ComboCode.cc.ivs.sigproc.fit

1 """ 2 Fit various functions to timeseries. 3 4 Section 1. Radial velocity data 5 =============================== 6 7 1.1 BD+29.3070 8 -------------- 9 Fit the orbit of the main sequence companion of the sdB+MS system BD+29.3070. 10 The radial velocities are obtained from HERMES spectra using the cross correlation 11 tool of the pipeline. 12 13 Necessary imports: 14 15 >>> from cc.ivs.io.ascii import read2array 16 >>> from cc.ivs.sigproc import funclib 17 >>> import numpy as np 18 19 Read in the data: 20 21 >>> data = read2array('/home/jorisv/IVS_python/test_fit/BD+29.3070.dat') 22 >>> dates, rv, err = data[:,0], data[:,1], data[:,2] 23 24 Import the function we want to fit to the data from the function library: 25 26 >>> mymodel = funclib.kepler_orbit(type='single') 27 28 Setup the starting values of the parameters and the boundaries: 29 30 >>> pars = [1000, dates[0], 0.0, 0.0, (max(rv)-min(rv))/2, np.average(rv)] 31 >>> bounds = [(pars[0]/2, pars[0]*1.5), (data[0][0]-pars[0]/2,data[0][0]+pars[0]/2), (0.0,0.5), (0,2*np.pi), (pars[4]/4,pars[4]*2), (min(rv),max(rv))] 32 >>> vary = [True, True, True, True, True, True] 33 >>> mymodel.setup_parameters(values=pars, bounds=bounds, vary=vary) 34 35 Fit the model to the data: 36 37 >>> result = minimize(dates, rv, mymodel, weights=1/err) 38 39 Print the results: 40 41 >>> print mymodel.param2str() 42 p = 1012.26 +/- 16.57 43 t0 = 2455423.65 +/- 11.27 44 e = 0.16 +/- 0.01 45 omega = 1.72 +/- 0.08 46 k = 6.06 +/- 0.04 47 v0 = 32.23 +/- 0.09 48 49 The minimizer already returned errors on the parameters, based on the Levenberg-Marquardt algorithm of scipy. But we can get more robust errors by using the L{Minimizer.estimate_error} method of the minimizer wich uses an F-test to calculate confidence intervals, fx on the period and eccentricity of the orbit: 50 51 >>> ci = result.estimate_error(p_names=['p', 'e'], sigmas=[0.25,0.65,0.95]) 52 >>> print confidence2string(ci, accuracy=4) 53 p 54 25.00 % 65.00 % 95.00 % 55 - 1006.9878 997.1355 980.7742 56 + 1017.7479 1029.2554 1053.0851 57 e 58 25.00 % 65.00 % 95.00 % 59 - 0.1603 0.1542 0.1433 60 + 0.1667 0.1731 0.1852 61 62 Now plot the resulting rv curve over the original curve: 63 64 >>> p = pl.figure(1) 65 >>> result.plot_results() 66 67 ]include figure]]ivs_sigproc_lmfit_BD+29.3070_1.png] 68 69 We can use the L{Minimizer.plot_confidence_interval} method to plot the confidence intervals of two 70 parameters, and show the correlation between them, fx between the period and T0: 71 72 >>> p = pl.figure(2) 73 >>> result.plot_confidence_interval(xname='p',yname='t0', res=30, filled=True) 74 75 ]include figure]]ivs_sigproc_lmfit_BD+29.3070_2.png] 76 77 To get a better idea of how the parameter space behaves we can start the fitting process from 78 different starting values and see how they converge. Fx, we will let the fitting process start 79 from different values for the period and the eccentricity, and then plot where they converge to 80 81 Herefore we use the L{grid_minimize} function which has the same input as the normal minimize function 82 and some extra parameters. 83 84 >>> fitters, startpars, models, chi2s = fit.grid_minimize(dates, rv, mymodel, weights=1/err, points=200, parameters = ['p','e'], return_all=True) 85 86 We started fitting from 200 points randomly distributed in period and eccentricity space, with the 87 boundary values for there parameters as limits. 88 89 Based on this output we can use the L{plot_convergence} function to plot to which values each starting point converges. 90 91 >>> p = pl.figure(3) 92 >>> plot_convergence(startpars, models, chi2s, xpar='p', ypar='e') 93 94 ]include figure]]ivs_sigproc_lmfit_BD+29.3070_3.png] 95 96 97 98 99 # 1.2 LSI+65010 100 # ------------- 101 # Fit orbit of massive X-ray binary LSI+65010, after Grundstrom 2007: 102 # 103 # Necessary imports: 104 # 105 # >>> from ivs.catalogs import vizier 106 # >>> from ivs.timeseries import pergrams 107 # >>> from ivs.aux import numpy_ext 108 # >>> import pylab as pl 109 # 110 # Read in the data, and remove the outliers: 111 # 112 # >>> data,units,comms = vizier.search('J/ApJ/656/431/table2') 113 # >>> times,RV = data['HJD'],data['RV'] 114 # >>> keep = RV<-30. 115 # >>> times,RV = times[keep],RV[keep] 116 # 117 # Find the best frequency using the Kepler periodogram, and fit an orbit with that 118 # frequency using the linear fitting routine. 119 # 120 # >>> freqs,ampls = pergrams.kepler(times,RV,fn=0.2) 121 # >>> freq = freqs[np.argmax(ampls)] 122 # >>> pars1 = kepler(times, RV, freq, output_type='new') 123 # >>> print pars1 124 # [11.581314028141733, 2451060.7517886101, 0.19000000000000003, 1.0069244281466982, 11.915330492005735, -59.178393186003241] 125 # 126 # Now we want to improve this fit using the nonlinear optimizers, deriving errors 127 # on the parameters on the fly (B{warning: these errors are not necessarily realistic!}). 128 # First, we setup the model: 129 # 130 # >>> mymodel = funclib.kepler_orbit(type='single') 131 # >>> mymodel.setup_parameters(pars1) 132 # >>> result = minimize(times,RV,mymodel) 133 # >>> pars2,e_pars2 = result.model.get_parameters() 134 # >>> print pars2 135 # [ 1.15815058e+01 2.45106077e+06 1.94720600e-01 1.02204827e+00 136 # 1.19264204e+01 -5.91827773e+01] 137 # >>> print mymodel.param2str(accuracy=6) 138 # p = 11.581506 +/- 0.004104 139 # t0 = 2451060.771681 +/- 0.583864 140 # e = 0.194721 +/- 0.060980 141 # omega = 1.022048 +/- 0.320605 142 # k = 11.926420 +/- 0.786787 143 # v0 = -59.182777 +/- 0.503345 144 # 145 # Evaluate the orbital fits, and make phasediagrams of the fits and the data 146 # 147 # >>> myorbit1 = mymodel.evaluate(times,pars1) 148 # >>> myorbit2 = mymodel.evaluate(times,pars2) 149 # >>> period = result.model.parameters['p'].value 150 # >>> phases,phased = evaluate.phasediagram(times,RV,1/period) 151 # >>> phases1,phased1 = evaluate.phasediagram(times,myorbit1,1/period) 152 # >>> phases2,phased2 = evaluate.phasediagram(times,myorbit2,1/period) 153 # 154 # Now plot everything: 155 # 156 # >>> sa1 = np.argsort(phases1) 157 # >>> sa2 = np.argsort(phases2) 158 # >>> p = pl.figure() 159 # >>> p = pl.subplot(121) 160 # >>> p = pl.plot(freqs,ampls,'k-') 161 # >>> p = pl.xlabel('Frequency [d$^{-1}$]') 162 # >>> p = pl.ylabel('Statistic') 163 # >>> p = pl.subplot(122) 164 # >>> p = pl.plot(phases,phased,'ko') 165 # >>> p = pl.plot(phases1[sa1],phased1[sa1],'r-',lw=2,label='Linear fit') 166 # >>> p = pl.plot(phases2[sa2],phased2[sa2],'b--',lw=2,label='Optimization') 167 # >>> p = pl.xlabel('Phase [$2\pi^{-1}$]') 168 # >>> p = pl.ylabel('Amplitude [km s$^{-1}$]') 169 # >>> p = pl.legend() 170 # 171 # ]include figure]]ivs_sigproc_fit_01.png] 172 173 Section 2. Fitting an absorption line 174 ===================================== 175 176 Here we show how to use 2 gaussians to fit an absorption line with an emission feature in its core: 177 178 Setup the two gaussian functions for the fitting process: 179 180 >>> gauss1 = funclib.gauss() 181 >>> pars = [-0.75,1.0,0.1,1] 182 >>> gauss1.setup_parameters(values=pars) 183 184 >>> gauss2 = funclib.gauss() 185 >>> pars = [0.22,1.0,0.01,0.0] 186 >>> vary = [True, True, True, False] 187 >>> gauss2.setup_parameters(values=pars, vary=vary) 188 189 Create the model by summing up the gaussians. As we just want to sum the two gaussian, we do not need 190 to specify an expression for combining the two functions: 191 192 >>> mymodel = Model(functions=[gauss1, gauss2]) 193 194 Create some data with noise on it 195 196 >>> np.random.seed(1111) 197 >>> x = np.linspace(0.5,1.5, num=1000) 198 >>> y = mymodel.evaluate(x) 199 >>> noise = np.random.normal(0.0, 0.015, size=len(x)) 200 >>> y = y+noise 201 202 Change the starting values for the fit parameters: 203 204 >>> pars = [-0.70,1.0,0.11,0.95] 205 >>> gauss1.setup_parameters(values=pars) 206 >>> pars = [0.27,1.0,0.005,0.0] 207 >>> vary = [True, True, True, False] 208 >>> gauss2.setup_parameters(values=pars, vary=vary) 209 210 Fit the model to the data 211 >>> result = minimize(x,y, mymodel) 212 213 Print the resulting values for the parameters. The errors are very small as the data only has some 214 small normal distributed noise added to it: 215 216 >>> print gauss1.param2str(accuracy=6) 217 a = -0.750354 +/- 0.001802 218 mu = 0.999949 +/- 0.000207 219 sigma = 0.099597 +/- 0.000267 220 c = 0.999990 +/- 0.000677 221 >>> print gauss2.param2str(accuracy=6) 222 a = 0.216054 +/- 0.004485 223 mu = 1.000047 +/- 0.000226 224 sigma = 0.009815 +/- 0.000250 225 c = 0.000000 +/- 0.000000 226 227 Now plot the results: 228 229 >>> p = pl.figure(1) 230 >>> result.plot_results() 231 232 ]include figure]]ivs_sigproc_lmfit_gaussian.png] 233 234 Section 3. Pulsation frequency analysis 235 ======================================= 236 237 Do a frequency analysis of the star HD129929, after Aerts 2004: 238 239 Read in the data: 240 241 >>> data,units,comms = vizier.search('J/A+A/415/241/table1') 242 >>> times,signal = data['HJD'],data['Umag'] 243 >>> signal -= signal.mean() 244 245 Find the best frequency using the Scargle periodogram, fit an orbit with that 246 frequency and optimize. Then print the results to the screen: 247 248 >>> freqs,ampls = pergrams.scargle(times,signal,f0=6.4,fn=7) 249 >>> freq = freqs[np.argmax(ampls)] 250 >>> pars1 = sine(times, signal, freq) 251 >>> e_pars1 = e_sine(times,signal, pars1) 252 >>> pars2,e_pars2,gain = optimize(times,signal,pars1,'sine') 253 >>> print pl.mlab.rec2txt(numpy_ext.recarr_join(pars1,e_pars1),precision=6) 254 const ampl freq phase e_const e_ampl e_freq e_phase 255 0.000242 0.014795 6.461705 -0.093895 0.000608 0.001319 0.000006 0.089134 256 >>> print pl.mlab.rec2txt(numpy_ext.recarr_join(pars2,e_pars2),precision=6) 257 const ampl freq phase e_const e_ampl e_freq e_phase 258 0.000242 0.014795 6.461705 -0.093895 0.000609 0.001912 0.000000 0.013386 259 260 Evaluate the sines, and make phasediagrams of the fits and the data 261 262 >>> mysine1 = evaluate.sine(times,pars1) 263 >>> mysine2 = evaluate.sine(times,pars2) 264 >>> phases,phased = evaluate.phasediagram(times,signal,pars1['freq']) 265 >>> phases1,phased1 = evaluate.phasediagram(times,mysine1,pars1['freq']) 266 >>> phases2,phased2 = evaluate.phasediagram(times,mysine2,pars1['freq']) 267 268 Now plot everything: 269 270 >>> sa1 = np.argsort(phases1) 271 >>> sa2 = np.argsort(phases2) 272 >>> p = pl.figure() 273 >>> p = pl.subplot(121) 274 >>> p = pl.plot(freqs,ampls,'k-') 275 >>> p = pl.xlabel('Frequency [d$^{-1}$]') 276 >>> p = pl.ylabel('Amplitude [mag]') 277 >>> p = pl.subplot(122) 278 >>> p = pl.plot(phases,phased,'ko') 279 >>> p = pl.plot(phases1[sa1],phased1[sa1],'r-',lw=2) 280 >>> p = pl.plot(phases2[sa2],phased2[sa2],'b--',lw=2) 281 >>> p = pl.xlabel('Phase [$2\pi^{-1}$]') 282 >>> p = pl.ylabel('Amplitude [mag]') 283 284 ]include figure]]ivs_sigproc_fit_02.png] 285 286 287 Section 4. Exoplanet transit analysis 288 ===================================== 289 290 Find the transits of CoRoT 8b, after Borde 2010. 291 292 >>> import urllib 293 >>> from cc.ivs.io import ascii 294 >>> url = urllib.URLopener() 295 >>> filen,msg = url.retrieve('http://cdsarc.u-strasbg.fr/viz-bin/nph-Plot/Vgraph/txt?J%2fA%2bA%2f520%2fA66%2f.%2flc_white&F=white&P=0&--bitmap-size&800x400') 296 >>> times,signal = ascii.read2array(filen).T 297 >>> signal = signal / np.median(signal) 298 >>> url.close() 299 300 Find the best frequency using the Box Least Squares periodogram, fit a transit 301 model with that frequency, optimize and prewhiten. 302 303 >>> freqs,ampls = pergrams.box(times,signal,f0=0.16,fn=0.162,df=0.005/times.ptp(),qma=0.05) 304 >>> freq = freqs[np.argmax(ampls)] 305 >>> pars = box(times,signal,freq) 306 >>> pars = box(times,signal,freq,b0=pars['ingress'][0]-0.05,bn=pars['egress'][0]+0.05) 307 >>> print pl.mlab.rec2txt(pars,precision=6) 308 freq depth ingress egress cont 309 0.161018 0.005978 0.782028 0.799229 1.000027 310 311 Evaluate the transits, and make phasediagrams of the fits and the data 312 313 >>> transit = evaluate.box(times,pars) 314 >>> phases,phased = evaluate.phasediagram(times,signal,freq) 315 >>> phases1,phased1 = evaluate.phasediagram(times,transit,freq) 316 317 Now plot everything and print the results to the screen: 318 319 >>> sa1 = np.argsort(phases1) 320 >>> p = pl.figure() 321 >>> p = pl.subplot(121) 322 >>> p = pl.plot(freqs,ampls,'k-') 323 >>> p = pl.xlabel('Frequency [d$^{-1}$]') 324 >>> p = pl.ylabel('Statistic') 325 >>> p = pl.subplot(122) 326 >>> p = pl.plot(phases,phased*100,'ko') 327 >>> p = pl.plot(phases1[sa1],phased1[sa1]*100,'r-',lw=2) 328 >>> p = pl.xlim(0.70,0.85) 329 >>> p = pl.xlabel('Phase [$2\pi^{-1}$]') 330 >>> p = pl.ylabel('Depth [%]') 331 332 ]include figure]]ivs_sigproc_fit_03.png] 333 334 Section 5. Eclipsing binary fit 335 =============================== 336 337 Splines are not a good way to fit eclipsing binaries, but just for the sake of 338 showing the use of the periodic spline fitting functions, we do it anyway. 339 340 We use the data on CU Cnc of Ribas, 2003: 341 342 >>> data,units,comms = vizier.search('J/A+A/398/239/table1') 343 >>> times,signal = data['HJD'],data['Dmag'] 344 345 346 Section 6. Blackbody fit 347 ======================== 348 349 We generate a single black body curve with Teff=10000. and scale=1.: 350 351 >>> from cc.ivs.sed import model as sed_model 352 >>> wave_dense = np.logspace(2.6,6,1000) 353 >>> flux_dense = sed_model.blackbody((wave_dense,'AA'),10000.) 354 355 We simulate 20 datapoints of this model and perturb it a bit: 356 357 >>> wave = np.logspace(3,6,20) 358 >>> flux = sed_model.blackbody((wave,'AA'),10000.) 359 >>> error = flux/2. 360 >>> flux += np.random.normal(size=len(wave),scale=error) 361 362 Next, we setup a single black body model to fit through the simulated data. Our 363 initial guess is a temperature of 1000K and a scale of 1: 364 365 >>> pars = [1000.,1.] 366 >>> mymodel = funclib.blackbody() 367 >>> mymodel.setup_parameters(values=pars) 368 369 Fitting and evaluating the fit is as easy as: 370 371 >>> result = minimize(wave,flux, mymodel,weights=1./error) 372 >>> myfit = mymodel.evaluate(wave_dense) 373 374 This is the result: 375 376 >>> print mymodel.param2str() 377 T = 9678.90 +/- 394.26 378 scale = 1.14 +/- 0.17 379 380 A multiple black body is very similar (we make the errors somewhat smaller for 381 easier fitting): 382 383 >>> flux2 = sed_model.blackbody((wave,'AA'),15000.) 384 >>> flux2+= sed_model.blackbody((wave,'AA'),6000.)*10. 385 >>> flux2+= sed_model.blackbody((wave,'AA'),3000.)*100. 386 >>> error2 = flux2/10. 387 >>> flux2 += np.random.normal(size=len(wave),scale=error2) 388 389 The setup now needs three sets of parameters, which we choose to be all equal. 390 391 >>> pars = [[1000.,1.],[1000.,1.],[1000.,1.]] 392 >>> mymodel = funclib.multi_blackbody(n=3) 393 >>> mymodel.setup_parameters(values=pars) 394 395 Fitting and evaluate is again very easy: 396 397 >>> result = minimize(wave,flux2, mymodel,weights=1./error2) 398 >>> myfit2 = result.model.evaluate(wave_dense) 399 400 This is the result: 401 402 >>> print mymodel.param2str() 403 T_0 = 6155.32 +/- 3338.54 404 scale_0 = 9.54 +/- 23.00 405 T_1 = 3134.37 +/- 714.01 406 scale_1 = 93.40 +/- 17.98 407 T_2 = 14696.40 +/- 568.76 408 scale_2 = 1.15 +/- 0.33 409 410 And a nice plot of the two cases: 411 412 >>> p = pl.figure() 413 >>> p = pl.subplot(121) 414 >>> p = pl.plot(wave_dense,flux_dense,'k-') 415 >>> p = pl.plot(wave_dense,myfit,'r-') 416 >>> p = pl.errorbar(wave,flux,yerr=error,fmt='bs') 417 >>> p = pl.gca().set_xscale('log') 418 >>> p = pl.gca().set_yscale('log') 419 >>> p = pl.subplot(122) 420 >>> p = pl.plot(wave_dense,myfit2,'r-') 421 >>> p = pl.errorbar(wave,flux2,yerr=error2,fmt='bs') 422 >>> p = pl.gca().set_xscale('log') 423 >>> p = pl.gca().set_yscale('log') 424 425 ]include figure]]ivs_sigproc_lmfit_blackbody.png] 426 """ 427 import time 428 import logging 429 430 import numpy as np 431 from numpy import pi,cos,sin 432 import numpy.linalg as la 433 from scipy.interpolate import splrep 434 import scipy.optimize 435 436 from cc.ivs.aux import progressMeter as progress 437 from cc.ivs.aux import loggers 438 from cc.ivs.sigproc import evaluate 439 #from ivs.timeseries import pyKEP 440 441 import re 442 import copy 443 import pylab as pl 444 import matplotlib as mpl 445 from cc.ivs.sigproc import lmfit 446 #import lmfit 447 448 logger = logging.getLogger('SP.FIT')

449 450 #{Linear fit functions 451 452 453 -def sine(times, signal, freq, sigma=None,constant=True,error=False,t0=0):

454 """ 455 Fit a harmonic function. 456 457 This function is of the form 458 459 C + \sum_j A_j sin(2pi nu_j (t-t0) + phi_j) 460 461 where the presence of the constant term is an option. The error 462 bars on the fitted parameters can also be requested (by Joris De Ridder). 463 464 (phase in radians!) 465 466 @param times: time points 467 @type times: numpy array 468 @param signal: observations 469 @type signal: numpy array 470 @param freq: frequencies of the harmonics 471 @type freq: numpy array or float 472 @keyword sigma: standard error of observations 473 @type sigma: numpy array 474 @keyword constant: flag, if not None, also fit a constant 475 @type constant: boolean 476 @keyword error: flag, if not None, also compute the errorbars 477 @type error: boolean 478 @keyword t0: time zero point. 479 @type t0: float 480 @return: parameters 481 @rtype: record array 482 """ 483 #-- Subtract the zero point from the time points. 484 times = times - t0 485 486 #-- Prepare the input: if a frequency value is given, put it in a list. If 487 # an iterable is given convert it to an array 488 if not hasattr(freq,'__len__'): 489 freq = [freq] 490 freq = np.asarray(freq) 491 # do the same for the sigmas 492 if sigma is None: 493 sigma = np.ones_like(signal) 494 elif not hasattr(sigma,'__len__'): 495 sigma = sigma * np.ones_like(signal) 496 497 #-- Determine the number of fit parameters 498 Ndata = len(times) 499 Nfreq = len(freq) 500 if not constant: Nparam = 2*Nfreq 501 else: Nparam = 2*Nfreq+1 502 503 #-- The fit function used is of the form 504 # C + \sum_j a_j sin(2pi\nu_j t_i) + b_j cos(2pi\nu_j t_i) 505 # which is linear in its fit parameters. These parameters p can therefore be 506 # solved by minimizing ||b - A p||, where b are the observations, and A is the 507 # basisfunction matrix, i.e. A[i,j] is the j-th base function evaluated in 508 # the i-th timepoint. The first Nfreq columns are the amplitudes of the sine, 509 # the second Nfreq columns are the amplitudes of the cosine, and if requested, 510 # the last column belongs to the constant 511 A = np.zeros((Ndata,Nparam)) 512 for j in range(Nfreq): 513 A[:,j] = sin(2*pi*freq[j]*times) * sigma 514 A[:,Nfreq+j] = cos(2*pi*freq[j]*times) * sigma 515 if constant: 516 A[:,2*Nfreq] = np.ones(Ndata) * sigma 517 518 b = signal * sigma 519 520 #-- Solve using SVD decomposition 521 fitparam, chisq, rank, s = np.linalg.lstsq(A,b) 522 523 #-- Compute the amplitudes and phases: A_j sin(2pi*\nu_j t_i + phi_j) 524 amplitude = np.zeros(Nfreq) 525 phase = np.zeros(Nfreq) 526 for j in range(Nfreq): 527 amplitude[j] = np.sqrt(fitparam[j]**2 + fitparam[Nfreq+j]**2) 528 phase[j] = np.arctan2(fitparam[Nfreq+j], fitparam[j]) 529 530 #-- If no error bars are needed, we are finished here, we collect all parameters 531 if constant: 532 constn = np.zeros(len(amplitude)) 533 constn[0] = fitparam[2*Nfreq] 534 names = ['const','ampl','freq','phase'] 535 fpars = [constn,amplitude,freq,phase/(2*pi)] 536 else: 537 names = ['ampl','freq','phase'] 538 fpars = [amplitude,freq,phase/(2*pi)] 539 540 parameters = np.rec.fromarrays(fpars,names=names) 541 542 logger.debug('SINEFIT: Calculated harmonic fit with %d frequencies through %d datapoints'%(Nfreq,Ndata)) 543 544 return parameters

545

546 -def periodic_spline(times, signal, freq, t0=None, order=20, k=3):

547 """ 548 Fit a periodic spline. 549 550 CAUTION: this definition assumes the proposed period is either 551 small compared to the total time range, or there are a lot of 552 points per period. 553 554 This definition basically phases all the data and constructs 555 an empirical periodic function through an averaging process 556 per phasebin, and then performing a splinefit through those points. 557 558 In order to make the first derivative continuous, we repeat 559 the first point(s) at the end and the end point(s) at the beginning. 560 561 The constructed function can than be removed from the original 562 data. 563 564 Output is a record array containing the columns 'freq', 'knots', 'coeffs' 565 and 'degree'. 566 567 Example usage: 568 569 >>> myfreq = 1/20. 570 >>> times = np.linspace(0,150,10000) 571 >>> signal = sin(2*pi*myfreq*times+0.32*2*pi) + np.random.normal(size=len(times)) 572 >>> cjs = periodic_spline(times,signal,myfreq,order=30) 573 >>> trend = evaluate.periodic_spline(times,cjs) 574 >>> import pylab as pl 575 >>> p=pl.figure() 576 >>> p=pl.plot(times,signal,'ko') 577 >>> p=pl.plot(times,trend,'r-') 578 >>> p=pl.title("test:tfit:periodic_spline_fit") 579 580 @param times: time points 581 @type times: numpy 1d array 582 @param signal: observation points 583 @type signal: numpy 1d array 584 @param freq: frequency of the periodic trend 585 @type freq: float 586 @keyword order: number of points to use for the spline fit. 587 @type order: integer 588 @keyword k: order of the spline 589 @type k: integer 590 @return: parameters 591 @rtype: record array 592 """ 593 #-- get keyword arguments 594 if t0 is None: 595 t0 = times[0] 596 597 #-- Prepare the input: if a frequency value is given, put it in a list. If 598 # an iterable is given convert it to an array 599 if not hasattr(freq,'__len__'): 600 freq = [freq] 601 freq = np.asarray(freq) 602 603 N = order*3+(k-2) 604 parameters = np.zeros(len(freq), dtype=[('freq','float32'), 605 ('knots','%dfloat32'%N), 606 ('coeffs','%dfloat32'%N), 607 ('degree','int8')]) 608 for nf,ifreq in enumerate(freq): 609 #-- get phased signal 610 phases,phased_sig = evaluate.phasediagram(times,signal,ifreq,t0=t0) 611 612 #-- construct phase domain 613 x = np.linspace(0.,1.,order)[:-1] 614 dx = x[1]-x[0] 615 x = x+dx/2 616 617 #-- make bin-averaged signal. Possible that some bins are empty, just skip 618 # them but warn the user 619 found_phase = [] 620 found_averg = [] 621 for xi in x: 622 this_bin = ((xi-dx/2.)<=phases) & (phases<=(xi+dx/2.)) 623 if np.sum(this_bin)==0: 624 logger.warning("PERIODIC SPLINE: some parts of the phase are not covered") 625 continue 626 found_phase.append(xi) 627 found_averg.append(np.average(phased_sig[this_bin])) 628 629 found_phase = np.array(found_phase) 630 found_averg = np.array(found_averg) 631 632 #-- make circular 633 found_phase = np.hstack([found_phase-1,found_phase,found_phase+1]) 634 found_averg = np.hstack([found_averg, found_averg,found_averg]) 635 636 #-- compute spline representation 637 knots,coeffs,degree = splrep(found_phase,found_averg,k=k) 638 parameters['freq'][nf] = ifreq 639 parameters['knots'][nf] = knots 640 parameters['coeffs'][nf] = coeffs 641 parameters['degree'][nf] = degree 642 643 return parameters

644

645 # def kepler(times, signal, freq, sigma=None, wexp=2., e0=0, en=0.99, de=0.01, output_type='old'): 646 # """ 647 # Fit a Kepler orbit to a time series. 648 # 649 # Example usage: 650 # 651 # >>> from ivs.timeseries import pergrams 652 # 653 # First set the parameters we want to use: 654 # 655 # >>> pars = tuple([12.456,23456.,0.37,213/180.*pi,98.76,55.]) 656 # >>> pars = np.rec.array([pars],dtype=[('P','f8'),('T0','f8'),('e','f8'), 657 # ... ('omega','f8'),('K','f8'),('gamma','f8')]) 658 # 659 # Then generate the signal and add some noise 660 # 661 # >>> times = np.linspace(pars[0]['T0'],pars[0]['T0']+5*pars[0]['P'],100) 662 # >>> signalo = evaluate.kepler(times,pars) 663 # >>> signal = signalo + np.random.normal(scale=20.,size=len(times)) 664 # 665 # Calculate the periodogram: 666 # 667 # >>> freqs,ampls = pergrams.kepler(times,signal) 668 # >>> opars = kepler(times,signal,freqs[np.argmax(ampls)]) 669 # >>> signalf = evaluate.kepler(times,opars) 670 # 671 # And make some plots 672 # 673 # >>> import pylab as pl 674 # >>> p = pl.figure() 675 # >>> p = pl.subplot(221) 676 # >>> p = pl.plot(times,signal,'ko') 677 # >>> p = pl.plot(times,signalo,'r-',lw=2) 678 # >>> p = pl.plot(times,signalf,'b--',lw=2) 679 # >>> p = pl.subplot(222) 680 # >>> p = pl.plot(freqs,ampls,'k-') 681 # 682 # @param times: time points 683 # @type times: numpy 1d array 684 # @param signal: observation points 685 # @type signal: numpy 1d array 686 # @param freq: frequency of kepler orbit 687 # @type freq: float 688 # @return: parameters 689 # @rtype: record array 690 # """ 691 # if sigma is None: 692 # sigma = np.ones_like(times) 693 # T = times[-1]-times[0] 694 # x00 = 0. 695 # x0n = 359.9 696 # #-- initialize parameters 697 # f0 = freq 698 # fn = freq+0.05/T 699 # df = 0.1/T 700 # maxstep = int((fn-f0)/df+1) 701 # f1 = np.zeros(maxstep) #-- frequency 702 # s1 = np.zeros(maxstep) #-- power 703 # p1 = np.zeros(maxstep) #-- window 704 # l1 = np.zeros(maxstep) #-- power LS 705 # s2 = np.zeros(maxstep) #-- power Kepler 706 # k2 = np.zeros(6) #-- parameters for Kepler orbit 707 # 708 # pyKEP.kepler(times,signal,sigma,f0,fn,df,wexp,e0,en,de,x00,x0n, 709 # f1,s1,p1,l1,s2,k2) 710 # 711 # freq,x0,e,w,K,RV0 = k2 712 # pars = [1/freq,x0/(2*pi*freq) + times[0], e, w, K, RV0] 713 # if output_type=='old': 714 # pars = np.rec.array([tuple(pars)],dtype=[('P','f8'),('T0','f8'),('e','f8'),('omega','f8'),('K','f8'),('gamma','f8')]) 715 # 716 # return pars 717 718 719 720 -def box(times,signal,freq,b0=0,bn=1,order=50,t0=None):

721 """ 722 Fit box shaped transits. 723 724 @param times: time points 725 @type times: numpy 1d array 726 @param signal: observation points 727 @type signal: numpy 1d array 728 @param freq: frequency of transiting signal 729 @type freq: float 730 @param b0: minimum start of ingress (in phase) 731 @type b0: 0<float<bn 732 @param bn: maximum end of egress (in phase) 733 @type bn: b0<float<1 734 @param order: number of phase bins 735 @type order: integer 736 @param t0: zeropoint of times 737 @type t0: float 738 @return: parameters 739 @rtype: record array 740 """ 741 if t0 is None: 742 t0 = 0. 743 744 #-- Prepare the input: if a frequency value is given, put it in a list. If 745 # an iterable is given convert it to an array 746 if not hasattr(freq,'__len__'): 747 freq = [freq] 748 freq = np.asarray(freq) 749 750 parameters = np.rec.fromarrays([np.zeros(len(freq)) for i in range(5)],dtype=[('freq','f8'),('depth','f8'), 751 ('ingress','f8'),('egress','f8'), 752 ('cont','f8')]) 753 bins = np.linspace(b0,bn,order) 754 755 for fnr,frequency in enumerate(freq): 756 parameters[fnr]['freq'] = frequency 757 #-- we need to check two phase diagrams, to compensate for edge effects 758 phases1,phased1 = evaluate.phasediagram(times,signal,frequency,t0=t0) 759 phases2,phased2 = evaluate.phasediagram(times,signal,frequency,t0=t0+0.5/frequency) 760 best_fit = np.inf 761 for i,start in enumerate(bins): 762 for end in bins[i+1:]: 763 transit1 = (start<=phases1) & (phases1<=end) 764 transit2 = (start<=phases2) & (phases2<=end) 765 transit_sig1 = np.median(np.compress( transit1,phased1)) 766 contini_sig1 = np.median(np.compress(1-transit1,phased1)) 767 depth1 = contini_sig1-transit_sig1 768 transit_sig2 = np.median(np.compress( transit2,phased2)) 769 contini_sig2 = np.median(np.compress(1-transit2,phased2)) 770 depth2 = contini_sig2-transit_sig2 771 #-- check fit 772 chisquare1 = sum((evaluate.box(times,[contini_sig1,frequency,depth1,start,end])-signal)**2) 773 chisquare2 = sum((evaluate.box(times,[contini_sig2,frequency,depth2,start,end])-signal)**2) 774 if chisquare1 < best_fit and chisquare1 < chisquare2: 775 parameters[fnr]['depth'] = depth1 776 parameters[fnr]['ingress'] = start 777 parameters[fnr]['egress'] = end 778 parameters[fnr]['cont'] = contini_sig1 779 best_fit = chisquare1 780 elif chisquare2 < best_fit and chisquare2 < chisquare1: 781 start2 = start - 0.5 782 end2 = end - 0.5 783 if start2 < 0: start2 += 1 784 if end2 < 0: end2 += 1 785 parameters[fnr]['depth'] = depth2 786 parameters[fnr]['ingress'] = start2 787 parameters[fnr]['egress'] = end2 788 parameters[fnr]['cont'] = contini_sig2 789 best_fit = chisquare2 790 return parameters

791

792 -def gauss(x,y,threshold=0.1,constant=False,full_output=False, 793 init_guess_method='analytical',window=None):

794 """ 795 Fit a Gaussian profile to data using a polynomial fit. 796 797 y = A * exp( -(x-mu)**2 / (2*sigma**2)) 798 799 ln(y) = ln(A) - (x-mu)**2 / (2*sigma**2) 800 ln(y) = ln(A) - mu**2 / (2*sigma**2) + mu / (sigma**2) * x - x**2 / (2*sigma**2) 801 802 then the parameters are given by 803 804 p0 = - 1 / (2*sigma**2) 805 p1 = mu / ( sigma**2) 806 p2 = ln(A) - mu**2 / (2*sigma**2) 807 808 Note that not all datapoints are used, but only those above a certain values (namely 809 10% of the maximum value), In this way, we reduce the influence of the continuum 810 and concentrate on the shape of the peak itself. 811 812 Afterwards, we perform a non linear least square fit with above parameters 813 as starting values, but only accept it if the CHI2 has improved. 814 815 If a constant has to be fitted, the nonlinear options has to be True. 816 817 Example: we generate a Lorentzian curve and fit a Gaussian to it: 818 819 >>> x = np.linspace(-10,10,1000) 820 >>> y = evaluate.lorentz(x,[5.,0.,2.]) + np.random.normal(scale=0.1,size=len(x)) 821 >>> pars1,e_pars1 = gauss(x,y) 822 >>> pars2,e_pars2 = gauss(x,y,constant=True) 823 >>> y1 = evaluate.gauss(x,pars1) 824 >>> y2 = evaluate.gauss(x,pars2) 825 >>> p = pl.figure() 826 >>> p = pl.plot(x,y,'k-') 827 >>> p = pl.plot(x,y1,'r-',lw=2) 828 >>> p = pl.plot(x,y2,'b-',lw=2) 829 830 @param x: x axis data 831 @type x: numpy array 832 @param y: y axis data 833 @type y: numpy array 834 @param nl: flag for performing a non linear least squares fit 835 @type nl: boolean 836 @param constant: fit also a constant 837 @type constant: boolean 838 @rtype: tuple 839 @return: A, mu, sigma (,C) 840 """ 841 #-- if a constant needs to be determined, take the median of the 5% lowest 842 # points (but take at least three): 843 if constant: 844 N = len(y) 845 C = np.median(np.sort(y)[:max(int(0.05*N),3)]) 846 else: 847 C = 0. 848 849 if window is not None: 850 win = (window[0]<=x) & (x<=window[1]) 851 x,y = x[win],y[win] 852 853 #-- transform to a polynomial function and perform a fit 854 # first clip where y==0 855 threshold *= max(y-C) 856 use_in_fit = (y-C)>=threshold 857 xc = x[use_in_fit] 858 yc = y[use_in_fit] 859 860 lnyc = np.log(yc-C) 861 p = np.polyfit(xc, lnyc, 2) 862 863 #-- determine constants 864 sigma = np.sqrt(-1. / (2.*p[0])) 865 mu = sigma**2.*p[1] 866 A = np.exp(p[2] + mu**2. / (2.*sigma**2.)) 867 868 #-- handle NaN Exceptions 869 if A!=A or mu!=mu or sigma!=sigma: 870 logger.error('Initial Gaussian fit failed') 871 init_success = False 872 A = (yc-C).max() 873 mu = xc[yc.argmax()] 874 sigma = xc.ptp()/3. 875 else: 876 init_success = True 877 878 #-- check chi2 879 if constant: 880 p0 = evaluate.gauss_preppars(np.asarray([A,mu,sigma,C])) 881 else: 882 p0 = evaluate.gauss_preppars(np.asarray([A,mu,sigma])) 883 yf = evaluate.gauss(xc,p0) 884 chi2 = np.sum((yc-yf)**2) 885 886 #-- perform non linear least square fit 887 if constant: 888 pars,e_pars,gain = optimize(x,y,p0,'gauss', minimizer='leastsq') 889 else: 890 pars,e_pars,gain = optimize(x,y,p0,'gauss', minimizer='leastsq') 891 if gain<0: 892 pars = p0 893 pars['sigma'] = np.abs(pars['sigma']) 894 if not full_output: 895 return pars,e_pars 896 else: 897 return pars,e_pars,p0,use_in_fit,init_success

898

899 900 #} 901 #{ Error determination 902 903 -def e_sine(times,signal,parameters,correlation_correction=True,limit=10000):

904 """ 905 Compute the errors on the parameters from a sine fit. 906 907 Note: errors on the constant are only calculated when the number of datapoints 908 is below 1000. Otherwise, the matrices involved become to huge. 909 910 @param times: time points 911 @type times: numpy array 912 @param signal: observations 913 @type signal: numpy array 914 @param parameters: record array containing the fitted parameters. This should 915 have columns 'ampl','freq' and 'phase', and optionally 'const'. 916 @type parameters: numpy record array 917 @param correlation_correction: set to True if you want to correct for correlation effects 918 @type correlation_correction: boolean 919 @param limit: Calculating the error on the constant requires the calculation 920 of a matrix of size Ndata x Nparam, which takes a long time for large datasets. 921 The routines skips the estimation of the error on the constant if the timeseries 922 is longer than C{limit} datapoints 923 @type limit: integer 924 @return: errors 925 @rtype: Nx4 array(, Nx3 array) 926 """ 927 #-- these are the parameters we need 928 freq = parameters['freq'] 929 phase = parameters['phase'] 930 amplitude = parameters['ampl'] 931 chisq = np.sum((signal - evaluate.sine(times,parameters))**2) 932 933 #-- for quick reference, we also need the dimensions of the data and 934 # fit parameters 935 Ndata = len(times) 936 Nparam = 2*len(parameters) 937 Nfreq = len(freq) 938 T = times.ptp() 939 940 #-- do we need to include the constant? 941 if 'const' in parameters.dtype.names: 942 constant = True 943 Nparam += 1 944 945 #-- these lists will contain the columns and their names 946 errors = [] 947 names = [] 948 949 #-- If error bars on the constant are needed, we do it here. The errors 950 # on the amplitude, frequency and phase are computed below. 951 if constant and Ndata<limit: 952 # First the derivative matrix: the 1st Nfreq columns the derivative w.r.t. 953 # the amplitude, the 2nd Nfreq columns w.r.t. the phase, and the if relevant 954 # the last column w.r.t. the constant. From this the covariance matrix. 955 F = np.zeros((Ndata, Nparam)) 956 for i in range(Ndata): 957 F[i,0:Nfreq] = sin(2*pi*freq*times[i] + phase) 958 F[i,Nfreq:2*Nfreq] = amplitude * cos(2*pi*freq*times[i] + phase) 959 #-- and for the constant 960 F[i,2*Nfreq] = 1.0 961 962 covariance = np.linalg.inv(np.dot(F.T, F)) 963 covariance *= chisq / (Ndata - Nparam) 964 965 #error_ampl = np.sqrt(covariance.diagonal()[:Nfreq]) 966 #error_phase = np.sqrt(covariance.diagonal()[Nfreq:2*Nfreq]) 967 error_const = np.sqrt(covariance[2*Nfreq,2*Nfreq]) 968 error_const = np.ones(Nfreq)*error_const 969 if len(error_const)>1: 970 error_const[1:] = 0 971 errors.append(error_const) 972 names.append('e_const') 973 elif constant: 974 errors.append(np.zeros(Nfreq)) 975 names.append('e_const') 976 977 #-- other variables 978 errors_ = np.zeros((Nfreq,3)) 979 residus = signal + 0. 980 for i in range(1,Nfreq+1): 981 residus -= evaluate.sine(times,parameters[i-1:i]) 982 a = parameters[i-1]['ampl'] 983 sigma_m = np.std(residus) 984 985 #-- error on amplitude, frequency and phase (in that order) 986 errors_[i-1,0] = np.sqrt(2./Ndata) * sigma_m 987 errors_[i-1,1] = np.sqrt(6./Ndata) * 1. / (pi*T) * sigma_m / a 988 errors_[i-1,2] = np.sqrt(2./Ndata) * sigma_m / a 989 990 #-- correct for correlation effects 991 if correlation_correction: 992 rho = max(1,get_correlation_factor(residus)) 993 errors_[i-1,0] *= np.sqrt(rho) 994 errors_[i-1,1] *= np.sqrt(rho) 995 errors_[i-1,2] *= np.sqrt(rho) 996 997 #-- collect results and return 998 e_parameters = np.rec.fromarrays(errors+[errors_[:,0],errors_[:,1],errors_[:,2]], 999 names=names + ['e_ampl','e_freq','e_phase']) 1000 1001 return e_parameters

1002

1003 1004 #{ Linear improvements 1005 1006 -def diffcorr(times, signal, parameters, func_name, \ 1007 max_iter=100, tol=1e-6, args=(), full_output=False):

1008 """ 1009 Differential corrections. 1010 """ 1011 1012 #-- ensure we have a flat array, and look for the functions to evaluate the 1013 # fits and the coefficients 1014 prep_func = getattr(evaluate,func_name+'_preppars') 1015 diff_func = getattr(evaluate,func_name+'_diffcorr') 1016 eval_func = getattr(evaluate,func_name) 1017 if parameters.dtype.names: 1018 parameters = prep_func(parameters) 1019 1020 #-- prepare arrays for coefficients and new parameters 1021 Deltas = np.zeros((max_iter,len(parameters))) 1022 params = np.zeros_like(Deltas) 1023 params[0] = parameters 1024 counter = 1 1025 while (counter==1) or (counter>0 and counter<max_iter and np.any(np.abs(Deltas[counter-1])>tol)): 1026 params[counter] = params[counter-1] + Deltas[counter-1] 1027 myfit = eval_func(times,params[counter],*args) 1028 coeff = diff_func(times,params[counter],*args) 1029 Delta,res,rank,s = la.lstsq(coeff,myfit-signal) 1030 Deltas[counter] = -Delta 1031 counter += 1 1032 1033 #-- transform the parameters to record arrays, as well as the steps 1034 parameters = prep_func(params[counter-1]) 1035 e_parameters = prep_func(Deltas[counter-1]) 1036 e_parameters.dtype.names = ['e_'+name for name in e_parameters.dtype.names] 1037 1038 if full_output: 1039 return parameters,e_parameters,0,params[:counter],Deltas[:counter] 1040 else: 1041 return parameters,e_parameters,0

1042

1043 1044 #} 1045 1046 1047 #{ Non-linear improvements 1048 1049 -def residuals(parameters,domain,data,evalfunc,weights,logar,*args):

1050 fit = evalfunc(domain,parameters,*args) 1051 if weights is None: 1052 weights = np.ones_like(data) 1053 if logar: 1054 return weights*(np.log10(data)-np.log10(fit)) 1055 else: 1056 return weights*(data-fit)

1057

1058 -def residuals_single(parameters,domain,data,evalfunc,weights,logar,*args):

1059 fit = evalfunc(domain,parameters,*args) 1060 if weights is None: 1061 weights = np.ones_like(data) 1062 if logar: 1063 return sum(weights*(np.log10(data)-np.log10(fit))**2) 1064 else: 1065 return sum(weights*(data-fit)**2)

1066

1067 -def optimize(times, signal, parameters, func_name, prep_func=None, 1068 minimizer='leastsq', weights=None, logar=False, args=()):

1069 """ 1070 Fit a function to data. 1071 """ 1072 #-- we need these function to evaluate the fit and to (un)pack the fitting 1073 # parameters from and to flat arrays 1074 if prep_func is None and isinstance(func_name,str): 1075 prepfunc = getattr(evaluate,func_name+'_preppars') 1076 else: 1077 prepfunc = None 1078 if isinstance(func_name,str): 1079 evalfunc = getattr(evaluate,func_name) 1080 else: 1081 evalfunc = func_name 1082 optifunc = getattr(scipy.optimize,minimizer) 1083 1084 #-- if no weights, everything has the same weight 1085 if weights is None: 1086 weights = np.ones_like(times) 1087 1088 #-- if the initial guess of the fitting parameters aren't flat, flatten them 1089 # here: 1090 if prepfunc is not None and parameters.dtype.names: 1091 parameters = prepfunc(parameters) 1092 init_guess = parameters.copy() 1093 #-- keep track of the initial chi square value, to check if there is an 1094 # improvement 1095 dof = (len(times)-len(init_guess)) 1096 signalf_init = evalfunc(times,init_guess) 1097 if logar: 1098 chisq_init = np.sum((np.log10(signalf_init)-np.log10(signal))**2) 1099 else: 1100 chisq_init = np.sum((signalf_init-signal)**2) 1101 1102 #-- optimize 1103 if minimizer=='leastsq': 1104 popt, cov, info, mesg, flag = optifunc(residuals,init_guess, 1105 args=(times,signal,evalfunc,weights,logar)+args,full_output=1)#,diag=[1.,10,1000,1.,100000000.]) 1106 #-- calculate new chisquare, and check if we have improved it 1107 chisq = np.sum(info['fvec']*info['fvec']) 1108 if chisq>chisq_init or flag!=1: 1109 logger.error('Optimization not successful [flag=%d] (%g --> %g)'%(flag,chisq_init,chisq)) 1110 chisq = np.inf 1111 1112 #-- derive the errors from the nonlinear fit 1113 if cov is not None: 1114 errors = np.sqrt(cov.diagonal()) * np.sqrt(chisq/dof) 1115 else: 1116 logger.error('Error estimation via optimize not successful') 1117 errors = np.zeros(len(popt)) 1118 else: 1119 out = optifunc(residuals_single,init_guess, 1120 args=(times,signal,evalfunc,weights),full_output=1,disp=False) 1121 popt = out[0] 1122 #-- calculate new chisquare, and check if we have improved it 1123 signalf_update = evalfunc(times,popt) 1124 chisq = np.sum((signalf_update-signal)**2) 1125 errors = np.zeros(len(popt)) 1126 if chisq>chisq_init: 1127 logger.error('Optimization not successful') 1128 1129 #-- gain in chi square: if positive, we gained, if negative, we lost... 1130 gain = (chisq_init-chisq)/chisq_init*100. 1131 1132 #-- transform the parameters to record arrays, as well as the errors 1133 if prepfunc is not None: 1134 parameters = prepfunc(popt) 1135 e_parameters = prepfunc(errors) 1136 e_parameters.dtype.names = ['e_'+name for name in e_parameters.dtype.names] 1137 else: 1138 parameters = popt 1139 e_parameters = errors 1140 1141 return parameters,e_parameters, gain

1142

1143 #=================================================================================================== 1144 #LMFIT functions 1145 1146 -class Function(object):

1147 """ 1148 Class to define a function with associated parameters. The function can be evaluated using the 1149 L{evaluate} method. Parameters can be added/updated, together with boundaries and expressions, 1150 and can be hold fixed and released by adjusting the vary keyword in L{setup_parameters}. 1151 1152 The provided function needs to take two arguments. The first one is a list of parameters, the 1153 second is a list of x-values for which the function will be evaluated. For example if you want 1154 to define a quadratic function it will look like: 1155 1156 >>> func = lambda pars, x: pars[0] + pars[1] * x + pars[2] * x**2 1157 1158 Functions can be combined using the L{Model} class, or can be fitted directly to data using the 1159 L{Minimizer} class. 1160 1161 To improve the fit, a jacobian can be provided as well. However, some care is nessessary when 1162 defining this function. When using the L{Minimizer} class to fit a Function to data, the 1163 residual function is defined as:: 1164 1165 residuals = data - Function.evaluate() 1166 1167 To derive the jacobian you have to take the derivatives of -1 * function. Furthermore the 1168 derivatives have to be across the rows. For the example quadratic function above, the jacobian 1169 would look like: 1170 1171 >>> jacobian = lambda pars, x: np.array([-np.ones(len(x)), -x, -x**2]).T 1172 1173 If you get bad results, try flipping all signs. The jacobian does not really improve the speed 1174 of the fitprocess, but it does help to reach the minimum in a more consistent way (see examples). 1175 1176 The internal representation of the parameters uses a parameter object of the U{lmfit 1177 <http://cars9.uchicago.edu/software/python/lmfit/index.html>} package. No knowledge of this 1178 repersentation is required as methods for direct interaction with the parameter values and 1179 other settings are provided. If wanted, the parameters object itself can be obtained with 1180 the L{parameters} attribute. 1181 """ 1182

1183 - def __init__(self, function=None, par_names=None, jacobian=None, resfunc=None):

1184 """ 1185 Create a new Function by providing the function of which it consists together with the names 1186 of each parameter in this function. You can specify the jacobian as well. 1187 1188 @param function: A function expression 1189 @type function: function 1190 @param par_names: The names of each parameter of function 1191 @type par_names: list of strings 1192 @param jacobian: A function expression for the jacobian of function. 1193 @type jacobian: function 1194 @param resfunc: Function to calculate the residuals. (Not obligatory) 1195 @type resfunc: function 1196 """ 1197 self.function = function 1198 self.par_names = par_names 1199 self.jacobian = jacobian 1200 self.resfunc = resfunc 1201 1202 #create an empty parameter set based on the parameter names 1203 self.parameters = None 1204 self.setup_parameters()

1205 1206 #{ Interaction 1207

1208 - def evaluate(self,x, *args, **kwargs):

1209 """ 1210 Evaluate the function for the given values and optional the given parameter object. 1211 If no parameter object is given then the parameter object belonging to the function 1212 is used. 1213 1214 >>> #func.evaluate(x, parameters) 1215 >>> #func.evaluate(x) 1216 1217 @param x: the independant data for which to evaluate the function 1218 @type x: numpy array 1219 1220 @return: Function(x), same size as x 1221 @rtype: numpy array 1222 """ 1223 if len(args) == 0: 1224 #-- Use the parameters belonging to this object 1225 pars = [] 1226 for name in self.par_names: 1227 pars.append(self.parameters[name].value) 1228 1229 return self.function(pars,x, **kwargs) 1230 1231 if len(args) == 1: 1232 #-- Use the provided parameters 1233 #-- if provided as a ParameterObject 1234 if isinstance(args[0],dict): 1235 pars = [] 1236 for name in self.par_names: 1237 pars.append(args[0][name].value) 1238 else: 1239 pars = args[0] 1240 1241 return self.function(pars,x, **kwargs)

1242

1243 - def evaluate_jacobian(self, x, *args):

1244 """ 1245 Evaluates the jacobian if that function is provided, using the given parameter object. 1246 If no parameter object is given then the parameter object belonging to the function 1247 is used. 1248 """ 1249 if self.jacobian == None: 1250 return [0.0 for i in self.par_names] 1251 1252 if len(args) == 0: 1253 #-- Use the parameters belonging to this object 1254 pars = [] 1255 for name in self.par_names: 1256 pars.append(self.parameters[name].value) 1257 1258 return self.jacobian(pars,x) 1259 1260 if len(args) == 1: 1261 #-- Use the provided parameters 1262 #-- if provided as a ParameterObject 1263 if isinstance(args[0],dict): 1264 pars = [] 1265 for name in self.par_names: 1266 pars.append(args[0][name].value) 1267 else: 1268 pars = args[0] 1269 1270 return self.jacobian(pars,x)

1271

1272 - def setup_parameters(self, value=None, bounds=None, vary=None, expr=None, **kwargs):

1273 """ 1274 Create or adjust a parameter object based on the parameter names and if provided 1275 the values, bounds, vary and expressions. Basic checking if the parameter boundaries 1276 are consistent is performed. 1277 1278 Example Use: 1279 1280 >>> setup_parameters(values=[v1, v2, ...], bounds=[(b1, b2), (b3, b4), ...], vary=[True, False, ...]) 1281 1282 @param values: The values of the parameters 1283 @type values: array 1284 @param bounds: min and max boundaries on the parameters [(min,max),...] 1285 @type bounds: array of tuples 1286 @param vary: Boolean array, true to vary a parameter, false to keep it fixed in the fitting process 1287 @type vary: array 1288 @param exprs: array of expressions that the paramters have to follow 1289 @type exprs: array 1290 """ 1291 nrpars = len(self.par_names) 1292 if value == None: 1293 value = kwargs['values'] if 'values' in kwargs else [0 for i in range(nrpars)] 1294 if bounds == None: 1295 bounds = np.array([[None,None] for i in range(nrpars)]) 1296 else: 1297 bounds = np.asarray(bounds) 1298 if vary == None: 1299 vary = [True for i in range(nrpars)] 1300 if expr == None: 1301 expr = kwargs['exprs'] if 'exprs' in kwargs else [None for i in range(nrpars)] 1302 1303 min = kwargs['min'] if 'min' in kwargs else bounds[:,0] 1304 max = kwargs['max'] if 'max' in kwargs else bounds[:,1] 1305 1306 def check_boundaries(min, max, value, name): 1307 #-- Check if boundaries are consistent 1308 if max != None and min != None and max < min: 1309 min, max = max, min 1310 logging.warning('Parameter %s: max < min, switched boundaries!'%(name)) 1311 if min != None and value < min: 1312 min = value 1313 logging.warning('Parameter %s: value < min, adjusted min!'%(name)) 1314 if max != None and value > max: 1315 max = value 1316 logging.warning('Parameter %s: value > max, adjusted max!'%(name)) 1317 return min, max

1318 1319 if self.parameters == None: 1320 #-- Create a new parameter object 1321 self.parameters = lmfit.Parameters() 1322 for i,name in enumerate(self.par_names): 1323 min_, max_ = check_boundaries(min[i], max[i], value[i], name) 1324 self.parameters.add(name, value=value[i], vary=vary[i], min=min_, max=max_, expr=expr[i]) 1325 else: 1326 #-- Adjust an existing parameter object 1327 for i,name in enumerate(self.par_names): 1328 min_, max_ = check_boundaries(min[i], max[i], value[i], name) 1329 self.parameters[name].value = float(value[i]) 1330 self.parameters[name].user_value = float(value[i]) 1331 self.parameters[name].vary = bool(vary[i]) if vary[i] != None else True 1332 self.parameters[name].min = min_ 1333 self.parameters[name].max = max_ 1334 self.parameters[name].expr = expr[i]

1335

1336 - def update_parameter(self, parameter=None, **kwargs):

1337 """ 1338 Updates a specified parameter. The parameter can be given by name or by index. This 1339 method also supports the use of min and max keywords to set the lower and upper boundary 1340 seperatly. 1341 1342 Example Use: 1343 1344 >>> update_parameter(parameter='parname', value=10.0, min=5.0, vary=True) 1345 >>> update_parameter(parameter=2, value=0.15, bounds=(0.10, 0.20)) 1346 """ 1347 1348 if type(parameter) == int: 1349 parameter = self.parameters[self.par_names[parameter]] 1350 elif type(parameter) == str: 1351 parameter = self.parameters[parameter] 1352 1353 #-- if bounds is provided, break them up in min and max. 1354 if 'bounds' in kwargs: 1355 kwargs['min'] = kwargs['bounds'][0] 1356 kwargs['max'] = kwargs['bounds'][1] 1357 1358 if 'value' in kwargs: 1359 kwargs['user_value'] = kwargs['value'] 1360 1361 for key in ['value', 'min', 'max', 'vary', 'expr']: 1362 if key in kwargs: 1363 setattr(parameter, key, kwargs[key])

1364

1365 - def get_parameters(self, parameters=None, error='stderr', full_output=False):

1366 """ 1367 Returns the parameter values together with the errors if they exist. If No 1368 fitting has been done, or the errors could not be calculated, None is returned 1369 for the error. 1370 1371 The parameters of which the settings should be returned can be given in 1372 I{parameters}. If None is given, all parameters are returned. 1373 1374 @param parameters: Parameters to be returned or None for all parameters. 1375 @type parameters: array 1376 @param error: Which error to return ('stderr', 'mcerr') 1377 @type error: string 1378 @param full_output: When True, also vary, the boundaries and expression are returned 1379 @type full_output: bool 1380 1381 @return: the parameter values and there errors: value, err, [vary, min, max, expr] 1382 @rtype: numpy arrays 1383 """ 1384 if type(parameters) == str: 1385 parameters = [parameters] 1386 1387 pnames = parameters if parameters != None else self.par_names 1388 1389 1390 out = [] 1391 for name in pnames: 1392 par = self.parameters[name] 1393 out.append([par.value,getattr(par, error), par.vary, par.min, 1394 par.max, par.expr]) 1395 out = np.array(out) 1396 1397 if full_output: 1398 return out.T 1399 else: 1400 return out[:,0], out[:,1]

1401 1402 #} 1403 1404 #{ Print functions 1405

1406 - def param2str(self, **kwargs):

1407 """ 1408 Converts the parameter object of this model to an easy printable string, including 1409 the value, error, boundaries, if the parameter is varied, and if in the fitting process 1410 on of the boundaries was reached. 1411 1412 The error to be printed can be set with the error keyword. You can chose from the 1413 standard error: 'stderr', the monte carlo error: 'mcerr', or any of the confidence 1414 intervalls that you have calculated by coding them like: 'ci###'. Fx: 95% (sigma = 0.95) 1415 use 'ci95', for 99.7% (sigma = 0.997) use 'ci997'. Don't put decimal signs in the ci! 1416 1417 The accuracy with which the parameters are printed can be set with the accuracy keyword. 1418 And the amount of information that is printed is determined by full_output. If False, 1419 only parameter value and error are printed, if True also boundaries and vary are shown. 1420 1421 @param accuracy: Number of decimal places to print 1422 @type accuracy: int 1423 @param error: Which error type to print ('stderr', 'mcerr' or 'ci###') 1424 @type error: string 1425 @param full_output: Amount of information to print 1426 @type full_output: bool 1427 1428 @return: parameters in string format 1429 @rtype: string 1430 """ 1431 return parameters2string(self.parameters, **kwargs)

1432

1433 - def correl2str(self, **kwargs):

1434 """ 1435 Convert the correlations between the different parameters of this function to an 1436 easy printable string. The correlations are only printed if they are larger than 1437 a certain provided limit. And the accuracy keyword sets the amount of decimals 1438 to print 1439 1440 @param accuracy: number of decimal places to print 1441 @param limit: minimum correlation value to print 1442 1443 @return: correlation in string format 1444 @rtype: string 1445 """ 1446 return correlation2string(self.parameters, **kwargs)

1447

1448 - def ci2str(self, **kwargs):

1449 """ 1450 Convert the confidence intervals of the parameters of this model to an easy 1451 printable string. 1452 1453 The accuracy with wich the CIs should be printed can be set with the accuracy 1454 keyword. 1455 1456 @param accuracy: Number of decimal places to print 1457 @type accuracy: int 1458 1459 @return: confidence intervals in string format 1460 @rtype: string 1461 """ 1462 return confidence2string(self.parameters, **kwargs)

1463 1464 #} 1465 1466 #{ Internal 1467

1468 - def __str__(self):

1469 """ String representation of the Function object """ 1470 pnames = ", ".join(self.par_names) 1471 name = "<Function: '{:s}' with parameters: {:s}>".format(self.function.__name__, pnames) 1472 return name

1473

1474 #} 1475 1476 1477 -class Model(object):

1478 """ 1479 Class to create a model using different L{Function}s each with their associated parameters. 1480 This Model can then be used to fit data using the L{Minimizer} class. The Model can be 1481 evaluated using the L{evaluate} method. Parameters can be added/updated, together with 1482 boundaries and expressions, and can be hold fixed or adjusted by changing the vary keyword 1483 in L{update_parameter}. 1484 1485 Parameters for the different Functions are combined. To keep track of which parameter is 1486 which, they get a number added to the end indicating the function they belong to. For example: 1487 when a Model is created summing a gaussian function with a quadratic function. The gaussian has 1488 parameters [a, mu, sigma, c] and the quadratic function has parameters [s0, s1, s2]. If the 1489 functions are provided in order [Gaussian, Quadratic], the parameters will be renames to: 1490 [a_0, mu_0, sigma_0, c_0] and [s0_1, s1_1, s2_1]. Keep in mind that this renaming only happens 1491 in the Model object. In the underlying Functions the parameters will keep there original name. 1492 1493 The functions themselfs can be combined using a mathematical expression in the constructor. 1494 If no expression is provided, the output of each function is summed up. Keep in mind that each 1495 function needs to have the same output shape:: 1496 1497 Model(x) = Function1(x) + Function2(x) + ... 1498 1499 The provided expression needs to be a function taking an array with the results of the 1500 Functions in the model as arguments, and having an numpy array as result. This can be done 1501 with simple I{lambda} expression or more complicated functions:: 1502 1503 Model(x) = Expr( [Function1(x),Function2(x),...] ) 1504 1505 The internal representation of the parameters uses a parameter object of the U{lmfit 1506 <http://cars9.uchicago.edu/software/python/lmfit/index.html>} package. No knowledge of this 1507 repersentation is required as methods for direct interaction with the parameter values and 1508 other settings are provided. If wanted, the parameters object itself can be obtained with 1509 the L{parameters} attribute. 1510 1511 At the moment the use of a jacobian is not supported at the Model level as it is not possible 1512 to derive a symbolic jacobian from the provided functions. If you want to use a jacobian you 1513 will have to write a Function yourself in which you can provide a jacobian function. 1514 """ 1515

1516 - def __init__(self, functions=None, expr=None, resfunc=None):

1517 """ 1518 Create a new Model by providing the functions of which it consists. You can provid an 1519 expression describing how the Functions have to be combined as well. This expression needs 1520 to take the out put of the Fuctions in an array as argument, and provide a new numpy array 1521 as result. 1522 1523 @param functions: A list of L{Function}s 1524 @type functions: list 1525 @param expr: An expression to combine the given functions. 1526 @type expr: function 1527 @param resfunc: A function to calculate the residuals (Not obligatory) 1528 @type resfunc: function 1529 """ 1530 self.functions = functions 1531 self.expr = expr 1532 self.resfunc = resfunc 1533 self.jacobian = None 1534 self._par_names = None 1535 self.parameters = None 1536 1537 #-- Combine the parameters 1538 self.pull_parameters()

1539 1540 #{ Interaction 1541

1542 - def evaluate(self, x, *args, **kwargs):

1543 """ 1544 Evaluate the model for the given values and optional a given parameter object. 1545 If no parameter object is given then the parameter object belonging to the model 1546 is used. 1547 1548 >>> evaluate(x, parameters) 1549 >>> evaluate(x) 1550 1551 @param x: the independant values for which to evaluate the model. 1552 @type x: array 1553 1554 @return: Model(x) 1555 @rtype: numpy array 1556 """ 1557 if len(args) == 0: 1558 #-- Use the parameters belonging to this object 1559 parameters = self.parameters 1560 elif len(args) == 1: 1561 #-- Use the provided parameters 1562 parameters = args[0] 1563 1564 #-- Update the parameters of the individual functions before calling them 1565 self.push_parameters(parameters=parameters) 1566 1567 #-- For each function, read the arguments and calculate the result 1568 if self.expr == None: 1569 result = np.zeros(len(x)) 1570 for function in self.functions: 1571 result += function.evaluate(x, **kwargs) 1572 else: 1573 result = [] 1574 for function in self.functions: 1575 result.append(function.evaluate(x, **kwargs)) 1576 result = self.expr(result) 1577 1578 return result

1579

1580 - def evaluate_jacobian(self, x, *args):

1581 """ 1582 Not implemented! 1583 """ 1584 return [0.0 for p in self.parameters]

1585

1586 - def setup_parameters(self,values=None, bounds=None, vary=None, exprs=None):

1587 """ 1588 Not implemented yet, use the L{setup_parameters} method of the Functions 1589 themselfs, or for adjustments of a single parameter use the L{update_parameter} 1590 function 1591 1592 Please provide feedback on redmine on how you would like to use this function!!! 1593 """ 1594 if values is None: values = [None for i in self.functions] 1595 if bounds is None: bounds = [None for i in self.functions] 1596 if vary is None: vary = [None for i in self.functions] 1597 if exprs is None: exprs = [None for i in self.functions] 1598 1599 for i,func in enumerate(self.functions): 1600 func.setup_parameters(values=values[i],bounds=bounds[i],vary=vary[i],exprs=exprs[i]) 1601 1602 self.pull_parameters()

1603

1604 - def update_parameter(self, parameter=None, **kwargs):

1605 """ 1606 Updates a specified parameter. The parameter can be given by name or by index. 1607 However, you have to be carefull when using the names. The model class changes 1608 the names of the parameters of the underlying functions based on the order in 1609 which the functions are provided (See introduction). This method also supports 1610 the use of kwargs min and max to set the lower 1611 and upper boundary separatly. 1612 1613 Example use: 1614 1615 >>> update_parameter(parameter='parname_0', value=10.0, min=5.0, vary=True) 1616 >>> update_parameter(parameter=2, value=0.15, bounds=(0.10, 0.20)) 1617 """ 1618 1619 if type(parameter) == int: 1620 parameter = self.parameters[self._par_names[parameter]] 1621 elif type(parameter) == str: 1622 parameter = self.parameters[parameter] 1623 1624 #-- if bounds is provided, break them up in min and max. 1625 if 'bounds' in kwargs: 1626 kwargs['min'] = kwargs['bounds'][0] 1627 kwargs['max'] = kwargs['bounds'][1] 1628 1629 for key in ['value', 'min', 'max', 'vary', 'expr']: 1630 if key in kwargs: 1631 setattr(parameter, key, kwargs[key]) 1632 1633 self.push_parameters()

1634

1635 - def get_parameters(self, parameters=None, error='stderr', full_output=False):

1636 """ 1637 Returns the parameter values together with the errors if they exist. If No 1638 fitting has been done, or the errors could not be calculated, None is returned 1639 for the error. 1640 1641 The parameters of which the settings should be returned can be given in 1642 I{parameters}. If None is given, all parameters are returned. 1643 1644 @param parameters: Parameters to be returned or None for all parameters. 1645 @type parameters: array 1646 @param error: Which error to return ('stderr', 'mcerr') 1647 @type error: string 1648 @param full_output: When True, also vary, the boundaries and expression are returned 1649 @type full_output: bool 1650 1651 @return: the parameter values and there errors: value, err, [vary, min, max, expr] 1652 @rtype: numpy arrays 1653 """ 1654 if type(parameters) == str: 1655 parameters = [parameters] 1656 pnames = parameters if parameters != None else self.parameters.keys() 1657 1658 out = [] 1659 for name in pnames: 1660 par = self.parameters[name] 1661 out.append([par.value,getattr(par, error), par.vary, par.min, 1662 par.max, par.expr]) 1663 out = np.array(out) 1664 1665 if full_output: 1666 return out.T 1667 else: 1668 return out[:,0], out[:,1]

1669 1670 #} 1671 1672 #{ Print functions 1673

1674 - def param2str(self, **kwargs):

1675 """ 1676 Converts the parameter object of this model to an easy printable string, including 1677 the value, error, boundaries, if the parameter is varied, and if in the fitting process 1678 on of the boundaries was reached. 1679 1680 The error to be printed can be set with the error keyword. You can chose from the 1681 standard error: 'stderr', the monte carlo error: 'mcerr', or any of the confidence 1682 intervalls that you have calculated by coding them like: 'ci###'. Fx: 95% (sigma = 0.95) 1683 use 'ci95', for 99.7% (sigma = 0.997) use 'ci997'. Don't put decimal signs in the ci! 1684 1685 The accuracy with which the parameters are printed can be set with the accuracy keyword. 1686 And the amount of information that is printed is determined by full_output. If False, 1687 only parameter value and error are printed, if True also boundaries and vary are shown. 1688 1689 @param accuracy: Number of decimal places to print 1690 @type accuracy: int 1691 @param error: Which error type to print ('stderr', 'mcerr' or 'ci###') 1692 @type error: string 1693 @param full_output: Amount of information to print 1694 @type full_output: bool 1695 1696 @return: parameters in string format 1697 @rtype: string 1698 """ 1699 return parameters2string(self.parameters, **kwargs)

1700

1701 - def correl2str(self, **kwargs):

1702 """ 1703 Convert the correlations between the different parameters of this model to an 1704 easy printable string. The correlations are only printed if they are larger than 1705 a certain provided limit. And the accuracy keyword sets the amount of decimals 1706 to print 1707 1708 @param accuracy: number of decimal places to print 1709 @param limit: minimum correlation value to print 1710 1711 @return: correlation in string format 1712 @rtype: string 1713 """ 1714 return correlation2string(self.parameters, **kwargs)

1715

1716 - def ci2str(self, **kwargs):

1717 """ 1718 Convert the confidence intervals of the parameters of this model to an easy 1719 printable string. 1720 1721 The accuracy with wich the CIs should be printed can be set with the accuracy 1722 keyword. 1723 1724 @param accuracy: Number of decimal places to print 1725 @type accuracy: int 1726 1727 @return: confidence intervals in string format 1728 @rtype: string 1729 """ 1730 return confidence2string(self.parameters, **kwargs)

1731 1732 #} 1733 1734 #{ Advanced attributes 1735 1736 @property

1737 - def par_names(self):

1738 'get par_names' 1739 return [val for sublist in self._par_names for val in sublist]

1740 1741 #@par_names.setter 1742 #def par_names(self, val): 1743 #'set par_names' 1744 #self._par_names = val 1745 1746 #} 1747 1748 #{ Internal 1749

1750 - def pull_parameters(self):

1751 """ 1752 Pulls the parameter objects from the underlying functions, and combines it to 1 parameter object. 1753 """ 1754 functions = self.functions 1755 parameters = [] 1756 for func in functions: 1757 parameters.append(func.parameters) 1758 1759 #-- Create new parameter object 1760 new_params = lmfit.Parameters() 1761 pnames = [] 1762 for i, params in enumerate(parameters): 1763 pname = [] 1764 for n,par in params.items(): 1765 pname.append(n+'_%i'%(i)) 1766 new_params.add(n+'_%i'%(i), value=par.value, vary=par.vary, min=par.min, max=par.max, expr=par.expr) 1767 pnames.append(pname) 1768 1769 self._par_names = pnames 1770 self.parameters = new_params

1771

1772 - def push_parameters(self, parameters=None):

1773 """ 1774 Pushes the parameters in the combined parameter object to the parameter objects of the underlying 1775 models or functions. 1776 """ 1777 if parameters == None: 1778 parameters = self.parameters 1779 for pnames,function in zip(self._par_names, self.functions): 1780 old_parameters = function.parameters 1781 for name in pnames: 1782 old_name = re.sub('_[0123456789]*$','',name) 1783 old_parameters[old_name] = parameters[name]

1784

1785 - def __str__(self):

1786 """ String representation of the Model object """ 1787 fnames = ", ".join([f.function.__name__ for f in self.functions]) 1788 expr = self.expr if self.expr != None else "Sum()" 1789 name = "<Model with functions: [{:s}] combined by: {:s}>" 1790 return name.format(fnames, expr)

1791

1792 #} 1793 1794 1795 -class Minimizer(object):

1796 """ 1797 A minimizer class on the U{lmfit <http://cars9.uchicago.edu/software/python/lmfit/index.html>} 1798 Python package, which provides a simple, flexible interface to non-linear 1799 least-squares optimization, or curve fitting. The package is build around the 1800 Levenberg-Marquardt algorithm of scipy, but 2 other minimizers: limited memory 1801 Broyden-Fletcher-Goldfarb-Shanno and simulated annealing are partly supported as 1802 well. For the Levenberg-Marquardt algorithm, the estimated uncertainties and 1803 correlation between fitted variables are calculated as well. 1804 1805 This minimizer finds the best parameters to fit a given L{Model} or L{Function} to a 1806 set of data. You only need to provide the Model and data. Weighted fitting is 1807 supported. 1808 1809 This minimizer uses the basic residual function:: 1810 1811 residuals = ( data - model(x) ) * weights 1812 1813 If a more advanced residual functions is required, fx when working with multi 1814 dimentional data, the used can specify its own residual function in the provided 1815 Function or Model, or by setting the I{resfunc} keyword. This residual function needs 1816 to be of the following call sign:: 1817 1818 resfunc(synth, data, weights=None, errors=None, **kwargs) 1819 1820 Functions 1821 ========= 1822 1823 A L{Function} is basicaly a function together with a list of the parameters that is 1824 needs. In the internal representation the parameters are represented as a Parameters 1825 object. However, the user does not need to now how to handle this, and can just 1826 provided or retrieve the parameter values using arrays. Every fitting parameter are 1827 extensions of simple numerical variables with the following properties: 1828 1829 - Parameters can be fixed or floated in the fit. 1830 - Parameters can be bounded with a minimum and/or maximum value. 1831 - Parameters can be written as simple mathematical expressions of other 1832 Parameters, using the U{asteval module <http://newville.github.com/asteval/>}. 1833 These values will be re-evaluated at each step in the fit, so that the 1834 expression is satisfied. This gives a simple but flexible approach to 1835 constraining fit variables. 1836 1837 Models 1838 ====== 1839 1840 A L{Model} is a combination of Functions with its own parameter set. The Functions 1841 are provided as a list, and a gamma function can be given to describe how the 1842 functions are combined. Methods to handle the parameters in a model are provided, but 1843 the user is recommended handle the parameters at Function level as the naming of the 1844 parameters changes at Model level. 1845 """ 1846

1847 - def __init__(self, x, y, model, errors=None, weights=None, resfunc=None, 1848 engine='leastsq', args=None, kws=None, grid_points=1, grid_params=None, 1849 verbose=False, **kwargs):

1850 1851 self.x = x 1852 self.y = y 1853 self.model = model 1854 self.errors = errors 1855 self.weights = weights 1856 self.model_kws = kws 1857 self.fit_kws = kwargs #fx: iter_cb, scale_covar 1858 self.resfunc = model.resfunc 1859 self.engine = engine 1860 self._minimizers = [None] 1861 1862 if weights == None: 1863 self.weights = np.ones(len(y)) # if no weigths definded set them all at one. 1864 1865 if resfunc != None: 1866 self.resfunc = resfunc # if residual function is provided, use that one. 1867 1868 params = model.parameters 1869 1870 #-- Setup the residual function and the lmfit.minimizer object 1871 self._setup_residual_function() 1872 self._setup_jacobian_function() 1873 fcn_args = (self.x, self.y) 1874 fcn_kws = dict(weights=self.weights, errors=self.errors) 1875 if self.model_kws != None: 1876 fcn_kws.update(self.model_kws) 1877 1878 #-- Setup the Minimizer object 1879 self._prepare_minimizer(fcn_args, fcn_kws, grid_points, grid_params) 1880 1881 #-- Actual fitting 1882 self._start_minimize(engine, verbose=verbose, Dfun=self.jacobian)

1883 1884 #{ Error determination 1885

1886 - def calculate_CI(self, parameters=None, sigmas=[0.654, 0.95, 0.997], maxiter=200, 1887 **kwargs):

1888 """ 1889 Returns the confidence intervalls of the given parameters. This function uses 1890 the F-test method described below to calculate confidence intervalls. The 1891 I{sigma} parameter describes which confidence level is required in percentage: 1892 sigma=0.654 corresponds with the standard 1 sigma level, 0.95 with 2 sigma and 1893 0.997 with 3 sigma. 1894 1895 The output is a dictionary containing for each parameter the lower and upper 1896 boundary of the asked confidence level. If short_output is True, an array of 1897 tupples is returned instead. When only one parameter is given, and short_output 1898 is True, only a tupple of the lower and upper boundary is returned. 1899 1900 The confidence intervalls calculated with this function are stored in the 1901 Model or Function as well. 1902 1903 F-test 1904 ====== 1905 The F-test is used to compare the null model, which is the best fit 1906 found by the minimizer, with an alternate model, where one of the 1907 parameters is fixed to a specific value. The value is changed util the 1908 differnce between chi2_0 and chi2_f can't be explained by the loss of a 1909 degree of freedom with a certain confidence. 1910 1911 M{F = (chi2_f / chi2_0 - 1) * (N-P)/P_fix} 1912 1913 N is the number of data-points, P the number of parameter of the null model. 1914 P_fix is the number of fixed parameters (or to be more clear, the difference 1915 of number of parameters betweeen the null model and the alternate model). 1916 1917 This method relies completely on the I(conf_interval) method of the lmfit 1918 package. 1919 1920 @param parameters: Names of the parameters to calculate the CIs from (if None, 1921 all parameters are used) 1922 @type parameters: array of strings 1923 @param sigmas: The probability levels used to calculate the CI 1924 @type sigmas: array or float 1925 1926 @return: the confidence intervals. 1927 @rtype: dict 1928 """ 1929 #-- check if a special probability function is provided. 1930 prob_func = kwargs.pop('prob_func', None) 1931 1932 if parameters == None: 1933 parameters = self.model.par_names 1934 elif type(parameters) == str: 1935 parameters = [parameters] 1936 1937 if not hasattr(sigmas, "__iter__"): 1938 sigmas = [sigmas] 1939 if 'sigma' in kwargs: 1940 sigmas = [kwargs['sigma']] 1941 print 'WARNING: sigma is depricated, use sigmas' 1942 1943 #-- Use the adjusted conf_interval() function of the lmfit package. 1944 # We need to work on a copy of the minimizer and make a backup of 1945 # the parameter object cause lmfit messes up the minimizer when 1946 # running conf_interval 1947 mini = copy.copy(self.minimizer) 1948 backup = copy.deepcopy(self.model.parameters) 1949 ci = lmfit.conf_interval(mini, p_names=parameters, sigmas=sigmas, 1950 maxiter=maxiter, prob_func=prob_func, trace=False, 1951 verbose=False) 1952 self.model.parameters = backup 1953 1954 #-- store the CI values in the parameter object 1955 for key, value in ci.items(): 1956 self.model.parameters[key].cierr.update(value) 1957 1958 return ci

1959

1960 - def calculate_CI_2D(self, xpar=None, ypar=None, res=10, limits=None, ctype='prob'):

1961 """ 1962 Calculates the confidence interval for 2 given parameters. Both the confidence interval 1963 calculated using the F-test method from the I{estimate_error} method, and the normal chi 1964 squares can be obtained using the I{ctype} keyword. 1965 1966 The confidence intervall is returned as a grid, together with the x and y distribution of 1967 the parameters: (x-values, y-values, grid) 1968 1969 @param xname: The parameter on the x axis 1970 @param yname: The parameter on the y axis 1971 @param res: The resolution of the grid over which the confidence intervall is calculated 1972 @param limits: The upper and lower limit on the parameters for which the confidence 1973 intervall is calculated. If None, 5 times the stderr is used. 1974 @param ctype: 'prob' for probabilities plot (using F-test), 'chi2' for chi-squares. 1975 1976 @return: the x values, y values and confidence values 1977 @rtype: (array, array, 2d array) 1978 """ 1979 1980 xn = hasattr(res,'__iter__') and res[0] or res 1981 yn = hasattr(res,'__iter__') and res[1] or res 1982 1983 prob_func = None 1984 if ctype == 'chi2': 1985 def prob_func(Ndata, Nparas, new_chi, best_chi, nfix=1.): 1986 return new_chi

1987 old = np.seterr(divide='ignore') #turn division errors off temporary 1988 x, y, grid = lmfit.conf_interval2d(self.minimizer, xpar, ypar, xn, yn, limits=limits, 1989 prob_func=prob_func) 1990 np.seterr(divide=old['divide']) 1991 1992 if ctype=='prob': 1993 grid *= 100. 1994 1995 return x, y, grid 1996

1997 - def calculate_MC_error(self, points=100, errors=None, distribution='gauss', 1998 short_output=True, verbose=True, **kwargs):

1999 """ 2000 Use Monte-Carlo simulations to estimate the error of each parameter. In this 2001 approach each datapoint is perturbed by its error, and for each new dataset 2002 best fitting parameters are calculated. The MC error of a parameter is its 2003 deviation over all iterations (points). 2004 2005 The errors supplied to this function will overwrite the errors already stored 2006 in this Minimizer. If however no errors are supplied, the already stored ones 2007 will be used. For now only symetric errors are supported. 2008 2009 Currently all datapoints are considered to have a symetric gaussian distribution, 2010 but in future version more distributions will be supported. 2011 2012 The MC errors are saved in the Model or Function supplied to this fitter, and 2013 can be returned as an array (short_output=True), or as a dictionary 2014 (short_output=False). 2015 2016 @param points: The number of itterations 2017 @type points: int 2018 @param errors: Possible new errors on the input data. 2019 @type errors: array or float 2020 @param distribution: Not yet supported 2021 @type distribution: str 2022 @param short_output: True if you want array, False if you want dictionary 2023 @type short_output: bool 2024 2025 @return: The MC errors of all parameters. 2026 @rtype: array or dict 2027 """ 2028 if errors != None: 2029 self.errors = errors 2030 2031 perturb_args = dict(distribution=distribution) 2032 perturb_args.update(kwargs) 2033 params = np.empty(shape=(points), dtype=object) 2034 2035 #-- perturb the data 2036 y_perturbed = self._perturb_input_data(points, **perturb_args) 2037 2038 if verbose: print "MC simulations ({:.0f} points):".format(points) 2039 if verbose: Pmeter = progress.ProgressMeter(total=points) 2040 for i, y_ in enumerate(y_perturbed): 2041 if verbose: Pmeter.update(1) 2042 2043 #-- setup the fit 2044 pars = copy.deepcopy(self.model.parameters) 2045 fcn_args = (self.x, y_) 2046 fcn_kws = dict(weights=self.weights, errors=self.errors) 2047 if self.model_kws != None: 2048 fcn_kws.update(self.model_kws) 2049 result = lmfit.Minimizer(self.residuals, pars, fcn_args=fcn_args, 2050 fcn_kws=fcn_kws, **self.fit_kws) 2051 2052 #-- run the fit and save the results 2053 result.start_minimize(self.engine, Dfun=self.jacobian) 2054 params[i] = pars 2055 2056 pnames, mcerrors = self._mc_error_from_parameters(params) 2057 2058 if short_output: 2059 return mcerrors 2060 else: 2061 out = dict() 2062 for name, err in zip(pnames, mcerrors): 2063 out[name] = err 2064 return out

2065 2066 #} 2067 2068 #{ Plotting Functions 2069

2070 - def plot_fit(self, points=1000, axis=0, legend=False, **kwargs):

2071 """ 2072 Plot the original data together with the best fit results. This method has some 2073 functionality to plot multi-dimensional data, but is limited to 2D data which it 2074 will plot consecutively allong the specified axis. 2075 2076 @param points: Number of points to use when plotting the best fit model. 2077 @type points: int 2078 @param axis: In case of multi-dim input along which axis to split (0 or 1) 2079 @type axis: int 2080 @param legend: Display legend on plot 2081 @type legend: bool 2082 """ 2083 2084 #-- transform to 2D if nessessary 2085 res = np.atleast_2d(self.y - self.model.evaluate(self.x)) 2086 x, y = np.atleast_2d(self.x), np.atleast_2d(self.y) 2087 err = np.atleast_2d(self.errors) if self.errors != None else np.zeros_like(self.x) 2088 2089 #-- transpose if the axis is 1 2090 if axis == 1: x, y, res, err = x.T, y.T, res.T, err.T 2091 2092 #-- create synthetic x-data 2093 xf = np.empty((len(x), points), dtype=float) 2094 for i, x_ in enumerate(x): 2095 xf[i] = np.linspace(np.min(x_), np.max(x_), points) 2096 2097 #-- calculate the synthetic y-data 2098 yf = np.atleast_2d(self.model.evaluate(np.squeeze(xf))) 2099 2100 #-- setup a colorMap 2101 cmap = cm = pl.get_cmap('spectral') 2102 norm = mpl.colors.Normalize(vmin=-len(x)-1, vmax=len(x)+1) 2103 color = mpl.cm.ScalarMappable(norm=norm, cmap=cmap) 2104 2105 #-- plot the data and fit 2106 for i, (x_, y_, e_) in enumerate(zip(x, y, err)): 2107 pl.errorbar(x_, y_, yerr=e_, marker='+', ms=5, ls='', color=color.to_rgba(-i-1), 2108 label='data %i'%(i+1)) 2109 for i, (xf_, yf_) in enumerate(zip(xf, yf)): 2110 pl.plot(xf_, yf_, ls='-', color=color.to_rgba(i+1), label='fit %i'%(i+1)) 2111 2112 if legend: 2113 pl.legend()

2114

2115 - def plot_residuals(self, axis=0, legend=False, **kwargs):

2116 """ 2117 Plot the residuals of the best fit. This method has some functionality to plot 2118 multi-dimensional data, but is limited to 2D data which it will plot 2119 consecutively allong the specified axis. 2120 2121 @param axis: In case of multi-dim input along which axis to split (0 or 1) 2122 @type axis: int 2123 @param legend: Display legend on plot 2124 @type legend: bool 2125 """ 2126 2127 #-- transform to 2D if nessessary 2128 res = np.atleast_2d(self.y - self.model.evaluate(self.x)) 2129 x = np.atleast_2d(self.x) 2130 err = np.atleast_2d(self.errors) if self.errors != None else np.zeros_like(self.x) 2131 if axis == 1: x, res, err = x.T, res.T, err.T 2132 2133 #-- setup a colorMap 2134 cmap = cm = pl.get_cmap('spectral') 2135 norm = mpl.colors.Normalize(vmin=-len(x)-1, vmax=len(x)+1) 2136 color = mpl.cm.ScalarMappable(norm=norm, cmap=cmap) 2137 2138 #-- plot residuals 2139 for i, (x_, res_, e_) in enumerate(zip(x,res, err)): 2140 pl.errorbar(x_, res_, yerr=e_, marker='+', ms=5, ls='', color=color.to_rgba(-i-1), 2141 label='data %i'%(i+1)) 2142 pl.axhline(y=0, ls=':', color='r') 2143 2144 if legend: 2145 pl.legend()

2146

2147 - def plot_results(self, points=1000, axis=0, legend=False, **kwargs):

2148 """ 2149 Creates a basic plot with the fit results and corresponding residuals. This 2150 method has some functionality to plot multi-dimensional data, but is up till 2151 now limited to 2D data which it will plot consecutively allong the specified 2152 axis. Based on the plot_fit and plot_residuals functions 2153 2154 @param points: Number of points to use when plotting the best fit model. 2155 @type points: int 2156 @param axis: In case of multi-dim input along which axis to split (0 or 1) 2157 @type axis: int 2158 @param legend: Display legend on plot 2159 @type legend: bool 2160 @param xlabel: label for the x-axis 2161 @type xlabel: str 2162 @param ylabel: label for the y-axis 2163 @type ylabel: str 2164 @param title: title of the plot 2165 @type title: str 2166 """ 2167 2168 xlabel = kwargs.pop('xlabel', '$X$') 2169 ylabel = kwargs.pop('ylabel', '$Y$') 2170 title = kwargs.pop('title', 'Fit Results') 2171 2172 pl.subplots_adjust(wspace=0.0, hspace=0.0) 2173 ax = pl.subplot2grid((3,4), (0,0), rowspan=2, colspan=4) 2174 self.plot_fit(points=points, axis=axis, legend=legend, **kwargs) 2175 xlim, ylim = pl.xlim(), pl.ylim() 2176 x, y = xlim[1] - 0.02 * ( xlim[1]-xlim[0] ), ylim[0] + 0.05 * ( ylim[1]-ylim[0] ) 2177 pl.text(x, y, r'$\chi^2 =$ {:0.2f}'.format(self.minimizer.chisqr), ha='right') 2178 pl.ylabel(ylabel) 2179 pl.title(title) 2180 for tick in ax.axes.get_xticklabels(): 2181 tick.set_visible(False) 2182 tick.set_fontsize(0.0) 2183 2184 ax = pl.subplot2grid((3,4), (2,0), colspan=4) 2185 self.plot_residuals(axis=axis, legend=False, **kwargs) 2186 pl.ylabel('$O-C$') 2187 pl.xlabel(xlabel)

2188

2189 - def plot_confidence_interval(self, xpar=None, ypar=None, res=10, limits=None, 2190 ctype='prob', filled=True, **kwargs):

2191 """ 2192 Plot the confidence interval for 2 given parameters. Both the confidence 2193 interval calculated using the F-test method from the I{estimate_error} method, 2194 and the normal chi squares can be plotted using the I{type} keyword. In case 2195 of chi2, the log10 of the chi squares is plotted to improve the clarity of the 2196 plot. 2197 2198 Extra kwargs are passed to C{confourf} or C{contour}. 2199 2200 @param xname: The parameter on the x axis 2201 @param yname: The parameter on the y axis 2202 @param res: The resolution of the grid over which the confidence intervall 2203 is calculated 2204 @param limits: The upper and lower limit on the parameters for which the 2205 confidence intervall is calculated. If None, 5 times the stderr 2206 is used. 2207 @param ctype: 'prob' for probabilities plot (using F-test), 'chi2' for chisquare 2208 plot. 2209 @param filled: True for filled contour plot, False for normal contour plot 2210 @param limits: The upper and lower limit on the parameters for which the confidence intervall is calculated. If None, 5 times the stderr is used. 2211 """ 2212 2213 x, y, grid = self.calculate_CI_2D(xpar=xpar, ypar=ypar, res=res, 2214 limits=limits, ctype=ctype) 2215 2216 if ctype=='prob': 2217 levels = np.linspace(0,100,25) 2218 ticks = [0,20,40,60,80,100] 2219 def func(x, pos=None): 2220 return "%0.0f"%(x)

2221 fmtr = mpl.ticker.FuncFormatter(func) 2222 elif ctype=='chi2': 2223 grid = np.log10(grid) 2224 levels = np.linspace(np.min(grid), np.max(grid), 25) 2225 ticks = np.round(np.linspace(np.min(grid), np.max(grid), 6), 2) 2226 def func(x, pos=None): 2227 return "%0.1f"%(10**x) 2228 fmtr = mpl.ticker.FuncFormatter(func) 2229 2230 if filled: 2231 pl.contourf(x,y,grid,levels,**kwargs) 2232 cbar = pl.colorbar(fraction=0.08, format=fmtr, ticks=ticks) 2233 cbar.set_label(ctype=='prob' and r'Probability' or r'$\chi^2$') 2234 else: 2235 cs = pl.contour(x,y,grid,np.linspace(0,100,11),**kwargs) 2236 cs = pl.contour(x,y,grid,[20,40,60,80,95],**kwargs) 2237 pl.clabel(cs, inline=1, fontsize=10) 2238 pl.plot(self.params[xname].value, self.params[yname].value, '+r', ms=10, mew=2) 2239 pl.xlabel(xname) 2240 pl.ylabel(yname) 2241

2242 - def plot_grid_convergence(self, xpar=None, ypar=None, chi2lim=None, chi2scale='log', 2243 show_colorbar='True', **kwargs):

2244 """ 2245 Plot the convergence path of the results from grid_minimize stored in this 2246 minimizer. The start values of the two selected parameters are plotted 2247 conected to there final best fit values, while using a color coding for the 2248 chisqr value of the fit result. 2249 2250 @param xpar: The parameter on the x axis 2251 @param ypar: The parameter on the y axis 2252 @param chi2lim: Optional limit on the chi2 value (in % of the maximum chi2) 2253 @param chi2scale: Scale for the chi2 values: 'log' or 'linear' 2254 """ 2255 2256 #-- Get the minimizer grid 2257 minis, models, chisqrs = self.grid 2258 2259 if chi2lim != None: 2260 selected = np.where(chisqrs <= chi2lim*max(chisqrs)) 2261 models = models[selected] 2262 chisqrs = np.abs(chisqrs[selected]) 2263 models, chisqrs = models[::-1], chisqrs[::-1] 2264 if chi2scale == 'log': 2265 chisqrs = np.log10(chisqrs) 2266 2267 #-- read the requested parameter values 2268 x1 = np.empty_like(chisqrs) 2269 y1 = np.empty_like(chisqrs) 2270 x2 = np.empty_like(chisqrs) 2271 y2 = np.empty_like(chisqrs) 2272 for i, mod in enumerate(models): 2273 x1[i] = mod.parameters[xpar].user_value 2274 y1[i] = mod.parameters[ypar].user_value 2275 x2[i] = mod.parameters[xpar].value 2276 y2[i] = mod.parameters[ypar].value 2277 2278 #-- set the colors 2279 jet = cm = pl.get_cmap('jet') 2280 cNorm = mpl.colors.Normalize(vmin=min(chisqrs), vmax=max(chisqrs)) 2281 scalarMap = mpl.cm.ScalarMappable(norm=cNorm, cmap=jet) 2282 2283 #-- plot the arrows 2284 ax=pl.gca() 2285 for x1_, y1_, x2_, y2_, chi2 in zip(x1,y1,x2,y2, chisqrs): 2286 colorVal = scalarMap.to_rgba(chi2) 2287 ax.add_patch(mpl.patches.FancyArrowPatch((x1_,y1_),(x2_,y2_), 2288 arrowstyle='->',mutation_scale=30, color=colorVal)) 2289 scatter = pl.scatter(x2, y2, s=30, c=chisqrs, cmap=mpl.cm.jet, norm=cNorm, 2290 edgecolors=None, lw=0) 2291 pl.plot(x2[-1],y2[-1], '+r', ms=12, mew=3) 2292 2293 #-- colorbar 2294 if show_colorbar: 2295 if chi2scale == 'log': 2296 def func(x, pos=None): 2297 return "%0.0f"%(10**x)

2298 fmtr = mpl.ticker.FuncFormatter(func) 2299 else: 2300 fmtr = None 2301 pl.colorbar(scatter, fraction=0.08, format=fmtr) 2302 2303 pl.xlim(min([min(x1),min(x2)]), max([max(x1),max(x2)])) 2304 pl.ylim(min([min(y1),min(y2)]), max([max(y1),max(y2)])) 2305 pl.xlabel(xpar) 2306 pl.ylabel(ypar) 2307 2308 #} 2309 2310 #{ Advanced attributes 2311 @property

2312 - def minimizer(self):

2313 'get minimizer' 2314 return self._minimizers[0]

2315 2316 @minimizer.setter

2317 - def minimizer(self, val):

2318 'set minimizer' 2319 self._minimizers[0] = val

2320 2321 @property

2322 - def errors(self):

2323 'get error' 2324 return self._error

2325 2326 @errors.setter

2327 - def errors(self, val):

2328 'set error' 2329 if val == None: 2330 self._error = None 2331 elif np.shape(val) == (): 2332 self._error = np.ones_like(self.x) * val 2333 elif np.shape(val) == np.shape(self.x): 2334 self._error = val 2335 else: 2336 self._error = np.ones_like(self.x) * val[0]

2337 2338 @property

2339 - def grid(self):

2340 'get minimizer grid' 2341 models = np.empty(len(self._minimizers), dtype=Model) 2342 chisqrs = np.empty(len(self._minimizers), dtype=float) 2343 for i, mini in enumerate(self._minimizers): 2344 models[i] = copy.deepcopy(self.model) 2345 models[i].parameters = mini.params 2346 chisqrs[i] = mini.chisqr 2347 return self._minimizers, models, chisqrs

2348 2349 #} 2350 2351 #{ Internal Functions 2352

2353 - def __getattr__(self, name):

2354 "allow to reach the attributes of the best fitting minimizer directly" 2355 if hasattr(self.minimizer, name): 2356 return getattr(self.minimizer, name) 2357 else: 2358 raise AttributeError

2359

2360 - def _setup_residual_function(self):

2361 "Internal function to setup the residual function for the minimizer." 2362 if self.resfunc != None: 2363 def residuals(params, x, y, weights=None, errors=None, **kwargs): 2364 synth = self.model.evaluate(x, params, **kwargs) 2365 return self.resfunc(synth, y, weights=weights, errors=errors, **kwargs)

2366 else: 2367 def residuals(params, x, y, weights=None, errors=None, **kwargs): 2368 return ( y - self.model.evaluate(x, params, **kwargs) ) * weights 2369 2370 self.residuals = residuals 2371

2372 - def _setup_jacobian_function(self):

2373 "Internal function to setup the jacobian function for the minimizer." 2374 if self.model.jacobian != None: 2375 def jacobian(params, x, y, weights=None, errors=None, **kwargs): 2376 return self.model.evaluate_jacobian(x, params, **kwargs)

2377 self.jacobian = jacobian 2378 else: 2379 self.jacobian = None 2380

2381 - def _prepare_minimizer(self, fcn_args, fcn_kws, grid_points=1, grid_params=None, 2382 append=False):

2383 "Internal function to prepare the minimizer" 2384 2385 params = self.model.parameters 2386 grid_params = params.can_kick(pnames=grid_params) 2387 minimizers = np.empty(grid_points, dtype=Minimizer) 2388 2389 if grid_points == 1 or len(grid_params) == 0: 2390 #-- just one fit 2391 minimizers[0] = lmfit.Minimizer(self.residuals, params, fcn_args=fcn_args, 2392 fcn_kws=fcn_kws, **self.fit_kws) 2393 else: 2394 #-- create the minimizer grid 2395 for i in range(grid_points): 2396 params_ = copy.deepcopy(params) 2397 params_.kick(pnames=grid_params) 2398 minimizers[i] = lmfit.Minimizer(self.residuals, params_, fcn_args=fcn_args, 2399 fcn_kws=fcn_kws, **self.fit_kws) 2400 if append: 2401 self._minimizers.append(minimizers) 2402 else: 2403 self._minimizers = minimizers

2404

2405 - def _start_minimize(self, engine, verbose=False, **kwargs):

2406 "Internal function that starts all minimizers one by one" 2407 #-- Possible termial output 2408 if len(self._minimizers) <= 1: verbose = False 2409 if verbose: print "Grid Minimizer ({:.0f} points):".format(len(self._minimizers)) 2410 if verbose: Pmeter = progress.ProgressMeter(total=len(self._minimizers)) 2411 2412 #-- Start all minimizers 2413 chisqrs = np.empty_like(self._minimizers, dtype=float) 2414 for i, mini in enumerate(self._minimizers): 2415 if verbose: Pmeter.update(1) 2416 mini.start_minimize(engine, **kwargs) 2417 chisqrs[i] = mini.chisqr 2418 2419 #-- Sort on chisqr 2420 inds = chisqrs.argsort() 2421 self._minimizers = self._minimizers[inds] 2422 self.model.parameters = self._minimizers[0].params

2423

2424 - def _perturb_input_data(self, points, **kwargs):

2425 "Internal function to perturb the input data for MC simulations" 2426 2427 #-- create iterator for the data points 2428 y_ = np.empty( (points,)+self.y.shape, dtype=float) 2429 it = np.nditer([self.y, self.errors], ['multi_index'], [['readonly'], ['readonly']]) 2430 2431 #-- perturb the data 2432 while not it.finished: 2433 index = (slice(0,points),) + it.multi_index 2434 y_[index] = np.random.normal(loc=it[0], scale=it[1], size=points) 2435 it.iternext() 2436 2437 return y_

2438

2439 - def _mc_error_from_parameters(self, params):

2440 " Use standard deviation to get the error on a parameter " 2441 #-- calculate the std 2442 pnames = params[0].keys() 2443 errors = np.zeros((len(params), len(pnames))) 2444 for i, pars in enumerate(params): 2445 errors[i] = np.array(pars.value) 2446 errors = np.std(errors, axis=0) 2447 2448 #-- store the error in the original parameter object 2449 params = self.model.parameters 2450 for name, error in zip(pnames, errors): 2451 params[name].mcerr = error 2452 2453 return pnames, errors

2454

2455 - def __str__(self):

2456 " String representation of the Minimizer object " 2457 out = [] 2458 out.append( ('Model', str(self.model)) ) 2459 out.append( ('Data shape', str(np.array(self.x).shape)) ) 2460 out.append( ('Errors', 'Provided' if self._error != None else 'Not Provided') ) 2461 out.append( ('Weights', 'Provided' if self.weights != None else 'Not Provided') ) 2462 out.append( ('Residuals', 'Standard' if self.resfunc == None else 'Custom') ) 2463 out.append( ('Jacobian', 'Provided' if self.jacobian != None else 'Not Provided') ) 2464 out.append( ('Engine', str(self.engine)) ) 2465 out.append( ("Grid points", "{:.0f}".format(len(self._minimizers))) ) 2466 temp = "{:<12s}: {:s}\n" 2467 outstr = "<Minimizer object>\n" 2468 for s in out: 2469 outstr += temp.format(*s) 2470 return outstr.rstrip()

2471

2472 #} 2473 2474 -def minimize(x, y, model, errors=None, weights=None, resfunc=None, engine='leastsq', 2475 args=None, kws=None, scale_covar=True, iter_cb=None, verbose=True, **fit_kws):

2476 """ 2477 Basic minimizer function using the L{Minimizer} class, find values for the parameters 2478 so that the sum-of-squares of M{(y-model(x))} is minimized. When the fitting process 2479 is completed, the parameters of the L{Model} are updated with the results. If the 2480 I{leastsq} engine is used, estimated uncertainties and correlations will be saved to 2481 the L{Model} as well. Returns a I{Minimizer} object. 2482 2483 Fitting engines 2484 =============== 2485 By default, the Levenberg-Marquardt algorithm is used for fitting. While often 2486 criticized, including the fact it finds a local minima, this approach has some 2487 distinct advantages. These include being fast, and well-behaved for most curve- 2488 fitting needs, and making it easy to estimate uncertainties for and correlations 2489 between pairs of fit variables. Alternative fitting algoritms are at least partially 2490 implemented, but not all functions will work with them. 2491 2492 - leastsq: U{Levenberg-Marquardt 2493 <http://en.wikipedia.org/wiki/Levenberg-Marquardt_algorithm>}, 2494 U{scipy.optimize.leastsq 2495 <http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html>} 2496 - anneal: U{Simulated Annealing <http://en.wikipedia.org/wiki/Simulated_annealing.>}, 2497 U{scipy.optimize.anneal 2498 <http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.anneal.html>} 2499 - lbfgsb: U{quasi-Newton optimization 2500 <http://en.wikipedia.org/wiki/Limited-memory_BFGS>}, 2501 U{scipy.optimize.fmin_l_bfgs_b 2502 <http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_l_bfgs_b.html>} 2503 2504 2505 @param x: the independent data array (x data) 2506 @type x: numpy array 2507 @param y: the dependent data array (y data) 2508 @type y: numpy array 2509 @param model: The I{Model} to fit to the data 2510 @param err: The errors on the y data, same dimentions as y 2511 @param weights: The weights given to the different y data 2512 @param resfunc: A function to calculate the residuals, if not provided standard 2513 residual function is used. 2514 @param engine: Which fitting engine to use: 'leastsq', 'anneal', 'lbfgsb' 2515 @param kws: Extra keyword arguments to be passed to the model 2516 @param fit_kws: Extra keyword arguments to be passed to the fitter function 2517 2518 @return: (I{Parameter} object, I{Minimizer} object) 2519 2520 """ 2521 2522 fitter = Minimizer(x, y, model, errors=errors, weights=weights, resfunc=resfunc, 2523 engine=engine, args=args, kws=kws, scale_covar=scale_covar, 2524 iter_cb=iter_cb, verbose=verbose, **fit_kws) 2525 if fitter.message and verbose: 2526 logger.warning(fitter.message) 2527 return fitter

2528

2529 2530 -def grid_minimize(x, y, model, errors=None, weights=None, resfunc=None, engine='leastsq', 2531 args=None, kws=None, scale_covar=True, iter_cb=None, points=100, 2532 parameters=None, return_all=False, verbose=True, **fit_kws):

2533 """ 2534 Grid minimizer. Offers the posibility to start minimizing from a grid of starting 2535 parameters defined by the used. The number of starting points can be specified, as 2536 well as the parameters that are varried. For each parameter for which the start 2537 value should be varied, a minimum and maximum value should be provided when setting 2538 up that parameter. The starting values are chosen randomly in the range [min,max]. 2539 The other arguments are the same as for the normal L{minimize} function. 2540 2541 If parameters are provided that can not be kicked (starting value can not be varried), 2542 they will be removed from the parameters array automaticaly. If no parameters can be 2543 kicked, only one minimize will be performed independently from the number of points 2544 provided. Pay attention with the vary function of the parameters, even if a parameter 2545 has vary = False, it will be kicked by the grid minimizer if it appears in parameters. 2546 This parameter will then be fixed at its new starting value. 2547 2548 @param parameters: The parameters that you want to randomly chose in the fitting process 2549 @type parameters: array of strings 2550 @param points: The number of starting points 2551 @type points: int 2552 @param return_all: if True, the results of all fits are returned, if False, only the 2553 best fit is returned. 2554 @type return_all: Boolean 2555 2556 @return: The best minimizer, or all minimizers as [minimizers, newmodels, chisqrs] 2557 @rtype: Minimizer object or array of [Minimizer, Model, float] 2558 """ 2559 2560 fitter = Minimizer(x, y, model, errors=errors, weights=weights, resfunc=resfunc, 2561 engine=engine, args=args, kws=kws, scale_covar=scale_covar, 2562 iter_cb=iter_cb, grid_points=points, grid_params=parameters, 2563 verbose=verbose, **fit_kws) 2564 if fitter.message and verbose: 2565 logger.warning(fitter.message) 2566 2567 if return_all: 2568 return fitter.grid 2569 else: 2570 return fitter

2571

2572 #} 2573 2574 #{ General purpose 2575 2576 -def get_correlation_factor(residus, full_output=False):

2577 """ 2578 Calculate the correlation facor rho (Schwarzenberg-Czerny, 2003). 2579 2580 Under white noise assumption, the residus are expected to change sign every 2581 2 observations (rho=1). Longer distances, 2*rho, are a sign of correlation. 2582 2583 The errors are then underestimated by a factor 1/sqrt(rho). 2584 2585 @param residus: residus after the fit 2586 @type residus: numpy array 2587 @param full_output: if True, the groups of data with same sign will be returned 2588 @type full_output: bool 2589 @return: rho(,same sign groups) 2590 @rtype: float(,list) 2591 """ 2592 same_sign_groups = [1] 2593 2594 for i in xrange(1,len(residus)): 2595 if np.sign(residus[i])==np.sign(residus[i-1]): 2596 same_sign_groups[-1] += 1 2597 else: 2598 same_sign_groups.append(0) 2599 2600 rho = np.average(same_sign_groups) 2601 2602 logger.debug("Correlation factor rho = %f, sqrt(rho)=%f"%(rho,np.sqrt(rho))) 2603 2604 if full_output: 2605 return rho, same_sign_groups 2606 else: 2607 return rho

2608

2609 #} 2610 2611 #{ Print and plot functions 2612 2613 -def _calc_length(par, accuracy, field=None):

2614 """ helper function to calculate the length of the given parameters for parameters2string """ 2615 if field == 'name': 2616 par = str(par) 2617 extralen = {'name':1, 2618 'value':0, 2619 'user_value':0, 2620 'init_value':0, 2621 'stderr':4, 2622 'mcerr':4, 2623 'cierr':5, 2624 'stderrpc':3, 2625 'mcerrpc':3, 2626 'cierrpc':3, 2627 'bounds':14} 2628 2629 def calculate_length(par): 2630 try: 2631 if type(par) == str: 2632 out = len(par) 2633 elif par == None or par == np.nan or np.isposinf(par): 2634 out = 3 2635 elif np.isneginf(par): 2636 out = 4 2637 else: 2638 old = np.seterr(divide='ignore') #turn division errors off temporary 2639 length = np.floor(np.log10(abs(par))) > 0 and np.floor(np.log10(abs(par))) + 1 or 1 2640 length = par < 0 and length + 1 or length # 1 for the minus sign 2641 length = length + accuracy + 1 # 1 for the decimal point 2642 np.seterr(divide=old['divide']) 2643 out = int(length) 2644 except Exception, e: 2645 logging.warning( 2646 'Could not calculate length of: %s, type = %s, field = %s\nerror: %s'%(par, type(par), field, e)) 2647 out = 0 2648 return out

2649 2650 if type(par) == tuple: 2651 out = 0 2652 for p in par: 2653 out += calculate_length(p) 2654 else: 2655 out = calculate_length(par) 2656 2657 if field != None and field in extralen: 2658 return out + extralen[field] 2659 else: 2660 return out 2661

2662 -def _format_field(par, field, maxlen=10, accuracy=2):

2663 fmt = '{{:.{0}f}}'.format(accuracy) 2664 2665 if field == 'name': 2666 temp = '{{:>{0}s}} ='.format(maxlen) 2667 return temp.format(par.name) 2668 elif field in ['value', 'user_value', 'init_value']: 2669 return fmt.format(getattr(par, field)) 2670 elif field in ['stderr', 'mcerr']: 2671 return '+/- ' + fmt.format(getattr(par, field)) 2672 elif re.match(r"^ci(\d\d+?)$", field): 2673 temp = '+ ' + fmt + ' - ' + fmt 2674 return temp.format(*getattr(par, field)) 2675 elif field == 'stderrpc': 2676 return '('+fmt.format( getattr(par, field) ) + '%)' 2677 elif field == 'bounds': 2678 temp = ' bounds = ' + fmt + ' <-> ' + fmt 2679 return temp.format(*getattr(par, field)) 2680 elif field == 'vary': 2681 return '(vary)' if getattr(par, field) else '(fixed)' 2682 elif field == 'expr': 2683 expr = getattr(par, field) 2684 return 'expr = %s'%(expr) if expr != None else '' 2685 else: 2686 return ''

2687

2688 2689 -def parameters2string(parameters, accuracy=2, error='stderr', output='result', **kwargs):

2690 """ Converts a parameter object to string """ 2691 out = "Parameters ({:s})\n".format(error) 2692 2693 if not hasattr(output, '__itter__'): 2694 if output == 'start': 2695 output = ['name', 'user_value', 'bounds', 'vary', 'expr'] 2696 out = "Parameters (initial)\n" 2697 elif output == 'result': 2698 output = ['name', 'value', error, error + 'pc'] 2699 out = "Parameters ({:s})\n".format(error) 2700 elif output == 'full': 2701 output = ['name', 'value', error, error + 'pc', 'bounds', 'vary', 'expr'] 2702 out = "Parameters ({:s})\n".format(error) 2703 else: 2704 output = ['name', 'value'] 2705 out = "Parameters \n" 2706 2707 #-- calculate the nessessary space 2708 maxlen = np.zeros(len(output), dtype=int) 2709 for i, field in enumerate(output): 2710 max_value = 0 2711 for name, par in parameters.items(): 2712 current = _calc_length(getattr(par, field), accuracy, field=field) 2713 if current > max_value: max_value = current 2714 maxlen[i] = max_value 2715 2716 #-- create matrix with all values in requered accuracy as string 2717 roundedvals = np.empty(shape=(len(parameters.keys()), len(output)), 2718 dtype="|S{:.0f}".format(np.max(maxlen))) 2719 for i, (name, par) in enumerate(parameters.items()): 2720 for j, field in enumerate(output): 2721 roundedvals[i,j] = _format_field(par, field, maxlen[j], accuracy) 2722 2723 #-- create the template 2724 template = ' ' 2725 for max_value in maxlen: 2726 template += "{{:<{0}s}} ".format(max_value) 2727 template += '\n' 2728 2729 #-- create the output string 2730 for line in roundedvals: 2731 out += template.format(*line) 2732 2733 return out.rstrip()

2734

2735 -def correlation2string(parameters, accuracy=3, limit=0.100):

2736 """ Converts the correlation of different parameters to string """ 2737 out = "Correlations (shown if larger as {{:.{0}f}})\n".format(accuracy).format(limit) 2738 2739 #-- first select all correlations we want 2740 cor_name, cor_value = [],[] 2741 for name1,par in parameters.items(): 2742 for name2, corr in par.correl.items(): 2743 n1 = name1 + ' - ' + name2 2744 n2 = name2 + ' - ' + name1 2745 if corr >=limit and not n1 in cor_name and not n2 in cor_name: 2746 cor_name.append(n1) 2747 cor_value.append(corr) 2748 cor_name, cor_value = np.array(cor_name, dtype=str),np.array(cor_value, dtype=float) 2749 ind = cor_value.argsort() 2750 cor_name, cor_value = cor_name[ind][::-1], cor_value[ind][::-1] 2751 2752 #-- calculate nessessary space 2753 max_name = 0 2754 max_value = 0 2755 for name, value in zip(cor_name, cor_value): 2756 max_name = len(name) if len(name) > max_name else max_name 2757 current = _calc_length(value, accuracy) 2758 max_value = current if current > max_value else max_value 2759 2760 #-- print correlations 2761 template = ' {{name:<{0}s}} = {{value:.{1}f}}\n'.format(max_name, accuracy) 2762 for name, value in zip(cor_name, cor_value): 2763 out += template.format(name=name, value=value) 2764 2765 return out.rstrip()

2766

2767 2768 -def confidence2string(parameters, accuracy=4):

2769 """ Converts the confidence intervals to string """ 2770 out="Confidence Intervals\n" 2771 2772 #-- find all calculated cis and the nessessary space 2773 max_name = 0 2774 max_value = 6 2775 sigmas = [] 2776 for name, par in parameters.items(): 2777 if len(name) > max_name: max_name = len(name) 2778 for ci in par.cierr.keys(): 2779 if not ci in sigmas: sigmas.append(ci) 2780 current = _calc_length(par.cierr[ci][0], accuracy) 2781 if current > max_value: max_value = current 2782 current = _calc_length(par.cierr[ci][1], accuracy) 2783 if current > max_value: max_value = current 2784 sigmas.sort() 2785 sigmas.reverse() 2786 2787 #-- create the output values 2788 template = "{{:.{0}f}}".format(accuracy) 2789 cis = np.empty(shape=(len(parameters.keys()), 2*len(sigmas)+1), 2790 dtype="|S{:.0f}".format(max_value)) 2791 for i, (name, par) in enumerate(parameters.items()): #cis 2792 for j, sigma in enumerate(sigmas): 2793 if sigma in par.cierr.keys(): 2794 cis[i,j] = template.format(par.cierr[sigma][0]) 2795 cis[i,-j-1] = template.format(par.cierr[sigma][1]) 2796 else: 2797 cis[i,j] = '-'*max_value 2798 cis[i,-j-1] = '-'*max_value 2799 for i, (name, par) in enumerate(parameters.items()): #values 2800 cis[i,len(sigmas)] = template.format(par.value) 2801 2802 template = "{:.2f}" 2803 header = np.empty(shape=2*len(sigmas)+1, dtype="|S{:.0f}".format(max_value)) 2804 for i, sigma in enumerate(sigmas): 2805 header[i] = template.format(sigma*100) 2806 header[-i-1] = template.format(sigma*100) 2807 header[len(sigmas)] = template.format(0) 2808 2809 #-- create the output 2810 template = "{{:>{0}s}}% ".format(max_value-1) * (2*len(sigmas)+1) 2811 template = " {{:>{0}s}} ".format(max_name) + template +"\n" 2812 out += template.format('', *header) 2813 2814 template = "{{:>{0}s}} ".format(max_value) * (2*len(sigmas)+1) 2815 template = " {{:>{0}s}}: ".format(max_name) + template +"\n" 2816 for name, ci in zip(parameters.keys(), cis): 2817 out += template.format(name, *ci) 2818 2819 return out.rstrip()

2820

2821 -def plot_convergence(startpars, models, chi2s, xpar=None, ypar=None, clim=None):

2822 """ 2823 Plot the convergence path of the results from grid_minimize. 2824 """ 2825 #-- sort the models on reverse chi2 so that the best model is plotted on top 2826 inds = chi2s.argsort()#[::-1] 2827 startpars = startpars[inds] 2828 models = models[inds] 2829 chi2s = chi2s[inds] 2830 2831 if clim != None: 2832 selected = np.where(chi2s <= clim*max(chi2s)) 2833 startpars = startpars[selected] 2834 models = models[selected] 2835 chi2s = chi2s[selected] 2836 2837 #-- read the requested parameter values 2838 points=len(startpars) 2839 x1 = np.zeros(points,dtype=float) 2840 y1 = np.zeros(points,dtype=float) 2841 x2 = np.zeros(points,dtype=float) 2842 y2 = np.zeros(points,dtype=float) 2843 for i in range(points): 2844 x1[i] = startpars[i][xpar].value 2845 y1[i] = startpars[i][ypar].value 2846 x2[i] = models[i].parameters[xpar].value 2847 y2[i] = models[i].parameters[ypar].value 2848 2849 #-- set the colors 2850 jet = cm = pl.get_cmap('jet') 2851 cNorm = mpl.colors.Normalize(vmin=0, vmax=max(chi2s)) 2852 scalarMap = mpl.cm.ScalarMappable(norm=cNorm, cmap=jet) 2853 2854 #-- plot the arrows 2855 ax=pl.gca() 2856 for x1_, y1_, x2_, y2_, chi2 in zip(x1,y1,x2,y2, chi2s): 2857 colorVal = scalarMap.to_rgba(chi2) 2858 ax.add_patch(mpl.patches.FancyArrowPatch((x1_,y1_),(x2_,y2_),arrowstyle='->',mutation_scale=30, color=colorVal)) 2859 pl.scatter(x2, y2, s=30, c=chi2s, cmap=mpl.cm.jet, edgecolors=None, lw=0) 2860 pl.plot(x2[-1],y2[-1], '+r', ms=12, mew=3) 2861 pl.colorbar(fraction=0.08) 2862 pl.xlim(min([min(x1),min(x2)]), max([max(x1),max(x2)])) 2863 pl.ylim(min([min(y1),min(y2)]), max([max(y1),max(y2)])) 2864 pl.xlabel(xpar) 2865 pl.ylabel(ypar)

2866 2867 2868 #} 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 # 2880 # 2881 # if __name__=="__main__": 2882 # import doctest 2883 # import pylab as pl 2884 # import sys 2885 # doctest.testmod() 2886 # pl.show() 2887 # sys.exit() 2888 # 2889 # from ivs.timeseries import pergrams 2890 # import pylab as pl 2891 # 2892 # times = np.linspace(0,150,5000) 2893 # np.random.seed(10) 2894 # freqs = [1.23,2.59,3.89,4.65] 2895 # freqs += [2*freqs[0],2*freqs[0]+freqs[2],freqs[3]-freqs[0],freqs[0]+freqs[3]-freqs[2]] 2896 # freqs = np.array(freqs) 2897 # amplitudes = np.sort(np.random.uniform(size=len(freqs),low=1,high=10))[::-1] 2898 # phases = np.random.uniform(size=len(freqs),low=-0.5,high=0.5) 2899 # parins = np.rec.fromarrays([np.zeros(len(freqs)),amplitudes,freqs,phases],names=('const','ampl','freq','phase')) 2900 # signal = evaluate.sine(times,parins) 2901 # signal_= signal + np.random.normal(size=len(signal),scale=1) 2902 # 2903 # residus = signal_ + 0. 2904 # frequencies = [] 2905 # 2906 # for i in range(9): 2907 # print "======== STEP %d ======"%(i) 2908 # 2909 # 2910 # pergram = pergrams.scargle(times,residus,threads=2) 2911 # frequency = pergram[0][np.argmax(pergram[1])] 2912 # frequencies.append(frequency) 2913 # parameters = sine(times,signal_,frequencies) 2914 # e_parameters = e_sine(times,signal_,parameters) 2915 # parameters,e_parameters,gain = optimize(times,signal_,parameters,'sine') 2916 # frequencies = list(parameters['freq']) 2917 # 2918 # signalf = evaluate.sine(times,parameters) 2919 # 2920 # 2921 # if i<len(parins): 2922 # print '' 2923 # for par in ['const','ampl','freq','phase']: 2924 # print par,parins[i][par],parameters[par],e_parameters['e_%s'%(par)] 2925 # print 'S/N',parameters['ampl']/e_parameters['e_ampl'] 2926 # 2927 # else: 2928 # print '' 2929 # for par in ['const','ampl','freq','phase']: 2930 # print par,parameters[par],e_parameters['e_%s'%(par)] 2931 # print 'S/N',parameters['ampl']/e_parameters['e_ampl'] 2932 # 2933 # 2934 # 2935 # print '' 2936 # 2937 # 2938 # 2939 # residus = signal_-signalf 2940 # 2941 # 2942 # parameters = sine(times,signal_,frequencies) 2943 # signalf = evaluate.sine(times,parameters) 2944 # 2945 # pl.subplot(121) 2946 # pl.plot(times,signal_,'ko') 2947 # pl.plot(times,signal,'r-') 2948 # pl.plot(times,signalf,'b-') 2949 # 2950 # pl.subplot(122) 2951 # pl.plot(*pergram) 2952 # pl.show() 2953

Source Code for Module ComboCode.cc.ivs.sigproc.fit