Beruflich Dokumente
Kultur Dokumente
nb
Chapter 5
Fitting Data to Nonlinear Models
One of the more difficult topics in all of data analysis in the physical sciences is fitting data to nonlinear models. Often
such fits require large computational resources and great skill, patience, and intuition on the part of the analyst. These
difficulties are one of the reasons that, as we shall see, the whole topic of spectral line shapes is still a very active subject
of research spanning the fields of chemistry, physics, astronomy, and more. In addition, computational methods of
nonlinear fitting is still a current research topic in computer science.
However, since sometimes nature really is nonlinear, such fits are often unavoidable, and the principles and some tools
for nonlinear fitting are the topics of this chapter. The main EDA program introduced here is FindFit, which
accomplishes fits to arbitrary models.
FindFit is similar to the NonlinearFit function in the StatisticsNonlinearFit package which is
standard with Mathematica. The primary differences are that FindFit: (1) recognises EDAs data format, including
errors in both coordinates; (2) estimates errors in fit parameters; (3) by default displays graphical information about the
fit; and (4) uses an algorithm that has been optimised for speed and stability for the types of nonlinear fits commonly
performed in the physical sciences and engineering.
The following command loads the initialization of EDA.
Needs "EDAMaster"
5.1 Introduction
The previous chapter, "Fitting Data to Linear Models by Least Square Techniques", introduced the distinction between
linear and nonlinear models. To briefly review, the terms refer to the way in which the parameters to which we are
fitting enter into the model.
In this chapter we discuss nonlinear models and the EDA program FindFit that can often find a reasonable fit to them.
Recall that if sos is the sum of the squares of the residuals, then we are seeking the minimum in its value. If we are
fitting to parameters a[0], a[1], ... , a[m], the answer is found by solving a set of simultaneous equations.
D[ sos, a[0] ] == 0
D[ sos, a[1] ] == 0
...
D[ sos, a[m] ] == 0
This in general can be done analytically provided the model to which we are fitting is linear in the parameters. Similarly,
when there are explicit errors in the data, we form the chisquared, say chisq, and we solve the corresponding equations.
D[ chisq, a[0] ] == 0
D[ chisq, a[1] ] == 0
...
D[ chisq, a[m] ] == 0
This again will be analytic for a linear fit.
Chapter5.nb
For a nonlinear fit, no such analytic solutions are possible, so iteration is required to find the minimum in the sum of the
squares or the chisquared.
If we imagine a plot of the value of the sum of the squares or the chisquared as a function of the parameters to which
we are fitting, in general for a nonlinear fit there may be many local minima instead of one big one, as is the case for
linear fitting. For example, if we are fitting to two parameters, param1 and param2, the chisquared as a function of the
values of the parameters might have two or more local minima.
1 1 ,
param12 param22 35
param1 25 2 param2 10 2 25
param1, 20, 50 , param2, 25, 25 ,
Automatic, Automatic, None ,
PlotRange All, PlotPoints 40, Ticks
AxesLabel
"param1", "param2", "chi squared" , ViewPoint
1.3, 2.4, .75 ;
Plot3D 1
param2
-20
-100
10 20
chi-squared
40
0
20
param1
-20
Thus, a nonlinear fitter must usually start off with initial values close to the real minimum.
The general technique for iteration, "steepest descent", is analogous to the
following situation. It was "a dark and stormy night". Foggy too. You are on the
side of a hill and want to find the valley. So you step in the direction in which the
slope goes down, and continue moving in the direction of the local definition of
"down" until you are in the valley. Of course, if you take giant steps you might
step over the next hill and end up in the wrong valley. And when you get close to
the bottom of the valley you will want to start taking baby steps. The
LevenbergMarquardt algorithm used by many nonlinear fitters, including
FindFit, is essentially some clever heuristics to define giant steps and baby
steps.
If the data has noise, which is almost a certainty for real experimental data, then there is a further difficulty. We can take
two sets of data from the same apparatus using the same sample, fit each dataset to a nonlinear model using identical
initial values for the fit parameters, and get very different final fits. This situation leads to ambiguity about which fit
results are "correct".
As already mentioned, unless we are fitting to a very simple model, FindFit must be provided with initial estimates of
the parameters.
We will first fit to a simple model where no such parameter estimates are necessary.
mydata
Chapter5.nb
Residuals
2
0.008
0.006
0.004
0.002
0
-0.002
1
0
-1
-2
0
a
3., 0.34 , b
DegreesOfFreedom
1.25 1.5
0.0000952225702044433,
Actually, when not given initial values FindFit starts with initial values for the parameters, here a and b, equal to 1.
Now the syntax of the call to FindFit is given.
FindFit[data, model, ind,
{param1, ... , paramM} ]
Here ind is the name of the independent variable in model. Also note that this model is nonlinear in the parameters to
which we are fitting, param1 ... paramM.
The EDAFindFit package includes a Gaussian function.
Information "Gaussian", LongForm
False
Random Real, 1, 1
10
8
6
4
2
2.5
7.5
10
12.5
15
We fit the data using FindFit; the fit takes a few seconds to complete.
, x, 1, 16, .1
Chapter5.nb
10
8
6
4
2
0
0
2.5
7.5 10 12.5
Residuals
15
6
4
2
0
-2
-4
10
8
6
4
2
0
0
2.5
7.5
10
12.5
15
Chapter5.nb
1
0.5
0
-0.5
-1
ampl
9.91, 0.18 , x0
8.075, 0.05400000000000001 ,
sigma
2.488, 0.05400000000000001 , SumOfSquares 49.5618230640989,
DegreesOfFreedom 148
Finding good initial values is sometimes subtle. As an example, we generate some madeup data for three peaks with a
Lorentzian shape using the Lorentzian function supplied with the EDAFindFit package.
Information "Lorentzian", LongForm False
mydata Table x, Lorentzian x, 100, 18, 8
Lorentzian x, 10, 23, 4 Lorentzian x, 80, 35, 8
EDAListPlot mydata ;
, x, 1, 60, .1
Lorentzian[x,a,x0,gamma] is a nonrelativistic
Lorentzian function of the independent variable
x, center value x0, and full width at
halfmaximum gamma. The function is normalized
so that its integral from Infinity to
+Infinity is equal to a. The maximum
amplitude of the peak is 2*a/(Pi*gamma).
Lorentzian is a synonym for BreitWigner.
8
6
4
2
10
20
30
40
50
60
It is difficult to see the small peak on the shoulder of the leftmost peak, or to provide initial estimates for its values.
However, we can see the peak and probably make some sensible guesses of its parameters.
Despite the claims sometimes seen in glossy advertisements, there is no known software that can find and estimate peaks
for data such as this as well as a human expert. This is in part because of the great ability of the human visual system to
be an intuitive integrator.
Almost all versions of the notebook front end for Mathematica include a provision to use our visual ability. We can
display a graph of the data and using the mouse, point at a desired location on the graph. The coordinates will be
displayed in the notebook window, and can also be copied out using the cutandpaste facility; consult the manual for
your version of the notebook to discover how to do this. For example, here we load and examine a part of a nuclear
spectrum.
LoadData Cobalt60
LoadData::name: Loading: Cobalt60Data
Chapter5.nb
False
EDAListPlot Cobalt60Data ;
200
150
100
50
1750
1800
1850
1900
1950
Using the mouse we can pick out the coordinates of the maximum, the points on the peak corresponding to roughly
onehalf the maximum, and so on. So, the position of the peak is found.
1831.88, 181.251
In some earlier versions of the Windows front end, if the notebook is closed and then reopened, the coordinates will be
returned in PostScript units instead of the actual units of the plot; rerendering the plot works around this bug.
Finally, for simple spectra EDA supplies a function, FindPeaks, which can sometimes provide initial guesses of the
number of peaks and their parameter values close enough for FindFit. The function is described in Section 8.3.
Although FindFit is intended for fitting to nonlinear models, it can also find the minimum in the sum of the squares
or the chisquared for linear models, albeit more slowly than LinearFit.
For example, here we repeat a fit from Chapter 4.
Chapter5.nb
LoadData Thermocouple
LinearFit ThermocoupleData, 0, 1, 2 , a
LoadData::name: Loading: ThermocoupleData
Residuals
3 0.06
0.04
0.02
0
2 -0.02
-0.04
-0.06
1
0
-1
0
20
40
60
80
100
a 0
0.886, 0.03 , a 1 0.0352, 0.0014 , a 2 0.00006, 0.000013 ,
ChiSquared 1.006602038693567, DegreesOfFreedom 18
a 1 x a 2 x2 , x, a 0 , a 1 , a 2
Residuals
3 0.06
0.04
0.02
0
2 -0.02
-0.04
-0.06
1
0
-1
0
20
40
60
80
100
a 0
0.886, 0.03 , a 1 0.0352, 0.0014 , a 2 0.00006, 0.000013 ,
ChiSquared 1.006602038696037, DegreesOfFreedom 18
For this example, FindFit took about four times as long as LinearFit.
Recall from the chapter on linear fitting that if the data have explicit errors in both coordinates, the effective variance
technique makes the fit essentially nonlinear unless the model is a straight line. Thus, in this case LinearFit iterates
until it finds a minimum in the chisquared. When there are errors in both coordinates, FindFit also calculates the
error in the dependent variable based on the effective variance. However, although there is a fairly comprehensive
literature on using this technique in linear fits, the main justification of using effective variances in nonlinear fits is
based only on a series of experiments in which it was found that most often the algorithm of FindFit would produce
reasonable results.
We repeat another fit from the chapter on linear models.
Chapter5.nb
LoadData Reactance
LinearFit ReactanceData,
1, 1 , a
10
Residuals
8 4
3
6 2
1
4 0
-1
2
0
-2
-4
22000 23000 24000 25000 26000 27000
a 1
590000.0000000001, 110000. , a 1 0.00102, 0.0002 ,
ChiSquared 2.191432350342454, DegreesOfFreedom 3
FindFit gives a similar result if started sufficiently close to the "final" values.
FindFit ReactanceData,
a 1 a 1 x, x,
x
10
1 , 590000 , a 1 , 0.001
Residuals
8 4
3
6 2
1
4 0
-1
2
0
-2
-4
22000 23000 24000 25000 26000 27000
a
1
14
, a 1
590000.0000000001, 3.800000000000001
ChiSquared
10
2.203721816446138, DegreesOfFreedom
0.001009, 0.000021 ,
For the above results, FindFit appears to have done a poorer job of estimating the errors in the fit parameters than
LinearFit.
Another difference between the above two fits is that by default the graphs produced by LinearFit display the errors
in the fit parameters, while the graphs produced by FindFit do not. This is because for many nonlinear fits, sorting
out how to combine the various error terms is problematic. To display the fit errors the UseFitErrors option,
discussed in Section 5.3.2.2, may be set to True.
Chapter5.nb
FindFit ReactanceData,
10
True
UseFitErrors
a 1 a 1 x, x,
1 , 590000 , a 1 , 0.001 ,
Residuals
8 4
3
6 2
1
4 0
-1
2
0
-2
-4
22000 23000 24000 25000 26000 27000
a
1
14
, a 1
590000.0000000001, 3.800000000000001
ChiSquared
10
2.203721816446138, DegreesOfFreedom
0.001009, 0.000021 ,
Note that as opposed to the plot produced by LinearFit, here the lines representing the errors in the fit parameters are
parallel. This is an artifact of a heuristic used by the FindFit package to combine the two terms; the heuristic is often
reasonable for true nonlinear fits.
As a final comparison between LinearFit and FindFit, we load and examine some data on oscillations in
interneuron networks.
LoadData Interneuron
LoadData::name: Loading: InterneuronData
False
EDAListPlot InterneuronData ;
45
40
35
30
25
10
15
20
25
30
10
Chapter5.nb
This is nonlinear in the parameters a and b. However, as mentioned in the previous chapter, we can linearize the
relationship by taking the logarithms of both sides.
Log[frequency] = Log[a] b*ipscTau
Thus, we form a data set of {ipscTau, Log[frequency]}.
mydata
InterneuronData;
&
LinearFit mydata, 0, 1 , a
Residuals
3.8
0.2
0.1
0
-0.1
-0.2
-0.3
3.6
3.4
3.2
3
2.8
10
15
20
25
30
a 0
4.027000000000001, 0.05700000000000001 ,
0.04240000000000001, 0.0033 , PseudoErrorY 0.1075978101620072,
a 1
SumOfSquares 0.2547003525365055, DegreesOfFreedom 22
Thus b = 0.0424 +/ 0.0033 in (ms)^(1). Although there are problems with the above fit, which are discussed below,
we will calculate a and its errors using the Datum construct discussed in Chapter 3.
Exp Datum a 0
. linresult
. Datum
Identity
56.10000000000001, 3.2
b x , x,
a, 56 , b,
0.04
Residuals
4
2
0
-2
-4
-6
45
40
35
30
25
20
5
10
15
20
25
30
a
59.90000000000001, 1. , b
0.04680000000000001, 0.0012 ,
SumOfSquares 164.3913719853976, DegreesOfFreedom 22
The two fits are similar, and both show some problems in the residuals. Also, FindFit seems more confident about the
uncertainties in the values of the fit parameters, probably without justification. The difference in the SumOfSquares is
because of the different values being used for the dependent variable of the data.
11
Chapter5.nb
One likely reason for the problems here is that the data does not seem to asymptotically approach zero, as assumed by
our model, but some other value c.
frequncy = a*Exp[b*ipscTau] + c
Fitting to this model seems to confirm our hypothesis.
FindFit InterneuronData, a Exp
b x c, x,
a, 60 , b, 0.05 , c, 15
Residuals
45
4
2
0
-2
-4
40
35
30
25
20
5
15
10
20
25
30
a
59.00000000000001, 2.3 , b
0.0925, 0.0094 , c
13.8, 1.5 ,
SumOfSquares 132.853722227798, DegreesOfFreedom 21
This essentially duplicates a fit presented by the experimenters in their original paper.
The model can be linearized as follows:
frequncy = a*Exp[b*ipscTau] + c
frequency c = a*Exp[b*ipscTau]
Log[frequency c] = Log[a] b*ipscTau
Thus, we can form another data set in which we subtract 13.8 from each value of frequency before taking the logarithms.
mydata2
13.8 &
InterneuronData;
Residuals
3.5
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
3
2.5
2
1.5
5
10
15
20
25
30
a 0
4.030000000000001, 0.11 , a 1
0.0901, 0.006600000000000001 ,
PseudoErrorY 0.2152138256526544, SumOfSquares 1.018973796545124,
DegreesOfFreedom 22
12
Chapter5.nb
Exp Datum a 0
. linresult2
. Datum
Identity
56.3, 6.200000000000001
5.1.4 References
Philip R. Bevington, Data Reduction and Error Analysis (McGrawHill, 1969), Chapter 11. A classic introduction to
nonlinear fitting techniques.
Xiang Ouyang and Philip L. Varghese, Appl. Optics 28 (1989), p. 1538. A discussion of a popular Fourier
transformbased algorithm for fitting spectra to Galatry and Voigt profiles.
William H. Press, Brian P. Flannery, Saul A. Teukolsky, and William T. Vetterling, Numerical Recipes: The Art of
Scientific Computing or Numerical Recipes in C: The Art of Scientific Computing (Cambridge Univ. Press), Section
14.4. A good brief introduction to nonlinear fitting and the LevenbergMarquardt algorithm used by FindFit.
A.P. De Weljer, C.B. Lucasius, L. Buydens, G. Kateman, H.M. Heuvel, and H. Mannee, Anal. Chem. 66 (1994), p 23.
An illuminating discussion of the research into fitting to curves using neural networks, this article also has a good
review of some of the problems with conventional techniques.
5.2 Examples
We begin by loading and examining some data that was briefly examined in Section 5.1.2.
LoadData Cobalt60
LoadData::name: Loading: Cobalt60Data
False
13
Chapter5.nb
EDAListPlot Cobalt60Data ;
200
150
100
50
1750
1800
1850
1900
1950
The theoretical prediction for the peak is that it should be a Gaussian, so part of the model for the fit will be the
Gaussian function included in the EDAFindFit package.
Information "Gaussian", LongForm
False
We also note that there is a background under the peak, i.e. ,counts in addition to just the Gaussian peak. We will
approximate the background as a straight line with the following slope.
a 1
16 44
N
1950 1700
0.112
Here the numbers in the calculation are based on values obtained by pointing and clicking at the left and righthand
sides of the plot of the data with the mouse. We calculate the intercept.
a 0
16 a 1 1950
234.4
From the plot of the data we estimate the center of the peak to be at channel 1830, and the amplitude above the
background is about 140 counts. The full width at halfmaximum is about 90 channels, so we will try an initial value for
sigma of 45 channels.
Since we are going to use a[0] and a[1] as names for fit parameters, we must clear the definitions we have just made for
them.
Remove "a "
200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000
14
Chapter5.nb
30
20
10
0
-10
-20
-30
N
#1 1 , #1 2 ,
#1 2
&
Cobalt60Data ;
200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000
15
Chapter5.nb
Residuals
40
20
0
-20
-40
a 0
194., 10. , a 1
x0
1833., 0.51 , sigma
DegreesOfFreedom 281
Although the error bars in the data have obscured the curve representing the fit, the residuals and the ChiSquared per
DegreesOfFreedom show that the fit is reasonable, except perhaps on the far righthand side.
There is another sort of bellshaped curve that highenergy physicists usually call a "Breit Wigner". The same curve is
usually called a "Lorentzian" by spectroscopists. The EDAFindFit package includes a BreitWigner function.
Information "BreitWigner", LongForm
False
BreitWigner[x,a,x0,gamma] is a nonrelativistic
BreitWigner function of the independent
variable x, center value x0, and full width at
halfmaximum gamma. The function is normalized
so that its integral from Infinity to
+Infinity is equal to a. The maximum
amplitude of the peak is 2*a/(Pi*gamma).
BreitWigner is a synonym for Lorentzian.
10
8
6
4
2
Gaussian
5
BreitWigner
10
15
20
Note that, consistent with a common convention, BreitWigner does not take the maximum amplitude as an
argument, but instead the total area under the curve. If the peak in the Cobalt60 data is a BreitWigner, this
corresponds to the total number of counts in the peak. The total number of counts in the data, peak plus background, can
be calculated.
Last Plus
Cobalt60Data
24057
Chapter5.nb
16
Also, the width is specified by the full width at the halfmaximum, not the standard deviation. We fit the Cobalt60 data
with errors to a BreitWigner plus a linear background.
17
Chapter5.nb
200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000
Residuals
40
20
0
-20
-40
a 0
138., 11. , a 1
0.07650000000000001, 0.005700000000000001 ,
a
31630., 770.0000000000001 , x0
1832.11, 0.55 , g
106.3, 2.4 ,
ChiSquared 448.2636168689677, DegreesOfFreedom 281
Because of the possibility that FindFit fell into the wrong minimum in the chisquared, caution must be used in
rejecting the model (i.e., a BreitWigner plus background) because of a high ChiSquared per
DegreesOfFreedom. Nonetheless, the residuals seem to be saying clearly that the data does not match a
BreitWigner.
We return to modeling the Cobalt60 data to a Gaussian plus background, and concentrate on the problems we have
seen on the righthand side. We can add a quadratic term to the background.
FindFit mydata, a 0 a 1 chan a 2 chan2 Gaussian chan, ampl, x0, sigma , chan,
a 0 , 234 , a 1 , 0.11 , a 2 , 0.02 , ampl, 140 , x0, 1830 , sigma, 45
200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000
18
Chapter5.nb
Residuals
150
100
50
0
-50
a 0
272.1666687,
a 2
0.00009,
x0
4.7
6 , a 1
10
2.3
6 , ampl
10
0.3632, 0.004400000000000001 ,
215.2443773,
6.1
1212.69729812, 7 , sigma
ChiSquared
6.800000000000001
6 ,
10
704.3369377000001,
10
9042.73001502903, DegreesOfFreedom 280
5.600000000000001
7 ,
10
This is a ridiculous result because FindFit fell into the wrong minimum in the ChiSquared. Try again with the
quadratic term set initially to zero, storing the answer in result.
result
FindFit mydata, a 0 a 1 chan a 2 chan2 Gaussian chan, ampl, x0, sigma , chan,
a 0 , 234 , a 1 , 0.11 , a 2 , 0.0 , ampl, 140 , x0, 1830 , sigma, 45
200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000
19
Chapter5.nb
Residuals
60
40
20
0
-20
233.8485, 0.001 , a 1
0.1236, 0.0058 ,
6.
3.
6 , 6 , ampl 139.87, 0.05300000000000001 ,
a 2
10
10
x0
1833.34, 0.57 , sigma
44.02000000000001, 0.6200000000000001 ,
ChiSquared 374.8140617169628, DegreesOfFreedom 280
a 0
Although not ridiculous, this fit is not nearly as good as the one without the quadratic background term. In fact, the
chisquare probability is pretty small.
ChiSquareProbability ChiSquared, DegreesOfFreedom
. result
0.0128891929874729
These sorts of difficulties are common when the number of fit parameters begins to get large. They are often particularly
acute when we are trying to fit to one or more peaks with a significant background under them. The "noise" in the data
further compounds the difficulty.
One frequently useful technique is to "sneak up" on the values. For example, we will fix all values of the fit at the values
of our best fit so far and allow the quadratic term to vary.
20
Chapter5.nb
200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000
Residuals
40
20
0
-20
-40
a 2
7.000000000000001
8 , 1.1
7 , ChiSquared 299.4365730959574,
10
DegreesOfFreedom 285
10
Then we can try the full fit again, starting a[2] with the value from this fit, and storing the answer in result.
result
FindFit mydata, a 0
a 0 , 194 , a 1 ,
7
0.092 , a 2 , 8 , ampl, 154 , x0, 1833 , sigma, 42.9
10
200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000
21
Chapter5.nb
Residuals
40
20
0
-20
-40
a 0
194.00251, 0.00081 ,
a 1
The point to all these fits is to demonstrate just how sensitive the results can be to apparently small changes in the initial
values. Here we have a reasonable ChiSquared per DegreesOfFreedom and the quadratic term is now zero within
errors. The fact that we have not managed to account for the small upturn in the data on the right may be because it is
not an artifact of the background, but because there is another peak beyond the range of our data that is bringing up
these values. The fact that each of these fits takes a minute or more, depending on the hardware, means that nonlinear
fitting is often somewhat laborious and time consuming.
Finally, we can compare the total number of counts in the data, 24057, to the total predicted from the fit.
N NIntegrate Gaussian x, 154, 1833, 43
23663.73555975553
Whether or not this number is close to the experimental number cannot be answered until we do some error analysis.
We can find the contributions to this number from the background and the Gaussian, including errors, using some tools
discussed in Chapter 3.
We ignore the quadratic term in the background. Using the Datum construct, we can find the counts and error in the
counts due to the background.
bkgd
1985
a bx
x . a Datum a 0 , b Datum a 1
. result
1700
Datum
7100., 3000.
We use the fact that the Gaussian is essentially zero at both the left and right of the data , so we can use the full
normalization of the Gaussian to calculate the counts under the peak.
peak
N
Datum
2
. result
16600., 220.
Identity
23700., 3000.
22
Chapter5.nb
A common line shape in the study of infrared absorption and emission is called a Galatry profile. The EDAFindFit
package includes a function for this shape.
Information "Galatry", LongForm
False
We will use this function to generate some madeup data of three Galatry peaks.
SeedRandom 1234 ;
mydata Table t, 2 Galatry 0.95, 1.0, t 5 1.3 Galatry 1.3, 1.0, t 12
1.2 Galatry 0.90, 1.0, t 25 Random Real, .07, .07
,
t, 0, 35, .2 ;
EDAListPlot mydata ;
3.5
3
2.5
2
1.5
1
0.5
5
10
15
20
25
30
35
Residuals
0.06
0.04
0.02
0
-0.02
-0.04
-0.06
3
2
1
0
0
10
20
30
c1
2.1, 1.7 , c2
1.4, 2.4 , c3
1.1, 1.8 , y1
0.93, 0.69 , y2
y3
0.9, 1.3 , z1
1.05, 0.91 , z2
1.1, 1.1 , z3
1., 1.6 , t1
t2
11.9, 1.3 , t3
25., 1.8 , SumOfSquares 0.2553336118528619,
DegreesOfFreedom 164
1.2, 1.1 ,
5., 1. ,
Chapter5.nb
23
However, you should be aware that this fit pushed FindFit pretty hard. Not only is the fit very sensitive to the initial
values, but on a fairly fast UNIX workstation each took over five minutes of cpu to perform. This is why specialized
software to solve these sorts of problems has been written. See Ouyang and Varghese, listed in the references, Section
5.1.4, for an example of a technique for fitting this particular type of spectrum.
The options and default values used directly by FindFit are given by using Options.
TableForm Options FindFit
24
Chapter5.nb
For information on the other two tests used by FindFit to determine convergence, see the discussion of the
RelativeChiSquaredTolerance and ValueTolerance options in 5.3.1.3 and in 5.3.1.13, respectively.
a 1 x, x, a 0 , a 1 , MaximumIterations 2
FindFit::maxiters:
FindFit failed to converge after 2
iterations. The maximum number of
iterations can be increased with the
MaximumIterations option.
3
17.5 2
15 1
0
12.5 -1
-2
10
Residuals
7.5
5
2.5
0
20
40
60
80
a 0
0.16, 0.5 , a 1 0.1098, 0.006700000000000001 ,
SumOfSquares 25.55891062641812, DegreesOfFreedom 12
25
Chapter5.nb
a 1 x, x, a 0 , a 1 , ShowFit False
a 0
0.03, 0.5 , a 1
0.1068, 0.006700000000000001 ,
SumOfSquares 25.35464811594685, DegreesOfFreedom 12
FindFit GanglionData, a 0
ReturnErrors False
a 1 x, x, a 0 , a 1 , ShowFit False,
a 0 0.03098373593165319, a 1 0.1067875703568358,
SumOfSquares 25.35464811594685, DegreesOfFreedom 12
This option is identical to the option of the same name used by LinearFit.
a 1 x, x, a 0 , a 1 , ShowFit False,
0.03 0.1068 x
This option is identical to the option of the same name used by LinearFit.
Chapter5.nb
26
27
Chapter5.nb
FindFit GanglionData, a 0
ShowProgress True
a 1 x, x, a 0 , a 1 , ShowFit False,
a 0
0.03, 0.5 , a 1
0.1068, 0.006700000000000001 ,
SumOfSquares 25.35464811594685, DegreesOfFreedom 12
Note that the final iteration satisfied two of the three possible criteria for convergence. This is common when FindFit
has found a true minimum in the sum of the squares.
a 1 x, x, a 0 , a 1 , ShowFit False
a 0
0.03, 0.5 , a 1
0.1068, 0.006700000000000001 ,
SumOfSquares 25.35464811594685, DegreesOfFreedom 12
28
Chapter5.nb
FindFit GanglionData, a 0 a 1 x, x, a 0 , a 1
UseSignificantFigures False
, ShowFit
False,
a 0
0.03098373593165319, 0.4980841602975855 ,
a 1
0.1067875703568358, 0.006739570956378 , SumOfSquares
DegreesOfFreedom 12
25.35464811594685,
Now the estimated errors in the fitted parameters have not been used to adjust the number of significant figures
displayed for either the values or the errors in those parameters.
This option is identical to the option of the same name used by LinearFit.
By default FindFit uses the program ShowFitResult to display the results of a fit. If we have the results of
FindFit, we can use ShowFitResult to display it. We will demonstrate with the GanglionData used above.
result
FindFit GanglionData, a 0
a 1 x a 2 x2 , x, a 0 , a 1 , a 2 , ShowFit False
a 1 x a 2 x2 , x, a 0 , a 1 , a 2 , result
Residuals
17.5
1
0.5
15
0
12.5 -0.5
-1
10
7.5
5
2.5
0
20
40
60
80
Graphics
Note that ShowFitResult returns a Graphics object. This can be convenient if you wish to print the graphic, since
the Graphics object is not returned by FindFit itself.
29
Chapter5.nb
Note that the syntax and functionality of ShowFitResult are very similar to that of ShowLinearFit in the
EDALinearFit package. ShowFitResult is somewhat more general.
By default, ShowFitResult tries to find a quadrant in the datafit graph in which to place the residual plot. If no
such quadrant can be found, the residual plot is displayed separately. Setting ResidualPlacement to Separate
causes the residual plot to always be displayed separately. ResidualPlacement can also be set to an integer
between 1 and 4, which causes the residual plot to be placed in that quadrant of the datafit plot.
ShowFitResult GanglionData, a 0
result, ResidualPlacement 4
a 1 x a 2 x2 , x, a 0 , a 1 , a 2 ,
17.5
15
12.5
10
7.5
5
2.5
0
20
40
60
Residuals
1
0.5
0
-0.5
-1
80 100 120 140
Graphics
If ResidualPlacement is set to None, no residual plot is displayed.
Internally, ShowFitResult uses EDAListPlot, Plot, and ToFitFunction. Options given to
ShowFitResult for these are passed to the appropriate program. In addition, ShowFitResult itself uses only the
two options ResidualPlacement and UseSignificantFigures.
There can be some minor differences between the graphs displayed by FindFit and those displayed by
ShowFitResult. For example, if the data set has no explicit errors and the Reweight option is set to True in the
call to FindFit, then the PseudoErrorY is used in the calculation of the errors in the residuals by FindFit.
ShowFitResult is unaware of this, and will then have slightly different errors in the residual graph. By specifying
ReturnResiduals to True in the call to FindFit, the "better" numbers will be used by ShowFitResult.
We give an example by fitting GanglionData with the Reweight option.
result
FindFit GanglionData, a 0
Residuals
17.5 4
15 2
0
12.5 -2
10
7.5
5
2.5
0
20
40
60
80
a 0
0.03, 0.72 , a 1
0.107, 0.01 , PseudoErrorY 1.453577429308659,
SumOfSquares 25.35464811594684, DegreesOfFreedom 12
30
Chapter5.nb
ShowFitResult GanglionData, a 0
a 1 x, x, a 0 , a 1 , result ;
Residuals
17.5 3
2
15 1
0
12.5 -1
-2
10
7.5
5
2.5
0
20
40
60
80
ShowLinearFit does not show the errors in the residuals due to the PseudoErrorY term.
We can have FindFit explicitly return the residuals.
result FindFit GanglionData, a 0 a 1 x, x, a 0 , a 1
ShowFit False, ReturnResiduals True
, Reweight
True,
a 0
0.03, 0.72 , a 1
0.107, 0.01 , Residuals
11.1, 1.3, 1.5 , 13.6, 1.4, 1.5 , 22.5, 0.5, 1.5 , 31.4, 0.4, 1.5 ,
32.7, 0.1, 1.5 , 34., 0.2, 1.5 , 53.8, 1.2, 1.5 , 63., 2.1, 1.5 ,
67.00000000000001, 0.2, 1.5 , 81., 1.8, 1.5 , 101., 0.4, 1.5 ,
107., 0.5, 1.5 , 114., 1.1, 1.5 , 141., 3.2, 1.5
,
PseudoErrorY 1.453577429308659, SumOfSquares 25.35464811594684,
DegreesOfFreedom 12
We see that the PseudoErrorY has led to an error in each of the residuals. ShowFitResult will display these in
this case.
ShowFitResult GanglionData, a 0
a 1 x, x, a 0 , a 1 , result ;
Residuals
17.5 4
15 2
0
12.5 -2
10
7.5
5
2.5
0
20
40
60
80
Similarly, if the data set has errors in both coordinates, FindFit uses the effective variance in calculating the errors in
the residuals. In this case also, specifying ReturnResiduals as True in the call to FindFit will cause
ShowFitResult to be slightly more correct.
31
Chapter5.nb
result
a 0
0.981, 0.021 , a 1
DegreesOfFreedom 19
21.05345454545474,
32
Chapter5.nb
a 1 x, x, a 0 , a 1
0.981 0.04122 x
The ToFitFunction is similar to the ToLinearFunction function supplied in the EDALinearFit package,
but somewhat more general. However, one difference is that when the fit parameters have errors, by default,
ToFitFunction does not return two functions. In contrast, ToLinearFunction will return two functions, the first
being the result of the fit and the second the estimated errors in the function. ToFitFunction can return this second
"error function" if the UseFitErrors option is set to True.
ToFitFunction result, a 0
0.981 0.04122 x,
a 1 x, x, a 0 , a 1 , UseFitErrors True
0.0004410000000000001
1.296
7 x 2
10
The EDAFindFit package includes some convenience functions to define peak shapes. They are BreitWigner,
Galatry, Gaussian, Lorentzian, PearsonVII, RelavitivisticBreitWigner, and Voigt.
False
False
False
33
Chapter5.nb
False
False
False
False
False
False
False
False
False
34
Chapter5.nb
False
False
False
False
False
ToFitFunction[result,model,ind,parameters] takes
result which is assumed to be a set of Rules
such as is returned by default by FindFit and
returns a function of the independent variable
ind for model. The model is assumed to be a
function of ind and the parameters. The
parameters may be a list of Symbols, or
{Symbol, value} pairs, although the values are
in all cases ignored. By default a single
function is returned, which is evaluated at the
result of the fit. If UseFitErrors is set to
True, the routine returns a list of two
functions; the first is as before and the second
is the estimated error in the function due to
the errors in the fit parameters if any.
False
35
Chapter5.nb
False
BreitWigner[x,a,x0,gamma] is a nonrelativistic
BreitWigner function of the independent
variable x, center value x0, and full width at
halfmaximum gamma. The function is normalized
so that its integral from Infinity to +Infinity
is equal to a. The maximum amplitude of the
peak is 2*a/(Pi*gamma). BreitWigner is a synonym
for Lorentzian.
False
False
Lorentzian[x,a,x0,gamma] is a nonrelativistic
Lorentzian function of the independent variable
x, center value x0, and full width at
halfmaximum gamma. The function is normalized
so that its integral from Infinity to +Infinity
is equal to a. The maximum amplitude of the
peak is 2*a/(Pi*gamma). Lorentzian is a synonym
for BreitWigner.
False
False
False