Sie sind auf Seite 1von 35

Chapter5.

nb

Chapter 5
Fitting Data to Nonlinear Models
One of the more difficult topics in all of data analysis in the physical sciences is fitting data to nonlinear models. Often
such fits require large computational resources and great skill, patience, and intuition on the part of the analyst. These
difficulties are one of the reasons that, as we shall see, the whole topic of spectral line shapes is still a very active subject
of research spanning the fields of chemistry, physics, astronomy, and more. In addition, computational methods of
nonlinear fitting is still a current research topic in computer science.
However, since sometimes nature really is nonlinear, such fits are often unavoidable, and the principles and some tools
for nonlinear fitting are the topics of this chapter. The main EDA program introduced here is FindFit, which
accomplishes fits to arbitrary models.
FindFit is similar to the NonlinearFit function in the StatisticsNonlinearFit package which is
standard with Mathematica. The primary differences are that FindFit: (1) recognises EDAs data format, including
errors in both coordinates; (2) estimates errors in fit parameters; (3) by default displays graphical information about the
fit; and (4) uses an algorithm that has been optimised for speed and stability for the types of nonlinear fits commonly
performed in the physical sciences and engineering.
The following command loads the initialization of EDA.
Needs "EDAMaster"

5.1 Introduction


5.1.1 Overview of FindFit

The previous chapter, "Fitting Data to Linear Models by Least Square Techniques", introduced the distinction between
linear and nonlinear models. To briefly review, the terms refer to the way in which the parameters to which we are
fitting enter into the model.
In this chapter we discuss nonlinear models and the EDA program FindFit that can often find a reasonable fit to them.
Recall that if sos is the sum of the squares of the residuals, then we are seeking the minimum in its value. If we are
fitting to parameters a[0], a[1], ... , a[m], the answer is found by solving a set of simultaneous equations.
D[ sos, a[0] ] == 0
D[ sos, a[1] ] == 0
...
D[ sos, a[m] ] == 0
This in general can be done analytically provided the model to which we are fitting is linear in the parameters. Similarly,
when there are explicit errors in the data, we form the chisquared, say chisq, and we solve the corresponding equations.
D[ chisq, a[0] ] == 0
D[ chisq, a[1] ] == 0
...
D[ chisq, a[m] ] == 0
This again will be analytic for a linear fit.

Chapter5.nb

For a nonlinear fit, no such analytic solutions are possible, so iteration is required to find the minimum in the sum of the
squares or the chisquared.
If we imagine a plot of the value of the sum of the squares or the chisquared as a function of the parameters to which
we are fitting, in general for a nonlinear fit there may be many local minima instead of one big one, as is the case for
linear fitting. For example, if we are fitting to two parameters, param1 and param2, the chisquared as a function of the
values of the parameters might have two or more local minima.

                                1                                                                                      1                                                     ,
param12  param22  35
param1  25 2  param2  10 2  25
param1,  20, 50 , param2,  25, 25 ,
Automatic, Automatic, None ,
PlotRange  All, PlotPoints  40, Ticks 
AxesLabel 
"param1", "param2", "chi squared" , ViewPoint 
1.3, 2.4, .75 ;

Plot3D 1 

param2
-20
-100
10 20
chi-squared

40

0
20
param1

-20

Thus, a nonlinear fitter must usually start off with initial values close to the real minimum.
The general technique for iteration, "steepest descent", is analogous to the
following situation. It was "a dark and stormy night". Foggy too. You are on the
side of a hill and want to find the valley. So you step in the direction in which the
slope goes down, and continue moving in the direction of the local definition of
"down" until you are in the valley. Of course, if you take giant steps you might
step over the next hill and end up in the wrong valley. And when you get close to
the bottom of the valley you will want to start taking baby steps. The
LevenbergMarquardt algorithm used by many nonlinear fitters, including
FindFit, is essentially some clever heuristics to define giant steps and baby
steps.
If the data has noise, which is almost a certainty for real experimental data, then there is a further difficulty. We can take
two sets of data from the same apparatus using the same sample, fit each dataset to a nonlinear model using identical
initial values for the fit parameters, and get very different final fits. This situation leads to ambiguity about which fit
results are "correct".

5.1.2 Providing Initial Parameter Values to FindFit

As already mentioned, unless we are fitting to a very simple model, FindFit must be provided with initial estimates of
the parameters.
We will first fit to a simple model where no such parameter estimates are necessary.
mydata

 Table n, 2 Cos 3 n , n, 0., 1.5, .5 ;

Chapter5.nb

FindFit mydata, b Cos a x , x, a, b

Residuals

2
0.008
0.006
0.004
0.002
0
-0.002

1
0
-1
-2
0

0.25 0.5 0.75

a
3., 0.34 , b 
DegreesOfFreedom 

1.25 1.5

2., 0.71 , SumOfSquares 


2

0.0000952225702044433,

Actually, when not given initial values FindFit starts with initial values for the parameters, here a and b, equal to 1.
Now the syntax of the call to FindFit is given.
FindFit[data, model, ind,
{param1, ... , paramM} ]
Here ind is the name of the independent variable in model. Also note that this model is nonlinear in the parameters to
which we are fitting, param1 ... paramM.
The EDAFindFit package includes a Gaussian function.
Information "Gaussian", LongForm

 False

Gaussian[x,ampl,x0,sigma] is a Gaussian function


of the independent variable x with amplitude
ampl, center value x0, and standard deviation
sigma. The function is normalized so that its
integral from Infinity to +Infinity is equal
to Sqrt[2Pi]*ampl*sigma.

We will use it to generate some data.


SeedRandom 1234 ;
mydata  Table x, Gaussian x, 10, 8, 2.5

 Random Real,  1, 1

We examine the data with EDAListPlot.


EDAListPlot mydata ;

10
8
6
4
2
2.5

7.5

10

12.5

15

We fit the data using FindFit; the fit takes a few seconds to complete.

, x, 1, 16, .1

Chapter5.nb

FindFit mydata, Gaussian x, ampl, x0, sigma , x, ampl, x0, sigma

10
8
6
4
2
0
0

2.5

7.5 10 12.5
Residuals

15

6
4
2
0
-2
-4

 100., 3600. , sigma   1000., 23000. ,


ampl 
3.9, 1.9 , x0 
SumOfSquares  1812.095726856373, DegreesOfFreedom  148
This result is ridiculous. What has happened is that FindFit thinks it has found a local minimum in the sum of the
squares at these silly values.
Note that because all four quadrants of the plot of the data and the fit contain significant amounts of data, FindFit is
displaying the plot of the residuals separately.
Repeating the fit with reasonable initial guesses of the parameters gives a much better result.
FindFit mydata, Gaussian x, ampl, x0, sigma , x,

10
8
6
4
2
0
0

2.5

7.5

10

12.5

15

ampl, 9 , x0, 9 , sigma, 3

Chapter5.nb

1
0.5
0
-0.5
-1
ampl 
9.91, 0.18 , x0 
8.075, 0.05400000000000001 ,
sigma 
2.488, 0.05400000000000001 , SumOfSquares  49.5618230640989,
DegreesOfFreedom  148

Finding good initial values is sometimes subtle. As an example, we generate some madeup data for three peaks with a
Lorentzian shape using the Lorentzian function supplied with the EDAFindFit package.
Information "Lorentzian", LongForm  False
mydata  Table x, Lorentzian x, 100, 18, 8 
Lorentzian x, 10, 23, 4  Lorentzian x, 80, 35, 8
EDAListPlot mydata ;

, x, 1, 60, .1

Lorentzian[x,a,x0,gamma] is a nonrelativistic
Lorentzian function of the independent variable
x, center value x0, and full width at
halfmaximum gamma. The function is normalized
so that its integral from Infinity to
+Infinity is equal to a. The maximum
amplitude of the peak is 2*a/(Pi*gamma).
Lorentzian is a synonym for BreitWigner.

8
6
4
2

10

20

30

40

50

60

It is difficult to see the small peak on the shoulder of the leftmost peak, or to provide initial estimates for its values.
However, we can see the peak and probably make some sensible guesses of its parameters.
Despite the claims sometimes seen in glossy advertisements, there is no known software that can find and estimate peaks
for data such as this as well as a human expert. This is in part because of the great ability of the human visual system to
be an intuitive integrator.
Almost all versions of the notebook front end for Mathematica include a provision to use our visual ability. We can
display a graph of the data and using the mouse, point at a desired location on the graph. The coordinates will be
displayed in the notebook window, and can also be copied out using the cutandpaste facility; consult the manual for
your version of the notebook to discover how to do this. For example, here we load and examine a part of a nuclear
spectrum.
LoadData Cobalt60
LoadData::name: Loading: Cobalt60Data

Chapter5.nb

Information "Cobalt60Data", LongForm

 False

Cobalt60Data is part of a Cobalt60 nuclear


spectrum. The format is: {channel, counts}.
The data were taken in the II Year Lab of the
Dept. of Physics, Univ. of Toronto with a NaI
scintillator and a PCbased multichannel
analyzer running PCMCA software by David
Harrison (1990, unpublished).

EDAListPlot Cobalt60Data ;

200
150
100
50

1750

1800

1850

1900

1950

Using the mouse we can pick out the coordinates of the maximum, the points on the peak corresponding to roughly
onehalf the maximum, and so on. So, the position of the peak is found.
1831.88, 181.251

In some earlier versions of the Windows front end, if the notebook is closed and then reopened, the coordinates will be
returned in PostScript units instead of the actual units of the plot; rerendering the plot works around this bug.
Finally, for simple spectra EDA supplies a function, FindPeaks, which can sometimes provide initial guesses of the
number of peaks and their parameter values close enough for FindFit. The function is described in Section 8.3.

5.1.3 Comparing LinearFit and FindFit

Although FindFit is intended for fitting to nonlinear models, it can also find the minimum in the sum of the squares
or the chisquared for linear models, albeit more slowly than LinearFit.
For example, here we repeat a fit from Chapter 4.

Chapter5.nb

LoadData Thermocouple
LinearFit ThermocoupleData, 0, 1, 2 , a
LoadData::name: Loading: ThermocoupleData

Residuals
3 0.06
0.04
0.02
0
2 -0.02
-0.04
-0.06
1
0
-1
0

20

40

60

80

100

a 0 
 0.886, 0.03 , a 1  0.0352, 0.0014 , a 2  0.00006, 0.000013 ,
ChiSquared  1.006602038693567, DegreesOfFreedom  18

Using FindFit yields the same result.


FindFit ThermocoupleData, a 0

 a 1 x  a 2 x2 , x, a 0 , a 1 , a 2

Residuals
3 0.06
0.04
0.02
0
2 -0.02
-0.04
-0.06
1
0
-1
0

20

40

60

80

100

a 0 
 0.886, 0.03 , a 1  0.0352, 0.0014 , a 2  0.00006, 0.000013 ,
ChiSquared  1.006602038696037, DegreesOfFreedom  18

For this example, FindFit took about four times as long as LinearFit.
Recall from the chapter on linear fitting that if the data have explicit errors in both coordinates, the effective variance
technique makes the fit essentially nonlinear unless the model is a straight line. Thus, in this case LinearFit iterates
until it finds a minimum in the chisquared. When there are errors in both coordinates, FindFit also calculates the
error in the dependent variable based on the effective variance. However, although there is a fairly comprehensive
literature on using this technique in linear fits, the main justification of using effective variances in nonlinear fits is
based only on a series of experiments in which it was found that most often the algorithm of FindFit would produce
reasonable results.
We repeat another fit from the chapter on linear models.

Chapter5.nb

LoadData Reactance
LinearFit ReactanceData,

 1, 1 , a

LoadData::name: Loading: ReactanceData

10

Residuals

8 4
3
6 2
1
4 0
-1
2
0
-2
-4
22000 23000 24000 25000 26000 27000
a  1 
 590000.0000000001, 110000. , a 1  0.00102, 0.0002 ,
ChiSquared  2.191432350342454, DegreesOfFreedom  3

FindFit gives a similar result if started sufficiently close to the "final" values.
FindFit ReactanceData,

 a          1       a 1 x, x,
x

10

 1 ,  590000 , a 1 , 0.001

Residuals

8 4
3
6 2
1
4 0
-1
2
0
-2
-4
22000 23000 24000 25000 26000 27000
a

 1 

14
, a 1 
 590000.0000000001, 3.800000000000001

ChiSquared 

10
2.203721816446138, DegreesOfFreedom 

0.001009, 0.000021 ,

For the above results, FindFit appears to have done a poorer job of estimating the errors in the fit parameters than
LinearFit.
Another difference between the above two fits is that by default the graphs produced by LinearFit display the errors
in the fit parameters, while the graphs produced by FindFit do not. This is because for many nonlinear fits, sorting
out how to combine the various error terms is problematic. To display the fit errors the UseFitErrors option,
discussed in Section 5.3.2.2, may be set to True.

Chapter5.nb

FindFit ReactanceData,

10

 True

UseFitErrors

 a          1       a 1 x, x,

 1 ,  590000 , a 1 , 0.001 ,

Residuals

8 4
3
6 2
1
4 0
-1
2
0
-2
-4
22000 23000 24000 25000 26000 27000
a

 1 

14
, a 1 
 590000.0000000001, 3.800000000000001

ChiSquared 

10
2.203721816446138, DegreesOfFreedom 

0.001009, 0.000021 ,

Note that as opposed to the plot produced by LinearFit, here the lines representing the errors in the fit parameters are
parallel. This is an artifact of a heuristic used by the FindFit package to combine the two terms; the heuristic is often
reasonable for true nonlinear fits.
As a final comparison between LinearFit and FindFit, we load and examine some data on oscillations in
interneuron networks.
LoadData Interneuron
LoadData::name: Loading: InterneuronData

Information "InterneuronData", LongForm

 False

InterneuronData are data of oscillations of


interneuron networks driven by metabotopic
glutamate receptor activation in slices of rat
hippocampus and neocortex. The data are from
Miles A. Whittington, Roger D. Traub and John
G.R. Jefferys, Nature 373, (1995) pg 612. The
format of the data is {ipscTau, frequency},
where ipscTau is the decay constant of the
inhibitory postsynapic currents in ms and
frequency is the oscillation frequency of the
network in Hz.

EDAListPlot InterneuronData ;

45
40
35
30
25
10

15

20

25

30

The data appears to be modeled by an exponential relationship.


frequency = a*Exp[b*ipscTau]

10

Chapter5.nb

This is nonlinear in the parameters a and b. However, as mentioned in the previous chapter, we can linearize the
relationship by taking the logarithms of both sides.
Log[frequency] = Log[a] b*ipscTau
Thus, we form a data set of {ipscTau, Log[frequency]}.
mydata

First #1 , Log Last #1

InterneuronData;

&

We fit this transformed data to a straight line.


linresult

 LinearFit mydata, 0, 1 , a
Residuals

3.8

0.2
0.1
0
-0.1
-0.2
-0.3

3.6
3.4
3.2
3
2.8

10

15

20

25

30

a 0 
4.027000000000001, 0.05700000000000001 ,
 0.04240000000000001, 0.0033 , PseudoErrorY  0.1075978101620072,
a 1 
SumOfSquares  0.2547003525365055, DegreesOfFreedom  22

Thus b = 0.0424 +/ 0.0033 in (ms)^(1). Although there are problems with the above fit, which are discussed below,
we will calculate a and its errors using the Datum construct discussed in Chapter 3.
Exp Datum a 0

. linresult

. Datum

 Identity

56.10000000000001, 3.2

Using FindFit, we can fit directly to the exponential.


FindFit InterneuronData, a Exp

 b x , x,

a, 56 , b,

 0.04

Residuals
4
2
0
-2
-4
-6

45
40
35
30
25
20
5

10

15

20

25

30

a
59.90000000000001, 1. , b 
0.04680000000000001, 0.0012 ,
SumOfSquares  164.3913719853976, DegreesOfFreedom  22

The two fits are similar, and both show some problems in the residuals. Also, FindFit seems more confident about the
uncertainties in the values of the fit parameters, probably without justification. The difference in the SumOfSquares is
because of the different values being used for the dependent variable of the data.

11

Chapter5.nb

One likely reason for the problems here is that the data does not seem to asymptotically approach zero, as assumed by
our model, but some other value c.
frequncy = a*Exp[b*ipscTau] + c
Fitting to this model seems to confirm our hypothesis.
FindFit InterneuronData, a Exp

 b x  c, x,

a, 60 , b, 0.05 , c, 15

Residuals
45

4
2
0
-2
-4

40
35
30
25
20
5

15

10

20

25

30

a
59.00000000000001, 2.3 , b 
0.0925, 0.0094 , c 
13.8, 1.5 ,
SumOfSquares  132.853722227798, DegreesOfFreedom  21

This essentially duplicates a fit presented by the experimenters in their original paper.
The model can be linearized as follows:
frequncy = a*Exp[b*ipscTau] + c
frequency c = a*Exp[b*ipscTau]
Log[frequency c] = Log[a] b*ipscTau
Thus, we can form another data set in which we subtract 13.8 from each value of frequency before taking the logarithms.
mydata2

First #1 , Log Last #1

 13.8 &

InterneuronData;

We fit this to a straight line.


linresult2  LinearFit mydata2, 0, 1 , a

Residuals

3.5

0.6
0.4
0.2
0
-0.2
-0.4
-0.6

3
2.5
2
1.5
5

10

15

20

25

30

a 0 
4.030000000000001, 0.11 , a 1 
 0.0901, 0.006600000000000001 ,
PseudoErrorY  0.2152138256526544, SumOfSquares  1.018973796545124,
DegreesOfFreedom  22

Thus, a can be calculated.

12

Chapter5.nb

Exp Datum a 0

. linresult2

. Datum

 Identity

56.3, 6.200000000000001

This number is within errors of the result found by FindFit.


Of course, without FindFit or some other nonlinear fitter available, it would be difficult to get an objective estimate
of the value that should be subtracted from each value of the dependent variable frequency.
Also, although not an issue here, this sort of linearization procedure can introduce biases in the values of the estimates of
the fit parameters; this is discussed further in Chapter 8.

5.1.4 References

Philip R. Bevington, Data Reduction and Error Analysis (McGrawHill, 1969), Chapter 11. A classic introduction to
nonlinear fitting techniques.
Xiang Ouyang and Philip L. Varghese, Appl. Optics 28 (1989), p. 1538. A discussion of a popular Fourier
transformbased algorithm for fitting spectra to Galatry and Voigt profiles.
William H. Press, Brian P. Flannery, Saul A. Teukolsky, and William T. Vetterling, Numerical Recipes: The Art of
Scientific Computing or Numerical Recipes in C: The Art of Scientific Computing (Cambridge Univ. Press), Section
14.4. A good brief introduction to nonlinear fitting and the LevenbergMarquardt algorithm used by FindFit.
A.P. De Weljer, C.B. Lucasius, L. Buydens, G. Kateman, H.M. Heuvel, and H. Mannee, Anal. Chem. 66 (1994), p 23.
An illuminating discussion of the research into fitting to curves using neural networks, this article also has a good
review of some of the problems with conventional techniques.

5.2 Examples


5.2.1 Fitting to a Single Peak with a Background

We begin by loading and examining some data that was briefly examined in Section 5.1.2.
LoadData Cobalt60
LoadData::name: Loading: Cobalt60Data

Information "Cobalt60Data", LongForm

 False

Cobalt60Data is part of a Cobalt60 nuclear


spectrum. The format is: {channel, counts}.
The data were taken in the II Year Lab of the
Dept. of Physics, Univ. of Toronto with a NaI
scintillator and a PCbased multichannel
analyzer running PCMCA software by David
Harrison (1990, unpublished).

13

Chapter5.nb

EDAListPlot Cobalt60Data ;

200
150
100
50

1750

1800

1850

1900

1950

The theoretical prediction for the peak is that it should be a Gaussian, so part of the model for the fit will be the
Gaussian function included in the EDAFindFit package.
Information "Gaussian", LongForm

 False

Gaussian[x,ampl,x0,sigma] is a Gaussian function


of the independent variable x with amplitude
ampl, center value x0, and standard deviation
sigma. The function is normalized so that its
integral from Infinity to +Infinity is equal
to Sqrt[2Pi]*ampl*sigma.

We also note that there is a background under the peak, i.e. ,counts in addition to just the Gaussian peak. We will
approximate the background as a straight line with the following slope.
a 1

16  44
 N 
1950  1700

 0.112
Here the numbers in the calculation are based on values obtained by pointing and clicking at the left and righthand
sides of the plot of the data with the mouse. We calculate the intercept.
a 0

 16  a 1 1950

234.4

From the plot of the data we estimate the center of the peak to be at channel 1830, and the amplitude above the
background is about 140 counts. The full width at halfmaximum is about 90 channels, so we will try an initial value for
sigma of 45 channels.
Since we are going to use a[0] and a[1] as names for fit parameters, we must clear the definitions we have just made for
them.
Remove "a "

Now we fit to the data.


FindFit Cobalt60Data, a 0  a 1 chan  Gaussian chan, ampl, x0, sigma ,
a 0 , 234 , a 1 ,  0.11 , ampl, 140 , x0, 1830 , sigma, 45
chan,

200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000

14

Chapter5.nb

30
20
10
0
-10
-20
-30

 0.09559, 0.00087 , ampl  154.03, 0.17 ,


a 0 
201.6, 1.6 , a 1 
x0 
1833.514, 0.05400000000000001 , sigma 
43.498, 0.06900000000000001 ,
SumOfSquares  23500.7168442931, DegreesOfFreedom  281
Note that the Reweight option, which is True by default for LinearFit, is False by default for FindFit.
Section 5.3.1.9 discusses this further.
Here is the syntax of the call to FindFit.
FindFit[ data, model, ind, params ]
The model is the model to which we are fitting and ind is the independent variable. The params can be a list of the
parameter names.
{param1, param2, ... , paramM}
More often params is a list of {parameter name, initial value} pairs.
{ {param1, value1}, {param2, value2}, ... ,
{paramM, valueM} }
The world will end if model contains undefined arguments in addition to ind and those named in params.
Other forms of parameter specification include {name, min, start, max}, which causes FindFit to begin parameter
name at start and not allow it to go below min or above max. Or the parameter can be specified as {name, min, max}, in
which case the value starts at (min + max)/2.
Note that in comparison to fits you may have done using LinearFit, FindFit is very slow. On a fairly fast UNIX
workstation, fitting to Cobalt60Data took over 30 seconds.
From the theory of nuclear physics, we expect the error in the number of counts in each channel to be Sqrt[counts].
Thus, we form a new data set with these errors included.
mydata

 N

#1 1 , #1 2 ,

#1 2

&

Cobalt60Data ;

We fit to this new data set.


FindFit mydata, a 0  a 1 chan  Gaussian chan, ampl, x0, sigma , chan,
a 0 , 234 , a 1 ,  0.11 , ampl, 140 , x0, 1830 , sigma, 45

200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000

15

Chapter5.nb

Residuals
40
20
0
-20
-40
a 0 
194., 10. , a 1 
x0 
1833., 0.51 , sigma 
DegreesOfFreedom  281

 0.0919, 0.005200000000000001 , ampl  154.8, 1.7 ,


42.89, 0.55 , ChiSquared  299.4179776308529,

Although the error bars in the data have obscured the curve representing the fit, the residuals and the ChiSquared per
DegreesOfFreedom show that the fit is reasonable, except perhaps on the far righthand side.
There is another sort of bellshaped curve that highenergy physicists usually call a "Breit Wigner". The same curve is
usually called a "Lorentzian" by spectroscopists. The EDAFindFit package includes a BreitWigner function.
Information "BreitWigner", LongForm

 False

BreitWigner[x,a,x0,gamma] is a nonrelativistic
BreitWigner function of the independent
variable x, center value x0, and full width at
halfmaximum gamma. The function is normalized
so that its integral from Infinity to
+Infinity is equal to a. The maximum
amplitude of the peak is 2*a/(Pi*gamma).
BreitWigner is a synonym for Lorentzian.

We can compare the shape of a BreitWigner to a Gaussian.


Needs "GraphicsLegend"
Plot Gaussian x, 10, 10, 2.5 , BreitWigner x, 80, 10, 5 ,
x, 0, 20 , PlotStyle 
GrayLevel 0 , Dashing 0.03
,
"Gaussian", "BreitWigner" , LegendSize 
1, .4
PlotLegend 

10
8
6
4
2
Gaussian
5
BreitWigner

10

15

20

Note that, consistent with a common convention, BreitWigner does not take the maximum amplitude as an
argument, but instead the total area under the curve. If the peak in the Cobalt60 data is a BreitWigner, this
corresponds to the total number of counts in the peak. The total number of counts in the data, peak plus background, can
be calculated.
Last Plus

Cobalt60Data
24057

Chapter5.nb

16

Also, the width is specified by the full width at the halfmaximum, not the standard deviation. We fit the Cobalt60 data
with errors to a BreitWigner plus a linear background.

17

Chapter5.nb

FindFit mydata, a 0  a 1 chan  BreitWigner chan, a, x0, g , chan,


a 0 , 234 , a 1 ,  0.11 , a, 24000 , x0, 1830 , g, 90

200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000
Residuals
40
20
0
-20
-40
a 0 
138., 11. , a 1 
 0.07650000000000001, 0.005700000000000001 ,
a
31630., 770.0000000000001 , x0 
1832.11, 0.55 , g 
106.3, 2.4 ,
ChiSquared  448.2636168689677, DegreesOfFreedom  281

Because of the possibility that FindFit fell into the wrong minimum in the chisquared, caution must be used in
rejecting the model (i.e., a BreitWigner plus background) because of a high ChiSquared per
DegreesOfFreedom. Nonetheless, the residuals seem to be saying clearly that the data does not match a
BreitWigner.
We return to modeling the Cobalt60 data to a Gaussian plus background, and concentrate on the problems we have
seen on the righthand side. We can add a quadratic term to the background.
FindFit mydata, a 0  a 1 chan  a 2 chan2  Gaussian chan, ampl, x0, sigma , chan,
a 0 , 234 , a 1 ,  0.11 , a 2 , 0.02 , ampl, 140 , x0, 1830 , sigma, 45

200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000

18

Chapter5.nb

Residuals
150
100
50
0
-50
a 0

272.1666687,

a 2

0.00009,

x0 

4.7
6 , a 1 
10

2.3
6 , ampl 
10

 0.3632, 0.004400000000000001 ,

215.2443773,

6.1
1212.69729812, 7 , sigma 

ChiSquared 

6.800000000000001
6 ,
10

704.3369377000001,
10
9042.73001502903, DegreesOfFreedom  280

5.600000000000001
7 ,
10

This is a ridiculous result because FindFit fell into the wrong minimum in the ChiSquared. Try again with the
quadratic term set initially to zero, storing the answer in result.
result 
FindFit mydata, a 0  a 1 chan  a 2 chan2  Gaussian chan, ampl, x0, sigma , chan,
a 0 , 234 , a 1 ,  0.11 , a 2 , 0.0 , ampl, 140 , x0, 1830 , sigma, 45

200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000

19

Chapter5.nb

Residuals
60
40
20
0
-20
233.8485, 0.001 , a 1 
 0.1236, 0.0058 ,
6.
3.
6 , 6 , ampl  139.87, 0.05300000000000001 ,
a 2 
10
10
x0 
1833.34, 0.57 , sigma 
44.02000000000001, 0.6200000000000001 ,
ChiSquared  374.8140617169628, DegreesOfFreedom  280
a 0

Although not ridiculous, this fit is not nearly as good as the one without the quadratic background term. In fact, the
chisquare probability is pretty small.
ChiSquareProbability ChiSquared, DegreesOfFreedom

. result

0.0128891929874729

These sorts of difficulties are common when the number of fit parameters begins to get large. They are often particularly
acute when we are trying to fit to one or more peaks with a significant background under them. The "noise" in the data
further compounds the difficulty.
One frequently useful technique is to "sneak up" on the values. For example, we will fix all values of the fit at the values
of our best fit so far and allow the quadratic term to vary.

20

Chapter5.nb

FindFit mydata, 194  0.0919 chan  a 2 chan2


a 2 , 0.0
chan,

 Gaussian chan, 155, 1833, 42.9 ,

200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000
Residuals
40
20
0
-20
-40
a 2

7.000000000000001
8 , 1.1
7 , ChiSquared  299.4365730959574,

10
DegreesOfFreedom  285

10

Then we can try the full fit again, starting a[2] with the value from this fit, and storing the answer in result.
result 
FindFit mydata, a 0

 a 1 chan  a 2 chan2  Gaussian chan, ampl, x0, sigma , chan,

a 0 , 194 , a 1 ,

7
 0.092 , a 2 ,         8   , ampl, 154 , x0, 1833 , sigma, 42.9
10

200
150
100
50
0
1700 1750 1800 1850 1900 1950 2000

21

Chapter5.nb

Residuals
40
20
0
-20
-40

a 0

194.00251, 0.00081 ,

, ampl  154.009, 0.045 ,


 0.0918, 0.005700000000000001 , a 2  0, 2.9
106
x0 
1832.98, 0.51 , sigma 
42.99000000000001, 0.5400000000000001 ,
ChiSquared  299.6225580604674, DegreesOfFreedom  280


a 1

The point to all these fits is to demonstrate just how sensitive the results can be to apparently small changes in the initial
values. Here we have a reasonable ChiSquared per DegreesOfFreedom and the quadratic term is now zero within
errors. The fact that we have not managed to account for the small upturn in the data on the right may be because it is
not an artifact of the background, but because there is another peak beyond the range of our data that is bringing up
these values. The fact that each of these fits takes a minute or more, depending on the hardware, means that nonlinear
fitting is often somewhat laborious and time consuming.
Finally, we can compare the total number of counts in the data, 24057, to the total predicted from the fit.
N NIntegrate Gaussian x, 154, 1833, 43

 194  0.0918 x, x, 1700, 1985

23663.73555975553

Whether or not this number is close to the experimental number cannot be answered until we do some error analysis.
We can find the contributions to this number from the background and the Gaussian, including errors, using some tools
discussed in Chapter 3.
We ignore the quadratic term in the background. Using the Datum construct, we can find the counts and error in the
counts due to the background.
bkgd

1985

a bx

x . a  Datum a 0 , b  Datum a 1

. result

1700

Datum

7100., 3000.

We use the fact that the Gaussian is essentially zero at both the left and right of the data , so we can use the full
normalization of the Gaussian to calculate the counts under the peak.
peak

 N

Datum

2

Datum ampl Datum sigma

. result

16600., 220.

Thus, the total number of counts predicted by the fit is calculated.


bkgd  peak . Datum

 Identity

23700., 3000.

This compares well with the experimental value of 24057.

22

Chapter5.nb

5.2.2 Fitting to Two Peaks with No Background

A common line shape in the study of infrared absorption and emission is called a Galatry profile. The EDAFindFit
package includes a function for this shape.
Information "Galatry", LongForm

 False

Galatry[y,z,tau] is a Galatry profile with


collision broadening parameter y, collison
narrowing parameter z, and dimensionless time
tau. The profile is normalized so that its
value is one when tau = 0.

We will use this function to generate some madeup data of three Galatry peaks.
SeedRandom 1234 ;
mydata  Table t, 2 Galatry 0.95, 1.0, t  5  1.3 Galatry 1.3, 1.0, t  12
1.2 Galatry 0.90, 1.0, t  25  Random Real,  .07, .07
,
t, 0, 35, .2 ;

EDAListPlot mydata ;

3.5
3
2.5
2
1.5
1
0.5
5

10

15

20

25

30

35

We fit to this data.


FindFit mydata,
c1 Galatry y1, z1, t  t1  c2 Galatry y2, z2, t  t2  c3 Galatry y3, z3, t  t3 ,
t,
c1, 1.995 , c2, 1.395 , c3, 1.19 , y1, 0.94 , y2, 1.3 , y3, 0.9 ,
z1, 1 , z2, 1 , z3, 1 , t1, 4.9 , t2, 12.1 , t3, 25.1

Residuals
0.06
0.04
0.02
0
-0.02
-0.04
-0.06

3
2
1
0
0

10

20

30

c1 
2.1, 1.7 , c2 
1.4, 2.4 , c3 
1.1, 1.8 , y1 
0.93, 0.69 , y2 
y3 
0.9, 1.3 , z1 
1.05, 0.91 , z2 
1.1, 1.1 , z3 
1., 1.6 , t1 
t2 
11.9, 1.3 , t3 
25., 1.8 , SumOfSquares  0.2553336118528619,
DegreesOfFreedom  164

1.2, 1.1 ,
5., 1. ,

Chapter5.nb

23

However, you should be aware that this fit pushed FindFit pretty hard. Not only is the fit very sensitive to the initial
values, but on a fairly fast UNIX workstation each took over five minutes of cpu to perform. This is why specialized
software to solve these sorts of problems has been written. See Ouyang and Varghese, listed in the references, Section
5.1.4, for an example of a technique for fitting this particular type of spectrum.

5.3 Options, Utilities, and Details


There are many options to FindFit that control both how it does the fit and what it returns. These are discussed in
Section 5.3.1.
The package also includes programs that are used by FindFit, but may also be used directly. These are the topic of
Section 5.3.2, which also discusses some convenience functions of various peak shapes.

5.3.1 Options to FindFit

The options and default values used directly by FindFit are given by using Options.
TableForm Options FindFit

AbsoluteChiSquaredTolerance > 0.1


MaximumIterations > 30
RelativeChiSquaredTolerance > 0.005
ReturnCovariance > False
ReturnEffectiveVariance > False
ReturnErrors > True
ReturnFunction > False
ReturnResiduals > False
Reweight > False
ShowFit > True
ShowProgress > False
UseSignificantFigures > True
ValueTolerance > 0.002
These options will be discussed in order.
In addition, if ShowFit is True, the default, FindFit, uses the function ShowFitResult. This function is
discussed in Section 5.3.2. Options to ShowFitResult given to FindFit are passed to that function.
If FindFit is called with ReturnFunction set to True, or if the ShowFit option is set to True, the default, the
function ToFitFunction is called. This function is discussed in Section 5.4.2. Options to ToFitFunction given
to FindFit are passed to that function.

 5.3.1.1 The AbsoluteChiSquaredTolerance Option


FindFit uses three separate tests to determine if a fit has converged to a final value. The
AbsoluteChiSquaredTolerance option is used by one of those tests.
If the chisquared (or the sum of the squares if there are no explicit errors in the data) is decreasing and, in the current
iteration, its value is less than the value in the previous iteration by AbsoluteChiSquaredTolerance, the fit is
judged to have converged.
If there are declared errors in the data being fit, so that the test is comparing chisquared statistics, the default value of
0.1 for this option is usually reasonable. If there are no declared errors, so the test is comparing the sum of the squares of
the residuals, then the actual value of the sum of the squares depends on the magnitude of the values of the dependent
variable. In this case some adjustment of the value of AbsoluteChiSquaredTolerance may be appropriate.

24

Chapter5.nb

For information on the other two tests used by FindFit to determine convergence, see the discussion of the
RelativeChiSquaredTolerance and ValueTolerance options in 5.3.1.3 and in 5.3.1.13, respectively.

 5.3.1.2 The MaximumIterations Option


As discussed above, FindFit uses an iterative technique to try to determine the minimum in the chisquared or sum of
the squares. The MaximumIterations option controls the number of iterations FindFit will attempt before giving
up.
When FindFit gives up, it issues a warning message, but also presents the result of the fit just as if it had converged.
For example, here we repeat a fit we performed in Section 5.1.3, but we restrict the number of iterations to less than that
required by FindFit to achieve convergence.
FindFit GanglionData, a 0

 a 1 x, x, a 0 , a 1 , MaximumIterations  2

FindFit::maxiters:
FindFit failed to converge after 2
iterations. The maximum number of
iterations can be increased with the
MaximumIterations option.

3
17.5 2
15 1
0
12.5 -1
-2
10

Residuals

7.5
5
2.5
0

20

40

60

80

100 120 140

a 0 
 0.16, 0.5 , a 1  0.1098, 0.006700000000000001 ,
SumOfSquares  25.55891062641812, DegreesOfFreedom  12

 5.3.1.3 The RelativeChiSquaredTolerance Option


FindFit uses three separate tests to determine if a fit has converged to a final value. The
RelativeChiSquaredTolerance option is used by one of those tests.
If the chisquared (or the sum of the squares if there are no explicit errors in the data) is decreasing, and the value of the
previous iteration minus the current value divided by the previous value is less than the value of
RelativeChiSquaredTolerance, then the fit is judged to have converged.
If there are declared errors, so that the test is comparing chisquared statistics, the default value of 0.005 for this option
is usually reasonable. If there are no declared errors, so the test is comparing the sum of the squares of the residuals, then
the actual value of the sum of the squares depends on the magnitude of the values of the dependent variable; in this case
some adjustment of the value of RelativeChiSquaredTolerance may be appropriate.
For information on the other two tests used by FindFit to determinate convergence, see the discussion of the
AbsoluteChiSquaredTolerance option and of the ValueTolerance option in 5.3.1.1 and in 5.3.1.13,
respectively.

 5.3.1.4 The ReturnCovariance Option


If the ReturnCovariance option is set to True, then FindFit returns the full covariance matrix of the fit in
addition to other rules about the result.

25

Chapter5.nb

A discussion of the meaning of the covariance matrix is in Section 4.4.18.

 5.3.1.5 The ReturnEffectiveVariance Option


When both coordinates have errors, FindFit uses an "effective variance" technique. This method is discussed in
Section 4.1.3.
If the ReturnEffectiveVariance option to FindFit is set to True, then the program returns the values of the
effective variance in addition to other rules about the result.
This option is identical to the option of the same name used by LinearFit.

 5.3.1.6 The ReturnErrors Option


By default, FindFit calculates and returns the errors in the fit parameters. It also uses these errors to adjust the
significant figures in the values of the parameters. If ReturnErrors is set to False, no errors are returned and no
significant figure adjustment is performed.
For example, here we repeat a fit performed before in this chapter, one with and the other without ReturnErrors set
to True. We also set ShowFit to False to suppress the graphs of the fit.
FindFit GanglionData, a 0

 a 1 x, x, a 0 , a 1 , ShowFit  False

a 0 
0.03, 0.5 , a 1 
0.1068, 0.006700000000000001 ,
SumOfSquares  25.35464811594685, DegreesOfFreedom  12
FindFit GanglionData, a 0
ReturnErrors  False

 a 1 x, x, a 0 , a 1 , ShowFit  False,

a 0  0.03098373593165319, a 1  0.1067875703568358,
SumOfSquares  25.35464811594685, DegreesOfFreedom  12

This option is identical to the option of the same name used by LinearFit.

 5.3.1.7 The ReturnFunction Option


Be default, FindFit returns a set of rules for the result of the fit. By setting ReturnFunction to True, FindFit
instead returns a function of the independent variable.
For example, here we repeat a fit performed a few times already in this section, but with ReturnFunction set to
True. We also set ShowFit to False to suppress the graphs of the fit.
FindFit GanglionData, a 0
ReturnFunction  True

 a 1 x, x, a 0 , a 1 , ShowFit  False,

0.03  0.1068 x

This option is identical to the option of the same name used by LinearFit.

 5.3.1.8 The ReturnResiduals Option


If set to True, the ReturnResiduals option causes FindFit to return the residuals of the fit along with the other
results of the fit.
This option is identical to the option of the same name used by LinearFit.

Chapter5.nb

26

 5.3.1.9 The Reweight Option


When the data has no explicit errors, FindFit finds the minimum in the sum of the squares of the residuals. However,
if the scatter in the data points can be considered to be random and statistical then it is often reasonable to assume that
the effective error in the dependent variable is given by PseudoErrorY.
PseudoErrorY =
Sqrt[ SumOfSquares / DegreesOfFreedom ]
The Reweight option is identical to the option of the same name used by LinearFit except that by default it is
True for LinearFit, but False for FindFit. This is because in a nonlinear fit, reweighting the data can change
the values of the parameters to which we are fitting. This is not true for a linear fit, where reweighting only affects the
calculated errors in those parameters.

 5.3.1.10 The ShowFit Option


Setting ShowFit to False suppresses the display of the graphical information about the fit. As discussed in Section
5.3.2.1, the graphs can be created later from the result returned by FindFit.
This option is identical to the option of the same name used by LinearFit.

 5.3.1.11 The ShowProgress Option


Setting ShowProgress to True causes FindFit to print information about its progress in performing the fit. This
option is identical to the one of the same name used by LinearFit, although for FindFit the information is much
more verbose and more often of use in finding the "best" fit of the data to a model.
Here we repeat a fit done a few times already in this Section, setting ShowProgress to True and setting ShowFit
to suppress the graphs of the fit.

27

Chapter5.nb

FindFit GanglionData, a 0
ShowProgress  True

 a 1 x, x, a 0 , a 1 , ShowFit  False,

14 data points with 2 variables.


Fitting to 2 parameters
Beginning iterations.
Iteration 1 Sum of squares = 62556.7
308
Previous sum of squares: 1.79769 10
Current answer: {a[0] > 1, a[1] > 1}
Evaluating matrices for the current\
answer.
Scale for step size (lambda) = 0.001
Calling SingularValues to find\
corrections.
Corrections to parameters:
{1.15781, 0.890167}
Iteration 2 Sum of squares = 25.5589
Previous sum of squares: 62556.7
Current answer: {a[0] > 0.15781,
a[1] > 0.109833}
Evaluating matrices for the current\
answer.
Scale for step size (lambda) = 0.0001
Calling SingularValues to find\
corrections.
Corrections to parameters:
{0.188662, 0.00304403}
Iteration 3 Sum of squares = 25.3546
Previous sum of squares: 25.5589
Current answer: {a[0] > 0.0308523,
a[1] > 0.106789}
Evaluating matrices for the current\
answer.
Scale for step size (lambda) = 0.00001
Calling SingularValues to find\
corrections.
Corrections to parameters:
6
{0.000131454, 1.80534 10 }
Iteration 4 Sum of squares = 25.3546
Previous sum of squares: 25.3546
Current answer: {a[0] > 0.0309837,
a[1] > 0.106788}
Convergence: the previous sumofsquares\
minus
the current sumofsquares is less than: 0.1
Convergence: the relative decrease in the
sumofsquares is less than: 0.005
Calculating covariance matrix using\
SingularValues.

a 0 
0.03, 0.5 , a 1 
0.1068, 0.006700000000000001 ,
SumOfSquares  25.35464811594685, DegreesOfFreedom  12

Note that the final iteration satisfied two of the three possible criteria for convergence. This is common when FindFit
has found a true minimum in the sum of the squares.

 5.3.1.12 The UseSignificantFigures Option


As discussed in Chapter 3, in the physical sciences, a specification of an error associated with a quantity essentially
defines what the significant figures of that quantity are. By default, FindFit uses this definition of significant figures
in returning the value of fit parameters. The UseSignificantFigures option allows this default behavior to be
turned off. For example, here we repeat a fit we have already done a few times.
FindFit GanglionData, a 0

 a 1 x, x, a 0 , a 1 , ShowFit  False

a 0 
0.03, 0.5 , a 1 
0.1068, 0.006700000000000001 ,
SumOfSquares  25.35464811594685, DegreesOfFreedom  12

28

Chapter5.nb

FindFit GanglionData, a 0  a 1 x, x, a 0 , a 1
UseSignificantFigures  False

, ShowFit

 False,

a 0 
0.03098373593165319, 0.4980841602975855 ,
a 1 
0.1067875703568358, 0.006739570956378 , SumOfSquares 
DegreesOfFreedom  12

25.35464811594685,

Now the estimated errors in the fitted parameters have not been used to adjust the number of significant figures
displayed for either the values or the errors in those parameters.
This option is identical to the option of the same name used by LinearFit.

 5.3.1.13 The ValueTolerance Option


FindFit uses three separate tests to determine if a fit has converged to a final value. The ValueTolerance option
is used by one of those tests.
If the chisquared or sum of the squares is decreasing, and the maximum of the absolute value of the change in the fit
parameters is less than ValueTolerance, the fit is judged to have converged. The default value is 0.002.
For information on the other two tests used by FindFit to determinate convergence, see the discussion of the
AbsoluteChiSquaredTolerance and RelativeChiSquaredTolerance option in Section 5.3.1.1 and in
Section 5.3.1.3, respectively.

5.3.2 Other Routines in the FindFit Package


 5.3.2.1 The ShowFitResult Routine

By default FindFit uses the program ShowFitResult to display the results of a fit. If we have the results of
FindFit, we can use ShowFitResult to display it. We will demonstrate with the GanglionData used above.
result 
FindFit GanglionData, a 0

 a 1 x  a 2 x2 , x, a 0 , a 1 , a 2 , ShowFit  False

 0.013, 0.028 , a 2  0.00084, 0.00019 ,


a 0 
2.91, 0.82 , a 1 
SumOfSquares  5.520657416491141, DegreesOfFreedom  11
Now we display the results of the fit.
ShowFitResult GanglionData, a 0

 a 1 x  a 2 x2 , x, a 0 , a 1 , a 2 , result

Residuals
17.5

1
0.5
15
0
12.5 -0.5
-1
10
7.5
5
2.5
0

20

40

60

80

100 120 140

 Graphics 
Note that ShowFitResult returns a Graphics object. This can be convenient if you wish to print the graphic, since
the Graphics object is not returned by FindFit itself.

29

Chapter5.nb

Note that the syntax and functionality of ShowFitResult are very similar to that of ShowLinearFit in the
EDALinearFit package. ShowFitResult is somewhat more general.
By default, ShowFitResult tries to find a quadrant in the datafit graph in which to place the residual plot. If no
such quadrant can be found, the residual plot is displayed separately. Setting ResidualPlacement to Separate
causes the residual plot to always be displayed separately. ResidualPlacement can also be set to an integer
between 1 and 4, which causes the residual plot to be placed in that quadrant of the datafit plot.
ShowFitResult GanglionData, a 0
result, ResidualPlacement  4

 a 1 x  a 2 x2 , x, a 0 , a 1 , a 2 ,

17.5
15
12.5
10
7.5
5
2.5
0

20

40

60

Residuals
1
0.5
0
-0.5
-1
80 100 120 140

 Graphics 
If ResidualPlacement is set to None, no residual plot is displayed.
Internally, ShowFitResult uses EDAListPlot, Plot, and ToFitFunction. Options given to
ShowFitResult for these are passed to the appropriate program. In addition, ShowFitResult itself uses only the
two options ResidualPlacement and UseSignificantFigures.
There can be some minor differences between the graphs displayed by FindFit and those displayed by
ShowFitResult. For example, if the data set has no explicit errors and the Reweight option is set to True in the
call to FindFit, then the PseudoErrorY is used in the calculation of the errors in the residuals by FindFit.
ShowFitResult is unaware of this, and will then have slightly different errors in the residual graph. By specifying
ReturnResiduals to True in the call to FindFit, the "better" numbers will be used by ShowFitResult.
We give an example by fitting GanglionData with the Reweight option.
result 
FindFit GanglionData, a 0

 a 1 x, x, a 0 , a 1 , Reweight  True, ShowFit  True

Residuals
17.5 4
15 2
0
12.5 -2
10
7.5
5
2.5
0

20

40

60

80

100 120 140

a 0 
0.03, 0.72 , a 1 
0.107, 0.01 , PseudoErrorY  1.453577429308659,
SumOfSquares  25.35464811594684, DegreesOfFreedom  12

Next we use ShowFitResult on result.

30

Chapter5.nb

ShowFitResult GanglionData, a 0

 a 1 x, x, a 0 , a 1 , result ;

Residuals
17.5 3
2
15 1
0
12.5 -1
-2
10
7.5
5
2.5
0

20

40

60

80

100 120 140

ShowLinearFit does not show the errors in the residuals due to the PseudoErrorY term.
We can have FindFit explicitly return the residuals.
result  FindFit GanglionData, a 0  a 1 x, x, a 0 , a 1
ShowFit  False, ReturnResiduals  True

, Reweight

 True,

a 0 
0.03, 0.72 , a 1 
0.107, 0.01 , Residuals 
11.1, 1.3, 1.5 , 13.6, 1.4, 1.5 , 22.5, 0.5, 1.5 , 31.4, 0.4, 1.5 ,
32.7,  0.1, 1.5 , 34.,  0.2, 1.5 , 53.8,  1.2, 1.5 , 63.,  2.1, 1.5 ,
67.00000000000001,  0.2, 1.5 , 81.,  1.8, 1.5 , 101., 0.4, 1.5 ,
107.,  0.5, 1.5 , 114.,  1.1, 1.5 , 141., 3.2, 1.5
,
PseudoErrorY  1.453577429308659, SumOfSquares  25.35464811594684,
DegreesOfFreedom  12

We see that the PseudoErrorY has led to an error in each of the residuals. ShowFitResult will display these in
this case.
ShowFitResult GanglionData, a 0

 a 1 x, x, a 0 , a 1 , result ;

Residuals
17.5 4
15 2
0
12.5 -2
10
7.5
5
2.5
0

20

40

60

80

100 120 140

Similarly, if the data set has errors in both coordinates, FindFit uses the effective variance in calculating the errors in
the residuals. In this case also, specifying ReturnResiduals as True in the call to FindFit will cause
ShowFitResult to be slightly more correct.

 5.3.2.2 The ToFitFunction Routine


ShowFitResult, discussed previously, uses the function ToFitFunction to graph the results of the fit. FindFit
also directly calls ToFitFunction if the option ReturnFunction is set to True.
The function may also be called directly. We repeat a fit we have done before.

31

Chapter5.nb

result

 FindFit ThermocoupleData, a 0  a 1 x, x, a 0 , a 1 , ShowFit  False

LoadData::name: Loading: ThermocoupleData

a 0 
 0.981, 0.021 , a 1
DegreesOfFreedom  19

0.04122, 0.00036 , ChiSquared 

21.05345454545474,

32

Chapter5.nb

We pass the result to ToFitFunction.


ToFitFunction result, a 0

 a 1 x, x, a 0 , a 1

 0.981  0.04122 x
The ToFitFunction is similar to the ToLinearFunction function supplied in the EDALinearFit package,
but somewhat more general. However, one difference is that when the fit parameters have errors, by default,
ToFitFunction does not return two functions. In contrast, ToLinearFunction will return two functions, the first
being the result of the fit and the second the estimated errors in the function. ToFitFunction can return this second
"error function" if the UseFitErrors option is set to True.
ToFitFunction result, a 0

 0.981  0.04122 x,

 a 1 x, x, a 0 , a 1 , UseFitErrors  True

0.0004410000000000001 

1.296
7 x 2
10

5.3.3 Peak Shape Routines

The EDAFindFit package includes some convenience functions to define peak shapes. They are BreitWigner,
Galatry, Gaussian, Lorentzian, PearsonVII, RelavitivisticBreitWigner, and Voigt.

5.4 Summary of the FindFit Package


Information "FindFit", LongForm

 False

FindFit[data, model, ind, parameters] finds a fit


of data to model that minimises the sum of
the squares of the residuals or the chisquared.
The model is expected to be a function of the
independent variable ind and the variables
named in parameters. The parameters can be of
the form {name1, name2, ... , nameM}, but will
more usually be of the form { {name1,start1},
{name2, start2}, ... , {nameM, startM}}, where
starti are initial values of the parameter.
In addition, each parameter can be specified as
{name, min, start, max}, where name is the
name, start is the starting value, and min
and max specify the minimum and maximum values
of the parameter that can be returned. Finally,
the parameter can be specified as {name, min,
max} in which case the starting value is (min +
max)/2.

Information "AbsoluteChiSquaredTolerance", LongForm


AbsoluteChiSquaredTolerance is used in the default
convergence test of FindFit. When the
chisquared of the current iteration is less
than the chisquared of the previous iteration
and their difference is less than
AbsoluteChiSquaredTolerance, the fit is judged
to have converged and no more iterations are
performed.

Information "MaximumIterations", LongForm

 False

MaximumIterations is an option to various routines


that use iterative techniques, and specifies the
maximum number of iterations to perform before
quitting.

 False

33

Chapter5.nb

Information "RelativeChiSquaredTolerance", LongForm

 False

RelativeChiSquaredTolerance is used in the default


convergence test of FindFit. When the
chisquared of the current iteration is less
than the chisquared of the previous iteration
and their relative difference is less than
RelativeChiSquaredTolerance, the fit is judged
to have converged and no more iterations are
performed.

Information "ReturnCovariance", LongForm

 False

ReturnCovariance is an option to various fitting


routines. If True, then the routine returns the
full covariance matrix; otherwise it does not.

Information "ReturnEffectiveVariance", LongForm


ReturnEffectiveVariance is an option to various
fitting routines. If True the routine returns
the effective variance as part of the result of
the fit.

Information "ReturnErrors", LongForm

 False

ReturnErrors is an option to various fitting


routines. If True, then the routine returns
errors in the fitted paramters; otherwise it
does not.

Information "ReturnFunction", LongForm

 False

ReturnFunction is an option to various fitting


routines. If set to False, then the fit will be
returned as a set of Rules involving the
parameter given in the call to the routine. If
set to True, then the fit will be returned as a
function and the independent variable is taken
to be parameter.

Information "ReturnResiduals", LongForm

 False

ReturnResiduals is an option to various fitting


routines. If True, the Residuals of the fit are
returned along with other information about the
fit. The Residuals returned always include a
value for the independent variable; if none is
in the data the values are {1,2, ... , N}.

Information "Reweight", LongForm

 False

Reweight is an option to LinearFit and FindFit,


which controls whether or not to reweight the
data if it contains no explicit errors. The
default is True for LinearFit and False for
FindFit. If set to True, the data is weighted
using a "statistical assumption", where the
error in the dependent variables is the square
root of the sum of the squares divided by the
number of degrees of freedom; see Taylor, "An
Introduction to Error Analysis," Eqn 8.14 on pg.
158 for further information. When set to True,
all subsequent processing assumes that the
generated errors in the dependent variable are
real.

Information "ShowFit", LongForm

 False

ShowFit is an option to various fitting routines.


If True, then a graphical display of the results
of the fit is displayed.

Information "ShowProgress", LongForm

 False

ShowProgress is an option to various routines.


When True, the routine will Print messages
showing its progress in performing its tasks.
If set to False no such information is printed.

 False

34

Chapter5.nb

Information "UseSignificantFigures", LongForm

 False

UseSignificantFigures is an option to various


routines. When True, the routine will use
AdjustSignificantFigures on numbers with
associated errors so that the error determined
the number of significant figures in the number
itself. If set to False no such adjustment is
performed.

Information "ValueTolerance", LongForm

 False

ValueTolerance is used in the default convergence


test of FindFit. When the chisquared of the
current iteration is less than the chisquared
of the previous iteration and the maximum change
in the values of the parameters being fit
divided by the value of that parameter is less
than ValueTolerance, the fit is judged to have
converged and no more iterations are performed.

Information "ShowFitResult", LongForm

 False

ShowFitResult[data, model, ind, parameters,result]


takes result from FindFit and displays graphic
information about the fit. The meaning of the
first four arguments are identical to arguments
of the same name for FindFit. Note that if
parameters contains {param,value} pairs, the
values are ignored by ShowFitResult.
ShowFitResult[data, model, ind, parameters,
result, residuals] is the same as the first form
except the residuals of the fit are given
instead of calculated.

Information "ResidualPlacement", LongForm

 False

ResidualPlacement is an option to ShowLinearFit and


ShowFitResult. If set to Automatic, they try to
place the plot of residuals as a small graph
inside the graph of the data and results of the
fit; if a quadrant cannot be found then the
residual plot is displayed separately. If the
option is set to Separate, then the residual
plot is always displayed separately. If the
option is set to an integer between 1 and 4,
that quadrant is used to display the residuals.
If the option is set to None, no residuals are
displayed.

Information "ToFitFunction", LongForm

 False

ToFitFunction[result,model,ind,parameters] takes
result which is assumed to be a set of Rules
such as is returned by default by FindFit and
returns a function of the independent variable
ind for model. The model is assumed to be a
function of ind and the parameters. The
parameters may be a list of Symbols, or
{Symbol, value} pairs, although the values are
in all cases ignored. By default a single
function is returned, which is evaluated at the
result of the fit. If UseFitErrors is set to
True, the routine returns a list of two
functions; the first is as before and the second
is the estimated error in the function due to
the errors in the fit parameters if any.

Information "UseFitErrors", LongForm

 False

UseFitErrors is an option to ToLinearFunction and


ToFitFunction. If set to True, the default for
ToLinearFunction, and the result contains errors
in the fitted parameters then two functions are
returned. The first is the function for the
values of the parameters, the second is the
function for the values of the errors in the
parameters. A heuristic method is used to
choose the sign of each term in the function of
the errors in the parameters. If UseFitErrors
is set to False, the default for ToFitFunction,
only a single function evaluated at the values
of the parameters is used.

35

Chapter5.nb

Information "BreitWigner", LongForm

 False

BreitWigner[x,a,x0,gamma] is a nonrelativistic
BreitWigner function of the independent
variable x, center value x0, and full width at
halfmaximum gamma. The function is normalized
so that its integral from Infinity to +Infinity
is equal to a. The maximum amplitude of the
peak is 2*a/(Pi*gamma). BreitWigner is a synonym
for Lorentzian.

Information "Galatry", LongForm

 False

Galatry[y,z,tau] is a Galatry profile with


collision broadening parameter y, collison
narrowing parameter z, and dimensionless time
tau. The profile is normalized so that its
value is one when tau = 0.

Information "Lorentzian", LongForm

 False

Lorentzian[x,a,x0,gamma] is a nonrelativistic
Lorentzian function of the independent variable
x, center value x0, and full width at
halfmaximum gamma. The function is normalized
so that its integral from Infinity to +Infinity
is equal to a. The maximum amplitude of the
peak is 2*a/(Pi*gamma). Lorentzian is a synonym
for BreitWigner.

Information "PearsonVII", LongForm

 False

PearsonVII[x,a,x0,gamma,m] is a Pearson VII


function of the independent variable x,
absorbance a, center value x0, full width at
half maximum gamma, and tailing factor m.

Information "RelativisticBreitWigner", LongForm


RelativisticBreitWigner[m_, mr_, gamma_, m1_, m2_]
generates a relativistic BreitWigner for a
resonance of nominal mass mr and total width
gamma decaying into two particles of rest mass
m1 and m2. It returns the fraction of such
decays whose invariant mass is m. The function
is normalized so that when m equals mr the
function returns 1.

Information "Voigt", LongForm

 False

Voigt[y,tau] is a Voigt profile with collison


broadening parameter y and dimensionless time
tau. The profile is the Limit[ Galatry[
y,z,tau], z > 0].

 False

Das könnte Ihnen auch gefallen