Beruflich Dokumente
Kultur Dokumente
VadeMecum for
Data Analysis Beginners
(October 3, 2007)
c
Copyright
Virgnio
de Oliveira Sannibale, June, 2001
Acknowledgments
I started this work with the aim of improving the course of Physics Laboratory for Caltech freshmen students, the so called ph3 course . Thanks
to Donald Skelton, ph3 was already a very good course, well designed to
satisfy the needs of news students eager to learn the basics of laboratory
techniques and data analysis.
Because of the need of introducing new experiments, and new topics in
the data analysis notes, I decided to rewrite the didactical material trying
to keep intact the spirit of the course, i.e emphasis on techniques and not
on the details of the theory.
Anyway, I believe and hope that this attempt to reorganize old experiments and introduce new ones constitutes an improvement of the course.
I would like to thank, in particular, Eugene W. Cowan for his incommensurable help he gave to me with critiques, suggestions, discussions,
and corrections to the notes. His experience as professor at Caltech for
several years were really valuable to make the content of these notes suitable for students at the first year of the undergraduate course.
I would like to thank also all the teaching assistants that make this
course work, for their patience and valuable comments that I constantly
received during the academic terms.
Sincerely,
Virgnio de Oliveira Sannibale
Contents
1
Physical Observables
1.1 Random Variables and Measurements . . . . . . . . . . . . .
1.2 Uncertainties on Measurements . . . . . . . . . . . . . . . . .
1.2.1 Accuracy and Precision . . . . . . . . . . . . . . . . .
1.3 Measurement and Probability Distribution . . . . . . . . . .
1.3.1 Gaussianity . . . . . . . . . . . . . . . . . . . . . . . .
1.3.2 Gaussian Distribution Parameter Estimation for a Single Variable . . . . . . . . . . . . . . . . . . . . . . . .
1.3.3 Gaussian Distribution Parameter Estimation for the
Average Variable . . . . . . . . . . . . . . . . . . . . .
1.3.4 Gaussian Distribution Parameter Estimation for the
Weighted Average . . . . . . . . . . . . . . . . . . . .
1.3.5 Example (Unweighted Average) . . . . . . . . . . . .
1.3.6 Example (Weighted Average) . . . . . . . . . . . . . .
9
9
12
13
14
16
Propagation of Errors
2.1 Propagation of Errors Law . . . . . . . . . . . . .
2.2 Statistical Propagation of Errors Law (SPEL) . .
2.2.1 Example 1: Area of a Surface . . . . . . .
2.2.2 Example 2: Power Dissipated by a Circuit
2.2.3 Example 4: Improper Use of the Formula
2.3 Relative Uncertainties . . . . . . . . . . . . . . .
2.3.1 Example 1: . . . . . . . . . . . . . . . . .
2.4 Measurement Comparison . . . . . . . . . . . . .
2.4.1 Example . . . . . . . . . . . . . . . . . . .
21
21
22
23
23
24
24
25
26
26
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
17
17
18
18
19
CONTENTS
3.2
.
.
.
.
.
.
.
.
.
.
.
29
29
30
31
34
34
35
35
36
36
36
Probability Distributions
4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.1 Probability and Probability Density Function (PDF) .
4.1.2 Distribution Function (DF) . . . . . . . . . . . . . . .
4.1.3 Probability and Frequency . . . . . . . . . . . . . . . .
4.1.4 Continuous Random Variable v.s. Discrete Random
Variable . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.5 Expectation Value . . . . . . . . . . . . . . . . . . . . .
4.1.6 Intuitive Meaning of the Expectation Value . . . . . .
4.1.7 Variance . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.8 Intuitive Meaning of the Variance . . . . . . . . . . .
4.1.9 Standard Deviation . . . . . . . . . . . . . . . . . . . .
4.2 Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 Random Variable Uniformly Distributed . . . . . . .
4.2.1.1 Example: Ruler Measurements . . . . . . . .
4.2.1.2 Example: Analog to Digital Conversion . .
4.3 Gaussian Distribution (NPDF) . . . . . . . . . . . . . . . . . .
4.3.1 Standard Probability Density Function . . . . . . . .
4.3.2 Probability Calculaltion with the Error Function . . .
4.4 Exponential Distribution . . . . . . . . . . . . . . . . . . . . .
4.4.1 Random Variable Exponentially Distributed . . . . .
4.5 Binomial/Bernoulli Distribution . . . . . . . . . . . . . . . .
4.6 Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . .
4.6.1 Example: Silver Activation Experiment . . . . . . . .
39
39
39
39
40
3.3
3.4
3.5
Graphical Fit . . . . . . . . . . . . . . . . . . . . . . . .
3.2.1 Linear Graphic Fit . . . . . . . . . . . . . . . .
3.2.2 Theoretical Points Imposition. . . . . . . . . .
Linear Plot and Linearization . . . . . . . . . . . . . .
3.3.1 Example 1: Square Function . . . . . . . . . . .
3.3.2 Example 2: Power Function . . . . . . . . . . .
3.3.3 Example 3: Exponential Function . . . . . . . .
Logarithmic Scales . . . . . . . . . . . . . . . . . . . .
3.4.1 Linearization with Logarithmic Graph Sheets
Difference Plots . . . . . . . . . . . . . . . . . . . . . .
3.5.1 Difference Plot of Logarithmic Scales . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
41
41
42
43
43
43
45
45
45
46
47
48
49
50
51
52
54
CONTENTS
5
Parameter Estimation
5.1 The Maximum Likelihood Principle (MLP) . . . . . .
5.1.1 Example: and of a Normally Distributed
dom Variable . . . . . . . . . . . . . . . . . . .
5.1.2 Example: of a set of Normally Distributed
dom Variables . . . . . . . . . . . . . . . . . . .
5.2 The Least Square Principle (LSP) . . . . . . . . . . . .
5.2.1 Geometrical Meaning of the LSP . . . . . . . .
5.2.2 Example: Linear Function . . . . . . . . . . . .
5.2.3 The Reduced 2 (Fit Goodness) . . . . . . . . .
5.3 The LSP with the Effective Variance . . . . . . . . . .
5.4 Fit Example (Thermistor) . . . . . . . . . . . . . . . . .
5.4.1 Linear Fit . . . . . . . . . . . . . . . . . . . . .
5.4.2 Quadratic Fit . . . . . . . . . . . . . . . . . . .
5.4.3 Cubic Fit . . . . . . . . . . . . . . . . . . . . . .
5.5 Fit Example (Offset Constant) . . . . . . . . . . . . . .
7
57
. . . . 57
Ran. . . . 58
Ran. . . . 59
. . . . 60
. . . . 61
. . . . 61
. . . . 62
. . . . 63
. . . . 64
. . . . 65
. . . . 66
. . . . 67
. . . . 69
71
73
75
77
CONTENTS
Chapter 1
Physical Observables
1.1
10
1111111111111111111111
0000000000000000000000
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
Paddlewheel
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
Liquid
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
Dewar
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
0000000000000000000000
1111111111111111111111
Figure 1.1: Example of a physical system under measurement , i.e a variant of the
Joules Experiment (1845). The isolated liquid is heated up by the paddle wheel
movement driven by an electric motor. The mercury thermometer measures the
temperature changes.
11
The heat flow due to the paddle wheel movement is not completely
constant and is affected by small unpredictable fluctuations. For example, measuring the current flowing through the electric motor we
see small random fluctuations around a constant value.
Some of those perturbations are probably completely negligible (the instrument is unable to see them), some others can be estimated and minimized, and some others cannot2 .
Disturbance
Disturbance
111111111111111
000000000000000
Excitation
000000000000000
111111111111111
000000000000
111111111111
000000000000000
111111111111111
000000000000
000000000000000
111111111111111
x+ 111111111111
000000000000
111111111111
000000000000000
111111111111111
000000000000
111111111111
Ideal Physical
000000000000000
111111111111111
Instrument
000000000000
111111111111
000000000000000
111111111111111
System
000000000000
111111111111
000000000000000
111111111111111
000000000000
111111111111
000000000000000
111111111111111
000000000000
111111111111
000000000000000
111111111111111
Disturbance
Response
000000000000000
111111111111111
1
Ambient
kludge. In other words, if we want the average liquid temperature, we need for example
a more sophisticated apparatus that allows us to map and average very accurately the
liquid temperature. Anyway, what we can do is only minimize perturbations, but never
get rid of them.
12
physical quantity with no disturbances at the time t, and (t) its random
fluctuations, the measured value of x at the time t will be
x ( t ) = x ( t ) + ( t ).
Any physical quantity x is indeed a random variable or a stochastic variable.
1.2
Uncertainties on Measurements
We can distinguish types of uncertainties based on their nature, i.e. uncertainties that can be in principle eliminated, and uncertainties that cannot
be eliminated.
Starting from this criterion, we can divide the source of uncertainties
also called errors into two categories:
random errors: any errors which are not or do not appear to be directly connected to any cause (the cause and effect principle doesnt
work), and are indeed not repeatable but random. Random errors
cannot be completely eliminated.
systematic error: any errors in the measurement which are not random.
Quite often, this kind of error algebraically adds to the measurement
a constant unknown value. This value can also change/drift with
time.
A typical systematic error comes from a wrong calibration of the instrument used. This kind of error is hard to minimize and quite often
difficult to detect. Sometimes, it can be found by repeating the measurement with different procedures and/or instruments.
We can have also systematic errors due to the measurement procedure or definition. Lets consider as example, the measurement of
the thickness of an elastic material. Lack in the procedure definition: measurement with a micrometer without defining the instrument applied pressure, the temperature, the humidity, etc...
Lack in the procedure execution: drift of physical quantities supposed to be stationary such as pressure, temperature, etc...
Systematic errors in principle can be completely removed.
x1
x*
13
x2
Figure 1.3: Accuracy and precision comparison. The gray strips represent
the uncertainty associated with the measurement xi , x represent the value
of the measured physical quantity with no perturbations. Measurement
x1 is more accurate but less precise than measurement x2 .
Figure 1.4: Accuracy and precision comparison. The shooter of the target
on the left is clearly more precise that the shooter of the right target. Anyway, the former shooter is more accurate than the other shooter. Which
shooter would you like to have as your bodyguard?
1.2.1
Here we explain two definitions that allow to compare different measurements and establish their most important relative qualities:
Accuracy: a measurement is said to be accurate if it is not affected by
systematic errors. This characteristic does not preclude the presence
of small or large random error.
Precision: a measurement is said to be precise if it is not affected by
random errors. This characteristic does not preclude the presence of
any type of systematic errors.
14
1.3
1
2
( x )2
22
is a continuous function (see figure 1.5) with one peak, symmetric around
a vertical axis crossing the value , and with tails exponentially decreasing
to 0. It has therefore one absolute maximum at x = . The peak width is
defined by the parameter .
3 In
general, statistics are not able to provide a necessary and sufficient test to check if
a random variable follows the Gaussian distribution (Gaussianity)[4].
15
0,4
0,3
0,2
2
0,1
0
-4
2
2
Normally distributed physical quantity (A.U.)
1
2
( x )2
22
dx ,
Z b
a
1
2
( x )2
22
dx .
The most probable value lies indeed inside an interval centered around
, and the probability to have a value in the interval ( , + ) is 68.3%
and is represented by the dashed area of figure 1.5. The statistical result of
a measurement with a probability/confidence level of 68.3% is written as
x = ( x0 )Units,
[68.3% confidence]
16
1.3.1
Gaussianity
As it has been said before, when we perform measurements physical quantities arise that behave as random variables following, to a good approximation, the NPDF.
This experimental evidence is theoretically corroborated by the so called
central limit theorem (see appendix A). Under a reasonably limited number
of hypothesis, this theorem states that the average x of any random variable x is a random variable normally distributed, when the number of
averages tends to infinity. Often, the measurement of a physical quantity
is the result of a intentional or unintentional average of several measurements and therefore tends to follows the Gaussian distribution.
Deviations from a Gaussian distribution (gaussianity) are quite often
time dependent. In other words, a physical quantity behaves as a Gaussian random variable for a given period of time. This happens mainly
because it is always difficult to keep the experimental conditions constant
and controlled during the time needed to perform all the measurements.
Sudden or slow uncontrolled changes of the system can easily modify the
parameters or the PDF of the physical quantity we are measuring.
Anyway, it is important to stress that the gaussianity of a random variable or more in general the type of PDF should be always investigated.
1.3.2
17
= x,
' .
( x1 x )2 + ( x2 x )2 + ... + ( x N x )2
.
N
Because we are averaging the distance squared between the theoretical
and experimental data-points, it is reasonable to assume that the squareroot of this value is an estimator of the uncertainty of each single measurement xi . A rigorous approach shows that is an estimator of the of the
distribution, and a more rigorous approach will show that an even better
estimator is the sum of the square of the distances divided by N 1, i.e.
2 =
2 =
1.3.3
N
1
( xi )2 ,
N 1 i
=1
' .
is possible and probable also to have a single measurement much closer to than
the average, but this does not help to find an estimate of .
18
x2 =
1.3.4
2
N
x ' x .
1
iN=1 wi
wi x i .
weighted average
i =1
where
wi =
1
,
i2
i = 1, 2, ..., N .
1
iN=1 1/i2
1.3.5
1
123.5
2
125.3
3
124.1
4
123.9
5
123.7
6
124.2
7
123.2
8
123.7
9
124.0
10
123.2
19
10
1
(Vi V )2 = 0.6070 mV.
10 1 i
=1
1.3.6
The reflectivity R 6 of a dielectric mirror, measured with 5 sets of measurements , gives the following table
i
Ri
si
1
0.4932
0.0021
2
0.4947
0.0025
3
0.4901
0.0032
4
0.4921
0.0018
5
0.4915
0.0027
5 The cumbersome notation of each single measurement is used here to avoid any kind
20
i =1
Ri
= 0.492517
s2i
Chapter 2
Propagation of Errors
2.1
f(x0)+df
f(x 0)
f(x 1) +df
f(x1)
x1
x1+dx
x0
x0+dx
Figure 2.1: Variation of f ( x ) at two points x0 , and x1 . The derivative accounts for the difference in magnitude variation of the function f ( x ) at
different points.
We want to find a method to approximate the uncertainty of a physical
quantity f , which has been indirectly determined, i.e f is a function of a a
set of random variables that are physical quantities.
To focus the problem it is better to consider the case of f as a function
21
22
2.2
23
f
x
2
x20
f
y
2
y20
+2
f
x
f
y
E [( x x0 ) (y y0 )]
f
x
2
x20
f
y
2
y20 .
The most general expression for SPEL and its derivation can be found
in appendix B.
2.2.1
Lets suppose that the area A of a rectangular surface having side lengths
a and b is indirectly measured by measuring the sides. We will have
A = ab,
2
' b2 a2 + a2 b2 .
A
2
A
' 2a2 a2 ,
which implies that we are assuming that the square is perfect, and it is
sufficient to measure one side of the square.
2.2.2
24
(2.2)
(2.3)
Considering the following experimental values with the respective estimates of their expectation values and uncertainties, we get
V = (77.78 0.71)V,
P = (90.37 5.39)W.
I = (1.21 0.071)A,
= (0.283 0.017)rad,
2.2.3
E[ x ] = 0,
V [ x ] = ,
and a function f of x, f ( x ) = x2 .
Applying the SPEL to f ( x ), we obtain
f = [2x ] x=0 x = 0,
which leads to a wrong result, because this approximation (see eq. B.1
in Appendix B) is not legitimate. In fact, there is no need to expand the
function up to the second order term to understand that the second order
expansion (i.e. the function itself) is not negligible at all.
Considering the definition of variance instead, and with the aid of the
integration by parts formula, we get the correct result
f = V [ f ( x )] = V [ x2 ] = 2.
2.3
Relative Uncertainties
25
2.3.1
Example 1:
g
.
l
g l
+
g
l
.
0 l
+ ,
0
l
l < 0.001m.
26
2.4
Measurement Comparison
x2 x2 ,
are two independent measurements of the physical quantity x. The difference and the uncertainty of the difference will be, indeed
q
x = | x2 x1 |,
x = x21 + x22 .
We can assume as test of confidence that the two measurements are
statistically the same if
x < 2x .
2.4.1
Example
Chapter 3
Graphical Representation of Data
3.1
Introduction
A good way to analyze a set of experimental data, and review the results,
is to plot them in a graph. It is important to provide all the information
necessary to correctly and easily read the graph. The choice of the proper
scale and the type of scale is also important.
For a reasonably good understanding of a graph, the following information should be included:
a title,
axis labels to define the plotted physical quantities,
the physics units of the plotted physical quantities,
a dot corresponding to each experimental point, and error bars or
error rectangles,
graphical analysis made on the graph, in particular the data-points
used clearly labeled,
a legend if more than one data set is plotted.
Figure 3.1 shows an example of a graph containing all the information
needed to properly read the plot.
Judgment of the curve fit goodness is quite often done by inspection
of the graph, the data points and the theoretical fitted curve, or better, by
analyzing the so called difference plot.
27
28
12
y(x)=ax+b
a=(996.3+-12.9)
10
b=(-0.103+-0.094)mA
0
0
10
12
3.2
29
Graphical Fit
3.2.1
(3.1)
where the two parameters, the slope a, and the intercept b must be
graphically determined.
Lets assume that we are able to trace a straight line, which fits the
experimental points reasonably well. Considering then, two points A =
( x1 , y1 ), and B = ( x2 , y2 ) belonging to the straight line, eq. (3.1), and some
trivial algebra, we will have the two estimators of a, and b
a =
y2 y1
,
x2 x1
x2 y1 x1 y2
,
b =
x2 x1
x1 < x2 .
30
We will then have two estimations of a, and b, whose averages will give
estimated values of the slope and the intercept. Their semi-differences will
give the maximum uncertainties associated with them
a max + a min
,
2
b max + b min
b =
,
2
a =
a max a min
,
2
b max b min
b =
.
2
a =
(3.2)
(3.3)
and finally
a = (1.0 0.1)k
b = (0 1)V
If we cannot have all the points intercepted by a straight line, and we
really need to give some numbers for the slope and intercept, we could
use this additional but very subjective thumb rule:
The maximum and the minimum slope straight lines are those lines,
which make the straight line computed using eq.s.(3.2) and (3.3) intercept at least 2/3 of the rectangles.
This rule tries to empirically take into account the results of statistics,
when applied to a curve fit. It is indeed better to use a statistical fitting
methods as explained in chapter 5.
3.2.2
Imposing theoretical points to the fit curve implies that we are assuming
that the uncertainty of each experimental point is not dominated by any
31
12.1
= 1.008k
12.0
a min =
12.0
= 0.916
13.1
and finally
a = (0.96 0.05)k
Statistically, this new measurement of a agrees with the previous one
within their uncertainty. Which measurement is more accurate is difficult
to say. A statistical analysis can reduce the uncertainty, giving a more
precise measurement.
3.3
The graphical fitting methods for straight lines can be extended to apply
to non-linear functions through the so called process of the linearization.
In other words, if we have a function, which is not linear we can apply
functions to linearise it.
We can mathematically formulate the problem in the following terms.
Lets suppose that
y = y( x; a, b)
is a non-linear function, with two parameters a and b. If we can find transformations
Y = Y ( x, y),
X = X ( x, y),
that allow us to write the following relation
Y = XF ( a, b) + G ( a, b),
where F, and G are known expression that only depend on a, and b, then
we have linearized y. Once the F and G values are found with graphical
32
12
10
8
Exp. Points
Limiting the
Min Slope
Exp. Points
Limiting the
Max Slope
B=(13.5mA, 12.0V)
C
-2
0
10
12
14
Figure 3.2: Maximum an minimum slope straight lines intercepts all the
experimental points. If we try to draw a straight line that is steeper than
the maximum slope line, some of the points will not be intercepted; the
same is true for the minimum slope line, if we try to draw a line with a
slope that is less steep.
33
Exp. Point
Limiting the
Max. Slope
12
10
8
Exp. Point
Limiting the
Min Slope
A=(12.0mA, 12.1V)
B=(13.1mA, 12.0V)
-2
0
10
12
14
Figure 3.3: Maximum an minimum slope straight lines with zero crossing
point imposed. The comments in the previous graphs also apply to this
figure.
34
3.3.1
y = ax2 .
Y (y) = y
3.3.2
If we have
y = bx a ,
(3.4)
X = log x,
3.3.3
35
If we have
y = be ax ,
applying the logarithm to the right-hand sides, we have
log y = ax + log b.
If we define the following functions Y = log y,
finally have the linear function to plot
X = x, we will
Y = aX + log b.
3.4
Logarithmic Scales
and the logarithmic scale is then approximated well by the linear scale.
Another advantage of using logarithmic scales is for the linearization
of functions as briefly discussed in the next subsection.
36
3.4.1
There are essentially two cases that can be linearized using logarithmic
graph sheets:
If the experimental points follow a power law ( y = ax b ) we will obtain
a straight line if we plot y and x on a logarithmic scale.
If the experimental points follow an exponential law ( y = ab x , as per
the linearization procedure of example 2 ), we will obtain a straight line if
we plot the logarithm of y versus the logarithm of x.
3.5
Difference Plots
The ability to see how data points scatter from the theoretically fit curve in
a graph is quite important for assessing the quality of the fit. Quite often, if
we plot experimental points and the fit curve together it becomes difficult
to appreciate and analyze the difference between the experimental points
and the curve. In fact, if the measurement range of a physical quantity is
greater than the average distance between the experimental points and the
curve, the data points and the curve become indistinguishable.
One way to avoid this problem is to produce the so called difference
plot, i.e the plot of the difference between the theoretical and the measured
points. In a difference plot, the goodness of the fit and/or the poorness of
the theoretical model can be better analyzed.
3.5.1
y
1,
y0
e1
37
we get
y
= log 1 +
y0
'
y
y y0
.
y0
y0
38
Chapter 4
Probability Distributions
This chapter reports the basic definitions describing the probability density functions (PDF) and some of the frequently used distributions with
their main properties. To comprehend the next chapters it is important to
become familiar mainly with the first section, where some new concepts
are introduced. The understanding of each distribution is required since
they will be used in the next chapters.
4.1
4.1.1
Definitions
Probability and Probability Density Function (PDF)
Z b
a
p( x )dx,
4.1.2
40
F(x) =
Z x
p( x 0 )dx 0
(4.1)
(4.2)
lim F ( x ) = 0.
(4.3)
x +
x
The first limit, the so called normalization condition, represents the probability that x assumes anyone of its possible values. The second limit represents the probability that x does not assume any value.
4.1.3
ki
,
N
i = 1, ..., n
ki
,
N
i = 1, ..., n
4.1. DEFINITIONS
4.1.4
41
4.1.5
Expectation Value
Z +
x p( x )dx
4.1.6
E[ X ] =
Xi P ( Xi ) ,
i =1
general, the partition can be numerable. In other words, it can have infinite
number of intervals but we can associate an integer number to each one of them.
2 There is an arbitrariness on choice of the values of X because we are assuming that
i
any value in each given interval is equiprobable.
3 In general, when we change between the continuous to discrete variable, we have
R
that p( x )dx P( xi )
1 In
42
4.1.7
Variance
The expectation value of the square of the difference between the random
variable x and its expectation value is called variance of x, i.e.
V [ x ] = E[( x E[ x ])2 ].
A more explicit expression gives
V [x] =
( x )2 p( x )dx
4.1.8
43
V [X] =
i =1
which shows that the variance is just the sum of the square of the distance of each experimental point to the expectation value weighted by the
probability to obtain the measurement. Estimating the probability with
the frequency, we will have that V [ X ] is just the average of the the square
of the distance of the experimental points Xi from their average, i.e.
V [X] '
1
N
i =1
4.1.9
Standard Deviation
The square root of the variance is defined as the standard deviation of the
random variable x, i.e.
rZ
=
4.2
( x )2 p( x )dx
Uniform Distribution
44
p(x;a,b)
[AU1]
0.4
0.3
=1
=2
=3
=4
0.2
0.1
0
5
1
x
0
[AU]
P(x;a,b)
[#]
0.8
0.6
0.4
a = 1 b
a = 2 b
a = 3 b
a = 4 b
0.2
0
5
1
x
0
[AU]
=1
=2
=3
=4
5
Figure 4.1: Uniform probability density function p( x; a, b) and its cumulative function P( x; a, b) for different intervals [ a, b].
The cumulative distribution function is
x < a,
0
xa
axb
P( x; a, b) =
b a
1
x>b
The expectation value of x is
E[ x ] =
and the variance is
a+b
,
2
1
( b a )2 .
12
The calculation of E[ x ], and V [ x ], are left as exercise.
V [x] =
4.2.1
45
Lets suppose that measuring N times a given physical quantity x we always obtain the same result x0 . In this case, we cannot study how x is statistically distributed. With this limited knowledge, a reasonable
hypothh
i
x
esis is that x is uniformly distributed in the interval x0 x
2 , x0 + 2 ,
where x is the instrument resolution. Under this assumption, the best
estimate of the uncertainty on x is indeed
x
ba
= .
x =
2 3
12
This is a case where the statistical uncertainty cannot be evaluated from
the measurements, and has to be estimated from the instrument resolution.
4.2.1.1
The conversion of an analog signal to a number through an Analog to Digital Converter (ADC), is another example of creation of a uniformly distributed random variable. In fact, the conversion rounds the analog value
to a given integer number. The integer value depends on which interval
the analog value lies in, and the interval length V is the ADC resolution. Then, it is reasonable to assume that the converted value follows the
uniform PDF.
If the ADC numerical representation is 12bit long, and the input/dynamic
range is from -10V to 10V the interval length is
V =
10 (10)
' 4.88mV ,
212
46
4.3
1
22
( x )2
22
xR
(4.4)
E[ x ] = ,
| x | | |,
p( x ) e
x2
22
47
p(x;,)
[AU1]
0.4
=1
=2
=3
=4
=5
0.3
0.2
0.1
0
8
2
x
0
[AU]
P(x;,)
[#]
0.8
0.6
=1
=2
=3
=4
=5
0.4
0.2
0
8
2
x
0
[AU]
Figure 4.2: Gaussian probability density function p( x; , ) and its cumulative function P( x; , ) for different values of .
the analytical cumulative function
P( x ) =
1
22
Z x
dx 0 e
( x 0 )2
22
(4.5)
4.3.1
48
x
,
t2
1
p(t) = e 2 ,
2
x R,
with
E[t] = 0,
and
4.3.2
1
P(t) =
2
V [t] = 1 ,
Z t
dt0 et
2 /2
Z t
0
dx e x ,
then
1
P {t1 t t2 } =
2
erf
t
1
2
+ erf
t
2
2
,
0 t1 t2 .
P { a x b; , } =
erf
+ erf
,
a b.
2
2
2
For a symmetric interval about , we have
a
P { a x a ; , } = erf
,
2
a .
49
1
x0 = 1
x0 = 2
p(x;x0)
[AU1]
0.8
x0 = 3
0.6
x0 = 4
x0 = 5
0.4
0.2
0
5
x [AU]
10
P(x;x0)
[#]
0.8
0.6
x0 = 1
x0 = 2
0.4
x0 = 3
x =4
0.2
x0 = 5
0
5
x [AU]
10
Figure 4.3: Exponential probability density function p( x; x0 ) and its cumulative distribution function P( x; x0 ) for different values of x0 .
4.4
Exponential Distribution
0
x < 0,
x0 > 0
is the exponential probability density function, and x is therefore exponentially distributed in the interval [0, ) (see figure 4.3).
The cumulative distribution function is
Z x
1 x0 /x0 0
e
dx = 1 e x/x0 ,
P( x; x0 ) =
0 x0
50
x0
Z +
0
( x x0 )2
1 x/x0
e
dx = x02 .
x0
4.4.1
1 /0
e
.
0
The quantity 0 is called the mean lifetime of the particle. In other words,
the previous formula gives the probability of an unstable particle to decay
after a time interval (, + d ) measured in its rest frame.
Lets demonstrate the previous formula. If N (t) is the number of unstable particles at the time t, then the rate of decayed particle N after a
time t will be
N
= N ,
> 0,
t
where is the propability of a decay in a time t. The assumption here is
that the decay of each single particle is an independent random process,
and therefore the rate of particles that decay must be proportional to the
number of particles. The minus sign is necessary because the particles
number is decreasing (N 0). Considering N very large (lots of decays
per unit time and therefore the variation in the particle number is almost
continuous), then we can approximate with a continuous decay rate
N dN
dN
= dt .
N
N (t) = N0 et ,
51
N (t = 0) = N0 .
d N (t)
d
= et = et
dt N0
dt
1
0 = .
Experiment 26 of the ph7 sophomore lab is an interesting study of unstable decay which uses Californium 252 (252 Cf) neutrons source to activate silver atoms.
4.5
Binomial/Bernoulli Distribution
n
k
n!
(n k)! k!
kind of experiment which has more than one, numerable, or infinite results can
be arbitrarily arranged into two set of results, and indeed into two possible events.
52
Probability
0,12
0,1
0,08
0,06
0,04
0,02
0
10
15
20
25
Events
30
35
40
45
50
4.6
Poisson Distribution
mk m
e .
k!
53
m.
is
The uncertainty on the estimator m
m = k =
n
m
.
n
The demonstration of the validity of these estimators is based on concepts explained in chapter 5.
54
4.6.1
A classical example of application of the Poisson distribution is the statistical analysis of atomic decay. In this case, the Poisson variable is the total
number of decays measured during a given time . The number N is the
number of atoms that can potentially decay, which is normally quite difficult to make very small, making the approximation N quite good.
Lets consider here some real data taken from the activation of the silver with a radioactive source:
Number of measurements n = 20
Measurement time = 8s
Table of measurements (with the average radioactive background
decays already removed)
i
1
k i (Counts) 52
2
46
3
61
4
60
5
48
6
55
7
53
8
59
9
56
10
53
i
11
k i (Counts) 59
12
48
13
63
14
50
15
55
16
56
17
55
18
61
19
49
20
39
1
1078
ki =
= 53.90 counts
n
20
The uncertainty is
k =
= 7.3417 counts,
m
The uncertainty on m is
m = k = 1.6416 counts .
n
The mean number of decaying atoms during a period of = 8s is
indeed
= (53.90 1.64) counts
m
55
Finally, neglecting the uncertainty on the measurement time , the statistical measurement of the decay rate obtained dividing by is
R = (6.74 0.21) decays/s
It is important to notice that a single long measurement gives the same
result. In fact, considering the overall decay time , we will have
m = 1078
counts
56
Chapter 5
Parameter Estimation
When a function depends on a set of parameters we are faced with the
problem of estimating the values of those parameters. Starting from a finite set of measurements we can make a statistical determination of the
parameters set.
In the following sections we will examine two standard methods, the
maximum-likelihood and least-square methods, to estimate the parameters of a PDF and of a general function of one independent variable.
5.1
Let x be a random variable and f its PDF, which depends on a set of unknown parameters ~ = (1 , 2 , ...n )
f = f ( x; ~ ).
Given N independent samples of x, ~x = ( x1, x2 , ..., x N ), the quantity
N
L(~x; ~ ) = f ( xi ; ~ )
i =1
58
The MLP reduces the problem of parameter estimation to that of maximizing the function L. Because, in general, it is not possible to find the parameters ~ that maximize L analytically, numerical methods implemented
in computers are often used.
5.1.1
)
i
L(~x, ~ ) =
exp
,
22
2
i =1
and we have to maximize it.
Considering that the exponential function is monotone, we have just to
study the argument of the exponential. Imposing the following conditions
#
"
N
( x i )2
= 0,
22
i =1
"
#
N
( x i )2
= 0,
22
i =1
is sufficient to determine the absolute minimum of L.
Solving the first equation respect to , we obtain the estimator of
=
1
N
xi ,
i =1
1
N
(xi )2.
i =1
59
N
1
( xi )2 .
N 1 i
=1
E [ s2 ] = 2 .
1
N
xi ,
i =1
5.1.2
1 2
.
N
)
1
1
exp i 2
,
L(~x, ~ ) =
2
2
i
i =1
i =1
i
and we have to maximize it.
Imposing the following condition
"
#
N
d
( x i )2
= 0,
d
2i2
i =1
60
i =1
k =1
1/k2
xi ,
which is the weighted average of the random variables . What is the variance associated with the weighted average ?
To answer to this question lets consider the weighted average of a random variable,
N
1/i2
x = N
x
2 i
i =1 k=1 1/k
Using the pseudo-linearity property, its variance can be directly computed,
i.e.
1
V [ x ] = N
i=1 1/i2
5.2
i =1
"
yi y( xi ; ~ )
i
#2
,
2 .
61
It is important to notice that in this formulation the LSP requires knowledge of the i (the uncertainties of the measurements yi ) and assumes no
uncertainties are associated with the xi . In a more general and correct
formulation the uncertainties of the xi should be taken into account (see
appendix D). In general, uncertainties in the independent variable x can
be neglected if
y
xi
k
xi
yk
i = 1, 2, ..., N,
k = 1, 2, ...., N
5.2.1
Neglecting the uncertainties i , 2 is just the sum of the square of the distance between the curves points ( xi , y( xi )) and the points yi . The minimization of 2 minimization corresponds to the search of the best curve,
which minimizes the distance between the points yi and the curves points,
varying the parameters k .
The introduction of the uncertainties is necessary if we want to perform
a statistical analysis instead of just solving a pure geometrical problem.
5.2.2
1
y
i =1
62
iN=1 ( xi x )(yi y )
,
iN=1 ( xi x )2
where
x =
1
N
xi ,
i =1
y =
1
N
yi .
i =1
The functions a , and b are the best estimators of parameters a and b given
by the LSP.
Uncertainties on a and b can be estimated by applying the SPEL to the
a and b expressions. After some tedious algebra we get
v
uN
u
a 2
t
,
a ' y
y
i
i =1
v
u
uN
b ' y t
i =1
b
yi
!2
,
where the partial derivatives with respect to x1 , ...x N are neglected because
we assumed that their uncertainties are negligible.
5.2.3
y
(
x
;
)
i
i
2 / ( N d ) =
,
N d i
i
=1
2 As already mentioned, the use of the hat symbol is just to distinguish the parameter
63
5.3
To generalize the LSP in the case of appreciable uncertainties on the independent variable x, we can use the so called effective variance, i.e.
2
f
2
i =
x2i + y2i .
x
where xi , and yi are the uncertainties associated with xi , and yi respectively. The derivative is calculated in xi .
Substituting this new definition of i into the previous the definition of
the 2 will take into account the effect of the uncertainty on x.
The proof of this formula is given in appendix D.
64
x 10
1/T (K)
3.5
Difference Plot
x 10
6
5
1/T (K)
4
3
2
1
0
1
0
2
Resistance
3
(log(Ohm))
Figure 5.1: Linear fit of the thermistor data. The difference plot clearly
show the bad approximation of the linear fit.
5.4
65
Eg ( T )
.
2k b T
Neglecting the temperature dependence of Eg , we can linearize the thermistor response as follows
1
2k
2k
= b log R b log R0 ,
T
Eg
Eg
y=
1
, x = log R.
T
Following what has been done in a published article [6], the corrections to
the linear fit can be introduced empirically using a polynomial expansion
in log R, i.e.
1
= C0 + C1 log R + C2 log2 R + C3 log3 R.
T
5.4.1
Linear Fit
= C0 + C1 log R
66
x 10
1/T (K)
3.5
Difference Plot
x 10
5
4
1/T (K)
3
2
1
0
1
2
0
2
Resistance
3
(log(Ohm))
Figure 5.2: Quadratic fit of the thermistor data. The plot of the experimental points and the theoretical curve seems to show a good agreement. The
difference plot clearly still shows a coarse approximation of the quadratic
fit.
5.4.2
Quadratic Fit
= C0 + C1 log R + C2 log2 R
67
x 10
1/T (K)
3.5
Difference Plot
x 10
1/T (K)
4
2
0
2
4
6
0
2
Resistance
3
(log(Ohm))
Figure 5.3: Cubic fit of the thermistor data. The plot of the experimental
points and the theoretical curve, and the difference plot dont show any
clear systematic difference between the experimental point an theoretical
curve.
1 and the difference plots clearly shows a residual trend in the data.
5.4.3
Cubic Fit
68
C0
C1
C2
C3
2 /( N 4) = 0.403
In this final case the reduced 2 value is smaller than one and the scatter of the data from the horizontal axis in the difference plot of figure 5.3
suggests that the assumed uncertainty on the temperature T = 0.02K is
probably a little too large. Apart from an increase of the data-points uncertainty, no special trend seems to be visible on the difference plot.
The following table contains the data points used for the thermistor
characteristic fits .
Point
(#)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Resist.
R()
0.76
0.86
0.97
1.11
1.45
1.67
1.92
2.23
2.59
3.02
3.54
4.16
4.91
5.83
6.94
8.31
Temp.
T (K )
383.15
378.15
373.15
368.15
358.15
353.15
348.15
343.15
338.15
333.15
328.15
323.15
318.15
313.15
308.15
303.15
T
(K )
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
R
()
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Point
(#)
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Resist.
R()
10.00
12.09
14.68
17.96
22.05
27.28
33.89
42.45
53.39
67.74
86.39
111.3
144.0
188.4
247.5
329.2
Temp.
T (K )
298.15
293.15
288.15
283.15
278.15
273.15
268.15
263.15
258.15
253.15
248.15
243.15
238.15
233.15
228.15
223.15
T
(K )
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
0.02
R
()
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
5.5
69
3 Please
let us know if you notice that the data and plot are missing in this section.
70
Appendix A
Central Limit Theorem
Lets consider a set of N independent random variables 1 , 2 , . . . , N ,
all having the same PDF p( i ) with the following parameters1
E[ i ] = ,
V [ i ] = i2 ,
i {1, 2, . . . , N }
i.
i
E[ x ] =
2
(
x
)
1
2
e 2 ,
N 1 p( x ) '
2 2
V [ x ] = 1N i2 /N 2
The proof of this theorem is rather complicated and is outside the scope
of this work.
The central limit theorem tell us that a random varaiable that is the sum
of random variables following an unknow PDF behaves as a normallly
distributed random variable.
1 This
theorem has a more general formulation. It was proved first by Laplace and
then extended by other mathematicians including P.L.Chebychev, A.A. Markov and A.M.
Lyapunov.
71
72
Appendix B
Statistical Propagation of Errors
We want to find an approximate formula that computes the variance of a
function of random variables by using the standard Taylor series expansion.
Let f be a function of n random variables x = ( x1 , x2 , ..., xn ).
The first order Taylor expansion of f ( x ) about = (1 , 2 , ..., n ) where
i = E [ x i ],
is
f ( x ) = f () +
1
i = 1, 2, ...n,
f ( x )
( x i ) + O ( xi i ) ,
xi xi =i i
(B.1)
!2
n
f ( x )
= E
( x i )
xi xi =i i
1
By the aid of the expected value properties, we obtain
n,n
f ( x )
f (x)
V [ f ( x )] =
E
[(
x
].
)
i
i
j
j
xi xi =i x j
i =1,j=1
x j = j
73
(B.2)
(B.3)
74
i, j = 1, 2, ....n,
(B.4)
Appendix C
NPDF Random Variable
Uncertainties
Lets consider a set of independent measurements X = { x1 , x2 , ..., xn } of
the same physical quantity x following the NPDF.
Measurement Set with no Uncertainties (Unweighted Case)
The uncertainty s of each single measurement xi , and the way to report it,
are
s2 =
N
1
( xi x )2 ,
N 1 i
=1
x =
1
N
xi ,
x = ( xi s)units
i =1
xi ,
i =1
s2x =
s2
,
N
x = ( x s x )units
76
s2x
1
,
= N
i=1 1/i2
x =
s2x
2i ,
i =1
x = ( x s x )units
Appendix D
The Effective Variance
Let x be a normally distributed random variable and let y = f ( x ) be another random variable. Lets suppose that each data point ( xi xi , yi
yi ), i = 1, ..., N, is normally distributed around ( xi , yi ).
Applying the MLP to y and x we obtain
#
"
N
1
( xi xi )2
L( x, y) =
exp
2x2i
i =1 xi 2
"
#
1
(yi yi )2
exp
2y2i
yi 2
1
xi 2 yi 2
where
N
S=
i =1
"
exp S,
#
( xi xi )2 (yi yi )2
,
+
2x2
2y2i
S'
i =1
"
#
( xi xi )2 [ f ( xi ) + ( x xi ) f 0 ( xi ) yi ]2
+
.
2x2
2y2i
77
78
gives
i = 1, ..., N,
2( xi xi ) 2{yi [ f ( xi ) ( xi xi f 0 ( xi )} f 0 ( xi )
+
= 0.
x2
x2
x2i
i2
[ f ( xi ) yi )] f 0 ( xi ),
S=
[yi f ( xi )]2
2i2
i =1
Bibliography
[1] P.R. Bevington, D. K. Robinson, Data Reduction and Error Analysis
for the Physical Science, second edition, WCB McGraw-Hill.
[2] J. Orear, Least squares when both variables have uncertainties, Am.
J. Phys. 50(10), Oct 1982
[3] S. G. Rabinovich, Measurements Errors and Uncertainties Theory
and Practice, second edition, Springer .
[4] C. L. Nikias, A. P. Petropulu, Higher-Order Spectral Analysis, PTR
Prentice Hall.
[5] V. de O. Sannibale, Basics on Semiconductor Physics, Freshman Laboratory Notes, //http://www.ligo.caltech.edu/~vsanni/ph3/
[6] Deep Sea Research, 1968, Vol 15, pp 497 to 501, Pergamon Press
(printed in Great Britain).
79