Beruflich Dokumente
Kultur Dokumente
• There exists a value on the axis where data tend to gather, this is
common for most measured continuous random variables
• Finding the convenient number of intervals for a continuous
random variable, can be achieved by implementing the following
formula:
K 1.87( N 1)0.4
1
• Where K is the number of intervals and N is the number of
data points.
Probability Density Function, p(x)
Compute the histogram and frequency distribution for the data given on previous table
Solution:
For N= 20 K = 7
j Interval nj fj = nj/N
1 0.65 ≤xi< 0.75 1 0.05
2 0.75 ≤xi< 0.85 1 0.05
3 0.85 ≤xi< 0.95 3 0.15
4 0.95 ≤xi< 1.05 7 0.35
5 1.05 ≤xi< 1.15 4 0.20
6 1.15 ≤xi< 1.25 2 0.1
7 1.25 ≤xi< 1.35 2 0.1
Probability Density Function, p(x)
• The probability density function defines the probability that a
measured variable might assume a particular value upon any
individual measurement.
• It also provides the central tendency of the variable, which is
the desired representative value that gives the best estimate
of the true mean value.
• There are a number of standard distribution shapes that
suggest how a variable will be distributed on the probability
density plot.
• Experimentally determined histograms are used to determine
which type of standard distribution the measured variable
tends to follow.
Probability Density Function, p(x)
• The assumption that the mean value of the
measurement yields the true value x’ as N or T
N T
1
x' lim
N
x i
1
x' lim xi dt
N i 1 T T 0
0,8
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0
0 1 2 3 4 5 6 7
Infinite Statistics
• The probability (P%) that any future measurement will lie
within a certain interval x'x , is the area under the
curve p(x). This area is found by integration:
x ' x
P( x'x x x'x) p( x)dx
x ' x
Infinite Statistics
• Assuming
x x' x1 x' x
z1
• The integral can be written as
z1
1
e
2 / 2
P( z1 z1 ) d
2 z1
1 z1 2 / 2
2
2 0
e d
thisbracketis foundin tables
Infinite Statistics
• To find the probability, P%, that a measured value, xi, lies
between
xi x'x x' z1 ( P%)
Can be found by integrating the function p(x) between the
limits from –z1 to z1.
Infinite Statistics
• Example 1:
Find the probability, P%, that a measurement xi will lie between a
value x’.
Infinite Statistics
• Example 2:
Find the probability, P%, that a measurement xi will lie between a
value x’z1, when z1=2 & 3
Infinite Statistics
• Example 3:
It is known that the statistics of a well-defined voltage signal are
given by x’= 8.5 V and 2=2.25 V2. If a single measurement of the
voltage signal is made, determine the probability the measured
variable will lie between 10.0 & 11.5 V.
Finite Statistics
• Finite statistics means that only a finite number of measurements,
N, are to be made.
• The sample mean, x , is given by:
N
1
x
N
x
i 1
i
N 1 i 1
Finite Statistics
• Analogous to infinite statistics a measurement, xi , can lie within
the interval:
xi x t , P S x ( P%)
• The value of t estimator is given in Student’s t Distribution table
• The obtained standard deviation is a sample standard deviation, in
order to obtain the standard deviation of the means,S x , is given
by: Sx
Sx
N
• Hence the true mean value might lie within the range
x t , P S x ( P%)
• Or the true mean, x’, is equal to (in the absence of systematic
error):
x' x t , P S x ( P%)
Finite Statistics
• Example 4
Consider the data given in Table.
a) Compute the sample statistics for this data set
b) Estimate the interval of values over which 95% of the
measurements of the random continuous variable (measurand)
should be expected to lie.
c) Estimate the true mean value of the measurand at 95% probability
based on this finite data set.
i xi i xi
1 0.98 11 1.02
2 1.07 12 1.26
3 0.86 13 1.08
4 1.16 14 1.02
5 0.96 15 0.94
6 0.68 16 1.11
7 1.34 17 0.99
8 1.04 18 0.78
9 1.21 19 1.06
10 0.86 20 0.96
Finite Statistics
• Example 4 (contd)
j Interval nj fj = nj/N
1 0.65 ≤xi< 0.75 1 0.05
2 0.75 ≤xi< 0.85 1 0.05
3 0.85 ≤xi< 0.95 3 0.15
4 0.95 ≤xi< 1.05 7 0.35
5 1.05 ≤xi< 1.15 4 0.20
6 1.15 ≤xi< 1.25 2 0.1
7 1.25 ≤xi< 1.35 2 0.1
Finite Pooled Statistics
• Replications are independent estimates of the same measured
value. Combining data from replications leads to better statistical
estimates of a measured variable
• Samples that are grouped in a manner so as to determine a
common set of statistics are said to be pooled.
• The pooled mean of samples for M replications is denoted as x ,
M
and is computed as follows: N xj j
j 1
x M
N j 1
j
j 1
j (S x j )2
Sx M
N
j 1
j
Finite Pooled Statistics
• The pooled standard deviation of the means
Sx
Sx
M
N
j 1
j
Chi-Squared Distribution
Precision Interval in Sample variance
• Analogous to standard deviation of the means, or how well the
sample mean value, x , estimates the true mean value, x’, here we
need to determine how well the sample variance, Sx2, represents
the true variance, 2.
• The Chi-squared probability density function, p(2), is used to
determine how well S2 predicts 2
• For normal distribution chi-squred, 2, is given as follows
2 S x2 / 2
• Where is the degrees of freedom
• The precision interval of the sample variance 2 can be determined
by the probability statement
P( 12 / 2 2 2 / 2 ) 1
• Where is the level of significance
Chi-Squared Distribution
Precision Interval in Sample variance
• Rearranging terms
P( S x2 / 2 / 2 2 S x2 / 12 / 2 ) 1
• Example 5,
• Ten steel specimens are tested from a large batch, and a sample
variance of 40 000 (kN/m2)2 is found. State the true variance
expected at 95% confidence.
n 'j
• The lower the value of chi-square, the better a data set fits
assumed distribution function.
Chi-Squared Distribution
Goodness of Fit
• Criteria:
– A very good fit for :
P(2)< 5% i.e. > 95%
– Non decisive for
i xi i xi
1 0.98 11 1.02
2 1.07 12 1.26
3 0.86 13 1.08
4 1.16 14 1.02
5 0.96 15 0.94
6 0.68 16 1.11
7 1.34 17 0.99
8 1.04 18 0.78
9 1.21 19 1.06
10 0.86 20 0.96
Chi-Squared Distribution
Goodness of Fit
j Interval nj fj = nj/N
1 0.65 ≤xi< 0.75 1 0.05
2 0.75 ≤xi< 0.85 1 0.05
3 0.85 ≤xi< 0.95 3 0.15
4 0.95 ≤xi< 1.05 7 0.35
5 1.05 ≤xi< 1.15 4 0.20
6 1.15 ≤xi< 1.25 2 0.1
7 1.25 ≤xi< 1.35 2 0.1
Data Outlier Detection
• The three-sigma test is used to detect
measurement outliers.
• Outliers are points lying outside the interval
xi x t ,99.8 S x P(99.8%)
• Compute the value of
xi x
z0
Sx
• Find the value of P(z0) from the normal
distribution table.
Data Outlier Detection (contd)
• The point is an outlier, if
0.5 P( z0 ) 0.1
Data Outlier Detection (contd.)
• Example 8
– Consider the data given below for 10 measurements of tire pressure
made using a hand-held gauge (Notew:14.5 psi = 1 bar). Compute the
statistics of the data set; then test for outliers using the modified
three-sigma test.
i 1 2 3 4 5 6 7 8 9 10
xi [psi] 28 31 27 29 28 24 29 28 18 27
Number of Measurements Required
• This part is concerned with determination of the number of
measurements, N, which are required to reduce the random
error within an acceptable limit.
x' x t , P S x ( P%)
confidence int erval
• The confidence interval, CI, is:
Sx
CI t , P S x t , P ( P%)
N
• The one sided precision valued, d:
CI Sx
d t , P ( P%)
2 N
• Therefore:
2
t , P S x
N ( P%)
d
• Example 9:
– From 51 measurements of a variable, the standard
deviation is found to be 160 units. For a 95%
confidence interval of 60 units of the mean value,
estimate the total number of measurements
required.
Thank You
Term Reports
1. Pressure & Velocity Measurements, scientific
ethics
2. Temperature Measurements, Health, safety and
environment
3. Flow measurements, Health, safety and
environment
4. Force, torque and strain measurement, scientific
ethics
5. Air pollution measurements and sampling,
Health, safety and environment