Beruflich Dokumente
Kultur Dokumente
Definition of statistics
or deductive
statistics:
Describes and analyzes a subject or
group.
Inductive
statistics:
Data collection
Data collection
Variable data:
Measurable. If capable of any degree of
subdivision, it is referred to as continuous.
Data collection
Attributes:
Data collection
If your upper and lower specs are 9.58 and 9.52mm, then
the data collected should be to the nearest .01 mm.
Accurate
Precise
Accurate &
precise
Not accurate
¬ precise
True value
0
1
1
2
0
1
1
TABLE 1
1
5
0
1
4
3
3
3
4
2
1
1
4
0
0
1
0
1
3
0
1
1
2
0
2
1
0
2
Graphical Techniques
Ungrouped data comprises a listing of the observed
values as shown
in Table1.Histograms
A method of processing
Frequency
Distribution
the data is necessary.
Ungrouped
A much better
understanding can be obtained by
data
tallying the frequency of each value of Daily Billing
Errors as shown in Table 2.
The numerical value for the number of tallies is
called the frequency.
Number
Nonconforming
Tabulation
Frequency
13
1
1
Fig.1
Frequency histogram
Frequency
10
0
0
Frequency
Relative
Frequency
Cumulative
Frequency
Relative
cumulative
Frequency
9/35= 0.26
9/35= 0.26
13
13/35= 0.37
9+13=22
22/35= 0.63
5/35= 0.14
22+5=27
27/35= 0.77
4/35= 0.11
27+4=31
31/35= 0.89
3/35= 0.09
31+3=34
34/35= 0.97
1/35= 0.03
34+1=35
35/35= 1.00
Total
35
1.00
1
0.3
0.2
0.1
0
Relative frequency
0.4
40
30
20
10
0
Cumulative frequency
1.00
0.75
0.50
0.25
0
Grouped data
Most data are continuous rather than
discrete and require grouping
1. Collect data and construct a tally sheet.
2. Determine the range.
3. Determine the cell interval and the
number of cells.
4. Determine the cell midpoints.
5. Post the cell frequency.
6. Construct the histogram
2
Grouped data
1.
2.
R = XH-XL
XH = highest number
XL = Lowest Number
2
Grouped data
3.
Grouped data
Guidelines to determine number of cells
h N
where N is the no. of observations
Midpoint
Lower
Boundary
Upper
Boundary
Grouped data
4.
Mp = XL + i / 2
Where Mp is the midpoint of the cell
XL is the lower boundary of the cell
i
is the cell interval
5. Post the cell frequency.
Cell frequency is the sum of frequencies of values
within the cell boundaries. Make a tally of the
values
6. Construct the histogram
Grouped data
Example problem 1
A company that fills bottles of oil tries to maintain a
specific weight of the product. The table gives
the weight of 110 bottles that were checked at
random intervals. Make a tally of these weights
and construct a frequency histogram ( weight is
in KGs )
Grouped data
Example problem 1
6.0
0
5.9
8
6.0
1
6.0
1
5.9
7
5.9
9
5.9
8
6.0
1
5.9
9
5.9
8
5.9
6
5.9
8
5.9
9
5.9
9
6.0
3
5.9
9
6.0
1
5.9
8
5.9
9
5.9
7
6.0
1
5.9
8
5.9
7
6.0
1
6.0
0
5.9
6
6.0
0
5.9
7
5.9
5
5.9
9
5.9
9
6.0
1
6.0
0
6.0
1
6.0
3
6.0
1
5.9
9
5.9
9
6.0
2
6.0
0
5.9
8
6.0
1
5.9
8
5.9
9
6.0
0
5.9
8
6.0
5
6.0
0
6.0
0
5.9
8
5.9
9
6.0
0
5.9
7
6.0
0
6.0
0
6.0
0
5.9
8
6.0
0
5.9
4
5.9
9
6.0
2
6.0
0
5.9
8
6.0
2
6.0
1
6.0
0
5.9
7
6.0
1
6.0
4
6.0
2
6.0
1
5.9
7
5.9
9
6.0
2
5.9
9
6.0
2
5.9
9
6.0
2
5.9
9
6.0
1
5.9
8
5.9
9
6.0
0
6.0
2
5.9
9
6.0
2
5.9
5
6.0
2
5.9
6
5.9
9
6.0
0
6.0
0
6.0
1
5.9
9
5.9
6
6.0
1
6.0
0
6.0
1
5.9
8
6.0
5.9
5.9
5.9
6.0
5.9
6.0
5.9
6.0
6.0
5.9
Grouped data
Example problem 1 Sol.
R = XH - XL
= 6.05 5.94 = 0.11
N 110 10.49 11
h = R/i
11 = 0.11 / i
i = 0.11 / 11 = 0.01
Grouped data
Example problem 1
Sol.
Frequency
Group /cell
fi
5.94
5.95
5.96
5.97
5.98
16
5.99
24
6.00
20
6.01
17
6.02
13
6.03
6.04
6.05
Total
110
2
10
Grouped data
Example problem 2
The relative strength of 150 silver solder welds are
tested, and the results are given in the table.
Determine the cell interval and the approximate
number of cells. Make a table showing cell
midpoints, cell boundaries, and observed
frequencies. Plot a frequency histogram
Grouped data
Example problem 2
1.5
1.2
3.1
1.3
0.7
1.3
3.4
1.3
1.7
2.6
1.1
0.8
0.1
2.9
1.0
1.3
2.6
1.7
1.0
1.5
2.2
3.0
2.0
1.8
0.3
0.7
2.4
1.5
0.7
2.1
2.9
2.5
2.0
3.0
1.5
1.3
3.5
1.1
0.7
0.5
1.6
1.4
2.2
1.0
1.7
3.1
2.7
2.3
1.7
3.2
3.0
1.7
2.8
2.2
0.6
2.0
1.4
3.3
2.2
2.9
1.8
2.3
3.3
3.1
3.3
2.9
1.6
2.3
3.3
2.0
1.6
2.7
2.2
1.2
1.3
1.4
2.3
2.5
1.9
2.1
3.4
1.5
0.8
2.2
3.1
2.1
3.5
1.4
2.8
2.8
1.8
2.4
1.2
3.7
1.3
2.1
1.5
1.9
2.0
3.0
0.9
3.1
2.9
3.0
2.1
1.8
1.1
1.4
1.9
1.7
1.5
3.0
2.6
1.0
2.8
1.8
1.8
2.4
2.3
2.2
2.9
1.8
1.4
1.4
3.3
2.4
2.1
1.2
1.4
1.6
2.4
2.1
1.8
2.1
1.6
0.9
2.1
1.5
2.0
1.1
3.8
1.3
1.3
1.0
0.9
2.9
2.5
1.6
1.2
2.4
3
Grouped data
Example problem 2 Sol.
R = XH - XL
= 3.8 0.1 = 3.7
N 150 12.25 13
h = R/i
13 = 3.7 / i
i = 3.7 / 13 = 0.3
Midpoint
0.1 0.4
0.25
0.4 0.7
0.55
0.7 1.0
0.85
1.0 1.3
1.15
14
1.3 1.6
1.45
25
1.6 1.9
1.75
20
1.9 2.2
2.05
18
2.2 2.5
2.35
18
2.5 2.8
2.65
2.8 3.1
2.95
17
3.1 3.4
3.25
11
3.4 3.7
3.55
3.7 4.0
3.85
Total
xi
Frequency
150
fi
10
0
Strength
3
Uses of Histogram
Symmetrical
)Normal(
Bimodal
Peaked
Flat
3
Characteristics of Frequency
Distribution Graphs
Provide a basis for decision making without further analysis.
Have certain identifiable characteristics:
Symmetry or lack of symmetry of the data. Are the data equally distributed on
each side of the central value, or are the data skewed to the right or to the left?
Location of data.
Location
Spread
Shape
Analysis of Histograms
Frequency
10
0
0.7
1.0
1.3
1.6
1.9
2.2
2.5
2.8
Wash concentration %
4
Interpreting Histogram
Fig. 9 Histogram Shapes
Interpreting Histogram
Empty Interval. In this case, one of the intervals has zero frequency.
This may result from prejudice (unfairness) in data collection.
Positive Skew. Positive skew means a long tail to the right. This is
common when successful efforts are being made to minimize the
measured value. Also, variance has a positively skewed distribution.
Interpreting Histogram
Negative Skew. Negative skew means a long tail to the left. This is
common when successful efforts are being made to increase the measured
value. Such a histogram may also result if sorting is taking place.
Outlier. Here one or more cells are greatly separated from the main body
of the histogram. Such observations are often the result of wrong
measurement or other mistakes.
Analytical Techniques
X
i 1
fi
X 1 X 2 ..... X n
n
Ungrouped Data
h
f x
i 1
h
f
i 1
Grouped Data
4
1.
2.
x2 = 3.37
3.
x3 = 3.28
4.
x4 = 3.34
5.
x5 = 3.30
Average
=
=
3.33
4
fx
i 1
h
f
i 1
11549
36.1
320
Frequency
Computation
Group
xi
fi
fi
23.6 26.5
25
100
26.6 29.5
28
36
1008
29.6 32.5
31
51
1581
32.6 35.5
34
63
2142
35.6 38.5
37
58
2146
38.6 41.5
40
52
2080
41.6 44.5
43
34
1462
44.6 47.5
46
16
736
47.6 50.5
49
294
320
11546
Total
Xi
Midpoint
Frequency
xi
fi
3.5
3.8
4.7
13
5.0
Midpoint
Frequen
cy
xi
fi
3.5
21
3.8
34.2
4.1
18
73.8
4.4
14
61.6
4.7
13
61.1
5.0
25
Total
65
276.7
Computa
tion
fi
Xi
6
fx
i 1
6
f
i 1
276.7
4.27 kg
65
Fig.1
Frequency histogram
Frequency
20
The process is centered,
controlled, but not
applicable
10
0
3.5
3.8
3.65
4.1
4.25
4.4
4.7
5.0
Weight
4.85
5
The Median
Example1. Find the median distance for the
following data.
85, 125, 130, 65, 100, 70, 75, 50, 140, 95, 70
Sol.
50, 65, 70, 70, 75, 85, 95, 100, 125, 130, 140
Ordered data
Median = 85
The Median
Example. Find the median distance for the following data
85, 125, 130, 65, 100, 70, 75, 50, 140, 135, 95, 70
Sol.
50, 65, 70, 70, 75, 85, 95, 100, 125, 130, 135, 140
Ordered data
Median = 90
5
The Mode
Bimodal: 15, 18, 18, 18, 20, 22, 24, 24, 24,
26, 26
Unimodal :6, 8, 9, 9, 9, 10, 11 14, 15, 18
No Mode : 2.7, 3.5, 4.9, 5.1, 8.3
Average
Median
Mode
Positively Skewed
Mode
Median
Average
Negatively Skewed
Average Mode
Median
Measures of Dispersion
Introduction
Measures of Dispersion
1- Range
The range of a series of numbers is the
difference between the largest and
smallest values or observations.
Symbolically, it is given by the formula.
R = XH X L
Where
R = range
XH = highest observation in a series
XL = lowest observation in a series
Example problem
If the weights of a sample of 10 bottles of
shampoo are recorded as follows ( in gm).
150, 147, 152, 156, 144, 148, 149, 153,
146,151
Determine the range of sample
Solution
Max weight XH = 156 gm
R = X H XL
= 156 144 = 12 gm
6
Measures of Dispersion
2- Standard deviation
S
Where
(X
i 1
X)
Or
n X i2
i 1
s (n=sample
standard deviation
1)
Xi
= observed value
Xi
i 1
n(n 1)
= average
X
6
Example Problem 1
i 1
n X i2
i 1
Xi
n(n 1)
6(231.26) (37.2) 2
6(6 1)
( see Table 5 )
0.35%
6
Example Problem 1
Table 5- Measure of standard
deviation
xi
X i2
6.7
44.89
36
6.4
40.96
5.9
34.81
6.4
40.96
5.8
33.64
37.2
231.26
Example Problem 2
n X i2
i 1
X
i 1
n(n 1)
4(0.023758) ( 0.308) 2
4(4 1)
0.000168
12
0.095032 0.094864
12
0.000014 0.0037
6
Example Problem 2
Measure of standard deviation
xi
X i2
0.076
0.005776
0.082
0.006724
0.073
0.005329
0.077
0.005929
0.308
0.023758
R
6
Population
statistic
parameter
X average
( Xo ) mean
S sample standard
deviation
(So) standard
deviation
Sample
size
No. of
green
spheres
No. of blue
spheres
% of green
spheres
10
10
10
20
10
50
10
10
10
30
10
10
10
20
10
10
Total
80
15
65
18.8
A container holds 800 blue and 200 green spheres . The 1000
spheres are considered the population with 20% green
spheres.
When the sample size is quite large and the cell interval
is very small, the histogram will take on the appearance of a
smooth polygon or a curve representing the population.
Frequency
84
90
88
86
+ 96
94
92
X i 92 90
0
1
-3
-2
-1
Fig. 15 shows three normal curves with the same mean value
but different standard deviations. The figure illustrates the
principle that the larger the standard deviation, the flatter
the curve, and the smaller the standard deviation, the more
peaked the curve.
11
14
= 20
17
20
= 29
23 26
29 32 35
38
=3
= 4.5
11
14
17
20
23
26
29
32
35
+
8
68.26%
95.46%
99.73%
- 3
- 2
- 1
1+
2+
3+
Applications
The areas under the curve for various Z values are given in
Table A in the appendix. Table A, "Areas under the Normal
Curve," is a left reading table, which means that the given
-
areas are for that
portion of the curve from
to a
particular value, Xi.
The first step is to determine the Z value using the formula
Xi
whereXZ
= standard normal value
i
= individual value
= mean
= population standard deviation
8
Area1
= 0.024
= 0.297
Xi = 0.274
8
Xi
Z
= 0.274 - 0.297
0.024
= - 0.96
From Table A it is found that for Z = - 0.96,
Area1 = 0.1685 or 16.85%
Thus, 16.85% of the data are less than 0.274 kg.
= 0.024
Area1
Area2
= 0.297
Xi = 0.347
8
= 0.347 0.297
0.024
= + 2.08
From Table A it is found that for Z2 = +2.08,
Area2 = 0.9812
Area1 = AreaT Area2
= 1.0000 0.9812
= 0.0188 or 1.88%
Thus, 1.88% of the data are above 0.347 kg.
8
-
-
Area3
Area2
= 1.20
Area1
= 118.5
Xi = 116
Xi = 120
9
Z3 = Xi -
= 116 118.5
1.20
= 120 118.5
1.20
= 2.08
= + 1.25
Area1 = 0.1210
= 1.20
X0 = ?
Xi = 115
-1.17= 115 - X0
1.20
X0 = 116.4 V
-
-
Area2
Area1
= 0.4
Area3
Area4
= 9.07
Xa = 8.3
Xb = 10
9
Zb = Xb -
= 8.3 9.07
0.4
= 10 9.07
0.4
= 1.925
= + 2.325
Plastic strips that are used in a sensitive electronic device are manufactured to a max specifications of 305.70 mm and a min specs. of 304.55 mm. If
the strips are less than the min specs., they are scrapped; if greater than the max specs, they are reworked.. The part dimensions are normally
distributed with a population standard deviation of o.25 mm. What % of the product is scrap? What % is rework? How can the process be centered to
eliminate all but 0.1% of the scrap? What is the rework % then?
-
-
Area2
= 0.25
Area1
Xmin = 304.55
Xmax = 305.7
9
= Xmin + Xmax
2= 304.55 + 305.70
0.024
= 305.125
= 304.55 305.125
0.25
= 2.3
From Table A it is found that for Z1 = 2.3, Area1 = 0.0107
Thus, 1.07% of the strips are scrapped.
= 305.7 305.125
0.25
= + 2.3
From Table A it is found that for Z1 = + 2.3, Area2 = 0.9916
Thus, % of rework = 1- 0.9916 = 0.0084 = 0.84%.
Z = Xi - =
1.28
Xi = 1.28
= 304.81
Xav = 304.81 + 3
UCL = 304.81 + 6
0.25 = 305.56
0.25 = 306.31
= 4.74
From Table A it is found that for Z = 4.74 ( > 3.5 ) area = 1.0
Thus, rework % = 0