Sie sind auf Seite 1von 102

Fundamentals of statistics

Definition of statistics

A collection of quantitative data


pertaining to any subject or group,
specially when the data are systematically
gathered & collected.

The science that deals with collection,


tabulation, analysis, interpretation, and
presentation of quantitative data.

Two phases of statistics


Descriptive

or deductive

statistics:
Describes and analyzes a subject or
group.
Inductive

statistics:

Determines from a limited sample of


data, an important conclusions about
the population
3

Data collection

Data may be collected by direct


observation or indirectly through written or
verbal questions.

Data that are collected for quality control


purposes are collected by direct
observation and are classified as either
variable or attribute.

Data collection

Variable data:
Measurable. If capable of any degree of
subdivision, it is referred to as continuous.

Examples: weight, length......

Variables that exhibit gaps are called discrete.


Sometimes it is convenient for verbal or non
numerical data to assume the nature of a variable,
e.g. the quality of a surface finish can be classified
as good (3), average (2), & poor (1).
While many quality characteristics are stated in
terms of variables, many others must be stated as
attributes.
5

Data collection

Attributes:

Are those quality characteristics that are


classified as either conforming or
nonconforming, go / no go.
Characteristics that are judged by visual
observation are classified as attributes.
Sometimes it is desirable for variables to be
classified as attributes e.g. the weight of a
package may not be as important as if the
weight is within specs or not.

Data collection

In data collection, the number of figures is a function of


the intended use of the data.

For example, data on the life of light bulbs, it is acceptable


to say 995.6 h. 995.632 is too accurate than necessary.

If your upper and lower specs are 9.58 and 9.52mm, then
the data collected should be to the nearest .01 mm.

Your measuring instruments may not give a true reading


because of problems due to accuracy and precision.

Accuracy and precision

Accurate

Precise

Accurate &
precise

Not accurate
&not precise

True value

Describing the data


Sometimes data collected are too many that they are more
confusing than helpful. Consider the data shown in Table 1

0
1
1
2
0
1
1
TABLE 1

1
5
0
1
4
3
3

3
4
2
1
1
4
0

0
1
0
1
3
0
1

1
2
0
2
1
0
2

Number of Daily Billing Errors.

Describing the data


Clearly these data, in this form, are difficult to use and are not
effective in describing the datas characteristics. Some means
of summarising the data are needed to show what values the
data tends to cluster about and how the data are dispersed or
spread out.
Two techniques are available to accomplish this summarization of
data, graphical and analytical.

The graphical technique is a plot or picture of a frequency


distribution.

Analytical techniques summarize data by computing a measure


of central tendency and a measure of the dispersion.

Sometimes both the graphical and analytical techniques are used.


1

Graphical Techniques
Ungrouped data comprises a listing of the observed
values as shown
in Table1.Histograms
A method of processing
Frequency
Distribution
the data is necessary.
Ungrouped
A much better
understanding can be obtained by
data
tallying the frequency of each value of Daily Billing
Errors as shown in Table 2.
The numerical value for the number of tallies is
called the frequency.

Frequency Distribution Histograms


Table2- Tally of Number of Daily Billing Errors

Number
Nonconforming

Tabulation

Frequency

13

1
1

Frequency Distribution Histograms


Ungrouped data

If the "Tabulation" column is eliminated, the resulting table is


classified as a frequency distribution, and can be graphically
presented as a histogram.

A histogram consists of a set of rectangles that represent the


frequency in each category as shown in Fig. 1

Fig.1

Frequency histogram

Frequency

10

0
0

Number non conforming


1

Frequency Distribution Histograms


Ungrouped data

Another types of graphical presentations is the relative frequency


distribution, the cumulative frequency distribution and relative
cumulative frequency distribution.
Relative frequency is calculated by dividing the frequency for
each data value by the total. These calculations are shown in the
3rd column of Table 3 . Graphical presentation is shown in Fig. 2
Cumulative frequency is calculated by adding the frequency of
each data value to the sum of the frequencies for the previous
data values. These calculations are shown in the 4 th column of
Table 3 . Graphical presentation is shown in Fig. 3
Relative cumulative frequency is calculated by dividing the
cumulative frequency for each data value by the total. These
calculations are shown in the 5 th column of Table 3 . Graphical
presentation is shown in Fig. 4

Table 3- Relative Frequency Distributions


of Data
Number
Nonconformin
g

Frequency

Relative
Frequency

Cumulative
Frequency

Relative
cumulative
Frequency

9/35= 0.26

9/35= 0.26

13

13/35= 0.37

9+13=22

22/35= 0.63

5/35= 0.14

22+5=27

27/35= 0.77

4/35= 0.11

27+4=31

31/35= 0.89

3/35= 0.09

31+3=34

34/35= 0.97

1/35= 0.03

34+1=35

35/35= 1.00

Total

35

1.00
1

0.3
0.2
0.1
0

Relative frequency

0.4

Fig.2 Relative frequency histogram

Number non conforming


1

40
30
20
10
0

Cumulative frequency

Fig.3 Cumulative frequency


histogram

Number non conforming


1

1.00
0.75
0.50
0.25
0

Relative Cumulative frequency

Fig.4 Relative cumulative


frequency histogram

Number non conforming


1

Grouped data
Most data are continuous rather than
discrete and require grouping
1. Collect data and construct a tally sheet.
2. Determine the range.
3. Determine the cell interval and the
number of cells.
4. Determine the cell midpoints.
5. Post the cell frequency.
6. Construct the histogram
2

Grouped data
1.
2.

Collect data and construct a tally sheet.


Individual observations are collected
representing the data
Determine minimum and maximum
observations.
Determine the range.

The range is the difference between the


highest observed value and the lowest
observed value

R = XH-XL

XH = highest number

XL = Lowest Number
2

Grouped data
3.

Determine the cell interval and no. of


cells.

The cell interval is the distance between adjacent cell


midpoints as shown in Figure 3.

The cell interval ( i ) and the numbers of cells (h) are


interrelated by the formula,
h = R/i

Since h and I are both unknown, a trial and error


approach is used to find the interval that will meet the
following guidelines.
2

Grouped data
Guidelines to determine number of cells

In general, the number of cells should be between 5 and 20.

Use 5 to 9 cells when the number of observations is less than 100;

Use 8 to 17 cells when the umber of observations is between 100


and 500; and

Use 15 to 20 cells when the number of observations is greater than


500.

Another method to determine the number of cells h

h N
where N is the no. of observations

Fig. 5 Cell Classification


Interval (i)
Cell

Midpoint
Lower
Boundary

Upper
Boundary

Grouped data
4.

Determine the cell midpoints.


The cell midpoint is determined by using the formula

Mp = XL + i / 2
Where Mp is the midpoint of the cell
XL is the lower boundary of the cell
i
is the cell interval
5. Post the cell frequency.
Cell frequency is the sum of frequencies of values
within the cell boundaries. Make a tally of the
values
6. Construct the histogram

Grouped data
Example problem 1
A company that fills bottles of oil tries to maintain a
specific weight of the product. The table gives
the weight of 110 bottles that were checked at
random intervals. Make a tally of these weights
and construct a frequency histogram ( weight is
in KGs )

Grouped data
Example problem 1
6.0
0

5.9
8

6.0
1

6.0
1

5.9
7

5.9
9

5.9
8

6.0
1

5.9
9

5.9
8

5.9
6

5.9
8

5.9
9

5.9
9

6.0
3

5.9
9

6.0
1

5.9
8

5.9
9

5.9
7

6.0
1

5.9
8

5.9
7

6.0
1

6.0
0

5.9
6

6.0
0

5.9
7

5.9
5

5.9
9

5.9
9

6.0
1

6.0
0

6.0
1

6.0
3

6.0
1

5.9
9

5.9
9

6.0
2

6.0
0

5.9
8

6.0
1

5.9
8

5.9
9

6.0
0

5.9
8

6.0
5

6.0
0

6.0
0

5.9
8

5.9
9

6.0
0

5.9
7

6.0
0

6.0
0

6.0
0

5.9
8

6.0
0

5.9
4

5.9
9

6.0
2

6.0
0

5.9
8

6.0
2

6.0
1

6.0
0

5.9
7

6.0
1

6.0
4

6.0
2

6.0
1

5.9
7

5.9
9

6.0
2

5.9
9

6.0
2

5.9
9

6.0
2

5.9
9

6.0
1

5.9
8

5.9
9

6.0
0

6.0
2

5.9
9

6.0
2

5.9
5

6.0
2

5.9
6

5.9
9

6.0
0

6.0
0

6.0
1

5.9
9

5.9
6

6.0
1

6.0
0

6.0
1

5.9
8

6.0

5.9

5.9

5.9

6.0

5.9

6.0

5.9

6.0

6.0

5.9

Grouped data
Example problem 1 Sol.
R = XH - XL
= 6.05 5.94 = 0.11

N 110 10.49 11

h = R/i
11 = 0.11 / i
i = 0.11 / 11 = 0.01

Grouped data
Example problem 1
Sol.

Frequency
Group /cell

fi

5.94

5.95

5.96

5.97

5.98

16

5.99

24

6.00

20

6.01

17

6.02

13

6.03

6.04

6.05

Total

110
2

Example problem 1 Sol.


Histogram of Oil bottles weight
24
22
20

10

Oil bottles weight ( kgs)


3

Grouped data
Example problem 2
The relative strength of 150 silver solder welds are
tested, and the results are given in the table.
Determine the cell interval and the approximate
number of cells. Make a table showing cell
midpoints, cell boundaries, and observed
frequencies. Plot a frequency histogram

Grouped data
Example problem 2
1.5

1.2

3.1

1.3

0.7

1.3

3.4

1.3

1.7

2.6

1.1

0.8

0.1

2.9

1.0

1.3

2.6

1.7

1.0

1.5

2.2

3.0

2.0

1.8

0.3

0.7

2.4

1.5

0.7

2.1

2.9

2.5

2.0

3.0

1.5

1.3

3.5

1.1

0.7

0.5

1.6

1.4

2.2

1.0

1.7

3.1

2.7

2.3

1.7

3.2

3.0

1.7

2.8

2.2

0.6

2.0

1.4

3.3

2.2

2.9

1.8

2.3

3.3

3.1

3.3

2.9

1.6

2.3

3.3

2.0

1.6

2.7

2.2

1.2

1.3

1.4

2.3

2.5

1.9

2.1

3.4

1.5

0.8

2.2

3.1

2.1

3.5

1.4

2.8

2.8

1.8

2.4

1.2

3.7

1.3

2.1

1.5

1.9

2.0

3.0

0.9

3.1

2.9

3.0

2.1

1.8

1.1

1.4

1.9

1.7

1.5

3.0

2.6

1.0

2.8

1.8

1.8

2.4

2.3

2.2

2.9

1.8

1.4

1.4

3.3

2.4

2.1

1.2

1.4

1.6

2.4

2.1

1.8

2.1

1.6

0.9

2.1

1.5

2.0

1.1

3.8

1.3

1.3

1.0

0.9

2.9

2.5

1.6

1.2

2.4
3

Grouped data
Example problem 2 Sol.
R = XH - XL
= 3.8 0.1 = 3.7

N 150 12.25 13

h = R/i
13 = 3.7 / i
i = 3.7 / 13 = 0.3

Example problem 2 Sol.


Cell
boundaries

Midpoint

0.1 0.4

0.25

0.4 0.7

0.55

0.7 1.0

0.85

1.0 1.3

1.15

14

1.3 1.6

1.45

25

1.6 1.9

1.75

20

1.9 2.2

2.05

18

2.2 2.5

2.35

18

2.5 2.8

2.65

2.8 3.1

2.95

17

3.1 3.4

3.25

11

3.4 3.7

3.55

3.7 4.0

3.85

Total

xi

Frequency

150

fi

Example problem 2 Sol.


Histogram of strength of silver welds
26
24
22
20

10

0
Strength
3

Uses of Histogram

The histogram describes the variation in the process. It is used to:


1. Determine the process capability,
2. Compare with specifications,
3. Suggest the shape of the population, and
4. Indicate discrepancies in data such as gaps.

Fig.6 Characteristics of Frequency


Distribution Graphs

A smooth curve represents a population


frequency distribution whereas the histogram
represents a sample frequency distribution

Symmetrical
)Normal(

Bimodal

Skewed to the Right

Peaked

Skewed to the Left

Flat
3

Characteristics of Frequency
Distribution Graphs
Provide a basis for decision making without further analysis.
Have certain identifiable characteristics:

Symmetry or lack of symmetry of the data. Are the data equally distributed on
each side of the central value, or are the data skewed to the right or to the left?

Number of modes or peaks to the data.

Location of data.

the spread of data ( quite peaked or flat )

Location

Spread

Shape

Figure 7 Differences due to location, spread, and shape

Analysis of Histograms

Analysis of a histogram can provide information


concerning specifications.
Fig. 8 shows a histogram for the % of wash
concentration in a steel tube cleaning operation
prior to painting.
No complex statistics are needed to show that
corrective actions are needed to bring the spread
of the distribution closer to the ideal value of 1.6%.
Concentrations less than 1.45% produce poor
quality, while concentrations more than 1.75% are
costly and therefore reduce productivity

Fig. 8 Histogram of wash


concentration
Ideal

Frequency

10

0
0.7

1.0

1.3

1.6

1.9

2.2

2.5

2.8

Wash concentration %
4

Interpreting Histogram
Fig. 9 Histogram Shapes

Interpreting Histogram

Normal. Many measured characteristics follow a normal distribution .


The histogram is bell-shaped. Normal distribution is so common that if
the histogram is not bell shaped, we should ask ourselves why not?

Bimodal (or Multimodal). These histograms have two (bimodal) or


many (multimodal) peaks. Such histograms result when the data come
from two or more distributions. For example, if the data came from
different suppliers, machines, shifts, and so on, a bimodal (or
multimodal) histogram will signal large differences due to these causes.

Empty Interval. In this case, one of the intervals has zero frequency.
This may result from prejudice (unfairness) in data collection.

Positive Skew. Positive skew means a long tail to the right. This is
common when successful efforts are being made to minimize the
measured value. Also, variance has a positively skewed distribution.

Interpreting Histogram

Negative Skew. Negative skew means a long tail to the left. This is
common when successful efforts are being made to increase the measured
value. Such a histogram may also result if sorting is taking place.

Uniform. This histogram looks more like a rectangular distribution. Such a


histogram can result if the process mean is not in control, as in the case
when tool wear is taking place.

Outlier. Here one or more cells are greatly separated from the main body
of the histogram. Such observations are often the result of wrong
measurement or other mistakes.

The mean, standard deviation, and histogram provide extremely useful


summaries of the data. However, they do not contain all the information in
the data. In particular, data are often collected over time and any time
trends are lost in the summaries considered so far.

Test for normality


Histogram.

Visual examination of a histogram developed from


a large amount of data will give an indication of
the underlying population distribution.

If a histogram is unimodal, symmetrical, and


tapers off at the tails, normality is a definite
possibility and may be sufficient information in
many practical situations.

The larger the sample size, the better the


judgment of normality.

A minimum sample size of 50 is recommended.


4

Analytical Techniques

Measures of Central Tendency


It is a numerical value that describes the
central position of the data or how the data
tend to build up the center.
There are 3 measures in common use:
1- The average
2- The median
3- The mode

Measures of Central Tendency


1- Average: is the most commonly used specially
with symmetrical distributions. It is the sum of
observations divided by their number.
n
Where
X = average
n = number of observed values

X
i 1

fi

= frequency of the i th cell

X 1 X 2 ..... X n
n

Ungrouped Data
h

Xi = observed values / midpoints of cells


h = number of cells

f x
i 1
h

f
i 1

Grouped Data
4

Ungrouped data - Example

1.

Resistance value of 5 coils in are


x1 = 3.35

2.

x2 = 3.37

3.

x3 = 3.28

4.

x4 = 3.34

5.

x5 = 3.30

Average

=
=

3.35 3.37 3.28 3.34 3.3


5

3.33
4

Grouped data Example 1

Given the frequency distribution of the life of


320 automotive tires in 1000 km as shown in
Table 4, determine the average
9

fx
i 1
h

f
i 1

11549

36.1
320

(In 1000 km) = 36100


km

Table 4- Frequency Distributions of


the life of 320 tires in 1000 km
Midpoint

Frequency

Computation

Group

xi

fi

fi

23.6 26.5

25

100

26.6 29.5

28

36

1008

29.6 32.5

31

51

1581

32.6 35.5

34

63

2142

35.6 38.5

37

58

2146

38.6 41.5

40

52

2080

41.6 44.5

43

34

1462

44.6 47.5

46

16

736

47.6 50.5

49

294

320

11546

Total

Xi

Grouped data Example 2

The weight of 65 castings is distributed as follows:

Midpoint

Frequency

xi

fi

3.5

3.8

1.Determine the average


4.1
18
2.Plot a frequency histogram
3.Evaluate the production process if4.4
specs are 4.25 0.60 kg 14

4.7

13

5.0

Grouped data Example 2 Sol

Compute the column fi Xi

Midpoint

Frequen
cy

xi

fi

3.5

21

3.8

34.2

4.1

18

73.8

4.4

14

61.6

4.7

13

61.1

5.0

25

Total

65

276.7

Computa
tion

fi

Xi
6

fx
i 1
6

f
i 1

276.7

4.27 kg
65

Fig.1

Frequency histogram

Frequency

20
The process is centered,
controlled, but not
applicable
10

0
3.5

3.8

3.65

4.1
4.25

4.4

4.7

5.0

Weight

4.85
5

Measures of Central Tendency

2- The median: is the value that divides a series


of ordered observations. It is an effective
measure for skewed distributions.
3. The mode: is the value that occurs with
greatest frequency.

A series of numbers is referred to as unimodal :


if it has one mode
Bimodal : if it has two modes
Multimodal : if there are more than two modes

The Median
Example1. Find the median distance for the
following data.

85, 125, 130, 65, 100, 70, 75, 50, 140, 95, 70
Sol.
50, 65, 70, 70, 75, 85, 95, 100, 125, 130, 140

Single middle value

Ordered data

Median = 85

The Median
Example. Find the median distance for the following data
85, 125, 130, 65, 100, 70, 75, 50, 140, 135, 95, 70
Sol.
50, 65, 70, 70, 75, 85, 95, 100, 125, 130, 135, 140

Two middle values so


take the mean.

Ordered data

Median = 90
5

The Mode

The mode of a set of data is the value in the set


that occurs most often.
A set of data can be bimodal. It is also possible to
have a set of data with no mode.

Bimodal: 15, 18, 18, 18, 20, 22, 24, 24, 24,
26, 26
Unimodal :6, 8, 9, 9, 9, 10, 11 14, 15, 18
No Mode : 2.7, 3.5, 4.9, 5.1, 8.3

Measures of Central Tendency


Figure 10 Relationship among average,
Median, and
Mode
Symmetrical

Average
Median
Mode

Positively Skewed

Mode
Median

Average

Negatively Skewed

Average Mode
Median

Measures of Dispersion
Introduction

A second tool of statistics is composed of the


measures of dispersion, which describe how the
data are spread out or scattered on each side of
the central value. Measures of dispersion and
measures of central tendency are both needed to
describe a collection of data.
Two common types of measures of dispersion:
1- Range
2- Standard deviation

Measures of Dispersion
1- Range
The range of a series of numbers is the
difference between the largest and
smallest values or observations.
Symbolically, it is given by the formula.
R = XH X L
Where
R = range
XH = highest observation in a series
XL = lowest observation in a series

Example problem
If the weights of a sample of 10 bottles of
shampoo are recorded as follows ( in gm).
150, 147, 152, 156, 144, 148, 149, 153,
146,151
Determine the range of sample
Solution
Max weight XH = 156 gm

Min weight XL = 144 gm


Range

R = X H XL
= 156 144 = 12 gm
6

Measures of Dispersion
2- Standard deviation

The standard deviation is a numerical value in the units of the


observed values that measures the spreading tendency of the data.
A large standard deviation shows greater variability of the data than
does a small standard deviation. In symbolic terms it is given by the
formula
n

S
Where

(X
i 1

X)

Or

n X i2
i 1

s (n=sample
standard deviation
1)
Xi
= observed value

Xi

i 1
n(n 1)

= average

= number of observed values

X
6

Example Problem 1

Determine the standard deviation of moisture content of a roll


of Kraft paper. The results of six readings across the paper
web are 6.7, 6.0, 6.4, 5.9, 6.4, and 5.8 %

i 1

n X i2

i 1

Xi

n(n 1)

6(231.26) (37.2) 2
6(6 1)

( see Table 5 )

0.35%
6

Example Problem 1
Table 5- Measure of standard
deviation

xi

X i2

6.7

44.89

36

6.4

40.96

5.9

34.81

6.4

40.96

5.8

33.64

37.2

231.26

Example Problem 2

Four readings of the thickness of a paper are 0.076, 0.082,


0.073, and 0.077mm. Determine the sample standard deviation
Sol.

n X i2
i 1

X
i 1

n(n 1)

4(0.023758) ( 0.308) 2
4(4 1)

0.000168
12

0.095032 0.094864
12

0.000014 0.0037
6

Example Problem 2
Measure of standard deviation

xi

X i2

0.076

0.005776

0.082

0.006724

0.073

0.005329

0.077

0.005929

0.308

0.023758

Relationship between the measures


of dispersion (range & standard
deviation)

Range is useful when data are too small

The standard deviation is used when a more precise


measure of dispersion is desired (# of observations > 10).

As shown in Fig. 11 two distributions may have the same


average and range, but their standard deviations are
different . The distribution on the bottom is much better
and the sample standard deviation is much smaller which
means better quality

Fig. 11 Comparison of two


distributions with equal average
and range

R
6

Table 6 Analytical Technique Recap

Concept of a population and a


sample

A sample is selected to represent the population.


Since the composition of samples will fluctuate, the
computed statistics will be larger or smaller than their
true population values (parameters).
Sampling is necessary when measuring of the entire
population is:
- impossible
- too expensive
- destructive
- too dangerous
We use different symbols to differentiate between
samples and population.

Table 7 Comparison of sample and


population
Sample

Population

statistic

parameter

X average

( Xo ) mean

S sample standard
deviation

(So) standard
deviation

Table 8 Results of 8 samples of


green & blue spheres
Sample
number

Sample
size

No. of
green
spheres

No. of blue
spheres

% of green
spheres

10

10

10

20

10

50

10

10

10

30

10

10

10

20

10

10

Total

80

15

65

18.8

Comparison of sample and


population

Table 8 shows the results of an experiment that illustrates


the relationship between samples and the population.

A container holds 800 blue and 200 green spheres . The 1000
spheres are considered the population with 20% green
spheres.

8 samples of size 10 spheres are selected, checked in colour


and replaced ( one by one ).

The table illustrate the difference between the sample results


and what should be expected from the known population.

The Normal Curve

One type of population that is quite common is called the


normal curve. The normal curve is a symmetrical,
unimodal, bell-shaped distribution with the mean,
median, and mode having the same value.

A population curve or distribution is developed from a


frequency histogram.

As the sample size of a histogram gets larger and larger,


the cell interval gets smaller and smaller.

The Normal Curve

When the sample size is quite large and the cell interval
is very small, the histogram will take on the appearance of a
smooth polygon or a curve representing the population.

Much of the variation in nature and in industry follows the


frequency distribution of the normal curve

A curve of the normal population of 1000 observations of the


resistance in ohms of an electrical device with population
mean, , of 90 and population standard deviation , of 2
is shown in figure12. The interval between dotted lines is
equal to one standard deviation, .
7

Figure 12 The normal curve

Frequency

84

90

88

86

+ 96

94

92

The standardized normal


distribution

Much of the variation in nature and in industry follows the


frequency distribution of the normal curve

All normal distributions of continuous variables can be


converted to the standardized normal distribution ( see fig.
13) by using the standardized normal value Z.

For example consider the value of 92 in fig. 12 , which is one


standard deviation above the mean. Conversion to the Z value is

X i 92 90

Figure 13 The standardized normal


distribution

0
1

-3

-2

-1

The standardized normal


distribution

Fig. 13 shows the standardized curve with its mean of Zero


and standard deviation of 1. The area under the curve is
equal to 1.0 or 100% and therefore can easily be used for
probability calculations.

A normal area table is provided as Table A in the appendix

Relationship to the mean and


standard deviation

Fig. 14 shows three normal curves with different mean values


and the same standard deviation. The only change is in
location.

Fig. 15 shows three normal curves with the same mean value
but different standard deviations. The figure illustrates the
principle that the larger the standard deviation, the flatter
the curve, and the smaller the standard deviation, the more
peaked the curve.

It is noted that the two parameters ( mean & standard


deviation ) are independent.
8

Figure 14 Normal curve with


different means, but identical
standard deviations
= 14

11

14

= 20

17

20

= 29

23 26

29 32 35

38

Figure 15 Normal curve with different


standard deviations, but identical means
= 1.5

=3

= 4.5

11

14

17

20

23

26

29

32

35

+
8

Figure 16 Percent of items included


between certain values of the
standard deviation

68.26%
95.46%
99.73%

- 3

- 2

- 1

1+

2+

3+

Applications

The areas under the curve for various Z values are given in
Table A in the appendix. Table A, "Areas under the Normal
Curve," is a left reading table, which means that the given
-
areas are for that
portion of the curve from
to a
particular value, Xi.
The first step is to determine the Z value using the formula

Xi

whereXZ
= standard normal value
i
= individual value
= mean
= population standard deviation
8

Example problem (1)

The mean value of the weight of a particular brand of cereal for


the past year is 0.297 kg (10.5 oz) with a standard deviation of 0.024
kg. assuming a normal distribution, find the percent of the data that
falls below the lower specification limit of 0.274 kg. (Note: Since the
mean and standard deviation were determined from a large number
of tests during the year, they are considered to be valid estimates of
the population values.)

Area1

= 0.024

= 0.297
Xi = 0.274
8

Example problem (solution)

Xi
Z

= 0.274 - 0.297
0.024
= - 0.96
From Table A it is found that for Z = - 0.96,
Area1 = 0.1685 or 16.85%
Thus, 16.85% of the data are less than 0.274 kg.

Example problem (2)


Using the data from the preceding problem, determine the
percentage of the data that fall above 0.347 kg.
Sol.
Since Table A is a left-reading table, the solution to this problem
requires the use of the relationship: Area1 + Area2 = AreaT = 1.0000.
Therefore, Area2 is determined and subtracted from 1.0000 to obtain
Area1.
AreaT = 1.0000
-
+

= 0.024

Area1

Area2

= 0.297

Xi = 0.347
8

Example problem (solution)


Z = Xi -

= 0.347 0.297
0.024
= + 2.08
From Table A it is found that for Z2 = +2.08,
Area2 = 0.9812
Area1 = AreaT Area2
= 1.0000 0.9812
= 0.0188 or 1.88%
Thus, 1.88% of the data are above 0.347 kg.
8

Example problem (3)

A large number of tests of line voltage to home residences show a


mean of 118.5 V and a population standard deviation of 1.20 V.
determine the percentage of data between 116 and 120V.
Since Table A is a left-reading table. The solution requires that the
area to the left of 116 V be subtracted from the area to the left of
120 V. The graph and calculations show the technique.

-
-

Area3
Area2

= 1.20
Area1

= 118.5

Xi = 116

Xi = 120
9

Example problem (solution)


Z2 = Xi -

Z3 = Xi -

= 116 118.5
1.20

= 120 118.5
1.20

= 2.08

= + 1.25

From Table A it is found that for Z2 = -2.08, Area2 = 0.0188, and


for
Z3 = + 1.25, Area3 = 0.8944.
Area1 = Area3 Area2
= 0.8944 0.0188
= 0.8756 or 87.56%
Thus, 87.56% of the data are between 116 and 120V.
9

Example problem (4)


If it is desired to have 12.1% of the line voltage below 115 V, how
should the mean voltage be adjusted? The dispersion is = 1.20 V.
The Solution to this type problem is the reverse of the other
problems. First 12.1% or 0.1210, is found in the body of table A. This
give a Z value and using the formula for Z, we can solve for the
mean voltage. Form Table A with Area1 = 0.1210, the Z value of
1.17 is obtained.

Area1 = 0.1210

= 1.20

X0 = ?

Xi = 115

Example problem (solution)


Z = Xi X0

-1.17= 115 - X0
1.20
X0 = 116.4 V

Thus, the mean voltage should be centered at


116.4 V for 12.1% of the values to be less than
115V.

Example problem (5)

The population mean of a companys racing bicycle is 9.07 kg


with a population standard deviation of 0.4 kg. If the
distribution is approximately normal, determine
A) the % of bicycles less than 8.3 kg
B) the % of bicycles greater than 10.00 kg
C) the % of bicycles between 8.3 and 10.00 kg

-
-

Area2
Area1

= 0.4
Area3

Area4

= 9.07

Xa = 8.3

Xb = 10
9

Example problem 5 (solution)


Za = Xa -

Zb = Xb -

= 8.3 9.07
0.4

= 10 9.07
0.4

= 1.925

= + 2.325

a) From Table A it is found that for Za = 1.925 , Area1 = 0.0188,


Then 1.88% of bicycles have weights less than 8.3 kg
b) and for Zb = + 2.325 Area2 = 0.9899.
Then 0.9899 of bicycles have weights less than 10 kg
Bicycles have weights more than 10 kg = 1- 0.9899
= 0.0101 or 1.01% (Area3 )

Example problem 5 (solution)


a) From Table A it is found that for Za = 1.925 , Area1 = 0.0188,
Then 1.88% of bicycles have weights less than 8.3 kg
b) and for Zb = + 2.325 Area2 = 0.9899.
Then 0.9899 of bicycles have weights less than 10 kg
Bicycles have weights more than 10 kg = 1- 0.9899
= 0.0101 or 1.01% (Area3 )
c) Area4 = Area2 Area1
= 0.9899 0.0188
= 0.9711 or 97.11%
Thus, 97.11% of the bicycles are between 8.3 and 10 kg.

Example problem (6)

Plastic strips that are used in a sensitive electronic device are manufactured to a max specifications of 305.70 mm and a min specs. of 304.55 mm. If
the strips are less than the min specs., they are scrapped; if greater than the max specs, they are reworked.. The part dimensions are normally
distributed with a population standard deviation of o.25 mm. What % of the product is scrap? What % is rework? How can the process be centered to
eliminate all but 0.1% of the scrap? What is the rework % then?

-
-

Area2

= 0.25

Area1

Xmin = 304.55

Xmax = 305.7
9

Example problem 6 (solution)


.

= Xmin + Xmax
2= 304.55 + 305.70
0.024
= 305.125

Example problem 6 (solution)


Z1 = Xmin -

= 304.55 305.125
0.25
= 2.3
From Table A it is found that for Z1 = 2.3, Area1 = 0.0107
Thus, 1.07% of the strips are scrapped.

Example problem 6 (solution)


Z2 = Xmax -

= 305.7 305.125
0.25
= + 2.3
From Table A it is found that for Z1 = + 2.3, Area2 = 0.9916
Thus, % of rework = 1- 0.9916 = 0.0084 = 0.84%.

Example problem 6 (solution)


From Table A it is found that for a % of 0.1 scrap, Z= 1.28

Z = Xi - =

1.28

Xi = 1.28

0.25 + 305.125 = LCL

= 304.81
Xav = 304.81 + 3

UCL = 304.81 + 6

0.25 = 305.56

0.25 = 306.31

Example problem 6 (solution)


305.125
Z = Xi - 306.31
=
0.25

= 4.74
From Table A it is found that for Z = 4.74 ( > 3.5 ) area = 1.0
Thus, rework % = 0

Das könnte Ihnen auch gefallen