Sie sind auf Seite 1von 105

Statistical Process Control

(SPC)
Lecture outcomes
Describe Categories of SQC
Explain the use of descriptive statistics in
measuring quality characteristics
Identify and describe causes of variation
Describe the use of control charts
Identify the differences between x-bar, R-, p-,
and c-charts
Explain process capability and process
capability index
Explain the concept six-sigma
Explain the process of acceptance sampling
and describe the use of OC curves
Describe the challenges inherent in
measuring quality in service organizations
Three SQC Categories
Statistical quality control (SQC) is the term used to describe the
set of statistical tools used by quality professionals
SQC encompasses three broad categories of;
Descriptive statistics
e.g. the mean, standard deviation, and range
Statistical process control (SPC)
Involves inspecting the output from a process
Quality characteristics are measured and charted
Helpful in identifying in-process variations (control charts)
Acceptance sampling
used to randomly inspect a batch of goods to determine
acceptance/rejection
Does not help to catch in-process problems
The Engineering Method and Statistical
Thinking

An engineer is someone who solves problems of


interest to society by the efficient application
of scientific principles. Engineers accomplish this by
either refining an existing product or process or by
designing a new product or process that meets
customers’ needs. The engineering, or scientific
method is the approach to formulating and solving
these problems. The steps in the engineering
method are as follows:
1. Develop a clear and concise description of the
problem.
2. Identify, at least tentatively, the important factors
that affect this problem or that may play a role in its
solution.
3. Propose a model for the problem, using scientific or
engineering knowledge of the phenomenon being
studied. State any limitations or assumptions of the
model.
4. Conduct appropriate experiments and collect data to
test or validate the tentative model or conclusions
made in steps 2 and 3.
5. Refine the model on the basis of the
observed data.
6. Manipulate the model to assist in developing
a solution to the problem.
7. Conduct an appropriate experiment to
confirm that the proposed solution to the
problem is both effective and efficient.
8. Draw conclusions or make recommendations
based on the problem solution.
The field of statistics deals with the:
collection,
presentation,
analysis,
and use of data to make decisions, solve
problems, and design products and
processes.
Many aspects of engineering practice
involve working with data.
Knowledge of statistics is important to
any engineer.
Statistical techniques is a powerful
aid in designing new products and
systems, improving existing designs,
and designing, developing, and
improving production processes.
Statistical methods are used to help us
describe and understand variability.

By variability, we mean that successive observations


of a system or phenomenon do not produce exactly
the same result.

We all encounter variability in our everyday lives,


and statistical thinking can give us a useful way to
incorporate this variability into our decision-making
processes.
Descriptive Statistics
Descriptive statistics are used to describe the
basic features of the data in a study. They
provide simple summaries about the sample
and the measures. Together with simple
graphics analysis, they form the basis of
virtually every quantitative analysis of data.
Data Categories

Data

Quantitative Qualitative
(numerical) (categorical)
Data, Information and Knowledge
Data... data is raw. It simply exists and has no
significance beyond its existence (in and of itself). It
can exist in any form, usable or not. It does not have
meaning of itself. In computer parlance, a
spreadsheet generally starts out by holding data.
Information... information is data that has been
given meaning by way of relational connection. This
"meaning" can be useful, but does not have to be. In
computer parlance, a relational database makes
information from the data stored within it.
Knowledge... knowledge is the appropriate
collection of information, such that it'
s intent is to be
useful. Knowledge is a deterministic process.
Data, Information and
Knowledge
POPULATION
SAMPLE
Population vs. Sample

Population
Sample
Samples
Again Sample Defined:
A Subset of a population.

A Representative Sample
Has the characteristics of the population

Census - A Sample that Contains all


Items in the Population
Frequency Distribution

A table that divides the data into classes


and shows the number of observed values
that fall into each class.
These data represent the record high
temperature in ˚F for 50 factories. Construct a
grouped frequency distribution for the data
using 7 classes
112 100 127 120 134 118 105 110 109 112
110 118 117 116 118 122 114 114 105 109
107 112 114 115 118 117 118 122 106 110
116 108 110 121 113 120 119 111 104 111
120 113 120 117 105 110 118 112 114 114
The completed frequency table
Class limits Class Tally Frequency Cum. Freq.
boundaries

100 - 104 99.5-104.5 // 2 2


105 - 109 104.5-109.5 //// /// 8 10
110 - 114 109.5-114.5 //// //// //// /// 18 28
115 - 119 114.5-119.5 //// //// /// 13 41
120 - 124 119.5-124.5 //// // 7 48
125 - 129 124.5-129.5 / 1 49
130 - 134 129.5-134.5 / 1 50
n= f = 50
Data Measures
Measures of Location
Mean
Median
Mode
Mean
Another name for average.
If describing a population, denoted as µ ,
the Greek letter “mu”.
If describing a sample, denoted as X ,
called “x-bar”.
Appropriate for describing measurement
data.
Seriously affected by unusual values
called “outliers”.
Calculating Sample Mean
Formula:
That is, add up all of the data points and
divide by the number of data points.
Data (# of classes skipped): 2 8 3 4 1
Sample Mean = (2+8+3+4+1)/5 = 3.6
Do not round! Mean need not be a whole
number.
With outliers
2 8 3 4 1 95
Sample Mean =
(2+8+3+4+1+95)/6 = 18.8
Without outlier mean = 3.6
The mean is seriously affected
by outliers
Calculating Population Mean
Formula:

xi
µ =
Σ N

Population size N
Median
Another name for 50th percentile.
Appropriate for describing
measurement data.
“Robust to outliers,” that is, not
affected much by unusual values.
Calculating Sample Median
Order data from smallest to largest.

If odd number of data points, the median is


the middle value.
Data (# of classes skipped): 2 8 3 4 1

Ordered Data: 1 2 3 4 8

Median
Calculating Sample Median
Order data from smallest to largest.

If even number of data points, the median is


the average of the two middle values.
Data (# of classes skipped): 2 8 3 4 1 8

Ordered Data: 1 2 3 4 8 8

Median = (3+4)/2 = 3.5


Sample Median with outlier

2 95 3 4 1 8

Ordered Data: 1 2 3 4 8 95

Median = (3+4)/2 = 3.5


Mode
The value that occurs most frequently.
One data set can have many modes.
Appropriate for all types of data, but most
useful for categorical data or discrete data
with only a few number of possible values.
Example: 2 8 3 4 1 8
Mode = 8
Most appropriate
measure of location
Depends on whether or not data are
“symmetric” or “skewed”.

Depends on whether or not data have one


(“unimodal”) or more (“multimodal”)
modes.
Symmetric versus Skewed
Symmetric
Data is symmetric if the left half of its
histogram is roughly a mirror image of
its right half.

Skewed
Data is skewed if it is not symmetric
and if it extends more to one side than
the other.
Skewness
Choosing Appropriate
Measure of Location
If data are symmetric, the mean, median,
and mode will be approximately the same.
If data are multimodal, report the mean,
median and/or mode for each subgroup.
If data are skewed, report the median.
Measures of Variability
Range
Variance and standard deviation
Coefficient of variation

All of these measures are appropriate for


measurement data only.
Range
The difference between largest and
smallest data point.
Highly affected by outliers.
Best for symmetric data with no
outliers.
SAMPLE RANGE
R = largest value - smallest value

or, equivalently

R = xmax - xmin
Variance

1. Find difference between


each data point and mean.
2. Square the differences, and
add them up.
3. Divide by one less than the
number of data points.
Variance
If measuring variance of population, denoted
by σ2 (“sigma-squared”).
If measuring variance of sample, denoted by
s2 (“s-squared”).
Measures average squared deviation of data
points from their mean.
Highly affected by outliers. Best for symmetric
data.
SAMPLE VARIANCE
Standard deviation
Sample standard deviation is square root
of sample variance, and so is denoted by
s.
Units are the original units.
Measures average deviation of data points
from their mean.
Also, highly affected by outliers.
What is a standard deviation?
SAMPLE STANDARD
DEVIATION
Coefficient of Variation
Ratio of sample standard deviation to
sample mean multiplied by 100.
Measures relative variability, that is,
variability relative to the magnitude of the
data.
Unitless, so good for comparing variation
between two groups.
The most appropriate measure of
variability depends on …

the shape of the data’s distribution.


Choosing Appropriate
Measure of Variability
Measures of Variation -
Some Comments

Range is the simplest, but is very sensitive


to outliers

Variance units are the square of the


original units

We will use the standard deviation as a


measure of variation often in this course
Statistical Process Control (SPC),
Is a technique for measuring,
understanding and controlling variation in a
manufacturing process. It is one tool
among many used in a broader vision of
business operations which includes ideas like
continuous improvement, in-process quality,
teams and the use of statistics to know the
unknowable.
Probability Distributions
Binomial Distribution
The binomial probability distribution is one of the most
widely used discrete probability distributions.
Probability that an event will occur x times out
of n performance of an experiment.
Binomial distribution is applied to experiments that satisfy
the four conditions of
binomial experiment
Each repetition of a binomial experiment is called a
trial or a Bernoulli trial
“x successes in n trials”
Conditions of Binomial
Experiment
Bernoulli Trials

1. There are a fixed number of identical trials


2. Each trial has only two possible outcomes
(success or failure)
3. The probabilities of the 2 outcomes are constant
4. The trials are independent
Binomial distribution
Function
The probability distribution of the random variable X is
called a binomial distribution, and is given by the
formula: binomial formula
f(x) = nCx px (1 - p)n-x , for x = 0,1,2, …n
where
n = the number of trials
x = 0, 1, 2, ... n=r
p = the probability of success in a single trial
q = the probability of failure in a single trial (i.e. q = 1 − p)

Cnx is a combination :
The Mean of a Probability
Distribution
µ = Σ xi · f(xi)
The random variable has binomial distribution
Mean of the binomial distribution

n = number of trials
µ=n·p p = probability of success
of an individual trial
Example:
Find the mean of the probability distribution
of the number of heads obtained in 3 flips of a
balanced coin.
Solution: n = 3 , p = 1/2
so :
µ = 3 · 1/2 = 3/2 = 1.5
Variance of a Probability
Distribution
If x is a random variable whose probability
distribution has mean µ then its deviation from the mean
is x - µ

We define the variance of the probability distribution


to be the expected value (mathematical expectation) of the
squared deviation from the mean, namely
Standard Deviation of the
Binomial Distribution

σ = √ np ( 1 - p)
Example: Find the variation and standard
deviation of the probability distribution of the
number of heads obtained by 3 flips of a coin.
Solution: ( binomial distribution !)
n = 3, p = 1/2, hence 1 - p = 1/2 , so
σ2 = 3 · 1/2 · 1/2 = 3/4
and σ = √3/4 = √3 /2
Continuous Random Variables
Continuous random variable: the outcome
can be any value in an interval or collection
of intervals.

Probability density function for a continuous random


variable X is a curve such that the area under the curve over
an interval equals the probability that X is in that interval.
P(a ≤ X ≤ b) = area under density curve over the
interval between the values a and b.
Continuous probability
density functions
The curve describes the probability of getting
any range of values, say P(X > 120), P(X<100),
P(110 < X < 120)
Area under the curve = probability
Area under whole curve = 1
Probability of getting specific number is 0, e.g.
P(X=120) = 0
The most important continuous probability
distribution is the normal distribution

a normal random variable.

Random variables with a normal distribution have a bell-


shaped probability distribution (histogram).
If X has a normal distribution
(i.e. X is normally distributed)

with mean

and standard deviation


µ
then we denote this σ
.
X ~ N ( µ,σ)
Characteristics of Normal distribution
Symmetric, bell-shaped curve.
Shape of curve depends on population mean µ
and standard deviation σ.
Center of distribution is µ.
Spread is determined by σ.
Most values fall around the mean, but some
values are smaller and some are larger.
Probability = Area under curve
Probability student scores higher than 75?

0.08

0.07

0.06

0.05
Density

0.04 P(X > 75)

0.03

0.02
0.01

0.00

55 60 65 70 75 80 85
Grades
The Standard Normal Distribution
It makes life a lot easier for us if we
standardize our normal curve, with a mean
of zero and a standard deviation of 1 unit.
If we have the standardized situation of
= 0 and = 1, then we have:

Standard Normal Curve = 0, =1


We can transform all the observations of any
normal random variable X with mean and
variance to a new set of observations of
another normal random variable Z with mean
0 and variance 1 using the following
transformation:
In the previous graph, we have indicated the
areas between the regions as follows:

-1 Z 1 68.27%
-2 Z 2 95.45%
-3 Z 3 99.73%

This means that 68.27% of the scores lie


within 1 standard deviation of the mean.
Example
Suppose you must establish regulations
concerning the maximum number of people
who can occupy a lift.
You know that the total weight of 8 people
chosen at random follows a normal
distribution with a mean of 550kg and a
standard deviation of 150kg.
What’s the probability that the total weight of
8 people exceeds 600kg?
First sketch a diagram.
The mean is 550kg and we are interested in
the area that is greater than 600kg.
z=(x-µ)/
Here x = 600kg,
µ, the mean = 550kg
, the standard deviation = 150kg
z = ( 600 - 550 ) / 150
z = 50 / 150
z = 0.33
Look in the table down the left hand column
for z = 0.3, and across under 0.03.
The number in the table is the tail area for
z=0.33 which is 0.3707 .
This is the probability that the weight will
exceed 600kg.
Our answer is:
"The probability that the total weight of 8
people exceeds 600kg is 0.37 correct to 2
figures."
Statistical Estimation
Parameter Estimation:
Using information from a single
random sample to estimate the
value of an unknown
parameter:
n
1
Point Estimator:
xi = x = µ̂
A rule or formula used to
calculate a sample statistic
n 1
which corresponds to the
unknown parameter. e.g.,
Interval Estimator :
An interval between two α/2 α/2
values and that will 1- α
enclose the parameter
θ with a specified
P (θˆL ≤ θ ≤ θˆU ) = 1 − α
probability (Confidence
level),
Where,
θˆL is the lower confidence limit constructed
such that (α/2) 100 % of the sampling
distribution of lies to its left.
θˆU is the upper confidence limit constructed
such that (α/2) 100 % of the sampling
distribution of lies to its right.
(1- ) is the confidence level of the
interval.
{ = 0.05 for most practical applications}
Note:
The higher the confidence level, the wider the
interval !!
Estimating the mean (µ):
Case I: Known

The statistic z can be used to construct


the interval such that:

σ σ
P( y − zα 2 ≤ µ ≤ y + zα 2 ) =1−α
n n
Example:

A manufacturing company produces light


bulbs that have a length of life that is approx.,
normally distributed with mean m and
standard deviation of 40 hr’s.
If a random sample 16 bulbs has a mean of
780 hr’s, construct a 95%confidence interval
for the population mean.
Using the cumulative Normal Table,
z /2 = 1.96

The interval is given by: α/2


α/2
760.4 < µ < 799.6 1- α
z
Conclusion
“At 95% confidence level, we expect the
mean life of all bulbs produced to be within
760.4 and 799.6 hr’s.”
Estimating the mean (µ):
Case II: Unknown
The statistic t can be used to construct the interval
such that:

s s
P ( y − tα 2 ≤ µ ≤ y + tα 2 ) = 1−α
n n
t /2 , is the tabulated value of the t statistic
with n = n-1 df corresponding to a tail area of (a/2)
Example:

The contents of 7 similar containers of


sulfuric acid are 9.8, 10.2, 10.4, 9.8, 10.0,
10.2, and 9.6 liters. Construct a 95%
confidence interval for the mean of all such
containers assuming that contents are
normally distributed.
From sample data:
x = 10.0 ,and = 0.283 liters.
Using the t-distribution Table
For v = 7-1= 6 and = 0.05
t /2 = t 0.025= 2.447

α/2 α/2

-tα/2 tα/2
The interval is given by:
9.74 < µ < 10.26

Conclusion:
“At the 95% confidence level, we expect the
mean content of all such containers to lie
within 9.74 and 10.26 liters.”
Estimation of the Variance ( ²)

The statistic χ 2
can be used to construct an
interval such that:

(n−1)S 2 (n−1)S
2 2
P( 2 ≤σ ≤ 2 ) =1−α
χα 2 χ1−α 2
The chi-squared distribution
Example:

To estimate the variance of fill at a cannery,


10 cans were selected at random and their
contents are weighed. The following data
were obtained ( in ounces): 7.96, 7.90, 7.98,
8.01, 7.97, 7.96, 8.03, 8.02, 8.04, 8.02.
Construct a 90% confidence interval
for estimating the variance.
From sample data, f(χ2 )

s = 0.0430
(α/2) (α/2)
At (1-α) = 0.90,
and ν = n-1= 9 χ2 χ2
1−(α/2) (α/2)
χ20.05 = 16.92, and
χ20.95 = 3.33
Conclusion
At 90% confidence level, we
expect the variance to lie between
0.00098 and 0.0050 (Sq. ounces).

Note:
0.0314< σ < 0.0707
Statistical Process Control
The Goal of SPC
The goal of SPC is to do three things:
1. Determine if a process is in "control".
If a process is in "control", we know it will do
the same thing over and over again, reliably.
Every system produces natural variation and
by measuring the variation, we can see if the
process is in control and reliable.
2. Determine if a process operates
within designed specification.
Since every process has natural variation,
how do we know that, on average, the
process is doing what we want. SPC gives us
some tools to use that tells us what the
average results will be, and what the high
and low variation from average will be (as
long as the process stays in "control").
3. Identify reasons for variation.
Once a system is working reliably (is in
control), we can look at the natural variation it
produces, and decide what causes the
variation. Actually, SPC doesn’t tell us what
causes the variation, but it gives us clues
about where to look. SPC can tell us if
variation has "Special" causes (a part of the
system is slipping away from specifications)
or "System" causes (the system itself must
be changed to reduce variation).
How SPC Works
SPC turns the measurements of a
manufacturing process into a visual
graphic. By reading the graph (that is,
recognizing certain patterns or shapes), a
worker can tell if the process is in control, and
if the process is producing within
specification.
All this while the process is happening
- Real time.
Not 24 hours or two weeks or
months later.
In time to avoid errors
instead of fixing them.
Key SPC Assumptions
Variation in a process is normal, natural
and can not be eliminated entirely.

At some level of measurement, all processes


result in variation.
Manufacturing is a process.
The point of the process is to create or add
value.

Variation in manufacturing process


results = waste.
The higher the variation, the higher the
waste. The higher the waste, the lower the
value added per unit of resource used.
Reduction in variation is "Good".
Reduction in variation reduces waste,
reducing cost per unit.

Reduction in variation makes competitive


specifications, increasing price per unit.

Reduction in variation allows controlled


continuous process improvement
Variation in process is statistical.
A single measurement can not determine
variation in process.

Measuring a process over time requires


multiple measurements.

Statistical tools are used to describe the


variation.
Sources of Variation
Variation can be categorized as either;
Common or Random causes of variation, or
Random causes that we cannot identify
Unavoidable
e.g. slight differences in process variables like diameter,
weight, service time, temperature

Assignable causes of variation


Causes can be identified and eliminated
e.g. poor employee training, worn tool, machine needing repair

Das könnte Ihnen auch gefallen