Beruflich Dokumente
Kultur Dokumente
Assume the original target is a Cpk or Ppk of 1.0. Capability Indices: Cpk
= 10.00
= 4.653
Steps:
4.1 Sketch the distribution
4.2 Calculate the estimated standard deviation.
4.3 Determine the location of the tails for the distribution
4.4 Draw the specification limits on the distribution
4.5 Calculate how much data is outside the specifications
4.6 Calculate and interpret the capability indices
4.7 Analyze the Results
1 Sketch the Distribution
Sketch a picture of a normal distribution. Begin by drawing a horizontal line (axis). Next, draw a normal
(bell-shaped) curve centered on the horizontal axis. Then draw a vertical line from the horizontal axis
through the center of the curve, cutting it in half. This line represents the overall average of the data and is
always located in the center of a normal distribution. Label the line with the value for the overall average
and its symbol. The value of the overall average in the example is 10.00 and the symbol for the overall
average from the chart is . The example completed through this step follows.
is calculated when constructing a control chart. Substitute M for if an X-MR chart has been
completed. In the example, is 4.653. The denominator (d2) is a weighting factor whose value is based on
the subgroup size, n, from the control chart. The value for d2 in the example, based on a subgroup size of
5, is 2.326. A short listing of the d2 values for other subgroup sizes follows. The full table of values is
given in the appendix.
n 2 3 4 5 6 7 8 9 10
d2 1.128 1.693 2.059 2.326 2.534 2.704 2.847 2.970 3.078
The estimated standard deviation for the example is:
The estimated standard deviation is calculated to one more decimal place than the original data.
4.3. Determine the location of the tails for the distribution
The next step is to determine where (at what value) the tails or ends of the curve are located. These values
can be estimated by adding and subtracting three times the estimated standard deviation from the overall
average. Remember, from the histogram section, that for a normal distribution, plus or minus three times
the standard deviation from the overall average includes 99.73 percent of the area under the curve. The
calculation for the location of the left tail is:
Add the values to the distribution drawn earlier. The example completed through this step follows.
The diagram shows whether any portion of the curve is beyond the specifications. In the example, some
of the distribution is beyond the upper specification. If the overall average of the distribution is outside
the specification, refer to “Variation – Capability analysis where the overall average is outside the
specification” later in this section.
4.5. Calculate how much data is outside the specifications
As indicated in the previous step, some of the distribution is outside the specification limit. The question
is, how much? To determine the percentage that falls outside the specification limits, it is necessary to
find how many estimated standard deviations exist between the overall average and each specification
limit. The number of standard deviations is known as the Z value. Z values are used to determine the
percentage of output that is outside the specification limits using the Standard normal distribution table.
4.5.1. Find the percentage above the upper specification.
The first step in determining the percentage above the upper specification is to calculate the Z value for
the upper specification. This is found by subtracting the overall average from the upper specification, and
then dividing by the estimated standard deviation. The Z value for the upper specification is denoted as
Zupper. The upper specification for the example is 14, the overall average is 10.00, and the estimated
standard deviation is 2.00. Thus, the value of Zupper for the example is:
This means that the upper specification is located 2.00 estimated standard deviations away from the
overall average. Look up the Z value in the Standard Normal Distribution Table to find the estimated
proportion of output that is outside the upper specification.
Z values are listed along the left and top of the table. The whole number (number to the left of the
decimal) and the tenths digit (first number to the right of the decimal) are listed on the left hand side of
the table, and the hundredths digit (second number to the right of the decimal) is along the top. The table
shows Z values only up to 4. If the Z value is greater than 4, the proportion outside the specification is
virtually 0. In the example, the Z value is 2.00. To find the percentage outside the specification, go down
the left hand side of the table to 2.0 and then across to the column marked x.x0. The number is 0.0228,
which is the proportion outside the specification. To convert the proportion to a percentage, multiply it by
100. The percentage outside the upper specification is 2.28 percent. Place this percentage on the diagram.
4.5.2. Find the percentage below the lower specification.
The Z value for the lower specification is found by subtracting the lower specification from the overall
average, and then dividing by the estimated standard deviation. The Z value for the lower specification is
denoted as Zlower. The lower specification for the example is 0, the overall average is 10.00, and the
estimated standard deviation is 2.00. Thus, the value for Zlower for the example is:
This means that the lower specification is located 5.00 estimated standard deviations away from the
overall average. Look up the Z value in the Standard normal distribution table as previously described.
Since the table shows Z values up to only 4, the proportion and percentage outside of the specification is
taken as 0. If any of the data is outside the specification, add the percentage to the diagram.
4.5.3. Determining the total percentage outside the specifications
The total percentage outside the specification limits or requirements is found by adding the percentage
outside the upper and lower specification limits. The total percent of output located outside the
specification limits for the example is:
2.28 + 0 = 2.28%
4.6. Calculate and interpret the capability indices
This step describes the key capability indices.
4.6.1. Calculate Cp.
4.6.2. Calculate Cpk.
4.6.3. Interpret Cp and Cpk.
4.6.4. Calculate Cpu and Cpl.
4.6.1. Calculate Cp.
Cp is an index used to assess the width of the process spread in comparison to the width of the
specification. It is calculated by dividing the allowable spread by the actual spread. The allowable spread
is the difference between the upper and lower specification limits. The actual spread is 6 times the
estimated standard deviation. Plus or minus 3 times the estimated standard deviation contains 99.73
percent of the data and is commonly used to describe actual spread.
Cp for the example is:
A Cp of one indicates that the width of the process and the width of the specification are the same. A Cp
of less than one indicates that the process spread is greater than the specification. This means that some of
the data lies outside the specification. A Cp of greater than one indicates that the process spread is less
than the width of the specification. Potentially this means that the process can fit inside the specification
limits. The following diagrams show this graphically.
In fact, the Cp states how many times the process can fit inside the specification. So a Cp of 1.5 means
the process can fit inside the specification 1.5 times. A Cp greater than one is obviously desirable.
However, the example has a Cp greater than one and yet it still has data outside the specification. This is
due to the position of the overall average relative to the specification. When the overall average is away
from the center of the specification, the system can still produce data outside the specification even
though the Cp is greater than one, as in the example below:
To overcome this problem, Cpk was created.
4.6.2. Calculate Cpk
Cpk takes into account the center of the data relative to the specifications, as well as the variation in the
process. Cpk is simple to calculate. The smaller of the two Z values is selected. This is known as Zmin .
When Zmin has been selected, it is divided by 3. The formula is:
The Z values for the example are Zupper of 2.00 and Zlower of 5.00, therefore Zmin is 2.00. Cpk for the
example is:
If the Cpk formula is written in full, it becomes more apparent how Cpk works.
The diagram clearly shows that the overall average is too close to the upper specification. By taking the
smaller of the two Z values, Cpk is always looking at the worst side, where the specification is closest to
the overall average. Since it is looking only at half the picture, instead of dividing by 6 as in Cp, it is
divided by 3 .
A Cpk value of one indicates that the tail of the distribution and the specification are an equal distance
from the overall average, as shown below:
A Cpk of less than one, as in the example, means that some of the data is beyond the specification limit. A
Cpk greater than one indicates that the data is within the specification. The larger the Cpk, the more
central and within specification the data.
4.6.3. Interpret Cp and Cpk.
The Cp and Cpk indices are the primary capability indices. Cp shows whether the distribution can
potentially fit inside the specification, while Cpk shows whether the overall average is centrally located.
If the overall average is in the center of the specification, the Cp and Cpk values will be the same. If the
Cp and Cpk values are different, the overall average is not centrally located. The larger the difference in
the values, the more offset the overall average. This concept is shown graphically below.
Cpk can never exceed Cp, so Cp can be seen as the potential Cpk if the overall average is centrally set. In
the example, Cp is 1.17 and Cpk is 0.67. This shows that the distribution can potentially fit within the
specification. However, the overall average is currently off center. The Cpk value does not state whether
the overall average is offset on the upper or lower side. It is necessary to go to the Z values to discern this.
An alternative is to show the capability indices Cpu and Cpl.
4.6.4. Calculate Cpu and Cpl
Cpu and Cpl are the Cpk values calculated for both Z values.
Therefore, Cpu is:
From Cpu and Cpl, it is evident that the smaller value for the example is Cpu, which is the same value as
Cpk. By comparing Cpu to Cpl, it is evident that the overall average is off center and closer to the upper
specification than the lower specification. The larger the difference between the Cpu and Cpl, the more
off center the process.
4.7. Analyze the Results
The completed analysis for the example is shown below.
Calculations:
= 2.00
Zupper = 2.00 Zlower = 5.00
Cpk = 0.67 Cp = 1.17
Cpu = 0.67 Cpl = 1.67
Examine the capability indices and the distribution. What do they show? Is the process capable? In the
example, the process is off center, reflecting a capability issue. The upper specification of 14 minutes
cannot be achieved consistently. The team must either improve the process or revise the specification. In
the example, the team chose to revise the specification, but this is not an option in many industries.
The aim of capability is to achieve improved Cpk values, resulting in a more capable system. For most
industries, the aim is to achieve a Cpk of at least one. Certainly this is the case for most service
organizations. Some manufacturing companies require Cpk values greater than one. For example, the
minimum is frequently 1.33, providing room for process drift, etc. When parts are being assembled,
reduced variation at the center of the specification gives considerable benefit, namely parts assemble
more quickly and more easily. Motorola, for example, constantly strives for higher and higher Cpk values.
The company’s 6-sigma program has received a great deal of attention, and translates to a Cpk of 2.0. By
pushing for improved Cpk values, the improvement effort is focused on shrinking variation around the
center of the specifications.
Caution.
If a process is unstable—that is, if special causes are evident in the control chart—capability analysis will
be unreliable. Every time the capability indices are calculated, they will be different. Special causes
should be removed from a process. While special causes are present, the process is unpredictable, causing
it to go in and out of specification. As soon as a special cause occurs, the Cpk is meaningless, since
special causes often result in unpredictable defects; even an apparently good Cpk can not be relied upon.
If the sampling selected in the control chart is not appropriate for the process, this can also affect the
Cp/Cpk values. For example, sampling too frequently will artificially reduce the range values and cause
the Cp/Cpk values to appear high. Using the wrong sample size can have a similar effect. Refer to the
sampling section for guidance with appropriate sample size and frequency.
If the data being analyzed is not normal, the estimated standard deviation () will not be accurate.
Nonnormal capability analysis must be used if the distribution is not normal, refer to the topic
"Nonnormal capability analysis," later in this section for more information. The method shown in this
section can be significantly affected by nonnormal data, giving inaccurate results.
Standard Normal Distribution Table
z x.x0 x.x1 x.x2 x.x3 x.x4 x.x5 x.x6 x.x7 x.x8 x.x9
4.0 .00003
3.9 .00005 .00005 .00004 .00004 .00004 .00004 .00004 .00004 .00003 .00003
3.8 .00007 .00007 .00007 .00006 .00006 .00006 .00006 .00005 .00005 .00005
3.7 .00011 .00010 .00010 .00010 .00009 .00009 .00008 .00008 .00008 .00008
3.6 .00016 .00015 .00015 .00014 .00014 .00013 .00013 .00012 .00012 .00011
3.5 .00023 .00022 .00022 .00021 .00020 .00019 .00019 .00018 .00017 .00017
3.4 .00034 .00032 .00031 .00030 .00029 .00028 .00027 .00026 .00025 .00024
3.3 .00048 .00047 .00045 .00043 .00042 .00040 .00039 .00038 .00036 .00035
3.2 .00069 .00066 .00064 .00062 .00060 .00058 .00056 .00054 .00052 .00050
3.1 .00097 .00094 .00090 .00087 .00084 .00082 .00079 .00076 .00074 .00071
3.0 .00135 .00131 .00126 .00122 .00118 .00114 .00111 .00107 .00104 .00100
2.9 .0019 .0018 .0018 .0017 .0016 .0016 .0015 .0015 .0014 .0014
2.8 .0026 .0025 .0024 .0023 .0023 .0022 .0021 .0021 .0020 .0019
2.7 .0035 .0034 .0033 .0032 .0031 .0030 .0029 .0028 .0027 .0026
2.6 .0047 .0045 .0044 .0043 .0041 .0040 .0039 .0038 .0037 .0036
2.5 .0062 .0060 .0059 .0057 .0055 .0054 .0052 .0051 .0049 .0048
2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064
2.3 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 .0084
2.2 .0139 .0136 .0132 .0129 .0125 .0122 .0119 .0116 .0113 .0110
2.1 .0179 .0174 .0170 .0166 .0162 .0158 .0154 .0150 .0146 .0143
2.0 .0228 .0222 .0217 .0212 .0207 .0202 .0197 .0192 .0188 .0183
z x.x0 x.x1 x.x2 x.x3 x.x4 x.x5 x.x6 x.x7 x.x8 x.x9
1.9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233
1.8 .0359 .0351 .0344 .0336 .0329 .0322 .0314 .0307 .0301 .0294
1.7 .0446 .0436 .0427 .0418 .0409 .0401 .0392 .0384 .0375 .0367
1.6 .0548 .0537 .0526 .0516 .0505 .0495 .0485 .0475 .0465 .0455
1.5 .0668 .0655 .0643 .0630 .0618 .0606 .0594 .0582 .0571 .0559
1.4 .0808 .0793 .0778 .0764 .0749 .0735 .0721 .0708 .0694 .0681
1.3 .0968 .0951 .0934 .0918 .0901 .0885 .0869 .0853 .0838 .0823
1.2 .1151 .1131 .1112 .1093 .1075 .1056 .1038 .1020 .1003 .0985
1.1 .1357 .1335 .1314 .1292 .1271 .1251 .1230 .1210 .1190 .1170
1.0 .1587 .1562 .1539 .1515 .1492 .1469 .1446 .1423 .1401 .1379
0.9 .1841 .1814 .1788 .1762 .1736 .1711 .1685 .1660 .1635 .1611
0.8 .2119 .2090 .2061 .2033 .2005 .1977 .1949 .1922 .1894 .1867
0.7 .2420 .2389 .2358 .2327 .2297 .2266 .2236 .2206 .2177 .2148
0.6 .2743 .2709 .2676 .2643 .2611 .2578 .2546 .2514 .2483 .2451
0.5 .3085 .3050 .3015 .2981 .2946 .2912 .2877 .2843 .2810 .2776
0.4 .3446 .3409 .3372 .3336 .3300 .3264 .3228 .3192 .3156 .3121
0.3 .3281 .3783 .3745 .3707 .3669 .3632 .3594 .3557 .3520 .3483
0.2 .4207 .4168 .4129 .4090 .4052 .4013 .3974 .3936 .3897 .3859
0.1 .4602 .4562 .4522 .4483 .4443 .4404 .4364 .4325 .4286 .4247
0.0 .5000 .4960 .4920 .4880 .4840 .4801 .4761 .4721 .4681 .4641
Cpk or Ppk: Which should you use?
Your customer has asked you to report the Given that Ppk uses the calculated sigma, it is
Cpk of the product you are sending. You no longer necessary to use the calculated
know that to compute the Cpk, you need to sigma in Cpk. The only acceptable formula
have the product specifications, and that you for Cpk uses the estimated sigma.
need to have the mean and sigma. As you
gather the information, someone asks,
"which sigma do they want?"
You know that Cpk is calculated by dividing
by 3 sigma. But which sigma should you use,
estimated or calculated? Which is correct?
Which would you report? Naturally, most of
us would use the sigma that makes the Cpk
look the best. But the sigma that makes the
Cpk look best may not accurately reflect
what you or your customer need to know Estimated sigma:
about the process.
Confusion over calculating Cpk by two
different methods is one reason that a new
Given that Ppk uses the calculated sigma, it is
index, Ppk, was developed. Ppk uses the
no longer necessary to use the calculated
calculated sigma from the individual data.
sigma in Cpk. The only acceptable formula
for Cpk uses the estimated sigma.
In 1991, the ASQC/AIAG Task Force
published the "Fundamental Statistical
Process Control" reference manual, which
shows the calculations for Cpk as well as
Ppk. These should be used to eliminate
confusion about calculating Cpk.
So which value is best to report, Cpk or Ppk?
Although they show similar information, they
have slightly different uses.
Sigma of the individuals: Estimated sigma and the related capability
indices (Cp, Cpk, and Cr) are used to
measure the potential capability of a system
to meet customer needs. Use it when you
want to analyze a system's aptitude to
perform.
Actual or calculated sigma (sigma of the
individuals) and the related indices (pp, Ppk,
and Pr) are used to measure the performance
of a system to meet customer needs. Use it
when you want to measure a system's actual
process performance.
How can Cpk be good with data outside the specifications?
A customer who called our technical support line recently could not understand why his Cpk was above
1.0 when his data was not centered between the specifications and some of the data was outside the
specification. How can you have a good Cpk when you have data outside the specification and/or data
which is not centered on the target/nominal value?
To calculate Cpk, you need to know only three pieces of information: the process average, the variation in
the process, and the specification(s). First, find out if the mean (average) is closest to the upper or lower
specification. If the process is centered, then either Zupper or Zlower can be used, as you will see below.
If you only have one specification, then the mean will be closest to that specification since the other one
does not exist.
To measure the variation in the process, use the estimated sigma (standard deviation). If you decide to use
the standard deviation from the individual data, you should use the Ppk calculation, since Ppk uses this
sigma. To calculate the estimated sigma, divide the average range, R-bar, by d2. The d2 value to use
depends on the subgroup size and will come from a table of constants shown below. If your subgroup size
is one, you will use the average moving range, MR-bar.
d2 values
Subgroup size d2
1 1.128
2 1.128
3 1.693
4 2.059
5 2.326
6 2.534
7 2.704
You, of course, provide the specifications. Now that you have these 3 pieces of information, the Cpk can
be easily calculated. For example, let’s say your process average is closer to the upper specification. Then
Cpk is calculated by the following:
Cpk = (USL - Mean) /( 3*Est. sigma). As you can see, the data is not directly used. The data is only
indirectly used. It is used to determine mean and average range, but the raw data is not used in the Cpk
calculation. Here is an example that might serve to clarify. Suppose you have the following example of 14
subgroups with a subgroup size of 2:
Averag
Sample No. Range
e
1 0.03 0.06 0.045 0.030
2 0.10 0.20 0.150 0.100
3 0.05 0.10 0.075 0.050
4 1.00 0.00 0.500 1.000
5 1.50 1.50 1.500 0.000
6 1.10 1.50 1.300 0.400
7 1.10 1.00 1.050 0.100
8 1.10 1.01 1.055 0.090
9 1.25 1.20 1.225 0.050
10 1.00 0.30 0.650 0.700
11 0.75 0.76 0.755 0.010
12 0.75 0.50 0.625 0.250
13 1.00 1.10 1.050 0.100
14 1.20 1.40 1.300 0.200
Ppk, on the other hand, uses the standard deviation from all of the data. We can call this the sigma of the
individual values or sigmai. Sigma of the individual values looks at variation within and between
subgroups.
For a process that exhibits drifting, estimated sigma would not pick up the total variation in the process
and thus the Cpk becomes a cloudy statistic. In other words, one can not be sure it is a valid statistic.
In contrast to Cpk, Ppk, which uses the sigma of the individual values, would pick up all the variation in
the process. Again, sigmai uses between and within subgroup variation. So if there is drifting in the
process, sigmai would typically be larger than the estimated sigma, sigmae, and thus Ppk would, as it
should, be lower than Cpk.
Here is a quick review of the formulae for Cpk and Ppk:
Cpk = Zmin/3 where Zmin Ppk = Zmin/3 where Zmin
We should be concerned with how well the process is behaving, therefore Ppk might be preferred over
Cpk. Ppk is a more conservative approach to answering the question, "How good is my process?"
Watch for a future article discussing the relatively new capability index, Cpm, and how it stacks up
against Cpk and Ppk.
Should you calculate Cpk when your process is not in control?
The AIAG Statistical Process Control reference manual (p. 13) states:
"The process must first be brought into statistical control by detecting and acting upon special causes of
variation. Then its performance is predictable, and its capability to meet customer expectations can be
assessed. This is the basis for continual improvement."
True, but to take it one step further, if the process is not in a state of statistical control then the validity of
a Cpk value is questionable.
Suppose your customer requires you to provide a Cpk value and does not require control charts. Or
perhaps the customer is willing to accept lack of control as long as the Cpk is acceptable. You provide a
"good" Cpk number and relax, knowing that your customer is satisfied. But have you really satisfied your
customer’s need, which is to ensure that your product or service is within an acceptable specification
region and consistent over time?
It is certainly possible to calculate Cpk even when a process is not in control, but one might ask what
value this calculation provides. Rather than state "You should never calculate Cpk when the process is out
of control," I prefer to say that the less predictable your process is, the less meaningful Cpk is or the less
value Cpk carries. While it is easy to say that one should never calculate Cpk when the process is out-of-
control, it is not always practical, since customers may dictate otherwise.
One of the reasons that minimal emphasis should be placed on Cpk when the process is not in control is
predictability. Customers want good Cpk values as well as some confidence that in the future, Cpk will be
consistent or improved over previous capability studies. [This topic will be addressed in a future article.]
Another reason that you should not put too much weight on Cpk when the process is not in control is due
to the underlying statistics that are used in calculating Cpk. Since Cpk uses the range, a process can
appear "better" simply because the range is not a fair representation of the process variability when the
process is not in control or predictable. If the process is in control, one could conclude that the range is
sufficient for calculating Cpk.
If you do not have the control chart to evaluate for process control, you might be tempted to select the
second process as being "better" on the basis of the higher Cpk. As this example illustrates, you cannot
fairly evaluate Cpk without first establishing process control.
The capability index dilemma: Cpk, Ppk, or Cpm
Lori, one of our customers, phoned the other day to ask if Cpk is the best statistic to use in a process that
slits metal to exacting widths. I too wondered what index would be best suited for her application.
Perhaps Cpk, Ppk, Cpm, or some other index offers the best means of reporting the capability of her
product or process.
Lori’s process capability has never dipped below 2 and averages above 3. Given this high degree of
capability, she might consider reducing variation about the target. While the Cpk and Ppk are well
accepted and commonly-used indices, they may not provide as much information as Lori needs to
continue to improve the process.
Cpm incorporates the target when calculating the
standard deviation. Like the sigma of the
individuals formula, compares each
observation to a reference value. However, instead
of comparing the data to the mean, the data is
compared to the target. These differences are
squared. Thus any observation which is different
from the target observation will increase the
standard deviation.
As this difference increases, so does the sigma. And as this sigma becomes larger, the Cpm index gets
smaller. If the difference between the data and the target is small, so too is the sigma. And as this sigma
gets smaller, the Cpm index becomes larger. The higher the Cpm index, the better the process, as shown
in the diagrams below.
In these 3 charts the This Cpm is
process is the same, good.
but as the process
becomes more
centered, the Cpm
gets better.
This Cpm is
better.
This Cpm is
better.
We can use Lori’s raw data to provide an example of how Cpm is calculated:
Sample Sample Sample Sample Sample Sample Sample Sample Sample
1 2 3 4 5 6 7 8 9
obs 1 90.741 102.711 104.066 106.602 100.904 104.922 112.738 102.388 97.825
obs 2 102.300 100.882 105.620 95.978 108.558 100.243 108.145 104.159 95.209
obs 3 98.642 103.314 96.165 96.265 94.882 97.053 98.679 100.204 91.273
obs 4 106.069 98.569 100.412 95.869 98.573 111.042 103.788 99.328 93.430
obs 5 97.635 96.639 96.316 84.872 108.588 99.068 105.664 94.157 98.263
And the specifications are: USL = 145, Target = 105, LSL = 60
Glossary
Assignable cause : An assignable cause is a source of variation that is intermittent, not predictable. It is
sometimes called "special cause" variation. On a control chart, an assignable cause is signaled by points
beyond the control limits or nonrandom patterns within the control limits.
Attributes data : Attributes data is data that can be classified and counted. There are two types of
attributes data: counts of defects per item or group of items (nonconformities) and counts of defective
items (nonconforming). For example, yes/no, good/bad, pass/fail, and go/no go.
Average : Another term for a mean, it is an indicator of the center of a set of data points. It is found by
adding all the individual values and dividing by the number of values.
Bell curve : Another term for the shape formed by a normal distribution when drawn as a histogram.
Bias : Something that influences the selection of certain items when collecting a sample.
Bimodal distribution : A distribution that has two modes. Drawn as a histogram, this condition is
reflected by two peaks or high points.
Capability : The capability of a process is how the process performs when compared to specification
limits or requirements. It uses a series of indices: Cp, Cpk, Cr, and Cpm
Capability Analysis : A set of statistical calculations performed on a set of data to assess how the
distribution formed by the data compares to specifications or requirements.
Capable process : A process is said to be capable if nearly 100% of its output falls within specification
limits
c-chart : An attributes control chart that is used to monitor the number of nonconformities, such as
defects per subgroup. The subgroup size must remain constant for this type of chart.
Central location : Central location is the center of a set of data points. Mean, median, and mode are the
statistics used to describe it.
Central tendency : Statistics such as the mean, median, and mode are said to be measures of central
tendency
Characteristic : A distinguishing feature of a process or its output on which variables or attributes data
can be collected.
Chi-square : A goodness-of-fit-test statistic used to test the assumption that the distribution of a set of
data is similar to the expected distribution, such as a normal distribution.
Coefficient of variance : A ratio that measures the significance of the standard deviation in relation to the
mean.
Common cause : A source of variation that is inherent in a system and is predictable. A control chart
identifies a system with only common causes of variation. Common causes of variation affect all
individual values of a system, and can be eliminated only by a systemic change.
Control chart : A control chart is a graphical representation of a characteristic of a process, showing
plotted values of some statistic, a central line, and one or two control limits. It is used to determine
whether a process has been operating in statistical control and is an aid to maintaining statistical control.
Control limits : Lines on a control chart used as a basis for judging whether variation in data on a chart is
due to special or common causes. These limits are calculated from data collected from the system, they
are not specifications or limits set by customers or management.
Cp : A capability index that compares the width of a two-sided specification with the variation in the
process. Estimated standard deviation is used to calculate the process variation. A Cp larger than 1
indicates that the process variation is narrower than the specification.
Cpk : Cpk is a capability index that tells how well a system can meet two-sided specification limits.
Because it takes the target value into account, the system does not have to be centered on the target value
for this index to be useful. It is calculated with estimated standard deviation. A Cpk greater than 1
indicates that the process can meet the specification.
Cpl : A capability index that compares the variation in the process to the lower specification. Estimated
standard deviation is used to calculate the process variation. A Cpl greater than 1 indicates the process is
capable of meeting the lower specification.
Cpm : Cpm is a capability index that shows how well the system can produce output within
specifications while taking the target into account. Its calculation uses sigma calculated from the target
value instead of the mean.
Cpu : A capability index that compares the variation in the process to the upper specification. Estimated
standard deviation is used to calculate the process variation. A Cpu greater than 1 indicates the process is
capable of meeting the upper specification.
Cr : Capability ratio compares the variation in a process with the width of a two-sided specification.
Estimated standard deviation is used to calculate the process variation. It is the inverse of Cp.
Defect : An occurrence such as a blemish, scratch, burn, error, or omission that appears on an object. A
defect does not necessarily make the object unusable or unacceptable.
Defective : A product or service flawed beyond use or acceptability.
Discrimination : This refers to a description of the capability of a measurement system.
Dispersion : Statistics such as the range and standard deviation (sigma) are said to be measures of
dispersion.
Distribution : Distribution is a way of describing the output from a system of variation. The
distribution’s location, shape, and spread may be evaluated by statistics such as the mean, median, sigma,
and range.
Estimated sigma : This is an estimate of the standard deviation calculated by dividing the average range
by the tabular constant d2 (R-bar/d2).
Histogram : A histogram is a bar chart that represents the frequency distribution of data. The height of
each bar corresponds to the number of items in the class or cell. The width of each bar represents a
measurement interval. The histogram shows basic information such as central location, shape, and spread
of the data being examined.
In control : A process is said to be "in control" or "stable" if it is in statistical control. If a process is in
statistical control, a control chart will have no subgroups falling outside the control limits, no runs, and no
nonrandom patterns.
Individuals control chart : The individual portion of an X-MR control chart. The individual data points
are plotted onto the chart and compared with control limits.
Kurtosis : Kurtosis is a statistic that is used to measure the "flatness" or "peakedness" of a set a of data. It
represents a measure of the combined weight of the tails relative to the rest of a distribution. As the tails
of a distribution become heavier, the kurtosis will increase. As the tails become lighter, the kurtosis value
will decrease.
Lower control limit : A line on a control chart used as a basis for judging whether variation from the data
on the chart is due to special or common causes. Any point beyond the lower control limit is an indication
of a special cause occurring. This limit is calculated from data collected on the system, it is not a
specification or limit set by customers or management. The symbol is LCL.
Lower specification limit : The lower limit of a specification. This limit is set as an aim for a system or
process, it is usually set by the customer of the process, engineering, or management. The symbol for the
lower specification is LSL – lower specification limit.
Maximum acceptable subgroup size : When a varying sample size is being used in a p or u-control
chart, the maximum acceptable sample size is usually a sample size that is twenty-five percent larger than
the average sample size. Any subgroup with a sample size larger than the maximum acceptable subgroup
size has to have control limits calculated specifically for that subgroup.
Mean : Another term for average, it is an indicator of the central location of a set of data. It is found by
adding all the individual values and dividing by the number of values.
Measurement System : A measurement system consists of the people, procedures, systems, and devices
used to take measurements.
Median : The middle number in a set of data when it is ranked from lowest to highest, it is an indicator of
central location in a data set.
Minimum acceptable subgroup size : When a varying sample size is being used in a p or u-control
chart, the minimum acceptable sample size is usually a sample size that is twenty-five percent smaller
than the average sample size. Any subgroup with a sample size smaller than the minimum acceptable
subgroup size has to have control limits calculated specifically for that subgroup.
Mode : Is the number that occurs most frequently in a data set. It is usually an indicator of central
location.
Moving range : The difference between consecutive subgroup values on an X-MR control chart. The
moving range is used as a measure of variability.
Moving range chart : The moving range portion of an individuals and moving range control chart. The
moving ranges are plotted on the chart and compared with control limits.
Negatively skewed distribution : A distribution of data where most of the data appears on the right hand
side of the distribution and then tails off to the left. Also known as a skewed left distribution.
Nonconforming : Nonconforming data is a count of defective units. It is often described as go/no go,
pass/fail, or yes/no, since there are only two possible outcomes to any given check. You can track either
the number of defective units or the number of nondefective units.
Nonconformities : Nonconformities data is a count of defects per unit or group of units. It can refer to
defects or occurrences that should not be present but are, or any characteristic that should be present but is
not.
Nonnormal data distribution : Any data set that does not show a normal, bell-shaped distribution.
Nonnormal data : Data that does not form a normal distribution.
Nonrandom pattern : A pattern in data that is repeating, or is not due to normal variation.
Normal curve : This bell-shaped curve is used to illustrate the shape of a normal distribution.
Normal distribution : A data distribution that is bell shaped and symmetrical, the normal distribution is
the basis for control chart and capability analysis.
Normal probability plot : A normal probability plot is a graphical method for showing a frequency
distribution. The scaling is set up so that if the distribution is normal, a straight line will result.
np-chart : An attributes control chart that plots the number of items that are defective or possess a
characteristic of interest. The subgroup size must remain constant for this type of chart to be used.
Observation : An observation is a single piece of data, usually a count or a measurement. It is also known
as a reading.
Operational definition
When applied to data collection, it is a clear, concise, and detailed definition of a measure. It ensures that
those collecting data do so consistently.
Outlier : An outlier is a point on a chart that does not fall into the pattern of the rest of the data.
Out-of-control : When applied to a control chart, out of control means that at least one special cause of
variation is present.
Overcontrol : Over reaction to a set of data. For example, in a control chart, it would be reacting to a
common cause as if it were a special cause.
Pareto chart : A Pareto chart is a bar chart for ranking aspects of a problem. Typically, a few aspects
make up a significant portion of the problem while many trivial aspects exist.
p-chart : An attributes control chart that plots the number of items possessing a characteristic of interest.
The subgroup size may vary.
Positively skewed distribution : A distribution of data where most of the data appears on the left hand
side of the distribution and then tails off to the right. Also known as a skewed right distribution.
Pp : Pp is a capability index, similar to Cp, that is a measure of process performance. Pp tells how well a
system can meet two-sided specification limits, assuming that the average is centered on the target value.
It is calculated with the actual sigma (using the actual individual values) rather than the estimated sigma.
A Pp larger than 1 indicates that the process variation is narrower than the specification.
Ppk : Similar to Cpk, Ppk is a capability index that indicates whether a process is capable of meeting
two-sided specification limits. However, Ppk uses actual standard deviation to calculate the process
variation, whereas Cpk uses an estimated standard deviation. The target value is taken into account with
Ppk, so the system does not have be center on the target value to be useful. A Ppk greater than 1 indicates
that the process can meet the specification.
Ppl : A capability index similar to Cpl in that it compares the variation in the process to the lower
specification. However, Ppl uses standard deviation to calculate the process variation, whereas Cpl uses
an estimated standard deviation. A Ppl greater than 1 indicates the process is capable of meeting the lower
specification.
Ppu : A capability index similar to Cpu in that it compares the variation in the process to the upper
specification. However, Ppu uses standard deviation to calculate the process variation, whereas Cpu uses
an estimated standard deviation. A Ppu greater than 1 indicates the process is capable of meeting the
upper specification.
Pr : A capability ratio similar to Cr in that it compares the variation in a process with the width of a two-
sided specification. However, Pr uses standard deviation to calculate the process variation, whereas Cr
uses an estimated standard deviation. It is the inverse of Pp.
Process : A process is the combination of people, equipment, materials, methods, and environment that
produce output—a given product or service. The words process and system are often used
interchangeably.
Process capability : Process capability is the 6 sigma range of common cause variation for statistically
stable processes only. Sigma is usually estimated by R-bar/d2.
Process performance : The process performance is the 6 sigma range of inherent variation for
statistically stable processes only, where sigma is usually estimated by the sample standard deviation
Random distribution : A distribution that forms no particular shape.
Random sample : A sample that allows every item in a population to have an equal chance of being
selected, with no bias.
Range : Range is an estimate of spread in a set of data points; the difference between the highest and
lowest values in the data set.
Repeatability : Repeatability refers to variation in a series of measurements that have been taken with
one gage measuring one characteristic of the same item by the same person.
Reproducibility : Reproducibility refers to variation in a series of measurements that have been taken
with one gage measuring one characteristic of the same item by different people.
Run chart : A run chart is a simple line chart that plots one characteristic over time. It is used to plot
individual observations and detect patterns in the data.
Sample : A sample is a collection of one or more observations used to analyze the performance of a
process, as opposed to the total populations. It is intended to represent the characteristics of the
population. Sample is a synonym for "subgroup" in process control applications
Sample size : The number of pieces of data taken at one time. For example five boxes are checked for
stiffness every hour, the sample size in this case is five. If the temperature of a room is taken every hour,
only one number is collected every hour, so the sample size is one.
Sigma of the individuals : Sigma of the individuals is standard deviation calculated from the individual
data values in a data set. It is also known as actual or calculated sigma.
Sigma : Sigma is the Greek symbol, , used to denote standard deviation. It is a measure of the variation
or spread within a set of data.
Skewed distribution : A distribution that tails off to one side, either to the left or right.
Skewness : Skewness is a statistic that is used to measure the symmetry of the distribution for a set of
data. A process that is skewed tails off to the left or to the right.
Special cause : Special cause variation is a source of variation that is intermittent, not predictable.
Sometimes it is called "assignable cause" variation. On a control chart, a special cause is signaled by
points beyond the control limits, runs, or nonrandom patterns within the control limits. A process that has
special cause variation is said to be out-of-control, unstable, or unpredictable.
Specification limits : Specifications are boundaries, usually set by management, engineering, or
customers, within which a system must operate. They are sometimes called engineering tolerances
Spread : Spread is the range of data from the lowest value to the highest value.
Stable process : A system, analysed by a control chart, with no special causes of variation present, this
system is also said to be in control. Variation within a stable system is due to common causes, and is
predictable.
Standard deviation : A statistic that describes the variation or spread within a data set. It can be used to
indicate the variation in a process and to compare with specifications.
Statistical control : Statistical control is a condition describing a process from which all special causes of
variation have been removed and only common causes of variation remain. On a control chart, processes
that are in statistical control show no subgroups outside the control limits, no runs, and no nonrandom
patterns. This condition is also referred to as in control, stable, or predictable.
Subgroup : A subgroup is one or more occurrences or measurements taken at one time. Multiple
subgroups are used to analyze the performance of a process. Subgroup is used as a synonym for "sample."
Symmetrical distribution : A distribution that if cut in half, shows each side is the mirror of the other.
Target value : The exact value at which customers, engineering, or management want the system to
operate.
Trial limits : On a control chart, trial limits are calculated when there is insufficient data to calculate
control limits. These give a temporary guide until sufficient data has been collected
u-chart : An attributes control chart that is used to monitor the number of nonconformities per unit, such
as defects per item. The subgroup size may vary.
Undercontrol : Not reacting to a set of data when the data is showing an issue or problem. For example,
in a control chart, it would be ignoring a special cause of variation.
Uniform distribution : A distribution, when drawn as a histogram, has each bar at a similar frequency.
Unstable system : A system that contains special and common causes of variation; this system is also
said to be out of control. An unstable system is unpredictable.
Upper control limit : A line on a control chart used as a basis for judging whether variation from the data
on the chart is due to special or common causes. Any point beyond the upper control limit is an indication
of a special cause occurring. This limit is calculated from data collected on the system, it is not a
specification or limit set by customers or management. Its symbol is UCL.
Upper specification limit : The upper limit of a specification. This limit is set as an aim for a system or
process, it is usually set by the customer of the process, engineering, or management. The symbol for the
upper specification is USL–upper specification limit.
Variability : Variability refers to the differences among individual outputs of a process. In control chart
pairs, it refers to the differences between individual observations and is analyzed in range, sigma, and
moving range charts.
Variables : Variables data is data that is acquired through measurements, such as length, time, diameter,
strength, weight, temperature, density, thickness, pressure, and height. X-bar and range, X-bar and sigma,
and individuals and moving range charts are used to analyze variables data.
Variation : Variation is the inevitable differences that occur among individual outputs of a process.
Sources of variation may be grouped into two major categories: common causes and special causes.
X-bar : X-bar is the average or mean of values in a group of observations.
X-bar chart : The X-bar chart is a variables control chart that shows the subgroup averages. The
subgroup size for this chart must be larger than one and consistent.
Zlower : The symbol for the Z value for the lower specification limit. It represents the number of standard
deviations between the average and the lower specification limit.
Zmin : The minimum of the Z values, either Zupper or Zlower. It is used to calculate the Cpk index in capability
analysis
Zupper : The symbol for the Z value for the upper specification limit. It represents the number of standard
deviations between the average and the upper specification limit.
Z value : Used in capability analysis, it is the symbol for the number of standard deviations between the
average and a specification limit for a normal distribution.