W5inse6220 PDF

1
INSE 6220 -- Week 5

Advanced Statistical Approaches to Quality
S Chart
0.35
0.3
UCL
Process capability 0.25
Standard Deviation
More on Hypothesis Testing 0.2
More on Statistical Inference 0.15

CL
More on Control Charts: 0.1
X-bar, R, and S control charts 0.05
0 LCL
0 5 10 15 20 25 30 35 40
Sample Number
Dr. A. Ben Hamza Concordia University

2
Process capability analysis
1. Compute the mean of sample means ( X ).
2. Compute the mean of sample ranges ( R ).
3. Estimate the population standard deviation (x):

x = R / d2
4. Estimate the natural tolerance of the process:

Natural tolerance = 6x
5. Determine the specification limits:

USL = Upper specification limit
LSL = Lower specification limit
3
Process capability analysis (cont.)

6. Compute capability indices:
Process capability potential
Cp = (USL LSL) / 6x
Upper capability index

CpU = (USL X ) / 3x
Lower capability index

CpL = ( X LSL) / 3x
Process capability index

Cpk = min (CpU, CpL)
4
Control Charts
Suppose we have a general statistic W
We plot W over time
We specify control limits of the form
U C L 3
W W Mean of W
C L W
L C L W 3 W Std. Dev. of W
A control chart based on a number of standard deviations of the statistic
from the mean of the statistic is called a Shewart Control Chart
Some commonly used Ws
X bar: Average
R: Range
s: Standard deviation
We can also specify control charts using probability limits
5
X-bar and R Charts
x Chart : R Chart :
UCL x A2 R UCL D4 R
Central line x Central line R
LCL D3 R
LCL x A2 R
R xmax xmin
x x ... xm
x 1 2 R1 R2 ... Rm
m R
m 20 ~ 25 m
m 20 ~ 25
n4~6
Estimates process A2, D3, D4=?
mean,
To find the control limits, need to estimate
the variance, or standard deviation
6
Control Charts for X-bar and s
UCL s 3 s
CL s
LCL s 3 s
If X 1 , X 2 ,..., X n is a random sample from a N ( , 2 ) population, then

E (s 2 ) 2 but E (s)
7
8
Example
9
10
11
12
13
Summary of Control Charts
X LCL x A2 R LCL D3 R
X bar & R chart R CL x CL R

d2 UCL x A2 R UCL D4 R
X LCL x A3S LCL B3S

X bar & S chart CL x CL S
S
; UCL x A3 S UCL B4 S
c4
14
Example: S charts with MATLAB

This example plots an S chart of
measurements on newly machined parts,
taken at one hour intervals for 36 hours.
Each row of the runout matrix contains
the measurements for 4 parts chosen at
random. The values indicate, in
thousandths of an inch, the amount the
part radius differs from the target radius.
>> load parts

>> controlchart(runout,'chart','xbar','sigma',std');
>> controlchart(runout,'chart','s', 'sigma','std');
Hypothesis Testing
15
pronounced
Null
H nought
Alternative Hypothesis
Hypothesis H 0 : 1.10
H1 : 1.10
A hypothesis test is a procedure for determining if an assertion about a characteristic of a

population is reasonable.
Example1: The mean monthly cell phone bill in this city is = $42
Example2: The proportion of adults in this city with cell phones is p = 0.68
Example3: suppose that someone says that the average price of a liter of regular unleaded
gas in Montreal is $1.10. How would you decide whether this statement is true? You could try
to find out what every gas station in the city was charging and how many liters they were
selling at that price. That approach might be definitive, but it could end up costing more than
the information is worth. A simpler approach is to find out the price of gas at a small number of
randomly chosen stations around the city and compare the average price to $1.10.
Of course, the average price you get will probably not be exactly $1.10 due to variability in
price from one station to the next. Suppose your average price was $1.18. Is this three cent
difference a result of chance variability, or is the original assertion incorrect? A hypothesis test
can provide an answer.
16
Hypothesis Test Terminology: review

The significance level is related to the degree of certainty you require in order to reject the
null hypothesis in favor of the alternative. By taking a small sample you cannot be certain
about your conclusion. So you decide in advance to reject the null hypothesis if the
probability of observing your sampled result is less than the significance level. For a
typical significance level of 5%, the notation is = 0.05. For this significance level, the
probability of incorrectly rejecting the null hypothesis when it is actually true is 5%. If you
need more protection from this error, then choose a lower value of .
The p-value is the probability of observing the given sample result under the assumption
that the null hypothesis is true. If the p-value is less than , then you reject the null
hypothesis. For example, if = 0.05 and the p-value is 0.03, then you reject the null
hypothesis. The converse is not true. If the p-value is greater than , you have insufficient
evidence to reject the null hypothesis.
The outputs for many hypothesis test functions also include confidence intervals. Loosely
speaking, a confidence interval is a range of values that have a chosen probability of
containing the true hypothesized quantity. Suppose, in the example, 1.15 is inside a 95%
confidence interval for the mean, . That is equivalent to being unable to reject the null
hypothesis at a significance level of 0.05. Conversely if the 100(1- ) confidence interval
does not contain 1.15, then you reject the null hypothesis at the level of significance.
17
Inference on the mean of a population, variance known

H 0 : 0
H1 : 0 (3-22)
X 0
Z0 (3-23)
/ n
H1 in equation (3-22) is a two-sided alternative hypothesis

The procedure for testing this hypothesis is to:
take a random sample of n observations on the random variable x,
compute the test statistic, and
reject H0 if |Z0| > Z/2, where Z/2 is the upper /2 percentage of the
standard normal distribution.
In some situations we may wish to reject H0 only if the true mean is larger
than 0
Thus, the one-sided alternative hypothesis is H1: >0, and we would reject
H0: =0 only if Z0>Z
If rejection is desired only when <0
Then the alternative hypothesis is H1: <0, and we reject H0 only if Z0<Z
18
Confidence interval on the mean, variance known
Furthermore, a 100(1 )% upper confidence bound on is
whereas a 100(1 )% lower confidence bound on is

19
Using the p value

The p-value is the probability of obtaining a sample result
that is at least as unlikely as what is observed.
The p-value can be used to make the decision in a hypothesis
test by noting that:
if the p-value is less than the level of significance , the
value of the test statistic is in the rejection region.
if the p-value is greater than or equal to , the value of the
test statistic is not in the rejection region.
Reject H0 if the p-value < .
Steps of Hypothesis Testing

Using the p value
4. Collect the sample data and compute the value of the test statistic.
5. Use the value of the test statistic to compute the p value.
6. Reject H0 if p-value < .
20
Using the p-value

Given the observed result for Z0 or t0, and knowing the distribution of Z0 and t0
assuming the null hypothesis is true, it is possible to compute the probability (p-
value) of observing this result. A very small p-value casts doubt on the truth of the
null hypothesis. For example, suppose that the p-value was 0.001, meaning that
the probability of observing the given Z0 or t0 was one in a thousand. That should
make you skeptical enough about the null hypothesis that you reject it rather than
believe that your result was just a lucky 999 to 1 shot.
21
Example: Glow Toothpaste

Two-Tailed Tests about a Population Mean: Large n
The production line for Glow toothpaste is designed to fill tubes of toothpaste with
a mean weight of 6 ounces. Periodically, a sample of 30 tubes will be selected in
order to check the filling process. Quality assurance procedures call for the
continuation of the filling process if the sample results are consistent with the
assumption that the mean filling weight for the population of toothpaste tubes is 6
ounces; otherwise the filling process will be stopped and adjusted.
Two-Tailed Tests about a Population Mean: Large n
A hypothesis test about the population mean can be used to help determine when the
filling process should continue operating and when it should be stopped and corrected.
Hypotheses
H0:

H1:
Rejection Rule
ssuming a .05 level of significance,
Reject H0 if Z0 < -1.96 or if Z0 > 1.96
22

Two-Tailed Test about a Population Mean: Large n
Assume that a sample of 30 toothpaste tubes
provides a sample mean of 6.1 ounces and standard
deviation of 0.2 ounces.
Let n = 30, x = 6.1 ounces, = 0.2 ounces
x 06.1 6
Z0 2.74
/ n 0.2 / 30
Since 2.74 > 1.96, we reject H0.
Two-Tailed Test about a Population Mean: Large n
Conclusion: We are 95% confident that the mean
filling weight of the toothpaste tubes is not 6
ounces. The filling process should be stopped
and the filling mechanism adjusted.
23

Using the p-Value for a Two-Tailed Hypothesis Test
Suppose we define the p-value for a two-tailed test as double the
area found in the tail of the distribution.
With Z0 = 2.74, the standard normal probability table shows:
1 (2.74) 1 0.996928 0.0031
Considering the same probability of a larger difference in the lower tail of

the distribution, we have
p-value = 2(0.0031) = 0.0062
The p-value .0062 is less than = 0.05, so H0 is rejected.
Confidence Interval Approach to a
24
Two-Tailed Test about a Population Mean

Select a simple random sample from the population and use the value of the
sample mean x to develop the confidence interval for the population mean .
If the confidence interval contains the hypothesized value 0, do not reject H0.
Otherwise, reject H0.
Confidence Interval Approach to a Two-Tailed Hypothesis Test

The 95% confidence interval for is

x z / 2 6.1 1. 96(. 2 30 ) 6.1 . 0716
n
or 6.0284 to 6.1716
Since the hypothesized value for the population mean, 0 = 6, is not
in this interval, the hypothesis-testing conclusion is that the null
hypothesis, H0: = 6, can be rejected.
25
Inference on the mean of a normal distribution with variance

unknown
For the two-sided alternative hypothesis, reject H0 if |t0| > t/2,n-1, where
t/2,n-1, is the upper /2 percentage of the t distribution with n 1 degrees of
freedom
For the one-sided alternative hypotheses,
If H1: 1 > 0, reject H0 if t0 > t,n 1, and
If H1: 1 < 0, reject H0 if t0 < t,n 1
One could also compute the P-value for a t-test

26
Confidence interval on the mean of a normal distribution with

variance unknown
p_value:
2[1- F (| t0 |)] for a two-tailed test

p value 1- F (t0 ) for an upper-tailed test
F (t ) for a lower-tailed test
0
where F is the cdf of the t-distribution.

27
28
29
Inference on a population proportion

Hypothesis Testing
30
Inference on a population proportion

Confidence intervals on a population proportion
31
The probability of type II error and sample size decisions

n n
z / 2 z / 2

Sample size calculation for two-tailed tests:
( Z / 2 Z ) 2 2
n , where 0
2
32
33
Inference for a difference in means, variances known

Statistical inference for two samples
34
Hypothesis tests for a difference in means, variances known
Confidence interval on a difference in means, variances known

35
36
Inference for a difference in means of two normal

Distributions: Variances unknown
Hypothesis Tests for the Difference in Means
37
38
39
40
Example 3.9
The top figure shows comparative box plot for the 97
yield data for the two types of catalysts. These 96
comparative boxplots indicate that there is no 95
obvious difference in the median of the two 94
samples, although the second sample has a
Yield
93
slightly larger sample dispersion or variance. There 92
are no exact rules for comparing two samples with 91
boxplots; their primary value is in the visual 90
impression they provide as a tool for explaining the 89

1 2
results of a hypothesis test, as well as in the Catalyst type
verification of assumptions. Normal Probability Plot
0.95
The bottom figure shows the normal probability plot 0.90

catalyst 1
of the two samples of yield data. Note that both 0.75
samples plot approximately along straight lines,

Probability
0.50
and the straight lines for each sample have similar catalyst 2
slopes (i.e. similar standard deviations). Hence, we 0.25
conclude that the normality and equal variances 0.10
assumptions are reasonable. 0.05
89 90 91 92 93 94 95 96 97
Data
41
Pooled-Variance t-Test Example
You are a financial analyst for a brokerage firm. Is there a

difference in dividend yield between stocks listed on the NYSE
& NASDAQ? You collect the following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16
Assuming both populations are

approximately normal with
equal variances, is
there a difference in mean
yield ( = 0.05)?
Pooled-Variance t Test Example: Calculating the 42
Test Statistic
(continued)
H0: 1 - 2 = 0 i.e. (1 = 2)
H1: 1 - 2 0 i.e. (1 2)
The test statistic is:
t0
X X
1 2 1

2 3.27 2.53 0 2.040
1 1 1 1
S
2
p
1.5021
n1 n 2 21 25
n
S 1
2 1S1
2
n 2 1S 2
2

21 11.30 2
25 11.16 2
1.5021
(n1 1) (n2 1) (21 - 1) ( 25 1)
p
Pooled-Variance t Test Example: Hypothesis Test
43
Solution
Reject H0 Reject H0
H0: 1 - 2 = 0 i.e. (1 = 2)
H1: 1 - 2 0 i.e. (1 2)
= 0.05 .025 .025
df = 21 + 25 - 2 = 44 -2.0154 0 2.0154 t
Critical Values: t = 2.0154
2.040
Test Statistic: Decision:
3.27 2.53
t0 2.040 Reject H0 at = 0.05
1 1
1.5021 Conclusion:
21 25
There is evidence of a
difference in means.
Pooled-Variance t Test Example: Confidence
44
Interval for 1 - 2
Since we rejected H0 can we be 95% confident that NYSE > NASDAQ?
95% Confidence Interval for NYSE - NASDAQ
1 1
X X t
1 2 /2,n1 n2 2 S 0.74 2.0154 0.3628 (0.09, 1.471)
2
p
n1 n 2
Since 0 is less than the entire interval, we can be 95% confident that
NYSE > NASDAQ

W5inse6220 PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

W5inse6220 PDF

Hochgeladen von

Copyright:

Verfügbare Formate

1

INSE 6220 -- Week 5

More on Statistical Inference 0.15

X-bar, R, and S control charts 0.05

Dr. A. Ben Hamza Concordia University

Process capability analysis

1. Compute the mean of sample means ( X ).

2. Compute the mean of sample ranges ( R ).

3. Estimate the population standard deviation (x):

4. Estimate the natural tolerance of the process:

5. Determine the specification limits:

Process capability analysis (cont.)

Upper capability index

Lower capability index

Process capability index

X-bar and R Charts

Control Charts for X-bar and s

If X 1 , X 2 ,..., X n is a random sample from a N ( , 2 ) population, then

Summary of Control Charts

X LCL x A3S LCL B3S

Example: S charts with MATLAB

>> load parts

A hypothesis test is a procedure for determining if an assertion about a characteristic of a

Hypothesis Test Terminology: review

Inference on the mean of a population, variance known

H1 in equation (3-22) is a two-sided alternative hypothesis

Confidence interval on the mean, variance known

Furthermore, a 100(1 )% upper confidence bound on is

whereas a 100(1 )% lower confidence bound on is

Using the p value

Steps of Hypothesis Testing

Using the p-value

Example: Glow Toothpaste

Example: Glow Toothpaste

Example: Glow Toothpaste

Considering the same probability of a larger difference in the lower tail of

Two-Tailed Test about a Population Mean

Confidence Interval Approach to a Two-Tailed Hypothesis Test

Inference on the mean of a normal distribution with variance

One could also compute the P-value for a t-test

Confidence interval on the mean of a normal distribution with

where F is the cdf of the t-distribution.

Inference on a population proportion

Inference on a population proportion

The probability of type II error and sample size decisions

Sample size calculation for two-tailed tests:

Inference for a difference in means, variances known

Hypothesis tests for a difference in means, variances known

Confidence interval on a difference in means, variances known

Inference for a difference in means of two normal

comparative boxplots indicate that there is no 95

obvious difference in the median of the two 94

samples, although the second sample has a

slightly larger sample dispersion or variance. There 92

are no exact rules for comparing two samples with 91

boxplots; their primary value is in the visual 90

impression they provide as a tool for explaining the 89

verification of assumptions. Normal Probability Plot

The bottom figure shows the normal probability plot 0.90

of the two samples of yield data. Note that both 0.75

samples plot approximately along straight lines,

slopes (i.e. similar standard deviations). Hence, we 0.25

conclude that the normality and equal variances 0.10