Sie sind auf Seite 1von 70

Majd Shhadi, PhD

First Semester 2018/2019


1
Hypothesis Testing
88

Hypothesis testing is a decision-making process for evaluating claims


about a population.

Hypothesis testing is the use of statistics to determine the probability


that a given hypothesis is true.

When comparing properties of interest between two populations, the


experimenter will be confronted with two possibilities:
 Either the properties of interest are essentially the same in both
populations,
 Or the properties of interest are significantly different.

Majd Shhadi, PhD


Hypothesis Testing
89

In hypothesis testing:
 Define the population under study.
 State the particular hypotheses that will be investigated.
 Give the significance level.
 Select a sample from the population.
 Collect the data.
 Perform the calculations required for the statistical test
 and reach a conclusion

Majd Shhadi, PhD


Hypothesis Testing
90

Hypotheses concerning parameters such as means.

There are two specific statistical tests used for hypotheses


concerning means:
 z test
 t test

 The null hypothesis, symbolized by Ho, is a statistical hypothesis


that states that there is no difference between a parameter
and a specific value, or that there is no difference between two
parameters.

Majd Shhadi, PhD


Hypothesis Testing
91

 The alternative hypothesis, symbolized by H1, is statistical


hypothesis that states the existence of a difference between a
parameter and a specific value, or states that there is a difference
between two parameters.

 The null hypothesis is the hypothesis that is always tested.

 The alternative hypothesis is set up as the opposite of the null


hypothesis and represents the conclusion supported if the null
hypothesis is rejected.

 The null hypothesis always refers to a specified value of the


population parameter (such as ) not to a sample statistic (such
ഥ)
as 𝑿 Majd Shhadi, PhD
Hypothesis Testing
92

 The statement of null hypothesis always contains an


equal sign regarding the specified value of the
parameter.

 The statement of the alternative hypothesis never


contains an equal sign regarding the specified value of
the parameter.

Majd Shhadi, PhD


Hypothesis Testing: Situation A
93

A medical researcher is interested in finding out weather a new


medication will have any undesirable side effects. The
researcher is particularly concerned with the pulse rate of the
patients who take the medication. Will the pulse rate increase,
decrease or remain unchanged after a patient takes the
medication?

 Since the researcher knows that the mean pulse rate for the
population under study is 82 beats per minute, the
hypotheses for this situation are

Majd Shhadi, PhD


Hypothesis Testing: Situation A
94

 The null hypothesis specifies that the mean will remain


unchanged, and the alternative hypothesis states that it
will be different.

 This test is called a two-tailed test since the possible


side effects of the medicine could raise or lower the
pulse rate.

Majd Shhadi, PhD


Hypothesis Testing: Situation B
95

 A chemist invents an additive to increase the life of an automobile


battery. If the mean lifetime of the automobile battery is 36 months,
then his hypotheses are

 In this situation, the chemist is interested only in increasing the


lifetime of the batteries, so his alternative hypothesis is that the
mean is greater than 36 months.

 The null hypothesis is that the mean is less than or equal to 36


months. This test is called right-tailed, since the interest is in an
increase only.
Majd Shhadi, PhD
Hypothesis Testing: Situation C
96

 A process engineer wishes to lower heat loss by using a special type


of insulation around pipes.

 If the average heat loss is 78 kW, his hypotheses about heat loss
with the use of insulation are:

 This test is a left-tailed test, since the process engineer is


interested only in lowering heat loss.

 The null and alternative hypotheses are stated together, and the null
hypothesis contains the equals sign.
Majd Shhadi, PhD
Hypothesis Testing
97

Majd Shhadi, PhD


Hypothesis Testing
98

 A statistical test uses the data obtained from a sample to make a


decision about whether the null hypothesis should be rejected.

 The numerical value obtained from a statistical test is called the test
value.

 In this type of statistical test, the mean is computed for the data
obtained from the sample and is compared with the population
mean.

 Then a decision is made to reject or not reject the null


hypothesis on the basis of the value obtained from the
statistical test.
Majd Shhadi, PhD
Hypothesis Testing
99

Region of Rejection and Nonrejection


The sampling distribution of the test statistic is divided into two regions:
 Region of rejection (critical region)
 Region of nonrejection

 If the test statistic falls into the


region of nonrejection, the null
hypothesis is not rejected.

 If the test statistic falls into the


rejection region, the null
hypothesis is rejected.
Majd Shhadi, PhD
Hypothesis Testing
100

Risk in Decision Making


Depending on the specific
decision, either
1. one of the two types of
errors can occur or
2. one of the two types of
correct conclusions can be
reached

Majd Shhadi, PhD


Hypothesis Testing
101

Type-I Error: occurs if one rejects the null hypothesis


when it is true.

 If a null hypothesis is true and it is rejected, then type-I error is


made. The probability that a Type I error occurs is .

 In situation A, the medication might not significantly change the


pulse rate of all the users in the population: but it might change the
rate of the subjects in the sample (by chance).

 In this case, the researcher will reject the null hypothesis when it is
really true, thus committing a type-I error.

Majd Shhadi, PhD


Hypothesis Testing
102

Type II error occurs if one does not reject the null


hypothesis when it is false.

 A type II error occurs if the null hypothesis H0 is not rejected when in


fact it is false and should be rejected. The probability that a Type II
error occurs is .

 The medication might not change the pulse rate of the subjects in the
sample, but it might cause a significant increase or decrease in the
pulse rate of the subjects in the population.

 The researcher, on the basis of the data obtained from the sample,
will not reject the null hypothesis, thus committing a type II error.
Majd Shhadi, PhD
Hypothesis Testing
103

 The decision is made on the basis of probabilities. That is, when


there is a large difference between the mean obtained from the
sample and the hypothesized mean, the null hypothesis is
probably not true.

 The question is, How large a difference is necessary to reject the


null hypothesis? Here is where the level of significance is used.

 The level of significance is the maximum probability of committing a


type I error.

 This probability is symbolized by . That is,


P [type I error] = 
Majd Shhadi, PhD
Hypothesis Testing
104

 The probability of a type II error is symbolized by . That is,


P (type II error) = 

 Statisticians generally agree on using three arbitrary significance


levels: the 0.10, 0.05, and 0.01 levels.

 That is, if the null hypothesis is rejected, the probability of a type I


errors will be 10%, 5%, or 1%, depending on which level of
significance is used.

Majd Shhadi, PhD


Hypothesis Testing
105

Here is another way of putting it:

 when  = 0.10, there is a 10% chance of rejecting a true null


hypothesis;

 when  = 0.05, there is a 5% chance of rejecting a true null


hypothesis; and

 when  = 0.01, there is a 1% chance of rejecting a true null


hypothesis.

 After a significance level is chosen, a critical value is


selected from a table for the appropriate test
Majd Shhadi, PhD
Hypothesis Testing
106

 If a z test is used, for example, the z table is consulted to find the


critical value. The critical value determines the critical and non-
critical regions

Majd Shhadi, PhD


Hypothesis Testing
107

 The critical value can be on the right side of the mean or on the left
side of the mean for a one-tailed test.

 Its location depends on the inequality sign of the alternative


hypothesis.

 In situation C, where the process engineer is interested in lowering


the heat loss, the alternative hypothesis is

 Hence, the critical value falls to the left of the mean. This test is
thus a left-tailed test.

Majd Shhadi, PhD


Hypothesis Testing
108

 One way in which we can control the probability of


making a Type II error in a study is to increase the size
of the sample.

 Larger samples sizes generally permit us to detect even


very small differences between sample statistics and true
population parameters.

Majd Shhadi, PhD


Hypothesis Testing
109

Z Test of Hypothesis for the Mean ( Known)


 The sampling distribution of the mean follows the normal
distribution, resulting in the following test statistic:
ഥ −𝝁
𝑿
𝒁= 𝝈
√𝒏
 If a level of significance of 0.05 is selected, the proportion of the
area in the rejection region is 0.05, and the critical values of the
normal distribution can be determined (expressed as Z values).
 The decision rule is:
Reject H0 if Z > +1.96 or if Z < -1.96
Otherwise do not reject H0
Majd Shhadi, PhD
One Sample Test for the Mean
110

Manufacture of ball bearings: whether the average diameter has


remained at 0.503 inch.
 Take a random sample of 25 ball bearings.

 Measure the diameter of each one.

 Evaluate the difference between the sample statistic and


hypothesized population parameter by comparing the mean
diameter in inches from the sample to the expected mean of
0.503 inch
Ho: = 0.503
H1:  ≠ 0.503
 Assume the standard deviation () is known for large samples (or
for samples from a normal population)
Majd Shhadi, PhD
One Sample Test for the Mean
111

Suppose that the sample of 25 ball bearings indicates a sample


mean (𝑥)ҧ of 0.5018 inch and that the population standard deviation
() is assumed to remain at 0.004 inch.

 Because Z = -1.50, therefore our decision is not to reject H0


 We would conclude that the average diameter is 0.503 inch

Majd Shhadi, PhD


The P-Value Approach to Hypothesis Testing
112

 State the null hypothesis Ho


 State the alternative hypothesis, H1
 Choose the level of significance, α
 Choose the sample size, n
 Collect the data and compute the sample value of the
appropriate test statistic
 With the critical value read a αC.V value from the table
 Compare the αC.V to α
 Make the statistical decision. If the αC.V is greater than or equal to
α/2, the null hypothesis is not rejected.
 If the αC.V is smaller than α/2, the null hypothesis is rejected
Majd Shhadi, PhD
The P-Value Approach to Hypothesis Testing
113

Majd Shhadi, PhD


Confidence Interval and Hypothesis Testing
114

 For a confidence level of 95%

 Because the interval includes the hypothesized value 0.503 inch,


we do not reject the null hypothesis. There is insufficient evidence
to conclude that the mean diameter is not 0.503 inch.

Majd Shhadi, PhD


One-Tailed Tests
115

 In some situations, the alternative hypothesis focuses on a


particular direction.

 A company that makes processed cheese is interested in


determining whether some supplier who provides milk for the
processing operation are adding water to their milk to increase
the amount supplied to the processing operation.

 Water reduces the freezing point of the milk.

 The freezing point of milk is normally distributed with a mean


of -0.545 C. The standard deviation is 0.008 C
Majd Shhadi, PhD
One-Tailed Tests
116

 Because the cheese company is interested only in determining


whether the freezing point of the milk is less than that which
would be expected from natural milk, the entire rejection region
is located in the lower tail of the distribution.

Majd Shhadi, PhD


One-Tailed Tests
117

 If we choose a level of significance = 0.05 P = 0.95

 Then the critical value from the Table is -1.645

 The decision rule is:

Majd Shhadi, PhD


One-Tailed Tests
118

 It means that the company should pursue an investigation of the


milk supplier.

Majd Shhadi, PhD


t Test of Hypothesis
119

t Test of Hypothesis for the Mean ( Unknown)

 For most hypothesis testing situations dealing with numerical


data, the standard deviation  of the population is
unknown.

 However the standard deviation of the population can be


estimated by computation (the standard deviation of the
sample).

 S gives an estimate of  and 𝒙ഥ gives an estimate of .

Majd Shhadi, PhD


t Test of Hypothesis
120

t Test of Hypothesis for the Mean ( Unknown)

 For large samples (n ≥ 30) we get a very good approximate


answer, but in practice we usually have fewer results than 30.

 As the number of results goes down, S and 𝒙ഥ become poorer


estimates of  and .

 The uncertainty grows → confidence limits must be wider.

 This is allowed for, by using the more widely spaced distribution


called “t-distribution”

Majd Shhadi, PhD


t Test of Hypothesis
121

 The t-distribution has the same shape as the normal distribution,


but it is wider.

 The width of the t-distribution is a function of the degree of


freedom ().

Majd Shhadi, PhD


t Test of Hypothesis
122

 As the value of degree of freedom increases, the width of the


distribution decreases.

 When  becomes very large (n > 30) the distribution curve almost
coincides with normal distribution.

 When we have estimate of  with at least 30 degrees of


freedom, we can say that  is known for all practical purposes.

 When  > 30 use normal distribution.

 When  < 30 use t-distribution

Majd Shhadi, PhD


When to Use the z or t Distribution
123

Majd Shhadi, PhD


t-Distribution
124

 Any t-value, tα has a fraction α of the total area under the curve
lying above it.

 Similarly, there is a fraction α lying below - tα

 The test statistic t for determining the difference between the


sample mean and the population mean when the sample standard
deviation s is used is given by

 Where the test statistic t follows a t distribution having n-1 degrees


of freedom
Majd Shhadi, PhD
t-Distribution
125

 Example: The nominal value of the capacitors (determined by the


manufacturer) is 0.330 F

 Suppose that a sample of 28 capacitors is tested and the results


are as follows:

Majd Shhadi, PhD


t-Distribution
126

 The manufacturer is interested in whether there is evidence of a


change in the average capacitance from the nominal value of 0.330,
so that the test is two-tailed and the following null and alternative
hypothesis are established:

 If the level of significance is Pα= 0.95; (α = 0.05).

 (α/2 = 0.025) is selected, the critical values of the t-distribution with


28 – 1 = 27 degrees of freedom can be obtained from the table
(t27 = 2.0518).
Majd Shhadi, PhD
t-Distribution
127

 The decision rule is as follows:

 Reject Ho if t < t27 = -2.0518 or if t > t27 = +2.0518

 otherwise do not reject Ho

Majd Shhadi, PhD


t-Distribution
128

 Because t = 0.882 falls within the non-rejection region between the


critical values t27 = ±2.0518, we cannot reject Ho and we conclude
that there is insufficient evidence to believe that the average
capacitance is different from 0.330.

 Our observed difference is non-significant and likely due to the


chance
Majd Shhadi, PhD
t-Distribution
129

 The P-Value Approach to Hypothesis Testing

 If the c.v is greater than or equal to , the null hypothesis is not


rejected.

 If the c.v is smaller than , the null hypothesis is rejected.

  = 27

 tcalculated = 0.882 → c.v = 0.19   = 0.05

 Therefore the null hypothesis is not rejected.

Majd Shhadi, PhD


t-Distribution
130

Majd Shhadi, PhD


Independent Means
131

 Sometimes we wish to compare two sets of data which have


been obtained separately from each other:

a. Two sets of samples one is tested immediately but the other is


subjected to some extra treatment before testing (related
means –repeated measures).

b. Samples from two different laboratories or plants (unrelated


means).

 These cases are said to have independent means

Majd Shhadi, PhD


Independent Means
132

 Suppose in case a), one given set of specimens are tested before
and after treatment. In this test-related situation, we have related
means.

 Another example of related means is when we have several


batches of material, and we analyzed each batch by two
different methods.

Majd Shhadi, PhD


Independent Means
133

t Test for the Difference between the Means of Two Independent


Groups (unrelated Means)

 Suppose we have two independent populations each with a mean


and standard deviation
Population 1 Population 2
1, 1 2, 2

 Suppose a random sample of size n1 is taken from the first


population and

 Suppose a random sample of size n2 is taken from the second


population
Majd Shhadi, PhD
Independent Means
134

 The test statistic used to determine the difference between the


population means 𝜇1 − 𝜇2 is based on the difference between
the sample means 𝑋ത1 − 𝑋ത2 .

 Because of the central limit theorem the test statistic follows the
standard normal distribution.

 The Z test for the difference between two means is as follows:

Majd Shhadi, PhD


Independent Means
135

Note:
 When n1  and n2   𝑠1 and 𝑠2 can be used in
2 2

place of 𝜎12 and 𝜎22

Majd Shhadi, PhD


Independent Means
136

Test for the Difference between the Means of Two Independent


Groups (unrelated Means)

 In most cases we do not know the actual variance or standard


deviation of either of two populations.

 Information usually obtainable is


 The sample means 𝑋 ത1 𝑎𝑛𝑑 𝑋ത2

 The sample variances 𝑠12 and 𝑠22


 And the sample standard deviations 𝑠1 and 𝑠2

Majd Shhadi, PhD


Test for the Difference between the Means of Two
Independent Groups (unrelated Means)
137

 If the assumptions are made that the samples are normally and
independently drawn from respective populations that are
normally distributed and that the population variances are equal
(𝝈𝟐𝟏 = 𝝈𝟐𝟐 ), a pooled-variance t test can be used to determine
whether there is a significant difference between the means of the
two populations.

 To test the null hypothesis of no difference between the means of


two independent populations

 Against the alternative that the means are not the same

Majd Shhadi, PhD


Test for the Difference between the Means of Two
Independent Groups (unrelated Means)
138

 Pooled-variance t test for the difference between two Means

 Where

 𝒔𝟐𝒑 Pooled variance

The test statistic t follows a t distribution with (n1+n2-2) degrees of


freedom Majd Shhadi, PhD
Test for the Difference between the Means of Two
Independent Groups (unrelated Means)
139

 The test statistic requires that we pool or combine the two sample
variance 𝒔𝟐𝟏 and 𝒔𝟐𝟐 to obtain 𝒔𝟐𝒑 , the best estimate of the variance
common to both populations under the assumption that the two
population variances are equal.

 A pooled estimate of the variance is a weighted average of the


variance using the two sample variances and the degrees of
freedom of each variance as the weights.

 The pooled estimate of variance is used to calculate the standard


error in the t test when the variances are equal.

Majd Shhadi, PhD


Test for the Difference between the Means of Two
Independent Groups (unrelated Means)
140

 For a given level of significant, , in a two tailed test, we


reject the null hypothesis if

 If the computed t test statistic exceeds the upper-tailed


critical value from the t distribution or

 If the computed test statistic falls below the lower-tailed


critical value from the t distribution

Majd Shhadi, PhD


Test for the Difference between the Means of Two
Independent Groups (unrelated Means)
141

Example: The question to be determined is whether the components


classified as unflawed had (on average) a smaller crack size than the
components classified as flawed

Majd Shhadi, PhD


Example: Portland Cement Formulation (page 23)
142

Prof. Amer EL-Hamouz


Dot Diagram and Box Plots

Prof. Amer EL-Hamouz 143


Box Plots
144

 This displays indicates some differences in mean strength between


the two formulations.

 It also indicates that both formulations produce reasonably


symmetric distributions of strength with similar variability or
spread.

 Dot diagrams, histograms, and box plots are useful for


summarizing the information in a sample of data. For more
description then we need to use the concept of probability
distribution

Prof. Amer EL-Hamouz


The Hypothesis Testing Framework
145

 Two-sample t-test
 Sampling from a normal distribution

 Statistical hypotheses:
H 0 : 1 =  2
H1 : 1   2 Prof. Amer EL-Hamouz
Summary Statistics
146

Formulation 1 Formulation 2
“New recipe” “Original recipe”
y1 = 16.76 y1 = 17.04
S = 0.100
1
2
S = 0.061
1
2

S1 = 0.316 S1 = 0.248
n1 = 10 n1 = 10

Prof. Amer EL-Hamouz


How the Two-Sample t-Test Works
147

Use the sample means to draw inferences about the population means
y1 − y2 = 16.76 − 17.04 = −0.28
Difference in sample means
Standard deviation of the difference in sample means
2
 y2 =
n
This suggests a statistic:
y1 − y2
Z0 =
 12  22
+
n1 n2

Prof. Amer EL-Hamouz


How the Two-Sample t-Test Works
148

Use S and S to estimate  and 


1
2 2
2
2
1
2
2

y1 − y2
The previous ratio becomes
2 2
S S
1
+ 2
n1 n2
However, we have the case where  =  =  2
1
2
2
2

Pool the individual sample variances:


( n − 1) S 2
+ ( n − 1) S 2
Sp = 1
2 1 2 2
n1 + n2 − 2
Prof. Amer EL-Hamouz
How the Two-Sample t-Test Works
149

The test statistic is


y1 − y2
t0 =
1 1
Sp +
n1 n2
 Values of t0 that are near zero are consistent with the null
hypothesis
 Values of t0 that are very different from zero are consistent with
the alternative hypothesis
t is a “distance” measure-how far apart the averages are
 0
expressed in standard deviation units
 Notice the interpretation of t0 as a signal-to-noise ratio
Prof. Amer EL-Hamouz
The Two-Sample (Pooled) t-Test

(n1 − 1) S12 + (n2 − 1) S 22 9(0.100) + 9(0.061)


S =
2
= = 0.081
n1 + n2 − 2 10 + 10 − 2
p

S p = 0.284

y1 − y2 16.76 − 17.04
t0 = = = −2.20
1 1 1 1
Sp + 0.284 +
n1 n2 10 10

The two sample means are a little over two standard deviations apart
Is this a "large" difference?

Prof. Amer EL-Hamouz


The Two-Sample (Pooled) t-Test
151

 So far, we haven’t really t0 = -2.20


done any “statistics”
 We need an objective basis
for deciding how large the
test statistic t0 really is
 In 1908, W. S. Gosset
derived the reference
distribution for t0 … called
the t distribution
 Tables of the t distribution -
text

Prof. Amer EL-Hamouz


The Two-Sample (Pooled) t-Test
152

 A value of t0 between t0 = -2.20


–2.101 and 2.101 is
consistent with equality of
means
 It is possible for the
means to be equal and t0
to exceed either 2.101 or
–2.101, but it would be a
“rare event” … leads to
the conclusion that the
means are different
 Could also use the P-
value approach
Prof. Amer EL-Hamouz
The Two-Sample (Pooled) t-Test
153

t0 = -2.20

 The P-value is the risk of wrongly rejecting the null hypothesis of


equal means (it measures rareness of the event)
 The P-value in our problem is P = 0.042
Prof. Amer EL-Hamouz
Minitab Two-Sample t-Test Results
154

Prof. Amer EL-Hamouz


Checking Assumptions –
The Normal Probability Plot
155

Prof. Amer EL-Hamouz


Importance of the t-Test
156

 Provides an objective framework for simple comparative


experiments.

 Could be used to test all relevant hypotheses in a two-level


factorial design, because all of these hypotheses involve the mean
response at one “side” of the cube versus the mean response at
the opposite “side” of the cube.

Prof. Amer EL-Hamouz

Das könnte Ihnen auch gefallen