Beruflich Dokumente
Kultur Dokumente
8-1 Overview
Page 1
Ch 8 Hypothesis Testing
A hypothesis is a statement about a population, usually of the form that a certain parameter takes a
particular numerical value or falls in a certain range of values
The main goal in many research studies is to check whether the data support certain hypotheses
Each significance test has two hypotheses:
The null hypothesis is a statement that the parameter takes a particular value. It has a single
parameter value.
The alternative hypothesis states that the parameter falls in some alternative range of values.
Null and Alternative Hypotheses
The value in the null hypothesis usually represents no effect
The symbol Ho denotes null hypothesis
The value in the alternative hypothesis usually represents an effect of some type
The symbol Ha (H1 HA ) denotes alternative hypothesis
The alternative hypothesis should express what the researcher hopes to show.
The hypotheses should be formulated before viewing or analyzing the data!
3. Calculate the test statistic
A test statistic describes how far the point estimate falls from the parameter value given in the null
hypothesis (usually in terms of the number of standard errors between the two).
If the test statistic falls far from the value suggested by the null hypothesis in the direction specified by the
alternative hypothesis, it is good evidence against the null hypothesis and in favor of the alternative
hypothesis.
We use the test statistic to assesses the evidence against the null hypothesis by giving a probability , the P-
Value.
4. P-Value, Critical region, or Confidence Interval
To interpret a test statistic value, we use a probability summary of the evidence against the null hypothesis,
Ho
First, we presume that Ho is true
Next, we consider the sampling distribution from which the test statistic comes
We summarize how far out in the tail of this sampling distribution the test statistic falls
We summarize how far out in the tail the test statistic falls by the tail probability of that value and values
even more extreme
This probability is called a P-value
The smaller the -value, the stronger the evidence is against Ho
The P-value is the probability that the test statistic equals the observed value or a value even more extreme
Page 2
Ch 8 Hypothesis Testing
This section presents individual components of a hypothesis test, and the following sections use those components in
comprehensive procedures.
The role of the following should be understood:
null hypothesis
alternative hypothesis
test statistic
critical region
significance level
critical value
P-value
Type I and II error
Example 1: Gender Selection and Probability Page 388
Lets again refer to the Gender Choice product that was once distributed by ProCare Industries. ProCare Industries
claimed that couples using the pink packages of Gender Choice would have girls at a rate that is greater than 50% or
0.5. Lets again consider an experiment whereby 100 couples use Gender Choice in an attempt to have a baby girl;
lets assume that the 100 babies include exactly 52 girls, and lets formalize some of the analysis.
Under normal circumstances the proportion of girls is = 0.5, so a claim that Gender Choice is effective can be
expressed as > 0.5. We support the claim of > 0.5 only if a result such as 52 girls is unlikely(with a small
probability, such as 0.05).
Using a normal distribution as an approximation to the binomial distribution, we find P(52 or more girls in 100
births) = 0.3821.
Figure 8-1, following, shows that with a probability of 0.5, the outcome of 52 girls in 100 births is not unusual.
Page 3
Ch 8 Hypothesis Testing
We do not reject random chance as a reasonable explanation. We conclude that the proportion of girls born to
couples using Gender Choice is not significantly greater than the number that we would expect by random chance.
Observations
Claim: For couples using Gender Choice, the proportion of girls is > 0.5.
Working assumption: The proportion of girls is = 0.5 (with no effect from Gender Choice).
The sample resulted in 52 girls among 100 births, so the sample proportion is = 52/100 = 0.52.
Assuming that = 0.5, we use a normal distribution as an approximation to the binomial distribution to find
that P (at least 52 girls in 100 births) = 0.3821.
There are two possible explanations for the result of 52 girls in 100 births: Either a random chance event (with
probability 0.3821) has occurred, or the proportion of girls born to couples using Gender Choice is greater than
0.5.
There isnt sufficient evidence to support Gender Choices claim.
Components of a Formal Hypothesis Test
Null Hypothesis: H0
The null hypothesis (denoted by H0) is a statement that the value of a population parameter (such as
proportion, mean, or standard deviation) is equal to some claimed value.
A statistical hypothesis that contains a statement of equality such as , =, or .
We test the null hypothesis directly.
Either reject H0 or fail to reject H0.
Alternative Hypothesis: Ha
The alternative hypothesis (denoted by H1 or Ha or HA) is the statement that the parameter has a value that
somehow differs from the null hypothesis.
The symbolic form of the alternative hypothesis must use one of these symbols: , <, >.
Must be true if H0 is false.
Note about Forming Your Own Claims (Hypotheses)
If you are conducting a study and want to use a hypothesis test to support your claim, the claim must be worded so
that it becomes the alternative hypothesis.
Example 2: Identifying H0 and Ha
You are testing a new design for air bags used in automobiles, and you are concerned that they might not open
properly. State the null and alternative hypotheses.
Solution: The two opposing possibilities are Bags open properly and Bags do not open properly. Testing could
produce evidence that discredits the hypothesis Bags open properly; plus your concern is that Bags do not open
properly. Therefore, Bags do not open properly would become the alternative hypothesis and Bags open
properly would be the null hypothesis.
Example 3: Identifying H0 and Ha
An engineer wishes to show that the new formula that was just developed results in a quicker-drying paint. State the
null and alternative hypotheses.
Solution: The two opposing possibilities are does dry quicker and does not dry quicker. Because the engineer
wishes to show does dry quicker, the alternative hypothesis is Paint made with the new formula does dry quicker
and the null hypothesis is Paint made with the new formula does not dry quicker.
Example 4: Identifying H0 and Ha
You suspect that a brand-name detergent outperforms the stores brand of detergent, and you wish to test the two
detergents because you would prefer to buy the cheaper store brand. State the null and alternative hypotheses.
Page 4
Ch 8 Hypothesis Testing
Solution: Your suspicion, The brand-name detergent outperforms the store brand, is the reason for the test and
therefore becomes the alternative hypothesis.
H0 : There is no difference in detergent performance.
Ha : The brand-name detergent performs better than the store brand.
However, as a consumer, you are hoping not to reject the null hypothesis for budgetary reasons.
Step 1
: : : =
{ 0 ; { 0 ; { 0
: > : < :
Step 2
Regardless of which pair of hypotheses you
use, you always assume = k and examine
the sampling distribution on the basis of this
Step 3 assumption.
Page 5
Ch 8 Hypothesis Testing
Page 6
Ch 8 Hypothesis Testing
xm xm
Test statistic for mean z or t
s s
n n
(n 1) s 2
Test statistic for standard deviation 2
s2
Example 8: Find the test statistic Page 392
A survey of n = 880 randomly selected adult drivers showed that 56% (or p = 0.56) of those respondents admitted to
running red lights. Find the value of the test statistic for the claim that the majority of all adult drivers admit to
running red lights.
Solution: The preceding example showed that the given claim results in the following null and alternative
hypotheses: H0: = 0.5 and Ha: > 0.5. Because we work under the assumption that the null hypothesis is
true with = 0.5, we get the following test statistic:
0.56 0.5
= = = 3.56
(0.5)(0.5)
880
Interpretation: We know from previous chapters that a z score of 3.56 is exceptionally large. It appears that in
addition to being more than half, the sample result of 56% is significantly more than 50%
Critical Region
.
Critical Region, Critical Value, Test Statistic
Critical Region
The critical region (or rejection region) is the set of all values of the test statistic that cause us to reject the null
hypothesis. For example, see the red-shaded region in the previous figure.
Significance Level
The significance level (denoted by ) is the probability that the test statistic will fall in the critical region when the
null hypothesis is actually true. This is the same introduced in Section 7-2. Common choices for are 0.05, 0.01,
and 0.10.
Critical Value
A critical value is any value that separates the critical region (where we reject the null hypothesis) from the values of
the test statistic that do not lead to rejection of the null hypothesis. The critical values depend on the nature of the
null hypothesis, the sampling distribution that applies, and the significance level . See the previous figure where
the critical value of z = 1.645 corresponds to a significance level of = 0.05.
Two-tailed, Right-tailed, Left-tailed Tests
Page 7
Ch 8 Hypothesis Testing
The tails in a distribution are the extreme regions bounded by critical values.
Two-tailed Test: is divided equally between the two tails of the critical region
H0: = and Ha:
means less than or greater than
Right-tailed Test
H0: = and Ha: >
points right
Left-tailed Test
H0: = and Ha: <
points left
Page 8
Ch 8 Hypothesis Testing
b) < 0.5 (so the critical region is in the left tail of the normal distribution)
Solution: With left tail area of 0.05. the critical value is found to be = 1.645.
c) > 0.5 (so the critical region is in the right tail of the normal distribution)
Solution: With right tail area of 0.05. the critical value is found to be = 1.645.
P-Value
The P-value (or p-value or probability value) is the probability of getting a value of the test statistic that is at least as
extreme as the one representing the sample data, assuming that the null hypothesis is true. The null hypothesis is
rejected if the P-value is very small, such as 0.05 or less.
Left-tailed Test Right-tailed Test Two-tailed Test
The alternative hypothesis Ha The alternative hypothesis Ha The alternative hypothesis Ha
contains the less-than contains the greater-than contains the not equal inequality
inequality symbol (<). inequality symbol (>). symbol (). Each tail has an area
H0 : H0 : of P.
{ { H : =
Ha : < Ha : > { 0
Ha :
Page 9
Ch 8 Hypothesis Testing
H : = 0.8
Solution: { 0 Two-tailed test
Ha : 0.8
b) A water faucet manufacturer announces that the mean flow rate of a certain type of faucet is less than 2.5
gallons per minute.
H0 : 2.5 gpm
Solution: { Left-tailed test
Ha : < 2.5 gpm
c) A cereal company advertises that the mean weight of the contents of its 20-ounce size cereal boxes is more than
20 ounces.
H : 20 oz
Solution: { 0 Right-tailed test
Ha : > 20 oz
Page 10
Ch 8 Hypothesis Testing
Page 11
Ch 8 Hypothesis Testing
Page 12
Ch 8 Hypothesis Testing
Type I Error
A Type I error is the mistake of rejecting the null hypothesis when it is true.
The symbol (alpha) is used to represent the probability of a type I error.
Type II Error
A Type II error is the mistake of failing to reject the null hypothesis when it is false.
The symbol (beta) is used to represent the probability of a type II error.
For example, you claim that a coin is not fair. To test your claim, you flip the coin 100 times and get 49 heads and
51 tails. You would probably agree that you dont have enough evidence to support your claim. Even so, it is
possible that the coin is actually not fair and you had unusual sample. But what if you flip the coin 100 times and
get 21 heads and 79 tails? It would be a rare occurrence to get only 21 heads out of 100 tosses with a fair coin. So,
you probably have sufficient evidence to support your claim that the coin is not fair. However, you cant 100%
sure. It is possible that the coin is fair and you get an unusual sample. Remember , the only way to test whether H0 is
true or false is to test entire population. Because your decisionto reject H0 fail to reject H0--- is base on a sample,
you must accept the fact that your decision might be incorrect. You might reject a null hypothesis when it is actually
true. Or, you might fail to reject a null hypothesis when it is actually false.
Example 14: Type I and Type II Errors
Assume that we a conducting a hypothesis test of the claim p > 0.5. Here are the null and alternative hypotheses: H0:
p = 0.5, and H1: p > 0.5.
a) Identify a type I error.
Solution: A type I error is the mistake of rejecting a true null hypothesis, so this is a type I error: Conclude
that there is sufficient evidence to support p > 0.5, when in reality p = 0.5.
b) Identify a type II error.
Solution: A type II error is the mistake of failing to reject the null hypothesis when it is false, so this is a type II
error: Fail to reject p = 0.5 (and therefore fail to support p > 0.5) when in reality p > 0.5.
Example 15: Identifying Type I and Type II Errors
Page 13
Ch 8 Hypothesis Testing
The USDA limit for salmonella contamination for chicken is 20%. A meat inspector reports that the chicken
produced by a company exceeds the USDA limit. You perform a hypothesis test to determine whether the meat
inspectors claim is true. When will a type I or type II error occur? Which is more serious? (Source: United States
Department of Agriculture)
Solution: Let p represent the proportion of chicken that is contaminated.
H0 : 0.20
Hypotheses: {
Ha : > 0.20 (Claim)
Page 14
Ch 8 Hypothesis Testing
The probability of making a Type I error, , is chosen by the researcher before the sample data is collected. The
level of significance, , is the probability of making a Type I error. In Other Words: As the probability of a Type
I error increases, the probability of a Type II error decreases, and vice-versa.
Controlling Type I and Type II Errors
For any fixed , an increase in the sample size n will cause a decrease in .
For any fixed sample size n, a decrease in will cause an increase in . Conversely, an increase in will
cause a decrease in .
The power of a hypothesis test is the probability (1 ) of rejecting a false null hypothesis, which is computed
by using a particular significance level and a particular value of the population parameter that is an alternative to
the value assumed true in the null hypothesis. That is, the power of the hypothesis test is the probability of
supporting an alternative hypothesis that is true.
Page 15
Ch 8 Hypothesis Testing
Page 16
Ch 8 Hypothesis Testing
Caution: In some cases, a conclusion based on a confidence interval may be different from a conclusion based on a
hypothesis test.
1. State the claim mathematically and verbally. Identify the null and alternative hypotheses.
H0: ? H a: ?
2. Specify the level of significance.
= ?
3. Determine the standardized sampling distribution and draw its graph.
4. Calculate the test statistic and its standardized value. Add it to your sketch.
7. Write a statement to interpret the decision in the context of the original claim.
To test hypotheses regarding the population mean assuming the population standard deviation is known, two
requirements must be satisfied:
Recall the researcher who believes that the mean length of a cell phone call has increased from its March, 2006
mean of 3.25 minutes. Suppose we take a simple random sample of 36 cell phone calls. Assume the standard
deviation of the phone call lengths is known to be 0.78 minutes. What is the sampling distribution of the sample
mean?(Answer: is normally distributed with mean 3.25 and standard deviation 0.7836 = 0.13.)
Page 17
Ch 8 Hypothesis Testing
Suppose the sample of 36 calls resulted in a sample mean of 3.56 minutes. Do the results of this sample suggest
that the researcher is correct? In other words, would it be unusual to obtain a sample mean of 3.56 minutes from a
population whose mean is 3.25 minutes? What is convincing or statistically significant evidence?
When observed results are unlikely under the assumption that the null hypothesis is true, we say the result is
statistically significant. When results are found to be statistically significant, we reject the null hypothesis.
One criterion we may use for sufficient evidence for rejecting the null hypothesis is if the sample mean is too many
standard deviations from the assumed (or status quo) population mean. For example, we may choose to reject the
null hypothesis if our sample mean is more than 2 standard deviations above the population mean of 3.25 minutes.
Recall that our simple random sample of 36 calls resulted in a sample mean of 3.56 minutes with standard deviation
3.563.25
of 0.13. Thus, the sample mean is = = 2.38 standard deviations above the hypothesized mean of
0.13
3.25 minutes.
Therefore, using our criterion, we would reject the null hypothesis and conclude that the mean cellular call length is
greater than 3.25 minutes.
Why does it make sense to reject the null hypothesis if the sample mean is more than 2 standard deviations above
the hypothesized mean?
If the null hypothesis were true, then 1 0.0228 = 0.9772 = 97.72% of all sample means will be less than
3.25 + 2(0.13) = 3.51.
Because sample means greater than 3.51 are unusual if the population mean is 3.25, we are inclined to believe the
population mean is greater than 3.25.
A second criterion we may use for sufficient evidence to support the alternative hypothesis is to compute how likely
it is to obtain a sample mean at least as extreme as that observed from a population whose mean is equal to the value
assumed by the null hypothesis.
Page 18
Ch 8 Hypothesis Testing
We can compute the probability of obtaining a sample mean of 3.56 or more using the normal model.
3.563.25
Recall = = 2.38, So, we compute ( 3.56) = ( 2.38) = 0.0087.
0.13
The probability of obtaining a sample mean of 3.56 minutes or more from a population whose mean is 3.25 minutes
is 0.0087. This means that fewer than 1 sample in 100 will give us a mean as high or higher than 3.56 if the
population mean really is 3.25 minutes. Since this outcome is so unusual, we take this as evidence against the null
hypothesis.
Assuming that H0 is true, if the probability of getting a sample mean as extreme or more extreme than the one
obtained is small, we reject the null hypothesis.
This section presents complete procedures for testing a hypothesis (or claim) made about a population proportion.
This section uses the components introduced in the previous section for the P-value method, the traditional method
or the use of confidence intervals.
1. There are independent trials.
Requirements for Testing Claims About a Population Proportion p 2. Each trial has two possible outcomes
(success or failure).
The sample observations are a simple random sample. 3. (success) = , (failure) = = 1
.
The conditions for a binomial distribution are satisfied (Section 5-3).
The conditions 5 and 5 are satisfied, so the binomial distribution of sample proportions can be
approximated by a normal distribution with = and = .
Notation
n = number of trials
= (sample proportion)
Test Statistic
Page 19
Ch 8 Hypothesis Testing
P-Value Method
Use the same method as described in Section 8-2 and in Figure 8-8. Use the standard normal distribution (Table A-
2).
Traditional Method
Use the same method as described in Section 8-2 and in Figure 8-9.
Use the same method as described in Section 8-2 and in Table 8-2.
An article distributed by the Associated Press included these results from a nationwide survey: Of 880 randomly
selected drivers, 56% admitted that they run red lights. The claim is that the majority of all Americans run red lights.
That is, p > 0.5. The sample data are n = 880, and p = 0.56. Use the sample data with a 0.05 significance level to
test the claim.
Solution:
Verify requirement:
P-value Method
Null Hypothesis: H0: p = 0.5, Alternative hypothesis: H1: p > 0.5 with = 0.05
0.560.5
Test statistic = = = 3.56
(0.5)(0.5)
880
Because the hypothesis test we are considering is right-tailed with test statistic = 3.56, the P-value is the area to
the right of = 3.56 and higher. Referring to Table A-2, we see that for values of = 3.50 and higher, we use
0.9999 for the cumulative area to the left of the test statistic. The P-value is 1 0.9999 = 0.0001.
Since the P-value of 0.0001 is less than the significance level of = 0.05, we reject the null hypothesis. There is
sufficient evidence to support the claim.
Traditional Method
Page 20
Ch 8 Hypothesis Testing
This is a right-tailed test, so the critical region is an area of 0.05. Referring to Table A-2 and applying methods of
section 6-2, we find that z = 1.645 is the critical value of the critical region. We reject the null hypothesis. There is
sufficient evidence to support the claim.
For a one-tailed hypothesis test with significance level , we will construct a confidence interval with a confidence
level of 1 2. We construct a 90% confidence interval(See Table 8-2).
We obtain 0.533 < < 0.588(section 7-2). We are 90% confident that the true value of p is contained within the
limits of 0.533 and 0.588. Thus we support the claim that > 0.5.
CAUTION
When testing claims about a population proportion, the traditional method and the P-value method are equivalent
and will yield the same result since they use the same standard deviation based on the claimed proportion p.
However, the confidence interval uses an estimated standard deviation based upon the sample proportion .
Consequently, it is possible that the traditional and P-value methods may yield a different conclusion than the
confidence interval method.
A good strategy is to use a confidence interval to estimate a population proportion, but use the P-value or
traditional method for testing a hypothesis.
When Gregory Mendel conducted his famous hybridization experiments with peas, one such experiment resulted in
offspring consisting of 428 peas with green pods and 152 peas with yellow pods. According to Mendels theory, 1/4
of the offspring peas should have yellow pods. Use a 0.05 significance level with the P-value method to test the
claim that the proportion of peas with yellow pods is equal to 1/4.
Null Hypothesis: H0: = 0.25, Alternative hypothesis: H1: 0.25 with = 0.05
Page 21
Ch 8 Hypothesis Testing
0.260.25
Test statistic = = = 0.67
(0.25)(0.75)
580
Since this is a two-tailed test, the P-value is twice the area to the right of the test statistic. Using Table A-2, =
0.67 is 1 0.7486 = 0.2514.
The P-value is 2(0.2514) = 0.5028. We fail to reject the null hypothesis. There is not sufficient evidence to
warrant rejection of the claim that 1/4 of the peas have yellow pods.
Zogby International claims that 45% of people in the United States support making cigarettes illegal within the next
5 to 10 years. You decide to test this claim and ask a random sample of 200 people in the United States whether they
support making cigarettes illegal within the next 5 to 10 years. Of the 200 people, 49% support this law. At = 0.05
is there enough evidence to reject the claim?
Solution:
H0: = 0.45
Ha: 0.45
= 0.05
Rejection Region:
0.490.45
Test Statistic: = = 1.14
(0.45)(0.55)200
At the 5% level of significance, there is not enough evidence to reject the claim that 45% of people in the U.S.
support making cigarettes illegal within the next 5 to 10 years.
The Pew Research Center claims that more than 55% of U.S. adults regularly watch their local television news. You
decide to test this claim and ask a random sample of 425 adults in the United States whether they regularly watch
their local television news. Of the 425 adults, 255 respond yes. At = 0.05 is there enough evidence to support the
claim?
H0: 0.55
= 0.05
Page 22
Ch 8 Hypothesis Testing
Rejection Region:
0.600.55
Test Statistic: = = 2.07
(0.45)(0.55)425
Decision: Reject 0
At the 5% level of significance, there is enough evidence to support the claim that more than 55% of U.S. adults
regularly watch their local television news.
In 1997, 46% of Americans said they did not trust the media when it comes to reporting the news fully, accurately
and fairly. In a 2007 poll of 1010 adults nationwide, 525 stated they did not trust the media. At the =0.05 level of
significance, is there evidence to support the claim that the percentage of Americans that do not trust the media to
report fully and accurately has increased since 1997?(Source: Gallup Poll)
525
The sample proportion is = = 0.52 .
1010
0.520.46
The test statistic is = = 3.83
0.46(0.54)1010
Since this is a right-tailed test, we determine the Since this is a right-tailed test, the P-value is the area
critical value at the = 0.05 level of significance to under the standard normal distribution to the right of
be 1.645. Since the test statistic, = 3.83, is the test statistic = 3.83. That is, P-value =
greater than the critical value 1.645, we reject the null ( > 3.83) 0. Since the P-value is less than
hypothesis. the level of significance, we reject the null hypothesis.
There is sufficient evidence at the = 0.05 level of significance to conclude that the percentage of Americans that
do not trust the media to report fully and accurately has increased since 1997.
For the sampling distribution of to be approximately normal, we require (1 ) be at least 10. What if this
requirement is not met?
In 2006, 10.5% of all live births in the United States were to mothers under 20 years of age. A sociologist claims
that births to mothers under 20 years of age is decreasing. She conducts a simple random sample of 34 births and
finds that 3 of them were to mothers under 20 years of age. Test the sociologists claim at the = 0.01 level of
significance.
Solution:
Page 23
Ch 8 Hypothesis Testing
From the null hypothesis, we have = 0.105. There were 34 mothers sampled, so (1 ) = 3.57 < 10.
Thus, the sampling distribution of is not approximately normal.
Let represent the number of live births in the United States to mothers under 20 years of age. We have x=3
successes in n=34 trials so = 3/34 = 0.088. We want to determine whether this result is unusual if the
population mean is truly 0.105. Thus,
= 0.51
The P-value = 0.51 is greater than the level of significance so we do not reject H0. There is insufficient evidence to
conclude that the percentage of live births in the United States to mothers under the age of 20 has decreased below
the 2006 level of 10.5%.
The standardized test statistic is z. =
In Words In Symbols
1. State the claim mathematically and verbally. State H0 and Ha
Identify the null and alternative hypotheses.
2. Specify the level of significance Identify
3. Sketch the sampling distribution
4. Determine any critical value(s) Use Table
5. Determine any rejection region(s).
6. Find the standardized test statistic
=
7. Make a decision to reject or fail to reject the null If z is in the rejection region, reject H0. Otherwise, fail
hypothesis to reject H0
8. Interpret the decision in the context of the original
claim
This section presents methods for testing a claim about a population mean, given that the population standard
deviation is a known value. This section uses the normal distribution with the same components of hypothesis tests
that were introduced in Section 8-2.
Page 24
Ch 8 Hypothesis Testing
=
Using -values to Make a Decision
To use a -value to make a conclusion in a hypothesis test, compare the -value with .
1) If , then reject 0 .
2) If > , then fail to reject 0 .
The P-value for a hypothesis test is P = 0.0237. What is your decision if the level of significance is
1. = 0.05? Solution: Because 0.0237 < 0.05, we should reject the null hypothesis.
2. = 0.01? Solution: Because 0.0237 > 0.01, we should fail to reject the null hypothesis.
After determining the hypothesis tests standardized test statistic and the test statistics corresponding area, do one of
the following to find the P-value.
Find the P-value for a left-tailed hypothesis test with a standardized test statistic of = 2.23. Decide whether to
reject 0 when the level of significance is = 0.01.
Find the P-value for a two-tailed hypothesis test with a test statistic of = 2.14. Decide whether to reject H0 if the
level of significance is = 0.05.
Page 25
Ch 8 Hypothesis Testing
In Words In Symbols
1. State the claim mathematically and verbally. Identify the null State H0 and Ha
and alternative hypotheses.
2. Specify the level of significance. Identify .
3. Determine the standardized test statistic.
=
4. Find the area that corresponds to z.
5. Find the P-value. Use Table A-2
a) For a left-tailed test, P = (Area in left tail).
b) For a right-tailed test, P = (Area in right tail).
c) For a two-tailed test, P = 2(Area in tail of test statistic).
Reject H0 if P-value . Otherwise, fail
6. Make a decision to reject or fail to reject the null hypothesis.
to reject H0.
7. Interpret the decision in the context of the original claim
In auto racing, a pit stop is where a racing vehicle stops for new tires, fuel, repairs, and other mechanical
adjustments. The efficiency of a pit crew that makes these adjustments can affect the outcome of a race. A pit crew
claims that its mean pit stop time (for 4 new tires and fuel) is less than 13 seconds. A random sample of 32 pit stop
Page 26
Ch 8 Hypothesis Testing
times has a sample mean of 12.9 seconds. Assume the population standard deviation is 0.19 second. Is there enough
evidence to support the claim at = 0.01? Use a -value.
Solution:
Because is known ( = 0.19), the sample is random, and = 32 30, we can use the -test.
The claim is the mean pit stop time is less than 13 seconds. So, the null and alternative hypotheses
are 0 : 13 seconds and : < 13 seconds. (Claim)
The level of significance is = 0.01. The standardized test statistic is
12.9 13
= = 2.98
0.1932
Using Table A-2, the area corresponding to = 2.98 is 0.0014. Because this test is a left-tailed test, the -value
is equal to the area to the left of = 2.98, as shown in the figure below. So, = 0.0014. Because the -value
is less than = 0.01, we reject the null hypothesis.
According to a study, the mean cost of bariatric (weight loss) surgery is $21,500. You think this information is
incorrect. You randomly select 25 bariatric surgery patients and find that the mean cost for their surgeries is $20,695.
From past studies, the population standard deviation is known to be $2250 and the population is normally distributed.
Is there enough evidence to support your claim at = 0.05? Use a -value.
Solution
Because is known ( = $2250), the sample is random, and the population is normally distributed, you can use
the -test. The claim is the mean is different from $21,500. So, the null and alternative hypotheses are
2069521500
= = 1.79
225025
In Table A-2, the area corresponding to = 1.79 is 0.0367. Because the test is a two-tailed test, the P-value is
equal to twice the area to the left of = 1.79, = 2(0.0367) = 0.0734 > 0.05.
Because the -value is greater than = 0.05, you fail to reject the null hypothesis.
Interpretation There is not enough evidence at the 5% level of significance to support the claim that the mean cost
of bariatric surgery is different from $21,500.
Page 27
Ch 8 Hypothesis Testing
We have a sample of 106 body temperatures having a mean of 98.20F. Assume that the sample is a simple random
sample and that the population standard deviation is known to be 0.62F. Use a 0.05 significance level to test the
common belief that the mean body temperature of healthy adults is equal to 98.6F. Use the P-value method.
98.298.6
= = 6.64
0.62106
This is a two-tailed test and the test statistic is to the left of the center, so the P-value is twice the area to the left of
= 6.64. We refer to Table A-2 to find the area to the left of = 6.64 is 0.0001, so the P-value is
2(0.0001) = 0.0002.
The volume of a stock is the number of shares traded in the stock in a day. The mean volume of Apple stock in
2007 was 35.14 million shares with a standard deviation of 15.07 million shares. A stock analyst believes that the
volume of Apple stock has increased since then. He randomly selects 40 trading days in 2008 and determines the
sample mean volume to be 41.06 million shares. Test the analysts claim at the = 0.10 level of significance using
P-values.
Solution
The analyst wants to know if the stock volume has increased. This is a right-tailed test with
0 : m = 35.14 versus : m > 35.14(Claim)
= 0.10;
41.0635.14
= = 2.48
15.0740
( > 2.48) = 0.0066.
Since the P-value= 0.0066 is less than the level of
significance, 0.10, we reject the null hypothesis.
There is sufficient evidence to reject the null hypothesis and to conclude that the mean volume of Apple stock is
greater than 35.14 million shares.
One advantage of using P-values over the classical approach in hypothesis testing is that P-values provide
information regarding the strength of the evidence. Another is that P-values are interpreted the same way regardless
of the type of hypothesis test being performed. the lower the P-value, the stronger the evidence against the
statement in the null hypothesis.
Page 28
Ch 8 Hypothesis Testing
Another method to decide whether to reject the null hypothesis is to determine whether the standardized test statistic
falls within a range of values called the rejection region of the sampling distribution.
A rejection region (or critical region) of the sampling distribution is the range of values for which the null
hypothesis is not probable. If a standardized test statistic falls in this region, then the null hypothesis is rejected. A
critical value 0 separates the rejection region from the nonrejection region.
Note that a standardized test statistic that falls in a rejection region is considered an unusual event.
Finding Critical Values in a Normal Distribution
1. Specify the level of significance .
2. Decide whether the test is left-, right-, or two-tailed.
3. Find the critical value(s) . If the hypothesis test is
a. left-tailed, find the -score that corresponds to an area of ,
b. right-tailed, find the -score that corresponds to an area of 1 ,
c. two-tailed, find the -score that corresponds to and 1 .
4. Sketch the standard normal distribution. Draw a vertical line at each critical value and shade the rejection
region(s).
Example 8: Finding Critical Values
Find the critical value and rejection region for a two-tailed test with = 0.05.
Solution:
Page 29
Ch 8 Hypothesis Testing
In Words In Symbols
1. State the claim mathematically and verbally. Identify the null and State H0 and Ha.
alternative hypotheses.
2. Specify the level of significance. Identify .
3. Sketch the sampling distribution.
4. Determine the critical value(s). Use Table A-2
5. Determine the rejection region(s).
6. Find the standardized test statistic. =
7. Make a decision to reject or fail to reject the null hypothesis. If is in the rejection region, then reject
8. Interpret the decision in the context of the original claim. 0 . Otherwise, fail to reject 0 .
Example 9: Testing with Rejection Regions
Employees in a large accounting firm claim that the mean salary of the firms accountants is less than that of its
competitors, which is $45,000. A random sample of 30 of the firms accountants has a mean salary of $43,500 with
a standard deviation of $5200. At = 0.05, test the employees claim.
Solution:
Rejection Region:
H0: $45,000 and Ha: < $45,000; = 0.05
4350045000
Test Statistic: = = 1.58
520030
The U.S. Department of Agriculture reports that the mean cost of raising a child from birth to age 2 in a rural area is
$10,460. You believe this value is incorrect, so you select a random sample of 900 children (age 2) and find that the
mean cost is $10,345 with a standard deviation of $1540. At = 0.05, is there enough evidence to conclude that
the mean cost is different from $10,460? Rejection Region:
Solution:
1034510460
Test Statistic: = = 2.24
1540900
Decision: Reject H0
At the 5% level of significance, you have enough evidence to conclude the mean cost of raising a child from birth to
age 2 in a rural area is significantly different from $10,460.
Page 30
Ch 8 Hypothesis Testing
We have a sample of 106 body temperatures having a mean of 98.20F. Assume that the sample is a simple random
sample and that the population standard deviation is known to be 0.62F. Use a 0.05 significance level to test the
common belief that the mean body temperature of healthy adults is equal to 98.6F. Use the traditional method
98.298.6
= = 6.64
0.62106
We now find the critical values to be = 1.96 and = 1.96. We would reject the null hypothesis, since the test
statistic of = 6.64 would fall in the critical region.
There is sufficient evidence to conclude that the mean body temperature of healthy adults differs from 98.6F.
A can of 7-Up states that the contents of the can are 355 ml. A quality control engineer is worried that the filling
machine is miscalibrated. In other words, she wants to make sure the machine is not under- or over-filling the cans.
She randomly selects 9 cans of 7-Up and measures the contents. She obtains the following data.
Is there evidence at the = 0.05 level of significance to support the quality control engineers claim? Prior
experience indicates that s = 3.2ml.
Solution: The quality control engineer wants to know if the mean content is different from 355 ml. Since the
sample size is small, we must verify that the data come from a population that is approximately normal with no
outliers.
Since this is a two-tailed test, we determine the critical values at the = 0.05 level of significance to be =
1.96 and = 1.96.
Since the test statistic, = 1.56, is less than the critical value 1.96, we fail to reject the null hypothesis.
Page 31
Ch 8 Hypothesis Testing
There is insufficient evidence at the = 0.05 level of significance to conclude that the mean content differs from
355 ml.
We have a sample of 106 body temperatures having a mean of 98.20F. Assume that the sample is a simple random
sample and that the population standard deviation s is known to be 0.62F. Use a 0.05 significance level to test the
common belief that the mean body temperature of healthy adults is equal to 98.6F. Use the confidence interval
method.
For a two-tailed hypothesis test with a 0.05 significance level, we construct a 95% confidence interval. Use the
methods of Section 7-2 to construct a 95% confidence interval:
We are 95% confident that the limits of 98.08 and 98.32 contain the true value of , so it appears that 98.6 cannot be
the true value of .
A can of 7-Up states that the contents of the can are 355 ml. A quality control engineer is worried that the filling
machine is miscalibrated. In other words, she wants to make sure the machine is not under- or over-filling the cans.
She randomly selects 9 cans of 7-Up and measures the contents. She obtains the following data.
Test the hypotheses at the = 0.05 level of significance by constructing a 95% confidence interval about m, the
population mean can content. Prior experience indicates that s = 3.2ml.
Solution:
3.2 3.2
Lower bound: 356.667 1.96 = 354.58 and Upper bound: 356.667 + 1.96 = 358.76
9 9
We are 95% confident that the mean can content is between 354.6 ml and 358.8 ml. Since the mean stated in the
null hypothesis is in this interval, there is insufficient evidence to reject the hypothesis that the mean can content is
355 ml.
If, under a given assumption, there is an extremely small probability of getting sample results at least as extreme
as the results that were obtained, we conclude that the assumption is probably not correct.
When testing a claim, we make an assumption (null hypothesis) of equality. We then compare the assumption
and the sample results and we form one of the following conclusions:
If the sample results (or more extreme results) can easily occur when the assumption (null hypothesis) is
true, we attribute the relatively small discrepancy between the assumption and the sample results to chance.
If the sample results cannot easily occur when that assumption (null hypothesis) is true, we explain the
relatively large discrepancy between the assumption and the sample results by concluding that the
assumption is not true, so we reject the assumption.
Page 32
Ch 8 Hypothesis Testing
In many real-life situations, the population standard deviation in not known. When either the population has a
normal distribution or the sample size is at least 30, you can still test the population mean m. To do so, you can use
the -distribution with 1 degrees of freedom. The methods of this section use the Student distribution
introduced earlier.
Requirements for Testing Claims About a Population Mean (with Not Known)
Test Statistic for Testing a Claim About a Mean (with Not Known)
1. The Student t distribution is different for different sample sizes (see Figure 7-5 in Section 7-4).
2. The Student t distribution has the same general bell shape as the normal distribution; its wider shape reflects the
greater variability that is expected when s is used to estimate .
3. The Student t distribution has a mean of = 0 (just as the standard normal distribution has a mean of = 0).
4. The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the
standard normal distribution, which has = 1).
5. As the sample size gets larger, the Student distribution gets closer to the standard normal distribution.
Choosing between the Normal and Student t Distributions when Testing a Claim about a Population Mean . Use
the Student t distribution when is not known and either or both of these conditions is satisfied: The population is
normally distributed or > 30.
3. Find the critical value(s) using Table A-3 in the row with 1 degrees of freedom. If the hypothesis test
is
Page 33
Ch 8 Hypothesis Testing
c. two-tailed, use Two Tails, column with a negative and a positive sign.
Find the critical value t0 for a left-tailed test given = 0.05 and n = 21.
Solution:
0 = 1.725
Find the critical value t0 for a right-tailed test with = 0.01 and = 17.
Solution:
0 = 2.583 0 = 2.583
Find the critical values t0 and t0 for a two-tailed test given = 0.05 and n = 26.
Solution:
Page 34
Ch 8 Hypothesis Testing
In Words In Symbols
Data Set 13 in Appendix B of the text includes weights of 13 red M&M candies randomly selected from a bag
containing 465 M&Ms. The weights (in grams) have a mean = 0.8635 and a standard deviation = 0.0576 g.
The bag states that the net weight of the contents is 396.9 g, so the M&Ms must have a mean weight that is 396.9/
465 = 0.8535 g in order to provide the amount claimed. Use the sample data with a 0.05 significance level to test
the claim of a production manager that the M&Ms have a mean that is actually greater than 0.8535 g. Use the
traditional method.
Solution: The sample is a simple random sample and we are not using a known value of . The sample size is
n = 13 and a normal quartile plot suggests the weights are normally distributed.
H0: = 0.8535 and H1: > 0.8535; = 0.05, = 0.8635, = 0.0576, and = 13.
0.8635 0.8535
= = 0.626
0.057613
Because the test statistic of = 0.626 does not fall in the critical region, we fail to reject H0. There is not sufficient
evidence to support the claim that the mean weight of the M&Ms is greater than 0.8535 g.
A used car dealer says that the mean price of a 2005 Honda Pilot LX is at least $23,900. You suspect this claim is
incorrect and find that a random sample of 14 similar vehicles has a mean price of $23,000 and a standard deviation
of $1113. Is there enough evidence to reject the dealers claim at = 0.05? Assume the population is normally
distributed.
H0: $23,900
= 0.05
df =14 1 = 13
Page 35
Ch 8 Hypothesis Testing
2300023900
Test Statistic: = = 3.026
111314
Decision: Reject H0
At the 0.05 level of significance, there is enough evidence to reject the claim that the mean price of a 2005
Honda Pilot LX is at least $23,900
An industrial company claims that the mean pH level of the water in a nearby river is 6.8. You randomly select 19
water samples and measure the pH of each. The sample mean and standard deviation are 6.7 and 0.24, respectively.
Is there enough evidence to reject the companys claim at = 0.05? Assume the population is normally distributed.
H0: = 6.8
Ha: 6.8
= 0.05
df =19 1 = 18
6.76.8
Test Statistic: = = 1.816
0.2419
At the 0.05 level of significance, there is not enough evidence to reject the claim that the mean pH is 6.8.
P-Value Method
The American Automobile Association claims that the mean daily meal cost for a family of four traveling on
vacation in Florida is $118. A random sample of 11 such families has a mean daily meal cost of $128 with a
standard deviation of $20. Is there enough evidence to reject the claim at = 0.10? Assume the population is
normally distributed.
Page 36
Ch 8 Hypothesis Testing
At the 0.10 level of significance, there is not enough evidence to reject the claim that the mean daily meal cost
for a family of four traveling on vacation in Florida is $118.
Assuming that neither software nor a TI-83/84 Plus calculator is available, use Table A-3 to find a range of values
for the P-value corresponding to the given results.
a) In a left-tailed hypothesis test, the sample size is n = 12, and the test statistic is = 2.007.
Solution: The test is a left-tailed test with test statistic t = 2.007, so the P-value is the area to the left of 2.007.
Because of the symmetry of the t distribution, that is the same as the area to the right of +2.007. Any test
statistic between 2.201 and 1.796 has a right-tailed P- value that is between 0.025 and 0.05. We conclude that
0.025 < P-value < 0.05.
b) In a right-tailed hypothesis test, the sample size is n = 12, and the test statistic is = 1.222.
Solution: The test is a right-tailed test with test statistic t = 1.222, so the P-value is the area to the right of
1.222. Any test statistic less than 1.363 has a right-tailed P-value that is greater than 0.10. We conclude that P-
value > 0.10.
c) In a two-tailed hypothesis test, the sample size is n = 12, and the test statistic is = 3.456.
Solution: The test is a two-tailed test with test statistic t = 3.456. The P-value is twice the area to the right of
3.456. Any test statistic greater than 3.106 has a two-tailed P-value that is less than 0.01. We conclude that
P-value < 0.01.
Assume the resting metabolic rate (RMR) of healthy males in complete silence is 5710 kJ/day. Researchers
measured the RMR of 45 healthy males who were listening to calm classical music and found their mean RMR to be
5708.07 with a standard deviation of 992.05.
At the =0.05 level of significance, is there evidence to conclude that the mean RMR of males listening to calm
classical music is different than 5710 kJ/day?
Solution:
Page 37
Ch 8 Hypothesis Testing
Traditional Method
1. Since this is a two-tailed test, we determine the critical values at the =0.05 level of significance with
1 = 45 1 = 44 degrees of freedom to be approximately 0.025 = 2.021 and 0.025 = 2.021.
2. Since the test statistic, = 0.013, is between the critical values, we fail to reject the null hypothesis.
P-value
1. Since this is a two-tailed test, the P-value is the area under the t-distribution with 1 = 45 1 = 44
degrees of freedom to the left of 0.025 = 0.013 and to the right of 0.025 = 0.013. That is, P-value =
( < 0.013) + ( > 0.013) = 2 ( > 0.013). 0.50 < -value.
2. Since the P-value is greater than the level of significance (0.05<0.5), we fail to reject the null hypothesis.
There is insufficient evidence at the = 0.05 level of significance to conclude that the mean RMR of males
listening to calm classical music differs from 5710 kJ/day.
In real life, it is important to produce consistent, predictable results. For instance, consider a
company that manufactures golf balls. The manufacturer must produce millions of golf balls, each
having the same size and the same weight. There is a very low tolerance for variation. For a normally
distributed population, you can test the variance and standard deviation of the process using the chi-
square distribution with n - 1 degrees of freedom.
Chi-Square Distribution
()
Test Statistic: = where = sample size, 2 = sample variance, and 2 =population
variance(given in null hypothesis.)
All values of 2 are nonnegative, and the distribution is not symmetric (see Figure 8-13, following).
Page 38
Ch 8 Hypothesis Testing
There is a different distribution for each number of degrees of freedom (see Figure 8-14, above).
The critical values are found in Table A-4 using n 1 degrees of freedom.
3. The critical values for the 2-distribution are found in Table A-4. To find the critical value(s) for a
c. two-tailed test, use the values that corresponds to d.f. and and d.f. and 1 .
Find the critical 2-value for a left-tailed test when = 11 and = 0.01.
Solution:
Page 39
Ch 8 Hypothesis Testing
Find the critical 2-value for a two-tailed test when n = 13 and = 0.01.
Solution:
From Table A-4, the critical values are 2 = 3.074 and 2 = 28.299
In Words In Symbols
1. State the claim mathematically and verbally. Identify the null and State H0 and Ha
alternative hypotheses.
Identify
2. Specify the level of significance.
()
5. Determine any rejection region(s). =
For a simple random sample of adults, IQ scores are normally distributed with a mean of 100 and a standard
deviation of 15. A simple random sample of 13 statistics professors yields a standard deviation of s = 7.2. Assume
that IQ scores of statistics professors are normally distributed and use a 0.05 significance level to test the claim that
= 15.
() ().
Standardized Test Statistic: = = 2.765
Page 40
Ch 8 Hypothesis Testing
The critical values of 4.404 and 23.337 are found in Table A-4, in the 12th row (degrees of freedom = 1 =
13 1 = 12) in the column corresponding to 0.975 and 0.025. Because the test statistic is in the critical region, we
reject the null hypothesis.
There is sufficient evidence to warrant rejection of the claim that the standard deviation is equal to 15.
Example 4: Hypothesis Test for the Population Variance
A dairy processing company claims that the variance of the amount of fat in the whole milk processed by the
company is no more than 0.25. You suspect this is wrong and find that a random sample of 41 milk containers has a
variance of 0.27. At = 0.05, is there enough evidence to reject the companys claim? Assume the population is
normally distributed.
Solution:
Rejection Region:
2
H0: 0.25
= 0.05
df =41 1 = 40
() ()(.)
Test Statistic: = = 43.2
.
Decision: Fail to reject H0
At the 5% level of significance, there is not enough evidence to reject the companys claim that the variance of the
amount of fat in the whole milk is no more than 0.25.
Example 5: Hypothesis Test for the Standard Deviation
A restaurant claims that the standard deviation in the length of serving times is less than 2.9 minutes. A random
sample of 23 serving times has a standard deviation of 2.1 minutes. At = 0.10, is there enough evidence to
support the restaurants claim? Assume the population is normally distributed.
Solution:
Rejection Region:
H0: 2.9 min
= 0.10 = 0.10
df =23 1 = 22
Page 41
Ch 8 Hypothesis Testing
() ()(.)
Test Statistic: = = 11.536
.
Decision: Reject H0
At the 10% level of significance, there is enough evidence to support the claim that the standard deviation for the
length of serving times is less than 2.9 minutes.
Example 6: Hypothesis Test for the Population Variance
A sporting goods manufacturer claims that the variance of the strength in a certain fishing line is 15.9. A random
sample of 15 fishing line spools has a variance of 21.8. At = 0.05, is there enough evidence to reject the
manufacturers claim? Assume the population is normally distributed.
Solution:
H0: 2 = 15.9 Rejection Region:
Ha: 2 15.9 1
2
= 0.025
= 0.05
df =15 1 = 14
() ()(.)
Test Statistic: = = 19.194
.
Decision: Fail to reject H0
At the 5% level of significance, there is not enough evidence to reject the claim that the variance in the strength of
the fishing line is 15.9.
Example 7:
A can of 7-Up states that the contents of the can are 355 ml. A quality control engineer is worried that the filling
machine is miscalibrated. In other words, she wants to make sure the machine is not under- or over-filling the cans.
She randomly selects 9 cans of 7-Up and measures the contents. She obtains the following data.
351 360 358 356 359 358 355 361 352
In section 8.4, we assumed the population standard deviation was 3.2. Test the claim that the population standard
deviation, s, is greater than 3.2 ml at the =0.05 level of significance.
Solution:
Page 42
Ch 8 Hypothesis Testing
Page 43