Sie sind auf Seite 1von 17

CHAPTER 8 ESTIMATION AND HYPOTHESIS

TESTING FOR TWO POPULATIONS

Objectives

1. To construct confidence interval and perform a hypothesis


testing about the difference between two population means for
large and independent samples.
2. To construct confidence interval and perform a hypothesis
testing about the difference between two population means for
small and independent samples: equal and unequal standard
deviations.
3. To construct confidence interval and perform a hypothesis
testing about the difference between two population proportions for
large and independent samples.

8.1 Inferences About the Difference Between


Two Population Means for Large and Independent
Samples
Mean, Standard Deviation, and Sampling Distribution of x1 − x 2
The mean of x1 − x 2 , denoted by μ x1− x 2 , is given by

μ x − x = μ1 − μ 2
1 2

The standard deviation of x1 − x 2 , denoted by σ x1 − x2 , is given by

σ 12 σ 22
σ x −x = +
1 2
n1 n2

We normally do not know the standard deviations σ 1 and σ 2 of the two


populations. So, we replace σ x1 − x2 by its point estimator,
132 Intro to Statistics & Probability

2 2
s1 s 2
s x1− x 2 = +
n1 n2

The shape of the sampling distribution of x1 − x 2 is approximately


normal and both samples must be large.

Interval Estimation of μ 1 − μ 2
The ( 1 − α )100% confidence interval for μ1 − μ 2 is given by

( x1 − x 2 ) ± zα 2σ x1 − x2 if σ 1 and σ 2 are known

( x1 − x 2 ) ± zα 2 s x1 − x2 if σ 1 and σ 2 are unknown

Example 8.1
The following information is obtained from two independent samples
selected from two populations.

n1 = 190, x1 = 5.47, s1 = 1.70


n 2 = 170, x 2 = 5.10, s 2 = 1.65

(a) What is the point estimate of μ1 − μ 2 ?


(b) Construct a 99% confidence interval for μ1 − μ 2 .

Solution

(a) Point estimate of μ1 − μ 2 = x1 − x2 = 5.47 − 5.10 = 0.37 .


(b) z-value from normal table with α 2 = 0.005 is 2.58.
99% confidence interval for μ1 − μ 2 is
= ( x1 − x 2 ) ± zα 2 s x1 − x2
1.70 2 1.65 2
= (5.47 − 5.10) ± 2.58 +
190 170
= (0.2895, 0.4505)
Chapter 8 Estimation and Hypothesis Testing for Two Populations 133

Exercise 8.1
A bottling company has gathered information regarding the diameter of
bottles produced by two machines, I and II. A sample of 70 bottles is
taken from machine I and they gave an average diameter of 8.56 cm with
a standard deviation of 0.5619 cm. Another random sample of 60 bottles
taken from machine II gave an average of 8.78 cm in diameter with a
standard deviation of 0.5925 cm.
(a) Let μ1 and μ 2 be the population means of diameters of bottles
produced by machine I and II, respectively. What is the point
estimate of μ1 − μ 2 and its margin of error?
(b) Construct a 97% confidence interval for μ1 − μ 2 .

Exercise 8.2
According to a report of a technician, the mean time spent by students
using internet in Lab I is 20.6 hours and in Lab II is 27.5 hours in a week.
Suppose that these means are based on a random sample of 30 computers
in Lab I and 45 computers in Lab II. The standard deviations are 4.59 and
3.35 hours for Lab I and II, respectively. Construct a 92% confidence
interval for the difference in mean time spent by students using internet
between the two labs.

Hypothesis Testing about μ 1 − μ 2

Null Hypothesis, H 0 Alternative Hypothesis, H 1


μ1 = μ 2 or μ1 − μ 2 = 0 μ1 ≠ μ 2 or μ1 − μ 2 ≠0

μ1 ≥ μ 2 or μ1 − μ 2 ≥ 0 μ1 < μ 2 or μ1 − μ 2 < 0

μ1 ≤ μ 2 or μ1 − μ 2 ≤ 0 μ1 > μ 2 or μ1 − μ 2 > 0

Test statistic for z for x1 − x2 is given by

( x1 − x2 ) − ( μ1 − μ 2 )
z=
σ x −x
1 2
134 Intro to Statistics & Probability

The value of μ1 − μ 2 is substituted from H 0 .

Example 8.2
By referring to Exercise 8.1, test at 5% significance level whether you
can conclude that the mean diameter of bottles are different for two
machines I and II?

Solution

H 0 : μ1 − μ 2 = 0
H1 : μ1 − μ 2 ≠ 0

The critical values are ± 1.96 and rejection regions are zα 2 < −1.96 and
zα 2 > 1.96 .

( x1 − x 2 ) − ( μ1 − μ 2 ) (8.56 − 8.78) − (0)


Test statistic: z = = = −2.16
s x1 − x2 0.5619 2 0.5925 2
+
70 60

Since z < −1.96 , then we reject H 0 .


We conclude that the mean diameter of bottles is different for the two
machines, A and B.

Exercise 8.3
A car magazine is comparing the total repair costs incurred during the
first year on two sports cars, the T-999 and the XPY. Random samples of
50 T-999 cars and 70 XPY cars are taken. All 120 cars are three years old
and have similar mileages. The mean of repair costs for the 50 T-999 cars
is RM4500 for the first three years with a standard deviation of RM550.
For the 70 XPY cars, the mean is RM5000 with a standard deviation of
RM650. Using the 2% significance level, can you conclude that such
mean repair costs are less for T-999 compared to XPY?
Chapter 8 Estimation and Hypothesis Testing for Two Populations 135

8.2 Inferences About the Difference Between


Two Population Means for Small and Independent
Samples: Equal Standard Deviations
The following assumptions must hold true in order to use t-distribution
to make inferences about μ1 − μ 2 :

(a) The two populations from which the two samples are drawn
are approximately normally distributed or normally distributed.
(b) The samples are small (n1 < 30 and n2 < 30 ) and they are
independent.
(c) The standard deviations σ 1 and σ 2 of the two populations are
unknown but they are assumed to be equal.

Pooled Standard Deviation for Two Samples

The pooled standard deviation for two samples is given by

(n1 − 1) s12 + (n2 − 1) s 22


sp =
n1 + n2 − 2

where n1 and n2 are the sizes of the two samples and s12 and s 22 are
the variances of the two samples. Here s p is the estimator of σ .

Estimator of the Standard Deviation of x 1 − x 2

1 1
s x1 − x2 = s p +
n1 n2

Interval Estimation of μ 1 − μ 2

The (1 − α )100% confidence interval for μ1 − μ 2 is given by


( x1 − x 2 ) ± tα 2 s x1 − x2
Note:
i. tα 2 is obtained from t- distribution table for a given confidence
level.
136 Intro to Statistics & Probability

ii. Degrees of freedom = n1 + n2 − 2

Hypothesis Tests About μ 1 − μ 2

Test statistic t for x1 − x 2 is given by

( x1 − x 2 ) − ( μ1 − μ 2 )
t=
s x1 − x2

Example 8.3
The following information was obtained from two independent samples
selected from two normally distributed populations with unknown but
equal standard deviations.

n1 = 25, x1 = 12.78, s1 = 3.55


n2 = 22, x 2 = 13.88, s 2 = 3.29

Construct a 98% confidence interval for μ1 − μ 2 .

Solution

(n1 − 1) s12 + (n2 − 1) s 22 (24)(3.55 2 ) + (21)(3.29 2 )


sp = = = 3.4311
n1 + n2 − 2 25 + 22 − 2

1 1 1 1
s x1 − x2 = s p + = 3.4311 + = 1.003
n1 n 2 25 22

From t-table with α 2 = 0.01 and degrees of freedom = 25 + 22 − 2 =


45, t-value is 2.390.

98% confidence interval for μ1 − μ 2 is


= ( x1 − x 2 ) ± tα 2 s x1 − x2
= (12.78 − 13.88) ± 2.390(1.003)
= (−3.4972, 1.2972)
Chapter 8 Estimation and Hypothesis Testing for Two Populations 137

Exercise 8.4
The melting points of two substances used in a pharmaceutical process
were investigated by melting 15 samples of each substance. The sample
mean and standard deviation for substance 1 were x1 = 95 o C and
s1 = 5 o C , while for substance 2 they were x 2 = 90 o C and s 2 = 4 o C .
Assume that the melting temperature for both substances is normally
distributed with equal but unknown population standard deviations.
(a) Construct a 98% confidence interval for the difference between the
corresponding population means of the two groups.
(b) Using the 4% significance level, can you conclude that the melting
point for both substances are different?

Exercise 8.5
A random sample of 12 cars driven by men on a highway was taken and
found that the mean speed to be 70 miles per hour with a standard
deviation of 2.5 miles per hour. Another sample of 15 cars driven by
women on the same highway gave a mean speed of 68 miles per hour
with a standard deviation of 2.2 miles per hour. Assume that the speeds
at which all men and all women drive cars on this highway are both
normally distributed with the same population standard deviation.
a. Construct a 90% confidence interval for the difference between the
mean speeds of cars driven by all men and all women on this
highway.
b. Test at 2.5% significance level whether the mean speed of cars driven
by all men drivers on this highway is greater than that of cars driven
by all women drivers.

8.3 Inferences About the Difference Between


Two Population Means for Small and Independent
Samples: Unequal Standard Deviations

The t-distribution is used to make inferences about μ1 − μ 2 if


(a) a. the two populations from which the two samples are drawn
are approximately normally distributed or normally distributed,
(b) the samples are small (n1 < 30 and n 2 < 30) and they are
independent,
(c) the standard deviations σ 1 and σ 2 of the two populations
are unknown and unequal,
138 Intro to Statistics & Probability

and the degrees of freedom is given by

2
⎛ s12 s 22 ⎞
⎜⎜ + ⎟⎟
df = ⎝ n1 n 2 ⎠
2 2
⎛ s12 ⎞ ⎛ s 22 ⎞
⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟
⎝ ⎠ + ⎝ n2 ⎠
n1

n1 − 1 n2 − 1

The degrees of freedom given by the formula is always rounded down to


the nearest integer.

Estimate of the Standard Deviation of x 1 − x 2

s12 s 22
s x1 − x2 = +
n1 n2

Interval Estimation of μ 1 − μ 2

The (1 − α )100% confidence interval for μ1 − μ 2 is given by

( x1 − x 2 ) ± tα 2 s x1 − x2

Hypothesis Tests About μ 1 − μ 2

Test statistic t for x1 − x 2 is given by

( x1 − x 2 ) − ( μ1 − μ 2 )
t=
s x1 − x2

Example 8.4
Assuming that the two populations are normally distributed with unequal
and unknown population standard deviations, construct a 98% confidence
interval for μ1 − μ 2 for the following:
Chapter 8 Estimation and Hypothesis Testing for Two Populations 139

n1 = 13, x1 = 52.78, s1 = 3.55


n2 = 19, x 2 = 45.55, s 2 = 5.28

Solution

s12 s22 3.552 5.282


s x1 − x 2 = + = + = 1.5610
n1 n2 13 19

2 2
⎛ s12 s22 ⎞ ⎛ 3.552 5.282 ⎞
⎜⎜ + ⎟⎟ ⎜⎜ + ⎟⎟
df = ⎝ 12 2 ⎠ 2 = ⎝ ⎠
n n 13 19
= 30.0025 ≈ 30
⎛ s1 ⎞
2
⎛ s2 ⎞
2 ⎛ ⎛ 2 2 ⎞
⎞ ⎛ ⎛ 2 2 ⎞

⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ ⎜⎜ 3. 55 ⎟ ⎜ 5. 28 ⎟
⎜ ⎜ 13 ⎟⎟ ⎟ ⎜ ⎜⎜ 19 ⎟⎟ ⎟
⎝ 1⎠ +⎝ 2⎠
n n ⎝ ⎠ + ⎝ ⎠
⎜ ⎟ ⎜ ⎟
n1 − 1 n2 − 1 ⎜ 12 ⎟ ⎜ 18 ⎟
⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠

From the t-table with α 2 = 0.01 and degrees of freedom 30, t-value is
2.457.

98% confidence interval for μ1 − μ 2 is


= ( x1 − x 2 ) ± tα 2 s x1 − x2
= (52.78 − 45.55) ± 2.457(1.5610)
= (3.3946, 11.0654)

Exercise 8.6
A manufacturing company is interested in buying one of two machines.
The company tested the two machines for production purposes. The first
machine was run for 15 days and produced an average of 111 items per
day with a standard deviation of 10 items. The second machine was run
for 18 days and produced an average of 118 items per day with a
standard deviation of 8 items. Assume that the production per day for
each machine is normally distributed and that the standard deviations of
the daily productions of the two populations are unequal.
(a) Make a 97% confidence interval for the difference between the
two population means.
140 Intro to Statistics & Probability

(b) Using the 1% significance level, can you conclude that the
mean number of items produced per day by the first machine is
lower than the second machine?

8.4 Inferences About the Difference Between


Two Population Proportions for Large and
Independent Samples

Mean, Standard Deviation, and Sampling Distribution of pˆ 1 − pˆ 2

The sampling distribution of pˆ 1 − pˆ 2 is (approximately) normal with

μ pˆ − pˆ = p1 − p2
1 2

and
p1 q1 p2 q2
σ pˆ − pˆ =
1 2
+
n1 n2

Take note that q1 = 1 − p1 and q 2 = 1 − p 2 .

Interval Estimation of p1 − p2

The (1-α)100% confidence interval for p1 − p 2 is given by


( pˆ 1 − pˆ 2 ) ± zα 2 s pˆ1 − pˆ 2
where
pˆ 1 qˆ1 pˆ 2 qˆ 2
s pˆ 1− pˆ 2 = +
n1 n2

Hypothesis Testing About p1 − p2

Test statistic z for pˆ 1 − pˆ 2 is given by

( pˆ 1 − pˆ 2 ) − ( p1 − p2 )
z=
s pˆ 1− pˆ 2

where
Chapter 8 Estimation and Hypothesis Testing for Two Populations 141

⎛1 1⎞
s pˆ 1− pˆ 2 = pq ⎜⎜ + ⎟⎟
⎝ n1 n2 ⎠

The value of p is called the pooled sample proportion and is given by


x +x n1 pˆ 1 + n2 pˆ 2
p= 1 2 or
n1 + n2 n1 + n2

Null Hypothesis, H 0 Alternative Hypothesis, H 1

p1 = p 2 or p1 − p 2 =0 p1 ≠ p2 or p1 − p 2 ≠0

p1 ≥ p2 or p1 − p 2 ≥0 p1 < p2 or p1 – p2 <0

p1 ≤ p2 or p1 − p 2 ≤ 0 p1 > p2 or p1 – p2 >0

Example 8.5

Construct a 97% confidence interval for p1 − p2 for the following:

n1 = 300 pˆ 1 = 0.53 n 2 = 200 pˆ 2 = 0.59

Solution

pˆ 1 qˆ1 pˆ 2 qˆ 2 (0.53)(0.47) (0.59)(0.41)


s pˆ 1− pˆ 2 = + = + = 0.0452
n1 n2 300 200

From the normal table with α 2 = 0.015 , z-value is 2.17.

97% confidence interval for p1 − p2 is


= ( pˆ 1 − pˆ 2 ) ± zα 2 s pˆ1 − pˆ 2
= (0.53 − 0.59) ± 2.17(0.0452)
= (−0.1581, 0.0381)
142 Intro to Statistics & Probability

Exercise 8.7
One consumer protection agency wanted to check whether the
proportions of luggage lost between two airline companies differ or
not. A sample of 600 luggage from airline company P showed that 12 are
lost. Another sample of 700 luggage from airline company Q showed that
13 luggage are lost.
(a) What is the point estimate of the difference between the two
population proportions?
(b) Construct a 94% confidence interval for the differences in the
proportions of all luggage between airline companies P and Q.
(c) Testing at 3% significance level, can you conclude that
the proportions of all luggage lost between airline companies P and
Q are different?

Exercise 8.8
According to a survey by the Social Security Organization, 35.9% of
industrial accidents in XYZ area and 33.4% of industrial accidents in PQR
area were due to unsafe working conditions. Assume that these
percentages are based on random samples of 750 and 738 of industrial
accidents in XYZ and PQR area, respectively.
(a) Determine a 93% confidence interval for the difference between the
two population proportions.
(b) At the 2% significance level, can you conclude that the proportion of
all industrial accidents that due to unsafe working conditions in XYZ
area are more than that in the PQR area?

Review Exercises

1. A delivery company wants to check whether there is a difference


in oil consumption when they use one type of fuel enhancer proposed
by a representative of one direct selling company. They have 85
lorries with the same engine that will be driven for 500 km. This type
of fuel enhancer is used for 50 lorries and gives them the average oil
consumption of 40 litres with a standard deviation of 5.4 litres. While
for the rest of lorries, the average oil consumption of 45 litres and a
standard deviation of 7.6 litres.
(a) Construct a 98% confidence interval for the difference in
the average of oil consumption when they use this type of
fuel enhancer.
Chapter 8 Estimation and Hypothesis Testing for Two Populations 143

(b) Is there evidence to indicate that the average oil consumption are
different when they use this type of fuel enhancer? Base your
answer on the results of part (a).

2. Two types of candies are suitable for use in decorating


chocolate muffins. The melting points of these candies are important.
It is known that σ 1 = σ 2 = 1.0 o C . From a random sample of size
n1 = 10 and n2 = 12 , we obtain x1 = 62 o C and x 2 = 50 o C . The
bakery will not use candy 1 unless its mean melting point exceeds
that of candy 2 by at least 15 o C . Based on the sample information,
should the bakery use candy 1? Use α = 0.05 in reaching a decision.

3. The diameter of bottles manufactured on two different machines


is being investigated. Two random samples of sizes n1 = 9 and
n 2 = 12 are selected, and the sample means and sample variances are
x1 = 8.73, s1 = 0.35 , x2 = 8.68 s2 = 0.40 , respectively.
2 2
and
Assume that σ 1 = σ 2 and that the data are drawn from a normal
2 2

distribution.
(a) a. Is there evidence to support the claim that the two
machines produce bottles with different mean diameters?
Use α = 0.01 .
(b) Construct a 95% confidence interval for the difference in
mean diameters of the bottles.

4. A manufacturer wants to compare the bursting pressure (in psi) for


two types of pipes. Both pipes have the same diameter and the
manufacturer selects 15 pipes from each type. An experiment had
been done and below are the results:
Type I Type II

Mean of bursting pressure (in psi) 383 389

Standard deviation of bursting 28.86 27.84


pressure (in psi)

Assume that both population variances are unequal.

(a) Construct a 98% confidence interval for the difference between


the two population means.
144 Intro to Statistics & Probability

(b) Testing at 1% significance level, does the experiment show


the evidence that the mean of bursting pressure is different
for both type of pipes?
(c) What would your decision be in part (b) if the probability of
making a Type I error was zero? Explain.

5. The mean drying time (in minutes) for two brands of paint
are analyzed. The two brands of paint are used for the 15 cm 2 area
with the same environment. The variance of drying time is known to
be 0.57 minutes regardless of the brand of paint. 12 areas are
observed for each brand and the drying times are given below:

Brand X Brand Y

1.59 1.66
1.64 1.45
1.78 1.75
1.79 1.79
1.90 1.84
2.01 1.90
2.05 1.90
2.21 2.01
1.74 1.88
1.55 1.85
1.43 1.77
1.65 1.80

(a) Find a 96% confidence interval on the difference in mean


drying time for the brands of paint.
(b) Based on the sample information above, is there any
evidence to indicate that the mean drying time will depend on
choice of brand of paint? Use α = 0.03 .

6. A technician was asked to choose the type of ceiling fan to be


installed in a new building. He expected that ceiling fans of type A
have a better speed compared to type B. He examined the speed for
each type of ceiling fan with the same input power of 60 watt. Below
are the details that he obtained from his observations. 18 ceiling fans
of type A and 15 ceiling fans of type B are observed.
Chapter 8 Estimation and Hypothesis Testing for Two Populations 145

Type A Type B
Mean Speed 3485 3000
(in rpm)
Variance of Speed 125 155
(in rpm)

Assume both populations are normally distributed with unequal


variances. Is there evidence to support the technician’s expectation?
Use α = 0.01.

7. A consumer protection agency believes that houses with a 1-unit


air conditioner brand certified by SIRIM have less amount of electric
bill compared to houses with 1-unit uncertified air conditioner brand.
A research committee selected 50 houses with a 1-unit certified air
conditioner brand and they found that the mean amount of electric
bill is RM75 with a standard deviation of RM15.40. Houses with a 1-
unit uncertified air conditioner brand gave the mean amount of
electric bill of RM83 with a standard deviation of RM12. Is the
consumer protection agency’s belief proven? Use α = 0.03.

8. A quality control officer of a semiconductor chips factory would


like to know whether there is a difference between the
proportions of defective chips produced between two consecutive
months. Random samples of 1000 chips produced in the first month
were selected and he found that there are 157 defective chips.
Random samples of 1200 chips produced in the second month were
selected and he found that 187 defective chips were produced.
(a) Construct a 92% confidence interval on the difference in
the proportions of defective chips produced.
(b) Do these data indicate that there is a difference in the
proportions of defective chips produced between two
consecutive months? Use α = 0.04 .

9. Samples of plastic products were selected from two brands and the
tensile strength in pounds was measured with the following results:

Brand X Brand Y
606 683
601 603
384 604
701 477
146 Intro to Statistics & Probability

Brand X Brand Y
463 438
293 371
549 237
520 480

Assume that both population variances are equal.

(a) Find a 90% confidence interval on the difference of average


tensile strength.
(b) Do the data support the claim that the average tensile strength
is the same for both brands? Use α = 0.01 .

10. In a random sample of 500 buyers in hypermarket X, 28.5% stated


that they are in favour of tea brand LIPBOH, while in another sample
of 400 buyers in hypermarket Y, 35.6% of them stated that they are in
favour of the same brand of tea. At 1% significance level, is there any
reason to believe that the proportion of buyers who are in favour of
tea brand LIPBOH in hypermarket Y is higher compared to the
proportion of buyers in hypermarket X? Find the p-value for this test.

11. An interior designer wanted to choose one type of bulb from 2


companies in this area. The first company claims that their bulbs have
a longer average life compared to the second company. The interior
designer decided to check this claim by selecting 20 units of bulbs
from each company and do some test. The results are as follows:

Company 1 (in hours) Company 2 (in hours)

375 345
348 358
380 350
370 368
364 355
335 357
340 344
384 330
373 335
364 338

It is known that the variances of life of all bulbs are unequal. Is there
evidence to support the claim of the first company? Use α = 0.05 .
Chapter 8 Estimation and Hypothesis Testing for Two Populations 147

12. It is believed that the proportion of mice that experienced side effects
once they were injected with vaccine A is less compared to vaccine B.
A researcher had tested each vaccine on 20 healthy white mice under
the same conditions and below are the results obtained:

Vaccine A Vaccine B

E E E E E E E E E E
E E E E E E E E E E
E E E E E E E E E E
E E E E E E E E E E

E – experienced side effects


E – did not experience side effects

From the above results, is there any reason to believe that vaccine A is
better than vaccine B at α = 0.005 ?

Das könnte Ihnen auch gefallen