Sie sind auf Seite 1von 9

QMDS 202 Data Analysis and Modeling

Chapter 13 Inference About Comparing Two Populations


Dependent Versus Independent Samples
When making comparisons between the means of two populations, we need to pay
particular attention to how we intend to collect sample data.
1.

If there is a definite reason for pairing (matching) corresponding data values, the
two samples are dependent samples.
2. If the two samples were obtained independently and there is no reason for pairing the
data values, the resulting samples are independent samples.
Inference About The Difference Between Two Means: Independent Samples
When the populations are normally distributed or the sizes of two independent samples
are large (both n1 and n2 are greater than or equal to 30), the sample statistic X 1 X 2 is a
normal random variable with mean

1 2

12 22
. The confidence

and variance

n
n
1
2

interval of (1-2) is found by:

x1 x 2 z / 2

12 22
2 2

1 2 x1 x 2 z / 2 1 2
n1 n 2
n1 n 2

If 1 and 2 are unknown but equal, the confidence interval of (1-2) is found by:

x1 x 2 t / 2

1
1

n1 n 2

s 2p

1
1
1 2 x1 x 2 t / 2 s 2p

n1 n 2

where t / 2 is a score obtained from the t-distribution with v = n1 + n2 2 and


s 2p

s12 ( n1 1) s 22 ( n 2 1)
= pooled sample variance.
n1 n 2 2

If 1 and 2 are unknown and unequal, the confidence interval of (1-2) is found by:

x1 x 2 t / 2

s12 s 22
s2 s2

1 2 x1 x 2 t / 2 1 2
n1 n 2
n1 n 2

2
1

/ n1 s 22 / n 2

2
2
where t / 2 is a score obtained from the t-distribution with v = s12 / n1
s 22 / n 2

n1 1
n2 1

Round the result of the calculation of v to the nearest integer.


Example 1
A corporation owns two outlets A and B. A random sample of 36 days at
outlet A had a mean of 170 sales daily. A random sample of 36 days at
outlet B had a mean sales of 165. Assuming A2 = 36 and B2 = 25, can we
conclude that there are more sales in outlet A at 0.05 level of significance?
Solution:

H0: 1 2 1 2 0
(D0 = the claimed value of 1 2 stated in H0 = 0)
H1: 1 2 1 2 0
= 0.05
Both sample sizes are large X 1 X 2 is a normal random variable.
Independent samples, 1 and 2 are known z-distribution will be used as the testing
distribution.

Reject H0 if TS > 1.645.

TS

x1 x2 D0
12
n1

22
n2

170 165 0 3.84


36 25

36 36

TS = 3.84 > 1.645 Reject H0


Conclusion: It is likely that outlet A sold more slacks than outlet B.

Example 2

A person claims that the rental rates for two bedroom apartments are the
same in sectors A and B of a city. To test this claim, another person
randomly samples apartment complexes in each sector and obtains the
following data:
Sector

Sample Size

Sample Mean

A
B

10
12

$595
$580

Sample Std.
Dev.
$62
$32

Perform the test at = 0.05. Assume that the populations of the rental
rates in the two sectors are normally distributed with unequal variances.
Solution:

H0: 1 = 2 1 2 = 0
H1: 1 2 1 2 0
= 0.05
The populations are normally distributed X 1 X 2 is a normal random variable.

Independent samples, 1 and 2 are unknown and unequal t-distribution will be used as
the testing distribution.

s12 s 22

n1 n 2
s12

n
1

n1 1

s 22

n
2

62 2 32 2

12
10
62 2

10

10 1

n2 1

32 2

12

12.92 13

12 1

Reject H0 if TS < -2.16 or TS > 2.16.

TS

x1 x 2 D0
s12 s 22

n1 n 2

595 580 0
(62) 2 (32) 2

10
12

0.692

TS = 0.692 is not greater than 2.16 Cannot reject H0


Conclusion: The mean rental rates in the two sectors are not significantly different.

Example 3

In one canning plant the average net weight of string beans being packed
in model no. 123 cans for a random sample of 12 cans is 15.97 ounces,
with standard deviation of 0.15 ounce. At another canning plant the
average net weight of string beans being packed in the same model of cans
for a random sample of 15 cans is 16.14 ounces, with standard deviation
of 0.09 ounce. The distributions of the amounts packed are assumed to be
approximately normal and equality of variance is assumed. Construct a
confidence interval for the difference between the average net weight of
beans being packed in the model no. 123 cans at the two plants, using 90%
confidence level.

Solution:

The populations are normally distributed X 1 X 2 is a normal random variable.


Independent samples, 1 and 2 are unknown but equal t-distribution will be used to
construct the confidence interval.

s 2p

s12 ( n1 1) s 22 (n 2 1) 0.15 2 11 0.09 2 14

0.0144
n1 n 2 2
12 15 2

v = 12 + 15 2 = 25
The 90% confidence interval of (1 2) is:

1
1
1

( 1 2 ) (15.97 16.14) 1.708 0.0144


12
15

12

(15.97 16.14) 1.708 0.0144


-0.249 (1 2) -0.091 (ans.)

Note. We will use the equal-variances test statistic and confidence interval estimator
unless there is evidence (based on the sample variances) to indicate that the population
variances are unequal, in which case we will apply the unequal-variances test statistic and
confidence interval estimator.
Inference About The Difference Between Two Means: Matched Pairs Experiment
When the populations are normally distributed and dependent samples (matched data) are
obtained, the confidence interval of (1-2) is found by:
x D t / 2

Example 4

nD

1 2 x D t / 2

sD
nD

An industrial engineer is evaluating a new technique to assemble air


compressors. A sample of 8 employees is selected at random, and the
number of compressors they each produce in one week using the existing
procedure is recorded. The same 8 workers are then trained to use the new
technique and their output for one week is then noted. Conduct a test to
determine whether there is a difference between the two techniques with
= 0.05.

Employee
Old Method
New Method
Solution:

sD

A
80
85

B
88
84

C
76
80

D
90
93

E
74
83

F
70
71

G
81
79

H
83
83

H0: D = 0 or 1 = 2 (D = population mean of difference = 1 - 2)


H1: D 0 or 1 2
= 0.05
Assume the two populations are normally distributed.
Dependent samples t-distribution.
v = nD 1 = 8 1 = 7

Reject H0 if TS < -2.365 or TS > 2.365.

TS

xD D 0
sD / nD

20
4.14 / 8

1.37

TS = -1.37 is not less than 2.365 Cannot reject H0


Conclusion: The two techniques are not significantly different.
Let x1 = no. of air compressors produced by the old technique
x2 = no. of air compressors produced by the new technique
Employee

x1

x2

D x1 x 2

D xD

D xD 2

A
B
C
D

80
88
76
90

85
84
80
93

-5
4
-4
-3

-3
6
-2
-1

9
36
4
1

E
F
G
H

74
70
81
83

83
71
79
83

-9
-1
2
0
-16

x D = sample mean of difference =

-7
1
4
2

49
1
16
4
120

D 16

2
nD
8

n D = number of paired values in the sample

s D = sample standard deviation of difference =

D x D 2

nD 1

120
7

= 4.14

Inference About The Ratio Of Two Variances


When comparing two population variances 12 and 22 :
1. The data in the two populations must be normally distributed.
2.
The two samples must be independent.
F Distribution
An F distribution is the sampling distribution for the variable

s12 / 12
s 22 / 22

, and it has the

following properties:
There are no negative values in an F distribution, so the scale of possible F values
extends from 0 to the right in a positive direction.
An F distribution is not symmetrical like the z or t distributions; rather, it is skewed to
the right like a 2 distribution.
There are many F distributions, and each one is determined by the number of samples
and the number of observations in the samples.
Example 5

A company makes ball bearings that are used in tractors and other
equipment. On the 8-to-4 shift, a random sample of 16 ball bearings is
selected and the diameters are measured. The sample variance is 17.39.
Later, a random sample of 13 ball bearings is selected from the 4-tomidnight shift, and the sample variance for the diameter measures is found

to be 12.83. Test the hypothesis at the 0.05 level that the population
variances for both shifts are equal.
Solution:

H0:

12

22

H1:

12

22

12

2
2

= 0.05
Assume the two populations are normally distributed.
Independent samples F-distribution will be used as the testing distribution
v1 = numerator degrees of freedom = n1 1 = 16 1 = 15
v2= denominator degrees of freedom = n2 1 = 13 1 = 12

F1 a ,v1 ,v2

Reject H0 if TS < 0.3378

TS

s12
s 22

1
Fa ,v2 ,v1

1
or TS > 3.18.
2.96

17.39
1.36
12.83

TS = 1.36 which is not greater than 3.18 Cannot reject H0.


Conclusion: There is no significant difference between the two population variances for
both shifts.

Suppose:

H0: 12 22
H1: 12 22
v1 = n1 1
v2 = n2 1
Reject H0 if TS > F ,v1 ,v2

TS s12 / s22
Suppose:

H0: 12 22
H1: 12 22
v1 = n1 1
v2 = n2 1
F
Reject H0 if TS < 1 ,v1 ,v2

TS s12 / s 22

If independent samples are obtained from two normal populations, the confidence
interval of 12 / 22 can be found by:

s12 / s 22
2
s12 / s 22
12
F / 2,v1 ,v2 2 1 / F / 2,v2 ,v1

In Example 5, the 95% confidence interval of 12 / 22 is:


17.39 / 12.83 12 17.39 / 12.83
2
3.18
1 / 2.96
2

12
4.03
22
Inference About The Difference Between Two Population Proportions
0.43

If two samples are taken independently from two populations and the sizes of the samples
are sufficiently large, that is, n1 p 1 5 , n1 (1 p 1 ) 5 , n 2 p 2 5 and n 2 (1 p 2 ) 5 ,
then the sample distribution of p 1 p 2 is approximately normally distributed with the
p1 (1 p1 ) p 2 (1 p 2 )
.

n1
n2

mean p1 p 2 and the variance

The confidence interval for p1 p2 can be found by:


( p 1 p 2 ) z / 2 p1 p 2 p1 p 2 ( p 1 p 2 ) z / 2 p1 p 2

where p1 p 2

p 1 (1 p 1 ) p 2 (1 p 2 )

n1
n2

Example 6

In attempting to assess voter sentiment regarding a state bonding proposal,


a legislator has a random sample of 200 people polled in each of two
districts containing a large number of voters. In the first district 120 of the
200 people interviewed expresses their approval of the proposal, and in the
second district 100 of the 200 people interviewed expressed approval.
Using 95% confidence level, estimate the difference between the
proportions of people in the two districts supporting the bonding proposal.

Solution:

Independent large samples:

120
120 5
200
80
n1 (1 p 1 ) 200
80 5
200
100
n 2 (1 p 2 ) 200
100 5
200
120 100
( p 1 p 2 )

0.1
200 200
n1 p 1 200

n 2 p 2 200

100
100 5
200

p1 p 2

p 1 (1 p 1 ) p 2 (1 p 2 )

n1
n2

0.6 0.4 0.5 0.5

0.05
200
200

The 95% confidence interval of (p1 p2) is:

0.1 1.96 0.05 p1 p 2 0.1 1.96 0.05

0.002 ( p1 p 2 ) 0.198 (ans.)

Hypothesis Testing for p1 and p2:


Example 7

An election candidate feels that male voters as well as female voters have
the same opinion of him. A random sample of 36 male voters showed that
12 of these voters favored his election. And it was found that in a random
sample of 50 female voters, 18 were his supporters. Test the validity of the
candidates assumption, using = 0.05.

Solution:

H0: p1 = p2 p1 p2 = 0
H1: p1 p2 p1 p2 0
= 0.05
Independent large samples:

(D = the claimed value of p1 - p2 stated in H0 = 0)

12
12 5
36
24
n1 (1 p 1 ) 36
24 5
36
n1 p 1 36

n2 p 2 50

18
18 5
50

n2 (1 p 2 ) 50

32
32 5
50

z-distribution will be used as the testing distribution.

Reject H0 if TS < -1.96 or TS > 1.96.

TS

p 1 p 2

p (1 p ) p (1 p )

n1
n2

0.33 0.36

0.3488 0.6512 0.3488 0.6512

36
50

x1 x 2 12 18

0.3488
n1 n 2 36 50
0.288

TS = -0.288 is not less than 1.96 Cannot reject H0


Conclusion: There is no sufficient reason to reject the candidates claim.

If D (the claimed value of p1 p2 stated in H0) 0, then we should use the following test
statistic:

TS

p 1 p 2 D
p 1 (1 p 1 ) p 2 (1 p 2 )

n1
n2

Review Problems: 13.6, 13.7, 13.8 a to c, 13.12, 13.52, 13.56, 13.76, 13.80, 13.88 a to c,
13.89, 13.90, 13.94.

Das könnte Ihnen auch gefallen