Beruflich Dokumente
Kultur Dokumente
787
788 Appendix Tables
814
Chapter 1 815
0 17 36.17 c. .79
1 22 46.81
2 6 12.77
23. Class Freq Class Freq
3 1 2.13
5 1 2.13 10– < 20 8 1.1– < 1.2 2
N ¼ 47 20– < 30 14 1.2– < 1.3 6
.362, .638 30– < 40 8 1.3– < 1.4 7
40– < 50 4 1.4– < 1.5 9
50– < 60 3 1.5– < 1.6 6
b. z: Count Percent 60– < 70 2 1.6– < 1.7 4
70– < 80 1 1.7– < 1.8 5
0 13 27.66 40 1.8– < 1.9 1
1 11 23.40 40
2 3 6.38
3 7 14.89
4 5 10.64
The original distribution is positively skewed.
5 3 6.38
The transformation creates a much more symmetric,
6 3 6.38
mound-shaped histogram.
8 2 4.26
N ¼ 47
25. a. Class interval Freq Rel. Freq.
.894, .830
0–< 50 9 0.18
21. a. 50–< 100 19 0.38
Class Freq Rel freq 100–< 150 11 0.22
150–< 200 4 0.08
0– < 100 21 0.21 200–< 250 2 0.04
100– < 200 32 0.32 250–< 300 2 0.04
200– < 300 26 0.26 300–< 350 1 0.02
300– < 400 12 0.12 350–< 400 1 0.02
400– < 500 4 0.04 > ¼ 400 1 0.02
500– < 600 3 0.03 50 1.00
600– < 700 1 0.01
700– < 800 0 0.00
800– < 900 1 0.01
100 1.00
The distribution is skewed to the right, or positively
skewed. There is a gap in the histogram, and what
appears to be an outlier in the ‘500–550’ interval.
The histogram is skewed right, with a majority of
observations between 0 and 300 cycles. The class
holding the most observations is between 100 and 200
cycles.
816 Chapter 1
69. The mean is .93 and the standard deviation is .081. The 7. a. {111, 112, 113, 121, 122, 123, 131, 132, 133, 211,
distribution is fairly symmetric with a central peak, as 212, 213, 221, 222, 223, 231, 232, 233, 311, 312, 313,
shown by the stem and leaf display: 321, 322, 323, 331, 332, 333}
b. {111, 222, 333}
Leaf unit ¼ 0.010 c. {123, 132, 213, 231, 312, 321}
7 7 d. {111, 113, 131, 133, 311, 313, 331, 333}
8 11
8 556
9. a. S ¼ {BBBAAAA, BBABAAA, BBAABAA,
BBAAABA, BBAAAAB, BABBAAA, BABABAA,
9 22333344
BABAABA, BABAAAB, BAABBAA, BAABABA,
9 55 BAABAAB, BAAABBA, BAAABAB, BAAAABB,
10 04 ABBBAAA, ABBABAA, ABBAABA, ABBAAAB,
10 55 ABABBAA, ABABABA, ABABAAB, ABAABBA,
ABAABAB, ABAAABB, AABBBAA, AABBABA,
AABBAAB, AABABBA, AABABAB, AABAABB,
71. a. Mode ¼ .93. It occurs four times in the data set. AAABBBA, AAABBAB, AAABABB, AAAABBB}
b. The Modal Category is the one in which the most b. {AAAABBB, AAABABB, AAABBAB, AABAABB,
observations occur. AABABAB}
73. The measures that are sensitive to outliers are the mean 13. a. .07 b. .30 c. .57
and the midrange. The mean is sensitive because all
values are used in computing it. The midrange is the 15. a. They are awarded at least one of the first two projects,
most sensitive because it uses only the most extreme .36.
values in its computation. b. They are awarded neither of the first two projects, .64.
The median, the trimmed mean, and the midfourth are c. They are awarded at least one of the projects, .53.
less sensitive to outliers. The median is the most resistant d. They are awarded none of the projects, .47.
to outliers because it uses only the middle value (or e. They are awarded only the third project, .17.
values) in its computation. The midfourth is also quite f. Either they fail to get the first two or they are awarded
resistant because it uses the fourths. The resistance of the the third, .75.
trimmed mean increases with the trimming percentage. 17. a. .572 b. .879
75. a. s2y ¼ s2x and sy ¼ sx b. s2z ¼ 1 and sz ¼ 1 19. a. SAS and SPSS are not the only packages.
77. b. .552, .102 c. 30 d. 19 b. .7 c. .8 d. .2
99. a. P(G | R1 < R2 < R3) ¼ 2/3, so classify as granite if 17. a. .81 b. .162
R1 < R2 < R3. c. The fifth battery must be an A, and one of the first
b. P(G | R1 < R3 < R2) ¼ .294, so classify as basalt if four must also be an A, so
R1 < R3 < R2. p(5) ¼ P(AUUUA or UAUUA or UUAUA or
P(G | R3 < R1 < R2) ¼ 1/15, so classify as basalt if UUUAA) ¼ .00324
R3 < R1 < R2. d. P(Y ¼ y) ¼ (y 1)(.1)y2(.9)2, y ¼ 2,3,4,5,. . .
c. .175 d. p > 14/17
19. c. F(x) ¼ 0, x < 1, F(x) ¼ log10([x] + 1), 1 x 9,
101. a. 1/24 b. 3/8 F(x) ¼ 1, x > 9.
d. .602, .301
103. s ¼ 1
21. F(x) ¼ 0, x < 0; .10, 0 x < 1; .25, 1 x < 2; .45,
107. a. P(B0|survive) ¼ b0/[1 (b1 + b2)cd] 2 x < 3; .70, 3 x < 4; .90, 4 x < 5; .96,
P(B1|survive) ¼ b1(1 cd)/[1 (b1 + b2)cd] 5 x < 6; 1.00, 6 x
P(B2|survive) ¼ b2(1 cd)/[1 (b1 + b2)cd]
b. .712, .058, .231 23. a. p(1) ¼ .30, p(3) ¼ .10, p(4) ¼ .05, p(6) ¼ .15,
p(12) ¼ .40
b. .30, .60
25. a. p(x) ¼ (1/3)(2/3)x1, x ¼ 1, 2, 3, . . .
b. p(y) ¼ (1/3)(2/3)y2, y ¼ 2, 3, 4, . . .
c. p(0) ¼ 1/6, p(z) ¼ (25/54)(4/9)z1, z ¼ 1, 2, 3, 4, . . .
Chapter 3 819
29. a. .60 b. $110 85. a. h(x; 10, 10, 20) b. .0325 c. h(x; n, n, 2n),
E(X) ¼ n/2, V(X) ¼ n2/[4(2n 1)]
31. a. 16.38, 272.298, 3.9936 b. 401 c. 2496 d. 13.66
87. a. nb(x; 2, .5) ¼ (x + 1).5x+2, x ¼ 0, 1, 2, 3, . . .
33. Yes, because S(1/x2) is finite. b. 3/16 c. 11/16 d. 2, 4
35. $700 89. nb(x; 6, .5), E(X) ¼ 6 ¼ 3(2)
37. E[h(X)] ¼ .408 > 1/3.5 ¼ .286, so you expect to win 93. a. .932 b. .065 c. .068 d. .491 e. .251
more if you gamble.
95. a. .011 b. .441 c. .554, .459 d. .944
39. V(X) ¼ V(X)
97. a. .491 b. .133
41. a. 32.5 b. 7.5
c. V(X) ¼ E[X(X–1)] + E(X) [E(X)]2 99. a. .122, .808, .283 b. 12, 3.464 c. .530, .011
43. a. 1/4, 1/9, 1/16, 1/25, 1/100 101. a. .099 b. .135 c. 2
b. m ¼ 2.64, s ¼ 1.54, P(|X m| 2s) ¼ .04 < .25,
P(|X m| 3s) ¼ 0 < 1/9 103. a. 4 b. .215 c. 1.15 years
The actual probability can be far below the Chebyshev 105. a. .221 b. 6,800,000 c. p(x; 1608.5)
bound, so the bound is conservative.
c. 1/9, equal to the Chebyshev bound 111. b. 3.114, .405, .636
d. P(1) ¼ .02, P(0) ¼ .96, P(1) ¼ .02
113. a. b(x; 15, .75) b. .6865 c. .313 d. 45/4, 45/16
45. MX(t) ¼ .5et/(1–.5et), E(X) ¼ 2, V(X) ¼ 2 e. .309
47. pY(y) ¼ .75(.25)y1, y ¼ 1, 2, 3, . . . 115. .9914
49. E(X) ¼ 5, V(X) ¼ 4 117. a. p(x; 2.5) b. .067 c. .109
2 =2
51. MY ðtÞ ¼ et , E(X) ¼ 0, V(X) ¼ 1 119. 1.813, 3.05
53. E(X) ¼ 0, V(X) ¼ 2 121. p(2) ¼ p2 , p(3) ¼ (1 p)p2 , p(4) ¼ (1 p)p2,
p(x) ¼ [1 p(2) . . . p(x 3)](1 p)p2, x ¼ 5,
59. a. .850 b. .200 c. .200 d. .701 6, 7, . . . .
e. .851 f. .000 g. .570 Alternatively, p(x) ¼ (1 p)p(x 1) +
61. a. .354 b. .114 c. .919 p(1 p) p(x 2), x ¼ 5, 6, 7, . . . ; 99950841
71. For p ¼ .9 the probability is higher for B (.9963 versus 131. b. .6p(x; l) + .4p(x; m) c. (l + m)/2
.99 for A) d. (l + m)/2 + (l m)2/4
For p ¼ .5 the probability is higher for A (.75 versus 133. .5
.6875 for B)
137. X ~ b(x; 25,p E(h(X)) ¼ 500p + 750,
p),ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
73. The tabulation for p > .5 is not needed.
shðXÞ ¼ 100 pð1 pÞ
75. a. 20, 16 (binomial, n ¼ 100, p ¼ .2) b. 70, 21 Independence and constant probability might not be
valid because of the effect that customers can have on
77. When p ¼ .5, the true probability for k ¼ 2 is .0414, each other. Also, store employees might affect customer
compared to the bound of .25. decisions.
When p ¼ .5, the true probability for k ¼ 3 is .0026,
compared to the bound of .1111. 139.
When p ¼ .75, the true probability for k ¼ 2 is .0652, X 0 1 2 3 4
compared to the bound of .25. p(x) .07776 .10368 .19008 .20736 .17280
When p ¼ .75, the true probability for k ¼ 3 is .0039,
compared to the bound of .1111. X 5 6 7 8
p(x) .13824 .06912 .03072 .01024
79. MnX(t) ¼ [p + (1 p)et]n, E(n X) ¼ n(1 p),
V(n X) ¼ np(1 p)
Intuitively, the means of X and n X should add to n and
their variances should be the same.
81. a. .114 b. .879 c. .121 d. Use the binomial
distribution with n ¼ 15 and p ¼ .1
83. a. h(x; 15, 10, 20) b. .0325 c. .6966
820 Chapter 4
3. a. .15 b. .40 c. .22 ¼ P(A) ¼ P(|X1 X2| 2) 45. a. E(Y| X ¼ x) ¼ x2/2 b. V(Y| X ¼ x) ¼ x4/12
d. .17, .46 c. fY(y) ¼ y–.5 1, 0 < y < 1
e. x1 0 1 2 3 4 47. a. p(1,1) ¼ p(2,2) ¼ p(3,3) ¼ 1/9, p(2,1) ¼ p(3,1)
p1(x1) .19 .30 .25 .14 .12 ¼ p(3,2) ¼ 2/9
E(X1) ¼ 1.7 b. pX(1) ¼ 1/9, pX(2) ¼ 3/9, pX(3) ¼ 5/9
c. pY|X(1|1) ¼ 1, pY|X(1|2) ¼ 2/3, pY|X(2|2) ¼ 1/3,
f. x2 0 1 2 3 pY|X(1|3) ¼ .4, pY|X(2|3) ¼ .4, pY|X(3|3) ¼ .2
d. E(Y| X ¼ 1) ¼ 1, E(Y| X ¼ 2) ¼ 4/3,
p2(x2) .19 .30 .28 .23
E(Y| X ¼ 3) ¼ 1.8, no
g. 0 ¼ p(4 , 0) 6¼ p1(4) p2(0) ¼ (.12)(.19) so the two e. V(Y| X ¼ 1) ¼ 0, V(Y| X ¼ 2) ¼ 2/9,
variables are not independent. V(Y| X ¼ 3) ¼ .56
5. a. .54 b. .00018 49. a. pX|Y(1|1) ¼ .2, pX|Y(2|1) ¼ .4, pX|Y(3|1) ¼ .4,
pX|Y(2|2) ¼ 1/3, pX|Y(3|2) ¼ 2/3, pX|Y(3|3) ¼ 1
7. a. .030 b. .120 c. .10, .30 d. .38 b. E(X| Y ¼ 1) ¼ 2.2, E(X| Y ¼ 2) ¼ 8/3,
e. yes, p(x,y) ¼ pX(x) pY(y) E(X| Y ¼ 3) ¼ 3, no
c. V(X| Y ¼ 1) ¼ .56, V(X| Y ¼ 2) ¼ 2/9,
V(X| Y ¼ 3) ¼ 0
822 Chapter 6
79. f(x) ¼ ex/2 ex, x 0; f(x) ¼ 0, x < 0. p(x/n) 0.0000 0.0000 0.0001 0.0008 0.0055
87. c. If p(0) ¼ .3, p(1) ¼ .5, p(2) ¼ .2, then 1 is the smaller of p(r) .30 .40 .22 .08
the two roots, so extinction is certain in this case with m < 1.
If p(0) ¼ .2, p(1) ¼ .5, p(2) ¼ .3, then 2/3 is the smaller of d. .24
the two roots, so extinction is not certain with m > 1. 7.
x pð
xÞ x pð
xÞ x pð
xÞ
89. a. P((X,Y) ∈ A) ¼ F(b, d) F(b, c) F(a, d) + F(a, b)
b. P((X,Y) ∈ A) ¼ F(10, 6) F(10, 1) F(4, 6) + F(4, 1) 0.0 0.000045 1.4 0.090079 2.8 0.052077
P((X,Y) ∈ A) ¼ F(b, d) F(b, c–1) – F(a–1, d) + 0.2 0.000454 1.6 0.112599 3.0 0.034718
F(a–1, b–1) 0.4 0.002270 1.8 0.125110 3.2 0.021699
c. At each (x*, y*), F(x*, y*) is the sum of the 0.6 0.007567 2.0 0.125110 3.4 0.012764
probabilities at points (x, y) such that x x* and 0.8 0.018917 2.2 0.113736 3.6 0.007091
y y* 1.0 0.037833 2.4 0.094780 3.8 0.003732
F(x, y) x 1.2 0.063055 2.6 0.072908 4.0 0.001866
100 250
200 .50 1
y 100 .30 .50
0 .20 .25
Chapter 7 823
13. a. No, the distribution is clearly not symmetric. 79. 26, 1.64
A positively skewed distribution —perhaps Weibull, 81. If Z1 and Z2 are independent standard normal
lognormal, or gamma. observations, then let
b. .0746 pffiffiffi
X ¼ 5Z1 + 100, Y ¼ 2ð:5Z1 þ ð 3=2ÞZ2 Þ þ 50
c. .00000092. No, 82 is not a reasonable value for m.
15. a. .8366 b. no
17. 43.29
Chapter 7
19. a. .9802, .4802 b. 32 1. a. 113.73, X b. 113, Xe
c. 12.74, S, an estimator for the population standard
21. a. .9839 b. .8932 deviation
d. The sample proportion of students exceeding 100 in
27. a. 87,850, 19,100,116 IQ is 30/33 ¼ .91
b. In case of dependence, the mean calculation is still e. .112, S=X
valid, but not the variance calculation.
c. .9973 3. a. 1.3481, X b. 1.3481, X
c. 1.78, X þ 1:282S
29. a. .2871 b. .3695 d. .67 e. .0846
31. .0317; Because each piece is played by the same 5. a. 1,703,000 b. 1,599,730 c. 1,601,438
musicians, there could easily be some dependence. If
they perform the first piece slowly, then they might 7. a. 120.6 b. 1,206,000, 10,000X c. .8
perform the second piece slowly, too, d. 120, Xe
pffiffiffiffiffiffiffiffi
33. a. 45 b. 68.33 c. 1, 13.67 d. 5, 68.33 9. a. X, 2.113 b. l=n, .119
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
35. a. 50, 10.308 b. .0076 c. 50 d. 111.56 11. b. p1 ð1 p1 Þ=n1 þ p2 ð1 p2 Þ=n2
e. 131.25 c. In part (b) replace p1 with X1/n1 and replace p2 with
X2/n2
37. a. .9615 b. .0617 d. .245 e. .0411
39. a. .5, n(n + 1)/4 b. .25, n(n + 1)(2n + 1)/24 13. a. .9876 b. .6915
41. 10:52.74 ^ ¼ P X2 =ð2nÞ b. 74.505
15. a. y i
43. .48 17. b. 4/9
45. b. MY(t) ¼ 1/[1 t2/(2n)]n 19. a. p^ ¼ 2^
l :30 ¼ :20 c. p^ ¼ ð100^
l 9Þ=70
w2n
47. Because is the sum of n independent random variables, 21. a. .15 b. yes c. .4437
each distributed as w21 , the Central Limit Theorem applies.
^ ¼ ð2
23. a. y x 1Þ=ð1 xÞ ¼ 3
53. a. 3.2 b. 10.04, the square of the answer to (a) b. ^
y ¼ ½n=S lnðxi Þ 1 ¼ 3:12
57. a. n2/(n2 2), n2 > 2 25. p^ ¼ r=ðr þ xÞ ¼ :15 This is the number of successes
b. 2n22 ðn1 þ n2 2Þ=½n1 ðn2 2Þ2 ðn2 4Þ, n2 > 4 over the number of trials, the same as the result in
61. a. 4.32 Exercise 21. It is not the same as the estimate of
Exercise 17.
65. a. The approximate value, .0228, is smaller because of P 2 P 2
skewness in the chi-squared distribution 27. a. s^2 ¼ 1n ^2 ¼ 1n
Xi b. s Xi
b. This approximation gives the answer .03237, agreeing P
^ ¼ X2 =ð2nÞ ¼ 74:505, the same as in Exercise 15
29. a. y i
with the software answer to this number of decimals. qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
b. ^
2y lnð2Þ ¼ 10:16
67. No, the sum of the percentiles is not the same as the
percentile of the sum, except that they are the same for 31. ^
l ¼ lnð^
pÞ=24 ¼ :0120
the 50th percentile. For all other percentiles, the
percentile of the sum is closer to the 50th percentile 33. No, statistician A does not have more information.
than is the sum of the percentiles Qn Pn
35. i¼1 xi ; i¼1 xi
69. a. 2360, 73.70 b. .9713
37. I(.5 max(x1, x2, . . ., xn) y min(x1, x2, . . ., xn))
71. .9685
39. a. 2X(n X)/[n(n 1)]
73. .9093 Independence is questionable because con- pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
sumption one day might be related to consumption the 41. a. X b. FððX cÞ= 1 1=nÞ
next day.
824 Chapter 8
43. a. Vð~yÞ ¼ y2 =½nðn þ 2Þ b. y2/n 33. a. (38.081, 38.439) b. (100.55, 101.19), yes
c. The variance in (a) is below the bound of (b), but the
theorem does not apply because the domain is a 35. a. Assuming normality, a 95% lower confidence bound
function of the parameter. is 8.11. When the bound is calculated from repeated
independent samples, roughly 95% of such bounds
45. a. x b. N(m, s2/n) should be below the population mean.
c. Yes, the variance is equal to the Cramér-Rao bound b. A 95% lower prediction bound is 7.03. When the
d. The answer in (b) shows that the asymptotic bound is calculated from repeated independent
distribution of the theorem is actually exact here. samples, roughly 95% of such bounds should be
below the value of an independent observation.
47. a. 2/s2
b. The answer in (a) is different from the answer, 37. a. 378.85 b. 413.09 c. (333.88, 407.50)
1/(2s4), to 46(a), so the information does depend on
the parameterization. 39. 95% prediction interval: (.0498, .0772)
73. a. .00985 b. .0578 29. a. Because t ¼ .50 < 1.895 ¼ t.05,7 do not reject H0.
pffiffiffi pffiffiffi b. .73
x ðs= nÞt:025;n1;d ; x ðs= nÞt:975;n1;d Þ
75. a. ð
b. (3.01, 4.46) 31. Because t ¼ 1.24 > 1.397 ¼ t.10,8, we do not have
evidence to question the prior belief.
77. a. 1/2n b. n/2n c. (n + 1)/2n, 1 (n + 1)/2n1,
(29.9, 39.3) with confidence level .9785 35. a. The distribution is fairly symmetric, without outliers.
b. Because t ¼ 4.25 3.499 ¼ t.005,7, there is strong
79. a. P(A1\A2) ¼ .952 b. P(A1\A2) .90 evidence to say that the amount poured differs from
c. P(A1\A2) 1 a1 a2 ; P(A1\A2\ . . . \ Ak) the industry standard, and indeed bartenders tend to
1 a1 a2 . . . ak exceed the standard.
c. Yes, the test in (b) depends on normality, and a normal
probability plot gives no reason to doubt the
Chapter 9 assumption.
d. .643, .185, .016
1. a. yes b. no c. no d. yes e. no f. yes 37. a. Do not reject H0: p ¼ .10 in favor of Ha: p > .10
5. H0: s ¼ .05 vs. Ha: s < .05. Type I error: Conclude that because z ¼ 1.33 < 1.645. Because the null
the standard deviation is less than .05 mm when it is hypothesis is not rejected, there could be a type II
really equal to .05 mm. Type II error: Conclude that the error.
standard deviation is .05 mm when it is really less than b. .49, .27. c. 362
.05. 39. a. Do not reject H0: p ¼ .02 in favor of Ha: p < .02
7. A type I error here involves saying that the plant is not in because z ¼ 1.1 > 1.645. There is no strong
compliance when in fact it is. A type II error occurs when evidence suggesting that the inventory be postponed.
we conclude that the plant is in compliance when in fact it b. .195. c. <.0000001.
isn’t. A government regulator might regard the type II 41. a. Reject H0 because z ¼ 3.08 2.58. b. .03
error as being more serious.
43. Using n ¼ 25, the probability of 5 or more leaky faucets
9. a. R1 is .0980 if p ¼ .10, and the probability of 4 or fewer leaky
b. A type I error involves saying that the two companies faucets is .0905 if p ¼ .3. Thus, the rejection region is
are not equally favored when they are. A type II error 5 or more, a ¼ .0980, and b ¼ .0905.
involves saying that the two companies are equally
favored when they are not. 45. a. reject b. reject c. do not reject d. reject
c. binomial, n ¼ 25, p ¼ .5; .0433 e. do not reject
d. .3, .4881; .4, .8452; .6, .8452; .7, .4881
e. If only 6 favor the first company, then reject the null 47. a. .0778 b. .1841 c. .0250 d. .0066 e. .5438
hypothesis and conclude that the first company is not 49. a. P ¼ .0403 b. P ¼ .0176 c. P ¼ .1304
preferred. d. P ¼ .6532 e. P ¼ .0021 f. P ¼ .000022
11. a. H0: m ¼ 10 vs. Ha: m 6¼ 10 b. .0099 51. Based on the given data, there is no reason to believe
c. .5319. .0076 d. c ¼ 2.58 e. c ¼ 1.96 that pregnant women differ from others in terms of true
f. x ¼ 10:02, so do not reject H0 average serum receptor concentration.
g. Recalibrate if z 2.58 or z 2.58
53. a. Because the P-value is .17, no modification is
13. b. .00043, .0000075, less than .01 indicated. b. 997
15. a. .0301 b. .0030 c. .0040 55. Because t ¼ 1.759 and the P-value ¼ .089, which is
17. a. Because z ¼ 2.56 > 2.33, reject H0 b. .84 less than .10, reject H0: m ¼ 3.0 against a two-tailed
alternative at the 10% level. However, the P-value
c. 142 d. .0052 exceeds .05, so do not reject H0 at the 5% level. There
19. a. Because z ¼ 2.27 > 2.58, do not reject H0
b. .22 c. 22
826 Chapter 10
is just a weak indication that the percentage is not equal to 91. a. For the test of H0: m ¼ m0 vs. Ha: m > m0 at level a,
3% (lower than 3%). reject H0 if 2Sxi/m0 w2a;2n
57. a. Test H0: m ¼ 10 vs. Ha: m < 10 For the test of H0: m ¼ m0 vs. Ha: m < m0 at level a,
b. Because the P-value is .017 < .05, reject H0, reject H0 if 2Sxi/m0 w21a;2n
suggesting that the pens do not meet specifications. For the test of H0: m ¼ m0 vs. Ha: m 6¼ m0 at level a,
c. Because the P-value is .045 > .01, do not reject H0, reject H0 if 2Sxi/m0 w2a=2;2n or
suggesting there is no reason to say the lifetime is if 2Sxi/m0 w21a=2;2n
inadequate. b. Because Sxi ¼ 737, the test statistic is 2Sxi/m0
d. Because the P-value is .0011, reject H0. There is good ¼ 19.65, which gives a P-value of .52. There is no
evidence showing that the pens do not meet reason to reject the null hypothesis.
specifications.
93. a. yes
61. a. 98, .85, .43, .004, .0000002
b. .40, .11, .0062, .0000003
c. Because the null hypothesis will be rejected with high
probability, even with only slight departure from the
Chapter 10
null hypothesis, it is not very useful to do a .01 level 1. a. .4; it doesn’t b. .0724, .269
test. c. Although the CLT implies that the distribution will be
63. b. 36.61 c. yes approximately normal when the sample sizes are each
100, the distribution will not necessarily be normal
65. a. Sxi c b. yes when the sample sizes are each 10.
67. Yes, the test is UMP for the alternative Ha : y > .5 3. Do not reject H0 because z ¼ 1.76 < 2.33
because the tests for H0 : y ¼ .5 vs. Ha : y ¼ p0 all
have the same form for any p0 > .5. 5. a. Ha says that the average calorie output for sufferers is
more than 1 cal/cm2/min below that for non-sufferers.
69. b. .05 Reject H0 in favor of Ha because z ¼ 2.90 2.33
c. .04345, .05826; Because .04345 < .05, the test is not b. .0019 c. .819 d. .66
unbiased.
d. .05114; not most powerful 7. Yes, because z ¼ 1.83 1.645.
71. b. The value of the test statistic is 3.041, so the P-value is 9. a. x y ¼ 6:2
.081, compared to .089 for Exercise 55. b. z ¼ 1.14, two-tailed P-value ¼ .25, so do not reject
the null hypothesis that the population means are
73. A sample size of 32 should suffice. equal.
c. No, the values are positive and the standard deviation
75. a. Test H0: m ¼ 2150pvs. ffiffiffi Ha: m > 2150 exceeds the mean.
b. t ¼ ð
x 2150Þ=ðs= nÞ c. 1.33 d. .101 d. 95% CI: (10.0, 29.8)
e. Do not reject H0 at the .05 level.
11. a. A 95% CI for the true difference, fast food mean – not
77. Because t ¼ .77 and the P-value is .23, there is no fast food mean is (219.6, 538.4)
evidence suggesting that coal increases the mean heat b. The one-tailed P-value is .014, so reject the null
flux. hypothesis of a 200-calorie difference at the .05
79. Conclude that activation time is too slow at the .05 level, level, and conclude that yes, there is strong evidence.
but not at the .01 level. 13. 22. No.
81. A normal probability plot gives no reason to doubt the 15. b. It increases.
normality assumption. Because the sample mean is 9.815,
giving t ¼ 4.75 and a (upper tail) P-value of .00007, 17. Because z ¼ 1.36, there is no reason to reject the
reject the null hypothesis at any reasonable level. The hypothesis of equal population means (p ¼ .17).
true average flame time is too high.
19. Because z ¼ .59, there is no reason to conclude that the
83. Assuming normality, calculate t ¼ 1.70, which gives a population mean is higher for the no-involvement group
two tailed P-value of .102. Do not reject the null (p ¼ .28).
hypothesis H0: m ¼ 1.75.
21. Because t ¼ 3.35 3.30 ¼ t.001,42, yes, there is
85. The P-value for a lower tail test is .0014 (normal evidence that experts do hit harder.
approximation, .0005), so it is reasonable to reject the
idea that p ¼ .75 and conclude that fewer than 75% of 23. b. No c. Because |t| ¼ |.38| < 2.228 ¼ t.025,10, no,
mechanics can identify the problem. there is no evidence of a difference.
87. Because t ¼ 6.43, giving an upper tail P-value of 25. Because the one-tailed P-value is .005 .01, conclude at
.0000002, conclude that the population mean time the .01 level that the difference is as stated.
exceeds 15 minutes. This could result in a type I error.
89. Because the P-value is .013 > .01, do not reject the null 27. Yes, because t ¼ 2.08 with P-value ¼ .046.
hypothesis at the .01 level. 29 b. (127.6, 202.0) c. 131.8
Chapter 10 827
31. Because t ¼ 1.82 with P-value .046 .05, conclude at # start with X in C1, Y in C2
the .05 level that the difference exceeds 1. let k3 ¼ N(c1)
qffiffiffiffiffiffiffiffiffiffi let k4 ¼ N(c2)
33. a. ðx yÞ ta=2;mþn2 sp m1 þ 1n sample k3 c1 c3;
b. (.24, 3.64) replace.
c. (.34, 3.74), which is wider because of the loss of a sample k4 c2 c4;
degree of freedom replace.
let k1 ¼ mean(c3)-mean(c4)
35. a. The slender distribution appears to have a lower mean
stack k1 c5 c5
and lower variance.
end
b. With t ¼ 1.88 and a P-value of .097, there is no
significant difference at the .05 level. 71. a. Here is a macro that can be executed 999 times in
37. With t ¼ 2.19 and a two-tailed P-value of .031, there is a MINITAB:
# start with X in C1, Y in C2
significant difference at the .05 level but not the .01 level.
let k3 ¼ N(c1)
39. With t ¼ 3.89 and one-tailed P-value ¼ .006, conclude let k4 ¼ N(c2)
at the 1% level that true average movement is less for the sample k3 c1 c3;
TightRope treatment. Normality is important, but the replace.
normal probability plot does not indicate a problem. sample k4 c2 c4;
replace.
41. a. The 95% confidence interval for the difference of let k2 ¼ medi(c3)-medi(c4)
means is (.000046, .000446), which has only positive stack k2 c6 c6
values. This omits 0 as a possibility, and says that the end
conventional mean is higher.
b. With t ¼ 2.68 and P-value ¼ .010, reject at the .05 73. a. (.593, 1.246)
level the hypothesis of equal means in favor of the b. Here is a macro that can be executed 999 times in
conventional mean being higher. MINITAB:
# start with X in C1, Y in C2
43. With t ¼ 1.87 and a P-value of .049, the difference is let k3 ¼ N(c1)
(barely) significantly greater than 5 at the .05 level. let k4 ¼ N(c2)
sample k3 c1 c3;
45. a. No b. 49.1 c. 49.1
replace.
47. 1 2 3 4 sample k4 c2 c4;
x 10 20 30 40 replace.
y 11 21 31 41 let k5 ¼ stdev(c3)/stdev(c4)
stack k5 c12 c12
end
49. a. Because |z| ¼ |4.84| 1.96, conclude that there is a
difference. Rural residents are more favorable to the 75. a. Because t ¼ 2.62 with a P-value of .018, conclude
increase. that the population means differ. At the 5% level,
b. .9967 blueberries are significantly better.
b. Here is a macro that can be executed repeatedly in
51. (.016, .171) MINITAB:
53. Because z ¼ 4.27 with P-value .000010, conclude that # start with data in C1, group var in C2
the radiation is beneficial. let k3 ¼ N(c1)
Sample k3 c1 c3.
55. a. H0: p3 ¼ p2, Ha: p3 > p2 unstack c3 c4 c5;
b. (X3 X2)/npffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi subs c2.
c. ðX3 X2 Þ= X2 þ X3 let k9 ¼ mean(c4)-mean(c5)
d. With z ¼ 2.67, P ¼ .004, reject H0 at the .01 level. stack k9 c6 c6
end
57. 769
77. a. Because f ¼ 4.46 with a two-tailed P-value of .122,
59. Because z ¼ 3.14 with P ¼ .002, reject H0 at the .01 there is no evidence of unequal population variances.
level. Conclude that lefties are more accident-prone. b. Here is a macro that can be executed repeatedly in
61. a. .0175 b. .1642 c. .0200 d. .0448 MINITAB:
e. .0035 let k1 ¼ n(C1)
Sample K1 c1 c3.
63. No, because f ¼ 1.814 < 6.72 ¼ F.01,9,7. unstack c3 c4 c5;
subs c2.
65. Because f ¼ 1.2219 with P ¼ .505, there is no reason to let k6 ¼ stdev(c4)/stdev(c5)
question the equality of population variances. stack k6 c6 c6
67. 8.10 end
85. The difference is significant at the .05, .01, and .001 7. a. The Levene test gives f ¼ 1.47, P-value .236, so there
levels. is no reason to doubt equal variances.
b. Because f ¼ 10.48 4.02 ¼ F.01,4,30, there are
89. b. No, given that the 95% CI includes 0, the test at the significant differences among the means.
.05 level does not reject equality of means.
Source DF SS MS F P
91. (299.2, 1517.8)
Plate 4 43993 10998 10.48 0.000
93. (1020.2, 1339.9). Because 0 is not in the CI, we would
length
reject equality of means at the .01 level.
Error 30 31475 1049
95. Because t ¼ 2.61 and the one-tailed P-value is .007, the Total 34 75468
difference is significant at the .05 level using either a
one-tailed or a two-tailed test. 11. w ¼ 36.09 3 1 4 2 5
97. a. Because t ¼ 3.04 and the two-tailed P-value is .008, Splitting the paints into two groups, {3, 1, 4}, {2, 5},
the difference is significant at the .05 level. there are no significant differences within groups but the
b. No, the mean of the concentration distribution paints in the first group differ significantly (they are
depends on both the mean and standard deviation lower) from those in the second group.
of the log concentration distribution.
13. 3 1 4 2 5
99. Because t ¼ 7.50 and the one-tailed P-value is .0000001, 427.5 462.0 469.3 502.8 532.1
the difference is highly significant, assuming normality.
101. The two-sample t is inappropriate for paired data. The 15. w ¼ 5.92; At the 1% level the only significant
paired t gives a mean difference .3, t ¼ 2.67, and the differences are between formation 4 and the first two
two-tailed P-value is .045, so the means are significantly formations.
different at the .05 level. We are concluding tentatively 2 1 3 4
that the label understates the alcohol percentage. 24.69 26.08 29.95 33.84
103. Because paired t ¼ 3.88 and the two-tailed P-value is
.008, the difference is significant at the .05 and .01 17. (.029, .379)
levels, but not at the .001 level. 19. 426
105. Because z ¼ 2.63 and the two-tailed P-value is .009, 21. a. Because f ¼ 22.60 3.26 ¼ F.01,5,78, there are
there is a significant difference at the .01 level, significant differences among the means.
suggesting better survival at the higher temperature. b. (99.1, 35.7), (29.4, 99.1)
107. .902, .826, .029, .00000003 23. The nonsignificant differences are indicated by the
109. Because z ¼ 4.25 and the one-tailed P-value is .00001, underscores.
the difference is highly significant and companies 10 6 3 1
appear to discriminate. 45.5 50.85 55.40 58.28
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
111. With Z ¼ ðX YÞ= X=n þ Y=m , the result is
z ¼ 5.33, two-tailed P-value ¼ .0000001, so one 25. a. Assume normality and equal variances.
should conclude that there is a significant difference in b. Because f ¼ 1.71 < 2.20 ¼ F.10,3,48, P-value ¼ .18,
parameters. there are no significant differences among the means.
113. (i) not bioequivalent (ii) not bioequivalent (iii) 27. a. Because f ¼ 3.75, P-value ¼ .028, there are
bioequivalent significant differences among the means.
b. Because the normal plot looks fairly straight and the
P-value for the Levene test is .68, there is no reason to
doubt the assumptions of normality and constant
Chapter 11 variance.
c. The only significant pairwise difference is between
1. a. Reject H0: m1 ¼ m2 ¼ m3 ¼ m4 ¼ m5 in favor of Ha: brands 1 and 4:
m1, m2, m3, m4, m5 not all the same, because 4 3 2 1
f ¼ 5.57 2.69 ¼ F.05,4,30. 5.82 6.35 7.50 8.27
b. Using Table A.9, .001 < P-value < .01. (The P-value
is .0018)
31. .63
3. Because f ¼ 6.43 2.95 ¼ F.05,3,28, there are pffiffiffiffiffiffiffi
significant differences among the means. 33. arcsinð x=nÞ
5. Because f ¼ 10.85 4.38 ¼ F.01,3,36, there are 35. a. Because f ¼ 1.55 < 3.26 ¼ F.05,4,12, there are no
significant differences among the means. significant differences among the means.
b. Because f ¼ 2.98 < 3.49 ¼ F.05,3,12, there are no
Source DF SS MS F P significant differences among the means.
Formation 3 509.1 169.7 10.85 0.000
37. With f ¼ 5.49 4.56 ¼ F.01,5,15, there are significant
Error 36 563.1 15.6 differences among the stimulus means. Although not all
Total 39 1072.3 differences are significant in the multiple comparisons
analysis, the means for combined stimuli were higher.
Chapter 11 829
Differences among the subject means are not very 51. a. With f ¼ 1.55 < 2.81 ¼ F.10,2,12, there is no
important here. The normal plot of residuals shows no significant interaction at the .10 level.
reason to doubt normality. However, the plot of residuals b. With f ¼ 376.27 18.64 ¼ F.001,2,12, there is a
against the fitted values shows some dependence of the significant difference between the formulation
variance on the mean. If logged response is used in place means at the .001 level.
of response, the plots look good and the F test result is With f ¼ 19.27 12.97 ¼ F.001,1,12, there is a
similar but stronger. Furthermore, the logged response significant difference among the speed means at the
gives more significant differences in the multiple .001 level.
comparisons analysis. c. Main effects Formulation: (1) 11.19, (2) –11.19
Speed: (60) 1.99, (70) –5.03, (80) 3.04
Means:
53. Here is the ANOVA table
L1 L2 T L1 + L2 L1 + T L2 + T
24.825 27.875 29.1 40.35 41.22 45.05 Source DF SS MS F P
b. The very significant f for blocks, which shows that 57. a. F ¼ MSAB/MSE
blocks differ strongly, implies that blocking was b. A: F ¼ MSA/MSAB B: F ¼ MSB/MSAB
successful. 59. a. Because f ¼ 3.43 2.61 ¼ F.05,4,40, there is a
43. With f ¼ 8.69 6.01 ¼ F.01,2,18, there are significant significant difference among the exam means at the
differences among the three treatment means. .05 level.
The normal plot of residuals shows no reason to doubt b. Because f ¼ 1.65 < 2.61 ¼ F.05,4,40, there is no
normality, and the plot of residuals against the fitted significant difference among the retention means at
values shows no reason to doubt constant variance. the .05 level.
There is no significant difference between treatments B 61. a.
and C, but Treatment A differs (it is lower) significantly
from the others at the .01 level. Source DF SS MS F
Means:
A 29.49 B 31.31 C 31.40 Diet 4 .929 .232 2.15
Error 25 2.690 .108
Total 29 3.619
45. Because f ¼ 8.87 7.01 ¼ F01,4,8, reject the hypothesis
that the variance for B is 0.
Because f ¼ 2.15 < 2.76 ¼ F.05,4,25, there is no
49. a. significant difference among the diet means at the .05
level.
Source df SS MS F b. (.59, .92) Yes, the interval includes 0.
A 2 30763.0 15381.5 3.79 c. .53
B 3 34185.6 11395.2 2.81 63. a. Test H0: m1 ¼ m2 ¼ m3 versus Ha: the three means
Interaction 6 43581.2 7263.5 1.79 are not all the same. With f ¼ 4.80 and F.05,2,16 ¼
Error 24 97436.8 4059.9 3.63 < 4.80 < 6.23 ¼ F.01,2,16, it follows that
Total 35 205966.6 .01 < P-value < .05 (more precisely, P ¼ .023).
Reject H0 in favor of Ha at the 5% level but not at
b. Because 1.79 < 2.04 ¼ F.10,6,24, there is no the 1% level.
significant interaction. b. Only the first and third means differ significantly at
c. Because 3.79 3.40 ¼ F.05,2,24, there is a significant the 5% level.
difference among the A means at the .05 level. 1 2 3
d. Because 2.81 < 3.01 ¼ F..05,6,24, there is no
significant difference among the B means at the .05 25.59 26.92 28.17
level.
e. Using w ¼ 64.93, 65. Because f ¼ 1123 4.07 ¼ F.05,3,8, there are significant
differences among the means at the .05 level.
3 1 2 For Tukey multiple comparisons, w ¼ 7.12:
3960.2 4010.88 4029.10
(continued)
830 Chapter 12
49. (431.2, 628.6) 71. a. The simple linear regression model may not be a
perfect fit because the plot shows some curvature.
51. a. Yes, for the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, b. The plot of standardized residuals is very similar to
we find t ¼ 10.62, with P-value .000014. At the the residual plot. The normal probability plot gives
.001 level conclude that there is a useful linear no reason to doubt normality.
relationship.
b. (8.24, 12.96) With 95% confidence, when the flow 73. a. For the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, we find
rate is increased by 1 SCCM, the associated expected t ¼ 10.97, with P-value .0004. At the .001 level
change in etch rate is in the interval. conclude that there is a useful linear relationship.
c. (36.10, 40.41) This is fairly precise. b. The residual plot shows curvature, so the linear
d. (31.86, 44.65) This is much less precise than the relationship of part (a) is questionable.
interval in (c) c. There are no extreme standardized residuals , and the
e. Because 2.5 is closer to the mean, the intervals will be plot of standardized residuals is similar to the plot of
narrower. ordinary residuals.
f. Because 6 is outside the range of the data, it is 75. The first data set seems appropriate for a straight-line
unknown whether the regression will apply there. model. The second data set shows a quadratic
g. Use a 99% CI at each value: (23.88, 31.43), (29.93, relationship, so the straight-line relationship is
35.98), (35.07, 41.45) inappropriate. The third data set is linear except for an
53. a. Yes outlier, and removal of the outlier will allow a line to be
b. Yes, for the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, we find fit. The fourth data set has only two values of x, so there is
t ¼ 4.39, with P-value < .001. At the .001 level no way to tell if the relationship is linear.
conclude that there is a useful linear relationship. 77. a. To test for lack of fit, we find f ¼ 3.30, with 3
c. (403.6, 468.2) numerator df and 10 denominator df, so the P-value
57. a. r ¼ .923, so x and y are strongly correlated. is .079. At the .05 level we cannot conclude that the
b. unaffected relationship is poor.
c. unaffected b. The scatter plot shows that the relationship is not
d. The normal plots seem consistent with normality, but linear, in spite of (a). In this case, the plot is more
the scatter plot shows a slight curvature. sensitive than the test.
e. For the test of H0: r ¼ 0 vs. Ha: r 6¼ 0, we find 79. a. 77.3
t ¼ 7.59, with P-value .00002. At the .001 level b. 40.4
conclude that there is a useful linear relationship. c. The coefficient b3 is the difference in sales caused by
the window, all other things being equal.
832 Chapter 12
105. a. .507% b. .7122 3. Do not reject H0 because w2 ¼ 1:57 < 7:815 ¼ w2:05;3
c. To test H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, we have t ¼ 3.93,
with P-value .0013. At the .01 level conclude that 5. Because w2 ¼ 6.61 with P-value .68, do not reject H0.
there is a useful linear relationship. 7. Because w2 ¼ 4.03 with P-value > .10, do not reject H0.
d. (1.056, 1.275)
e. y^ ¼ 1:014 y y^ ¼ :214 9. a. [0, .223), [.223, .510), [.510, .916), [.916, 1.609).
[1.609, 1)
107. –36.18, (64.43, –7.94) b. Because w2 ¼ 1.25 with P-value > .10, do not reject
109. No, if the relationship of y to x is linear, then the H0.
relationship of y2 to x is quadratic. 11. a. (1, .967), [.967, .431), [.431, 0), [0, .431),
111. a. Yes [.431, .967), [.967, 1)
b. y^ ¼ 98:293 y y^ ¼ :117 b. (1,.49806), [.49806, .49914), [.49914, .50), [.50,
c. s ¼ .155 .50086), [.50086, .50194), [.50194, 1)
d. .794 c. Because w2 ¼ 5.53 with P-value > .10, do not reject
e. 95% CI for b1: (.0613, .0901) H0.
f. The new observation is an outlier, and has a major 13. Using p^ ¼ :0843, w2 ¼ 280.3 with P-value < .001, so
impact: reject the independence model.
The equation of the line changes from
y ¼ 97.50 + .0757 x to y ¼ 97.28 + .1603 x 15. The likelihood is proportional to y233(1 – y)367 from
s changes from .155 to .291 which ^ y ¼ :3883. This gives estimated probabilities
r2 changes from .794 to .616 .1400, .3555, .3385, .1433, .0227 and expected counts
21.00, 53.32, 50.78, 21.49, 3.41. Because 3.41 < 5,
113. a. The paired t procedure gives t ¼ 3.54 with a two- combine the last two categories, giving w2 ¼ 1.62 with
tailed P-value of .002, so at the .01 level we reject the P-value > .10. Do not reject the binomial model.
hypothesis of equal means.
b. The regression line is y ¼ 4.79 + .743x, and the test 17. ^
l ¼ 3:167 which gives w2 ¼ 103.9 with P-value < .001,
of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, gives t ¼ 7.41 with a so reject the assumption of a Poisson model.
P-value of <.000001, so there is a significant
relationship. However, prediction is not perfect, y1 ¼ :4275; ^
19. ^ y2 ¼ :2750 which gives w2 ¼ 29.3 with P-
with r2 ¼ .753, so one variable accounts for only value < .001, so reject the model.
75% of the variability in the other. 21. Yes, the test gives no reason to reject the null hypothesis
117. a. linear of a normal distribution.
b. After fitting a line to the data, the residuals show a 23. The P-values are both .243.
lot of curvature.
c. Yes. The residuals from the logged model show some 25. Let pi1 ¼ the probability that a fruit given treatment i
departure from linearity, but the fit is good in terms matures and pi2 ¼ the probability that a fruit given
of R2 ¼ .988. We find ^ a ¼ 411:98, b^ ¼ :03333: treatment i aborts, so Ho: pi1 ¼ pi2 for i ¼ 1, 2, 3, 4, 5.
d. (58.15, 104.18) We find w2 ¼ 24.82 with P-value < .001, so reject the
null hypothesis and conclude that maturation is affected
119. a. The plot suggests a quadratic model. by leaf removal.
b. With f ¼ 25.08 and a P-value of < .0001, there is a
significant relationship at the .0001 level 27. If pij denotes the probability of a type j response when
c. CI: (3282.3, 3581.3), PI: (2966.6, 3897.0). Of course, treatment i is applied, then H0: p1j ¼ p2j ¼ p3j ¼ p4j for
the PI is wider, as in simple linear regression, j ¼ 1, 2, 3, 4. With w2 ¼ 27:66 23:587 ¼ w2:005;9 ,
because it needs to include the variability of a new reject H0 at the .005 level. The treatment does affect the
observation in addition to the variability of the mean. response.
d. CI: (3257.6, 3565.6), PI: (2945.0, 3878.2). These are
slightly wider than the intervals in (c),which is 29. With w2 ¼ 64:65 13:277 ¼ w2:01;4 , reject H0 at the .001
appropriate, given that 25 is slightly closer to the level. Political views are related to marijuana usage. In
mean and the vertex. particular, liberals are more likely to be users.
e. With t ¼ 6.73 and a two-tailed P-value of 31. Compute the expected counts by
< .0001, the quadratic term is significant at the e^ijk ¼ n^
pijk ¼ n^
n
pi p^j p^k ¼ n nni nj nnk . For the
.0001 level, so this term is definitely needed.
w2 statistic df ¼ 20.
121. a. With f ¼ 2.4 < 5.86 ¼ F.05,15,4, there is no
33. a. With w2 ¼ :681 < 4:605 ¼ w2:10;2 , do not reject
significant relationship at the .05 level
independence at the .10 level.
b. No, especially when k is large compared to n
b. With w2 ¼ 6:81 4:605 ¼ w2:10;2 , reject independence
c. .9565
at the .10 level.
c. 677
c. Because the logistic regression takes into account the 5. We form the difference and perform a two-tailed test of
order in the professorial ranks, it should be more H0: m ¼ 0 at level .05. This gives s+ ¼ 72 and because
sensitive, so it should give a lower P-value. it does not satisfy 14 < s+ < 64, we reject H0 at the
d. There are few female professors but many assistant .05 level.
professors, and the assistant professors will be the
professors of the future. 7. Because s+ ¼ 162.5 with P-value .044, reject H0: m ¼ 75
in favor of Ha: m > 75 at the .05 level.
37. With w2 ¼ 13:005 9:210 ¼ w2:01;2 , reject the null
hypothesis of no effect at the .01 level. Oil does make a 9. With w ¼ 38, reject H0 at the .05 level because the
difference (more parasites). rejection region is {w 36}.
39. a. H0: The population proportion of Late Game 11. Test H0: m1 – m2 ¼ 1 vs. Ha: m1 – m2 > 1. After
Leader Wins is the same for all four sports; Ha: The subtracting 1 from the original process measurements,
proportion of Late Game Leader Wins is not the same we get w ¼ 65. Do not reject H0 because w < 84.
for all four sports. With w2 ¼ 10:518 7:815 ¼ w2:05;3 , 13. b. Test H0: m1 – m2 ¼ 0 vs. Ha: m1 – m2 < 0. With a
reject the null hypothesis at level .05. Sports differ in P-value of .002 we reject H0 at the .01 level.
terms of coming from behind late in the game.
b. Yes (baseball) 15. With w ¼ 135, z ¼ 2.223, and the approximate P-value
is .026, so we would not reject the null hypothesis at the
41. With w ¼ 197:6 16:812 ¼
2
w2:01;6 ,
reject the null .01 level.
hypothesis at the .01 level. The aged are more likely to
die in a chronic-care facility. 17. (11.15, 23.80)
43. With w ¼ :763 < 7:779 ¼
2
w2:10;4 ,
do not reject the 19. (.585, .025)
hypothesis of independence at the .10 level. There is no
evidence that age influences the need for item pricing. 21. (16, 87)
835
836 Index
Wilcoxon rank-sum, 774–776 paired data and, 515–516 for Studentized range
Wilcoxon signed-rank, 772-774 sample (see Sample correlation distribution, 565
Confidence level coefficient) for t distribution, 320, 390,
definition of, 382, 385–388 Covariance 500, 504
simultaneous, 565–570, 578, correlation coefficient and, 249 type II error and, 574
589, 591, 658 Cramér–Rao inequality and, Delta method, 174
in Tukey’s procedure, 565–570, 374–375 De Morgan’s laws, 56
578, 589, 591 definition of, 247 Density
Confidence set, 772 of independent random conditional, 253–257
Consistency, 304, 357, 375–377 variables, 250–251 curve, 160
Consistent estimator, 304, 357, of linear functions, 249 function (pdf), 160
375–377 matrix format for, 711 joint, 235
Contingency tables, two-way, Covariate, 699 marginal, 236
744–751 Cramér–Rao inequality, 374–375 scale, 17
Continuity correction, 189–190 Credibility interval, 777–782 Dependence, 84–88, 238–242, 250,
Continuous random variable(s) Critical values 257, 747
conditional pdf for, 254, 789 chi-squared, 317 Dependent events, 84-88
cumulative distribution function F, 324 Dependent variable, 614
of, 163–168 standard normal (z), 184 Descriptive statistics, 1–41
definition of, 99, 159 studentized range, 565 Deviation
vs. discrete random variable, 162 t, 322, 409 definition of, 33
expected value of, 171–172 tolerance, 406 minimize absolute deviations
joint pdf of (see Joint probability Cumulative distribution function principle, 33, 679
density functions) for a continuous random Dichotomous trials, 128
marginal pdf of, 236–238 variable, 163–168 Difference statistic, 347
mean of, 171, 172 for a discrete random variable, Discrete random variable(s)
moment generating of, 175–177 104–108 conditional pmf for, 253
pdf of (see Probability density inverse function of, 223–224 cumulative distribution function
function) joint, 282 of, 104–108
percentiles of, 166–168 of order statistics, 272–273 definition of, 99
standard deviation of, 173–175 pdf and, 163 expected value of, 112
transformation of, 220–225, percentiles and, 167 joint pmf of (see Joint probability
265–270 pmf and, 105–108 mass function)
variance of, 173–175 transformation and, 220–225 marginal pmf of, 234
Contrast of means, 570–571 Cumulative frequency, 24 mean of, 112
Convenience samples, 7 Cumulative relative frequency, 24 moment generating of, 122
Convergence pmf of (see Probability mass
in distribution, 153, 329 D function)
in mean square, 303 Data standard deviation of, 117
in probability, 304 bivariate, 3, 617, 632, 691 transformation of, 225
Convex function, 231 categorical (see Categorical data) variance of, 117
Correction factor, 141, 560, 568, censoring of, 32, 343–344 Disjoint events, 54
577, 582 characteristics of, 3 Dotplots, 12
Correction for the mean, 560 collection of, 7–8 Dummy variable, 696
Correlation coefficient definition of, 2 Dunnett’s method, 571
autocorrelation coefficient and, 674 multivariate, 3, 220
in bivariate normal distribution, qualitative, 19 E
258–260, 310, 667 univariate, 3 Efficiency, asymptotic relative,
confidence interval for, 671 Deductive reasoning, 6 764, 769
covariance and, 249 Degrees of freedom (df) Empirical rule, 187
Cramér–Rao inequality and, in ANOVA, 557–559, Erlang distribution, 202, 229
374–375 587, 599 Error(s)
definition of, 249, 663 for chi-squared distribution, estimated standard, 344, 646, 713
estimator for, 666 200, 315–320 estimation, 334
Fisher transformation, 669 in chi-squared tests, 726, 734, family vs. individual, 570
for independent random 737, 746 measurement, 179, 211, 337, 477
variables, 250 for F distribution, 323 prediction, 405, 658, 683
in linear regression, 664, 667, 669 in regression, 631, 685 rounding, 36
measurement error and, 328 sample variance and, 35 standard, 344, 713
838 Index
Maximum likelihood estimator Minimize absolute deviations normal equations in, 683, 685,
(cont.) principle, 477, 679 705–708
Cramér–Rao inequality and, 375 Minimum variance unbiased parameters for, 682
data sufficiency for, 369 estimator, 341–343, 358, and polynomial regression,
Fisher information and, 371, 375 369, 375 691–693
for geometric distribution Mixed effects model, 593–603 prediction interval in, 689
parameter, 742 Mixed exponential distribution, 229 principle of least squares in,
in goodness-of-fit testing, 733 mle. See Maximum likelihood 683–706
in homogeneity test, 745 estimate residuals in, 685, 691, 688, 691,
in independence test, 748 Mode 708, 713
in likelihood ratio tests, 475 of a continuous distribution, squared multiple correlation in,
in linear regression, 631, 639 228, 229 686, 709
in logistic regression, 650 of a data set, 46 sum of squares in, 686, 708–710
sample size and, 357 of a discrete distribution, 156 t ratios in, 690, 712
score function and, 377 Model utility test, 647–649 Multiplication rule, 77–88
McNemar’s test, 526, 550 Moment generating function Multiplicative exponential
Mean of a Bernoulli rv, 122, 127 regression model, 721
of Cauchy distribution, 322, of a binomial rv, 135 Multiplicative power regression
342, 761 of a chi-squared rv, 315 model, 721
conditional, 255–257 CLT and, 329–330 Multivariate data, 3, 20
correction for the, 560 of a continuous rv, 175–177 Multivariate hypergeometric
deviations from the, 33, 206, 563, definition of, 122, 175 distribution, 244
631, 739 of a discrete rv, 122–127 Mutually exclusive events, 54, 79
of a function, 115, 245–246 of an exponential rv, 221 MVUE. See Minimum variance
vs. median, 28 of a gamma rv, 195 unbiased estimator
moments about, 121 of a linear combination, 311
outliers and, 27, 28 and moments, 124, 176 N
population, 26 of a negative binomial rv, 143 Negative binomial distribution,
regression to the, 260, 636 of a normal rv, 191 141–144
sample, 25 of a Poisson rv, 149 definition of, 141
of sample total, 296 of a sample mean, 329–330 estimation of parameters, 352, 738
See also Average uniqueness property of, 123, 176 Negative binomial random
Mean square Moments variable, 141
expected, 573, 589, 593, 594, definition of, 121 Newton’s binomial theorem, 143
600, 604 method of, 350–352, 358, 740 Neyman factorization theorem, 363
lack of fit, 681 and moment generating function, Neyman–Pearson theorem, 470–475
pure error, 681 124, 176 Noncentrality parameter, 423,
Mean square error Monotonic, 221, 353 574, 582
definition of, 335 Multimodal histogram, 19 Noncentral t distribution, 423
of an estimator, 335 Multinomial distribution, 240, 725 Nonhomogeneous Poisson
MVUE and, 341 Multinomial experiment, 240, 724 process, 156
sample size and, 337 Multiple regression Nonstandard normal distribution,
Measurement error, 337 additive model, 682, 705 185–188
Median categorical variables in, 696–699 Normal distribution
in boxplot, 37–38 coefficient of multiple asymptotic, 298, 371, 375, 377
of a distribution, 27, 28 determination, 686, 709 binomial distribution and,
as estimator, 378, 478 confidence intervals in, 712 189–190, 302
vs. mean, 28 covariance matrices in, 711–713 bivariate, 258–260, 310, 318,
outliers and, 26, 28, 29 degrees of freedom in, 685, 477, 677–671
population, 28 696, 708 confidence interval for mean of,
sample, 27, 271 diagnostic plots, 691 383–388, 392, 398, 403
statistic, 378 fitted values in, 685 continuity correction and, 189–190
Mendel’s law of inheritance, 726–728 F ratio in, 687, 709 density curves for, 180
M-estimator, 359, 381 interaction in models for, 693–698 and discrete random variables,
Midfourth, 46 leverages in, 714–715 188–190
Midrange, 333 logistic regression model, 699 goodness-of-fit test for, 730, 740
Mild outlier, 39, 393 in matrix/vector format, 705–715 of linear combination, 309
Minimal sufficient statistic, model utility test in, 687, lognormal distribution and,
366–367, 369 708–709 205, 303
Index 841
nonstandard, 185–188 estimator for a, 332–346 mean squared error of, 335
pdf for, 179 Fisher information on, 371–377 moments method, 350–352, 358
percentiles for, 182–188, 210 goodness-of-fit tests for, MVUE of, 340–342, 358, 369,
probability plot, 210, 740 728–729, 732–736 375
Ryan–Joiner test for, 747 hypothesis testing for, 427, 450 notation for, 332, 334
standard, 181 location, 217, 367 of a standard deviation and,
t distribution and, 320–322, 325, maximum likelihood estimate 286, 340
402 of, 354–359, 369 standard error of, 344–346
z table, 181–183 moment estimators for, 350–352 of a variance, 334, 339
Normal equations, 626, 683, 705 MVUE of, 341–343, 358, Point prediction, 405, 628, 684
Normal probability plot, 210, 740 369, 375 Poisson distribution
Normal random variable, 181 noncentrality, 574 Erlang distribution and, 202
Null distribution, 443–444, 760, 780 null value of, 427 expected value, 149, 152
Null hypothesis, 426 of a probability distribution, exponential distribution and, 199
Null set, 54, 57 103–104 gamma distribution and, 783
Null value, 427, 436 in regression, 617–618, 622, goodness-of-fit tests for, 736–738
624–636, 658, 666, 682 in hypothesis testing, 470–472,
O scale, 195, 203, 217–218, 365 474, 482, 550
Observational study, 488 shape, 217–218, 365 mode of, 156
Odds ratio, 621–622, 750–751 sufficient estimation of, 361–369 moment generating function
One-sided confidence interval, Pareto diagram, 24 for, 149
398–399 Pareto distribution, 170, 178, 226 nonhomogeneous, 156
Operating characteristic curve, 137 pdf. See Probability density function parameter of, 149
Ordered categories, 749–751 Percentiles and Poisson process, 149–151, 199
Ordered pairs, 66–67 for continuous random variables, variance, 149, 152
Order statistics, 271–278, 338, 166–168 Poisson process, 149–151, 194
365–367, 478 in hypothesis testing, 458, 740 Polynomial regression model,
sufficiency and, 365–367 in probability plots, 211–216, 740 691–693
Outliers sample, 29, 210–211, 216 Pooled t procedures
in a boxplot, 37–41 of standard normal distribution, and ANOVA, 477, 504–505, 576
definition of, 11 182–184, 211–216 vs. Wilcoxon rank-sum
extreme, 39–41 Permutation, 68, 69, 535–541 procedures, 769
leverage and, 714 Permutation test, 535–541 Posterior probability, 79–81,
mean and, 29, 415–417 PERT analysis, 207 777, 781
median and, 29, 37, 415, 417 Plot Power curves, 574–575
mild, 39 probability, 210–218, 369, 499, Power function of a test, 473–475,
in regression analysis, 679, 688 668, 676, 688, 691, 740 574–575
scatter, 615–617, 632–633, Power model for regression, 721
P 663, 667 Power of a test
Paired data pmf. See Probability mass function Neyman–Pearson theorem and,
in before/after experiments, Point estimate/estimator 473–475
511, 526 biased, 337–342 type II error and, 446–447,
bootstrap procedure for, 538–540 bias of, 335–340 472–476, 505, 593, 749
confidence interval for, 513–515 bootstrap techniques for, Precision, 315, 344, 371, 382,
definition of, 509 345–346, 411–418 387–388, 397, 405, 417, 514,
vs. independent samples, 515 bound on the error of 516, 592, 781
in McNemar’s test, 550 estimation of, 388 Prediction interval
permutation test for, 540–541 censoring and, 343–344 Bonferroni, 659
t test for, 511–513 consistency, 304, 357, 375–377 vs. confidence interval, 406,
in Wilcoxon signed-rank test, for correlation coefficient, 665–666 658–659, 690
762–763 and Cramér–Rao inequality, in linear regression, 654, 658–659
Pairwise average, 772, 773, 775 373–377 in multiple regression, 690
Pairwise independence, 94 definition of, 26, 287, 332 for normal distribution, 404–406
Parallel connection, 55, 88, 89, 90, efficiency of, 375 Prediction level, 405, 659, 689
272, 273 Fisher information on, 371–377 Predictor variable, 614, 682,
Parameter(s) least squares, 626–631 693–696
Bayesian approach to, 776–782 maximum likelihood (mle), Principle of least squares, 625–636,
concentration, 779 352–359 674, 679, 683
confidence interval for, 389, 394 of a mean, 26, 287, 332–333, 366 Prior probability, 79, 758
842 Index