Sie sind auf Seite 1von 59

Appendix Tables

787
788 Appendix Tables

Table A.1 Cumulative Binomial Probabilities


Appendix Tables 789

Table A.1 Cumulative Binomial Probabilities (cont.)


790 Appendix Tables

Table A.1 Cumulative Binomial Probabilities (cont.)

Table A.2 Cumulative Poisson Probabilities


Appendix Tables 791

Table A.2 Cumulative Poisson Probabilities (cont.)


792 Appendix Tables

Table A.3 Standard Normal Curve Areas


Appendix Tables 793

Table A.3 Standard Normal Curve Areas (cont.)


794 Appendix Tables

Table A.4 The Incomplete Gamma Function


Appendix Tables 795

Table A.5 Critical Values for t Distributions


796 Appendix Tables

Table A.6 Critical Values for Chi-Squared Distributions


Appendix Tables 797

Table A.7 t Curve Tail Areas


798 Appendix Tables

Table A.7 t Curve Tail Areas (cont.)


Appendix Tables 799

Table A.8 Critical Values for F Distributions


800 Appendix Tables

Table A.8 Critical Values for F Distributions (cont.)


Appendix Tables 801

Table A.8 Critical Values for F Distributions (cont.)


802 Appendix Tables

Table A.8 Critical Values for F Distributions (cont.)


Appendix Tables 803

Table A.8 Critical Values for F Distributions (cont.)


804 Appendix Tables

Table A.8 Critical Values for F Distributions (cont.)


Appendix Tables 805

Table A.9 Critical Values for Studentized Range Distributions


806 Appendix Tables

Table A.10 Chi-Squared Curve Tail Areas


Appendix Tables 807

Table A.10 Chi-Squared Curve Tail Areas (cont.)


808 Appendix Tables

Table A.11 Critical Values for the Ryan–Joiner Test of Normality


Appendix Tables 809

Table A.12 Critical Values for the Wilcoxon Signed-Rank Test


810 Appendix Tables

Table A.13 Critical Values for the Wilcoxon Rank-Sum Test


Appendix Tables 811

Table A.14 Critical Values for the Wilcoxon Signed-Rank Interval


812 Appendix Tables

Table A.15 Critical Values for the Wilcoxon Rank-Sum Interval


Appendix Tables

Table A.16 b Curves for t Tests


813
Answers to
Odd-Numbered Exercises
Chapter 1
This display brings out the gap in the data:
1. a. Houston Chronicle, Des Moines Register, Chicago There are no scores in the high 70’s.
Tribune, Washington Post
b. Capital One, Campbell Soup, Merrill Lynch, 13. a. 2 23 Stem units: 1.0
Prudential 3 2344567789 Leaf units: .10
c. Bill Jasper, Kay Reinke, Helen Ford, David Menendez 4 01356889
d. 1.78, 2.44, 3.50, 3.04 5 00001114455666789
6 0000122223344456667789999
3. a. In a sample of 100 DVD players, what are the chances
that more than 20 need service while under warranty? 7 00012233455555668
What are the chances that none need service while still 8 02233448
under warranty? 9 012233335666788
b. What proportion of all DVD players of this brand and 10 2344455688
model will need service within the warranty period? 11 2335999
12 37
5. a. No, the relevant conceptual population is all scores of
all students who participate in the SI in conjunction 13 8
with this particular statistics course. 14 36
b. The advantage of randomly allocating students to the 15 0035
two groups is that the two groups should then be fairly 16
comparable before the study. If the two groups perform 17
differently in the class, we might attribute this to the 18 9
treatments (SI and control). If it were left to students to
choose, stronger or more dedicated students might b. A representative value could be the median, 7.0.
gravitate toward SI, confounding the results. c. The data appear to be highly concentrated, except for
c. If all students were put in the treatment group there a few values on the positive side.
would be no results with which to compare the d. No, there is skewness to the right, or positive
treatments. skewness.
e. The value 18.9 appears to be an outlier, being more
7. One could generate a simple random sample of all single than two stem units from the previous value.
family homes in the city, or a stratified random sample
by taking a simple random sample from each of the ten 15. a.
district neighborhoods. From each of the homes in the Relative
sample the necessary data would be collected. This
Number frequency
would be an enumerative study because there exists a
finite, identifiable population of objects from which to nonconforming Frequency (Freq/60)
sample. 0 7 0.117
9. a. There could be several explanations for the variability 1 12 0.200
of the measurements. Among them could be measuring 2 13 0.217
error, (due to mechanical or technical changes across 3 14 0.233
measurements), recording error, differences in 4 6 0.100
weather conditions at time of measurements, etc. 5 3 0.050
b. This study involves a conceptual population. There is
6 3 0.050
no sampling frame.
7 1 0.017
11. 6 l 034 8 1 0.017
6h 667899 1.001
7l 00122244 Doesn’t add exactly to 1 because relative
7h Stem ¼ tens frequencies have been rounded
8l 001111122344 Leaf ¼ ones
8h 5557899
9l 03
9h 58

814
Chapter 1 815

b. .917, .867, 1 .867 ¼ .133


c. The center of the histogram is somewhere around b. Class Freq Rel freq Density
2 or 3 and it shows that there is some positive
skewness in the data. 0– < 50 8 0.08 .0016
50– < 100 13 0.13 .0026
17. a. .375
100– < 150 11 0.11 .0022
b. .218
c. .242 150– < 200 21 0.21 .0042
d. The histogram is very positively skewed. 200– < 300 26 0.26 .0026
300– < 400 12 0.12 .0012
19. a. The number of subdivisions having no cul-de-sacs is 400– < 500 4 0.04 .0004
17/47 ¼ .362, or 36.2%. The proportion having at 500– < 600 3 0.03 .0003
least one cul-de-sac is 30/47 ¼ .638, or 63.8%.
600– < 900 2 0.02 .00007
100 1.00
y: Count Percent

0 17 36.17 c. .79
1 22 46.81
2 6 12.77
23. Class Freq Class Freq
3 1 2.13
5 1 2.13 10– < 20 8 1.1– < 1.2 2
N ¼ 47 20– < 30 14 1.2– < 1.3 6
.362, .638 30– < 40 8 1.3– < 1.4 7
40– < 50 4 1.4– < 1.5 9
50– < 60 3 1.5– < 1.6 6
b. z: Count Percent 60– < 70 2 1.6– < 1.7 4
70– < 80 1 1.7– < 1.8 5
0 13 27.66 40 1.8– < 1.9 1
1 11 23.40 40
2 3 6.38
3 7 14.89
4 5 10.64
The original distribution is positively skewed.
5 3 6.38
The transformation creates a much more symmetric,
6 3 6.38
mound-shaped histogram.
8 2 4.26
N ¼ 47
25. a. Class interval Freq Rel. Freq.
.894, .830
0–< 50 9 0.18
21. a. 50–< 100 19 0.38
Class Freq Rel freq 100–< 150 11 0.22
150–< 200 4 0.08
0– < 100 21 0.21 200–< 250 2 0.04
100– < 200 32 0.32 250–< 300 2 0.04
200– < 300 26 0.26 300–< 350 1 0.02
300– < 400 12 0.12 350–< 400 1 0.02
400– < 500 4 0.04 > ¼ 400 1 0.02
500– < 600 3 0.03 50 1.00
600– < 700 1 0.01
700– < 800 0 0.00
800– < 900 1 0.01
100 1.00
The distribution is skewed to the right, or positively
skewed. There is a gap in the histogram, and what
appears to be an outlier in the ‘500–550’ interval.
The histogram is skewed right, with a majority of
observations between 0 and 300 cycles. The class
holding the most observations is between 100 and 200
cycles.
816 Chapter 1

decreased to any value at least 370 without changing


b. Class interval Freq. Rel. Freq. the median.
d. 6.18 min; 6.16 min
2.25 < 2.75 2 0.04
2.75 < 3.25 2 0.04 35. a. 125.
3.25 < 3.75 3 0.06 b. If 127.6 is reported as 130, then the median is 130, a
substantial change. When there is rounding or grouping,
3.75 < 4.25 8 0.16
the median can be highly sensitive to a small change.
4.25 < 4.75 18 0.36
4.75 < 5.25 10 0.20 37. x~ ¼ 92; xtrð25Þ ¼ 95:07; xtrð10Þ ¼ 102:23; x ¼ 119:3
5.25 < 5.75 4 0.08 Positive skewness causes the mean to be larger than the
5.75 < 6.25 3 0.06 median. Trimming moves the mean closer to the median.
39. a. y ¼ x þ c, y~ ¼ x~ þ c
The distribution of the natural logs of the original data b. y ¼ cx, y~ ¼ c~
x
is much more symmetric than the original.
41. a. 25.8 b. 49.31 c. 7.02 d. 49.31
c. .56, .14.
43. a. 2887.6, 2888 b. 7060.3
29. d. The frequency distribution is:
45. 24.36
Relative Relative 47. $1,961,160
Class frequency Class frequency 49. 3.5; 1.3, 1.9, 2.0, 2.3, 2.5
0 < 150 .193 900 < 1050 .019 51. a. 1, 6, 5
150 < 300 .183 1050 < 1200 .029 b. The box plot shows positive skewness. The two
300 < 450 .251 1200 < 1350 .005 longest runs are extreme outliers.
450 < 600 .148 1350 < 1500 .004 c. outlier: greater than 13.5 or less than 6.5
600 < 750 .097 1500 < 1650 .001 extreme outlier: greater than 21 or less than 14
d. The largest observation could be decreased to 6
750 < 900 .066 1650 < 1800 .002
without affecting fs
1800 < 1950 .002
53. a. The mean is 27.82, the median is 26, and the 5%
The relative frequency distribution is almost unimodal trimmed mean is 27.38. The mean exceeds the
and exhibits a large positive skew. The typical middle median, in accord with positive skewness. The
value is somewhere between 400 and 450, although the trimmed mean is between the mean and median, as
skewness makes it difficult to pinpoint more exactly than you would expect.
this. b. There are two outliers at the high end and one at the
e. .775, .014 low end, but there are no extreme outliers. Because
f. .211 the median is in the lower half of the box, the upper
whisker is longer than the lower whisker, and there are
31. a. 5.24 two high outliers compared to just one low outlier, the
b. The median, 2, is much lower because of positive plot suggests positive skewness.
skewness.
c. Trimming the largest and smallest observations yields 55. The two distributions are centered in about the same
the 5.9% trimmed mean, 4.4, which is between the place, but one machine is much more variable than the
mean and median. other. The more precise machine produced one outlier,
but this part would not be an outlier if judged by the
33. a. A stem-and leaf display: distribution of the other machine.
32 55 Stem: ones 57. All of the Indian salaries are below the first quartile of
33 49 Leaf: tenths Yankee salaries. There is much more variability in the
34 Yankee salaries. Neither team has any outliers.
35 6699
61. The three flow rates yield similar uniformities, but the
36 34469 values for the 160 flow rate are a little higher.
37 03345
38 9 63. a. 9.59, 59.41. The standard deviations are large, so it is
39 2347 certainly not true that repeated measurements are
40 23 identical.
b. .396, .323. In terms of the coefficient of variation, the
41
HC emissions are more variable.
42 4
65. 10.65
The display is reasonably symmetric, so the mean and 67. a. y ¼ a
x þ b: s2y ¼ a2 s2x :
median will be close. b. 100.78, .572
b. 370.7, 369.50.
c. The largest value (currently 424) could be increased
by any amount without changing the median. It can be
Chapter 2 817

69. The mean is .93 and the standard deviation is .081. The 7. a. {111, 112, 113, 121, 122, 123, 131, 132, 133, 211,
distribution is fairly symmetric with a central peak, as 212, 213, 221, 222, 223, 231, 232, 233, 311, 312, 313,
shown by the stem and leaf display: 321, 322, 323, 331, 332, 333}
b. {111, 222, 333}
Leaf unit ¼ 0.010 c. {123, 132, 213, 231, 312, 321}
7 7 d. {111, 113, 131, 133, 311, 313, 331, 333}
8 11
8 556
9. a. S ¼ {BBBAAAA, BBABAAA, BBAABAA,
BBAAABA, BBAAAAB, BABBAAA, BABABAA,
9 22333344
BABAABA, BABAAAB, BAABBAA, BAABABA,
9 55 BAABAAB, BAAABBA, BAAABAB, BAAAABB,
10 04 ABBBAAA, ABBABAA, ABBAABA, ABBAAAB,
10 55 ABABBAA, ABABABA, ABABAAB, ABAABBA,
ABAABAB, ABAAABB, AABBBAA, AABBABA,
AABBAAB, AABABBA, AABABAB, AABAABB,
71. a. Mode ¼ .93. It occurs four times in the data set. AAABBBA, AAABBAB, AAABABB, AAAABBB}
b. The Modal Category is the one in which the most b. {AAAABBB, AAABABB, AAABBAB, AABAABB,
observations occur. AABABAB}
73. The measures that are sensitive to outliers are the mean 13. a. .07 b. .30 c. .57
and the midrange. The mean is sensitive because all
values are used in computing it. The midrange is the 15. a. They are awarded at least one of the first two projects,
most sensitive because it uses only the most extreme .36.
values in its computation. b. They are awarded neither of the first two projects, .64.
The median, the trimmed mean, and the midfourth are c. They are awarded at least one of the projects, .53.
less sensitive to outliers. The median is the most resistant d. They are awarded none of the projects, .47.
to outliers because it uses only the middle value (or e. They are awarded only the third project, .17.
values) in its computation. The midfourth is also quite f. Either they fail to get the first two or they are awarded
resistant because it uses the fourths. The resistance of the the third, .75.
trimmed mean increases with the trimming percentage. 17. a. .572 b. .879
75. a. s2y ¼ s2x and sy ¼ sx b. s2z ¼ 1 and sz ¼ 1 19. a. SAS and SPSS are not the only packages.
77. b. .552, .102 c. 30 d. 19 b. .7 c. .8 d. .2

79. a. There may be a tendency to a repeating pattern. 21. a. .8841 b. .0435


b. The value .1 gives a much smoother series. 23. a. .10 b. .18, .19 c. .41 d. .59 e. .31 f. .69
c. The smoothed value depends on all previous values of
the time series, but the coefficient decreases with k. 25. a. 1/15 b. 6/15 c. 14/15 d. 8/15
d. As t gets large, the coefficient (1 – a)t–1 decreases to
zero, so there is decreasing sensitivity to the initial 27. a. .98 b. .02 c. .03 d. .24
value. 29. a. 1/9 b. 8/9 c. 2/9
31. a. 20 b. 60 c. 10
Chapter 2 33. a. 243 b. 3645, 10
1. a. A\B0 b. A[B c. (A\B0 ) [ (B\A0 ) 35. .0679
3. a. S ¼ {1324, 1342, 1423, 1432, 2314, 2341, 2413, 37. .2
2431, 3124, 3142, 4123, 4132, 3214, 3241,
4213, 4231} 39. .0456
b. A ¼ {1324, 1342, 1423, 1432}
41. a. .0839 b. .24975 c. .1998
c. B ¼ {2314, 2341, 2413, 2431, 3214, 3241, 4213,
4231} 43. a. 1/15 b. 1/3 c. 2/3
d. A[B ¼ {1324, 1342, 1423, 1432, 2314, 2341, 2413,
2431, 3214, 3241, 4213, 4231} 45. a. .447, .5, .2
A\B ¼ ∅ b. P(A|C) ¼ .4, the fraction of ethnic group C that has
0
A ¼ {2314, 2341, 2413, 2431, 3124, 3142, 4123, blood type A.
4132, 3214, 3241, 4213, 4231} P(C|A) ¼ .447, the fraction of those with blood group
A that are of ethnic group C.
5. a. A ¼ { SSF, SFS, FSS } c. .211
b. B ¼ { SSS, SSF, SFS, FSS }
c. C ¼ { SSS, SSF, SFS } 47. a. Of those with a Visa card, .5 is the proportion who also
d. C0 ¼ { SFF, FSS, FSF, FFS, FFF } have a Master Card.
A[C ¼ { SSS, SSF, SFS, FSS } b. Of those with a Visa card, .5 is the proportion who do
A\C ¼ { SSF, SFS } not have a Master Card.
B[C ¼ { SSS, SSF, SFS, FSS }
B\C ¼ { SSS SSF, SFS }
818 Chapter 3

c. Of those with Master Card, .625 is the proportion


who also have a Visa Card.
Chapter 3
d. Of those with Master Card, .375 is the proportion
1. S : FFF SFF FSF FFS FSS SFS SSF SSS
who do not have a Visa Card.
e. Of those with at least one of the two cards, .769 is the X: 0 1 1 1 2 2 2 3
proportion who have a Visa card.
3. M ¼ the absolute value of the difference between the
49. .217, .178
outcomes, with possible values 0, 1, 2, 3, 4, 5 or 6;
51. .436, .582 W ¼ 1 if the sum of the two resulting numbers is even
and W ¼ 0 otherwise, a Bernoulli random variable.
53. .0833
5. No, X can be a Bernoulli random variable where a
59. a. .067 b. .509 success is an outcome in B, with B a particular subset
of the sample space.
61. .287
7. a. Possible values are 0, 1, 2, . . ., 12; discrete
63. a. 76.5% b. .235
b. With N ¼ # on the list, values are 0, 1, 2, . . . , N;
65. .466, .288, .247 discrete
c. Possible values are 1, 2, 3, 4, . . . ; discrete
67. a. Because of independence, the conditional probability d. { x: 0 < x < 1 } if we assume that a rattlesnake can
is the same as the unconditional probability, .3. be arbitrarily short or long; not discrete
b. .82 c. .146 e. With c ¼ amount earned per book sold, possible
values are 0, c, 2c, 3c, . . . , 10,000c; discrete
71. .349, .651, (1  p)n, 1  (1  p)n
f. { y: 0  y  14} since 0 is the smallest possible pH
73. .99999969, .226 and 14 is the largest possible pH; not discrete
g. With m and M denoting the minimum and maximum
75. .9981 possible tension, respectively, possible values are {
x: m  x  M }; not discrete
77. Yes, no
h. Possible values are 3, 6, 9, 12, 15, . . . — i.e., 3(1),
79. a. 2p  p2 b. 1  (1  p)n c. (1  p)3 3(2), 3(3), 3(4), . . .giving a first element, etc,; discrete
d. .9 + .1(1  p)3 e. .0137
9. a. X is a discrete random variable with possible values
81. .8588, .9896 {2, 4, 6, 8, . . .}
b. X is a discrete random variable with possible values
83. 2p(1  p) {2, 3, 4, 5, . . .}
85. a. 1/3, .444 b. .15 c. .291 11. a. p(4) ¼ .10 c. .45, .25
87. .45, .32 13. a. .70 b. .45 c. .55 d. .71 e. .65 f. .45
89. a. 1/120 b. 1/5 c. 1/5 15. a. (1,2) (1,3) (1,4) (1,5) (2,3) (2,4) (2,5) (3,4) (3,5) (4,5)
b. p(0) ¼ .3, p(1) ¼ .6, p(2) ¼ .1, p(x) ¼ 0 otherwise
91. .9046
c. F(0) 8¼ .30, F(1) ¼ .90, F(2) ¼ 1. The c.d.f. is
93. a. .904 b. .766 > 0
> x<0
<
:30 0  x<1
95. .008 FðxÞ ¼
> :90 1  x<2
>
:
97. .362, .348, .290 1 2x

99. a. P(G | R1 < R2 < R3) ¼ 2/3, so classify as granite if 17. a. .81 b. .162
R1 < R2 < R3. c. The fifth battery must be an A, and one of the first
b. P(G | R1 < R3 < R2) ¼ .294, so classify as basalt if four must also be an A, so
R1 < R3 < R2. p(5) ¼ P(AUUUA or UAUUA or UUAUA or
P(G | R3 < R1 < R2) ¼ 1/15, so classify as basalt if UUUAA) ¼ .00324
R3 < R1 < R2. d. P(Y ¼ y) ¼ (y  1)(.1)y2(.9)2, y ¼ 2,3,4,5,. . .
c. .175 d. p > 14/17
19. c. F(x) ¼ 0, x < 1, F(x) ¼ log10([x] + 1), 1  x  9,
101. a. 1/24 b. 3/8 F(x) ¼ 1, x > 9.
d. .602, .301
103. s ¼ 1
21. F(x) ¼ 0, x < 0; .10, 0  x < 1; .25, 1  x < 2; .45,
107. a. P(B0|survive) ¼ b0/[1  (b1 + b2)cd] 2  x < 3; .70, 3  x < 4; .90, 4  x < 5; .96,
P(B1|survive) ¼ b1(1  cd)/[1  (b1 + b2)cd] 5  x < 6; 1.00, 6  x
P(B2|survive) ¼ b2(1  cd)/[1  (b1 + b2)cd]
b. .712, .058, .231 23. a. p(1) ¼ .30, p(3) ¼ .10, p(4) ¼ .05, p(6) ¼ .15,
p(12) ¼ .40
b. .30, .60
25. a. p(x) ¼ (1/3)(2/3)x1, x ¼ 1, 2, 3, . . .
b. p(y) ¼ (1/3)(2/3)y2, y ¼ 2, 3, 4, . . .
c. p(0) ¼ 1/6, p(z) ¼ (25/54)(4/9)z1, z ¼ 1, 2, 3, 4, . . .
Chapter 3 819

29. a. .60 b. $110 85. a. h(x; 10, 10, 20) b. .0325 c. h(x; n, n, 2n),
E(X) ¼ n/2, V(X) ¼ n2/[4(2n  1)]
31. a. 16.38, 272.298, 3.9936 b. 401 c. 2496 d. 13.66
87. a. nb(x; 2, .5) ¼ (x + 1).5x+2, x ¼ 0, 1, 2, 3, . . .
33. Yes, because S(1/x2) is finite. b. 3/16 c. 11/16 d. 2, 4
35. $700 89. nb(x; 6, .5), E(X) ¼ 6 ¼ 3(2)
37. E[h(X)] ¼ .408 > 1/3.5 ¼ .286, so you expect to win 93. a. .932 b. .065 c. .068 d. .491 e. .251
more if you gamble.
95. a. .011 b. .441 c. .554, .459 d. .944
39. V(X) ¼ V(X)
97. a. .491 b. .133
41. a. 32.5 b. 7.5
c. V(X) ¼ E[X(X–1)] + E(X)  [E(X)]2 99. a. .122, .808, .283 b. 12, 3.464 c. .530, .011
43. a. 1/4, 1/9, 1/16, 1/25, 1/100 101. a. .099 b. .135 c. 2
b. m ¼ 2.64, s ¼ 1.54, P(|X  m|  2s) ¼ .04 < .25,
P(|X  m|  3s) ¼ 0 < 1/9 103. a. 4 b. .215 c. 1.15 years
The actual probability can be far below the Chebyshev 105. a. .221 b. 6,800,000 c. p(x; 1608.5)
bound, so the bound is conservative.
c. 1/9, equal to the Chebyshev bound 111. b. 3.114, .405, .636
d. P(1) ¼ .02, P(0) ¼ .96, P(1) ¼ .02
113. a. b(x; 15, .75) b. .6865 c. .313 d. 45/4, 45/16
45. MX(t) ¼ .5et/(1–.5et), E(X) ¼ 2, V(X) ¼ 2 e. .309
47. pY(y) ¼ .75(.25)y1, y ¼ 1, 2, 3, . . . 115. .9914
49. E(X) ¼ 5, V(X) ¼ 4 117. a. p(x; 2.5) b. .067 c. .109
2 =2
51. MY ðtÞ ¼ et , E(X) ¼ 0, V(X) ¼ 1 119. 1.813, 3.05
53. E(X) ¼ 0, V(X) ¼ 2 121. p(2) ¼ p2 , p(3) ¼ (1  p)p2 , p(4) ¼ (1  p)p2,
p(x) ¼ [1  p(2)  . . .  p(x  3)](1  p)p2, x ¼ 5,
59. a. .850 b. .200 c. .200 d. .701 6, 7, . . . .
e. .851 f. .000 g. .570 Alternatively, p(x) ¼ (1  p)p(x  1) +
61. a. .354 b. .114 c. .919 p(1  p) p(x  2), x ¼ 5, 6, 7, . . . ; 99950841

63. a. .403 b. .787 c. .773 123. a. 0029 b. 0767, .9702


P
1
65. .1478 125. a. .135 b. .00144 c. ½pðx; 2Þ5
x¼0
67. .4068, assuming independence 127. 3.590

69. a. .0173 b. .8106, .4246 c. .0056, .9022, .5858 129. a. No b. .0273

71. For p ¼ .9 the probability is higher for B (.9963 versus 131. b. .6p(x; l) + .4p(x; m) c. (l + m)/2
.99 for A) d. (l + m)/2 + (l  m)2/4
For p ¼ .5 the probability is higher for A (.75 versus 133. .5
.6875 for B)
137. X ~ b(x; 25,p E(h(X)) ¼ 500p + 750,
p),ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
73. The tabulation for p > .5 is not needed.
shðXÞ ¼ 100 pð1  pÞ
75. a. 20, 16 (binomial, n ¼ 100, p ¼ .2) b. 70, 21 Independence and constant probability might not be
valid because of the effect that customers can have on
77. When p ¼ .5, the true probability for k ¼ 2 is .0414, each other. Also, store employees might affect customer
compared to the bound of .25. decisions.
When p ¼ .5, the true probability for k ¼ 3 is .0026,
compared to the bound of .1111. 139.
When p ¼ .75, the true probability for k ¼ 2 is .0652, X 0 1 2 3 4
compared to the bound of .25. p(x) .07776 .10368 .19008 .20736 .17280
When p ¼ .75, the true probability for k ¼ 3 is .0039,
compared to the bound of .1111. X 5 6 7 8
p(x) .13824 .06912 .03072 .01024
79. MnX(t) ¼ [p + (1  p)et]n, E(n  X) ¼ n(1  p),
V(n  X) ¼ np(1  p)
Intuitively, the means of X and n  X should add to n and
their variances should be the same.
81. a. .114 b. .879 c. .121 d. Use the binomial
distribution with n ¼ 15 and p ¼ .1
83. a. h(x; 15, 10, 20) b. .0325 c. .6966
820 Chapter 4

Chapter 4 55. a. .794 b. 5.88 c. 7.94 d. .265


57. No, because of symmetry.
1. a. 25 b. .5 c. 7/16
59. a. approximate, .0391; binomial, .0437
3. b. .5 c. 11/16 d. .6328 b. approximate, .99993; binomial, .99976
5. a. 3/8 b. 1/8 c. .2969 d. .5781 61. a. .7287 b. .8643, .8159
7. a. f ðxÞ ¼ for 25  x  35 and ¼ 0 otherwise
1
10 63. a. approximate, .9933; binomial, .9905
b. .2 c. .4 d. .2 b. approximate, .9874; binomial, .9837
c. approximate, .8051; binomial, .8066
9. a. .5618 b. .4382, .4382 c. .0709
67. a. .15866 b. .0013499 c. .999936658
11. a. 1/4
pffiffiffi b. 3/16 c. 15/16 Actual: .15866 .0013499 .999936658
d. 2 e. f(x) ¼ x/2 for 0  x < 2, and f(x) ¼ 0
d. .00000028665
otherwise
69. a. 120 b. 1.329 c. .371 d. .735 e. 0
13. a. 3 b. 0 for x  1, 1  1/x3 for x > 1
c. 1/8, .088 71. a. 5, 4 b. .715 c. .411
15. a. F(x) ¼ 0 for x  0, F(x) ¼ x3/8 for 0 < x < 2, 73. a. 1 b. 1 c. .982 d. .129
F(x) ¼ 1 for x  2
b. 1/64 c. .0137, .0137 d. 1.817 75. a. .449, .699, .148 b. .050, .018

17. b. 90th percentile of Y ¼ 1.8(90th percentile of X) + 32 77. a. \ Ai b. Exponential with l ¼ .05


c. 100 pth percentile of Y ¼ a(100 pth percentile of c. Exponential with parameter nl
X) + b
83. a. .8257, .8257, .0636 b. .6637 c. 172.73
19. a. 1.5, .866 b. .9245
87. a. .9296 b. .2975 c. 98.18
21. a. .8182, .1113 b. .044
89. a. 68.03, 122.09 b. .3196 c. .7257, skewness
23. a. A + (B  A)p pffiffiffiffiffi 91. a. 149.157, 223.595 b. .957 c. .0416
b. (A + B)/2, (B  A)2/12, ðB  AÞ= 12
d. 148.41 e. 9.57 f. 125.90
c. (Bn+1  An+1)/[(n + 1)(B  A)]
93. a ¼ b
25. 314.79
95. b. G(a + b) G(m + b) /[G(a + b + m ) G(b)], b/( a + b)
27. 248, 3.6
97. Yes, since the pattern in the plot is quite linear.
29. 1/(1  t/4), 1/4, 1/16
99. Yes
31. 100p, 30p
101. Yes
33. f ðxÞ ¼ 10
1
for 5  x  5 and ¼ 0 otherwise
103. Form a new variable, the logarithms of the rainfall
35. a. M(t) ¼ .15 e.5t/(.15  t), t < .15; E(X) ¼ 7.167,
values, and then construct a normal plot for the new
V(X) ¼ 44.44
variable. Because of the linearity of this plot, normality
b. E(X) ¼ 7.167, V(X) ¼ 44.44
is plausible.
37. M(t) ¼ .15/(.15  t), E(X) ¼ 6.667, V(X) ¼ 44.44
105. The normal plot has a nonlinear pattern showing
This distribution is shifted left by .5, so the mean differs
positive skewness.
by .5 but the variance is the same.
107. The plot deviates from linearity, especially at the low
39. a. .4850 b. .3413 c. .4938 d. .9876
end, where the smallest three observations are too small
e. .9147 f. .9599 g. .9104 h. .0791
relative to the others. The plot works for any l because l
i. .0668 j. .9876
is a scale parameter.
41. a. 1.34 b. 1.34 c. .674 d. .674
109. fY ( y) ¼ 2/y3, y > 1
e. 1.555
111. fY ðyÞ ¼ yey =2
2
,y>0
43. a. .9772 b. .5 c. .9104 d. .8413
e. .2417 f. .6826 113. fY ( y) ¼ 1/16, 0 < y < 16
45. a. .7977 b. .0004 115. fY ( y) ¼ 1/[p(1 + y2)]
c. The top 5% are the values above .3987.
117. Y ¼ X2/16
47. The second machine pffiffiffi
119. fY ðyÞ ¼ 1=½2 y, 0 < y < 1
49. a. .2525 b. 39.96 pffiffiffi pffiffiffi
121. fY ðyÞ ¼ 1=½4 y, 0 < y < 1, fY ðyÞ ¼ 1=½8 y,
51. .0510 1<y<9
53. a. .8664 b. .0124 c. .2718 125. pY ( y) ¼ (1  p)y1p, y ¼ 1, 2, 3, . . .
Chapter 5 821

127. a. .4 b. .6 c. F(x) ¼ x/25, 0  x  25; 9. a. .3/380,000 b. .3024 c. .3593


F(x) ¼0, x < 0; F(x) ¼ 1, x > 25 d. 12.5, 7.22 d. 10Kx2 + .05, 20  x  30 e. no
  
129. b. F(x) ¼ 1  16/(x + 4)2, x  0; F(x) ¼ 0, x < 0 11. a. pðx; yÞ ¼ el lx =x! ey yy =y! for x ¼ 0,  1, 2, . . .;
c. .247 d. 4 e. 16.67 y ¼ 0, 1, 2, . . . b. ely ð1 þ l þ yÞ
c. ely ðl þ yÞm =m!, Poisson with parameter l + y
131. a. .6563 b. 41.55 c. .3179
13. a. exy, x  0, y  0 b. .3996 c. .5940
133. a. .00025, normal approximation; .000859, binomial d. .3298
b. .0888, normal approximation; .0963, binomial
15. a. FðyÞ ¼ 1  2e2ly þ e3ly for y  0, F(y) ¼ 0 for
135. a. F(x) ¼1.5(1  1/x), 1  x  3; F(x) ¼0, x < 1; y < 0; f ðyÞ ¼ 4le2ly  3e3ly for y  0, f(y) ¼ 0
F(x) ¼ 1, x > 3 b. .9, .4 c. 1.6479 for y < 0
d. .5333 e. .2662 b. 2/(3l)
137. a. 1.075, 1.075 b. .0614, .3331 c. 2.476 b.p1/p
17. a. .25 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffic. 2/p
d. fX ðxÞ ¼ 2pRffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2  x2 ðpR2 Þ for R  x  R,
139. b. 95,693, 1/3 
fY ðyÞ ¼ 2 R2  y2 ðpR2 Þ for R  y  R, no
141. b. F(x) ¼ .5e.2x, x  0; F(x) ¼ 1  .5e.2x, x > 0
c. .5, .6648, .2555, .6703 19. .15
143. a. k ¼ (a  1)5 a1
b. F(x) ¼ 0, x  5; 21. L2
F(x) ¼ 1  (5/x) a1, x > 5 c. 5(a  1)/(a  2)
23. 1/4 h
145. b. .4602, .3636 c. .5950 d. 140.178
25. 2/3
147. a. Weibull b. .5422
27. a. .1058 b. .0128
149. a. l b. a xa 1/ba 2
c. FðxÞ ¼ 1  eaðxx =ð2bÞÞ , 0  x  b; F(x) ¼ 0, 37. a. fX(x) ¼ 2x, 0 < x < 1, fX(x) ¼ 0 elsewhere
x < 0; F(x) ¼ 1  eab/2, x > b b. fY|X(y| x) ¼ 1/x, 0 < y < x < 1 c. .6
d. no, the domain is not a rectangle
f ðxÞ ¼ að1  x=bÞeaðxx =ð2bÞÞ , 0  x  b;
2

e. E(Y| X ¼ x) ¼ x/2, a linear function of x


f(x) ¼ 0, x < 0, f(x) ¼ 0, x > b f. V(Y| X ¼ x) ¼ x2/12
This gives total probability less than 1, so some
probability is located at infinity (for items that last 39. a. fX(x) ¼ 2e2x, 0 < x < 1, fX(x) ¼ 0, x  0
forever). b. fY|X(y| x) ¼ ey+x, 0 < x < y < 1
c. P(Y > 2| x ¼ 1) ¼ 1/e
151. mR  v/20, sR  v/800 d. no, the domain is not rectangular
155. F(q*) ¼ .818 e. E(Y| X ¼ x) ¼ x + 1, a linear function of x
f. V(Y| X ¼ x) ¼ 1
41. a. E(Y| X ¼ x) ¼ x/2, a linear function of x; V(Y|
Chapter 5 X ¼ x) ¼ x2/12
b. f(x, y) ¼ 1/x, 0 < y < x < 1
1. a. .20 b. .42 c. The probability of at least one c. fY(y) ¼ ln(y), 0 < y < 1
hose being in use at each pump is .70. d. E(Y) ¼ 1/4, V(Y) ¼ 7/144
d. x 0 1 2 y 0 1 2 e. E(Y) ¼ 1/4, V(Y) ¼ 7/144
pX(x) .16 .34 .50 pY(y) .24 .38 .38 43. a. pY|X(0|1) ¼ 4/17, pY|X(1|1) ¼ 10/17, pY|X(2|1) ¼ 3/17
P(X  1) ¼ .50 b. pY|X(0|2) ¼ .12, pY|X(1|2) ¼ .28, pY|X(2|2) ¼ .60
e. dependent, .30 ¼ P(X ¼ 2 and Y ¼ 2) 6¼ P(X ¼ 2) c. .40
P(Y ¼ 2) ¼ (.50)(.38) d. pX|Y(0|2) ¼ 1/19, pX|Y(1|2) ¼ 3/19, pX|Y(2|2) ¼ 15/19

3. a. .15 b. .40 c. .22 ¼ P(A) ¼ P(|X1  X2|  2) 45. a. E(Y| X ¼ x) ¼ x2/2 b. V(Y| X ¼ x) ¼ x4/12
d. .17, .46 c. fY(y) ¼ y–.5  1, 0 < y < 1
e. x1 0 1 2 3 4 47. a. p(1,1) ¼ p(2,2) ¼ p(3,3) ¼ 1/9, p(2,1) ¼ p(3,1)
p1(x1) .19 .30 .25 .14 .12 ¼ p(3,2) ¼ 2/9
E(X1) ¼ 1.7 b. pX(1) ¼ 1/9, pX(2) ¼ 3/9, pX(3) ¼ 5/9
c. pY|X(1|1) ¼ 1, pY|X(1|2) ¼ 2/3, pY|X(2|2) ¼ 1/3,
f. x2 0 1 2 3 pY|X(1|3) ¼ .4, pY|X(2|3) ¼ .4, pY|X(3|3) ¼ .2
d. E(Y| X ¼ 1) ¼ 1, E(Y| X ¼ 2) ¼ 4/3,
p2(x2) .19 .30 .28 .23
E(Y| X ¼ 3) ¼ 1.8, no
g. 0 ¼ p(4 , 0) 6¼ p1(4)  p2(0) ¼ (.12)(.19) so the two e. V(Y| X ¼ 1) ¼ 0, V(Y| X ¼ 2) ¼ 2/9,
variables are not independent. V(Y| X ¼ 3) ¼ .56
5. a. .54 b. .00018 49. a. pX|Y(1|1) ¼ .2, pX|Y(2|1) ¼ .4, pX|Y(3|1) ¼ .4,
pX|Y(2|2) ¼ 1/3, pX|Y(3|2) ¼ 2/3, pX|Y(3|3) ¼ 1
7. a. .030 b. .120 c. .10, .30 d. .38 b. E(X| Y ¼ 1) ¼ 2.2, E(X| Y ¼ 2) ¼ 8/3,
e. yes, p(x,y) ¼ pX(x)  pY(y) E(X| Y ¼ 3) ¼ 3, no
c. V(X| Y ¼ 1) ¼ .56, V(X| Y ¼ 2) ¼ 2/9,
V(X| Y ¼ 3) ¼ 0
822 Chapter 6

51. a. 2x – 10 b. 9 c. 3 d. .0228 d. F(x, y) ¼ .6x2y + .4xy3, 0  x  1; 0  y  1;


F(x, y) ¼ 0, x  0; F(x, y) ¼ 0, y  0;
53. a. pX(x) ¼ .1, x ¼ 0, 1, 2, . . ., 9; pY|X(y| x) ¼ 1/9, y ¼ 0, F(x, y) ¼ .6x2 + .4x, 0  x  1, y > 1;
1, 2, . . ., 9, y 6¼ x; F(x, y) ¼ .6y + .4y3, x > 1, 0  y  1; F(x, y) ¼ 1,
pX,Y(x, y) ¼ 1/90, x, y ¼ 0, 1, 2, . . ., 9, y 6¼ x x > 1, y > 1
b. E(Y| X ¼ x) ¼ 5  x/9, x ¼ 0, 1, 2, . . ., 9, a linear P(.25  X  .75, .25  Y  .75) ¼ .23125
function of x e. F(x, y) ¼ 6x2y2, x + y  1, 0  x  1; 0  y  1,
55. a. .6x, .24x b. 60 c. 60 x  0, y  0
F(x, y) ¼ 3x4  8x3 + 6x2 + 3y4  8y3 + 6y2  1,
57. a. .1410 b. .1165 x + y > 1, x  1, y  1
With positive correlation, the deviations from their means F(x, y) ¼ 0, x  0; F(x, y) ¼ 0, y  0;
of X and Y are likely to have the same sign. F(x, y) ¼ 3x4 – 8x3 + 6x2 , 0  x  1, y > 1
F(x, y) ¼ 3y4 – 8y3 + 6y2 , 0  y  1, x > 1
59. a. If U ¼ X1 + X2, fU(u) ¼ u2, 0 < u < 1, fU(u) ¼ F(x, y) ¼ 1, x > 1, y > 1
2u – u2, 1 < u < 2, fU(u) ¼ 0, elsewhere
b. If V ¼ X2  X1, fV(v) ¼ 2  2v, 0 < v < 1, 91. a. 2x, x b. 40 c. .100
fV(v) ¼ 0, elsewhere
93. MW(t) ¼ 2/[(1–1000t)(2–1000t)], 1500
61. 4y3[(ln(y3)]2, 0 < y3 < 1
65. a. g5(y) ¼ 5y4/105, 25/3 b. 20/3 c. 5
d. 1.409 Chapter 6
3
67. gY5 jY1 ðy5 j4Þ ¼ ½2=3½ðy5  4Þ=6 , 4 < y5 < 10; 8.8 1. a. x 25 32.5 40 45 52.5 65
69. 1/(n + 1), 2/(n + 1), 3/(n + 1), . . ., n/(n + 1) xÞ
pð .04 .20 .25 .12 .30 .09
h i2
Gðnþ1ÞGðiþ1=yÞ Gðnþ1ÞGðiþ2=yÞ Gðnþ1ÞGðiþ1=yÞ
71. GðiÞGðnþ1þ1=yÞ , GðiÞGðnþ1þ2=yÞ  GðiÞGðnþ1þ1=yÞ  ¼ 44:5 ¼ m
EðXÞ
73. a. .0238 b. $2025 b. s2 0 112.5 312.5 800
  2
75. gi;j yi ; yj ¼ ði1Þ!ðji1Þ!ðnjÞ! Fðyi Þ
n! i1
ðFðyj Þ  Fðyi ÞÞ ji1 nj
ð1  Fðyj ÞÞ f ðyi Þf ðyj Þ, p(s ) .38 .20 .30 .12
1 < yi < yj < 1
E(S ) ¼ 212.25 ¼ s
2 2
1 Ð
77. a. fW ðw2 Þ ¼ nðn  1Þ 1
2
ðFðw1 þ w2 Þ  Fðw1 ÞÞn2 f ðw1 Þf ðw1 þ w2 Þdw1
b. fW2 ðw2 Þ ¼ nðn  1Þw2n2 ð1  w2 Þ, 0 < w2 < 1 3. x/n 0 .1 .2 .3 .4

79. f(x) ¼ ex/2  ex, x  0; f(x) ¼ 0, x < 0. p(x/n) 0.0000 0.0000 0.0001 0.0008 0.0055

81. a. 3/81,250 .5 .6 .7 .8 .9 1.0


8 ð 30x
>
>
< kxydy ¼ kð250x  10x2 Þ; 0  x  20 0.0264 0.0881 0.2013 0.3020 0.2684 0.1074
b. fX ðxÞ ¼ ð 30x
20x
>
>
: kxydy ¼ kð450x  30x þ 2x Þ; 20<x  30
2 1 3
0 5. a. x 1 1.5 2 2.5 3 3.5 4
fY(y) ¼ fX(y) dependent
c. .3548 d. 25.969 e. 32.19, .894 xÞ
pð .16 .24 .25 .20 .10 .04 .01
f. 7.651
b. PðX  2:5Þ ¼ :85
83. 7/6 c. r 0 1 2 3

87. c. If p(0) ¼ .3, p(1) ¼ .5, p(2) ¼ .2, then 1 is the smaller of p(r) .30 .40 .22 .08
the two roots, so extinction is certain in this case with m < 1.
If p(0) ¼ .2, p(1) ¼ .5, p(2) ¼ .3, then 2/3 is the smaller of d. .24
the two roots, so extinction is not certain with m > 1. 7.
x pð
xÞ x pð
xÞ x pð

89. a. P((X,Y) ∈ A) ¼ F(b, d)  F(b, c)  F(a, d) + F(a, b)
b. P((X,Y) ∈ A) ¼ F(10, 6)  F(10, 1)  F(4, 6) + F(4, 1) 0.0 0.000045 1.4 0.090079 2.8 0.052077
P((X,Y) ∈ A) ¼ F(b, d)  F(b, c–1) – F(a–1, d) + 0.2 0.000454 1.6 0.112599 3.0 0.034718
F(a–1, b–1) 0.4 0.002270 1.8 0.125110 3.2 0.021699
c. At each (x*, y*), F(x*, y*) is the sum of the 0.6 0.007567 2.0 0.125110 3.4 0.012764
probabilities at points (x, y) such that x  x* and 0.8 0.018917 2.2 0.113736 3.6 0.007091
y  y* 1.0 0.037833 2.4 0.094780 3.8 0.003732
F(x, y) x 1.2 0.063055 2.6 0.072908 4.0 0.001866
100 250
200 .50 1
y 100 .30 .50
0 .20 .25
Chapter 7 823

11. a. 12, .01 75. .8340


b. 12, .005
c. With less variability, the second sample is more 77. a. r ¼ s2W =ðs2W þ s2E Þ
closely concentrated near 12. b. r ¼ .9999

13. a. No, the distribution is clearly not symmetric. 79. 26, 1.64
A positively skewed distribution —perhaps Weibull, 81. If Z1 and Z2 are independent standard normal
lognormal, or gamma. observations, then let
b. .0746 pffiffiffi
X ¼ 5Z1 + 100, Y ¼ 2ð:5Z1 þ ð 3=2ÞZ2 Þ þ 50
c. .00000092. No, 82 is not a reasonable value for m.
15. a. .8366 b. no
17. 43.29
Chapter 7
19. a. .9802, .4802 b. 32 1. a. 113.73, X b. 113, Xe
c. 12.74, S, an estimator for the population standard
21. a. .9839 b. .8932 deviation
d. The sample proportion of students exceeding 100 in
27. a. 87,850, 19,100,116 IQ is 30/33 ¼ .91
b. In case of dependence, the mean calculation is still e. .112, S=X
valid, but not the variance calculation.
c. .9973 3. a. 1.3481, X b. 1.3481, X
c. 1.78, X þ 1:282S
29. a. .2871 b. .3695 d. .67 e. .0846
31. .0317; Because each piece is played by the same 5. a. 1,703,000 b. 1,599,730 c. 1,601,438
musicians, there could easily be some dependence. If
they perform the first piece slowly, then they might 7. a. 120.6 b. 1,206,000, 10,000X c. .8
perform the second piece slowly, too, d. 120, Xe
pffiffiffiffiffiffiffiffi
33. a. 45 b. 68.33 c. 1, 13.67 d. 5, 68.33 9. a. X, 2.113 b. l=n, .119
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
35. a. 50, 10.308 b. .0076 c. 50 d. 111.56 11. b. p1 ð1  p1 Þ=n1 þ p2 ð1  p2 Þ=n2
e. 131.25 c. In part (b) replace p1 with X1/n1 and replace p2 with
X2/n2
37. a. .9615 b. .0617 d. .245 e. .0411
39. a. .5, n(n + 1)/4 b. .25, n(n + 1)(2n + 1)/24 13. a. .9876 b. .6915
41. 10:52.74 ^ ¼ P X2 =ð2nÞ b. 74.505
15. a. y i
43. .48 17. b. 4/9
45. b. MY(t) ¼ 1/[1  t2/(2n)]n 19. a. p^ ¼ 2^
l  :30 ¼ :20 c. p^ ¼ ð100^
l  9Þ=70
w2n
47. Because is the sum of n independent random variables, 21. a. .15 b. yes c. .4437
each distributed as w21 , the Central Limit Theorem applies.
^ ¼ ð2
23. a. y x  1Þ=ð1  xÞ ¼ 3
53. a. 3.2 b. 10.04, the square of the answer to (a) b. ^
y ¼ ½n=S lnðxi Þ  1 ¼ 3:12
57. a. n2/(n2  2), n2 > 2 25. p^ ¼ r=ðr þ xÞ ¼ :15 This is the number of successes
b. 2n22 ðn1 þ n2  2Þ=½n1 ðn2  2Þ2 ðn2  4Þ, n2 > 4 over the number of trials, the same as the result in
61. a. 4.32 Exercise 21. It is not the same as the estimate of
Exercise 17.
65. a. The approximate value, .0228, is smaller because of P 2 P 2
skewness in the chi-squared distribution 27. a. s^2 ¼ 1n ^2 ¼ 1n
Xi b. s Xi
b. This approximation gives the answer .03237, agreeing P
^ ¼ X2 =ð2nÞ ¼ 74:505, the same as in Exercise 15
29. a. y i
with the software answer to this number of decimals. qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
b. ^
2y lnð2Þ ¼ 10:16
67. No, the sum of the percentiles is not the same as the
percentile of the sum, except that they are the same for 31. ^
l ¼  lnð^
pÞ=24 ¼ :0120
the 50th percentile. For all other percentiles, the
percentile of the sum is closer to the 50th percentile 33. No, statistician A does not have more information.
than is the sum of the percentiles Qn Pn
35. i¼1 xi ; i¼1 xi
69. a. 2360, 73.70 b. .9713
37. I(.5 max(x1, x2, . . ., xn)  y  min(x1, x2, . . ., xn))
71. .9685
39. a. 2X(n  X)/[n(n  1)]
73. .9093 Independence is questionable because con- pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
sumption one day might be related to consumption the 41. a. X b. FððX  cÞ= 1  1=nÞ
next day.
824 Chapter 8

43. a. Vð~yÞ ¼ y2 =½nðn þ 2Þ b. y2/n 33. a. (38.081, 38.439) b. (100.55, 101.19), yes
c. The variance in (a) is below the bound of (b), but the
theorem does not apply because the domain is a 35. a. Assuming normality, a 95% lower confidence bound
function of the parameter. is 8.11. When the bound is calculated from repeated
independent samples, roughly 95% of such bounds
45. a. x b. N(m, s2/n) should be below the population mean.
c. Yes, the variance is equal to the Cramér-Rao bound b. A 95% lower prediction bound is 7.03. When the
d. The answer in (b) shows that the asymptotic bound is calculated from repeated independent
distribution of the theorem is actually exact here. samples, roughly 95% of such bounds should be
below the value of an independent observation.
47. a. 2/s2
b. The answer in (a) is different from the answer, 37. a. 378.85 b. 413.09 c. (333.88, 407.50)
1/(2s4), to 46(a), so the information does depend on
the parameterization. 39. 95% prediction interval: (.0498, .0772)

49. ^l ¼ 6=ð6t6  t1  . . .  t5 Þ ¼ 6=ðx1 þ 2x2 þ . . . þ 6x6 Þ ¼ :0436, 41. a. (169.36, .179.37)


where x1 ¼ t1, x2 ¼ t2  t1, . . ., x6 ¼ t6  t5 b. (134.30, 214.43), which includes 152
c. The second interval is much wider, because it allows
53. 1.275, s ¼ 1.462 for the variability of a single observation.
d. The normal probability plot gives no reason to doubt
s2 Þ ¼ s2 =2, so 2^
55. b. no, Eð^ s2 is unbiased normality. This is especially important for part (b), but
59. .416, .448 the large sample size implies that normality is not so
critical for (a).
61. d(X) ¼ (1)X, d(200) ¼ 1, d(199) ¼ 1
 45. a. 18.307 b. 3.940 c. .95 d. .10
^ ¼ P xi yi P x2 ¼ 30:040, the estimated minutes
63. b. b Pi ^ i Þ2 ¼ 16:912; 47. b. (2.34, 5.60)
^ ¼ 1n ðyi  bx
per item; s 2
^ ¼ 751
25b 49. a. (7.91, 12.00)
b. Because of an outlier, normality is questionable for
this data set.
Chapter 8 c. In MINITAB, put the data in C1 and execute the
following macro 999 times
1. a. 99.5% b. 85% c. 2.97 d. 1.15 Let k3 ¼ N(c1)
sample k3 c1 c3;
3. a. A narrower interval has a lower probability b. No, replace.
m is not random let k1 ¼ mean(c3)
c. No, the interval refers to m, not individual observations stack k1 c5 c5
d. No, a probability of .95 does not guarantee 95 end
successes in 100 trials
51. a. (26.61, 32.94)
5. a. (4.52, 5.18) b. (4.12, 5.00) c. 55 d. 94 b. Because of outliers, the weight gains do not seem
7. Increase n by a factor of 4. Decrease the width by a factor normally distributed.
of 5. c. In MINITAB, see Exercise 49(c).
pffiffiffi 53. a. (38.46, 38.84)
9. a. ðx  1:645 s=
pffiffiffi n; 1Þ; (4.57, 1)
x  za  s= n; 1Þ
b. ð b. Although the normal probability plot is not perfectly
pffiffiffi straight, there is not enough deviation to reject
c. ð1; x þ za  s= nÞ; (1, 59.7) normality.
11. 950; .8724 (normal approximation), .8731 (binomial) c. In MINITAB, see Exercise 49(c).

13. a. (.99, 1.07) b. 158 55. a. (169.13, 205.43)


b. Because of an outlier, normality is questionable for
15. a. 80% b. 98% c. 75% this data set.
c. In MINITAB, see Exercise 49(c).
17. .06, which is positive, suggesting that the population
mean change is positive 57. a. In MINITAB, put the data in C1 and execute the
following macro 999 times
19. (.513, .615) Let k3 ¼ N(c1)
21. .218 sample k3 c1 c3;
replace.
23. (.439, .814) let k1 ¼ stdev(c3)
stack k1 c5 c5
25. a. 381 b. 339 end
29. a. 1.341 b. 1.753 c. 1.708 d. 1.684 b. Assuming normality, a 95% confidence interval for
e. 2.704 s is (3.541, 6.578), but the interval is inappropriate
because the normality assumption is clearly not
31. a. 2.228 b. 2.131 c. 2.947 d. 4.604 satisfied.
e. 2.492 f. 2.715
Chapter 9 825

59. a. (.198, .230) b. .048 21. Test H0: m ¼ .5 vs. Ha: m 6¼ .5


c. A 90% prediction interval is (.149, .279) a. Do not reject H0 because t.025,12 ¼ 2.179 > |1.6|
b. Do not reject H0 because t.025,12 ¼ 2.179 > |1.6|
61. 246 c. Do not reject H0 because t.005,24 ¼ 2.797 > |2.6|
63. a. A 95% confidence interval for the mean is (.163, d. Reject H0 because t.005,24 ¼ 2.797 < |3.9|
.174). Yes, this interval is below the interval for 59(a). 23. Because t ¼ 2.24  1.708 ¼ t.05,25, reject H0: m ¼ 360.
b. (.089, .326) Yes, this suggests contradiction of prior belief.
65. (0.1263, 0.3018) 25. Because |z| ¼ 3.37  1.96, reject the null hypothesis.
67. a. yes b. (196.88, 222.62) It appears that this population exceeds the national
pffiffiffiffiffiffiffiffi average in IQ.
^ ¼ s2 =Sx2 , s^ ¼ s= Sx2
69. c. VðbÞ i
i b
27. a. no, t ¼ .02 b. 58
d. Put the xi’s far ffiffiffiffiffiffiffiffi 0 to minimize sb^
pfrom c. n ¼ 20 total observations
^  ta=2;n1 s= Sx2 , (29.93, 30.15)
e. b i

73. a. .00985 b. .0578 29. a. Because t ¼ .50 < 1.895 ¼ t.05,7 do not reject H0.
pffiffiffi pffiffiffi b. .73
x  ðs= nÞt:025;n1;d ; x  ðs= nÞt:975;n1;d Þ
75. a. ð
b. (3.01, 4.46) 31. Because t ¼ 1.24 > 1.397 ¼ t.10,8, we do not have
evidence to question the prior belief.
77. a. 1/2n b. n/2n c. (n + 1)/2n, 1  (n + 1)/2n1,
(29.9, 39.3) with confidence level .9785 35. a. The distribution is fairly symmetric, without outliers.
b. Because t ¼ 4.25  3.499 ¼ t.005,7, there is strong
79. a. P(A1\A2) ¼ .952 b. P(A1\A2)  .90 evidence to say that the amount poured differs from
c. P(A1\A2)  1  a1  a2 ; P(A1\A2\ . . . \ Ak) the industry standard, and indeed bartenders tend to
 1  a1  a2  . . . ak exceed the standard.
c. Yes, the test in (b) depends on normality, and a normal
probability plot gives no reason to doubt the
Chapter 9 assumption.
d. .643, .185, .016
1. a. yes b. no c. no d. yes e. no f. yes 37. a. Do not reject H0: p ¼ .10 in favor of Ha: p > .10
5. H0: s ¼ .05 vs. Ha: s < .05. Type I error: Conclude that because z ¼ 1.33 < 1.645. Because the null
the standard deviation is less than .05 mm when it is hypothesis is not rejected, there could be a type II
really equal to .05 mm. Type II error: Conclude that the error.
standard deviation is .05 mm when it is really less than b. .49, .27. c. 362
.05. 39. a. Do not reject H0: p ¼ .02 in favor of Ha: p < .02
7. A type I error here involves saying that the plant is not in because z ¼ 1.1 > 1.645. There is no strong
compliance when in fact it is. A type II error occurs when evidence suggesting that the inventory be postponed.
we conclude that the plant is in compliance when in fact it b. .195. c. <.0000001.
isn’t. A government regulator might regard the type II 41. a. Reject H0 because z ¼ 3.08  2.58. b. .03
error as being more serious.
43. Using n ¼ 25, the probability of 5 or more leaky faucets
9. a. R1 is .0980 if p ¼ .10, and the probability of 4 or fewer leaky
b. A type I error involves saying that the two companies faucets is .0905 if p ¼ .3. Thus, the rejection region is
are not equally favored when they are. A type II error 5 or more, a ¼ .0980, and b ¼ .0905.
involves saying that the two companies are equally
favored when they are not. 45. a. reject b. reject c. do not reject d. reject
c. binomial, n ¼ 25, p ¼ .5; .0433 e. do not reject
d. .3, .4881; .4, .8452; .6, .8452; .7, .4881
e. If only 6 favor the first company, then reject the null 47. a. .0778 b. .1841 c. .0250 d. .0066 e. .5438
hypothesis and conclude that the first company is not 49. a. P ¼ .0403 b. P ¼ .0176 c. P ¼ .1304
preferred. d. P ¼ .6532 e. P ¼ .0021 f. P ¼ .000022
11. a. H0: m ¼ 10 vs. Ha: m 6¼ 10 b. .0099 51. Based on the given data, there is no reason to believe
c. .5319. .0076 d. c ¼ 2.58 e. c ¼ 1.96 that pregnant women differ from others in terms of true
f. x ¼ 10:02, so do not reject H0 average serum receptor concentration.
g. Recalibrate if z  2.58 or z  2.58
53. a. Because the P-value is .17, no modification is
13. b. .00043, .0000075, less than .01 indicated. b. 997
15. a. .0301 b. .0030 c. .0040 55. Because t ¼ 1.759 and the P-value ¼ .089, which is
17. a. Because z ¼ 2.56 > 2.33, reject H0 b. .84 less than .10, reject H0: m ¼ 3.0 against a two-tailed
alternative at the 10% level. However, the P-value
c. 142 d. .0052 exceeds .05, so do not reject H0 at the 5% level. There
19. a. Because z ¼ 2.27 > 2.58, do not reject H0
b. .22 c. 22
826 Chapter 10

is just a weak indication that the percentage is not equal to 91. a. For the test of H0: m ¼ m0 vs. Ha: m > m0 at level a,
3% (lower than 3%). reject H0 if 2Sxi/m0  w2a;2n
57. a. Test H0: m ¼ 10 vs. Ha: m < 10 For the test of H0: m ¼ m0 vs. Ha: m < m0 at level a,
b. Because the P-value is .017 < .05, reject H0, reject H0 if 2Sxi/m0  w21a;2n
suggesting that the pens do not meet specifications. For the test of H0: m ¼ m0 vs. Ha: m 6¼ m0 at level a,
c. Because the P-value is .045 > .01, do not reject H0, reject H0 if 2Sxi/m0  w2a=2;2n or
suggesting there is no reason to say the lifetime is if 2Sxi/m0  w21a=2;2n
inadequate. b. Because Sxi ¼ 737, the test statistic is 2Sxi/m0
d. Because the P-value is .0011, reject H0. There is good ¼ 19.65, which gives a P-value of .52. There is no
evidence showing that the pens do not meet reason to reject the null hypothesis.
specifications.
93. a. yes
61. a. 98, .85, .43, .004, .0000002
b. .40, .11, .0062, .0000003
c. Because the null hypothesis will be rejected with high
probability, even with only slight departure from the
Chapter 10
null hypothesis, it is not very useful to do a .01 level 1. a. .4; it doesn’t b. .0724, .269
test. c. Although the CLT implies that the distribution will be
63. b. 36.61 c. yes approximately normal when the sample sizes are each
100, the distribution will not necessarily be normal
65. a. Sxi  c b. yes when the sample sizes are each 10.
67. Yes, the test is UMP for the alternative Ha : y > .5 3. Do not reject H0 because z ¼ 1.76 < 2.33
because the tests for H0 : y ¼ .5 vs. Ha : y ¼ p0 all
have the same form for any p0 > .5. 5. a. Ha says that the average calorie output for sufferers is
more than 1 cal/cm2/min below that for non-sufferers.
69. b. .05 Reject H0 in favor of Ha because z ¼ 2.90  2.33
c. .04345, .05826; Because .04345 < .05, the test is not b. .0019 c. .819 d. .66
unbiased.
d. .05114; not most powerful 7. Yes, because z ¼ 1.83  1.645.

71. b. The value of the test statistic is 3.041, so the P-value is 9. a. x  y ¼ 6:2
.081, compared to .089 for Exercise 55. b. z ¼ 1.14, two-tailed P-value ¼ .25, so do not reject
the null hypothesis that the population means are
73. A sample size of 32 should suffice. equal.
c. No, the values are positive and the standard deviation
75. a. Test H0: m ¼ 2150pvs. ffiffiffi Ha: m > 2150 exceeds the mean.
b. t ¼ ð
x  2150Þ=ðs= nÞ c. 1.33 d. .101 d. 95% CI: (10.0, 29.8)
e. Do not reject H0 at the .05 level.
11. a. A 95% CI for the true difference, fast food mean – not
77. Because t ¼ .77 and the P-value is .23, there is no fast food mean is (219.6, 538.4)
evidence suggesting that coal increases the mean heat b. The one-tailed P-value is .014, so reject the null
flux. hypothesis of a 200-calorie difference at the .05
79. Conclude that activation time is too slow at the .05 level, level, and conclude that yes, there is strong evidence.
but not at the .01 level. 13. 22. No.
81. A normal probability plot gives no reason to doubt the 15. b. It increases.
normality assumption. Because the sample mean is 9.815,
giving t ¼ 4.75 and a (upper tail) P-value of .00007, 17. Because z ¼ 1.36, there is no reason to reject the
reject the null hypothesis at any reasonable level. The hypothesis of equal population means (p ¼ .17).
true average flame time is too high.
19. Because z ¼ .59, there is no reason to conclude that the
83. Assuming normality, calculate t ¼ 1.70, which gives a population mean is higher for the no-involvement group
two tailed P-value of .102. Do not reject the null (p ¼ .28).
hypothesis H0: m ¼ 1.75.
21. Because t ¼ 3.35  3.30 ¼ t.001,42, yes, there is
85. The P-value for a lower tail test is .0014 (normal evidence that experts do hit harder.
approximation, .0005), so it is reasonable to reject the
idea that p ¼ .75 and conclude that fewer than 75% of 23. b. No c. Because |t| ¼ |.38| < 2.228 ¼ t.025,10, no,
mechanics can identify the problem. there is no evidence of a difference.

87. Because t ¼ 6.43, giving an upper tail P-value of 25. Because the one-tailed P-value is .005  .01, conclude at
.0000002, conclude that the population mean time the .01 level that the difference is as stated.
exceeds 15 minutes. This could result in a type I error.

89. Because the P-value is .013 > .01, do not reject the null 27. Yes, because t ¼ 2.08 with P-value ¼ .046.
hypothesis at the .01 level. 29 b. (127.6, 202.0) c. 131.8
Chapter 10 827

31. Because t ¼ 1.82 with P-value .046  .05, conclude at # start with X in C1, Y in C2
the .05 level that the difference exceeds 1. let k3 ¼ N(c1)
qffiffiffiffiffiffiffiffiffiffi let k4 ¼ N(c2)
33. a. ðx  yÞ  ta=2;mþn2  sp m1 þ 1n sample k3 c1 c3;
b. (.24, 3.64) replace.
c. (.34, 3.74), which is wider because of the loss of a sample k4 c2 c4;
degree of freedom replace.
let k1 ¼ mean(c3)-mean(c4)
35. a. The slender distribution appears to have a lower mean
stack k1 c5 c5
and lower variance.
end
b. With t ¼ 1.88 and a P-value of .097, there is no
significant difference at the .05 level. 71. a. Here is a macro that can be executed 999 times in
37. With t ¼ 2.19 and a two-tailed P-value of .031, there is a MINITAB:
# start with X in C1, Y in C2
significant difference at the .05 level but not the .01 level.
let k3 ¼ N(c1)
39. With t ¼ 3.89 and one-tailed P-value ¼ .006, conclude let k4 ¼ N(c2)
at the 1% level that true average movement is less for the sample k3 c1 c3;
TightRope treatment. Normality is important, but the replace.
normal probability plot does not indicate a problem. sample k4 c2 c4;
replace.
41. a. The 95% confidence interval for the difference of let k2 ¼ medi(c3)-medi(c4)
means is (.000046, .000446), which has only positive stack k2 c6 c6
values. This omits 0 as a possibility, and says that the end
conventional mean is higher.
b. With t ¼ 2.68 and P-value ¼ .010, reject at the .05 73. a. (.593, 1.246)
level the hypothesis of equal means in favor of the b. Here is a macro that can be executed 999 times in
conventional mean being higher. MINITAB:
# start with X in C1, Y in C2
43. With t ¼ 1.87 and a P-value of .049, the difference is let k3 ¼ N(c1)
(barely) significantly greater than 5 at the .05 level. let k4 ¼ N(c2)
sample k3 c1 c3;
45. a. No b. 49.1 c. 49.1
replace.
47. 1 2 3 4 sample k4 c2 c4;
x 10 20 30 40 replace.
y 11 21 31 41 let k5 ¼ stdev(c3)/stdev(c4)
stack k5 c12 c12
end
49. a. Because |z| ¼ |4.84|  1.96, conclude that there is a
difference. Rural residents are more favorable to the 75. a. Because t ¼ 2.62 with a P-value of .018, conclude
increase. that the population means differ. At the 5% level,
b. .9967 blueberries are significantly better.
b. Here is a macro that can be executed repeatedly in
51. (.016, .171) MINITAB:
53. Because z ¼ 4.27 with P-value .000010, conclude that # start with data in C1, group var in C2
the radiation is beneficial. let k3 ¼ N(c1)
Sample k3 c1 c3.
55. a. H0: p3 ¼ p2, Ha: p3 > p2 unstack c3 c4 c5;
b. (X3  X2)/npffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi subs c2.
c. ðX3  X2 Þ= X2 þ X3 let k9 ¼ mean(c4)-mean(c5)
d. With z ¼ 2.67, P ¼ .004, reject H0 at the .01 level. stack k9 c6 c6
end
57. 769
77. a. Because f ¼ 4.46 with a two-tailed P-value of .122,
59. Because z ¼ 3.14 with P ¼ .002, reject H0 at the .01 there is no evidence of unequal population variances.
level. Conclude that lefties are more accident-prone. b. Here is a macro that can be executed repeatedly in
61. a. .0175 b. .1642 c. .0200 d. .0448 MINITAB:
e. .0035 let k1 ¼ n(C1)
Sample K1 c1 c3.
63. No, because f ¼ 1.814 < 6.72 ¼ F.01,9,7. unstack c3 c4 c5;
subs c2.
65. Because f ¼ 1.2219 with P ¼ .505, there is no reason to let k6 ¼ stdev(c4)/stdev(c5)
question the equality of population variances. stack k6 c6 c6
67. 8.10 end

69. a. (.158, .735) 79. a. A MINITAB macro is given in #75(b).


b. Here is a macro that can be executed 999 times in 81. a. (11.85, 6.40)
MINITAB: b. See Exercise 57(a) in Chapter 8.
828 Chapter 11

85. The difference is significant at the .05, .01, and .001 7. a. The Levene test gives f ¼ 1.47, P-value .236, so there
levels. is no reason to doubt equal variances.
b. Because f ¼ 10.48  4.02 ¼ F.01,4,30, there are
89. b. No, given that the 95% CI includes 0, the test at the significant differences among the means.
.05 level does not reject equality of means.
Source DF SS MS F P
91. (299.2, 1517.8)
Plate 4 43993 10998 10.48 0.000
93. (1020.2, 1339.9). Because 0 is not in the CI, we would
length
reject equality of means at the .01 level.
Error 30 31475 1049
95. Because t ¼ 2.61 and the one-tailed P-value is .007, the Total 34 75468
difference is significant at the .05 level using either a
one-tailed or a two-tailed test. 11. w ¼ 36.09 3 1 4 2 5
97. a. Because t ¼ 3.04 and the two-tailed P-value is .008, Splitting the paints into two groups, {3, 1, 4}, {2, 5},
the difference is significant at the .05 level. there are no significant differences within groups but the
b. No, the mean of the concentration distribution paints in the first group differ significantly (they are
depends on both the mean and standard deviation lower) from those in the second group.
of the log concentration distribution.
13. 3 1 4 2 5
99. Because t ¼ 7.50 and the one-tailed P-value is .0000001, 427.5 462.0 469.3 502.8 532.1
the difference is highly significant, assuming normality.
101. The two-sample t is inappropriate for paired data. The 15. w ¼ 5.92; At the 1% level the only significant
paired t gives a mean difference .3, t ¼ 2.67, and the differences are between formation 4 and the first two
two-tailed P-value is .045, so the means are significantly formations.
different at the .05 level. We are concluding tentatively 2 1 3 4
that the label understates the alcohol percentage. 24.69 26.08 29.95 33.84
103. Because paired t ¼ 3.88 and the two-tailed P-value is
.008, the difference is significant at the .05 and .01 17. (.029, .379)
levels, but not at the .001 level. 19. 426
105. Because z ¼ 2.63 and the two-tailed P-value is .009, 21. a. Because f ¼ 22.60  3.26 ¼ F.01,5,78, there are
there is a significant difference at the .01 level, significant differences among the means.
suggesting better survival at the higher temperature. b. (99.1, 35.7), (29.4, 99.1)
107. .902, .826, .029, .00000003 23. The nonsignificant differences are indicated by the
109. Because z ¼ 4.25 and the one-tailed P-value is .00001, underscores.
the difference is highly significant and companies 10 6 3 1
appear to discriminate. 45.5 50.85 55.40 58.28
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
111. With Z ¼ ðX  YÞ=  X=n þ Y=m  , the result is
z ¼ 5.33, two-tailed P-value ¼ .0000001, so one 25. a. Assume normality and equal variances.
should conclude that there is a significant difference in b. Because f ¼ 1.71 < 2.20 ¼ F.10,3,48, P-value ¼ .18,
parameters. there are no significant differences among the means.

113. (i) not bioequivalent (ii) not bioequivalent (iii) 27. a. Because f ¼ 3.75, P-value ¼ .028, there are
bioequivalent significant differences among the means.
b. Because the normal plot looks fairly straight and the
P-value for the Levene test is .68, there is no reason to
doubt the assumptions of normality and constant
Chapter 11 variance.
c. The only significant pairwise difference is between
1. a. Reject H0: m1 ¼ m2 ¼ m3 ¼ m4 ¼ m5 in favor of Ha: brands 1 and 4:
m1, m2, m3, m4, m5 not all the same, because 4 3 2 1
f ¼ 5.57  2.69 ¼ F.05,4,30. 5.82 6.35 7.50 8.27
b. Using Table A.9, .001 < P-value < .01. (The P-value
is .0018)
31. .63
3. Because f ¼ 6.43  2.95 ¼ F.05,3,28, there are pffiffiffiffiffiffiffi
significant differences among the means. 33. arcsinð x=nÞ

5. Because f ¼ 10.85  4.38 ¼ F.01,3,36, there are 35. a. Because f ¼ 1.55 < 3.26 ¼ F.05,4,12, there are no
significant differences among the means. significant differences among the means.
b. Because f ¼ 2.98 < 3.49 ¼ F.05,3,12, there are no
Source DF SS MS F P significant differences among the means.
Formation 3 509.1 169.7 10.85 0.000
37. With f ¼ 5.49  4.56 ¼ F.01,5,15, there are significant
Error 36 563.1 15.6 differences among the stimulus means. Although not all
Total 39 1072.3 differences are significant in the multiple comparisons
analysis, the means for combined stimuli were higher.
Chapter 11 829

Differences among the subject means are not very 51. a. With f ¼ 1.55 < 2.81 ¼ F.10,2,12, there is no
important here. The normal plot of residuals shows no significant interaction at the .10 level.
reason to doubt normality. However, the plot of residuals b. With f ¼ 376.27  18.64 ¼ F.001,2,12, there is a
against the fitted values shows some dependence of the significant difference between the formulation
variance on the mean. If logged response is used in place means at the .001 level.
of response, the plots look good and the F test result is With f ¼ 19.27  12.97 ¼ F.001,1,12, there is a
similar but stronger. Furthermore, the logged response significant difference among the speed means at the
gives more significant differences in the multiple .001 level.
comparisons analysis. c. Main effects Formulation: (1) 11.19, (2) –11.19
Speed: (60) 1.99, (70) –5.03, (80) 3.04
Means:
53. Here is the ANOVA table
L1 L2 T L1 + L2 L1 + T L2 + T
24.825 27.875 29.1 40.35 41.22 45.05 Source DF SS MS F P

Pen 3 1387.5 462.50 0.68 0.583


39. With f ¼ 2.56 < 2.61 ¼ F.10,3,12, there are no significant surface 2 2888.1 1444.04 2.11 0.164
differences among the angle means. Interaction 6 8100.3 1350.04 1.97 0.149
41. a. With f ¼ 1.04 < 3.28 ¼ F.05,2,34, there are no Error 12 8216.0 684.67
significant differences among the treatment means. Total 23 20591.8

Source DF SS MS F With f ¼ 1.97 < 2.33 ¼ F.10,6,12, there is no significant


interaction at the .10 level.
Treatment 2 28.78 14.39 1.04
With f ¼ .68 < 2.61 ¼ F.10,3,12, there is no significant
Block 17 2977.67 175.16 12.68 difference among the pen means at the .10 level.
Error 34 469.56 13.81 With f ¼ 2.11 < 2.81 ¼ F.10,2,12, there is no significant
Total 53 3476.00 difference among the surface means at the .10 level.

b. The very significant f for blocks, which shows that 57. a. F ¼ MSAB/MSE
blocks differ strongly, implies that blocking was b. A: F ¼ MSA/MSAB B: F ¼ MSB/MSAB
successful. 59. a. Because f ¼ 3.43  2.61 ¼ F.05,4,40, there is a
43. With f ¼ 8.69  6.01 ¼ F.01,2,18, there are significant significant difference among the exam means at the
differences among the three treatment means. .05 level.
The normal plot of residuals shows no reason to doubt b. Because f ¼ 1.65 < 2.61 ¼ F.05,4,40, there is no
normality, and the plot of residuals against the fitted significant difference among the retention means at
values shows no reason to doubt constant variance. the .05 level.
There is no significant difference between treatments B 61. a.
and C, but Treatment A differs (it is lower) significantly
from the others at the .01 level. Source DF SS MS F
Means:
A 29.49 B 31.31 C 31.40 Diet 4 .929 .232 2.15
Error 25 2.690 .108
Total 29 3.619
45. Because f ¼ 8.87  7.01 ¼ F01,4,8, reject the hypothesis
that the variance for B is 0.
Because f ¼ 2.15 < 2.76 ¼ F.05,4,25, there is no
49. a. significant difference among the diet means at the .05
level.
Source df SS MS F b. (.59, .92) Yes, the interval includes 0.
A 2 30763.0 15381.5 3.79 c. .53
B 3 34185.6 11395.2 2.81 63. a. Test H0: m1 ¼ m2 ¼ m3 versus Ha: the three means
Interaction 6 43581.2 7263.5 1.79 are not all the same. With f ¼ 4.80 and F.05,2,16 ¼
Error 24 97436.8 4059.9 3.63 < 4.80 < 6.23 ¼ F.01,2,16, it follows that
Total 35 205966.6 .01 < P-value < .05 (more precisely, P ¼ .023).
Reject H0 in favor of Ha at the 5% level but not at
b. Because 1.79 < 2.04 ¼ F.10,6,24, there is no the 1% level.
significant interaction. b. Only the first and third means differ significantly at
c. Because 3.79  3.40 ¼ F.05,2,24, there is a significant the 5% level.
difference among the A means at the .05 level. 1 2 3
d. Because 2.81 < 3.01 ¼ F..05,6,24, there is no
significant difference among the B means at the .05 25.59 26.92 28.17

level.
e. Using w ¼ 64.93, 65. Because f ¼ 1123  4.07 ¼ F.05,3,8, there are significant
differences among the means at the .05 level.
3 1 2 For Tukey multiple comparisons, w ¼ 7.12:
3960.2 4010.88 4029.10
(continued)
830 Chapter 12

c. No, there is a wide range of y values for a given x; for


PCM OCM RM PIM
example when temperature is 18.2 the ratio ranges
29.92 33.96 125.84 129.30 from .9 to 2.68.
3. Yes. Yes.
The means split into two groups of two. The means within
each group do not differ significantly, but the means in 5. b. Yes
the top group differ strongly from the means in the c. The relationship of y to x is roughly quadratic.
bottom group.
7. a. 5050 psi b. 1.3 psi c. 130 psi d. 130 psi
67. The normal plot is reasonably straight, so there is no
reason to doubt the normality assumption.
3
9. a. .095 m /min b. .475 m /min
3
c. .83 m3/min,
1.305 m3/min d. .4207, .3446 e. .0036
69.
11. a. .01 h, .10 h b. 3.0 h, 2.5 h
Source DF SS MS F c. .3653 d. .4624
A 1 322.667 322.667 980.5
13. a. y ¼ .63 + .652x
B 3 35.623 11.874 36.1
b. 23.46, 2.46
AB 3 8.557 2.852 8.7 c. 392, 5.72
Error 16 5.266 .329 d. .956
Total 23 372.113 e. y ¼ 2.29 + .564x, r2 ¼ .688
15. a. y ¼ 15.2 + .0942x
With f ¼ 8.7  3.24 ¼ F.05,3,16, there is significant b. 1.906
interaction at the .05 level. c. 1.006 , 0.096, 0.034, 0.774
In the presence of significant interaction, main effects are d. .451
not very useful.
17. a. Yes
b. slope, .827; intercept, 1.13
c. 40.22
Chapter 12 d. 5.24
e. .975
1. a. Temperature
19. a. y ¼ 75.2  .209x 54.274
17 0
b. The coefficient of determination is .791, meaning that
17 23 the predictor accounts for 79.1% of the variation in y.
17 445 c. The value of s is 2.56, so typical deviations from the
17 67 regression line will be of this size.
17 Stem: hundreds and tens
18 0000011 Leaf: ones
21. b. y ¼ 2.18 + .660x
c. 7.72
18 2222
d. 7.72
18 445
18 6 ^0 ¼ 1:8b
25. b ^ þ 32; ^0 ¼ 1:8b
b ^
0 0 1 1
18 8
29. a. Subtracting x from each xi shifts the plot x units to
The distribution is fairly symmetric and bell-shaped with the left. The slope is left unchanged, but the new
a center around 180. y intercept is y, the height of the old line at x ¼ x.
^ ¼ Y ¼ b
b. b ^ þb ^ ¼ b
^ x and b ^
Ratio 0 0 1 1 1

0 889 31. a. .00189


1 0000
b. .7101
c. No, because here Sðxi  xÞ2 is 24,750, smaller than the
1 3
value 70,000 in part (a), so Vðb ^ Þ ¼ s2 =Sðxi  xÞ2 is
1
1 4444 higher here.
1 66
1 8889 Stem: ones; 33. a. (.51, 1.40)
2 11 Leaf: tenths b. To test H0: b1 ¼ 1 vs. Ha: b1 < 1, we compute
2
t ¼ .2258 > 1.383 ¼ t.10,9, so there is no
reason to reject the null hypothesis, even at the 10%
2 5
level. There is no conflict between the data and the
2 6 assertion that the slope is at least 1.
2
^ ¼ 1:536, and a 95% CI is (.632, 2.440)
35. a. b
3 00 1
b. Yes, for the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, we find
The distribution is concentrated between 1 and 2, with t ¼ 3.62, with P-value .0025. At the .01 level
some positive skewness. conclude that there is a useful linear relationship.
b. No, x does not determine y: for a given x there may be c. Because 5 is beyond the range of the data, predicting
more than one y. at a dose of 5 might involve too much extrapolation.
Chapter 12 831

^ ¼ 1:683, and a 95% CI is (.531, 2.835).


d. b 59. a. For the test of H0: r ¼ 0 vs. Ha: r > 0, we find
1
Eliminating the point causes only moderate change, r ¼ .760, t ¼ 4.05, with P-value < .001. At the .001
so the point is not extremely influential. level conclude that there is a positive correlation.
b. Because r2 ¼ .578 we say that the regression accounts
37. a. Yes, for the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, we find for 57.8 % of the variation in endurance. This also
t ¼ 6.73, with P-value .00002. At the .01 level applies to prediction of lactate level from endurance.
conclude that there is a useful linear relationship.
b. (2.77, –1.42) 61. For the test of H0: r ¼ 0 vs. Ha: r 6¼ 0, we find r ¼ .773,
t ¼ 2.44, with P-value .072. At the .05 level conclude
43. No, z ¼ .73 and the P-value is .46, so there is no evidence that there is not a significant correlation. With such a
for a significant impact of age on kyphosis. small sample size, a high r is needed for significance.
45. a. sY^ increases as the distance of x from x increases 63. a. Reject the null hypothesis in favor of the alternative.
b. (2.26, 3.19) b. No, with a large sample size a small r can be
c. (1.34, 4.11) significant.
d. At least 90% c. Because t ¼ 2.200  1.96 ¼ t.025,9998 the correlation
47. a. The regression equation is y ¼ 1:58 þ 2:59x and is statistically (but not necessarily practically)
R2 ¼ .838. significant at the .05 level.
b. A 95% confidence interval for the slope is ( 2.16, 67. a. .184, –.238, –.426
3.01). In repetitions of the whole process of data b. The mean that is subtracted is not the mean x1;n1 of
collection and calculation of the interval, roughly x1, x2,, . . ., xn–1, or the mean x2;n of x2, x3,, . . ., xn.
95% of the intervals will contain the true slope. Also, the denominator of rffi1 is not
c. When tannin ¼ .6 the estimated mean astringency is qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn1 ffiqP
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
n
0.0335 and the 95% confidence interval is (0.125, 1 ðxi  x1;n1 Þ 2
2 ðxi  x 2;n Þ . However, if
2

0.058) n is large then r1 is approximately the same as the


d. When tannin ¼ .6 the predicted astringency is correlation. A similar relationship applies to r2.
0.0335 and the 95% prediction interval is c. No
(0.5582, 0.4912) d. After performing one test at the .05 level, doing more
e. Our null hypothesis is that true average astringency tests raises the probability of at least one type I error to
is 0 when tannin is .7, and the alternative is that the more than .05.
true average is positive. The t for this test is 4.61, with
P-value ¼ .000035, so yes there is compelling 69. The plot shows no reasons for concern about using the
evidence. simple linear regression model.

49. (431.2, 628.6) 71. a. The simple linear regression model may not be a
perfect fit because the plot shows some curvature.
51. a. Yes, for the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, b. The plot of standardized residuals is very similar to
we find t ¼ 10.62, with P-value .000014. At the the residual plot. The normal probability plot gives
.001 level conclude that there is a useful linear no reason to doubt normality.
relationship.
b. (8.24, 12.96) With 95% confidence, when the flow 73. a. For the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, we find
rate is increased by 1 SCCM, the associated expected t ¼ 10.97, with P-value .0004. At the .001 level
change in etch rate is in the interval. conclude that there is a useful linear relationship.
c. (36.10, 40.41) This is fairly precise. b. The residual plot shows curvature, so the linear
d. (31.86, 44.65) This is much less precise than the relationship of part (a) is questionable.
interval in (c) c. There are no extreme standardized residuals , and the
e. Because 2.5 is closer to the mean, the intervals will be plot of standardized residuals is similar to the plot of
narrower. ordinary residuals.
f. Because 6 is outside the range of the data, it is 75. The first data set seems appropriate for a straight-line
unknown whether the regression will apply there. model. The second data set shows a quadratic
g. Use a 99% CI at each value: (23.88, 31.43), (29.93, relationship, so the straight-line relationship is
35.98), (35.07, 41.45) inappropriate. The third data set is linear except for an
53. a. Yes outlier, and removal of the outlier will allow a line to be
b. Yes, for the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, we find fit. The fourth data set has only two values of x, so there is
t ¼ 4.39, with P-value < .001. At the .001 level no way to tell if the relationship is linear.
conclude that there is a useful linear relationship. 77. a. To test for lack of fit, we find f ¼ 3.30, with 3
c. (403.6, 468.2) numerator df and 10 denominator df, so the P-value
57. a. r ¼ .923, so x and y are strongly correlated. is .079. At the .05 level we cannot conclude that the
b. unaffected relationship is poor.
c. unaffected b. The scatter plot shows that the relationship is not
d. The normal plots seem consistent with normality, but linear, in spite of (a). In this case, the plot is more
the scatter plot shows a slight curvature. sensitive than the test.
e. For the test of H0: r ¼ 0 vs. Ha: r 6¼ 0, we find 79. a. 77.3
t ¼ 7.59, with P-value .00002. At the .001 level b. 40.4
conclude that there is a useful linear relationship. c. The coefficient b3 is the difference in sales caused by
the window, all other things being equal.
832 Chapter 12

81. a. .686, no f. Source DF SS MS F


b. We find f ¼ 28.6  2.62 ¼ F.001,16,186, so there is a
significant relationship at the .001 level. Regression 2 5 2.5 0.625
c. With all other predictors held constant, the estimated Error 1 4 4.0
difference in y between class A and not is .364. In Total 3 9
terms of $/ft2, the effect is multiplicative. Class A
buildings are estimated to be worth 44% more With f ¼ .625 < 199.5 ¼ F.05,2,1, there is no
dollars per square foot, with all other predictors held significant relationship at the .05 level.
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
^ ¼ y; s ¼ P ðy  yÞ2 =ðn  1Þ;
constant.
d. The difference in (c) is highly significant because the 93. b 0 pffiffiffi
two-tailed P-value is .00000013. c00 ¼ 1=n; y  t:025;n1 s= n
Xmþn
83. a. 48.31, 3.69 95. a. b^ ¼ 1 yi ¼y;
0
mþn 1
b. No, because the interaction term will change. X X
c. Yes, f ¼ 18.92, P-value < .0001. ^ ¼1
b
m
y 
1 mþn
y ¼ y1  y2
1
m 1 i n mþ1 i
d. Yes, t ¼ 3.496, P-value ¼ .003  .01 b. y^i ¼ y1 ; i ¼ 1; . . . ; m; y^i ¼ y2 ; i ¼ m þ 1; . . . ; m þ n
e. (21.6, 41.6) P Pmþn
SSE ¼ m 1 ðyi  y 1 Þ2 þ mþ1 ðyi  y2 Þ2 s¼
f. There appear to be no problems with normality or pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

curvature, but the variance may depend on x1 SSE=ðm þ n  2Þ c11 ¼ 4/(m + n)
d. b ^ ¼ 128:17; b ^ ¼ 14:33 y^i ¼ 121; i ¼ 1; . . . ; 3;
85. a. No 0 1
b. With f ¼ 5.03  3.69 ¼ F.05,5,8, there is a significant y^i ¼ 135:33; i ¼ 4; . . . ; 6
relationship at the .05 level. SSE ¼ 116.67 s ¼ 5.4006 c11 ¼ 2/3
c. Yes, the individual hypotheses deal with the issue of 95% CI for b1 (2.09, 26.58)
whether an individual predictor can be deleted, not the 97. Residual ¼ Dep Var – Predicted Value
effectiveness of the whole model. Std Error Residual ¼ [MSE – (Std Error Predict)2].5
d. 6.2, 3.3, (16.7, 31.9) Student Residual ¼ Residual/Std Error Residual
e. With f ¼ 3.44 < 4.07 ¼ F.05,3,8, there is no reason to
reject the null hypothesis, so the quadratic terms can 101. a. Hij ¼ 1=n þ ðxi  xÞðxj  xÞ=Sðxk  xÞ2
be deleted. VðY^i Þ ¼ s2 ½1=n þ ðxi  xÞ2 =Sðxk  xÞ2 
b. VðYi  Y^i Þ ¼ s2 ½1  1=n  ðxi  xÞ2 =Sðxk  xÞ2 
87. a. The quadratic terms are important in providing a good
c. The variance of a predicted value is greater for an x
fit to the data.
that is farther from x
b. A 95% PI is (.560, .771).
d. The variance of a residual is lower for an x that is
89. a. rRI ¼ .843 (.000), rRA ¼ .621 (.001), rIA ¼ .843 farther from x
(.000) Here the P-values are given in parentheses to e. It is intuitive that the variance of prediction should be
three decimals. higher with increasing distance. However, points that
b. Rating ¼ 2.24 + 0.0419 IBU – 0.166 ABV. Because are farther away tend to draw the line toward them,
the two predictors are highly correlated, one is so the residual naturally has lower variance.
redundant.
103. a. With f ¼ 12.04  9.55 ¼ F.01,2,7, there is a
c. Linearity is an issue.
significant relationship at the .01 level.
e. The regression is quite effective, with R2 ¼ .872. The
To test H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, |t| ¼
ABV coefficient is not significant, so ABV is not
2.96  t.025,7 ¼ 2.36, so reject H0 at the .05 level.
needed. The highly significant positive coefficient
The foot term is needed.
for IBU and negative coefficient for its square show
To test H0: b2 ¼ 0 vs. Ha: b2 6¼ 0, |t| ¼
that Rating increases with IBU, but the rate of increase
0.02 < t.025,7 ¼ 2.36, so do not reject H0 at the .05
is lower at higher IBU.
level. The height term is not needed.
2 3 2 3 b. The highest leverage is .88 for the fifth point. The
1 1 1 1 height for this student is given as 54 inches, too low
6 1 1 1 7 617
91. a. X ¼ 641
7 y ¼ 6 7, to be correct for this group of students. Also this
1 1 5 405
value differs by 800 from the wingspan, an extreme
1 1 1 4 difference.
2 3 2 3 2 3
4 0 0 6 1:5 c. Point 1 has leverage .55, and this student has height
4 0 4 0 5^ b ¼ 425 ^ ¼ 4 :5 5
b. b 75, foot length 13, both quite high.
0 0 4 4 1 Point 2 has leverage .31, and this student has height
2 3 2 3 66 and foot length 8.5, at the low end.
0 1 Point 7 has leverage .31 and this student has both
627 6 1 7
c. y^ ¼ 6 7 6 7
4 1 5 y  y^ ¼ 4 1 5 SSE ¼ 4, MSE ¼ 4
height and foot length at the high end.
d. Point 2 has the most extreme residual. This student
3 1 has a height of 6600 and a wingspan of 5600 differing
d. (12.2, 13.2) by 1000 , so the extremely low wingspan is probably
e. For the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, we wrong.
find |t| ¼ .5 < t.025,1 ¼ 12.7, so do not reject H0 at e. For this data set it would make sense to eliminate
the .05 level. The x1 term does not play a significant points 2 and 5 because they seem to be wrong.
role. However, outliers are not always mistakes and one
needs to be careful about eliminating them.
Chapter 13 833

105. a. .507% b. .7122 3. Do not reject H0 because w2 ¼ 1:57 < 7:815 ¼ w2:05;3
c. To test H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, we have t ¼ 3.93,
with P-value .0013. At the .01 level conclude that 5. Because w2 ¼ 6.61 with P-value .68, do not reject H0.
there is a useful linear relationship. 7. Because w2 ¼ 4.03 with P-value > .10, do not reject H0.
d. (1.056, 1.275)
e. y^ ¼ 1:014 y  y^ ¼ :214 9. a. [0, .223), [.223, .510), [.510, .916), [.916, 1.609).
[1.609, 1)
107. –36.18, (64.43, –7.94) b. Because w2 ¼ 1.25 with P-value > .10, do not reject
109. No, if the relationship of y to x is linear, then the H0.
relationship of y2 to x is quadratic. 11. a. (1, .967), [.967, .431), [.431, 0), [0, .431),
111. a. Yes [.431, .967), [.967, 1)
b. y^ ¼ 98:293 y  y^ ¼ :117 b. (1,.49806), [.49806, .49914), [.49914, .50), [.50,
c. s ¼ .155 .50086), [.50086, .50194), [.50194, 1)
d. .794 c. Because w2 ¼ 5.53 with P-value > .10, do not reject
e. 95% CI for b1: (.0613, .0901) H0.
f. The new observation is an outlier, and has a major 13. Using p^ ¼ :0843, w2 ¼ 280.3 with P-value < .001, so
impact: reject the independence model.
The equation of the line changes from
y ¼ 97.50 + .0757 x to y ¼ 97.28 + .1603 x 15. The likelihood is proportional to y233(1 – y)367 from
s changes from .155 to .291 which ^ y ¼ :3883. This gives estimated probabilities
r2 changes from .794 to .616 .1400, .3555, .3385, .1433, .0227 and expected counts
21.00, 53.32, 50.78, 21.49, 3.41. Because 3.41 < 5,
113. a. The paired t procedure gives t ¼ 3.54 with a two- combine the last two categories, giving w2 ¼ 1.62 with
tailed P-value of .002, so at the .01 level we reject the P-value > .10. Do not reject the binomial model.
hypothesis of equal means.
b. The regression line is y ¼ 4.79 + .743x, and the test 17. ^
l ¼ 3:167 which gives w2 ¼ 103.9 with P-value < .001,
of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, gives t ¼ 7.41 with a so reject the assumption of a Poisson model.
P-value of <.000001, so there is a significant
relationship. However, prediction is not perfect, y1 ¼ :4275; ^
19. ^ y2 ¼ :2750 which gives w2 ¼ 29.3 with P-
with r2 ¼ .753, so one variable accounts for only value < .001, so reject the model.
75% of the variability in the other. 21. Yes, the test gives no reason to reject the null hypothesis
117. a. linear of a normal distribution.
b. After fitting a line to the data, the residuals show a 23. The P-values are both .243.
lot of curvature.
c. Yes. The residuals from the logged model show some 25. Let pi1 ¼ the probability that a fruit given treatment i
departure from linearity, but the fit is good in terms matures and pi2 ¼ the probability that a fruit given
of R2 ¼ .988. We find ^ a ¼ 411:98, b^ ¼ :03333: treatment i aborts, so Ho: pi1 ¼ pi2 for i ¼ 1, 2, 3, 4, 5.
d. (58.15, 104.18) We find w2 ¼ 24.82 with P-value < .001, so reject the
null hypothesis and conclude that maturation is affected
119. a. The plot suggests a quadratic model. by leaf removal.
b. With f ¼ 25.08 and a P-value of < .0001, there is a
significant relationship at the .0001 level 27. If pij denotes the probability of a type j response when
c. CI: (3282.3, 3581.3), PI: (2966.6, 3897.0). Of course, treatment i is applied, then H0: p1j ¼ p2j ¼ p3j ¼ p4j for
the PI is wider, as in simple linear regression, j ¼ 1, 2, 3, 4. With w2 ¼ 27:66  23:587 ¼ w2:005;9 ,
because it needs to include the variability of a new reject H0 at the .005 level. The treatment does affect the
observation in addition to the variability of the mean. response.
d. CI: (3257.6, 3565.6), PI: (2945.0, 3878.2). These are
slightly wider than the intervals in (c),which is 29. With w2 ¼ 64:65  13:277 ¼ w2:01;4 , reject H0 at the .001
appropriate, given that 25 is slightly closer to the level. Political views are related to marijuana usage. In
mean and the vertex. particular, liberals are more likely to be users.
e. With t ¼ 6.73 and a two-tailed P-value of 31. Compute the expected counts by
< .0001, the quadratic term is significant at the e^ijk ¼ n^
pijk ¼ n^
n
pi p^j p^k ¼ n nni nj nnk . For the
.0001 level, so this term is definitely needed.
w2 statistic df ¼ 20.
121. a. With f ¼ 2.4 < 5.86 ¼ F.05,15,4, there is no
33. a. With w2 ¼ :681 < 4:605 ¼ w2:10;2 , do not reject
significant relationship at the .05 level
independence at the .10 level.
b. No, especially when k is large compared to n
b. With w2 ¼ 6:81  4:605 ¼ w2:10;2 , reject independence
c. .9565
at the .10 level.
c. 677

Chapter 13 35. a. With w2 ¼ 6.45 and P-value .040, reject independence


at the .05 level.
1. a. reject H0 b. do not reject H0 c. do not reject b. With z ¼ 2.29 and P-value .022, reject
H0 d. do not reject H0 independence at the .05 level.
834 Chapter 14

c. Because the logistic regression takes into account the 5. We form the difference and perform a two-tailed test of
order in the professorial ranks, it should be more H0: m ¼ 0 at level .05. This gives s+ ¼ 72 and because
sensitive, so it should give a lower P-value. it does not satisfy 14 < s+ < 64, we reject H0 at the
d. There are few female professors but many assistant .05 level.
professors, and the assistant professors will be the
professors of the future. 7. Because s+ ¼ 162.5 with P-value .044, reject H0: m ¼ 75
in favor of Ha: m > 75 at the .05 level.
37. With w2 ¼ 13:005  9:210 ¼ w2:01;2 , reject the null
hypothesis of no effect at the .01 level. Oil does make a 9. With w ¼ 38, reject H0 at the .05 level because the
difference (more parasites). rejection region is {w  36}.

39. a. H0: The population proportion of Late Game 11. Test H0: m1 – m2 ¼ 1 vs. Ha: m1 – m2 > 1. After
Leader Wins is the same for all four sports; Ha: The subtracting 1 from the original process measurements,
proportion of Late Game Leader Wins is not the same we get w ¼ 65. Do not reject H0 because w < 84.
for all four sports. With w2 ¼ 10:518  7:815 ¼ w2:05;3 , 13. b. Test H0: m1 – m2 ¼ 0 vs. Ha: m1 – m2 < 0. With a
reject the null hypothesis at level .05. Sports differ in P-value of .002 we reject H0 at the .01 level.
terms of coming from behind late in the game.
b. Yes (baseball) 15. With w ¼ 135, z ¼ 2.223, and the approximate P-value
is .026, so we would not reject the null hypothesis at the
41. With w ¼ 197:6  16:812 ¼
2
w2:01;6 ,
reject the null .01 level.
hypothesis at the .01 level. The aged are more likely to
die in a chronic-care facility. 17. (11.15, 23.80)
43. With w ¼ :763 < 7:779 ¼
2
w2:10;4 ,
do not reject the 19. (.585, .025)
hypothesis of independence at the .10 level. There is no
evidence that age influences the need for item pricing. 21. (16, 87)

45. a. No, w ¼ 9:02  7:815 ¼ w2:05;3 .


2 29. a. (.4736, .6669)
b. With w2 ¼ :157 < 6:251 ¼ w2:10;3 , there is no reason to b. (.4736, .6669)
say the model does not fit. 33. For a two-tailed test at level .05, we find that s+ ¼ 24 and
47. a. H0: p0 ¼ p1 ¼ . . . ¼ p9 ¼ .10 vs. Ha: at least one because 4 < s+ < 32, we do not reject the hypothesis of
pi 6¼ .10, with df ¼ 9. equal means.
b. H0: pij ¼ .01 for i and j ¼ 0,1,2,. . .,9 vs. Ha: at least 35. a. a ¼ .0207; Bin(20, .5)
one pij 6¼ .01, with df ¼ 99. b. c ¼ 14; because y ¼ 12, do not reject H0
c. No, there must be more observations than cells to do a
valid chi-square test. 37. With K ¼ 20:12  13:277 ¼ w2:01;4 , reject the null hypo-
d. The results give no reason to reject randomness. thesis of equal means at the 1% level. Axial strength does
seem to (as an increasing function) depend on plate
length.
Chapter 14 39. Because fr ¼ 6:45 < 7:815 ¼ w2:05;3 , do not reject the null
hypothesis of equal emotion means at the 5% level.
1. For a two-tailed test of H0: m ¼ 100 at level .05, we
find that s+ ¼ 27 and because 14 < s+ < 64, we do not 41. Because w0 ¼ 26 < 27, do not reject the null hypothesis
reject H0. at the 5% level.
3. For a two-tailed test of H0: m ¼ 7.39 at level .05, we find
that s+ ¼ 18 and because s+ does not satisfy
21 < s+ < 84, we reject H0.
Index

A Ansari–Bradley test, 786 Beta functions, incomplete, 207


Additive model Association, causation and, 251, 671 Bias-corrected and accelerated
for ANOVA, 584–6, 589 Asymptotic normal distribution, 298, interval, 415, 417, 538
for linear regression analysis, 624 371, 375, 377, 671 Bimodal histogram, 18, 19
for multiple regression Asymptotic relative efficiency, Binomial distribution
analysis, 682 764, 769 basics of, 128–135
Alternative hypothesis, 426 Autocorrelation coefficient, 674 Bayesian approach to, 777–780
Analysis of covariance, 699 Average multinomial distribution and, 240
Analysis of variance (ANOVA) definition of, 25 normal distribution and,
additive model for, 584–586, deviation, 33 189–190, 302
597 pairwise, 379, 772–773, 775 Poisson distribution and,
data transformation for, 579 rank, 785 147–149
definition of, 552 weighted (see Weighted average) Binomial experiment, 130–131, 134,
expected value in, 556, 573, 147, 240, 302, 724
589, 597 B Binomial random variable
fixed vs. random effects, 579 Bar graph, 9, 19 Bernoulli random variables and,
Friedman test, 785 Bartlett’s test, 562 134, 302
fundamental identity of, 560, Bayesian approach to inference, 758, cdf for, 132
564, 587, 599, 600, 635 776–782 definition of, 130
interaction model for, 597–606 Bayes’ Theorem, 79–81, 777, 780 distribution of, 132
Kruskal–Wallis test, 784 Bernoulli distribution, 104, 122, 134, expected value of, 134, 135
Levene test, 562–563 302 373, 375, 377, 777 in hypergeometric experiment,
linear regression and, 636, 639, Bernoulli random variable 141
664, 708, 717 binomial random variable and, in hypothesis testing, 428–431,
mean in, 553, 555, 557 134, 302 450–454
mixed effects model for, Cramér–Rao inequality for, 375 mean of, 134–135
593, 603 definition of, 98 moment generating function for,
multiple comparisons in, expected value, 113 135
564–571, 578, 589–590, 603 Fisher information on, 372–373, multinomial distribution of, 240
noncentrality parameter for, 377 in negative binomial experiment,
574, 582 Laplace’s rule of succession 142
notation for, 555, 559, 598 and, 782 normal approximation of,
power curves for, 574–575 mean of, 113 189–190, 302
randomized block experiments mle for, 377 pmf for, 132
and, 590–593 moment generating function for, and Poisson distribution,
regression identity of, 635–636 122, 123, 127 147–149
sample sizes in, 574–576 pmf of, 103 standard deviation of, 134
single-factor, 553–582 score function for, 372 unbiased estimation, 335, 337
two-factor, 582–608 in Wilcoxon’s signed-rank variance of, 134, 135
type I error in, 558–559 statistic, 314 Binomial theorem, 135, 142–144
type II error in, 574 Beta distribution, 206–208, 777 Bioequivalence tests, 551

835
836 Index

Birth process, pure, 378 in confidence intervals, Complement of an event, 53, 60


Bivariate data, 3, 617, 623, 632, 389–390, 410 Compound event, 52, 62
691, 721 critical values for, 317, 389, Concentration parameter, 779
Bivariate normal distribution, 409–410, 477, 725, 727, Conceptual population, 6, 113,
258–260, 310, 318, 477, 737–738 287, 487
667–671 definition of, 200 Conditional density, 253
Bonferroni confidence intervals, degrees of freedom for, 200, 315 Conditional distribution, 253–263,
424, 657–659, 689 exponential distribution 361, 369, 667, 735, 758, 777
Bootstrap procedure and, 317 Conditional mean, 255–262
for confidence intervals, F distribution and, 323–325 Conditional probability, 74–81,
411–418, 532–534 gamma distribution and, 200, 315 84–85, 200, 253–255, 362,
for paired data, 538–540 in goodness-of-fit tests, 720–751 365–366
for point estimates, 345–346 Rayleigh distribution and, 226 Conditional probability density
Bound on the error of estimation, 388 standard normal distribution function, 253
Box–Muller transformation, 271 and, 224, 316–317, 325 Conditional probability mass
Boxplot, 37–41 of sum of squares, 317, 557 function, 253, 255
comparative, 40–41 t distribution and, 320, 325 Conditional variance, 255–262, 367
Branching process, 281 in transformation, 224 Confidence bound, 398–399, 403,
Weibull distribution and, 231 440, 494, 500, 513
C Chi-squared random variable Confidence interval
Categorical data in ANOVA, 557 adjustment of, 400
classification of, 30 cdf for, 316 in ANOVA, 565, 570–571, 578,
graphs for, 19 expected value of, 315 589, 591, 603
in multiple regression analysis, in hypothesis testing, 482 based on t distribution, 401–404,
696–699 in likelihood ratio tests, 477, 480 499–501, 505, 513–515,
Pareto diagram, 24 mean of, 315 570–571, 643–646
sample proportion in, 30 moment generating function Bonferroni, 424, 657-659
Cauchy distribution of, 315 bootstrap procedure for,
mean of, 322, 342 pdf of, 200, 315 411–418, 538, 540, 532–534
median of, 342 standard normal random for a contrast, 571
minimal sufficiency for, 367 variables and, 224, for a correlation coefficient, 671
reciprocals and, 231 316–317, 325 vs. credibility interval, 777–781
standard normal distribution in Tukey’s procedure, 565 definition of, 382
and, 271 variance of, 315 derivation of, 389
uniform distribution and, 226 Chi-squared test for difference of means, 493–495,
variance of sample mean degrees of freedom in, 726, 734, 500–501, 505, 513–515,
for, 349 736, 745, 748 532–534, 539–540, 565–569,
Causation, association and, 251, 671 for goodness of fit, 724-730, 578, 589, 591, 603
cdf. See Cumulative distribution for homogeneity, 745–747 for difference of proportions, 524
function for independence, 747–749 distribution-free, 771–776
Cell counts/frequencies, 725–727, P-value for, 727–728 for exponential distribution
729–730, 732–740, 744–750 for specified distribution, parameter, 389
Cell probabilities, 729, 732, 737, 739 729–730 in linear regression, 643–646,
Censored experiments, 32, 343–344 z test and, 752 656–658
Census, 2 Class intervals, 15–17, 278, 293, for mean, 383–387, 392,
Central Limit Theorem 738–739 403–404, 411–415
basics of, 298–303 Coefficient of determination for median, 415–417
Law of Large Numbers and, 305 definition of, 632–634, 686 in multiple regression, 689, 712
proof of, 329–330 F ratio and, 687 one-sided, 398, 500, 513
sample proportion distribution in multiple regression, 686 for paired data, 513–515, 539
and, 190 sample correlation coefficient for ratio of variances, 530–531, 537
Wilcoxon rank-sum test and, 770 and, 664 sample size and, 388
Wilcoxon signed-rank test Coefficient of skewness, 121, Scheffé method for, 610
and, 765 128, 178 sign, 784
Central t distribution, 320–323, 423 Coefficient of variation, 45, 229, 357 for slope coefficient, 643
Chebyshev’s inequality, 120, 138, Cohort, 281 for standard deviation, 409–410
156, 194, 303, 345 Combination, 70–72 for variance, 409–410
Chi-squared distribution Comparative boxplot, 40–41, 502, width of, 385, 387–388, 394, 397,
censored experiment and, 421 503, 554 404, 417, 495
Index 837

Wilcoxon rank-sum, 774–776 paired data and, 515–516 for Studentized range
Wilcoxon signed-rank, 772-774 sample (see Sample correlation distribution, 565
Confidence level coefficient) for t distribution, 320, 390,
definition of, 382, 385–388 Covariance 500, 504
simultaneous, 565–570, 578, correlation coefficient and, 249 type II error and, 574
589, 591, 658 Cramér–Rao inequality and, Delta method, 174
in Tukey’s procedure, 565–570, 374–375 De Morgan’s laws, 56
578, 589, 591 definition of, 247 Density
Confidence set, 772 of independent random conditional, 253–257
Consistency, 304, 357, 375–377 variables, 250–251 curve, 160
Consistent estimator, 304, 357, of linear functions, 249 function (pdf), 160
375–377 matrix format for, 711 joint, 235
Contingency tables, two-way, Covariate, 699 marginal, 236
744–751 Cramér–Rao inequality, 374–375 scale, 17
Continuity correction, 189–190 Credibility interval, 777–782 Dependence, 84–88, 238–242, 250,
Continuous random variable(s) Critical values 257, 747
conditional pdf for, 254, 789 chi-squared, 317 Dependent events, 84-88
cumulative distribution function F, 324 Dependent variable, 614
of, 163–168 standard normal (z), 184 Descriptive statistics, 1–41
definition of, 99, 159 studentized range, 565 Deviation
vs. discrete random variable, 162 t, 322, 409 definition of, 33
expected value of, 171–172 tolerance, 406 minimize absolute deviations
joint pdf of (see Joint probability Cumulative distribution function principle, 33, 679
density functions) for a continuous random Dichotomous trials, 128
marginal pdf of, 236–238 variable, 163–168 Difference statistic, 347
mean of, 171, 172 for a discrete random variable, Discrete random variable(s)
moment generating of, 175–177 104–108 conditional pmf for, 253
pdf of (see Probability density inverse function of, 223–224 cumulative distribution function
function) joint, 282 of, 104–108
percentiles of, 166–168 of order statistics, 272–273 definition of, 99
standard deviation of, 173–175 pdf and, 163 expected value of, 112
transformation of, 220–225, percentiles and, 167 joint pmf of (see Joint probability
265–270 pmf and, 105–108 mass function)
variance of, 173–175 transformation and, 220–225 marginal pmf of, 234
Contrast of means, 570–571 Cumulative frequency, 24 mean of, 112
Convenience samples, 7 Cumulative relative frequency, 24 moment generating of, 122
Convergence pmf of (see Probability mass
in distribution, 153, 329 D function)
in mean square, 303 Data standard deviation of, 117
in probability, 304 bivariate, 3, 617, 632, 691 transformation of, 225
Convex function, 231 categorical (see Categorical data) variance of, 117
Correction factor, 141, 560, 568, censoring of, 32, 343–344 Disjoint events, 54
577, 582 characteristics of, 3 Dotplots, 12
Correction for the mean, 560 collection of, 7–8 Dummy variable, 696
Correlation coefficient definition of, 2 Dunnett’s method, 571
autocorrelation coefficient and, 674 multivariate, 3, 220
in bivariate normal distribution, qualitative, 19 E
258–260, 310, 667 univariate, 3 Efficiency, asymptotic relative,
confidence interval for, 671 Deductive reasoning, 6 764, 769
covariance and, 249 Degrees of freedom (df) Empirical rule, 187
Cramér–Rao inequality and, in ANOVA, 557–559, Erlang distribution, 202, 229
374–375 587, 599 Error(s)
definition of, 249, 663 for chi-squared distribution, estimated standard, 344, 646, 713
estimator for, 666 200, 315–320 estimation, 334
Fisher transformation, 669 in chi-squared tests, 726, 734, family vs. individual, 570
for independent random 737, 746 measurement, 179, 211, 337, 477
variables, 250 for F distribution, 323 prediction, 405, 658, 683
in linear regression, 664, 667, 669 in regression, 631, 685 rounding, 36
measurement error and, 328 sample variance and, 35 standard, 344, 713
838 Index

Error(s) (cont.) observational studies in, 488 Fisher–Irwin test, 525


type I, 429 paired data, 515 Fisher transformation, 669
type II, 429 paired vs. independent samples, Fitted values, 588, 629, 674
Estimated regression function, 520–521 Fixed effects model, 579, 592, 597
676, 685 randomized block, 590–593 Fourth spread, 37, 41, 285
Estimated regression line, 625 randomized controlled, 489 Frequency, 13
Estimated standard error, 344, repeated measures designs in, 591 Frequency distribution, 13
646, 713 with replacement, 69, 141, 287 Friedman’s test, 785
Estimator, 332 retrospective, 488 F test
Event(s) simulation, 291–294 in ANOVA, 558, 580, 587,
complement of, 53 Explanatory variable, 614 593, 600
compound, 52, 62 Exponential distribution Bartlett’s test and, 562
definition of, 52 censored experiments and, 343 coefficient of determination
dependent, 84–88 chi-squared distribution and, 687
disjoint, 54 and, 317 critical values for, 324, 528, 558
exhaustive, 79 confidence interval for distribution and, 323, 527, 558
independent, 84–88 parameter, 389 for equality of variances, 527, 537
indicator function for, 364 double, 477 expected mean squares and, 573,
intersection of, 53 estimators for parameter, 343, 351 589, 593, 600, 604
mutually exclusive, 54 goodness-of-fit test for, 739 Levene test and, 562
mutually independent, 87 mixed, 229 power curves and, 574–575
simple, 52 in pure birth process, 378 P-value for, 529, 537, 559
union of, 53 shifted, 360, 479 in regression, 687, 709
Venn diagrams for, 55 skew in, 277 sample sizes for, 574
Expected mean squares standard gamma distribution single-factor, 558, 580
in ANOVA, 573, 577, 600, 614 and, 198 vs. t test, 576
F test and, 589, 593, 600, 604 Weibull distribution and, 203 two-factor, 587, 593, 600
in mixed effects model, 593, 604 Exponential random variable(s) type II error in, 574
in random effects model, 580, Box–Muller transformation Full quadratic model, 695
593–594 and, 271
in regression, 681 cdf of, 199 G
Expected value expected value of, 198 Galton–Watson branching process,
conditional, 255 independence of, 242 281
of a continuous random mean of, 198 Gamma distribution
variable, 171 in order statistics, 272, 275 chi-squared distribution and, 200
covariance and, 247 pdf of, 198 definition of, 195
of a discrete random variable, 112 transformation of, 220, 267, 270 density function for, 195
of a function, 115, 245–246 variance of, 198 Erlang distribution and, 201
heavy-tailed distribution and, Exponential regression model, 721 estimators of parameters, 351,
114–115, 120 Exponential smoothing, 48 355, 358
of jointly distributed random Extreme outliers, 39–41 exponential distribution and,
variables, 245 Extreme value distribution, 217 198–200
Law of Large Numbers and, 303 Poisson distribution and, 783
of a linear combination, 306 F standard, 195
of mean squares (see Expected Factorial notation, 69 Weibull distribution and, 203
mean squares) Factorization theorem, 363 Gamma function
moment generating function Factors, 552 incomplete, 196, 217
and, 122, 175 Failure rate function, 230 properties of, 195
moments and, 121 Family of probability distributions, Gamma random variables, 195
in order statistics, 272–273, 277 104, 213 Geometric distribution, 143, 225
of sample mean, 277, 296 F distribution Geometric random variables, 143
of sample standard deviation, chi-squared distribution and, 323 Goodness-of-fit test
340, 379 definition of, 323 for composite hypotheses,
of sample total, 296 expected value of, 325 732, 741
of sample variance, 339 for model utility test, 649, 687, 709 definition of, 723
Experiment noncentral, 574–575 for homogeneity, 745–747
binomial, 128, 240, 724 pdf of, 324 for independence, 747–749
definition of, 52 Finite population correction factor, 141 simple, 724–730
double-blind, 523 Fisher information, 371 Grand mean, 555, 584
Index 839

H Intersection of events model utility test and, 721


Half-normal plot, 220 definition of, 53 in Neyman–Pearson theorem, 470
Histogram multiplication rule for probability significance level and, 470, 471
bimodal, 18 of, 77–79 sufficiency and, 380
class intervals in, 15–17 Invariance principle, 357 tests, 475
construction of, 12–20 Inverse matrix, 712 Limiting relative frequency, 58, 59
density, 17–18 Linear combination
multimodal, 19 J distribution of, 309
Pareto diagram, 24 Jacobian, 267 expected value of, 306
for pmf, 103 Jensen’s inequality, 231 independence in, 306
symmetric, 19 Joint cumulative distribution variance of, 307
unimodal, 18 function, 282 Linear probabilistic model, 617, 627
Hodges–Lehmann estimator, 379 Jointly distributed random variables Linear regression
Homogeneity, 745–747 bivariate normal distribution additive model for, 614, 682, 705
Hyperexponential distribution, 229 of, 258–260 ANOVA in, 649, 699, 768
Hypergeometric distribution, conditional distribution of, confidence intervals in, 643, 656
138–141 253–263 correlation coefficient in,
and binomial distribution, 141 correlation coefficients for, 249 662–671
Hypergeometric random variable, covariance between, 248 definition of, 617
138–141 expected value of function of, degrees of freedom in, 631,
Hypothesis 245–246 685, 708
alternative, 426 independence of, 238–239 least squares estimates in,
composite, 732–741, 744 linear combination of, 306–312 625–636, 679
definition of, 426 in order statistics, 274–276 likelihood ratio test in, 721
errors in testing of, 428–434 pdf of (see Joint probability mles in, 631, 639
notation for, 426 density functions) model utility test in, 648, 687, 708
null, 426 pmf of (see Joint probability mass parameters in, 617, 624–636, 682
research, 427 functions) percentage of explained variation
simple, 469 transformation of, 265–270 in, 633–634
Hypothetical population, 6 variance of function of, 252, 307 prediction interval in, 654, 658, 689
Joint marginal density function, 245 residuals in, 629, 674, 685
I Joint probability mass function, summary statistics in, 627
Inclusive inequalities, 136 233–234 sums of squares in, 631–636, 686
Incomplete beta function, 207 Joint probability table, 233 t ratio in, 648, 669, 690
Incomplete gamma function, Line graph, 102–103
196–197, 217 K Location parameter, 217, 367
Independence k-out-of-n system, 153 Logistic distribution, 279
chi-squared test for, 749 Kruskal–Wallis test, 784–785 Logistic regression model
conditional distribution and, k-tuple, 68–69 contingency tables for, 749–751
257–258 definition of, 620–622
correlation coefficient L fit of, 650–651
and, 250 lag 1 autocorrelation coefficient, 674 mles in, 650
covariance and, 250, 252 Laplace distribution, 478 in multiple regression analysis, 699
of events, 84–88 Laplace’s rule of succession, 782 Logit function, 621, 650
of jointly distributed random Largest extreme value distribution, 228 Lognormal distribution, 205–205, 233
variables, 238–239, 241 Law of Large Numbers, 303–304, Lognormal random variables, 205–206
in linear combinations, 306–307 322–323, 376
mutual, 87 Law of total probability, 79 M
pairwise, 90, 94 Least squares estimates, 626, 645, Mann–Whitney test, 766–770
in simple random sample, 287 679, 683–684 Marginal distribution, 234, 236, 253
Independent variable, 614 Level a test, 433 Marginal probability density
Indicator variables, 696 Level of a factor, 552, 583, 593 functions, 236
Inductive reasoning, 6 Levene test, 562–563 Marginal probability mass
Inferential statistics, 5–6 Leverages, 714–715 functions, 234
Inflection point, 180 Likelihood function, 354, 470, 475 Matrices in regression analysis,
Intensity function, 156 Likelihood ratio 705–715
Interaction, 597–602, 603–606, chi-squared statistic for, 477 Maximum likelihood estimator
693–698 definition of, 470 for Bernoulli parameter, 377
Intercept, 214, 617, 627 mle and, 475 for binomial parameter, 377
840 Index

Maximum likelihood estimator Minimize absolute deviations normal equations in, 683, 685,
(cont.) principle, 477, 679 705–708
Cramér–Rao inequality and, 375 Minimum variance unbiased parameters for, 682
data sufficiency for, 369 estimator, 341–343, 358, and polynomial regression,
Fisher information and, 371, 375 369, 375 691–693
for geometric distribution Mixed effects model, 593–603 prediction interval in, 689
parameter, 742 Mixed exponential distribution, 229 principle of least squares in,
in goodness-of-fit testing, 733 mle. See Maximum likelihood 683–706
in homogeneity test, 745 estimate residuals in, 685, 691, 688, 691,
in independence test, 748 Mode 708, 713
in likelihood ratio tests, 475 of a continuous distribution, squared multiple correlation in,
in linear regression, 631, 639 228, 229 686, 709
in logistic regression, 650 of a data set, 46 sum of squares in, 686, 708–710
sample size and, 357 of a discrete distribution, 156 t ratios in, 690, 712
score function and, 377 Model utility test, 647–649 Multiplication rule, 77–88
McNemar’s test, 526, 550 Moment generating function Multiplicative exponential
Mean of a Bernoulli rv, 122, 127 regression model, 721
of Cauchy distribution, 322, of a binomial rv, 135 Multiplicative power regression
342, 761 of a chi-squared rv, 315 model, 721
conditional, 255–257 CLT and, 329–330 Multivariate data, 3, 20
correction for the, 560 of a continuous rv, 175–177 Multivariate hypergeometric
deviations from the, 33, 206, 563, definition of, 122, 175 distribution, 244
631, 739 of a discrete rv, 122–127 Mutually exclusive events, 54, 79
of a function, 115, 245–246 of an exponential rv, 221 MVUE. See Minimum variance
vs. median, 28 of a gamma rv, 195 unbiased estimator
moments about, 121 of a linear combination, 311
outliers and, 27, 28 and moments, 124, 176 N
population, 26 of a negative binomial rv, 143 Negative binomial distribution,
regression to the, 260, 636 of a normal rv, 191 141–144
sample, 25 of a Poisson rv, 149 definition of, 141
of sample total, 296 of a sample mean, 329–330 estimation of parameters, 352, 738
See also Average uniqueness property of, 123, 176 Negative binomial random
Mean square Moments variable, 141
expected, 573, 589, 593, 594, definition of, 121 Newton’s binomial theorem, 143
600, 604 method of, 350–352, 358, 740 Neyman factorization theorem, 363
lack of fit, 681 and moment generating function, Neyman–Pearson theorem, 470–475
pure error, 681 124, 176 Noncentrality parameter, 423,
Mean square error Monotonic, 221, 353 574, 582
definition of, 335 Multimodal histogram, 19 Noncentral t distribution, 423
of an estimator, 335 Multinomial distribution, 240, 725 Nonhomogeneous Poisson
MVUE and, 341 Multinomial experiment, 240, 724 process, 156
sample size and, 337 Multiple regression Nonstandard normal distribution,
Measurement error, 337 additive model, 682, 705 185–188
Median categorical variables in, 696–699 Normal distribution
in boxplot, 37–38 coefficient of multiple asymptotic, 298, 371, 375, 377
of a distribution, 27, 28 determination, 686, 709 binomial distribution and,
as estimator, 378, 478 confidence intervals in, 712 189–190, 302
vs. mean, 28 covariance matrices in, 711–713 bivariate, 258–260, 310, 318,
outliers and, 26, 28, 29 degrees of freedom in, 685, 477, 677–671
population, 28 696, 708 confidence interval for mean of,
sample, 27, 271 diagnostic plots, 691 383–388, 392, 398, 403
statistic, 378 fitted values in, 685 continuity correction and, 189–190
Mendel’s law of inheritance, 726–728 F ratio in, 687, 709 density curves for, 180
M-estimator, 359, 381 interaction in models for, 693–698 and discrete random variables,
Midfourth, 46 leverages in, 714–715 188–190
Midrange, 333 logistic regression model, 699 goodness-of-fit test for, 730, 740
Mild outlier, 39, 393 in matrix/vector format, 705–715 of linear combination, 309
Minimal sufficient statistic, model utility test in, 687, lognormal distribution and,
366–367, 369 708–709 205, 303
Index 841

nonstandard, 185–188 estimator for a, 332–346 mean squared error of, 335
pdf for, 179 Fisher information on, 371–377 moments method, 350–352, 358
percentiles for, 182–188, 210 goodness-of-fit tests for, MVUE of, 340–342, 358, 369,
probability plot, 210, 740 728–729, 732–736 375
Ryan–Joiner test for, 747 hypothesis testing for, 427, 450 notation for, 332, 334
standard, 181 location, 217, 367 of a standard deviation and,
t distribution and, 320–322, 325, maximum likelihood estimate 286, 340
402 of, 354–359, 369 standard error of, 344–346
z table, 181–183 moment estimators for, 350–352 of a variance, 334, 339
Normal equations, 626, 683, 705 MVUE of, 341–343, 358, Point prediction, 405, 628, 684
Normal probability plot, 210, 740 369, 375 Poisson distribution
Normal random variable, 181 noncentrality, 574 Erlang distribution and, 202
Null distribution, 443–444, 760, 780 null value of, 427 expected value, 149, 152
Null hypothesis, 426 of a probability distribution, exponential distribution and, 199
Null set, 54, 57 103–104 gamma distribution and, 783
Null value, 427, 436 in regression, 617–618, 622, goodness-of-fit tests for, 736–738
624–636, 658, 666, 682 in hypothesis testing, 470–472,
O scale, 195, 203, 217–218, 365 474, 482, 550
Observational study, 488 shape, 217–218, 365 mode of, 156
Odds ratio, 621–622, 750–751 sufficient estimation of, 361–369 moment generating function
One-sided confidence interval, Pareto diagram, 24 for, 149
398–399 Pareto distribution, 170, 178, 226 nonhomogeneous, 156
Operating characteristic curve, 137 pdf. See Probability density function parameter of, 149
Ordered categories, 749–751 Percentiles and Poisson process, 149–151, 199
Ordered pairs, 66–67 for continuous random variables, variance, 149, 152
Order statistics, 271–278, 338, 166–168 Poisson process, 149–151, 194
365–367, 478 in hypothesis testing, 458, 740 Polynomial regression model,
sufficiency and, 365–367 in probability plots, 211–216, 740 691–693
Outliers sample, 29, 210–211, 216 Pooled t procedures
in a boxplot, 37–41 of standard normal distribution, and ANOVA, 477, 504–505, 576
definition of, 11 182–184, 211–216 vs. Wilcoxon rank-sum
extreme, 39–41 Permutation, 68, 69, 535–541 procedures, 769
leverage and, 714 Permutation test, 535–541 Posterior probability, 79–81,
mean and, 29, 415–417 PERT analysis, 207 777, 781
median and, 29, 37, 415, 417 Plot Power curves, 574–575
mild, 39 probability, 210–218, 369, 499, Power function of a test, 473–475,
in regression analysis, 679, 688 668, 676, 688, 691, 740 574–575
scatter, 615–617, 632–633, Power model for regression, 721
P 663, 667 Power of a test
Paired data pmf. See Probability mass function Neyman–Pearson theorem and,
in before/after experiments, Point estimate/estimator 473–475
511, 526 biased, 337–342 type II error and, 446–447,
bootstrap procedure for, 538–540 bias of, 335–340 472–476, 505, 593, 749
confidence interval for, 513–515 bootstrap techniques for, Precision, 315, 344, 371, 382,
definition of, 509 345–346, 411–418 387–388, 397, 405, 417, 514,
vs. independent samples, 515 bound on the error of 516, 592, 781
in McNemar’s test, 550 estimation of, 388 Prediction interval
permutation test for, 540–541 censoring and, 343–344 Bonferroni, 659
t test for, 511–513 consistency, 304, 357, 375–377 vs. confidence interval, 406,
in Wilcoxon signed-rank test, for correlation coefficient, 665–666 658–659, 690
762–763 and Cramér–Rao inequality, in linear regression, 654, 658–659
Pairwise average, 772, 773, 775 373–377 in multiple regression, 690
Pairwise independence, 94 definition of, 26, 287, 332 for normal distribution, 404–406
Parallel connection, 55, 88, 89, 90, efficiency of, 375 Prediction level, 405, 659, 689
272, 273 Fisher information on, 371–377 Predictor variable, 614, 682,
Parameter(s) least squares, 626–631 693–696
Bayesian approach to, 776–782 maximum likelihood (mle), Principle of least squares, 625–636,
concentration, 779 352–359 674, 679, 683
confidence interval for, 389, 394 of a mean, 26, 287, 332–333, 366 Prior probability, 79, 758
842 Index

Probability hyperexponential, 229 Randomized controlled


conditional, 74–81, 84–85, 200, hypergeometric, 138–141, experiment, 489
253–255, 362, 365–366 307–308 Randomized response technique, 349
continuous random variables joint, 232–283, 665–667, 732 Random variable
and, 99, 158–225, 235–242, Laplace, 315, 477–478 continuous, 158–231
253–255 of a linear combination, 259, definition of, 97
counting techniques for, 66–72 306–312 discrete, 96–157
definition of, 50 logistic, 279 jointly distributed, 232, 233–283
density function (see Probability lognormal, 205–206, 303 standardizing of, 185
density function) multinomial, 240, 724 types of, 99
of equally likely outcomes, 62–63 negative binomial, 141–144 Range
histogram, 103, 159–160, normal, 179–191, 205, 210–216, definition of, 33
188–190, 289–290 258–260, 297–303, 309, 730 in order statistics, 271–274
inferential statistics and, 6, 9, 284 parameter of a, 103–104 population, 394
Law of Large Numbers and, Pareto, 170, 178, 226 sample, 33, 271–274
303–304, 322–323 Poisson, 146–151, 199 Studentized, 565–566
law of total, 79 Rayleigh, 169, 226, 349, 360 Rank average, 785
mass function (see Probability of a sample mean, 285–294, Ratio statistic, 478
mass function) 296–304 Rayleigh distribution, 226,
of null event, 57 standard normal, 181–184 349, 360
plots, 210–218, 369, 499, 668, of a statistic, 285–304 Regression
676, 688, 691, 740 Studentized range, 565 coefficient, 640–651, 682–685,
posterior/prior, 79–81, 758, symmetric, 19, 28, 121, 168, 705–707, 711–712
777, 781 174, 180 effect, 260, 636
properties of, 56–63 t, 320–323, 325, 401–403, 443, function, 614, 676, 682, 685,
relative frequency and, 58–59, 462, 511 693, 696
291–292 uniform, 161–162, 164 line, 618–620, 624–636,
sample space and, 51–55, 56–57, Weibull, 202–205 640–647, 674–677
63, 66, 95 Probability mass function linear, 617–620, 624–636,
and Venn diagrams, 54–55, 62, conditional, 253–254 640–649, 654–659
75–76 definition of, 101–109 logistic, 620–622, 650–651
Probability density function (pdf) joint, 233–236 matrices for, 705–715
conditional, 254–255, 777 marginal, 234 to the mean, 260
definition of, 161 Product rules, 66–68 multiple, 682–689
joint, 232–278, 310, 354, Proportion multiplicative exponential
363–365, 368, 470, 475 population, 30, 395, 450–454, model, 721
marginal, 236–238, 268–269 519–525 multiplicative power model
vs. pmf, 162 sample, 30, 190, 302, 338, 519, for, 721
Probability distribution 748 plots for, 676–678
Bernoulli, 98, 102–104, 113, trimming, 29, 333, 340, 342–343 polynomial, 691–693
122–123, 127, 134, 302, 304, P-value quadratic, 691–693
308, 360, 373, 375, 377, 777 for chi-squared test, 727–728 through the origin, 381–421
beta, 206–208 definition of, 456 Rejection method, 281
binomial, 128–135, 147–149, for F tests, 529–530 Rejection region
189–190, 302, 352–353, for t tests, 462–465 cutoff value for, 428–433
395–396, 428–431 type I error and, 457–459 definition of, 428
bivariate normal, 258–260, for z tests, 459–461 lower-tailed, 431, 437–438
477, 669 in Neyman–Pearson theorem,
Cauchy, 226, 231, 271, 342 Q 470–474
chi-squared, 200, 224, 315–320 Quadratic regression model, 691–693 two-tailed, 438
conditional, 253–263 Qualitative data, 19 type I error and, 429
continuous, 99, 158–231 Quartiles, 28–29 in union-intersection test, 551
discrete, 96–157 upper-tailed, 429, 437–438
exponential, 198–200, 203, 343 R Relative frequency, 13–19, 30,
extreme value, 217–218 Random effects model, 579–580, 58–59
F, 323–325 593–594, 603–606 Repeated measures designs, 591
family, 104, 213, 216–218, 558 Random interval, 384–386 Replications, 58, 291–293, 386
gamma, 194–200, 217–218 Randomized block experiment, Research hypothesis, 427
geometric, 106–107, 114, 143, 225 590–593 Residual plots, 588, 602, 676–678
Index 843

Residuals Poisson distribution and, 147 Series connection, 272–273


in ANOVA, 588, 602 for population proportion, Set theory, 53–55
definition of, 556 396–398 Shape parameters, 217–218, 366
leverages and, 714–715 power and, 433, 440–441, 445, Siegel–Tukey test, 786
in linear regression, 629, 674–678 452–454, 489, 505, 523 Significance
in multiple regression, 685, 688 probability plots and, 216 practical, 468–469, 727
standard error, 674 in simple random sample, 287 statistical, 469, 489, 727
standardizing of, 675, 691 t distribution and, 445, 505 Significance level
variance of, 675, 713 type I error and, 433, 440-441, definition of, 433
Response variable, 8, 614, 620 445, 489, 523 joint distribution and, 479
Retrospective study, 488 type II error and, 433, 440–441, likelihood ratio and, 475
Ryan–Joiner test, 741 445, 452–454, 489, 505, 523 observed, 458
variance and, 303 Sign interval, 784
S z test and, 440–441, 452–453 Sign test, 784
Sample Sample space Simple events, 52, 62, 66
convenience, 7 definition of, 51 Simple hypothesis, 469, 732
definition of, 2 probability of, 56–63 Simple random sample
outliers in, 38–40 Venn diagrams for, 54–55 definition of, 7, 287
simple random, 7, 287 Sample standard deviation independence in, 287
size of (see Sample size) in bootstrap procedure, 413, 537 sample size in, 287
stratified, 7 confidence bounds and, 398 Simulation experiment, 288,
Sample coefficient of variation, 45 confidence intervals and, 392, 291–294, 417, 463
Sample correlation coefficient 403 Skewed data
in linear regression, 662–664, definition of, 33 coefficient of skewness, 121, 178
669, 719 as estimator, 340, 379 definition of, 19
vs. population correlation expected value of, 340, 379 in histograms, 19, 413
coefficient, 666, 669–671 independence of, 318–319 mean vs. median in, 28
properties of, 664–665 mle and, 357 measure of, 121
strength of relationship, 665 population standard deviation probability plot of, 216, 411–413
Sample mean and, 286, 340, 379 Slope, 617–618, 622, 626, 642, 644
definition of, 25 sample mean and, 34, 318–319 Slope coefficient
population mean and, 296–304 sampling distribution of, confidence interval for, 644
sampling distribution of, 296–304 288–289, 320, 340, 379, 482 definition of, 617–618
Sample median variance of, 482 hypothesis tests for, 648
definition of, 27 Sample total, 296, 306, 560 least squares estimate of, 626
in order statistics, 271–272 Sample variance in logistic regression model, 622
vs. population median, 417 in ANOVA, 555–556 Standard deviation
Sample moments, 350–351 calculation of, 35 normal distribution and, 179
Sample percentiles, 210–211 definition of, 33 of point estimator, 344–346
Sample proportion, 30, 335–336, distribution of, 287–289, 320 population, 117, 173
338, 391–400, 450–455, expected value of, 339 of a random variable, 117, 173
519–526 population variance and, 35, 317, sample, 33
Sample size 322–323, 339 z table and, 186
in ANOVA, 574–576 Sampling distribution Standard error, 344–346
asymptotic relative efficiency bootstrap procedure and, 413, Standardized variable, 185
and, 764, 769 532, 758 Standard normal distribution
bound on the error of estimation definition of, 284, 287 Cauchy distribution and, 271
and, 388 derivation of, 288–291 chi-squared distribution and,
Central Limit Theorem and, 302 of intercept coefficient, 719 316, 325
confidence intervals and, of mean, 288–290, 297–299 critical values of, 184
387–388, 394, 396, 403, 495 permutation tests and, 758 definition of, 181
definition of, 9 simulation experiments for, density curve properties for,
in finite population correction 291–294 181–184
factor, 140 of slope coefficient, 640–649 F distribution and, 323, 325
for F test, 574–576 Scale parameter, 195, 203–204, percentiles of, 182–184
for Levene test, 562–563 217–218, 365 t distribution and, 320, 325
mle and, 357–358, 375 Scatter plot, 615–617 Standard normal random variable,
noncentrality parameter and, Scheffé method, 610 181, 325
574–576, 582 Score function, 373–377 Statistic, 286
844 Index

Statistical hypothesis, 426 Trimming proportion, 29, 343 U


Stem-and-leaf display, 10–12 True regression function, 615 Unbiased estimator, 337–344
Step function, 106 True regression line, 618–620, 625, minimum variance, 340–343
Stratified samples, 7 640–641 Uncorrelated random variables,
Studentized range distribution, 565 t test 251, 307
Student t distribution, 320–323 vs. F test, 576 Uniform distribution
Summary statistics, 627, 630, heavy tails and, 764, 769, 774 beta distribution and, 778
645, 671 likelihood ratio and, 475, 476 Box–Muller transformation
Sum of squares in linear regression, 648 and, 271
error, 557, 631, 708 in multiple regression, definition of, 161
interaction, 599 688–690, 712 discrete, 120
lack of fit, 681 one-sample, 443–445, 461, transformation and, 223–224
pure error, 681 474–476, 511, 769 Uniformly most powerful test,
regression, 636, 699, 708 paired, 511 473–474
total, 559–560, 587, 591, 645, pooled, 504–505, 576 Unimodal histogram, 18–19
686 P-value for, 461–462 Union-intersection test, 551
treatment, 557–560 two-sample, 499–504, 576, 515 Union of events, 53
Symmetric distribution, 19, type I error and, 443–445, 501 Univariate data, 3
121, 168 type II error and, 445–447, 505
vs. Wilcoxon rank-sum V
T test, 769 Variable(s)
Taylor series, 174, 579 vs. Wilcoxon signed-rank test, covariate, 699
t confidence interval 763–764 in a data set, 10
heavy tails and, 764, 769, 774 Tukey’s procedure, 565–570, 578, definition of, 3
in linear regression, 643, 656 589–590, 603 dependent, 614
in multiple regression, 689, 712 Two one-sided tests, 551 dummy, 696–699
one-sample, 403–404 Type I error explanatory, 614
paired, 513–515 definition of, 429 independent, 614
pooled, 505 Neyman–Pearson theorem indicator, 696–699
two-sample, 500, 515 and, 470 predictor, 614
t distribution power function of the test random, 96–231
central, 423 and, 473 response, 614
chi-squared distribution and, P-value and, 457–458 Variance
320, 325, 500, 504 sample size and, 441 conditional, 255–257
critical values of, 322, 402, significance level and, 433 of a function, 118–119,
444, 461 vs. type II error, 433 174–175, 328
definition of, 320 Type II error of a linear function, 118–120, 307
degrees of freedom in, definition of, 429 population, 34–35, 117, 173
320–321, 401–402 vs. type I error, 433 precision and, 781
density curve properties for, Type II error probability of a random variable, 117, 173
322, 402 in ANOVA, 574–576, 596 sample, 33–37
F distribution and, 325, 576 degrees of freedom and, 516 Venn diagram, 54–55, 62, 75, 76
noncentral, 423 for F test, 574–576, 596
standard normal distribution in linear regression, 653 W
and, 320, 322, 403 Neyman–Pearson theorem Weibull distribution
Student, 320–323 and, 469–472 basics of, 202–205
Test statistic, 428 power of the test and, 446, 473 chi-squared distribution and, 231
Time series, 48, 674 sample size and, 440, 505, estimation of parameters, 356,
Tolerance interval, 406 477–478, 468, 495 359–360
Treatment, 553, 555–556, 583 in tests concerning means, extreme value distribution
Tree diagram, 67–68, 78, 81, 87 440, 445, 468, 489, 505 and, 217
Trial, 128–131 in tests concerning proportions, probability plot, 217–218
Trimmed mean 452–453, 522–524 Weighted average, 112, 171, 261,
definition of, 28–29 t test and, 445, 505 504, 779, 781
in order statistics, 271–272 vs. type I error probability, 433 Weighted least squares
outliers and, 29 in Wilcoxon rank-sum estimates, 679
as point estimator, 333, test, 769 Wilcoxon rank-sum test, 766–769
340, 343 in Wilcoxon signed-rank test, Wilcoxon signed-rank test,
population mean and, 340, 343 763–764 759–764
Index 845

Z z curve for a difference between


z confidence interval area under, maximizing means, 485–493
for a correlation coefficient, 671 of, 479 for a difference between
for a difference between rejection region and, 438 proportions, 521
means, 493 t curve and, 322, 402 for a mean, 438, 442
for a difference between z test for a Poisson parameter,
proportions, 524 chi-squared test and, 752 400, 482
for a mean, 387, 392 for a correlation for a proportion, 451
for a proportion, 395 coefficient, 669 P-value for, 459–461

Das könnte Ihnen auch gefallen