Beruflich Dokumente
Kultur Dokumente
4-1
Chapter
Associated with each independent variable in a multiple regression model is a coefficient. Since
these coefficients are unknown, we estimate them using data from a sample of size n. Each estimate
eliminates one degree of freedom available for estimating 2 . Hence, if there are k independent
variables in the model plus the y-intercept, 0 , then there will be n ( k +1) degrees of freedom left
to estimate 2 .
4.3
4.5
a.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3
b.
R 2 = .08 implies that 8% of the sample variation in frequency of marijuana is explained by the
model.
c.
d.
e.
f.
Fail to reject H 0 : 3 = 0 since the p-value for oppositional-defiant and conduct disorder is
greater than .05
a.
b.
H0 : 7 = 0
Ha : 7 < 0
0 .14 0
The test statistic is t = 7
=
= 1.00
s
.14
7
The rejection region requires = .05 in the lower tail of the t distribution. From Table 2,
Appendix D, with df = n ( k + 1) =234 (9 + 1) =224, t.05 1.645. The rejection region is
t < 1.645.
Since the observed value of the test statistic does not fall in the rejection region
( t = 1.00 1.645 ), H 0 is not rejected. There is insufficient evidence to indicate that there
is a negative linear relationship between the number of runs scored and the number of times
caught stealing at = .05.
4-2
For confidence coefficient .95, = .05 and / 2 = .05 / 2 = .025. From Table 2, Appendix D,
with df =n ( k + 1) = 234 (9 + 1) = 224, t.025 1.96. The 95% confidence interval is:
We are 95% confident that for each additional home run, the mean number of runs scored will
increase by anywhere from 1.412 to 1.608, holding all the other variables constant.
4.7
a.
1 = 2.006 represents for every 1-unit increase in the proportion of block with low density ( x1 ) ,
population density will increase by 2; 2 = 5.006 represents for every 1-unit increase in
proportion of block with high density ( x2 ) , the population density will increase by 5.
b.
c.
H 0 : 1 = 2 = 0
H a : at least one of the coefficients is nonzero
d.
e.
4.9
F=
.686 / 2
R2 / k
=
= 133.27
2
1 R / n (k + 1) (1 .686) / 125 (2 + 1)
Reject H 0 since Fc = 133.27 > F = 4.79 and conclude that both independent variables are
efficient in explaining the sample variation in the population density.
a.
b.
For every 1-mile increase in road length ( x1 ) , the number of crashes increase by .109; for every
one-vehicle increase in AADT ( x2 ) , number of creases increase by .00017.
c.
.109 .083 (.026,.192) We are 99% confident that the true increase crashes per mile on the
interstate highways is between .026 miles and .192 miles.
d.
.00017 .00008 (.00009,.00025) We are 99% confident that the true increase in crashes per
average annual number of vehicles is between .00009 and .00025 crashes annually.
e.
y = 1.20785 + .06343 x1 + .00056 x2 ; for every 1-mile increase in road length ( x1 ) , number of
crashes increase by .063, for every one-vehicle increase in AADT ( x2 ) , number of crashes
increase by .00056; .063 .047;.00056 .00031 .
4.11
a.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 +
b.
Chapter 4
c.
4-3
2 = 557.91. We estimate that the mean equivalent width will increase by 557.91 for each
additional increase of 1 unit of lineflux, all other variables held constant.
3 = 340.17. We estimate that the mean equivalent width will decrease by 340.17 for each
additional increase of 1 unit of line luminosity, all other variables held constant.
4 = 85.68. We estimate the mean equivalent width will increase by 85.68 for each additional
increase of 1 unit of AB1450, all other variables held constant.
d.
( p = .238 > .05) , H 0 is not rejected. There is insufficient evidence to indicate that redshift is
a useful linear predictor of equivalent width at = .05.
4.13
e.
f.
F = 51.72; Reject H 0.
a.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 +
b.
Coef
13614.5
0.08879
-9.201
14.394
0.35
-0.8480
S = 458.828
SE Coef
870.0
0.01391
1.499
3.461
29.56
0.4421
R-Sq = 92.4%
T
15.65
6.38
-6.14
4.16
0.01
-1.92
P
0.000
0.000
0.000
0.000
0.991
0.060
R-Sq(adj) = 91.7%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
5
61
66
SS
155055273
12841935
167897208
MS
31011055
210524
F
147.30
P
0.000
4-4
c.
2 = 9.20; we estimate that the mean heat rate will decrease by 9.20 kJ/kW-hr with every
increase in one degree Celsius of inlet-temperature.
3 = 14.4; we estimate that the mean heat rate will increase by 14.4 kJ/kW-hr with every
increase of one degree Celsius of exhaust-temperature.
4 = .85; we estimate that the mean heat rate will decrease by .85 kJ/kW-hr with every one
kg/sec increase in mass air flow.
d.
S = 458.8; 95% of sample heat rates fall within 917.6 kJ/kW-hr of the model predicted values.
e.
.917; Ra2 = 91.7% of the variation in heat rate is explained by the model.
f.
H 0 : 1 = 2 = 3 = 4 = 5 = 0
H a : At least one i 0
F = 147.30, p value 0 < = .01. Yes, we reject H 0 and conclude that the model is useful
in predicting the heat rate.
4.15
a.
H 0 : 1 = 2 = ... = 18 = 0
H a : At least one of the coefficients is nonzero
Test statistic: F =
R2 / k
2
(1 R ) / [n (k + 1)]
.95 /18
= 1.056
.05 / 1
Ra2 = 1
(n 1)
20 1
(1 R 2 ) = 1
(1 .95) = .05
n (k + 1)
20 (18 + 1)
After adjusting for the sample size and the number of parameters in the model, approximately
5% of the sample variation in IQ is "explained" by the model.
Chapter 4
4.17
a.
R 2 = .362. 36.2% of the sample variation in active caring score is explained by the model.
b.
4-5
H 0 : 1 = 2 = 3 = 0
H a : At least one i 0, i = 1, 2, 3
R2 / k
2
(1 R ) / [n (k + 1)]
.362 / 3
= 5.11
(1 .362) / [31 (3 + 1)]
The rejection region requires = .05 in the upper tail of the F distribution with v1 = k = 3 and
v2 = n ( k + 1) = 31 (3 + 1) = 27. From Table 4, Appendix C, F.05 = 2.96. The rejection
4.21
a.
The confidence interval is ( 2.68, 5.82). With 95% confidence we can conclude that the mean
gosling weight change for all goslings with 40% digestion efficiency and 15% acid-detergent
fibre will fall between 2.68% and 5.82%.
b.
4.23
The prediction interval is (3.04, 11.54). With 95% confidence we can conclude that the actual
gosling weight change for a gosling with 40% digestion efficiency and 15% acid-detergent fibre
will fall between 3.04% and 11.54%.
According to the analysis on the MINITAB printout below we can predict the arsenic level for the
lowest latitude (23.755), the highest longitude (90.662), and the lowest depth (25.000).
Predicted Values for New Observations
New
Obs
Fit SE Fit
95% CI
1 232.33
23.23 (186.63, 278.04)
95% PI
(24.03, 440.64)X
LATITUDE
23.8
LONGITUDE
90.7
DEPTH-FT
25.0
With 95% confidence we can predict that the arsenic level is between 24.03 and 440.64 when the
lowest latitude is 23.755, the highest longitude is 90.662, and the least depth is 25.0.
4-6
4.25
We want to find a 95% prediction interval for the actual voltage when the volume fraction of the
disperse phase is at the high level ( x1 = 80) , the salinity is at the low level ( x2 = 1) , and the amount
of surfactant is at the low level ( x5 = 2).
Using MINITAB, the output is:
The regression equation is
y = 0.933 - 0.243 x1 + 0.142 x2 + 0.385 x5
Predictor
Constant
x1
x2
x5
Coef
0.9326
-0.024272
0.14206
0.38457
S = 0.4796
StDev
0.2482
0.004900
0.07573
0.09801
R-Sq = 66.6%
T
3.76
-4.95
1.88
3.92
P
0.002
0.000
0.080
0.001
R-Sq(adj) = 59.9%
Analysis of Variance
Source
Regression
Residual
Error
Total
DF
3
15
SS
6.8701
3.4509
18
10.3210
MS
2.2900
0.2301
F
9.95
P
0.001
Predicted Values
Fit
-0.098
StDev Fit
0.232
95.0%
( -0.592,
CI
0.396)
95.0%
-1.233,
PI
1.0308)
The 95% prediction interval is (1.233, 1.038) . We are 95% confident that the actual voltage is
between 1.233 and 1.038 when the volume fraction of the disperse phase is at the high level
( x1 = 80), the salinity is at the low level ( x2 = 1), and the amount of surfactant is at the low
level ( x5 = 2).
4.27
4.29
a.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x1 x2
b.
Linear relationship between number of defects and variable speed depends on blade position.
c.
3 < 0 .
a.
y = 0 + 1 x1 + 2 x2 + 3 x1 x2
b.
Linear relationship between negative feelings score and number ahead in line depends on number
behind in line.
c.
Since the p-value > .25 is greater than any significance level, a test for the usefulness of the
H 0 : 3 = 0
would not have significant evidence to reject the null
interaction in the model
H a : 3 0
hypothesis. Thus, the interaction would not be useful in the model.
d.
Chapter 4
4.31
a.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x1 x3 + 5 x2 x3
b.
4-7
Coef
10845
-1280
217.4
-1549.2
-11.00
19.98
S = 103.072
SE Coef
67720
1053
814.5
985.6
11.86
11.20
R-Sq = 13.7%
T
0.16
-1.22
0.27
-1.57
-0.93
1.78
P
0.873
0.225
0.790
0.117
0.355
0.076
R-Sq(adj) = 12.4%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
5
321
326
SS
542303
3410258
3952562
MS
108461
10624
F
10.21
P
0.000
d.
e.
H0 : 4 = 0
Ha : 4 0
H 0 : 5 = 0
H a : 5 0
t = .93
t = 1.78
Since the p-value for 4 is .355 > = .05, the interaction between latitude and depth does not
have a significant effect on the arsenic level. Since the p-value for 5 is 0.076> = .05, the
interaction between longitude and depth does not have a significant effect on the arsenic level.
4-8
4.33
a.
By including the interaction terms, it implies that the relationship between voltage and volume
fraction of the disperse phase depend on the levels of salinity and surfactant concentration.
Slope depends on x2 and x5 . A possible sketch of the relationship is:
x2 = 4, x5 = 2
x2 = 4, x5 = 4
x2 = 1, x5 = 2
x2 = 1, x5 = 4
Chapter 4
b.
4-9
DF
Model
Error
C Total
5
13
18
Root MSE
Dep Mean
C.V.
Analysis of Variance
Sum of
Mean
Squares
Square
7.01028
3.31073
10.32101
0.50465
0.97684
51.66138
F Value
Prob>F
5.505
0.0061
1.40206
0.25467
R-square
Adj R-sq
0.6792
0.5558
Parameter Estimates
Parameter
Standard
T for H0:
Estimate
Error
Parameter=0
Variable
DF
INTERCEP
X1
X2
X5
X1X2
X1X5
1
1
1
1
1
1
Obs
X1
X2
X5
Dep Var
Y
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
40
80
40
80
40
80
40
80
40
80
40
80
40
80
40
80
0
0
0
1
1
4
4
1
1
4
4
1
1
4
4
1
1
4
4
0
0
0
2
4
4
2
4
2
2
4
2
4
4
2
4
2
2
4
0
0
0
0.6400
0.8000
3.2000
0.4800
1.7200
0.3200
0.6400
0.6800
0.1200
0.8800
2.3200
0.4000
1.0400
0.1200
1.2800
0.7200
1.0800
1.0800
1.0400
0.905732
-0.022753
0.304719
0.274741
-0.002804
0.001579
0.28546326
0.00831751
0.23660006
0.22704807
0.00378998
0.00394692
Predict
Value
0.8640
0.7701
2.1174
0.2092
1.5398
-0.0320
1.4416
1.0113
0.8640
0.7701
2.1174
0.2092
1.5398
-0.0320
1.4416
1.0113
0.9057
0.9057
0.9057
Sum of Residuals
Sum of Squared Residuals
Predicted Resid SS (Press)
3.173
-2.736
1.288
1.210
-0.740
0.400
0.185
0.309
0.264
0.305
0.309
0.283
0.292
0.298
0.185
0.309
0.264
0.305
0.309
0.283
0.292
0.298
0.285
0.285
0.285
-0.2969
-0.5082
0.8869
-1.0645
0.2617
-1.2820
0.1824
-0.2553
-0.2969
-0.5082
0.8869
-1.0645
0.2617
-1.2820
0.1824
-0.2553
-0.3468
-0.3468
-0.3468
Residual
2.0248
2.0484
3.3480
1.4828
2.8178
1.2181
2.7009
2.2779
2.0248
2.0484
3.3480
1.4828
2.8178
1.2181
2.7009
2.2779
2.1583
2.1583
2.1583
-0.2240
0.0299
1.0826
0.2708
0.1802
0.3520
-0.8016
-0.3313
-0.7440
0.1099
0.2026
0.1908
-0.4998
0.1520
-0.1616
-0.2913
0.1743
0.1743
0.1343
0
3.3107
6.5833
4-10
4.35
0 = .906.
1 = .023.
For each unit increase in disperse phase volume, we estimate that the mean
voltage will decrease by .023 units, holding salinity and surfactant
concentration at 0.
2 = .305.
For each unit increase in salinity, we estimate that the mean voltage will
increase by .305 units, holding disperse phase volume at 0 and holding
surfactant concentration constant.
3 = .275.
For each unit increase in surfactant concentration, we estimate that the mean
voltage will increase by .275 units, holding disperse phase volume at 0 and
salinity constant.
4 = .003.
This estimates the difference in the slope of the relationship between voltage
and disperse phase volume for each unit increase in salinity, holding surfactant
concentration constant.
5 = .002.
This estimates the difference in the slope of the relationship between voltage
and disperse phase volume for each unit increase in surfactant concentration,
holding salinity constant.
a.
E ( y ) = 0 + 1 x1 + 2 x12
b.
c.
Negative
Chapter 4
4.37
a.
4-11
ENE
Yes, there appears to be a quadratic relationship between ENE and body weight.
b.
H0 : 2 = 0
Ha : 2 0
The test statistic is t = 2.69.
The p-value is p = .031. Since the p-value is less than ( p = .031 < .10) , H 0 is rejected. There
is sufficient evidence to indicate that body weight and ENE are quadratically related at = .10.
4.39
a.
b.
From the plot, the 2 would be positive because the points appear to form an upward curve.
c.
This value of r 2 applies to the linear model, not the quadratic model. The linear model is:
E ( y ) = 0 + 1 x1
4-12
4.41
SPRate
0.75
0.50
0.25
0.00
0.0
0.5
1.0
1.5
Time
2.0
2.5
3.0
Curvilinear relationship
b.
Coef
1.00705
-1.1671
0.28975
S = 0.101142
SE Coef
0.07899
0.1219
0.03937
R-Sq = 92.7%
T
12.75
-9.57
7.36
P
0.000
0.000
0.000
R-Sq(adj) = 91.4%
Analysis of Variance
Source
Regression
Residual Error
Total
Source
Time
sqrtime
c.
4.43
a.
DF
1
1
DF
2
12
14
SS
1.54782
0.12276
1.67057
MS
0.77391
0.01023
F
75.65
P
0.000
Seq SS
0.99365
0.55416
Yes, there is evidence of upward curvature since t = 7.36 and the p- value .000 for 2 > 0
is less than = .05.
Let x1 and x2 represent the two quantitative independent variables. A first-order linear model
includes only the first-order terms for the variables:
E ( y ) = 0 + 1 x1 + 2 x2
b.
Let x1 , x2 , x3 , and x4 represent the four quantitative independent variables. A first-order linear
model includes only the first-order terms for the variables:
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4
4.45
a.
1 if A
E ( y ) = 0 + 1 x1 , where x1 =
0 if B
Chapter 4
b.
4-13
For a qualitative independent variable with four levels (A, B, C, and D), the model requires three
dummy variables as shown below. (We have arbitrarily selected level D as the "base" level.)
1 if level A
1 if level B
1 if level C
x1 =
x2 =
x3 =
0 if not
0 if not
0 if not
Then the model is written E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3
The following table shows the values of the dummy variables and the mean response E ( y ) for
each of the four levels.
Level
x1
x2
x3
E ( y)
0 + 1 = A
0 + 2 = B
0 + 3 = C
0 = D
0 = D , 1 = A D , 2 = B D ,
and 3 = C D
4.47
a.
4-14
c.
4.49
a.
For x1 = 1, E ( y ) = 4
For x1 = 2, E ( y ) = 11
For x1 = 1, E ( y ) = 2
For x1 = 2, E ( y ) = 7
2
For x1 = 0, E ( y ) = 1
For x1 = 1, E ( y ) = 5
For x1 = 2, E ( y ) = 13
For x1 = 1, E ( y ) = 1
For x1 = 2, E ( y ) = 5
2
For x1 = 1, E ( y ) = 8
For x1 = 2, E ( y ) = 17
For x1 = 1, E ( y ) = 2
For x1 = 2, E ( y ) =5
b.
The curves are second-order because they contain a squared term ( x12 ) .
c.
Different shapes.
d.
e.
It changes the shape of the curve for different x2 values by shifting the curve along the x1 axis.
Chapter 4
4.51
a.
x1 =
x4 =
if
if
if
if
1 if clay
, x2 =
, x3 =
not
0 if not
0
1
East
1 if South
, x5 =
, x6 =
0 if
not
not
0
manual
if
4-15
gravel
,
if
not
if
West
if
1 if
, x7 =
not
0 if
Southeast
not
b.
E ( y ) = 0 + 1 x1 + 2 (0) + 3 (0) + 4 (0) + 5 (0) + 6 (0) + 7 (0) if all other factors besides
manual picking are zero. Thus when
x1 = 1 E ( y ) = 0 + 1 0 = Auto , 1 = Manual Auto
c.
E ( y ) = 0 + 1 (0) + 2 (1) + 3 (0) + 4 (0) + 5 (0) + 6 (0) + 7 (0) if all other factors
besides clay soil type are zero. Thus when x2 = 1, E ( y ) = 0 + 2 which implies that the wine
quality is based on the grape being grown in clay soil. But when
E ( y ) = 0 + 1 (0) + 2 (0) + 3 (1) + 4 (0) + 5 (0) + 6 (0) + 7 (0) , x3 = 1, E ( y ) = 0 + 3
and all other factors are zero, this implies that the wine quality is based on the grape being grown
in gravel. Thus,
E ( y ) = 0 + 1 x2 + 2 x3 ; 0 = Sand , 1 = Clay Sand ; 2 = Gravel Sand
d.
4.53
a.
E ( y ) = 0 + 1 x1 + 2 x2 where 0 = NOHELP
1 if
x1 =
1 if
, x2 =
otherwise
0
FullSolution
CheckFigures
otherwise
b.
1.
c.
4-16
H 0 : 1 = 2 = 0
H a : at least one i 0
F=
.012 / 2
.006
R2 / k
=
=
= .45
2
1 R / n ( k + 1) (1 .012) / [75 3] .01317
p value = .637 > = .05. We do not reject the null hypothesis and conclude that the model is
not significant in predicting the knowledge gained.
4.55
a.
H 0 : 1 = 2 = 12 = 0
H a : at least one i 0
F=
R2 / k
.705 / 12
.05875
=
=
= 26.83
2
1
.705
/
148
13
.00219
(
)
[
]
1 R / n ( k + 1)
df = 1 = 12, 2 = 135 Fcritical 1.75 < Fc = 26.83, reject H 0 and conclude that the overall
model is significantly useful in predicting the card price.
4.57
b.
Fail to reject H 0 : 1 = 0 .
c.
Reject H 0 : 3 = 0 .
d.
a.
Model 1:
H 0 : 1 = 0
Ha : 1 0
The test statistic is t =
1 0
s
.0354
= 2.58.
.0137
Since no was given, we will use = .05. The rejection region requires / 2 = .05 / 2 = .025
in each tail of the t distribution. From Table 2, Appendix D, with
df = n ( k + 1) = 12 (1 + 1) = 10, t.025 = 2.228 . The rejection region is t < 2.228 or
t > 2.228.
Since the observed value of the test statistic falls in the rejection region (t = 2.58 > 2.228) , H 0
is rejected. There is sufficient evidence to indicate that there is a linear relationship between
vintage and the logarithm of price.
Model 2:
H 0 : 1 = 0
H a : 1 0
0
.0238
The test statistic is t = 1
=
= 3.32
.00717
s
1
Chapter 4
4-17
Since no was given, we will use = .05. The rejection region requires
/ 2 = .05 / 2 = .025 in each tail of the t distribution. From Table 6, Appendix A, with
df = n ( k + 1) = 12 (4 + 1) = 7, t.025 = 2.365. The rejection region is t < 2.365 or
t > 2.365.
Since the observed value of the test statistic falls in the rejection region (t = 3.32 > 2.365) , H 0 is
rejected. There is sufficient evidence to indicate that there is a linear relationship between
vintage and the logarithm of price, adjusting for all other variables.
H0 : 2 = 0
Ha : 2 0
0 .616
The test statistic is t = 2
=
= 6.47
.0952
s
2
H 0 : 3 = 0
H a : 3 0
0 .00386
The test statistic is t = 3
=
= 4.77
.00081
s
3
H0 : 4 = 0
Ha : 4 0
0 .0001173
The test statistic is t = 4
=
= 0.24.
.000482
s
4
4-18
H 0 : 1 = 0
H a : 1 0
0
.0240
The test statistic is t = 1
=
= 3.21
.00747
s
1
Since no was given, we will use = .05. The rejection region requires / 2 = .05 / 2 = .025
in each tail of the t distribution. From Table 6, Appendix A, with
df = n ( k + 1) = 12 (5 + 1) = 6, t.025 = 2.447 . The rejection region is t < 2.447 or
t > 2.447.
Since the observed value of the test statistic falls in the rejection region (t = 3.21 > 2.447) , H 0 is
rejected. There is sufficient evidence to indicate that there is a linear relationship between
vintage and the logarithm of price, adjusting for all other variables.
H0 : 2 = 0
Ha : 2 0
0 .608
The test statistic is t = 2
=
= 5.24.
.116
s
2
H 0 : 3 = 0
H a : 3 0
0 .00380
The test statistic is t = 3
=
= 4.00
.00095
s
3
H0 : 4 = 0
Ha : 4 0
0
.00115
The test statistic is t = 4
=
= 2.28
.000505
s
4
Chapter 4
4-19
Since the observed value of the test statistic does not fall in the rejection region
( t = 2.28 < 2.365), H 0 is not rejected. There is insufficient evidence to indicate that there is a
linear relationship between rainfall in months preceding vintage and the logarithm of price,
adjusting for all other variables.
H 0 : 5 = 0
H a : 5 0
0 .00765
The test statistic is t = 5
=
= 0.014.
.565
s
5
Mode1 1:
We estimate that the mean price will increase by 2.4% for each additional increase of 1 unit of
x1, vintage year (with all other variables held constant).
.616
1 = .852
2 = .616, e
We estimate that the mean price will increase by 85.2% for each additional increase of 1 unit of
x2, average growing season temperature C (with all other variables held constant).
.00386
1 = .004
3 = .00386, e
We estimate that the mean price will decrease by .4% for each additional increase of 1 unit of x3,
Sept./Aug. rainfall in cm (with all other variables held constant).
We estimate that the mean price will increase by 2.4% for each additional increase of 1 unit of
x1, vintage year (with all other variables held constant).
4-20
We estimate that the mean price will decrease by .4% for each additional increase of 1 unit of x3,
Sept./Aug. rainfall in cm (with all other variables held constant).
We estimate that the average mean price will increase by .8% for each additional increase of 1
unit of x5, average Sept. temperature in C (with all other variables held constant).
4.59
4.61
c.
Model 2 is preferable.
a.
b.
c.
. We see from the print-out that t = .34, p value = .734 > = .05. There
Ha : 2 0
is insufficient evidence that transcript copy number (y) is curvilinearly related to the proportion
H 0 : 5 = 0
this hypothesis is also not rejected.
of RNA. When testing
H a : 5 0
Test
H0 : 2 = 0
Models (a) and (b) are nested. The complete model is E ( y ) = 0 + 1 x1 + 2 x2 and the reduced
model is E ( y ) = 0 + 1 x1.
Models (a) and (d) are nested. The complete model is E ( y ) = 0 + 1 x1 + 2 x2 + 3 x1 x2 and the
reduced model is E ( y ) = 0 + 1 x1 + 2 x2 .
Models (a) and (e) are nested. The complete model is
Chapter 4
4-21
a.
b.
H a : At least one i 0
c.
Yes, the two models are nested since all of the terms of the first model and four additional terms
are included in the second.
d.
Since the p value < .001 for = .05, we reject H 0 and conclude that the additional terms
contribute to the prediction of y.
e.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 + 6 x6 + 7 x7 + 8 x8
9 x5 x6 + 10 x5 x7 + 11 x5 x8 + 12 x6 x7 + 13 x6 x8 + 14 x7 x8
f.
4.65
a.
Fail to reject H 0 : 9 = 10 = 14 = 0 since p value > .10, this indicated that model 3 is
not significantly more effective in predicting y.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 + 6 x6 + 7 x7 + 8 x8
+ 9 x9 + 10 x10 + 11 x11
b.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 + 6 x6 + 7 x7
+8 x8 + 9 x9 + 10 x10 + 11 x11
+12 x1 x9 + 13 x1 x10 + 14 x1 x11
+ 15 x2 x9 + 16 x2 x10 + 17 x2 x11
+ 18 x3 x9 + 19 x3 x10 + 20 x3 x11
+ 21 x4 x9 + 21 x4 x10 + 22 x4 x11
+ 24 x5 x9 + 25 x5 x10 + 26 x5 x11
+ 27 x6 x9 + 28 x6 x10 + 29 x6 x11
+ 30 x7 x9 + 31 x7 x10 + 32 x7 x11
+ 33 x8 x9 + 34 x8 x10 + 35 x8 x11
c.
4-22
4.67
b.
c.
The test was statistically significant. Thus, H 0 was rejected. There is sufficient evidence to
indicate that at least one of the control variables contributes to the prediction of SAT-Math
scores.
d.
R 2adj = .79. 79% of the sample variability of the SAT-Math scores around their means is
explained by the proposed model relating SAT-Math scores to the 10 independent variables,
adjusting for the sample size and the number of parameters in the model.
e.
For confidence coefficient .95, = .05 and / 2 = .05 / 2 = .025. From Table 2, Appendix D,
with df = n ( k + 1) = 3492 (10 + 1) = 3481, t.025 = 1.96. The 95% confidence interval is:
We are 95% confident that the difference in the mean SAT-Math scores between students who
were coached and those who were not is between 8.12 and 19.88 points, holding all the other
variables constant.
f.
No. From Exercise 4.48, the confidence interval for 2 was (13.12, 24.88). In part e, the
confidence interval for 2 was (8.12, 19.88). Even though coaching is significant in both
models, the change in the mean SAT-Math scores is not as great if the control variables are
added to the model. Notice that each point estimate is within the confidence interval of the other.
g.
4.69
h.
a.
b.
H 0 : 2 = 5 = 0
H a : At least one i 0
c.
F=
2961
Since the F = .06 yields a p value = .9423 which exceeds any significance level that we
would use, we will not reject the null hypothesis.and conclude that all independent variables in
the complete model are useful in predicting the transcript copy number.
Chapter 4
4.71
a.
4-23
To determine if the AD score is positively related to assertiveness level, once age and length of
disability are accounted for, we test:
H 0 : 1 = 0
H a : 1 > 0
The test statistic is t = 5.96.
The p-value is p = .0001/ 2 = .00005. Since the p-value is less than
( p = .00005 < .05) , H 0 is rejected. There is sufficient evidence to indicate that the AD score
is positively related to assertiveness level, once age and length of disability are accounted for
with = .05.
b.
To determine if age is related to assertiveness level, once AD score and length of disability are
accounted for, we test:
H0 : 2 = 0
Ha : 2 0
The test statistic is t = 0.01.
The p-value is p = .9620. Since the p-value is less than ( p = .9620 < .05), H 0 is not
rejected. There is insufficient evidence to indicate that Age score is related to assertiveness
level, once AD score and length of disability are accounted for with = .05.
c.
To determine if length of disability is positively related to assertiveness level, once AD score and
age are accounted for, we test:
H 0 : 3 = 0
H a : 3 > 0
The test statistic is t = 1.91.
The p-value is p = .0576 / 2 = .0288. Since the p-value is less than ( p = .0288 < .05) , H 0
is rejected. There is sufficient evidence to indicate that length of disability is positively related
to assertiveness level, once the AD score and age are accounted for with = .05.
4.73
4.75
a.
Estimate of 1
b.
Prediction equation is less reliable for values of xs outside the range of the sample data.
a.
The first order model for E ( y ) as a function of the first five independent variables is:
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5
b.
4-24
c.
The first order model for E ( y ) as a function of the first seven independent variables is:
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 + 6 x6 + 7 x7
d.
e.
4.77
R 2 = .603
60.3% of the variability in the GSI scores is explained by the model including
the first seven independent variables.
Since the p-values associated with the variables DES and PDEQ-SR are both less than .001,
there is evidence that both variables contribute to the prediction of GSI, adjusted for all the other
variables already in the model for > .001.
a.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x1 x2
b.
To determine if the effect of treatment on spelling score depends on disease intensity, we test:
H 0 : 3 = 0
H a : 3 0
The beta estimate is 3 = 1.6.
The p-value is p = .02. Since the p-value is less than ( p = .02 < .05) , H 0 is rejected.
There is sufficient evidence to indicate that the effect of treatment on spelling score depends on
disease intensity at = .05.
4.79
c.
Since the two variables interact, the main effects may be covered up by the interaction effect.
Thus, tests on main effects should not be made. Also, the interpretation of the coefficients of the
main effects should be interpreted with caution. Since the independent variables interact, the
effect of one independent variable on the dependent variable depends on the level of the second
independent variable.
a.
Since as the level of x2 goes from 0 to 9, we expect the level of funniness to increase, then level
off, then decrease, we are expecting to see negative or downward curvature. The sign we expect
to see on 2 is negative.
b.
Chapter 4
c.
4-25
H a : At least one i 0
The test statistic is t = 4.20, with p value = .0085 / 2 = .004. Since the p-value is less than
= .10, reject H 0 . There is sufficient evidence of an upward curvilinear relationship between y and x
at = .10.
4.83
a.
The model is E ( y ) = 0 + 1 x1 + 2 x2
1 if Communist
where x1 =
0 if not
1 if Democratic
x2 =
0 if not
b.
4.85
No, income x1 is not significant, air carrier dummies ( x3 x30 ) are significant.
4.87
a.
b.
c.
d.
Let
1 if benzene
x1 =
0 if not
1 if chloroform
x3 =
0 if not
1 if toulene
x2 =
0 if not
1 if methanol
x4 =
0 if not
4-26
0 = A ; 1 = B A ; 2 = T A ; 3 = C A ; 4 = M A .
f.
To see if the mean retention coefficients of the five organic compounds differ, we test:
H 0 : 1 = 2 = 3 = 4 = 0
H a : At least one i 0 for i = 1, 2,3, 4
a.
H 0 : 5 = 6 = 7 = 8 = 0
H a : At least one coefficient is not 0
For males:
Rejection region: = .05, v1 = 4, v2 = n ( k + 1) = 235, F.05 2.37
Reject H0 if F > 2.37
For females:
Rejection region: = .05, v1 = 4, v2 = n ( k + 1) = 144, F.05 2.45
Reject H0 if F > 2.45
b.
For males, reduced model: R 2 = .218. This implies that 21.8% of the sample variability of the
intrinsic job satisfaction scores is explained by the model containing age, education level, firm
experience, and sales experience.
For males, complete model: R 2 = .408. This implies that 40.8% of the sample variability of the
intrinsic job satisfaction scores is explained by the model containing age, education level, firm
experience, sales experience, contingent reward behavior, noncontingent reward behavior,
contingent punishment behavior, and noncontingent punishment behavior.
Since the R 2 value increased from .218 to .408 after the 4 supervisory behavior variables were
added to the model, it appears that they did have an impact on intrinsic job satisfaction for the
males.
For females, reduced model: R 2 = .268. This implies that 26.8% of the sample variability of the
intrinsic job satisfaction scores is explained by the model containing age, education level, firm
experience, and sales experience.
For females, complete model: R 2 = .496. This implies that 49.6% of the sample variability of
the intrinsic job satisfaction scores is explained by the model containing age, education level,
firm experience, sales experience, contingent reward behavior, noncontingent reward behavior,
contingent punishment behavior, and noncontingent punishment behavior.
Since the R 2 value increased from .268 to .496 after the 4 supervisory behavior variables were
added to the model, it appears that they did have an impact on intrinsic job satisfaction for the
females.
Chapter 4
c.
4-27
Ffemales = 9.05
Rejection region:
Males: = .05, v1 = 4, v2 = 235, F.05 2.37
Reject H 0 if F > 2.37
Females: = .05, v1 = 4, v2 = 144, F.05 2.45
Reject H 0 if F > 2.45
Conclusion: For both males and females, reject H 0 at = .05. There is sufficient evidence to
indicate that at least one of the four supervisory behavior variables affects intrinsic job
satisfaction.