Multiple Regression Models: Ey X X X

Chapter 4
4-1
Chapter
Multiple Regression Models

4.1
Associated with each independent variable in a multiple regression model is a coefficient. Since
these coefficients are unknown, we estimate them using data from a sample of size n. Each estimate
eliminates one degree of freedom available for estimating 2 . Hence, if there are k independent
variables in the model plus the y-intercept, 0 , then there will be n ( k +1) degrees of freedom left
to estimate 2 .
4.3
4.5
a.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3
b.
R 2 = .08 implies that 8% of the sample variation in frequency of marijuana is explained by the
model.
c.
Reject H 0 : 1 = 2 = 3 = 0 since the p-value is less than .01
d.
Reject H 0 : 1 = 0 since the p-value is less than .01
e.
Fail to reject H 0 : 2 = 0 since the p-value for the severity of impulsivity-hyperactivity is

greater than .05.
f.
Fail to reject H 0 : 3 = 0 since the p-value for oppositional-defiant and conduct disorder is
greater than .05
a.
From the output, the least squares prediction equation is

y = 3.70 + .34x1 + .49x2 + .72x3 + 1.14x4 + 1.51x5 + .26x6 .14x7 .10 x8 .10 x9 .
b.
H0 : 7 = 0
Ha : 7 < 0
0 .14 0
The test statistic is t = 7
=
= 1.00
s
.14
7
The rejection region requires = .05 in the lower tail of the t distribution. From Table 2,
Appendix D, with df = n ( k + 1) =234 (9 + 1) =224, t.05 1.645. The rejection region is
t < 1.645.
Since the observed value of the test statistic does not fall in the rejection region
( t = 1.00 1.645 ), H 0 is not rejected. There is insufficient evidence to indicate that there
is a negative linear relationship between the number of runs scored and the number of times
caught stealing at = .05.
Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.
4-2

c.
For confidence coefficient .95, = .05 and / 2 = .05 / 2 = .025. From Table 2, Appendix D,
with df =n ( k + 1) = 234 (9 + 1) = 224, t.025 1.96. The 95% confidence interval is:
5 t.025 s 1.51 1.96(.05) 1.51 .098 (1.412, 1.608)

5
We are 95% confident that for each additional home run, the mean number of runs scored will
increase by anywhere from 1.412 to 1.608, holding all the other variables constant.
4.7
a.
1 = 2.006 represents for every 1-unit increase in the proportion of block with low density ( x1 ) ,
population density will increase by 2; 2 = 5.006 represents for every 1-unit increase in
proportion of block with high density ( x2 ) , the population density will increase by 5.
b.
68.6% of the sample variation in population density is explained by the model.
c.
H 0 : 1 = 2 = 0
H a : at least one of the coefficients is nonzero
d.
e.
4.9
F=
.686 / 2
R2 / k
=
= 133.27
2
1 R / n (k + 1) (1 .686) / 125 (2 + 1)
Reject H 0 since Fc = 133.27 > F = 4.79 and conclude that both independent variables are
efficient in explaining the sample variation in the population density.
a.
y = 1.81231 + .10875 x1 + .00017 x2
b.
For every 1-mile increase in road length ( x1 ) , the number of crashes increase by .109; for every
one-vehicle increase in AADT ( x2 ) , number of creases increase by .00017.
c.
.109 .083 (.026,.192) We are 99% confident that the true increase crashes per mile on the
interstate highways is between .026 miles and .192 miles.
d.
.00017 .00008 (.00009,.00025) We are 99% confident that the true increase in crashes per
average annual number of vehicles is between .00009 and .00025 crashes annually.
e.
y = 1.20785 + .06343 x1 + .00056 x2 ; for every 1-mile increase in road length ( x1 ) , number of
crashes increase by .063, for every one-vehicle increase in AADT ( x2 ) , number of crashes
increase by .00056; .063 .047;.00056 .00031 .
4.11
a.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 +
b.
y = 21087.95 + 108.45x1 + 557.91x2 340.17x3 + 85.68x4
Chapter 4
c.
4-3
0 = the estimate of the y-intercept

1 = 108.45. We estimate that the mean equivalent width will increase by 108.45 for each
additional increase of 1 unit of redshift, all other variables held constant.
2 = 557.91. We estimate that the mean equivalent width will increase by 557.91 for each
additional increase of 1 unit of lineflux, all other variables held constant.
3 = 340.17. We estimate that the mean equivalent width will decrease by 340.17 for each
additional increase of 1 unit of line luminosity, all other variables held constant.
4 = 85.68. We estimate the mean equivalent width will increase by 85.68 for each additional
increase of 1 unit of AB1450, all other variables held constant.
d.
To determine if redshift is a useful linear predictor of equivalent width, we test:

H 0 : 1 = 0
H a : 1 0
The test statistic is t = 1.22. The p-value is p = .238. Since the p-value is greater than
( p = .238 > .05) , H 0 is not rejected. There is insufficient evidence to indicate that redshift is
a useful linear predictor of equivalent width at = .05.
4.13
e.
R 2 = .912, Ra2 = .894; Ra2
f.
F = 51.72; Reject H 0.
a.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 +
b.
According to the MINITAB printout below, the model is

y = 13614 + .09 x1 9.20 x2 + 14.40 x3 + .35 x4 .85 x5
The regression equation is
HEATRATE = 13614 + 0.0888 RPM - 9.20 INLET-TEMP + 14.4 EXH-TEMP
+ 0.4 CPRATIO- 0.848 AIRFLOW
Predictor
Constant
RPM
INLET-TEMP
EXH-TEMP
CPRATIO
AIRFLOW
Coef
13614.5
0.08879
-9.201
14.394
0.35
-0.8480
S = 458.828
SE Coef
870.0
0.01391
1.499
3.461
29.56
0.4421
R-Sq = 92.4%
T
15.65
6.38
-6.14
4.16
0.01
-1.92
P
0.000
0.000
0.000
0.000
0.991
0.060
R-Sq(adj) = 91.7%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
5
61
66
SS
155055273
12841935
167897208
MS
31011055
210524
F
147.30
P
0.000
4-4
c.
0 = 13614; the estimate of the y-intercept.

1 = .09; we estimate that the mean heat rate will increase by .09 kJ/kW-hr with every one
increase in revolutions per minute.
2 = 9.20; we estimate that the mean heat rate will decrease by 9.20 kJ/kW-hr with every
increase in one degree Celsius of inlet-temperature.
3 = 14.4; we estimate that the mean heat rate will increase by 14.4 kJ/kW-hr with every
increase of one degree Celsius of exhaust-temperature.
4 = .85; we estimate that the mean heat rate will decrease by .85 kJ/kW-hr with every one
kg/sec increase in mass air flow.
d.
S = 458.8; 95% of sample heat rates fall within 917.6 kJ/kW-hr of the model predicted values.
e.
.917; Ra2 = 91.7% of the variation in heat rate is explained by the model.
f.
H 0 : 1 = 2 = 3 = 4 = 5 = 0
H a : At least one i 0
F = 147.30, p value 0 < = .01. Yes, we reject H 0 and conclude that the model is useful
in predicting the heat rate.
4.15
a.
H 0 : 1 = 2 = ... = 18 = 0
H a : At least one of the coefficients is nonzero
Test statistic: F =
R2 / k
2
(1 R ) / [n (k + 1)]
.95 /18
= 1.056
.05 / 1
Rejection region: = .05, v1 = 18, v2 = 1, F.05 = 247

Reject H 0 if F > 247
Conclusion: There is insufficient evidence to reject H 0 at = .05. What happened? The
inclusion of 18 independent variables in the model reduces the number of degrees of freedom
in the denominator to 1. The result is an inflation of the critical value, which makes it
difficult to reject H0. In order to have more degrees of freedom available for estimating 2 ,
the researcher should either collect more data or include fewer independent variables in the
model.
b.
Ra2 = 1
(n 1)
20 1
(1 R 2 ) = 1
(1 .95) = .05
n (k + 1)
20 (18 + 1)
After adjusting for the sample size and the number of parameters in the model, approximately
5% of the sample variation in IQ is "explained" by the model.
Chapter 4
4.17
a.
R 2 = .362. 36.2% of the sample variation in active caring score is explained by the model.
b.
To test the utility of the model, we test:
4-5
H 0 : 1 = 2 = 3 = 0
H a : At least one i 0, i = 1, 2, 3
The test statistic is:

F=
R2 / k
2
(1 R ) / [n (k + 1)]
.362 / 3
= 5.11
(1 .362) / [31 (3 + 1)]
The rejection region requires = .05 in the upper tail of the F distribution with v1 = k = 3 and
v2 = n ( k + 1) = 31 (3 + 1) = 27. From Table 4, Appendix C, F.05 = 2.96. The rejection
region is F > 2.96.

Since the observed value of the test statistic falls in the rejection region ( F = 5.11 > 2.96) , H 0
is rejected. There is sufficient evidence that the model is useful in predicting AC score at
= .05.
4.19
From the output, the least squares prediction equation is

y = 3.70 + .34x1 + .49x2 + .72x3 + 1.14x4 + 1.51x5 + .26x6 .14x7 .10 x8 .10 x9 .
Various answers possible
4.21
a.
The confidence interval is ( 2.68, 5.82). With 95% confidence we can conclude that the mean
gosling weight change for all goslings with 40% digestion efficiency and 15% acid-detergent
fibre will fall between 2.68% and 5.82%.
b.
4.23
The prediction interval is (3.04, 11.54). With 95% confidence we can conclude that the actual
gosling weight change for a gosling with 40% digestion efficiency and 15% acid-detergent fibre
will fall between 3.04% and 11.54%.
According to the analysis on the MINITAB printout below we can predict the arsenic level for the
lowest latitude (23.755), the highest longitude (90.662), and the lowest depth (25.000).
Predicted Values for New Observations
New
Obs
Fit SE Fit
95% CI
1 232.33
23.23 (186.63, 278.04)
95% PI
(24.03, 440.64)X
X denotes a point that is an outlier in the predictors.

Values of Predictors for New Observations
New
Obs
1
LATITUDE
23.8
LONGITUDE
90.7
DEPTH-FT
25.0
With 95% confidence we can predict that the arsenic level is between 24.03 and 440.64 when the
lowest latitude is 23.755, the highest longitude is 90.662, and the least depth is 25.0.
4-6
4.25
The first order model is:

E ( y ) = 0 + 1 x1 + 2 x2 + 3 x5
We want to find a 95% prediction interval for the actual voltage when the volume fraction of the
disperse phase is at the high level ( x1 = 80) , the salinity is at the low level ( x2 = 1) , and the amount
of surfactant is at the low level ( x5 = 2).
Using MINITAB, the output is:
y = 0.933 - 0.243 x1 + 0.142 x2 + 0.385 x5
Predictor
Constant
x1
x2
x5
Coef
0.9326
-0.024272
0.14206
0.38457
S = 0.4796
StDev
0.2482
0.004900
0.07573
0.09801
R-Sq = 66.6%
T
3.76
-4.95
1.88
3.92
P
0.002
0.000
0.080
0.001
R-Sq(adj) = 59.9%
Source
Regression
Residual
Error
Total
DF
3
15
SS
6.8701
3.4509
18
10.3210
MS
2.2900
0.2301
F
9.95
P
0.001
Predicted Values
Fit
-0.098
StDev Fit
0.232
95.0%
( -0.592,
CI
0.396)
95.0%
-1.233,
PI
1.0308)
The 95% prediction interval is (1.233, 1.038) . We are 95% confident that the actual voltage is
between 1.233 and 1.038 when the volume fraction of the disperse phase is at the high level
( x1 = 80), the salinity is at the low level ( x2 = 1), and the amount of surfactant is at the low
level ( x5 = 2).
4.27
4.29
a.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x1 x2
b.
Linear relationship between number of defects and variable speed depends on blade position.
c.
3 < 0 .
a.
y = 0 + 1 x1 + 2 x2 + 3 x1 x2
b.
Linear relationship between negative feelings score and number ahead in line depends on number
behind in line.
c.
Since the p-value > .25 is greater than any significance level, a test for the usefulness of the
H 0 : 3 = 0
would not have significant evidence to reject the null
interaction in the model
H a : 3 0
hypothesis. Thus, the interaction would not be useful in the model.
d.
1 > 0 should be positive and 2 < 0 should be negative.
Chapter 4
4.31
a.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x1 x3 + 5 x2 x3
b.

ARSENIC = 10845 - 1280 LATITUDE + 217 LONGITUDE - 1549 DEPTH-FT 11.0 latdepth+ 20.0 longdepth
4-7
327 cases used, 1 cases contain missing values

Predictor
Constant
LATITUDE
LONGITUDE
DEPTH-FT
latdepth
longdepth
Coef
10845
-1280
217.4
-1549.2
-11.00
19.98
S = 103.072
SE Coef
67720
1053
814.5
985.6
11.86
11.20
R-Sq = 13.7%
T
0.16
-1.22
0.27
-1.57
-0.93
1.78
P
0.873
0.225
0.790
0.117
0.355
0.076
R-Sq(adj) = 12.4%
Source
Regression
Residual Error
Total
DF
5
321
326
SS
542303
3410258
3952562
MS
108461
10624
F
10.21
P
0.000

y = 10845 1280.0 x1 + 217.4 x2 1549.2 x3 11.0 x1 x3 + 19.98 x2 x3
c.
d.
e.
H0 : 4 = 0
Ha : 4 0
H 0 : 5 = 0
H a : 5 0
t = .93
t = 1.78
Since the p-value for 4 is .355 > = .05, the interaction between latitude and depth does not
have a significant effect on the arsenic level. Since the p-value for 5 is 0.076> = .05, the
interaction between longitude and depth does not have a significant effect on the arsenic level.
4-8
4.33
a.
By including the interaction terms, it implies that the relationship between voltage and volume
fraction of the disperse phase depend on the levels of salinity and surfactant concentration.
Slope depends on x2 and x5 . A possible sketch of the relationship is:
x2 = 4, x5 = 2
x2 = 4, x5 = 4
x2 = 1, x5 = 2
x2 = 1, x5 = 4
Chapter 4
b.
4-9
Using SAS, the printout is:

Model: MODEL1
Dependent Variable: Y
Source
DF
Model
Error
C Total
5
13
18
Root MSE
Dep Mean
C.V.
Sum of
Mean
Squares
Square
7.01028
3.31073
10.32101
0.50465
0.97684
51.66138
F Value
Prob>F
5.505
0.0061
1.40206
0.25467
R-square
Adj R-sq
0.6792
0.5558
Parameter Estimates
Parameter
Standard
T for H0:
Estimate
Error
Parameter=0
Variable
DF
INTERCEP
X1
X2
X5
X1X2
X1X5
1
1
1
1
1
1
Obs
X1
X2
X5
Dep Var
Y
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
40
80
40
80
40
80
40
80
40
80
40
80
40
80
40
80
0
0
0
1
1
4
4
1
1
4
4
1
1
4
4
1
1
4
4
0
0
0
2
4
4
2
4
2
2
4
2
4
4
2
4
2
2
4
0
0
0
0.6400
0.8000
3.2000
0.4800
1.7200
0.3200
0.6400
0.6800
0.1200
0.8800
2.3200
0.4000
1.0400
0.1200
1.2800
0.7200
1.0800
1.0800
1.0400
0.905732
-0.022753
0.304719
0.274741
-0.002804
0.001579
0.28546326
0.00831751
0.23660006
0.22704807
0.00378998
0.00394692
Predict
Value
Std Err Lower95%

Predict Predict
0.8640
0.7701
2.1174
0.2092
1.5398
-0.0320
1.4416
1.0113
0.8640
0.7701
2.1174
0.2092
1.5398
-0.0320
1.4416
1.0113
0.9057
0.9057
0.9057
Sum of Residuals
Sum of Squared Residuals
Predicted Resid SS (Press)
3.173
-2.736
1.288
1.210
-0.740
0.400
0.185
0.309
0.264
0.305
0.309
0.283
0.292
0.298
0.185
0.309
0.264
0.305
0.309
0.283
0.292
0.298
0.285
0.285
0.285
-0.2969
-0.5082
0.8869
-1.0645
0.2617
-1.2820
0.1824
-0.2553
-0.2969
-0.5082
0.8869
-1.0645
0.2617
-1.2820
0.1824
-0.2553
-0.3468
-0.3468
-0.3468
Prob > |T|

0.0073
0.0170
0.2202
0.2478
0.4725
0.6956
Upper95%
Predict
Residual
2.0248
2.0484
3.3480
1.4828
2.8178
1.2181
2.7009
2.2779
2.0248
2.0484
3.3480
1.4828
2.8178
1.2181
2.7009
2.2779
2.1583
2.1583
2.1583
-0.2240
0.0299
1.0826
0.2708
0.1802
0.3520
-0.8016
-0.3313
-0.7440
0.1099
0.2026
0.1908
-0.4998
0.1520
-0.1616
-0.2913
0.1743
0.1743
0.1343
0
3.3107
6.5833
The fitted regression line is:

y = .906 .023x1 + .305 x2 + .275x5 .003x1 x2 + .002x1 x5.
To determine if the model is useful, we test:

H 0 : 1 = 2 = 3 = 4 = 5 = 0
H a : At least one i 0 for i = 1, 2, ,5
The test statistic is F = 5.505.

The rejection region requires = .05. in the upper tail of the F-distribution with v1 = k = 5 and
v2 = n (k + 1) = 19 (5 + 1) = 13. From Table 4, Appendix C, F.05 = 3.03. The rejection
region is F > 3.03.
4-10

Since the observed value of the test statistic falls in the rejection region ( F = 5.505 > 3.03) , H 0
is rejected. There is sufficient evidence to indicate the model is useful for predicting voltage at
= .05.
R 2 = .6792. Thus, 67.92% of the sample variation of voltage is explained by the model
containing the three independent variables and two interaction terms.
The estimate of the standard deviation is s = .505.

Comparing this model to that fit in Exercise 4.14, the model in Exercise 4.14 appears to fit the
data better. The model in Exercise 4.14 has a higher R 2 (.7710 vs .6792) and a smaller estimate
of the standard deviation (.437 vs .505).
c.
4.35
0 = .906.
This is simply the estimate of the y-intercept.
1 = .023.
For each unit increase in disperse phase volume, we estimate that the mean
voltage will decrease by .023 units, holding salinity and surfactant
concentration at 0.
2 = .305.
For each unit increase in salinity, we estimate that the mean voltage will
increase by .305 units, holding disperse phase volume at 0 and holding
surfactant concentration constant.
3 = .275.
For each unit increase in surfactant concentration, we estimate that the mean
voltage will increase by .275 units, holding disperse phase volume at 0 and
salinity constant.
4 = .003.
This estimates the difference in the slope of the relationship between voltage
and disperse phase volume for each unit increase in salinity, holding surfactant
concentration constant.
5 = .002.
This estimates the difference in the slope of the relationship between voltage
and disperse phase volume for each unit increase in surfactant concentration,
holding salinity constant.
a.
E ( y ) = 0 + 1 x1 + 2 x12
b.
2 allows for a curvilinear relationship between y and x1.
c.
Negative
Chapter 4
4.37
a.
4-11
Using MINITAB, the plot of the data is:
ENE
MTB > plot ENE BODYWT

C2
|
| 2
|
14.0+
|
|
|
|
10.5+
| *
|
*
|
*
|
7.0+
|
* *
*
*
|
*
|
--------+---------+---------+---------+---------+--------50
100
150
200
250
Body Weight
Yes, there appears to be a quadratic relationship between ENE and body weight.
b.
H0 : 2 = 0
Ha : 2 0
The test statistic is t = 2.69.
The p-value is p = .031. Since the p-value is less than ( p = .031 < .10) , H 0 is rejected. There
is sufficient evidence to indicate that body weight and ENE are quadratically related at = .10.
4.39
a.
The quadratic model would be:

E ( y ) = 0 + 1 x + 2 x 2
b.
From the plot, the 2 would be positive because the points appear to form an upward curve.
c.
This value of r 2 applies to the linear model, not the quadratic model. The linear model is:
E ( y ) = 0 + 1 x1
4-12
4.41

a.
Scatterplot of SPRate vs Time
1.00
SPRate
0.75
0.50
0.25
0.00
0.0
0.5
1.0
1.5
Time
2.0
2.5
3.0
Curvilinear relationship
b.

SPRate = 1.01 - 1.17 Time + 0.290 sqrtime
Predictor
Constant
Time
sqrtime
Coef
1.00705
-1.1671
0.28975
S = 0.101142
SE Coef
0.07899
0.1219
0.03937
R-Sq = 92.7%
T
12.75
-9.57
7.36
P
0.000
0.000
0.000
R-Sq(adj) = 91.4%
Source
Regression
Residual Error
Total
Source
Time
sqrtime
c.
4.43
a.
DF
1
1
DF
2
12
14
SS
1.54782
0.12276
1.67057
MS
0.77391
0.01023
F
75.65
P
0.000
Seq SS
0.99365
0.55416
Yes, there is evidence of upward curvature since t = 7.36 and the p- value .000 for 2 > 0
is less than = .05.
Let x1 and x2 represent the two quantitative independent variables. A first-order linear model
includes only the first-order terms for the variables:
E ( y ) = 0 + 1 x1 + 2 x2
b.
Let x1 , x2 , x3 , and x4 represent the four quantitative independent variables. A first-order linear
model includes only the first-order terms for the variables:
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4
4.45
a.
1 if A
E ( y ) = 0 + 1 x1 , where x1 =
0 if B
0 = mean of y for the x = B level = B

1 = difference in the mean levels of y for the x = A and x = B levels = A B
Chapter 4
b.
4-13
For a qualitative independent variable with four levels (A, B, C, and D), the model requires three
dummy variables as shown below. (We have arbitrarily selected level D as the "base" level.)
1 if level A
1 if level B
1 if level C
x1 =
x2 =
x3 =
0 if not
0 if not
0 if not
Then the model is written E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3
The following table shows the values of the dummy variables and the mean response E ( y ) for
each of the four levels.
Level
x1
x2
x3
E ( y)
0 + 1 = A
0 + 2 = B
0 + 3 = C
0 = D
From the table, we obtain the interpretations of the 's.
0 = D , 1 = A D , 2 = B D ,
and 3 = C D
4.47
a.
y = 1 + 2x1 + (1) 3(3)

y = 7 + 2x1
For x1 = 0, y = 7
For x1 = 6, y = 5
Two points determine the line.
b.
y = 1 + 2x1 + (1) 3(1)

y = 3 + 2x1
For x1 = 0, y = 3
For x1 = 6, y = 9
Two points determine the line.
4-14
c.
The geometric relationship in a first-order model for E ( y ) as a function of one independent

variable for various combinations of values of the other independent variables is parallel lines.
The y-intercept is determined by the combination of values of the other independent variables.
The slope is the coefficient of the one independent variable.
4.49
a.
With x2 = 0, E ( y ) = 1 + x1 (0) + x1 (0) + 2 x12 + (0)2 = 1 + 2 x1 + 2 x12

For x1 = 0, E ( y ) = 1
For x1 = 1, E ( y ) = 4
For x1 = 2, E ( y ) = 11
For x1 = 1, E ( y ) = 2
For x1 = 2, E ( y ) = 7
2
With x2 = 1, E ( y ) = 1 + x1 (1) + x1 (1) + 2x12 + (1) = 1 + 2x1 + 2x12
For x1 = 0, E ( y ) = 1
For x1 = 1, E ( y ) = 5
For x1 = 2, E ( y ) = 13
For x1 = 1, E ( y ) = 1
For x1 = 2, E ( y ) = 5
2
With x2 = 2, E ( y ) = 1 + x1 ( 2) + x1 ( 2) + 2x12 + ( 2) = 3 + 3x1 + 2x12

For x1 = 0, E ( y ) = 3
For x1 = 1, E ( y ) = 8
For x1 = 2, E ( y ) = 17
For x1 = 1, E ( y ) = 2
For x1 = 2, E ( y ) =5
b.
The curves are second-order because they contain a squared term ( x12 ) .
c.
Different shapes.
d.
Yes, the shape of the x1 curve changes for different x2 values.
e.
It changes the shape of the curve for different x2 values by shifting the curve along the x1 axis.
Chapter 4
4.51
a.
x1 =
x4 =
if
if
if
if
1 if clay
, x2 =
, x3 =
not
0 if not
0
1
East
1 if South
, x5 =
, x6 =
0 if
not
not
0
manual
if
4-15
gravel
,
if
not
if
West
if
1 if
, x7 =
not
0 if
Southeast
not
b.
E ( y ) = 0 + 1 x1 + 2 (0) + 3 (0) + 4 (0) + 5 (0) + 6 (0) + 7 (0) if all other factors besides
manual picking are zero. Thus when
x1 = 1 E ( y ) = 0 + 1 0 = Auto , 1 = Manual Auto
c.
E ( y ) = 0 + 1 (0) + 2 (1) + 3 (0) + 4 (0) + 5 (0) + 6 (0) + 7 (0) if all other factors
besides clay soil type are zero. Thus when x2 = 1, E ( y ) = 0 + 2 which implies that the wine
quality is based on the grape being grown in clay soil. But when
E ( y ) = 0 + 1 (0) + 2 (0) + 3 (1) + 4 (0) + 5 (0) + 6 (0) + 7 (0) , x3 = 1, E ( y ) = 0 + 3
and all other factors are zero, this implies that the wine quality is based on the grape being grown
in gravel. Thus,
E ( y ) = 0 + 1 x2 + 2 x3 ; 0 = Sand , 1 = Clay Sand ; 2 = Gravel Sand
d.
E ( y ) = 0 + 1 (0) + 2 (0) + 3 (0) + 4 (1) + 5 (0) + 6 ( 0) + 7 (0) , x4 = 1, E ( y ) = 0 + 4 .

The wine quality is based on the East slope orientation.
E ( y ) = 0 + 1 (0) + 2 (0) + 3 (0) + 4 (0) + 5 (1) + 6 (0) + 7 (0) , x5 = 1, E ( y ) = 0 + 5
The wine quality is based on the South slope orientation.
E ( y ) = 0 + 1 (0) + 2 (0) + 3 (0) + 4 (0) + 5 (0) + 6 (1) + 7 (0) , x6 = 1, E ( y ) = 0 + 6
The wine quality is based on the West slope orientation.
E ( y ) = 0 + 1 (0) + 2 (0) + 3 (0) + 4 (0) + 5 (0) + 6 (0) + 7 (1) ; x7 = 1,
E ( y) = 0 + 7
The wine quality is based on the Southeast orientation.
E ( y ) = 0 + 1 x4 + 2 x5 + 3 x6 + 4 x7 ; 0 = SW ; 1 = E SW ; 2 = S SW ;
Thus,
3 = W SW ; 4 = SE SW
4.53
a.
E ( y ) = 0 + 1 x1 + 2 x2 where 0 = NOHELP
1 if
x1 =
1 if
, x2 =
otherwise
0
FullSolution
CheckFigures
otherwise
b.
1.
c.
The regression model is y = 2.433 .483x1 + .287 x2 .
4-16

d.
H 0 : 1 = 2 = 0
H a : at least one i 0
F=
.012 / 2
.006
R2 / k
=
=
= .45
2
1 R / n ( k + 1) (1 .012) / [75 3] .01317
p value = .637 > = .05. We do not reject the null hypothesis and conclude that the model is
not significant in predicting the knowledge gained.
4.55
a.
H 0 : 1 = 2 = 12 = 0
H a : at least one i 0
F=
R2 / k
.705 / 12
.05875
=
=
= 26.83
2
1
.705
/
148
13
.00219
(
)
[
]
1 R / n ( k + 1)
df = 1 = 12, 2 = 135 Fcritical 1.75 < Fc = 26.83, reject H 0 and conclude that the overall
model is significantly useful in predicting the card price.
4.57
b.
Fail to reject H 0 : 1 = 0 .
c.
Reject H 0 : 3 = 0 .
d.
E [ln( y )] = 0 + 1 x4 + 2 x5 + 3 x6 + + 9 x12 + 10 x4 x5 + 11 x4 x6 + + 17 x4 x12
a.
Model 1:
H 0 : 1 = 0
Ha : 1 0
The test statistic is t =
1 0
s
.0354
= 2.58.
.0137
Since no was given, we will use = .05. The rejection region requires / 2 = .05 / 2 = .025
in each tail of the t distribution. From Table 2, Appendix D, with
df = n ( k + 1) = 12 (1 + 1) = 10, t.025 = 2.228 . The rejection region is t < 2.228 or
t > 2.228.
Since the observed value of the test statistic falls in the rejection region (t = 2.58 > 2.228) , H 0
is rejected. There is sufficient evidence to indicate that there is a linear relationship between
vintage and the logarithm of price.
Model 2:
H 0 : 1 = 0
H a : 1 0
0
.0238
=
= 3.32
.00717
s
1
Chapter 4
4-17
Since no was given, we will use = .05. The rejection region requires
/ 2 = .05 / 2 = .025 in each tail of the t distribution. From Table 6, Appendix A, with
df = n ( k + 1) = 12 (4 + 1) = 7, t.025 = 2.365. The rejection region is t < 2.365 or
t > 2.365.
Since the observed value of the test statistic falls in the rejection region (t = 3.32 > 2.365) , H 0 is
rejected. There is sufficient evidence to indicate that there is a linear relationship between
vintage and the logarithm of price, adjusting for all other variables.
H0 : 2 = 0
Ha : 2 0
0 .616
=
= 6.47
.0952
s
2
The rejection region is t < 2.365 or t > 2.365.

average growing season temperature and the logarithm of price, adjusting for all other variables.
H 0 : 3 = 0
H a : 3 0
0 .00386
=
= 4.77
.00081
s
3

Since the observed value of the test statistic falls in the rejection region
(t = 4.77 < 2.365), H 0 is rejected. There is sufficient evidence to indicate that there is a
linear relationship between Sept./ Aug. rainfall and the logarithm of price, adjusting for all other
variables.
H0 : 4 = 0
Ha : 4 0
0 .0001173
=
= 0.24.
.000482
s
4

(t = 0.24 < 2.365), H 0 is not rejected. There is insufficient evidence to indicate that there is a
linear relationship between rainfall in months preceding vintage and the logarithm of price,
adjusting for all other variables.
4-18

Model 3:
H 0 : 1 = 0
H a : 1 0
0
.0240
=
= 3.21
.00747
s
1
Since no was given, we will use = .05. The rejection region requires / 2 = .05 / 2 = .025
in each tail of the t distribution. From Table 6, Appendix A, with
df = n ( k + 1) = 12 (5 + 1) = 6, t.025 = 2.447 . The rejection region is t < 2.447 or
t > 2.447.
vintage and the logarithm of price, adjusting for all other variables.
H0 : 2 = 0
Ha : 2 0
0 .608
=
= 5.24.
.116
s
2

average growing season temperature and the logarithm of price, adjusting for all other variables.
H 0 : 3 = 0
H a : 3 0
0 .00380
=
= 4.00
.00095
s
3

Since the observed value of the test statistic falls in the rejection region
(t = 4.00 > 2.447), H 0 is rejected. There is sufficient evidence to indicate that there is a
linear relationship between Sept./Aug. rainfall and the logarithm of price, adjusting for all other
variables.
H0 : 4 = 0
Ha : 4 0
0
.00115
=
= 2.28
.000505
s
4
Chapter 4
4-19
( t = 2.28 < 2.365), H 0 is not rejected. There is insufficient evidence to indicate that there is a
linear relationship between rainfall in months preceding vintage and the logarithm of price,
adjusting for all other variables.
H 0 : 5 = 0
H a : 5 0
0 .00765
=
= 0.014.
.565
s
5

( t = 0.014 < 2.365), H 0 is not rejected. There is insufficient evidence to indicate that there is a
linear relationship between average September temperature and the logarithm of price, adjusting
for all other variables.
b.
Mode1 1:
1 = .0354, e.0354 1 = .036

We estimate that the mean price will increase by 3.6% for each additional increase of unit of x1,
vintage year.
Model 2:
.0238
1 = .024
1 = .0238, e
We estimate that the mean price will increase by 2.4% for each additional increase of 1 unit of
x1, vintage year (with all other variables held constant).
.616
1 = .852
2 = .616, e
x2, average growing season temperature C (with all other variables held constant).
.00386
1 = .004
3 = .00386, e
We estimate that the mean price will decrease by .4% for each additional increase of 1 unit of x3,
Sept./Aug. rainfall in cm (with all other variables held constant).
4 = .0001173, e.0001173 1 = .0001

We estimate that the mean price will increase by .01% for each additional increase of 1 unit of
x4, rainfall in months preceding vintage in cm (with all other variables held constant).
Model 3:
.0240
1 = .024
1 = .0240, e
x1, vintage year (with all other variables held constant).
2 = .608, e.608 1 = .837
4-20

x2, average growing season temperature in C (with all other variables held constant).
.00380
1 = .004
3 = .00380, e
We estimate that the mean price will decrease by .4% for each additional increase of 1 unit of x3,
Sept./Aug. rainfall in cm (with all other variables held constant).
4 = .00115, e.00115 1 = .001

We estimate that the average mean price will increase by .1% for each additional increase of 1
unit of x4, rainfall in months preceding vintage in cm (with all other variables held constant).
.00765
1 = .008
5 = .00765, e
We estimate that the average mean price will increase by .8% for each additional increase of 1
unit of x5, average Sept. temperature in C (with all other variables held constant).
4.59
4.61
c.
Model 2 is preferable.
a.
y = 80.22 + 156.5 x1 42.3 x12 + 272.84 x2 + 760.1x1 x2 + 47.0 x12 x2
b.
The global F-test is used to test H 0 : 1 = 2 = 3 = 4 = 5 = 0. The test statistic

F = 417.05, p value 0 < = .01 shows that there is significant evidence that the prediction
equation is useful for predicting transcript copy number (y).
c.
. We see from the print-out that t = .34, p value = .734 > = .05. There
Ha : 2 0
is insufficient evidence that transcript copy number (y) is curvilinearly related to the proportion
H 0 : 5 = 0
this hypothesis is also not rejected.
of RNA. When testing
H a : 5 0
Test
H0 : 2 = 0
Models (a) and (b) are nested. The complete model is E ( y ) = 0 + 1 x1 + 2 x2 and the reduced
model is E ( y ) = 0 + 1 x1.
Models (a) and (d) are nested. The complete model is E ( y ) = 0 + 1 x1 + 2 x2 + 3 x1 x2 and the
reduced model is E ( y ) = 0 + 1 x1 + 2 x2 .
Models (a) and (e) are nested. The complete model is
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x1 x2 + 4 x12 + 5 x22 and the reduced model is

E ( y ) = 0 + 1 x1 + 2 x2 .
Models (b) and (c) are nested. The complete model is E ( y ) = 0 + 1 x1 + 2 x12 and the reduced
model is E ( y ) = 0 + 1 x1.
Models (b) and (d) are nested. The complete model is E ( y ) = 0 + 1 x1 + 2 x2 + 3 x1 x2 and the
reduced model is E ( y ) = 0 + 1 x1.
Models (b) and (e) are nested. The complete model is
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x1 x2 + 4 x12 + 5 x22 and the reduced model is E ( y ) = 0 + 1 x1.
Chapter 4
4-21
Models (c) and (e) are nested. The complete model is

E ( y ) = 0 + 1 x1 + 2 x12 .
Models (d) and (e) are nested. The complete model is
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x1 x2 .
4.63
a.
b.
R 2 = .101 10.1% (55.5%) of the sample variation in aggression score is explained

by Model 1 (Model 2).
H 0 : 5 = 6 = 7 = 8 = 0
c.
Yes, the two models are nested since all of the terms of the first model and four additional terms
are included in the second.
d.
Since the p value < .001 for = .05, we reject H 0 and conclude that the additional terms
contribute to the prediction of y.
e.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 + 6 x6 + 7 x7 + 8 x8
9 x5 x6 + 10 x5 x7 + 11 x5 x8 + 12 x6 x7 + 13 x6 x8 + 14 x7 x8
f.
4.65
a.
Fail to reject H 0 : 9 = 10 = 14 = 0 since p value > .10, this indicated that model 3 is
not significantly more effective in predicting y.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 + 6 x6 + 7 x7 + 8 x8
+ 9 x9 + 10 x10 + 11 x11
b.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 + 6 x6 + 7 x7
+8 x8 + 9 x9 + 10 x10 + 11 x11
+12 x1 x9 + 13 x1 x10 + 14 x1 x11
+ 15 x2 x9 + 16 x2 x10 + 17 x2 x11
+ 18 x3 x9 + 19 x3 x10 + 20 x3 x11
+ 21 x4 x9 + 21 x4 x10 + 22 x4 x11
+ 24 x5 x9 + 25 x5 x10 + 26 x5 x11
+ 27 x6 x9 + 28 x6 x10 + 29 x6 x11
+ 30 x7 x9 + 31 x7 x10 + 32 x7 x11
+ 33 x8 x9 + 34 x8 x10 + 35 x8 x11
c.
To test for interaction, we would use the hypothesis:

H 0 : 12 = 13 = 14 = ... = 35 = 0
H a : At least one i 0, i = 12,13,14, 35

We would compare the complete model in part b to the reduced model in part a.
4-22
4.67

a.
The hypothesized alternative model is

E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 + 6 x6 + 7 x7 + 8 x8 + 9 x9 + 10 x10
b.
The null hypothesis would be H 0 : 3 = 4 = 5 = 6 = 7 = 8 = 9 = 10 = 0
c.
The test was statistically significant. Thus, H 0 was rejected. There is sufficient evidence to
indicate that at least one of the control variables contributes to the prediction of SAT-Math
scores.
d.
R 2adj = .79. 79% of the sample variability of the SAT-Math scores around their means is
explained by the proposed model relating SAT-Math scores to the 10 independent variables,
adjusting for the sample size and the number of parameters in the model.
e.
For confidence coefficient .95, = .05 and / 2 = .05 / 2 = .025. From Table 2, Appendix D,
with df = n ( k + 1) = 3492 (10 + 1) = 3481, t.025 = 1.96. The 95% confidence interval is:
2 t.025 s 14 1.96 (3) 14 5.88 (8.12,19.88)

2
We are 95% confident that the difference in the mean SAT-Math scores between students who
were coached and those who were not is between 8.12 and 19.88 points, holding all the other
variables constant.
f.
No. From Exercise 4.48, the confidence interval for 2 was (13.12, 24.88). In part e, the
confidence interval for 2 was (8.12, 19.88). Even though coaching is significant in both
models, the change in the mean SAT-Math scores is not as great if the control variables are
added to the model. Notice that each point estimate is within the confidence interval of the other.
g.
The complete model would be:

E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 + 6 x6 + 7 x7 + 8 x8 + 9 x9 + 10 x10
+ 11 x2 x1 + 12 x2 x3 + 13 x2 x4 + 14 x2 x5 + 15 x2 x6 + 16 x2 x7 + 17 x2 x8 + 18 x2 x9 + 19 x2 x10
4.69
h.
The null hypothesis would be

H 0 : 11 = 12 = 13 = 14 = 15 = 16 = 17 = 18 = 19 = 0; partially (nested model) use
F-test.
a.
Multiple t-tests result in an increased Type I error rate.
b.
H 0 : 2 = 5 = 0
c.
F=
( SSER SSEC ) / 2 = (89171 88119) / 2 = .06

MSEC
2961
Since the F = .06 yields a p value = .9423 which exceeds any significance level that we
would use, we will not reject the null hypothesis.and conclude that all independent variables in
the complete model are useful in predicting the transcript copy number.
Chapter 4
4.71
a.
4-23
To determine if the AD score is positively related to assertiveness level, once age and length of
disability are accounted for, we test:
H 0 : 1 = 0
H a : 1 > 0
The p-value is p = .0001/ 2 = .00005. Since the p-value is less than
( p = .00005 < .05) , H 0 is rejected. There is sufficient evidence to indicate that the AD score
is positively related to assertiveness level, once age and length of disability are accounted for
with = .05.
b.
To determine if age is related to assertiveness level, once AD score and length of disability are
accounted for, we test:
H0 : 2 = 0
Ha : 2 0
The p-value is p = .9620. Since the p-value is less than ( p = .9620 < .05), H 0 is not
rejected. There is insufficient evidence to indicate that Age score is related to assertiveness
level, once AD score and length of disability are accounted for with = .05.
c.
To determine if length of disability is positively related to assertiveness level, once AD score and
age are accounted for, we test:
H 0 : 3 = 0
H a : 3 > 0
The p-value is p = .0576 / 2 = .0288. Since the p-value is less than ( p = .0288 < .05) , H 0
is rejected. There is sufficient evidence to indicate that length of disability is positively related
to assertiveness level, once the AD score and age are accounted for with = .05.
4.73
4.75
a.
Estimate of 1
b.
Prediction equation is less reliable for values of xs outside the range of the sample data.
a.
The first order model for E ( y ) as a function of the first five independent variables is:
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5
b.
To test the utility of the model, we test:

H 0 : 1 = 2 = 3 = 4 = 5 = 0
H a : At least one i 0, i = 1, 2, 3, 4, 5
The test statistic is F = 34.47.
The p-value is p < .001. Since the p-value is so small, there is sufficient evidence to indicate
the model is useful for predicting GSI at > .001.
4-24

R 2 = .469.
c.
46.9% of the variability in the GSI scores is explained by the model

including the first five independent variables.
The first order model for E ( y ) as a function of the first seven independent variables is:
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4 + 5 x5 + 6 x6 + 7 x7
d.
e.
4.77
R 2 = .603
60.3% of the variability in the GSI scores is explained by the model including
the first seven independent variables.
Since the p-values associated with the variables DES and PDEQ-SR are both less than .001,
there is evidence that both variables contribute to the prediction of GSI, adjusted for all the other
variables already in the model for > .001.
a.
E ( y ) = 0 + 1 x1 + 2 x2 + 3 x1 x2
b.
To determine if the effect of treatment on spelling score depends on disease intensity, we test:
H 0 : 3 = 0
H a : 3 0
The beta estimate is 3 = 1.6.
The p-value is p = .02. Since the p-value is less than ( p = .02 < .05) , H 0 is rejected.
There is sufficient evidence to indicate that the effect of treatment on spelling score depends on
disease intensity at = .05.
4.79
c.
Since the two variables interact, the main effects may be covered up by the interaction effect.
Thus, tests on main effects should not be made. Also, the interpretation of the coefficients of the
main effects should be interpreted with caution. Since the independent variables interact, the
effect of one independent variable on the dependent variable depends on the level of the second
independent variable.
a.
Since as the level of x2 goes from 0 to 9, we expect the level of funniness to increase, then level
off, then decrease, we are expecting to see negative or downward curvature. The sign we expect
to see on 2 is negative.
b.
To test if model 1 is useful for predicting the funniness rating, we test:

H 0 : 1 = 2 = 0
The test statistic is F = 1.60 (given).
The rejection region requires = .05 in the upper tail of the F distribution with numerator
df = k = 2 and denominator df = n ( k + 1) = 32 ( 2 + 1) = 29. From Table 4, Appendix C,
we find F.05 = 3.33. Since our observed test statistic did not fall in the rejection region
( F = 1.60 < 3.33), H 0 is not rejected. There is not sufficient evidence to indicate the model is
useful at = .05.
Chapter 4
c.
4-25
To test if model 2 is useful for predicting the funniness rating, we test:

H 0 : 1 = 2 = 0
The test statistic is F = 1.61 (given).

The rejection region requires = .05 in the upper tail of the F distribution with numerator
df = k = 2 and denominator df = n (k + 1) = 32 (2 + 1) = 29. From Table 4, Appendix
C, we find F.05 = 3.33. Since our observed test statistic did not fall in the rejection region
( F = 1.60 < 3.33), H 0 is not rejected. There is not sufficient evidence to indicate the model is
useful at = .05.
4.81
To determine is there is an upward concave curvilinear relationship, we test:

H0 : 2 = 0
Ha : 2 > 0
The test statistic is t = 4.20, with p value = .0085 / 2 = .004. Since the p-value is less than
= .10, reject H 0 . There is sufficient evidence of an upward curvilinear relationship between y and x
at = .10.
4.83
a.
The model is E ( y ) = 0 + 1 x1 + 2 x2
1 if Communist
where x1 =
0 if not
1 if Democratic
x2 =
0 if not
b.
0 = Dictator , 1 = Communist Dictator ; 2 = Democratic Dictator
4.85
No, income x1 is not significant, air carrier dummies ( x3 x30 ) are significant.
4.87
a.
Since temperature is measured on a numerical scale, it is quantitative.
b.
Since relative humidity is measured on a numerical scale, it is quantitative.
c.
Since organic compound is not measured on a numerical scale, it is qualitative.
d.
Let
1 if benzene
x1 =
0 if not
1 if chloroform
x3 =
0 if not
1 if toulene
x2 =
0 if not
1 if methanol
x4 =
0 if not
The model would then be:

E ( y ) = 0 + 1 x1 + 2 x2 + 3 x3 + 4 x4
4-26

e.
0 = A ; 1 = B A ; 2 = T A ; 3 = C A ; 4 = M A .
f.
To see if the mean retention coefficients of the five organic compounds differ, we test:
H 0 : 1 = 2 = 3 = 4 = 0
H a : At least one i 0 for i = 1, 2,3, 4
The test statistic would be a F .

4.89
a.
H 0 : 5 = 6 = 7 = 8 = 0
H a : At least one coefficient is not 0
For males:
Rejection region: = .05, v1 = 4, v2 = n ( k + 1) = 235, F.05 2.37
Reject H0 if F > 2.37
For females:
Rejection region: = .05, v1 = 4, v2 = n ( k + 1) = 144, F.05 2.45
Reject H0 if F > 2.45
b.
For males, reduced model: R 2 = .218. This implies that 21.8% of the sample variability of the
intrinsic job satisfaction scores is explained by the model containing age, education level, firm
experience, and sales experience.
For males, complete model: R 2 = .408. This implies that 40.8% of the sample variability of the
experience, sales experience, contingent reward behavior, noncontingent reward behavior,
contingent punishment behavior, and noncontingent punishment behavior.
Since the R 2 value increased from .218 to .408 after the 4 supervisory behavior variables were
added to the model, it appears that they did have an impact on intrinsic job satisfaction for the
males.
For females, reduced model: R 2 = .268. This implies that 26.8% of the sample variability of the
experience, and sales experience.
For females, complete model: R 2 = .496. This implies that 49.6% of the sample variability of
the intrinsic job satisfaction scores is explained by the model containing age, education level,
firm experience, sales experience, contingent reward behavior, noncontingent reward behavior,
contingent punishment behavior, and noncontingent punishment behavior.
Since the R 2 value increased from .268 to .496 after the 4 supervisory behavior variables were
added to the model, it appears that they did have an impact on intrinsic job satisfaction for the
females.
Chapter 4
c.
Test statistic: Fmales = 13.00
4-27
Ffemales = 9.05
Rejection region:
Males: = .05, v1 = 4, v2 = 235, F.05 2.37
Reject H 0 if F > 2.37
Females: = .05, v1 = 4, v2 = 144, F.05 2.45
Reject H 0 if F > 2.45
Conclusion: For both males and females, reject H 0 at = .05. There is sufficient evidence to
indicate that at least one of the four supervisory behavior variables affects intrinsic job
satisfaction.

Multiple Regression Models: Ey X X X

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Multiple Regression Models: Ey X X X

Hochgeladen von

Copyright:

Verfügbare Formate

Chapter 4

Multiple Regression Models

Reject H 0 : 1 = 2 = 3 = 0 since the p-value is less than .01

Reject H 0 : 1 = 0 since the p-value is less than .01

Fail to reject H 0 : 2 = 0 since the p-value for the severity of impulsivity-hyperactivity is

From the output, the least squares prediction equation is

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

Multiple Regression Models

5 t.025 s 1.51 1.96(.05) 1.51 .098 (1.412, 1.608)

68.6% of the sample variation in population density is explained by the model.

y = 1.81231 + .10875 x1 + .00017 x2

y = 21087.95 + 108.45x1 + 557.91x2 340.17x3 + 85.68x4

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

0 = the estimate of the y-intercept

To determine if redshift is a useful linear predictor of equivalent width, we test:

R 2 = .912, Ra2 = .894; Ra2

According to the MINITAB printout below, the model is

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

Multiple Regression Models

0 = 13614; the estimate of the y-intercept.

Rejection region: = .05, v1 = 18, v2 = 1, F.05 = 247

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

To test the utility of the model, we test:

The test statistic is:

region is F > 2.96.

From the output, the least squares prediction equation is

X denotes a point that is an outlier in the predictors.

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

Multiple Regression Models

The first order model is:

1 > 0 should be positive and 2 < 0 should be negative.

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

The regression equation is

327 cases used, 1 cases contain missing values

The regression equation is

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

Multiple Regression Models

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

Using SAS, the printout is:

Std Err Lower95%

Prob > |T|

The fitted regression line is:

To determine if the model is useful, we test:

The test statistic is F = 5.505.

region is F > 3.03.

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

Multiple Regression Models

The estimate of the standard deviation is s = .505.

This is simply the estimate of the y-intercept.

2 allows for a curvilinear relationship between y and x1.

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

Using MINITAB, the plot of the data is:

MTB > plot ENE BODYWT

The quadratic model would be:

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

Multiple Regression Models

The regression equation is

0 = mean of y for the x = B level = B

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

From the table, we obtain the interpretations of the 's.

y = 1 + 2x1 + (1) 3(3)

y = 1 + 2x1 + (1) 3(1)

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.