Beruflich Dokumente
Kultur Dokumente
Section 11-2
11-1.
a) y i = 0 + 1x1 + i
S xx = 157.42
432
14
= 25.348571
S xy = 1697.80
1
0
43(572 )
14
= 59.057143
S xy 59.057143
=
=
= 2.330
25.348571
S xx
= y x = 572 ( 2.3298017)( 43 ) = 48.013
1
14
14
b) y = 0 + 1 x
y = 48.012962 2.3298017(4.3) = 37.99
c) y = 48.012962 2.3298017(3.7) = 39.39
d) e = y y = 46.1 39.39 = 6.71
11-5.
a)
Regression Analysis - Linear model: Y = a+bX
Dependent variable: SalePrice
Independent variable: Taxes
-------------------------------------------------------------------------------Standard
T
Prob.
Parameter
Estimate
Error
Value
Level
Intercept
13.3202
2.57172
5.17948
.00003
Slope
3.32437
0.390276
8.518
.00000
-------------------------------------------------------------------------------Analysis of Variance
Source
Sum of Squares
Df Mean Square
F-Ratio Prob. Level
Model
636.15569
1
636.15569
72.5563
.00000
Residual
192.89056
22
8.76775
-------------------------------------------------------------------------------Total (Corr.)
829.04625
23
Correlation Coefficient = 0.875976
R-squared = 76.73 percent
Stnd. Error of Est. = 2.96104
2 = 8.76775
If the calculations were to be done by hand use Equations (11-7) and (11-8).
y = 13.3202 + 3.32437 x
b) y = 13.3202 + 3.32437(7.5) = 38.253
c) y = 13.3202 + 3.32437(58980
.
) = 32.9273
y = 32.9273
e = y y = 30.9 32.9273 = 2.0273
d) All the points would lie along the 45% axis line. That is, the regression model would estimate the values exactly. At
this point, the graph of observed vs. predicted indicates that the simple linear regression model provides a reasonable fit
to the data.
Predicted
45
40
35
30
25
25
30
35
40
45
50
Observed
11-9.
a) Yes, a linear regression would seem appropriate, but one or two points appear to be outliers.
9
8
7
6
5
4
3
2
1
0
60
70
80
90
100
Predictor
Constant
x
S = 1.408
Coef
-9.813
0.17148
SE Coef
2.135
0.02566
R-Sq = 71.3%
T
-4.60
6.68
P
0.000
0.000
R-Sq(adj) = 69.7%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
18
19
SS
88.520
35.680
124.200
MS
88.520
1.982
F
44.66
P
0.000
11-11.
40
30
20
10
0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
Predictor
Constant
x
S = 3.716
Coef
0.470
20.567
SE Coef
1.936
2.142
R-Sq = 85.2%
T
0.24
9.60
P
0.811
0.000
R-Sq(adj) = 84.3%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
16
17
SS
1273.5
220.9
1494.5
b)
2 = 13.81
y = 0.470467 + 20.5673 x
c)
d)
y = 10.1371 e = 1.6629
MS
1273.5
13.8
F
92.22
Section 11-4
11-21.
H 0 : 1 = 0
H1 : 1 0
1
se( 1 )
t0 =
3.32437
= 8.518
0.390276
8) Since 8.518 > 2.074 reject H 0 and conclude the model is useful = 0.05.
P
0.000
MSR
SS R / 1
=
MSE SSE / ( n 2)
=
S xx
c) se (
)=
1
)=
se (
0
8 . 7675
= . 39027
57 . 5631
1
x
2 +
=
n
S
xx
1
6 . 4049 2
8 . 7675
+
= 2 .5717
24 57 . 5631
H 0 : 0 = 0
H 1 : 0 0
0
)
se(
0
H 0 : 1 = 0
H 1 : 1 0
= 0.05
f 0 = 44.6567
f .05 ,1,18 = 4.416
f 0 > f ,1,18
) = 0.0256613
se(
1
) = 2.13526
se(
0
c)
H 0 : 0 = 0
H1 : 0 0
= 0.05
t0 = 4.59573
t.025 ,18 = 2.101
| t0 |> t / 2 ,18
Therefore, reject H0. P-value = 0.00022.
. 025 ,12
Y | x 0 t.025 ,12 2 ( 1n +
( x0 x ) 2
S xx
( 2. 5 3 .0714 ) 2
25 . 3486
y 0 t.025,12 2 (1 + n1 +
( x0 x ) 2
S xx
( 2.5 3.0714 ) 2
25.348571
11-35.
a) 9.10130 1 9.31543
b) 11.6219 0 1.04911
1 +
(2.228) 3.774609( 12
c) 500124
.
(5546.5)2
)
3308.9994
500124
.
.
14037586
498.72024 Y|x 50152776
.
0
1 +
d) 500124
.
(2.228) 3.774609(1 + 12
(55 46.5)2
)
3308.9994
500.124 4.5505644
495.57344 y0 504.67456
It is wider because the prediction interval includes error for both the fitted model and from that associated
with the future observation.
( 20 13 .3375 ) 2
1114 .6618
1
20
( 20 13 . 3375 ) 2
1114 . 6618
Section 11-7
11-43.
4.9176
5.0208
4.5429
4.5573
5.0597
3.8910
5.8980
5.6039
5.8282
5.3003
6.2712
5.9592
5.0500
8.2464
6.6969
7.7841
9.0384
5.9894
7.5422
8.7951
6.0831
8.3607
8.1400
9.1416
29.6681073
30.0111824
28.4224654
28.4703363
30.1405004
26.2553078
32.9273208
31.9496232
32.6952797
30.9403441
34.1679762
33.1307723
30.1082540
40.7342742
35.5831610
39.1974174
43.3671762
33.2311683
38.3932520
42.5583567
33.5426619
41.1142499
40.3805611
43.7102513
-3.76810726
-0.51118237
-0.52246536
-2.57033630
-0.24050041
3.64469225
-2.02732082
-3.04962324
3.20472030
0.55965587
-3.16797616
-2.23077234
-0.10825401
-3.83427422
6.31683901
1.30258260
0.53282376
4.26883165
-0.49325200
1.94164328
4.35733807
-2.21424985
-3.48056112
2.08974865
b) Assumption of normality does not seem to be violated since the data appear to fall along a straight line.
95
80
50
20
5
1
0.1
-4
-2
Residuals
c) No serious departure from assumption of constant variance. This is evident by the random pattern of the
residuals.
Plot of Residuals versus Taxes
Residuals
Residuals
-2
-2
-4
-4
26
29
32
35
38
41
44
3.8
4.8
5.8
Predicted Values
d) R 2 76.73% ;
a) R 2 = 71.27%
b) No major departure from normality assumptions.
Normal Probability Plot of the Residuals
(response is y)
3
Residual
11-47.
-1
-2
-2
6.8
Taxes
-1
Normal Score
7.8
8.8
9.8
ResidualsVersusx
(response is y)
(response is y)
Residual
Residual
-1
-1
-2
-2
60
70
80
90
100
a) R 2 = 85 . 22 %
b) Assumptions appear reasonable, but there is a suggestion that variability increases with
11-49.
Residuals Versus x
y.
(response is y)
(response is y)
Residual
Residual
Fitted Value
-5
-5
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
10
20
30
40
Fitted Value
c) Normality assumption may be questionable. There is some bending away from a straight line in the
tails of the normal probability plot.
Normal Probability Plot of the Residuals
(response is y)
Residual
-5
-2
-1
Normal Score
Section 11-10
11-55.
a) y = 0.0280411 + 0.990987 x
b) H 0 : 1 = 0
H 1 : 1 0
f 0 = 79 . 838
f . 05 ,1 ,18 = 4 . 41
f 0 >> f ,1 ,18
Reject
H0 .
= 0.05
c) r = 0 . 816 = 0 . 903
d) H 0 : = 0
H1 : 0
t0 =
= 0.05
R n2
2
t .025 ,18
1 R
= 2 . 101
0 . 90334
18
1 0 . 816
= 8 . 9345
t 0 > t / 2 ,18
Reject
H0 .
e) H 0 : = 0 . 5
= 0.05
H 1 : 0 .5
z 0 = 3 . 879
z .025 = 1 . 96
z 0 > z / 2
Reject H0 .
f) tanh(arctanh 0.90334 -
z.025
17
. .
) tanh(arctanh 0.90334 + z.025
) where z.025 = 196
17
0 . 7677 0 . 9615 .
11-59
n = 50 r = 0.62
a) H 0 : = 0
= 0.01
H1 : 0
t0 =
n2
1 r 2
0 . 62
48
= 5 . 475
1 ( 0 . 62 ) 2
t .005 , 48 = 2 . 683
t 0 > t 0 .005 , 48
Reject
H 0 . P-value 0
b) tanh(arctanh 0.62 -
z.005
47
) tanh(arctanh 0.62 +
0 .3358 0 . 8007 .
c) Yes.
11-61.
a) r = 0.933203
a)
H0 : = 0
H1 : 0
t0 =
n2
1 r 2
= 0.05
0 . 933203
15
1 ( 0 . 8709 )
= 10 . 06
H 0 : 1 = 0
H 1 : 1 0
= 0.05
z.005
47
f 0 = 101 . 16
f . 05 ,1 ,15 = 4 . 545
f 0 >> f ,1 ,15
Reject H0. Conclude that the model is significant at = 0.05. This test and the one in part b are identical.
H0 : 0 = 0
d)
H1 : 0 0
= 0.05
t 0 = 0 . 468345
t . 025 ,15 = 2 . 131
t 0 >/ t
/ 2 ,15
Residuals Versus x
(response is y)
(response is y)
Residual
Residual
0
-1
-1
-2
-2
10
20
30
40
50
15
Fitted Value
Residual
2
1
0
-1
-2
-2
-1
Normal Score
Supplemental
11-65.
a)
b)
y = 93.34 + 15.64 x
H 0 : 1 = 0
H 1 : 1 0
= 0.05
f 0 = 12.872
f 0.05,1,14 = 4.60
f 0 > f 0.05,1,14
Reject
H 0 . Conclude that 1 0
c) (7.961 1 23.322)
d) (74.758 0 111.923)
e) y = 93.34 + 15.64(2.5) = 132.44
at = 0.05.
25
( 2.5 2.325 ) 2
7.017
132.44 6.26
126.18 Y | x0 = 2.5 138.70
11-67
a)
15
10
0
0
500
1000
1500
2000
2500
3000
3500
b)
y = 0.8819 + 0.00385 x
c) H 0 : 1 = 0
H1 : 1 0
= 0.05
f 0 = 122.03
f 0 > f ,1, 48
Reject H0 . Conclude that regression model is significant at = 0.05
d) No, it seems the variance is not constant, there is a funnel shape.
Residuals Versus the Fitted Values
(response is y)
3
2
1
Residual
0
-1
-2
-3
-4
-5
0
10
Fitted Value
e)
a)
110
100
90
80
days
11-71
70
60
50
40
30
16
17
index
18
Coef
-193.0
15.296
S = 23.79
SE Coef
163.5
9.421
R-Sq = 15.8%
T
-1.18
1.62
P
0.258
0.127
R-Sq(adj) = 9.8%
Analysis of Variance
Source
Regression
Error
Total
DF
1
14
15
SS
1492.6
7926.8
9419.4
MS
1492.6
566.2
F
2.64
P
0.127
Cannot reject Ho; therefore we conclude that the model is not significant. Therefore the seasonal
meteorological index (x) is not a reliable predictor of the number of days that the ozone level exceeds 0.20
ppm (y).
95% CI on 1
1 t / 2 , n 2 se ( 1 )
d)
The normality plot of the residuals is satisfactory. However, the plot of residuals versus run order
exhibits a strong downward trend. This could indicate that there is another variable should be included in the
model, one that changes with time.
40
30
20
10
Residual
Residual
c)
0
-10
-20
-30
-5
-40
2
10
Observation Order
12
14
16
920
930
Fitted Value
940
11-75 a)
940
930
920
920
930
940
b) y
= 33.3 + 0.9636 x
c) Predictor
Constant
x
S = 4.805
Coef
33.3
0.9639
SE Coef
171.7
0.1848
R-Sq = 77.3%
T
0.19
5.22
P
0.851
0.001
R-Sq(adj) = 74.4%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
8
9
SS
628.18
184.72
812.90
MS
628.18
23.09
F
27.21
P
0.001
Reject the hull hypothesis and conclude that the model is significant. 77.3% of the variability is explained
by the model.
d)
H 0 : 1 = 1
H 1 : 1 1
t0 =
=.05
1 1 0.9639 1
=
= 0.1953
0.1848
se( 1 )
> t a / 2,n 2 , we cannot reject Ho and we conclude that there is not enough evidence to reject the
claim that the devices produce different temperature measurements. Therefore, we assume the devices
produce equivalent measurements.
e) The residual plots to not reveal any major problems.
Residual
Normal Score
-1
-1
-2
-5
Residual
Fitted Value