Sie sind auf Seite 1von 8

Jhay Ehidio

Applied Linear Regression


Midterm Examination

1. a.

Regression Analysis: Y versus X


Analysis of Variance
Source
Regression
X
Error
Total

DF
1
1
14
15

Adj SS
228.0
228.0
1483.0
1711.0

Adj MS
228.0
228.0
105.9

F-Value
2.15
2.15

P-Value
0.164
0.164

Model Summary
S
10.2923

R-sq
13.32%

R-sq(adj)
7.13%

R-sq(pred)
0.00%

Coefficients
Term
Constant
X

Coef
183.0
0.262

SE Coef
12.7
0.178

T-Value
14.38
1.47

P-Value
0.000
0.164

VIF
1.00

Regression Equation
Y = 183.0 + 0.262 X

Therefore, the estimated regression function is Y = 183.0 + 0.262 X.

> predict(result1, newdata=newdata, interval =


"confidence", level = 1-(0.05/2))
fit
lwr
upr
1 201.2827 194.8251 207.7402
2 198.6670 190.8424 206.4915
> predict(result1, newdata=newdata, interval =
"prediction", level = 1-(0.05/2))
fit
lwr
upr
1 201.2827 174.6585 227.9069
2 198.6670 171.6786 225.6553

Therefore, with 95% confidence, the confidence interval of Y when X is 70 is (194.8251, 207.7402) while
the confidence interval of Y when X is 60 is (190.8424, 206.4915).
Therefore, with 95% confidence, the predicted interval of Y when X is 70 is (174.6585, 227.9069) while
the predicted interval of Y when X is 60 is (171.6786, 225.6553).

b.
Scatterplot of Y vs X
220

210

200

190

180
50

60

70

80

90

100

Looking at the plot, the linear regression function does not appear to give a good fit. Most of the points
are far from the regression line.
c.

The plot displays no clear pattern among the points. This shows that the regression function is linear and
that the variance is constant.

d.

Some of the points are close to the line which might not indicate a violation of the normality
assumption. We will use a formal test on the normality of error terms to be sure.

The p-value is greater than the chosen level of 0.05, thus we will not reject H0. There is not enough
evidence to suggest that the data do not follow a normal distribution.

2.

Regression Analysis: Y versus X


Analysis of Variance
Source
Regression
X
Error
Lack-of-Fit
Pure Error
Total

DF
1
1
10
7
3
11

Adj SS
22426.7
22426.7
952.2
645.5
306.8
23378.9

Adj MS
22426.7
22426.7
95.2
92.2
102.3

F-Value
235.51
235.51

P-Value
0.000
0.000

0.90

0.593

Model Summary
S
9.75833

R-sq
95.93%

R-sq(adj)
95.52%

R-sq(pred)
94.05%

Coefficients
Term
Constant
X

Coef
153.92
2.417

SE Coef
8.07
0.157

T-Value
19.08
15.35

P-Value
0.000
0.000

VIF
1.00

Regression Equation
Y = 153.92 + 2.417 X

The p-value of 0.593 for lack-of-fit is greater than the chosen level of 0.01, thus we will not reject H0.
There is not enough evidence to suggest that ( )
. Therefore, there is no sufficient
evidence to suggest that there is lack of fit of a linear regression function.
3. a.

Regression Analysis: Y versus X


Analysis of Variance
Source
Regression
X
Error
Lack-of-Fit
Pure Error
Total

DF
1
1
20
19
1
21

Adj SS
2224557
2224557
255186
254898
288
2479743

Adj MS
2224557
2224557
12759
13416
288

F-Value
174.35
174.35

P-Value
0.000
0.000

46.58

0.115

Model Summary
S
112.957

R-sq
89.71%

R-sq(adj)
89.19%

R-sq(pred)
86.03%

Coefficients
Term
Constant
X

Coef
-595.6
3.567

SE Coef
80.5
0.270

T-Value
-7.39
13.20

P-Value
0.000
0.000

Regression Equation
Y = -595.6 + 3.567 X

Therefore, the estimated regression function is Y = -595.6 + 3.567 X.

VIF
1.00

b.

The points follow a curve-like pattern which indicates a violation of the linearity assumption.
There is also a possible violation of the constancy of variance based on the plots.

Some of the points are near to the line while there are a few that are not. This might or might not
indicate a violation of the normality assumption.

There is no existence of outliers since there are no standardized residual values greater than 4 or less
than -4.

4. a.

Regression Analysis: Y versus X


Analysis of Variance
Source
Regression
X
Error
Lack-of-Fit
Pure Error
Total

DF
1
1
10
8
2
11

Adj SS
1715.14
1715.14
185.77
151.77
34.00
1900.92

Adj MS
1715.14
1715.14
18.58
18.97
17.00

F-Value
92.32
92.32

P-Value
0.000
0.000

1.12

0.554

Model Summary
S
4.31014

R-sq
90.23%

R-sq(adj)
89.25%

R-sq(pred)
83.11%

Coefficients
Term
Constant
X

Coef
19.20
3.288

SE Coef
5.17
0.342

T-Value
3.71
9.61

Regression Equation
Y = 19.20 + 3.288 X

The estimated regression function is Y = 19.20 + 3.288 X.

The residual plot suggests a violation of the constancy of variance.

P-Value
0.004
0.000

VIF
1.00

b.

Regression Analysis: Y versus X


Method
Weights

Weights

Analysis of Variance
Source
Regression
X
Error
Lack-of-Fit
Pure Error
Total

DF
1
1
10
8
2
11

Adj SS
8.0104
8.0104
0.6433
0.4600
0.1833
8.6537

Adj MS
8.01045
8.01045
0.06433
0.05750
0.09163

F-Value
124.53
124.53

P-Value
0.000
0.000

0.63

0.739

Model Summary
S
0.253624

R-sq
92.57%

R-sq(adj)
91.82%

R-sq(pred)
89.21%

Coefficients
Term
Constant
X

Coef
16.58
3.471

SE Coef
4.17
0.311

T-Value
3.97
11.16

P-Value
0.003
0.000

VIF
1.00

Regression Equation
Y = 16.58 + 3.471 X

The regression coefficients of the weighted least squares are not similar to those obtained using
ordinary least squares. Also, the standard deviation of the weighted least squares is less than the
standard deviation of the ordinary least squares.
c. The transformation using the square root of Y would have yielded the same regression coefficients as
the weighted least squares.
5.

> b
[,1]
[1,] 182.9724994
[2,]
0.2615742

Therefore, b0 = 182.9724994 and b1 = 0.2615742.


> varb
[,1]
[,2]
[1,] 161.85779 -2.22163992
[2,] -2.22164 0.03179449

The estimated covariance of b0 = 161.85779 and the estimated covariance of b1 = 0.03179449.

Das könnte Ihnen auch gefallen