Sie sind auf Seite 1von 2

STAT 3008: Applied Linear Regression

2014-15 Term 2
Assignment #1
Due: February 3th, 2014 (Tuesday) at 5:30pm
This assignment covers material from Chapter 1 and Section 2.1-2.3 of the lecture notes.
You need to show your calculation in details order to obtain full scores.
Problem 1 [35 points]: Suppose the following regression model is fitted to a data set with
observations {(xi, yi), i = 1, 2, , n}:

y x 3 e, e ~ N (0, 2 )
(a) Based on the least squares method and the fact that RSS ~ n21 (df = n-1 since df= n from
2
the data and df=1 from ), compute the least squares estimates and .

(b) Is an unbiased estimator for ? Verify.

6
(c) Are the points ( x , y ) and x

1/ 3

n
, x 3 y i 1 xi6 / n

1/ 3

n
, i 1 xi3 yi / n on the fitted

regression line?

~
2
(d) Derive the maximum likelihood estimates (MLE) and ~ .
(e) Suppose (x1, x2, x3, x4, x5) = (1, 2, 3, 4, 5) and (y1, y2, y3, y4, y5) = (1, 3, 11, 26, 50).
2
Compute the values of the least squares estimates and . Does the sum of residuals

equal to zero?
Problem 2 [10 points]: Consider the residuals { ei } from the simple linear regression:

ei yi y i yi 0 1 xi ,

i = 1, 2, , n

where 1 SXY/SXX and 0 y - 1 x are the OLS estimates for 0 and 1.


Show that { ei , i = 1,2,n} are uncorrelated with the predictor {xi, i = 1,2,n}. That is,
( x, e)

1 n
( xi x )(ei e) 0 .
n 1 i 1

Page 1/2

Problem 3 (R problem) [20 points]: The R library alr3 contains the segreg data, which
contains the electricity consumption (in KWH) and mean temperature (in F) for one building
on the University of Minnesotas Twin Cities campus for 39 months in 1988-1992.
(http://www.stat.cmu.edu/~roeder/stat707/=data/=data/data/Rlibraries/alr3/html/segreg.html)

Suppose that we are interested in how the electricity consumption (y=segreg$C) is affected
by the monthly mean temperature (x=segreg$Temp), primarily driven by the use of air
conditioning.
(a) Based on similar R codes from page 23 in Ch2, obtain the OLS estimates 0 , 1 and 2 .
(b) Is there any outlier in the data set, if outlier is defined as observation (xi, yi) with
| ei | 2 ?

(Note: A more precise definition of outlier will be introduced in Chapter 7, which


removes the impact of the outlier (xi, yi) itself when estimating ).
Problem 4 [35 points]: Suppose we want to fit the our data {(xi, yi), i=1, 2, n} based on
the following simple linear regression model:

yi 0 1 xi ei , with E( ei ) 0, Var (ei ) 2 and ei , i 1,..., n are uncorrelat ed


Given that n 16, x 19.27496, y 2.26215,

xi2 8718.558,
i 1

yi2 113.9961,
i 1

x y
i 1

929.8138 .

(a) Compute SXY, SXX and SYY.


(b) Show that the OLS estimates 1 = 0.08369. What are the OLS estimates 0 and 2 ?

ar ( | X ) and V
ar ( 1 | X ) .
(c) Compute V
0
(d) Suppose that (x, y)=(48.462, 2.000) is one of the observations in the data set. Based on
the definition of outlier as in Problem 3(b), do you think the point is an outlier? Explain.
Suppose that the point (48.462, 2.000) is removed from the data set, and the new OLS

*
*
*
estimates 0 , 1 and

are obtained based on the remaining 15 observations.

*
(e) Show that 1 = 0.12883.

*
* 2
(f) What are the OLS estimates 0 and ?

End of the Assignment -

Page 2/2

Das könnte Ihnen auch gefallen