Sie sind auf Seite 1von 26

Lecture 17.

Estimation with
heteroskedastic errors; serial correlation

In linear regression model

Yi = β1 X i1 + β 2 X i 2 + + β K X iK + ui , i = 1,… , n

that random error term is heteroskedastic if

Var (ui ) = E (ui2 ) = σ i2 , i = 1,… , n

i.e. if the variance of the random error is


different for each observation.
Summary of previous lecture

• Heteroskedastic errors if there are


omitted variables that have a different
order of magnitude for the observations,
e.g. cross-section data on US states
• OLS estimator of the regression
coefficient is still unbiased
• OLS estimator is not the best estimator
• Usual formula for standard errors and t-
and F-tests are not correct, i.e. the
computer output that uses these formulas
cannot be used
• You can find out whether the errors are
heteroskedastic by plotting the square of
the OLS residuals against variables
• Better is to use the LM test which
depends on a particular model for the
heteroskedasticity, e.g. BP model. Then
the test statistic is the number of
observations times the R 2 of the regression
of the squared OLS residuals on a
number of variables that are suspected to
affect that variance of the random error
term
• There is an alternative method to
compute the standard errors of the OLS
estimates that is valid even if the random
errors are heteroskedastic
Improving on OLS

Consider the (simple) linear regression model


with heteroskedastic errors

Yi = β1 + β 2 X i 2 + ui

Var (ui ) = σ i2

How can we transform this into a linear


regression model with a homoskedastic
random error term?

Remember for a constant c

Var (cui ) = c 2σ i2

i.e. if we multiply a variable by a constant the


variance is multiplied by the square of that
constant.
1
Hence if we choose c = we have
σi
u 
Var  i  = 1
σ I 

and the regression model

Yi 1 X u
(1) = β1 + β 2 i 2 + i
σi σi σi σi

Yi
with dependent variable and independent
σi
1 X
variables and i 2 has a homoskedastic
σi σi
random error term (and the same regression
coefficients β1 and β 2 .

If we estimate the regression coefficients in


this model by OLS we again get the BLU
estimators of the regression coefficients. Note
this model has no constant term.
The OLS estimators of β1 and β 2 minimize

2
n  Yi 1 X i2  n 1
∑σ  − β 1 − β 2 
 = (
∑ 2 i 1 2 i2
Y − β − β X )2

i =1  i σ i σ i  i =1 σ i

The last expression is a weighted sum of


squared residuals with weights equal to 1
over the variance of the random error.

The OLS estimators in model (1) are called


Weighted Least Squares (WLS) estimators.

This is a special case of a Generalized Least


Squares (GLS) estimator. GLS estimators are
the best estimators if the assumptions 3 or 4
in the CLR model do not hold.

Problem with WLS estimator: In general σ i2 is


not known.
Special case in which we can use WLS
estimator directly: Error variance
proportional with square of size variable Z i

Var (ui ) = σ 2 Z i2

Example: data are cross-section data on US


states and Z i is the size of the population of
state i .

Now

u 
Var  i  = σ 2
 Zi 
and if we divide the dependent and
independent variables (including the
constant!) by Z i we obtain a linear regression
model with homoskedastic errors.

The resulting OLS estimator is the WLS


estimator and BLU.
In general case we start with a model of the
variance of the random error (as we did in
deriving the LM test), e.g. the HG model

log σ i2 = α1 + α 2 Z i 2 + L + α L Z iL (Harvey-
Godfrey)

The first 2 steps are as in the LM test

1. Estimate by OLS and obtain OLS


residuals ei , i = 1,K , n
2. Estimate linear regression of log ei2 on
constant and Z i 2 , K , Z iL and compute
σˆ i2 = exp(αˆ1 + αˆ 2 Z i 2 + L + αˆ L Z iL )
3. Divide the dependent and independent
variables by σ̂ i and estimate the
regression coefficients by OLS

This estimator is called the Feasible GLS or


Feasible WLS estimator.
Surprising fact: The standard errors of the
FGLS estimators obtained in step 3 are
correct as are the t- and F-tests in this model
(strictly this is true if the sample is large). The
fact that we use estimated variances σˆ i2 does
not matter!

Application to relation between log salary


and experience (in years since Ph.D.)
Data on 222 university professors for 7
schools (UC Berkeley, UCLA, UCSD, Illinois,
Stanford, Michigan, Virginia)

Model for error variance

log σ i2 = α1 + α 2Years + α 3Years 2

• Estimates of α1 ,α 2 ,α 3
• Compare OLS estimates and FGLS
estimates
• Compare OLS standard error,
heteroskedasticity-consistent standard
error, FGLS standard error
• FGLS standard error is smallest
Dependent Variable: LNRESID2
Method: Least Squares
Date: 11/12/01 Time: 23:16
Sample: 1 222
Included observations: 222

Variable Coefficient Std. Error t-Statistic Prob.

C -6.562664 0.359239 -18.26824 0.0000


YEARS 0.235562 0.041963 5.613595 0.0000
YEARS2 -0.004776 0.001050 -4.547694 0.0000

R-squared 0.159023 Mean dependent var -4.404772


Adjusted R-squared 0.151342 S.D. dependent var 1.952342
S.E. of regression 1.798549 Akaike info criterion 4.025259
Sum squared resid 708.4163 Schwarz criterion 4.071241
Log likelihood -443.8037 F-statistic 20.70563
Durbin-Watson stat 1.565414 Prob(F-statistic) 0.000000
Dependent Variable: LNSALARY
Method: Least Squares
Date: 11/07/01 Time: 13:15
Sample: 1 222
Included observations: 222

Variable Coefficient Std. Error t-Statistic Prob.

C 3.809365 0.041338 92.15104 0.0000


YEARS 0.043853 0.004829 9.081645 0.0000
YEARS2 -0.000627 0.000121 -5.190657 0.0000

R-squared 0.536179 Mean dependent var 4.325410


Adjusted R-squared 0.531943 S.D. dependent var 0.302511
S.E. of regression 0.206962 Akaike info criterion -0.299140
Sum squared resid 9.380504 Schwarz criterion -0.253158
Log likelihood 36.20452 F-statistic 126.5823
Durbin-Watson stat 1.434005 Prob(F-statistic) 0.000000

Dependent Variable: LNSALARY


Method: Least Squares
Date: 11/07/01 Time: 13:49
Sample: 1 222
Included observations: 222
White Heteroskedasticity-Consistent Standard Errors & Covariance

Variable Coefficient Std. Error t-Statistic Prob.

C 3.809365 0.026119 145.8466 0.0000


YEARS 0.043853 0.004361 10.05599 0.0000
YEARS2 -0.000627 0.000118 -5.322369 0.0000

R-squared 0.536179 Mean dependent var 4.325410


Adjusted R-squared 0.531943 S.D. dependent var 0.302511
S.E. of regression 0.206962 Akaike info criterion -0.299140
Sum squared resid 9.380504 Schwarz criterion -0.253158
Log likelihood 36.20452 F-statistic 126.5823
Durbin-Watson stat 1.434005 Prob(F-statistic) 0.000000

Dependent Variable: LNSALARYWLS


Method: Least Squares
Date: 11/12/01 Time: 23:24
Sample: 1 222
Included observations: 222

Variable Coefficient Std. Error t-Statistic Prob.

CWLS 3.827501 0.020303 188.5145 0.0000


YEARSWLS 0.038216 0.003257 11.73423 0.0000
YEARS2WLS -0.000443 8.27E-05 -5.359798 0.0000

R-squared 0.989833 Mean dependent var 41.66265


Adjusted R-squared 0.989740 S.D. dependent var 16.59118
S.E. of regression 1.680524 Akaike info criterion 3.889510
Sum squared resid 618.4915 Schwarz criterion 3.935492
Log likelihood -428.7356 Durbin-Watson stat 1.324687
Model estimated in step 3 seems to fit better
( R 2 is larger). However dependent variables
in OLS and transformed OLS model are
different: lnSalary and lnSalary divided by
σ̂ i , respectively.

To find R 2 for WLS we compute the WLS


residuals

ei = log Salaryi − βˆ1 − βˆ 2Yearsi − βˆ3Yearsi2

where we use the FGLS estimators. These


residuals are used in the usual formula for R 2
with dependent variable log Salary .
Serial Correlation

What we will discuss

• Nature of economic time series


• Consequences of this for random error
term in regression model
• Model for serial correlation of random
error term
• Consequences for OLS estimator of
regression coefficients
• Detecting serial correlation

Most economic time series change gradually


over time

Example: US GNP (billions of 1982$) and


new housing units (thousands) for 1963-1985
(see time series graph)
4000

3500

3000

2500

2000

1500

1000
64 66 68 70 72 74 76 78 80 82 84

GNP HOUSING
A numerical measure for the slowness of
change or persistence of the time series is the
autocorrelation coefficient. For time series
Yt , t = 1, K, n the autocorrelation coefficient of
order 1 is defined as the sample correlation
between Yt and Yt −1 .

The time series Yt −1 , t = 2, K, n is the time series


Yt lagged by one period. The value of Yt −1 in
period t is the value of Y in period t − 1.

If Y0 is not known, the lagged once time series


starts in period t = 2 .
Define the sample average

1 n
Y = ∑ Yt
n t =1
then the autocorrelation coefficient of order 1
is
n
∑ (Yt − Y )(Yt −1 − Y )
(2) ρ̂1 = t =2 n
∑ (Yt − Y ) 2
t =1

Note that by the definition of the sample


correlation the denominator should be

n n
∑ (Yt − Y ) 2
∑ (Yt −1 − Y ) 2
t =1 t =2

The only difference is with the way the first


observation is included and that difference
can be neglected. (2) is simpler.
The sample correlation between Yt and Yt − 2 is
the autcorrelation coefficient of order 2, ρ̂ 2
etc.

The autocorrelation coefficient of order k is a


measure of the (linear) relation between the
time series Yt and Yt − k .

What do you expect for the values ρˆ1 , ρˆ 2 ,K?


Why?

Example: autocorrelation for GNP series.


Correlogram of GNP

Date: 11/13/01 Time: 06:27


Sample: 1963 1985
Included observations: 23

Autocorrelation Partial Correlation AC PAC Q-Stat Prob

1 0.840 0.840 18.426 0.000


2 0.684 -0.071 31.243 0.000
3 0.568 0.042 40.521 0.000
4 0.479 0.018 47.454 0.000
5 0.372 -0.107 51.882 0.000
6 0.271 -0.044 54.358 0.000
7 0.157 -0.125 55.244 0.000
8 0.032 -0.140 55.284 0.000
Consequence for linear regression model

In linear regression model

Yt = β1 + β 3 X t 2 + ut t = 1, K, n

the error term ut captures the omitted


variables that affect Y . These variables are
also economic time series and can have the
same persistence as GNP.

How do we express that ut and ut −1 are


related?

Suggestion use a linear regression model

(3) ut = ρut −1 + ε t

Note: no constant because E (ut ) = E (ut −1 ) = 0


The error term ε t has the same properties as
the error term in the CLR model, in
particular

E (ε t ) = 0

Var (ε t ) = σ 2 (homoskedasticity)

for s ≠ 0 , ε t and ε t − s are uncorrelated

Such a time series is called a white noise


series (if fed to your speakers you will hear
static)

The model in (3) with ε t white noise is called


the first-order autoregressive or AR(1)
process. The parameter ρ is called the first-
order autocorrelation coefficient.
It can be shown that (3) implies that the
correlation between ut and ut − s is equal to

ρs

Remember that in economic time series the


correlation becomes smaller with s . This
happens if − 1 < ρ < 1.
Consequences of serial correlation

• Assumptions 1 and 2 still hold: OLS


estimators unbiased
• OlS estimator not the best (BLU)
estimator
• Usual formula for standard error
incorrect
• t- and F-tests cannot be used

Compare with consequences of


heteroskedasticity

Often: standard errors produced by


computer OLS program too small
Detecting serial/autocorrelation

• Graphical method
• Test

Graphical method

Data for 1963-1985. Dependent variable log


of housing units started per capita and
independent variables log of GNP per capita
and log of mortgage interest rate

See regression output

To detect serial correlation we look at the


OLS residuals

• Time series graph


• Scatterplot of et and et −1

Problem: indicative but not conclusive.


Dependent Variable: LNHOUSINGCAP
Method: Least Squares
Date: 11/13/01 Time: 00:06
Sample: 1963 1985
Included observations: 23

Variable Coefficient Std. Error t-Statistic Prob.

C 2.528899 1.180472 2.142278 0.0447


LNGNPCAP -0.066000 0.540505 -0.122109 0.9040
LNINTRATE -0.211284 0.202894 -1.041351 0.3101

R-squared 0.094147 Mean dependent var 1.991961


Adjusted R-squared 0.003562 S.D. dependent var 0.226095
S.E. of regression 0.225692 Akaike info criterion -0.018186
Sum squared resid 1.018735 Schwarz criterion 0.129922
Log likelihood 3.209133 F-statistic 1.039325
Durbin-Watson stat 0.913015 Prob(F-statistic) 0.372027
0.6

0.4

0.2

0.0

-0.2

-0.4
64 66 68 70 72 74 76 78 80 82 84

LNHOUSINGCAP Residuals
RESID01LAG vs. RESID01
0.6

0.4
RESID01LAG

0.2

0.0

-0.2

-0.4
-0.4 -0.2 0.0 0.2 0.4 0.6

RESID01