Sie sind auf Seite 1von 26

# Lecture 17.

Estimation with
heteroskedastic errors; serial correlation

## In linear regression model

Yi = β1 X i1 + β 2 X i 2 + + β K X iK + ui , i = 1,… , n

## i.e. if the variance of the random error is

different for each observation.
Summary of previous lecture

## • Heteroskedastic errors if there are

omitted variables that have a different
order of magnitude for the observations,
e.g. cross-section data on US states
• OLS estimator of the regression
coefficient is still unbiased
• OLS estimator is not the best estimator
• Usual formula for standard errors and t-
and F-tests are not correct, i.e. the
computer output that uses these formulas
cannot be used
• You can find out whether the errors are
heteroskedastic by plotting the square of
the OLS residuals against variables
• Better is to use the LM test which
depends on a particular model for the
heteroskedasticity, e.g. BP model. Then
the test statistic is the number of
observations times the R 2 of the regression
of the squared OLS residuals on a
number of variables that are suspected to
affect that variance of the random error
term
• There is an alternative method to
compute the standard errors of the OLS
estimates that is valid even if the random
errors are heteroskedastic
Improving on OLS

## Consider the (simple) linear regression model

with heteroskedastic errors

Yi = β1 + β 2 X i 2 + ui

Var (ui ) = σ i2

## How can we transform this into a linear

regression model with a homoskedastic
random error term?

## Remember for a constant c

Var (cui ) = c 2σ i2

## i.e. if we multiply a variable by a constant the

variance is multiplied by the square of that
constant.
1
Hence if we choose c = we have
σi
u 
Var  i  = 1
σ I 

## and the regression model

Yi 1 X u
(1) = β1 + β 2 i 2 + i
σi σi σi σi

Yi
with dependent variable and independent
σi
1 X
variables and i 2 has a homoskedastic
σi σi
random error term (and the same regression
coefficients β1 and β 2 .

## If we estimate the regression coefficients in

this model by OLS we again get the BLU
estimators of the regression coefficients. Note
this model has no constant term.
The OLS estimators of β1 and β 2 minimize

2
n  Yi 1 X i2  n 1
∑σ  − β 1 − β 2 
 = (
∑ 2 i 1 2 i2
Y − β − β X )2

i =1  i σ i σ i  i =1 σ i

## The last expression is a weighted sum of

squared residuals with weights equal to 1
over the variance of the random error.

## The OLS estimators in model (1) are called

Weighted Least Squares (WLS) estimators.

## This is a special case of a Generalized Least

Squares (GLS) estimator. GLS estimators are
the best estimators if the assumptions 3 or 4
in the CLR model do not hold.

## Problem with WLS estimator: In general σ i2 is

not known.
Special case in which we can use WLS
estimator directly: Error variance
proportional with square of size variable Z i

Var (ui ) = σ 2 Z i2

## Example: data are cross-section data on US

states and Z i is the size of the population of
state i .

Now

u 
Var  i  = σ 2
 Zi 
and if we divide the dependent and
independent variables (including the
constant!) by Z i we obtain a linear regression
model with homoskedastic errors.

## The resulting OLS estimator is the WLS

estimator and BLU.
In general case we start with a model of the
variance of the random error (as we did in
deriving the LM test), e.g. the HG model

log σ i2 = α1 + α 2 Z i 2 + L + α L Z iL (Harvey-
Godfrey)

## 1. Estimate by OLS and obtain OLS

residuals ei , i = 1,K , n
2. Estimate linear regression of log ei2 on
constant and Z i 2 , K , Z iL and compute
σˆ i2 = exp(αˆ1 + αˆ 2 Z i 2 + L + αˆ L Z iL )
3. Divide the dependent and independent
variables by σ̂ i and estimate the
regression coefficients by OLS

## This estimator is called the Feasible GLS or

Feasible WLS estimator.
Surprising fact: The standard errors of the
FGLS estimators obtained in step 3 are
correct as are the t- and F-tests in this model
(strictly this is true if the sample is large). The
fact that we use estimated variances σˆ i2 does
not matter!

## Application to relation between log salary

and experience (in years since Ph.D.)
Data on 222 university professors for 7
schools (UC Berkeley, UCLA, UCSD, Illinois,
Stanford, Michigan, Virginia)

## log σ i2 = α1 + α 2Years + α 3Years 2

• Estimates of α1 ,α 2 ,α 3
• Compare OLS estimates and FGLS
estimates
• Compare OLS standard error,
heteroskedasticity-consistent standard
error, FGLS standard error
• FGLS standard error is smallest
Dependent Variable: LNRESID2
Method: Least Squares
Date: 11/12/01 Time: 23:16
Sample: 1 222
Included observations: 222

## C -6.562664 0.359239 -18.26824 0.0000

YEARS 0.235562 0.041963 5.613595 0.0000
YEARS2 -0.004776 0.001050 -4.547694 0.0000

## R-squared 0.159023 Mean dependent var -4.404772

Adjusted R-squared 0.151342 S.D. dependent var 1.952342
S.E. of regression 1.798549 Akaike info criterion 4.025259
Sum squared resid 708.4163 Schwarz criterion 4.071241
Log likelihood -443.8037 F-statistic 20.70563
Durbin-Watson stat 1.565414 Prob(F-statistic) 0.000000
Dependent Variable: LNSALARY
Method: Least Squares
Date: 11/07/01 Time: 13:15
Sample: 1 222
Included observations: 222

## C 3.809365 0.041338 92.15104 0.0000

YEARS 0.043853 0.004829 9.081645 0.0000
YEARS2 -0.000627 0.000121 -5.190657 0.0000

## R-squared 0.536179 Mean dependent var 4.325410

Adjusted R-squared 0.531943 S.D. dependent var 0.302511
S.E. of regression 0.206962 Akaike info criterion -0.299140
Sum squared resid 9.380504 Schwarz criterion -0.253158
Log likelihood 36.20452 F-statistic 126.5823
Durbin-Watson stat 1.434005 Prob(F-statistic) 0.000000

## Dependent Variable: LNSALARY

Method: Least Squares
Date: 11/07/01 Time: 13:49
Sample: 1 222
Included observations: 222
White Heteroskedasticity-Consistent Standard Errors & Covariance

## C 3.809365 0.026119 145.8466 0.0000

YEARS 0.043853 0.004361 10.05599 0.0000
YEARS2 -0.000627 0.000118 -5.322369 0.0000

## R-squared 0.536179 Mean dependent var 4.325410

Adjusted R-squared 0.531943 S.D. dependent var 0.302511
S.E. of regression 0.206962 Akaike info criterion -0.299140
Sum squared resid 9.380504 Schwarz criterion -0.253158
Log likelihood 36.20452 F-statistic 126.5823
Durbin-Watson stat 1.434005 Prob(F-statistic) 0.000000

## Dependent Variable: LNSALARYWLS

Method: Least Squares
Date: 11/12/01 Time: 23:24
Sample: 1 222
Included observations: 222

## CWLS 3.827501 0.020303 188.5145 0.0000

YEARSWLS 0.038216 0.003257 11.73423 0.0000
YEARS2WLS -0.000443 8.27E-05 -5.359798 0.0000

## R-squared 0.989833 Mean dependent var 41.66265

Adjusted R-squared 0.989740 S.D. dependent var 16.59118
S.E. of regression 1.680524 Akaike info criterion 3.889510
Sum squared resid 618.4915 Schwarz criterion 3.935492
Log likelihood -428.7356 Durbin-Watson stat 1.324687
Model estimated in step 3 seems to fit better
( R 2 is larger). However dependent variables
in OLS and transformed OLS model are
different: lnSalary and lnSalary divided by
σ̂ i , respectively.

residuals

## where we use the FGLS estimators. These

residuals are used in the usual formula for R 2
with dependent variable log Salary .
Serial Correlation

## • Nature of economic time series

• Consequences of this for random error
term in regression model
• Model for serial correlation of random
error term
• Consequences for OLS estimator of
regression coefficients
• Detecting serial correlation

over time

## Example: US GNP (billions of 1982\$) and

new housing units (thousands) for 1963-1985
(see time series graph)
4000

3500

3000

2500

2000

1500

1000
64 66 68 70 72 74 76 78 80 82 84

GNP HOUSING
A numerical measure for the slowness of
change or persistence of the time series is the
autocorrelation coefficient. For time series
Yt , t = 1, K, n the autocorrelation coefficient of
order 1 is defined as the sample correlation
between Yt and Yt −1 .

## The time series Yt −1 , t = 2, K, n is the time series

Yt lagged by one period. The value of Yt −1 in
period t is the value of Y in period t − 1.

## If Y0 is not known, the lagged once time series

starts in period t = 2 .
Define the sample average

1 n
Y = ∑ Yt
n t =1
then the autocorrelation coefficient of order 1
is
n
∑ (Yt − Y )(Yt −1 − Y )
(2) ρ̂1 = t =2 n
∑ (Yt − Y ) 2
t =1

## Note that by the definition of the sample

correlation the denominator should be

n n
∑ (Yt − Y ) 2
∑ (Yt −1 − Y ) 2
t =1 t =2

## The only difference is with the way the first

observation is included and that difference
can be neglected. (2) is simpler.
The sample correlation between Yt and Yt − 2 is
the autcorrelation coefficient of order 2, ρ̂ 2
etc.

## The autocorrelation coefficient of order k is a

measure of the (linear) relation between the
time series Yt and Yt − k .

Why?

## Example: autocorrelation for GNP series.

Correlogram of GNP

## Date: 11/13/01 Time: 06:27

Sample: 1963 1985
Included observations: 23

## 1 0.840 0.840 18.426 0.000

2 0.684 -0.071 31.243 0.000
3 0.568 0.042 40.521 0.000
4 0.479 0.018 47.454 0.000
5 0.372 -0.107 51.882 0.000
6 0.271 -0.044 54.358 0.000
7 0.157 -0.125 55.244 0.000
8 0.032 -0.140 55.284 0.000
Consequence for linear regression model

## In linear regression model

Yt = β1 + β 3 X t 2 + ut t = 1, K, n

## the error term ut captures the omitted

variables that affect Y . These variables are
also economic time series and can have the
same persistence as GNP.

related?

## Suggestion use a linear regression model

(3) ut = ρut −1 + ε t

## Note: no constant because E (ut ) = E (ut −1 ) = 0

The error term ε t has the same properties as
the error term in the CLR model, in
particular

E (ε t ) = 0

Var (ε t ) = σ 2 (homoskedasticity)

## Such a time series is called a white noise

series (if fed to your speakers you will hear
static)

## The model in (3) with ε t white noise is called

the first-order autoregressive or AR(1)
process. The parameter ρ is called the first-
order autocorrelation coefficient.
It can be shown that (3) implies that the
correlation between ut and ut − s is equal to

ρs

## Remember that in economic time series the

correlation becomes smaller with s . This
happens if − 1 < ρ < 1.
Consequences of serial correlation

## • Assumptions 1 and 2 still hold: OLS

estimators unbiased
• OlS estimator not the best (BLU)
estimator
• Usual formula for standard error
incorrect
• t- and F-tests cannot be used

## Compare with consequences of

heteroskedasticity

## Often: standard errors produced by

computer OLS program too small
Detecting serial/autocorrelation

• Graphical method
• Test

Graphical method

## Data for 1963-1985. Dependent variable log

of housing units started per capita and
independent variables log of GNP per capita
and log of mortgage interest rate

OLS residuals

## • Time series graph

• Scatterplot of et and et −1

## Problem: indicative but not conclusive.

Dependent Variable: LNHOUSINGCAP
Method: Least Squares
Date: 11/13/01 Time: 00:06
Sample: 1963 1985
Included observations: 23

## C 2.528899 1.180472 2.142278 0.0447

LNGNPCAP -0.066000 0.540505 -0.122109 0.9040
LNINTRATE -0.211284 0.202894 -1.041351 0.3101

## R-squared 0.094147 Mean dependent var 1.991961

Adjusted R-squared 0.003562 S.D. dependent var 0.226095
S.E. of regression 0.225692 Akaike info criterion -0.018186
Sum squared resid 1.018735 Schwarz criterion 0.129922
Log likelihood 3.209133 F-statistic 1.039325
Durbin-Watson stat 0.913015 Prob(F-statistic) 0.372027
0.6

0.4

0.2

0.0

-0.2

-0.4
64 66 68 70 72 74 76 78 80 82 84

LNHOUSINGCAP Residuals
RESID01LAG vs. RESID01
0.6

0.4
RESID01LAG

0.2

0.0

-0.2

-0.4
-0.4 -0.2 0.0 0.2 0.4 0.6

RESID01