Sie sind auf Seite 1von 27

ECO2009: Empirical Economic Analysis

Topic 11: Autocorrelation


Gujarati and Porter, Chapter 12 Dougherty, Chapter 12

Autocorrelation

We will continue our discussion of problems which call for the use of the generalized least squares (GLS) estimator by considering an important topic called autocorrelation. This is used with time series data, so we will use t = 1, . . . , T to denote observations (rather than i = 1, . . . , n).

Classical Assumptions Once More 1. Yt = 0 + 1 X1t + . . . + k Xkt + ut (model is linear) 2. The X values are xed. They are not random variables. 3. E(ut ) = 0 (mean zero errors)
2) = 2 4. V(ut ) = E(ut (constant variance errors: homoskedasticity)

5. Cov(ut , us ) = E(ut us ) = 0 for t = s (errors uncorrelated with one another) 6. Cov(ut , Xjt ) = E(ut Xjt ) = 0 (errors uncorrelated with the regressors) 7. Number of observations greater than number of parameters 8. Variability in X values 9. Model is correctly specied 10. No perfect multicollinearity N. ut is normally distributed.

Autocorrelated Errors We will work with the multiple regression model under the classical assumptions, with the exception that the errors follow an autoregressive process of order 1 [AR(1)] : ut = ut 1 + t , where it is t which satises classical assumptions. So 2 and cov( , ) = 0 (for t = s ). E(t ) = 0, V(t ) = t s We also assume 1 < < 1. To preview later material, this restriction ensure stationarity and means you do not have to worry about problems relating to unit roots and cointegration (denitions will be provided to you later on). We will focus on the AR(1) case, but note that the AR(p) errors case is a simple extension: ut = 1 ut 1 + 2 ut 2 + . . . + p ut p + t

Variances and Covariances of ut

The assumptions above specied properties of t , but we need to know properties of ut . We will derive the following properties in class: E(ut ) = 0 V(ut ) =
2 1 2

cov(ut , ut 1 ) = V(ut ) cov(ut , ut s ) = s V(ut ) Thus, we have established that the regression model with autocorrelated errors violates assumption 5. That is, the regression errors are NOT uncorrelated with one another.

Properties of OLS Estimator in the Autocorrelated Errors Case

Model: Yt = Xt + ut We will derive the mean, the variance and the distribution of the OLS estimator in class. What do these results mean for inference?

The GLS Estimator for the Autocorrelated Errors Case Remember: GLS can be interpreted as OLS on a suitably transformed model. In this case, the appropriate transformation is refered to as quasi-dierencing. To explain what this is, consider the regression model: Yt = + 1 X1t + . . . + k Xkt + ut This model will hold for every time period so we can take it at period t 1 and multiply both sides of the equation by : Yt 1 = + 1 X1,t 1 + . . . + k Xk ,t 1 + ut 1 Subtract this equation from the original regression equation: Yt Yt 1 = + 1 (X1t X1t 1 ) + . . . + k (Xkt Xk ,t 1 ) + ut ut 1 or
Yt = + 1 X1 t + . . . + k Xkt + ut

The GLS Estimator for the Autocorrelated Errors Case (cont.)

satises the classical assumptions so OLS on this But ut transformed model will be GLS (which will be BLUE).

Note that the transformed variables are quasi-dierenced Yt = Yt Yt 1


X1 t = X1t X1,t 1

... The case with = 1 (which we do not consider) is called dierenced - this is not quite the same so we say quasi dierenced.

The GLS Estimator for the Autocorrelated Errors Case (cont.)

One (relatively minor) issue: if our original data is from = Y Y will involve Y (and same t = 1, . . . , T then Y1 1 0 0 issue for explanatory variables). But we do not observe such initial conditions. There are many ways of treating initial conditions. What we do (simplest, most common thing) is work with data from t = 2, . . . , T (and use t = 1 values for variables as initial conditions). Summary: If we knew , then we could quasi-dierence the data and do OLS using the transformed data (which is equivalent to GLS). In practice, we rarely (if ever) know . Hence, we replace by an estimate: . There are several ways of getting a , we now turn to one, called the Cochrane-Orcutt procedure.

The Cochrane-Orcutt Procedure Remember: with autocorrelated errors, GLS is BLUE. However, OLS (on original data) is still unbiased. Cochrane-Orcutt procedure begins with OLS and then uses OLS residuals to estimate . Cochrane-Orcutt procedure goes through following steps: 1. Do OLS regression of Yt on intercept, X1t , . . . , Xkt and produce the OLS residuals, u t . 2. Do OLS regression of u t on u t 1 which will provide a . 3. Quasi-dierence all variables to produce Yt = Yt Yt 1
X1 X1,t 1 t = X1t

...
, . . . , X , thus 4. Do OLS regression of Yt on intercept, X1 t kt producing GLS estimates of the coecients.

Autocorrelation Consistent Estimators

Remember: with heteroskedasticity we discussed heteroskedasticity consistent estimator (HCE). Less ecient than GLS, but correct second-best solution when GLS dicult to implement. Similar issues hold for autocorrelated errors. There exist autocorrelation consistent estimators which allow for the correct use of OLS methods when you have autocorrelated errors. We will not explain these, but econometrics software packages include them. The most popular is the Newey-West estimator.

Detecting Autocorrelated Errors

If = 0 then doing OLS on the original data is ne (OLS is BLUE). However, if = 0, then a GLS estimator such as the Cochrane-Orcutt estimator is better. This motivates testing H0 : = 0 against H1 : = 0. There are several such procedures and tests, here we describe some of the most popular.

I. Graphical Method

Positive autocorrelation:
^ ut ^ ut

^ ut1

I. Graphical Method (cont.)

Negative autocorrelation:
^ ut ^ u
t

^ ut1

I. Graphical Method: Example 1 AR(1) model: ut = ut 1 + t , t N(0, 1) = 0.8/0.0/0.8? Sample path:

2 0

20

40 t

60

80

100

I. Graphical Method: Example 2 AR(1) model: ut = ut 1 + t , t N(0, 1) = 0.8/0.0/0.8? Sample path:

3 0

20

40 t

60

80

100

I. Graphical Method: Example 3 AR(1) model: ut = ut 1 + t , t N(0, 1) = 0.8/0.0/0.8? Sample path:

3 0

20

40 t

60

80

100

II. Runs Test Also called Geary test Nonparametric, thus no parametric model involved N1 : number of positive residuals N2 : number of negative residuals N = N1 + N2 R : Number of runs Example: ( )(++++)()(++++)( ) 5 runs Under the H0 of independent residuals R is asymptotically normal with 2 N1 N2 +1 E(R ) = N 2N1 N2 (2N1 N2 N ) V(R ) = N 2 (N 1) Construct t-test or condence interval

III. t Test for AR(1) Serial Correlation Assumptions: 1. X are non-stochastic (xed in repeated sampling). 2. Sample is large. AR(1) model for error term: ut = ut 1 + t H0 : = 0 (errors are serially uncorrelated) If the ut were observed, we immediately could use a t test. In large samples we can use the residuals instead of the errors and can construct the following test procedure: 1. Run OLS of Yt on (1, X1t , . . . , Xkt ) and obtain residuals u t for t = 1, . . . , T . 2. Run OLS of u t on u t 1 for t = 2, . . . , T . 3. Use t test in the usual way to test H0 : = 0.

IV. Durbin-Watson Test

Assumptions: 1. Regression model includes an intercept. 2. X are non-stochastic (xed in repeated sampling). 3. Errors ut follow an AR(1) process: ut = ut 1 + t . 4. Errors ut are normally distributed. 5. Model does not include lagged values of the dependent variable.

IV. Durbin-Watson Test (cont.)

Durbin-Watson d statistic: d= Bit of algebra: d=


2+ u t ( ut u t 1 )2 = 2 u t u t u t 1 =2 1 22 2 u t 2 u t 1 2 2 u t u t u t 1 2 u t P u t u t 1 P 2 u t T ut u t 1 )2 t =2 ( T 2 t t =1 u

u t u t 1

Remember: u t = u t 1 + t Thus: 0 < d 2(1 ) < 4

IV. Durbin-Watson Test (cont.)

Steps: 1. Run OLS and obtain the residuals. 2. Compute the test statistic d . 3. Find out critical values dL and dU which depend on the sample size and number of regressors. 4. For the H0 : No positive autocorrelation the decicision rule is as follows: If 0 < d < dL : reject, if dL d dU : no decision.

V. Breusch-Godfrey Test The Breusch-Godfrey test can be used when regressors are stochastic and allows for higher order autoregressive schemes. AR(p) errors: ut = 1 ut 1 + 2 ut 2 + . . . + p ut p + t H0 : 1 = 0, 2 = 0, . . . , p = 0 Breusch-Godfrey test involves the following steps: 1. Run OLS of Yt on (1, X1t , . . . , Xkt ) and produce the residuals u t . 2. Run OLS of u t on (1, X1t , . . . , Xkt , u t 1 , . . . , u t p ) and produce the R 2 . 3. Calculate the test statistic: LM = (T p )R 2 If H0 is true, then LM has an (asymptotic) 2 (p ) distribution.

Empirical Example (cont.)

You are interested in the determinants of fertility and analyse the following data set which includes annual observations for the years 1913 through 1984:

Variable gfr pe ww2 pill

Description Fertility rate: births per 1000 women aged 15-44 average dollar value of personal tax exemption per child World war II dummy variable: =1 between 1941 and 1945, =0 otherwise Birth control pill dummy variable: =1 from 1963 onwards, =0 otherwise

Empirical Example (cont.)

regress gfr pe ww2 pill Source | SS df MS -------------+-----------------------------Model | 13183.6215 3 4394.54049 Residual | 14664.2739 68 215.651087 -------------+-----------------------------Total | 27847.8954 71 392.223879 Number of obs F( 3, 68) Prob > F R-squared Adj R-squared Root MSE = = = = = = 72 20.38 0.0000 0.4734 0.4502 14.685

-----------------------------------------------------------------------------gfr | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------pe | .08254 .0296462 2.78 0.007 .0233819 .1416981 ww2 | -24.2384 7.458253 -3.25 0.002 -39.12111 -9.355684 pill | -31.59403 4.081068 -7.74 0.000 -39.73768 -23.45039 _cons | 98.68176 3.208129 30.76 0.000 92.28003 105.0835 ------------------------------------------------------------------------------

Empirical Example (cont.)


plot resid year 28.073 + | * | ** * | * | * | * ** * R | e | * * s | ** * *** * * i | * * ** *** ** d | * * u | * * * ** * a | ** * l | * * * s | * * * ***** | ** ***** | * * | | * | * *** -27.0187 + **** +----------------------------------------------------------------+ 1913 year 1984

Empirical Example (cont.)

regress res pe ww2 pill resid_lag1 resid_lag2 Source | SS df MS -------------+-----------------------------Model | 10868.1676 5 2173.63352 Residual | 2298.16998 64 35.9089059 -------------+-----------------------------Total | 13166.3376 69 190.816486 Number of obs F( 5, 64) Prob > F R-squared Adj R-squared Root MSE = = = = = = 70 60.53 0.0000 0.8255 0.8118 5.9924

-----------------------------------------------------------------------------res | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------pe | .0065419 .0124797 0.52 0.602 -.0183892 .031473 ww2 | 2.897539 3.092199 0.94 0.352 -3.279839 9.074917 pill | .3706119 1.666398 0.22 0.825 -2.958402 3.699625 resid_lag1 | .9173621 .1242827 7.38 0.000 .6690789 1.165645 resid_lag2 | -.03796 .1224065 -0.31 0.757 -.282495 .2065751 _cons | -1.531936 1.389222 -1.10 0.274 -4.307226 1.243354 ------------------------------------------------------------------------------

Das könnte Ihnen auch gefallen