Beruflich Dokumente
Kultur Dokumente
on
Introductory
Econometrics with EViews
Asst. Prof. Dr. Kemal Bağzıbağlı
Department of Economic
2
Part 1 - Outline
1. Violation of Classical Linear Multiple Regression
(CLMR) Assumptions
a. Heteroskedasticity c. Model Misspecification
b. Multicollinearity d. Autocorrelation
3
1. Violation of Classical
Linear Multiple
Regression (CLMR)
Assumptions
Multiple Regression Model
● n observations on y and x:
● α & βi: unknown parameters
5
Assumptions
1) The error term (ut) is a random variable with E(ut )=0.
2) Common (constant) Variance. Var(ut ) = σ2 for all i.
3) Independence of ut and uj for all t.
4) Independence of xj
● ut and xj are independent for all t and j.
5) Normality
● ut are normally distributed for all t.
● In conjunction with assumptions 1, 2 and 3;
ut 〜 IN (0, σ2) 6
Violation of Basic Model Assumptions
HETEROSKEDASTICITY (nonconstant variance)
var(ut ) = E(ut2) = σ2 for all t (similar distribution)
Homoskedasticity:
● σ12 = σ22 = … = σ2n
● Constant dispersion of the
error terms around their
mean zero
7
Heteroskedasticity (cont.)
● Rapidly increasing or
decreasing dispersion
heteroskedasticity?
● Variances are different because
of changing dispersion
● σ12 ≠ σ22 ≠ ...≠ σ2n Var(ut )=
σt2
● One of the assumptions is
8
violated!
Heteroskedasticity (cont.)
Residuals increasing by x
heteroskedasticity?
9
Consequences of Heteroskedasticity
★ The ordinary least squares (OLS) estimators
are still unbiased but inefficient.
➢ Inefficiency: It is possible to find an alternative
unbiased linear estimator that has a lower variance
than the OLS estimator.
10
Consequences of Heteroskedasticity (cont.)
Effect on the Tests of Hypotheses
★ The estimated variances and covariances of the
OLS estimators are biased and inconsistent
➢ invalidating the tests of hypotheses (significance)
Effect on Forecasting
★ Forecasts based on the estimators will be unbiased
★ Estimators are inefficient
➢ forecasts will also be inefficient 11
Lagrange Multiplier (LM) Tests for
Heteroskedasticity
1. Park Test is a two-stage procedure
Stage 1:
● Run an OLS regression disregarding the
heteroskedasticity question.
● Obtain ut from this regression;
Stage 2:
● if β is statistically significant, there is heteroskedasticity.
12
Park Test in EViews
ls compensation c productivity
13
Park Test in EViews (cont.)
14
Park Test in EViews (cont.)
u=0
15
Park Test in EViews (cont.)
u2=u^2
16
Park Test in EViews (cont.)
lnu2=log(u2) lnproductivity=log(productivity)
17
Park Test in EViews (cont.)
● Probability value (p-value) of
lnproductivity (0.5257) is
greater than the critical value
of 0.05
● Statistically insignificant
homoskedasticity
18
Detection of Heteroskedasticity (cont.)
2. Glejser Test is similar in spirit to the Park test.
● Glejser (1969) suggested estimating regressions
of the type;
IûtI = α + βXt
IûtI = α + β/Xt
IûtI = α + β√Xt and so on
● Testing the hypothesis β=0 19
Glejser Test in EViews
genr au=@abs(u)
20
Glejser Test in EViews (cont.)
ls au c productivity
Heteroskedasticity?
21
Glejser Test in EViews (cont.)
ls au c 1/productivity ls au c @sqrtproductivity
22
Glejser Test in EViews (cont.)
ls compensation c productivity
23
Glejser Test in EViews (cont.)
24
Detection of Heteroskedasticity (cont.)
3. White’s Test
● Recommended over all the previous tests
25
Detection of Heteroskedasticity (cont.)
3. White’s Test (cont.)
Step 3: Regress the squared residual against a
constant, X2t, X3t etc. (auxiliary equation)
● if
○ Upper a percent point on the chi-square dist. with 5 d.f.
● If the null hypothesis is not rejected
○ the residuals are homoskedastic
27
White Test in EViews
28
Solutions to the Heteroskedasticity Problem
Graphical Method
● Check the residuals (i.e.
error variance)
○ linearly increasing with xt
● WLS
30
Solutions to the Heteroskedasticity Problem (cont.)
31
Solutions to the Heteroskedasticity Problem (cont.)
32
Applications with EViews
ls foodexp c totalexp foodexp c totalexp 01.makeresid u
33
Applications with EViews (cont.)
Command:
scat totalexp u
heteroskedasticity?
34
Applications with EViews (cont.)
35
Applications with EViews (cont.)
lnfoodexp=log(foodexp) lntotalexp=log(totalexp)
36
Applications with EViews (cont.)
Command:
ls lnfoodexp c lntotalexp
37
Applications with EViews (cont.)
38
Multicollinearity
OLS Ordinary Least Squares
40
Multicollinearity (cont.)
1. Exact (or Perfect) Multicollinearity
a. Linear relationship among the independent variables
2. Near Multicollinearity
a. Explanatory variables are approximately linearly
related
For example; If ➡ Exact
➡ Near
41
Theoretical Consequences of
Multicollinearity
Unbiasedness & Forecasts
★ OLS estimators are still BLUE and MLE and hence are
unbiased, efficient and consistent.
★ Forecasts are still unbiased and confidence intervals
are valid
★ Although the standard errors and t-statistics of
regression coefficients are numerically affected,
○ tests based on them are still valid
42
Theoretical Consequences of
Multicollinearity (cont.)
Standard Errors
★ Standard errors tend to be higher
○ making t-statistics lower
○ thus making coefficients less significant (and
possibly even insignificant)
43
Identifying Multicollinearity
● High R2 with low values for t-statistics
● High values for correlation coefficients
● Regression coefficients sensitive to specification
● Formal test for multicollinearity
○ Eigenvalues and condition index (CI)
k= max eigenvalues/min eigenvalues
CI=√k ➡ k is between 100 and 1000 ➡ multicollinearity?
High variance inflation factor (VIF)
➡ VIF>10 ➡ THEN multicollinearity is suspected.
44
Solutions to the Multicollinearity Problem
● Benign Neglect
○ Less interested in interpreting individual coefficients
but more interested in forecasting
● Eliminating Variables
○ The surest way to eliminate or reduce the effects of
multicollinearity
45
Solutions to the Multicol. Problem (cont.)
46
Solutions to the Multicol. Problem (cont.)
■ goes up substantially
● there may be no benefit to adding to the sample size 48
Applications with EViews
Overall statistically
significant
multicollinearity
problem 49
Applications with EViews (cont.)
Command: eq01.varinf
50
Applications with EViews (cont.)
Even if we drop
these variables
one-by-one from
the model, still
we have a
multicollinearity
problem.
52
Applications with EViews (cont.)
● When we drop both the general price
level and the price of cars, the
multicollinearity problem is solved
○ but R2 is low.
53
Applications with EViews (cont.)
DROP: General price level and
disposable income
59
Causes of Autocorrelation
DIRECT INDIRECT
60
Consequences of Autocorrelation
● OLS estimates are still unbiased and consistent
● OLS estimates are inefficient not BLUE
○ Forecasts will also be inefficient
● The same as the case of ignoring heteroskedasticity
● Usual formulas give incorrect standard errors for OLS
estimates
● Confidence intervals and hypothesis tests based on the
usual standard errors are not valid
61
Detecting Autocorrelation
❖ Runs Test: Investigate the signs of the residuals. Are
they moving randomly? (+) and (-) comes randomly
don’t need to suspect autocorrelation problem.
❖ Durbin-Watson (DW) d Test: Ratio of the sum of
squared differences in successive residuals to the
residual sum of squares.
❖ Breusch-Godfrey LM Test: A more general test which
does not assume the disturbance to be AR(1).
62
Durbin-Watson d Test
63
Durbin-Watson d Test (cont.)
STEP 3 Construct the table with the calculated DW
statistic and the dU, dL, 4-dU and 4-dL critical values.
STEP 4 Conclude
64
Resolving Autocorrelation
The Cochrane-Orcutt Iterative Procedure
65
Resolving Autocorrelation (cont.)
Step 4: Run the regression again with the transformed
variables and obtain a new set of residuals.
Step 5 and on: Continue repeating Steps 2 to 4 for several
rounds until the following stopping rule applies:
● the estimates of ⍴ from two successive iterations differ by no
more than some preselected small value, such as 0.001.
66
Applications with EViews
Variables in natural logarith:
● LNCO: Copper price
● LNIN: Inudtrial production
● LNLON: London stock exchange
● LNHS: Housing price
● LNAL: Aluminium price
1.143 1.739
AUTOCORRELATION?
67
Applications with EViews (cont.)
H0: No autocorrelation
68
Applications with EViews (cont.)
To Fix it!
69
Applications with EViews (cont.)
To Fix it!
u=u(0)
70
Applications with EViews (cont.)
To Fix it!
Generate series:
● y= lnco-0.52*lnco(-1)
● x2= lnin-0.52*lnin(-1)
● x3= lnlon-0.52*lnlon(-1)
● x4= lnhs-0.52*lnhs(-1)
● x5= lnal-0.52*lnal(-1)
71
Applications with EViews (cont.)
To Fix it!
1.124 1.743
Command:
ls y c x2 x3 x4 x5
72
Applications with EViews (cont.)
To Fix it!
73
Summary
Problem Source Detection Remedy
Heteroskedasticity Nonconstant variance Park Test, Glejser, White Test Taking logarithm,
Weighted least squares
Time series
plot of a
random walk
vs. a random
walk with drift
77
Example
PDI: Personal Disposable Income
78
What is Stationarity? (cont.)
● If a regression model is not stationary,
⇒ the usual “t-ratios” will not follow a t-distribution.
● The use of nonstationary data can lead to
spurious regressions.
● Results of the regression do not reflect the
real relationship except if these variables are
cointegrated.
79
3. Univariate Time Series
Modelling
Some Stochastic Processes
Random Walk
Autoregressive Process
81
Autoregressive Integrated MA Process
● Most time series are nonstationary
● Successive differencing stationarity
●
● : A stationary series that can be
expressed by an ARMA(p, q)
● can be represented by an ARIMA model
ARIMA(p, d, q)
82
Estimation and Forecasting with an
ARIMA Model
The Box and Jenkins (1970) Approach
● Identification
● Fitting (Estimation), usually OLS
● Diagnostics
● Refitting if necessary
● Forecasting
83
Identification
● The process of specifying the orders of differencing,
AR modeling, and MA modeling
● How do the data look like?
● What pattern do the data show?
- Are the data stationary?
- Specification of p, d, and q?
● Tools
- Plots of data
- Autocorrelation Function (ACF)
- Partial ACF (PACF) 84
Identification (cont.)
● To determine the value of p and q we use the graphical
properties of the autocorrelation function and the partial
autocorrelation function.
● Again recall the following:
85
Model Fitting
● Model parameters are estimated by OLS
● Output includes
○ Parameter estimates
○ Test statistics
○ Goodness of fit measures
○ Residuals
○ Diagnostics
86
Diagnostics
● Determines whether the model fits the data
adequately.
○ The aim is to extract all information and ensure that
residuals are white noise
● Key measures
○ ACF of residuals
○ PACF of residuals
○ Ljung-Box Pierce Q statistic 87
Preliminary Analysis with EViews
Select the series “dividends” in the workfile,
then select [Quick/Graph/Line graph]:
88
Preliminary Analysis with EViews (cont.)
[Quick/Generate Series]:
ddividends=d(dividends)
89
Preliminary Analysis: Identification
Correlogram
● The graph of autocorrelation function
91
Preliminary Analysis: Identification
Interpretation of Correlogram
● The function quickly decreases
to zero (a low ⍴)
92
Correlogram and Stationarity
93
Preliminary Analysis: Estimation
ARIMA(1,1,1)
Command: ls ddividens c AR(1) MA(1)
94
Empirical Example
Forecasting Monthly Electricity Sales
95
Empirical Example (cont.)
Forecasting Monthly Electricity Sales
96
Empirical Example (cont.)
Forecasting Monthly Electricity Sales
Correlogram for 12-Month Differenced Data
(Xt-Xt-12)
97
Empirical Example (cont.)
Forecasting Monthly Electricity Sales
Superior model:
ARIMA (0, 1, 4)
98
Bibliography
● Brooks, C. (2008) Introductory Econometrics for Finance,
● Gujarati D.N., Porter D.C. (2004), Basic Econometrics,The McGraw−Hill
Companies
● Maddala, G.S. (2002). Introduction to Econometrics.
● Ramanathan, R. (2002). Introductory econometrics with applications,
Thomson Learning. Mason, Ohio, USA.
● Wooldridge,J. (2000) Introductory Econometrics: A modern Approach.
South-Western College Publishing
99