Sie sind auf Seite 1von 34

LU12

Autoregressive,
Distributed-Lag Models and
Granger Causality Analysis
INTRODUCTION

• If the regression model includes not only the current but also
the lagged (past) values of the explanatory variables (the X’s)
it is called distributed-lag model.
e.g: Yt    0 X t  1 X t 1  2 X t 2  ut
• An Autoregressive model is where it includes one or more
lagged values of the dependent variables among its
explanatory variables. Also known as dynamic models.
e.g: Yt    0 X t  Yt 1  ut

2
THE KOYCK (1954) DISTRIBUTED-LAG
MODELS
• Koyck starts with following distributed-lag model in one
explanatory variable (infinite lag model):

Yt    0 X t  1 X t 1  2 X t 2  ...  ut (1)

• Assuming that the ’s are all of the same sign, Koyck assumes
that they decline geometrically as follows:
k = 0 λ k k = 0,1…. (2)

3
where λ, such that 0< λ <1, is know as the rate of decline, or
decay of the distributed lag where 1- λ is known as the speed
of adjustment.
• It implies that each successive  coefficient is numerically less
than each preceding  and when one goes back into the distant
past, the effect of the lag on Yt becomes progressively smaller.
• The closer λ is to 1, the slower the rate of decline in k,
whereas the closer it is to zero the more rapid the decline in k.

4
• As indicated earlier the Koyck scheme:
a) assumes nonnegative value of λ, Koyck rules out the ’s from
changing sign;
b) by assuming λ <1, he gives lesser weight to the distant ’s than the
current ones;
c) he ensures that the sum of the ’s, which gives the long-run multiplier,
is finite, namely:  1
 k  0 (1   )
k 0

• Thus, the infinite lag model (1) may be written as:


Yt    0 X t  0X t 1  02 X t 2  ...  ut (3)

• It is not easy to be estimated due to the high nonlinear form.


Koyck lags equation (3) by one period to obtain:
Yt 1    0 X t 1  0X t 2  02 X t 3  ...  ut 1 (4)
5
• Multiply it by  to obtain:
Yt 1    0 X t 1  0X t 2  02 X t 3  ...  ut 1 (5)

• Subtracting equation (5) from equation (3) to get:


Yt  Yt 1   (1   )  0 X t  (ut  ut 1 ) (6)

• Rearranging:
Yt   (1   )  0 X t  Yt 1  vt (7)

where vt= (ut - ut-1), a moving average of ut and ut-1.


• This procedure is called Koyck transformation.
• What difference equations (1) and (7)? A tremendous
simplification had been achieved. Whereas before we had to
estimate  and an infinite number of ’s, we now have to
estimate only three unknowns: , 0 and .
6
• Note the following features of the Koyck transformation:
i. From distributed lag model and ended up with autoregressive
model because Yt-1 appears as one of the explanatory variables.

ii. The appearance of Yt-1 will create some statistical problems


because it is stochastic but in the CLRM assumption, the
explanatory variables either are non-stochastic or, if stochastic,
are distributed independently of the stochastic disturbance
term.

iii. Disturbance term now is vt rather then ut. Thus, the statistical
properties of vt depend on ut.

iv. The presence of lagged Y violates one of the assumptions


underlying the Durbin-Watson d test. Thus, an alternative is the
Durbin h test (see example 17.10).

7
• ADAPTIVE EXPECTATION (AE) MODEL
- This is the first version of the rationalization of Koyck
model. Suppose we postulate the following model:
Yt   0 1 X t*  ut (8)
where Y = demand for money,

X* = equilibrium, expected long-run or normal rate of interest

u = error term

 See Cagan (1956) and Friedman (1957).

8
• Equation (8) postulates that the demand for money is a function
of expected rate of interest. Further, the expectation is formed:

X t*  X t*1   ( X t  X t*1 )
(9)

where , such that 0 <   1, is known as coefficient of expectation.

• Hypothesis (9) is known as the adaptive expectation,


progressive expectation or error learning hypothesis.
• It means that “economic agents will adapt their expectation in
the light of past experience and that in particular they will learn
from their mistakes.”
• The further the distant of the experiences exert, the lesser effect
it will be to the recent experience.

9
• It also can be stated as:
X t*  X t  (1   ) X t*1 (10)

• Equation (10) shows that the expected value of rate of interest


at time t is a weighted average of the actual value of the
interest rate at time t and its value expected in the previous
period.
• If =1 then X t*  X t , meaning that expectations are realized
immediately and fully in the same time (the reverse true when
=0).
• If =0 then X t*  X t 1 , meaning that expectations are static, that
is, conditions prevailing today will be maintained in all
subsequent periods. Expected future values then become
identified with current values.
10
• Substituting equation (10) into equation (8) to get:
Yt  0  1[X t  (1   ) X t*1 ]  ut (11)

 0  1X t  1(1   ) X t*1  ut

• Now lag equation (8) one period, multiply it by 1-, and


subtract the product from (11). After simple algebraic
manipulations, we obtain:
Yt  0  1 X t  (1   )Yt 1  ut  (1   )ut 1 (12)
 0  1 X t  (1   )Yt 1  vt

where vt = ut – (1-) ut-1

11
• Different between equations (8) and (12) can be seen in the
interpretation of 1 where (8) measures the average response of Y to
a unit change in X* (the equilibrium or long run value of X). In (12),
1 measures the average response of Y to a unit change in the actual
or observed value of X. They will be the same when =1 (current
and long run value the same).

• Similarity between AE and Koyck models is both are autoregressive


models and both have the same error term although the
interpretation of the coefficient is different.

• The critics of the AE model by the RE (rational expectation)


believers (J.Muth, Robert Lucas and Thomas Sargent) is available in
the literature. The RE assumes that individual economic agents use
current available and relevant information in forming their
expectation and do not rely upon purely past experience.

12
THE STOCK ADJUSTMENT (PARTIAL
ADJUSTMENT) MODEL

• By Marc Nerlove.
• Using the flexible accelerator model of economy theory that
assumes in equilibrium, optimal desired or long run amount of
capital stock needed to produce output under the given state of
technology.
• First, the desired level of capital Yt* is a linear function of
output X as follows:
Yt* = β0 + β1Xt + ut (13)

13
• Since the desired level of capital is not directly observed, Nerlove
postulates the hypothesis known as partial adjustment where:
Yt – Yt-1 = (Yt* - Yt-1) (14)
in which 0<   1 is the coefficient of adjustment and where Yt -
Yt-1= the actual change and (Yt* - Yt-1) = desired change.
• Equation (14) postulates that the actual change in capital stock
(investment) in any given time period t is some fraction  of the
desired change for that period.
• If =1, then the actual = desired stock. If =0, then nothing
change since actual stock at time t is the same as that observed in
the previous time period. But  is expected to lie between the two
extremes due to rigidity, inertia and contractual obligations, and
hence called partial adjustment model.

14
• Since Yt-Yt-1 is the change between two periods capital stock
(or investment), the equation (14) is written as:
It = (Yt* - Yt-1) (15)
where It is the investment in time period t. Further equation
(14) can be written as:
Yt = Yt* + (1- )Yt-1 (16)
showing that the observed capital stock at time t is a weighted
average of the desired capital stock at that time and the capital
stock existing in the previous time period,  and (1-) being
the weight.

15
• Substituting equation (13) into equation (16) and get:

Yt   (0  1 X t  ut )  (1  )Yt 1

 0  1 X t  (1  )Yt 1  ut (17)

• Equation (17) is called the short run demand function for


capital stock.
• Once we estimate equation (17), we can easily obtain the
coefficient of adjustment , we can derive the long run
function by simply divide 0 and 1 by  and omitting the
lagged Y term, which will then give equation (13).

16
• Still the partial adjustment (PA) model is also an
autoregressive model.
• The PA model and AE model are the rationalization of the
Koyck model. Both two are similar but conceptually not the
same.
• PA is based on technical or institutional rigidities, inertia, cost
of change, while AE is based on uncertainty.
• However, both of them are better than Koyck model.

17
ESTIMATING AUTOREGRESSIVE MODEL

• Koyck Model:
Yt   (1  )  0 X t  Yt 1  vt (18)

• Adaptive Expectation Model:


Yt  0  1X t  (1  )Yt 1  [ut  (1  )ut 1] (19)

• Partial Adjustment Model:


Yt  0  1 X t  (1   )Yt 1  ut (20)

• All have common form of:


Yt  0  1 X t  2Yt 1  vt (21)
18
• The problem of estimating it is not straightforward because of the
presence of stochastic explanatory variables and the possibility
of serial correlation.

• In Koyck and AE models, if the explanatory variable in a regression


model is correlated with the stochastic disturbance term, the OLS
estimators are not only biased but also not even consistent; that is,
when the sample size is increased indefinitely, the estimator do not
approximate their true population values.

• The PA model will satisfy the assumption of CLRM but the


estimates tend to be biased (in finite sample). The consistency is that
although Yt-1 depends on ut-1 and all the previous disturbance
terms, it is not related to the current error term (ut). Thus, the OLS
cannot be directly applied to the Koyck and AE models but rather
use IV (instrumental variable approach).

19
THE ALMON (1965) APPROACH (PDL)

• Due to the restrictive assumption in Koyck model that the 


coefficients decline geometrically as the lag lengthens, Shirley
Almon suggested a method that is much more general (see
Figure 17.7).
Yt    0 X t  1 X t 1  2 X t 2  ...  k X t k  ut (22)
in compact form as: k
Yt     i X t i  ut
i 0 (23)

20
• Following Weierstrass’ theorem, Almon assumes that i can
be approximately by a suitable-degree polynomial in i, the
length of the lag. Generally we can written the lag as:
β1 = a0 + a1i + a2i2 + … + amim (24)
which is an mth-degree polynomial in i. It is assumed that m
(the degree of the polynomial) is less than k (the maximum
length of the lag).
• For example, if the degree of polynomial is 2, we get:
k
Yt     (a0  a1i  a2i 2 ) X t i  ut
i 0

k k k
   0  X t i  a1  iX t i  a2  i 2 X t i  ut
i 0 i 0 i 0 (25)

21
• Defining: k
Z 0t   X t i
i 0

k
Z1t   iX t i
i 0

k
Z 2t   i 2 X t i
i 0

• We may write equation (25) as:

Yt    a0 Z0t  a1Z1t  a2 Z2t  ut (26)

• In Almon scheme the Y is regressed on constructed Z and not


on X and be estimated using OLS (not like Koyck model).
22
• Once the a’s are estimated from equation (26), we can obtain
the ’s as follows:
 
0  a0

ˆ1  aˆ0  aˆ1  aˆ2

ˆ2  aˆ0  2aˆ1  4aˆ2

ˆ3  aˆ0  3aˆ1  9aˆ2


........................

ˆk  aˆ0  kaˆ1  k 2aˆ2

23
• Before we apply the Almon technique, these issues must be
resolve:
– The maximum lag k must be determined – many criterion in
selection of lag (AIC, SBC, LR). One popular approach is testing
down the lag from the biggest to the smallest.

– Determine the degree of polynomial (m) (should be at least one


more than the turning points in the curve relating i to i. One
may argue that less degree of polynomial is better approximation
but we have to be very careful in determining this (the problem
of multicollinearity). How to determine? One approach is by
using top down approach by Hendry’s method.

– Once m and k are determined, we can proceed with construction


of Z’s (see equation 17.13.10, pp. 648).

24
ADVANTAGES AND DISADVANTAGES OF
ALMON MODEL
• Advantages:
1. It provides flexible method of incorporating a variety of lag structures.

2. Do not have to worry of the presence of the lagged dependent variable


as an explanatory variable in the model and the problems it creates for
estimation (satisfied all the assumptions of CLRM).

• Disadvantages:
1. The degree of polynomial (m) and the lag length (k) are a subjective
question.

2. Z exhibits the problem of multicollinearity (because Z is from the


original X’s), but it may not causing a big problem in the estimation
procedure (remember the problem of multicollinearity and its
consequences).
25
GRANGER CAUSALITY (1969)
• When we identify one variable as the ‘dependent variable ‘(Y) and
another as ‘explanatory variable’ (X), we have made an implicit
assumption that changes in the explanatory variable induce changes
in the dependent variable. That is the notion of causality in which
information about X is expected to affect the conditional distribution
of the future values of Y.

– Example: does money supply cause interest rate or the other way
round?

• The intuition behind the Granger causality test is quite


straightforward. Suppose X Granger causes Y but Y does not Granger
causes X then the past values of X should be able to help predict
future values of Y but the past values of Y should not be helpful in
forecasting X.

26
• In short, there are four possibilities while testing the direction:
i. Unidirectional causality (from X to Y, where Y is dependent
variable)
ii. Unidirectional causality (from Y to X)
iii. Feedback effect or bi-directional causality
iv. Independence (no direction of causality)

• Example: n n
rt  i rt i    j mt  j  ut
(27)
i 1 j 1

n n
mt  i mt i    j rt  j  vt (28)
i 1 j 1

where r = interest rate, m= money supply, ut and vt = error terms.


27
STEPS IN GRANGER CAUSALITY TEST

i. Regress the r on all lagged r term and other variables (if any)
but do not include the lagged M variables in the regression.
Obtain RSSR (restricted residual sum of square).
ii. Run the equation with lagged terms of m includes. Obtain
RSSUR (unrestricted residual sum of square).
iii. The null hypothesis is that H0: βj = 0 that is lagged values
of m does not belong to the equation (see equation 27).

28
iv. Test the hypothesis using F-test:

( RSS R  RSS UR )
F  m
RSS UR
(n  k )

where m is equal to number of lagged m terms and k is the


numbers of parameters estimated in the unrestricted
regression.
v. If the computed F value exceeds the critical value F at the
chosen level of significance, we reject the null hypothesis
that m Granger causes r.

29
• To test whether r Granger causes m, the steps must be
repeated.
– The most significant disadvantage in Granger causality is the selection
of lag (the sensitivity of lag can induce different results in the
empirical study).

– To solve the problem one may use complementary tests in providing


more insight of the nature of causality in the empirical study [or the
combination of the Granger causality and the other time series
econometrics techniques such as non-stationarity, cointegration,
VECM, VDCs and IRFs).

– Example: Sim (1972) and Toda and Yamamoto (1995).

30
Example
• An application of the Toda and Yamamoto (1995) Granger non-causality
test to the problem of twin deficits phenomenon between current account
deficits (CAD) and the budget deficits (BD) in ASEAN-4 countries [Evan
Lau, Liew Khim Sen and Puah Chin Hong, “The Tale Of The Twin
Deficits Nexus: An Alternative Procedure,” International Journal of
Business and Society, 2004, Vol. 5 No. 2, pp 33-53.].
• The ‘twin deficits’ hypothesis asserts a bi-directional causality from budget
deficits to current account deficits.
• A unidirectional causality from current account to budget deficits are
termed as ‘current account targeting’.
• The bilateral causality suggests the fiscal policy and external policy
synchronization while the absence of causality is in line with the Ricardian
Equivalence Hypothesis.

31
• The model
– Following Toda and Yamamoto (1995) Granger non-causality test,
these variables can be causally linked in a two-dimensional VAR
system (assuming p=3):
CADt  CADt 1  + A2 CADt  2  +A3 CADt 3  +  CAD 
  = A0 + A1        
 BDt   t 1 
BD  BDt  2   BDt  3   BD 

where A0 as an identity matrix. To test whether BD does not Granger cause


movement in CAD (if k=2 and dmax=1), we test the H0: where are the
coefficients of BDt-i=0, i=1,2,…, in the first equation of the system.
 The existence of the causality from BD to CAD can be established through
rejecting the above H0, which requires finding the significance of the
MWALD statistics for BDt-1 and BDt-2 identified above while BDt-3 is left
unrestricted as a long run correction mechanism.
 Similar analogous restrictions and testing procedure can be applied in
testing the hypothesis that CAD does not Granger cause movement in BD,
i.e. to test: H0:  21(1)   21( 2)  0 .
32
Long Run Granger Non-causality Results
Null Hypothesis Test Statistics Conclusion

A: Indonesia (k=4 d=1) MWALD p-value

Budget deficits does not Granger cause 5.546 0.235 Do Not Reject Ho
current account deficits

Current account deficits does not 11.161 0.024 Reject Ho


Granger cause budget deficits

B: Malaysia (k=3 d=1)

Budget deficits does not Granger cause 8.263 0.041 Reject Ho


current account deficits

Current account deficits does not 10.714 0.013 Reject Ho


Granger cause budget deficits

C: Philippines (k=5 d=1)

Budget deficits does not Granger cause 11.842 0.037 Reject Ho


current account deficits

Current account deficits does not 7.223 0.204 Do Not Reject Ho


Granger cause budget deficits

D: Thailand (k=5 d=1)

Budget deficits does not Granger cause 5.165 0.396 Do Not Reject Ho
current account deficits

Current account deficits does not 18.729 0.002 Reject Ho


Granger cause budget deficits
Notes: k = optimum lag and d = maximal order of integration.
33
Figure 1: Direction of Causal Relationship

A: Indonesia B: Malaysia

BD CAD BD CAD

C: Philippines D: Thailand

BD CAD BD CAD

Note: BD  CAD implies one-way causality while BD  CA indicates the bi-directional causality relationship.
34

Das könnte Ihnen auch gefallen