Sie sind auf Seite 1von 18

1

Applications of Econometrics Group Project 2016

By Andrey Mantrov (s1359793), Anton Bochkov (s1371514), Sergei


Gumilevskiy (s1339803), Maxim Aleshin (s1327303)

Group number: 21

Country of choice: Australia.

Data for CPI: https://data.oecd.org/price/inflation-cpi.htm

Data for Short term Interest Rates:


https://data.oecd.org/interest/short-term-interest-rates.htm

Data for Deficit as % of GDP:


https://data.oecd.org/gga/general-government-deficit.htm
2

Question 1 - Plot each of the three variables (interest rate, inflation and
government deficit) against time on separate line graphs. Be sure to
carefully label the horizontal and vertical axes.

a) Graph for Deficit as percentage of GDP 1970-2013 annual data for Australia
2
0
Debt
-2 -4
-6

1970 1980 1990 2000 2010


Year

As seen from the graph Australia has budget deficit for the most of the time over our
period from 1970 through to 2013 with the exception from 1996-2000 and from 2002
to 2008, when the crisis happened throughout the world. The increasing deficit can be
explained by Australia having stagflation beginning in 1972-1973 (inflation and
unemployment rising simultaneously) due to Britain abandoning Imperial Preference
policy and declined investments from US during the Vietnam War era.

b) Graph showing CPI change for Australia since 1970 to 2013


15
10
CPI
5
0

1970 1980 1990 2000 2010


Year
3

Our data for inflation is the % change in CPI and as seen from above it soared from
1970 due to stagflation phenomena which was explained above. After 1990 inflation
has gone down and maintained between 1-5% till recent.

c) Graph showing Short-term interest rate fluctuations in Australia 1970-201


20
15
Interest Rate
10
5
0

1970 1980 1990 2000 2010


Year

Short term interest rates have soared from 1972 to 13% in order to battle with high
inflation as increased interest rates increase motivation to save and this decreases
consumption as people prefer to save.
4

Question 2 - Run a simple regression of the interest rate on the rate of


inflation and government deficit, both with and without a time trend. Do the
coefficients on inflation and government deficit have the expected signs?
Why or why not?

a) We have regressed the variables without a time trend and got the following
results:
Table 1

Regression was as follows:


ir=4.303+0.617 cpi 0.303 dpcgdp
Given we are asked for simple regression we assume that all six OLS assumptions hold
in our data.
Our coefficient on CPI and our intercept have p-values of 0.000 so they are statistically
significant at 1% significance level. However, the coefficient on Deficit as % of GDP
has p-value of 0.077 which is above the 5% significance level of p-value=0.05, so here
we cannot be certain that its statistically significant and is different to zero.

b) Then we have included a time trend and run the regression again, obtaining
following results:

Table 2
5

Regression obtained:
IR=6.0857 0.0500 TimeTrend+ 0.5021CPI 0.2994 dpcgdp
When we include a time trend in our regression we got p-value of 0.348 for the trend
meaning its statistically insignificant which means we dont have a time trend in our
data (a certain pattern in data which is consistent over time). However, we were
surprised that there wasnt a time trend in inflation data as prices grow over time so
we have regressed again but this time using GDP inflator data but results showed that
time trend there wasnt significant either (p-value of 0.558) and we stayed with CPI
change for our inflation data.

Table 3

Coefficient on inflation does have the expected sign as when inflation increases we
expect Central bank to raise interest rates in order to combat inflationary pressure as
was the case in 1972 when increased short-term interest rates in Australia resulted in
credit squeeze during the time of stagflation. So when government observes high
inflation they raise interest rates.
As for the sign for coefficient of Deficit as % of GDP we expected a positive
relationship between interest rate and deficit as recent working paper, by Eric Engen
and R. Glenn Hubbard, found that when government debt increased by 1 percent of
GDP, interest rates would increase by about two basis points. 1 Also according to
Laubach's2 estimates, when the projected deficit to GDP ratio increases by one
percentage point, long-term interest rates increase by roughly 25 basis points and
when deficit rises it puts upward pressure on short-term interest rates.

1 FederalGovernmentDebtandInterestRatesEricM.Engen,R.GlennHubbardApril2005
2 https://www.stlouisfed.org/publications/central-banker/summer-2004/budget-deficits-and-interest-rates-
what-is-the-link
6

Question 3 - Test for the presence of serial correlation in the residuals from
the specification in Part 2. What do you conclude?

Without serial correlation the Corr (u t , u sX )=0 , for all t s and hence there is
zero correlation through time between error terms. To test for serial correlation, we
used a regression without a time trend as we found the coefficient on time trend to be
statistically insignificant.

First we predicted residuals and then created lagged variable in order to see if there
was correlation between adjacent errors. Then a regression of u^ t on xt and u^ t-1
was run for t=1,2,,44. This was used to obtain the coefficient ^p on u^ t-1:

u^ = ^p u^ t 1+ et

To determine whether there is serial correlation we used an AR(1) t-test if the


coefficient p is different to zero and is significant. As our sample size was large sample
size (n=44), we conducted a test for serial correlation without strictly exogenous
regressors. The two hypotheses for out t-test are:

H0: ^p =0 and H1: ^p >0.

As seen from our results t statistics, t ^p for ^p is 8.14 which is bigger than 1.96
(95% significance level) so we reject the null hypothesis that p=0 and of no serial
correlation. And therefore correlation between errors is Corr(u t,ut-1) = 0.7952706 and
we have serial correlation in our model.

Question 4 - Discuss two ways in which econometricians have at their


disposal to deal with heteroscedasticity and serial correlation in the
residuals. Re-run the specification in Part 2, this time using Newey-West
standard errors.

Heteroscedasticity - The situation in which the variance of the regression error term,
conditional on the regressors, is not constant.
It is written mathematically as: Var (U tX t) 0
Serial Correlation Serial Correlation occurs when the error terms are correlated
over time; i.e., conditional on the explanatory variables, the unobserved factors are
7

correlated from one time period to the next. It is written mathematically as:
Corr (U t , U sX ) 0

Generally, econometricians deal with heteroscedasticity and serial correlation in 2


ways:

1) By transforming the model in order to eliminate the effect of serial


correlation

In order to transform the model, we use the CochraneOrcutt or Prais-Winsten iterative


models, which are estimated assuming that the value of p is known. This is also known
as feasible least squares, where errors follow the AR (1) model. In order to apply the
model, we assume the first 4 Assumptions of OLS, as well as one exogenous regressor
and AR (1) errors.

The transformation process of the model is called Quasi-Differencing, and the model
follows:

y t =0 + 1 X t +U t (1) U t =p U t1 + Et ( 2) Now we assume that we know p, and we

lag yt by one period and multiply by p:

p y t1= p 0+ p 1 X t1 + p U t1 (3)

Further, we subtract (1) minus (3) to get:

y t p y t 1= 0 (1 p)+ 1 ( X t p X t1)+ E t

The equation above will have non-autocorrelated errors as the error term in this
equation is in fact Et, and it satisfies all the properties needed for applying the OLS
procedure.

Yet, in practice, the value of parameter p unknown, so we have to estimate it. In order
to do that, we use feasible generalized least squares (GLS) estimator of this model
which replaces p with its estimate ^p . (Ut= ^p Ut-1+Et) Despite the lack of small
sample properties, the model formed is asymptotically appropriate. The process is as
y t = b^0 + b^
follows: first regression we run is ^ 1 Xt + U ^t (all with hats), which
^t . Then, we use those residuals to mimic the population
gives us the residuals U
error process, so we regress U^t on U ^t 1 , and the coefficient we get from the
regression is our estimate of p, or ^p . Also, from this regression we get an estimate
of population errors ^ Et . Then, we use the estimated ^p in the model
transformations described above. In practice, FGLS is biased due to the fact that we
use the estimated parameter ^p instead of known p. Yet, if we assume strict
exogeneity of errors, the FGLS is consistent.
8

2) Using Newey-West (HAC) standard errors which correct for serial


correlation and heteroscedasticity.

When the error term in regression function is serially correlated, despite the
consistency of OLS coefficients, the classical SE is incorrect. Instead of the original
formula, we use the Newey-West SE formula that corrects the SE robust to serial
correlation and heteroscedasticity.

The formula is written as: SE ( ^


1 ) = (SE ( ^
1 ) / )2 vv , where we take the
classical formula, divide it by , square the number obtained and the multiply by the
vv .

The value of vv depends on the integer g which controls for the amount of serial
correlation which we allow in standard error computation.

As suggested by Newey and West (1987), we should take 4(n/100) 2/9 to get the value
of integer g. In our data set, we have found the value of g to be 3, and we will use it
for the HAC SE computation.

We re-run the specification in part 2, using NW SE instead of classical ones. As you can
see from the table, the only value which is statistically significant is the CPI, but the
values for government deficit and time trend remain the same.

Regression with Newey-West standard errors Number of obs = 44


maximum lag: 3 F( 3, 40) = 5.40
Prob > F = 0.0032

Newey-West
ir Coef. Std. Err. t P>|t| [95% Conf. Interval]

dpcgdp -.2994178 .2227513 -1.34 0.186 -.7496149 .1507793


cpi .5027423 .2336629 2.15 0.038 .030492 .9749926
t -.0500882 .0624588 -0.80 0.427 -.1763222 .0761458
_cons 6.085717 2.339402 2.60 0.013 1.357608 10.81383
9

Question 5 - Test for the presence of a unit root in each of the three
variables, first using the basic Dickey Fuller (DF) test for AR(1)
autocorrelation, and then the Augmented DF test with two lagged changes.

Basic Dickey-Fuller Test:

Dickey-Fuller test is used in econometrics in order to determine whether a unit root is


present in a model. If y t has a unit root, then it contains a stochastic trend and is
non-stationary. Here, we will begin with an AR (1) model equation.

y t = + y t 1+ et

Next step is to subtract from the equation above y t1 from both sides, which
eliminates the possibility of a unit root on the left side. So the unit root will only exist if
= 1.

y t y t1= + y t 1 (1)+ et

Simplifying the equation above we achieve the following:

y t = + y t1 + e t

* y t is either interest rate, CPI or the budget deficit, e t is an independent and


identically distributed random variable with mean zero and variance 2et, and is
defined as =-1

In the Basic Dickey-Fuller test we test the null hypothesis :

H 0 : =0 H1 : <0

Under our H0 the standard normal distribution for t statistic does not apply, so we will
be using the Dickey-Fuller distribution.

To answer the question, we have regressed cpi against cpit-1, ir against irt-1 and
dpcgdp against dpcgdpt-1, from which we achieved the following equations:

cpi=0.83614550.1497307 cp i t 1

ir=1.1079160.1396161i r t1
10

dpcgdp=0.48162870.1870116 dpcgd p t1

SE () t- R2
statistic
(

cpi - 0.83614 0.08339 -1.80 0.0729


0.149730 55 79
7

ir - 1.10791 0.08629 -1.62 0.06


0.139616 6 7
1

dpcgdp 0.1870116- 0.08998 -2.08 0.0953


0.48168 89
7

From the computed t-statistic, we check our result against the Dickey-Fuller
distribution. The critical value at 5% significance level is -3.41. All of our computed t-
statistics are higher than the critical value (-1.80, -1.62, -2.08), so we fail to reject our
null hypothesis about a unit root presence as the data does not provide strong
evidence against H0.

Augmented Dickey-Fuller Test:

The augmented Dickey-Fuller test cleans up any serial correlation in y with inclusion
of additional lagged changes.

After implementing two additional lagged changes we achieved the following equation:

y t = + y t1+ 1 y t 1 + 2 y t2 +e t

*Where Y t is either CPI, interest rate or budget deficit, e is an independent and


identically distributed random variable with mean zero and variance 2e.

Similar to the previous part of the question, we have regressed cpi against cpit-1,
cpit-1 and cpit-2. The same regressions were run for ir and dpcgdp. The following
equations were computed:
11

cpi=0.72395980.1379655 ( cp i t 1) + 0.1271802 ( cp i t1 )0.1420959 ( cp i t2 )

ir=0.72526460.0915301 ( ir t 1 ) + 0.0140944 ( ir t1 )0.3472686 ( ir t 2 )

dpcgdp=0.6897260.277876 ( dpcgdpt 1) + 0.363144 ( dpcgdp t1 ) +0.08658 ( i r t 2 )

As before we test again H0: = 0 against H1: < 0.

1 2 SE() T()

cpi 0.723959 - 0.127180 - 0.093576 -


8 0.137965 2 0.142095 2 1.47
5 9 4

ir 0.725264 - 0.014094 - 0.093711 -


6 0.091530 4 0.347268 3.64
1 6 1

dcpgdp - - 0.363143 0.086576 0.101686 -


0.689725 0.277876 6 5 9 2.73
9 2 3

From the table above we can see that the null hypothesis at 5% critical value (-3.493)
can be rejected just for the interest rate variable (-3.641). For the other variables we
fail to reject the null hypothesis and a unit root may be present.
12

Question 6 - Given your answer to Part 5, does it seem more reasonable to


run the regression in levels or in first differences?

Using time series with strong persistence of the type displayed by a unit root process
in a regression equation can lead to very misleading results if the CLM assumptions
are violated. So as we have a unit we need to perform certain transformations to our
model that render a unit root process weakly dependent. Unit root processes are said
to be integrated of order one, or I(1). This means that the first difference of the
process is weakly dependent and can be used for regression analysis. In our case we
found in question 5 that we have a unit root in all of our variables so we need to run
our regression in first differences.

Term spurious correlation is used to describe a situation where two variables are
related through their correlation with a third variable. A simple regression involving
two independent I(1) series, as in our case due to unit root, will often result in a
significant t statistic. Luckily we can use a method of co-integration, by Engle and
Granger (1987) which makes regressions involving I(1) variables theoretically
significant. In next question we run our regression in first differences and then test for
spurious correlation.
13

Question 7 - Run your preferred specification (either in levels or in first


differences). Do the coefficients on inflation and deficit have the expected
signs now?

As said above we have decided to run our modified regression in first differences and
as in Question 2 we have done it both with and without the time trend. Our new model
looked like this:

ir t = 0 + 1 ( CPI t ) + 2 ( dpcgdp t

ir t = 0 + 1 ( CPI t ) + 2 ( dpcgdp t + 3 t

The results are summarised in a tables below:


14
. reg d1_ir d1_cpi d1_dpcgdp

Source SS df MS Number of obs = 43


F(2, 40) = 9.55
Model 73.022246 2 36.511123 Prob > F = 0.0004
Residual 152.946877 40 3.82367192 R-squared = 0.3232
Adj R-squared = 0.2893
Total 225.969123 42 5.38021721 Root MSE = 1.9554

d1_ir Coef. Std. Err. t P>|t| [95% Conf. Interval]

d1_cpi .5945714 .1369205 4.34 0.000 .3178447 .8712981


d1_dpcgdp .0324116 .2128208 0.15 0.880 -.3977152 .4625384
_cons -.085855 .2983185 -0.29 0.775 -.6887792 .5170692

. reg d1_ir d1_cpi d1_dpcgdp t

Source SS df MS Number of obs = 43


F(3, 39) = 6.35
Model 74.1804565 3 24.7268188 Prob > F = 0.0013
Residual 151.788666 39 3.89201709 R-squared = 0.3283
Adj R-squared = 0.2766
Total 225.969123 42 5.38021721 Root MSE = 1.9728

d1_ir Coef. Std. Err. t P>|t| [95% Conf. Interval]

d1_cpi .5871272 .1388112 4.23 0.000 .3063552 .8678993


d1_dpcgdp .0343828 .2147447 0.16 0.874 -.3999794 .468745
t -.0132961 .0243735 -0.55 0.589 -.062596 .0360039
_cons .2198519 .6361084 0.35 0.731 -1.066799 1.506503

The signs of the coefficient for the change CPI is the same as in question 2 as
expected, but the sign for the change in budget deficit has changed and now follows
the theory explained in question 2. It is worth mentioning that the coefficient for
change in budget deficit is statistically insignificant. Also as the coefficient for time
trend is insignificant and does not change other coefficients by a big amount we will
prefer the model without the t variable.

As we are running the regression in first differences we should not have a spurious
problem but it is worth checking anyway. Our t statistics for spurious regression is
-6.878 which means that the variable ir and cpi are co-integrated so we can rely on
our t values to not be overstated.

Question 8: Estimate your preferred specification using all but the most
recent observation of data. Forecast the interest rate for the most recent
observation, and compare it with the actual rate of interest. How closely did
the model match the data?

We start with the regression from the Question 7 with the time trend but not taking
into the account the year 2013. This results in the following regression.

ir=0.1636686+0.5926029 cpi+0.0318489 dpcgdp0.009758 t


15

d1_ir Coef. Std. Err. t P>|t| [95% Conf. Interval]

d1_cpi .5926029 .1405358 4.22 0.000 .3081031 .8771027


d1_dpcgdp .0318489 .2168484 0.15 0.884 -.4071377 .4708356
t -.009758 .0255422 -0.38 0.705 -.0614656 .0419495
_cons .1636686 .6513291 0.25 0.803 -1.154878 1.482215

We can use this model for forecasting IR at time t only if we assume that the other
variable for time t are known beforehand at time t-1. This assumption is only
reasonable if the government sets targets for CPI and Budget Deficit and then commits
to those so it equals to the future observed values, which is not a realistic assumption.
But we will use this approach for simplicity. So plugging the values for 2013 for CPI and
Budget Deficit into the equation above gives the following results.

ir=0.91756381

To get the interest rate for year 2013 we just add the interest rate for 2012 to the
value we have received. This gives us ir= --0.91756381+3.728333= 2.8107692 2.81
against the observed value of 2.78 which is a quite close fit including the fact the
adjusted R squared for this regression is 0.2779. But we need to remember that we
assumed that the values of our independent values are the same as we would observe
at time t which is the main reason why our forecast is as accurate as the OLS model
used minimizes errors by construction.

If we assume that we do not know the independent variables beforehand we will need
to specify different forecasting model where we only use variables that are known at
time t-1.

ir= 0 + 1 ( cp i t1cp i t 2) + 2 ( dpcgd p t1dpcgd pt 2) + 3 t

Running this regression for all values excluding 2013 we approximate the following
coefficients.

ir=0.570950.18539 ( cp i t 1 cp i t2 ) +0.72054 ( dpcgd pt 1dpcgd p t2 )0.025836t

d1_ir Coef. Std. Err. t P>|t| [95% Conf. Interval]

d1_cpi
L1. -.1718163 .1494231 -1.15 0.257 -.4743076 .130675

d1_dpcgdp
L1. .7278565 .2367769 3.07 0.004 .2485268 1.207186

_cons -.0226481 .3324987 -0.07 0.946 -.6957564 .6504603


16

Plugging in the known values we get.

ir=0.4626588 ir 2013 =3.2656742 3.27

This forecast is way worse than the previous as the difference between the predicted
value and the observed value is 0.4856742 or around 17%. But this value is more
realistic prediction as we only use the information available at time t-1.

Appendix do file

tsset year

*** Question 1

graph twoway line dpcgdp year, ytitle("Debt") xtitle("Year")

graph twoway line cpi year, ytitle("CPI") xtitle("Year")

graph twoway line ir year, ytitle("Interest Rate") xtitle("Year")


17

*** Question 2

reg ir dpcgdp cpi, r

gen t=[_n]

reg ir dpcgdp cpi t, r

*** Questin 3

cap drop u1

predict double u1, residuals

reg u1 l1.u1

*** Question 4

newey ir dpcgdp cpi, lag(3) leve(95)

newey ir dpcgdp cpi t, lag(3) leve(95)

*** Question 5

gen l1_cpi= l1.cpi

gen d1_cpi= d1.cpi

reg d1_cpi l1_cpi

gen l1_ir= l1.ir

gen d1_ir= d1.ir

reg d1_ir l1_ir

gen l1_dpcgdp= l1.dpcgdp

gen d1_dpcgdp= d1.dpcgdp

reg d1_dpcgdp l1_dpcgdp

dfuller ir, lag(2) regress

dfuller cpi, lag(2) regress

dfuller dpcgdp, lag(2) regress


18

***Question 7

reg d1_ir d1_cpi d1_dpcgdp

reg d1_ir d1_cpi d1_dpcgdp t

***test spurios relationship

reg d1_ir d1_cpi

predict u3, residuals

dfuller u3

***Question 8

***Forecast 1

reg d1_ir d1_cpi d1_dpcgdp if year < 2013

scalar b_0=_b[_cons]

scalar b_1=_b[d1_cpi]

scalar b_2=_b[d1_dpcgdp]

scalar f_2013 = b_0 +b_1*(-1.54107)+b_2*(2.028539)

disp f_2013 + 3.728333

disp (f_2013 + 3.728333-2.78)/2.78

***Forecast

reg d1_ir l.d1_cpi l.d1_dpcgdp if year < 2013

scalar b_00=_b[_cons]

scalar b_11=_b[l.d1_cpi]

scalar b_22=_b[l.d1_dpcgdp]

scalar f1_2013 = b_00 +b_11*(0.38551)+b_22*(0.2065091 )

disp f1_2013 +3.728333

Das könnte Ihnen auch gefallen