Beruflich Dokumente
Kultur Dokumente
Group number: 21
Question 1 - Plot each of the three variables (interest rate, inflation and
government deficit) against time on separate line graphs. Be sure to
carefully label the horizontal and vertical axes.
a) Graph for Deficit as percentage of GDP 1970-2013 annual data for Australia
2
0
Debt
-2 -4
-6
As seen from the graph Australia has budget deficit for the most of the time over our
period from 1970 through to 2013 with the exception from 1996-2000 and from 2002
to 2008, when the crisis happened throughout the world. The increasing deficit can be
explained by Australia having stagflation beginning in 1972-1973 (inflation and
unemployment rising simultaneously) due to Britain abandoning Imperial Preference
policy and declined investments from US during the Vietnam War era.
Our data for inflation is the % change in CPI and as seen from above it soared from
1970 due to stagflation phenomena which was explained above. After 1990 inflation
has gone down and maintained between 1-5% till recent.
Short term interest rates have soared from 1972 to 13% in order to battle with high
inflation as increased interest rates increase motivation to save and this decreases
consumption as people prefer to save.
4
a) We have regressed the variables without a time trend and got the following
results:
Table 1
b) Then we have included a time trend and run the regression again, obtaining
following results:
Table 2
5
Regression obtained:
IR=6.0857 0.0500 TimeTrend+ 0.5021CPI 0.2994 dpcgdp
When we include a time trend in our regression we got p-value of 0.348 for the trend
meaning its statistically insignificant which means we dont have a time trend in our
data (a certain pattern in data which is consistent over time). However, we were
surprised that there wasnt a time trend in inflation data as prices grow over time so
we have regressed again but this time using GDP inflator data but results showed that
time trend there wasnt significant either (p-value of 0.558) and we stayed with CPI
change for our inflation data.
Table 3
Coefficient on inflation does have the expected sign as when inflation increases we
expect Central bank to raise interest rates in order to combat inflationary pressure as
was the case in 1972 when increased short-term interest rates in Australia resulted in
credit squeeze during the time of stagflation. So when government observes high
inflation they raise interest rates.
As for the sign for coefficient of Deficit as % of GDP we expected a positive
relationship between interest rate and deficit as recent working paper, by Eric Engen
and R. Glenn Hubbard, found that when government debt increased by 1 percent of
GDP, interest rates would increase by about two basis points. 1 Also according to
Laubach's2 estimates, when the projected deficit to GDP ratio increases by one
percentage point, long-term interest rates increase by roughly 25 basis points and
when deficit rises it puts upward pressure on short-term interest rates.
1 FederalGovernmentDebtandInterestRatesEricM.Engen,R.GlennHubbardApril2005
2 https://www.stlouisfed.org/publications/central-banker/summer-2004/budget-deficits-and-interest-rates-
what-is-the-link
6
Question 3 - Test for the presence of serial correlation in the residuals from
the specification in Part 2. What do you conclude?
Without serial correlation the Corr (u t , u sX )=0 , for all t s and hence there is
zero correlation through time between error terms. To test for serial correlation, we
used a regression without a time trend as we found the coefficient on time trend to be
statistically insignificant.
First we predicted residuals and then created lagged variable in order to see if there
was correlation between adjacent errors. Then a regression of u^ t on xt and u^ t-1
was run for t=1,2,,44. This was used to obtain the coefficient ^p on u^ t-1:
u^ = ^p u^ t 1+ et
As seen from our results t statistics, t ^p for ^p is 8.14 which is bigger than 1.96
(95% significance level) so we reject the null hypothesis that p=0 and of no serial
correlation. And therefore correlation between errors is Corr(u t,ut-1) = 0.7952706 and
we have serial correlation in our model.
Heteroscedasticity - The situation in which the variance of the regression error term,
conditional on the regressors, is not constant.
It is written mathematically as: Var (U tX t) 0
Serial Correlation Serial Correlation occurs when the error terms are correlated
over time; i.e., conditional on the explanatory variables, the unobserved factors are
7
correlated from one time period to the next. It is written mathematically as:
Corr (U t , U sX ) 0
The transformation process of the model is called Quasi-Differencing, and the model
follows:
p y t1= p 0+ p 1 X t1 + p U t1 (3)
y t p y t 1= 0 (1 p)+ 1 ( X t p X t1)+ E t
The equation above will have non-autocorrelated errors as the error term in this
equation is in fact Et, and it satisfies all the properties needed for applying the OLS
procedure.
Yet, in practice, the value of parameter p unknown, so we have to estimate it. In order
to do that, we use feasible generalized least squares (GLS) estimator of this model
which replaces p with its estimate ^p . (Ut= ^p Ut-1+Et) Despite the lack of small
sample properties, the model formed is asymptotically appropriate. The process is as
y t = b^0 + b^
follows: first regression we run is ^ 1 Xt + U ^t (all with hats), which
^t . Then, we use those residuals to mimic the population
gives us the residuals U
error process, so we regress U^t on U ^t 1 , and the coefficient we get from the
regression is our estimate of p, or ^p . Also, from this regression we get an estimate
of population errors ^ Et . Then, we use the estimated ^p in the model
transformations described above. In practice, FGLS is biased due to the fact that we
use the estimated parameter ^p instead of known p. Yet, if we assume strict
exogeneity of errors, the FGLS is consistent.
8
When the error term in regression function is serially correlated, despite the
consistency of OLS coefficients, the classical SE is incorrect. Instead of the original
formula, we use the Newey-West SE formula that corrects the SE robust to serial
correlation and heteroscedasticity.
The value of vv depends on the integer g which controls for the amount of serial
correlation which we allow in standard error computation.
As suggested by Newey and West (1987), we should take 4(n/100) 2/9 to get the value
of integer g. In our data set, we have found the value of g to be 3, and we will use it
for the HAC SE computation.
We re-run the specification in part 2, using NW SE instead of classical ones. As you can
see from the table, the only value which is statistically significant is the CPI, but the
values for government deficit and time trend remain the same.
Newey-West
ir Coef. Std. Err. t P>|t| [95% Conf. Interval]
Question 5 - Test for the presence of a unit root in each of the three
variables, first using the basic Dickey Fuller (DF) test for AR(1)
autocorrelation, and then the Augmented DF test with two lagged changes.
y t = + y t 1+ et
Next step is to subtract from the equation above y t1 from both sides, which
eliminates the possibility of a unit root on the left side. So the unit root will only exist if
= 1.
y t y t1= + y t 1 (1)+ et
y t = + y t1 + e t
H 0 : =0 H1 : <0
Under our H0 the standard normal distribution for t statistic does not apply, so we will
be using the Dickey-Fuller distribution.
To answer the question, we have regressed cpi against cpit-1, ir against irt-1 and
dpcgdp against dpcgdpt-1, from which we achieved the following equations:
cpi=0.83614550.1497307 cp i t 1
ir=1.1079160.1396161i r t1
10
dpcgdp=0.48162870.1870116 dpcgd p t1
SE () t- R2
statistic
(
From the computed t-statistic, we check our result against the Dickey-Fuller
distribution. The critical value at 5% significance level is -3.41. All of our computed t-
statistics are higher than the critical value (-1.80, -1.62, -2.08), so we fail to reject our
null hypothesis about a unit root presence as the data does not provide strong
evidence against H0.
The augmented Dickey-Fuller test cleans up any serial correlation in y with inclusion
of additional lagged changes.
After implementing two additional lagged changes we achieved the following equation:
y t = + y t1+ 1 y t 1 + 2 y t2 +e t
Similar to the previous part of the question, we have regressed cpi against cpit-1,
cpit-1 and cpit-2. The same regressions were run for ir and dpcgdp. The following
equations were computed:
11
1 2 SE() T()
From the table above we can see that the null hypothesis at 5% critical value (-3.493)
can be rejected just for the interest rate variable (-3.641). For the other variables we
fail to reject the null hypothesis and a unit root may be present.
12
Using time series with strong persistence of the type displayed by a unit root process
in a regression equation can lead to very misleading results if the CLM assumptions
are violated. So as we have a unit we need to perform certain transformations to our
model that render a unit root process weakly dependent. Unit root processes are said
to be integrated of order one, or I(1). This means that the first difference of the
process is weakly dependent and can be used for regression analysis. In our case we
found in question 5 that we have a unit root in all of our variables so we need to run
our regression in first differences.
Term spurious correlation is used to describe a situation where two variables are
related through their correlation with a third variable. A simple regression involving
two independent I(1) series, as in our case due to unit root, will often result in a
significant t statistic. Luckily we can use a method of co-integration, by Engle and
Granger (1987) which makes regressions involving I(1) variables theoretically
significant. In next question we run our regression in first differences and then test for
spurious correlation.
13
As said above we have decided to run our modified regression in first differences and
as in Question 2 we have done it both with and without the time trend. Our new model
looked like this:
ir t = 0 + 1 ( CPI t ) + 2 ( dpcgdp t
ir t = 0 + 1 ( CPI t ) + 2 ( dpcgdp t + 3 t
The signs of the coefficient for the change CPI is the same as in question 2 as
expected, but the sign for the change in budget deficit has changed and now follows
the theory explained in question 2. It is worth mentioning that the coefficient for
change in budget deficit is statistically insignificant. Also as the coefficient for time
trend is insignificant and does not change other coefficients by a big amount we will
prefer the model without the t variable.
As we are running the regression in first differences we should not have a spurious
problem but it is worth checking anyway. Our t statistics for spurious regression is
-6.878 which means that the variable ir and cpi are co-integrated so we can rely on
our t values to not be overstated.
Question 8: Estimate your preferred specification using all but the most
recent observation of data. Forecast the interest rate for the most recent
observation, and compare it with the actual rate of interest. How closely did
the model match the data?
We start with the regression from the Question 7 with the time trend but not taking
into the account the year 2013. This results in the following regression.
We can use this model for forecasting IR at time t only if we assume that the other
variable for time t are known beforehand at time t-1. This assumption is only
reasonable if the government sets targets for CPI and Budget Deficit and then commits
to those so it equals to the future observed values, which is not a realistic assumption.
But we will use this approach for simplicity. So plugging the values for 2013 for CPI and
Budget Deficit into the equation above gives the following results.
ir=0.91756381
To get the interest rate for year 2013 we just add the interest rate for 2012 to the
value we have received. This gives us ir= --0.91756381+3.728333= 2.8107692 2.81
against the observed value of 2.78 which is a quite close fit including the fact the
adjusted R squared for this regression is 0.2779. But we need to remember that we
assumed that the values of our independent values are the same as we would observe
at time t which is the main reason why our forecast is as accurate as the OLS model
used minimizes errors by construction.
If we assume that we do not know the independent variables beforehand we will need
to specify different forecasting model where we only use variables that are known at
time t-1.
Running this regression for all values excluding 2013 we approximate the following
coefficients.
d1_cpi
L1. -.1718163 .1494231 -1.15 0.257 -.4743076 .130675
d1_dpcgdp
L1. .7278565 .2367769 3.07 0.004 .2485268 1.207186
This forecast is way worse than the previous as the difference between the predicted
value and the observed value is 0.4856742 or around 17%. But this value is more
realistic prediction as we only use the information available at time t-1.
Appendix do file
tsset year
*** Question 1
*** Question 2
gen t=[_n]
*** Questin 3
cap drop u1
reg u1 l1.u1
*** Question 4
*** Question 5
***Question 7
dfuller u3
***Question 8
***Forecast 1
scalar b_0=_b[_cons]
scalar b_1=_b[d1_cpi]
scalar b_2=_b[d1_dpcgdp]
***Forecast
scalar b_00=_b[_cons]
scalar b_11=_b[l.d1_cpi]
scalar b_22=_b[l.d1_dpcgdp]