Sie sind auf Seite 1von 11

ECON 4613 Assignment 2

Joshua Thomas
1. a)

All coefficients are significant at the 5% level.


An increase in horse power of 1, results in the predicted value of MPG to increase by 0.39 for
given values of all other values. We wouldnt expect the sign of this coefficient to be positive
since MPG likely decreases with more horse power.
An increase of 1 mile per hour in the top speed of a car, will increase the predicted value of MPG
to decrease by 1.27 for given values of all other values. This is the sign we would expect.
An increase of 100lbs of a car will decrease the predicted value of MPG by 1.9 for given values
of all other values. This is the sign we would expect.
We see a very high R-squared value of 0.883 which tells us that 88.3% of the variation of MPG
can be explained by the model.
The F statistic is significant at the 5% level telling us that the R-squared number is statistically
different form 0.

16

16
12

12

8
UT

8
UT

16

12

4
UT

b)

-4

-4

-4

-8

-8
10

20

30

40
WT

50

60

80 90 100

120

140

SP

160

-8
0

50

100 150 200 250 300 350


HP

In all of the graphs, we see the variance of the residuals decrease as the dependent variable
increases. This leads us to believe that there is likely heteroscedasticity in out model.
c)
With a very low p value, we reject the
null hypothesis that there is
homoscedasticity and accept that there is
heteroscedasticity in our model. Because
all of the variables are significant when
regressed on the residuals, it appears that
all of the variable are contributing to the
problem of heteroscedasticity.

d)

With a square root transformation


focusing on all the variables, the
new model still suffers from
heteroscedasticity since the p-value
from the whites test is very low.

e)
With a transformation focusing on all the
variables assuming the error term was
proportional to the variables squared, the
new model still suffers from
heteroscedasticity since the p-value from
the whites test is very low.

f)
When we did not correct for
heteroscedasticity, we
underestimated the standard errors
for all three variables.

g)
We could try using a log-linear model to
solve the heteroscedasticity, however as
we are now close to not rejecting
homoscedasticity at the 1% level, there
still appears to be heteroscedasticity. Also
this specification results in horse power
and speed to no longer be significantly
different from 0.

2) a)
I would expect B1 to be negative since when the price of cars increase, less people are likely to
buy.
I would expect B2 to be negative as well if prices are high, consumers are more likely to save
more and thus not make large purchases such as a car.
I would expect B3 to be positive because as consumers income rises, they will purchase more.
I would expect B4 to be negative, because as the interest rate rises, it is more costly to take out a
load to purchase a car.
I would expect B5 to be positive because when there is a larger employed population, more cars
will be sold.
b)

A 1% increase in the CPI of cars will result in a 1.94% increase in car sales on average for given
values of all other variables. The sign of this coefficient is not consistent with our expectation.
A 1% increase in the CPI will result in a 4.68% decrease in car sales on average for given values
of all other variables. The sign of this coefficient is consistent with our expectation.
A 1% increase in PDI will result in a 2.72% increase in car sales on average for given values of
all other variables. The sign of this coefficient is consistent with our expectation.
A 1% increase in the interest rate will result in a 0.026% decrease in car sales on average for
given values of all other variables. The sign of this coefficient is consistent with our expectation.
A 1% increase in the labour force will result in a 0.58% decrease in car sales on average for
given values of all other variables. The sign of this coefficient is not consistent with our
expectation.
Overall CPI is the only variable that is statistically significant at the 5% level. A high R-squared
value of .855 shows that 85.5% of the variation in car sales can be explained by the model and a
very low p-value of the overall F test tells us that the R-squared number is statistically different
from 0. This seems a little strange considering there is only 1 variable that is statistically
significant and two of the signs of the coefficients are not what we would expect.

c) The results from above suggest that there may be multicollinearity in out model. Most of the
VIFS calculated are well above 10 proving that multicollinearity is present in the sample. The
correlation coefficients calculated in the question below also show very high correlation between
the variables.
Regressa
nd
cpicars
cpiall
pdi
intrate
lforce

r^2
VIF
0.9963 270.63
05
6
0.9995 2016.1
04
29
0.9995 2096.4
23
36
0.8734 7.8993
07
31
0.9960 255.95
93
09

This multicollinearity means that the calculated standard errors in the model are larger than if
there was no multicollinearity. This means that we might report a coefficient as not statistically
significant when in fact it was.
d) Correlation coefficient table:
cpicars cpiall
pdi
intrate lforce
cpicars
1
0.996
0.993
0.585
0.974
cpiall
1
0.996
0.614
0.974
pdi
1
0.585
0.987
intrate
1
0.6
lforce
1
We can see the cpicars, cpiall, pdi, and lforce all have very strong correlations between each
other. The CPI of cars and overall would be expected to be correlated as they are similar. It also
makes that income and prices would be correlated because as income rises, demand will rise
pushing up prices. The employed labour force is likely so correlated to prices and income
because all of these variables would vary with similarly with the business cycle.
e) Since CPIcars and income are so highly correlated it would make sense to drop them in the
model. This variables are perhaps the best to drop because having only one CPI is likely better
and we are would assume that the price of cars specifically affects demand. We would also want
to keep the employed labour force in the model because that would also be an extremely relevant
factor in the demand for cars.

f)
Dropping the two variables
mentioned above seems to
significantly help the issue of
multicollinearity. All of the variables
are now statistically significant, and
the signs also are what we would
expect. The VIFs are still very high
because of the correlation between
CPIcars and Labour force, but they
are less than in the original model.

g) Yes to some degree rethinking the model helps somewhat. For example, in the model where
the two variables are dropped, CPIcars and Labour force are still very correlated and we see
some very high VIFs. However, we probably do not care too much about this because the
variables are now in line with economic theory and statistically significant and as mentioned
before, the colinearity between the variables is likely due to both variables correlation with the
business cycle.

3. a)

When the price of gas goes up by 1 cent percent per gallon, 8.67 thousand less barrels were sold
per day on average for given values of all other variables. This number is statistically significant.
When personal income goes up by a billion dollars, a thousand more barrels of gas were sold per
day on average for given values of all other variables. This number is statistically significant.
When a million more cars were sold in a year, 32.5 thousand more barrels of gas were sold per
day on average for given values of all other variables. This number is not statistically significant
at the 5% level.
A high R-squared of .812 tells us that 81.2% of the variation in gas sold can be explained by the
model and the p-value of 0 of the F-statistic tells us that the R-squared number is statistically
significant form 0.
1200
1200

800
800

400
400

UT

b)

0
-400

-400

-800

-800
-1200
78 80 82 84 86 88 90 92 94 96 98 00 02

-1200
-1200 -800

-400

0
UT(-1)

UT

400

800

1200

Although it is tough to detect serial correlation from the time graph, the residuals against their
lagged values shows a fairly strong positive correlation of the errors.

c)
With a very low p-value, we reject the nul
hypothesis that there is no serial
correlation in the model and conclude
that there is serial correlation in the
model. Since the first two lags are
statistically significant, the order of the
errors is 2.

d) With a Durbin-Watson stat of 1.04 being lower than the 1.643 critical value, we conclude that
there is positive serial correlation in the model. This is consistent with our findings in c.
e)
The estimated coefficients are 0.4068
and 0.1388.

f)

800
400

EI

0
-400
-800
-1200
-1200

-800

-400
EI(-1)

With a 0.145 p-value from the Breusch-Godfrey test, we do not reject the null hypothesis that
there is no serial correlation in our model. Graphically, we can see on a plot of the residuals
versus their lagged values, that there does not appear to be any correlation.
g) For every variable, the estimates of the standard errors were underestimated in the original
specification and by a fair margin.
h) Yes the assumptions the GLS estimators are consistent and unbiased are likely to hold. For
example, we would expect there to be strong relationships between the price of gas and income
on the amount of gas sold.

400

800

i)

Comparing the estimated of the standard errors in the Newly-west corrected model with the
original model, we see that the standard errors in the original specification were underestimated
in every variable. This is because although the original estimators are still linear and unbiased if
serial correlation is present, not correcting for the autocorrelation causes the variances to simply
not be correct. Thus, there will inevitably be bias in the standard errors of the OLS estimators.

Das könnte Ihnen auch gefallen