Sie sind auf Seite 1von 7

Suggested Solutions: Problem Set 5

May 18, 2015


1. The OLS estimator in the general case with K regressors is
= (X T X)1 X T Y
We want to show that when K = 1, we have
SXY
1 = 2
SX
Note that when K = 1, x will be an N 2 matrix and xT x will be of dimension 2 2. First
note that

1 x11
(
)
)
(
N
1 x21
1
1
1
N
x
i1
T
i=1
.
N
X X=
..
2
= N x
x11 x21 xN 1
.
..
i=1 xi1
i=1 i1
1 xN 1
Following the hint, we use the simple formulate to nd the inverse of a 2 2 matrix:
1

(X T X)1 =
N

2
i=1 xi1

N
i=1 xi1

)
(
N
N
2
x

x
i=1 i1
i=1 i1
)2
x
N
N
i1
i=1

As y is of dimension N 1, xT y will be of dimension 2 1, and we have



y
(
)
) 1 (
N
y2
1
1
1
y
i
T
i=1
. = N
X Y =

x11 x21 xN 1
..
i=1 xi1 yi
yN
Putting these results together gives us

(X T X)1 xT y =
N

=
N

2
i=1 xi1

2
i=1 xi1

N
i=1 xi1

(
)(
)
N
N
N
2
x

x
y
i=1 i1
i=1 i1
Ni=1 i
=
)2
N
x
N
i1
i=1
i=1 xi1 yi

(
(

N
i=1 xi1

)2

)(

)(

N
N
N
N
2
i=1 xi1
i=1 yi
i=1 xi1
i=1 xi1 yi
(
) (
)
(
)
N
N
N

x
y
+
N
x
y
i1
i
i1
i
i=1
i=1
i=1

we know that the second element of this vector corresponds to 1 .


From the expression for ,
Evidently, we have
)
) (
) (
(
N
N
N
y
x
y

x
N
i=1 i
i=1 i1
i=1 i1 i
SXY
1 =
=
(
)
2
2

N
SX
2
N N
i=1 xi1
i=1 xi1
2. In the general homoskedastic multivariate regression model with no autocorrelation, the
variance-covariance of the OLS estimator is
= 2 (X T X)1
V ar()
From the previous problem, we know that when K = 1,
)
(
N
N
2
2

x
i1
i=1 i1
i=1
2 (X T X)1 =
)2
(
N
x
N

N
N
i1
i=1
N i=1 x2i1
i=1 xi1

(1)

(a) The rst element of the rst row of (1) corresponds to V ar(0 ). It is given by

N 2

i=1 xi1
V ar(0 ) = [ 2 (X T X)1 ]11 = 2
(
)2
N
N
2
N i=1 xi1
i=1 xi1
Using some straightforward algebra:

[
]
[
]
N 2
2
N
2 + N X2
(x

x
X)
1
X

2
i=1 i1
i=1 i1
2
= 2
+ N
N
(
)2 =
2
2
N
N
N
(x

N
X)
2
i1
i=1
i=1 (xi1 X)
x
N i=1 xi1
i=1 i1
(b) The second element of the second row of (1) corresponds to V ar(1 ). It is given by

[
]
N
N

2
V ar(1 ) = [ 2 (X T X)1 ]22 = 2
=

)2 =
(
2
N
N
N N
2
i=1 (xi1 X)
x
N i=1 xi1
i1
i=1
= N

i=1 (xi1

X)2

(c) The covariance between 0 and 1 is given by any of the o-diagonal entries of (1). More
precisely,
[
]
N

x
2X
i1
2
T
1
2
T
1
2
i=1
Cov(0 , 1 ) = [ (X X) ]12 = [ (X X) ]21 =
=

N
2
2
N N
i=1 (xi1 X)
i=1 (xi1 X)
3. (a) We want to show that
Cov[X1 , Y ] = 1 V ar[X1 ] + 2 Cov[X1 , X2 ]
First, substitute in for Y on the LHS:
Cov[X1 , Y ] = Cov[X1 , 0 + 1 X1 + 2 X2 + U ]
Use the denition of covariance and the properties of the expectations operator:
Cov[X1 , 0 + 1 X1 + 2 X2 + U ] =
= E[(X1 E[X1 ])(0 + 1 X1 + 2 X2 + U E[0 + 1 X1 + 2 X2 + U ])] =
= E[(X1 E[X1 ])(1 (X1 E[X1 ]) + 2 (X2 E[X2 ]) + U )] =
= 1 E[(X1 E[X1 ])2 ] + 2 E[(X1 E[X1 ])(X2 E[X2 ])] + E[(X1 E[X1 ])U ]
= 1 V ar[X1 ] + 2 Cov[X1 , X2 ] + Cov[X1 , U ]
= 1 V ar[X1 ] + 2 Cov[X1 , X2 ]
(b) We know (see for example problem set 3) that we can express the OLS estimator in the
following way:
n
(X1i X 1 )(Yi Y )

1 = i=1
n
2
i=1 (X1i X 1 )

Substitute in for Yi = 0 + 1 X1i + 2 X2i + Ui and Y = (1/n) ni=1 (0 + 1 X1i + 2 X2i +


Ui ), and simplify:
n
(X1i X 1 )(X2i X 2 )

1 = 1 + 2 i=1n
2
i=1 (X1i X 1 )
By the WLLN and Slutskys theorem (see problem set 3 for the details):
n
Cov[X1 , X2 ]
i=1 (X1i X 1 )(X2i X 2 )

n
2
V ar[X1 ]
i=1 (X1i X 1 )
This shows the result we are interested in.
(c) If 2 = 0 or Cov[X1 , X2 ] = 0, then 1 will be consistent. If none of these two conditions
hold, then we have omitted variable bias.

(d) Yes, the tted residuals satisfy these two conditions by construction. In fact, the FOCs

of the OLS imply that we are setting each expression equal to zero and solve for .
(Derive the OLS in the case of = [0 1 ] to see this.)
4. (a) (...)
(b) The average wage paid to a foot soldier in a small gang not at war is found by setting
war = large = 0 Hence, wage = 0 = 1.83.
(c) Introduce an interaction term: war large, into the model:
wage = 0 + 1 war + 2 large + 3 (war large) + U
The estimated dierence between large and small gangs in terms of the eect of a war
will be given by 3 .
5. (a) There may be good reason in this case to suspect that a causal interpretation is appropriate for the population regression model. For example, reading WSJ may build valuable
human capital if readers learn meaningful lessons about the business world which make
them more valuable workers later on. Another possibility is that some employers (such
as the US State Department or investment banking rms) require their employees to
maintain up-to-date knowledge about current trends in world politics and economics; in
either of these two cases it may very well be, at least theoretically, that WSJ readership
has a direct eect of making one a more marketable employee prospect. In this case,
other unobserved determinants of employment probability might be things like fundamental worker productivity dierences which are detectable (at least partially) by job
interviewers, or college major, which in principle is observable, but which WSJ left out
of their study.
(b) In this case there is good reason to suspect that the parameters of the causal model are
NOT identied due to probable correlations between X and U : students who are more
diligent may read WSJ (or some other newspaper) more frequently, and if this diligence
can be signaled to potential employers in an interview, then we may get an omitted
variable bias. Similarly, if demand for economics and/or business degrees is higher than
demand for humanities and other social sciences, and if economics/business majors are
more likely to choose WSJ (over say NY Times), then we could also get an omitted
variable bias. Therefore, one might suspect that OLS estimates of 1 using only data
on Y and X are both biased and inconsistent.
(c) Given the endogeneity discussed above, we should only interpret the estimated sample
regression function 0 + 1 X as (an approximation of) the conditional expectation function (CEF), not a causal relationship. In other words, even though we may hypothesize
that the underlying population regression model is causal in nature, we can only in this
case pin down an estimate of the CEF, because the parameters of the causal model are
not identied.

(d) The empirical evidence that students who read the WSJ are more likely to be employed
post-graduation says nothing conclusive about the causal eect of buying the WSJ on
nding a job. It simply tells us that students who read the WSJ also tend to be the
types of students who are employable; but this may be driven mostly or entirely by other
factors like major or productivity dierences. For a given student, buying the WSJ may
or may only change her underlying employability very little, if at all.
6. (a) Let denote percent change . Since the dependent variable is in log form, we have:
wage = 1 + 2 pareduc
We see that the eect neither depends own education nor experience. However, it depends on parents education.
(b) For experience, we have
wage = 3 + 24 experience
In this case, the eect depends on the level. However, it does not depend on own or
parents education.

Stock and Watson Exercises


2

6.1 R = 0.175, 0.189, 0.193 in column 1, 2, and 3 respectively.


6.2 (a) Yes, controlling for gender, workers with college degrees earn $5.46 more per hour, on
average, than workers with only high school degrees.
(b) Yes, controlling for education level, men earn $2.64 more per hour, on average, than
women.
6.4 (a) Workers in the Northeast earn $0.69 more per hour than workers in the West, on average,
controlling for other variables in the regression. Workers in the Midwest earn $0.60 more
per hour than workers in the West. Workers in the South earn $0.27 less than workers
in the West.
(b) The regressor W est is omitted to avoid perfect multicollinearity. If West is included, then
the intercept can be written as a perfect linear function of the four regional regressors.
(c) The expected dierence in earnings between Juanita and jennifer is -0.27 - 0.6 = -0.87.
6.5 (a) $23,400.
(b) In this case, BDR = 1, and Hsize = 100. The resulting expected change in price is
23.4 + 0.156 100 = 39.0 thousand dollars or $ 39,000.
(c) The loss is $48,800.
2

(d) R = 0.727.
5

Figure 1: E6.2 regression results.


E6.2 See Figure 1 for the full regression results.
(a) -0.073
(b) -0.032
(c) The coecient fell by more than half after adding the additional regressors, so it does
seem that the result in (a) suered from omitted variable bias.
(d) The regression in (b) ts the data much better, as evidenced by the higher SER, R2 ,
2
2
and R . The R2 and R are similar because the sample size is large (n = 3796) relative
n1
to the number of regressors (k = 10), so the correction term nk1
is very close to 1.
(e) The coecient of 0.696 on dadcoll means that students whose fathers went to college
complete 0.696 more years of education, on average, than students whose fathers did not
go to college, holding the other regressors constant.
(f) These terms capture the opportunity costs of attending college. As stwmfg80, the 1980
state hourly wage in manufacturing, increases, forgone wages increase, so that, on average, college attendance declines. The negative sign on the coecient is consistent with
this. As cue80, the county unemployment rate, increases, it is more dicult to nd a
job, which lowers the opportunity cost of attending college, so that college attendance
increases. The positive sign on the coecient is consistent with this.
(g) Bobs predicted years of education = 0.032 2 + 0.094 58 + 0.145 0 + 0.368 1 +

0.399 0 + 0.395 1 + 0.152 1 + 0.696 0 + 0.023 7.5 0.052 9.75 + 8.827 = 14.796.
(h) The dierence between Jim and Bobs predicted years of education is 0.032 (4 2) =
0.032 2 = 0.064, so Jims expected years of education is given by 14.796 0.064 =
14.732.

Das könnte Ihnen auch gefallen