Beruflich Dokumente
Kultur Dokumente
1 x11
(
)
)
(
N
1 x21
1
1
1
N
x
i1
T
i=1
.
N
X X=
..
2
= N x
x11 x21 xN 1
.
..
i=1 xi1
i=1 i1
1 xN 1
Following the hint, we use the simple formulate to nd the inverse of a 2 2 matrix:
1
(X T X)1 =
N
2
i=1 xi1
N
i=1 xi1
)
(
N
N
2
x
x
i=1 i1
i=1 i1
)2
x
N
N
i1
i=1
x11 x21 xN 1
..
i=1 xi1 yi
yN
Putting these results together gives us
(X T X)1 xT y =
N
=
N
2
i=1 xi1
2
i=1 xi1
N
i=1 xi1
(
)(
)
N
N
N
2
x
x
y
i=1 i1
i=1 i1
Ni=1 i
=
)2
N
x
N
i1
i=1
i=1 xi1 yi
(
(
N
i=1 xi1
)2
)(
)(
N
N
N
N
2
i=1 xi1
i=1 yi
i=1 xi1
i=1 xi1 yi
(
) (
)
(
)
N
N
N
x
y
+
N
x
y
i1
i
i1
i
i=1
i=1
i=1
x
N
i=1 i
i=1 i1
i=1 i1 i
SXY
1 =
=
(
)
2
2
N
SX
2
N N
i=1 xi1
i=1 xi1
2. In the general homoskedastic multivariate regression model with no autocorrelation, the
variance-covariance of the OLS estimator is
= 2 (X T X)1
V ar()
From the previous problem, we know that when K = 1,
)
(
N
N
2
2
x
i1
i=1 i1
i=1
2 (X T X)1 =
)2
(
N
x
N
N
N
i1
i=1
N i=1 x2i1
i=1 xi1
(1)
(a) The rst element of the rst row of (1) corresponds to V ar(0 ). It is given by
N 2
i=1 xi1
V ar(0 ) = [ 2 (X T X)1 ]11 = 2
(
)2
N
N
2
N i=1 xi1
i=1 xi1
Using some straightforward algebra:
[
]
[
]
N 2
2
N
2 + N X2
(x
x
X)
1
X
2
i=1 i1
i=1 i1
2
= 2
+ N
N
(
)2 =
2
2
N
N
N
(x
N
X)
2
i1
i=1
i=1 (xi1 X)
x
N i=1 xi1
i=1 i1
(b) The second element of the second row of (1) corresponds to V ar(1 ). It is given by
[
]
N
N
2
V ar(1 ) = [ 2 (X T X)1 ]22 = 2
=
)2 =
(
2
N
N
N N
2
i=1 (xi1 X)
x
N i=1 xi1
i1
i=1
= N
i=1 (xi1
X)2
(c) The covariance between 0 and 1 is given by any of the o-diagonal entries of (1). More
precisely,
[
]
N
x
2X
i1
2
T
1
2
T
1
2
i=1
Cov(0 , 1 ) = [ (X X) ]12 = [ (X X) ]21 =
=
N
2
2
N N
i=1 (xi1 X)
i=1 (xi1 X)
3. (a) We want to show that
Cov[X1 , Y ] = 1 V ar[X1 ] + 2 Cov[X1 , X2 ]
First, substitute in for Y on the LHS:
Cov[X1 , Y ] = Cov[X1 , 0 + 1 X1 + 2 X2 + U ]
Use the denition of covariance and the properties of the expectations operator:
Cov[X1 , 0 + 1 X1 + 2 X2 + U ] =
= E[(X1 E[X1 ])(0 + 1 X1 + 2 X2 + U E[0 + 1 X1 + 2 X2 + U ])] =
= E[(X1 E[X1 ])(1 (X1 E[X1 ]) + 2 (X2 E[X2 ]) + U )] =
= 1 E[(X1 E[X1 ])2 ] + 2 E[(X1 E[X1 ])(X2 E[X2 ])] + E[(X1 E[X1 ])U ]
= 1 V ar[X1 ] + 2 Cov[X1 , X2 ] + Cov[X1 , U ]
= 1 V ar[X1 ] + 2 Cov[X1 , X2 ]
(b) We know (see for example problem set 3) that we can express the OLS estimator in the
following way:
n
(X1i X 1 )(Yi Y )
1 = i=1
n
2
i=1 (X1i X 1 )
1 = 1 + 2 i=1n
2
i=1 (X1i X 1 )
By the WLLN and Slutskys theorem (see problem set 3 for the details):
n
Cov[X1 , X2 ]
i=1 (X1i X 1 )(X2i X 2 )
n
2
V ar[X1 ]
i=1 (X1i X 1 )
This shows the result we are interested in.
(c) If 2 = 0 or Cov[X1 , X2 ] = 0, then 1 will be consistent. If none of these two conditions
hold, then we have omitted variable bias.
(d) Yes, the tted residuals satisfy these two conditions by construction. In fact, the FOCs
of the OLS imply that we are setting each expression equal to zero and solve for .
(Derive the OLS in the case of = [0 1 ] to see this.)
4. (a) (...)
(b) The average wage paid to a foot soldier in a small gang not at war is found by setting
war = large = 0 Hence, wage = 0 = 1.83.
(c) Introduce an interaction term: war large, into the model:
wage = 0 + 1 war + 2 large + 3 (war large) + U
The estimated dierence between large and small gangs in terms of the eect of a war
will be given by 3 .
5. (a) There may be good reason in this case to suspect that a causal interpretation is appropriate for the population regression model. For example, reading WSJ may build valuable
human capital if readers learn meaningful lessons about the business world which make
them more valuable workers later on. Another possibility is that some employers (such
as the US State Department or investment banking rms) require their employees to
maintain up-to-date knowledge about current trends in world politics and economics; in
either of these two cases it may very well be, at least theoretically, that WSJ readership
has a direct eect of making one a more marketable employee prospect. In this case,
other unobserved determinants of employment probability might be things like fundamental worker productivity dierences which are detectable (at least partially) by job
interviewers, or college major, which in principle is observable, but which WSJ left out
of their study.
(b) In this case there is good reason to suspect that the parameters of the causal model are
NOT identied due to probable correlations between X and U : students who are more
diligent may read WSJ (or some other newspaper) more frequently, and if this diligence
can be signaled to potential employers in an interview, then we may get an omitted
variable bias. Similarly, if demand for economics and/or business degrees is higher than
demand for humanities and other social sciences, and if economics/business majors are
more likely to choose WSJ (over say NY Times), then we could also get an omitted
variable bias. Therefore, one might suspect that OLS estimates of 1 using only data
on Y and X are both biased and inconsistent.
(c) Given the endogeneity discussed above, we should only interpret the estimated sample
regression function 0 + 1 X as (an approximation of) the conditional expectation function (CEF), not a causal relationship. In other words, even though we may hypothesize
that the underlying population regression model is causal in nature, we can only in this
case pin down an estimate of the CEF, because the parameters of the causal model are
not identied.
(d) The empirical evidence that students who read the WSJ are more likely to be employed
post-graduation says nothing conclusive about the causal eect of buying the WSJ on
nding a job. It simply tells us that students who read the WSJ also tend to be the
types of students who are employable; but this may be driven mostly or entirely by other
factors like major or productivity dierences. For a given student, buying the WSJ may
or may only change her underlying employability very little, if at all.
6. (a) Let denote percent change . Since the dependent variable is in log form, we have:
wage = 1 + 2 pareduc
We see that the eect neither depends own education nor experience. However, it depends on parents education.
(b) For experience, we have
wage = 3 + 24 experience
In this case, the eect depends on the level. However, it does not depend on own or
parents education.
(d) R = 0.727.
5
0.399 0 + 0.395 1 + 0.152 1 + 0.696 0 + 0.023 7.5 0.052 9.75 + 8.827 = 14.796.
(h) The dierence between Jim and Bobs predicted years of education is 0.032 (4 2) =
0.032 2 = 0.064, so Jims expected years of education is given by 14.796 0.064 =
14.732.