Sie sind auf Seite 1von 5

Department of Economics

Columbia University

W3412
Fall 2015

Problem Set 8
Introduction to Econometrics
Profs. Seyhan Erden and Miikka Rokkanen
for all sections.
Part I:
True, False, Uncertain with Explanation:
1. You want to estimate a supply equation for snow boots:
= + + +
where is the quantity sold in city i, is price charged in city i, and is cost of
transportation to city i from a central production facility. The instrument you had in mind
using for in this equation, annual snowfall, is correlated with . This correlation is
evidence that your instrument is not valid.
2. One can use the same instrument for two different endogenous explanatory variables in
the same regression.
3. Instrument relevance means that some of the variation in the regressor is related to
variation in the instrument.

Part II:

1.

Do you think attendance to lectures affects performance on final exam? A model to explain the
standardized outcome on a final exam (stndfnl) in terms of percentage of classes attended, prior
college grade point average, and ACT score is
stndfnl = 0 + 1 atndrte +2 priGPA + 3 ACT + u
where variables are given in the following table:
Variable
stndfnl
atndrte
priGPA
ACT
priGPAxatndrte

Definition
Standardized final exam score
Percentage of classes attended
Prior college grade point average
Achievement Test score
Prior GPA times attendance rate

(a) Let dist be the distance from the students living quarters to the lecture hall. Do you think
dist is uncorrelated with u?

(b) Assuming that dist and u are uncorrelated, what other assumption must dist satisfy in order
to be a valid IV for atndrte?
(c) Suppose we add the interaction term priGPAxatndrte:
stndfnl = 0 + 1 atndrte +2 priGPA + 3 ACT + 4 priGPAxatndrte + u
If atndrte is correlated with u, in general, so is priGPAxatndrte. What might be a good IV for
priGPAxatndrte? [Hint: if E(upriGPA,ACT,dist)=0, as happens when priGPA, ACT and dist
are all exogenous, then any function of priGPA and dist is uncorrelated with u]

2.

The purpose of this question is to compare the estimates and standard errors obtained by correctly
using 2SLS with those obtained using inappropriate procedures. Use the data file WAGE2.dta,
variables are:
Variable
Definition
wage
Monthly earnings
educ
Years of education
exper
Years of working experience
tenure
Years with current employment
black
=1 if the person is African-American, 0
otherwise
sibs
Number of siblings
(a) Use a 2SLS routine to estimate the equation
Log(wage) = 0 + 1 educ +2 exper + 3 tenure + 4 black + u
using sibs as the IV for educ. Report your results
(b) Now, manually, carry out 2SLS. That is, first regress educ on sibs, exper, tenure and
black, obtain the fitted values educ_hat, then run the 2nd stage regression log(wage) on
educ_hat, exper, tenure and black. Verify that estimated sample coefficients are identical
to those obtained form part (a) but that the standard errors are somewhat different. The
standard errors obtained from the second stage regression when manually carrying out
2SLS are generally inappropriate.
(c) Now, use the following two-step procedure, which generally yields inconsistent
parameter estimates of j and not just inconsistent standard errors. In step one, regress
educ on sibs only and obtain the fitted values, say educ_tilde (Note that this is an
incorrect first stage regression). Then, in the second step, run the regression of log(wage)
on educ_tilde, exper, tenure and black. How does the estimate from this incorrect, twostep procedure compare with the correct 2SLS estimate of the return to education?

3.

Suppose you are investigating how wage inflation affects price inflation. You hypothesize that
wage costs force prices up. You model this by the following linear relationship,
p = 0 + 1w + u p
where p = annual rate of growth of prices, w = annual rate of growth in wages, and up is an error
term. You are then interested in estimating the parameter 1. However, you suspect that, at the
same time, workers protect their wages by demanding higher wages as prices rise, but their ability
to do so becomes weaker, the higher the rate of unemployment.
You try to model this through the following second equation,
w = 0 + 1 p + 2 U + uw
Here, U is the unemployment rate which we assume is an exogenous variable, Cov(U, up) =0 and
Cov(U, uw) = 0.
(a) From the above economic arguments, what do you expect the signs of the slope
parameters in the two equations to be?
(b) Solve the two equations w.r.t. the endogenous variables, p and w. That is, express these
variables as functions of U, up and uw only.
(c) Show that Cov(w, up) 0 and use this to argue that the OLS estimator of 1 will be biased
and inconsistent.
(d) Under what additional assumption on U will 2SLS estimation of 1 with U as an
instrument be consistent? How can you test this assumption?

4.

Consider the following regression model:


Y = 0 +1X + u
where Cov(X, u) 0. You want to use the variables Z1 and Z2 as instruments for X. You check
for instrument exogeneity using the overidentifying restrictions test and the value of the
heteroskedasticity-robust J-statistic you obtain is J = 15.7. Assume that the instruments are not
weak and that your sample is large.
(a) What does the value of the J-statistic mean? Do we reject the null hypothesis of instrument
exogeneity?
(b) Does the value of the J-statistic suggests that Cov(u,Z1) 0, or Cov(u,Z2) 0 or both?
Explain.

5. By studying the probability limit (plim) of the IV estimator we can see that when Z and u
are possibly correlated, we can write
(,)
1, = 1+ (,)

(1)

where u and x are the standard deviation of u and X in the population, respectively. The
interesting part of this equation involves the correlation terms. It shows that, even if
Corr(Z,u) is small, the inconsistency in the IV estimator can be very large if Corr(Z,X) is
also small. Thus, even if we focus only on consistency, it is not necessarily better to use
IV than OLS if the correlation between Z and u are smaller than that between X and u.
Using the fact that (, ) = (, )( . ) along with the fact that plim 1=
1 + Cov(X,u)/Var(X) = when (, ) = 0, we can write the plim of OLS estimator
call it 1, - as

1, = 1+ (, )

(2)

Assume that u = x , so that the population variance in the error term is the same as it is
in X. Suppose the instrumental variable, Z, is slightly correlated with u: (, ) = 0.1.
Suppose also that Z and X have somewhat stronger correlation: (, ) = 0.2.
(i) What is the bias in the asymptotic IV estimator?
(ii) How much correlation would have to exist between X and u before OLS has more
asymptotic bias than TSLS?

6. Define the

in terms of observable differences in the treatment and control

group, before and after the treatment. Explain why this presentation is the equivalent of
calculating the coefficient in a regression framework.

The following questions will not be graded, they are for you to practice and will be discussed at
recitation:

1.
2.
3.
4.
5.
6.
7.
8.
9.

SW Exercise 12.2
SW Exercise 12.5
SW Exercise 12.7
SW Exercise 12.8
SW Exercise 12.10
SW Empirical Exercise 12.1
SW Empirical Exercise 12.2
SW Exercise 13.4
SW Exercise 13.5

Das könnte Ihnen auch gefallen