Sie sind auf Seite 1von 2

# Introductory Econometrics

Problem set 2
Jan Zouhar
Department of Econometrics, University of Economics, Prague, zouharj@vse.cz

Due date: 30th November (Friday class) / 4th December (Tuesday class)

Problem 2.1. Use the data in wage3.gdt from my website for this problem and the problems below. As
indicated in Data ! Dataset info, the dataset comes from the paper by Blackburn and Neumark (1992) en-
titled “Unobserved Ability, Efficiency Wages, and Interindustry Wage Differentials”, published in Quarterly
Journal of Economics (feel free to read the paper for additional information). Unless stated otherwise, use
the 5 % significance level for all tests.
a ) Estimate the equation

## and report the results using the usual format.

b ) Use the Breusch-Pagan test to show whether Assumption MLR.5 holds. What do you conclude? (Re-
port the value of the test statistic and the resulting p-value along with your conclusions.)
c ) Using the approximation

## %wage  100.ˇ1 C 2ˇ2 exper/exper;

find the approximate return to the fifth year of work experience. What is the approximate return to the
twentieth year of work experience?
d) At what value of exper does additional experience actually begin to lower predicted log.wage/? (Or,
what is the turning point in the effect of experience?) How many people have more experience in this
sample?
e) Is the effect of experience significant in the equation? (Formulate the null hypothesis and report the
value of the test statistic and the resulting p-value along with your conclusions.)
f) What is the expected difference in the wage due to race (black versus non-black) for people living in
the North? And for people living in the South?
g) According to your equation, is the race gap significantly larger in the South than in the North? (State a
suitable null and alternative hypothesis, report the p-value of the test and conclude.)
Problem 2.2. Based on (1), you want predict the salary of a white male person with 5 years of work ex-
perience and 18 years of education. This prediction is made difficult by the presence of logarithms; read
Wooldridge’s section “Predicting y when log.y/ is the dependent variable”.
a ) Find the prediction, assuming that u is normally distributed (conditional on all independent variables),
i.e. that assumptions MLR.1 through MLR.6 hold.
b ) Save the residuals from (1) to a new variable uhat, and test for normality (Gretl: Variable ! Normality
test); the null is that uhat is normally distributed. Next, look at the Q-Q plot of residual quantiles
against theoretical normal distribution quantiles (Gretl: Variable ! Normal Q-Q plot). What do you
conclude?
c ) Find the prediction once again, this time using Duan’s (1983) smearing estimate, described in the same
section of Wooldridge’s book. (Hint: you will need to create a new variable, calculated as exp.uhat/,
and find its mean, e.g. by displaying summary statistics.)
Problem 2.3. a ) Estimate a modified version of (1) with the level, rather than log, of wage as the depen-
dent variable:

## wage D ˇ0 C ˇ1 exper C ˇ2 exper2 C ˇ3 educ C ˇ4 south C ˇ5 black C ˇ6 .south  black/ C u: (2)

1
Introductory econometrics: Problem Set 2 Jan Zouhar

b ) Based on (2), obtain the 95 % prediction interval for the wage of the person from Problem 2.2.
O from (2) and find the sample correlation coefficients between uO and all the
c ) Save the residuals (u)
explanatory variables (i.e., 5 correlation coefficients). Explain the results.
1
1
d ) Save the fitted values wage from (2) and find the sample correlation coefficient between wage and
wage. Is there any relationship between this correlation coefficient and the R2 from the regression
model? (Hint: See Wooldridge, look for the origin of the term ‘R-squared’.)
1
1
e ) Based on (1), calculate the predicted wage for all people in the sample (wage2), using Duan’s estimate
as in Problem 2.2. Find the squared correlation between wage and wage2, and use the result to compare
the goodness of fit of (1) and (2). (See Wooldridge, same section as in Problem 2.2, for a comparison
of goodness of fit for models that combine dependent variables in the level and the log form.)