Sie sind auf Seite 1von 42

EC403 Applied Econometrics

Unit 3 (Chapter 10 in Text)


Random Regressors and Moment-Based Estimation Random Regressors and Moment-Based Estimation
Learning Objectives
10.1 Linear Regression with Random xs
10.2 Cases in Which x and e are Correlated
10.3 Estimators Based on the Method of
Chapter Contents
10.3 Estimators Based on the Method of
Moments
10.4 Specification Tests
One classical assumption:
10.1
Linear Regression
with Random xs
If E(e|x) = 0, then we can show that it is also true that x and e are
uncorrelated, and that cov(x, e) = 0. Explanatory variables that are
not correlated with the error term are called exogenous
variables.
10.1 Linear Regression with Random xs
variables.
Relaxation of the above assumption:
Conversely, if x and e are correlated, then cov(x, e) 0 and we can
show that E(e|x) 0. Explanatory variables that are correlated
with the error term are called endogenous variables.
If assumption A10.3* is not true, and in particular if cov(x,e)
0 so that x and e are correlated, then the least squares
estimators are inconsistent.
They do not converge to the true parameter values even in
very large samples.
10.1
Linear Regression
with Random xs
10.1.2
Large Sample
Properties of the
Least Squares
Estimators
None of our usual hypothesis testing or interval estimation
procedures are valid.
10.1
Linear Regression
with Random xs
10.1.3
Why Least Squares
Estimation Fails
FIGURE 10.1 (a) Correlated x and e
10.1
Linear Regression
with Random xs
10.1.3
Why Least Squares
Estimation Fails
FIGURE 10.1 (b) Plot of data, true and fitted regression functions
The statistical consequences of correlation
between x and e is that the least squares estimator
is biased and this bias will not disappear no
10.1
Linear Regression
with Random xs
10.1.3
Why Least Squares
Estimation Fails
is biased and this bias will not disappear no
matter how large the sample
Consequently the least squares estimator is
inconsistent when there is correlation between
x and e
When an explanatory variable and the error term
are correlated, the explanatory variable is said to
be endogenous
This term comes from simultaneous equations
10.2
Cases in Which x
and e are Correlated
10.2 Cases in Which x and e are Correlated
This term comes from simultaneous equations
models
It means determined within the system
Using this terminology when an explanatory
variable is correlated with the regression error,
one is said to have an endogeneity problem
Case 1: Measurement error
The errors-in-variables problem occurs when an
explanatory variable is measured with error
If we measure an explanatory variable with
10.2
Cases in Which x
and e are Correlated
10.2.1
Measurement Error
error, then it is correlated with the error term,
and the least squares estimator is inconsistent
Let y = annual savings and x* = the permanent
annual income of a person
A simple regression model is:
Current income is a measure of permanent
10.2
Cases in Which x
and e are Correlated
10.2.1
Measurement Error
*
1 2 i i i
y x v = + +
Eq. 10.1
Current income is a measure of permanent
income, but it does not measure permanent
income exactly.
It is sometimes called a proxy variable
To capture this feature, specify that:
*
i i i
x x u = +
Eq. 10.2
Substituting:
10.2
Cases in Which x
and e are Correlated
10.2.1
Measurement Error
( )
*
1 2
1 2
i
y x v
x u v
= + +
= + +
( )
( )
1 2
1 2 2
1 2
x u v
x v u
x e
= + +
= + +
= + +
Eq. 10.3
In order to estimate Eq. 10.3 by least squares, we
must determine whether or not x is uncorrelated
with the random disturbance e
The covariance between these two random
10.2
Cases in Which x
and e are Correlated
10.2.1
Measurement Error
The covariance between these two random
variables, using the fact that E(e) = 0, is:
( ) ( )
( )
( )
( )
*
2
2 2
2 2
cov ,
0
u
x e E xe E x u v u
E u
(
= = +

= =
Eq. 10.4
The least squares estimator b
2
is an inconsistent
estimator of
2
because of the correlation between
the explanatory variable and the error term
Consequently, b
2
does not converge to
2
in
10.2
Cases in Which x
and e are Correlated
10.2.1
Measurement Error
Consequently, b
2
does not converge to
2
in
large samples
In large or small samples b
2
is not
approximately normal with mean
2
and
variance
( ) ( )
2
2
var b x x =

Case 2: Simultaneous Equations System


Another situation in which an explanatory variable
is correlated with the regression error term arises
in simultaneous equations models
10.2
Cases in Which x
and e are Correlated
10.2.2
Simultaneous
Equations Bias
in simultaneous equations models
Suppose we write:
1 2
Q P e = + +
Eq. 10.5
There is a feedback relationship between P and Q
Because of this, which results because price and
quantity are jointly, or simultaneously,
determined, we can show that cov(P, e) 0
The resulting bias (and inconsistency) is called
10.2
Cases in Which x
and e are Correlated
10.2.2
Simultaneous
Equations Bias
The resulting bias (and inconsistency) is called
the simultaneous equations bias
Case 3: Omitted Variable Correlated with e
When an omitted variable is correlated with an
included explanatory variable, then the regression
10.2
Cases in Which x
and e are Correlated
10.2.3
Omitted Variables
included explanatory variable, then the regression
error will be correlated with the explanatory
variable, making it endogenous
Consider a log-linear regression model explaining
observed hourly wage:
10.2
Cases in Which x
and e are Correlated
10.2.3
Omitted Variables
( )
2
1 2 3 4
ln WAGE EDUC EXPER EXPER e = + + + + Eq. 10.6
What else affects wages? What have we
omitted?
( )
1 2 3 4
ln WAGE EDUC EXPER EXPER e = + + + + Eq. 10.6
We might expect cov(EDUC, e) 0
If this is true, then we can expect that the least
squares estimator of the returns to another year
10.2
Cases in Which x
and e are Correlated
10.2.3
Omitted Variables
squares estimator of the returns to another year
of education will be positively biased,
E(b
2
) >
2
, and inconsistent
The bias will not disappear even in very
large samples
Estimating our wage equation, we have:
We estimate that an additional year of education
10.2
Cases in Which x
and e are Correlated
10.2.4
Least Squares
Estimation of a
Wage Equation
( )
( ) ( ) ( ) ( ) ( )
2
ln 0.5220 0.1075 0.0416 0.0008
se 0.1986 0.0141 0.0132 0.0004
WAGE EDUC EXPER EXPER = + +
We estimate that an additional year of education
increases wages approximately 10.75%,
holding everything else constant
If ability has a positive effect on wages, then
this estimate is overstated, as the
contribution of ability is attributed to the
education variable
A moment is a quantitative measure of the shape
of a set of points, e.g. mean, variance, median, etc.
The method of moments is a method of
estimation of population parameters by equating
10.3 Estimators Based on the Method of Moments
estimation of population parameters by equating
sample moments with unobservable population
moments and then solving those equations for the
quantities to be estimated.
The method of moments estimation procedure
equates m population moments to m sample
moments to estimate m unknown parameters
10.3.1
Method of Moments
Estimation of a
Population Mean
and Variance
10.3
Estimators Based on
the Method of
Moments
moments to estimate m unknown parameters
Example:
Eq. 10.9
( ) ( )
( )
2
2 2 2
var Y E Y E Y = = =
The first two population and sample moments of Y
are:
10.3.1
Method of Moments
Estimation of a
Population Mean
and Variance
10.3
Estimators Based on
the Method of
Moments
Population Moments Sample Moments
Eq. 10.10
( )
( )
1
2 2
2 2
Population Moments Sample Moments

i
i
E Y y N
E Y y N
= = =
= =

Solve for the unknown mean and variance


parameters:
10.3.1
Method of Moments
Estimation of a
Population Mean
and Variance
10.3
Estimators Based on
the Method of
Moments
Eq. 10.11

i
y N y = =

and
Eq. 10.11

i
y N y = =

( )
2
2 2 2
2 2 2
2

i i i
y y y y Ny
y
N N N

= = = =

% Eq. 10.12
In the linear regression model y =
1
+
2
x + e, we
usually assume:
10.3.2
Method of Moments
Estimation in the
Simple Linear
Regression Model
10.3
Estimators Based on
the Method of
Moments
Eq. 10.13
( ) ( )
1 2
0 0
i i i
E e E y x = =
If x is fixed, or random but not correlated with
e, then:
Eq. 10.13
Eq. 10.14
( ) ( )
1 2
0 0
i i i
E e E y x = =
( ) ( )
1 2
0 0 E xe E x y x = ( =

We have two equations in two unknowns:
10.3.2
Method of Moments
Estimation in the
Simple Linear
Regression Model
10.3
Estimators Based on
the Method of
Moments
1
Eq. 10.15
( )
( )
1 2
1 2
1
0
1
0
i i
i i i
y b b x
N
x y b b x
N
=
=

These are equivalent to the least squares normal


equations and their solution is:
10.3.2
Method of Moments
Estimation in the
Simple Linear
Regression Model
10.3
Estimators Based on
the Method of
Moments
Eq. 10.16
( )( )
( )
2
2
i i
i
x x y y
b
x x

=

Under "nice" assumptions, the method of


moments principle of estimation leads us to the
same estimators for the simple linear regression
model as the least squares principle
1 2
b y b x =
Nice: when all the usual assumptions of the
linear model hold, the method of moments leads
to the least squares estimator
If x is random and correlated with the error term,
10.3
Estimators Based on
the Method of
Moments
If x is random and correlated with the error term,
the method of moments leads to an alternative,
called instrumental variables estimation, or
two-stage least squares estimation, that will
work in large samples
IV or 2SLS Estimation
Instrumental Variable
Suppose that there is another variable, z, such that:
1. z does not have a direct effect on y, and thus it
does not belong on the right-hand side of the
model as an explanatory variable
2. z is not correlated with the regression error
10.3.3
Instrumental
Variables Estimation
in the Simple Linear
Regression Model
10.3
Estimators Based on
the Method of
Moments
2. z is not correlated with the regression error
term e
Variables with this property are said to be
exogenous
3. z is strongly [or at least not weakly] correlated
with x, the endogenous explanatory variable
A variable z with these properties is called an
instrumental variable
If such a variable z exists, then it can be used to
form the moment condition:
Use Eqs. 10.13 and 10.16, the sample moment
10.3.3
Instrumental
Variables Estimation
in the Simple Linear
Regression Model
10.3
Estimators Based on
the Method of
Moments
( ) ( )
1 2
0 0 E ze E z y x = ( =

Eq. 10.16
Use Eqs. 10.13 and 10.16, the sample moment
conditions are:
( )
( )
1 2
1 2
1

0
1

0
i i
i i i
y x
N
z y x
N
=
=

Eq. 10.17
Solving these equations leads us to method of
moments estimators, which are usually called the
instrumental variable (IV) estimators:
10.3.3
Instrumental
Variables Estimation
in the Simple Linear
Regression Model
10.3
Estimators Based on
the Method of
Moments
Eq. 10.18
( )( )
( )( )
2
1 2


i i
i i i i
i i i i i i
z z y y N z y z y
N z x z x z z x x
y x

= =

=


These new estimators have the following
properties:
They are consistent, if z is exogenous, with
E(ze) = 0
In large samples the instrumental variable
10.3.3
Instrumental
Variables Estimation
in the Simple Linear
Regression Model
10.3
Estimators Based on
the Method of
Moments
In large samples the instrumental variable
estimators have approximate normal
distributions
In the simple regression model:
Eq. 10.19
( )
2
2 2
2
2

~ ,
zx i
N
r x x
| |

|
|

The error variance is estimated using the estimator:


10.3.3
Instrumental
Variables Estimation
in the Simple Linear
Regression Model
10.3
Estimators Based on
the Method of
Moments
( )
2
1 2
2

2
i i
IV
y x
N

=

Note that we can write the variance of the instrumental


variables estimator of
2
as:
Because the variance of the instrumental variables
estimator will always be larger than the variance of
the least squares estimator, and thus it is said to be
less efficient
( )
( )
( )
2
2
2
2 2
2
var

var
zx
zx i
b
r
r x x

= =

To extend our analysis to a more general setting,


consider the multiple regression model:
Let x be an endogenous variable correlated
10.3.4
Instrumental
Variables Estimation
in the Multiple
Regression Model
10.3
Estimators Based on
the Method of
Moments
1 2 2

K K
y x x e = + + + + L
Let x
K
be an endogenous variable correlated
with the error term
The first K - 1 variables are exogenous
variables that are uncorrelated with the error
term e - they are included instruments
We can estimate this equation in two steps with a least
squares estimation in each step
The first stage regression has the endogenous variable
x
K
on the left-hand side, and all exogenous and
instrumental variables on the right-hand side
The first stage regression is: The first stage regression is:
The least squares fitted value is:
1 2 2 1 1 1 1 K K K L L K
x x x z z v

= + + + + + + + L L
Eq. 10.20
1 2 2 1 1 1 1


K K K L L
x x x z z

= + + + + + + L L
Eq. 10.21
The second stage regression is based on the
original specification:
The least squares estimators from this equation
are the instrumental variables (IV) estimators
10.3.4
Instrumental
Variables Estimation
in the Multiple
Regression Model
10.3
Estimators Based on
the Method of
Moments
Eq. 10.22
*
1 2 2


K K
y x x e = + + + + L
are the instrumental variables (IV) estimators
Because they can be obtained by two least
squares regressions, they are also popularly
known as the two-stage least squares (2SLS)
estimators
We will refer to them as IV or 2SLS or
IV/2SLS estimators
In the simple regression, if x is endogenous and
we have L instruments:
10.3.4a
Using Surplus
Instruments in
Simple Regression
10.3
Estimators Based on
the Method of
Moments
1 1 1


L L
x z z = + + + L
The two sample moment conditions are:
( )
( )
1 2
1 2
1

0
1

0
i i
i i i
y x
N
x y x
N
=
=

Solving using the fact that , we get:


10.3.4a
Using Surplus
Instruments in
Simple Regression
10.3
Estimators Based on
the Method of
Moments
( )
( )
( )( )

x x y y

x x =
( )
( )
( )
( )
( )( )
( )( )
2
1 2




i i
i i
i i
i i
x x y y
x x y y
x x x x
x x x x
y x


= =


=

Estimation Issue1: Validity of Instruments


The first stage regression is a key tool in assessing
whether an instrument is strong or weak in the
multiple regression setting
10.3.5a
One Instrumental
Variable
10.3
Estimators Based on
the Method of
Moments
1 2 2 1 1 1 1 K K K K
x x x z v

= + + + + + L Eq. 10.24
Suppose the first stage regression equation is:
The key to assessing the strength of the
instrumental variable z
1
is the strength of its
relationship to x
K
after controlling for the effects of
all the other exogenous variables
Suppose the first stage regression equation is:
We require that at least one of the instruments be
strong
Eq. 10.25
1 2 2 1 1 1 1 K K K L L K
x x x z z v

= + + + + + + + L L
Using FATHEREDUC and MOTHEREDUC, the first stage
equation is:
2
1 2 3 1 2
EDUC EXPER EXPER MOTHEREDUC FATHEREDUC v = + + + + +
Table 10.1 First-Stage Equation
The IV/2SLS estimates are:
Obtain the predicted values of education from the first
stage equation and insert it into the log-linear wage
equation to replace EDUC
Then estimate the resulting equation by least squares
( )

( ) ( ) ( ) ( ) ( )
2
ln 0.0481 0.0614 0.0442 0.0009
se 0.4003 0.0314 0.0134 0.0004
WAGE EDUC EXPER EXPER = + +
To compare, the OLS estimates of the log-linear wage
equation are:
( )

( ) ( ) ( ) ( ) ( )
2
ln 0.1982 0.0493 0.0449 0.0009
se 0.4729 0.0374 0.0136 0.0004
WAGE EDUC EXPER EXPER = + +
Estimation Issue 2: Identification
The multiple regression model, including all K
variables, is:
G exogenous variables B endogenous variables
1 2 2 1 1
G exogenous variables B endogenous variables
G G G G K K
y x x x x e
+ +
= + + + + + +

L L
Think of G = Good explanatory variables, B = Bad
explanatory variables and L = Lucky instrumental
variables
It is a necessary condition for IV estimation that
L B
If L = B then there are just enough instrumental
variables to carry out IV estimation
10.3.8
Instrumental
Variables Estimation
in a General Model
10.3
Estimators Based on
the Method of
Moments
variables to carry out IV estimation
The model parameters are said to just identified
or exactly identified in this case
The term identified is used to indicate that the
model parameters can be consistently estimated
If L > B then we have more instruments than are
necessary for IV estimation, and the model is said
to be overidentified

Das könnte Ihnen auch gefallen