Beruflich Dokumente
Kultur Dokumente
EXT 5-2
Web Extension 5
SUR here as a stepping stone to the next section, in which we examine simultaneous equations models with disturbances that are correlated across equations.
The DGP
To illustrate a system of seemingly unrelated equations, consider a sample of
firms, all of which produce two products, quilts and mattresses. The per unit cost
of producing quilts, Cqi, depends on the price of linen, Pli and the price of dyes, Pdi .
The per unit cost of producing mattresses, Cm
i , depends on the price of linen and
the price of foam, Pfi.
Thus we have a DGP with two equations:
Cqi = a0 + a1Pli + a2Pdi + eqi
14.14
l
f
m
Cm
i = b 0 + b 1Pi + b 2Pi + ei ,
14.15
and
EXT 5-3
Suppose, however, we knew the correlation between the disturbances and the
variances of the disturbances, and we knew the actual value of the first disturbance in the quilt equation, but not the value of the first disturbance in the mattress equation. Let us further assume the first observation has explanators that
take on the sample average values. Would we use OLS to estimate the mattress
equation? Not if we think this observation lies closer to the true line than do
other observations. The assumed information about the quilt disturbance, coupled with the information about the covariance between the disturbances, undermines the presumption that all observations are equally likely to have disturbances of a given size, and therefore undermines the rationale for the optimality
of OLS. Suppose, for example, that the disturbance on the first observation for
the cost of quilts is very small in magnitude, suggesting that the underlying disturbance in the mattresses equation for the first observation is also very small. Because this observation probably lies closer to the true line than we would otherwise think, we want to use that information in estimating the line.
How does this intuition apply in the actual circumstances in which we do not
know the size of a given residual? The choice of weighting the first observation in
the mattress regression will depend in part on how well we think we can mimic
the first observations disturbance in the quilt regression. If we can mimic the disturbances particularly well, we may decide to consider that information when we
weight the first observation in the mattress equation. Because unbiased estimation
will impose constraints on the weights we use, weighting one observation more
heavily than in OLS requires weighting another observation less heavily, if we are
to maintain unbiasedness.
Residuals mimic disturbances, but they are not equal to them. How reliably
residuals mimic the disturbances for a given observation depends, in part, on how
well we estimate the true parameters (the variances of the parameter estimators),
and, in part, on how far the explanators for that observation are from their average values. (We are less confident in our estimates of the expected value of Y for
Xs far from the mean Xs.) Therefore, how the quilt information influences our
mattress equation weight for the first observation will depend on the values of the
explanators in the quilt equation. Similarly, how influenced our weight for the
first observation in the quilt equation will be depends on the values of the explanators in the mattress equation.
There are special cases in which OLS efficiently estimates seemingly unrelated
regressions. For one, if both equations suggest the same weights for an explanator, there is no reason to choose one over the other. One special case is that in
which the explanators in one equation are identical to those in the other equations; OLS is the efficient estimator for seemingly unrelated regressions in this
case. A further special case applies when an equation contains only a subset of the
explanators that appear in other equations. If that subset takes on the same values
EXT 5-4
Web Extension 5
in every equation, then OLS efficiently estimates the equation containing the subset of explanators.
The standard procedure for estimating seemingly unrelated regressions is a
variant of the feasible generalized least squares (FGLS) strategy developed in
Chapters 10 and 11:
1. Estimate the equations using ordinary least squares.
2. Use the least squares residuals from (1) to estimate the variances and contemporaneous covariance of the disturbances:
s2q =
1
e2
n - kq - 1 a qi
s2m =
1
e2
n - km - 1 a mi
and
sqm =
1
e e ,
n - max(kq, km) - 1 a qi mi
where the subscripts q and m refer to the equations being estimated, for example the quilt and mattress cost equations, Equations 14.14 and 14.15.
3. Combine the regressions into one large equation, but allow the coefficient on
each variable to differ across the several equations. For example, in the quilt
and mattress example, we replace Equations 14.14 and 14.15 with the equivalent model
Cj = a0Dqj + a1Pj* l + a2Pj* d
+ b0Dqj + b1Pj* * l + b2Pj* f + ej,
14.16
and
Cj = Cm
j-n for j = n + 1, , 2n;
Dqj = 1 for j = 1, , n
and
= 0 otherwise;
Dmj = 1 for j = n + 1, , 2n
and
= 0 otherwise;
and
= 0 otherwise;
and
= 0 otherwise;
and
= 0 otherwise;
EXT 5-5
and
= 0 otherwise;
ej = eqj for j = 1, , n
and
ej = em
j for j = n + 1, , 2n.
This is called a stacked regression because several equations are stacked together in a single regression equation. OLS applied to Equation 14.16 obtains
the same results as OLS applied to Equations 14.14 and 14.15 separately.
4. Perform GLS on the stacked regression, using the estimated variances and covariances in place of the actual.
Although FGLS can improve the efficiency with which each equation is estimated, there is a risk incurred by treating equations as a seemingly unrelated system. Specification errors, such as omitted variables, in one equation can bias the
coefficient estimates in all the equations. OLS applied one equation at a time will
be biased when applied to the misspecified equations, but the other equations will
be unbiasedly estimated. Many researchers shy away from SUR estimation because they do not want to risk tainting all their estimates with the misspecification of a single equation.
14.11
HOW DO WE MAKE
ESTIMATOR?
AN
EXT 5-6
Web Extension 5
1. Estimate the reduced form equations by OLS and form fitted values for each
troublesome explanator based on the variables reduced form equation.
2. Replace each troublesome explanator in each equation by its fitted value
from (1) and perform OLS for each structural equation. As in the second step
of the seemingly unrelated regressions equation, estimate the variances and
contemporaneous covariances of the equations disturbances, this time using
the 2SLS residuals, instead of OLS residuals.
3. Perform FGLS as for seemingly unrelated regressions, using the estimated
variances and contemporaneous correlations among the disturbances of the
several structural equations, but replace any endogenous explanators with
their fitted values from (1) before performing the SUR estimation.
Three-stage least squares consistently estimates systems of identified equations but does not consistently estimate systems of equations that include one or
more underidentified equations. Consequently, before beginning the three steps of
3SLS, we must determine which equations are underidentified, and not include
them in the 3SLS procedure.
Notice that in estimating the variances and cross-equation correlation of the
disturbances, 3SLS relies on fitted values of the reduced forms, in place of the endogenous variables themselves. But these OLS reduced form estimates do not incorporate the information embodied in the exclusion restrictions of the structural
equations. This observation points the way to yet another estimator for simultaneous equations, full information maximum likelihood (FIML). Well further discuss the FIML estimator after we explore an example of 3SLS.
EXT 5-7
Table 14.5 3SLS Estimation of the Supply and Demand for Whiting
System: SANDD
Estimation Method: Three-Stage
Least Squares
Date: 11/09/02 Time: 16:30
Sample: 1 111
Included observations: 111
Total system (balanced) observations: 222
Linear estimation after one-step weighting matrix
Coefficient
Std. Error
t-Statistic
Prob.
DEMAND
constant
PRICE
Day1
Day2
Day3
Day4
8.527301
0.942795
0.023597
0.508797
0.552579
0.091847
0.150803
0.335759
0.200875
0.195435
0.200147
0.195392
56.54589
2.807956
0.117470
2.603406
2.760865
0.470066
0.0000
0.0055
0.9066
0.0099
0.0063
0.6388
SUPPLY
constant
QTY
Windspd
Rainy
Cold
Stormy
Mixed
Windspd2
2.454753
0.046738
1.011353
0.008251
0.044219
0.377246
0.202540
0.148608
5.599931
0.120080
3.879499
0.088971
0.074296
0.136818
0.091789
0.672736
0.438354
0.389219
0.260692
0.092734
0.595178
2.757289
2.206580
0.220901
0.6616
0.6975
0.7946
0.9262
0.5524
0.0063
0.0284
0.8254
0.050408
0.182118
8.523430
Adjusted R-squared
0.143172
0.741672
S.E. of regression
0.686529
DurbinWatson stat
1.341118
49.48880
0.197158
Adjusted R-squared
0.142596
S.E. of regression
0.353657
DurbinWatson stat
0.727180
0.193681
0.381935
12.88252
EXT 5-8
Web Extension 5
Table 14.6 FIML Estimates of the Supply and Demand for Whiting
System: SANDD
Estimation Method: Full Information
Maximum Likelihood (Marquardt)
Date: 11/09/02 Time: 16:37
Sample: 1 111
Included observations: 111
Total system (balanced) observations: 222
Convergence achieved after 31 iterations
Coefficient
Std. Error
z-Statistic
Prob.
DEMAND
constant
PRICE
Day1
Day2
Day3
Day4
8.535494
0.936917
0.003248
0.521458
0.561473
0.097179
0.187009
0.370707
0.223671
0.225277
0.236276
0.253373
45.64224
2.527375
0.014522
2.314741
2.376338
0.383541
0.0000
0.0115
0.9884
0.0206
0.0175
0.7013
SUPPLY
constant
QTY
Windspd
Rainy
Cold
Stormy
Mixed
Windspd2
3.087104
0.100217
1.112579
0.010789
0.048856
0.392793
0.214600
0.163582
5.762400
0.147332
3.936755
0.093329
0.089252
0.157426
0.106364
0.687638
0.535732
0.680212
0.282613
0.115603
0.547389
2.495097
2.017603
0.237889
0.5921
0.4964
0.7775
0.9080
0.5841
0.0126
0.0436
0.8120
Log Likelihood
144.3302
Determinant residual covariance
0.055259
0.183640
8.523430
Adjusted R-squared
0.144766
0.741672
S.E. of regression
0.685890
Durbin-Watson stat
1.346313
49.39672
0.137879
Adjusted R-squared
0.079289
S.E. of regression
0.366480
DurbinWatson stat
0.795456
0.193681
0.381935
13.83371
EXT 5-9
EXT 5-10
Web Extension 5
Estimating the structural equation with its associated reduced form equations is
less apt to contaminate a well-specified structural equation with the misspecification of another equation, because the reduced form equations are much less likely
to be misspecified than the various structural equations. There is no risk of mistakenly omitting a relevant exogenous variable in a reduced form equation because reduced form equations always include all of the exogenous variables. Structural
equations, in contrast, are often misspecified by unwarranted exclusions.
LIML performs FIML on a subset of equations. An intuitive attraction of
LIML is that it can exploit any correlations among the one structural equations
disturbances and the disturbances of the associated reduced form equations. Despite this seeming advantage in theory, LIML and 2SLS have the same asymptotic
properties. In practice, Monte Carlo studies in the literature suggest that LIML
approaches its asymptotic normal distribution more quickly than 2SLS and that
LIML often outperforms 2SLS in small samples. Nonetheless, 2SLS remains the
limited information estimator used more often, perhaps because its theoretical
small-sample statistical properties are somewhat more attractive than those of
LIML, or perhaps because it is easier to implement in some econometric software
packages.
An Organizational Structure
for the Study of Econometrics
1. What Is the DGP?
2. What Makes a Good Estimator?
3. How Do We Create an Estimator?
Limited information methods:
Indirect least squares (ILS)
Two-stage least squares (2SLS)
Limited information maximum likelihood (LIML)
Seemingly unrelated regression estimation (SUR)
Full information methods:
Three-stage least squares (3SLS)
Full information maximum likelihood (FIML)
EXT 5-11
Summary
This extension of Chapter 14 first introduces an estimation procedure for jointly
estimating several nonsimultaneous regression equations when their disturbances
are correlated. It then introduces two more estimators, three-stage least squares
(3SLS) and full information maximum likelihood (FIML) estimators, which estimate all the parameters of a system of simultaneous equations jointly. Although
3SLS and FIML are more efficient than 2SLS, they also risk transmitting specification biases across equations. A correctly specified equation is consistently estimated by 2SLS, even if all the other equations in the model are misspecified. But
3SLS and FIML estimates of properly specified equations may be biased by a single misspecified equation in the system.
EXT 5-12
Web Extension 5
and (iii) a wage equation in which wages depend on current and lagged national
product (Y) and a time trend (T):
Wt = g0 + g1Yt + g2Yt-1 + g3Tt + eW
t .
The three identities in Model I are: (i) national private product (Y) equals consumption plus investment plus government spending (G) minus government wages (Wg):
Yt = Ct + It + Gt - Wgt;
(ii) national income (N), which equals national product minus net exports and taxes
(X), equals wages plus property income:
Nt = Yt - Xt = Wpt + Pt;
and (iii) the change in the capital stock equals investment:
Kt = Kt - 1 + It - 1.
The exogenous variables in Kleins system were government spending (G), government wages (Wg), indirect business taxes plus net exports (X), and a time trend (T).
Additional predetermined variables were the capital stock, which was measured at
the beginning of the year (K), lagged property income (Pt-1), and lagged national
product (Yt-1). The file Klein1.*** contains the data with which to estimate Kleins
model.
a. Estimate each structural equation in Kleins model by OLS, 2SLS, 3SLS, and
FIML. Briefly compare the results from the four procedures.
b. Using the 2SLS residuals, test any overidentifying restrictions in Kleins three
structural equations.
c. Estimate the reduced form equations for the Klein model, one for each endogenous variable. Use the estimated reduced form to assess the effect of increased
government spending on the level of national income. How do the structural equations add to your understanding of the effect of government spending on national
income beyond what you learned from the reduced form?
Endnotes
1. Arnold Zellner, An Efficient Method of Estimating Seemingly Unrelated Regressions
and Tests of Aggregation Bias, Journal of the American Statistical Association 57
(1962): 500509.
EXT 5-13
Appendix 14.A
A Matrix Algebra Representation of
Systems of Equations
Just as matrix algebra provides compact representation and manipulation of the
data for a single equation, it also provides a compact representation and manipulation of the data for a system of equations. This appendix uses matrix algebra to
examine the identification of equations within systems of equations. The matrix
algebra for representing two-stage-least squares (2SLS), the most common procedure for estimating identified equations, is in Appendix 13.B.
14.A.1
A System of Equations
This appendix examines a system of G equations, each accounting for one endogenous variable. The system also contains (K + 1) predetermined variables.
Predetermined variables include both exogenous variables, those determined outside of the system, and lagged dependent variables. This appendix limits its attention to DGPs in which all predetermined variables are nontroublesome. This excludes DGPs with troublesome lagged dependent variable explanators.
We would ordinarily write as the structural equation that determines Y1, the
first endogenous variable as
G
14.A.1
j=0
if the first endogenous variable depended on both the other endogenous variables
and the predetermined variables. Some coefficients in Equation 14.A.1 might be
equal to zero. Equation 14.A.1 is the starting point for describing a system of
equations. This section adapts Equation 14.A.1 to describe such a system.
j=0
14.A.2
EXT 5-14
Web Extension 5
j=0
14.A.3
j=0
14.A.4
j=0
14.A.5
j=0
Y = [Y1Y2 YG],
X = [X0X1 XK],
E = [E1E2 EG],
in which the Yi, the Xi, and the Ei are (T * 1) column vectors containing the observations for the corresponding variables, and
i = [g1ig2i gGi]
Bi = [b 0 b 2i b Ki].
Notice that in i,gii = 1. We can rewrite Equation 14.A.5 as
Yi = XBi + Ei.
An even more compact notation combines all G structural equations into one matrix formulation. Define
= [12 G]
EXT 5-15
and
B = [B1B2 BG]
and write
Y = XB + E,
14.A.6a
Y - XB - E = 0.
14.A.6b
or
Equation 14.A.6a is a matrix representation of G structural equations with K predetermined variables in the system. The matrix (Y - XB - E) in Equation
14.A.6b is a (T * G) matrix. Its columns contain the G structural relationships,
with each row corresponding to one observation on all G structural relationships.
14.A.7
where = B-1 and N = E-1. Equation 14.A.7 is the reduced form for the
model in Equation 14.A.6. Equation 14.A.7 implies that
B = .
14.A.8
If the predetermined variables are not perfectly collinear, the reduced form
equations are identified. Because the reduced form equations explanators are all
predetermined variables, we can consistently estimate the coefficients of those
equations, the elements of , by OLS. In contrast, we may be unable to consistently estimate the coefficients of a structural equation like that in Equation
14.A.2,
G
j=0
EXT 5-16
Web Extension 5
14.A.9
the i-th column of , i, yields the reduced form equation for the i-th endogenous
variable, Yi. We know that the i-th element of i is 1. Identification requires further restrictions on i and Bi. Consider, for example, the first structural equation.
Not all endogenous variables need appear in the first structural equation, nor
do all predetermined variables need appear in the first equation. Lets divide endogenous variables into three groups, Y1 itself, Yin1, and Yout1. Yin1 contains all
the endogenous variables that appear as explanators with nonzero coefficients in
the first structural equation. Yout1 contains all the endogenous variables with zero
coefficients in the first structural equation. Yin1 is a Gin1 * T matrix and Yout1 is a
EXT 5-17
14.A.10
X = [Xin1Xout1],
14.A.11
Similarly, suppose
in which Xin1 contains all the predetermined variables that appear with nonzero
coefficients in the first structural equation and Xout contains all the predetermined
variables with zero coefficients in the first structural equation. Xin1 is a Kin1 * T
matrix and Xout1 is a Kout1 * T matrix. Kin1 + Kout1 = K + 1.
It proves informative to rewrite 14.A.9 with the explicit division of Y and X
into the groups given by Equations 14.A.10 and 14.A.11:
1
Bin1
[Y1Yin1Yout1 C in1 S - [Xin1Xout1] B
R - Ei,
Bout1
out1
14.A.12
in which in1 is a Gin1 * T matrix containing the first structural equations coefficients for the endogenous variables in Yin1, out1 is a Gout1 * T matrix containing
the first structural equations coefficients for the endogenous variables in Yout1,
and Bin1 and Bout1 are similarly defined. Because excluded variables have zero coefficients, we can rewrite Equation 14.A.12 as
1
Bin1
[Y1Yin1Yout1]C in1 S - [Xin1Xout1] B
R - Ei.
0
0
We can similarly rewrite the reduced form equation
Y = X + N
as
[Y1Yin1Yout1] = [Xin1Xout1] B
P in11
P out11
in1in
out1in
in1out
R + Ni,
out1out
in which
P in11
P out11
in1in
out1in
in1out
R =
out1out
EXT 5-18
Web Extension 5
and
P in11 contains the reduced form coefficients for the predetermined variables that
appear in the first structural equation from the first endogenous variables reduced form equation; it is 1 * Kin1;
P out11 contains the reduced form coefficients for the predetermined variables that
do not appear in the first structural equation from the first endogenous variables reduced form equation; it is 1 * Kout1;
in1in contains the reduced form coefficients for the predetermined variables that
appear in the first structural equation from the reduced form equations for
the endogenous explanators included in the first structural equation; it is
Gin1 * Kin1;
in1out contains the reduced form coefficients for the predetermined variables that
appear in the first structural equation from the reduced form equations for
the endogenous explanators excluded from the first structural equation; it is
Gout1 * Kin1;
out1in contains the reduced form coefficients for the predetermined variables that
do not appear in the first structural equation from the reduced form equations for the endogenous explanators included in the first structural equation;
it is Gin1 * Kout1; and
out1out contains the reduced form coefficients for the predetermined variables excluded from the first structural equation from the reduced form equations for
the endogenous explanators excluded from the first structural equation; it is
Gout1 * Kout1.
With this more elaborate rendering of structural and reduced form equations,
we can determine when Equation 14.A.12 is identified.
Bin1
P in11
R = B
0
P out11
in1in
out1in
1
in1out
R C in1 S .
out1out
0
14.A.13
14.A.14
EXT 5-19
14.A.15
In this case, the first structural equation is exactly identified. When Kout1 7 Gin1,
a similar condition determines whether the first structural equation is identified.
Consider a case in which Kout1 7 Gin1. Suppose that we can select a subset of
the rows of out1in such that the resulting square matrix is invertible. Call this
-1
. Remove the same rows from P out11
matrix *out1in and call its inverse * out1in
*
and call the resulting matrix P out11.
We call the number of linearly independent columns in a matrix the column
rank of the matrix. We call the number of linearly independent rows the row
rank. Equation 14.A.15 tells us the relationship among these matrices,
*
*
= - out1in
in1,
P out11
EXT 5-20
Web Extension 5
pq0
b 1 = pp0
and
pq1
b 1 = pp1 .
The reduced form parameters lead to b 1 using either formula. Consequently, b 1
can be consistently estimated from consistent reduced form estimates b 1 is identified. The surfeit of riches rules out using indirect least squares as our estimation
procedurein finite samples, there is no unique solution for the structural equations from the reduced form equations in finite samplesbut b 1 is identified, so it
can be consistently estimated by some available means.
When for M 7 R and an M * R matrix, A, we can remove rows from the
matrix and form an invertible R * R matrix, we say that the matrix A is of rank
R. A sufficient condition for the i-th equation in a system to be identified, is that
* in is Gin . This is the rank condition for identification. When the
the rank of out
i
i
rank condition fails for an equation, there are multiple sets of coefficient values
for that equation that are consistent with the systems reduced form; the equation
is underidentified. When the rank condition is satisfied, there is only one set of
coefficients for the equation that are consistent with the systems reduced form;
the equation is identified.
EXT 5-21
miss failures of the order condition when we perform IV estimation; the computer
will tell us that we have tried to divide by zero or that some matrix is singular
(that is, does not have an inverse).
When the order condition is satisfied, but the rank condition fails, we have
the needed number of potential instruments, but we cannot form enough linearly
independent combinations of them to make (Z X) invertible in large samples. A
failure of the rank condition when the order condition is satisfied does not stop us
from computing the IV estimator, even in very large samples; the sampling errors
in estimating the reduced-form coefficients will probably lead us to construct instruments that are not perfectly correlated within our samples. Nevertheless, the
IV estimates are inconsistent in this case because the equation is not identified.