Econometrics I 12

Part 12: Asymptotics for the Regression Model
Econometrics I
Professor William Greene
Stern School of Business
Department of Economics
Econometrics I
Part 12 Asymptotics for
the Regression Model

1/38
Setting
The least squares estimator is
(X'X)
-1
X'y = (X'X)
-1
E
i
x
i
y
i

= | + (X'X)
-1
E
i
x
i
i

So, it is a constant vector plus a sum of random
variables. Our finite sample results
established the behavior of the sum according
to the rules of statistics. The question for the
present is how does this sum of random
variables behave in large samples?

2/38
Well Behaved Regressors
A crucial assumption: Convergence of the
moment matrix X'X/n to a positive definite
matrix of finite elements, Q
What kind of data will satisfy this assumption?
What wont?
Does stochastic vs. nonstochastic matter?
Various conditions for well behaved X

3/38
Probability Limit
=
| | | |
= + c
| |
\ . \ .
| | | |
c c
| |
\ . \ .
1
n
i i
i 1
1
n
i i i i
i 1 i
We use convergence in mean square. Adequate for almost all problems,
not adequate for some time series problems.
1 1
n n
1 1 1
( ' '
n n n
b X'X x
b - b - X'X x x
|
|)( |) =

=

=
| | | |
| |
\ . \ .
| | | | | |
c c
| | |
\ . \ . \ .

1
n
1
1 1
n
i i j j 2 i 1
1
n
1 1 1
'
n n n
In E[( ' | ] in the double sum, terms with unequal
subscripts have expectation zero.
E[( ' |
n
j=1
X'X
X'X x x X'X
b - b - X
b - b -
=
|)( |)
|)( |)

=

| | | | | |
c
| | |
\ . \ . \ .
o o | | | | | | | |
= =
| | | |
\ . \ . \ . \ .
1 1
n
2
i j i 2 i 1
1 1 1
2 2
1 1 1
] ' E[ | ]
n n n
1 1 1 1

n n n n n n
X X'X x x X X'X
X'X X'X X'X X'X
=

4/38
Mean Square Convergence
E[b|X]= for any X.
Var[b|X]0 for any specific well behaved X
b converges in mean square to

5/38
Crucial Assumption of the Model
| |
=
|
\ .
c
c
i
i
1
What must be assumed to get plim ?
n
(1) = a random vector with finite means and variance
and identical distributions.
(2) = a random variable with a constant distribution with
finite mean
X' 0
x
=
c
c
c
i
i i
i i i
n
i
i 1
and variance and E[ ]=0
(3) and statistically independent.
Then, = = an observation in a random sample, with
constant variance matrix and mean vector 0.
1
converges to its expectat
n
x
z x
z ion by the law of large numbers.

6/38
Consistency of s
2

= =

(
=
(

| | | | | |
| | |
\ . \ . \ .
2
2
1 1 n 1
s
n K n K n K n
n
1
n K
1 1 1
plims plim plim ( )
n n n
1 1 1 1
plim plim plim ( ) plim
n n n n
1
plim
n
What must be a
-1
-1
-1
e'e 'M 'M
'M ' 'X X'X X'
' 'X X'X X'
' 0'Q 0
c c = c c
c c = c c c c
= c c c c
= c c
c = o
2 2
1
ssumed to claim plim = E[ ] ?
n
' c c

7/38
Asymptotic Distribution
=
| | | |
= + c
| |
\ . \ .
1
n
i i
i 1
1 1
n n
The limiting behavior of is the same as
that of the statistic that results when the
moment matrix is replaced by its limit. We
examine the behavior of the modified
b X'X x
b
|
=
| |
+ c
|
\ .
n
1
i i
i 1
sum
1
n
Q x |

8/38
Asymptotics
=
| |
+ c
|
\ .
n
1
i i
i 1
1
n
What is the mean of this random vector?
What is its variance?
Do they 'converge' to something? We use
this method to find the probability limit.
What is the asymptotic distribu
Q x |
tion?

9/38
Asymptotic Distributions
Finding the asymptotic distribution
b in probability. How to describe the
distribution?
Has no limiting distribution
Variance 0; it is O(1/n)
Stabilize the variance? Var[\n b] ~
2
Q
-1
is O(1)
But, E[\n b]= \n which diverges
\n (b - ) a random variable with finite mean and
variance. (stabilizing transformation)
b apx. +1/ \n times that random variable

10/38
Limiting Distribution
\n (b - ) = \n (XX)
-1
X
= \ n (XX/n)
-1
(X/n)
Limiting behavior is the same as that of
\ n Q
-1
(X/n)
Q is a fixed matrix. Behavior depends on the
random vector \ n (X/n)

11/38
Limiting Normality
n n
i i i
i 1 i 1
n
i
i 1
1 1 1
n n n n
n n n
1
Mean of a sample.
n
Independent observations.
Mean converges to zero (plim (1/n) already assumed
n a candidate for the Lindberg-Feller Central L
= =
=
= c = =
=

X' x w w
w
X' =0
w =
2 2
i i i
2
2
imit Theorem.
Variance of each term (| ) is ' . Variance of /n.
Var n
Based on the CLT, n N[ , ]
o o
o
o
d
x x x w Q
w Q
w 0 Q

12/38
1
-1
2
-1 -1 2 -1 2 -1
2 -1
Limiting distribution of
'
n( ) n
n n
is the same as that of n .
n N[ , ]
Therefore,
n N[0, ( ) ] N[ , ]
Conclude : n( ) N[ , ]
Approximately : N[
| | | |
=
| |
\ . \ .
o
o = o
o
d
d
d
a
X'X X
b
Q w
w 0 Q
Q w Q Q Q 0 Q
b 0 Q
b 0
c
|
|
2 -1
, ( n) ] o / Q

13/38
Asymptotic Properties
Probability Limit and Consistency
Asymptotic Variance

14/38
Root n Consistency
How fast does b ?
Asy.Var[b] =
2
/n Q
-1
is O(1/n)
Convergence is at the rate of 1/\n
\n b has variance of O(1)
Is there any other kind of convergence?
x
1
,,x
n
= a sample from exponential population; min
has variance O(1/n
2
). This is n convergent
Certain nonparametric estimators have variances that
are O(1/n
2/3
). Less than root n convergent.
Kernel density estimators converge slower than \n

15/38
Asymptotic Results
Distribution of b does not depend on normality of
Estimator of the asymptotic variance (
2
/n)Q
-1
is (s
2
/n)

(XX/n)
-1
. (Degrees of freedom corrections are irrelevant
but conventional.)
Slutsky theorem and the delta method apply to functions
of b.

16/38
Test Statistics
We have established the asymptotic distribution of b. We
now turn to the construction of test statistics. In
particular, we based tests on the Wald statistic

F[J,n-K] = (1/J)(Rb - q)[R s
2
(X'X)
-1
R']
-1
(Rb - q)

This is the usual test statistic for testing linear hypotheses
in the linear regression model, distributed exactly as F if
the disturbances are normally distributed. We now
obtain some general results that will let us construct test
statistics in more general situations.

17/38
Full Rank Quadratic Form
A crucial distributional result (exact): If the
random vector x has a K-variate normal
distribution with mean vector and
covariance matrix E, then the random variable
W = (x - )'E
-1
(x - ) has a chi-squared
distribution with K degrees of freedom.

(See Section 5.4.2 in the text.)

18/38
Building the Wald Statistic-1
Suppose that the same normal distribution
assumptions hold, but instead of the parameter
matrix E we do the computation using a matrix
S
n
which has the property plim S
n
= E. The
exact chi-squared result no longer holds, but
the limiting distribution is the same as if the true
E were used.

19/38
Building the Wald Statistic-2
Suppose the statistic is computed not with an x that has an
exact normal distribution, but with an x
n
which has a
limiting normal distribution, but whose finite sample
distribution might be something else. Our earlier
results for functions of random variables give us the
result
(x
n
- ) 'S
n
-1
(x
n
- ) _
2
[K]

(!!!)VVIR! Note that in fact, nothing in this relies on the
normal distribution. What we used is consistency of a
certain estimator (S
n
) and the central limit theorem for
x
n
.

20/38
General Result for Wald Distance
The Wald distance measure: If plim x
n
= , x
n
is
asymptotically normally distributed with a
mean of and variance E, and if S
n
is a
consistent estimator of E, then the Wald
statistic, which is a generalized distance
measure between x
n
converges to a chi-
squared variate.

(x
n
- ) 'S
n
-1
(x
n
- ) _
2
[K]

21/38
The F Statistic
An application: (Familiar) Suppose b
n
is the least
squares estimator of | based on a sample of n
observations. No assumption of normality of the
disturbances or about nonstochastic regressors is
made. The standard F statistic for testing the
hypothesis H0: R| - q = 0 is

F[J, n-K] = [(e*e* - ee)/J] / [ee / (n-K)]

where this is built of two sums of squared residuals.
The statistic does not have an F distribution. How can
we test the hypothesis?

22/38
JF is a Wald Statistic
F[J,n-K] = (1/J) (Rb
n
- q)'[R s
2
(X'X)
-1
R]
-1
(Rb
n
- q).
Write m = (Rb
n
- q). Under the hypothesis, plim m=0.
\n m N[0, R(o
2
/n)Q
-1
R]
Estimate the variance with R(s
2
/n)(XX/n)
-1
R]
Then, (\n m ) [Est.Var(\n m)]
-1
(\n m )
fits exactly into the apparatus developed earlier. If plim b
n

= |, plim s
2
= o
2
, and the other asymptotic results we
developed for least squares hold, then
JF[J,n-K] _2[J].

23/38
Application: Wald Tests
read;nobs=27;nvar=10;names=
Year, G , Pg, Y , Pnc , Puc , Ppt , Pd , Pn , Ps $
1960 129.7 .925 6036 1.045 .836 .810 .444 .331 .302
1961 131.3 .914 6113 1.045 .869 .846 .448 .335 .307
1962 137.1 .919 6271 1.041 .948 .874 .457 .338 .314
1963 141.6 .918 6378 1.035 .960 .885 .463 .343 .320
1964 148.8 .914 6727 1.032 1.001 .901 .470 .347 .325
1965 155.9 .949 7027 1.009 .994 .919 .471 .353 .332
1966 164.9 .970 7280 .991 .970 .952 .475 .366 .342
1967 171.0 1.000 7513 1.000 1.000 1.000 .483 .375 .353
1968 183.4 1.014 7728 1.028 1.028 1.046 .501 .390 .368
1969 195.8 1.047 7891 1.044 1.031 1.127 .514 .409 .386
1970 207.4 1.056 8134 1.076 1.043 1.285 .527 .427 .407
1971 218.3 1.063 8322 1.120 1.102 1.377 .547 .442 .431
1972 226.8 1.076 8562 1.110 1.105 1.434 .555 .458 .451
1973 237.9 1.181 9042 1.111 1.176 1.448 .566 .497 .474
1974 225.8 1.599 8867 1.175 1.226 1.480 .604 .572 .513
1975 232.4 1.708 8944 1.276 1.464 1.586 .659 .615 .556
1976 241.7 1.779 9175 1.357 1.679 1.742 .695 .638 .598
1977 249.2 1.882 9381 1.429 1.828 1.824 .727 .671 .648
1978 261.3 1.963 9735 1.538 1.865 1.878 .769 .719 .698
1979 248.9 2.656 9829 1.660 2.010 2.003 .821 .800 .756
1980 226.8 3.691 9722 1.793 2.081 2.516 .892 .894 .839
1981 225.6 4.109 9769 1.902 2.569 3.120 .957 .969 .926
1982 228.8 3.894 9725 1.976 2.964 3.460 1.000 1.000 1.000
1983 239.6 3.764 9930 2.026 3.297 3.626 1.041 1.021 1.062
1984 244.7 3.707 10421 2.085 3.757 3.852 1.038 1.050 1.117
1985 245.8 3.738 10563 2.152 3.797 4.028 1.045 1.075 1.173
1986 269.4 2.921 10780 2.240 3.632 4.264 1.053 1.069 1.224

24/38
Data Setup
Create;
G=log(G);
Pg=log(PG);
y=log(y);
pnc=log(pnc);
puc=log(puc);
ppt=log(ppt);
pd=log(pd);
pn=log(pn);
ps=log(ps);
t=year-1960$
Namelist;X=one,y,pg,pnc,puc,ppt,pd,pn,ps,t$
Regress;lhs=g;rhs=X;PrintVC$

25/38
Regression Model
Based on the gasoline data: The regression
equation is
g =|
1
+ |
2
y + |
3
pg + |
4
pnc + |
5
puc +
|
6
ppt + |
7
pd + |
8
pn + |
9
ps + |
10
t + c
All variables are logs of the raw variables, so that
coefficients are elasticities. The new variable, t,
is a time trend, 0,1,,26, so that |
10
is the
autonomous yearly proportional growth in G.

26/38
Least Squares Results
+----------------------------------------------------+
| Ordinary least squares regression |
| LHS=G Mean = 5.308616 |
| Standard deviation = .2313508 |
| Model size Parameters = 10 |
| Degrees of freedom = 17 |
| Residuals Sum of squares = .003776938 |
| Standard error of e = .01490546 |
| Fit R-squared = .9972859 |
| Adjusted R-squared = .9958490 |
| Model test F[ 9, 17] (prob) = 694.07 (.0000) |
| Chi-sq [ 9] (prob) = 159.55 (.0000) |
+----------------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient | Standard Error |t-ratio |P[|T|>t] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
Constant -5.97984140 2.50176400 -2.390 .0287
Y 1.39438363 .27824509 5.011 .0001 9.03448264
PG -.58143705 .06111346 -9.514 .0000 .47679491
PNC -.29476979 .25797920 -1.143 .2690 .28100132
PUC -.20153591 .07415599 -2.718 .0146 .40523616
PPT .08050720 .08706712 .925 .3681 .47071442
PD 1.50606609 .29745626 5.063 .0001 -.44279509
PN .99947385 .27032812 3.697 .0018 -.58532943
PS -.81789420 .46197918 -1.770 .0946 -.62272267
T -.01251291 .01263559 -.990 .3359 13.0000000

27/38
Covariance Matrix

28/38
Linear Hypothesis
H
0
: Aggregate price variables are not significant
determinants of gasoline consumption
H
0
:
7
=
8
=
9
= 0
H
1
: At least one is nonzero

0 0 0 0 0 0 1 0 0 0 0
= 0 0 0 0 0 0 0 1 0 0 , = 0
0 0 0 0 0 0 0 0 1 0 0
( (
( (
( (
( (

R- q=0
R q

29/38
Wald Test
Matrix ; R = [0,0,0,0,0,0,1,0,0,0/
0,0,0,0,0,0,0,1,0,0/
0,0,0,0,0,0,0,0,1,0]
; q = [0 / 0 / 0 ] $
Matrix ; m = R*b - q ; Vm = R*Varb*R'
; List ; Wald = m'<Vm>m $
Matrix WALD has 1 rows and 1 columns.
1
+--------------
1| 66.91506

30/38
Restricted Regression
Compare Sums of Squares Regress; lhs=g;rhs=X;
cls:b(7)=0,b(8)=0,b(9)=0$

+----------------------------------------------------+
| Linearly restricted regression |
| Ordinary least squares regression |
| LHS=G Mean = 5.308616 |
| Standard deviation = .2313508 |
| Residuals Sum of squares = .01864365 | .00377694
| Standard error of e = .3053166E-01 |
| Fit R-squared = .9866028 | .9972859 without restrictions
| Adjusted R-squared = .9825836 |
| Model test F[ 6, 20] (prob) = 245.47 (.0000) |
| Restrictns. F[ 3, 17] (prob) = 22.31 (.0000) | Note: J(=3)*F = Chi-Squared = 66.915 from before
| Not using OLS or no constant. Rsqd & F may be < 0. |
| Note, with restrictions imposed, Rsqd may be < 0. |
+----------------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient | Standard Error |t-ratio |P[|T|>t] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
Constant -4.46504223 4.77789711 -.935 .3631
Y 1.05851456 .55196204 1.918 .0721 9.03448264
PG -.15852276 .05008100 -3.165 .0057 .47679491
PNC .21765564 .18336687 1.187 .2516 .28100132
PUC -.24298315 .10328032 -2.353 .0309 .40523616
PPT -.12617610 .10436708 -1.209 .2432 .47071442
PD .000000 ......(Fixed Parameter)....... -.44279509
PN .222045D-15 ......(Fixed Parameter)....... -.58532943
PS -.444089D-15 ......(Fixed Parameter)....... -.62272267
T .02944666 .02126600 1.385 .1841 13.0000000

31/38
Nonlinear Restrictions
I am interested in testing the hypothesis that
certain ratios of elasticities are equal. In
particular,
|
1
= |
4
/|
5
- |
7
/|
8
= 0
|
2
= |
4
/|
5
- |
9
/|
8
= 0

32/38
Setting Up the Wald Statistic
To do the Wald test, I first need to estimate the asymptotic covariance matrix for the
sample estimates of |
1
and |
2
. After estimating the regression by least squares, the
estimates are
f
1
= b
4
/b
5
- b
7
/b
8

f
2
= b
4
/b
5
- b
9
/b
8
.

Then, using the delta method, I will estimate the asymptotic variances of f
1
and f
2
and the
asymptotic covariance of f
1
and f
2
. For this, write f
1
= f
1
(b), that is a function of the
entire 101 coefficient vector. Then, I compute the 110 derivative vectors,
d
1
= cf
1
(b)/cb' and d
2
= cf
2
(b)/cb' These vectors are

1 2 3 4 5 6 7 8 9 10
d
1
= 0, 0, 0, 1/b
5
, -b
4
/b
5
2
, 0, -1/b
8
, b
7
/b
8
2
, 0, 0
d
2
= 0, 0, 0, 1/b
5
, -b
4
/b
5
2
, 0, 0, b
9
/b
8
2
, -1/b
8
, 0

33/38
Wald Statistics
Then, D = the 210 matrix with first row d
1
and second row
d
2
. The estimator of the asymptotic covariance matrix of
[f
1
,f
2
]' (a 21 column vector) is
V = D s
2
(X'X)
-1
D'. Finally, the Wald test of the
hypothesis that | = 0 is carried out by using the chi-
squared statistic W = (f-0)'V
-1
(f-0). This is a chi-squared
statistic with 2 degrees of freedom. The critical value
from the chi-squared table is 5.99, so if my sample chi-
squared statistic is greater than 5.99, I reject the
hypothesis.

34/38
Wald Test
In the example below, to make this a little
simpler, I computed the 10 variable regression,
then extracted the 51 subvector of the
coefficient vector c = (b
4
,b
5
,b
7
,b
8
,b
9
) and its
associated part of the 1010 covariance matrix.
Then, I manipulated this smaller set of values.

35/38
Application of the Wald Statistic
? Extract subvector and submatrix for the test
matrix;list ; c =[b(4)/b(5)/b(7)/b(8)/b(9)]$
matrix;list ; vc=[varb(4,4)/
varb(5,4),varb(5,5)/
varb(7,4),varb(7,5),varb(7,7)/
varb(8,4),varb(8,5),varb(8,7),varb(8,8)/
varb(9,4),varb(9,5),varb(9,7),varb(9,8),varb(9,9)]$
? Compute derivatives
calc ;list
; g11=1/c(2); g12=-c(1)*g11*g11; g13=-1/c(4); g14=c(3)*g13*g13 ; g15=0
; g21=g11 ; g22=g12 ; g23=0 ; g24=c(5)/c(4)^2 ; g25=-1/c(4)$
? Move derivatives to matrix
matrix;list; dfdc=[g11,g12,g13,g14,g15 / g21,g22,g23,g24,g25]$
? Compute functions, then move to matrix and compute Wald statistic
calc;list ; f1=c(1)/c(2) - c(3)/c(4)
; f2=c(1)/c(2) - c(5)/c(4) $
matrix ; list; f = [f1/f2]$
matrix ; list; vf=dfdc * vc * dfdc' $
matrix ; list ; wald = f' * <vf> * f$
(This is all automated in the WALD command.)

36/38
Computations
Matrix C is 5 rows by 1 columns.
1
1 -0.2948 -0.2015 1.506 0.9995 -0.8179
Matrix VC is 5 rows by 5 columns.
1 2 3 4 5
1 0.6655E-01 0.9479E-02 -0.4070E-01 0.4182E-01 -0.9888E-01
2 0.9479E-02 0.5499E-02 -0.9155E-02 0.1355E-01 -0.2270E-01
3 -0.4070E-01 -0.9155E-02 0.8848E-01 -0.2673E-01 0.3145E-01
4 0.4182E-01 0.1355E-01 -0.2673E-01 0.7308E-01 -0.1038
5 -0.9888E-01 -0.2270E-01 0.3145E-01 -0.1038 0.2134
G11 = -4.96184 G12 = 7.25755 G13= -1.00054 G14 = 1.50770 G15 = 0.000000
G21 = -4.96184 G22 = 7.25755 G23 = 0 G24 = -0.818753 G25 = -1.00054
DFDC=[G11,G12,G13,G14,G15/G21,G22,G23,G24,G25]
Matrix DFDC is 2 rows by 5 columns.
1 2 3 4 5
1 -4.962 7.258 -1.001 1.508 0.0000
2 -4.962 7.258 0.0000 -0.8188 -1.001
F1= -0.442126E-01
F2= 2.28098
F=[F1/F2]
VF=DFDC*VC*DFDC'
Matrix VF is 2 rows by 2 columns.
1 2
1 0.9804 0.7846
2 0.7846 0.8648
WALD Matrix Result is 1 rows by 1 columns.
1
1 22.65

37/38
Noninvariance of the Wald Test

38/38

Econometrics I 12

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Econometrics I 12

Hochgeladen von

Copyright:

Verfügbare Formate

Part 12: Asymptotics for the Regression Model

Das könnte Ihnen auch gefallen