Beruflich Dokumente
Kultur Dokumente
The asymptotic power envelope is derived for point-optimal tests of a unit root in the
autoregressive representation of a Gaussian time series under various trend specifications.
We propose a family of tests whose asymptotic power functions are tangent to the power
envelope at one point and are never far below the envelope. When the series has no
detdrministic component, some previously proposed tests are shown to be asymptotically
equivalent to members of this family. When the series has an unknown mean or linear
trend, commonly used tests are found to be dominated by members of the family of
1. lNrRooucrroru
ForrowrNc rHE
sEMINAL woRK
for the tests can differ substantially, no general optimality theory has been
developed. In particular, there are few general results (even .asymptotic) concerning the relative merits of the competing testing principles and of the various
methods for eliminating trend parameters.
Emptoying a model common in the previous literature, we assume that the
data y1,,,,,/r woro generated as
(1)
yt -- dt + ul
. ttt:
dllt-1+
(t:t,...,7),
Dt
!E
{r
T&
.ff
814
c. ELLIorr, T. J. RoTHENBERG,
AND J. H. srocK
is
,H
AUTOREGRESSIVE
UNIT ROOT
815
--ri
the distribution of the data were otherwise known, the Neyman-pearson Lemma
gives us the best test against any given point alternative a. The power of this
d,
If there
exist feasible tests with the same asymptotic power as the Neyman-pearson point-optimal tests, the comparison will be appropriate in the nuisance
functions very close to this bound, even when there are additional nuisance
:a.i$
.t6
.t.I
*
,.]
.ri.3i
:]d
:*
In this section we derive an upper bound to the asymptotic power function for
tests of the hypothesis a: L when the data are generated by (1) and the
following condition is satisfied.
:li
,.,,f
it
CoNDIIIoN A: The stationaty sequence (u,) has a stictly positiue spectral density
function; it has a mouing auerage representation u,:Li:o61'th- 1 where the 11, arc
independent standard normal random uariables and Li:oil}il <*. The initial uo is
0 and the 6's are known.
The unrealistic assumption of known ao, 5's and error distribution is made so
we may employ the Neyman-Pearson theory; in Section 3 we show that it may be
dropped without any essential change. Our results, however, are quite sensitive
to the nature of the deterministic components d,. Section 2.1 considers the
simplest case where the d, are known. Section 2.2 examines the case where the
d, are "slowly evolving" and Section 2.3 examines the case where d, is a linear
combination of nonrandom trending regressors. Our purpose here is to derive
the power bound; tests that might be used in practice are discussed later in the
paper. All proofs are given in the Appendix.
2.1.. Known Detetmi.nistic Component
When the d, are known, a, is observable and minus two times the log
likelihood is (except for an additive constant) given by
(2)
L(a) :lau-(a-r)u-tl;
t-\au- (r,-r)u-i
L(d) - L(r).
When the sample size is large, any reasonable test will have high power unless
is close to one. Thus, in obtaining large-sample approximations, it is natural
to employ local-to-unity asymptotics where the parameter space is a shrinking
neighborhood of unity as the sample size grows. In our case the appropriate rate
to get nondegenerate distributions is T-1 so we reparameterize the model
writing c:T(a - 1) and take c to be a constant when making limiting arguments. Cf. Chan and Wei (1987), Phillips and Perron (1988). Setting e = T(a - t),
we can then write the likelihood ratio test statistic as
(3)
'Using u different maintained model, Robinson (1994) develops a ,.standard,, asymptotic theory
of efficient tests for a unit root. This requires dropping the familiar autoregression framework and
assuming, for example, a fractionally differenced process for the data.
ENVELopE
Au.
For any given c, rejecting when the linear combination (3) is small yields the
most powerful test against the alternative that c:e .
816
AUTOREGRESSIVE
STOCK
8t7
UNIT ROOT
(4)
(s)
\J/
=wle, Iw,, -
ew"z(1.) <
b(e)]
< b(e)l : a. Because the test indexed by c is optimal against the alternative c:e , the envelope
power function for this family of point-optimal tests is nk): r(c,c).
where
114"2
Ad, are
CoNorrroN
bounded with
This will automatically be satisfied if. the d, are constant, It will also be
of time. These include low frequenry
sinusoids (e.g., d,:cos(2nkt/I) for finite k); slowly increasing time trends
(e.9., dt:ln(r) or d,:t6 for 6 <t/2); and step functions with finitely many
jumps (e.g., d,: Fo when / < to and d,: F, when / > /o). In the slowly evolving
trend case, the random component of y, dominates the deterministic component
when I is large. It is tempting therefore to ignore the deterministic term when
constructing the test statistic. In the Appendix we show that, if the d, evolve
satisfied by a variety of smooth functions
(6)
tfi: minL(a, B) BB
minL(1, F).
The test statistic is the difference in (weighted) sum of squared residuals from
two constrained GI-S regressions, one imposing d:d and the other imposing
a:
L.
(D
v,(t,e)
w,(t)
- ,l^orr,
+ 3(1 - t)
[sw"G)
ds],
d,
are
unknown and not slowly evolving is more complicated. Suppose, for example, the
TnEoRpu l: Suppose {y,l is generated by the Gaussian model (L) under Condition A. Consider unit-root tests of size e under locaho-unity asymptotics where both
c : T(a - 1.) and e : T(d - L) are fixed as T tends to infinity.
818
STOCK
AUTOREGRESSIVE
a. When d, is known or satisfies Condition B, the Neyman-Pearson most powerful test against the alteratiue c : e has asymptotic power function r(c,E\ defined in
(4). An upper bound to the asymptotic power of any unit-root test is giuen by the
power enuelope II(c)= n(c,c).
b. When d,: Fo, the most powerful inuaiant test against the altematiue c : E has
asymptotic power function r(c,e). The asymptotic power enuelope for this family of
point-optimal inuaiant tests is II(c).
c. When d,: Fo+ BJ, the most powetful inuaiant test against the altematiue
c
rr'(c,e):rufe'[r'nz(t,e) dt+
(8)
where
b'(e)
satisfies
(1
e)v,z(t,z)
<b'Cdf
upper
bound to the arymptotic power of any unit-root test inuaiant to the trend parameters
Bo and B, is giuen by the power enuelope II"(c) = r'(c,c).
PoINT-oPTTMAL TESTS
Although the point-optimal test statistics defined in (3) and (6) require -E and
uo to be known, it is possible to construct tests having the same large-sample
properties even in the absence of this knowledge. Furthermore, the asymptotic
theory is valid under less stringent assumptions than those made in Theorem L.
In this section, we continue to assume that equation (1) describes the data
generating process but we drop Condition A and consider the properties of
some feasible tests under weaker assumptions. For 0 (s ( 1, let [sI] be the
greatest integer less than or equal to sZ and let + denote weak convergence of
the underlying probability measures as 7 tends to infinity.
UNIT ROOT
819
(9)
where
is an estimator for
Zo
and
Thus the power functions r(c,e) ar,d n'(c,c) derived in Theorem L for
point-optimal tests in the Gaussian model with X known can be attained by the
simple P. family of statistics under the much weaker assumptions of this
section. This is important, because, in practice, I will generally contain uirknown parameters and there is often no compelling reason to believe that the
data are normally distributed. If the errors are non-normal, tests exploiting the
form of the actual likelihood and possessing power higher than II(c) and [1"(c)
could be constructed. In the absence of such information, quasi-likelihood tests
based on least-squares regressions are likely to be used in practice. The power
bounds derived under normality are still valid when comparing such tests.
Although our analysis is based on relatively weak assumptions, two interesting
models considered elsewhere in the literature are ruled out. A problem closely
related to ours is to test the null hypothesis that {u,} is an integrated process
against the alternative that it is a strictly stationary process. Under that
alternative, ao will have a variance proportional to (1 - o2)-', a violation of
Condition C. The tests studied in Section 2 are not point optimal under this
specification and the asymptotic power bounds are no longer valid. Our P,
statistics, however, still have simple local-to-unity limiting representations under
820
c. ELLIorr, T. J. RoTHENBERG,
AND J. H. srocK
AUTOREGRESSIVE
AS
(10)
(1ee3).
second, closely related approach to modeling unit roots is also ruled out
here, One way to avoid making an assumption about the initial error a6 is to
base the entire statistical analysis on the conditional distribution of the data
given the first observation yr. When d, is known, there is no difference
asymptotically between our analysis based on the full likelihood and analysis
based on the conditional likelihood. But when d, is unknown, the point-optimal
invariant test based on the full likelihood is not asymptotically equivalent to the
point-optimal test based on the conditional likelihood. Invariance under the
Estimators to2 that are consistent under local alternatives and have nonzero
probability limits under fixed alternatives clearly satisfy this condition. Some
examples of such estimators are given in Section 5.
a
:.i
!t
4. soup
821
*: tlfT(a - 1)l for dll,we can find that alternative d(r,2, a) which yields
(approximate) power rl' when using the point optimal test of level e with a
sample of size Z. Then, for e < rr < L, the family of test statistics can be written
the stationary alternative. Suppose, for example, zo is normal with mean zero
and variance (l - oz1-t and that the u, are serially uncorrelated with unit
variance. Then T-l/zuvn- W!(t) : W"G) * no"' where 4o is a normal variate, independent of W"(.), with mean zero and variance (-Zc)-l. The P.
statistics can then be written as functionals of the W!(t) process. Further
analysis of the stationary alternative testing problem can be found in Elliott
and
UNIT ROOT
PTQI) =
(We suppress the dependence of P on e.) Although every member of this family
is admissible, past research suggests that values of z- near one-half often yield
tests whose power functions lie close to the power envelope over a considerable
range. Cf. King (1988).
For the remainder of the paper we restrict attention to the three standard
cases discussed in the literature where d, is either zero, a constant, or a linear
trend. To distinguish the cases, we follow Dickey and,Fuller (1979) and use a
superscript trr, when d, is constant and a superscript r when it is a linear trend.
Since commonly used test statistics have distributions not depending on the
parameters determining the d,, we shall also restrict attention to invariant tests.
when there is no deterministic term, our family of P, tests includes as special
cases many tests^ previously proposed. Recall that Pr(rr) has the asymptotic
representation czGr)1W"2 -eG)W"z(l) where E(zr) is i monotonically deireasing function taking the value zero when z' is equal to a (the size of the test) and
tending to minus infinity as z,' approaches one" Sargan and Bhargava (1983)
suggest S(0)/S(l)-as a test statistic when the u, are white noise; asymptotically it
behaves like lll"2 and corresponds to P1(1). The locally most powerful test
described by Dufour and King (1991) behaves asymptotically like w"2(D ana
corresponds to P7G). The Dickey-Fuller estimator test (based on their statistic
D) is also a member, since its rejection region is determined, asymptotically, by a
Iinear combination of IW"2 and w"2(1); computations indicate that it has the
same limiting distribution as our Pr(l - a). The Dickey-Fuller , statistic (denoted by i) is a nonlinear function of lW"2 and W"z(l). Nonetheless, computations indicate that the asymptotic power function of their , test is tangent to the
power envelope when power is aboul one-half and behaves like the pr(.5) test.
Likewise, the Z. and Z, tests examined in Phillips (1987) and phillips and
Perron (1988) behave like members of the P. family since they are asymptotically equivalent to the fi and i tests, respectively.
Figure L graphs the asymptotic power functions of these tests along with the
power envelope when the tests have size 0.05. These are based on 20,000 Monte
carlo replications where w" was approximated by its discrete realization from a
sample of size 500; simulation standard errors are less than 0.0013. The power
e,nvelope is monotonic and equals one-half when c :
- 7. with the exception of
the locally most powerful test which puts all the weight on w"z(l), a[ the tests
have power functions very close to the power envelope. Indeed, it is hard to
distinguish them without vastly changing the scale of the figure. Although none
of these tests is uniformly most powerful even asymptotically, our numerical
'#
H
822
STOCK
AUTOREGRESSIVE
,$
UNIT ROOT
823
'l
:I
it
-/
':3
ililt
l:\f
C.'
:i:I
-.i'
.'2'.-- /-- .'/
/./
.a
-.trb / .//
/'
{$
o.7
o.7
0.6
Solid
o.5
A: Pr(1.0); Sargan-Bhargava
B: P;(.9s); Dickey-Fuller P
C: Pr(.5); Dickey-Fuller"
D: Pr(.05); Locally most Powerful
l=
o.5
o.4
Solid
Pf (.5)
B: DF-GI-S/(.s)
C: Sargan-Bhargava
o.1
o.'l
2.5
12.5 15 17.5 X) 25
4
7.5
7.5
21.5
Frcune l-Asymptotic power functions of selected unit root tests: no deterministic component.
well.
D: Dickey-Fullerpr
E: Dickey-Fuller?r
o.2
o.2
Things are rather different, however, when d, contains parameters that have
to be estimated. The Sargan-Bhargava (1983) test for the constant mean case,
Bhargava's (1986) extension for the linear trend case, the Dickey-Fuller estimator tests (based on their statistics pp and D'), the Dickey-Fuller, tests (based
on their ?r' and ?'), and the Phillips-Perron Z tesls are no longer asymptotically equivalent to members of the P, family since they employ OI-S estimates
of the p's instead of constrained local-to-unity estimates. The power functions
for the PfGr) and P[(r) tests remain very close to the relevant power
envelopes II(c) and II"(c) for a broad range of 7r values. The power functions
for the tests which use OI.$ estimates of. B are well below the power envelopes.
Some results for tests at ttre 5Vo level are presented in Figure 2 for the constant
FrcuRE
12.5 15
17.5
27.s
2-Asymptotic power functions of selected unit root tests: constant mean (2,
L).
mean case and in Figure 3 for the linear trend case. The envelope power curve
II'(c) has the same shape as I/(c), but now takes the value one-half when
c: -13.5. The power loss of the commonly used.tests is particularly dramatic in
the constant mean case. The same pattern is found for tests at the lVo and llVo
significance levels.
A measure of the difference between two tests is Pitman asymptotic relative
efficiency (ARE), defined as the ratio of the values of. c at which the tests
achieve a specified power. Evaluating efficiency at power one-half and using 57o
level tests, we find in the constant mean case the ARE's of the Sargan-Bhargava,
ip and ?p tests relative to the powerenvelope are, respectively, 1.40,1.53, and
1.91. Since c is proportional to I, this implies that using the Dickey-Fuller / test
instead of the P,(.5) test is equivalent in large samples to discarding almost half
of the observations. The corresponding ARE's for the linear trend case are L.07,
t.13, and1.,.25.
Since the difficulties with the standard tests are associated with inefficient
estimates of the trend parameters, it is reasonable to expect that modified
824
STOCK
AUTOREGRESSIVE
UNIT ROOT
825
TABLEI
-7
/...r''r-'
o.9
CnmrcaL VeLues,
-.n'
If,vel
2.5%
/o'i'',"
'
o.7
.:-/'
A.
t/
50
100
^'"/..,i'i;
1.91.
1.99
/.,., ,'
/.'./ t
,,,, ,,
/''.'t
o.4
f"l,'ri
(,i'.,'
4,
.t-
o.3
o.2
f/r/
z'
? ,.
o.1
Constant Mean:
1.87
1.95
200
,'
50
100
zo0
2..5 %
50
100
200
estimates could improve their performance. Because of their relatively good size
properties found in small-sample Monte Carlo studies (e.g., Schwert (1989)),
natural tests to modify are those based on the Dickey-Fuller / statistics ip and,
?'. choosing a to be that alternative where maximal power is approximately
one-half, we propose regressing y" on Zu to obtain the estimate p. Then one
can perform the usual augmented Dickey-Fuller I test (without deterministic
regressors) using the residual series yd =y,- B'2, in place of y,. Thus the
modified test statistic (denoted by DF-GLS(z') in the tables and figures) is the /
statistic for testing ao:0 in the regression
2.97
3.11
3.17
3.26
3.9r
4.17
4.33
4.48
4.22
4.26
4.05
3.96
5.72
5.64
5.66
5.62
4.94
4.90
4.83
4.78
13.5
3.58
-3.46
3.48
statistic when there is no intercept. In the linear trend case, the detrended series
-3.46
- 3.29
- 3.18
- 3.15
20,OOO
-3.L9
- 3.03
-2.93
-2.89
6.77
6.79
6.86
6.89
13,5
-2.89
2.74
-2.64
-2.s7
30
:I,::tff ;'1*.,';"rTi:;.'"X.'",ii,,:n:*"1,;J;'/#li:Jl::,il#:
Ftcunr 3-Asymptotic power functions of selected unit root tests: linear trend (2,:(1,/),).
(11)
2.39
2.47
2.47
2.55
e: -7
e: -
-3.77
27.5
with
B: DF-GLSr(.5)
C: Bhargava
D: Dickey-Fullerpr
E: Dickey-Fullerfr
Pf
2.5
to%
integrals.
yi:l:-
Br- Br, plays the role of y/. It is shown in the Appendix that
T-1/2!["rt* a4(s,c) when d:1+ c/T is used for the estimation of B; the t
power envelope
given in Table
for .01 < e < .10. Some critical values for this choice of e are
r. Note that, although the small-sample values are valid only for
826
AUToREGRESSTvE UNrT
Gaussian white noise {u,}, the large-sample critical values do not depend
on
.5
or normality.
5.
r'rurt
lr
63r: t
nt:
SAMPLE PERFoRMANCE
based
MA(l):
AR(l):
III.
GARCH MA(1):
(o:
.8,.5,0, -.5,
(d:
-.8),
.s,
-.s),
.s,0,
-.s).
(0:
each of these models the initial condition was ,r0 :0. Although the null
distribution of the test statistics considered here are invariant to the initial
condition, small-sample power typically depends otr uo. This dependence is
In
investigated by considering a variant of the first model where the {rr,} are strictly
stationary under the alternative hypothesis. That is, zo is normal with mean zero
and variance equal to (l+ gz -zea)/(L- a'),a+ 1. This design violates our
Condition C and is intended to shed light on the importance of that assumption.
The autocorrelation structure of {u,} was assumed to be unknown to the
(13)
wherc
(14)
o:^:
6]
u;
f (, - ,L,r,)'
Ay,:aoy,-r*arAy,-L+...
+ap
827
Two choices of lag length were employed: the AR(8) estimator used p:8 and
the AR(BIC) estimator used p chosen by the Schwarz(1978) Bayesian information criterion constrained so 3 <p < 8. The SC estimators are given by
A Monte Carlo experiment was conducted to see how well the asymptotic
I.
[.
Roor
Ay,-r*aoat*
4r.
K(m/tr)i(m)
- lr
where K(.) is the Parzen kernel, i@): T-tL!:1"e,er+n, and e, is the residual
from an OLS regression of y, on (/r-r, z,). Two variants were employed: SC(12)
using /, : L2 and SC(auto) using Andrews' (1991) optimal automatic procedure
(his equations (6.2) and (6.4)).
The results are summarized in Table II for a constant mean and in Table III
for a linear trend. Tests were at the 5Vo asymptotic significance level and the
sample size T was 100. For a: l, the tables report the observed rejection rates
from 5000 Monte Carlo replications when critical values were based on the
limiting distributions. For a ( 1, the tables report size-adjusted power; this is the
rejection rate when critical values are estimated from the a: 1 Monte Carlo
trials.
The results suggest three conclusions. First, the predicted superiority of the
tests using local-to-unity estimates of the mean and trend parameters is borne
out by the Monte Carlo study. The Pr and modified Dickey-Fuller tests have
higher size-adjusted power than the standard Dickey-Fuller , test for almost all
of the data generating processes and all choices of to2. The improvement is
largest in the constant mean case. A.lthough the observed power curves tend to
be somewhat below the asymptotic power curves, the results are generally
consistent with the predictions of the asymptotic theory. The main exception is
the poor performance of the point-optimal tests using SC estimates of r,r2 when
the MA parameter 0 is large.
Second, the choice of estimator for az has a large effect on the size of the P,
tests, with the AR estimator exhibiting much smaller distortions than the SC
estimator. This mirrors similar results found for other unit-root statistics; see,
for example, DeJong et al. (1992) and Perron (1996). The AR(S) and AR(BIC)
tests have moderate size distortion except in the MA model with large 0. The
modified Dickey-Fuller tests have notably smaller size distortions than those
based on Pr. In addition, the tests based on the AR(BIC) estimator have better
size-adjusted power than those based on the AR(8) estimator, which typically
estimates more nuisance parameters. Other experiments not reported in Tables
II or III indicate that the AR(BIC) tests also dominate the ones based on the
AR(4) estimator for a2. Lag length selection based on sequential likelihood
ratio statistics was also tried; no general improvement over AR(BIC) was found,
although the LR selector appears to improve the size-adjusted power of the
modified Dickey-Fuller test relative to BIC in the linear trend case, at least for
of 0.
Third, the powers of the P. and modified Dickey-Fuller tests deteriorate
substantially when the a, are stationary. Even so, in the linear trend case with
small values
Asymptotic
Power
Slalistic
Pi$)
AR(8)
1.00
.95
.90
.80
.70
.05
.32
.'t6
1.00
1.00
Pi$)
1.00 .os
AR(BIC) .95 .32
.90 .76
.80 1.00
.70 1.00
1.00
.05
.95
.32
.90
Pr(J)
SC(auto)
.80
r.00
.70
1.00
l.oo
.95
.90
.80
..10
A(.5) 1.00
AR(8)
.95
.90
.80
.70
DF-GLS
P(.5) 1.00
AR(BIC)
.95
.90
.80
.,to
DF-GLS
1.00
.95
.90
.80
.'lo
.os
.32
.76
1.00
1.00
.05
.32
..15
1.00
1.00
0.18
0.31
0.47
0.56
o.t4
0.24
0.50
0.82
0.92
0.o2
0.29
0.64
0.96
1.00
0.04
0.30
0.67
0.97
1.00
0.05
0.21
0.42
0.68
0.80
1.00
0.10
0.26
0.56
0.87
0.96
.05
.12
.31
.85
l 00
0.08
0.11
0.23
0.55
0.76
.05
.32
.'t5
1.00
TEsrs oF THE
CoNsTANT
Mrar (2,:
MAo), A-
0.18
AUTOREGRESSIVE
1),
0J-
REsuLTs
GARCH MA(l),
- 0.5
0.20
o.17
0.29
0.46
0.53
0.21
0.18
0.30
0.48
0.s6
0.11
0.27
0.57
0.89
0.97
0.10
0.27
0.56
0.88
0.96
0.13
0.26
0.s4
0.86
0.95
0.8
0.20
0.18
0.31
0.48
0.65
0.97
1.00
0.04
0.30
0.68
0.98
1.00
0.06
o.23
0.43
0.10
0.28
0.59
0.91
0.98
0.84 0.91
0.98
0.08
0.59
0.92
0.98
0.06
0.10
0.22
0.56
0.7e
0.06
0.10
0.20
0.46
0.67
.99
1.oo
.05
0.i3
.95
.10
0.18
0.3?
.90
.27
0.36
0.69
.80
.81
0.65
0.83
.10
.99
o.82
0.10
o.tl
0.36
0.69
0.86
1.00
.05
.10
.90
.27
.80
.81
.'10
.99
0.00
0.10
0.25
0.65
0.87
0.00
0.11
0.26
0.69
0.91
0.57
.95
1.oo
.05
.10
.90
.27
.80
.81
.'70
.99
0.01
o.12
0.30
o.7'1
0.9'1
0.01
0.11
0.30
0.79
0.97
0.00
0.11
0.27
0.69
0.93
0.26
.95
1.00
.05
,10
0.04
0.08
0.16
0.30
0.41
0.0s
0.09
0.7't
0.31
0.42
0.05
0.09
0.7't
0.33
0.45
0.04 0.09
0.10 0.11
o.20 0.25
0.40 0.s3
0.56 0.68
0.0s
0.09
0.15
0.28
0.37
0.05
.95
0.11
0.11
0.23
0.53
0.75
0.08
0.10
0.23
0.57
0.80
0.07
0.10
0.24
0.61
0.84
0.11 0.58
0.11 0.12
0.28 0.2't
0.72 0.70
0.94 0.91
0.06
0.10
0.22
0.48
0.69
0.07
0.10
0.09
0.16
0.36
0.5'1
0.07
0.08
0.74
0.36
0.s8
0.05
0.08
0.14
0.30
0.48
0.06
0.68
0.64
0.68
0.59
o.97
0.95
0.99
0.98
0.80
1.00
0.74
0.o7
0.34
0.31
0.31
0.73
0.04
0.30
0.68
0.69
0.99
o.97
1.00
0.06
0.87
0.82
,,10
0.51
0.7't
o.gz
0.s1
0.28
0.74
o.li
0.46
0.08
0.28
0.62
0.95
1.00
0.05
0.11
0.2s
0.65
0.89
0.02
o.17
0.36
0.63
0.76
Pi$)
AR(8)
Pi(.s\
0.10
AR(BIC)
0.17
Pi$)
0.07
ss(12)
0.17
0.40
0.'13
0.85
0.70
0.04
0.17
0.39
0.44
0.98
0.98
0.80
1.00
0.'12
1.00
1.00
0.8s
0.91
0.06
o.22
0.43
0.69
o.82
0.06 0.07
0.22 0.23
0.44 0.4'1
0.71 0.78
0.83 0.90
0.06
0.14
o.25
0,40
0.46
0.09
0.26
o.57
0.90
0.98
0.08 0.1 I
0.27 0.26
0.59 0.61
0.92 0.95
0.98 1.00
0.08
o.17
o.37
0.66
o.79
0.80
0.07
0.10
o.23
0.54
o.77
0.06 0.08
0.06
0.06
Pi(s)
0.06
Sc(auto)
0.19
DF-GLS'(.5)
0.06
AR(8)
.
0.14
0.25
0.40
DF-GLS'(.5)
0.07
AR(BIC)
0.16
0.37
0.68
0.10 0.13
0.11
0.r1
0.23 0.29
0.24
0.24
0.58 0.73
0.57
0.60
0.82 0.93
0.80
0.84
replications,
Power
.95
.05
.10
1.00
.90
.80
.81
.'70
.99
1.00
.05
.95
.10
.90
.27
.80
.81
.70
.99
1.00
.05
.9s
.09
.90
.19
.80
.61
.'70
.94
0.5
0.5
0.16
0.35
0.68
0.84
0.11
0.26
0.61
0.71
0.12
0.32
0.86
0.99
0.09
0.18
0.3s
0.48
GARCH MA(l),
Staiionary MA(l),
-0J
0.19
0.14
o.23
o.37
0.46
0.o 05
0.19 0.13
0.15 0.15
0.25 0.25
0.40 0.42
0.48 0.51
-0J
0.18
0.13
0.23
0.38
0.46
0"0
0J
0.18
0.13
0.24
0.39
0.48
0.14
0.13
0.11
0.17
0.34
0.66
0.83
0.08
0.16
0.34
0.68
0.84
0.06
0.17
0.37
0.73
0.89
0.10
0.14
0.29
0.61
0.78
0.07
0.14
0.30
0.63
0.81
0.05
0.00
0.11
0.27
0.68
0.89
0.03
0.12
0.29
0.74
0.93
0.77
0.11
0.20
0.31
0.25
0.00
0.09
0.21
0.51
0.73
0.03
0.10
0.25
0.64
0.85
0.77
0.01
0.11
0.30
0;76
o.97
0.05
0.11
0.30
0.81
0.98
0.51
0.11
0.30
0.80
0.97
0.01
0.10
0.23
0.62
0.87
0.04
0.10
0.25
0.70
0.93
0.49
0.05
0.08
0.15
0.31
0.43
0.05
0.09
0.16
0.32
0.45
0.0s
0.09
0.18
0.38
0.53
0.05
0.08
0.13
0.22
0.30
0.05
0.08
0.13
0.23
0.31
0.04
0.08
0.10
0.23
0.56
0.78
0.06
0.10
0.24
0.59
0.82
0.11
0.11
0.26
0.69
0.91
0.08
0.09
0.19
0.46
0.67
0.07
0.09
0.19
0.49
0.71
0.11
0.07
0.08
0.14
o.34
0.55
0.06
0.08
0.14
0.37
0.60
0.09
0.09
0.18
0.48
0.78
0.o7
0.08
0.15
0.36
0.58
0.05
0.08
0.15
0.39
0.64
0.09
0.25
0.42
0.50
0.14
0.32
0.68
0.86
0.09
0.16
0.25
0.20
0.10
0.24
0.63
0.82
0.09
0.15
0.26
0.33
0.47
DF-z"
AR(BrC)
iy'ot?J: For each statislic, entries in the first row are the empirical
reiection rate under the null (the size). The remaining eDtries arc
size-adjusted power under rhe model described in if,"
,,a"y.pto-ii"'io*"i,;; is the locallo-unity
6sympt
power ror each statisric. The entrv helow the name
""jl"i'. " "otu-n,
of each
secrion 5). For the lol< t
in the final lhree columns, uo *u'" dr"*n r.om lts srarlonaiy ai;i;i;;'tl-.
Based on 5000 Monte carlo
j
t
""i*i,''t
"irii.il"liii*re. rh;;;;;";;;;7';;;'i;:
0.05
.81
0.30
0.66
o.tz
0.10
0.16
0.37
0.60
0.76
.2't
.80
0.08
0.24
.90
0.13
0.03
0.31
0.18
0.16
0.26
0.41
0.49
AR(l), d 05 -oJ
0.18 0.14 0.13 0.21 0.17
0.16 0.16 0.13 0.15 0.15
0.26 0.2't 0.25 0.2s O.24
0.41 0.44 0.45 0.38 0.38
0.s0 0.53 0.42 0.45 0.46
0.25
0.20
o.29
0.18
MA(l).0:
-
0.42
0.33
0.99
Asymptotic
Tesl
Statistic
0.8
0.16
0.15
0.26
0.40
0.48
0.0
0.29
0.70
0.27
0- Srationary
0.5 - 0.5 0.0
05
0.29
829
100
":
ARo), 6
-o5
0.02
UNIT ROOT
TABLEIII
sslrcreo
LevuTEsrs,
0.8
STOCK
0.10
0.25
0.63
0.88
0.08
0.15
0.42
0.69
0.09
0.21
0.54
0.76
0.09
0.18
0.52
0.81
830
AUTOREGRESSIVE
0:0, the size-adjusted powers of the tests using Iocal detrending exceed that of
DF-4". In the constant mean case, the size-adjusted powers of tests using local
demeaning exceed that of DF-?,' for close but not distant alternatives. The
gains from employing local-to-unit estimates of the intercept appear to depend
crucially on the assumption that, under both the null and the alternative
let
O.ifttrcelementsoftheTxluectot'zandofttrcTxTmatr*Aareboundedinabsoluieualue,then,
(AD
Pnoor:
nt
s- l - f
- *)z:
T-@
/(
hence,
lll<KT.
>- | _ V ) zl
lxt (
)f
llOll
O,
A2:
I
(l)
if c : T(a
posiriue and,
- l)
'u',>'il
is
*,
fued as T -
,-a'l
St.,
O.
under Condition
rs
P^
,)ll.
'u',u
O.
Since
<f(l) <M;
T-
94720, U.S.A.,
l*t yUr) be the autocovariance function and /(I) the spectral density function for the stationary
process {ur} satisfuing Condition A. The rs element of the Z x I covariance matrix .t is y(r s) :
l!, ei<'-"ty1^1d,\. We shall approimate -5- t by the f x f matrix g w.irh rs element p(r - s) =
pI
is given by Davies (1973) and DzhaI!,ei('-")^l4n2f(A)l-r dA. The rc element of D=Irparidze (1985) as
tim r-rx'(.5-r
T)@
lD'Al<KTt/zllDll, and
D Dia,,l.2Ll-y?)lk
D lp(ilt.-.
j=-@
r-ls:l
k-t
under ConditionA,
accurate, these tests are essentially optimal among tests based on second-order
sample moments and should perform considerably better than tests which
employ OLS estimates of the parameters determining d,. Our Monte Carlo
results suggest that the Dickey-Fuller / test applied to a locally demeaned or
detrended time series, using a data-dependent lag length selection procedure,
has the best overall performance in terms of small-sample size and power.
The numerical finding that, as a practical matter, the asymptotic power
functions of the P.(.5) and the modified Dickey-Fuller , tests effectively lie on
the Gaussian power envelope indicates that, in large samples, there is little
room for improvement under the stochastic specification made here. Of course,
if the errors have a known non-normal distribution or if the initial error zo is
large compared to ar, better tests could be constructed. Furthermore, the Monte
Carlo evidence suggests that autocorrelation in the u, can have very substantial
effects in small samples. Nevertheless, it appears that, when parameters in the
deterministic component of a series have to be estimated, the proposed tests for
a unit root dominate those currently in common use.
final
TT
tlDll:
LEMMA A1: Let 2 and 9 be TxT Toeplilz matices formed from y(k) and p(k), the Fouier
coeficients of 2rf(l) andl2rf(),))-r, tcspectiuely. Let xbe a Tx 7 uectorsuch that lim,--T-llxl:
1992;
B:lbijl,let
=LLlf:lbijl,<
squares regressions.
llBll
(A1)
6. coNcrusroNs
The P, and modified Dickey-Fuller , statistics are easily computed from least
and
Kennedy School of Gouemment, Haruard (Jniuercity, 79 John F. Kennedy
Cambidge, MA 02138, U.S.A.
831
hypotheses, only the early observations are informative about that parameter.
cA
UNIT ROOT
'i
:i
a7(k)
T_K T
=T-' L L o,,"o,*0," :
T-tlRAl-
0 as 7+ o. Define
T-K
T-2(7 + cT-r)k
,:1 s:1
a7(k)-("2'-l-2c)/(2c)z
when
ar(k)
E (t + cr-r;2{'-"(T - k - r)'
c*0
and
a7(k)+l/2when c:0,
832
s.=E|=l_r*fl(k)-o2
Since
T-ztt[A,(>-srr)A]:2L
and
AUTOREGRESSIVE
rr=L[*r_r+tp(k)
y(k)lar(&) -a7(0)l -
t-1
T-l
T-2trlA, (V - rrl) A) : 2 | p(k)ta7(k)
_ a1(0)l
k:1
0,
where
T-2
0.
1A
o)-
>A
and B, then
(A4)
(As)
plimT-2 d'- 1 Z-
where
d_,
(0, db
..
.,
dr_
-, :
I-
plim
and Atl
d,_ r
(db d2
S- r 4a
dr,
A, A]
0.
dr_ rl
g,
p(o)r-
,o?
,f
T-1
o?-
[d,d,*o-d,_1d,_111,1
-T-r
ad,\padl
- r)2
- a d'
-,
>-
tA
ZA,
E-
- t < T-
e2ktr ( E ) 12 (
E-
- e
which.implies that plimT-2d'-t>-1u_t:0. For the second part of (A5), note that Au_ul
cT-tu-, so it suffices toverifoihat uur(T-rd,_r2-rul:T-_rd,_1i-t_l]b.;;"li,T-tad,ad
+ 0 and lD'Al37r/z"tctllDll imply T-t ad,(2-t _ v)u_r 5 0 ,in""
)l
d.
rl2
T-
ld,1l,
d, _
rp
u_
rl < T-
o) - a-2lQr(d) - Qr\)l
t<T
k-_@
p&)l L
o,
where,r, is the last column of X'. Under Condition A,T-t/2upz..-aW,(s). By the continuous
mapping theorem and the Ito calculus,
|
/2
xE + hO, ilw,o)
Io'
n tt, e) + eh(s,
wherelr(s,c-)isthevectorconsistingofthe4-lfunctions
e))w,( s) ds
:I
h?)ldw,
- ew,l,
h,(s,e),H(s,d) isthevectoroffirst
derivatives dh/s,e)/ds, and the time index is dropped in the final term for notational convenience.
Since Iim T- tZ'Z is block diagonal, we find that Qy(a) + u? + o2Q(d) vrhere
orrrroo -rw))
o@ =Uh.aaw, -ew,)f'
Un<etn,<ttf-'11
Setting a to one, the same argument shows that QrG) - Qr1) * azlQc) - O(0)1.
The argument now follows the proof of Lemma A2. Because the elements 2,i of Z
polynomials in t, for all k and for all (i, j) pairs except i:i:1,
the terms
arc
T-k
2,,i21
a,il
rrZ'Z)
o.
mryl,)
t<T
o.
ana
+0and o2T-txtE-txaG(e).similarly,T-t(x'>x-.2x'x)+0,so?-llRxl2-0,whereR
G10)
k_
QY
and
o- T-
EIT- d'- | 2-
G) -
,,
T-k
+2\ oG)T-, |
k-l
,-1
in (5).
T-2d,_t>-rd_1<r(>-t)T-2dt_i_t<r(D-t)max,.rdlyT-0.
I-I Ad,(r-t - ild_t:0. But, defining do =b, wi have
lzT-t Ad,pd_i:7-rld,vd- d,_tpd_r- Ad,1q.Adl
I
is given
uo
oY
!ZG'Z)-'Z't
(,{6)
Z,
and
pnoor: Define the qxq diagonal matrix Nr: diag(j.1/2,7,r-1,...,7,-n-') and the f x4
matru Z:ZuNr. Then, setting 6: Au-dT-tu-1:uf (c-e)T-tu-, we have QrG):
column of
plim T- | Ad, E- 1 u _,
dr
...,
Qy
2At
uo:(upuz-du1,...tu7-au7-11
(A8)
Leuva A3: If
t
Q7k) : u'oZolZ|Z"l- Z2u,,
LEMMA A4: Suppose the data are generated by (1) under Condition A and d,: B'2,, where
zi=(1,t,t2,...,tq-t). Then QrG) - QrT) has a timitingdistibutionwhen c: T(o - 1,) ande :T(a
l) are fixed as T tends to infinity. Furthemtore,
(43)
833
Define
(AT
T-1
UNIT ROOT
So:7-rtz*'rr-t -.-211u !
O,
tr[E(s+sl)]
834
Definin_g
x'b) 3
(Ai1)
AUTOREGRESSIVE
STOCK
the chi-square vaiate Xz(u)=(e,rl-tu)2/e\I-re1, we find that 1*,2-tE)z7xtE-txT- tZ'>- tZ is block diagonal, we have
0. Since lim
OYG): a-2er(d)
- .-ru? + oo0).
+ x2(u)
on d,
2T-tu'-, Au=T-11u2
Au'
Aul:f-tul.-
(82)
e2
T-
Lt
-, E-
-, -
ZeT-
a2T- (d'_ t
2-
d_
i2
d'_
| tt'
-I
E-' Au -
e2
y(0).
W,
/u)
converies in probability
(l) - ll.
u_
r)
2eT-
t(
Ad,
>-
Recalling that
L| :
Ii :
B)
y;l 2-,
minrL(-u, B)
T-
a:
>- | Au + Ad, E-
u_
u,o
E-
uo
- e{
(a).
(83)
(T-2u'-p
u'
- Et
- 1,7-
2eT-
u' _ t
uzr,grG) - ereD
I4- tt
=e2
Iw,'-aw"z(7)
+ g(o)
- ,r([w,r,w,2(t),eG)- O(0)),
- eG) + c.
For polynomial trend, L* is a function of c, - and 4; it does not depend on .E at all. when : 1 so
{
z,:1, Q(e):0 and the result is the same as in part (a).
(c) Wtren
(A10)
is given by (1 -cs) so [h(e)2:t_ e +
/,.1 Fo+ Pl + ut, the tunction lr(s,c) in
"
e'/3
and lhG)G\-zw,):Q,-e)w,(l)+ezlswc(s). Afrer considerable algebraic manipulation,
we find the following alternative expressions for (84):
(Bs)
t -e : r, I r:
(1,
e)w,2 O)
e2
/3),
azPr -- s(d)
: izT-
u'
,'ouo
- Qrk),
s(1)(1 + er- t)
- ru - t
Or( d).
P,
a-'.'le'lw,'
- a(e)]
-ew,'<t>+ o(o)
= Lx
o2.
PRooF oF THEoREM 3: It suffices to show that A2Prl 0 *h", ?+ o with lal < 1 and fixed.
Since the initial condition is asymptotically negligible, {u,} behaves like a stationary process. Cf.
Anderson (1971, Section 5.5.2). Thus T-zu'-ru-r3O urA f-'u'r30. From (Ag), T-t/2XE:
0, so both
+ ?-1c-, we find
where Q is defined in (A10). Thus, from (A8) and the argument leading to (B2),
(84)
(86)
Z- | Au + ey e)
- Oy G).
-, For the polynomial trend of Lemma 44, we have from the continuous mapping theorem,
c2
d _ L + d, _ |
to the statistic on the left of (B2). But, from Lemma ,A3, these terms converge in probability to zero
under Condition B, so the limiting distribution is unchanged.
(b) From standard GLS projection theory and (A7)
min L(o,
(.B7)
>-
I : o - d)/(l -
The limiting results (A10) and (B3) follow from the fact rhat T-t/2uL"rt-alY,(s) and that
,-' r' , L z(0). Since these limits also follow from Condition C, we have
ftrs,
elw,2
where
estimates. Thus the same interpretation holds when detrending is done with OI-S.
z(0) +or(1)
835
ROOT
r)1tr,. By I-emma
4(s,e) is the limiting representation of the standardized detrended series 7-rl2opolynomial
of
in
the
trend
case
are
asymptotically
estimates
equivalent
to GIS
B
44, OLS
The statistical theory underlying Theorem 1 can be found in Lehmann (1959, Chapter 6). The
limiting representations for the test statistics are derived as follows:
(a) Under Condition A, T-1/2up,]
aWG) and hence a-2(T-2u,_ru_yT-ruzr) *
UNIT
- 0 - e + e, fil-
tf<r
-.)w,(1) *
REFERENCES
ANDERSoN, T. W. (1971): Tlrc Statistical Analysis of Time Series. New York: Wiley.
ANDREws, D. W. K. (1991): "Heteroskedasticity and Autocorrelation Consistent Covariance Matrix
Estimation," Economeltica, 59, 817-858.
Baren:ee, A., J. J. DoLADo, J. W. GeI-eRAtrH, AND D. F. HENDRv (1993): Co-inegratiott, Enor
Corection, and the Econonenic Analysis of Non-stationaty Data. Ofrord: Oxford University Press.
BHencave, A. (1986): "On the Theory of Testing for Unit Roots in Observed Time Series," Reuiew
of Ecortotrtic Studies, 53, 369-384.
CnaN, N. H., AND C. Z. WEt (1987): "Asymptotic Inference for Nearly Nonstationary AR(1)
Processes," Annals of Statistics, 15, 1050-1063.
Devres, R. B. (7973): "Asymptotic Inference in Stationary Gaussian Time-Series," Aduances in
Applie d Pr oba bility, 5, 469 - 497.
DEJoNc, D. N., J. C. NnNrrnvrs, N. E. SAVIN, AND C. H. WHTTEMAN (1992): "The Power Problems
of Unit Root Tests in Time Series with Autoregressive Errors," Jotunal of Ecottontehics, 53,
323-343.
r,
l rw,<rlf'
Dtcrry, D. A.,
cient in Linear Regressions with Stationary or Nonstationary AR(1) Errors," Jounul of Economettics, 47,115-143.
DzHarantozr, K. (1985): Parameter Estimation and Hypothesis Testing in Spectral Analysis of Stationaty Time Series. New York: Springer-Verlag.
836
ELLlorr, G. (1993): "Efficient Tests for a Unit Root when the Initial Observation is Drawn from its
Unconditional Distribution," unpublished manuscript.
Ettcle, R. F. (1984): "Wald, Likelihood Ratio, and Lagrange Multiplier Tests in Econometrics," in
Handbook of Econometics, Vol. II, ed. by Z. Griliches and M. Intriligator. New York: North
Holland.
SntxxoNru, P., AND R. Luuxxoxen (1993): "Point Optimal Tests for Testing the Order to
Differencing in ARIMA Models," Econometric Theory, 9, 343-362.
Sancm, J. D., AND A. BHARcAvA (1983): "Testing Residuals from Least Squares Regression for
Being Generated by the Gaussian Random Walk," Econometica,5l,753-1.74.
ScHwARz, G. (1978): "Estimating the Dimension of a Model," Annals of Statistics,6, 467-464.
ScHWERT, G. W. (1989): "Tests for Unit Roots: A Monte Carlo Investigation," Joumal of Business
and Economic Statistics, 7, 147-759.
Srocx, J. H. (1994): "Unit Roots and Trend Breaks in Econometrics," in Handbook of Econometics,
Vol.4, ed. by R. F. Engle and D. McFadden. New York: North Holland, pp.2740-2847.
Zvctu,ruxo, A. (1968): Tigonometic Seies, Vol.7. Cambridge: Cambridge University Press.
CONOME,TRICA
JOURNAL OF THE ECoNoMETRIC SOCIETY
-(,1,
CONTENTS
Peren C. B. Psllt-tps: Econometric Model Determination . . . .
Gneueu ELLtort, Tuotrans J' RorHeNnsnc, AND Jar'aes H. Srocrc: Efficient
Tests for an Autoregressive Unit Root . . . .
Form;
Games
763
813
837
865
891
917
Mutations
943
... ..
Analysis
957
of
981
993
997
News Nores
SuavlssroN
MoNocnapH
998
Srntps
Socrerv
VOL.
64, NO.
4-July,
1996