Beruflich Dokumente
Kultur Dokumente
REFERENCES
Linked references are available on JSTOR for this article:
http://www.jstor.org/stable/2336336?seq=1&cid=pdf-reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted
digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about
JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://about.jstor.org/terms
Biometrika Trust, Oxford University Press are collaborating with JSTOR to digitize, preserve and
extend access to Biometrika
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
Biometrika (1985), 72, 1, pp. 67-90 67
Printed in Great Britain
BY RICHARD L. SMITH
SUMMARY
Some key words: Extreme value theory; Maximum likelihood; Nonregular estimation; Stable distribution;
Weibull distribution.
1. INTRODUCTION
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
68 RICHARD L. SMITH
f(x;O,x,f3)=cxf3(x-O)'-exp{-,f(x-0y)} (0<x<oo,x>O,fl>O).(1-2)
(ii) Three-parameter gamma,
f(x; 0, Lc, fl) = f,B(x-0) exp {-f(x-0)}/F(ac) (0 < x < oo, oc > 0
(1.3)
(iv) Three-parameter log gamma, which arises when log (X -0+1) has a gamma
distribution; thus
f(x;0,cx,f3)=flB{log(x-0+1)}'-'(x-0+1)-y-'/r(oc) (0<x<oo,oc>0,f,>0).
(1.5)
Our results also cover certain instances of the Box-Cox transformation family but we
shall not consider these explicitly.
Of these four examples, the Weibull has been studied the most intensively. For all
four, when ac < 1 the density tends to infinity as x I 0 so that, unless the range
excluded, the likelihood function always tends to infinity along some path in the
likelihood space as 0 tends to the sample minimum. Therefore it is necessary to
distinguish between global and local maxima of the likelihood. In this paper, by the
maximum likelihood estimator we shall always mean a local maximum, thus satisfying
the likelihood equations. A second point, which applies to all four examples but may not
be true in general, is that for Lc < 1 the density is J-shaped, so there cannot exist a
maximum likelihood estimator for which ac < 1. In particular, if the true value of ac is less
than 1, maximum likelihood estimators either do not exist at all or are inconsistent.
Harter & Moore (1965) described an iterative procedure for finding maximum
likelihood estimators for the Weibull and gamma distributions with possibly censored
data. In cases where maximum likelihood estimators do not exist, they proposed an ad
hoc modification based on treating the smallest observation as if it were censored.
Rockette, Antle & Klimko (1974) showed for the Weibull distribution that, if a local
maximum of the likelihood function exists, then there is a second solution of the
likelihood equations which is a saddlepoint. Their result shows that, in finding maximum
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
Nonregular estimation 69
likelihood estimators one must take care to ensure that a solution of the likelihood
equations really is a local maximum. Other relevant references are Cohen (1965) and
Lemon (1975). The Weibull distribution may be reparameterized as the generalized
extreme value distribution (Jenkinson, 1955) for which modern algorithms are available
(NERC, 1975; Prescott & Walden, 1980, 1983).
Thus there is a fair-sized literature on finding maximum likelihood estimators of the
Weibull distribution, but their asymptotic properties are largely unexplored. It is easily
checked that, when a > 2, the Fisher information matrix is finite, and it is widely
assumed that the classical properties hold in this case. For a < 2 the Fisher information
for 0 is infinite, so the classical results are certainly not valid.
In this paper it is confirmed that the classical results hold when a > 2, and the case
a <, 2 is studied in detail. The surprising result is that, in this case, estimation of 0 and
the other parameters, denoted by 0, in general a vector, are asymptotically independent:
each of the maximum likelihood estimators of 0 and 0 has the same asymptotic
distribution when the other is unknown as when the other is known, and we are also able
to show that these asymptotic distributions are independent. For 1 < Li < 2 we prove the
existence of a consistent sequence of maximum likelihood estimators as the sample size
tends to infinity, while for Li < 1 no consistent maximum likelihood estimators exist. We
also propose efficient alternatives to maximum likelihood which, in particular, cover the
cases where maximum likelihood estimators do not exist.
Cheng & Amin (1983) have also studied the asymptotic properties of maximum
likelihood estimators for the Weibull and gamma cases, though their Theorem 2 is less
extensive than our results in ?? 3 and 4 and their proofs are unpublished. On the other
hand they introduce a new estimator, the 'maximum .product of spacings' estimator,
which deserves to be studied further. Johnson & Haskell (1983) prove consistency
of the maximum likelihood estimator for the three-parameter Weibull with Li > 1, and
present Monte Carlo results which indicate that, even in the 'regular' case Li > 2, the
asymptotic normality of the estimators is approached only slowly.
Our main results require a long list of assumptions, which are stated in ? 2. The
remainder of the paper is organized so that statements of the main results appear at the
beginning of each section, and may be understood without reference to the proofs.
Results of a purely technical nature are stated as lemmas and contained within the
proofs of the theorems, though Lemmas 6 and 7 may be of independent interest.
2. ASSUMPTIONS
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
70 RICHARD L. SMITH
Assumpton 1. All second-order partial derivatives of g(x; 0) exist and are continuous
in 0 < x < oo, 0 E (D. Moreover c(0) = c - 1 lim g(x; 4) as x 4 0 exists, is positive and finite
for each 4, and is twice continuously differentiable as a function of 0.
Assumption 3. For some fixed P >, 0 and for each il > 0, 01 E (, we have
log g(x -e; 0) -log g(x - el; 01) < Hr,(x; 01),
whenever IeI < 1, I1el <11, I 0 -01<ii , x-e> , x-e1 > * and the func
satisfies
100
Assumption 4. There exists a fixed increasing sequence of compact subsets {Km, m > 1}
of RP+l and a fixed constant 6' such that Um Km = DR x (D, and, for 0,0, 04, 4O satisfyin
0 < 00o-'4)40)0o c (D,(0, 0) 0 Km, we have
log f (x; f, 0) - log f (x; f0o, #)) < Ho (x; f0o, 4)) (x > f0o),
where
(00
~i'x(4)) + {xoc(4) -x(4O)} log x + log g(x-e; 4) -log g(x; 4)) < HIJ(x; 4),)v
where
('00
Assumption 6. If E, denotes expectation with respect to f(. ;0, 4)) then for each
0 E (D (i, j = l ... I,p):
(a)
E (4 logf(X;O,a) = O
E,)6{(;+i) log f (X; 0, 4) )Qa4j)logf(x;0, 4)) =- E4(.A4) '7 ,) log f (X; 0 4)}
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
Nonreqular estimation 71
= mio(4) = moi(4));
(c) if oc>2, then
Assumption 7. If h(x; 4) is any Of (02/ax a/i) log g(x; 0) or (a2/a4i a/J) log g(x; 4),
as 0 -+ Oo, 4) -+
EoIh(X- 0;4)-h(X-0o;40o)I -+ 0,
where Eo is with respect to f(.; 00, 4)0). If ox(4o) > 2, we require the same of
h(x; 4) = (a2/aX2) log g(x; 4).
Assumption 8. For each e > 0, 6 > 0, there exists a function h, , such that
We now make some remarks on the assumptions. The key assumptions are Assump-
tions 1 and 6; the rest are there for technical reasons. Assumptions 2-5 are similar to, but
necessarily more complicated than, the classical assumptions of Wald (1949). In
Assumption 3 we may have (* = 0 or (* > 0; in the Weibull case, Assumption 3 is true
with (* > 0 for all oc but with (* = 0 only for a > 1. The distinction is reflected in the
statement of Theorem 2 below. Assumptions 7 and 8 are needed for Lemma 4 in ? 3, and
Assumption 9 is taken from Woodroofe (1972, 1974). Assumption 1 could be weakened to
g(x; 0) slowly varying at x = 0 for each 0, but at the cost of an increase in technical
detail.
For all our examples, Assumptions 1-6 and 9 are straightforward, if somewhat
tedious, to check. For examples (ii)-(iv), log g and its partial derivatives are bounded in x
for each 4, so Assumptions 7 and 8 are easy to check as well. For the Weibull distribution
we have
log g(x;c,f3) = log a + log _flx,
a a2t*~R R- nnm*~R- t M
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
72 RICHARD L. SMITH
and all (oc, /B) satisfying X1 < oX < oX2, I- I < J. Let XE = (8/K)'I' which we assume to be
< 1. Then
I (021/X2) log g(x ; oc, /)I < ?/X2 + hi6(,; C0, /0)
where
(0 (x <,xe),
2 (X,
he, (x; ao, f3o) = KxI (XE< ?XE
Xv< 1),
tX12 2 (X l)
From now on we shall let 00, 00 denote the true values of 0 and 0. The letters ox and
unless indicated otherwise, will always denote Ca(40) and c(0o). Let X1, ..., Xn den
sample of independent observations from the common density f( ; 00, /0). Let Ln d
the log likelihood divided by n, that is
The order statistics will be denoted Xn,1 < ... < Xn n; note that Ln(0, 0) is def
for 0 < Xn, 1.
The maximum likelihood estimator, when it exists, will be denoted by (On, /n) an
satisfies
aL,n(on,, 4J)/aO = 0, aLn(On, /n)/abi = 0 (i = 1, ...,p).
For the special case when 0 = 00 is known, let 7n- n(7O() denote the maximum
likelihood estimator for k, satisfying (aLnl/a4Ji) (0f,5,) = 0. The existence and con-
sistency of a5n follows from the classical results for regular estimation problems.
Similarly, let fn =_ O-n(/) denote the maximum likelihood estimator for 0 when 0 = 00
is known. The asymptotic properties of On are given by the results in ? 1. In particular,
0in exists and is consistent when oc > 1.
Define dn, 0 (n 1) to be 1 if,B> 1, logn if,B= 1 and n1/P1 if 0 </< 1, and write
Yn <prrn, for random variables { Yn4 and positive constants {rnl}, if
THEOREM 1. Assume Assumptions 1 and 6-8 are satisfied, and that ocX> 1. Suppose that
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
Nonregular estimation 73
Theorem 1 is a result about the local behaviour of the likelihood function near the true
parameter values. There is no guarantee that the local maximum, whose existence is
guaranteed by Theorem 1 when cx > 1, is unique, and we know already that the local
maximum is not a global maximum. The following result goes part of the way to settling
the uniqueness of the estimator. It is an analogue for this problem of Wald's (1949) result
on the consistency of the maximum likelihood estimator in regular estimation problems.
We remark that the main interest in this result lies in the case oc > 1, when (i) and (ii)
both hold. The theorem then shows that the region on which Xn, 1-0 is exponentially
small and cx(o) < 1 is asymptotically the only region where the likelihood function
badly behaved. When ac < 1, for all our four examples the log likelihood is J-shaped,
which shows at once that there can be no consistent maximum likelihood estimator.
Note that we have not settled the question of whether there is a unique local
maximum of the likelihood function. The proof of Theorem 1 makes it clear that the
Hessian of the log likelihood is negative-definite on a small neighbourhood depending on
n around (00, 00), and results of Mikeliinen, Schmidt & Styan (1981) then show that
there is a unique local maximum on this neighbourhood. The question of global
uniqueness is much harder to resolve.
We start the proofs with several technical lemmas. In Lemmas 1-3, the only
assumptions made are Assumption 1 and (21). Convergence of random variables is
always convergence in distribution unless stated; p means convergence in probability.
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
74 RICHARD L. SMITH
LEMMA 1. Let X1, Xn be independent from f(. 00, 00) and let
n
Sn,m Z I (Xk-00)m
k= 1
(m > O).
Then a - '(Sn,m - bn) W for some an, bn and W having a stable distribution with
min (2, x/m). We may take an = (ndn,pim). If m = o then also
LEMMA 2. Let X1,., Xn be as in Lemma 1, ordered by Xn 1 < ... < Xn,n. Define
n
Sn,m = Z (Xn,k-Xn,l) (m>O).
(Xn k-Xn, 1) m- (Xn, k-O ) < (Xn, k-o ) Mh{ (Xn, 1 -O0)/(Xn, k - O) },
where h(t) = mt(I - t) -m r-i for 0 < t < 1. Note that h(t) is increasing in t and tends
t -+ 0. Therefore
n
s*-m - S* (X,k-00) -m h{(Xn,1 -o)/(Xn,k-O)} (3*2)
k=2
Part (i) follows by splitting the sum in (3 2) into two parts, using the monotonicity of h
and (X., k-OO)-MSn*,*m -p O for any fixed k. Part (ii) is similar but easier. O
For the results which follow, we need some new notation. Suppose { Yn(An), n > 1} i
random sequence indexed by iln e An, and {rn, n > 1 } is a sequence of positive consta
We shall say that Yn(An) <p rn uniformly in An if the relation (3 1) holds uniformly over
An e An. Similarly, we say that Yn(An) -+p c uniformly in An if convergence in probabili
holds uniformly over An.
Define
LEMMA 3. Given positive sequences {5,n}, {bn}, the following relations hold uniformly
0on satisfying 0On < Xn, 1- bnI 0| - On I < 6n
(i) I Yn')(O,,)- Y'((0o) | <p max {(nlog I/6n)', n- log n al};
(ii) I Yn(')(On) - Yn )(00) I <p max {(nu,) , n 6, n
(iii) | Yn)(O,) - Yn (()0) <n max {n bn n n21a1, I
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
Nonregular estimation 75
n1 E {(Xn,k-On)'-(Xn,k-O0)'}-
k=2
The first term is <p (nbn)'- and the second is <p n1ll- 1. For the th
Hence this third term is <, 5n __. The result follows by putting together these three
bounds. DH
II (i) Suppose a > 1. Then - (02/aO a4i) Ln(0, 4)) p moi(4)o) for i = 1, . . ., p, unifor
over
0 I| <nlXn, la, 1 0 001 < bn, 0J < Xn, 1 -n 16n*
II (ii) Suppose a = 1. Then (D2/00 a')i) Ln(0, 4)) <p log n, for i = 1, ..., p, uniformly
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
76 RICHARD L. SMITH
nY
? y(3) (0 ) + 3,(O)?Xn,'kZ
n-f h (nkf '/))
k= 1
00, 40)
uniformly over I 0-00 I < (, I b-00 1 < , where the latter term is Op(1),
Assumption 8. The result then follows from Lemmas 1 and 3(iii).
(iii) The first part again follows by a combination of Assumption 8 and Lemma 3(iii),
together with Lemma 1. For the second part, note that _ (02 Ln/a02) is bo
below, up to an error of at most Op(l), by
We shall not give the proofs of the remaining parts of Lemma 4. They follow by
arguments similar to those already given, using Assumptions 1, 6 and 7 and the relevant
parts of Lemmas 1 and 3.
Proof of Theorem 1. The case cx > 2 is straightforward, so that we do this first. Let {
be any sequence such that n' n -+ 0, n2 (5b -+ oo and define for t E R, y E RP,
fn(t,
fn(^y)Y==(5-2
n 2 L(O?5t f?5y)
n((O + in t, o0 + an Y)-
By expanding afn/It and afnl/yj as far as the second term and using the results of Lem
4, we deduce that
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
Nonregular estimation 77
where (0*, O*) = A(0, 0) + (1 -iA) (0o, On) for some A between 0 and 1. Thus we may writ
Let 0* satisfy
For the moment, we assume On* exists. Comparing (3 6) with (3 5), we see
The right-hand side is Op(n- 2) while the left is of the form (0-00) (52 La02) (0, v) for
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
78 RICHARD L. SMITH
Under the same conditions, we have also n4 en*(On /) ?n 0 For 1 <o c < 2,
and it follows that n"'l en(OX, /)) -+p 0 and n2 e*(On, On)4) - 0 along any sequence (
such that n'l"(0 - -n) +P andn < n'-a.
We are ready for the final step. Suppose ci = 2. For t E M, y E [RP, define
fn(t,y) = I"L + tn 2 2 + yn
Expanding as far as the second term, using (3 7), (3 8) and Lemma 4J(iii), we may show
that t afn/an + ? y' afnl/yi is strictly negative over t2 + I y 12 = 62, with probability tending
to 1 as n -x . Again Lemma 5 may be applied, and we conclude that fn has a local
maximum satisfying t2 + I y 12 < 62, for any 6 > 0, with probability tending to 1. Hence
Ln has a local maximum at (On, an) satisfying
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
Nonregular estimation 79
Proof of Theorem 2. (i) This follows Wald's classical proof very closely; see also Walker
(1969) where similar results are needed as a preliminary to establishing asymptotic
normality of the Bayes estimator in a regular case. There are three steps.
First, for fixed (01, 01) E U there exists b(01, 01) > 0 such that
where the supremum is over all (0, 0) such that 0-01 I < , I k-01 1 < . This is
by bounding Ln(0, ) -Ln(01, 01), using Assumptions 1 and 3 the first step above.
Thirdly, let Km be a compact subset of RO x (F, as in Assumption 4. Extending the result
of the second step above, we have
for some ilm > 0. This is because Km can be covered by a finite number of open
neighbourhoods of points (01, 01). But now Assumption 4 allows us to drop the
restriction to KMn, provided m is sufficiently large.
(ii) First we note that, if 0 = 00 is assumed known then, given 6 > 0, there exists 4 > 0
such that
lim pr{ sup L(0O,0) < L(0O,00)-4} = 1 (3.10)
n -oo < 10-: 1<1-I > a
This may be proved by imitating the arguments used to prove (i), using Assumptions 2
and 3 with E = el = 0, and Assumption 5.
Let K' be a compact subset of (F, as in Assumption 5. For
LO(0, 0)-LO(0o, /1) = n {X(+)-1} Ik{lOg (Xn,k -0) -log (Xn, k -Oo)}
+n {oIC()-ok 1) } k log (Xn, k -O)
+n' Xk{log (Xn, k-0; ,) )-log (Xn, k-O, 1)} * (3d11
But if kk-k11 < ij,b <, iv
n Ik{log (Xn, k-O0; 4)-log (Xn, k-o; 0 1?) } < nfl kH(Xn, k fO; 1)
Note that it is essential. that Assumption 3 holds with 6* = 0 for this step.
Define an event 4'n,,(O, b) to hold if and only if
n7 I{xC() - I} Ek{log (Xn,k-0) -log (Xn,k -o)} < e{oC(4) + I}
We claim that, for any E > 0, it is possible to choose 6 sufficiently small so that
lim pr {gn~(, E(fb(l) for all (0, 0b) E V,n} = 1.F (3d 13)
n -o~
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
80 RICHARD L. SMITH
To prove this, fix (0, 0) E V,, and consider two cases: (a) oc() > 1, (b) X,,1 < 0-n-Y. We
use the inequality
is negative when 0 > 00, and is bounded by n- 1 6{x(4)- 1} Ik(Xfl k-Oo) - when 0 < 00.
Since E{(X - 0o) -} < oo, we may choose 6 sufficiently small, independently of 0
so that
lim pr[n'16{x(4)-1} Xk(Xn,k-00)' > e{X(4)+1}] = O0
n -~ oo
In case (b),
n
and the last term converges in probability by Lemma 2. Putting the results for (a) and
(b) together, we have (3413).
We also have E{llog(Xn,k-0o)i} < oo and may make I o(o)-cx(o1)I arbitrarily s
by choosing q sufficiently small. Combining this observation with (3 10), (3-12) and (3
choosing E in (3d13) so that e{ (4)? 1} < 4/4 on the range I 0-01 l < , we have
lim pr {sup Ln( 0) < Ln(00 v0o)- /4} = 1, (3d14)
n-oo
where the supremum is now taken over all (0, 0) such that
(0, 0 1) E- Vn, 0 E- Kmn 01 c- K'5 1 0-01 I < il,
for fixed 01. This result may immediately be extended to any finite set of values of 01
and hence by compactness to the whole of Km. Thus (3d14) holds if the supremum is taken
over all (0, 0) such that (0, 0) E Vn and 0 E K'm.
Now consider the case (0, 0) E Vn, 0 ? Km. Taking e = i' in (3 13), &l = 00 in (31 1), we
have with probability tending to one that
4. ASYMPTOTIC DISTRIBUTIONS
We are now in a position to state our main results about the asymptotic distributions
of 0n and s-
THEOREM 3. Under the a88umption8 of Theorem 1 let (n, $n) denote a sequence of
maximum likelihood e8timator8 8ati8fying the conclu8ion8 of Theorem 1.
(i) If a > 2 then n (On - 00, 4n - k0) converges in di8tribution to a normal random vector
with mean 0 and covariance matrix M-1, where M i8 as in Theorem 1(i).
(ii) If o = 2 then {(nc log n)2 (0n - 0), n (an - 4o)} converges in di8tribution to a normal
random vector with covariance matrix of form
[A M'1'
where M i8 as in Theorem 1l(ii).
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
Nonregular estimation 81
Note that Woodroofe's definition of H is long-winded, but because we do not know any
way to simplify it, we refer to Woodroofe's paper for the definition.
We also have the following Corollary.
COROLLARY. If a = 2 then
The corollary shows that the variance of the estimators may be estimated asymp-
totically by means of the observed information, in a case where the expected information
does not exist. In regular estimation problems, Efron & Hinkley (1978) argued that the
observed information is superior to the expected information as an estimator of variance,
but their argument depends on second-order approximations and conditional argu-
ments. It may therefore be of some interest that we have an example in which the
superiority of observed information is very easily demonstrated. Our argument,
however, applies only to the specific case a = 2 and therefore is of only slight practical
significance.
Proofs. We require two preliminary lemmas and a remark. Suppose {(Xk, Yk), k > 1} is
a sequence of independent identically distributed random variables and let (Sn, Tn)
denote the sum of (Xk, Yk) (k = 1, ..., n).
LEMMA 7. Suppose X1
density f satisfies f (
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
82 RICHARD L. SMITH
Proof. The case where g is the identity or a linear function is dealt with by Chow &
Teugels (1978), and our method closely follows theirs. It suffices to show that
/ 0 ZoanU n
(1 + f
But it follows from the standard proof of the central limit theorem that
00
ranu
n ff(x)dx -- u'.
ranu- anu
n Jf[exp {itg(x)/ In
<L{ t2 2(X
as n -[ oo.
Remark 1. Lemma 6 extends the result of Resnick & Greenwood (1979), Theorem 3,
that if Sn/an Z1, Tnlbn -_ Z2 with Z1 normal, Z2 stable with index less than 2, th
(Sn/an, Tnlbn) (Z1 , Z2), with Z1, Z2 necessarily independent. The key point in the pr
is the observation that the limit (Z1, Z2) must be infinitely divisible and therefore the
sum of independent Gaussian and Levy components. Our remark is that the same result
holds if (Z1, Z2) arises as the limit of renormalized row sums of a triangular array subject
to the usual asymptotic negligibility condition. That is, if Z1 is normal and Z2 has an
infinitely divisible distribution without a Gaussian component, then Z1 and Z2 are
independent.
Proof of Theorem 3. (i) Theorem 1 shows the existence of (On, an) with an - O <p
an-- k0 <p n- 2, and the proof of Theorem 1 shows that the second derivatives of L
asymptotically constant in this region. The result therefore follows by standard
arguments.
(ii) Since (n log n)2(On - On) -sp 0, n (4n - ) -sp 0, it suffices to prove the result
Orn 4in in place Of On, an. For On alone, the asymptotic distribution is given by Wood
(1972). For an, alone, the asymptotic distribution is given by the classical results for
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
Nonregular estimation 83
regular estimation problems. Therefore the only thing to show is the asymptotic
independence of on and /n,
Now (nclogn)2(On-0o) may be written as
where Vn -+p 1 and hence plays no role in determining the limit. Now it follo
Lemma 1 and Assumption 9 (Woodroofe, 1972) that aLO(O0, X0)/0O has infinite v
but that {n/(clogn)}l2OLO(O, 0)/a0 converges to a standard normal variable
Similar arguments applied to (j - 4c), together with the Cramer-Wold device and
Lemma 6, allow us to assert that ZO is independent of the asymptotic distribution of
n2(0n-0)o), as required.
(iii) Since n1 (On- O-n) p 0 n($n -n) -+p 0, it again suffices to prove the result with
Onv, an in place of n,, $n, and hence the only thing to show is the independence of the
asymptotic distributions of On and /. Our proof will make use of Remark 1 as well as
Lemma 7.
Let t > 0, y Ec R and consider
- limpr {(cn) " (On-n-o) > t, n2 a Ej a(#Jn- O) y IXn 1 > IOo+ t(cn)-/}
x pr {Xn 1 > Oo +t(cn)V }
- limpr{(cn)'1 (on-ao) > tIXn 1 > o?+t(cnV I }
x pr {n2 XjaJ($j-4/)jO l yXn, 1 > O0?+t(cn) l/a} pr {Xn, > SO+?t(cn)" }.
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
84 RICHARD L. SMITH
Lemma 7 implies the asymptotic independence of (en) - - (XO,1- 0) and the score
statistic evaluated at 4)0 for the parameter E ai Xi. It easily follows that the second factor
in this expression is independent of t, and hence that the whole expression equals
The complicated nature of the preceding results when 1 < oc < 2, and the nonexistence
of a consistent estimator when cx < 1, make it desirable to seek some alternative
estimator. An obvious candidate for a point estimator of 0 is Xn 1, the sample minimum.
The asymptotic distribution of n'/a(Xn,1 - 0S) is Weibull, and Akahira's results show that
no point estimator of 0 converges at a faster rate when oc < 2. Thus it seems reasonable to
use Xn, 1 as an estimator of 0 when x < 2. The difficulties are, first, that it is generally no
known a priori whether Xc < 2 and, secondly, that we still need an estimator of 4.
In this section we propose a new estimator Sn of 4. It is consistent, so that it may be
used to discriminate between the cases oc > 2, cx < 2, and when a < 2 it is asymptotically
efficient, and may therefore be used in place of the maximum likelihood estimator 4)n.
The new estimator is defined as the local maximum of the function
n
COROLLARY. For all ~, 4 iits a consistent estimator of ?)O. If oc < 2, then a- is also
asymptotically efficient, and n2(4)n- 4)O) converges to a normal distri,bution with mean
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
Nonregular estimation 85
log (Xn 1- 0) = O(log n), log g(Xn, 10; ) -+, log c(o),
so that the whole expression is Op(n- 1 log n). The same applies to
aLn (Xn 1, b) = ?+ (Xn, 1_00) Ln (0*, Ok*) +?X (k)-n) (0*, 4*)n
where (0*,*) = A(Xn 1,/)+(1- A)(00, () for some A (O < A < 1)
Suppose cx < 1. Let bn, n > 1, be any positive sequence such that bn 0,
nbn/logn -+ x. Define, for y e RlP such that I y = 1,
fn(y) =O/n6-
gnY n 2 n(Xn, 15 + 6n Y)L(5-2)
( 2
Then
afn/aYi = -E jmj+e() (5 3)
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
86 RICHARD L. SMITH
Proof of Cor
The results for oc < 2 follow by noting that
6. HYPOTHESIS TESTING
In this section we consider testing hypotheses about 0, making first a few remarks
about the case 4 known before turning to the case 0 unknown. The results given
of course also relevant to the construction of confidence intervals.
Consider first a simple versus simple test of Ho: 0 = 00, 4 = 00 against HI: 0
0 = (/o, based on sample size n. The Neyman-Pearson test is to reject Ho if
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
Nonregular estimation 87
with 0* between an and k0. But a2 Ln/I0 0fi <p 1 uniformly on a region of the
I I I Io (l), O1-0I1 =Iop(rJ2+l), so the whole expression 0 uniformly on a
region of form l 0-001l = O(n - 1a). Similar arguments show that the same result holds
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
88 RICHARD L. SMITH
O < oc < 1, cx = 1 and cx = 2. We therefore conclude that the tests (6-4)-(6-5) remain valid
when k0 is unknown, provided that /0 is replaced by its estimated value 4n or by som
other In-consistent estimator.
As previously remarked, no claim of optimality is made for these procedures. In one-
The results of this paper may be applied to two particular distributions which are of
importance in the analysis of extreme values. These are the generalized extreme value
distribution, which includes the three-parameter Weibull as a special case, and the
generalized Pareto distribution introduced by Pickands (1975).
The density function of the generalized extreme value distribution is
positive when y > 4u + u/k. Writing ,B = - 1/k, 0 = p + l/k gives the reparam
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
Nonregular estimation 89
ACKNOWLEDGEMENTS
One of the referees made many helpful suggestions, in particular a greatly shortened
proof of Lemma 6. I thank N. H. Bingham and J. P. Cohen for references.
REFERENCES
AKAHIRA, M. (1975a). Asymptotic theory for estimation of location in non-regular cases, I: Order of
convergence of consistent estimators. Rep. Statist. Appl. Res. Union Jap. Sci. Eng. 22, 8-26.
AKAHIRA, M. (1975b). Asymptotic theory for estimation of location in non-regular cases, II: Bounds of
asymptotic distributions of consistent estimators. Rep. Statist. Appl. Res. Union Jap. Sci. Eng. 22, 99-
115.
BARNDORFF-NIELSEN, 0. (1983). On a formula for the distribution of the maximum likelihood estimator.
Biometrika 70, 343-65.
CHENG, R. C. H. & AMIN, N. A. K. (1981). Maximum likelihood estimation of parameters in the inverse
Gaussian distribution, with unknown origin. Technometrics 23, 257-63.
CHENG, R. C. H. & AMIN, N. A. K. (1983). Estimating parameters in continuous univariate distributions
with a shifted origin. J. R. Statist. Soc. B 45, 394-403.
CHOW, T. L. & TEUGELS, J. L. (1978). The sum and maximum of i.i.d. random variables. In Proc. 2nd
Symp. Asymp. Statist., Ed. P. Mandl and M. Huskova, pp. 81-92. Amsterdam: North Holland.
COHEN, A. C. (1965). Maximum likelihood estimation in the Weibull distribution based on complete and on
censored samples. Technometrics 7, 579-88.
Cox, D. R. & HINKLEY, D. V. (1974). Theoretical Statistics. London: Chapman and Hall.
DAVISON, A. C. (1984). Modelling excesses over high thresholds, with an application. In Statistical Extremes
and Applications, Ed. J. Tiago de Oliveira, pp. 461-82. Dordrecht: Reidel.
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms
90 RICHARD L. SMITH
DAWID, A. P. (1970). On the limiting normality of posterior distributions. Proc. Camb. Phil. Soc. 67, 625-33.
EFRON, B. & HINKLEY, D. V. (1978). Assessing the accuracy of the maximum likelihood estimator: Observed
versus expected Fisher information. Biometrika 65, 457-87.
FELLER, W. (1971). An Introduction to Probability Theory and its Applications, 2, 2nd ed. New York: Wiley.
GRIFFITHS, D. A. (1980). Interval estimation for the three-parameter lognormal distribution via the
likelihood function. Appl. Statist. 29, 58-68.
GUMBEL, E. J. (1958). Statistics of Extremes. New York: Columbia University Press.
HALL, P. (1982). On estimating the endpoint of a distribution. Ann. Statist. 10, 556-68.
HARTER, H. L. & MOORE, A. H. (1965). Maximum-likelihood estimation of the parameters of gamma and
Weibull populations from complete and from censored samples. Technometrics 7, 639-43.
IBRAGIMOV, I. A. & HAS MINSKII, R. Z. (1981). Statistical Estimation. Berlin: Springer.
JENKINSON, A. F. (1955). Frequency distribution of the annual maximum (or minimum) values of
meteorological elements. Quart. J. R. Met. Soc. 81, 158-71.
JOHNSON, R. A. & HASKELL, J. H. (1983). Sampling properties of estimators of a Weibull distribution of use
in the lumber industry. Can. J. Statist. 11, 155-69.
LE CAM, L. (1970). On the assumptions used to prove asymptotic normality of maximum likelihood
estimates. Ann. Math. Statist. 41, 802-28.
LEMON, G. H. (1975). Maximum likelihood estimation for the three parameter Weibull distribution based on
censored samples. Technometrics 17, 247-54.
MAKELIINEN, T., SCHMIDT, K. & STYAN, G. P. H. (1981). On the existence and uniqueness of the maximum
likelihood estimate of a vector-valued parameter in fixed-size samples. Ann. Statist. 9, 758-67.
MANN, N. R. (1984). Statistical estimation of parameters of the Weibull and Frechet distributions. In
Statistical Extremes and Applications. Ed. J. Tiago de Oliveira, pp. 81-9. Dordrecht: Reidel.
NERC (1975). Flood Studies Report, 1. London: Natural Environment Research Council.
PICKANDS, J. (1975). Statistical inference using extreme order statistics. Ann. Statist. 3, 119-31.
PRESCOTT, P. & WALDEN, A. T. (1980). Maximum likelihood estimation of the parameters of the generalized
extreme-value distribution. Biometrika 67, 723-4.
PRESCOTT, P. & WALDEN, A. T. (1983). Maximum likelihood estimation of the parameters of the three-
parameter generalized extreme-value distribution from censored samples. J. Statist. Comput. Simul. 16,
241-50.
RESNICK, S. & GREENWOOD, P. (1979). A bivariate stable characterization and domains of attraction. J.
Mult. Anal. 9, 206-21.
RoCKETTE, H., ANTLE, C. & KLIMKO, L. A. (1974). Maximum likelihood estimation with the Weibull model.
J. Am. Statist. Assoc. 69, 246-9.
SMITH, R. L. (1984). Threshold methods for sample extremes. In Statistical Extremes and Applications, Ed. J.
Tiago de Oliveira, pp. 621-38. Dordrecht: Reidel.
TIAGO DE OLIVEIRA, J. (1984). Univariate extremes: Statistical choice. In Statistical Extremes and
Applications, Ed. J. Tiago de Oliveira, pp. 91-107. Dordrecht: Reidel.
WALD, A. (1949). Note on the consistency of the maximum likelihood estimate. Ann. Math. Statist. 20, 595-
601.
WALKER, A. M. (1969). On the asymptotic behaviour of posterior distributions. J. R. Statist. Soc. B 31, 80-8.
WEISS, L. (1979). Asymptotic sufficiency in a class of non-regular cases. Selecta Statistica Canadiana 5, 143-
50.
WEISS, L. & WOLFOWITZ, J. (1973). Maximum likelihood estimation of a translation parameter of a
truncated distribution. Ann. Statist. 1, 944-7.
WOODROOFE, M. (1972). Maximum likelihood estimation of a translation parameter of a truncated
distribution. Ann. Math. Statist. 43, 113-22.
WOODROOFE, M. (1974). Maximum likelihood estimation of translation parameter of truncated distribution
II. Ann. Statist. 2, 474-88.
This content downloaded from 200.131.225.130 on Mon, 03 Apr 2017 11:21:25 UTC
All use subject to http://about.jstor.org/terms