()
be the estimator of the same functional form as
, but computed from
the reducedsample of size m(k 1) obtainedbyomittingthe thgroup, anddene
= k
(k 1)
()
. (4.2.1)
Quenouilles estimator is the mean of the
=
k
=1
/k, (4.2.2)
and the
} = +a
1
()/n +a
2
()/n
2
+. . . ,
where a
1
(), a
2
(), . . . are functions of but not of n. This is easily seen by noting
that
E{
()
} = +a
1
()/m(k 1) +a
2
()/(m(k 1))
2
+. . .
and that
E{
} = k[ +a
1
()/mk +a
2
()/(mk)
2
+. . .]
(k 1)[ +a
1
()/m(k 1) +a
1
()/(m(k 1))
2
+. . .]
= a
2
()/m
2
k(k 1) +. . . .
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.2. Some Basic InnitePopulation Methodology 153
In addition, the estimator
=
(n)
+
1
n
n
i =1
(n)
(Y
i
) +
1
n
2
i <j
(n)
(Y
i
, Y
j
)
(i.e.,
can be expressed in a form that involves the Y
i
zero, one, and two at a time
only), then
is an unbiased estimator of ,
E{
} = .
See Efron and Stein (1981) and Efron (1982). We shall return to the bias reducing
properties of the jackknife later in this section.
Following Tukeys suggestion, let us treat the pseudovalues
as approximately
independent and identically distributed random variables. Let
()
denote the mean
of the k values
()
. The jackknife estimator of variance is then
v
1
(
) =
1
k(k 1)
k
=1
(
)
2
(4.2.3)
=
(k 1)
k
k
=1
(
()
()
)
2
,
and the statistic
t =
k(
)
_
1
k 1
k
=1
(
)
2
_
1/2
(4.2.4)
should be distributed approximately as Students t with k 1 degrees of freedom.
In practice, v
1
(
but also of
. Alternatively, we may use the estimator
v
2
(
) =
1
k(k 1)
k
=1
(
)
2
. (4.2.5)
This latter form is considered a conservative estimator since
v
2
(
) = v
1
(
) +(
)
2
/(k 1),
and the last term on the righthand side is guaranteed nonnegative.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
154 4. The Jackknife Method
4.2.2. Some Properties of the Jackknife Method
Aconsiderable body of theory is nowavailable to substantiate Tukeys conjectures
about the properties of the jackknife method, and we now review some of the
important results.
Many important parameters are expressible as = g(), where denotes the
common mean E{Y
i
} = . Although
Y = n
1
n
j =1
Y
j
is anunbiasedestimator of ,
= g(
Y) is generallya biasedestimator of = g().
Quenouilles estimator for this problem is
= kg(
Y) (k 1)k
1
k
=1
g(
Y
()
),
where
Y
()
denotes the sample mean of the m(k 1) observations after omitting
the th group. Theorem 4.2.1 establishes some asymptotic properties for
.
Theorem 4.2.1. Let {Y
j
} be a sequence of independent, identically distributed
random variables with mean and variance 0 <
2
< . Let g() be a func
tion dened on the real line that, in a neighborhood of , has bounded second
derivatives. Then, as k , k
1/2
(
()}
2
, where g
() is the rst
derivative of g() evaluated at .
Proof. See Miller (1964).
Theorem 4.2.2. Let {Y
j
} be a sequence of independent, identically distributed
random variables as in Theorem 4.2.1. Let g() be a realvalued function with
continuous rst derivative in a neighborhood of . Then, as k ,
kv
1
(
)
p
2
{g
()}
2
.
Proof. See Miller (1964).
Taken together, Theorems 4.2.1 and 4.2.2 prove that the statistic
t is asymptoti
cally distributed as a standard normal randomvariable. Thus, the jackknife method
ology is correct, at least asymptotically, for parameters of the form = g().
These results do not apply immediately to statistics such as
s
2
= (n 1)
1
n
j =1
(Y
j
Y)
2
or log(s
2
)
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.2. Some Basic InnitePopulation Methodology 155
since they are not of the form g(
Y). In a second paper, Miller (1968) showed that
when the observations have bounded fourth moments and = log(s
2
),
t converges
in distribution to a standard normal random variable as k . In this case,
is
dened by
= k log(s
2
) (k 1)k
1
k
=1
log(s
2
()
)
and s
2
()
is the sample variance after omitting the m observations in the th group.
Millers results generalize to Ustatistics and functions of vector Ustatistics.
Let f (Y
1
, Y
2
, . . . , Y
r
) be a statistic symmetrically dened in its arguments with
r n,
E{ f (Y
1
, Y
2
, . . . , Y
r
)} = ,
and
E{( f (Y
1
, Y
2
, . . . , Y
r
))
2
} < .
Dene the Ustatistic
U
n
= U(Y
1
, . . . , Y
n
) =
1
_
n
r
_
f (Y
i
1
, . . . , Y
i
r
), (4.2.6)
where the summation is over all combinations of r variables Y
i
1
, . . . , Y
i
r
out of the
full sample of n. The following theorems demonstrate the applicability of jackknife
methods to such statistics.
Theorem 4.2.3. Let be a realvalued function with bounded second derivative
in a neighborhood of , let = (), and let
= (U
n
). Then, as k
k
1/2
(
)
d
N(0, r
2
2
1
{
()}
2
),
where
= k
1
k
=1
= k(U
n
) (k 1)(U
m(k1),()
),
U
m(k1),()
=
1
_
m(k 1)
r
_
f (Y
i
1
, . . . , Y
i
r
),
2
1
= Var{E{ f (Y
1
, . . . , Y
r
)Y
1
}},
)
p
r
2
2
1
{
()}
2
.
Proof. See Arvesen (1969).
Theorems 4.2.3 and 4.2.4 generalize to functions of vector Ustatistics; e.g.,
(U
1
n
, U
2
n
, . . . , U
q
n
). Again, the details are given by Arvesen (1969). These results
are important because they encompass an extremely broad class of estimators.
Important statistics that fall within this framework include ratios, differences of
ratios, regression coefcients, correlation coefcients, and the t statistic itself. The
theorems show that the jackknife methodology is correct, at least asymptotically,
for all such statistics.
The reader will note that for all of the statistics studied thus far,
t converges to
a standard normal random variable as k . I f n with k xed, it can be
shown that
t converges to Students t with (k 1) degrees of freedom.
All of these results are concerned with the asymptotic behavior of the jackknife
method, and we have seen that Tukeys conjectures are correct asymptotically.
Now we turn to some properties of the estimators in the context of nite samples.
Let v
1
(
(Y
1
, . . . , y
m(k1)
)} is
v
(n)
1
(
(Y
1
, . . . , Y
m(k1)
)) =
k
=1
(
()
()
)
2
,
and the sample size modication is
v
1
(
(Y
1
, . . . , Y
n
)) =
_
k 1
k
_
v
(n)
1
(
(Y
1
, . . . , Y
m(k1)
)).
Applying an ANOVA decomposition to this twostep process, we nd that the
jackknife method tends to produce conservative estimators of variance.
Theorem 4.2.5. Let Y
1
, . . . , Y
n
be independent and identically distributed ran
dom variables,
=
(Y
1
, . . . , Y
n
) be dened symmetrically in its arguments, and
E{
2
} < . The estimator v
(n)
1
(
(Y
1
, . . . , Y
m(k1)
)) is conservative in the sense
that
E{v
(n)
1
(
(Y
1
, . . . , Y
m(k1)
))} Var{
(Y
1
, . . . , Y
m(k1)
)} = 0
_
1
k
2
_
0.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.2. Some Basic InnitePopulation Methodology 157
Proof. The ANOVA decomposition of
is
(Y
1
, . . . , Y
n
) = +
1
n
i
+
1
n
2
i <i
i i
+
1
n
3
i <i
<i
i i
+. . . +
1
n
n
1,2,3,...,n
, (4.2.7)
where all 2
n
1 randomvariables on the righthand side of (4.2.7) have zero mean
and are mutually uncorrelated with one another. The quantities in the decomposi
tion are
= E{
}
grand mean;
i
= n[E{
Y
i
= y
i
} ]
ith main effect;
i i
= n
2
[E{
Y
i
= y
i
, Y
i
= y
i
} E{
Y
i
= y
i
} E{
Y
i
= y
i
} +]
(i, i
i i
= n
3
[E{
Y
i
= y
i
, Y
i
= y
i
, Y
i
= y
i
}
E{
Y
i
= y
i
, Y
i
= y
i
}
E{
Y
i
= y
i
, Y
i
= y
i
}
E{
Y
i
= y
i
, Y
i
= y
i
}
+ E{
Y
i
= y
i
}
+ E{
Y
i
= y
i
}
+ E{
Y
i
= y
i
} ]
(i, i
, i
)th thirdorder interaction; and so forth. See Efron and Stein (1981) both
for the derivation of the ANOVA decomposition and for the remainder of the
present proof.
The statistic v
(n)
1
(
(Y
1
, . . . , Y
m(k1)
) is based upon a sample of size n = mk but
estimates the variance of a statistic
(Y
1
, . . . , Y
m(k1)
) associated with the reduced
sample size m(k 1). Theorem 4.2.5 shows that v
(n)
1
tends to overstate the true
variance Var{
(Y
1
, ..., Y
m(k1)
)} associated with the reduced sample size m(k 1).
The next theorem describes the behavior of the sample size modication.
Theorem 4.2.6. Let the conditions of Theorem 4.2.5 hold. In addition, let
=
(Y
1
, . . . , Y
n
) be a Ustatistic and let m(k 1) r. Then
E{v
1
(
)} Var{
}.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
158 4. The Jackknife Method
Proof. From Hoeffding (1948), we have
(k 1)
k
Var{
(Y
1
, . . . , Y
m(k1)
)} Var{
(Y
1
, . . . , Y
n
)}.
The result follows from Theorem 4.2.5.
Thus, for Ustatistics the overstatement of variance initiated in Theorem 4.2.5
for statistics associated with the reduced sample size m(k 1) is preserved by the
sample size modication factor (k 1)/k. In general, however, it is not true that
the jackknife variance estimator v
1
(
) to be nonnegatively biased.
For linear functionals, however, the biases vanish.
Theorem 4.2.7. Let the conditions of Theorem 4.2.5 hold. For linear functionals,
i.e., statistics
such that the interactions
i i
,
i i
i
, etc. are all zero, the estimator
v
(n)
1
(
(Y
1
, . . . , Y
m(k1)
)) is unbiased for Var{
(Y
1
, . . . , Y
m(k1)
)} and the estimator
v
1
(
}.
Proof. See Efron and Stein (1981).
In summary, Theorems 4.2.5, 4.2.6, and 4.2.7 are nitesample results, whereas
earlier theorems presented asymptotic results. In the earlier theorems, we saw
that the jackknife variance estimator was correct asymptotically. In nite samples,
however, it tends to incur an upward bias of order 1/k
2
. But, for linear functionals,
the jackknife variance estimator is unbiased.
4.2.3. Bias Reduction
We have observed that the jackknife method was originally introduced as a means
of reducing bias. Although our main interest is in variance estimation, we shall
briey review some additional ideas of bias reduction in this section.
The reader will recall that
1
and
2
denote two estimators of whose biases
factor as
E{
1
} = + f
1
(n) a(),
E{
2
} = + f
2
(n) a(),
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.2. Some Basic InnitePopulation Methodology 159
with
1 1
f
1
(n) f
2
(n)
= 0.
Then, the generalized jackknife
G(
1
,
2
) =
2
f
1
(n) f
2
(n)
1 1
f
1
(n) f
2
(n)
1
=
,
2
=
n
=1
()
/n,
f
1
(n) = 1/n,
f
2
(n) = 1/(n 1).
Now suppose p +1 estimators of are available and that their biases factor as
E{
i
} = +
j =1
f
j i
(n) a
j
() (4.2.8)
for i = 1, . . . , p +1. If
1 1
f
11
(n) f
1, p+1
(n)
.
.
.
.
.
.
f
p1
(n) f
p, p+1
(n)
= 0, (4.2.9)
then the generalized jackknife estimator
G(
1
, . . . ,
p+1
) =
p+1
f
11
(n) f
1, p+1
(n)
.
.
.
.
.
.
f
p1
(n) . . . f
p, p+1
(n)
1 1
f
11
(n) f
1, p+1
(n)
.
.
.
.
.
.
f
p1
(n) f
p, p+1
(n)
1
, . . . ,
p+1
)} = + B(n, p, ),
where
B(n, p, ) =
B
1
B
p+1
f
11
(n) f
1, p+1
(n)
.
.
.
.
.
.
f
p1
(n) . . . f
p, p+1
(n)
1 1
f
11
(n) f
1, p+1
(n)
.
.
.
.
.
.
f
p1
(n) f
p, p+1
(n)
and
B
i
=
j =p+1
f
j i
(n) a
j
()
for i = 1, . . . , p +1.
Proof. See, e.g., Gray and Schucany (1972).
An example of the generalized jackknife G(
1
, . . . ,
p+1
) is where we extend
Quenouilles estimator by letting
1
=
and letting
2
, . . . ,
p+1
be the statistic
k
1
k
=1
()
with m = 1, 2, 3, 4, . . . , p, respectively. If the bias in the parent
sample estimator
1
=
is of the form
E{
} = +
j =1
a
j
()/n
j
,
then
f
j i
= 1/(n i +1)
j
(4.2.10)
and the bias in G(
1
, . . . ,
p+1
) is of order n
( p+1)
. Hence, the generalized jack
knife reduces the order of bias from order n
1
to order n
( p+1)
.
4.2.4. Counterexamples
The previous subsections demonstrate the considerable utility of the jackknife
method. We have seen how the jackknife method and its generalizations eliminate
bias and also how Tukeys conjectures regarding variance estimation and the
t
statistic are asymptotically correct for a wide class of problems. One must not,
however, make the mistake of believing that the jackknife method is omnipotent,
to be applied to every conceivable problem.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.2. Some Basic InnitePopulation Methodology 161
In fact, there are many estimation problems, particularly in the area of order
statistics and nonfunctional statistics, where the jackknife does not work well, if at
all. Miller (1974a) gives a partial list of counterexamples. To illustrate, we consider
the case
= Y
(n)
, the largest order statistic. Miller (1964) demonstrates that
t with
k = n can be degenerate or nonnormal. Quenouilles estimator for this case is
= Y
(n)
, if = n,
= nY
(n)
(n 1)Y
(n1)
, if = n,
= n
1
n
=1
= Y
(n)
+[(n 1)/n](Y
(n)
Y
(n1)
).
When Y
1
is distributed uniformly on the interval [0, ],
=
_
y
(r)
+ y
(r+1)
_ _
2,
where y
(i )
denote the ordered observations and r = n/2. After dropping one obser
vation, , the estimate is either
()
= y
(r)
or y
(r+1)
, with each outcome occurring
exactly half the time. The sample median is simply not smooth enough, and the
jackknife cannot possibly yield a consistent estimator of its variance.
On the other hand, the jackknife does work for the sample median if m is large
enough. The jackknife estimator of variance
v(
) =
(k 1)
k
()
()
)
2
is consistent and provides an asymtotically correct t statistic
t =
_
v(
)
if
n <
m < n. Further more, if one chooses m large enough, it may be come advantageous
to employ all
_
n
m
_
groups instead of just the k = n/m nonoverlapping groups we
have studied thus for. In this event, the jackknife estimator of variance becomes.
v(
) =
n m
m
_
n
m
_
()
()
)
2
.
See Wu (1986) and Shao and Wu (1989) for details.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
162 4. The Jackknife Method
4.2.5. Choice of Number of Groups k
There are two primary considerations in the choice of the number of groups k : (1)
computational costs and (2) the precision or accuracy of the resulting estimators.
As regards computational costs, it is clear that the choice (m, k) = (1, n) is most
expensive and (m, k) = ((n/2), 2) is least expensive. For large data sets, some
value of (m, k) between the extremes may be preferred. The grouping, however,
introduces a degree of arbitrariness in the formation of groups, a problem not
encountered when k = n.
As regards the precision of the estimators, we generally prefer the choice
(m, k) = (1, n), at least when the sample size n is small to moderate. This choice
is supported by much of the research on ratio estimation, including papers by Rao
(1965), Rao and Webster (1966), Chakrabarty and Rao (1968), and Rao and Rao
(1971). For reasonable models of the form
Y
i
=
0
+
1
X
i
+e
i
,
E{e
i
X
i
} = 0,
E{e
2
i
X
i
} =
2
X
t
i
,
t 0,
E{e
i
e
j
X
i
X
j
} = 0, i = j,
both the bias and variance of
is
Quenouilles estimator based on the ratio
= y/ x. Further, the bias of the
variance estimator v
1
(
i
Y
i
and
Y = Y/N,
respectively. It is assumed that we wish to estimate Y or
Y.
4.3.1. Simple Random Sampling with Replacement (srs wr)
Suppose an srs wr sample of size n is selected from the population N. It is known
that the sample mean
y =
n
i =1
y
i
/n
is an unbiased estimator of the population mean
Y with variance
Var{ y} =
2
/n,
where
2
=
N
i =1
(Y
i
Y)
2
/N.
The unbiased textbook estimator of variance is
v( y) = s
2
/n, (4.3.1)
where
s
2
=
n
i =1
(y
i
y)
2
/(n 1).
By analogy with (4.2.1), let
= y, and let the sample be divided into k random
groups each of size m, n = mk. Quenouilles estimator of the mean
Y is then
=
k
=1
/k, (4.3.2)
where the th pseudovalue is
= k y (k 1) y
()
,
and
y
()
=
m(k1)
i =1
y
i
/m(k 1)
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
164 4. The Jackknife Method
denotes the sample mean after omitting the th group of observations. The cor
responding variance estimator is
v(
) =
k
=1
(
)
2
/k(k 1). (4.3.3)
To investigate the properties of the jackknife, it is useful to rewrite (4.3.2) as
= k y (k 1) y
()
, (4.3.4)
where
y
()
=
k
=1
y
()
/k.
We then have the following lemma.
Lemma 4.3.1. Quenouilles estimator is identically equal to the sample mean
= y.
Proof. Follows immediately from (4.3.4) since any given y
i
appears in exactly
(k 1) of the y
()
.
From Lemma 4.3.1, it follows that the jackknife estimator of variance is
v
1
(
) =
(k 1)
k
k
=1
( y
()
y)
2
. (4.3.5)
The reader will note that v
1
(
)/(n 1)
and by (4.3.5) that
v
1
(
) =
(n 1)
n
n
=1
[y
y/(n 1)]
2
= v( y),
the textbook estimator of variance. In any case, whether k = n or not, we have the
following lemma.
Lemma 4.3.2. Given the conditions of this section,
E{v
1
(
)} = Var{
} = Var{ y}.
Proof. Left to the reader.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.3. Basic Applications to the Finite Population 165
We conclude that for srs wr the jackknife method preserves the linear estimator
y, gives an unbiased estimator of its variance, and reproduces the textbook variance
estimator when k = n.
4.3.2. Probability Proportional to Size Sampling with Replacement
(pps wr)
Suppose now that a pps wr sample of size n is selected from N using probabilities
{ p
i
}
N
i =1
, with
N
i
p
i
= 1 and p
i
> 0 for i = 1, . . . , N. The srs wr sampling design
treated in the last section is the special case where p
i
= N
1
, i = 1, . . . , N. The
customary estimator of the population total Y and its variance are given by
Y =
1
n
n
i =1
y
i
/p
i
and
Var{
Y} =
1
n
N
i =1
p
i
(Y
i
/p
i
Y)
2
,
respectively. The unbiased textbook estimator of the variance Var{
Y} is
v(
Y) =
1
n(n 1)
n
i =1
(y
i
/p
i
Y)
2
.
Let
=
Y, and suppose that the parent sample is divided into k random groups
of size m, n = mk. Quenouilles estimator of the total Y is then
= k
Y (k 1)k
1
k
=1
Y
()
,
where
Y
()
=
1
m(k 1)
m(k1)
i =1
y
i
/p
i
is the estimator based on the sample after omitting the th group of observations.
The jackknife estimator of variance is
v
1
(
) =
1
k(k 1)
k
=1
(
)
2
,
where
= k
Y (k 1)
Y
()
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
166 4. The Jackknife Method
is the th pseudovalue. For pps wr sampling, the moment properties of
and v
1
(
)
are identical with those for srs wr sampling, provided we replace y
i
by y
i
/p
i
.
Lemma 4.3.3. Given the conditions of this section,
=
Y
and
E{v
1
(
)} = Var{
} = Var{
Y}.
Further, if k = n, then
v
1
(
) = v(
Y).
Proof. Left to the reader.
4.3.3. Simple Random Sampling Without Replacement (srs wor)
If an srs wor sample of size n is selected, then the customary estimator of
Y, its
variance, and the unbiased textbook estimator of variance are
= y =
n
i =1
y
i
/n,
Var{ y} = (1 f )S
2
/n,
and
v( y) = (1 f )s
2
/n,
respectively, where f = n/N and
S
2
=
N
i =1
(Y
i
Y)
2
/(N 1).
We suppose that the parent sample is divided into k random groups, each of size
m, n = mk. Because without replacement sampling is used, the random groups
are necessarily nonindependent. For this case, Quenouilles estimator,
, and the
jackknife variance estimator, v
1
(
= y, and thus
is also an unbiased
estimator of
Y for srs wor sampling. However, v
1
(
) is no longer an unbiased
estimator of variance; in fact, it can be shown that
E{v
1
(
)} = S
2
/n.
Clearly, we may use the jackknife variance estimator with little concern for the
bias whenever the sampling fraction f = n/N is negligible. In any case, v
1
(
)
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.3. Basic Applications to the Finite Population 167
will be a conservative estimator, overestimating the true variance of
= y by the
amount
Bias{v
1
(
)} = f S
2
/n.
If the sampling fraction is not negligible, a very simple unbiased estimator of
variance is
(1 f )v
1
(
).
Another method of correcting the bias of the jackknife estimator is to work with
()
= y +(1 f )
1/2
( y
()
y)
instead of
()
= y
()
.
This results in the following denitions:
= k
(k 1)
()
, (pseudovalue),
=
k
=1
) =
1
k(k 1)
k
=1
(
)
2
, (jackknife estimator of variance).
We state the properties of these modied jackknife statistics in the following
Lemma.
Lemma 4.3.4. For srs wor sampling, we have
= y
and
E{v
1
(
)} = Var{ y} = (1 f )S
2
/n.
Further, when k = n,
v
1
(
) = v( y).
Proof. Left to the reader.
Thus, the jackknife variance estimator dened in terms of the modied pseu
dovalues
i
= P {i s},
where s denotes the sample. The HorvitzThompson estimator of the population
total is then
=
Y =
n
i =1
y
i
/
i
. (4.3.6)
Again, we suppose that the parent sample has been divided into k randomgroups
(nonindependent) of size m, n = mk. Quenouilles estimator
= k
Y (k 1)
Y
()
, (4.3.7)
and
Y
()
=
m(k1)
i =1
y
i
/[
i
m(k 1)/n]
is the HorvitzThompson estimator based on the sample after removing the th
group of observations. As was the case for the three previous sampling methods,
is algebraically equal to
Y. The jackknife thus preserves the unbiased character
of the HorvitzThompson estimator of the total.
To estimate the variance of
) =
1
k(k 1)
k
=1
(
)
2
,
where the
) =
1
n(n 1)
n
i =1
(y
i
/p
i
Y)
2
. (4.3.8)
The reader will recognize this as the textbook estimator of variance for pps wr
sampling. More generally, when k < n it can be shown that the equality in (4.3.8)
does not hold algebraically but does hold in expectation,
E{v(
)} = E
_
1
n(n 1)
n
i =1
(y
i
/p
i
Y)
2
_
,
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.4. Application to Nonlinear Estimators 169
where the expectations are with respect to the ps sampling design. We conclude
that the jackknife estimator of variance acts as if the sample were selected with
unequal probabilities with replacement rather than without replacement! The bias
of this procedure may be described as follows.
Lemma 4.3.5. Let Var{
Y
ps
} denote the variance of the HorvitzThompson esti
mator, and let Var{
Y
wr
} denote the variance of
Y
wr
= n
1
n
i =1
y
i
/p
i
in pps wr
sampling. Let
i
= np
i
for the without replacement scheme. If the true design
features without replacement sampling, then
Bias{v
1
(
)} = (Var{
Y
wr
} Var{
Y
ps
})n/(n 1).
That is, the bias of the jackknife estimator of variance is a factor n/(n 1) times
the gain (or loss) in precision from use of without replacement sampling.
Proof. Follows from Durbin (1953). See Section 2.4, particularly Theorem
2.4.6.
We conclude that the jackknife estimator of variance is conservative (upward
biased) in the useful applications of ps sampling (applications where ps beats
pps wr).
Some practitioners may prefer to use an approximate nitepopulation correc
tion (fpc) to correct for the bias in v
1
(
n
i =1
i
/n. This may be incorporated in the jackknife calculations by
working with
()
=
Y +(1 )
1/2
(
Y
()
Y)
instead of
()
=
Y
()
.
4.4. Application to Nonlinear Estimators
InSection4.3, we appliedthe various jackkningtechniques tolinear estimators, an
application in which the jackknife probably has no real utility. The reader will recall
that the jackknife simplyreproduces the textbookvariance estimators inmost cases.
Further, no worthwhile computational advantages are to be gained by using the
jackknife rather than traditional formulae. Our primary interest in the jackknife lies
in variance estimation for nonlinear statistics, and this is the topic of the present sec
tion. At the outset, we note that fewnite sample, distributional results are available
concerning the use of the jackknife for nonlinear estimators. See Appendix B for
the relevant asymptotic results. It is for this reason that we dealt at some length with
linear estimators. In fact, the main justication for the jackknife in nonlinear prob
lems is that it works well and its properties are known in linear problems. If a non
linear statistic has a local linear quality, then, on the basis of the results presented in
Section 4.3, the jackknife method should give reasonably good variance estimates.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
170 4. The Jackknife Method
To apply the jackknife to nonlinear survey statistics, we
(1) form k random groups and
(2) follow the jackkning principles enumerated in Section 4.2 for the innite
population model.
No restrictions on the sampling design are needed for applicationof the jackknife
method. Whatever the design might be in a particular application, one simply
forms the random groups according to the rules set forth in Section 2.2 (for the
independent case) and Section 2.4 (for the nonindependent case). Then, as usual,
the jackknife operates by omitting randomgroups fromthe sample. The jackknifed
version of a nonlinear estimator
of some population parameter is
=
k
=1
/k,
where the pseudovalues
()
is the estimator of the
same functional formas
obtained after omitting the th randomgroup. For linear
estimators, we found that the estimator
=
.
The jackknife variance estimator
v
1
(
) =
1
k(k 1)
k
=1
(
)
2
was rst given in (4.2.3). A conservative alternative, corresponding to (4.2.5), is
v
2
(
) =
1
k(k 1)
k
)
2
.
We may use either v
1
(
) or v
2
(
.
Little else is known about the relative accuracy of these estimators in nite
samples. Brillinger (1966) shows that both v
1
and v
2
give plausible estimates of
the asymptotic variance. The result for v
2
requires that
and
()
have small biases,
while the result for v
1
does not, instead requiring that the asymptotic correlation
between
()
and
()
( = ) be of the form(k 2)(k 1)
1
. The latter condition
will obtain in many applications because
()
and
()
have (k 2) random groups
in common out of (k 1). For additional asymptotic results, see Appendix B.
We close this section by giving two examples.
4.4.1. Ratio Estimation
Suppose that it is desired to estimate
R = Y/X,
the ratio of two population totals. The usual estimator is
R =
Y/
X,
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.4. Application to Nonlinear Estimators 171
where
Y and
X are estimators of the population totals based on the particular
sampling design. Quenouilles estimator is obtained by working with
R
()
=
Y
()
/
X
()
,
where
Y
()
and
X
()
are estimators of Y and X, respectively, after omitting the th
random group from the sample. Then, we have the pseudovalues
= k
R (k 1)
R
()
(4.4.1)
and Quenouilles estimator
R = k
1
k
=1
. (4.4.2)
To estimate the variance of either
R or
R, we have either
v
1
(
R) =
1
k(k 1)
k
=1
(
R
R)
2
(4.4.3)
or
v
2
(
R) =
1
k(k 1)
k
=1
(
R
R)
2
. (4.4.4)
Specically, let us assume srs wor sampling. Then,
Y = N y,
X = N x,
Y
()
= N y
()
,
X
()
= N x
()
,
R =
Y/
X,
R
()
=
Y
()
/
X
()
,
= k
R (k 1)
R
()
,
R = k
1
k
=1
,
= k
R (k 1)
R
()
.
If the sampling fraction f = n/N is not negligible, then the modication
()
=
R +(1 f )
1/2
(
R
()
R)
might usefully be applied in place of
R
()
.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
172 4. The Jackknife Method
4.4.2. A Regression Coefcient
Asecond illustrative example of the jackknife is given by the regression coefcient
=
n
i =1
(x
i
x)(y
i
y)
n
i =1
(x
i
x)
2
based on srs wor sampling of size n. Quenouilles estimator for this problem is
formed by working with
()
=
m(k1)
i =1
(x
i
x
()
)(y
i
y
()
)
m(k1)
i =1
(x
i
x
()
)
2
,
where the summations are over all units not in the th random group. This gives
the pseudovalue
= k
(k 1)
()
and Quenouilles estimator
= k
1
k
=1
.
To estimate the variance of either
or
, we have either
v
1
(
) =
1
k(k 1)
k
=1
(
)
2
or
v
2
(
) =
1
k(k 1)
k
=1
(
)
2
.
An fpc may be incorporated in the variance computations by working with
()
=
+(1 f )
1/2
(
()
)
in place of
()
.
4.5. Usage in Stratied Sampling
The jackknife runs into some difculty in the context of stratied sampling plans
because the observations are no longer identically distributed. We shall describe
some methods for handling this problem. The reader should be especially careful
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.5. Usage in Stratied Sampling 173
not to apply the classical jackknife estimators (see Sections 4.2, 4.3, and 4.4) to
stratied sampling problems.
We assume the population is divided into L strata, where N
h
describes the size of
the hth stratum. Sampling is carried out independently in the various strata. Within
the strata, simple randomsamples are selected, either with or without replacement,
n
h
denoting the sample size in the hth stratum. The population is assumed to be
pvariate, with
Y
hi
= (Y
1hi
, Y
2hi
, . . . , Y
phi
)
denoting the value of the i th unit in the hth stratum. We let
Y
h
= (
Y
1h
,
Y
2h
, . . . ,
Y
ph
)
denote the pvariate mean of the hth stratum, h = 1, . . . , L.
The problemwe shall be addressing is that of estimating a population parameter
of the form
= g(
Y
1
, . . . ,
Y
L
),
where g() is a smooth function of the stratum means
Y
rh
for h = 1, . . . , L and
r = 1, . . . , p. The natural estimator of is
= g( y
1
, . . . , y
L
); (4.5.1)
i.e., the same function of the sample means y
rh
=
n
h
i =1
y
rhi
/n
h
. The class of
functions satisfying these specications is quite broad, including for example
=
R =
L
h=1
N
h
y
1h
L
h=1
N
h
y
2h
,
the combined ratio estimator;
= y
11
/ y
12
,
the ratio of one stratum mean to another;
=
=
L
h=1
N
h
y
4h
_
L
h
N
h
y
1h
__
L
h
N
h
y
2h
__
N
L
h=1
N
h
y
3h
_
L
h
N
h
y
2h
_
2
_
N
,
the regression coefcient (where Y
3hi
= Y
2
2hi
and Y
4hi
= Y
1hi
Y
2hi
); and
= ( y
11
/ y
21
) ( y
12
/ y
22
),
the difference of ratios.
As in the case of the original jackknife, the methodology for stratied sampling
works withestimators of obtainedbyremovingobservations fromthe full sample.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
174 4. The Jackknife Method
Accordingly, let
(hi )
denote the estimator of the same functional formas
obtained
after deleting the (h, i )th observation from the sample. Let
(h)
=
n
h
i =1
(hi )
/n
h
,
()
=
L
h=1
n
h
i =1
(hi )
/n,
n =
L
h=1
n
h
,
and
()
=
L
h=1
(h)
/L.
Then dene the pseudovalues
hi
by setting
hi
= (Lq
h
+1)
Lq
h
(hi )
,
q
h
= (n
h
1)(1 n
h
/N
h
), for without replacement sampling
= (n
h
1), for with replacement sampling
for i = 1, . . . , n
h
and h = 1, . . . , L.
1
In comparison with earlier sections, we see that the quantity (Lq
h
+1) plays the
role of the sample size and Lq
h
the sample size minus one, although this apparent
analogy must be viewed as tenuous at best. The jackknife estimator of is now
dened by
1
=
L
h=1
n
h
i =1
hi
/Ln
h
=
_
1 +
L
h=1
q
h
_
L
h=1
q
h
(h)
, (4.5.2)
and its moments are described in the following theorem.
1
In the case L = 1, contrast this denition with the special pseudovalues dened in Section
4.3. Here we have (dropping the h subscript)
i
=
(n 1)(1 f )(
(i )
),
whereas in Section 4.3 we had the special pseudovalues
i
=
(n 1)(1 f )
1/2
(
(i )
).
For linear estimators
, both pseudovalues lead to the same unbiased estimator of . For
nonlinear
, the pseudovalue dened here removes both the order n
1
and the order N
1
(in the case of without replacement sampling) bias in the estimation of . The pseudovalue
i
, attempts instead to include an fpc in the variance calculations. In this section, fpcs are
incorporated in the variance estimators but not via the pseudovalues. See (4.5.3).
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.5. Usage in Stratied Sampling 175
Theorem4.5.1. Let
and
1
be dened by (4.5.1) and(4.5.2), respectively. Let g()
be a function of stratum means, which does not explicitly involve the sample sizes
n
h
, with continuous third derivatives in a neighborhood of
Y = (
Y
1
, . . . ,
Y
L
).
Then, the expectations of
and
1
to secondorder moments of the y
rh
are
E{
} = +
L
h=1
_
N
h
n
h
(N
h
1)n
h
_
c
h
, for without replacement sampling
= +
L
h=1
c
h
/n
h
, for with replacement sampling,
E{
1
} = ,
where the c
h
are constants that do not depend on the n
h
. Further, their variances
to thirdorder moments are
Var{
} =
L
h=1
_
N
h
n
h
(N
h
1)n
h
_
d
1h
+
L
h=1
_
(N
h
n
h
)(N
h
2n
h
)
(N
h
1)(N
h
2)n
2
h
_
d
2h
,
for without replacement sampling
=
L
h=1
n
1
h
d
1h
+
L
h=1
n
2
h
d
2h
, for with replacement sampling,
Var{
1
} =
L
h=1
_
N
h
n
h
(N
h
1)n
h
_
d
1h
L
h=1
_
(N
h
n
h
)
(N
h
1)(N
h
2)n
h
_
d
2h
,
for without replacement sampling
=
L
h=1
n
1
h
d
1h
, for with replacement sampling,
where the d
1h
and d
2h
are constants, not dependent upon the n
h
, that represent the
contributions of the second and thirdorder moments of the y
rh
.
Proof. See Jones (1974) and Dippo (1981).
This theorem shows that the jackknife estimator
1
is approximately unbiased
for . In fact, it is strictly unbiased whenever is a linear or quadratic function of
the stratum means. This remark applies to estimators such as
=
L
h=1
N
h
y
1h
,
the estimator of the total;
= y
11
y
12
,
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
176 4. The Jackknife Method
the estimated difference between stratum means; and
=
_
L
h=1
(N
h
/N) y
1h
__
L
h=1
(N
h
/N) y
2h
_
_
L
h=1
(N
h
/N)
Y
2h
,
the combined product estimator.
As was the case for sampling without stratication (i.e., L = 1), the jackknife
may be considered for its bias reduction properties. The estimator
1
, called by
Jones the rstorder jackknife estimator, eliminates the order n
1
h
and order N
1
h
terms from the bias of
as an estimator of . This is the import of the rst part of
Theorem 4.5.1. Jones also gives a secondorder jackknife estimator, say
2
, which
is unbiased for through thirdorder moments of the y
rh
:
2
=
_
1 +
L
h=1
q
(h)
h=1
q
(hh)
_
L
h=1
q
(h)
(h)
+
L
h=1
q
(hh)
(h)(h)
,
q
(h)
= a
h
a
(hh)
/{(a
(h)
a
h
)(a
(hh)
a
(h)
)},
q
(hh)
= a
h
a
(h)
/{(a
(hh)
a
h
)(a
(hh)
a
(h)
)},
(h)
=
n
h
i =1
(hi )
/n
h
,
(h)(h)
= 2
n
h
i <j
(hi )(hj )
/{n
h
(n
h
1)},
a
h
= n
1
h
N
1
h
, for without replacement sampling
= n
1
h
, for with replacement sampling,
a
(h)
= (n
h
1)
1
N
1
h
, for without replacement sampling
= (n
h
1)
1
, for with replacement sampling,
a
(hh)
= (n
h
2)
1
N
1
h
, for without replacement sampling
= (n
h
2)
1
, for with replacement sampling,
where
(hi )(hj )
is the estimator of the same functional form as
based upon the
sample after removing both the (h, i )th and the (h, j )th observations. The second
order jackknife is strictly unbiased for estimators
that are cubic functions of the
stratum means y
rh
. For linear functions, we have
1
=
2
.
The jackknife estimator of variance for the stratied sampling problemis dened
by
v
1
(
) =
L
h=1
q
h
n
h
n
h
i =1
(
(hi )
(h)
)
2
. (4.5.3)
This estimator and the following theorem are also due to Jones.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.5. Usage in Stratied Sampling 177
Theorem 4.5.2. Let the conditions of Theorem 4.5.1 hold. Then, to secondorder
moments of y
rh
, v
1
(
} and Var{
1
}. To
thirdorder moments, the expectation is
E{v
1
(
)} =
L
h=1
_
N
h
n
h
(N
h
1)n
h
_
d
1h
+
L
h=1
_
(N
h
n
h
)(N
h
2n
h
+2)
(N
h
1)(N
h
2)n
h
(n
h
1)
_
d
2h
,
for without replacement sampling
=
L
h=1
n
1
h
d
1h
+
L
h=1
n
1
h
(n
h
1)
1
d
2h
,
for with replacement sampling,
where d
1h
and d
2h
are dened in Theorem 4.5.1.
Proof. See Jones (1974) and Dippo (1981).
Thus, v
1
(
} and Var{
1
}. When thirdorder moments are included, however, v
1
(
)
is seen to be a biased estimator of variance. Jones gives further modications to
the variance estimator that correct for even these lowerorder biases.
In addition to Jones work, McCarthy (1966) and Lee (1973b) have studied the
jackknife for the case n
h
= 2(h = 1, . . . , L). McCarthys jackknife estimator of
variance is
v
M
(
) =
L
h=1
(1/2)
2
i =1
_
(hi )
h=1
(h
.)
/L
_
2
,
and Lees estimator is
v
L
(
) =
L
h=1
(1/2)
2
h=1
(
(hi )
)
2
.
For general sample size n
h
, the natural extensions of these estimators are
v
2
(
) =
L
h=1
(q
h
/n
h
)
n
h
i =1
(
(hi )
()
)
2
, (4.5.4)
v
3
(
) =
L
h=1
(q
h
/n
h
)
n
h
i =1
(
(hi )
()
)
2
, (4.5.5)
and
v
4
(
) =
L
h=1
(q
h
/n
h
)
n
h
i =1
(
(hi )
)
2
. (4.5.6)
In the following theorem, we show that these are unbiased estimators of variance
to a rstorder approximation.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
178 4. The Jackknife Method
Theorem 4.5.3. Given the conditions of this section, the expectations of the jack
knife variance estimators, to secondorder moments of the y
rh
, are
E{v
2
(
)} =
L
h=1
_
N
h
n
h
(N
h
1)n
h
_
d
1h
, for without replacement sampling
=
L
h=1
n
1
h
d
1h
, for with replacement sampling
and
E{v
3
(
)} = E{v
4
(
)} = E{v
2
(
)},
where the d
1h
are as in Theorem 4.5.1.
Proof. Left to the reader.
Theorems 4.5.1, 4.5.2, and 4.5.3 show that to secondorder moments v
1
, v
2
, v
3
,
and v
4
are unbiased estimators of the variance of both
and
1
.
Some important relationships exist between the estimators both for with re
placement sampling and for without replacement sampling when the sampling
fractions are negligible, q
h
=n
h
1. Given either of these conditions, Jones
estimator is
v
1
(
) =
L
h=1
[(n
h
1)/n
h
]
n
h
i =1
(
(hi )
(h)
)
2
,
and we may partition the sum of squares in Lees estimator as
v
4
(
) =
L
h=1
[(n
h
1)/n
h
]
n
h
i =1
(
(hi )
)
2
=
L
h=1
[(n
h
1)/n
h
]
n
h
i =1
(
(hi )
(h)
)
2
(4.5.7)
+
L
h=1
(n
h
1)(
(h)
()
)
2
+(n + L)(
()
()
)
2
+(n L)(
()
)
2
+2n(
()
)(
()
()
).
The rst term on the righthand side of (4.5.7) is v
1
(
). The estimator v
2
(
) is
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.5. Usage in Stratied Sampling 179
equal to the rst two terms on the righthand side, and v
3
(
) v
1
(
),
(ii) v
3
(
) v
2
(
) v
1
(
),
(iii) v
4
(
) v
3
(
) = v
2
(
) v
1
(
), whenever the n
h
are roughly equal.
These results hold algebraically irrespective of the particular sample selected. They
are important in view of the result (see Theorem 4.5.3) that the four estimators
have the same expectation to secondorder moments of the y
rh
. We may say that
v
2
(
) and v
3
(
) and that
v
4
(
(hi )
=
R
(hi )
=
L
=h
N
h
y
1h
+ N
h
n
h
j =i
y
1hj
/(n
h
1)
L
=h
N
h
y
2h
+ N
h
n
h
j =i
y
2hj
/(n
h
1)
.
The jackknife estimator of = R is
1
=
R
1
=
_
1 +
L
h=1
q
h
_
R
L
h=1
q
h
R
(h)
,
where
R
(h)
=
n
h
i =1
R
(hi )
/n
h
. The corresponding variance estimators are
v
1
(
) = v
1
(
R) =
L
h=1
q
h
n
h
n
h
i =1
(
R
(hi )
R
(h)
)
2
,
v
2
(
) = v
2
(
R) =
L
h=1
q
h
n
h
n
h
i =1
(
R
(hi )
R
()
)
2
,
v
3
(
) = v
3
(
R) =
L
h=1
q
h
n
h
n
h
i =1
(
R
(hi )
R
()
)
2
,
v
4
(
) = v
4
(
R) =
L
h=1
q
n
n
h
n
h
i =1
(
R
(hi )
R)
2
,
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
180 4. The Jackknife Method
and are applicable to either
R or
R
1
. For n
h
= 2 and (1 n
h
/N
h
) = 1, the esti
mators reduce to
v
1
(
R) =
L
h=1
(1/2)
2
i =1
(
R
(hi )
R
(h)
)
2
=
L
h=1
(
R
(h1)
R
(h2)
)
2
/4,
v
2
(
R) =
L
h=1
(1/2)
2
i =1
(
R
(hi )
R
()
)
2
,
v
3
(
R) =
L
h=1
(1/2)
2
i =1
(
R
(hi )
R
()
)
2
,
v
4
(
R) =
L
h=1
(1/2)
2
i =1
(
R
(hi )
R)
2
.
All of the results stated thus far have been for the case where the jackknife
operates on estimators
(hi )
obtained by eliminating single observations from the
full sample. Valid results may also be obtained if we divide the sample n
h
into
k random groups of size m
h
and dene
(hi )
to be the estimator obtained after
deleting the m
h
observations in the i th random group from stratum h.
The results stated thus far have also been for the case of simple randomsampling,
either with or without replacement. Now suppose sampling is carried out pps with
replacement within the L strata. The natural estimator
= g( x
1
, . . . , x
L
) (4.5.8)
is now dened in terms of
x
rh
= (1/n
h
)
n
h
i =1
x
rhi
,
where
x
rhi
= y
rhi
/N
h
p
hi
and p
hi
denotes the probability associated with the (h, i )th unit. As usual, p
hi
> 0
for all h and i, and
i
p
hi
= 1 for all h. Similarly,
(hi )
denotes the estimator of
the same form as (4.5.8) obtained after deleting the (h, i)th observation from the
sample,
(h)
=
n
h
i =1
(hi )
/n
h
,
()
=
L
h=1
n
h
i =1
(hi )
/n,
and
()
=
L
h=1
(h)
/L.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.5. Usage in Stratied Sampling 181
The rstorder jackknife estimator of is
1
=
_
1 +
L
h=1
q
h
_
+
L
h=1
q
h
(h)
,
where q
h
= (n
h
1). To estimate the variance of either
or
1
, we may use
v
1
(
) =
L
h=1
{(n
h
1)/n
h
}
n
h
i =1
(
(hi )
(h)
)
2
,
v
2
(
) =
L
h=1
{(n
h
1)/n
h
}
n
h
i =1
(
(hi )
()
)
2
,
v
3
() =
L
h=1
{(n
h
1)/n
h
}
n
h
i =1
(
(hi )
()
)
2
,
or
v
4
(
) =
L
h=1
{(n
h
1)n
h
}
n
h
i =1
(
(hi )
)
2
.
The reader should note that the earlier results for simple random sampling extend
to the present problem.
If sampling is with unequal probability without replacement with inclusion
probabilities
hi
= P {(h, i )th unit in sample} = n
h
p
hi
,
i.e., a ps sampling scheme, then we may use the jackknife methods just given for
pps with replacement sampling. This is a conservative procedure in the sense that
the resulting variance estimators will tend to overestimate the true variance. See
Section 2.4.5.
Example 4.5.2 If sampling is pps with replacement and
=
R is the combined
ratio estimator, then the jackknife operates on
(hi )
=
R
(hi )
=
L
=h
N
h
x
1h
+ N
h
n
h
j =i
x
1hj
/(n
h
1)
L
=h
N
h
x
2h
+ N
h
n
h
j =i
x
2hj
/(n
h
1)
,
where x
rhj
= y
rhj
/N
h
p
hj
.
Finally, there is another variant of the jackknife, which is appropriate for strat
ied samples. This variant is analogous to the second option of rule (iv), Section
2.4, whereas previous variants of the jackknife have been analogous to the rst
option of that rule. Let the sample n
h
(which may be srs, pps, or ps) be divided
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
182 4. The Jackknife Method
into k randomgroups of size m
h
, for h = 1, . . . , L, and let
()
denote the estimator
of obtained after removing the th group of observations from each stratum.
Dene the pseudovalues
= k
(k 1)
()
.
Then, the estimator of and the estimator of its variance are
=
k
=1
/k,
v
1
(
) =
1
k(k 1)
k
=1
(
)
2
. (4.5.9)
Little is known about the relative merits of this method vis` avis earlier methods. It
does seemthat the estimators in (4.5.9) have computational advantages over earlier
methods. Unless k is large, however, (4.5.9) may be subject to greater instability.
4.6. Application to Cluster Sampling
Throughout this chapter, we have treated the case where the elementary units
and sampling units are identical. We now assume that clusters of elementary
units, comprising primary sampling units (PSUs), are selected, possibly with
several stages of subsampling occurring within each selected PSU. We con
tinue to let
be the parent sample estimator of an arbitrary parameter . Now,
however, N and n denote the number of PSUs in the population and sample,
respectively.
No new principles are involved in the application of jackknife methodology to
clustered samples. When forming k random groups of m units each (n = mk), we
simply work with the ultimate clusters rather than the elementary units. The reader
will recall from Chapter 2 that the term ultimate cluster refers to the aggregate of
elementary units, secondstage units, thirdstage units, and so on from the same
primary unit. See rule (iii), Section 2.4. The estimator
()
is then computed from
the parent sample after eliminating the th randomgroup of ultimate clusters ( =
1, . . . , k). Pseudovalues
, Quenouilles estimator
) and v
2
(
=
Y =
1
n
n
i =1
y
i
/p
i
(4.6.1)
of the population total = Y based on a pps wr sample of n primaries, where y
i
denotes an estimator of the total in the i th selected primary based on sampling at
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.6. Application to Cluster Sampling 183
the second and successive stages. For this problem, the
()
are dened by
()
=
Y
()
=
1
m(k 1)
m(k1)
i =1
y
i
/p
i
, (4.6.2)
where the summation is taken over all selected PSUs not in the th group.
Quenouilles estimator is
Y =
1
k
k
=1
Y
()
,
=
Y
= k
Y (k 1)
Y
()
,
and
v
1
(
) =
1
k(k 1)
k
=1
(
Y
Y)
2
is an unbiased estimator of the variance of
.
If the sample of PSUs is actually selected without replacement with inclu
sion probabilities
i
= np
i
, then the estimators
and
()
take the same form
as indicated in (4.6.1) and (4.6.2) for with replacement sampling. In this case,
the estimator v
1
(
=
and v
1
(
) = v
2
(
)
because the parent sample estimator
is linear in the y
i
. For nonlinear
, the
estimators
and
are generally not equal, nor are the estimators of variance v
1
(
)
and v
2
(
). For the nonlinear case, exact results about the moment properties of
the estimators generally are not available. Approximations are possible using the
theory of Taylor series expansions. See Chapter 6. Also see Appendix B for some
discussion of the asymptotic properties of the estimators.
If the sample of PSUs is selected independently within each of L 2 strata,
then we use the rule of ultimate clusters together with the techniques discussed
in Section 4.5. The methods are now based upon the estimators
(hi )
formed by
deleting the i th ultimate cluster from the hth stratum. Continuing the example
=
Y, we have
1
=
_
1 +
L
h=1
q
h
_
L
h=1
q
h
(h)
=
_
1 +
L
h=1
q
h
_
Y
L
h=1
q
h
Y
(h)
, (4.6.3)
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
184 4. The Jackknife Method
the rstorder jackknife estimator of , and
v(
) =
L
h=1
n
h
1
n
h
n
h
i =1
(
(hi )
(h)
)
2
(4.6.4a)
=
L
h=1
n
h
1
n
h
n
h
i =1
(
Y
(hi )
Y
(h)
)
2
, (4.6.4b)
the rstorder jackknife estimator of the variance, where
(h)
=
Y
(h)
=
1
n
h
n
h
j =1
Y
(hj )
is the mean of the
(hj )
=
Y
(hj )
over all selected primaries in the hth stratum.
Consider an estimator of the population total of the form
Y =
L
h=1
n
h
i =1
m
hi
j =1
w
hij
y
hij
, (4.6.5)
where w
hi j
is the weight associated with the j th completed interviewin the (h, i )
th PSU. The estimator after dropping one ultimate cluster is
Y
(hi )
=
L
=1
n
h
=1
m
h
j =1
w
(hi )h
j
y
h
j
, (4.6.6)
where the replicate weights satisfy
w
(hi )h
j
= w
h
j
, if h
= h,
= w
h
j
n
h
n
h1
, if h
= h and i
= i,
= 0, if h
= h and i
= i.
(4.6.7)
Then, the rst jackknife estimator of variance v
1
_
Y
_
is dened by (4.6.4b). Es
timators v
2
_
Y
_
, v
3
_
Y
_
, and v
4
_
Y
_
are dened similarly. For a general nonlinear
parameter , the estimators
and
(hi )
are dened in terms of the parent sam
ple weights, w
hi j
, and the replicate weights, w
(hi )h
j
, respectively. And the rst
jackknife estimator of variance v
1
_
_
is dened by (4.6.4a). In this discussion,
nonresponse and poststratication (calibration) adjustments incorporated within
the parent sample weights are also used for each set of jackknife replicate weights.
A technically better approachone that would reect the increases and decreases
in the variance brought by the various adjustmentswould entail recalculating the
adjustments for each set of jackknife replicate weights.
The discussion in this section has dealt solely with the problems of estimating
and estimating the total variance of the estimator. In survey planning, however, we
often face the problem of estimating the individual components of variance due to
the various stages of selection. This is important in order to achieve an efcient
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.7. Example: Variance Estimation for the NLSY97 185
allocation of the sample. See Folsom, Bayless, and Shah (1971) for a jackknife
methodology for variance component problems.
4.7. Example: Variance Estimation for the NLSY97
The National Longitudinal Survey of Youth, 1997 (NLSY97) is the latest in a
series of surveys sponsored by the U.S. Department of Labor to examine issues
surrounding youth entry into the workforce and subsequent transitions in and out
of the workforce. The NLSY97 is following a cohort of approximately 9000 youths
who completed an interviewin 1997 (the base year). These youths were between 12
and 16 years of age as of December 31, 1996, and are being interviewed annually
using a mix of some core questions asked annually and varying subject modules
that rotate from year to year.
The sample design involved the selection of two independent areaprobability
samples: (1) a crosssectional sample designed to represent the various subgroups
of the eligible population in their proper population proportions; and (2) a supple
mental sample designed to produce oversamples of Hispanic and nonHispanic,
Black youths. Both samples were selected by standard areaprobability sampling
methods. Sampling was in three essential stages: primary sampling units (PSUs)
consisting mainly of Metropolitan Statistical Areas (MSAs) or single counties,
segments consisting of single census blocks or clusters of neighboring blocks,
and housing units. All eligible youths in each household were then selected for
interviewing and testing.
The sampling of PSUs was done using pps systematic sampling on a geographi
cally sorted le. To proceed with variance estimation, we adopt the approximation
of treating the NLSY97 as a twoperstratum sampling design. We take the rst
stage sampling units in order of selection and collapse two together to make a
pseudostratum. We treat the two PSUs as if they were selected via pps wr sam
pling within strata.
In total, the NLSY97 sample consists of 200 PSUs, 36 of which were selected
with certainty. Since the certainties do not involve sampling until the segment level,
we paired the segments and call each pair a pseudostratum. Overall, we formed 323
pseudostrata, of which 242 were related to certainty strata and 81 to noncertainty
PSUs (79 of the pseudostrata included two PSUs and two actually included three
PSUs).
Wolter, Pedlow, and Wang (2005) carried out BHS and jackknife variance esti
mation for NLSY97 data. We created 336 replicate weights for the BHS method
(using an order 336 Hadamard matrix). Note that 336 is a multiple of 4 that is
larger than the number of pseudostrata, 323. For the jackknife method, we did not
compute the 2*323 =646 replicate weights that would have been standard for the
dropoutone method; instead, to simplify the computations, we combined three
pseudostrata together to make 108 revised pseudostrata consisting of two groups
of three pseudoPSUs each (one of the revised pseudostrata actually consisted of
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
186 4. The Jackknife Method
T
a
b
l
e
4
.
7
.
1
.
P
e
r
c
e
n
t
a
g
e
o
f
S
t
u
d
e
n
t
s
W
h
o
E
v
e
r
W
o
r
k
e
d
D
u
r
i
n
g
t
h
e
S
c
h
o
o
l
Y
e
a
r
o
r
F
o
l
l
o
w
i
n
g
S
u
m
m
e
r
J
a
c
k
k
n
i
f
e
S
t
a
n
d
a
r
d
E
r
r
o
r
B
H
S
S
t
a
n
d
a
r
d
E
r
r
o
r
J
a
c
k
k
n
i
f
e
S
t
a
n
d
a
r
d
B
H
S
S
t
a
n
d
a
r
d
E
r
r
o
r
w
i
t
h
o
u
t
R
e
p
l
i
c
a
t
e
w
i
t
h
o
u
t
R
e
p
l
i
c
a
t
e
E
r
r
o
r
w
i
t
h
R
e
p
l
i
c
a
t
e
w
i
t
h
R
e
p
l
i
c
a
t
e
D
o
m
a
i
n
E
s
t
i
m
a
t
e
s
(
%
)
R
e
w
e
i
g
h
t
i
n
g
(
%
)
R
e
w
e
i
g
h
t
i
n
g
(
%
)
R
e
w
e
i
g
h
t
i
n
g
(
%
)
R
e
w
e
i
g
h
t
i
n
g
(
%
)
T
o
t
a
l
,
a
g
e
1
7
8
9
.
0
1
.
0
0
0
.
9
8
0
.
9
9
0
.
9
6
M
a
l
e
y
o
u
t
h
s
8
8
.
5
1
.
4
9
1
.
3
5
1
.
4
7
1
.
3
3
F
e
m
a
l
e
y
o
u
t
h
s
8
9
.
6
1
.
2
3
1
.
3
6
1
.
2
0
1
.
3
4
W
h
i
t
e
n
o
n

H
i
s
p
a
n
i
c
9
2
.
2
1
.
0
4
1
.
0
9
1
.
0
4
1
.
1
0
B
l
a
c
k
n
o
n

H
i
s
p
a
n
i
c
7
8
.
7
3
.
1
8
2
.
7
6
3
.
1
8
2
.
7
8
H
i
s
p
a
n
i
c
o
r
i
g
i
n
8
6
.
8
2
.
3
4
2
.
5
5
2
.
3
4
2
.
5
6
G
r
a
d
e
1
1
8
4
.
8
2
.
6
3
2
.
6
3
2
.
6
4
2
.
6
4
G
r
a
d
e
1
2
9
0
.
9
1
.
1
9
1
.
0
5
1
.
1
8
1
.
0
3
T
o
t
a
l
,
a
g
e
1
8
9
0
.
5
1
.
2
2
1
.
2
1
1
.
2
4
1
.
2
2
M
a
l
e
y
o
u
t
h
s
8
9
.
3
2
.
0
5
1
.
9
0
2
.
0
4
1
.
8
9
F
e
m
a
l
e
y
o
u
t
h
s
9
1
.
8
1
.
3
2
1
.
5
5
1
.
3
1
1
.
5
4
W
h
i
t
e
n
o
n

H
i
s
p
a
n
i
c
9
2
.
8
1
.
4
7
1
.
4
2
1
.
4
8
1
.
4
3
B
l
a
c
k
n
o
n

H
i
s
p
a
n
i
c
8
2
.
6
2
.
9
6
3
.
1
3
2
.
9
6
3
.
1
5
H
i
s
p
a
n
i
c
o
r
i
g
i
n
8
9
.
4
2
.
3
7
2
.
3
5
2
.
3
7
2
.
3
7
G
r
a
d
e
1
2
8
6
.
7
1
.
9
1
2
.
1
6
1
.
9
2
2
.
1
7
F
r
e
s
h
m
a
n
i
n
c
o
l
l
e
g
e
9
3
.
9
1
.
2
7
1
.
3
4
1
.
2
8
1
.
3
5
T
o
t
a
l
,
a
g
e
1
9
9
4
.
1
1
.
2
0
1
.
1
3
1
.
2
0
1
.
1
4
M
a
l
e
y
o
u
t
h
s
9
3
.
8
1
.
6
1
1
.
6
4
1
.
6
3
1
.
6
7
F
e
m
a
l
e
y
o
u
t
h
s
9
4
.
4
1
.
4
3
1
.
4
2
1
.
4
2
1
.
4
2
W
h
i
t
e
n
o
n

H
i
s
p
a
n
i
c
9
4
.
8
1
.
5
3
1
.
4
4
1
.
5
3
1
.
4
4
B
l
a
c
k
n
o
n

H
i
s
p
a
n
i
c
8
8
.
9
3
.
2
4
3
.
0
4
3
.
1
8
3
.
0
4
H
i
s
p
a
n
i
c
o
r
i
g
i
n
9
2
.
2
3
.
5
1
3
.
4
6
3
.
5
6
3
.
4
8
F
r
e
s
h
m
a
n
i
n
c
o
l
l
e
g
e
9
5
.
2
1
.
9
1
1
.
8
9
1
.
9
2
1
.
9
1
S
o
p
h
o
m
o
r
e
i
n
c
o
l
l
e
g
e
9
5
.
1
1
.
4
2
1
.
4
3
1
.
4
1
1
.
4
2
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.8. Example: Estimating the Size of the U.S. Population 187
the two pseudostrata that included three PSUs). We created 2*108 =216 replicate
weights by dropping out one group at a time.
In creating replicate weights for the BHS and jackknife methods, the simple
approach is to adjust the nal parent sample weights only for the conditional
probability of selecting the replicate given the parent sample, as in (3.3.3) and
(4.6.7). We calculated the simple approach. We also produced replicate weights
according to the technically superior approach of recalculating adjustments for
nonresponse and other factors within each replicate.
Table 4.7.1 shows the estimated percentage of enrolled youths (or students) who
worked during the 200001 school year or the summer of 2001. The weighted
estimates are broken down by age (17, 18, or 19) and by age crossed by sex,
race/ethnicity, or grade. The data for this table are primarily from rounds 4 and
5 of the NLSY97, conducted in 200001 and 200102, respectively. Analysis of
these data was rst presented in the February 18, 2004 issue of BLS News (see
Bureau of Labor Statistics, 2004).
The resulting standard error estimates appear in Table 4.7.1. The jackknife and
BHS give very similar results. For the simple versions of the estimators without
replicate reweighting, of 24 estimates, the jackknife estimate is largest 14.5 times
(60%) and the BHSstandard error estimate is largest 9.5 times (40%).
2
The average
standard errors are 1.86 and 1.85 and the median standard errors are 1.51 and 1.50
for the jackknife and BHS, respectively.
The means and medians of the replicate reweighted standard error estimates are
1.85and1.51for the jackknife andand1.85and1.49for the BHS. Theoretically, the
replicatereweighted methods should result in larger estimated standard errors than
the corresponding methods without replicate reweighting because they account for
the variability in response rates and nonresponse adjustments that is ignored by
simply adjusting the nal weight for the replicate subsampling rate. However,
we nd almost no difference between the estimates with and without replicate
reweighting by method. Since the estimates with and without replicate reweighting
are so close, the method without replicate reweighting is preferred because it is
the simpler method computationally.
An illustration of the jackknife and BHS calculations appears in Table 4.7.2.
4.8. Example: Estimating the Size of
the U.S. Population
This example is concerned with estimating the size of the U.S. population. The
method of estimation is known variously as dualsystem or capturerecapture
estimation. Although our main purpose is to illustrate the use of the jackknife
method for variance estimation, we rst describe, in general terms, the twosample
capturerecapture problem.
2
When two methods tie, we consider each to be largest 0.5 times.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
188 4. The Jackknife Method
Table 4.7.2. Illustration of Jackknife and BHS Calculations for the
NLSY Example
For a given demographic domain, the estimated proportion is of the form
R =
L
h=1
n
h
i =1
m
hi
j =1
w
hi j
y
hi j
L
h=1
n
h
i =1
m
hi
j =1
w
hi j
x
hi j
,
where
x
hi j
= 1, if the interviewed youth is in the specied domain and is a student,
= 0, otherwise,
y
hi j
= 1, if the interviewed youth is in the specied domain, is a student, and ever
worked during the school year or following summer,
= 0, otherwise,
and L denotes the number of pseudostrata, n
h
= 2 is the number of pseudoPSU groups per
stratum, and m
hi
is the number of completed youth interviews within the (h,i )th group. L
is 323 and 108 for the BHS and jackknife methods, respectively.
Without replicate reweighting, we have replicate estimators
R
(hi )
dened for the jackknife
method using the replicate weights in (4.6.7) and the replicate estimators
R
=1
(
R
R)
2
and
v
j
(
R) =
108
h=1
1
2
2
i =1
(
R
(hi )
R)
2
,
respectively.
For the domain of females age 17, we nd
R 100% = 89.6 and v
J
(
R 100%) = 1.51
and v
BHS
(
R 100%) = 1.85 without replicate reweighting and v
J
(
R 100%) = 1.44 and
v
BHS
(
R 100%) = 1.80 with replicate reweighting.
Let N denote the total number of individuals in a certain population under study.
We assume that N is unobservable and to be estimated. This situation differs from
the model encountered in classical survey sampling, where the size of the popu
lation is considered known and the problem is to estimate other parameters of the
population. We assume that there are two lists or frames, that each covers a portion
of the total population, but that the union of the two lists fails to include some
portion of the population N. We further assume the lists are independent in the
sense that whether or not a given individual is present on the rst list is a stochastic
event that is independent of the individuals presence or absence on the second
list.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.8. Example: Estimating the Size of the U.S. Population 189
The population may be viewed as follows:
Second List
First List
Present
Absent
Present Absent
N
11
N
12
N
21
N
1
N
1
The size of the (2,2) cell is unknown, and thus the total size N is also unknown.
Assuming that these data are generated by a multinomial probability law and that
the two lists are independent, the maximum likelihood estimator of N is
N =
N
1.
N
.1
N
11
.
See Bishop, Fienberg, and Holland (1975), Marks, Seltzer, and Krotki (1974), or
Wolter (1986) for the derivation of this estimator.
In practice, the two lists must be matched to one another in order to determine
the cell counts N
11
, N
12
, N
21
. This task will be difcult, if not impossible, when
the lists are large. Difculties also arise when the two lists are not compatible with
computer matching. In certain circumstances these problems can be dealt with
(but not eliminated) by drawing samples fromeither or both lists and subsequently
matching only the sample cases. Dealing with samples instead of entire lists cuts
down on work and presumably on matching difculties. Survey estimators may
then be constructed for N
11
, N
12
, and N
21
. This is the situation considered in the
present example.
We will use the February 1978 Current Population Survey (CPS) as the sample
from the rst list and the Internal Revenue Service (IRS) tax returns led in 1978
as the second list. The population N to be estimated is the U.S. adult population
ages 1464.
The CPS is a household survey that is conducted monthly by the U.S. Bureau
of the Census for the U.S. Bureau of Labor Statistics. The main purpose of the
CPS is to gather information about characteristics of the U.S. labor force. The CPS
sampling design is quite complex, using geographic stratication, multiple stages
of selection, and unequal selection probabilities. The CPS estimators are equally
complex, employing adjustments for nonresponse, two stages of ratio estimation,
and one stage of composite estimation. For exact details of the sampling design
and estimation scheme, the reader should see Hanson (1978). The CPS is discussed
further in Section 5.5.
Each monthly CPS sample is actually comprised of eight distinct subsamples
(known as rotation groups). Each rotation group is itself a national probability
sample comprised of a sample of households from each of the CPS primary
sampling units (PSUs). The rotation groups might be considered random groups
(nonindependent), as dened in Section 2.4, except that the ultimate cluster rule
(see rule (iii), Section 2.4) was not followed in their construction.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
190 4. The Jackknife Method
In this example, the design variance of a dualsystem or capturerecapture es
timator will be estimated by the jackknife method operating on the CPS rotation
groups. Because the ultimate cluster rule was not followed, this method of variance
estimation will tend to omit the between component of variance. The omission is
probably negligible, however, because the between component is considered to be
a minor portion (about 5%) of the total CPS variance.
The entire second list (IRS) is used in this application without sampling. After
matching the CPS sample to the entire second list, we obtain the following data:
IRS
CPS
Present
Absent
Present Absent
N
11
N
12
N
21
N
1
N
1
and
IRS
CPS
Present
Absent
Present Absent
N
11
N
12
N
21
N
1
N
1
for = 1, . . . , 8. The symbol
N
11
=
1
8
8
=1
=
N
11
,
N
12
=
1
8
8
=1
=
N
12
,
N
21
=
1
8
8
=1
=
N
21
,
N
1
=
1
8
8
=1
=
N
1.
3
The dualsystem or capturerecapture estimator of N for this example is
N =
N
1
N
1
N
11
,
and the estimator obtained by deleting the th rotation group is
N
()
=
N
1()
N
1
N
11()
,
where
N
11()
=
1
7
N
11
,
N
1()
=
1
7
N
1
.
The pseudovalues are dened by
= 8
N 7
N
()
,
Quenouilles estimator by
N =
1
8
8
=1
,
3
These are only approximate equalities. The reader will recall that estimators
and
are
equal if the estimators are linear. In the present example, all of the estimators are nonlinear,
involving ratio and nonresponse adjustments. Thus, the parent sample estimators, such as
N
11
, are not in general equal to the mean of the randomgroup estimators
N
11
/8. Because
the sample sizes are large, however, there should be little difference in this example.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
192 4. The Jackknife Method
Table 4.8.2 Computation of Pseudovalues, Quenouilles Estimator, and the
Jackknife Estimator of Variance
The estimates obtained by deleting the rst rotation group are
N
11(1)
=
1
7
=1
N
11
= 106,004,486,
N
1(1)
=
1
7
=1
N
1
= 133,301,315,
N
(1)
=
N
1(1)
N
1
N
11(1)
= 144,726,785.
The estimate obtained from the parent sample is
N =
N
1
N
1
N
11
=
(133,313,591)(115,090,300)
106,164,555
.
= 144,521,881.
Thus, the rst pseudovalue is
N
1
= 8
N 7
N
(1)
= 143,087,555.
The remaining
N
()
and
N
N =
1
8
8
=1
= 144,519,648
and
v
1
(N) =
1
8(7)
8
=1
(
N
N)
2
,
= 3.1284 10
11
,
respectively. The estimated standard error is 559,324.
The conservative estimator of variance is
v
2
(N) =
1
8(7)
8
=1
(
N
N)
2
= 3.1284 10
11
with corresponding standard error 559,325.
P1: OTE/SPH P2: OTE
SVNY318Wolter December 13, 2006 20:0
4.8. Example: Estimating the Size of the U.S. Population 193
and the jackknife estimator of variance by
v
1
(
N) =
1
8(7)
8
=1
_
N
N
_
2
.
The conservative estimator of variance is
v
2
(
N) =
1
8(7)
8
=1
_
N
N
_
2
.
In this problem we are estimating the design variance of
N, given N
11
,
N
21
, N
12
, N
1
, and N.
1
. Some illustrative computations are given in Table 4.8.2.
To conclude the example, we comment on the nature of the adjustment for
nonresponse and on the ratio adjustments. The CPS uses a combination of hot
deck methods and weightingclass methods to adjust for missing data. The
adjustments are applied within the eight rotation groups within cells dened by
clusters of primary sampling units (PSUs) by race by type of residence. Because
the adjustments are made within the eight rotation groups, the original principles
of jackknife estimation (and of random group and balanced halfsample estima
tion) are being observed and the jackknife variances should properly include the
components of variability due to the nonresponse adjustment.
On the other hand, the principles are violated as regards the CPS ratio estima
tors. To illustrate the violation, we consider the rst of two stages of CPS ratio
estimation. Ratio factors (using 1970 census data as the auxiliary variable) are
computed within region,
4
by kind of PSU, by race, and by type of residence. In
this example, the ratio factors computed for the parent sample estimators were also
used for each of the random group estimators. This procedure violates the origi
nal principles of jackknife estimation, which call for a separate ratio adjustment
for each random group. See Section 2.8 for additional discussion of this type of
violation. The method described here greatly simplies the task of computing the
estimates because only one set of ratio adjustment factors is required instead of
nine sets (one for the parent sample and one for each of the eight rotation groups).
Some components of variability, however, may be missed by this method.
4
The four census regions are Northeast, North Central, South, and West.
Viel mehr als nur Dokumente.
Entdecken, was Scribd alles zu bieten hat, inklusive Bücher und Hörbücher von großen Verlagen.
Jederzeit kündbar.