and Methods
S. Sampatb
CRC Press
Boca Raton London New York Washington, D.C.
<iJ
Narosa Publishing House
New Delhi Chennai Mumbai Calcutta
S. Sampath
Department of Statistics
Loyola College. ChennaJ600 034. India
Library of Congress CataloginginPublication Data:
A catalog record for this book is available from the Library of Congress.
All rights reserved. No part of this publication may be reproduced. stored
in a system or transmitted in any form or by any means, electronic,
mechanical. photocopying. or otherwise, without the prior permission of the
copyright owner.
This book contains information obtained from authentic and highly regarded sources.
Reprinted material is quoted with permission, and sources are indicated. Reasonable
efforts have been made to publish reliable data and information. but the author and the
publisher cannot assume responsibility for the validity of all materials or for the
consequences of their use.
Neither this book nor any part may be reproduced or transmitted in any form or by any
means. electronic or mechanical, including photocopying, microfilming, and recording,
or by any information storage or retrieval system, without prior permission in writing
from the publisher.
Exclusive distribution in North America only by CRC Press LLC
Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton,
Florida 33431. Email: orders @crcpress.com
Copyright@ 2001 Narosa Publishing House, New Delhi110 017, India
No claim to original U.S. Government works
International Standard Book Number 0849309808
Printed in India.
Dedicated to my
parents
Preface
This book is an outcome of nearly two decades of my teaching experience both
at the gmduate and postgraduate level in Loyola College (Autonomous),
Chennai 600 034, during which I came across numerous books and research
articles on "Sample Surveys".
I have made an attempt to present the theoretical aspects of "Sample Surveys" in
a lucid fonn for the benefit of both undergraduate and post graduate students of
Statistics.
The first chapter of the book introduces to the reader basic concepts of Sampling
Theory which are essential to understand the later chapters. Some numerical
examples are also presented to help the readers to have clear understanding of
the concepts. Simple random sampling design is dealt with in detail in the
second chapter. Several solved examples which consider various competing
estimators for the population total are also included in the same chapter. The
third is devoted to systematic sampling schemes. Various systematic sampling
schemes like, linear, circular, balanced. modified systematic sampling and their
performances under different superpapulation models are alSo discussed. In the
fourth chapter several unequal probability samplingestimating strategies are
presented. Probability Proportional to Size Sampling With and Without
Replacement are considered with appropriate estimators. In addition to them
Midzuno sampling scheme and Random group Method are also included.
Stratified sampling, allocation problems and related issues are presented with
full details in the fifth chapter. Many interesting solved problems are.also added.
In the sixth and seventh chapters the use of auxiliary information in ratio and
regression estimation are discussed. Results related to the properties of ratio and
estimators under superpopulation models are also given. Cluster
sampling and Multistage sampling are presented in the eighth chapter. The
results presented in under two stage sampling are general in nature. In the ninth
chapter, nonsampling errors, randomised response techniques and related topics
are discussed. Some recent developments in Sainple surveys namely, Estimation
of distribution functions, Adaptive sampling schemes, Randomised response
methods for quantitative data are presented in the tenth chapter.
Many solved theoretical problems are incorporated into almost all the chapters
which will help the readers acquire necessary skills to solve problems of
theoretical nature on their own.
I am indebted to the authorities of Loyola College for providing me the
necessary to successfully complete this work. I also wish to thank
Dr.P.Chandrasekar. Department of Statistics, Loyola College, for his help during
proof I wish to place on record the excellent work done by the
Production Department of Narosa Publishing House in fonnatting the
1nanuscript
S.Sampath
Contents
Chapter 1 Preliminaries
1.1 Basic Definitions 1
1.2 Estimation of Population Total 3
1.3 Problems and Solutions 8
Chapter 2 Equal Probability Sampling
2.1 Simple Random Sampling 10
2.2 Estimation of Total 11
2.3 Problems and Solutions 16
Chapter3 Systematic Sampling Schemes
3.1 Introduction 29
3.2 Linear Systematic Sampling 29
3.3 Schemes for Populations with Linear Trend 34
3.4 Autocorrelated Populations 39
3.5 Estimation of Variance 42
3.6 Circular Systematic Sampling 43
3.7 Systematic Sampling in Two Dimensions 44
3.8 Problems and Solutions 47
Chapter4 Unequal Probability Sampling
4.1 PPSWR Sampling Method
55
4.2 PPSWOR Sampling Method 60
4.3 Random Group Method 63
4.4 Midzuno scheme
67
4.5 PPS Systematic Scheme 70
4.6 Problems and Solutions 71
ChapterS Stratified Sampling
5.1 Introduction
76
5.2 Sample Size Allocation
79
5.3 Comparision with Other Schemes
86
5.4 Problems and Solutions
89
Chapter6
Use of Auxiliary Information
6.1 Introduction
97
6.2 Ratio Estimation
97
6.3 Unbiased Ratio Type Estimators
100
6.4 Almost Unbiased Ratio Estimators
102
6.5 Jackknife Ratio Estimator 104
6.6 Bound for Bias 105
6.7 Product Estimation
106
x Contents
6.8
Two Phase Sampling 108
6.9
Use of Multiauxiliary Information 113
6.10
Ratio Estimation in Stratified Sampling 115
6.11 Problems and Solutions 117
Chapter 7
Regression Estimation
7.1 Introduction
122
12 Difference Estimation 124
7.3 Double Sampling in Difference Estimation 125
7.4 Multivariate Difference Estimator 126
7.5 Inference under Super Population Models 129
7.6 Problems and Solutions 137
Chapter 8 Multistage Sampling
8.1 Introduction 140
8.2 Estimation under Cluster Sampling 140
8.3 Multistage Sampling 143
Chapter 9 Nonsampling Errors
9.1 Incomplete Surveys 152
9.2 Randomised Response Methods 158
9.3 Observational Errors 161
Chapter 10 Recent Developments
10.1 Adaptive Sampling 165
10.2 Estimation of Distribution Functions 171
10.3 Randomised Response Methods for
Quantitative Data 174
References
179
Index
183
Chapter 1
Preliminaries
1.1 Basic Definitions
Definition 1.1 "Finite Population" A finite population is nothing but a set
containing finite number of distinguishable elements.
The elements of a finite population will be entities possessing panicular
characteristics in which a sampler would be interested and they will be referred
to as population units. For example, in an agricultural study where one is
interested in finding the total yield, a collection of fields or a collection of plots
may be defined as population. In a socioeconomic study, population units may
be defined as a group of individuals, streets or villages.
Definition 1.2 "Population Size" The number of elements in a finite population
is called population size. Usually it is denoted by Nand it is always a known
finite number.
With each unit in a population of size. N, a number from 1 through N is
assigned. These numbers are called labels of the units and they remain
unchanged throughout the study. The values of the population units with respect
to the characteristic y under study will be denoted by Y
1
, Y
2
, ... , Y N. Here Y;
denotes the value of the unit bearing label i with respect to the variable y.
Defmition 1.3 "Parameter" Any real valued function of the population values
is called parameter.
 1 N 1 N 
For example, the population mean Y =L,r; , S
2
= IIli Y]
2
and
N i=l N1 i=l
population range R = Max {X; }  Min {X; } are parameters.
Definition 1.4 "Sample" A sample is nothing but a subset of the population S.
Usually it is denoted by s. The number of elements in a sample s is denoted
by n(s) and it is referred to as sample size.
Definition 1.5 "ProbabHity SampUng" Choosing a subset o( the population
according to a probability sampling design is called probability sampling.
2 Sampling Theory and Methods
Generally a sample is drawn to estimate the parameters whose values are
not known.
Definition 1.6 "Statistic" Any real valued function is called statistic, if it
depends on Yt, Y2, .... Y N only through s.
A statistic when used to estimate a parameter is referred to as estimator.
Definition 1. 7 "Sampling Design" Let .Q be the collection of all subsets of S
and P(s) be a probability distribution defined on .Q. The probability distribution
{P(s),se .Q} is called sampling design.
A sampling design assigns probability of selecting a subset s as sample.
For example, let .Q be the collection of all (:] possible subsets of size n of the
populationS. The probability distribution
P(s)= jCf if n(s) = n
0 otherwise
(
Nl1
is a sampling design. This design assigns probabilities for all subsets of
II I
size n for being selected as sample and zero for all other subsets of S.
It is pertinent to note that the definition of sample as a subset of S does not
allow repetition. of units in the sample more than once. That is, the sample will
always contain distinct units. Alternatively one can also define a sequence
whose elements are members of S as a sample, in which case the sample will not
necessarily contain distinct units.
Definition 1.8 "Bias" Let P<s) .be a sampling design defined on .Q. An estimator
T(s) is unbiased for the parameter 8 with respect to the sampling design P(s) if
Ep[T(s)] = L T(s)P(s) =8.
seD
The difference Ep[T(s)]8 is called the bias of T(s) in estimating 8 with
respect to the design P(s). It is to be noted that an estimator which is unbiased
with respect to a sampling design P(s) is not necessarily unbiased with respect to
some other design Q(s).
Definition 1.9 "Mean Square Error" Mean square error of the estimator
T(s) in estimating 8 with respect to the design P(s) is defined as
MSE<f: P) = Ep[T(s) 8]
2
= L[T(s) 8]
2
P(s)
seD
Preliminaries 3
If Ep[T(s)] =8 then the mean square error reduces to variance.
Given a parameter8. one. can propose a number of estimators. For
example, to estimate the population mean one can use either sample mean or
sample median or any other reasonable sample quantity. Hence one requires
some criteria to choose an estimator. In sample surveys, we use either the bias or
the mean square error or both of them to evaluate the performance of an
estimator. Since the bias gives the weighted average of the difference between
the estimator and parameter and the mean square error gives weighted squared
difference of the estimator and the parameter. it is always better to choose an
estimator which has smaller bias (if possible unbiased) and lesser mean square
error. The following theorem gives the relationship between the bias and mean
square error of an estimator.
Theorem 1.1 Under the sampling design P(s), any statistic T(s) satisfies the
"' .... .... .... ....
relation MSE( P : T) = V p (T) + [ B p (T)] where V p (T) and B p (T) are variance
and bias ofthe statistic T(s) under the sampling design P(s).
.,
Proof MSE(T: P) = E p[T(s) 8]
Hence the proof.
= 8]
2
P(s)
= E p (T(s)) + E p (T(s)) 8]
2
P(s)
=
P(s)+[Ep(T(s))81
2
se.Q
As mentioned earlier, the performance of an estimator is evaluated on the basis
of its bias and mean square error of the estimator. Another way to assess the
performance of a sampling design is the use of its entropy.
Definition 1.10 "Entropy" Entropy of the sampling design P(s) is defined as,
e = LP(s)lnP(s)
Since the entropy is a measure of information corresponding to the given
sampling design, we prefer a sampling design having maximum entropy.
1.2 Estimation of Population Total
In order to introduce the most popular HorvitzThompson estimator for the
population total we give the following definitions.
4 Sampling Theory and Methods
Definition 1.11 "Inclusion indicators" Let s 3 i denote the event that the
sample s contains the unit i . The random variables
{
I if s 1 i. IS. iS. N
l(s)=
' 0 otherwise
are called inclusion indicators.
Definition 1.12 "Inclusion Probabilities" The first and second order inclusion
probabilities corresponding to the sampling design P(s) are defined as
TC; = L P(s). rcij = L P(s)
s3i ni.j
where the sum L extends over all s containing i and the sum L extends
Hi ni.j
over all s containing both i and j.
Theorem 1.2 For any sampling design PCs). (a)p[/i(s)]=rc;.i=l.2 ..... N
(b) E P [I; ( s) I J ( s)] = rc iJ , i, j = l. 2 ..... N
Proof (a) Let .Q
1
be the collection of all subsets of S containing the unit with
label i and !22 = .Q .Qt.
Ep[l;(s)]= L l;(s)P(s)+ Ll;(s)P(s)
seD
1
seD,
= L 1P(s)+ L OP(s)
.teD
1
.fE.CJ,
= LP(s)
ni
= TC;
(b) Let .Q
1
be the collection of all subsets of S containing the units with labels ;
and} and .Q
2
= .Q !2
1
.
{
1 ifse.Q
1
Notethat. l;(s)I
1
(J)= .
0 otherw1se
Therefore Ep[/;(s)lj(s)]= L l;(s)J
1
(s)P(s)+ Ll;(s)J
1
(s)P(s)
seD
1
.fED
2
= LP(s)
seD
1
= LP(s)
Hi.j
= ";J
Hence the proof.
N
Theorem 1.3 For any sampling design P(s). E p[n(s)] = L";
i=l
Proof For any sampling design. we know that.
N
n(s) = L I; (s)
i=l
Taking expectation on both sides, we get
N
Ep[n(s)]= L,Ep[l;(s)]
i=l
N
= :Llr;
i=l
Hence the proof.
Theorem 1.4 (a) For i =I. 2 ... N. V p[/; (s)] = lr; (1Jr;).
(b) Fori. j =I, 2 ..... N.cov p[/; (s).l j (s)] = lr;j Jr;lr j
Preliminaries 5
Proof of this theorem is straight forward and hence left as an exercise.
Theorem 1.5 Under any sampling design. satisfying P[n(s) = n] =I for all s.
N N
(a) n=
fl
2
y N1.,. I
i=l
Proof We have seen in Chapter 1. for any sampling design
A y.
yfff = .,.'
ies 1t;
is unbiased for the population total with variance
(2.1)
12 Sampling Theory and Methods
N N [ y ]2
:L :L !!_ _ _L (1C;1C 1 rei})
1C 1C
i=l J=l I J
i<j
n n(n1)
By Theorem 2.1, we have 7t; = N and rcij = N(N _
1
)
Substituting these values in (2.1) we notice that
A N""'
y liT =""" Y;
n.
IE.S
is unbiased for the population total.
Note that
.,
n n(n1) n N n
1C1C. 1C = =
I J I) N2 N(N1) N N(N1)
r t ~ N [ ., ]2 ~ N 2n(Nn)
Therefore by (2.2) V(Yfff) = LL ., (Y; Y
1
)
i=l J=l n N(N 1)
i<j
N N
= Nn LL(Y;Yi)2
n(N 1 ) i=l j=l
i<j
N N N N N
We know that L L,a;
1
= L,a;; + 2 L L,aij , if a;
1
=a ji.
i=l j=l i=l i=l j=l
i<j
Using the above identity in the right hand side of (2.S), we get
VCYfff)= Nn {<Y; Y
1
)
2
 <Y; Y;)
2
}
2
n(N 
1
) i=l j=l i=l
= Nn {2ffr? 2ffr;r
1
}
2n(N 1) . I . I . I . I
I= j= I= j=
= N n {NYlNHj2}
n(N 1) i=l
N 2
= N(N n) L (Y;  f)2 = N (N n) s;
n(N 1) i=l Nn
Hence the proof.
(2.2)
(2.3)
(2.4)
(2.S)
(2.6)
The following theorem gives an unbiased estimator for the variance obtained in
Theorem 2.1.
Equal Probability Sampling 13
A A N
2
(Nn) .,
Theorem 2.3 An unbiased estimator of V(Y_rr.S') is v(Ysr.S') = s;
Nn 
where s; is the sample analogue of s .
0 A A 2 2
Proof Smce V(Ysr.r) = E(Y.S'r.r) Y ,
"2 A 2 N
2
(N n) 2 2
we have E(Y.S'n) = V(Y.S'n) + Y = Nn Sy + Y
The sample analogue of s; is s; = n
1
L[Y; Y]
2
iE.S'
1
whereY = LY;
n.
IE.S'
1 { 2 }
= nY
n1
iE.S'
=
1
{ r? n[r.S';..S' J }
n1 N2
IE.S'
Taking expectations on both sides we get
E<s:)= ]n&f2J}
1 [ n N
= n1
1=1
Y
2
n{Nn S
2
+f
2
} ]<using(2.7))
1
Nn Y
n [ 1 N
= n1 N
y.2 _ { N n S 2 + y2 } J
1
Nn Y
=__!!_[ N1 s;Nn s; J
n1 N Nn
__!!_[ n1 s2 ] = s2
 \' ,.
n1 n 
Th
. .
1
. , N
2
(N n) 2 ] [N
2
(N n)Js2
IS Imp IeS S v = y
Nn Nn
Hence the proof.
(2.7)
(2.8)
TMorem 2.4 Let (X;, Y;) be the values with respect to the two variables x and
y associated with the unit having label i. i = 1, 2, ... N . If X = N LX; .
n .
IE.S'
14 Sampling Theory and Methods
N
y = N Y; and Sry
+[ n(n_n ](N
2
f X fr;x;]rx
n N i=l N(N 1) i=l
n(n1) ]f X;Y; +[ n(n1) ](N2f X ]N2f X
n N N(Nl) i=l N(N1)
2 N
= N (N n) _l_L(X; X)(Y; Y)
Nn N 1 i=l
Hence the proof.
1 A A
Theorem2.5Undersimplerandomsampling sry X){Y; Y)
IE.S
Equal Probability Sampling 15
1 N 1
is unbiased for S , X)(Y Y) where X=' X; and
x.' N  1 ..J I I n ..J
1
Y=IY;.
n.
lEI
1 [ A A]
Proof sx:'. = nl LX;Y; nXY
lEI
=
1
['x.y. _ _!!_,
1
..J II .,..J 1..J I
n n
iEI iEI IE.f
1
'x.y. _..!_ 'y.x. + y.x.
..J I I ..J I I ..J..J I I
=
n1 . n .
1E1 iEI iE1 1:;:. j
jE1
= n1' y.x. _..!_ y.x.
1
.. I I ..J..J I 1
n IZ . IZ . .
lEI lEI 1EI
;:;:. j
Taking expectations on both sides. we get
N N N
1 n1 n n(n1)
E[sxvl = ..J Y;X; ..J ..J Y;X
 n1 n N
1
=l nN(Nl)i=lj=l
1
j:;:.1
N [ N ]
1 1 ., __
='YX NYx'YX
N I I N(N 1) I I
1=1 . 1=1
=[
1
+
1
]f YX [_!!_]v x
N N(N1) i=l I I N1
= 
1
 [f Y; X;  NY X] Hence the proof.
N 1 i=l
A
Y r ...
Remark 2.1 If Y = ....!...!_ then under simple random sampling Y is unbiased for
N
the population mean and its variance is N  n S ;.
Nn
16 Sampling Theory and Methods
This remark follows from Theorem 2.2.
2.3 Problems and Solutions
Probkm 2.1 After the decision to take a simple random sample had been made.
it was realised that Y1 the value of unit with label I would be unusually low
and YN the value of the unit with label N would be unusually high. In such
situations, it is decided to use the estimator
..
Y + C if the sample contains Y N but not Y
1
.:.. A
Y = Y  C if the sample contains Y
1
but not Y N
..
Y for all other samples
where the constant C is positive and predetermined. Show that the estimator
Y is unbiased and its variance is
[
2 ]
Nn S,. 2C
V(Y )=  (YN YtnC)
N n N1
Also prove that V(Y.)<V(Y) if O<C< YN YI (Samdal(1971)).
n
Solution Let !ln = {s I n(s) = n} Partition !ln into three disjoint subclasses as
!l
1
= {s I n(s) = n, s contains I but not N},
!l
2
= {s I n(s) = n,s contains N but not 1}
and !l3 = !ln !ll !22
It is to be noted that the number of subsets m
and !l
3
are
respectively rN
2
J.[N
2
] and [N]_jN
2
].
n1 n1 n l n1
Under simple random sampling
(
N)1
E(Y.)= L y
seDn n
=(N)l{ I [r +C ]+ I [r c ]+ Ir}
n seD1 ,\'E D2 sell3
r
=(N)l I,r = Y (refer the remark 2.1)
n seOn
Equal Probability Sampling 17
Therefore the estimator Y is unoiased for the population mean. The variance
oftheestimator r is V(f.)=. 2, [r Y ]
2
(NJI (by definition)
selln "
=(NJI( L [Y. +cr ]
2
+I [r cr]
2
+I [r r ]
2
J
"
(
NJ1[ [A ]2 [A ]2 [A ]2
=. YY YY YY
N; +rN; 2n(N n) (N2]
Note that n ( l n = . Funher it may be noted that all the
N N(N1) n1
"
members of .0
1
contain the unit with label 1, ( NJl of them contain the units
n2
with labels j (j = 2, 3, ... , N I) and none of them contain the unit with label N.
Therefore
L r Ir
sellt sell} .fElll
=.!_[(N2\,I (N2\,]
n n1 J
1
n2 )f::
2
1
n1 J
1 n 1 N1 } (N2}'
= Y
1
+ :I,r. 
n n1 N  2 . 2 1 n1
1
Proceeding in the same way, we get
Y) + n=l rrj}(N2r (2.12)
n n n1 N 2 2 n1
se.a2 1
It can be seen that
Nn (2.13)
n CJ N(N1)
Using (210)(2.13) in (2.9) we get
18 Sampling Theory and Methods
.!..,. N  n [ S; 2C . ]
V(Y ) =  (YN Yt nC)
N n N1
(2.14)
which is the required result.
V(Yl= N;n[ s!] (2.15)
Therefore
..:. [ 2C ][ ]
V(Y )<V(Y):::)  YN Y1nC >0
N1
(comparing (2.14) and (2.15))
:::) Y N  Y1 > nC (when Cis positive)
=> O<C<[YN :yl]
Hence the solution.
Problem 2.2 Given the information in problem 2.1, an alternative plan is to
include both Y
1
and Y
8
in every sample, drawing a sample of size 2 from the
..
units with labels 2. 3, .... 7, when N=8 and n=4. Let Y; be the mean of those 2
..
Y
1
+ 6f., + Y
8
units selected. Show that the estimator Y' =  is unbiased for the
8
..
1
. . . 9V(f.,)
popu auon mean wnh vartance  .
 16
..
So ut1on Y'=  = +   Y
l
. ..:. Y1 +6Y., +Y8 Y1 +Y8 [6] 1 L
8 8 8 2 . I
Taking expectations on both the sides we get
where I; = 1 if iE s
= 0 otherwise
Since E[/;] get from (2.16)
6
Hence the solution.
IES
(2.16)
Equal Probability Sampling 19
P.,.obkm 2.3 Show that when N = 3. n = 2 in simple random sampling, the
estimator
..!_y
1
+..!_Y2if s:;:{l,2}
2 2
y = ..!_ Yt Y3 if s = { 1.3}
2 3
..!_y
2
+.!.Y3 if s={2,3}
2 3
is unbiased for the population mean and
V(Y.)>V(Y) if Y
3
[3Y
2
3Y
1
Y
3
]>0
Solution By definition
""' 1 ( 1 If 1 1 1 2 1 1 }
E[Y ] = ]3 = J )l2Yt +2Y2 +2Yt +JY3 +2Y2 +JY3
=
1
.s
1
.s, (
2
J
= ( Jrt + r 2 + Y3 }=i'
Hence y is unbiased for the population mean.
V[ y] = { [ Yt + Y 2 r + +[ Yt + Y3 r + [ y 2 + Y3 n
[Yt +Y; +Y3 r
1212 2 21 1
= Yt +Y2 +Y3 YtY., Y2Y3
18 18 27 18  9
We know that under simple random sampling,
V[Y]
3

2 1
Y]
2
(refer remark 2.1)
 (3)(2) 31 I
t=l
[ Y? +Yf + YlY1Y2 Y1Y3 Y2Y
3
]
Therefore
v[YJv[Y"J
+(;
4
]r2 YJ [
5
3
4
]rtYJ
Using the above difference we get,
V[Y]V[Y ]>0
::::) Y3[3Y2 3Yt Y3]> 0
Hence the solution. This example helps us to understand that under certain
conditions, one can find estimators better than conventional estimators.
20 Sampling Theory and Methods
Probkm 2.4 A simple random sample of size n = n
1
+ n2 with mean y is
drawn from a finite p!Jpulation, and a simple random subsample of size n
1
ts
drawn from it with mean Yt . S ~ o w that
(a) V[y
1
 Y2] = s;[1..+J..] where Y2 is the mean of the remaining n
2
nt n2
units in the sample,
(b) V[ytY1=S
2
[2__.!_]
Y n
1
n
(c) cov(y. y
1
 y) = 0
Solution Since y
1
is based on a subsample,
V(yt) = Et V2(Yt )+ V1E2 (Yt)
(2.17)
where
1
is the unconditional expectation and
2
the conditional expectation
with respect to the subsample. Similarly V
1
is the unconditional variance and
V
2
is the conditional variance with respect to the subsample .
It may be noted that E 2 lYt] = y and V 2 [y
1
] = nnt s; (refer Remark 2.1 ).
nn
1
_ N n 2 _ nn
1
2
Therefore V
1
2[y
1
] =S y and Et V2Yt1 = S y.
Nn nn
1
Substituting these expressions in (2.17) and doing the necessary simplification
we get
.V[yl] = (.!.2... 1 ~ ;
n N [
Further cov(y. Yt) = E [y Yt] E [Y]E l:Yt]
= Et E2 [y Y"t1 YEt 2[ Yt1
= Et[yyt]YY
=V[y]
=[N n]s;
Nn
We know that cov(y. y
1
 y) = cov(y, Yt ) cov(y, Y>
This proves (c).
= V[y] V[y] (using 2.19)
=0
Note that V[y
1
 Y1 = V[y
1
]+ V[y] 2 cov (y. y
1
)
= V[yt]+ V[Y] 2V[y] (using 2.19)
= V[y
1
]V[y]
(2.18)
(2.19)
Equal Probability Sampling 21
(using (b))
= :; [ ]s;
S; (since n2 = n n1 )
n2 nl .
= nl +n2 s; =[+]s;
nln2 . nl n2
This proves (a).
Probkm 2.5 Suppose from a sample of n units selected with simple random
sampling a subsample of n' units is selected with simple random sampling,
duplicated and added to the original sample. Derive the expected value and the
approximate sampling variance of y' , the sample mean based on the n + n'
units. For what value of the fraction does the efficiency of Y' compared to
n
that of y attains its minimum value?
Solution Denote by y o, the mean of the subsample. The sample mean based on
n + n' units can be written as
. nv+n' V
0
y = . "
n+n'
Since )" is based on the subsample,
E[y'] = E
1
E
2
[y'], where E
2
being expectation w.r.t. the subsample
and E
1
the original sample.
Th fi E(

1
EE [ny+n'y
0
]_nEtE2(Y)+n'EtE2(yo)
ere ore y =
1 2
 __
n+n' n+n'
=nY+n'Y = f
n+n'
Hence the combined mean is unbiased for the population mean.
22 Sampling Theory and Methods
V[y'] = E
1
V2[y' ]+ V1E2[y']
[
nv +n' v
0
] [ny+n' yo ]
=EtV2 .  +VtE2
n +n' n+n'
=
1
2
{EtV2(n'yc')+Vt(ny+n'y)}
(n +n')
1
2
{Et[n'
2
(n+n') nn 
=
= [ n' ]
2
[nn' s;
n+n' nn'
= [ n:n. n s;
]
N n s2
+ . y
Nn
]
+ Nn s2
Nn Y
[
n'
= n+n'
]
2
[ 1 1 ] ., s;
 S;+
n' n n
=[
=
n'
n
n'
1+
n
]
2[..!._.!.
n' n
By Remark 2.1, V(y) =[N = s;
Nn n
(approximately)
Therefore by (2.20) and (2.21 ), the efficiency of y' as compared to y is
n'
1+3
1+39 n'
E = [ I+ : n r = II+
9
p where 9 =;;.
(2.20)
(2.21)
Using calculus methods. it can be seen that E attains maximum at
3
Therefore the value of !t.. for which the efficiency attains maximum is .
n 3
Equal Probability Sampling 23
Probkm 2.6 Let Y; be the ith sample observation (i = 1, 2 ..... N) in simple
random sampling. Find the variance of y
1
and the covariance of Y; and .v
1
( i j ). Using these results derive the variance of the sample mean.
Solution
Claim : In simple random sampling, the probability of drawing the unit with
label r(r = 1, 2, ... , N) in the ith draw is same as the probability of drawing the
unit with label r in the first draw.
Proof of the Claim
The probability of drawing the unit with label r in the first draw is 
1
.
N
The probability of drawing the unit with label r in the ith draw is
[I [I N
1
i+2] [I N
1
i+l]
which on simplification reduces to 
1
. Hence the claim.
N
Proceeding in the same way it can be seen that the probability of selecting the
units with labels r and s in the ith and jth draws is same as the probability of
drawing them in the first and second draws.
Therefore, we infer that Y; can take any one of the N values Y
1
, Y
2
, , Y N
with equal probabilities 
1
and the product y; y i can take the values
N
Y1 Y2. Yt Y3 ... , Y N1 Y N with probabilities
1
N(N 1)
Hence we have
(i)
(ii)
(iii)
1 N
E[y;]=N LY;
i=l
N
=_I """ y;2
I N.. I
i=l
l N N
E[y;Yjl= N(Nl)LLY;Yj
i=l j=l
i"'}
From (2.22), we have E[y;] =f.
1 N
Therefore V[Y; ] =LYlY
2
(refer (2.23))
N . I
]=
N
=1 f]2 = N1 s2
1
N y
j=l
Using (2.24) and (2.25) we get
(2.22)
(2.23)
(2.24)
(2.25)
24 Sampling Theory and Methods
1
,V N
cov(y;.y
1
)= L,L,r,r
1
r
2
N(N 1) =l J=l
N
 1 y2 1 y2
 (N1) lc
/c=l
=1[fr/c2 Nr2]=s;
N(N 1) lc=l N
(2.26)
We know that
[
1 n ] 1
V[y]=V LYi = 2
n i=l n
n n n
L V(y; )+2L,L,cov(y;.Y j)
j:;:l i=l j=l
i<j
=1[n(N
1
) s;+2n(nn(s; 11 (using(2.25)and (2.26))
n2 N 2 N JJ
= [Nn]s;
Nn
Hence the solution.
Sv
Probkm 2. 7 If the value of the population coefficient of variation C = is
y
known at the estimation stage, is it possible to improve upon the estimator y .
the usual sample mean based on a sample of n units selected using simple
random sampling? If so, give the improved estimator and obtain its efficiency.
by comparing its mean square error with V (y) .
Solution Consider the estimator
Y.t =ly
where A. is a constant.
The mean square error of the estimator y
1
is
MSE( YA. )=E[A. yYf
=E[A.(y Y)+(ll)Y]
2
Equal Probability Sampling 25
=l
2
E(y Y)
2
+(A. 1)
2
f
2
+2l(lDYEC:Y Y)
= ..t
2
V(y)+(l1)
2
Y
2
(2.27)
Using differential calculus methods. it can be seen that the above mean square
error is minimum when
A = [I + N;, n C 2 r (2.28)
Therefore, the population mean can be estimated more precisely by using the
estimator
y ~ =[I+ N ;,.n c2 r y
whenever the value of C is known.
Substituting (2.28) in (2.27) we get the minimum mean square error
M* =[N n s;J[l+ N n c2]t
Nn Nn
0
Therefore the relative efficiency of the improved estimator y l when compared
to y is
[I+ N N ~ n C 2 r
It may be noted that the above expression will always assume a value less than
one.
Remark 2.2 We have pointed out, a simple random sample of size n is obtained
by drawing n random numbers one by one without replacing and considering the
units with the corresponding labels. If the random numbers are drawn with
replacement and the units corresponding to the drawn numbers is treated as
sample, we obtain what is known as a "Simple Random Sampling With
Replacement " sample (SRSWR).
Problem 2.8 Show that in simple random sampling with replacement
(a) the sample mean y is unbiased for the population mean
(b) V(y)=[Nl]s2
Nn Y
Solution If Yi, i =I. 2, ... , N is the value of the unit drawn in the ith draw then
Yi cantake any one of theN values Yi with probabilities ~ .
N
N 1
Therefore E(yi) = I,ri = Y
. I N
J=
(2.29)
26 Sampling Theory and Methods
In the same way. we get
N I l .v
E(v:!) = ~ Y
2
 =  ~ y:
' ~ 1 N N...,; 1
1=1 1=1
N
l L ., _.., Nl ..,
Hence V(y;) = Y[ Y =S;
N N 
1=1
Since draws are independent cov(y;, )'j) = 0, we get
E(y) = E[.!_ i Y;] =.!.nr (using (2.29))
n .
1
n
I=
=Y
and
. _ [ I ~ ] [ I ~ ] I N I 2 N I 2
V(y) =V ;; ~ Y; = 2" ...,; V(y;) =2(n)NS y = Nn S_v
t=l n t=l n
Hence the solution.
(2.30)
Probkm 2.9 A simple random sample of size 3 is drawn from a population of
size N with replacement. As an estimator of Y we take y'. the unweighted mean
over the different units in the sample. Show that the average variance of y' is
(2N l)(N l)S;
6N
2
Solution The sample drawn will contain I.i or 3 different units. Let P ~ o P
2
. and
P
3
be the probabilities of the sample containing I,2 and 3 different units
respectively.
N
P
1
= L P (selecting rth unit in all the three draws)
r=l
=NI _I _I =I
NN N N2
N
P
2
= L P (selecting rth unit in draw I and a unit different from rth unit in
r=l
N
the second and third draws) + L P (selecting the rth unit in draw 2 and
r=l
a unit different from rth unit in the first and third draws) +
N
L P (selecting the rth unit in draw 3 and a unit different from rth unit in
r=l
the first and second draws).
Equal Probability Sampling 27
1 Nl N1 l N1 Nl 1 N1 N1
= N + N  + N 
NNN NNN NNN
=
3(N 1)
N2
N
P
3
= L P (selecting rth unit in draw 1, a unit different from rth unit in the
r=l
second draw and in the third draw a unit different from
=
=
units drawn in the fi.rst two draws)
N(N 1)(N 2)
N3
(N 1)(N 2)
N2
We know that the variance of the sample mean based on n distinct units is
Nn 2
Sy.
Nn
Therefore average variance of y' is
(
N  1 S 2 J1 + ( N  2 S; ) 3( N  1) + ( N  3 S 2
N y N 2 2N N 2 3N y
(2N  1)( N  1) S 2
which on simplification reduces to
2
y
6N
Hence the solution.
Exercises
l
(Nl)(N2)
N3
2.1 Derive V(s;) and cov(x,s.;) in simple random without replacement
under usual notations.
2.2 Let v denote the number of distinct units in a simple random sample drawn
with replacement. Show that the sample mean based on the v distinct units
is also unbiased for the population mean and derive its variance.
2.3 Suggest an unbiased estimator for the population proportion under simple
random sampling without replacement and derive its variance and also
obtain an estimator for the variance.
2.4 Suppose in a population of N units. NP units are known to have value
zero. Obtain the relative efficiency of selecting n units from N units
with simple random sampling with replacement as compared to
selection of n units from the N NP nonzero units with simple random
sampling with replacement in estimating the population mean.
2.5 A sample of size n is drawn from a population having N units by simple
random sampling. A subsample of n
1
units is drawn from the n units by
simple random sampling . Let y
1
denote the mean based on n
1
units and
y
2
the mean based on nn
1
units. Show that wy
1
+(1w)y2 is unbiased
28 Sampling Theory and Methods
for the population mean and derive its variance. Also derive the optimum
value of w for which the variance attains minimum and the resulting
estimator.
Chapter3
Systematic Sampling Schemes
3.1 Introduction
In this chapter, a collection of sampling schemes called systematic sampling
schemes which have several practical advantages are considered. In these
schemes, instead of selecting n units at random, the sample units are decided by
a single number choseri at random.
Consider a finite population of size N, the units of which are identified by
the labels I. 2, ... ,Nand ordered in ascending order according to their labels.
Unless otherwise mentioned. it is assumed that the population size N is
expressible as product of the sample size nand some positive integer k, whtch is
known as the reciprocal of the sampling fraction or sampling interval.
In the following section we shall describe the most popular linear systematic
sampling scheme abbreviated as LSS.
3.2 Linear Systematic Sampling
A Linear Systematic Sample (LSS) of size n is drawn by using the following
procedure:
Draw at random a number less than or equal to k, say r. Starting from the
rth unit in the population, every kth unit is selected till a sample of size n is
obtained.
For example, when N=24, n=6 and k=4, the four possible linear systematic
samples are :

Sample Number
1
2
3
4
Random Start
1
2
3
4
Sampled units
1, 5, 9, 13
2, 6, 10, 14
3, 7, 11, 15
4, 8, 12, 16
The linear systematic sampling scheme described above can be regarded as
dividing the population of N units into k mutually exclusive and exhaustive
groups {S
1
, S
2
, ... , S k} of n units each and choosing one of them at random
where the units in the rth group are given by
Sr ={r,r+k, ... ,r+(nl)k},r= 1.2, ... ,k
30 Sampling Theory and Methods
The following theorem gives an unbiased estimator for the population total and
its variance under LSS.
Theorem 3.1 An unbiased eslimator for the population total Y under LSS
A N n
corresponding to the random start r is given by Y LSS = Lyr+(jl)k and its
n .
j=l
lc
variance is V(YLSs) = .Lrr, Y]
2
where Y,
r=l
corresponding to the random start r.
A
IS the value of Y LSS
Proof Note that the estimator Y LSS can take any one of the k values
Y,, r = 1, 2, ... , k with equal probabilities ..!.. .
k
Therefore
A lc A(l) 1 k N n
E[YLSs] = I,r, k = k L;; Yr+(jl)k.
r=l r=l J=l
N 1c n N
= nk 2, 2: Yr+(j1)/c = LY;
r=l j=l i=l
(3.1)
A
Hence Y us is unbiased for the population total Y.
Since the estimator Y LSS can take any one of the k values Y,, r =I. 2, ... , k with
equal probabilities ..!.. and it is unbiased for Y,
k
A A 2
V(YLSs) = E[YLSs  Y]
=
r=l
k
= _L[r, rf <3.2)
r=l
Hence the proof.
Apart from operational convenience. the linear systematic sampling has an
added advantage over simple random sampling namely, the simple expansion
estimator defined in the above theorem is more precise than the corresponding
estimator in simple random sampling for populations exhibiting linear trend.
That is, if the values Y
1
, Y
2
, ... , Y N of the units with labels l, 2, ... , N are
modeled by Y; =a+ f3 i, i = 1, 2, ... , N then systematic sampling is more efficient
than simple random sampling when the simple expansion estimator is used for
estimating the population total. This is proved in the following theorem. Before
S)'stematic Sampling Schemes 31
the theorem is stated. we shall give a frequently used identuy meant for
populations possessing linear trend.
Identity For populations modeled by Y, =a+ f3i, i = l, 2 ..... N
Y,  Y = Nf{r  (k ; I)] (3.3)
..
where Y, is as defined in Theorem 3.1.
Proof: Note that when Y; =a+ {3i. i = 1. 2 ..... N . we have
.. N n N n .
Y, =L,Yr+(JI)k =:L{a+{J(r+(jl)k.]}
n . I n . I
]= j=
= [ na + fJnr + fj[k n(n2l)l]
= N[a + {Jr 1 {J[k (n l)]l
2 j
N
and Y = LY;
i=l
= t.[a + ./li] = Na+ /3[ +I) ]
Using (3.3) and (3.4) we get
Y, y = N[ a+ fJr+ .13[ k(n21) ]a .13[ nk2+1]]
=Nfl[r+
= Nfl[r (k ; I)]
(3.4)
(3.5)
Hence the identity given in (3.3) holds good for all r, r =I. 2, .... k .
Theorem 3.2 For populations possessing linear trend, V(Y LSS) < V(Ysrs) where
Y LSS and Ysrs are the conventional expansion estimators
systematic sampling and simple random sampling, respectively.
Proof We know that under simple random sampling
2 N
V(Y )= N (N n) Y]2
srs Nn N  l {:t '
N th
"f Y.. a: 2 N h 1' a (N + l)
ote at 1 ; =a+ fl' 1 = I. , .... . t .en =a+ fJ
2
Therefore for i = I. 2, ... N
under linear
(3.6)
(3.7)
32 Sampling Theory and Methods
.., . N+l
[
[Y
1
n: a+P<aP(
2
)
:pt('';1 )]'
=P2[;2 + (N:I)2 i(N+l)]
Hence f[Y; f]2 =/32[N(N+1)(2N+l) + N(N+1)
2
_ N(N+1)
2
]
i=l 6 4 2
=pi[ N(N
I)]
Substituting this in (3.6) we get
V ( y ) = N
2
( N  n) __.!.._ a 2 [ N ( N + 1)( N  1)]
srs Nn N  1 12
2a2
N (k  1)(nk + 1) ( . N k)
= usmg =n
12
On using identity'given in (3.3), we get
tif, Y]2: (k; I) r
= N2 p2t.[r2 + (k: 1)2 i(k +I)]
= N2/32[k(k + 1)(2k + 1) + k(k + 1)
2
_ k(k + 1)
2
]
6 4 2
N2/32k(k2 1)
=....;...._.....;... __
12
N2 /32(k 2 1)
Therefore V(YLSs) =
12

Thus using (3.8) and (3.9) we get.
V(Y )V(Y )=N
2
{3
2
(k1)(nk+1k1)
srs LSS
12
= N
2
f3
2
k(n 1)(k  1)
12
(3.8)
(3.9)
Since the right hand side of the above expression is positive for all values of n
greater than one, the result follows.
Systematic Sampling Schemes 33
Yates Corrected Estimator
In Theorem 3.2. it has been proved that linear systematic sampling is more
precise than simple random sampling in the presence of linear trend. Yates
(1948) suggested an estimator that coincides with the population mean under
linear systematic sampling for populations possessing linear trend. The details
are furnished below:
When the rth group S r is drawn as sample. the first and last units in the
sample are corrected by the weights At and respectively (that is. instead of
using Yr and Yr+(nt)A: in the estimator, the corrected values namely At Yr and
A
2
Yr+(nt)A: will be used) and the sample mean is taken as an estimator for the
population mean. where the weights At and A
2
are selected so that the corrected
mean coincides with the population mean in the presence of linear trend. That is.
the corrected mean
Yc =.!.[At Yr +I Yr+(jl)A: +A2Yr+(nt)A:]
n . 2
j=
is equated. to the population mean Y after substituting. Y; = a+ {3i. i = 1, 2, ... , N
to get
.!.[.1.1 (a+ P..l+ /J[r+(j l)kJ]+l =a+ {j(N + 1)
n J=
2
2
A2 [a+ {j[r+ (n 1)k1]
Comparing the coefficients of a in (3.10) we get
1
[At +A2 +n2] = 1
n
(3.10)
Therefore At + A
2
= 2 (3.11)
Again comparing the coefficient of f3 in (3.10) we get
1 [ .., C n l)(n 2) ] N + 1
;;
2
k =
2

2[At r + A2[r + (n 1)k] + (n 2)r + (n 
1
)(n
2
) k ]= n(N + 1)
2
[2Atr + 2(2 At )(n 1)k + 2(n 2)r + (n 1)(n 2)k ]= n(N + 1)
. n(2r k 1)
Solvmg for At we get At= 1
2(n 1)k
Using (3.12) in (3.13), we find that
n(2r k 1)
A2=1
2(n 1)k
(using (3.11))
(3.12)
(3.13)
When the above obtained values of At and are used in the Yates corrected
estimator, we get
34 Sampling Theor_.; and Merhods
 1 r (:!rk1) ]
Yc =lYr + [Yr Yr+(nl)k]
N 2(nl)k
(3.14)
Therefore the estimator
,A [ (2r k  1) ]
Yc = Yr + [Yr  Yr+<nl)k]
2(n l)k
estimates the population total without any error. Since the estimator coincides
with the parameter value, it has mean square error zero.
3.3 Schemes for Populations with Linear Trend
In the previous section, we have seen a method in which the corrected expansion
estimator coincides with i:he population total in the presence of linear trend.
However, instead of correcting the estimator, many have suggested alternative
sampling schemes which are best suitable for populations with linear trend.
Three such schemes are presented in this section.
(i) Centered Systematic Sampling (Madow, 1953)
As in the case of linear systematic sampling, in centered systematic sampling
also the population units are divided into k groups S
1
, S
2
, ... , S k of n units
each, where S r = { r. r + k, ... , r + ( n  1 )k } , r = 1, 2, ... , k .
If the sampling interval k is odd then the middlemost group namely
Su+l)/
2
is selected as sample with probability one. On the other hand, one of
the middlemost groups, namely Sk
12
or Su+2)/2 will be randomly selected as
sample.
To estimate the population total, one can use the expansion estimator as in
A
the case of linear systematic sampling. If Ycss is the estimator of the
population total under centered systematic sampling, then
(i) when k is odd, Ycss = Yck+l)/
2
with probability one
A {yk'" with probability 1/2
(ii) when k IS even, Y css = A
Y(k+2)/2 with probability l/2
A
It may be noted that in both the cases Ycss is not unbiased for Y. However, for
populations with linear trend. it has same desirable properties as shown in the
following theorem.
Theorem 3.3 For populations satisfying Y; =a+ f3i, i = 1, 2, ... , N ,
(i) when k is odd, Y css = Y and MSE(Y css) = 0 and
A f3 2
(ii) when k is even, E(Y css) = Y and MSE(Y css) =
4
Proof For populations with linear trend, wehave seen in (3.3)
Systematic Sampling Schemes 35
  [ (k+l)l  "'
Y, Y N/3 r , r l.  ... , k
2 ,j
.  . [(k+l) (k+l)]
Therefore (I) Y(k+ll/2 Y = N/3
2

2
= 0
( 3.15)
..  [k (k + 1)] N/3
(u) ykl:! Y=N/3 2 2 = 2
(3.16)
, A [k+2 (k+l),J N/3
and llll) Y(k+
2
>
12
Y = N/3  =
. 2 2 2
(3.17)
Hence when k is odd MSE('Y css) = [ Y< k+l)/2  Y ]
2
= 0
(By (3.15))
A 1 [A ]2 1 [A ]2
and when k 1s even MSE(Y css) = 2 Yk 11  Y +2 Y(k+2)/2 Y
=_!_[N:!/32 +N2/32]
2 4 4
= N2/32
4
Thus we have proved the theorem.
The centered systematic sampling described above is devoid of randomisation.
Hence the results based on a centered systematic .sample are likely to be
unreliable panicularly when the assumption regarding the presence of linear
trend is violated. Hence it is to develop a sampling method free from
such limitation. In the following pan of this section, one such scheme developed
by Sethi( 1962) is presented.
(ii) Balanced Systematic Sampling (Sethi, 1962)
Under Balanced Systematic Sampling (BSS}, the population units are divided
into!!.. groups (assuming the sample size n is even) of 2k units each and a pair of
2
units equidistant from the end points are selected from each group. This is
achieved by using the following procedure:
A random number r is selected from 1 to k and units with labels r and
2k  r + 1 will be selected from the first group and thereafter from the remaining
!!..  1 groups, the corresponding pairs of elements will be selected. For example.
2
6
when N=24 and n=6, the populatton units are divided into  = 3 groups of
2
(2)(4)=8 units each as follows:
1 2 3 4 5 6 7 8
9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24
36 Sampling Theory and Methods
The four possible balanced systematic samples are listed below:
s
1
={1,9.17,8,6,24}, s
2
={2,10.18,7,15.23},
s
3
= {3,11,19, 6,14, s
4
= {5,13, 21, 4,12, 20}
Thus the balanced systematic sample of size n corresponding to the random
start r is given by the units with labels
{ r + 2 jk ,2(j + l)k  r + 1}, j = 0, 1, 2, ... , 1
2
When the sample size n is odd, the balanced systematic sample of size n
corresponding to the random start r is given by the units with labels
{ r + 2jk,2(j + l)k r + 1}U { r + (n 1)k }, j = 0,1, 2 ... , n
3
2
T/Norem 3.4 Under balanced systematic sampling, the conventional expansion
estimator is unbiased for the population total.
ProofCase 1 "n even"
..
The expansion estimator Y BL corresponding to the random start r can take any
one of the k values
N
(n2)/2
.. (r)
Y BL = .. L.)Yr+2jk + Y2(j+l)kr+l ], r = 1, 2, k
n j=O
with equal probabilities I_ .
k
k
.. 1 .. (r)
Therefore E(Yst> =k .L.JYBL
..
r=l
1
k N (n2)/2
= k L L {Yr+2jk + Y2(j+l)kr+l}
r=l n j=O
N k (n2)/2
= nk L L {Yr+2jk + Y2(j+l)kr+d
r=l j=O
N k
= I ri c since u s r = s and s r n s, = , for r '* t >
i=l r=l
Hence Y BL is unbiased for the population total Y.
Case 2 "n odd"
In this case, Y BL can take any one of the k values
= N + Y2(j+l)kr+l ]+ Yr+(nl)k] r = 1, 2, ... , k
n j=O
with equal probabilities I_ .
k
Systematic Sampling Schemes 37
k
A 1 ~ '(r)
Therefore E(Y BL) = k LJ Y BL
r=l
I k N{(n3)/2 ,
=k L  L [Yr+2jk + Y2(j+l)kr+d + Yr+(nl)k ~
r=l n j=O J
k
= Y (since Usr =Sand srns, ='for r'*t)
r=l
A
Hence in this case also, Y BL is unbiased for the population total.
Thus from the above theorem we infer that the conventional estimator is
unbiased for the population total in balanced systematic sampling. It may also be
noted that the variance of the estimator is
k
A 1 ~ A(r) 2
V(YsL) =k LJ[YBL f)
r=l
where Y j2 is as defined in the previous theorem.
Theorem 3.5 When Y; = a+ {Ji, i = 1,2, ... N
{
0 when n is even
v (Y BL) = f3 2 k 2 (k 2 1) .
when n 1sodd
12
Proof For r = I. 2 ... k when n is even
N
(n2)/2
A(r) ~
YBL = LJ[Yr+2jk + y2(j+l)kr+l]
n j=O
N (n2)/2
= Lfa + /3(r+ 2jk) +a+ /3[2(j + I)k r+ I]}
n j=O
N (n2)/2
= L {2a + f3[r + 2jk + 2(} + l)k r +I]}
n j=O
=!!._{!!_{2a+/3(2k+I)}+4/Jc n(n
2
) }
n 2 8
= N[ a+ J l ( ~ +I)] = Y
A(r)
Thus we have Y BL = Y for all r = I, 2 ... k
Therefore V (Y BL) = 0 .
For r = I, 2; .. k when n is odd
38 Sampling Theory and Methods
v{(n3)/2 }
=' [Yr+2jk + Y:!(j+l)kr+l] + Yr+(nllk
n .=0
}
N {(n3)/2 }
= L[2a + /3(4jk + 2k + 1)] +{a+ f3[r + (n 1)k}
n '=0
J
<[na+P{
+ +r+(n1)
(n 1}(2k + 1) (n l)(n 3)k k }]
2 2
=!!_[na + /3Jl (n l)(nk + k + 1) + r l]
n 2 J
Funher we know that for populations having linear trend
Y = Na + f3 (nk)(nk + 1)
2
Using (3.18) and (3.19) we get, for r = I, 2, .... k
y<r> Y = f3N [r k+1]
BL n 2
Squaring both sides and summing with respect to k we get
rf = p2:2 [r k ;1]2
r=l n k r=l
= /3
2
N
2
r2 + k(k+1)
2
_
2
k+1 k(k+1) ]
n
2
k ...J
1
4 2 2
L..r=
= f3
2
N
2
[ k ( k + 1 )( 2k + 1) _ k ( k + 1l ]
n2k 6 4
f3 2 (k 2  l)k 2
=..;....._ ___ _
12
Hence the theorem.
(iii) Modified Systematic Sampling (Singh, Jindal and Garg, 1965)
(3.18)
(3.19)
The modified systematic sampling is another scheme meant for populations
exhibiting linear trend. A sample of size n is drawn by selecting pairs of units
equidistant from both the ends of the population in a systematic The
details are furnished below.
As in the case of linear and balanced systematic sampling here also a
random number r is selected from 1 to k. When the sample size n is the
sample corresponding to the random stan r ( r = 1, 2, ... , k ) is given by the set
of units with labels
s r = { r + jk, N  r  jk + 1}, j = 0, 1, ... , n ..:..
2
2
Systematic Sampling Schemes 39
When the sample size n is odd, the sample corresponding to the random stan r
( r = 1, 2 ... k ) is given by the set of units with labels
k (nl}k . n3
s r = { r + jk. N r jk + 1} U { r + , J = 0, I, ... ,
r=l 2 2
For example, when N= 16, n=4 and k=4 the four possible modified systematic
samples are
s 1 = { 1, 5, 12, 16}, s 2 = { 2, 6, 11, 15} S 3 = { 3, 7 10, 14}, s 4 = { 4, 8, 9, 13}
It is interesting to note that the theorems which we proved in the previ,.,us
section for balanced systematic sampling are true even under modified
systematic sampling.
TINorem 3.6 Under Modified Systematic sampling. the conventional expansion
estimator is unbiased for the population total.
Proof of this theorem is left as exercise.
Tlteorem 3.7 Under Modified Systematic sampling.
{
0 if n is even
V (Y MOD)= f3 2 k 2 (k 2 1} . .
..;...._f n1sodd
12
..
when Y; =a+ f3i, i = 1, 2, ... N where Y MOD is the conventional estimator
under Modified Systematic Sampling.
Proof of this theorem is also left as ~ n exercise.
3.4 Autocorrelated Populations
Generally, it is reasonable to expect the units which are nearer to each other
possess similar characteristics. This property can be represented using a
statistical model assuming that the observations Y; and Y i are correlated, the
correlation being a function of the distance between the labels of the units
which decreases as the distance increases. Such population's are called
autocorrelated populations and the graph of the correlation coefficient p,.
between observations separated by u units. as a function of u is called
"correlogram''.
Because of the finite nature of the population the correlogram will not be
smooth and it will be difficult to study the relative performances of. various
sampling schemes for a single finite population. But it is easy on the average
over a series of finite populations drawn from an infinite super population to
which the model applies. A model which is suitable for populations possessing
autocorrelatedness is described below.
Model The population values are assumed to be the realized values of N
random variables satisfying the following conditions :
40 Sampling Theory and Methods
and EM [Y;  ,u][Yi+"  ,LL] = Pua where Pu 2: Pv whenever u < v.
The subscript M is used to denote the fact that the expectations are with respect
to the superpopulation model .The following theorem gives the average
variance of the expansion estimator under simple random sampling and linear
systematic sampling.
Theorem 3.8 Under the super population model described above
EMV(Y.m)= 1 L(Nu}pu
A a2 (k 1)N 2 [ 2 N1 ]
nk N(N 1) u=l
EMV(Yf..Ss)= i 1 L(nku)pu +
A a 2 (k l)N 2 r [ 2 nk1 ]
nk l nk(k 1) u=l
[
2k nl ] }
n(k1)
ProoifWeknowthat V(Y )=N
2
(Nn)
1
f[YYi
srs Nn N  1 ..J I
i=l
(3.20)
N N
Note that L[Y;  ,u]
2
= L[Y; Y + Y ,u]
2
i=l i=l
N
= L[Y;  Y]
2
+ N[Y .ul
2
i=l
N N
Therefore L[Y; Y]
2
= L[Y; ,u]
2
 N[Y ,u]
2
i=l i=l
N [ N ]2
= N !L]
= f[Y; !']
2
 [ f[Y; !']
2
+ 2t[Y
1
 !LUYi !L] ]
r=l 1=l 1<1
N
1
N
2
NINu
[Y ,u]
2
 [Y ,u][Y. ,u]
N ..J I N ..J ..J I I+U
i=l u=l i=l
Taking expectations on both the sides with respect to the model we obtain
2
(3.21)
i=l N(N 1) u=l
Using (3.20) and (3.21) we get
Systematic Sampling Schemes 41
E V y = N (N n)a (N 1)
1
_ 2 (N _ u)
2 2 [ N1 ]
M ( srs) Nn(N 1) N(N 1) ~ Pu
2 2 [ N1 ]
=a (k l)N
1
_ 2 ~ (N _ u)
nk N(N 1) .L..i Pu
u=l
Thus we have obtained the average variance of the conventional estimator
under simple random sampling with respect to the autocorrelated model
described earlier.
le
We know that V(YL.Ss) = ~ L[Yr Y]
2
r=l
2 le A
=: I[r;. "f]2
. r=l
..:. 1 n  I lc ..:.
where Yr = Lyr+()l)le and Y =Lyr
n . I k I
J= r=
N 1e n
~ 2 ~ ~ 2
Note that .L..i[Y; Y] = .L..i.L..i[Yr+<j1\lc  Y]
i=l r=l j=l
le n A A
= L L[Yr+(jl)le  Yr+ Yr f]2
r=l j=l
len A leA
~ ~ 2 ~   2
= .L.J .L.J [Y.,.+(jl)le  Yr] + n .L.J [Yr  Y]
r=i j=l r=l
le A
1
N
1
1e n
~  2 ~ 2 ~ ~
Therefore .L.J[Yr Y] = .L.J[Y; Y] .L.J .L.J[Yr+(jl)le
r=l n i=l n r=l j=l
On using !).21) we get
Using (3.21) and (3.24) in (3.23) we obtain
(3.23)
( 3 ~ 2 4 )
42 Sampling Theory and Methods
2 If. .,]1 2
k ' Nl }
EM Y] il _ 
r=l n L N(N l) u=l
(nl)ka
2
{ 2 }
1
n n(n1) u=l
= (k1)L(Nu)pu +X,(nu)pku
2 { 2 N 2k n1 }
n nk u=l n u=l
a 2 (k _ 1) { 2 nk 2k n1 }
= 1 L(nku)pu+ L(nu)pku
n nk(k 1) u=l n(k 1) u=l
Substituting this in (3.22) we get the average variance of the conventional
estimator in linear systematic sampling.
A comparison between these two average variances is given in Chapter S.
3.5 Estimation of Variance
In Linear systematic sampling the second order inclusion probabilities are not
positive for all pairs of units in the population. This makes unbiased estimation
of the variance of the estimator impossible. In the absence of a proper estimate
for the variance, several ad hoc procedures are being followed to estimate the
variance of the conventional estimator.
One of the methods is to treat the systematic sample as a simple random
sample of size n units and estimate the variance by
2 "
N (N  n) 1 [ ] 2
..J Y; y
Nn n 1 i=l
where Y; is theyvalue of the ith unit in the sample and y is the sample mean.
It may be noted that the above estimator is not unbiased for the variance of
conventional estimator under linear systematic sampling.
The second approach is to treat systematic sampling as a process of grouping the
N population units into.!: groups of 2k units each and selecting two units from
2
each group in a systematic manner. In this case the population total can be
estimated by
2
n/2
N L Y2i + Y2il (3.22)
n i=l 2
Assuming that the two units have been selected with simple random sampling
without replacement from the 2k units in the ith group, an unbiased estimator of
the variance of the ith term in the brackets on the right hand side of (3.22) will
be given by
Systematic Sampling Schemes 43
(k; l) { Y2i 
2
Y2il r
Hence an unbiased estimator of the variance of the estimator given in (3.22) is
N2 N 
2
n ~ { Y2i + Y!i1 }
2
Nn i=l 2
An alternative variance estimator based on the same principles as. those
considered above. which takes into account successive differences of sample
values is given by
n1 {y }2
N2NnL i+IYi
Nn i=l 2(n 1)
Singh and Singh ( 1977) proposed a new type of systematic sampling which
facilitates the estimation of variance under certain conditions. The scheme
suggested by them is described below.
(i} Select a random number from 1 toN
(ii) Starting with r select u continuous unitS and the remaining n u = v units
with interval d, where u(iess than or equal to n) and dare predetermined.
They have proved that if u+vd is less than or equal to N then the above
sampling scheme will yield distinct units and the second order inclusion
probabilities are positive if (a) dis less than or equal to u and (b) u+vd is greater
than or equal to (Nn.)+ 1. When these two conditions are satisfied it is possible
to estimate the variance of the conventional estimator. They have observed that
in situations where usual systematic sampling performs better than simple
r a n ~ o m sampling the suggested procedure also leads to similar results and for
some situations, it provides better results than even linear systematic sampling.
3.6 Circular Systematic Sampling
It has been pointed out in the beginning of this chapter, the population size is a
multiple of the sample size. However in practice this requirement will not be
satisfied always. Some survey practitioners will try to take the sampling interval
k as the integer nearest to N/n. When this is followed, some times we may not
get a sample of the desired size. For example, when N=20, n=3 and k=7, the
random start 7 yields units with labels 7 and 14 as sample whereas the desired
sample size is 3. In some cases, some units will never appear in the sample
thereby the estimation of the population total (mean) becomes impossible. For
example, when N=30, n=7 and k=4, the units with labels 29 and 30 will never
appear as sampled units. These problems can be overcome by adopting a
method, known as Circular Systematic Sampling (CSS) by Lahiri(1952). This
method consists in choosing ~ random. start from 1 to N and selecting the unit
corresponding to the random start and thereafter every kth unit in a cyclical
manner till a sample of size n units is obtained, k being the integer nearest to
Nln. That is, if r is the number selected at random from 1 to N. the sample
consists of the units corresponding to the numbers r + jk if r + jk .s N and
44 Sampling Theory and Methods
r + jk N if r + jk > N for j = 0, l. ... , (n 1) .It is to be noted that. if the
sampling interval is taken as the integer closest to N/n, it is not always possible
to get a sample of the given size as shown in the following example. Let N=l5,
n=6 and k=3. The sample corresponding to the random start 3 has only five
distinct elements namely 3,6,9,12,15. Motivated by this, Kunte(l978) suggested
the use of the largest integer smaller than or equal to Nln to avoid the above
mentioned difficulty. The following theorem due to Kunte( 1978) gives a
necessary and sufficient condition under which one can always obtain samples
having n distinct elements for any n less than or equal toN.
Theorem 3.9 A necessary and sufficient condition for all elements of s(r. n) the
sample to be distinct for all r S. N and n S N is that N and k are relatively
coprime , where s(r. n) = {i
1
, i
2
, .... in}. Here i j = [(j l)k + r] mod N. with
the convention that 0 is identified with N.
Proof Suftidency
Suppose Nand k are relatively coprime and there exists rand n such that two
elements of s(r. n) are equal.
Without loss of generality assume that i
1
and i j+l =[jk + r] mod N.
where j < n S. N. This contradicts the fact that k and N are coprimes.
Necessity
Suppose for all r S. N, and n :S N all elements of s(r, n) are distinct and Nand
k are not coprimes. Let gcd(k,N)=a. with k = b.a N = c.a , where band care
both smaller than N. For any r let us take n ~ c + 1 . Then
ic+l =[ck + r] mod N 
= [cb.a + r] mod N
=[b.N + r] mod N
= r= i
1
This again contradicts our assumption that all elements of s( r, n ) are distinct.
Hence the theorem.
Under circular systematic sampling, the conventional expansion estimator is
unbiased for the population total (whenever Nand k are relatively coprime) and
N n
its variance is given by 
1
L[Yci Y]
2
where Yc; = !!_ L Y ~ . yj being they
N . I n . I
= ]=
value of the jth unit in the circular systematic sampling corresponding to the
random start i.
3. 7 Systematic Sampling in Two Dimensions
The linear systematic sampling c a n ~ extended for two dimensions populations
in a straightforward manner. Here it is assumed that the nmlcl population units
are arranged in the form of ml rows , each containing nk units and it is planned
Systematic Sampling Schemes 45
to select a systematic sample of mn units. The following procedure is adopted to
draw a sample of size mn.
Two random numbers rand s are independently chosen from 1 to l and I to
k respectively. Then the sample of size nm is obtained by using the units with
coordinates r + (i I)l, s +(j l)k. i =I. 2 .... , m and j =I. 2 ... , n. For example.
when m=3. 1=3, n=3 and k=4. the units corresponding to the random starts 2 and
3 are those placed against the coordinates (2.3). (5.3). (8.3). (2.7), (5.7). (8.7).
(2.11 ). (5, 11) and (8.II ). Refer the Diagram 3.1 A systematic sample selected in
this manner is called aligned sample.
Theorem 3.10 An unbiased estimator for the population total corresponding to
. . NM m n
the random starts rands 1s g1ven by YTD =L Lyr+(il)l.s+(jl)k
nm . I . I
I= j=
A
Proof Note that the estimator YTD can take any one of the lk values
NM m n
 L LyrHil)l.s+(jl)k. r =I. 2 .... ,/, s =I, 2, ... , k
nm . I . I
I= j=
with equal probabilities values _I .
kl
A I k I NM m n
Therefore E(YTo>= 'L'LLLYr+(il)l,s+(jl)k =Y
r=l .f=l lk nm i=l j=l
Hence the proof.
Remark It may be noted that the variance of the above estimator is
I k
:l LL[Yrs Y]
2
where Yrs is the expansion estimator defined in the above
r=l s=l
theorem corresponding to the random starts r and s.
Theorem 3.11 An aligned sample of size n drawn from a population consisting
of n
2
k
2
units has the same precision as a simple random sample of size n
2
when the population values are represented by the relation
Yij = i + j. i = 1. 2 ..... nk; j = 1. 2, .... nk
if the expansion estimator is used for estimating the population total.
Proof The variance of the expansion estimator based on a sample of size n
2
drawn from a population containing n
2
k
2
units is given by
44 22 ~ nknk
V(Y ) = n k (n k n ) I ~ ~ [ f ..  f]2 (3.23)
srs 2 2 2 2 2 ~ ~ IJ
n k n n k  1 i=l J=l
_ I nk nk
Note that Y =   ~ ~ f ..
2 2 ~ ~ I)
n k i=l j=l
46 Sampling Theory tUU1 Methods
1 n/c nJc
=22LL[i+j]
n k i=l j=l
=
[{ (nk)(nk:nk +I)}+ { (nk)(nA:;(nA: + 1)}]
=nk+1
Therefore [Y;
1
 Y]
2
= [i + j (nk + 1)]
2
Summing both sides with respect to i and j from 1 to nk we get
nJc nJc nJc nJc
f]
2
=
+ j
2
+2ij]
i=l i=l i=l i=l
11/cnk
2(nk + 1) L I (i + j) + n
2
k
2
(nk + 1)
2
i=l j=l
= 2{ (nA:)(nA: + + 1)} + 2{ + 1)} 2
_
4
{ (nk + +I)}+ (n 2 k 2 }(nk +
1
) 2
n
2
k
2
(n
2
k
2
1)
=
6
Substituting (3.24) in (3.23) we get
A n
2
k
2
(n
2
k
2
n
2
) 1 n
2
k
2
(n
2
k
2
1}
V(Ysn>= 2 2 2 6
n n k l
n
4
k
4
(n
2
k
2
1)
=
6
Since Yrv is unbiased for the population total, we have
A A 2
V(Yro) = E(Yro  Y)
.. 2 2
=E(Yro>  Y
= E(Yr
0
>
2
 n
4
k
4
(nk + 1}
2
Corresponding to the random starts rands we have
2k2 n n
Yro = n
2
Yr+(il)l.s+(j:)k
n i=l J=l
n n
=
n i=l p,
(4.3)
Note that, the quantity [ ;:  Y r can take any one of the N values
[
y. ]2
 Y with respective probabilities Pi, j = 1, 2, ... , N
Therefore J Y;. r]
2
= r]
2
Pi
..... Lp, j=l PJ
(4.4)
58 Sampling Theory and Methods
N 12 A I y.
Using (4.4) in (4.3) we get V[Ypps] =I,{' Y P,
n p. J
t=l '
Hence the proof.
A
The following theorem gives an unbiased estimator of V[Y pps] .
A
Theorem 4.3 An unbiased estimator of V[Ypps] is
A 1 A}2
v[Ypps1= Y P;
n(n 1) i=l P;
Proof By Theorem 4.2,
V[Y pps] =.!. f {21. r}
2
P;
n i=l P;
N 2
=.!. L Y;  y2
n i=l P;
(4.5)
If v[f pps] is an unbiased estimator of V[Y pps] then
E{ v[Y pps] }= V[Y pps] (4.6)
0 A A2 2
SmceV[Yppsl=E[YppslY , we have
{
A } A2 .,
E v[Y pps] = E[Y pps] Y (using (4.6))
2 A2 A
Hence Y =E[Ypps v(Ypps)] (4.7)
Note that E{: i
i=J p, i=l ,I
(4.8)
Using (4.6),(4.7) and (4.8) in (4.5) we get
{
A } 1 { 1 1 A
E v[Ypps] = E  .:.....!._ E[YP;u \'lYpps )]
n n i=l P; n
{
n 2
n1 A 1 y 1 A.,
Therefore
n   n
n t=l Pi ..
E{ v[Y ] } E{ 1 [ 1 YT 1 y: ll
pps  ( 1)   J'!'SJ!
n n i=l P; n J
= E 1 i Y;  y pps i
{ [
n( n I) i=J P; ; f
Unequal Probability Sampling 59
II ( J
2 I Y A
Hence .2, 
1
 Y pps is an unbiased estimator of V[Y pps] .
n(nI).
1
P
I= I
A meaningful equal probability sampling competitor to the PPSWR sampling
scheme is simple random sampling with replacement. The following theorem
gives an estimate for the gain due to PPSWR when compared to simple random
sampling with replacement.
Theorem 4.4 An unbiased estimator of the gain due to PPSWR sampling as
compared to SRSWR i[N __ I_.]
n i=l P
1
P
1
Proof We know that under SRSWR,
2 N
V[Y ]=N (Nl) I
swr Nn N  1 ...J I
i=l
r
2
} (4.9)
n . I
I=
N
Note that, under PPSWR, unbiased estimators of the quantities Ir? and Y
2
i=l
II 2
1 Y; A2 A .
are ...J. and E[Ypps v(Ypps}J respectively.
n i=l P;
A
Therefore, by (4.9), unbiased estimator of V[Yswr] under PPSWR is
{
II 2 }
1 Y; A2 1 A
 NInYpps +v(Ypps)
2 p n
n i=l I
(4.10)
A
Already we have seen in Theorem 4.I, an unbiased estimator of V[Y pps] is
1 y l y 2 ] ( 4.11)
n(n1) pps
1=1 I
Subtracting ( 4.11) from ( 4.10), one can estimate the gain due to PPSWR as
n
12
t[N ;.]
Hence the proof.
60 Sampling Theory and Methods
4.2 PPSWOR Sampling Method
When probability proportional to size selection is made in each draws without
replacing the units drawn. we get a Probability Proportional to Size Sample
Without Replacement (PPSWOR). Since the selection probabilities changes
from draw to draw, we must device suitable estimators takingjnto account this
aspect. In this section, we shall discuss three estimators suitable for PPSWOR.
Desraj Ordered estimator
Let
t1 =l.L,t2 = Y1 + y
2
0 PI ),t3 = Y1 + Y2 + YJ (1 P1 P2), ... ,
P1 P2 P3
tn = Y1 + Y2 + ... + Yn1 + Yn (1 P1  ...  Pn1)
Pn
where Y; and P;, i = 1. 2, ... , n are as defined in Section 4.1.The Desraj ordered
estimator for the population total is defined as
A 1 n
YoR = L,t;
n . I
=
A l II
Theorem 4.5 Under PPSWOR, Y
0
R = Lt; is unbiased for the population
n i=l
total and an unbiased estimator of its variance is
n n
vCY DR)=
1
L (t;  r)
2
where t =.!.. L,t;
n(n  1) i=l n i=l
y y.
Proof Note that the ratio 
1
can take any one of the N values 
1
, j= 1 ,2, .... N
PI pj
with respective probabilities Pi . Therefore
J11:.]=f; Pr
....,LPr r=l r
= y
Hence t
1
is unbiased for the population total.
For r=l,2, ... ,n, E[tr]=E1E2[tr li1,i2, ... ,ird
where
2
is the conditional expectation after fixing the units with labels
i
1
, i2 , ... , irl for the first (r 1) draws. Now E[t r] can be written as
E[t,)=E.[ Y
1
, +Y
12
+ ... +Y
1
,_, +(1P
1
,P;,  ... P
1
,_
1
)
E{;: lit,i2, ... ,i,_t } ]
(4.12)
Unequal Probability Sampling 61
v
Note that conditionally, the ratio .;.L can take any one of the N r 1 values
Pr
yi , j = 1, 2, ... N, * i
1
, i
2
, ... , 1,_
1
with probabilities pj
Pi (1P;
1
P;
2
 P;,_,)
Th ti
E{
Yr }
ere ore Pr I ', '2 , ... , 'r1 =
N
I
j=l
p. (1P. P.  ... P. )
J ,,
1
2 'r1
Substituting (4.13) in (4.12) we get
E(t,) =
1
Y, + Y, + .... + Y; +
I 2 r1
j=l
; ~ : ; , .i2 ..... ,;,_,
=Y
Thus we have , for r = I, 2. ~ .. , n E[t r] = Y .
I n
Therefore E(YoR>= LE(t;)
n. I
=
I
=nY =Y
n
n
Hence Y DR =! L t; is unbiased for the population total.
n . ~ 
'=
A A., ")
Weknowthat V(YoR>=E(YJ3R)Y
Note that
(4.13)
(4.14)
E&,ts ]= EtE2[t,t_, I it. i2 , ... ,isl] (assuming without any loss of
generality r<s)
=E
1
[t,E
2
(t_, lit.i2, ... ,(,_t)1
=
1
[t,Y]
= YE
1
[t,]
=Y2
E{ I :f,r,t_,} = Y
2
n(n1)
r ; ~ : s
Substituting (4.15) in (4.14) we get
{
}
2 }
A 1 n I n 2
V(YoR)=E..:. L,t; E{ Ltrts Y
n .
1
n(n1)
= rills
(4.15)
62 Sampling Theory arui. Methods
( 1 n 2 [ 1 1 ]n )
=E t + tt
":! ll(nl)Lrsr
l r=l J
r 1 n 1 
1
= El 'r; t
2
J where
n(n 1) L 11 1
r=l
= { 1 i(tr r)2}
n(n 1) r=l
Hence the proof.
Murthy's Ordered Estimator
 1 n
t =I,r;
n . 1
I=
The Desraj ordered estimator depends upon the order in which the units are
drawn. Murthy ( 1957) obtained tht: unordered estimator corresponding to Desraj
ordered estimator. For the sake of simplicity, we shall restrict to samples of size
2 only. Suppose y
1
and y
2
are the values of the units selt"'cted in the first and
second draws and p
1
and p
2
the corresponding initial selection probabilities.
The ordered estimator is
y0.2) = ..!_[l.L + Yt + Y2 (1 Pt >]
DR 2 Pt P2
1 r 1 + Pt Y2 ]
=1 Yt +(1 Pt)
2 L Pt P2
On the other hand, if the same two units are m the other order then the
corresponding ordered estimator is
rbi
1
> =..!_[
1
+ p
2
Y2 +1'!_(1 P2 >]
2 P2 Pt
Their corresponding selection probabilities are
P(1.
2
) == Pt P2
0 Pt)
and
P(2,1) = Pt p
2
.The ordered estimator based on the ordered estimators yf 1.
2
J
(1 P2) DR
and rb'i) is given by
y0.2) + y<2.1)
y _ DR DR
M  P(1,2) + P(2,1)
(1p2)1.!_+(1pt) Y2
Pt P2
2 Pt P2
Unequal Probability Sampling 63
An unbiased estimator of V(Y M ) Is
(1 PI P2)(1 PI P2) [11. Y2 ]
2
(2 PI  P'!. ) PI P2
HorvitzThompson Estimator
To estimate the population total one can use the HorvitzThompson estimator
provided the inclusion probabilities are available. In the case of PPSWOR
explicit expressions are not available for inclusion probabilities. With the help of
computers one can all possible outcomes when n draws are made and hence
calculate the inclusion probabilities. In the following sections some unequal
probability sampling schemes yielding samples of distinct units are presented.
4.3 Random group method
The random group method is due to Rao,Hartley and Cochran ( 1962). This
method makes use of the size infonnation and always yields sample containing
distinct units. In this method, the population is randomly divided into n mutually
exclusive and exhaustive groups of sizes N
1
, N 2 , N n and one unit is drawn
from group with probabilities proportional to size of the units in that group.
Here the group sizes N
1
, N
2
. N n are predetermined constants.
An unbiased estimator of the population total is
n
A Yi
YRHC = .....,,
i=l P;
where Yi is theyvalue of the unit drawn from the ith random group and
the selection probability of the unit drawn from the ith random group.
I
Pi lS
Let Y ij and X ij be the y and x value of the jth unit in the ith random group for a
given partition. Then Yi can take any one of N; values Yij, j = 1,2 .... , Ni and
X
pj can take any one of the N i values _.....;,
1
;......_, j = I, 2, ... , N i; i = l, 2, .... n
N
!,xij
j=l
n
Theorem 4.6 The estimator YRHC = L 4 is unbiased for population total Y.
i=l Pi
Proof E[YRHcl=E1E2[YRHC IGI,G2Gnl
(4.16)
where
2
is the conditional expectation taken with respect to a given
partitioning of the population and E
1
is the overall expectation.
64 Sampling Theory and Methods
Note that E2['YRHC 1Gt,G2Gnl= Ez{i Y';. 1Gt,G2Gn}
i=( pI
(4.17)
y y.
Since the ratio can take any one of the N; values ....;;
11
, j = 1,2, ... , N i
Pi xij
x ..
with respective probabilities
we have
f,x;i
j=l
E2( IGi )= f
PI j=l ij
N
x ..
L I]
j=l
Substituting (4.18) in (4.17) we get
x ..
I]
N
x ..
L I]
j=l
n N
E2['YRHC 1Gt,G2, ... ,Gnl = I!rij
i=l j=l
=Y
n
N
L I]
j=l
(4.18)
Therefore, by ( 4.16), Y RHC = L )' is unbiased for the population total under
i=t Pi
random group method.
The following theorem gives the variance of the estimator Y RHC
Theorem 4.7 The variance of the estimator YRHC is
n
I,N;(Ni 1) N
2
i=t r] p
N(N 1) f=t Pr  r
Proof
V[:YRHC] = E1 V2[:YRHC I Gt ,G2, ... ,Gn] + V1E2[YRHC I Gt,G2, ... ,Gnl
We have seen in Theorem 4.6, E2'YRHC 1Gt,G2, ... ,Gnl = Y
Therefore V1E2[YRHC I Gt ,G2, ... ,Gn] = 0
Since draws are made independently in different groups,
(4.19)
(4.20)
Unequal Probability Sampling 65
V2[:YRHc 1Gt,G2 ..... Gn1= V2{t 1Gt,G2Gn}
i=l PI
= t V2( I G;)
i=l PI
n N [y.. ]2
= :L! T; P;i
i=J j=l I)
( 4.21)
N;
X;i
where p .. =
IJ N
k=l
and T; = L X ilc . The right hand side of the above
lc=l
expression is obtained by applying Theorem 4.1 with n=1.
N [Y ]2 N [Y y.]
2
Claim L 
1
. Y P; = L 
1
P;Pj
.
1
P, .. P
1
P
1
I= I<]
Proof of the claim
n n n n
We know that ""'""'a =""'a .. + 2""' a if a .. =a
lj II lj lj fl
i=l J=l i=l i<j
Therefore
ff[Y; y
1
]
2
P;Pi = f[XL y
1
]
2
P;P; +2f[XL y
1
]
2
P;P
1
i=l j=l P; pj i=l P; P; i<j P; pi
The above expression can be written as
N N y.2 N N N [ y. y. ]2
2""' p. p.  2""' ""' y. y . = 2""' 
1
 _J p. p.
2 I } I } p. p. I }
i=J j=J P; I i=J j=J i<j I }
N 2 N [ ]2
""' y. 2 ""' y. y j
Therefore 2
2Y = 2  P. P; P
1
I=J P; i<j I }
Hence Y]
2
P; = ]
2
P;P
1
.IP, .. P, pl
I= I<]
Thus we have proved the claim.
Making use of(4.22) in (4.21) we get
(4.22)
66 Sampling Theory and Methods
can take any one of the values
= i Nj(N; l)f[.!!:__y]2 Pr
i=l N(N1) r=l Pr
Substituting ( 4.20) and ( 4.24) in ( 4.19) we get the required result.
Remark When the groups are all of the same size, then
N
Nl =N1 = ..... =Nn =
n
~ N(N n)
In such cases we have ...JNi (N; 1) = 
n
i=l
Substituting this in (4.24), we obtain
VCYRHC) = N(N n) f[.!!. r]2 Pr
nN(N I) r=l Pr
(N n) A
= V(Ypps) (refer Theorem 4.2)
(N 1)
(4.24)
From this we infer that random group method is better than probability
proportional to size with replacement whenever the groups are of the same size.
The following theorem gives an unbiased estimator of VCYRHC).
A
Theorem 4.8 An unbiased estimator of V ( Y RHC) is
Unequal Probability Sampling 61
n
LN2N
A 1=1 I r ~ y,:. YA }
v( y RHC ) = n . i ~ ,  RHC
N 2  L N l l i=l pi p i
i=l
where p; and P; are as defined in Theorem 4.6 and Theorem 4.2.
Proof From Theorem 4.7 we have
A N [ Y ,
2
n N tN 1)
V(YRHC) = A.L " 'Y j P,. where ).. = L
1
'
r=1 Pr .J i=l N(N 1)
{
N y2 i
=ALtr2l
r=1 r j
Using the argument given in Theorem 4.6 it can be seen that
E{t ~ ? .}= f ~
2
i=l p,p r=1 r
If v[Y RHC] is an unbiased estimator of for V[Y RHC] then
E{ v[YRHC] }= V[YRHC]
A A2 2
Funher V[Y RHC] = E[Y RHC] Y
2 A 2 A
Hence Y = E[YRHC v(YRHC )]
Using (4.26),(4.27) and (4.28) in (4.25) we get
E{ v[YRHC1 }= A.E{i: '.?, +v(YRHc>YiHc}
i=l PIP
Solving for v(YRHC), we get as estimator of V(YRHC).
1 r n 2 l
A A I ~ )'; ~ 2 I
v('fRHc)= 1.... YRHCJ
1A. i=l P; p
Hence the proof.
4.4 Midzuno Scheme
(4.25)
(4.26)
(4.27)
(4.28)
This is another unequal probability sampling scheme due to Midzuno ( 1952).
A
Let X be an unbiased estimator of the population total X of the size variable x
68 Sampling Theory and Methods
under stmple random sampling. That is. X = !:!... L Xi . The Midzuno sampling
n
IES
design is defined as
P(s) = (4.29)
0 otherwise
The above sampling design can be implemented by using the following
sampling method.
To draw a sample of size n. one unit is drawn by probability proportional
to size method and from the remaining (N 1) units a simple random sample of
size (n 1) will be drawn.
Now we shall prove that the above sampling method will implement the
sampling design defined in (4.29).
Let s = {i
1
, i
2
, ... , in} . The probability of getting the sets as is
n
P(s) = L P(A;,)
(4.30)
r=l
where P( A; r) is the probability of obtaining the set s as sample with r selected
X
( )
1
N1
in the first draw. It is to be noted that P(A;,) = ''
 X n1
Therefore by (4.30), P(s) = :; ]
IES n I
(using X= LX;)
ies
From this, we infer that the sampling scheme described above implements the
sampling design defined in (4.29).
The following theorem gives the first order inclusion probabilities
corresponding to the Midzuno sampling design.
TINorem 4.9 Under Midzuno sampling design the first order inclusion
probabilities are
Nn X n1
TC; ... ,N
N1 X N1
Proof By definition TC; = L P(s)
sJi
Unequal Probability Sampling 69
=N L.!.
)
1 ~
( '" X
=(N)r !:!.__1 LL x,
n n X Hi 1e.r
(4.31)
Note that ( 1) the number of subsets of size n containing the label i is ( Nl) and
\ n1
(2) the number of subsets of size n containing the label j along with label i is
(
N
2
). Therefore, by (4.31),
n2
TC; =1 (N1)I[(NIJyi +(N2}X X;)]
X n1 n1 r n2
=
1
(NI\J'I[(N2v Nn}(X; +X)]
X n1 n2 )l n 1
Nn X; n1 .
=+,, = 1, 2 ... , N
N1 X N1
Hence the proof.
The following theorem gives the second order inclusion probabilities under
Midzuno scheme.
Theorem 4.10 Under Midzuno sampling design, the second order inclusion
probabilities are
(N  n)(n 1) X; + X J (n 1)(n 2)
TC .. = +
'1 (N1)(N2) X (Nl)CN2)
Proof By definition, rr;j = L P(s)
sJi,j
=!:!..._1 IIx, +
n X . .. ( J
S)I,]IES l n
(
N2)
Note that the number of subsets of size n containing the labels i and j is
n2
and the number of subsets of size n containing the label k along with labels i
(
N3)
andj is .
_n3
70 Sampling Theory and Methods
Therefore
TC=1 1 [(N2'(x +X)+rN3\fXXX)lJ
'1 X I N I ) : I 1 I 1
l
n2 ; , n3
n1
= }<x, +x 1 +x)]
(N n)(n 1) Xi +X J (n l)(n 2)
= +
(Nl)(N2) X (N1)(N2)
Hence the proof.
Thus we have derived the first and second order inclusion probabilities under
Midzuno sampling scheme. These expressions can be used in the Horvitz
Thompsen estimator to estimate the population total and derive the variance of
the estimator. Midzuno sampling design is one in which the YatesGrundy
estimator of variance is nonnegative.
4.5 PPS Systematic Scheme
As in the case of cumulative total method, in probability proportional to size
systematic sampling, with each unit a number of numbers equal to its size are
associated and the units corresponding to a sample of numbers drawn
systematically will be selected as sample. That is, in sampling n units with this
procedure, the cumulative totals T;, i = 1.2 .... , N, are determined and the units
corresponding to the numbers { r + jk}, j = 0, 1, 2, ... , (n 1) are selected, where
k = T = and r is a random number from 1 to k. This procedure is known as
n n
pps systematic sampling. The unit U i is included in the sample, if
T;_
1
< r + jk S T; for some value of j = 0.1. 2, ... , (n 1). Since the random
number, which determines the sample. is selected from 1 to k and since X; of
the numbers are favourable for inclusion of the ith unit in a sample, the
nX
probability rr; of inclusion of the ith population unit is ' provided k < Xi .
X
It is to be noted that if is not an integer, the sampling interval k can be
n
taken as the integer nearest to in this case the actual sample size differs
n
from the required sample size. This difficulty can be overcome by selecting the
sample in a circular fashion after choosing the random start from 1 to X instead
of from 1 to k. Hartley and Rao( 1962) have considered pps systematic procedure
when the units are arranged at random and derived approximate expressions for
the variance and estimated variance which are given below.
Unequal Probability Sampling 71
vdHR > =!_ fl _!J__ r]
1
P, [1ln l)P;]
n =i L P,
1 ., V v ..
n n. [ n [ ]2]
v(YHR)= ., ...:...!... ___ ,_
n(n1) i=l i'<i i"=l Pi Pi"
It can be shown that even when the units are arranged at random pps systematic
sampling is more efficient than ppswr sampling.
4.6 Problems and Solutions
Problem 4.1 Derive the variance of Desraj ordered estimator when the sample
size is two.
..
Solution When n = 2 , the estimator Y DR can be written as
1
y DR = 2 ( t I + t 2)
= .!_r Yt(1 + Pt )+ Y2l' 1 Pt )11
2L Pt P2
Note that the above estimator can take the values
[
v ( 1 + f} l y [
1
 P, 1 2 N
'i P.. + j T ' 1 = . . .... :, 1
P,.p.
with respective probabilities _'
1
1 P;
A2 1ff[ (1+P.J
P,.p.
Therefore E[YDR1= Y; . ' + yi .' _, 1
4 i=t i=l P, p1 1 P;
n n n n
Using the identity. LLaii =La;;+ Laij , we can write
=I j=l i=l j
E[YBR1 =.!. ff[r;(l + P; I+ ri(1 f1 P;Pi
4 i=t i=l P; ) pi 1 P;
1ff[ (1+P;)
1';
2
Y. +Y.
4 .. 1 I P., I P., 1P
I= 1= I
On simplification the first and second terms reduce to
N Y. 2 (
1
P. ) 2 N N y 2 N Y. 2
.!.L; .+;_ +.!.uLP/1Li+.!.r2L i
4
i=l P; (1 P;)
4
i=l j=l pj
2
i=l (1 P;)
72 Sampling Theorv and Methods
N y2
and L
1
respectively.
t=l (1 P;)
Therefore
N y2
1
p 2 N N y2
vo'vR) =_!_I ; < + ; )
2Y
2

4. P,OP
1
) 4 .
1
.
1
P
1

I=I I= )=
N y2 N
4L ; +2YL',Y;P;
i=l (1 P; ) i=l
Using f Yl = f[ Y;  r]
2
P; + Y
2
... p. p.
i=l J i=l I
in the above expression and simplifying
the resulting expression we get
V( YoR) = [1.!. f P/]_!_ f[ Yl
2
P; _.!_ f[ Y;.  Y]
2
P;
2
2 i=l 2 P; P;
Hence the solution.
Problem 4.2 A finite population of size N is divided randomly into n groups of
equal sizes (assuming the population is a multiple of sample size) and one unit is
drawn from each group randomly. Suggest an unbiased estimator for the
population total and derive its variance. Compare the resulting variance with the
variance of conventional estimator under simple random sampling.
Solution We know that probability proportional to size sampling reduces to
equal probability sampling if the units are of the same size. Therefore, the
sampling scheme described in the given problem can be viewed as a particular
case of random group method. Hence the results given under the random group
method can be used for the sampling scheme given above by taking
X
1
= .... = X N = X o (say). In this case. we have
X X
p .. = I) = 0
IJ N; N;Xo
L,x;k
k=l
1
n 2 N" 2
==,]=1, , .. " ;:l=1, .... ,n
N; N
X; X
0
1
andP. = = =
1 X NX
0
N
Under this set up, by Theorem 4.6 an unbiased estimator of the population total
n
. . b YA N"""
IS given y RHC =..JYi
n . I
I=
Unequal Probability Sampling 73
Further r]
2
P, =[NY,  Nf]
2
r=l r r=l
N
= LN[Yr Y]
2
r=l
Substituting this expression in the variance expression given under the Remark
(stated below Theorem 4.7) we get
2 N
V(Y )= N (Nn) ___!_ f]2
RHC Nn N  1 f:t
1
It may be noted that the variance expression given above is nothing but the
variance of the expansion estimator under simple random sampling and in fact
n
th
. YA .
e esttmator RHC = y; ts
n i=l
nothing but the expansion estimator under
simple random sampling.
Probkm 4.3 Show that the YatesGrundy estimator is nonnegative under
Midzuno sampling design.
Solution We have seen in Chapter 1, a set of sufficient conditions for the non
negativity of YatesGrundy estimator are given by
TC;TC j TCij o. i. j = 1, 2, ... , N; i j
Using the expressions given in Theorems 4.9 and 4.10 we have
tc;tcrtcij = + [ +
(N n)(n 1) X; +X j (n l)(n 2)
(Nl)(N2) X (Nl)(N2l
= [(N n ')
2
j + (N n)(n1) X; +X j [l ___ I_J]
N1) X* N1 X N1 N2
+ n1 [n1 _ n2]
N1 N1 N2
=(Nn)
2
X;Xj +[ (Nn)(n1) [l X;+Xj]] .
N 1 X
2
(N 1)
2
(N 2) X
(N n)(n1)
(N 1)
2
(N 2)
Since the right hand side of the above expression is nonnegative, we conc.lude
that the YatesGrundy estimator is always nonnegative.
74 Sampling Theorv and Methods
Probkm 4.4
n
Derive the bias and mean square error of=; LYi under
1=1
probability proportional to size sampling with replacement.
Solution Bias of the estimator is given by
N n
B=I,E<y;)Y
n .
1=1
N n N
=LLyipi NY
n . I . I
I= j=
N N
=n L[Yi Y]Pi
n . I
]=
N
= NL[Yj Y]Pj
j=l
N n N n 
Consider the difference  L )'; Y"=  L ( Y;  Y)
n . n .
1=l =I
Squaring both the sides and taking expectation on both the sides we get the
mean square error as
M = :: [t.E(y1  f>
2
+ 2tE(y, fXYr f>]
N2[n N N ]
N2 [ N ]
= L(Y; f)
2
P;
n . I
I=
Cross product terms become zero because units are drawn independently one by
one with replacement.
Exercises
4.1 Derive the first and second order inclusion probabilities in PPSWOR when
n=2.
4.2 Derive the necessary and sufficient condition for the variance estimator to
be nonnegative in PPSWOR when n=2
4.3 Suppose the units in a population are grouped on the basis of equality of
their sizes and that each such group has at least n units. Then a sample of n
units is chosen with ppswr from the whole population and repeated units
are replaced by units selected with srswor from the respective groups.
Unequal Probability Sampling 75
Suggest an unbiased estimator of the population mean and derive its
variance and compare with the usual P P SWR.
4.4 If in a s ~ p l e of three units. drawn with PPSWR, only two units are
distinct. show that the estimators
_!_[ Y1 + Y:?. + Y! + Y:?. ] Y1 + .V2
. 3 3
3 Pt P2 P1 + P:?. 1 (1 P1 ) 1(1 P2)
are unbiased for the population total.
Chapter 5
Stratified Sampling
5.1 Introduction
In simple random sampling, it has been seen that the precision of the standard
estimator of the population total (mean) depends on two aspects, namely. the
sample size and the variability of the character under study (refer the expression
(2.6) given in Chapter 2). Therefore in order to get an estimator with increased
precision one can increase the sample size. However considerations of cost limit
the size of the sample. The other possible way to estimate the population total
(mean) with greater precision is to divide the population into several groups
each of which is more homogeneous than the entire population and draw sample
of predetermined size from each of these groups. The groups into which the
population is divided are called strata and drawing sample from each of the
strata is called stratified sampling. In stratified sampling, samples are drawn
independently from different strata and it is not necessary to use the same
sampling design in all strata. Depending on the nature of the strata, different
sampling designs can be used in different strata. For example, in the absence of
suitable size information, simple random sampling can be used in few strata,
whereas probability proportional to size sampling can be used in the remaining
strata when size information is available in those strata.
Notations
N :Population size :Number of strata in the population
N h :Number of units in the stratum h. h = 1, 2, ... , L
Yhj : yvalue of the jth unit in the stratum h, j = 1, 2, .... N h; h = 1, 2, .... L
nh : Sample size corresponding to the stratum h, h = 1, 2, ... , L
Yh : Stratum total of the stratum h. h = 1, 2, ... L
Yh : Stratum mean of the stratum h, h = 1, 2, ... , L
Yhj : yvalue ofthejth sampled unit in the stratum
h, j = 1, 2, .... N h ; h = 1 , 2, ... , L
yh : Stratum sample mean of the stratum h, h = 1, 2, .... L
Stratified Sampling 77
2 1 ~ 2 2 l ~ 2
sh =L[Yhj Yh] . sh = L[Yhj Yh]
N II 1 j=l n h  1 J=l
The following theorems help us to identify unbiased estimator for the population
total under different sampling designs and also to obtain their variances.
Theorem 5.1 If Yh, h = 1, 2, ... L is unbiased for the stratum total Yh of the
L
stratum , then an unbiased estimator for the population total Y is Ysr = L,rh
L
and its variance is V(Ys
1
) = L V(Yh)
h=l
h=l
Proof Since Yh is unbiased for the stratum total Yh of the stratum h. we have
E(Yh)=Yh ,h=l,2, .... L
L
Therefore E(Ys
1
) = L,E(Yh)
h=l
L
= L,rh =Y
h=l
L
Hence Ysr = L,rh is unbiased for the population total.
h=l
A A
Since samples are drawn independently from different strata, cov(fh. Yk) = 0
forh ~ k.
L L L
Therefore V(Ys
1
)= I,V(fh)+2I,I,cov(fh,l\)
h=l h=lh<k
L
= L,v<Yh>
h=l
Hence the proof.
Corollllry 5.1 If v(Yh) is unbiased for V(Yh) , h = 1, 2, .... L, then an unbiased
L
estimator of V(Ysr) is v(Ys
1
) = L v(Yh)
h=l
Proof of this corollary is straight forward and hence omitted.
78 Sampling Theory and Methods
Corollary 5.2 If simple random sampling is used in all the L strata. then an
unbiased estimator of the population total is Yst = N h t y hj .
h=l nh j=l
Proof We know that when a simple random sample of size n is drawn from a
population containing N units, N L Y; is unbiased for the population total.
n .
IES
Therefore N h I Yhj is unbiased for the stratum total Yh , h = 1. 2, ... , L. Hence
nh j=l
L N .
we conclude that L _h L y hj is unbiased for the population total Y (refer
h=l nh j=l
Theorem 5.1).
A
Therefore by Theorem 5.1 , V ( Yst ) = ..J S h
h=l N hnh
Hence the proof.
Corollary 5.4 If simple random sampling is used in all the L strata. then an
unbiased estimator of V(Ys
1
) considered in Corollary 5.3 is
L N1 N
A It ( h  nh) 2
v(Yst ) = ..J 5
11
h=l N hnh
Proof We know that under simple random sampling s
2
is unbiased for S
2
(refer Theorem 2.4). Therefore an unbiased estimator of V(Yh) considered in
2
., N h (N h  nh) "
Corollary 5.3 is v( Yh) = 5 h
Nhnh
Hence by Corollary 5.1, an unbiased estimator of V (Yst ) is
A nh)
2
v(Yst) = ..J s h
h=l N hnh
Hence the proof.
Stratified Sampling 79
5.2 Sample Size Allocation
Once the sample size n is fixed next arises the question of deciding the sample
size nh meant for the stratum h,_h = l. 2, ... L. In this section some solutions are
given assuming that simple random sampling is used in all the L strata. Two
popular allocation techniques are (i) Proportional allocation tii) Neyman
allocation.
Proportional Allocation
Under proportional allocation the number of units to be sampled from the
stratum h is made proportional to the stratum s1ze. That 1s,
nh oc N h , h = I, 2 .... , L
=kNh
where k is the constant of proportionality. Summing both the sides of the above
expression we obtain
L L
Lnh=kLNh
h=l h=l
n
N
n
Therefore nh =N h , h = 1, 2 .... , L
N
The following theorem gives an unbiased estimator for the population total and
its variance under proportional allocation.
Theorem 5.2 Under proportional allocation , an unbiased estimator for the
A N L
population total is =L L y hj
n h=l j=l
and its variance is
N
2
(N n) L N
A " .,
V(Ysr>=
Nn h=l N
Proof We know that under proportional allocation
n
nh =Nh ,h=l,2 .... ,L
N
Substituting these values in the expressions given in Corollaries 5.2 and 5.3, we
get the required results after simplification.
The above discussion gives the sample sizes under proportional allocation when
the total sample size is known in advance and it does not take into account the
cost involved under the allocation. Normally cost will always be a constraint in
the organisation of any sample survey. Therefore it is of interest to consider
proportional allocation for a given cost. Let ch , h = 1, 2, ... , L be the cost of
80 Sampling Theory and Methods
collecting information from a unit in stratum h. (These costs can differ
substantially between strata. For example, information from large establishments
can be obtained cheaply if we mail them questionnaire, whereas small
establishments may have to be personally contacted in order to get reliable data).
Therefore the total cost of the survey can be taken as
L
C=Co+ Lchnh {5.1)
h=l
where C
0
is the fixed cost. When the sample size nh is proportional to the
stratum size, we have
nh =leN h h = 1, 2, ... L (5.2)
where k is the constant of proportionality. Summing both the sides of (5.2) with
respect to h after multiplying by c h we get
L L
Lchnh = LchNh
h=l h=l
Using (5.1) in the above expression we get
~ CC
0
CC
0
=k LJchNh => k =L _ _..;:._
h=l ~ N
LJch h
h=l
Therefore the proportional allocation ior a given cost is given by
CC
0
nh = L Nh
LchNh
h=l
(5.3)
Summing both sides with respect to h h = 1, 2, ... , L we get the total sample
size as
n= CCo N (5.4)
L
LchNh
h=l
Under the above allocation the variance of the estimator defined in Theorem 5.1
is
(5.5)
Optimum Allocation
The proportional allocations described above do not take into account any factor
other than strata sizes. They completely ignore the internal structure of strata
like within stratum variability etc., and hence it is desirable to consider an
allocation scheme which takes into account these aspects. In this section two
allocation schemes which minimise the variance of the estimator are considered.
Since minimum variance is an optimal property, these allocations are called
Stratified Sampling 81
"Optimum allocations''. Note that under simple random sampling the variance of
L
Y st = L Y h can be expressed as
h=l
A N;(Nh nh) .!
V(Ys
1
)= Sh
h=l N hnh
L 2 2 L
= L Nhsh LNhsl
h=l nh h=l
Global minimisation of the above variance with respect to n
1
, n
2
, ... n L does not
yield nontrivial solution (see what happens when the first order partial
derivatives with respect to n
1
,n
2
... ,11L are equated to zero). Therefore in order
to get nontrivial solutions for n
1
, n
2
, ... ,nL, we resort to conditional
minimisation. Two standard conditional minimisation techniques are (i)
Minimising the variance for a given cost and (ii) Minimising the variance for a
given sample size. The solution given by the latter will be referred to as
"Neyman optimum allocation" and the former allocation will be referred to as
"Cost optimum allocation". The expressions for the sample sizes under the two
types of allocations mentioned above are derived below.
(i) NeymaD optimum allocation
As mentioned above under Neyman allocation, the variance of the estimator will
be minimised by fixing the total sample size. That is, we need the values of
n1 !'2 , ... ,nL which minimise
L 2 2 L
L Nhsh  LNhsi
h=l nh h=l
L
subject to the condition L n1r = n .
h=l
To solve the above problem consider the function
L N2s2 L J L }
: h L.nhn
h=l h h=l l h=l
(5.6)
where A. is the Lagrangian multiplier. Differentiating the above function partially
with respect to nh and equating the derivatives to zero we get
N2s2
h
2
h +A.::<>, ,h=I,2, ... ,L
nh
(5.7)
Differentiating the function c1 with respect to A. and equating the derivative to
zero we get
82 Sampling Theory and Methods
L
l.nh =n (5.8)
h=l
Summing both the sides of (5.7) with respect to h from 1 to L, we get
L L
L LNhSh L,NhSh
h=l h=l
nh = .[i. n =
L
LNhSh
Therefore .fi. = h=l (5.9)
n
NhSh
Using (5.9) in (5.7) we get nh = L n (5.10)
L,N,sh
h=l
The expression given in (5.10) can be used to calculate the sample sizes for
different strata. It can be seen that the matrix of second order partial derivatives
is positive definite for the values satisfying (5.7) and (5.8). Therefore we
conclude that the values yielded by (5 .1 0) minimise the variance of the estimator
for the given sample size.
Under the above allocation the variance of the estimator reduces to
{
L N S } L
.!. L h h
n h=l nh h=l
(5.11)
A
This expression is obtained by using (5.10) m V(Yst) under simple random
sampling.
(ii) Cost Optimum Allocation
Under costoptimum allocation the sample sizes are determined by minimising
the variance of the estimator by fixing the total cost of the survey. As in the case
of proportional allocation for a given cost, the total cost of the survey can be
L
taken as C = C
0
+ L,chnh where C
0
is the fixed cost and ch is the cost per
h=l
unit in stratum h, h = 1, 2, ... , L .
Define ;= N:s: N.s: +'A.{co + chnh c}
h=l h h=l h=l
(5.12)
Differentiating the above function partially with respect to nh and equating the
derivatives to zero we get
Stratified Sampling 83
NhSh
nh = r:; ~
"'A.vch
(5.13)
Differentiating the function ~ with respect to A. and equating the derivative to
zero we get
L
C
0
+ .L,chnh =C
h=l
L
.L,chnh =CC
0
h=l
Summing both the sides of (5.7) with respect to h from Ito L, we get
L
L LNhShfc;
L chnh = h=l r:;
h=l "'A.
L
_L,Nhshfc;
..[i. = .:.:..h==
L
.L,chnh
h=l
L
_L,Nhshfc;
= h=l (using (5.14))
<CCo)
Using this expression in (5.13) we get
NS
h h (C Co)
rc;
nh =.......;;L _____ ,h =I, 2, ... , L
_L,Nhshfc;
h=l
(5.14)
(5.15)
The expression given above gives the optimum allocation of the sample for a
given cost. It can be shown that the matrix of the second order partial derivatives
is positive definite. Hence we conclude that this allocation minimises the
variance for a given cost.
The expression given in (5.15) leads to the following conclusions. In a given
stratum. take a large sample if : ( 1) the stratum size is larger; (2) the stratum
has more internal variation with respect to the variable under study; (3) sampling
is cheaper in the stratum.
84 Sampling Theory and Methods
Summing both sides of the equation (5.15) with respect to h from 1 co L,
we get the total sample size under the costoptimum allocation as
(5.16)
The variance of the conventional estimator under the costoptimum allocation
reduces to
L

h=l
CC
0
(5.17)
A
This expression is obtained by using (5.15) m V(Yst) under simple random
sampling.
The following theorem compares the variance of conventional estimator
under simple random sampling, proportional allocation for a given sample size
and optimum allocation for a given sample size.
Theorem 5.3 Let V ran , V prop and Vopt be the variances of the usual estimators
under simple random sampling, proportional allocation and optimum allocation
for a given sample size. If N h is large then
vran v prop vopt
Proof We know that under simple random sampling, the variance of the
conventional estimator for the population total is
2
V = N (N n) s2
ran Nn
(5.18)
L N
Note that (N l)S
2
= ']2
h=l j=l
L N L
 2  2
= .4J2)Yhj Yh 1 + ..J N h[Yh Y1
h=l j=l h=l
L N L
= + f1
2
(5.19)
h=l j=l h=l
L L
Therefore s
2
= 'L w hs + 'L wh  v 1
2
(5.20)
h=l h=l
Stratified Sampling 85
Nh
where wh =.
N
This is obtained by using the fac_tthat N h and hence N is large.
Using (5.20) in (5.18) we get
vran = N
2
c;nn) [whs; + wh[Yh f1
2
]
h=l h=l
N
2
(Nn)[L ]
=Vprop +Nn I,wh[Yh f]
2
(ByTheorem5.2)
h=l
Therefore V ran V prop
(*)
By expression (5.11) we have
Vopr = {Nhsh}
2

h=l h=l
N
2
(Nn) L
1
{L }
2
L
ThereforeVprop Vopt =  I,NhSh + LNhs:
Nn lz=l n lr=1 h=l
= (N: n) t. N.s; ! {t.NS r + t.NSl
(N n) 2 N
2
2  1
= ...JNhSh S where S =...JNhSh
n h=l n N h=l
L
=!!.. LNh(Sh S)
2
(5.21)
n h=l
Therefore vprop vapr
The result follows from(*) and (**).
L
Note From (5.21)wehave vprop Vopt= !!..LNh(Sh S)
2
n h=l
N  2   2]
Therefore V ran = V opt +; ...J N h ( S h  S) + ...J W h [Yh  Y]
h=l h=l
(**)
This expression leads to the conclusion that as we change from simple random
sampling to optimum allocation with fixed sample size. considerable amount of
precision can be gained by forming the strata such that variance between means
and variances are large.
86 Sampling Theory and Methods
5.3 Comparison with Other Schemes
(1) Comparison under populations with linear trend
Suppose that the population divided (assuming that N = nk, n and k being
integers) into n strata where the stratum h contains units with labels
G h = { ( h  1 )k + j. j = l. 2 ... k}. h = I. 2 ... n
and one unit is selected from each stratum randomly to get a sample of size n.
Under the above stratificationsampling scheme an unbiased estimator for the
II II
population total is given by LYhl = k LYhl
h=l h=l
This estimator is derived from Corollary 5 .2. by taking L = n, N h = k, nh = 1.
On applying Corollary 5.3. we obtain the yariance of the above estimator as
II 2 k
L k (k 1) 1 L[Yhj f,.]2
h=l k k 1 j=l
II /c.
which reduces to k LL[Yhj Yh ]
2
h=l j=l
When the population values are modeled by the relation.
Y; =a+bi.i = 1. 2 ... N
under the stratification scheme described earlier. we have
Yhii =a +b[(ht)k + j]
Therefore
 1 /c.
rh =I {a +b[<h nk + j]}
k . I
J=
=a +b{(h l)k + l)}
yhj  Y,. = b{ j  (k; 1)}
Squaring both the sides and summing with respect to j from 1 to k. we get
 2 { . 2 (k + 1)
2
}
 Yh] = b f:: 1 +
4
 (k + l)J
(5.22)
= b2J k(k + ><2k + > + k(k + >
2
_ k
2
(k + >2}
l 6 4 2
2
= b2 k(k 1)
12
A
Using this in (5.22) we get the variance of the estimator Yo as
Stratified Sampling 87
(5.23)
Already we have seen in Chapter 3. for populations exhibiting linear trend
., .,
V(Y ) =b2 nk(k l)(nk + 1)
srs 12
(5.24)
2 2 2
V(Y ) = b2 n k (k  D
LSS 12
(5.25)
Denoting the variances given in (5.23).(5.24) and (5.25) by Vst, Vran, Vsys, we
obtain Vst S Vran S V .rys .
From the above inequality, we conclude that the stratification estimation
scheme described in this section is better than both simple random sampling and
systematic sampling for populations exhibiting linear trend.
(2) Comparison under Autocorrelated trend
Assuming that the N population values are the realized values of N random
variables having a joint distribution such that
EM[Y;]=,u .EM[Y; ,u]
2
=a
2
and
EM [Y;  ,u][Yi+u  ,U] = p
11
a
2
where Pu 2:!: Pv whenever u < v,
we have proved that (Theorem 3.8)
EMV(Ysrs>= 1 L(Nu)p
11
A 0'2(k 1)N2 [ 2 N1 ]
nk N(N 1) u=l
A
(5.26)
The expected variance of the estimator Yo under the above model is given by
Define
Then
and
EM[V(fo)]=EM {k f.f} (5.27)
= a1(k l)N2 [1 2 (k u)p ]
nJc k(k  1) "' II
u=l
(refer Theorem 3.21)
"1
L(j)= . f(ju)p
11
,j=2.3 ...
J(J
1
) u=l
2 2
E V(Y ) =a (k 
1
)N [1 L(nlc)]
M srs nk
2 2
EM V(Y
0
) =a (k l)N [1 L(k)]
(5.29)
(5.30)
(5.31)
Thus in order to prove EM V(Y
0
) S EM V(Ysrs) it is enough to show that
L(nk) S L(k) (5.32)
Consider the difference
88 Sampling Theory and Methods
I
L(j) L(j + n = . _; L<i + 1 2u)pu
](} 1) u=l
(5.33)
If S stands for the summation term in the right hand side of (5.33), grouping
together the terms equidistant from the beginning and end, S can be written as
m
S = L[2m + 1 2u][pu  P:!m+lu 1
if j=2m is even
u=l
m
S = L [2m+ 2 2u ][p u  P2m+2u 1 if j=2m+ 1 is odd
u=l
Since P; Pi+l for all i, every term in S is nonnegative. Therefore S is non
negative. Hence we conclude that L is a nondecreasing function. Therefore
L(nk) L(k). This leads to the conclusion that the average variance of the
conventional estimator under simple random sampling is larger than the average
variance of the estimator Yo introduced in this section. However no such
general result can be proved about the efficiency of systematic sampling relative
to simple random sampling or stratified sampling unless further restrictions are
imposed on the correlations p u . The following theorem is due to Cochran
(1946).
Theorem 5.4 If Pi+l = 1. 2 .... , N 2 .al = Pi+2 + P; 2Pi+l
and i = l, 2, ... , N 2 then EM[V(YLSs )] EM[V(fo)] S .
., .. ..
Furthermore, unless () F = 0, i = 1, 2 .... N 3. M [V(Y LSS )] S EM [V (Y LSS )]
Proof As a? Pi+2+P;2Pi+l ... ,N2
By induction it can be shown that Pi+c+l  Pi+c Pi+c Pi for any integer c.
Hence for any integer a,c>O we have
i+c+l a+c1
LPi+c+l Pi+c L Pi+c P;
i=a i=a
which gives Pa+2c + Pa 2Patc 0
Consider the difference
EM [V(Yo)]EM [V(YLSs )] =
(5.34)
2a 2 (k l)N 2 [nk1 n1 k1 ]
2
 L(nku)pu k
2
L(nU)Pku nL(ku)pu (5.35)
Nnk (k 1) u=l u=l u=l
nk1 k n1
Note that L (nk u)pu = L L[nk (i + jk)]P;+ jk
u=l i=l j=O
Stratified Sampling 89
k1 n1 n1
= LL[nk (i+ jk)]P;..Jk + L<n j)p jk
='I j=O j=l
k1 n1 n1
= LL(n j)(ki)Pi+jk +k L,<n j)p jk +
i=l j=l j=l
k1 n2 k1
LLi<n ji>p jk+i +n L<ki)p;
i=l j=O i=l
(5.36)
k1 n2 k1 ni
Since L L i(n j i)p jk+i = L L i(n j) p jk(ki)
i=l j=O i=l j=l
k1 ni
= LL(k i)(n j)p jki
(5.37)
i=l j=l
The expression inside square braces of (5.35) can be written as
k1 ni k1 ni n1
LL(k i)(n j)p jk+i + LL(k i)(n j)p jki k(k l)L,p jk
i=l j=l i=l j=l j=l
k1 ni
which is equal to L L (k  i)(n j)[p jk+i + P jki  2p jk]
i=l j=l
By (5.34) this is clearly nonnegative. Therefore EM [V(fo)] S EM [V(YLSs )].
Further from (5.38), it can be seen that the above inequality will be strict if and
only if (); = 0, i = 1, 2, ... N 1. Hence the proof.
5.4 Problems and Solutions
Nt
Problem 5.1 A sampler has two strata with relative sizes W
1
=and
N
N2
W 2 =. He believes that S
1
S
2
can be taken as equal. For a given cost
N
C = c1n1 + c
2
n
2
show that (assuming N his large)
[
Vprop]= [W1c1 +W2c2]
Vopt
Solution When Nh is large, V(Ysr) = N h nh
h=l N hnh
90 Sampling Theory and Methods
= t, {Ni[ 
_ N:
5
z
h
hI nh
For the given cost, under proponional allocation we have
CNh
nh = ,h=1,2
ciNI +c2N2
(5.38)
This expression is obtained from (5.3) by taking C
0
= 0 and L = 2 .
Substituting these values in (5.38) we get
N
1
2
s[ N?s?
Vprop = CNI + CN2
c
1
N
1
+c
2
N
2
c
1
N
1
+c
2
N
2
= c1N1 +c2 2 N1 S1 + N2 2
N
{
2 2 252}
C N
1
N2
= ciNI +c2N2 s2N
c
The above expression is obtained by taking N
1
+ N 2 = N and S
1
= S
2
For the given cost, under optimum allocation we have
CNhSh
:;;
nh = r r:: , h = I, 2
N
1
SpJCI +N2S
2
"1c2
(S.39)
This expression is obtained from (5.15) by taking C
0
= 0 and L = 2. The
variance of standard estimator can be obtained from (5.38) by substituting the
sample size values given above. It turns out to be
CN1S1 CN2S2
v  N
2
s
2
F. + N
2
s
2
,Jc;
opt  I I N IS l..rc I + N 2 S 2 ,jc; 2 2 NISI ,Tc I + N 2 S 2 ,jc;
[N1s1..Tc1 +N2Sz,lc;1 r c
= C [NISrvc I+ N2S2"/c2]
[Nisi,Tcl +
2
= S (S.40)
c
Therefore by (5.39) and (5.40) we get
V prop _ [w
1
c
1
+ W
2
c
2
)
Vopt
Hence the solution.
Stratified Sampling 91
hobMm 5.2 With two strata, a sampler would like to have n
1
= n
2
for
administrative convenience, instead of using the values given by the Neyman
allocation. If V and Vopt ~ e n o t e the variances given by the n
1
= n
2
and the
Neyman allocations, respectively, show that the fractional increase in variance
VVopt [r1]
2
n1
  ~ =  where r =as given by the Neyman allocation. Assume
vopt r+ 1 n2
that N
1
and N
2
is large.
n
Solutioa Under equal allocation we have n
1
= n
2
=  . Substituting this in
2
(5.38) we get (with n = 2)
[2]f 2 2 2 2]
v ; i.NI sl +N2S2
Under Neyman allocation we have
NS
n
1
=
1 1
n and
N
1
S
1
+N
2
S
2
Substituting these values in (5.38) we get
Vopt = ..!_ [NISI + N 2 S 2 )2
n
(5.41)
(5.42)
N
1
s
1
By the definition of r. we have r = . Using this in V and Vopt given in
N2S2
(5.41) and (5.42) we get
2 2 2 .,
V =N
2
S
2
(r +I)
n
[
3:.N:fS:f<r2 1).!. Nisi (r+l)2]
V V r ' n n n
Therefore op = ..;;;....=
V opt .!. N f S i ( r + 1) 2
n n
(r1)2
=
(r+ 1)2
Hence the solution.
(5.43)
(5.44)
ProbMm 5.3 If there are two strata and if ~ is the ratio of the actual nl to
n ~
the Neyman optimum nl , show that whatever be the values of N
1
,N
2
,S
1
and
n2
92 Sampting Theory and Methods
. f V apr . I h 4cP
S
2
the rauo o  IS never ess t an
2
when N
1
and N:! are large.
v (1+<0)
Here Vopt is the variance of usual estimator under Neyman optimum allocation
and V is the variance under actual allocation.
. Nl2 ., N; .,
Solutzon By (5.8), V = SC + =s:r and by (5.42),
nl nl
V opt = ...!_[NISI + N 2 S 2 ]
2
n
1 .,
V [NISI +N2S2]
Therefore opt = n (5.45)
v N
2
s
2
N;s
2
I I +  2
nl n2
Under Neyman allocation,
nl NISI
 = (refer Problem 5.2)
n2 N2S2
N
2
S.,nl
~ =  (5.46)
N1Sin2
1 {1 N2S2 }2
Vopt ;; + N1S1
v= 1 N
2
s;
+ 2 
n1 Nf S1
2
n2
The above expression is obtained by dividing both the numerator and
denominator of (5.45) by N1
2
S1
2
. Substituting the value of ; given in (5.45)
in the above expression. we get
v .. , ! {l+f ::} '
=;..._ _ _;;,;::;...._
V 1 2 n ~
+<& 2
nl nl n2
1 .,
(nl + n 2 ~ )
n
:...:..:....
2
nl + n2f'
(5.47)
Replacing n by n
1
+ n
2
and (nl + n
2
<&)
2
by (nl  n
2
;)
2
+ 4n
1
n
2
; the above
Vopt  (nl  n2;)2 + 4nln2; W kn h
ratio can be expressed as  _
2 2
. e ow t at
V (nl  n
2
;) + n1n2 (1 +";)
Stratified Sampling 93
vopt 44
conclude that 
2
. Hence the solution.
v (1+<0)
L
Probkm 5.4 If the cost function is of the form C = C
0
+ L t h ..r;;; , where C
0
h=l
and t h are known numbers. show that in order to minimize the variance of the
2
estimator for fixed total cost nh must be proportional to { N::l}
3
Solution To find the desired values of nh, we must minimize the function
<0 = + A.{co + rh_J;;; c}
h=l h h h=l
where A. is the Lagrangian multiplier.
Differentiating partially the above function with respect to nh and equating the
derivatives to zero we get
th
2
+ c = O. h = I, 2 ... L
nh 2ynh
(5.48)
Differentiating partially with respect to A. and setting the derivative equal to
zero we get
L
C =Co+ "'J:/h..r;;;
(5.49)
h=l
From (5.48) we have,
94 Sampling Theory and Methods
{Nis;}
113
h=l lh
CCo=
113
(using(5.49))
A.
1
113
1/3 h=l th r
A.
CCo
Substituting this value in (5.48) we get
r
n = ]1/3 th
h=l th
It can be seen that for these values of nh , the matrix of second order partial
derivatives becomes positive definite. Therefore we conclude that the above
v3Jues of nh minimize the variance of the estimator for a given cost. 
ProbZ.m 5.5 In a population consisting of a linear trend, show that a systematic
sample is less precise than stratified random sample with strata of size 2/c and
two units per stratum if n >
4
k +
2
when the first stratum contains first set of 2k
k+l
units, second stratum contains second set of 2k units in the population and so on.
Solution We know that under systematic sampling in the presence of linear
trend, the variance of the usual estimator is
.. N2/J2(k2 1)
V[Yus 1 =
12
(5.50)
Under the stratification scheme described above. we have L = N h = 2k for
2
h = l, 2, ... the labels of units included in the stratum h are given by
2
Gh = {2(h l)k + j, j = 1, 2, ... , 2k h = 1, 2, ... ,
2
Therefore we have Yhj =a+ /3{2(hl)k + j}, j = l, 2, ... , 2k, h = 1. 2, ...
2
 1 2k
Hence Yh =Ia+fJ{2(hl)k+ j}
2/c . I
]=
=a+ II[ 2(h1)k+
2
\+l]
YIV f. =J'[j 2/c2+1]
Squaring and summing we get, .
s2 = I a2[  2k+l]
2
h
1
2
j=
= /32 k(2k +I)
6
Stratified Sampling 95
2 ., k(2k +I) . .
Since N h = 2k,nh = 2 and S h = {3
6
, we obtam the vanance of the
estimator .as
V
31
= {N:[I __ I
h=l nh Nh
= /32 k
2
(k I)n(2k + 1)
6
(5.5I)
Comparing (5.50) and (5.5I) we infer that systematic sampling is less precise
than stratified sampling if n >
4
k +
2
. Hence the solution.
k+I
Probkm 5.6 Suggest an unbiased estimator for the population total under
stratified sampling when ppswr sampling is used in all the L strata and also
derive its variance.
Solution Let X hj be the value of the size variable corresponding to the jth unit
in the stratum h, j = I, 2, ... , N h; h = 1. 2, ... , L, X h be the stratum total of the size
variable corresponding to the stratum h and Phj = X hj . If Phj is the Pvalue of
xh
the jth sampled in the stratum h, then an unbiased estimator of the population
total is Ypt YhJ .This estimator is constructed by using the fact
h=l nh j::al Phj
that _I_ t y hj is unbiased for the stratum total Yh of the study variable Y and
nh i=l Phj
L I Yh
hence by Theorem 5 .I , Y pt = LL _IJ is unbiased for the population
. h=l nh i=1 Phi
total. The variance of the above estimator is
VfYpr )=I  rh}2 phj
h=1 nh j::al phj
Hence the solution.
(refer Theorem 4.2)
96 Sampling Theory and Methods
Exercises
5.1 Derive the variance of the estimator considered in Problem 6.6 under
proportional allocation. That is. the sample size is made proportional to
stratum size X h rather than the number of units in the stratum h.
5. 2 A random sample of size n is selected from a population containing N units
and the sample units are allocated L strata on the basis of information
collected about them. Denoting by n" the number of sample units failing
L
in stratum h, derive the variance of L N h f,. (note that N h is known).
h=l N N
5.3 A population is divided into L strata, stratum h containing N h units from
which n
11
, h = 1, 2, ... , L are to be taken into the sample. The following
procedure is used. One unit is selected with pp to x from the entire
population. If the unit comes from the stratum h, a simple random sample
of further nh 1 units is taken from theN h 1 units that remain. From the
other strata simple random samples of specified sizes are taken. Show that
L
LNhYh
under usual notations hZI is an unbiased estimator of ~ .
I,Nhxh
h=l
5.4 For the sampling scheme in which the population is split at random into
substrata containing N;, i = 1 ~ 2, ... , n units, and one unit is selected with pp
to x from each substratum, suggest an unbiased estimator for the
population total and derive its variance. (Compare this with Random group
method described in Chapter 4 ).
Chapter6
Use of Auxiliary Information
6.1 Introduction
So far we have seen many samplingestimating strategies in which the
knowledge of the variable under study, y, alone is directly used during the
estimation stage. However in many situations the variable under study, y, will
be closely related to an auxiliary variable x and information pertaining to it for
all the units in the population is either readily available or can be easily
collected. In such situations, it is desirable to consider estimators of the
population total Y that use the data on x which are more efficient than the
conventional ones. Two such methods are (i) ratio methods and (ii) regression
methods. In the following sections we shall discuss "Ratio estimation".
6.2 Ratio Estimation
Let Y and X be unbiased for the population totals Y and X of the study and
auxiliary variable respectively. The ratio estimator of the population total is
defined as
(6.1)
For example, if Y is the number of teak trees in a geographical region and x is
...
its area in acres, the ratio is an estimator of the number of teak trees per acre
X
...
of a region in the population. The product of with X, the total area in acres
X
would provide an estimator of Y. the total of teak trees in the population.
The estimator proposed above is meant for any sampling design yielding
unbiased estimators for the population totals Y and X. Let P(s) be any sampling
design. It may be noted that
Ep[YR]= LYRP(s) = L[ lvP(s)
sEll sEll X(s) f
= L[ }(s)
sdl X(s)
98 Sampling Theory and Methods
Since the right hand side of the above expression is not equal to Y. the ratio
estimator is biased for Yunder the given sampling design.
The following theorem gives the approximate bias and mean square error
of the ratio estimator.
TMorem 6.1 The approximate bias and mean square error of the ratio estimator
_ scYR> = ][ ]}
and MSECYR> = y2{[ H ]
2
[ ]}
A A
YY XX
Proof Let e
0
= and e
1
= 
y X
It may be not<d that (i) E(e
0
) = E[ Y; y] = 0 {6.2)
(ii) E(e
1
) =!=[X; X]= 0 (6.3)
(iii) ECeJ l = E[ f; y r = (6.4)
(iv) E(ef)=E[XX]
2
 V(X) (6.5)
x x2
( v) E(eoel) = E [ (f X)] = cov:;: i) (6.6)
Assume that the sample size is large enough so that I e
0
k 1 and I e
1
k 1 . This is
equivalent to assuming that for all possible samples 0 <X< 2X and
A
... ... ... ... y
0<f<2Y.Since Y=Y(l+e
0
) and X=X(l+e
1
).theestimator YR =......:X
X
can be written as YR = f(1 + e
0
)(1 + e
1
)I
= f(l+eo)(1el +ef  ... )
= Y(1+e
0
e
1
+ef e
0
e
1
+ ... )
Using (6.2) and (6.3) and ignoring terms of degree greater than two we get
A 2
E[YRY]:YE[e1 eoed
= cov(f, X)} (using (6.4),(6.5) and (6.6))
x YX
Proceeding as above we get (ignoring terms of degree greater than two)
Use of auxiliary information 99
Hence the proof.
CoroliDry 6.1 Under simple random sampling,
(i)
LY;
.. ies
YR=LX;X
X
'
iEs
... .. N
2
(Nn){s; s; Sxy}
(m)MSE(YR)=  +2
Nn y2 x2 XY
.. N
2
(Nn) 2
Proof We know that under simple random sampling V(Y} = S Y
Nn
.. N
2
(N n) 2 .. .. N
2
(N n)
V(X) = S and cov(Y, X)= S"'. Substituting these
Nn x Nn ;
expression in the results available in Theorem 6.1, we get the required
expressions.
The following theorem gives the condition under which the ratio estimator
will be more efficient than the conventional expansion estimator .
..
.. y
TMorem 6.2 The ratio estimator Y R = .... X is more efficient than the
X
.. 1 c s, s
expansion estimator Y if p >  ~ w h e r e C y ==, C x = x and p is the
coefficient of correlation.
Proof V('Y) > MSE('YR)
S >.!_.!_Sx
;ry 2XS
y
2 Cy Y X
100 Sampling Theory and Methods
Hence the proof.
Estimated mean square error under simple random sampling
Note that
N N
~ 2 ~  2
,,)Y;  RX; 1 = ..L..[Y;  Y + Y  RX; 1
i=l i=l
N 
= L[Y; Y+RX RX;]
2
(sinceR= y)
i=l X
N N N
~ 2 2 ~ 2 ~  
= ..L..[Y; Y] +R ..L..[X; X] 2R..L..[Y; Y][X; X]
i=l i=l
Dividing both the sides by (N 1) we get
N
i=l
1 ~ 2 2 2 2
..L.. [ Y;  RX; ] = S y + R S JC  2RS rJ
(N 1) i=l
(6.10)
Substituting this in the expression for the mean square error, we get an
equivalent expression as
2 N
MSE(YR>= N (Nn)
1
L[f; RX;]
2
(replacing(6.10))
Nn (N I) i=l
Therefore a reasonable estimate for the mean square error of the ratio estimate is
~ f . .
2 """' I
N (N n) 1 L[v RX...
1
2 h R ... _ ie.r
, ..  w ere ==:
Nn (n 1) .
1 1
~ x.
~ $ """' I
ie.r
The ratio estimator considered in this section is not unbiased for the population
total (mean). In the following section of this chapter, few ratio type unbiased
estimators meant for simple random sampling are presented.
6.3 Unbiased Ratio Type Estimators
Already we have seen that under simple random sampling, the ratio estimator
LY;
takes the form X . As an alternative to this, it is reasonable to take
iE.r
Y
... ,,.. = [ Nn ] ~ . [xY;
1
]x
v ..L.. as estimator of the population total. Like the ratio
IE$
estimator, the above estimator is also biased for the population total. The
...
following theorem gives an expression for the bias of YRo.
Tlu!onm 6.3 The bias of the estimator YR
0
is B(YRQ) = [N 1]Szx
Use of auxiliary infomuJtion 101
N
where S
Z][X X],Z
zx ' ' ' X
'
Proof Taking = Y; . i = 1. 2, ... N. the estimator YRo can be written as
X,
.. 
YRo
n .
IE$
.. _ ..
=ZX where
n .
IE$
The bias of YRo is B(Y Ro ) = E [Y Ro  Y]
.. _
= E(ZX)Y
N
= ZX Y where Z = LZ;
i=l
2 N
Weknowthatcov(Z,X)=N (Nn)
1
L[Z;Z][X;X]
Nn N 1 i=l
= N2(N n) 1 {fz;X; z x}
Nn N 1 i=l
(6.11)
= N
2
(N n)
1
{Nf N Z X} using Z; = Y;
Nn N 1  X;
2
= N (N n)
1
B(Y ) (using (6.11)
Nn N 1 Ro
...
Therefore the bias of the estimator Y.R is
tl
B(YR ) = Nn(N 
1
) cov(Z, X)
0
N
2
(Nn)
(6.12)
.. .. N
2
(Nn)
We know that under simple random sampling , cov(f, X) = S xy
Nn
Making use of this result in (6.12) we get the required expression.
The above theorem helps us to get an unbiased estimator for the population total
as shown below.
We have seen in Chapter 2 that sJCy =
1
LCX; X)(Y; f) is unbiased for
n 1.
IIE.r
N
S ry = __!__ L (X;  X )(f;  Y) . Therefore an unbiased estimator for the bias
N 1. I
, ...
given in Theorem 6.3 is
102 Sampling Theory and Methods
B(Y Ro) = <N l)S zx
=
n1 k..
lES
= .JZ;X; nZ X)
n1 .
IES
f..
= .J Y; nZ X) (using Z; = ' )
n1 X
'
= n(N1) [Y Z X]
n1
It may be observed that. if b is an unbiased estimator of the bias of the estimator
T (which is meant for estimating the parameter 9) then Tb is unbiased for the
parameter 9. Therefore Y RO  B(Y RO) is an unbiased estimator of the population
A n(N 1)
total. That is, Y RO + [Y  Z X] is unbiased for the population total Y.
n1
Thus we have obtained an exactly unbiased ratiotype estimator by
considering the mean of the ratios of Y; to X; ( instead of the ratio of sum of
Y; to sum of X; ) to form the estimator and correcting for the bias. The above
estimator is due to Hartley and Rao ( 1954 ). In the following section another
corrected estimator is presented.
6.4 Almost Unbiased Ratio Estimator
Suppose a sample of size n is drawn in the form of m independent subsamples
of the same size, selected according to the same sampling design and Y; and
X;, i = I. 2, ... ,mare unbiased estimates of the population totals Y and X based
on the m subsamples. The following two estimates can be considered for Y:
A
A y
Y1 =:X (6.13)
X
1 m
where Y=LY;
m. I
=
and
A 1 m
and X =LX;
m. I
=
(6.14)
Use of auxiliary information 103
f..
where r; =:1.
X
'
Under the usual assumptiol16 (stated in the proof of Theorem 6.1 ), the bias of the
A
estimator Y
1
is
B
1
=Y[RV(X)cov(X,Y)]. (byTheorem6.1)
=
r=l r=l r=l
1 mr A A A]
= 2YLtRV(X;)cov(X;.Y;)
m i=l
1 m
=r LB(r;)
m i=l
where B( r; ) = Y[ RV (X; )  cov( X; , Y; ) 1
A
and the bias of the estimator Y m is
A
Bm=B(Ym)
1 m
=I,B(r;)
m. l
=
Comparing (6.15) and (6.16) we get mBl = Bm
(6.15)
(6.16)
(6.17)
This shows that the bias of the estimator Y m is m times that of Y
1
Further it can
be seen that Bm  Bt = E[Y m  Y] E[ft  Y]
= EYm ft1
Therefore E[Y m Y1] = (m 1)81.
H
[fm fd . b" . f B
ence IS an un 1ased estimator o
1
m1
A
Thus after correcting the estimator Y
1
for its bias, we get an unbiased estimator
for the population total
YAu = Yt Ym ft1 = [mft Ym1
m1 m1
(6.18)
Since the estimator given above is obtained by correcting only the approximate
bias (not the exact bias), it is known as "Almost Unbiased RatioType
Estimator".
104 Sampling Theory and Methods
.6.5 Jackknife Ratio Estimator
As in the previous section, here also it is assumed that a simple random sample
of size n is selected in the form of m independent subsamples of k units each.
A Y 1 m 1 m
Let Y
1
=A X where Y =I,r; and i =L i; . Further denote by
X
.. y<i)
Y  X where i(i) and y<i> are unbiased estimators of X and Y
Ri X (i)
obtained after omitting the ith subsample. That is, Y Ri is the ratio estimate
.. ..
computed after omitting the ith subsample. Combining Y
1
and Y Ri , Quenouille
(1956) suggested the estimator
m
.. .. ..
Ya =mY
1
m . 1
=
The above estimator is popularly known as Jackknife ratio estimator.
(6.19)
In the following theorem it is proved that the above estimator 1s also
approximately unbiased.
Theorem 6.4 The estimator
A A m1 m A
Ya =mY
1
LYR;
m . 1
=
unbiased.
A A
Proof The bias of the estimator Ya =mY
1
is
m . 1
B(Y Q) = E(Y Q) y
[
m ]
A m1
=E mY
1
LYR; Y
m . 1
=
=
[
m ]
A m1 A
=E m(Y
1
Y)LYR; +(m1)Y
m . 1
=
[
m ]
A m1 A
= E m(Y1 Y)I,(YR; f)
m . 1
=
m
.. m1 A
=
m . 1
=
1s approximately
(6.20)
A
We have seen in the previous section that the bias of the estimator Y
1
is_
A 1 m A
B(Y
1
) =I, B(r;)
m. 1
=
Use of auxiliary infornultion 105
where B(r;) is as defined in (6.15).
Since the subsamples are drawn independently and they are of the same size,
B(f;) = B
0
(constant) for i = 1. 2, ... , m.
A 1
Hence B(Y
1
) =mBo
m2
Bo
=
m
1 m A 1 m A
i, I, X j and I Yj are unbiased
m1 .
1
m 1 .
1
Note that for each
)= )=
; ;
(6.21)
for the
population totals X and Y respectively. Therefore by Theorem 6.1, the bias of the
estimator Y Ri is
B('Y Ri) = Y[RV(X (il) cov(X(i). y<O )]
=Y Rv[ ~ f,xi Jcov( [ ~ fxi ].[ ~ fri J J
ml.
1
ml.
1
ml.
1
]= ]= ]=
= Bo
(m1)
~ ~ ~
(in 1)8
0
(m1)
2
Substituting (6.21) and (6.22) in (6.20) we get
B
0
m1 B
0
B<Ya)=mm
m m m1
=0
(6.22)
Therefore the Jackknife estimator given in (6.19) is approximately unbiased. It
is pertinent to note that this estimator is not an exactly unbiased estimator.
6.6 Bound for Bias
In the last three sections. we have seen unbiased ratiotype estimators. In this
section, an upper bound is presented for the bias of the ratio estimator.
We know that the bias of the ratio estimator Y R is
B[YR)=E[YR]Y
and cov(YR.X)=E[YRX]E[YR]E[X]
106 Sampling Theory and Methods
=, X ]E[fR )E[i]
... ...
=X E[Y]E[YR ]X
= XY E[YR]X
...
= XB [Y R 1 (using (6.22))
Therefore cor(YR, X)< SD(YR )SD(X) =X
XIB[YR11
SD(X)
Hence ... S
SD(YR) X
The above bound is due to Hartley and Ross(1954).
6. 7 Product Estimation
We have proved under simple random sampling, the ratio estimator is more
precise than the expansion estimator when the variables x and y have high
positive correlation. In fact, it is not difficult to see under any sampling design,
... ... ... ... I [C(X)] ... SD(f)
Y R is more efficient than Y if p(X, f)> ...  where C(f) = and
2 C(f) Y
C(X) = SD(X) . This shows that if the correlation between x andy is negative,
X
the ratio estimator will not be precise than the conventional estimator. For such
situations, Murthy( 1964) suggested another method of estimation, which is
... ...
expected to be more efficient than Y in situations, where Y R turns out to be
less efficient than Y . In this method, termed "Product method of estimation, the
population total is estimated by using the estimator
...
... y ...
Y p =:X (6.24)
X
...
Since the estimator uses the product YX rather than the ratio , it is known as
X
product estimator.
The following theorem gives the exact bias and approximate mean square error
of the ratio estimator.
Theorem 6.5 The exact bias and the approximate mean square error of the
product estimator are given by
......
B[Yp]= cov(X,Y)
X
Use of auxiliary information I 07
and
Proof Using the notations anC:l assumptions introduced in Theorem 6.1, the
...
estimator Yp can be written as
y P = y (1 + eo )(1 + el )
=YO+ e
0
+ e
1
+ e
0
+ eJ)
Therefore Y p  Y = Y[eo + e1 + eo + e1]
Taking expectation on both the sides of (6.25) we get
Yp Y = YE [eoed (since E(eo) = E(e
1
) = 0)
=Y[
X)
X
(6.25)
(6.26)
Squaring and taking expectation on both the sides of (6.25) and ignoring terms
of degree greater than two, we get the approximate mean square error
E[Yp Y]
2
= Y
2
E[e5 +ef + 2eoel 1
=r2{ [ H ]+{
= V(Y)+ R
2
V(X)+ 2R coveY, X) (6.27)
Hence the proof.
...
The following theorem gives the condition under which the estimator Y p
will be more efficient than Y .
Theorem 6.6 The product estimator Yp is more efficient than Y if
... ... 1 [C(X)]
p(X' f)< 2 C(f)
Proof Left as an exercise.
... [N
2
(Nn)](sry)
Theorem 6.7 Under simple random sampling Y p + Nn. X is
unbiased for the population total.
Proof We know that under simple random sampling
... ... [N
2
(Nn)]
cov(Y, X)= Nn S ry .Therefore the true bias of the _product estimator
108 Sampling Theory and Meflrods
. 1 d 1" . N (N n) XV s . b" d
[
2 ] s
under s1mp e ran om samp mg ts  Nn X. mce s.cy ts un 1ase
for S ry under simple random sampling, an unbiased estimator of the bias of the
od
. . [ N 2 (N n)] s xy Th & &, d. . h d
pr uct esttmator ts   . ere,ore a,ter a JUSllng t e pro uct
Nn X
& b. y'"' [N2(N n)](sry) b. d . f
estimator ,or tts 1as we get p + Nn X as un 1ase est1mator o
the population total. It may be noted that the above estimator is not an exact
unbiased estimator.
6.8 Two Phase Sampling
The ratio and product estimators introduced in this chapter assume the
knowledge of the population total X of the auxiliary variable x. However there
are some situations where the population total of the auxiliary variable will not
be known in advance. In such cases. twophase sampling can be used for getting
ratio or product estimator. In two phase sampling, a sample of size n' is selected
initially by using a suitable sampling design and the population total X is
estimated and then a sample of size n is selected to estimate the population totals
of the study and auxiliary variables. The second phase sample can be either a
subsample of the first phase sample or it can be directly drawn from the given
population. The sampling designs used in the first and second phases need not
be the same. Depending on the situation, different sampling designs can also be
used. Generally, twophase sampling is recommended only when the cost of
conducting first phase survey is more economical when compared to that of the
second phase.
Let X D be an unbiased estimator of X based on the first phase sample and
X. Y be unbiased estimators of X. Y based on the second phase sample. Then
the ratio and product estimators based on twophase sampling are
and
A
A y A
Y RD =::X D (6.29)
X
......
... YX
Ypo=
X
(6.30)
The following theorems give the approximate bias and mean square error of the
ratio and product estimator under different cases of twophase sampling.
Theorem 6.8 (i) When the samples are drawn independently in the two phases
of sampling the approximate bias of the ratio estimator is
= ][ ]}
Use of auxiLiary information I 09
(ii) When the second phase sample is a subsample of the first phase sample. the
approximate bias of the ratio estimator is
mYRol=Y{[ ][ ][ H ol ]}
Proof When the samples are drawn independently in the two phases of
sampling
cov(X, X
0
) =0
cov(f, Xo)=O
YY XX XdX
Let e
0
=, e
1
= and ed = ==
y X X
It may be noted that
(i) E(eo) =, Y; y] = 0 , (ii) E(e
1
) = , i ; X]= 0
JxdX] ., [YY]
2
V(Y)
(iii) E(ed )= &l X . =0 (iv) E(e0)= E Y f2
2 JxX]
2
V(X) ., Jx0 X]
2
V(X 0 >
(v) E(e1 ) = &;Ol X = """i"2 (vi) E(ed) = &;Ol X = X
2
.. ) E( ) J (f Y)(X X)] cov(f, X)
(vn eoel = &;Ol YX . = YX
.. E E[<Y Y)(X oX)] cov(f, X o>
(vn) (eoed) = YX = YX
(
.) E( )_J<XX)(XoX>]_cov(X,X0 )
IX eled  &;Ol YX  YX
The ratio estimator can be expressed in terms of e
0
, e
1
, ed as follows :
YR =Y(I+e
0
)(1+e
1
)
1
(1+ed)
= Y(l+e
0
)(1+ed)(le
1
+ef  ........ )
I
2 3
= Y( +eo + ed + eoed )(1 e
1
+ e
1
 e
1
+ ........ )
.,
= Y(l e
1
+ ej +eo eoe
1
+ ed  e
1
ed + eoed)
(6.31)
(6.32)
(ignoring terms of degree greater than two)
This implies Y RD  Y = Y(e
0
 e
1
+ ed + e[  e
0
e
1
 e
1
ed + e
0
ed) (6.33)
Taking expectations on both the sides of (6.33) and using expressions above
we get when the samples are drawn independently drawn, the approximate bias
A {[V(X >] [cov(X, f)]}
of the ratio estimator as B[Y RD] = Y """X"2  XY
II 0 SampLing Theory and Methods
The bias of the ratio estimator when the second phase sample is a subsample of
the first phase sample can be obtained by taking expecWions on both the sides
of(6.33) as
= r{[ ][ ][ H ]}
Hence the proof.
Theorem 6.9 (i) When the samples are drawn independently in the two phases
of sampling, the approximate mean square error of the ratio estimator is
MSElYRol= y2{[ H H ]{ ]}
(ii) When the second phase sample is a subsample of the first phase sample, the
approximate mean square error of the ratio estimator is
MSElYRol=Y2{ [
_ { cov(;
0
)] + { X o)] }
Proof of this theorem is left as exercise.
Theorem 6.10 (i) When the samples are drawn independently, the approximate
bias of the product estimator in two phase sampling is
H
(ii) When the second phase sample is a subsample of the first phase sample, the
approximate bias of the product estimator is
B[Ypo] = H Yl ][ ][
]}
Proof: Proof of this theorem is left as exercise.
Theorem 6.11 (i) When the samples are drawn independently in the two phases
of sampling, he approximate mean square error of the product estimator is
MSElfpo]=r2{[ H ]}
(ii) When the second phase sample is a subsample of the first phase sample, the
approximate mean square error of the product estimator is given by
Use of auxiliary infomuJtion Ill
MSET.Y
1
= r 2{ V(Y) + V(X) + V(X o> +
Po y2 x2 x2
+{cov(X,f) _ cov(X.X0 ) + cov(Y,Xo>]}
XY x
2
XY
Proof of this theorem is left as exercise.
The theorems stated above are quite general in nature and they are applicable for
any sampling design. Now we shall develop the approximate bias and mean
square error of the ratio estimator under the two cases of twophase sampling
when simple random sampling is used in the twophases of sampling. Towards
this we observe the following :
Let s' and s be the samples obtained in the twophases of sampling, where
in the first phase a large sample of size n' is drawn to estimate the population
total X and in the second phase a sample of size n is drawn to estimate the
population totals X and Y both by using simple random sampling. Here n is
assumed to be small when compared to n '.
A NL A A NL
Let X
0
= X X= X andY= X
, ' . ' '
n . . n . n .
IE.f IE.f IE.f
Note that when s is a subsample of s'.
N
(i) E(Xo>=
(6.36)
i=l
(ii) E(i) = EtEII[i Is')= EtEII[.;. Is']
= E1[N
2
(N )]=x
(6.37)
(iiil vcid > =[ N
2
<N ;,.n> Js; (6.38)
(iv) V(X) = E1V11 [X I s']+V1 E11X Is']
[
2 (n'n)] 2 A
= E
1
N s" +V
1
(Xd)
n'n
= [ N2 (N ;,."> ]s; (6.39)
Cv> vcYd > =[N
2
<N ;,.">
(6.40)
112 Sampling Theory and Methods
(vi) cov(X.Xd>=E [XXd]E[X]E[Xdl
A A I 2
= E
1
E
11
[XXd Is ]X
... ... 2
= E
1
[xdxd 1x
= [N2 (N
(vii) cov('f, X d)= [N
2
(N n)Js xv
Nn
(viii) cov(f, X)= E[XY] XY
......
= E 1 E 11 [ XY I s'] XY
= E1[N
2
s'_.,+YdXd ]xr
= [N
2
xv + E
1
[x dyd ] XY
nn
= [N
2
_., +cov(Xd, Yd)
=[N2 +[N2
n n Nn
(6.41)
(6.42)
= [N
2
(N n)Js (6.43)
Nn ry
Here Yd and are the analogues of Y and s ry respectively based on the
samples'.
When the samples are drawn independently, the results derived in simple
random sampling can be used directly without any difficulty.
The following theorem gives the approximate bias and mean square error
of the ratio estimator when simple random sampling is used in both the phases.
Theorem 6.12 When simple random sampling is used in both the phases of
sampling and the samples are drawn independently
B[fRol =[N
2
(N ;,"> ]r(c .. C _.,]
and
MSE[Y ]=[N2 (Nn>Jy2rc +C 2C 1+[N2 (Nn')]y2c
RD Nn yy .u ry Nn' .u
s
2
s; sxy
whereC.u =L,Cyy =and Cry= XY.
x2 . y2
TMorem 6.13 When simple random sampling is used in both the phases of
sampling and the second phase sample is a sub sample of the first phase sample
Use of auxiliary information 113
A [ 2 (n'n)] r ]
B{YRol= N n'n YLC.u Cxy
and
MSE[YRo) = N
2
Y
2
[(N n)](c vv +C .u 2C ... , l+[N
2
(N n' >][c r'\1 C .u]
Nn J Nn' J
The above theorems can be proved by applying the expressions given in (6.36)
(6.43) along with Theorems 6.8 6.11.
6.9 Use Of MultiAuxiliary Information
There are many situations in which in addition to the study variable, information
on several related auxiliary variables will be available. In such situations, the
ratio estimator can be extended in several ways. In this section. one straight
forward extension due to Olkin ( 1958) is considered.
Let X; be unbiased for X;, i = I. 2, ... , k the population total of the ith
auxiliary variable and Y be unbiased for Y , the population total of the study
variable. Olkin (1958) suggested a composite estimator of the form
YR/c = w;[ ~  Jx; (6.44)
i=l X,
lc
where W
1
, W
2
, ... W1c are predetermined constants satisfying L W; =I.
i=l
Note that if k=2, the above estimator reduces to
A A
A y y
YRl =W
1
::X I+ W
2
AX
2
x
1
x2
where W
1
+W
2
=I.
(6.45)
The following theorem gives the approximate bias and mean square error of the
estimator Y Rl . In order to make the expressions compact, the following
notations are used.
v. _V(Y) v _V(X
1
> v _V(Xz) c _cov(Y,X
1
)
o 
2
, 1 
2
2 
2
01 YX
Y x
1
x
2
1
C _ cov(f,X2) C _ cov(X
1
,X
2
)
02  YX 2 , 12  X I X 2
A
Theorem 6.14 The approximate bias and mean square error of Y Rl are
B(YR
2
) = Y{V
2
 C
02
+ Wt(C
02
 C
01
 ~
2
) }
and
114 Sampling Theory and Methods
YY
Proof Let eo = y .
...
The estimator Y R
2
can be written as
Y R
2
= W
1
Y(l + e
0
)(I+ e
1
)
1
+ W
2
Y(1 + e
0
)(1 + e
2
)
1
=f{Wt(l+eo)(lel +ef2 ... )+}
w2 (I +eo )(le2 +e2  ... )
= r{wl (1el +ef .eo eoel + ... )+ }
(lW
1
)(le
2
+ei +eo e
0
e
2
+ ... )
= r{(le2 +el +eo ~ e o e 2 ... )+ w. (1e, +ef .eo eoel}
l+e
2
e2 +eo +eoe
2
... )
{
2 }
... (eo e2 +e2 eoe2 ... )+
Therefore Y R
2
 Y = Y ., .,
W
1
(e
2
e
1
+ej e
0
e
1
ei +e
0
e
2
... )
(6.46)
Taking expectations on both the sides after ignoring terms of degree greater than
two,.we get the approximate bias
B(YR2)=Y{V2 Co2 +Wa(Co2 CoaV2)}
(6.47)
Squaring both the sides of (6.46) and taking expectation, on ignoring terms of
degree greater than two, we get the approximate mean square error
MSE(YR2)=Y2{Vo.+V2 2Co2 +Wa2(V2 +Va 2Ca2)} (6.48)
2Wa (Coa +V2 Co2 Ca2>
Hence the proof.
RemtU'Ic Note that the mean square error given in (6.48) attains minimum if
w, = Coa +V2 Co2 C12 (
6
.
49
)
v2 +Va2Ca2
The minimum mean square error of the estimator Y R
2
obtained by substituting
(6.49) in (6.48) is
y2{vo +V2 2Co2 (Coa +V2 Co2 Ca2>2} (6.50)
V2 + v
1
 2c
12
It is pertinent to note that the denominator of the expression is nothing but the
variance of the difference e
1
 e
2
Therefore the minimum mean square error
given in (6.50) is always less than or equal to Y.
2
{vo + V2 2Co
2
} which is
nothing but the approximate mean square error of the ratio estimator based on
the auxiliary variable x
2
. Therefore we infer by using the additional auxiliary
variable the efficiency of the ratio estimator can be increased. It is to be noted
that the optimum value of W
1
given in (6.49) requires the knowledge of some
Use of auxiliary infomuztion 115
parametric values which in general will not be known in advance. The usual
practice is using their estimated values.
6.10 Ratio Estimation in Stratified Sampling
When the sample is selected in the form of a stratified sample, the ratio
estimator can be constructed in two different ways.
Let Yh and X h. h = l, 2, ... L be unbiased for the population totals Yh and
X h hth stratum totals of the study and auxiliary variables respectively. Using
these estimates, the population total can be estimated by using any one of the
following estimates :
L ...
YRs = L xh
h=l XL
L
I,rh
Y
A  h=l X
RC L
I,xh
h=l
(6.51)
(6.52)
...
The estimates Y RS and Y RC are known as separate ratio estimator and
combined ratio estimator respectively. The separate estimator can be used to
estimate the population total only when the true stratum total X h of the
auxiliary variable is known for all strata.
Theorem 6.15 The approximate bias and mean square error of the separate ratio
A L {[V(X >] [cov(X , Y >]}
estimator are B[YRs] = I,rh  h h
il=l X h X hyh
and MSE[YRs 1 = yh >]+[V(X >]2[cov(X h yh >]}
h=l Yh Xh XhYh
Proof Bias of the estimator under consideration is
B[YRs] = E[YRs ]Y
= xh ]rh}
__ yh {[ V ( X
2
h ) ] _ [ cov(X h. Y h)]}
(using Theorem 6.1)
h=l Xh XhYh
Hence the proof.
The mean square error of the separate ratio estimator is
... ... 2
MSE[YRs] = E[YRs Y]
116 Sampling Theory and Methods
= ;: Xh ]Yh r
= rh
h=l Y h X h X h y h
(6.54)
The above mean square error is only an approximate expression and it is
obtained by applying Theorem 6.1 under the assumptions stated in the same
theorem.
L L
The combined ratio estimator is constructed by using L i h and l',rh as
h=l h=l
estimates for the population totals X and Y respectively. Therefore the
approximate bias and mean square error are
L L L
V<l',ih> co I,ihLyh
A h=l y hl h=l
X2 XY
L
V<l',Yh)
........;.h;...=.;...l  + _h;.;....;.l __
y2 x2
respectively.
L
l',cov(Xh Yh)
2 h=l
XY
(6.55)
(6.56)
The expressions given in (6.53)(6.56) are applicable for any sampling design.
In particular, if simple random sampling is used in all the L strata then they
A N:(Nh nh)
reduce to (i)B[Y RS 1 = .LJ Yh { C .u1a  C zyh} and
hl Nhnh
A
2
N:(Nh nh)
(ii)MSE[Y RS 1 = Yh { C .ah + C yyh  2C zyh}
h=l Nhnh
Use of auxiliary infomuztion 117
6.11 Problems and Solutions
l'roiMrrt 6.1 Consider the" estimator Ya = r[; r which m!uces to the ratio
estimator when a = 1 and the conventional expansion estimator Y if a = 0.
Derive the approximate bias and mean square error of the above estimator and
also the minimum mean square error with respect to a (Shrivastava, 1967).
Solution Using the notations introduced in Section 6.2, the estimator
Ya = r[; r can be written as
Ya = y (1 + eo )(1 + el ) a
=Y(l+eo){lael + a(a2+1) e,2  .. J
{
a(a+l) 2 }
=Y lae
1
+
2
e
1
+eo ae
0
e
1
+ ...... .
Therefore
A { a(a+l) 2 }
YaY= Y e
0
ae
1
+
2
e
1
+eo ae
0
e
1
+ ...
(6.57)
Taking expectation on both the sides after ignoring terms of degree greater than
two, we get the approximate bias as
{
a( a+ 1) 2 }
B=Y
2
E(e
1
)aE(eoe
1
)
=f{a(a+l)V(X) acov(X,f)} (
6
.
58
)
2 x
2
XY
Squaring and taking expectations on both the sides of (6.57) after ignoring terms
of degree greater than two, we get the approximate mean _square error as
M =[E(e5)+a
2
E(ef)2aE(eoet>1
= r 2 {V(f) +a
2
V(X) 2a cov(f. X)} (
6
.
59
)
y2 x2 XY
By employing the usual calculus methods, we note that the above mean square
error is minimum if
a = ~ cov(X. f) (
6
.60)
Y XY
Substituting (6.60) in (6.59), we get the mtmmum mean square error as
V(f)(l p
2
) where p is the correlation coefficient between Y and X.
118 Sampling Theory and Methods
A
... A A X
Probkm 6.2 Consider the estimator Ya = aY +(1a)Y which reduces to
X
the product and conventi<?nal expansion estimators when a = I and 0
respectively.
A
Solution The estimator Y a can be written as
A
A A A X
Ya =aY +(1a)Y X
=ay+(l +eo)+(la)Y(l +eo) (1 +e
1
)
= Y +(I +eo)+[a + (1a) O+e1 >]
=Y[1+e
1
ae
1
+eo+eoe
1
aeoe
1
]
Therefore faY= Y[eo +(I +a)(el + eoe1 >]
Taking expectations on both sides. we get the bias of the estimator
8
y [cov(X, f)]
= (1a) XY
Squaring both the sides of (6.61) and taking expectations after ignoring terms of
degree greater than two, we get the approximate mean square error as
M = y2{V(f) + (
1
a)2 V(X) + Z(la) cov(X, f)}
y2 X 2 XY
It can be seen that the above mean square error is minimum if
a= Y>]
and the minimum mean square error is V ( f)(l p
2
) where p as the
A A
correlation coefficient between Y and X .
Probkm 6.3 Derive the approximate bias and mean square error of the estimator
af and also find the minimum mean square error.
X
Solution The given estimator can be W..ilten as
A A X
Ya =aYA
X
=aY(l + e
0
)(1 + e
1
)
1
2
=af(l+e
0
)(1e
1
+e1  ... )
.,
=aY(le
1
+ei +eo e
0
e
1
+ .. )
A 2
Therefore YaY= (al)Y aY(eo e
1
+e
1
+eo eoe
1
+ ... )
Taking expectations on both sides of the above expression we get the bias of the
estimator as
Use of auxiliary information 119
(al)Yar{'vci) _ covci.Y) ~
x
2
XY J
Proceeding in the usual way we get the mean square errt>r of the estimator as
M 1
2 2y2{V(f) V(X)
2
cov(f, X)}
=(a) +a + 
y2 x2 XY
_ 2a(a nr 2{V(X) _ cov(X. f)}
x
2
XY
Minimising the above mean square error with respect to a and substituting the
optimum value in the mean square error expression we get the minimum mean
square error.
Problem 6.4 Letx
1
and x be samples means in two phase sampling when
samples are drawn independently and simple random sampling is used in both
the phases of sampling. Show that the estimator
. _ n(N 1) ( _)
y =rx
1
+ y rx
N(n 1)
is unbiased for the population mean where y is the sample mean of the study
variable based on the second phase sample.
Solution Since the samples are drawn without replacement and independently
of each other in the two phases of sampling we have
E[y*]=E[r]E[X't1+ n(NI) E(yrx)
N(n1)
Note that E[rx] = cov(r,x)+ E[r]E[x]
N
Nn 1 ~   
=~ ( R ; R)(X; X)+R X
Nn N1.
1
,.
N
Nn  (n1)N  1 ~ Y.
= Y + R X where R =  ~ R ; . R ; ='
n( N  1) n( N  1) N i=t X;
n(N1) N(n1)   
Therefore we have E(y*) = (Y  R X) +R X
N(n1) n(N1) '
=f
Hence the solution.
Problem 6.5 Derive an unbiased ratiotype estimator based on xw, the mean of
say w distinct units in two phase sampling when independent samples are drawn
in the two phases of sampling using simple random sampling.
Solution Let r x w be an estimator of the population mean Y where r is as
defined in the last problem. Note that E[rxw1 = E{E(rxw I w)}
120 Sampling Theory and Methods
Since each units in a particular subset s w l containing w distinct units 1 is given
equal chance for being included in the sample, we get E[r I w] = rw .
Therefore
E[r xwl = E[rwxw]
=cov(rw,Xw) + E[rw]Elxwl
{
Nw 1 f  } 
= E ..t ( R;  R )(X;  X) + R X (refer the next problem)
Nw N1 i=t
where the expectation in the right hand side of the above expression is with
respect to the distribution of w.
E[rxwl=E{Nw_
1
}rE{N(w
1
)}R X
Nw N1 w(N1)
Hence the bias of r xw is B(r xw I w) = E(r xw I w) Y
= E{N(w
1
)}(f R X)
w(N1)
Further we know that E[)' r x] = { N(n 
1
)}<r R X)
n(N 1)
Therefore B(r xw> = E{N(wl)}{n(N l)}(y rx)
w(N1) N(n1)
E{
n(w1)}( __ )
= yrx
w(n1)
Hence an unbiased estimator of the population mean is
_ _ {n(w1)} __ _
Yw = rxw + (y r x)
w(n1)
Problsm 6.6 Under the notations used in problem 6.5 derive (a) E(xw). (b)
V(xw) and (c) cov(iw .Y>
Solution (a) E(xw)= EE(xw I w)
=E(X)
=X
(b) V(xw> = EV(xw I w)+VE(xw I w)
= E { ~  ~ }s;+V(X)
(c) We can write E(y Is w) = Yw
Therefore cov(iw y) = E(xwy) E(xw)E(y)
= EE(xwyl sw) E(xw)E(yl sw)
Hence the solution.
Exercises
Use of au.x1liary information 121
=cov(xw. Yw>
6.1 Derive under simple random sampling the approximate bias and mean
.. .. s2
square error of the estimator Y RS = Y ; .
Sx
6.2 Derive the approximate bias and minimum mean square error of the
A
estimator Y =_!:_[aX + (1 a )X] and compare the minimum mean square
X
error with the mean square error of the Linear regression estimator.
6.3 Let :x be the mean of distinct units when samples are drawn
independently in two phase sampling. Derive the approximate bias and
mean square error of the estimator f' = y :x* assuming simple random
X
sampling is used in both the phases of sampling.
6.4 Derive the minimum square error of the estimator
y = Y + b(x*  x) under the notations explained in 6.3.
6.5 Show that for a sample of n units selecting usmg srswor,
..  n N1 .. ..
B[RnX1=E[x(Rn Rt>1
N n1
A v A I ~ y
where R
1
= :_ and R ~
1
= L ' . Y; and X; being y and x values of the ith
X n i=l X;
drawn unit.
6.6 Let C x be the coefficient of variation of x. Derive the condition under
which the estimator Y X + C:c is more efficient than the usual ratio
XCx
estimator. assuming simple random sampling is used.
Chapter7
Regression Estimation
7.1 Introduction
Like ratio estimation, regression estimation is another method of estimation of a
finite population total using the knowledge of an auxiliary variable x which is
closely related to the study variable y. The regression estimator is developed
below.
We know that. when the variables x andy are linearly related, the least squares
estimates of the slope and intercept are respectively b = s i and a = Y bX .
SJC
The Y can be expressed as
y = L,r; + L,r; where s = s  s . (7.1)
ies ies
Once the sample is observed, the first term in the right hand side becomes fully
A Sxy A A
known. Using the least squares estimates b =and a = Y bX each
s2
unobserved y value can be estimated by
Y; =a+bX;.ie s
Summing both the sides over s = S  s , we get
L,r; = <N n><Y &x> +hLX;
ies ies
ies
=<N n><Y bX)+b[NX nX1
JC
Substituting these estimated values for the unobserved y values in (7 .I), we get
an estimator for the population total Y as
r = L,r; + <N n><Y bX>+brNX nX1
ies
Regression Estimation 123
=nY + <N n><Y Gio+bNX nXJ
=NY+ Nb[NX nX] (7.2)
The above estimator is mown as the Linear Regression estimator of the
population total Y. It should be noted that the above estimator is not unbiased for
the population total under simple random sampling. The following theorem gives
the approximate mean square error of the Linear regression estimator.
TINorem 7.1 The approximate mean square error of the regression estimator
under simple random sampling is N
2
( N  n) S; (1 p
2
)
Nn
.:,._ .:.,_ s 2 3
YY XX Sxv ..ty SxSx
Proof Define e
0
= ,e1 = ,e2 = .e3 =
2
Y X Sxy Sx
It is to be noted that E(e;) = O,i = 0, I, 2,3.
The regression estimator can be expressed as
YLR =NY+ Nb[NX nX]
S xvO +e.,)
= Y(l+e
0
)+ i  [X X (l+et)]
S X (I+ e.3)
XSxy
1
e1
Sx
1 Sxv
=f(l+e
0
)XB(l+e
2
)(l+e
3
) e
1
where B=2
Sx
Assuming I e; I< l, i = 0, 1. 2, 3 , the above expression can be modified as
A 2 4
Yu Y =Yeo XBe
1
(1+e2)(te
3
+e
3
e
3
+ ... )
=Yeo  XB(e
1
 e1 e3 + e
1
e
2
) (ignoring terms of degree
greater than two)
Squaring both the sides and taking expectations we get
E('YLR Y)
2
= Y
2
E(eJ> +X
2
B
2
E(ef) 2XYBE(eoet)
=N2(Nn)[S2+B2s22BS ]
Nn Y x xy
N
2
N ) S
2
S
= ( n [S
2
+2..si22.Sxv1
Nn Y s4 s2 .
X X
= N
2
(Nn)
52
[
1
 si,]
Nn Y s2s2
X y
= N2 (Nn) S2(lp2)
Nn Y
124 Sampling Theory and Methods
Hence the proof.
Tuorem 7.2 Under simple random sampling, > MSE[f LR] and
MSE(Y R] > MSE(Y LR] .
.,
Proof Since 1 < p < 1 , we have (I  p  ) < I
Therefore N
2
(N n) s;o p
2
) < N
1
(N n) s;
Nn Nn
Hence V[Ysrs] > MSE[Y LR] .
Consider the difference
MSE[Y ]MSE[Y ] N
2
(Nn>{S
2
+R
2
S
2
2RS S
2
+S
2
p
2
}
R LR  Nn y .x xy y y
= N
2
( N  n) { S; p
2
+ R
2
S;  2RS xy }
Nn
= N2 (N n) +R2 s; 2RS ]
Nn s2s2 xy
.% y
2
= N
2
( N  n) ( S xy  RS .x )
Nn s2
.%
Since the right hand side of the above expression is always nonnegative, the
result follows.
7.2 DilTerence Estimation
The ratio estimator which is obtained by multiplying the conventional estimator
Y by the factor is an alternative to the estimator Y . Here we shall examine
X
the possibility of improving upon Y by considering the estimator obtained by
adding Y with constant times the difference X  X whose expected value is
zero. That is, as an estimator of f. we take
YoR = Y +A.( X X) (7.3)
where A. is a predetermined value. Since the above estimator depends on the
difference X  X rather than the ratio , it is tenned as "Difference
X
estimator". The difference estimator is for the population total f and its
vanance ts
A A 2
V(YoR)=E[YDR f]
= E[(Y f)+ A.(X X)]
2
= E(Y f)
2
+A.
2
E(X X)
2
2A. E(X X)(f f)
= V(Y)+ A.
2
V(X) 2A.cov(X, f)
(7.4)
Regression Estimation 125
The above expression for variance is applicable for any sampling design yielding
unbiased estimator for Y and X. It can be seen that the above variance is
minimum if
......
l= cov(X, f)
XY
(7.5)
and the resulting minimum variance is V(f)(l p
2
(.X, f)] where p(X, f) is the
coefficient of correlation between X and Y . It is interesting to note that, when
s
simple random sampling is used, the optimum value of A. is ~ and the
s:x
minimum variance happens to be N
2
( N  n) S; (1 p
2
) which is nothing but
Nn
the approximate mean square error of the linear regression estimator. It is
peninent to note that the optimum value of A. depends on S xy which in general
will not be known. Normally in such situations, survey practitioners use unbiased
estimators for unknown quantities. The .value derived from the optimal choice
...
happens to be the least squares estimate. Therefore the estimator Y DR reduces to
the linear regression estimator. It is also to be noted that the difference estimator
~
reduces to the ratio estimator when A. = ~ .
X
7.3 Double Sampling in Difference Estimation
As in the case of ratio estimation, here also one can employ double sampling
method to estimate the population total Y whenever the population total X of the
auxiliary variable is not known. The difference estimator for the population total
under double sampling is defined as
r
00
=Y+A.<id i> (7.6>
...
where X D is an unbiased estimator of the population total X based on the first
phase sample. Evidently the difference estimator is unbiased for the population
total in both the cases of double sampling.
~ ... 2
Note that V(Y
00
) = E[Y
00
 Y]
= E[(Y Y)+ A.(Xd X)]
2
A A A 2
=E[(Y Y)+A.[(Xd X}(X X)]]
= V(f)+A.
2
[V(X)+ V(X d )2cov(X, Xd )]
 2A.[cov(f, X d) cov(X, f)] (7.7)
When the samples are drawn independently, the above variance reduces to
V[f
00
]=V(f)+ A.
2
[V(X) + V(Xd )]  2A.cov(f, id)
The following theorem gives the variance of the difference estimator in double
sampling when the samples are drawn independently in two phases of sampling
using simple random sampling.
126 Sampling Theory and Methods
Theorem 7.3 When the sampies are drawn independently in the two phases of
sampling using simple random sampling the variance of the difference estimator
is
V(Y
00
> = N
2
[s; +1
2
(/ + f')S; 2A. I Sxyl
Nn Nn'
where f =  and f' = . Here n' and n are the sample s1zes
Nn Nn'
corresponding to the first and second phases of sampling. Further the minimum
variance of the difference estimator in this case is
N 2 I s 2 [1 I P 2]
y /+/'
where p is the correlation coefficient between x and y.
Proof Using the results stated in Section 6.8 in the variance expression
available in (7. 7) we get
V(Y _ N
2
(Nrr)
5
2 ;..2 N
2
(Nn') S2+ N
2
(Nn) s2
DD)  N ,. + N ' .t Nn ;c
n n
.,
_
2
A. N ... (N n) S .
Nn  ~
= N
2
[s; +A
2
(/ + f ' ) S . ~  2A. I S;cy]
(7.8)
Differentiating the above variance expression partially with respect to A. and
equating the derivative to zero, we get A.= f S ;cy
2
.Substituting this value in
f+f'sJC
(7.8) and simplifying the resulting expression we get the minimum variance
N2 I s2[1 f P2]
y /+/'
It is to be noted that the second order derivative is always positive. Hence the
proof.
Theorem 7.4 When the second phase sample is a subsample of the first phase
sample and simple random sampling is used in both the phases of sampling the
variance of the difference estimator is
V(Y oo) = N
2
[/ S; + A.
2
(/ f')S; + 2A.(/'/)S;cy].
The minimum variance of the difference estimator in this case is
N2 s;[f' P2 +
10
_ p2)l
where f and f' are as defined in Theorem 7.3
Proof of this theorem is left as an exercise.
7 A Multivariate DitTerence Estimation
Regression Estimation 127
When information about more than one auxiliary variable is known, the
difference estimator defined in Section 7.3 can be extended in a straight forward
manner.
Let Y , X
1
and X! be unbia5ed estimators for the population totals Y , X
1
and
X! of the study variable)', the auxiliary variables x
1
and x
2
respectively. The
difference estimator of the population total Y is defined as
y 02 = y + 8 I (X I  X I ) + 8 2 (X 2  i 2)
where the constants 8
1
and 82 are predetermined.
A
(7.9)
The estimator Y
02
is unbiased for the population total and its variance is
... .... .... .... 2
V(Y
02
)=E{(ff)+8
1
(X
1
X
1
)+8
2
(X
2
X
2
)]
.... 2 .... 2 .... ... ....
=V(Y) + 8
1
V(X
1
) + 82 V(X 2) 28
1
cov(f, X
1
)
28
2
cov(f. :i
2
) + 28
1
82 cov(Xt, X2)
Denote by
V
0
=V(Y),V
1
=V(X
1
),V
2
=V(X
2
)
A A A A A A
C
01
=cov(Y,X
1
),C
02
=cov(Y,X
2
),C
12
=cov(X
1
,X
2
)
Differentiating the variance expression partially with respect to 8
1
and 8
2
and
equating the derivatives to zero, we get the following equations
V181 +C1282 =Cot
C1281 + V282 =Co2
Solving these two equations, we obtain
C01 V2  C12C02
8t= .,
V1V2 cr2
Co2V1 C12Cot
82 = 2
v
1
v
2
c
12
(7.10)
(7.11)
(7.12)
(7.13)
Substituting these values in the variance expression, we get after simplification
V(f)[1 R ~ ] (7.14)
. ~ . ..t ..t2
A A
where R is the multiple correlation between Y and X
1
. X
2
. Since the
y . ..t ..t2
multiple correlation between Y and X
1
. X
2
is always greater than the
correlation between Y and X
1
and that of between Y and X
2
, we infer that the
use of additional auxiliary information will always increase the efficiency of the
estimator. However, it should be noted that the values of 8
1
and 8
2
given in
(7.12) and (7.13) depend on C
01
and C
02
which in general will not be known.
The following theorem proves that whenever b
1
and b
2
are used in place of 8
1
and 8
2
given in (7 .12) and (7 .13 ), the resulting estimator will have mean square
error that is approximately equal to the minimum variance given in (7.14), where
128 Sampling Theory and Methods
where c
01
and
coz are
unbiased estimators for C
01
and C
02
respectively.
TINorem 7.5 The approximate mean square error of the estimator
A. A A A
Y D2 = Y + bt (X t  X t ) + b2 (X 2  X 2)
is same as that of the difference estimator defined in (7 .9), where b
1
~ d b
1
are
as defined in (7.15) and (7.16) respectively.
Proof Let
YY X
1
Xt e _ X
2
X
2
eo = y' el = X t ' 2  X 2
e'= cot Cot ,e"= coz Co2
Cot Coz
It can be seen that
b
_ CotO+e'}Vz Ct2CozO+e")
1
VtVz Cfz
B
CotVze'CtzCoze"
= t +   ~ ~ ~ =  ~ 
vlv2 c?2
Similarly it be seen that
b
CozVte"CtzCote'
2 = Bz + 2
. V1V2 C12
Using (7.17) and (7.18), the estimaror r;
2
can be written as
Y
A  YA [B CotVze'CtzCoze"] X XA
D2  + I + 2 ( I  I ) +
vtvz Ctz
[B
CozVte''Cl2Cote']<x XA
2 + 2 2 2)
V1V2 ~ C12
A A
Replacing Y ,X
1
and X
2
by YO+e
0
),X
1
(1+e
1
) and X
2
(1+e
2
)
we obtain
A [ Cot Vze'CtzCoze"]
y D2  y = Yeo + 8 t +
2
(X t e t ) +
V
1
V
2
C
12
. [B CozVte"CtzCote']
z+ 2 (Xze2)
v
1
v
2
 c
12
(7 .17)
(7.18)
(7.19)
in (7.19)
Squaring both the sides and ignoring terms of degree greater than two, we obtain
E[Y;z Y]
2
= Y
2
E(e5) + Bf XfE(ef) + BfX f E(ei)
Regression Estii1Ultion 129
2 ..,
=V(Y)+B
1
V(Xt)+B!V(X2)
2B1 cov(f, X
1
) 282 cov(f, X 2) + 28
1
8
2
cov(X
1
, X 2)
Substituting the values given in (7. 12) and (7 ,13) in the above expression. we get
the approximate mean square error
as V(f)[lR
2
] . Hence the
y . .r, ,.rz
proof.
Thus in the last few sections of this chapter. we constructed the linear
regression estimator for the population total by using the fact that the variables x
and y are linearly related and extended this to cover the case of having more than
one auxiliary variable. In the following section. the problem of identifying the
optimal samplingestimating strategy with the help of super population models is
considered.
7.5 Inference under Superpopulation Models
In the superpopulation approach. the population values are assumed to be the
realised values of N independent random variables. In this section, we shall
assume that the population values are the realised values of N independent
random variables Y
1
, Y
2
, ... , Y N where Y; has mean bx; and variance a
2
v(x;).
The function v(.) is known, v(x) 0 for x 0. The constants b and a
2
are
unknown. Let denote the joint probability law.
Implied estimator Consider any estimator f for the population total T which
can be uniquely expressed in the fonn f = L,Y; +bL,X; where b .does not
s i
A
depend on the unobserved yvalues. Here b is referred to as the implied
estimator of the parameter b .
For example, under simple random sampling, the ratio estimator can be
LY; N LY;
expressed as fR = t . L,x; = L,r; Therefore, the
X, i=t .r _LX; .r
s .r
L,r;
implied estimator of b corresponding to the ratio estimator is Ls . It is to be
X
I
s
noted that the implied estimator does not depend on unobserved y' s. In the
following theorem we shall prove that of two statistics f and f
0
, the one
whose implied estimator for b is better is. the better estimator for T.
Theorem 7.6 For any sampling design P, if estimators f and f
0
have implied
estimators b and b
0
for b which satisfy
A 2 Ao 2
[b b] S [b  b] (7.21)
130 Sampling Theory and Methods
for each s such that P(s)>O, then
MSE(P: T) MSE(P: f
0
)
(7.22)
If for some subset. say s
0
with P(s
0
) > 0, the inequality in (7.21) is strict.
the strict inequality holds in (7.22).
Proof Under the superpopulation model described earlier.
MSE(P: T) = E; ri P(s)(T T)
2
] (7.23)
s
and
(7.24)
s
Therefore MSE(P: T) MSE(P: f
0
)
A 2 Ao 2
E; P(s)(T T) ] E; [ P(s)(T  T) ] (7.25)
s s
Consider fT = LY; +bLX;  LY; LY;
s s s s
=b'LX; Ir;
s s
s s
Squaring both the sides and taking expectations, we obtain
A 2 2 A 2 2
E;[TT] = E;(bb) bX;)]
s s
(7.26)
Since b is independent of unobserved y's and E; (b b)= 0,
E;[bb]L(Y; bX;)]=E;[bb]LE;(Y; bX;)]=O (7.27)
Further E; ri (Y;  bX i )]
2
= L E; (Y;  bX i )
2
(since y s'are independent)
s
=a
2
'Lv(X;) (7.28)
Substituting (7.27) and (7.28) in (7.26) we get
A 2 2 A 2
E;[TT] = E;(bb) +a (7.29)
Proceeding in the same way, we obtain
Ao 2 2 Ao 2
E;[T T] = E;(b b) +a (7.30}
s s
If E; [b b f E; [b
0
 b]
2
for each s such that P(s)>O, then
Regression Estimation 131
=> [T T]
2
] S [T
0
 T]
2
for each s such that P(s)>O.
Hence the proof follows from (7.25)
Thus we conclude that the estimator which has the better implied estimator is
better. In order to make the study more deeper under the superpopulation
approach, we give the following definition.
Model Unbiasedness An estimator f is said to be model unbiased or <:
unbiased if for each s, f  LY; is an unbiased predictor of the unobserved sum
s
(7.31)
It may be seen that the above definition is equivalent to [T T] = 0.
Theorem 7. 7 An estimator is model unbiased if and only if the implied
estimator is unbiased for b
Proof We know that f T = (b b) LX;  [ L (f;  bX; )] (refer (7 .25))
s s
Taking expectations on both the sides we get
T )= (b b)
s
From this we infer that (T T) = 0 ::> (b b)= 0.
Hence the proof.
Note A design unbiased estimator is not necessarily model unbiased and in the
same way a model unbiased estimator is not necessarily design unbiased. The
following theorem gives the Best Linear Model unbiased estimator for the
population total.
Theorem 7.8 For any sampling design P. let f be a linear estimator satisfying
A """* A
(T T) = 0 for ever s such that P(s)>O. then MSE(P: T ) S MSE(P: T)
.. _ ... ... s v( X; )
where T  "'' Y; + b b =
s s L xl
s v(X i)
Proof Since we consider only estimators that can be expressed in the form
132 Sampling Theory and Methods
f = :Lr; +bLX;
s s
it is easily seen that f is a function of the observed y 's if and only if b is
a linear function of the observed y's. Therefore by GaussMarkov Theorem and
Theorem 7 .6, the proof follows.
The above theorem helps us to derive the Best Linear Unbiased estimators (with
respect to the model) for different choices of v(.).
Case (1): v(x) = 1
I I
1
A.
s
.f
b =
=
:Lxj2
2
LXI;
1
(7.32)
s
Case (2): v(x) = x
LXY
:Lr;
I I
A. X;
b = s
.r
=
2
:Lx;
X
s
S I
(7.33)
Case (3): v(x) = x
2
X;f;
2
A. s X; 1 Y;
b =
L
x'! n X;
_I s
2
s X;
(7.34)
Thus under the three cases the Best Linear Unbiased Estimators are
:Lx;Y; :Lr;
f, = I_r, + I. 2 I_x, .f2 = I_r, + i I_x, and f3 = I_r, +
s X; s .r X; s s
s
s
..!. L Y; LX; respectively. It is interesting to note that the estimator f
2
is
n X; _
s s
nothing but the ratio estimator. This proves the ratio estimator is the Best Linear
Unbiased Estimator for the population total when v(x) = x.
Now we shall state and prove a lemma which will be used later to prove an
interesting property regarding the estimator T3 = Lf; + ..!. L Y; LX; .
n X; _
s s s
1Am11111 7.1 If 0 S bt S b2 S ... S bm and if Ct S c2 S ... S em satisfies
c1 +c2 + ... +em ;;?; 0 then btcl +b2c2 + ... +bmcm 0
Regression Estimation 133
Proof Let lc denote the greatest integer i for which c; S 0. Then
lc m
b
1
c
1
+b
2
c
2
... +b"'c'!' = 1',h;c; + Lb;c;
,..... i=k+l
lc m
I,c;
i=l i=k+l
m m
b1c I,c; +( blc+l  b1c ) I,c;
i=l i=k+l
Hence the proof.
... ... XLY
The following theorem proves T
3
is better than T pps = __.!_ for any
n X
S I
fixed size sampling design under a wide class of variance functions.
N
TMorem 7.9 If Max{nX
1
.nX 2 .... , nX N} S LX; and v(x) is nonincreasing
i=l
in x then for any sampling plan P for which P(s)>O only if n(s)=n then
A A
MSE(P: Tpps) s; MSE( P: T3).
Proof In order to prove the given result, it is enough to show that
A 2 A 2
[bpps b) s; [b3 b)
where b pps is the implied estimator of b corresponding to the estimator f pps
To identify the implied estimator corresponding to the estimator f pps , it can be
written as
.!. L 21.._ I_Y;
... n X,
T pps = Y; + .r s
s 'i
s
Therefore the implied estimator corresponding to the estimator f pps is
Y;
n X
"' S I S
b pps = _ _,;;....,'t"==...:;...
L
Ya
_ I I
 
nX
s '
134 Sampling Theory and Methods
XnX
where ai = I
1
x.
I
A
2
Ya
[
1
2
Hence E.; [b pps  b] = E.; r:x; b ..A
= EJ:  E" (Y; )]]
2
nx ,
S I
., .,
arv(X;)
n
2
S I
Further note that b
3
 b = .!_ L Y;  b
n X
S I
1 1
= X[Y; E.; (Y; )]
n s ;
Squaring both the sides and taking expectations, we get
A 2 1 1 2
E.; [b3 b] = 2
E.; [Y; E.; (Y; )]
n .r X;
= a
2
v(X;)
n2 x.2
S I
From (7.35) and (7.36) we get
E [b b]2E [b
.; pps .; 3 2 X 2
n s ;
(7.35)
(7.36)
(7.37)
From the definition of a; . we get a? J :::::> X; S X j . Therefore whenever
2 2 \'(X;) v(Xj)
a; 1 i 1 , we note that
2
2
. Further under the given
X X
I J
conditions L [af  1] 0 . Therefore by the Lemma 7.1 the proof follows.
s
A A
Thus we have proved the estimator T3 is better than T pps It is to be noted
A
that the estimator T pps is not model unbiased even though it is linear which
makes the above theorem meaningful. Thus we have identified the optimal
estimators under the given superpopulation model for different choices of the
variance function appearing in the superpopulation model. Now we shall come
to the problem of identifying the ideal sampling plan with respect to the given
superpopulation model.
Regression Estimation 135
If we fix the sample .iize, then the problem of identifying an optimal
sampling design becomes straight forward. Let S n denote the set of all subsets
of size n of the population S and Pn be the collection of all sampling designs P
for which P(s)>O only when s is in S n . Since for P in Pn.
MSE(P: T) = lL P(s)(T T)
2
] (7.37)
'
clearly the optimal sampling plan is one which selects with certainty a subset s
which minimises (T  T)
2
Some insight can be gained when the quantity to
be minimised is expressed in the form
=
+0'
2
Lv(X;)
(7.38)
From (7 .38) we understand that the sampler has two objectives namely (I) to
choose a sample which will afford a good estimate of the expected value of the
total of the nonsampled values. That is. to choose a sample s so that
[LX;]
2
(b b)
2
is small and (2) to observe those units whose values of y
s
have greatest variance so that only the sum of the least variable values must be
predicted. That is to choose s so that a
2
L v( X; ) is small.
For a wide class of variance functions. if the optimal estimator defined in
Theorem 7.8 is to be used. then determination of the optimal sample is quite
simple. As it is shown in the following theorem. the best sample to observe
consists of those units having the largest x values. Let s be any sample of n
units for which Maxs" LX; =LX; and p be a sampling plan which
s s*
entails selecting s with cenainty. That is. P*(s=s*)= 1.
Theorem 7.10 If is nonincreasing then MSE(P* :f*)5.MSE(P:f*>
X
for any sampling plan P in Pn where
f* = LY; +b*'Lx;. and
S S
Proof We know that
MSE(P: T) = lL P(s)(T T)
2
]
s
A 2 2 A 2
and = [.LJX;] +0' Llv(X;)
'
s
136 Sampling Theory and Methods
L X;Y;
A v(X;)
Note that (b b)= .r  b
2
S I
X;[Y bX;]
L v(X)
 S I
 x'f
L v(;)
S I
Since y's are independent, squaring both the sides and taking expectations we get
a 2 L [ X; ] 2 v(X; )
A ., v( X;)
=
Therefore
Clearly the expression in side the square brackets is minimum for s=s*. Since
the sampling plan P* selects the sample s* with certainty we
* ..... .....
MSE(P : T ) S MSE(P : T ) . Hence the proof.
The following theorem proves the sampling plan P* is optimal for use with
A A
the estimator T
3
and T pps under rather wealc conditions.
x2
Theorem 7.11 For any Pin Pre and any v(.) for which both v(X) and 
v(X)
are nondecreasing
N
(i) MSE(P* :fppa> S MSE(P: fpps) if Max{nX tnX 2nX N} S LX;
i=l
* A A
(ii) MSE(P : T3) S MSE(P: T3)
Proof From (7 .35) we have
Regression EstimD.tinn 137
A 2 a
2
L al v( X 1 ) X  nX
E [b b]  where a =
1
.; pps   n 2 s X l I L Xi
.,
Therefore
A 2 1 2
E.; [b pps  b] = ( ] 2 n 2 7 X? [X  nX; ]
LX; 
s
A 2 2 a
2
v( X; ) 2
E.;[TppsTl =a
2
[XnX;]
"i n s X;
Hence
A
2
a
2
v(X;)
2
MSE(P:Tpps>= {a +
2
2
[X nX;] }
s i n s X;
"
It can be seen that the above expression will be minimum when s=s*. Since P*
is the sampling plan which selects s with certainty
A A
MSE(P : T ppa) S MSE(P: T pps >.
This completes the first pan.
A
2
a
2
v(X )
WehaveseeninTheorem7.9.E.;[b
3
b] =
2
;
n s X;
Therefore E.; [T
3
 T]
2
= (1: X; ]
2
a: L, v(X;> + a
2
L, v(X i)
i n s X; i
Hence MSE(P:T
3
)= LP(s) {a
2
Lv(X;) +[LX; ]
2
a:}:; v(X;)1
"i "i n s X;
Clearly under the assumptions stated in the theorem. the right hand side of the
above expression is minimum if s=s. Since the sampling plan P* yields the set
s* as sample with probability one,
A A
MSE(P :T3) SMSECP:T3)
Hence the proof.
The results discussed in this section are due to Royall ( 1971 ).
7.6 Problems and Solutions
Problem 7.1 Derive the Best Linear Unbiased Estimator for the population total
under the superpopulation model
E.;[Y;]=a+bi,i=l, 2, ... , N, V.;[Y;]=a
2
v(i);cov.;[Y;,Yj]=O,i;t;j
where is the joint probability law of Y
1
, f2, ... , Y N Also find its mean square
error when a=O.
138 Sampling Theory and Methods
Solution By Theorem 7.8. the Best Linear Unbiased Estimator for the
population total is gtven by f = L Y
1
+ ( N  n )a+ bL X; where
s
L
y. ALi
' b 
v(i) v(i)
a= s .f and
Lv:i)
s
iY, i Y; i
A v(i) v(i) v(i) v(l)
b = s .f .f[ s ]2
2 I .
:L ;(i) I v<i>  I v;i>
.f s s
L iY;
A. A* v(i)
When a=O, the estimator reduces to T = Y; + b X; , b = .c
2
s s I
Consider
A.
(b b)=
'
.f b
2
L.:(i)
.f
L llY; bi]
v(i)
 .f
 2
L;<i>
s
.f
Since y's are independent. squaring both the sides and taking expectations we get
v(i)
A
2
v(1)
b) =
(*)
= (b* b)Li I,Y; Et; (Y; )]
s 1
Squaring and taking expectation with respect to the model we get
Et;(T* T)
2
b)
2
[Li]
2
+ La
2
v(i) (**)
s s
Regression Estimation 139
Using ( *) in(**) we get
.,
T)2=[ 1 Li11 + I,a1v(i)
2  
I' .f .f
s v(i) J
Hence by definition of mean square error, we get
.,
Exercises
...
7.1 Show that the mean square error of T
0
derived in the above .problem is
minimum in P,. under the sampling design P
0
(s), where
{
1if s=s*
P
0
(s)= .
0 otherwise
where s* is the set containing the units with labels N  n + 1 N n + 2 ,
N
'd d th ti ( ') d d v(i)
... , prov1 e e unction v l 1s non ecreasmg m ' an 1s non
l
increasing in i. Here P,. is the class of sampling designs yielding samples
of size n.
7.2 Extend the difference estimator considered in this chapter to pauxiliary
variables case and show that the resulting minimum mean square error is
2
N (N n) s2 1 R
2
) h R . h I. I I .
(  o.l23 ... p , w ere
0 123
1st emu t1p e corre at10n
Nn ... p
coefficient. assuming simple random sampling is used.
7.3 Derive the Best Linear Unbiased Estimator for the population total under
the superpopulation model,
., ?
[Y;] =a +bi +ct. i = 1. 2 ..... N, [Y;] =a;
cov [Y; , Y J ] = 0. i :: j
where is the joint probability law of Y1 , Y 2 , Y N . Also derive mean
square error of the estimator.
ChapterS
Multistage sampling
8.1 Introduction
So far we have seen a number of sampling methods wherein a sample of units to
be investigated are taken directly from the given population. While this is
convenient in small scale surveys. it is not so in large scale surveys. The main
reason being that no usable list describing the population units to be considered
generally exists to select the sample. Even if such a list is available. it would be
economically viable to base the enquiry on a simple random sample or
systematic sample, because this would force the interviewer to visit almost each
and every part of the population. Therefore it becomes necessary to select
clusters of units rather than units directly from the given population. One way
of selecting the sample would be to secure a list of clusters, take a probability
sample of clusters and observe every unit in the sample. This is called single
stage cluster sampling. For example to estimate the total yield of wheat in a
district during a given season. instead of treating individual fields as sampling
units, one can treat clusters of neighbouring fields as sampling units and instead
of selecting a sample of fields one can select clusters of fields. Sometimes
instead of observing every field within each cluster, one can select samples of
fields within each cluster. This is called twostage sampling since now the
sample is selected in two stages first the cluster of fields (called first stage or
primary stage units) and then the fields within the clusters. This is also called
Subsampling. Generally, subsampling is done independently from all the
selected primary units.
8.2 Estimation Under Cluster Sampling
Suppose the population is divided into N clusters where the ith cluster contains
N;, (i = 1. 2, .... N) units. Let Yij (j = 1. 2 ..... N;; i = 1. 2 .... , N) be the yvalue of
N
the jth unit in the ith cluster and Y; = f, Yij . That is, Y; (i = 1, 2, .... N) stands
j=l
for the total of all the units in the cluster i. Suppose a cluster sample of n
clusters is drawn by using a sampling design with first order inclusion
probabilities 1C; , ( i = 1, 2, .... N) and second order inclusion probabilities
1C ij, i ~ j . An unbiased estimator for the population total Y of all the units in the
population
Multistage Sampling 141
N N
namely, Y = L !, YiJ is given by
i=l j=l
.. """ y.
ycl = .,;'
. 1C
IE.f I
and its variance is
.. f J f r TCij TCiTC J l
V(Yc1) = L Yt + 2 .,; .,; Y; Yj l rr ir . j
i=l I i=l j=l I J
i<j
( 8.1)
t"8.2)
The expressions given in (8.1) and (8.2) can be used for estimatmg the
population total and to get its variance under any sampling design for which the
first and second order inclusion probabilities are known. In particular. when
simple random sampling is used to get a sample of 11 clusters, then an unbiased
estimator of the population total is given by
A N""'
Ycls =.,; Y;
n
lES
and its variance is
[
.., ]
A N""(N n) .,
V[Ycl.r ] = S
Nn
' 1 LN 2
where s; = [Y; Y] .
 N 1
i=l
( 8.3)
(8.4)
In the same manner, for all sampling designs appropriate unbiased estimators can
be constructed and their vanances can also be obtained.
It is interesting to note that the number of units in each cluster can be taken
as a measure of size and cluster sampling can be performed with the help of
probability proportional to size scheme. Let Y; be the total of the ith sampled
cluster and P; be the selection probability of the unit selected iri the ith draw.
i = 1. 2, .... n when a probability proportional to size sample of size n is drawn
with replacement. Note that, when the number of units in each cluster is regarded
as size, the selection probability of the rth unit in the population is given by
N
N ..
Pr 1,2, ... ,N,
No
where N
0
= L N; . In this case an unbiased estimator
1=1
of the population total is given by
A 1
yelp = .,; 
n p
=I I
(8.5)
and its variance is
.. I N { y. ) .:!
L ' P,
n i=l P; J
(8.6)
142 Sampling Theory and Methods
Thus we have understood that no new principles are involved in constructmg
estimators when a probability sample of clusters is taken. A problem to be
considered is the optimum size of cluster. No general solution is available for
this problem. However, when ctusters are of the same size and simple random
sampling is used. partial answer is provided to this problem. The following
tlleorem gives the variance of the estimator under simple random sampling in
terms of inuacluster correlation. when the clusters contain same number of units.
TMorem 8.1 When the clusters contain M units and a cluster sample of n
clusters is drawn using simple random sampling, the variance of the estimator
considered in (8.3) is
V[Y l=[N2(Ntz)]NM 1 l) ]
cis Nn N 1 p
where p is the inuacluster correlation given by
N ,W
2 L !,[Y;
1
 YHYu,  Y]
= y
p= ., .Y=
(M l)(NM l)S; NM
Proof We have seen in (8.4)
V[Y 1 =[N2(N n>]_!_ f]2
cls Nn N 1 tt '
Note that
N N [M
12
t;.[Y; i? = f1j
= f[f[riJ YJ
2
] + 2 f Y][Y;k  Y]
i=l j=l i=l j<k
=<NM + <M l)(NM
..,
=CNM l)p]
Substituting (8.8) in (8.7) we get the required result.
(8.7)
(8.8)
From the above theorem we infer that the variance expression obtained
depends on the number of clusters in the sample, the variance S: , the size of the
cluster M and the intracluster correlation coefficient.
R k S
. (NM 1) M h . . . . Th 8
emar mce , t e vanance expression given m eorem .1
N1
can be written as
.Wultistage Sampling 143
A [N
2
CNnl] ,
V[Ydsl=
Nn 
(8.9)
Under the conditions stated in Theorem 8.1. if instead of sampling of clusters. a
s1mple random sample of nM elements be taken directly from the popuiation
wh1ch contams NM clements.
V[Y ]=ll(NM>
2
(NM
''' "" \
M Nn 
(8.10)
Nn
Comparing (8.9) with (8.10) we note that
V[Ycls 1 = )[I+ tM l)p]
(8.11)
Since p is generally positive (because clusters are usually formed by putting
together geographically contiguous elements). we infer from (8.11) cluster
:;ampling will give a higher variance than sampling elemt:nts directly from the
population. But it should be remembered that cluster sampling will be more
economical when compared to simple random sampling. However, if p 1s
negative, both the cost and the efficiency point to the use of cluster sampling.
8.3 Multistage Sampling
If the population contains very large number of units, we resort to sampling in
several stages. For the first stage, we define a new population whose units are
dusters of the original units. The clusters used for the first stage of sampling are
called primary stage units (psu). For example when the population is a collection
of individuals living in a city. the psu may be taken as streets. Each psu selected
in the first stage may be considered as a smaller population from which we select
a certain number of smaller units. namely secondary stage units ( ssu). Unless
stated otherwise. the secondstage sampling in each psu is carried out
independently.
1. Twostage sampling with simple random sampling in both the stages
As in the cluster of sampling. let Yij U = I. 2 ..... N i; i = I, 2 .... , N) denote the y
N
value of the jth unit in the ith cluster (psu), ( Y;_ = 
1
f Y;i mean per second
N; j=l
I N
stage unit of the ith primary stage unit and = N. L Y;. the population mean.
I j=l
Assume that a sample of n primary stage units are selected using simple random
sampling in the first stage and a subsamplc of size n; is drawn from the ith
144 Sampling Theory and Merhods
primary stage sampled unit i e 1. 1 being the set of indices of the sampled
primary stage units. The following theorem gives an unbiased estimator for the
population total under twostage sampling and also its variance.
Theorem 8.2 An unbiased estimator of the population total Y is given by
Yms = N L N;Y;. }
1
being the mean of the units sampled from the ith sampled
n
IE}
4
N .,1 12 2l l'
P
suandltsvanancels V(Y  +N  s
N [ } [ ]
nu 1 N WI N b
n n n
i=l I I
N. N
where S
2
=
1
[ y ..  Y: 1
2
and S
2
= 
1
 [ Y:  f ]
2
WI M   1 L I] I. b N  1 I. ..
I j=l i=l
Proof E(Y ms) =
1
2 CY,u). where
1
and 2 are the overall and conditional
expectation with respect to subsampling respectively.
E(Yms>=E1E2[: LN;Y;.]
IE}
= 1[: LN;2(v;. )]
IE}
= 1[:
IE}
= 1 [ N L N: Y;]
n . J
IE
=Y
,.
The variance of Y,,u is given by
(8.12)
A .,r 1 1 ] .,
Therefore V
1
E
2
( Y ) =: N l S h
rz N
., [ l
A N ., I I .,
Further V.,(Y )  s.
.. nl.f ., .... I N "''
n
1
n; ;
IE
., .'V [ 1
A N n .., l I .,
E
1
V..,(Y y.
 ms .., N 1 V w1
ll 1=1 ll; I r
=IN,.
N N "[ 1 l }"
n i=l n; N;
Multistage SampLmg
Using (8.13) and (8.14) in (8.12), we get the required result.
145
(8.13)
(8.14)
The following theorem g1ves an unbiased estimator of the variance given in the
above Theorem.
Theortm 8.3 An unbiased esumator of V ( Y
11
u ) is
v(Yms)=:
IE}
1 1
j
n
where s
2
Nv. ]
2
and s
2
y f
b n I.... '. ' n ..,. ' ' w1 n _
1
L '1 '
I
Yij being the value of the jth sampled unit from the ith sampled psu.
Proof We have
); N v. ]
2
= N:..
2
Nv1
2
.... I I. .... r1. .... I .I .... 11
iEJ n 1EJ iEJ n iEJ J
consider ErL,N?'Yll= 1 2! L,N,
2
vpj
1
_IE} _. 1...1EJ
= E,lr L,1N?V1Ci'; 1 + Nl[E1<Y,.Jf 1]
IE}
= ,[L{N,2f
1
f}s;; l]
iEJ ln, i
=..!:.[f N2f_l __ I }s2 +..!:_ f y.2]
.... I WI .... I
N i=l ln; N; N i=l
Further
(8.15)
(8.16)
146 Sampling Theory and Methods
[
l
[
N ' f ]
.,_., 2 ' I ., II ., I I .,
=nr+n  Sb+ I,Nrl
n N J N i=l II; N;
Hence by (8.15) we have
[L
 I L  ., ] ., n I [LN ., ( 1 1 1 ., ]
E [Nv  Nv ] =(nl)Sb+ r
1 I. n 1 I N I n N
IE} IE} i=l
1
1
Therefore
E[N
2
( _!__ Nl _ _!_ I,N;Y;,1
2
]=V(Ym.,)
tn 11 .
1
n.
1
\ IE IE
i=l n1 N, r
Since [N I,N;
2
(
1
. 
1
. is unbiased for [fN?(
1
. 
1
.
n iEJ nl Nl 1=1 Nl r
conclude that
N2(_!_ __ 1 J1 _ _!_ Nv. ]2 + [N __ I }:!]is
N I L. I I. L. 1 I. L. I N "1
n n iEJ n iEJ n iEJ n; i
unbiased for V (Y,u) .
Remark If all the first stage units have same number of second stage units, say M
and the same number of second stages units is sampled from every sampled
primary stage unit, then Y,ns and V(Y m.f) take the following forms: (Here it is
assumed that m is the second stage sample size)
Multistage Sampling 147
. NM L
(l) = V
ms .  1.
n
1EJ
.. A N
2
M
2
( i . I '; .., ., ., ( I I ,.,
(ul V<Yms)= l T"NM  
11 m M) n N b
N
.., ..,
where S
=.... S and
N i=l
N 
_.., I L =.., = y
sb =  < Y  Y >. Y = _ ..
N1. I. M
t=l
.,
.,
_., sb
It is to be noted that S;; =  where
M
S . is as defined in Theorem 8.2.
17
Optimum values of n and m
Now we shall find the values of n and m. namely the sample sizes to be used in
the first and second stages of sampling. assuming the conditions stated m
Remark 8.2. Naturally these values depend on the type of cost function. If travel
between primary stage units is not a major component. then the total cost of the
survey can be taken as
C=c
1
n+c
2
nm (8.17)
The above cost function contains two components. The first component is
proportional to the number of psu s to be sampled whereas the second
component is proportional to the second stage units to be sampled from each
sampled psu. Under the above set up, it is possible to find the optimum values of
n and m for which V(Ym.f) ts minimum for a given cost. Towards this. we
consider the function
L = vo' m.f) + .A.[cln + Czllm C]
N
2
M
2
(1 t'\.., .... "II 1)2
= } IS;;.+NM 1Sb +.A.[c
1
n+c
2
nmC]
n \_m M. ,n N)
(8.18)
Differentiating the above function partially with respect to n and equating the
derivative to zero we get
s; ( I I
 J+c1 +czm=0
nl m M n2
S
z ( I I 'Lz
+!
2 b \111 M I w
n = (8.19)
c
1
+c
2
m
Again differentiating partially with respect to m and equating the derivative to
s2
zero we get  w +c., = 0
.., 2 
mn
..,
Combining (8.19) and (8.20) we get
(8.20)
148 Sampling Theon and Merirods
sl +( 2
1
.. ~ , ~ 1
\ m M ) = 5,11
c
1
+ c:!m c.,m.;..
m=g
(8.21)
This value can be substituted in ( 8.17) and a best solution for n can be found
easily. The expression given in ( 8.21) indicates that m is directly proportional to
S"" . This implies. the number of second stage units to be taken from a primary
stage unit should be large if the variability with respect to y is large within
primary stage units. Similarly if the cost per secondary unit c
2
is small or the
cost per primary units c
1
is large, m should be large.
2. Twostage Sampling Under Unequal Probability Sampling
The results presented above are applicable for the case of using simple random
sampling in both the stages of sampling. In this section, we shall discuss some
results which are quite general in nature and can be applied for any sampling
design with known inclusion probabilities. Here it is assumed that all the second
stage units in the population arc labelled using running numbers from I to N
0
N
where N
0
= L N r. N, being the number of secondary units in the rth psu.
r=l
If a unit i belongs to the 11h psu. r=r(i). then the unit will be included in the
samples if
I. the rth psu will be included in the first sample: and
2. the unit i will be selected in subsampling. provided that case I
happened.
Denoting the probability of case I by rc t and the conditional probability in
2 by 1r /
1
we have the following formula for the overall inclusion probability
rc; = rc: rc ,
11
i = I. 2 ..... N n . r= 11 i)
N
where N
0
= LN,
r=l
Multistage Samplin1: 149
(8.12)
For example: If simple random sampling is used in both the stages of sampling.
rc; = .!!_.!!.!.__. r = I, 2, .... N. i = I. 2 ..... N
0
; r = r(i)
N N,
( 8.13)
Here it is assumed that a sample of 11 psu's are selected in the first stage and a
subsample of size n, is drawn from the rth sampled psu.
Now we shall consider secondorder inclusion probabilities in multistage
sampling. For two units i and j belonging to rth and sth psu, respectively, we
may write
rc ij = rc :, rc ; ~
1
i. j = I, 2 ... , N
0
; r = r(i), s = s( j) ( 8.14)
where rr:r denotes the probability of simultaneous inclusion of the rth and sth
psu in the first stage. and T C ; ~
1
denotes the conditional probabi1ity of
simultaneous inclusion of the units i and j in subsampling. provided that the rth
and sth psu have been selected in the tirst stage.
If r :;c s . the subsampling concerning the unit is independent from one
concerning the unitj. Therefore
II II II d _1 II llf (') ')
Tr;j =rei rc j an rc;
1
= ""n Tr; rc j 1 r l '* S(J .
If r = s, then we have rc:, = rc: ~ n d rc ij = rc: rc 6
1
if r(i) = r(j).
Using these first and second order inclusion probabilities. one can easily
construct unbiased estimator for the population total.
The total over the rth psu will be denoted by (with respect to the variable yl
T n, r = I. 2 .... , N .The set of indices belonging to the psu selected in the first
stage of sampling will be denoted by J c {I, 2 ..... M} and identified with the
first stage sample. The sample of second stage units yielded by subsampling in
the rth psu will be denoted by s, . r = 1. 2, .... N . Consequently the sample of
second stage units will be given by s = Us, . Under this notation. we have
rei
rc: = P( r e J) and 1C 6
1
= P(i e s I r(i) e J) .
An unbiased estimator of the population total Y is given by
... Ly.
y = ....!...
TC
iE.f
1
(8.15)
The above estimator can also be expressed as
150 Sampling Theor; and Methods
'Y,
rrll
= L tEsr
1
1
rei 1Cr
where fry is the estimator of Try based on subsampling . That is,
A 'y,
Try = r = l. 2 .... , N
ies rr i
,
(8.16)
The following theorem gives the variance of the above unbiased estimator for the
population total.
Theorem 8.4 In two stage sampling
N A
A A 2 L E II (Try  T '!' )
V(Y)=EI[YI Y] T
I
r=l lrr
(8.17)
,rn.
where Y
1
= 1 and the expectations E
1
and
rei rr r
E
11
refer to the first stage
sampling and subsampling respectively.
Proof Fixing the first stage sample J. we have
A ' [ f n ] ' f n A
Eu(Y)= 1 =
rei lrr rei lrr
Furthermore, since subsampling is carried out in each sampled first stage unit
independently, we have
E II ('Y  Y) 2 = E II [ y  E II ( 9)] 2 + [E II ( y  Y) f
A .,
2.
E 1/ [T n  T 1'\' ] A .,
=  + [Y1  n
1 .,
rei (lr r )
Applying the wellknown relation [.]=
1
11
[.] to both the sides of the above
expression, we get the required result.
Exercises
8.1 Suggest an unbiased estimator for the population total assuming simple
random sampling is used in the first stage and ppswr sampling is used in
the second stage and derive its variance.
8.2 Suppose a population consists of N primary stage units out of which n are
selected so that the probability that a sample s of size n is selected. is
proportional to sample total of size variable of the primary stage units.
Suppose further that for the ith psu there is an estimator T; (based on
Multistage Sampling 151
sampling at the second stage and subsequent stages> of the total Y, of the
primary stage unit. Suggest an unbiased esumator of the population total
usmg denve its vanance.
8.3 In a twostage des1gn one subumt is selected wnh pp to x from the enure
population. If_ this happens to come from the ith psu. a without replacement
random sample of m:  I subunits is taken from the M; 1 that remain
in the psu. From the other N I psu 's a without replacement random
sample of N l psu's is taken. Subsampling of the selected p.nc's is
n
11
without replacement simple random. Show that
n
. f y
esumator o  .
X
I I
i=l
is an unbiased
Chapter9
Nonsampling Errors
9.1 Incomplete Surveys
In many large scale surveys. data cannot always be obtained from all the
sampled units due to various reasons like the selected respondent may not be
available at home and even if present may refuse to cooperate with the
investigator etc. In such cases the available data returns are incomplete and
some times. this kind of incompleteness called Nonresponse is so large as to
completely vitiate the results. In this section some techniques meant for
removing biases arising from incomplete data are presented.
Hansen and Hurwitz Technique
Hansen and Hurwitz (1946) suggested a solution for obtaining unbiased
estimates in mail surveys in the presence of nonresponse. In their method,
questionnaires are mailed to all the respondents included in a sample and a list
of nonrespondents is prepared after the deadline is over. Then a subsample is
drawn from the set of nonrespondents and a direct interview is conducted with
the selected respondents and the necessary information is collected. The
parameter concerned are estimated by combining the data obtained from the two
pans of the survey.
Assume that the population is divided into two groups. those who will respond
at the first attempt belong to the response class, and those who will not respond
called nonresponse class. Let N
1
and N
2
be the number of units in the
population that belong to the response class and the nonresponse class
respectively (N
1
+ N
2
= N). Let n
1
be the number of units responding in a
simple random sample of size n drawn from the population and n
2
be the
number of units not responding in the sample. We may regard the sample of n
1
respondents as a simple random sample from the response chiss anu the sample
of n
2
as a simple random sample from the nonresponse class. Let h
2
denote
the size of the subsample from n
2
nonrespondents to be interviewed and
f = n
2
. Unbiased estimators of N
1
and N
2
are given by
h2
NonSampling Errors 153
A Nn
1
A Nn,
N
1
= and N., = (9.1)
n n
Let yh
2
denote the mean of h
1
observations in the subsample and
ntY't +n2Y'112
Yw =
n
(9.2)
The following theorem proves the above estimator is unbiased for the population
mean and gives its variance.
. _ n1Yt+ll2Yin
TINonm 9.1 The esumator y w =  is unbiased for the population
n
mean and its variance is
 ( 1 I 2 N., s
V(yw) = ;; N r +(/I) N ;
where S 1 is the analogue of S
2
based on the nonresponse class.
Proof
4
ntY'r+n2Yin ]
=  I n
1
, n,
n . 
L.
=Elf] (since yh: is unbiased for the mean of nonresponse class)
=Y
Therefore the estimator is unbiased.
V(yw)=VE[yw ln,,n2]+EV[yw ln,,n2]
=V[y]+EV[yw ln,,n2]
_ [ntY't +n2Y1r, ]
Note that V[yw ln,,n2]= V n  ln,,n2
n., "
:::...(/ l)s.;
n
2 
where si is the sample analogue of Sf.
HenceV[Yw ln1,n2J= (/ 1)1 :; si I n2] = (/: l) sf{;]
= l/1> N2
n  N
(9.3)
(9.4)
154 Sampling Theory and Methods
Note that V( vn) =I 
 ( 1 t ,.,
n N
(9.5)
Substitwing (9.4) and (9.5) in (9)) we get the required result.
The cost involved in the above technique contains three components (i) the
overhead cost C
0
, (2) the cost of collecting. processing per unit in the response
class C
1
and (3) the cost of interviewing and processing information per unit in
the nonresponse class C
2
. Thus. it is reasonable to consider a cost function of
the form C = C
0
n + C
1
n
1
+ C Since n
1
and n::!. are random quantities with
N
expectataons n 
1
and
N
N.,
n
Nf
respectively, the average cost funcuon as
C' = [ C
0
N + C
1
N
1
+ C
2
n; ] . The following theorem gives the optimum
values of f and n for a given variance.
Theorem 9.2 The values of f and n for which the average cost is minimum
when V[yw1 = V
0
are given by
2
S2+N2(/1)S2
N
n =
I
N j
c:! s2 N2s11
N
f= and
Proof of this theorem is left as an exercise.
The problem of incomplete surveys has received the attention of many including
ElBardy( 1956), De Ienius ( 1955). Kish and Hess ( 1959), Bartholomew( 1961)
and Srinath( 1971 ).
Deming's Model of the effects of caDbacks
Deming (1953) developed a mathematical model to study in detail the
consequences of different callback policies. Here the population is divided into
r classes according to the probability that the respondent will be found at home.
Let wij = probability that respondent in the jth class will be reached on or
before the ith call . Pi =proportion of population falling in the jth class. Y
1
=
.,
mean for the jth class and G; =variance for the jth class. Here it is assumed that
wij is positive for all classes. If Yij is the mean for those in class j. who were
NonSampling Errors 155
reached on or before the ith call. it is assumed that E(YiJ] = Y
1
. The true
r
pcpulation mean for the item is ~ = L [ p
1
Y
1
] .
j=l
Suppose a simple random sample of size n is drawn. After i calls, the sample is
divided into ( r+ I) classes: in the first class and interviewed: in the second and
interviewed; and so on. The (r+ I )st class consists of all those not interviewed
yet. The numbers falling in these ( r+ I) classes are distributed according to the
multinomial
r
[w;IPI +w;2P2 + ... +w;,Pr +(1 LwijPj)]"o
. j=l
where n
0
is the initial size of the sample. Therefore n; follows Binomial
r
distribution with parameters n
0
and L wii p
1
. For fixed n, . the number of
j=l
mterviews nij \ j = I. 2 ..... r) follows multinomial with probabilities
Therefore
n; wij p J
E[ n ij I n i ] = , __;;,___;;._
. L WijPJ
j=l
If y, is the sample mean obtained after i calls.
E[y, In;]= E L nij Yij [
r 
j=l ll;
r
LwiJpiYJ
1=1
= ~      Y
,
r
LWiJP j
j=l
r
I W;jPj
j=l
Since the above expected value does not depend on n; , the overall expectation
of Y; is also Y; . Therefore the estimator 1s biased for the populati9n mean Y .
The bias of the esumator Y; is given by
E[y;] = Y, y
156 Sampling Theory and Methods
r
L W;j p J YJ r r r ]
rl I  I     w .. p . y .  1  w .. p . y .
 :" , I) J I} IJ } I
L WijP,I j=l L. j=l
j=l
[
1  i W;j p j J '
= j=l ~ w" p ' (f., f. 1 )
r ~ IJ J IJ I
~ j=l
~ w i j P j
j=l
where Y;' is the mean of the units not interviewed yet.
The conditional variance of the estimator v. after i calls is
I
N  n I ~ [ N ij   2 N 'k  1 ., ]
V[y; In; I= ' ' ~ [Y;j  Y;] + ' S;j
N; 1 "; j=l N; Nij
N;;
where S ;] =
1
L [ Yijk  ;
1
]
2
. The quantities N iJ, N; etc. have the usual
Nij 1 k=l
v.. Nij1 ., ., _
meaning. Taking a,, =:....JL and s ~ =a:: V[y In] can be written as
N
.. IJ 1}' I I
. N, IJ
r
N n 1 L   ., 2
V[yln]= ' ' {a .. [r .. r.]+a .. }
I I N 1 !I I} I IJ
i  II; j=l
If we further assume a ;7 = a} and ignore terms of order ._!, , . it can be seen that
n
r
L {a ij [ f;j  Y; ] 2 + a J }
MSElY;l=(! .! JJ=
1
, 
L WijP j
j=l
Deming has also considered the problem of determining optimum .number of
callbacks for the given sample size and cost of the survey. For related results
one can refer to Deming ( 1946).
PolitzSimmons Technique
Politz and Simmons ( 1949,1950) developed a technique to reduce the bias due to
incomplete surveys without successive callbacks. Their method is described
below:
NonSampling Errors 157
The interviewer makes only one call during a specific time on six
weekdays. If the respondent is at home. the required information is collected and
he is asked how many times in the preceding five days he was at home at the
time of visit. This data is used to estimate the probability of the respondent's
availability. If the respondent states that he was at home t nights out of five, the
ratio '+
1
is taken as an estimator of the frequency 7t with which he is at home
6
during interviewing hours.
The results from the first call are sorted intQ six groups according to the
values oft (0.1,2,3.4,5). Let nr be the number of interviews obtained from the ith
group and y
1
the mean based on them. The PolitzSimmons esumate of the
5 6 
L ntYt
~ _ r=O t + l
population mean is Y p.f 
5
. In this approach, the fact dtat the first call
L 6nt
t=O t +I
results are unduly weighted with persons who are at home most of the time is
recognised. Since a person who is at home. on the average. a proportion 1t of the
time has a relative chance 7t of appearing in the sample. his response should
receive a weight _!_.The quantity ~ is used as an estimate of _!_.Thus Yps
1C t+l 1C
is less biased than the sample mean from the first call, but 1t has greater variance
because the estimator happens to be weighted mean.
Let the population be divided into classes, people in the jth class being at
home rc J of the time. Note that the kth group will contain persons from various
classes. That is. persons at home t nights out of the preceding tive belong to
various classes. Let n jc. y jt be the number and the mean for those in class j and
t 6njrYjt
~ ~ t + 1
Th h Y
 r=O j=l If
group r. en t e estimator Y _ , , . ~ can be wntten as . n
0
ps tt 6n_;1
r=O j=l t + l
1s the initial size of sample (response plus notathomes) and n j is the number
from class j who are interviewed, the following assumptions are made.
n
(l) 
1
. is a binomial estimate of p
1
rc
1
no
<2> E[n
11
1n
1
J=n
1
r
5
)rcjorc
1
>
5
t
. 'l
(3) E[y jt] = Yj for any j and t
!58 Sampling Theory and Merhods
It can be shown that under the above assumptions
;
E(I6_, =i[l nIt, _16]
r=l
t + 1 :rj
1
r
= n
0
L p
1
( 1  ( 1  1r
1
)
6
]
j=l
Since E[y jt] = Y
1
for any j and t. we have
_t=l
E[Y p.f] =
)6]
i=l
..
This that the estimator Yp.{ is biased for Y . However, in practice the
amount bias is likely to be small when compared to callback surveys. The
variance of the estimator is quite complicated. For more details. one can refer to
the original paper.
9.2 Randomised Response Methods
In many sample surveys involving human populations, it is very difficult to get
answers which are truthful and in some cases the respondents fail to cooperate.
It is mainly due to sensitivity of certain questions which are likely to affect the
privacy of respondents. To overcome this limitation, Warner ( 1965) has
designed a technique to encourage cooperation and truthful answering.
Suppose members of a group A in a population have a socially
unacceptable character and we are interested in estimating the proportion 1t A of
the persons belonging to A. Assume that a simple random sample of size n is
drawn with replacement from the given population.
Wa.rner's Method
Each selected respondent is given a random device which results in one of the
two statements "I belong to group A " and "I do not belong to A ". The
respondent is asked to conduct the. experiment unobserved by the investigator
and report only "yes" or '"no'' according to the outcome of the experiment . He
does not repon the outcome of the experiment. If n
1
persons in the sample
report yes" answer and n
2
= n  n
1
report "no" answer then an unbiased
estimator of 6. the probability of yes" answer is given by8w It is to be
n
noted that
NonSampling Errors 159
8w =Prc ..
1
+0P)!lJr.\l (9.6)
where P is the probability of getting the statement 1 belong to group A" and
(1 P) is the probability of getting the other statement. It is assumed that P is
known. Hence an unbiased estimator of the parameter It' .
1
ts
8w (1P) 1 .
It' A w = , P :1:. Smce n
1
has binomial distribution with parameters
2Pl 2
nand 8w,
V(lt'AW) =";..;;.._., (9.7)
(2?l)
=lt'A(lJr.4.)+ P(lP)
.,
n n(2P 1)
(9.8)
Here it is assumed that the respondent is truthful.
The first term. on the right hand side of (9.8) is the usual binomial variance
that would be obtained when all the respondents are willing to answer truthfully
and a direct question is presented to each respondent included in the sample. The
second term represents a sizable addition due to the random device. It is to be
1
noted that the above method is not useful when P =  and for P = 1 . the
.2
method reduces to direct questioning.
Simmons Randomised Response Model
In order to enhance the confidence of the respondent in the anonymity provided
by the randomised response method. Simmons suggested that one of the
statements referred to is a nonsensitive attribute. say Y. unrelated to the sensitive
attribute A. In some cases the respondent would get one of the following two
statements with probabilities P and n  P).
1. I belong to group Y
2. I belong to group A
In this case the statement 1 would not embarrass the respondent. If lt'y is the
proportion in the population with the attribute Y and it is known then the
proportion It' A caJl be estimated unbiasedly. Note that the probability of getting
the yes answer is 8 s =Pit' A + (1 P)lt'y . If Bs is the proportion of yes answers
in the sample of size n. then an unbiased estimator of It' A ts
.. 8
5
(1P)lt'y
It' AS = p
and its variance is
V(
.. )  8 s ( 1 8 s )
lt'As _........;:;. __ ~
nP
2
160 Samplmg Theory and Methods
When 1ry is unknown. the method be altered to facilitate estimauon of both
Try and rr A . Here. the sample is drawn in the form of two independent samples
of sizes n
1
and n
2
again with and with probabilities P
1
and P
2
for
getting the sensitive statements in the tirst and second samples respectively. The
same unrelated question is presented in with probabilities I  P
1
and I  P
2
in
the first and second samples respectively. If 8
1
and 8
2
are the respective
probabilities of "yes" answer. then we have
81 = Plrr A + 0 PI )Try
8
2
= P
2
rr A+ (1 P
2
)Try
Solving these two expressions. we get
  P281 P182
'"Y
p'!. Pl
  (I  p2 )81 (I  pl )82
'"A
pl P'!.
Let n'
1
and n'
2
be the number of yes answers in the first and second samples
I I
respectively. Since and n
2
are unbiased for 8
1
and 8
2
respectively. an
nl n2
unbiased estimator of rr A is given by
.. _ o  P2 >B1  o  P1 lB 2
rrAs
P1 P2
.. n'
1
.. n'.,
where 8
1
= and 8., = . Since 11'
1
and n'2 are independent and
nl  n2
binomially distributed with parameters (11
1
,8
1
) and (n
2
.82). the variance of
it .
4
s is found to be
nl n1
( pl  p2 )
Folsom's Model with two unrelated characteristics
Folsom et al ( 1973) developed an unrelatedquestion model with two non
sensitive characteristics. y
1
and y
2
in addition to the sensitive character A.
Assume that the nonsensitive proportions rr yl and rr y'!. are unknown. Two
independent simple random samples with replacement of sizes n
1
and n
2
are
drawn. Each respondent in both the samples answer a direct question on a non
sensitive topic and also one of two questions selected by a randomised device.
The following table given in the next page describes the scheme.
Technique used wtth respondents
Randomised Response (RR)
Direct Response (DR)
NonSampling Errors 161
Sampie I
Question A
Quesuon f;
Question Y
2
Sample 2
Question A
Question Y
2
Question Y;
In both samples let the sensitive question be asked with the probability P. and for
i = l, 2, A.r ( A.f ) be the probability of a "yes" answer to the question selected by
RR( DR) in the ith sample. Then
AI = p 1t' A +(I  p )1t'yJ (9.9)
A.2 = p 1t'.4 + (l p )1t'Y2
A,d = 1ryl
A; = Kr:
(9.10)
(9.11)
{9.12)
Let i[ , i2 . if and denote the usual unbiased estimators of A.r . A.2 . A.f
and respectively. given by the corresponding sample proportions. Then from
(9.9) and (9.12) we get an unbiased estimator as
ir (] P)it
1
.. AJ . .,
JrA(l)= p 
(9.13)
Using (9.10) and (9.11) we get another unbiased estimator as
.. ..d
 0  P)A.
J?A(2)= I
p
(9.14)
Variances of the estimators defined in (9.13) and (9.14) can be obtained easily.
This is left as an exercise. In addition to these three Randomised Response
methods, several other schemes are available. For details one can refer to
Chaudhuri and Mukerjee ( 1988).
9.3 Observational Errors
So far in all our discussions, it has been assumed that each unit in the population
us attached a fixed value known as the true value of the unit with respect to the
character under study and whenever a population is included in the sample, its
value of y is observed. this assumption is an over simplification of the
problem and actual experience does not support this assumption. There are
plenty of examples to show that error of measurements of responses are present
when a survey is carried out. In this section we shall consider this problem and
devise methods for the measurement of these errors to plan the survey as
meticulously as possible.
Let us assume that M interviewers are available for the survey. The response
xijk obtamed by interviewer on unit j assumed to be a randcm variable with
E2[xijk j =X ij and V2[xijk] = S;J. The average of responses obtamed by
162 Sampling Theon and Methods
N 
 ~ x,J
interviewer i on all the N units in the population ts X, = L N and the
j=l
M 
 ~ X
average obtamed by all the M interviewers would be X = ~ 
1
This value
1=1 ,w
can be taken as the expected value of the survey, whereas the true value is Y
the population mean based on all the umts in the population. The difference
X  Y is called the response bias.
The response obtained from a sampled unit depends on the person who
observes the unit. Therefore it is desirable to allocate the sample interviewer
(selected out of the M available) to the sample units (selected out of theN units
in the population). Now consider the situation. in which a simple random sample
of n = ~ units is selected from the population of N units and assigned to an
m
interviewer selected at random from the population of N units and assigned to an
interviewer selected at random from the M available for the study. Another
independent sample of size n is selected and assigned to another interviewer
selected at random from the M. In this process m such subsamples of size ii are
selected and assigned to the M interviewers. The following theorem gives an
unbiased estimator of X under the above scheme.
Theorem 9.3 Under the sampling scheme described above. an unbiased
A I Ill 1 n X
. f X . . b X ~  h  ~
11
k . h 1
esttmator o ts gtven y =~ .t; w ere x; == ~   =  ts t e samp e
m 1=1 n 1=1 n
mean provided by the ith selecuon of the interviewer.
Proof If a unit is selected at random from the population containing N units and
an interviewer is chosen at random from the M and assigned to the selected unit.
the expected value of the response x;jk will be X . It is because for a given
interviewer i and for a given unit j. E
2
[x;jk] =X ij . This implies for a tixed i,
111 1 N_
E:dxijk] =LX ;
1
. Therefore E[x;jk] =LX ij =X . This implies that
N.
1
MN.
1
j= }=
m
the sample mean 'X; is unbiased for X. Hence E[i] =
1
L E[x;] =X . Hence
m. I
I=
the proof.
NonSampling Errors 163
Theorem 9.4
_ Vlxl [I 1]
V(x)=+  C
n m n
] M N
where V[.t] = MN L L E[x,
1
k  Xl
2
r=l J=l
l M N _ _
and C = L L E[x;jk X ][x;j'k X I
MN(N 1).
1
.,
t= j<J
Proof of this theorem follows from routine algebra and hence left as an exercise.
The variance of the sample mean based on a survey employing interviewers
has two components. One is the variability of all responses over all units to all
interviewers and the other is the covariance between responses obtained from
different units within interviewer assignments. If advance estimates of these two
components are available. one can determine from the variance gtven tn
Theorem 9.4 the optimum number of interviewers to employ for the collection of
data.
Let c
1
be the cost per unit in the sample and c., be the cost per
interviewer, so that the total cost of the surveyor is
c
1
= c
0
+ c
1
n + c
2
m (9.15)
The values of n and m can be found by minimising V(x) for a given cost with
the help of the method of Lagrangian multtpliers. Setting the partial derivatives
of V(i) +A. (c
0
+ c
1
n + c
2
m Cr) with respect to n and m equal to zero. we
get
kl = V(x)., C and
n
c
A.c.,=
 .,
m
~
!!: = v ~ rc
n .Jvcx>C
(9.16)
(9.17)
The actual values of n and m are obtatned by substituting the ratio gtven in
(9.17) in the cost function defined in (9.15). Since the covariance component C
and the vartance V depend on the number of intervtewers used and the size of the
asstgnment. the solution obtained should be used for getting an idea of the
magnitudes involved. Thus we have seen the manner in which resources can be
allocated towards the reduction of sampling errors (as provided by n) and non
sampling errors (interviewer errors). The following theorem gives unbiased
estimates of C. V(x) and V(:t) under the sampling scheme described in this
section.
Theorem 9.5 Under the sampling scheme described tn this section. unbiased
., ., ., 2
s s;. ., ssw
estimates of C. V(x) and VCi) are gtven by C = b V(x) = s ~ . +b __
n n
164 Sampling Theorv and Methods
/PI
 l , h
and V(x) = L X; xj were
m(m J) 1=1
m
,
1
)1,
sb =L n{.t; x 
m1
!=I
and
m n
: 1  12
.Sw = _ L Xijlc X;
m(n I) .
1
.
1
I= J=
Proof of this theorem is left as an exercise.
Exercises
9.1 Extend Warner"s method to the case of estimating two proportions.
Find the mean square error of the estimator rrab = a8 + b where 8 is as
defined in Warner's model.
9.3 Find the minimum mean square error of the estimator suggested in 9.2 and
9.4 Obtain an unbiased estimator for the sensitive proportion under Warner's
method assuming, the probability of a respomient being untruthful is L and
derive the variance of the estimator.
Chapter 10
Recent Developments
10.1 Adaptive Sampling
It has been an untiring endeavour of researchers in sampling theory to seek
estimators with increased precision. In the earlier chapters we have seen a
variety of sampling estimating strategies which use the information of a suitable
auxiliary (size) variable either in the sampling design or in the estimator. There
are very few sampling schemes which use the knowledge of study variable in
the selection stage. Recently Thompson ( 1990) introduced sampling schemes
which directly use the knowledge of study variable in the selection process. In
this section details of his sampling schemes are presented. Quite often we
encounter surveys where the investigator gathers information regarding the
number of individuals having some specific characteristics. As an example one
can think of a survey involving endangered species in which observers record
data regarding the number of individuals of the species seen or heard at
locations within a study area. In such surveys frequently zero abundance is
encountered. In those cases, whenever substantial abundance is seen, exploration
in nearby locations is likely to yield additional clusters of abundance. These
kinds of patterns are encountered along with others, from whales to insects, from
trees to lichens and so on.
Generally, in sample surveys, survey practitioners decide their sampling strategy
before they actually begin data collection. However, functioning in this
predetermined manner may not be effective always. For example, in
epidemiological studies of contagious diseases, whenever an infected individual
is encountered, it is highly likely that neighbouring individuals will reveal a
higher than expected incidence rate. In such situations, field workers may not
like to stick to their original sampling plan. They will be interested in departing
from the preselected sample plan and add nearby or closely associated units to
the sample. Keeping these points in mind, Thompson (1990) suggested a new
sampling scheme. In the sampling scheme suggested by him namely, "adaptive
sampling", an initial sample of predetermined size is drawn according to a
conventional sampling design. The values of sampled units are scrutinized.
Whenever the observed value of a selected unit satisfies a given condition of
interest, additional units are added to the sample from the neighbourhood of that
unit. The basic idea of the design is illustrated in Figures 1 Oa and lOb. Figure
166 Sampling Theory and Methods
..
Fig. 10 (a)
Initial sample of 10 units
..
"' ...
"'
Fig. 10 (b)
Final sample after neighboring units are included
Recent Developments 167
lOa shows an initial sample of 10 units. Whenever one or more of the units is
found to satisfy a given condition, the adjacent neighbouring units to the left,
right, top and bottom are added to the sample. When this process is completed,
the sample consists of 45 units, shown in Figure 1 Ob. It is pertinent to note that
neighbourhood of units may be defined in so many ways other than spatial
proximity. The fonnal definition of adaptive sampling is presented below:
In adaptive sampling an initial set of units is selected by some
probability sampling procedure, and whenever variable of interest of a selected
unit satisfies the given criterion, additional units in the neighbourhood of that
unit are added to the sample. The criterion for selection of additional
neighbouring units can be framed in several ways, depending on the nature of
study. For example, the criterion for additional selection of neighbouring units
can be taken as an interval or a set C which contains a given range of values with
respect to the variable of interest. The unit i is said to satisfy the condition if
Y; e C . For example, a unit satisfies the condition if the variable of interest Y;
is greater than or equal to some constant c. That is, C = {x: x S c} . Here it is
assumed that the initial sample consists .of simple random sample of size n units
selected either with or without replacement. To introduce appropriate estimators
under adaptive sampling scheme, we need the following definitions.
Neighbourhood of a unit For any unit in the population, the neighbourhood of a
unit U; is defined as collection of units which includes unit U i with the property
that if unit U i is in the neighbourhood of unit U; , then the unit U; is in the
neighbourhood of unit U i . These neighbourhoods do not depend on the
population values.
Cluster The collection of all units that are observed under the design as a result
of initial selection of unit U; is tenned as cluster. Note that such a collection
may consist of the union of several neighbourhoods.
Network A set of units is known as a network if selection in the initial sample of
any unit in the set will result in inclusion in the final sample of all units in that
network.
It is convenient to consider any unit not satisfying the condition a network of
size one, so that the given yvalues may be uniquely partitioned into networks.
Edge unit A population unit is said to be edge unit if it does not satisfy the
condition but is in the in the neighbourhood of one that satisfies the condition.
Notatioas n
1
: Size of initial sample
'1'1.: :Network which consists of the unit U 1.:
m ~ , : : Number of units in the network to which unit k belongs
168 Sampling Theory and Methods
ai : Total number of units in networks of which unit i is an edge
unit.
Selection procedure and related properties
As mentioned earlier, an initial sample consisting of n
1
units using SRSWR or
SRSWOR is sampled. When a selected unit satisfies the condition all units within
its neighbourhood are added to the sample and observed. Note that in addition
to units satisfying the condition, even those units in their neighbourhoods are
also included in the sample and so on.
Denote by m; the number of units in the network to which unit i belongs
and by a; the total number of units in networks of which unit i is an edge unit.
Note that if unit i satisfies the criterion C then a; =0, whereas if unit i does not
satisfy the condition then m; = 1 . It may be noted that the unit i will be selected
in a given draw if either any one of the m; units in its network is drawn in the
initial sample or any one of the a; units for which this is an edge unit, is drawn
in the sample. Hence the probability of selection for the unit in a given draw is
Pi =
1 1
The number of ways of choosing n
1
units out of N is . Let
m +n (N)
N n1
B; be the subset of the population units containing either the units which are in
the network containing the unit i or the units for which unit i is an edge unit
Clearly n(B;) = m; +a;. A sample not containing the unit i can be drawn by
considering the set S B; which contains N m; a; units. Hence the
( N m; a;)
probability of not including unit i in the sample is ~ . Therefore the
probability of including the unit i in the sample is a
1
= 1  ~ . When
the initial sample is selected by using SRSWR, the probability that the unit i is
included in the sample is a; = 1 (1 P; )n; Since some of the a; may not be
known, the draw by draw probability P; as well as the inclusion probability
a; cannot be determined.
Estimators under Adaptive sampling
Classical estimators such as sample mean are not unbiased under adaptive
sampling. Now we shall describe some estimators suitable for adaptive sampling.
Recent Developments 169
(i) The initial sample size
If the initial sample in the adaptive design is selected either by SRSWR or
SRSWOR, the mean of the initial observations is unbiased for the population
mean. However, this estimator completely ignores all observations in the sample
other than those initially selected.
(ii) Modified HansenHurwitz estimator
In conventional sampling, HansenHurwitz estimator, in which each yvalue is
divided by the number of times the unit is selected, is an unbiased estimator of
population mean. However, in adaptive sampling, selection probabilities are not
known for every unit in the sample. An unbiased estimator can be formed by
modifying the HansenHurwitz estimator to make use of observations not
satisfying the condition only when they are selected in the sample. Let 'I' k
denote the network which consists of the unit U k and mk be the number of units
in that network. Let 'YZ be the average of observations in the network that
includes the kth unit of the initial sample. That is 'YZ = 
1
 I Y
1
A modified
mk .
IE'Ifk
HansenHurwitz estimator can be defined by using 'YZ as =...!.. t "Yt
nt i=l
Theorem 10.1 The =
1
ty; is unbiased for the population
nl i=l
mean.
Proof Case 1 When SRSWOR is used to select initial sample.
Let Z; indicates the number of times the ith unit of the population appears in the
estimator, The random variable z; has a hypergeometric distribution when
initial sample is selected by SRSWOR with E[z
1
] = nl m; .
N
Therefore
'YZ =_!_
Y
n n
1
I k=l I k=l k jE'Ifk
N
=1 I zkYk
nt k=t
Taking expectation on both the sides we get the required result.
Case 2 When SRSWOR is used to select initial sample.
Let z
1
as in case 1 ,indicates the number of times the ith unit of the population
appears in the estimator. It is to be noted that Zi is nothing but the number of
times the network including the unit i is represented in the sample. Note that
I m I m I
[n
r
)
z ( )nz
P(Zj) = Z; . J
n
1
m
Therefore E(z;] =
1
Expressing the
N
170 Sampling Theory and Methods
estimator Y ~ H = 
1
 f 'YZ in the form considered in case I and taking
nl k=l
expectations we get the required result.
The following theorem gives the variance of the estimator Y ~ H =_.!._I 'YZ in
nl k=l
the two cases of simple random sampling.
Theorem 10.2 (a) If the initial sample is selected by SRSWOR, the variance of
 1 ~  . . b N  nl 1 ~  ]2 (b If th
Y HH =LY k ts gaven y  L,[Y;  J.l ) e initial
nl k=l Nnl N  1 i=l
sample is selected by SRSWOR, the variance of Y ~ H = _.!._ t 'YZ is given by
nl k=l
n N

1
L[Y; JJ.]
2
,where y; is the average of observations in the network that
N 1 i=l
N
includes the kth unit of the initial sample and J.l = 
1
LY; .
N i=l
Proof Taking y; as the variable of interest and applying the results available
under nonadaptive sampling scheme, the desired expressions can be obtained.
(iii) Modified HorvitzThompson Estimator
We know that the knowledge of first order inclusion probabilities tc; can be used
to construct the HorvitzThompson estimator for estimating the population total .
With the adaptive designs, the inclusion probabilities are not known for all units
included in the sample. Hence it can not be used to estimate the total unbiasedly.
An unbiased estimator can be formed by modifying the HorvitzThompson
estimator to make use of observations not satisfying the condition only when
they are included in the initial sample. In this case, the probability that a unit is
used in the estimator can be computed, even though its actual probability of
inclusion in the sample may be unknown.
Define the indicator variable
J k = 0 if the kth unit in the sample does not satisfy the condition and
was not included in the sample
= 1 otherwise, for k = I, 2, ... , N .
The modified estimator is y ~ = 
1
f y k: k where v is the number of distinct
N k=l ak
units in the sample a; is the probability that i is included in the estimator. It can
be seen that, whether the unit i satisfies the condition or not, the probability of
Recent Developments 171
including the unit in the estimator is 1w . The following theorem gives
the variance of the estimator y * HT .
Theorem 10.3 The estimator y * HT is unbiased for the population mean and its
variance is
1
ttYhYj[tcjh tctrtcj]where Dis the number of networks
N
2
i=l h=l tc tete i
in the population and tc jh is the probability that the initial sample contains
atleast one lDlit in each of the networks j and h.
Proof of this theorem is straight forward and hence omitted.
The results presented above are due to Thompson and more about adaptive
sampling are available in Thompson (1990,1991a,1991b).
10.2 Estimation of Distribution Function
The problem of estimating finite population totals, means and ratios of the
survey variables are widely discussed in sample survey literature. But estimation
of finite population distribution function has not received that much attention.
Estimation of distribution function is often an important objective because
sometimes it is necessary to identify subgroups in the population whose values
for particular variables lie below or above the population average, median,
quantiles or any other given value. The of distribution function in finite
population mean has received the attention of Chambers and Dunstan ( 1986) and
Rao, Kovar and Mantel ( 1990). In this section their contributions are presented.
Population Distribution Function: Let Li(x) be the step function
Li(x) = 1 if x ~ 0
=0 otherwise
Let Y
1
, Y
2
, ... , Y N be the values of the N units in the population with respect to
the variable y. The finite population distribution function of y is defined as
1 N
FN(t)='LLi(tY;),te R (10.1)
N i=l
We know that the HorvitzThompson estimator for a finite population total Y is
given by
.. Lr..
YHT = '
Tr
ies
1
(10.2)
172 Sampling Theory and Methods
where 1t'; 's are the inc:lusion probabilities corresponding the sampling design
P(s) used to choose the sample. Hence the HorvitzThompson estimators
N N
' L Li(tY)
of '
' 1 'Li(tY;)
are
respectively. Hence a
1t'
i=l i=l l
1t 1t'
ies i tes l
L Li(t Y;)
designbased estimator of
I, 1t'
ies '
Note that F N (t) reduces to the ordinary sample empirical distribution function
and it is design unbiased under any sampling design satisfying L 
1
 = N .
. 1t';
IES
Kuk( 1988) compared the perfonnance of the estimator F N (t) with those of
A A
F L (t) and F R (t) where
FL(t)=1 (tY;)
N . 1r;
IE.t
FR(t)=ISR(t),SR(t)=
1
LL1 (tY;)
N . 1r;
es
It may be noted that S R (t) estimates the proportion of units in the population
whose values exceed the given value r. It is interesting to note that FL (t) is not
necessarily equal to F R (t) . Further it can be easily seen that both F L (t) and
F R (t) are unbiased for F N (t) under all sampling designs for which 1t'; > 0 for
every i = 1, 2, ... , N . Even though both of them are unbiased for F N (t) , they
lack the most important property of being distribution functions. On the other
hand, F N (t) even though by nature a distribution function, it is not un.biased.
A A
The following theorem gives the mean square errors of FL (t) and FR (t) and the
approximate mean square error of F N (t).
Theorem 10.4 (a) The mean square error of FL (t) is
1
N N 1t'1t'1t'
2 L L '1 ' 1 Li(t  Y; )Li(t  yj)
N i=l j=I 1t' ;1r i
(b) The mean square error of F R (t) is
1
N N 1t 1t'1t' .
2 2, '
1
'
1
Li(Y;  t)Li(Yi  t)
N i=l i=l 1t' ;1t' i
Recenc Developments ! 73
(c) The approximate mean square error of F N ( t)
l N N tr . tr tr
2 LL l) l J (L1(tY;)FN<t)][L1(tYj)FN(t)]
N trtr
t=l J=l l 1
Proof of this theorem is straight forward and hence left as an exercise.
Remarks (l) Further it can be seen that if
N N N trTrtr
b > bt1(Y t) where b = '
1
'
1
l l
tr tr .
t=l i=l j=l l J
(2) The results mentioned above are due to Kuk( 1988) and more details can be
obtained from the original paper.
Rao et al( 1990) suggested difference and ratio estimators for population
distribution function which use the knowledge of auxiliary information. The
A
design based ratio and difference estimator ofF N (t) are obtained from standard
results for totals or means treating t1(t Y; ) and t1(t RX; ) as y and x variables
respectively, where
A
R = _;;lE.:.;S====
>[X;]
tr
lES l
is the customary design based estimator of
the population ratio R = ..!._ . The ratio estimator of the population distribution
X 
function is given by
L[ .1(1 Y; )]
F,(t)= l iE.f Tr;A L1(tRX;)
N I[t1(t i=l
lES I
A
which reduces to F N (t) when l'i is proportional to Xi for all i. Hence the
variance will be zero if Y; is proportional to Xi . This suggests that Fr (t)
could lead to considerable gains in efficiency over F N (t) , when Y; is
approximately proportional to X; . The difference estimator with the same
desirable property is given by
Fd<i>=
_lE.f l 1=1 lES I
Using the data of Chambers and Dunstan (1986), the performance of the above
two estimators were studied by Rao et al ( 1990). They found that the difference
estimator is less biased than the ratio estimator for smaller values of F N (t).
174 Sampling Theory and Methods
They also found that ftd (t) is more precise than Fr (t) and ft N (t). The presence
of R in Fr (t) and ftd (t) creates difficulties in evaluating (analytically) the exact
bias and the mean square errors of these estimators. Invoking the results of
A
Randles (1982), they obtained the approximate design variances of Fr(t) and
A
Fd (t) which are given below:
A l
V[Fd (t)] = V[L1(t Y;) .1(t RX; )]
N2
A l l Fy (t) l
V[Fr(t)]=
2
V L1(tY;) L1(tRX;)
N F (_:_)
X R
where V(Y;) = f f (tr;j 1r;1r j )[:;.  :j. ]
2
i=l j=l I )
i<j
The estimated variances of ftr (t) and Fd (t) are
A l A
v[Fd (t)] = 
2
v[L1(t Y;) .1(t RX; )]
N
Fy(t)
_..;...._ .1(t  Rx j)
A t
Fx(;:)
R
A l
and v[Fr(t)]=
2
v L1(tY;)
N
~ T"" [ Y; yj ]
2
(tr;1r j 1rtj J A A
where v(Y;)=.' ~  and Fx(t)and Fy(t) arethe
"""' 1r . 1r . 1r ..
ies jes
1
1 IJ
i<j
customary estimates of F x (t) and F y (t) respectively.
10.3 Randomised Response Method for Quantitative Data
In the last chapter, we have seen several randomised response methods which are
meant for estimating the proportion of units in a population possessing a
sensitive character. In this section, a randomised response method meant for
dealing with quantitative data as developed by Eriksson (l973a,b) is presented.
This problem arises when one is interested in estimating the earnings from illegal
or clandestine activities, expenses towards gambling or consumption of
alchoholic and so on. These are some examples where people prefer not to
reveal their exact status. Let Y
1
, Y
2
, ... , Y N be the unknown values of N units
labelled i = l, 2, ... , N with respect to the sensitive study variable y. To estimate
the population total Y, Eriksson ( 1973a,b) suggested the following procedure.
Recent DeveLopments 115
A sample of desired size is drawn by using the sampling design P(s). Let
X
1
, X
2
, ... , X N be predetermined real numbers supposed to cover the
anticipated range of unknown population values Y
1
, Y
2
, ... , Y N . The quantities
q i, j = 1, 2, ... , M are suitably chosen nonnegative proper fractions and C is a
M
rightly chosen positive proper fraction such that c + L q j = 1 . Each
j=l
respondent included in the sample is asked to use conduct a random experiment
independently k(> 1) times each to produce random observations
Z ir , r = 1, 2, ... , k,
Z ir = Y; with probability C
= Xi with probability q j, j = 1, 2, ... , M
A corresponding device is independently used for every sampled individual so
that the values Z;r, r = 1, 2, ... , k, for i e s are generated. For theoretical purpose,
the random vectors Z r = (Z
1
r, Z2r, ... , Z Nr ), r =1, 2, ... , k are supposed to be
defined for every unit in the population. Let Z = (Z
1
, Z
2
, ... , Z k ). Denote by
E R, V R and C R taking expectation, variance and covariance with respect to the
 1 k
randomisation technique employed to yield Z ir values. Let Z; = k L Z ir and
r=l
1 M
J.l.
X 1C., 1 1
j=l
M
Note thatER[Z;r1 = CY; + LqiX j
j=l
= CY; + (1 C)J.l.x
Hence ER[Z;]=CY; +(1C)J.l.x;i=l,2, ... ,N;r=l,2, ... ,k.
.. Z (1C)J.l.
Therefore an estimator of Y; is given by Y; =
1
x
c
(10.3)
(10.4)
A general estimator for the population total and also its variance is given in the
theorem furnished below.
Theorem 10.5 An unbiased estimator for the population total is given by
e(s,Z)=a
1
+ Lb
1
;Y; , where a
1
and b
1
; are free of Y
1
,Y
2
, ... ,YN and
ie.r
satisfy I,a
1
P(s) = 0 and Lb
1
;P(s) = 1, i = 1, 2, ... , N. The variance of
Hi
N
e(s,Z) is given by Vp[e(s,Y)]+ Here L is the sum
kC i=l .r .r
over all samples.
176 Sampling Theory and Methods
Proof Taking E p, V p and C p as operators for expectation, vanance and
covariance with respect to the design. Assuming commutativity, we write
E PR = E pER =ERE p = E RP.V PR = V RP to indicate operators for expectation
and variance respectively, with respect to randomisation followed by sampling ,
or vice versa. Taking expectation for the estimator e(s, Z) we get
ER[e(s,Z)]=as + LbsiER[Y;]
ies
ies
Again taking expectation with respect to the sampling design, we note that
EpER[e(s,Z)] = Y
The variance of e = e(s, Z) can be written as
VpR(e)=VpER[e]+EpVR[e] (10.5)
1 M
_Denoting by a ix = L q j (X j  Jl x)
2
and af =a; + C(Y;  Jl x)
2
, we
1C .
1
J=
write VR[Z;,] = (1C) [a;+ C(Y; Jlx)
2
}
= (l C) G;
2
, i = 1, 2, ... , N; r = 1, 2, ... , k ;
Therefore Lb.;;VR(Z;,)
kC ies
(10.6)
Hence the proof.
Note The second term in the right hand side of (10.6) shows how variance
increases (efficiency is lost) when one uses randomised response method rather
than direct survey.
Under designs yielding positive first order inclusion probabilities for all
units and positive second order inclusion probabilities for all pairs of units, an
unbiased estimator for the above variance can be found easily in particular
when as =0 as shown below.
When as =0 ,the variance of the estimator with respect to the sampling
design can be written as
N N N
v p[e(s; Y)] = L C; r? + L 2, dij Y; yj
i=l i=l j=l
Recent Developments 177
Denote by v( s, Y) = L f
1
; Y/ + L L g sij Y; Y j where /
1
; 's and g sij 's
ie.r i
i,jes
quantities free of r_ satisfying E p [ v( s, r> J = V ( s, !> .
satisfies EpR[v(s, Z)] = Ep[ERv(s, Z)]
= Ep[v(s,r_)] = V p[e(s,r_}]
. 2 2 2
Funher1f Szj Z;] , then ER[szj]=VR[Z;r],r=1,2, ... ,k.
k 1
r=l
Hence }:bi;VR(Z;r}=VR(e}
kC ies kC ies
Taking expectation with respect to the sampling design, we have
}:bi;si;]=Ep[VR(e}]
kC ies
As a result of the above discussion, we have
E PR [v(s, Z) s = V PR (e)
kC ies
Therefore v( s, Z} + L b i; s is an unbiased estimator for V PR (e) .
kC ies
For more details about randomised response methods, one can refer to the
monograph by Chaudhuri and Mukerjee (1988}.
References
1. Bartholomew, D.J. (1961}: A method of allowing for "notathomes" bias in
sample surveys, App. Stat., 10,5259.
2. Chambers, R.L. and Dunstan, R. ( 1986} : Estimating distribution function
from survey data, Biometrika, 73,3,597604.
3. Cochran, W.G. (1946} : Relative accuracy of systematic and stratified
random samples for a certain class of populations, Ann. Math. Stat., 17, 164
177.
4. Delenius, T. (1955} : The problem of notathomes, Statistisk Tidskrift.,
4,208211.
5. Deming, W.E. (1953} : On a probability mechanism to obtain an economic
balance between the resulting error of response and bias of nonresponse, J.
Amer. Stat. Assoc.,48,743772.
6. Das, A.C. ( 1950} : Twodimensional systematic sampling and the associated
stratified and random sampling, Sankhya. 10,95108.
7. ElBardy, M.A.(1956} : A sampling procedure for mailed questionnaire, J.
Amer. Stat. Assoc.,51,209227.
8. Erikkson, S. (1973a} : Randomised interviews for sensitive questions,Ph.D.
thesis, University of Gothemburg.
9. Erikkson, S. (1973b} : A new model for RR, Internat. Statist. Rev., 1.101
113.
10. Folsom, R.E., Greenberg, B.G.,Horvitz, D.G. and Abernathy, J.R.(1973}:
The two alternate questions RR model for human surveys, J. Amer.,Stat.
Assoc.,68,525530.
11. Hansen, M.H. and Hurwitz, W.N.(1946} : The problem of nonresponse in
sample surveys, J. Amer. Stat. Assoc., 41,517529.
12. Hartley, H.O. and Rao, J.N.K.(1968} : Sampling with unequal probabilities
and without replacement, Ann. Math. Stat. 33,350374.
13. Hartley, H.O. and Ross, A.(1954} : Unbiased ratio type estimators,
Nature,174, 270271.
14. Horvitz, D.G. and Thompson, D.J. (1952} : A generalisation of sampling
without replacement from a finite universe, J. Amer. Stat. Assoc., 47; 663
685.
15. 'Kish, L. and Hess, I. (1959} : A replacement procedure for reducing the
bias of nonresponse, The American Statistician, 13,4,1719.
16. Kuk, A.Y.C. (1988} : Estimation of distribution functions and medians
under sampling with unequal probabilities, Biometrika, 75,1,97103.
17. Kunte, S. (1978} : A note on circular systematic sampling design, Sanlchya
c. 40,7273.
18. Madow, W.G. (1953}: an the theory of systematic sampling lll, Ann. Math.
Stat., 24,101106.
180 References
19. Midzuno (1952) : On the sampling design with probability proponional to
sum of sizes. Ann. Inst. Stat. Math . 3.99l 07.
20. Munhy. M.N. (1957) : Ordered and unordered estimates in sampling
without replacement. Sankhya.18:379 390.
21. Munhy. M.N. (1964): Product methods of estimation. Sankhya.26.A.6974.
22. Olkin, I. ( 1958) : Multivariate ratio estimation for finite populations.
Biometrika.45.154165.
23. Politz, A.N. and Simmons, W.R. (1949,1950): An attempt to get the "not at
home" into the sample without callbacks, J.Amer. Stat. Assoc., 44,931 and
45,136137.
24. Quenouille, M.H. ( 1949) : Problem in plane sampling, Ann. Math. Stat., 20,
355375.
25. Quenoulle M.H. (1956) : Notes on bias in estimation, Biometrika,43,353
360.
26. Rao, J.N.K., Hartley, H.O. and Cochran, W.G. (1962): A simple procedure
oi unequal probability sampling without replacement, Jour. Roy. Stat. Soc.,
B24, 482491.
27. Rao J.N.K.,Kovar, J.G. and Mantel, H.J. (1990): On estimating distribution
functions and quantiles from survey data using auxiliary information,
Biometrika,77 ,2,365375.
28. Royall, R.M. (1970) : On finite population sampling theory under cenain
linear regression models, Biometrika,57 ,377,387.
29. Sethi, V.K. (1965): On optimum pairing of units, Sankhya B. 27,315320.
30. Singh, D., Jindal, K,K. and Garg, J.N. (1968) : On modeified systematic
sampling, Biometrika, 55,541546.
31. Singh. D. and Singh, P. (1977) : New systematic sampling, Jour. Stat.
Plano. Inference, 1 , 163179.
32. Srinath, K.P. (1971) : Multiphase sampling in nonresponse problems, J.
Amer. Stat. Assoc., 16, 583586.
33. Shrivastava ,S.K. (1967) : An estimator using auxiiiary information,
Calcutta Statist. Assoc. Bull., 16, 121132.
34. Thompson, S.K. (1990) :Adaptive cluster sampling, J. Amer. Stat. Assoc.,
85, 10501059.
35. Thompson, S.K. (1991a) : Stratified adaptive cluster sampiing, Biometrika,
78, 30893097. .
36. Thompson, S.K. (1991b) : Adaptive cluster sampling: designs with primary
and secondary units, Biometrics, 47, 11031105.
37. Warner, S.L. (1965) : Randomised response : A survey technique for
eliminating evasive answer bias, J. Amer. Stat. Assoc., 60,6369.
38. Yates, F. (1948) : Systematic sampling, Phil. Trans. Roy. Soc:, London, A
241,345371.
Books
1. Chaudhuri, A. and Mukerjee, R. (1988) : Randomised respon_se theory and
technique, Marcel Dekker Inc.
2. Cochran, W.G. (1977): Sampling techniques, Wiley Eastern Limited.
Ref"erences 181
3. Des Raj and Chandok. P. ( 1998 l : Sampling Theory. Narosa Publishing
House. New Deihi.
4 Hajek. J. ( 1981 l : Sampling from a finite population. Marcel Dekker Inc.
5. Konijn, H.S. (1973 l : Statistical Theory of sample survey des1gn and
analysis. NorthHolland Publishing Company.
6. Murthy, M.N. ( 1967) : Sampling Theory and methods. Statistical Publishing
Society. Calcutta.
7. Sukhatme. P.V .. Sukhatme.B.V .. Sukhatme,S. and Asok.C. (1984) : Sampling
theory of surveys with applications. Iowa State University Press and Indian
Society of Agricultural Statistics, New Delhi.
Index

adaptive sampling, 165171
almost unbiased ratio type estimator,
104,105
autocorrelated populations, 39,87
auxiliary information, 97121
balanced systematic sampling, 3537
Bartholomew, 154
Bellhouse, 47
bias, 2
bound for bias, 105
centered systematic sampling, 34
Chambers, 171,173
Chaudhuri, 161
circular systematic sampling, 43,44
cluster sampling, 140
Cochran, 63,88
cost optimum allocation. 82
cumulative total method, 55
Dalenius, 154
Das, 47
Deming's model, 154
Desraj ordered estimator, 60
difference estimator, 124126
distribution.function, 171
Dunstan, 171,173
edge unit, 167
ElBardy, 152
entropy, 3
Erikkson, 174
finite population, I
Folsom's model, 160
Garg, 38
GaussMarkov, 132
Hansen and Hurwitz, 152
Harltey, 63,70,102,106
Hess, 154
HorvitzThompson. 3,6,8,63
implied estimator, 129
inclusion indicators, 4
inclusion probabilities, 4,5
incomplete surveys, 152
Jindal, 38
Kish, 154
Kovar, 171,173
Kuk, 172
Kovar, 171,173
Kuk, 172
Kunte. 44
Lagrangian multipliers, 81,93
Lahiri, 43,56
linear systematic sampling, 2932
Madow 34
Mantel, 171,173
mean squared error, 1 ,3
Midzuno, 6770
model unbiasedness, 131
modifed HansenHurwtiz
estimator, 168
modified HorvitzThompson
estimator, 170
modified systematic sampling,
38,39
multiauxiliary information, 113
multistage sampling, 140150
Murthy's unordered estimator, 62
neighbourhood, 167
network, 167
Neyman optimum allocation, 81
nonsampling errors, 152164
observational errors, 161
Olkin, 113
parameter 1 ,3
PolitzSimmons technique, 156
population siie, 1
pps systematic scheme, 70
ppswor, 60
ppswr,55
probability sampling, 1
product estimation, 106108
proportional allocation, 79
l84 References
Quenouille. 47
random group method. 63
randomised response 15 8161 . 17 4
Rao. 4.63.70,1 02.171,173
ratio estimator. 971 05
regression estimation. 122124
Ross. 106
Royall, 137
sample size allocation. 7985
sample, I
sampling design, 2,3,4,5
sampling in two dimensions. 44,45
Sarndal, 16
Sethi,35
Simmons. 159
Sethi. 35
Simmons. 159
simple random sampling, I 028
Singh,38
Srinath. 154
srswr. 25
statistic. 2
stratified sampling, 7696,115
superpopulation model, 129
systematic sampling, 2954
Thompson, 165.171
two phase sampling, 108112
two stage sampling, 140150
unbiased ratio type estimators, ! 00
unbiasedness, 2
unequal probability sampling, 55
Warner, 158
Yates. 33.73
YatesGrundy, 7