Mvue Notes

Lecture Notes - Finding the MVU Estimator
Date: June 25, 2014

Instructor : Vijay A Detection and Estimation Theory
1 Introduction
In class we have seen how CRLB can be a useful tool in verifying whether an estimator is minimum
variance unbiased or not. Although the CRLB is a good verication tool, it does not have the potential
to determine the MVU estimator in a step by step (or algorithmic) manner. Here we shall determine a
procedure that will say if an MVU estimator exists, if it exists we see that it is unique and easily found for
some additional restrictions. In Figure 1 we can see the various sets of estimators and the set of unbiased
estimators. In the set of unbiased estimators we represent the MVU estimator as a unique point.
Unbiased Estimators
MV U
(y)
All Estimators of
Figure 1: The MVU estimator representation
1.1 A Look back - some examples
Under the class of estimation of non-random parameter we pressed on two essential requirements.
The estimator should be unbiased, i.e E(
(y)) =
Among the class of unbiased estimators the estimator having minimum variance is preferred.
To see why these conditions are considered important we look into the case where the cost of estimate is
the mean square error. Thus,
C(,

(y)) = |

(y)|
2
where y is the observation vector( or scalar). The average cost (i.e the mean square error) of the estimator,
E
p(y;)
_
C
_
,

(y)
__
= Variance + Bias
2
and the bias is | E()|. We note that independent of the variance, the cost can be reduced by making
the bias as zero. This is the reason for our interest in the class of unbiased estimators. The purpose of
the minimum variance criterion is due to the cost being dependent on the variance of the estimator.
It is not always possible to determine an unbiased estimator. The reason for these are explained
through the examples below:
1
Ex1: (Non Existence) The observations Y Binomial(n, ) and n is known. We are required to obtain
a estimate for = sin . The mean of an estimate

(y) is
E(
(y)) =
n
k=0
(k)
_
n
k
_
k
(1 )
nk
= sin
Since the mean of the estimator has only nite number of terms it cannot be an unbiased estimator
for .
Ex2: (A silly estimator) Consider the observation Y Poisson(). We are required to nd the
unbiased estimator for = e
2
. Thus,
k=0
(k)
k
e
k!
= e
2
=
k=0
(1)
k
k
e
k!
The estimator

(y) = (1)
y
is unbiased. But if > 0, the parameter (0, 1) and hence the
estimator

(y) is silly.
Ex3: (A Non trivial case) Consider n iid observations Y
1
, Y
2
, , Y
n
distributed with the density function
p(y; ) = e
(y)
; y R. We may nd two unbiased estimators for the parameter .
(y) = y
(1)

1
n
; (y) = y 1
where y
(1)
= min{y
1
, y
2
, , y
n
} and y =
1
n
n
i=1
y
i
The corresponding variances are
1
n
2
and
1
n
respectively. Thus the MVU estimator is (y). (It is surprising to observe that the one dimensional
sucient statistic y
(1)
happens to be in the MVU estimator.)
2 The Factorization Theorem
The factorization theorem in estimation is a good tool to determine a sucient statistic for the parameter
. We shall see further that by determining the sucient statistic, the variance of the estimator can be
reduced by using the Rao-Blackwell theorem.
Theorem 2.1 Suppose the observation X = (X
1
, X
2
, . . . , X
n
) has a joint density p(x; ); . A
statistic T = T(x) is a sucient statistic for ) if and only if,
p(x; ) = g(T(x), )h(x)
Proof Observe that the proof has two parts - the if part and the only if part. Note that,
p(T(x) = T|x; ) = I
{T(x)=T}
=
_
1 if T(x) = T
0 if T(x) = T
(1)
(the if part) Given, p(x; ) = g(T(x), )h(x). We determine the following to check if T(x) is a sucient
statistic,
p(x|T(x) = T; ) =
p(x, T(x) = T; )
p(T(x) = T)
=
I
{T(x)=T}
g(T(x), )h(x)
{x:T(x)=T}
g(T(x), )h(x)
=
h(x)
{x:T(x)=T}
h(x)
(2)
2
(the only if part, [Discrete Distributions]) Suppose T(x) is a sucient statistic of , then p(x|T(x); ) is
independent of . The joint density function of the observations X
1
, X
2
, X
3
, . . . , X
n
is as follows:
p(x; ) =
TR
p(x, T(x) = T; )
=
TR
p(x|T(x) = T; )p(T(x) = T; )
= h(x)
TR
p(T(x) = T; )
= h(x)g(T(x), ). (3)
QED
Example Consider the observation to be iid Gaussian distribution of mean and variance
2
. Then the
joint distribution can be written in the factorized form as,
p(x; ) =
1
(2
2
)
n/2
n
i=1
exp
_
(x
i
)
2
2
2
_
=
1
(2
2
)
n/2
exp
_
n
i=1
x
2
i
2
n
i=1
x
i
+
2
2
2
_
=
_
1
(2
2
)
n/2
exp
_
n
i=1
x
i
+
2
2
2
__
_
exp
_
i=1
x
2
i
2
2
__
(4)
= g(T(x), )h(x). (5)
The sucient statistic T(x) =
n
i=1
x
i
is hence separated as a function from the remaining part of p(x, ).
Example We are required to nd a sucient statistic T(x) for the parameter estimation of where the
observations

X = (X
1
, X
2
, . . . , X
n
) are iid Uniform random variables in the interval [0, ]. Then the joint
density of X
n
1
is,
p(x; ) =
1
n
I
{max
k
x
k
}
. .
g(T(x),)
I
{min
k
x
k
0}
. .
h(x)
(6)
Thus, T(x) = max
k
x
k
is a sucient statistic for this problem (using the if part of Theorem 2.1).
3 Rao Blackwell Theorem
If the Factorization theorem gave a mechanism to nd the sucient statistic of then the Rao Black-
well theorem gives us a means to nd the minimum variance unbiased estimator under some additional
conditions. The theorem is stated below:
Theorem 3.1 Suppose g(y) be an unbiased estimator for g(), and T(y) is a sucient statistic. Then,
g(T(Y )) E
[ g(Y )|T(Y )]
is an unbiased estimator of and satises
var
[ g(y)] var
[ g(y)]
The equality is satised if and only if P( g(y) = g(y)) = 1.
3
Proof Observe that the estimator g(T(Y )) is an unbiased estimator since,
E
( g(T(Y ))) = E[E
( g(Y )|T(Y ))] = E[ g(Y )] = g().

If X is a random variable, var
(X) 0 hence E(X

2
) {E(X)}
2
. From this if X = ( g(Y ) g())|T(Y ),
then
E
_
( g(Y ) g())
2
|T(Y )
{E[( g(Y ) g())|T(Y )]}

2
= {E[( g(Y )|T(Y )] g())}
2
= { g(T(Y )) g())}
2
(7)
Taking expectation on both sides,
E
_
E
_
( g(Y ) g())
2
|T(Y )
_
E
_
( g(T(Y )) g())
2
E
_
( g(Y ) g())
2
E
_
( g(T(Y )) g())
2
var
( g(Y )) var
( g(T(Y ))). (8)

The equality of the variance can be explained from the case when
var
(X) = 0 X = c (a constant) with probability 1

Then, E(X) = X = c and hence g(T(Y )) = g(Y ) with probability 1.
Note: The notation var
represents the variance computed wrt the distribution p(y; ). i.e

var
(f(Y )) = E
_
_
f(Y )

f(Y )
_
2
_
=
_
y
_
f(y)

f(y)
_
2
dp(y; )
QED
The Rao Blackwell theorem suggests the procedure to nd the minimum variance unbiased estimator for
. But consider a case where there are two sucient statistics, T(Y ) and S(Y ). Then, the two estimators
suggested by the Rao Blackwellization are g
T
= E
[ g(Y )|T(Y )] and g

S
= E
[ g(Y )|S(Y )]. We ask the

following question,
var
( g
T
)
>
=
<
var
( g
S
)
In general this cannot be answered without some additional information. Intuitively the answer is that
the most concise sucient statistic should be the winner; i.e if S = h(T) then var
( g
T
) var
( g
S
). In
other words, the minimum sucient statistic for will minimize the variance. To argue the existence of
a minimum sucient statistic we use the notion of completeness as explained in the section below.
4 Completeness
The additional property of completeness to the sucient statistic will ensure that there is a termination
to the algorithm suggested by the Rao Blackwell theorem. We dene completeness as,
Denition A family of distributions {p(y; )} is said to be complete, if for some measurable function
g(.), E
[g(Y )] = 0 implies that P(g(Y ) = 0) = 1 for all .

4
Example Consider a binary distribution (also known as the Bernoulli distribution) with parameter
[0, 1]. The density is given as,
p(x; ) = (1 )(x) + (x 1)
where (x) =
_
1; x = 0
0; x = 1
.
We take a measurable function g(.) such that, E(g(X)) = 0. Then,
E(g(X)) = g(0)(1 ) + g(1)
0 = g(0) + (g(1) g(0))
where the RHS is a polynomial in . Thus g(0) = g(1) = 0 and hence the family p(x; ) is complete.
Denition A sucient statistic T(y) is a complete sucient statistic if for a measurable function g(.),
E[g(T(y))] = 0 implies P(g(T(y)) = 0) = 1 for all .
Example Consider the sucient statistic T(y) =
i
y
i
for the mean estimation problem. We can see
that, T(y) has a Binomial(n, ) distribution given by,
P[T(Y ) = t] =
_
n
t
_
t
(1 )
nt
; 0 t n
Now,
E[g(T(Y ))] =
n
t=0
g(t)
_
n
t
_
t
(1 )
nt
0 =
n
t=0
g(t)
_
n
t
__

1
_
t
(1 )
n
0 = (1 )
n
n
t=0
g(t)
_
n
t
__

1
_
t
Let a
t
= g(t)
_
n
t
_
and x =

1
, then the above equation becomes
0 =
n
t=0
a
t
x
t
and implies a
t
= 0. Therefore g(t) = 0; t = 0, 1, 2, . . . , n and T(y) =
i
y
i
is a complete sucient statistic.
Remark: This lecture note will have given you holistic idea about the sucient statistics approach to
nding the MVU estimator. It is advised that you refer to some problems on this topic in the reference
books(Steven Kay Ch. 5). Also be free to point out mistakes(grammar as well as conceptual) by
mailing it to anvijay@live.com.
5

Mvue Notes

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Mvue Notes

Hochgeladen von

Copyright:

Verfügbare Formate

Lecture Notes - Finding the MVU Estimator

Date: June 25, 2014

( g(T(Y ))) = E[E

( g(Y )|T(Y ))] = E[ g(Y )] = g().

(X) 0 hence E(X

{E[( g(Y ) g())|T(Y )]}

( g(T(Y ))). (8)

(X) = 0 X = c (a constant) with probability 1

represents the variance computed wrt the distribution p(y; ). i.e

[ g(Y )|T(Y )] and g

[ g(Y )|S(Y )]. We ask the

[g(Y )] = 0 implies that P(g(Y ) = 0) = 1 for all .

Das könnte Ihnen auch gefallen