Beruflich Dokumente
Kultur Dokumente
391J
Statistics for Engineers and Scientists Mar 16
MIT, Spring 2006 Handout #9
Solution 4
---------------------------------------------------------------
% Multi-antenna transmitter and receiver system: MMSE estimator
% Number of transmit and receive antennas:
nT = 2; nR = 2;
1
% possible data symbols:
A = [-1;1];
---------------------------------------------------------------
2
Problem 2: Let X1 , X2 , . . . , Xn be independent random variables, each with
density,
x e− x2θ2 x > 0;
θ
f (x | θ) =
0, x ≤ 0,
Solution
3
Therefore, the maximum likelihood estimator (MLE) of θ is
n
X2
tn (X1 , X2 , . . . , Xn ) = i=1 i .
2n
√
(b) MATLAB command raylrnd( θ,1,n) generates n i.i.d. samples of Rayleigh
density with a parameter θ. For the data set used to plot Fig. 1, the max-
imum likelihood estimate approaches θ = 2 when the sample size is large.
From the figure, it seems that estimator is asymptotically unbiased.
5
MLE
1
0 100 200 300 400 500 600 700 800 900 1000
sample size
4
Solution First, we derive the MLE of θ. The likelihood and log-likelihood
function are
1 X1 −1 1
1 X2 −1 1
L(θ) = 1 − · × 1− ·
θ θ θ θ
1 Xn −1 1
× ··· × 1 − ·
n
θ
1 ( i=1 Xi )−n 1
θ
= 1−
θ θn
and
n 1
ln L(θ) = Xi − n ln 1 − − n ln θ.
i=1
θ
The first and second derivatives of the log-likelihood functions with respect to
θ are
n
∂ ( i=1 Xi ) − n n
ln L(θ) = − (1)
∂θ θ2 − θ θ
n
∂2 [( i=1 Xi ) − n](2θ − 1) n
ln L(θ) = − + 2. (2)
∂θ2 θ2 − θ θ
Setting the first derivative to zero and solving for an unknown θ ≥ 1 yields a
saddle point
n
Xi
θ∗ = i=1 ,
n
which maximizes the log-likelihood function:
∂2
n
2
ln L(θ) ∗ = − ∗ ∗ < 0.
∂θ θ=θ θ (θ − 1)
%-----------%
% variables %
%-----------%
m = 10000; % number of points for the histogram
n = 500; % sample size for each histogram point
5
1/2
Histogram for 10000 samples of n (Tn−θ) (n=500 and θ=3)
0.18
histogram
0.16 2
pdf N(0,θ −θ)
0.14
0.1
0.08
0.06
0.04
0.02
0
−10 −5 0 5 10
value
Figure 2: Histogram for problem 3(a) matches the pdf of a normal random
variable.
theta = 3;
%---------%
% samples %
%---------%
% Note that MATLAB defines geometric distribution to be
% Pr{ Y = k } = (1-p)^k p , for k=0,1,2,...
% From our homework, we define a r.v. X to be the pmf
% Pr{ X = l } = (1-p)^(l-1) p, for l=1,2,3,... . ----(*)
% Note that Y+1 has the pmf of (*):
% Pr{ Y+1 = s } = Pr{ Y=s-1 }
% = (1-p)^{s-1} p, for s=1,2,3... .
% Thus, to generate sample of pmf (*), we simply add one to the
% command geornd.
x = 1 + geornd(1/theta, n, m );
%-----------------------------------------%
% derive ML estimates T_n(1), ..., T_n(m) %
6
%-----------------------------------------%
% Structure of x:
% x[1,1] x[1,2] ... x[1,m]
% x[2,1] x[2,2] ... x[2,m]
% ...
% x[n,1] x[n,2] ... x[n,m]
% | | |
% | | V
% | | use this data set for T_n (m)
% | V
% V use this data set for T_n (2)
%
% use this data set for T_n (1)
%-------------------------------------%
% plot histogram and the limiting pdf %
%-------------------------------------%
sigma = sqrt(theta^2-theta); % limiting value of standard dev.
%---------------------------%
% label axes, title, legend %
7
%---------------------------%
% add title and lables
legend( ’histogram’, ’pdf N(0,\theta^2-\theta)’);
xlabel(’value’);
ylabel(’normalized frequency’);
txt = [’Histogram for ’, num2str(m), ’ samples of n^{1/2}...
(T_n-\theta) (n=’, num2str(n),’ and \theta=’, num2str(theta), ’)’ ];
title(txt);
(b) Let X denote a Geometric random variable with the probability of success
1/θ, and let p(·) denote the probability mass function of X:
1 k−1 1
f (k | θ) = 1 − · , k = 1, 2, . . . .
θ θ
The Fisher’s information is given by
2
∂
I(θ) E ln f (X | θ) (the expectation over a random variable X)
∂θ
X − 1 1 2
=E − (from equation (1) with n = 1)
θ2 − θ θ
X − θ 2
=E
θ2 − θ
1
= 2 2
E (X − θ)2
(θ − θ)
1
= 2 Var X
(θ − θ)2
1
= 2 .
θ −θ
Problem 4:
Show that
2
∂
Eθ ln fX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn |θ) = nI(θ).
∂θ
8
(b) Let tn (X1 , . . . , Xn ) be an unbiased estimator of some function of some θ,
say g(θ). Show that
2
(g (θ))
V ar (tn (X1 , . . . , Xn )) ≥
nI(θ)
Solution
Next, note that, if random variables V and W are independent, then, for
any function of V and W , they are independent, i.e., E {g(W )g(V )} =
E {g(W )} E {g(V )}. Thus, we have
2 n
2
∂ ∂
E ln fX1 ,...,Xn (x1 , . . . , xn |θ) = E ln fXi (xi |θ)
∂θ ∂θ i=1
n 2 ∂
∂ ∂
= E ln fXi (xi |θ) + 2 E ln fXi (xi |θ) · ln fXj (xj |θ)
i=1
∂θ i<j
∂θ ∂θ
n
= I(θ) + 0
i=1
= nI(θ)
∂
2
Hence, we conclude E ∂θ ln fX1 ,...,Xn (x1 , . . . , xn |θ) = nI(θ).
9
From (a), we know that
n
n
∞ ∞
∂
... g(θ) ln fXi (xi |θ) fXi (xi |θ)dx1 . . . dxn
−∞ −∞ ∂θ i=1 i=1
n
∂
= g(θ) E ln fXi (xi |θ) = g(θ) · 0 = 0
i=1
∂θ
Then, we have
n
∞ ∞
∂
g (θ) = ... tn (x1 , . . . , xn ) fX (xi |θ) dx1 . . . dxn
−∞ −∞ ∂θ i=1 i
∞ ∞ n
n
∂
= ... tn (x1 , . . . , xn ) ln fXi (xi |θ) fXi (xi |θ)dx1 . . . dxn
−∞ −∞ ∂θ i=1 i=1
∞ ∞ n
n
∂
= ... (tn (x1 , . . . , xn ) − g(θ)) ln fXi (xi |θ) fXi (xi |θ)dx1 . . . dxn
−∞ −∞ ∂θ i=1 i=1
n
∂
= E (tn (X1 , . . . , Xn ) − g(θ)) ln fX (xi |θ)
∂θ i=1 i
2 12
12 n
≤ E {tn (X1 , . . . , Xn ) − E [tn (X1 , . . . , Xn )]}
2 E ∂ ln fX (xi |θ)
∂θ i=1 i
1 1
= (V ar(tn (X1 , . . . , Xn ))) 2 (nI(θ)) 2
∞ ∞ ∂ n
Note that the third equality holds since −∞ . . . −∞ g(θ) ∂θ ln i=1 fXi (xi |θ)
n
i=1 fXi (xi |θ)dx1 . . . dxn = 0, and the inequality is the from the Schwartz
2
(g (θ))
inequality. Therefore, we have V ar (tn (X1 , . . . , Xn )) ≥ nI(θ) .
Problem 5:
10
iv. Plot V and CRLB as a function of σ 2 for α = 2 and N = 20. [Gen-
erally, we plot the performance against the so-called signal-to-noise-
ratio (SNR). To get the similar plot, make your plots as a function
of σ12 ].
(b) Repeat the same exercise as part (a) for the unknown θ in the system
model: r = ejθ s + n. You can use the function φ = Arg(x), which gives
the phase of any complex x s.t. φ ∈ [−π, π]. You can simulate to find
numerical solution, if the closed form solutions are not easily obtained.
Solution
(a) In this part, we can assume n = nR that makes r also real because the
only imaginary part could be due to the noise, nI which can be removed
before the estimator.
i. We have to find MLE of α. The likelihood function is:
N
1 − N −1 (rk −αsk )2 /2 σ2
L(α) = p(r|α) = e k=0 2
2
2π σ2
11
for α gives:
N −1
1
α̂MLE (r) = rk sk (4)
N
k=0
ii.
N −1
1
E {α̂MLE (r)} = E rk sk
N
k=0
N
−1
1
= sk E {rk }
N
k=0
N −1
1 2
= αsk
N
k=0
N
−1
α
= 1
N
k=0
= α (5)
where the second last step is because sk ∈ {−1, +1}. Thus the estimator
is unbiased, its mean estimation error is. So, the variance of estimation
error is:
Var{α̂MLE − α} = E (α̂MLE (r) − α)2
2
1 N −1
= E rk sk − α (6)
N
k=0
2
1 N −1
= E (αs2k + nk sk ) − α
N
k=0
2
1
N −1
= E α+ nk sk − α
N
k=0
2
1
N −1
= E n k sk
N2
k=0
1 2 2
N −1 N−1
= E n s
k k + n i nj si sj
N2
k=0 i,j:j=i
N
−1 N
−1
1
= E n2k + E {ni } E {nj } si sj
N2
k=0 i,j:j=i
σ2
= (7)
2N
12
Note that the estimation error is inversely related with the vector size,
which is expected as the same α scales all the components and estimation
should improve with the increase in sample size.
iii. To find the lower bound on mean square estimation error (or variance
of estimation error, because mean of the error is zero), we find Fisher’s
information Ir (α), using (3):
2
∂
Ir (α) = E ln p(r|α)
∂α
2
4
N −1
= E rk sk − N α
σ4
k=0
2
4N 2 1
N −1
= E rk sk − α
σ4 N
k=0
where we note that expectation is the same as in (6). Therefore, the result
of expectation is given by (7).
4N 2 σ 2
Ir (α) =
σ 4 2N
2N
=
σ2
Note that we could have used:
∂2
Ir (α) = −E ln p(r|α)
∂α2
2N
=
σ2
which is the same result. For the numerator part of the CRLB, we already
found that this is an unbiased estimator of α. Therefore,
Thus,
ψ (α) = 1
Hence, the lower bound of mean square estimation error is given by:
(ψ (α))2 σ2
= (8)
Ir (α) 2N
From part (ii), eq. (7), we see that MLE achieves the lower bound. Note
2
that the noise variance for this derivation is σ2 , that brings the constant
13
2 in the result.
iv. For N = 20, the plot of CRLB vs σ12 is given below. Variance of
estimation errir by simulating the estimator with sample size 100 is also
shown. Figure below shows the mean of the estimate at different variance.
0.02
Variance of Estimation Error
0.015
0.01
0.005
0
0 2 4 6 8 10 12 14 16 18 20
1/σ2
Figure 3: The plot of CRLB and simulated estimator’s error variance (sample
size 100) as a function of noise variance σ12 .
(b) r = ejθ s + n
i. For the MLE, we first write the likelihood function and the log-likelihood
function:
N
1 1 jθ 2
L(θ) = p(r|θ) = √ e− σ2 ||r−se ||
πσ 2
1
N
N −1 jθ 2 2
= √ e− k=0 |rk −sk e | /σ
πσ 2
N −1
N 1
ln L(θ) = − ln(πσ 2 ) − 2 |rk − sk ejθ |2
2 σ
k=0
14
∗
denotes imaginary part and indicates complex conjugate.
N −1
N 1
ln L(θ) = − ln(πσ 2 ) − 2 (|rk |2 + s2k − 2{rk∗ sk ejθ })
2 σ
k=0
N −1 N −1 N −1
N 1
= − ln(πσ 2 ) − 2 ( |rk |2 + s2k − 2{rk∗ sk ejθ })
2 σ
k=0 k=0 k=0
N
−1 N −1
N 1 N 2
= − ln(2πσ 2 ) − 2 |rk |2 − + {rk∗ sk ejθ }
2 σ σ2 σ2
k=0 k=0
15
N
−1
∂2 2 j θ̂
ln L(θ̂) = − {e rk∗ sk }
∂θ2 σ2
k=0
N −1
2
= − 2 rk∗ sk < 0
σ
k=0
ii.
N −1
# $
E θ̂ = E Arg rk sk
k=0
N −1
jθ
= E Arg N e + nk sk
k=0
N −1
1 jθ
= E Arg e + nk sk
N
k=0
N −1
= E θ + Arg 1 + e−jθ nk sk
k=0
= θ
The last step has the following justification: The term in the summation
has the expected magnitude of zero. Therefore, the quantity inside the
Arg function is a complex Gaussian random variable with mean magni-
tude 1 and mean phase 0, because it is circularly symmetric.
iii. We now determine the CRLB for the mean square estimation error
in unknown parameter θ. From the numerical results of part (ii), we
find that the estimator is unbiased. Therefore, the numerator part of the
16
Information Inequality is 1. We calculate the denominator, that is, Ir (θ).
We make use of (10) here.
2
∂
Ir (θ) = −E ln p(r|θ)
∂θ2
N −1
2 ∗ jθ
= E {rk sk e }
σ2
k=0
N −1
2 ∗ −jθ ∗ jθ
= E {(sk e + nk )sk e }
σ2
k=0
N −1
2 ∗ jθ
= E {1 + nk sk e }
σ2
k=0
N
−1 N −1
2 2
= 1+ {sk E {n∗k } ejθ }
σ2 σ2
k=0 k=0
2N
= +0
σ2
2N
=
σ2
σ2
Hence, the CRLB for the variance of estimation error is 2N .
iv. For N = 20, the plot of CRLB vs σ12 is given below in Fig. 5. The
variance of estimation error calculated with a sample size of 100 is also
plotted.
17
sample mean of MLE of θ
1.14 Mean θ
known value = π/3
1.12
1.1
1.06
1.04
1.02
0.98
0.96
Figure 4: The plot of sample mean of the estimator at different variance values,
sample size is 100 in each case.
0.025
Variance of Estimation Error
0.02
0.015
0.01
0.005
0
0 2 4 6 8 10 12 14 16 18 20
2
1/σ
18