Sie sind auf Seite 1von 18

6.434J/16.

391J
Statistics for Engineers and Scientists Mar 16
MIT, Spring 2006 Handout #9

Solution 4

Problem 1: Consider the multi-antenna transmission and reception system.


There are nT = 2 transmit and nR = 2 receive antennas. A vector of nT iid data
symbols a ∈ A = {−1, +1}nT is transmitted over an AWGN channel. Assume
that the symbols are uniformly distributed. The channel coefficients between
each transmit and receive antenna pair are grouped in the form of a channel
matrix H, which is known in this problem. The observations at the receiver can
be written as:
r = Ha + n

where N ∼ N (0, σ 2 InR ). H is specified as the following: H11 = 0.25, H12 =


0.9, H21 = 0.7, H22 = 0.5. Implement the above described system in MATLAB
for noise variance of σ 2 = 0.25, 0.5 and 1, respectively. Implement an MMSE
estimator to obtain the transmitted vector of symbols. Record the error per-
formance by comparing with the transmitted sequence for each case of σ 2 . Be
sure to run your system for a sufficient length of samples in each case.

Solution MATLAB code below simulates the system. It implements the L-


MMSE estimator for the transmitted symbols. One of the simulation results for
the given noise variance values are:
σn2 = 0.25 mean symbol error = 0.0925
2
σn = 0.50 mean symbol error = 0.1600
σn2 = 1.00 mean symbol error = 0.2375
By the way, for a realistic communication system, these errors are quite high.
Actually, these values of noise variance are large for such a MIMO system (they
yield SNR of 6, 3 and 0 dB respectively). At σn2 = 0.05 (that is almost 13 dB
SNR), the symbol error rate goes below 10−3 .

---------------------------------------------------------------
% Multi-antenna transmitter and receiver system: MMSE estimator
% Number of transmit and receive antennas:
nT = 2; nR = 2;

% channel coefficient matrix H:


H = [0.25 0.9;0.7 0.5];

1
% possible data symbols:
A = [-1;1];

% The noise variance: can change it to any desired value


allVariances = [0.25 0.5 1.0];

RESULT = []; for index = 1:length(allVariances),


sigma2 = Sigma2(index);
sampleSize = 1000;
totalErrors = 0;
for m=1:sampleSize,
% Generate transit vectorof size (nT x 1)
a = 2*(round(rand(nT,1))-0.5);

% Generate noise samples


n = sqrt(sigma2)*randn(nR,1);

% Receiver sequence is:


r = H*a + n;

% L-MMSE of transmitted symbols a, based on observed r:


a_hat = H.’ * inv((H*H.’ + sigma2*eye(nR))) * r;

% Quantize the estimate to be from A:


a_hatQ = 2*((a_hat > 0)-0.5);

% Check for errors:


symError = sum(a_hatQ ~= a);
totalErrors = totalErrors + symError;
end
% For each noise-variance, record sample-mean symbol error
RESULT = [RESULT; [sigma2 totalErrors/(nT*sampleSize)]];
end
% Display
RESULT

---------------------------------------------------------------

2
Problem 2: Let X1 , X2 , . . . , Xn be independent random variables, each with
density,

 x e− x2θ2 x > 0;
θ
f (x | θ) =
0, x ≤ 0,

where θ is unknown, θ ∈ Θ  {z | z > 0}. (This is called the Rayleigh distribu-


tion and has been used to model the fluctuation of the received signal amplitude
in wireless transmission.)

(a) Find a maximum likelihood estimator of θ, say, tn (X1 , X2 , . . . , Xn ).

(b) Use MATLAB to generate n = 1000 samples, Xi = xi , of i.i.d. random


variables, each with Rayleigh density with parameter θ = 2. Then, plot
the estimates tm (x1 , x2 , . . . , xm ) as a function of the sample size, m, for
m = 1, 2, . . . , 1000. Comment on the figure. Based on the plot, do you
think that the estimator is asymptotically unbiased?

Solution

(a) The likelihood and log-likelihood functions are given by


n 
Xi − ni=1 Xi2
L(θ) = i=1n e 2θ
θ
n n
1  2
ln L(θ) = ln Xi − n ln θ − X .
i=1
2θ i=1 i

We want to find θ > 0 that maximizes the log-likelihood function. The


first and second partial derivatives of the log-likelihood function are given
by
n
∂ n 1  2
ln L(θ) = − + 2 X
∂θ θ 2θ i=1 i
n
∂2 n 1  2
ln L(θ) = − X .
∂θ2 θ2 θ3 i=1 i

Setting the first partial derivative to zero yields a saddle point


n
X2
θ = i=1 i ,

2n
which maximizes the log-likelihood function:
∂2 
 n
2
ln L(θ) ∗ = − ∗ 2 < 0.
∂θ θ=θ (θ )

3
Therefore, the maximum likelihood estimator (MLE) of θ is
n
X2
tn (X1 , X2 , . . . , Xn ) = i=1 i .
2n

(b) MATLAB command raylrnd( θ,1,n) generates n i.i.d. samples of Rayleigh
density with a parameter θ. For the data set used to plot Fig. 1, the max-
imum likelihood estimate approaches θ = 2 when the sample size is large.
From the figure, it seems that estimator is asymptotically unbiased.

MLE of unknown parameter θ=2 for Rayleigh density


8

5
MLE

1
0 100 200 300 400 500 600 700 800 900 1000
sample size

Figure 1: The estimate is close to 2 for a large sample size.

Problem 3: Let X1 , X2 , . . . , Xn be i.i.d. random variables from the Geomet-


ric distribution with the probability of success 0 ≤ 1/θ ≤ 1, for an unknown
parameter θ:
 1 k−1 1
P {X1 = k} = 1 − · , k = 1, 2, 3, . . . .
θ θ
Let Tn be the maximum likelihood estimator (MLE) of θ based on a random
sample of size n.

(a) Fix θ = 3 and use MATLAB to plot a histogram of n(Tn − θ), for
n = 500.

(b) Find the Fisher information, I(θ).

4
Solution First, we derive the MLE of θ. The likelihood and log-likelihood
function are

 1 X1 −1 1
 1 X2 −1 1
L(θ) = 1 − · × 1− ·
θ θ θ θ

 1 Xn −1 1
× ··· × 1 − ·
 n
θ
1 ( i=1 Xi )−n 1
θ
= 1−
θ θn

and


n  1
ln L(θ) = Xi − n ln 1 − − n ln θ.
i=1
θ

The first and second derivatives of the log-likelihood functions with respect to
θ are
n
∂ ( i=1 Xi ) − n n
ln L(θ) = − (1)
∂θ θ2 − θ θ
 n
∂2 [( i=1 Xi ) − n](2θ − 1) n
ln L(θ) = − + 2. (2)
∂θ2 θ2 − θ θ
Setting the first derivative to zero and solving for an unknown θ ≥ 1 yields a
saddle point
n
Xi
θ∗ = i=1 ,
n
which maximizes the log-likelihood function:

∂2 
 n
2
ln L(θ) ∗ = − ∗ ∗ < 0.
∂θ θ=θ θ (θ − 1)

Therefore, the maximum likelihood estimator of θ is


n
Xi
Tn = i=1 .
n

(a) MATLAB code to plot the histogram is shown below:

%-----------%
% variables %
%-----------%
m = 10000; % number of points for the histogram
n = 500; % sample size for each histogram point

5
1/2
Histogram for 10000 samples of n (Tn−θ) (n=500 and θ=3)
0.18
histogram
0.16 2
pdf N(0,θ −θ)

0.14

normalized frequency 0.12

0.1

0.08

0.06

0.04

0.02

0
−10 −5 0 5 10
value

Figure 2: Histogram for problem 3(a) matches the pdf of a normal random
variable.

theta = 3;

%---------%
% samples %
%---------%
% Note that MATLAB defines geometric distribution to be
% Pr{ Y = k } = (1-p)^k p , for k=0,1,2,...
% From our homework, we define a r.v. X to be the pmf
% Pr{ X = l } = (1-p)^(l-1) p, for l=1,2,3,... . ----(*)
% Note that Y+1 has the pmf of (*):
% Pr{ Y+1 = s } = Pr{ Y=s-1 }
% = (1-p)^{s-1} p, for s=1,2,3... .
% Thus, to generate sample of pmf (*), we simply add one to the
% command geornd.
x = 1 + geornd(1/theta, n, m );

%-----------------------------------------%
% derive ML estimates T_n(1), ..., T_n(m) %

6
%-----------------------------------------%
% Structure of x:
% x[1,1] x[1,2] ... x[1,m]
% x[2,1] x[2,2] ... x[2,m]
% ...
% x[n,1] x[n,2] ... x[n,m]
% | | |
% | | V
% | | use this data set for T_n (m)
% | V
% V use this data set for T_n (2)
%
% use this data set for T_n (1)

% sum over each column


mle = sum(x)/n;

%-------------------------------------%
% plot histogram and the limiting pdf %
%-------------------------------------%
sigma = sqrt(theta^2-theta); % limiting value of standard dev.

hres = 0.5; % resolution of the histogram


t =-4*sigma:hres:4*sigma; % resolution for histogram
yhist = sqrt(n)*(mle-theta); % data for histogram plot
[fqcount,xout] = hist(yhist,t);
% plot normalized frequency
% 1 = full width of bars, -y = solid yellow line
bar(xout,fqcount/m/hres, 1); %last argument is width of the bar
% edit the figure color to your choice
hold on

val = -4*sigma:0.1:4*sigma; % x-axis, new resolution


y = normpdf( val, 0, sigma );
plot( val, y, ’-b’, ’Linewidth’, 2 ); % so the line is over the histogram

%---------------------------%
% label axes, title, legend %

7
%---------------------------%
% add title and lables
legend( ’histogram’, ’pdf N(0,\theta^2-\theta)’);
xlabel(’value’);
ylabel(’normalized frequency’);
txt = [’Histogram for ’, num2str(m), ’ samples of n^{1/2}...
(T_n-\theta) (n=’, num2str(n),’ and \theta=’, num2str(theta), ’)’ ];
title(txt);

(b) Let X denote a Geometric random variable with the probability of success
1/θ, and let p(·) denote the probability mass function of X:
 1 k−1 1
f (k | θ) = 1 − · , k = 1, 2, . . . .
θ θ
The Fisher’s information is given by

2

I(θ)  E ln f (X | θ) (the expectation over a random variable X)
∂θ

X − 1 1 2
=E − (from equation (1) with n = 1)
θ2 − θ θ

X − θ 2
=E
θ2 − θ
1  
= 2 2
E (X − θ)2
(θ − θ)
1
= 2 Var X
(θ − θ)2
1
= 2 .
θ −θ

Problem 4:

(a) Let X, X1 , X2 , . . . , Xn be independent random variables, each with the


same pdf fX (x|θ) for some θ ∈ Θ. Define the Fisher information
 2

I(θ) = Eθ ln fX (x|θ)
∂θ

Show that
 2

Eθ ln fX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn |θ) = nI(θ).
∂θ

8
(b) Let tn (X1 , . . . , Xn ) be an unbiased estimator of some function of some θ,
say g(θ). Show that
2
(g  (θ))
V ar (tn (X1 , . . . , Xn )) ≥
nI(θ)

Solution

(a) We fist find


   ∞
∂ ∂
Eθ ln fX (x|θ) = ln fX (x|θ) · fX (x|θ)dx
∂θ −∞ ∂θ
∞ ∂
∂θ fX (x|θ)
= fX (x|θ)dx
−∞ fX (x|θ)
∞ 

= fX (x|θ)dx
∂θ −∞

= (1)
∂θ
= 0

Next, note that, if random variables V and W are independent, then, for
any function of V and W , they are independent, i.e., E {g(W )g(V )} =
E {g(W )} E {g(V )}. Thus, we have
 2  n
2
∂ ∂ 
E ln fX1 ,...,Xn (x1 , . . . , xn |θ) = E ln fXi (xi |θ)
∂θ ∂θ i=1
 n  2  ∂ 
∂ ∂
= E ln fXi (xi |θ) + 2 E ln fXi (xi |θ) · ln fXj (xj |θ)
i=1
∂θ i<j
∂θ ∂θ
n

= I(θ) + 0
i=1
= nI(θ)
 ∂
2
Hence, we conclude E ∂θ ln fX1 ,...,Xn (x1 , . . . , xn |θ) = nI(θ).

(b) Since tn (X1 , . . . , Xn ) is a unbiased estimator of function g(θ), we have

g(θ) = E {tn (X1 , . . . , Xn )}


 ∞  ∞  n 

= ... tn (x1 , . . . , xn ) fXi (xi |θ) dx1 . . . dxn
−∞ −∞ i=1

9
From (a), we know that
   n
 n
∞ ∞
∂  
... g(θ) ln fXi (xi |θ) fXi (xi |θ)dx1 . . . dxn
−∞ −∞ ∂θ i=1 i=1
n  

= g(θ) E ln fXi (xi |θ) = g(θ) · 0 = 0
i=1
∂θ

Then, we have
   n

∞ ∞
∂ 
g  (θ) = ... tn (x1 , . . . , xn ) fX (xi |θ) dx1 . . . dxn
−∞ −∞ ∂θ i=1 i
 ∞  ∞  n
 n
∂  
= ... tn (x1 , . . . , xn ) ln fXi (xi |θ) fXi (xi |θ)dx1 . . . dxn
−∞ −∞ ∂θ i=1 i=1
 ∞  ∞  n
 n
∂  
= ... (tn (x1 , . . . , xn ) − g(θ)) ln fXi (xi |θ) fXi (xi |θ)dx1 . . . dxn
−∞ −∞ ∂θ i=1 i=1
  n

∂ 
= E (tn (X1 , . . . , Xn ) − g(θ)) ln fX (xi |θ)
∂θ i=1 i
  2  12
 12 n
≤ E {tn (X1 , . . . , Xn ) − E [tn (X1 , . . . , Xn )]}
2 E ∂ ln fX (xi |θ) 
∂θ i=1 i
1 1
= (V ar(tn (X1 , . . . , Xn ))) 2 (nI(θ)) 2
∞ ∞ ∂ n 
Note that the third equality holds since −∞ . . . −∞ g(θ) ∂θ ln i=1 fXi (xi |θ)
n
i=1 fXi (xi |θ)dx1 . . . dxn = 0, and the inequality is the from the Schwartz
2
(g (θ))
inequality. Therefore, we have V ar (tn (X1 , . . . , Xn )) ≥ nI(θ) .

Problem 5:

(a) For the following system:


r = αs + n
where s is a vector of N symbols (known), s ∈ {−1, +1}N , n = nR + jnI
and both nR and nI are samples of iid zero-mean Gaussian noise with
2
variance σ2 . α ∈ R, is the parameter that we want to estimate.

i. Find the maximum-likelihood estimate of α, that is α̂ML and show


that it is unbiased.
ii. Find the variance of estimation error, V (α).
iii. Find the lower bound on estimation error for unbiased estimators.
This is Cramer-Rao Lower Bound (CRLB).

10
iv. Plot V and CRLB as a function of σ 2 for α = 2 and N = 20. [Gen-
erally, we plot the performance against the so-called signal-to-noise-
ratio (SNR). To get the similar plot, make your plots as a function
of σ12 ].

(b) Repeat the same exercise as part (a) for the unknown θ in the system
model: r = ejθ s + n. You can use the function φ = Arg(x), which gives
the phase of any complex x s.t. φ ∈ [−π, π]. You can simulate to find
numerical solution, if the closed form solutions are not easily obtained.

Solution

(a) In this part, we can assume n = nR that makes r also real because the
only imaginary part could be due to the noise, nI which can be removed
before the estimator.
i. We have to find MLE of α. The likelihood function is:
 N

1  − N −1 (rk −αsk )2 /2 σ2
L(α) = p(r|α) =   e k=0 2
2
2π σ2

So the log-likelihood function is,


N −1
N 1 
ln L(α) = − ln(πσ 2 ) − 2 (rk − αsk )2
2 σ
k=0

We want to find α that maximizes log-likelihood function. Taking partial


derivatives w.r.t. α:
N −1
∂ 2 
ln L(α) = (rk − αsk )sk
∂α σ2
k=0
N −1 N −1

2  
2
= rk sk − α sk
σ2
k=0 k=0
N −1 N −1

2  
= rk sk − α 1
σ2
k=0 k=0
N −1 
2 
= rk sk − N α (3)
σ2
k=0
∂2 2N
ln L(α) = − <0
∂α2 σ2
Since σ 2 is always positive, second derivative is always negative implying
that we indeed have the maximum here. Setting equal to zero and solving

11
for α gives:
N −1
1 
α̂MLE (r) = rk sk (4)
N
k=0
ii.
 N −1

1 
E {α̂MLE (r)} = E rk sk
N
k=0
N
 −1
1
= sk E {rk }
N
k=0
N −1
1  2
= αsk
N
k=0
N
 −1
α
= 1
N
k=0
= α (5)

where the second last step is because sk ∈ {−1, +1}. Thus the estimator
is unbiased, its mean estimation error is. So, the variance of estimation
error is:
 
Var{α̂MLE − α} = E (α̂MLE (r) − α)2
 2 
 1 N −1 
= E rk sk − α (6)
 N 
k=0
 2 
 1 N −1 
= E (αs2k + nk sk ) − α
 N 
k=0
 2 
 1 
N −1 
= E α+ nk sk − α
 N 
k=0
 2 
1   
N −1
= E n k sk
N2  
k=0
 
1  2 2 
N −1 N−1
= E n s
k k + n i nj si sj
N2  
k=0 i,j:j=i
N
 −1 N
 −1
1  
= E n2k + E {ni } E {nj } si sj
N2
k=0 i,j:j=i

σ2
= (7)
2N

12
Note that the estimation error is inversely related with the vector size,
which is expected as the same α scales all the components and estimation
should improve with the increase in sample size.
iii. To find the lower bound on mean square estimation error (or variance
of estimation error, because mean of the error is zero), we find Fisher’s
information Ir (α), using (3):
 2 

Ir (α) = E ln p(r|α)
∂α
 2 
4   
N −1
= E rk sk − N α
σ4  
k=0
 2 
4N 2  1  
N −1
= E rk sk − α
σ4  N 
k=0

where we note that expectation is the same as in (6). Therefore, the result
of expectation is given by (7).
4N 2 σ 2
Ir (α) =
σ 4 2N
2N
=
σ2
Note that we could have used:

∂2
Ir (α) = −E ln p(r|α)
∂α2
2N
=
σ2
which is the same result. For the numerator part of the CRLB, we already
found that this is an unbiased estimator of α. Therefore,

ψ(α) = E {α̂MLE (r)} = α

Thus,
ψ  (α) = 1

Hence, the lower bound of mean square estimation error is given by:
(ψ  (α))2 σ2
= (8)
Ir (α) 2N
From part (ii), eq. (7), we see that MLE achieves the lower bound. Note
2
that the noise variance for this derivation is σ2 , that brings the constant

13
2 in the result.

iv. For N = 20, the plot of CRLB vs σ12 is given below. Variance of
estimation errir by simulating the estimator with sample size 100 is also
shown. Figure below shows the mean of the estimate at different variance.

CRLB for α estimation


0.025
CRLB
MLE simulation

0.02
Variance of Estimation Error

0.015

0.01

0.005

0
0 2 4 6 8 10 12 14 16 18 20
1/σ2

Figure 3: The plot of CRLB and simulated estimator’s error variance (sample
size 100) as a function of noise variance σ12 .

(b) r = ejθ s + n
i. For the MLE, we first write the likelihood function and the log-likelihood
function:
 N
1 1 jθ 2
L(θ) = p(r|θ) = √ e− σ2 ||r−se ||
πσ 2

1
N
N −1 jθ 2 2 
= √ e− k=0 |rk −sk e | /σ
πσ 2
N −1
N 1 
ln L(θ) = − ln(πσ 2 ) − 2 |rk − sk ejθ |2
2 σ
k=0

Note that rk is complex. In the equations below,  denotes real part, 

14

denotes imaginary part and indicates complex conjugate.
N −1
N 1 
ln L(θ) = − ln(πσ 2 ) − 2 (|rk |2 + s2k − 2{rk∗ sk ejθ })
2 σ
k=0
N −1 N −1 N −1
N 1   
= − ln(πσ 2 ) − 2 ( |rk |2 + s2k − 2{rk∗ sk ejθ })
2 σ
k=0 k=0 k=0
N
 −1 N −1
N 1 N 2 
= − ln(2πσ 2 ) − 2 |rk |2 − + {rk∗ sk ejθ }
2 σ σ2 σ2
k=0 k=0

Taking partial derivatives w.r.t. θ:


N −1
∂ 2 
ln L(θ) = − 2 {rk∗ sk ejθ } (9)
∂θ σ
k=0
N −1
∂2 2 
ln L(θ) = − {rk∗ sk ejθ } (10)
∂θ2 σ2
k=0

Equating (9) to 0 and solving for θ,


N −1
∂ 2 
ln L(θ) = − {rk∗ sk ejθ }
∂θ σ2
k=0
 N −1 
2 
jθ ∗
= − 2 e rk sk
σ
k=0
= 0

Imaginary part 0 demands the angle of the term in summation to be −θ.


Therefore, the required MLE, θ̂MLE(r) is:
N −1 

θ̂MLE (r) = −Arg rk∗ sk
k=0
N −1 ∗ 

= −Arg rk sk
k=0
N −1 

= Arg rk sk (11)
k=0
 T

= Arg r s (12)

which indeed maximizes the log-likelihood function because plugging θ̂MLE


in (10), we get:

15
N
 −1
∂2 2 j θ̂
ln L(θ̂) = − {e rk∗ sk }
∂θ2 σ2
k=0
N −1 
  
2  
= − 2  rk∗ sk  < 0
σ  
k=0

ii.
 N −1 
# $ 
E θ̂ = E Arg rk sk
k=0
  N −1



= E Arg N e + nk sk
k=0
 
N −1

1  jθ
= E Arg e + nk sk
N
k=0
  N −1


= E θ + Arg 1 + e−jθ nk sk
k=0
= θ

The last step has the following justification: The term in the summation
has the expected magnitude of zero. Therefore, the quantity inside the
Arg function is a complex Gaussian random variable with mean magni-
tude 1 and mean phase 0, because it is circularly symmetric.

The estimator is therefore unbiased. Estimation of the mean of estimator


performed via a sample size 100 is given in Fig. 4. It also shows that the
estimator is unbiased. Calculating the the variance of estimation error
then needs the following evaluation:
# $
2
E (Arg (wn ))
N −1
where wn = 1 + e−jθ k=0 nk sk . However it is unnecessarily cumber-
some. We compute the variance by simulation. The results is shown in
Fig. 5

iii. We now determine the CRLB for the mean square estimation error
in unknown parameter θ. From the numerical results of part (ii), we
find that the estimator is unbiased. Therefore, the numerator part of the

16
Information Inequality is 1. We calculate the denominator, that is, Ir (θ).
We make use of (10) here.
2

Ir (θ) = −E ln p(r|θ)
∂θ2
 N −1 
2  ∗ jθ
= E {rk sk e }
σ2
k=0
 N −1 
2  ∗ −jθ ∗ jθ
= E {(sk e + nk )sk e }
σ2
k=0
 N −1 
2  ∗ jθ
= E {1 + nk sk e }
σ2
k=0
N
 −1 N −1
2 2 
= 1+ {sk E {n∗k } ejθ }
σ2 σ2
k=0 k=0
2N
= +0
σ2
2N
=
σ2

σ2
Hence, the CRLB for the variance of estimation error is 2N .

iv. For N = 20, the plot of CRLB vs σ12 is given below in Fig. 5. The
variance of estimation error calculated with a sample size of 100 is also
plotted.

17
sample mean of MLE of θ

1.14 Mean θ
known value = π/3

1.12

1.1

mean estimate with sample size 100


1.08

1.06

1.04

1.02

0.98

0.96

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1


σ2

Figure 4: The plot of sample mean of the estimator at different variance values,
sample size is 100 in each case.

CRLB for θ estimation


0.03
CRLB
MLE simulation

0.025
Variance of Estimation Error

0.02

0.015

0.01

0.005

0
0 2 4 6 8 10 12 14 16 18 20
2
1/σ

Figure 5: The plot of CRLB and simulated estimator’s error variance as a


function of noise variance σ12 .

18