Soln 5 229aspr07

EECS 229A
Spring 2007
*
Solutions to Homework 5
1. Problem 8.8 on pg. 258 of the text.
Solution:
Channel with uniformly distributed noise
Consider the probability distribution (i , i = 0, 1, 2) on X where
results in Y having a density
fY (y) =
1
2 2
1
2 (2 + 1 )
1
2 (1 + 0 )
1
2 (0 + 1 )
1
2 (1 + 2 )
1
2 2
i i
= 1. This
if y (3, 2)
if y (2, 1)
if y (1, 0)
if y (0, 1)
if y (1, 2)
if y (2, 3)
The corresponding differential entropy h(Y ) equals the entropy of the probability distribution on 6 points given by
1
1
1
1
1
1
( 2 , (2 + 1 ), (1 + 0 ), (0 + 1 ), (1 + 2 ), 2 ) .
2
2
2
2
2
2
The largest this can be is log 6, with equality achieved when
1
2 = 0 = 2 = , 1 = 1 = 0 .
3
We also note that the conditional differential entropy h(Y | X) does not depend on the
probability distribution of X, and equals 0. Hence the capacity of the channel is log 6.
Solution:
Output power constraint
We would expect that the capacity of the channel is given by
C = max I(X; Y )
p(x)
where the maximum is taken over all distributions p(x) on X for which the resulting Y
given by Y = X + Z where Z q X and Z N (0, 2 ) satisfies E[Y 2 ] P . Assuming this
is true, we can write, for any choice of p(x), and assuming that P 2 ,
I(X; Y )
=
=
(a)
h(Y ) h(Y | X)
1
H(Y ) log 2e 2
2
1
1
log 2eP log 2e 2
2
2
P
1
log 2 ,
2
where step (a) comes from applying the output power constraint. If P < 2 it is impossible
to meet the output power constraint, so the capacity must be 0 in this case. We thus
get the bound C ( 12 log P2 )+ , where (u)+ denotes max(u, 0) for a real number u. This
bound can be achieved when P 2 by choosing p(x) to be the Gaussian distribution of
mean 0 and variance P 2 . Hence we have
1
P
C = ( log 2 )+ .
2
(1)
This discussion only depended on the assumption that

C max I(X; Y ) ,
(2)
p(x)
since the achievement of the expression in equation (1) may be proved by a random
coding argument as was done for the AWGN with the traditional input power constraint.
However, the truth of the bound in equation (2) follows immediately from equation (9.45)
on pg. 269 of the text, which was derived using Fanos inequality and applies irrespective
of the nature of the constraints imposed on the communication process.
Solution:
Exponential noise channels
Strictly speaking, the claim of this problem is false, as we will see at the end of this
discussion. What they intended was to also impose the condition that the input be
nonnegative, in which case the claim is true, as we will see.
We first compute the differential entropy of an exponential density. Suppose W is a
random variable having an exponential density of mean , i.e. the density of W is
1 w
e 1(w 0) .
Then
h(W ) = log + 1 .
We next observe that among all nonnegative random variables having a density and
having mean , the exponential density with mean maximizes the differential entropy.
This is most easily seen by considering the relative entropy between the density and the
corresponding exponential density. Let f (x) be a density (with support on nonnegative
values) having mean and write:
1 x
0 D(f (x)k e 1(x 0))
Z
f (x)
=
f (x) log 1 x dx
0
e
= h(f (x)) + log +
2
x
f (x) dx ,
which gives
h(f (x)) log + 1 ,
as claimed.
We may expect that the capacity of the channel is given by
max I(X; Y ) ,
p(x)
where the maximum is taken over all input distributions p(x) having mean and support
on the nonnegative real numbers. Assuming that this is true, since the output will have
a density, we could write, for any choice of p(x),
I(X; Y )
h(Y ) h(Y | X)
h(Y ) (log + 1)
(a)
(log( + ) + 1) (log + 1)
log(1 + ) ,
where in step (a) we have used the nonnegativity of Y , the fact that its mean is at most
+ , the monotonicity of the logarithm function, and the result we just proved that
the exponential distribution of a given mean maximizes the differential entropy among all
nonnegative densities having that mean.
We now observe that this upper bound (which applies to all choices of input distribution
supported on the nonnegative real numbers and having mean at most ) is actually
achievable. To see this, it is convenient to recall the Laplace transform of an exponential
distribution of mean :
Z
1
1 x
esx e dx =
s+
1
,
1 + s
We now seek to identify the distribution of a nonnegative random variable that when
added to a random variable independent of it and having exponential distribution of
mean results in a random variable having exponential distribution of mean + . For
this, we have to solve for the unknown Laplace transform F (s) in the equation
F (s)
1
1
=
.
1 + s
1 + ( + )s
This equation has a solution, given by

F (s) =
+
.
+ ( + ) (1 + ( + )s)
The random variable whose distribution has this Laplace transform is a mixed random
variable: it takes value 0 with probability +

and conditioned on being strictly positive
(which happens with probability + ) it is exponentially distributed with mean + .

Note that its actual mean is then , which meets the constraint.
3
We have therefore proved that the capacity of the channel (assuming nonnegative inputs)
is indeed log(1 + ), as claimed. This depended on assuming firstly that the capacity of
the channel is bounded above by
max I(X; Y ) ,
p(x)
where the maximum is taken over all input distributions p(x) having mean and support
on the nonnegative real numbers, and secondly that we can actually achieve the capacity
log(1 + ). The latter can be proved by a random coding argument. The former is a
consequence of equation (9.45) on pg. 269 of the text.
Finally, let us see why the original problem statement is wrong. If we were allowed to use
input random variables that take on negative values, say values that are K or larger,
where K > 0, the problem becomes equivalent to the one where the inputs are forced to
take on only nonnegative values, but the input mean is allowed to be as big as K + .
In this case the channel capacity, as just proved, would be log(1 + +K
). But K can be
arbitrarily large, so without the nonnegativity constraint on the input the capacity of the
channel is .
Solution:
Fading channel
We write
I(X; Y, V ) = I(X; V ) + I(X; Y | V ) = I(X; Y | V ) ,
where the second step comes from the independence of the fading from the input. We
also have
I(X; Y, V ) = I(X; Y ) + I(X; V | Y ) I(X; Y ) ,
using the nonnegativity of mutual information. It follows that
I(X; Y | V ) I(X; Y ) ,
as claimed.
Solution:
Multipath Gaussian channel
The problem statement does not specify this, but it is implicitly being assumed that the
noise variables have zero mean.
(a) We have
Y = Y1 + Y2 = (X + Z1 ) + (X + Z2 ) = 2X + (Z1 + Z2 ) ,
Note that Z1 + Z2 is independent of X. It is a Gaussian random variable with mean
0 and variance 2(1 + ) 2 . The input power constraint appears as E[(2X)2 ] 4P .
The capacity of the channel is therefore
C=
1
2P
log(1 +
).
2
(1 + ) 2
4
(b) When = 0, the capacity becomes 21 log(1 + 2P

). In effect the noise power is being
2
halved.
When = 0, the capacity becomes 12 log(1 + P2 ). Since the noise in the two paths
is coherent, in effect the multipath system becomes a scaled version of a single path
system.
When = 0, the capacity becomes . The noise in the two paths cancel each other
out in the limit as 1.
6. Problem 9.20 on pp. 296 -297 of the text.
Solution:
Robust decoding
(a) The proof depends on the entropy power inequality, stated in equation (9.181) on pg.
298 of the text. Let X be a Gaussian random variable of mean 0 and variance P and
let Z be independent of X with distribution pZ (z). The entropy power inequality
for the differential entropy of X + Z, reads
22h(X+Z) 22h(X) + 22h(Z) .
Dividing through by 22h(Z) , we get
22(h(X+Z)h(Z)) 22h(X) 22h(Z) + 1 .
1
2
Since E[Z 2 ] = N , we have h(Z)

thus get
log 2eN . Further, h(X) =
22(h(X+Z)h(Z)) 1 +
1
2
log 2eP . We
P
.
N
This can be rearranged to read

h(X + Z) h(Z)
1
P
log(1 + ) .
2
N
Thus we have found a specific input distribution (the Gaussian distribution with
P
mean 0 and variance P ) for which I(X; Y ) is at least as big as 21 log(1 + N
). This
P
proves that the capacity of the channel is bounded below by 12 log(1 + N
).
(b) The random codebook can be listed as an array with 2nR rows and n columns, the
entry Xi (m) in row m and column i being the i-th coordinate of the m-th codeword. Each Xi (m) is Gaussian with mean zero and variance P and all these random
variables are mutually independent. Let W denote the message transmitted; this is
uniformly distributed on {1, . . . , 2nR } and is independent of the random codebook.
The received vector (Y1 , . . . , Yn ) then has coordinates given by Yi = Xi (W ) + Zi ,
where (Z1 , . . . , Zn ) are i.i.d. with marginal distribution pZ (z) and these are inde denote the decoded message. This
pendent of W and the random codebook. Let W
is determined by the rule
= argminm
W
n
X
i=1
(Yi Xi (m))2 ,
and since ties occur with probability zero, we can ignore the event that tie-breaking
6= W ). Note that this is automatically
is needed. The probability of error is P (W
being averaged over the transmitted message and the random codebook.
i. The probability of error conditioned on noise realization being z can be written
6= W | Z = z), where Z denotes (Z1 , . . . , Zn ).
as P (W
The suggested symmetry argument works as follows. Let U be any orthogonal
n n real matrix. Given a random codebook, we can consider another one
with the same number of codewords, but with the m-th row being given by
1 (m), . . . X
n (m)), where
(X
1 (m)
X
X1 (m)
..
..
=U
.
.
.
n (m)
Xn (m)
X
By invariance under rotation, for the new codebook the probability of error conditioned on the noise realization being (U zT )T is identical to the probability of
error under the old codebook conditioned on the noise realization being z. However, the new codebook is statistically indistinguishable from the old codebook,
Hence the probability of error conditioned on the noise vector being z depends
only on the Euclidean norm kzk.
ii. Error occurs if the vector X(W )T + ZT is closer to X(m)T than to X(W )T in
. Let > 1.
Euclidean distance, for some m 6= W . The closest such m is W
T
)T , and
By considering the triangle defined by the three points X(W ) , X(W
T
T
X(W ) + Z in the two-dimensional plane that they span, simple geometric
arguments show that this implies that, in Euclidean norm, X(W )T + ZT is
)T than it is to X(W )T . This implies that the probability of
closer to X(W
error conditioned on the Euclidean norm of the noise vector is monotonically
increasing in this Euclidean norm.
iii. By the weak law of large numbers the probability that the norm of the noise
vector Z exceeds that of a Gaussian noise vector (Z10 , . . . , Zn0 ) each of whose
coordinates have mean zero and variance N 0 , where N 0 > N , goes to 0. By the
geometric argument of the preceding part it follows that the probability of error
in the original case is asymptotically no bigger than that in the Gaussian noise
case (where the variance of the Gaussian noise is strictly bigger than N ).
iv. Since the probability of error in the Gaussian case (with noise variance N 0 strictly
bigger than N but such that R < 21 log(1 + NP 0 )) asymptotically goes to zero, it
does so also in the case we are considering.
(c) Every step in the discussion above goes through even if we only assume that the
noise process is stationary and ergodic. The only thing that needs to change is that
instead of a weak law of large numbers for the i.i.d. noise variables Zi , we need to
apply an ergodic theorem.
Solution:
6
Mutual information game

The claim
I(X; X + Z ) I(X ; X + Z )
(with notation as in the problem statement) was something we already proved in lecture,
when we argued that the Gaussian distribution with mean zero and variance P is capacity
achieving over the AWGN subject to input power constraint P .
The claim
I(X ; X + Z ) I(X ; X + Z)
was proved in the solution of the preceding problem, where we argued that the memoryless
additive noise channel with noise variance N , when the input distribution is chosen to be
Gaussian with mean zero and variance P , the mutual information between the input and
the output is at least as large as in the case of Gaussian noise of variance N .
The implication is that the transmitter and the noise player are in Nash equilibrium when
each selects a Gaussian distribution with mean zero and variance equal to the respective
power constraint. If you do not understand what this means, look up the definition of
Nash equilibrium in any decent book on Game Theory, e.g. the book of Guillermo Owen
entitled Game Theory.

Soln 5 229aspr07

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Soln 5 229aspr07

Hochgeladen von

Copyright:

Verfügbare Formate

EECS 229A

This discussion only depended on the assumption that

This equation has a solution, given by

variable: it takes value 0 with probability +

(which happens with probability + ) it is exponentially distributed with mean + .

(b) When = 0, the capacity becomes 21 log(1 + 2P

Since E[Z 2 ] = N , we have h(Z)

log 2eN . Further, h(X) =

This can be rearranged to read

Mutual information game

Das könnte Ihnen auch gefallen