Beruflich Dokumente
Kultur Dokumente
ThB6.4
Fig. 1.
I. I NTRODUCTION
The problem of communication at rates above capacity
was traditionally considered in the context of list decoding
(e.g. [6]). List decoding involves the point-to-point communication scenario. Our interest in the problem was motivated
by the following cooperative communication scenario.
Cooperative communication frequently takes the form of
one or more relays which facilitate the communications
between a source and a destination (see Fig. 1). Two
of the most common relay communication strategies (see
e.g. [3], [10]) are decode-and-forward (DF) and compressand-forward (CF). While DF focuses on the case when
the channel to the relay is strong enough to enable it to
decode, CF involves scenarios when decoding at the relay is
not possible. With CF, the relay helps by retransmitting its
observed signal so that the destination may combine it with
its own channel observation and decode using both1 .
In this paper, we were interested in the following question.
When the relay cannot decode (i.e., the communication is at a
rate that exceeds the capacity of the channel from the source
to the relay), perhaps it can use its knowledge of the code
structure to compute an MMSE estimate of the transmitted
codeword. The MMSE would enjoy a lower variance in comparison to the noise in the original signal (Similar approaches
Amir Bennatan and A. Robert Calderbank are with the Program in
Applied and Computational Mathematics (PACM), Princeton University
(email: abn@princeton.edu, calderbk@Math.Princeton.EDU).
Shlomo Shamai (Shitz) is with the Department of Electrical
Engineering, Technion - Israel Institute of Technology (email:
sshlomo@ee.technion.ac.il).
1 See [3], [10] for a precise discussion of CF.
A relay channel.
1065
ThB6.4
and Sason [22] to arbitrary binary-input symmetric-output
channels, including binary-input AWGN channels. The results of Guo et al. [9], however, involve the derivative
of I(X; Y) with respect to the SNR. None of the abovementioned bounds has been proven to be tight, and thus their
derivatives cannot straightforwardly be applied to obtain a
bound on the MMSE. In this paper, we nonetheless extend
the method of [22] to bound the MMSE. Our bounds are
conned to regular LDPC codes (see [17]), and an analysis
of irregular codes is deferred to future work.
We begin in Sec II by introducing our notation and some
background. In Sec III we present our main results, namely
bounds the MMSE of LDPC codes. We also present a
theorem bounding the MMSE of good codes, which is a
variation of a theorem of Peleg et al. [15], and provide
numeric results comparing bad LDPC codes to good
codes. In Sec. IV we provide the proof of the validity of
our second bound. Finally, Sec. V concludes the paper.
II. BACKGROUND AND N OTATION
A. General Notation and Denitions
Vectors are denoted in boldface (e.g. x) and scalar values
in normal face (e.g. x). Random variables are upper cased
(e.g. X) and their realizations are lower cased (e.g. x). a
b denotes the bitwise XOR between two binary variables.
a b denotes the component-wise XOR between two binary
Y = snr X + N
(1)
where Y is the channel output, X is as dened above, snr 0
and N N (0, 1).
Throughout the paper, we will also be interested in transmission of vectors X over the channel. That is,
Y = snr X + N
(2)
where X is an n-dimensional vector with BPSK-valued
components, N is a Gaussian vector with independent components, Ni N (0, 1) and Y is the channel output.
Throughout the paper, we follow the convention of [9]
and use upper-case and to refer to MMSE and SNR in a
1066
3 More
ThB6.4
In [9][Equation (17)] it is shown than,
(3)
1
BP(j, k, snr)
n
1
1
DE(j, k, snr) + O
n
n
i =
X
(+1) pi + (1) (1 pi )
(5)
Each value pi is a function of the random channel outputs. As such, it is a random variable. Richardson and
Urbanke [17] suggested an efcient algorithm, called density
evolution, to compute the distribution of this random variable.
The algorithm can thus be extended to obtain the distribution
i , and of the mean square error of this estimate.
of X
The precise application of density evolution requires a
number of assumptions, which are justied in Appendix I.
Relying on these assumptions, we obtain the following
theorem. This theorem considers the average performance
4 Typically, at each iteration of belief-propagation, several different messages corresponding to each code bit, are computed, one for each outgoing
edge in the LDPC bipartite graph (see e.g. in [7], [17] for a detailed description of the algorithm). The messages are different because of the extrinsic
information rule. However, at the very last iteration of the algorithm, this
rule needs not be obeyed, and thus one value per code bit is available.
1
d
2
gp (snr)k
(6)
where,
gp (snr)
(7)
1067
ThB6.4
d
gp (snr)k
dsnr
N
tanh2p1 (snr + snr N ) sech2 (snr + snr N )
= k gp (snr)k1 E 2p 1 +
2 snr
C. Good Codes
(8)
0.9
0.8
0.7
1/n MMSE
1
mmse(uncoded, snr) 8
mmse(C (n) , snr)
n
mmse(uncoded, snr)
0.6
0.5
0.4
0.3
0.2
0.1
0
0.2
0.4
0.6
0.8
1.2
1.4
SNR
Fig. 2. Comparison between the MMSE of LDPC (2,4) codes and good
codes of the same rate.
Yi > 0
0,
1,
Yi < 0
i = |2 snr Yi |, i =
(9)
0 or 1 w.p. 12 , Yi = 0.
where |x| denotes the absolute
value of x. Note that with the
channel dened by (1), 2 snr Yi coincides with the loglikelihood-ratio (LLR) value computed by the LDPC decoder
in its rst decoding iteration [22]. Thus, our denitions of
i and i coincide with those of [22]. We also dene Yi =
involves no loss
(i , i ). Clearly, switching from Y to Y
of information, and in the sequel we assume that the channel
output is Y.
Following [22], we also observe that i is independent
of the transmitted X. Combined with the Markov relation
{j }j=i X Xi Yi i , this implies that he random
variables i , i = 1, ..., n are independent of one another.
Given a xed i = i , the transition probabilities from
Xib (recall that Xib is the binary representation of the BPSKvalued Xi , see Sec. II-A) to i are equal to those of a binarysymmetric channel (BSC) with crossover probability p(i ),
1068
ThB6.4
where,
1
p() =
1 tanh
(10)
2
2
We can thus dene the noise i , which is a binary
random variable, dependent on i and distributed as i
Bernoulli(p()), such that i = Xi i . Using similar
arguments to the one involving the components of , the
random variables i , i = 1, ..., n are also independent of
one another.
In the sequel, we will use normal face (rather than bold
face) to correspond to uncoded transmission. That is, X and
Y are dened as in Denition 4. Y , , , are obtained
, , were
from X and Y in exactly the same way as Y,
obtained from X and Y.
We are now ready to examine I(C, snr).
= H(Y)
H(Y
| X) (11)
I(C, snr) = I(X; Y) = I(X; Y)
| X)
H(Y
n
=
=
H(Yi | Xi )
n H(Y | X)
n H(Y ) I(X; Y )
n H(Y ) n CB (snr)
(d)
n (H() + H( | )) n CB (snr)
(e)
n (H() + 1) n CB (snr)
(12)
H(Y)
(a)
H() + H( | )
(b)
n H() + H( | )
(13)
b , S | )
H(X
b | S, ) + H(S | )
H(X
Xb ( r(H ))
(17)
B. Analysis of H(S | )
We now focus to the second term in (17). We assume the
syndrome components S = [S1 , ..., SL ] are ordered as in
Appendix II. Let S(1) = [S1 , ..., Sn/k ] denote the syndromes
that correspond to the rst submatrix in Fig. 3, and S(2) =
[Sn/k+1 , ..., Sjn/k ] denote the rest of the syndromes. Thus
we may write,
H(S | ) =
=
(15)
H(S(1) , S(2) | )
H(S(1) | ) + H(S(2) | S(1) , ) (18)
ik
m ,
i = 1, ..., n/k
(19)
m=(i1)k+1
where
denotes modulo-2 sum. The syndrome components
{Si }i=1,...,n/k are functions of distinct sets of independent
random variables {m }, and thus they are independent. We
therefore obtain,
(14)
(c)
r(H )
i=1
(b)
X
) = nR.
Combining (14), (15) and (16) we obtain,
(16)
H(S(1) | ) =
n/k
H(Si | )
(20)
i=1
k
1
E tanh2p
(21)
H(Si | ) = 1
2p
(2p
1)
2
p=1
6 A ln 2 factor that appears in [22][Equation (65)] was removed, because
in the context of our discussion, I(X; Y) is evaluated in nats rather than
bits as in [22].
1069
ThB6.4
We now observe that,
(a)
2p
E tanh
=
2
(b)
(c)
(d)
(e)
|2 snr Y |
2p
E tanh
2
2p
E tanh (|snr X + snr N |)
(22)
where (a) follows by (9), (b) follows by the channel equation (1). (c) follows by the symmetry of the distribution of
N , and by the fact that X {1}. (d) follows by the fact
that tanh is an odd function, and (e) follows by the denition
of gp (snr) (7). Combining (20), (21) and (22), we obtain:
n
1
(1)
k
H(S | ) = 1
(23)
gp (snr)
k
2p (2p 1)
p=1
Combining (17), (18) and (23), we obtain,
I(C, snr) = n CB (snr)
n
1
k
+ 1
gp (snr)
k
2p (2p 1)
p=1
(2)
+H(S
(1)
|S
are
these denitions, we have S = S S, where S and S
independent of one another.
Finally, we are ready to examine the function f ().
f (1 , ..., n ) =
, ) n(1 R)
= H(S(2) | S(1) )
(b)
(1) , S
(2) )
H(S(2) | S(1) , S
(c)
(2) | S(1) S
(1) , S
(1) , S
(2) )
= H(S(2) S
(1) , S
(2) )
= H(S(2) | S(1) , S
1 d
d
2
I(C, snr) = 2
CB (snr)
n dsnr
dsnr
1
2
d
gp (snr)k
k p=1 2p (2p 1) dsnr
= H(S(2) | S(1) )
(d)
2 d
+
(24)
H(S(2) | S(1) , )
n dsnr
By Theorem 1, the left hand side of the above equality equals
1/n mmse(C, snr). By Remark 1 and Theorem 1, the rst
term on the right hand side is equal to mmse(uncoded, snr).
Thus, to complete the proof, we must show that the derivative
of H(S(2) | S(1) , ) is non-positive. This will be the focus
of the following section.
C. Analysis of H(S(2) | S(1) , )
(2)
(1)
In this section, we show that H(S | S , ) is a nonascending function of snr, and thus its derivative is nonpositive. We begin by dening,
f (1 , ..., n ) = H(S(2) | S(1) , (1 , ..., n ) = (1 , ..., n )) (25)By (9) and (1), i = |2snr Xi + 2 snr Ni |. It is thus
clear that with increasing snr, larger values of i have
We proceed by proving the following lemma,
larger probability. Thus, at a rst glance it would appear
(2) (1)
Lemma 2: Let 1 , ..., n and 1 , ..., n be non-negative from (27) and Lemma 2 that H(S |S , ) is a descending
function of snr, as desired. However, Lemma 2 requires
real values, such that i i for all i = 1, ..., n. Then,
a simultaneous condition on all of the arguments of f ()
f (1 , ..., n ) f (1 , ..., n )
(26) for (26) to be valid.
To obtain our desired result, we apply the following
Proof: Since S = H (as discussed above),
each syndrome is a sum of random variables from the set technique. Let F (; snr) denote the cumulative distribution
{1 , ..., n }. The condition (1 , ..., n ) = (1 , ..., n ) function (CDF) of i , for a xed given value of snr (F () is
1070
ThB6.4
independent of i because the components of are identically
i = 1, ..., n
with bad ones. In fact, they are frequently capacityapproaching (see e.g. [5]) and therefore good in the sense
of Denition 1. Thus, the focus of the discussions of [16],
[13] is very different from ours.
A PPENDIX I
A PPLICATION OF D ENSITY E VOLUTION
We now discuss two conditions, which need to be satised
in order for density evolution to be valid. That is, for
the distribution computed by density evolution to correctly
correspond to the distribution of each of the pi , as dened
in Sec. III-A.
The rst condition is known as the all-one codeword
assumption. With density-evolution, the distribution of each
of the values pi that is computed, is conditioned on the
event that the transmitted codeword was the all-one codeword
(assuming BPSK representation of signals, see Sec. II-A). In
reality, however, the transmitted codeword may be different.
Nevertheless, Richardson and Urbanke [17] have shown that
the probability of error, when belief-propagation decoding is
used, is the same for all transmitted codewords. Thus, an
analysis of the algorithm for one particular case is sufcient.
A simple modication of [17][Lemma 1] allows us to extend
the all-ones codeword assumption to an analysis of the meansquare estimation error as well.
The second condition involves loops. Density-evolution
becomes invalid if the neighborhood graph, according to
which pi is computed, contains a loop [17]. However,
the probability that this occurs for any particular pi , in a
randomly chosen code from an LDPC bipartite ensemble,
diminishes as O(1/n) (n is the block length). Furthermore,
by (5), the square estimation error is upper bounded by 4.
Thus, the mean-square error as predicted by density evolution
is correct up to an additive term of order O(1/n).
A PPENDIX II
T HE (j, k)-R EGULAR G ALLAGER LDPC E NSEMBLE
A Gallager (j, k)-regular LDPC matrix is dened for positive integers j and k. Its parity-check matrix is obtained [7]
by combining j submatrices (see Fig. 3), constructed as
follows. The rst submatrix is given by Fig 4. That is,
in the ith row, columns (i 1) k + 1, ..., i k are 1,
and all the rest are zero. Each of the other submatrices
(numbered 2, ..., j) is obtained by applying a permutation
l l = 2, ..., j on the columns of the rst submatrix. The
Gallager ensemble contains all codes whose parity-check
matrices were constructed this was, i.e., each one corresponds
to some choice of permutations 2 , ..., j .
Examining this construction, it is easy to observe that the
weight (number of ones) in each row is k, and the weight of
each column is j. Each submatrix contains n/k rows. The
design rate of the LDPC code dened by this parity-check
matrix is R = 1 j/k. The true rate (see [7]) is guaranteed
to be at least R.
1071
ThB6.4
R EFERENCES
Fig. 3.
Fig. 4.
A PPENDIX III
A NALYSIS OF F 1 (; SNR)
In this appendix, we prove that F 1 (; snr), dened in
Sec. IV-C, is a non-descending function of snr for xed .
To do so, we rst examine F (; snr) in the range
0. As
noted in Sec. IV-C, by (9) and (1), i = |2snr Xi + 2 snr
Ni |. Using similar arguments to the ones leading to(22), the
distribution of i is identical to that of |2snr + 2 snr N |
where N N (0, 1).
F (; snr)
Pr[ ]
/2 snr
/2 snr
= Q
Q
snr
snr
d
1 2 + snr2
1
exp
F (; snr) =
dsnr
2
snr
2 2 snr3/2
( cosh(/2) + 2snr sinh(/2))
The above derivative is clearly non-positive in the range
0. Thus, F (; snr) is non-ascending as a function of snr in
this range. We now turn to F 1 (; snr), for xed . Let
snr snr , = F 1 (; snr ) and = F 1 (; snr ). We
wish to show that . Now,
(c)
(b)
The work of Shlomo Shamai was supported by the USIsrael Binational Science Foundation.
1072