Beruflich Dokumente
Kultur Dokumente
Channel Capacity
Peng-Hua Wang
Graduate Inst. of Comm. Engineering
National Taipei University
Chapter Outline
Chap. 7 Channel Capacity
7.1 Examples of Channel Capacity
7.2 Symmetric Channels
7.3 Properties of Channel Capacity
7.4 Preview of the Channel Coding Theorem
7.5 Definitions
7.6 Jointly Typical Sequences
7.7 Channel Coding Theorem
7.8 Zero-Error Codes
7.9 Fanos Inequality and the Converse to the Coding Theorem
Channel Model
Channel capacity
Definition 1 (Discrete Channel) A system consisting of an input
alphabet X and output alphabet Y and a probability transition matrix
p(y|x).
Definition 2 (Channel capacity) The information channel capacity of
a discrete memoryless channel is
C = max I(X; Y )
p(x)
where the maximum is taken over all possible input distribution p(x).
Example 1
0 = 1 = 1/2
Example 2
p(X = 0) = 0 , p(X = 1) = 1 = 1 0
p(Y = 1) = 0 p, p(Y = 2) = 0 (1 p), p = 1/2
p(Y = 3) = 1 q, p(Y = 4) = 1 (1 q), q = 1/3
I(X; Y ) = H(Y ) H(Y |X) = H(Y ) 0 H(p) 1 H(q)
Peng-Hua Wang, April 16, 2012
Noisy Typewriter
Noisy typewriter
Peng-Hua Wang, April 16, 2012
Noisy Typewriter
I(X; Y )
= H(Y ) H(Y |X)
X
= H(Y )
p(x)H(Y |X = x)
x
= H(Y )
p(x)H( 21 )
= H(Y ) H( 12 )
log 26 1 = log 13
C = max I(X; Y ) = log 13
I(X; Y )
= H(Y ) H(Y |X)
X
= H(Y )
p(x)H(Y |X = x)
x
= H(Y )
p(x)H(p)
= H(Y ) H(p)
1 H(p)
C = max I(X; Y ) = 1 H(p)
I(X; Y )
= H(Y ) H(Y |X)
X
= H(Y )
p(x)H(Y |X = x)
x
= H(Y )
p(x)H()
= H(Y ) H()
H(Y ) = (1 )H(0 ) + H()
C = max I(X; Y ) = 1
C 0.
C log |X |.
C log |Y|.
This set has to be divided into sets of size 2nH(Y |X) corresponding to
the different input X sequences.
Example
Example
7.5 Definitions
Communication Channel
Message W
{1, 2, ..., M }.
X n (W ) X n
n is the length of the signal. We then transmit the signal via the
channel by using the channel n times. Every time we send a
symbol of the signal.
If W
6= W , an error occurs.
Definitions
Definition 3 (Discrete Channel) A discrete channel, denoted by
Definitions
(M, n) code] An (M, n) code for the channel
(X , p(y|x), Y) consists of the following:
1. An index set {1, 2, . . . , M }.
Definition 5
2. An encoding function X n
3. A decoding function g
: Y n {1, 2, . . . , M }
Definitions
Definition 6 (Conditional probability of error)
n
g(y n )6=i
yn
Definitions
Definition 7 (Maximal probability of error)
(n) =
max
i{1,2,...,M }
Pe(n)
Pr(g(Y n ) 6= W ) =
M
X
M
X
1
i
=
M i=1
i=1
Pe
= Pr(g(Y n ) 6= W ).
Definitions
Definition 9 (Rate) The rate R of an (M, n) code is
log M
R=
n
Definitions
(n)
where
1
n
n
n
n
n
= (x , y ) X Y : log p(x ) H(X) < ,
n
1
log p(y n ) H(Y ) < ,
n
1
log p(xn , y n ) H(X, Y ) <
n
p(xn , y n ) =
n
Y
p(xi , yi )
i=1
Joint AEP
Theorem 1 (Joint AEP) Let (X n , Y n ) be sequences of length n
drawn i.i.d. according to p(xn , y n ). Then:
(n)
Pr (xn , y n ) A
1 as n .
(n)
2. A 2n(H(X,Y )+)
n , Y n ) p(xn )p(y n ) [i.e., X
n and Y n are independent with
3. If (X
1.
n(I(X;Y )3).
n , Y n ) A(n)
Pr (X
n , Y n ) A(n) (1 )2n(I(X;Y )+3).
Pr (X
Joint AEP
(n)
Theorem 2 (Joint AEP) 1. Pr (xn , y n ) A
1 as n .
Proof. Given
Joint AEP
Then, by weak law of large number, there exists n1 , n2 , n3 such that,
Pr (x , y )
A(n)
= Pr(Ac B c C c )
> max{n1 , n2 , n3 }.
Joint AEP
(n)
Theorem 3 (Joint AEP) 2. A 2n(H(X,Y )+)
Proof.
1=
p(xn , y n )
p(xn , y n )
(n)
(xn ,y n )A
n(H(X,Y )+)
|2
|A(n)
Thus,
(n)
A 2n(H(X,Y )+) .
Joint AEP
n , Y n )
Theorem 4 (Joint AEP) 3. If (X
n and
p(xn )p(y n ) [i.e., X
n(I(X;Y )+3)
n , Y n ) A(n)
Pr (X
(1
)2
.
Joint AEP
Proof.
n , Y n ) A(n)
=
Pr (X
p(xn )p(y n )
(n)
(xn ,y n )A
(n)
(xn ,y n )A
and
Peng-Hua Wang, April 16, 2012
(n) n(H(X,Y ))
p(x , y ) A 2
.
(n)
A (1 )2n(H(X,Y ))
Joint AEP
n , Y n ) A(n)
Pr (X
X
=
p(xn )p(y n )
(n)
(xn ,y n )A
< C , there
(n) 0.
Conversely, any sequence of (2nR , n) codes with (n)
0 must have
R C.
R < C achievable.
Achievable R C .
Main ideas.
Random Code
p(x) (fixed). That is, the 2nR codewords have the distribution
p(xn ) =
n
Y
p(xi )
i=1
x1 (1)
x2 (1)
x (2)
x2 (2)
1
C= .
..
..
..
.
.
x1 (2nR ) x2 (2nR )
xn (1)
xn (2)
..
xn (2nR )
The code C is revealed to both sender and receiver. Both sender and
receiver are also assumed to know the channel transition matrix
Random Code
nR
different codes.
nR
Pr(C) =
2
n
Y
Y
p (xi (w))
w=1 i=1
P (y n |xn (w)) =
n
Y
i=1
was sent if
The
receiver
declares
that
the
message
W
), Y n is jointly typical.
X n (W
other W
declared (W
6= W ].
6= W . Let E be the event [W
Pr(E) =
Pe(n) (C)
(n)
Pe (C)
Pr(C)
nR
2
X
X
1
2nR
w=1
2nR
w (C)
w=1
Pr(C)w (C)
nR
2
X
Therefore,
Pr(E) =
=
1
2nR
X
nR
2
XX
w=1
Pr(C)w (C)
jointly typical.
Y n is the channel output when the first codeword X n (1) was sent.
E1c : The transmitted codeword and the received sequence are not
jointly typical.
P (E1c |W = 1) +
2
X
P (Ei |W = 1)
i=2
By AEP,
and X
We have
If I(X; Y ) R 3
Pr(E|W = 1) 2.
What do we need? If R
(n) 0.
Peng-Hua Wang, April 16, 2012
Choose p(x) such that I(X; Y ) is maximum. That is, choose p(x)
such that I(X; Y ) achieve channel capacity C . Then the condition
< 2.
Pr(E|C ) =
2
1 X
2nR
i (C ) 2
i=1
which implies that the maximal error probability of the better half
codewords is less than 4.
Chap. 7 - p. 48/62
There are 10 students. Their average score is 40.Information
Then Theory,
the highest
Peng-Hua Wang,
April 16, 2012
We throw away the worst half of the codewords in the best codebook
Summary. If R 1/n
sufficiently large.
No error R C
nR
n
X
i=1
R C.
No error R C
Lemma 1 Let Y n be the result of passing X n through a discrete
memoryless channel of capacity C . Then for all p(xn ),
I(X n ; Y n ) nC.
Proof.
n
X
i=1
n
X
i=1
n
X
i=1
H(Yi )
n
X
i=1
H(Yi |Xi ) =
n
X
I(Yi ; Xi ) nC
i=1
Fanos Inequality
Theorem 6 (Fanos inequality) Let X and W have the same sample
spaces X
Then
Fanos Inequality
Proof. We will prove that H(X|W ) H(Pe ) Pe log(M
H(X|W ) =
XX
p(x, w) log
1
p(x|w)
XX
p(x, w) log
1
p(x|w)
XX
p(x, w) log
1
p(x|w)
XX
p(x, w) log
1
M 1
Pe log(M 1) =
w6=x
w=x
w6=x
1) 0.
w6=x
XX
x
p(x, w) log(1 Pe )
w=x
Fanos Inequality
Proof (cont.)
H(X|W ) Pe log(M 1) H(Pe )
XX
XX
Pe
1 Pe
=
p(x, w) log
p(x, w) log
+
(M 1)p(x|w)
p(x|w)
x w6=x
x w=x
XX
XX
P
1 Pe
e
log
p(x, w)
+
p(x, w)
(M 1)p(x|w)
p(x|w)
x w6=x
x w=x
XX
XX
P
e
p(w) + (1 Pe )
p(w)
= log
M 1 x
x w=x
w6=x
= log[Pe + (1 Pe )] = 0
Fanos Inequality
Corollary 1 1.
2.
3. If X
Remark.
1.
Y Z , then
I(X; Z) I(X; Y )
Proof.
I(X; Z) I(X; Y )
=H(X) H(X|Z) [H(X) H(X|Y )] = H(X|Y ) H(X|Z)
XX
XX
1
1
=
p(x, y) log
p(x, z) log
p(x|y)
p(x|z)
x
y
x
z
XXX
1
1
p(x, y, z) log
p(x, y, z) log
=
p(x|y)
p(x|z)
x
y
z
x
y
z
!
XXX
p(x|z)
log
p(x, y, z)
(by convexity of logarithm)
p(x|y)
x
y
z
XXX
Y Z , we have
p(x, y)p(y, z)
p(x, y, z) = p(x, y)p(z|x, y) = p(x, y)p(z|y) =
p(y)
and
p(x|z)
p(x, y)p(y, z) p(x, z)p(y)
p(x, z)p(y, z)
=
=
p(x, y, z)
p(x|y)
p(y)
p(z)p(x, y)
p(z)
Therefore,
XXX
x
X X p(x, z) X
x
p(z)
p(y, z) =
XX
x
p(x, z) = 1
Y Z , then
(
I(X; Y )
I(X; Z)
I(Y ; Z)
H(X|Y ) H(X|Z)
2. If X
Y Z W , then
I(X; Z) + I(Y ; W ) I(X; W ) + I(Y ; Z),
I(X; W ) I(Y ; Z)
Achievable R C
Theorem 7 (Converse to Channel coding theorem) Any sequence of
= g(Y n ), we have W X n (W ) Y n W
.
W
{1, 2, . . . , 2nR }.
] = Pe(n) =
Pr[W 6= W
2
X
1
2nR
i .
i=1
Achievable R C
Proof (cont.)
1
R nR
That is, if R > C , the probability of error is large than a positive value
for sufficiently large n. The error probability cant achieve arbitrary
small.
Pe(n)