You are on page 1of 9

Entropy and Information Rate of Markoff Sources

Session 4
Definition of the entropy of the source
Assume that, the probability of being in state i at he beginning of the first
symbol interval is the same as the probability of being in state i at the beginning of the
second symbol interval, and so on.
The probability of going from state i to j also doesnt depend on time, Entropy
of state i is defined as the average information content of the symbols emitted from
the i-th state.

n
j ij
ij
p
p
1
2 i
1
log H bits / symbol ------ (1)
Entropy of the source is defined as the average of the entropy of each state.
i.e. H = E(H
i
) =

n
1 j
i i
H p ------ (2)
Where,
p
i
= the proby that the source is in state i.
using eqn (1), eqn. (2) becomes,
H =

,
_

,
_



n
1 j ij
ij
n
1 i
i
p
1
log p p bits / symbol ------ (3)
Average information rate for the source is defined as
R = r
s
. H bits/sec
Where, r
s
is the number of state transitions per second or the symbol rate of
the source.
The above concepts can be illustrated with an example
Illustrative Example:
1. Consider an information source modelled by a discrete stationary Markoff random
process shown in the figure. Find the source entropy H and the average information
content per symbol in messages containing one, two and three symbols.
The source emits one of three symbols A, B and C.
A tree diagram can be drawn as illustrated in the previous session to understand
the various symbol sequences and their probabilities.
1

A
1
2

1
2
AAA
AAC
ACC
ACB
1
1
2

A
C

A
C

C
3
/
4
B
2
2

1
2
CCA
CCC
CBC
CBB
1
1
2

C
B
A
C
C
B

3
/
4
C
2

C
1
2

1
2
CAA
CAC
CCC
CCB
1
1
2

A
C

A
C

C
3
/
4
B
2
2

1
2
BCA
BCC
BBC
BBB
1
1
2

C
B
A
C
C
B

3
/
4
B
C

2 B
3
/
4
1
A
3
/
4
p
1
= P
2
=
As per the outcome of the previous session we have
Messages of Length (1) Messages of Length (2) Messages of Length (3)
A
,
_

8
3
AA
,
_

32
9
AAA
,
_

128
27
B
,
_

8
3
AC

,
_

32
3
AAC

,
_

128
9
C

,
_

4
1
CB

,
_

32
3
ACC

,
_

128
3
CC

,
_

32
2
ACB

,
_

128
9
BB

,
_

32
9
BBB

,
_

128
27
BC

,
_

32
3
BBC

,
_

128
9
CA

,
_

32
3
BCC

,
_

128
3
BCA

,
_

128
9
CCA
,
_

128
3
CCB
,
_

128
3
CCC
,
_

128
2
CBC

,
_

128
3
CAC

,
_

128
3
CBB

,
_

128
9
CAA

,
_

128
9
By definition H
i
is given by

n
1 j ij
ij i
p
1
log p H
Put i = 1,

2 n
1 j j 1
j 1 i
p
1
log p H
12
12
11
11
1
log
p
1
log
p
p p +
Substituting the values we get,
( )

,
_

+
4 / 1
1
log
4
1
4 / 3
1
log
4
3
2 2 1
H
= ( ) 4 log
4
1
3
4
log
4
3
2 2

,
_

,
_

H
1
= 0.8113
Similarly H
2
=
4
1
log 4 +
4
3
log
3
4
= 0.8113
By definition, the source entropy is given by,



2
1 i
i i
n
1 i
i i
H p H p H
=
2
1
(0.8113) +
2
1
(0.8113)
= (0.8113) bits / symbol
To calculate the average information content per symbol in messages
containing two symbols.
How many messages of length (2) are present? And what is the information
content of these messages?
There are seven such messages and their information content are:
I (AA) = I (BB) = log
) (
1
AA
= log
) (
1
BB
i.e., I (AA) = I (BB) = log
) 32 / 9 (
1
= 1.83 bits
Similarly calculate for other messages and verify that they are
I (BB) = I (AC) =
I (CB) = I (CA) =
log
) 32 / 3 (
1
= 3.415 bits
I (CC) = log =
) 32 / 2 ( P
1
= 4 bits
Compute the average information content of these messages.
Thus, we have
H
(two)
= . sym / bits
P
1
log P
i
7
1 i
i

=
i
7
1 i
i
I . P

Where I
i
= the Is calculated above for different messages of length two
Substituting the values we get,
) 83 . 1 ( x
32
9
) 415 . 3 ( x
32
3
) 415 . 3 ( x
32
3
) 4 (
32
2
) 415 . 3 (
32
3
) 415 . 3 ( x
32
3
) 83 . 1 (
32
9
H
(two)
+ +
+ + + +
bits 56 . 2 H
(two)

Compute the average information content persymbol in messages containing
two symbols using the relation.
G
N
=
message the in symbols of Number
N length of messages the of content n informatio Average
Here, N = 2
G
N
=
2
(2) length of messages the of content n informatio Average
=
2
H
) two (
= symbol / bits 28 . 1
2
56 . 2

28 . 1 G
2

Similarly compute other Gs of interest for the problem under discussion viz G
1
& G
3
.
You get them as
G
1
= 1.5612 bits / symbol
and G
3
= 1.0970 bits / symbol
What do you understand from the values of Gs calculated?
You note that,
G
1
> G
2
> G
3
> H
How do you state this in words?
It can be stated that the average information per symbol in the message reduces as
the length of the message increases.
What is the generalized from of the above statement?
If P(m
i
) is the probability of a sequence m
i
of N symbols form the source with
the average information content per symbol in the messages of N symbols defined
by
G
N
=
N
) m ( P log ) m ( P
i
i i

Where the sum is over all sequences m


i
containing N symbols, then G
N
is a
monotonic decreasing function of N and in the limiting case it becomes.
Lim G
N
= H bits / symbol
N
Recall H = entropy of the source
What do you understand from this example?
It illustrates the basic concept that the average information content per symbol
from a source emitting dependent sequence decreases as the message length
increases.
Can this be stated in any alternative way?
Alternatively, it tells us that the average number of bits per symbol needed to
represent a message decreases as the message length increases.
Home Work (HW)
The state diagram of the stationary Markoff source is shown below
Find (i) the entropy of each state
(ii) the entropy of the source
(iii) G
1
, G
2
and verify that G
1
G
2
H the entropy of the source.
Example 2
For the Markoff source shown, cal the information rate.
Solution:
By definition, the average information rate for the source is given by
R = r
s
. H bits/sec ------ (1)
Where, r
s
is the symbol rate of the source
and H is the entropy of the source.
To compute H
Calculate the entropy of each state using,
R

3 R
1
/
2 L

2
1
L

p
1
= P
2
=
S

P
3
=
P(state1) = P(state2) =
P(state3) = 1/3

B
A

A
1
C
2
3
C
A

B
sym / bits
p
1
log p H
ij
n
1 j
iJ i

-----(2)
for this example,
3 , 2 , 1 i ;
p
1
log p H
ij
3
1 j
ij i

------ (3)
Put i = 1
j
j
j i
p p H
1
3
1
1
log


= - p
11
log p
11
p
12
log p
12
p
13
log p
13
Substituting the values, we get
H
1
= -
2
1
x log

,
_

2
1
-
2
1
log

,
_

2
1
- 0
= +
2
1
log (2) +
2
1
log (2)
H
1
= 1 bit / symbol
Put i = 2, in eqn. (2) we get,
H
2
= -

3
1 j
j 2 j 2
p log p
i.e., H
2
= - [
23 23 22 22 21 21
p log p p log p p log p + +
Substituting the values given we get,
H
2
= -

,
_

,
_

,
_

4
1
log
4
1
2
1
log
2
1
4
1
log
4
1
= +
4
1
log 4 +
2
1
log 2 +
4
1
log 4
=
2
1
log 2 +
2
1
+ log 4
H
2
= 1.5 bits/symbol
Similarly calculate H
3
and it will be
H
3
= 1 bit / symbol
With H
i
computed you can now compute H, the source entropy, using.
H =

3
1 i
i i
H P
= p
1
H
1
+ p
2
H
2
+ p
3
H
3
Substituting the values we get,
H =
4
1
x 1 +
2
1
x 1.5 +
4
1
x 1
=
4
1
+
2
5 . 1
+
4
1
=
2
1
+
2
5 . 1
=
2
5 . 2
= 1.25 bits / symbol
H = 1.25 bits/symbol
Now, using equation (1) we have
Source information rate = R = r
s
1.25
Taking r
s
as one per second we get
R = 1 x 1.25 = 1.25 bits / sec