Information Theory and Coding

 Source coding theorem
◦ It shows the efficient representation of symbols

generated by the source
◦ The main motivation is compression of data
◦ A discrete memoryless source output a symbol

every T seconds
◦ Each symbol is selected from a finite set of symbols
◦
◦ The symbols are occuring with the probabilities

 The entropy of this DMS in bits per source
symbols is
 The equality holds when symbols are equally

likely.
 Entropy is average number of bits per
symbol.
 The source rate is H(x)/T bits per second
 Suppose we need to represent 26 letters of
english alphabets using bits
 We Know that
 So, Each of the letters is being is represented
by atleast 5 bits
 The number of binary digits(bits) R required

for unique coding
 When L is a power of 2
 When L is not a power of 2


 Here we can conclude that
 The fixed length code means each letter in
alphabet is equally important (probable)
 So each one requires 5 bits for representation
 We know that some of letters are less important

i.e. (x,q,z..etc)
 Some letters are more frequently used (s, t, a,
e...etc)
 However representing same number of bits for all

the letters is not the efficient way of coding
◦ This is also known as Fixed Length of Codes. (FLC)
 For example. ASCII codes.
 Better way of coding is
◦ More frequent alphabet is represented by less
number of bits
◦ Less frequent alphabet is represented by more
number of bits
This is known as Variable Length Codes. (VLC)

 Fixed Length codes
 Variable Length codes
 Distinct codes
 Uniquely decodable codes
 Prefix free codes
 Instantaneous codes
 Optimal codes
 Entropy coding
 If the code word length for a code is fixed
 A fixed length code assigns fixed number of bits
to the source symbols irrespective of their
statistics of appearance
◦ ASCII codes
 A to Z
 A to z
 0 to 9
 Punctuation mark
 Commas etc. have a 7 bit code word
 If there are L number of source alphbets
 If L is a power of 2 then codeword is given by
 If L is not a power of 2 then codeword is given by

 The codeword is not fixed
◦ More frequent by less number of bits
◦ Less frequent by more number of bits
◦ It require less number of bits as compared to fixed

length of codes to encode a same information
A Code is called Distinct if each codeword is
distinguishable from other
Xj Codeword
X1 00
X2 01
X3 10
X4 11
 The coded source symbols are transmitted as
stream of bits
 The codes must satisfy some properties so
that the receiver can identify the possible
symbols from stream of bits
 A Distinct code is said to be uniquely

decodable if the original source sequence can
be represented perfectly from received
encoded binary sequence.
Symbol Code 1 Code 2
A 00 0
B 01 1
C 10 00
D 11 01
Code 1 is fixed length code

Code 2 is variable length code
The message ‘A BAD CAB’ can be encoded
using above 2 codes
In Code 1 format it appears as
00 010011 100001
In Code 2 format is appears as
0 1001 0001
In Code 1 format it appears as
00 010011 100001
In Code 2 format is appears as
0 1001 0001
 Here code 1 requires 14 bits to encode

 Here code 2 requires 9 bits to encode
 Although code 2 is having less codes, yet it is not

a valid code as there is decoding problem with
this code
 The code 0 1001 0001 can be grouped in
different ways as
Symbol Code 1 Code 2
A 00 0
B 01 1
C 10 00
D 11 01
 The code 0 1001 0001 can be grouped in

different ways as
 [0] [1][0][0][1] [0][0][0][1] which means
 A BAAB AAAB
 A B C B C D
 D C B C D
 As the destination does not know where the

codeword ends and there is new codeword
start.
 In this case code 1 is uniquely decodable
 A code in which no code word forms the prefix of
any other codeword is called prefix free code
 The Prefix code is
Symbol Codeword
A 0
B 10
C 110
D 1110
 If zero(0) is received, the receiver cannot decide

whether it is entire code for ‘A’ or a partial code
word for ‘C’ or ‘D’
 Hence no code word should be prefix of any
other code word. This is called Prefix Free Code
 A Uniquely decodable code is said to be an
instantaneous code if the end of any code is
recognizable without checking subsequent
code symbols.
 It can be type of Prefix or Prefix free.
 A code is called optimal code if it is
instantaneous and has minimum average
length for a given source particular
probability assignment for the source
symbols.
 When a variable length code is designed such
that its average codeword length approaches
the entropy of the DMS (discrete memoryless
source).
 It is known as entropy coding
◦ Shanon fano and Huffman coding are the examples.

Xj Code 1 Code 2 Code 3 Code 4 Code 5 Code 6
X1 00 00 0 0 0 1
X2 01 01 1 10 01 01
X3 00 10 00 110 011 001
X4 11 11 11 111 0111 0001
Code 1 and Code 2 are fixed length codes

Code 3, 4, 5 and 6 are variable length codes
All codes are distinct except code 1
Code 2, 4, 6 are prefix or instantaneous codes
Code 2, 4, 5 and 6 are uniquely decodable codes
Code 5 is not prefix free code, still it is uniquely decodable since bit 0
indicates the beginning of each codeword
 Let X be discrete memory less Source having
an alphabet
 If the length of the binary code word
corresponding to be
 A necessary and sufficient condition for

existence of an instantaneous binary code is
 This is an expression for kraft inequality

 It indicates the existence of an instataneous
decodable code with codeword length that
satisfy the inequality
Xj Code 1 Code 2 Code 3 Code 4
X1 00 0 0 0
X2 01 10 11 100
X3 10 11 100 110
X4 11 110 110 111
 For code 1:
 Hence this satisfy kraft inequality

X1 00 0 0 0
X2 01 10 11 100
X3 10 11 100 110
X4 11 110 110 111
 For code 2:
 Hence this does not satisfy kraft inequality

X1 00 0 0 0
X2 01 10 11 100
X3 10 11 100 110
X4 11 110 110 111
 For code 3:

X1 00 0 0 0
X2 01 10 11 100
X3 10 11 100 110
X4 11 110 110 111
 For code 4:

Information Theory and Coding

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Information Theory and Coding

Hochgeladen von

Copyright:

Verfügbare Formate

 Source coding theorem

◦ It shows the efficient representation of symbols

◦ A discrete memoryless source output a symbol

◦ The symbols are occuring with the probabilities

 The equality holds when symbols are equally

 The number of binary digits(bits) R required

 When L is not a power of 2

 We know that some of letters are less important

 However representing same number of bits for all

This is known as Variable Length Codes. (VLC)

 If L is not a power of 2 then codeword is given by

◦ It require less number of bits as compared to fixed

 A Distinct code is said to be uniquely

Code 1 is fixed length code

 Here code 1 requires 14 bits to encode

 Although code 2 is having less codes, yet it is not

 The code 0 1001 0001 can be grouped in

 As the destination does not know where the

 If zero(0) is received, the receiver cannot decide

◦ Shanon fano and Huffman coding are the examples.

Code 1 and Code 2 are fixed length codes

 A necessary and sufficient condition for

 This is an expression for kraft inequality

 Hence this satisfy kraft inequality

 Hence this does not satisfy kraft inequality

 Hence this satisfy kraft inequality

 Hence this satisfy kraft inequality

Das könnte Ihnen auch gefallen