Sie sind auf Seite 1von 27

 Source coding theorem

◦ It shows the efficient representation of symbols


generated by the source
◦ The main motivation is compression of data

◦ A discrete memoryless source output a symbol


every T seconds
◦ Each symbol is selected from a finite set of symbols

◦ The symbols are occuring with the probabilities


 The entropy of this DMS in bits per source
symbols is

 The equality holds when symbols are equally


likely.
 Entropy is average number of bits per
symbol.
 The source rate is H(x)/T bits per second
 Suppose we need to represent 26 letters of
english alphabets using bits
 We Know that
 So, Each of the letters is being is represented
by atleast 5 bits

 The number of binary digits(bits) R required


for unique coding
 When L is a power of 2

 When L is not a power of 2



 Here we can conclude that
 The fixed length code means each letter in
alphabet is equally important (probable)
 So each one requires 5 bits for representation

 We know that some of letters are less important


i.e. (x,q,z..etc)
 Some letters are more frequently used (s, t, a,
e...etc)

 However representing same number of bits for all


the letters is not the efficient way of coding
◦ This is also known as Fixed Length of Codes. (FLC)
 For example. ASCII codes.
 Better way of coding is
◦ More frequent alphabet is represented by less
number of bits
◦ Less frequent alphabet is represented by more
number of bits

This is known as Variable Length Codes. (VLC)


 Fixed Length codes
 Variable Length codes
 Distinct codes
 Uniquely decodable codes
 Prefix free codes
 Instantaneous codes
 Optimal codes
 Entropy coding
 If the code word length for a code is fixed
 A fixed length code assigns fixed number of bits
to the source symbols irrespective of their
statistics of appearance
◦ ASCII codes
 A to Z
 A to z
 0 to 9
 Punctuation mark
 Commas etc. have a 7 bit code word
 If there are L number of source alphbets
 If L is a power of 2 then codeword is given by

 If L is not a power of 2 then codeword is given by


 The codeword is not fixed
◦ More frequent by less number of bits
◦ Less frequent by more number of bits

◦ It require less number of bits as compared to fixed


length of codes to encode a same information
A Code is called Distinct if each codeword is
distinguishable from other

Xj Codeword
X1 00
X2 01
X3 10
X4 11
 The coded source symbols are transmitted as
stream of bits
 The codes must satisfy some properties so
that the receiver can identify the possible
symbols from stream of bits

 A Distinct code is said to be uniquely


decodable if the original source sequence can
be represented perfectly from received
encoded binary sequence.
Symbol Code 1 Code 2
A 00 0
B 01 1
C 10 00
D 11 01

Code 1 is fixed length code


Code 2 is variable length code
The message ‘A BAD CAB’ can be encoded
using above 2 codes
In Code 1 format it appears as
00 010011 100001
In Code 2 format is appears as
0 1001 0001
In Code 1 format it appears as
00 010011 100001
In Code 2 format is appears as
0 1001 0001

 Here code 1 requires 14 bits to encode


 Here code 2 requires 9 bits to encode

 Although code 2 is having less codes, yet it is not


a valid code as there is decoding problem with
this code
 The code 0 1001 0001 can be grouped in
different ways as
Symbol Code 1 Code 2
A 00 0
B 01 1
C 10 00
D 11 01

 The code 0 1001 0001 can be grouped in


different ways as
 [0] [1][0][0][1] [0][0][0][1] which means
 A BAAB AAAB
 A B C B C D
 D C B C D

 As the destination does not know where the


codeword ends and there is new codeword
start.
 In this case code 1 is uniquely decodable
 A code in which no code word forms the prefix of
any other codeword is called prefix free code
 The Prefix code is
Symbol Codeword
A 0
B 10
C 110
D 1110

 If zero(0) is received, the receiver cannot decide


whether it is entire code for ‘A’ or a partial code
word for ‘C’ or ‘D’
 Hence no code word should be prefix of any
other code word. This is called Prefix Free Code
 A Uniquely decodable code is said to be an
instantaneous code if the end of any code is
recognizable without checking subsequent
code symbols.
 It can be type of Prefix or Prefix free.
 A code is called optimal code if it is
instantaneous and has minimum average
length for a given source particular
probability assignment for the source
symbols.
 When a variable length code is designed such
that its average codeword length approaches
the entropy of the DMS (discrete memoryless
source).
 It is known as entropy coding

◦ Shanon fano and Huffman coding are the examples.


Xj Code 1 Code 2 Code 3 Code 4 Code 5 Code 6
X1 00 00 0 0 0 1
X2 01 01 1 10 01 01
X3 00 10 00 110 011 001
X4 11 11 11 111 0111 0001

Code 1 and Code 2 are fixed length codes


Code 3, 4, 5 and 6 are variable length codes
All codes are distinct except code 1
Code 2, 4, 6 are prefix or instantaneous codes
Code 2, 4, 5 and 6 are uniquely decodable codes

Code 5 is not prefix free code, still it is uniquely decodable since bit 0
indicates the beginning of each codeword
 Let X be discrete memory less Source having
an alphabet
 If the length of the binary code word
corresponding to be

 A necessary and sufficient condition for


existence of an instantaneous binary code is

 This is an expression for kraft inequality


 It indicates the existence of an instataneous
decodable code with codeword length that
satisfy the inequality
Xj Code 1 Code 2 Code 3 Code 4
X1 00 0 0 0
X2 01 10 11 100
X3 10 11 100 110
X4 11 110 110 111

 For code 1:

 Hence this satisfy kraft inequality


Xj Code 1 Code 2 Code 3 Code 4
X1 00 0 0 0
X2 01 10 11 100
X3 10 11 100 110
X4 11 110 110 111

 For code 2:

 Hence this does not satisfy kraft inequality


Xj Code 1 Code 2 Code 3 Code 4
X1 00 0 0 0
X2 01 10 11 100
X3 10 11 100 110
X4 11 110 110 111

 For code 3:

 Hence this satisfy kraft inequality


Xj Code 1 Code 2 Code 3 Code 4
X1 00 0 0 0
X2 01 10 11 100
X3 10 11 100 110
X4 11 110 110 111

 For code 4:

 Hence this satisfy kraft inequality

Das könnte Ihnen auch gefallen