Sie sind auf Seite 1von 14

TSBK01 Image Coding and Data Compression

Lecture 3: Source Coding Theory

Jrgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)

Outline
1. Coding & codes 2. Code trees and tree codes 3. Optimal codes
The source coding theorem

Part 1: Coding & codes


Coding: To each sorce symbol (or group of symbols) we assign a codeword from an extended binary alphabet. Types of codes:
FIFO: Fixed input (i.e., # source symbols, fixed output (i.e., # code symbols). FIVO: Fixed input, variable output. VIFO: Variable input, fixed output. VIVO: Variable input, variable output.

FIVO and VIVO are called variable length codes (VLC). Should be comma-free.

Example
Assume a memoryless source with alphabet A = {a1, , a4} probabilities P(a1) = P(a2) = P(a3) = P(a4) = 1/8. a1 a2 a3 a4 FIFO: 00 01 10 11 FIVO: 0 01 110 111

Four different classes


Singular Non-singular Uniqely decodable Instantaneous

a1 a2 a3 a4

0 0 0 0

0 010 01 10

10 0 Decoding probem: 010 could 00 10 mean a1a4 or a2 or a3a1. 11 110 110 111 All codes

Non-singular codes Decoding probem: 1100000000000000001 is uniqely decodable, but the first symbol (a3 or a4) cannot be decoded until the third 1 arrives (Compare 11010 and 110010). Uniqely decodable

Instantaneous

Data Compression
Efficient codes utilize the following properties:
Uneven symbol probability Inter-symbol dependence (memory source) Acceptable distortion

Examples:
The FIVO example Theres always an a3 after a1. Dont care whether its a3 or a4.

Part 2: Code Trees and Tree Codes


Consider, again, our favourite example code {a1, , a4} = {0, 10, 110, 111}. The codewords are the leaves in a code tree.
0 a1 0 1 1 1 a4 a2 0 a3

Tree codes are comma-free and instantaneous. No codeword is a prefix of another!

Krafts Inequality
For a uniqely decodable code with codeword lengths li we have

(Proof: Sayood 2.4)

Conversely, if this is valid for a given set of codeword lengths, it is possible to construct a code tree.

Krafts Inequality and Tree Codes


If KI is valid for a set of codeword lengths, there is a tree code with such lengths. Proof: Create a maximal tree with the size from the longest codeword length lmax.
The tree then has 2lmax leaves. Place the codewords, cut the tree, and use KI to prove that there is enough leaves. Lets illustrate.

lmax = 3 leads to this tree:

l1 l2 l4 l3

Try with {li} = {1, 2, 3, 3} and {li} = {1, 2, 2, 3} !

Cannot be used: cut!

Place l1 in the tree. Then 2lmax l1 leaves disappear... ..and 2lmax 2lmax l1 = 2lmax (1 2-l1) leaves remain. Place l2 in the tree. Then 2lmax (1 2-l1 2-l2) leaves remain. After placing N codeword lengths, 2lmax (1 2-l1) leaves remain.

Possible whenever KI is valid, i.e., 2-l1 1.

Part 3: Optimal Codes


Average codeword length [bits/codeword] Krafts Inequality

Optimal codeword lengths ) the entropy limit is reached!

But what about the integer constraints? li = log pi is not always an integer!

The Source Coding Theorem


Assume that the source X is memory-free, and create the tree code for the extended source, i.e., blocks of n symbols. We have:

We can come arbitrarily close to the entropy!

In Practice
Two practical problems need to be solved:
Bit-assignment The integer contraint

Theoretically: Chose li = log pi


Rounding up not always the best! Example: Binary source p1 = 0.25, p2 = 0.75 ) l1 = log 4 = 2 l2 = d log 0.75 e = 1

Instead, use, e.g., the Huffman algorithm (D.Huffman, 1952) to create an optimal tree code!

Summary
Coding: Assigning binary codewords to (blocks of) source symbols. Variable-length codes (VLC) and fixed-length codes. Instantaneous codes Uniqely decodable codes Non-singular codes All codes Tree codes are instantaneous. Tree code , Krafts Inequality. The Source Coding Theorem.

Das könnte Ihnen auch gefallen