Sie sind auf Seite 1von 95

AGENDA Coding and Compression

Introduction- Sampling, Nyquist, Transform Lossless Data Compression


Runlength, Huffman, Dictionary compression

Audio
PCM, DPCM

Image
hierarchical coding, subband coding MPEG, JPEG, DCT Wavelet, HAAR Transform
1

Introduction
A key problem with multimedia is the huge quantities of data that result from raw digitized data of audio, image or video source. The main goal for coding and compression is to alleviate the storage, processing and transmission costs for these data. There are a variety of coding and compression techniques commonly used in the Internet and other system.
2

Introduction
The components of a system are capturing, transforming, coding and transmitting.

Sample

Transform

Coding

Introduction
Sampling --- Analog to Digital Conversion.
An input signal is converted from some continuously varying physical value(e.g. pressure in air, or frequency or wavelength of light) into a continuously electrical signal by some electro-mechanical device. This continuously varying electrical signal can then be converted to a sequence of digital values, called samples, by some analog to digital conversion circuit.

Two factors determine the accuracy of the sample with the original continuous signal:
4

Introduction
Sampling and Nyquist theorem
The maximum rate at which we sample.
Based on Nyquists theorem, the digital sampling rate must be twice of the highest frequency in continuous signal.

The number of bits used in each sample. (known as the quantization level.) however, it is often not necessary to capture all frequencies in the original signal.
For example, voice is comprehensible with a much smaller range of frequencies that we can actually hear.
5

Introduction - Transform
The goal of transform is to decorrelate the original signal, and this decorrelation results in the signal energy being redistributed among only a small set of transform coefficients. The original data can be transformed in a number of ways to make it easier to apply certain compression techniques. The most common transform in current techniques are the Discrete Cosine Transform and wavelet transform.

Compression techniques were developed early in the life of computers, to cope with the problems of limited memory and storage capacity Hardware advances have limited the requirement for such techniques in desktop applications Network and communication capacity restrictions have resulted in continuing work on compression The advent of distributed multimedia has resulted in considerable developments in compression Problem : real-time, or timely, transmission of audio and video over communications networks
7

Image compression - Technique of reducing the numbers required to store an image

Image compression is necessitated by:


A need to store data efficiently in available memory

A need to efficiently transmit data over available communication channels

Media
Speech CD Audio

Sample Rate
8000 samples/sec.

Data Size & Rate


7.8 KB/s

44,100 172 KB/s samples/sec. 2 bytes/sample Satellite Images 180x180 km2 1030MB/image 30 m2 resolution VGA Video 25 frames/sec. 22 MB/s 640x480 pixels 3 bytes/pixel

Another View
Data Rate
128 Kbs 384 Kbs 1.5 Mbs 3.0 Mbs 6.0 Mbs 25 Mbs

Size/Hour
60 MB 170 MB 680 MB 1.4 GB 2.7 GB 11.0 GB

10

Video Data Size


size of uncompressed video in gigabytes
1920x1080 1 e 1 i 1 o o 1 19 11 2 671 85 671 846 4 1280x720 8 4 98 298 6 298 598 4 640x480 3 1 66 99 53 99 532 8 320x240 1 41 24 88 24 883 2 160x120 1 6 22 6 22 8

image size of video

1280x720 (1.77)

640x480 (1.33)

320x240

160x120

11

Bandwidth requirements of images in some applications


Fax

- 250 KB/image Digital Cameras - 18-150 MB/image Digital Television - 166 MB/second

12

Image compression standards are necessitated for ease of exchange in software and hardware
Standards are developed by different standards bodies ISO, ITU, ANSI, etc. Some popular image compression standards JPEG, MPEG-1, MPEG-2, MPEG-4 etc. It is important to note that there are many proprietary compression codes!

13

How (and why) can images be compressed ?


Images can be compressed by exploiting two characteristics of digital images Redundancy Redundancy looks at properties of an image and reduces redundant data Irrelevancy Much of the data in an image may be irrelevant to a human observer

14

Video Bit Rate Calculation


width * height * depth * fps = bits/sec compression factor

width bpixels (160, 320, 640, 720, 1280, 1920, ) height bpixels (120, 240, 480, 485, 720, 1080, ) depth b bits (1, 4, 8, 15, 16, 24, ) fps bframes per second (5, 15, 20, 24, 30, ) compression factor (1, 6, 24, )

15

Effects of Compression
storage for 1 hour of compressed video in megabytes
1:1 :1 :1 :1 100:1 1920x1080 1, , 111, , , 1 1280x720 , , , 11, , 640x480 , ,1 1 , , 1 320x240 , , ,1 160x120 , 1 ,0 1,0

3 bytes/pixel, 30 frames/sec

16

Categories of Compression Techniques


Entropy Encoding Run-Length Encoding Huffman Coding LZW Arithmetic Coding Source Encoding Prediction Transformation (e.g. DCT) Layered Coding Vector Quantization Hybrid Coding JPEG MPEG H.261 DVI RTV, DVI PLV
17

Digital Video and Image Coding, Compression


Simple
Truncation
Interpolative

Predictive

Transform

Statistical Huffman

Subsample

CLUT
Run-length

DPCM ADPCM Motion Comp.

DCT

Fixed

Adaptive

Video Colour Input Components

Video Compression Algorithm

Bit Assignment

Compressed Bit-Stream
18

As can be seen from the diagram, the majority of video compression algorithms use a combination of compression techniques to produce the bit-stream. We will consider each of the individual techniques identified in the diagram. We assume that all input to the system is in the form of a PCM (Pulse Code Modulation - we will discuss this later when considering Sound sampling) digitised signal in colour component (RGB, YUV) form. Selection of colour component form can be important, where there are differences in colour processing between compression and decompression. Techniques can be made adaptive to the image content.

19

Simple Compression (Encoding) Techniques


Truncation
throw away least significant bits for each pixel too much truncation will affect contouring, image becomes cartoon-like for real images, truncation from 24bpp to 16bpp gives good results (RGB = 5:5:5 + keying bit; YUV=6:5:5)

CLUT
Colour Lookup Table pixel values in the bitmap represent an index into a table of colours usually 8bpp, so image limited to 256 colours unique CLUT can be created for each image, but this results in non-trivial preprocessing bpp can be increased for better quality, but once you reach 16bpp truncation is better and simpler
20

Run-length Encoding
blocks of repeated pixels are replaced with a single value plus a count works well on images with large repeated blocks of solid colours, can achieve compression rates below 1bpp good for computer-generated images, cartoons, etc. poor for real images, video, etc.

Interpolative Techniques
Interpolative encoding works at the pixel level by transmitting a subset of the pixels and using interpolation to reconstruct the intervening pixels
not really compression as we are reducing the number of pixels rather than the size of their representation it is validly used in colour subsampling, working with luminance-chrominance component images (YUV), can reduce 24bpp to 9bpp 21 also used in motion video compression (i.e. MPEG)

Predictive Techniques
Based on the fact that we can store the previous item (frame, line, pixel, etc.) and use it to help build the next item, allowing us to transmit only that part of the item that has changed. DPCM
Compare adjacent pixels and only transmit the difference between them, because adjacent pixels are likely to be similar the difference value have a high probability of being small and can safely be transmitted with fewer bits. Hence we can use 4 bit difference values for 8 bit pixels. In decompression the difference value is used to modify the previous pixel to get the new one, which works well as long the amplitude change is small. If the change is a fullamplitude, say from black to white, it would overload the DPCM system, requiring a number of pixel times to make the change and causing smearing of the edges in highcontrast images (slope overload).
22

ADPCM
Adaptive DPCM Can adapt the step size for the difference values to cope with full amplitude changes. Some extra overhead in data and processing to achieve adaptation. Replaces slope overload with quantisation noise for highcontrast edges.

Since predictive encoding is dependent on previous pixels for future ones, any errors are likely to be exacerbated. To avoid this typically predictive schemes make differential start overs, often at the beginning of each scanning line or each frame.

Transform Coding Techniques


A transform is a process that converts a bundle of data into an alternate form which is more convenient for some purpose. Transforms are usually reversible, using an inverse transform. 23

Lossless Data Compression


Lossless means the reconstructed image doesnt lose any information according to the original one. There is a huge range of lossless data compression techniques. The common techniques used are: runlength encoding Huffman coding dictionary techniques
24

Lossless Data Compression


Runlength compression
Removing repetitions of values and replacing them with a counter and single value. Fairly simple to implement. Its performance depends heavily on the input data statistics. The more successive value it has, the more space we can compress.

25

Lossless Data Compression


Huffman compression
Use more less bits to represent the most frequently occurring characters/codeword values, and more bits for the less commonly occurring once. It is the most widespread way of replacing a set of fixed size code words with an optimal set of different sized code words, based on the statistics of the input data. Sender and receiver must share the same codebook which lists the codes and their compressed representation.
26

Lossless Data Compression


Dictionary compression
Look at the data as it arrives and form a dictionary. when new input comes, it look up the dictionary. If the new input existed, the dictionary position can be transmitted; if not found, it is added to the dictionary in a new position, and the new position and string is sent out. Meanwhile, the dictionary is constructed at the receiver dynamically, so that there is no need to carry out statistics or share a table separately.
27

In image and video compression, the bundle of data is usually a two-dimensional array of pixels, i.e. 8x8.
2x2 Array of Pixels A B

C Transform X0 = A X1 = B - A X2 = C - A X3 = D - A

D Inverse Transform An = X0 Bn = X1 + X0 Cn = X2 + X0 Dn = X3 + X0
28

In the simple example shown, if the pixels were 8 bits each then the block would use 32 bits :
Using the transform we could assign 4 bits each for the difference values and 8 bits for the base pixel, A. This would reduce the data to 8 + (3x4) or 20 bits for the 2x2 block - compressing from 8bpp to 5bpp. This example is too small to be useful, typically transforms are enacted on 8x8 blocks and the trick is to develop good transforms with calculations that are easy to implement in hardware or software.

The Discrete Cosine Transform


especially important for video and image compression typically used on 8x8 pixel blocks, processing 64 pixel values and 64 new values are output, representing the amplitudes of the twodimensional spatial frequency components of the 64-pixel block these are referred to as DCT coefficients. the coefficient for zero spatial frequency is called the DC coefficient, the remaining 63 are the AC coefficients, and they all represent amplitudes of progressively higher spatial frequencies in the block
29

As adjacent pixel values tend to be similar or vary slowly from one to another, the DCT processing provides the opportunity for compression by forcing most of the signal energy into the lower spatial frequency components. In most cases, many of the higher-frequency coefficients will have zero or near-zero values and can be ignored.

Statistical Coding
Uses the statistical distribution of the pixel values in an image, or of the data created from one of the techniques already described. Also known as entropy encoding Can be used in bit assignment as well as part of the compression algorithm itself. Due to the non-uniform distribution of pixel values, we can set up a coding technique where the more 30 frequently occurring values are encoded using fewer bits.

A codebook is created which sets out the encodings for the pixel values, this is transmitted separately from the image data and can apply to part of an image, a single image or a sequence of images. Because the most frequently occurring values are transmitted using fewer bits high compression ratios can be achieved. One of the most widely used forms of statistical coding is called Huffman encoding.

Motion Compensation
If we are transmitting video frames on the basis of describing the difference between one frame and the next, how do we describe motion? Compare frames for differences 31 Set threshold value for motion

Use DPCM approach to encode the data Use block structure to determine motion in parts of image (similar to transform approach) In sophisticated compression systems, motion vectors can be developed to ensure fidelity of reproduction

Classification of Compression Algorithms


Lossless compression
image is mathematically equivalent to original only achieves modest level of compression (5:1)

Lossy compression
image shows degradation from original high rates of compression (up to 200:1) Objective - achieve highest possible rate of compression while maintaining quality of image to be virtually lossless 32

What is
JPEG - Joint Photographic Experts Group Still image compression, intraframe picture technology MJPEG is sequence of images coded with JPEG MPEG - Moving Picture Experts Group Many standards MPEG1, MPEG2, and MPEG4 Very sophisticated technology involving intra- and interframe picture coding and many other optimizations => high quality and cost in time/computation H.261/H.263/H.263+ - Video Conferencing Low to medium bit rate, quality, and computational cost Used in H.320 and H.323 video conferencing standards
33

Image, Video and Audio Standards


JPEG
good compression widespread applicability lossless compression
predictor for each pixel comparison with surrounding pixels difference computed value of difference replaced with code from code table, developed using Huffman encoding code table forms part of encoded image

lossy compression
image broken down into 8x8 blocks apply DCT and then quantize the image encode using same system as for lossless
34

JPEG Compression Ratios

Quality Moderate to good Good to Very good Excellent Near original quality

Bits per pixel 0.25 - 0.5 0.5 - 0.75 0.75 - 1.5 1.5 - 2.0

35

JPEG can be used for video information (Motion JPEG) but it makes no concession to the nature of video, maintaining the same structure and, more importantly, bit rate and structure for each frame of the video.

JBIG
lossless compression one bit/pixel, binary or bi-level images based on template structure to model redundancy within image uses arithmetic encoding intended primarily for use with fax
36

MPEG
MPEG - 1 : Data rate 1-1.5Mbps Features :
random access to frames to allow starting of the video sequence at any point fast forward and reverse searches to view video in either direction at more than the original speed reverse playback to permit a reverse play mode - not appropriate in situations such as video telephony audio-video synchronisation manage lip-synch robustness to errors should be able to recover from errors, and not propagate errors through frames, particularly important when dealing with non-error-free communication 37 channels

adjustable delay time or real-time operation not a factor in normal video playback, but of particular importance in video telephony editability permit inclusion of other video in encoded sections flexible format permit different window sizes and frame rates implementable in hardware dedicated chipset for decoding is desirable

Algorithm
Based on 3 types of frame I-frame : similar to JPEG still image, basis of the encoding as it contains the maximum amount of information. P-frame : contains less information than an I-frame and is obtained by using motion-compensated prediction from past I-frames.
38

B-frame: has the greatest level of compression, or the least level of information. Obtained by interpolation between an I-frame and a P-frame.

Audio
various elements of audio capture are defined in the MPEG standard, see the handout.

MPEG-2 MPEG-4 MPEG-7 MPEG-21

39

Audio
The input audio signal from a microphone is passed through several stages:
firstly, a band pass filter is applied eliminating frequencies in the signal that we are not interested in. then the signal is sampled, converting the analog signal into a sequence of values. This is then quantised, or mapped into one of a set of fixed value. These values are then coded for storage or transmission.
40

Audio
Some techniques for audio compression:
ADPCM LPC CELP

41

Audio
ADPCM -- Adaptive Differential Pulse Code Modulation
ADPCM allows for the compression of PCM encoded input whose power varies with time. Feedback of a reconstructed version of the input signal is subtracted from the actual input signal, which is quantised to give a 4 bits output value. This compression gives a 32 kbit/s output rate.

42

Audio
Transmitter Original

Xm + -

Em

Em* Qunatizer oder

Channel

+ Xm' Predictor Xm*

Receiver Em* Decoder + Reconstructed

Channel

+ Xm' Xm*

Predictor

43

Audio
LPC -- Linear Predictive Coding
The encoder fits speech to a simple, analytic model of the vocal tract. Only the parameters describing the best-fit model is transmitted to the decoder. An LPC decoder uses those parameters to generate synthetic speech that is usually very similar to the original. LPC is used to compress audio at 16 Kbit/s and below.

44

Audio -- CELP
CELP -- Code Excited Linear Predictor
CELP does the same LPC modeling but then computes the errors between the original speech and the synthetic model and transmits both model parameters and a very compressed representation of the errors. The result of CELP is a much higher quality speech at low data rate.

45

CODING

46

Huffman
Uncompressed images, audio, and video data require considerable storage capacity. Data transfer of uncompressed video data over digital networks requires very high bandwidth to be provided for a single point-to-point communication. Without compression, a CD with a storage capacity of approximately 600 million bytes would only be able to store about 260 pictures (1024x768 true color) or at the 25 frames per second rate of a motion picture, about 10 seconds of a movie.

47

Compression Terminology
Compression Ratio
The ratio of raw data to compressed data It is computed by dividing the original number of bits or bytes by the number of bits or bytes remaining after data compression is applied or as a percentage of compressed/original. For lossless compression, compression ratios of 2:1 (50%) or 3:1 (30%) are typical. For lossy compression on video, compression ratios of more than 100:1 may be achievable with the effectiveness of the compression algorithms and acceptable information loss.
48

Image Compression
2-stage Coding technique
1. A linear predictor such as DPCM, or some linear predicting function Decorrelate the raw image data 2. A standard coding technique, such as Huffman coding, arithmetic coding, Lossless JPEG: - version 1: DPCM with arithmetic coding - version 2: DPCM with Huffman coding

49

Entropy Encoding
Used regardless of medias specific characteristics. The data stream to be compressed is considered to be a simple digital sequence and the semantics of the data is ignored. It is a lossless process.

50

Source Encoding
Takes into account the semantics of the data. The degree of compression that can be reached by source encoding depends on the data contents. It is usually a lossy process.

51

Run-Length Encoding (RLE)


RLE is mostly useful when we have to deal with palettebased images that contain large sequences of equal colours. The idea in RLE to encode a string of repeated characters by a count of the number of times it is repeated and one copy of the character. For example, the string AAAABBBAABBBBBCCCCCCCCDABCDBAAABBBBC CCD can be compressed as 4A3BAA5B8CDABCB3A4B3CD where 4A means four A, and so forth. This example represents 38 bytes of data with 22 bytes, achieving a compression ratio of 38/22 = 1.73.
52

Run-Length Encoding (RLE)


For binary files, simply store the run lengths, taking the advantage of the fact that the runs alternate between 0 and 1, assuming run of 0 goes first. For example, 0000011000 can be compressed as 5#2#3 while 11000 be compressed as 0#2#3. Run length encoding works very well for images with solid backgrounds like cartoons. For natural images, it doesn't work that well.
53

The Huffman Coding algorithmHistory


In 1951, David Huffman and his MIT information theory classmates given the choice of a term paper or a final exam Huffman hit upon the idea of using a frequencysorted binary tree and quickly proved this method the most efficient. In doing so, the student outdid his professor, who had worked with information theory inventor Claude Shannon to develop a similar code. Huffman built the tree from the bottom up 54 instead of from the top down

Huffman Coding Algorithm


1. Take the two least probable symbols in the alphabet
(longest codewords, equal length, differing in last digit)

2. Combine these two symbols into a single symbol, and repeat.

55

A simple example
Suppose we have a message consisting of 5 symbols, e.g. [ ] How can we code this message using 0/1 so the coded message will have minimum length (for transmission or saving!) 5 symbols at least 3 bits For a simple encoding, length of code is 10*3=30 bits
56

A simple example cont.


Intuition: Those symbols that are more frequent should have smaller codes, yet since their length is not the same, there must be a way of distinguishing each code For Huffman code, length of encoded message will be =3*2 +3*2+2*2+3+3=24bits
57

Definitions
An ensemble X is a triple (x, Ax, Px) x: value of a random variable Ax: set of possible values for x , Ax={a1, a2, , aI} Px: probability for each value , Px={p1, p2, , pI} where P(x)=P(x=ai)=pi, pi>0, p !1

Shannon information content of x h(x) = log2(1/P(x)) Entropy of x


1 H ( x) ! P ( x). log P ( x) x Ax

i 1 2 3 .. 26

ai a b c z

pi .0575 .0128 .0263 .. .0007

h(pi) 4.1 6.3 5.2 .. 10.4


58

Source Coding Theorem

There exists a variable-length encoding C of an ensemble X such that the average length of an encoded symbol, L(C,X), satisfies
L(C,X)[H(X), H(X)+1)

59

Symbol Codes
Notations: AN: all strings of length N A+: all strings of finite length {0,1}3={000,001,010,,111} {0,1}+={0,1,00,01,10,11,000,001,} A symbol code C for an ensemble X is a mapping from Ax (range of x values) to {0,1}+ c(x): codeword for x, l(x): length of codeword
60

Example
Ensemble X:
Ax= { a , b , c , d } Px= {1/2 , 1/4 , 1/8 , 1/8}
ai c(ai) 1000 0100 0010 0001 li 4 4 4 4

c(a)= 1000 c+(acd)=


100000100001 (called the extended code)

a
C0:

b c d

61

The code should achieve as much compression as possible


The expected length L(C,X) of symbol code C for X is
|

L(C , X ) !

x
x

( x )l ( x ) ! pi li
i !1

x|

62

Example
Ensemble X:
Ax= { a , b , c , d } Px= {1/2 , 1/4 , 1/8 , 1/8}
ai c(ai) 0 10 110 111 li 1 2 3 3

c+(acd)=
0110111 (9 bits compared with 12)
C1:

a b c d

prefix code?
63

Example
Ax={ a , b , c , d , e } Px={0.25, 0.25, 0.2, 0.15, 0.15}
0 0.55 1 1 0 0 a 0.25 00 b 0.25 10 1.0

0.45 1 c 0.2 11 0 d 0.15 010

0.3 1 e 0.15 011


64

Huffman Coding Algorithm for Image Compression


Step 1. Build a Huffman tree by sorting the histogram and successively combine the two bins of the lowest value until only one bin remains. Step 2. Encode the Huffman tree and save the Huffman tree with the coded value. Step 3. Encode the residual image.
65

Example Huffman encoding


A=0 B = 100 C = 1010 D = 1011 R = 11 ABRACADABRA = 01001101010010110100110 This is eleven letters in 23 bits A fixed-width encoding would require 3 bits for five different letters, or 33 bits for 11 letters Notice that the encoded bit string can be 66 decoded!

In this example, A was the most common letter In ABRACADABRA :


5 As 2 Rs 2 Bs 1C 1D code for A is 1 bit long code for R is 2 bits long code for B is 3 bits long code for C is 4 bits long code for D is 4 bits long
67

Creating a Huffman encoding


For each encoding unit (letter, in this example), associate a frequency (number of times it occurs)
You can also use a percentage or a probability

Create a binary tree whose children are the encoding units with the smallest frequencies
The frequency of the root is the sum of the frequencies of the leaves

Repeat this procedure until all the encoding units are in the binary tree
68

Example, step I
Assume that relative frequencies are:
A: 40 B: 20 C: 10 D: 10 R: 20

Smallest number are 10 and 10 (C and D), so connect those

69

Example, step II
C and D have already been used, and the new node above them (call it C+D) has value 20 The smallest values are B, C+D, and R, all of which have value 20
Connect any two of these

70

Example, step III


The smallest values is R, while A and B+C+D all have value 40 Connect R to either of the others

71

Example, step IV
Connect the final two nodes

72

Assign 0 to left branches, 1 to right branches Each encoding is a path from the root
A=0 B = 100 C = 1010 D = 1011 R = 11

Example, step V

Each path terminates at a leaf

73

Example 2
Suppose we want to compress the following message ABEACADABEA. The count table is:
character frequency A 5 B 2 C 1 D 1 E 2

The following binary tree can then be constructed:


A B E C D
74

The following table is used to encode the characters: Character representation A 1 B 01 C D 0010 0011 E 000
A B E C D

Example 2

The message ABEACADABEA can then be encoded with the string 10100010010100111010001. 75

Total number of bits to code the string = 5v1 + 2v2 + 1v4 + 1v4 + 2v3 = 23 If the original message uses 8 bit for 1 character, its length is 11v8 = 88. The compression ratio is 88/23 = 3.83 (26.14%). If the original message uses 3 bit for 1 character, its length is 11v3 = 33. The compression ratio is 33/23 = 1.43 (69.70%).
76

Example 2

Example 3
a) Using Huffman Coding Algorithm, compress a string with the following probability of occurrence:

Character robability
b)

C 0.325 0.250 0.15 0

F 0.12 0.050 0.075 5

G 0.025

Assuming that the above string was originally represented by 3-bit code, calculate the compression ratio achieved.

77

Example 3
Character A B C
1.0 0.6 0.275

robability 0.325 0.250 0.150 0.125 0.050 0.075 0.025

0.15 0.075 0.4

A 0.325

F 0.075

E 0.050

G 0.025

D 0.125

B 0.250

C 0.150

Encode the characters A(00) B(10) C(11)

(011) E(01010) F(0100) G(01011)

78

Example 3
Character A B C E F G robability 0.325 0.250 0.150 0.125 0.050 0.075 0.025

A(00) B(10) C(11) D(011) E(01010) F(0100) G(01011) Original expected length: 3 bits Compressed expected length: 0.325(2) + 0.250(2) + 0.150(2) + 0.125(3) + 0.050(5) + 0.075(4) + 0.025(5) = 2.375 Compression ratio = 3/2.375 = 1.26 (79.17%)
79

LZW (Lempel-Ziv-Welch) is a dictionarybased compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978. The algorithm is designed to be fast to implement but not necessarily optimal since it does not perform any analysis on the data.
80

LZW

LZW
The principle of encoding
The algorithm is surprisingly simple. It replaces strings of characters with single codes. It does not do any analysis of the incoming text. Instead, it just adds every new string of characters it sees to a table of strings. Compression occurs when a single code is output instead of a string of characters. It became very widely used after it became part of the GIF image format in 1987.

81

LZW

The principle of encoding


Most implementations of LZW used 12-bit codewords to represent 8-bit input characters. The string table is 4096 locations. The first 256 locations are initialized to the single characters (location 0 stores 0, location 1 stores 1, and so on). As new combinations of characters are parsed in the input stream, these strings are added to the string table, and will be stored in locations 256 to 4095 in the table.
82

Algorithm
The compression algorithm is as follows: Initialize table with single character strings STRING = first input character WHILE not end of input stream CHARACTER = next input character IF STRING + CHARACTER is not in the string table add STRING + CHARACTER to the string table output the code for STRING STRING = CHARACTER ELSE STRING = STRING + CHARACTER //wait until for a new string END WHILE output code for string

83

LZW Encoding Example


Compression of the string BABAABAAA yields the following trace:
string B A B BA A AB A A AA character A B A A B A A A 260 259 260 ABA AA 257 65 258 BAA 256 index 256 257 entry BA AB output code 66 65

The compressed data is


<66><65><256><257><65><260>
84

LZW Decoding Algorithm


Initialize table with single character strings OLD = first input code output translation of OLD CHARACTER = translation of OLD WHILE not end of input stream NEW = next input code IF NEW is in the string table STRING = translation of NEW ELSE STRING = translation of OLD + CHARACTER output STRING CHARACTER = first character of STRING add translation of OLD + CHARACTER to the string table OLD = NEW END WHILE
85

LZW Decoding Algorithm


LZW algorithm does not need to pass the string table to the decompression code. The table can be built exactly as it was during compression, using the input stream as data. It starts with the first 256 table entries initialized to single characters. The decompression algorithm adds a new string to the string table each time it reads in a new code. When a new code read is undefined, it translates the value of OLD_CODE and then adds the CHARACTER value to the string.
86

LZW Decoding Example


The decompression of our compressed data <66><65><256><257><65><260> gives the following trace: OLD = 66 (B) character = B output = B The output is BABAABAAA

0..255 ..ABC..
NEW 65 (A) 256 (BA) 257 (AB) 65 (A) 260 (AA) string A BA AB A AA output A BA AB A AA character A B A A A index 256 257 258 259 260 entry BA AB BAA ABA AA OLD 65 (A) 256 (BA) 257 (AB) 65 (A) 260 (AA) 87

DPCM Differential Pulse Code Modulation


DPCM is an efficient way to encode highly correlated analog signals into binary form suitable for digital transmission, storage, or input to a digital computer Patent by Cutler (1952)
88

DPCM

89

DCT : Discrete Cosine Transform


DCT converts the information contained in a block(8x8) of pixels from spatial domain to the frequency domain.
A simple analogy: Consider a unsorted list of 12 numbers between 0 and 3 -> (2, 3, 1, 2, 2, 0, 1, 1, 0, 1, 0, 0). Consider a transformation of the list involving two steps (1.) sort the list (2.) Count the frequency of occurrence of each of the numbers ->(4,4,3,1 ). : Through this transformation we lost the spatial information but captured the frequency information. There are other transformations which retain the spatial information. E.g., Fourier transform, DCT etc. Therefore allowing us to move back and forth between spatial and frequency domains.

1-D DCT:
F([) ! a(u) 2
N 1

1-D Inverese DCT:

N 1 (2n  1)[T (2n  1)[T a(0) ! 1 a(u) f(n) ! F([)cos 2 f(n)cos 16 2 16 a(p) ! 1 ? p { 0 A [ !0 n!0 90

Discrete Cosine Transform


JPEG is a lossy compression scheme based on colour space conversion and discrete cosine transform (DCT). The Discrete Cosine Transform (DCT) is a method of decomposing a block of data into a weighted sum of spatial frequencies. Each of these spatial frequency patterns has a corresponding coefficient, the amplitude needed to represent the contribution of that spatial frequency pattern in the block of data being analyzed. 91

Discrete Cosine Transform


If only the low-frequency DCT coefficients are nonzero, the data in the block vary slowly with position. If high frequencies are present, the block intensity changes rapidly from pixel to pixel.

92

Discrete Cosine Transform


Row 256 of lenna
200.00 2500.00

160.00

2000.00

absolute dct values of lenna row 256

120.00

1500.00

80.00

1000.00

40.00

500.00

0.00 0.00 200.00 400.00

0.00 600.00 0.00 200.00 400.00 600.00

93

Discrete Cosine Transform


MPEG uses a 2-D 8x8 form of the DCT. The coefficients of DCT U(i, j) for input data V(x, y) are determined by the following formula:
7 7 (2 y  1) jT (2 x  1)iT 1 ] ] cos[ U (i, j ) ! C (i )C ( j ) V ( x, y ) cos[ 16 16 4 x ! 0 y !0

1 2

where C(i), C(j) = for i, j = 0 and otherwise C(i), C(j) = 1.

94

DCT vs. Wavelet: Which is Better?


3dB improvement?
Wavelet compression was claimed to have 3dB improvement over DCT-based compression Comparison is done on JPEG Baseline

Improvement not all due to transforms


Main contribution from better rate allocation, advanced entropy coding, & smarter redundancy reduction via zero-tree DCT coder can be improved to decrease the gap
95

Das könnte Ihnen auch gefallen