Beruflich Dokumente
Kultur Dokumente
Compression
Need for Image & Video Compression
Uncompressed video
640 x 480 resolution, 8 bit (1 bytes) colour, 24 fps
307.2 Kbytes per image (frame)
Given a 100 Gigabyte disk (100 x 109 bytes), can store about 1-
4 hours of high quality video
MPEG-1 compresses video to 187 Kbytes/second
2
Need for Image & Video Compression
3
Compression
Based on Information Theory as described by
Claude Shannon in 1948
Remove redundancy
4
Compression Theory
(lossy/lossless)
Original Data
5 4 2
Lossy Lossless
CODE CODEC
C
1.37 MB 85 KB
6
Compression theory
1.37 MB 16 KB
7
Compression Theory (Entropy)
10 22 0 0 3 03 30 03 10
3 30 33 44 05 35 0 50 05
0 50 5 50 8 60 8 60 66 8
66 666 88 86 79 84 67 34
ENTROPY
ENTROPY
22 0 0 3 03 30 03 10 33
30 33 44 05 05 0 50 05 0
50 5 50 8 60 8 60 66 8 60
8 60 66 8 66 666 88 86
79 84 67 34 22 0 30 3 33
30 03 10 3 30 33 44 05 35 Redundancy
0 50 05 0 50 5 50 8 60 8
60 66 44 33 0 0 0 0 0 0 0
8
Compression Theory (Entropy)
10
Video Compression and Formats
H.261/H.263
Cinepak
Sorensen
Indeo
Real Video
MPEG-1, MPEG-2, MPEG-4, etc.
QuickTime, AVI
11
Special Coding Requirements
Viewing real-time source information
End-to-end delay (EED) should not exceed 150-200 ms
Face-to-face application needs EED of 50ms (including
compression and decompression)
Interactive viewing - random access
Random access to single images and audio frames, access
time should be less than 0.5 sec
Decompression of images, video, audio should not be linked
to other data units to support random access
12
Compression Steps
Uncompressed
Picture
Picture Preparation
Picture Processing
Adaptive
Feedback
Loop
Quantization
Compressed
Picture
Entropy Coding
13
Picture Preparation
Analog-to-digital conversion
Generate appropriate digital representation
Divide picture into macro blocks (usually 8x8)
Fix the number of bits per pixel
14
Picture Processing
Transform to frequency domain
E.g., use the Discrete Cosine Transform (DCT)
15
Quantization and Coding
Map real numbers to integers
E.g., the DC and AC coefficients from DCT are
real numbers, but only want to store as integers
Entropy coding
Compress a sequential bit stream without loss
16
Types of Compression
Symmetric compression
Requires same time for encoding and decoding
Used for dialog mode applications (teleconference)
Asymmetric compression
Performed once when enough time is available
Two Pass Encoding
Used for retrieval mode applications (e.g., an
interactive CD-ROM)
17
Broad Classification of
Compression Techniques
Entropy Coding
lossless encoding
used regardless of media’s specific characteristics
data taken as a simple digital sequence
decompression process regenerates data completely
e.g. Run-length coding, Huffman coding, LZW, Arithmetic coding
Source Coding
lossy encoding
May take into account the semantics of the data
degree of compression depends on data content
E.g. DPCM, ADPCM
Hybrid Coding (used by most multimedia systems)
combine entropy with source encoding
E.g. JPEG, H.263, MPEG-1, MPEG-2, MPEG-4
18
Compression Techniques
Statistical techniques (Entropy)
Predictive techniques (Source)
Transform techniques (Hybrid)
19
Statistical Techniques
Entropy (H) refers to how much variability is in data
Low/high entropy means low/high variability
Zero-order entropy model H = log 2Μ
m
20
Huffman Coding
Source Data
ABABCCDBBABBAAABBAACCDDEEAAC...
ADPCM
Similar to DPCM, but use a variable number of pixels to
transmit differences
E.g., use 8 bits per pixel and use 1-5 bits for each difference
22
Transform Techniques
Convert to data to an alternate form that better
supports specific operations
Typically operates on blocks of data
Larger blocks give better results, but require more
computational overhead
Discrete Cosine Transform
Converts an image from spatial to frequency domain
Operates on 8 x 8 blocks (64 pixels) of data
DC coefficient represents zero spatial frequency, which is
the average value for all the pixels in the 8 x 8 block
AC coefficients represent amplitudes of progressively higher
horizontal and vertical spatial frequency components
23
Run Length Encoding (RLE)
Form of entropy coding (lossless)
Content dependent coding scheme
24
Example Implementation
Use 8 bits (1 byte) data elements
Unsigned [0, 255]; signed [-127, 127]
26
Exercise
Assume same sequence from before
abbbbbbbccddddeeddd
Given previous implementation, what would
be the transmitted code
27
Differential Encoding (Source)
Consider a sequence of values S1, S2, S3, etc.
that differ in value, but not dramatically
Encode differences from a specific value
E.g., S1, S2-S1, S3-S2, etc.
28
Differential Encoding Example
0 0 0 0 0
0 0 0 0 0
29
Compression Steps
Uncompressed
Picture
Picture Preparation
Picture Processing
Adaptive
Feedback
Loop
Quantization
Compressed
Picture
Entropy Coding
30
JPEG Compression Steps
Prepare the image for compression
Transform color space
Down-sample components
Interleave the color planes
Partition into smaller blocks
31
JPEG Algorithm
8x8 blocks Source
Image
B
G
R
DCT-based encoding
Entropy Compressed
FDCT Quantizer
Encoder image data
Table Table
32
Color Transformation
May want to transform RGB to YUV or YCbCr
This is an optional step, so why do it?
34
Color sub-sampling
Sub-Sampling
4:4:4 4:2:2
35
Color sub-sampling
Sub-Sampling
4:4:4 4:1:1
36
Color sub-sampling
Sub-Sampling
4:4:4 4:2:0
37
How does colour sub-sampling aid
compression?
100 x 100 pixels frame requires:
100 x 100 Y Pixels
100 x 100 Cr Pixels
100 x 100 Cb Pixels
38
Blocks and Pixel Shift
Divide each component into 8x8 blocks
Send to the FDCT
Source
Image
39
Forward DCT
Convert from spatial to frequency domain
Convert intensity function into weighted sum of
elementary frequency components
Identify pieces of spectral information that can be
thrown away without loss of quality
Intensity values in each color plane often
change slowly (see next example)
Contributions from higher frequency bands in the
frequency domain can be ignored
Better compression without loss of quality
40
Equations for 2D DCT
Forward DCT:
2 m −1 n −1
⎛ (2 x + 1)uπ ⎞ ⎛ (2 y + 1)vπ ⎞
F (u , v) = C (u )C (v)∑∑ I ( x, y ) * cos⎜ ⎟ * cos⎜ ⎟
nm y =0 x =0 ⎝ 2n ⎠ ⎝ 2m ⎠
Inverse DCT:
2 m −1 n −1 ⎛ (2 x + 1)uπ ⎞ ⎛ (2 y + 1)vπ ⎞
I ( y, x) = ∑∑
nm v =0 u =0
F ( v , u )C (u )C ( v ) cos⎜
⎝ 2n
⎟
⎠
* cos⎜
⎝ 2 m
⎟
⎠
41
Visualization of Basis Functions
Increasing frequency
Increasing frequency
42
Quantization
Divide each coefficient in a block by an integer in the
range [1, 255]
Comes in the form of a table, same size as a block
Multiply the block of coefficients by the table, and then
round the result to nearest integer
In the decoding process, multiply the quantized
coefficients by the inverse of the table
Get back a number close to, but the same as the original
Error is always less than half of the quantization number
Larger numbers in quantization table cause more loss
This is the main source of loss in JPEG
43
De facto Quantization Table
16 11 12 14 12 10 16 14
Eye becomes less sensitive
13 14 18 17 16 19 24 40
26 24 22 22 24 49 35 37
29 40 58 51 61 60 57 51
56 55 64 72 92 78 64 68
87 69 55 56 80 109 81 87
95 98 103 104 103 62 77 113
121 112 100 120 92 101 103 99
45
DC Encoding
DC represents average intensity of a block
Because image intensity tends to change slowly,
DC values tend to change slowly
Encode using difference encoding scheme
Use 3x3 pattern of blocks
Because difference tends to be near zero, can
use less bits in the encoding
Categorize difference into difference classes
Send the index of the difference class, followed by
the bits representing the difference
46
AC Encoding
Use a zig-zag ordering of coefficients, why?
Orders frequency components from low->high
Should produce maximal series of 0s at the end
Lends well to RLE
47
Huffman Encoding
String together the RLE of the AC coefficients
along with the DC difference indices and values
Apply Huffman encoding to resulting sequence
Attach appropriate headers
Finally have the JPEG image!
48