Sie sind auf Seite 1von 48

Introduction to Image and Video

Compression
Need for Image & Video Compression
„ Uncompressed video
„
640 x 480 resolution, 8 bit (1 bytes) colour, 24 fps
„ 307.2 Kbytes per image (frame)

„ 7.37 Mbytes per second

„ 442 Mbytes per minute

„ 26.5 Gbytes per hour

„ 640 x 480 resolution, 24 bit (3 bytes) colour, 30 fps


„ 921.6 Kbytes per image (frame)

„ 27.6 Mbytes per second

„ 1.66 Gbytes per minute

„ 99.5 Gbytes per hour

„ Given a 100 Gigabyte disk (100 x 109 bytes), can store about 1-
4 hours of high quality video
„ MPEG-1 compresses video to 187 Kbytes/second

2
Need for Image & Video Compression

– Raw video contains an immense amount of data


– Communication and storage capabilities are limited and expensive
• Example HDTV video signal:
– 720x1280 pixels/frame, progressive scanning at 60 frames/s:

– 20 Mb/s HDTV channel bandwidth


→ Requires compression by a factor of 70 (equivalent to .35 bits/pixel)

3
Compression
„ Based on Information Theory as described by
Claude Shannon in 1948

„ Lossless and lossy methods

„ Efficient binary representation of information

„ Remove redundancy
4
Compression Theory
(lossy/lossless)
Original Data
5 4 2

Lossy Lossless
CODE CODEC
C

3.1 2.1 0 4.9 3.9 5 4 2


Decoded Data 1.9 Decoded Data
Decoded Data
High Error
Mild Error
5
Compression theory

Original Picture 800 x 600 Decoded Picture (Low Losses)

1.37 MB 85 KB
6
Compression theory

Original Picture 800 x 600 Decoded Picture (High Losses)

1.37 MB 16 KB
7
Compression Theory (Entropy)

10 22 0 0 3 03 30 03 10
3 30 33 44 05 35 0 50 05
0 50 5 50 8 60 8 60 66 8
66 666 88 86 79 84 67 34
ENTROPY
ENTROPY
22 0 0 3 03 30 03 10 33
30 33 44 05 05 0 50 05 0
50 5 50 8 60 8 60 66 8 60
8 60 66 8 66 666 88 86
79 84 67 34 22 0 30 3 33
30 03 10 3 30 33 44 05 35 Redundancy
0 50 05 0 50 5 50 8 60 8
60 66 44 33 0 0 0 0 0 0 0

8
Compression Theory (Entropy)

Part of the Entropy


ENTROPY
ENTROPY

Lossy Compression Optimal Lossless Compression

Only Part of the Entropy is Preserved Full Entropy is Preserved


9
Image Compression and Formats
„ RLE
„ Huffman
„ LZW
„ GIF
„ JPEG
„ Fractals
„ TIFF, PICT, BMP, etc.

10
Video Compression and Formats
„ H.261/H.263
„ Cinepak
„ Sorensen
„ Indeo
„ Real Video
„ MPEG-1, MPEG-2, MPEG-4, etc.
„ QuickTime, AVI

11
Special Coding Requirements
„ Viewing real-time source information
„ End-to-end delay (EED) should not exceed 150-200 ms
„ Face-to-face application needs EED of 50ms (including
compression and decompression)
„ Interactive viewing - random access
„ Random access to single images and audio frames, access
time should be less than 0.5 sec
„ Decompression of images, video, audio should not be linked
to other data units to support random access

12
Compression Steps

Uncompressed
Picture
Picture Preparation

Picture Processing
Adaptive
Feedback
Loop
Quantization

Compressed
Picture
Entropy Coding

13
Picture Preparation
„ Analog-to-digital conversion
„ Generate appropriate digital representation
„ Divide picture into macro blocks (usually 8x8)
„ Fix the number of bits per pixel

14
Picture Processing
„ Transform to frequency domain
„ E.g., use the Discrete Cosine Transform (DCT)

„ Compute motion vectors for each block

15
Quantization and Coding
„ Map real numbers to integers
„ E.g., the DC and AC coefficients from DCT are
real numbers, but only want to store as integers

„ Entropy coding
„ Compress a sequential bit stream without loss

16
Types of Compression
„ Symmetric compression
„ Requires same time for encoding and decoding
„ Used for dialog mode applications (teleconference)

„ Asymmetric compression
„ Performed once when enough time is available
„ Two Pass Encoding
„ Used for retrieval mode applications (e.g., an
interactive CD-ROM)

17
Broad Classification of
Compression Techniques
„ Entropy Coding
„ lossless encoding
„ used regardless of media’s specific characteristics
„ data taken as a simple digital sequence
„ decompression process regenerates data completely
„ e.g. Run-length coding, Huffman coding, LZW, Arithmetic coding
„ Source Coding
„ lossy encoding
„ May take into account the semantics of the data
„ degree of compression depends on data content
„ E.g. DPCM, ADPCM
„ Hybrid Coding (used by most multimedia systems)
„ combine entropy with source encoding
„ E.g. JPEG, H.263, MPEG-1, MPEG-2, MPEG-4
18
Compression Techniques
„ Statistical techniques (Entropy)
„ Predictive techniques (Source)
„ Transform techniques (Hybrid)

19
Statistical Techniques
„ Entropy (H) refers to how much variability is in data
„ Low/high entropy means low/high variability
„ Zero-order entropy model H = log 2Μ
m

„ First-order entropy model H = −∑ P (i ) log 2 P (i )


i =1

„ Huffman encoding, for example


„ Use statistical profile to determine encoding scheme
„ Use fewer/more bits to encode more/less frequent data
„ Must transmit a codebook

20
Huffman Coding
Source Data
ABABCCDBBABBAAABBAACCDDEEAAC...

Symbol A B C D E Huffman Table


Count 20 8 6 6 4 Symbol Count Code
Total Size: 44 Bytes / 352 A 20 0
Bits B 8 100

Total size : 92 Bits C 6 101


D 6 110
Compression Ration: 3.8
E 4 111
≅ 2 Bits/Pixel
010001001011011101001001100100000100100… Encoder
21
Compressed Data
Predictive Techniques
„ DPCM
„ Compare adjacent pixels and only transmit the difference
between them
„ E.g., use 8 bits per pixel and use 4 bits for each difference

„ ADPCM
„ Similar to DPCM, but use a variable number of pixels to
transmit differences
„ E.g., use 8 bits per pixel and use 1-5 bits for each difference

22
Transform Techniques
„ Convert to data to an alternate form that better
supports specific operations
„ Typically operates on blocks of data
„ Larger blocks give better results, but require more
computational overhead
„ Discrete Cosine Transform
„ Converts an image from spatial to frequency domain
„ Operates on 8 x 8 blocks (64 pixels) of data
„ DC coefficient represents zero spatial frequency, which is
the average value for all the pixels in the 8 x 8 block
„ AC coefficients represent amplitudes of progressively higher
horizontal and vertical spatial frequency components

23
Run Length Encoding (RLE)
„ Form of entropy coding (lossless)
„ Content dependent coding scheme

„ Series of repeated values replaced by a single


value and a count
„ Example:
„ The sequence abbbbbbbccddddeeddd
would be replaced by 1a7b2c4d2e3d

24
Example Implementation
„ Use 8 bits (1 byte) data elements
„ Unsigned [0, 255]; signed [-127, 127]

„ Encode repeated values using two bytes


„ First byte provides count (N) between [-1, -127]
„ The repetition count is –N + 1
„ Second byte contains the data value to repeat

„ Repeated patterns cannot be longer than 128


„ Must be broken into multiple runs

„ A non-repeating data value will have a positive first


byte, which is the data value
25
Exercise
„ Assume same sequence from before
abbbbbbbccddddeeddd
„ Given previous implementation, what would
be the transmitted code

26
Exercise
„ Assume same sequence from before
abbbbbbbccddddeeddd
„ Given previous implementation, what would
be the transmitted code

a -6b -1c -3d -1e -2d

27
Differential Encoding (Source)
„ Consider a sequence of values S1, S2, S3, etc.
that differ in value, but not dramatically
„ Encode differences from a specific value
„ E.g., S1, S2-S1, S3-S2, etc.

„ E.g. still image


„ Calculate difference between nearby pixels
„ Areas of rapid color change characterized by large
values, other areas by small values
„ After differential encoding, apply RLE

28
Differential Encoding Example

0 0 0 0 0

0 255 250 253 251

0 255 251 254 255

0 0 0 0 0

DPCM: 0, 0, 0, 0, 0, 0, 255, -5, 3, -2, 0, 255, -4, 3, 1, …


RLE: 6(0), 255, -5, 3, -2, 0, 255, -4, 3, 1, …

29
Compression Steps

Uncompressed
Picture
Picture Preparation

Picture Processing
Adaptive
Feedback
Loop
Quantization

Compressed
Picture
Entropy Coding

30
JPEG Compression Steps
„ Prepare the image for compression
„ Transform color space
„ Down-sample components
„ Interleave the color planes
„ Partition into smaller blocks

„ Apply the Discrete Cosine Transform


„ Quantize the DC and AC coefficients
„ Apply entropy encoding to quantized numbers

31
JPEG Algorithm
8x8 blocks Source
Image

B
G
R
DCT-based encoding

Entropy Compressed
FDCT Quantizer
Encoder image data

Table Table
32
Color Transformation
„ May want to transform RGB to YUV or YCbCr
„ This is an optional step, so why do it?

„ Human visual system detects changes better in


luminance than in chrominance components
„ YUV or YCbCr better supports this than RGB
„ Facilitates better compression by enabling
„ Down-sampling of the image in chrominance
„ Elimination of more of the high-frequency changes
in the chrominance dimensions
33
Down-Sampling Chrominance
„ Optional step, but increases compression ratio
„ Only down-sample the chrominance components,
never the luminance component
„ Average groups of pixels in the horizontal
and/or vertical resolution of the image
„ Referred to as H2V1, H2V2, 4:2:2, 4:2:0, etc.
„ E.g., H2V2 means average every 2 pixels in
horizontal and vertical dimensions
„ H4V4 means use the SAME resolution in the
horizontal and vertical dimensions

34
Color sub-sampling

Sub-Sampling

4:4:4 4:2:2

35
Color sub-sampling

Sub-Sampling

4:4:4 4:1:1

36
Color sub-sampling

Sub-Sampling

4:4:4 4:2:0

37
How does colour sub-sampling aid
compression?
„ 100 x 100 pixels frame requires:
„ 100 x 100 Y Pixels
„ 100 x 100 Cr Pixels
„ 100 x 100 Cb Pixels

„ Using 4:2:0 colour sub-sampling:


„ 100 x 100 Y Pixels
„ 50 x 50 Cr Pixels
„ 50 x 50 Cb Pixels

38
Blocks and Pixel Shift
„ Divide each component into 8x8 blocks
„ Send to the FDCT

Source
Image

„ Shift each pixel from unsigned range [0, 255] into


signed range [-127, 127]
„ DCT requires range be centered around 0

39
Forward DCT
„ Convert from spatial to frequency domain
„ Convert intensity function into weighted sum of
elementary frequency components
„ Identify pieces of spectral information that can be
thrown away without loss of quality
„ Intensity values in each color plane often
change slowly (see next example)
„ Contributions from higher frequency bands in the
frequency domain can be ignored
„ Better compression without loss of quality

40
Equations for 2D DCT
„ Forward DCT:
2 m −1 n −1
⎛ (2 x + 1)uπ ⎞ ⎛ (2 y + 1)vπ ⎞
F (u , v) = C (u )C (v)∑∑ I ( x, y ) * cos⎜ ⎟ * cos⎜ ⎟
nm y =0 x =0 ⎝ 2n ⎠ ⎝ 2m ⎠

„ Inverse DCT:
2 m −1 n −1 ⎛ (2 x + 1)uπ ⎞ ⎛ (2 y + 1)vπ ⎞
I ( y, x) = ∑∑
nm v =0 u =0
F ( v , u )C (u )C ( v ) cos⎜
⎝ 2n


* cos⎜
⎝ 2 m

41
Visualization of Basis Functions

Increasing frequency

Increasing frequency
42
Quantization
„ Divide each coefficient in a block by an integer in the
range [1, 255]
„ Comes in the form of a table, same size as a block
„ Multiply the block of coefficients by the table, and then
round the result to nearest integer
„ In the decoding process, multiply the quantized
coefficients by the inverse of the table
„ Get back a number close to, but the same as the original
„ Error is always less than half of the quantization number
„ Larger numbers in quantization table cause more loss
„ This is the main source of loss in JPEG

43
De facto Quantization Table

16 11 12 14 12 10 16 14
Eye becomes less sensitive

13 14 18 17 16 19 24 40
26 24 22 22 24 49 35 37
29 40 58 51 61 60 57 51
56 55 64 72 92 78 64 68
87 69 55 56 80 109 81 87
95 98 103 104 103 62 77 113
121 112 100 120 92 101 103 99

Eye becomes less sensitive


44
Entropy Encoding
„ Compress the sequence of quantized DC and
AC coefficients from the quantization step
„ Further increase compression, but without loss

„ Separate DC from AC components


„ DC components change slowly, thus will be
encoded using difference encoding

45
DC Encoding
„ DC represents average intensity of a block
„ Because image intensity tends to change slowly,
DC values tend to change slowly
„ Encode using difference encoding scheme
„ Use 3x3 pattern of blocks
„ Because difference tends to be near zero, can
use less bits in the encoding
„ Categorize difference into difference classes
„ Send the index of the difference class, followed by
the bits representing the difference
46
AC Encoding
„ Use a zig-zag ordering of coefficients, why?
„ Orders frequency components from low->high
„ Should produce maximal series of 0s at the end
„ Lends well to RLE

„ Apply RLE to ordering

47
Huffman Encoding
„ String together the RLE of the AC coefficients
along with the DC difference indices and values
„ Apply Huffman encoding to resulting sequence
„ Attach appropriate headers
„ Finally have the JPEG image!

48

Das könnte Ihnen auch gefallen