Image Copression

Introduction to Image and Video
Compression
Need for Image & Video Compression
Uncompressed video

640 x 480 resolution, 8 bit (1 bytes) colour, 24 fps
307.2 Kbytes per image (frame)
7.37 Mbytes per second
442 Mbytes per minute
26.5 Gbytes per hour
640 x 480 resolution, 24 bit (3 bytes) colour, 30 fps

921.6 Kbytes per image (frame)
27.6 Mbytes per second
1.66 Gbytes per minute
99.5 Gbytes per hour
Given a 100 Gigabyte disk (100 x 109 bytes), can store about 1-
4 hours of high quality video
MPEG-1 compresses video to 187 Kbytes/second
2
Need for Image & Video Compression
– Raw video contains an immense amount of data

– Communication and storage capabilities are limited and expensive
• Example HDTV video signal:
– 720x1280 pixels/frame, progressive scanning at 60 frames/s:
– 20 Mb/s HDTV channel bandwidth

→ Requires compression by a factor of 70 (equivalent to .35 bits/pixel)
3
Compression
Based on Information Theory as described by
Claude Shannon in 1948
Lossless and lossy methods
Efficient binary representation of information
Remove redundancy
4
Compression Theory
(lossy/lossless)
Original Data
5 4 2
Lossy Lossless
CODE CODEC
C
3.1 2.1 0 4.9 3.9 5 4 2

Decoded Data 1.9 Decoded Data
Decoded Data
High Error
Mild Error
5
Compression theory
Original Picture 800 x 600 Decoded Picture (Low Losses)
1.37 MB 85 KB
6
Compression theory
Original Picture 800 x 600 Decoded Picture (High Losses)
1.37 MB 16 KB
7
Compression Theory (Entropy)
10 22 0 0 3 03 30 03 10
3 30 33 44 05 35 0 50 05
0 50 5 50 8 60 8 60 66 8
66 666 88 86 79 84 67 34
ENTROPY
ENTROPY
22 0 0 3 03 30 03 10 33
30 33 44 05 05 0 50 05 0
50 5 50 8 60 8 60 66 8 60
8 60 66 8 66 666 88 86
79 84 67 34 22 0 30 3 33
30 03 10 3 30 33 44 05 35 Redundancy
0 50 05 0 50 5 50 8 60 8
60 66 44 33 0 0 0 0 0 0 0
8
Compression Theory (Entropy)
Part of the Entropy

ENTROPY
ENTROPY
Lossy Compression Optimal Lossless Compression
Only Part of the Entropy is Preserved Full Entropy is Preserved

9
Image Compression and Formats
RLE
Huffman
LZW
GIF
JPEG
Fractals
TIFF, PICT, BMP, etc.
10
Video Compression and Formats
H.261/H.263
Cinepak
Sorensen
Indeo
Real Video
MPEG-1, MPEG-2, MPEG-4, etc.
QuickTime, AVI
11
Special Coding Requirements
Viewing real-time source information
End-to-end delay (EED) should not exceed 150-200 ms
Face-to-face application needs EED of 50ms (including
compression and decompression)
Interactive viewing - random access
Random access to single images and audio frames, access
time should be less than 0.5 sec
Decompression of images, video, audio should not be linked
to other data units to support random access
12
Compression Steps
Uncompressed
Picture
Picture Preparation
Picture Processing
Adaptive
Feedback
Loop
Quantization
Compressed
Picture
Entropy Coding
13
Picture Preparation
Analog-to-digital conversion
Generate appropriate digital representation
Divide picture into macro blocks (usually 8x8)
Fix the number of bits per pixel
14
Picture Processing
Transform to frequency domain
E.g., use the Discrete Cosine Transform (DCT)
Compute motion vectors for each block
15
Quantization and Coding
Map real numbers to integers
E.g., the DC and AC coefficients from DCT are
real numbers, but only want to store as integers
Entropy coding
Compress a sequential bit stream without loss
16
Types of Compression
Symmetric compression
Requires same time for encoding and decoding
Used for dialog mode applications (teleconference)
Asymmetric compression
Performed once when enough time is available
Two Pass Encoding
Used for retrieval mode applications (e.g., an
interactive CD-ROM)
17
Broad Classification of
Compression Techniques
Entropy Coding
lossless encoding
used regardless of media’s specific characteristics
data taken as a simple digital sequence
decompression process regenerates data completely
e.g. Run-length coding, Huffman coding, LZW, Arithmetic coding
Source Coding
lossy encoding
May take into account the semantics of the data
degree of compression depends on data content
E.g. DPCM, ADPCM
Hybrid Coding (used by most multimedia systems)
combine entropy with source encoding
E.g. JPEG, H.263, MPEG-1, MPEG-2, MPEG-4
18
Compression Techniques
Statistical techniques (Entropy)
Predictive techniques (Source)
Transform techniques (Hybrid)
19
Statistical Techniques
Entropy (H) refers to how much variability is in data
Low/high entropy means low/high variability
Zero-order entropy model H = log 2Μ
m
First-order entropy model H = −∑ P (i ) log 2 P (i )

i =1
Huffman encoding, for example

Use statistical profile to determine encoding scheme
Use fewer/more bits to encode more/less frequent data
Must transmit a codebook
20
Huffman Coding
Source Data
ABABCCDBBABBAAABBAACCDDEEAAC...
Symbol A B C D E Huffman Table

Count 20 8 6 6 4 Symbol Count Code
Total Size: 44 Bytes / 352 A 20 0
Bits B 8 100
Total size : 92 Bits C 6 101

D 6 110
Compression Ration: 3.8
E 4 111
≅ 2 Bits/Pixel
010001001011011101001001100100000100100… Encoder
21
Compressed Data
Predictive Techniques
DPCM
Compare adjacent pixels and only transmit the difference
between them
E.g., use 8 bits per pixel and use 4 bits for each difference
ADPCM
Similar to DPCM, but use a variable number of pixels to
transmit differences
E.g., use 8 bits per pixel and use 1-5 bits for each difference
22
Transform Techniques
Convert to data to an alternate form that better
supports specific operations
Typically operates on blocks of data
Larger blocks give better results, but require more
computational overhead
Discrete Cosine Transform
Converts an image from spatial to frequency domain
Operates on 8 x 8 blocks (64 pixels) of data
DC coefficient represents zero spatial frequency, which is
the average value for all the pixels in the 8 x 8 block
AC coefficients represent amplitudes of progressively higher
horizontal and vertical spatial frequency components
23
Run Length Encoding (RLE)
Form of entropy coding (lossless)
Content dependent coding scheme
Series of repeated values replaced by a single

value and a count
Example:
The sequence abbbbbbbccddddeeddd
would be replaced by 1a7b2c4d2e3d
24
Example Implementation
Use 8 bits (1 byte) data elements
Unsigned [0, 255]; signed [-127, 127]
Encode repeated values using two bytes

First byte provides count (N) between [-1, -127]
The repetition count is –N + 1
Second byte contains the data value to repeat
Repeated patterns cannot be longer than 128

Must be broken into multiple runs
A non-repeating data value will have a positive first

byte, which is the data value
25
Exercise
Assume same sequence from before
abbbbbbbccddddeeddd
Given previous implementation, what would
be the transmitted code
26
Exercise
Assume same sequence from before
abbbbbbbccddddeeddd
Given previous implementation, what would
be the transmitted code
a -6b -1c -3d -1e -2d
27
Differential Encoding (Source)
Consider a sequence of values S1, S2, S3, etc.
that differ in value, but not dramatically
Encode differences from a specific value
E.g., S1, S2-S1, S3-S2, etc.
E.g. still image

Calculate difference between nearby pixels
Areas of rapid color change characterized by large
values, other areas by small values
After differential encoding, apply RLE
28
Differential Encoding Example
0 0 0 0 0
0 255 250 253 251
0 255 251 254 255
0 0 0 0 0
DPCM: 0, 0, 0, 0, 0, 0, 255, -5, 3, -2, 0, 255, -4, 3, 1, …

RLE: 6(0), 255, -5, 3, -2, 0, 255, -4, 3, 1, …
29
Compression Steps
Uncompressed
Picture
Picture Preparation
Picture Processing
Adaptive
Feedback
Loop
Quantization
Compressed
Picture
Entropy Coding
30
JPEG Compression Steps
Prepare the image for compression
Transform color space
Down-sample components
Interleave the color planes
Partition into smaller blocks
Apply the Discrete Cosine Transform

Quantize the DC and AC coefficients
Apply entropy encoding to quantized numbers
31
JPEG Algorithm
8x8 blocks Source
Image
B
G
R
DCT-based encoding
Entropy Compressed
FDCT Quantizer
Encoder image data
Table Table
32
Color Transformation
May want to transform RGB to YUV or YCbCr
This is an optional step, so why do it?
Human visual system detects changes better in

luminance than in chrominance components
YUV or YCbCr better supports this than RGB
Facilitates better compression by enabling
Down-sampling of the image in chrominance
Elimination of more of the high-frequency changes
in the chrominance dimensions
33
Down-Sampling Chrominance
Optional step, but increases compression ratio
Only down-sample the chrominance components,
never the luminance component
Average groups of pixels in the horizontal
and/or vertical resolution of the image
Referred to as H2V1, H2V2, 4:2:2, 4:2:0, etc.
E.g., H2V2 means average every 2 pixels in
horizontal and vertical dimensions
H4V4 means use the SAME resolution in the
horizontal and vertical dimensions
34
Color sub-sampling
Sub-Sampling
4:4:4 4:2:2
35
Color sub-sampling
Sub-Sampling
4:4:4 4:1:1
36
Color sub-sampling
Sub-Sampling
4:4:4 4:2:0
37
How does colour sub-sampling aid
compression?
100 x 100 pixels frame requires:
100 x 100 Y Pixels
100 x 100 Cr Pixels
100 x 100 Cb Pixels
Using 4:2:0 colour sub-sampling:

100 x 100 Y Pixels
50 x 50 Cr Pixels
50 x 50 Cb Pixels
38
Blocks and Pixel Shift
Divide each component into 8x8 blocks
Send to the FDCT
Source
Image
Shift each pixel from unsigned range [0, 255] into

signed range [-127, 127]
DCT requires range be centered around 0
39
Forward DCT
Convert from spatial to frequency domain
Convert intensity function into weighted sum of
elementary frequency components
Identify pieces of spectral information that can be
thrown away without loss of quality
Intensity values in each color plane often
change slowly (see next example)
Contributions from higher frequency bands in the
frequency domain can be ignored
Better compression without loss of quality
40
Equations for 2D DCT
Forward DCT:
2 m −1 n −1
⎛ (2 x + 1)uπ ⎞ ⎛ (2 y + 1)vπ ⎞
F (u , v) = C (u )C (v)∑∑ I ( x, y ) * cos⎜ ⎟ * cos⎜ ⎟
nm y =0 x =0 ⎝ 2n ⎠ ⎝ 2m ⎠
Inverse DCT:
2 m −1 n −1 ⎛ (2 x + 1)uπ ⎞ ⎛ (2 y + 1)vπ ⎞
I ( y, x) = ∑∑
nm v =0 u =0
F ( v , u )C (u )C ( v ) cos⎜
⎝ 2n
⎟
⎠
* cos⎜
⎝ 2 m
⎟
⎠
41
Visualization of Basis Functions
Increasing frequency
Increasing frequency
42
Quantization
Divide each coefficient in a block by an integer in the
range [1, 255]
Comes in the form of a table, same size as a block
Multiply the block of coefficients by the table, and then
round the result to nearest integer
In the decoding process, multiply the quantized
coefficients by the inverse of the table
Get back a number close to, but the same as the original
Error is always less than half of the quantization number
Larger numbers in quantization table cause more loss
This is the main source of loss in JPEG
43
De facto Quantization Table
16 11 12 14 12 10 16 14
Eye becomes less sensitive
13 14 18 17 16 19 24 40
26 24 22 22 24 49 35 37
29 40 58 51 61 60 57 51
56 55 64 72 92 78 64 68
87 69 55 56 80 109 81 87
95 98 103 104 103 62 77 113
121 112 100 120 92 101 103 99
Eye becomes less sensitive

44
Entropy Encoding
Compress the sequence of quantized DC and
AC coefficients from the quantization step
Further increase compression, but without loss
Separate DC from AC components

DC components change slowly, thus will be
encoded using difference encoding
45
DC Encoding
DC represents average intensity of a block
Because image intensity tends to change slowly,
DC values tend to change slowly
Encode using difference encoding scheme
Use 3x3 pattern of blocks
Because difference tends to be near zero, can
use less bits in the encoding
Categorize difference into difference classes
Send the index of the difference class, followed by
the bits representing the difference
46
AC Encoding
Use a zig-zag ordering of coefficients, why?
Orders frequency components from low->high
Should produce maximal series of 0s at the end
Lends well to RLE
Apply RLE to ordering
47
Huffman Encoding
String together the RLE of the AC coefficients
along with the DC difference indices and values
Apply Huffman encoding to resulting sequence
Attach appropriate headers
Finally have the JPEG image!
48

Image Copression

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Image Copression

Hochgeladen von

Copyright:

Verfügbare Formate

Introduction to Image and Video

 7.37 Mbytes per second

 442 Mbytes per minute

 26.5 Gbytes per hour

 640 x 480 resolution, 24 bit (3 bytes) colour, 30 fps

 27.6 Mbytes per second

 1.66 Gbytes per minute

 99.5 Gbytes per hour

– Raw video contains an immense amount of data

– 20 Mb/s HDTV channel bandwidth

 Lossless and lossy methods

 Efficient binary representation of information

3.1 2.1 0 4.9 3.9 5 4 2

Original Picture 800 x 600 Decoded Picture (Low Losses)

Original Picture 800 x 600 Decoded Picture (High Losses)

Part of the Entropy

Lossy Compression Optimal Lossless Compression

Only Part of the Entropy is Preserved Full Entropy is Preserved

 Compute motion vectors for each block

 First-order entropy model H = −∑ P (i ) log 2 P (i )

 Huffman encoding, for example

Symbol A B C D E Huffman Table

Total size : 92 Bits C 6 101

 Series of repeated values replaced by a single

 Encode repeated values using two bytes

 Repeated patterns cannot be longer than 128

 A non-repeating data value will have a positive first

a -6b -1c -3d -1e -2d

 E.g. still image

0 255 250 253 251

0 255 251 254 255

DPCM: 0, 0, 0, 0, 0, 0, 255, -5, 3, -2, 0, 255, -4, 3, 1, …

 Apply the Discrete Cosine Transform

 Human visual system detects changes better in

 Using 4:2:0 colour sub-sampling:

 Shift each pixel from unsigned range [0, 255] into

Eye becomes less sensitive

 Separate DC from AC components

 Apply RLE to ordering

Das könnte Ihnen auch gefallen

7.37 Mbytes per second

442 Mbytes per minute

26.5 Gbytes per hour

640 x 480 resolution, 24 bit (3 bytes) colour, 30 fps

27.6 Mbytes per second

1.66 Gbytes per minute

99.5 Gbytes per hour

Lossless and lossy methods

Efficient binary representation of information

Compute motion vectors for each block

First-order entropy model H = −∑ P (i ) log 2 P (i )

Huffman encoding, for example

Series of repeated values replaced by a single

Encode repeated values using two bytes

Repeated patterns cannot be longer than 128

A non-repeating data value will have a positive first

E.g. still image

Apply the Discrete Cosine Transform

Human visual system detects changes better in

Using 4:2:0 colour sub-sampling:

Shift each pixel from unsigned range [0, 255] into

Separate DC from AC components

Apply RLE to ordering