Sie sind auf Seite 1von 66

Image Compression

Prepared by
T. RAVI KUMAR NAIDU
Image Compression
• The goal of image compression is to reduce the amount of
data required to represent a digital image.
Image Compression
Need For Image Compression
• Because much of this information is graphical or pictorial in
nature, the storage and communications requirements are
immense.

• Image compression addresses the problem of reducing the


amount of data requirements to represent a digital image.

• Also it plays an important role in Video Conferencing, remote


sensing, satellite TV, FAX, document and medical imaging.
Fundamentals
• The term data compression refers to the process of reducing the
amount of data required to represent a given quantity of
information
• Data  Information
• Various amount of data can be used to represent the same
information
• Data might contain elements that provide no relevant
information : data redundancy
• Data redundancy is a central issue in image compression. It is not
an abstract concept but mathematically quantifiable entity
Data vs Information
The same amount of information can be represented by various
amount of data, e.g.:
Ex:1 - Your wife, Helen, will meet you at Logan Airport in Boston
at 5 minutes past 6:00 pm tomorrow night
Ex:2 - Your wife will meet you at Logan Airport at 5 minutes past
6:00 pm tomorrow night
Ex:3 - Helen will meet you at Logan at 6:00 pm tomorrow night
Data Redundancy

compression

Compression ratio:
Data Redundancy
• Let n1 and n2 denote the number of information carrying units
in two data sets that represent the same information
• The relative redundancy RD is define as :

1
RD  1 
CR
where CR, commonly called the compression ratio, is
n1
CR 
n2
Data Redundancy

• If n1 = n2 , CR = 1 and RD = 0 no redundancy

• If n1 >> n2 , CR   and RD 1 high redundancy

• If n1 << n2 , CR  0 and RD   undesirable


Types of Data Redundancy
Redundancy in Digital Images

– Coding redundancy: usually appear as results of the uniform


representation of each pixel

– Spatial/ Tempopral redundancy: because the adjacent pixels


tend to have similarity in practical.

– Irrelevant Information: Image contain information which are


ignored by the human visual system.
Types of Data Redundancy

1. Coding Redundancy

2. Interpixel Redundancy

3. Psychovisual Redundancy

– Data compression attempts to reduce one or


more of these redundancy types.
Coding - Definitions
• Code: a list of symbols (letters, numbers, bits etc.)
• Code word: a sequence of symbols used to
represent some information (e.g., gray levels).
• Code word length: number of symbols in a code
word.
Coding - Definitions (cont’d)
Coding Redundancy
• Assume the discrete random variable for rk in the interval [0,1]
that represent the gray levels. Each rk occurs with probability Pk
• If the number of bits used to represent each value of rk by l(rk)
then

•The average code bits assigned to the gray level values.


•The length of the code should be inverse proportional to its
probability (occurrence).
Coding - Definitions (cont’d)
Coding - Definitions (cont’d)
• N x M image

• rk : k-th gray level


• l(rk) : # of bits for rk

• P(rk): probability of rk

Expected value: E ( X )   xP( X  x)


x
Coding Redundancy
• Case 1: l(rk) = constant length

Example:
Coding Redundancy (cont’d)
• Case 2: l(rk) = variable length
variable length

Total number of bits: 2.7NM


Inter pixel Redundancy:
• Inter pixel redundancy is due to the correlation between the
neighboring pixels in an image.
• Inter pixel redundancy depends on the resolution of the image
– The higher the (spatial) resolution of an image, the more
probable it is that two neighboring pixels will depict the
same object
– The higher the frame rate in a video stream, the more
probable it is that the corresponding pixel in the following
frame will depict the same object
• These types of predictions are made more difficult by the
presence of noise
Inter pixel Redundancy:
• The value of any given pixel can be predicated from
the value of its neighbors that is they are highly
correlated.

• The information carried by individual pixel is


relatively small. To reduce the inter pixel redundancy
the difference between adjacent pixels can be used
to represent an image.
Psychovisual redundancy
• The human eye does not respond with equal
sensitivity to all visual information.

• It is more sensitive to the lower frequencies than to


the higher frequencies in the visual spectrum.

• Idea: discard data that is perceptually insignificant!


Psychovisual redundancy
• Psychovisual redundancy is associated with the characteristics
of the human visual system (HVS).
• In the HVS, visual information is not perceived equally.
• Some information may be more important than other
information.
• If less data is used to represent less important visual
information, perception will not be affected.
• This implies that visual information is psychovisually
redundant Eliminating the psychovisual redundancy leads to
efficient compression.
Psycho-visual Redundancy
Fidelity Criteria

• How close is to ?
• Criteria
– Subjective: based on human observers

– Objective: mathematically defined criteria


Subjective Fidelity Criteria
Objective Fidelity Criteria
The error between two functions is given by:

e( x, y )  f ( x, y )  f ( x, y )
So, the total error between the two images is

M 1 N 1

 [ f ( x, y)  f ( x, y)]
x 0 y 0
The root-mean-square error averaged over the whole image is
1 
erms  [ f ( x, y )  f ( x, y )]2
MN
Objective Fidelity Criteria
• A closely related objective fidelity criterion is the mean square
signal to noise ratio of the compressed-decompressed image

M 1 N 1 

x 0 y 0
f ( x, y ) 2
SNRms 

M 1 N 1

 [ f ( x
x 0 y 0
, y )  f ( x , y )] 2
Subjective vs Objective Fidelity Criteria

RMSE = 5.17 RMSE = 15.67 RMSE = 14.17


Compression Model

• The source encoder is responsible for removing redundancy


(coding, inter-pixel, psycho-visual)
• The channel encoder ensures robustness against channel
noise.
Compression Types

Compression

Error-Free Compression
Lossy Compression
(Loss-less)
Compression Types
Compression Types
Huffman Coding
(addresses coding redundancy)

• A variable-length coding technique.

• Source symbols are encoded one at a time!

– There is a one-to-one correspondence between


source symbols and code words.

• Optimal code - minimizes code word length per


source symbol.
Huffman Coding
• Forward Pass
– 1. Sort probabilities per symbol
– 2. Combine the lowest two probabilities
– 3. Repeat Step2 until only two probabilities remain.
Huffman Coding
• Backward Pass
Assign code symbols going backwards
Arithmetic coding
• Arithmetic coding is a form of variable-length entropy encoding.
• A string is converted to arithmetic encoding, usually characters
are stored with fewer bits
• Arithmetic coding encodes the entire message into a single
number, a fraction n where (0.0 ≤ n < 1.0).
• Next figure illustrates the basic arithmetic coding process. Here,
a five-symbol sequence or message, a1a2a3a3a4, from a four-
symbol source is coded.
• At the start of the coding process, the message is assumed to
occupy the entire half-open interval [0, 1).
Arithmetic coding
Arithmetic coding
• As Table shows, this interval is initially subdivided into four
regions based on the probabilities of each source symbol.
• Symbol ax, for example, is associated with subinterval [0, 0.2).
Because it is the first symbol of the message being coded, the
message interval is initially narrowed to [0, 0.2).
• Thus in Fig. [0, 0.2) is expanded to the full height of the figure
and its end points labeled by the values of the narrowed range.
• The narrowed range is then subdivided in accordance with the
original source symbol probabilities and the process continues
with the next message symbol.
Arithmetic coding
• In this manner, symbol a2 narrows the subinterval to [0.04,
0.08), a3 further narrows it to [0.056,0.072), and so on.

• The final message symbol, which must be reserved as a


special end-of message indicator, narrows the range to
[0.06752, 0.0688).

Of course, any number within this subinterval—for


example, 0.068—can be used to represent the message.
Transform Coding ( Lossy Compression)

The goal of the transformation process is to decorrelate the


pixels of each sub-image, or to pack as much information as
possible into the smallest number of transform coefficients
Transform Coding ( Lossy Compression)
N 1 N 1
T (u , v)   f ( x, y ) g ( x, y, u, v) u , v  0,1, , N  1
u 0 v 0
N 1 N 1
f ( x, y )   T (u , v)h( x, y, u , v) x, y  0,1,, N  1
u 0 v 0

Forward kernel is Separable if:

g ( x, y, u, v)  g1 ( x, u).g2 ( y, v)

Forward kernel is Symmetric if:

g1  g2  g ( x, y, u, v)  g1 ( x, u).g1 ( y, v)
Transform Coding ( Lossy Compression)

Discrete Fourier Transform (DFT):


1  j 2 (uxvy ) / N
g ( x, y, u, v)  e
N
h( x, y, u, v)  e j 2 (uxvy ) / N
Walsh-Hadamard Transform (WHT):
m1

1 [bi ( x ) pi (u )bi ( y ) pi ( v )]
g ( x, y, u, v)  h( x, y, u, v)  (1) i0 ( N  2m )
N
bk(z) is the kth bit (from right to left) in the
binary representation of z.
Transform Coding ( Lossy Compression)

p0 (u )  bm1 (u )
p1 (u )  bm1 (u )  bm2 (u )
p2 (u )  bm2 (u )  bm3 (u )

pm1 (u )  b1 (u )  b0 (u )
Transform Coding ( Lossy Compression)
Discrete Cosine Transform (DCT):

g ( x, y , u , v )  h ( x, y , u , v )
 (2 x  1)u   (2 y  1)v 
  (u ) (v) cos   cos  
 2N   2N 
 1
 for u  0
where  (u )   N
 2
for u  1,2, , N  1
 N
Dictionary-Based Compression
• The compression algorithms we studied so far use a statistical model
to encode single symbols
– Compression: Encode symbols into bit strings that use fewer bits.
• Dictionary-based algorithms do not encode single symbols as
variable-length bit strings; they encode variable-length strings of
symbols as single tokens
– The tokens form an index into a phrase dictionary
– If the tokens are smaller than the phrases they replace,
compression occurs.
• Dictionary-based compression is easier to understand because it uses
a strategy that programmers are familiar with-> using indexes into
databases to retrieve information from large amounts of storage.
– Telephone numbers
– Postal codes

44
Dictionary-Based Compression: Example
• Consider the Random House Dictionary of the English Language,
Second edition, Unabridged. Using this dictionary, the string:
A good example of how dictionary based compression works
can be coded as:
1/1 822/3 674/4 1343/60 928/75 550/32 173/46 421/2
• Coding:
– Uses the dictionary as a simple lookup table
– Each word is coded as x/y, where, x gives the page in the
dictionary and y gives the number of the word on that page.
– The dictionary has 2,200 pages with less than 256 entries per
page: Therefore x requires 12 bits and y requires 8 bits, i.e.,
20 bits per word (2.5 bytes per word).
– Using ASCII coding the above string requires 48 bytes,
whereas our encoding requires only 20 (<-2.5 * 8) bytes:
50% compression.

45
Dictionary-Based Compression
Dictionary-Based Compression
Dictionary-Based Compression
Dictionary-Based Decompression
Run-Length Coding (RLC)
• Run : the repetition of a symbol.
• Run-length: number of repeated symbols.
• Instead of encoding the consecutive symbols, it is
more efficient to encode the run-length and the value
that these consecutive symbols commonly share.
• Application:
– Adopted in JPEG (multi-level image coding)
– Binary document coding
– Adopted in facsimile coding standards: the CCITT
Recommendations T.4 and T.6.
Run-Length Coding (RLC)
Classification:
• RLC using only the horizontal correlation between pixels on the
same scan line is called 1-D RLC.

• To achieve higher coding efficiency, 2-D RLC utilizes both


horizontal and vertical correlation between pixels.
1-D Run-Length Coding
• In this technique, each scan line is encoded independently.

• Each scan line can be considered as a sequence of alternating,


independent white runs and black runs.

• As an agreement between encoder and decoder, the first run in


each scan line is assumed to be a white run. If the first actual
pixel is black, then the run-length of the first white run is set to
be zero.

• At the end of each scan line, there is a special codeword called


end-of-line (EOL). The decoder knows the end of a scan line
when it encounters an EOL codeword.
Image Compression Standards
Why Do We Need International Standards?

 International standardization is conducted to achieve inter-


operability .
 Only syntax and decoder are specified.
 Encoder is not standardized and its optimization is left to the
manufacturer.
 Standards provide state-of-the-art technology that is developed
by a group of experts in the field.
 Not only solve current problems, but also anticipate the
future application requirements.
 Most of the standards are sanction by the International
Standardization Organization (ISO) and the Consultative
Committee of the International Telephone and Telegraph (CCITT)
Image Compression Standards
• Binary Compression Standards
• CCITT G3 -> 1D Run Length Encoding
• CCITT G4 -> 2D Run Length encoding
• JBIG1 -> Lossless adaptive binary compression
• JBIG2 -> Lossy/Lossless adaptive binary compression

Software Research
Image Compression Standards
Binary Image Compression Standards
CCITT Group 3 and 4
 They are designed as FAX coding methods.
 The Group 3 applies a non-adaptive 1-D run length coding
and optionally 2-D manner.
 Both standards use the same non-adaptive 2-D coding
approach, similar to RAC technique.
 They sometime result in data expansion. Therefore, the
Joint Bi-level Imaging Group (JBIG), has adopted several
other binary compression standards, JBIG1 and JBIG2.
Image Compression Standards
• Continuous Tone Still Image Compression
Standards
• JPEG
• JPEG 2000
• Mixed Raster Content (MRC)

Software Research
Image Compression Standards
Continues Tone Still Image Comp.
What Is JPEG?
 "Joint Photographic Expert Group". Voted as international
standard in 1992.

 Works with color and grayscale images, e.g., satellite,


medical, ...

 Lossy and lossless


Image Compression Standards
Continues Tone Still Image Comp. - JPEG

 First generation JPEG uses DCT+Run length Huffman


entropy coding.
 Second generation JPEG (JPEG2000) uses wavelet
transform + bit plane coding + Arithmetic entropy coding.
Image Compression Standards
Continues Tone Still Image Comp. - JPEG
 Still-image compression standard
 Has 3 lossless modes and 1 lossy mode
 sequential baseline encoding
 encode in one scan
 input & output data precision is limited to 8 bits, while quantized DCT
values are restricted to 11 bits
 progressive encoding
 hierarchical encoding
 lossless encoding
 Can achieve compression ratios of up-to 20 to 1 without
noticeable reduction in image quality
Image Compression Standards
Continues Tone Still Image Comp. - JPEG
 Work well for continuous tone images, but not good for
cartoons or computer generated images
 Tend to filter out high frequency data
 Can specify a quality level (Q)
 with too low Q, resulting images may contain blocky, contouring and
ringing structures.
 5 steps of sequential baseline encoding
 transform image to luminance/chrominance space (YCbCr)
 reduce the color components (optional)
 partition image into 8x8 pixel blocks and perform DCT on each block
 quantize resulting DCT coefficients
 variable length code the quantized coefficients
Image Compression Standards
MPEG
MPEG
(Interframe Coding)

 Temporal DPCM is used to remove temporal redundancy


first.
 The motion compensated error is coded with DCT+Run
length Huffman entropy coding.
Image Compression Standards
MPEG

 Temporal redundancy
 Prediction along the motion trajectories (motion compensation
prediction)

Motion Estimation
 The accuracy of motion estimation has a big influence on
coding efficiency.
 Motion estimation is a very time-consuming work.
 Some fast algorithms are needed.
Image Compression Standards
MPEG
 Frame types
 I frames: intra-picture
 P frames: predicted picture
 B frames: bidirectional predicted picture

Input
Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6 Frame 7
stream

MPEG
compression
Forward
prediction

Compressed I frame B frame B frame P frame B frame B frame I frame


stream

Bidirectional
 Example sequence transmitted as I P B B I B B prediction
Thank you!

Das könnte Ihnen auch gefallen