Media

Media Characteristics
 Audio: MPEG Coding

 Images: JPEG Coding background
 Video: MPEG-2 Principles
 3D Models: Progressive Representations
 3D Motions: Data Representation
B. Prabhakaran 1
MPEG Audio Features
 MPEG Audio Layers
 Layer 1 allows for a maximal bit rate of 448 Kbits/second.
 Layer 2 allows for 384 Kbits/second
 Layer 3 for 320 Kbits/second.
 All samplings at 16 bits.

 Sampling frequencies: CD (compact disc) digital
audio (44.1 kHz) and digital audio tapes (48 kHz);
Sampling at 32 kHz is also available.
B. Prabhakaran 2
MPEG Audio Coding
B. Prabhakaran 3
MPEG Audio Coding…
 Uncompressed audio transformed into 32 non-
interleaved sub-bands using Fast Fourier Transform
(FFT).
 Amplitude and noise level of signal in each sub-band
determined using a psycho-acoustic model.
 Next, quantization of the transformed signals.
 MPEG audio layers 1 and 2 are PCM-encoded. Layer
3: Huffman coded.
 Types of channels
 Single channel, two independent channels, or one stereo
channel.
 Stereo channels: processed independently or jointly. Joint
stereo exploits redundancy of both channels
B. Prabhakaran 4
MPEG-2 Audio
 Different channels:
 5 full bandwidth channels (left, right, center, and two
surround channels).
 Additional low frequency enhancement (LFE) channel.
 Up to seven commentary or multilingual channel.
 Sampling rates defined for MPEG-2 audio include 16
KHz, 22.05KHz, and 24 KHz
B. Prabhakaran 5
JPEG Compression
 Modes of compression
 Lossy Sequential DCT-based Mode: also known as
baseline process; Needs to be supported by every
JPEG implementation.
 Expanded Lossy DCT-based Mode: Provides an
additional set of further enhancements to the
baseline mode.
 Lossless Mode: Allows perfect reconstruction of
the original image; lower compression ratio.
 Hierarchical Mode: consists of images with
different resolutions generated using the methods
described above.
B. Prabhakaran 6
Image Preparation
 Source image can have up to 255 components
(instead of only three components Y,U, and V).
 E.g., components of an image can be the color
components (Red R, Green G, Blue B) or luminance
components (Luminance Y, chrominance U and V).
 Each component of the image may have the same or
different resolution, in terms of the number of pixels
in the horizontal and vertical axis.
B. Prabhakaran 7
Image Preparation ..
•In JPEG, gray scale image will consist of a single component.

•RGB image in JPEG has three components with resolutions:
• Y1 = Y2 = Y3 and X1 = 2X2 = 2X3.
B. Prabhakaran 8
Image Preparation …
 Each pixel represented by p bits; values 0 - 2p-1.
 Value of p depends on the mode of compression.
 Lossy modes use either 8 or 12 bits per pixel.
 Lossless modes use 2 upto 12 bits per pixel.
 All pixels of all components within the same image are coded
with the same number of bits.
 An application can have different number of bits per
pixel, provided a suitable transformation of the image
to the well defined numbers of bits in the JPEG
standard.
B. Prabhakaran 9
Non-Interleaved Ordering
 Data processed in one component completely before

processing the next component.
 Within a component: processing from left-to-right and
then top-to-bottom.
 While decoding components: display one after the
other. E.g., in RGB-encoded image, red component
will be presented first, green component next, and
then the blue component.
B. Prabhakaran 10
Interleaving of Components
 JPEG divides each component of the image to be

compressed into equal number of regions.
 Then, specify a Minimum Coded Unit (MCU).
 MCU comprises of exactly one region in each
component.
B. Prabhakaran 11
Interleaving of Components ..
 MCU1 consists of regions R1 in components C1 and
C2.
 Data units within one region are ordered as in the
earlier way: left-to-right and top-to-bottom.
 MCU1 in Example figure:
 C100 C101 C110 C111 C200 C201;
 MCU2:
 C102 C103 C112 C113 C202 C203 and so on.
B. Prabhakaran 12
Lossy Sequential Mode
 Uncompressed image: divided into blocks of 8 X 8 pixels.

 Order of blocks: determined by the MCU. These blocks are
passed to the image processing phase.
B. Prabhakaran 13
Lossy Sequential Mode
 Values of each pixel is shifted in the range -128 to
127, with zero as the center. Achieved by Discrete
Cosine Transformation (DCT).
 For a block of 8 X 8 pixels, shifted values are
represented by Sxy, 0 ≤ x ≤ 7, 0 ≤ y ≤ 7.
 Each of these values are transformed using Forward
DCT (FDCT).
 Above FDCT transformation has to be done 64 times
per block, resulting in 64 coefficients Suv per block.
 Cosine expressions depend on x, y, u, and v, but not
on Sxy. Computation can be optimized to take
advantage of this fact.
B. Prabhakaran 14
Lossy Sequential Mode ..
• FDCT maps the value from the time to the frequency domain.
• Each coefficient of Suv can be regarded as a two-dimensional frequency.
• Coefficient S00, the DC-coefficient, corresponds to the lowest frequency in
both dimensions. Also describes the fundamental color of the 8 X 8 pixels
block.
• Rest of the coefficients known as AC-coefficients.
B. Prabhakaran 15
Quantization Phase
 FDCT coefficients in a block may have low or zero
values, if the block has only one predominant color.
 Entropy encoding is used for further compression
 Each of the 64 coefficient value is scaled by a factor
Q, the quantization factor.
 E.g., the quantized value of the DC-coefficient, SQ00, is :
SQ00 = S00 / Q.
 In most cases, quantization is not done in a uniform
manner.
 Low frequencies of the FDCT coefficients describe the
boundaries among regions in the image being compressed.
 If low frequency coefficients are quantized in a very coarse
manner (i.e., with high values of Q), boundaries in the
reconstructed image may not be as sharp.
B. Prabhakaran 16
Quantization Phase
 Low frequency coefficients quantized in finer manner
(i.e., with lower values of Q) than higher frequency
ones.
 Table with 64 entries used for representing the
values of the quantization factor Q, for each of the 64
FDCT coefficients.
 Quantization of each coefficient is: SQuv = Suv / Quv,
Quv the quantization factor for uvth coefficient.
B. Prabhakaran 17
Quantization Phase..
B. Prabhakaran 18
Entropy Encoding
 DC-coefficients encoded as difference of the current

DC-coefficient and the previous one: for block i,
difference in DC-coefficients is DCi - DCi-1.
 Further processing done only on these differences.
 AC-coefficients are processed in the zig-zag order
shown earlier.
 zig-zag sequence describes an increasing order of
the frequencies of the AC-coefficients.
B. Prabhakaran 19
Entropy Encoding
 Lower frequency AC-coefficients have higher values
than higher frequencies ones (which are usually very
small or zero).
 Hence, zig-zag ordering of AC-coefficients produces
a sequence where similar values will be together.
 Such a sequence is highly suitable for efficient
entropy encoding.
 Next step: run-length encoding of zero values. JPEG
specifies Huffman and Arithmetic encoding.
(Arithmetic encoding is protected by a patent).
 For the lossy JPEG mode (i.e., the baseline
process), only Huffman encoding is allowed to be
used.
B. Prabhakaran 20
Image Reconstruction
 Decompress the data in Huffman/Arithmetic coded
form.
 Dequantization is then performed: Ruv = SQuv X Quv.
Must use the same table as the one used in the
quantization process.
 Dequantized DCT coefficients are then subject to
IDCT (Inverse DCT).
 If FDCT and IDCT can determine the values with full
precision, reconstruction can be lossless (assuming
lossless quantization).
 However, precision is restricted and hence the
reconstruction process is lossy.
B. Prabhakaran 21
Image Reconstruction
B. Prabhakaran 22
Expanded Lossy Mode
 Progressive representation realized by expansion of
quantization.
 Expansion done by addition of an output buffer to the
quantizer, storing all coefficients of the quantized
DCT.
 Encoding process follows either:
 Encoder processes DCT-coefficients of low frequencies
successively (low frequencies describe border outlines).
Hence, encoding low frequency coefficients successively
decode the boundaries of various objects successively.
 Another approach: use all the DCT-coefficients in the
encoding process but single bits are differentiated according
to their significance (i.e., most significant bit first and then
the least significant bits are encoded).
B. Prabhakaran 23
Progressive Spectral Selection
 DCT coefficients are grouped into several spectral
bands.
 Low-frequency DCT coefficients sent first, and then
higher-frequency coefficients. E.g.,
 Band 1: DC coefficients only
 Band 2: AC1 and AC2 coefficients
 Band 3: AC3, AC4, AC5, and AC6 coefficients
 Band 4: AC7 … AC63 coefficients
Bits
n-1
band1 band2 band3 band4
DC AC1 & AC3 - AC7 … AC63

AC2 AC6
0
B. Prabhakaran 24
Prog. Successive Approximation
 All DCT coefficients sent first with lower precision.
 Then refined in later scans. E.g.,
 Band 1: All DCT coefficients divided by 4.
 Band 2: All DCT coefficients divided by 2
 Band 3: All DCT coefficients at full resolution
Bits
n-1
band1
band2
band3
0
DC AC1 AC2 AC63
B. Prabhakaran 25
Combined Progressive …
 Combines both spectral & successive approximations
 SCAN 1: DC band 1; Scan 2: AC band 1
 Scan 3: AC b2; Scan 4: AC b3; Scan 5: AC b4;
 Scan 6: AC b5; Scan 7: DC b2; Scan 8: AC b6
Bits
n-1
DC AC b2 AC b3
AC b1
b1
AC b4 AC b5
DC
AC b6
0 b2
DC AC1 AC2 AC63
B. Prabhakaran 26
Lossless Mode
 This mode work on single pixel of an image (instead

of 8 X 8 pixels block). So, no processing to be done
as part of the image preparation phase.
 Each pixel can be encoded with 2 to 8 bits. Image
preparation and quantization phases use a predictive
technique (instead of a transformation oriented DCT
technique).
B. Prabhakaran 27
Lossless Mode..
 8 possible predictor values are defined for each pixel
based on the values of the adjacent pixels.
 Table describes predictors for pixel X
 0 No Prediction 1 X =A
 2 X=B 3 X=C
 4 X =A+B-C 5 X = A + (B - C) / 2
 6 X = B + (A - C) / 2 7 X = (A + B)/2
 For each pixel, the number of the chosen predictors
as well as the difference of the prediction to the
actual value are entropy encoded.
B. Prabhakaran 28
Hierarchical JPEG
 Progressive JPEG at multiple resolutions
Rn
Image Res 2
Image Res 1
B. Prabhakaran 29
Video Compression
 Asymmetric Applications
 Compression process is performed only once and at the
time of storage. E.g., on-Demand servers (such as Video-
on-Demand and News-on-Demand) and electronic
publishing (travel guides, shopping guides, and educational
materials).
 Symmetric Applications
 Equal use of compression and decompression process.
 E.g., information generated through video cameras or by
editing pre-recorded material.
 Video conferencing, video telephone applications involve
generation, compression, and decompression of information
generated through video cameras.
 Desktop video publishing applications require edit
operations on pre-recorded material.
B. Prabhakaran 30
Desirable Features for Video
Compression
 Random Access
 Fast Forward / Rewind
 Reverse Playback
 Audio-Visual Synchronization
 Robustness to Errors
 Coding / Decoding Delay
 Edit ability
 Format Flexibility
 Cost Tradeoffs
B. Prabhakaran 31
MPEG Standard
 MPEG-Video: compression of video signals at about
1.5 Mbits per second.
 MPEG-Audio: compression of digital audio signal at
the rates of 64, 128, and 192 kbps per channel.
 MPEG-System deals with the issues relating to
audio-visual synchronization.
 Also handles multiplexing of multiple compressed
audio and video bit streams.
B. Prabhakaran 32
MPEG Video
 Primary aim of MPEG-Video is to compress a video
signal to a bit rate of about 1.5 Mbits/s with an
acceptable quality.
 MPEG is often termed a generic standard, implying
that it is independent of a particular application.
 Benefited from other following standards. :
 JPEG
 H.261: standard was already available during MPEG
standardization. MPEG technique is more advanced than
H.261.
B. Prabhakaran 33
MPEG Video
 Two nearly conflicting requirements.
 Random access requirements for MPEG video are best
satisfied with pure intra-frame coding.
 High compression rates are not possible unless a fair
amount of inter-frame coding is done.
 Intra-frame coding: targets spatial redundancy
reduction.
 Inter-frame coding: targets temporal redundancy
reduction.
 Delicate balance between inter- and intra-frame
coding.
B. Prabhakaran 34
Temporal Redundancy Reduction
 Temporal redundancies present in video when
subsequent frames carry similar but slightly varying
content.
 E.g., video frames of a person walking in a street show a
gradual variation in the contents based on the walking speed
of the person.
 Most widely used techniques for achieving temporal
redundancy reduction is motion compensation.
 Motion information comprises of the amplitude and
the direction of displacement of the contents.
 MPEG uses block-based motion compensation
technique.
B. Prabhakaran 35
Temporal Redundancy Reduction
 Significant cost associated with motion information
coding.
 Hence, 16 X 16 bit blocks are chosen as motion
compensation units (MCUs), called macro-blocks.
 Two types of motion compensation are applied over
these macro-blocks.
 Causal predictors (Predictive coding): i.e., generate the
contents of a subsequent frame based on the motion
information and the contents of the current one.
 Non-causal predictors (interpolative coding): Frame coded
based on both a previous and a successive frame.
B. Prabhakaran 36
Interpolative Coding
 E.g, frame 10 coded based on both frames 5 and 14.

 Interpolated frames also termed bidirectional frames
(B-frames).
 Signal to be reconstructed for block x in frame i
obtained by adding a correction term to a
combination of blocks in frames i-1 and i+1.
B. Prabhakaran 37
Interpolative Coding
 Advantages of interpolative coding:
 Compression obtained using interpolative coding
is very high.
 Results in better noise reduction: the coded block
is based on both a past and a future frame.
 Helps in efficiently coding new blocks (i.e., the
blocks that are not present in the future) in the
frame to be coded. New blocks may be properly
predicted from the future frame.
B. Prabhakaran 38
Predictive Coding
 E.g, frame 10 coded based on ONLY frame 5.
B. Prabhakaran 39
MPEG Frame Sequence
B. Prabhakaran 40
Motion Estimation
Previous Current Future
Frame Frame Frame
A
B
C
 Interpolation for motion estimation: B = (A + C) / 2.
B. Prabhakaran 41
Motion Estimation …
 MPEG does not specify the motion estimation
technique.
 Block-matching techniques likely to be used. Goal:
 Estimation motion of a n X m block in the present frame with
respect to a previous or a future frame.
 Block is compared to with a corresponding block within a
search area G of size (m + 2p X n + 2p) in the previous/future
frame.
 Typical: n = m = 16 (16 X 16 pixels) and parameter p = 6.
n
m
p p
Search Area G
B. Prabhakaran 42
Block Matching Approaches
 Exhaustive Search or brute force

 3-step search
 2-D Logarithmic search
 Conjugate direction search
 Parallel hierarchical 1-D search
 Modified pixel-difference classification
B. Prabhakaran 43
MPEG Layers
 Sequence Layer  Context unit
 Group of Pictures  Random access unit
 Picture Layer  Primary coding unit
 Slice Layer  Resynchronization unit
 Macro-blocks  Motion compensation
 Blocks  DCT unit
B. Prabhakaran 44

Media

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Media

Hochgeladen von

Copyright:

Verfügbare Formate

Media Characteristics

 Audio: MPEG Coding

 All samplings at 16 bits.

•In JPEG, gray scale image will consist of a single component.

 Data processed in one component completely before

 JPEG divides each component of the image to be

 Uncompressed image: divided into blocks of 8 X 8 pixels.

 DC-coefficients encoded as difference of the current

DC AC1 & AC3 - AC7 … AC63

 This mode work on single pixel of an image (instead

 E.g, frame 10 coded based on both frames 5 and 14.

 E.g, frame 10 coded based on ONLY frame 5.

 Interpolation for motion estimation: B = (A + C) / 2.

 Exhaustive Search or brute force

Das könnte Ihnen auch gefallen