Data Comp Paper Saurabh

Data compression
Saurabh, ece/06/144,electronics and communication department

Shri Balwant Institute of Technology, Sonepat
ece06144.sbit@gmail.com
Abstract— In computer science and information theory, data compression is desired; in other cases, fidelity is sacrificed to
compression or source coding is the process of encoding reduce the amount of data as much as possible.
information using fewer bits (or other information-bearing
units) than an unencoded representation would use, through Lossless compression schemes are reversible so that the
use of specific encoding schemes. original data can be reconstructed, while lossy schemes accept
some loss of data in order to achieve higher compression.
As with any communication, compressed data communication
only works when both the sender and receiver of the However, lossless data compression algorithms will always
information understand the encoding scheme. For example, fail to compress some files; indeed, any compression
this text makes sense only if the receiver understands that it is algorithm will necessarily fail to compress any data containing
intended to be interpreted as characters representing the no discernible patterns. Attempts to compress data that has
English language. Similarly, compressed data can only be been compressed already will therefore usually result in an
understood if the decoding method is known by the receiver. expansion, as will attempts to compress all but the most
Compression is useful because it helps reduce the trivially encrypted data.
consumption of expensive resources, such as hard disk space
or transmission bandwidth. On the downside, compressed data In practice, lossy data compression will also come to a point
must be decompressed to be used, and this extra processing where compressing again does not work, although an
may be detrimental to some applications. For instance, a extremely lossy algorithm, like for example always removing
compression scheme for video may require expensive the last byte of a file, will always compress a file up to the
hardware for the video to be decompressed fast enough to be point where it is empty.
viewed as it's being decompressed (the option of
decompressing the video in full before watching it may be An example of lossless vs. lossy compression is the following
inconvenient, and requires storage space for the decompressed string:
video). The design of data compression schemes therefore
involves trade-offs among various factors, including the 25.888888888
degree of compression, the amount of distortion introduced (if
using a lossy compression scheme), and the computational
This string can be compressed as:
resources required to compress and uncompress the data.
Lossless versus lossy compression 25.[9]8
Lossless compression algorithms usually exploit statistical Interpreted as, "twenty five point 9 eights", the original string
redundancy in such a way as to represent the sender's data is perfectly recreated, just written in a smaller form. In a lossy
more concisely without error. Lossless compression is system, using
possible because most real-world data has statistical
redundancy. For example, in English text, the letter 'e' is much 26
more common than the letter 'z', and the probability that the
letter 'q' will be followed by the letter 'z' is very small. instead, the exact original data is lost, at the benefit of a
Another kind of compression, called lossy data compression smaller file size.
or perceptual coding, is possible if some loss of fidelity is
acceptable. Generally, a lossy data compression will be guided Applications
by research on how people perceive the data in question. For
example, the human eye is more sensitive to subtle variations
The above is a very simple example of run-length encoding,
in luminance than it is to variations in color. JPEG image
wherein large runs of consecutive identical data values are
compression works in part by "rounding off" some of this
replaced by a simple code with the data value and length of
less-important information. Lossy data compression provides
the run. This is an example of lossless data compression. It is
a way to obtain the best fidelity for a given amount of
often used to optimize disk space on office computers, or
compression. In some cases, transparent (unnoticeable)
better use the connection bandwidth in a computer network. standard DjVu. The text entry system, Dasher, is an inverse-
For symbolic data such as spreadsheets, text, executable arithmetic-coder.
programs, etc., losslessness is essential because changing even
a single bit cannot be tolerated (except in some limited cases). Theory
For visual and audio data, some loss of quality can be The theoretical background of compression is provided by
tolerated without losing the essential nature of the data. By information theory (which is closely related to algorithmic
taking advantage of the limitations of the human sensory information theory) for lossless compression, and by rate–
system, a great deal of space can be saved while producing an distortion theory for lossy compression. These fields of study
output which is nearly indistinguishable from the original. were essentially created by Claude Shannon, who published
These lossy data compression methods typically offer a three- fundamental papers on the topic in the late 1940s and early
way tradeoff between compression speed, compressed data 1950s. Cryptography and coding theory are also closely
size and quality loss. related. The idea of data compression is deeply connected
with statistical inference.
Lossy image compression is used in digital cameras, to
increase storage capacities with minimal degradation of Many lossless data compression systems can be viewed in
picture quality. Similarly, DVDs use the lossy MPEG-2 Video terms of a four-stage model. Lossy data compression systems
codec for video compression. typically include even more stages, including, for example,
prediction, frequency transformation, and quantization.
In lossy audio compression, methods of psychoacoustics are
used to remove non-audible (or less audible) components of Image compression
the signal. Compression of human speech is often performed
with even more specialized techniques, so that "speech Image compression is the application of data compression on
compression" or "voice coding" is sometimes distinguished as digital images. In effect, the objective is to reduce redundancy
a separate discipline from "audio compression". Different of the image data in order to be able to store or transmit data
audio and speech compression standards are listed under audio in an efficient form.
codecs. Voice compression is used in Internet telephony for
example, while audio compression is used for CD ripping and
is decoded by audio players.
Examples of lossless compression
The Lempel-Ziv (LZ) compression methods are among the

most popular algorithms for lossless storage. DEFLATE is a
variation on LZ which is optimized for decompression speed
and compression ratio, therefore compression can be slow.
DEFLATE is used in PKZIP, gzip and PNG. LZW (Lempel-
Ziv-Welch) is used in GIF images. Also noteworthy are the
LZR (LZ-Renau) methods, which serve as the basis of the Zip
method. LZ methods utilize a table-based compression model
where table entries are substituted for repeated strings of data.
For most LZ methods, this table is generated dynamically A chart showing the relative quality of various jpg settings
from earlier data in the input. The table itself is often Huffman and also compares saving a file as a jpg normally and using a
encoded (e.g. SHRI, LZX). A current LZ-based coding "save for web" technique
scheme that performs well is LZX, used in Microsoft's CAB
format. Image compression can be lossy or lossless. Lossless
compression is sometimes preferred for medical imaging,
The very best compressors use probabilistic models, in which technical drawings, icons or comics. This is because lossy
predictions are coupled to an algorithm called arithmetic compression methods, especially when used at low bit rates,
coding. Arithmetic coding, invented by Jorma Rissanen, and introduce compression artifacts. Lossless compression
turned into a practical method by Witten, Neal, and Cleary, methods may also be preferred for high value content, such as
achieves superior compression to the better-known Huffman medical imagery or image scans made for archival purposes.
algorithm, and lends itself especially well to adaptive data Lossy methods are especially suitable for natural images such
compression tasks where the predictions are strongly context- as photos in applications where minor (sometimes
dependent. Arithmetic coding is used in the bilevel image- imperceptible) loss of fidelity is acceptable to achieve a
compression standard JBIG, and the document-compression substantial reduction in bit rate. The lossy compression that
produces imperceptible differences can be called visually • Component progressive: First encode grey; then
lossless. color.
Methods for lossless image compression are: Region of interest coding. Certain parts of the image are
encoded with higher quality than others. This can be
• Run-length encoding – used as default method in combined with scalability (encode these parts first, others
PCX and as one of possible in BMP, TGA, TIFF later).
• DPCM and Predictive Coding
• Entropy encoding Meta information. Compressed data can contain information
• Adaptive dictionary algorithms such as LZW – used about the image which can be used to categorize, search or
in GIF and TIFF browse images. Such information can include color and
• Deflation – used in PNG, MNG and TIFF texture statistics, small preview images and author/copyright
• Chain codes information.
Methods for lossy compression: Processing power. Compression algorithms require different
amounts of processing power to encode and decode. Some
high compression algorithms require high processing power.
• Reducing the color space to the most common colors
in the image. The selected colors are specified in the
The quality of a compression method is often measured by the
color palette in the header of the compressed image.
Peak signal-to-noise ratio. It measures the amount of noise
Each pixel just references the index of a color in the
introduced through a lossy compression of the image.
color palette. This method can be combined with
However, the subjective judgement of the viewer is also
dithering to avoid posterization.
regarded as an important, perhaps the most important,
• Chroma subsampling. This takes advantage of the measure.
fact that the eye perceives spatial changes of
brightness more sharply than those of color, by
averaging or dropping some of the chrominance Video compression
information in the image.
• Transform coding. This is the most commonly used Video compression refers to reducing the quantity of data
method. A Fourier-related transform such as DCT or used to represent digital video images, and is a combination of
the wavelet transform are applied, followed by spatial image compression and temporal motion
quantization and entropy coding. compensation. Video compression is an example of the
• Fractal compression. concept of source coding in Information theory. This article
deals with its applications: compressed video can effectively
reduce the bandwidth required to transmit video via terrestrial
The best image quality at a given bit-rate (or compression
broadcast, via cable TV, or via satellite TV services. Most
rate) is the main goal of image compression. However, there
video compression is lossy — it operates on the premise that
are other important properties of image compression schemes:
much of the data present before compression is not necessary
for achieving good perceptual quality. For example, DVDs
Scalability generally refers to a quality reduction achieved by use a video coding standard called MPEG-2 that can compress
manipulation of the bitstream or file (without decompression around two hours of video data by 15 to 30 times, while still
and re-compression). Other names for scalability are producing a picture quality that is generally considered high-
progressive coding or embedded bitstreams. Despite its quality for standard-definition video. Video compression is a
contrary nature, scalability can also be found in lossless tradeoff between disk space, video quality, and the cost of
codecs, usually in form of coarse-to-fine pixel scans. hardware required to decompress the video in a reasonable
Scalability is especially useful for previewing images while time. However, if the video is overcompressed in a lossy
downloading them (e.g. in a web browser) or for providing manner, visible (and sometimes distracting) artifacts can
variable quality access to e.g. databases. There are several appear.
types of scalability:
Video compression typically operates on square-shaped
• Quality progressive or layer progressive: The groups of neighboring pixels, often called macroblocks. These
bitstream successively refines the reconstructed pixel groups or blocks of pixels are compared from one frame
image. to the next and the video compression codec (encode/decode
• Resolution progressive: First encode a lower image scheme) sends only the differences within those blocks. This
resolution; then encode the difference to higher works extremely well if the video has no motion. A still frame
resolutions. of text, for example, can be repeated with very little
transmitted data. In areas of video with more motion, more
pixels change from one frame to the next. When more pixels very large files and are thus almost never used for the
change, the video compression scheme must send more data to distribution of finished videos. Once excessive lossy video
keep up with the larger number of pixels that are changing. If compression compromises image quality, it is impossible to
the video content includes an explosion, flames, a flock of restore the image to its original quality.
thousands of birds, or any other image with a great deal of
high-frequency detail, the quality will decrease, or the variable
bitrate must be increased to render this added information
with the same level of detail.
The programming provider has control over the amount of ACKNOWLEDGMENT

video compression applied to their video programming before
it is sent to their distribution system. DVDs, Blu-ray discs, I would like to acknowledge the contribution of our
and HD DVDs have video compression applied during their lecturers for their help in creating this paper. College website
mastering process, though Blu-ray and HD DVD have enough was referred for IEEE format of paper presentation.
disc capacity that most compression applied in these formats
REFERENCES
is light, when compared to such examples as most video
streamed on the internet, or taken on a cellphone. Software [1] ^ Korn, D.G.; Vo, K.P. (1995), B. Krishnamurthy, ed.,
used for storing video on hard drives or various optical disc Vdelta: Differencing and Compression, Practical
formats will often have a lower image quality, although not in Reusable Unix Software, John Wiley & Sons
all cases. High-bitrate video codecs with little or no [2] www.Wikipedia.org
compression exist for video post-production work, but create [3] www.sbit.in
[4] www.data-compression.com/

Data Comp Paper Saurabh

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Data Comp Paper Saurabh

Hochgeladen von

Copyright:

Verfügbare Formate

Data compression

Saurabh, ece/06/144,electronics and communication department

Lossless versus lossy compression 25.[9]8

Examples of lossless compression

The Lempel-Ziv (LZ) compression methods are among the

The programming provider has control over the amount of ACKNOWLEDGMENT

Das könnte Ihnen auch gefallen