Sie sind auf Seite 1von 18

An Introduction to Image Compression

Compressing an image is significantly different than compressing raw binary data. Of course, general purpose compression programs can be used to compress images, but the result is less than optimal. This is because images have certain statistical properties which can be exploited by encoders specifically designed for them. Also, some of the finer details in the image can be sacrificed for the sake of saving a little more bandwidth or storage space. This also means that lossy compression techniques can be used in this area. Image compression is minimizing the size in bytes of a graphics file without degrading the quality of the image to an unacceptable level. The reduction in file size allows more images to be stored in a given amount of disk or memory space. It also reduces the time required for images to be sent over the Internet or downloaded from Web pages. There are several different ways in which image files can be compressed. For Internet use, the two most common compressed graphic image formats are the JPEG format and the GIF format. The JPEG method is more often used for photographs, while the GIF method is commonly used for line art and other images in which geometric shapes are relatively simple. Other techniques for image compression include the use of fractals and wavelets. These methods have not gained widespread acceptance for use on the Internet as of this writing. However, both methods offer promise because they offer higher compression ratios than the JPEG or GIF methods for some types of images. Another new method that may in time replace the GIF format is the PNG format.

Lossless compression involves with compressing data which, when decompressed, will be an exact replica of the original data. This is the case when binary data such as executables, documents etc. are compressed. They need to be exactly reproduced when decompressed. On the other hand, images (and music too) need not be reproduced 'exactly'. An approximation of the original image is enough for most purposes, as long as the error between the original and the compressed image is tolerable.

A text file or program can be compressed without the introduction of errors, but only up to a certain extent. This is called lossless compression. Beyond this point, errors are introduced. In text and program files, it is crucial that compression be lossless because a single error can seriously damage the meaning of a text file, or cause a program not to run. In image

compression, a small loss in quality is usually not noticeable. There is no "critical point" up to which compression works perfectly, but beyond which it becomes impossible. When there is some tolerance for loss, the compression factor can be greater than it can when there is no loss tolerance. For this reason, graphic images can be compressed more than text files or programs.

Theory
The theoretical background of compression is provided by information theory (which is closely related to algorithmic information theory) for lossless compression, and by ratedistortion theory for lossy compression. These fields of study were essentially created by Claude Shannon, who published fundamental papers on the topic in the late 1940s and early 1950s. Cryptography and coding theory are also closely related. The idea of data compression is deeply connected with statistical inference. Many lossless data compression systems can be viewed in terms of a four-stage model. Lossy data compression systems typically include even more stages, including, for example, prediction, frequency transformation, and quantization.

Classifying image data


An image is represented as a two-dimentional array of coefficients, each coefficient representing the brightness level in that point. When looking from a higher perspective, we can't differentiate between coefficients as more important ones, and lesser important ones. But thinking more intuitively, we can. Most natural images have smooth colour variations, with the fine details being represented as sharp edges in between the smooth variations. Technically, the smooth variations in colour can be termed as low frequency variations and the sharp variations as high frequency variations. The low frequency components (smooth variations) constitute the base of an image, and the high frequency components (the edges which give the detail) add upon them to refine the image, thereby giving a detailed image. Hence, the smooth variations are demanding more importance than the details. Separating the smooth variations and details of the image can be done in many ways. One such way is the decomposition of the image using a Discrete Wavelet Transform (DWT).

The DWT of an image


The procedure goes like this. A low pass filter and a high pass filter are chosen, such that they exactly halve the frequency range between themselves. This filter pair is called the Analysis Filter pair. First, the low pass filter is applied for each row of data, thereby getting the low frequency components of the row. But since the lpf is a half band filter, the output data contains frequencies only in the first half of the original frequency range. So, by Shannon's Sampling Theorem, they can be subsampled by two, so that the output data now contains only half the original number of samples. Now, the high pass filter is applied for the same row of data, and similarly the high pass components are separated, and placed by the side of the low pass components. This procedure is done for all rows. Next, the filtering is done for each column of the intermediate data. The resulting twodimensional array of coefficients contains four bands of data, each labelled as LL (low-low), HL (high-low), LH (low-high) and HH (high-high). The LL band can be decomposed once again in the same manner, thereby producing even more subbands. This can be done upto any level, thereby resulting in a pyramidal decomposition as shown below.

Fig 1. Pyramidal Decomposition of an Image As mentioned above, the LL band at the highest level can be classified as most important, and the other 'detail' bands can be classified as of lesser importance, with the degree of importance decreasing from the top of the pyramid to the bands at the bottom.

Fig 2. The three layer decomposition of the 'Lena' image.

The Inverse DWT of an image


Just as a forward transform to used to separate the image data into various classes of importance, a reverse transform is used to reassemble the various classes of data into a reconstructed image. A pair of high pass and low pass filters are used here also. This filter pair is called the Synthesis Filter pair. The filtering procedure is just the opposite - we start from the topmost level, apply the filters columnwise first and then rowwise, and proceed to the next level, till we reach the first level.

The following is a demonstration of JPEG/JPEG200 Compression of Color Image

Original Image --- 24 bpp

JPEG --- 2 bpp

JPEG2000 --- 2 bpp

JPEG --- 1 bpp

JPEG2000 --- 1 bpp

JPEG --- 0.5 bpp

JPEG2000 --- 0.5 bpp

Different File Types


For the purposes of this article, were only going to focus on three file types, those most commonly found in web design: PNG, JPEG, and GIF. While there are other image formats out there that take advantage of compression (TIFF, PCX, TGA, etc.), youre unlikely to run across them in any kind of digital design work. GIF GIF stands for Graphics Interchange Format, and is a bitmap image format introduced in 1987 by CompuServe. It supports up to 8 bits per pixel, meaning that an image can have up to 256 distinct RGB colors. One of the biggest advantages to the GIF format is that it allows for animated images, something neither of the other formats mentioned here allow.

JPEG JPEG (Joint Photographic Experts Group) is an image format that uses lossy compression to create smaller file sizes. One of JPEGs big advantages is that it allows the designer to fine-tune the amount of compression used. This results in better image quality when used correctly while also resulting in the smallest reasonable file size. Because JPEG uses lossy compression, images saved in this format are prone to artifacting, where you can see pixelization and strange halos around certain sections of an image. These are most common in areas of an image where theres a sharp contrast between colors. Generally, the more contrast in an image, the higher quality the image needs to be saved at to result in a decent-looking final image.

PNG PNG (Portable Network Graphics) is another bitmapped image format that uses lossless data compression and was created to replace the GIF image format. The PNG format was largely unsupported by Internet Explorer for a long time, making it less commonly used than GIF and JPEG formats, though its now supported properly by every major browser. PNG files support palette-based color (either 24-bit RGB or 32-bit RGBA), greyscale, RGBA and RGB color spaces. One of PNGs biggest advantages is that it supports a number of transparency options, including alpha channel transparency.

3. Choosing a File Format


Each of the file formats specified above are appropriate for different types of images. Choosing the proper format results in higher quality images and smaller file sizes. Choosing the wrong format means your images wont be as high-quality as they could be and that their file sizes will likely be larger than necessary. For simple graphics like logos or line drawings, GIF formats often work best. Because of GIFs limited color palette, graphics with gradients or subtle color shifts often end up posterized. While this can be overcome to some extent by using dithering, its often better to use a different file format.For photos or images with gradients where GIF is inappropriate, the JPEG format may be best suited. JPEG works great for photos with subtle

shifts in color and without any sharp contrasts. In areas with a sharp contrast, its more likely there will be artifacts (a multi-colored halo around the area). Adjusting the compression level of your JPEGs before saving them can often result in a much higher quality image while maintaining a smaller file size. For images with high contrast, especially photos, or illustrations with lots of gradients or contrast, the PNG format is often best. Its also the best option for transparent images, especially those that need partial transparency. PNG files are often larger than JPEGs, though it depends on the exact image. PNG files are also lossless, meaning all the original quality of the image remains in tact.

Heres an overview of which file types work best for each type of image: GIF

If animation is required. Line drawings and simple graphics.

JPEG

Photos, especially without high contrast. Screenshots, especially of movies, games, or similar content.

PNG

Line art, illustrations. Photos with high contrast. Transparency, especially alpha channel transparency. Application screenshots or other detailed diagrams.

And heres an overview of which formats to avoid for each type of image: GIF

Images with gradients. Photos.

JPEG

Images with high contrast. Detailed images, especially diagrams. Simple graphics (the file sizes are larger).

PNG

Photos with low contrast (file sizes are larger).

The Outline
We'll take a close look at compressing grey scale images. The algorithms explained can be easily extended to colour images, either by processing each of the colour planes separately, or by transforming the image from RGB representation to other convenient representations like YUV in which the processing is much easier. The usual steps involved in compressing an image are 1. Specifying the Rate (bits available) and Distortion (tolerable error) parameters for the target image. 2. Dividing the image data into various classes, based on their importance. 3. Dividing the available bit budget among these classes, such that the distortion is a minimum. 4. Quantize each class separately using the bit allocation information derived in step 3. 5. Encode each class separately using an entropy coder and write to the file. Remember, this is how 'most' image compression techniques work. But there are exceptions. One example is the Fractal Image Compression technique, where possible self similarity within the image is identified and used to reduce the amount of data required to reproduce the image. Traditionally these methods have been time consuming, but some latest methods promise to speed up the process. Literature regarding fractal image compression can be found at <findout>. Reconstructing the image from the compressed data is usually a faster process than compression. The steps involved are 1. Read in the quantized data from the file, using an entropy decoder. (reverse of step 5). 2. Dequantize the data. (reverse of step 4). 3. Rebuild the image. (reverse of step 2).

Error Metrics
Two of the error metrics used to compare the various image compression techniques are the Mean Square Error (MSE) and the Peak Signal to Noise Ratio (PSNR). The MSE is the cumulative squared error between the compressed and the original image, whereas PSNR is a measure of the peak error. The mathematical formulae for the two are

MSE = PSNR = 20 * log10 (255 / sqrt(MSE))

1/MN [I(x,y)-I(x,y)]^2

where I(x,y) is the original image, I'(x,y) is the approximated version (which is actually the decompressed image) and M,N are the dimensions of the images. A lower value for MSE means lesser error, and as seen from the inverse relation between the MSE and PSNR, this translates to a high value of PSNR. Logically, a higher value of PSNR is good because it means that the ratio of Signal to Noise is higher. Here, the 'signal' is the original image, and the 'noise' is the error in reconstruction. So, if you find a compression scheme having a lower MSE (and a high PSNR), you can recognise that it is a better one.

RATE DISTORTION CURVE :

COMPRESSION LOGARITHMS Proper use of image compression can make a huge difference in the appearance and size of your website image files. But compression is an often-misunderstood topic, partly because theres a real lack of understanding on what the different types of compression are good for. If you dont understand which type of compression to use for different types of images, youll likely end up with one of two results: either images that dont look as good as they could, or image file sizes that are way larger than they need to be. Below is everything you need to know about image compression in relation to web design. Weve covered the differences between lossless and lossy compression, the different file types and the compression techniques they use, and guidelines for which file formats work best for different kinds of images.

1. Lossless vs. Lossy Compression


Many people feel that they should only use image formats that use lossless compression. While lossless compression is superior for many kinds of images, its not necessary for many others. Basically, lossless image compression means all the data from the original file is preserved. Lossy compression, on the other hand, removes some data from the original file and saves the image with a reduced file size. Its up to you, as the designer, to tell it how much data to disregard by setting the image compression rate. Lossless Compression There are a few different methods for lossless compression. Theres run-length encoding (used for BMP files), which takes runs of data (consecutive data elements with identical values) and stores them in a single data value and count. Its best suited for simple graphics files, where there are long runs of identical data elements.

DEFLATE is another lossless data compression method used for PNG images. It uses a combination of the LZ77 algorithm and Huffman coding. In addition to being used for PNG images, its also used in ZIP and gzip compression. Lempel-Ziv-Welch (LZW) compression is a lossless compression algorithm that performs a limited analysis of data. Its used in GIF and some TIFF file formats. Lossless LempelZiv (LZ) compression methods are among the most popular algorithms for lossless storage. DEFLATE is a variation on LZ which is optimized for decompression speed and compression ratio, therefore compression can be slow. DEFLATE is used in PKZIP, gzip and PNG. LZW (LempelZivWelch) is used in GIF images. Also noteworthy are the LZR (LZ Renau) methods, which serve as the basis of the Zip method. LZ methods utilize a table-based

compression model where table entries are substituted for repeated strings of data. For most LZ methods, this table is generated dynamically from earlier data in the input. The table itself is often Huffman encoded (e.g. SHRI, LZX). A current LZ-based coding scheme that performs well is LZX, used in Microsoft's CAB format. The very best modern lossless compressors use probabilistic models, such as prediction by partial matching. The BurrowsWheeler transform can also be viewed as an indirect form of statistical modelling.

Lossy Compression There are a number of lossy compression methods, some of which can be combined with lossless methods to create even smaller file sizes. One method is reducing the images color space to the most common colors within the image. This is often used in GIF and sometimes in PNG images to result in smaller file sizes. When used on the right types of images and combined with dithering, it can result in images nearly identical to the originals.

Transform encoding is the type of encoding used for JPEG images. In images, transform coding averages out the color in small blocks of the image using a discrete cosine transform (DCT) to create an image that has far fewer colors than the original. Chroma subsampling is another type of lossy compression that takes into account that the human eye perceives changes in brightness more sharply than changes of color, and takes advantage of it by dropping or averaging some chroma (color) information while maintaining luma (brightness) information. Its commonly used in video encoding schemes and in JPEG images.s
used in digital cameras, to increase storage capacities with minimal degradation of picture quality. Similarly, DVDs use the lossy MPEG-2 Video codec for video compression.

In lossy audio compression, methods of psychoacoustics are used to remove non-audible (or less audible) components of the signal. Compression of human speech is often performed with even more specialized techniques, so that "speech compression" or "voice coding" is sometimes distinguished as a separate discipline from "audio compression". Different audio and speech compression standards are listed under audio codecs. Voice compression is used in Internet telephony for example, while audio compression is used for CD ripping and is decoded by audio players.

Lossless versus lossy compression


Lossless compression algorithms usually exploit statistical redundancy in such a way as to represent the sender's data more concisely without error. Lossless compression is possible because most real-world data has statistical redundancy. For example, in English text, the letter 'e' is much more common than the letter 'z', and the probability that the letter 'q' will be followed by the letter 'z' is very small. Another kind of compression, called lossy data compression or perceptual coding, is possible if some loss of fidelity is acceptable. Generally, a lossy data compression will be guided by research on how people perceive the data in question. For example, the human eye is more sensitive to subtle variations in luminance than it is to variations in color. JPEG image compression works in part by "rounding off" some of this less-important information. Lossy data compression provides a way to obtain the best fidelity for a given amount of compression. In some cases, transparent (unnoticeable) compression is desired; in other cases, fidelity is sacrificed to reduce the amount of data as much as possible. Lossless compression schemes are reversible so that the original data can be reconstructed, while lossy schemes accept some loss of data in order to achieve higher compression. However, lossless data compression algorithms will always fail to compress some files; indeed, any compression algorithm will necessarily fail to compress any data containing no discernible patterns. Attempts to compress data that has been compressed already will therefore usually result in an expansion, as will attempts to compress all but the most trivially encrypted data. In practice, lossy data compression will also come to a point where compressing again does not work, although an extremely lossy algorithm, like for example always removing the last byte of a file, will always compress a file up to the point where it is empty. An example of lossless vs. lossy compression is the following string:
25.888888888

This string can be compressed as:


25.[9]8

Interpreted as, "twenty five point 9 eights", the original string is perfectly recreated, just written in a smaller form. In a lossy system, using
26

instead, the exact original data is lost, at the benefit of a shorter representation.

[edit] Example algorithms and applications


The above is a very simple example of run-length encoding, wherein large runs of consecutive identical data values are replaced by a simple code with the data value and length of the run. This is an example of lossless data compression. It is often used to optimize disk space on office computers, or better use the connection bandwidth in a computer network. For symbolic data such as spreadsheets, text, executable programs, etc., losslessness is essential because changing even a single bit cannot be tolerated (except in some limited cases). For visual and audio data, some loss of quality can be tolerated without losing the essential nature of the data. By taking advantage of the limitations of the human sensory system, a great deal of space can be saved while producing an output which is nearly indistinguishable from the original. These lossy data compression methods typically offer a three-way tradeoff between compression speed, compressed data size and quality loss. In a further refinement of these techniques, statistical predictions can be coupled to an algorithm called arithmetic coding. Arithmetic coding, invented by Jorma Rissanen, and turned into a practical method by Witten, Neal, and Cleary, achieves superior compression to the betterknown Huffman algorithm, and lends itself especially well to adaptive data compression tasks where the predictions are strongly context-dependent. Arithmetic coding is used in the bilevel image-compression standard JBIG, and the document-compression standard DjVu. The text entry system, Dasher, is an inverse-arithmetic-coder.

Image compression is a technique used for storing a visual image that reduces the amount of digitized information needed to store the image electronically. When you

save a document in either Microsoft Document Imaging Format (MDI) or Tagged Image File Format (TIFF), image compression is used to reduce the size of the file. In general, images that are saved in the MDI file format take up less disk space than the same images saved as TIFF files. The image quality of an image saved in the MDI file format is comparable to that of the same image saved in TIFF.

What happens when you change compression options?


When you select a different compression option, the pages in a document are compressed using the new option under the following conditions:

you save the document with a different file format or file extension (all of the pages in the document are saved with the new compression option) you make annotations a permanent part of the document (any page with an annotation will be saved with the new compression option) you make changes to a page (any page that you make changes to is saved with the new compression option) So, when you save a file with a different compression option (without changing the file format), only pages that you have made changes to are saved with the new compression option. To compress all the pages with a new compression option, you must make changes to each page. For example, rotate all the pages in a document 90 right, then 90 left to make sure that each page is marked as changed, but without actually making any visible changes to your document. Technical details The following table shows the type of image compression used for each of the image types you can create in Office Document Imaging.

IMAGE TYPE

COMPRESSION FORMAT USED

Monochrome TIFF TIFF CCITT Group 4 FAX Grayscale TIFF Color TIFF TIFF 6.0 JPEG TIFF 6.0 JPEG

Monochrome MDI MODI BW Color MDI Grayscale MDI MODI Color MODI Color

Example implementations

DEFLATE (a combination of LZ77 and Huffman coding) used by ZIP, gzip and PNG files LZMA used by 7-Zip LZO (very fast LZ variation, speed oriented) LZX (an LZ77 family compression algorithm) liblzg (a minimal LZ77 based compression library) Unix compress utility (the .Z file format), and GIF use LZW Unix pack utility (the .z file format) used Huffman coding bzip2 (a combination of the Burrows-Wheeler transform and Huffman coding) PAQ (very high compression based on context mixing, but extremely slow; competing in the top of the highest compression competitions) JPEG (image compression using optional chroma downsampling, discrete cosine transform, quantization, then Huffman coding) MPEG (audio and video compression standards family in wide use, using DCT and motioncompensated prediction for video) o MP3 (a part of the MPEG-1 standard for sound and music compression, using subbanding and MDCT, perceptual modeling, quantization, and Huffman coding) o AAC (part of the MPEG-2 and MPEG-4 audio coding specifications, using MDCT, perceptual modeling, quantization, and Huffman coding) Vorbis (DCT based AAC-alike audio codec, designed with a focus on avoiding patent encumbrance) JPEG 2000 (image compression using wavelets, then quantization, then entropy coding) TTA (uses linear predictive coding for lossless audio compression) FLAC (linear predictive coding for lossless audio compression)

Image Compression in Print Design


While the bulk of this article has focused on image compression in web design, its worth mentioning the effect compression can have in print design. For the most part, lossy image compression should be completely avoided in print design. Print graphics are much less forgiving of artifacting and low image quality than are on-screen graphics. Where a JPEG saved at medium quality might look just fine on your monitor, when printed out, even on an inkjet printer, the loss in quality is noticeable (as is the artifacting). For print design, using file types with lossless compression is preferable. TIFF (Tagged Image File Format) is often the preferred file format if compression is necessary, as it gives options for a number of lossless compression methods (including LZW mentioned above). Then again, depending on the image and where it will be printed, its often better to use a file type with no compression (such as an original application file). Talk to your printer about which they prefer.

CONCLUSION
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use. Compression is useful because it helps reduce the consumption of expensive resources, such as hard disk space or transmission bandwidth. On the downside, compressed data must be decompressed to be used, and this extra processing may be detrimental to some applications. For instance, a compression scheme for video may require expensive hardware for the video to be decompressed fast enough to be viewed as it is being decompressed (the option of decompressing the video in full before watching it may be inconvenient, and requires storage space for the decompressed video). The design of data compression schemes therefore involves trade-offs among various factors, including the degree of compression, the amount of distortion introduced (if using a lossy compression scheme), and the computational resources required to compress and uncompress the data.

References

1. ^ Rationale for a Large Text Compression Benchmark 2. ^ RFC 3284 3. ^ Korn, D.G.; Vo, K.P. (1995), B. Krishnamurthy, ed., Vdelta: Differencing and Compression, Practical Reusable Unix Software, John Wiley & Sons

Das könnte Ihnen auch gefallen