Sie sind auf Seite 1von 3

Chapter 1

Introduction

1.1

Introducing digital media

In this Unit we will consider two technologies which support digital media. One is the digital disk, used in CD audio, DVD and for computer games. The other is the World Wide Web. Each is now freely available and each has developed greatly from relatively simple beginnings. Both make use of digital technology to store the text, images, movies and sounds that constitute the content of (multi-) media. Each provides a range of mechanisms for interacting with the presentation, ranging from the passive CD audio disk to the high speed interaction of many computer games. The technologies themselves shape what we may do with them. For example, the unpredictability of the web delivery time makes fast CD-game style interaction dicult. On the other hand, the web provides a vast resource which cannot t on a disk. Disks have guaranteed content. Web content may disappear at a moments notice. Disks are generally published in a formal sense, so the content has some degree of reliability. In contrast anyone can put anything on the web. Other media are also available, including cameras, ipods, mobile phones etc. These are for the most part derivative in how they handle the content so we will concentrate on the web and the disk. Both revolutionised media representation and portability, for very dierent reasons. In practice we use these two media in dierent ways and they are also marketed very dierently. Disks are sold, a conventional marketing model. Making money through the web requires more ingenuity. Both approaches rely on digitally encoding their content. Both approaches benet from having a compact representation of the data. Lets sketch some of the issues concerning compact digital data before moving on to the main technologies themselves. There are three processes involved: encoding; transmission; decoding. 1

Multimedia

1.2

Digital text data

Alphabetic text has long been the simplest data to represent digitally. In eect we invent a digital code to represent each letter of the alphabet, then we can store or send these codes. Morse code is a simple discrete coding. It associates a unique pattern of dashes and dots with each letter and digit. The sender knows the Morse alphabet by heart and thus the encoding process is that person reading the clear (= unencoded) text message and then tapping the Morse key. The transmission is a simple electrical representation, short pulses for dots, longer ones for dashes. Decoding is done by someone at the receiver hearing the sounds made by these pulses and writing down the clear text message. More sophisticated systems encode, transmit and decode without human intervention. With the arrival of the digital computer, a simple binary alphabet arose, ASCII. This associates a 7-bit pattern of 0s and 1s with each character. The alphabet supports upper and lower case letters, digits and some control characters to help layout and transmission. More recently there has been work to support extended alphabets, for example those with accented letters or many more symbols but the principle remains the same: there is an agreed digital code for each character.

1.3

Digitizing analogue data

Of course multimedia needs to encode sound and pictures; and pictures might be single images or a movie. So we need to nd digital ways of recording these essentially analogue elements. We also want to do this eciently, so we would like to compress the data. Compression is usually the single most important aspect of the encoding and will be a recurring theme in this Unit.

1.3.1

Sampling and quantization

Lets take sound as an example. The same principles apply to pictures. Sampling is the basis of most digital schemes. The input is an analogue waveform, such as a sound signal. The output is a stream of digital values. These values represent the heights of the waveform sampled at regular intervals in time. There are thus two parameters which determine the delity of the digital data, namely the accuracy with which we capture the height of the signal (the number of bits in each digital sample) and thenumber of samples per second (the sampling frequency). Lets look at each in turn.

Bits per sample Of course accurate reconstruction is possible only if each digital sample is perfectly accurate. In practice the digitally-encoded signal can only have one of a nite number of values: we say it is quantised. For example, if we have 8-bit samples, then there are only 256 possible values. Any digital value is represented in a nite number of bits, so there is an inaccuracy equivalent to half the least-signicant bit, due to rounding. This inaccuracy is called quantization error and is an unavoidable consequence of digitizing. The useful dynamic range of the signal (the range of loudness) is determined by the number of bits in each sample. The ear has a high dynamic range partly because it adjusts to the overall volume at any time. It can detect a dynamic range substantially greater than that used on Audio CD recordings.

Introduction

To replay this digital data as an analogue signal, we use Pulse Code Modulation (PCM). This simply means that each digital code determines the height of an analogue waveform at that point in time. Sampling frequency The original analogue signal can be reconstructed accurately if the samples were taken at at least twice the maximum frequency in the source (the Nyquist sampling criterion). People with good hearing (in fact, the young) can hear upto 20 kHz. The tracks on CDs are sampled at 44.1 kHz, satisfying the Nyquist criterion with something to spare. Usually we have to x the sampling frequency for a given application. If the analogue signal contains frequencies higher that half the digital sampling frequency, then these frequencies should be ltered out before sampling occurs. If this is not done, then the recontructed signal will contain sub-harmonics of those higher frequencies. This is known as aliasing because the high frequencies appear in the disguise of a lower frequency.

1.3.2

Compression

Various forms of compression are used to get the most information into a limited digital storage medium. There has been a lot of interest in this, partly because computers have reached the stage where video is within their processing grasp, partly because of the web having limited bandwidth and partly because of digital television. There are two types of compression: lossy and lossless. Lossy techniques generally give greater compression but only allow approximate reconstruction. They are generally used as part of a perceptual coding scheme; that is, one which takes account of the limitations of the human eye (for video) or ear (for audio). Lossless techniques allow the original to be reconstructed with complete veracity. They are used for computer data, for example. Here is some of the jargon associated with media compression: JPEG M-JPEG MPEG LZW RLE Joint Photographic Experts Group Motion JPEG Motion Picture Experts Group Lempel-Ziv-Welch encoding Run-Length Encode 10-20:1 (more with visible degradation) Lossy Lossy: uses a sequence of JPEG images Lossy, for movies Encodes patterns; used in gif and ti (lossless) Basis of many lossless schemes

RLE isnt a standard, it is the term for any of various methods using the same core idea: each item has two values, namely the data value and the number of consecutive samples (the runlength) which have that data value. Although the run-value pair requires more bits than a single value, the average amount of data will be reduced if the data has coherence, meaning simply that the next value is likely to be the same as the previous one. Simple RLE does not guarantee compression therefore: the degree of compression depends on the nature of the data. Indeed, in a worst case, RLE-encoding can produce more data than the raw data. In a favourable case, you might get ve-fold compression. We will look at MPEG and LZW later, in some detail.

Das könnte Ihnen auch gefallen