Sie sind auf Seite 1von 70

An Introduction to Digital Communications

Costas N. Georghiades Electrical Engineering Department Texas A&M University


These Thesenotes notesare aremade madeavailable availablefor forstudents studentsof ofEE EE455, 455,and andthey theyare areto to be used to enhance understanding of the course. Any unauthorized be used to enhance understanding of the course. Any unauthorized copy copyand anddistribution distributionof ofthese thesenotes notesis isprohibited. prohibited.

Course Outline
Introduction Analog Vs. Digital Communication Systems A General Communication System Some

Probability Theory

Probability space, random variables, density

functions, independence
Expectation, conditional expectation, Bayes rule Stochastic processes, autocorrelation function,

stationarity, spectral density


Costas N. Georghiades

Outline (contd)
Analog-to-digital

conversion

Sampling (ideal, natural, sample-and-hold) Quantization, PCM Source

coding (data compression)

Measuring information, entropy, the source coding

theorem Huffman coding, Run-length coding, Lempel-Ziv


Communication

channels

Bandlimited channels The AWGN channel, fading channels


Costas N. Georghiades

Outline (contd)
Receiver

design
M-ary signaling

General binary and

Maximum-likelihood receivers Performance in an AWGN channel The Chernoff and union/Chernoff bounds Simulation techniques Signal spaces Modulation: PAM, QAM, PSK, DPSK, coherent FSK,

incoherent FSK

Costas N. Georghiades

Outline (contd)
Channel

coding

Block codes, hard and soft-decision decoding,

performance Convolutional codes, the Viterbi algorithm, performance bounds Trellis-coded modulation (TCM)
Signaling

through bandlimited channels

ISI, Nyquist pulses, sequence estimation, partial

response signaling Equalization


Costas N. Georghiades

Outline (contd)
Signaling

through fading channels

Rayleigh fading, optimum receiver, performance Interleaving Synchronization Symbol synchronization Frame synchronization Carrier synchronization

Costas N. Georghiades

Introduction
A General Communication System
Source Transmitter Channel Receiver User

Source: Speech, Video, etc. Transmitter: Conveys information Channel: Invariably distorts signals Receiver: Extracts information signal User: Utilizes information

Costas N. Georghiades

Digital vs. Analog Communication


Analog

systems have an alphabet which is uncountably infinite.


Example: Analog Amplitude Modulation (AM)

X
RF Oscillator

Receiver

Costas N. Georghiades

Analog vs. Digital (contd)


r

Digital systems transmit signals from a

discrete alphabet
Example:

Binary digital communication systems

Data Rate=1/T bits/s

or

1
0110010... Transmitter

Costas N. Georghiades

Digital systems are resistant to noise...


Noise
s1(t)

s2(t) 0

+
Channel

r(t)

1 1? ?

Optimum (Correlation) Receiver:


r(t) X s1(t)
Costas N. Georghiades

( )dt
0

t=T

> <

1 0 0

Comparator
10

Advantages of Digital Systems


Error correction/detection Better encryption algorithms More reliable data processing Easily reproducible designs
Reduced cost

Easier data multiplexing Facilitate data compression


Costas N. Georghiades

11

A General Digital Communication System


Source A/D Conversion Source Encoder Channel Encoder Modulator

Synchronization

C h a n n e l

User

D/A Conversion

Source Decoder

Channel Decoder

Demodulator

Costas N. Georghiades

12

Some Probability Theory


a set , i.e., = {Ai ; Ai } is called an algebra of sets if: 1) 2) Ai and Aj Ai Aj Ai Ai Definition: A non-empty collection of subsets = {A1 , A2 , ...} of

Example: Let = {0,1,2} .

1) = {, } an algebra 2) = { , , {1} , {2} , {0} , {1,2} , {1,0} , { 0,2}} an algebra 3) = { , , {0} , {1} , {2}}

not an algebra
13

Costas N. Georghiades

Probability Measure
Definition: A class of subsets, , of a space is a -algebra (or a Borel algebra) if: 1) Ai Ai . 2) Ai , i = 1,2,3,... U Ai .
i =1

Definition: Let be a -algebra of a space . A function P that maps onto [0,1] is called a probability measure if: 1) P[ ] = 1 2) P[ A] 0

A .

3) P U Ai = P[ Ai ] for i =1 i =1
Costas N. Georghiades

Ai A j = , i j .
14

Probability Measure
Let = (the real line) and be the set of all intervals (x1, x2] in . Also, define a real valued function f which maps such that: 1) f ( x) 0 x .
2)
Then:

f ( x)dx = 1.

P { x ; x1 < x x 2 } = P ( x1 , x 2 ] =

] [

x2

x1

f ( x )dx

is a valid probability measure.

Costas N. Georghiades

15

Probability Space
The following conclusions can be drawn from the above definition: 1) P[ ] = 0
2) P A = 1 P[ A]

3) If A1 A2 P ( A1 ) P ( A2 )

[ ]

( P ( A + A ) = P ( ) = 1 = P ( A ) + P ( A )) .

4) P[ A1 A2 ] = P[ A1 ] + P[ A2 ] P[ A1 A2 ] .

Definition: Let be a space, be a -algebra of subsets of , and P a probability measure on . Then the ordered triple ( , , P ) is a probability space. sample space event space P probability measure
Costas N. Georghiades

16

Random Variables and Density Functions


Definition: A real valued function X () that maps into the real line is a random variable. Notation: For simplicity, in the future we will refer to X () by X . Definition: The distribution function of a random variable X is defined by
F X ( x ) = P[ X x ] = P[ < X x ] . From the previous discussion, we can express the above probability in terms of a non-negative function f X () such that
FX ( x ) = P [ X x ] =

( x ) d X = 1 as follows

( ) d .

We will refer to f X () as the density function of random variable X .


Costas N. Georghiades

17

Density Functions
We have the following observations based on the above definitions:

1) F X ( ) = 2) F X ( ) =

f X (x)d X = 0

3) If x 1 x 2 F X ( x 1 ) F X ( x 2 ) ( F X ( x ) non-decreasing) Examples of density functions: a) The Gaussian density function (Normal)

f X (x)d X = 1

f X ( x) =
Costas N. Georghiades

1 2 2

( x )2
2 2

18

Example Density Functions


b) Uniform in [0,1]

1, x [ 0,1] f X (x) = 0, otherwise

fX (x) 1 x 0 1

c) The Laplacian density function: fX (x)


f X (x) = a exp ( a x ) 2

Costas N. Georghiades

19

Conditional Probability
Let A and B be two events from the event space . Then, the probability of event A, given that event B has occurred, P[A | B ] , is given by P[ A B] . P[ A| B ] = P[ B] Example: Consider the tossing of a dice: P [{2} | "even outcome"]=1/3, P[{2} | "odd outcome"] = 0 Thus, conditioning can increase or decrease the probability of an event, compared to its unconditioned value. The Law of Total Probability Let A1, A2,..., AN be a partition of , i.e.,
M

UA
i =1

= and Ai A j = i j .

Then, the probability of occurrence of event B can expressed as

P[ B] = P[ B| Ai ] P[ Ai ] , B .
i =1
Costas N. Georghiades

20

Illustration, Law of Total Probability


P ( B | A3 )

A3

B A1 A2

P ( B | A2 )
21

Costas N. Georghiades

Example, Conditional Probability


Pr( 0) = 1 2

0 P01 P 10 1

P00

P00=P[receive 0 | 0 sent] P10=P[receive 0 | 1 sent] P01=P[receive 1 | 0 sent]

1 Pr(1) = 2

P 11

P11=P[receive 1 | 1 sent]

P01 = 0.01 P10 = 0.01

P00 = 1 P01 = 0.99 P11 = 1 P10 = 0.99


1 1 0.01 + 0.01 2 2

Pr( e) = Pr( 0) P01 + Pr(1) P 10 = = 0.01

Costas N. Georghiades

22

Bayes Law
Bayes Law: Let

Ai , i = 1, 2,..., M be a partition of and B an event in . Then


P Aj | B =

P B| A j P A j
M i =1 i

][ ]
i

P[ B| A ] P[ A ]

Proof:

[ ] P[ A | B] = P[ A B] = P[ A | B] P[ B] = P[ B| A ] P[ A ] [ ] PB P[ B| A ] P[ A ] P[ B| A ] P[ A ] P[ A | B] = =
P Aj B
j j j j j j j j j j

P( B )

P[ B| A ] P[ A ]
i =1 i i

Costas N. Georghiades

23

Statistical Independence of Events


Two events A and B are said to be statistically independent if P[ A B] = P[ A] P[ B] . In intuitive terms, two events are independent if the occurrence of one does not affect the occurrence of the other, i.e., P ( A| B ) = P ( A ) when A and B are independent. Example: Consider tossing a fair coin twice. Let A={heads occurs in first tossing} B={heads occurs in second tossing}. Then
1 . 4 The assumption we made (which is reasonable in this case) is that the outcome of a coin toss did not affect the other. P[ A B ] = P[ A] P[ B] =
Costas N. Georghiades

24

Expectation
Consider a random variable X with density fX (x). The expected (or mean) value of X is given by
E[ X ] =

xf

( x )dx .

In general, the expected value of some function g( X ) of a random variable X is given by


E [ g( X ) ] =

g( x ) f

( x ) dx .

When g( X ) = X n for n = 0,1,2,L , the corresponding expectations are referred to as the n-th moments of random variable X. The variance of a random variable X is given by
var ( X ) = =

[ x E ( X )] f X ( x)dx
2 2 2 x f ( x ) dx E (X) X

= E ( X 2 ) E 2 ( X ).
Costas N. Georghiades

25

Example, Expectation
Example: Let X be Gaussian with

f X ( x) =
Then:

1 2 2
1 2 2

( x ) 2 exp 2 2

E( X ) =

xe

( x)2
2 2

dx = ,

Var( X ) = E [ X 2 ] E 2 ( X ) =

2 2 2 . x f ( x ) dx = X

Costas N. Georghiades

26

Random Vectors
Definition: A random vector is a vector whose elements are random variables, i.e., if X1, X2, ..., Xn are random variables, then X = ( X 1 , X 2 ,..., X n ) is a random vector. Random vectors can be described statistically by their joint density function

f X (x) = f X 1 X 2 ... X n ( x1 , x 2 ,L , x n ) .
Example: Consider tossing a coin twice. Let X1 be the random variable associated with the outcome of the first toss, defined by
1, if heads X1 = 0, if tails Similarly, let X2 be the random variable associated with the second tossing defined as 1, if heads X2 = . 0 , if tails The vector X = ( X 1 , X 2 ) is a random vector.
Costas N. Georghiades

27

Independence of Random Variables


Definition: Two random variables X and Y are independent if

f X ,Y ( x , y ) = f X ( x ) f Y ( y ) .
The definition can be extended to independence among an arbitrary number of random variables, in which case their joint density function is the product of their marginal density functions. Definition: Two random variables X and Y are uncorrelated if

E [ XY ] = E [ X ] E [ Y ] .
It is easily seen that independence implies uncorrelatedness, but not necessarily the other way around. Thus, independence is the stronger property.

Costas N. Georghiades

28

The Characteristic Function


Definition : Let X be a random variable with density f X ( x ) . Then the characteristic function of X is
X ( j ) = E e

j X

]= e

j x

f X ( x )dx .

Example: The characteristic function of a Gaussian random X variable having mean and variance 2 is
2

X ( j ) =

1 2

j x

( x )2
2 2

dx = e

1 j 2 2 2

Definition : The moment-generating function of a random variable X is defined by


X ( s ) = E [ e sX ] =

sx

f X ( x ) dx .

Fact : The moment-generating function of a random variable X can be used to obtain its moments according to: d n X ( s) n |s = 0 E[ X ] = n ds
Costas N. Georghiades

29

Stochastic Processes
A stochastic process {X (t ); < t < } is an ensemble of signals, each of which can be realized (i.e. it can be observed) with a certain statistical probability. The value of a stochastic process at any given time, say t1, (i.e., X(t1)) is a random variable. Definition: A Gaussian stochastic process is one for which X(t) is a Gaussian random variable for every time t.
5 Fast varying Amplitude 0 -5 0 1 Amplitude 0 -1 0

0.2

0.4 Time

0.6

0.8

Slow varying

0.2

0.4

0.6

0.8

1
30

Costas N. Georghiades

Characterization of Stochastic Processes


E [ X ( t )] = X ( t ) =

Consider a stochastic process { X ( ); < < } . The random variable X(t), t , has a density function f X ( t ) ( x; t ) . The mean and variance of X(t) are

VAR[ X ( t )] = E ( X ( t ) X ( t ) )

xf

X (t )

( x; t )dx ,
2

].

Example: Consider the Gaussian random process whose value X(t) at time t is a Gaussian random variable having density x2 f X ( x; t ) = exp , 2 t 2t

We have E [ X ( t )] = 0 (zero-mean process), and var[ X ( t ) ] = t .


0.4
t= 1

0.3

0.2

0.1

t= 2

Costas N. Georghiades

0 -6

-4

-2

31

Autocovariance and Autocorrelation


Definition: The autocovariance function of a random process X(t) is:
C XX ( t1 , t 2 ) = E X ( t1 ) X ( t1 ) X ( t 2 ) X ( t 2 ) ,

[(

)(

)]

t1 , t 2 . .

Definition: The autocorrelation function of a random process X(t) is defined by

R XX (t1 , t 2 ) = E X (t1 ) X (t 2 ) ,

t1 , t 2 .

Definition: A random process X is uncorrelated if for every pair (t1,t2)


E [ X ( t 1 ) X ( t 2 )] = E [ X ( t 1 )] E [ X ( t 2 )] .

Definition: A process X is mean-value stationary if its mean is not a function of time. Definition: A random process X is correlation stationary if the autocorrelation function R XX (t1 , t 2 ) is a function only of = (t1 t 2 ) . Definition: A random process X is wide-sense stationary (W.S.S.) if it is both mean value stationary and correlation stationary.
Costas N. Georghiades

32

Spectral Density
Example: (Correlation stationary process)

RXX (t1 , t2 ) = exp t1 t2 = exp[

X (t ) = a

],

= t1 t 2 .

Definition: For a wide-sense stationary process we can define a spectral density, which is the Fourier transform of the stochastic process's autocorrelation function:
SX ( f ) =

XX

( ) e j 2 f d .

The autocorrelation function is the inverse Fourier transform of the spectral density:

R XX ( ) =

S (f) e
X

j2 f

df .

Fact: For a zero mean process X,


var ( X ) = R XX (0) =

S ( f ) df .
X

Costas N. Georghiades

33

Linear Filtering of Stochastic Signals


x(t ) H(f ) y (t )

SY ( f ) = S X ( f ) H ( f )

The spectral density at the output of a linear filter is the product of the spectral density of the input process and the magnitude square of the filter transfer function
34

Costas N. Georghiades

White Gaussian Noise


Definition: A stochastic process X is white Gaussian if: a) X ( t ) = (constant) N b) R XX ( ) = 0 ( ) ( = t 1 t2 ) 2 c) X is a Gaussian random variable and X(ti ) is independent of X(tj ) for all ti t j . Note: 1) A white Gaussian process is wide-sense stationary N 2) S X ( f ) = 0 is not a function of f 2 Sx(f) N0/2 f
35

Costas N. Georghiades

Analog-to-Digital Conversion
r

Two steps:
Sampling Discreetize amplitude: Quantization
Discreetize time:
A m p l i t u d e
0

Analog signal: Continuous time, continuous amplitude

Time
36

Costas N. Georghiades

Sampling

Signals are characterized by their frequency

content

The Fourier transform of a signal describes its frequency content and determines its bandwidth
x(t)
0.3

X ( f ) = x (t )e

j 2ft

X(f)

dt
0.5

Time, sec

x (t ) = X ( f )e j 2ft df

Costas N. Georghiades

Frequency, Hz
37

Ideal Sampling
Mathematically, the sampled version, xs(t), of signal x(t) is:

x s (t ) = h(t ) x (t ) X s ( f ) = H ( f ) X ( f ) ,
1 h (t ) = (t kTs ) = Ts k =
k =

j 2 k

t Ts

Sampling function

h(t)

...
-4Ts -3Ts -2Ts -Ts 0 Ts
xs(t)

...
2Ts 3Ts 4Ts

Ts

2 Ts 3 Ts 4 Ts

Costas N. Georghiades

38

Ideal Sampling
1 H ( f ) = Ts
Then:
1 X s ( f ) = H( f ) * X ( f ) = Ts k X f . Ts K =
K =

j2

kt
Ts

1 = Ts

k =

k . Ts

Aliasing
X(f)

X s ( f) fs < 2 W

...
-fs -W
(b )

...
W fs f

X s ( f)

No Aliasing
...
-fs -W
(a )
Costas N. Georghiades

fs > 2 W

...
W fs f

39

Ideal Sampling
If fs>2W, the original signal x(t) can be obtained from xs(t) through simple low-pass filtering. In the frequency domain, we have

X ( f ) = X s ( f ) G ( f ), where
T , f B G( f ) = s 0, oherwise. for W B f s W .
G( f )

The impulse response of the low-pass filter, g(t), is then


g( t ) = G( f ) = G( f ) e j 2 ft df = 2 BTs
1 B B

sin(2 Bt ) . 2 Bt

Ts
B B f

From the convolution property of the Fourier transform we have:


x(t ) =

x (a ) g (t a )da = x(kT ) (a kT )g (t a )da = x(kT ) g (t kT ) .


s s s s s k k

Thus, we have the following interpolation formula


x(t ) = x(kTs ) g (t kTs )
k
Costas N. Georghiades

40

Ideal Sampling
-B -W

G(f) T

W B

g(t)

t
The Sampling Theorem : A bandlimited signal with no spectral components above W Hz can be recovered uniquely from its samples taken every Ts seconds, provided that Nyquist 1 Ts , or, equivalent ly, f s 2W . Rate 2W Extraction of x(t) from its samples can be done by passing the sampled signal through a low-pass filter. Mathematically, x(t) can be expressed in terms of its samples by:

x (t ) = x (kTs ) g (t kTs )
k
Costas N. Georghiades

41

Natural Sampling
A delta function can be approximated by a rectangular pulse p(t)
It can be shown that in this case as well the original signal can be reconstructed from its samples at or above the Nyquist rate through simple low-pass filtering
42

1 T , 2 t T T 2 p (t ) = 0, elsewhere.

h p (t ) =

k =

p(t kT )
s

Costas N. Georghiades

Zero-Order-Hold Sampling
x(t)

x s ( t ) = p ( t ) [x ( t ) h ( t ) ]
P(f)

Ts

t
f

xs (t )

1 P( f )

G( f )

x (t )

Equalizer
Costas N. Georghiades

Low-pass Filter

Reconstruction is possible but an equalizer may be needed


43

Practical Considerations of Sampling


Since in practice low-pass filters are not ideal and have a finitely steep roll-off, in practice the sampling frequency fs is about 20% higher than the Nyquist rate:
f s 2.2W

50 0 -50 -100 -150 0 0.2 0.4 0.6 0.8 1

Example: Music in general has a spectrum with frequency components in the range ~20kHz. The ideal, smallest sampling frequency fs is then 40 Ksamples/sec. The smallest practical sampling frequency is 44Ksamples/sec. In compact disc players, the sampling frequency is 44.1Ksamples/sec.
Costas N. Georghiades

44

Summary of Sampling (Nyquist) Theorem


An

analog signal of bandwidth W Hz can be reconstructed exactly from its samples taken at a rate at or above 2W samples/s (known as the Nyquist rate)
xs(t)
Ts

x(t)

1 f = s T > 2W s

t
45

Costas N. Georghiades

Summary of Sampling Theorem (contd) Signal reconstruction


xs(t)
Low-Pass Filter
0

x(t)

Amplitude still takes values on a continuum => Infinite number of bits Need to have a finite number of possible amplitudes => Quantization
46

Costas N. Georghiades

Quantization

Quantization

is the process discretizing the amplitude axis. It involves mapping an infinite number of possible amplitudes to a finite set of values.

N bits can represent L = 2 amplitudes. This corresponds to N-bit quantization


N

Quantization: Uniform Vs. Nonuniform Scalar Vs. Vector


Costas N. Georghiades

47

Example (Quantization)
Let N=3 bits. This corresponds to L=8 quantization levels:
x(t)
111 110 101 100 000 001 010 011

3-bit Uniform Quantization

Costas N. Georghiades

48

Quantization (contd)
There

is an irrecoverable error due to quantization. It can be made small through appropriate design.
Telephone speech signals: 8-bit quantization CD digital audio: 16-bit quantization

Examples:

Costas N. Georghiades

49

Input-Output Characteristic
7 2

Output,

= Q( x) x

3-Bit (8-level) Uniform Quantizer

5 2

3 2
2

Input, x

3 2

Quantization Error:
) = (x Q ( x ) ) d = (x x

5 2
7 2

Costas N. Georghiades

50

Signal-to-Quantization Noise Ratio (SQNR)


For Stochastic Signals
PX SQNR = D where : 1 PX = lim T T 1 D = lim T T

For Random Variables


PX SQNR = D where :

T E X (t ) dt
2 2

T 2

PX = E X 2

[ ]

T E [X (t ) Q ( X (t ))] dt
T 2

D = E ( X Q( X ))

Can be used for stationary processes


51

Costas N. Georghiades

SQNR for Uniform Scalar Quantizers


Let the input x(t) be a sinusoid of amplitude V volts. It can be argued that all amplitudes in [-V,V] are equally likely. Then, if the step size is , the quantization error is uniformly distributed in the interval
, 2 2
2 1 D = 2 e 2 de = 2 12

1/
/2

p(e)

/2 e

V2 1 T2 2 2 PX = lim T V sin (t )dt = T T 2 2

For an N-bit quantizer:

= 2V / 2 N
= 6.02 N + 1.76 dB
52

PX SQNR = 10 log10 D
Costas N. Georghiades

Example
A zero-mean, stationary Gaussian source X(t) having spectral density as given below is to be quantized using a 2-bit quantization. The quantization intervals and levels are as indicated below. Find the resulting SQNR.
SX ( f ) = 200 RXX ( ) = 100 e 2 1 + (2f )

PX = RXX (0 ) = 100
-15 -10 -5 0 5 10 15

1 f X ( x) = e 200

x2 200

D = E ( X Q( X )) = 2
2

10

( x 5)

f X ( x )dx + 2 ( x 15) f X ( x )dx = 11.885


2 10

100 SQNR = 10 log10 = 9.25 dB 11.885


Costas N. Georghiades

53

Non-uniform Quantization

In general, the optimum quantizer is non-uniform Optimality conditions (Lloyd-Max):


The boundaries of the quantization intervals are the mid-points of the corresponding quantized values The quantized values are the centroids of the quantization regions.

Optimum quantizers are designed iteratively using the above rules

We can also talk about optimal uniform quantizers. These have equal-length quantization intervals (except possibly the two at the boundaries), and the quantized values are at the centroids of the quantization intervals.

Costas N. Georghiades

54

Optimal Quantizers for a Gaussian Source

Costas N. Georghiades

55

Companding (compressing-expanding)
Compressor

Uniform Quantizer

Low-pass Filter
1 0.8

Expander

- Law Companding :
ln (1 + x ) g ( x) = sgn( x ), ln (1 + ) 1 x 1

= 255 = 10

0.6 0.4 0.2 0 0 1

=0

0.2

0.4

0.6

0.8

g ( x) =

[ (1 + ) 1] sgn( x ),
1
x

0.8

1 x 1

0.6 0.4 0.2

=0 = 10 = 255

0
Costas N. Georghiades

0.2

0.4

0.6

0.8

56

Examples: Sampling, Quantization


Speech

signals have a bandwidth of about 3.4KHz. The sampling rate in telephone channels is 8KHz. With an 8-bit quantization, this results in a bit-rate of 64,000 bits/s to represent speech. In CDs, the sampling rate is 44.1KHz. With a 16-bit quantization, the bit-rate to represent (for each channel) is 705,600 bits/s (without coding).
Costas N. Georghiades

57

Data Compression
A/D Converter Analog Source Sampler Quantizer Source Encoder 001011001...

Discrete Source
Discrete Source 01101001... Source Encoder 10011...

The job of the source encoder is to efficiently (using the smallest number of bits) represent the digitized source
Costas N. Georghiades

58

Discrete Memoryless Sources


Definition: A discrete source is memoryless if successive symbols produced by it are independent.

For a memoryless source, the probability of a sequence of symbols being produced equals the product of the probabilities of the individual symbols.

Costas N. Georghiades

59

Measuring Information
Not

all sources are created equal:


P(0)=1

Example:
Discrete Source 1

No information provided
P(1)=0 P(0)=0.99

Discrete Source 2 Discrete Source 3

Little information is provided


P(1)=0.01 P(0)=0.5

Much information is provided


P(1)=0.5
60

Costas N. Georghiades

Measuring Information (contd)


The amount of information provided is a function of the probabilities of occurrence of the symbols. Definition: The self-information of a symbol x which has probability of occurrence p is

I(x)=-log2(p) bits

Definition: The average amount of information in bits/symbol provided by a binary source with P(0)=p
is

H ( x ) = p log 2 ( p ) (1 p ) log 2 (1 p )
61

H(x) is known as the entropy of the binary source


Costas N. Georghiades

The Binary Entropy Function


Maximum information is conveyed when the probabilities of the two symbols are equal

H(x)
1

0.5

0.5

p
62

Costas N. Georghiades

Non-Binary Sources
In general, the entropy of a source that produces L symbols with probabilities p1, p2, ,pL, is

H ( X ) = pi log 2 ( pi ) bits
i =1

Property: The entropy function satisfies

0 H ( X ) log 2 (L )
Equality iff the source probabilities are equal
Costas N. Georghiades

63

Encoding of Discrete Sources


Fixed-length coding: Assigns source symbols binary sequences of the same length. Variable-length coding: Assigns source symbols binary sequences of different lengths.
Example: Variable-length code
Symbol Probability Codeword Length

a b c d

3/8 3/8 1/8 1/8

0 11 100 101

1 2 3 3

3 3 1 1 M = mi pi = 1 + 2 + 3 + 3 8 8 8 8 i =1 = 1.875 bits/symbol
H ( X ) = pi log 2 ( pi ) = 1.811 bits/symbol
i =1 4

Costas N. Georghiades

64

Theorem (Source Coding)


The

smallest possible average number of bits/symbol needed to exactly represent a source equals the entropy of that source. Example 1: A binary file of length 1,000,000
bits contains 100,000 1s. This file can be compressed by more than a factor of 2:
H ( x ) = 0.9 log(. 9 ) 0.1 log(. 1) = 0.47 bits

S = 106 H ( x ) = 4.7 105 bits Compression Ratio=2.13


Costas N. Georghiades

65

Some Data Compression Algorithms


Huffman

coding Run-length coding Lempel-Ziv There are also lossy compression algorithms that do not exactly represent the source, but do a good job. These provide much better compression ratios (more than a factor of 10, depending on reproduction quality).

Costas N. Georghiades

66

Huffman Coding (by example)


A binary source produces bits with P(0)=0.1. Design a Huffman code that encodes 3-bit sequences from the source.
Lengths Code bits 1 011 010 001 00011 00010 00001 00000 Source bits Probability

1 3 3 3 5 5 5 5

111 110 101 011 100 010

largest

.729 .081 .081 .081 .009 .009


1
.018 1

1 1 0 1
.109 .271 .162

1.0

0 0

smallest

001 000

.009 .001

0 1
.01

.028

0 0

H ( x ) = [ 0.1 log2 (0.1) 0.9 log2 (0.9)] = 0.469 M =


Costas N. Georghiades

1 (1 0.729 + 3 3 0.081 + 5 3 0.009 + 5 0.001) = 0.53 3

67

Run-length Coding (by example)


A binary source produces binary digits with P(0)=0.9. Design a run-length code to compress the source.
Source Bits Run Lengths Probability Codewords

1 01 001 0001 00001 000001 0000001 00000001 00000000

0 1 2 3 4 5 6 7 8

0.100 0.090 0.081 0.073 0.066 0.059 0.053 0.048 0.430

1000 1001 1010 1011 1100 1101 1110 1111 0

M 1 = 4 (1 0.43) + 1 0.43 = 2.71 M 2 = .1 + 2 .09 + 3 .081 + 4 .073 + 5 .066 + 6 .059 + 7 .053 + 8 .048 + 8 .430 = 5.710 M = M 1 2.710 = = 0.475 M 2 5.710

H ( X ) = 0.9 log2 (0.9) 0.1 log2 (0.1) = 0.469 < 0.475

Costas N. Georghiades

68

Examples: Speech Compression


Toll-quality

speech signals can be produced at 8Kbps (a factor of 8 compression compared to uncompressed telephone signals). that produce speech at 4.8Kbps or even 2.4Kbps are available, but have reduced quality and require complex processing.

Algorithms

Costas N. Georghiades

69

Example: Video Compression


Uncompressed

video of a 640x480 pixel2 image at 8 bits/pixel and 30 frames/s requires a data-rate of 72 Mbps. systems operate at 384Kbps. (standard) operates at 3Mbps.

Video-conference MPEG2

Costas N. Georghiades

70

Das könnte Ihnen auch gefallen