Sie sind auf Seite 1von 10

9/25/2013

Overview
Sampling
Speech and Audio Signal Processing

jaj
ECE554

Nikesh Bajaj
nikesh.14730@lpu.co.in
Asst. Prof. DSP, SECE
Lovely Professional University
2 By: Nikesh Bajaj

CODING OF SPEECH SIGNALS


SpeechCoding
Ba Speech Quality V Bit Rate for Codecs
sh
WaveformCoding HybridCoding AnalysisSynthesisor
Vocoders
Waveform Coding: an attempt is made to preserve the original waveform.
Vocoders: a theoretical model of the speech production mechanism is considered.

Hybrid Coding: uses techniques from the other two. 4


ke

Sampling Theorem
Ni

Theorem: If the highest frequency contained in an analog


signal xa(t) is Fmax = B, and the signal is sampled at a frequency
Fs > 2B, then the analog signal can be exactly recovered from
its samples using the following reconstruction formula:

xa t
T t nT
sin
x nT
n
a
T t nT
Note that at the original sample instances (t = nT), the
reconstructed analog signal is equal to the value of the original
analog signal. At times between the sample instances, the
signal is the weighted sum of shifted sinc functions.
6
5 By: Nikesh Bajaj 6 By: Nikesh Bajaj

1
9/25/2013

Sampling Theorem(1920) Reconstruction of Signal

jaj
7 By: Nikesh Bajaj 8 By: Nikesh Bajaj

Sampling
Prove the Theorem
Fs = 2*Fc
Ba
sh
9 By: Nikesh Bajaj 10 By: Nikesh Bajaj
ke

TYPICAL SAMPLING FREQUENCIES IN


SPEECH RECOGNITION Problems
Ni

8 kHz: Popular in digital telephony. Provides Sampling theorem for bandlimited signals
coverage of first three formants for most speakers and How to change the sample rate of a signal?
most sounds. How this can be implemented using time
16 kHz: Popular in speech research. domain interpolation (based on the Sampling
Sub 8 kHz Sampling: Theorem)?
How this can be implemented efficiently using
digital filters?
11 12
11 By: Nikesh Bajaj 12 By: Nikesh Bajaj

2
9/25/2013

PCM

jaj
13 By: Nikesh Bajaj 14 By: Nikesh Bajaj


Speech Probability Density
Function
Probability density function for x(n) is the same as for xa(t) since
x(n)=xa(nt) the mean and variance are the same for both x(n) and xa(t).
Need to estimate probability density and power spectrum from speech
waveforms
probability density estimated from long term histogram of amplitudes
Ba


Measured Speech Densities
Distribution normalized so mean
is 0 and variance is 1(x=0, x=1)
Gamma density more closely
approximates measured
distribution for speech than
sh
Laplacian
good approximation is of a gamma distribution of the form:
Laplacian is still a good model
and is used in analytical studies
Small amplitudes much more
Simpler approximation is Laplacian density, of the form: likely than large amplitudes by
100:1 ratio.

15 16
ke

Correlation of Speech PCM


Sampling and Quantization
Ni

Can estimate long term


autocorrelation and power
spectrum using time-series Separating the processes of sampling and quantization.
analysis methods
Assume x(n) obtained by sampling a bandlimited signal at
a rate at or above the Nyquist rate.
8kHzsampledspeechforseveral
speakers Assume x(n) is known to infinite precision in amplitude.
where L is a large integer highcorrelationbetweenadjacent
samples
Lowpassspeechmorehighly
Need to quantize x(n) in some suitable manner.
correlatedthanbandpassspeech

17 18

3
9/25/2013

Quantization and Encoding n-bit Quantization


Use n-bit binary numbers to represent the quantized
Coding is a two-stage process samples => 2n quantization levels

jaj
quantization process: x( n)x (n) Information Rate of Coder: I=n FS= total bit rate in
bits/second
encoding process: x (n) c(n)
n=16, FS= 8 kHz => I=128 kbps
where is the (assumed fixed) quantization step size
n=8, FS= 8 kHz => I=64 kbps
Decoding is a single-stage process
n=4, FS= 8 kHz => I=32 kbps
decoding process:c(n) x(n)
Goal of waveform coding is to get the highest quality at a
if c(n)=c(n), (no errors in transmission) then x(n) =x(n) fixed value of I (kbps), or equivalently to get the lowest
x(n) x(n) coding and quantization loses information.
value of I for a fixed quality.
Since FS is fixed, need most efficient quantization methods
to minimize I.
19 20

Quantization Basics


Assume |x(n)| Xmax (possibly )

For Laplacian density (where Xmax=), can show that


Ba Quantization Process
sh
0.35% of the samples fall outside the range -4x x(n)

4x => large quantization errors for 0.35% of the samples.

Can safely assume that Xmax is proportional to x.

21 22
ke

Mid--Riser and Mid--Tread


Uniform Quantizer Quantizers
The choice of quantization range and levels chosen such that signal can easily be
Ni

processed digitally Mid-riser


origin (x=0) in middle of rising part of the staircase
same number of positive and negative levels
symmetrical around origin.
Mid-tread
origin (x=0) in middle of quantization level
one more negative level than positive
one quantization level of 0 (where a lot of activity occurs)
Code words have direct numerical significance (sign-magnitude representation for
mid-riser, twos complement for mid-tread).
23 24

4
9/25/2013

Quantizer Uniform Quantization and SNR


Uniform Quantizers characterized by:
number of levels2n (n bits)
quantization step size-.

jaj
if |x(n)| Xmax and x(n) is a symmetric density, then
2n =2Xmax
= 2Xmax/ 2n
if we let
x(n)=x(n) + e(n)
with x(n) the unquantized speech sample, and e(n) the
quantization
- /2 e(n) /2
26
25 By: Nikesh Bajaj


Quantization Noise Model Ba
quantization noise is a zero-mean, stationary white noise
process.
E[e(n)e(n+m)]=2e, m=0
= 0 otherwise
SNR for Quantization Ref:5.3.1
sh
quantization noise is uncorrelated with the input signal
E[x(n)e(n+m)]=0 m
Distribution of quantization errors is uniform over each
quantization interval
pe(e)=1/ - /2 e /2 =0, 2e = 2/12
=0 otherwise
27 28
ke

Instantaneous Companding [R:5.3.2] - Law : Companding/Exp.


LOG Pseudo Logarithmic
Ni

Not Practical
29 By: Nikesh Bajaj 30 By: Nikesh Bajaj

5
9/25/2013

- Law : Companding/Exp.
Cases
y(n) =0 for x(n) =0

jaj

If = 0 then y(n) = x(n)


For large
SNR Smith[10]

31 By: Nikesh Bajaj 32 By: Nikesh Bajaj

- Law : Example
= 40 and L=8
Ba A-Law
sh
33 By: Nikesh Bajaj 34 By: Nikesh Bajaj
ke

Delta Modulation
Ni

At high sampling rate, signal samples are


highly correlated.
Autocorrelation
.

35 By: Nikesh Bajaj 36 By: Nikesh Bajaj

6
9/25/2013

Linear Delta Modulation Linear Delta Modulation


Coder Decoder

jaj
Characteristics ??

37 By: Nikesh Bajaj 38 By: Nikesh Bajaj

Linear Delta Modulation


Optimum Prediction Gain
Ba


LDM
Slope overload
Avoid
sh
Slope overload
distortion (noise)

39 By: Nikesh Bajaj 40 By: Nikesh Bajaj


ke

LDM
When input is zero or constant
Ni

Noise with peak to peak variation (Qnoise)


Granular Noise

41 By: Nikesh Bajaj 42 By: Nikesh Bajaj

7
9/25/2013

LDM Adaptive Delta Modulation


Bit Rate:?? Encoder
Advantages

jaj
Simplicity
No sync. Req.
Simple circuit cond.

Bit-Pattern
43 By: Nikesh Bajaj 44 By: Nikesh Bajaj

Adaptive Delta Modulation


Decoder
Ba Characteristics?
Adaptive DM
sh
45 By: Nikesh Bajaj 46 By: Nikesh Bajaj
ke

ADM Comparison
Ni

Parameters P, Q, Dmn, Dmx


Ratio Dmx/Dmn = large for high SNR

PQ <=1

Another Continuously variable slope delta


modulation (CVSD)

47 By: Nikesh Bajaj 48 By: Nikesh Bajaj

8
9/25/2013

DPCM ADPCM: Adaptive Quan.


Quantize difference rather than sample Adaptive step size
DM is 1-bit DPCM Coder: Feed Forward

jaj
6dB improvement on SNR 5 dB + 6dB improvement
No single predictor can be optimal
Need of Adaptive DPCM

49 By: Nikesh Bajaj 50 By: Nikesh Bajaj

ADPCM
Decoder
Ba ADPCM -Feedback
sh
51 By: Nikesh Bajaj 52 By: Nikesh Bajaj
ke

ADPCM: Adaptive Predictor ADPCM: Adaptive Predictor


Ni

53 By: Nikesh Bajaj 54 By: Nikesh Bajaj

9
9/25/2013

ADPCM

jaj
55 By: Nikesh Bajaj 56 By: Nikesh Bajaj

11
PCM Speech(2)
Companding Example: 5-bit per sample(1-bit polarity, 2-bit segment code,
& 2-bit quantization code)

Linear
quantization
intervals
11
10
01
00
+V signal
Ba 11
PCM Speech(3)
Companding Example: 5-bit per sample(1-bit polarity, 2-bit segment code,
& 2-bit quantization code)

Linear
quantization
intervals
11
10
01
00
+V signal
Polarity: 1

Polarity: 1
11 11
Segment 10 10
01 Segment 10 10
01
00 00
codes(+) 11 codes(+) 11
01 10
01 01 10
01
00 00
11 11
sh
00 10
01 00 10
01
00 00
-V 00
01 +V
00
01
10 00 10 00
11 11
00 00
Polarity: 0

Polarity: 0
Narrower 01
10 01 Wider 01
10 01
intervals 11
Segment intervals 11
Segment
for smaller 00
for smaller 00
01
10 codes(-) 01
10 codes(-)
amplitude 10
11 amplitude 10
11
00 00
01 01
10 11 10 11
11 11
-V -V
57 58
ke
Ni

59 By: Nikesh Bajaj 60 By: Nikesh Bajaj

10