Sie sind auf Seite 1von 10

# 9/25/2013

Overview
Sampling
Speech and Audio Signal Processing

jaj
ECE554

Nikesh Bajaj
nikesh.14730@lpu.co.in
Asst. Prof. DSP, SECE
Lovely Professional University
2 By: Nikesh Bajaj

## CODING OF SPEECH SIGNALS

SpeechCoding
Ba Speech Quality V Bit Rate for Codecs
sh
WaveformCoding HybridCoding AnalysisSynthesisor
Vocoders
Waveform Coding: an attempt is made to preserve the original waveform.
Vocoders: a theoretical model of the speech production mechanism is considered.

ke

Sampling Theorem
Ni

## Theorem: If the highest frequency contained in an analog

signal xa(t) is Fmax = B, and the signal is sampled at a frequency
Fs > 2B, then the analog signal can be exactly recovered from
its samples using the following reconstruction formula:

xa t
T t nT
sin
x nT
n
a
T t nT
Note that at the original sample instances (t = nT), the
reconstructed analog signal is equal to the value of the original
analog signal. At times between the sample instances, the
signal is the weighted sum of shifted sinc functions.
6
5 By: Nikesh Bajaj 6 By: Nikesh Bajaj

1
9/25/2013

## Sampling Theorem(1920) Reconstruction of Signal

jaj
7 By: Nikesh Bajaj 8 By: Nikesh Bajaj

Sampling
Prove the Theorem
Fs = 2*Fc
Ba
sh
9 By: Nikesh Bajaj 10 By: Nikesh Bajaj
ke

## TYPICAL SAMPLING FREQUENCIES IN

SPEECH RECOGNITION Problems
Ni

8 kHz: Popular in digital telephony. Provides Sampling theorem for bandlimited signals
coverage of first three formants for most speakers and How to change the sample rate of a signal?
most sounds. How this can be implemented using time
16 kHz: Popular in speech research. domain interpolation (based on the Sampling
Sub 8 kHz Sampling: Theorem)?
How this can be implemented efficiently using
digital filters?
11 12
11 By: Nikesh Bajaj 12 By: Nikesh Bajaj

2
9/25/2013

PCM

jaj
13 By: Nikesh Bajaj 14 By: Nikesh Bajaj

Speech Probability Density
Function
Probability density function for x(n) is the same as for xa(t) since
x(n)=xa(nt) the mean and variance are the same for both x(n) and xa(t).
Need to estimate probability density and power spectrum from speech
waveforms
probability density estimated from long term histogram of amplitudes
Ba

Measured Speech Densities
Distribution normalized so mean
is 0 and variance is 1(x=0, x=1)
Gamma density more closely
approximates measured
distribution for speech than
sh
Laplacian
good approximation is of a gamma distribution of the form:
Laplacian is still a good model
and is used in analytical studies
Small amplitudes much more
Simpler approximation is Laplacian density, of the form: likely than large amplitudes by
100:1 ratio.

15 16
ke

## Correlation of Speech PCM

Sampling and Quantization
Ni

## Can estimate long term

autocorrelation and power
spectrum using time-series Separating the processes of sampling and quantization.
analysis methods
Assume x(n) obtained by sampling a bandlimited signal at
a rate at or above the Nyquist rate.
8kHzsampledspeechforseveral
speakers Assume x(n) is known to infinite precision in amplitude.
where L is a large integer highcorrelationbetweenadjacent
samples
Lowpassspeechmorehighly
Need to quantize x(n) in some suitable manner.
correlatedthanbandpassspeech

17 18

3
9/25/2013

## Quantization and Encoding n-bit Quantization

Use n-bit binary numbers to represent the quantized
Coding is a two-stage process samples => 2n quantization levels

jaj
quantization process: x( n)x (n) Information Rate of Coder: I=n FS= total bit rate in
bits/second
encoding process: x (n) c(n)
n=16, FS= 8 kHz => I=128 kbps
where is the (assumed fixed) quantization step size
n=8, FS= 8 kHz => I=64 kbps
Decoding is a single-stage process
n=4, FS= 8 kHz => I=32 kbps
decoding process:c(n) x(n)
Goal of waveform coding is to get the highest quality at a
if c(n)=c(n), (no errors in transmission) then x(n) =x(n) fixed value of I (kbps), or equivalently to get the lowest
x(n) x(n) coding and quantization loses information.
value of I for a fixed quality.
Since FS is fixed, need most efficient quantization methods
to minimize I.
19 20

Quantization Basics

Assume |x(n)| Xmax (possibly )

## For Laplacian density (where Xmax=), can show that

Ba Quantization Process
sh
0.35% of the samples fall outside the range -4x x(n)

## Can safely assume that Xmax is proportional to x.

21 22
ke

Uniform Quantizer Quantizers
The choice of quantization range and levels chosen such that signal can easily be
Ni

## processed digitally Mid-riser

origin (x=0) in middle of rising part of the staircase
same number of positive and negative levels
symmetrical around origin.
origin (x=0) in middle of quantization level
one more negative level than positive
one quantization level of 0 (where a lot of activity occurs)
Code words have direct numerical significance (sign-magnitude representation for
23 24

4
9/25/2013

## Quantizer Uniform Quantization and SNR

Uniform Quantizers characterized by:
number of levels2n (n bits)
quantization step size-.

jaj
if |x(n)| Xmax and x(n) is a symmetric density, then
2n =2Xmax
= 2Xmax/ 2n
if we let
x(n)=x(n) + e(n)
with x(n) the unquantized speech sample, and e(n) the
quantization
- /2 e(n) /2
26
25 By: Nikesh Bajaj

Quantization Noise Model Ba
quantization noise is a zero-mean, stationary white noise
process.
E[e(n)e(n+m)]=2e, m=0
= 0 otherwise
SNR for Quantization Ref:5.3.1
sh
quantization noise is uncorrelated with the input signal
E[x(n)e(n+m)]=0 m
Distribution of quantization errors is uniform over each
quantization interval
pe(e)=1/ - /2 e /2 =0, 2e = 2/12
=0 otherwise
27 28
ke

## Instantaneous Companding [R:5.3.2] - Law : Companding/Exp.

LOG Pseudo Logarithmic
Ni

Not Practical
29 By: Nikesh Bajaj 30 By: Nikesh Bajaj

5
9/25/2013

- Law : Companding/Exp.
Cases
y(n) =0 for x(n) =0

jaj

For large
SNR Smith

## 31 By: Nikesh Bajaj 32 By: Nikesh Bajaj

- Law : Example
= 40 and L=8
Ba A-Law
sh
33 By: Nikesh Bajaj 34 By: Nikesh Bajaj
ke

Delta Modulation
Ni

## At high sampling rate, signal samples are

highly correlated.
Autocorrelation
.

6
9/25/2013

## Linear Delta Modulation Linear Delta Modulation

Coder Decoder

jaj
Characteristics ??

## Linear Delta Modulation

Optimum Prediction Gain
Ba

LDM
Avoid
sh
distortion (noise)

## 39 By: Nikesh Bajaj 40 By: Nikesh Bajaj

ke

LDM
When input is zero or constant
Ni

Granular Noise

## 41 By: Nikesh Bajaj 42 By: Nikesh Bajaj

7
9/25/2013

Bit Rate:?? Encoder

jaj
Simplicity
No sync. Req.
Simple circuit cond.

Bit-Pattern
43 By: Nikesh Bajaj 44 By: Nikesh Bajaj

Decoder
Ba Characteristics?
sh
45 By: Nikesh Bajaj 46 By: Nikesh Bajaj
ke

Ni

## Parameters P, Q, Dmn, Dmx

Ratio Dmx/Dmn = large for high SNR

PQ <=1

## Another Continuously variable slope delta

modulation (CVSD)

## 47 By: Nikesh Bajaj 48 By: Nikesh Bajaj

8
9/25/2013

Quantize difference rather than sample Adaptive step size
DM is 1-bit DPCM Coder: Feed Forward

jaj
6dB improvement on SNR 5 dB + 6dB improvement
No single predictor can be optimal

## 49 By: Nikesh Bajaj 50 By: Nikesh Bajaj

Decoder
sh
51 By: Nikesh Bajaj 52 By: Nikesh Bajaj
ke

Ni

## 53 By: Nikesh Bajaj 54 By: Nikesh Bajaj

9
9/25/2013

jaj
55 By: Nikesh Bajaj 56 By: Nikesh Bajaj

11
PCM Speech(2)
Companding Example: 5-bit per sample(1-bit polarity, 2-bit segment code,
& 2-bit quantization code)

Linear
quantization
intervals
11
10
01
00
+V signal
Ba 11
PCM Speech(3)
Companding Example: 5-bit per sample(1-bit polarity, 2-bit segment code,
& 2-bit quantization code)

Linear
quantization
intervals
11
10
01
00
+V signal
Polarity: 1

Polarity: 1
11 11
Segment 10 10
01 Segment 10 10
01
00 00
codes(+) 11 codes(+) 11
01 10
01 01 10
01
00 00
11 11
sh
00 10
01 00 10
01
00 00
-V 00
01 +V
00
01
10 00 10 00
11 11
00 00
Polarity: 0

Polarity: 0
Narrower 01
10 01 Wider 01
10 01
intervals 11
Segment intervals 11
Segment
for smaller 00
for smaller 00
01
10 codes(-) 01
10 codes(-)
amplitude 10
11 amplitude 10
11
00 00
01 01
10 11 10 11
11 11
-V -V
57 58
ke
Ni

10