Robust Pitch Detection Using DCT Based Spectral Autocorrelation

Robust Pitch Detection using DCT
based Spectral Autocorrelation
Under the guidance of

Dr. Rajesh kumar Dubey
Submitted by:
Sudhakar Rai(15316004)
Nikhil Singh Gaur(15316024)
Pitch can be defined as the extent to which sound is

high or low.
Pitch is the perceived fundamental frequency of sound.
Pitch detection is known as determining the level of

intensity of voice. Pitch detection is very important in
some related tasks of voice processing. Pitch detection
is crucial task in singing voice separation also. Pitch
detection also play important role in Musical information
retrieval, Identification of the singer and in lyric
recognition.
Pitch can identify gender of singing voice.
Pitch also can examine or find the time of voice

recording or the time slot of voice recording.
Techniques in pitch
extraction
Time domain approaches
(1) ACF (Autocorrelation function) and MACF
(Modified Autocorrelation function)
(2) Normalized cross correlation function NCCF
(3) AMDF (Average magnitude difference function)
Frequency domain approaches
(4) CPD (Cepstrum Pitch Determination)

(5) DCT (discrete cosine transformation) based
spectral autocorrelation
Method 1:
ACF (Autocorrelation function)
Autocorrelation function (ACF)
By definition , auto - correlatio n is
N
1
R( m ) lim
x ( n ) x (n m ), 0 m M 0
N 2 N 1
n N
R for n ' ' and ' -' are symmetrica l, so only n 0 is used.
1
R(m)
N
N 1 m
x(n) x(n m), 0 m M

n 0
What is Autocorrelation, R(m)?
E.g.
x=[1 5 7 1 4 ]
N=5,
R(0)=[x(0)*x(0)+x(1)*x(1)+x(2)*x2+x(3)*x(3)+x(4)*x(4)]
R(0)= (1+ 25+49+1+16)=92
R(1)=[x(0)*x(1)+x(1)*x(2)+x(2)*x(3)+x(3)*x(4)]
x=[1 5 7 1 4 ]
[1 5 7 1 4 ]
(5+ 35+ 7+ 4)=51

And so on
R=[92.0000 51.0000 40.0000 21.0000
4.0000]
Importance of linear
prediction analysis in speech
Speech signal is produced by the convolution of

excitation source and time varying vocal tract system
components.
These excitation and vocal tract components are to be

separated from the available speech signal to study
these components independently . For deconvolving
the given speech into excitation and vocal tract system
components, method theLinear Predictionanalysis is
developed.
6
The speech sample s(n) are related to the

excitation u(n) by the simple difference
equation
Between the pitch pulses Gu(n) is zero. So the

present speech sample is predicted from the linear
weighted summation of the past speech samples
Excitation is zero during pitch

pulses so u(n)=0
We process the speech signal through the linear

predictor
with predictor coefficients and the
output is :
The error between the actual signal and predicted value is given by:
E(n) consist of train of impulses .before performing spectral

autocorrelation function we do linear prediction analysis so
that residual of LP analysis are impulses who Fourier
transformation will be flat .
Example 2: Discrete Cosine Transform

(DCT)
A NN
akl
a11
...
... ... a1N

... ... ...
, k 1,1 l N
... ... ... ...
a
...
...
a
NN
N1
N
2 cos (2l 1)( k 1) ,2 k N ,1 l N
N
2N
C C* , C 1 CT
13
However, Fourier transformation has strong

disadvantages for some applications
it is complex
it has poor energy compaction
energy compaction is the ability to pack the energy of
the spatial sequence into as few frequency coefficients
as possible
if compaction is high we only have to transmit a few
coefficients.
14
algorithm
15
Algorithm
1.Record a speech signal.

2.Preprocess the speech signal through linear prediction analysis to
flatten the spectrum
3.Take the frame size of 20ms with overlap of 10 ms to get a pitch
contour.
4.Find out dct magnitude spectrum for each analysis of frame.
Dct spectrum is smoothed by following window
W(k)=1
for 0<k<N/2
W(k)=0.5*(1-cos(2*pi*K/N))
for N/2<k<N
5.SAF is applied on smoothed DCT spectrum
16
Result :
This algorithm was tested on number
of speech segment and it is found to
be a robust tool for obtaining a good
estimate of the fundamental
frequency.
Which is clearly shown in the graph.
THANK YOU
ANY QUESTION

Robust Pitch Detection Using DCT Based Spectral Autocorrelation

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Robust Pitch Detection Using DCT Based Spectral Autocorrelation

Hochgeladen von

Copyright:

Verfügbare Formate

Robust Pitch Detection using DCT

based Spectral Autocorrelation

Under the guidance of

Pitch can be defined as the extent to which sound is

Pitch is the perceived fundamental frequency of sound.

Pitch detection is known as determining the level of

Pitch can identify gender of singing voice.

Pitch also can examine or find the time of voice

(4) CPD (Cepstrum Pitch Determination)

x(n) x(n m), 0 m M

What is Autocorrelation, R(m)?

(5+ 35+ 7+ 4)=51

Speech signal is produced by the convolution of

These excitation and vocal tract components are to be

The speech sample s(n) are related to the

Between the pitch pulses Gu(n) is zero. So the

Excitation is zero during pitch

We process the speech signal through the linear

with predictor coefficients and the

E(n) consist of train of impulses .before performing spectral

Example 2: Discrete Cosine Transform

... ... a1N

... ... ... ...

However, Fourier transformation has strong

1.Record a speech signal.

5.SAF is applied on smoothed DCT spectrum

Das könnte Ihnen auch gefallen