Beruflich Dokumente
Kultur Dokumente
Lecture 1:
Course Introduction
!
!
1. Course Structure
!
2. DSP: The Short-Time
Fourier Transform
!
Dan Ellis
http://www.ee.columbia.edu/~dpwe/e4896/
2014-01-22 - 1 /21
freq / kHz
freq / kHz
1
0
4
3
2
freq / kHz
1
0
4
3
2
freq / kHz
1
0
4
3
2
1
0
10
12
14
16
18
20
time / sec
1. Course Goals
Survey of applications
Course Structure
Monday:
presentations, practical exposition, discussion
Wednesday:
presentations, practical, sharing
!
Grade structure
20%
10%
30%
40%
practicals participation
one presentation
three mini-project assignments
one final project
Flipped Classroom
Hands-on Practicals
Matlab
Python
Processing
Projects
Project Examples
Fig. 4.
Mood
Classification
Beat Tracking
Fig. 6. C
two classes
happy ar
investiga
shown in
mistaken
the same
and scary
envision
This phe
would li
the diffe
combinat
to find o
result is
timber fe
found tha
are melo
effective
while the
sad song
2.2
Tempo Selection
Each of the six onset pulse signals (one for each band) is fed into its own
bank of 150 comb filters you can think of these filters as mapping to 150
possible tempos that we want to detect. A comb filter is essentially just a delay
implemented a circular buffer. The delay time on each filters corresponds to
the tempo that that filter is meant to detect so for example, the filter that
is detecting a tempo of 90 bpm has a delay time of 90
60 = 1.5 Hz, which yields
a buffer length of 44100
1.5 = 29400 samples. The tempo selector then, for each
bpm, sums the energy across all 6 subbands at that tempo. The tempo with
the highest total energy is chosen as the tempo of the piece.
Because of the sheer number of filters (150 6 = 900), it was impossible to
implement this using standard PD objects. Therefore I created a PD External
written in C [3] and used this within my PD patch. The external has six inputs,
one for each subband, and has two float outlets one indicating the detected
tempo, and the other indicating the phase. I added a signal outlet as well for
Fig. 5.
Music
Transcription
In this
classifier.
ness of e
works, w
when cla
a key rol
number o
further an
classes a
features t
found. Fo
songs are
unexplain
In the fu
Presentations
Encourage discussion
Web Site
http://www.ee.columbia.edu/~dpwe/e4896/
2014-01-22 - 10/21
Course Outline
Jan 22: DSP
Anything missing?
E4896 Music Signal Processing (Dan Ellis)
2014-01-22 - 11/21
2. Digital Signals
Discrete-time sampling
limits bandwidth
xd[n] = Q( xc(nT ) )
Discrete-level
quantization
limits
dynamic range
time
E
T
Sampling interval
T
Sampling frequency
0 = 2 /T
Quantizer Q(x) = round(x/ )
E4896 Music Signal Processing (Dan Ellis)
2014-01-22 - 12/21
Fourier Series
Observation:
!
x(t) = x(t + T )
!
x(t)
( 1)
0
= 0; ak =
k=0
1
k
k)
k = 1, 3, 5, . . .
otherwise
x(t)
1.0
|ak|
0.5
0
0.5
1
1.5
2 k
ak cos(
t+
T
0.5
0.5
1.5
1 2 3 4 5 6 7
k
2014-01-22 - 13/21
Fourier Transform
!
x(t) !
j 2T k t
ck e
k= M
1
ck =
T
T /2
x(t)e
j 2T k t
dt
T /2
Let T
x(t)
0.01
-0.01
0
X(j
)=
!
x(t)e
j t
dt
0.002
0.004
level
/ dB
-20
0.006
0.008
time / sec
|X(j)|
-40
-60
-80
0
2000
4000
6000
8000
freq / Hz
2014-01-22 - 14/21
0.1
level / dB
amplitude
x(t + T )
2
c
(
k
k k
T )
x(t)
X(j )
-0.1
1.52
1.54
1.56
1.58
time / s
-40
-60
-80
-100
1000
2000
3000
4000
freq / Hz
n.b.: |X(j
2014-01-22 - 15/21
1
x[n] =
2
X(ej )ej
X(ej ) =
x[n]e
j n
n=
arg{X(ej )}
3
2
1
0
2014-01-22 - 16/21
N 1
1
x[n] =
N
X[k]WNnk
k=0
WN = e
N 1
X[k] =
j 2N
x[n]WN nk
n=0
X[N
1
1
1
=
.
..
1]
1
WN1
WN2
..
.
(N 1)
1 WN
1
WN2
WN4
..
.
..
.
2(N 1)
WN
(N 1)
WN
2(N 1)
WN
..
.
(N
WN
x[0]
x[1]
x[2]
..
.
1)2
x[N
1]
2014-01-22 - 17/21
0.1
2L
3L
-0.1
short-time 2.35
2.4
2.45
2.5
2.55
2.6
window
time / s
DFT
freq / Hz
4000
N 1
3000
X[k, m] =
2000
n=0
1000
0
j 2 Nkn
m= 0
m=1
m=2
m=3
2014-01-22 - 18/21
The Spectrogram
4000
10
0
3000
-10
2000
-20
-30
1000
intensity / dB
freq / Hz
-40
0
2.35
2.4
2.45
2.5
2.55
2.6
-50
time / s
freq / Hz
4000
3000
2000
1000
0.5
1.5
2.5
time / s
2014-01-22 - 19/21
Time-Frequency Tradeoff
Shorter window
0.2
freq / Hz
4000
3000
2000
10
0
1000
-10
0
freq / Hz
Window = 48 pt
Wideband
Window = 256 pt
Narrowband
-20
4000
-30
-40
3000
-50
level
/ dB
2000
1000
1.4
1.6
1.8
2.2
2.4
2.6
time / s
2014-01-22 - 20/21
Spectrogram in Matlab
2014-01-22 - 21/21
http://www.wakayama-u.ac.jp/~kawahara/MatlabRealtimeSpeechTools/
2014-01-22 - 22/21
Spectrogram in Processing
2014-01-22 - 23/21
Summary + Assignment
Course
participation, practicals, projects, presentations
Digital Signal Processing
signals on computers
Fourier analysis & spectrogram
Assignment
2014-01-22 - 24/21