Sie sind auf Seite 1von 19

ELEN E4896 MUSIC SIGNAL PROCESSING

Lecture 10:
Beat Tracking
1.
2.
3.
4.

Rhythm Perception
Onset Extraction
Beat Tracking
Dynamic Programming
Dan Ellis

Dept. Electrical Engineering, Columbia University


dpwe@ee.columbia.edu
E4896 Music Signal Processing (Dan Ellis)

http://www.ee.columbia.edu/~dpwe/e4896/
2013-04-01 - 1 /19

1. Rhythm Perception

What is rhythm?
aspects, origin?

E4896 Music Signal Processing (Dan Ellis)

2013-04-01 - 2 /19

Rhythm Perception Experiments

McKinney & Moelants 2006

Tapping experiment
ambiguous; hierarchy

40
30
20
10
0

Frequency

7671
2416
761
240
0

mirex06/train10 Harnoncourt

10

E4896 Music Signal Processing (Dan Ellis)

15

20

25

time / sec

30

2013-04-01 - 3 /19

Rhythm Tracking Systems

Two main components

front end: extract events from audio


back end: find plausible beat sequence to match

Audio

Onset Event
detection

Beat
marking

Beat times etc.

Musical
knowledge

Other outputs
tempo
time signature
metrical level(s)

E4896 Music Signal Processing (Dan Ellis)

2013-04-01 - 4 /19

2. Onset detection
Simplest thing is
energy envelope

Bello et al. 2005

W/2
2

e(n0 ) =
n= W/2

w[n] |x(n + n0 )|

freq / kHz

Harnoncourt

Maracatu

8
6
4

2
0
level / dB

emphasis on
high frequencies?
|X(f, t)|

60
50
40
30
20
10

level / dB

f |X(f, t)|

60
50
40
30
20
10
0

E4896 Music Signal Processing (Dan Ellis)

10
time / sec

10
time / sec

2013-04-01 - 5 /19

Multiband Derivative Puckette et. al 1998

Sometimes energy just shifts

calculate & sum onset in multiple bands


use ratio instead of difference - normalize energy

freq / kHz

o(t) =
f

|X(f, t)|
W (f ) max(0,
|X(f, t 1)|

1)

bonk~

6
4
2

level / dB

0
60
50
40
30
20
10
0

E4896 Music Signal Processing (Dan Ellis)

10
time / sec

10
time / sec

2013-04-01 - 6 /19

Phase Deviation

When amplitudes dont change much,

Bello et al. 2005

phase discontinuity may signal new note

0.15
0.1
0.05
0
0.05
0.1
8.96

8.97

8.98

8.99

9.01

9.02

9.03

9.04

^
X(f, tn+1)

Can detect by comparing


actual phase with
extrapolation from past

X(f, tn )

X(f, tn+1 ) = X(f, tn )


X(f, tn 1 )

combine with amplitude?


E4896 Music Signal Processing (Dan Ellis)

9.05

9.06
Im{X}

prediction

Complex
deviation

X(f, tn)

X(f, tn-1)

Re{X}

actual

X(f, tn+1)

2013-04-01 - 7 /19

3. Rhythm Tracking Desain & Honing 1999

Earliest systems were rule based

based on musicology Longuet-Higgins and Lee, 1982


inspired by linguistic grammars - Chomsky

input: event sequence (MIDI)


output: quarter notes, downbeats
E4896 Music Signal Processing (Dan Ellis)

2013-04-01 - 8 /19

Resonators

How to address:

Scheirer 1998

build-up of rhythmic
evidence
ghost events
(audio input)

Seems more like a


comb filter...

resonant filterbank of
y(t) = y(t

T ) + (1

)x(t)

for all possible T

E4896 Music Signal Processing (Dan Ellis)

2013-04-01 - 9 /19

Multi-Hypothesis Systems
Goto & Muraoka 1994

Beat is ambiguous

Goto 2001
Dixon 2001

develop several alternatives


Drumsound
finder

cy um
n
e
r
Compact disc requ pect
F s
f

Manager

Musical
audio signals

A/D conversion

Onsettime
finders

Higher-level
checkers

Onsettime
vectorizers

Frequency
analysis

Beat information
Agents

Beat prediction

Beat information
transmission

inputs: music audio


outputs: beat times, downbeats, BD/SD patterns...
E4896 Music Signal Processing (Dan Ellis)

2013-04-01 - 10/19

4. Dynamic Programming

Re-cast beat tracking as optimization:

Ellis 2007

Find beat times {ti} to maximize


N

C({ti }) =

O(ti ) +
i=1

F (ti

ti

1, p)

i=2

O(t) is onset strength function


F ( t, ) is tempo consistency score e.g.
2
t
F ( t, ) =
log

Looks like an exponential search over all {t }


i

... but Dynamic Programming saves us

E4896 Music Signal Processing (Dan Ellis)

2013-04-01 - 11/19

Dynamic Programming (DP)

DP is a general algorithm for optimizing


optimal substructure problems

i.e. where optimal total solution can be built from


optimal partial solutions

e.g. best path through cost matrix


T10

D(i,j) = d(i,j) + min

D(i-1,j)

T01

11

D(i-1,j)

Reference frames rj

Lowest cost to (i,j)

D(i-1,j)

D(i-1,j) + T10
D(i,j-1) + T01
D(i-1,j-1) + T11

Local match cost


Best predecessor
(including transition cost)

Input frames fi

path after (i,j) is independent of how we got there


E4896 Music Signal Processing (Dan Ellis)

2013-04-01 - 12/19

Tempo Estimation

Algorithm needs global tempo period

otherwise problem is not optimal substructure


Onset Strength Envelope (part)
4

Pick peak in
onset envelope
autocorrelation
after applying
human
preference
window
check for
subbeat

0
-2

8.5

9.5

10

10.5

11

11.5

3.5

Raw Autocorrelation

12
time / s

400
200
0

Windowed Autocorrelation
200
100
0
-100

E4896 Music Signal Processing (Dan Ellis)

0.5

1
1.5
Secondary Tempo Period
Primary Tempo Period

2.5

lag / s

2013-04-01 - 13/19

Beat Tracking by DP

To optimize

C({ti }) =

F (ti

O(ti ) +
i=1

ti

1, p)

i=2

define C*(t) as best score up to time t


then build up recursively (with traceback P(t))
O(t)
C*(t)

C*(t) = O(t) + max{F(t , p) + C*()}

P(t) = argmax{F(t , p) + C*()}

final beat sequence {ti} is best C* + back-trace


E4896 Music Signal Processing (Dan Ellis)

2013-04-01 - 14/19

beatsimple

Beat tracking in 15 lines of Matlab

function beats = beat_simple(onset, osr, tempo,


alpha)
% beats = beat_simple(onset, osr, tempo, alpha)
%
Core of the DP-based beat tracker
%
<onset> is the onset strength envelope at
frame rate <osr>
%
<tempo> is the target tempo (in BPM)
%
<alpha> is weight applied to transition cost
%
<beats> returns the chosen beat sample times
(in sec).
% 20070619 Dan Ellis dpwe@ee.columbia.edu
if nargin < 4; alpha = 100; end

% backlink(time) is best predecessor for this


point
% cumscore(time) is total cumulated score to this
point
localscore = onset;
backlink = -ones(1,length(localscore));
cumscore = zeros(1,length(localscore));
% convert bpm to samples
period = (60/tempo)*osr;
% Search range for previous beat
prange = round(-2*period):-round(period/2);
% Log-gaussian window over that range
txwt = (-alpha*abs((log(prange/-period)).^2));

for i = max(-prange + 1):length(localscore)


timerange = i + prange;
% Search over all possible predecessors
% and apply transition weighting
scorecands = txwt + cumscore(timerange);
% Find best predecessor beat
[vv,xx] = max(scorecands);
% Add on local score
cumscore(i) = vv + localscore(i);
% Store backtrace
backlink(i) = timerange(xx);
end
% Start backtrace from best cumulated score
[vv,beats] = max(cumscore);
% .. then find all its predecessors
while backlink(beats(1)) > 0
beats = [backlink(beats(1)),beats];
end
% convert to seconds
beats = (beats-1)/osr;

E4896 Music Signal Processing (Dan Ellis)

2013-04-01 - 15/19

vary
tempo
estimate

mean output BPM

mean output BPM

200
150
100

400
300
200
150

50
40

100

= 400
= 100

30

= 400
= 100

70
Beat accuracy
accuracy

Beat accuracy
1
0.5
0

1
0.5
0

/ of inter-beat-intervals
0.2
0.1
0

/ of inter-beat-intervals
/

vary
tradeoff
weight

bonus6 (Jazz): BPM target vs. BPM out

70

accuracy

against
human
tapping
data

bonus1 (Choral): BPM target vs. BPM out

Verify

Results

0.2
0.1

30

40 50

E4896 Music Signal Processing (Dan Ellis)

70

100

150 200
target BPM

70

100

150

200

300 400
target BPM

2013-04-01 - 16/19

Downbeat Detection

Downbeat = start of bar


one level up in metrical hierarchy
Approaches
Goto94 BTS:
Pop music
SD/BD template

Jehan05:
Trained classifier

E4896 Music Signal Processing (Dan Ellis)

2013-04-01 - 17/19

Summary

Rhythm perception

Innate and strong, hierarchic

Beat tracking models

Need to account for buildup & persistence

Dynamic Programming

Neat way to maintain multiple hypotheses

E4896 Music Signal Processing (Dan Ellis)

2013-04-01 - 18/19

References

J. P. Bello, L. Daudet, S. Abdallah, C. Duxbury, M. Davies, M. B. Sandler, A Tutorial on Onset Detection in


Music Signals, IEEE Tr. Speech and Audio Proc., vol.13, no. 5, pp. 1035-1047, September 2005.
P. Desain & H. Honing, Computational models of beat induction: The rule-based approach, J. New
Music Research, vol. 28 no. 1, pp. 29-42, 1999.
Simon Dixon, Automatic extraction of tempo and beat from expressive performances, J. New Music
Research, vol. 30 no. 1, pp. 39-58, 2001.
D. Ellis, Beat Tracking by Dynamic Programming, J. New Music Research, vol. 36 no. 1, pp. 51-60, March
2007.
Tristan Jehan, Creating Music By Listening, Ph.D Thesis, MIT Media Lab, 2005.
Masataka Goto & Yoichi Muraoka A Beat Tracking System for Acoustic Signals of Music, ACM
Multimedia, pp.365-372, October 1994.
Masataka Goto, An Audio-based Real-time Beat Tracking System for Music With or Without Drumsounds, J. New Music Research, vol. 30 no. 2, pp.159-171, June 2001.
M. F. McKinney and D. Moelants, Ambiguity in tempo perception: What draws listeners to different
metrical levels? Music Perception, vol. 24 no. 2 pp. 155-166, December 2006.
M. Puckette, T. Apel, D. Zicarelli, Real-time audio analysis tools for Pd and MSP, Proc. Int. Comp. Music
Conf., Ann Arbor, pp. 109112, October 1998.
Eric. D. Scheirer, Tempo and beat analysis of acoustic musical signals, J. Acoust. Soc. Am., vol. 103, pp.
588-601, 1998.

E4896 Music Signal Processing (Dan Ellis)

2013-04-01 - 19/19

Das könnte Ihnen auch gefallen