Вы находитесь на странице: 1из 24

ELEN E4896 MUSIC SIGNAL PROCESSING

Lecture 1:
Course Introduction
!
!

1. Course Structure

!
2. DSP: The Short-Time
Fourier Transform
!

Dan Ellis

Dept. Electrical Engineering, Columbia University


dpwe@ee.columbia.edu
E4896 Music Signal Processing (Dan Ellis)

http://www.ee.columbia.edu/~dpwe/e4896/
2014-01-22 - 1 /21

freq / kHz

Signal Processing and Music


4
3
2

freq / kHz

1
0
4
3
2

freq / kHz

1
0
4
3
2

freq / kHz

1
0
4
3
2
1
0

10

12

14

16

18

20
time / sec

Music is a very rich signal



Signal Processing can expose some of it

We want to get inside it
E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 2 /21

1. Course Goals

Survey of applications

of signal processing to music audio



music synthesis

music/audio processing (modification)

music audio analysis

Connect basic DSP theory


to sound phenomena and effects

Hands-on, live investigations
of audio processing algorithms

using Matlab, Pd, ...

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 3 /21

Course Structure

Two weekly sessions


Monday:
presentations, practical exposition, discussion

Wednesday:
presentations, practical, sharing

!

Grade structure

20%
10%
30%
40%

practicals participation

one presentation

three mini-project assignments

one final project

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 4 /21

Flipped Classroom

Video Lectures are required viewing


before Monday session

!

about 30 mins each



!

recorded last year



!

bring one question


(at least) to class

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 5 /21

Hands-on Practicals

Music Signal Processing involves


connecting algorithms with perceptions

Our goal is to be able to play with
algorithms to feel how they work

In class, on laptops, in small groups

We will use several platforms:


PureData (Pd)

Matlab

E4896 Music Signal Processing (Dan Ellis)

Python

Processing

2014-01-22 - 6 /21

Projects

There will be 3 mini projects assigned



approx. week 3, week 5, week 7

provide basic pieces, but open-ended

involve implementation & experimentation

Also, Final project


on a topic of your choice



implement some kind of music signal processing

due at end of semester

Good projects ...


... start with clear goals



... apply ideas from class

... evaluate performance & modify accordingly

Could continue into summer...

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 7 /21

Project Examples
Fig. 4.

Comparison of effectiveness of different combination of features.

Mood

the most effective combination of features. The result is shown


in Figure 4 Although there are some features especially useful,
I also observe that the number of features enhance the overall
classification accuracy. Therefore, the best combination is to
use all the features.

Classification

Beat Tracking

Fig. 6. C
two classes

happy ar
investiga
shown in
mistaken
the same
and scary
envision
This phe
would li
the diffe
combinat
to find o
result is
timber fe
found tha
are melo
effective
while the
sad song

Figure 1: Screenshot of the PD patch implementing the Scheirer algorithm.


Much of the logic is hidden inside the combfilterbank object at the bottom.

2.2

Tempo Selection

Each of the six onset pulse signals (one for each band) is fed into its own
bank of 150 comb filters you can think of these filters as mapping to 150
possible tempos that we want to detect. A comb filter is essentially just a delay
implemented a circular buffer. The delay time on each filters corresponds to
the tempo that that filter is meant to detect so for example, the filter that
is detecting a tempo of 90 bpm has a delay time of 90
60 = 1.5 Hz, which yields
a buffer length of 44100
1.5 = 29400 samples. The tempo selector then, for each
bpm, sums the energy across all 6 subbands at that tempo. The tempo with
the highest total energy is chosen as the tempo of the piece.
Because of the sheer number of filters (150 6 = 900), it was impossible to
implement this using standard PD objects. Therefore I created a PD External
written in C [3] and used this within my PD patch. The external has six inputs,
one for each subband, and has two float outlets one indicating the detected
tempo, and the other indicating the phase. I added a signal outlet as well for

E4896 Music Signal Processing (Dan Ellis)

Fig. 5.

Composition of hits in each class.

Music

VI. E VALUATION AND R ESULTS


In the evaluation part, I will use 100 test samples (20
samples for each mood class) to evaluate the effectiveness
of my classifier. Essentially, there are two tasks I would
like to investigate in the evaluation. First, I use the best
combination of feature found in the feature selection part,
which is to use all of the features. The average hit rate is
0.64. However, I noticed that the accuracy of each mood
class are dramatically different. The hit rates for angry and

Transcription

In this
classifier.
ness of e
works, w
when cla
a key rol
number o
further an
classes a
features t
found. Fo
songs are
unexplain
In the fu

2014-01-22 - 8 /21

Presentations

Everyone will make at least one


in-class presentation

e.g. presentation of a relevant paper



... or your latest mini-project

... or just something youre curious about

We will assign time slots


first-come-first served

slots each class

check course calendar
to choose a time

email TAs

Encourage discussion

informal presentations of practical findings

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 9 /21

Web Site

Information distributed via:

http://www.ee.columbia.edu/~dpwe/e4896/

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 10/21

Course Outline
Jan 22: DSP

Mar 10: Pitch tracking

Jan 27: Acoustics

Mar 24: Time & pitch scaling

Feb 03: Perception

Mar 31: Beat tracking

Feb 10: Analog synthesis

Apr 07: Chroma & Chords

Feb 17: Sinusoidal models

Apr 14: Audio alignment

Feb 24: LPC models

Apr 21: Music fingerprinting

Mar 03: Filtering & Reverb

Apr 28: Source separation

Anything missing?
E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 11/21

2. Digital Signals
Discrete-time sampling
limits bandwidth

xd[n] = Q( xc(nT ) )
Discrete-level
quantization
limits
dynamic range

time

E
T

Sampling interval
T
Sampling frequency
0 = 2 /T
Quantizer Q(x) = round(x/ )
E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 12/21

Fourier Series

Observation:

Any periodic signal can be constructed


from sinusoids at integer multiples of the
fundamental frequency

M

!
x(t) = x(t + T )
!

x(t)

E.g., square wave


k

( 1)
0

= 0; ak =

k=0
1
k

k)

k = 1, 3, 5, . . .
otherwise

x(t)

1.0

|ak|

0.5
0
0.5
1
1.5

2 k
ak cos(
t+
T

0.5

E4896 Music Signal Processing (Dan Ellis)

0.5

1.5

1 2 3 4 5 6 7

k
2014-01-22 - 13/21

Fourier Transform

Complex form of Fourier Series + analysis


!
x(t) !

j 2T k t

ck e
k= M

1
ck =
T

T /2

x(t)e

j 2T k t

dt

T /2

Let T

Harmonics become infinitely close:



1
!
x(t) =
X(j )ej t d
2
!
0.02

x(t)

0.01

-0.01
0

X(j
)=
!

x(t)e

j t

dt

0.002

0.004

level
/ dB
-20

0.006
0.008
time / sec

|X(j)|

-40
-60

-80
0

2000

x(t), X(j ) are equivalent, dual descriptions


E4896 Music Signal Processing (Dan Ellis)

4000

6000
8000
freq / Hz

2014-01-22 - 14/21

Spectrum of Periodic Sounds

If sound is (nearly) periodic (i.e., pitched),

Fourier Transform approaches Fourier Series


!

0.1

level / dB

amplitude

x(t + T )
2
c
(
k
k k
T )

x(t)
X(j )

-0.1
1.52

1.54

1.56

1.58

time / s

-40
-60
-80

-100

1000

2000

3000

4000

freq / Hz

n.b.: |X(j

)| plotted in log units:

dB(x) = 20 log10 (x)

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 15/21

Discrete-Time Fourier Transf. (DTFT)

Sampled x[n] = x(nT ) has same FT form



!
!

1
x[n] =
2

X(ej )ej

X(ej ) =

x[n]e

j n

n=

... but results in spectrum with period 2:


|X(ej )|

arg{X(ej )}

3
2

1
0

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 16/21

Discrete Fourier Transform (DFT)

N-point finite-length sampled signal



the kind you can have on a computer!

!
!
!

N 1

1
x[n] =
N

X[k]WNnk
k=0

WN = e

N 1

X[k] =

j 2N

x[n]WN nk
n=0

Just a matrix-multiply between vectors


X[0]
X[1]
X[2]
..
.

X[N

1

1


1
=

.
..

1]

1
WN1
WN2
..
.

(N 1)

1 WN

E4896 Music Signal Processing (Dan Ellis)

1
WN2
WN4
..
.

..
.

2(N 1)

WN

(N 1)
WN

2(N 1)
WN

..
.

(N

WN

x[0]
x[1]
x[2]
..
.

1)2
x[N

1]

2014-01-22 - 17/21

Short-Time Fourier Transform

To localize energy in time and frequency...


chop signal into N-point windows every L points



take DFT of each one

0.1

2L

3L

-0.1
short-time 2.35

2.4

2.45

2.5

2.55

2.6

window

time / s

DFT
freq / Hz

4000

N 1

3000

X[k, m] =

2000

n=0

1000
0

x[n + mL] w[n]e

j 2 Nkn

m= 0

m=1

m=2

m=3

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 18/21

The Spectrogram

Plot STFT |X[k, m]| as a grayscale image


0.1

4000

10
0

3000
-10
2000

-20
-30

1000

intensity / dB

freq / Hz

-40
0
2.35

2.4

2.45

2.5

2.55

2.6

-50

time / s

freq / Hz

4000

3000

2000

1000

0.5

E4896 Music Signal Processing (Dan Ellis)

1.5

2.5

time / s

2014-01-22 - 19/21

Time-Frequency Tradeoff

Shorter window

better resolution of time detail


worse resolution of frequency detail

0.2

freq / Hz

4000

3000

2000

10
0

1000
-10
0

freq / Hz

Window = 48 pt
Wideband

Window = 256 pt
Narrowband

-20

4000

-30
-40

3000

-50

level
/ dB

2000

1000

1.4

1.6

E4896 Music Signal Processing (Dan Ellis)

1.8

2.2

2.4

2.6

time / s

2014-01-22 - 20/21

Spectrogram in Matlab

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 21/21

Live Matlab Spectrogram

http://www.wakayama-u.ac.jp/~kawahara/MatlabRealtimeSpeechTools/

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 22/21

Spectrogram in Processing

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 23/21

Summary + Assignment

Music Signal Processing


synthesis, modification, analysis



hands-on investigation

Course

participation, practicals, projects, presentations

Digital Signal Processing

signals on computers

Fourier analysis & spectrogram

Assignment

Watch Acoustics video & prepare question



Do Pd tutorial http://en.flossmanuals.net/pure-data/
through to Amplitude Modulation

E4896 Music Signal Processing (Dan Ellis)

2014-01-22 - 24/21

Вам также может понравиться