Beruflich Dokumente
Kultur Dokumente
Veton Kpuska
Veton Kpuska
Fourier-Transform View
Recall (from Chapter 3):
X n,
x m w n me
j n
Veton Kpuska
Fourier-Transform View
x[n] time-domain signal
fn[m]=x[m]w[n-m] - Denotes short-time
section of x[m] at point n. That is, signal at
the frame n.
X(n,) - Fourier transform of fn[m] of shorttime windowed signal data.
Computing the DFT:
X n, k X n, |
Veton Kpuska
2
k
N
Fourier-Transform View
Thus X(n,k) is STFT for every =(2/N)k
Frequency sampling interval = (2/N)
Frequency sampling factor = N
DFT:
X n,k x m w nm e
2
km
N
Veton Kpuska
Fourier-Transform View
Veton Kpuska
Example 7.1
Let x[n] be a periodic impulse train sequence:
x[ n]
[n lP ]
-P
2P
3P
-P/2
P/2+1
P-points
December 30, 2015
Veton Kpuska
Example 7.1
X ( n, )
x[m]w[n m]e
jm
jm
w(n lP)e
j lP
Non-zero only
for m=lP
Window located at
lP &
Linear phase -lP
December 30, 2015
Veton Kpuska
Example 7.1
Since windows w[n] do not overlap, |X(n,)| =
constant and X(n,) is linear.
Computation of DFT for N=P gives:
2
X ( n, k )
x[m]w[n m]e
km
(m lP )
w(n lP)e
X n, k
2
j
k lP
P
w(n lP)
constant
December 30, 2015
w[ n m]e
Veton Kpuska
2
km
P
1
DFT of translated,
non-overlapping
windows with
phase shift of zero
(due to sampling)
10
Spectogram |X(n,)|2
If analysis window length is pitch period
wideband spectrogram
vertical striations
Otherwise
narrowband spectrogram
horizontal striations
How often to apply analysis window to the signal?
X(n,k) is decimated by a temporal decimation
factor L:
X(nL,k) = DFT{fnL(m)}
fnL[m] sections are a subset of fn[m]
Veton Kpuska
11
Analysis window
x[m]
p=1
L
p=2
w[pL-m]
p=3
Veton Kpuska
12
Spectrogram |X(n,)|2
Veton Kpuska
13
Fourier-Transform View
~
X (n, ) x[mn0 ]w[nm]e jm x[q ]w[nn0 q]e j ( qn0 )
m
j n 0
jn0
j q
x
[
q
]
w
[
n
q
]
e
e
X (nn0 , )
Veton Kpuska
14
Filtering View
In this interpretation w[n] is considered to be a
filter whose impulse response is w[n].
Thus w[n] is referred to as analysis filter.
Lets fix the value of =o.
X n,o x m e jo m w nm
m
X n,o x n e
j o n
Veton Kpuska
w n
15
Filtering View
The product:
x[n]e-jon
Modulation of x[n] up to frequency o.
Veton Kpuska
16
Filtering View
Alternate view:
X n,o e jo n x n w ne jo n
X n,k e
2
kn
N
x n w n e
2
kn
N
Veton Kpuska
17
Filtering View
Veton Kpuska
18
Filtering View
General Properties:
1. If x[n] has the length N & w[n] has the
length M, then X(n,) has length
N+M+1 along n.
2. The bandwidth of X(n,o) is less than or
equal to that of w[n].
3. Sequence X(n,o) has its spectrum
centered at the origin.
Veton Kpuska
19
Example 7.2
Consider a Gaussian window of the form:
a ( n n o ) 2
w[n] e
hk [n]e
a ( nno ) 2
2
kn
N
Veton Kpuska
20
Example 7.2
For k=0,5,10,15 the following is
obtained:
2
j
0n
a ( nno ) 2
a ( nno ) 2
50
ho [n]e
e
e
h5 [n]e
a ( n no ) 2
h10 [n]e
h15 [n]e
December 30, 2015
a ( nno )
a ( nno )
e
e
2
5n
50
j
2
10 n
50
2
15 n
50
Veton Kpuska
21
Example 7.2
Veton Kpuska
22
Example 7.3
Consider the filter bank of previous example 7.2 that was designed
with a Gaussian window of the form:
w[n]e
a ( n no ) 2
Figure 7.7 shows the Fourier transform magnitudes of the output of the
four complex bandpass filters hk[n] for k=0,5,10, and 15 as presented
in previous slide and depicted in the figure 7.6.
Veton Kpuska
23
Example 7.3
After Demodulation the resulting bandpass outputs
have the same spectral shape as in the figure but
centered at the origin.
Veton Kpuska
24
Time-Frequency Resolution
Tradeoffs
1
jn
W
(
)
e
X ( )d
Since both X() and W() are periodic over 2 linear convolution is
essentially circular.
From the equation above:
W() smears (smoothes) X().
Want W() as narrow as possible ideally W()=() for good
frequency resolution.
W()=() will result in a infinitely long w[n].
Poor time resolution.
Conflicting goal
Veton Kpuska
25
Example 7.4
Figure 7.8 depicts time-frequency resolution
tradeoff:
Veton Kpuska
26
Time-Frequency Resolution
Tradeoffs
From the previous example, smoothing interpretation of
STFT is not valid for non-stationary sequences.
For steady signal long analysis windows are appropriate
and they yield good frequency resolution as depicted in
the next figure.
Veton Kpuska
27
Time-Frequency Resolution
Tradeoffs
However, for short and transient signals, plosive
speech, flaps, diphthongs, etc. , short windows are
preferred in order to capture temporal events.
Shorter windows yield poor frequency resolution.
Veton Kpuska
28
Short-Time Synthesis
How to obtain original sequence back from its
discrete-time STFT?
The inversion is represented mathematically by a
synthesis equation which expresses a sequence in
terms of its discrete-time STFT.
Recall that for fn[m]=x[m]w[n-m]:
X (n, ) f n [m]e jn
Thus:
Veton Kpuska
29
Short-Time Synthesis
For each n, we take the inverse Fourier transform of the
corresponding function of frequency, then we obtain the
sequence fn[m].
Evaluating fn[m] for m=n the following is obtained:
x[n]w[0].
For w[0]0 x[n] can be obtained by dividing fn[n]/w[0].
1
j n
x[n]
X
(
n
,
)
e
d
2w[0]
Veton Kpuska
30
Short-Time Synthesis
In contrast to discrete-time STFT X(n,) the
discrete STFT X(n,k) is not always invertible.
Example 1.
Consider the case when w[n] is bandlimited with
bandwidth of B.
Veton Kpuska
31
Short-Time Synthesis
Note if there are frequency components of x[n] which
do not pass through any of the filter regions of the
discrete STFT then
it is not a unique representation of x[n], and
x[n] is not invertible.
Example 2.
Consider X(n,k) decimated in time by factor L, i.e.,
STFT is applied every L samples.
w[n] is non-zero over its length Nw.
If L > Nw then there are gaps in time where x[n] is not
represented/considered.
Thus in such cases again x[n] is not invertible.
December 30, 2015
Veton Kpuska
32
L > Nw
x[m]
L
w[pL-m]
Nw
Veton Kpuska
33
Short-Time Synthesis
Conclusion:
Constraints must be adopted to ensure
uniqueness and invertability:
1. Proper/Adequate frequency sampling:
B2/Nw (B - Window bandwidth)
2. Proper Temporal Decimation: LNw
Veton Kpuska
34
Veton Kpuska
35
1
jn
x[n]
X
(
n
,
)
e
d
2w[0]
FBS method carries out discrete version of this
equation by utilizing discrete STFT X(n,k):
2
j kn
1 N 1
N
y[n]
X
(
n
,
k
)
e
Nw[0] k 0
Veton Kpuska
36
Thus:
Analysis followed
by synthesis
N 1
1
y[n]
Nw[0] k 0
y[n]
x[m]w[nm]e
2
km
N
2
kn
N
X ( n ,k )
1
y[n]
x[n]w[n]e
Nw[0]
k 0
Veton Kpuska
37
N 1
1
y[n]
x[n]w[n]e
Nw[0]
k 0
N 1
1
y[n]
x[n]w[n]e
Nw[0]
k 0
2
nk
N
2
nk
N
1
y[n]
x[n]w[n] N [nrN ]
Nw[0]
r
Periodic impulse train
period N
Veton Kpuska
38
1
y[n]
x[n]w[n] [nrN ]
w[0]
r
Veton Kpuska
39
Veton Kpuska
40
2
W
k Nw 0
k 0
Veton Kpuska
41
1
x[n]
2
jn
f
[
n
,
n
r
]
X
(
r
,
)
e
d
f [nm]w[m]1
Veton Kpuska
42
L
y[n] f [n,nrL] X (rL,k )e
N r k 0
2
nk
N
Veton Kpuska
43
j nk
L N 1
y[n] f [nrL] X (rL,k )e N
N r k 0
L f [nrL]w[rLn pN ] [ p ],
Veton Kpuska
44
Veton Kpuska
45
FBS Method was motivated from the filtering view of the STFT
OLA method was motivated from the Fourier transform view of
the STFT.
Inverse DFT for each fixed time in the discrete STFT is taken,
Overlap and add operation between the short-time section is
performed,
Veton Kpuska
46
1
jn
x[n]
X
(
n
,
)
e
d
2W [0]
If x[n] is averaged over many short-time segments
and normalized by W(0) then
1
jp
x[n]
X
(
p
,
)
e
d
2W [0] p
where
W (0) w[n]
n
Veton Kpuska
47
W (0) p N k 0
2
IDFT: f p [ n ] x[ n ] w[ pn ]
Note that the above IDFT is true provided that N>N w. The
expression for y[n] thus becomes:
y[n]
1
1
x
[
n
]
w
[
p
n
]
x
[
n
]
W (0) p
W (0)
w[ pn]W (0)
then
y[n]=x[n]
December 30, 2015
Veton Kpuska
w[ pn]
48
W (0)
w[ pLn]
L
p
W (0) p N k 0
Veton Kpuska
49
Veton Kpuska
50
FBS
OLA
2
W k Nw 0
k 0
N 1
w[ pLn]
W ( 0)
L
FBS method requires that finite-length windows have a length N w less than the
number of analysis filters N to satisfy FBS constrain (N>N w).
Analogously, for OLA methods it can be shown that its constrained is satisfied by allfinite-bandwidth analysis windows whose maximum frequency is less than 2/L
(where L is temporal decimation factor).
W k 0,
L
at
2
k
L
Analogous to FBS constrain for N w>N where the window w[n] is required to take
on value zero at n=N, 2N, 3N,...
Veton Kpuska
51
Veton Kpuska
52
Time-Frequency Sampling
Veton Kpuska
53
Time-Frequency Sampling
Consider windowed/short-time signal:
fn[m]=w[m]x[n-m], and
X(n,) Fourier transform of fn[m]
Analysis window duration of Nw
Veton Kpuska
54
Time-Frequency Sampling
Veton Kpuska
55
Time-Frequency Sampling
Sufficient (but not necessary) conditions for
signal reconstruction are:
1.
2.
3.
I.
II.
Veton Kpuska
56
Rectangular window, Nw
Assuming bandwidth equal
to the extent of the main lobe
B = [-2/Nw,: 2/Nw]= 4/Nw
2 N w
Lw
B 2
-c
Lw w
B 4
Bandwidth
2 BN= 8/Nw
Veton Kpuska
57
Summary
Veton Kpuska
58
Summary
FBS Method
1. No frequency-domain aliasing occurs if the
decimation factor L meets the Nyquist criterion, i.e.,
L Nw (2/c) where c is the w[n] bandwidth.
2. Not time-domain aliasing occurs if 2/N 2/Nw
Nw N.
3. If zeros in w[n] are allowed then condition 2 can be
relaxed. In this case we can under-sample in time
and still recover the sequence.
Veton Kpuska
59
Veton Kpuska
60
Time-scale modification
Speech Enhancement
Veton Kpuska
61
1
r[ n , m ]
2
2
X (n, )
X ( n , ) e j n d
j n
r
[
n
,
m
]
e
Short-time
autocorrelation
Short-time
magnitude
m-autocorrelation lag
December 30, 2015
Veton Kpuska
62
fn[m]=x[m]w[n-m]
Veton Kpuska
63
Signal Representation
Under what conditions STFTM can be used to
represent a sequence uniquely?
Note that:
|F{x[n]}|= |F{-x[n]}|
Ambiguity, thus STFTM is not unique representation
for all cases.
However, by imposing certain mild restrictions on:
the analysis window and
the signal,
unique signal representation is indeed possible with
the discrete-time STFTM.
December 30, 2015
Veton Kpuska
64
Signal Representation
Veton Kpuska
65
Signal Representation
Veton Kpuska
66
Signal Representation
If the successive STFTM correspond to overlapping
signal segments then:
If short-time spectral magnitude of signal segment at
time n is know then
Spectral magnitude of the adjacent section at time
n+1 must be consistent in the region of overlap with
the known short-time section.
If the analysis window were non-zero and of length Nw,
then after dividing out the analysis window, the first
Nw-1 samples of the segment at time n+1, must equal
the last Nw-1 of the segment at time n (as illustrated in
the next slide)
If the last sample of a segment can be extrapolated from
its first Nw-1 values, one could repeat this process to
obtain the entire signal x[n].
Veton Kpuska
67
Signal Representation
Veton Kpuska
68
Signal Representation
To develop the procedure for extrapolating the next
sample of a sequence using its STFTM, assume that the
first Nw-1 samples under the analysis window positioned
at time n are known.
Veton Kpuska
69
Signal Representation
last of present
r[n, N w 1]
x[n]
w[0]w[ N w 1]x[n ( N w 1)]
Veton Kpuska
70
Signal Representation
Note that:
X n,
2
jn
r
n
,
m
Veton Kpuska
71
Signal Representation
Sequential extrapolation algorithm
1. Initialize with x[0]
2. Update time n
3. Compute r[n,Nw-1] from the inverse DFT
of |X(n,k)|2.
r[n, N w 1]
4. Compute: x[n]
w[0]w[ N w 1]x[n ( N w 1)]
5. Return to step (2) and repeat
December 30, 2015
Veton Kpuska
72
Veton Kpuska
73
Veton Kpuska
74
Limitations:
Veton Kpuska
75
Veton Kpuska
76
Example 7.6
At time n:
Suppose a time-decimated
STFT, X(nL,) is multiplied
by a linear phase factor
ejno to obtain
Y(nL,)=X(nL,)ejno
At time (n+1)
X((n+1)L,) is multiplied
by a negative of this linear
phase factor e-jno to obtain
Y((n+1)L,)=X((n+1)L,)
e-jno
Overlapping sections of
inverse Fourier Transforms
denoted by gnL[m] and
g(n+1)L[m] are not consistent.
Veton Kpuska
77
Veton Kpuska
78
Y (n,k ) Y (n, )|
2
k
N
X (n,k ) H (n,k )
~
h [n,m] h[n,mlN ], periodic over N
l
y[n] x[nm]h[n,m]
where
Veton Kpuska
79
Veton Kpuska
80
X (n, ) x[m]w[nm]e jm
m
Veton Kpuska
81
W (0) p n k 0
j k ( n m )
1 N 1
1
N
y[n]
x[ m] H k e
w[ p m]
W (0) p
N k 0 p
W (0)
IDFT h[ nmrN ]
r
x[m]h[nm]
p
Veton Kpuska
82
Veton Kpuska
83
Veton Kpuska
84
Veton Kpuska
85
Time-Scale Modification
Veton Kpuska
86
Time-Scale Modification
Methods:
Select frame size & location synchronous to pitch periods. Problem of pitch
period mismatch is avoided.
Problem:
STFTM Synthesis
1.
2.
3.
Veton Kpuska
87
Time-Scale Modification
Veton Kpuska
88
Noise Reduction
A number of techniques developed to remove/reduce
additive noise:
Noise corrupted signal is given by:
y[n]=x[n]+b[n]
STFT Synthesis:
Subtract Noise spectrum b()
1
2
2
2
if Y (nL, ) Sb ( ) 0 Y (nL, ) Sb ( ) 0
Veton Kpuska
89
Noise Reduction
STFTM Synthesis:
Ignore phase and use Sequential Extrapolation or
Least-Squared Error estimation method to construct
clean signal.
Veton Kpuska
90