Beruflich Dokumente
Kultur Dokumente
() (t)
x(t)
hx(t), 1(t)i
hx(t), 2 (t)i
...
hx(t),
(t)i
...
1
Notes by J. Romberg
x(t) =
(k) e j2kt
kZ
where
1
x(t) e j2kt dt
(k) =
0
= hx(t), e j2kti.
Fourier series has two nice properties:
1. The {(k)} carry semantic information about which frequencies are in the signal.
2. If x(t) is smooth, the magnitudes |(k)| fall off quickly as k
increases. This energy compaction provides a kind of implicit
compression.
If x(t) is real, it might be sort of annoying that we are representing
it using a list of complex numbers. An equivalent decomposition
is
x(t) = (0)0,0(t) +
X X
m{0,1} k=1
0,k (t) =
1,k (t) =
k=0
2 cos(2kt) k 1
2 sin(2kt).
2
Notes by J. Romberg
x() =
x(t) =
sin((t nT ))
.
(t nT )/T
x[n]
n=
(n) n(t)
n=
with
sin((t nT ))
n(t) = T
(t nT )
(n) = T x(nT ).
If x(t) is bandlimited, then the (n) are also inner products against
the n(t):
(n) = T x(nT )
Z /T
T
=
x() ejnT d
2 /T
1
= h
x(), n()i,
2
where
n() =
T ejnT
3
Notes by J. Romberg
|| /T
|| > /T.
sin((t nT )/T )
= T
.
(t nT )
Thus we can interpret the Shannon-Nyquist sampling theorem as
an expansion of a bandlimited signal in an basis of shifted sinc
functions. We offer two additional notes about this result:
Sampling a signal is a fundamental operation in applications.
Analog-to-digital converters (ADCs) are prevalent and relatively cheap ADCs operating at 10s of MHz cost on the
order of a few dollars/euros.
The sinc representation for bandlimited signals is mathematically the same as the Fourier series for signals with finite
support, just with the roles of time and frequency reversed.
4
Notes by J. Romberg
Orthobasis expansions
Fourier series and the sampling theorem are both examples of
expansions in an orthonormal basis (orthobasis expansion for
short). The set of signals { } is an orthobasis for a space H
if
1.
h , 0 i =
1 = 0
.
0 =
6 0
5
Notes by J. Romberg
The second operator : `2() H takes a sequence of coefficients in `2() and uses them to build up a signal. The mapping
is called the synthesis operator, and its action is given by
[{()}] =
() (t).
6
Notes by J. Romberg
()()
where
() = hx, iH and () = hy, iH .
Proof.
hx, yiH =
*
X
() ,
X
0
XX
( 0) 0
H
()( 0)h , 0 iH
()(),
|x(t)| dt =
x(t)x(t) dt =
= kk2`2().
7
Notes by J. Romberg
()() =
|()|2
Everything is discrete
An amazing consequence of the Parseval theorem is that every
space of signals for which we can find any orthobasis can be discretized. That the mapping from (continuous) signal space into
(discrete) coefficient space preserves inner products essentially means
that it preserves all of the geometrical relationships between the
signals (i.e. distances and angles). In some sense, this means
that all signal processing can be done by manipulating discrete
sequences of numbers.
For our primary continuous spaces of interest, L2(R) and L2([0, 1])
which are equipped with the standard inner product, there are
many orthobases from which to choose, and so many ways in which
we can sample the signal to make it discrete.
Here is an example of the power of the Parseval theorem. Suppose
that I have samples {x[n] = x(nT )}n of a bandlimited signal x(t).
Suppose one of the samples is perturbed by a known amount ,
forming
(
x[n] + n = n0
x[n] =
.
x[n]
otherwise
What is the effect on the reconstructed signal? That is, if
x(t) =
x[n]
nZ
sin((t nT )/T )
(t nT )/T
|x(t) x(t)|2 dt ?
8
Notes by J. Romberg
(1)
is given by
x0(t) =
N
X
k=1
kZ
9
Notes by J. Romberg
10
Notes by J. Romberg
Cosine transforms
The cosine-I transform is an alternative to Fourier series; it is an
expansion in an orthobasis for functions on [0, 1] (or any interval on
the real line) where the basis functions look like sinusoids. There
are two main differences that make it more attractive than Fourier
series for certain applications:
1. the basis functions and the expansion coefficients are realvalued;
2. the basis functions have different symmetries.
Definition. The cosine-I basis functions for t [0, 1] are
(
k (t) =
k=0
.
2 cos(kt) k > 0
(2)
We can derive the cosine-I basis from the Fourier series in the
following manner. Let x(t) be a signal on the interval [0, 1]. Let
x(t) be its reflection extension on [1, 1]. That is
(
x(t) =
x(t) 1 t 0
x(t)
0t1
x(t)
x(t)
0.5
0.5
0.45
0.4
0.4
0.35
0.3
0.3
0.25
0.2
0.2
0.15
0.1
0.1
0.05
0
0
0.2
0.4
0.6
0.8
0
1
11
Notes by J. Romberg
0.5
0.5
x(t) =
k ejkt.
k=
ak cos(kt) +
k=1
bk sin(kt),
k=1
for all k = 1, 2, 3, . . . ,
ak cos(kt).
k=1
Since we can use this expansion to build up any symmetric function on [1, 1], it means that the right hand side of the function
on [0, 1] is arbitrary, so any x(t) on [0, 1] can be written as
x(t) = a0 +
ak cos(kt).
k=1
12
Notes by J. Romberg
1(t)
1.5
2(t)
1.5
3(t)
1.5
1.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
1.5
0
0.2
0.4
0.6
0.8
1.5
0
0.2
0.4
0.6
0.8
1.5
0
0.2
0.4
0.6
0.8
1.5
0
0.2
0.4
0.6
0.8
cos
N
k=0
k
N
n+
1
2
k = 1, . . . , N 1
n = 0, 1, . . . , N 1.
(3)
1
k (t) = 2 cos
k+
t ,
k = 0, 1, 2, . . . . (4)
2
13
Notes by J. Romberg
Notes by J. Romberg
15
Notes by J. Romberg
k
j,k [m, n] for j, k = 0, . . . , 7
2D DCT coefficients are indexed by two integers, and so are naturally arranged on a grid as well:
0,0
0,1 0,N 1
1,0
1,1 1,N 1
...
...
...
...
N 1,0 N 1,1 N 1,N 1
16
Notes by J. Romberg
1
298
299
300
301
6
302
7
303
8
304
849
850
851
852
853
854
8 8 block
855
856
2D DCT coeffs
17
Notes by J. Romberg
ordering
To get a rough feel for how closely this model matches reality,
lets look at a simple example. Here we have an original image
2048 2048, and a zoom into a 256 256 piece of the image:
original
250
300
350
400
450
900
950
1000
1050
1100
Here is the same piece after using 1 of the 64 coefficients per block
(1/64 1.6%), 3/64 4.6% of the coefficients, and 10/64
15/62%:
1.6%
4.6%
14.6%
250
250
250
300
300
300
350
350
350
400
400
400
450
450
900
950
1000
1/64
1050
1100
450
900
950
1000
3/64
1050
1100
900
950
1000
1050
1100
10/64
18
Notes by J. Romberg
j,k
You can see that the coefficients at low frequencies (upper left) are
being treated much more gently than those at higher frequencies
(lower right).
The decoder simply reconstructs each 8 8 block xb using the
synthesis formula
xb[m, n] =
7
7 X
X
k,` k,`[m, n]
k=0 `=0
By the Parseval theorem, we know exactly what the effect of quantizing each coefficient is going to be on the total error, as
22 =
kxb xbk22 = k k
7 X
7
X
k=0 `=0
19
Notes by J. Romberg
k,`|2.
|k,`
Video compression
The DCT also plays a fundamental role in video compression (e.g.
MPEG, H.264, etc.), but in a slightly different way. Video codecs
are complicated, but here is essentially what they do:
1. Estimate, describe, and quantize the motion in between
frames.
2. Use the motion estimate to predict the next frame.
3. Use the (block-based) DCT to code the residual.
Here is an example video frame, along with the differences between
this frame and the next two frames (in false color):
xt1 xt0
xt 0
xt2 xt0
50
50
50
100
100
100
150
150
150
200
200
200
250
250
250
300
300
300
350
350
350
400
400
400
450
450
450
500
500
50
100
150
200
250
300
350
400
450
500
500
50
100
150
200
250
300
350
400
450
500
50
100
150
200
250
300
350
The only activity is where the car is moving from left to right.
20
Notes by J. Romberg
400
450
500
a
n
gn(t)
an+1 an
where gn(t) is a window that is flat in the middle, monotonic on
its ends:
and obeys
X
21
Notes by J. Romberg
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.2
0.2
0.4
0.4
0.6
0.6
0.8
0.8
4
5
1
0
500
1000
pulse
1500
2000
1
900
6
7
0
950
1000
1050
1100
1150
1200
zoom
22
Notes by J. Romberg
1250
1300
10
20
30
40
50
Non-orthogonal bases in RN
When x RN , basis representations fall squarely into the realm
of linear algebra. Let 0, 1, . . . , N 1 be a set of N linearly
independent vectors in RN . Since the k are linearly independent,
then every x RN produces a unique sequence of inner products
against the {k }. That is, we can recover x from the sequence of
inner products
0
hx, 0i
1 hx, 1 i
.
. =
...
..
N 1
hx, N 1i
Stacking up the (transposed) k as rows in an N N matrix ,
0
1
=
...
...
...
,
N 1
and x = 1.
= 0 1 N 1 .
| |
|
23
Notes by J. Romberg
N
1
X
hx, k ik [n].
k=0
N
1
X
k=0
m
12kxk22 kk22 N2 kxk22,
where 1 is the smallest singular value of the analysis operator
matrix and N is its largest singular value.
To extend these ideas to infinite dimensions, we need to use the
language of linear operators in place of matrices (which introduces
a few interesting complications). Before doing this, we will take a
first look at overcomplete expansions.
24
Notes by J. Romberg
Overcomplete frames in RN
A sequence of vector 0, 1, . . . , M in RN are a frame if there is
no x RN , x 6= 0 that is orthogonal to all of the k . This means
that the sequence of inner products
hx, 0i
0
1 hx, 1 i
.
. =
...
..
hx, M 1i
M 1
0
1
=
...
...
...
,
M
1
this means that is overdetermined and has no null space (and
hence has full column-rank). Of course, does not have an inverse, so we must take a little more caution with the reproducing
formula.
Since the M N matrix has full column rank, we know that
the N N matrix is invertible. The reproducing formula can
then comes from
x = ()1x.
Now define the synthesis basis vectors k as the columns of the
pseudo-inverse ()1:
k = ()1k .
25
Notes by J. Romberg
M
1
X
hx, k ik [n].
k=0
M
1
X
k=0
m
12kxk22 kk22 N2 kxk22,
where now 1 is the smallest singular value of the analysis operator matrix and N is its largest singular value (i.e. N2 is the
largest eigenvalue of the symmetric positive-definite matrix ).
If the rows of are orthogonal and all have the same energy A,
then = A Identity and we have a Parseval relation
hx, yi = hx, yi = Ahx, yi
and so
M
1
X
k=0
26
Notes by J. Romberg
= 3/2 1/2 .
3/2 1/2
Thus
=
and so
kxk22
and
A=B=
27
Notes by J. Romberg
Example: Unions of orthobases in RN Suppose our sequence { } is a union of sequences, each of which is an orthobasis:
{11 }11 {22 }22 {LL }LK
Then
kxk22
|hx, 11 i|2
1 1
|hx, 22 i|2
2 2
= Lkxk22
28
Notes by J. Romberg
+ +
X
L L
|hx, LL i|2