Beruflich Dokumente
Kultur Dokumente
An important consequence of the increase in smoothness of the wavelet function ψ(x) with increasing
zero moments is the reduction in magnitude of wavelet and/or scaling coefficients of a (sufficiently
smooth) function f (x). We discuss this idea briefly below, and follow it up with an example.
First, suppose that ψ(x) has compact support in [a, b] and that it has M > 0 vanishing moments,
i.e., Z b
mk = xk ψ(k) dx = 0 , 0 ≤ k ≤ M − 1. (1)
a
If f (x) is constant, or linear, or even a polynomial of degree less than n on the support Djk of ψjk ,
then the above integral vanishes, since the moments vanish.
Let us investigate this situation more generally. First of all, since ψ(x) has compact support in
[a, b], i.e., ψ(x) is zero outside the interval [a, b], the support Djk of the wavelet ψjk is given by all
x ∈ R which satisfy
We now shift the variable x to x + 2−j k so that the above integral becomes
Z bj
j/2
bjk = 2 f (2−j k + x)ψ(2j x) dx, (4)
aj
1 1
f (2−j k + x) = f (2−j k) + f ′ (2−j k)x + · · · + f (n−1) (2−j k)xn−1 + f (n) (c)xn , (5)
(n − 1)! n!
417
where c lies between 2−j and 2−j k + x. Now substitute this expression into (4),
M −1
f (i) (2−j k) bj i
X Z
j/2
bjk = 2 x ψ(2j x) dx +
(i − 1)! aj
i=0
(M ) (c) Z bj
j/2 f
+ 2 xM ψ(2j x) dx , (6)
M! aj
where f (i) denotes the ith derivative of f . We now show that all integrals in the summation from
i = 0 to i = M − 1 vanish because of the vanishing moments in Eq. (1). If we make the change of
variable s = 2j x, then x = 2−j s, dx = 2−j ds, etc., so that the integrals become
Z bj Z b
i j
x ψ(2 x) dx = 2 −j
(2−j s)i ψ(s) ds
aj a
Z b
= 2−j(i+1) si ψ(s) ds
a
= 0, 0 ≤ i ≤ M − 1. (7)
Using the same change of variable for the final integral, we arrive at the following result,
b
2−j(M −1/2) f (M ) (c)
Z
bjk = xM ψ(x) dx . (8)
M! a
From this result, we can obtain the following upper bound to bjk ,
2−j(M −1/2) K b M
Z
|bjk | ≤ x ψ(x) dx , (9)
M!
a
where
K = max |f (M ) (x)|. (10)
x∈[aj ,bj ]
Note that the maximum of the derivative f (M ) is taken only over the interval [aj , bj ]: Recall the the
original Taylor series in Eq. (5) was being considered over this interval, implying that the intermediate
point c would lie in this interval.
For some standard wavelets, e.g., Daubechies-N , the moment integrals are known exactly in
closed form. (For example, see the book by Boggess and Narcowich, Proposition 6.1, p. 232.) But
even without these result, one can, from the appearance of the factor 2−jM in the numerator and
the factor M ! in the denominor of Eq. (9), conjecture that the magnitudes, |bjk |, of a given wavelet
coefficient bjk will decrease as M , the number of vanishing moments of ψ(x), increases, at least for
sufficiently large M .
418
In fact, we can work a little harder to obtain the following upper bounds on the integral on the right:
Z b Z b
M
M
x ψ(x) dx
≤ x ψ(x) dx
a a
Z b 1/2 Z b 1/2
≤ x2M dx ψ(x)2 dx (Cauchy-Schwarz)
a a
1 2M +1 1/2
= √ b − a2M +1 , (11)
2M + 1
where we have used the fact that ψ(x) is normalized, i.e., kψk2 = 1. Substitution of this result into
(9) yields the following inequality,
In most cases, one of a or b, and often both, will have magnitudes greater than unity. Nevertheless, the
decaying exponential 2−jM and the factorial M ! in the denominator ensure that that |bjk | eventually
decreases with M .
Example: In the following table are listed the magnitudes of a selected branch of wavelet coefficients
bj,2j−1 −1 for the function sin(πx), x ∈ [0, 1], for Daubechies-N wavelets, N = 4, 6, 8, 10. For each j,
the coefficient bj,2j−1 −1 lies near the middle of the row of coefficients bjk , 0 ≤ k ≤ 2j − 1 – the last
element on the left half of the row.
In each row of the table, we see that the magnitude of the wavelet coefficient decreases with N ,
the number of vanishing moments of the Daubechies-N wavelet.
419
Vanishing moments and the approximation of functions
The decrease in wavelet coefficient magnitudes afforded by wavelets with larger numbers of vanishing
moments is very important in applications. It implies that more of the “energy” of a (sufficiently
smooth) signal f (x) is stored in the low frequency (i.e., low j) coefficients bjk . As such, more of
the higher frequency coefficients are negligible, implying that we may achieve the same accuracy in
approximating f (x) with fewer coefficients, i.e., data compression.
Theorem: Let {Vj }j∈Z be a multiresolution analysis of L2 (R) with scaling function φ. Furthermore,
suppose that the associated wavelet function ψ has M vanishing moments, according to the definition
in Eq. (1). Given an f ∈ L2 (R), let fj = Pj f denote the projection of f in Vj , i.e., fj ∈ Vj is the best
approximation of f in Vj in the L2 sense, i.e.,
X
fj = hf, φjk iφjk . (14)
k∈Z
kf − fj k2 ≤ C 2−jM , (15)
where C is a constant independent of j and M but dependent on f (x) and the wavelet system.
At any given scale j, the error in approximation is seen to decrease with the number of zero
moments of the wavelet function ψ. Moreover, the decrease in error is exponential in the number M
of zero moments. This is consistent with the upper bound of Eq. (12). Of course, as stated in the
theorem, as we move from one MRA with L1 zero moments to another one with L2 zero moments,
the constant C will change, but it will not be dependent on M – overall, therefore not affecting the
exponential decrease in error with respect to M .
A final comment regarding the effect of regularity on the above approximation error: We have
seen elsewhere that the more regular a function f is, the easier it is to approximate it. In fact, the
approximation error in Eq. (15) is also dependent upon the Holder exponent α discussed earlier. If
420
the function f is Holder-α over the interval [a, b] of concern, then the exponent α will appear in the
exponent along with j and M in a multiplicative way, implying that as α increases, the error decreases.
If such a signal f (x) is corrupted with additive noise, i.e., f˜ = f + n, then the noise will be present
throughout the entire wavelet coefficient tree – it will not decrease in magnitude for higher frequency,
just as in the case of the discrete Fourier transform. Since a great deal of the high frequency coefficients
of the noiseless signal f are negligible, they may be discarded, removing a significant amount of the
noise component of f˜ but not much of the signal f .
421
Multiresolution analysis and Fourier transforms
Much of the development of MRA and construction of wavelet bases was actually based on the
Fourier analysis of scaling and wavelet functions. For example, the construction of the famous
Daubechies-N wavelets was done via Fourier transforms. In this section, we examine briefly the
Fourier-based analysis of MRA.
As has been the case before, there is a problem with notation – different books use different
notations. Here, we shall adopt the notation of Boggess and Narcowich, which actually differs slightly
from the notation that has been used in this course to date. From time to time, we shall provide
some “translation” of results into the notation used by researchers in multiresolution analysis and
signal/image processing, and employed by S. Mallat in his classic book, A Wavelet Tour of Signal
Processing, The Sparse Way.
Recall that the scaling function φ of an MRA satisfies a scaling relation of the form,
X √
φ(x) = hk 2φ(2x − k) . (16)
k
i.e.,
√
pk = 2hk . (18)
We shall adopt this notation in this section so that the results conform with those presented in the
BG book.
First, let us recall the definition of the Fourier transform (FT) of a function f : R → R:
1
Z
F (ω) = √ f (x)e−iωx dx . (19)
2π R
The FT of the scaling function φ is then
1
Z
Φ(ω) = √ φ(x)e−iωx dx , (20)
2π R
422
so that
1
Z
Φ(0) = √ φ(x) dx. (21)
2π R
In the following treatment, we adopt the following additional normalization condition that is used in
B&G,
Z
φ(x) dx = 1 . (22)
R
As such,
1
Φ(0) = √ . (23)
2π
Here is our first result.
In addition, the wavelet function ψ(x) is orthogonal to φ(x − l) for all l ∈ Z if and only if
X
Φ(ω + 2πk)Ψ(ω + 2πk) = 0 for all ω ∈ R, (26)
k∈Z
423
Lecture 34
Proof: We shall prove the first part – the proof of the second part is similar in form. By replacing
x − k with x and relabelling n = l − k, the orthonormality relation for φ becomes
Z
φ(x) φ(x − n) dx = δ0n . (27)
R
From this, along with the following consequence of the Scaling Theorem for FTs,
Now divide the real line R into the intervals Ik = [2πk, 2π(k + 1)], so that the above equation can be
written as
XZ 2π(k+1)
|Φ(ω)|2 einω dω = δ0n . (32)
k∈Z 2πk
(Summation and integration can be interchanged because the result is finite.) For each k ∈ Z, we
make the change of variable ω → ω + 2πk, which changes the limits of integration of all integrals to 0
and 2π, i.e., Z 2π X
|Φ(ω + 2πk)|2 ein(ω+2πk) dω = δ0n . (33)
0 k∈Z
424
Now define
X
F (ω) = 2π |Φ(ω + 2πk)|2 , (35)
k∈Z
The above result looks like a Fourier series-type expansion over the finite interval [0, 2π] involving the
basis functions e−inω . We now check if F (ω) is 2π-periodic:
X
F (ω + 2π) = 2π |Φ(ω + 2π(k + 1)|2
k∈Z
X
= 2π |Φ(ω + 2πk′ |2 (k′ = k + 1)
k ′ ∈Z
= F (ω) . (37)
with
2π
1
Z
an = F (ω)e−inω dω . (39)
2π 0
This completes the first part of the proof. The second part is proved in a similar manner.
Let us now consider the scaling equation for a scaling function φ of an MRA, written as
X
φ(x) = pk φ(2x − k) , (42)
k
recalling that
√
pk = 2hk . (43)
425
Now take Fourier transforms of both sides and use the Scaling Theorem to arrive at
1X ω
Φ(ω) = pk Φ e−ik(ω/2)
2 2
k
1 ω X
= Φ pk e−ikω/2 . (44)
2 2
k
so that
Φ(ω) = P (e−iω/2 )P (e−iω/4 )φ(ω/4) . (48)
n
Φ(ω) = P (e−iω/2 ) · · · P (e−iω/2 )φ(ω/2n )
n
k
Y
= Φ(ω/2n ) P (e−iω/2 ) . (49)
k=1
implies that
1
Φ(0) = √ , (52)
2π
so that Eq. (50) becomes
∞
1 Y k
Φ(ω) = √ P (e−iω/2 ) . (53)
2π k=1
426
This result expresses the FT of the scaling function φ in terms of the scaling polynomial P . It is of
limited practical use since infinite products are difficult to compute. Nevertheless, this result will be
important for some later theoretical developments.
Recall that the wavelet function ψ associated with the scaling function φ satisfies the following
scaling equation,
X √
ψ(x) = gk 2φ(2x − k) . (54)
k
Also recall that the orthogonality condition hψ, φi = 0 led to the following solution for the gk ,
Note that we are now allowing the coefficients hk , and therefore the pk to be complex-valued.
Now take Fourier transforms of both sides of the above equation to arrive at the result,
1 X
Ψ(ω) = Φ(ω/2) (−1)k p̄1−k e−ikω/2
2
k
1 X
= Φ(ω/2) (−1)1−l p̄l e−i(1−l)ω/2
2
l
1 X
= − Φ(ω/2)e−iω/2 (−1)l p̄l eilω/2
2
l
1 X
= − Φ(ω/2)e−iω/2 p̄l (−eiω/2 )l . (57)
2
l
where
Q(z) = −zP (−z) . (59)
The above results may be combined to give the following necessary condition on the polynomial
P (z) for the existence of a multiresolution analysis. This result provided the basis for I. Daubechies’
427
construction of wavelets with a given number of vanishing moments.
Equivalently,
|P (e−it )|2 + |P (e−i(t+π) |2 = 1 for 0 ≤ t ≤ 2π . (64)
Proof: Since φ satisfies the orthonormality condition, we have, from Theorem 1 above, that
X 1
|Φ(ω + 2πk)|2 = for all ω ∈ R . (65)
2π
k
Furthermore, since φ satisfies the scaling equation, we have, from the previous Theorem, that
Now divide the sum in Eq. (65) into even- and odd-indexed terms and use Eq. (66) to perform the
following operations,
1 X X
= |Φ(ω + (2l)2π)|2 + |Φ(ω + (2l + 1)2π)|2
2π
l∈Z l∈Z
X
−i(ω/2+2lπ) 2
= |P (e )| |Φ(ω/2 + (2l)π)|2
l∈Z
X
+ |P (e−i(ω/2+(2l+1)π) )|2 |Φ(ω/2 + (2l + 1)π)|2
l∈Z
X
= |P (e−iω/2 )|2 |Φ(ω/2 + 2πl)|2
l∈Z
X
−iω/2 2
+|P (−e )| |Φ((ω/2 + π) + 2πl)|2 . (67)
l∈Z
428
From Theorem 2 above, each of the sums in the final line are equal to 1/2π, with ω replaced by ω/2
and ω/2 + π, respectively. Therefore the above equation reduces to the result,
It is instructive to examine a special case of the above result because of its connections with some
previous results. In the special case z = 1, we have
But recall the following condition involving the hk which was derived from the finite L1 condition on
φ(x):
X √
hk = 2. (71)
k
429
Fourier transforms and vanishing moments of wavelets
Let us recall the definition of a wavelet function ψ that has M vanishing moments:
Z
mk = xk ψ(x) dx = 0 , k = 0, 1, 2, · · · , M − 1 . (75)
R
This result can be connected to the behaviour of the Fourier transform Ψ(ω) of the wavelet ψ(x) by
means of the following result established earlier in the course,
dn
1
Z
(n) −iωx
F (ω) = √ f (x)e dx
dω n 2π R
n 1
Z
= (−i) √ xn f (x)e−iωx dx . (76)
2π R
Therefore, if
Ψ(k) = 0 , 0 ≤ k ≤ M − 1, (78)
then the wavelet function ψ(x) has M vanishing moments. In the next lecture, we’ll take a look at
how such wavelets can be constructed with the help of their Fourier transforms.
With regard to this lecture (34), you are responsible only for the section immediately
above, i.e., Eqs. (75)-(78). You are not responsible for Theorems 1 and 2 presented
earlier.
430
Lecture 35
We’ll repeat the short discussion from the end of the previous lecture. Recall the definition of a wavelet
function ψ that has M vanishing moments:
Z
mk = xk ψ(x) dx = 0 , k = 0, 1, 2, · · · , M − 1 . (79)
R
This result can be connected to the behaviour of the Fourier transform Ψ(ω) of the wavelet ψ(x) by
means of the following result established earlier in the course,
dn
1
Z
(n) −iωx
F (ω) = √ f (x)e dx
dω n 2π R
1
Z
= (−i)n √ xn f (x)e−iωx dx . (80)
2π R
From this result,
Ψ(k) (0) = 0 implies that mk = 0 . (81)
Therefore, if
Ψ(k) = 0 , 0 ≤ k ≤ M − 1, (82)
Let us now recall Eq. (58) from an earlier section, which relates the FT of ψ to the FT of φ,
where
Q(z) = −zP (−z) . (84)
With an eye to Eq. (82), we are going to take derivatives of both sides of Eq. (85) with respect to ω,
and evaluate these derivatives at ω = 0, also recalling that
1
Φ(0) = √ 6= 0 . (86)
2π
431
Let us compute the first derivative of Ψ(ω) in order to compute Ψ′ (ω):
Therefore,
Ψ′ (0) = (i/2)Φ(0)P (−1) − Φ′ (0)P (−1) + (i/2)Φ(0)P ′ (−1) . (88)
Also recall that the finite L1 norm condition on the hk coefficients requires that
P (1) = 1 , (90)
This means that we can guarantee that the first two moments of ψ vanish if we can find coefficients
pk (and therefore hk ) such that P ′ (−1) = 0. This implies that P (z) has the form,
1
P (z) = (1 + z)2 Q(z) , (93)
2
1
where Q(z) is a polynomial. The factor 2 has been added in order to conform with the polynomial
P (z) from Eq. (46).
432
If we once again assume that P ′ (−1) = 0 so that the first moment of Ψ vanishes, then
then
Ψ(n) (0) = (i/2)n Φ(0)P (n) (−1) , (98)
2. Ψ(k) (0) = 0 , 0 ≤ k ≤ M − 1,
3. P (k) (−1) = 0 , 0 ≤ k ≤ M − 1.
This theorem is presented in many books on wavelet theory, but in a slightly different form with
regard to the third condition involving the polynomial P (z). Recall that the coefficients of P are, up
to a constant, the scaling coefficients hk . Also recall that z = e−iω/2 . Let us rewrite P in terms of the
hk and ω,
1X
P (z) = pk z k
2
k
1 X
= √ hk e−ikω/2
2 k
1
= √ H(ω) . (101)
2
433
The function H(ω) may be viewed as a Fourier series – or, equivalently, a Fourier transform with delta
functions situated at the hk in k-space. Note that z = −1 corresponds to ω = 2π. As such, Condition
No. 3 above may be restated in terms of H(ω) as follows:
• H (k) (2π) = 0 , 0 ≤ k ≤ M − 1.
Indeed, the entire discussion in this section on Fourier transforms and MRA can be formulated in
terms of the function H(ω). This is what is done in most books on wavelet theory and signal/image
processing, including S. Mallat’s classic book, A Wavelet Tour of Signal Processing: The Sparse Way.
(There is a further complication, however, in that most books employ a slight variation of H(ω) in
which ω/2 is replaced by ω.) However, we have decided to follow the method and notation of Boggess
and Narcowich.
Daubechies approach was to construct polynomials PM (z) that satisfy Condition No. 3 for given M
values, i.e.,
(k)
PM (−1) = 0 0 ≤ k ≤ M − 1, (102)
which would guarantee that the wavelets ψM defined by the coefficients pk would have M vanishing
moments. These polynomials have the form
1
PM (z) = (1 + z)M QM (z) , (103)
2
1
where QM (z) is a polynomial. (Once again, we have inserted the factor 2 so that P (z) has the form in
Eq. (46). Clearly, the factor (1 + z)M ensures that Condition No. 3 is satisfied. The coefficients – and
degree – of QM (z) are determined by the other conditions that have to be satisfied by the coefficients
pk of PM (z).
We now investigate a couple of simple cases. Of course, the simplest possible case is M = 1.
Case No. 1: M = 1 In this case, the wavelet ψ will have one vanishing moment, which is characteristic
of all wavelets, namely, Z
ψ(x) dx = 0 . (104)
R
434
The polynomial P1 (z) will have the form,
1
P1 (z) = (1 + z)Q1 (z) . (105)
2
Actually, Q1 (z) = C, where C is a constant, will work. Clearly, P (z) satisfies the condition
P (−1) = 0 . (106)
But it also needs to satisfy Eq. (90) (finite L1 norm condition), so that
P (1) = C = 1 . (107)
This leads to
1 1
P1 (z) = + z. (108)
2 2
From Eq. (46), the coefficients pk in this case are
p0 = p1 = 1 . (109)
Also recall that the pk are related to the scaling coefficients hk as follows,
√ 1
pk = 2hk =⇒ hk = √ pk . (110)
2
As such, we have the result,
1
h0 = h1 = √ , (111)
2
which, as the reader may have already anticipated, corresponds to the Haar MRA.
The above result does not preclude the existence of more complicated polynomials Q1 (z) in Eq.
(105). These would presumably yield coefficient strings pk or hk of higher length. Nothing significant
is gained, however, since all wavelets have one vanishing moment. For this reason, we move to the
case M = 2.
Case No. 2: M = 2 (two vanishing moments) The polynomial P2 (z) will have the form,
1
P2 (z) = (1 + z)2 Q2 (z) , (112)
2
√
where Q2 (z) is to be determined. Recalling the fact that the number of nonzero coefficients hk = pk / 2
must be even, a first-degree polynomial,
Q2 (z) = a0 + a1 z , (113)
435
can, at least in principle, produce coefficients hk , 0 ≤ k ≤ 3. Let us attempt to solve for a0 and a1 .
Firstly, we write
3
X
2
(1 + z) (a0 + a1 z) = pk z k , (114)
k=0
where
p0 = a0 , p1 = 2a0 + a1 , p2 = a0 + 2a1 , p3 = a1 . (115)
In general, the finite L1 norm condition P (1) = 1 for polynomials of the form in Eq. (46) implies that
X
pk = 2 . (116)
k
1
4a0 + 4a1 = 2 =⇒ a0 + a1 = . (117)
2
In this case,
p0 + p2 = 2a0 + 2a1 , p1 + p3 = 2a0 + 2a1 . (119)
This condition is automatically satisfied and yields no new information. Actually, this is not a coin-
cidence: The original condition
P (−1) = 0, (120)
which, in turn, implies the odd-even condition in Eq. (118). As a result, all polynomials PM (z) of the
form in (103) will produce coefficients that satisfy the odd-even relation.
We also recall the L2 normalization condition
X X
h2k = 1 =⇒ p2k = 2 . (122)
k k
1
a1 = − a0 , (123)
2
436
then substitution of this result into Eq. (122) yields the following quadratic equation in a0 ,
1 1
a20 − a0 − = 0 . (124)
2 8
These are the scaling coefficients of the Daubechies-4 wavelet system presented in an earlier lecture.
If we use the other root a0 of the quadratic, the resulting hk values are
√ √ √ √
1− 3 3− 3 3+ 3 1+ 3
h0 = √ , h1 = √ , h2 = √ , h3 = √ . (127)
4 2 4 2 4 2 4 2
This is a reversal of the first h-vector. The effect of this reversal is to invert the graphs of the original
Daubechies-4 functions with respect to the y-axis and translate them so that they are supported on
the same intervals as the original functions. (They are also inverted with respect to the x-axis.)
Finally, recall from a previous lecture that the general solution for a scaling coefficient vector of
length N = 4 could be expressed as the following one-parameter family,
1
h0 = √ (1 − cos α + sin α)
2 2
1
h1 = √ (1 + cos α + sin α)
2 2
1
h2 = √ (1 + cos α − sin α)
2 2
1
h3 = √ (1 − cos α − sin α). (128)
2 2
437
will produce the Daubechies-4 condition. The second zero-moment condition,
′
PM (−1) = 0 , (129)
implies that
X
(−1)k khk = 0 . (130)
k
Substituting the hk from Eq. (128) into the above relation yields the equation,
1 π
−2 + 4 cos α = 0 =⇒ cos α = =⇒ α= , (131)
2 3
Another important class of wavelets developed shortly after the Daubechies-N wavelets are the Coif-
man wavelets, or “Coiflets”, named after Ronald Coifman, a mathematician from Yale who also played
an important role in the development of wavelet theory and applications.
In addition to satisfying zero-moment conditions involving the wavelet, the Coifman systems also
satisfy zero-moment conditions involving the scaling function. For example, one family of Coifman
wavelets satisfies the two conditions,
Z
mk = xk ψ(x) dx = 0, 0 ≤ k ≤ L − 1,
ZR
′
mk = xk φ(x) dx = 0, 1 ≤ k ≤ L − 1. (132)
R
(The condition m′0 = 0 cannot be satisfied.) We have already seen how the zero wavelet-moment
condition is imposed by setting derivatives of the Fourier transform Ψ(ω) to zero. As might be
expected, the zero scaling-moment condition is imposed in a similar way. We go back to the expression
of the Fourier transform Φ(ω) of the scaling function, φ:
1
Z
Φ(ω) = √ φ(x)eiωx dx . (133)
2π R
From Eq. (80) for the derivatives of a Fourier transform, we see that
To impose these conditions, we use Eq. (45) derived earlier in this section,
438
Let us now investigate the imposition of a couple of the zero-derivative conditions. First of all,
As a result,
Φ′ (0) = (1/2)Φ′ (0) − (i/2)Φ(0)P ′ (1) . (137)
Therefore,
Φ′ (0) = 0 P ′ (1) = 0 . (138)
This relation should be compared with the condition Ψ′ (0) = 0 in Eq. (92). The condition P ′ (1) = 0
implies that the coefficients pk of P (z) satisfy the relation
X
kpk = 0 . (139)
k
Φ′′ (ω) = (1/4)Φ′′ (ω/2)P (eiω/2 ) + (i/2)Φ′ (ω/2)P ′ (eiω/2 ) − (1/4)Φ(ω/2)P ′′ (eiω/2 ) . (140)
Therefore
Φ′′ (0) = (1/4)Φ′′ (0) + (i/2)Φ′ (0)P ′ (1) − (1/4)Φ(0)P ′′ (1) . (141)
It follows that
P ′ (1) = P ′′ (1) = 0 ⇐⇒ Φ′ (0) = Φ′′ (0) = 0 . (142)
The same kind of pattern seems to be developing as was seen for the zero wavelet-moment conditions.
As such, we conjecture the following result.
2. Φ(k) (0) = 0 , 0 ≤ k ≤ L − 1,
3. P (k) (1) = 0 , 0 ≤ k ≤ L − 1.
439
As we have discussed earlier, wavelet systems with length 4 scaling coefficient vectors, i.e., h =
(h0 , h1 , h2 , h3 ), have only one degree of freedom. As such, we can impose only one vanishing moment
condition. Imposition of the zero wavelet moment condition m1 = 0 yields the Daubechies-4 wavelets.
A natural question is, “What happens if we impose the zero scaling moment condition m′1 = 0
instead?”
To answer this, let us go directly to the one-parameter family of solutions involving the α param-
eter, presented in Eq. (128). Imposition of the condition P ′ (1) = 0 implies Eq. (139) which, in turn,
implies that
X
khk = h1 + 2h2 + 3h3 = 0 . (144)
k
3
3 − 4 sin α = 0 =⇒ sin α = . (145)
4
The graphs of the “Coifman-4” scaling function φ(x) and wavelet function ψ(x) are presented in the
figure below. (As expected, they are very similar to the plots for the scaling and wavelet functions
associated with a = 0.25, which were included in the notes for Lecture 31 (Week 12).)
2 2
1.5 1.5
1 1
0.5 0.5
0 0
-0.5 -0.5
-1 -1
-1.5 -1.5
-2 -2
-1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
x x
“Coifman-4” acaling function φ(x) (Left) and wavelet function ψ(x) associated with α ≈ 0.269946π.
One noteworthy feature of this result is that the Coifman-4 wavelet function ψ(x) exhibits more
symmetry with respect to its central peak than does the Daubechies-4 wavelet. This was one of the
motivations to construct this system of wavelets.
440
In the following figure are shown the graphs of the “Coiflet-6” scaling and asociated wavelet
function. Here we state, without proof, that the wavelet ψ(x) has L = 2 vanishing moments, and the
scaling function has L − 1 = 1 vanishing moment. The “’6” in “Coiflet-6” signifies that six non-zero
scaling coefficients are used, i.e., h0 , h1 , · · · , h5 . Recall that for cofficient vectors h of length N = 2,
there are two degrees of freedom. One of the degrees of freedom was used to make the first-order
moment, m1 , of ψ(x) vanish. The other degree of freedom was used to make the first-order moment,
m′1 , of φ(x) vanish.
2 2
1.5 1.5
1 1
0.5 0.5
0 0
-0.5 -0.5
-1 -1
-1.5 -1.5
-2 -2
-1 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7
x x
441
Lecture 36
The following discussion is not intended to be complete – neither intellectually nor pedagogically!
A proper treatment would require a few more lectures, and many more pages. The intent of this
presentation is simply to give the reader an idea of how the ideas on wavelet representations of
functions of a single real variable may be carried over to the representation of functions of two variables,
in particular, images. For more information, the reader is referred to the references at the end of this
section.
We now outline briefly the standard approach for constructing wavelet expansions for real-valued
functions of two variables, i.e., f (x, y), with applications to image processing. The approach is based
on the method of tensor products discussed earlier in the course (e.g., two-dimensional Discrete Fourier
Transform). Let us recall this idea for the case of functions with compact support: If the set of
functions {uk (x)}∞ 2
k=1 forms an orthonormal basis for the space L (I), where I = [a, b], then the
First, let us recall some of the basic ingredients of a multiresolution analysis (MRA) on the space
of functions L2 (R). We have a nested sequence of subspaces Vj ⊂ L2 (R) such that
Vj ⊂ Vj+1 , j ∈ Z. (148)
Furthermore, we have a scaling function φ(x) ∈ V0 such that the set of functions,
442
forms an orthonormal basis in V0 . This, along with the scaling property for the MRA, implies that
the set of functions
φjk (x) = 2j/2 φ(2j − k), k ∈ Z, (150)
forms an orthonormal basis in Vj for any j ∈ Z. It will be convenient to express this result as follows,
The line over the expression indicates the closure of the set – when an infinity of elements is involved,
we have to include the limit points of all possible infinite sequences in the set.
Now recall the fundamental decomposition
V1 = V0 ⊕ W 0 , (152)
where W0 ⊂ V1 and W0 = V0⊥ . There exists a mother wavelet function ψ(x) ∈ W0 , hψ, φi = 0, such
that the set of functions,
ψ0k (x) = ψ(x − k), k ∈ Z, (153)
forms an orthonormal basis in W0 . Once again, the scaling property implies that the set of functions
forms an orthonormal basis in Wj for any j ∈ Z. It will also be convenient to express this result as
follows,
Wj = span{ψjk , k ∈ Z}. (155)
We now consider the space of square-integrable functions of two variables, L2 (R2 ). It is first necessary
to define a nested sequence of subspaces of this space. We shall denote these spaces in bold to
distinguish them from the one-dimensional case above, i.e., Vk ∈ L2 (R2 ), with
Vk ⊂ Vk+1 , j ∈ Z. (156)
We shall also assume these spaces satisfy the separability and density properties for an MRA in
L2 (R2 ), namely,
\
Vk = {∅} (separability), (157)
k∈Z
443
and
[
Vk = L2 (R2 ) (density). (158)
k∈Z
Vk ⊕ Wk = Vk+1 . (159)
V0 ⊕ W0 ⊕ W1 ⊕ · · · ⊕ Wk = Vk , k > 0. (160)
The question remains, “How do we define the spaces Vk ?” (And, of course, Wk ?) The answer is
that Vk will be a tensor product of the one-dimensional spaces Vk , i.e.,
Vk = Vk ⊗ Vk . (161)
Let us start with the space V0 . The first step is to construct the tensor product function,
In the case of the Haar MRA, the function Φ(x, y) is nonzero, namely 1, in the unit square 0 ≤ x ≤
1, 0 ≤ y ≤ 1, and zero everywhere else.
The next step is to consider all possible integer translates of the function Φ(x, y) – this means
integer translates in both the x and y directions. The result is the following set of functions,
In the case of the Haar MRA, the function Φ0,ij (x, y) is nonzero, namely 1, in the unit square
i ≤ x ≤ i + 1, j ≤ y ≤ j + 1.
We now state, without formal proof, that the space V0 is spanned by the above functions, i.e.,
The general resolution spaces Vk are defined by tensor products of the appropriately scaled functions
φki (x) and φkj (y), i.e.,
444
In other words, the basis functions for Vk are constructed by forming all possible tensor products
from the functions φkl that belong to the one-dimensional space Vk . Note that we don’t have to use
functions from other spaces Vl , l 6= k.
The next question is, “How do we define the space Wk ?” Remember that this space must (i)
belong to Vk+1 and (ii) be orthogonal to Vk . It might seem reasonable to consider the basis elements
formed from the tensor products involving the wavelet functions ψkl , i.e., basis functions of the form
After all, these functions are orthogonal to the basis functions of Vk . But this is not enough! The
reason is that there are also other possible tensor products that are orthogonal to the basis functions
of Vk , namely,
φki (x)ψkj (y), ψki (x)φkj (y), i, j ∈ Z. (167)
As a result, our space Wk becomes a direct sum of three tensor product spaces, i.e.,
where
The superscripts h, v and d stand for horizontal, vertical and diagonal, respectively. These terms will
be explained shortly.
The net result of this procedure is that a function f (x, y) ∈ L2 (R2 ) may now be expanded as
follows,
∞ XXh
X i
f (x, y) = a000 φ000 (x, y) + bhkij Ψhkij (x, y) + bvkij Ψvkij (x, y) + bdkij Ψdkij (x, y) . (171)
k=0 i∈Z j∈Z
445
In applications to imaging, it is sufficient to consider image functions f (x, y) that are supported
over finite regions – without loss of generality, we consider the spatial region of support to be [0, 1]2 ,
i.e., (x, y) ∈ [0, 1] × [0, 1]. In this case, it is not necessary to let the i and j indices run over all integers:
Our image functions will have expansions of the form,
∞ 2Xk
−1 2X
−1 h k
X i
f (x, y) = a000 φ000 (x, y) + bhkij Ψhkij (x, y) + bvkij Ψvkij (x, y) + bdkij Ψdkij (x, y) . (172)
k=0 i=0 j=0
The set of all functions defined on [0, 1]2 and admitting the above expansions will be denoted as
L20 (R2 ) ⊂ L2 (R2 ).
Recall that in the one-dimensional case, i.e., functions f (x) of a single variable x with compact
support, the expansion coefficients could be arranged as a binary tree. In the case of two-dimensions,
the coefficients may be arranged in the form of quadtrees. The arrangement of the first three blocks
is shown in Figure 1 below. The blocks Bhk , Bvk , Bdk , k ≥ 0, each contain 22k coefficients bhkij , bvkij , bdkij ,
respectively. The three collections of blocks
∞
[ ∞
[ ∞
[
Bh = Bhk , Bv = Bvk , Bd = Bdk , (173)
k k k
comprise the fundamental horizontal, vertical and diagonal quadtrees of the coefficient tree, which we
shall discuss a bit later.
A0 Bh0 Bh1
Bv0 Bd0 Bh2
Bv1 Bd1
Bv2 Bd2
With reference to the expansion of f (x, y) in Eq. (172), the block A0 in the upper left corner of the
matrix in Figure 1 consists of the single scaling coefficient a000 , which represents the V0 component of
f . Each of the three blocks Bv0 , Bd0 and Bh0 that surround A0 also consists of one coefficient, namely,
bv000 , bd000 and bh000 , respectively. These three blocks comprise the W0 component of f . This means
446
that the A0 block (V0 component) along with these three Bλ0 blocks (W0 component) comprise the
V1 component of f . More on this later.
Each of the next set of blocks B1λ , λ ∈ {v, d, h}, is composed of the set of 4 coefficients bλ1ij ,
0 ≤ i, j ≤ 1 from Eq. (172). These comprise the W1 component of f .
And then each of the next set of blocks B2λ , λ ∈ {v, d, h}, is composed of the set of 16 coefficients
bλ1ij , 0 ≤ i, j ≤ 3 from Eq. (172). These comprise the W2 component of f . And so on and so on.
Now consider any wavelet coefficient bλkij , λ ∈ {h, v, d} in this matrix and the unique (infinite)
λ . In the Haar case,
quadtree with this element as its root. We shall denote this (sub)quadtree as Bkij
h , B v and B d correspond to the same spatial
for a fixed set of indices {k, i, j} the three quadtrees Bkij kij kij
AN −1 BhN −1
AN →
BvN −1 BdN −1
Let us now explain the terms, “horizontal,” “vertical,” and “diagonal.” We start with “horizontal”.
Recall that the basis elements spanning the horizontal subspace Wkh have the following form,
The x-dependent part of Ψkij (x, y) is the scaling function φki (x) and its y-dependent part is the
wavelet, or detail, function ψkj (x). Recalling the Haar case, this implies that the basis function
Ψkij (x, y) should be able to detect more detail in the y-direction than in the x-direction. But detecting
more detail in the y-direction implies that we should be able to detect horizontal edges better than
447
vertical edges.
In the same way, we expect that the basis functions of Wkv , namely,
are able to pick up more detail in the x-direction than in the y-direction. As such, they will be able
to detect vertical edges. Finally, the basis functions of Wkd , namely,
will pick up detail in both the x- and y-directions combined, which implies a detection of diagonal
edges.
We now illustrate the wavelet decomposition algorithm, along with the concepts of horizontal,
vertical and diagonal subspaces, using the 512 × 512 pixel Boat test image displayed below.
The 512 × 512 greyscale values of the noiseless Boat image may be considered as forming an V9
representation (29 = 512). As such, these values will comprise the block A9 . In the next figure are
presented the partial wavelet decompositions of this block having the form A9 = A7 ⊕ B7 ⊕ B8 . The
A7 representation of Boat appears in the upper left of each block. In the B8 and B9 detail blocks, the
magnitudes of the wavelet coefficients are plotted. Their magnitudes have been magnified in order to
accentuate their differences.
Recalling Figure 1, vertical blocks appear in the lower left, horizontal blocks in the upper right
and diagonal blocks in the lower right. The accentuation of vertical edges in lower left blocks is quite
448
Partial wavelet decomposition of 512×512 pixel Boat image with respect to Haar (left) and Daubechies-
4 (right) basis functions.
visible, particularly in the Haar case. As well, the accentuation of horizontal edges in the upper right
blocks can also be seen. In both figures, there is very little visible structure in the lower right Bd8
blocks, indicating that most of the “energy” of edges is contained in the other B8 blocks.
We now show the effects of truncating the wavelet expansions of the Boat image for both the Haar
and Daubechies-4 case. Recall that the original (noiseless) Boat test image, being a 512 × 512 pixel
image, may be considered as a V9 resolution image, represented by the block A9 . We may decompose
this image as follows,
A9 = A8 + B8 (corresponding to V9 = V8 ⊕ W8 ). (177)
By “throwing away” the wavelet coefficients in B8 , we have the lower resolution approximation A8 ∈
V8 . We may now continue this procedure, i.e.,
A8 = A7 + B7 (corresponding to V8 = V7 ⊕ W7 ), (178)
to produce the lower resolution approximation A7 ∈ V7 . In the figure below are shown the resulting
A8 , A7 and A6 lower-resolution approximations to the Boat image in the Haar and Daubechies-4
basis sets. In the Haar case, the approximations exhibit considerable blockiness due to the both
nonoverlapping and discontinuous nature of the Haar wavelet functions.
449
Increasingly lower-resolution approximations to Boat image obtained by setting appropriate detail
coefficient blocks Bk = 0 for Haar basis (left) Daubechies-4 basis (right). Top: A8 , Middle: A7 ,
Bottom: A6 .
450
Denoising by wavelet thresholding
The idea of denoising by thresholding discussed earlier for functions/signals carries over to images.
Very briefly, the magnitudes of wavelet coefficients decay as we move down wavelet coefficient “trees.”
In the case of a noisy image, the coefficients eventually become “drowned out” by the noise which, at
least ideally, has the same magnitude throughout the wavelet tree. This suggests that thresholding
may accomplish at least some image denoising.
The subject of denoising by wavelet thresholding has recevied a great deal of attention in the
research literature. More sophisticated methods employ different thresholding levels for different
levels Ak . In the example presented below, a relatively simple method of thresholding was performed.
Once again, the the 512 × 512 pixel Boat test image, displayed at the left of the figure below, was
used. At the right is shown a noisy version of this test image.
Left: 512 × 512 pixel Boat test image. Right: Noisy Boat test image.
A simple thresholding method was applied to both Haar and Daubechies-4 representations of the noisy
Boat image shown earlier.
There remains the interesting question: Are either, or both, of the denoised images actually more
pleasing visually than the original noisy image? Sometimes, an observer will prefer a noisy image,
in which the edges are well-defined, to a denoised version in which the edges have been blurred or
degraded in some other way.
451
Denoising of noisy Boat image via wavelet coefficient thresholding. Left: Haar basis. Right:
Daubechies-4 basis.
References
[2] S.G. Mallat, A theory for multiresolution signal decomposition: The wavelet representation, IEEE
Trans. PAMI, 11, No. 7, 674-693 (1989).
[3] S.G. Mallat, A Wavelet Tour of Signal Processing: The Sparse Way, Academic Press (2009).
452