Gasquet Wi Tomski

Texts in Applied Mathematics 3
Editors
J.E. Marsden
L. Sirovich
M. Golubitsky
W.Jger
Advisor
G. Iooss
P. Holmes
Springer Science+Business Media, LLC

Texts in Applied Mathematics
1. Sirovich: Introduction to Applied Mathematics.

2. Wiggins: Introduction to Applied Nonlinear Dynamical Systems and Chaos.
3. Hale!Kofak: Dynamics and Bifurcations.
4. Chorin!Marsden: A Mathematical Introduction to Fluid Mechanics, 3rd ed.
5. Hubbard/West: Differential Equations: A Dynamical Systems Approach:
Ordinary Differential Equations.
6. Sontag: Mathematical Control Theory: Deterministic Finite Dimensional
Systems, 2nd ed.
7. Perko: Differential Equations and Dynamical Systems, 2nd ed.
8. Seabom: Hypergeometrie Functions and Their Applications.
9. Pipkin: A Course on Integral Equations.
10. Hoppensteadt!Peskin: Mathematics in Medicine and the Life Sciences.
11. Braun: Differential Equations and Their Applications, 4th ed.
12. Stoer/Bulirsch: Introduction to Numerical Analysis, 2nd ed.
13. Renardy!Rogers: A First Graduate Course in Partial Differential Equations.
14. Banks: Growth and Diffusion Phenomena: Mathematical Framewerksand
Applications.
15. Brenner!Scott: The Mathematical Theory ofFinite Element Methods.
16. Van de Velde: Concurrent Scientific Computing.
17. Marsden!Ratiu: Introduction to Mechanics and Symmetry.
18. Hubbard/West: Differential Equations: A Dynamical Systems Approach:
Higher-Dimensional Systems.
19. Kaplan/Glass: Understanding Nonlinear Dynamies.
20. Holmes: Introduction to Perturbation Methods.
21. Curtain!Zwart: An Introduction to Infinite-Dimensional Linear Systems
Theory.
22. Thomas: Numerical Partial Differential Equations: Finite Difference
Methods.
23. Taylor: Partial Differential Equations: Basic Theory.
24. Merkin: Introduction to the Theory ofStability ofMotion.
25. Naher: Topology, Geometry, and Gauge Fields: Foundations.
26. Polderman/Willems: Introduction to Mathematical Systems Theory:
A Behavioral Approach.
27. Reddy: Introductory Functional Analysis with Applications to Boundary-
Value Problemsand Finite Elements.
28. Gustafson/Wilcox: Analytical and Computational Methods of Advanced
Engineering Mathematics.
29. Tveito/Winther: Introduction to Partial Differential Equations: A Computational
Approach.
30. Gasquet!Witomski: Fourier Analysis and Applications: Filtering, Numerical
Computation, Wavelets.
31. Bremaud: Markov Chains: Gibbs Fields, Monte Carlo Simulations, and Queues.
32. Durran: Numerical Methods for Wave Equations in Geophysical Fluid Dynamics.
C. Gasquet P. Witomski
Fourier Analysis
and Applications
Filtering, Numerical Computation, Wavelets
Translated by R. Ryan
With 99 Illustrations
Springer
Claude Gasquett Patrick Witomski
Universite Joseph Fourier (Grenoble I) Directeur du Laboratoire LMC-
IMAG
Transtator Tour IRMA, BP 53
38041 Grenoble, Cedex 09
Robert Ryan
France
12, Blvd. Edgar Quinet
75014 Paris
France
Series Editors
J.E. Marsden L. Sirovich
Control and Dynamical Systems, 107-81 Division of Applied Mathematics
California Institute of Technology Brown University
Pasadena, CA 91125 Providence, RI 02912
USA USA
M. Golubitsky W.Jger
Department of Mathematics Department of
University of Houston Applied Mathematics
Houston, TX 77204-34 76 Universitt Heidelberg
USA Im Neuenheimer Feld 294
69120 Heidelberg
tDeceased.
Germany
Mathematics Subject Classification (1991): 42-01, 28-XX

Library of Congress Cataloging-in-Publication Data
Gasquet, Claude.
Fourier analysis and applications : filtering, numerical
computation, wavelets / Claude Gasquet, Patrick Witomski.
p. cm. - (Texts in applied mathematics; 30)
Includes bibliographical references and index.
ISBN 978-1-4612-7211-3 ISBN 978-1-4612-1598-1 (eBook)
DOI 10.1007/978-1-4612-1598-1
l. Fourier analysis. I. Witomski, Patrick. II. Tide.
III. Series.
QA403.5.G37 1998
515'.2433-dc21 98-4682
Printed on acid-free paper.
1999 Springer Science+Business Media New York
Originally published by Springer-Verlag New York, Inc in 1999
Softcover reprint of the bardeover 1st edition 1999
All rights reserved. This work may not be translated or copied in whole or in part without the
written permission ofthe publisher (Springer Science+Business Media, LLC), except for brief
excerpts in connection with reviews or scholarly analysis. Use in connection with any form of
information storage and retrieval, electronic adaptation, computer software, or by similar or
dissimilar methodology now known or hereafterdeveloped is forbidden.
The use of general descriptive names, trade names, trademarks, etc., in this publication, even
if the former are not especially identified, is not to be taken as a sign that such names, as
understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely
by anyone.
Production managed by A.D. Orrantia; manufacturing supervised by Jacqui Ashri.
Camera-ready copy prepared from the authors' U.TEJX files.
9 8 7 6 5 4 3 2 1
SPIN 10658148
Translator's Preface
This book combines material from two sources: Analyse de Fourier et ap-
plications: Filtrage, Calcul numerique, Ondelettes by Claude Gasquet and
Patrick Witomski (Masson, Paris, second printing, 1995) and Analyse de
Fourier et applications: Exercices corriges by Robert Delmasso and Patrick
Witomski (Masson, Paris, 1996). The translation of the first book forms
the core of this Springer edition; to this have been added ail of the exercises
from the second book. The exercises appear at the end of the lessons to
which they apply. The solutions to the exercises were not includcd bccause
of space constraints.
Whcn Springer offered me the opportunity to translate the book by Gas-
quet and Witomski, I readily acccpted bccause I liked both the book's
content and its style. I particularly liked the structure in 42 lessons and
12 chapters, and I agree with the authors that each lesson is a "chew-
able piece," which can be assimilated relatively easily. Believing that the
structure is important, I have maintained as much as possible the "look and
feel" of the original French book, including the page format and numbering
system. I believe that this page structure facilitates study, understanding,
and assimilation. With regard to content, again I agree with the authors:
Mathematics students who have worked through the material will be weil
preparcd to pursue work in many directions and to explore the proofs of
results that have been assumed, such as the development of measure theory
and the representation theorems for distributions. Physics and engineering
students, who perhaps have a different outlook and motivation, will be weil
equipped to manipulate Fourier transforms and distributions correctly and
to apply correctly results such as the Poisson summation formula.
Translating is perhaps the closest scrutiny a book receives. The process
of working through the mathematics and checking in-text referenccs always
uncovers typos, and a number of these have been corrected. On the other
hand, I have surely introduced a few. I have also added material: I have
occasionaily added details to a proof wherc I felt a few more words of
explanation wcre appropriate. In the case of Proposition 31.1.3 (which is
vi Translator's Preface
a key result), Exercise 31.12 was added to complctc thc proof. I havc also
completed the proofs in Lesson 42 and added some comments. Sevcral new
referenccs on wavelets have been included in thc bibliography, a few of
thcm with annotations. All of these modifications have bcen madc with
the knowledge and concurrence of Patrick Witomski.
Although the book was written as a textbook, it is also a useful refcrence
book for theorctical and practical results on Fourier transforms and distri-
butions. Therc arc several places where the Fourier transforms of specific
functions and distributions are summarized, and therc are also summaries
of general results. These summaries havc been indexed for easy refcrence.
The French edition was typeset in Plain TEX and printed by Louis-Jean
in Gap, France. Monsieur Albert at Louis-Jean kindly sent me a copy of
the TEX source for the French cdition, thus allowing many of the equations
and arrays to bc copied. This simplified the typcsetting and helped to avoid
introducing errors. My sincere thanks to M. Albert. Similarly, thanks go
to Anastis Antoniadis (IMAG, Grenoble) for providing the lb-TEX sourcc
for the cxercises, which was clegantly prcpared by his wife. I had the good
fortune to have had the work edited by David Kramcr, a mathcmatician
and freclance editor. He not only did a masterful job of straightening out
the punctuation and othcr language-based lapses, but he also added many
typesetting suggestions, which, I believe, manifestly improved the appear-
ance of the book. I also thank David for catching a few of the typos that I
introduced; those that remain are my responsibility and embarrassmcnt.
Robcrt Ryan
Paris, July 14, 1998
Preface to the French Edition
This is a book of applied mathematics whose main topics are Fourier anal-
ysis, filtering, and signal processing.
The development proceeds from the mathematics to its applications,
whilc trying to make a connection betwecn the two perspectives. On one
hand, specialists in signal processing constantly use mathematical concepts,
often formally and with considerable intuition based on experience. On the
other hand, mathematicians place more priority on the rigorous develop-
ment of the mathcmatical conccpts and tools.
Our objective is to give mathematics students somc understanding of
the uses of the fundamental notions of analysis they are learning and to
providc thc physicists and engineers with a theoretical framework in which
the "wcll known" formulas are justified.
With this in mind, the book presents a development of the fundamentals
of analysis, numerical computation, and modeling at levels that extend
from the junior year through the first year of graduate school. One aim is
to stimulate students' interest in the coherence among the following three
domains:
Fourier analysis;
signal processing;
numerical computation.
On completion, students will have a general background that allows them
to pursuc more spccialized work in many directions.
The general concept

We have chosen a modular presentation in lessons of an average size that
can be easily assimilated . . . or passed ovcr. The density and the level of
thc material vary from lesson to lesson. Wc havc purposefully modulated
thc pacc and thc concentration of the book, since as lecturers know, this
is necessary to capture and maintain the attention of their audience. Each
viii Preface to the French Edition
lesson is devoted to a specific topic, which facilitates reading "a la carte."

The lessons are grouped into twelve chapters in a way that allows one to
navigate easily within the book.
A progressive approach
The program we have adopted is progressive; it is written on levels that
range from the third year of college through the first year of graduate
school.
JUNIOR LEVEL
Lessons 1 through 7 are accessible to third-year students. They intro-
duce, at a practical level, Fourier series and the basic ideas of filtering.
Here one finds some simple examples that will be re-examined and studied
in more depth later in the book. The Lebesgue integral is introduced for
convenience, but in superficial way. On the other hand, emphasis is placed
on the geometric aspects of mean quadratic approximation, in contrast to
the point of view of pointwise representation. The notion of frequency is
illustrated in Lesson 7 using musical scales.
SENIOR LEVEL
The reader will find a presentation and overview of the Lebesgue integral
in Chapter IV, where the objective is to master the practical use of the
integral. The lesson on measure theory has been simplified. This chapter,
howcver, serves as a good guide for a morc thorough study of measure and
integration. Chapter VI contains concentrated applications of integration
techniques that lead to the Fourier transform and convolution of functions.
One can also include at this level the algorithmic aspects of the discrete
Fourier transform via the fast Fourier transform (Chapter III), thc concepts
of filtcring and linear differential equations (Chapter VII), an easy version
of Shannon's theorem, and an introduction to distributions (Chapter VIII).
MASTER LEVEL
According to our experience, the rest of the book, which is a good half
of it, demands more maturity. Herc one finds precisc results about thc
fundamental relation -r;g = j g, the Young inequalities (Chaptcr VI),
and various aspects of Poisson's formula related to sampling (Chapter XI).
Finally, time-frequency analysis based on Gabor's transform and wavelet
analysis (Chapter XII) call upon all of the tools developed in the first cleven
chapters and lead to recent applications in signal processing.
The content of this book is not claimed to be exhaustive. We have, for

example, simply treated the z-transform without speaking of the Laplace
transform. We chose not to deal with signals of several variables in spite of
the fact that they are clcarly important for image processing.
Preface to the French Edition ix
Possible uses of time

This book is an extension of a course given for engineering students during
their second year at E.N.S.I.M.A.G. 1 and at I.U.P. 2 Wehave been con-
fronted, as are all teachers, with dass schedules that constrain the time
available for instruction. The 40 hours available to us per semester at
E.N.S.I.M.A.G. or at I.U.P., which is divided equally between lectures and
work in sections, provides enough time to present the essential material.
Nevertheless, the material is very rich and requires a certain level of
maturity on the part of the students. We are thus led to assume in our
lectures some of the results that are proved in the book. This is facilitated
by the partition of the book into lessons, and it is not incompatible with a
good mathematics education. The time thus saved is more usefully invested
in practicing proofs and the use of the available tools. The material is
written at a level that leads to a facility in manipulating distributions, to a
rigorous formulation of the fundamental formula r;g f
= *g under various
assumptions, to an exploration of the formulas of Poisson and Shannon, and
finally, to precise ideas about the wavelet decomposition of a signal.
Our presentation contrasts with those that simply introduce certain for-
mulas such as
/_:oo e-2i7r(A-a)dt = 8(>.- a)
out ofthin air, where one ignores all of the fundamental background for a
very short-term advantage.
Different possible courses

One can work through the book linearly, or it is possible to enter at other
places as suggested below:
Juniors
Chapters I, II, and III.
Seniors and Masters in Mathematics
Chapters IV, V, VI, VIII, and IX.
Seniors and Masters in Physics
Chapters VII, X, XI, and XII.
This book comes from many years of teaching students at E.N.S.I.M.A.G.

and I.U.P. and pre-doctoral students. In fact, it was for pre-doctoral instruc-
tion that a course in applied mathematics oriented toward signal processing
1 Ecole Nationale Superieure d'Informatique et de Mathematiques Appliquees de
Grenoble (Institut National Polytechnique de Grenoble)

2 Institut Universitaire Professionnalise de Mathematiques Appliquees et Industrielles
(Universite Joseph Fourier Grenoble I)

x Preface to the French Edition
was established by Raoul Robert. His initiative in this subject, which was
not his area of research, has played a decisive role, and the current cxplo-
sion of numerical work based on wavelets shows that hisvisionwas correct.
Our thanks go equally to Pierrc Baras for thc numerous animated discus-
sions we have had. Their ideas and comments have been a valuable aid and
irreplaceable inspiration for us.
The sccond printing of this book is an opportunity to make several rc-
marks. We have chosen not to include any new developments. We havc
listed at the end of the book several references on wavelets, which show
that this area has exploded during these last years. But for the student or
the teacher to whom we address the book, the path to follow remains the
same, and the basics must be even more solidly established to understand
these new areas of applications. It seems to us that our original objective
continues to be appropriate today.
We have made the necessary corrections to the original text, and a book
of exercises with solutions will soon be available to complete the project.
Claude Gasquet
Patrick Witomski
Grenoble, June 30, 1994
Contents
Translator's Preface V
Preface to the French Edition vii
Chapter I Signals and Systems 1
Lesson 1 Signals and Systems 3

1.1 General considerations 0 3
1.2 Somc elementary signals 6
1.3 Examples of systems 0 0 7
Lesson 2 Filters and Transfer Functions 11

201 Algebraic properties of systems 0 0 11
202 Continuity of a system 0 0 0 0 0 0 0 0 12
203 The filter and its transfer function 0 14
204 A standard analog filter: thc RC cell 15
205 A first-order discrete filter 0 0 0 0 0 0 18
Chapter II Periodic Signals 21
Lesson 3 Trigonometrie Signals 23

301 Trigonometrie polynomials 0 0 23
302 Representation in sines and cosines 24
303 Orthogonality 0 24
3.4 Exercises 0 0 0 0 0 0 0 0 0 0 0 0 0 0 26
Lesson 4 Periodic Signals and Fourier Series 27

401 The space L~(O, a) 0 0 0 0 0 27
402 Thc idea of approximation 0 0 0 0 0 0 0 0 0 0 29
xii Contents
4.3 Convergence of the approximation . . . . . . . . . 31

4.4 Fourier coefficicnts of real, odd, and even functions 34
4.5 Formulary 35
4.6 Exercises 35
Lesson 5 Pointwise Representation 39

5.1 The Riemann-Lebesgue theorem 39
5.2 Pointwise convergence? . . . . . . 40
5.3 Uniform convergence of Fourier series 45
5.4 Exercises . . . . . . . . . . . . . . . . 47
Lesson 6 Expanding a Function in an Orthogonal Basis 51

6.1 Fourier series expansions on a bounded interval 51
6.2 Expansion of a function in an orthogonal basis 53
6.3 Exercises . . . . . . . . . . . . . . . . . . 56
Lesson 7 Frequencies, Spectra, and Scales 57

7.1 Frequencies and spectra 57
7.2 Variations on the scale 59
7.3 Exercises . . . . . . . 62
Chapter 111 The Discrete Fourier Transform and

N umerical Computations 63
Lesson 8 The Discrete Fourier Transform 65
8.1 Computing the Fourier coefficients . . . . 65
8.2 Some properties of the discrete Fourier transform 68
8.3 The Fourier transform of real data . . . . . . . . 71
8.4 A relation between thc exact and approximate Fourier
coefficients . 71
8.5 Exercises 73
Lesson 9 A Famous, Lightning-Fast Algorithm 75

9.1 The Cooley-Tukey algorithm . . . . 75
9.2 Evaluating the cost of the algorithm 77
9.3 The mirrar permutation 78
9.4 A recursive program 80
9.5 Exerciscs . . . . . . 81
Lesson 10 Using the FFT for Numerical Computations 85

10.1 Computing a periodic convolution . . . . 85
10.2 Nonpcriodic convolution . . . . . . . . . . . . . . . 87
10.3 Computations on high-order polynomials. . . . . . 88
10.4 Polynomial interpolation and the Chebyshev basis 90
10.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . 94
Contents xiii
Chapter IV The Lebesgue Integral 95
Lesson 11 From Riemann to Lebesgue 97

11.1 Some history . . . . . 97
11.2 Another point of view 98
11.3 By way of transition . 99
Lesson 12 Measuring Sets 101

12.1 Measurable sets and measure 101
12.2 Sets of measure zero . 104
12.3 Measurable functions . 105
12 .4 Exercises . . . . . . . 107
Lesson 13 Integrating Measurable Functions 111

13.1 Constructing the integral . . . . . . . 111
13.2 Elementary properties of the integral . . . . 113
13.3 The integral and sets of measure zero . . . . 115
13.4 Comparing the Riemann and Lebesgue integrals 116
13.5 Exercises 119
Lesson 14 Integral Calculus 121

14.1 Lcbesgue's dominated convergence theorem 121
14.2 Integrals that depend on a parameter 122
14.3 Fubini's theorem . . . . . . . . . . . . . . . 124
14.4 Changing variables in an integral . . . . . . 125
14.5 The indefinite Lebesgue integral and primitives 126
14.6 Exercises 128
Chapter V Spaces 131
Lesson 15 Function Spaces 133

15.1 Spaces of differentiable functions 133
15.2 Spaces of integrable functions 135
15.3 lnclusion and density . 137
15.4 Exercises . . . . . . 139
Lesson 16 Hilbert Spaces 141

16.1 Definitions and geometric properties 141
16.2 Best approximation in a vector subspace 143
16.3 Orthogonal systems and Hilbert bases 146
16.4 Exercises 151
xiv Contents
Chapter VI Convolution and the Fourier

Transform of Functions 153
Lesson 17 The Fourier Transform of Integrable Functions 155
17.1 The Fourier transform on L 1 (IR) . . . . . . . . . 155
17.2 Rules for computing with the Fourier transform . 157
17.3 Some standard examples . 159
17.4 Exercises . . . . . . . . . . . . . . . . . . . . . . 161
Lesson 18 TheInverse Fourier Transform 163

18.1 An inversion theorem for L 1 (IR) . . . . . . . . . . . . . . . . 163
18.2 Some Fourier transforms obtained by the inversion formula 165
18.3 The principal value Fourier inversion formula 166
18.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Lesson 19 The Space Y (IR) 171

19.1 Rapidly decreasing functions 171
19.2 The space Y (IR) . . . . . . . 172
19.3 Theinverse Fourier transform on Y 174
19.4 Exercises . . . . . . . . . . . . . . . 175
Lesson 20 The Convolution of Functions 177

20.1 Definitions and examples . 177
20.2 Convolution in L 1 (IR) . . . . . . . . . 179
20.3 Convolution in LP(IR) . . . . . . . . . 180
20.4 Convolution of functions with limited support . 183
20.5 Summary 184
20.6 Exercises . . . . . . . . . . . . . . . . . . . . . 184
Lesson 21 Convolution, Derivation, and Regularization 187

21.1 Convolution and continuity . . 187
21.2 Convolution and derivation . . . 187
21.3 Convolution and regularization . 188
21.4 The convolution Y (IR)* Y (IR). 190
21.5 Exercises . . . . . . . . . . . . . 191
Lesson 22 The Fourier Transform on L 2 (IR) 193

22.1 Extension of the Fourier transform . . . . . . . . . 193
22.2 Application to thc computation of certain Fourier
transforms . . . . . . . . . 196
22.3 The uncertainty principle 197
22.4 Exercises . . . . . . . . . 199
Lesson 23 Convolution and the Fourier Transform 201

23.1 Convolution and the Fourier transform in L 1 (IR) 201
23.2 Convolution and the Fourier transform in L 2 (IR) 203
Contents xv
23.3 Convolution and the Fourier transform: Summary . 204

23.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . 206
Chapter VII Analog Filters 209

Lesson 24 Analog Filters Governed by a Differential
Equation 211
24.1 The case where the input and output are in SC . 211
24.2 Generalized solutions of the differential equation 213
24.3 The impulse response when deg P < deg Q . 213
24.4 Stability . . . . . . . . . 215
24.5 Realizable systems . . . 216
24.6 Gain and response time 217
24.7 The Routh criterion 218
24.8 Exercises . . . . . . . . 219
Lesson 25 Examples of Analog Filters 221

25.1 Revisiting the RC filter . . . . . . 221
25.2 The RLC circuit . . . . . . . . . . 222
25.3 Another second-order filter: -~g" + g = f 225
25.4 Integrator and differentiator filters 227
25.5 The ideal low-pass filter . . . . . . . 228
25.6 The Butterworth filters . . . . . . . 229
25.7 The general approximationproblern 231
25.8 Exercises . . . . . . . . . . . . . . . 232
Chapter VIII Distributions 233

Lesson 26 Where Functions Prove to Be Inadequate 235
26.1 The impulse in physics . . . . 235
26.2 Uncontrolled skid on impact . 237
26.3 A new-look derivation . . 239
26.4 The birth of a new theory . . 241
Lesson 27 What Is a Distribution? 243

27.1 The basic idea . . . . . . . . . 243
27.2 The space !(IR) of test functions . 244
27.3 The definition of a distribution . . 245
27.4 Distributions as generalized functions 247
27.5 Exercises . . . . . . . . . . . . . . . . 249
Lesson 28 Elementary Operations on Distributions 251

28.1 Even, odd, and periodic distributions . 251
28.2 Support of a distribution . . . . . . . . . . . . . . . 253
xvi Contents
28.3 The product of a distribution and a function 254

28.4 The derivative of a distribution 255
28.5 Some new distributions 258
28.6 Exercises 261
Lesson 29 Convergence of a Sequence of Distributions 265

29.1 The limit of a sequence of distributions 265
29.2 Revisiting Dirac's impulse . . . . . . . . . . . . . . . . 266
29.3 Relations with the convergence of functions . . . . . . 267
29.4 Applications to the convergence of trigonometric series 268
29.5 The Fourier series of Dirac's comb 270
29.6 Exercises . . . . . . . . . . . . . . . 273
Lesson 30 Primitives of a Distribution 275

30.1 Distributions whose derivatives are zero 275
30.2 Primitives of a distribution 276
30.3 Exercises 278
Chapter IX Convolution and the Fourier

Transform of Distributions 281
Lesson 31 The Fourier Transform of Distributions 283
31.1 The space Y '(IR) of tempered distributions . 283
31.2 The Fourier transform on Y '(IR) . . . . . . . . . 287
31.3 Examples of Fourier transforms in Y '(IR) . . . . 290
31.4 The space g' 1 (IR) of distributions with compact support 291
31.5 The Fourier transform on g' '(IR) 292
31.6 Formulary 294
31.7 Exercises . . . . . . . . . . . . . 294
Lesson 32 Convolution of Distributions 297

32.1 The convolution of a distribution and a coo function 297
32.2 The convolution g'' * !JJ' 301
32.3 The convolution g'' * Y 1 303
32.4 The convolution .!Z ~ * !JJ ~ . . . . 304
32.5 The associativity of convolution . 306
32.6 Exercises . . . . . . . . . . . . . 308
Lesson 33 Convolution and the Fourier Transform of

Distributions 311
33.1 The Fourier transform and convolution Y * Y 1 311
33.2 The Fourier transform and convolution g' 1 * Y 1 312
33.3 The Fourier transform and convolution L 2 * L 2 313
33.4 The Hilbert transform . . . . . . . . . . . . . . . 313
Contents xvii
33.5 The analytic signal associated with a real signal . 314

33.6 Exercises . . . . . . . . . . . . . . . . . . . . . . 315
Chapter X Filtersand Distributions 317

Lesson 34 Filters, Differential Equations, and Distributions 319
34.1 Filters revisited . . . . . . . . . . . . . . . . . . . . 319
34.2 Realizable, or causal, filters . . . . . . . . . . . . . 321
34.3 Tempered solutions of linear differential equations 321
34.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . 324
Lesson 35 Realizable Filters and Differential Equations 325

35.1 Representation of the causal solution . 325
35.2 Examples 327
35.3 Exercises . . . . . . . . . . . . . . . . 331
Chapter XI Sampling and Discrete Filters 333

Lesson 36 Periodic Distributions 335
36.1 The Fourier series of a locally integrable periodic function 335
36.2 The Fourier series of a periodic distribution . . . . 337
36.3 The product of a periodic function and a periodic
distribution 340
36.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . 342
Lesson 37 Sampling Signalsand Poisson's Formula 343

37.1 Poisson's formula in~ 1 344
37.2 Poisson's formula in L 1 (JR). . . . . . . . . . . . . . . . . . . 345
37.3 Application to the study of the spectrum of a sampled signal 348
37.4 Application to accelerating the convergence of a Fourier
series. . . 350
37.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 351
Lesson 38 The Sampling Theorem and Shannon's Formula 353

38.1 Shannon's theorem . . . . . . . . . . . . . . . 355
N
38.2 The case of a function f(t) = 2:: Cne 2 i'11"Ant 356
n=-N
38.3 Shannon's formula fails in Y 1 357
38.4 The cardinal sine functions . . . . . . . . . . 357
38.5 Sampling and the numerical evaluation of a spectrum 359
38.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 361
Lesson 39 Discrete Filters and Convolution 365

xviii Contents
39.1 Discrete signals and filters . . . . . . . . . . . . 365

39.2 The convolution of two discrete signals . . . . . 367
39.3 Cases where the two supports are not bounded 368
39.4 Summary . . . . . . . . . . . . . . . . . . 371
39.5 Causality and stability of a discrete filter 372
39.6 Exercises 374
Lesson 40 The z-Transform and Discrete Filters 375

40.1 The z-transform of a discrete signal 375
40.2 Applications to discrete filters . 378
40.3 Exercises . . . . . . . . . . . . 381
Chapter XII Current Trends: Time-Frequency

Analysis 383
Lesson 41 The Windowed Fourier Transform 385

41.1 Limitations of standard Fourier analysis 385
41.2 Opening windows . . . . . . . . . . . . . . . 386
41.3 Dennis Gabor's formulas . . . . . . . . . . . 388
41.4 Comparing the methods of Fourier and Gabor . 392
41.5 Exercises . . . . . . . . . . . . . . . . . . . . . 394
Lesson 42 Wavelet Analysis 395

42.1 The basic idea: the accordion 395
42.2 The wavelet transform . . . . 397
42.3 Orthogonal wavelets . . . . . 405
42.4 Multiresolution analysis of L 2 (JR.) 410
42.5 Multiresolution analysis and wavelet bases . 413
42.6 Afternotes . 428
42.7 Exercises 431
References 433
Index 437
Chapter I
Signalsand Systems
Lesson 1
Signals and Systems
1.1 General considerations

Thc purpose of signal theory is to study signals and the systems that trans-
mit them. The notion of signal is extensive. The observation of somc phe-
nomenon yields certain quantities that depend on time (on space, on fre-
quency, or on something else!). These quantities, which are assumed tobe
measurable, will be called signals. They correspond in mathematics to the
notion of function (of one or morevariables of time, space, etc.), and thus
signals are modeled by functions. We will see later that the notion of dis-
tribution provides a model for signals that is both more general and more
satisfactory than that of function.
Examples of signals:
Intensity of an electric current
Potential difference between two points in a circuit
Position of an object, located with respect to time, M = M(t), or with
respect to space, M = M(x, y, z)
Graylevels of the points of an image g(i,j)
Components of a field V(x, y, z)
Asound
There are different ways to think about a signal:
(i) It can be modeled deterministically or statistically. The deterministic
point of view will be the only one used here.
(ii) The variable can be continuous; one is then said to have an analog
signal x = x(t). If the variable is discrete, one is said to have a discrete
signal x = (xn)nEZ A discrete signalwill most often result from sampling
(also called discretizing) an analog signal. (See Figure 1.1.)
(iii) Finally, we will consider the values x = x(t) of a signal to be exact
real or complex numbers. However, for computer processing, it is necessary
to store these numbers in some finite form, for example, as multiples of
an elementary quantity q. This approximation of the exact values is called
4 Lesson 1. Signals and Systems
x(t)
-2 -1 0 2 3
FIGURE 1.1. Sampling an analog signal.
quantization. We will not examine the effects of this operation. A discrete,

quantized signal is called a digital signal.
Any entity, or apparatus, where one can distinguish input signals and
output signals will be called a (transmission) system (Figure 1.2). The
input and output signals are not necessarily of thc same kind (see, for
cxample, Section 1.3. 7).
1\
x(t) y(t)
Transmission system
input output
signal signal
FIGURE 1.2. Diagram of a system.
When there are several input or output signals, the functions x(t) and
y(t) are vectors. We willlimit our discussion to the scalar case, where there
is a single input signal and a single output signal.
In signal theory, one is not necessarily interested in the system's compo-
nents, but rather in the way it transforms the input signal into the output
signal. It is a "black box." It will be modeled by an operator acting on
functions, and we write
y=Ax,
where x E X, the sct of input signals, and y E Y, the set of output signals.
Examples of systems:
An electric circuit
An amplifier
The telephone
The Internet
1.1 General considerations 5
FIGURE 1.3. Analog system.
One distinguishes:
Analogsystems that transform an analog signal into another analog sig-
nal (Figure 1.3)
Discrete systems that transform a discrete signal into another discrete
signal (Figure 1.4)
T T
-3 -2 -1 0 1 2 3 n -3 -2 -1 0 1 2 3 n
FIGURE 1.4. Discrete system.
Onc can go from a discrete signal to an analog signal, or converscly, using

converters that are called hybrid systems:
An analog-to-digital converter, likc a sampler, for example;
A digital-to-analog converter, which produces an analog signal from a
digital signal. We mention as an example the clamper, or clamping circuit
(Figure 1.5). This device yields the last value of the digital signal until
the point when the next value arrives.
x(t)
-1 0 2 3 -1 0 1 2 3
FIGURE 1.5. The clamper.
1. 2 Some elementary signals

1.2.1 The Heaviside function
The Heaviside function is the signal, denoted throughout the book by u(t),
defined by
u(t) = {0 if t < 0,
1 if t > 0.
(See Figure 1.6.) The value at t = 0 can be specified or not. This value is not
important for integration. The Heaviside signal models the instantaneous
establishment of a steady state.
u(t)
FIGURE 1.6. The Heaviside function.
1.2.2 A reetangular window

The (centered) reetangular signal r(t) (Figure 1.7) is defined, for a > 0, by
{
1 if ltl < a,
r(t) = 0 if ltl > a.
r(t)
-a 0 a
FIGURE 1.7. A reetangular window.
1.2.3 A pure sinusoidal, or monochromatic, signal

A sinusoidal signal is of the form
x(t) = acos(wt + cp),
1.3 Examples of systems 7
where the parameters have the following interpretations:
la:l = max lx(t)l is the amplitude of the signal;

w = the angular rate;
a = 21rjw is the (smallest) period;
..\ = 1/a is the frequency;
r.p = the initial phase.
Signal values are, in principle, real numbers, and the frequency is a pos-
itive number. However, for reasons of convenience (Fresnel representation,
derivation, multiplication, ... ) a complex-valued function
z(t) = a:ei(wt+<p)
is often used, and one has

1
x(t) = Re(z(t)) = 2(z(t) + z(t)),
where z is the complex conjugate of z. Writing the signal this way involves
negative frequencies, which make no sense physically. Nevertheless, this is a
useful convention. It is always understood that the frequencies of opposite
sign will be combined to reproduce the real signal.
A signal of the form z(t) = ce 2 i7r>.t, where c = lclei'P and where ..\ E IR-
but where c is can be complex and can thus include a phase r.p - is often
represented by plotting the modulus and argument of c in frequency space,
that is, as a function of the frequency ..\. This is illustrated in Figures 7.1
and 7.2 for more general functions of the form
L
00
z(t) = Cne2i1rAnt.
n=-cx::>
1.3 Examples of systems

1.3.1 Ideal amplifier
y(t) = kx(t), where k is a fixed constant.
1.3.2 Delay line
y(t) = x(t- a), where a is a real constant.
1.3.3 Differentiator
y(t) = x'(t), where x' is the derivative of x.

1.3.4 A discrete system

One taps the output Yk, subjects it to a unit time delay, multiplies it by a,
and adds it to the input Xk. This gives the recursion equation
Yk=xk+aYk-l, kEZ. (1.1)

Such a system is typically represented by the diagram in Figure 1.8.
FIGURE 1.8.
1.3.5 An RC circuit
\
7)
x(t)
L.,__ _ _ _ _ _ _ T....____c__
FIGURE 1.9. An RC circuit.
The input to the circuit shown in Figure 1.9 is the voltage x(t); the
output is the voltage v(t) across the capacitor. Thus,
A: X f---> V.
The potential difference across a capacitor with charge Q is v = QjC,

so by Ohm's law
Ri(t) + v(t) = x(t).
By writing i(t) = Q'(t), this becomes
RCv'(t) + v(t) = x(t). (1.2)

This system is governed by a first-order linear differential equation with
constant coefficients. In general, the solution v will depend on a parameter
that is computed using an additional condition, for example, the initial
condition v(O) = 0.
1.3 Examples of systems 9
1.3.6 A mechanical example
Y(t)
x(t)
k m
z)zzzz;zzzzJzzztzz~zzzzzzzizzzz~zzzft;:;,zzz
FIGURE 1.10.
The mechanical system shown in Figure 1.10 consists of two borlies A

and B that slide in one direction on a fixed surface. The mass B has a
coeffi.cient of sliding friction a. A and B are connected by a spring with
restoring constant k. A is driven by the controlled movement x(t) (the
input), and this causes the motion of B, which is measured by the distance
y(t) (the output).
If B has mass m and acceleration 7(t), then by Newton's law, m7(t) is
equal at each instant t to the sum of the forces acting on B. These are the
restoring force, -k(y(t)- x(t)), and the friction force, -ay'(t). Combining
these forces gives the equation
my"(t) + ay'(t) + ky(t) = kx(t), (1.3)
and the output is complctely determined if two initial conditions are known,
for example, y(O) = 0 and y'(O) = 0.
1.3. 7 A system of resistors
R R R R R
R' R'
FIGURE 1.11.
Consider the electrical circuit in Figure 1.11. The input is the constant
voltage E, and the currents i 0, i 1, ... , iN constitute the output:
A: E t-+ (io,il, ... ,iN)
A B
D c
FIGURE 1.12.
Summing the currents at the kth node gives
On the other hand, by summing the voltage drops around thc loop
DABC we have (see Figure 1.12)
0 =VA -VB+ VB - Vc + Vc- VA,

0 = Rik + R'jk- R'ik-1!
0 = Rik + R'(ik- ik+I)- R'(ik-1- ik)
Finally, taking iN +1 = 0 shows that for k = 1, ... , N,
E = (R + R')io- R'ii!
0 = R'ik-1- (2R' + R)ik + R'ik+l,
0 = iN+1
This is a second-order linear recursion system with boundary conditions.
Lesson 2
Filters and Thansfer Functions
Systems have properties, at least sometimes. We are going to review several

of the more standard properties of systems.
2.1 Algebraic properties of systems

The set of input signals X and the set of output signals Y are assumed to
be vector spaces (real or complex). A system A can have several properties:
2.1.1 Linearity
Consider the system
A:X-+Y.
A is said to be linear if
A(x + u) = A(x) + A(u)
and
A(..\x) = ..\A(x)
for all x, u EX and all ,\ER (or C if Xis complex). This is also called the
principle of superposition. The systems in Section 1.3 are alllinear, which
is easily verified by examining the governing equations (1.1) through (1.4).
2.1.2 Causality
A is said to be realizable (or causal) if the equality of any two input signals
up to time t = to implies the equality of the two output signals at least to
time to:
12 Lesson 2. Filters and Transfer Functions
This property is completely natural for a physical system in which the

variable is time. It says that the response at time t depends only on what
has happened before t. In particular, the system does not respond before
there is an input. Thus causality is a necessary condition for the system to
be physically realizable.
2.1.3 lnvariance
A is said to be invariant, or stationary, if a translation in time of the input
leads to the same translation of the output; that is,
x(t)---> y(t) ===? x(t- a)---> y(t- a).
Let Ta be the delay operator defined by
Tax(t) = x(t- a).

If the system A is invariant, then
for all x E X and a E R Thus, for all a E IR,
which says that A commutes with all translations. For discrete systems,
one considers only a that are multiples of the sampling interval.
2.2 Continuity of a system

The system
is said tobe continuous if the sequence Axn ( = Yn) tends to Ax ( = y) when

the sequence Xn tends to x. This concept assumes that there exists some
notion of sequential limit for signals in both X and Y.
Continuity is a natural hypothesis; it expresses that idea that if two input
signals are close, then the output signals are also close.
2.2.1 The analog case

When the signals are functions, the notion of limit is often defined in terms
of a norm II II defined on each of the vector spaces X and Y. In this case
Xn ---> x means that llxn - xll ---> 0.

2.2 Continuity of a system 13
These are the three most frequently used norms:

(i) The norm for uniform convergence:
llxlloo = sup lx(t)l.

tEl
(ii) The norm for mean convergence:
llxll1 = jlx(t)l dt.
(iii) The norm for convergence "in energy" (mean quadratic convergence):
llxll2 = ( jlx( t) 12dt)

112 .
In all cases, I is the interval of interest.

This last norm has the advantage over the other two of being derived
from a scalar product,
(x, y) = 1 x(t)y(t) dt,
where y(t) denotes the complex conjugate of y(t). Thus, llxlb = ~

Such a structure allows one to introduce the notion of orthogonality be-
tween two signals. This generalizes the concept of orthogonality in ]Rn and
is expressed by the relation
(x,y)=O.
One often uses a less restrictive notion of continuity, namely, continuity
in the sense of distributions. This concept will be studied Chapter VIII.
2.2.2 The discrete case

When the signals are discrete, one can use the analogous norms:
+oo
llxlloo = sup lxnli
nEZ
llxll1 = L
n=-CXJ
lxnli
n=-oo
The simple convergence of a sequence of signals,
is also used. By this we mean that the sequence Xn tends to x if the limit
exists for each of the components:
Xn ----+ x {=::=} Xnk ----+ Xk for each k E Z as n ----+ +oo.

14 Lesson 2. Filtersand Transfer Functions
Example: A differentiator is not a continuous system in the uniform conver-

gence norm. lndeed, if we take Xn = (1/n) sin(nt), then Xn --+ 0 uniformly
in t. But Yn(t) = x~,(t) = cos(nt) does not tend to zero as n--+ +oo. On
the other hand, we will show that the integrator
y(t) = [ e-(t-s)x(s) ds
00
is continuous with respect to uniform convergence.
2.3 The filter and its transfer function

The term filter refers both to a physical system having certain properties
and to its mathematical model defined in terms of the following objects:
(i) two vector spaces X and Y of input and output signals, respectively,
that are endowed with a notion of convergence;
(ii) a linear operator A : X f-+ Y that is continuous and translation-
invariant.
We will say informally that a filter is a continuous, translation-invariant
linear system.
Such a system satisfies the principle of superposition, which is another
name for linearity. Thus,
k k
A(Lanxn) = LanAXn,
n=O n=O
and by continuity, one can pass to the limit when theinfinitesums converge:
+oo +oo
A(Lanxn) = LanAXn
n=O n=O
Later we will see that a periodic signal (under rather generat conditions)
can be written as an infinite sum of monochromatic signals in the form
+oo
x(t) = L ene2i1l'Ant.
n=-oo
Hence, at the output of a filter we will have

+oo
Y = Ax = L enA(e~), where eA(t) = e2i,..At.
n=-oo
It is thus sufficient to know the outputs for each of the inputs (e~)nEZ to
know thc image of an arbitrary periodic signal. Furthermore, it is easy to
2.4 A standard analog filter: the RC cell 15
determine the image f>. of the signal e.x, assuming that the latter belongs
to the space of input signals for the filter. Indeed, for all values oft and u,
e.x(t + u) = e.x(t)e.x(u) = Lte.x(u),
where t is considered to be a parameter and u to be the variable. The image

of this signal is f>.(t + u). As a result, we see that
f>.(t + u) = A(e.x(t)e.x)(u) = e.x(t)f.x(u)

for all u E R. Hence, for u = 0,
f.x(t) = e.x(t)f>,(O),
which we write as
A(e.x) = If()..)e.x, where H()..) = f>.(O).

This result can be expressed as follows:
2.3.1 Proposition Assurne that e.x is an admissible input function for

the filter A, which is otherwise arbitrary. Then e.x is an eigenfunction of
the filter A. That is to say, there exists a scalar function H ()..) such that
for).. ER,
A(e.x) = If()..)e_x.
The function H : R ~--> C is called the tmnsfer function of the filter A.

There will be many occasions in the rest of the book where we will see
the essential role that this function plays in the action of a filter on the
spectrum of an input signal.
2.4 A standard analog filter: the RC cell

We will illustrate the general ideas presented above with the RC circuit
shown in Section 1.3.5.
2.4.1 Systemresponse
t
Writing the unknown function v as v(t) = w(t)e-RC reduces the initial
equation
RCv'(t) + v(t) = x(t)
to
1 t
w'(t) = RCeRCx(t).
Assuming that the input signal x(t) is such that the second member is
integrable on every interval ( -oo, t), we have
1
w(t) = RC
1t s -oo eRCx(s)ds+K
and
v(t)
1
= RC
1t t-s
-oo e- RCx(s)ds+Ke- RC.
t
The constant K is determined by an auxiliary condition. For example, if
we assume that the response to the zero input is zero, we see that K = 0.
One can define the response of the system A to the input x to be
v(t) = Ax(t) = RC
1 1t t-s
-oo e- RC x(s) ds. (2.1)
It is clear from this expression that Ais linear, realizable, and invariant. It
is also continuous, for example, in the uniform norm, since
and thus
IIAxlloo :S llxlloo
This shows that thc RC cell is a filter.
2.4.2 An expression for the output

If we write
h(t) = - 1-e- ;cu(t),
RC
where u is the Heaviside function, then we can express (2.1) as
Ax(t) 1 +oo
= -oo h(t- s)x(s) ds = (h * x)(t). (2.2)
This operation is, by definition, the convolution of the two signals h and
x. It is denoted by h * x, and we havc
Ax = h*x.
In this situation, one is said to have a convolution system. The function h,
called the impulse response of the system, characterizes the filter because
knowing h implies that the output of the filter is known for any input x.
Throughout the book, we will use h to denote the impulse response of a
system. A companion notion, the response of a system to the unit step
function u(t), will be defined in Lesson 24.
2.4 A standard analog filter: the RC cell 17
2.4.3 The transfer function of the RC filter

The response to the input x(t) = e>.(t) is v(t) = H(.>..)e>.(t). Substitution
in equation (1.2) gives
(2i7r.ARC + 1)H(.>..)e>.(t) = e>.(t),
and we have
H(.>..) = 1 + 2i~.ARC
We see that signals for which I.AI is small, the low-frequency signals, are
transmitted by the filter almost as if it were the identity mapping (see
Figure 2.1). On the other hand, the high-frequency signals, for which I.AI
is large, are almost completely attenuated. This explains why this filter is
called a low-pass filter. The action of the filter on different frequencies is
clearly apparent from the graph of the function
which is called the energy spectrum of the filter. The function IH(.A)I is
called the spectral amplitude.
0 ,\ _ _1_
c- 21rRC
FIGURE 2.1. Energy spectrum of the low-pass RC filter.
The frequency .Ac = 1/(27rRC), beyond which the amplitudes of the

input frequencies are reduced by more than the factor 1/V'i, is considered
to be the cutoff frequency.
We will return to the RC filter in Lesson 25. In fact, the analysis of
this filter and of the systems described by generalizing the equation that
governs the RC filter will be the main application of the mathematical
tools that are developed in the book.
2.5 A first-order discrete filter

The discrete analogue of the last case is the example in Section 1.3.4, where
for a "1- 0,
(2.3)
The analysis of this system follows that of the last example. Wc try to
express the output explicitly as a function of the input. Thus we change
the unknown by letting Yk = akvk, which transforms (2.3) into
(2.4)
By successive additions, we see that

k
vk- Vk-p = L a- 1 x~, p E N*. (2.5)
l=k-p+l
Suppose that the input signal (xk) is suchthat the series

0
L a-nXn (2.6)
n=-oo
is absolutely convergent. Then from (2.5), the sequencc (vkh<o is a Cauchy

sequence and thus converges to some Iimit b as k-+ -oo. Letting p-+ +oo
in (2.5) shows that for each fixed k E Z
k
Vk = b+ L a- 1x 1
l=-oo
and
k
Yk = bak + L ak-nXn
n=-oo
Conversely, for each complex constant b, the sequence (Yk) ofthisform is

a solution of (2.3). It is logical that Yk is not completely detcrmined, since
wc have not specified an initial condition. Now assume that the response
to a null input is null; then necessarily b = 0. If we define (hn) by
if n ~ 0,
if n < 0,
the output (Yk) can be written as

+oo
Yk = L hk-nXn, or y = h * x,
n=-oo
2.5 A first-order discrete filter 19
which is called the discrete convolution of the two signals x = (xn) and
h = (hn) This system is linear and invariant. One can easily verify that
the condition hn = 0 if n < 0 implies that it is realizable. The system is
continuous in the uniform norm whenever Iai < 1, and we have
The signal h is called the impulse response of the system. It is the response
to a unit impulse at time t = 0 defined by
where ek = { ~ if k
if k
=
=/:
0,
0.
As in the analog case, we examine the response to an exponential input

x = (zk) where z is a fixed complex number with izl > Iai so that the
hypothesis concerning the convergence of (2.6) is satisfied. We obtain the
relation
L (z)n z
k
Yk = ak - = --Xk
a
n=-oo
z-a
Thus the output is proportional to the input. The exponential signals are
eigenfunctions of the filter, as expected, and the eigenvalue
z
H(z)=-
z-a
is again called the transfer function of the discrete filter. It is a function of
the complex variable z that is defincd and analytic in the domain lzl > lai-
We return to this function, from another point of view, in Lesson 40.
Chapter II
Periodic Signals
Lesson 3
Trigonometrie Signals
We have seen that the pure sinusoidal signals are eigenfunctions for all
filters. They are also the simplest periodic signals. These two facts explain
their importance. We will see in the next two lessons that they enter into
the structure of all periodic signals.
3.1 Trigonometrie polynomials

A function f is said to be periodic with period a, a > 0, if for all t E JR,
f(t + a) = f(t).
2" t
(Note that a is not necessarily the smallest period.) Since en(t) = e mna
has period a for each integer n, the same is true for functions p of the form
where I is any fixed, finite set of integers and the Cn are arbitrary complex
numbers. By adding zero terms if necessary, we may assume that
+N
p(t) = I: (3.1)
n=-N
This function is called a trigonometric polynomial of degree less than or

equal to N. These functions model the Superposition of a finite number of
monochromatic signals; the real part of such a function can be represented
graphically as a function of time as in Figure 3.1.
24 Lesson 3. Trigonometrie Signals
p(t)
-a 0 a 2a
FIGURE 3.1.
3.2 Representation in sines and cosines

Expression (3.1) can be transformed to express p(t) as a linear combination
of sines and cosines. Thus,
p(t) =Co+ LN (cne '1!"n;;:

.
2
t
+c-ne- '1!"na),
2
. t
n=l
and by expanding the exponentials, this becomes
where, for n ~ 0,
an= Cn + C-n,
(3.3)
bn = i(cn- C-n)
The inverse formulas are

1 .
Cn = 2(an- Zbn),
(3.4)
1 .
c_n = 2(an + zbn)
3.3 Orthogonality
A simple computation shows that the following important relation holds
for the functions en(t):
(3.5)
3.3 Orthogonality 25
Wc let TN denote the set of all trigonometric polynomials p of dcgree

less than or equal to N. TN is obtained by letting the Cn in formula (3.1)
vary over all possible values. If we endow this vector space, which has finite
dimension :::; 2N + 1, with the scalar product
(p, q) = la p(t)q(t) dt,
thc relation (3.5) expresses the fact that the functions en, E Z, are or- n
thogonal:
(en,em) = 0 if n "Im, and llenll2 = ya.
It follows that the vectors en are independent and that the dimension of
TN is exactly 2N + 1. If p is of the form (3.1), we have
(p,en) = Cnllenll~ = acn,
11a
and
Cn = - p(t)e -2i1rnla dt. (3.6)

a o
This is called Fouricr's formula; it gives the coefficients Cn explicitly in
terms of the function p. One easily obtains thc following formulas for the
cocfficients an and bn, n ~ 0 :
an=
2r
~ Jo
t
p(t) cos ( 27rn~) dt,
bn
2r
= ~ Jo
t
p(t) sin ( 2nn~) dt.
(3.7)
Theoretically, this has solved the problern of spcctral analysis for a

trigonometric signal: knowing the values of p, calculate the cocfficients Cn
in (3.1). Later we will see how to do this important calculation efficicntly.
Since p is periodic, the integral (3.6) can be taken over any interval of
length a. By taking it to bc ( -a/2, a/2), we immediately have the following
properties:
peven {::} C-n = Cn, n E Z {::} bn = 0, n E N;

podd {::} C-n = -Cn, n E Z {::} an = 0, n E N.
Finally, computing the quadratic norm of p from (3.1) gives
N N
jjpjj~ = L L CnCrn(en, Ern),

n=-Nm=-N
and this, combined with (3.5), yields Parseval's equality for trigonometric
polynomials:
L N 1
lcnl 2 = - Jo
r jp(t)j 2 dt. (3.8)
m=-N a 0
26 Lesson 3. Trigonometrie Signals
3.4 Exercises
Exercise 3 .1 If f : R --> R is a periodic function with period a, integrable
J:+a
on bounded intervals, show that the integral f(t) dt does not depend on x.
lesson 4
Periodic Signals and Fourier

Series
The question is this: If f : IR ---+ C is an arbitrary function with period a,

can we find a decomposition of f of the form
(4.1)
The immediate answer is "no" if one considers only finite sums. The sum
on the right is infinitely differentiable, while there is no reason for f to be.
For example, f could be the periodic window function in Figure 4.1.
f(t)
-a -h 0 h a t
FIGURE 4.1. Periodic window.
4.1 The space L~(O, a)

In a famous paper dated 1807, Joseph Fourier asserted that the answer to
this question was "yes," provided that infinite sums are allowed. He ar-
rived at his results by a very circuitous route using the "tools at band,"
which is to say, the mathematical techniques available at that time. Recall
that at the beginning of the nineteenth century, not only was the notion of
convergence rather vague, but the definition of function itself was open to
controversy. For example, the following question was debated: Is a function
28 Lesson 4. Periodic Signals and Fourier Series
defined on two consecutive intervals by two different formulas still a func-

tion? (For more about this subject, we recommend the interesting popular
book [DH82].)
Today we approach this problern with two centuries of experience, during
which the tools of mathematical analysis have been considerably developed
and refined. In particular, we now understand the parts of the problern that
are simple and those that are diflicult. In this section, we will approach the
problern from a geometric point of view.
Note first that a periodic function defined on ~ with period a is com-
pletely determined by its values on any interval [x, x + a) of length a.
In addition to periodicity, we need to assume that the functions consid-
ered are such that the integral
exists and is finite. We will not dwell on issues of integration at this point;
they will be discussed in Lessons 11 through 15. We introduce the notation
The index "p" is to remind us that the functions are periodic.

This set, endowed with the usual addition for functions and scalar mul-
tiplication, is a complex vector space. (At this point, it is not obvious that
the sum of two such functions is a member of the set (see Proposition
16.1.4). We define a scalar product (or in the complex case, a Hermitian
form) on this set by
(!, g) = 1a f(t)g(t) dt.
The associated norm is given by
llfii2=JU,f)=( 1 a
o lf(tWdt)
1/2
.
It is important to note that the norm of f can be zero even though the
function f E L~(O, a) is not zero at every point. For example, f could be
zero at all but a finite nurober of points. Thus, to have a true norm, it is
necessary to identify such a function with the function that is identically
zero, which we sometimes call the null function. Generally, we must identify
any two functions fand g for which f 0a lf(t)-g(t)l dt = 0 (see Section 13.3).
In this case, we say that the two functions are equal almost everywhere,
and we write f = g a.e. At this point, it is suflicient to remernher that
1a lf(t)l dt = 0 {:=::} f = 0 a.e. on (0, a).

4.2 The idea of approximation 29
4.2 The idea of approximation

If equality (4.1) cannot hold exactly for a finite sum, one can try to have
it hold "as weH as possible." More precisely, we can try to answer the
following question: Given an integer N, is it possible to find coefficients Xn
suchthat
N
II/- L
Xnenll2 attains a minimum?
n=-N
Geometrically, this amounts to finding an element IN in the subspace TN
of L~(O, a) that has minimum distance from f. When such an element IN
exists, we say that it is the best approximation of f in TN (see Figure 4.2).
Oe
L
FIGURE 4.2. Orthogonal projection on a subspace.
To solve this approximation problem, we first try to evaluate the distance

between f and an arbitrary trigonometric polynomial in TN,
N
p(t) = L Xnen.
n=-N
Thus,
II/- Pli~ = 11/11~- 2Re(f,p) + IIPII~
We know from (3.8) that
N N
IIPII~ =a L lxnl 2 and (/,p) = L Xn(/,en)
n=-N n=-N
By writing
1
Cn = Cn(/) = -(/,
a
en),
we have
N
II/- Pli~= 11/11~ + a L (lcn- Xnl 2 -lcnl 2 ). (4.2)
n=-N
From (4.2) it is perfectly clear that the minimum is attained when Xn = Cn,
and only for this value.
In summary, the bcst approximation fN exists and is unique, and it is
given by
N
fN(t) = L Cnen(t).
n=-N
4.2.1 Theorem There exists a unique trigonometric polynomial !N in

TN suchthat
This polynomial is given by

N
!N(t) = L (4.3)
n=-N
whcrc
Cn = -
a o
11a
f(t)e
-2i1Tn!
a dt. (4.4)
4.2.2 Bessel's inequality

For Xn = Cn, equality (4.2) becomes
a L lcnl 2 + II/- /Nil~= II/II~ (4.5)

n=-N
An immediate consequence is the inequality
L
N
lcnl
1
2 :::; - Jo
r lf(tW dt, NEN,
n=-N a o
which is traditionally known as Bessel 's inequality.
4.2.3 Corollary For any f E L~(O, a), we have the inequality
n=-oo
and hence cn(f) --> 0 as lnl --> +oo.

4.3 Convergence of the approximation 31
4.3 Convergence of the approximation

One can ask what happens to !N as N--+ +oo. Hereis an example. Take
a = 21r and define
f (t) = { +1 if 0 ~ t < 1f'

-1 if 1f ~ t < 21f.
By writing the exponentials in terms of sines, we have the following ap-

proximations for N = 1, 3, 5:
4 .
fi(t) = - sm t;
1f
!J(t) = ~(sint+ !sin3t);

1f 3
4 . 1 . 1 .
f 5 (t) = ;:(smt + "3 sm3t + "5 sm5t).
These functions are shown graphically in Figures 4.3-4.5.

y
1.2268
0.7268
0.2268
-2732 X
-0.7732
-1 .2732 -1---.-~--..--.-->-4--.-~--..-~.--.,--"-,-"--,
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 11.0
10.0 12.0
FIGURE 4.3. fl(t) = ~sint.
From this example it appears that !N tends to f as N increases. In fact,

we have following important general result.
4.3.1 Theorem If f E L~(O, a), then the best approximation of f in

TN, which is given by
N t
f N = '"""'
L......,; Cne
2i7rn -
a
n=-N
with the Cn defined by (4.4), tends to f in L~(O, a) as N--+ +oo. Expressed

otherwise,
0.7996
0.2996
-Q.2004 0 X
-Q.7004
-1.2004+-....--.---.---lo.f--~'-.--.---..-~-+~--.-.>.<;
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 11.0
10.0 12.0
FIGURE 4.4. fs(t) = ~(sint+ isin3t).

y
0.8126
0.3126
-Q.1874 2'1T 3'1T X
-o.6874
-1.1874 +--.---.---.--"-r-..,~c..-...,....---.--.-.....J./.-.--l,/
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 11.0
10.0 12.0
FIGURE 4.5. j5(t) = ~(sint + i sin3t + i sin5t).
The proof of this theorem requires more background than is available
in these early lessons. We will prove it in Lesson 16 as an illustration of
results about Lebesgue integration.
It is this theorem that gives meaning to the formula
+oo t
!( t ) = """'
L....J Cne
2i7rn-
a, (4.6)
n=-oo
or, as in (3.2), to the expression

+oo
f(t) = ~ + ~ (an cos ( 27rn~) + bn sin ( 2i7rn~)). (4.7)
Two remarks are indicated at this point:

(a) Formulas (4.6) and (4.7) are equalities in the L~(O, a) norm. In par-
ticular, they do not mean that for each value oft, the complex number f(t)
4.3 Convergence of the approximation 33
is equal to the sum of the series on the right. In the last example, the sum
ofthe series is zerofort = 1r, whereas j(1r) = -1. Similarly, if f is modified
at a point t 0 , the Fourier series is unchanged for t = t 0 . This touches on
the problern of pointwise representation that will be studied in the next
lesson.
(b) A more scholarly way to express Theorem 4.3.1 is to say that the
family of functions (en)nEZ is a topological basis for the space L~(O, a) and
that the series
is summable to f in this space. The meaning of this will be explained in

Lesson 16, where we develop the theory of Hilbert spaces.
The Cn are called the Fourier coefficients of the periodic function f.
4.3.2 Parseval's equality

Equation (4.5) and Theorem 4.3.1 imply that
+oo
L
1
lcnl 2 = -
r lf(t)i dt.
Jo 2 (4.8)
n=-oo a 0
This is called Parseval's equality. In fact, in view of (4.5), it is equivalent

to the statement of Theorem 4.3.1. Another way to express (4.8) is to say
that the energy of a periodic signal is equal to the sum of the energies of
its harmonic components. If the development in sines and cosines (4. 7) is
used, then it follows from (3.4) that
(4.9)
4.3.3 U niqueness of the Fourier coefficients

Are two functions that have the same Fourier coefficients equal? This is the
question of uniqueness. One might expect to have at least equality almost
everywhere, since
f = g a.e. ===} cn(f) = cn(g) for all n E Z.
By linearity, an affirmative answer to the uniqueness question reduces to

the following property:
f E L~(O,a) and cn(f) = 0 for all n E Z ===} f = 0 a.e. (4.10)

Butthis is an immediate consequence of (4.8), knowing that
foa lf(tW dt = 0 ===* / = 0 a.e. on (0, a).
In fact, in the context of the space L~(O, a) there is an equivalence between

Parseval's formula (and thus Theorem 4.3.1) and the uniqueness of the
Fourier coefficients. One of the implications is immediate, as we have just
seen; the converse is more difficult to establish. It will be clone in Lesson 16.
A direct proof (which does not use Theorem 4.3.1) of the uniqueness of the
Fourier coefficients for a piecewise continuous function is the content of
Exercise 4.7.
4.4 Fourier coefficients of real, odd, and even

functions
At the end of Lesson 3, we showed that certain properties of a trigonometric
polynomial p are reflected as conditions on its coefficients, and conversely.
It is easy to prove the same relations for a periodic function f. These
relations are important in practice because they can be used to reduce the
number of numerical Operations in certain calculations. We note specifically
the following relations:
(a) f real {::} C-n = Cn, n E Z {::} an and bn real, n E N

(b) f even {::} C-n = Cn, n E JE {::} bn = 0, n E N
(c) f odd {::} C-n = -Cn, n E JE {::} an= 0, n E N
(d) f real, even {::} the sequence (cn) is real and even
(e) f real, udd {::} the sequence (cn) is pure imaginary and odd
The properties of f on the left are to be understood in the sense of almost

everywhere. For example, if C-n = Cn for all n, then the two functions f(t)
and /( -t) have the sawe Fourier series development and are equal almost
everywhere by (4.10). Thus one must understand by "/ even" the property
/( -t) = f(t) for a.e. t,
where "a.e. t" means "almost every t" or almost everywhere.

4.5 Formulary 35
4.5 Formulary
+oo t
!( t ) = ""'
L....t Cne 2i1fn-a
n=-oo
Cn =-11a f(t)e
a o
-2i1fnl
a dt
4.6 Exercises
Exercise 4.1 Calculate the Fourier series expansions of the following func-
tions and verify the symmetric properties of the coefficients:
(a) f has period 2 and f(t) = ltl if ltl < 1.
(b) f has period a and f(t) = ! if 0 :S t < a.
a
(c) f(t) =I sintl.
(d) f(t) = sin3 t.
Exercise 4.2 If the Fourier coefficients of f(t) are cn, what are the Fourier
coefficients of the translated function f(t - to)? Deduce from Exercise 4.1 the
Fourier series expansion of f (t) = Icos t I
Exercise 4.3 Prove relation (4.9). (Hint: Use (4.8) and (3.4).)
Exercise 4.4 Write Parseval's equality for each function in Exercise 4.1.
Exercise 4.5 Assurne that f E L~(O, a) and Cn are its Fourier coefficients.
Then f is also in L~(O, 2a) with Fourier coefficients c~. How are the coefficients
Cn and c~ related? Verify that the two Fourier series are identical.
Exercise 4.6 Find the Fourier series expansion ofthe function f with period
a = 2 defined on [-1,+1) forzE C\Z by
Deduce the relation

1
(x- n) 2
n=-oo
for all x E lR\Z from Parseval's equality.
Exercise 4. 7 (Uniqueness of the Fourier coefficients for a periodic,

piecewise continuous function.) Assurne for simplicity that the piecewise
continuous function f has period 1. We wish to show that if all the Fourier
coefficients are zero, that is, if
(H1) Cn(f) = 1 1
j(t)e- 2 irrnt dt = 0, n E Z,
then f vanishes everywhere except perhaps at points where f is discontinuous.

Lets E [0, 1] be a point where f is continuous and suppose that f(s) -j. 0. One
can assume (by a translation and multiplication by -1, if necessary) that s = 0
and f(O) > 0.
(1) Show that (Hr) implies
(H2) 1: 1
2
j(t)p(t) dt = 0
for all trigonometric polynomials p with period 1.

(2) Take a E (0, 1/4] suchthat
ltl :::; a :::::} f(t) 2 ~ f(O). (Why does such an a exist?)
Define
L L
N-1 k
C!N(t) =~ e2irrnt
k=O n=-k
(a) Calculate CJN(t) and show that
1: 1
2
C!N(t) dt = 1.
1
(b) Show that
lim CJN(t)dt=O.
N~oo a<itl<!
- -2
4.6 Exercises 37
I:
(c) Deduce from this that
UN(t)f(t) dt? f~O)

for sufficiently large N.
(3) Show that
1
[~ 2 UN(t)j(t) dt > 0
2
for sufficiently large N and deduce the result.
Lesson 5
Pointwise Representation
A function used for numerical computation is necessarily evaluated at only

a finite number of points. It is therefore important to determine whether
formulas (4.6) and (4. 7) can express equality at a given point t. This is the
problern of pointwise representation. We begin by extending the notion of
Fourier series beyond the space L~(O, a).
5.1 The Riemann-Lebesgue theorem

The Lebesgue integral has the remarkable property that a function is inte-
grable on an interval I if and only if its modulus is integrable on I:
f is Lebesgue-integrable on I {:::=:} llf(x)i dt < oo. (5.1)
This is false for the Riemann integral; take, for example, the function equal
to 1 on the rationals and -1 on the irrationals.
It follows immediately that the Fourier coefficients of a periodic function
Cn(/) = ~
a lo
r j(t)e - 21rn~ dt
exist if and only if f is integrable on (0, a). We introduce the notation
L~(O, a) = { f: ~--+ C I f has period a and 1a if(t)i dt < +oo }
Note that L~(O, a) c L~(O, a), so saying f is in L~(O, a) is less restrictive

than saying f is in L~(O, a). For any f in L~(O, a), we can consider the
Fourier series
+oo
L
t
cn(f)e2"'na.
n=-oo
40 Lesson 5. Pointwise Representation
We do not know whether this series converges, and if it does converge, we

do not know the value of its limit.
That cn(f) --> 0 as n--> +oo is an important necessary, but not sufficient,
condition for convergence. This limitwas established for f E L~(O, a) using
Bessel's inequality (4.2.3), which makes sense only for functions in L~(O, a).
However, the property cn(f)--> 0 remains true for f in L~(O, a).
5.1.1 Theorem (Riemann-Lebesgue) Let (a,b) be a bounded

interval and assume that f is integrable on (a, b). Then the integral
tends to 0 as lnl --> +oo.

Proof. This is easy to establish when f is continuously differentiable on
[a, b]. Integration by parts shows that
In= -.1- [f(x)e2i1rnx]b- -.1-lb J'(x)e2innx dx,

2z7rn a 2z1fn a
which yields the estimate
1
IInl :::; 21rlnl (lf(a)l + lf(b)l + lb
a lf'(x)l dx).
The right hand-side, and thus In, tends to 0 as when lnl--> +oo.
We now use a density argument that is based on the following provisional
assumption: The functions that are continuously differentiable on [a, b] are
densein L~(a, b). This means that given c: > 0, there exists a 9c: E C 1 ([a, b])
l
suchthat
b c
a lf(x)- 9c:(x)l dx:::; 2'
which implies that
From the previous argument, there exists N > 0 such that the last integral
is dominated by t:/2 if lnl ~ N. Thus, lnl ~ N implies IInl :::; c:, and this
proves the theorem. o
5.2 Pointwise convergence?

We have already noted following Theorem 4.3.1 that the mean quadratic
convergence of the Fourier series f N to f gives no information about the
5.2 Pointwise convergence? 41
convergence of /N at a given point. For this, one needs more refined assump-
tions about the function. In practice, these hypotheses will generally hold.
However, we emphasize that theseadditional hypotheses are essential, since
several "natural" results that one might expect to hold for f E L~(O, a) are
indeed false. We cite three of these:
(i) !N -t f in the L~(O, a) norm.
(ii) !N(t) - t f(t) for almost all t.
(iii) If f is also continuous on IR, then !N(t) f(t) for all t E IR.
-t
It has even been shown (see [KF74]) that there exists an f in L~(O, a) such
that !N(t) diverges for all t as N-t +oo! These examples represent difficult
problems that have played a central role during the last century in research
on the theory of functions.
5.2.1 Piecewise continuous functions on [a, b]

Let f : [a, b] - t C be a complex-valued function. We say that f is piecewise
continuous on [a, b] if it is continuous on [a, b] except at a finite (possibly
zero) number of points and if both the right and left limits exist and are
finite at these exceptional points. These limits are defined by
f(t+) = lim f(t+h),

h-+O+
f(t-) = lim f(t + h).

h-+O-
At the end points a and b we require that only the one-sided limits exist. We
denote this function space by Cpw[a, b], where "pw" stands for "piecewise."
Then
f is bounded on [a, b],
f E Cpw[a, b] ===? {
f is integrable on [a, b].
The integral of f involves only the integration of continuous functions, since
{b n {ak+I
Ja f(t) dt ={;Jak J(t) dt,
where ao = a, an+ 1 = b, and where a1, ... , an are the points of discontinu-
ity of f in (a, b). Each integral in the sum is understood tobe the integral
of the continuous extension of f to the interval [ak, ak+t] Note that as far
as the integral is concerned, f does not need to be defined at the points ak.
EXAMPLE: Define f on [-1, 1) by
- 1::; t < 0,
f( t) = { t 2 + 1 if
-t +2 if 0 ::; t ::; 1.
Then f has a derivative
- 1::; t < 0,
f '(t) = { 2t if
-1 if 0 < t::; 1,
and f' is in Cpw[-1, +1]; however, f' is not defined at t = 0.
5.2.2 Functions of bounded variation on [a, b]

We say that a function f is of bounded variation on [a, b], denoted by
f E BV[a, b], if there exists an M such that
n
L lf(tk+I)- f(tk)l ::; M
k=O
for all Subdivisions a = to < t 1 < < tn+l = b, where n E N is arbitrary.
For f to be of bounded variation, it is necessary and sufficient that its real
and imaginary parts be of bounded variation. Any real function that is
monotonic on [a, b] is of bounded variation on [a, b]. In fact, we have the
following characterization for real functions:
There exist g and h, monotonic

f E BV[a, b] {=::::} {
on [a, b], suchthat f = g- h.
The implication from right to left is immediate. For the other direction,
see, for example, [KF74]. We deduce from this last cquivalence that
f is Riemann integrable on [a, b], and

f E BV[a, b] ===} { (5.2)
f(t-) and f(t+) exist for all t E (a,b).
5.2.3 An expression for the remainder !N(t0)- f(t 0)

Let f be in L~(O, a) and assume that at a point to the limits f(to+) and
f(t 0 - ) exist. From (4.3) and (4.4) we have
fN(to) = ~
1~~a (
LN
e
221 to-x) f(x) dx.
rn-a-
-2 n=-N
An easy computation shows that
t
n=-N
e2i1rnt = sin 1r(~N + 1)t.
sm 1rt
(5.3)
5.2 Pointwise convergence? 43
Thus we obtain
11~ sin1r(2N + 1)t0 - x

!N(to) = - a to _ x a f(x) dx,
a -2 sm7r--
a
which, by a change of variable, becomes
11~-to sin7r(2N + 1)~

!N(to) = - a x a f(x + to) dx.
a ---to Sln 7r-
2 a
Since the integrand has period a, the integral can be taken over (-a/2, a/2].
By making the variable change x ~ -x in the integral over [-a/2, 0], we
have the expression
11~ sin7r(2N + 1)~

!N(to) = - [f(to + x) + f(to- x)] . x a dx.
a o ffin7r-
a
From (4.3) and (4.4), we see that f = 1 implies fN = 1 for all N. Thus, for
all N::::: 0,
11~ sin7r(2N + 1)~ 1
- adx--
a o sin1r~ - 2
a
If wc write Yo = ~[f(to+) + f(to- )], we obtain
!N(to)- Yo =
1 ~~ sin7r(2N + 1)~
- Jo [f(to+x)-f(to+)+f(to-x)-f(to-)] . x a dx.
a 0 Sill 1ra (5.4)
5.2.4 Theorem (Dirichlet's theorem) Let f be in L~(O,a). If

the limits f(to+) and f(to-) exist at a point to, and if the left- and right-
hand derivatives also exist at t 0 , then
1
!N(to) ~ 2[f(to+) + f(to- )]
as N ~ +oo. If, in addition, f is continuous at to, then !N(to) ~ f(to).

Proof. The result will follow from (5.4). The assumption that the left and
right derivatives exist means that the quantities
1 1
- (f(to
X
+ x) - f(to+)) and - (f(to - x)- f(to-))
X
tend to finite limits as X ......... o-. Hence, the same is true for
<p(x) = i(to + x)- i(to+ ~ + {(to- x)- i(to-).
Slll1f-
a
Thus, there exist a > 0 and M > 0 suchthat l<p(x) I :S: M for all x E (0, a].
Since i E L~(O, a), <p is integrable on [a, a/2], and
l<p(x)l :S: M + l<p(x)IX[a,a/2J(x)

for all x E (0, a/2]. The function on the right is integrable because it is the
sum of two integrable functions. From (5.1) we know that
lg(x)l < h(x) for all x EI, }

- ==? g Lebesgue integrable on I. (5.5)
h Lebesgue integrable on I
This implies that +oo. D
This result shows that the convergence of the Fourier series of i at a

point t 0 depends only on the behavior of i in a neighborhood of t 0 .
Here is a global result that we will not prove.
5.2.5 Theorem Assurne that i : IR . . . . . <C is a periodic function of

bounded variation with period a. Then we have the following results:
(i) For all to E IR, iN(to)---> ~(f(to+) + i(to-)) as N---> +oo.
(ii) If, in addition, i is continuous on a closed and bounded interval [a, ],
then iN converges uniformly to i on [a, ].
REMARK: The fact that iN(t 0) has a limit for each fixed t 0 as N ---> +oo
does not imply that either of the series
+oo t 0
LCne 2i1rn-a
'"'
n=O
or 2:
n=-oo
is convergent. On the other hand, this does imply the convergence of the
series (4. 7) of sines and cosines because it is obtained by symmetrically
regrouping the terms of iN
5.3 Uniform convergence of Fourier series 45
5.3 Uniform convergence of Fourier series

5.3.1 Theorem Assume that f has period a and is continuous on IR;
that f is differentiable on (0, a], except possibly at a finite number ofpoints;
and that f' is piecewise continuous.
(i) The Fourier series of f' is obtained by differentiating the Fourier series
of f term by term.
(ii) The Fourier coefflcients of f satisfy
n=-oo
(iii) The Fourier seriesoff converges uniformly to f on IR.

Proof. The hypotheses have been fashioned so that the expression for
cn(f) can be integrated by parts and so that / 1 is in L~(O, a), and hence
in L~(O, a). Integration by parts shows that
1
Cn(f) = -.-
2z1rn
1a f
0
1
(t)e -2i11"n!a dt,
since f(O+) = f(a- ), and hence the Fourier coeflicients cn(/1 ) of / 1 are
given by
2i1l"n
Cn(/) = -cn(f),
1
a
which proves (i). We deduce (ii) directly from the inequality
lcn(f)l = 21l"~nllcn(/ 1 )1 ~ 4: (~2 + lcnU W). 1
To prove (iii), note that E!:-oo len(f)l < +oo implies that (/N) is a
Cauchy sequence in the uniform norm on [0, a] (and hence on IR), so !N
converges uniformly on IR to some continuous function g. Since uniform
convergence implies convergence in the L~(O, a) norm, it follows from 4.3.1
and the uniqueness of the limit that f = g almost everywhere. But both f
and g are continuous, so f = g everywhere. o
The last part of this proof established the following corollary.
5.3.2 Corollary If f E L~(O, a) and if its Fourier coefflcients satisfy
n=-oo
then f is equal to a continuous function j almost everywhere and thc

Fourier series of f converges uniformly to j on IR.
5.3.3 Conclusions
We use the terms "regular" and "smooth" informally to mean that a func-
tion has a number (undetermined) of derivatives. Thus the more regular,
or the smoother, a function f is, the faster the coefficients cn(J) tend to 0.
This can be seen by repeatedly integrating by parts. This is summarized
in the following display, where the regularity of f is increasing and where
c;(o, a] denotes the space of functions f E Ck(R) that are a-periodic.
(a) f E L~(O,a) ==> en(J)- o

+oo
(b) f E L~(O, a) ==>
n=-oo
L
lcn(JW < +oo
+oo
(c) f E C~(O,a] ==>
n=-oo
L lcn(/)1 < +oo
(d) JE c;ro,a] ==> lcn(J)I :::; K/n 2

(e) f E Cg"(O,a] ==> lnkcn(J)I- 0, k E N,
as lnl- +oo
The property
lnkcn(J)i- 0 for all k E N as lnl- +oo
will be abbreviated by the expression the sequence cn(J) is rapidly decreas-
ing or variationssuch as cn(J) decreases rapidly. Note that these expres-
sions, although widely used, are a slight abuse of the language: They in no
way imply that the sequence len(J)I is monotonic.
We now show that implication (e) is in fact an equivalence.
5.3.4 Proposition Assurne that f E L~(O, a). Then the following two
properties are equivalent.
(i) The Fourier coefflcients of f are rapidly decreasing.
(ii) The function f is infinitely differentiable.
Proof. Wehaveseen that (ii) =? (i). Conversely, ifthe cn(J) are rapidly de-
creasing, then in particular, n2 lcn(J)I - 0. Hence :L~:'-oo lcn(J)I < +oo,
and the sequence of functions
N
!N(t) = L
n=-N
converges uniformly to f (Corollary 5.3.2). In the same way,

N 2" t
! N1 ( t ) = """"
L-t z1rn
--Cne
2i7rn-
a
a
n=-N
5.4 Exercises 47
converges uniformly to a continuous function g. Then it is a classical result

that f is differentiable and that f' = g. By iterating this argument, f is
shown to have derivatives up to any finite order. o
EXAMPLE: The Fourier coefficients of f (t) = 2 sin t decrease rapidly.

+cost
5.4 Exercises
Exercise 5.1 Prove that a piecewise continuous function on [a, b] is bounded.
Note that a and b must be finite! Find a counterexamples if a = -oo or b = +oo.
Exercise 5.2 Show that a function in C 1 [a, b] is of bounded variation on

[a, b]. (As in Exercise 5.1, the bounds a and b must be finite.)
*Exercise 5.3 Define f by
f(x) ={ .
x sm
0
x1 l'f
x r_J. 0 ,
if X= 0.
Show that f is continuous on [0, 1] and differentiahte (0, 1] but that it is not of
bounded variation on [0, 1].
Exercise 5.4 Prove formula (5.3).
Exercise 5.5 Develop the Fourier series ofthe function J, with period a = 2,
defined on [-1, 1) by
f(t) = COS11'Zt, z E C\Z.
From this deduce the equalities
1 00
1
71'COt71'Z =-
z
+ 2z""' - - -,
L....., z 2 -n2
n=l
_11'_ = .!_ +2z oo (-1)n.

sin 11'Z z z2 - n2 L
n=l
Exercise 5.6 Show that

00
""'smnx _ 11' x XE (0, 211').
L.....,-n--2-2'
n=l
Derive from this the value of
L
00
1 2i7rn:!?.
f(x) = -e a, 0 < x < a.
n
n=-oo
n,t:O
**Exercise 5. 7 Let f be the 21r-periodic function defined on (0, 21r) by
f(x) =In (2sin ~).
(a) Verify that f is even.

(b) Show that f E L~(O, 21r).
(c) Is f of bounded variation on (0, 21r)? Is it L~(O, 21r)?
(d) We wish to determine the Fourier series expansion of f:
f(x) = ;o + Lancosnx.
n=l
Compute an, n 2: 1, by noticing that the integral
In= 111" cot ~ sinnxdx

does not depend on n (compute In- In-l)
(e) Determine the value of ao and prove that
00
L ~ cosnx =-In (2sin ~), x E (0, 21r).

n=l
*Exercise 5.8 Let x be a real parameter and Iet f be defined by

j(t) = exeit.
(a) What is the period of f? Show that
Cn(f) {~n =
if n < 0,
if n 2: 0.
n!
1
(b) Deduce that
L
2rr 00 2n
e2x cos t dt = 211" (~!)2 .
0 n=O
(c) Define
Show that
1 1 " e 2xcost dt -_..;::... 2n I n

~ n! nX,
0 n=O
and from this determine In.
Note that hp+I = 0. Was this foreseeable?

5.4 Exercises 49
**Exercise 5.9 Consider the sequence of polynomials Bk defined by
Bo(x) = 1,
B~(x) = kBk-1(x) and 1 1
Bk(x)dx = 0, k 2:1.
(a) Compute B1, B2, B3 and draw their graphs in the same coordinate system.
(b) Show that each of these graphs is mapped into itself by reflection about
one or two axes. Is this generally true for the polynomials Bk? Express this
property algebraically for Bk.
(c) Let fk be the function with period 1 that coincides with Bk on [0, 1). Show
that for k 2: 3,
fk E C (JR) and fk = kfk-1
1 I
and that h satisfies the hypotheses of Theorem 5.3.1.

(d) Show that
k 2: 2.
(e) Compute the Fourier series of h and use it to determine the Fourier series
of h for all k 2: 2.
Exercise 5.10 Let f be the 27r-periodic function defined on [-7r,7r) by

f(x) = cosh(ax), a > 0.
(a) Show that the Fourier seriesoff converges uniformly to f.
(b) Compute the expansion of f in a series of cosines.
(c) Gonelude from this that
~ - 1- = !!.._ [coth(7ra)- 1l'a

L.... a 2 + n 2 2a
__.!:_], a E R\{0}.
n=l
(d) Justify the term-by-term differentiation of the series for fand show that
. h( ) 2 sinh(a7r) ~( 1)n+1 n . xE(-7r,7r).

sm ax = 7l' L....- n 2 +a 2 smnx,
n=l
Exercise 5.11
(a) Show that if f E c;[o, a], then lcn(f)l :S ~.
n
(b) Show that f E C;'[O, a] implies limlnl~oo lnkcn(f)l = 0 for all k E N.
Exercise 5.12 Take f E L~(O,a) and Iet h be a sequence in L~(O,a) such
Jor lf(t)- h(t)l dt = 0.

that
lim
k---+oo
Show that for fixed n, limk~oo Cn(!k) = cn(f).

Exercise 5.13 (Fourier series of a product) Suppose f and g

L;(o, a). We wish to compute the Fourier series of the product fg.
are in
(a) Verify that fg E L~(O, a).
(b) Write
n=-N
gN(t) = L N t
Cn(g)e2irrn;;:
n=-N
Show that
N
Cn(fNgN) = L Cn-k(f)ck(g).
k=-N
(c) Prove that f NgN tends to fg in L~(O, a) and use Exercise 5.12 to show
that
L
00
Cn(fg) = Cn-k(f)ck(g), n E Z,
k=-oo
where the series on the right is absolutely convergent.

Lesson 6

Expanding a Function In an
Orthogonal Basis
6.1 Fourier series expansions on a bounded

interval
Let (a, b) be a bounded interval and let f be a complex function defined
on (a, b). A priori, this function docs not have a Fourier series expansion,
since this notion has been defined only for functions that are defined and
periodic on all of R However, f can bc extended periodically, with period
b- a, to all of :IR as in Figure 6.1.
f(t)
a 0 b
FIGURE 6.1.
The extended function j has a Fourier series expansion if f is in L 2 ( a, b),

and this is called the expansion of f on the interval (a, b). Here are two
other ways to obtain a Fourier expansion of f. For simplicity, we assume
that f is defined on (0, a).
6.1.1 Expansionoff in a series of sines

Define the function j by j(t) = f(t) for t E (0, a),i(t) = - f( -t) for
tE ( -a, 0), and extend j periodically to :IR with period 2a. (See Figure 6.2.)
The Fourier series expansion of j is then of the form (4. 7) and contains
only sine terms.
52 Lesson 6. Expanding a Function in an Orthogonal Basis
f(t)
/"-----
FIGURE 6.2. Example where f(t) = tja.
6.1.2 Expansionoff in a series of cosines

The construction of the cosine series is similar, except that here f is the
even extension of f defined by ]( t) = f( -t) fort E ( -a, 0) (see Figure 6.3).
This time the series expansion contains only cosines.
f(t)
a 3a 4a
FIGURE 6.3. Example where f(t) = tja.
If nothing else is said, the expansion of f on the interval (0, a) leads to

the expansion of the a-periodic function illustrated in Figure 6.4.
f(t)
FIGURE 6.4.
AN IMPORTANT CONSEQUENCE: Although the Fourier series expansion of

a periodic function is unique (Section 4.3.3), the Fourier series expansion
602 Expansion of a function in an orthogonal basis 53
of a function f defined on a bounded interval is not uniqueo It depends

fundamentally on the way one extends f periodically to the whole lineo
6.2 Expansion of a function in an orthogonal

basis
6.2.1 General method
What has been done here and in Section 402 for periodic functions using
complex exponentials as the basis can be done for functions f in L 2 ( a, b)
using other basis functionso The simplicity of the formulas and their sim-
ilarity to those of Section 4o2 are consequences of the fact that the basis
functions are orthogonal.
Assurne that there is a family of functions {cl>n}nEN that areorthogonal
in the space L 2 (a, b), which we assume to be endowed with either the
usual scalar product or perhaps a scalar product weighted with a positive
function w(t)o Then it is always possible to project f orthogonally onto the
subspace VN generated by { cl>o, ci>I. 0oo , ci>N} and thereby obtain the best
approximation !N of f in this subspaceo We find that
with
and
II!- /NII2 = min II/- Pll2o
pEVN
If, for all f E L 2 (a, b), fN tends to f in the norm of L 2 (a, b) as N tends
to infinity, the family {cl>n }nEN is said to be a topological basis for L 2 ( a, b) o
One also says that this family is a complete system or a total family (see
Section l6o3)o In this case, we have the Parsevalrelation
and we write, in the sense of the L 2 (a, b) norm,
One can also use functions defined on an unbounded interval, for example
(a, b) = IR or (a, b) = IR+, where IR+ = {x E IR I x ?: O}o A problern in
these cases is that the polynomials are no Ionger in L 2 (a, b)o This can be
solved, however, by multiplying the polynomials by an appropriate weight-
ing function that tends to 0 sufficiently fast at infinityo One then obtains a
family of functions in L 2 (a, b) that can be used to expand f E L 2 (a, b)o
6.2.2 Examples
(a) LEGENDRE POLYNOMIALS
Take (a, b) = (-1, 1) and consider only real-valued functions. An orthog-
onal basis is obtained by "orthogonalizing" the set {1, t, t 2 , ... } of linearly
independent polynomials with respect to the inner product
1
(f,g) = /_ 1 f(t)g(t)dt.
This process yields a family of orthogonal polynomials that are, up to a

factor depending only on n, the Legendre polynomials:
In general,
Pn(t) = -,1-ddn (t2- 1)n.
n.2n tn
These polynomials are orthogonal, and
Thus, for f E L 2 ( -1, 1),

+oo
J(t) = L CnPn(t) with
2n+ 1
Cn = - 2 -(f,Pn).
n=O
(b) CHEBYSHEV POLYNOMIALS

Suppose that f E L 2 ( -1, 1) and associate with f the function
F(x) = f(cosx),
which is even and 271"-periodic. The scalar product for L 2 (0, 7r) is
(F, G) = 17[ F(x)G(x) dx = 17[ f(cosx)g(cosx) dx;

by letting t = cos x, this becomes
(F G) =
'
11 v'f=t2
-1
f(t)g(t) dt.
This formula defines the new scalar prod uct on L 2 ( -1, 1)
(! ) =
,g w
11 v'f=t2
-1
f(t)g(t) dt
6.2 Expansion of a function in an orthogonal basis 55
involving the weight

1
w(t) = Vf=t2"
1- t 2
The functions Fn(x) = cosnx form a basis for the subspace of even func-
tions in L 2 ( -1r, 1r). The corresponding functions
Tn(t) = cos(n arccos t)
form an orthogonal basis in L 2 (-1, 1) with respect to the scalar product

(, )w These polynomials Tn are called the Chebyshev polynomials, and
they play an important role in the theory of approximation (see Section
10.4).
(c) HERMITE POLYNOMIALS IN L 2 (R)
One uses the scalar product
(f,g)H = l +oo
-oo j(t)g(t)e-t 2 dt,
which leads to an orthogonal family {4> 0 , 4> 1 , } of functions of the form
</>n(t) = Hn(t)e - t /2 ,
2
where Hn is a polynomial of degree n called a Hermite polynomial. The

Hermite polynomials are orthogonal with respect to the scalar product
(, )H One can show that the family {4>n} forms a topological basis for the
Hilbert space L 2 (R) [KF74].
(d) LAGUERRE POLYNOMIALS IN L 2 (0, +oo)
The basis {1, t, t 2 , } is orthogonalized with respect to the scalar prod-
uct
(!, g)L = fo+oo f(t)g(t)e-t dt.
One obtains a topological basis for the space L 2 (0, +oo) [KF74]. This basis
consists of the functions Lne-t/ 2 , where the Ln are the Laguerre polyno-
mials, which areorthogonal with respect to the scalar product (, )L
6.2.3 Comments and references

Many facts about orthogonal polynomials and about their use can be found
in books on numerical analysis or the theory of approximation. See, for
example, [Lau72] and [Sze59].
The rather sophisticated proofs that these families form topological bases
can be found in the book by Kolmogorov and Fomine [KF74]. These exam-
ples serve to justify, among other things, the theoretical study of Hilbert
spaces and their usual topological bases. This will be done in Lesson 16.
6. 3 Exercises
Exercise 6.1
(a) Expand the functions in Figures 6.2-6.4 in Fourier series.
(b) Determine the rates at which their Fourier coefficients Cn converge to 0.
(c) Write Parseval's equality for these three cases.
(d) Express the pointwise convergence in the three cases fort= a/2 and t = a.
In which cases do (c) and (d) produce interesting identities?
Exercise 6. 2
(a) Use the Legendre polynomials P0 , H, P 2 , ?3 to compute the best approx-
imations J;, i = 0, 1, 2, 3, to J(t) = ltl on [-1, + 1] in the sense of the usual
L 2 ( -1, 1) norm.
(b) Represent J, JI, h on the same graph.
Exercise 6.1 Compute the Hermite polynomials Ho, H1, and H 2 .
Exercise 6.2 Let f be defined on [0, 1] by f(x) = x(1- x).

(a) We wish to consider the expansion of f in a series of sines. Sketch the graph
of the periodic (period 2) extension g of f. Is the sine series expansion of
g uniformly convergent? Can it be differentiated term by term?
(b) Compute the expansion of g.
(c) Deduce from (b) that
(d) Compute 1 1
J(x) dx and deduce that
(e) Compute the expansion in cosines of

J'(x) = 1- 2x, x E [0, 1],
and the expansion of
j"(x) = -2, XE (0, 1).
(f) Deduce from (e) that
= 1 ~2
~ (-1t = ~-
-
L.,_; (2n + 1) 2 -- 8
"'""' and
L.,_; 2n + 1 4
n=O n=O
(g) Expand f in a series of cosines and address the same questions as above.
lesson 7
Frequencies, Spectra, and

Seales
7.1 Frequencies and spectra

7.1.1 The notion of the spectrum of a periodic signal
If f is a periodic signal with period a, and if it has the Fourier series
expansion
+=
J(t) = I: (7.1)
n=-(X)
then the spectrum of f is defined tobe the set of pairs (nja, cn), n E Z.
7.1.2 Amplitude spectrum and phase spectrum

In Section 1.2.3, we mentioned the frequency representation of a pure si-
nusoidal signal in terms of amplitude and phase. By Superposition, we
have a representation of the amplitude and phase spectra as a function of
frequency for all periodic signals by representing the various values with
arrows parallel to the y-axis.
amplitude
-2/a -1/a 0 1/a 2/a 3/a A
FIGURE 7.1. Amplitude spectrum.

58 Lesson 7. Frequencies, Spectra, and Scales
phase
o,
-2/a -1/a 0 1/a 3/a
o_,
FIGURE 7.2. Phase spectrum (of a real signal).
The amplitude spectrum (Figure 7.1) consists of spectrallines regularly

spaced at the frequencies nja. For lnl = 1, the two lines correspond to the
fundamental frequency. The other lines are called harmonics of the signal.
The phase spectrum (Figure 7.2) is the set of pairs (nja, On), n E Z, where
Cn = lcnlei 0n, On E [-1r,1r).
7.1.3 Action of a filter on a periodic signal

We saw in Beetion 2.3 that the action of a filter on a sinusoidal signal with
frequency >. was simply to multiply the signal by H(>.). For a periodic
signal f given by (7.1) the output from the filter with transfer function H
will be
)
+oo
'""'
g(t) = (AJ (t) = L. CnH ~ e 2i1rn-a. (n) t
n=-oo
The positions of the spectrallines are not changed; only their relative values
change as they are multiplied by H(nja). Thus, this process is properly
called frequency filtering.
7.1.4 Orders of magnitude

Vibrating or oscillating phenomena are widespread in nature. In fact, some
would claim that everything reduces to vibrations! Listed below are some
common oscillatory phenomena encountered in the physical sciences along
with their frequencies (1 hertz (Hz)= 1 cycle per second):
Hausehold current: 60Hz

Quartz in a watch: 105 Hz
Radar wave: 10 10 Hz
Vibration of a caesium atom: 10 14 Hz
Electromagnetic waves:
very long: (telegraph) 1.5 104 to 6 10 4 Hz
long: (radio) 6 104 to 3 10 5 Hz
medium: (radio) 3 105 to 3 106 Hz
short: (radio) 3 106 to 3 107 Hz
meters: (tclevision) 3 107 to 3 108 Hz
centimeters: (radar) 3 108 to 10 11 Hz
visible light: 3.7 10 14 to 7.5 10 14 Hz
The human ear can, in the best cases, detect sounds whose frequencies
range from 20 to 20,000 Hz.
7.2 Variations on the scale

Sound is measured, as a function of time, as variations in air pressure either
by the ear or other sensors. A pure tone with a fixed frequency f (or a note)
is sensed by the ear as a periodic variation in air pressure where the maxima
occur every 1/ f seconds.
7.2.1 The octave

An octave is the interval between two notes, one with frequency f and
the other with frequency 2f. This definition may seem arbitrary, but it is
clearly based on a spectral decomposition: An octave is the interval that
separates the fundamental frequency and its first harmonic.
When one hears middle C, say, one also hears, hidden immediately be-
hind, its first harmonic, which is the C in the next (higher) octave. These
two notes are closely related, and one has the impression that they are from
the same family. And indeed, we give them the same name, C. A note on
the scale is thus determined modulo multiplication by a power of 2; it is
the exponent of 2 that determines its octave.
EXAMPLE: Standard pitch assigns 440 Hz to A3 . This standard leads to
the following A-ladder (in hertz):
27.5 55 880 .. .
A-1 Ao A4 .. .
7.2.2 The harmonic scale

When we hear a C with frequency J, we also hear the harmonics 2f, 3f,
etc., and it is in this progression that we encounter G and E:
f 2f 3f 4f 5f 6f
c c G c E G
Thus the common cord (C, G, E) is found within the C-scale. Bringing
these notes back to the same octave, we have the cord (C, E, G, C):
f ~! ~! 2/
c E G c
If we similarly analyze a G, we find the cord (G, D, B):
~! 3f V 6f 125 f
G G D G B
Brought back to the octave (!, 2!), we obtain the following:
f
C
~!
D
~!
E
uG ~5 f
B
2f
C
Starting with E leads to no new note, at least in the first three harmonics:
~! ~! 1f
1 5f
E E B E
The frequencies of all of these notes are simple fractions of the fundamental
frequency f. Furthermore, all of the denominators are powers of 2. The first
simple fraction with denominator 3 leads to the discovery of the F with
frequency ~f, from which the cord (F, C, A) is constructed:
~! ~! 4f 136 f 2~ f
F F c F A
We now have all 7 notes of the harmonic scale:

f
c
~!
D
~!
E
u u
F G
if
A
~5 f
B
2/
c
This scale is also call the physicists' scale.
7.2.3 Tonesand semitones

The sound of two tones whose frequencies have a fixed ratio is perceived
by the ear as a fixed interval between two notes. Thus fixed intervals are
expressed by frequencies in geometric progression.
EXAMPLE: For the interval called a fifth, the ratio is 3 to 2:
f
c
uG ~!
D
If one wishes to relate these ratios between frequencies to the length

of the intervals, the thing to do is to take the logarithms (for example
to base 2) of these ratios. This shows that there are intervals with three
different lengths: the major interval, the minor interval, and the semitone
or half-step. The following table illustrates this idea:
(C,D) log9/8 0.170 Major interval

(D,E) log 10/9 0.152 Minor interval
(E,F) log 16/15 0.093 Half-step
(F,G) log9/8 0.170 Major interval
(G,A) log 10/9 0.152 Minor interval
(A,B) log9/8 0.170 Majorinterval
(B,C) log 16/15 0.093 Half-step
The difference between a major interval and a minor interval is called the
comma. It is the minimum interval perceived by the ear.
1 comma = log 81/80 = 0.018.

This is roughly a ninth of the major second: 0.170/9 = 0.0188. Although
a very approximate definition, it is generally taught in courses on music
theory.
7.2.4 The tempered scale

The two kinds of intervals, major and minor, have the disadvantage that
they do not partition the octave into 12 equal half-steps, nor do they have
one fixed value for sharps and flats. This can be remedied directly by di-
viding the octave into 12 equal intervals. This process defines the tempered
scale:
f 2f
c c
where
a = 2 1112.
The semitone, or half-step, is thus defined by the interval
(!, 21/12 !).
It is easier to see the nuances between the harmonic scale (H.S.) and the
tempered scale (T.S.) by looking at the decimal values:
c D E F G A B c
H.S. f 1.125! 1.250! 1.333! 1.5! 1.667! 1.875! 2f
T.S. f 1.122! 1.260! 1.335! 1.498! 1.682! 1.888f 2f
The tempered scale has no need for the comma. It is this scale that is,
in principle, used for the piano.
7. 3 Exercises
Exercise 7.1 The Pythagorean scale is built on the "fifth," which is defined
by two vibrating strings whose lengths are in the ratio 3 to 2:
F c G
~~ f ~~
(a) Bring these frequencies back to the same octave and compare this scale
with those described in Beetion 7.2.2 and 7.2.4.
(b) Note that this scale has only one major tone T and a very "tight" half-tone
m. Describe the succession of tones and half-tones.
Exercise 7.2 Let f be a real periodic signal. Show that its amplitude spec-
trum is even. Investigate the properties of its phase spectrum.
Chapter 111
The Discrete Fourier

Transform and N umerical
Computations
lesson 8
The Discrete Fourier Transform
8.1 Computing the Fourier coefficients

We will work with the following assumptions: We know the period a of the
function f as well as N of its values that are regularly spaced over one
period:
f(k;) =yk, k=0,1,2,000 ,N-1.
The signal f(t) is thus assumed to have been sampled at regularly spaced
times separated by ajN unitso Using this information, we wish to approx-
imate the Fourier coefficients of f We also assume that the Fourier series
0
of f converges pointwise to f and that at points of discontinuity

1
f(t) = 2(f(t+)+f(t-))o
Given N data points, it is logical to try to compute N Fourier coefficients

Cn Since these coefficients tend to zero as n tends to infinity, we choose
0
to compute Cn for n = -N/2, 000 , N/2- 1 (or a centered interval if N is

odd)o These considerations lead one to compute an approximation of the
integral
Cn
11a
= -
a o
f(t)e -2i7rn!a dto (801)
FIRST METHOD: Integrating (801) by the trapezoid formula gives the ap-
proximate value
N-1 k
1 1 """' -2i7rn-
Cn = N L..... Yke N '
k=O
or
N-1
1 1 """' -nk with (8o2)
cn = N L..... YkWN
k=O
66 Lesson 8. The Discrete Fourier Transform
SECOND METHOD: One can also compute the Fourier coeffi.cients, denoted
by c!;[, of the trigonometric polynomial
N
2-1
p(t) = ""'
~ cnN e2i?rn!a (8.3)
N
n=-2
that interpolates f at the points k(a/N), k = 0, 1, 2, ... , N- 1. Note that

with the notation c!;[, N is fixed; it is not a running index like n. c!;[, like
c~, is destined to be an approximation of Cn. One is thus led to solve the
linear system of order N,
k = 0, 1, 2, ... , N- 1. (8.4)
For convenience, we bring all of the indices n into the interval [0, N -1) by
translating thc negative indices to the right by N. This is possible because
the functions involved are N-periodic. Thus,
By defining
N
if O<n<--1
- - 2 '
N
if -<n<N-1
2 - - '
the system (8.4) can be written as
N-1
L Ynw'Nk = Yk, k = 0, 1, 2, ... , N- 1.
n=O
This system has the advantage that it can be solved explicitly. Let p be
an integer between 0 and N - 1 and compute the sum
N-1 N-1N-1 N-1 N-1
""' -kp _ ""' ""' v k(n-p) _ ""' v ""' k(n-p)
~ YkWN - ~ ~ .InWN -~.In~ WN
k=O k=O n=O n=O k=O
The last sum is a geometric series:
~
~WN
k(n-p) _
-
{0 N
if p
l"f
=f: n,
k=O P = n.
8.1 Computing the Fourier coefficients 67
This show that

N-1
LYkw//P = NYp,
k=O
and hence the unknowns Yn are given by

N-1
~
Y.n = N1 W -nk n = 0, 1, 2, ... , N- 1.
YkWN ,
k=O
We see that this is the same as formula (8.2)! After a change of indices, we
discover that
8.1.1 Conclusions
Integrating (8.1) using the trapezoid method yields N approximate Fourier
coefficientsc;;;
that are equal to the Fourier coefficients of the trigonometric
polynomial (8.3) that interpolates f at the points tk = k(a/N). We have
the equivalent formulas
N-1
Yk = LYnwRrk, k = 0, 1, 2, ... , N - 1,
n=O
(8.5)
N-1
Y. 1 ~ -nk n = 0, 1, 2, ... , N- 1,
n = N W YkWN '
k=O
and the approximate Fourier coefficients are
N
if :s;n< 2 ,
0
N
if - 2 :::; n < 0.
8.1.2 Definition The second formula in (8.5) defines a transformation

.!/N from CN into itself
that is called the discrete Fourier transform of order N .
.!/N is linear and bijective:

If f2N is the matrix representation of .!JlN, then
1 1 1
N-1
1 WN
2(N-1)
1 WN
2(N-1)
WN
and
8.1.3 Remarks
(a) Be careful to note that the computation of the Yn yields the Fourier
coefficients c!;; in the following order (N = 8):
Yo Y1 Y2 Y3 Y4 Y5 Y6 Y1
c~ c~ c~
(b) It is convenient, particularly for manipulating the formulas, to extend
the vector y E CN to a periodic sequence with period N, which just comes
back to its original definition, since Yk = f(kajN), where f has period a.
This allows us to write Yk with k E Z. We will use this convention in the
rest of the book. Formula (8.5) shows that the Yn are also periodic with
period N. We will use the same convention and define Yn for n E Z.
These conventions mean that the approximate Fourier coefficients c!;;
also form a periodic sequence. Here it is important to remernher that c!;; is
an approximation (see Figure 8.1) of Cn only for
(There is no reason to believe otherwise, but an easy way to remernher this

is to keep in mind that Cn --+ 0 as n--+ +oo.)
With these conventions, Yn = c!;; for all n E Z (although the c!;; are
still out of order), and the sums in (8.5) can be taken over any set of N
consecutive integers.
8.2 Some properties of the discrete Fourier

transform
All the sequences in this section will be complex and periodic with the
same period N. The sequence (Yn) is said to be even if Y-n = Yn for all
n E Z. It is said to be odd if Y-n = -yn for all n E Z.
8.2 Some properties of the discrete Fourier transform 69
N -1 0 1 2 N N-1 N N+ 1
2 2
FIGURE 8.1. Fourier coefficients exact () and computed (x).
We note without further comment that the properties described in this

section are also true for the inverse transform
8.2.1 Proposition If (yk) ~ (Yn), then

(i) (Y-k) ~ (Y_n);
(ii) (Yh) ~ (Y -n);
(iii) (y_k) ~ (Y n)
Proof. For example, for (i), write 9N(Y-k) = (Y~). Then
N-1 0
Yn = N1 "~ Y-kwN
I -nk 1
= N
"~ D
k=O k=-N+l
8.2.2 Proposition If (Yk) ~ (Yn), the following relations hold:

(i) (Yk) is even (odd) <=? (Yn) is even (odd);
(ii) (Yk) is real<=? Y_n = Y n for all n E Z;
(iii) (Yk) is real and even <=? (Yn) is real and even;
(iv) (Yk) is real and odd <=? (Yn) is imaginary and odd.
Proof. These follow directly from Proposition 8.2.1. D
We now have a result that explains the important relation between the
discrete Fourier transform and the periodic discrete convolution.
8.2.3 Theorem Let (xk) and (yk) be two complex sequences with pe-
riod N and Jet (Xk) and (Yk) denote their discrete Fourier transforms.
(i) The sequence defined by circular convolution,

N-1
Zk = L XqYk-q, kE z,
q=O
has as its transform
(ii) The transform of the pointwise product of the sequences (xk) and
(Yk) is
wherc
N-1
Pn = L XqYn-q
q=O
Proof. By definition,
N-1 N-1
Zn = N1 "
L "
L -nk '
XqYk-qWN
k=O q=O
and interchanging the order of summation shows that

N-1 N-1
Z n_- N1 "L -nq" -n(k-q) _ NX v
XqWN L Yk-qWN - nLn
q=O k=O
If we do the "same" computation for the inverse transform applied to the

vector (Pn), we see that
which proves (ii). D

We will see later that a similar property holds in the continuous case.
8.2.4 Proposition If (yk) ~ (Yn),
N-1 N-1
L iYki 2 =N L IYnl 2
k=O n=O
The proof is left as an exercise.

This result means that the discrete Fourier transform is, up to a factor N,
an isometry on the Euclidean space cN into itself. Furthermore, a quadratic
error of E: in the data (Yk) appears as a quadratic error E: / N in the result.
This means that the computation is stable.
8.3 The Fourier transform of real data 71
8.3 The Fourier transform of real data

The discrete Fourier transform applies to complex-valued vectors, hence to
the case wherc (Yk) is real-valued. However, in this case it is possible to
reduce the cost of computation by half by treating two sets of real data
with a single complex transformation. We wish to compute thc transforms
of two real vectors (xk) and (yk):
(xk) ~ (Xn),
(yk) ~ (Yn)
We know from Proposition 8.2.2 that
Let Zk = Xk + iyk and denote the transform of (zk) by (Zn):
By linearity,
Zn= Xn +iYn
But notc that Xn and Yn arc not necessarily real! With this notation,
1 -
Xn = 2(Zn + ZN-n),
1 -
Yn = 2i(Zn- ZN-n)
It is only necessary to compute these values for n = 0, 1, ... , N /2, since the
values for n bctween N /2 and N - 1 will have already appeared as their
conjugates. Thus it is sufficient to compute the transform of (zk) to obtain
the transforms of (xk) and (yk)
It is also possible to compute the Fourier transform of a single real vector
with N components by using a transform of length N/2 (see [CLW70]).
8.4 A relation between the exact and

approximate Fourier coefficients
Assurne for simplicity that the periodic function f is cxpressed by
+oo t
!( t ) = '""'
~ Cne
2i7rn-
a (8.6)
n=-oo
and that the series is absolutely convergent:
n=-oo
It is sufficient, for example, that f satisfy the hypotheses of Theorem 5.3.1.

Since (8.6) is absolutely convergent, we can, for each t, rearrange the order
of summation. In particular, we can first sum all of the terms whose indices
have a fixed residue modulo N and then sum these N terms. Thus, by taking
t = k(a/N),
nk
Cn+qN ) WN,
m=-oo q=-oo
where we have written m = n + qN with n E {0, 1, ... , N- 1} and q E z.

We deduce from this, using the inverse transform, that
+oo
c;: = L Cn+qN (8.7)
q=-00
A surprising relation! It expresses the approximate coefficients in terms of

the exact coefficients, and we obtain an expression for the approximation
error:
c;:- Cn =L Cn+qN
q,tO
From this we see that for a fixed N, the faster Cn tends to zero as n tends
to infinity, the better will be the approximation
N N
for - -2 -< n -< -2 - 1.
Consequently, the smoother f is, the better the approximation (compare

with Section 5.3.3). On the other hand, the approximation for a discontin-
uous function can be rather bad. The determination of the rate at which
the Cn tend to zero and, if possible, of abound on the sums
I+ C-2N+n + C-N+n + CN+n + C2N+n +I

are serious issues for the numerical analysis of this approximation problem.
The sum in (8.7) is, of course, finite if f is a trigonometric polynomial.
Take, for example, the general trigonometric polynomial of degree 6:
L
+6 t
f(t) = cnii1rn;;;.
n=-6
8.5 Exercises 73
Then we have for N = 4: cg = C-4 +Co+ C4,

cf = c_3 + c1 + es,
C~ = C-2 + C2 + C{;,
c~ = c-1 + c3, etc.
For N = 8: cg =Co,
c~ = c1.
~ = c_6 + c2, etc.

For N ~ 13:
This simple example illustrates the following general result, which follows
directly from (8.7):
Fora trigonometric polynomial of degree P, the values of the approxi-
mate coefflcients computed with the discrete Fourier transform are exact
whenever N ~ 2P + 1.
8.5 Exercises
Exercise 8.1 Consider two consecutive discrete Fourier transforms:
Compute Zq as a function of Yk
Exercise 8.2 Let (xk) and (yk) be two complex periodic sequences (with
period N) such that
XN-k = Xk and YN-k = Yk
for all k E Z. Show that the discrete Fourier transforms (Xn) and (Yn) arereal
and that they can be computed with a single transform of order N.
Exercise 8.3 Compute the successive powers of the matrix nN.

Exercise 8.4 Prove Proposition 8.2.4 by computing (rlNY)t (ONY).
Exercise 8.5 Calculate the discrete Fourier transform of the vector Xk = k,

k = 0 , 1, ... , N - 1.
Lesson 9
A Famous, Lightning-Fa st
Algorithm
Computing the vector (Yo, Y1 , ... , YN-d using formula (8.5) requires
(N- 1) 2 complex multiplications,
N(N- 1) complex additions,
assuming that the values of w~, the sines and cosines of the given angles,
have already been computed and stored.
A typical value of N is of the order of 1000, which implies about a million
operations of each kind. Considering the frequency of this computation, it
was natural to seek to lower the cost. In 1965, two American scientists,
J. W. Cooley and J. W. Thkey, developed a much more efficient algorithm
that has since been known as the fast Fourier transform (FFT). This al-
gorithm takes into consideration the special form of the transformation
matrix, which is constructed from the roots of unity. From the beginning,
the FFT, including its many extensions, has enjoyed enormous success. In
fact, it is safe to say that it has been the backhone of signal and image pro-
cessing in the last half of the twentieth century. Furthermore, it has been
the inspiration for numerous investigations in algebra independently of its
intensive use in signal processing. It was indeed a marvelous discovery. The
fast Fourier transform marked an important step in the theory of the com-
plexity of algorithm. This field of research is concerned with determining
and minimizing the cost of a given computation or dass of computations,
where the cost is measured by the number of numerical Operations. For
example, we will see that the cost of the FFT is of the order N log N.
9.1 The Cooley-Tukey algorithm

Assurne that N is even, N =2m, and rearrange the terms of (8.5) into two
groups-those with even indices and those with odd indices. Then
Yk = 21 (Pk +wN-k Ik),

76 Lesson 90 A Farnous, Lightning-Fast Algorithm
where The Pk and Ik are given by the formulas
Pk = -1 (
Yo+Y2WN -2k -(N-2)k) ,
+Ooo+YN-2WN
m
Ik=- 1 (
Yl+Y3WN -2k -(N-2)k)
+ooo+YN-lWN
m
Note that we have the relations
for k = 0, 1, ooo, m- 1. These identities provide the key to the algorithm,

whose essential idea is this:
Step 1: Compute Pk and w-;/ Ik;
Step 2: Form Yk = ~(Pk + w!/ Ik);
Step 3: Deduce Yk+m = ~(Pk-w-;/ Ik)o
These computations are done successively only for k = 0, 1, 000, m- 1.
This scheme is illustrated schematically in Figure 901, where thc arrows
indicatc dependent relations in the calculationso
Po lm-1
j j
FIGURE 901.
The cost of Step 1 is 2(m-1) 2 +m-1 (or roughly N 2 /2) multiplicationso

Steps 2 and 3 cost nothing in complcx multiplicationso Thus one obtains the
same result for about half the work. (Note we have neglected the divisions
by 2 and mo In practice, m isapower of 2, and these are binomial shiftso)
One could consider that this saving is sufficient and stop hereo But most
readers probably have noticed that Pk and h are themselves two indepen-
dent discrete Fourier transforms of order m = N /2. In any case, it takes
only a moment to be convinced that
!f'N;2
(yo,Y2,ooo ,Y2m-2) ~-----+ (Po,Pl,oo ,Pm_I),
!f'N;2
(Yb y3, o. o , Y2m-1) ~-----+ (Io, h, 0o0 , Im-do
An obvious strategy is to repeat this clever decomposition, provided that
m is eveno The best case is where N isapower of 2, N = 2Po We can then
iterate the process until we arrive at discrete Fourier transforms of order
20 These are particularly simple computations, since they are of the form
Y = (y + z)/2,
Z = (y- z)/20
9.2 Evaluating the cost of the algorithm 77
We illustrate the algorithm for N = 8. The first step is to rearrange the

sequencc (Yl, Y2, ... , Ys) into two sequences of length 4, the first having the
odd indices and the second the even indices. The process is repeated, and
we obtain thc four vectors of length 2 shown in Figure 9.2. The computation
begins with the vectors of length 2. As in Figure 9.1, the arrows indicate
}b .Y1 .Yl .Y3 Y4 .Ys .Ys Y1
\!x!, !!x!
}b .Yl Y4 .Ys Y1 .Y3 .Ys Y1
Yo Y4 Y2 Ye
II II II II
Po lo Po lo
l Xl ! X!
FIGURE 9.2. Rearrangement of the data.
the dependencies of the Y-vectors on the data. To simplify the notation,

we have written P 0 , Io, Yo, Y 1 four timcs, but they are clearly not the same
values. The wiggly lines separate indepcndent computations. Going from
one level (vectors of length m) to the next (vectors of length 2m) is done
using the formulas
yk = 21 ( pk + WN-k Ik ) '
Yk+m = 21 ( Pk- WN-k lk ) ,
for k = 0, 1, ... , m - 1. Figure 9.3 illustratcs the complete algorithm for
N = 23 . The wiggly lines separate the independent computations. In an
actual program, a single vector is used. This is ultimately the output vector
(Yo, Y1. ... , YN-1); it is the result of successively transforming the vector
obtained by appropriately rearranging the original data.
9.2 Evaluating the cost of the algorithm

The only arithmetic operations that appear in the FFT are multiplications
and additions of complex numbcrs. (We neglect the successive divisions by
2; these reduce to a single division by N = 2P, at the outset, for example.)
We denote the cost of r complex multiplications and of s complex additions
by (r; s].
For N = 2P, let Mp be the number of multiplications used in the algo-
rithm and let Av be the number of additions. Formulas (9.1) are used to
evaluate the cost for N = 2P in terms of the cost for N = 2P-l:
Cost of computing the Pk: [Mp-li Ap-l]i
Cost of computing the h: (Mv-li Av-di
78 Lesson 9. A Famous, Lightning-Fast Algorithm
Yo Y4 .Y2 Ya y, .Ys .Y'J Y7

II II II II II II II II
Po lo Po lo Po lo Po lo
~
Yo
Xl y1
~X~
Yo y1
~X~
Yo y1
~X!
Yo y1
Po p1 lo /1 Po p1 lo /1
j j j):Kj
Yo y1 y2 Ya Yo Y1, y2 Ya
p1 /2 Ia
FIGURE 9.3. The FFT algorithm of order 8.
Multiplications by w-;/
(k ~ 1): [2P- 1 - 1; 0];
Additions: [0; 2Pj.
From these relations we have
M1 =0, A1 = 2,
Mv = 2Mp-l + 2p-l - 1, Ap = 2Ap-1 + 2P.
A computation, which is left as an exercise, provides an explicit expressions
for Mp and Ap, namely,
Mv = (p- 2)2P- 1 + 1,
Ap = p2P.
We see from this that the global cost, as a function of N, is
[~N(log 2 N- 2) + 1; Nlog2 N]. (9.1)

Table 9.1 compares thc FFT with the "old" method. It shows the savings
for the two operations as a function of N. For N = 1024, we see that the
FFT divides the cost by 250: a fantastic gain.
9.3 The mirror permutation

If we wish to obtain thc values Y0 , Y1 , ... , YN-l in this order, it is clear
from Figure 9.3 that we must begin with a vector (Yn), n = p(k), where
9.3 The mirror permutation 79
Multiplications Additions
N Old FFT Ratio Old FFT Ratio
Method Method
2 0 0 2 2 1
4 0 0 12 8 1.5
8 49 5 10 56 24 2.3
16 225 17 13 240 64 3.8
32 961 49 20 992 160 6.2
64 3,969 129 31 4,032 384 10
128 16,129 321 50 16,256 896 18
256 65,025 769 85 65,280 2,048 32
512 261,121 1,793 145 261,632 4,608 57
1,024 1,046,529 4,097 255 1,047,552 10,240 102
TABLE 9.1.
p is a permutation of the indices k = 0, 1, ... , N- 1. This permutation of

the data at the outset is an important issue, particularly for programming
the algorithm. There are a number of ways to do this, and it is a prob-
lern that generally stimulates much imagination from students. The only
restriction is not to introduce so many operations that the gain realized by
the algorithm is compromised.
For these consecutive even-odd permutations, one feels that the repre-
sentation of the indices in base 2 must come into play. Take the case N = 8
and notice what happens:
0=000 .. .. 0=000
1 = 001 .. .. 4 = 100
2 = 010 .. .. 2 = 010
3 =Oll .. 6 = llO
4 = 100 .. .. 1 = 001
5 = 101 .. .. 5 = 101
6 = llO .. 3 =Oll
7 = ll1 .. .. 7 = 1ll
Each number has been written in binary form using three places, which
is possible, since we stop at 7 (N = 23 ). We notice a surprising property:
The required permutation of an index is given by reversing the order of its
binary representation. It is as if they were reflected in a mirror. One can
verify that this holds for N = 16, and it is an excellent exercise to show
that it is true in general.
This "mirror" permutation leads to a method for programming the initial
permutation. For this, it is necessary to work with the binary representa-
tions of the indices. These, however, arenot directly accessible in high-level
languages like PASCAL; consequently, this is not the best method.
9.4 A recursive program

Here, to finish the chapter, is a program (writtcn in a simplified pscudo-
language) for computing thc FFT of a vector y. It is taken from [Lip81].
Wc include it because it is astonishingly simple to program and bccause it
follows step by step the approach we have taken. The particularity of this
procedure is that it is recursive, which means that calls are madc within
the program to the program itself.
Procedure FFT(n,w,y,Y);
begirr
if n=1 then Y[O]:=y[O] eise
begirr
m:=n div 2;
for k:=O to m-1 do
begirr
b[k] :=y[2*k];
c[k]:=y[2*k+1]
end; w2=w*w;
TFR(m,w2,b,B);
TFR(m,w2,c,C);
wk:=1;
for k:=O to m-1 do
begirr
X:=B[k]; T.=wk*C[k];
Y[k]:=(X+T)/2;
Y[k+m]:=(X-T)/2;
wk=wk*w
end
end
end.
We note that compilers deal with these recursions more or less wcll, par-
ticularly on microcomputers. While the program itsclf is concisely writtcn,
which is very attractive to the programmcr, its execution, by contrast, re-
quires a great deal of processing and a large amount of memory: At each
call to FFT, the procedure is completely recopied with new parameters.
Finally, it is not obvious that this procedure does indced compute the
desired FFT. For example, the second call to FFT is cxecuted only after
many other such calls.
9.5 Exercises 81
9. 5 Exercises
Exercise 9.1 Consider the discrete Fourier transform of order N defined by
the formulas (8.5)
N-1
v
<n = N1 ~
LYkWN
-nk
, n = 0, 1, ... , N- 1.
k=O
Write the discrete Fourier transform in its matrix form Y = SNY What is the
matrix associated with the inverse transform?
**Exercise 9.2 Let u and v be two complex periodic sequences with period
N. Consider the periodic convolution w = u * v defined by (10.1):
N-1
Wn = L Un-qVq, n = 0, 1, ... , N- 1.
q=O
(a) Write this convolution in matrix form,
w=C(u)v where w=(wo,w1, ... ,WN-dt, v=(vo,v1, ... ,VN-1)t,
and C(u) is an N x N circulant matrix.

(b) Let U and V denote, respectively, the discrete Fourier transforms of u and
v. Express W, the discrete Fourier transform of w, in matrix form as a
function of SN, C(u), and V.
(c) Show that W = D(U)V, where D(U) isadiagonal matrix (D(U))ii = NUi,
i = 0, 1, ... , N - 1. (Write C( u) as a linear combination of permutation
matrices.)
Exercise 9.3 (binary mirror permutation) For M E N*, let

N =2M and define EM ={jE NI 0::; j ::; 2M- 1}. Each integer jE EM can
be represented as a binary number of length M:
M-1
j = L ak2\
k=O
where ak is 0 or 1. With each j E EM we associate j* E EM obtained.by reversing

the order of the ak :
M-1
j* = aoa1 aM_zaM-1, or j* = L aM-k-12k.
k=O
This defines a permutation on EM.

(a) Write the matrix PN associated with this permutation for N = 8.
(b) Verify that PN is symmetric and that P~ =I, the identity matrix.
(c) How does this generalize for arbitrary M?
**Exercise 9.4 (matrix version of the FFT) The exercise is for

the case M = 3, that is, N = 23 . It generalizes to an arbitrary M. (A description
of the general algorithm is given in [Ebe70].)
We will use the results of Exercises 9.1 and 9.3 with modified notation: In place
of SN, PN, ON we write 83, P3, !13 (N = 2 3 ).
We will use the mirror permutation matrix P3 to compute Y = S3y; in fact,
we compute Y = P3(P3S3)y. The work proceeds in two phases:
Compute Y* = (P3S3)y.
Compute Y = P3 Y*.
(1) Phase 1: Compute Y*.
Let T3 = P3S3. Define E3(S3) and E3(T3) tobe the 8 x 8 matrices formed
from the exponents (nk mod 8) of the terms (wj\/)nk appearing in 83 and
T3 respectively.
(a) Express E3(S3) and E3(T3) explicitly.
(b) Show that T3 can be written as
with
1 0 0 0
-2i7r-\-
0 e 2 0 0
L2 = -2i7T~
0 0 e 2 0
-2i7r~
0 0 0 e 2
(c) Now consider the algorithm for computing Y* = T3y.

First step:
Write R 0 = T 3 . Verify that R 0 can be decomposed as a product
of 3 matrices:
where
h]
-h '
!2 being the 4 x 4 identity matrix.
Write v 0 = y and cut v 0 , a vector of length 8, into two vectors,
v8 and v?, of length 4:
Compute v 1 = ~ 0 E 0 v 0 as a function of v8, v?, and L2. How

many complex multiplications are needed to compute v1?
9.5 Exercises 83
Second step:
This step is to compute Y* = R 1v1 .
Show that R 1 can be written as the product of 3 matrices:
Rl = R21~1,
l'
where
0 0 0 0
1] [~ ~
R2 = T1 0 l= L1 0
IT 0
0
T1
0
0 h
l
0 0 L1
[I,
and
h 0
~1 =! h -h 0 0
2 0 0 h I0 .
0 0 h _}1
h is the (2 x 2) identity matrix, T1 = HS1 is a 2 X 2 matrix,
l
and
0.
-2t1f~
1
e 2
Cut the vector v 1 into 4 vectors vJ , vi, v~, and vj of length 2.

Compute v 2 = 1 ~ 1 v 1 as a function of vi, i = 0, ... , 3.
How many complex multiplications are needed to compute v 2 ?
Third step:
This is the last step when M = 3. Here we compute Y* = R 2 v2
Show that R 2 can be decomposed as a product of 3 matrices:
R2 = R32~2,
where R 3 is the (8 x 8) identity matrix, 2 = R 3 , and
1 1 0 0 0 0 0 0
1 -1 0 0 0 0 0 0
0 0 1 1 0 0 0 0
0 0 1 -1 0 0 0 0
~2 =!
2 0 0 0 0 1 1 0 0
0 0 0 0 1 -1 0 0
0 0 0 0 0 0 1 1
0 0 0 0 0 0 1 -1
From this, deduce the value of Y*.

What is the total number of complex multiplications needed to
compute Y*?
(2) Phase 2: Rearrange the components of Y*.

Is it necessary to compute the matrix product Y = P3Y*?
(3) How does one proceed to obtain Y0 , Y1, ... , Y7 in this order?
(4) How must one modify the algorithm to compute the inverse discrete Fourier
transform?
Exercise 9.5 Show that it is possible to compute the periodic convolution

(10.1) using a form of the FFT that does not involve the mirror permutation.
Lesson 10
U sing the FFT for N umerical
Computations
We present several examples to indicate the many possible numerical ap-

plications of the fast Fourier transform (FFT). It is widely used in signal
processing for spectral analysis and for computing convolutions. We will
sec other important uses in computations involving high-degree polynomi-
als and in interpolation problems.
10.1 Computing a periodic convolution

10.1.1 Camplex data
Let (xn)nEZ and (hq)qEZ be two complex periodic sequences having the
same period N. The periodic convolution of these two sequences is the
complex sequence (Yn)nEZ defined by
N-1 N-1
Yn = L hqXn-q = L hn-qXq (10.1)

q=O q=O
This sequence is clearly periodic with period N. The transformation defined
by (10.1) is also a linear transformation X f--+ Y = HX of CN into itself
with X= (xo,xi.,XN-1)t and Y = (Yo,Y1,,YN-1)t. The matrix of
this transformation is called a circulant matrix and is given by
ho hN-1 hN-2 h1
h1 ho hN-1 h2
H= h2 h1 ho h3
hN-1 hN-2 hN-3 ho

Computing the convolution directly from the definition (10.1) requires
N 2 complex multiplications,
{10.2)
N(N- 1) complex additions.
86 Lesson 10. Using the FFT for Numerical Computations
Theorem 8.2.3 points to another way to proceed: Let (Xk), (Hk), and (Yk)
be the discrete Fourier transforms of the sequences (xn), (hn), and (Yn)
Equation (10.1) becomes
(10.3)
If we assume that the length N of complex vectors is a power of 2, N = 2P,
then this computation proceeds as follows:
Computation Cost
Step 1: Compute the transforms 9N [N(p- 2); 2Np]
(xn) ~ (Xk)
(hn) ~ (Hk)
Step 2: Compute the products (10.3) [N;O]
Step 3: Compute the transform .!J?iv - 1 [(N/2)(p- 2); Np]
!T. -1
(Yk) ~ (Yn)
The total cost is
~ (3log 2 N- 4) complex multiplications,

(10.4)
3N log 2 N complex additions,
which is an appreciable savings compared with (10.2). If N = 64, the cost
is [448; 1, 152] complex Operations in placc of [4, 096; 4, 032]. (Note that
here and elsewhere we neglect the constant term that appears in (9.2).)
10.1.2 Real data

In this case, the two discrete Fourier transforms (DFT) in Step 1 can be
computed with a single DFT of orderNon complex data (see Section 8.3).
In Step 3 the inverse DFT acts on the periodic sequence (Yk) that satisfies
Y-k = Y k, so this can be computed with a single DFT of orderN/2. Since
the products in (10.3) can be obtained with N /2 complex multiplications,
the total cost is
~ (3log2 N - 5) complex multiplications,
~ (3log2 N- 1) complex additions.

One complex multiplication can be clone with 4 real multiplications and 2
real additions. Thus in the real case the cost is
N(3log 2 N- 5) real multiplications,
~ (9log 2 N - 7) real additions.
For N = 64, the cost is [832; 1, 504] versus [16, 384; 16, 256] given by (10.2).
10.2 Nonperiodic convolution 87
10.2 Nonperiodic convolution

Let (xn)nEZ and (hn)nEZ be two nonperiodic signals that have compact
support. In particular we assume that
Xn = 0 if n < 0 or n 2:: M,
hn = 0 if n < 0 or n 2:: Q. (Q $ M).
The problern is to compute the nonperiodic convolution

Q-1
Yn = L hqXn-q (10.5)
q=O
The Yn are zero if n < 0 or if n 2:: M + Q - 1. Let N be the smallest power

of 2 such that N 2:: M + Q - 1. By making the original sequences periodic
with period N, we come back to the problern of computing a periodic
convolution using the FFT, where the cost is given by (10.4):
N
2 (3log2 N - 4) complex multiplications,
3N log 2 N complex additions.
EXAMPLE: Take Q = 200, M = 500, and N = 1024. Then the cost of

computing (10.5) is
MQ = 105 multiplications,
MQ- (M + Q- 1) ~ 104 additions.
The cost using the FFT is
1024 x 13 ~ 1.3 104 multiplications,

1024 x 30 ~ 3 104 additions.
We see that the FFT method is still advantageaus in this case. On the
other hand, this advantage is lost when the lengths of the two signals are
disproportionate. This happens frequently in "real-time" signal processing,
where the sequence (xn) is practically "infinite" and where the support of
the filter (hq) is relatively small. This is the case, for example, when one
smoothes data with a "sliding window."
EXAMPLE: Suppose
4
Yn = LhqXn-q (10.6)
q=O
with Q = 5 and M = 1000. The cost of computing (10.6) is

5000 multiplications,
4000 additions.
The cost using the FFT method with N of the order 1024 is
1.3 104 multiplications,
3 104 additions.
The direct application of the FFT method is clearly more costly. There
are, however, specific methods for this case. They involve cutting the vector
(xn) into shorter pieces (see, for example, [CLW67] and [Nus81]).
10.3 Computations on high-order

polynomials
A polynomial P of degree less than or equal to p can be coded in sev-
eral different ways. The most common of these is to represent P as the
vector of its Coordinates in a given basis, for example, the canonical basis
(1, x, ... , xP). Thus
is represcnted by
(10. 7)
This represcntation is convenient for computing the values of P for differ-
ent values of x (Horner's algorithm). It is less convenient if one wishes to
computc the product of P and another polynomial
Q(x) = bo + b1x + + bqxq.
The coefficients of the product
PQ(x) = c0 + c1x + + cp+qxp+q
are given by thc convolution
k
Ck=Lanbk-n, k=0,1, ... ,p+q. (10.8)
n=O
This computation requircs

1
2(p + q + 1)(p + q + 2) multiplications,
~(p + q)(p + q + 1) additions.

10.3 Computations on high-order polynomials 89
There is another representation that is better adapted to the computa-

tion of a product. We know that a polynomial of degree less than or equal
to N - 1 is uniquely determined by its values at N distinct points in the
complex plane:
Yi=P(xi), j=0,1, ... ,N-1. (10.9)
In this case, the product PQ is simply coded by the numbers
P(xj)Q(xj), j = 0, 1, ... , N- 1.
(We have assumed that p+q:::; N -1.) On the other hand, the computation
of P(x) for an arbitrary value x is more complicated in this representation.
We are going to examine these two representations and the problern of
going from one to the other.
10.3.1 Polynomials represented in the canonical basis

Formula (10.8) expresses the coordinates of the product PQ as a nonperi-
odic convolution. Thus it can be computed using the method described in
Section 10.2: The vectors (ao, ab ... , ap) and (bo, b1, ... , bq) are extended
with zeros to obtain two vectors of length N, where N isapower of 2 and
N 2': p + q + 1. These vectors are extended periodically to all of Z to obtain
two N-periodic sequences; formula (10.8) then becomes
N-1
Ck = L anbk-n, k = 0, 1, ... , N - 1.
n=O
The computation using the FFT technique costs
~ (3log2 N- 4) multiplications,
3Nlog2 N additions.
EXAMPLE: Take p = 13, q = 15, and N = 32. The direct method costs
406 additions,
while the FFT method costs
320 additions.
However, as indicated in Section 10.2, the FFT method loses its advan-
tage when p and q are not of the same order of magnitude.
10.3.2 Polynomials represented by N values

A choice for the points Xk in (10.9) that is particularly interesting is
(10.10)
The two representations (10.7) and (10.9) are then related by the equations
N-1
Yk = "'"' nk
~ anWN, k = 0, 1, ... , N- 1. (10.11)
n=O
This is a discrete Fourier transform whose inverse is given by

N-1
N1 "'"' n = 0, 1, ... , N - 1.
-nk
an= ~ YkWN ,
k=O
We know from Section 9.2 that the cost of going from one representation
to the other is
N
2(log 2 N- 2) multiplications,
Nlog 2 N additions.
10.4 Polynomial interpolation and the

Chebyshev basis
We will see that the FFT can be used to reduce the cost of computing a
polynomial interpolation. This is made possible by representing the poly-
nomial in the Chebyshev basis.
10.4.1 The Chebyshev polynomials

The Chebyshev polynomials are the polynomials T0 , T1 , T2 , .. that are
defined for all () E [0, 1r] by the relations
Tn(cosO) = cosnO. (10.12)
That Tn is a polynomial follows from the de Moivre formulas. Furthermore,

the degree of Tn is exactly n, and its coefficients are integers.
To(x) = 1,
T1(x) = x,
T2(x) = 2x 2 - 1,
T3(x) = 4x 3 - 3x,
T 4 (x) = 8x 4 - 8x 2 + 1, etc.
For example, the expression for T 3 comes from the identity

cos30 = 4cos3 0- 3cos0.
We are concerned here with the vector space of polynomials with real
coefficients. This is a vector space over JR, and the Tn form a basis for this
space. More precisely, if P has degree less than or equal to N, it can be
represented uniquely as
N
P(x) = L anTn(x). (10.13)
n=O
This representation in terms of the Tn and that of P(x) in terms of its
values at N + 1 points are widely used in pseudo-spectral methods for
approximating solutions of certain partial differential equations. The poly-
nomial P is an approximation of the unknown function. It is obtained by
computing its values Yk at N + 1 points. When using this technique, one
must constantly pass from one to the other of the two representations (an)
and (yk)
10.4.2 Choosing the xk
Since we wish to remain in the real domain, we cannot use the Xk given by
(10.10) directly, but our choices are derived from (10.10). In particular, we
take the abscissas Xk (which are called the Chebyshev abscissas) to be
Xk = cos ( k ~)' k = 0, 1, ... 'N. (10.14)
From (10.12),
(10.15)
and
N
Yk = P(xk) = L
ancos ( nk ~ ). (10.16)
n=O
This formula is not exactly a DFT, but it is not far from one. By writing
the cosines in terms of exponents, we see that
N -N N
Yk = "'"' nk
2 LJ anw2N
1 1 "'"' nk
+ 2 LJ a_nW2N = "'"' nk
LJ CnW2N
n=O n=O n=-N (10.17)
r
where
2 an if 0 < n::; N,
Cn = ~0 if n=O, (10.18)
2 a_n if -N:Sn<O.
The expression (10.16) defines Yk for 0 ~ k ~ N, but sincc thc functions

cos(nkft) are defined for all k E Z, we can use the right-hand side of
(10.16) to extend the function k c--> Yk to all k E Z. Furthermore, the
functions cos(nkft) are evcn and 2N-periodic, so the same is true for the
sequence (yk)kET. Thus Y-k = Yk and YzN-k = Yk, and the system (10.17)
can be written
N
Yk = L:
n=-N
k = 0, 1, ... , 2N- 1. (10.19)
Applying the technique used in Section 8.1, we try to invert this system by
computing
2N-l
L
Ykw-;J:/, p = 0, 1, ... ,N.
'Yp =
k=O
By substituting (10.19) for Yk, this becomes
L L L L
2N-l N N 2N-1
'Yp = Cnw~'J.r-p)k = Cn w~'J.r-p)k.
k=O n=-N n=-N k=O
The last sum is equal to 2N if n = p (mod 2N) and 0 otherwise, so that

'Yp = 2NcP, p = 0, 1, 2, ... , N- 1,
'YN = 2N(c_N + CN) = 4NcN.
Finally, in view of (10.18), we have
n = 0, 1, ... ,N, (10.20)
with
Eo = EN = 2; c1 = = EN-1 = 1.
Formulas (10.16) and (10.20) are the reciprocals of each other; they resolve
theoretically the problern of going from onc representation to the other.
10.4.3 Practical computation

The work procecds as follows:
(a) COMPUTE (an) KNOWING (Yk):
Step 1: Compute YzN-k = Yk for k = 1, 2, ... , N- 1.

Step 2: The FFT algorithm of order 2N:
Step 3: an= Yn if n = 1, 2, ... , N- 1; ao = ~Yo; aN= ~YN.

(We leave it to the reader to explain why the Yn are real.)
(b) COMPUTE (Yk) KNOWING (an):
Step 1: Compute the Cn from (10.18).
Step 2:
Yn = {cn if 0 :S n :S N,
Cn-2N if N < n :S 2N - 1.
Step 3: Theinverse FFT algorithm of order 2N:

g; -1
(Yo, Y1, ... , Y2N-1) ~ (yo, Y1, , Y2N-d
10.4.4 Cost of the computation

If we compute thc cost with reference to Beetion 9.1, thc total cost of the
computation is
N(log 2 N- 1) complex multiplications,

2N(log 2 N + 1) complex additions.
It is possible to reduce this cost by taking into consideration the fact that
the sequences arereal and even: In (a) Y2N-k = Yk, andin (b) Y2N-n = Yn.
Thus the computation can be clone with a discrete Fourier transform of
order N /2 rather than order 2N. The cost is
N(log 2 N- 3) real multiplications,
2N (3log2 N- 5) real additions.

EXAMPLE: Take N = 32.
The cost using formula (10.16) is 32 2 = 1024 real multiplications.
Thc cost using the reduced FFT is 64 real multiplications.
10.4.5 A trigonometric interpolationproblern

The function
f(()) = P(cos())
has by (10.12) and (10.13) an expression of the form
N
f(()) = L an cosnO.
n=O
The an are the Fourier coefficients of the even trigonometric polynomial f,

which has degree less than or equal to N. Equation (10.16) becomes
f(k;) =yk, k=0,1, ... ,N,

and (10.20) shows us how to find the coefficients an of f given the values
Yk and thus how to solve numerically this particular interpolation problem.
10.4.6 Theorem Given any N + 1 real numbers y0 , y 1 , ... , YN, there

exists a unique trigonometric polynomial of the form
N
f(O) = L an cosn()
n=O
that satisfies
!(k;) =yk, k=O,l, ... ,N.
The coefflcients an are given by (10.20).
10. 5 Exercises
Exercise 10.1 Take two vectors of lengths M = 4 and Q = 3. Extend
them as described in Section 10.2 and verify that the computations with the new
vectors do indeed give the original convolution.
Exercise 10.2 Let f(O) be the trigonometric polynomial in Theorem 10.4.6.

Show that integrating f on [0, 211"] using the trapezoidal method yields the correct
value of the integral; that is,
Exercise 10.3
(a) Show that the function f(O) in Theorem 10.4.6 can be written
L f(fh)gk(O),
N
f(O) =
k=O
where
N
k 1
9k(O) =La~ cosnO, and an= -N cosnOk,
E:n
n=O
with co = cN = 2 and 1 = = cN-1 = 1.

(b) Find expressions for the 9k in terms of tangents and cotangents.
Chapter IV
The Lebesgue Integral

lesson 11
From Riemann to Lebesgue
We are going to introduce the Lebesgue integral here andin the next three
lessons. Experience has shown that this notion of integration is particularly
well suited for operations such as
taking limits under the integral sign,
taking derivatives under the integral sign,
interchanging the order of integration.
We do not intend to give a complete and rigorous development of the the-
ory of Lebesgue integration. Readers wishing a deeper understanding of
the fundamentals can consult any of numerous references such as [KF74],
[Hal64], and [Roy63]. We wish to go as directly as possible to the applica-
tions while at the same time presenting the essential ideas.
11.1 Some history

The idea of integration is based intuitively on the notion of area. Given a
positive, continuous function f defined on [a, b], the integral
1b f(x)dx
is the area of the region bounded by the curve y = f(x), the x-axis, and
the two lines x = a and x = b. It was in 1853 that Bernhard Riemann gave
a rigorous definition of the integral that bears his name.
For a fixed n, consider a parti~ion ~n of the interval [a, b] of the form
a = Xo < X1 < X2 < < Xn-1 < Xn =b

and define the size of the partition to be
J..L(~n) = sup(xi - Xi-d

i
98 Lesson 11. From Riemann to Lebesgue
Form the sums

n
where ~i E (xi_ 1 ,xi) but is otherwise arbitrary. The integral of f in the

sense of Riemann, or the Riemann integral of J, is the limit, if it exists, of
the I(n) as n--+ oo and J.L(n) --+ 0.
Riemann showed that the partial sums I(n) (called Riemann sums)
have a limit not only for continuous functions but also for a larger class
of functions, some of which have an infinite number of discontinuities. He
introduced a generalized integral of a function that is unbounded in the
neighborhood of an isolated point c E [a, b] as the limit, if it exists, of
1 a
c-a f(x)dx + Jb
c+
f(x)dx
as o:, --+ 0, o:, > 0. Riemann also proved that the Fourier coefficients of a
periodic, integrable function tend to zero. This result was later generalized
by Henri Lebesgue.
Riemann's work stimulated numerous studies aimed at generalizing the
definition of the integral to include the widest possible class of functions
(T. Stieltjes, E. Borel, H. Lebesgue, F. Riesz). It was around 1900 that
Lebesgue proposed his theory of integration. The Riemann sums are well
behaved for only a particular class of discontinuous functions; they require
that f(x) not vary too much in the intervals (xi-I.Xi) Lebesgue inverted
the situation: He considered the range of values [m, MJ taken by f(x) and
partitioned this interval into segments (Yi-1, Yi) He then considered the
set of x such that Yi-1 :S: f(x) < Yi He gave this set a measure mi and
formed the sums
L m(rli where Yi-l''li < Yi.
The integral of f on [a, b] is obtained by passing to the limit as the size of

the partition of [m, M] tends to zero.
This description, while simplistic, introduces the notion of the measure
of a set of real numbers whose structure can be quite complicated. In fact,
Lebesgue's work on integration was a natural continuation of the theory
of measure initiated by E. Borel and the researches of R. Baire on the
structure of sets. Classical presentations of the theory of integration follow
this historical development: measure theory followed by integration theory.
We will follow this plan.
11.2 Another point of view

It is very convenient when using arguments that involve taking limits to
be working in completc spaces. If not, the "limit" of a Cauchy scquence
11.3 By way of transition 99
is a foreign object. The situation for arguments about sets of functions is

analogaus to that for sets of numbers: While practical computations may
be made within the rationals Q (or even a subset of Q), the complcte set of
reals lR is very useful, indeed almost indispensable, for studying convergent
sequences. In the same way, wc use the Riemann integral for practical
computations. On the other hand, taking limits and differentiating under
the integral sign arc theorctical operations that are greatly facilitated with
the Lebesgue integral.
The space C 0 [a, b] of continuous functions on [a, b] endowed with the
norm
llfll1 = 1b lf(x)ldx
is not complete. For continuous functions, thc Riemann and Lebesguc in-
tegrals are the same. To complete this space it is necessary to extend the
class of function for which
1b lf(x)ldx
exists and is finite. The set of Riemann integrable functions is included in
this completion, but there are Cauchy sequences of continuous functions
in the norm II ll1 that do not converge to Riemann integrable functions.
The theory of Lebesgue integration provides a "constructive" process to
complete C 0 [a, b] in thc II lh norm.
11.3 By way of transition

At the end of the day, the waiter, B. Riemann, and the owner of a cafc, H.
Lebesgue, both verify the day's receipts. M. Riemann has kept a copy of
each customer's bill. He computes the day's total income as
SR= t1 + h + ... + tN.
M. Lebesgue, on the other hand, must prepare the bank deposit, and he
sorts thc money from the cash register by denomination. (It's a French cafe
that rounds things off to the nearest franc, so only 1 F, 2 F, 5 F, etc. appear
in the register.) Thus, M. Lebesgue counts how many items nk there are of
each denomination Dk. He computes the total for the deposit as
SL = noDo +n1D1 + +nMDM.
Assuming no pilfering or added tips, M. Riemann and M. Lebesgue will
agree that SR= SL.
Figures 11.1 and 11.2 represent the two situations. We sec that the two
men measure the x-axis differently. In particular, M. Lebesgue measures
f- 1 (1), f- 1(2), f- 1 (5), .... More precisely, he counts these sets; for ex-
ample, his measure for f- 1 (10) is 3.
100 Lesson 11. From Riemann to Lebesgue
20 F r--
10 F r- --
5F r-- f--
-
H
2F
1F a1 ~ ra;- a4
1 2 3 4 5 6 7 8 9 10 11 12 13
~
first second'
bill bill SR =sum of the a;
II II
7F 11 F
FIGURE 11.1. Riemann's point of view.
20 F -1
10 F -1 2 3
5F r-
1
.....-.---
2 3
-4
2F -1
1F Ir #
SL =(1 F x 2) + (2 F x 2) + (5 F x 4) + (10 F x 3) + (20 F x 1)
FIGURE 11.2. Lebesgue's point of view.

Lesson 12
Measuring Sets
The goal of measure theory is to extend the notions of length (of an interval
in JR.), of area (of a reetangle in JR. 2 ), of volume, and so on, to more complex
sets. In general, given a set X (which in applications will be a part of JR.n),
one considers a restricted family of subsets of X called a a-algebra, and it is
on the elements of this a-algebra that one defines a measure. A measure is
a function with values in JR.+ U { +oo} = i:+, where JR.+ = { x E lR. I x ;:::: 0},
that has certain desirable additivity properties.
12.1 Measurable sets and measure

Intuitively, if the sets D1 and D2 have a measure, then the sets D 1 U D2,
D 1 n D 2, D 1 \ D2, and D2 \ D 1 should also have a measure.
12.1.1 Definition Given a set X, let .9 (X) be the dass of all subsets
of X . ." c .9 (X) is a a-algebra if the following hold:
(i) 0 and X are in ." ;
(ii) S E ." implies X \ S E ." ;
(iii) 81, S2, ... E ..9'" =? U:'=l Sn E ..9'" .
." is said to be closed under the formation of complements and count-
able unions. Note that .9 (X) is itself a a-algebra, but in general it contains
too many sets to be of interest.
12.1.2 Definition Let." be a a-algebra on X. The elements of ."

are called measurable sets and the pair (X,." ) is called a measurable
space.
A a-algebra is typically generated from a given collection of subsets of
X called the generators of the algebra. Let ?' be a subset of .9 (X). The
intersection of all the a-algebras containing ?' is a a-algebra. It is the
smallest a-algebra containing ?', and this leads to the next definition.
102 Lesson 12. Measuring Sets
12.1.3 Definition Suppose lf C!? (X). The smallest a-algebra con-

taining lf is called the a-algebra generated by lf.
As an example, take X = IR.n and let lf be the collection of open sets

in IR.n. lf is not a a-algebra because the complement of an open set is
not open. The elements of the a-algebra generated by lf are called Borel
sets. It is not possible to give an explicit description of this algebra using
intersections and unions of open sets.
We now come to the definition of a measure on a a-algebra.
12.1.4 Definition Let .!Y be a a-algebra on X. A measure on .!Y

-+ IR+ having the following properties:
is a function J-L : .!Y
(i) J-L(0) = 0.
(ii) If Sn is a sequence of disjoint measurable sets, then
f-l( usn) =
00 00
LJ-L(Sn)
n=l n=l
The triple (X, .!Y , J-L) is called a measure space.
12.1.5 Examples Let X be an arbitrary set, .!Y a a-algebra on X,

and a an element of X. Define J-la and /-ld as follows:
(a) /-la(S) = {1 if a E_S,

0 otherw1se.
(b) /-ld(S) = {the number ~f elements of S if S is finite,

+oo otherw1se.
Then /-La and /-ld are measures on .!Y . However, these measures do not
generalize the notion of length when X = IR.!
In practice, measures are not generally defined directly on a a-algebra.

Usually a measure is defined on a smaller collection of sets and then
extended to the a-algebra generated by these sets. This is the case for
Lebesgue measure on IR.n.
12.1.6 Definition Given a set X, consider d c !? (X). d is an

algebra if and only if it satisfies the following conditions:
(i) 0 and X are in d ;
(ii) S E d implies X\ S E d ;
(iii) 81,82 E d implies 81 U 82 E d .
12.1 Measurable sets and measure 103
Thus an algebra is closed under complements and finite unions. A func-

tion v defined on an algebra d with values in R+ is said to be a measure
on d if it satisfies the following conditions:
(i) v(0) = 0.
(ii) If Sn is a sequence of pairwise disjoint sets in d such that U~ 1 Sn
is in d , then
00 00
v( U Sn)= Lv(Sn)
n=l n=l
Given a measure on an algebra d , it can be extended to a a-algebra
containing d . This is the content of the following theorem, which we will
assume (sec, for example, [Hal64]).
12.1. 7 Theorem Let v be a measure on an algebra d of X. There

exists a measure J.L and a a-algebra .!T containing d with the following
properties:
(i) For all SE d , v(S) = J.L(S).
(ii) If S1 E .!T and J.L(Sl) = 0, then for all S2 C S1. S2 E .!T and
11(S2) = 0.
(iii) If X = U:=l Sn with Sn E .!T and J.L(Sn) < +oo, then J.L is uniquely
determined on the smallest a-algebra that contains d and satisfles
condition (ii).
This procedure is used on lR. to define Lebesgue measure. One can take
the elements of d to be the sets that are finite Unions of intervals. The
measure v is defined for intervals by
v(a, b) = { Jb- aJ if (a, b). is a bounded interval, (12.1)

+oo otherw1se.
Fora finite union of disjoint intervals In, v is defined by

p p
v( U In)= Lv(In)
n=l n=l
It is possible to show that v is a measure on d . (Note that the delicate
part of the argument is to prove condition (ii), the countable additivity.)
From Theorem 12.1.7 we know that v can be extended to a measure J.L
on a a-algebra 5f containing d and hence containing the Borel sets. This
measure J.L is constructed by defining
J.L(A) = inf { ~ v(In) I AC Ql In}

where In is an open interval in IR.
X' denotes the u-algebra of Lebesgue-measurable subsets of ~' and J.L

is Lebesgue measure on R Recall from (ii) of Theorem 12.1.7 that every
subset T of a measurable set S for which J.L(S) = 0 is a measurable set and
J.L(T) = 0. Since the sets of measure zero play such an important role in
integration theory, we devote a section to discussing them.
12.2 Sets of measure zero

Given a measure space (X, 3 , J.L), if S1, S2 E 3 and if S1 c S2, then
it follows from Definition 12.1.4 that J.L(S 1) ::::; J.L(S2 Now suppose that
).
SE 3 and that J.L(S) = 0. If G C S belongs to 3 , then Gis also a set
of measure zero. However, if Gis not a member of .!T , we can say nothing
about J.L( G) because it is not defined. The measure space (X, 3 , J.L) is said
to be complete if all the subsets of sets of measure zero are measurable, that
is, if G c Sand J.L(S) = 0 imply that G E .!T . The Lebesgue measure space
(~,X' , J.L) is complete. On the other hand, the measure space (~, !!Ii , J.L),
where !!Ii is the u-algebra of Borel sets and J.L is Lebesgue measure, is not
complete.
In practice, we deal mostly with sets of measure zero (null sets) that
are finite or countably infinite. There are, however, null scts that have the
cardinality of the continuum; the Cantor middle third set is an example.
12.2.1 Examples of sets of measure zero in IR

(a) S = {a}, a E R Let In = (a- 1/n, a + 1/n). S = n~=l In and
J.L(In) = 2/n by (12.1). Since J.L(S) ::::; 2/n for all n E N, J.L(S) = 0.
(b) S = {a1, a2, ... , ap}, an ER Then J.L(S) = L~=l J.L(an) = 0.
(c) S = U~= 1 Sn, where J.L(Sn) = 0 and Si n SJ = 0 for all i i= j. From
Definition 12.1.4(ii) we have
L J.L(Sn) = 0.
CXJ
J.L(S) =
n=l
For example, thc set of rational numbers, Q, has measure zero.
12.2.2 Proposition If Sn is a sequence of sets of measure zero, then

S = UnEN Sn has measure zcro.
Proof. We reduce this to the case where the sets are pairwise disjoint by
defining
n-1
T1 = S1, Tn =Sn\ U Sk, n 2 2.
k=l
Wehave s = u~=l Tn and J.L(S) = L~=l J.L(Tn) By construction, Tn c Sn,

so J.L(Sn) 2 J.L(Tn) = 0. Hence J.L(S) = 0. 0
12.3 Measurable functions 105
12.2.3 Definition We say that a property P is true (holds) almost

everywhere, which we denote by a.e., if the set where P is not true (does
not hold) is a set of measure zero.
For example, a function is said tobe zero a.e. if S = {x EX I f(x) -j. 0}

is measurable and J-I(S) = 0. The function defined by
j(x) = {1 0
if X E_Q,
otherw1se
is zero almost everywhere. We write f = 0 a.e.

We see in this example that it is necessary to measure sets of the form
f- 1 (8) = {x E X I f(x) E S}. In fact, this requirement appeared in the
brief description of the Lebesgue integral given in Lesson 11.
12.3 Measurable functions

It is useful to consider functions taking values in IR= lR U { +oo} U { -oo }.
Wc add the following conventions to the usual operations:
(i) For all a E JR, a oo = oo.
(ii) For all a > 0, a x (oo) = oo.
(iii) 0 x (oo) = 0.
The operations +oo + (-oo) and +oo - (+oo) remain undefined.
We deal with complcx-valued functions f by decomposing them into their
real and imaginary parts: f = g + ih.
12.3.1 Definition Let (X,Y ) be a measurable space.

(i) A function f : X -+IRis said tobe measurable if for allreal a the set
f- 1 ((a,+oo]) = {x EX I f(x) > a} E Y.
(ii) A function f : X -+ C of the form g + ih is measurable if g and h are

measurable.
12.3.2 Proposition Let 9J denote the O"-algebra of Borel sets on ]Rn,

f : JRn -+ lR be a continuous function. Then f is measurable.
and Jet
Proof. Since f is continuous, f- 1((a, +oo]) = f- 1 ((a, +oo)) is an open

set in lRn; hence it is a Borel set. o
This proves that all continuous functions are measurable on (JR, Y? ) ,

where c':f is the Lebesgue O"-algebra. We note, however, that there exist
many measurable functions that are not continuous.
12.3.3 Example The characteristic function of a measurable set Ais

denoted by XA and is defined by
XA( x) = {ol if X E A,
if X.;. A.
In practice, it is easy to verify that the functions being used are measur-
able. The following properties are particularly useful in this regard.
12.3.4 Proposition Let (X, .5T ) be a measurable space and suppose

that f and g are measurable functions on X with values in iR. With the
conventions for computing in iR, the following functions arealso measurable:
(i) af for all a in IR;
(ii) f + g (when this sum is defined);
(iii) Jg;
(iv) max(f,g) and min(J,g);
(v) j+ = max(J, 0) and f- = max(- J, 0);
(vi) l/1.
Proof. All of these functions can be shown to be measurable by direct
reference to Definition 12.3.1. o
The next result addresses limit processes and measurability.
12.3.5 Proposition Let (X,Y ) be a measurable space and assumc

that fn, n E N, is a sequence of measurable functions from X to iR. Then
wc havc the following results:
(i) f(x) = infnEN fn(x) and g(x) = supnEN fn(x) are measurable func-
tions.
(ii) If the pointwise limit of fn exists, it is measurable.
Proof. We show as an example that g is measurable. Notefirst that g is

defined for all x E X (possibly with g( x) = +oo). The set
00
S = {x EX I g(x) > a} = U {x EX I fn(x) > a}

n=l
is the union of a countable number of measurable sets. Hence S E .5T , and

g is measurable. o
The dass of measurable step functions plays a fundamental role in the

theory of integration.
12.4 Exercises 107
12.3.6 Definition Let (X, .Y ) be a measurable space. A function e

defined on X is called a step function, or simple function, if there exists
a finite number of measurable sets S 1 , S 2 , ... , Sn and n finite real values
a1, a2, ... , an suchthat
n
e = Laixsi, (12.2)
i=l
where xsi is the characteristic function of Si.

Note that the representation (12.2) is by no means unique. A step func-
tion is a measurable function that takes a finite number of values. The sum,
product, absolute value, etc. of step functions are again step functions. The
next result highlights the importance of step functions.
12.3. 7 Proposition Let (X, .!T ) be a measurable space and assume

that the function f : X ---+ ~ is measurable. Then the following hold:
(i) f is the pointwise limit of a sequence en of step functions. If f is
bounded, the en can be chosen such that the limit is uniform.
(ii) If f is positive, the en can be chosen to be positive and increasing.
Proof. Assurne that f is positive. Define the sequence en by
2-nk if 2-nk S: f(x) < 2-n(k + 1),
en(x) = { where k = 0, 1, ... , 22n - 1,
2n if 2n S: f(x).
Then en is a step function that takes at most 22 n + 1 positive values. It is

easy to see that the sequence en is increasing. lf f(x) < +oo, then there
is an N suchthat for all n;::: N, 0 :=:; f(x)- en(x) :=:; 1/2n, which proves
convergence. lf f(x) = +oo, we have en(x) = 2n for all n E N, and again
lim en(x) = f(x).
n-+oo
This proves (ii). To establish (i), it is sufficient to decompose f into its
positive and negative parts, f = J+- f-. We leave it to the reader to show
that the convergence is uniform when f is bounded. o
The integration of measurable functions will be studied in the next lesson.
The set of measurable functions is extensive, and the functions that occur in
practice are always measurable. The question of measurability is not a point
of difficulty for the practical application of the theorems of integration.
12.4 Exercises
Exercise 12.1 Let X be a set and l!f c .9' (X). Show that the intersection
of all the u-algebras containing l!f is a u-algebra.
Exercise 12.2
(a) Show that every closed set F of Rn is a Bore! set of Rn. For example,
a E Rn is a Bore! set.
(b) Show that Q and R\Q are Bore! sets in R.
Exercise 12.3 Show that the Bore! u-algebra on R is generated by each of

the following families of sets in R, a, b E R:
(a) (-oo,a); (b) (-oo,a]; (c) [a,b); (d) the closed sets.
*Exercise 12.4 Let (X,Y ,p,) be a measure space.

(a) Let An be a decreasing sequence of measurable sets. Write
A= n An,
00
n=l
and assume that p,(Al) < oo. Show that p,(A) = !im p,(An)
n--+oo
(b) Let Bn an increasing sequence of measurable sets. Show that
Hint for (a): Write An as a countable union of pairwise disjoint measurable sets:
Exercise 12.5 Show that in R the family .s/ of unions of intervals is an

algebra.
Exercise 12.6 Verify that /-La and /-Ld defined in Section 12.1.5 are measures.
Exercise 12.7 Let (X, Y , p,) be a measure space. Consider the space E of
measurable functions with values in i:. Show that the relation f "' g if f - g = 0
almost everywhere (a.e.) is an equivalence relation on E.
Exercise 12.8 What is the Lebesgue measure ofQn(O, 1]? Ofthe irrationals
in (0, 1]?
Exercise 12.9 Let (X,Y ) be a measurable space and Iet f be a measur-

able function from X to i:. Show that the following sets are measurable for all
a ER:
{xEXIf(x)~a}; {xEXIf(x)~a};
{x EX I f(x) < a}; {x EX I f(x) = a}.

12.4 Exercises 109
Exercise 12.10 Let (X, ) be a measurable space and assume that j, g

are measurable functions from X to R. Show that the following sets are measur-
able:
81 = {x EX I f(x) < g(x)};

82 = {x EX I f(x) = g(x)};
83 = {x EX I f(x)::; g(x)}.
Hint: Write
s = U ({x E x J(x) < q} n { x E x

1 1 1 g( x) > q}).
qEill!
Exercise 12.11 Use Exercise 12.9 to prove Proposition 12.3.4.

Hint for (iii): Write
fg = 41 ( (f + g) 2- (f- g) 2) .
Exercise 12.12 Show that XA is measurable ifand only if Ais measurable.

Lesson 13
Integrating Measurable
Functions
Measure theory is the difficult part of developing the Lebesgue integral.

Now that we have a measure space (X, Y , J.L) at our disposal, we are
going to define the integral for measurable functions on X with real or
complex values. Here and in the rest of the book, when we speak of a
function we will mean a measurable function; most of the time we will not
mention specifically that the function is measurable. This lesson contains
the elementary properties of the Lebesgue integral, including a statement
of the monotone convergence theorem.
13.1 Constructing the integral

We are going to develop the integral in the following three steps:
(i) Define the integral of a nonnegative simple function.
(ii) Define The integral of a function with values in iii+.
(iii) Define the integral of real- or complex-valued functions.
13.1.1 Definition Assurne that (X,Y ,J.L) be a measure space and

that e = L.:~=l aiXS; is a nonnegative simple function on X (e(x) ~ 0).
The integral of e on X with respect to the measure J.L is the nonnegative
number (possibly +oo) denoted by I x e dJ.L and defined by
We say that e is integrable if I x e dJ.L is finite.
We will assume that the value of L.:~=l O!iJ.L(Si) does not depend on the
particular representation of e (see Definition 12.3.6), hence that the integral
of e is well-defined.
112 Lesson 13. Integrating Measurable Functions
If E C X is measurable, the integral of e on E is defined by
For the characteristic function of E we have
Proposition 12.3. 7 is used to extend this definition to a function defined on

X with values in IR+ =IR+ U { +oo }.
13.1.2 Definition (integral of a nonnegative function)

Let (X, 5 , f..L) be a measure space and assume that f : X -+ IR+ is a
nonnegative measurable function. The integral of f with respcct to the
measure f..L is the nonnegative number (possibly +oo) denoted by Jx f df..L
and defined by
L f df..L = sup{ L e df..L I 0 :::; e :::; f, e simple}.
We say that f is integrable if Jx f df..L is finite.
If E is measurable, we define the integral of f an E as before:
13.1.3 Exarnples The measure f..L is Lebesgue measure in the following

cxamples.
(a) X= (0, 1] and
f(x) = {0 if XE Q,
1 otherwise.
Jx f df..L = 1, and one can show that f is not integrable in the sense of
Riemann.
(b) X= [0, 1] and
f(x)={1/x ~fx>O,
0 1f X= 0.
One can show that Jx f df..L = +oo by applying Definition 13.1.2.

(c) X = IR, f : X -+ IR+, and E is a set of measure zero. Then it follows
from the definition that JE
f df..L = 0: Givcn a simple function e such that
0 :::; e :::; f, one has JE
e df..L = 0.
13.2 Elementary properties of the integral 113
The integral of a function defined an X with values in iR is obtained

by decomposing f into its positive and negative parts: f = j+ - f-. The
functions j+ and f- are measurable and nonnegative (Proposition 12.3.4).
It is then natural to define the integral of f on E by
for all measurable sets E. The two integrals an the right make sense; how-
ever, in case they are both +oo, JE
f dJ.L is not defined. This leads to the
following definition.
13.1.4 Definition Let (X,Y ,J.L) be a measure space and f: X~ iR

a measurable function. f is said to be integrable an the set E if j+ dJ.LJE
and JE f- dJ.L are finite. In this case, the integral an E is defined by
When f is complex-valued, the integral of f = g + ih on Eis defined by
13.2 Elementary properties of the integral

The usual properties of the integral (linearity, monotonicity, order) follow
directly from the definitions for simple functions. These properties are ex-
tended to nonnegative measurable functions by using Definition 13.1.2 and
by taking limits using Proposition 12.3. 7 and the monotone convergence
theorem (Theorem 13.2.2). This theorem is one of the central results in
Lebesgue integration theory.
13.2.1 Proposition Assurne that (X,Y ,J.L) is a measure space, f

and g are defined on X with values in R+, and E and F are in ..r .
(i) IfO:::; f:::; g, then 0:$ kfdJ.L :$ kgdJ.L.
(ii) If E n F = 0, then f f dJ.L = f f dJ.L + f f dJ.L.

jEUF jE jF
(iii) If E CF, then kfdJ.L :$ tfdJ.L.
The proof is a direct application of Definition 13.1.2 and is left as an

exercise.
Proving the linearity of the integral for nonnegative functions is more

delicate. One first establishes the result for simple functions; then one uses
Proposition 12.3. 7 and the next theorem to prove the result for arbitrary
nonnegative mcasurable functions.
13.2.2 Theorem (monotone convergence) Let (X,Y ,J.L) be

a measure space. Suppose that there is a sequence of measurable functions
on X with values in IR+ such that 0 :::; fn(x) :::; fn+l(x) for all x E X
and all n E N. Then the function f defined by f(x) = limn-+oo fn(x) is
nonnegative and measurable, and for all E E Y we have
The technique for proving this theorem can be found, for example, in
[Ber70]. This result is used to prove linearity for nonnegative functions,
which in turn is use to prove the next result.
13.2.3 Proposition (linearity) Let (X, Y , M) be a measure space.

If f and g are integrable on the set E C Y , then we have the following
results that express the linearity of the integral:
(i) l af d~-t = a l f d~-t for all a E IR;
(ii) l(f+g)d~-t= lfd~-t+ lgdJ1-.

Proof. To provc (i), write f = j+- f- and use thc linearity of the integral
for nonnegative functions. For (ii), write h = f +g. Then 0:::; h+ :::; j+ +g+
and 0 :::; h- :::; f- + g-. It follows from Proposition 13.2.l(i) that h is
integrable. Since
we have
t+ + g+ + h- = h+ + r + g-.
Using the linearity of the integral for nonnegative functions and rearranging
terms shows that
13.2.4 Proposition Let (X, Y , J.L) be a measure space and E E Y
(i) If f and g are integrable and f :::; g, then
lfd~-t:::; lgdJ1-.
13.3 The integral and sets of measure zero 115
(ii) If f is integrable, then IL f dJ.t' ~ fe1!1 dJ.t.

(iii) If there exists an integrable function g such that 1!1 ~ g, then f is
integrable and
fe1JidJ.t ~ lgdj.t.
Proof. These results follow from Proposition 13.2.1 and linearity. o
These results lead to the next proposition, which is another key result in
the Lebesgue theory.
13.2.5 Proposition Let (X,!T ,J.t) beameasurespace. If f: X---+ iR

is a measurable function, then f is integrable if and only if 1!1 is integrable.
Proof. By definition, Jx j+ dJ.t and Jx f- dJ.t are finite if f is integrable.
Since 1!1 = j+ + f-, fx 1!1 dJ.t < +oo by linearity.
Conversely, since 0 ~ j+ ~ 1!1 and 0 ~ f- ~ IJI, the integrability of 1!1
implies that fx j+ dJ.t < +oo and fx f- dJ.t < +oo (Proposition 13.2.1),
and hence f is integrable. o
13.2.6 Remark Proposition 13.2.5 is false for the Riemann integral.

A simple counterexample is given by the function f defined on [0, 1] by
f(x) = {+1 -1
~f x st Q,
If xEQ.
It is clear that f is not Riemann integrable. On the other hand, 1!1 = 1 on

[0, 1] and is Riemann integrable.
13.3 The integral and sets of measure zero

Given two measurable functions f and g, we say that f = g almost every-
where (a.e.) if the set on which the functions differ has measure zero. Thus
if M = {x I f(x) =/= g(x)}, f = g a.e if and only if J.t(M) = 0. We will see
that when f is integrable, g is also integrable and their integrals are equal.
13.3.1 Proposition Let (X,!T ,J.t) be a measure space. If f: X---+

iR+ is a nonnegative measurable function, then Jx f d~-t = 0 if and only if
f = 0 almost everywhere.
Proof. Let N = {x EX I f(x) =/= 0} and write X= (X\ N) U N. If f = 0
a.e., then J.t(N) = 0; this and the fact that f is zero on X\ N imply that
fx f d~-t = 0 (Proposition 13.2.1(ii) and Section 13.1.3(c)).
Now suppose that fx f df.t = 0 and write N = U:=l Sn, where Sn is

{x E X I f(x) > 1/n}. Then 0 = fx f djt 2:: fsn f djt 2:: J.t(Sn)/n, and
hence J.t(Sn) = 0 for all n E N. We conclude from Proposition 12.2.2 that
J.t(N) = 0 and hence that f = 0 a.e. o
One consequence of this proposition is that an integrable function can be
modified on a set of measure zero without changing the value of its integral.
From the point of view of the integral, we cannot distinguish two functions
that are equal almost everywhere. Proposition 13.3.1 shows that "equal
a.e." is an equivalence relation on the vector space of integrable functions
on a measure space (X,Y ,J.t).
13.3.2 Definition Assurne that (X, Y , J.t) is a measure space. Define

L 1 (X, Y , J.t) to be the vector space of (classes) of measurable functions
defined on X and integrable with respect to f.t We also write L 1 (X), or
even L 1 , when there is no chance of misunderstanding.
The quantity fx I/I df.t is a norm on L 1 (X,Y , J.t). Technically, this norm
is defined on the equivalence classes. We will not distinguish between the
dass of functions for which f is a representative and the function f.
13.4 Comparing the Riemann and Lebesgue

integrals
We consider R to be endowed with Lebesgue measure. We have seen that
a function can be Lebesgue integrable without being Riemann integrable.
On the other hand, one can prove the following result (sec, for example,
[KF74]).
13.4.1 Theorem If the Riemann integral

Lebesgue integral f[a,b]
J: f(x) dx exists, then the
f djt exists and the two integrals are equal.
The proof of this theorem is based on the definitions of the two integrals
and on the monotone convergence theorem. The following sufficient condi-
tion for a function to be Lebesgue integrable is much easier to establish.
13.4.2 Proposition Let (X,Y ,J.t) be a finite measure space, which

means that J.t(X) < +oo. If f: X - t IRis bounded almost everywhere on a
measurable set E, then f is Lebesgue integrable on E.
Beware that the converse is not true: f integrable does not imply the
existence of a number M suchthat I/I :::; M a.e. Consider, for example, the
function f(x) = 1/JX on (0, 1). However, we do have the following result.
13.4 Comparing the Riemann and Lebesgue integrals 117
13.4.3 Proposition Let (X, Y , t-L) be a measure space. If f :X~ IR

is Lebesgue integrable on X, then f is finite a.e.
Proof. Let N = {x E X I lf(x)l = +oo}. lf t-L(N) f= 0, we would have
JN
fx I/I dt-L 2: I/I dt-L = +oo. o
It follows directly from the definition of the Riemann integral in terms of
Riemann sums that an unbounded function cannot be Riemann integrable.
As we have seen, certain unbounded functions are Lebesgue integrable; fur-
thermore, their Lebesgue integrals can often be computed using Riemann
integrals.
13.4.4 Proposition Let [a, b] be a bounded interval ofR. If the func-

tion f : [a, b] ~ IR+ is such that for all c: > 0 the Riemann integral
Ic; = J:+"' f(x) dx exists and iflimc;-+0 I"' = I < +oo, then f is Lebesgue
integrable on [a, b] and f[a,b] f dt-L =I.
Proof. Take C:n > 0, C:n ~ 0, and define fn = f X[a+cn,b] Clearly, fn(x)
converges to f(x) as n ~ oo for all x E [a, b]. Furthermore, the sequence
f n is increasing. By the monotone convergence theorem,
{
J[a,b]
fdt-L = lim {
n-+oo J[a,b]
fndt-L = lim
n-+oo
1b
a+cn
f(x)dx =I. o
When fisnot positive, we use this result for I/I to conclude integrability
but not, for the moment, to compute the value of fra,b] f dt-L. We will see
further results relating the Riemann and Lebesgue integrals in Lesson 14.
13.4.5 Examples
(a) f(x) = .)x on (0, 1] is not Riemann integrable on [0, 1]. It is inte-
grable on every interval [c:, 1] with 0 < c: < 1, and
[ 1
1"' .rx
dx
= [2vxJ! = 2- 2vre.
Thus f is Lebesgue integrable on [0, 1] and
r
J(o,l]
fdt-L = 2.
(b) f(x) = ~ sin_!-. on (0, 1] is Lebesgue integrable on [0, 1] because

yX X
1
lf(x)l:::; -;;x
13.4.6 Remark If f is such that
lim
cn -->O+
1b a+cn
lf(x)l dx = +oo,
then f is not Lebesgue integrable on [a, b]. In the case where f is nonneg-
ative, it is clear that the (generalized) Riemann integral
1ba
f(x) dx =
n
lim
-->O+
1b a+cn
f(x) dx
does not exist either. If, however, f takes both positive and negative val-
ues, the generalized Riemann integral can exist without f being Lebesgue-
integrable. Take as an example f(x) = k sink Set
In = 1-1 1. 1
cn X
s1n- dx
X
1
= ~ /"nsinu
1
[-cosu]1/cn ~ 1 /"ncosu
--du = - - -
U
- - 2- du.
U 1 1 U
The term
[ - cosu]1/cn
-- = -E:nCOS-
1
+cos1
!
U 1 n
1 /cn COSU
converges as E:n --+ 0. The integral - 2- du converges absolutely
1 u
!
because
11"n 1 cos ul
---du
[
< - -
1/cn
= 1-
1] n
1 2 u - u 1
11
Thus In converges as E:n --+ 0. On the other hand, J" = / " Isi: u Idu
tends to +oo as c --+ 0 because
J., ?. .lmr Isi: u Idu, mr ~ ~ ~ (n + 1)7r,
1I 1
and
mr . n-1 1 (k+1)1r 2 n-1 1
7r
81
: I
u du ?. L
k=1
k1r
k?r
Isin ul du = ; L
k=1
k.
Proposition 13.4.4 can be proved for a generalized Riemann integral on
an interval [a, +oo], and Remark 13.4.6 is also true in this case.
13.4. 7 A convention
For the Riemann integral, the symbol I: f(x) dx makes sense when b < a
by the relation I:
f(x) dx = - Iba
f(x) dx, which comes directly from the
Riemann sums. On the other hand, the Lebesgue integral is taken over a
nonoriented set (a, b). When the integrals in a given context are all Lebesgue
integrals, we will adopt this sign convention. For example, I 1 f(x) dx will
denote - I[o, 1] f dJ.L.
13.5 Exercises 119
13.5 Exercises
*Exercise 13.1 Prove Proposition 13.2.1 for simple functions. Extend this
result to nonnegative measurable functions.
Exercise 13.2 Use Theorem 13.2.2 to prove the linearity of the integral for
nonnegative functions.
Exercise 13.3 Let An be a sequence of disjoint measurable sets and let A

denote the union U::'= 1 An. If f
isanonnegative measurable function on A, show
that
Exercise 13.4 (absolute continuity) Let (X,..'T , p,) be a measure

space and let f be integrable on A E ..'T . Show that for all c > 0, there exists a
c > 0 suchthat for all measurable setsEin A with p,(E) < c, one has
Hints:
(a) Establish the result for f bounded.

(b) Write A = B U ( U::'=o An) with
B = {x E A llf(x)l = +oo}
and
An= {XE AI n ~ lf(x)l < n + 1 },
and decompose Aas A = B u BN u (A\(B u BN)) with BN = u:=O An.
Exercise 13.5 (Chebyshev's inequality) Let f be a nonnegative

function defined on a measurable set E. Fora> 0 show that
p,{x E EI f(x) ~ a} ~ .!_ [ fdp,.

0! JE
Exercise 13.6 Show that f(x) = x" is Lebesgue integrable on
(a) (0, 1] for a > -1;
(b) (1, +oo) for a < -1.
Exercise 13.7 {Beppo-Levi's theorem) Let fn be an increasing

sequence of integrable functions on a measurable set E such that for some M > 0,
lfndj.t~M
for all n E N. Show that the sequence fn converges almost everywhere to an
integrable function f on E and that
{ f dj.t = lim { f n dj.t.

JE n-+oo}E
Hint: Write 9n = fn- /1 and use Theorem 13.2.2.
Exercise 13.8 Let f n be a sequence of nonnegative measurable functions

on a measurable set E. Assurne that L:;;'= 1 JE
fn dj.t < +oo. Show that the series
L:;;:"= 1 fn(x) converges almost everywhere and that
*Exercise 13.9 {Fatou's lemma) Let fn be sequence of nonnegative

measurable functions defined on a measurable set E.
(a) Show that
{ lim inf fn dj.t ~ lim inf { f,. dj.t.
} E n-+oo n--+oo } E
Hint: Use Theorem 13.2.2 on the functions 9n = inf fk Recall that for a
k;o:n
sequence ofreal numbers a,., n E N, the limit inferior, denoted by lim inf an,
n-+oo
is the quantity supkEN{infteN ak+t}.
(b) Investigate the sequence fn = -~X[o,n], which does not satisfy the non-
negativity hypothesis, and verify that Fatou's inequality does not hold.
Exercise 13.10 Consider the u-algebra .9" (N) of all subsets of N and endow
the elements E of .9" (N) with the counting measure defined by
J.t(E) = the number of integers in E.
Show that f : N ----> lR is integrable with respect to the measure J.t if and only if
L lf(n)l < +oo.

00
n=O
Exercise 13.11 Let(X,Y ,J.t)beameasurespace.ForallEE.!T define
u(E) =l f dj.t,
where f is a given nonnegative integrable function. Show that u is a measure on

.!T . (Use Exercise 13.3.)
lesson 14
Integral Calculus
This lesson contains the essential tools for putting into practice integral
computations: It is the Lebesgue vcrsion of integral calculus. We present
rules for manipulating integrals that depend on a parameter. In particular,
we discuss continuity and derivation with a view toward applications to
the Fourier transform. The lesson also contains the formulas for changing
variables and the rules for interchanging the order of integration in double
integrals, the celebrated Fubini's theorem.
14.1 Lebesgue's dominated convergence

theorem
We saw one way to pass to a limit under the integral sign in Lesson 13
(Theorem 13.2.2). Note that this result applies only to an increasing se-
quence of nonnegative functions. One should take care not to confuse the
theorem on monotone convergence with the following more powerful result.
14.1.1 Theorem (Lebesgue) Let (X,Y ,J.L) be a measure space.

Let fn, n E N, be a sequence of measurable functions deEned on X that
converges almost everywhere to a function f. Suppose that there exists an
integrable function g such that for each n E N, lfn(x)l :<::; g(x) a.e. on X.
Then
(i) f is integrable;
(ii) lim { fn dJ.L = { f dJ.L for allE E Y .
n--+oo JE JE
The proof of (i) is immediate: f is measurable because it is the limit a.e.
of measurable functions; f is integrable because If I is bounded (dominated)
a.e. by an integrable function g. We will assume (ii). The proof, which is
more technical, can be found, for example, in [Ber70].
122 Lesson 14. Integral Calculus
14.1.2 Remark This theorem has two consequences: an integrability

criterion and a method to compute the integral.
We use an example to illustrate the advantage of the Lebesgue integral
over the Riemann integral as regards passing to the limit under the integral
sign. Take X = (0, 1], let J.L be Lebesgue measure, and suppose that the
rationals in (0, 1] are ordered in a sequence q1, q2, ... , qn, .... Define
f n (X) = { +1 if X E ~ ql , q2, ... , qn},

0 otherw1se.
The sequence converges pointwise to the function
f(x)={+l ifxE~n(0,1],
0 otherw1se.
Wehave lfnl ~ 1 for all n E N. Theorem 14.1.1 implies that f is integrable

and that
{ f dj.L = lim { fn dj.L = 0.
Jx n-+oo}x
(Of course, we already knew this because f = 0 a.e.)
We note that in this example fn is Riemann integrable for all n E N,
while the limit f is not Riemann integrable. This shows that the pointwise
limit of a sequence of Riemann-integrable functions is not always Riemann
integrable. In a sense, Lebesgue's dominated convergence theorem resolves
this issue. We will sec many other applications.
14.2 Integrals that depend on a parameter

We are given a measure space (X,Y ,J.L) and an arbitrary interval (a,b),
bounded or not, of IR. Let f be defined on (a, b) x X with values in lR or
C. Assurne that for all t E (a,b) the function x r-+ f(t,x) is integrable. We
define
I(t) = L f(t,x)dJ.L, t E (a,b).
EXAMPLE: The Fourier transform of a function of L 1(JR),
We intend to examine the continuity and differentiability of the function
I: t r-+ Lf(t, x) dJ.L.

14.2 Integrals that depend on a parameter 123
14.2.1 Proposition (continuity) Ifforalmostallx EX thefunc-

tion t,..... f(t, x) is continuous at t* E (a, b) and if there exists an integrable
function g such that for all t in a neighborhood V oft*
lf(t,x)l:::; g(x) a.e.,
then I is continuous at t*.
Proof. Let tn be an arbitrary sequence in V that converges to t*. Define

fn(x) = f(tn,x). From the hypotheses, lim fn(x) = f(t*,x) for almost
n-oo
all x in X and lfn(x)l = lf(tn, x)l :::; lg(x)l a.e. Applying Theorem 14.1.1
r
shows that
lim
n-oo}x r
fn(x) dJ.l = lim fn(x) dJ.l,
Jxn-oo
or
lim I(tn) = { f(t*, x) dJ.l = I(t*). D

n-oo Jx
14.2.2 Proposition (derivation) Suppose that Visa neighbor-
hood oft*, V C (a, b), such that the following two conditions hold:
(i) For almost all x, t,..... f(t,x) is continuously differentiable on V.
(ii) Therc exists an integrable function g such that for all t E V,
j !{ (t, x)j :::; g(x) a.e.
Then I is differentiable at t*, and I' (t*) = Li: ( t*, x) dJ.l.
Proof. The proof is essentially the same as the one above. Here we write
f (x) = f(tn,x)- f(t*,x)

n tn- t*
and use the mean value theorem. D
14.2.3 Remark The last two results apply to complex-valued func-

tions by taking real and imaginary parts.
14.2.4 Example Suppose that f : ~ ----> ~ is integrable. The Fourier

transform
f(t) = l e-2i7rtx f(x) dx
is well-defined for all t E ~ because le- 2i1rtx f(x)l :::; lf(x)l. Formally, the
J
derivative of is
f'(t) = l e- 2i1rtx( -2inx)f(x) dx.

To apply Proposition 14.2.2 we must show that the right-hand side of
is dominated by an integrable function. A simple sufficient condition is that

x ~---> xf(x) be integrable. In this case we have
f'(t) = -2Q(x)(t).
14.3 Fubini's theorem

This section deals with rules for interchanging the order of integration in
double integrals. Fubini's theorem, which addresses this problem, will be
essential for our work on the Fourier transform and convolutions. We will
be concerned with functions of two variables and Lebesgue measure and
integration on IR 2 . The development of these theories for IR2 is similar in
most respects to that for IR. In the case of IR2 one begins with a measure v
defined on rectangles [a 1 , bi] x [a 2 , b2 ] (recall (12.1)).
14.3.1 Theorem (Fubini) Assume that f: IR x IR---> lR: is measur-

able and that E x F is a measurable set in IR x R
(i) If fisnonnegative on Ex F, then
{ f(x,y)dxdy = { dx { f(x,y)dy = { dy { f(x,y)dx.

JExF JE JF JF JE (14.1)
The three integrals can possibly be equal to +oo.

(ii) If f is integrable on E x F, the function x ~---> f(x, y) is integrable
for almost every y, the function y ~---> f (x, y) is integrable for almost
every x, and the three integrals in (14.1) arefinite and equal.
(iii) f is integrablc if and only if
fe dx t lf(x, y)l dx or t dy fe lf(x, y)l dx
is finite.
For the proof see [Hal64] or [Roy63].

The practical aspect is that one can compute the double integral by
choosing a convenient order of integration if at least one of the iterated
integrals of lf(x, y)l exists. Note, however, that the existence of the two
integrals
fe dx tJ(x,y)dy and t dy Lf(x,y)dx
does not imply the integrability of f on E x F.
14.4 Changing variables in an integral 125
14.4 Changing variables in an integral

We give the formula for !Rn knowing that in practice it is used mostly for
1:::; n:::; 4.
Let ~ and n be two domains in !Rn related by a 1-to-1 mapping <P.
Suppose that <P and <P- 1 are continuously differentiable on n and ~' re-
spectively. The Jacobian of <P at x = (x1, x2, ... , Xn) is the matrix of partial
derivatives of <P = (cpb cp2, ... , 4?n) with respect to x:
cp1(x) cp1 (x) cp1 (x)

axl x2 xn
cp2 (x) cp2 (x) cp2 (x)
Jac<P(x) = x1 ax2 xn
8cpn(x) cpn(x)
axl xn
Jcp(x) denotes the determinant of the Jacobian, and !Jw(x)! denotes its
absolute value.
14.4.1 Theorem Supposc that f is defined on ~.

(i) f is roeasurablc if and only if f o <P is roeasurable.
(ii) If f is measurable and positive, then
i f(y) dJ.L(Y) = fo f(<P(x))IJw(x)l dJ.L(x).

(iii) j is integrabJe Oll ~ if and onJy if (j 0 <P)jJcpj is integrabJe Oll f2, in
which case
i f(y) dJ.L(Y) = fo f(<P(x))IJw(x)l dJ.L(x).

See [Her86] for a proof.
If E is a measurable subset of n, its image under <P is also measurable,
and
J.L(<P(E)) = [ dJ.L(Y) = [ IJw(x)l dJ.L(x).
lw(E) jE
In particular, this shows that Lebesgue measure is invariant under trans-

lation and symmetry.
14.5 The indefinite Lebesgue integral and

primitives
In this section we consider R with Lebesgue measure. Given a function
f E Li (a, b), we are going to study the function
I(x) = f fdp.,
J[a,x]
which we denote by I(x) = 1x f(t) dt. I is called theindefinite Lebesgue

integral of f.
The following results are true for the Riemann integral:
(i) If f is continuous on [a, b], then
J(x) = 1x f(t) dt
is differentiable for all x E [a, b] and J'(x) = f(x) for all x E [a, b].
(ii) If f is continuously differentiable on [a, b], then for all x E (a, b)
f(x) = f(a) + 1x f'(t) dt. (14.2)
What is the situation for the Lebesgue integral?

The first result generalizes to Lebesgue-integrable functions thanks to
the following result [KF74].
14.5.1 Theorem A function that is monotone on an interval [a,b] is

differentiable almost everywhere on [a, b].
When f is nonnegative, I is monotone. For an arbitrary f we make the
usual decomposition f = J+ - f- and see that I is the difference of two
monotone functions. Hence I is differentiable a.e.
14.5.2 Proposition Suppose that f : [a, b] ~ R is integrable. The

function I defined on [a, b] by I(x) = J: f(t) dt is differentiable a.e., and
I'(x) = f(x) a.e.
A proof can be found in [KF74] and other books on integration.
We now turn to formula (14.2). Forthis to make sense, f must be differ-
entiable almost everywhere with f' integrable on [a, b]. The condusion is
that f is continuous on [a, b]. Beware! These conditions arenot sufficient. It
is possible to construct a function f : [0, 1] ~ [0, 1] that is continuous and
strictly increasing, with f(O) = 0 and /(1) = 1, and suchthat f' exists and
is zero almost everywhere. (The dassie example, due to Cantor, is called
The Devil's Staircase.) In this case, (14.2) is dearly not true. The problern
posed by this situation leads to the introduction of a new dass of functions.
14.5 Theindefinite Lebesgue integral and primitives 127
14.5.3 Definition A function f : [a, b] - t IR is said to be absolutely

continuous (AC) on [a, b] if it satisfies the following conditions:
(i) f is differentiable a.e.
(ii) f' is Lebesgue integrable on [a, b].
(iii) For all x E [a, b], f(x) = f(a) + 1x f'(t) dt.
An AC function is continuous; the converse is false. A continuously dif-
ferentiable function is absolutely continuous; again the converse is false.
(Take f(x) = xu(x), where u is the Heaviside function.)
14.5.4 Proposition Suppose that f : [a, b] - t IR is integrable. The

function I defined on [a, b] by I(x) =I: f(t) dt is absolutely continuous.
Proof. This follows directly from Proposition 14.5.2 and the definition. o
14.5.5 Definition A function Fis said tobe a primitive of f on (a, b)

if F is absolutely continuous and F' = f a.e.
We apply these results to integration by parts. The formula that is true
for continuously diffcrentiable functions extends to AC functions.
14.5.6 Theorem (integration by parts) Let u and v be two

AC functions on (a, b). Then
1b u(x)v'(x) dx = u(b)v(b)- u(a)v(a) -1b u'(x)v(x) dx.
Proof. u(x) = u(a) +I: u'(t) dt. Multiplying both sides by v' and inte-
grating with respect to x shows that
1b u(x)v'(x)dx=u(a)[v(b)-v(a)]+ 1b v'(x)(1x u'(t)dt)dx.
Write
Apply Fubini's theorem and interchange the order of integration; thc last
integral becomes
1b u'(t) ( lb v'(x) dx) dt = 1b u'(t)[v(b)- v(t)] dt
= u(b)v(b)- u(a)v(b) -1b u'(t)v(t) dt,
and this proves the result. (We leave it to the reader to show that thc
hypotheses of Fubini's theorem are fulfilled.) o
14.6 Exercises
Exercise 14.1 Consider the sequence of functions fn: R-+ R+ defined by
_!_ if lxl < _!.

fn(x) = { lxl - n'
0 otherwise.
Discuss this sequence in the context of Lebesgue's Theorem 14.1.1.
Exercise 14.2 Assurne that f is Lebesgue integrable on R.

(a) Let (an) and (bn) be any two sequences of real numbers suchthat an and
bn tend to +oo as n -+ +oo. Show that
lim
n---+oo
lbn
-an
f(x) dx = r
}IR.
f(x) dx.
Give an example to show that the converse is false.
1
(b) Show that
L },.,
00 r(n+l)a
f(x)dx= f(x)dx
R n=-CXl na
for all a > 0.

(c) Define I(t) = [too f(x) dx. Show that I is continuous on Rand that
lim I(t) = { f(x) dx.

t--++oo }R
Exercise 14.3 Let fn : R-+ R be a sequence of Lebesgue-integrable func-
tions on R. Show that if
f:
n=l
11/n(x)l dx
IR
< +oo,
E f n ( x) converges for almost all x and

00
then the series

n=l
Exercise 14.5 Let~ be a bounded domain in R 2 Discuss
lim /r
n---+oo
r (1 +X+n y)n dxdy.
J
14.6 Exercises 129
Exercise 14.6 Let f : IR ----> IR be such that the function X ----> xn f(x) is
integrable on IR for all n ;::: 0. Show that the Fourier transform
is infinitely differentiable.
1
*Exercise 14.7 Show that f(x, t) = e-tx is integrable on IR+ for all t > 0.
Use the following two methods to verify that I(t) = 00
e-tx dx is infinitely
differentiable:
(a) Compute I(t) explicitely.
(b) Apply Proposition 14.2.2.
Use this result to deduce that
Exercise 14.8 Consider the function

xy
f(x, y) = (x2 + y2)2
1:
on the square = [-1, 1] X [-1, 1]. Show that
1 1 1
[
1
dx [ 1
f(x, y) dy = dy [ 1
f(x, y) dx.
Show that fisnot Lebesgue integrable on (use polar coordinates). Conclusion?
Exercise 14.9 Consider the function
defined on = [0, 1] x [0, 1] (except at (0, 0)). Show that the iterated integrals
exist and are different. Conclusion?
Exercise 14.10 Show that if fandgare in L 1 (IR), then h(x,y) = f(x)g(y)

isinL 1 (IR2 ).
Exercise 14.11 Compute the integral of
f(x, y) = e-x-y (x + Yt
on =IR+ x IR+ by making the change of variables u = x and v = x + y.
Exercise 14.12 Let f: Rn--+ R be defined by

1
j(x1,X2, ... ,xn)= "' with a ER.
(xi +x~ + +x~)2
Change variables X; to polar Coordinates:
X1 p COS fh COS (}2 COS On-2 COS On-1,

X2 p cos 01 cos 02 cos On-2 sin On-1,
X3 p cos 01 cos 02 ... cos (}n-3 sin On-2'
Xn-1 pcos01 sin02,

Xn psinfh,
with p > 0, -~ < 01, ... , (}n-2 < ~ and 0 < On-1 < 271". Let
Show that f is integrable on B if and only if a < n. Show that f is integrable on

Rn\B if and only if a > n.
Hint: The determinant of the Jacobian of the transformation is
Chapter V
Spaces
lesson 15
Function Spaces
We have collected in this lesson the definitions and essential results for the
commonly encountered spaces of functions (function spaces or functional
spaces). The lesson is somewhat like a catalog, and it can be used as a refer-
ence to find one's way around function spaces that are perhaps unfamiliar.
Several proofs are technical and can safely be skipped on first reading.
15.1 Spaces of differentiahte functions

15.1.1 Definition Let I be an arbitrary interval (bounded or not) of
R For p E N we define the space of functions CP (I) by
CP(I) = {! :I~ IR (or IC) If is p-times continuously differentiable }.

Whcn f E CP (I), we say that f is of dass CP.
When we say that f is continuously differentiable, we mean that f is
differentiable and that x ~ f'(x) is continuous. If f has values in C, we
are speaking of the differentiability of Re(!) and Im(!). f E C 0 (J) simply
means that f is continuous. For functions of several variables, the continuity
of the derivative of order p is replaced by the continuity of all the partial
derivatives whose total order is p. Finally, the adjective regular is often
used to indicate some (unspecified) degree of differentiability of a function.
15.1.2 Proposition If I is a closed and bounded interval of!R, then

CP (I) is a complete normed vector space in each of the following norms:
p
(i) N1(f) = L 11/(k)lloo;
k=O
L
p 1/2
(ii) N2(f) = (
11/(k) II~) ;
k=O
(iii) Noo(f) = k=O,l,
max 11/(k)lloo,
... ,p
wherc 11/(k)lloo = maxlf(k)(x)l.
xEI
134 Lesson 15. Function Spaces
Proof. This is a classical result. The completeness of these spaces is based

on the fact that C 0 (I) is complete in the norm for uniform convergence,
II II= (see, for example, [KF74]). D
15.1.3 Remark When I is not a bounded interval, for example I= IR,

none of the norms above make sense if J, f', . .. , f(P) arenot bounded. Take,
for example, f (x) = ex. The same is true if I is bounded but not closed
(f(x) = 1/x on (0, 1]).
15.1.4 Definition A function f is said to be infinitely differentiable,

or of class c=' on I if f is in CP(I) for all p E N. This space is denoted by
c=(I), which is read "C-infinity."
We are often going to need the notion of the support of a function.
Here we define the support of a continuous functions. We will extend the
definition to measurable functions in Lesson 20, where we will need it for
studying convolution.
15.1.5 Definition Assurne that f: I 1-t Cis continuous. The support

of f, which is denoted by supp(f), is defined to be the complement of the
largest open set on which f is zero.
Since f is continuous, supp(f) = {x I x EI, f(x)-=/:- 0}, the closure of the

set where f(x)-=/:- 0.
15.1.6 Definition C[(I) denotes the space of functions in CP(I) that

have bounded support in I.
To introduce distributions, we will need to use functions with bounded

support that have as much regularity as possible.
15.1. 7 Definition 9J (IR) (or 9J (I)) will denote the space offunctions
in c=(IR) (or c=(J)) that have bounded SUpport.
15.1.8 Example The function
if lxl ~ 1,
otherwise
is in 9J (IR).
In the last definition it is important tobe precise about the interval I. For
example, if I= [a, b], one can have f E 9J (I) without f being zero at a and
b. If I = (a, b), the support of f must be a closed set in (a, b). In this case,
f can be extended by continuity to all of IR (f(x) = 0, x E IR\supp (!)),
and we have 9J ((a, b)) c 9J (IR).
15.2 Spaces of integrable functions 135
15.2 Spaces of integrable functions

In Lesson 13 we introduccd thc vector space of Lebesgue-integrable func-
tions, or more precisely, dasses of functions. Recall that two functions that
are equal almost evcrywhere are equivalent from the point of view of inte-
gration. In what follows we will not make a distinction between the dass
of functions and a representative; we will speak simply of integrable func-
tions. However, if the dass is "continuous," that is, if therc is a continuous
function f in the dass, we will generally choose f as the representative.
15.2.1 Definition Let p > 0 be an arbitrary positive number and let

I be an interval (bounded or not) of IR. Then LP (I) denotes the space of
measurable functions f: I---+ IR (or q for which lf(x)IP is integrable on I.
15.2.2 Definition Let I be an interval (bounded or not) of IR. Then

L 00 (I) dcnotes the space of measurable functions f : I ---+ IR (or q that
are bounded almost everywhere.
15.2.3 Proposition The spaccs LP(I), 1 :::; p :::; +oo, are complete
normed linear vector spaces when endowed with the following norms:
(i) 11/IIP = (llf(t)IP dt) l/p if 1:::; p < +oo;
(ii) 11/lloo = inf{c I measure {x llf(x)l ~ c} = 0}.
When p = oo it is easy to show that lf(x)l :::; 11/lloo except on a set of
measure zero.
We assume from now on that 1 :::; p, q :::; +oo. The proof of Proposition
15.2.3 is not immediate; we first establish the following result.
15.2.4 Lemma (Hlder's inequality) Assume that f E LP(I)

1 1
and g E Lq(I), where-
p
+-q = 1. Then fg E L 1 (I), and
llf(t)g(t)l dt:::; II!IIPIIYIIq

Proof. lf p = 1 or p = +oo, the inequality is dearly true. Assurne that
1 < p < +oo. By Young's inequality,
1
ab<-
- p
aP +-q1 bq
for all a ~ 0 and b ~ 0. We apply this to lf(t)llg(t)l and integrate both

sides over I to obtain the inequality
1 I
1 1
lf(t)g(t)l dt:::; -II/II~ + -IIYII~
p q
By replacing f by af, a > 0, wc see that
f lf(t)g(t)l
I
p-1
dt::; ~ 11!11~ + -llgll~;
p qa
1
taking a = llgii~/P /IIJIIP yields Hlder's inequality. D
Proof of Proposition 15.2.3. It follows immcdiately that LP(I) is a

normed vector space when p = 1 or p = +oo, so we assume that 1 < p < oo.
Suppose f g E LP(I). Then f + g is in LP(I) sincc by the convcxity of xP
on IR+ we have
(lf(t)l + lg(t)I)P :S 2P- 1 (lf(t)IP + lg(t)IP) a.e.
To show that II IIP is a norm on LP(I), we need only to prove the triangle
inequality; the other properties of II IIP are obvious. Write
II! + gll~ = jlf(t) + g(t)IP- 1 If(t) + g(t)l dt

::; jlf(t) + g(t)IP- 1 If(t)1 dt + jlf(t) + g(t)lp- 1 lg(t)l dt.
1
Note that lf + giP- 1 E Lq when -
p
+-q1 = 1; thus by Hlder's inequality
1
II! + gll~::; (jlf(t) + g(t)lq(p- 1 ) dt) '1 (llfllp + llgllp)
= II! + gll~- 1 (llfllp + llgllp),
from which we have II! + giiP :S IIJIIP + llgllp
Showing that thc spaces LP (I) are complete is more technical; we refer
to [Bre83], for example, for a proof. D
15.2.5 Remark When I is not a bounded interval, for examplc I= IR,

the bounded functions are not in LP(I). Take, for example, f = 1. The
bounded functions are, however, intcgrable on all boundcd intervals of R
15.2.6 Definition Lfoc(IR), p 2: 1, denotes the space of measurable

functions from IR to IR or C such that IJIP is integrable on every bounded
interval of R
Clearly, LP(IR) C Lfoc(IR). The periodic functions studied in the earlier
lessons on Fourier series with p = 1 or 2 are cxamples of functions in
Lfoc(IR).
15.2. 7 Definition L~(O, a) denotes the space of functions with period
a that are integrable on (0, a). Similarly, L~(O, a) denotes the space of a-
periodic functions that are square integrable.
15.3 Inclusion and density 137
The vector space L~(O, a) was introduced in Lesson 5. We now know that
L~(O, a) = {f :IR.-+ C If has period a and foa lf(t)l dt < +oo}

is complete in the norm llflh = foa IJ(t)l dt. Since f is periodic, this integral
can be computed over any interval of the form (o:, o: + a).
15.3 lnclusion and density

We are going to indicate the inclusion relations among the spaces we have
just defined. These relations for the spaces of differentiable functions need
no proof.
15.3.1 Proposition The following inclusions hold for all p E N and

for all intervals I of IR.:
(i) cP+ 1 (I) c CP(I).
(ii) c=(I) c CP(I).
(iii) ~(I) c C[(I).
It is necessary to consider the measure of I when discussing the inclusions

of the LP spaces.
15.3.2 Proposition Assurne that J.L(I) < +oo. Then the following
relations hold:
(i) L 00 (I) c LP(I) for all p ~ 1.
(ii) Lq(I) c LP(I) for all q ~ p ~ 1.
(iii) There exists a constant c = c(p, q) such that llhiiP :::; cllhllq for all
h E Lq(I), 1 :'S p :'S q :'S +oo.
Proof.
(i) For f E L=(I), JI lf(t)IP dt :'S J.L(I)IIfll~ < +oo.
(ii) Take p < q and f E Lq(I). Write S = {t EI llf(t)l ~ 1}. FortES,
lf(t)IP :'S lf(t)lq on S; thus
f lf(t)IP dt:::; f lf(tW dt + f lf(t)IP dt:::; f lf(tW dt + f dt

}I ls ji\S }I ji\S
:'S 11!11~ + J.L(I) < +oo,
which proves that f E LP(I).
(iii) The inequality llhiiP ::; cllhllq is another way of saying that the injec-
tion of Lq(I) in LP(J) is continuous. Take h in Lq(I) and apply Hlder's
inequality with
f(t) = lh(t)IP, g(t) = 1, r - J_ and

- p'
. 1
Smce -
r
+-1s = 1, we see that
q-p
It follows that llhiiP ::; cllhllq with c = (J-t(J)) pq. D
When J-t(I) = -t-oo these results are false. The spaces L 1 (JR) and L 2 (JR)
are not comparable.
Thrning to the integrability properties of the regular functions, we see
immediately that if I is not bounded, then there is no inclusion of cm(I)
in LP(I) for any m ~ 0 and p ~ 1. It is sufficient to take f = 1. If I is
closed and bounded, then cm(J) C L 00 (I) c LP(I) for all m ~ 0 and all
p~l.
The following theorem about approximation is morc precise and will be
used often in the rest of the book.
15.3.3 Theorem Let I be an open (bounded or unbounded) interval

oflR. The space C~(J) is densein L 1 (I). In other words, given f E L 1 (I)
and E: > 0 there exists 'Pc E C~(J) suchthat llf- 'Pclll < E:.
Proof. Assurne that I is bounded. Given f E L 1 (I) and E: > 0, it is

possible to find a closed interval K C I such that Ir
If - XK f I dJ-t < E: /4.
By definition of the Lebesgue integral, we know that the simple functions
aredensein L 1 (K). Lets= E~=l DnXsn bc a simple function suchthat
IK lf- sl dJ-t < E:/4. Then Ir
lf- sl dJ-t < E:/2.
The next step is to show that we can approximate the functions Xsn with
continuous functions 'Pn E C~(J) so that Ir
ls - E Dn'Pn IdJ-t < 'T}, where
'T} > 0 is arbitrary. If we do this with 'T} = (E:/2) E lanl, we will have shown
that llf- 'Pclll < E: for 'Pc = L:an'Pn and proved the theorem.
To simplify the notation, let E denote any one of the Sn and take 'T} > 0
as indicated above. E is a measurable set in K. From the construction of
Lebesgue measure, we know that there exists an open set rl and a closed
set F such that F c E c rl c I, where the last inclusion is proper and
suchthat J-t(rl/ F) < 'TJ/2. Consider the function g defined by
d(x, I\ rl)
g(x) = d(x, I\ rl) + d(x, F)'
15.4 Exercises 139
where d(x,A) = inf{d(x,a) I a E A} is the distance from x to A. The

denominator ofthis fraction cannot vanish, because I\!1 and F are disjoint,
closed, and bounded sets. The function x f-4 d(x, A) is continuous; thus g
is continuous. For all x EI, we have 0 :$ g(x) :$ 1, and XE(x)- g(x) = 0
if X E F or if X E I\ n. It follows that
fI
iXE - gl djJ. = r
10\F
IXE - gl d~-t :$ 2JJ.(n \ F) < TJ
This proves the result for the case when I is bounded. If I is unbounded,
we first approximate f in the L 1 norm with XJ f, where J is a bounded
open interval. o
15.3.4 Remark Theorem 15.3.3 is also true for 1 < p < +oo. It is false
for p = +oo, as can be seen by taking I = (0, 1) and f 1 (see [Bre83]). =
We will sec later (Lesson 21) that !JJ (R) is densein L 1 (JR). Forthis we will
use the convolution.
The display below summarizcs the inclusion relations among the function
spaces that we have introduced so far.
C 00 (I) c ... c CP+l(I) c CP(I) c ... c C 0 (I)

u u u u
!JJ (I) c c C~+l(I) c C~(I) c c C~(I)
If JJ.(I) < +oo,

L 00 (I) C C LP(I) C L 2 (I) C L 1 (I) C Lfoc(l)
The space L 2 (JR) plays a central role in Fourier analysis. Its norm is
derived from a scalar product, and it is a Hilbert space. This will be the
subject of the next lcsson.
15.4 Exercises
Exercise 15.1 Let n be an Open set in Rn and let f : n -> R be measurable.
Assurne that f satisfies the following property (B):
(B) There exists c > 0 suchthat lf(x)l ::::; c for almost all XE n.
Define
llflloo = inf { a I lf(x)l ::::; a a.e. on n }.
(a) Show that lf(x)l ::::; llflloo a.e. on n.
(b) Let L oo (n) be the set of functions defined on n with values in R that satisfy
(B). Show that II lloo is a norm on L 00 (!l).
Exercise 15.2 Let f : [-1, 1] -+ lR be defined by j(x) = x 2 and let g be

equal to f except at x = 0, where g(O) = 2, and at x = 4, where g( 4) = 4.
Campare sup lf(x)l with sup lg(x)l and llflloo with IIYIIoo Conclusion?
-l~x~l -l~x~l
Exercise 15.3 Let f and g be in LP(I), 1 :::; p :::; oo. Show that II!- giiP =0
if and only if f = g a.e.
Exercise 15.4 Show that if f and g are in L 2 (J), the product fg is inte-
grable. Give an example where J,g E L 1(/) but fg is not integrable on I.
Exercise 15.5 Let f: lR-+ JR+ be an integrable function on R Show that

if g : lR -+ lR is equivalent to f at +oo (limx~+oo g(x)/ f(x) = .A =f. 0), then g is
integrable in a neighborhood of +oo.
Exercise 15.6 Let f(x) = P(x)/Q(x) be a rational function with coeffi-

cients in C. Assurne that Q has no real roots. Show that:
(a) deg(P) :::; deg(Q) ===} f E L 00 (1R).
(b) deg(P) :::; deg(Q- 1) ===} f E L 2 (1R).
(c) deg(P) :::; deg(Q- 2) ===} f E L 1(1R).
Study the implications in the other direction. (deg = degree.)
**Exercise 15.7 (L 1 (IR) is complete) Let fn be a Cauchy sequence in

Ll(JR).
(a) Show that one can extract a subsequence fa(n) suchthat for all n E N,
1
llfa(n+l)- fa(n) ll1 < 2n
(b) Write
n
9n(x) = lfa(l)(x)l + 2.:: lfa(k+l)(x)- fa(k)(x)l.
k=l
Show that 9n converges almost everywhere to a function g of L 1(JR).

(c) Show that fa(n) converges almost everywhere to a function f in L 1(1R).
Verify that lim llfa(n) - fll1 = 0 and deduce that lim llfn - flh = 0.
n--+oo n--+oo
Conclusions?
**Exercise 15.8 (L00 is complete) Let fn be a Cauchy sequence in

LOO(JR).
(a) Use the Cauchy criterion to show that the sequence fn converges uniformly
on lR\M, where M is a set of measure zero. Show that the limit f is in
LOO(JR).
(b) Prove that lim llf- fnlloo
n->oo
= 0.
lesson 16
Hilbert Spaces
In this lesson we present the ba.sic elements of the theory of Hilbert spaces.
These spaces generalize several aspects of lR"'. Hilbert spaces are endowed
with a "Euclidean" geometry in the sense that there is a distance function
and the notion of angle between two vectors. Hilbert spaces are complete,
and this allows one to develop the notion of an infinite-dimensional basis.
The prototypic Hilbert space is L 2 (I). 8ome of the points introduced in
Lesson 4, including orthogonal projections, will be formalized.
16.1 Definitionsand geometric properties

16.1.1 Definition Let E be a vector space over K (K = lR or C). A
scalar product on Eisa mapping from Ex E to K, denoted by (, ), that
satisfies the following properties for all x, y, z E E and a E K:
(81) (x,x) 2:0 and (x,x) = 0 => x = 0;
(82) (x,y) = (y,x);
(83) (x + y, z) = (x, z) + (y, z) and (ax, y) = a(x, y).
In (82), (y,x) is the complex conjugate of (y,x). (82) and (83) imply
that (x, ay) = a( x, y). When K = JR, the conjugation bars are clearly
superfluous. 8ince we will be working most of the time in C, we present the
results for this field.
16.1.2 Definition A vector space endowed with a scalar product is

called a pre-Hilbert space.
Define llxll = J(x,x). We will see that this is a norm once the following
lemma is established .
16.1.3 Lemma Let H be a pre-Hilbert space over C. The following

geometric rclations hold for all x, y E H:
142 Lesson 16. Hilbert Spaces
(i) Schwarz inequality:
i(x,y)i::::; llxiiiiYII
(ii) Parallelogram identity:
Proof. If y = 0, then (i) is trivial, so assume y :f:: 0. We must show that
This reduces to the case IIYII = 1; but if IIYII = 1, then
0::::; llx- (x, y)yll 2 = llxll 2 + i(x, y)i 2 - (x, (x, y)y)- ((x, y)y, x)
= llxll 2 -i(x,yW,
which proves (i).
To prove (ii), just expand the left-hand side:
llx + Yll 2 + llx- Yll 2 =llxll 2 + (x, Y) + (y, x) + IIYII 2

+ llxll2 - (x, y)- (y, x) + IIYII2
=2(llxll 2 + IIYII 2). D
16.1.4 Proposition A pre-Hilbert space is a normed vector space

with the norm 11!11 = y'(f, f).
Proof. The only axiom that is not obvious is the triangle inequality. We
have II! + 911 2 = 11!11 2 + 11911 2 + 2Re(f, 9). From the Schwarz inequality, wc
deduce that 2Re(f,9)::::; 211!1111911; hence, II! + 911 2 ::::; (11!11 + 11911) 2. D
16.1.5 Definition A pre-Hilbert space H that is completc with re-

spect to its norm 11!11 = .J(fJ) is called a Hilbert space.
16.1.6 Examples
Hilbert spaces:
IR.n with the scalar product (x, y) = :L:~=l XiYi, where x = (xb ... , Xn)
and y = (yl, ,yn)
cn with the scalar product (x, y) = E~l Xi'fk
L~(O, a) with the scalar product (f, g) = Ioa f(t)g(t) dt (sec Section 4.1).
Lp(O, a) is complete by Proposition 15.2.3.
Pre-Hilbert spaces:
c0 ([a, bJ; IR) with the scalar product u, 9) =I: J(t)g(t) dt .
c0 ([a, bJ; q with the scalar product u, 9) =I: J(t)g(t) dt.
16.2 Best approxirnation in a vector subspace 143
16.1. 7 Definition Let H bc a pre-Hilbert space. Wc say that x and

y EH areorthogonal if (x, y) = 0. LetS be a subset of H. The orthogonal
complemcnt of s is the set sj_ defined by
sj_ = {y EH I (x, y) = 0 for all XE S}.
16.1.8 Proposition (Pythagorean identity) Let H be a pre-

Hilbert space. If x and y are orthogonal, then
More gencrally, if (h, (h, ... , cPn arc pairwisc orthogonal, tlwn
Proof. It ls sufficient to expand the left-hanci sicies of thesc cquations. o
16.1.9 Remark Whcn K = JR., the Pythagorean relation implies that

x and y are orthogonal. Thus (x, y) = 0 {=)- llx +-YII 2 = llxll 2 + IIYII 2 . The
convcrse is false if K = C, since Rc(x, y) can be zero without (x, y) being
zero. (Take x = (1, i) and y = (-i, 1)).
16.2 Bestapproximation in a vector subspace

Given a pre-Hilbert space H, a linear subspaee V C H, and an element
f E H, we can ask the following questions:
(i) Does there exist an f* E V such that
II!- !* I = min II!- vll?

vEV
(ii) Can we characterize f*?

If f* exists, it is called thc best approximation of f in V. For cxam-
ple, take H = L~(O, 27r) and V thc subspace of trigonometric polynomials
generated by 1, sinx, cosx, ... , sinnx, cosnx (sec Beetion 4.2).
Thc next theorem answers the first question.
16.2.1 Theorem (orthogonal projection) SupposcH isapre-

Hilbert space and Visa completc linear subspacc of H. Givcn f EH, tlwre
exists a unique f* E V such that
II!- !*II = min

vEV
111- vll
Proof. Let d = inf{llf- vll I v E V}. V is complete and hcncc closed. If

d = 0, we havc f = f*. Thus assume d > 0 and considcr the scts
We wish to estimate thc size of Cn, so lct v 1 and v2 bc arbitrary elemcnts

of Cn. By the parallelogram identity,
The right hand-sidc is bounded above by 4d 2 + 8d/n + 4/n 2. On thc other

side, llv1 +v2-2fll 2 ~ 4d2 becausc (v 1 + v2)/2 E V and hencc is a contender
in the definition of d. These inequalities imply that llv 1-v2 ll 2 ::; 8d/n+4/n 2 ,
and thus the diameter of Cn tends to 0 as n ---+ oo.
Since Cn is not empty, we can choosc Vn E Cn. The sequcnce Vn is a
Cauchy sequence because llvn- Vn+pll 2 ::; 8d/n + 4/n 2 for all p ~ 0. Thus
Vn tends to a limit f* in V, and II!- !*II::; d, from which it follows that
II!- !*II = d = min II!- vll

vEV
The uniquencss comes from thc scalar product via thc parallclogram
identity. Suppose h and h are two solutions; then d = II!- h II = II!- hll
Since (h + h)/2 is in V,
1 1 1
d::; II!- 2 (h+h)ll::; 2 11!-hll+ 2 11!-hll =d.
By writing u = ~ (!- h) and v = ~ (!- h) we sec that
llull = llvll = 1 and llu + vll = 2,
and llu- vll = 0 by thc parallelogram identity; hencc h = h D
16.2.2 Remark Note that the proof of Theorem 15.2.1 remains valid
whcn V is any set of vcctors that is convex and closed in the norm.
Computing the best approximation is easy when working with a norm

dcrived from a scalar product. The success of least squares tcchniqucs rests
on thc ncxt result.
16.2.3 Proposition Let H bc a pre--Hilbert spacc and Jet V be a

linear subspace of H. If f E H, then f* is the bcst approximation of f in
V if and only if
(! - r' V) = 0 for all V E V. (16.1)

16.2 Best approximation in a vector subspace 145
Proof. We first prove the sufficient condition. Since f* - v E V for all

v E V, (f - f*, f* - v) = 0, and by the Pythagorean identity,
II!- vll 2 = II!- !*11 2 + II!*- vll 2

Hence II!- f* II :S II!- vll for all v E V, and f* isthebest approximation.
The necessary condition is obtained by examining vectors in V in a neigh-
borhood of f*. The idea is that if (f- f*, v) =I 0, there are perhaps vectors
nearby that lower the value of II!- !*II Thus take v" = f* + a(v- !*),
where a E C and v E V are arbitrary. By the definition of f* we have
II!- !* 11 2 ::; II!- v" 1 2 =II!- !* 1 2 + lal 2 llv- !* 1 2

- a(v- j*,f- j*)- a(f- j*,v- j*).
Consequently,
a(v-j*,J-f*)+a(v-f* ,f-f*) :S lal 2 llv-f*ll 2

for all v E V and all a E C. Write a = laleili and w = v - f*. Dividing
both sides by Iai and letting Iai ~ 0, we have
for all 0 and all w E V. This implies that (w, f- f*) = 0, which can be
seen by taking (} = 0, n/2, n, and 3n/2. D
This necessary and sufficient condition allows one to compute f* when

V is a subspace of dimension n. If (/>I, rp 2 , .. , r/Jn is a basis for V and we
write
then condition (16.1) translates into a system of linear equations in the Ak:
n
LAk(rfJk,r/Jj) = (f,rpj), j = 1, ... ,n.
k=l
The matrix G of this system, with G;j = (rp;, r/Jj), i, j = 1, ... , n, is called
the Gram matrix associated with the basis rjJ 1 , ... , r/Jn. This clearly shows
one reason why we want to have an orthogonal basis: In this case the matrix
is diagonal. (Recall the orthogonal polynomials in Beetion 6.2.)
16.2.4 Proposition Assume that the rp; form a basis for V. The Gram
matrix with general term (rpi, r/Jj ), i, j = 1, ... , n, is Hermitian and positive
definite.
Proof. Gis clearly Hermitian: Gij = (c/Yi,c/Yj) = (c/Yj,c/Yi) = Gji Now let
X= (xl, ... 'Xn) be an element of cn and compute (X, GX). Since
n n
(GX)i = L_(c/Yi,c/Y1 )xi = ("f:.xjc/Yj,c/Yi),
j=l j=l
it follows that
n n n n
(X,GX) = "f:.xi(GX)i = "f:.xi(Lxic/Yi,cPi) = II "f:.xic/Yill 2 2 0.
i=l i=l j=l i=l
n
If (X, GX) = 0, then L XicPi = 0, and hence X = 0 because the cPi are
i=l
linearly independent. 0
This proof provides another way to show that the best approximation
exists and is unique when V has finite dimension. When the basis is or-
thogonal, f* is given by
(16.2)
16.2.5 Definition Thc quantity ci(J) = (J,c/Yi)/(cPi,cPi) is callcd the

f relative to c/Y;. The coefficicnt ci (!) = (!, cPi) if the
Fourier coefficient of
cPi are orthorrormaL
This definition is a generalization of what was devcloped in Scction 4.2
for the basis cPk(x) = ii11'k~, which is indeed orthogonal. Thus (16.2) is
the trigonometric polynomial that best approximates f in the quadratic
norm. It is also the truncated Fourier series of f. We will sec how these
ideas generalize to an arbitrary Hilbert space.
16.3 Orthogonal systems and Hilbert bases

In this section we generalize to Hilbert spaces the notions of orthogonal
and orthorrormal bases found in the Euclidean spaces ~n.
16.3.1 Definition Assurne that His a pre-Hilbert space. A countable

subset g; = {cPn In E N} is
orthogonal if (c/Yn, cPm) = 0 for all n -=J m;
orthorrormal if (cPn, cPm) = { ~ if n=m,

otherwise;
total if g; j_ = {0}.
16.3 Orthogonalsystemsa nd Hilbert bases 147
The last condition means that (!, <Pn) = 0 for all n E N implies f = 0.
When !1J is an orthogonal system, the numbers cn(f) = (f,</Jn)/(</Jn,<Pn)
are called the Fourier coefficients of f relative to !1J .
16.3.2 Proposition (Bessel's inequality) Suppose that His a

pre-Hilbert space and that !1J = {<Pn I n E N} is an orthogonal system. For
all f EH,
00
L lcn(f)I 2 II</Jnll 2 :::; 11/11 2

n=l
Proof. Let /p = L:;~=l cn(f)c/Jn Then (f- /p, /p) = 0 for all p E N by
Proposition 16.2.3. The Pythagorean relation implies that
p p
II/- L Cn(/)c/Jnll 2 = 11/11 2 - L lcn(!WIIc/Jnll 2 ~ 0

n=l n=l
for all p E N, and this proves the result. o
Bascd on the generalization of the idea of a basis, we would like to write
L
00
J= Cn (f)c/Jn-
n=l
Writing this expression immediately raises two questions:

Ql: Does the scquence S(f) = L:;~=l cn(f)c/Jn converge? More preciscly, do
the partial sums Sp = L:;~=l cn(f)c/Jn converge in H?
Q2: If thc answer to Ql is yes, is f = S(f)?
16.3.3 Definition (Fourier series) Let H be a pre-Hilbert space

and let !1J = {c/Jn In E N} be an orthogonal system. S(f) = L:;~=l cn(f)c/Jn
is called the Fourier series of f.
The answer to Ql is given in the general context of Hilbert spaces by

the next result.
16.3.4 Proposition Let H be a Hilbert space, !1J = {c/Jn I n E N}

an orthogonal system, and O:n, n E N, a sequence of scalars. The series
L:~=l o:n<Pn converges if and only ifL:;~= 1 Io:nl 2 llc/Jnll 2 < +oo.
Proof. Write Sp = L:;~=l o:n<Pn and assume that Sp converges to some

element g in H. Wehave
p p
L lo:nl 2 llc/Jnll 2 = II L O:nc/Jnll 2 = IISpll 2 :::; (IISp- Yll + IIYII) 2

n=l n=l
for all p E N. Letting p tend to +oo shows that

+oo
L lnni 2 II</Jnll 2 ::; llgll 2 < +oo.
n=l
To prove the converse, assume that E:= 1 lnni 2 II<Pnll 2 < +oo. We wish to
show that Sp is a Cauchy sequence in H. For p < q,
q q
IISq- Bpll 2 = II L ctn</!nll 2 = L lnni 2 II</Jnll 2

n=p+I n=p+I
Thc right-hand side is the Cauchy remainder of a convergent series in R
Consequently, Sp is a Cauchy sequence in H, and since H is complete, it
converges to some gEH. o
16.3.5 Proposition Let !iJ = { <Pn I n E N} be an orthogonal system

in a Hilbert space H. The Fourier series S(f) = E:=l cn(f)<Pn converges
for all f EH.
Proof. By Bessel's inequality, 2::~ 1 lcn(f)I 2 II<Pnll 2 ::; 11/11 2 < +oo; the
Fourier series converges by Proposition 16.3.4. o
Wo now move to question Q2.
16.3.6 Definition An orthogonal system !iJ = { <Pn I n E N} in a

Hilbortspace His said tobe an orthogonal basis (or a Hilbert basis) if for
all f EH we have f = 2::~ 1 Cn(f)<Pn
The next theorem characterizes these bases.
16.3. 7 Theorem Let H be a Hilbert space and !iJ = {<Pn I n E N} an

orthogonal system. The following assertions are equivalent:
(i) The system !iJ is total.
(ii) The finite linear combinations of elements of !iJ aredensein H.
(iii) !iJ is an orthogonal basis.
(iv) For all f EH, 11/11 2 = E:= 1 lcnUWII<Pnll 2 (Parseval's equality).
Proof.
(i) =} (ii) Let [!iJ] be the set of finite linear combinations of elements of
!iJ. Suppose that [!iJ] is not dense in H. Then there exists f E H such
that f (j_ [!iJ], the closure of [!iJ ]. [!iJ] is a closed subspace of H, so by
Theorem 16.2.1 there exists a unique f* in [!iJ] suchthat f- f* E ([!iJ ]).l.
Since !iJ C [!iJ], we have ([!iJ ]).l C !iJ .l, and since !iJ is total, !iJ .l = 0.
Thus f = f*, contradicting the assumption f (j_ [!iJ ].
16.3 Orthogonalsystemsand Hilbert bases 149
(ii)* (iii) TakefEH = [~] and c > 0. There exists a finite number of
scalars a 1 , a 2 , ... , am such that
m
II!- L:ak4>kll < c.

k=l
By Theorem 16.2.1,
m m
k=l k=l
If we let ak = 0 for k = m + 1, ... ,p, this relation remains true for all
p?: m, again by Theorem 16.2.1. Thus
p
II!- 2:ck(f)4>kll < c

k=l
for all p ?: m, which proves that the Fourier series S(f) converges to f.
Since f was arbitrary, this means that ~ is an orthogonal basis.
*
(iii) (iv) We saw in the proof of Proposition 16.3.2 that
p p
IIJ- L Cn(f)4>nll = IIJII 2 2 - L lcnUWII4>nll 2

n=l n=l
The left-hand term tends to 0 as p tends to +oo. In the limit we have

Parseval's relation:
00
11!11 2 = L len(f)l 2 ll4>nll 2

n=l
(iv)* (i) Take f E ~ j_. Then for all n E N, (f, 4>n) = 0, and Parseval's
relation implies IIJII = 0. Hence f = 0, which proves that ~ is a total
system. o
16.3.8 Corollary Two elements in 1I having the same Fourier coeffl-

cients are equal.
Proof. If suffices to show that an element f of 1I for which cn(f) = 0 for
all n E N is the zero element. This follows from Parseval's identity. o
In summary,
f = f
n=l
(f, l/>n) lf>n
(l/>n, l/Jn)
if and only if 4>n is an orthogonal basis. In general, the difficult step in this
theory is to show that a given family of functions is a basis, for example,
to show that the 4>n(x) = e2 i1rnx form a basis in 1I = L~(O, 1).
150 Lesson 16. 1-Iilbert Spaces
16.3.9 Theorem The trigonometric system { e 2 i71"n~} nEZ is a basis for

the Hilbert space L~(O, a).
Proof. Take f E L~(O, a). Let !N denote the best approximation of f in
. X
the finite-dimensionalsubspace generated by e 2 ''~~"na., n = -N, ... , +N. We
know from Bessel's inequality that the series whose general term is len(f)i2
is summable and from the Pythagorean relation that
a f:
n=-N
lenUW + r lf(t) - !N(tW dt = Jor lf(tW dt.
Jo (16.3)
Assurne for the moment that f is continuous on IR. The function
rp(x) = foa f(x + t)f(t) dt

is a-periodic and continuous on lR. by Proposition 14.2.1. We compute the
Fourier coefficients of rp:
This implies by Corollary 5.3.2 that the Fourier series of rp converges

uniformly to some continuous, periodic function '1/J. Then rp and t/J have
the same Fourier coefficicnts. Since they are both continuous, we know
from Exercise 4. 7 (which is completely independent of other results) that
rp(x) = '1/J(x) for all x E JR.. In particular,
+= ra
L cn(rp)e 2i'll"na
X
= rp(x) = Jo f(x + t)f(t) dt.
n=-= 0
By taking x = 0, we have Parseval's relation

+= +=
L L
a
11f(tW dt = Cn('P) = a lcn(/)1 2 ;
0 n=-= n=-=
from this and (16.3) we deduce that limN-+= foa lf(t)- /N(t)l 2 dt = 0.
The result for f E L~(O, a) follows from the fact that cg(a, 0) is dense
in L~(O, a) (Section 15.3.4). o
16.4 Exercises 151
16.3.10 Corollary Each function f E L~(O, a) can be written as a

unique Fourier series
L:
00
J(t) = Cn (!) e 2i7rn!a

n=-oo
that converges in the norm of L~(O, a).

As promised in Lesson 4, this provides a proof of Theorem 4.3.1.
16.4 Exercises
Exercise 16.1 Verify that C 0 [-1, +1] endowed with the scalar product
(f,g) =[ 1
1 f(t)g(t)dt
is not a Hilbert space over R (Construct a counterexample.)

Remark: Consider the sequence fn defined by
1
if -::;X::; 1,
fn(x) ={ O n 1
1-nx if 0 <x
- <-
- n
It is instructive to see that this is not a counterexample.
**Exercise 16.2 Define

00
12 (N) = {!: N--> IC I 2::: if(nW < +oo }.

n=1
Define a scalar product on 12 (N) by

00
(f,g) = Lf(n)g(n).
n=1
We are going to show that 12 (N) is a Hilbert space:

(a) Let fk a Cauchy sequence in 12 (N). Show that for all n E N, the sequence
fk(n) converges in IC. Derrote this limit by f(n).
(b) Verify that n f-> f(n) is in 12 (N) and show that fk converges to f in 12 (N).
*Exercise 16.3 Let H be a Hilbert space and let S be a subset of H.

(a) Show that
s.l. = {y E H I (s' y) = 0 for all s E s}
is a closed linear subspace of H.
(b) Show that (S.l.).l. = S if S isalinear subspace.
Exercise 16.4 Show that if r/J1, r/J2, ... , rPn arc n nonzero elements of apre-
Hilbert space that are pairwise orthogonal, then thcy are linearly independent.
*Exercise 16.5 ( the parallelogram identity) Let E be a normed

vector space over C. Assurne that the norm satisfies the parallelogram identity
(Section 16.1.3(ii) ).
(a) Show that the quantity
defines a scalar product on E.

(b) Show that among the spaces LP(O, 1), 1 :S p < oo, the only pre-Hilbert
space is for p = 2. (Consider the functions x(t) = t and y(t) = 1 t.)
Exercise 16.6 Show that a Hilbert spacc H that has a countable orthonor-
mal basis is isomorphic to l2 (N). (Consider thc mapping : H-> l 2 (N) defined
by (f) = Cn(f).)
Exercise 16.7 Let rPn be an orthorrormal basis for the Hilbert space H.
Show that for all f and g in H,
L Cn(f) Cn(g).
00
(!, g) =
n=l
Chapter VI
Convolution and the

Fourier Transform of
Functions
lesson 17
The Fourier Transform of

lntegrable Functions
In this lesson we begin to develop properties of the Fourier transform of

functions defined on IR. Our main concern is with the basic rules for ma-
nipulating these integrals. The inverse Fourier transform and properties
involving the convolution will be studied later.
17.1 The Fourier transform on L 1 (~)

17.1.1 Definition Given f E L 1 (IR) we write
Y J(e) = 1Ce) = L e-zi1rEx f(x) dx, (17.1)
y J(e) = l e2i1rEx f(x) dx. (17.2)
By definition, the function Y f is the Fourier transform of f, and Y f

is the conjugate Fourier transform of f.
Theseintegrals make senseifand only if f E L 1 (IR), since leZi7rXEI = 1.
We will see later that Y is the inverse of the Fourier transform Y
whenever Y f E L 1 (IR).
17.1.2 Example Let f = X[a,b] be the characteristic function of the

interval [a, b]. A simple computation shows that
~
f(~) =
{b- 'Tf~e- a)~
sin
a
e-i7r(a+b)E
if
if
e= 0,
e"/= 0.
J
In this case, is not is L 1 (IR) because Isin 1ri~-a)E I is not integrable (Section
13.4.6). We will refer to this example several times.
The following celebrated result describes the general behavior of f

156 Lesson 17. The Fourier Transform of Integrable Functions
17.1.3 Theorem (Riemann-Lebesgue) If f E L 1 (1R), then f

satisfies the following conditions:
(i) 5 f is continuous and bounded an R
(ii) 5 is a continuous linear operator from L 1 (1R) to L 00 (IR), and
llflloo ~ IIJII1 (17.3)
(iii) lim Ii(~) I= 0.

1~1->+oo
Proof.
f
(i) The continuity of follows directly from the continuity of the integral
(17.1) with respect to the parameter ~ The function ~ t--+ e- 2 i1r~x f(x) is
continuous on IRandis dominated by lf(x)l, which is in L 1(1R). Proposition
14.2.1 applies.
(ii) For all ~ E IR we have Ii(~) I~ J lf(x)l dx = llfll1 Thus fis bounded,
and 5 is continuous from L 1(1R) to L 00 (IR).
(iii) For f = X[a,bJ we have Ii(~) I~ 1/nl~l for ~ =f. 0 (Section 17.1.2). Thus
liml~l->oo i(~) = 0; clearly this is true for all simple functions. Now take
f in L 1(1R). Since the simple functions are dense in L 1(1R), there exists a
sequence gn of simple functions suchthat limn....., 00 II/- gnlh = 0 and, for
each fixed n, liml~l->oo l9n(~)l = 0. From (17.3), li(O- 9n(01 ~ II/- gnll1
uniformly in ~ E IR for each fixed n. It follows that liml~l->oo f(~) = 0. D
The following formula is essential for introducing the inverse Fourier

transform.
17.1.4 Proposition Letfand g be two functions in L 1 (1R). Then fg

and fg are in L 1 (1R) and
J f(t)g(t) dt = J i(x)g(x) dx. (17.4)
Proof. We saw in the last theorem that g is bounded; thus fg is in L 1(1R).

Similarly, fg E L 1(1R). Equality (17.4) comes from a direct application of
Fubini's theorem (Theorem 14.3.1). Since e- 2 i1rtx f(t)g(x) E L 1(1R2 ), we
have
J f(t)g(t) dt = J J f(t) ( e- 2 i1rtxg(x) dx) dt
= J J g(x) ( e- 2 i1rtx f(t) dt) dx = J

g(x)f(x) dx. D
17.1.5 Remark The mapping 5 has the same properties as those

announced for 5 in Theorem 17 .1.3 and Proposition 17 .1.4; to see this,
just change i to -i.
17.2 Rules for computing with the Fourier transform 157
17.2 Rules for computing with the Fourier

transform
Onc of the remarkable properties of the Fourier transform is the relation
between derivation and multiplication by a monomial.
17.2.1 Proposition ( derivation)

(i) Ifxk f(x) is in L 1 (JR), k = 0, 1, 2, ... , n, then Jis n times differentiable,
and
f<k)(~)=[(-2i7rx)kf(x)r(~) for k=1,2, ... ,n, (17.5)
where [( -2i7rx)k f(x)]~(O denotes .!T [( -2i7rx)k f(x)](~).

(ii) If f E cn(JR) n L 1 (JR) and if all the derivatives f (k), k = 1, 2, ... , n,
are in L 1 (1R), then
]Ck)(~) = (2i7r~)k[(~) for k = 1,2, ... ,n. (17.6)
(iii) If f E L 1 (1R) has bounded support, then fEC 00 (lR).
Proof.
(i) The function h : ~ ~--+ e- 2 i-rrt;x f(x) is infinitely differentiable; further-
more, h(k)(~) = ( -2i1rx)ke- 2 i-rrt;x f(x) and lk<k)(~)l :::; 27rlxk f(x)l. Proposi-
tion 14.2.2 applies for k = 1, 2, ... , n, and
(ii) We prove this for n = 1; the result for n ~ 2 is obtained by induction.

Since f' E L 1 (1R), we can compute [' by the formula
Integrating by parts shows that
Assurne for the moment that f(a) has a limit as a ---+ +oo. Since f is
integrable, this limit must be zero. As a ---+ +oo, (17. 7) becomes
J e- 2 i-rrt,xf'(x)dx = J(2i7r0e- 2 i-rrt;xf(x)dx,
which is formula (17.6) for k = 1.

It remains to show that lima---++oo f(a) exists. Since f' is continuously

differentiable, we have
f(a) = f(O) + 1a f'(t) dt.
J
Since f' E L 1 (1R.), lima---++oo 0a f'(t) dt exists, and hence lima---++oo f(a) ex-
ists. A similar argument shows that lima---++oo f( -a) exists.
(iii) If f E L 1 (JR.) has bounded support (f(x) = 0 a.e. for lxl greater than
some K > 0), it is clear that xk f(x) is integrable for all k E N; thus by (i),
f E C 00 (1R.). 0
We are now going to examine how the Fourier transform behaves with
respect to translation and parity. We will use the following notation.
17.2.2 Notation
(i) If f has values in C, then f(x) = f(x), the complex conjugate of f(x).
(ii) fu denotes the reflection of f defined by fu(x) = f( -x).
(iii) The translate Ta! of f is defined by Taf(x) = f(x- a).
17.2.3 Proposition (conjugation and parity) For f E L 1 (JR.)

we have the following relations:
(i) .!T (f) = Y(f).
(ii) (.!T (f))a = Y(f) = .!T Ua)
(iii)f even (odd) ==? f evcn (odd).
(iv) f real and even (real and odd) ==? f real and even (imaginary and
odd).
Proof.
(i) and (ii) follow directly from the definitions.
(iii) j is even ~ f = fa Thus .!T (f) ;:= .!T Ua) = (.!T (f))a from (ii),
and f is even. Similarly, f odd implies f is odd.
(iv) Suppose f is real and even. It suffices to show that is real. We f
have .!T (f) = .!T (]) = .!T (f) = .!T Ua) = .!T (f). If f is real and
odd, we have .!T (f) = .!T Ua) = -.!T (f), which shows that is purely f
imaginary. 0
17.2.4 Proposition (translation) For f E L 1 (1R.),

(i) Taf(~) = e-2ixat; [(~);
(ii) Ta[(~)= .!T [e2ixax f(x)](O = [e2ixax f(x)r(~).
Proof. To prove (i) we have
Taf(O = J
e-2ixt;x f(x- a) dx = J e-2ixt;(a+t) f(t) dt = e-2ixat; [(~).
The proof of (ii) is similar. 0

17.3 Same standard examples 159
17.3 Some standard examples

Recall that u denotes the Heavisidc function, which is defined by u(x) = 1
for x > 0 and u(x) = 0 for x ~ 0. Several useful Fourier transforms arc
presented in the next section.
17.3.1 Direct computation of the Fourier transform

(a) h(x) = e-axu(x),
~
h(~) =
0
Re(a) > 0.
1oo e-2,rrxl;e-ax
. dx = lim
b.-+oo
[-e-x(a+2irrl;)
.
a + 2zn~
l
b 1
a + 2in~
0
(b) h(x) = eaxu( -x), Re(a) > 0.
j;(~) = Jo e-2irrxl;eax dx = -1. .

-oo -a + 2z7r~
xk
(c) h(x) = kf e-axu(x), Re(a) > 0.
- 1 1 . k ~ - 1 1 ~(k)
!J(x)- (- 2in)k k! (-2znx) h(x), and h(~)- k! (- 2in)k/1 (~)
by Proposition 17.2.1(i). Since h_(k) (~) = k!(a + 2in~)-(k+l) ( -2in)k,
~ 1
h(~) = (a + 2in~)k+ 1
k
(d) j4(x) = ~! eaxu( -x), Re(a) > 0.
The computation is the same as in (c), and we have
~ -1
/4 (~) = (-a + 2in~)k+ 1
(e) f 5 (x) = e-aixl, Re(a) > 0. From (a) and (b) we deduce
~ 2a
!5(~) = a2 + 4n2~2.
(f) f 6 (x) = sign(x)e-alxl, Re(a) > 0. Wehave
~ -4i7r~
/6(~) = a2 + 4n2~2.
17.3.2 A computation using a differential equation

We are going to compute the Fourier transform of f(x) = e-ax 2 , a > 0.
A direct method is to evaluate a contour integral in the complex plane,
and this is often the best way to proceed when other attempts fail. We
present another way to do this evaluation. Observe that f'(x) = -2axf(x)
and take the Fourier transform of both sides of this equality. By using
equations (17.5) and (17.6) with k = 1, we see that
2i7r~J(~) = ~[-2i7rxf(x) r= ~ f'(~).

27r 27r
Thus
i'(~) + 27ra 2 ~i(O = o.
71"2 2
A particular solution of (17.8) is e- ~~ . When we look for a general solu-

~ 7re 2
tion of the form f(~) = K(~)e -~ , we see that K'(~) = 0, so K(~) = K,

a constant, and f(o) = K. But J(o) = Jn~. e-ax dx = (1rja)~. Hence
2
17.3.3 Summary of useful formulas
(i) j(k)(~) = [( -2i7rx)k f(x)]~(~)
J(kl(O = (2i7r~)kf(~)
(ii) f(x- a) ~ e-2i7rae J(~)
e2i7raX f(x) ~ n~- a)
(iii) a i- 0. f(ax)
.r
I-+ m
1 ~(~)
f ~
(iv) a E C, Re(a) > 0, k = 0, 1, 2 ....

xk -ax ( ) .r 1
kfe u x ~-----+ (a + 2i7r~)k+l
Xk ax ( ) 5 -1
kfe u -x ~-----+ ( -a + 2i7r0k+l
e-alxl ~ 2a
a2 + 47r2e
. (X) e- a Ix I I!T -4i7r~
- + -::----=---=-
Slgll
a2 + 47r2~2
(v) a E IR., a > 0.
.r sin 2a7r~
X[-a,+a] (x) I-+ 7r~
17.4 Exercises 161
17.3.4 Remark When attempts to compute a Fourier transform di-

rectly seem to lead nowhere, it is often useful to try to evaluate the inte-
gral using the residue theorem and a contour integral in the complex plane.
Standard texts on functions of a complex variable discuss this technique,
and two of the exercises are clone this way.
17.4 Exercises
Exercise 17.1 Assurne that f E L 1 (JR).
(a) ForA E lR\0, define g(x) = j(Ax). Show that
g(~) = 1~1 1 (~).

(b) ForA E lR\0 and t-t E JR, define g(x) = j(Ax- t-t). Show that
g(~) = ~~~e-2"~<1(D.
Exercise 17.2 Let f(t) = (1- e)X[-l,lJ(t). Show that
~ (sin27r~
J(~) = 1r2
1
e ~ - cos 21r~ ) .
Exercise 17.3 If xk f(x) is in L 1 (JR) for k = 0, ... ,p, the numbers

Mk=1xkf(x)dx, k=O, ... ,p,
exist. Mk is called the moment of order k of J, or the kth moment of f. Show

that
- 1 ~(k)
Mk - ( _ 2i1r)k f (0).
Exercise 17.4 Compute the Fourier transform of f(x) = ~' a > 0,

a +x
using the calculus of residues.
*Exercise 17.5 Use contour integration to compute the Fourier transform of

f(x) = e-ax 2
, a > 0.
Hint: It is sufficient to computef(~) for ~ > 0, since is even (Proposition f
17.2.3). Consider the contour r R formed by the lines joining - R to R, R to
R + i~~' R + i~~ to -R + i~~' and -R + i~~ to -R. By Cauchy's formula,
a a a a
"2e
frR e-az 2 dz = 0. Let R tend to +oo to show that f(~) = y!Ie --a-.
Exercise 17.6 Show by a simple example that the hypothesis f E cn(JR) is

essential in Proposition 17.2.1(ii).
Lesson 18
The Inverse Fourier Transform
The Fourier transform allows us to pass from the time domain to the fre-
quency domain. It is remarkable that the inverse operation is obtained
very simply from Y itself. In fact, it is just Y . However, one must be
cautious, for as we have seen in the last lesson, f being integrable does
not imply that 1
is integrable (Section 17.1.2). We will need additional
hypotheses on f to invert f ~--+ f
18.1 An inversion theorem for L 1 (JR)

1
18.1.1 Theorem If fand arebothin L 1 (R), then 5 f(t) = f(t)
at all points where f is continuous.
2
Proof. Foreach n > 0, we introduce the function 9n(x) = e- :1xl, whose
Fourier transform is
~ ( ) 1 n
9n ~ = ; 1 + n2~2.
The functions 9n and Yn are in U(R). We can apply formula (17.4) to the
two functions fand e2i?Ttx9n(x), which in view of Proposition 17.2.4(ii) is
L J(x)gn(x)e 2i?Ttx dx = L f(u)gn(u- t) du. (18.1)
For all x ER, IJ(x)gn(x)e 2 i?Ttxl ~ IJ(x)l, and limn-+oo 9n(x) = 1. Since 1
is in L 1 (R), we can apply Lebesgue's theorem and pass to the limit under
the integral sign. Thus
lim [ J(x)gn(x)e 2i?Ttx dx = [ J(x)e 2 i?Ttx dx = 5 f(t).

n-+oo }IR }IR
164 Lesson 18. TheInverse Fourier Transform
Assurne that f is continuous at t; we need to show that the integral on

the right-hand side of (18.1) tends to f(t). Since 9n E P(JR),
j,tir 9n(~) d~ = a-++oo

lim 1+a!.. 1 + :2~2
-a 7r
d~ = 1.
Thus we can write
l f(u)gn(u- t) du- f(t) = l (!(~ + t)- f(t))gn(~) d~. (18.2)
Given E: > 0, there exists 'f} > 0 suchthat ly-tl ~ 'f} implies lf(y)- f(t)l ~ E:.
We decompose (18.2) as follows:
f (!(~ + t)- J(t))9n(0 d~ =

}JR
1I~I:S7J (!(~ + t)- J(t))9n(~) d~
+ 1 (f(~ + t)- f(t))gn(~) d~.
1~1>'7
For all n > 0,
The last step is to show that lim

n-+oo
11~1>'7
(f(t + ~)- f(t))gn(~) d~ = 0. For
this we have
IJ(t) 11~1>'7
9n(0 d~~ = lf(t)l ( 1- ~ arctan n'f}),
7r
(18.3)
and since 9n is even and decreasing on JR+,
111~1>'7
f(t + ~)gn(~) d~l ~ 9n(TJ)IIJII1 (18.4)
As n tends to +oo, the right-hand sides of (18.3) and (8.4) tend to 0, and
this proves the theorem. D
As a consequence of this result, the Fourier transform of a function f 1

in L 1 (JR) that is continuous except for ajump discontinuity at x = a cannot
1
be integrable: If werein L 1 (JR), Theorem 18.1.1 implies that !T [(x) =
f(x) except for x = a. But !T 1(x) is continuous everywhere, so this is
impossible (see, for example, Section 17.3.1(a)).
1
We will see in Lesson 23 that f and in L 1 (JR) implies that f is equal
almost everywhere to the continuous function !T f.
The next proposition gives sufficient conditions on f so that E L 1 (JR). 1
18.2 Some Fourier transforms obtained by the inversion formula 165
18.1.2 Proposition If f belongs to C 2 (1R) and if J, f', and f" are in

L 1 (1R), then j is integrable.
Proof. According to ( 17.6), J'i ( e

~) = -47r2 J( ~). On the other hand,
limlei--+CXliJ'i(~)l = 0 (Theorem 17.1.3). Thus there is an M > 0 such
that for alll~l > M, 47r 2 1~1 2 lf(~)l:::; 1. Since lf(~)l is continuous on IR and
dominated by 1/(47r 2 1~1 2 ) at infinity, it follows that fis in L 1 (1R). o
18.2 Some Fourier transforms obtained by

the inversion formula
When the hypotheses of Theorem 18.1.1 are satisfied, one can often easily
compute 5 f.
18.2.1 Proposition If f continuous and integrable and if f is in
L 1 (1R), then for all x E IR,
5 J(x) = fa(x) = f(-x).

Proof. Let 9 = f. We have
By Theorem 18.1.1, g( -x) = f(x) for all x E IR; thus
5 J(x) = fa(x). D
We will now use the results of Section 17.3.4 to compute several inverse
transforrns. We take a E C with Re( a) > 0.
k
(a) h(x) = ~! e-axu(x) is integrable for all k E N, but it is continuous
~ 1
only for k;::: 1. Wehave 91(~) = h(~) = (a + 2 i1r~)k+l E L 1(1R) for k;::: 1.
Thus
( -x)k
g1(x) = ~ eaxu(-x)
for k > 1.
- xk
(b) In the same way, h(x) = kfeaxu(-x) for k;::: 1, and we have
~ -1 ( -x)k
92(~) = h(~) = ( -a + 2 i7r~)k+l and g2(x) = ~e-axu(x).
166 Lesson 18. The Inverse Fourier Transform
(c) h(x) = e-alxl is in L 1 (1R) and continuous on IR.

~ 2a
93(e) = h(e) = a2 + 47r2e2
Y3(x) = e-alxl.
(d) f 4 (x) = sign(x)e-alxl is not continuous at 0. We note that
is not integrable at infinity.

(e) fs(x) = e-ax 2 is in L 1 (1R) and continuous on IR.
is in L 1 (1R). Hence
18.2.2 Summary
(i) a E C, Re(a) > 0, k = 1,2, ....

1 ( -e)k a~
(a + 2i7rx)k+l
.'T
1---t ~e u
(
-e )
-1 .'T ( -Qk -a~ (C)

(-a + 2i7rx)k+l 1---t k! e u ."
l ~ 1!:e-2.,..al~l
a2 + x2 a
(ii) a E IR, a > 0.
18.3 The principal value Fourierinversion

formula
We have remarked several times that the Fourier transform of an integrable
function is not necessarily integrable. In this is the case, the integral
18.3 The principal value Fourier inversion formula 167
is not defined. This does not exclude the possible existence of the limit
18.3.1 Theorem Assume that f E L 1 (R) satisfies the following two

conditions:
(i) There isafinite number of real numbers a 1, a2, ... , ap such that f is
continuously differentiable on ( -oo, al), (a1, a2), ... , (ap, +oo).
(ii) f' E L 1 (R).
ja e2i1rtef(f.)df.
Then
lim = _21 (f(t+) +f(t-)).
a--+oo -a
Proof. Note first that (i) and (ii) imply that the limits f(t+) and f(t-)
exist for all t. Let g(~) = e 2 i7rteX[-a,aJ(~). Since fandgare in L 1(R), it
follows from Theorem 17.1.4 that
v(a) =ja e2i1rte

-a
j(f,) d~ = rg(f(~) d~ = }Rrg(x)f(x) dx.
}R
We compute g(x) using Proposition 17.2.4(ii) and Section 17.1.2:
~ ~ sin 21ra(x- t)
g(x) = TtX[-a,aj(X) = 1r(x _ t)
Thus,
() 1
v a = f( t+u ) sin27rau du
=1
R 1l"U
00 sin21rau
(f(t + u) + f(t- u)) du. (18.5)
0 1l"U
The function sin x has the following properties:
.
(1') 1Im
R--+oo
1 R . X
O
smx dX = -1r.
--
X 2
(18.6)
(ii) s(y) = foo sinx dx is well-defined and differentiable on R; its deriva-

}y X
tive is
8 , (y) __ sin y. (18.7)
- y ,
and
lim s(y)
y--++oo
= 0. (18.8)
Consequently, s is bounded on [0, +oo), and we can write

M = sup !s(y)!. (18.9)
y~O
We return to (18.5). By hypothesis, the function

ht(u) = f(t + u) + f(t- u)
is integrable on [0, +oo), it has at most a finite nurober of discontinuities
b1, b2, ... , bq, it is continuously differentiable on (0, bl), (b1, b2), ... , (bq, oo)
(possibly b1 = 0), and h~ is integrable. Thus we can integrate (18.5) by
parts. By letting b0 = 0 and bq+l = +oo, we see that
fbHt
- Jb h~(u)s(21rau) du].
3
The term s(27rabJ+l)ht(bj+ 1 ) is actually the Iimit of s(27rau)ht(u) as u

tends to +oo. This limit is 0, since both s(21rau) and ht(u) tend to 0 by
(18.8) and the proof of Proposition 17.2.1, respectively. For j = 1, 2, ... , q,
lim s(21rabj) = 0, again by (18.8). Now consider the limits
1
a-+oo
bj+t
lim h~(u)s(21rau) du
a-+oo bi
for j = 1, 2, ... , q. Wehave a-+oo

lim h~(u)s(21rau) = 0 for almost every u, and
lh~(u)s(21rau)!::; M!h~(u)!
by (18.9). Since f' is integrable, h~(u) is in L 1 (JR); we can apply the domi-
nated convergence theorem and conclude that
lim
a-+oo
1 bj+l
bj
h~(u)s(21rau) du= 0.
The remaining term is ~s(O)ht(O+); thus lim v(a) = ~s(O)ht(O+). Re-

7r a-+oo 7r
turning to the original notation, we have
lim ja e 2i1rt;t J(f.) df. = ~ (f(t+) + f(t- )),

a-+oo -a 2
which completes the proof. D
18.4 Exercises 169
18.3.2 Example We saw in Section 17.1.2 that si;~ is the Fourier

transform of f(x) = 7rX[-a,aJ(x) with a = 1/(27r). Thus,
lim
a-jo+<X>
1 e i1r~tsm~ d~
a
-a
2
.
~
=
{7f
7r/2
0
if
if
if
ltl < 1/(27r),
ltl = 1/(27r),
ltl > 1/(27r).
18.4 Exercises
Exercise 18.1 Consider the following two statements:
(a) f is equal almost everywhere to a continuous function.
(b) f is continuous almost everywhere.
Show that (b) implies (a) butthat the converse is false.
Exercise 18.2 Compute the Fourier transform of f(x) =- 1- . Deduce

2
1+x
= ( x = 1
from this the transforms of g(x)
1+x 2
)2 and h(x) 1 + (x-a )2.
Exercise 18.3 Compute lim

R~+oo
jR
-R
e 2 ;"et
a
1.
+ 2me
de, a E IC, Re(a) > 0.
Hint: Use Section 17.3.1(a) and Theorem 18.3.1. The result is 1/2.
Exercise 18.4 The computation in Section 17.3.2 of the Fourier transform
1
2 -.. 7r2 c2
of f(x) = e-ax , a > 0, showed that f(e) = Ke-a"' . Determine the constant
K using Proposition 18.2.1, and use this result to evaluate e-ax 2 dx directly.
*Exercise 18.5 (Shannon's formula for f E L 1 n C0 (Il~.))

Assurne that f E L 1 (1R) n C 0 (JR) and that supp(J) c [-Ac, Ac], Ac > 0. Let a
be real with 0 < a ::; 1/(2Ac). Define g tobe the function with period 1/a that
coincides with J on ( -1/(2a), 1/(2a)).
(a) Show that the Fourier coefficients of gare
Cn(g) = af( -na), n E Z.
(b) Fort real and fixed, let h be the function with period 1/a defined by
h(A) = e2i7rt>., AE (-_.!._, _.!._).

2a 2a
Show that
sin ~(t- na)
if t i- na,
'"(h) ~ {
1 ~~t- na)
if t = na.
(c) Use the expression for the Fourier coefficients of a product (Exercise 5.13)
to deduce Shannon's formula: For all t ER,
~ sin !!.(t- na) 1

f(t) = ~ f(na) !!.~ _ na) , O < a $ 2Ac
n=-oo a
Lesson 19
The Space .!7 (JR)
We have seen in the last few lessons how it is necessary to restriet the
choice of functions in L 1 (~) if we wish to use the differentiation formulas
and define the inverse Fourier transform. In this lesson, we are going to
introduce a subspace of L 1 (~) that is invariant under the Fourier transform,
differentiation, and multiplication by polynomials.
19.1 Rapidly decreasing functions

19.1.1 Definition A function f : ~-? C is said to decay rapidly, or
be rapidly decreasing, if for all p E N,
lim lxP f(x)i = 0.

fxf-->oo
For example, the function f(x) = e-lxl decays rapidly. It is important

to note that in spite of the name, this definition does not imply that the
function is monotonic in a neighborhood of infinity (f(x) = e-lxl sinx is
also rapidly decreasing).
The following is a useful property about the integrability of rapidly de-
creasing functions.
19.1.2 Proposition If f is locally integrable, f E Lfoc(~), and rapidly

decreasing, then xPf(x) is in L 1 (~) for all p E N.
Proof. Since f decays rapidly, there is an M > 0 suchthat for alllxl > M
we have lxP+ 2 f(x)i ::=:; 1. Thus
Jffi.
::=:; r ~~xP+2 j(x)idx
r lxPf(x)idx+ lfxf>M
r lxPf(x)idx lfxf<5_M X
::=:; MP r
lfxf<5_M
if(x)i dx + r
lfxf>M
~ dx < +oo.
X
D
172 Lesson 19. The Space .:? (JR)
This leads to an important property of the Fourier transform of a rapidly

decreasing function; it generalizes Proposition 17.2.l(iii).
19.1.3 Proposition If f E L 1 (JR) decays rapidly, then 1 is infinitely

differentiable.
Proof. xP f(x) is in L 1 (1R) for all p E N by Proposition 19.1.2. This implics

1
by Proposition 17.2.l(i) that is in coo. o
Conversely, what docs f E C 00 imply about f? A partial answer is given
by the next result.
19.1.4 Prop~sition Assume that f is in C 00 If j(k) is in L 1 (JR) for

all k E N, then f decays rapidly.
Proof. J(k) (~) = (2i7r~)k 1(0 for all k E N by Proposition 17.2.l(ii). By

the Riemann-Lebesgue theorem, liml~l-++oo l~lkl1(~)1 = 0. D
Said another way, we have just proved the following results:

(i) The faster f decreases at infinity, the greater the rcgularity of [.
(ii) Thc more regular f is, the faster 1 decays.
In particular, if f E C 00 (JR) and decreases rapidly, the same is true for [.
Note the similarity of this result and Proposition 5.3.4 about the Fourier
coefficients of a pcriodic function in C 00 (1R).
19.2 The space .Y (IR)

19.2.1 Definition .9 (JR), or simply .9, dcnotes the vcctor space of
functions f : lR ___. C that have the following two properties:
(i) f is infinitely differentiablc.
(ii) f and all of its derivatives decay rapidly.
The space .9 (JR) is called the Schwartz class of functions, and it is named
for the l<rench mathcmatician Laureut Schwartz.
19.2.2 Proposition The space .9 has the following properties:

(i) .9 is invariant under multiplication by a polynomial.
(ii) .9 is invariant under derivation; that is, f E .9 =} f' E .9 .
(iii) .9 c L 1 (1R).
The proof is left as an exercise. The main result of this lesson is contained
in the next theorem.
19.2 The space sP (IR) 173
19.2.3 Theorem The space .5I' is invariant under the Fourier trans-
form; that is, f E .5I' => 1E .5I' .
Proof. Assume f E .5I'. Then f is in L 1 (IR) and decays rapidly. Thus
1E 0 00 by Proposition 19.1.3. Since J(k) is rapidly decreasing for all k E N,
it is integrable for all k by Proposition 19.1.2. We deduce from Proposition

1
19 .1.4 that is rapidly decreasing. We need to show that the derivatives of
1 are all rapidly decreasing. Since ((-2i7rx)qf(x))(p) is integrable, we see
from (17.5) and (17.6) that
-.1(
)P !JT (((-2i1fX)qj(x))(p))(~) = e!JT ((-2i7rX)qj(x))(~) = e1(q)(~).
2z7f (19.1)
The right-hand term is the Fourier transform of an integrable function, so

by the Riemann-Lebesgue theorem liml~l-+oo le1(q)(~)l = 0. D
We will need the notion of convergence in .5I' in later parts of the book.
19.2.4 Definition A sequence Un)nEN of elements in .5I' tends to 0

as n tends to infinity if
for all p and q in N. We write fn -+ 0 in .5I' , or simply fn -+ 0 if there is

no chance for misunderstanding.
Clearly, f n -+ f in .5I' if and only if f n - f -+ 0 in .5I' . Definition

19.2.4 implies that the sequence fn and the sequences of all derivatives
f~q) converge uniformly on IR (take p = 0). We will often make use of the
following consequences of convergence in .5I' .
19.2.5 Proposition If the sequence Un)nEN tends to 0 in .5I', then

(i) 0 in .5I' (continuity of derivation);
f~ -+
(ii) Pfn-+ 0 in .5I' for all polynomials P;
(iii) fn-+ 0 in L 1 (IR);
(iv) 1-+ 0 in .5I' (continuity ofthe Fourier transform).
Proof.
(i) This is part of the definition.
(ii) It is sufficient to prove this for P(x) = xk; thus we must show that
xP(xk fn(x))(q) converges uniformly on R Butthis follows directly from the
definition and from Leibniz's formula for the derivatives of a product.
(iii) Take c: > 0. Since fn -+ 0 in .5I' , there is an N > 0 such that for
all n ~ N and all x E IR, 1(1 + x 2 )fn(x)l :::; c:. Thus for all n ~ N,
J lfn(x)l dx:::; c: J(1 + x 2 )- 1 dx = c7f, which proves that fn-+ 0 in P(IR).
174 Lesson 19. The Space .'f' (JR)
(iv) We use (19.1); thus lepj..(q)(e)l = (27r)q-pl5 ((xqfn(x)))(e)i. Let

Yn(x) = (xqfn(x)). We know from Proposition 19.2.2 that Yn is in .9
and from (i) and (ii) that Yn - t 0 in .9; hence i1Yni11 - t 0 by (iii). Since
l9n(e)l ~ IIYnlil> (iv) is proved. o
19.3 Theinverse Fourier transform on Y

Wehaveseen (Theorem 18.1.1) that if fand 1 are integrable, then at all
points where f is continuous,
(19.2)
1
If f is in .9 , then is in .9 and hence is integrable. Since f is continuous
everywhere, (19.2) is true for all x E :IR.. In other notation,
J=Y(5!) (19.3)
for all f E .9 . In the same way, f = 5 (5 !) . This means that 5 is
a 1-to-1 mapping on .9 , and its inverse is
5 - 1 =5. (19.4)
19.3.1 Theorem TheFourier transform5 isalinear 1-to-1 mapping

from .9 onto .9 that is continuous in the sense of convergence on .9 . The
inverse mapping is 5 - 1 = 5 ; in other words, the relations
f(e) = h e-2i11"ex f(x) dx,
f(x) = l e+2i11"xe f(e) de
are true for all f E .9 e

and all x, E :IR..
Proof. We have proved everything except the continuity of 5 and 5 .
The continuity of 5 is given by Proposition 19.2.5(iv); the continuity of
5 is proved in the same way. o
An often cited representative of the space .9 is a Gaussian, which is a
function of the form
g(x) = e-a(x-m)2.
The function g(x) = e-71"X 2 plays a special role in analysis.
19.3.2 Proposition The Fourier transform of g(x) = e-1rx 2 is the

same function, g(e) = e-1re.
This result was established in Section 17.3.2. The function g(x) = e-1rx 2
is a fixed point of the Fourier transform.
19.4 Exercises 175
19.4 Exercises
Exercise 19 .1 Find an exarnple of a function in coo (R.) that decays rapidly
but whose derivative does not decay rapidly.
*Exercise 19.2
(a) Show that if f and gare in.'? (R.), then the product fg belongs to .'? (R.).
(b) Show that the rnapping f ~--+ fg is continuous frorn .'? (R.) to .'? (R.).
Exercise 19.3 Let f : li ----> C be rneasurable. We say that f is slowly

increasing if there exist C > 0 and N E N such that
Suppose that f E C 00 (Ji) and all of its derivatives are slowly increasing, and
suppose that g E .'? (R.). Show that fg E .'? (R.).
Exercise 19.4 Assurne that g E .'? (R.) and that f = P/Q, where P and Q
are polynornials. Show that if Q has no real zeros, then fg E .'? (R.).
Exercise 19.5 Prove Proposition 19.2.2.
Exercise 19.6 Show that .'? (R.) c LP(Ji), p ~ 1.

Hint: Write f(x) = (1 + x 2 )-l/P(1 + x 2 ) 1 /P f(x) and note that (1 + x 2 ) 1 1P f(x) is
bounded on R..
*Exercise 19.7 {the density of 9? (JR) in !7 (IR))

Assurne that if; is in.'? (R.) and that a is in .!JJ (R.) with a = 1 on [-1, 1]. Define
an(x) = a(x/n) for n E N*. Show that the sequence if;n = anif; is in .!JJ (R.) and
that it converges to if; in the topology of .'? (R.).
lesson 20
The Convolution of Functions
The convolution, like the Fourier transform, is one of the essential tools
of signal processing. The results that we establish in this lesson will be
restricted to functions. Our development will often rely on the theorems on
integration established in Lesson 14. The limits of the notion of "function"
and practical applications lead naturally to generalize the Fourier trans-
form, convolution, and associated concepts to distributions. The study of
distributions will begin with Lesson 26; the convolution for distributions
will be developed in Lesson 32.
20.1 Definitions and examples

20.1.1 Definition The convolution of two functions f and g from R
to C is the function f * g, if it exists, defined by
f * g(x) = L f(x- t)g(t) dt = L f(u)g(x- u) du.
If no assumptions are made about fand g, the convolution is clearly not

defined. Take, for example, f = g = 1. We give assumptions in Section 20.2
that imply the existence of f * g. But first we examine two examples that
allow us to visualize some properties of the convolution.
20.1.2 Examples
(a) Let f = g = X[o,I] Then
L f(x- t)g(t) dt = 1 1
X[o, 1J(x- t) dt = measure ([0, 1] n [x- 1, xl),
178 Lesson 20. The Convolution of Functions
which is the "hat" function
if X :S 0,
Jg(x)~f 2-x
if 0 :S X :S 1,
if 1 :S X :S 2,
0 if X 2': 2.
This convolution is illustrated in Figure 20.1.
f(x) = g(x) f * g(x)
0 X 0 2 X
f= g=X[0,1J
FIGURE 20.1. The convolution f * g is continuous.

1 .
(b) Take f E L 1 (R) and g = 2hX[-h,+h] w1th h > 0. Then
f * g(x) = 21h ~+h f(x- t) dt

1 1x+h
= -h f(u) du,
-h 2 x-h
which is the average of f on the interval [x - h, x + h]. The continuity of

f * g is a direct consequence of Proposition 14.5.4.
These two examples illustrate an important property of the convolution:
It regularizes a function by averaging. We will study this in detail in Les-
son 21, but first we are going to present some conditions that imply the
existence of the convolution. For this we will need the notion of the support
of a function.
We have already given the definition of the support of a continuous func-
tion (Definition 15.1.5). We are now dealing with measurable functions that
are defined almost everywhere, and it is necessary to proceed cautiously.
For example, if we simply extend the previous definition of support to mea-
surable functions, we have {x ER I XQ(x) f. 0} = R for the support of the
characteristic function of the rationals. This definition is not what we want,
for it clearly depends on the particular function we have taken as the rep-
resentative of its class. We need a definition that gives the same result for
all functions that are equal almost everywhere.
20.2 Convolution in L 1 (R) 179
20.1.3 Definition (support of a measurable function)

Let f : :IR -+ C be a measurable function. Let f)i, i E I, be the family of
open sets in :IR such that for all i E I, f = 0 a.e. on f)i. Let fJ = UiEI()i and
define the support of J, supp(f), tobe the closed set :IR\ fJ, that is,
supp(f) = R \ fJ.
It is left as an exercise to verify that this definition extends Definition
15.1.5, that f = 0 a.e. on fJ, and that f = g a.e. implies supp(f) = supp(g).
In the first example of Section 20.1.2, supp(f) = supp(g) = (0, 1], and
we saw that the convolution f * g spreads these supports with the result
that supp(f * g) = [0, 2]. In general, we have the following result.
20.1.4 Lemma Let f and g be two functions for which f * g exists.

Then
supp(f * g) C supp(f) + supp(g).
Proof. Let S denote :IR\ (supp(f) + supp(g)) and let so denote the interior
of S. Suppose that x ES. Then for all t E supp(f) we have (x-t) ~ supp(g),
and consequently JIR g(x- t)f(t) dt = 0. Now let f)f*Y be the largest open
set on which f *g = 0 a.e. We have just seen that x E so =? x E fJ f*Y. Thus
x E :IR\ f)f*Y = supp(f * g) implies that x E :IR\ S0 Since :IR\ so= R \ S,
this proves the result. D
20.2 Convolution in L 1 (~)

We will establish the existence of the convolution for integrable functions.
20.2.1 Proposition If fandgare in L 1 (R), then the following hold:

(i) f * g is defined almost everywhere and f * g belongs to L 1 (R).
(ii) The convolution is a continuous bilinear operator from L 1 (R) x L 1 (R)
to L 1 (R) with
(20.1)
Proof.
(i) Since fandgare in L 1 (R), Fubini's theorem implies that the function
(y, z) ~ f(y)g(z) is in L 1 (R2 ). By making the change of variables y = x- t
and z = t, we have
iJrr
IRxiR
f(y)g(z)dydz=jrf
J Rx!R
f(x-t)g(t)dxdt.
The function x ~ JR f(x- t)g(t) dt is thus defined almost everywhere and

belongs to L 1 (R), again by Fubini's theorem.
(ii) To establish the inequality (20.1) we write
I/* g(x)l :::; klf(x- t)llg(t)l dt = I/I* lgl(x).
Thus
klf * gl(x) dx:::; kl/1 * lgl(x) dx = k dx klf(x- t)llg(t)l dt
= klg(t)l ( klf(x- t)l dx) dt = IIYIIIII/II1 o
Can the hypotheses of this last result be weakened? If f and g are in

=
Lfoc(IR), the result is false (take f = g 1). However, we have the following
result.
20.2.2 Proposition Assume that f Lfoc(IR) and that g E L 1 (JR).

E
(i) Ifsupp(g) is bounded, then f*g(x) exists a.e. and belongs to Lfoc(JR).
(ii) If f is bounded, then f * g(x) exists for all x and belongs L 00 (1R.).
Proof.
(i) g is zero a.e. outside some interval [-a, a]. Take x in a finite interval
[a, ]. For all t E [-a, a] and all x E [a, ],
f(x- t)g(t) = X[a-a,+aj(X- t)f(x- t)g(t),
and thus
f * g(x) = l +a
-a f(x- t)g(t) dt = (X[a-a,+aJI) * g(x).
f * g coincides on [a, ] with the convolution of two functions in L 1(JR), so
by Proposition 20.2.1(i) it is defined a.e. and is integrable. Thus f * g is
defined a.e. and is integrable on all compact sets.
(ii) If f E L00 (1R), then
ll f(u)g(x- u) dul:::; 11/lloo klg(x- u)l,du = 11/llooiiYIIl

for all x, and II/ * Ylloo :::; 11/llooiiYih- D
20.3 Convolution in .LP(JR)

If p and q are two real positive numbers (perhaps +oo) such that ~ + i = 1,
we say that p and q are harmonic conjugates, or simply conjugates.
20.3 Convolution in LP(R) 181
20.3.1 Proposition Assume that f E LP(JR.) and g E Lq(JR.) (p and q

conjugates). Then the following hold:
(i) f * g is defined everywhere and is continuous and bounded on IR..
(ii) II/ * Ylloo :::; 11/llviiYIIq (20.2)
Proof. We will prove this result in two particularly important cases: p = 1,
q = +oo and p = 2, q = 2.
First case: p = 1, q = +oo.
Since we have already seen (Proposition 20.2.2(ii)) that f * g is defined
everywhere and bounded, we only need to prove continuity. For this we
write
I!* g(x)- f * g(y)l :::; j lf(x- t)- f(y- t)llg(t)l dt

:::; IIYIIoo J lf(x- t)- f(y- t)l dt.
We first establish the continuity when f is continuous with compact sup-

port. Thus let ( -a, a) be an open interval containing supp(f). For lx- Yl
sufficiently small,
L lf(x- t)- f(y- t)l dt = L lf(x- y + u)- f(u)l du
= I:a lf(x- Y + u)- f(u)l du
:::; 2a sup lf(x- y + u)- f(u)l.

iui~a
Since f is uniformly continuous on [-a, aJ it follows that f *g is continuous

on IR., in fact, uniformly continuous on IR..
When f E L 1 (1R.), we argue using the density of cg in L 1 (1R.) (Theorem
15.3.3). Thus let fn be a sequence of continuous functions with compact
supportsuchthat lim llfn- flh = 0. Adding and subtracting fn * g at x
n-+oo
and y, we see that
lf * g(x)- f * g(y)l ::;IJ * g(x)- fn * g(x)l + lfn * g(y)- f * g(y)l

+ lfn * g(x) - fn * g(y)l,
so
IJ * g(x)- f * g(y)l :::; 2IIYIIoollf- fnll1 + lfn * g(x)- fn * g(y)l.

The first term on the right-hand side tends to 0 as n tends to infinity, and
the second term is uniformly continuous for each fixed n; it follows directly
that f * g is uniformly continuous on IR..
Second case: p = 2, q = 2.
From Schwarz's inequality we have
( 11f(x- t)l 2dtf

12 ( 119(tW dt f 12 ,
I!* 9(x)l : : : 11f(x- t)ll9(t)1 dt::::;
and hence II! * 9lloo :S llfll2ll9ll2 The continuity is established as in the first
case using Schwarz's inequality and the density of C~(JR) in L 2 (IR) (Section
15.3.4).
For p -j. 1, 2 one uses Hlder's inequality (Lemma 15.2.4) and imitates
the arguments given above. D
We next examine the convolution of a function L 1 (IR) with a function of

L 2 (IR), which is a case not included in the last result.
20.3.2 Proposition If f E L 1 (IR) and 9 E L 2 (IR), then the following

hold:
(i) f * 9(x) exists almost everywhere.
(ii) f * 9 is in L 2 (1R), and
(20.3)
Proof.
(i) Write
1/2 ( ) 1/2
IJ(u)9(x- u)l = ( lf(u)II9(X- uW ) lf(u)l (20.4)
Since IJI E L 1(IR) and 191 2 E L 1(IR), the function u 1---4 lf(u)ll9(x- u)l 2 is
integrable for almost all x (Proposition 20.2.1(i)). The right-hand term of
(20.4), being the product of two square integrable functions, is integrable.
Thus f * 9(x) is defined for almost all x.
(ii) Using the Schwarz inequality and (20.4) we see that
lf * 9(x)l :S 11f(u)ll9(x- u)l du
:S (11f(u)ll9(x- uW du) 112 (11f(u)l du) 112 ,

and thus
lf * 9(x)l 2:S (111 * l91 2(x)) llfll1
Integrating both sides of the last inequality shows that
111 * 9(x)l 2dx :S IIJII1 1111 * l9l 2(x) dx

:S llflhllfll1lllll1,
and finally,
D
20.4 Convolution of functions with limited support 183
20.3.3 Remark The last result can be generalized to the convolution

LP(IR) * Lq(IR) with ~ + ~ -- 1 = ~' where p, q, rare 2:: 1. For f E LP(IR)
and g E Lq(IR), f * g is in Y(IR) [Kho72]. In Proposition 20.3.2 we have
p = 1, q = 2, and r = 2.
20.4 Convolution of functions with limited

support
When one observes a signal, it exists from some time ti to time tr, with the
possibility that ti = -oo and tr = +oo. Signals whose support is limited
on the left (lies to the right of some finite point) are of particular interest.
20.4.1 Definition
c~ = {! E C 0 (1R) I supp(f) c [a,+oo] for some a E IR}.
Cpw+ = {! is piecewise continuous I supp(f) C [a, +oo] for some a E IR}.
The function spaces C9.. and Cpw- are defined similarly for functions
whose support is limited on the right.
Recall that C2(1R) denotes the continuous functions with bounded sup-
port and that Cpw denotes the functions that arc piecewise continuous,
that is, f is continuous except for a finite number of points a 1 , ... , ak
where f(aj) and f(aj) exist (see Section 5.2.1).
20.4.2 Proposition If f and g are in C2(1R), the convolution f *g

exists and belongs to C2(1R).
Proof. Consider this tobe a case ofthe convolution L 1 (1R)*VXl(IR) (Propo-

sition 20.3.1): f * g is defined, continuous, and bounded on R By Lemma
20.1.4,
supp(f * g) c supp(f) + supp(g).
Thus f * g has bounded support and belongs to C2(1R). 0
20.4.3 Remark In the statement of this last result onc can assume
that f and g are in Cpw and have bounded support. The convolution f * g
is again in C2 (IR).
20.4.4 Proposition Given f and g in Cpw+ the convolution f *g

exists and belongs to c~.
Proof. Suppose that supp(f) c [a, +oo) and supp(g) C [b, +oo). Then
f(x- t) = 0 if x- t < a, and g(t) = 0 if t < b. Thus
f*g(x)=O if x<a+b. (20.5)

If x 2: a + b and M > x, then
f * g(x) = 1M-a f(x- t)g(t) dt. (20.6)
In (20.6), only the values of f on the interval [2a + b- M, M- b] are used,

so the convolution can be written
f* g(x) = UX[2a+b-M,M-bj) * (YX[b,M-aj)(x). (20.7)
Hence f * g agrees on [a + b, M) with the convolution of two functions in

Cpw that have bounded SUpport. f * g is thus in c~. 0
20.5 Summary
Ll * Ll c Ll
Ll * L'XJ c L 00 n C 0
L2 * L2 c L 00 n C 0
L2 * Ll c L2
Cpw+ * Cpw+ c co+

co*co
c c c coc
20.6 Exercises
**Exercise 20.1 Let f: lR-> IC be a measurable function. With the notation
of Definition 20.1.3, supp(f) = lR\0. Show that f = 0 a.e. on 0.
Hint: The proof of this theoretical result is delicate. Write 0 = Kn, where U::"=l
Kn = {x E 0 I distance(x, lR\0) ;::: .!.n and lxl ::; n}.
Note that Kn is compact and hence is in the union of a finite number of the open
sets 0;.
Exercise 20.2 Let f = X[o, 11 . Show that h = f * (f *!) makes sense and
compute h. What is the regularity of h?

X[-a,a] * sinx = 2sina sinx,
X[-a,a] * cosx = 2sina cosx.
20.6 Exercises 185
Exercise 20.4 Compute u * u, where u clenotes the Heavisicle function.
Exercise 20.5 Suppose f E L 1 (R) ancl g E LP(R), 1 $ p < +oo. Show that
f * g E LP(R) ancl that II! * gjjp $ llflh llgllp
Hint: The case p = 1 is clone in Proposition 20.2.1, ancl the case p = 2 is clone
in Proposition 20.3.2. For p 1= 1, 2, imitate the proof in Proposition 20.3.2 by
writing
!_ 1-!_
lf(u)g(x- u)! = (!f(u)l!g(x- u)IP)P lf(u)l P;
then use Hlcler's inequality (Lemma 15.2.4).
Exercise 20.6 Show that the convolution of a slowly increasing function f

with a rapiclly clecreasing function g is well-clefinecl.
Hint: Write !f(x- t)g(t)! $ C (1 + (x- t) 2 r (1 + ern- 2

*Exercise 20.7 Assurne that f ancl gare in L~(O, a) ancl that their perioclic
convolution is
f * g(x) = 1a f(x- t)g(t) dt = 1a f(t)g(x- t) dt.
(a) Show that f * g exists ancl belongs to L~(O, a) n C 0 ([0, a]) ancl that
(b) Show that cn(f * g) = acn(f)cn(g).

lesson 21
Convolution, Derivation, and

Regularization
We saw in Lesson 20 conditions under which the convolution of two func-

tions is well-defined. We turn now to several important properties of the
convolution, some of which will be extended to distributions in Lesson 32.
In the current lesson we focus on regularization.
21.1 Convolution and continuity

Wehave shown that f * g is continuous on IR when f E LP(JR), g E Lq(IR),
and ~ + ~ = 1 (Proposition 20.3.1). Hereisa consequence ofthat result.
21.1.1 Proposition Suppose that f E LP(IR) has bounded support

and that g is in L{0 c(IR) with ~ + ~ = 1. Then the convolution f * g is
defined and continuous for all x E R
Proof. f is zero a.e. outside some interval [-a, a]. Suppose x is in a
bounded interval [a, ]. f * g agrees on [a, ] with f * 9X[a-a,-a]' and
this reduces to a convolution LP * Lq as in Proposition 20.3.1. Thus f * g
is defined and continuous everywhere. o
21.1.2 Example The convolution of a function in L 1 (JR) with a bound-
ed function having compact support is continuous.
21.2 Convolution and derivation

The last result can be generalized: Convolution with a function of class CP
yields a function in CP(JR).
21.2.1 Proposition Let f be in L 1 (1R) and Jet g be in CP(JR). Assume

that g(k) is bounded for k = 0, 1, ... ,p. Then (i) f *g E CP(JR) and (ii)
(! * g)(k) = f * g(k) for k = 1, 2, ... ,p.
188 Lesson 21. Convolution, Derivation, and Regularization
Proof. By applying Proposition 20.3.1 with p = 1 and q = +oo we see

that f*g(k) is continuous for k = 0, 1, ... ,p. The function x f-t f(t)g(x-t)
is p-times differentiable, and for k = 0, 1, ... ,p,
where Mk = supyEIR lg(k)(y)l. Since f E L 1 (1R), we can differentiate under

the integral sign (Proposition 14.2.2); hence
21.2.2 Remark (f * g)(k) is bounded on lR for k = 0, 1, ... ,p because

(f * g)(k) = f * g(k) is a convolution of the type L 1 * L 00
21.3 Convolution and regularization

21.3.1 Definition A sequence of functions Pn in .! (JR) (Definition
15.1. 7) is called a regularizing sequence if it satisfies the following condi-
tions:
L
(i) Pn(x) 2': 0 for all x ER
(ii) Pn(x) dx = 1.
(iii) The Support of Pn is in [-cn, cn], cn > 0, and lim cn

n-->oo
= 0.
To see that such a sequence exists, take p E .! (JR) defined by
-1 e -~
{ ~
1-x if lxl :::; 1,
p(x) =
if lxl > 1,
1
with
1 1
c= e dx,
-~
1-x
-1
and let Pn(x) = np(nx). In practice, regularizing sequences are used with-
out defining them explicitly. As we will see, the details are not important;
one uses only properties (i), (ii), and (iii).
21.3.2 Definition If f E L 1 (1R), the functions f * Pn are called regu-

larizations of f.
It is clear from the properties of Pn and Proposition 21.2.1 that f * Pn is

in C 00 (!R). But what is the relation between fand its regularizations?
21.3 Convolution and regularization 189
21.3.3 Theorem ( density of ~ (~) in L 1 (~)) Let f be a func-

tion in L 1 (JR). For c: > 0 there exists gE; in ! (JR) such that II!- gE; ll1 ::; c:.
Proof. First choose JE; in C~(JR) such that II!- JE;Ih ::; c:/2 (Theorem
15.3.3). Assurne that supp(JE;) C [a, b]. Now consider the regularizations 9n
of JE:, namely, 9n =JE:* Pn Let K = [a- 1, b + 1]. Then supp(gn) C K for
sufficiently large n (Lemma 20.1.4). Since Pn E C 00 (lR), 9n is in! (JR).
We wish to estimate II JE: - 9n 11 1. For sufficiently large n,
f I!E;(x)- 9n(x)l dx::; (b- a + 2) sup I!E;(x)- 9n(x)l. (21.1)

JR xEK
Since JR Pn(t) dt = 1, we can write
JE:(x)- 9n(x) = l (JE:(x)- JE:(x- t))Pn(t) dt.
Thus
IJE:(x)- 9n(x)l ::; sup IJE:(x)- JE:(x- t)l,
lti:S:En
and
sup IJE;(x)- 9n(x)l::; sup IJE(x)- JE(x- t)l. (21.2)

xEK xEK
lti:S:En
JE is uniformly continuous, so the right-hand side of (21.2) tends to 0 as

n ____, +oo. Returning to (21.1) we sec that
(21.3)
In particular, there is an N suchthat n > N implies IIJE;- 9nll1 ::; c:/2.

Thus for all sufficiently large n, II!- 9nlll ::; II!- JEII1 + IIJE- 9nll1 ::; c:,
which proves that ! (JR) is densein L 1(JR). o
21.3.4 Remark The proof shows that if f is continuous, the sequence

f * Pn tends to f uniformly on all compact sets.
21.3.5 Remark One can prove in the same way that ! (JR) is densein
LP(JR), 1 0. By Theorem 21.3.3 there is a gE; in ! (JR) such that
II!- 9Eih::; c:/4. From (20.1) we see that
IIJ * Pn- 9E * Pnll1::; IIJ- 9Eihi1Pnlh = IIJ- 9Eih,

and hcnce that
II/- f * Pnlll ~ II/- g.,III + llg.,- g., * Pnll1 + IIYe * Pn- f * Pnlh
~ 211/- g.,III + llg.,- g., * Pnlll
We know from (21.3) that
lim llg.,- g"

n->oo
* Pnll1 = 0.
Thus there is an N such that for all n > N, llg., - g" * Pn ll1 ~ c: /2, in which
case II/- f * Pnlh ~ c:. This proves that f * Pn tends to f in L 1(IR). 0
21.3. 7 Remark A similar argument can be used to show that the reg-
ularizations f * Pn of a function f E LP (IR), 1 < p < oo, tend to f in LP (IR).
(See Exercise 21.4.)
21.4 The convolution Y (IR)* Y (IR).

Since Y (IR) is in L 1(IR), we know that the convolution of two functions in
Y (IR) is in L 1(IR). There is a better result.
21.4.1 Proposition Assurne that f and g are in Y (IR). Then the

following hold:
(i) f * g is in Y (IR).
(ii) The convolution is a continuous opcrator from Y (IR) x Y (IR) to
y (IR).
Proof.
(i) f*Y E C 00 (IR) (Proposition 21.2.1). We look at the behavior at infinity.
First,
lim f * g(x) = lim f f(x- t)g(t) dt =0
lxl->oo lxl->oo }IR
by dominated convergence: f E Y and lf(x- t)g(t)l ~ 11/llooiY(t)l, which
is integrable. To study limlxl->oo xP(f * g)(q)(x) we use the formula
p
xP(f * g)(q)(x) = Li(xP-i!) * (xig(q)),
j=O
where the j are binomial coefficients. Thus xP(f * g)(q) is written as a

sum of convolutions of elements in Y (IR), which is invariant under differ-
entiation and multiplication by polynomials (Proposition 19.2.2). Thus, by
what we have just shown for f * g, limlxl->oo xP(f * g)(q)(x) = 0.
21.5 Exercises 191
(ii) To prove continuity, consider two sequences fn and Ym in !7 (JR) that

converge in !7 to fand g. Adding and subtracting fn * g, we have
11/n * Ym- J * Ylloo :::; ll(fn- f) * Ylloo + 11/n * (gm- g)lloo

Using (20.2), this becomes
11/n * Ym- J * Ylloo:::; 11/n- fllooiiYih + 11/niiiiiYm- Ylloo,

and it follows that fn * Ym converges to f * g. For expressions of the form
xP(fn * Ym)(q)(x) we use the decomposition given in (i). D
21.4.2 Remark Note that we did not use the regularity of f in the last
proof. By modifying this proof we can show that g f-+ f * g is continuous
from !7 (JR) to !7 (JR) when g E !7 (JR) and f E Lfoc(JR) decreases rapidly.
21.5 Exercises
Exercise 21.1 Assurne that f is in L 1 (R) and g(x) = e2 i7l"X. Compute f*Y
Exercise 21.2 Suppose f E C 0 (R) and h > 0. Show that g = 21hx[-h,h] *f

is in C 1 (IR) and compute g'.
*Exercise 21.3 Show that !JJ (R) is densein LP(R), 1 :::; p < +oo. Deduce
from this that .9' (IR) is dense in LP(R), 1 :::; p < +oo.
Exercise 21.4 Take f E LP(R), 1 :::; p < +oo. Use Exercise 20.5 to show
that lim llf- f * PniiP = 0.
n-++cx:>
*Exercise 21.5 Let I be an open interval in IR. Show that !JJ (I) is densein
LP(I), 1 :::; p < +oo.
*Exercise 21.6 Suppose f E Lfoc(R) is suchthat
1 f(t)ifJ(t) dt =0 for allifJ E !JJ (IR).
(a) Show that the regularizations Pn * f are zero.

(b) Take a > 0 and b = a + 1. Show that for alllxl :::; a,
Pn * f(x) = Pn * (X[-b,bjf)(x) = 0,
1:
and from this deduce that
lf(x)l dx :S: IIX[-b,bJf- Pn * X[-b,bJflh

(c) Goneludethat f = 0 a.e. on R.
*Exercise 21.7 Suppose fandgare in L 2 (1R}. Provethat lim f*g(x} = 0.

JxJ->oo
Exercise 21.8 lf fandgare in Y (JR}, show that
xP(J * g)(q) (x) = ~ (~) (xp-j!} * (xi g(q)}(x}.

Lesson 22
The Fourier Transform on L2 (IR)
In signal processing, L 2 (IR) models the space of signals that are functions
of a continuous variable (usually time) and that have finite energy. Until
now, the Fourier transform has been defined only for integrable functions,
and L 2 (IR) is not included in L 1 (IR). The purpose ofthis lesson is to extend
the Fourier transform to L 2 (IR); we will do this using results that have been
established for .9 (IR).
22.1 Extension of the Fourier transform

We proved in Section 19.2 that ,'7' is a continuous linear operator from
.9 to .9. We are going to extend 5 to L 2 (IR) using the following density
result.
22.1.1 Proposition (density of .7 in L 2 (JR))

.9 is a dense linear subspace of L 2 (IR).
Proof. We need to show that .9 c L 2 (IR). For f E .9, there exists an
A > 0 suchthat 1(1 + x 2)f(x)l :::; A for all x ER Hence
r
irrt lf(x)l
2
dx:::; A
2{ dx
irrt (1 + x2)2 < +oo.
The density follows from the density of !lJ (IR) in L 2 (IR) (Section 21.3.5). D
The Fourier transform is an isometry from .9 to !7 in the L 2 norm.
22.1.2 Proposition (The Plancherel-Parseval equality)

For f and g in .9 ,
(i) ~ i(g(~) d~ = ~ f(x)g(x) dx,
(ii) ~ lf(~Wd~ = ~ lf(x)l 2dx.
194 Lesson 22. The Fourier Transform on L 2 (~)
Proof. The first equation follows directly from (17.4): Let h(~) = g(~).
kJ(~)h(~) d~ k
From (17.4),
= f(x)h(x) dx.
But g(~) = k = 5 g(~). Thus h = g, which proves (i). The

e2 i1rexg(x) dx
second relation is derived from (i) by taking f = g. o
!T is extended to L 2 (JR) using the density of Y in L 2 (JR) and thc fact
that L 2 (JR) is complete. This will be an application of the following rcsult.
22.1.3 Proposition Let E and F be two normed vector spaces. As-

sume that Fis complete and that Gis a dense linear subspace of E. If A
is a continuous linear operator from G to F, then there exists has a unique
continuous linear extension of A, denoted by A, from E to F. Furthermore,
the norm of A is equal to the norm of A.
Proof. Let f be an element of E. Since Gisdensein E, there is a sequencc
lim II!- fnll = 0. Being convergent, fn is a Cauchy
fn in G suchthat n-+oo
sequence, and sincc A is continuous,
IIAJn- Afmll :-: : IIAIIIIJn- fmll

This shows that Afn is a Cauchy scquence in F. Since F is complete,
Afn converges to some elemcnt g in F. It is easy to show that g does not
depend on thc sequence fn that converges to f. Thus by letting Af = g, A
is well-defined on F.
A is linear by definition, and
IIAJII = llgll = lim
7Z.~OO
IIAfnll :-: : lim
n~oo
IIAII llfnll = IIAII II!II-
Thus IIAII :-: : IIAII- Since Af = Af for all f E G, we have
IIAII = sup IIA/11 :::: sup IIAJII = IIAII

JEE llfll fEG llfll '
#0 #0
and consequently IIAII = IIAII- Finally, G being densein E, it is clear that

A is unique. o
!T is an isometry on Y in the L 2 norm. By applying the last result
with E = F = L 2 (JR) and G = Y we have the next thcorem.
22.1.4 Theorem The Fourier transform !T and its inverse 5 ex-

tend uniquely to isometries on L 2 (JR). Using the same notation for these
extensions, we have the following results for all f and g in L 2 (JR):
22.1 Extension of the Fourier transform 195
(i) 5 ,'JT f = 5 5 f = f a.e.
(ii) k f(x)g(x) dx = k 5 !(~)5 g(~) d~.

(iii) 11!112 = 115 !112
Proof. Having extended 5 from .5I' to L 2 , the other results follow by
using the density of .5I' in L 2(1R) and taking limits. o
We now examine some properties of this extension. The first result is
that (17.4) is true in L 2(1R).
22.1.5 Proposition If fandgare in U(IR), 5 f g and f 5 gare

in L 1 (1R), and
k 5 f(t)g(t) dt = k f(u)5 g(u) du. (22.1)
Proof. Wc have just seen that 5 f is in L 2(1R), so that the product of

5 f and g is in L 1 (1R). The same is true for f and 5 g. Let fn and 9n
be sequences in .5I' that tend to f and g respectively. Since 5 f n = fn
and 5 9n = 9n, and since .5I' C L 1 (1R), it follows from (17.4) that
1. 5 fn(t)gn(t) dt = 1. fn(u)5 9n(u) du.
Eq~ation (22.1) follows by passing to the limit. 0
22.1.6 Proposition The Fourier transform defined an L 1 (1R) and the

oneobtained byextension toL 2(1R) coincideonL 1 (1R)nL 2(1R). If! E L2(1R),
then 5 f is the limit in L 2 (1R) of the sequence 9n defined by
Proof. Denote the Fourier transform on L 1 (1R) by j and that on L2(1R)

by 5 f, as we have been doing. Take f E L 1 (1R) n L2(1R) and '1/J E .5I' (IR).
Applying (17.4) and (22.1) we have
Thus k(J- 5 !)'1/J = 0 for all '1/J E .5I' (IR). Since j- 5 f is in Lfoc(IR),
we conclude (Exercise 21.6) that j = 5 f a.e.

Let fn = !X[-n,n] By dominated convergence (Theorem 14.1.1), we
know that limn___.oo llfn - fll2 = 0. Since fn E L 1 (1R) n L2(1R), we have
9n = Jn = 5 fn, and limn->oo 115 f- 9nll2 = 0 by the continuity of 5 .
0
196 Lesson 22. The Fourier Transform on L 2 (~)
22.1.7 Remark lf f E L 2(1R), then Y f is the limit in L 2(JR) ofthe

sequence hn defined by hn(~) = Dn e2 i1rf,x f(x) dx.
22.1.8 Remark We will continue to denote the Fourier transform by

f or Y f. The meaning of these notations is now clear, depending on
whether f E Ll(IR) or f E L 2(JR).
22.2 Application to the computation of

certain Fourier transforms
When we know that the !ourier transform f of a function f E L 1 (IR) is in
L 1 (1R), we can compute[ and obtain new transforms. This is the way we
obtained the table in Section 18.2.2 from that in Section 17.3.3. However,
E: ~
for f 6 (x) = e-caxu(cx) with E: = 1 and Re(a) > 0, fc(~) = 2. ( E:a + Z7r
which is not in L 1 (JR). It is, however, in L 2(IR), and in this case we can
compute Y (fc).
22.2.1 Proposition
(i) If f E L 2 (1R), then Y Y f = fa a.e.
(ii) If f E L 1 (JR) n L 2 (1R), then Y f = fa a.e.
Proof. To prove (i), we first show that Y f = Y fa Thus take se- a
quence fn in!/' (IR) suchthat limn---+oo llf- fnll2 = 0. Wehave Y fn =
ff(fn)a by Proposition 18.2.1, and in the limit, Y f = Y fa lf f is
f
also in Ll(JR), then = Y f, and this implies (ii). o
We are now able to compute the Fourier transform of f 6
22.2.2 The completion of Section 18.2.2

(i) a E C, Re(a) > 0.
1 5
-----:-.- ~ eaf.u( -~)
+
a 2Z1fX
1 5
- - .- ~ e-af.u(~)
a- 2Z1fX
sinx sr
(ii) -X- ~ KX[-...L ...LJ(~)
21r '271"
With these results and those of Lessons 17 and 18, we can compute the
Fourier transform of any rational function P(x)jQ(x) by decomposing it
into partial fractions.
22.3 The uncertainty principle 197
22.3 The uncertainty principle

The purpose of this section is to develop the relation that exists between
the localization of a signal and the localization of its spectrum.
Given a function f : lR----. C suchthat J, xf, and ef are in L2(1R), we
introduce thc following definitions and notation:
aJ = l x 2 lf(xW dx (energy dispersion of f in time).
a~=
f
f e1ReWde
JJR
(energy dispersion in frequency).
Et = llf(x)l2 dx (energy of f).
The value t, defined by

a2
t2 = _j_
Et'
is called the effective dumtion of the signal f; >., defined by
a~
/).).2 = _j_
Ej'
is called the effective bandwidth. The uncertainty principle is a relation
between t and /).). that says that one cannot arbitrarily localize a signal
in both time and frequency. This relation is
/).t . /).). >~ (22.2)

- 47r'
which is the content of the next result.
22.3.1 Proposition Let f: lR----. C be a function in C 1 (JR) suchthat

J, f' and xf are in L 2 (JR). Then
(22.3)
Proof. We assume the following two results (sec Exercises 22.6 and 22. 7):
(i) lim xlf(x)l 2 = 0.
lxi-HlO
(ii) f'(e) = 2i7fef(e).
The second formula will be proved in the more general context of tempered
distributions (Proposition 31.2.4). Also, note that f being differentiable
almost everywhere does not imply (ii) (take f(x) = X[-l,lJ(x)).
Using (ii) and Theorem 22.1.4(iii) we see that
a~=
f
1 f lf'(eW de
47f 2 }JR = 4 1 2 f l!'(x)l 2 dx.
7f jR
198 Lesson 22. The Fourier Transform on L 2 (R)
On thc othcr hand, (!7)' = !'7 + !7', and
I Lx(f(x)7(x))' dxl ~ L L
lxf'(x)7(x)l dx + lxf(x)]'(x)l dx
~(L f (L f
lx7(xWdx 12 lf'(xWdx 12
L+( Llxf(xWdx)
112
( lf'(x)l 2 dx)
112
=2( L lf'(xWdx)
112
( L lxf(x)l 2 dx) 112
=47ro-j CJj.
But
since limlxl-->oo xlf(x)l 2 = 0. Thus o-1 o-(~- Etf(47r). D
Thc next proposition shows that a Gaussian signal has the minimum
effective duration for a given cffective bandwidth.
22.3.2 Proposition Let the effective bandwidth .A be fixed. Then

the signal
minimizes the effective duration.
Proof. In the proof of Proposition 22.3.1 wc used the Schwarz inequality

to obtain Et ~ 4rro-1 o-? One has equality in the real case when tf and f'
are proportional. This implies that f is of the form f(t) = ae-ct 2 , where
~ 11"2 >-.2
c > 0 because f E L 2 (1R). We know that f(.A) = av1"e -----c (Section

17.3.4), so that
-
c 1m e _211"2>-.2
c d.A
4rr2 IR c
4rr 2
D
22.4 Exercises 199
22.4 Exercises
*Exercise 22.1
(a) Let a and b be two real numbers with a < b. Compute the Fourier transform
") sin 1r(b- a)~ -irr(a+b)~ C e;r f h b 1
o f !( ." = 7r~ e . ompute c.T w en a = - = - 27!".
(b) Compute .'T f using Proposition 22.1.6(ii).
Exercise 22.2 Let f(x) = ~ with a E JR, a :f= 0. Compute the Fourier
a +x
transform of f two different ways: by direct computation (see Section 18.2.2) and
by decomposing f into partial fractions and using the Fourier transform on L 2(JR)
(see Section 22.2.2).
Exercise 22.3 Let f(x) =~ with a E JR, a :f= 0. Show that f ~ L 1 (JR).
a +x
Compute the Fourier transform of f.
Exercise 22.4 In Exercise 18.5 on Shannon's formula, show that the as-
f imply that f E L 2(1R) and that 2:::~:-oo lf(na)l 2 < +oo.
sumptions on
22 . 5
E xerc1se Evaluate 1 IR
-
X
2-x dx and
sin2
- 1 IR
( dx 2) 2 dx.
1+x
Hint: Use Theorem 22.1.4(iii). The results are 1r and i
Exercise 22.6 Take f E C 1 (JR)nL 2 (JR) and suppose in addition that f' and
xf are in L (1R).
2
(a) Show that xlfl 2 and (xlfl 2 )' are in L 1 (JR).

(b) Use this (and [Bre83] p. 130) to prove that !im xlf(x)l 2 = 0.
lxl->+oo
**Exercise 22.7 Assurne that f E C 1 (JR) n L 2(JR) with f' E L2(JR). We wish
to establish the formula
.'T f' (~) = 2i7r~.'T !(~) a.e.
(a) Let hn(~) =I: e 2 irrt;x J'(x) dx. Show that hn converges to .'T J' in L 2(JR).
From this deduce the existence of a strictly increasing sequence (nk)kEN
suchthat hnk converges a.e. to .'T f' (use [Bre83] p. 58).
(b) By integrating by parts and using [Bre83] p. 130, show that there is a
strictly increasing sequence (kj) suchthat hnk. converges a.e. to 2i7r~.'T f.
J
Lesson 23
Convolution and the Fourier

Transform
The Fourier transform has the remarkable property that it interchanges
-*
convolution and multiplication. Formally, we have these relations:
J ~
g(e) = f(e) 9(e),
h(e) = 1* 9(e).
We will establish conditions under which these formulas are valid.
23.1 Convolution and the Fourier transform

in L1 (1R)
First we are going to complete a result about the Fourier transform. We
saw (Theorem 18.1.1) that .!T f(t) = f(t) at every point t where f is
1
continuous when f and are in L 1(1R). In particular, if f E !/ (R), then
.!T f(t) = f(t) for all t E IR. We use the density of !/ (IR) in L 1(1R) to
prove the next result.
23.1.1 Proposition If fand 1 are in L 1 (1R), then .!T 1 = f a..e.
Proof. Since !/ (IR) is densein L1(1R), there exists a sequence fn in!/ (IR)
suchthat limn--cx> II!- fnll1 = 0. As we have noted, .!T h(t) = fn(t) for all
n E N and all t E IR. We are going to show that JR(f(t) -.!T f(t))cp(t) dt =
0 for all <p E !/ (IR), and this implies (by Exercise 21.6 or by (27.6)) that
.!T 1= f a.e.
Since fn is in!/ (IR) C L 1(1R), we have by (17.4)
l fn(t)cp(t) dt = l . r h(t)cp(t) dt = lfn(u).!T cp(u) du.

202 Lesson 23. Convolution and the Fourier Transform
It is clear that lim { fn(t)cp(t)dt = { f(t)cp(t)dt, and with (17.3) we

}IR
n->oo }IR
have lim ii1n- Jlioo = 0. Thus
n->oo
lim { fn(u)5 cp(u) du= { f(u)5 cp(u) du.

n->oo }IR }IR
Now, 1E L 1 (JR.) and 5 cp E .9' (JR.), so by (17.4),
L 1(u)5 cp(u) du= L 5 f(t)cp(t) dt.
Finally, for all cp E .9' (JR.),
L f(t)cp(t) dt = L Tf(t)cp(t) dt,
and this proves the result. 0
1
This result implies that if fand are in L 1 (JR.), then f is continuous, or
more precisely, the equivalence dass to which f belongs contains a contin-
uaus representative, namely, 5 f
23.1.2 Proposition Given fand g in L 1 (JR.), wc havc
(i) j;g(~) = f(~) g(~) for all ~ E lR..
(ii) If in addition 1 and gare in L (JR.), then
1
h(~) = 1* g(~) for all ~ E lR..
Proof.
(i) f * g is in L 1 (JR.) by 20.2.1. The computation of j;g(~) is a direct
application of Fubini's theorem:
Le- i1r~x
2 f * g(x) dx = Le- i1r~x L
2 ( f(x- t)g(t) dt) dx
= L Le- i1r~x
g(t) ( 2 f(x- t) dx) dt
= Lg(t)e-2i1r~t f(~) dt = g(~) . 1(~).
(ii) Note that (i) is true for 5 by changing i to -i. Since 1 and gare
in U(JR.), we can apply (i) and Proposition 23.1.1:
,r (1 * g)(x) = 5 f(x) 5 g(x) = f(x)g(x) a.e.
Note that fg is in L 1 (JR.) because both f and g are in L 1 (JR.) n C(JR.).

Taking the Fourier transform of both sides of the last equation shows that
1* g(~) = h(~) for all ~ E lR.. o
23.2 Convolution and the Fourier transform in L 2 (1R.) 203
This result is particularly important for functions in Y (JR).
23.1.3 Proposition If f and g are in Y (JR), then

(i) r;g = J. g;
(ii) T9 = 1* g.
Proof. Proposition 23.1.2 applies directly because Y is invariant under
the Fourier transform. o
23.2 Convolution and the Fourier transform

in L2 (IR)
We extended the Fourier transform from Y to L 2(JR) in Lesson 22. The
convolution is a continuous operator from L 2(JR) x L 2(JR) to L 00 nC0 (Propo-
sition 20.3.1).
23.2.1 Proposition Given fand g in L2 (JR), we have

(i) f * g(t) = .!T (1 g)(t) for all t in JR;
(ii) h = 1* g for all t in JR.
Proof.
(i) We establish the result using the density of Y in L 2(JR) and applying
Proposition 23.1.2(i). Thus let fn and Un be two sequences in Y suchthat
We see that fn * Un = .!T (h Un) by taking the inverse Fourier transform

of both sides of Proposition 23.1.2(i). On the other band,
IIJ. g -Jn Ynlll ~ 111 -hll2llull2 + llhii2IIY- Ynll2

= II/- /nll2llull2 + ll/nll2llu- Unll2
by Theorem 22.1.4(iii), and hence limn--+oo llf.u-fn
unll1 = 0. By applying
the Riemann-Lebesgue theorem (Theorem 17.1.3) to the inverse Fourier
transform, we see that 5 (h Un) tends to 5 (1
g) uniformly on R
The last step is to determine the limit of fn * Un From (20.2) we have
thus f n *Un converges uniformly to f * g, which is continuous. We conclude

that
f * g(t) = 5([. g)(t)
for all t in JR.
(ii) The proof is similar tothat of (i) and is left as an exercise. o
23.2.2 Remark With reference to the last proposition, note that the
formula r;g i = g does not make sense a priori, since f * g is only in
L 00 (!R). This formula is true whenever f * g is in L 1 (JR).
When f E L 2 (IR) and g E L 1 (IR), the convolution and the Fourier trans-
form are well defined, and we have the next result.
23.2.3 Proposition If f g E L 1 (IR), tben

E L 2 (IR) and f.g is in L 2 (IR)
and f * g = !T (j. g), witb equality in L 2 (IR).
Proof. We proceed as in Proposition 23.2.1. Take two sequences fn and

gn in Y suchthat limn-+oo llf- fnll2 = 0 and limn-+oo llg- gnlh = 0. We
know that !T (in !fn) = fn * gn; we first study the convergence of in Yn
i
Sincc is in L 2(IR) and g is in L 00 (!R), i
g is in L 2(IR) and
lli g- in Ynll2 ~ lli- inii2II!JIIoo + llfnii2II!Jn- !JIIoo

= IIJ- fnii2II!JIIoo + llfnii2II!Jn- !JIIoo
Since limn-+oo llg-gnlh = 0 implies limn-+oo ll!f-!fnlloo = 0, fn!fn convcrges

to f. g in L 2(IR). Consequently, !T (fn !in) tcnds to !T (j. g) in L 2 (IR).
Finally, wc must examine the convergence of fn * gn. The convolution is
continuous from L 2(IR)*L 1 (IR) to L 2(IR) (Proposition 20.3.2). Hence fn*gn
converges to f * g in L 2(IR), and we have f * g = !T (j. g) in L 2(IR). o
23.3 Convolution and the Fourier transform:

Summary
23.3.1 The Fourier transform on L 1 (IR)
The Fourier transform of a function in L 1 (IR) is denoted by i or !T f.
Riemann-Lebesgue theorem:
!T : L 1 (IR) ___. L 00 (IR) n C 0 (IR),

lim !T f(x) = 0.
lxl-+oo
Exchange formula: l f(t)g(t) dt = l i(u)g(u) du.
Derivationformulas: i(k) = [(-2i7rx)kfr; JW = (2i7r~)kf.

Translation formulas: Tai = [e 2 i7rax j]~; ;;} = e-2i7r{a j.
23.3 Convolution and the Fourier transform: Summary 205
Properties:
f even ===} f even,

f odd ===}
L odd,
f real, even ===}
L
f
real, even,
f real, odd ===} imaginary, odd.
Fourier transforms:
(i) a E C, Re(a) > 0, c: = 1, k = 0, 1, 2, ... ;
xk r c:
-e-wxu(c:x) ~ --:-------:-;---:--:-
k! (ca + 2i71'0k+ 1 '
-alxl .r 2a
e 1--------t a2 + 471'2~2 '
(ii) a E JR., a > 0;
23.3.2 The inverse Fourier transform

-
=5 . If fand f
~
The operator 5 is the conjugate of 5 , and 5 -I
are in L 1 (JR.), then
j(t) = f(t) a.e.,

5 f=fa
This last formula leads one to find new Fourier transforms (see the table
in Beetion 18.2.2).
The space Y (JR.) of functions in C 00 that decay rapidly is densein LP(JR.)
for 1 :::; p < +oo.
5 is linear, 1-to-1 onto, and bicontinuous from Y (JR.) to Y (JR.).
23.3.3 The Fourier transform on L 2 (1R.)

5 is an isometry from L 2(JR.) onto L 2(JR.): llfll2 = 115 fll2 In particular,
,e;r preserves the scalar product in L 2(JR.):
l f(x)g(x) dx = l i(fi(~) d~.

23.3.4 Convolution
f g f*9 Continuity
LI LI LI II! * 9III :::; IIJIIIII9III
LI Loo L n C0
00
II/ * 9lloo :S IIJih ll9lloo
Lz Lz L n C0
00
llf * 9lloo :S llfllzll9llz
Lz LI Lz II! * 9llz :S llfllzii9III
.9' .9' .9'
Regularization: If f E LP(~), 1 :::; p < +oo, then

lim IIPn * f- fllp
n-+oo = 0.
23.3.5 Convolution and the Fourier transform
f E LI(~)}
for all ~ E ~'
gE LI(~)
f
j, E LI(~)}
for all ~ E ~'
g, g E LI(~)
f E L2 (1R)} ===} {i_:_g(t) =-: (f. g)(t)

for all t E ~'
g E L 2 (~) fg(t)=f*g(t)
/EL 2 (~)}
===} f * g(t) = (j. g)(t) for a.e. t ER
g E LI(~)
23.4 Exercises
Exercise 23.1 Compute N for f = X[o,IJ
23 . 2
E xerCISe Compute f *f when f(t) = sin27rAt .
Wlth).. > 0.
?rt
Exercise 23.3 Compute the Fourier transform of

. 2
( ) =-- sm x
gx
x2
1
and deduce that
sin 2 x d _
-2- X -?r.
IR X
23.4 Exercises 207
Exercise 23.4 Let g(x) = e-n 2 Compute g * g.
Exercise 23.5 Let fa(x) = 2 2: 2 2 with a E C and Re(a) > 0. If b E C

a + 71" x
and Re(b) > 0, compute fa * fb
Hint: Use the Fourier transform to show that fa * fb = !a+b
Exercise 23.6 lf f and g are in L 2 (1R), show that f * g is in C 0 (1R) and that
lim f*g(x)=O.
J:rJ-oo
Chapter VII
Analog Filters
lesson 24
Applications to Analog Filters

Governed by a Differential
Equation
The tools we have just developed (convolution and the Fourier transform
for functions) are going to be used to study analog filters that are governed
by a linear differential equation with constant coefficients,
q p
L bkg(k) = L ajf(j), ap bq =f:. 0, (24.1)

k=O j=O
where f is the input and g = A(f) is the output. Other conditions must
be given to eliminate ambiguity among the possible solutions of (24.1).
24.1 The case where the input and output

are in .!7
This case is very special. The input has no reason to be so regular, but we
will see that this is a step toward more general cases.
We assume that f E .9 and look for a solution g in .9 . If such a g
exists, we can take the Fourier transform of both sides of (24.1). Thus
q p
Lbk(2i7r.X)kg(.X) = L::aj(2i7r.X)Ji(.X). (24.2)

k=O j=O
Consider the two polynomials
p q
P(x) = L:ajxj and Q(x) = L bkxk

j=O k=O
and assume that the rational function P(x)jQ(x) has no poles on the imag-
inary axis. Then P(2i7r.X)/Q(2i7r.X) has no poles for real.-\, and (24.2) is
equivalent to
~(.-\) = P(2i7r .X)/~(.-\) (24.3)

g Q(2i7r.X) .
212 Lesson 24. Analog Filters Governed by a Differential Equation
This equality completely determines g in !7 , if it exists, and thus proves

the uniqueness of a solution of (24.1) in !7. The existence of a solution
also follows from (24.3), since the function
G(.A) = P(2in.A) j(.A)

Q(2in.A)
is in !7 whenever f is in !7. By applying Theorem 19.3.1, we see that
g = 5 - 1 (G) is a solution of (24.1) in !7.
24.1.1 Proposition If P(x)/Q(x) has no poles on the imaginary axis

and if f is in !7, then (24.1) has a unique solution g E !7. In this case,
the system
is a filter.
Proof. We have proved the first part of the result and thus need only
to show that A is a filter on !7 . The linearity and invariance present
no difficulty. To prove continuity in the topology of !7 , suppose that a
sequence f n tends to 0 in !7 . Then fn
tends to 0 in !7 , as does Yn given
by (24.3). Thus Yn tends to 0 by Theorem 19.3.1. o
The differential equation (24.1) has a unique solution without initial
conditions being specified. This is because we require the solution g to be
in !7 , which means that g and all of its derivatives vanish at infinity.
We assume in what follows that P/Q has no poles on the imaginary axis.
Also, note that P '/=. 0, since we assume that ap =F 0.
24.1.2 The output expressedas a convolution (p < q)

If we assume that deg P < deg Q, then the transfer function
H(.A) = P(2in .A) (24.4)
Q(2in.A)
is in L 2 (!R) n L 00 (!R). By decomposing this rational function into partial
fractions, we see from Sections 18.2.2 and 22.2.2 that it has an inverse
Fourier transform h = 5 - l H that is bounded, rapidly decreasing, con-
tinuous except perhaps at the origin, and satisfies (24.3),
which by Proposition 23.2.1(i) implies that
(24.5)
24.2 Generalized solutions of the differential equation 213
This is the same kind of formula that we obtained in Section 2.4 for
the RC filter. The response is the convolution of the input with a fixed
function h that is called the impulse response. Note that if dcg P ~ deg Q,
thc computations we havc just made no Ionger make sense.
24.2 Generalized solutions of the differential

equation
The formula g = h * J, obtained when f is in Y, makes sense in the
following more general cases.
24.2.1 If f is in L 1 (~), then g is in L 1 (~)nL 2 (~)nL 00 (~) (Propositions
20.2.1, 20.3.1, and 20.3.1) and
IIYih :::; llhll1 11/111,
IIYII2 :::; llhll2 11/111, (24.6)
IIYIIoo :::; llhllooll/111-
24.2.2 If f is in L 2 (~), then g is in L 2 (~), it is bounded and continuous
(Proposition 20.3.1), it tends to 0 at infinity (Proposition 23.2.1(i)), and
IIYII2 :::; llhll1ll/ll2, (24.7)
IIYIIoo :::; llhll2ll/ll2
24.2.3 If f is in L 00 (~), then gisalso bounded and (proposition 20.3.1)
IIYIIoo :::; llhlhll/lloo (24.8)
The system A defined in Proposition 24.1.1 in continuous from L 00 (~)
to L 00 (~), and thus it is a filter. Similarly, (24.6) and (24.7) show that A
is continuous from L 1 (~) to LP(~) (p = 1, 2, oo), and from L 2 (~) to Lq(~)
(q = 2, oo).
24.2.4 Definition The response of a filter to the unit step function is

called the step response of the filter. This response, h1, is well-defined as a
generalized solution of (24.1). It is bounded by (24.8) and is given by
h1(t) = h*u(t) = [too h(s)ds. (24.9)
24.3 The im pulse response when deg P < deg Q

The impulse response h = .!T - 1 H is computed by decomposing H into
partial fractions. The poles of P / Q are assumed to lie off the imaginary
axis. There are two cases to consider: P/Q has only simple poles or P/Q
has multiple poles.
24.3.1 The case where P(x)/Q(x) has only simple poles

In this case, H can be decomposed in the form
(24.10)
where z1, ... , Zq are the poles. From Section 22.2.2, read for !JT -l, we
conclude that
(24.11)
where we have defined
K_ = {k E {1,2, ... ,q} I Re(zk) < 0},

K+ = {k E {1,2, ... ,q} I Re(zk) > 0}.
24.3.2 The case where P(x)/Q(x) has multiple poles

Let z1, Z2, ... , Zl the poles and let m1, m2, ... , m1 be their multiplicities.
Then we can write H as
l mk
H(>.) = " " k,m (24.12)
~ ~ (2i7r>.- Zk)m.
By using the results in Section 17.3.4, we see that
h(t) = ( L Pk(t)ezkt)u(t)- ( L Pk(t)ezkt)u(-t), (24.13)

kEK- kEK+
where
24.3.3 The case of purely imaginary poles

What we have done so far does not allow us to treat an equation like
g" +w2g = J,
where P(x)fQ(x) = 1/(x2 + w2 ) has two poles are on the imaginary axis.
In this case h is a sinusoid and the Fourier transform of H (when H is
considered to be a function) is no Ionger defined. This problern will be
resolved in Section 35.2.3 in the context of distributions.
24.4 Stability 215
24.3.4 The case where deg P = deg Q

Take for example the equation
g"- w2g = f".

Again, what we have clone so far does not apply. Nevertheless, we can still
manage to solve the equation. Changing the unknown function to g0 = g- f
lowers the order of the right-hand side:
Then we have g0 = h 0 * f and g = f + ho * f. This is no Ionger a convolution

like (24.5), but it will serve the same purpose. On the other hand, it is clear
that we can obtain g as
g = h1 * !' or g = h2 * !".
In the general case, we change the unknown function to g0 = g + >..j and
find that q q
I: bkgak) = I:(ak - >..bk)f(k).
k=O k=O
Taking >.. = aqjbq reduces the degree of the right-hand side and brings us
back to the case p < q. We can then write
g = >..f + ho * f. (24.14)
(Note that it is possible that h 0 = 0; this happens when P(x) = >..Q(x)

for all x (see Exercise 24.2).) The representation (24.14) leads to estimates
like those given in Section 24.2. In Section 35.2 we will give an expression
for g as a convolution without the condition p < q, but in this case h will
be a distribution.
24.3.5 Summary
When P / Q has no poles on the imaginary axis and deg P ~ deg Q, a unique
generalized solution of (24.1) can be defined under the sole condition that
f E L 1 (JR) U L 2 (JR) U L 00 (lR). A(f) = g is a filter that we will call the
generalizedfilter A associated with (24.1). The output g is given by g = h*f
or possibly by a formula like (24.14).
24.4 Stability
24.4.1 Definition An analog system A: X-+ Y is said tobe stable if
there exists an M > 0 suchthat IIA/IIoo ~ Mllflloo for all f E L 00 (1R) nX.
By (24.8), the generalized filter A is stable when deg P < deg Q. If

deg P = deg Q, the system is still stable from what we have seen in Section
24.3.4.
24.4.2 Theorem The generalized filter governed by equation (24.1),

whose output g is defined by (24.5) or (24.14), is stable when deg P::; deg Q
and the poles of P(x)jQ(x) arenot on the imaginary axis.
degP::; degQ and P/Q has no} {The generalized filter

poles on the imaginary axis. ===} A is stable.
24.5 Realizable systems

24.5.1 Definition A system is said tobe realizable (or causal) if the
equality of two input signals for t < to implies the equality of the two
output signals for t < to (see Section 2.1.2).
For a filter, which is by definition linear and invariant, this condition
becomes the following: For all t 0 E JR,
f(t) = 0 for t < to ===} Af(t) = 0 fort < to.

We will sec that the realizability of the filter defined in Section 24.3.5
depends simply on its impulse response or on the position of the poles.
Assurne that deg P ::; deg Q.
The generalized filter

Ais realizable. *"* supp(h) C [O, +oo).
If supp(h) C [0, +oo), the output
g(t) = Jo
r+oo h(s)f(t- s) ds
is 0 for t < t 0 when f(t) = 0 for t < t 0 . We prove the other direction by
contradiction. Thus suppose that there is a h < 0 such that h( h) > 0.
Since h is continuous at t 1 , there is an interval (a, b) suchthat b < 0 and
a < h 0. For the causal input signal
f(t) = X[O,b-aJ(t),
24.6 Gain and response time 217
we have an output signal
g(t) = it
t-b+a
h(s) ds
with g(b) > 0. This contradicts the fact that Ais causal. This is the proof
when deg P < deg Q. In case deg P = deg Q, one uses the trick introduced
in Section 24.3.4.
From formulas (24.11) and (24.13) we see that supp(h) c [0, +oo) if and
only if K+ is empty. Thus if deg P:::; deg Q, we have the following result:
The generalized filter} {The poles of P /Q are located to

A is realizable. ~ the left of the imaginary axis.
24.5.2 Theorem For the generalized filter defined in Section 24.3.5

with deg P :::; deg Q to be realizable, it is necessary and sufficient that all
the poles of P / Q have strictly negative real parts.
For deg P = deg Q, the property results from the fact that the output
can be written as g = >.j + h0 * f, ).. E C. In summary, if deg P :::; deg Q,
we have the following result:
The real parts of all the } {The generalized filter A

poles of P/Q are negative. ~ is realizable and stable.
24.6 Gain and responsetime

The gain of a filter of the type described in Section 24.3.5 is defined to be
the constant
K = H(O).
From (24.9) we see that
K = h(O) = lim h1(t),

t-->+oo
which is the ratio between the asymptotic value of the step response and
the height of the input step function. The response time is defined to be the
time it takes the step response to reach and maintain a certain percentage
of its limit, in general 95%:
tr = min { t II hl (t~- K I: :; 1 ~ 0 for all t > tr}.

24.7 The Routh criterion

The stability of a system depends on thc location of the roots of the char-
acteristic equation Q(x) = 0 in the complex plane. We note that it is not
necessary to compute the roots of this equation to determine whether all
their real parts are negative. It is possible to use the Routh criterion: The
roots of the equation
aoxP + a1xp-l + + ap-1X + ap = 0

with real coeffi.cients will all have strictly negative real parts if and only if
the elements of the first column of the following array all have thc same
sign:
with
etc., for k = 1, 2, ....

EXAMPLES
(a) Q(x) = x 4 + 3x 3 + 6x 2 + 9x + 12.
The Routh matrix is
r13 : 2 ~l.
and thus the real parts of the roots are not all negative.
(b) Q(x) = x 3 + (2k + 1)x2 + (k + 1) 2 x + k2 + 1 = 0.
The Routh matrix is
1 (k + 1) 2 0
2k + 1 k2 +1 0
2k(k 2 + 2k + 2)
0
2k + 1
k2 + 1
24.8 Exercises 219
For the elements in the first column all to have the same sign, we must
have 2k + 1 > 0 and 2k > 0. Thus the real parts of the roots of Q are
strictly negative if and only if k > 0.
24.8 Exercises
Exercise 24.1 Compute explicitly the output g of the generalized filter de-
fined by
g' - ag = j, a > 0,
and show that it is stable. Is it realizable? Compute the step response.
Exercise 24.2 Let a, b ER. We wish to study the differential equation
g'- ag = !'- bf.

In which cases (a = b and a i= b) can one define a generalized filter? Discuss
stability and causality.
Exercise 24.3 Consider the generalized filter determined by
g" + 2ag' + bg = f
given a, b E R.
(a) Determine the regions of the (a, b)-plane where the poles of Q are not on
the imaginary axis.
(b) Determine the regions corresponding to a realizable filter.
(c) Show that the filter is unstable if b = 0.
Exercise 24.4 Does (24.5) define a function when f is slowly increasing?

Hint: See Exercise 20.6.
Exercise 24.5 Compute the transfer function and the impulse response of
the generalized filter
g/11 + g = !" + f.
Is the filter stable? Realizable?
lesson 25
Examples of Analog Filters
25.1 Revisiting the RC filter

The RC filter was studied in Beetion 2.4. The equation is
RCg' + g = J,
and
P(x) 1 1
Zl=--
Q(x) 1 +RCx' RC.
The filter is stable and realizable (fortunately!). Formula (24.11) shows that
1 __t_
h(t) = - e RCu(t)
RC
and
1
g(t) = RC
lt t-s
-oo e- RC f(s) ds.
By taking f = u, we obtain the step response

t
h 1 (t) = (1- e- RC)u(t).
The gain is K = 1. At the timest= RC and t = 3RC,
h1 (RC)= 1- e- 1 ~ 0.63,
h1 (3RC) = 1 - e- 3 ~ 0.95.
The responsetime is tr = 3RC. The number RC is called the time constant

of the filter, or RC-constant; it provides a good characterization of the
time it takes the filter to respond to an abrupt change in the input. In this
sense, it characterizes the system's dynamics. The impulse response and
step response are illustrated, respectively, in Figures 25.1 and 25.2
222 Lesson 25. Examples of Analog Filters
h(t)
1
RC
0 RC
FIGURE 25.1. Impulseresponse of the RC filter.
u(t) h,(t)
95%
0 0 RC 2RC 3RC
FIGURE 25.2. Step response of the RC filter.
25.2 The RLC circuit

If v is the voltage across the capacitance and f is the applied voltage, by
Ohm's law,
LCv" + RCv' + v = j,
which defines a second-order filter (Figure 25.3).
t(t) TL- f'l'"~'~ :t\'~'X ,,__c_I..l__ _.) ~t)

'X"'O
FIGURE 25.3. RLC circuit.
Thus
P(x) 1
Q(x) LCx 2 + RCx + 1'
25.2 The RLC circuit 223
and there are three cases to consider that depend on the sign of
First case: D. <0 (R < 2~).

Let
w= J4Lc -R , 2
a= 2L'
R
= 2L.
w
The two poles are complex conjugates and have negative real parts:
z = -a + i and z = -a - i.
The partial fraction representation of H is
H(>.) = _1_ [ 1 _ 1 ]
iwC 2i1r>.-z 2irr>.-z '
and we have the representation of h from (24.11) (Figure 25.4):
h(t) = w~ e-2~ t sin c~ t) 0

u(t). (25.1)
The response to the input f is thus
v(t) = - 20
w
Jt
-(X)
e-a(t-x) sin(t- x)f(x) dx
h(t)
2
wC
...--
/
...... ......
/
2 /
-wC
FIGURE 25.4. Impulseresponse of the RLC circuit (R small).
R
(1 + 8- id) - - - - - - - - - ~---r--
0 !!. 21T

FIGURE 25.5. Step response of the RLC circuit (R small).
and the step response is
h1(t) = w~ (Iot e-axsinxdx) u(t).

This integral is evaluated by integrating by parts two times:
h 1 (t) = [1 - e-at(cos t + ~ sint)] u(t).

The step response oscillates around the limit value K = 1 (Figure 25.5).
Second case: = 0 ( R = 2~).

In this case,
P(x) 1
Q(x)
has a double real negative pole:
The impulse response is
h (t ) = -1t e - !i.t ()
2L u t , (25.2)
LC
and
v(t) = L~ j_too (t- s)e- 2~ (t-s) f(s) ds.
The step response is
25.3 Another second-order filter: - ~ g" + g = f 225
0 s.!:.
R
FIGURE 25.6. Step response of the RLC circuit in the critical case.
This response no Ionger oscillates araund its asymptote (Figure 25.6).
Third case: > 0 ( R > 2 [!i) .

Herewe have
1
H(>.) = LC(2i'TrA- Zt)(2i7rA- Z2),
and P(x)/Q(x) has two real negative poles:
Zt = --u,
R+w
Z2 = --u,
R-w
where w = J R2 - 4~.
H is decomposed as
H(>.) = __1 [ 1 _ 1 ]
wC 2i7r). - Zt 2i7r). - Z2 '
and
-1
h(t) = wC [eztt- ezt] u(t). (25.3)
The step response is (Figure 25. 7)
()
htt
2L z t 2L z
= [ 1+Cw(R+w)el -Cw(R-w)e
t] ut.
()
The response is slower than in the critical case = 0. The gain is 1 in
all three cases. The RLC filter is stable and realizable.
25.3 Another second-order filter: -tg'' + g = f

In this example,
P(x)
w >0,
Q(x)
FIGURE 25.7. Step response of the RLC filter (R large).
so that
w2
H(>.) = 411'2)...2 +w2
From Section 18.2.2, the impulse response is (Figure 25.8)
h(t) = ~we-wltl.
2
Thus the output is
g(t) = ~w
2
r e-wlt-sl f(s) ds,
}IR
and the step response is (Figure 25.9)
if t :::;; 0,
if t ~ 0.
The filter is stable but not realizable.

h(t)
FIGURE 25.8. Impulse response of the filter -~g" + g = f.

25.4 Integrator and differentiator filters 227
u(t) h,(t)
0 0
FIGURE 25.9. Step response of the filter -~g" + g = f.

u(t) h,(t)
0 0
FIGURE 25.10. Step response of the integrator.
25.4 Integrator and differentiator filters

25.4.1 The integrator g' = f
In this case, we have
P(x) 1
=
Q(x) X
There is one pole at the origin, and the results of Lesson 24 do not apply.
If f is in .9 , one cannot in general find a g in .9 . It is easy to study this
directly: g is a primitive of f, and if we limit the search to causal signals,
then g is determined by having to be causal. In this case,
g(t) = [too f(s) ds,

which can be written in terms of the Heaviside function:
9= U* j.
This is a convolution. system whose impulse response is the unit step

function. The step response is defined and is the ramp h 1 {t) = tu(t) (see
Figure 5.10). The gain is infinite; the system is unstable but realizable if
limited to causal signals.
25.4.2 The differentiator g = f'

Herewe have
P(x)
Q(x) = x and degP > degQ.
This filter is clearly realizable but unstable. Neither the impulse response
nor the step response can be defined with the tools developed so far. These
will be defined later in the context of distributions.
25.5 The ideal low-pass filter

It is customary to describe a filter by the way it modifies the frequencies of
the input signal. This is just to say that a filter is described by its transfer
function H, since the frequencies of the input and output are related by
g(.A) = H(.A)j (.X). (25.4)
The ideal low-pass filter does not change the frequencies .A for [.X[ < Ac
(Ac is the cutoff frequency) and completely suppresses the others. Thus the
transfer function of the ideal filter is
H(.A) = {1 if [.X[< .Ac,

0 otherwise.
From Section 22.2.2, the hin L 2 (~) for which h = H is

h(t) sin27fAct.
=
7ft
If we consider only input signals with finite energy, then f, h, and H are
in L 2 (~), and (25.4) can be expressedas (Proposition 23.2.1(i))
g = h * f.
We know that g is continuous, bounded, and zero at infinity. The right-
f
hand side of (25.4) is in L 2 (~) because is in L 2 (~). Thus g and g are in
L 2 (~). The important issue here is the form of h; it tells us that the ideal
low-pass filter is not realizable. This is indeed troublesome, but not at all
surprising. Faced with the impossibility of having an ideal low-pass filter,
the best we can expect is to find realizable filters whose transfer functions
approximate that of the ideal filter. In general, the transfer functions of
these "real filters" will have a bell-shaped amplitude (see Figure 2.1) and
unbounded support. These ideas are illustrated in Figure 25.11.
The better [H(.A)[ approximates the centered reetangular window, the
better will be the performance of the realizable filter. In Section 2.4 we saw
that the RC filter acts as a crude low-pass filter. We will see in the next sec-
tion that the Butterworth filters provide better realizable approximations
to the ideallow-pass filter.
25.6 The Butterworth filters 229
IH(A)I h(t)
cutoff of higher frequencies impulse response of an unrealizable filter
h(t)
-
IH(A)I
attenuation of higher frequencies impulse response of a realizable filter

FIGURE 25.11. A reallow-pass filter can only attenuate higher frequencies.
25.6 The Butterworth filters

The Butterworth filters are the filters whose energy spectra have the form
Ac> 0. (25.5)
For n = 1 we have the RC filter with Ac = 1/(27rRC). The motivation

for increasing n is to produce a cleaner cutoff around Ac.
Asn increases, frequencies in the pass band lAI < Ac are less attenuated,
and frequencies in the attenuation band lAI >Ac are more suppressed (see
Figure 25.12). Since we have some freedom to choose the phase (only the
modulus has been given), we will determine H(A) to obtain a stable and
realizable filter. If we require h tobe real, then
IH(A)I 2 = H(A)H(A) = H(A)H( -A). (25.6)
The poles of IH(A)I 2 are the complex numbers

\ i 21r (2k+l)
Pk = 1\ce n , k = 0, 1, ... , 2n- 1,
and they occur in conjugate pairs. We want H(A) tobe a rational function
H(A) = P( 2i1r A) = F(2. A)

Q(2i7r A) Z'Tr '
FIGURE 25.12. Energy spectra of the Butterworth filters.
and furthermore, we want the poles

Pk
Zk = 2i7l"
of F to lie to the left of the imaginary axis. This means that the poles Pk
must lie above the real axis. Thus, for the poles of H(>..) we select those Pk
whose imaginary parts are positive. The remaining Pk (the conjugates of
the ones selected) are the poles of H( ->..). Here are two examples.
Case n = 2:
poles selected for H(A)
FIGURE 25.13. Butterworth filter of order 2.
In this case (Figure 25.13),

H(>..) = PoP1 .
(>..-Po)(>..- pl)
Case n = 3:
Here we have (Figure 25.14)
H(>..) = -PoP1P2 .
(>..-Po)(>..- Pl)(>..- P2)
25.7 The general approximation problern 231
poles selected for H(A)
FIGURE 25.14. Butterworth filter of order 3.
We will compute the impulse response for the case n = 2. Thus

Ac
Po =p= v'2 ( 1+z.) and P1 = -p.
lf we let a = 1r .Xcv'2, we have
H(.X) = - 2i7r I ~1 2 [ 1 1 ]
p+ p 2i7r .X+ a - 2i7r .X+ a '
where a = a(1 - i). Referring to Beetion 22.2.2,
h(t) = -ia(e-at- e-at)u(t) = 2ae-at sinat u(t).
This impulse response has the same form as that of the RLC circuit,
which is equation (25.1).
25.7 The general approximation problern

There are many ways to approximate the ideallow-pass filter with stable,
realizable filters. The Butterworth filters belong to the class of polynomial
filters (P(x) = 1). The Chebyshev filters are also in this class. These are
obtained by letting
2 1
IH(.X)I = 1+ a2T~(.X)'
where Tn(.X) is the Chebyshev polynomial of degree n and a is a parameter
that determines the amplitude of the oscillations in the pass band. We also
mention the elliptic filters: IH(.XW has the sameform as above, but Tn(.X)
is replaced by a rational function. For an account of this we refer to [BL80].
The general approximation problem, given the frequency specifications,
amounts to looking for a rational function that falls within a predetermined
template (see Figure 25.15).
IH(A)I
FIGURE 25.15. Approximation of the ideallow-pass filter with given frequency

specifications.
25.8 Exercises
Exercise 25.1 Show that it is possible to choose the constants R, L, and C
such that the RLC circuit is a Butterworth filter of order 2.
Hint: Take R = ..j2L7C and compute IH(.XW as in Section 25.2, First case. One
finds that
1
with .Xe=-----==
27r..;YC
Exercise 25.2 Discuss the stability of the generalized system
g( 4 ) + 6g( 3 ) + 11g" + 6g' + kg = f

as a function of k.
Exercise 25.3 Consider the following electric filter
R c
T
X (t) ( i (t)
r } V{t)
where x is the input and where the voltage v across the resistance r is the output.
(a) Show that x and v are related by RrCv' + (R + r)v = rx + RrCx'.
(b) Compute the transfer function and the step response.
(c) Assurne that rissmall with respect to R. What is the role of this filter?
Chapter VIII
Distributions
lesson 26
Where Functions Prove to Be

Inadequate
We are going to take a turn here that will lead to a new environment in
which signals are no Ionger modeled solely by functions. The two themes
for this heuristic introduction are impulse and derivation.
26.1 The impulse in physics

Intuitively, an impulse is a very strong signal having a very short duration.
It is like a sharp "right to the jaw," or, less personally, like the collision
of two solid bodies, one large and one small. The acceleration experienced
by the smaller one is short and intense, and its velocity appears to be
discontinuous, since it changes rapidly from one value to another. We first
consider a simple example.
26.1.1 The notion of a point mass

We restriet the example to a one-dimensional mass distribution. Thus imag-
ine a unit mass distributed on the x-axis between the values -h and h with
a density dh(x) (see Figure 26.1). This density function has the following
properties:
(i) dh(x);::: 0 for all x E IR.
(ii) dh(x) = 0 if lxl > h.
(iii) l dh(x) dx = 1, the total mass.
If we imagine that this constant mass is compressed into the point x = 0,

that is, if we let h tend to 0, then we have, in the limit, what is called a
point mass at the origin. This is like the situation where a physical body is
observed from so far away that it seems to have no dimension and appears
as a point.
236 Lesson 26. Where Functions Prove to Be Inadequate
-h 0 h X
FIGURE 26.1. Density distribution of a mass.
But what happens mathematically to the density dh as h tends to zero?

If we assume that some sort of limit density d(x) exists, then we would like
it to satisfy the following conditions:
(i) d(x) 2::0 for all x E lR.
(ii) d(x) = 0 if x i- 0.
(iii) l dh(x) dx = 1.
One has the idea that at x = 0 the value d(O) is infinite. The situation
is similar to that of a point charge carried by an elementary particle.
26.1.2 A collision between two solid bodies

Let us try to see what happens in mechanics when a force becomes more
and more intense and brief. Let S be a solid body of mass m at rest on a
surface where it can slide without friction; think of a hockey puck about to
be hit. Between the instants t = -h and t = h the stick applies a force fh
whose graph has, for example, the shape shown in Figure 26.2.
-h 0 h
FIGURE 26.2. Force applied to a puck.
We imagine that the duration of the force becomes shorter and shorter
(h ----> 0) while always imparting the same energy EJ to S. These applied
26.2 Uncontrolled skid on impact 237
forces become more and more intense, and in the limit we have an instan-
taueaus shock at time t = 0. If vh and 'Yh = v~ are respectively the velocity
and the acceleration of S, its kinetic energy at time t is
which is constant and equal to E f after time t = h:
Thus vh(h) is a constant as a function of h, and we have
vh(h) = 1h 'Yh(t) dt = C, a constant.

-h
1:
Newton's second law, fh = m"(h, implies that
!h(t) dt = c.
By taking C = 1, we see that the forces fh satisfy the following three
conditions:
(i) fh(t) ~ 0 for all t ER
0 if ltl ~ h.
L
(ii) !h(t) =
(iii) fh(t) dt = 1.
At the limit, we will have a shock f(t) that has the following properties:
(i) f(t) ~ 0 for all t ER
(ii) f(x) = 0 if t i= 0.
(iii) Lf(t)) dt = 1.
26.2 Uncontrolled skid on impact

From what we have just seen, the unit impulse at the origin will be an
ideal signal, which we denote for the moment by imp(t), that satisfies the
three conditions (i), (ii), and (iii). Unfortunately, even with the Lebesgue
integral, these three conditions are incompatible for a function: The integral
of a function that is zero almost everywhere is necessarily zero.
What is to be done?
People working in mechanics and theoretical particlc physics around
1920-1930 (notably P. Dirac) werc deterred by neither the question nor
the contradiction. They had a useful tool, even if it was not conceptually
satisfying. They used the imp(t) "function"-it was, in fact, called (t),
but we change the name temporarily for clarity-which, desirable or not,
was thought of as satisfying the conditions
if t # 0,
imp(t) = {O
+oo if t = 0,
kimp(t) dt = 1,
and whose graphic representation is shown Figure 26.3.
imp(t)
FIGURE 26.3. Unit impulse at the origin.
If we set aside rigor but respect the usual properties of functions, it is

easy to exploit the formalism of (26.1). Take, for example, the computation
k
of the integral
I= imp(t)f(t) dt.
Assuming that f is differentiable, we integrate by parts by letting
l x
-oo
.
1mp ( )
t dt = u ( x ) =
{0 if
.
1 1f
X<
X> 0,
0, (26.1)
which is quite natural in view of (26.1). (u(O) is not defined, but this is not
important.) Hereis the evaluation of I:
I= [u(x)f(x)J~: - k u(x)f'(x) dx;
I=f(+oo)- Jo
r+oo !'(x)dx=f(+oo)-f(+oo)+/(0);
I= /(0).
This relation makes sense even if f is only continuous at the origin, and
it is thus possible to make practical use of integrals containing the im pulse
26.3 A new-look derivation 239
L
function by letting
imp(t)f(t) dt = f(O)
for all continuous functions.

In passing, we have "deduced" from (26.2) that the impulse is the deriva-
tive of Heaviside's unit step function, and everything is working out quite
nicely. All the better, since we will see that this new derivation is far more
satisfying than the old one.
26.3 A new-look derivation

With the example of the unit step, we are faced with two derivatives: the
usual derivative, which is zero except at the origin, where it does not exist,
and a derivation denoted by D that leads to the formula (see Figure 26.4)
Du= imp.
u(t)
.../
1~8~~~/
'\. new
'\. derivation
u' (t) ~ Du(t) =imp(t)
0
0
FIGURE 26.4. The two kinds of derivation.

240 Lesson 26. Where Functions Proveto Be Inadequate
-h 0 h usual
derivation
-h 0 h
FIGURE 26.5. Establishment of an electric current.
From the modeling point of view, the unit step represents, for example,
the instantaneous establishment of a constant electric current. We consider
this phenomenon from a microscopic point of view, without going to the
level of electrons, where the model would necessarily be discrete. Physically,
there is no discontinuity at t = 0, but rather the continuous and very rapid
(of order w- 7 seconds) establishment of the current. A more precise model
would thus be a function uh(t) like the one shown in Figure 26.5.
For convenience, we put the time origin at the center of the transition
phase. The usual derivative u~, (t) must satisfy the following conditions:
(i) u~(t) ~ 0 for all t ER
(ii) u~(t) = 0 if itl > h.
(iii) l u~(t)dt = 1.
Thus the functions u~ have all the characteristics of approximations of

the unit impulse. On the other hand, it is clear that taking the usual deriva-
tive and passing to the limit, wipes out the event that occurs at time t = 0.
The information disappears. Thus it is more in keeping with the physical
phenomenon to say "u is differentiable, and its derivative is the impulse"
than to say "u is differentiable except at the origin, and its derivative is
the zero function."
26.4 The birth of a new theory 241
26.4 The birth of a new theory

It was necessary to wait until 1947 for the creation, by Laureut Schwartz,
of a complete mathematical theory of these new objects. This is the theory
of distributions. Since then, it has become an almost indispensable tool in
theoretical physics and signal processing. Originally motivated by the study
of partial differential equations, distribution theory has had an impact on
most areas of mathematical analysis.
Distributions generalize the notion of function. We have already seen
with the Lebesgue integral one way that functions needed to be generalized:
Starting with an ordinary process that allows a well-defined value J(x) to
be associated with each x, we arrived at equivalence classes of functions
that are equal almost everywhere, where the value of f a given point is no
longer significant.
The new generalization will include the impulse as well an many other
"generalized functions." The theory of distributions also contains the new
derivation, which is called "derivation in the sense of distributions." This is
a global concept, whereas the usual derivation applies only to "differentiable
functions." Historically, it was around 1937 that the Soviet mathematician
S.L. Sobolev first introduced the idea of a generalized derivative. Roughly,
g is the generalized derivative of f if
l g(x)cp(x) dx = -l f(x)cp'(x) dx (26.2)
for all regular functions cp that have bounded support. This point of view
is taken in distribution theory, which was officially born in 1947 with the
publication of Schwartz's first article in the Annals of the Fourier Insti-
tute at Grenoble. Little by little the idea that all continuous functions
were differentiable spread throughout the mathematics community, to the
general amazement of all! While the theory at first seemed rather esoteric
and complicated (probably because of its heavy use of topology and the
Lebesgue integral), mathematicians quickly realized that its actual use was
much simpler than the theory: One could work formally and quickly with-
out worrying about whether functions satisfied certain conditions, such as
differentiability. A distribution is always differentiable, and in fact infinitely
differentiable. A series divergent in the usual sense will often be convergent
in the sense of distributions. One important property that we have already
verified for the Heaviside function is the continuity of derivation:
fn --t J ==} Dfn --t DJ,
where the limits are taken in the sense of distributions. Finally, the Fourier
transform, an indispensable tool in so many areas, was until the advent
of distributions defined only for integrable and square-integrable functions.
One was not able to spcak, for example, of the Fourier transform of the
Heaviside function. This restriction will be removed; in addition, we will

see the theories of Fourier series and of the Fourier integral unified.
We are going to present only the elementary aspects of distribution the-
ory. Our goal is to develop the tools needed for the principal applications
of Fourier analysis.
lesson 27
What Is a Distribution?
27.1 The basic idea

The basic idea for generalizing the notion of function in the context of
distributions is to regard a function as an operator Tt (called a functional)
acting by integration on functions themselves:
Tt('P) = l
-oo
+oo
f(x)cp(x) dx. (27.1)
This idea is analogous to that of identifying a real number a with the linear
function
x f--7 ax.
It is this concept that allows one to go from the notion of derivative to that
of differential.
Clearly, the integral in (27.1) does not always exist. If we want it to exist
for rather general functions f, it is necessary to impose severe restrictions
on the function <p, which is called a test function. We first require that <p
vanish outside a bounded interval so that there will be no problern with
convergence at infinity. Thus all test functions <p have bounded support.
To generalize derivation (or the derivative), we examine what happens
when f is continuously differentiable. The functional associated with f' is
T!'('P) = l
-oo
+oo
f'(x)cp(x) dx,
and integration by parts shows that
T!'('P) =- l +oo
-oo f(x)cp'(x) dx. (27.2)
This formula has the advantage that the derivative of f no Ionger ap-
pears, and thus the derived functional can be defined even though the
244 Lesson 27. What Is a Distribution?
function f is not differentiable. However, the test functions <p must be

differentiable; indeed, they must be infinitely differentiable if we wish to
iterate the operation.
27.2 The space 9J (JR) of test functions

One is led naturally to require that test functions be infinitely differentiable
and have bounded support. The space of these functions is denoted by
!(IR) or simply ! (recall Definition 15.1.7).
(i) <p vanishes outside a bounded

{ interval (which depends on cp).
(ii) <p is infinitely differentiable
in the usual sense on IR.
In other words,
!(IR) = {<p: IR--+ c I <p E C (IR), supp(cp) is bounded.}

00
It is not immediately obvious how to construct such a function or whether

such functions exist, except for <p = 0. The "usual" functions (polynomials,
rational functions, trigonometric functions, etc.) satisfy (ii) but not (i).
In addition, the elementary infinitely differentiable functions are analytic.
Thus if they are zero on some interval, no matter how small, they are
identically zero. This is almost the opposite of what we want. To fix this
situation, we can try to define <p a piece at a time:
cp(x) = {O(x) if x E (a,b),

0 if x (j_ (a, b),
where 0 is an elementary function. The difficulty here is that all of the

derivatives of 0 must vanish to the left of a and to the right of b. There
exists, however, at least one explicit example that is always presented in
this context. It is the following function, which we have already seen in
Lessons 15 and 21:
O(x) = {
exp ( - -1- )
1- x2
if lxl < 1, (27.3)
0
if lxl 2 1.
One can verify, with a little patience, that this function is infinitely differ-
entiable and that the derivatives all vanish at x = 1. (Show that the nth
derivative of 0 is of the form Fn(x)O(x), where Fn isarational function.)
It follows that ! =j:. {0}, but this is still a modest result.
27.3 The definition of a distribution 245
By translation and change of scale, we can construct a function in 9J

whose support is an arbitrary bounded interval (a, b). In this way we find
an infinite number of test functions whose supports are disjoint, and this
shows that 9J is an infinite-dimensional vector space. In practice, we never
need to use the explicit expression for a test function.
Forthose who have skipped over Lesson 21, we mention that it was shown
there that 9J is densein L 1 (1R) and, more generally, in LP(IR):
9J (IR) = LP(~R), 1 ~ p< +oo.
27.3 The definition of a distribution

In formula (27.1), the functional T1 is linear on 9J. We will add a continuity
condition.
27.3.1 Definition A distribution is any mapping
that is linear and continuous.

The variable of a distribution is a test function, and T( r.p) is a complex
number. T is said tobe a continuous linear functional (or linear form) on
9J. The value of Tat r.p will be denoted in either of two ways:
T(r.p) or (T, r.p).
This second notation brings to mind a close relative of T(r.p), namely, the
scalar product in L 2 (1R) expressed by (27.1).
The continuity ofT means the following:
If 'Pn-+ r.p in 9J, then T(r.pn)-+ T(r.p).
We still must specify the meaning of "r.pn -+ r.p in 9J"; thus we need to
define a topology (or at least the concept of convergence) for 9J. Here, and
elsewhere in the book, we settle for a direct definition of convergence---
the notion of a limit of a sequence---and avoid discussing the underlying
topology. We will define what 'Pn -+ 0 means; by linearity, 'Pn -+ r.p will
mean that 'Pn - r.p -+ 0.
27.3.2 Definition (Iimits in !iJ) A sequence of elements (r.pn)nEN

of 9J tends to 0 in 9J if the following hold:
(i) The supports of all the 'Pn are contained in a fixed compact interval.
(ii) (r.pn) as well as all of the derived sequences tend to 0 uniformly on IR
as n-+ +oo.
There does not exist a distance function, much less a norm, on .! that
gives this notion of convergence. There is, however, a well-defined topology
on .! (sec [Sch65b)). It is suffi.cient for our purposes to have the notion of
convergence.
The following useful property follows directly from the definition.
27.3.3 Proposition If a sequence of elements (cpn) in.! tends to 0

in .!, then the sequence (cp~) is in .! and tends to 0 in .!.
It is clear that the set of distributions has the structure of a complex vec-
tor space with the obvious addition of two linear forms and multiplication
by scalars. This space is called the topological dual of .! or the space of
continuous linear functionals on .!. We denote this dual space by .! '(JR)
or simply .! '. This conforms with historical and customary notation.
27.3.4 Examples
(a) Point distributions: Let a be a real number and let 8a be the mapping
defined on .! by
8a('P) = cp(a).
It is clear that 8a is linear and continuous on .! (convergence in .! is

uniform and hence pointwise). Thus 8a is a distribution. For a = 0 we
write simply 8.
(b) Let (>.n)nEZ be a complex sequence and let a > 0. We write
+oo
T = L AnOna
n=-oo
for the linear functional on .! defined by
+oo
T(cp) = L Ancp(na).
n=-oo
This sum is, in fact, finite for each cp; hence there is no problern about
convergence. Furthermore, T is continuous: If 'PP ~ 0 and if [A, B] is a
bounded interval containing the supports of all the 'Pv then
T(cpp) = L Ancpp(na)
A:o:=;na:o:=;B
is a finite sum that tends to 0 as p ~ +oo. Thus T is a distribution. For

An= 1, wc have Dimc's comb, which will be used often in the sequel.
27.4 Distributions as generalized functions 247
27.4 Distributions as generalized functions

We wish to show that many functions can be identified as distributions via
the relation (27.1). The integral in this formula is well-defined whenever I is
locally integrable. The integrability of lr.p is a consequence of the inequality
kll(x)llr.p(x)l dx::::; llr.plloo 1b ll(x)l dx, (27.4)
where supp( r.p) c (a, b).
27 .4.1 Proposition If I is locally integrable on R, then the functional

Tt defined by
Tt(r.p) = k l(x)r.p(x) dx (27.5)
is a distribution.
Proof. Tt is clearly linear. The continuity on !iJ follows directly from
(27.4): Let (a, b) be an interval containing the supports of the elements '{)n
of a sequence in !iJ that tends to zero. o
It is clear from (27.5) that if I and g are locally integrable and equal
almost everywhere, they define the same distribution Tt on !iJ; that is,
I= g a.e. ===} Tt = T9
We are going to investigate the converse of this implication.
27 .4.2 A question of identification

The mapping in Proposition 27.4.1 of Lfoc(R) into !iJ '(R) given by I 1-t Tt
is well-defi.ned. The identification of Lfoc(R) with its image in !iJ' is possible
if this mapping, which is linear, is also 1-to-1; that is, if
Tt = 0 ===} I = 0 a.e.
(See Figure 27.1.) The proof of this property was given as an exercise
(Exercise 21.6). Thus for all I in Lfoc(R)
Tt = 0 {::=} I = 0 a.e. (27.6)
The distribution Tt is thus identified uniquely with the locally integrable

function I. From now on we will make the identification I +--+ Tt using any
of several notations:
FIGURE 27.1. Embedding of Lfoc(R) in~'.
L/oc
0
FIGURE 27.2. Identification of Lloc as a subspace of .! '.
The distributions TJ associated with locally integrable functions are said

to be regular. With this identification we have the situation illustrated in
Figure 27.2.
It is based on this embedding that distributions are called generalized
functions. More precisely, distributions generalize locally integrable func-
tions. Here are two examples:
The unit step function (Heaviside's function) u is identified with the
distribution
Tu(cp) = Jo
r+oo cp(x) dx.
The constant function f = K, K E C, is identified with the constant
distribution of the form
T1(cp) = K h cp(x) dx.
Unfortunately, a function as simple as f(x) = x- 1 , which is not locally

integrable araund the origin, cannot at this point be considered a distribu-
tion. We will sec in Section 28.5 how this situation can be remedied.
27.5 Exercises 249
27.4.3 There are nonregular distributions

A simple example of a nonregular distribution is given by the point distri-
butions
8a(cp) = cp(a).
We will prove this for a = 0; that is, we will show that there does not
exist a locally integrable function f such that
1 fcp = cp(O)
for all cp E .!JJ . We argue indirectly. Thus, assume that such an f exists and
let p E ..!JJ be the function defined by (27.3). Then we have
1 f(x)p(nx) dx = p(O)
for all n E N. This implies that

1 1
1 = lp(O)I:::; 1_: lf(x)IIP(nx)l dx:::; 1_: lf(x)l dx.

n n
As n ---+ oo, the right-hand integral tends to 0; this contradiction proves

that f does not exist.
27.5 Exercises
Exercise 27.1 Are the following functionals distributions?
(a) T('P) = I'P(O)I.
(b) T('P) = a, a E IC.
+oo
(c) T('P) = E 'P(n)(O).
n=O
(d) T('P) = 11xi"''P(x)dx, o: E IR.
Exercise 27.2 Suppose 'PE 9J (IR) and n E N*. Show that 'P can be written
in the form
L %, 'P(k)(O) + xn'I/Jn(x)
n-1 k
'P(x) =
k=O
with 1/Jn E C""(IR).
Hint: Use Taylor's formula with integral remainder.
Exercise 27.3
(a) Take yj E C, 0 ~ j ~ n, and a ER We wish to find r.p E !iJ (JR) suchthat
(I)
Let Ba = kX[a-l,a+lJ *B, where k E lR and Bis defined by (27.3). Show

that Ba E !iJ (JR) and that B~P (a) = 0 for all j 2': 1. Choose k such
that Ba(a) = 1.
Find a polynomial P of degree n suchthat r.p = PBa satisfies (I).
(b) Let (yj)jEN be a sequence that is dominated by a geometric sequence:
for some A, B > 0. Fora E JR, show that there exists r.p E !iJ (JR) suchthat
r.pUl(a) = yj for all jE N.
Exercise 27.4 (truncation) For f E C""(JR), show that there exists

r.p E !iJ (JR) that agrees with f on [-1, 1].
Exercise 27.5 Assurne that f E C""(lR\0) satisfies the following condition:
There is a p E N suchthat for all n E N, xP f(n) (x) --+ 0 when x--+ 0.
(a) Give an example of such a function.

(b) Show that there exists g E C""(JR) that agrees with f outside [-1, 1].
Exercise 27.6 Suppose g E C""(JR) , a E JR, and n E N. Prove the following

equivalence:
lesson 28
Elementary Operations on
Distributions
In this lesson we intend to extend to distributions several concepts regularly

applied to functions such as parity and periodicity, as well as the funda-
mental notion of derivation. Indeed, it is in the area of differential equations
(ordinary and partial) that the use of distributions has been most fruitful.
We will illustrate a general method for extending to distributions certain
concepts that are known for functions. The difficulty comes from the fact
that the argument of a distribution is a function and not a real variable.
28.1 Even, odd, and periodic distributions

The reflection Ia of a function I is defined by
la(x) = 1(-x).
The function I is said tobe even if Ia =I; it is said to be odd if Ia = -I
We will try to define the reflection Ta of a distribution T. For a regular
distribution Tf, the identification with the function I imposes the relation
This means that
(Tj )a(cp) = l la(t)cp(t) dt = l l(t)cpa(t) dt
for all cp E !iJ, or written differently, that
The last formula is our guide for extending "reflection" to all distributions.
28.1.1 Definition The reflection Ta of a distribution T is defined by

(28.1)
252 Lesson 28. Elernentary Operations on Distributions
for all <p E !lf. The distribution T is said to be cven if Ta = T and odd if
Ta= -T.
It is easy to show that Ta thus defined is indeed a distribution.

EXAMPLE: 8 is an even distribution.
28.1.2 Periodic distributions

Here again we note what happens in the case of functions. A function f
has real period a, a =f. 0, if
f(x- a) = f(x)
for all x ER In terms of the translation operator (Scction 2.1.3) this is
Taf = J.
Thus it is sufficient to definc the translation operator for distributions
to establish the desired definition. For a regular distribution Tt, its idcnti-
fication with f forces the rclation
As beforc, this expands to
(TaTf, <p) = (TTaf, <p) = J f(x- a)<p(x) dx = Jf(x)<p(x + a) dx,

which means that
for all <p E !lf. The last formula can be extended to all distributions.
28.1.3 Definition The translate TaT of a distribution T is defined by

(28.2)
for all <p E !lf. A distribution T is said to be periodic with period a =f. 0 if
EXAMPLES:
(i) TaDb = Da+b

+oo
(ii) Dirac's comb, T(<p) = L <p(na), is periodic with period a.
n=oo
28.2 Support of a distribution 253
28.2 Support of a distribution

It is agairr a matter of extending to distributions a concept that has been
defined for functions (see Definition 15.1.5). If r.p is continuous, then
supp(r.p) = {x E JR.I r.p(x) =/= 0}.
28.2.1 Definition A distribution is said tobe null (be zero, vanish)

on an open set 0 if T( r.p) = 0 for all test functions r.p for which supp( r.p) c 0.
For example, 8 is null on (0, +oo ). It is possible to prove the following

equivalence for a regular distribution T1 :
Tf is null on 0 ~ f(x) = 0 for a.e. x E 0. (28.3)
We will assume (28.3). This is slightly more general than the condition
discussed in Section 27.4.2, which was that
TJ = 0 ~ f(x) = 0 for a.e. x ER
28.2.2 Definition The support of a distribution T, supp(T), is defined

to be the complement of the largest open set on which T is null.
This is equivalent to saying that supp(T) = Ui Oi, where the union is

taken over all open sets Oi on which T is null. Clearly, supp(T) is a closed
subset of R
EXAMPLE: supp(2:~= 1 Ai8ai) {ai,a2, ... ,an}, with equality in case
C
none of the Ai are zero. It is sufficient to show that T is null on the comple-
ment of {a1, a2, ... , an} Thus take r.p E !1J with supp( r.p)n{ a 1, ... , an} = 0;
clearly,
n
T(r.p) = L Air.p(ai) = 0.
i=l
To prove equality, suppose there is an index k such that ak is not in

supp(T). Then there is a neighborhood I of ak that contains none of the
other aj and on which T is null. Let r.p be a test function such that
supp(r.p) CI and r.p(ak) = 1.
Wehave
which proves the result.
28.2.3 Proposition IfT1 is a regular distribution, then

supp(T1 ) = supp(f).
254 Lesson 28. Elementary Operations on Distributions
Proof. This follows directly from Definition 20.1.3 of the support of a

measurable function and (28.3). o
28.2.4 Definition The space of distributions whose supports lie to the

right of some finite point is denoted by ! ~:
TE!~ -{:===} supp(T) c [to,+oo) for some t 0 ER
This space, which corresponds to the space of causal signals, will be used
often in the sequel.
28.3 The product of a distribution and a

function
One should not conclude from what we have done so far that all operations
on functions extend naturally to distributions. We do not know how, in
general, to define the product of two distributions. If we could do this, and
if we wish this operation to be consistent with the identification f ~---+ T1 ,
we would have the relation
T 1 T9 = TJg
But TJg does not necessarily exist, since f and g can be locally integrable
without fg being locally integrable (take f(x) = g(x) = lxl- 112 ). There is
no problern if one of the functions, say g, is continuous. In this case,
(TJ 9 ,r.p) = 1. fgr.p = (TJ,gr.p)
for all r.p E ! . This suggests the formula

(gT,r.p) = (T,gr.p).
For the expression on the right to be well-defined, we must have gr.p E !
for all r.p, and this implies that g must be infinitely differentiable.
28.3.1 Definition The product of a distribution T by an infinitely

differentiable function g, denoted by gT, is defined by
(gT, r.p) = (T, gr.p), for all r.p E !. (28.4)
It is easy to see that gT is a well-defined distribution.
EXAMPLE: Take T = 8a. Then
(g8a, r.p) = (8a, gr.p) = g(a)r.p(a) = (g(a)8a, r.p),
and hence
g8a = g(a)8a. (28.5)
As a particular case,
x8 = 0.
28.4 The derivative of a distribution

From what we have seen in Section 27.1, we should define the derivative
T' of the distribution T by the formula
(T', c.p) = - (T, c.p'), (28.6)
which we deduce directly from (27.2). We need to show that T' is a dis-
tribution. It is clearly a linear mapping defined on .!, since c.p' is in .!
for all c.p E .!. It is also continuous: If 'Pn --+ 0 in .!, then c.p~ --+ 0 in .!
(Proposition 27.3.3), and hence T'(c.pn)--+ 0.
Since T' is a distribution, it has a derivative T" : .! --+ C given by
(T", c.p) = (T, c.p"),
and so on.
28.4.1 Proposition Each distribution T is infinitely differentiable,

and the nth derivative ofT satisfies the relation
for all c.p E .! .
28.4.2 Examples
(a) The derivative of the point distribution E!a is given by
{J~ ( c.p) = -c.p' (a). (28. 7)
(b) For the derivative of the unit step u we have
(Tu,c.p) = Jo
roo c.p(x)dx,
and hence
(T~,c.p) = -(Tu,c.p') = -
r+oo c.p'(x)dx = c.p(O) =
Jo (fJ,c.p).
Thus we see that distribution theory leads us to the derivative of Heaviside's

unit step being the point distribution at the origin. It was our heuristic
derivation of this result that prompted us to use the point distribution as
a model for the unit impulse. We will see in the next lesson a more direct
(mathematical) reason for modeling the unit impulse at x = 0 by fJ.
28.4.3 The usual derivative and the derivative in the

sense of distributions
lf f is a locally integrable function, then the associated distribution Tt
has a derivative, which is the distribution Tj. We call this distribution the
derivative of f in the sense of distributions. When f is absolutely contin-
uaus on all compact intervals [a, b], integration by parts (Theorem 14.5.6)
-l l
shows that
(Tj,cp) = fcp' = f'cp = (Tt,cp'),
and hence that
Tj = Tf'.
The identification of a locally integrable function with its associated dis-
tribution leads to the following result: For a function that is absolutely
continuous an all compact intervals [a, b], the derivative of f in the sense
of distributions agrees a. e. with the usual, or ordinary, derivative. For ex-
ample,
f(x) = lxl
and
J'(x) = sign(x) for a.e. x.
In the general case of a locally integrable function J, this derivative-
whether or not it is a function-will be denoted by f' when there is no
ambiguity. The derivative in the sense of distributions is expressed by the
relation
(!',cp) = -(!,cp') for all cp E !lJ. (28.8)
For example,
u' = 8. (28.9)
28.4.4 The derivative in the sense of distributions of a

discontinuous function
As an example, we take a function f that is continuously differentiable on
the intervals (-oo,a), (a,b), and (b,+oo) and that has finite left and right
limits at a and b (see Figure 28.1). Taking the derivative of f in the sense
of distributions, we have
j +oo
(Tj,cp)=- -oo fcp'=-
1a
00
/(x)cp'(x)dx-
lb
a f(x)cp'(x)dx
r+oo
- Jb f(x)cp'(x) dx.
f(x)
b X
f'(x)
FIGURE 28.1. Derivative in the sense of distributions of a discontinuous function.
We integrate by parts on each of the intervals; this shows that
(Tj, cp) =- f(a- )cp(a) + /_~ f'(x)cp(x) dx- f(b- )cp(b) + f(a+ )cp(a)
+ la
r f'(x)cp(x) dx + f(b+ )cp(b)
r+oo f'(x)cp(x) dx.
+ Jb
This can be written
(Tj, cp) = >..cp(a) + f.J.cp(b) + (T!', cp)

with
>.. =f(a+)- f(a-),

f.1. = f(b+)- f(b- ).
Written in different notation, we have
(28.10)
Tr denotes the distribution associated with the usual derivative of f,

which is defined everywhere except at a and b. In this case, we see that
the usual derivative and the derivative in the sense of distributions give
different results: The discontinuity of f at a point, say a, causes a point
distribution .Ma to appear in the (distribution) derivative; the coefficient A
is equal to the size of the "jump" of the function at a. Formula (28.9) is a
particular case.
28.4.5 An infinite number of discontinuities

If, for example, a function f has infinitely many points of discontinuity
(na)nEZ equally spaced on JR, then formula (28.10) can be generalized.
Since every test function cp has compact support, one is led to the finite
case described above.
EXAMPLE: Consider the periodic function f with period a defined on (0, a)
by f(x) = xja (Figure 28.2).
f(x)
-a 0 a 2a 3a X
FIGURE 28.2.
Then we have (see Section 27.3.4(b))

+oo
! I = ~a - """'
~ <:
Una
n=-oo
We are now going to see what can be clone, from the point of view of
distributions, with certain functions that are unbounded and not integrable
in the neighborhood of a point.
28.5 Some new distributions

28.5.1 The distribution pv(~)
The function f(x) = ljx, defined for x =f 0, is not integrable araund the
origin. Thus we cannot associate a distribution with f, and we will have
28.5 Some new distributions 259
the same problern with any rational function having a real pole. This is
something of a wrench in our distribution machinery, for one of our main
goals with distributions is to extend the notion of function. Furthermore,
the functions in question often arise in practical problems. We will see here
how this problern can be resolved. Although f is not locally integrable, its
primitive F(x) = ln(lxl) is locally integrable. The distribution that comes
to our aid is simply the derivative of F in the sense of distributions. This
distribution is well-defined (Section 28.4.3) and is denoted by pv(.!), where
X
pv stands for "principal value." But what is the value of
(pv(~),<p)?
Wehave
(Tj..,<p) = -(Tp,<p1) = -faln(ixl)<p'(x)dx.
This integral can be written as a limit,
with
JE=-/_: ln(lxl)<p'(x) dx -l+oo ln(ixl)<p'(x) dx.
JE= [<p(c)- <p( -c-)]ln(c-) + Jlxi?.E

f <p(x) dx.
X
The mean value theorem,
shows that the integrated term tends to 0 as c --+ 0. Thus we see that
(pv (.!.), <p) = lim { <p(x) dx = lim 1+oo <p(x) - <p( -x) dx.
X E-+O lixi?_E X E-+O E X (28.11)
Note that the integral

f <p(x) dx
j.R. X
does not exist in general, but taking symmetric limits (-c- and c) around
the origin guarantees the existence of the limit (28.11) as c--+ 0. This is a
particular case of what is called the "principal value" of an integral.
One easily deduces from (28.11) that
xpv(~) = 1. (28.12)
260 Lesson 280 Elementary Operations on Distributions
Indeed, by using (28011) again, we see that
(xopv(~),cp) = (pv(~),xcp) = lim f cp(x)dx= f <p= (1,cp)o

X X c:-+0 llxl?_c: }IR
Finally, we repeat the defining relation:
pv(~) = (lnlxl)'o (28o13)
28.5.2 The distribution fp(~)

Here is a second example, which will give a better idea about the generality
of the technique used in the last sectiono This time the function that is
not locally integrable is g(x) = 1/x2 o The associated distribution, which
remains tobe defined, will be denoted by fp(
X
-b), where fp stands for "finite
parto" There is no sense in writing
(fp ( 21 ) , cp) = hm
X E-+0
0 1 lxl?_c:
-cp(x)
X
2- dx,
since this time the limit generally does not existo Again, the plan is to
consider a primitive of g, but of a higher ordero The function F(x) = ln(lxl)
is, up to a sign, a second primitive of g, so we define
fp(:2 ) = -F",
where the derivative is taken in the sense of distributionso The new distri-
bution is thus the negative of the derivative of pv (.!):
X
(28014)
This and (28011) show that
(fp(~), cp) =
X
lim
c:-+0
l+oo cp'(x)- cp'( -x) dxo
E X
(fp(~),cp) lim {- ~[cp(c) + cp(-c)- 2cp(O)]

= E-+0 c
l +oo cp(x) + cp(-x)- 2cp(O) d }

X
+ 2 X o
E X
28.6 Exercises 261
It is not difficu1t to see that the integrated term tends to 0 as c -; 0, and

we are 1eft with the formu1a
(fp ( _!.._)
2 , 0 E X
(28.15)
As before, one easily shows that
x2 .fp ( :2 ) = 1. (28.16)
In summary:
(1)
(pv - , <p) =-
1 1n lxl<p'(x) dx = lim 1+oo <p(x)- <p(-x) dx,
-11 I I "( )
X IR E->0 E X
(fp ( _!.._)
X
)-
2 , 0
1+oo <p(x) + <p( -x)- 2<p(O) d
E X
2 X.
28.6 Exercises
Exercise 28.1
(a) Compute the successive derivatives in the sense of distributions of the fol-
lowing functions:
u(x); f(x) = xku(x), k E N*; g(x) = Jxl-

(b) Compute the first two derivatives in the sense of distributions of the func-
tions
h(x) = JsinxJ and k(x) = u(x) sinx.
Exercise 28.2 What is the parity of 8(n)?
Exercise 28.3 Suppose TE ~'(JR) and 0: E C 00 (1R). Prove the following

results:
(o:T)' = o:'T + o:T',
(o:T)(n) = ~ (~)o:(k)T(n-k).
Exercise 28.4 Campare (T')a and (Ta)'. Compute (8')a and (8a)'.
Exercise 28.5 Suppose f E Lfoc(JR). Show that F(x) = 1x f(t) dt, a E JR,
has f as its derivative in the sense of distributions.
Exercise 28.6 Compute x8', x 2 8', and x8".

Exercise 28.7 For f : ~ ---+ C measurable and >. E ~*, the operator h>-. is
defined by
h>-.f(x) = f(>.x), x E ~.
Extend this definition to distributions.
Exercise 28.8
(a) Define the conjugate T of a distribution TE~~(~).
(b) Define Re(T) and Im(T). Characterize a distribution as being real or imag-
inary.
(c) Verify that 8a, a E ~' and pv(~) are real.
(d) Prove that the derivative of a real distribution is real.
Exercise 28.9 Prove that x fp( _.;)

X
= pv(.!)
X
two different ways.
*Exercise 28.10 Prove the equality
x
(e -1)pv ;
( 1) = -x-.1
ex-
How can this be generalized? Deduce the decomposition
pv(~) = S + f,
where S is a distribution with bounded support and f E L 2 (~).
*Exercise 28.11 (solution of xT = 0, TE ~'(JR.))

(a) Show that a function X E ~ (~) satisfies x(O) = 0 if and only if x = x'lj;
with 'lj; E ~ (~).
(b) Take(} E ~ (~) suchthat B(O) = 1. Show that for t.p E ~ (~) there exists
a '1/J E ~ (~) such that t.p = t.p(O)(} + x'lj;.
(c) Deduce from this that T E ~ 1 (~) is a solution of xT = 0 if and only if T
is of the form K 8 with K E C.
(d) Give an example of a distribution T suchthat supp(T) c {0} and xT i- 0.
Exercise 28.12 (solution of xT = 1, TE~ '(JR.)) Deduce from

Exercise 28.11 and equation (28.13) that T E ~ 1 (~) is a solution of the equation
xT = 1 if and only if T is of the form pv(.!)
X
+ K8 with K E C.
Exercise 28.13 Solve the equations x 2 T = 0 and x 2 T = 1, TE~~(~).

Hint: Use the methods of Exercises 28.11-28.12 and equation (28.17).
28.6 Exercises 263
Exercise 28.14 (solution of xT = U; T, U E .2:1 '(JR))

(1) Suppose U E !ll' 1 (1R) and 0 ~ supp(U).
(a) Show that there exists an a > 0 suchthat supp(U) n (-a, a) = 0.

(b) Show that the equation xT = U has at least one solution To E !. 1 (1R).
(c) Give the generalform of all solutions TE !ll' 1 (1R).
(d) Find all distributions T E !. 1 (JR) such that
+oo
xT= ~ 8k.
k=-oo, k#O
(2) Show that one can find T, U E !. 1 (JR) such that 0 E supp(U) and xT = U.
Exercise 28.15 Take b E Lfoc(JR) and define a by a(x) = xb(x).
(a) Solve the equation xT = a, TE!. 1 (1R).
(b) Apply this to a(x) = sinx.
Hint: For (a), use Exercise 28.11(c) to show that T = b + K8, K E IC.
Exercise 28.16 Assurne that a E Lfoc(JR) is suchthat the function m de-

fined by
m(x) = a(x)- a(O)
X
is also in Lfoc (JR).

(a) Solve the equation xT = a, TE!. 1 (1R).
(b) Apply this to a(x) = ex.
Hint: Use the relation x pv(.!)

X
= 1 to transform the equation and reduce it to
the case in Exercise 28.15.
Exercise 28.17
(a) Assurne 0 suchthat <p(x) = 0 for all x E [-a, a].
Deduce from this that for each k E N there exists 'ljJ E !. (JR) such
that <p(x) = xk'lj;(x) for all x.
(b) Show that
xnT = 0 for some n EN ==> supp(T) C {0}.
Hint: If <p E D(JR) is suchthat supp(<p) c lR\{0}, by (a) there exists 'ljJ E !. (JR)
suchthat <p(x) = xn'lj;(x) for all x. Then (T, <p) = (T, xn'lj;) = (xnT, 'lj;) = 0, and
hence supp(T) C {0} ).
**Exercise 28.18 Let g E C""(JR) and Iet its derivative be odd and bounded.
(a) For x < 0, show that the integral
1 A
x g'(t) dt
t , A < 0,
has a finite Iimit as A ---> -oo. Let f be the even function defined for x < 0
1"'
by
f(x) = g'(t) dt = lim {"' g'(t) dt.
_ 00 t A-+-oo}A t
(b) Show that
lf(x)l:::; C(l -log lxl) for 0< lxl:::; 1.
Deduce that f E Ltoc(JR).
(c) Show that the derivative of f in the sense of distributions is
j'(x) = g'(x)pv(;).
lesson 29
Convergence of a Sequence of
Distributions
We mentioned in Beetion 26.4 that an essential property of derivation in

the sense of distributions is its eontinuity:
We saw in Beetion 26.2 that from the point of view of physies, the impulse
is a limit. For these and other reasons it is important to investigate the
notion of limit in ! '.
29.1 The limit of a sequence of distributions

The limit of a sequenee of distributions is simply the "pointwise limit"
where the "points" are test funetions.
29.1.1 Definition A sequenee of distributions (Tn)nEN is said to eon-

verge to the distribution T if
for all cp E ! .
29.1.2 Examples
(a) Ifthe real sequenee (an)nEN eonverges to a, then Dan eonverges to Oa,
sinee cp( an) ---. cp( a) for all cp in !.
(b) Consider the sequenee of functions Un)nEN defined by
fn(x) = sin27rnx.
For fixed x (not an integer or half-integer), the sequenee fn(x) does not
eonverge as n ---. +oo. Thus the sequenee of functions Un) does not tend
266 Lesson 29. Convergence of a Sequence of Distributions
pointwise to a limit. We know, however, from the Riemann-Lebesgue the-

orem that
Ia r.p(x)sin21rnxdx-.. 0
for all r.p E .!. We conclude that the sequence Un) tends to 0 in !lJ 1 , and
we have the following negative result:
fn __.. f in.! 1 =# fn(x) __.. f(x) for a.e. x ER

Finally, we wish to emphasize the convergence of a sequence of functions
in the sense of distributions:
fn __.. f in.! 1 ~ J J fnr.p __.. fr.p for all r.p E !lJ. (29.1)
29.1.3 Theorem (continuity of derivation) Ifthesequenccof

distributions (Tn)nEl\1 converges to the distribution T, then the sequence of
derivatives (T~)nEl\1 converges to T 1
The proofis read directly from Definition 29.1.1 and the definition of the
derivative (28.6). This theorem formalizes our statement in Section 26.4.
29.2 Revisiting Dirac's impulse

In Lesson 26, physical reasoning led us to consider an impulse as the limit
as n-.. +oo of a family of functions like this (see Figure 29.1):
fn(x) = {~ if lxl::; ~'

0 otherwise.
We wish to see whether this sequence has a limit in the sense of distribu-
tions. For r.p E .!, the mean value theorem implies that
(in, r.p) = 2n~~

__!_ r.p(x) dx = r.p(cn)
n
for some cn, -1/n < Cn < 1/n. Asn-.. +oo, r.p(cn)-.. r.p(O); hence,
(in, r.p) __.. r.p(O) = (8, r.p)

for all r.p E .! . This means that
fn-.. 8 in .! 1
Although we have taken a particular form for the functions fn, this argu-
ment is easily generalized to more general sequences, and it shows that the
29.3 Relations with the convergence of functions 267
1 X
4
FIGURE 29.1. The impulse as the limit of functions fn
point distribution at the origin is the model of the impulse that we sought
in Lesson 26.
The distribution 8 models the unit impulse at the origin.

This argument justifies our calling 8 the Dirac unit im pulse (or mass) at
the origin, and it agrees with the fact that 8 is the derivative of Heaviside's
unit step function (Sections 26.2 and 26.3).
29.3 Relations with the convergence of

functions
The convergence of a sequence of functions in the sense of distributions
is generally "weaker" than the notions of convergence we have previously
encountered. Here are several results (not an exhaustive Iist) that allow us
to conclude convergence in the sense of distributions.
29.3.1 Proposition Let Un)nEN be a sequence ofintegrable functions

that converges to f in L 1 (~). Then Un) converges to f in the sense of
distributions.
Proof. This is a consequence of the following inequalities:
0
29.3.2 Proposition If a sequence Un)nEN of square-integrable func-

tions converges in L 2 (1R.) to a function f E L 2 (1R.), then Un) converges to
f in the sense of distributions.
Proof. This is a consequence of Schwarz's inequality:
29.3.3 Proposition Let Un)nEN be a sequence of measurable func-

tions such that fn(x) ___, f(x) almost everywhere. If there is a function
g E Lfoc(IR.) such that for all n E N
lfn(x)l:::; g(x) for a.e. x,

then Un) tends to f in the sense of distributions.
This proposition is a direct consequence of the theorem on dominated

convergence.
29.3.4 Proposition Suppose Un)nEN is a sequence of functions in

Lfoc(!R.) that converges to f uniformly on every bounded interval. Then
Un) tends to f in the sense of distributions.
29.4 Applications to the convergence of

trigonometric series
Consider an arbitrary trigonometric series
(29.2)
n=-oo
where the an do not necessarily tend to 0. This series is not, a priori, the
Fourier series of a function, and furthermore, it is not, in general, pointwise
convergent. Nevertheless, this series converges in !JJ' under rather general
conditions and thus defines a distribution. Recall that the series (29.2)
converges if the sums
N
!N(x) = L
n=-N
converge as N ___, +oo ..

29.4 Applications to the convergence of trigonometric series 269
29.4.1 Definition A complex sequence (a:n)nez is said to be slowly

increasing if there exist a positive constant A and an integer k such that
{29.3)
for all sufficiently large lnl.
This is equivalent to saying there exist a constant C and an integer k E N
suchthat
{29.4)
for all n E Z.
29.4.2 Theorem If the sequence (a:n) is slowly increasing, then the

series (29.2) converges in ! 1 to a periodic distribution with period a.
Proof. Let k be the integer in {29.3) and consider the series
. X
2..,rn-
~ -k-2 {29.5)
L....i n O:ne a.
n#O
By hypothesis, the modulus of the general term is dominated by Ajn2

for all sufficiently large n. Thus the series converges uniformly on IR to a
continuous function F:
F( ) ~ -k-2
X = L...J n
2i11'n~
O:ne a.
n#O
From Proposition 29.3.4 we have convergence in ! 1 , and by continuity of

derivation, the series differentiated k + 2 times tends to p(k+2) in the sense
of distributions:
L
N (
:7r
2 . )k+2O:ne2i11'nax ---+ p(k+2)
n=-N
n#O
as N ---+ +oo. This proves that the series {29.2) converges in the sense of
distributions to the distribution
T = ( 2;7r) k+2 p(k+2) + a:o. o
We will see in Proposition 36.1.3 that {29.2) converges in! 1 to f when

it is the Fourier series of a function f E Ltoc{O, a).
REMARKS:
Note that the technique used here-going to a higher-order primitive
and then descending by differentiation in the sense of distributions-is the
1 1
same as we used to define pv (-) and fp ( 3 ).
X X
The generat term of the series {29.2) need not tend to 0 in the sense of
functions, but it evidently tends to 0 in ! 1
29.4.3 Theorem (term-by-term differentiation)

Let (un)nEl\1 be a sequence of functions that are absolutely continuous on
bounded intervals. Assurne that the series
n=O
converges in !lJ to a locally integrable function

1 f. Then the differentiated
series
+oo
Lu~
n=O
converges in !lJ 1 to the derivative f 1 (in the sense of distributions) of f.
The proof is a direct application of the continuity of derivation.
29.5 The Fourier series of Dirac's comb

29.5.1 Given a real positive number a, Dirac's comb (Figure 29.2) with
mesh a is the distribution defined by (Section 27.3.4)
+oo
Lla = L
n=-oo
Ona
Thus
+oo
(Lla, <p) = L
n=-oo
<p(na)
for all <p E !lJ , and it is clear that Lla is periodic with period a.
29.5.2 The product f Lla is well-defined whenever f E C 00 (Definition

28.3.1), and
+oo
(! Lla, <p) = (Lla, f<p) = L f(na)<p(na).
n=-oo
FIGURE 29.2. Dirac's comb.

29.5 The Fourier series of Dirac's comb 271
Y= f(t)
_.-"
- ----1"'----i', /
/
/
/
"'
' 1',
''
'
t'
-2a -a 0 a 2a 3a
FIGURE 29.3. Sampling f.
f(t)
FIGURE 29.4. Approximating f.
Writtcn as distributions,
+oo
flla = L f(na)Ona
n=-oo
Equation (29.6) is a sequence of impulses that represents the sampling of

the signal f cvery a "seconds." (See Figure 29.3.) From the point of view
of approximating f by a sequence of Dirac masses and letting a tend to
zero, it is in fact the sequence of impulses af fla that approximates f by
step functions (Figure 29.4). In the sense of distributions,
+oo +oo
f ~ L
n=-oo
f(na) Tna X[-~~)~
2' 2
L
n=-oo
f(na)Ona, or f ~ aflla.
29.5.3 We will show that ~a can be expressedas a trigonometric series

just as for a periodic function. We first develop the Fourier series of the
function with period a defined on (0, a) by f(x) = xja (Figure 29.5).
f(t)
-a 0 a 2a X
FIGURE 29.5.
An easy computation shows that

1 i L 1 2i7rn:!C
!( x ) =- +- -e a.
2 2n _J_ n
n-r-0
This series converges in the norm of L 2 (0, a) (Proposition 16.3.4). By an ar-

gument similar to that used in Proposition 29.3.2, this series also converges
in !i! '; by continuity of derivation, we can differentiate term by term; hence
! '( x ) -
_ - -1 "'"'~e
2i7rn:!C
a.
a
nof-0
On the other hand, from Section 28.4.5,

+oo
! ' = ~-
a
"'"'
~ Una,
J;:
n=-oo
and we see that

+oo 1 +oo
~ ~ 2i7rn~
~a = ~ bna = -a ~ e a. (29.6)
n=-oo n==-oo
The series on the right diverges in the sense of functions, but it converges
in the sense of distributions to Dirac's comb. This series is the Fourier series
expansion of Dirac's comb, and it illustrates the general situation described
in Section 29.4.
29.5.1 Remark The proofwe used for the term-by-term differentiation

of the Fourier series applies to any a-periodic function f that is square
integrable over a period.
29.5.2 Proposition The Fourier series of a function in L~(O, a) can

be differentiated term by term in the sense of distributions.
29.6 Exercises 273
29.6 Exercises
Exercise 29.1 Let T be a distribution and let (hn) be a sequence of real
numbers that tends to 0 (hn =/: 0). Show that h1n (T- ThnT)--+ T' in !'(IR) as
n--+ +oo.
Exercise 29.2 Show that every sequence of functions (in) that satisfies the
following three conditions converges to 6 as n --+ +oo:
fn E L 1 (1R) and JIRfn(x)dx = 1.
fn :::=: 0 a.e.
fn = 0 a.e. on ( -oo, -1/n) U (1/n, +oo).
*Exercise 29.3 Let (/n)neN be the sequence of functions defined by
fn(x) = sinnx for x =/: 0.

X
(a) Show that (/n)nEN does not converge pointwise.

(b) Using Exercise 27.2, show that fn converges in !'(IR) to Kc, where K is
a constant that can be computed from the property
.
l1m
x-+oo
1"' 0
sinx
- -
X
d x=-.
1r
2
Exercise 29.4 Given a E IR, consider the sequence of distributions
L
00
Tn = Ak(n)Cka
k=-oo
Show that Tn tends to 0 in ! '(IR) if and only if for all k E Z,
lim Ak(n)
n-oo
= 0.
Exercise 29.5 Prove Proposition 29.3.4.
Exercise 29.6 Assurne that f: IR--+ Cis piecewise continuous on all closed,
bounded intervals.
(a) Prove that the sequence of distributions
n=-oo
tends to f in the sense of distributions as N --+ +oo.

(b) Show that this is also true for the sequence
N2
sN = ~ L
n=-N2
J(~)c*.
Exercise 29.7 Prove that

+oo
2: (-1tei7rnx = 2 2:
+oo
D2n+1
n=-oo n=-oo
Hint: Use the Fourier series expansion of the function with period 2 defined on
( -1, 1) by f(x) =x (see Section 29.5.3).
Exercise 29.8 Show that the sequence of functions fn(x) = ne-1rn 2 "' 2 tends
to 6 as n -> +oo.
Hint: First compute 1fn(x) dx knowing that 1 e-rrx
2
dx = 1; then show that
lim { fn(x)(cp(x)- <p(O)) dx = 0 for all <p E ~(IR).

n-+oo }JR
Exercise 29.9 Let (an)nez be a real sequence and let (.An)nez be a complex
sequence. Give a sufficient condition on (an) that implies that
n=-oo
is a distribution.
Exercise 29.10 Prove directly that the sequence
where (an) is slowly increasing, tends to 0 in ~'(IR).

Lesson 30
Primitives of a Distribution
When studying physical systems governed by differential equations, physi-

cists often consider the derivatives to be taken in the sense of distributions.
This is necessary, for example, when the inputs are discontinuous. This
Ieads one to define generalized solutions of differential equations in terms
of distributions.
30.1 Distributions whose derivatives are zero

The question of whether a distribution has a primitive reduces to the fol-
lowing problem:
Given T E !JJ 1 , find U E !JJ 1 such that U 1 = T.

We will see that the results for distributions are the same as those for
functions. We first consider the case T = 0.
30.1.1 Theorem The derivative of a distribution U is the zero element

of !JJ 1 if and only if U is a constant; that is, if and only if U( cp) = K JIR cp.
Proof. Assurne that U 1 = 0. Then
(U,cp 1 } = 0
for all cp E !JJ .
If cp E !JJ , then cp1 E !lJ and JIR cp1 = 0. Conversely, if 1/J E !JJ and
JIR 1/J = 0, then the function
cp(x) = /_~ 1/J(t) dt

is in !JJ and is a primitive of 1/J. This prompts us to define
276 Lesson 30. Primitives of a Distribution
Then for all '1/J E .! o,

(U,'!/J) = 0.
Let 0 be an arbitrary but fixed function in .! such that
l 0 = 1 and O(x) = 0 if lxl ~ 1. (30.1)
For cp E .!, define '1/J"' = cp- I(cp)O, where I(cp) = JR cp. Then
cp = '1/J"' + I(cp)O, (30.2)
and '1/J"' E .! O Applying U to (30.2), we have
(U, cp) = I( cp) (U, 0).
This shows that U is the constant distribution
with K = (U, 0). Conversely, if U is a constant, it is clear that U' = 0. o
30.1.2 Remark Let f be a continuous function on R whose derivative

in the sense of distributions is also a continuous function. It is not obvious
that this implies that f is continuously differentiable in the usual sense.
This is, however, a consequence of Theorem 30.1.1:
Let g be the continuous function for which T/ = T9 and let
G(x) =lax g(t) dt.

Gis continuously differentiable in the usual sense and G' = g. By Theorem
30.1.1,
Tf = G + a constant,
and so
f(x) = f(O) +lax g(t) dt.
30.2 Primitives of a distribution

30.2.1 Theorem Every distribution T has a primitive U in.!', and
all the primitives of T are of the form U + C, where C is some constant.
30.2 Primitives of a distribution 277
Proof. If U and V areprimitives ofT, then V'- U' = 0, so by Theorem

30.1.1,
V=U+C.
We need to show that T has at least one primitive U. If U exists, then
necessarily
(U',cp) = -(U,cp') = (T,cp) (30.3)
for all cp E !. Todetermine U, we must know the value of (U, cp) for each
cp in!. We know how U acts on 1/J in! 0 from (30.3). Now suppose that
cp is in ! and write
1/J = 1/Jcp = cp- I( cp )0
using the notation of Theorem 30.1.1. One primitive of 1/J is
Fcp(x) = [xoo [cp(t)- I(cp)O(t)] dt.
Clearly,
Fcp' cp and (Fcp)' = cp- I(cp)O = 1/J.
=
Since the derivative of Fcp is in! 0 , we know that Fcp is in!.
Now define U : ! --4 C by
(U, cp) = -(T, Fcp)
(At this point we do not know that U is a distribution, so writing (U, cp) is
a slight abuse of notation.) If U is a distribution, then it is a primitive of
T, since
(U',cp) = -(U,cp') = (T,Fcp') = (T,cp).
To prove that U is a distribution, it is suflicient to show that the mapping
cp f-t F cp is linear and continuous from ! to ! .
The mapping is clearly linear, so we focus on continuity. Let (cpn) be a
sequence in! that tends to 0. The sequence of integrals (I(cpn)) tends to
0; hence
(30.4)
tends to 0 in !. If [A, BJ is an interval containing the supports of all the
1/Jn, then the supports of the functions
F'Pn (x) = [~ 1/Jn(t) dt

are also in [A, B]. Since
IIF'Pn lloo::; 111/Jnlll::; (B- A)li1/Jnlloo,
it follows that F'Pn (x) --4 0 uniformly on IR. For any integer p 2: 1,
( F, )(P) = .J,(p-1)
<fJn '+"n '
which tend to 0 uniformly on IR. Thus the sequence (FcpJ tends to 0 in!,
and this proves continuity. o
278 Lesson 30. Primitives of a Distribution
30.3 Exercises
*Exercise 30.1
(a) Solve the differential equation
T' + aT = 0 (a # 0)
in !!J'. (One can make the change of unknown distribution T = e-axs.)

(b) Deduce from this the general solutions of
T' + aT = 6 and T' + aT = u (a # 0).
*Exercise 30.2 Use Exercise 28.1 to derive the general solutions of the fol-
lowing differential equations:
T l/ +T -- v", T" +T -- u"71', T" + w2 T =wu;;;

A (w > 0).
*Exercise 30.3 For -oo < a < b < +oo we consider the space
where the derivative is taken in the sense of distributions. Show that W 1 1 (a, b)
is the space of absolutely continuous (AC) functions on [a, b].
Remark: If f E W 1 1 (a, b), one shows that f has a unique representative in the
space of absolutely continuous functions by using g(x) =I:
f'(t) dt, x E [a, b].
**Exercise 30.4 (the Sobolev space H 1 (a, b)) Let (a,b) an inter-
val, bounded or not. The vector space H 1 (a,b) is defined by
where the derivative f' is taken in the sense of distributions.

(a) Show that u rt H 1 (-1, 1), that g(t) = "fi is not in H 1 (0, 1), and that the
polynomials are in H 1 (a, b) if b- a < +oo but not in H 1 (JR) (except 0).
Give some examples of elements of H 1 (1R).
(b) Assurne that (a, b) is bounded.
Use Schwarz's inequality to show that L 2 (a, b) c L 1 (a, b).

If g E L 2 (a, b), show that f defined by f(x) =I: g(t) dt is in H 1 (a, b).
(c) Let (a, b) be arbitrary, f E H 1 (a, b), and c E (a, b). Define h(x) = Icx f'(t) dt
for x E (c, b).
Show that h- f is constant on (c, b) (use Theorem 30.1.1). Deduce

from this that f is absolutely continuous on all intervals [c, d] C (a, b).
Is hin H 1 (a,b)?
30.3 Exercises 279
(d) A scalar product can be defined on H 1 ( a, b) by
((!, g))! = 1b f(t)g(t) dt + 1b /'(t)if(t) dt.
We wish to show that H 1 (a, b) is a Hilbert space. If (fn) is a Cauchy

sequence in H 1 (a, b), show that there are f and g in L 2 (a, b) suchthat
fn--+ f and /~--+ g in L 2 (a, b).
Show that g = f' using convergence in the sense of distributions as in

Proposition 29.3.2 and finish the proof.
*Exercise 30.5 Let f be a function in H 1 (0,a) and Iet Cn = cn(f), n E .Z,

be its Fourier coefficients (see Lesson 6).
+oo
(a) Show that L: lcnl 2 < +oo.
n=-oo
+oo
(b) Show that L: n 2 lcnl 2 < +oo if and only if J(o+) = f(a-).
n=-oo
*Exercise 30.6 Show that if f E H 1 (1R), then lim 1"'J-oo f(x) = 0 (apply the
argument used in Proposition 17.2.1(ii) to the function f ). Verify that f E H 1 (1R)
does not imply f E L 1 (IR).
Chapter IX
Convolution and the

Fourier Transform of
Distributions
lesson 31
The Fourier Transform of

Distributions
We are going to extend the Fourier transform to distributions by interpret-

ing the transpose formula we obtained for functions. Consider the following
~rmal computation. Let f be a function in L 1 (lll) and let r.p be in !fJ. Then
f is a continuous function, and in the sense of distributions we have
(1, r.p) = l (l e- i1r~x
2 f(x) dx) r.p(e) de = l l e- i".x~r.p(e)
J(x) 2 de dx
= l f(x)(jj(x) dx = (!, (/5).
This suggests that the Fourier transform for distributions should be defined
by transposition:
(T, r.p) = (T, (/5). (31.1)
We know that the expression (T, (/5) make sense when (/5 E !fJ (lll), and we
have seen that (/5 E coo (Proposition 17.2.1). But does (/5 have compact
support? We will see in Proposition 31.5.4 that this is never the case. We
have, however, shown that the space of rapidly decreasing functions .5I' (lll)
is invariant under the Fourier transform (Theorem 19.2.3). This leads to
the introduction of the subspace of tempered distributions.
31.1 The space !/ '(JR) of tempered

distributions
The function (/5 in equation (31.1) is in .5I' (lll). Although (/5 does not have
compact support, it is very small at infinity.
31.1.1 Definition .5I' '(lll) denotes the vector space of continuous lin-
ear functionals T defined on .5I' (lll). Thus
'Pn ~ 0 in .5I' ===} (T, 'Pn) ~ 0 in C.
284 Lesson 31. The Fourier Transform of Distributions
If we take cp in !lJ (ffi.), then T(cp) is well-defined, since !lJ (ffi.) C SC (ffi.)
(Figure 31.1). Furthermore, convergence in !lJ (ffi.) implies convergence in
SC (ffi.) (see Definitions 19.2.4 and 27.3.2). This means that the elements
of SC '(ffi.) restricted to !lJ (ffi.) are distributions. Since !lJ (ffi.) is dense in
SC (ffi.) (Exercise 19.7), we can identify SC '(ffi.) with a subspace of !lJ '(ffi.)
(see Exercise 31.1).
31.1.2 Definition The elements of SC '(ffi.) are called tempered distri-

butions.
FIGURE 31.1.
The following result provides a practical way to characterize the tem-

pered distributions.
31.1.3 Proposition Suppose that T is a distribution, i.e., TE !lJ '(ffi.).

Then T is a tempered distribution, TE SC '(ffi.), if and only ifT is contin-
uaus on !lJ (ffi.) in the topology of SC (ffi.).
Proof. Clearly the condition is necessary: If T is a tempered distribution,

then 'Pn E !lJ C SC and 'Pn ____, 0 in SC imply that T( 'Pn) ____, 0.
The proof in the other direction depends on the fact that !lJ (ffi.) is dense
in SC (ffi.) in the topology of SC (ffi.). Wehave already seen in Proposition
22.1.3 how a linear operator that is defined and continuous on a dense
subspace can be uniquely extended to a continuous linear operator on the
whole space. Proposition 22.1.3 was about normed spaces, which is not the
case for !lJ and SC . The topologies of !lJ and SC (which have not been
discussed, since we are mainly interested in convergence of sequences) are
more complicated than a topology given by a norm. With a full under-
standing of the topology of SC , the result follows directly. However, it is
also possible to give a proof using sequences. This is left as Exercise 31.12,
where there are plenty of hints. We also refer to [Kho72]. o
We will use the following notion of convergence for sequences in SC '(ffi.).

31.1 The space .7 '(Ii) of tempered distributions 285
31.1.4 Definition Suppose Tn is a sequence in .Y 1 (~). We say that

Tn tends to 0 in .Y 1 (~) if
lim (Tn, cp) = 0 for all cp E .Y (~).

n--+oo
Note that convergence in .Y 1 (~) implies convergence in ~ 1 (~), since

~ (~) c .Y (~).
Here are some important properties that follow directly from the defini-
tions.
31.1.5 Proposition It T is a tempered distribution, then we have the

following results:
(i) Foreach k E N, xkT is in .Y 1 (~).
(ii) Foreach k E N, the derivative T(k) is in .Y 1 (~).
(iii) The mappings T---+ xkT and T---+ T(k) are continuous from .Y 1 (~)
to .Y 1 (~).
Proof.
(i) The mapping cp ---+ xkcp is continuous in .Y (~) (Proposition 19.2.5).
We have (xkT, cp) = (T, xkcp) for all cp E ~ (~). If a sequence 'Pn E ~ (~)
tends to 0 in .Y, then Xk'Pn---+ 0 in .Y. Thus (xkT, 'Pn) ---+ 0, which shows
that xkT E .Y 1 (~) (Proposition 31.1.3).
(ii) Take 'Pn E ~ (~) suchthat 'Pn ---+ 0 in .Y. The mapping cp---+ cp(k) is
continuous in .Y (~)(Proposition 19.2.5). Since (T(k), 'Pn) = ( -1)k(T, cp~k)),
we can pass to the limit and conclude, as in (i), that T(k) E .Y 1 (~).
(iii) Suppose that Tn is in .Y 1 (~) and that Tn---+ Tin .Y 1 (~). Then for all
cp in .Y (~), (xkTn, cp) = (Tn, xkcp) -+ (T, xkcp) = (xkT, cp). This proves the
continuity ofT f---t xkT. The same technique is used to show that T f---t T(k)
is continuous in .Y 1 (~). o
We will present several useful examples of tempered distributions.
31.1.6 Definition Suppose f: ~---+Cis a measurable function. Then

f is said to be slowly increasing if there exist C > 0 and N E N such that
If f is slowly increasing, then it is clearly in Lf0 c(~).
31.1. 7 Proposition Every slowly increasing function f is a tempered

distribution.
Proof. Since f is in Lf0 c(~), we know that Tf is a distribution (Proposition

27.4.1). We use Proposition 31.1.3 to show that Tf is tempered. Thus let
'Pn be a sequence in !lJ (IR) that tends to 0 in Y (IR). We have
I(T, 'Pn)l::; ~ IJ(x)II'Pn(x)l dx

= { lf(x)l (1 + x2)NI'P (x)l dx
jffi. (1 + x2)N
1
n
2 2N 'Pn(x)l
::;Csupl(1+x) ( dx )N"
2
xEIR IR 1 +X
From Proposition 19.2.5 we know that (1 +x 2) 2N'Pn (x) tends to 0 in Y (IR).

Hence limn_, 00 I(T, 'Pn) I = 0, and T is tempered by Proposition 31.1.3. D
Being a slowly increasing function is not a necessary condition for f to
be in Y '(IR). Hereis another criterion.
31.1.8 Proposition The functions in LP(IR), p ;::: 1, are tempered

distributions.
Proof. Since LP(IR) C Lfoc(IR), every element f of LP(IR) is a distribution.
We proceed as in the proof of Proposition 31.1.7. Let 'Pn be a sequence in
!(IR) that tends to 0 in Y (IR). Then
I(T,cpn)l::; (~lf(x)IPdx)l/p (~I'Pn(xWdx))l/q = llfllvii'Pnllq

If 'Pn ~ 0 in Y (IR), then limn->oo II'Pnllq = 0, and this proves the result. D
So far, the examples of tempered distributions have been functions. Here
is a different example.
31.1.9 Proposition If the complex sequence (Yn)nEZ is slowly in-

creasing and a > 0, then
+oo
T = L Yn 8na
n=-oo
is a tcmpered distribution.
Proof. Wehave already seen that T is a distribution (Section 27.3.4(b)).
Let 'Pv be a sequence in !lJ (IR) that tends to 0 in Y (IR). We must show
that limv_,oo I(T, 'Pv) I = 0.
Since (Yn)nEZ is slowly increasing (see Definition 29.4.1), there is an
N E N suchthat IYnl ::; C(1 + n 2)N for all n E Z (take any N ;::: k/2 for
the k in (29.4)). With this we have
1Yn(8na, 'Pv)l::; Cl1 + n 2 INI'Pv(na)l

= Cl1 + n 2IN 11 + (an)2IN+l I'Pv(na)l
11 + (an) 2IN 1 + (an) 2
31.2 The Fourier transform on !7 '(JR) 287
Since <pp is in .9' (IR),
11 + (an) 2 IN+ 1 Icpp(na)l ~ sup 1(1 + x 2 )N+ 1 cpp(x)l = Mp

xEIR
for all p E N and all n E z. Also,
Combining these inequalities, we have IYn (8na, cpp) I ~ C1 ~P

1+ an
)2; hence
+oo
where c2 = Cl L 1 + tan)2. By hypothesis, p~~ Mp = 0, and this
n=-ex>
proves that TE .9' '(IR). o
+oo
As an example, Dirac's comb a = L 8na is a tempered distribution.
n=-oo
To finish this section, we state without proof the structure theorem for
tempered distributions [Sch65b, Kho72].
31.1.1 0 Theorem IfT E .9' '(IR), then there are integers n 1 , n 2 , .. , np

and slowly increasing continuous functions /I, h, ... , fP suchthat
p
T = Lfknk)
k=l
31.2 The Fourier transform on.? '(lR)

We are now in position to validate equation (31.1).
31.2.1 Definition Suppose that T E .9' '(IR). The Fourier transform

ofT, which is denoted by Tor by 5 (T), is defined by
(T, cp) = (T, fP) for all cp E .9' (IR). (31.2)
T is a tempered distribution because the Fourier transform is a contin-

uaus operator on .9' (IR) (Proposition 19.2.5). Formula (31.2) extends the
Fourier transform from L 1 (IR) or L 2 (IR) to tempered distributions.
31.2.2 Proposition If f is in L 1 (IR) or L 2 (IR), then fj = T?

Proof. Take f E L 1 (IR.) and c.p E .9 (IR.). Then 1 E L 00 (IR.) is a tempered

distribution and (j5 E .9 (IR.). By Proposition 17.1.4,
(Ti' c.p) = ~ 1C~)c.p(~) d~ = ~ f(x)(jj(x) dx = (TJ, (jj) = (TJ, c.p).

If f E L 2 (IR.), then 1 E L 2 (IR.) is a tempered distribution. For all c.p E .9 (IR.),
c.p, (j5 E L 2 (IR.). Thus by Proposition 22.1.5,
(Tf,c.p)= ~1(~)c.p(~)d~= ~f(x)(jj(x)dx

= (TJ, (jj) = (Tf, c.p). D
The Fourier transform is a linear 1-to-1 mapping on .9 (IR.) that is con-

tinuous in both directions (bicontinuous). The next result follows from
transposition.
31.2.3 Theorem The Fourier transform is a linear, 1-to-1, bicontinu-

ous mapping from .9 '(IR.) to .9 '(IR.). Theinverse mapping, !T - 1 = !T ,
is defined for all c.p E .9 (IR) by
(YT,c.p) = (T,Yc.p).
For all TE .9 '(IR.),
(31.3)
Proof. The mapping T f--+ !T T from .9 '(IR.) to .9 '(IR) is clearly linear.
It is also continuous: If Tn-+ 0 in .9 '(IR.), then
as n-+ oo.
The same proof works for !T . For all <p E .9 (IR.),
(!T YT,c.p) = (T,!T !T c.p) = (T,!T 5c.p) = (,r ,9" T,c.p) = (T,c.p),
which proves (31.3). Thus !T is 1-to-1 and !T - 1 = !T . D
Here are several important properties of the Fourier transform on .9 '(IR.);

they are direct consequences of the definition.
31.2.4 Proposition Let T be a tempered distribution.

(i) For all k E N,
f<kl = [( -2i1Tx)kTr,
T(k) = (2i1T~)kf.
(ii) Fora E IR.,
TaT= [e2i1faxTr,
TaT = e-2i7raE,f.
31.2 The Fourier transform on Y '(R) 289
Proof.
(i) Since T E Y '(IR), xkT E Y '(IR) (Proposition 31.1.5(i)). Thus ;;kT
exists, and we have (n, cp) = (T, xkrp) for all cp E Y (IR). From Proposi-
tion 17.2.1(ii) we have xkrp = -.1(
)kcp(k); hence
2m
(Xkr 'cp ) -- _1_(T W)- (-1)k (f(k) )

(2i7r)k 'cp - (2i1f)k 'cp '
and f(k) = [( -2i1fx)kTr. The relation i(k) = (2i7r~)kf is obtained simi-

larly from Proposition 17.2.1(i).
(ii) The function X I-+ e+ 2 i1rax is in C 00 (IR) and is bounded. Thus e+ 2 i1raeT
is in Y '(IR), and
([e2i1raxTr, cp) = (T, e2i1raerp) = (T, 0)
for all cp E Y (IR) (Proposition 17.2.4(i)). But we know that
(Definition 28.1.3), so TaT= [e 2 i1raxTr. The proofthat :;;;T = e- 2i7raef is

similar. o
31.2.5 Proposition ForT E Y '(IR) we have the following relations:

(i) 5 (Tl)") = (5 T)(j = 5T.
(ii) 5 (5 T) = T".
Proof.
(i) Take cp E Y (IR). By Proposition 17.2.3(ii),
soforT E Y '(IR) we have
(5 (Tl)"), cp) = (Trr, 5 (cp)) = (T, (5 cp)l)") = (T, 5 cp) = (5T, cp).
Similarly, ((5 T)1)",cp) = (T,5 (cpl)")) = (T,5 cp) = (5 T,cp), which

completes the proof of (i).
(ii) For cp E Y (IR), we know from Proposition 18.2.1 that 5 (5 cp) =
'Per Consequently,
(5 (5 T),cp) = (T,5 5 cp) = (T,cpa) = (TI)",cp),
and 5 (5 T) =Tu D
31.3 Examples of Fourier transforms in

:7 '(JR)
31.3.1 Dirac's impulse
It is easy verify that a is a tempered distribution for all a E IR. We wish
to compute the Fourier transform 6. For cp E !7 {IR),
("i,cp) = (,cp) = cp(O) = l cp(x)dx = (l,cp);
hence
{31.4)
In the same way,
so
{31.5)
We note that the Fourier transform of a is a C 00 function. For the deriva-
tives of the Dirac impulse we have {Proposition 31.2.4{i))
{31.6)
and by taking the Fourier transform of both sides of {31.6) we see that
;k = 1 {k) {31. 7)
( -2i7r)k .
The Fourier transform of a polynomial (which cannot be computed by
integration because xk fj L 1 {IR)) is a linear combination of the derivatives
of the Dirac distribution at the origin.
31.3.2 Sinusoidal signals

Let f(x) = e2i.,rax with a real and fixed. f is in L 00 {IR) and is a tempered
distribution. For cp E !7 {IR),
(Tf, cp) = l e2i1raxrp(x) dx = !T cp(a) = cp(a)
by Theorem 19.3.1. Thus (Tf,

cp) = (a, cp) and Tf
=Da In particular, the
Fourier transform of the constant function 1 {which is in neither L 1 {1R) nor
L 2 (1R)) is :
{31.8)
{31.9)
31.4 The space if '(JR) of distributions with compact support 291
31.3.3 Dirac's comb

+oo
Wehaveseen that a = L 8na is in .9 '(IR). Since the Fourier transform
n=~CXI
is continuous,
n=-oo n=-oo
But from (29. 7) we know that

+oo
I::
n=-oo
thus
-
a 1
= -1. (31.10)
a - a
The Fourier transform of the Dirac comb with grid a is a- 1 times the
Dirac comb with grid a- 1 . For a = 1, it is a distribution equal to its
Fourier transform, which is a property shared by the Gaussian g(t) = e-1rt 2
(Proposition 1(.3.2).
31.4 The space (5' '(JR) of distributions with

compact support
We have seen that the Fourier transform of a function f in L 1 (IR) is con-
tinuous and zero at infinity (Theorem 17.1.3). If in addition f has com-
pact Support, then jE C 00 (IR) (Proposition 17.2.1(iii)). We will show that
TE C 00 (IR) for distributions with compact SUpport.
31.4.1 Definition iif '(IR) denotes the subspace of ~'(IR) of those dis-
tributions that have compact support.
It can be shown that iif 1 (IR) is the dual of iif (IR) = 0 00 (IR) when 0 00 (IR)
is endowed with the following topology: A sequence 'Pn tends to 0 if and
only if for each p E N, r.p}i) tends to 0 uniformly on every compact set
KcR
T is a continuous linear functional on iif (IR) if and only if there exists a
compact set K, a constant C > 0, and an integer m E N suchthat
I(T,r.p)l::; C sup sup I'P(k)(x)l (31.11)
k$m xEK
for all r.p E C 00 (IR). iif '(IR) isalinear subspace of .9 '(IR) [Kho72].
31.4.2 Examples
(a) A function f E Lfoc(IR) with compact support is in g '(JR.).
(b) Let Oa be the Dirac distribution at a. For all p E N, oip) is in lf I (JR.).
Warking with the elements of g '(JR.) is facilitated by knowing their struc-
ture; thus we are going to assume the following theorem [Sch65b, Kho72].
31.4.3 Theorem (representation of ~ '(R)) IfT E g'(JR.) and

the supportofT is in the interior of some compact set K, there exist inte-
gers n1, n2, ... , np and continuous functions !1, h, ... , fv whose supports
are in K, and
p
T= L Jti>.
j=l
31.5 The Fourier transform on g' '(JR)

Since g '(JR.) is a subspace of !7 '(JR.), the Fourier transform of a distribution
with compact support is well-defined. In fact, it is a well-behaved function.
31.5.1 Theorem IfT is in lf '(JR.), then T and all ofits derivatives are
slowly increasing functions in C 00 (1R.). Furthermore, f<k>(e) = (T, 'Pkk)) for
k = 0, 1, 2, ... , where 'Pe(x) = e- 2i71"Xe; that is,
where Tx indicates that the "integration" is with respect to x.
Proof. We use Theorem 31.4.3 and write T = L:;~=l J}ni). Since J}ni) is
in g '(JR.) for j = 1, 2, ... ,p, we have
p
(Tx, e-2i71"Xe) = L ut;)' e-2inxe)
j=l
= I)-1)n; 1(-2i7re)n;e-2inxe/j(x)dx
j=l IR
p
= L(2i1re)ni jj(e).
j=l
Since /j is continuous and has compact support, we know from Proposition

e
17.2.1(iii) that the function f---f (Tx, e- 2inxe) is infinitely differentiable. On
the other hand, T = E~=lffJ = E~=l (2i7re)n; fj (e). We conclude that
T(e) = (Tx, e- 2inxe), which proves the case for k = 0.
31.5 The Fourier transform on if '(JR) 293
For k ~ 1 we compute fCk) in the sense of distributions. If r.p E !l' (IR),

then
(fCkl,r.p) = (-l)k(f,r.pCk)) = (-l)k(T,cPW) = (T,(-2i7rx)k<P(x))
= t
J=ljR
r Jtj)(x) (-2i7rx)k ( jRr e-2i1rxer.p(0 d~) dx
= 1r.p(~) t 1
R
(
J=l R
Jt 1 )(x)( -2i1rx)k e- 2 i7rXe dx) d~,
which shows that fCk)(~) = (Tx, ( -2i7rx)ke- 2 i7rXe).

Finally, to show that fCk) is slowly increasing, we use the continuity of
T (31.11): There exist C > 0, K compact, and m E N suchthat
I(T,r.p)l ~ C sup sup lr.p(j)(x)l

j'.S_m xEK
for all r.p E C 00 (IR). By taking r.p(x) = e- 2i1rXe(-2i1rx)k we see that
dj .
IT(k)(~)l ~ C sup sup 1-. (( -2i7rx)ke- 2'7rXe)l,
j'.S_m xEK dxJ
from which it follows that fCk)(~) is slowly increasing. 0
31.5.2 Theorem (the Paley-Wiener theorem)

Suppose that T Elf '(IR) and that suppT C [-M, +M] for some M > 0.
Then f(~) = T(~) can be extended to a holomorphic function IC --* C 1:
that satisfies the following estimate: There exist C > 0 and M E N such
that for all z E IC,
Proof. We know that f(~) = T(~) is the C 00 function f(O = (Tx, e- 2 i1rxe).
Define [(z) = (Tx, e- 2i1rxz) for z E IC. One shows by direct computation
1
(as above) that is holomorphic on IC, i.e., it is infinitely differentiable on
IC. The inequality follows from the continuity ofT. o
31.5.3 Remark The converse ofthe Paley-Wiener theorem is also true

[Kho72]. The Paley-Wiener theorem has the following important conse-
quence.
31.5.4 Proposition The Fourier transform of a distribution T with

compact support (T =1- 0) cannot have compact support.
Proof. If T E ~ '(R) has compact support, the function I = T is the

restriction to R of a holomorphic function on C. Thus I is analytic on R. If
I had compact support, it would vanish on some nonempty open interval.
But being analytic, this implies that I vanishes everywhere. Thus I cannot
have compact support. o
This result will be used in Section 38.5.1.
31.6 Formulary
(i)
(ii)
~: !T -2i1rae
(iii) ua 1--+ e
. !T
e2t11"Xa I--+ 6a
6(k) ~ (2i7r{)k
xk ~ ( -2i11")-k6(k)
(iv) u(x)
!T
I--+
1 + 2i11"1 pv (1)
26 e
sign(x) ~ i~ pvG)
pv(~) ~ -i1rsign({)
(v) .!T (Y T) =Tu
(Y T)u =Y Tu =Y T
Y Y T=Y .!T T=T
31.7 Exercises
Exercise 31.1 (.9' '(JR) as a subspace of ! '(JR))
ForT E .9' 'R) we write j(T) = 11 y (IR.) to denote the restriction ofT to !lJ (R).
(a) Show that j(T) E !IJ'(R).
(b) Show that j: .9' '(R) 1-+ !ll'(R) is injective (j(T) = 0 implies T = 0).
(c) Show that j is continuous.
31.7 Exercises 295
*Exercise 31.2
(1) Suppose that f E SC (R) and that TE SC 1 (R). For --> fT from SC (R) x SC 1 (R) to SC 1 (R)
is continuous with respect to each variable.
(2) Show that fT E SC 1 (R) if T is in SC 1 (R) and f is in C""(JR) with jUl

slowly increasing for all j.
Exercise 31.3 Show that a primitive of a tempered distribution is a tem-

pered distribution.
Exercise 31.4 Show that log lxl is a tempered distribution. Deduce from
this that pv ( _!) and fp (-.;.) are also tempered distributions.
X X
*Exercise 31.5 (the Fourier transform of u) Take the Fourier

transform of the equality u 1 = 8 and use Exercise 28.12 to show that is of u
the form u( ~) = 2:7r pv ( ~) + .A8. Show that .A = ~ by establishing that pv( ~) is
an odd distribution.
Use this to show that pv(~) E SC 1 (R).
Exercise 31.6 Compute the Fourier transform of pv( _!).

x
Use this to com-
pute the Fourier transform of sign(x).
Exercise 31.7 Compute the Fourier transform of fp (-.;.). Use this to com-
x
pute the Fourier transform of lxl.
[cos21r.At]~= ~(8.x +8-.x) and [sin27r.Atr= ;i(8.x- 8-.x).
Hint: Use Section 31.3.2.
Exercise 31.9 Let j(x) = arctanx.

(a) To which ofthe spaces L 1 (1R), L 2 (1R), SC (R), ~ 1 (R), SC 1 (R) does f belong?
(b) Compute Jic~).
(c) Deduce from this and Exercise 28.16 that
~ 1 (1) 1- e2rrl<l
!(~) = 2ipv "l - 2i~
Exercise 31.10 1
(a) Compute the Fourier transform of arctan- from Exercises 31.6 and 31.9
X
and the formula
arctan x + arctan -X1 = -1!".

2
s1gn (x) , X# 0.
(b) Prove this result starting with the derivative of arctan .!.
X
and proceeding
as in Exercise 31.9.
Exercise 31.11 We wish to study the limit, in the sense of distributions,

of the sequence of functions fn(x) = (x +*)-I.
(a) Verify that fn E Y '(.IR) and compute Jn.
(b) Prove that fn converges in Y '(JR) to -2i?Tu and deduce that fn converges
in Y '(JR) to pv(~)- i1r8.
*Exercise 31.12 We wish to prove Proposition 31.1.3. Thus assume that

TE 21'(JR) is continuous on 21(1R) in the topology of Y (JR). The plan is to
extend T to a continuous linear functional on Y (JR) using the fact that 21 (JR)
is dense in Y (JR).
(a) Extending T to Y (JR): Let 'ljJ be an arbitrary element of Y (JR) and let
'-Pn be a sequence in 21 (JR) that converges to 'ljJ in the topology of Y (JR).
Show that T(r.pn) converges and define T('ljJ) = limn~oo T(r.pn). (Show that
T(r.pn) is a Cauchy sequence by arguing indirectly.)
(b) T is well-defined: Show that if cPn is another sequence in 21 (JR) that con-
verges to '1/J in the topology of Y (JR), then limn~oo T(c/Jn) = T('ljJ).
(c) T is continuous on Y (JR): We must show that T('I/Jn) ---> 0 whenever '1/Jn ---> 0
in Y (JR). Define O:m as in Exercise 19. 7. Then O:m'I/Jn E 21 (JR), and for each
n, O:m'I/Jn ---> '1/Jn in Y (JR) as m ---> oo. Hence, given E: > 0, for each n there
is an m( n) such that
Use this and estimates developed from the equation
xP(O:m(n)'I/Jn)(q)(x) = xP ~ (;)m(n)-jo:(j) (m~n))'I/J~q-j)(x)

to show that T( '1/Jn) ---> 0.
Lesson 32
Convolution of Distributions
We discussed the convolution of functions in Lesson 20. There we saw that

it is not always possible to take the convolution of two functions; it is the
same for distributions. We will study the convolution of distributions and
its basic properties for the more important cases.
32.1 The convolution of a distribution and a

C 00 function
When f and gare in L1(1R), the convolution f * g(x) = JIR f(x- t)g(t) dt
is well-defined and f * g E L 1 (IR). Now consider f * g as a distribution. For
cp E ~(IR) we have
(! * g, cp) = h(1. f(x- t)g(t) dt) cp(x) dx
= h(1. f(x- t)cp(x) dx) g(t) dt (32.1)
= k(h.g(x-u)cp(x)dx)f(u)du. (32.2)
From this it appears that one should study the quantities fu * cp and Yu * cp
when f and g are distributions.
32.1.1 Proposition Suppose that cp E C 00 (IR) and TE~ '(IR) satisfy

one of the following three conditions:
(i) cp E ~(IR) and TE~ '(IR).
(ii) cp E !7 (IR) and TE !7 '(IR).
(iii) cp E C 00 (IR) and TE~ '(IR).
Then the function '1/J defined by
(32.3)
298 Lesson 32. Convolution of Distributions
is infinitely differentiable, and
'lj;(k)(x) = (rxT, cp(k)) for k = 1, 2, .... (32.4)
Proof.
(i) When cp E !i! (~), the expression (rxT, cp) makes sense for all x E R
We wish to show that the function 'lj;(x) = (rxT, cp) is differentiable. Thus
Iet hn be a sequence of nonzero reals that tends to 0 as n ~ oo. Define
1
O'.n(Y) = hn [cp(y +X+ hn)- cp(y + x)];
then
1
hn ['1/J(x + hn)- 'l/;(x)] = (T, O'.n)
Now, limn---+oo an(Y) = cp'(x + y) = Lxcp'(y). To prove that 'lj; is differen-

tiable it is sufficient to show that O'.n converges to Lx'P' in !i! (~).
We consider the support of the O'.n. lf supp(cp) C [-M, M] and Ihn I s 1,
then supp(an) C [-x-M -1, -x+M +1], which is a fixed compact interval
K. The following inequality, which is based on the mean value theorem,
shows that O'.n and of all its derivatives converge uniformly on K: Foreach
q EN,
la~q)(y)- cp(q+l)(x + Y)l = I'P(q+ll(x + Y + Onhn)- cp(q+l)(x + Y)l

S lhniii'P(q+ 2 )lloo, 0< ()n < 1.
This proves that 'lj; is differentiable and that
'1/J'(x) = (T, Lxcp') = (rxT, cp').

Similarly, one proves that 'lj; E C 00 (~) and that equation (32.4) holds for
k > 1.
(ii) lf cp E Y (~) and T E Y '(~), then (rxT, cp) makes sense. To show
that 'lj; is differentiable it is sufficient to verify that O'.n converges to Lx'P'
in.? (~). As in (i), the mean value theorem Ieads to the inequality
iyP(an- Lxcp')(q)(y)l S lhnllyPcp(q+ 2)(x + Y + Pnhn)l
= lhniiYIP [(1 + lx + Y + PnhniP)cp(q+2)(x + Y + Pnhn)]

1 + lx + Y + Pnhn IP
where 0 < Pn < 1. Consequently,
sup iyP(an- Lxcp')(q)(y)l S Clhnl (11'P(q+ 2 )lloo + sup 1tPcp(q+2 l(t)1)
yER tER
for some constant C. Since cp is in .? (~), we see that O'.n converges in

y (~) to Lx'P' as n ~ 00. That 'lj; E C 00 (~) and (32.4) are proved
similarly.
32.1 The convolution of a distribution and a coo function 299
(iii) Again, (rxT,<p) makes sense because <p E C 00 (1R.) and TE ~'(IR.). To
prove that '1/J is differentiable it is sufficient to show that ahq) converges
to (r_xt.p')(q) uniformly on all compact subsets of IR.. This is done using
inequalities similar to those used in the proofs of (i) and (ii). o
32.1.2 Definition Assurne that <p T E .!'(IR.) satisfy

E C 00 (1R.) and
one of the conditions in Proposition 32.1.1. The convolution of <p and T is
the function <p * T defined by
t.p*T(x) = (Ty,<p(x-y)). (32.5)
The "y" in this definition indicates the variable of "integration." Written

differently,
(Ty,<p(x- y)) = (r_xT,t.pa),
which is the function we studied in Proposition 32.1.1. Thus we know the
meaning of the convolutions
coo *~'.
32.1.3 Proposition (convolution Y *
Y ') Assume<p E Y (IR.)
and T E Y '(IR.). The convolution <p * T and all of its derivatives are slowly
increasing coo functions.
Proof. By definition t.p*T(x) = (Ty, <p(x-y)). We use Theorem 31.1.10 and
write T = E~=l fknk), where the continuous slowly increasing functions fk
satisfy lfk(x)l:::; Ck(l +x 2 )Nk. Then
<p * T(x) = tuknk)(y), <p(x- y))

k=l
= t
k=l R
1 fk(y)<p(nk)(x- y) dy
= t
k=l Ja
rfk(x- y)<p(nk)(y) dy,
and
where kj(y) is a polynomial in y.

Since <p E Y (IR.), kj'P(nk) is in Y (IR.) and hence in L 1 (1R.). Thus I'P *
T(x)l is bounded by a polynomial in x, which proves that it is slowly
increasing.
One obtains similar estimates for the derivatives of cp*T(x) by rcpeating

the computation for (cp * T)(kl(x) = (Ty,cp(kl(x- y)). o
We are now going state a fcw of the convolution's essential properties.
The convolution dcfined in Definition 32.1.2 is an operator that is contin-
uous in each variable. The proof of this result [Sch65b], which we assume,
depends on the topologies of the three spaces involved.
32.1.4 Proposition ( continuity) The mapping (cp, T) --+ cp * T

defined under one of the hypotheses of Proposition 32.1.1 is continuous
with respect to each variable.
32.1.5 Proposition (derivation) Ifcp E C 00 (!R) andT E !'(IR)

satisfy one of the hypotheses of Proposition 32.1.1, then cp *T E C 00 (!R)
and
Proof. From (32.4), the kth derivative of the function
is
fl(k)(x) = ( -l)k(LxT, (cpa)(k)) = (LxT, (cp(k))a)

= (Ty, cp(kl(x- y)) = cp(k) * T(x).
On the other hand,
cp*T(kl(x) = (LxT(kl,cpa) = (-l)k(T,(cpa)(kl(y-x))

= (T, cp(k)(x- y)) = cp(k) * T(x). o
32.1.6 Proposition (support) Ifcp E C 00 (!R) and TE lf'(IR),

then supp(cp * T) c supp(cp) + supp(T), where "+" denotes the algebraic
sum of the two sets.
Proof. Since supp(T) is compact, supp(cp) + supp(T) is closed. Define

0 = IR\(supp(cp)+supp(T)). For x E 0 and y E supp(cp), (x-y) (/. supp(T),
and hence (Ty, cp(x- y)) = 0. This proves the required inclusion. o
32.1. 7 Corollary If cp E !(IR) and T Elf '(IR) then the convolution

cp *T has compact support.
The results of this section show that the convolution of a distribution and
a C 00 function is a smoothing operation. We will see in the next section,
where we introduce the convolution of two distributions, that !(IR) is even
dense in !'(IR).
32.2 The convolution if 1 * .9J 1 301
32.2 The convolution ?t 1 * !lJ 1

The expressions (32.1) and (32.2) suggest a way to generalize the convolu-
tion to distributions. Given two distributions S and T we write
(S * T, <p) = (St, (Tx, <p(x + t)))
= (Tu, (Sx, <p(x + u)) ).
We have seen that the function '!j;(t) = (Tx, <p(x + t)) is 0 00 when
T E !0'(IR). Thus the expression (St, '!j;(t)) makes sense when SE&' '(IR).
Similarly, the function o:(u) = (Sx,<p(x + u)) is in !0(IR) by Proposition
32.1.6, and the expression (Tu,o:(u)) makes sense for all TE !0'(IR). It is
not clear, however, that (St, '!j;(t)) =(Tu, o:(u)), as is the case for functions.
This is, in fact, true [Sch65b]; we state without proof the next result.
32.2.1 Theorem *
(~ 1 !lf ') Assume SE&' '(IR) and TE !0'(IR).
(i) There exists a distribution called the convolution of S and T and
denoted by S * T such that for all S * T from ?f '(IR) x !0'(IR) to !0'(IR) is con-
tinuous with respect to each variable.

Formula (32.7) is important because it allows us to develop a calculus
for the convolution. The convolution is a commutative operation; we next
consider its other important properties.
32.2.2 Proposition (Dirac distributions) Take TE !0'(IR).

(i) Then
Oa *T = T * Oa = TaT, (32.8)
and in particular, 8 acts like a unit element for convolution.
(ii)
8(k) * T = T * 8(k) = T(k), k = 1, 2,... . (32.9)
Proof. One needs to be careful not to confuse the index a in 8a with

the "dummy variables" u and x in (32. 7). Both results follow from simple
computations. To prove (i) we have
(8a * T, <p) = (Tu, (8a, <p(x + u))) = (Tu, <p(a + u))
= (T,La<fJ) = (raT,<p).
For the proof of (ii) we have
(8(k) *T,<p) = (Tu,(8(k),<p(x+u))) = (-1)k(Tu,<p(k)(u))
= (T(k), <p). o
32.2.3 Proposition (derivatives) If SE &''(JR) and TE ~'(JR),

then
(S * T)(k) = sCk) * T = S * T(k), k = 1, 2, .... (32.10)
Proof. For <p E ~ (JR),

((S*T)Ckl,<p) = (-1)k (S*T,<p(k)) = (-1)k (St,(Tx,'P(kl(x+t)))
= (St,(T~kl,<p(x+t))) = (S*T(kl,<p).
Similarly, (S * T)(k) = S(k) * T. 0
32.2.4 Remark If TE&' '(JR), one can find a primitiveofT by writing

T * U where U is the Heaviside distribution. Indeed, from (32.10) we see
that (T * U)' = T * U' = T * 8 = T.
32.2.5 Proposition (support of a convolution)

(i) Assurne SE &''(JR) and TE ~'(JR). Ifsupp(S) = A and supp(T) =
B, then supp(S * T) c A + B.
(ii) If Sand T are in&' '(JR), then S *TE&' '(JR).
Proof.
(i) Since A is compact, A + B is closed. Let n = lR \ (A + B) and take
<p E ~ (JR) with supp(<p) c 0. We will show that (S * T, <p) = 0. Wehave
(S * T, <p) =(Tu, (Sx, <p(x + u))) and 7f(u) = (Sx, <p(x + u)) = S * <p 17 ( -u).
Thus we wish to show that supp( 7f) n B = 0.
If u E supp( 7f) n B, then -u E supp( 'Pa) + supp( S) (recall that by
Proposition 32.1.6 supp(S * 'Pa) C supp<p 17 + suppS). This means that
-u = y + x with -y E supp(<p) and x E supp(S). But then -y = u + x
with u E B, x E A, and hence
supp(<p) n (A + B) -j. 0,
which is a contradiction, since supp(<p) C n = lR \ (A + B).

(ii) If Sand T are in&' '(JR), then A +Bis compact. 0
32.2.6 Proposition ( density of ~ (JR) in ~ '(JR))

IfT E ~'(JR), then there exists a regularizing sequence ()n E ~(JR) such
that On converges toT in~ '(JR).
Proof. Choose the usual regularizing sequence Pn (Definition 21.3.1). We
know that Pn tends to 8 in~ '(JR). Let O:n = Pn *T. By Theorem 32.2.1(ii),
an converges toT in~ '(JR). This proves that C 00 (1R) is densein ~ '(JR).
To prove that ~ (JR) is densein ~ '(JR), we must show that the functions
an can be chosen with compact support. We fix this by multiplying O:n by
a function n E ~ (JR) such that n(x) = 1 for lxl :::; n and n(x) = 0
otherwise. Then ()n = a:nn E ~ (JR), and it converges toT in~ '(JR). 0
32.3 The convolution iif 1 * !1' 1 303
32.3 The convolution (if 1 * !7 1
The convolution ~ 1 * Y' 1 is a particular case of the convolution ~' * !1J 1

But in this case the distribution that one obtains is tempered.
Referring to (32. 7) we see that the functions 'ljJ(t) = (Tx, cp(x + t)) and
a(u) = (Sx, cp(x + u)) are well-defined when 'PE Y' (IR), TE Y' '(IR), and
SE ~'(IR) (Proposition 32.1.1). We establish a preliminary result.
32.3.1 Proposition If S E ~'(IR), the mapping 'P ~-----+ a defined by

a(u) = (Sx, cp(x + u)) is continuous from Y' (IR) to Y' (IR).
Proof. We use the theorem about the structure of elements in ~ 1 (IR) (The-
orem 31.4.3). Thus we write S = ~;=l Jti), where the fJ are continuous
with support in some compact set K. Then
and it is not difficult to verify that a E Y' (IR.).

To prove continuity, we assume that 'Pn ~ 0 in Y' (IR) and show that
the corresponding sequence an converges to 0 in Y' (IR.). It is clear that we
can differentiate under the integral sign, and consequently
luma~q)(u)l :<:; t llfJ(x)llumcp~ni+q)(x

j=l K
+ u)l dx
< f-.. { IJ(x)l lulm (1 + lx + ulm)I'P(ni+q)(x + u)l dx

- J=l
~j~ J 1 + lx + ulm n
K
p
:<:;CL (II'P~ni+q) lloo + sup ltmcp~ni+q)(t)l)
j=l tEIR
for some constant C. This shows that 'Pn ~ 0 in Y' (IR) implies that an
converges to 0 in Y' (IR). o
We use this result to prove the next one.
32.3.2 Proposition If SE ~'(IR.) and TE Y' '(IR), then the convo-

lution S *T is a tempered distribution.
Proof. We know that S*T is a distribution. Let 'Pn be a sequence in !1J (IR.)
that tends to 0 in Y' (IR). Then (S * T, 'Pn) = (Tu, (Sx, 'Pn(x + u)) ), and
by Proposition 32.3.1 the sequence an(u) = (Sx, 'Pn(x + u)) is in Y' (IR)
and converges to 0. Hence limn ...... oo (T, an) = 0. The result follows from
Proposition 31.1.3. o
The next step is to examine continuity.
32.3.3 Proposition
(i) Let Sn be a sequence in iif 1 (lR) that converges to 0 in iif 1 (lR); that is,
(Sn, r.p) ---+ 0 for all r.p E C 00 (JR). Then Sn * T ---+ 0 in Y 1 (lR), and
hence in~ 1 (lR), for all TE Y 1 (JR).
(ii) Let Tn be a sequence in Y 1(JR) that converges to 0 in Y 1 Then for
all SE iif 1 (lR), S * Tn---+ 0 in Y 1 (JR), and hence in~ 1 (JR).
Proof.
(i) By Proposition 32.1.1, the function 1/J(t) = (Tx,r.p(x+t)) is in C 00 (lR)
for all r.p in Y (JR). Thus
(Sn * T, r.p) = (Sn, 1/J) ---+ 0 as n ---+ oo.
(ii) Similarly, a(u) = (Sx, r.p(x+u)) is in Y (JR) for all <p is Y (JR) (Propo-
sition 32.3.1). Hence limn-+oo(S*Tn,r.p) = limn-+oo(Tn,r.p) = 0. 0
It is necessary to pay attention to the various notions of convergence.

Here is a simple example:
Dn * 1 = 1
for all n, and 8n converges to 0 in~ 1 (JR). In this case the result of Propo-
sition 32.3.3 is not true. This is because 8n does not converge in iif 1 (lR).
32.3.4 Proposition
(i) Let Sn be a sequence in iif 1 (lR) that converges to 0 in~ 1 (JR). Assurne
that there exists a compact set K such that Sllpp(Sn) C K for all n.
Then Sn* T---+ 0 in~ 1 (JR) for all TE Y 1 (JR).
(ii) Let Tn a sequence in Y 1 (JR) that converges to 0 in ..2? 1 (JR). Then for
all SE iif' 1 (lR), S*Tn---+ 0 in ~ 1 (lR).
Proof.
(i) Take <p E ..2? (JR). The function 1/J(t) = (Tx, r.p(x + t)) is in C 00 (lR).
Since supp(Sn) C K, (Sn, 1/J) = (SnJN), where () is a function in ~ (JR)
suchthat O(x) = 1 for x E K. Then 01/J is in~ (JR) and (Sn, 01/J) ---+ 0.
(ii) If <p E ~ (JR), then the function a(u) = (Sx, r.p(x + u)) is in ~ (JR),
and hence (Tn, a) ---+ 0. 0
32.4 The convolution ! ~ * ! ~

We have studied the convolution of two distributions where at least one
of them has compact support. Without this condition on the support, the
convolution is not generally defined. However, as is the case for functions,
the convolution is defined when both distributions are in~~ (or ~~).
Webegin with a preliminary result about ~ ~-
32.4 The convolution !lJ ~ * !lJ ~ 305
32.4.1 Proposition Suppose that TE.!'~ and r.p E C 00 (!R) and that
supp(T) C [a,+oo) and supp(r.p) C (-oo,b]. Then (T,r.p), defined by
(T, r.p) = (T, Or.p), (32.11)
where () is a function in.!' (IR) equal to 1 on an interval [-M, MJ containing
a and b in its interior, is well-defined.
Proof. ()r.p E .!'(IR), so (T, Or.p) makes sense. We must show that the defini-
tion of (T, r.p) does not depend on the choice of 0. Let 01 be another function
in .!'(IR) equal to 1 on [-Mll M1] containing a and b. Then (()- OI)r.p van-
ishes on [-m, +oo), where m = min{M, M 1}. Since supp(T) c [a, +oo),
we have supp(T) n supp((O- OI)r.p) = 0 and (T, (()- 01)r.p) = 0. o
To define the convolution, it is necessary to give meaning to the expres-
sions (St, (Tx, r.p(x + t))) and (Tu, (Sx, r.p(x + u))) for S and Tin .!' ~ and
r.p in .!'(IR).
32.4.2 Proposition Suppose that TE .!' ~ and r.p E C 00 (!R) and that
supp(T) C [a, +oo) and supp(r.p) C ( -oo, b]. Then 'lj;(t) = (Tx, r.p(x + t)) is
defined of all t E IR, supp('lj.;) C ( -oo, b- a] , and 'lj; E C 00 (IR).
Proof. The function Lt'P is in C 00 (!R) with supp(Ltr.p) c (-oo, b- t] for
all t E IR. Thus by Proposition 32.4.1, (Tx, r.p(x + t)) is well-defined.
Now, 'lj;(t) = 0 if supp(r-tr.p) n supp(T) = 0, which is the case when
b - t < a. Hence, supp('lj;) c (-oo, b - a]. That 'lj; E coo is a consequence
of (32.11) and Proposition 32.1.1(i). o
These preliminary results lead to the next theorem, which we state with-
out proof (see, for example, [Sch65b]).
32.4.3 Theorem (the convolution !lJ ~ !lJ ~) Suppose that S *

and T are in .!' ~-
(i) There exists a distribution called the convolution of S and T and
denoted by S * T such that for all r.p E .!'(IR),
(S * T, r.p) = (St, (Tx, r.p(x + t))) = (Tu, (Sx, r.p(x + u)) ). (32.12)
(ii) (S * T)(k) = S(k) * T = S * T(k), k = 1, 2, 3 ....
(iii) The mapping (S, T) f-+ S *T of .!' ~ x .!' ~ into .!' 1 is continuous with
respect to each variable. (The convergence of Sn to 0 in .!' ~ means
that Sn -4 0 in .!' 1 (IR) and that there exists a c such that for all n,
supp(Sn) C [c, +oo).)
32.4.4 Proposition (support of S * T) If SandTE.!'~ with

supp(S)c [ab+oo) and supp(T) C [a2 ,+oo), then
supp(S * T) c [a1 + az, +oo),
and hence S *T is in.!'~-
Proof. Take <p E .21 (IR.) with supp(<p) C ( -oo, a 1 + a2). The support of
= (Tx, <p(x + t)) is in ( -oo, ai) by Proposition 32.4.2. Thus
"P(t)
supp("P) n supp(S) = 0
and (S * T, <p) = 0, which proves that supp(S * T) c [a1 + a2, oo). D
32.4.5 Remark As in Section 32.2.4, we can obtain a primitive of T

in !lJ ~ by taking the convolution ofT with the Heaviside distribution U.
Since U E .21 ~' the primitiveofT is in .21 ~-
32.5 The associativity of convolution

We have defined the convolution of two distributions and seen that it is a
commutative operation. If we wish to convolve three or more distributions,
we run into two problems: existence and associativity. Here is a dassie
example: We wish to compute 1 * 8' * u. From Proposition 32.2.3,
1 * 8' = 1' * 8 = 0.
Thus (1 * 8') * u makes sense and is equal to 0. On the other hand, by
Proposition 32.2.2,
8' * u = 8 * u' = 8 * 8 = 8;
hence 1 * (8' * u) makes sense and is equal to 8. This shows that the con-
volution product is not associative in general. Nevertheless, convolution is
associative in several cases.
32.5.1 Proposition The convolution of n distributions of which at

least n - 1 have compact support is associative and commutative.
Proof. The proof is based directly on the definitions. We take S and T in
lf '(IR.) and U in .21 '(IR.); we show, for example, that (S*T)*U = S*(T*U).
First, these convolutions make sense, since S * T E lf '(IR.) (Proposition
32.2.5(ii)) and (S * T) * U is an lf' * .21' convolution. Similarly, T * U is an
lf' * !lJ' convolution, so T * U E !lJ '; and S * (T * U) is another lf' * !lJ'
convolution. Take <p E !lJ (IR.). From equation (32.7),
((S * T) * U, <p) = ( (S * T)t, (Ux, <p(x + t)) ).
Now, "P(t) = (Ux, <p(x + t)) is in 0 and S *TE lf '(IR.).
00 , Then
(S*T,"P) = (Sz,(Tu,'I/J(u+z)))
= ( Sz, (Tu, (Ux, <p(x + U + z))))
= (Sz, ((T * U)t, <p(t + z)))
= (S * (T * U),<p),
from which we see that (S * T) * U = S * (T * U). D
32.5 The associativity of convolution 307
The space ~ '(JR) endowed with the convolution operation is a convolution

algebra, since T, S E ~ '(JR) implies that T * S E ~ '(JR). This algebra is
commutative and associative, and it contains a unit element, 15. g; ~(JR) is
another important convolution algebra.
32.5.2 Proposition The convolution in g; ~(JR) is associative.

The proof is similar to that of Proposition 32.5.1, and 15 is also a unit
element for this algebra.
Since these two distribution algebras have unit elements, it is natural to
ask whether a distribution S in ~'(JR) or g}~(JR) has an inverse; that is, is
there a T such that
This question is important for solving differential equations with constant

coefficients. With this in mind, we introduce the differential operator
p
P= LamD(m),
m=O
where am E C and D(m) denotes the mth derivative. Given U E g; '(JR),

we wish to find a TE g; '(JR) suchthat
P(T) = U. (32.13)
We saw in Proposition 32.2.2(ii) that T(k) = 15(k) * T, and hence we can

write (32.13) as
p
L aml5(m) * T = u. (32.14)
m=O
A distribution E is said to be an elementary solution of (32.14) if
This means that E is the inverse of S = L aml5(m). (We will develop

m=O
methods for computing elementary solutions in Lesson 35.) If we have as-
sociativity, then knowing E yields all the solutions of (32.13), since
and hence T = E * U. Furthermore, this solution is unique: If T 1 and T2

satisfy (32.13), then we have
S * T1 = S * T2 = U.
By taking the convolution with E this becomes, thanks to associativity,
and since E * S = o, we are left with
In the language of filters, we will see that E is the impulse response

(Lessons 34 and 35). These techniques also apply to finding solutions of
partial differential equations [Sch65a].
32.6 Exercises
Exercise 32.1 Let Sand T be two even (or two odd) distributions. Show
that S *T is even.
Exercise 32.2 If T is a distribution in 21~(IR), show that it has a unique

primitive in 21 ~(IR), which is P = u * T.
Exercise 32.3 Let P be a polynomial of degree less than or equal to m. For

which distributions S does S * P make sense? Show that S * P is a polynomial
of degree less than or equal to m.
Exercise 32.4 Suppose f(x) = (1- x)u(x) and g(x) = exu(x). Show that
f * g makes sense and compute this convolution.
Exercise 32.5 Consider the differential operator defined by
P(f) = j" + a2 j, aER
Show that u( x) sin ax is an elementary solution of P in 21 ~(IR) if a # 0 and that

a
u(x)x is an elementary solution if a = 0.
Exercise 32.6 Keeping in mind the last exercise, show that it is possible to
find an elementary solution for the operator
dm dm-1 d
P = - - +am-1--
dxm dxm- 1
+ +a1-
dx
+ao
of the form E(x) = u(x)g(x) where g is a c= function that is the Solution of a

differential equation in the classical sense. What is this equation and what are
the boundary conditions?
32.6 Exercises 309
Exercise 32.7
(a) If At and A2 have inverses in! t (I~), show that At* A2 is invertible. What
is the inverse of At * A2?
(b) Use this result to find the elementary solution of the operator
d2 d 2
p = dx 2 - 2 ).. dx + )..
Exercise 32.8 What are the inverses in !Pt (I~) of u, 15', and 15'- a/5?
Lesson 33
Convolution and the Fourier

Transform of Distributions
---
As is the case for functions (Lesson 23), the Fourier transform interchanges
---- --
convolution and multiplication of distributions. We wish to determine under
what conditions the relations T * U = T U and T U = T...... * U are true.
The first thing to notice is that one must be careful manipulating these
-
relations, since the product of two distributions is not generally defined.
We faced a similar problern with the convolution in the last lesson. There
we were able to establish several conditions under which the convolution is
well-defined and consistent with the convolution for functions.
33.1 The Fourier transform and convolution

.7*.7'
The interchange properties established for Y (:IR) and the definition of the
Fourier transform on Y '(IR) lead to our first result.
33.1.1 Proposition If"P E Y (IR) and TE Y '(IR), then

(i) -:;r;T = ;p. T; (33.1)
(ii) (33.2)
Proof.
(i) 1/J * T is in Y '(IR) by Proposition 32.1.3. For all r.p E Y (IR),
(-;r;T, r.p} = (1/J * T, $} = (Tu, (1/Jx, cp(x + u)} ).

On the other hand, since ;p E Y (IR) and TEY '(IR), the product ;p T
--
makes sense. Thus
(;p. T, r.p} = (T, ;p. r.p} = (T, ;p. r.p}.

312 Lesson 33. Convolution and the Fourier Transform of Distributions
-
Applying Proposition 23.1.2(ii), we see that
~
;j 'P = :(}j * fi5 = 1/Ju * cp = (1/Jx, cp(x + u));
hence -:;;;T = ;j T.
(ii) ;j E !? (IR) and T E !? '(IR) imply that ;j * T is defined and is in
!? '(IR). Applying the operator .!T to (i), we have
T(;j * T) = .!T ;j. T f = 1/J. T.

Since .!T is an isomorphism Oll!? '(IR) (Theorem 31.2.3), ;j * T = n. 0

'if'*Y'
1f we take S E ~'(IR) and T E !? '(IR), then S is in C 00 (IR) (Theorem
31.5.1) and T is in!? '(IR). The product ST makes sense, S*T E!? '(IR)
(Proposition 32.3.2), and we can compute s;T.
33.2.1 Proposition If SE~ '(IR) and TE!? '(IR), then

s;T=s-f. (33.3)
Proof. For all <p E!? (IR),
(s. f, 'P) = (f, s. <p).

Since S E ~'(IR), the Fourier transform S and all of its derivatives are
slowly increasing C 00 functions (Theorem 31.5.1). Thus the product S <p
is in !? (IR) and
(T, s'P) = (T, s<p).-
-S
By applying (33.2) we see that
<p
~
= S * cp = Su * cp.
Thus
-
(T, S <p) =(Tu, (Su(x), cp(u- x)))
=(Tu, (Sx, cp(u + x)))
= (T * S, cp)
= (T;S,<p),
0
33.3 The Fourier transform and convolution L 2 * L2 313

L2 * L2
We studied the convolution L 2 * L 2 in Section 23.2, but it was not possible
then to prove the formula j;g = f.g (Section 23.2.2). In fact, when fand
g are in L 2 (1R), the best one can say in generat isthat f * g is in L 00 n C 0
(Section 20.5), and it is not possible to take the Fourier transform of such
a function. Now, however, we know that f * g is in .7 '(IR) (Proposition
31.1.8), and it is possible to take the Fourier transform of such a distribu-
tion. Since f * g(t) = 5 (j. g)(t) for all t E IR (Proposition 23.2.1(i)), we
have j;g = 1 g simply by taking the Fourier transform.
33.3.1 Proposition If fandgare in L 2 (1R), then

(i) r;g = f. g;
(ii)
Equation (i) is just a restatement of Proposition 23.2.1(ii).
33.4 The Hilbert transform

Let f be a function in L 2 (1R) with supp(f) C [0, +oo), which means that
f is a causal signal with finite energy. Suppose in addition that f is real-
valued. We intend to study the real and imaginary parts of its Fourier
transform
1(~) = A(~) + iB(~).
In particular, we wish to find expressions for the functions A(e) and B(~).
Since supp(f) C [0, +oo), we can write f(x) = sign(x) f(x). Operating
formally, we have
~ - ~
f(e) = sign * /(~) =
1pv (1)
i7r e * f(e).
~
(33.4)
This is a .7 ' * L 2 convolution, which is not one of the cases we have

discussed. We can, however, decompose .;._ pv(.!) and write it as
~7r X
.;._ pv(!) = S
~7r X
+g
with SE lt'(IR) and g E L 2 (1R) (see Exercise 28.10). Equation (33.4) is
1
valid because Sigii * makes sense, and
Y(Sigll* 1) = 5((8 + g) * 1) = 5 (S) ! +Y(g) !

=5 (S + g) f = sign f = f.
314 Lesson 330 Convolution and the Fourier Transform of Distributions
Thus we see that
A(~) + iB(~) = i~ pv G) * (A(~) + iB(~)),

and thus
A=~pvG)*B, (33o5)
B = -~pvG) *Ao (3306)
33.4.1 Proposition (the Hilbert transform)

(i) The operator H: L 2(1R)--+ L 2(1R) defined by Hf= ..!.pv(..!.) * f
7r X
is linear, 1-to-1, and bicontinuouso It is an isometry on L 2(1R), and
H- 1 =-Ho
(ii) If f E L 2(1R) is real and ifsupp(f) C (0, +oo), the:r;_ the real part off
is the Hilbert transform of the imaginary part of f o
Proof.
(i) Wehave just seen that His well-defined on L 2(!R)o H(f) is in L 2(1R),
since
5 [.!. pv (.!.) * 1] = 5 [.!. pv (.!.)] 0f = -i sign(~)f(~)

7r X 7r X (3307)
and f E L2(!R)o His clearly linear; from (3307) and Theorem 22ol.4(iii) we
have
IIHJII2 = IIHJII2 = llfll2 = llfll2,
which proves that H is an isometryo To find the inverse, apply (33o7) to
H(f):
5 [H(H(f))(~)] = -i sign(~)ii(J)(~) = - f(~)o
Thus H(H(f)) =- f for all f E L 2(1R), and H- 1 =-Ho
(ii) This is a restaterneut of (33o5)o 0
33.5 The analytic signal associated with a

real signal
If f E L 2(JR) isareal signal (real-valued), then the real part of its Fourier
transform is even and the imaginary part is odd:
f(~) = A(~) + iB(~),

A even , B oddo
33.6 Exercises 315
It follows that j, and hence f, is completely determined by the restriction

of j to [0, +oo). Let G denote this restriction:
G= j. u.
33.5.1 Definition The signal whose Fourier transform is 2G, which is
the signal g = 25 G, is called the analytic signal associated with the real
signal f.
From the definition we have
g - (f~ u) = 2f * 5- u = f * (6- in1pv (1))

= 25 ~ ;
hence
= f +iHf.
g
Thus the analytic signal associated with f is obtained by adding an imag-

inary part to f equal to its Hilbert transform. In summary,
is the analytic signal {0 if ~ < 0

g
associated with f ~ g(~) = 2f( ~ g= f +2'Hf .
if ~ > 0
33.6 Exercises
Exercise 33.1 Compute the Fourier transform of f(x) = cos ~x X[-l,lJ(x)
and justify the equation J= c~x * X[-l,l]
Exercise 33.2 Show that the convolution pv(~) * pv(~) makes sense and
compute it.
Exercise 33.3 Compute the Fourier transform of pv(~) from the relation
xpvG) = 1.
Exercise 33.4 Consider the sequence of distributions defined by

1
T1 = '2(8-1 +81),
Tn = Tn-1 * T1, n 2 2.
(a) Express Tn in terms of Dirac distributions.

(b) Compute Tn.
316 Lesson 33. Convolution and the Fourier Transform of Distributions
(c) Define fn(e) = Tn( 2 1l"~;rJ Show that fn converges in .'7 '(R) and find its
limit.
(d) Study the convergence of T!n in .'7 '(R).
2
Remark: Exercise 33.4 shows that any function of the form e-ax , a > 0, is the
limit in the sense of ~ '(R) of a sequence of finite linear combinations of Dirac
distributions. In fact, this is an immediate result of Exercise 28.7 and the fact
that forT E .'7 '(R) and A E R\0, 5 (h>..T) = l~l ht.!T T.
Exercise 33.5 Suppose f E L 2 (R), suppf c [0, +oo), and ..r f = A + iB

with A and B real-valued. Also assume that A is even. Show that f = 25 A u.
Chapter X
Filters and Distributio ns

lesson 34
Filters, Differential Equations,

and Distributions
Filters for functions have been studied in Lessons 1, 2, 24, and 25. We
are going to recast and complete this analysis in the light of what we now
know about distributions. We will see that the basic tools developed so far,
namely, convolution and the Fourier transform, play an essential role in the
study of generalized filters, in the same way they did in the study of filters
for functions.
34.1 Filters revisited

34.1.1 Definition (analog filter) Let X be a translation-invari-
ant linear subspace of !lJ 1 (~). Assurne that X has a topology that is at least
as fine as the topology induced by !JJ 1 (~). An analog filter is a mapping
A: X -+ !lJ 1 that is linear, invariant, and continuous (recall the definitions
in Sections 2.1 and 2.2).
We want X to be endowed with a topology that gives the best chance

for a mapping A to be continuous: We want the topology on X to be a
fine as possible and that on !lJ 1 to be as coarse as is reasonable. Note that
Definition 34.1.1 is more general than the one given in Section 2.2. Thus X
can be a function space (X = LP(~), 1 ::=; p ::=; +oo, ... ) or a distribution
space (X = ~ 1 , .9 1 , !lJ ~, !lJ 1 , ), and in particular, X can contain
discrete signals.
34.1.2 Proposition ( examples: convolution filters)

The convolution system
A:X-+!JJ 1 ,
where Af = h * f, is an analog Elter in the following cases:
(i) h is a distribution with bounded support, that is, h E ~ 1
(ii) h E .21~ and XC !lJ~.
320 Lesson 34. Filters, Differential Equations, and Distributions
Proof. In case (i), A is well-defined for any subspace X C !lf ', and it is
linear. The invariance comes from Propositions 32.5.1 and 32.2.2: For all
a E JR,
and in terms of A,
Ta(Af) = A(Taf)
for all f E X. A is continuous by Theorem 32.2.1. For case (ii), Af is
in !lf ~- A is linear and invariant by Proposition 32.5.2; for continuity, we
invoke Theorem 32.4.3. o
34.1.3 Definition The impulse response of a filter A is its response

h = A8 to the Dirac im pulse (when 8 E X). Similarly, its step response is
h 1 =Au (when u E X), and its transfer function is H = h (when h E !7 ').
When 8 is not in X, for example when X = L2 (1R), one can still define
the impulse response because in practice all of the filters encountered will
be convolution systems.
34.1.4 Consistency of the two definitions of transfer

function
In Section 2.3 we gave another definition for the transfer function. It was
the function H such that for all .X E JR,
A(e.x) = H(.X)e,x,
where e,x(t) = e2 i1r-Xt. These two definitions are equivalent for convolution
systems. One assumes, of course, that the sinusoidal signals e,x are in the
set of input signals. We Iook at several cases.
Case 1: h E 'if '.
From Proposition 33.2.1, Section 31.6, and (28.5),
.li{e.J = ~ = 'h. e.x = 'h. 8.x = h(.X)8.x (34.1)
for an .X E R By taking the inverse Fourier transform, we have
A(e,x) = h(.X)e,x,
so
H=h. (34.2)
Case 2: h E !7 .
The equalities in (34.1) arevalid in this case by Proposition 33.1.1; hence
(34.2) is also true.
34.2 Realizable, or causal, filters 321
The consistency of the two definitions thus extends to all convolution

systems where h E &' 1 + Y' . This is the case for the RC and RLC filters,
and more generally for all of the filters studied in Lessons 24 and 25.
Case 3: hELl(~).
Wehave
(h * e>,)(t) = l h(s)e 2 i1r>.(t-s) ds = h(.X)e>.(t).
(See Beetion 20.3 regarding the convolution L 1 * L 00 .)
34.2 Realizable, or causal, filters

The next definition generalizes the definition in Section 2.1.2 and Definition
24.5.1.
34.2.1 Definition An analog filter is said tobe realizable if

supp(f) C [to,+oo) ===> supp(Af) C [to,+oo)
for all t 0 ER
34.2.2 Proposition For the convolution system

A: X----+ !iJ 1 ,
Af = h* f
tobe realizable, it is necessary and suflicient that supp(h) C [0, +oo).
Proof. We prove the result in the simple case where {j E X. If Ais realiz-
able, the relations h = A8 and supp( 8) C [0, +oo) imply that the support of
h is in [0, +oo). Conversely, if supp(h) c [0, +oo) and supp(f) c [t0 , +oo),
then we have supp(h *!) C [t0 , +oo) by Proposition 32.4.4. D
34.3 Tempered solutions of linear differential

equations
We will generalize, in the larger context of tempered distributions, what
was done in Lesson 24 under the assumption that p < q. This limitation
was imposed because the rational function P(x)/Q(x) is not in L 2 (~) when
p 2': q. This condition is no langer necessary, since we can now work in the
space Y' 1 of tempered distributions.
Let A be a system whose input f and output g = Af are related by the

differential equation
q p
L bkg(k) = L a3j(j), ap bq =f. 0, (34.3)
k=O j=O
taken in the sense of distributions. The coefficients a3 and bk are fixed

complex numbers, and f is a given tempered distribution. We will see that
in general (34.3), has a unique solution g in .9' 1 , which defines the output
of the system. As before, we write
p q
P(x) = La3xj and Q(x) = Lbkxk.
j=O k=O
34.3.1 Proposition If f E .9' 1 and if P(x)/Q(x) has no poles on the

imaginary axis, then (34.3) has a unique solution g E .9' 1
Proof. The proof copies that of Proposition 24.1.1. If there exists a solu-
tion g E .9' 1 , then by taking the Fourier transform, we obtain
g(>.) = H(>.)J()), (34.4)
where
H(>.) = P(2~11' >.). (34.5)

Q(2z11'>.)
This shows that gis uniquely determined, and Theorem 31.1.7 implies that
g is uniquely determined. Thus (34.3) has at most one solution in .9' 1 On
the other hand, H is slowly increasing, and it is easy to see that Hf E .9' 1
and that (34.4) defines a solution g in .9' 1 o
It is easy to show that A : f ~---+ g is a filter. However, as in Lesson 24,
our main interest is in computing the impulse response.
34.3.2 The solution of (34.3) is a convolution

Since H is slowly increasing, it is a tempered distribution (Proposition
31.1.7) and has an inverse Fourier transform h = 5 -lH, which we can
compute by decomposing H into partial fractions.
Case 1 : H has only simple poles.
Then
34.3 Tempered solutions of linear differential equations 323
where we define O.j = 0 if j < 0 {the polynomial part is zero if p < q) and
where z 1 , ... , Zq are the simple poles of P(x)/Q(x) in C. Then
p-q
h(t) = L O.jO(j) + L kezktu(t)- L kezktu( -t), {34.6)
j=O kEK- kEK+
where K_ are K+ are defined in Section 24.3.1.
Case 2: H has multiple poles.
The polynomial part contributes a sum E~;;;;g O.jO(j) to h as in Case 1.
Thus we can limit ourselves to the case p < q. Using the same notation we
used in Section 24.3, we obtain the same result:
h(t) = ( L Pk(t)ezkt)u(t)- ( L Pk(t)ezkt)u(-t) {34.7)

kEK- kEK+
with
Pk(t) = fl
mk tm-1
k,m (m _ 1)!
At this point we know that g = Af satisfies the relation

g(A) = h(A)[(A).
We would like to take the inverse Fourier transform and apply the results
of Sections 33.1 and 33.2, but first we need to represent h as en element of
lf 1 + !7 . For this we write
h= h(J + h{1 - 0) = hl + h2,

where (} E !lJ satisfies
O(t) = {10 if
if
ltl ~ 1,
ltl ~ 2.
--- ----- + ----* f - +- -

Now apply Propositions 33.1.1 and 33.2.1:
h * f = hl * f h2 = (hl h2)!,
and
Taking the inverse Fourier transform, we obtain
g = h* J,
which shows that A is a convolution system.

34.3.3 Remarks
(a) The generalized solution in terms of functions that was given for
this equation in Section 24.3.5 is the tempered solution that we have just
obtained. This does not lessen our interest in Lesson 24 where we found
that the solution is a function when the input f is a function (p ::; q).
However, in Lesson 24 we assumed that f E L 1 n L 2 n L 00 ; here we have a
much wider range of inputs f, even if we restriet f to be a function.
(b) If P(x)jQ(x) has a pole on the imaginary axis, then the solution g
in !/ 1 is no Ionger unique. For example,
g" + w2 g = 8, w > 0,
has as solutions in !/ 1 all of the functions
Acoswt + Bsinwt, t < 0,
g(t)= { (
Acoswt + B +
1)
w
sinwt, t > 0.
This equation will be studied in Lesson 35. Here we merely note that the
moderated growth at infinity imposed by the space !/ 1 is not enough to
guarantee uniqueness. It is causality that will be determinant.
34.3.4 Causality
Proposition 34.2.2 and equations (34.6) and (43. 7) give us a necessary and
sufficient condition for causality.
34.3.5 Proposition The fllter A : !/ 1 --+ !/ 1 deflned by equation
(34.3) is realizable (or causal) if and only if the real parts of the poles of
P(x)jQ(x) are strictly negative.
34.4 Exercises
Exercise 34.1 Suppose the filter A: Y 1 (R) ~ Y '(R) is governed by the
equation
g" - w 2 g = !", w > 0.
(a) Compute its impulse response.
(b) Compute and represent graphically the step response.
Exercise 34.2 Suppose the filter A: Y '(R)-+ Y '(R) is governed by the

equation
g" + .../2g' + g = f.
(a) Compute the impulse response of A.
(b) Is the filter A realizable?
lesson 35
Realizable Filters and

Differential Equations
This lesson is a direct continuation of the last one. We are going to look
for the causal solutions of a linear differential equation with constant coef-
ficients; thus by assumption, the filterwill be realizable (Section 34.2). For
convenience we write the equation with bq = 1:
p
L bkg<k) + g<q) = L
q-1
ajJ(j). (35.1)
k=O j=O
We assume f E ~ ~' and we wish to find a solution g E ~ ~
35.1 Representation of the causal solution

We cannot use the Fourier transform here as we did in Section 34.3 because
the signals f and g are not assumed to be tempered. This lack of restriction
is essential, since we will find solutions that grow exponentially.
35.1.1 Existence and uniqueness of a causal solution

We are going to transform equation (35.1) into a first-order linear system.
By introducing the auxiliary functions g 1 , .. , 9q- 1 and calling the right-
hand side <p, (35.1) becomes
(35.2)
I
9q-2 = 9q-1,
9~-1 = -(bq-19q-1 ++bog)+ <p.
326 Lesson 35. Realizable Filters and Differential Equations
If we define
0 1 0 0
0 0 1 0
gl
M=
0 0 0 1
G=
rgq~l. J'
g h flJ,
-bo -bl -b2 -bq-1
then (35.2) can be written as the matrix cquation
G' =MG+ . (35.3)
The procedure for transforming a linear differential equation of order q

into a first-order system of linear equations is well known. The advantage
here is that (35.3) can be solved as a first-order scalar equation. The only
difference is that the exponents are matrices. We rcview this techniquc.
One proves that the series of matrices
t2
etM = I + tM + - M 2 +
2!
converges for all real t in the (normed) space of q x q matrices. One also
shows that etM is invertible, that the inverse is e-tM, and that the deriva-
tive of the vector-valued function t ~--> etM is the function t ~--> M etM.
Finally, one has etM eslvl = e(t+s)M.
Changing the unknown function to G(t) = etM X(t) reduces (35.3) to
etM X'(t) = <J>(t),
and we havc
X'(t) = e-tM(t) and X(t) = Xo +[ 00

e-sM(s)ds. (35.4)
All solutions of (35.3) are of the form
G(t) = etMXo + /_too e(t-s)M<J>(s)ds, (35.5)
where X 0 is an arbitrary fixed vector. Assurne supp( <p) C [t0 , +oo ). Then
(t) = 0 and G(t) = etM X 0 for all t < t 0 . If G(t) is causal, then necessarily
X 0 = 0. Equation (35.3) thus has the unique causal solution
(35.6)
and (35.1) has a unique causal solution g. Since g(t) = 0 fort E ( -oo, t 0 ),
the system is realizable.
35.2 Examples 327
We have acted here as if cp were a function. If it is a distribution, then

(35.4) is solved the same way using Theorem 30.2.1. One obtains a solution
G whose support is limited on the left (Section 32.4). We can now state
the main result.
35.1.2 Proposition If f E .! ~. then equation (35.1) has a unique

solution g in .! ~. The system
B:.!~-~~.
J~--+g
is a convolution system and hence a Elter.

Proof. Formula (35.6) shows that Bis a convolution system. However, one
can obtain the response g by computing the impulse response h of (35.1)
directly. If f = 6, then
L bkh(k} + h(q) = L aioU>.

q-1 p
k=O j=O
Taking the convolution with the input f gives (Theorem 32.4.3)

q-1 p
E bk(h * nck> + (h * ncq> = E ajJu>.
k=O j=O
Thus h* f is a causal solution of (35.1), and indeed it is the unique solution

g that was sought:
(35.7)
35.1.3 Remark Notice once again, the fact that the differential equa-
tion has a unique solution is a consequence of a constraint on g; in this
case, it is that g have support limited on the left. This restriction takes the
place of initial conditions. -
We will look at some examples of how to find actual solutions.
35.2 Examples
35.2.1 The RC filter RCg' +g= f
We are going to find the same result as we did in Section 25.1 but by a
different method. The impulse response h is the solution in .! ~ of
RCh' +h=o. (35.8)
Since h = 0 on ( -oo, 0) and since supp(b') = {0}, it follows that
RCh' +h=O
on the interval (O,+oo). Thus for all t > 0,

t
h(t) = ke- RC
for some constant k. These relations imply that
for all t E R. The derivative of h is (Section 28.4.4)
and substitution in (35.8) shows that
kRC6 = 6,
so
1
k= RC'
The impulse response (Figure 25.1) is
h(t) = - 1-e- icu(t). (35.9)

RC
Thus by (35.7), the causal solution of the equation is
1 ~t t-s
g(t) = RC -oo e- RC f(s) ds,
which is indeed what we found in Section 25.1. Taking f = u gives the step
response (Figure 25.2)
1
35.2.2 The filter - 2 g'' + g = f (w > 0)
w
This second-order filter was studied in Section 25.3 where the conditions
imposed by the use of the Fourier transform led to noncausal, slowly in-
creasing solutions. Here, in contrast, the causality assumption willlead to
a solution that grows exponentially.
35.2 Examples 329
The impulse response h is the solution in ! ~ of

h"- w2 h = -w 2 o. (35.10)
To find h, we use the same method that we used above. This time, however,
we willlook directly for a solution h of the form h = yu where y is a function
in C 00 R Thus
h' = y'u + y(O)o, h" = y"u + y'(O)o + y(O)o',
and (35.10) becomes
(y"- w2 y)u + y'(O)o + y(O)o' = -w 2 o.
This prompts us to look for a function y that satisfies the equation
y"- w2 y =0
with the boundary conditions
y(O) = 0,
y'(O)=-w 2 ,
which is a problern completely in terms of functions. The general solution is
y(t) = .Xcoshwt + J.LSinhwt;
with the initial conditions this becomes
y(t) = -w sinhwt.
The impulse response is
h(t) = -wsinhwt u(t),
and the solution g is given by
g(t) = (h * f)(t) = -w [oo sinhw(t- s)f(s) ds
when f is locally integrable with support limited on the left. If f is a

distribution, then g is given by a convolution in!~-
Note that hisnot in Y' ' and that the solution found here is completely
different from that found in Section 25.3. The step response (Figure 35.1)
is obtained by taking f = u:
h 1 (t) = (1- coshwt)u(t).
This filter is realizable but unstable: The input is bounded, but the out-
put is not. We note in this regard that the implications of the positions of
the poles given in Sections 24.4 and 24.5 do not apply here.
u(t) h,(t)
1 11
FIGURE 35.1. Causal step response of - 2 g + g = f.
w
1
35.2.3 The resonator 29" + g = f
w
We encountered this filter in Lesson 24 (Section 24.3.3), but we were not
able to analyze it using the Fourier transform because the poles were on
the imaginary axis. This equation describes the mechanical example Section
1.3.6 where the friction is negligible (zero in the equation). The equation
also represents a weight suspended on a spring where there is no "air"
friction, which means that the coefficient of the first derivative is zero.
We wish to find the impulse response h using the method we used in the
last section. Thus
h" + w 2 h = w2 6,
and by assuming h = yu, we have the system
y" +w2 y = 0,
y(O) = 0,
y'(O) = w2
The solution is
y(t) = .Xcoswt + J.tsinwt
with A = 0 and J.t = w. Thus
h(t) = wsinwt u(t),
and the output is
g(t) = w j_too sinw(t- s)f(s) ds
for a locally integrable, causal input f. The step response (Figure 35.2) is
h1{t) = (h * u)(t) = {1- coswt)u(t).
35.3 Exercises 331
u(t)
0 0 7T
w w
FIGURE 35.2. Resonator.
It is interesting to study the response to a sinusoidal input. For the input

f(t) = sinwot u(t), the output g is
2 w 2 (wsinwot- wo sinwt)u(t) if wo=/- w,

g(t) = { w -wo
~(sinwt- wtcoswt)u(t) if wo= w.
If w0 = w, the amplitude of the oscillations tends to infinity. This is an

instability, and it is an example of the phenomenon called resonance. It
can cause the system to "explode."
35.2.4 The integrator g' = f

We saw in Beetion 25.4.1 that causality imposed the solution
g(t) = j_too f(s)ds.

We now see directly that the impulse response must satisfy h' = 8, so h = u
and g = u * f.
35.2.5 The differentiator g = f'

The impulse response is h = 8', and we have g = f * 8' = f'. No surprise!
The step response is h 1 = u. Thus we have a realizable filter, but it is not
stable.
35.3 Exercises
Exercise 35.1 We wish to solve the equation
1 11
- - g +g=f
w2
by the matrix method described in Section 35.1.
(a) Write the equation as the first-order matrix system
G'(t) = MG(t) + fl>(t) with G = [:~] and fl> = [_~2 f] .

(b) To compute etM, write M = P f:lp-l, where l:l is the matrix of eigenvalues
1 = -w, 2 = w of M. Show that
e
tA _
-
[e-wt 0
ewt
J
0
and that etM = PetA p- 1
(c) Compute etM.
(d) Deduce the integral expression for g(t) found in Section 35.2.2.
Exercise 35.2 Consider the differential equation
g" + w 2 g = !", w > 0. (1)
(1) Solve (1) by the matrix niethod of Section 35.1 (see Exercise 35.1).
(2) Show that changing the unknown to go = g- f leads to the differential
equation
(2)
(a) Use Exercise 32.6 to compute the impulse response ho E ~.f.(JR) of

the filter (from ~.f.(JR) to ~.f.(JR)) governed by equation (2).
(b) Deduce the impulse response h E ~ .f.(JR) of the filter (from ~ .f.(JR)
+
to ~ (JR)) governed by (1). Compute its step response.
Chapter XI
Sampling and Discrete

Filters
Lesson 36
Periodic Distributions
We are going to return to the topics of Lessons 4 and 5 armed with what we
now know about tempered distributions and their Fourier transforms. Our
objective is to show the connection between Fourierseriesand the Fourier
transform.
36.1 The Fourier series of a locally integrable

periodic function
36.1.1 Review of Lesson 4
We saw in Lesson 4 that each periodic function f E L~(O, a) can represented
as a Fourier series
+oo
J(t) = I: (36.1)
n=-oo
and that this series converges to f in the norm of L~(O, a). The Fourier
coeffi.cients are given by
Cn = ~
a lo
r f(t)e - 2i1rn~ dt. (36.2)
For these coefficients to exist, it is necessary and suffi.cient that f be inte-

grable on (0, a). Thus if f E L~(O, a), we can associate with f a trigono-
metric series that is called its Fourier series:
n=-oo
In this general case, we no Ionger know how to interpret the sum in (36.1).
And even if we knew that the series converged in some sense, we would still
336 Lesson 36. Periodic Distributions
have to show that its limit is f. We recall a negative result from Section 5.2:
N
L
t
f E L~(O, a) ~ cnii.".na ~ f in L~(O, a) as N ~ +oo.
n=-N
36.1.2 A brief preview

If the Fourier series of f did converge, in a sense to be determined, and if
its limit was J, we could (probably) take the Fourier transform of (36.1)
and get the formula
+oo
t= I:
n=-oo
a
(36.3)
It is this formula that establishes the connection between the Fourier trans-
form of f and its Fourier series.
Several questions arising from this scenario need to be addressed. Let f
be periodic and locally integrable.
Ql: Does the Fourier seriesoff converge in some sense?
Q2: lf yes, does it converge to f?
Q3: Is f tempered?
Q4: Does the Fourier series of f converge to f in .57 '?
We need to answer "yes" to Q4 (thus also to Ql, Q2, Q3) if we are to
write (36.3) and expect it to make sense. In fact, this formula requires that
f be tempered and that the Fourier transform and the infinite summation
can be interchanged.
The answers, all positive, are given in the next result.
36.1.3 Proposition Let f be a periodic locally integrable function

with period a > 0. Then we have the following results:
(i) f is a tempered distribution.
(ii) The equalities
+oo
t(t) = I: (36.4)
n=-oo
and
+oo
1 I:
n=-oo
a
(36.5)
hold in .57' with
Cn =- 11a
a o
f(t)e -2i7rn.!a dt.
36.2 The Fourier series of a periodic distribution 337
Proof. Let fo = f X[o,a] Note that

f = fo * .6.a. (36.6)
The convolution is well-defined, since fo is in lf 1 and .6.a is in Y 1 (Propo-

sition 32.3.2). On the other hand,
+oo +oo
(fo * .6.a)(t) = L Uo * 8na)(t) = L fo(t- na) = f(t).
n=-CXJ n=-oo
This proves (i). Next take the Fourier transform of (36.6); applying Propo-
sition 33.2.1 and (31.10), we have
- -
f- = fo .6.a = -1-fo.6. 1 ,
a -a
which by (28.5) is
~ 1 +oo ~ n
f =-
a
L !o(-)8!}..
a a
(36.7)
n=-oo
This proves (36.5), since
h(~) = 1a f(t)e- 2 i1rn~ dt = acn.
We obtained (36.4) by taking the inverse Fourier transform of (36.5). o
36.2 The Fourier series of a periodic

distribution
The last proof works equally weil if the function fo is replaced by a distri-
bution S with compact support.
36.2.1 Proposition Suppose S is a distribution with compact sup-

port and a > 0. Then
T = S * .6.a
is a periodic tempered distribution with period a. It can be decomposed in
a Fourier series
(36.8)
with equality in Y 1
Recall that S is a 0 00 function (Theorem 31.5.1). For S = 8 we get

the Fourier series representation of Dirac's comb, which was established in
Section 29.5.
REMARK: For a periodic function f with period a represented as a Fourier
series, we called (Section 7.1) the set of pairs
( ?!'.,cn)
a nEZ
the spectrallines of f. The representation of these pairs by arrows parallel

to the y-axis (Figures 7.1 and 7.2) was just a graphic convenience. Formula
(36.5) shows that this representation agrees with that adopted in Section
26.1 for representing Dirac masses (Figure 26.3).
At this point it is natural to ask (and important to answer) the following
question: Do all periodic distributions have a Fourier series representation
that converges in .9" '? The next theorem provides a sort of converse of
Proposition 36.2.1, and also of Theorem 29.4.2.
36.2.2 Theorem If T is a periodic distribution with period a > 0,

then the following results hold:
(i) T is tempered.
(ii) There is a distribution S with compact supportsuch that
T = S*a.
(iii) T has a unique Fourier series development that converges toT in .9" ':
+oo t
T =
~
~ ane
2inn-
a, (36.9)
n=-oo
(36.10)
n=-oo
(iv) The sequence of Fourier coefficients (an) is slowly increasing, and
an= ~s(~)-
Proof. The idea is the sameasthat used in Proposition 36.1.3. We would
like to define S as the "restriction" ofT to (0, a), that is, to write
S= X[o,a] T.
Unfortunately, as we saw in Section 28.3, such a product is not defined.

Wc get around this difficulty by using a function () E ! that approximates
X[- 1 , 1] and that also satisfies the relation
+oo
L O(t-na)=1 (36.11)
n=-oo
36.2 The Fourier series of a periodic distribution 339
for all t E ~. Assurne for the moment that we have such a function. The
distribution S = (}T will do the job: S has bounded support and
+oo +oo
S*a = L S*Ona = L TnaS.
n=-oo n=-oo
For all cp E ! ,
+oo +oo
(S * a, cp) = L (S, T-na'P) = L (T, OT-na'P)
n=-oo n=-oo
Since T is periodic,
(T, 1/J) = (T, Tna1/J)
for all 1/J E ! and all n E Z; using (36.11) shows that
+oo
(S * a, cp) = L (T, TnaOcp) = (T, cp),
n=-oo
which proves (ii). Statements (i) and (iii) follow from Proposition 36.2.1.
The Fourier coefficients of T are
1 ~(n) (36.12)
on = ~S -;;: .
These coefficients are slowly increasing because the function S is slowly

increasing (Theorem 31.5.1).
One might think that the coefficients On depend on S, that is, on the
choice of 0. This is not the case: The Fourier coefficients are unique, as they
are for functions in L~(O, a). In view of (36.12) and linearity, it is sufficient
to show that
for all n E Z. But S * a = 0 implies that S E:, = 0, and this in turn

implies that
+oo
L
n=-oo
s(~)o~ =o.
The only way for this to hold is to have s(~) = 0 for all n.
To finish the proof, we need to show the existence of a function (} E !
satisfying (36.11). Let
cp(t) = P(~),
where p is the function in !JJ defined by (27.3). The sum

+oo
ip(t) = L cp(t- na)
n=-oo
exists because for each t there are at most two nonzero terms. For the same
reason, ip, like cp, is infinitely differentiable. <i5 is a-periodic and strictly
positive; thus a suitable choice for () is
cp(t)
O(t) = ip(t)" D
REMARK: The Fourier coefficients Cn of a periodic locally integrable func-

tion tend to 0 as lnl -+ +oo by the Riemann-Lebesgue theorem. Those of
a periodic distribution are only slowly increasing.
36.2.3 Corollary Let (Yn)nEZ be a complex sequence and define the

distribution T by
(36.13)
n=-oo
for a> 0. T is a tempered distribution if and only if the sequence (Yn) is

slowly increasing.
Proof. It was shown in Proposition 31.1.9 that the condition is sufficient.
If T is tempered, then the series (36.13) converges in .9' 1 , and we can take
its Fourier transform, which is
+oo
T= L Yne-2iTrna>..
n=-oo
T is a periodic tempered distribution with period .!. Thus its Fourier co-
a
efficients Y-n are slowly increasing. D
36.3 The product of a periodic function and

a periodic distribution
36.3.1 Theorem Let f be a periodic coo function with period a and
let T be a distribution with the same period. Then the distribution fT is
represented by the Fourier series
+oo
L
. t
fT= ne 2'l1rn-a'
n=-oo
36.3 The product of a periodic function and a periodic distribution 341
and the coeflicients are given by

+oo
n = L CkO.n-k
k=-oo
The Ck are the Fourier coeflicients of f, and the an are those of T. The
series for n is absolutely convergent.
Proof. Write
N
en(t) = e
2i7rn!
a and /N(t) = L ckek(t).
k=-N
Then
N +oo
/NT= L L CkO.nen+k =
k=-Nn=-oo n=-oo
where
N
n(N) = L
CkO.n-k
k=-N
We will study what happens to n(N) as N-+ +oo.
(1) The series whose general term is CkO.n-k is absolutely convergent for
each fixed n. lndeed, the sequence (an-k) is slowly increasing in k (Theorem
36.2.2), so
lan-kl ~ Alklm
lkl. By Section 5.3.3(e), we also have
for sufliciently large
lckl ~ Blkl-m- 2 .
Thus ickan-kl ~ Clkl- 2 for sufliciently large lkl, and the sequence n(N)
converges as N -+ +oo.
(2) We wish to show that !NT-+ fT in .9 '. If cp E .9, then
((/- /N)T,cp} = (T, (/- /N)rp},

and it is suflicient to show that C/)N = (f - !N )cp tends to 0 in .9 . By
Leibniz's formula, this reduces to showing, for arbitrary given integers p,
q, and l, that
tP(f- IN )(l)cp(q) -+ 0
uniformly on :IR.. We know from Theorem 5.3.1 that (f- !N )(!) tends to 0
uniformly on :IR.. Since tPcp(q) is bounded, the result follows.
(3) Denote the Fourier coeflicients ofthe periodic distribution fT by (n)
From (1) we know that
+oo
(/- fN)T = L (n- n(N))en-+ 0
n=-oo
in Y 1 Taking the Fourier transform shows that

+oo
L (n- n(N))o!!:-+ 0
a
n=-oo
in Y 1 and hence in .9f 1 This implies that

n(N)-+ n
+oo
for each n E Z as N -+ +oo, which means that n = L CkO:n-k D
-oo
36.4 Exercises
*Exercise 36.1 Let x = E!:-oo
xn8nh be a discrete periodic signal with
period a = Nh and grid h > 0, and let
N-1
Xk = N1 "'""
L....J -nk
XnW N , kEZ,
n=O
be its discrete Fourier transform (see (8.5)).

(a) By writing
N-1
X = L Xj'T"jhll.a
j=O
(with ll.a = E:=-oo 8na), show that

~(') 1
X " = -
a
L
N-1
Xke
-2i"->.kh A
'-" 1
-
k=O a
(b) Deduce from this that

1 +oo
X= h L
k=-oo
Xk8~.
a
Discuss the relation between the signals x and x.

Exercise 36.2 Assurne SE W'(R). Under what condition does there exist
a periodic distribution T with period a such that T' = S * ll.a?
Exercise 36.3 Let T be a periodic distribution with period a and Fourier

coefficients On. Show that
+oo
(T,tp) = L On~(-~)
n=-oo
for all tp E .'? (R).

Lesson 37
Sampling Signals and Poisson's

Formula
We are now going to tackle the problern of sampling analog signals. This
operation is a prerequisite of digital signal processing. For example, an
analog speech signal must be sampled before it can enter a digital tele-
phone system. A sampler records the level of the signal every a seconds
and transforms it into a sequence of impulses (Figure 37.1). An analog-
ta-digital converter (ADC) codes these impulses as numbers that can be
processed digitally.
Xn= f(na) digital (Yn)

f (t) ----+--e ~
ADC signal DAG t---- g(t)
s ampler processor
FIGURE 37.1. Processing an analog signal.
Mathematically, sampling has already been defined (Section 29.5.2) as

multiplication of the signal by Dirac's comb:
+oo
f ~ afAa =a L f(na)na
n=-oc
At first glance it would appear that information in the original signal is

thereby lost. This is true to some extent, but what we will see in this lesson
and the rest of the book is that the sampling rate can be high enough
that for all practical purposes the loss of information is not important.
On the other hand, digital processing offers many technical and economic
advantages. Here is a simplified look at two of them:
(a) During the time between the arrival of two consecutive sample values,
a serial processor can be doing calculations based on the value of the last
sample to arrive. This enables "real time" processing for applications like
automatic control and process control.
(b) Sampling and subsequent digital processing lead to sophisticated
ways to compress signals. This means, for example, that speech signals
344 Lesson 37. Sampling Signalsand Poisson's Formula
can be compressed, transmitted, and reconstructed digitally without per-

ceptible loss of quality. The economic advantage is that the compressed
signal occupies less bandwidth, and so more signals can be transmitted
over a given channel.
We are particularly interested in the spectra of signals and thus in spec-
tral analysis. In this context, it is essential to ask what happens to the
spectrum of a sampled signal af !:ia, or, put another way, to ask how the
spectra of the sampled and original signals are related. We hasten to note
that the spectrum obtained for the sampled signal is not a sampling of the
original spectrum.
We will see in the next lesson something that at first seems quite aston-
ishing: Fora large (andin practice, important) dass of signals, it is possible
to sample without losing information. This is the essence of Shannon's the-
orem, which remains one of the most important theoretical and practical
results in the theory of signal processing.
There is, however, an older result that is fundamental for all of the work
in this area: It is Poisson's formula, and it establishes the connection be-
tween Fourier series and the Fourier transform.
37.1 Poisson's formula in g''

The equation
+oo +oo t
"" ( ) 1 "" ~(n) 2i11'n-a,
L_-ft-na=~L_-f~e (37.1)
n=-oo n=-cx::>
where a > 0 is arbitrary and fixed, or its dual version

+oo +oo
L 9(.\- ;) =a L g(na)e-2i11'>.na, (37.2)
n=-oo n=-oo
is called Poisson's summation formula or simply Poisson's formula. We in-

tend to prove this formula for several dasses of signals that will be modeled
by either functions or distributions.
37.1.1 Preliminary remarks

lf formula (37.1) is to make sense, the numbers f(nja) must be defined;
this means that f must be a function that can be evaluated at a given
f
point. This is the case when is continuous. This is not generally the case
iff E L 2 (lll)-unless, of course,f belongs to an equivalence dass that
contains a continuous representative.
37.2 Poisson's formula in L 1 (~) 345
In spite of appearances, this is not an issue for the left-hand sum; it is

to be interpreted as one of the expressions
+= +=
L
n=-oo
Tnaf = L J * Ona = J * ~a, (37.3)
n=-oo
where specific values of the function do not appear. Thus f can be any
distribution for which the series converges, say, in ! 1
Finally, we note once again that the conve;:pence of these series must be
interpreted as the (symmetric) limit of 2:::~=-N as N ----t +oo. We know,
for example, that the series on the right in (37.1) converges in! 1 if is a 1
slowly increasing function (Theorem 29.4.2). As in the case of the left-hand
side, the variable t plays only a symbolic role in the expression on the right.
There are, however, cases where the Poisson formula holds for all t E IR.
This happens, for example, when f E !/ (see Exercise 37.1).
37.1.2 The case where f is a distribution with

compact support
If the distribution f has compact support, 1
is a slowly increasing c=
function (Theorem 31.5.1), and both sides of (37.1) make sense. Further-
more, in view of (37.3), the Poisson formula is just equation (36.8), which
was established for periodic distributions. This proves the following result:
The Poisson summation formula (37.1) is true for distributions f with
compact support, f E /1 1 The dual formula (37.2) is true for all functions
g such that g E if' 1
37.2 Poisson's formula in L 1 (~)

1
When f E L 1 (IR), is continuous and bounded by Theorem 17.1.3. The
right-hand side of (37.1) is then a trigonometric series that converges in
! 1 to a periodic distribution (Theorem 29.4.2), which, being periodic, is
tempered.
We first prove a lemma about the series on the left of (37.1).
37.2.1 Lemma Assurne f E L 1 (IR), and for a > 0 define

+=
F(t) = L f(t- na). (37.4)
n=-oo
Then the following results hold:

(i) Theseries (37.4) converges in L 1 (0, a), and FE L~(O, a). The Fourier
coeflicients of F are
k E 71..
(ii) If in addition, f' E L 1 (~) (where the derivative f' is taken in the
sense of distributions), then the series (37.4) converges uniformly on
~' and thus F is continuous on R
Proof.
(i) We show that the restriction of the series (37.4) to (0, a) converges in
the complete space L 1 (0, a) by showing that the sequence
N
FN(t) = L f(t- na)
n=-N
is a Cauchy sequence. We have
IIFN+P- FNIIu(o,a) ~ L
N<lni<5.N+PJo
r lf(t- na)l dt ~ 1lxi~Na lf(x)l dx.
Since f is in L 1 (~), the last integral tends to 0 as N and P tend to +oo.
Thus the sequence (FN) converges to F on (0, a), and F E L~(O, a) by
periodicity. The Fourier coefficients of F are the limits of those of FN:
But
Ck (FN ) = -1 ""
L...J 1a
f ( t - na ) e -2inkla dt = -1 ~(N+l)a J( x )e -2ink'Ea dx
a lni<5.N o a -Na
tends to
~ ~+oo f(x)e-2ink~ dx = ~f(~),
a _ 00 a a
which proves (i).

(ii) Take a = 1 to simplify the notation. We will show that there is a
sequence of positive numbers Un such that
lf(t- n)l ~ Un (37.5)
for all t E [0, 1] and all n E Z, and suchthat 2:::~:-oo Un < +oo.
37.2 Poisson's formula in L 1 (JR) 347
The assumption in (ii) means that T/ = Tf', and an argument like that
given in Section 30.1.2 shows that f is absolutely continuous on all bounded
intervals. Thus fort E [0, 1],
f (t - n) -
r+l f (t -
Jn x) dx =
r+l [f (t - n) -
Jn f (t - x) Jdx,
and
f(t- n)- f(t- x) = lx f'(t- y) dy.
From these relations we see that
lf(t- n)- f(t- x)l ~ Jnr+l lf'(t- Y)l dy

and
If(t- n)- Jnr+l f(t- x) dx I ~ Jnr+l IJ'(t- y)l dy.

The last inequality implies that
lf(t-n)l ~ 1 n
n+l
[1/(t-x)l+l/'(t-x)l] dx ~
~-(n-1)
-(n+l)
[lf(y)l+lf'(y)l] dy.
Let Un = J~(~;N [IJ(y)l + lf'(y)l] dy; then

+oo
L
n=-oo
Un = 21
IR
[1/(y)l + 1/'(y)l] dy < +oo. (37.6)
This proves that the series (37.4) converges uniformly on [0, 1] (or [0, a]
in general). Since F is periodic, the convergence in uniform on JR, which
proves the result. o
37.2.2 Theorem
(i) If f E L 1 (JR), then Poisson's formula (37.1) expresses the equality in
.9 1 between the function FE L~(JR) and its Fourier series.
(ii) If, in addition, f' E L 1 (JR) (where the derivative f' is taken in the
sense of distributions), equality (37.1) holds for all t E JR. More pre-
cisely, the series on the left converges uniformly on lR to a continuous
periodic function F, and the Fourier series of F, which is the series
on the right, converges uniformly on lR to F.
Proof. The first result is a restatement of Lemma 37.2.1 and Proposition

36.1.3. For (ii), it is sufficient to show that F is of bounded variation on
[0, a] (Theorem 5.2.5). Again, take a = 1.
Let 0 = t 0 < t 1 < < tP < tp+l = 1 be any subdivision of [0, 1]. Then
p p +oo
Mp = L IF(ti+I)- F(ti)i ~ L L lf(ti+1- n)- f(ti- n)l
i=O n=-oo
I: I: lt(ti+1- n)- t(ti- n)l ~ I: I: 1

i=O
+oo P +oo P ti+ 1
= lf'(t- n)l dt.

n=-oo i=O n=-oo i=O t;
Finally,
+oo {1 +oo j-(n-1) r

Mp ~ L Jn
n=-oo 0
if'(t- n)l dt = L
n=-oo -n
if'(y)i dy =irr< if'(y)i dy.
lfl.
The numbers Mp are bounded independently of the subdivision; thus F is

of bounded variation, which proves the theorem. o
37.3 Application to the study of the

spectrum of a sampled signal
Let f be a tempered signal whose spectrum contains no frequencies greater
than some limiting value .Ac:
In this situation, f is said to be band limited, which is to say that E iif '. 1
We saw in Section 29.5 that sampling f every a seconds can be expressed by
+oo
af a = a L f(na)8na
n=-oo
1
This makes sense because E iif I implies f E C 00 (lR).
We wish to determine how sampling modifies the spectrum of f. For this
we need to compute the Fourier transform of af a. By Proposition 33.2.1,
Y(f * ~a) = J a.
Taking the Fourier transform and using (31.10) shows that the spectrum
of the sampled function is
af7):;. = 1* !.
a
37.3 Application to the study of the spectrum of a sampled signal 349
f(t) f(A)
0 0
?
-a Oa 2a
FIGURE 37.2. What is the effect of sampling on the spectrum?
Formula (37.2) with g = j gives the following key result:
(37.7)
n=-oo n=-oo
We draw several conclusions:

(a) The spectrum of the sampled signal aJ75:a, of a band-limited signal f
is periodic with period lja.
(b) The spectrum of the sampled signal is obtained by summing all trans-
lates of the spectrum of the original signal f, the translations being
the integer multiples of lja.
We are now able to answer the question asked in Figure 37.2. There are
two cases:
Case 1: a > 1/(2Ac) (Figure 37.3).
The translates of the spectrum of f overlap, and the spectrum of af D.a
does not agree with the spectrum of f on the interval [-Aa, Aal
Case 2: a ~ 1/(2Ac) (Figure 37.4).
The sampling rate is large enough so that the translates of the spectrum
of f are separated. The spectrum of the sampled signal is a simple periodic
repetition of the initial spectrum j, and the two spectra agree on [- Aa, Aal
Thc critical sampling rate a = 1/(2Ac) is called the Nyquist rate.
If you sample a signal, you periodize its spectrum.

350 Lesson 37. Sampling Signals and Poisson's Formula
f(t) f(>.)
8> _1_
2Ac
-3a -2a -a 0 a 2a 3a
FIGURE 37.3. Sampling a signal periodizes its spectrum.
a< _1_
- 2Ac
0 a 2a 1 >.
a a
FIGURE 37.4. A high sampling rate separates the components of the spectrum.
37.4 Application to accelerating the

convergence of a Fourier series
We illustrate the idea with an example. Consider the function F defined
by its Fourier series:
+oo
F(t) = L n2
1
+ b2
e2i1rnt
.
(37.8)
n=-oo
This series converges uniformly on JR, and F is continuous with period 1.

If f is the function defined by
~ 1
f(>..) = >._2 + b2' b > 0,
37.5 Exercises 351
then we see that the right-hand side of (37.8) is exactly the right-hand side
of (37.1) with a = 1. We will compute F using f. From Section 18.2.2 we
know that
!( t) = ie-2''-bltl,
and clearly f and f' are integrable. Thus
F(t) = i +oo
L e-27rblt-nl (37.9)
n=-oo
for all t ER This series converges much faster than (37.8), andin this case
we can compute the sum explicitly. Fort E [0, 1],
b 0 +oo -211"bt + -27rb(l-t)

-F(t) = '""' e-21rb(t-n) + '""'e-21rb(n-t) = e e ,
7r ~ ~ 1 - e-27rb
n=-oo n=l
and hence, for t E [0, 1],
1 2i7rnt 7r 1
n~oo
+oo
n2 + b2 e = b sinh 7rb cosh [27rb ( t- 2)] .
Taking t = 0, we see that

+oo
1 7r
2::
n=-oo
n2 + b2 = b coth(7rb).
3 7. 5 Exercises
Exercise 37.1 (Poisson's formula in Y (JR.))
(a) Use equation (31.10) to show that
f
n=-oo
~(~) =a f
n=-<Xl
~(na), a~O,
for all ~ E Y (R)
(b) Deduce from this that
t > 0.
Exercise 37.2 (sampling a sinusoidal signal)

Consider the signal g(t) = cos(27r.\t + ~) that is sampled with a frequency r. Let
9k denote the values of g at the times h = k/r, k E N.
352 Lesson 37. Sampling Signals and Poisson's Formula
(a) Show that there exists a frequency f E [-T/2,T/2) suchthat the signal
h(t) = cos(27rft + <p)
gives the same samples as g at the times tk.

(b) Deduce from this result a necessary condition (involving ..\ and T) for g to
be completely determined by its samples.
Exercise 37.2 Use the method illustrated in Section 37.4 to transform the
Fourier series
+oo
F(t) = L e-bn2 e2i"nt, b > 0.
n=-oo
Does the transformed series converge faster than the original series?
Lesson 38
The Sarnpling Theorem and

Shannon's Formula
Shannon's formula is an interpolation formula that expresses the value f(t)

of a signal at any timet in terms of its values f(na) at the discrete points
na, n E Z. The signal f is thus completely determined by the sampled
signal af ~a. This is what we had in mind when we mentioned in the last
lesson that sampling does not destroy information. Since this property is
patently false for arbitrary signals, some restrictive assumptions must be
made about f. Our point of departure will be Poisson's formula (37.7);
thus the first assumption is that f is band limited.
Let f be a band-limited signal:
(38.1)
Then f is infinitely differentiable and slowly increasing (Theorem 31.5.1),

and we have Poisson's formula (34.4)
+oo +oo
L 1(A- ~) = a L f(na)e-2i7r>.na (38.2)
n=-oo n=-oo
with equality in Y '. When the sampling rate is high enough, which is
when
1
a < -
- 2-Xc'
the translates of the spectrum 1 in the left-hand side of (38.2) do not
overlap; they are separated by a- 1 - 2-Xc ~ 0.
The idea behind Shannon's formula is to isolate the central copy of 1
and use it to reconstruct f, which will then be expressed in terms of its
values f(na) (see Figure 38.1). The next assumptionisthat [, and hence
f, is square integrable:
(38.3)
The left-hand side of (38.2) is then a periodic function F(.\) with period
354 Lesson 38. The Sampling Theorem and Shannon's Formula
F(.\)
FIGURE 38.1.
1/a that is square integrable over one period. Thus F can be expanded in
a Fourier series
+oo
F()..) = L Cne2i11'-\na (38.4)
n=-oo
The sequence (c-n) is square integrable, and the equality (38.4) holds in
L 2 (0, 1/a). Since Fourierexpansions of periodic tempered distributions are
unique (Theorem 36.2.2),
C-n = af(na).
First conclusion: Under the assumptions (38.1) and (38.3) the equality
(38.2) holds in L~(O, 1/a) and
+oo
L if(naW < +oo.
n=-oo
If we multiply F(>.) by the characteristic function
r(>.) =X[ _ _!_ 1__ 1(>.),

2a'2a
we see that
+oo
!(>.) = a L f(na)r(>.)e-2i11'-\na, (38.5)
n=-oo
which holds in L 2 (R). From the continuity of 5 on L 2 (R) and from the
relation
_ . _ sin~(t-na)
5 [r(>.)e- 2m-\na] = (5 r)(t- na) = a ,
rr(t- na)
38.1 Shannon's theorem 355
we finally obtain the interpolation formula

+oo sin !!.(t- na)
f(t) = L f(na) 7r a . (38.6)
n=-oo -(t-
a
na)
Theseries on the right converges to f in L 2 (1R). If in addition,

+oo
L lf(na)l < +oo,
n=-oo
then the series in (38.6) converges uniformly on IR to a continuous function

g. But this implies that the series converges to g in L 2 (J), where J is any
bounded interval. The conclusion isthat f = g a.e. on IR; hence f(t) = g(t)
for all t E IR because f and g are continuous.
38.1 Shannon's theorem

Shannon's theorem Let f be a signal that contains no frequencies
greater than some value Ac and assume that f has finite energy:
supp(j) C [->.c, >.c] and f E L2 (1R).
Then for all a > 0,
+oo
L lf(naW < +oo, (38.7)
n=-oo
1
and for all a ~ 2>.c ,
+oo sin !!.(t- na)

f(t) = L
n=-oo
f(na)--=-1ra=-----
-(t- na)
(38.8)
a
This equality holds in L 2 (1R). If in addition,
+oo
L lf(na)l < +oo, (38.9)
n=-oo
then the series converges uniformly on IR and equality holds for all t ER
REMARK: Shannon's formula can also be written as
+oo ( 1)n
f(t) = ~ sin '!!..t "" f(na)---. (38.10)
1r a ~ t-na
n=-oo
This causes poles, which do not belong to f, to appear in the series at the
points tn = na.
N
38.2 The case of a function f(t) = L cne 2i1rAnt
n=-N
The spectrum of this function is E~=-N CnAn, so it has bounded support.

Although this function is not square integrable, it is easy to see using the
theory of Fourier series that Shannon's formula is true for trigonometric
series. By linearity it is sufficient to prove (38.8) for the function
f(t) = e2i7rAt' A E lR.
Let g be the 1/a-periodic function that agrees with f on (-1/(2a), 1/(2a)).

For a real and fixed, the Fourier coefficients of g are
asin ~(A- na)

Cn = a , n E Z,
1r(A- na)
and
sin ~ (A- na)
g(t) = L
+oo
7ra e2i1rnat. (38.11)
n=-oo a(A- na)
This equality holds in L 2 (0, 1/a), but in view of Theorem 5.2.4, it is also
true for all t in the open interval (-1/(2a), 1/(2a)). Hence, for all A ER,
+oo sin ~(A- na) 1

e2i7r ).t = ~
~
e2i1rnat __"..,a=-----
7r , ltl < 2a
n=-oo -(A-na) (38.12)
a
Shannon's formula (38.8) is obtained by interchangingt and A.
REMARK: We note that here it is not possible to take a to be one of the
extreme values 1/(2A). The relation would clearly be false in this case.
38.2.1 Theorem Fora trigonometric signal

N
f(t) = L Cne 2i11"Ant, An ER,
n=-N
Shannon 's formula
+oo sin ~(t- na)
f (t) = L f (na) -=1r-(t-
n=-oo
a""-----
na)
(38.13)
a
holds for all a E (0, 1/(2Ac)) and all t E R with Ac = maxlni~N IAnl, and
the convergence is pointwise on R.
38.3 Shannon's formula fails in !/ ' 357
38.3 Shannon's formula fails in !7 '

It is natural to ask whether Shannon's formula (38.8) is true in Y 1 for all
f
signals I suchthat Elf 1 (f is band limited). If this equality were true in
Y 1 for all such signals, then we would have
LN
l(na)
1sin~(t-na)
7r a cp(t) dt --t
1 l(t)cp(t) dt
n=-N lR 0;(t- na) lR
for all cp E Y. This, however, is not the case: One can find such signals I
and functions cp for which the sum does not converge to the integral on the
right (see Exercise 38.5).
38.4 The cardinal sine functions

The function sa(t) = sin ~t / (~t), whose translates San(t) = sa(t- na)
appear in Shannon's formula
+oo
I= L l(na)san, (38.14)
n=-ex>
is called the cardinal sine. Wehave sn(na) = 1, and sn(ma) = 0 formE Z

and m =f. n. The cardinal sine and its translates are in L 2 (IR), and (38.14)
suggests that the Sn might form a basis for the Hilbert space L 2 (JR). One
must remember, however, that (38.14) was developed only for functions
in L2 (IR) that have bounded spectra contained in [-1/(2a), 1/(2a)]. This
leads us to introduce the following definition:
I
Va = { v E L 2 (IR) supp(v) c [- 2~, 2~] }
It is easy to show that Va is a closed subspace of L 2 (JR).
38.4.1 Proposition
(i) The family offunctions (san)nEZ is an orthogonal basis for the Hilbert
space Va.
(ii) If (aj)jEN is any sequence such that _lim aj = 0, then ujEN Vaj is
J-++oo
densein L2 (R).
Proof. We first prove orthogonality. By Proposition 22.1.2,

We know how to compute the Fourier transform of San:
s;;;:;-(.A) = :r;;:s;;(.A) = S:"(.A)e-2i71".\na = ar(.A)e-2i71".\na,
where r is the characteristic function of [-1/(2a), 1/(2a)]. Consequently,
1 -- 212~
IR
SanSap - a
_ _l_
2a
e-2i11".\(n-p)a d'/\ -- {a
0
if n = p,
if n -=1- p,
(38.15)
which proves orthogonality. Next we show that linear combinations of the

San are dense in Va.
Take 9 EVa and c > 0. By (38.14) and (38.15),
N
119- L 9(na)sanll~ = II L 9(na)sanll~ = a L l9(na)l 2,

n=-N lni>N lni>N
and hence by (38. 7) there is an N 0 E N such that

No
119- L 9(na)sanll2 < .
n=-No
2
This proves density and completes the proof of (i).
To prove (ii), take f E L (JR.), c > 0, and definc 9n by
";;(.A) = {[(>.) if I.AI ~ n,

0 otherw1se.
There exists an n 0 E N suchthat for all n;::: n 0 ,
II!- 9nll~ = 1 l.\l~n

lfc.A)I 2d.A < c,
and 9no E Vai for sufficiently large j. D
38.4.2 Remark It happens that the decomposition (38.8) is not partic-

ularly useful in practice for numerical computation. The cardinal sine tends
to zero too slowly. Figure 38.2 illustrates a representation using (38.8).
It is nevertheless true that the function
N
fN(t) = L f(na)san(t),
n=-N
which interpolates f at the points tn = na, - N :::; n :::; N, and is zero at
the other subdivision points, is the best approximation of f in the subspacc
of L 2 (JR.) spanned by {s-aN, ... , so, ... , SaN}.
38.5 Sampling and the numerical evaluation of a spectrum 359
f(t)
FIGURE 38.2. The cardinal sine basis.
We have just encountered the problern of looking for a "good" orthogonal

basis for representing a signal f E L 2 (1R), where "good" is related to the
kind of signal processing we have in mind. We will see in Lesson 42 how
this question is being dealt with today in view of results on wavelets that
began to appear in the 1980s.
38.5 Sampling and the numerical evaluation

of a spectrum
38.5.1 The sampling problern
Suppose we wish to compute the spectrum J(>. )of a signal f that is pre-
sented to us in some form-for example, as an analog recording-where we
have access to the function values at "all times" t. If we have no explicit for-
mula for the function or other information, the best we can do is sample f
and try to compute its spectrum from the sampled function. But what sam-
pling rate should be used? Without more information, there is no answer.
Experts in signal processing can tell us in concrete cases what frequencies
are essential for carrying information; this means that in a given, well known
situation, an expert can specify a limit Ac above which the higher frequen-
cies are considered to be noise. The simplest example is perhaps the case
of sound in the human audio range. We know, in general, that humans do
not hear frequencies beyond about 20,000 Hz. Thus frequencies higher than
this can be suppressed in transmission and reproduction systems without
perceptible loss of quality. The limit Ac =20,000 Hz corresponds to a basic
sampling rate of 40,000 times per second, which is approximately what is
used in digital recordings. In fact, four times this rate, or 160,000 Hz, is
used for the production of compact discs. In cases where the signal varies
slowly, it is possible to sample at a lower rate. It is the specific situation

with its specific definition of "quality" that determines the sampling rate.
While it is up to the expert to define what is considered to be the bound
Ac of the spectrum, one must always keep in mind that this assumption
of a limited spectrum implies that the signal itself is an analytic function.
In particular, we cannot assume without contradiction that both the sig-
nal and its spectrum have bounded support, since an analytic function
that vanishes on an interval must vanish identically (Theorem 31.5.2). It is
important to keep these facts in mind.
To assume that the signal f is band limited implies that f is an analytic

function and that supp(f) = IR?.. (In particular, f cannot be causal.)
Conversely, to assume that f has bounded support implies that its
spectrum cannot have bounded support.
38.5.2 The phenomenon of aliasing

If one is not careful, computing the spectrum from samples taken directly
from a recorded signal can lead to unpleasant surprises. Any recorded phys-
ical signal is going to be contaminated by noise. In addition to the "real"
signal J, the recorded signal will typically look like g = f + r, where r
has relatively small amplitude but contains relatively high frequencies, and
the spectrum of g will be broader than the spectrum of f. This means that
even though one has a priori an idea about the band width of J, a sampling
rate based on this knowledge will have a good chance of being too low and
will lead to the situation illustrated in Figure 37.3. This is phenomenon is
called aliasing. When this happens, the computed spectrum will not be the
one that is sought. To avoid this problem, it is necessary to filter the signal
before it is sampled. By passing the signal through a well-designed low-
pass filter, one gains two advantages: High-frequency noise is eliminated,
and one has a better idea about the appropriate sampling rate.
Ta compute the spectrum of a physical signal numerically, it is nec-

essary to filter the signal before it is sampled. This is to avoid the
problern of aliasing.
Aliasing appears when processing a sampled signal in formula (8. 7),
N
Cn = ' '' Cn-2N + Cn-N + Cn + Cn+N + Cn+2N + ' '' ,
which we saw in connection with the discrete Fourier transform. Here the
approximate spectrum c;:[ is "contaminated" with extra copies of the real
spectrum Cn that appear as the terms Cn+pN, p =f. 0. Prefiltering eliminates
these terms, which can be too large for practical computations, even though
they eventually tend to zero.
38.6 Exercises 361
38.5.3 Computation using the FFT

Assurne that the signal f has been filtered and that
1
Then by (38.5), for a < 2..\c,
+oo
[(>.) = a L f(na)e-2i1rna>.
n=-oo
for all >. E [->.c,Ac]

Suppose the signal is observed during the time t E [-Na, (N- 1)a].
The approximation of the spectrum that is based the samples Xn = f(na),
n = -N, ... , N- 1, will be
L
N-1
SN(>.)= a Xne-2i1r>.na,
n=-N
and its values at the points Ak = 2 ~ a = ~,
are easily computed using the FFT as described in Lesson 9.

We see that the mesh of the grid on which the spectrum is computed is
1/T, where T is the length of observation.
In practice, one avoids cutting the signal abruptly at the two extremes,
since this operation, which amounts to multiplying the function by some
characteristic function X[a,b], introduces perturbations on the spectrum.
Replacing X[a,b] by a smooth window lessens these effects. We will return
to this question in Lesson 41.
38.6 Exercises
Exercise 38.1 Let f be an element of .Y '(R) suchthat
(a) Show that

f(t) = -2 cos 27rt
16t 2 - 1
and verify that f is infinitely differentiable.
(b) Use Shannon's formula with a = 1/2 to show that
2 +oo 1
7TCOt2-rrt = (16t - 1) ~ (4 2 )( )
L......t n - 1 2t- n
n=-<X>
when 2t is not an integer.

(c) Write the general term of the last series in partial fractions (in the variable
n) and show that
1 +oo 1 1
cotx=-+~(-+-)
X L......t X - n7T X + n7T
n=l
when x is not a multiple of 1r.
Exercise 38.2 Apply Shannon's formula to the functionf(t) = cos27rt with

a = 1/2 and verify that one obtains directly the expression for cot x found in the
last exercise.
Hint: Use the proof of Theorem 38.2.1 with
e2i7rt + e-2i7rt
f(t) = cos21rt = 2
and notice that in this case one can apply Theorem 5.2.4 for all t ER
Exercise 38.3 Suppose f E L 2 (1R) and supp(J) c [a - .X0 , a + .X0 ] with

> 0. Show that f is determined by a sampling (f(na))
a E lR and .Xo with
0 < a ~ 1/2-Xo.
Exercise 38.4 Write equality (38.12) at the points t = 1/(2a) using The-
orem 5.2.4.
Exercise 38.5 (Shannon's formula fails in .57 ')

We use the notation of Section 38.3 with a = 1. For <.p E .Y , define
N
SN('P) = L f(n)In('P)
n=-N
with
In('P) = 1r(t)<p(t+n)dt
and r = x[_! !J
2'2
(a) Show that
In('P) = 1: 1
2
( 5<.p)(x)e- 2 i"nx dx.
38.6 Exercises 363
(b) Take f(t) = t. Show that
L
N
SN(cp) = ncn(?/J),
n=-N
where 1/J is the function with period 1 that agrees with Y cp on the interval
( -1/2, 1/2).
(c) Take cp = g, where g is an element of !lJ (R) suchthat
g(x) = x if lxl <~
(Exercise 27.4). Compute SN(cp) and conclude that Shannon's formula is
not generally true in .7 '.
lesson 39
Discrete Filters and

Convolution
We are going to study several specific questions about discrete signals and
filters in this and the following lesson. The current lesson concentrates on
the convolution of discrete signals and its application to discrete filters.
39.1 Discrete signals and filters

39.1.1 Discrete signals
Let a be a positive real number. Any distribution of the form
+oo
X = L XnDna Xn E C,
n=-oo
will be called a discrete signal; we denote the set of discrete signals by Xa:
This is a vector space that is usually endowed with the topology induced
by that of ! 1 , which is the topology of pointwise convergence:
lim XN =X in! 1 <====> lim XNn = Xn for all n E Z.
N-+oo N-+oo
39.1.2 Definition ( discrete filter) Any mapping D : X f--+ Xa

that is linear, continuous, and commutes with the translations Tka, k E Z,
(see Section 2.1.3) will be called a discrete filter whenever the space X
satisfies the following conditions:
(i) X is a subspace of Xa that contains 8 and that is invariant under the
translations Tka k E Z.
(ii) X is endowed with a topology that is at least as fine as the topology
induced by ! 1
366 Lesson 39. Discrete Filters and Convolution
Unless otherwise indicated, the topology of X will be that induced by

!JJ 1 This definition is modeled on the one given for analog filters (Definition
34.1.1). Here, however, the translations must be limited to integer multiples
of the step a, which is fixed once and for all. The spaces most frequently
encountered in practice are the following:
X=Xa all of the discrete signals.

X= Xa n!JJ~ the discrete causal signals.
X = Xa n lf I = Ya the discrete signals with finite support.
X=XanY 1 the discrete tempered signals.
39.1.3 Examples of discrete filters

(a) A delay (or shift): X= Xa, Dx = y with Yn = Xn-k
(b) An average: X= Xa, Dx = y with, for example,
(c) The recursive system defined by (1.1): Yk = Xk + G.Yk-1

(d) Any convolution system: Dx = h * x.
Naturally, one assumes that the convolution is defined on the space of
input signals X. lt is clear that D is linear and invariant; it will generally
be continuous, but this depends on the space X (see Lesson 32). As in the
case of analog filters, most discrete filters are convolution systems.
39.1.4 Proposition Let D : X ~ Xa be a discrete filter and Jet

h = Db. Then Dis a convolution system
Dx=h*x, xEX,
in the following two cases:

(i) X = Xa and h is finite.
(ii) X = Xa n !JJ ~ and h is a causal signal.
Proof. The result follows immediately in case (i), and it is obtained by
the density of Ya in X for case (ii). o
The next step is to examine several frequently encountered cases where

the convolution of two discrete signals is defined and to determine how to
compute the convolution in these cases.
39.2 The convolution of two discrete signals 367
39.2 The convolution of two discrete signals

We first look at two simple cases:
(a) h E ~; that is, h is finite.
(b) The supports of h and x are limited on the left.
Wehave
+oo +oo +oo
h= L hm8ma 1 X= L
Xk8ka 1 Y = h *X= Yn8na L
m=-cx:> k=-oo n=-oo
Operating formally, we compute Yn in terms of (hm) and (xk):
By regrouping the terms m + k = n, we see that

+oo
Yn = L hkXn-k (39.1)
k=-oo
We are going to investigate the validity of this computation.
39.2.1 Proposition If h is finite, or if the supports of h and x are

Jimited on the left, then y = h * x is given by equation (39.1), which is a
finite sum.
Proof. The proof in the finite case follows directly from the formal com-
putation using the distributivity of the convolution. For the other case, we
have
+oo +oo
(h, cp) =L hkcp(ka), (x, cp) = L
Xmcp(ma)
k=ko m=mo
for all cp E !fJ, and these sums have only a finite nurober of terms. Hence,
+oo
(h*x,cp) = (hk,(Xm,cp((m+k) a))) = L hk'I/Jk,
k=ko
where
+oo
'1/Jk = L Xm'P((m + k)a).
m=mo
The result follows by interchanging the (finite) sums and by the change of
variable n = m + k:
(h*x,cp) = ~ ~ hkxmcp((m+k)a) = ~ ( ~ hkXn-k)cp(na).

m=mo k=ko n=mo+ko k=ko
This proves equation (39.1). In addition, Yn = 0 if n < mo + ko, and the

series for Yn is indeed a finite sum. o
39.3 Cases where the two supports are not

bounded
Herewe will see the first of several cases where h*x exists when the supports
of h and x extend from -oo to +oo. This case is the discrete version of the
continuous convolution Y * Y 1 in the same way that the last two cases
were the discrete versions of the convolutions (f' 1 * 9J 1 and 9J -f- * 9J -f-. The
condition that h have finite support is replaced by the condition that the
sequence (hn) be rapidly decreasing; x must be tempered, which means
that (xn) is slowly increasing (Corollary 36.2.3).
39.3.1 Proposition Suppose that the two discrete signals

+oo +oo
h = L hk8ka and X= L Xn8na
k=-oo n=-oo
are such that (hk) is rapidly decreasing and (xn) is slowly increasing.
Then the convolution h * x is well defined. Furthermore,indexdiscrete sig-
nals!convolution of
(i) h * x is a tempered distribution.
+oo +oo
(ii) h *X = L Yn8na with Yn = L hkXn-k, and the series for Yn
n=-oo k=-oo
converges absolutely.
Proof. The proof is based on Fubini's theorem for the discrete measure
space (Z x Z, .!T , f..l) where .!T is the a-algebra generated by the finite sub-
sets ofZxZ and J.l is the measure defined by J.t(S) = the number of point inS.
A function u : Z x Z --+ C is integrable (or summable) if and only if Iu I is
integrable. One part of Fubini's theorem states that the condition
L
+oo
n=-oo
(
L
+oo
k=-oo
lun,kl
)
<+oo
implies that u is integrable (summable) and that
1
z
2
U= L
(n,k)EZ 2
Un,k = L
+oo
n=-oo
(
L
+oo
k=-oo
Un,k
)
= L
+oo
k=-oo
(
L
+oo
n=-oo
)
Un,k
This is the discrete version of Theorem 14.3.1.

39.3 Cases where the two supports are not bounded 369
Now let cp be an element of SC. The sameformal computation that was

clone in Section 39.2 shows that if h*X is to exist as a tempered distribution,
we must have
+oo
(h * x, cp) = I:
hk'I/Jk
k=-00
with
+oo
'1/Jk = L Xmcp((m + k)a).
m=-oa
Thus, for the convolution to exist, it is sufficient that the function
(39.2)
be summable on Z x Z, or, after renaming the indices, that the function
be summable. This leads us to examine the double sum
(39.3)
If (39.3) is finite, Fubini's theorem tells us that all of the series involved in
the formal computations are absolutely convergent and Summation in any
order gives the same answer.
The sequence lhkl is rapidly decreasing and lxnl is slowly increasing. If
we define
+oo +oo
L
t t
f(t) = lhkle2i1rk;;: and T = "'"'
~
IXn Ie2i7rn-a,
k=-oo n=-<X:l
then f is an infinitely differentiable function (Proposition 5.3.4) and T is

a tempered distribution. By Theorem 36.3.1, the Fourier coefficients n of
JT are
+oo
n = L lhkllxn-kl,
k=-00
and they are slowly increasing (Theorem 36.2.2(iv)). Thus there exist A > 0
and o: > 0 such that
n :::; A(l + lnl"')
for all n E Z. On the other hand, since cp E SC , there is a B > 0 such that
for all n. These two inequalities imply that
+oo +oo 1 + lnla

L
n=-oo
nl(f?(na)l ~AB L
n=-oo
1 + ln1<>+2 < +oo.
This shows that the sum (39.3) is finite and hence that (39.2) is summable.
We thus can sum (39.2) in any order, andin particular,
(h * x, (f?) = L
+oo ( +oo
L )
hkXn-k (f?(na),
n=-oo k=-oo
which proves that h * x makes sense and is given by (ii). The estimate
shows that (Yn) is slowly increasing, so (ii) follows from Corollary 36.2.3.in-
dexdiscrete signals!convolution of o
39.3.2 Corollary (periodic convolution and .!T )

(i) If f is a periodic coo function with period a > 0 and ifT is a periodic
distribution with the same period, then
(ii) Let (hn) be a rapidJy decreasing compJex sequence, Jet (xn) be a

sJowJy increasing sequence, and Jet
+oo +oo
h= L hnOna and X= L XnOna
n=-oo n=-oo
be the associated distributions. Then
Proof. T is tempered, T = 2::~=-oo O:nO~, and the sequence (o:n) is slowly

increasing (Theorem 36.2.2). f is tempered, 1
= 2::~=-oo eno~, and the
coefficients Cn are rapidly decreasing (Proposition 36.1.3 and Section 5.3.3).
1
From Proposition 39.3.1 we know that * T is tempered and that * 1
T = 2::~=-oo YnO~, where the Yn are equal to 2::%"=-oo CkO:n-k This and
Theorem 36.3.1 imply that * T = 1 /T,
which proves (i). To prove (ii),
first observe that (i) is true if we replace 5 by 5 . The result follows
x
by applying (i) to f = h and T = with 5 replaced by 5 . o
39.4 Summary 371
39.3.3 The convolution Z! * zr:

If we define
and
l':' = {X = n%=oo XnOna Is~p lxn I < +oo},

then the convolution l! *l':' is well-defined in the same way the convolution
L 1 * L 00 is well-defined in the continuous case. In fact, going back to the
proof of Proposition 39.3.1, for h E l! and x E l':' we have
and thus we have formula (39.1) with h * x E l';'.
39.3.4 The convolution l~ * l~

From Schwarz's inequality
we see (by a computation similar to the one above) that h * x exists for all
h,x E l~ and that h * x E l~. (The convolution l! * l~ does not need tobe
studied as a special case since l~ c l':'.)
39.4 Summary
The convolution h * x is defined for the distributions
+oo
and X= L
XnOna
n=-oo n=-oo
in the following cases:

(a) h (or x) is finite.
(b) h and x have their supports bounded on the left (or on the right).
(c) (hn) is rapidly decreasing and (xn) is slowly increasing.
(d) h E l! and X E l';' (h *X E l';').
(e) h E l~ and XE l~ (h *XE l';').
In all of these cases,

+oo +oo
h *X= L YnDna with Yn = L hk Xn-k
n=-oo k=-00
In cases (a) and (b), the series for Yn isafinite sum; in the other cases, the
series is absolutely convergent. In case (c),
,;;; = h. x.
These results show that the mapping
D: X___, Xa,
x t--+ D(x) = h *x
is a discrete filter in the following cases:
Case 1: h is finite, and X = Xa.
Case 2: h is causal, and X= Xa n !iJ ~-
Case 3: h is rapidly decreasing, and X = Xa n 5I' I (slowly increasing).
Case 4: h E l~, and X= l':.
Case 5: h E l~, and X= l~.
Case 6: h E l':, and X= l~.
Case 7: h E Xa, and X = Xa n ~ 1 = Ya (finite inputs).
In Cases 1, 2, and 3, the topology on X isthat induced by !iJ 1 In Cases
4, 5, and 6, one can take the topologies of the l~ spaces. In Case 7, one has
many choices.
39.5 Causality and stability of a discrete filter

The general definition of causality of a systemwas given in Section 2.1.2.
As in the analog case, linearity and invariance reduce the definition to the
{: : }
following:
[l~:~:~:]
(or causal).
[xn = 0 for all n < 0 =? Yn = 0 for all n < 0.]
We define stability as follows:
The filter l There is an A > 0 such thatl

[ D : X ---> X a {::::::::} [ IIDxlloo :S: Allxlloo
is stable. for all x E X n l':.
In particular, a bounded input produces a bounded output.
39.5 Causality and stability of a discrete filter 373
The next result characterizes these two properties in terms of the im pulse
response.
39.5.1 Theorem Let D: X--+ Xa belang to one ofthe 7 cases listed

above and Jet h be its impulse response. Then the following hold:
(i) D is stable if and only if E~:-oo Ihn I < +oo.
(ii) D is realizable if and only if hn = 0 for all n < 0.
Proof. If h E l~, then from (39.1),

+oo +oo
IYnl:::; L lhkllxn-kl:::; sup lxnl
n
L lhkl
k=-oo k=-oo
and
Hence D is stable. To prove the converse, assume that D is stable. In

Cases 1, 3, and 4, there is nothing to prove, since h E l~. For the other
cases, consider the sequence of signals xP, p E N, defined by
xP = {sign(hp-n) if 0 :::; n :::; 2p and hp-n =1- 0,

n 0 otherwise.
(For c =lclei 0 , sign(c) = e-iiJ.) The signals xP are finite, so they are in
X n l': for Cases 2, 5, 6, and 7, and llxPIIoo :::; 1. Then
oo n
Y~ = L hkx~-k = L hksign(hp-n+k),
k=-oo k=-2p+n
and
p
y~ = I: lhkl
k=-p
for all p 2:: 0. From the definition of stability we conclude that
p
IY~I = L lhkl:::; A
k=-p
for all p 2:: 0; hence

+oo
L lhkl < +oo,
k=-oo
which proves (i).
If D is realizable, then h = D8, and the definition shows that hn = 0

for all n < 0. Conversely, if this property holds, then formula (39.1) shows
that
Xn = 0 for all n < 0 ==} Yn = 0 for all n < 0,
and this proves (ii). 0
39.6 Exercises
Exercise 39.1 Let x = :L:=-oo XnDna, a > 0, be a discrete signal. Compute
the impulse responses of the following filters y = Dx.
(a) Yn = Xn-1
1
(b) Yn = 2(Xn + Xn-1).
1
(c) Yn = 3(Xn+1 + Xn + Xn-1)
Which of these filters are realizable?
Exercise 39.2 Show that-;;;;; = hx when h and x are in l~ (use the result
in Section 39.3.4).
Exercise 39.3 Consider the discrete filter whose impulse response h = (hn)
is given by
if n::::; 0,
if n > 0,
and that belongs to Case 7 in Section 39.4. Show that the response of every finite
signal (which is necessarily bounded) is bounded but that the filter is not stable.
Exercise 39.4 Show that the sequence

n-1
Yn = L k(n ~ k)'
k=1
n ~ 2,
is bounded.
Exercise 39.5 Can the proof of Proposition 39.1.4 be adapted to Cases 4,

5, 6, and 7 of Section 39.4?
Hint: Yes for 5, 6, and 7; no for 4, since Ya is not dense in l';' in the topology
induced by the sup norm II lloo
Lesson 40
The z- Transform and Discrete

Filters
40.1 The z-transform of a discrete signal

The spectrum of a discrete tempered signal x = E~:'-oo XnOna is the peri-
odic distribution
+oo
x(.X) = L Xne-2i11"Ana (40.1)
n=-oo
The change of variable z = e2i."..xa transforms x into the function

+oo
X(z) = L XnZ-n, (40.2)
n=-oo
which is represented as a Laurent series in the complex variable z. By

freeing this variable from the constraint lxl = 1, we obtain what is called
the z-tmnsform of the discrete signal x. We know from elementary results
on power series that this Laurent expansion defines a function X that is
holomorphic in an annulus (which is possibly empty)
r < izl < R

with 0 ~ r ~ R ~ +oo (Figure 40.1). The series diverges outside this
annulus, and the behavior on lzl = r or lzl = R is uncertain. It is clear,
however, that R = +oo for causal signals.
For discrete signals, it is customary to study the complex function X(z)
rather than the Fourier transform x(.X). These two functions are related
through the equation
(40.3)
The z-transform of a discrete signal does not always exist. For example,
there is no z-transform for Dirac's comb.
376 Lesson 40. The z- Transform and Discrete Filters
FIGURE 40.1. Annulus of convergence for X(z).
EXAMPLES:
(a) Fora > 0 and > 0, define
if n < 0,
if n ~ 0.
Then
-1 +CXl
X(z) = " nz-n + "anz-n = _z_ + _z_
.L...J .L...J -z z-a
n=-CXJ n=O
for values of z satisfying iz/l < 1 and la/zl < 1. Thus the z-transform
exists if a < . It is defined and holomorphic in the annulus a < izl < .
(b) The discrete version of the unit step function (Heaviside function) is
defined by
0 if n < 0,
x -u - {
n - n - 1 if n ~ 0,
and
+CXJ 1
U(z) =L z-n = -1---z---=-1
n=O
if lzl > 1. Here the annulus of convergence is the exterior of the unit disk:
r = 1, R= +oo.
40.1.1 Elementary properties of the z-transform

(a) Linearity
The transform x ~--+X is clearly linear.
(b)Effect of a delay
If the z-transform of x = E::'=-CXl Xnna is X(z), then z- 1X(z) is the
transform of TaX and z-k X(z) is the transform of TkaX.
40.1 The z-transform of a discrete signal 377
(c) Transform of a convolution

Assurne that h = E!:'-oo hn8na and x = E!:'-oo XnDna are discrete
signals that belong to one of the cases in Section 39.4 where the convolution
h * x exists. Their respective z-transforms H and X exist in the annuli A 1
and A2. In the annulus A = A1 n A2, assumed tobe nonempty, we have
The function (n, k) r-+ hkXn-kZ-k is summable (integrable) on 'I}, and by

Fubini's theorem, it can be summed in any order. Thus for y = h * x,
It follows that
Y(z) = H(z) X(z)
for all z E A. One should not be surprised that the z-transform of a con-
volution of two signals is the product of their z-transforms!
40.1.2 Inverting the z-transform

Given the z-transform of a signal x, one can recover x by either of two
methods: (a) by expanding X(z) in a Laurent series, or (b) by using the
residue theorem to compute
Xn 2 ~ { X(z) zn-l dz,

=-
zrr lr
(40.4)
where r is a contour around the origin situated in the annulus of conver-

gence and taken in the positive direction (Figure 40.2).
EXAMPLE: Let X(z) = z(z- r)- 1 , r > 0, and take the annulus of conver-
gence to be the exterior of the disk Iz I ::; r.
The first method gives
1 +oo
X(z) = - - r = Lrnz-n,
1--
Z -o
n-
so
+oo
X= Lrn8na
n=O
By the second method,
1
Xn = -.- -Zn 1
- - dz.
2m r z- r
)Y
FIGURE 40.2.
Ifn ~ 0, the residue of f(z) = zn(z-r)- 1 atz= r is rn. Ifn < 0, another
pole appears atz= 0. The residue atz= 0 is obtained by expanding f(z)
around z = 0:
1 +oo zn+p
=- - - =-""""" -.
Zn
f(z)
r 1- ~ L... rP+l
r p=O
The residue at z = 0 is the coefficient of z- 1 , which is equal to -rn. The

two residues cancel each other, and we have Xn = 0 for n < 0. Thus
+oo
X= L:rn8na
n=O
40.2 Applications to discrete filters

In most applications, a discrete filter D : x f-t y will be a convolution
system; thus Dx = h * x for some h E Xa. This is established either by
applying one of the results from Lesson 39 or by direct verification.
When this is the case, the z-transform H(z) oftheimpulse response h is
called the transfer function of the discrete filter D. The next result relates
the stability and realizability of D to properties of H.
40.2.1 Theorem Assurne that the filter D is a convolution system

with transfer function H(z) that converges in a nonempty annulus A.
(i) D is stableifand only if the unit circle lzl = 1 is in A.
(ii) If Dis realizable, then it is stableifand only if the poles of H(z) are
in the interior of the unit disk.
Proof. The filter Dis stableifand only if (Theorem 39.5.1)
n=-oo
40.2 Applications to discrete filters 379
!Y
____ poles of H(z)
FIGURE 40.3. A realizable and stable filter.
This is equivalent to saying that the series
n=-oo
is absolutely convergent for lzl = 1. This proves (i).

If D is realizable, the annulus of convergence is the exterior (lzl > r) of
some disk lzl ~ r (Figure 40.3). If the poles Pk of H(z) are in the interior of
the unit disk, that is, if IPkl < 1, then r < 1. Conversely, if Dis realizable
and stable, then H(z) converges absolutely on lzl = 1, so the poles must
be in the interior of the disk Iz I ~ 1. o
EXAMPLE: Let H(z) = z(z- r)- 1 with the annulus of convergence lzl > r.
This corresponds to a realizable filter. The pole is r. Thus the filter is stable
if r < 1.
40.2.2 Filters governed by linear difference equations

with constant coefficients
In the same way that analog filters are often governed by differential equa-
tions, discrete filters can be governed by linear difference equations with
constant coefficients:
q p
LbkYn-k = LajXn-j, bo = 1. (40.5)

k=O j=O
The output y is completely determined by some additional condition, for

example, that the filter is realizable.
COMPUTING THE TRANSFER FUNCTION: By taking the z-transform of
both sides of (40.5) and using Section 40.1.1(b), we have
The transfer function is the rational function

p
:LajZ-j
H(z) = .::....j:=-0- - (40.6)
Lbkz-k
k=O
COMPUTING THE IMPULSE RESPONSE: This is the inversion problern for

the z-transform that we examined in Section 40.1.2. We can obtain the
Laurent expansion of H ( z) from the relation
When the filter is realizable, hn = 0 for all n < 0. In this case, the hn
are obtained from thc recurrence
ho = ao,
n
hn =an- :Lbkhn-k, n = 1,2, ... ,
k=1
where we define an= 0 if n > p and bk = 0 if k > q, and we have

+oo
Yn = LhkXn-k
k=O
for all n E Z.
40.2.3 Example
The discrete form of the realizable RC filter, RCv' + v = f, is
Yn- Yn-1 + Yn = Xn,
Re ;::_____::.___ nEZ.
a
In this case, the annulus of convergence is the exterior of the unit disk:
r = 1, R = +oo. The discrete filter has the form
Yn- bYn-1 = CXn

with
b= RC a
RC+a and c= RC+a
The transfer function is

c cz
H(z) = 1 - bz-1 - izl>b.
z- b'
40.3 Exercises 381
The series expansion is

+oo
H(z) = c L)nz-n,
n=O
and the impulse response is
+oo
h = C LbnDna
n=O
The filter is stable, since lbl < 1.
40.3 Exercises
Exercise 40.1 Invert the z-transform defined by X(z) = z(z- r)- 1 in the
< lzl < r
annulus 0 and compare the result with the example in 40.1.2.
Exercise 40.2 lnvert the z-transform defined by
z2 + 1
H(z)=~1, z -
knowing that the associated filter is realizable.
Exercise 40.3 Let X(z) be the z-transform of the signal

00
X= (xn) = 2:; Xn8na

n=-oo
(a) Compute the z-transforms of the signals
(b) Use (a) to compute the z-transforms of the signals
X = (nun) and y = (n2nun),

where Un = u(n).
Exercise 40.4 Give an example of a noncausal signal for which R = +oo.
Chapter XII
Current Trends:
Time-Freq uency Analysis
lesson 41
The Windowed Fourier

Transform
41.1 Limitations of standard Fourier analysis

Current research is to a large extent motivated by industrial applications
of mathematical analysis and signal processing. Seismic exploration, the
analysis and synthesis of sound, medical imaging, and the digital telephone
are a few of the applications that come to mind. In all cases, one wishes
to extract from the signal the pertinent information as discrete numerical
values. This set of digital information must be rich enough to characterize
the signal, but it should be no larger than necessary for the task at hand. If,
for example, it is a question of speech and the digital telephone, one wants
enough numerical information at the receiver to reconstruct a recognizable
voice, but economy dictates the need to minimize the amount of information
that must be transmitted.
Fourier analysis is the oldest of the various techniques avaliable for signal
analysis and synthesis. Since the invention of the fast Fourier transform
(FFT), it has become an efficient tool, particularly for analyzing sufficiently
smooth periodic signals (Lesson 9). In these cases, the Fourier coefficients Cn
decrease rapidly as lnl --+ +oo, and relatively few numerical coefficients are
needed to reconstruct the signal for most practical purposes. Unfortunately,
as soon as the signal becomes irregular, like, for example, a transient, the
number of coefficients necessary to reconstruct the signal (and hence the
amount of data that must either be stored or transmitted) becomes large
and often economically impractical.
Before the advent of the FFT, Fourier analysis was mainly a theoreti-
cal tool-indeed, one of the most important and pervasive. This quickly
changed with the arrival of the FFT and efficient digital computing, and
these twin techniques have had widespread applications in the last third
of the twentieth century. Nevertheless, even with the FFT and modern
computing, Fourier analysis does not provide a s~isfactory analysis for all
kinds of signals. Although the Fourier transform f contains all of the infor-
mation about J, much of this information in "hidden." For example, none
386 Lesson 41. The Windowed Fourier Transform
of the temporal aspects of f are revealed by 1 If f is a finite signal, the

spectrum does not indicate the beginning and the end of the signal, and if
there is a singularity, the time of occurrence is hidden throughout 1
Faced with these kinds of issues, one would like to have an analytic tool
that provides information both in time and in frequency. The model that is
often cited is musical notation: the horizontal position of a note (its "start
time," its duration, and its frequency are all represented.
There is another problern that has surely not escaped the reader's notice:
To compute the spectrum f(>. ) it is necessary to know f(t) for allreal val-
ues of t. This is impossible in the case of analysis in "real time" where the
signal must be processed as it arrives. One cannot know the spectrum, even
approximately, of a signal when one knows nothing of its future; the inter-
esting information may be yet to arrive. We should not despair, however;
the previous eleven chapters retain their value today both theoretically
and numerically in spite of the cited problems. These technical constraints
simply motivate us to refine existing tools and to develop new ones.
41.2 Opening windows

One of the first ideas was to truncate the signal and to analyze only what
happens on a finite interval [-A, A]. One is forced to do this when making
numerical computations. Mathematically, this amounts to multiplying the
signal f(t) by a characteristic function X[-A,A] =TA (or a translate) and
taking the Fourier transform of the product. The result is
Thus truncating the signal results in convolving its spectrum with the car-
dinal sine (Figure 41.1).
2A
FIGURE 41.1. The cardinal sine.

41.2 Opening windows 387
f
The approximation of by g becomes better as A increases, that is, as s A
better approximates the Dirac impulse. Unfortunately, the computations
for this process quickly become very voluminous. The cardinal sine decays
slowly and has important lobes near the origin. To avoid these problems,
one replaces X(-A,A] with a more regular function. These functions are all
called windows, and they are concentrated around the origin.
EXAMPLES:
(a) Triangular window (Figure 41.2)
w(t) w(A) =.! (sin21TAA) 2

A 1rA
A
-A 0 A
FIGURE 41.2. Triangular window in time and frequency.
(b) Hamming and Hanning windows (Figure 41.3)

Theseare of the form w(t) = [a + (1- a) cos(2rrt/A)]r(t). Fora= 0.54
we have Hamming's window and for a = 0.50 Hanning's window. These
coefficients have been computed to minimize certain criteria (see [Kun84]).
w(t)
a= 0.54
A 0 A
2 2
FIGURE 41.3. Hamming and Hanning windows.
(c) Gaussian window w(t) = Ae-ctt 2 (a, A > 0) (Figure 41.4)

These windows are used in practice, and they significantly improve the
computation of the spectrum.
One is led naturally to slide this window along the graph of the func-
tion and thereby analyze the whole function. One then obtains a family of
coefficients depending on two real variables .\ and b given by
Wt(.X, b) = l
-oo
+oo
f(t)w(t- b)e- 2i1r>.t dt. (41.1)
Wt(A, b) replaces f(.X). The mapping f ~---+ Wt is called the sliding window
Fourier transform or simply the windowed Fourier transform.
w(t) = Ae-" 12
K
A Y7i'la
0 0
FIGURE 41.4. Gaussian window in time and frequency.
The parameter A plays the role of a frequency, localized around the ab-
scissa b of the temporal signal. W 1 (.\, b) thus provides an indication of how
the signal behaves at time t = b for the frequency A. We use the function w
rather than w in (41.1) for reasons of convenience and because we wish to
allow complex-valued windows. Thus, W1 becomes a scalar product in L 2 :
Wt(A, b) = (!, W>.b),

(41.2)
W>.b(t) = w(t- b)e 2irr>.t.
41.3 Dennis Gabor's formulas

Intuitively, one might expect that knowing WJ(A, b) for all values of A and
b completely determines the signal f. One could even conjecture that the
information contained in W 1 (.\, b) is redundant, since we have replaced a
one-parameter family j with a two-parameter family. We will see below
that these speculations are weil founded.
In his 1946 paper [Gab46), Dennis Gabor used a window that was essen-
tially the Gaussian w(t) = 7r- 114 e-t 2 /Z. Such a function has the advantage
of approximating a square window while avoiding the disadvantage of in-
troducing abrupt discontinuities. One of Gabor's important contributions
was to show that Wt(A, b) can be inverted to recover f.
41.3.1 Theorem Suppose that w E L 1 nL2 is a window suchthat lwl

is even and llwllz = 1. Write
W>.b(t) = w(t- b)e 2 irr>.t, .\,b ER
For all signals f E L 2 we define the coeflicients
Wt(.\,b) = l
-oo
+oo
f(t)w>.b(t)dt.
Under these conditions, we have the following two results:

41.3 Dennis Gabor's formulas 389
(a) Conservation of energy:
(b) Reconstruction formula:
f(x) = JJR. 2 Wt(.A,b)w>.b(x)dAdb (41.4)
in the sense that if
9A(x) = J!i>-I~A WJ(A,b)W>.b(x)dAdb,

bER.
then 9A ---+ f in L 2 as A ---+ +oo.
Proof. We first give another expression for WJ(A, b):
Since
~ (t:) ') ,
W>.b ." = e-2i1T(~->.)b w
~((:
." - ,~~, (41.5)
this becomes
so
The function of ~ in brackets is in L 1 , since it is the product of two

functions in L 2 It is also in L 2 because, w being in L 1 , wis bounded. Thus
we have
JJR. 2 1Wt(A,b)j 2 dAdb = /_:oo (/_:oo 15 di(~)$(~- A)](b)j db) dA 2
= j_:oo (j_:oo ji(~)$(~- AW d~) dA (Parseval)
= /_:oo (11(~)1 2 /_:oo Iw(~- A)l d, A) ~ 2
= ll!ll~llwll~ = 11!11~
This establishes (a).
To prove (b), we first show that YA is well-defined for all A > 0 by showing
that (A,b) ~ WJ(A,b)w>.b(x) is integrable on the strip [-A,A] x R Let
I:
By Schwarz's inequality and Parseval's relation, we have (Theorem 22.1.4)
115 di(~):(~- A)](b)ll2llwll2 dA

=I:
JA(x):::;
lli(~):(~- A)ll2 dA.

The function h(A) under the last integral sign satisfies
Since L 1 * L 1 C L 1 , it follows that lhl 2 E L 1 and hence that h E L 2 . Finally,
for all x E ~ and A > 0. Integrability allows us to choose the order of

integration in the definition of gA, so in view of (41.6), we have
A
YA(x) = IA g(A) dA
with
g(A) = 1+oo
-oo 5 df(~)(~- A)](b)w(x- b)e 2 i1r>.(x-b) db,
which by Proposition 22.1.5 is
After computing the Fourier transform 5b[w(x- b)e 2 i1r>.(x-b)], we see

that
so
41.3 Dennis Gabor's formulas 391
The next step is to verify that the function of (.>., ~) under the double
integral (41. 7) is integrable on [- A, A] x ffi.. Since Iw I is even,
Since IJl E L 2 and lwl 2 E L 1, it follows (Proposition 20.3.2) that h =

1Jl * lwl 2 E L 2(ffi.) and hence that h E L 1[-A, A]. Thus the integral is
well-defined, and we can interchange the order of integration in (41.7):
Denote the second integral by ct~A(~). Then 0::::; tt~A(~) : : ; 1, since llwll2 = 1.
Since ct~A is bounded, 1ct~A is in L 2 and 9A = !T (1 ct~A) The last step
is to show that 9A tends to f in L 2 as A-+ +oo. Forthis we evaluate the
norm of the difference:
II!- 5(1 ct'A)II~ = 115[(1- ct'A)1JII~ = 11(1- ct'A)111~ = c:(A).
We estimate the integral
(41.8)
in two parts. If 1~1 : : ; A/2, then
so
A
o::::; 1- tt~A(~)::::; j_~2 lw(y)l 2dy + J:i+oo lw(y)l 2dy = c:1(A),

2
which tends to 0 as A-+ +oo. As a consequence,
A
1_: [1- tt~A(~)fli(~W d~::::; ci(A)IIfll~

2
If 1~1 ;::: A/2, then
[ A
llt;l?.-z
[1- ct'A(~)fl1(~)1 2 d~::::; 1 A
lt;l?.-z
li(~W d~,
which also tends to 0 as A -+ +oo. These two estimates show that c:(A)
(41.8) tends to 0 as A tends to infinity, and this proves (b). 0
This result shows that for the windowed Fourier transform in L 2 we

have formulas analogaus to those for the ordinary Fourier transform in L 2 :
conservation of energy (Parseval's formula) and an inversion formula. There
is a nice harmony in these formulas; this will also appear in the theory of
wavelets.
In practice, one generally uses a function w that is well localized around
the origin t = 0, for example, a Gaussian. The function W>-.b is then localized
araund the point t = b, while W>-.b, given by (41.5), is localized around the
point ~ = A. This means that
contains information in both time and frequency around the point (b, .\).
For numerical computations, the coefficients Wt(A, b) are evaluated on
a grid (m.\o, nbo) with m, n E Z and Ao, bo > 0. One thus obtains a double
sequence Wm,nU) = Wt(m.\o, nbo), which is a discretized version of the
function of the two real variables ,\ and b.
41.4 Comparing the methods of Fourier and

Gabor
The transforms of Fourier and Gabor, which we can write formally as
f(x) = f-oo
+oo
J(~)e2ine d~,
f(x) = { Wt(.\,b)w>-.b(x)d.\db,
JJR2
can be interpreted as decomposing the signal f in terms of functions that
play the role of basis functions, except that sums are replaced by integrals.
In the Fourier transform, these functions are sinusoids; in the Gabor
transform, they are strongly attenuated sinusoids, or looked at the other
way, modulated Gaussians (Figure 41.5). In the frequency space, we have
the representations illustrated in Figure 41.6.
With Fourier's method, the "basis functions" are completely concen-
trated in frequency (Dirac impulses) and totally distributed in time ( unat-
tenuated sinusoids extending from -oo to +oo ). This is another way to
explain that taking the Fourier transform gives the maximum amount of
information about the distribution of the frequencies but completely loses
information relative to time.
With Gabor's method, the figures show that time-frequency information
remains coupled, although there is always a compromise: The uncertainty
principle limits the simultaneaus localization in time and frequency. In spite
41.4 Comparing the methods of Fourier and Gabor 393
Re[e2hrxf] Re[w,~.bCxl] ,'~,
+1 / \ y= w(x-b)
\
-1
b X
Fourier Gabor
FIGURE 41.5. Basis functions for Fourier and Gabor decompositions.
Re[w_;;(~)]
Fourier
,\
Gabor
FIGURE 41.6. Basis functions for Fourier and Gabor in frequency space.
of this-which is a fact of life for any time-frequency analysis-Gabor's

method has advantages over Fourier analysis for certain applications.
A signal f of finite duration provides one of the best illustrations of
the difference between the two methods. The reconstruction of f using
the inverse Fourier formula necessitates knowing the values of f{e) with
considerable precision over a very large range of values, for although f{e)
tends to zero, it can do so frustratingly slowly (consider the transform of
X[a,bJ) The effects of all the sinusoids must come together to give zero
outside the support of f.
The situation is quite different for Gabor analysis. It f vanishes on a
long enough interval (bo - a, bo + a) and if w(t) is small for ltl ~ 1, then
the coefficients Wt(>., b) will be negligible for b in a neighborhood of b0 ,
since
Wt(>., b) ~ 1b+l
b-1
f(t)'W>-.b(t) dt = 0.
On the other hand, if f oscillates strongly at t = bo, the value of Wt(>., b)

will be large for b near bo when the values of >. "match" the frequency of f
near bo. This gives an idea about the "local frequency" of f.
In spite of its advantages for certain applications, the Gabor method has
. the inajor disadvantage that the size of the window is fixed. In terms of the
uncertainty principle, this means that l:l.t is fixed (Section 22.3), and this
limits the ability to localize events in time. Problems arise when one wishes
to analyze signals that contain features on scales that range over several
orders of magnitude. This is the case, for example, with speech. Consider
the word "school." It begins with a short high-frequency attack followed by
a Ionger relatively lower-frequency component. Fluid mechanics provides
another important example. In fully developed turbulence, one observes
events on scales that range from the macroscopic to the microscopic.
The geophysicist Jean Morlet encountered these kinds of problems in
connection with seismic exploration for oil. Here it is necessary to analyze
signals that result from a pulse being reflected (and delayed and com-
pressed) from various layers in the earth. This led Morlet to introduce a
new method where the window is not only translated but is also dilated
and contracted. This was the beginning of the use of wavelets for numerical
signal processing.
41.5 Exercises
Exercise 41.1 With the notation and hypotheses of Theorem 41.3.1, show
that for fand g E L 2 (1R),
//JR w
2
1 (>.,b)W 9 (>.,b)d>.db= 1f(t)g(t)dt.
Exercise 41.2 Consider the signal f(t) = e2 i"at, a E JR, and the Gaussian
window w(t) = e-?Tt 2
1
(a) Verify that
W 1 (>., b) = f(t)w(t- b)e- 2 i"M dt
is well-defined (even though f fl. L 2 (JR)).

(b) Compute w, (>., b) using the following result:
Fora> 0 and x E JR, 1 e-"a(t+ix) 2 dt = a -~.
(c) Show that IWt(>., bW attains its maximum when >. = a.

Exercise 41.3 Consider the Gaussian window w(t) = Ae-at 2 with A, a > 0
and the signal f(t) = Be-t 2 with B, > 0. Use the result in Exercise 41.2(b)
1
to compute
Wj(A, b) = j(t)w(t- b)e- 2 i1rM dt.
Lesson 42
Wavelet Analysis
Gabor's method dates from the 1940s. With wavelets we enter a dynamic
contemporary research environment; what is now known as the modern
theory of wavelets emerged in the 1980s, notably with the article [GM84]
by Alex Crossmann and Jean Morlet. We say "modern" wavelet theory
because looking back over the mathematicallandscape from a late twentieth
century perspective we can identify many earlier ideas and techniques that
are now logically included in this theory. Work by Haar in 1909; work in the
late 1920s by Strmberg; results from the 1930s by Littlewood and Paley,
Lusin, and Franklin; and later work in the 1960s, particularly the result of
Calder6n on operators with singular kernels-all these efforts and others
are now interpreted in the language of wavelets.
What happened in the 1980s was qualitatively different; there occurred
a conjunction of requirement and solution. Jean Morlet, a geophysicist,
wished to analyze a particular dass of signals associated with seismic ex-
ploration, and he had an idea about how this should be clone. He sought
the collaboration of Alex Grossmann, who, being a theoretical physicist,
had command of certain mathematical tools, particularly those associated
with coherent states and group representations from quantum theory. The
immediate result was their celebrated 1984 paper; it was also the begin-
ning of a productive collaboration between mathematics and other sectors
of science and technology. We will say more about contemporary research
at the end of the lesson, once some basic results have been established.
42.1 The basic idea: the accordion

Starting with a function 1/J, called the analyzing wavelet or "mother" wave-
let, we construct the family of functions
b E IR, a > 0.
396 Lesson 42. Wavelet Analysis
The wavelet coefficients of a signal f are the numbers
Ct(a, b) = (!, '1/Ja,b) = f +oo

-oo f(tfia,b(t) dt.
The properties of '1/J are quite different from those of a window, which
has more or less the aspect of a characteristic function, while '1/J, on the
other hand, oscillates and its integral is zero. We also want '1/J and :(jJ to
be well localized, which means that they both converge to zero at infinity
fairly rapidly. In this way one obtains a function that looks like a wave: It
oscillates and quickly decays. This is the source of its name. Morlet used
the function
t2
7/J(t) = e-2 cos5t,
which is now known as Morlet's wavelet; derivatives of the Gaussian are
widely used in practice. Figures 42.1-42.4 illustrate differences in the be-
havior of the Gabor functions W>.b(t), which have a ridged envelope, and
wavelets, which are dilated and contracted. With wavelets one sees the
action of an accordion. (The factor a- 112 has not been used in the figures.)
Unlike Gabor functions, wavelets do not have a rigid envelope.
l{l(t)
FIGURE 42.1. A wavelet oscillates and decays.
Re[wA.o(t)] Re[wA.o(t)]
I
\
''
I
''
I
I
'' ' I
0.0 0.0
1 2
FIGURE 42.2. Gabor functions W>.b(t) = e -2(t-b) e 2inAt: The envelope is rigid,
and the number of oscillations varies with frequency.
42.2 The wavelet transform 397
rfr(x)
0.8227 a=1
0.5727
0.3227
0.0727
-0.1773
-0.4273
-0.6773 L--.....__._____.___..'--'---.1.....-'----- '----'-"'--+
-5.0 -3.0 -1.0 0.0 1.0 2.0 3.0 4.0 5.0
-4.0 -2.0
FIGURE 42.3. A mother wavelet (8th derivative of a Gaussian).
42.2 The wavelet transform

42.2.1 Theorem Suppose that the function 'ljJ E L 1 (IR) n L2 (IR) satis-
fies the following conditions:
+ ~ 2
[oooo
(i) 1'1/J~~?I d>. = K < +oo.
(ii) II'I/JII2 = 1.
Construct the family of wavelets
'1/Jab(t) = 1
VfaT'I/J (t--a-b) , a, b E IR, a =1- 0,
and for any signal f E L 2 (IR) consider the wavelet coeHicicnts
CJ(a, b) = f-oo
+oo
f(t)""ijjab(t) dt.
Under these conditions we have the following results:

(a) Conservation of energy:
K
1 !J F
ICJ(a, bW-
da db =
2
a
f+oo lf(tW dt.
-oo
(b) Rcconstruction formula:
f(x) = K
1 Jr1r
IR 2 CJ(a,b)'I/Jab(x)~
dadb
in the sense that if
fc(x) = K
1 Jr1rlal~c CJ(a, b)'I/Jab(x)~,
dadb
bEIR
1/Js.o(x)
0.82261
a=3
0.57261
0.32261
0.07261
-0.17739
-0.42739
-0.67739 .__....._....-......_...._---'_..l---L-...L-~~...~....-l.
-5.0 -4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0 5.0
1/1! o(x)
2'
0.8227
a=0.5
0.5727
0.3227
),07271-----
-0.1773
-0.4273
-0.67731..-....J---L-..L---L...II-L....IL-1---L-....I--1..---l
-5.0 -4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0 5.0
FIGURE 42.4. Wavelets at low and high frequency: They have the same form
and the same number of oscillations; they are dilated for large a and contracted
for small a.
Proof. First two observations: The '1/Jab are normalized so that II'I/Jabll2 = 1,
and the proof is similar tothat of Theorem 41.3.1. Thus, as before, we find
another expression for CJ(a, b):
and since
(42.1)
we have
(42.2)
The function of >. in brackets is in L 1 (~) because it is a product of two func-

tions in L 2 (~); it is also in L 2 (~), since '1/J E L 1 (~) implies ;j} is bounded.
To prove (a), we compute the double integral using (42.2). Hence,
I= /_:oo (/_:oo ICJ(a, b)l db) ~~

2
= j_:oo (j_:oo 15 A[Jc>.)~(a>.)](bW db) fa~.

Using this and Parseval's relation, we obtain
By the change of variable ~ = a>., we see that the last integral is constant
and equal to K, which proves the result.
To prove (b), we first compute
J(a) = /_: CJ(a, b)'I/Jab(x) db =

00
vTaf /_: 5A[j(>.)~(a>.)](b)'I/Jab(x) db.
00
Using Parseval's relation again, we have
and since
it follows that
(42.3)
Define
g"(x) = {
lial?.c
J(a) d~ = {
a llal?.c
(l+oo
-oo Jc>.)l;j}(a>.We2i7rAx d>.) dlal. (42.4)
a
The next step is to show that the function of (a, >.) under the integral signs
is integrable on (Iai ~ e) x R
By the change of variable ~ = a>., we see that
A= l+oo r l;j}(a>.W
-oo li(>.)l ( Jial?.c Iai
da) d>.
= l+oo lf(>.)l ( r l;j}(012 d~) d>..

-oo }l~l?.ciAI ~~~
Ais estimated in two parts. For lAI ~ 1,
A1 = 1 -1
1
lf(A)I ( f
llel~el>.l
1~(~)1 2 d~) dA
1~1
~ K j_11 lf(A)IdA ~ K\1'211/112
For lAI ~ 1,
so
~ ~II?PII~ 1 dA~ ~II?PII~ ll/ll2 (1

1
A2
C 1>.1~1
11\AI)I
1A C 1>.1~1
~;)
A
2
< +oo.
This means that we can interchange the order of integration in (42.4); thus
9e(x) = j+oo j(A)e 2 i1r>.x ( l~(aA)I 2 da) dA= Y[j. Oe](x)

-oo
[
Jlal~e Iai
with
Oe(A) = [ l~(aA)I2 da.
Jlal~e Iai
To show that 9e -t K f in L 2, we evaluate the norm of the difference:
IIK/- Yell~ = IIY(Kj- j. Oe)ll~ = llf(K- Oe)ll~,
or
(42.5)
Again we examine two cases depending on the relation of A to c:- 112. If

lAI ~ C 112, then
Oe(A) = f 1~(~)1 2 d~ ~ f 1~(~)12 d~ = K(c:).

llel~el>.l 1~1 l1e1~ve 1~1
Thus 0 ~ K- Oe(A)K- K(c:), and K(c:) - t K as c: - t o+ by (i). If
~
lAI ~ c 1 , it is sufficient to note that 0 ~ Oe(A) ~ K. Then from (42.5)
1 2
we have
IIK/- Yell~ ~ [K- K(c:WII/11~ + K 2 1_ l>.l~e

_!
2
lf(AW dA,
and these two terms tend to zero as c: -t 0. 0

42.2.2 Remarks
(a) Hypothesis (i) implies that ~(0) = JIR '!jJ(t) dt = 0, since ~ is continu-
ous. In all practical cases this condition is also suffi.cient. For example, if 'ljJ
and x'ljJ are integrable, then ~ E C 1 (JR) and ~---> 1~1- 1 1~(~)1 2 is continuous
at ~ = 0. There is no problern with the integral at infinity, since 'ljJ E L 2 (JR).
f
(b) For signals f belanging to L 1 (JR) n L 2 (JR) such that is also in L 1 (JR),
the proof of the theorem is simplified because all of the integrals exist in
the usual sense when c = 0. From (42.3) we deduce that
by the Fourier inversion formula (Theorem 18.1.1). The reconstruction for-

mula then holds for almost all x E JR, or for all x if f is the continuous
representative of its class.
42.2.3 Examples
(a) The wavelet first used by Morlet (Figures 42.5 and 42.6),
t2
'!fJ(t) = e -2 cos 5t, (42.6)
is not normalized, but this is not a problem. On the other hand, the hy-
pothesis (i) is not satisfied, since
Thus K = +oo! However, the value of ~(0) is on the order of 10- 5 . For
numerical computations this is essentially zero, and in practice things work
well. Nevertheless, the theorem does not apply to Morlet's wavelet.
(b) The simplest example of a wavelet is the piecewise constant function
'ljJ defined by
1 if 0 <X<~'
{
'1/J(x) = -~ 1"f 1 1
2 <x < '
elsewhere.
This is the Haar wavelet (Figure 42.7), and
I/I( X)
0.92411
0.6741
0.4241
0.1741
--o.0759 0 X
--o.3259
--Q.5759
--Q.8259 L---L.-...l---~--L.--IL...l---L....---L.-...l--
-4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0
FIGURE 42.5. The Morlet wavelet.
1.25.
1.00
0.75
0.50
0.25
0.00 L--....C...-'---~--'---'-----'--".....__ _._.

-2.0 -1.0 0.0 1.0 2.0 ~
FIGURE 42.6. Spectrum of Morlet's wavelet.
1/l(X)
0.50
0.25
0 1.: 1:
2: :
X
L--.
-10.0 -5.0 0.0 5.0 10.0 ~
FIGURE 42.7. The Haar wavelet and the modulus of its spectrum.
The convergence of ;p(>.) to 0 at infinity is very slow due to the irregularity

of '1/;, and this is a considerable problern for applications.
(c) Almost any function '1/J that oscillates and has a zero integral and is
;p
such that both '1/J and are welllocalized can be used as a mother wavelet.
Examples include the derivatives of the Gaussian. The second derivative
( ) =- 2 ( 2 ) _.!_x2
'1/Jx -1 1-x e 2 (42.7)
v'371"4
is called the Mexican hat (Figure 42.8). Its spectrum is
Both 'lj; and ;j} belang to Y and are well localized. Figure 42.9 illustrates
the 8th derivative of the Gaussian and its spectrum.
tf!(g) ~(g)
0.86295
1.50
0.61295 1.25
0.36295 1.00
0.11295 0.75
0.50
-0.13705 X 0.25
-Q.38705 0.00 L..._----L-LL-.L----...J-"---L-__.__.. ,
-6.0 -4.0 -2.0 0.0 2.0 4.0 6.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 g
-5.0 -3.0 -1.0 1.0 3.0 5.0
FIGURE 42.8. The Mexican hat and its spectrum.
~(g)
y
1.50
0.8227
1.25
0.5727
0.3227 1.00
0.75
0.07271---..::::-:r-+--++.:+--+---'<:::::=----
-o.1773 X
0.50
-0.4273 0.25
-o .8773 L.-..L-..J---'--'"----'-----"--'---'-.......__---' 0.00
-4.0 -2.0 0.0 2.0 4.0 -1.5 -1.0 0.5 0.0 0.5 1.0 1.5 g
-5.0 -3.0 -1.0 1.0 3.0 5.0
FIGURE 42.9. The 8th derivative of the Gaussian and its spectrum.
42.2.4 The wavelet transform as an analytic tool

Like the Fourier transform, the wavelet transform is both a theoretical
and practical tool, but unlike the Fourier transform, there is the opportu-
nity to choose different analyzing wavelets depending on the job at hand.
There are, however, common properties shared by large classes of analyzing
wavelets.
The wavelet coefficients of the signal f = e2 i1rAot are
In this case, ICJ(a, b)l depends only on a, and when ;j} is real, the argument
of C1 (a, b) is proportional to b modulo 21r. Here we sec that the wavelet
coefficients tell us something about the behavior of the function f, and
this provides a simple illustration of how the wavelet transform can be

used as an analytic tool. It will have not gone unnoticed that f = e2 i7r>.ot
is not in L 2 (1R). In general, the extension of Theorem 42.2.1 to functions
not in L 2 (1R), or to distributions, is a difficult problem, but this does not
mean that the wavelet transform has no applications outside the context of
L 2 (1R). On the contrary, the wavelet transform has proved tobe a powerful
tool for investigating the local behavior of functions that are, for example,
assumed tobe only in L""(JR). Here are a few other important properties.
Consider the function f(t) = 1. Its wavelet coefficients are all zero,
f(t) = 1 ===} Ct(a, b) = 0,
which means that the wavelet transform "ignores" constants. For f(t) = t,
Ct(a, b) = alal 1 / 2 l
-oo
+oo
x""i(x) dx
if x'ljJ is integrable, which is most often the case. We then have
alall/2 :::::.'
f(t) = t ===} Ct(a,b) = - -2-.-7/J (0).
Z1r
If 7/J is a derivative of order m ;:::: 2 of a Gaussian, then the coefficients

of f are again all zero. The general result is this: The wavelet transform
Cf ( a, b) will vanish for all polynomials f of degree :::; p when m ;:::: p + 1.
One implication is that by "ignoring" the "smooth" part of a function, the
wavelet transform "sees" only the "rough" part. In particular, by choosing
an analyzing wavelet with sufficiently many vanishing moments, the wavelet
transform will ignore the polynomial "trends" in a signal. Finally, we note
that the wavelet transform is invariant with respect to translations of the
signal,
g(t) = f(t- to) ===} C9 (a, b) = Ct(a, b- to),
and that for k # 0,

1
g(t) = f(kt) C9 (a, b) = TkTCt(ka, kb).
This last property plays an important role when the wavelet transform is
used to analyze the singularities of a function.
42.2.5 Numerical computation

Throughout the text we have computed explicitly the Fourier transform
of functions and distributions. In contrast, the wavelet transform is hardly
ever computed explicitly, even for simple functions f and 7/J. Naturally, for
42.3 Orthogonal wavelets 405
numerical computations it is necessary to restriet the parameters a and b

to a discrete (indeed finite) subset of JR. For example, with
am =2-m and bn = n2-m, m,n E Z,
we have
1/Jarn,bn (x) = 2m/ 2 "P(2mx- n);
or more generally, with o: > 1 and > 0,
1/Jarn,bn(x) = o:m/ 2 1/J(o:mx- n).
The information in this set becomes more redundant-and the computa-
tions become more voluminous-the closer o: is to 1 and is to 0. The
choice o: = 2 corresponds to different octaves in music.
lngrid Daubechies studied under what conditions the mapping
C: J 1-7 Cf= (Cj(am, bn))(m,n)EZ2
from L 2 (JR) into l 2 (Z 2 ) is 1-to-1, which means that the wavelet coefficients
characterize the signal. She also asked when the inverse of this mapping is
continuous on its domain, which is important for numerical stability. For
any reasonable analyzing wavelet 1/J (good decay in both time and frequency
and J "P(x) dx = 0), these requirements are equivalent to the existence of
two positive constants A and B such that
All/li~ S: L 1(1/Jarn,bn,JW S: Bll/11~

m,nEZ
Daubechies showed that in this case the reconstruction formula can be

written
If the remainder, or error term, Rf is small enough, it can be neglected. If

not, then Daubechies has a reconstruction algorithm that converges expo-
nentially (see [Dau92] for a complete discussion).
42.3 Orthogonal wavelets

Since the information contained in the coefficients Ct(a, b) (which are de-
termined in terms of the "basis" functions 1/Jab) is redundant, a natural chal-
lenge for the early researchers was to find a family of orthogonal wavelets
{"P1k}, j,k E Z, on which every signal f E L 2 (JR) could be decomposed in
a double series
f(x) = L (!, 1/Jjk)'I/Jjk(x)
j,kEZ
with
One would thus havc an orthogonal basis, in the usual sense, for the Hilbert
space L 2 (1R.), where the coefficients Cjk = Cf(j, k) are independent of one
another. What is special about this basis is that the functions are all de-
termined from one wavelet 'ljJ by dilations and translations.
42.3.1 The Haarsystem

Such a family of functions '1/Jjk has been known since the beginning of the
century. It is the Haarsystem introduced by Alfred Haar in 1910 [Haa10].
The mother wavelet for this system is defined in Section 42.2.3(b) and
shown in Figurc 42.7; a graph of '1/Jjk is shown in Figure 42.10.
1/Jik(x)
-2-~ - - - - - -
j ~~--:~-~-~- - (k+ 1)2-i

2-2 ~
FIGURE 42.10. The orthogonal Haar system.
The Haarsystem is an orthonormal basis for L 2 (1R.). Wehave seen, how-

ever, that the absence of regularity of 'ljJ causes ;j} to be poorly localized.
More to the point, we will see below that the Haar coefficients Cjk converge
to 0 very slowly as j --> +oo even for 0 00 signals.
42.3.2 The problern of moments for wavelets

Assurne that f E 0 00 n L 2 (1R.) and to simplify the notation, write n = 21.
We wish to study the rate at which
converges to 0 as n --> +oo. Taylor's formula with integral remainder ap-

plied to f at x = 0 shows that at order q,
Un
q
= ~ J(ll(o) -oo lT:jj}(nx) dx + l+oo
l+oo xl
-oo R(x):jj}(nx) dx
with
R(x) =
lo
r (x- t)q j(q+ll(t) dt.
q!
Denoting the moments of 1fi by Mt and the remainder by rn, we have
Mt= L:oo xt"ifi(x)dx, l E N,
q f(l)(O)Mt
Un = L
t=O
ll l+l
.n
+rn.
An easy computation shows that lrnl :::; Cn-(q+2) for some constant C.
Thus
_ f'(O) M1
+
f"(O) M2
+
...
+
j(q)(O) Mq
+
o(-1-)
Un - n 2 11. n3 21. n q+l q.I n q+2 '
and we see that the rate of convergence Un to 0 is controlled by the first

nonzero moment of "ifi. For the Haar wavelet, M1 =/:- 0, and this is the
source of the numerical problems related to the lack of concentration of the
coefficients. These considerations lead to the definition of a wavelet with a
certain amount of regularity and localization [Mey90].
42.3.3 Definition Suppose r E N. A wavelet of order r is any function

t/J : IR-+ C suchthat t/J and its derivatives up to order r belang to L 00 (!R)
and that satisfies the following two conditions:
(a) t/J and its derivatives up to order r decrease rapidly. (42.8)
(b) j-oo xq.,P(x) dx = 0

+oo
for 0:::; q:::; r. (42.9)
42.3.4 Definition We say that the family {'1/Jikh.kez is an orthonor-

mal wavelet basis for L 2 (1R) if the tPik are of the form
(42.10)
for t/J E L 2(1R) with llt/JII2 = 1, and t/J is a wavelet of some order r ~ 0.
Later in the lesson we will see how to construct orthonormal bases of the
form (42.10) where all we know about t/J is that it is in L 2(JR). This is not
particularly interesting for applications, as we have seen in the case of the
Haar wavelet (which is of order 0). For efficient numerical computations
it is necessary to use higher-order wavelets, which means that the wavelet
and its Fourier transform have reasonably good localization and regularity.
42.3.5 Yves Meyer's coo wavelet

We know from Proposition 17.2.1 that saying that the first few moments
of t/J vanish is the same as saying that
.if;<q)(o) = o, q= o, 1, 2, ....
Fora wavelet of order r, condition (b) of Definition 42.3.3 on the moments

is equivalent to ~ = 0 being a zero of order r + 1 of ;j}. Thus if one wishes
to find a wavelet of infinite order, one could start with a function ;j} that
vanishes in a neighborhood of the origin. But this is easier said than done.
In 1985 Yves Meyer wa! able to produce such a wavelet, which is in
.9 , by first constructing 'ljJ belonging to !l! . This construction is rather
subtle; the details can be fond in [Dau92, p. 116]. What are we to think
when Professor Meyer confesses modestly to have made this discovery "by
accident"? In fact, Meyer was quite dissatisfied that the construction of 'ljJ
did not fit into a general framework; this was created a little later with the
advent of multiresolution analysis. Here are the steps in the construction
of this coo wavelet.
One starts with a real, even function w E !l! having the shape shown in
Figure 42.11.
w(~) B
c
4 1 1 2 4 ~
3 3 3 3 3
FIGURE 42.11.
The curve AB is required to have a certain symmetry:
The curve BC is required to have the sameform as AB, but reversed and
stretched:
w(2~) = ~- w(2(1- ~)), ~ E [~, ~].
Then ;j} is defined by
;j}(~) = e-i1rt; sin[w(~)], ~ ER (42.11)
It is easy to see that ;j} is in !l! . From this it follows that 'ljJ E .9 and is
given by
(42.12)
Observe that'lj; is real and its graph is symmetric with respect to t = 1/2.
It turns out that there is not much leeway in the choice of w, and the
wavelets constructed all have about the same appearance as the one shown
in Figure 42.12. We note that although this function decreases rapidly,
1.25 1/J(X)
1.00
0.75
0.50
0.25
0.00
-0.25 X
-0.50
-0.75
-1.00
-8 -5 -2 0 4 7
-1 1
FIGURE 42.12. Meyer's C 00 wavelet.
it has a rather large "numerical" support. Meyer proved (and this is not
simple [LM86]) that the 1/Jjk form an orthonormal basis for L 2 (IR). Thus
for all f E L 2 (IR),
L: L: u, 1/Jjk)'l/Jjk,
+oo +oo
!= (42.13)
j=-oo k=-oo
and
j
+oo +oo +oo
L: L: lU, 1/Jjkw =
-oo
!f(tW dt.
j=-oo k=-oo
This wavelet basis has been tested numerically by Stephane Jaffard by

approximating the curve AB with a polynomial. Equation (42.13) is a series
expansion similar to a Fourier series, except that here the series is double
and f is not required to be periodic.
With an orthonormal wavelet basis, we obtain a decomposition of a signal
f in "voices" iJ:
L: u, 1/Jjk)'l/Jjk
+oo +oo
1 = L: 1j with IJ =
j=-oo k=-oo
Here we have a chorus with infinitely many voices. The approximation

n-1 +oo
Fn = L L (!, 1/Jjk)'l/Jjk>
j=-oo k=-oo
which is a projection of f on a certain subspace Vn, tends to f as n-+ +oo.

The voice fJ represents exactly the detail that must be added to Fj to
obtain the finer approximation Fj+l These ideas led to the notion of a
multiresolution analysis of the space L 2 (IR). This concept was introduced
by Stephane Mallat and Yves Meyer in 1987 [Mal89].
410 Lesson 420 Wavelet Analysis
42.4 Multiresolution analysis of L 2 (JR)

42.4.1 An introductory example
We begin with a uniform subdivision of the realline defined, for simplicity,
by tk = k for all k E Zo (In approximation theory, the points tk are known
as knotso) An approximation F0 of a signal f E L 2 (JR.) can be defined in
terms of an orthogonal projection on a subspace V0 of approximations that
are defined with respect to the given subdivisiono For example, let Vo be the
subspace of L 2 (JR.) consisting of the continuous functions in L 2 (JR.) whose
restrictions to the intervals [k, k + 1] are polynomials of degree ::::; 1 (sec
Figure 42o13)o
v(t)
FIGURE 420130 A cardinal spline of degree 1.
Any such function, square integrable or not, is called a cardinal spline

of degree 1. It is not difficult to show that V0 is isomorphic to the Hilbert
space l2(Z)o Since l 2(Z) is complete, V0 is a closed subspace of L 2(JR.) on
which the orthogonal projection F 0 of f is well-definedo Having defined such
a function, we can improve the approximation of f by projecting f onto
a larger subspace V1 that contains V0 0 Then V1 is defined the same way
we defined Vo, except that this time we refine the subdivision by adding
all of the mid-points of the original intervals [k, k + 1]0 Then V1 is easily
characterized in terms of V0 , namely,
v(t) E VQ ~ v(2t) E V1o

In the same way we define the spaces V2 , V3 , 000 by taking finer subdivisions,
and the spaces V_ 1 , V_ 2 , oo0 by taking coarser subdivisionso In the latter
construction, the knots for V_l are the points 2k, k E Z; those for V-2 are
22 k, and so ono In this way we obtain a sequence of closed, nested subspaces
of L 2 (1R.)
000 c v-2 c v_l c Vo c V1 c V2 c 000
such that for all j E Z,
v(t) E Vo ~ v(2Jt) E Vjo

42.4 Multiresolution analysis of L 2 (JR) 411
y
-2 -1 0 2 3 4
FIGURE 42.14. The hat-function basis.
V0 is invariant under integer translations of the variable, and one can

show that the translates Tkg of the hat function g (Figure 42.14) form a
basis for the Hilbert space V0 Thus every v E V0 can be expressed as
+oo
v = L v(k)Tk9
k=-00
This example serves as a model for the definition of a multiresolution

analysis of L 2 (IR).
42.4.2 Definition A multiresolution analysis of L 2 (IR) is a increasing

sequence {Vj }JEZ of closed subspaces of L 2 (IR) that have the following
properties:
(i) v(t) E Vj {o} v(2t) E Vj+l for all jE Z.
(ii) V0 is invariant under integer translations of the variable: v E Vo im-
plies that rkv E V0 for all k E Z.
(iii) Vj is densein L 2 (IR) and njEZ Vj = {0}.
ujEZ
(iv) There is a function g in V0 such that the family {Tk9 hEz is an un-
conditional basis for Vo.
The subspace Vj can be interpreted, as in the example, as the space of

all possible approximations at the scale 2-1. Property (iii) means that the
sequence of orthogonal projections F1 of f tends f in L 2 (IR) as j --> +oo
and that F1 --> 0 as j--> -oo.
The example of the spline functions of degree 1 clearly satisfies properties
(i) and (ii). If v is in Vj for all j, then v would have tobe a linear function,
and being in L 2 , this means that it must vanish identically. For the density,
it is sufficient to show that the union of the Vj is densein !lf, since !lf itself
is dense in L 2 (IR). Suppose f is in !lf, and consider the function v1 E Vj
that agrees with f at the points k2-J, k E Z. We know that v1 converges
to f uniformly on IR as j --> +oo. Since the supports of fand the v1 are all
contained in some bounded interval, the v1 also converge to f in L 2 (IR).
To understand point (iv) it is necessary to define an unconditional basis
(or Riesz basis) for a Hilbert space, since the definition of a topological
basis given in Lesson 16 was only for the case where the basis elements were
orthogonal. Wc do not assume that the vectors Tk9 in Definition 42.4.2 are
orthogonal, and in important cases they are not.
42.4.3 Definition A sequence of elements {ekhEz in a Hilbert space

H is called an unconditional basis for H if the following conditions are
satisfied:
(i) For each f E H there exists a unique complex sequence (ck)kEz in
l 2 (Z) such that
N
II!- L ckekll-+ 0 as N-+ +oo. (42.14)
k=-N
(ii) There are two positive constants A and B suchthat
+oo
Allfll ~2 L lckl 2 ~ Bllfll 2 , (42.15)
k=-oo
which means that f f-+ ( LkEZ Iek 12 ) 112 defines a norm on H that is
equivalent to the original norm on H.
Having an unconditional basis for H is equivalent to having an isomor-

phism T between the two Hilbert spaces l 2 (Z) and H. If A = B = 1, we
have the definition of a Hilbert basis. It is left as an exercise to show that
the hat functions Tk9 in Section 42.4.1 form an unconditional basis for V0 .
42.4.4 Cardinal spline functions

The example in Section 42.4.1 is easily generalized by simultaneously in-
creasing the degree of the polynomials and the global regularity of the
approximations. Thus V0 can be expanded to the subspace of continuously
differentiable functions in L 2 (~) whose restrictions to the intervals [k, k+ 1]
are polynomials of degree less than or equal to 2. Note that V0 is not triv-
ial, since it is easy to exhibit nonzero functions of this sort. These are the
cardinal splines of degree 2.
If r denotes the function that equals 1 on [1/2, 1/2] and zero elsewhere,
then the function g in Figure 42.16 is equal to r * r. Similarly, one can show
that g = r * r * r is in V0 and that the sequence of translates of g forms an
unconditional basis for the new space Vo.
On can continue this process and consider the spaces V0 of cardinal
splines of degrees 3, 4, ... , n, . . . that are in cn-
1 n L 2 , and in this way
create a family of multiresolution analyses of L (~).

2
42.5 Multiresolution analysis and wavelet bases 413
42.4.5 A different multiresolution analysis

Here is an example of a multiresolution analysis that is not based on spline
functions. Let
Vo = { v E L 2 (R) 1 supp(v) c [-1, 11}.
Vo is closed in L 2 (R) andinvariant under translation. Wehave
Vj = { v E L2 (R) 1 supp(v) c [-21, 211},
and the spaces Vj are closed and nested. The density of UVj in L 2 (R) was
proved in Section 38.4, and it is clear that the intersection reduces to {0}.
For the function g E V0 we can take the cardinal sine
( ) sin 27rt
g t = -----;t'
and we have seen with Shannon's formula that the translates Tkg are a
basis for Vo: For all v E Vo,
+oo
v(t) = L v(k)g(t- k).
k=-oo
It happens in this case that the g( t - k) are orthonormal. On the other

hand, g converges slowly at infinity and is not integrable. In the sense of
Definitions 42.3.3 and 42.3.4, the g(t- k) do not form a wavelet basis.
42.5 Multiresolutionanalysis and wavelet

bases
We are going to see how it is possible, given a multiresolution analysis of
L 2 (R), to construct an orthonormal basis for L 2 (R) of the form
1/Jj,k(t) = 2jf 2 1jJ(21t- k), j, k E Z.
To have a wavelet basis in the sense of Definition 42.3.4, it is then sufficient

to verify that the wavelet 1/J satisfies Definition 42.3.3 for some r.
Finding such a function 1/J is not an easy problem, as we have seen in the
case of Meyer's 0 00 wavelet. We will first look for orthonormal bases of the
form {<pj(t- k)}kEZ for the subspaces Vj. This will not solve the problern
directly, but it is an important initial step. Based on the relations
v(t) E Vo ~ v(21t) E Vj,
it is sufficient to find an orthonormal basis of the form {<p(t- k)}kEZ for
Vo. The corresponding bases for the Vj will be
<pj(t- k) = 211 2 <p(21t- k), j, k E Z.
We are assuming that we have a multiresolution analysis, so by definition we

have an unconditional basis for Vo, namely, the functions 9k(t) = g(t- k).
One way to proceed is to transform {gk} into an orthonormal basis using
the Gram-Schmidt method. Meyer used instead a method, attributed to
Henri Poincare, that does not depend on the way the 9k are indexed and
that transforms a basis of the form gk(x) = g(x- k) into one of the same
form [Mey90].
Let {gk}kEz be an unconditional basis for a Hilbert space H. Define the
linear operator T : H ---+ H by
+oo
Tx = L (x, 9k)9k
k=-oo
This operator is continuous and self-adjoint, and there is an a > 0 such

that for all x EH,
(Tx, x) ~ al\xll
Thus T has an inverse that is also positive and self-adjoint. By a result of
Schur, it is possible to define a positive self-adjoint operator U such that
U 2 = r- 1 . It is then easy to verify that the vectors {U 9khEZ form the
desired orthogonal basis.
Although this is an abstract result, it can be used as a guide to attack
the problern directly to produce an orthonormal basis expressed in terms
of the 9k This is the approach we now take.
42.5.1 Proposition The set offunctions {rkcp}kEz is an orthonormal

family in Vo if and only if
+oo
:L lcp(.x + k)l 2 = 1 (42.16)
k=-00
for almost every .X E R

Proof. By definition, the functions Tkcp are orthonormal if and only if
j +oo
cp(t- p)rp(t- q) dt = {0 if p =/: q,
=q
-oo 1 if p
for all p, q E Z. This is equivalent to
j +oo
-00
cp(.X)~(.X)e-2i7r(p-q).A d.X = {0 if P =/: q,
1 if p = q,
which, by letting n = p- q, is equivalent to having
j +oo
-oo
lcp(.X)I2e-2i7rn.A d.X = {0
1
~f
If
n =/: 0,
n = 0.
(42.17)
Finally, we show that (42.17) is equivalent to (42.16). Since ltP'(.X)I 2 E L 1 ,

we know from Theorem 37.2.2 that L:t:_
00 ltP'(.X+k)i2 E L~(O, 1) and that
+oo +oo
L ltP'(.X + kW = L 5 ltP"I 2 (n)e 2i7rn\
k=-oo k=-oo
in the sense of Y 1 lf (42 .16) holds (in the sense of L 1 ) , then the Fourier
coeffi.cients of ltP"I 2 satisfy condition (42.17). On the other hand, if we have
(42.17), then
+oo
L ltP'(.X + kW = 1
k=-oo
in the sense of Y '. But by Exercise 21.6, this implies that thc relation
holds for almost every .X E R o
42.5.2 Proposition If the family {rk'PhEZ is an orthonormal basis

for V0 , then there exists a function M in L~(O, 1) such that
cp(.X) = M(.X)g(.X)
and for a.e . .X E JR,
(42.18)
Proof. Since cp E V0, thcre exists a sequence (mk) in l2 (Z) such that
+oo
cp(t) = L mkg(t- k)
k=-oo
in L 2 (JR). By taking the Fourier transform of both sidcs, we see that

+oo
cp(.X) = L mke- 2i1rk>-.g(.X) = M(.X)g(.X),
k=-oo
where
L
+oo
M(.X) = mke- 2i1rk>-.
k=-oo
Clearly, M(.X) is in L~(O, 1). Using Proposition 42.5.1 wc have

+= +oo
L ltP'(.X + kW = IM(.XW L 19(-X + kW = 1,
k=-oo k=-oo
which proves the result. D

One can show that the Poincare process leads essentially to the relation
1
~(.-\) = [k~oo 19(.-\ + k)l2]-2 9(.-\). (42.19)
The following theorem is due to Meyer.
42.5.3 Theorem Assurne that g E V0 and that {Tkg} is an uncondi-

tional basis for V0 . If cp is defined by (42.19), then {Tk cp} is an orthonormal
basis for Vo.
Proof. The proof follows directly from the abstract argument, but we are
going to give a more explicit argument.
We must first show that cp is well-defined and that {Tk'P} is an orthonor-
mal family in L 2 (IR). We use the assumption about g to prove that
+oo
0< c::; L 19(.-\ + k)l 2 ::; D
k=-00
for some positive constants C and D.

For any sequence (ak) in l 2 (Z) the function
+oo
f(t) = L akg(t- k)
k=-00
is in L 2 (IR), and, as in the proof of Proposition 42.5.2,

+oo
J(.x) = L ake- 2i7rk>-.9(.X) = m(.X)9(.X),
k=-oo
where m E L~(O, 1). The hypothesis that {Tk'P} is an unconditional basis

implies that
+oo +oo
0<C L lakl 2 :S 11!11 2 :S D L lakl 2
k=-00 k=-oo
for some strictly positive constants C and D, and in Fourier space,
{1 1+oo {1
0<C Jo lm(.XW d.X :S -oo lm(.XWI9(.XW d.X :S D Jo lm(.X)I 2 d.X.
1 +oo
The middle integral is equal to jlm( .X) 12 L 19( .X+ kWd.X, so we have
0 k=-00
1 1 +oo {1
0 < C fo1m(.X)I 2d.X :S fo1m(.XW k~oo 19(.-\ + k)l 2d.X :S D Jo lm(.XW d.X
42.5 Multiresolutionanalys is and wavelet bases 417
for all (ak) in l 2 (Z), which is to say, for all m E L~(O, 1). Butthis can be
true if and only if
+oo
O<C:::; L 19(-\+kW:::;D
k=-00
for almost every ,\ E JH?., which is what we wished to prove. For convenience
we write
Then
1 1
VC:::; N(,\):::; Vf5 and ..fJ5:::; M(,\):::; v7J'
and we conclude that both M and N are in Lgc'(O, 1) and hence in L~(O, 1).
This implies that 4?(-\) = M(,\)g(,\) is well-defined as an element of L 2 (JH?.).
It is clear from this definition that 4? satisfies (42.16); hence {rkcp} is an
orthonormal family.
It remains to show that {rkcp} spans V0, which is by now close to obvious:
Since g(,\) = N(-\)4?(-\) and NE L~(O, 1), there is a sequence (nk) in l 2 (Z)
suchthat
+oo
g(t) = L nkcp(t- k).
k=-oo
Since the functions g(t- k) span V0, this shows that the functions cp(t- k)
span V0 and completes the proof. o
42.5.4 Finding an orthonormal wavelet basis for L 2 (JR.)

Starting with a multiresolution analysis of L 2 (JH?.), we have managed to
construct a basis C;'k(t) = cp(t- k) for the space Vo and thus for the spaces
Yj. However, we have yet to find an orthonormal wavelet basis for L 2 (JH?.).
For this, let W1 be the orthogonal complement of Yj in l'J+ 1,
YJ+l = l'J E9 W1.

The spaces W1 provide decompositions of the spaces Vn (since Yj ! {0})
and of L 2 (since Vn i L 2 ) as direct sums of orthogonal subspaces:
J=-oo J=-oo
Thus, if we find an orthonormal basis for each W1 , then we will have an

orthonormal basis for L 2 (JH?.). As was the case with the Yj, it is sufficient to
solve the problern for W 0 , since
v(t) E Wo ~ v(21t) E Wj
The plan is to look for a function 'ljJ E Wo suchthat the functions Tk'l/J form
an orthonormal basis for Wo.
42.5.5 Proposition Assurne that <.p is defined by (42.19). Then there

exists a function A E L~(O, 1) such that, for almost all ..\,
~(2..\) = A(.A)~(.A)
and
IA(.A)I 2 + lA (.A + 1/2)1 2 = 1. (42.20)
Proof. The function ~<p(~) is in V_ 1 c V0 . Thus there exists a sequence

(ak) in l 2 (Z) suchthat
Taking the Fourier transform, this becomes
L
+oo
~(2..\) = ake-2i1rk-\~(.A) = A(.A)~(.A),
k=-oo
and Ais clearly in L~(O, 1). From (42.16) we see that

+oo +oo
L 1~(2.A+2kW = IA(.AW L 1~(..\+kW = IA(.AW.
k=-oo k=-oo
Replacing .A with .A + 1/2 we have

+oo +oo
L 1~(2..\ + 2k + 1W = lA (.A + 1/2)1 2 L I~(.A + k + 1/2)1 2
k=-00 k=-oo
The result is obtained by adding the last two equations. D
We next investigate the conditions that 'ljJ must satisfy.
42.5.6 Proposition If 'ljJ exists, then there exists a function B in

L~(O, 1) that satisfies the following conditions:
(i) ;f;(2.A) = B(.A)~(.A). (42.21)
(ii) IB(.A)I 2 + IB (.A + 1/2)1 2 = 1. (42.22)
(iii) A(.A)B(.A) + A (..\ + 1/2) B (..\ + 1/2) = 0. (42.23)
42.5 Multiresolutionanalysis and wavelet bases 419
Proof. Firstnote that as for cp in Proposition 42.5.1, if the 'lj;(t- k) form

an orthonormal basis for W0 , then
+oo
L I~(A+kW=L (42.24)
k=-oo
On the other hand, all of the functions cp(t- k) are orthogonal to '1/J, and
therefore
/_:oo cp(t- k)1{;(t) dt = /_:oo ~(A);fj(A)e- 2 i11'kA dA= 0

for all k E z. By Poisson's formula (or direct manipulation), this implies
that
+oo
L ~(A + k);fj(A + k) = 0. (42.25)
k=-oo
The existence of B satisfying (42.21) follows from the same argument we
used in Proposition 42.5.5 to show the existence of A for cp. Thus
+oo
B(A) = L bke-2i11'kA
k=-oo
is derived from the relation
1 t +oo
2'1/J( 2) = E bkcp(t- k). (42.26)
k=-00
Similarly, identity (42.22) follows from (42.24). We can write (42.25) as
+oo
L [~(2A + 2k);fj(2A + 2k) + ~(2A + 2k + 1);fj(2A + 2k + 1)] = 0.
k=-oo
Using the relations ~(2A) = A(A)~(A) and ~(2A) = B(A)~(A) shows that
+oo
A(A)B(A) L ~(A + k)~(A + k) +
k=-oo
+oo
A(A + 1/2)B(A + 1/2) L
~(A + k + 1/2)~(A + k + 1/2) = 0,
k=-oo
which is
+oo
A(A)B(A) L I~(A + kW +
k=-00
+oo
A(A + 1/2)B(A + 1/2) L
I~(A + k + 1/2W = 0.
k=-oo
This and (42.16) yield (42.23). 0
42.5. 7 Computing the functions B and 'ljJ

To find '1/J, we look for aBthat satisfies (42.22) and (42.23). By solving
equations (42.20) and (42.23) for B(.>..) we see that
B(.>..) = -A(.>.. + 1/2)[A(.>..)B(.>.. + 1/2)- A(.>.. + 1/2)B(.>..)].
We write this as
B(.>..) = e- 2 in>.A(.>.. + 1/2)0(.>..)
with
0(.>..) = -e2 in>.[A(.>..)B().. + 1/2)- A(.>.. + 1/2)B(.>..)]
and make two observations. Note that 0().. + 1/2) = 0(.>..) and, since A and
B must satisfy (42.20) and (42.22), that
IO(.>..)i = 1. (42.27)
Conversely, it is easy to show that any function 0 with period 1/2 satis-
fying (42.27) will work. A simple family of functions 0 is
but in practice one usually takes a = 0. Thus let B be defined by

B(.>..) = e- 2 in>.A(.>.. + 1/2); (42.28)
then 'ljJ is defined in terms of its Fourier transform by (42.21).
42.5.8 Theorem If 'ljJ is defined by (42.21) and (42.28), then the set
of functions {Tk'l/J hEz is an orthonormal basis for Wo and the functions
are an orthonormal basis for L 2 (JR.).
Proof. The first task is to sort out what has been proved and what remains
to be proved. We assume that we have a multiresolution analysis of L 2 (JR.)
and that we have in hand a function r.p suchthat the 'Pk(t) = r.p(t-k) form an
orthonormal basis for V0 . By definition, W 0 is the orthogonal complement
of Vo in VI, so vl = Vo E9 Wo, and this implies by a change of scale that
\'JH = \rj ffi Wj for all j E Z. The fact that
+=
L2 = E9 Wj
j=-=
is then a direct consequence of Definition 42.4.2(iii). Thus, to prove that

{'1/Jjk}j,kEZ is an orthonormal basis for L 2(1R.), it is sufficient to show that
the 'l/Jok = 'l/Jk form an orthonormal basis for Wo.
42.5 Multiresolutionanalys is and wavelet bases 421
This is how we proceed: Define B by (42.28) and 'ljJ by (42.21); then work
the arguments of Proposition 42.5.6 backwards to show that B satisfies
(42.22) and (42.23) and that 'ljJ satisfies relations (42.24) and (42.25). These
are Straightforward computations, and we leave this part as an exercise.
This proves that the functions '1/Jk are in W0 and that they form an orthonor-
mal family in W 0. What remains tobe shown isthat the '1/Jk span W 0. To
do this, we will show the existence ofsequences (ck), (dk), (ek), (fk) E l 2 (Z)
such that the functions cp(t), cp(2t), and 'lj;(t) are related by the following
equations:
L L
+oo +oo
cp(2t) = Ckcp(t- k) + dk'lj;(t- k),
k=-oo k=-oo
+oo +oo
(42.29)
cp(2t- 1) = L ekcp(t- k) + L !k'I/J(t- k).
k=-oo k=-oo
Once we have (42.29) we have the result: Theserelationsshow that for each
n E Z, the function cp(2t- n) can bc expressed as linear a combination of
the 'Pk and the '1/Jk If the '1/Jk do not span W 0, there is a nonzero element
h E Wo such that (h, '1/Jk) = 0 for all k. Since W0..L Vo, (h, 'Pk) = 0 for all
k. But h E V1 = Vo ffi Wo and the cp(2t- k)) form a basis for V1 ; this and
(49.29) imply that h = 0. Hence the '1/Jk must span W0.
The last step is to show that wc do indeed havc (42.29). In the Fourier
domain, (42.29) is equivalent to thc existence of four functions C, D, E, F
belanging to L~(O, 1) such that
0( ~) = C(.X)0(.X) + D(.X)~(.X),
(42.30)
e-i7r>-0( ~) = E(.X)0(.X) + F(.X)~(.X).
Using the properties of A and the definitions of B and 'ljJ it is easy to

show that the following system satisfies the requiremcnts. (It slightly more
difficult to find these relations "from scratch.")
C(.X) = A(.X/2) + A(A/2 + 1/2),

D(.X) = B(.X/2) + B(.X/2 + 1/2),
E(.X) = e-i7r>-[A(.X/2)- A(.X/2 + 1/2)],
F(.X) = e-i7r>-[B(.X/2)- B(.X/2 + 1/2)].
While this completes the proof, much can be said about this result and the
questions it raises. A few comments are given below. o
42.5.9 Remarks
(a) Formula (42.28) provides a relation between the Fourier coefficients
ak and bk of A and B:
bk = (-1) 1-k1-k
This allows one to obtain 'lj; in terms of cp without a Fourier transform,
since by (42.26),
However, if one starts with a multiresolution analysis with

+oo
L lg(>.+kW =I 1,
k=-oo
then a Fourier transform is needed to construct cp via (42.19).

Once we have an orthonormal basis for each Wi, a signal f is decomposed
as the sum of its projections on the spaces Wi:
+oo
!= L ,j,
j=-oo
with
L u, 'lj;jk)'lj;jk,
+oo
,j =
k=-00
and the approximation of f at the resolution 2-n is given by its orthogonal

projection on Vn, which is
1-n
Fn = L /j.
j=-oo
(b) The approach has been to start with a multiresolution analysis of

L 2 (JR.) and to construct cp and 'lj;. It is possible to begin with a function cp
in L 2 (JR.) and to consider the closed subspace Vo spanned by the translates of
cp. A natural question arises: What assumptions about cp will guarantee that
{ltj hEz is a multiresolution analysis of L 2 (JR.)? Clearly, we want {cp(t- k)}
to be an orthonormal family, so we must assume that cp satisfies (42.16).
(Otherwise, we must assume 0 < C ~ :L:kEZ I<P(>.+k)l 2 ~ D and transform
{cp(t- k)} into an orthonormal family.) The function cp must also satisfy
(42.31)
for somc sequencc (ak) in l 2 (Z). With thcse assumptions it is easy to

show that the closed subspaces Vj generated by the orthonormal fami-
lies {'PikhEZ = {2il 2 <p(2it- k)}kEZ fulfill conditions (i), (ii), and (iv) of
Definition 42.4.1. It is also not difficult to provc that njEZ Vj = {0}. The
difficult part is to show that UjEZ Vj is densc in L 2 This can be proved
with thc additional assumptions that <P(>.) is bounded for all >. and that
it is continuous ncar >. = 0 with I<P(O)I = 1 (sec [Dau92, p. 142]). These
conditions are fulfillcd, for example, if . = 0 with I<P(O)I = 1, then by Proposition
42.5.5, IA(O)I = 1 and A(1/2) = 0, and from the dcfinition of B, we have
B(O) = 0 and IB(1/2)1 = 1. If A and B arc interpreted as transfer functions
of filters, then A passcs frequencics ncar >. = 1 and attenuatcs frequencics
ncar >. = 1/2. Thus A acts like a low-pass filter. Similarly, B acts like
a high-pass filtcr. The impulse responses of the two filtcrs are (ak) and
(bk) respectively. Filters A and B that satisfy the rclations of Propositions
42.5.5 and 42.5.6, which can bc summarized by saying that the matrix
[ A(>.) B(>.) ]
A(>. + 1/2) B(>. + 1/2)
is unitary for almost all >., are called conjugate quadraturc filtcrs.
(c) As indicated several times, regularity and localization of the scaling
function <p and the wavelets '1/ljk are necessary for efficient numerical com-
putation. Thus the "minimal" assumptions made about <p in (b) do not
lcad to practical wavelets. If we assumc, however, that
for all m E Z, in addition to assuming that <p satisfies (42.16) and (42.31)
and that I<P(O)I = 1, the whole situation becomes much "smoother." In
this casc, the coefficicnts ak decrcase rapidly at infinity and A E coo.
Furthermore, not only do assumptions about the regularity of <p and the
localization of its derivatives lead to the regularity of '1/J, but thcy also imply
that '1/J has vanishing moments. This analysis can be found in [CR95].
42.5.10 Spline wavelets

Wc began the discussion of multiresolution analysis by describing the mul-
tiresolution analysis of L 2 (IR) based on the space V0 that is spanned by the
integer translates of the "hat" function (Figure 42.14). We also mentioned
that this example could be generalized by takin Vo to be the space spanned
by the cardinal splines of degree n. More precisely, if r is the charactcris-
tic function of the interval [-1/2, 1/2] and if 9n denotes thc convolution
r * r * * r containing n + 1 terms, the functions Tk9n, k E Z, form a Riesz
basis for Vo, and the ncsted spaces Vj constitutc a multiresolution analysis
of L 2 (IR). Wc wish to continuc this cxamplc in light of what wc now know

about the wavclets associated with a multiresolution analysis. For simplic-
ity, wc limit the discussion to thc case n = 1 and write g = g0 = r * r. It is
clear from Figure 42.14 that the functions rkg arenot orthogonal. However,
it is easy to sec that g satisfies the equation
1 1
g(t) = 2g(2t + 1) + g(2t) + 2g(2t- 1). (42.32)
This relation implies that the V0 c V1 and thus that \;j c \;j +1 by a changc
of scale. It was argucd following Definition 42.4.2 that {\;j} is a multireso-
lution analysis of L 2 (JR).
For later use, we take thc Fourier transform of both sides of (42.32) and
write
9(2-\) = G(-\)9(-\), (42.33)
wherc G(,\) = (1 + cos 271'-\)/2. Note that Gis real and even.
Sincc the translates Tkg, k f= 0, are not orthogonal to g, it is necessary
to transform the rkg into an orthonormal family. For this, wc usc Theorem
42.5.3 and definc rp by (42.19):
Sincc f(-\) = sin7r-\/(7r-\), wc havc

sin 2 7l' ,\
9(-\) = - - .
7r2,\2
An cxpression for thc function I.:t:'-oo 19(,\ + k)l 2 is computcd by eval-

uating its Fourier coefficicnts as follows:
1k=-oo
+oo
L 19(,\ +
1
kWe2i1rn,.\ d,\ = 119(-\)l2e2i7rn,.\ d,\
IR
l
0
= 9(-\)g(,\)e2ioornA d,\
= l g(t)g(t- n) dt.
i
A simple computation shows that
~
if n = 0,
J. g(t)g(t- n) dt { if n = 1,
othcrwisc.
Thus, this infinite sum has the simple expression
2 1 1
L
+~
k=-~
19(>. + kW = 3 + 3 cos 21r>. = 3[1 + 2cos2 1r>.],
and we can write (j5 (42.19) as
~(>.) M(>.)~(>.) v'3 sin2 1r>. (42.34)

r.p = g = [1 + 2 cos2 1r>.]11 2 1r2 >. 2
The function M(>.) = J3[1 + 2 cos 2 1r>.]- 1 / 2 is in c~(ll) n L~(O, 1). It is a
periodic tempered distribution, and by Theorem 36.2.2, .!T M, which we
denote by m, can be expressed as
L
+~
.!T M =m = O:nOn,
n=-oo
where (an) is a slowly increasing sequence. In fact, since ME L~(O, 1), O:n
tends to zero as lnl --+ +oo. An application of Proposition 33.2.1 shows
that
--
g * .!T M = g M,
and since the Fourier transform in 1-to-1 on .'? ', we must have
r.p(t) = g * m(t). (42.35)
We can draw several conclusions from this representation of r.p. First, it is

clear from (42.32) that r.p is a cardinal spline of degree 1. To be precise,
r.p is the spline one obtains by connecting the points (n, o:n) with straight
lines. The second observationisthat r.p does not have compact support, or
more to the point, m does not have compact support. If it did, then M
would be a trigonometric polynomial, but this assumption leads quickly to
a contradiction. The conclusion is that the support of the scaling function
r.p is all of IR. Finally, note that (an) is real and even, sincc M is real and
even.
As a step toward construction the wavelet 1/J, we need to describe the
filter A that appears in Proposition 42.5.5. From (42.33) and (42.44), it
follows that
A(>.) = M(2>.) G(>.)
M(>.) '
and from what we know about M and G, it is not difficult to see that Ais
real and even. A quick computation shows that A(>. + 1/2) is also even.
The wavelet 1/J is defined by 1/J(2>.) = e- 2i."..x A(>. + 1/2)(jj(>.) ( (42.21 and
(42.28)), and from what we have seen so far, this can be written as
1/J(2>.) = e- 2i."..x A(>. + 1/2)G(>.)g(>.). (42.36)

1/J(X)
FIGURE 42.15. Spline wavelet of degree 1 (Lemarie-Battle).
As a last step, we wish to show that 'ljJ is a rcal-valucd cardinal splinc of

dcgree 1 and that it is symmetric about t = 1/2. For ease of notation,
writc S(>.) = A(>. + 1/2)G(>.). Thc argumcnt regarding M applies to S,
and consequently .!JT S can be expressed as
+oo
.!JT S = S = L n8n, (42.37)
n=-oo
where n tends to zero as lnl --7 +oo. Equation (42.36), written as
e 2 i7r>.'I/J(2>.) = S(>.)g(>.),
implies that
'I/JC~1) =(s*g)(t).
Both s and g are even, and it follows that '1/J( (t + 1) /2) is even. The function
'ljJ(t + 1/2) obtained by replacing t with 2t is also even; thus its translate
'ljJ(t) is symmetric araund 1/2.
To summarize, starting with a multiresolution of L 2 (JR.) generated by
the function g = r * r, we have used the constructions describcd in this
lesson to generate the scaling function <p and the spline wavelet '1/J. We
have shown that <p and 'ljJ are real spline functions of degree 1, that <p is
even, and that 'ljJ is symmetric about t = 1/2. We also argued that the
support of <p is JR.; similarly, since S cannot be a polynomial, the support
of 'ljJ is R However, both functions decay exponentially (for a proof sec
[Dau92]). These results generalize to the multiresolution generated by gn.
For a systematic discussion of spline wavelets, we suggest the article by
Charles Chui in [RBC+92]. The spline wavelet 'ljJ and the modulus of its
spectrum are shown, respectivcly, in Figures 42.15 and 42.16. The spline
wavelet of degree 3 is shown in Figure 42.17 and the modulus of its spectrum
is illustrated in Figure 42.18.
1~(~)1
0 2 3 4 5 6 7 8 9 10 11 12 ~
FIGURE 42.16. Amplitude of the spectrum of spline wavelet of degree 1.
l{l(x)
FIGURE 42.17. Spline wavelet of degree 3.
1~(~)1
1 2 3 4 5 6 7 8 9 10 1112 ~
FIGURE 42.18. Amplitude of the spectrum of spline wavelet of degree 3.

42.6 Afternot es
This lesson has been but a brief introduction to the theory and applications
of wavelets. We have presented only a few topics from what has become
a dynamic and productive area of research with a rich theory and a wide
range of applications. In this last section we indicate some other aspects
of the field and provide a few pointers to the literature, which is now
substantial.
A first point concerns history and the sociology of science. Since the be-
ginning in the 1980s of what we call "modern wavelet theory," the field
has been characterized by a healthy interplay between theory and applica-
tions. Simply put, mathematicians have worked in close collaboration with
researchers from other areas of science and engineering, and wavelet theory
has been strongly influenced by applied problems. These revolve naturally
araund signaland image processing, but the signals and images arrive from
diverse fields: astronomy, biology, medicine, hydrodynamics, geophysics,
and, .of course, telecommunications-to mention but a few. The challenge
is to find a field of science or engineering where wavelet techniques have not
been applied, or at least tried. This was not always the case. As mentioned
at the beginning of the lesson, we now sec many older results in mathe-
matics and in signal processing that are now interpreted in the language
of wavelet theory. These results were for the most part unknown outside
their respective communities. Since the initial collaboration between Marlet
and Grossmann, the tradition of interaction and cross-fertilization among
disciplines continues, and there is a resonance in this when we recall that
Fourier was motivated by problems in heat conduction.
We have introduce two kinds of wavelet analysis: continuous wavelet
analysis associated with a family of the form
'lj;ab(t) 1 (t- b)
= ya'lj; -a- , b E IR., a > 0,
and discrete wavelet analysis using wavelets of the form
(42.38)
Both of these analyses can be extended to higher dimensions; this is par-

ticularly important in two dimensions for image processing.
Continuous wavelct analysis, including variations involving the modulus
of the wavelet transform, has been developed as a sensitive tool for an-
alyzing local properties of a signals. These techniques have been uscd to
analyze the singularities of "mathematical signals" such as the celebrated
continuous, "nowhere-differentiable" function
R(x) = ~ sin(1rn 2 x)
L.... n2
n=l
42.6 Afternotes 429
attributed to Riemann as well as various "experimental signals," partic-

ularly fully developed turbulence. General information on the continuous
point of vicw can be found in [Dau92] and [Tor95]. Applications to thc
analysis of fractal objects in physics can be found in [AAB+95]. A great
deal of work on continuous wavelet analysis has been done by the group
at Marseille under the general guidance of Alex Grossmann, who from the
very beginning has been a leader in the field.
We proved the reconstruction formula using the same wavelet that was
used for the analysis. It is possible, however, to use different wavelets for
the analysis and synthesis. This technique was used profitably by Matthias
Holschneider and Philippe Tchamitchian for their analysis of Riemann's
function [HT91].
On the discrete side, the discovery by lngrid Daubechies of wavelets
having compact support stands as a landmark in the theory. Remarkably,
given r E N, there exists an orthorrormal basis for L 2 (JR.) of the form
suchthat the support of '1/Jr is in [0, 2r + 1], the moments J tn'I/Jr dt = 0 for
0 ::=; n ::=; r, and '1/Jr has about r /5 continuous derivatives. A complete ac-
count can be found in Daubechies's book [Dau92]. Another significant step
was the discovery by Daubechies, Cohen, and Feauveau of a general way to
generate biorthogonal wavelet bases. (A particular example had previously
been constructed by Philippe Tchamitchian.) This means there are two
families {'1/ljk} and {.;j;jk}, each of the form (42.32), that are unconditional
bases for L 2 (JR.) and suchthat
except when j = j' and k = k', in which case it equals 1. A complete discus-
sion of this construction and of why biorthogonal wavelets are interesting
for applications is given in [CR95].
The original French version of this lesson appeared in 1990 at a time when
the only book on wavelets was Yves Meyer's Ondelettes et operateurs I:
Ondelettes [Mey90]. Professor Meyer and his students have played a central
role in the development of wavelet theory, and Meyer's books, both the
technical work cited above and his more widely accessible account [Mey93],
have had an influence on both sides of the Atlantic.
Ten Lectures on Wavelets [Dau92] by lngrid Daubechies was the first
book in English, and it has deservedly become a "best seller." Full accounts
of most of the material in this lesson can be found there.
There are now many books on wavelets in English. Furthermore, all are
accessible to anyone who has understood the material of these 42 lessons.
We have included several books in the References, usually anPotated, that
have not been cited in the text.
Finally, there is a large amount of information and software available via

the Internet. Thc Wavelet Digest is a free monthly news letter edited by
Wim Sweldens that provides general information on publications, confer-
cnces, software, ctc. A subscription is available by visiting the Web page
http: I /www. wavelet. org. Information about software in the public do-
main can be found in the article Wavelet analysis by Andrew Bruce, David
Donoho, and Hong-Ye Gao, IEEE Spectrum, October 1996.
42.7 Exercises 431
42.7 Exercises
Exercise 42.1 With the notation and hypotheses of Theorem 42.2.1, show
that
K
1 JJ - dadb
R2CJ(a, b) C 9 (a, b) ~=}IR[ _
f(t) g(t) dt
for fand g in L 2 (1i).

Hint: Show that
Ck(a, b) = ~ Y>. [h(>.) :J(a>.)] (b)
for h E L 2 (Ji) and use the proof of Exercise 41.1.
Exercise 42.2 Consider the Haarsystem '1/Jik defined by

'1/Jik(x) = 2j/ 2'1/J(2jx- k), XE Ii, j, k E Z,
where '1/J(x) = 1 on (0, 1/2); '1/J(x) = -1 on [1/2, 1); and '1/J(x) = 0 otherwise.
(1) Show that {'1/Jikh.kez is an orthonormalsystem in L 2 (Ji).
(2) We know that {'1/Jjk} is an orthonormal basis for L 2 (Ji). Consider the scaling
function t.p = X[o,l) associated with the wavelet basis {'1/Jik}. If n E N* and
we define a scaling function e associated with A by

2n-1
e(x) = L Uk X[k2-n,(k+l) 2 -n)(x), XE Ii.

k=O
Verify that e E L 2 (Ji) and that the wavelet decomposition of e is of the

form
n-12i-1
e(x) = doot.p(x) + L L Cjk'l/Jik(x). (1)

j=O k=O
(3) For n = 2 take
dool
and write
.
B = [coo
cw
.
Cll
(a) Draw the graph of e.

(b) Find the matrix ME MR(4,4) suchthat A =MB.
(c) Find B.
(d) Show explicitly that one indeed has the solution by computing the
values of the two terms of (1) for each x.
(4) Treat explicitly the case n = 3 for an A of your choice.
References
[AAB+95] A. Arneodo, F. Argoul, E. Bacry, J. Elezgaray, and J.-F. Muzy.

Ondelettes, multifractales et turbulences, de l'ADN aux crois-
sances cristallines. Diderot Editeur, Arts et Sciences, Paris,
1995. English translation, Diderot Publishers, New York, 1997.
[Bas78] J. Bass. Cours de Mathematiques, volume I. Masson, Paris,

1978.
[Bel81] M. Bellanger. Traitement numerique du signal. Masson, Paris,

1981.
[Ber70] J.P. Bertrandias. Analyse fonctionnelle. Armand Colin, Paris,

1970.
[BL80] R. Boite and H. Leich. Les jiltres numeriques. Masson, Paris,

1980.
[Bre83] H. Brezis. Analyse fonctionnelle. Theorie et applications. Mas-

son, Paris, 1983.
[Car63] H. Cartan. Theorie elementaire des fonctions analytiques d 'une

ou plusieures variables complexes. Hermann, Paris, 1963.
[CLW67] J.W. Cooley, P.A.W. Lewis, and P.D. Welch. The Fast Fourier
Transform algorithm and its applications. Technical report,
I.B.M. Research, 1967.
[CLW70] J.W. Cooley, P.A.W. Lewis, and P.D. Welch. The Fast Fourier
Transform algorithm. Programming considerations in the cal-
culation of sine, cosine and Laplace transforms. J. Sound Vi-
brations, 12(3):315-337, 1970.
[Cou84] F. De Coulon. Theorie et traitement des signaux. Dunod, Paris,

1984.
434 References
[CR95] A. Cohen and R.D. Ryan. Wavelets and Multiscale Signal Pro-
cessing. Chapman & Hall, London, 1995.
[Dau92] I. Daubechies. Ten Lectures on W avelets. Society for Industrial

and Applied Mathematics, Philadelphia, PA, 1992.
[DH82] P.J. Davis and R. Hersh. The Mathematical Experience.

Roughton Mifin, Boston, 1982.
[Ebe70] A. Eberhard. Algorithmes de l'analyse harmonique numerique.

PhD thesis, Univesity of Grenoble, June 1970.
[Gab46] D. Gabor. Theory of communication. J. Inst. Elec. Eng. (Lon-

don}, 93:429-457, 1946.
[GM84] A. Grossmann and J. Morlet. Decomposition of Hardy functions

into square integrable wavelets of constant shape. SIAM J.
Math., 15:723-736, 1984.
[HaalO] A. Haar. Zur theorie der orthogonalen funktionen-systeme.

Math. Ann., 69:331-337, 1910.
[Ha164] P.R. Halmos. Measure Theory. D. Van Norstrand Company,

Inc., New York, 1964.
[Her86] M. Herve. Distributions et transformee de Fourier. P.U.F.,

Paris, 1986.
[HT91] M. Holschneider and Ph. Tchamitchian. Pointwise regularity

of Riemann's "nowhere differentiable" function. Inventiones
Mathematicae, 105:157-175, 1991.
[Hub96] B.B. Hubbard. The World According to Wavelets. A.K Peters,

Wellesley, MA, 1996. A popular account of the basic ideas of
wavelets, their history, and the people involved.
[Jac63] D. Jackson. FourierSeriesand Orthogonal Polynomials. Num-

ber 6 in Carus Mathematical Monographs. Mathematical Asso-
ciation of America, Washington, D.C., 1963.
[KF74] A. Kolmogorov and S. Fomine. Elements de la theoriedes fonc-

tions et de l'analyse fonctionnelle. Editions du Moscou, 1974.
[Kho72] Vo Khac Khoan. Distributions, Analyse de Fourier. Operateurs

aux derivees partielles. Vuibert, 1972.
[Kun84] M. Kunt. Traitement numerique des signaux. Dunod, Paris,

1984.
References 435
[Lau72] P.J. Laurent. Approximation et optimisation. Herrnann, Paris,

1972.
[Lip81] J.D. Lipson. Elements of algebra and algebraic computing.
Addison-Wesley, 1981.
[LM86] P.G. Lernarie and Y. Meyer. Ondelettes et bases hilbertiennes.
Revista Ibero-Americana, 2:1-18, 1986.
[Mal89] S. Mallat. A theory for rnultiresolution signal decornposition:
The wavelet representation. IEEE Trans. Pattern Anal. Ma-
chine Intell., 11:674-693, 1989.
[Mey90] Y. Meyer. Ondelettes et Operateurs I: Ondelettes. Masson,
Paris, 1990. English translation, Wavelets and operators, Garn-
bridge University Press, 1992.
[Mey93] Y. Meyer. Wavelets: Algorithms fj Applications. SIAM,
Philadelphia, 1993.
[MJR87] Y. Meyer, S. Jaffard, and 0. Rioul. L'analyse par ondelettes.
Pour la Science, Sept. 1987.
[Nus81] H.J. Nussbaurner. Fast Fourier Transform and Convolution
Algorithms. Springer-Verlag, 1981.
[RBC+92] M. B. Ruskai, G. Beylkin, R. Coifrnan, I. Daubechies, S. Mallat,
Y. Meyer, and L. Raphael, editors. Wavelets and their Appli-
cations. Jones and Bartlett, Boston, 1992.
[Roy63] H.L. Royden. Real Analysis. The Macrnillan Cornpany, New
York, 1963.
[Sch65a] L. Schwartz. Methades mathematiques pour les sciences phy-
siques. Herrnann, Paris, 1965.
[Sch65b] L. Schwartz. Theorie des distributions. Dunod, Paris, 1965.
[SN96] G. Strang and T. Nguyen. Wavelets and Filter Banks. Welles-
ley-Carnbridge Press, Wellesley, MA, 1996.
[Sze59] G. Szeg. Orthogonal polynomials, volurne 23. A.M.S. Collo-
quiurn Publications, 1959.
[Tor95] B. Torresani. Analyse continue par ondelettes. InterEdi-
tions/CNRS Editions, Paris, 1995.
[VK95] M. Vetterli and J. Kovacevic. Wavelets and Subband Coding.
Prentice Hall, Englewood Cliffs, NJ, 1995. Written in the lan-
guage of signal processing, this book presents an integrated view
of wavelets and subband coding.
436 References
[Wic94] M.V. Wickerhauser. Adapted Wavelet Analysis from Theory to

Software. A.K Peters, Wellesley, MA, 1994. A detailed treat-
ment for engineers and applied mathematicians with an empha-
sis on the analysis of real signals. A good place to learn about
wavelet packets.
Index
Algebra cardinal spline functions, 412

of distributions, 307 Chebyshev polynomials, 54, 90
of sets, 102 Chebyshev's inequality, 119
cr-algebra, 101 Chui, Charles, 426
aliasing, 360 circulant matrix, 85
almost everywhere (a.e.), 28, 105 Cohen, Albert, 428
amplifier, 7 conjugate quadrature filters, 423
analog filters convergence
action on a periodic signal, 58 in !JJ ( R ) , 245
Butterworth filters, 229 in .9' (R), 173
Chebyshev filters, 231 mean, 13
definitions, 14, 319 mean quadratic (in energy), 13
differentiator, 7, 228, 331 of discrete signals, 13
examples, 221-232 of distributions, 265
governed by a differential of tempered distributions, 285
equation, 211-219 uniform, 13
generalized solutions, 213 convolution of distributions,
solution in .9' , 212 297-309
integrator, 227, 331
c= * sz; 1 , 297
low-pass filters, 17, 228
,9' * ,9' I> 299
RC filter, 8, 15-17, 221, 327 /5' I * !}) I> 301
RLC circuit, 222-225
/5' I * .9' I> 303
realizable, 321
g;~ * !JJ~, 304, 305
See also differential equations
associativity of, 306, 307
analyzing ( "mother") wavelet, 395
continuity of, 300
approximation
derivation of, 300, 302
in L~(O, a), 29-33
support of, 300, 302, 305
in Hilbert space, 143-146
unit element for, 301
Band-limited signals, 348 convolution of functions, 16,
Beppo-Levi's theorem, 120 177-185
Bessel's inequality, 30, 147 L 1 * L 1 , 179
Bore! sets, 102-104 LP*U, 180
L 1 * L 2 , 182
Cardinal sine functions, 357 .9' * y ' 190
438 Index
continuity of, 187 distributions

derivation of, 187 convergence of, 265
having limited support, 183 definition, 245
regularization, 188 derivation of, 255
summary, 184 continuity of, 266
convolution system, 16 term-by-term, 270
See also analog filters even and odd, 252
Cooley, J.W., 75 Fourier transform ( see Fourier
Cooley-Thkey algorithm. See fast transform of distributions)
Fourier transform history and heuristics, 235-242
null (zero, vanishing), 253
Daubechies's wavelets, 428 periodic, 252, 335-342
Daubechies, Ingrid, 405, 428 Fourier series of, 337, 338
delay line, 7 product with a periodic
delay operator, Ta, 12 function, 340
density primitives of, 275-279
of C~(I) in L 1 (I), 138 product with a function, 254
of~(R) inS"' (R), 175 regular, 248
of ~ (R) in L 1 (R), 189 tempered, .9" 1 (R)
of .9" (R) in L 2(R ), 193 characterization of, 284
of ~ (R) in~ 1 (R), 302 convergence of, 285
derivative definition, 284
generalized, 241 representation theorem, 287
of a distribution, 255 with compact support, g' 1 (R)
relation between usual and definition, 291
distribution, 256 representation theorem, 292
differential equations
causal solutions, 325-327 Eigenfunction, 15, 19
tempered solutions, 321-324 eigenvalue, 19
differentiator, 7, 228, 331 expansion of a function
Dirac's comb, 246, 252, 270, 287 in a series of sines, 51
Dirac's impulse, 8, 237, 266, 290 in a series of cosines, 52
Dirichlet 's theorem, 43 in an orthogonal basis, 53
discrete filters
an example, 18 Fast Fourier transform, 75-80
definition, 365 cost, 76, 77
RC filter, 380 matrix version, 82
governed by difference used for computing
equations, 379 high-order polynomials, 88
realizable, 372, 378 nonperiodic convolutions,
stable, 372, 378 87
discrete Fourier transform, 65-73 periodic convolutions, 85
inverse, 69 polynomial interpolation, 90
of real data, 71 spectrum of a signal, 361
properties of, 69 Fatou's lemma, 120
discrete signals, 365 Feauveau, J.C., 428
FFT. See fast Fourier transform
convolution of, 19, 70, 367, 370
l~ * l':', 371 finite part, fp( _;. ), 260, 261
X
l~ * l~, 371 Fourier analysis
Index 439
a critique of, 385 derivation, 157

compared with Gabor's inverse ( see inverse Fourier
method, 392 transform)
Fourier coefficients, 33 isometry on L 2 (R), 194
approximation of summar~ 204-206
by interpolation, 66 translation, 158
by trapezoid formula, 65 Fourier, Joseph, 27
of real, odd, and even Fubini's theorem, 124, 368
functions, 34 function spaces, 133-140
relation between the exact and CP(J), 133
approximate, 71 0 00 (1), 134
summary of behaviors, 46 ~ (R), 134
uniqueness .'/' (R), 171
for functions in L~(O, a), 33 completeness of LP(I), 135,
for piecewise continuous 140
functions, 36 locally integrable, LfacCR), 136
Fourier series of differentiable functions, 133
accelerating convergence of, of integrable functions, 135
350 summary of inclusions, 139
of a locally integrable periodic functions
function, 335 absolutely continuous, 127
of a periodic distribution, 337 characteristic, 106
of a product, 50 generalized, 241, 248
pointwise representation, measurable, 105
39-50 of bounded variation, 42
uniform convergence of, 45 periodic, 23
Fourier transform of convolutions piecewise continuous, 41
of distributions primitive of, 127
.'/' *.'/' ', 311 rapidly decreasing, 171
15' * .'/' ', 312 regular, 46, 133
L 2 * L 2 , 313 slowly increasing, 175, 285
of functions, 201-207 step, 107
L 1 * L 1 , 202 fundamental frequency, 58
L 2 * L 2 , 203
.'/' *.'/', 203 Gabor functions, 396
L 2 * L 2 , 313 Gabor's formulas, 388, 389, 392
Fourier transform of distributions Gabor, Dennis, 388
defined for tempered gain of a filter, 217
distributions, 287-291 Gram matrix, 145
forT E i5'(R), 292 Grossmann, Alex, 395, 427
of Dirac's comb, 291
of Dirac's impulse, 290 Haar system, the, 406, 407, 429
of sinusoidal signals, 290 Haar wavelet, the, 401
summary, 294 Haar, Alfred, 406
Fourier transform of functions harmonics, 58
L 1 (R), 155 Heaviside's function, u, 6
L 2 (R), 193 Hermite polynomials, 55
.'/' (R), 173 Hermitian form, 28
conjugation and parity, 158 Hilbert bases, 146, 148
440 Index
Hilbert spaces, 141-152 Mexican hat wavelet, 403

Fourier coefficients, 146 Meyer's c= wavelet, 407, 409
Fourier series, 147 Meyer, Yves, 408, 409, 428
convergence of, 148 mirror permutations, 78-81
pre-Hilbert space, 141, 142 monotone convergence theorem,
Hilbert transform, 313, 314 114
Hlder's inequality, 135 Morlet's wavelet, 396, 401
Holschneider, Matthias, 428 Morlet, Jean, 394, 395, 427
multiresolution analysis of L 2 (R),
Impulse response, h, 16, 19, 213, 410-413
320 based on cardinal sines 413
integrator, 227, 331 based on splines, 410-412
inverse Fourier transform, 163
for L 1 (R), 163 Norm, 12-13
on.? (R), 174 for CP(J), 133
principal value formula, 166 for LP(I), 135
notation
Jacobian matrix, 125 BV[a,b], 42
Jaffard, Stephane, 409 c~ (also c~, 183
Cpw+ (also Cpw-), 183
Laguerre polynomials, 55
c;[o,a], 46
Laurent series, 375
C~(I), 134
Lebesgue integral CP(J), 133
change of variable, 125
c=(I), 134
compared with Riemann
Cpw[a, b], 41
integral, 116
!Z' ~. 254
derivation with respect to a
!Z' (R), 134, 244
parameter, 123
LP(I), 135
elementary properties,
113-116
Lfoc(R ), 136
L~(O, a), 28
history, 97-98
.? (R), 172
indefinite, 126
Nyquist rate, 349
derivative of, 126
integration by parts, 127
of measurable functions, 113 Octave, 59
of nonnegative simple orthogonal complement, 143
functions, 111 orthogonal projection, 143
Lebesgue measure, 102, 103 orthogonal systems. See Hilbert
Lebesgue's dominated convergence bases
theorem, 121 orthogonal vectors, 143
Lebesgue, Henri, 98 orthogonal wavelets, 405
Legendre polynomials, 54 orthorrormal system, 146
oscillating phenomena, 58
Mallat, Stephane, 409
measure theory, 101-109 Paley-Wiener theorem, 293
measurable set, 101 parallelogram identity, 142, 152
measurable space, 101 Parseval's equality, 25, 33, 53
measure space, 102, 104 Plancherel-Parseval equality, 193
measure, a, 102 Poincare, Henri, 414
Index 441
Poisson's formula, 344, 345, 347, spectrum

351 energy, 17
pre-Hilbert space. See Hilbert of a periodic signal, 57
spaces of a sampled signal, 348, 349
principal value, pv ( .!_), 258, 261, spline wavelets, 423-426
X step response, h1, 213, 320
313
Pythagorean identity, 143 summaries
Fourier series relations, 35
Quantization, 4 properties of Fourier
coefficients, 46
Rapidly decreasing sequence, 46 inclusion relations for function
recursion equation, 8, 10 spaces, 139
regularization of a function, 188 inclusions for convolutions of
regularizing sequence, 188 functions, 184
resonator, 330 Fourier transforms of
response time, 217 functions, 160, 166, 196
Riemann's function, 427 Fourier transform and
Riemann, Bernhard, 97 convolution of functions,
Riemann-Lebesgue theorem, 40, 204-206
156 Fourier transforms of
Routh criterion, 218 distributions, 294
existence of discrete
Sampling, 343, 344, 351, 359 convolutions, h * x, 371
scalar product, 13, 28, 141 Superposition, principle of, 11
scales, 59-61 support
harmonic scale, 59 of a distribution, 253
tempered scale, 61 of a measurable function, 179
Schwartz dass, 172 of a continuous function, 134
Schwartz, Laurent, 172, 241 of the convolution of two
Schwarz inequality, 142 functions, 179
sets of measure zero, 104 Sweldens, Wim, 428
Shannon's formula, 353, 355 systems
fails in .'? ', 35 7, 362 analog, 5
for f E L 1 n C 0 (R), 169 definition, 4
for a trigonometric signal, 356 discrete, 5, 8
Shannon's theorem. See Shannon's hybrid, 5
formula properties of
signals, 3 causality, 11
analog, 3 continuity, 12
analytic, 314, 315 invariance, 12
digital, 4 linearity, 11
discrete, 3 realizability, 11, 216
rectangular, 6 stability, 216
sinusoidal, 6 stationarity, 12
slowly increasing sequence, 269 See also analog filters; discrete
Sobolev, S.L., 241 filters
Sobolev space H 1 (a,b), 278
spectral amplitude, 17 Tchamitchian, Philippe, 428
spectral !irres, 58 test functions, !Z (R), 244
442 Index
time-frequency analysis, 393 wavelet coefficients, 396

topological basis, 33, 53 Wavelet Digest, the, 428
topological dual, 246 wavelet transform, 397-405
total system, 146 as an analytic tool, 403, 404
transfer function, 15, 19, 320, 378 fundamental theorem, 397
trigonometric polynomials, 23-25 numerical computation, 404,
trigonometric signals. See 405
trigonometricpolynomials wavelets
trigonometric system, 150 moments of, 404, 406, 407
Tukey, J.W., 75 of order r, 407
windowed Fourier transform, 386,
Uncertainty principle, 197 387
unconditional basis, 412
Young's inequality, 135
W avelet bases
for L 2 , 407 Z-transform, 375-381
biorthogonal, 428
derived from a multiresolution
analysis, 413-421

Gasquet Wi Tomski

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Gasquet Wi Tomski

Hochgeladen von

Copyright:

Verfügbare Formate

Texts in Applied Mathematics 3

Springer Science+Business Media, LLC

1. Sirovich: Introduction to Applied Mathematics.

Mathematics Subject Classification (1991): 42-01, 28-XX

The general concept

lesson is devoted to a specific topic, which facilitates reading "a la carte."

The content of this book is not claimed to be exhaustive. We have, for

Possible uses of time

Different possible courses

This book comes from many years of teaching students at E.N.S.I.M.A.G.

1 Ecole Nationale Superieure d'Informatique et de Mathematiques Appliquees de

Grenoble (Institut National Polytechnique de Grenoble)

(Universite Joseph Fourier Grenoble I)

Preface to the French Edition vii

Chapter I Signals and Systems 1

Lesson 1 Signals and Systems 3

Lesson 2 Filters and Transfer Functions 11

Chapter II Periodic Signals 21

Lesson 3 Trigonometrie Signals 23

Lesson 4 Periodic Signals and Fourier Series 27

4.3 Convergence of the approximation . . . . . . . . . 31

Lesson 5 Pointwise Representation 39

Lesson 6 Expanding a Function in an Orthogonal Basis 51

Lesson 7 Frequencies, Spectra, and Scales 57

Chapter 111 The Discrete Fourier Transform and

Lesson 9 A Famous, Lightning-Fast Algorithm 75

Lesson 10 Using the FFT for Numerical Computations 85

Chapter IV The Lebesgue Integral 95

Lesson 11 From Riemann to Lebesgue 97

Lesson 12 Measuring Sets 101

Lesson 13 Integrating Measurable Functions 111

Lesson 14 Integral Calculus 121

Chapter V Spaces 131

Lesson 15 Function Spaces 133

Lesson 16 Hilbert Spaces 141

Chapter VI Convolution and the Fourier

Lesson 18 TheInverse Fourier Transform 163

Lesson 19 The Space Y (IR) 171

Lesson 20 The Convolution of Functions 177

Lesson 21 Convolution, Derivation, and Regularization 187

Lesson 22 The Fourier Transform on L 2 (IR) 193

Lesson 23 Convolution and the Fourier Transform 201

23.3 Convolution and the Fourier transform: Summary . 204

Chapter VII Analog Filters 209

Lesson 25 Examples of Analog Filters 221

Chapter VIII Distributions 233

Lesson 27 What Is a Distribution? 243

Lesson 28 Elementary Operations on Distributions 251

28.3 The product of a distribution and a function 254

Lesson 29 Convergence of a Sequence of Distributions 265

Lesson 30 Primitives of a Distribution 275

Chapter IX Convolution and the Fourier

Lesson 32 Convolution of Distributions 297

Lesson 33 Convolution and the Fourier Transform of

33.5 The analytic signal associated with a real signal . 314

Chapter X Filtersand Distributions 317

Lesson 35 Realizable Filters and Differential Equations 325

Chapter XI Sampling and Discrete Filters 333

Lesson 37 Sampling Signalsand Poisson's Formula 343

Lesson 38 The Sampling Theorem and Shannon's Formula 353

Lesson 39 Discrete Filters and Convolution 365

39.1 Discrete signals and filters . . . . . . . . . . . . 365

Lesson 40 The z-Transform and Discrete Filters 375

Chapter XII Current Trends: Time-Frequency