0 Bewertungen0% fanden dieses Dokument nützlich (0 Abstimmungen)

30 Ansichten439 SeitenMoroşanu G. - Functional Analysis for the Applied Sciences-Springer (2020)

© © All Rights Reserved

0 Bewertungen0% fanden dieses Dokument nützlich (0 Abstimmungen)

30 Ansichten439 SeitenMoroşanu G. - Functional Analysis for the Applied Sciences-Springer (2020)

Sie sind auf Seite 1von 439

Gheorghe Moroşanu

Functional

Analysis for

the Applied

Sciences

Universitext

Universitext

Series editors

Sheldon Axler

San Francisco State University, San Francisco, CA, USA

Carles Casacuberta

Universitat de Barcelona, Barcelona, Spain

John Greenlees

University of Warwick, Coventry, UK

Angus MacIntyre

Queen Mary University of London, London, UK

Kenneth Ribet

University of California, Berkeley, CA, USA

Claude Sabbah

École Polytechnique, CNRS, Université Paris-Saclay, Palaiseau, France

Endre Süli

University of Oxford, Oxford, UK

Wojbor A. Woyczyński

Case Western Reserve University, Cleveland, OH, USA

mathematical disciplines at master’s level and beyond. The books, often well class-

tested by their author, may have an informal, personal even experimental approach

to their subject matter. Some of the most successful and established books in the se-

ries have evolved through several editions, always following the evolution of teach-

ing curricula, into very polished texts.

Thus as research topics trickle down into graduate-level teaching, first textbooks

written for new, cutting-edge courses may make their way into Universitext.

Gheorghe Moroşanu

Functional Analysis

for the Applied Sciences

Gheorghe Moroşanu

Romanian Academy of Sciences

Bucharest, Romania

Department of Mathematics

Babes-Bolyai University

Cluj-Napoca, Romania

Universitext

ISBN 978-3-030-27152-7 ISBN 978-3-030-27153-4 (eBook)

https://doi.org/10.1007/978-3-030-27153-4

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of

the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,

broadcasting, reproduction on microfilms or in any other physical way, and transmission or information

storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology

now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publica-

tion does not imply, even in the absence of a specific statement, that such names are exempt from the

relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book

are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or

the editors give a warranty, expressed or implied, with respect to the material contained herein or for any

errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional

claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG.

The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Dedicated to my wife, Carmen

Preface

main results and techniques in Functional Analysis and use them to

explore various areas in mathematics and its applications. Special

attention is paid to creating appropriate frameworks towards solving

diﬀerent problems in the ﬁeld of diﬀerential and integral equations. In

fact, the ﬂavor of this book is given by the ﬁne interplay between the

tools oﬀered by Functional Analysis and some speciﬁc problems which

are of interest in the Applied Sciences.

The table of contents of the book (see below) oﬀers a fairly good

description of the material. In contrast with other books in the ﬁeld,

we present in Chap. 1 the real number system, describing the Cantor–

Méray model which is most appropriate for our purposes here. Indeed,

it is based on a completion procedure, allowing the extension from ra-

tional numbers to real numbers. This procedure involves the concepts

of limit and inﬁnity that are speciﬁc to analysis. We consider the

Cantor–Méray construction as the corner stone of mathematical anal-

ysis, which is why we pay attention to this subject which is usually

assumed well known.

In order to help the reader to understand the richness of ideas and

methods oﬀered by Functional Analysis, we have included a section of

exercises at the end of each chapter. Some of these exercises supple-

ment the theoretical material discussed in the corresponding chapter,

while others are mathematical problems that are related to the real

world. Some of the exercises are borrowed from other books, being

reformulated and/or presented in a form adapted to the needs of the

corresponding chapter. We do not indicate the books where individual

exercises come from, but all those sources are included into the refer-

ence list of our book. In any event, we do not claim originality in such

cases. Other exercises were invented by us to oﬀer the reader enough

vii

viii Preface

material to understand the theoretical part of the book and gain ex-

pertise in solving practical problems. In the last chapter of the book

(Chap. 12), we provide solutions to almost all exercises. This is in con-

trast to many other books which include exercises without solutions.

For easy exercises, we provide hints or ﬁnal solutions, and answers to

very easy exercises are left to the reader. I encourage everybody to

spend some time working on an exercise before looking at its solution.

We shall refer to an exercise by indicating the chapter and exercise

numbers (and not the section number). For example, Exercise 11.3

will mean Exercise 3 in the last section of Chap. 11 (which is Sect. 11.3

in this case).

The book is addressed to graduate students and researchers in

applied mathematics and neighboring ﬁelds of science.

I would like to thank the anonymous reviewers whose pertinent

comments improved the initial version of the book.

Special thanks are due to a former American student of mine, Ivan

Andrus, who wrote the ﬁrst draft of the present book as lecture notes

for my Functional Analysis lectures in 2010. He also carefully checked

the ﬁnal version of the book and suggested several minor changes.

I am also indebted to my former student Liviu Nicolaescu for read-

ing the ﬁrst part of the book and correcting some errors.

Last but not least, I would like to thank Mrs. Elizabeth Loew,

Executive Editor at Springer, for our very kind cooperation that led

to the successful completion of this book project.

Contents

1 Introduction 1

1.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Real Numbers . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Complex Numbers . . . . . . . . . . . . . . . . . . . . 15

1.5 Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . 16

1.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2 Metric Spaces 31

2.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.2 Completeness . . . . . . . . . . . . . . . . . . . . . . . 34

2.3 Compact Sets . . . . . . . . . . . . . . . . . . . . . . . 40

2.4 Continuous Functions on Compact Sets . . . . . . . . . 44

2.5 The Banach Contraction Principle . . . . . . . . . . . 55

2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.1 Measurable Sets in Rk . . . . . . . . . . . . . . . . . . 65

3.2 Measurable Functions . . . . . . . . . . . . . . . . . . . 71

3.3 The Lebesgue Integral . . . . . . . . . . . . . . . . . . 75

3.4 Lp Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 82

3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.1 Deﬁnitions, Examples, Operator Norm . . . . . . . . . 89

4.2 Main Principles of Functional Analysis . . . . . . . . . 93

4.3 Compact Linear Operators . . . . . . . . . . . . . . . . 96

4.4 Linear Functionals, Dual Spaces, Weak Topologies . . 97

4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 104

ix

x Contents

5.1 Test Functions . . . . . . . . . . . . . . . . . . . . . . . 107

5.2 Friedrichs’ Molliﬁcation . . . . . . . . . . . . . . . . . . 112

5.3 Scalar Distributions . . . . . . . . . . . . . . . . . . . . 119

5.3.1 Some Operations with Distributions . . . . . . 121

5.3.2 Convergence in Distributions . . . . . . . . . . 122

5.3.3 Diﬀerentiation of Distributions . . . . . . . . . 125

5.3.4 Diﬀerential Equations for Distributions . . . . . 131

5.4 Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . 143

5.5 Bochner’s Integral . . . . . . . . . . . . . . . . . . . . . 149

5.6 Vector Distributions, W m,p (a, b; X) Spaces . . . . . . . 155

5.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 160

6.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 165

6.2 Jordan–von Neumann Characterization Theorem . . . 168

6.3 Projections in Hilbert Spaces . . . . . . . . . . . . . . 171

6.4 The Riesz Representation Theorem . . . . . . . . . . . 175

6.5 Lax–Milgram Theorem . . . . . . . . . . . . . . . . . . 180

6.6 Fourier Series Expansions . . . . . . . . . . . . . . . . 186

6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 195

Operators 201

7.1 The Adjoint of a Linear Operator . . . . . . . . . . . . 201

7.2 Adjoints of Operators on Hilbert Spaces . . . . . . . . 204

7.2.1 The Case of Compact Operators . . . . . . . . 205

7.3 Symmetric Operators and Self-adjoint Operators . . . 209

7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 212

8.1 Deﬁnition and Examples . . . . . . . . . . . . . . . . . 217

8.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . 219

8.3 Eigenvalues of −Δ Under the Dirichlet Boundary

Condition . . . . . . . . . . . . . . . . . . . . . . . . . 226

8.4 Eigenvalues of −Δ Under the Robin Boundary

Condition . . . . . . . . . . . . . . . . . . . . . . . . . 228

8.5 Eigenvalues of −Δ Under the Neumann Boundary

Condition . . . . . . . . . . . . . . . . . . . . . . . . . 230

8.6 Some Comments . . . . . . . . . . . . . . . . . . . . . 232

8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 239

Contents xi

9.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . 244

9.2 Some Properties of C0 -Semigroups . . . . . . . . . . . 246

9.3 Uniformly Continuous Semigroups . . . . . . . . . . . 252

9.4 Groups of Linear Operators. Deﬁnitions and Link

to Operator Semigroups . . . . . . . . . . . . . . . . . 254

9.5 Translation Semigroups . . . . . . . . . . . . . . . . . . 257

9.6 The Hille–Yosida Generation Theorem . . . . . . . . . 260

9.7 The Lumer–Phillips Theorem . . . . . . . . . . . . . . 265

9.8 The Feller–Miyadera–Phillips Theorem . . . . . . . . . 268

9.9 A Perturbation Result . . . . . . . . . . . . . . . . . . 271

9.10 Approximation of Semigroups . . . . . . . . . . . . . . 273

9.11 The Inhomogeneous Cauchy Problem . . . . . . . . . . 279

9.12 Applications . . . . . . . . . . . . . . . . . . . . . . . . 283

9.12.1 The Heat Equation . . . . . . . . . . . . . . . . 283

9.12.2 The Wave Equation . . . . . . . . . . . . . . . 286

9.12.3 The Transport Equation . . . . . . . . . . . . . 288

9.12.4 The Telegraph System . . . . . . . . . . . . . . 291

9.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 293

by the Fourier Method 297

10.1 First Order Linear Evolution

Equations . . . . . . . . . . . . . . . . . . . . . . . . . 297

10.2 Second Order Linear Evolution

Equations . . . . . . . . . . . . . . . . . . . . . . . . . 304

10.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 308

10.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 309

11.1 Volterra Equations . . . . . . . . . . . . . . . . . . . . 315

11.2 Fredholm Equations . . . . . . . . . . . . . . . . . . . 325

11.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 336

12.1 Answers to Exercises for Chap. 1 . . . . . . . . . . . . 341

12.2 Answers to Exercises for Chap. 2 . . . . . . . . . . . . 343

12.3 Answers to Exercises for Chap. 3 . . . . . . . . . . . . 354

12.4 Answers to Exercises for Chap. 4 . . . . . . . . . . . . 359

12.5 Answers to Exercises for Chap. 5 . . . . . . . . . . . . 365

xii Contents

12.7 Answers to Exercises for Chap. 7 . . . . . . . . . . . . 383

12.8 Answers to Exercises for Chap. 8 . . . . . . . . . . . . 390

12.9 Answers to Exercises for Chap. 9 . . . . . . . . . . . . 398

12.10 Answers to Exercises for Chap. 10 . . . . . . . . . . . . 407

12.11 Answers to Exercises for Chap. 11 . . . . . . . . . . . . 417

Bibliography 429

Chapter 1

Introduction

to set theory, real and complex numbers, and linear spaces.

1.1 Sets

We assume that the reader is familiar with the basic concepts and

results of set theory. However, we are going to recall or specify some

concepts and symbols that will be frequently used in this book.

First of all, in this book the notation A ⊂ B or B ⊃ A indicates that

every element (member) of the set A is also an element of the set B.

In particular, A ⊂ A. The empty set, i.e., the set with no elements,

will be denoted as usual by ∅. The empty set is a subset of every set

A, ∅ ⊂ A. The sets A, B are equal, A = B, if and only if A ⊂ B and

B ⊂ A.

We assume that the sets

N = {1, 2, . . . } (natural numbers),

Z = {. . . , −2, −1, 0, 1, 2, . . . } (integers), and

Q = {0} ∪ {±m/n; m, n ∈ N, (m, n) = 1} (rational numbers)

are well known, including their axiomatic deﬁnitions.

A set A is called countable if there exists an injective function from A

to N. If one can ﬁnd a bijective function from A to N then S is called

countably inﬁnite. In particular, N, Z, and Q are countably inﬁnite

sets. In fact, a countable set is either ﬁnite or countably inﬁnite.

G. Moroşanu, Functional Analysis for the Applied Sciences,

Universitext, https://doi.org/10.1007/978-3-030-27153-4 1

2 1 Introduction

≤ over A satisfying the following conditions for x, y, z ∈ A: (a) x ≤ x;

(b) if x ≤ y and y ≤ x, then x = y; (c) if x ≤ y and y ≤ z, then x ≤ z.

We say that x < y if x ≤ y and x = y. The symbols ≥ and > have

natural meanings: x ≥ y iﬀ y ≤ x, and x > y iﬀ y < x.

If A is endowed with a partial order, then A is called a partially ordered

set. For example, N is partially ordered with respect to the divisibility

relation (m ≤ n if m is a divisor of n); also, the set of subsets of a given

set S is partially ordered by the inclusion relation. Note that in these

examples there are pairs of elements which are not comparable with

respect to the corresponding order, which is why the order is called

partial.

If A is a set with a partial order ≤, then a subset B ⊂ A is said

to be totally ordered (or a chain) if any two elements x, y ∈ B are

comparable, i.e., either x ≤ y or y ≤ x (including the case x = y).

Let B be a subset of A. An element z ∈ A is an upper bound for B if

x ≤ z for all x ∈ B. If B has an upper bound, it is said to be bounded

above. An element m ∈ A is a maximal element of A if there is no

x ∈ A, x = m, such that m ≤ x. A maximal element of A is not

necessarily an upper bound for A.

The set A is called inductive if any totally ordered subset of A has an

upper bound.

Now, let us recall an important result which is known as Zorn’s

Lemma1 :

inductive set has a maximal element.

A, the supremum of B, denoted sup B, is deﬁned as the least upper

bound of B. An element b ∈ A is the least upper bound of B if and

only if

x ∈ B such that a < x.

x ≤ b for all x ∈ B), then b = sup B.

1

Max August Zorn, German mathematician, 1906–1993.

1.3 Real Numbers 3

1.2 Sequences

A sequence in a nonempty set X is an ordered list of elements from

X, and can be deﬁned as a function f : D → X whose domain D

is a countable, totally ordered set. The case when D is ﬁnite is not

considered in this book. We shall mostly consider that D = N and

the sequence is usually denoted (an )n∈N , or simply (an ), where an =

f (n) for all n ∈ N. Sometimes we consider inﬁnite subsets of N, for

instance, D = {m, m + 1, . . . }, m ∈ N, m > 1, and in this case the

sequence is denoted (an )n≥m . A sequence can also be indicated by

listing its terms: (an )n∈N = (a1 , a2 , . . . ). For example, (1, 3, 5, 7, . . . )

is the sequence of odd natural numbers. It is worth pointing out

that a term (element) can appear several times in a sequence, e.g.,

(an )n∈N = (0, 1, 0, 1, 0, 1, . . . ), where a2k−1 = 0 and a2k = 1 for all

k ∈ N.

A subsequence of a given sequence (an )n∈N = (a1 , a2 , . . . ) is a new

sequence (bk )k∈N , obtained by removing some terms from (a1 , a2 , . . . )

and preserving the order of the remaining terms, i.e.,

bk = ank , k ∈ N,

We close this section by noting that further details on sequences will

be discussed later.

While everybody feels comfortable dealing with rational numbers, in

order to understand the larger set of real numbers some eﬀort is

needed.

Real numbers are needed since the set of rational numbers Q is not

suﬃciently large for many purposes. For example, the equation p2 = 2

has no solution in Q. This assertion was ﬁrst proved by Euclid.2 In

fact, it was observed that the diagonal and the side of any square are

incommensurable, i.e., the length p of the diagonal of the unit square

is not a rational number. Indeed, p must satisfy the equation p2 = 2.

One needs to ﬁnd a number p (which cannot be a rational one) to

represent the length of that diagonal. Many other similar examples

2

Greek mathematician, known as father of Geometry, born around 330 BC,

presumably in Alexandria, Egypt.

4 1 Introduction

appear when trying to express areas, volumes, weights, etc. So, it was

really necessary to enlarge the set Q to obtain a set R, called the set of

real numbers, within which inconveniences as those described above do

not occur. The elements of R Q√ will be called irrational numbers. In

particular, the irrational number 2 will be the precise representation

for the length of the diagonal of the unit square. In fact, we will √ see

√ equation p = 2 discussed above has two solutions in R, + 2

that the 2

and − 2.

Roughly speaking, R is the completion of Q, as we will explain below.

First of all, let us recall an axiomatic deﬁnition of R: R is an ordered

ﬁeld, containing Q as a subﬁeld, and having the least upper bound

property. More precisely, R, endowed with two internal operations,

addition and multiplication, denoted “+” and “·”, and a total order,

denoted “≤”, satisﬁes the following axioms:

x ∈ R;

x + (−x) = 0;

x · y is also denoted xy);

all x ∈ R;

the inverse of x, also denoted x1 or 1/x) such that x · x−1 = 1;

A has an upper bound) there exists sup A ∈ R.

1.3 Real Numbers 5

The axiom (LUBP) is called the least upper bound property (which

is why it is so denoted) or the completeness axiom (this name will be

clariﬁed in the following).

Remark 1.2. The fact that Q is a subﬁeld of R means that Q ⊂ R and

the operations of addition and multiplication in R are also internal

operations in Q. In fact, any ordered ﬁeld K contains a subﬁeld QK

which is isomorphic to Q. Indeed, the function g : Q → K, deﬁned by

g(m/n) = (m · 1K ) · (n · 1K )−1 , is an injective morphism, so g(Q) is a

subﬁeld of K isomorphic to Q. Thus, the condition from the deﬁnition

above, that R contains Q as a subﬁeld is superﬂuous if we admit that

Q is unique up to isomorphism. We merely wanted to make it clear

that R is an extension of Q.

Remark 1.3. It is worth pointing out that the extension from rational

numbers to real numbers is the result of a long investigative process

extended over more than 2000 years. The problem was clariﬁed in

the nineteenth century. There are several models for R deﬁned by

the above system of axioms, such as the Stolz–Weierstrass model,3

based on decimal expansions; Dedekind’s model,4 based on the so-

called Dedekind cuts and the Cantor–Méray model.5 All these

models are based on approximation (as are all models of R). We shall

describe the Cantor–Méray construction which involves Cauchy se-

quences of rational numbers and uses the basic properties of Q as an

ordered ﬁeld. Intuitively speaking, according to this construction, R

will consist of all rational numbers, plus “limits” of Cauchy6 sequences

in Q which are not rational numbers. The most important step in this

construction (completion procedure) will be to show that the com-

pleteness axiom is satisﬁed by this model, denoted RC−M (C − M

comes from Cantor–Méray), thus ensuring that any Cauchy sequence

of rational numbers is “convergent” (has a “limit”) in RC−M . But

such “limits” cannot be used in this construction (one cannot deﬁne

real numbers by themselves!), so instead we consider as elements of

RC−M the equivalence classes of Cauchy rational sequences (two se-

quences being equivalent if the corresponding sequence of diﬀerences

3

Otto Stolz, Austrian mathematician, 1842–1905; Karl Weierstrass, German,

known as father of modern analysis, 1815–1897.

4

Richard Dedekind, German mathematician, 1831–1916.

5

Georg Cantor, German mathematician, 1845–1918; Charles Méray, French

mathematician, 1835–1911.

6

August-Louis Cauchy, French mathematician, engineer and physicist, 1789–

1857.

6 1 Introduction

quence which is supposed to deﬁne (“converge to”) a real number is

not unique.

Finally, we will prove that any two copies of R are isomorphic, thus

concluding that R is unique up to isomorphism.

Before presenting in detail the Cantor–Méray model, we will make a

few comments and derive some abstract results regarding R as deﬁned

by the axioms above.

Remark 1.4. It is easily seen that (LUBP) implies that for any non-

empty set A ⊂ R which is bounded below (i.e., has a lower bound),

there exists the greatest lower bound of A, denoted inf A ∈ R. In fact,

inf A = − sup {x ∈ R; −x ∈ A}. The converse implication is also true,

so one may replace (LUBP) by this equivalent statement.

Remark 1.5. It is worth pointing out that the (LUBP) is precisely

what makes the diﬀerence between R and Q. Indeed, Q is an ordered

ﬁeld, but does not satisfy the (LUBP), as illustrated by the following

counterexample:

Let A ⊂ Q denote the set {p ∈ Q : p > 0, p2 < 3}. A is nonempty,

since 1 ∈ A. Obviously, A is bounded above (e.g., 2 is an upper bound

of A). Assume by contradiction that there exists a number α ∈ Q

which is the least upper bound of A, α = sup A. Then α ≥ 1 and we

need to examine the following three possibilities: α2 < 3, α2 = 3, and

α2 > 3.

If α2 < 3, then (2α + 3)/(α + 2) > α, and (2α + 3)/(α + 2) ∈ A, so α

is not even an upper bound of A.

The case α ∈ Q, α2 = 3 is impossible (prove it!).

Finally, if α2 > 3, then β := (2α + 3)/(α + 2) ∈ Q, β > 0 (since α ∈ Q,

α ≥ 1), and α − β = (α2 − 3)/(α + 2) > 0, hence β < α. On the other

hand, 3 − β 2 = (3 − α2 )/(α + 2)2 < 0, so β 2 > 3. It follows that β

is an upper bound for A, with β < α. This contradicts the fact that

α = sup A.

Since none of the above cases is possible, there is no rational number

α such that α = sup A. Therefore Q does not satisfy the (LUBP).

Note that if A is considered as a subset of R, then there exists sup A =

√

3 ∈ R \ Q (see below).

Now, we present a result known as the Archimedean7 property:

7

Archimedes of Syracuse, 287–212 BC.

1.3 Real Numbers 7

that nx > y.

Proof. Assume that, on the contrary, nx ≤ y for all n ∈ N, so the set

A = {nx; n ∈ N} is bounded above. Then the (LUBP) implies that

there exists α = sup A ∈ R. Since α − x < α, there exists an element

of A, say mx, with m ∈ N, such that α − x < mx which is equivalent

to α < (m + 1)x ∈ A. This contradicts the fact that α is an upper

bound of A.

numbers there is a rational number.

Proof. Let x, y ∈ R, x < y. Since y − x > 0 it follows by the

Archimedean property that there exists an n ∈ N such that

n(y − x) > 1. (1.3.1)

By the same Archimedean property there exist w, z ∈ N such that

−w < −nx < z. In fact, w can be replaced by m := − sup{r ∈ Z :

−w ≤ r < −nx}, so nx < m. Moreover,

nx < m ≤ nx + 1. (1.3.2)

By (1.3.1) and (1.3.2) we can conclude that x < m/n < y.

x > 0, and for all n ∈ N, n ≥ 2, there exists a unique y ∈ R, y > 0,

such that y n = x.

Proof. The uniqueness of y follows from the implication 0 < y1 < y2 ⇒

y1n < y2n . To prove the existence of y consider the set A = {t ∈ R; t >

0, tn < x}. A is nonempty, since it contains t1 = x/(1 + x). Indeed,

tn1 < t1 < x. A is also bounded above (for example, 1 + x is an upper

bound for A). By the (LUBP) there exists y = sup A ∈ R, y > 0. Let

us prove that y n = x. Assuming that y n < x, we have for 0 < ε < 1,

(y + ε)n − y n = ε[(y + ε)n−1 + y(y + ε)n−2 + · · · + y n−1 ] < εn(y + 1)n−1 .

Hence

(y + ε)n < y n + εz, (1.3.3)

where z = n(y+1)n−1 . By the Archimedean property, there is a k ∈ N,

k ≥ 2, such that ε = 1/k satisﬁes

εz < x − y n . (1.3.4)

8 1 Introduction

the fact that y = sup A. We can also show that y n > x leads to a

contradiction. Hence, y n = x.

√ √

The n-th root y of the real number x > 0 is denoted n x ( x if

n = 2) or x1/n . At this moment, we can see that in particular

√ 2 the

2

equation√p = 2 √ can be solved in R: p = 2 ⇔ p − ( 2)

2 2

√ = 0

⇔ (p √− 2)(p + 2) =√ 0, so there are two solutions, p = 2 and

p = − 2. The number 2, which is irrational, represents in particular

the length of the diagonal of the unit square. So, the

√ diﬃculty pointed

out by Euclid can be handled in R. Similarly, 3 is an irrational

number representing the length of the diagonal of the unit cube.

Remark 1.9. Sometimes it is useful to represent numbers by points on

a straight line. First, let us mark arbitrarily two distinct points O

and A on the straight line to represent the numbers 0 and 1. The line

segment OA is called the unit segment. If we choose a point P to the

right of A, such that OP consists of m unit segments, m ∈ N, m ≥ 2,

then P represents the natural number m. The negative integers are

similarly represented by points on the left of O, following the natural

order . . . , −3, −2, −1. So now we have a directed straight line, called

the number line, including the positive half-line (on the right of O)

and the negative half-line. One can also associate with any rational

number a point on the number line. For example, if one divides OA

into 2 equal parts and choose a point R on the positive half-line, such

that OR is equal to 3 such parts, then R represents 3/2. Obviously,

the points corresponding to distinct rational numbers are distinct too.

Note that the set of points on the number line corresponding to all the

rational numbers does not cover the number line. For example, the

point√D corresponding to the length of the diagonal of the unit square

(i.e., 2) is on the number line (D being constructible by using a ruler

and compass). We will discuss later the representation of irrational

numbers by points on the number line.

increasing (or nondecreasing) if an ≤ an+1 for all n ∈ N. If an < an+1

for all n ∈ N, then (an ) is called strictly increasing.

Similarly, if the order relations “≤” and “<” are replaced by “≥”

and “>”, we obtain the deﬁnitions for a decreasing (or nonincreasing)

sequence, and a strictly decreasing sequence, respectively.

1.3 Real Numbers 9

if there exists an M ∈ R such that an ≤ M (an ≥ M , respectively) for

all n ∈ N. If (an ) is bounded both above and below, then it is called

bounded.

A sequence (an )n∈N in R is said to be convergent if there exists a

number a ∈ R (called limit of (an )) such that

∀ε ∈ R, ε > 0, ∃N = N (ε) ∈ N such that ∀ n > N, |an − a| < ε.

Here, | · | means the absolute value function, i.e., |x| = x if x ≥ 0, and

|x| = −x if x < 0. The above deﬁnition (of a convergent sequence)

will be discussed again later in a more general framework. Here we

are interested in some properties of sequences of real numbers.

It is easily seen that any convergent sequence is bounded, and its limit

is unique.

Next, we state the so-called Monotone Convergence Theorem:

(an )n∈N in R which is increasing (decreasing) and bounded is

convergent.

Proof. We consider the case when (an ) is increasing and bounded (the

other case is similar). Since the set of all an ’s (where repetitions are

eliminated) is bounded above, it follows by (LUBP) that there exists

its supremum a ∈ R. Thus, for all ε ∈ R, ε > 0, there exists an N ∈ N,

such that a − ε < N . Since (an ) is increasing, we have a − ε < an for

all n > N , so

|an − a| = a − an < ε ∀n > N.

Theorem.8

has a convergent subsequence.

number with the property ak > am for all m > k. Assume there are

inﬁnitely many such k’s, say k = nj , n1 < n2 < · · · < nj < · · · . Then,

the subsequence (anj )j∈N is strictly decreasing, hence convergent since

it is also bounded (cf. Theorem 1.10).

8

Bernard Bolzano, Bohemian mathematician, logician, philosopher, and the-

ologian, 1781–1848.

10 1 Introduction

maximum of such k’s. Obviously, for n1 = K + 1 there exists an

n2 ∈ N, such that an1 ≤ an2 . Now, since n2 does not belong to the

set of k’s, there exists an n3 ∈ N such that an2 ≤ an3 . Continuing this

procedure we obtain a subsequence (anj )j∈N which is increasing and

bounded, hence convergent (cf. Theorem 1.10).

∀ n, m > N, |an − am | < ε.

vergent.

Proof. Let (an )n∈N be a Cauchy sequence in R. It is easily seen that

(an ) is bounded. Thus, by Theorem 1.11, there is a convergent subse-

quence, say (ank )k∈N . Let a ∈ R be its limit. By the triangle inequality

(which obviously holds in R), we have

the same limit a).

The converse implication is trivial.

are important in real analysis and also help us understand the Cantor–

Méray model for R.

ﬁeld of rational numbers) is known. We want to extend Q to obtain

a larger ordered ﬁeld satisfying in addition the (LUBP). Denote by

SQ the collection of all Cauchy sequences of rational numbers. When

deﬁning a Cauchy sequence in Q we require ε ∈ Q, ε > 0 (since the

extension of Q is not yet known). Deﬁne the following equivalence

relation in SQ

∀n > N, |an − bn | < ε.

bn = n/(n2 + 1), cn = 0 for all n ≥ 1, belong to the same equivalence

1.3 Real Numbers 11

class, i.e., the class of the constant sequence (0, 0, . . . ), which can be

identiﬁed with 0 ∈ Q. We identify any r ∈ Q with the equivalence

class of the constant sequence (r, r, . . . ).

Let us denote by RC−M the set of all equivalence classes in SQ (with

respect to the equivalence relation deﬁned above). Obviously, Q can

be regarded as a subset of RC−M (in view of the natural identiﬁcation

mentioned above).

Now, one deﬁnes in a natural manner the operations of addition and

multiplication in RC−M . Speciﬁcally, if a, b are classes in RC−M with

representatives (an ), (bn ) ∈ SQ , then a + b and ab are deﬁned as the

equivalence classes of (an + bn ) and (an bn ), respectively. Also, a ≤ b

if for all ε ∈ Q, ε > 0, there exists an N ∈ N such that bn − an ≥ −ε

for all n ≥ N . Note that the strict inequality a < b (i.e., a ≤ b

and a = b) can be equivalently expressed as follows: there exists an

ε0 ∈ Q, ε0 > 0, such that bn −an ≥ ε0 for all n large enough. Likewise,

these deﬁnitions do not depend on speciﬁc representatives.

It is easily seen that RC−M is an ordered ﬁeld satisfying axioms (A1)−

(A4), (M 1) − (M 4), (D), and (O1) − (O2).

Let us now prove that RC−M also satisﬁes the (LUBP). Let Ω be a

nonempty subset of RC−M which is bounded above, with upper bound

of a ∈ RC−M . One may assume that a is the class of a constant

sequence (u0 , u0 , . . . ) with u0 ∈ Q (if this is not the case, we can use

the information that a Cauchy sequence in Q has an upper bound in Q,

so a can be replaced by the class of a constant sequence (u0 , u0 , . . . ),

where u0 is a large rational number).

Let us pick an s0 ∈ Ω and a rational number l0 such that l0 < s0 ,

where l0 is identiﬁed with the class of the constant sequence (l0 , l0 , . . . ).

Next, we construct two sequences of rational numbers (un ) and (ln ) as

follows: u1 = u0 and l1 = l0 , then, successively, for n = 1, 2, . . . , either

un+1 = (un + ln )/2, ln+1 = ln if (un + ln )/2 is an upper bound of Ω, or

un+1 = un and ln+1 = (un + ln )/2 if (un + ln )/2 is not an upper bound

of Ω. By induction we can see that un is an upper bound of Ω for all

n ∈ N, while ln is not an upper bound of Ω for any n ∈ N. Obviously,

(un ) and (ln ) are Cauchy sequences, so their classes u, l ∈ RC−M , and

in fact u = l, since |un − ln | = un − ln = (u0 − l0 )/2n−1 , n ≥ 1. It

is also obvious that u is an upper bound of Ω. Let us prove that u is

the least upper bound: u = sup Ω. Assume that there exists a smaller

upper bound, say v ∈ RC−M , v < u = l. Since lk ≤ lk+1 for all k ∈ N,

there exists an N ∈ N such that v < lN . But lN is not an upper

bound of Ω, hence v = u cannot be an upper bound of Ω, leading to a

12 1 Introduction

a model for R.

Remark 1.13. Let us summarize: any element x ∈ RC−M is the equiv-

alence class of a Cauchy sequence in Q, say (rn ) (this could be a

constant sequence if x ∈ Q); since RC−M is a model for R (a complete

ordered ﬁeld), we know that (rn ) is convergent (see Theorem 1.12); its

limit (which is independent of the choice of (rn ) in the class x) can

be identiﬁed with x. So now we have a clear representation of RC−M ,

including rational and irrational numbers.

phism. Let R̂ be another model for R. As before, we admit that Q

is unique up to isomorphism, so Q is a subﬁeld of both RC−M and R̂.

Since Q is dense in R̂ (see Theorem 1.7), for any x ∈ R̂, there exists a

sequence of rational numbers (rn ) that converges to x (this sequence

can be the constant sequence (x, x, . . . ) if x ∈ Q). Of course, (rn ) is a

Cauchy sequence. We associate with such an x the class of (rn ) with

respect to the equivalence class “∼” deﬁned above. So we have deﬁned

a mapping φ : R̂ → RC−M , φ(x) = the class of (rn ). It is easily seen

that φ is a bijection, and

φ(x + y) = φ(x) + φ(y) ∀x, y ∈ R̂,

φ(x · y) = φ(x) · φ(y) ∀x, y ∈ R̂,

x > 0 =⇒ φ(x) > 0.

Therefore, R̂ is isomorphic to RC−M , hence any two real number mod-

els are isomorphic. So in what follows we will consider that the real

number system is unique and denote it by R.

Line. We discussed in Remark 1.9 how to represent rational numbers

on a directed straight line. Now, taking into account the Cantor–Méray

construction, we can complete the procedure by representing irrational

numbers. We see that to every real number there corresponds a unique

point of the directed straight line, and the correspondence is one-to-

one. The Dedekind–Cantor axiom stipulates that there are no gaps on

the line after representing all real numbers, that is there is a one-to-

one correspondence between R and the points of the directed straight

line. The directed straight line will be called the real line, and real

numbers will be sometimes called points.

1.3 Real Numbers 13

describe mathematically what happens “beyond” real numbers. For

example, 1/x gets closer and closer to zero when x gets larger and

larger. Having in mind that the point on the real line corresponding

to x goes far away to the right, we usually say that x tends to inﬁnity,

and write x → +∞. The fact that 1/x tends to zero as x → ∞ can be

1

written as +∞ = 0.

Similar situations require the introduction of the symbol −∞. So we

are led to the so-called extended real number system,

R := R ∪ {−∞, +∞} .

−∞ < x < +∞ ∀x ∈ R.

every nonempty subset of R. Moreover, any nonempty subset has a

least upper bound. For instance, E = {x + x1 : x ∈ R, x = 0} has

sup E = +∞ and inf E = −∞. The symbol +∞ is also denoted by ∞.

In accordance with our intuition, we adopt the following conventions

x x

x + ∞ = ∞, x − ∞ = −∞, = = 0 ∀x ∈ R;

∞ −∞

x · ∞ = ∞, x · (−∞) = −∞ ∀x ∈ R, x > 0;

x · ∞ = −∞, x · (−∞) = +∞ ∀x ∈ R, x < 0;

∞ + ∞ = ∞, −∞ − ∞ = −∞,

∞ · ∞ = ∞, ∞ · (−∞) = −∞, (−∞) · (−∞) = +∞.

∞

0 · (±∞), ∞ − ∞,

∞

are not accepted. For example, x/(1 + x2 ) approaches 0 as x → ∞,

√

while x/(1 + x) approaches +∞ as x → ∞. Thus, the quotient of

two large numbers may approach either 0 or ∞. That is why we say

that ∞

∞ does not make sense.

Note that R does not form a ﬁeld (why?).

numbers. For information see, for example, [33, 41, 42].

14 1 Introduction

tional is not a trivial task. The number often known as e is an example

in this respect. It is deﬁned as the sum of a series, namely

∞

1

e= ,

n!

n=0

where n! = 1 · 2 · 3 · · · n for n

≥ 1, and 0! = 1. Let sn denote the partial

sum of the series, i.e., sn = nk=0 k! 1

. By the ratio test we see that the

series converges, hence e ∈ R. More precisely,

∞

1

2<e<1+ = 3. (1.3.5)

2k

k=0

Note that

∞

1 1

e − sn < ,

(n + 1)! (k + 1)2

k=0

hence

2

0 < e − sn < . (1.3.6)

n!n

Let us now prove that e is irrational. Assume the contrary, that

e = p/q, where p, q ∈ N, (p, q) = 1. In fact, q > 1 (see (1.3.5)).

From (1.3.6) we infer that

p 2

0 < q! − sq < . (1.3.7)

q q

Observing that q!sq ∈ N, we have m := q! pq − sq ∈ N. So we deduce

from (1.3.7) that 0 < m < 1 which is impossible (there is no integer

between 0 and 1). Therefore e is irrational, as claimed.

Remark 1.14. By an argument from Rudin [42, p. 64] we see thate

1 n

is also the limit of the sequence (xn )n∈N deﬁned by xn = 1 + n .

Using the binomial formula we can write

1 1 1 1 2

xn = 1 + 1 + 1− + 1− 1− + ···

2! n 3! n n

1 1 2 n − 1

+ 1− 1− ··· 1 − .

n! n n n

Then, for all n, m ∈ N, n ≥ m ≥ 2, we have

1 1 1 1 2

1+1++ 1− + ··· + 1− 1− ···

2! n m! n n

m − 1

1− ≤ xn ≤ sn ,

n

1.4 Complex Numbers 15

which implies

1 1

1+1+ + ··· + ≤ lim inf xn ≤ lim sup xn ≤ e.

2! m!

Therefore, e = lim xn exists, as claimed.

We assume that the reader is familiar with the complex ﬁeld. In what

follows we just recall its construction and some notation.

Let C denote the Cartesian9 product R×R equipped with two internal

operations, addition and multiplication, deﬁned as follows:

It is easy to check that C is a ﬁeld, with (0, 0) and (1, 0) in the role of

0 and 1, respectively.

x In −y particular,

for any z = (x, y) ∈ C, z = (0, 0),

we have z −1 = x2 +y 2 , x2 +y 2 .

to the above operations that read in this case

Thus any (x, 0) can be identiﬁed with x and R1 with these operations

can be identiﬁed with R with the usual operations of addition and

multiplication. So R can be viewed as a subﬁeld of C.

Any z = (x, y) ∈ C can be decomposed as z = (x, 0) + (y, 0) · (0, 1),

so in view of the above identiﬁcation, we can write z = x + yi, where

i := (0, 1). Note that (0, 1)·(0, 1) = (−1, 0), thus we can write i2 = −1;

i is called the imaginary unit.

Summarizing, we can write C = {x + yi; x, y ∈ R} and observe that

the two operations initially deﬁned can be viewed as the addition and

multiplication similar to those used for real numbers if we admit that

i2 = −1.

The elements z = x + yi of C are called complex numbers and C is

known as the complex ﬁeld or complex number system. For a complex

number z = x + yi, the real numbers x and y are called the real

part and the imaginary part of z, respectively (denoted x = Re z,

9

René Descartes, latinized Renatus Cartesius, French mathematician, philoso-

pher, and scientist, 1596–1650.

16 1 Introduction

(of coordinates x, y) in the complex plane which is determined by two

orthogonal directed straight lines with the same unit, the x-axis (real

axis) and the y-axis (imaginary axis).

Let z̄ = x − yi be the complex conjugate

of z = x + yi. Note that z · z̄ =

x2 + y 2 ∈ R. The number |z| = x2 + y 2 is called the magnitude of z,

and it represents the length of the segment connecting the origin O of

the complex plane and the point of coordinates x and y corresponding

to z.

Recall that a nonempty set X is said to be a linear space (or vector

space) over a ﬁeld K if there exist a binary operation on X, called

addition, + : X × X → X, and an external binary operation, called

scalar multiplication, · : K × X → X, such that the following axioms

are satisﬁed

(A1) (x + y) + z = x + (y + z) ∀x, y, z ∈ X;

(A2) x + y = y + x ∀x, y ∈ X;

(A3) ∃0 ∈ X, called zero, such that x + 0 = x ∀x ∈ X;

(A4) ∀x ∈ X ∃ − x ∈ X such that x + (−x) = 0;

(A5) 1 · x = x ∀x ∈ X, where 1 is the unit element of

the ﬁeld K;

(A6) α(βx) = (αβ)x ∀α, β ∈ K, ∀x ∈ X;

(A7) (α + β)x = αx + βx ∀α, β ∈ K, ∀x ∈ X;

(A8) α(x + y) = αx + αy ∀α ∈ K, ∀x, y ∈ X.

The ﬁrst four axioms ensure that X is an Abelian10 group with respect

to addition.

In the following K will be either the ﬁeld R of real numbers or the ﬁled

C of complex numbers, and X will be called a real or complex space,

respectively.

A nonempty subset Y of X which is a linear space with respect to the

same operations is called a subspace of X. In fact, a necessary and

10

Niels Henrik Abel, Norwegian mathematician, 1802–1829.

1.5 Linear Spaces 17

that Y be closed under the operations, i.e.,

∀ x, y ∈ Y, ∀ α ∈ K, x + y ∈ Y, αx ∈ Y.

the collection of all ﬁnite linear combinations of elements of S, i.e.,

k

Span S = αi xi = α1 x1 + · · · + αk xk ; k ∈ N, αi ∈ K,

i=1

xi ∈ S, i = 1, . . . , k .

generated by S (and S is said to be a system of generators).

We recall that x1 , x2 , . . . , xk ∈ X (where X is a linear space) are said

to be linearly dependent if there exist some scalars α1 , . . . , αk ∈ K,

not all zero, such that α1 x1 + · · · + αk xk = 0. Otherwise, the vectors

x1 , x2 , . . . , xk are called linearly independent (and {x1 , x2 , . . . , xk } is

said to be a linearly independent system). In this case, S = {x1 , x2 ,

. . . , xk } is a basis of the space Y = Span S (which could be the whole

of X), and we say that Y has dimension k, dim Y = k, and any vector

x ∈ Y can be uniquely expressed as a linear combination,

k

x= αi x i = α1 x 1 + · · · + αk x k ,

i=1

basis S.

A basis is not unique.

A linear space X is inﬁnite dimensional (written as dim X = ∞) if

for any k ∈ N there exist k vectors in X which are linearly indepen-

dent. If X contains only the null vector, then by convention we deﬁne

dim X = 0.

Recall that any two linear spaces X, Y are isomorphic if there exists

a bijection φ : X → Y which satisﬁes

other is also ﬁnite dimensional and dim X = dim Y (prove it!).

18 1 Introduction

of some properties of classical Euclidean geometry to general linear

spaces is the scalar product.

A scalar product (or inner product) on a linear space X is a mapping

from X × X to K, denoted (·, ·), which satisﬁes the following axioms

(a2 ) (x + y, z) = (x, z) + (y, z) ∀ x, y, z ∈ X ,

(a3 ) (αx, y) = α(x, y) ∀ α ∈ K, ∀ x, y ∈ X ,

(a4 ) (x, x) ≥ 0 ∀x ∈ X, and (x, x) = 0 ⇐⇒ x = 0 .

(y, x) = (y, x) if K = R). A space X together with such a product is

called an inner product space. It is easily seen that (x, αy) = α(x, y)

for all α ∈ K and all x, y ∈ X.

Two vectors x, y ∈ X are called orthogonal if their scalar product

is equal to zero: (x, y) = 0 (this is sometimes denotedx⊥y). One

can also deﬁne the length of a vector x ∈ X as x = (x, x). The

mapping x → x satisﬁes the following properties:

(i) x = 0 ⇐⇒ x = 0 ;

(ii) αx = |α| · x ∀α ∈ K, ∀ x ∈ X ;

(iii) x + y ≤ x + y ∀ x, y ∈ X .

is a normed space. In general a mapping from X to [0, ∞) satisfying

(i), (ii), (iii) is called a norm on X. A given space may have many

diﬀerent norms, but the above is a special norm, being generated by a

scalar product.

While (i) and (ii) are trivial, property (iii) (called triangle inequality)

follows from the Bunyakovsky–Cauchy–Schwarz11 inequality:

product. Indeed,

11

Viktor Y. Bunyakovsky, Russian mathematician, 1804–1889; Karl Hermann

Amandus Schwarz, German mathematician, 1843–1921.

1.5 Linear Spaces 19

≤ x2 + 2|(x, y)| + y2

≤ x2 + 2x · y + y2

= (x + y)2 ,

based on the inequality

(otherwise (1.5.8) is trivial). Now replacing in (1.5.9) α = −(x, y)/y2

yields (1.5.8).

Example 1. For a given n ∈ N, consider X = Rn , which is the set of all

ordered n-tuples (here arranged as n × 1 matrices) x = (x1 , . . . , xn )T ,

where x1 , . . . , xn ∈ R. It is easily seen that X = Rn is a linear space

over R with respect to the usual operations of addition and scalar

multiplication:

(zero) element of X is (0, 0, . . . , 0)T , while the inverse of any x =

(x1 , . . . , xn )T ∈ X with respect to the addition is (−x1 , . . . , −xn )T .

The usual scalar product of X = Rn is deﬁned by

n

(x, y) = x i yi ∀ x = (x1 , . . . , xn )T , y = (y1 , . . . , yn )T ∈ X ,

i=1

n

x = (x, x) = x2i .

i=1

while the corresponding norm is the absolute value. If n = 2 or n = 3

then the above scalar product is nothing else but the scalar product

(dot product) of two vectors in the Euclidean plane or space, respec-

tively, while the corresponding norm of a vector represents its length.

20 1 Introduction

ity. For this reason X = Rn so equipped is called Euclidean n-space.

By extension, a general normed space whose norm is generated by a

scalar (inner) product is called a generalized Euclidean space (or inner

product space, as previously mentioned).

Analogously, Cn is a linear space over C with respect to the usual

operations of addition and scalar multiplication. Here, the usual scalar

product is deﬁned by

n

(x, y) = x i yi ∀ x = (x1 , . . . , xn )T , y = (y1 , . . . , yn )T ∈ Cn ,

i=1

n

x = (x, x) = |xi |2 .

i=1

Indeed, an isomorphism φ : X → Kn is the mapping which associates

with any x ∈ X the vector constructed with the coordinates of x with

respect to a basis in X. Thus any such space X can be equipped with

a scalar product as follows:

n

(x, y)X := (φ(x), φ(y)) = φ(x)i · φ(y)i ∀x, y ∈ X .

i=1

a generalized Euclidean (inner product) space.

X is a linear space with respect to the usual operations of addition

and scalar multiplication

(αf )(t) = αf (t) ∀ t ∈ [0, 1], ∀ f, g ∈ X, ∀ α ∈ K.

f (t) = a0 + a1 t + a2 t + · · · + ak tk , a0 , . . . , ak ∈ K, k ∈ {0} ∪ N).

Obviously Y is a (proper) subspace of X. Note that Y is inﬁnite

dimensional (dim Y = ∞) and hence so is X. Indeed, for any k ∈ N

1.5 Linear Spaces 21

We can deﬁne on Y the scalar product

1

(f, g) = f (t) · g(t) dt ∀f, g ∈ Y,

0

1/2

1

and the corresponding norm f = (f, f ) = 0 |f (t)|2 dt .

Another norm on Y is the following

f ∗ = sup |f (t)| ∀f ∈ Y ,

t∈[0,1]

that · ∗ is generated by a scalar product, then it must satisfy the

parallelogram law

f + g2∗ + f − g2∗ = 2 f 2∗ + g2∗ ∀f, g ∈ Y , (1.5.10)

which is valid in any inner product space. But, for example, the poly-

nomial functions f (t) = t, g(t) = 1 − t do not satisfy (1.5.10), which

conﬁrms our assertion above.

Now, let Z be the set of polynomials f ∈ Y of degree less than or

equal to n − 1, for a given natural number n. This is a ﬁnite dimen-

sional subspace of Y , with basis {1, t, t2 , . . . , tn−1 } and dimension n.

Therefore, Z is isomorphic to Kn . A natural isomorphism between Z

and Kn is the mapping which associates with any polynomial func-

tion f (t) = a0 + a1 t + a2 t2 + · · · + an−1 tn−1 the n-dimensional vector

(a0 , a1 , a2 , . . . , an−1 )T ∈ Kn . Thus, besides the scalar product above,

one can deﬁne on Z another scalar product

n−1

(f, g)Z = a i · bi ,

i=0

where ai , bi ∈ K, i ∈ {0, 1, . . . , n − 1}. This scalar product generates

a new norm on Z,

n−1

1/2

f Z = (f, f )Z = |ai |2

i=0

∀ f (t) = a0 + a1 t + · · · + an−1 tn−1 ∈ Z.

22 1 Introduction

set, where X is an inner product space with a scalar product (·, ·) and

the norm · generated by it. We further assume that S is a linearly

independent system (otherwise we eliminate those vectors which are

linear combinations of other vectors from S). Recall that an inﬁnite

set is linearly independent if all ﬁnite subsets of it are independent.

Consider ﬁrst the case when S is a ﬁnite independent system, S =

{u1 , . . . , un }. Starting from S one can construct an orthogonal system

S = {v1 , . . . , vn }, i.e., (vi , vi ) = 0 and (vi , vj ) = 0 if i = j.

In what follows we present the Gram–Schmidt12 orthogonalization

method. To create S let

v 1 = u1 ,

v2 = u2 + αv1 ,

= (u2 , v1 ) + αv1 2 ,

giving

(u2 , v1 )

α=− .

v1 2

Note that v1 = u1 = 0 (by assumption) and also v2 = 0. To see that,

we suppose by contradiction that v2 = 0, i.e., u2 + αu1 = 0. But this

is impossible since u1 , u2 are independent vectors.

After having determined the ﬁrst p members of S deﬁne

p

vp+1 = up+1 + βj vj ,

j=1

0 = (up+1 , vk ) + βk vk 2 ,

(up+1 , vk )

βk = − , k = 1, . . . , p.

vk 2

12

Jorgen Pedersen Gram, Danish actuary and mathematician, 1850–1916; Er-

hard Schmidt, Baltic German mathematician, 1876–1959.

1.5 Linear Spaces 23

1, . . . , p, namely

p

vp+1 = up+1 + θk uk ,

k=1

and u1 , u2 , . . . , up , up+1 are independent vectors. Continue the process

until ﬁnished.

Since S is an orthogonal system, it follows that it is independent

(prove it!). S can be simply replaced by an orthonormal (independent)

system S = {w1 , . . . , wn }, by deﬁning wj = vj −1 vj , j = 1, . . . , n.

In particular, any n-dimensional inner product space possesses an or-

thonormal basis (since any basis can be replaced by an orthonormal

one).

If S is a countably inﬁnite, independent system in X, S = {u1 ,

u2 , . . . , un , . . . }, then using the same Gram–Schmidt method, one

can construct an orthonormal system S = {w1 , w2 , . . . , wn , . . . }, i.e.,

(wi , wj ) = δij , where δij is the Kronecker13 symbol, δii = 1 and δij = 0

for i = j.

linear space over K. A function f : X → K is said to be a linear form

on X if

respect to the usual operations on functions and is called the dual of X.

If X is ﬁnite dimensional, with a basis B = {u1 , . . . , un }, n ∈ N, then

any linear form f has a speciﬁc form:

n

n

f (x) = a i αi ∀x = αi u i ∈ X ,

i=1 i=1

where ai = f (ui ) are called coeﬃcients of the linear form with respect

to the basis B. X ∗ is isomorphic to Kn (hence dim X ∗ = n), since the

mapping associating each f ∈ X ∗ to the vector (f (u1 ), . . . , f (un ))T ∈

Kn is an isomorphism (prove it!).

A function a : X ×X → K which is linear with respect to each variable

is called a bilinear form on X (more precisely, a(·, y) is a linear form

for all y ∈ X, and a(x, ·) is also a linear form for all x ∈ X).

13

Leopold Kronecker, German mathematician, 1823–1891.

24 1 Introduction

of the complex conjugate function a(x, ·) (x ∈ X) then a is said to be

a sesquilinear form on X. For example, a scalar product on X is a

sesquilinear form.

B = {u1 , . . . ,

If X is ﬁnite dimensional, with a basis un } and a is a

bilinear form on X, then for all x = ni=1 αi ui , y = nj=1 βj uj ∈ X

we have

n

a(x, y) = cij αi βj , (1.5.11)

i,j=1

trix C = (cij ) (which depends on the basis of X). If a is a sesquilinear

form, then instead of (1.5.11) we have

n

a(x, y) = cij αi βj .

i,j=1

a(y, x) for all x, y ∈ X. If X is ﬁnite dimensional, then the symmetry of

a bilinear form a is expressed by the symmetry of the matrix associated

with a (the symmetry of that matrix being independent of the basis

of space X). Any symmetric bilinear form a deﬁnes a quadratic form

F : X → R by setting F (x) = a(x, x). Given a quadratic form F , one

can recover the corresponding bilinear form a. Indeed, from

we deduce

1

a(x, y) = [a(x + y, x + y) − a(x, x) − a(y, y)]

2

1

= [F (x + y) − F (x) − F (y)] .

2

if F (x) > 0 for all x ∈ X, x = 0 (F (x) ≥ 0 for all x ∈ X, respectively).

F is called negative deﬁnite (negative semideﬁnite) if −F is positive

deﬁnite (positive semideﬁnite, respectively).

If F is a positive deﬁnite quadratic form on the real linear space X

then the corresponding a is a scalar product on X.

1.5 Linear Spaces 25

and F is a quadratic form on X, then

n

F (x) = a(x, x) = cij αi αj ,

i,j=1

ticular, the components of x if X = Rn with its usual basis). It can

be shown (using the well-known Gauss14 method), that for any such

quadratic form F , there is a convenient basis of X such that F can be

written as follows:

and λ1 , . . . , λn ∈ R (some of these λ’s could be zero). In fact, starting

from the new basis, one can simply deﬁne another basis, such that F

can be written under the following canonical form:

n

F (x) = εi γi2 , εj ∈ {−1, 0, +1}, j = 1, . . . , n ,

i=1

Obviously, F is positive deﬁnite (positive semideﬁnite) if and only if

εj = 1, j = 1, . . . , n (εj ∈ {0 , +1}, j = 1, . . . , n, respectively).

Let us also recall that for a quadratic form F : X → R (X being an n-

dimensional real linear space), whose matrix C with respect to a basis

of X has nonzero NW principal minors (i.e., Δi = 0, i = 1, . . . , n) there

always exists a decomposition (called Jacobi’s formula15 ) as follows:

n

Δi−1

F (x) = βi2 ,

Δi

i=1

a new basis of X. Therefore, F is positive deﬁnite (negative deﬁnite)

if and only if Δi > 0, i = 1, . . . , n (respectively, (−1)i Δi > 0, i =

1, . . . , n). These are known as Sylvester’s conditions.16

14

Carl Friedrich Gauss, German mathematician and physicist, 1777–1855.

15

Carl Gustav Jacob Jacobi, German mathematician, 1804–1851.

16

James Joseph Sylvester, English mathematician, 1814–1897.

26 1 Introduction

form, then a is called Hermitian17 if a(x, y) = a(y, x) for all x, y ∈ X.

Such a form a deﬁnes a quadratic form F (x) = a(x, x), x ∈ X, with

values in R. If X is an n-dimensional complex linear space, then one

can ﬁnd a basis in X such that F takes the form

n

n

F (x) = λi β i β i = λi |βi |2 ,

i=1 i=1

of x with respect to that basis. The Jacobi formula also works in this

complex case, and Sylvester’s conditions remain valid.

We close this chapter by inviting the reader to consult other books to

ﬁnd more information on the topics addressed in this chapter, such as

[6, 16, 28, 33, 37, 41, 42, 51].

1.6 Exercises

1. Let A, B, C be some arbitrary subsets of a universe U . Show

that

(a) A \ (B ∪ C) = (A \ B) ∩ (A \ C) = (A \ B) \ C ;

(b) A \ (B ∩ C) = (A \ B) ∪ (A \ C) ;

(c) (A ∩ B) \ C = A ∩ (B \ C) = (A \ C) ∩ B ;

(d) (A ∪ B) \ C = (A \ C) ∪ (B \ C) ;

Determine the set X ⊂ U satisfying

A ∩ X = B and A ∪ X = C .

A \ X = B, and X \ A = C .

A ∪ C = B ∪ C, we have A = B.

17

Charles Hermite, French mathematician, 1822–1901.

1.6 Exercises 27

of the following statements are true?

(a) (A ∩ B) × (C ∩ D) = (A × C) ∩ (B × D);

(b) (A ∪ B) × (C ∪ D) = (A × C) ∪ (B × D);

(c) (A \ B) × C = (A × C) \ (B × C),

a = min A, then a is the unique minimal element of A.

6. Let A = {an ; an = 1

1·2 + 1

2·3 + ··· + 1

n(n+1) , n ∈ N}. Find inf A

and sup A.

as follows:

z1 = x1 + y1 i z2 = x2 + y2 i ⇐⇒ x1 ≤ x2 and y1 ≤ y2 .

Show that

(b) for each a ≥ 0, is a total order on Xa = {z = x + yi ∈

C; y = ax} (i.e., Xa is a chain);

(c) there exists a partial order on C such that, for each a < 0,

Xa deﬁned as above is a chain of C with respect to this

partial order.

√

a1 = 2, an = 2 + an−1 , n ≥ 2 ,

R such that any subsequence of it has a convergent subsequence

whose limit is a. Show that an → a.

dependent system, then show that {v1 + v2 , v2 + v3 , v3 + v1 } is

too.

28 1 Introduction

that each of the following systems of functions in X

is linearly independent.

[0, 1] → R. Consider on X the scalar product

1

(f, g) = f (t) · g(t) dt ∀f, g ∈ X ,

0

independent?

(i) f1 (t) = 1, f2 (t) = t, f3 (t) = t2 ;

(ii) f1 (t) = 1 − t, f2 (t) = t(1 − t), f3 (t) = 1 − t2 ;

(iii) f1 (t) = 1, f2 (t) = et , f3 (t) = 2e−t ;

(iv) f1 (t) = 3t, f2 (t) = t + 5, f3 (t) = −2t2 , f4 (t) = (t + 1)2 ;

(v) f1 (t) = (t + 1)2 , f2 (t) = t2 − 1, f3 (t) = 2t2 + 2t − 3;

(vi) f1 (t) = 1, f2 (t) = 1 + t, f3 (t) = 1 + t + t2 , . . . , fk (t) =

1+t+t2 +· · ·+tk−1 , where k is a given natural number.

(b) Let Y be the vector subspace of X generated by B =

{f1 , f2 , f3 }, where f1 (t) = 1, f2 (t) = t, f3 (t) = t2 for

t ∈ [0, 1]. By using the Gram–Schmidt method, construct

an orthonormal basis in Y with respect to the above scalar

product.

basis of the real vector space X of all polynomials of degree ≤ 3

with real coeﬃcients, and ﬁnd the coordinates of a polynomial

p = p(t) ∈ X with respect to this basis.

14. Let X be a linear space equipped with a scalar product (·, ·).

Show that a system S = {x1 , x2 , . . . , xk } ⊂ X is linearly inde-

pendent if and only if the following determinant (called the Gram

determinant)

det (xi , xj )1≤i,j≤k = 0.

1.6 Exercises 29

1

3

1

(x, y) = xi yi − (x1 y2 + x1 y3 + x2 y3 )

2 4

i=1

with respect to this scalar product.

16. Let X be the real vector space of polynomials of degree ≤ m

with real coeﬃcients, where m is a given natural number. Find

the expression of the linear form f : X → R deﬁned by

1

f (p) = p(t) dt ∀p(t) = a0 + a1 t + a2 t2 + · · · am tm ∈ X

0

with respect to each of the bases

B = {1, t, t2 , . . . , tm }, B = 1, 1 + t, 1 + t + t2 , . . . ,

1 + t + t2 + · · · + tm .

is said to be antisymmetric if

a(x, y) = −a(y, x) ∀x, y ∈ X .

Show that

(i) a bilinear form a : X × X → R is antisymmetric ⇐⇒

a(x, x) = 0 ∀x ∈ X.

(ii) any bilinear form on X is the sum of a symmetric bilinear

form and an antisymmetric one.

18. Let A be an n × n matrix with real entries, and let B = aIn +

AT A, where AT denotes the transpose of A, In denotes the n × n

identity matrix, and a > 0. Show that the quadratic form F :

Rn → R whose matrix with respect to the canonical basis of Rn

is B is positive deﬁnite. What about the case a = 0?

19. Consider the quadratic form F : R3 → R,

F (x) = x21 + x22 + 3x23 + 4x1 x2 + 2x1 x3 + 2x2 x3 ∀x ∈ R3 .

n

Determine a basis of R3 such that F (x) = 2

i=1 εi ξi , where

ξ1 , . . . , ξn are the coordinates of x with respect to this basis, and

εj ∈ {−1, 0, +1}, j = 1, . . . , n. Check whether F is positive

deﬁnite, negative deﬁnite, or neither.

30 1 Introduction

F : R4 → R,

written as a sum of squares with respect to this basis.

Chapter 2

Metric Spaces

Metric spaces oﬀer a suﬃciently large framework for most of the prob-

lems we discuss in this book.

2.1 Deﬁnitions

Deﬁnition 2.1. A metric (or a distance function) on a nonempty

set X is a function d : X × X → [0, ∞) satisfying

(M 1) d(x, y) = 0 ⇐⇒ x = y ;

(M 2) d(x, y) = d(y, x) ∀x, y ∈ X ;

(M 3) d(x, y) ≤ d(x, z) + d(z, y) ∀x, y, z ∈ X .

sometimes denoted (X, d).

Any set X = ∅ can be equipped with a metric. The “simplest” metric

is the so-called discrete metric which is deﬁned by

Note that this metric is not very useful in practice, but is suitable for

counterexamples.

with the metric

G. Moroşanu, Functional Analysis for the Applied Sciences,

Universitext, https://doi.org/10.1007/978-3-030-27153-4 2

32 2 Metric Spaces

Note also that any ﬁnite dimensional linear space can be equipped with

a norm (e.g., with the Euclidean norm—see the previous chapter), and

hence with the metric generated by that norm (cf. (2.1.1)).

space with respect to d restricted to Y × Y .

deﬁne

B(x0 , r) := {x ∈ X; d(x, x0 ) < r} ,

which is called the open ball centered at x0 with radius r.

each x ∈ A there exists an ε > 0 such that B(x, ε) ⊂ A. By convention

the empty set is considered open.

(a) ∅, X ∈ τ ;

(b) the union of any sub-collection of τ is in τ ;

(c) the intersection of any ﬁnite sub-collection of τ is in τ.

not be open. For example, in X = R, with d(x, y) = |x − y|, we have

for a ﬁxed x0 ∈ R

∞

1 1

x0 − , x0 + = {x0 },

n n

n=1

and obviously {x0 } does not belong to the (usual) topology of R deﬁned

by | · |.

τ generated by its metric d (see above), called metric topology. If

d is deﬁned by a norm, i.e., d(x, y) = x − y (x, y ∈ X), then τ is

called a norm topology.

is an r > 0 such that B(p, r) ⊂ V . In particular, any open set D is a

neighborhood of any p ∈ D.

2.1 Deﬁnitions 33

is open (i.e., X \ C ∈ τ ). In particular, for any x0 ∈ X and r > 0, we

have B(x0 , r) ∈ τ , and

B(x0 , r) := {x ∈ X; d(x, x0 ) ≤ r}

is closed, i.e., X \ B(x0 , r) ∈ τ (prove these assertions!).

A subset A of a metric space (X, d) is said to be bounded (with

respect to d) if it is contained in a closed ball (equivalently, in an open

ball). Otherwise, A is called unbounded (with respect to d). For

example, N ⊂ R is bounded with respect to the discrete metric on R,

but is unbounded with respect to the usual norm topology (the norm

being the absolute value function | · |).

A sequence (an )n∈N in (X, d) is said to be convergent if there exists

a ∈ X such that d(an , a) → 0. This is denoted an → a, or limn→∞ an =

a, or lim an = a, and we say that (an ) converges to a, or that a is the

limit of (an ). It is easily seen that the limit is unique.

Let S be a nonempty subset of a metric space (X, d). S is closed if

and only if the limit of any convergent sequence of points in S is also

a point of S (prove it!).

A point p ∈ (X, d) is called an accumulation point (or limit point)

of a set S ⊂ X if (V ∩ S) \ {p} = ∅ for every neighborhood V of p.

Note that p is not necessarily an element of S. If q ∈ S and q is not

an accumulation point of S, then q is an isolated point of S.

Obviously, p is an accumulation point of S if and only if there exists

a sequence (pn ) in S such that pn → p. By the above assertion, S is

closed if and only if S contains all its accumulation points.

Let (an )n∈N a sequence in (X, d). A point p ∈ X is called a cluster

point of (an ) if for every ε > 0 there are inﬁnitely many an such that

d(an , p) < ε (in other words, (an ) has a subsequence converging to p).

A point p ∈ S is called an interior point of S if there is an r > 0

such that B(p, r) ⊂ S. The set of all interior points of S is called the

interior of S, and is denoted Int S.

Obviously,

• Int S is the union of all open subsets of S, and hence Int S is an open

set (possibly ∅);

• S is open if and only if S = Int S.

34 2 Metric Spaces

intersection of all closed sets containing S.

Clearly, Cl S is a closed set, and

• S is closed if and only if S = Cl S;

• Cl S = S ∪ {accumulation points of S}.

subset S, i.e., Cl S = X (the closure being related to the metric topol-

ogy generated by d).

For example, R is separable with respect to its usual topology (since Q

is dense in R with respect to this topology), but is not separable with

respect to the discrete topology, i.e., the topology associated with the

discrete metric on R. This is because any subset of R is closed with

respect to the discrete topology, so there is no dense countable subset

of R.

∂S := Cl S ∩ Cl (X \ S).

and B(p, ε) ∩ (X \ S) = ∅ for all ε > 0.

2.2 Completeness

We start this section with the deﬁnition of a Cauchy sequence which

is essential in what follows.

a Cauchy sequence if for all ε > 0 there exists an N = N (ε) ∈ N

such that d(an , am ) < ε for all m, n > N .

Cauchy sequence. The converse implication is not true in general.

Cauchy sequence (an )n∈N in X converges (i.e., there exists a point

a ∈ X such that d(an , a) → 0).

metric space (as shown in the previous chapter, see Theorem 1.12).

More generally, for any n ∈ N, Rn , equipped with the Euclidean metric

2.2 Completeness 35

sequence in Rn is Cauchy in each coordinate. In fact, we will see later

that Rn endowed with any norm is complete.

On the other hand, the metric space (Q, d), where d(x, y) = |x − y|

(x, y ∈ X) is not complete. For example, the sequence in Q, deﬁned

by

1 2

a1 = 2, an+1 = an + , n = 1, 2, . . .

2 an

√

is convergent in (R, |·|) (since an ≥ 2 and an+1 /an√≤ 1, n = 1, 2, . . . ),

hence Cauchy with respect to | · |, but its limit is 2 ∈/ Q.

Deﬁne

B(S; R) = {f : S → R; f (S) is bounded} ,

where the boundedness condition on f (S) means: ∃M > 0 such that

|f (s)| ≤ M for all s ∈ S. Obviously, X = B(S; R) is a real linear

space with respect to the usual operations (addition and scalar multi-

plication). It can be equipped with a norm · deﬁned by

f := sup |f (s)| ∀f ∈ X ,

s∈S

Moreover, it is easily seen that (X, d) is a complete metric space. The

key condition ensuring the completeness of X is the completeness of

R with respect to its usual metric.

Convergence in X = B(S; R) is called uniform convergence on S.

It is stronger than the pointwise convergence. In particular,

n→∞

(X, d) is complete for d(x, y) = x − y, x, y ∈ X) is called a Banach

space.

gence norm) is a Banach space. The subset XK = {f ∈ B(S; R) :

|f (s)| ≤ K ∀s ∈ S}, where K is a given positive constant, is a com-

plete metric space with respect to the same metric (generated by the

36 2 Metric Spaces

XK is not a Banach space because it is not a linear space.

In general, if (X, d) is a complete metric space, then any nonempty

closed set Y ⊂ X is also a complete metric space with the metric d

restricted to Y × Y .

Deﬁnition 2.7. Two metric spaces (X1 , d1 ), (X2 , d2 ) are isometric if

there exists a bijection φ : X1 → X2 such that d2 (φ(x), φ(y)) = d1 (x, y)

for all x, y ∈ X.

can be extended (uniquely up to isometry) to a complete metric space

(see [44, Chapter 2]). More precisely we have

Theorem 2.8. For any metric space (X, d) there exists a complete

¯ such that

metric space (X̄, d)

¯

(X1 , d);

(jj) ¯.

X1 is dense in (X̄, d)

¯ with the above properties is unique up to isometry.

(X̄, d)

Proof. One can construct an extension (completion) of (X, d) by a pro-

cedure similar to that used in the previous chapter to construct the

Cantor–Méray model for R starting from rational numbers. Specif-

ically, let E denote the set of all Cauchy sequences in (X, d). E is

nonempty as it contains constant sequences (c, c, . . . ), c ∈ X. We

deﬁne an equivalence relation in E as follows: (an ), (bn ) ∈ E are

equivalent iﬀ d(an , bn ) → 0. In other words, two sequences convergent

in (X, d) with the same limit are equivalent. It is easily seen that

the relation deﬁned above is indeed an equivalence relation. Let X̄

be the collection of all equivalence classes in E with respect to this

equivalence relation. Denote by A, B, C, . . . the classes of sequences

(an ), (bn ), (cn ), . . . Now, deﬁne d¯ : X̄ × X̄ → [0, ∞) by

¯ B) = lim d(an , bn )

d(A, ∀A, B ∈ X̄. (2.2.2)

n→∞

1

Felix Hausdorﬀ, German mathematician, 1868–1942.

2.2 Completeness 37

Note also that the limit in (2.2.2) does not depend on representatives,

as the following inequality shows

|d(an , bn ) − d(an , bn )| ≤ d(an , an ) + d(bn , bn ) → 0 .

Thus d¯ is well deﬁned. It is easy to check that d¯ is a metric.

Now, let ψ : X → X̄ be the mapping which associates with every

a ∈ X the class A of the constant sequence (a, a, . . . ): ψ(a) = A.

Obviously, ψ is injective, so if we denote X1 = ψ(X), then ψ is a

bijection between X and X1 . Moreover, for any A, B ∈ X1 we have

¯

d(ψ(a), ¯ B) = lim d(a, b) = d(a, b).

ψ(b)) = d(A,

¯ are isomorphic, i.e., (j) holds true.

Hence (X, d) and (X1 , d)

Let us now prove (jj). To this purpose, let A be an arbitrary element

of X̄ and let (an ) be a representative of A. For each k ∈ N denote

by Ak the class of the constant sequence (ak , ak , . . . ). Since (an ) is a

Cauchy sequence in (X, d), we can write

∀ε > 0, ∃N ∈ N : d(ak+p , ak ) < ε ∀k > N, p ∈ N.

Therefore,

¯ Ak ) = lim d(am , ak ) ≤ ε ∀k > N .

d(A,

m→∞

In order to prove that (X̄, d) ¯ is complete, let (An ) be a Cauchy se-

¯

quence in (X̄, d). For each class Ak there is a class Bk ∈ X1 such

¯ k , Bk ) < 1/k (see (jj)). Notice that Bk is the class of some

that d(A

constant sequence (bk , bk , . . . ) with bk ∈ X. We can show that (bk ) is

a Cauchy sequence in (X, d):

¯ k , Bm )

d(bk , bm ) = d(B

¯ k , Ak ) + d(A

≤ d(B ¯ k , Am ) + d(A

¯ m , Bm )

1 1 ¯ m , Bm ) → 0,

≤ + + d(A

k m

as k, m → ∞, so the class B of the sequence (bk ) belongs to X̄. We

¯ Indeed, given ε > 0,

claim that B is the limit of (Ak ) with respect to d.

d(B, ¯ Bk ) + d(B

¯ Ak ) ≤ d(B, ¯ k , Ak )

1

= lim d(bm , bk ) +

m→∞ k

< ε

38 2 Metric Spaces

¯ is complete.

(X̄, d)

Finally, we need to show that any two complete metric spaces (X̄, d) ¯

and (X̂, d)ˆ satisfying (j) and (jj) are isometric. Let X1 ⊂ X̄ and

X2 ⊂ X̂ such that each of these spaces is isometric to (X, d). Let

g : (X, d) → (X1 , d) ¯ and h : (X, d) → (X2 , d) ˆ be the corresponding

isometries. Then (X1 , d) ˆ are isometric, and θ = h ◦ g −1 is

¯ and (X2 , d)

an isometry between these spaces.

Let A be an arbitrary element of X̄. By (jj) there exists a sequence

(An ) in X1 such that d(A ¯ n , A) → 0. Obviously, Bn = θ(An ) ∈ X2 and

(Bn ) is a Cauchy sequence in (X̂, d), ˆ so it is convergent since (X̂, d)ˆ

is complete. Let B ∈ X̂ be its limit: d(B ˆ n , B) → 0. Denote by θ̃

the mapping that takes A to B. Note that B does not depend on the

choice of (An ) so it is unique for each A, i.e., θ̃ is well deﬁned. In fact,

θ̃ is an extension of θ to the whole X̄. It is easily seen that θ̃ is a

bijection between X̄ and X̂.

It remains to prove that θ̃ is an isometry. Let A, A ∈ X̄ and let

(An ), (An ) sequences in X1 which converge, respectively, to A and

A with respect to d. ¯ Let B, B be the limits of Bn = θ(An ) and

Bn = θ(An ) in (X̂, d).

ˆ By letting n tend to inﬁnity in the equation

ˆ n , B ) = d(A

d(B ¯ n , A ) ,

n n

we obtain

dˆ θ̃(A), θ̃(A ) = d(B,

ˆ B ) = d(A,

¯ A ),

by using the inequality

¯ n , A ) − d(A,

|d(A ¯ A )| ≤ d(A ¯ , A ) ,

¯ n , A) + d(A

n n

¯ and (X̂, d)

ˆ

are indeed isometric.

space (Z, d). Then (Cl X, d) is also a complete metric space, where

Cl X is the closure of X in (Z, d), also denoted X̄. Clearly, (Cl X, d)

¯ in Theorem 2.8, so (Cl X, d) can be regarded

plays the role of (X̄, d)

as the completion of X with respect to d.

To illustrate this case, consider X = (0, 1] and Z = R with d(x, y) =

|x − y|. Then, Cl X = [0, 1] and so ([0, 1], d) is the

completion of ((0, 1], d) (which is not itself complete). Further ex-

amples will be discussed later, including examples involving function

spaces.

2.2 Completeness 39

priori known.

portant principles of Functional Analysis: the Uniform Boundedness

Principle, the Open Mapping Theorem, and the Closed Graph Theo-

rem (see Theorems 4.7, 4.8, and 4.10).

Theorem 2.10 (Baire). Let (X, d) be a complete metric space and let

Xn ⊂ X, n ∈ N, be closed sets satisfying

Int Xn = ∅ ∀n ∈ N . (2.2.3)

Then,

∞

Int Xn = ∅. (2.2.4)

n=1

Cl(X \ F ) =: X \ F = X ⇐⇒ Int F = ∅ .

We have to show that (2.2.4) holds or, equivalently, that M := ∩∞n=1 Dn

is dense in X, i.e., for every open set D ⊂ X we have D ∩ M = ∅. Fix

such an open set D and choose some x0 ∈ D and r0 > 0 such that

the closed ball B(x0 , r0 ) ⊂ D. Since D1 is open and dense in X there

exist x1 ∈ B(x0 , r0 ) ∩ D1 and r1 > 0 such that

r0

B(x1 , r1 ) ⊂ B(x0 , r0 ) ∩ D1 , 0 < r1 < .

2

By induction one can ﬁnd sequences (xn ) and (rn ) such that

rn

B(xn+1 , rn+1 ) ⊂ B(xn , rn ) ∩ Dn+1 , 0 < rn+1 < ,

2

for n = 0, 1, 2, . . . It is easily seen that (xn ) is Cauchy, hence convergent

(since (X, d) is complete). If a denotes its limit then a ∈ D ∩ M , hence

D ∩ M = ∅, as claimed.

2

René-Louis Baire, French mathematician, 1874–1932.

40 2 Metric Spaces

Let A be a subset of a metric space (X, d). A cover of A is a collection

of sets {Di }i∈I whose union contains A:

A⊂ Di ,

i∈I

where I is a ﬁnite or inﬁnite index set. If all Di are open sets then

{Di }i∈I is called an open cover.

open cover of A has a ﬁnite subcover.

only if every sequence in A has a subsequence that converges to a point

of A (in other words A is sequentially compact).

We need to show that X \ A is open. Let x ∈ X \ A. If y ∈ A we have

d(y, x) > 0 and so y belongs to

is compact, there is a ﬁnite subcover of A. In fact, this subcover can be

reduced to one set DN with N large. By construction B(x, N1 ) ⊂ X \A,

and hence X \ A is open, therefore A is closed, as claimed.

If {Di }i∈I is an open cover of B, then {Di }i∈I ∪ {X \ B} is an open

cover of A. Since A is compact, we can extract a ﬁnite subcover of A,

say {Di1 , Di2 , . . . , Dim , X \ B}. Thus {Di1 , Di2 , . . . , Dim } is a ﬁnite

subcover of B extracted from {Di }i∈I .

Assume, by contradiction, that there is a sequence (xn ) in A that

has no convergent subsequence. So (xn ) has inﬁnitely many distinct

points y1 = xn1 , y2 = xn2 , . . . such that for each ym there is an open

2.3 Compact Sets 41

of the sequence (yi )). The set C = {y1 , y2 , . . . } is closed since all its

points are isolated. By Step 2, C is compact. On the other hand,

{B(ym , rm )}m∈N is an open cover of C which has no ﬁnite subcover.

Hence (xn ) must have a convergent subsequence. Its limit belongs to

A, since A is closed (see Step 1).

Step 4: If A is sequentially compact, then for every open cover {Di }i∈I

of A, there exists an r > 0 such that ∀y ∈ A, B(y, r) is contained in

some Di .

Assume to the contrary that this is not the case. Thus there exists

an open cover {Di } of A such that ∀n ∈ N there is some yn ∈ A

so that B(yn , n1 ) is not contained in any Di . By hypothesis (yn ) has

a subsequence (z1 = yn1 , z2 = yn2 , . . . ) converging to some z ∈ A.

Obviously, z belongs to some Di0 and since Di0 is open and zn → z,

we can choose some large N such that B(zN , N1 ) ⊂ Di0 , which is a

contradiction.

Step 5: A being sequentially compact implies that for all ε > 0 there is

a ﬁnite number of open balls of radius ε covering A (i.e., A is totally

bounded).

We need to analyze the case when A is not ﬁnite, otherwise the conclu-

sion is obvious. Assume that A is not totally bounded, i.e., for some

ε > 0 we cannot cover A with ﬁnitely many open balls of radius ε.

Choose y1 ∈ A and y2 ∈ A \ B(y1 , ε). By the same assumption there

exists a point y3 ∈ A \ B(y1 , ε) ∪ B(y2 , ε) . Repeating this process we

obtain a sequence

yn ∈ A \ ∪n−1i=1 B(yi , ε) ,

words, (yn ) has no Cauchy subsequence and hence has no convergent

subsequence, thus contradicting sequential compactness.

Let {Di } be an open cover of A. Associate with this cover a positive r

given by Step 4. By Step 5 (see also its proof) there is a ﬁnite number

of points, say y1 , y2 , . . . , yp ∈ A, such that

A ⊂ ∪pj=1 B(yj , r) .

By Step 4, each ball B(yj , r) is contained in some Dij . Hence {Di1 ,

Di2 , . . . , Dip } is a ﬁnite (open) subcover of A.

42 2 Metric Spaces

Cl A is compact.

every sequence in A has a convergent subsequence.

(its limit being a point of Cl A). Then Cl A is sequentially compact

(hence compact) in (X, d). Indeed, if (xn ) is a sequence in Cl A, then

there exists a sequence (yn ) in A such that d(xn , yn ) < 1/n for all

n ∈ N. As (yn ) has a convergent subsequence (ynk ), it follows that

(xnk ) is also convergent. So the statement of the corollary holds true

by Theorem 2.12.

endowed with the Euclidean norm has a convergent subsequence.

Proof. This theorem is known for k = 1 (see Theorem 1.11) and ex-

tends easily to Rk : a bounded sequence in Rk is bounded in each

coordinate.

From the proof of Theorem 2.12 we see that every compact set in a

metric space is closed and bounded. The converse implication is not

true in general. However, we have the following result attributed to

Heine and Borel.3

usual Euclidean metric. A is compact if and only if A is closed and

bounded (with respect to the same metric).

served above. Conversely, assume that A is closed and bounded. Then

any sequence in A is bounded so it has a convergent subsequence (cf.

Theorem 2.15). Its limit belongs to A because A is closed. Thus A is

sequentially compact, hence compact by Theorem 2.12.

sional space with Euclidean metric but may not be true for other

3

Heinrich Eduard Heine, German mathematician, 1821–1881; Émile Borel,

French mathematician, 1871–1956.

2.3 Compact Sets 43

metric

0 x = y,

d0 (x, y) =

1 x = y.

A = N is closed with respect to d0 , but it is not compact because the

open cover {B(n, 1/2)}n∈N has no ﬁnite subcover.

A collection of subsets of (X, d) is said to have the ﬁnite intersection

property if the intersection of every ﬁnite sub-collection of the family

is nonempty.

(X, d), say {Ki }i∈I , has the ﬁnite intersection property, then ∩i∈I Ki =

∅.

I is inﬁnite. Assume to the contrary that ∩i∈I Ki = ∅. Hence

X = ∪i∈I (X \ Ki )

= (X \ Ki0 ) ∪i∈I1 (X \ Ki ) , (2.3.5)

follows that

Ki0 ⊂ ∪i∈I1 (X \ Ki ) .

As Ki0 is compact and {X \ Ki }i∈I1 is an open cover of Ki0 , there is

a ﬁnite set J ⊂ I1 such that

Ki0 ⊂ ∪i∈J (X \ Ki ).

X = ∪i∈J1 (X \ Ki ) ,

∅ = ∩i∈J1 Ki ,

44 2 Metric Spaces

Let (X, d) and (X1 , d1 ) be two metric spaces. A function

f : D ⊂ (X, d) → (X1 , d1 )

borhood V ⊂ (X1 , d1 ) of f (x0 ) there exists a neighborhood U ⊂ (X, d)

of x0 such that f (U ∩ D) ⊂ V , or equivalently

(2.4.6)

U and δ depend on ε and x0 . The continuity of f at x0 ∈ D can also

be equivalently expressed by using sequences

on D (or simply continuous). The function f is called uniformly

continuous on D if δ can be the same for all x0 ∈ D, i.e., δ is

independent of x0 ∈ D (it depends only on ε).

D → (X1 , d1 ) is continuous (on D), then the following hold:

• f is uniformly continuous on D;

˜ g) = sup

space with respect to the metric d(f, x∈D d1 (f (x), g(x)).

˜ is also complete.

If in addition (X1 , d1 ) is complete, then (C(D; X1 ), d)

set and f : D → R (R being equipped with the usual metric), then

f (D) is closed and bounded, and there exist x0 , y0 ∈ D such that

f (x0 ) = inf f (D) and f (y0 ) = sup f (D).

2.4 Continuous Functions on Compact Sets 45

Proof. The ﬁrst part of the theorem follows from Theorem 2.19 which

in particular says that f (D) is compact (in R), hence closed and

bounded. So the inﬁmum and supremum of f (D), denoted m and

M , are ﬁnite numbers. Now, for all n ∈ N there exists an xn ∈ D such

that

1

m ≤ f (xn ) < m + . (2.4.7)

n

As D is a compact set, (xn ) has a subsequence which converges to

some x0 ∈ D. This fact combined with (2.4.7) implies m = f (x0 ).

Similarly, there is a point y0 ∈ D such that M = f (y0 ).

either R or C). Two norms on X, say · and · ∗ , are said to be

equivalent if there exist two positive constants C1 , C2 such that

C1 x ≤ x∗ ≤ C2 x ∀x ∈ X . (2.4.8)

Obviously, two equivalent norms on X generate the same topology

on X.

If X is a k-dimensional linear space, k ∈ N, with a basis B = {u1 , . . . ,

uk }, then X can be equipped with diﬀerent norms, such as

xmax = max |αi |,

1≤i≤n

n 1/p

xp = |αi |p , p ∈ [1, ∞),

i=1

for all x = ki=1 αi ui ∈ X. Note that · 2 is precisely the Euclidean

norm of X introduced before.

Theorem 2.21. If X is a k-dimensional linear space, k ∈ N, then

any two norms on X are equivalent.

Proof. It is enough to show that any norm · on X is equivalent

k to the

Euclidean norm · 2 . On the one hand, for any x = i=1 αi ui ∈ X,

we have

k

x ≤ |αi | · ui

i=1

k

≤ max ui · |αi |

1≤i≤k

i=1

√

≤ k max ui · x2 . (2.4.9)

1≤i≤k

46 2 Metric Spaces

Schwarz inequality. Denoting C := k max1≤i≤k ui , we can derive

from (2.4.9)

x ≤ Cx2 ∀x ∈ X . (2.4.10)

In order to get the other inequality we use Theorem 2.20. Observe

that · is a continuous function on (X, · 2 ):

S2 (0, 1) = {x ∈ X; x2 = 1} (which is compact in (X, · 2 )), i.e.,

From (2.4.11) we easily derive

claimed.

which are not equivalent. For instance, let us consider the following

two norms on the real linear space X = C[a, b] := C([a, b]; R), −∞ <

a < b < +∞,

b

f = sup{|f (t)|; a ≤ t ≤ b}, f 1 = |f (t)| dt .

a

We have

f 1 ≤ (b − a)f ∀f ∈ X ,

i.e., the sup-norm · is stronger than · 1 . But the two norms are

not equivalent. Indeed, let (fn ) be the sequence in X deﬁned by

0, a ≤ t ≤ b − n1 ,

fn (t) =

nt + 1 − nb, b − n1 < t ≤ b ,

b

1

fn 1 = |nt + 1 − nb| dt = ,

b− 1 2n

n

so there does not exist C such that fn ≤ Cfn 1 because fn 1 → 0

as n → ∞.

2.4 Continuous Functions on Compact Sets 47

Remark 2.23. It follows from Theorem 2.21 that any norm on a ﬁnite

dimensional linear space generates the same topology as that deﬁned

by the Euclidean norm, so any topological result involving the Eu-

clidean norm is also valid with respect to any other norm. In particu-

lar, the Heine–Borel Theorem is valid in any ﬁnite dimensional linear

space equipped with any norm. Throughout the rest of this book,

Rk and any other ﬁnite dimensional linear space is always considered

as a normed space, equipped with the norm topology (generated by

any convenient norm), unless otherwise speciﬁed. The next result is a

characterization (due to Riesz4 ) of the ﬁnite dimensionality of normed

spaces clarifying the Heine–Borel Theorem.

ﬁnite dimensional if and only if every closed bounded subset of X is

compact.

proper, closed linear subspace of X. Then there exists x0 ∈ X \ X1

such that

x0 = 1 ,

1

x − x0 ≥ ∀x ∈ X1 .

2

Proof. Choose x1 ∈ X \ X1 and let ρ = d(x1 , X1 ) := inf{x1 − z; z ∈

X1 }. We ﬁrst prove ρ > 0. Suppose ρ = 0. Then there exists a

sequence zn ∈ X1 such that x1 − zn < 1/n, hence zn → x1 . As X1

is closed, this implies x1 ∈ X1 , which is a contradiction.

By the deﬁnition of ρ there exists x2 ∈ X1 such that x1 − x2 < 2ρ.

Let

1

x0 = (x1 − x2 ) .

x1 − x2

4

Frigyes Riesz, Hungarian mathematician, 1880–1956.

48 2 Metric Spaces

1

= x1 − v

x1 − x2

1

≥ x1 − v

2ρ

ρ 1

≥ = ,

2ρ 2

where v = x2 + x1 − x2 x ∈ X1 .

Proof of Theorem 2.24. The necessity part follows from the Heine–

Borel Theorem extended to ﬁnite dimensional linear spaces (see Re-

mark 2.23).

To prove suﬃciency, assume by way of contradiction that X is not

ﬁnite dimensional, i.e., there exist inﬁnitely many distinct points in

X, say x1 , x2 , . . . , such that for all n ∈ N, Bn = {x1 , x2 , . . . , xn } is a

linearly independent system. Let Xn = Span Bn . Now, (Xn , · ) is

a closed space and Xn ⊂ Xn+1 (proper inclusion) for all n ∈ N. By

Lemma 2.25, there exists yn ∈ Xn+1 \ Xn for n ∈ N such that yn = 1

and

yn − x ≥ 1/2 ∀x ∈ Xn .

In particular yn − ym ≥ 1/2 for all m, n ∈ N, m = n. So (yn ) has

no Cauchy subsequence, hence no convergent subsequence. On the

other hand, yn ∈ Cl B(0, 1) ∀n ∈ N, so (yn ) should have a conver-

gent subsequence (since Cl B(0, 1) is compact by assumption). This

contradiction completes the proof.

Arzelà–Ascoli Criterion5

Let (X, d) and (X1 , d1 ) be metric spaces and let ∅ = A ⊂ X. Denote as

usual by C(A; X1 ) the set of all continuous functions from A ⊂ (X, d)

to (X1 , d1 ).

Deﬁnition 2.26. A family of functions F ⊂ C(A; X1 ) is called

equicontinuous if for all ε > 0 and all x ∈ A there exists δ > 0

such that y ∈ A and d(x, y) < δ implies d1 (f (x), f (y)) < ε for all

f ∈ F, i.e., δ = δ(ε, x) is independent of f .

5

Cesare Arzelà, Italian mathematician, 1847–1912; Giulio Ascoli, Italian math-

ematician, 1843–1896.

2.4 Continuous Functions on Compact Sets 49

and f ), then F is uniformly equicontinuous, i.e., ∀ε > 0, ∀x, y ∈

A, d(x, y) < δ implies d1 (f (x), f (y)) < ε for all f ∈ F.

tinuous, then F is uniformly equicontinuous (see Exercise 2.22 below).

Note also that if A is compact then C(A; X1 ) is a metric space with

respect to the metric d(f,˜ g) = sup

x∈A d1 (f (x), g(x)); if in addition

˜ is complete too, and in partic-

(X1 , d1 ) is complete then (C(A; X1 ), d)

ular C(A; R ), k ∈ N, is a Banach space with respect to the sup-norm

k

compact. Assume that F ⊂ C(A, Rk ) is equicontinuous and bounded

in C(A; Rk ) (i.e., ∃M > 0 such that f (x) ≤ M , ∀x ∈ A, ∀f ∈ F).

Then F is relatively compact in C(A; Rk ) equipped with the sup-norm.

Proof. For any δ > 0 we have A ⊂ ∪x∈A B(x, δ) and since A is com-

pact, there exists a ﬁnite subcover, so that A ⊂ ∪pj=1 B(yj , δ). Let

Cδ = {y1 , y2 , . . . , yp } and consider C = ∪i∈N C1/i . C is dense in A and

countable so C = {x1 , x2 , . . . }.

In order to prove that F is relatively compact in C(A; Rk ) it suﬃces

to show that any sequence in F has a convergent subsequence in this

space (cf. Corollary 2.14). So let (fn )n∈N be a sequence in F. Since

F is bounded in C(A; Rk ), then (fn (x1 )N ) is bounded in Rk so there

exists a subsequence of (fn ),

f11 , f12 , . . . , f1n , . . .

which is convergent at x = x1 . By the same assumption this subse-

quence has a subsequence

f21 , f22 , . . . , f2n , . . .

which is convergent at x = x2 (and at x = x1 as well). Continuing the

process we obtain successive subsequences

fm1 , fm2 , . . . , fmn , . . .

..

.

Think of it as an inﬁnite matrix and consider the diagonal sequence

(gn ) = (f11 , f22 , . . . , fnn , . . . ) which converges at any point of C. On

the other hand, as F is equicontinuous and A is compact, F is in fact

50 2 Metric Spaces

uniformly equicontinuous, i.e., for every ε > 0 there exists a δ = δ(ε) >

0 such that

Now, for a given ε ﬁx a δ = 1/i, so Cδ = C1/i is a ﬁnite set Cδ =

{y1 , . . . , yp } ⊂ C. If x ∈ A then it belongs to a ball B(yj , δ) for some

j ∈ {1, . . . , p} and we have, by (2.4.13) and the convergence of (gn (yj )),

gn (x) − gm (x)≤ gn (x) − gn (yj ) + gn (yj ) − gm (yj ) + gm (yj )

− gm (x) < ε + ε + ε = 3ε ∀n, m > N (ε, j) .

Therefore,

j∈{1,...,p}

space.

Notice that in the above proof we have used two essential arguments:

the completeness of the space (Rk , · ) (implying that C(A; Rk ) is

a Banach space) and the fact that the set {f (x); f ∈ F, x ∈ X1 } is

bounded in Rk (equivalently, relatively compact in this space) for all

x ∈ A. So the following generalization holds true:

of (X, d) and (X1 , d1 ) is a complete metric space. Assume that F is

equicontinuous and {f (x); f ∈ F} is relatively compact in (X1 , d1 ) for

all x ∈ A. Then F is relatively compact in C(A; X1 ).

In what follows we illustrate the Arzelà–Ascoli Criterion with Peano’s

Existence Theorem which is a fundamental result in the theory of

ordinary diﬀerential equations.

be equipped with the norm v = max1≤i≤k |vi |. Let D be the set

D = {(t, v) ∈ R × Rk ; |t − t0 | ≤ a, v − x0 ≤ b} ⊂ Rk+1

6

Giuseppe Peano, Italian mathematician, 1858–1932.

2.4 Continuous Functions on Compact Sets 51

continuously diﬀerentiable function x : [t0 − δ, t0 + δ] → Rk satisfying

the equation

x(t0 ) = x0 , (2.4.15)

assumed to be a positive number, because the case M = 0 ⇐⇒ f ≡ 0

is trivial.

Proof. We shall use Euler’s method of polygonal lines.7

Since f ∈ C(D; Rk ) and D is compact, f is uniformly continuous, i.e.,

∀ε > 0, ∃δ = δ1 (ε) > 0 such that

− f (s, v2 ) < ε.

other side, however we have to check that the solution is diﬀerentiable

at t = t0 . Given

xr (t), t ∈ [t0 , t0 + δ],

x(t) =

xl (t), t ∈ [t0 − δ, t0 ],

we have

dxl dxr

x− (t0 ) = (t0 ) = f (t0 , x0 ) = (t0 ) = x+ (t0 ) .

dt dt

Consider the uniform subdivision

Now for a given ε > 0 construct φε : I → Rk as

φε (tj ) + (t − tj )f (tj , φε (tj )), tj < t ≤ tj+1 ,

φε (t) =

x0 , t = t0 .

7

Leonhard Euler, Swiss mathematician, physicist, astronomer, logician, and

engineer, 1707–1783.

52 2 Metric Spaces

shall see that it approximates for ε small the trajectory of the solution

of problem (2.4.14) and (2.4.15). For k = 1 Euler’s polygonal line can

be visualized in the (t, x) coordinate plane.

Consider the family F = {φε ; ε > 0}. Let us ﬁrst show that φε

is well deﬁned on I for all ε > 0. On the interval [t0 , t1 ], φε (t) =

x0 + (t − t0 )f (t0 , x0 ) and

φε (t) − x0 ≤ M (t − t0 ) ≤ M δ ≤ b ,

So on [t1 , t2 ], φε (t) = φε (t1 ) + (t − t1 )f (t1 , φε (t1 )) is well deﬁned and

≤ (t − t1 )M + (t1 − t0 )M

≤ (t − t0 )M

≤ Mδ

≤ b,

Arzelà–Ascoli Theorem, we need to show that F is equicontinuous.

If t, s ∈ [tj , tj+1 ] then φε (t) − φε (s) ≤ M |t − s|. If t, s are in diﬀerent

intervals, say t ∈ [tp , tp+1 ], s ∈ [tq , tq+1 ] with p < q, then

+ φε (tp+1 ) − φε (t)

≤ M (s − tq ) + M (tq − tq−1 ) + · · · + M (tp+1 − t)

≤ M (s − t)

= M |t − s| ,

by the Arzelà–Ascoli Criterion there is a sequence εn → 0+ such that

2.4 Continuous Functions on Compact Sets 53

(see (2.4.16))

φ(t) − x0 ≤ b ,

so (t, φ(t)) ∈ D for all t ∈ I.

Now it simply remains to prove that x = φ(t) is a solution of prob-

lem (2.4.14) and (2.4.15). Deﬁne

φεn (t) − f (t, φεn (t)) t = tnj ,

gεn (t) =

0 otherwise ,

then φεn (t) = 0 + f (tnj , φεn (tnj )). For t ∈ (tnj , tnj+1 ), we have |t − tnj | ≤

hεn ≤ δ1 (εn ), and

formly continuous

On the other hand, for all t ∈ I

t t

gεn (s) ds = φεn (t) − x0 − f (s, φεn (s)) ds . (2.4.17)

t0 t0

D, f (s, φεn (s)) → f (s, φ(s)) uniformly on I as n → ∞. Therefore,

passing to the limit in (2.4.17), we get

t

φ(t) = x0 + f (s, φ(s)) ds , t ∈ I,

t0

(2.4.15).

Cauchy problem

x (t) = 2 |x(t)| ,

x(0) = 0 ,

54 2 Metric Spaces

with a = b = 1, D = [−1, 1]×[−1, 1], f (t, v) = 2 |v|, has the following

solutions:

x1 (t) = 0, −1 ≤ t ≤ 1,

t2 , −1 ≤ t ≤ 0,

x2 (t) =

0, 0 < t ≤ 1,

0, −1 ≤ t ≤ 0,

x3 (t) =

−t , 0 < t ≤ 1,

2

t2 , −1 ≤ t ≤ 0,

x4 (t) =

−t , 0 < t ≤ 1.

2

Note that all these solutions are deﬁned on the whole interval [−1, 1],

even if the existence interval given by Peano’s Theorem is smaller:

δ = min{a, b/M } = min{1, 1/2} = 1/2. A solution which is deﬁned on

the whole initial interval [t0 − a, t0 + a] in the case of problem (2.4.14)

and (2.4.15)) is called a global solution. In particular, the above

four solutions are global solutions. In fact, there are inﬁnitely many

solutions of the above Cauchy problem (see Exercise 2.28 below).

Peano’s Theorem provides only a local solution, i.e., a solution de-

ﬁned on an interval around t0 which in general is smaller than the

initial interval. If f in (2.4.14) is deﬁned on an open set Ω ⊂ Rk+1

then one can associate with each pair (t0 , x0 ) ∈ Ω a box D ⊂ Ω so

that Peano’s Theorem gives a local solution to problem (2.4.14) and

(2.4.15) deﬁned on an interval which depends on (t0 , x0 ).

example, we get uniqueness if, in addition, f satisﬁes a Lipschitz con-

dition: ∃L > 0 such that

be two solutions of problem (2.4.14) and (2.4.15). Then

t

φ(t) − ψ(t) ≤ L φ(s) − ψ(s) ds,

t0

or, equivalently,

d
−Lt t

e φ(s) − ψ(s) ds ≤ 0 ,

dt t0

2.5 The Banach Contraction Principle 55

for all t ∈ I. It follows easily that φ(t) = ψ(t) for all t ∈ I. Uniqueness

on [t0 − δ, t0 ] follows by converting problem (2.4.14) and (2.4.15) on

[t0 − δ, t0 ] into a similar Cauchy problem on [0, δ] by using the change

τ = t0 − t. Therefore, we can state the following result.

rem 2.31), plus (2.4.18), there exists a unique function x ∈ C 1 ([t0 −

δ, t0 + δ]; R) satisfying (2.4.14) and (2.4.15), where δ is the same as

in Theorem 2.31.

sions, i.e., if Rk is replaced by an inﬁnite dimensional Banach space

(see [18]).

Euler’s Diﬀerence Scheme.

If x = φ(t) is unique, then φε → φ in C(I; Rk ) as ε → 0+ so the

polygonal line corresponding to φε approximates the graph of φ. Let

Δ : t0 < t1 < · · · < tN = t0 + δ with tj = t0 + jh and h = Nδ . The

points (tj , φε (tj )) give us the polygonal line approximation. Denoting

φj := φε (tj ) we have

φj+1 = φj + hf (tj , φj ), j = 0, 1, . . . , N − 1,

φ0 = x0 .

tion provides the vertices of a polygonal line approximation, so Euler’s

scheme is important for the numerical analysis of the solutions of dif-

ferential equations.

We saw in the previous section that under the assumptions of Peano’s

Existence Theorem (Theorem 2.31) plus the Lipschitz condition (2.4.18)

the Cauchy problem

δ as deﬁned in the statement of Theorem 2.31. This (existence and

uniqueness) result can also be derived by applying the general Banach8

8

Stefan Banach, Polish mathematician, 1892–1945.

56 2 Metric Spaces

rem) we present below. Before stating this principle let us explain how

problem (2.5.19) can be reduced to a ﬁxed point problem. Note that

problem (2.5.19) is equivalent to the integral equation

t

x(t) = x0 + f (s, x(s)) ds . (2.5.20)

t0

metric space since it is a closed subset of the Banach space C(I; Rk )

equipped with the sup-norm, denoted · C , which gives the metric

d(u, v) = u − vC . Deﬁne on X the map (operator) T by

t

(T v)(t) = x0 + f (s, v(s)) ds , ∀v ∈ X .

t0

the assumptions above T v ∈ X for all v ∈ X, i.e., T : X → X.

Equation (2.5.20) can be simply written as

x = Tx, (2.5.21)

Eq. (2.5.21) in X. In other words, the Cauchy problem (2.5.19) has

a unique solution x deﬁned on I if and only if T has a unique ﬁxed

point x: x = T x. We do not go into further details concerning the

above Cauchy problem, or Eq. (2.5.20), since later on we will address

Volterra equations which are more general. We simply wanted to mo-

tivate the Banach Contraction Principle which is applicable to many

other problems.

plete metric space, and assume T : X → X is a contraction, i.e.,

∃α ∈ (0, 1) such that d(T x, T y) ≤ αd(x, y) for all x, y ∈ X. Then T

has a unique ﬁxed point (i.e., ∃! x∗ ∈ X such that

T x∗ = x∗ ).

ﬁne a sequence xn = T xn−1 for n ∈ N with x0 ∈ X arbitrary. We have

by induction

2.5 The Banach Contraction Principle 57

+ d(xn+1 , xn )

which by (2.5.22) is

≤ αn (1 + α + · · · + αp−1 )d(T x0 , x0 )

1 − αp

= αn d(T x0 , x0 )

1−α

αn

≤ d(T x0 , x0 ) .

1−α

So it is Cauchy in (X, d) (as αn → 0), and since (X, d) is complete, xn

converges to some x∗ ∈ X: d(xn , x∗ ) → 0. Now,

= d(T x∗ , T xn−1 ) + d(xn , x∗ )

≤ αd(x∗ , xn−1 ) + d(xn , x∗ ) ,

ﬁxed point of T .

We now wish to show that x∗ is unique. Suppose that y ∗ is also a

ﬁxed point of T , then d(x∗ , y ∗ ) = d(T x∗ , T y ∗ ) ≤ αd(x∗ , y ∗ ), so (1 −

α)d(x∗ , y ∗ ) ≤ 0 which implies x∗ = y ∗ .

the following counterexample from Natanson9 [38, p. 571] shows. If

X = R, and T : R → R is given by T x = x + π2 − arctan x, then T has

no ﬁxed point because π2 − arctan x > 0 ∀x ∈ R. On the other hand,

by the Mean Value Theorem, we have for all x, y ∈ R, x = y,

|T x − T y| ≤ |x − y − arctan x + arctan y|

x − y

= x − y −

1 + z2

for some z between x and y

1

= |x − y| · 1 −

1 + z2

< |x − y| ,

9

Isidor P. Natanson, Russian mathematician, 1906–1963.

58 2 Metric Spaces

a contraction. Thus, the fact that this T has no ﬁxed point is not

surprising.

Remark 2.37. From the above proof we see that

αn

d(xn , x∗ ) ≤ d(T x0 , x0 ) ,

1−α

which gives us an approximation of x∗ .

◦ ·

Remark 2.38. Suppose that T k = T · · ◦T, k ≥ 2, is a contraction

k f actors

(even though T may not be), then there is a unique ﬁxed point for T .

x∗ is a ﬁxed point of T k (which exists and is unique by Theorem 2.35)

then T x∗ = T k+1 x∗ = T k (T x∗ ), so both x∗ and T x∗ are ﬁxed points

of T k , and consequently T x∗ = x∗ .

2.6 Exercises

n

1. Let A1 ,A2 , . . . be subsets of a metric space.

Prove that

Cl i=1

∞ ∞

Ai = ni=1 Cl Ai for all n ∈ N and Cl i=1 Ai ⊃ i=1 Cl Ai .

Show by an example that the latter inclusion can be proper.

the same interior? Do A and Int A always have the same closure?

subset of (X, d0 ) is open.

p ∈ Cl A ⇔ inf {d(p, x) : x ∈ A} = 0 .

be a Banach space. Denote

f continuous and bounded}.

norm: f sup = supx∈A f (x).

2.6 Exercises 59

(equipped with the Euclidean metric):

(a) Z × Z ;

(b) Q × Q ;

(c) {( m

n , n ); m, n ∈ Z, n = 0} ;

1

(d) {( n1 + 1

m , 0); m, n ∈ Z \ {0}} .

(a) A = [0, 1] ∩ Q ;

(b) B = { n1 ; n ∈ N} ;

(c) C = {(x, y) ∈ R2 ; x2 − y 2 > 1} .

(i.e., d(x, y) = x−y, ∀x, y ∈ X). Prove that the closure of any

open ball B(x, r) := {v ∈ X; d(v, x) < r} in (X, d) is the closed

ball B(x, r) := {v ∈ X; d(v, x) ≤ r}. Show that this property

fails to be true if X is equipped with the discrete metric d0 .

most one cluster point.

√

(a) xn = sin 2π n2 + 3n , n = 1, 2, . . . ;

√

(b) yn = sin π n2 + n , n = 1, 2, . . .

11. Show that B := {f ∈ C([0, 1]; R) : f (x) > 0 for all x ∈ [0, 1]}

is open in C([0, 1]; R) equipped with the metric generated by

the sup-norm. What is the closure of B in this metric (in fact

Banach) space?

12. Denote

f (R) is bounded} .

Let D := {f ∈ BC(R; R); f (x) > 0 for all x ∈ R}. Is D open in

BC(R; R) equipped with the sup-norm? If not, what is Int D?

What is Cl D?

60 2 Metric Spaces

13. Find an open cover of (0, 1] ⊂ (R, | · |) which has no ﬁnite sub-

cover.

a metric space (X, d) to be compact. [Recall that S ⊂ (X, d) is

discrete if all its elements are isolated].

A is separable (i.e., there exists a countable subset S of A, such

that A = Cl S).

equipped with the topology given by the metric d deﬁned by

d(x, y) = x − y, x, y ∈ X. We have the following:

A, v ∈ A} is compact, too;

(b) If A is closed and B is compact, then A + B is closed, but

not necessarily compact (give a counterexample).

sin(π(2x − 1)), x ∈ [ 12 , 1],

f (x) =

0, otherwise.

{fn ; n ∈ N} is closed and bounded in C[0, 1] := C([0, 1], R)

equipped with the sup-norm, but not compact.

bounded set, show that A is relatively compact.

set of all sequences of real numbers a = (an )n∈N

satisfying ∞n=1 |an | < ∞. Show that

(a) l

∞

n=1 |an |, a ∈ l .

1

(b) the set A = {a = (an )n∈N ∈ l1 ; ∞ n=1 n|an | ≤ 1} is compact

in (l , · ) (i.e., in (X, d), where d is the metric generated

1

by · : d(a, b) = a − b, a, b ∈ l1 ).

2.6 Exercises 61

∞

f (x) = an sin (nπx),

n=1

where a = (an )n∈N is a sequence in R satisfying ∞ n=1 n|an | ≤

1. Show that F is a compact subset of C[0, 1] := C([0, 1]; R)

equipped with the sup-norm. Does the result hold if the domain

of the f ’s is D = R?

21. Let −∞ < a < b < ∞, un ∈ C 1 ([a, b]; R), n = 1, 2, . . . , such that

(un )n∈N and (un )n∈N are bounded in Lp ([a, b], R), p ∈ (1, ∞),

equipped with the usual norm. Show that (un ) has a subsequence

which is convergent in C([a, b]; R) with respect to the sup-norm.

(Information on Lp spaces is available in Chap. 3 below.)

22. Let (X, d), (Y, ρ) be metric spaces, and let F ⊂ C(A; Y ), where

∅ = A ⊂ X. If A is compact (with respect to d) and F is an

equicontinuous family, then F is uniformly equicontinuous.

23. For a ∈ R consider fa : [0, 1] → R, fa (x) = 1+ax2 x2 . Show that

F = {fa ; a ∈ R} is relatively compact in C[0, 1] := C([0, 1]; R)

equipped with the sup-norm, but not compact.

24. (a) Prove Gronwall’s lemma, namely given

t

u(t) ≤ a(t) + b(s)u(s) ds, t ∈ I = [t0 , T ],

t0

then

t t

u(t) ≤ a(t) + a(s)b(s)e s b(τ )dτ ds ∀t ∈ I.

t0

the case a is a constant function, i.e., a(t) = C ∀t ∈ I, then

t

b(s)ds

u(t) ≤ Ce t0 , t ∈ I.

orem 2.31 (Peano’s Theorem). Assume (in addition to conti-

nuity on D) that f satisﬁes the Lipschitz condition (2.4.18).

Use Bellman’s lemma to prove that x is the unique solution

of the corresponding Cauchy problem.

62 2 Metric Spaces

t

1 1

x(t)2 ≤ c2 + f (s)x(s) ds ∀t ∈ I = [t0 , T ] ,

2 2 t0

t

|x(t)| ≤ |c| + f (s) ds ∀t ∈ I .

t0

x(t)2

x (t) = 1 + t2 + ; x(0) = 0 ,

1 + x(t)2

has a unique solution deﬁned on R.

x (t) = 2e−t + ln 1 + x(t)2 ; x(0) = 0 .

2

x (t) = 2 |x(t)|, t ∈ R; x(0) = 0,

x (t) = 1 + t 1 + x(t)2 , t ≥ 0; x(0) = x0 ,

has

√ a solution whose maximal interval is (−T, T ), with

2/2 < T < ∞.

a continuous function. Then, for any (t0 , x0 ) ∈ Ω, the Cauchy

problem

2.6 Exercises 63

addition, f satisﬁes the condition: ∀ compact K ⊂ Ω, ∃LK > 0

such that ∀(t, u), (t, v) ∈ K,

unique.

x(t0 ) = x0 ,

is a k × k-matrix, and b(t) = (b1 (t), . . . , bk (t))T with aij , bj ∈

C(I) := C(I; R), i, j = 1, 2, . . . , k. Show that the above Cauchy

problem has a unique solution on the whole interval I.

Euclidean metric. Show that T has at least one ﬁxed point.

34. Prove that for every f ∈ C[0, 1] := C([0, 1]; R) and α ∈ (0, 1)

the integral equation

1

x(t) = f (t) + e−ts cos αx(s) ds, t ∈ [0, 1]

0

a continuous function satisfying

Chapter 3

and Lp Spaces

surable functions, Lebesgue integration, and Lp spaces. These spaces,

equipped with appropriate norms, are signiﬁcant examples of Banach

spaces.

Here we essentially follow [46]. First of all, for any closed cube C ⊂ Rk ,

called the volume of C).

A collection of cubes in Rk is said to be almost disjoint if the interiors

of the cubes are disjoint.

It is easily seen that every open set D ⊂ Rk (equipped with the usual

norm topology) can be written as a countable union of almost disjoint

closed cubes: D = ∪∞ j=1 Cj . To prove this, consider a grid in R of

k

closed cubes of side length 1/n, with n suﬃciently large, retaining the

cubes of the grid that are completely contained in D. Then, we bisect

each cube of the above grid into 2k cubes with side length 1/(2n) and

1

Henri Léon Lebesgue, French mathematician, 1875–1941.

G. Moroşanu, Functional Analysis for the Applied Sciences,

Universitext, https://doi.org/10.1007/978-3-030-27153-4 3

66 3 The Lebesgue Integral and Lp Spaces

retain those new cubes that are contained in D. Thus, repeating in-

deﬁnitely the procedure, we construct a countable collection of almost

disjoint closed cubes whose union equals D, as claimed.

Now, for any set M ⊂ Rk , we deﬁne the exterior measure of M by

∞

me (M ) = inf v(Cj ),

j=1

j=1 Cj

⊃ M with closed cubes Cj .

me (∅) = 0.

Indeed, we clearly have me (C) ≤ v(C), and in order to prove

the converse inequality it suﬃces to show that for any cover by

closed cubes ∪∞j=1 Cj ⊃ C, we have

∞

v(C) ≤ v(Cj ). (3.1.1)

j=1

Let ε > 0 be arbitrary but ﬁxed. Choose for each j an open cube

Cj ⊃ Cj such that v Cj ≤ (1 + ε)v(Cj ). Since {Cj }∞ j=1 is an

open cover of the compact set C, there exists a ﬁnite subcover

{Cj 1 , . . . , Cj m }, C ⊂ ∪m

i=1 Cji . It follows that

m ∞

v(C) ≤ (1 + ε) v(Cji ) ≤ (1 + ε) v(Cj ).

i=1 j=1

(e) If M = ∪∞

j=1 Mj , then

∞

me (M ) ≤ me (Mj ). (3.1.2)

j=1

3.1 Measurable Sets in Rk 67

equality is trivially satisﬁed. For arbitrary ε > 0 we can choose

for each j a cover by closed cubes Mj ⊂ ∪∞ q=1 Cj,q such that

∞

ε

v(Cj,q ) < me (Mj ) + .

2j

q=1

Then, M ⊂ ∪∞

j,q=1 Cj,q , hence

∞

me (M ) ≤ v(Cj,q )

j,q=1

∞

ε

≤ me (Mj ) +

2j

j=1

∞

= me (Mj ) + ε,

j=1

Clearly,

by closed cubes, M ⊂ ∪∞j=1 Cj , such that

∞

ε

v(Cj ) < me (M ) + .

2

j=1

Choose for every j an open cube Cj , such that Cj ⊂ Cj and

ε

v Cj ≤ v(Cj ) + .

2j+1

68 3 The Lebesgue Integral and Lp Spaces

Then, denoting D = ∪∞

j=1 Cj , we have that D is an open set

and by (e)

∞

∞

me (D ) ≤ me Cj = v Cj

j=1 j=1

∞

ε

≤ v(Cj ) +

2j+1

j=1

∞

ε

= v(Cj ) +

2

j=1

< me (M ) + ε.

∪∞ C

j=1 j , then me (M ) = j=1 v(Cj ).

Indeed, by (c) and (e), me (M ) ≤ ∞ j=1 v(Cj ), and for the con-

verse inequality we consider, for a ﬁxed m ∈ N and an arbitrary

but ﬁxed ε, closed cubes C̃j ⊂ Int(Cj ), j = 1, . . . , m, such that

ε

v(Cj ) < v(C̃j ) + , j = 1, . . . , m.

2j

Then,

m

m

me (M ) ≥ me (∪m

j=1 C̃j ) = v(C̃j ) ≥ v(Cj ) − ε,

j=1 j=1

∞

which implies me (M ) ≥ j=1 v(Cj ).

measurable) if for every ε > 0 there exists an open set D such that

D ⊃ M and me (D \ M ) < ε. If M is measurable, we deﬁne the

Lebesgue measure (or measure) of M by m(M ) := me (M ).

(A) It follows from the above deﬁnition that every open set is mea-

surable.

3.1 Measurable Sets in Rk 69

Indeed, we know (see (f) above) that

0 = me (M ) = inf{me (D); D open, D ⊃ M },

so for any ε > 0 there exists an open set Dε such that Dε ⊃ M

and me (Dε ) < ε. As Dε \ M ⊂ Dε , we have me (Dε \ M ) < ε.

(C) If M = ∪∞j=1 Mj , where each Mj is measurable, then M is mea-

surable.

Indeed, for a given ε > 0, we can choose for each j an open set

Dj , Dj ⊃ Mj , such that me (Dj \Mj ) < ε/2j . Hence D = ∪∞

j=1 Dj

is open, D ⊃ M and D \ M ⊂ ∪ ∞ (D \ M ) =⇒ m (D \ M ) ≤

∞ j=1 j j e

j=1 m e (D j \ M j ) < ε.

Since K is compact, hence bounded, we have me (K) < ∞. For

any ε > 0 there exists an open set D, D ⊃ K, such that me (D) <

me (K) + ε/2 (cf. (f)). The open set D \ K can be written as a

countable union of almost disjoint closed cubes: D\K = ∪∞j=1 Cj .

Now, for a given p ∈ N, K1 = ∪pj=1 Cj is a compact set with

K1 ∩ K = ∅, K ∪ K1 ⊂ D, and

me (D) ≥ me (K ∪ K1 )

= me (K) + me (K1 )

p

= me (K) + v(Cj ),

j=1

p

ε

v(Cj ) ≤ me (D) − me (K) < ,

2

j=1

hence

∞

me (D \ K) ≤ me (∪∞

j=1 Cj ) ≤ me (Cj )

j=1

∞

ε

= v(Cj ) ≤ < ε,

2

j=1

70 3 The Lebesgue Integral and Lp Spaces

Indeed, F can be written as a countable union of compact sets,

F = ∪∞

n=1 F ∩ B(0, n),

To prove this, observe ﬁrst that for all n ∈ N there exists an

open set Dn such that M ⊂ Dn and me (Dn \ M ) < 1/n. Since

Rk \ Dn is a closed set, it is measurable, hence E := ∪∞

n=1 (R \

k

k

me (Rk \ (M ∪ E)) = 0, so Rk \ (M ∪ E) is measurable (cf. (B)).

Since

Rk \ M = [Rk \ (M ∪ E)] ∪ E,

we conclude by (C) that Rk \ M is measurable, as claimed.

able set.

This follows easily from ∩∞ ∞

j=1 Mj = R \ [∪j=1 (R \ Mj )] (see also

k k

Theorem 3.2. If {Mn }∞ ∞is any collection of disjoint measurable

n=1

sets, then m(∪∞ M

n=1 n ) = n=1 m(Mn ).

be arbitrary but ﬁxed. Since Rk \ Mn is measurable, for any n ∈ N

there exists a closed set Fn ⊂ Mn such that me (Mn \ Fn ) < ε/2n . For

each ﬁxed p ∈ N, F1 , . . . , Fp are compact and disjoint, and, denoting

M = ∪∞ n=1 Mn , we have

p

p

m(M ) ≥ m(∪pn=1 Fn ) = m(Fn ) ≥ m(Mn ) − ε,

n=1 n=1

which implies m(M ) ≥ pn=1 m(Fn ) ≥ pn=1 m(Mn ). This concludes

the proof in the case when each Mn is bounded, since the converse

inequality is also satisﬁed. In the general case, we consider the closed

3.2 Measurable Functions 71

Mn,1 = Mn ∩ C1 , Mn,i = Mn ∩ (Ci \ Ci−1 ), i = 2, 3, . . . Then

Mn = ∪i Mn,i , M = ∪n,i Mn,i ,

so, as each Mn,i is bounded, we can use what we obtained above to

write

m(M ) = m(Mn,i ) = m(Mn,i ) = m(Mn ).

n,i n i n

Remark 3.3. There are subsets of Rk which are not Lebesgue measur-

able. See, for example, [46, p. 24].

Remark 3.4. Denote by A the collection of all measurable subsets of

Rk . According to the usual terminology, as ∅ ∈ A and (E) and (C)

hold, the pair (Rk , A) is a σ-algebra. As the Lebesgue measure m is

a nonnegative function on A satisfying m(∅) = 0 and Theorem 3.2,

the triple (Rk , A, m) is a measure space. This deﬁnition of a measure

space can be also used for sets other than Rk . In particular, if Ω ⊂ Rk

is a Lebesgue measurable set, and deﬁne B = {B ∩ Ω; B ∈ A}, then

(Ω, B, m) is a measure space (where m is the restriction to B of the

Lebesgue measure deﬁned above).

In what follows we consider the measure space (Rk , A, m) deﬁned in

the previous section. Note that similar considerations apply to any

other measure spaces. Assume that R = R1 is equipped with the

usual topology.

Deﬁnition 3.5. A function f : Rk → R is called measurable if for

all λ ∈ R the set {f > λ} := {x ∈ Rk ; f (x) > λ} is measurable (i.e.,

it belongs to A).

Remark 3.6. Equivalent deﬁnitions are obtained if the set {f > λ} is

replaced by {f ≥ λ}, {f < λ}, or {f ≤ λ}, λ ∈ R. Indeed, if {f > λ}

is measurable for all λ ∈ R then so is

{f ≥ λ} = Rk \ ∩∞

n=1 {f > λ − 1/n} ∀λ ∈ R,

hence so is

{f < λ} = Rk \ {f ≥ λ} ∀λ ∈ R,

and so on (the other implications are trivially satisﬁed).

72 3 The Lebesgue Integral and Lp Spaces

set D ⊂ R the set f −1 (D) := {x ∈ Rk ; f (x) ∈ D} is measurable.

assumed to be measurable for any λ ∈ R, then f is measurable, since

f −1 (D) = {f > λ}. Conversely, let us assume that f is measurable.

If ∅ = D ⊂ R is an open set then it can be represented as a countable

union of disjoint open intervals. Indeed, for x ∈ D denote by I(x)

the maximal open interval containing x and included into D. If x, y

are distinct points in D, then I(x), I(y) either coincide or are disjoint.

Obviously, D = ∪x∈D I(x). Since each I(x) contains a rational number,

the number of distinct I(x) must be countable so D = ∪∞ n=1 In . Since

f is measurable, we have f −1 (In ) ∈ A for all n ∈ N, which implies

f −1 (D) = ∪∞n=1 f

−1 (I ) ∈ A.

n

in Ω ⊂ Rk if it holds in Ω \ E with m(E) = 0; in other words, (P )

holds for almost all (abbreviated a.a.) x ∈ Ω.

then g is also measurable.

{g > λ} ∪ E = {f > λ} ∪ E ∈ A,

measure zero, it follows that {g > λ} ∈ A.

all measurable functions.

continuous, then g ◦ f is measurable.

open, too. Hence, as f is measurable, we conclude that (g ◦ f )−1 (D) =

f −1 (g −1 (D)) is measurable for any open set D ⊂ R.

then so are the functions λf (λ ∈ R), |f |p (p > 0), f + = max{f, 0},

f − = − min{f, 0}, etc.

3.2 Measurable Functions 73

in addition, g = 0 a.e., then f /g is measurable.

Proof. For any λ ∈ R we have

{f + g > λ} = ∪q∈Q {f > q > λ − g} = {f > q} ∩ {g > λ − q} ,

q∈Q

surable. The function f g is also measurable since

1

f g = (f + g)2 − (f − g)2 .

4

In order to prove the last statement, it suﬃces to prove that 1/g is

measurable. This follows from

{1/g > λ} = ({g > 0} ∩ {λg < 1}) ∪ ({g < 0} ∩ {λg > 1}).

Theorem 3.12. If (fn )n∈N is a sequence of measurable functions,

then all of supn∈N fn , inf n∈N fn , lim supn→∞ fn , and lim inf n→∞ fn

are measurable. In particular, if fn → f a.e. then f is measurable.

Proof. For any λ ∈ R we have {supn∈N fn > λ} = ∪n∈N {fn > λ} which

implies that supn∈N fn is measurable. The function inf n∈N fn is also

measurable since it is equal to − supn∈N (−fn ). The other statements

follow from

lim sup fn = inf {sup fi }, lim inf fn = sup{inf fn },

n→∞ i n≥i n→∞ i n≥i

Now, let us recall the deﬁnition of the characteristic function of a set

E, denoted χE ,

1 if x ∈ E,

χE (x) =

0 if x ∈ / E.

Let E ⊂ Rk . It is easily seen that χE is measurable if and only if E is

measurable.

Deﬁnition 3.13. A function f : Rk → R is called a simple function

if it has the form

p

f (x) = yi χMi (x), (3.2.3)

i=1

where p ∈ N, yi ∈ R, i = 1, . . . , p, and the Mi ’s are disjoint, measurable

subsets of Rk , with m(Mi ) < ∞, i = 1, . . . , p.

74 3 The Lebesgue Integral and Lp Spaces

characteristic functions of measurable sets. Normally, in the above

deﬁnition y1 , . . . , yp are distinct numbers.

Theorem 3.14. If f : Rk → R is a measurable function, then there

exists a sequence of simple functions (fn )n∈N such that

and

lim fn (x) = f (x), x ∈ Rk . (3.2.5)

n→∞

If, in addition, f ≥ 0, n = 1, 2, . . . , then one can ﬁnd fn ≥ 0, n =

1, 2, . . .

Proof. We assume ﬁrst that f is a nonnegative measurable function.

For a given n ∈ N, deﬁne the following subsets of Rk

j−1 j

Mj = { ≤ f < n }, j = 1, 2, . . . , n2n , and Pn = {f ≥ n},

2n 2

which are all measurable. Let

n2n

j−1

gn (x) = χMj (x) + nχPn (x), x ∈ Rk , n = 1, 2, . . .

2n

j=1

whenever f (x) ≤ n, hence gn → f . Thus, the sequence (gn ) satisﬁes

all the properties for a sequence (fn ) mentioned in the statement of

the theorem, except for m(Pn ) < ∞, m(Mj ) < ∞ for all n ∈ N,

j = 1, 2, . . . , n2n (see Deﬁnition 3.13). This inconvenience can be

easily removed as follows. For any n ∈ N, consider the closed cube Cn

centered at the origin with side length n and deﬁne

n

n2

j−1

= χMj ∩Cn (x) + nχPn ∩Cn (x), x ∈ Rk .

2n

j=1

It is easily seen that (fn ) satisﬁes all the desired properties, including

f = f + − f − , which implies |f | = f + + f − . Since f + and f − are both

3.3 The Lebesgue Integral 75

measurable and nonnegative, it follows from the proof above that there

exist sequences (fn+ ) and (fn− ) that satisfy the properties mentioned

above and approximate f + and f − , respectively. Then (fn = fn+ −fn− )

is a sequence of simple functions satisfying (3.2.4) and (3.2.5).

Remark 3.15. Taking into account Theorems 3.12 and 3.14, one can

say that a function f : Rk → R is measurable (in the sense of Deﬁni-

tion 3.5) if and only if f is the limit of a sequence of simple functions

(fn ), i.e., fn (x) → f (x), as n → ∞, for a.a. x ∈ Rk . This equivalent

condition can be used to deﬁne the notion of an X-valued measurable

function, where X is a Banach space.

If f : Rk → R is a simple function as in (3.2.3), the Lebesgue integral

of f is deﬁned by

p

f (x) dx := m(Mi ) · yi . (3.3.6)

Rk i=1

and we deﬁne

f (x) dx := f (x)χΩ (x) dx .

Ω Rk

Denote by S the set of all simple functions f : Rk → R. It is easily seen

that S is a linear space over R with respect to the usual operations:

addition of functions and scalar multiplication. We have the following

statements:

• Rk (αf + βg) dx = α Rk f dx + β Rk g dx ∀f, g ∈ S, α, β ∈ R;

• f, g ∈ S, f ≤ g =⇒ Rk f dx ≤ Rk g dx;

• If Ω1 , Ω2 ⊂ Rk are disjoint measurable sets with m(Ωi ) < ∞,

i = 1, 2, then

f dx = f dx + f dx ;

Ω1 ∪Ω2 Ω1 Ω2

• If f ∈ S, then so is |f | and

| f dx| ≤ |f | dx .

Rk Rk

76 3 The Lebesgue Integral and Lp Spaces

In what follows we are concerned with the Lebesgue integration of non-

negative measurable functions. Denote by S + the set of all nonnegative

simple functions f : Rk → R (i.e., functions of the form (3.2.3), where

each yi ≥ 0).

Deﬁnition 3.16. A nonnegative measurable function f : Rk → R is

called integrable in the sense of Lebesgue (or simply integrable) if

sup{ s dx; s ∈ S + , s ≤ f } < +∞ ,

Rk

and denote

f dx := sup{ s dx; s ∈ S + , s ≤ f }.

Rk Rk

If sup{ Rk s dx; s ∈ S + , s ≤ f } = ∞, we write Rk f dx = ∞.

Note that if f is a nonnegative simple function, i.e., a function of the

with yi ≥

form (3.2.3) 0, i = 1, . . . , p, then using this deﬁnition we

reobtain Rk f (x) dx = pi=1 m(Mi ) · yi .

If f : Rk → R is a nonnegative

integrable function and Ω ⊂ R is a

k

We have the following immediate statements for f, g : Rk → R non-

negative measurable functions and α ≥ 0:

• f ≤ g =⇒ Rk f dx ≤ Rk g dx ;

Ω1 f dx ≤ Ω2 f dx ; We also have:

a.e. if and only if Rk f dx = 0.

Proof. Observe ﬁrst that if f = 0 a.e., then for any s ∈ S + , with

s ≤ f , we have s = 0 a.e., so Rks dx = 0. Therefore Rk f dx =

0. Conversely, let us assume that Rk f dx = 0. Deﬁne Ωn = {x ∈

Rk ; f (x) ≥ 1/n}, n ∈ N. We have for all n ∈ N

1 1

0= f dx ≥ χΩn dx = m(Ωn ) .

R k R k n n

So m(Ωn ) = 0 for all n ∈ N =⇒ m({f > 0}) = m(∪∞

n=1 Ωn ) = 0 =⇒

f = 0 almost everywhere.

3.3 The Lebesgue Integral 77

Beppo Levi’s theorem.2

Theorem 3.17 (Monotone Convergence Theorem). Let 0 ≤ f1 ≤

f2 ≤ · · · ≤ fn ≤ · · · be a sequence of measurable functions. Denote

f (x) := limn→∞ fn (x). Then

lim fn dx = f dx .

n→∞ Rk Rk

lim fn dx ≤ f dx .

n→∞ Rk Rk

ε ∈ (0, 1). Deﬁne Mn = {x ∈ Rk ; fn (x) ≥ εs(x)}, n ∈ N. We have

Rk = ∪ ∞n=1 Mn . Indeed, if x ∈ R and f (x) = 0, then s(x) = 0, so

k

enough.

Next,

fn dx ≥ fn dx ≥ ε s dx .

Rk Mn Mn

Since Mn ⊂ Mn+1 for all n ∈ N, the last inequality implies

lim fn dx ≥ ε s dx ,

n→∞ Rk Rk

lim fn dx ≥ s dx ∀s ∈ S + , s ≤ f .

n→∞ Rk Rk

This implies

lim fn dx ≥ f dx ,

n→∞ Rk Rk

as claimed.

Remark 3.18. Combining Theorems 3.14 and 3.17, we infer that for

any nonnegative integrable function f : Rk → R, there exists an in-

creasing

sequence

(sn )N in S + such that sn → f pointwise (or a.e.) and

Rk sn dx → Rk f dx. Using this observation, one can readily deduce

that

2

Beppo Levi, Italian mathematician, 1875–1961.

78 3 The Lebesgue Integral and Lp Spaces

f + g and

(f + g) dx = f dx + g dx .

Rk Rk Rk

We also have

• Rk αf dx = α Rk f dx ∀α ≥ 0 .

surable functions. Set f = lim inf n→∞ fn . Then,

f dx ≤ lim inf fn dx . (3.3.7)

Rk n→∞ Rk

sequence, we have

f = sup gn = lim gn .

n∈N n→∞

lim gn dx = f dx . (3.3.8)

n→∞ Rk Rk

gn dx ≤ fn dx, n ∈ N . (3.3.9)

Rk Rk

Now, we are going to deﬁne the Lebesgue integral for a general measur-

able function f : Rk → R. One can use the decomposition f = f + −f − .

Obviously, f is measurable if and only if both f + and f − are measur-

able.

grable if both f + and f − are integrable and

f dx := f dx −

+

f − dx .

Rk Rk Rk

3

Pierre Joseph Louis Fatou, French mathematician, 1878–1929.

3.3 The Lebesgue Integral 79

f : Rk → R.

One can prove by elementary arguments the following statements:

f ∈ L(Rk ) ⇐⇒ |f | ∈ L(Rk ) ;

f ∈ L(Rk ) ;

αf dx = α f dx ;

Rk Rk

We also have

(f + g) dx = f dx + g dx .

Rk Rk Rk

(f +g)− are measurable, and f + , f − , g + , g − ∈ L(Rk ). From (f +g)+ ≤

f + +g + and (f +g)− ≤ f − +g − we infer that (f +g)+ , (f +g)− ∈ L(Rk ),

which implies f + g ∈ L(Rk ). On the other hand,

(f + g)+ − (f + g)− = f + g = f + − f − + g + − g − ,

so

(f + g)+ + f − + g − = (f + g)− + f + + g + ,

which involves only nonnegative integrable functions. Hence,

−

+

(f + g) dx + f dx + g − dx

Rk Rk Rk

− +

= (f + g) dx + f dx + g + dx,

Rk Rk Rk

g ∈ L(Rk ) and Rk g dx = Rk f dx.

80 3 The Lebesgue Integral and Lp Spaces

a.e., so

g + dx = f + dx, g − dx = f − dx ,

Rk Rk Rk Rk

• If f, g ∈ L(Rk ) and f ≤ g a.e., then Rk f dx ≤ Rk g dx.

The proof is easy.

• For every f ∈ L(Rk ) we have

k f dx ≤ |f | dx .

R Rk

f dx = f + dx − f − dx

R

R R

k k k

≤ +

f dx + f − dx

R R

k k

= |f | dx .

Rk

Similarly,

− f dx ≤ |f | dx ,

Rk Rk

so the result follows.

Theorem 3.21. Let f ∈ L(Rk ). Then, for every ε > 0 there exists

δ > 0, such that for every measurable set M ⊂ Rk with m(M ) < δ, we

have M |f | dx < ε.

Proof. For n ∈ N deﬁne

|f (x)| if |f (x)| ≤ n,

gn (x) =

n if |f (x)| > n.

(gn ) is an increasing sequence converging pointwise to |f |. By Beppo

Levi,

lim gn dx = |f | dx ,

n→∞ Rk Rk

3.3 The Lebesgue Integral 81

ε

(|f | − gN ) dx < . (3.3.10)

Rk 2

ε

∀M ∈ A with m(M ) < δ, gN dx ≤ N dx = N m(M ) < .

M M 2

(3.3.11)

Now, we derive from (3.3.10) and (3.3.11),

|f | dx = (|f | − gN ) dx + gN dx < ε .

M M M

space of measurable functions, in particular in L(Rk ). Denote by

L1 (Rk ) the quotient space L(Rk )/∼, where ∼ stands for the equiv-

alence relation we are talking about. In general, any equivalence class

in L1 (Rk )/∼ is identiﬁed with a representative of the corresponding

class, which is usually selected to be the most regular one. If Ω ⊂ Rk

is a measurable set, we can similarly deﬁne L1 (Ω) := L(Ω)/∼. Based

on this identiﬁcation, we can say that the above theory works for func-

tions (in fact classes of functions) belonging to L1 (Rk ) or to L1 (Ω).

The next result is known as Lebesgue’s Dominated Convergence The-

orem.

Ω ⊂ Rk be a measurable set, possibly Ω = Rk . Let (fn )n∈N be a

sequence in L1 (Ω) such that

Then, f ∈ L1 (Ω) and limn→∞ Ω |fn (x) − f (x)| dx = 0.

we get |f | ≤ g a.e., so f ∈ L1 (Ω). Set hn := |fn − f |. We have hn → 0

a.e. on Ω and hn ≤ g̃ := g + |f | ∈ L1 (Ω). Applying Fatou’s lemma to

the sequence (g̃ − hn ), we get

g̃ dx ≤ lim inf (g̃ − hn ) dx = g̃ dx − lim sup hn dx,

Ω n→∞ Ω Ω n→∞ Ω

82 3 The Lebesgue Integral and Lp Spaces

which implies

lim sup hn dx ≤ 0 .

n→∞ Ω

Thus

lim hn dx = 0 .

n→∞ Ω

3.4 Lp Spaces

Throughout this section Ω denotes a measurable subset of Rk (possibly

Ω = Rk ). As usual, any class of measurable functions with respect to

the equality a.e. will be identiﬁed with one of its representatives.

We have already deﬁned the space L1 (Ω) as being the set of all func-

tions f : Ω → R which are integrable over Ω, i.e., f is measurable and

Ω |f | dx < ∞. This deﬁnition can be extended as follows:

L∞ (Ω) := {f : Ω → R; f is measurable and there exists C ≥ 0

such that |f (x)| ≤ C a.e. on Ω}.

It is easily seen that, for every 1 ≤ p ≤ ∞, Lp (Ω) is a linear space

over R.

Now, for 1 < p < ∞ denote by q the conjugate of p, i.e.,

1 1

+ = 1.

p q

Recall the so-called Young’s inequality

a p bq

ab ≤ + . (3.4.12)

p q

This inequality follows from the fact that the log function is concave

on (0, ∞), so

1 1 1 1

log ap + bq ≥ log ap + log bq = log(ab) .

p q p q

Now, we set for 1 ≤ p < ∞

1/p

f Lp (Ω) := |f (x)|p dx ∀f ∈ Lp (Ω) ,

Ω

3.4 Lp Spaces 83

and

We are going to prove that these are norms. To this purpose, we need

the following auxiliary result which is known as Hölder’s inequality.4

Lemma 3.23 (Hölder’s Inequality). Let 1 < p < ∞. If f ∈ Lp (Ω)

and g ∈ Lq (Ω), then f g ∈ L1 (Ω) and

|f g| dx ≤ f Lp (Ω) gLq (Ω) , (3.4.13)

Ω

Proof. If f = 0 a.e. on Ω, then (3.4.13) is trivially satisﬁed, so we can

assume f Lp (Ω) > 0. By Young’s inequality we have

1 p 1 q

|f g| ≤ |f | + |g| a.e. on Ω .

p q

This shows that f g ∈ L1 (Ω) and

1 1

|f g| dx ≤ f pLp (Ω) + gqLq (Ω) .

Ω p q

By replacing in this inequality f by αf with α > 0, we obtain

αp−1 1

|f g| dx ≤ f pLp (Ω) + gqLq (Ω) ,

Ω p αq

q/p

whose right-hand side achieves its minimum for α = gLq (Ω) /f Lp (Ω) ,

thus (3.4.13) follows.

Proof. The result is trivial for p = 1.

Now, if f ∈ L∞ (Ω), then

Indeed, we infer from the deﬁnition of · L∞ (Ω) that, for each n ∈ N,

there exists a constant Cn such that

1

f L∞ (Ω) ≤ Cn < f L∞ (Ω) + and |f (x)| ≤ Cn ,

n

4

Otto Ludwig Hölder, German mathematician, 1859–1937.

84 3 The Lebesgue Integral and Lp Spaces

m(A) = 0 and

|f (x)| ≤ Cn , x ∈ Ω \ A .

As Cn → f L∞ (Ω) we derive (3.4.14) by passing to the limit in the

last inequality.

Using (3.4.14) one can easily prove that · L∞ (Ω) is a norm in L∞ (Ω).

Now, let us consider the case 1 < p < ∞. We have only to prove the

triangle inequality (since the other axioms are trivially satisﬁed). For

f, g ∈ Lp (Ω), we have

p

f + gLp (Ω) = |f + g|p−1 |f + g| dx

Ω

≤ |f + g| |f | dx +

p−1

|f + g|p−1 |g| dx. (3.4.15)

Ω Ω

f + gpLp (Ω) ≤ f + gp−1

Lp (Ω) f L (Ω) + gL (Ω) ,

p p

which implies

is a Banach space.

Proof. The fact that · Lp (Ω) is a norm was shown before (see Theo-

rem 3.24). So we only need to prove that this norm is complete. We

distinguish two cases.

Then there exists a subsequence (fnm )m∈N which satisﬁes

1

fnm+1 − fnm Lp (Ω) ≤ , m = 1, 2, . . . (3.4.16)

2m

Indeed, one may ﬁrst choose n1 ∈ N such that fm − fn Lp (Ω) ≤

1/2 ∀m, n ≥ n1 ; then choose n2 ∈ N, n2 ≥ n1 , such that fm −

fn Lp (Ω) ≤ 1/22 ∀m, n ≥ n2 , and so on. We are going to show that

there is a function f ∈ Lp (Ω) such that fnm −f Lp (Ω) → 0, as m → ∞.

If we show this, the initial sequence (fn ) will be convergent in Lp (Ω),

3.4 Lp Spaces 85

we redenote fm := fnm , so (3.4.16) becomes

1

fm+1 − fm Lp (Ω) ≤ , m = 1, 2, . . . (3.4.17)

2m

Set

n

gn (x) = |fi+1 (x) − fi (x)| .

i=1

According to (3.4.17), we have

gn Lp (Ω) ≤ 1, n = 1, 2, . . .

By the Monotone Convergence Theorem, gn (x) converges a.e. to a

ﬁnite limit g(x), and g ∈ Lp (Ω). Now, for m ≥ n ≥ 2 and for almost

all x ∈ Ω,

|fm (x) − fn (x)| ≤ |fm (x) − fm−1 (x)| + · · · + |fn+1 (x) − fn (x)|

= gm−1 (x) − gn−1 (x)

≤ g(x) − gn−1 (x) . (3.4.18)

It follows that for almost all x ∈ Ω, (fn (x))n∈N is Cauchy, so it con-

verges to some f (x). We also obtain for almost all x ∈ Ω

|f (x) − fn (x)| ≤ g(x), n = 2, 3, . . .

so, in particular, f ∈ Lp (Ω). As |fn −f |p → 0 a.e. on Ω and |fn −f |p ≤

g p ∈ L1 (Ω), we are in a position to apply the Dominated Convergence

Theorem to conclude that fn − f Lp (Ω) → 0.

Case 2: p = ∞. Let (fn ) be a Cauchy sequence in L∞ (Ω). So, for

any j ∈ N, there exists Nj ∈ N such that

1

fn − fm L∞ (Ω) ≤ ∀n, m ≥ Nj .

j

Hence, there exists a set Mj with m(Mj ) = 0 such that

1

|fn (x) − fm (x)| ≤ ∀x ∈ Ω \ Mj , m, n ≥ Nj . (3.4.19)

j

Obviously, the set M = ∪∞ j=1 Mj has measure zero. For each x ∈

Ω \ M the sequence (fn (x)) is Cauchy and therefore convergent to

some f (x) ∈ R. Now, we deduce from (3.4.19)

1

|fn (x) − f (x)| ≤ ∀x ∈ Ω \ M, n ≥ Nj ,

j

86 3 The Lebesgue Integral and Lp Spaces

1

fn − f L∞ (Ω) ≤ ∀n ≥ Nj .

j

3.5 Exercises

1. A set Ω ⊂ Rk is measurable ⇐⇒ for every ε > 0 there exists a

closed set F ⊂ Ω such that m(Ω \ F ) < ε.

for every ε > 0, there exists a compact set K ⊂ Ω such that

m(Ω \ K) < ε.

m(A) = m(B) < ∞. Then Ω is measurable.

Ω ⊂ Rk we have

(translation invariance);

(b) αΩ := {αx; x ∈ Ω} is measurable and m(αΩ)

= |α|k m(Ω).

functions x → f (x − h), x → f (αx) and

f (x − h) dx = f (x) dx, f (αx) dx

Rk Rk Rk

1

= f (x) dx .

|α|k Rk

tion. If f is Riemann integrable then f ∈ L1 (a, b) := L1 ((a, b); R),

and the two integrals coincide:

b b

(L) f (x) dx = (R) f (x) dx .

a a

3.5 Exercises 87

1 if x ∈ Q ∩ [0, 1],

D(x) =

0 if x ∈ [0, 1] \ Q,

nxn−1

fn (x) = , x ∈ [0, 1], n ∈ N .

1+x

Show that 1

1

lim fn (x) dx = .

n→∞ 0 2

∞

f (x) dx = 1 .

1

9. Show that n

x n −2x

lim 1+ e dx = 1.

n→∞ 0 n

lim f (x) = a,

x→∞

b

lim f (nx) dx = ab .

n→∞ a

0 if x = 0,

f (x) = √ 1 1

n if x ∈ n+1 , n , n ∈ N.

88 3 The Lebesgue Integral and Lp Spaces

Show that

(b) f ∈ Lp (0, 1) for 1 ≤ p < 2, and f ∈

/ Lp (0, 1) for 2 ≤ p ≤ ∞ .

12. Show that the following functions are not Lebesgue integrable:

(b) g(x) = sin x + cos x, x ∈ (0, ∞) .

13. Let f ∈ C[0, 1] := C([0, 1]; R), such that f (0) = 0, and f is

diﬀerentiable at x = 0. Then prove that g : (0, 1) → R, deﬁned

by

g(x) = x−3/2 f (x), x ∈ (0, 1) ,

belongs to L1 (0, 1).

1

14. If f ∈ L1 (0, 1), show that 0 xn f (x) dx → 0 as n → ∞.

p < q ≤ ∞. Prove that Lq (Ω) ⊂ Lp (Ω) and

L∞ (Ω). Prove that

p→∞

Chapter 4

Continuous Linear

Operators and Functionals

our presentation is restricted at this stage to the space of continuous

(bounded) linear operators between normed spaces. When the target

space is either R or C, they are called (continuous linear) functionals

and are used to deﬁne dual spaces and weak topologies.

Unless otherwise speciﬁed, this chapter only considers linear spaces

over the ﬁeld K, with K being R or C. When two or more linear

spaces are involved then all of them will be over the same ﬁeld.

Norm

We begin this section with some basic deﬁnitions.

Deﬁnition 4.1. Let X, Y be linear spaces and let A : D(A) ⊂ X → Y .

A is called a linear operator if D(A) is a linear subspace of X and

A(αx + βy) = αAx + βAy, ∀α, β ∈ K, ∀x, y ∈ D(A) .

We denote the range of A by R(A), i.e., R(A) = {Ax; x ∈ D(A)}. The

range R(A) is a linear subspace of Y .

We say that A is injective or one-to-one if N (A), the nullspace of

A, deﬁned by N (A) = {x ∈ D(A); Ax = 0}, is precisely {0}. The

operator A is called surjective or onto if R(A) = Y .

G. Moroşanu, Functional Analysis for the Applied Sciences,

Universitext, https://doi.org/10.1007/978-3-030-27153-4 4

90 4 Continuous Linear Operators and Functionals

Example 1.

Let X = Rn , Y = Rm with n, m ∈ N. Let M be an m × n matrix with

real entries, then A : D(A) = X → Y deﬁned by

Au = M u ∀u = (u1 , . . . , un )T ∈ X

is a linear operator, and in fact all linear maps between these spaces

can be represented in this way. Here we consider that the elements of

both X and Y are column vectors. If m = 1 then A is a linear form

on X, as deﬁned in Chap. 1.

Example 2.

For X = Y = C[a, b] := C([a, b]; R) with −∞ < a < b < ∞, the

derivative operator Af = f is deﬁned on D(A) = C 1 [a, b] (which is

the set of all continuously diﬀerentiable functions f : [a, b] → R), and

its range is R(A) = C[a, b] = Y , so A is surjective. Note that A is not

injective because its nullspace N (A) := {f ∈ D(A); Af = 0} = {0}

(more precisely, N (A) consists of all constant functions).

Example 3.

For X = Y = C[a, b], −∞ < a < b < ∞, the antiderivative operator

t

(Af )(t) = a f (s) ds is deﬁned on D(A) = C[a, b] = X. It is injective

because Af = 0 implies f = 0. However A is not surjective because

(Af )(a) = 0 for all f ∈ D(A) = C[a, b], and thus R(A) is a proper

subset of Y = C[a, b].

and let A : X → Y be a linear operator. Then the following are

equivalent

1. A is continuous on X;

2. A is continuous at x = 0;

Proof. An exercise.

can be equipped with norms, and every linear operator between the

two spaces is continuous (prove it!). In fact, any such operator can

4.1 Deﬁnitions, Examples, Operator Norm 91

spaces. So continuity of linear operators is interesting only in the case

of inﬁnite dimensional linear spaces.

Remark 4.4. A linear operator A : D(A) ⊂ X → Y is said to be

bounded if

Obviously, any continuous linear operator from (X, · X ) to (Y, · Y )

is bounded. Conversely, if A : D(A) ⊂ X → Y is a bounded linear

operator, then denoting by ĉ the supremum in (4.1.1) we have

That is why continuous linear operators are also called bounded.

Note that if A is a continuous (bounded) linear operator from (D(A), ·

X ) to (Y, ·Y ), then A can be extended by continuity to a continuous

linear operator A1 : D(A1 ) = X1 → Y1 , where X1 , Y1 denote the

completions of D(A) and Y with respect to ·X and ·Y , respectively.

so-called operator norm

Clearly, we have

then AB ∈ L(Z, Y ) and

92 4 Continuous Linear Operators and Functionals

t

antiderivative operator A : X → X, (Af )(t) = 0 f (s) ds, t ∈

[0, 1], f ∈ X, is linear and continuous (hence bounded) with A = 1.

On the other hand, for the same space X, the derivative operator

B : D(B) = C 1 [0, 1] ⊂ X → X, Bf = f , is linear but unbounded

because for fn (t) = tn , t ∈ [0, 1], n ∈ N, we have fn = 1, while

Bfn = n → ∞.

Remark 4.5. If X = {0}, then

a ≤ A . (4.1.4)

A(u−1

X u)Y ≤ a ∀u ∈ X \ {0}

we derive

AuY ≤ auX ∀u ∈ X . (4.1.5)

By taking the supremum in (4.1.5) over all u ∈ X, uX ≤ 1, we ﬁnd

A ≤ a, which combined with (4.1.4) proves (4.1.3).

Banach space, then L(X, Y ) is a Banach space with respect to the

operator norm.

that it is complete. For the sake of simplicity we redenote by ·

both the norms · X and · Y . Consider a Cauchy sequence (An ) in

L(X, Y ), i.e.,

4.2 Main Principles of Functional Analysis 93

A : X → Y , Av = limn→∞ An v, and because each An is linear, A is as

well. Since for all v ∈ X

≤ εv + AN +1 · v

= ε + ANε +1 v ,

Since

An v − Av ≤ ε

An − A ≤ ε ∀n > Nε ,

which implies that An → A in L(X, Y ).

Analysis

In this section we present some important principles of Functional

Analysis: the Uniform Boundedness Principle, the Open Mapping

Theorem, and the Closed Graph Theorem. We begin with the Uniform

Boundedness Principle, which was proven by Banach and Steinhaus.1

Let (X, · X ) and (Y, · Y ) be Banach spaces and let {Ti }i∈I ⊂

L(X, Y ) be a collection of operators satisfying

sup Ti xY < ∞ ∀x ∈ X . (4.2.6)

i∈I

Then,

sup Ti < ∞ . (4.2.7)

i∈I

Proof. Denote

Xn = {x ∈ X; sup Ti xY ≤ n}, n ∈ N .

i∈I

1

Hugo Steinhaus, Polish mathematician, 1887–1972.

94 4 Continuous Linear Operators and Functionals

∞

X= Xn .

n=1

n0 ∈ N such that Int Xn0 = ∅, i.e., there is a ball B(x0 , r0 ) ⊂ Xn0 ,

r0 > 0. Hence,

which implies

r0 Ti ≤ n0 + Ti x0 Y ∀i ∈ I .

This shows that (4.2.7) holds true (see also (4.2.6)).

be Banach spaces. If A : D(A) ⊂ X → Y is a linear, continuous, and

surjective operator, then A maps open sets in X to open sets in Y .

Proof. It suﬃces to prove that there exists a constant r > 0 such that

where BX (0, 1), BY (0, r) denote the open balls in X and Y centered

at 0 with radii 1 and r, respectively. In order to prove (4.2.8) we shall

ﬁrst show the existence of a constant r1 > 0 such that

BY (0, r1 ) ⊂ Cl A(BX (0, 1)) . (4.2.9)

Denote Yn = n Cl A(BX (0, 1)) , n ∈ N. Since A is surjective, we

have Y = ∪n∈N Yn . By Baire’s Theorem (Theorem

2.10) Int Yn0 = ∅

for some n0 ∈ N, hence Int Cl A(BX (0, 1)) = ∅. So, for some y0 ∈ Y

and some r1 > 0, we have

BY (y0 , 2r1 ) ⊂ Cl A(BX (0, 1)) . (4.2.10)

Adding the fact that −y0 ∈ Cl A(BX (0, 1)) to (4.2.10), we obtain

BY (0, 2r1 ) ⊂ Cl A(BX (0, 1)) + Cl A(BX (0, 1))

= 2 Cl A(BX (0, 1))

4.2 Main Principles of Functional Analysis 95

(since Cl A(BX (0, 1)) is a convex set), hence (4.2.9) holds true.

Now we are going to prove (4.2.8) by using (4.2.9) with r1 = 2r, i.e.,

BY (0, 2r) ⊂ Cl A(BX (0, 1)) . (4.2.11)

r

y − Av1 Y < .

21

Now choosing y − Av1 instead of y and ε = 1/22 in (4.2.12), we can

ﬁnd some v2 ∈ BX (0, 1/22 ) with

r

(y − Av1 ) − Av2 Y < .

22

Continuing the process we ﬁnd vn ∈ BX (0, 1/2n ) such that

r

y − A(v1 + v2 + · · · + vn )Y < . (4.2.13)

2n

Obviously, xn = v1 + v2 + · · · + vn deﬁnes a Cauchy sequence in X,

hence xn converges to some x ∈ X with xX < 1 and y = Ax since

A ∈ L(X, Y ) (see (4.2.13)). As y was an arbitrary vector in BY (0, r)

the proof of (4.2.8) is complete.

L(X, Y ) is bijective, then A−1 ∈ L(Y, X). This follows from (4.2.8).

Theorem 4.10 (Closed Graph Theorem). Let (X, · X ), (Y, · Y )

be Banach spaces. If A : X → Y is a linear operator and its graph

G(A) := {(x, Ax); x ∈ X} is closed in X × Y (in other words, A is a

closed operator), then A ∈ L(X, Y ).

Proof. Deﬁne on X the norm

which is called the graph norm. Since G(A) is a closed set in (X, ·

X )×(Y, ·Y ), it follows that (X, ·A ) is a Banach space. Obviously,

xX ≤ xA ∀x ∈ X,

96 4 Continuous Linear Operators and Functionals

by Remark 4.9, its inverse I −1 = I ∈ L((X, · X ), (X, · A )), i.e.,

there exists a constant C > 0 such that

xA ≤ CxX ∀x ∈ X.

In particular,

AxY ≤ CxX ∀x ∈ X,

which means A is continuous from (X, · X ) to (Y, · Y ).

If X, Y are normed spaces and A : X → Y is a linear operator then

A is called compact or completely continuous if A takes bounded

sets of X into relatively compact subsets of Y .

Example.

Let X = Y = C[a, b], −∞ < a < b < +∞, equipped with the usual

sup-norm, and let A : X → X be deﬁned by

b

(Af )(t) = k(t, s)f (s) ds ∀f ∈ X, ∀t ∈ [a, b] ,

a

Obviously A is a linear operator. Moreover, it follows from Arzelà–

Ascoli’s Criterion that A is a compact operator. The key argument

here is that the equicontinuity condition is a consequence of the uni-

form continuity of k.

Denote by

have the following theorem.

then K(X, Y ) is a closed linear subspace of L(X, Y ), i.e., K(X, Y ) is

a Banach space with respect to the operator norm (see Theorem 4.6).

4.4 Linear Functionals, Dual Spaces, Weak Topologies 97

L(X, Y ). Let (An ) be a sequence in L(X, Y ) which converges to some

A ∈ L(X, Y ), namely An − A → 0. So, for ε > 0 there exists m ∈ N

suﬃciently large such that

ε

Am − A < . (4.3.14)

3r

Let (xn ) be a sequence in the ball B(0, r) ⊂ X, where r > 0 is arbitrary

but ﬁxed. Since Am is compact there exists a subsequence of (xn ), say

(xnk )k≥1 , such that (Axnk )k≥1 is convergent, hence Cauchy. Thus, for

any ε > 0 (which can be the same as above), there exists N ∈ N such

that ε

Am xnk − Am xnj < ∀k, j > N . (4.3.15)

3

Using (4.3.14) and (4.3.15) we deduce

Axnk − Axnj

≤ Axnk − Am xnk + Am xnk − Am xnj + Am xnj − Axnj

≤ A − Am · xnk + Am xnk − Am xnj + Am − A · xnj

ε ε ε

< r· + +r· = ε,

3r 3 3r

in other words, (Axnk ) is Cauchy, hence convergent, and therefore

A ∈ K(X, Y ).

is a normed space and Y is a Hilbert space (see Chap. 6), then there

exists a sequence (An )n≥1 in L(X, Y ), such that the range of An is ﬁnite

dimensional (hence An is compact) for all n ≥ 1 and An − A → 0.

For the proof of this nice result see Brezis2 [6, Remark 1, pp. 157–158].

Weak Topologies

We begin this section by deﬁning the important concept of a dual

space.

Deﬁnition 4.13. Let (X, · ) be a normed space. Deﬁne the dual

of X, denoted X ∗ , by

X ∗ = {f : X → K; f is linear and continuous },

2

Haim Brezis, French mathematician, born 1944.

98 4 Continuous Linear Operators and Functionals

Since (K, |·|) is a Banach space, X ∗ is also a Banach space with respect

to

f = sup {|f (v)|; v ∈ X, v ≤ 1} .

By deﬁnition

|f (v)| ≤ f · v ∀v ∈ X, ∀f ∈ X ∗ .

Example 1.

Let X be the linear space of all sequences of real numbers (xn )n≥1

satisfying ∞

|xn | < ∞ .

n=1

X is usually denoted by l1 and is a Banach space (over R) with respect

to the norm ∞

(xn ) = |xn | .

n=1

See Exercise 2.19.

It is easily seen that any functional f ∈ X ∗ has the form

∞

f (xn ) = an xn ,

n=1

and is a Banach space with the norm

(an )∞ = sup |an | .

n≥1

Example 2.

Let X = C[a, b], −∞ < a < b < +∞, with the sup-norm, denoted

· . For a ﬁxed v ∈ X deﬁne f : X → R by

b

f (u) = u(t)v(t) dt ∀u ∈ X .

a

and therefore f ∈ X ∗ .

4.4 Linear Functionals, Dual Spaces, Weak Topologies 99

norm, namely the L2 -norm, and the same functional f , which can be

expressed as the scalar product

f (u) = (u, v)L2 (a,b) ∀u ∈ X .

Again, f is linear and by the Bunyakovsky–Cauchy–Schwarz inequality

|f (u)| ≤ vL2 (a,b) · uL2 (a,b) ∀u ∈ X ,

so f ∈ (X, · L2 (a,b) )∗ .

Question: Given f ∈ (X, · L2 (a,b) )∗ , does there exist v ∈ X = C[a, b]

such that f (u) = (u, v)L2 (a,b) for all u ∈ X? We shall show later

(Theorem 6.10) that there exists such a v in the L2 (a, b), but not

necessarily in X = C[a, b].

sion of linear (not necessarily continuous) R-valued functionals.

p : X → R be a map which satisﬁes

p(x + y) ≤ p(x) + p(y) ∀x, y ∈ X ,

If Y is a linear subspace of X and f : Y → R is a linear functional

satisfying

f (x) ≤ p(x) ∀x ∈ Y ,

then there exists a linear functional g : X → R such that

g(x) = f (x) ∀x ∈ Y ,

g(x) ≤ p(x) ∀x ∈ X .

subspace of X. Consider the collection E of all linear extensions of f

in the above sense, i.e., h ∈ E if and only if D(h) is a linear subspace

of X, Y ⊂ D(h), h is linear, h extends f , and h(x) ≤ p(x) ∀x ∈ D(h).

Clearly f ∈ E so E is nonempty. Deﬁne on E the order relation

3

Hans Hahn, Austrian mathematician, 1879–1934.

100 4 Continuous Linear Operators and Functionals

subset of E and consider the functional h deﬁned by

for G. Hence E is inductive, so by Zorn’s Lemma E has a maximal

element g ∈ E.

To complete the proof let us show that D(g) = X. Assume by con-

tradiction that this is not the case, so ∃x0 ∈ X \ D(g). Consider

Z = Span {x0 } ∪ D(g) , and deﬁne on Z a linear functional g̃ of the

form

g̃(tx0 + x) = αt + g(x), t ∈ R, x ∈ D(g) ,

where α is a real parameter. We shall prove that there exists an α

such that g̃ ∈ E, i.e.,

In particular,

g(y) − α ≤ p(y − x0 ) ∀y ∈ D(g) ,

which is equivalent to

y∈D(g) x∈D(g)

⇔ g(x + y) ≤ p(x + x0 ) + p(y − x0 ) = p(x + y) ,

with this alpha satisﬁes (4.4.16), so g̃ ∈ E. But g̃ is a proper extension

of g (since D(g) is a proper subset of D(g̃) = Z) and this contradicts

the maximality of g.

4.4 Linear Functionals, Dual Spaces, Weak Topologies 101

subspace of X. If f ∈ Y ∗ := (Y, · )∗ , then there exists an extension

g of f such that g ∈ X ∗ := (X, · )∗ and

gX ∗ = f Y ∗ .

p(x) = f Y ∗ x to derive the existence of a linear extension g : X →

R satisfying

g(x) ≤ f Y ∗ x ∀x ∈ X .

Since −g(x) = g(−x) satisﬁes a similar inequality, we have g ∈ X ∗

and

gX ∗ ≤ f Y ∗ .

Obviously, the converse inequality is also satisﬁed, so gX ∗ = f Y ∗ .

If K = C deﬁne

q(x) := Re f (x) ∀x ∈ Y.

Then,

f (x) = q(x) − iq(ix) ∀x ∈ Y,

and

|q(x)| ≤ f Y ∗ x ∀x ∈ Y. (4.4.17)

Now, if we regard X, Y as real linear spaces and take into account

(4.4.17), we deduce from the ﬁrst part of the proof the existence of a

continuous linear functional h : X → R which extends q and satisﬁes

Set

g(x) = h(x) − ih(ix), x∈X.

Functional g : X → C is an extension of f and is linear on the complex

space X. Let us prove that

|g(x)| ≤ f Y ∗ x ∀x ∈ X .

|g(x)| = r = Re e−iθ g(x)

= Re g e−iθ x

= h(e−iθ x

(by (4.4.18)) ≤ f Y ∗ x ∀x ∈ X .

102 4 Continuous Linear Operators and Functionals

trivially satisﬁed, we have gX ∗ = f Y ∗ .

the complex case K = C by a similar procedure.

x0 ∈ X \ {0} there exists a functional g ∈ X ∗ such that

deﬁned by

f (x) = tx0 for x = tx0 , t ∈ K .

x ∈ X we have

the right-hand side of (4.4.19). Clearly, a ≤ x. In fact, a = x by

virtue of Corollary 4.17.

general, J(x) is not a singleton, but there are cases when this happens

for all x ∈ X (e.g., if X is a Hilbert space, as will be shown later). The

set-valued map x → J(x) is called the duality map from X to X ∗ .

Recall that, given a normed space (X, ·), the strong (norm) topol-

ogy of X is the metric topology generated by d(x, y) = x − y for

x, y ∈ X. In fact, we can consider that X is a Banach space (in other

words, · is complete, or d is complete), otherwise we can use the

completion procedure (see Theorem 2.8) to reach this framework.

4.4 Linear Functionals, Dual Spaces, Weak Topologies 103

neighborhoods of the origin of the form

for all ﬁnite systems of functionals {x∗1 , x∗2 , . . . , x∗m } and for all ε > 0.

w

We write xn → x or xn x to mean convergence in the weak topology,

i.e., x∗ (xn ) → x∗ (x) for all x∗ ∈ X ∗ .

w

Remark 4.21. If xn → x, i.e., xn − x → 0, then xn → x. Indeed, for

all x∗ ∈ X ∗ ,

≤ x∗ · xn − x ,

which tends to 0. The converse is not true in general, and we shall see

some examples later.

However, if X is ﬁnite dimensional then strong and weak convergence

are equivalent. Indeed, by choosing particular functionals, one can see

that weak convergence reduces to convergence on coordinates.

Deﬁnition 4.22. In X ∗ , besides the strong topology and the weak

topology, deﬁned by means of functionals from X ∗∗ := (X ∗ )∗ (the bid-

ual of X), we have the so-called weak-star topology w∗ , starting

from another neighborhood basis consisting of

for all ﬁnite systems {x1 , x2 , . . . , xm } ⊂ X, and for all ε > 0. So con-

w∗

vergence x∗n → x∗ means x∗n (x) → x∗ (x) for all x ∈ X, i.e., pointwise

convergence for a sequence of functionals. In general this is diﬀerent

than w-convergence.

In general X is embedded into X ∗∗ , which is to say that there is an

i

injection x → fx deﬁned by fx (x∗ ) = x∗ (x) for all x∗ ∈ X ∗ . Clearly,

i ∈ L(X, X ∗∗ ) since

|fx (x∗ )| ≤ x∗ · x .

Moreover, using Corollary 4.17, we see that i is an isometry.

If i : X → X ∗∗ is onto (surjective), then X is said to be reﬂexive. In

particular Hilbert spaces are reﬂexive, as will be shown later.

Remark 4.23. It is easily seen that if X is reﬂexive then w = w∗ on

X ∗.

104 4 Continuous Linear Operators and Functionals

4.5 Exercises

1. Let X, Y be linear spaces. Find a necessary and suﬃcient condi-

tion for a subset G ⊂ X × Y to be the graph of a linear operator

from X into Y .

operator satisfying the condition

3. Let −∞ < a < b < +∞. Find the operator norm of A ∈ L(X)

given by

(Af )(t) = tf (t), t ∈ [a, b], f ∈ X,

when

(i) X = C[a, b] with the sup-norm;

(ii) X = Lp (a, b), with the usual norm, for some 1 ≤ p < ∞.

4. Let X = C[a, b], where −∞ < a < b < +∞. Assume that X is

equipped with the usual sup- norm and consider the operator A

deﬁned by

t

(Af )(t) = g(s)f (s) ds, f ∈ X, t ∈ [a, b],

a

all s ∈ (a, b). Show that A is a compact linear operator from X

into itself (i.e., A ∈ K(X) ⊂ L(X)) and calculate A.

operator A : X → Y is continuous if and only if the following

implication holds

space. Show that, for any sequence

(An )n∈N in L(X, Y ) satisfy-

ing An ≤ an ∀n ∈ N with ∞ a

n=1 n < ∞, the series ∞

n=1 An

is convergent in L(X, Y ).

4.5 Exercises 105

(i) for all A ∈ L(X) the series

1 1 1

I+ A + A2 + · · · + An + · · ·

1! 2! n!

is convergent in L(X) with its usual operator norm (the sum

of the series being denoted eA ). Here I denotes the identity

operator on X.

(ii) for all A ∈ L(X) with A < 1, I − A is invertible and

(I − A)−1 ∈ L(X).

8. Let (X, · ) be a Banach space. For every pair of operators

A, B ∈ L(X) that commute (i.e., AB = BA) one has eA eB =

eA+B (for the notation see the previous exercise).

9. Let (X, · X ), (Y, · Y ) be Banach spaces. Let (Tn )n∈N be a

sequence in L(X, Y ) which is pointwise convergent, i.e.,

∀x ∈ X ∃yx ∈ Y such that Tn x − yx Y → 0.

Deﬁne T : X → Y by T x = yx , x ∈ X. Show that

(a) (Tn )n∈N is bounded in (K, | · |);

(b) T ∈ L(X, Y );

(c) T ≤ lim inf Tn .

10. Let (X, · ) be a Banach space and let S be a nonempty subset

of X such that for all f ∈ X ∗ the set

f (S) = {f (x); x ∈ S} is bounded in (K, | · |)}.

Prove that S is bounded in (X, · ).

11. Let X be a Banach space and let A : X → X ∗ be a linear

operator satisfying

(Ax)(y) = (Ay)(x) ∀x, y ∈ X.

Show that A is a continuous operator, i.e. A ∈ L(X, X ∗ ).

12. Let (X, · X ), (Y, · Y ) be Banach spaces. If A : D(A) ⊂ X →

Y is a linear closed operator with D(A) closed in (X, ·X ), then

prove there exists a constant C > 0 such that

AxY ≤ CxX ∀x ∈ D(A).

106 4 Continuous Linear Operators and Functionals

operator satisfying

(Ax)(x) ≥ 0 ∀x ∈ X.

14. Let X be a linear space, equipped with two norms, ·1 and ·2 ,

such that X is Banach for both norms. Assume there exists a

constant C > 0 such that

x2 ≤ Cx1 ∀x ∈ X.

{u1 , u2 , . . . , un } be a basis in X.

For any linear functional f : X → K we have

n

n

f (u) = α i fi ∀u = αi ui ∈ X,

i=1 i=1

continuous with respect to any norm of X, i.e., f ∈ X ∗ .

Compute the norm of f , f X ∗ , explicitly, in terms of the fi ’s,

when the norm of X is deﬁned by

(i) u∞ = max1≤i≤n |αi | ∀u = ni=1 αi ui ∈ X;

(ii) u1 = ni=1 |αi | ∀u = ni=1 αi ui ∈ X;

1/p

n

(iii) up = i=1 |α i | p ∀u = ni=1 αi ui ∈ X, where p ∈

(1, ∞).

16. Let X = {u ∈ C[0, 1]; u(0) = 0} with the usual sup-norm. Let

f : X → R be deﬁned by

1

f (u) = u(s) ds ∀u ∈ X.

0

u ∈ X such that usup = 1 and f (u) = f X ∗ ?

Chapter 5

Distributions, Sobolev

Spaces

In this chapter we ﬁrst present test functions, which are then used

to introduce scalar distributions. The space D (Ω) of distributions is

analyzed in detail and some related applications are discussed: the

interpretation of the density of a mass concentrated at a point by

means of the Dirac distribution, solving the Poisson equation in D (Ω),

solving ordinary diﬀerential equations in D (R), solving the equation

of the vibrating string with non-smooth initial displacement function,

and the boundary controllability for a problem associated with the

same wave equation. We also introduce and discuss Sobolev spaces.

In order to introduce vector distributions we shall present in a separate

section the Bochner integral for vector functions. Vector distributions

and W k,p (a, b; X) spaces are then presented. These will later be used

in solving problems associated with parabolic and hyperbolic PDE’s.

Let Ω ⊂ Rk be a nonempty open set in Rk (which is equipped with

the usual topology).

For u : Ω → R deﬁne the support of u by

supp u = {x ∈ Ω; u(x) = 0} .

For a given m ∈ N, let C m (Ω) denote the set of all functions u : Ω → R

such that u, and all its n-th order partial derivatives, 1 ≤ n ≤ m, exist

G. Moroşanu, Functional Analysis for the Applied Sciences,

Universitext, https://doi.org/10.1007/978-3-030-27153-4 5

108 5 Distributions, Sobolev Spaces

compact (bounded) set ⊂ Ω}. For m = ∞ extend the deﬁnitions

above in the obvious way. The elements (functions) in C0∞ (Ω) are

called test functions since they serve as arguments of distributions

that will be deﬁned later.

A typical example of a test function is φ : Ω = Rk → R deﬁned by

exp x12 −1 , x2 < 1,

φ(x) = 2

0, x2 ≥ 1,

it!) and so φ ∈ C0∞ (Rk ) with supp φ = B(0, 1). For later use we also

deﬁne

ω(x) = Cφ(x) with C > 0 such that ω(x) dx = 1 . (5.1.1)

Rk

Obviously, C0∞ (Ω) is a real linear space with respect to the usual op-

erations (addition of functions and scalar multiplication).

In what follows, we introduce the usual topology on C0∞ (Ω). To this

purpose, we must ﬁrst discuss a few important concepts.

Let X be a linear space over K (as usual, K is either R or C). A

function p : X → R is called a seminorm if the following conditions

are satisﬁed:

some x = 0 is not excluded. We also have

p(x1 )−p(x2 ) ≤ p(x1 −x2 ) and so p(x1 −x2 ) = p(x2 −x1 ) ≥ p(x2 )−p(x1 ).

Obviously, (5.1.2) follows from these two inequalities.

We will use seminorms to equip X with a topology. If p is a seminorm

and M is the set {x ∈ X; p(x) < ε}, where ε is a positive constant,

then, obviously, 0 ∈ M and M is convex, balanced (i.e., x ∈ M and

5.1 Test Functions 109

exists an α > 0 such that α−1 x ∈ M ).

Let F = {pi : X → R; i ∈ I} be a family of seminorms satisfying

the axiom of separation: for any y ∈ X, y = 0, there exists j ∈ I

such that pj (y) = 0. Consider the collection V(0) of all sets which are

ﬁnite intersections of sets {x ∈ X; pi (x) < εi }, i ∈ I, ε > 0. Such an

intersection looks like

a convex, balanced, and absorbing set. Each V ∈ V(0) is considered

to be a neighborhood of 0 ∈ X and y + V := {y + v; v ∈ V } a

neighborhood of any y ∈ X.

Now, a set D ⊂ X which is a neighborhood of any y ∈ D is called

open. Indeed, the collection τ of all such sets, plus ∅ ⊂ X, satisﬁes

the axioms of a topology, so (X, τ ) is a topological space.

Using the separating property of F we can infer that singletons are

closed sets. Indeed, let y ∈ X be a given point. For each x ∈ X, x = y,

let Dx be an open set containing x but not y. Then D = ∪x=y Dx is

open and its complement is {y}, so the singleton {y} is closed, as

claimed. Note that if F does not satisfy the axiom of separation then

the closedness of singletons is not guaranteed.

It is easily seen that the mappings X × X (x, y) → x + y ∈ X and

K × X (α, x) → αx ∈ X are both continuous, so X is a topological

linear space.

every open set containing 0 includes a convex, balanced, and absorbing

open subset.

To summarize, we can say that any linear space X equipped (as above)

with the topology generated by a family of seminorms {pi ; i ∈ I}

satisfying the axiom of separation is a locally convex space in which

any seminorm pi is continuous (cf. (5.1.2)).

Conversely, any locally convex space X is a topological linear space

whose topology is generated by a collection of seminorms. In order to

show this, we deﬁne for a convex, balanced, and absorbing set M ⊂ X

the so-called Minkowski functional:

110 5 Distributions, Sobolev Spaces

Indeed, by the convexity of M and the obvious relations

x y

∈ M, ∈ M, ε > 0,

pM (x) + ε pM (y) + ε

we deduce

pM (x) + ε x

·

pM (x) + pM (y) + 2ε pM (x) + ε

pM (y) + ε y

+ · ∈ M.

pM (x) + pM (y) + 2ε pM (y) + ε

Hence

pM (x + y) ≤ pM (x) + pM (y) + 2ε ∀ε > 0

⇒ pM (x + y) ≤ pM (x) + pM (y).

We also have pM (αx) = |α|pM (x), since M is balanced.

So, the topology of a given locally convex space X is the one generated

by the collection of seminorms obtained as the Minkowski functionals

associated with convex, balanced, and absorbing open subsets of X.

collection of linear subspaces of X such that X = ∪α∈J Xα . Suppose

that each Xα is a locally convex space such that, if Xα1 ⊂ Xα2 , then

the topology of Xα1 coincides with the relative topology of Xα1 as a

subset of Xα2 . Every convex, balanced, and absorbing set D ⊂ X is

considered open ⇐⇒ D∩Xα is an open set of Xα containing 0 ∈ Xα for

all α ∈ J. If X is a locally convex space with respect to the topology

deﬁned in this way, then X is called the inductive limit of the Xα ’s.

Now let us return to C0∞ (Ω). For any compact K ⊂ Ω deﬁne the set

DK (Ω) = {φ ∈ C0∞ (Ω); supp φ ⊂ K},

which is a linear subspace of C0∞ (Ω).

For m ∈ N0 = N ∪ {0} and K ⊂ Ω compact,

pK,m (φ) = sup |Dα φ(x)|

x∈K, |α|≤m

α1 + α2 + · · · + αk , and the α-derivative of φ is deﬁned as

∂ |α| φ

Dα φ = .

∂xα1 1 ∂xα2 2· · · ∂xαk k

5.1 Test Functions 111

smooth function. If α = (0, 0, . . . , 0), then Dα φ = φ by convention.

Then DK (Ω) is a locally convex space and, if K1 ⊂ K2 the topology

of DK1 (Ω) coincides with the relative topology of DK1 (Ω) as a subset

of DK2 (Ω). Then C0∞ (Ω) can be regarded as the inductive limit of the

DK (Ω)’s, where K ranges over all compact subsets of Ω. The space

C0∞ (Ω), topologized in this way, is denoted by D(Ω).

One of the seminorms deﬁning the topology of D(Ω) is

p(φ) = sup |φ(x)|, φ ∈ C0∞ (Ω).

x∈Ω

D ∩ DK (Ω) = {φ ∈ DK (Ω); pK (φ) := supx∈K |φ(x)| < 1}.

that the following conditions are satisﬁed:

(a) there exists a compact set K ⊂ Ω such that supp φn ⊂ K for all

n;

do is to prove (a). Assume by contradiction that (a) is not satisﬁed.

So there exists a sequence (xj )j≥1 in Ω with no cluster point in Ω and

a subsequence (φnj )j≥1 such that φnj (xj ) = 0 for all j ≥ 1. Deﬁne a

seminorm p : C0∞ (Ω) → R by

∞

|φ(x)|

p(φ) = 2 sup , φ ∈ C0∞ (Ω),

|φnj (xj )|

j=1 x∈Kj \Kj−1

Ω and xj ∈ Kj \ Kj−1 , j = 1, 2, . . . , K0 = ∅. Clearly, the set V = {φ ∈

C0∞ (Ω); p(φ) < 1} is a neighborhood of 0 ∈ C0∞ (Ω) and none of the

φnj belongs to V , which gives a contradiction.

condition (a) with some compact K ⊂ Ω, and Dα φn → Dα φ uniformly

on K as n → ∞ for all α ∈ Nk0 .

function deﬁned by (5.1.1). Then K = B(0, 1) and all derivatives of

φn converge uniformly to 0, so φn → 0 in D(Ω).

112 5 Distributions, Sobolev Spaces

uniformly as n → ∞ for all α ∈ Nk0 , but there is no K satisfying (a).

In fact supp ψn = B(0, n), therefore ψn does not converge in D(Ω).

Friedrichs’ molliﬁcation will allow us to associate with “bad functions”

very good approximate functions.1

Consider again the test function ω : Rk → R deﬁned in the previous

section, i.e.,

C exp x12 −1 , x2 < 1,

ω(x) = 2

0, x2 ≥ 1,

with C > 0 such that Rk ω(x) dx = B(0,1) ω(x) dx = 1.

εk

ω( 1ε x) for all x ∈ Rk .

This is called the molliﬁer.

1. ωε ∈ C ∞ (Rk ) ;

2. supp ωε = B(0, ε) ;

3. Rk ωε (x) dx = B(0,ε) ωε (x) dx = 1 .

and f ∈ L1 (K) for any compact K ⊂ Rk . For ε > 0 deﬁne fε (x) the

Friedrichs’ molliﬁcation of f as

= ωε (x − y)f (y) dy ,

Rk

1

Kurt Otto Friedrichs, German-American mathematician, 1901–1982.

5.2 Friedrichs’ Molliﬁcation 113

= ωε (y)f (x − y) dy

R

k

= ωε (y)f (x − y) dy

B(0,ε)

If f ∈ L1loc (Ω), then f can be extended as f = 0 for x ∈ Rk \ Ω, and

we can deﬁne fε as before.

1. fε ∈ C ∞ (Rk ) ;

2. supp fε ⊂ supp f + B(0, ε), i.e., not much larger than supp f ;

Proposition 5.4. If f ∈ C0 (Ω), then fε (x) → f (x) uniformly as

ε → 0+ in Ω, where C0 (Ω) = {u ∈ C(Ω); u has compact (bounded)

support ⊂ Ω}.

supp fε ⊂ K ⊂ Ω for 0 < ε ≤ ε0 , if ε0 is small enough.

For 0 < ε ≤ ε0 and x ∈ K ,

|fε (x) − f (x)| = f (x − y)ωε (y) dy − f (x)ωε (y) dy

R Rk

k

≤ |f (x − y) − f (x)| ωε (y) dy .

B(0,ε)

η > 0, |f (x − y) − f (x)| < η for all y ∈ B(0, ε) with ε > 0 small. Thus

supx∈Ω |fε (x) − f (x)| ≤ η for all ε > 0 suﬃciently small, hence fε → f

uniformly in Ω as ε → 0+ .

to Ω of ) fε is in Lp (Ω) for all ε > 0 and

1. fε Lp (Ω) ≤ f Lp (Ω) for all ε > 0 ,

114 5 Distributions, Sobolev Spaces

as before, and the two conclusions of the theorem for the extension of

f will imply the same conclusions for f ∈ Lp (Ω).

Consider ﬁrst the case p = 1, i.e., f ∈ L1 (Rk ). Note that

is measurable on Rk × Rk and

|f (y)| ωε (x − y) dx = |f (y)| ωε (x − y) dx

Rk Rk

=1

= |f (y)|

|f (y)| ωε (x − y) dx dy = |f (y)| dy = f L1 (Rk ) < ∞ .

Rk Rk Rk

(5.2.4)

Thus, by Fubini-Tonelli’s Theorem (see, e.g., [51, p. 18]), function

(5.2.3) is a member of L1 (Rk × Rk ) and

|fε (x)| dx = ωε (x − y)f (y) dy dx

Rk

R Rk

k

≤ |f (y)| ωε (x − y) dx dy

Rk

R

k

=1

= f L1 (Rk ) ,

so that

fε L1 (Rk ) ≤ f L1 (Rk ) ,

as claimed.

We now consider the case 1 < p < ∞ for the same function (5.2.3).

Then fε ∈ Lp (Rk ) and, denoting by p the conjugate of p (i.e., (1/p) +

(1/p ) = 1), we have

|fε (x)| ≤ |f (y)| ωε (x − y) dy

Rk

= ωε (x − y)1/p ωε (x − y)1/p |f (y)| dy

Rk

5.2 Friedrichs’ Molliﬁcation 115

1/p ! "1/p

≤ ωε (x − y) dy ωε (x − y)|f (y)| dy

p

R Rk

k

=1

so that

|fε (x)| ≤

p

ωε (x − y)|f (y)|p dy

Rk

and integrating

! "

|fε (x)| dx ≤

p

ωε (x − y)|f (y)| dy dx p

Rk Rk Rk

= |f (y)|p ωε (x − y) dx dy

Rk

R

k

=1

= |f (y)|p dy

Rk

= f pLp (Rk )

so that

fε Lp (Rk ) ≤ f Lp (Rk )

which concludes the proof of the ﬁrst statement of the theorem. Before

we continue the proof of the theorem we shall prove two auxiliary

results.

V of K such that V ⊂ Ω and a continuous map g : Ω → R satisfying

g(x) = 0 for all x ∈ Ω \ V, and

0 ≤ g(x) ≤ 1 for all x ∈ Ω .

be δ-neighborhood of K whose closure V lies in Ω. Let W = Ω \ V and

ρ(x) = d(x, W ) := inf w∈W x − w2 which is a continuous function.

116 5 Distributions, Sobolev Spaces

Now let α = inf x∈K ρ(x) > 0, and let g(x) = min{1, α1 ρ(x)} which is

also a continuous function. Clearly g(x) = 1 for x ∈ K, g(x) = 0 for

x ∈ W = Ω \ V , and 0 ≤ g(x) ≤ 1 for x ∈ V \ K.

Lp (Ω)

Lemma 5.7. C0 (Ω) is dense in Lp (Ω) for all 1 ≤ p < ∞: C0 (Ω) =

p p

L (Ω) (i.e., every L (Ω) function can be approximated by C0 (Ω) func-

tions with respect to the usual norm of Lp (Ω)).

are nonnegative Lp (Ω) functions. So, it suﬃces to consider nonnegative

Lp (Ω) functions u which we approximate by simple functions

m

s= yi χ M i ,

i=1

m(Mi ) < ∞, and the χMi are their characteristic functions. Consider

a sequence of simple functions (sn ), such that 0 ≤ sn ≤ u and sn → u

as n → ∞ for almost all x ∈ Ω, so sn → u in Lp (Ω). Thus u can

be approximated by simple functions and so all reduces to approxi-

mating characteristic functions u = χM where M ⊂ Ω is a Lebesgue

measurable set with m(M ) < ∞. In fact, we only need to consider

K ⊂ M compact such that m(M \ K) = m(M ) − m(K) is small (see

Exercise 3.2), so

|χK − χM |p dx = 1 dx = m(M \ K)

Ω M \K

then there exists g ∈ C0 (Ω) such that

|g − χK | dx =

p

g dx ≤

p

1 dx = m(V \ K) < εp

Ω V \K V \K

so

g − χK Lp (Ω) < ε .

Thus the characteristic functions u = χM can indeed be approximated

by C0 (Ω) functions.

5.2 Friedrichs’ Molliﬁcation 117

Consider f ∈ Lp (Ω) and approximate it using Lemma 5.7: for θ > 0

there exists g ∈ C0 (Ω) such that

θ

f − gLp (Ω) < . (5.2.5)

3

We have

fε − f Lp (Ω) ≤ fε − gε Lp (Ω) + gε − gLp (Ω) + g − f Lp (Ω)

so by (5.2.5)

2

< θ + gε − gLp (Ω)

3

which by Proposition 5.4

2

< θ + constant · gε − gC(K )

3

< θ3

<θ

for all ε > 0 small. Therefore,

lim sup fε − f Lp (Ω) = 0 =⇒ lim fε − f Lp (Ω) = 0 .

ε→0+ ε→0+

Theorem 5.8. Let Ω ⊂ Rk be a nonempty open set. We have

Lp (Ω)

C0∞ (Ω) = Lp (Ω) for all 1 ≤ p < ∞ (i.e., every Lp (Ω) function

can be approximated by test functions).

Proof. Let f ∈ Lp (Ω). By Lemma 5.7 for all η > 0 there exists g ∈

C0 (Ω) such that f − gLp (Ω) < η/2. On the other hand, there is a

gε ∈ C0∞ (Ω) and by Theorem 5.5 gε − gLp (Ω) < η/2 for ε > 0 small.

Therefore,

η η

f − gε Lp (Ω) ≤ f − gLp (Ω) + gε − gLp (Ω) < + = η

2 2

for ε > 0 small.

118 5 Distributions, Sobolev Spaces

f (x)φ(x) dx = 0 ∀φ ∈ C0∞ (Ω) , (5.2.6)

Ω

then f = 0 a.e. on Ω.

f (x)g(x) dx = 0 (5.2.7)

Ω

where K ⊂ Ω is a compact set. Obviously, such a function g belongs

in particular to L1 (Ω) and (by Theorem 5.5)

gε − gL1 (Ω) → 0 as ε → 0+ .

Hence, there exists a sequence εj → 0 such that

gεj (x) → g(x) as j → ∞ for a.a. x ∈ Ω . (5.2.8)

Therefore by (5.2.6) we have

f (x)gεj (x)dx = 0 , (5.2.9)

Ω

K ⊂ K ⊂ Ω. We also have

|f (x)gεj (x)| ≤ |f (x)| · |gεj (x)|

≤ |f (x)| |g(y)|ωεj (x − y) dy

Ω

≤ |f (x)| · gL∞ (Ω) ,

for j large enough and for almost all x ∈ K . So we can apply

the Lebesgue Dominated Convergence Theorem (see also (5.2.8) and

(5.2.9)) to get (5.2.7) for all g ∈ L∞ (Ω) such that g vanishes a.e. on

Ω \ K.

Now choose an arbitrary compact set K ⊂ Ω and let g = sign f · χK .

Then by (5.2.7) we have

f g dx = |f |χK dx = |f | dx = 0,

Ω Ω K

which implies f = 0 for almost all x ∈ K. Since K is arbitrary, f = 0

a.e. on Ω.

5.3 Scalar Distributions 119

Let Ω ⊂ Rk be a nonempty open set. Recall that C0∞ (Ω), topologized

as the inductive limit of the DK (Ω)’s, where K runs over all compact

subsets of Ω, is denoted by D(Ω) (see Sect. 5.1).

distribution (on Ω) if u is linear and continuous, i.e.,

φ1 , φ2 ∈ D(Ω);

of linearity. Let D (Ω) denote the set of all distributions on Ω.

write (u, φ) instead of u(φ).

Notice that, in general, a distribution is not deﬁned point-wise on Ω,

unless it is a regular distribution, i.e., a distribution deﬁned by a usual

function, as explained below.

Regular Distributions

Let u ∈ L1loc (Ω) and deﬁne ũ : D(Ω) → R by

ũ(φ) = u(x)φ(x) dx ∀φ ∈ D(Ω) .

Ω

Clearly, ũ is linear and continuous and therefore a distribution. Note

that the mapping i : L1loc (Ω) → D (Ω), i(u) = ũ is injective. Since i is

linear, injectivity can be seen by showing the implication ũ = i(u) =

0 =⇒ u = 0 for a.a. x ∈ Ω. This is indeed the case by Theorem 5.9.

We now simply identify ũ with u and write

u(φ) = u(x)φ(x) dx ∀φ ∈ D(Ω).

Ω

120 5 Distributions, Sobolev Spaces

Let Ω = Rk and deﬁne (δ, φ) = δ(φ) = φ(0) for all φ ∈ D(Ω). It is

linear and continuous, so δ ∈ D (Ω). δ is called the Dirac distribution

or delta function, to follow the original denomination, even though it

is not in fact a function.

Claim: The distribution δ is not a regular distribution.

f ∈ L1loc (Rk ) such that

(δ, φ) = f (x)φ(x) dx ∀φ ∈ D(Rk ) .

Rk

This means

f φ dx = φ(0) ∀φ ∈ D(Rk ) ,

Rk

and, in particular,

f φ dx = 0 ∀φ ∈ D(Rk ), supp φ ⊂ Rk \ {0} ,

Rk

hence

f φ dx = 0 ∀φ ∈ D(Rk \ {0}) .

Rk \{0}

f = 0 for almost all x ∈ Rk , thus φ(0) = (δ, φ) = 0 for all φ ∈ D(Rk )

which is false.

For a given x0 ∈ Rk one can deﬁne a similar Dirac distribution, denoted

δx0 , by

(δx0 , φ) = φ(x0 ) ∀φ ∈ D(Rk ) .

The Dirac distribution associated with x0 = 0 is precisely δ. Of course,

linear combinations of Dirac distributions are also distributions. In

fact, the space of distributions is a large one, as shown below.

2

Paul Adrien Maurice Dirac, English theoretical physicist, 1902–1984.

5.3 Scalar Distributions 121

Besides addition and scalar multiplication there are some further op-

erations we can perform on distributions.

• Multiplication by a C ∞ function.

on D(Ω), so au ∈ D (Ω).

This is a generalization of the usual multiplication of functions. In-

deed, if u ∈ L1loc (Ω) (i.e., u is a regular distribution), then

= u(aφ) dx

Ω

= (au)φ dx ,

Ω

times we write u(x) instead of u even though it is not a function. For

example, this helps denote the reﬂection of u

D (Rk ) is precisely the regular distribution generated by the function

x → u(−x),

u(−x)φ(x) dx = u(x)φ(−x) dx ∀φ ∈ D(Ω) ,

Rk Rk

and this explains the notation for the reﬂection of the distribution u.

122 5 Distributions, Sobolev Spaces

• Translation by a vector.

For u ∈ D (Rk ) and h ∈ Rk , deﬁne u(x + h) by

(u(x + h), φ(x)) := (u(x), φ(x − h)) ∀φ ∈ D(Ω) .

It is clear that u(x + h) ∈ D (Rk ). Again, the notation u(x + h) is

justiﬁed by the case when u is a locally integrable function.

Note that the Dirac distribution δx0 deﬁned before is precisely δ(x−x0 )

in terms of the above notation.

Let (un )n∈N be a sequence in D (Ω). We say that (un ) converges in

D (Ω) if there exists u ∈ D (Ω) such that

lim (un , φ) = (u, φ) ∀φ ∈ D(Ω) . (5.3.10)

n→∞

automatically in D (Ω). More precisely,

u : D(Ω) → R deﬁned by (5.3.10) is linear and continuous.

Proof. While the linearity of u follows trivially from (5.3.10), its con-

tinuity is not immediate, see Gel’fand and Shilov [17].3 Assume that

u is not continuous, i.e., there exists a sequence φn → 0 in D(Ω) such

that, on a subsequence again denoted φn , we have

1

sup |Dα φn (x)| < ∀ |α| ≤ n . (5.3.12)

x∈Ω 22n

We consider ψn = 2n φn . By (5.3.12) we get

ψn → 0 in D(Ω) . (5.3.13)

|u(ψn )| ≥ 2n δ → ∞ . (5.3.14)

3

Israel M. Gel’fand, Russian mathematician, 1913–2009; Georgiy E. Shilov,

Russian mathematician, 1917–1975.

5.3 Scalar Distributions 123

Let us now extract new subsequences, say (ũn ) and (ψ̃n ). In view

of (5.3.14) we can pick a ψ̃1 such that |(u, ψ̃1 )| > 1. Thus, by virtue

of (5.3.10), we can choose ũ1 such that |(ũ1 , ψ̃1 )| > 1. Now, assuming

that ũj and ψ̃j have been chosen for j = 1, 2, . . . , n − 1, we can pick

(by the continuity of the ũj ’s and by (5.3.14)) a test function ψ̃n such

that

1

|(ũk , ψ̃n )| < n−k , k = 1, 2, . . . , n − 1 , (5.3.15)

2

n−1

|(u, ψ̃n )| > |(u, ψ̃j )| + n + 1 . (5.3.16)

j=1

Taking into account (5.3.10) and (5.3.16), we can pick ũn such that

n−1

|(ũn , ψ̃n )| > |(ũn , ψ̃j )| + n + 1 . (5.3.17)

j=1

(5.3.17). Set

∞

ψ= ψ̃n .

n=1

D(Ω) (see (5.3.12)), hence ψ ∈ D(Ω). Now, let us estimate |(ũn , ψ)|

by using the decomposition

(ũn , ψ) = (ũn , ψ̃j ) + (ũn , ψ̃n ) . (5.3.18)

j=n

∞

∞

1

|(ũn , ψ̃j )| < = 1. (5.3.19)

2j−n

j=n+1 j=n+1

ω1/n , i.e., un (x) = nk ω(nx) for x ∈ Ω = Rk , n ∈ N, where ω is

124 5 Distributions, Sobolev Spaces

the test function deﬁned before in (5.1.1). The graphs of the un ’s for

k = 1 or k = 2 can be visualized in corresponding coordinate systems

to observe the behavior of the un ’s as n gets larger and larger. The

pointwise limit of (un ) is as follows:

0 x = 0

lim un (x) =

n→∞ +∞ x = 0

un ’s as distributions, we have un → δ in D (Rk ):

(un , φ) = un (x)φ(x) dx

R

k

= un (x)φ(x) dx

1

B(0, n )

→ φ(0) = (δ, φ) ,

distribution represents, for instance, the density of a unit mass con-

centrated at some point. To explain that, let us suppose that a unit

mass, which is concentrated at the origin of a coordinate system in R3 ,

is distributed uniformly in B(0, 1/n) ⊂ R3 . Thus the corresponding

mass density is given by

3n3

4π x2 ≤ n1 ,

δn (x) =

0 otherwise,

and obviously the total mass δn dx = 1. For n → ∞ the mass

concentrates in x = 0. Obviously, δn (x) → 0 as n → ∞ for all x = 0,

and δn (0) → +∞, so δn does not converge pointwise to a function.

However, δn → δ in D (R3 ) as n → ∞ :

3n3

(δn , φ) = φ(x) dx → φ(0) = (δ, φ) ∀φ ∈ D (R3 ) ,

4π 1

B(0, n )

mass.

5.3 Scalar Distributions 125

For u ∈ C 1 (Ω) and φ ∈ D(Ω) one can write

∂u ∂u

,φ = φ dx

∂xi ∂x

Ω i

∂u

= φ dx , i = 1, . . . , k

supp φ ∂xi

tion

∂u

= φ dx

cell ∂xi

∂φ

=− u dx

∂xi

cell

∂φ

=− u dx

Ω ∂xi

∂φ

= − u, .

∂xi

∂u ∂φ

, φ = − u, ∀φ ∈ D(Ω), i = 1, . . . , k. (5.3.20)

∂xi ∂xi

If u is an arbitrary distribution, then we use (5.3.20) as the deﬁnition

∂u

of ∂x i

which is also an element of D (Ω). Whenever u is a smooth

function, its distributional derivative deﬁned by (5.3.20) coincides with

the classical derivative of u.

Since u ∈ D (Ω) implies ∂x ∂u

i

∈ D (Ω) for i = 1, . . . , k, we deduce by

induction that every distribution u ∈ D (Ω) is inﬁnitely diﬀerentiable,

and we have

(5.3.21)

By convention D(0,0,...,0) u = u. It is clear from (5.3.21) that mixed

derivatives in the sense of distributions do not depend on the order of

diﬀerentiation.

126 5 Distributions, Sobolev Spaces

Example 1. Consider the Heaviside function H, deﬁned on Ω = R

by

1 x ≥ 0,

H(x) =

0 x < 0.

in D (R). Obviously Ḣ(x) = 0 for x = 0, hence Ḣ = 0 a.e. On the

other hand, H = δ: for all φ ∈ D(R) we have

(H , φ) = −(H, φ̇)

∞

=− H(x)φ̇(x) dx

−∞

∞

=− φ̇(x)dx

0

x=∞

= −φ(x)x=0

with the pointwise derivative.

x1 x2 ≥ 0,

u = x1 H(x2 ) =

0 x2 < 0.

vative D(1,0) u = H(x2 ), which coincides with the classical partial

derivative ∂u/∂x1 . On the other hand,

∂φ

(D (0,1)

u, φ) = − u dx1 dx2

R2 ∂x2

+∞ ! +∞ "

∂φ

=− x1 dx2 dx1

−∞ 0 ∂x2

+∞

= x1 φ(x1 , 0) dx1 ∀φ ∈ D(R2 ).

−∞

5.3 Scalar Distributions 127

contrary, we obtain D(0,1) u = 0 almost everywhere in R2 by using

test functions with support in R × (0, +∞) and in R × (−∞, 0), while

D(0,1) u cannot be zero. So D(0,1) u is diﬀerent from the classical partial

derivative ∂u/∂x2 (which is zero almost everywhere).

Example

3. Let Ω = R3 and consider u = 1/r, where r = x2 =

x21 + x22 + x23 . We want to calculate

3

∂2u

Δu = ,

i=1

∂x2i

Δ is called the Laplace operator (or Laplacian).4 Note that u is not

an element of L1loc (R3 ) (because of a singularity at the origin) so that

we cannot deﬁne the distribution Δu directly. We replace u with

1

r ≥ n1 ,

un = r

0 r < n1 ,

which belongs to L1loc (R3 ), for all n ∈ N. For any test function φ ∈

D(R3 ) we have

(Δun , φ) = (un , Δφ)

Δφ

= dx .

1

r≥ n r

We wish to accept

1 Δφ

(Δu, φ) = (Δ , φ) = lim dx

r n→∞ 1

r≥ n r

as the deﬁnition of Δu, but of course we must show that this limit

exists. For a ﬁxed φ ∈ D(R3 ), deﬁne the spherical shell

1

Sn = {x ∈ R3 ; ≤ r ≤ a}

n

4

Pierre-Simon Laplace, French mathematician and astronomer, 1749–1827.

128 5 Distributions, Sobolev Spaces

where a is large enough that supp φ ⊂ B(0, a). We then use the second

Green formula5 (see, for example, [14, p. 628]) to deduce that

# $ # $

1 1 1 1

Δφ − φΔ dx = Δφ − φ Δ dx

r≥ 1 r r Sn r r

n

=0

and changing the direction of the normal and consequently the sign in

the double integral below (we can ignore the outer edge of the shell

since φ vanishes there),

# $

∂φ 1 ∂
1

=− −φ dσ

1

r= n ∂r r ∂r r

1

and since r = n on the edge

∂φ

= −n dσ − n2 φ dσ

1

r= n ∂r 1

r= n

∂φ

and because ∂r is bounded

1

= −nO 2 − n 2

φ dσ

n r= 1 n

which as n → ∞ becomes

= −4πφ(0)

= −4π(δ, φ).

Δφ

lim dx = −4π(δ, φ).

n→∞ 1

r≥ n r

Hence

1

(Δ , φ) = −4π(δ, φ) ∀φ ∈ C0∞ (R3 ) ,

r

that is to say, Δ 1r = −4πδ.

5

George Green, British mathematical physicist, 1793–1841.

5.3 Scalar Distributions 129

that for k ≥ 3, 1

Δ k−2 = −(k − 2)ak δ

r

in D (R ), where ak is the “area” of the unit hyper-sphere in Rk .

k

1

Δ ln = −2πδ

r

in D (R2 ), so that deﬁning for k ≥ 2

⎧

⎨− (k−2)ak rk−2

⎪ k ≥ 3,

1 1

E(x) =

⎪

⎩ 1

− 2π ln 1r k = 2,

we have

ΔE = δ in D (Rk ) . (5.3.22)

E is called the fundamental solution of the Laplacian Δ. In particular,

it can be used to ﬁnd a solution to the Poisson equation6

Δu = f (x), x ∈ Rk . (5.3.23)

Assume that f ∈ L∞ (Rk ) and vanishes almost everywhere outside a

compact set. Then the function

u(x) = (E ∗ f )(x) = E(x − y)f (y) dy (5.3.24)

Rk

is well deﬁned (since E is locally summable) and satisﬁes

Eq. (5.3.23). Indeed, we ﬁrst notice that for all y ∈ Rn and φ ∈

C0∞ (Rk )

E(x − y)Δφ(x) dx = (E(x − y), Δφ(x))

Rk

= (Δx E(x − y), φ(x))

= (δ(x), φ(x + y))

= φ(y).

6

Siméon Denis Poisson, French mathematician, engineer, and physicist, 1781–

1840.

130 5 Distributions, Sobolev Spaces

Now,

= u(x)Δφ(x) dx

R !

k

"

= E(x − y)f (y) dy Δφ(x) dx

Rk Rk

! "

= f (y) E(x − y)Δφ(x) dx dy

Rk Rk

= f (y)φ(y) dy

Rk

= (f, φ) ∀φ ∈ D(Rk ),

Remark 5.11. We point out (without proof) the following result known

as Weyl’s regularity lemma7 (see, e.g., [47]): if ∅ = Ω ⊂ Rk is open,

f ∈ C ∞ (Ω), u ∈ D (Ω) and Δu = f in D (Ω), then u ∈ C ∞ (Ω).

operation.

Proposition 5.12. Suppose that ∅ = Ω ⊂ Rk is open. If un → u in

D (Ω), then ∂u

∂xi → ∂xi in D (Ω) for all i = 1, . . . , k.

n ∂u

∂u ∂φ

n

, φ = − un ,

∂xi ∂xi

∂φ

and since ∂xi is a test function, as n → ∞ the right-hand side converges

to

∂φ

= − u,

∂xi

∂u

= ,φ .

∂xi

7

Hermann Weyl, German mathematician, theoretical physicist, and philoso-

pher, 1885–1955.

5.3 Scalar Distributions 131

un → u in D (Ω), then Dα un → Dα u in D (Ω) for all α = (α1 , . . . , αk ) ∈

Nk0 with |α| > 0.

Series in D (Ω)

Suppose (un )n∈N is a sequence in D (Ω). Then we can associate with

this sequence the series

u1 + u2 + · · · + un + · · ·

sn = u1 + · · · + un

u1 + u2 + · · · + un + · · · = u .

hence we can diﬀerentiate the series term by term as many times as

we wish, i.e.,

D α u1 + D α u2 + · · · + D α un + · · · = D α u

with Ω = R, un (x) = n1 sin(nx) converges uniformly to 0 as n → ∞

(and uniform convergence implies convergence in D (Ω)), but un (x) =

cos(nx) which does not converge, even pointwise. However, it does

converge in D (R). In fact, un → 0 as n → ∞ in D (R) for all

(j)

j = 1, 2, . . . .

Consider Ω = R, u, b ∈ D (R) and smooth functions a1 , a2 , . . . , an ∈

C ∞ (R). Then, if u(j) indicates the j-th derivative of u in D (R), the

diﬀerential equations

u (n)

+ a1 u (n−1)

+ · · · + an−1 u (1)

+ an u = b (E)

make sense. Classically, there are nice solutions u to (E0 ) and they

are solutions in the sense of distributions as well. In fact, there are

132 5 Distributions, Sobolev Spaces

below.

The equation u = 0 has constant solutions C in D (R) since for all

φ ∈ D(R)

∞

(C , φ) = −(C, φ ) = −C φ dt = 0,

−∞

in the sense of distributions? We answer this question in the following

way. If u ∈ D (R) and u = 0 in D (R), we have

0 = (u , φ) = −(u, φ ) ∀φ ∈ D(R). (5.3.25)

Given φ ∈ D(R) deﬁne

∞

ψ(t) = φ(t) − ω(t) φ(s) ds

−∞

for all t ∈ R with ω= ω(t) deﬁned as in (5.1.1) (where k = 1). Note

+∞

that ψ ∈ D(R) and −∞ ψ dt = 0. Deﬁne

t

φ1 (t) = ψ(s) ds, t ∈ R ,

−∞

and notice that φ1 ∈ D(R) and φ1 = ψ. Now for all φ ∈ D(R)

! ∞ "

(u, φ) = (u, ψ) + φ(s) ds (u, ω)

−∞

constant

∞

= (u, φ1 ) + Cφ ds

−∞

= (C, φ) ,

thus u = C. Therefore, any distributional solution of the equation

u = 0 is a constant distribution (i.e., a distribution generated by a

constant function).

⎧

⎪

⎪ u1 = a11 u1 + a12 u2 + · · · + a1n un

⎪

⎪

⎨u = a21 u1 + a22 u2 + · · · + a2n un

2

.. (5.3.26)

⎪

⎪ .

⎪

⎪

⎩

un = an1 un + an2 u2 + · · · + ann un

5.3 Scalar Distributions 133

where the aij ∈ C ∞ (R). Denoting by A(x) the matrix aij (x) and by

u the column vector (u1 , . . . , un )T , we can rewrite (5.3.26) as

u = Au . (5.3.27)

know from the classical theory of linear diﬀerential systems (see, e.g.,

[8, 11]) that X is invertible and X = AX for all t ∈ R. Consider the

transformation u = Xz then in the sense of distributions

u = X z + Xz

= AXz + Xz

and by (5.3.27)

= AXz.

deduce that z = 0. We have denoted by u and z the column vectors

whose components are the distributional derivatives of u1 , . . . , un and

z1 , . . . , zn , respectively. As z = 0, z must be a constant vector z = c ∈

Rn and we ﬁnd u = Xc. Therefore, there are no solutions in D (Rn )

to system (5.3.26) other than the classical ones.

Finally, consider the homogeneous Eq. (E0 ). Since it can be writ-

ten in the vector form (5.3.27) which has only classical solutions, so

does (E0 ). The non-homogeneous case (E) has a general solution

which is obtained by adding to the general solution of (E0 ) a particu-

lar solution to (E) in the sense of distributions. Indeed, if up ∈ D (R)

is such a particular solution, and u is an arbitrary solution in D (R)

of (E), then u − up is a (classical) solution of (E0 ), hence a linear

combination of the functions belonging to the fundamental system of

solutions.

tional solution of it, then

134 5 Distributions, Sobolev Spaces

of the corresponding homogeneous equation within each of these two

intervals, i.e., u is a function (regular distribution) of the form

(c1 t + c2 )et , t ∈ (−∞, 1),

u(t) =

(c3 t + c4 )et , t ∈ (1, +∞),

where c1 , c2 , c3 , c4 are real constants. Not all these functions u are so-

lutions of the given diﬀerential equation. The fact that such a function

u is a solution means

1

uφ + 2uφ + uφ dt

−∞

∞

+ uφ + 2uφ + uφ dt = 2φ(1) ∀φ ∈ D(R).

1

of the homogeneous equation in (−∞, 1) and also in (1, ∞), plus the

fact that φ(1) and φ (1) can be any real numbers, we obtain

so

c3 = c1 + 2e−1 , c4 = c2 − 2e−1 .

Thus the general solution of the given equation is

t 2(t − 1)et−1 , t > 1,

u(t) = (c1 t + c2 )e +

0, t < 1.

It is worth pointing out that there is no classical solution of the given

equation, more precisely there is a jump at t = 1 in the ﬁrst derivative

of any solution (which is caused by the Dirac distribution in the right-

hand side of the equation).

Remark 5.14. Note that in equation (E) above the coeﬃcient of u(n) is

1, i.e., we do not have any singularity in the coeﬃcient of the leading

term. Otherwise, some diﬃculties may occur. For example, consider

the simple equation

tu = 0 in D (R). (5.3.29)

5.3 Scalar Distributions 135

in (−∞, 0) and in (0, ∞) as well. So the general solution is

u(t) = c1 + c2 H(t), t ∈ R,

where c1 , c2 are real constants. Note that in this case there are two

independent solutions (e.g., u1 (t) = 1, u2 (t) = H(t)), even if the given

equation is of order one.

lems associated with partial diﬀerential equations, consider the follow-

ing examples:

Example 1.

Consider the equation of an inﬁnite vibrating string with no external

force acting on it

where

∂u ∂2u ∂2u

ut := , utt := 2 , uxx := .

∂t ∂t ∂x2

First assume that ψ ∈ C 2 (R). Recall that using the change of variables

α = x + t, β = x − t ,

uαβ = 0 .

So it is easily seen that any solution of the Eq. (5.3.30) has the form

u = g(x+t)+h(x−t), and so applying (5.3.31), we ﬁnd the D’Alembert

formula8

1

u= ψ(x + t) + ψ(x − t) (D’Alembert’s formula) . (5.3.32)

2

8

Jean-Baptiste le Rond d’Alembert, French mathematician, mechanician,

physicist, philosopher, and music theorist, 1717–1783.

136 5 Distributions, Sobolev Spaces

lem (5.3.30), (5.3.31).

On the other hand, assuming that ψ ∈ C 1 (R), then u given by (5.3.32)

is no longer a classical solution of Eq. (5.3.30). However, this u still

satisﬁes conditions (5.3.31).

Now, assume that ψ ∈ C(R). In this case, the function u given

by (5.3.32) only satisﬁes classically the condition u(0, x) = ψ(x), x ∈

R. However, it should be some relation between this u and prob-

lem (5.3.30), (5.3.31). Indeed, we can show that this u satisﬁes (5.3.30)

and the condition ut (0, x) = 0 (x ∈ R) in a weak sense, that is in the

sense of distributions.

If ψ , ψ denote the ﬁrst and second derivative of ψ in D (R), then it

is easily seen that

D(2,0) ψ(x + t) = D(0,2) ψ(x + t) = ψ (x + t) in D (R2 ) .

Similarly,

D(2,0) ψ(x − t) = D(0,2) ψ(x − t) = ψ (x − t) in D (R2 ) .

(2,0) 1

D u(t, x), φ(t, x) = ψ (x + t) + ψ (x − t), φ(t, x)

2

(0,2)

= D u(t, x), φ(t, x) ,

which shows that u given by (5.3.32) satisﬁes Eq. (5.3.30) in the sense

of distributions:

We also have

1

D(1,0) u(t, x) = ψ (x + t) − ψ (x − t)

2

= 0, if t = 0 .

5.3 Scalar Distributions 137

Example 2.

Here we discuss the boundary controllability of the 1-dimensional wave

equation describing the vibrations of a ﬁnite string. Speciﬁcally, let us

consider the following initial-boundary value problem:

⎧

⎪

⎨utt − uxx = 0, 0 < x < 1, t > 0,

u(t, 0) = 0, u(t, 1) = f (t), t > 0, (5.3.33)

⎪

⎩ 0 1

u(0, x) = u (x), ut (0, x) = u , 0 < x < 1,

We shall prove that

shall see that there exists a lowest time instant T with this property,

precisely T = 2. Obviously, any T > 2 satisﬁes the same property.

This result is in accordance with similar results previously obtained

by other authors by using diﬀerent arguments (see, e.g., [34, p. 57]).

Our direct approach is more advantageous since it provides the solution

u (in a generalized sense, under weak assumptions on the data) as a

function of u0 , u1 , and f and allows us to determine the minimal time

interval (0, 2) and an explicit control function f (depending on u0 and

u1 ) which steers the solution u to zero. It may happen that this direct

approach is known, but we could not ﬁnd anything about it in the

literature. Nevertheless, we present it here as a nice application and

do not claim originality.

Existence of Solutions to Problem (5.3.33)

Denote R = {(t, x); t ≥ 0, 0 ≤ x ≤ 1}. Consider in the ﬁrst instance

that u = u(t, x) is a classical solution of problem (5.3.33) corresponding

to regular u0 , u1 , and f . Obviously, the solution of the above wave

equation has the general form

g(x + t) and h(x − t) (hence u = u(t, x)) within diﬀerent subsets

(triangles or squares) of R, as follows:

From the initial conditions we get

x

g(x) + h(x) = u (x), g(x) − h(x) =

0

u1 + c, 0 < x < 1 ,

0

138 5 Distributions, Sobolev Spaces

⎧ 0 x 1

⎪ 1

⎨g(x) = 2 u (x) + 0 u + c , 0 < x < 1,

(5.3.35)

⎪

⎩ x

h(x) = 1

2 u0 (x) − 0 u1 −c , 0 < x < 1.

and

g(1 + t) + h(1 − t) = f (t), t > 0 . (5.3.37)

Now (5.3.36) yields (see also (5.3.35))

h(−t) = −g(t)

t

1 0

= − u (t) + u1 + c , 0 < t < 1 . (5.3.38)

2 0

1−t

1 0

= f (t) − u (1 − t) − u1 − c , 0 < t < 1 . (5.3.39)

2 0

h(−t − 1) = −g(1 + t)

1 1−t

= −f (t) + u0 (1 − t) − u1 − c , (5.3.40)

2 0

g(2 + t) = f (1 + t) − h(−t)

t

1 0

= f (1 + t) + u (t) + u1 + c , (5.3.41)

2 0

h(−t − 2) = −g(t + 2)

! t "

1

= −f (1 + t) − 0

u (t) + 1

u +c , (5.3.42)

2 0

5.3 Scalar Distributions 139

g(3 + t) = f (2 + t) − h(−t − 1)

! 1−t "

1

= f (2 + t) + f (t) − u (1 − t) −

0

u − c , (5.3.43)

1

2 0

in R. We decompose R into triangles and squares as in Fig. 5.1.

x

1

1

−

0

2

t=

−

t=

t=

t=

−

−

x

−

x

x

1

1≤x+t≤2 2≤x+t≤3 3≤x+t≤4

0≤x−t≤1 −1 ≤ x − t ≤ 0 −2 ≤ x − t ≤ −1

C F I

0≤x+t≤1 1≤x+t≤2 2≤x+t≤3 3≤x+t≤4

0≤x−t≤1 −1 ≤ x − t ≤ 0 −2 ≤ x − t ≤ −1 −3 ≤ x − t ≤ −2

A D G J

B E H

−1 ≤ x − t ≤ 0 0 ≤ x − t ≤ −1 −3 ≤ x − t ≤ −2

0≤x+t≤1 1≤x+t≤2 2≤x+t≤3

t

0 1 2 3

x

x

+

+

t=

t=

t=

t=

0

of R and the strip {0 ≤ x + t ≤ 1}) (see (5.3.35)):

! x+t "

1 0 1

g(x + t) = u (x + t) + u +c . (5.3.44)

2 0

intersection of R and the strip {1 ≤ x + t ≤ 2}) (see (5.3.39)):

g(x + t) = g (x + t − 1) + 1

1 0

= f (x + t − 1) − u (2 − x − t)

2"

2−x−t

− u1 − c . (5.3.45)

0

140 5 Distributions, Sobolev Spaces

1−y

1 0

f (y) = u (1 − y) − u1 − c ∀y ∈ (0, 1),

2 0

implies g(x + t) = 0 in C ∪ D ∪ E.

(see (5.3.41))

g(x + t) = g (x + t − 2) + 2

1 0

= f (x + t − 1) + u (x + t − 2)

2"

x+t−2

+ u1 + c . (5.3.46)

0

If we choose f : (1, 2) → R,

! z−1 "

1

f (z) = − u (z − 1) +

0 1

u +c ∀z ∈ (1, 2) ,

2 0

then g(x + t) = 0 in F ∪ G ∪ H.

First, in the triangle A ∪ C (see (5.3.35))

! x−t "

1

h(x − t) = u (x − t) −

0

u −c .

1

(5.3.47)

2 0

h(x − t) = h − (t − x)

= −g(t − x)

! t−x "

1

= − u (t − x) +

0 1

u +c . (5.3.48)

2 0

h(x − t) = h − (t − x − 1) − 1

1

= −f (t − x − 1) + u0 (2 + x − t)

2

2+x−t

− u1 − c . (5.3.49)

0

5.3 Scalar Distributions 141

h(x − t) = 0 in E ∪ G ∪ I.

h(x − t) = h − (t − x − 2) − 2

1

= −f (t − x − 1) − u0 (t − x − 2)

2

t−x−2

+ u1 + c . (5.3.50)

0

In fact all the above computations are valid for u0 , u1 ∈ L1 (0, 1) and

f ∈ L1loc [0, ∞).

These calculations lead to the following theorem:

is Lebesgue summable on (0, m) for all m > 0) problem (5.3.33) has

a unique weak solution u. If u0 ∈ C[0, 1], u1 ∈ L1 (0, 1), f ∈ C[0, ∞),

and the following compatibility conditions are satisﬁed:

h(x − t). Obviously, u satisﬁes the wave equation in the distribution

sense on the interior of each of the sets A, B, C, D, E, F, G, and so

on. In this sense, u is a weak solution of the wave equation. By

construction, the initial and boundary conditions are also satisﬁed. It

is easily seen that the constant c disappears when constructing u =

g(x + t) + h(x − t) in A, B, C, . . . so the solution u is unique.

If u0 ∈ C[0, 1], u1 ∈ L1 (0, 1), f ∈ C[0, ∞) and u0 , f satisfy (5.3.51),

then u is continuous on [0, ∞) × [0, 1]. It suﬃces to observe that u is

continuous on the characteristic lines {x − t = i}, i = 0, 1, . . . , and

{x + t = −j}, j = 1, 2 . . . , restricted to the inﬁnite strip R.

regularity of the data and additional compatibility conditions.

142 5 Distributions, Sobolev Spaces

A careful analysis of the above computations shows that there are pairs

(u0 , u1 ) for which there are no functions f : (0, T ) → R, T < 2, making

u = 0 in the trapezoid A ∪ B ∪ C ∪ D ∪ F . In other words, the waves

cannot be controlled in [0, T ] if T < 2. On the other hand, we have

Theorem 5.17. For any pair (u0 , u1 ) ∈ L1 (0, 1)×L1 (0, 1) there exists

a control function f : (0, +∞) → R deﬁned by

⎧

⎪

⎪

1

u 0 (1 − y) + 1−y u1 − c , y ∈ (0, 1),

⎪

⎪ 2 0

⎪

⎪

⎪

⎨

f (y) = − 1 u0 (y − 1) + y−1 u1 + c , y ∈ (1, 2), (5.3.52)

⎪

⎪ 2 0

⎪

⎪

⎪

⎪

⎪

⎩

0, y > 2,

1

with c = −u0 (1) − 0 u1 , which makes u = 0 in the inﬁnite trapezoid

{x ≤ t − 1} ∩ {0 < x < 1}.

Proof. The proof follows easily from the computations performed above,

including the remarks on the regions where g(x+t) = 0 or h(x−t) = 0

(based on the fact that f is that given in (5.3.52)). Of course, u van-

ishes in {x < t − 1} ∩ {0 < x < 1} since f (t) = 0 for t > 2.

the corresponding (unique) solution u vanishes starting from the line

segment deﬁned as the intersection of R and the characteristic line

{x − t = −1} and remains zero everywhere on the right side of that

segment, which can be interpreted as a threshold. So the waves can

be controlled in the minimal time interval (0, 2) and in fact in any

interval (0, T ) with T ≥ 2.

Remark 5.19. While the solution u is unique in Theorem 5.15, the

control function f is not since the constant c in (5.3.52) can be chosen

arbitrarily. Indeed, the restriction of f to the interval (0, 2) is unique

up to an additive constant,

1 as follows from the computations above.

We chose c = −u0 (1) − 0 u1 in Theorem 5.17 in order to obtain a

continuous control function f .

5.4 Sobolev Spaces 143

Let ∅ = Ω ⊂ Rk be an open set. For m ∈ N, 1 ≤ p ≤ ∞ deﬁne the

Sobolev space of order m to be9

Obviously, W m,p (Ω) is a linear space with respect to the usual opera-

tions of addition and scalar multiplication. In particular,

)

∂u

W (Ω) = u ∈ L (Ω);

1,p p

∈ L (Ω) ∀i = 1, . . . , k ,

p

∂xi

∂u

where ∂x i

is the partial derivative of u with respect to xi in the sense

of distributions.

space with respect to the norm

1/p

um,p = Dα upLp (Ω) , 1 ≤ p < ∞,

|α|≤m

|α|≤m

Let (un )n∈N be a Cauchy sequence in W m,p (Ω), i.e., for all ε > 0 there

exists N = N (ε) ∈ N such that

Since Lp (Ω) is a Banach space (with respect to · Lp (Ω) ), there exist

u, uα ∈ Lp (Ω) such that

Claim: In general, if vn → v in Lp (Ω), 1 ≤ p ≤ ∞, then vn → v in

D (Ω).

9

Sergei, L. Sobolev, Russian mathematician, 1908–1989.

144 5 Distributions, Sobolev Spaces

|(vn − v, φ)| = (vn − v)φ dx

Ω

which for p = 1

By the above claim, it follows that the convergences in (5.4.53) also

hold in D (Ω). Since Dα is a closed operation in D (Ω), it follows that

of C0∞ (Ω) in (W m,p (Ω), · m,p ). Obviously, W0m,p (Ω) is a Banach

space with respect to · m,p for all m ∈ N, 1 ≤ p ≤ ∞.

For p = 2 there are speciﬁc notations

with the scalar product

(u, v)m := Dα u, Dα v L2 (Ω) .

|α|≤m

product (·, ·), i.e., x = (x, x), x ∈ X; see Chap. 6 for more infor-

mation on Hilbert spaces).

In particular, the scalar product of H 1 (Ω) (and of H01 (Ω) as well) is

k

∂u ∂v

(u, v)1 = (u, v)L2 (Ω) + ,

∂xj ∂xj L2 (Ω)

j=1

k

∂u ∂v

= uv dx + dx ,

Ω Ω ∂xj ∂xj

j=1

5.4 Sobolev Spaces 145

k

∂u 2

u21 = u2L2 (Ω) + dx .

j=1Ω ∂xj

[1, p. 47], where further results on Sobolev spaces can be found. See

also [2, 6, 14].

Let us recall (without proof) the following approximation result (cf.,

e.g., [14, p. 252]).

C 1 , and let 1 ≤ p < ∞. Then for every u ∈ W m,p (Ω) there exists a

sequence (un ) in C ∞ (Ω) such that un → u in W m,p (Ω).

For the deﬁnition of a C 1 open set see [6, p. 272]. Generally, in appli-

cations ∂Ω is smooth enough and consequently Ω is of class C 1 .

Notice also that W0m,p (Rk ) = W m,p (Rk ), i.e., C0∞ (Rk ) is dense in

W m,p (Rk ) (see, e.g., [1, p. 56]). But, in general, W0m,p (Ω) is a proper

subspace of W m,p (Ω).

Let us also state (without proof) a uniﬁed version of some results due

to Sobolev, Rellich & Kondrashov10 (see, e.g., [2, pp. 3–4]).

p < ∞, then there are the continuous embeddings

(a) if m < k

p , then W m,p (Ω) → Lq (Ω) ∀q ∈ [p, p∗ ], where p∗ =

kp

k−mp ;

(b) if m = k

p , then W m,p (Ω) → Lq (Ω) ∀q ∈ [p, ∞);

(c) if m > kp , then W m,p (Ω) → C 0,α (Ω) (which is the space of

Hölder continuous functions deﬁned on Ω with exponent α ∈

(0, 1), and with α = 1 if m − kp > 1).

If, in addition, Ω is bounded, then all the above embeddings are com-

pact except for the case q = p∗ in (a), and furthermore, if we replace

W m,p (Ω) by W0m,p (Ω), then all these embeddings (including the com-

pact ones) hold without any regularity condition on ∂Ω.

10

Vladimir I. Kondrashov, Russian mathematician, 1909–1971; Franz Relich,

Austrian-German mathematician, 1906–1955.

146 5 Distributions, Sobolev Spaces

The above embeddings are the natural linear injective maps between

the corresponding spaces. In particular, the embedding (c) above as-

sociates with every u ∈ W m,p (Ω) (which is a class of functions with

respect to the a.e. equality) its continuous representative. Continuity

and compactness of the above embeddings are understood in the usual

sense.

We continue with a few words on the trace of functions from W m,p (Ω)

on the boundary ∂Ω of Ω. The concept of trace is important for appli-

cations to boundary value problems for partial diﬀerential equations.

We restrict our attention to W 1,p (Ω), 1 ≤ p < ∞, since this case is

suﬃcient for the applications that will be discussed later.

Clearly, for a function u ∈ C(Ω) its restriction to ∂Ω, denoted u|∂Ω , is

well deﬁned. But if u ∈ W 1,p (Ω) then u is only deﬁned a.e. on Ω so it

does not make sense to speak about the restriction of u to ∂Ω because

the k-dimensional Lebesgue measure of ∂Ω is zero; however, there is

a trace of u on ∂Ω which plays the role of the restriction u|∂Ω . More

precisely, we have the following theorem (cf. [14, pp. 258–259]):

and let 1 ≤ p < ∞. There exists a continuous linear operator γ :

W 1,p (Ω) → Lp (∂Ω) such that γ(u) = u|∂Ω for all u ∈ W 1,p (Ω) ∩ C(Ω).

Moreover, u ∈ W01,p (Ω) if and only if u ∈ W 1,p (Ω) and γ(u) = 0.

continuity of the classical restriction to ∂Ω from W 1,p (Ω) ∩ C(Ω) to

Lp (∂Ω). This extension is unique since W 1,p (Ω) ∩ C(Ω) is dense in

(W 1,p (Ω), · 1,p ) (see Theorem 5.21). If u ∈ W01,p (Ω), hence γ(u) = 0,

we say that u = 0 on ∂Ω in a generalized sense. For details on traces

and Lp (∂Ω), 1 ≤ p < ∞, see [14].

The case k = 1

If Ω = (a, b) ⊂ R, −∞ ≤ a < b ≤ +∞, we denote

Lp (a, b) := Lp (a, b) , W m,p (a, b) := W m,p (a, b) ,

W0m,p (a, b) := W0m,p (a, b) , H m (a, b) := H m (a, b) ,

H0m (a, b) := H0m (a, b) .

distributions. In particular, we shall see that for 1 ≤ p < ∞ and

5.4 Sobolev Spaces 147

is an absolutely continuous function on [a, b], so identifying u with

this representative, u(a) and u(b) make sense classically. According

to Theorem 5.23, u is in W01,p (a, b) if and only if u ∈ W 1,p (a, b) and

u(a) = 0 = u(b). This shows in particular that W01,p (a, b) is a proper

subspace of W 1,p (a, b).

Green’s Identity

Let ∅ = Ω ⊂ Rk be an open and bounded set of class C 1 . Recall the

classical divergence (Gauss–Ostrogradski11 ) formula

∇ · F dx = F · n ds

Ω ∂Ω

∀F = (f1 , . . . , fk ), fi ∈ C 1 (Ω), i = 1, . . . , k, (5.4.54)

F = g∇f , with f ∈ C 2 (Ω) and g ∈ C 1 (Ω), one obtains the classical

Green identity

∂f

gΔf dx + ∇f · ∇g dx = g ds . (5.4.55)

Ω Ω ∂Ω ∂n

Taking into account Theorems 5.21 and 5.23, the identity (5.4.55) can

be easily extended by density to

gΔf dx + ∇f · ∇g dx

Ω

Ω

∂f

= g ds ∀f ∈ W 2,p (Ω), g ∈ W 1,q (Ω) , (5.4.56)

∂Ω ∂n

where 1 < p < ∞ and q is the conjugate ofp, i.e., q = (p − 1)/p. Here,

the functions in the right-hand side under ∂Ω actually represent their

traces on ∂Ω.

Poincaré’s Inequality12

Now we present an important inequality which holds in W01,p (Ω) for

1 ≤ p < ∞ and Ω open and bounded.

11

Mikhail V. Ostrogradski, Russian-Ukrainian mathematician, mechanician, and

physicist, 1801–1862.

12

Henri Poincaré, French mathematician, theoretical physicist, engineer, and

philosopher of science, 1854–1912.

148 5 Distributions, Sobolev Spaces

and let 1 ≤ p < ∞. Then

k 1/p

∇uLp (Ω) := ∂u/∂xi Lp (Ω) .

i=1

prove (5.4.57) for all u ∈ C0∞ (Ω).

Consider ﬁrst the case

k= 1, i.e., Ω = (a, b), −∞ < a < b < ∞. If

u ∈ C0∞ (a, b) := C0∞ (a, b) , then

x b

u(x) = u (t) dt =⇒ |u(x)| ≤ |u (t)| dt ∀x ∈ [a, b] .

0 a

inequality over [a, b].

If 1 < p < ∞ then we can derive from the same inequality by using

Hölder 1

|u(x)| ≤ (b − a) p u Lp (a,b) ∀x ∈ [a, b] ,

where p = p/(p − 1). It follows that

b

|u(x)|p dx ≤ (b − a)p u pLp (a,b) ,

a

the xy-plane such that Ω ⊂ D. Take u ∈ C0∞ (Ω) and extend it as zero

in D \ Ω. We have

x

∂

u(x, y) = u(s, y) ds =⇒ |u(x, y)|

a ∂s

b

∂

≤ u(s, y) ds ∀(x, y) ∈ D .

∂s

a

∂u ∂u

uL1 (D) ≤ (b − a) L1 (D) =⇒ uL1 (Ω) ≤ (b − a) L1 (Ω) .

∂x ∂x

5.5 Bochner’s Integral 149

∂u

uLp (Ω) ≤ (b − a) p , (5.4.58)

∂x L (Ω)

so, in fact, (5.4.58) is valid for p ∈ [1, ∞).

Similarly,

∂u

uLp (Ω) ≤ (d − c) Lp (Ω) . (5.4.59)

∂y

By (5.4.58) and (5.4.59) it follows that (5.4.57) holds with C = 2 max

{b − a, d − c}.

The proof is similar for k ≥ 3.

Remark 5.25. An inspection of the above proof shows that the Poincaré

inequality still holds if the Lebesgue measure of Ω is ﬁnite, and also if

the projection of Ω on some coordinate plane is bounded.

Remark 5.26. If Ω is bounded or satisﬁes one of the conditions in the

previous remark then, according to the Poincaré inequality, W01,p (Ω)

can be equipped with a new norm

Let ∅ = Ω ⊂ Rk be a Lebesgue measurable set, and let (X, · ) be a

real Banach space.

As in the case of R-valued functions, a function g : Ω → X is a simple

function if it is of the form

p

g(s) = χMi (s)yi

i=1

∞), and Mi ∩ Mj = ∅ if i = j. Here, we prefer to use s to denote a

generic point in Ω (instead of x which could be used to designate points

of X).

A function f : Ω → X is called strongly measurable (or simply mea-

surable) if there exists a sequence of simple functions gn : Ω → X such

that

lim gn (s) − f (s) = 0 for a.a. s ∈ Ω .

n→∞

150 5 Distributions, Sobolev Spaces

its integral over Ω to be

p

g(s) ds := m(Ai )yi .

Ω i=1

a simple function as well (hence Lebesgue integrable over Ω) and the

following inequality holds:

* *

* *

* g(s) ds* ≤ g(s) ds .

* *

Ω Ω

a real linear space with respect to the usual operations (addition of

functions and scalar multiplication), and

(α1 g1 + α2 g2 ) ds = α1 g1 ds

Ω

Ω

Ω

Ω)13 if there exists a sequence of simple functions gn : Ω → X con-

verging strongly to f a.e. in Ω (so f is measurable) and

lim gn (s) − gm (s) ds = 0 , (5.5.60)

n,m→∞ Ω

f (s) ds := lim gn (s) ds. (5.5.61)

Ω n→∞ Ω

* * * *

* * * *

* gn ds − g ds * = * (g − g ) ds *

* m * * n m *

Ω Ω

Ω

≤ gn − gm ds .

Ω

13

Salomon Bochner, American mathematician, 1899–1982.

5.5 Bochner’s Integral 151

So (5.5.60) implies

* *

* *

*

lim * gn ds − gm ds*

n,m→∞ * = 0,

Ω Ω

i.e., the limit in (5.5.61) exists. To prove the limit does not depend on

the choice of (gn ), consider another sequence (g̃n ) satisfying the same

properties. Then, by (5.5.60), we have for all ε > 0

gn − g̃n − gm + g̃m ds ≤ gn − gm ds + g̃n − g̃m ds

Ω Ω Ω

≤ ε ∀n, m > Nε .

gn − g̃n ds ≤ ε ∀n > Nε . (5.5.62)

Ω

* * * *

* * * *

* gn ds − g̃n ds* * *

* * = * (gn − g̃n ) ds* ≤ gn − g̃n ds .

Ω Ω Ω Ω

(5.5.63)

From (5.5.62) and (5.5.63) we deduce

lim g̃n ds = lim gn ds = f ds ,

n→∞ Ω n→∞ Ω Ω

Remark 5.28. Note that if X = RN , N ∈ N, then f = (f1 , . . . , fN ) is

measurable in the sense above if and only if fi is Lebesgue measurable

for all i = 1, . . . , N , and integrability of f in the sense of Bochner

means integrability of all fi ’s in the sense of Lebesgue. If (X, · ) is

an inﬁnite dimensional Banach space, then, in addition to the concept

of strong measurability of a function from Ω to X as deﬁned before,

there is also a concept of weak measurability, namely f : Ω → X is

said to be weakly measurable if s → x∗ (f (s)) is Lebesgue measurable

for every continuous linear functional x∗ : (X, · ) → R. If X is a

separable Banach space, then the weak measurability of f is equiv-

alent to its strong measurability. In fact, this equivalence holds if f

is almost separably valued, that is {f (s); s ∈ Ω \ M } is a separable

set, where M ⊂ Ω has zero Lebesgue measure. This result belongs to

152 5 Distributions, Sobolev Spaces

Pettis,14 see, e.g., [51, p. 131]. It is worth mentioning that, in all the

applications discussed in this book, X will always stand for separable

Banach spaces.

The next result says that Bochner integrability of any X-valued func-

tion f reduces to Lebesgue integrability of f .

Theorem 5.29 (Bochner). Let (X, · ) be a real Banach space and

let Ω ⊂ Rk be a measurable set. If f : Ω → X is strongly measurable,

then f is Bochner integrable if and only if f is Lebesgue integrable,

where f (s) := f (s) for almost all s ∈ Ω.

Proof. Since f is strongly measurable, f is also (Lebesgue) measur-

able because a sequence of simple functions gives a sequence of simple

functions upon taking the norm.

To prove necessity, assume that f is Bochner integrable. If (gn ) is

a sequence of simple functions as in Deﬁnition 5.27, we can write

(see (5.5.60))

gn − gm ds ≤ ε ∀n, m > Nε .

Ω

gn − f ds ≤ ε ∀n > Nε ,

Ω

obvious inequality

f ≤ f − gn + gn

we obtain

f ds ≤ f − gn ds + gn ds < ∞ ∀n > Nε ,

Ω Ω Ω

In order to prove suﬃciency, assume that f is Lebesgue integrable

and consider a sequence of simple functions hn : Ω → X such that

lim hn (s) − f (s) = 0 for almost all s ∈ Ω .

n→∞

Deﬁne

hn (s) if hn (s) ≤ (1 + δ)f (s),

gn (s) =

0 otherwise,

14

Billy James Pettis, American mathematician, 1913–1979.

5.5 Bochner’s Integral 153

and

lim gn (s) − f (s) = 0 for a.a. s ∈ Ω . (5.5.64)

n→∞

We must show

lim gn − gm ds = 0 . (5.5.65)

n,m→∞ Ω

orem to the sequence (gn − f ). The ﬁrst condition of this theorem

is satisﬁed (see (5.5.64)), and

≤ (1 + δ)f (s) + f (s)

= (2 + δ)f (s) ,

orem is also satisﬁed, hence

lim gn − f ds = 0 .

n→∞ Ω

gn − gm ds ≤ gn − f ds + gm − f ds

Ω Ω Ω

implies (5.5.65).

is Bochner integrable, we have

* *

* *

* f ds* ≤ f ds ,

* *

Ω Ω

because this inequality holds for simple functions. In general, the usual

properties of the Lebesgue integral are also satisﬁed by the Bochner

integral.

Remark 5.31. Let (X, · ) and (Y, · ∗ ) be real Banach spaces. If

f : Ω → X is Bochner integrable over Ω and A is a continuous linear

operator from (X, ·) to (Y, ·∗ ), then A◦f is also Bochner integrable

and

A◦f ds = A f ds .

Ω Ω

154 5 Distributions, Sobolev Spaces

(A◦gn ) is also a sequence of simple functions which converges to A◦f .

Moreover,

A◦gn − A◦gm ds ≤ A gn − gm ds → 0 as n, m → ∞ .

Ω Ω

It follows that

A◦f ds = lim A◦gn ds = lim A gn ds = A f ds ,

Ω n→∞ Ω n→∞ Ω Ω

as claimed.

L (Ω; X) = {f : Ω → X; f is measurable and

p

f p ds < ∞} .

Ω

We also deﬁne

L∞ (Ω; X) = {f : Ω → X; f is measurable

and ess sup f (s) < ∞} ,

s∈Ω

where

ess sup f (s) := inf{C; f (s) ≤ C a.e. on Ω}.

s∈Ω

Lp (Ω; X) := Lp (Ω)/∼ .

! "1/p

f Lp (Ω; X) := f ds

p

, 1 ≤ p < ∞,

Ω

f L∞ (Ω; X) := ess sups∈Ω f (s) .

The proof follows by arguments similar to those from the proof of the

classical theorem corresponding to the case X = R (Theorem 3.25),

so we leave it to the reader as an exercise. The key condition is the

completeness of X.

If Ω = (a, b) with −∞ ≤ a < b ≤ ∞ denote Lp (a, b; X) := Lp ((a, b); X).

5.6 Vector Distributions, W m,p (a, b; X) Spaces 155

∞ ∞

let −∞ ≤ a < b ≤ ∞. Denote as before

Let X be a Banach space and

D(a, b) = C0 (a, b) := C0 (a, b) equipped with the inductive limit

topology.

u : D(a, b) → X which is linear and continuous (in the sense that

if φn → 0 in D(a, b) then u(φn ) → 0). The set of all such vector

distributions is denoted D (a, b; X).

by a locally integrable function u ∈ L1loc (a, b; X), i.e., u : (a, b) → X

is strongly measurable and u ∈ L1 (K) for all K ⊂ (a, b) compact.

Deﬁne ũ : D(a, b) → X by

b

ũ(φ) := φ(t)u(t)dt ∀φ ∈ D(a, b) .

a

φ ∈ D(a, b) and v ∈ L1loc (a, b; X) satisfying

b

φ(t)v(t) dt = 0 ,

a

b

φ(t)x∗ (v(t)) dt = 0 ∀x∗ ∈ X ∗ ,

a

summable function, it follows by Theorem 5.9 that

Consequently, one can identify the (regular) distribution ũ with the

locally summable function u, and write

b

u(φ) := φ(t)u(t)dt ∀φ ∈ D(a, b) .

a

156 5 Distributions, Sobolev Spaces

this way, e.g., u : D(R) → X deﬁned by u(φ) = φ(0)x for all φ ∈ D(R)

and a ﬁxed x ∈ X \ {0}.

and inductively,

and by convention

u(0) = u .

to Ω ⊂ Rk .

For m ∈ N, 1 ≤ p ≤ ∞, we set

u(j) ∈ Lp (a, b; X), j = 0, 1, . . . , m} ,

all (distributional) derivatives above are regular as well.

1 ≤ p ≤ ∞, W m,p (a, b; X) is a Banach space with respect to the norm

m 1/p

uW m,p (a,b; X) := u(j) pLp (a,b; X) , 1 ≤ p < ∞,

j=0

0≤j≤m

5.6 Vector Distributions, W m,p (a, b; X) Spaces 157

(a, b; X) indicates the set of all u ∈ D (a, b; X) such

that u ∈ W m,p (t1 , t2 ; X) for every bounded interval (t1 , t2 ) ⊂ (a, b).

For p = 2 denote H m (a, b; X) = W m,2 (a, b; X). If X is a Hilbert

space, then so is H m (a, b; X) with respect to the inner product

m b

(u, v)H m (a,b; X) = u(j) (t), v (j) (t) X

dt .

j=0 a

Now for −∞ < a < b < +∞ denote by Am,p (a, b; X) the space of all

functions f : [a, b] → X which are absolutely continuous on [a, b], the

pointwise derivatives dj f /dtj exist and are absolutely continuous on

[a,b] for j = 1, 2, . . . , m − 1, and dm f /dtm ∈ Lp (a, b; X).

Remark 5.34. If X is reﬂexive, it follows by a well-known theorem due

to Kōmura15 (see [25]; see also [45, p. 105]) that

functions on [a, b].

u ∈ Lp (a, b; X) then the following are equivalent:

(jj) there exists u1 ∈ Am,p (a, b; X) such that u1 (t) = u(t) for almost

all t ∈ (a, b) .

Proof. We shall prove the case m = 1, and then the result follows by

induction.

To prove the implication (j) ⇒ (jj) ﬁx u ∈ W 1,p (a, b; X) and extend

it as zero in R \ (a, b). For ε > 0 small deﬁne uε as before, i.e.,

uε (t) = ωε (t − s)u(s) ds ,

R

where

− 1

1 Ce 1−t2 , |t| < 1 ,

ωε (t) = ω(t/ε), and ω(t) =

ε 0, |t| ≥ 1 ,

15

Yukio Kōmura, Japanese mathematician, born 1931.

158 5 Distributions, Sobolev Spaces

with C > 0 such that R ω(t) dt = 1. We have

d

u̇ε (t) = uε (t)

dt

= ωε (t − s)u(s) ds , ∀t ∈ R ,

R

it to a test function φ ∈ C0∞ (R)

(u̇ε , φ) = φ(t)u̇ε (t) dt

R

! "

= φ(t) ωε (t − s)u(s) ds dt

R R

! "

= ωε (t − s)φ(t) dt u(s) ds

R R

=− φε (s)u(s) ds

R

= −(u, φε )

= u (φε )

= φε (t)u (t) dt

R

! "

= ωε (t − s)φ(s) ds u (t) dt

R R

! "

= φ(s) ωε (t − s)u (t) dt ds

R R

= φ(s)(u )ε (s) ds

R

so that

5.6 Vector Distributions, W m,p (a, b; X) Spaces 159

t

uε (t) − uε (s) = (u )ε (τ ) dτ . (5.6.67)

s

is the same as in the scalar case).

Hence, there exists a function u1 such that

t

u1 (t) − u1 (s) = u (τ ) dτ for a.a. s, t ∈ (a, b) .

s

Therefore, u1 ∈ AC([a, b]; X) and u̇1 = u for almost all t ∈ (a, b), i.e.,

the pointwise derivative u̇1 is a representative of the distributional

derivative u ∈ Lp (a, b; X). So u̇1 ∈ Lp (a, b; X), which together with

absolute continuity implies that u1 ∈ A1,p (a, b; X).

For the implication (jj) =⇒ (j), assume there exists

u1 ∈ A1,p (a, b; X) an element of the class u. We must show that

u ∈ W 1,p (a, b; X). Since u1 ∈ AC[a, b], u ∈ Lp (a, b; X), and we must

show that u ∈ Lp (a, b; X). We start with u̇1 and interpret it as a

distribution. For all φ ∈ D(a, b), we have

b

(u̇1 , φ) = φu̇1 dt

a

b

=− φ̇u1 dt

a

and, since changing u1 to another element of its class won’t aﬀect the

integral,

b

=− φ̇u dt

a

= −u(φ̇)

= u (φ) .

and u̇1 ∈ Lp (a, b; X) so u ∈ Lp (a, b; X).

Note that usually good representatives are preferred since their values

at particular points make sense.

160 5 Distributions, Sobolev Spaces

5.7 Exercises

1. Let Ω = R × (−1, +1) ⊂ R2 and let u : Ω → R be deﬁned by

that the topology generated by F coincide with the pointwise

convergence topology.

and m ∈ N ∪ {0} deﬁne the seminorm p : C ∞ (Ω) → R

x∈K,|α|≤m

∂ |α|

Dα f (x) = f (x1 , . . . , xk ).

∂xα1 1 · · · ∂xαk k

· · · ⊂ Ω, such that Ω = ∪∞

n=1 Kn . Deﬁne for each j ∈ N

j

1 pKj,m (f − g)

dj (f, g) = · , f, g ∈ C j (Ω),

2m 1 + pKj,m (f − g)

m=0

and

∞

1 dj (f, g)

d(f, g) = j

· , f, g ∈ C ∞ (Ω).

2 1 + dj (f, g)

j=1

maxR φ = 1.

5. Let φ ∈ C0∞ (Rk ). Prove that there exists ψ ∈ C0∞ (Rk ) such that

kψ

φ = ∂x1∂···∂x k

if and only if Rk φ(x) dx = 0.

exists a function φ ∈ C0∞ (R) such that φ(n) = an ∀n ∈ N if and

only if there exists an n0 ∈ N such that an = 0 ∀n > n0 .

5.7 Exercises 161

the sequence (φn )n∈N , where

1

φn (x) = n ψ x + h − ψ(x) , x ∈ Rk , n ∈ N.

n

Prove that

k

∂ψ

φn → hj in D(Rk ).

∂xj

j=1

sequence (γn )n∈N deﬁned by

1 1

γn (x) = n ψ x + h − ψ x − h , x ∈ Rk , n ∈ N.

n n

φn (x) = ω1/n (x − y)φ(y) dy, x ∈ Ω, n ∈ N suﬃciently large,

Ω

converges to φ in D(Ω).

and for a multi-index α ∈ Nk0 , deﬁne u : D(Ω) → R by

distribution.

satisﬁes

u(φ) = 0 ∀u ∈ D (Ω),

then φ = 0.

162 5 Distributions, Sobolev Spaces

∞

u(φ) = φ(1/i2 ) − φ(0) , φ ∈ D(R).

i=1

distribution.

the order of diﬀerentiation.

that

∂(au) ∂a ∂u

= u+a .

∂xi ∂xi ∂xi

Extend this formula to Dα (au) for a general multi-index α.

tions of f, g : R → R,

1

f (x) = x|x|, x ∈ R ,

2

g(x) = H(x) · cos x, x ∈ R ,

16. Find a sequence (Hn )n∈N in C0∞ (R) such that Hn → H in D (R),

where H is the Heaviside function.

∞

u(φ) = φ(x1 , 0) dx1 ∀φ ∈ D(R2 ).

−∞

(ii) Show that u is not a regular distribution;

∂u

(iii) Check that ∂x1 = 0.

inﬁnite set of isolated points, S = {x1 , x2 , . . . ,

xn , . . . }. Show

∞ that for any sequence of real numbers (an )n∈N

the series n=1 an δxn converges in D (Ω).

5.7 Exercises 163

of [0, 1]);

(b) u + u = H + δ (where H denotes the Heaviside function

and δ is the Dirac distribution);

(c) u − 2u + u = 2δ(t − 1) + δ(t − 2);

(d) u − 4u = δ − δ − 8.

u − u = δ(t − 1) + 2δ(t − 3) − 2t − 1 in D (R),

u(0) = 1, u (0) = 0.

(sin t) · u = 0 in D (R)

⎧

⎪

⎨u1 = 4u1 − u2 + H,

u2 = 3u1 + u2 − u3 + δ,

⎪

⎩

u3 = u1 + u3 + H.

(a, ∞; R), prove that there exists a function v ∈ C[a, ∞) which

is a representative of the class u, and v(a) = 0.

25. Let p ∈ (1, ∞). Show that W 2,p (0, 1) is compactly embedded

into C 1 [0, 1]. The Sobolev space W 2,p (0, 1) is equipped with the

usual norm, and C 1 [0, 1] is equipped with the norm

0≤t≤1 0≤t≤1

164 5 Distributions, Sobolev Spaces

by un (t) = φ(t + n), t ∈ R, n ∈ N. Prove that

(ii) there exists no subsequence of (un ) converging strongly in

Lq (R) for any 1 ≤ q ≤ ∞.

W 1,2 (Ω), show that uv ∈ W 1,1 (Ω) and

∂ ∂u ∂v

(uv) = ·v+u· , i = 1, 2, . . . , k,

∂xi ∂xi ∂xi

Chapter 6

Hilbert Spaces

(·, ·) (i.e., X is an inner product space or a generalized Euclidean space,

as deﬁned in Chap. 1). As usual, throughout this chapter K is either

R or C. Deﬁne the norm

x = (x, x), x ∈ X .

where d(x, y) = x − y, x, y ∈ X), then X is said to be a Hilbert1

space. In other words, a Hilbert space is a Banach space (X, · )

whose norm is given by a scalar product.

6.1 Examples

We have already met some Hilbert spaces, such as the Euclidean space

Rk , Ck , L2 (Ω), H m (Ω), m ∈ N, these spaces being equipped with their

usual scalar products, i.e.,

1

David Hilbert, German mathematician, 1862–1943.

G. Moroşanu, Functional Analysis for the Applied Sciences,

Universitext, https://doi.org/10.1007/978-3-030-27153-4 6

166 6 Hilbert Spaces

k

(x, y) = x i yi , x = (x1 , . . . , xk ) , y = (y1 , . . . , yk ) ∈ Rk ,

i=1

k

(x, y) = x i yi , x = (x1 , . . . , xk ), y = (y1 , . . . , yk ) ∈ Ck ,

i=1

(u, v)L2 (Ω) = uv dx , u, v ∈ L2 (Ω) ,

Ω

α

(u, v)m = D u, Dα v L2 (Ω) , u, v ∈ H m (Ω) ,

|α|≤m

k

x = 2

x2i , x = (x1 , . . . , xk ) ∈ Rk ,

i=1

k

x2 = |xi |2 , x = (x1 , . . . , xk ) ∈ Ck ,

i=1

u2L2 (Ω) = u2 dx , u ∈ L2 (Ω) ,

Ω

u2m = Dα u2L2 (Ω) , u ∈ H m (Ω) ,

|α|≤m

cases, respectively.

Obviously, every Cauchy sequence in Rk is convergent since the cor-

responding coordinate sequences are Cauchy in (R, | · |), hence con-

vergent in that space. So the Euclidean space Rk equipped with the

above scalar product and norm is a Hilbert space over R. Similarly, Ck

equipped with the above scalar product and norm is a Hilbert space

over C.

Note also that Lp (Ω) equipped with the usual norm is a Banach space

for all 1 ≤ p ≤ ∞ (see Theorem 3.25). So (L2 (Ω), · L2 (Ω) ) is a real

Hilbert space. Also, H m (Ω) equipped with the above scalar product

and norm is a real Hilbert space, and so is its closed subspace H0m (Ω),

m ∈ N.

It is worth pointing out that H01 (Ω) can be equipped with a diﬀerent

scalar product,

∗

(u, v)1 = ∇u · ∇v dx , u, v ∈ H01 (Ω) ,

Ω

6.1 Examples 167

coordinate plane is bounded (see Theorem 5.24 and Remarks 5.25 and

5.26).

L2 (a, b; X) equipped with the scalar product

b

(u, v)L2 (a,b; X) = u(t), v(t) X

dt , u, v ∈ L2 (a, b; X) ,

a

b

u2L2 (a,b; X) = u(t)2L2 (a,b; X) dt ,

a

m ∈ N with respect to the scalar product

m b

(u, v)m = u(j) (t), v (j) (t) X

dt , u, v ∈ H m (a, b; X) ,

j=0 a

m

b

u2m = u(j) (t)2X dt , u ∈ H m (a, b; X) .

j=0 a

Let us point out that any inner product space can be extended

(uniquely up to isomorphism) to a Hilbert space, by a completion pro-

cedure similar to that used in the proof of Theorem 2.8. To illustrate

this consider the space C[0, 2] endowed with the scalar product

2

u, v = u(t)v(t) dt , u, v ∈ C[0, 2] ,

0

2

u2L2 = u, u = u(t)2 dt , u ∈ C[0, 2] .

0

168 6 Hilbert Spaces

The space (C[0, 2], · L2 ) is not complete (i.e., it is not a Hilbert

space), as can be seen by using the sequence (un )n≥2 deﬁned by

⎧

⎪

⎨0, 0 ≤ t ≤ 1 − n1 ,

un (t) = nt − n + 1, 1 − n1 < t < 1 ,

⎪

⎩

1, 1 ≤ t ≤ 2,

but it can be extended to the Hilbert space (L2 (0, 2), · L2 ) (each

element ∈ C[0, 2] being identiﬁed with its L2 equivalence class).

If X is a ﬁnite dimensional, inner product space, then it is a Hilbert

space with respect to the norm induced by the corresponding inner

product, so no extension is needed (in particular, Rk and Ck are Hilbert

spaces).

Theorem

Our aim in this chapter is to present the main properties of Hilbert

spaces which are of course common to all the particular spaces men-

tioned above. First of all, we state the following characterization result

due to Jordan and von Neumann.2

space. Then the norm · is given by a scalar product (i.e., there exists

a scalar product (·, ·) : H × H → K such that x = (x, x), x ∈ H)

if and only if · satisﬁes the parallelogram law. (Hence, a Banach

space (H, · ) is Hilbert ⇐⇒ its norm · satisﬁes the parallelogram

law).

here the proof which is immediate. Assuming that · is generated

by a scalar product (·, ·), we have for all x, y ∈ H

x + y2 + x − y2 = (x + y, x + y) + (x − y, x − y)

= 2(x2 + y2 ) , (6.2.1)

2

Pascual Jordan, German theoretical and mathematical physicist, 1902–1980;

John von Neumann, Hungarian-American mathematician, physicist, and computer

scientist, 1903–1957.

6.2 Jordan–von Neumann Characterization Theorem 169

the parallelogram law (see (6.2.1)).

Consider ﬁrst the case K = R. Deﬁne f : H × H → R by

1

f (x, y) = x + y2 − x − y2 , x, y ∈ H,

4

which we will show is a scalar product on H. Clearly,

1

f (x, x) = 2x2 = x2 ∀x ∈ H , (6.2.2)

4

f (x, y) = f (y, x) ∀x, y ∈ H , (6.2.3)

f (x, 0) = 0 ∀x ∈ H . (6.2.4)

Obviously, for any x1 , x2 , y ∈ H, we have

1

f (x1 + x2 , y) = x1 + x2 + y2 + x1 + x2 − y2 ,

4

1

f (x1 − x2 , y) = x1 − x2 + y2 + x1 − x2 − y2 .

4

Add the two equations and apply the parallelogram law to get

1

f (x1 + x2 , y) + f (x1 − x2 , y) = x1 + y2 + x2 2

2

−x1 − y2 − x2 2

1

= x1 + y2 − x1 − y2

2

= 2f (x1 , y) . (6.2.5)

x + x

f (x, y) + f (x , y) = 2f ,y ,

2

which by (6.2.6) gives

extended to

170 6 Hilbert Spaces

r = m/n, m, n ∈ Z, n = 0, we have (by (6.2.8))

m ! "

1 m

f x, y = mf x, y = f x, y ,

n n n

so

f (rx, y) = rf (x, y) ∀x, y ∈ H, ∀r ∈ Q .

Since f is continuous on H × H, this extends to r ∈ R, i.e.,

so f (·, ·) is a scalar product and generates the given norm: x2 =

f (x, x), x ∈ H.

Suﬃciency in the complex case K = C can be treated similarly, with

f : H × H → C deﬁned by

1 m

3

f (x, y) = i x + im y2 , x, y ∈ H ,

4

m=0

Indeed, if (·, ·) and ·, · are two scalar products such that (x, x) =

x, x = x2 , x ∈ H, then we easily derive from

(x + y, x + y) = x + y, x + y ∀x, y ∈ H ,

that

Re(x, y) = Rex, y ∀x, y ∈ H , (6.2.10)

and this completes the proof in the real case. If K = C, then by

replacing y by iy in (6.2.10), we also get

Remark 6.3. We have already noticed that Rk equipped with the usual

Euclidean norm is a Hilbert space, but Rk is not Hilbert with respect

to other norms, such as

k

u1 = |ui |, or umax = max |ui |, u = (u1 , . . . , uk ) ∈ Rk .

1≤i≤k

i=1

6.3 Projections in Hilbert Spaces 171

Indeed, one can easily ﬁnd pairs of vectors that do not satisfy the

parallelogram law expressed in terms of these norms.

Similarly, L1 (a, b), −∞ ≤ a < b ≤ ∞, equipped with its usual norm,

is not a Hilbert space, as can be seen by ﬁnding a pair of functions

f, g ∈ L1 (a, b) that does not satisfy the parallelogram law (do it!).

A Hilbert space is similar in many respects to k-dimensional Euclidean

space. That is why Hilbert spaces are more useful in applications than

general Banach spaces.

Theorem 6.4. Let H be a Hilbert space with scalar product (·, ·) and

induced norm · , and let C be a nonempty, convex, closed subset of

H. Then for all x ∈ H there exists a unique y ∈ C such that

v∈C

a good candidate is y = x.

Assume x ∈ H \ C. Denote ρ = d(x, C). By the deﬁnition of inf, for

all n ∈ N there exists yn ∈ C such that

1

ρ ≤ x − yn < ρ + ,

n

which gives

lim x − yn = ρ . (6.3.12)

n→∞

We have ρ > 0. Indeed if ρ = 0, then by (6.3.12) yn → x and C is

closed, so x ∈ C, contradiction.

Apply the parallelogram law (see (6.2.1)) to x − yn and x − ym to get

2x − (yn + ym )2 + yn − ym 2 = 2 x − yn 2 + x − ym 2 , (6.3.13)

for all n, m. Consider the ﬁrst term of the left-hand side of (6.3.13)

and factor out a 4

and therefore is in C by convexity. Hence (see (6.3.13) and (6.3.14)),

yn − ym 2 ≤ 2 x − yn 2 + x − ym 2 − 4ρ2 . (6.3.15)

172 6 Hilbert Spaces

Using (6.3.12) we get that (yn ) is Cauchy because the right-hand side

of (6.3.15) converges to 0 as n, m → ∞. Therefore (yn ) converges

strongly to some y, and y ∈ C because C is closed. It follows from

(6.3.12) that

x − y = ρ .

We now prove uniqueness. Suppose x − y = ρ = x − y for some

y, y ∈ C. We use the parallelogram law for x − y, x − y to obtain

2x − (y + y )2 + y − y 2 = 2 x − y2 + x − y 2

which implies

y − y 2 ≤ 4ρ2 − 4ρ2 = 0 ,

and thus y = y .

For example, if C is an open disc in R2 , then there is no y for x ∈ R2 \C.

On the other hand, if C is not convex there may exist more (possibly

inﬁnitely many) y’s for the same x, as the reader can easily imagine.

as above is called the projection of x on C and is denoted y = PC x.

Since a projection exists and is unique for any x ∈ H we can deﬁne a

projection operator PC : H → C : x → y = PC x.

closed and convex set. For x ∈ H, y ∈ C the following are equivalent:

(a) y = PC x;

6.3 Projections in Hilbert Spaces 173

If H is a real Hilbert space, then the “Re” from (c) and (d) can be

removed.

Proof.

(a) ⇐⇒ (b) : Trivial.

(b) =⇒ (c) : x − y2 ≤ x − v2 for all v ∈ C. Let v = (1 − λ)y + λw

for 0 < λ < 1, and w ∈ C. Since v is a convex combination, v is in

C. We have

x − y2 ≤ x − y + λ(y − w)2

≤ x − y2 + 2λ Re(x − y, y − w) + λ2 y − w2 ,

so that

Let λ → 0+ to ﬁnd

Re(x − y, y − w) ≥ 0 for all w ∈ C .

(c) =⇒ (b) : Since Re(x − y, y − x + x − v) ≥ 0 we have

x − y2 ≤ Re(x − y, x − v)

≤ |(x − y, x − v)|

≤ x − y · x − v ∀v ∈ C ,

x − y ≤ x − v , ∀v ∈ C .

(c) =⇒ (d) : Re(x − v + v − y, y − v) ≥ 0 for all v ∈ C so that

Re(x − v, y − v) ≥ y − v2 ≥ 0, ∀v ∈ C .

(d) =⇒ (c) : Replacing v in (d) by (1 − λ)y + λw for λ ∈ (0, 1), w ∈ C,

we get

Re(x − y + λ(y − w), λ(y − w)) ≥ 0 ,

Thus, letting λ → 0+ we obtain

Re(x − y, y − w) ≥ 0 ∀w ∈ C .

174 6 Hilbert Spaces

Re(x1 − PC x1 , PC x1 − PC x2 ) ≥ 0 ,

Re(x2 − PC x2 , PC x1 − PC x2 ) ≥ 0 .

Re(PC x1 − PC x2 , x1 − PC x1 − x2 + PC x2 ) ≥ 0 ,

which implies

Re(PC x1 − PC x1 , x1 − x2 ) ≥ PC x1 − PC x2 2 .

PC is also called nonexpansive.

of Theorem 6.7 we have for all v ∈ C, Re(x − y, y − v) ≥ 0, and in fact

we can write it as Re(x − y, v) ≥ 0 for all v ∈ C since C is a linear

subspace. Both v, −v ∈ C because of linearity and this gives equality

Re(x − y, v) = 0 for all v ∈ C. We can also replace v with iv, and so

Im(x − y, v) = 0, therefore

(x − y, v) = 0, ∀v ∈ C . (6.3.17)

are said to be orthogonal by analogy with orthogonality in Euclidean

space, and we write w1 ⊥ w2 . So, (6.3.17) can be expressed as (x−y) ⊥

C. The reader is invited to imagine what the orthogonality relation

(6.3.17) looks like in the Euclidean space R3 equipped with the usual

scalar product and norm.

6.4 The Riesz Representation Theorem 175

Theorem

Let (H, (·, ·), · ) be a Hilbert space and let M ⊂ H be a closed linear

subspace. The orthogonal complement M ⊥ of M is deﬁned as

M ⊥ = {u ∈ H; (u, v) = 0 ∀v ∈ M }

uous.

can be written as u = u1 + u2 with u1 ∈ M and u2 ∈ M ⊥ , and this

decomposition is unique. We write H = M ⊕ M ⊥ and call it a direct

sum.

while u2 = u − u1 = u − PM u is in M ⊥ because (u − PM u, v) = 0 for

all v ∈ M (see (6.3.17)). Let us now prove that this decomposition

(u = u1 + u2 ) is unique.

Suppose that u = u1 + u2 = u1 + u2 with u1 , u1 ∈ M and u2 , u2 ∈ M ⊥ .

Then

= u1 − u1 2 + (u2 − u2 , u1 − u1 ) ,

u1 − u1 2 = 0 so that u1 = u1 which in turn implies u2 = u2 .

be a Hilbert space. For all f ∈ H ∗ (i.e., f is a continuous linear

functional from H to K) there exists a unique v ∈ H such that

Proof.

(u, v ) for all u ∈ H, then (u, v − v ) = 0 for all u ∈ H and in

particular (v − v , v − v ) = 0 so v = v .

176 6 Hilbert Spaces

works. If f = 0 consider the nullspace N (f ) = {z ∈ H; f (z) =

0}. It is a closed linear subspace so H = N (f )⊕N (f )⊥ . In fact

N (f ) = H because f is not identically 0. Thus there exists

u0 ∈ N (f )⊥ \ {0}. We may assume f (u0 ) = 1 by scaling. Let

u ∈ H be arbitrary and deﬁne

w = u − f (u)u0 .

Now consider

showing that w ∈ N (f ). So

1

f (u) = u, u 0 ,

u0 2

by the previous step.

so assume that f = 0, which implies v = 0. By Bunyakovsky–

Cauchy–Schwarz

f ≤ v . (6.4.18)

1

f v = v ,

v

6.4 The Riesz Representation Theorem 177

functionals f from the dual of (C[a, b], · L2 (a,b) ), −∞ < a < b <

∞, can be expressed as f (u) = (u, v)L2 (a,b) , u ∈ C[a, b], with v ∈

C[a, b]. The answer is, in general, no. First of all, any f ∈ (C[a, b], ·

L2 (a,b) )∗ can be extended by continuity to (L2 (a, b), · L2 (a,b) ) which

is a Hilbert space. By the Riesz Representation Theorem, for each

such f (extended to L2 (a, b)) there exists a unique v ∈ L2 (a, b) such

that f (u) = (u, v)L2 (a,b) , ∀u ∈ L2 (a, b), but this v is not necessarily

an element of C[a, b] (i.e., v has no representative in C[a, b]). In fact,

we can consider f (u) = (u, v)L2 (a,b) , u ∈ L2 (a, b), with v ∈ L2 (a, b) \

C[a, b]; this f is continuous on (C[a, b], ·L2 (a,b) ) and its representation

as a scalar product, f (u) = (u, v)L2 (a,b) , is unique (i.e., v is unique);

but this v is not an element of C[a, b], so the answer to the above

question is negative.

Remark 6.12. In the proof of Theorem 6.10 we saw that for all u ∈ H,

0 = f ∈ H ∗ we have the decomposition u = w+f (u)u0 with w ∈ N (f ),

u0 ∈ N (f )⊥ , f (u0 ) = 1, so that dim N (f )⊥ = 1. Another way to say

this is that the codimension of N (f ) is 1. For such a functional f

and for some a ∈ K we have an aﬃne subspace of H,

Y := {u ∈ H; f (u) = a} = au0 + N (f ) ,

a usual hyperplane if H is the Euclidean space.

Conversely, given a closed aﬃne subspace Y of H of codimension 1,

i.e., Y = u1 + Z, for some u1 ∈ H, Z ⊂ H a closed linear subspace

with codimension 1, there exists u0 ∈ H \ {0} which is orthogonal on

Z, i.e., (u, u0 ) = 0, u ∈ Z. Deﬁne f : H → K,

f (u) = (u, u0 ), ∀u ∈ H ,

can be expressed by means of this f as follows:

Y = u1 + N (f ) = {u ∈ H; f (u) = f (u1 )} .

1

A simple example is H = L2 (0, 1), Z = {u ∈ H; 0 u(t) dt = 0}.

Clearly, Z is a closed linear subspace of H with codim Z = 1. Indeed,

any v ∈ H can be uniquely decomposed into

178 6 Hilbert Spaces

1 1

v(t) = v(s) ds + v(t) − v(s) ds

0

0

u(t)

= C + u(t) , for a.a. t ∈ (0, 1) ,

1 We can

choose u0 to be the constant function 1, so f (u) = 0 u(t) dt.

The Weak Topology of H

Taking into account the Riesz Representation Theorem, we see that

the weak topology of H is generated by the neighborhood system

ε > 0, v1 , . . . , vp ∈ H, p ∈ N .

means (xn , v) → (x, v) for all v ∈ H.

If dim H = ∞ then we can use the Gram–Schmidt method (see Chap. 1)

to construct an inﬁnite orthonormal sequence

(x1 , x2 , . . . , xn , . . . ). This sequence converges weakly to 0. Indeed,

for v ∈ H arbitrary, we have

*N *2

* * N

N

* *

* (v, xn )xn − v * = |(xn , v)|2 − 2 |(xn , v)|2 + v2

* *

n=1 n=1 n=1

N

= v2 − |(xn , v)|2

n=1

≥ 0,

so that

N

|(xn , v)|2 ≤ |v|2 , ∀N ∈ N ,

n=1

∞

which is known as Bessel’s inequality.3 So the series n=1 |(xn , v)|

2 is

convergent and consequently

3

Friedrich Wilhelm Bessel, German astronomer, mathematician, physicist and

geodesist, 1784–1846.

6.4 The Riesz Representation Theorem 179

(xn , v) → 0, ∀v ∈ H ,

(to 0) since xn = 1 for all n ∈ N. Therefore, weak convergence in any

inﬁnite dimensional Hilbert space is diﬀerent from strong convergence.

called Riesz operator R : H → H ∗ by v → (· , v) so that (Rv)(u) =

(u, v) for all u, v ∈ H and Rv = v. As seen before, R is also

bijective.

φ

Proof. Let φ : H → H ∗∗ , v → fv ∈ H ∗∗ such that fv (x∗ ) = x∗ (v) for

all x∗ ∈ H ∗ .

As we have already seen, φ is injective. For the convenience of the

reader, let us prove this again in the present context. If fv = 0,

x∗ (v) = 0 for all x∗ ∈ H ∗ which implies, by the Riesz Representation

Theorem, that (v, w) = 0 for all w ∈ H so that v = 0. Thus φ is

injective.

We now prove that φ is surjective. Let x∗∗ ∈ H ∗∗ and deﬁne u∗ ∈ H ∗

by u∗ (v) := x∗∗ (Rv) for all v ∈ H. Denote u = R−1 u∗ and calculate

x∗∗ (x∗ ) = x∗∗ (R(R−1 x∗ ))

= u∗ (R−1 x∗ )

= (R−1 x∗ , u)

= (u, R−1 x∗ )

= x∗ (u)

= fu (x∗ ) ,

so that all functionals x∗∗ are of the form fu (x∗ ), and φ is onto, i.e.,

for all x∗∗ ∈ H ∗∗ there exists u ∈ H such that x∗∗ = fu .

Remark 6.14. The above proof is a direct one. In fact, Theorem 6.13

follows from the Milman–Pettis4 general result we state without proof:

every uniformly convex Banach space is reﬂexive.

Recall that a normed space (H, · ) is said to be uniformly convex if

∀ε ∈ (0, 2) ∃δ > 0 such that ∀x, y ∈ H, x ≤ 1, y ≤ 1, x − y > ε

we have (1/2)(x + y) < 1 − δ.

4

David P. Milman, Soviet and later Israeli mathematician, 1912–1982.

180 6 Hilbert Spaces

that H is uniformly convex, hence reﬂexive (by Milman–Pettis).

We begin this section with a preparatory lemma whose proof is based

on the Banach Contraction Principle.

Lemma 6.15. Let (H, (·, ·), · ) be a real Hilbert space and let A :

H → H be a not necessarily linear operator satisfying

(a) (Au − Av, u − v) ≥ cu − v2 for all u, v ∈ H (strong mono-

tonicity);

(b) Au − Av ≤ Lu − v for all u, v ∈ H (Lipschitz condition),

where c and L are given positive constants. Then for all w ∈ H there

exists a unique u∗ ∈ H such that Au∗ = w, i.e., A is a bijection.

Proof. We ﬁrst prove uniqueness: Suppose u1 , u2 ∈ H such that Au1 =

w = Au2 . Then by (a),

0 = (Au1 − Au2 , u1 − u2 ) ≥ cu1 − u2 2 ,

which implies u1 = u2 .

We now prove existence: First we note that c ≤ L by using (a) and

(b) together with Bunyakovsky–Cauchy–Schwarz. For a ﬁxed w ∈ H,

deﬁne B : H → H by

Bu = u − t(Au − w), t > 0, u ∈ H .

Note that if there is a ﬁxed point of B then it is u∗ as desired. We wish

to apply the Banach Contraction Principle in (H, d), where d(u, v) =

u − v. We have for all u, v ∈ H

d(Bu, Bv)2 = Bu − Bv2

= u − v2 − 2t(u − v, Au − Av) + t2 Au − Av2

≤ u − v2 − 2tcu − v2 + t2 L2 u − v2

from (a) from (b)

= (1 − 2tc + t L ) u − v

2 2 2

call this m

= mu − v2

= md(u, v)2 .

6.5 Lax–Milgram Theorem 181

t = Lc2 . Thus the minimum value of m is

c2 c2 c2

m=1−2 + = 1 − ≥ 0,

L2 L2 L2

since c ≤ L. If c = L, then m = 0, so B is constant, i.e., Bu = w0 , so

that w0 = u − (c/L2 )(Au − w). In this case A is aﬃne, namely

L2

Au = (u − w0 ) + w ,

c

so that u∗ = w0 .

When c < L then 0 < m < 1 so that B is a contraction and hence by

the Banach Contraction Principle (see Sect. 2.5) B has a unique ﬁxed

point u∗ .

real Hilbert space and consider two functionals a : H × H → R and

b : H → R satisfying

1. For all u ∈ H the map v → a(u, v) is linear and continuous on

H (i.e., it belongs to H ∗ );

c > 0;

3. |a(u, w) − a(v, w)| ≤ Lu − v · w for all u, v, w ∈ H and

some L > 0;

4. b is a continuous linear functional (i.e., b ∈ H ∗ ).

Then there exists a unique u ∈ H such that

orem 6.10 for all u ∈ H there exists a unique z ∈ H such that

a(u, v) = (v, z) for all v ∈ H. So there exists an operator A : H → H

deﬁned by Au := z. We now rewrite the second condition

= (u − v, Au − Av)

5

Peter D. Lax, Hungarian-born American mathematician, born 1926; Arthur

N. Milgram, American mathematician, 1912–1961.

182 6 Hilbert Spaces

and since K = R

= (Au − Av, u − v)

≥ cu − v2 ,

for all u, v ∈ H, so A satisﬁes condition (a) of the previous lemma.

From the third assumption we have for all u, v, z ∈ H

|a(u, z) − a(v, z)| = |(z, Au) − (z, Av)|

= |(z, Au − Av)|

≤ Lu − v · z .

Choosing z = Au − Av we see that operator A also satisﬁes condition

(b) of Lemma 6.15.

On the other hand, by the fourth assumption and the Riesz Represen-

tation Theorem there exists a unique w such that b(v) = (v, w) for all

v ∈ H. Now (6.5.19) can be written as

(v, Au) = (v, w), ∀v ∈ H ⇐⇒ Au = w ,

so the conclusion of the theorem follows by Lemma 6.15.

Hilbert space and consider two functionals a : H × H → R and

b : H → R satisfying

1. a is bilinear;

2. a is bounded (continuous) on H×H, namely |a(u, v)| ≤ Lu·v

for all u, v ∈ H for some L > 0;

3. a is strongly positive (or coercive), i.e., there exists c > 0 such

that a(v, v) ≥ cv2 for all v ∈ H;

4. b is linear and continuous (i.e., b ∈ H ∗ ).

Then there exists a unique u ∈ H satisfying

a(u, v) = b(v) ∀v ∈ H. (6.5.19 )

If, in addition, a is symmetric (i.e., a(u, v) = a(v, u) for all u, v ∈

H) then u is a solution of (6.5.19 ) if and only if it is a solution

(minimizer) of the quadratic minimization problem

)

1

min a(v, v) − b(v) . (6.5.20)

v∈H 2

6.5 Lax–Milgram Theorem 183

all that remains is to prove the ﬁnal statement.

Deﬁne

1

F (v) = a(v, v) − b(v), v ∈ H .

2

φ(t) = F (u + tv) for t ∈ R, v ∈ H. We have

1

φ(t) = a(u + tv, u + tv) − b(u + tv)

2

1 1

= a(u, u) + ta(u, v) + t2 a(v, v) − b(u) − tb(v)

2 2

1

= F (u) + t a(u, v) − b(v) + t2 a(v, v) .

2

Therefore,

φ (t) = a(u, v) − b(v) + ta(v, v) ,

hence

a(u, v) − b(v) = φ (0) = 0 ,

since t = 0 is a minimizer of φ, so that u satisﬁes (6.5.19 ) because v

is arbitrary.

Conversely, suppose that u satisﬁes (6.5.19 ). We must show F (u) ≤

F (v) for all v ∈ H. It is enough to prove F (u+v)−F (u) is nonnegative:

1 1

F (u + v) − F (u) = a(u + v, u + v) − b(u + v) − a(u, u) + b(u)

2 2

1 1 1

= a(u, u) + a(u, v) + a(v, v) − b(v) − a(u, u)

2 2 2

symmetric

1

= a(u, v) − b(v) + a(v, v)

2

=0

1

= a(v, v)

2

≥ 0.

to (6.5.20).

184 6 Hilbert Spaces

all f ∈ L2 (Ω) there exists a unique u ∈ H01 (Ω) which is a solution to

the following minimization problem:

)

1

min ∇v · ∇v dx − f v dx , (6.5.21)

v∈H01 (Ω) 2 Ω Ω

u ∈ H01 (Ω) ,

(6.5.22)

Ω ∇u · ∇v dx = Ω f v dx ∀v ∈ H01 (Ω) .

be

u ∈ H01 (Ω), −Δu = f in Ω , (6.5.23)

which is known as the Euler–Lagrange equation7 associated with

the minimization problem (6.5.21) (being a Poisson equation in this

example) and u being 0 on the boundary is interpreted as meaning the

trace of u on the boundary ∂Ω is 0. Indeed, for every test function

φ ∈ C0∞ (Ω) we have

∇u · ∇φ dx = f φ dx ⇐⇒ (−Δu, φ) = (f, φ) ,

Ω Ω

u satisﬁes the equation −Δu = f for a.a. x ∈ Ω. In fact, if ∂Ω is

smooth enough, then u ∈ H01 (Ω) ∩ H 2 (Ω) (see [39, Theorem 3.1, p.

212]). Moreover, if f ∈ C ∞ (Ω) then so is u. Actually, the following

regularity result holds.

D (Ω) satisﬁes the equation −Δu = f in the sense of distributions,

then u ∈ C ∞ (Ω).

We wish to use the classical Lax–Milgram Theorem 6.17. Denote H :=

H01 (Ω). Recall that H is a real Hilbert space as a closed subspace of

H 1 (Ω). According to Remark 5.26, H can be equipped with the norm

6

Johann Peter Gustav Lejeune Dirichlet, German mathematician, 1805–1859.

7

Joseph-Louis Lagrange, Italian mathematician and astronomer, 1736–1813.

6.5 Lax–Milgram Theorem 185

! "1/2

u∗ = |∇u| dx2

, u ∈ H = H01 (Ω) ,

Ω

and b : H → R by

a(u, v) := ∇u · ∇vdx , b(v) := f v dx .

Ω Ω

(bounded),

|a(u, v)| ≤ u∗ · v∗ ∀u, v ∈ H ,

and coercive

a(v, v) = ∇v · ∇v dx = v2∗ ∀v ∈ H .

Ω

so by Poincaré’s inequality

Thus all the conditions of Theorem 6.17 are fulﬁlled, so the proof of

Dirichlet’s Principle is complete.

−Δu(x) + β(u(x)) = f (x), x ∈ Ω ,

(6.5.24)

u = 0, x ∈ ∂Ω ,

a nonlinear Lipschitz continuous, nondecreasing function. We wish to

prove that problem (6.5.24) has a unique solution u ∈ H01 (Ω). To this

purpose we can apply Theorem 6.16 with H = H01 (Ω) equipped with

the norm · ∗ as above, and with a : H × H → R, b : H → R deﬁned

by

a(u, v) = ∇u · ∇v dx + β(u)v dx , b(v) = f v dx .

Ω Ω Ω

186 6 Hilbert Spaces

are fulﬁlled, so there is a unique u ∈ H = H01 (Ω) satisfying

u(u, v) = b(v), ∀v ∈ H ,

f − β(u) is in L2 (Ω) as well, i.e., u satisﬁes the given equation for

a.a. x ∈ Ω. In fact, if ∂Ω is smooth enough, then u ∈ H 2 (Ω) (cf. [39,

Theorem 3.1, p. 212]).

Let (H, (·, ·), · ) be a Hilbert space with m := dim H ≥ 1.

If m < ∞ then starting from a basis of H, say B = {e1 , . . . , em }, one

can construct by the Gram–Schmidt procedure (see Chap. 1) an or-

thonormal basis B = {u1 , . . . , um }, i.e., (ui , uj ) = δij , i, j = 1, . . . , m.

So every u ∈ H can be written as

m

u= ci ui , ci ∈ K, i = 1, . . . , m .

i=1

m

u= (u, ui )ui , ∀u ∈ H . (6.6.25)

i=1

Fourier coeﬃcients.8

case m = ∞. A set S ⊂ H is said to be an orthonormal set if for

any pair u, v ∈ S, u = v, we have

tem in H if it is not properly included in any other orthonormal set

in H.

8

Jean-Baptiste Joseph Fourier, French mathematician and physicist,1768–1830.

6.6 Fourier Series Expansions 187

orthonormal system, and any orthonormal set can be extended to a

complete orthonormal system. Indeed, choosing x ∈ H \ {0} and

denoting u1 = (1/x)x we see that {u1 } is an orthonormal system

in H. Consider the collection of all orthonormal systems in H which

contain {u1 }. This collection is partially ordered with respect to the

usual inclusion relation. By Zorn’s Lemma there exists a maximal

element for the collection, which is a complete orthonormal system in

H. If m = ∞ then this system is inﬁnite, be it countable or not (this

issue will be clariﬁed later).

Theorem 6.21. Let (H, (·, ·), · ) be an inﬁnite dimensional Hilbert

space and let S = {un }n∈N ⊂ H be a countably inﬁnite orthonormal

system. Then the following are equivalent:

(a) S is complete;

(b) u = ∞ n=1 (u, un )un ∀u ∈ H;

∞

n=1 |(u, un )| = u ∀u ∈ H (Parseval’s relation)9 ;

(c) 2 2

Proof. First of all, using the orthogonality of system S, we have for

all u ∈ H and N ∈ N

N

N

0≤ (u, un )un − u2 = u2 − |(u, un )|2 . (6.6.26)

n=1 n=1

Let us prove that (b) =⇒ (a). Assume by contradiction that (b) holds,

but S is not complete, i.e., there exists a vector û ∈ H \ S such that

û = 1, and (û, un ) = 0 ∀n ∈ N. From (b) with u = û it then follows

û = 0 which is a contradiction.

Now, we prove that (a) =⇒ (b). Fix u ∈ H. By a standard computa-

tion we get

* m+p *2 m+p

* *

* *

* (u, un )un * = |(u, un )|2 . (6.6.27)

*n=m * n=m

∞

n=1 |(u, un )|

Since the numerical series 2 is convergent

(see (6.6.26)), we deduce from (6.6.27) that the sequence of partial

9

Marc-Antoine Parseval, French mathematician, 1755–1836.

188 6 Hilbert Spaces

ũ ∈ H, so we can write

∞

ũ = (u, un )un . (6.6.28)

n=1

We compute

N

(ũ, uj ) = lim (u, un )un , uj = (u, uj ) ∀j ∈ N ,

N →∞

n=1

so

(ũ − u, uj ) = 0 ∀j ∈ N,

which implies ũ = u by the completeness of S. Therefore ũ in (6.6.28)

can be replaced by u.

It is clear that (b) =⇒ (d). To complete the proof it suﬃces to show

(d) =⇒ (a). Assume by contradiction that (d) holds but S is not

complete, i.e., there exists a vector v ∈ H \ S such that v = 1, and

(v, un ) = 0 ∀n ∈ N. According to (d), we obtain (v, w) = 0 ∀w ∈ H,

hence v = 0, another contradiction.

plete orthonormal system in H, then every u ∈ H is the sum of the

Fourier series associated with it (see (b)), similar to the ﬁnite dimen-

sional case m < ∞. That is why S is also called a countable or-

thonormal basis of H. The next result is a characterization of the

Hilbert spaces possessing countable orthonormal bases.

Theorem 6.23. A Hilbert space has a countable orthonormal basis if

and only if it is separable.

Proof. Let H be a Hilbert space. Denote m := dim H.

If m < ∞, then the result is trivial, so let us assume m = ∞.

Let S = {un }n∈N be a (countable) orthonormal basis in H. Then

Span S is dense in H (cf. Theorem 6.21). On the other hand, using

the fact that Q is dense in R, we can show that there exists a countable

subset of Span S which is dense in Span S, hence in H. Indeed, for any

u ∈ Span S, say u = pk=1 αk uk , and any ε > 0, there are numbers

rk ∈ Q if H is a real Hilbert space, or rk ∈ Q + iQ if H is a complex

Hilbert space, such that

* *

* p *

ε * *

|rk − αk | < , k = 1, . . . , p =⇒ *u − rk uk * < ε.

p * *

k=1

6.6 Fourier Series Expansions 189

Thus H is separable.

Conversely, assume H is separable, i.e., there exists a countably in-

ﬁnite set, say M = {x1 , x2 , . . . , xn , . . . } such that M = H. Using

Gram–Schmidt (see Chap. 1) we can construct with vectors from M

an orthonormal system S = {u1 , u2 , . . . , un , . . . } eliminating depen-

dent vectors of M if any. An inspection of the Gram–Schmidt method

shows that in fact M ⊂ Span S so that

H = M ⊂ Span S ⊂ H ⇒ Span S = H ,

orthonormal system S = {ui }i∈I in H is still valid (cf. Remark 6.20).

Obviously, the index set I is no longer countable. Surprisingly, in this

case, for every u ∈ H there is a sequence of indices i1 , i2 , . . . such that

∞

u= (u, uij )uij ,

j=1

i.e., u has a Fourier series expansion as in the separable case. For the

proof of this result, see [51, pp. 86–87].

Let H = L2 (−π, π) with the usual scalar product

π

(f, g) = f (x)g(x) dx , f, g ∈ H ,

−π

and f = (f, f ) for all f ∈ H. Let S = {un }∞

n=0 , where

1 1 1

u0 = √ , u2k−1 (x) = √ cos kx, u2k (x) = √ sin kx, k = 1, 2, . . . .

2π π π

system in H. Moreover, S is complete as stated in the following result.

is a basis in H = L2 (−π, π).

10

Ernst Sigismund Fischer, Austrian mathematician, 1875–1954.

190 6 Hilbert Spaces

Span S = H. We know that C0∞ (−π, π) is dense in L2 (−π, π) (see

Theorem 5.8). To conclude we can use Weierstrass’ lemma below (cf.

[52, p. 205]). This is an approximation result with respect to the

sup-norm of C[−π, π] which is obviously stronger than the norm of

H = L2 (−π, π).

deﬁned above.

Proof. Let f ∈ X be even, i.e., f (−x) = f (x), x ∈ [−π, π]. Since the

function y → f (arccos y) is continuous on [−1, 1], for all ε > 0 there

exists a Bernstein11 polynomial p such that

y∈[−1,1] x∈[0,π]

(6.6.29)

In fact, since both f and x → p(cos x) are even, we can extend (6.6.29)

to [−π, π],

sup |f (x) − p(cos x))| < ε . (6.6.30)

x∈[−π,π]

so (6.6.30) concludes the proof in the case when f is even.

Now, consider an odd function f ∈ X, so f (−π) = f (π) = f (0) = 0.

f (x)

Then x → sin x is an even function, but has singularities at x = 0, ±π.

So we consider for δ > 0 small

π(x−δ)

f π−2δ , x ∈ (δ, π − δ),

f˜(x) =

0, x ∈ [0, δ] ∪ [π − δ, π],

function which approximates f uniformly. Now deﬁne

˜

f (x)

, x ∈ [−π, π] \ {0, ±π} ,

ψ(x) = sin x

0 x ∈ {0, ±π} .

11

Sergei N. Bernstein, Russian mathematician, 1880–1968.

6.6 Fourier Series Expansions 191

approximated as well by elements in Span S.

To conclude the proof, it is enough to notice that any function f can

be decomposed into f = fe + fo where

1 1

fe (x) = [f (x) + f (−x)], fo (x) = [f (x) − f (−x)] ,

2 2

are even and odd, respectively.

Some Comments

lows that L2 (−π, π) is separable (by Theorem 6.23). Obviously,

L2 (a, b) is separable for any a < b. In fact, for any measurable

set Ω ⊂ Rk , Lp (Ω) is separable for all p ∈ [1, ∞) (see, e.g., [6, p.

95]).

u ∈ L2 (−π, π) is the sum of the Fourier series associated with it,

i.e.,

∞

u= (u, un )un , (6.6.31)

n=0

n

meaning that sn (u) = k=0 (u, uk )uk converges strongly to u

in L2 (−π, π). Taking into account the structure of the basis S,

(6.6.31) can be written as

∞

a0

u(x) = + (an cos nx + bn sin nx) , (6.6.32)

2

n=1

where

π

1 1 π

a0 = u(t) dt, ak = u(t) cos(kt) dt,

π −π π −π

π

1

bk = u(t) sin(kt) dt , (6.6.33)

π −π

192 6 Hilbert Spaces

the Fourier series associated with u. For the moment, we know

(by Fischer–Riesz) that for u ∈ L2 (−π, π) the series expansion

(6.6.32) is valid in L2 (−π, π), i.e.,

a0

n

sn (u)(x) = + (ak cos kx + bk sin kx) (6.6.34)

2

k=1

(sn (u)) that converges to u for a.a. x ∈ (−π, π). There is a

question whether the sequence (sn (u)) itself converges a.e., i.e.,

(6.6.32) holds for a.a. x ∈ (−π, π). This question was posed in

1920 by Luzin.12 In 1966, Carleson13 proved that this is indeed

the case. The proof is not trivial and is omitted. Later, Hunt14

extended the result to Lp -functions, i.e., the series expansion

(6.6.32) holds a.e. for every Lp -function u, for 1 < p < ∞. On

the other hand, in 1922 Kolmogorov15 gave a counterexample

showing that it does not hold for p = 1. However, the Fourier

expansion (6.6.32) holds for L1 -functions in the sense of distri-

butions, as explained below.

Recall that in general L1 functions do not admit Fourier series expan-

sions in classical theory. However, the Fourier coeﬃcients of u (see

(6.6.33)) are still well deﬁned if u ∈ L1 (−π, π). Fix such a function

u ∈ L1 (−π, π) and associate with it the series

∞

a0

u(x) ≈ + (an cos nx + bn sin nx) .

2

n=1

∞

a0

u(x) = + (an cos nx + bn sin nx) in D (−π, π) , (6.6.35)

2

n=1

12

Nikolai N. Luzin, Russian mathematician, 1883–1950.

13

Lennart Axel Edvard Carleson, Swedish mathematician, born 1928).

14

Richard Allen Hunt, American mathematician, 1937–2009.

15

Andrey N. Komogorov, Russian mathematician, 1903–1987.

6.6 Fourier Series Expansions 193

Recall that distributions are not deﬁned pointwise, and the appearance

of x in (6.6.35) is simply for convenience.

In order to prove (6.6.35), consider the series

a0 2
an

∞

bn

x + − 2 cos nx − 2 sin nx ,

4 n n

n=1

of (6.6.35). This series is uniformly and absolutely convergent since

for all n ≥ 1

a

n bn 1

− 2 cos nx − 2 sin nx ≤ 2 (|an | + |bn |)

n n n

π

4

≤ 2 |u(t)| dt

n π −π

1

=C 2.

n

Let

a0
an∞

bn

s(x) = x2 + − 2 cos nx − 2 sin nx . (6.6.36)

4 n n

n=1

D (−π, π), so (6.6.36) also holds in D (−π, π). Diﬀerentiating (6.6.36)

twice in the sense of distributions we get

∞

a0

+ (an cos nx + bn sin nx) = s in D (−π, π) . (6.6.37)

2

n=1

Finally we must show that s = u, i.e., s is generated by the function

u. We consider the partial sums

a0

l

sl (u)(x) = + (an cos nx + bn sin nx) .

2

n=1

194 6 Hilbert Spaces

π

(sl (u), φ) = sl (u)(x)φ(x) dx

−π

π # $

a0

l

= φ(x) + (an cos nx + bn sin nx) dx

−π 2

n=1

π #

π

1

= φ(x) u(t) dt

−π 2π −π

π

l

1

+ cos nx u(t) cos nt dt

π −π

n=1

π $

1

+ sin nx f (t) sin nt dt dx ,

π −π

π

(sl (u), φ) = u(t)sl (φ)(t) dt . (6.6.38)

−π

l→∞

Indeed, if we denote

1 π 1 π

Ak = φ(t) cos kt dt (k ≥ 0), Bk = φ(t) sin kt dt (k ≥ 1) ,

π −π π −π

that for k ≥ 1

π

1

Ak = − φ (t) sin kt dt

kπ −π

π

1

=− 2 φ (t) cos kt dt

k π −π

and similarly

π

1

Bk = − 2 φ (t) sin kt dt .

k π −π

6.7 Exercises 195

C1 C1

|Ak | ≤ 2

, |Bk | ≤ 2 , ∀k ≥ 1 . (6.6.40)

n n

that the Fourier series of φ is uniformly convergent (see Weierstrass’ M

Test) and its sum is φ (by the classical theory, or by Theorem 6.25),

i.e., (6.6.39) holds. Finally, taking into account (6.6.39) and letting

l → ∞ in (6.6.38), we get

π

(s , φ) = u(t)φ(t) dt = (u, φ).

−π

6.7 Exercises

1. Let ∅ = Ω ⊂ Rk be an open set and let p ∈ (1, ∞). It is well

known that Lp (Ω) is a Banach space with respect to the usual

norm

1/p

uLp (Ω) = |u(x)| dx

p

, u ∈ Lp (Ω).

Ω

Prove that Lp (Ω), · Lp (Ω) is a Hilbert space if and only if

p = 2.

a scalar product (·, ·) and the induced norm · . Show that for

x, y ∈ H we have |(x, y)| = x · y if and only if x and y are

linearly dependent.

3. Let −∞ < a < b < ∞. Show that C[a, b] with the sup-norm is

not a Hilbert space.

nomials with real coeﬃcients of degree ≤ n. Show that for any

u ∈ L2 (0, 1) there exists a unique pu ∈ C such that

196 6 Hilbert Spaces

u if u ≤ 1,

Pu = −1

u u if u > 1.

L = 1;

(ii) if H is a general Banach space, then P is Lipschitzian with

L = 2.

norm. Set

and for x = (1, 2, −1)T determine PM x and write x as a direct

sum of vectors in M and M ⊥ , i.e., x = x1 +x2 , x1 ∈ M, x2 ∈ M ⊥ .

equipped with the usual scalar product and norm. Show that

b )

M = u ∈ L (a, b);

2

u(t) dt = 0

a

any u ∈ L2 (a, b) as a direct sum of vectors in M and M ⊥ , i.e.,

u = u1 + u2 , u1 ∈ M, u2 ∈ M ⊥ .

9. Show that any linear subspace Y of a Hilbert space (H, (·, ·)) one

has ⊥ ⊥

Y = Cl Y.

10. Let H = L2 (0, 1) be the real Hilbert space equipped with the

usual scalar product and norm. Is the subspace Y = {u ∈

1

H; 0 u(t)

t dt = 0} closed in H?

6.7 Exercises 197

11. Prove that the dual of any Hilbert space is a Hilbert space, too.

12. Let {un }∞n=1 be an orthonormal basis in a Hilbert space H and

let (an )n∈N be a bounded sequence in R. Prove that

1

n

vn = ai ui , n ∈ N,

n

i=1

√

(ii) the sequence ( nvn )n∈N converges weakly to zero.

13. Let (H, · ) be a Hilbert space and A ∈ L(H). Show that the

following two conditions are equivalent:

H;

where I is the identity operator on H.

14. Let (H, · , (·, ·)) be a real Hilbert space. For any A ∈ L(H)

satisfying (Ax, x) ≥ 0 ∀x ∈ H, we have

lim (I + tA)−1 u = PN (A) u ∀u ∈ H,

t→∞

15. Let (un )n∈N be a sequence in a Hilbert space (H, · ) which

is weakly convergent to a point u ∈ H. If, in addition, lim sup

un ≤ u then show un − u → 0.

16. Prove that for any f ∈ L1 (0, 1) there exists a unique u ∈ H01 (0, 1)

satisfying

1 1 1

u (t)v (t) dt + u(t)v(t) dt = f (t)v(t) dt ∀v ∈ H01 (0, 1),

0 0 0

198 6 Hilbert Spaces

−u + u = f a.e. in (0, 1),

u(0) = 0, u(1) = 0.

(P ),

⎧

⎪

⎨u ∈ H (0, 1),

2

⎪

⎩

u (0) = 0, u (1) = u(1),

1 1

u ∈ H 1 (0, 1), −u(1)v(1) + u v + α uv

0 0

1

= f v ∀v ∈ H 1 (0, 1).

0

exists a unique solution u of problem (P ).

of a functional deﬁned on H 1 (0, 1).

18. Let (H, (·, ·)) be a Hilbert space and let Y ⊂ H be a closed

subspace with an orthonormal basis {un }∞

n=1 . Prove that ∀y ∈ H

∞

the closest point to y in Y is i=1 (y, un )un .

Show that for any x ∈ H, x ≤ 1, there exists a sequence

(xn )n∈N in H such that xn = 1 for all n ∈ N and xn → x

weakly.

6.7 Exercises 199

f2 (x) = −3x + sin x, −π ≤ x ≤ π,

−1 −π ≤ x ≤ 0,

f3 (x) =

x + 1 0 ≤ x ≤ π,

x + 1 −1 ≤ x ≤ 0,

f4 (x) =

x2 − 1 0 ≤ x ≤ 1.

Chapter 7

Self-adjoint Linear

Operators

and discuss some related results. Then we shall address the case of

compact operators A : H → H, where H is a Hilbert space, and

present the Fredholm theorem as an application. The last section is

devoted to symmetric operators and self-adjoint operators.

Throughout this chapter we consider linear operators between linear

spaces over K, where K is either R or C, unless otherwise speciﬁed.

Let X, Y be Banach spaces with duals X ∗ and Y ∗ and let A : D(A) ⊂

X → Y be a linear operator that is densely deﬁned: D(A) = X. The

adjoint of A is an operator A∗ : D(A∗ ) ⊂ Y ∗ → X ∗ deﬁned as follows.

The domain of A∗ is the set

functional f (x) = y ∗ (Ax) is continuous on D(A) (equipped with the

norm · of X), i.e., |f (x)| ≤ cx for all x ∈ D(A). According to the

Hahn–Banach Theorem, f can be extended to a functional g ∈ X ∗ ,

G. Moroşanu, Functional Analysis for the Applied Sciences,

Universitext, https://doi.org/10.1007/978-3-030-27153-4 7

202 7 Adjoint, Symmetric, and Self-adjoint Linear Operators

such that |g(x)| ≤ cx for all x ∈ X. This extension is unique since

D(A) is dense in X. We now deﬁne

A∗ y ∗ = g

Example.

Let X = Y = l1 (for the deﬁnition of l1 see Chap. 4). Let A : D(A) ⊂

l1 → l1 be deﬁned by

Note that both A and A∗ are closed operators, i.e., their graphs are

closed in l1 ×l1 and l∞ ×l∞ , respectively. In fact, we have the following

general result:

X → Y be a densely deﬁned, linear operator. Then A∗ is closed.

A∗ yn∗ → x∗ in X ∗ . We have

7.1 The Adjoint of a Linear Operator 203

A ∈ L(X, Y ), then A∗ ∈ L(Y ∗ , X ∗ ), and A = A∗ .

same symbol · for diﬀerent norms)

Therefore,

X, Y, Z be three Banach spaces over K, where K is the same (either

R or C) for all the three spaces. Then the following properties hold:

B : D(B) ⊂ X → Y is another linear operator, such that A ⊂ B (i.e.,

D(A) ⊂ D(B) and Bx = Ax ∀x ∈ D(A)), then B ∗ ⊂ A∗ ;

204 7 Adjoint, Symmetric, and Self-adjoint Linear Operators

Let (H, (·, ·), · ) be a Hilbert space. Let A : D(A) ⊂ H → H be a

densely deﬁned, linear operator. Taking into account the Riesz Rep-

resentation Theorem, the adjoint of A can be redeﬁned as an operator

from H into itself, as follows:

tinuous on (D(A), · )) can be extended uniquely to a functional

belonging to H ∗ so (by Riesz) there is a corresponding element in H,

denoted A∗ y. Thus we have a linear operator A∗ : D(A∗ ) ⊂ H → H,

such that

nothing else but the operator R−1 ◦ A∗ ◦ R, with A∗ being the adjoint

deﬁned in the previous section. Whenever we deal with a densely

deﬁned linear operator A : D(A) ⊂ H → H, we shall associate with A

the A∗ deﬁned in this section. It is easily seen that all the properties

discussed in the previous section remain valid, except for (b) which

now takes the form

(b ) For all α , β ∈ K and A, B ∈ L(X, Y ) ,

α ∈ K and any densely deﬁned, linear operator A : D(A) ⊂ X → Y ,

we have

(αA)∗ = ᾱA∗ .

the transposed conjugate of the matrix corresponding to A (while the

matrix associated with the adjoint of A as deﬁned in the previous

section is just the transpose of the matrix corresponding to A. This

shows the diﬀerence between the two notions of adjoint).

7.2 Adjoints of Operators on Hilbert Spaces 205

(Ax, y) = (x, A∗ y)

= (A∗ y, x)

= (y, A∗∗ x)

= (A∗∗ x, y) ∀x, y ∈ H ,

Denote by K(H) := K(H, H) the space of compact linear operators

from H into itself. This is a closed subspace of L(H) := L(H, H) with

respect to the operator norm, hence K(H) is a Banach space with

respect to this norm (see Theorem 4.11).

Theorem 7.3. If (H, (·, ·), ·) is a Hilbert space and A ∈ K(H), then

the nullspace of I − A, denoted N = N (I − A), is a ﬁnite dimensional

subspace of H, where I denotes the identity operator of H.

a bounded subset of N . Since A is compact and Q = AQ we deduce

that Q is relatively compact in (N , · ). According to Theorem 2.24,

N is ﬁnite dimensional.

A ∈ K(H) then A∗ ∈ K(H), too.

Proof. Let r > 0 be arbitrary but ﬁxed. Since A∗ ∈ L(H), the set

A∗ B(0, r) is bounded: x < r =⇒ A∗ x ≤ rA∗ . As A is compact,

it follows that for any sequence (xn )n≥1 in B(0, r) the sequence ((A ◦

A∗ )xn )n≥1 has a convergent subsequence, say ((A ◦ A∗ )xnk )k≥1 . We

also have

A∗ xnk − A∗ xnj 2 = A∗ (xnk − xnj ), A∗ (xnk − xnj )

= xnk − xnj , A(A∗ (xnk − xnj ))

≤ 2r(A ◦ A∗ )xnk − (A ◦ A∗ )xnj ,

1

Juliusz Pawel Schauder, Polish mathematician, 1899–1943.

206 7 Adjoint, Symmetric, and Self-adjoint Linear Operators

compact. This follows from Schauder’s Theorem above combined with

(A∗ )∗ = A.

Remark 7.6. If A, B ∈ L(H) and at least one is compact, then A ◦ B

is compact as well.

We continue with an important result, essentially due to Fredholm,2

that provides a necessary and suﬃcient condition for an operator equa-

tion involving a compact linear operator to have a solution.

Theorem 7.7 (Fredholm). Let (H, (·, ·), · ) be a Hilbert space and

let A ∈ K(H). The equation x − A∗ x = f has a solution if and only if

f ∈ N ⊥ , where N = N (I − A) (the nullspace of I − A).

then the equation x − Ax = f has a solution if and only if f ∈ N (I −

⊥

A∗ ) .

Lemma 7.9. Let (H, (·, ·), · ) be a Hilbert space and let A ∈ K(H).

Then there exists a constant C > 0 such that

Cx ≤ (I − A)x ∀x ∈ N ⊥ , (7.2.3)

where N = N (I − A).

Proof. Assume by contradiction that (7.2.3) is not true, i.e., for all

n ∈ N there exists an xn ∈ N ⊥ such that xn = 1 and

1

(I − A)xn < .

n

Therefore,

xn − Axn → 0 . (7.2.4)

As A is compact there is a subsequence of (xn )n≥1 , say (xnk )k≥1 , such

that (Axnk )k≥1 is convergent. By (7.2.4) we deduce that (xnk )k≥1 is

also convergent, and its limit x ∈ N ⊥ (since N ⊥ is closed). Using again

(7.2.4), we infer that x − Ax = 0, i.e., x ∈ N . Since N ∩ N ⊥ = {0},

we have x = 0, which contradicts xn = 1 ∀n ≥ 1.

2

Erik Ivar Fredholm, Swedish mathematician, 1866–1927.

7.2 Adjoints of Operators on Hilbert Spaces 207

Necessity. Assume that the equation x−A∗ x = f has a solution x ∈ H.

Then, for all y ∈ N , we have

(f, y) = (x, y) − (A∗ x, y)

= (x, y) − (x, Ay)

= (x, (I − A)y )

=0

= 0.

Therefore f ∈ N ⊥ .

), N ⊥ is a Hilbert space with the same scalar product and norm.

According to Lemma 7.9, · is equivalent (on N ⊥ ) with the norm

deﬁned by the scalar product

x, y = (T x, T y) ∀x, y ∈ N ⊥ ,

where T = I − A. Since the functional x → (x, f ) is linear and

continuous on N ⊥ , it follows by the Riesz Representation Theorem

that there exists xf ∈ N ⊥ such that

(x, f ) = x, xf ∀x ∈ N ⊥ . (7.2.5)

=(T x,T xf )

N ⊥ . Denoting x̃ = T xf , we can write (see (7.2.5) extended to H)

(T x, x̃) = (x, f ) ∀x ∈ H ,

=(x,x̃−A∗ x̃)

so

x̃ − A∗ x̃ = f .

orem 7.7.

Theorem 7.10. Let (H, (·, ·), · ) be a Hilbert space and let A ∈

K(H). Then,

R(I − A) = H ⇐⇒ N = {0} ⇐⇒ N ∗ = {0} ⇐⇒ R(I − A∗ ) = H ,

where N = N (I − A), N ∗ = N (I − A∗ ), and R(I − A), R(I − A∗ )

denote the ranges of I − A, I − A∗ .

208 7 Adjoint, Symmetric, and Self-adjoint Linear Operators

prove that

R(I − A) = H ⇐⇒ R(I − A∗ ) = H . (7.2.6)

Assume R(I − A) = H. Let us prove that N = {0}. Assume by way

of contradiction that N = {0}, i.e., there exists an x0 ∈ N , x0 = 0.

As R(I − A) = H we can construct a sequence (xn )n≥1 in D(A) such

that

T xn = xn−1 ∀n ≥ 1 ,

where T := I − A. We have

T n xn = x0 = 0, and T n+1 xn = 0 ,

we have that Hn is a proper linear subspace of Hn+1 for all n ∈ N.

According to Theorem 7.3, every Hn is a ﬁnite dimensional space,

hence closed, since

n ! "

n

T = (I − A) = I −

n n

(−1)k+1 Ak .

k

k=1

compact operator

1

un ∈ Hn+1 , un = 1, un − u ≥ ∀u ∈ Hn .

2

Since for 1 ≤ m < n

T n (T un + Aum ) = T n+1 un + AT n um = 0 ,

1

Aun − Aum = un − (T un + Aum ) ≥ .

2

Thus the sequence (Aun )n≥1 cannot have Cauchy (hence convergent)

subsequences. This contradicts the fact that A is compact combined

with un = 1 for all n ≥ 1. Therefore, N = {0}, which (by The-

orem 7.7) implies that R(I − A∗ ) = H. Thus we have proved the

implication

R(I − A) = H ⇒ R(I − A∗ ) = H .

The converse implication follows by replacing A with A∗ .

7.3 Symmetric Operators and Self-adjoint Operators 209

Remark 7.11. From Corollary 7.8 and Theorem 7.10 we deduce that

if the equation x − Ax = f has a solution uf for all f ∈ H then

uf is unique (since N (I − A) = {0}). So we can now state the so-

called Fredholm’s alternative regarding the equation x − Ax = f with

A ∈ K(H), namely one of the following must hold:

• for every f ∈ H the equation x − Ax = f has a unique solution

(equivalently, N (I − A) = {0});

• N (I − A) = {0}, in which case the equation x − Ax = f is solvable if

and only if f ⊥ N (I − A∗ ) (i.e., f satisﬁes m orthogonality relations,

where m = dim N (I − A∗ ) = dim N (I − A)).

We shall later apply Fredholm’s alternative to a class of integral equa-

tions that are named after him.

Remark 7.12. In fact, the above theory is valid in a general Banach

space H (see, e.g., [6, Chapter 6] or [15, Chapter 5]).

Self-adjoint Operators

We begin this section with the following deﬁnition.

Deﬁnition 7.13. Let (H, (·, ·), · ) be a Hilbert space and let A :

D(A) ⊂ H → H be a densely deﬁned, linear operator.

(a) A is called symmetric if A ⊂ A∗ , i.e.,

adjoint, and in this case A is closed (by Theorem 7.1), hence A ∈ L(H)

(by the Closed Graph Theorem).

Example 1. Let X = L2 (a, b; K), where −∞ < a < b < +∞ and let

A : X → X be deﬁned by

b

(Af )(t) = k(t, s)f (s) ds, a ≤ t ≤ b ,

a

where k ∈ C([a, b] × [a, b]; K). The space X equipped with the usual

scalar product and norm is a Hilbert space and A ∈ L(X). Moreover,

210 7 Adjoint, Symmetric, and Self-adjoint Linear Operators

Note that for all f, g ∈ X we have

b

(Af, g)L2 (a,b; K) = (Af )(t) · g(t) dt

a

b b

= k(t, s)f (s) ds · g(t) dt

a a

b b

= f (s) · k(t, s)g(t) dt ds

a a

b b

= f (t) · k(s, t)g(s) ds dt ,

a a

thus b

(A∗ g)(t) = k(s, t) · g(s) ds ∀g ∈ X .

a

Obviously,

A = A∗ ⇐⇒ k(t, s) = k(s, t) ∀t, s ∈ [a, b] .

Hilbertian norm, and let A : D(A) ⊂ X → X be given by

D(A) = {f ∈ X; tf (t) ∈ X} ,

(Af )(t) := tf (t) ∀t ∈ K, f ∈ D(A) .

It is easily seen that A is self-adjoint.

and norm, where ∅ = Ω ⊂ RN , N ≥ 2, is a bounded domain with

smooth boundary. Let A : D(A) ⊂ H → H, where

D(A) = C0∞ (Ω), Au = Δu ∀u ∈ D(A) .

Obviously, D(A) is dense in H. By Green’s identity, we have

vΔu dx = uΔv dx ∀u ∈ D(A) = C0∞ (Ω), v ∈ H 2 (Ω) .

Ω Ω

H 2 (Ω)

A is symmetric but not self-adjoint because D(A) is a proper subset

of D(A∗ ). If the domain of A = Δ is extended to H01 (Ω) ∩ H 2 (Ω)

then A becomes self-adjoint. More precisely, we have the following

proposition.

7.3 Symmetric Operators and Self-adjoint Operators 211

product (·, ·) and the induced norm · , where ∅ = Ω ⊂ RN , N ≥ 2,

is a bounded domain with smooth boundary. Let B : D(B) ⊂ H → H

be deﬁned by D(B) = H 2 (Ω) ∩ H01 (Ω), Bu = Δu for all u ∈ D(B).

Then B is self-adjoint.

for all u, v ∈ D(B)

(Bu, v) = Δu · v dx = u · Δv dx = (u, Bv),

Ω Ω

metric). Let us prove that D(B ∗ ) = D(B). Using the Lax–Milgram

Theorem, we can see that R(I + B) = H. In addition, since B is

positive, I + B is invertible and J := (I + B)−1 ∈ L(H). As B is

symmetric, so is J. Now, let v be an arbitrary function in D(B ∗ ).

Denoting g = v + B ∗ v, we have

so v = Jg ∈ R(J) = D(B).

(A∗ )−1 = (A−1 )∗ . In fact, the following more general result holds.

Theorem 7.15. Let (H, (·, ·), · ) be a Hilbert space and let A :

D(A) ⊂ H → H be a symmetric linear operator, with R(A) = H.

Then

(A−1 )∗ = (A∗ )−1 ,

where all operations are permitted. If, in addition, A is self-adjoint,

then so is A−1 .

212 7 Adjoint, Symmetric, and Self-adjoint Linear Operators

∗=−1H. Therefore, A−1 and (A∗ )−1 exist

−1

with D(A ) = R(A), D (A ) = R(A ). Since D(A−1 ) is dense in

∗

and

= R(A), w ∈ D(A∗ ) . (7.3.8)

On the other hand, by (7.3.8), A∗ w ∈ D (A∗ )−1 = D(B) and

=B

7.4 Exercises

1. Let X, Y be Banach spaces. Let A : D(A) ⊂ X → Y be a

densely deﬁned, closed linear operator and B ∈ L(X, Y ). Deﬁne

T : D(T ) = D(A) ⊂ X → Y by T x = Ax + Bx ∀x ∈ D(A).

Prove that

(i) T is a closed operator;

(ii) D(T ∗ ) = D(A∗ ) and T ∗ = A∗ + B ∗ .

7.4 Exercises 213

densely deﬁned linear operator. Show that A∗ is injective if and

only if Cl R(A) = Y .

linear operator with R(A) = H, then A is self-adjoint, i.e., A =

A∗ .

4. Let H be a Hilbert space, with the scalar product denoted (·, ·),

and let A, B ∈ L(H). Show that

A2 .

6. Let (H, (·, ·)) be a Hilbert space over C and let A ∈ L(H). Prove

that

A is symmetric (hence self-adjoint) ⇐⇒ (Ax, x) ∈ R ∀x ∈ H.

7. Let (H, (·, ·)) be a Hilbert space over R. Prove that for any a > 0

and any A ∈ L(H) the operator T = I + aA∗ A is invertible and

T −1 ∈ L(H), where I denotes the identity operator on H.

(hence self-adjoint) operator. Denote T = A + iI, where i2 = −1

and I is the identity operator on H. Prove that

(b) T is invertible and T −1 ∈ L(H).

C, denote by P (A) the operator polynomial a0 I + a1 A + · · · +

an An , where I stands for the identity operator.

R, then P (A) is symmetric, too;

(jj) If A is a normal operator (i.e., A∗ A = AA∗ ), then so is

P (A).

214 7 Adjoint, Symmetric, and Self-adjoint Linear Operators

Hilbert space consisting of all pairs (x1 , x2 )T , x1 ∈ H1 and x2 ∈

H2 , with ! " ! " ! "

x1 y1 x 1 + y1

+ = ,

x2 y2 x 2 + y2

! " ! "

x αx1

α 1 = ∀α ∈ K,

x2 αx2

and a scalar product deﬁned by

+! " ! ", ! " ! "

x1 y1 x y1

, = (x1 , y1 )H1 + (x2 , y2 )H2 ∀ 1 , ∈ H.

x2 y2 x2 y2

- .

A1 0

A= .

0 A2

Prove that A ∈ L(H) and A = max {A1 , A2 }. Find A∗ .

previous exercise, deﬁne Y = H × H to be the Hilbert space

consisting of all pairs (x1 , x2 )T , x1 ∈ H and x2 ∈ H, with the

corresponding operations and scalar product. Deﬁne on Y the

matrix operator B by

- .

0 iA

B= ,

−iA∗ 0

√

where i = −1. Prove that B ∈ L(Y ), B = A, and that

B ∗ = B.

Now, assume that A : D(A) ⊂ H → H is a linear, densely

deﬁned operator. Prove that B : D(A∗ ) × D(A) ⊂ Y → Y is

symmetric.

Prove that Ax = x if and only if A∗ x = x.

13. Let H be the real Hilbert space L2 (0, 1) equipped with the usual

scalar product and induced norm. Deﬁne A : D(A) ⊂ H → H

by

D(A) = {u ∈ H 1 (0, 1); u(0) = 0}, Au = u .

7.4 Exercises 215

(b) Compute N (A) and R(A);

(c) Determine A∗ and show that D(A∗ ) is dense in H.

14. Let H be the real Hilbert space L2 (0, 1) equipped with the usual

scalar product and induced norm. Let A : D(A) ⊂ H → H be

the operator deﬁned by Au = u , where

(b) D(A) = {u ∈ H 1 (0, 1); u(0) = αu(1)} for some α ∈ R \ {0}.

cases.

15. Let H be the real Hilbert space L2 (0, 1) equipped with the usual

scalar product and induced norm. Let A : D(A) ⊂ H → H,

Au = u , where D(A) is speciﬁed below. Determine A∗ in each

of the following cases:

(b) D(A) = {u ∈ H 2 (0, 1); u(0) = u(1) = u (0) = u (1) = 0};

(c) D(A) = {u ∈ H 2 (0, 1); u(0) = u (1) = 0};

(d) D(A) = {u ∈ H 2 (0, 1); u(0) = u(1)};

of all 2sequences of

complex numbers x = (xn )n∈N satisfying ∞ n=1 |xn | < ∞, with

the usual scalar product

∞

x, y = xn ȳn ∀x = (xn ), y = (yn ) ∈ H,

n=1

H → H and B : D(B) ⊂ H → H by

nα i n

B(xn ) = xn , for a given α ∈ R.

1+n

216 7 Adjoint, Symmetric, and Self-adjoint Linear Operators

(b) Show that if α ≤ 1 then D(B) = H and B ∈ L(H); compute

B;

(c) For α > 1 ﬁnd (the maximal domain) D(B) and prove that

D(B) is dense in H;

(d) Compute B ∗ for all α ∈ R;

(e) Check whether A and B with α ≤ 1 are normal operators.

Chapter 8

Eigenvalues and

Eigenvectors

eigenvectors of compact and/or symmetric operators. This includes

the Hilbert–Schmidt Theorem and its applications to the main eigen-

value problems for the Laplacian.

Throughout this chapter we consider linear operators deﬁned on linear

spaces over K, where K is either R or C, unless otherwise speciﬁed.

We ﬁrst introduce the concept of an eigenpair (i.e., eigenvector + the

corresponding eigenvalue).

said to be an eigenvector of a linear operator A : X → X if there

exists λ ∈ K such that Au = λu. Such a λ is called an eigenvalue

corresponding to u, and the pair (u, λ) is called an eigenpair.

value λ is unique. Indeed,

λu = Au = λ1 u =⇒ (λ − λ1 )u = 0 =⇒ λ − λ1 = 0,

since u = 0.

G. Moroşanu, Functional Analysis for the Applied Sciences,

Universitext, https://doi.org/10.1007/978-3-030-27153-4 8

218 8 Eigenvalues and Eigenvectors

is N (λI − A) \ {0}, where I is the identity operator of X.

Remark 8.3. Note also that a set of eigenvectors u1 , u2 , . . . , um of A

corresponding to distinct eigenvalues λ1 , λ2 , . . . , λm (m ∈ N) is a lin-

early independent system. The proof is by induction.

Example 1. Let X = Cn , A : X → X, Au = M u ∀u = (u1 , . . . , un )T ∈

X, where M = (aij ) is an n × n matrix with entries aij ∈ C. Then,

λ is an eigenvalue of A if and only if det(λI − M ) = 0, where I is the

n × n identity matrix.

quences of complex numbers x = (xn )n∈N satisfying ∞

n=1 |xn | < ∞,

2

∞

x, y = xn ȳn ∀x = (xn ), y = (yn ) ∈ H,

n=1

2 3 n

A(xn ) = x2 , x3 , . . . , xn , . . . .

1 2 n−1

We have for all x = (xn ) ∈ H

∞

Ax 2

= |n(n − 1)−1 xn |2

n=2

∞

≤ 4 |xn |2

n=2

≤ 4x , 2

we have x̃ = 1 and Ax̃ = 2.

Consider the equation Ax = λx, or, equivalently,

n+1

xn+1 = λxn , n = 1, 2, . . . (8.1.1)

n

Observe that λ = 0 is an eigenvalue of A with eigenvectors (x1 , 0,

0, . . . ), x1 ∈ C \ {0}.

If λ = 0, then it follows easily from (8.1.1) that

1 n−1

xn = λ x1 , n = 1, 2, . . .

n

8.2 Main Results 219

(xn ) ∈ H is equivalent to |λ| ≤ 1. So the set {λ ∈ C; |λ| ≤ 1} is the

set of all eigenvalues of A.

compact linear operator.

(i.e., A : X → X is linear and sends bounded sets to relatively compact

sets). Then A has a countable set of eigenvalues, and the only possible

accumulation point of the set of eigenvalues is λ = 0. Moreover, for

any eigenvalue λ = 0, dim N (λI − A) < ∞ (one says that λ has a

ﬁnite rank or ﬁnite multiplicity).

that X is inﬁnite dimensional. To prove the ﬁrst statement of the

theorem, it suﬃces to show that for all r > 0 the set {λ ∈ K; |λ| ≥ r}

contains a ﬁnite number of eigenvalues. Suppose not, i.e., there exists

r0 > 0 and inﬁnitely many distinct eigenvalues λ1 , λ2 , . . . such that

|λn | ≥ r0 ∀n ≥ 1. Then there exists a sequence un ∈ X \ {0} such that

Aun = λn un ∀n ≥ 1, and we may assume that un = 1 ∀n ≥ 1.

Because the λn ’s are distinct, Bn = {u1 , u2 , . . . , un } are independent

systems. Set Xn = Span Bn , n = 1, 2, . . . By Lemma 2.25, there exists

yn ∈ Xn \ Xn−1 such that yn = 1 ∀n ≥ 2 and

1 1

yn − v ≥ ∀v ∈ Xn−1 , n ≥ 2 =⇒ yn − ym ≥ ∀n = m.

2 2

other hand, assuming that 1 ≤ m < n, we have

∈Xn \Xn−1 ∈Xm ∈Xn−1 ∈Xm ⊂Xn−1

= λn yn − vmn

220 8 Eigenvalues and Eigenvectors

n

n

Ayn − λn yn = A αi u i − λn

n

αin ui

i=1 i=1

n

n

= αin λi ui − λn αin ui

i=1 i=1

n

= αin (λi − λn )ui

i=1

n−1

= αin (λi − λn )ui

i=1

= |λn | · yn − λ−1

n vnm

≥ r0 yn − λ−1

n vnm

r0

≥ ,

2

so (Ayn ) has no Cauchy (hence no convergent) subsequence. But A

is compact and yn = 1 ∀n ≥ 1 so (Ayn ) must have a convergent

subsequence. This contradiction shows that {λ ∈ K; |λ| ≥ r} contains

a ﬁnite number of eigenvalues of A for all r > 0, as claimed.

The proof of the latter statement of the theorem is similar to the proof

of Theorem 7.3.

Proposition 8.5. Let (H, (·, ·), · ) be a Hilbert space and let A :

H → H be a symmetric (hence self-adjoint) operator. Then,

(i) every eigenvalue of A is real, even if K = C;

(ii) every two eigenvectors of A corresponding to distinct eigenvalues

are orthogonal.

be a corresponding eigenvector, i.e., Au = λu. Then

(u, Au) = (u, λu) = λu2 .

8.2 Main Results 221

λ1 , λ2 ∈ R (from (i)) and λ1 = λ2 . We have

λ1 (u1 , u2 ) = (Au1 , u2 ) = (u2 , Au1 ) = λ2 (u1 , u2 ) ,

thus

(λ1 − λ2 )(u1 , u2 ) = 0 ,

=0

so (u1 , u2 ) = 0.

Proposition 8.6. Let (H, (·, ·), · ) be a Hilbert space, H = {0}, and

let A ∈ L(H) be a symmetric operator. Then,

A = sup { |(Ax, x)|; x ∈ H, x = 1 } .

Proof. Trivial if A = 0 (equivalently A = 0). Assume A = 0 (A >

0) and set

a = sup { |(Ax, x)|; x ∈ H, x = 1} .

Since

|(Ax, x)| ≤ Ax · x ≤ A · x2 ∀x ∈ H ,

we infer that

a ≤ A . (8.2.2)

Now, for given b > 0 and x ∈ H such that x = 1 and Ax > 0, we

have

1 1 1 1 1

Ax2 = A(bx + Ax), bx + Ax − A(bx − Ax), bx − Ax .

4 b b b b

(8.2.3)

We also have

|(Av, v)| ≤ av2 ∀v ∈ H . (8.2.4)

Combining (8.2.3) and (8.2.4) we obtain

a 1 1

Ax2 ≤ bx + Ax2 + bx − Ax2

4 b b

a 2 1

≤ b x + 2 Ax ,

2 2

2 b

so for x = 1 and b = Ax > 0 we have

Ax2 ≤ aAx .

Therefore,

Ax ≤ a ∀x ∈ H, x = 1 =⇒ A ≤ a .

This together with (8.2.2) implies A = a.

222 8 Eigenvalues and Eigenvectors

is essential.

We have the following central theorem.

Theorem 8.7 (Hilbert–Schmidt). Let (H, (·, ·), · ) be an inﬁnite

dimensional, separable Hilbert space and let A : H → H be a symmetric

(equivalently, self-adjoint), compact linear operator, with N (A) = {0}.

Then there exist a sequence of eigenvalues of A, (λ1 , λ2 , . . . , λn , . . . ),

such that (|λn |) is a decreasing sequence of positive numbers converging

to 0 and a complete orthonormal system (basis) in H of corresponding

eigenvectors {un }∞ n=1 (i.e., Aun = λn un for n = 1, 2, . . . ).

Let us prove that either A or −A is an eigenvalue of A. By

Proposition 8.6 there exists (vn )n≥1 , with vn = 1 ∀n ≥ 1, such that

|(Avn , vn )| → A. In fact, one can extract from (vn ) a subsequence,

again denoted (vn ), such that (Avn , vn ) converges to either A or

−A, say

(Avn , vn ) → λ1 := A. (8.2.5)

Since A is compact we can now take another subsequence, also denoted

(vn ), such that

Avn → u1 , (8.2.6)

and this is the subsequence we keep. Now, passing to the limit in

so by (8.2.6)

u1 ≤ A = |λ1 | .

Therefore,

u1 = |λ1 | = A . (8.2.8)

From (8.2.7) (see also (8.2.5), (8.2.6) and (8.2.8)) we derive

Avn − λ1 vn → 0 . (8.2.9)

8.2 Main Results 223

continuity of A we get

Au1 = λ1 u1 ,

i.e., (u1 , λ1 ) is an eigenpair of A. We normalize without changing

notation, u1 := |λ1 |−1 u1 , since we want an orthonormal system of

eigenvectors.

It is worth pointing out that any other eigenvalue λ satisﬁes |λ| ≤ |λ1 |.

Indeed, if we assume by contradiction the existence of an eigenpair

(u, λ), with |λ| > |λ1 | and u = 1, then |(Au, u)| = |λ| which contra-

dicts |λ1 | = A being the supremum from Proposition 8.6.

n = 2, 3, . . .

Denote by Y the orthogonal complement of Span{u1 }, i.e.,

Y = { u ∈ H; (u, u1 ) = 0 } .

(with the scalar product and norm of H), and is invariant to A in the

sense that AY ⊂ Y because for y ∈ Y ,

= (y, λ1 u1 )

= λ1 (y, u1 )

= 0.

The restriction A|Y is not 0 since then N (A) = Y . In fact, all the

properties are inherited (symmetric, compact, and N (A|Y ) = {0})

and by the previous step we have an eigenvalue

u2 ∈ Y, u2 = 1, Au2 = λ2 u2 .

Moreover, |λ2 | ≤ λ1 .

224 8 Eigenvalues and Eigenvectors

Next, take

⊥

Z = { u ∈ Y ; (u, u2 ) = 0 } = Span{u1 , u2 } ,

a new eigenpair (u3 , λ3 ), with u3 = 1, |λ3 | ≤ |λ2 |. We may continue

doing this, each time obtaining an inﬁnite dimensional subspace. We

thus construct a sequence of eigenvalues (λn ) such that

Aun = λn un , un = 1, n ≥ 1 ,

∞

Au = λn (u, un )un ∀u ∈ H . (8.2.11)

n=1

⊥

Vm := {u ∈ H; (u, uj ) = 0, j = 1, . . . , m} = Span{u1 , . . . , um } ,

), invariant under A (i.e., Av ∈ Vm ∀v ∈ Vm ). By the previous step

of our proof, there is an eigenpair (um+1 , λm+1 ) of A such that

In particular,

Av ≤ |λm+1 | · v ∀v ∈ Vm+1 . (8.2.12)

Now, choose a particular

m

wm = u − (u, un )un

n=1

1, . . . , m. Calculate

m

wm = u −

2 2

|(u, un )|2 ≤ u2 . (8.2.13)

n=1

8.2 Main Results 225

m

Awm = Au − (u, un )Aun

n=1

m

= Au − λn (u, un )un ,

n=1

so that

= |λm+1 | · wm

≤ |λm+1 | · u . (8.2.14)

(8.2.10)), there exists

lim |λn | = α ≥ 0 .

n→∞

all n ≥ 1 and so

un 1 1

λ−1

n un = = ≤ ∀n ≥ 1 .

|λn | |λn | α

n un ) has a convergent subsequence. But

this is impossible because

So α = 0, i.e., λn → 0, as claimed.

Consequently, we have by (8.2.14) that Awm → 0 as m → ∞, i.e.,

(8.2.11) holds true.

n=1 is a basis in H.

We

∞know from the proof of Theorem 6.21 that for all u ∈ H the series

(u, u )u converges (as {u }∞

n=1 n n n n=1 is an orthonormal system), so

we can write

∞

v= (u, un )un

n=1

226 8 Eigenvalues and Eigenvectors

to check that u = v. Consider the sequence of

partial sums sm = m n=1 (u, un )un which converges strongly to v as

m → ∞, so Asm → Av. On the other hand, by (8.2.11) we have that

m

Asm = λn (u, un )un → Au as m → ∞ .

n=1

Hence,

Av = Au =⇒ A(v − u) = 0 =⇒ v = u ,

since ker A = {0}. Thus the system {un }∞

n=1 is complete, i.e., a basis

in H (cf. Theorem 6.21).

Remark 8.8. If we assume in addition that A is positive (i.e., (Av, v) >

0 for all v ∈ H \{0}), then it has eigenvalues λ1 ≥ λ2 ≥ · · · ≥ λn ≥ · · · ,

with λn > 0 ∀n ≥ 1. This follows from

(Aun , un ) = λn un 2 = λn , n ≥ 1.

Note also that

λ1 = A = sup{(Av, v); v ∈ H, v = 1} and

λn+1 = A|Vn = sup{(Av, v); v ∈ Vn , v = 1} ∀n ≥ 1 ,

⊥

where Vn = Span{u1 , u2 , . . . , un } , n ≥ 1.

Dirichlet Boundary Condition

In what follows we apply the Hilbert–Schmidt Theorem to an eigen-

value problem for the Laplace operator. Speciﬁcally, let ∅ = Ω ⊂ RN ,

N ≥ 2, be a bounded domain with smooth boundary ∂Ω. Consider

the Dirichlet eigenvalue problem

−Δu = λu in Ω ,

(8.3.15)

u=0 on ∂Ω .

Deﬁnition 8.9. A real number λ is said to be an eigenvalue of the

Dirichlet problem (8.3.15) if there is a function u ∈ H01 (Ω) \ {0} such

that the problem is satisﬁed in the sense that

∇u · ∇v dx = λ uv dx ∀v ∈ H01 (Ω) , (8.3.16)

Ω Ω

or, equivalently,

−Δu = λu in D (Ω) .

8.3 Eigenvalues of −Δ Under the Dirichlet Boundary Condition 227

in fact more regular (see [6, Theorem 9.25, p. 298]).

Theorem 8.11. Let ∅ = Ω ⊂ RN be a bounded domain with smooth

boundary ∂Ω. Then there exist an increasing sequence of positive

eigenvalues λn for (8.3.15) such that λn → +∞ and a complete or-

thonormal system (in H = L2 (Ω)) of eigenfunctions un satisfying prob-

lem (8.3.15) with λ = λn , n = 1, 2, . . .

Proof. Let H = L2 (Ω) equipped with the usual inner product and

norm. H is an inﬁnite dimensional, separable Hilbert space (over R).

We know that for every f ∈ H = L2 (Ω) the problem

−Δu = f in Ω ,

u=0 on ∂Ω ,

has a unique solution u ∈ H01 (Ω) (by Dirichlet’s Principle, Chap. 6).

Deﬁne an operator A : H → H by assigning f → u. Note that A is

linear and N (A) = {0}. Moreover, A is symmetric (hence self-adjoint

since D(A) = H). Indeed, if v = Ag with g ∈ H, i.e.,

−Δv = g in Ω ,

v=0 on ∂Ω ,

then, by Green, we can write

∇u · ∇v dx = f v dx = f Ag dx ,

Ω Ω Ω

∇v · ∇u dx = gu dx = gAf dx ,

Ω Ω Ω

so Ω f Ag dx = Ω gAf dx as desired.

Let us show that operator A is also compact, i.e., for every constant

M > 0, the set

SM := {Af ; f ∈ L2 (Ω), f L2 (Ω) ≤ M }

is relatively compact in H = L2 (Ω). Indeed, if u = Af ∈ SM it follows

from (8.3.16) with v = u that

∇u2L2 (Ω) = f u dx

Ω

≤ f L2 (Ω) · uL2 (Ω)

228 8 Eigenvalues and Eigenvectors

Finally,

∇uL2 (Ω) ≤ Cf L2 (Ω) ≤ CM ,

so that Af H01 (Ω) is less than or equal to some constant. We know

that bounded sets in H01 (Ω) are relatively compact in L2 (Ω) so that

SM is relatively compact in this space.

We can apply the Hilbert–Schmidt Theorem which guarantees the ex-

istence of a sequence of eigenpairs for A, {(un , μn )}∞

n=1 , such that |μn |

decreases to zero and {un }∞ n=1 is a complete orthonormal system (ba-

sis) in H = L2 (Ω). Note that Aun = μn un says that un satisﬁes the

problem

−Δun = λn un in Ω ,

un = 0 on ∂Ω ,

where λn = 1/μn , i.e., (un , λn ) is an eigenpair of problem (8.3.15).

Note also that

λn = λn un 2 dx = |∇un |2 dx > 0 ∀n ≥ 1 ,

Ω Ω

+∞ (since |μn | = μn decreases to 0).

Robin Boundary Condition

Let again ∅ = Ω ⊂ RN , N ≥ 2, be a bounded domain with smooth

boundary ∂Ω. Consider the classical Robin eigenvalue problem

−Δu = λu in Ω ,

∂u

(8.4.17)

∂ν + αu = 0 on ∂Ω ,

normal to ∂Ω. In this case we have the following natural deﬁnition:

8.4 Eigenvalues of −Δ Under the Robin Boundary Condition 229

Robin problem (8.4.17) if there is a function u ∈ H 1 (Ω) \ {0} such that

∇u · ∇v dx + α uv ds = λ uv dx ∀v ∈ H 1 (Ω) . (8.4.18)

Ω ∂Ω Ω

eigenfunction u is, in fact, more regular.

Theorem 8.14. Assume ∅ = Ω ⊂ RN is a bounded domain with

smooth boundary ∂Ω and α is a positive constant. Then there exists

an increasing sequence of positive eigenvalues λn for (8.4.17) such that

λn → +∞ and a complete orthonormal system (in H = L2 (Ω)) of

eigenfunctions un satisfying problem (8.4.17) with λ = λn , n = 1, 2, . . .

Proof. Again, let H = L2 (Ω) equipped with the usual inner product

and norm. By the Lax–Milgram Theorem (see Chap. 6) we easily infer

that for every f ∈ H = L2 (Ω) the problem

−Δu + u = f in Ω ,

∂u

∂ν + αu = 0 on ∂Ω ,

f → u. It is an easy exercise to check that A is positive and satisﬁes

all the conditions of the Hilbert–Schmidt Theorem. In contrast with

the previous Dirichlet case, we have replaced −Δ by −Δ + I in order

to ensure the strong positivity (coercivity) of the corresponding bilin-

ear form as well as the compactness of A (based on Theorem 5.22).

Therefore there exists a sequence of eigenpairs for A, {(un , μn )}∞

n=1 ,

such that |μn | = μn decreases to 0 and {un }∞ n=1 is an orthonormal

basis in H. The fact that Aun = μn un can be written as

−Δun = λn un in Ω ,

∂u

∂ν + αun = 0 on ∂Ω ,

Note that

λn = λ n un 2 dx = ∇un 2 dx + α u2n ds > 0 ∀n ≥ 1 ,

Ω Ω ∂Ω

(8.4.19)

so (λn )n≥1 is an increasing sequence of positive numbers converging to

∞ (since |μn | = μn decreases to 0).

230 8 Eigenvalues and Eigenvectors

Neumann Boundary Condition

Under the same conditions on Ω we consider the Neumann eigenvalue

problem

−Δu = λu in Ω ,

∂u

(8.5.20)

∂ν = 0 on ∂Ω ,

i.e., α > 0 in the Robin eigenvalue problem is replaced by α = 0. The

deﬁnition of an eigenvalue is the same as before (see Deﬁnition 8.12)

with α = 0 in (8.4.18). We have a result similar to Theorem 8.14

which we explain in what follows. One can again consider H = L2 (Ω)

with its usual scalar product and norm, and A : H → H the operator

which associates with each f ∈ H the unique solution u ∈ H 1 (Ω) of

the problem

−Δu + u = f in Ω ,

∂u

∂ν = 0 on ∂Ω .

The Hilbert–Schmidt Theorem is again applicable (see also Remark 8.8),

thus there exist a decreasing sequence of positive eigenvalues of oper-

ator A, say (μn )n≥0 , μn → 0, and a corresponding complete orthonor-

mal system {un }∞ n=0 , i.e., Aun = μn un for n = 0, 1, 2, . . . So denoting

λn = −1 + 1/μn we have

−Δun = λn un in Ω ,

∂un

∂ν = 0 on ∂Ω ,

We also have (8.4.19) with α = 0, hence λn ≥ 0 for all n ≥ 0. Note that

λ0 = 0 is the ﬁrst eigenvalue of problem (8.5.20), the corresponding

eigenfunctions being the nonzero constant functions. Thus λ0 = 0 has

multiplicity one (so λ0 = 0 is said to be a simple eigenvalue)

and the

corresponding normalized eigenfunction is u0 = ±1/ m(Ω), where

m(Ω) denotes the Lebesgue measure of Ω. Consequently, a result

similar to Theorem 8.14 holds, with the only diﬀerence that the ﬁrst

eigenvalue is no longer a positive number (it is λ0 = 0).

In fact, the proof can also be done as in the Dirichlet case, as explained

below. Denote by V0 the one-dimensional space generated by u0 =

8.5 Eigenvalues of −Δ Under the Neumann Boundary Condition 231

1/ m(Ω): V0 = Span{u0 } = Span{1}. Obviously, the space H =

L2 (Ω) can be written as a direct sum

⊥

H = V0 ⊕ V1 , V1 = V0 = {v ∈ H; v dx = 0} .

Ω

space with respect to the same scalar product and norm. We can use

V1 as a basic space to show the existence of (λn , un ) for n = 1, 2, . . .

Note that W = V1 ∩ H 1 (Ω) is a real Hilbert space with respect to the

scalar product (see (8.5.21) below)

v, w = ∇v · ∇w dx ∀v, w ∈ W,

Ω

β = inf { ∇v dx; v ∈ W,

2

v 2 dx = 1}

Ω

Ω

Ω ∇v dx

2

= inf ,

v∈W \{0} v2 dx

Ω

Rayleigh quotient

then there exists a minimizing sequence (vk )k≥1 in W , vk L2 (Ω) =

1 ∀k ≥ 1, such that (vk ) converges to some v̂ weakly in H 1 (Ω) and

strongly in V1 . From

∇v̂L2 (Ω) =

2

∇v̂ · ∇(v̂ − vk ) dx + ∇v̂ · ∇vk dx

Ω Ω

Ω

we derive

∇v̂ dx ≤ lim inf

2

∇vk 2 dx = 0 ,

Ω Ω

which implies

∇v̂2 dx = 0 ,

Ω

and so v̂ is a constant function. Since v̂ ∈ V1 it follows that v̂ = 0.

On the other hand, one can derive from vk L2 (Ω) = 1, k ≥ 1, that

232 8 Eigenvalues and Eigenvectors

this implies the following Poincaré-type inequality:

problem

−Δu = f in Ω ,

∂u

∂ν = 0 on ∂Ω ,

has a unique solution u ∈ W . Moreover the operator A : V1 → V1

deﬁned by Af = u, f ∈ V1 (i.e., A = (−Δ)−1 ), is positive and sat-

isﬁes the conditions of the Hilbert–Schmidt Theorem. Therefore the

existence of {(λn , un )}∞

n=1 is again guaranteed.

Summarizing what we have done so far, we obtain the following result.

Theorem 8.15. Assume ∅ = Ω ⊂ RN is a bounded domain with

smooth boundary ∂Ω. Then there exist a sequence of eigenvalues for

(8.5.20), 0 = λ0 < λ1 ≤ λ2 ≤ · · · ≤ λn ≤ · · · , such that λn → ∞

and a complete orthonormal system (in H = L2 (Ω)) of eigenfunctions

un verifying problem (8.5.20) with

λ = λn , n = 0, 1, 2, . . . ; in addition

λ0 = 0 is simple and u0 = ±1/ m(Ω).

1. Let f ∈ L2 (Ω). The Neumann problem

−Δu = f in Ω ,

∂u

∂ν = 0 on ∂Ω ,

f ∈ V1 (i.e., Ω f dx = 0). Suﬃciency follows by the Lax–Milgram

Theorem, as noticed before, while the converse implication follows by

Green’s Identity.

2. Deﬁne

λD

1 = inf { ∇v dx; v ∈ H0 (Ω),

2 1

v 2 dx = 1}

Ω

Ω

∇v 2 dx

= inf Ω .

Ω v dx

v∈H01 (Ω)\{0} 2

Rayleigh quotient

8.6 Some Comments 233

It is easily seen that λD D

to λD 1 . Moreover, λD

1 is the ﬁrst eigenvalue (or principal eigenvalue),

i.e., λ1 = λ1 given by Theorem 8.11, λD

D D

1 is simple, and u1 is positive

within Ω (see [14, Theorem 2, p. 336]). If we deﬁne

W1D = {v ∈ H01 (Ω); uD

1 v dx = 0} ,

Ω

then

λD

2 = inf { ∇v2

dx; v ∈ W D

1 , v 2 dx = 1},

Ω Ω

is attained at some uD

2∈ W1D , uD2 L2 (Ω) = 1, which is an eigenfunc-

tion corresponding to λ2 , u2 ⊥u1 . In general,

D D D setting

D

Wn−1 = {v ∈ H01 (Ω); j v dx = 0, j = 1, . . . , n − 1}, n ≥ 2,

uD

Ω

λD

n = inf { ∇v dx; v ∈ Wn−1 ,

2 D

v 2 dx = 1},

Ω Ω

n , un ), such that

1 < λ2 ≤ · · · ≤ λn ≤ · · · , λn = λn → ∞ ,

λD D D D

and {uD ∞

n }n=1 is an orthonormal basis in L (Ω).

2

Hilbert–Schmidt Theorem.

Similar arguments work for the Robin and Neumann eigenvalue prob-

lems. We just recall that the lowest positive eigenvalues are given

by

λ1 = inf { ∇v dx + α

R 2

v ds; v ∈ H (Ω),

2 1

v 2 dx = 1}

Ω

∂Ω

Ω

∇v2 dx + α 2

∂Ω v ds

= inf Ω , (8.6.22)

2

v∈H 1 (Ω)\{0} Ω v dx

λN

1 = inf { ∇v dx; v ∈ W = V1 ∩ H (Ω),

2 1

v 2 dx = 1}

Ω

Ω

Ω∇v dx

2

= inf 2

.

v∈W \{0} Ω v dx

234 8 Eigenvalues and Eigenvectors

1 and λ1 (which is equal to β deﬁned be-

fore) are positive numbers and are attained for functions uR1 ∈ H (Ω)

1

N

N R 1 N

−Δu = f in Ω ,

∂u

∂ν + αu = 0 on ∂Ω ,

Indeed, by (8.6.22) we have the inequality

λR

1 v 2

dx ≤ ∇v 2

dx + α v 2 ds ∀v ∈ H 1 (Ω) , (8.6.23)

Ω Ω ∂Ω

into L2 (∂Ω)) shows that its right-hand side deﬁnes a norm equivalent

to the usual norm in H 1 (Ω). So the claim follows from Lax–Milgram

applied to the bilinear form

(u, v) → ∇u · ∇v dx + α uv ds .

Ω ∂Ω

calculated. In the one-dimensional case (N = 1), if Ω = (0, 1), the

three eigenvalue problems look as follows:

−u = λu, 0 < x < 1 ,

u(0) = u(1) = 0 ,

−u = λu, 0 < x < 1 ,

u (0) = u (1) = 0 ,

−u = λu, 0 < x < 1 ,

−u (0) + αu(0) = 0 = u (1) + αu(1) ,

where α is a given positive constant. In the ﬁrst two cases (Dirichlet

and Neumann) we obtain by easy computations

√

λD 2 2 D

n = π n , un (x) = 2 sin(nπx), n = 1, 2, . . . ;

8.6 Some Comments 235

√

λN N N 2 2

0 = 0, u0 (x) = 1; λn = π n , uN

n (x) = 2 cos(nπx), n = 1, 2, . . .

In the Robin case we cannot calculate by elementary methods the

corresponding eigenpairs (un , λn ), n ≥ 1.

−1/2

5. In the Dirichlet case above, the system {wn = λn un }∞ n=1 is an

orthonormal basis in WD = H01 (Ω). Indeed, we can deduce from

−Δun = λn un in Ω ,

(8.6.24)

un = 0 on ∂Ω ,

that

∇wn · ∇wk dx = un uk dx = δnk ∀n, k ≥ 1,

Ω Ω

which shows that {wn }∞ n=1 is an orthonormal system in WD . Now,

∞

since {un }n=1 is complete in H = L2 (Ω), any u ∈ H can be written as

(see Theorem 6.21)

∞

∞

u= (u, un )L2 (Ω) un = uun dx un ,

n=1 n=1 Ω

∞

u= ∇u · ∇wn dx wn .

n=1 Ω

Thus {wn }∞

n=1 is complete in WD .

Similar statements hold true for the other two cases (Neumann and

Robin) within WN = V1 ∩ H 1 (Ω) and WR = H 1 (Ω) equipped with the

scalar products

(w1 , w2 )N = ∇w1 · ∇w2 dx,

Ω

(w1 , w2 )R = ∇w1 · ∇w2 dx + α w1 w2 dx .

Ω ∂Ω

Neumann or Robin boundary conditions can be derived from the ab-

stract framework we describe below, related to the so-called energetic

extension of a linear operator Q satisfying the following assumptions:

236 8 Eigenvalues and Eigenvectors

strongly positive operator, where (H, (·, ·), · ) is a real, inﬁnite di-

mensional, separable Hilbert space.

Deﬁne on the vector subspace D(Q) the so-called energetic scalar prod-

uct

(u, v)E = (Qu, v) ∀u, v ∈ D(Q).

It induces the energetic norm on D(Q): u2E = (u, u)E , u ∈ D(Q).

Denote by HE the completion of (D(Q), · E ). Then HE is a Hilbert

space with respect to the scalar product

k→∞

respectively. Since Q is strongly positive, i.e., there exists a constant

c > 0 such that

we have

1

u ≤ √ uE ∀u ∈ HE ,

c

so the identity map from HE to H is continuous (i.e., HE is contin-

uously embedded in H). Denote by QE the Riesz isomorphism from

(HE , · E ) onto its dual HE∗ , namely,

D(Q) ⊂ HE ⊂ H ⊂ HE∗ .

QE u = Qu ∀u ∈ D(Q) ,

The term energetic will become clear later when we discuss examples.

We also assume that

(b) the identity map from HE into H is compact (i.e., HE is com-

pactly embedded into H).

Now we can state the following abstract spectral result.

8.6 Some Comments 237

Theorem 8.16. Assume (a) and (b) above are fulﬁlled. Then there

exist an increasing sequence (λn )n≥1 in (0, ∞) converging to ∞, and

an orthonormal basis {un }∞

n=1 in H such that

−1/2

In addition, {λn un }∞ n=1 is an orthonormal basis in HE (the ener-

getic space deﬁned above).

Proof. We shall adapt the proof of Theorem 8.11 to the present ab-

stract framework.

First of all, note that Q : D(Q) ⊂ H → H is bijective since its

extension QE : HE → HE∗ is. Denote A = Q−1 . Obviously, A ∈

L(H), N (A) = {0}, and A is self-adjoint. Operator A is also compact.

Indeed, if for some M > 0 we take f ∈ H such that f ≤ M , then

we have for u = Af (equivalently Qu = f ),

u2E = (Qu, u) ≤ f · u ≤ M u . (8.6.27)

Combining (8.6.27) with (8.6.25) yields

M

Af E = uE ≤ √ ,

c

i.e., A sends bounded sets in H to bounded sets in HE , hence A is

compact (cf. (b)). According to the Hilbert–Schmidt

Theorem

there

−1

exists a sequence of eigenpairs for A = Q , (μn , un ) n≥1 with the

known properties, with μn > 0, n ≥ 1, since Q is strongly positive.

Thus, the ﬁrst part of the theorem follows with (λn , un ) n≥1 , where

λn = 1/μn , n = 1, 2, . . .

−1/2

In order to prove the second part, denote wn = λn un , n ≥ 1. It

follows from (8.6.26) that

(wn , wk )E = (Qun , uk ) = δnk ∀n, k ≥ 1 ,

i.e., the system {wn }∞ ∞

n=1 is orthonormal in HE . Now, since {un }n=1 is

complete in H, any u ∈ H can be expressed as (cf. Theorem 6.21)

∞

u= (u, un )un .

n=1

∞

u= (u, wn )E wn ,

n=1

n=1 is complete in HE .

238 8 Eigenvalues and Eigenvectors

show that {λn un }∞ ∗

1/2

n=1 is an orthonormal basis in HE .

For more details on energetic spaces and extensions we refer the reader

to [52, Chapter 5]. See also [22, Chapter 1, p. 18].

Remark 8.18. One can reobtain from Theorem 8.16 the previous state-

ments related to Q = −Δ with Dirichlet, Neumann or Robin boundary

condition.

In the Dirichlet case we have H = L2 (Ω) with its usual scalar product

and induced norm, D(Q) = H01 (Ω)∩H 2 1

(Ω), and HE = H0 2(Ω) with the

energetic scalar product (u, v)E = Ω ∇u · ∇v dx and uE = (u, u)E .

Note that HE is equal to WD deﬁned above.

In the Neumann case, H = V1 := {v ∈ L2 (Ω); Ω v dx = 0} with the

scalar product and norm inherited from L2 (Ω), D(Q) = V1 ∩ H

2 (Ω),

∇v dx, u2E = (u, u)E . Of course, in this case we have an additional

eigenvalue λ0 = 0 as speciﬁed before.

Finally, in the case of the Robin boundary condition, H = L2 (Ω) with

its usual scalar product and norm, D(Q) 2 1

= H (Ω), and HE = H (Ω)

(denoted above by WR ) with (u, v)E = Ω ∇u · ∇v dx + α ∂Ω uv ds and

u2E = (u, u)E .

There are also many other speciﬁc examples covered by Theorem 8.16,

in particular the case Q = −Δ with diﬀerent conditions on parts of

the boundary of Ω.

Remark 8.19. In order to develop the above theory on energetic exten-

sions we can begin with an operator Q which satisﬁes all the assump-

tions in (a), with one exception: Q is only symmetric, not self-adjoint.

Everything works similarly and HE and QE can be constructed by us-

ing the same arguments. Now deﬁne an operator Q̂ : D(Q̂) ⊂ H → H

as follows:

D(Q̂) = {v ∈ HE ; QE v ∈ H}, Q̂v = QE v ∀v ∈ D(Q̂) .

Obviously, Q̂ is an extension of Q so D(Q̂) is dense in H. It is also

easily seen that Q̂ is strongly positive. As QE is bijective so is Q̂ since

it is a restriction of QE . Note also that Q̂−1 ∈ L(H) and is symmetric,

hence self-adjoint. Thus Q̂ is self-adjoint as well. Operator Q̂ is called

the Friedrichs extension of Q. It is easily seen that the energetic space

and the energetic extension deﬁned by Q̂ are exactly HE and QE .

Summarizing, we see that Q̂ satisﬁes all the conditions in (a) and plays

the role of the former Q. So assuming in (a) that Q is a self-adjoint

8.7 Exercises 239

fact, in this case the Friedrichs extension of Q is Q itself.

For example, if we choose H = L2 (Ω) (where Ω ⊂ RN is an open

bounded set with smooth boundary) and D(Q) = C0∞ (Ω), Qu = −Δu,

then Q is symmetric in H (not self-adjoint), the corresponding ener-

getic space is HE = H01 (Ω), and QE : HE → HE∗ is given by

QE (u)(v) = ∇u · ∇v dx ∀u, v ∈ H01 (Ω) ,

Ω

i.e., the same energetic extension we had before (see Remark 8.18).

Obviously, the corresponding Friedrichs extension of Q is given by

8.7 Exercises

1. Let X denote the real linear space of all polynomials with real

coeﬃcients of degree ≤ 3. Deﬁne A : X → X by

(b) Find all the eigenpairs of A.

the sup-norm. Deﬁne on X the operator A by

λ = 0. Prove that λ is an eigenvalue of AB := A ◦ B if and only

if λ is an eigenvalue of BA := B ◦ A.

240 8 Eigenvalues and Eigenvectors

4. Let X denote the real Banach space C[0, 1] with the usual sup-

norm. Let k = k(t, s) ∈ C[0, 1] × C[0, 1], with ∂k/∂t ∈ C[0, 1] ×

C[0, 1], k(t, t) = 0 ∀t ∈ [0, 1]. Deﬁne on X the operator A by

t

(Au)(t) = k(t, s)u(s) ds, t ∈ [0, 1].

0

Show that

(a) A ∈ L(X);

(b) A has no eigenvalue.

Solve the same exercise for X = L2 (0, 1) with the usual norm.

x2 , . . . ) in C satisfying n=1 |xn | < ∞ with the inner product

∞

x, y = xi ȳi , x = (x1 , x2 , . . . ), y = (y1 , y2 , . . . ) ∈ H,

n=1

tion operator A by

Ax = (λ1 x1 , λ2 x2 , . . . ) ∀x = (x1 , x2 , . . . ) ∈ H,

(b) Show that A is symmetric (hence self-adjoint) ⇐⇒ λn ∈ R

for all n ∈ N;

(c) Find all the eigenvalues of A.

usual scalar product and the induced norm, denoted · . Deﬁne

A : H → H by

1 t

(Au)(t) = t u(s) ds + su(s) ds, 0 ≤ t ≤ 1, u ∈ H.

t 0

(b) Prove that A is a compact operator;

(c) Prove that A is symmetric (hence self-adjoint);

8.7 Exercises 241

of A and use this information to determine an orthonormal

basis of H.

an eigenvector of A ∈ L(H) ⇐⇒ |(Ax, x)| = Ax · x.

two orthogonal vectors (i.e., (u, v) = 0). Deﬁne A : H → H by

Obviously, A ∈ L(H).

(b) Show that A is symmetric (hence self-adjoint);

(c) Using (a) calculate A, where A : L2 (−π, π)

→ L2 (−π, π) is the linear operator deﬁned by

π

(Af )(t) = sin t f (s) cos s ds+

−π

π

cos t f (s) sin s ds, t ∈ [−π, π],

−π

(d) Find all the eigenpairs of A.

⊂ H be an orthonormal system, where m is a given natural

number. Deﬁne A : H → H by

m

Ax = ci (x, ei )ei , x ∈ H,

i=1

where ci ∈ K \ {0}, i = 1, . . . , m.

(a) Show that A ∈ L(H) and determine A, R(A) and N (A);

(b) Show that A is symmetric ⇐⇒ ci ∈ R ∀i ∈ {1, . . . , m};

(c) Determine all the eigenvalues of A.

242 8 Eigenvalues and Eigenvectors

10. Let H = L2 (0, 1) be the real Hilbert space equipped with the

usual scalar product and norm. Deﬁne A : H → H by

1

t s

(Au)(t) = u(s) ds, t ∈ [0, 1], u ∈ H.

1+t 0 1+s

(b) Determine R(A) and N (A);

(c) Determine all the eigenpairs of A.

11. Let H = L2 (0, 1) be the real Hilbert space equipped with the

usual scalar product and norm. For u ∈ H consider the problem

v (t) = u(t) a.e. in (0, 1),

v (0) = 0, v(1) = 0.

the above problem corresponding to u.

(b) Prove that A is symmetric and compact;

(c) Find all the eigenpairs of A and use this information to

determine an orthonormal basis of H.

−Δu = λu in Ω ⊂ R2 ,

u=0 on ∂Ω,

problem for −Δ with Neumann conditions on all sides of the rect-

angle Ω or combinations of Dirichlet and Neumann conditions on

diﬀerent sides of Ω. Solve all these eigenvalue problems.

Chapter 9

Semigroups of Linear

Operators

Consider the Cauchy problem

u(0) = x, (IC)

where x is a given (column) vector in Cn . It is well known that problem

(E), (IC) has a unique solution given by

system (E) which equals I (the n × n identity matrix) for t = 0. We

have

∞ k

tA t k

e = A , (9.0.2)

k!

k=0

linear operators A, etA ∈ L(X), where X = Cn , equipped with one of

its equivalent norms, and L(X) denotes, as usual, the space of bounded

linear operators from X into itself. As we will see later, the family of

matrices (operators) {T (t) = etA ; t ≥ 0} is a uniformly continuous

semigroup on X = Cn . What’s more, the family {T (t); t ≥ 0} extends

G. Moroşanu, Functional Analysis for the Applied Sciences,

Universitext, https://doi.org/10.1007/978-3-030-27153-4 9

244 9 Semigroups of Linear Operators

solution u(t) as

u(t) = T (t)x, t ≥ 0 (9.0.3)

allows the derivation of some properties of solutions from the proper-

ties of the family {T (t); t ≥ 0}. This idea extends easily to the case

when X is a general Banach space and A is a bounded (continuous)

linear operator, A ∈ L(X).

If A is not an element of L(X), then the operator exponential etA no

longer makes sense. This case is not trivial, rather it is much more

interesting and very useful in applications. If A : D(A) ⊂ X → X

satisﬁes certain conditions, then one can associate with A a so-called

C0 -semigroup of linear operators {T (t); t ≥ 0} ⊂ L(X) (see Deﬁ-

nition 9.1 below), so that the solution of the Cauchy problem (E),

(IC) can again be represented by the above formula (9.0.3). Indeed,

there is a central result in the linear semigroup theory, known as the

Hille–Yosida theorem,1 which establishes the necessary and suﬃcient

conditions for a linear operator A to “generate” a C0 -semigroup of

linear operators {T (t); t ≥ 0} ⊂ L(X). In this way, one can solve

linear partial diﬀerential equations of the form (E), where A repre-

sents unbounded linear diﬀerential operators with respect to the space

variables, deﬁned on convenient function spaces.

The linear semigroup theory received considerable attention in the

1930s as a new approach in the study of parabolic and hyperbolic

linear partial diﬀerential equations. This theory has since developed

as an independent theory with applications in some other ﬁelds, such

as ergodic theory, the theory of Markov processes, etc.

In this chapter we present some of the most important results of the

linear semigroup theory and provide some related applications.

9.1 Deﬁnitions

Throughout this chapter X will be a Banach space over K with norm

· , where K is either R or C. Denote as usual by L(X) the space

of all bounded (continuous) linear operators T : X → X, which is a

Banach space with respect to the operator norm

T = sup {T x : x ∈ X, x ≤ 1}.

1

Carl Einar Hille, American mathematician, 1894–1980; Kosaku Yosida,

Japanese mathematician, 1909–1990.

9.1 Deﬁnitions 245

to be a semigroup if

If, in addition,

class C0 , or a strongly continuous semigroup).

that is why {T (t); t ≥ 0} is called a C0 -semigroup.

formly continuous semigroup if it satisﬁes conditions (i) and (ii)

above, and

for any x ∈ X, we have

1

Ax := lim [T (h)x − x], (9.1.4)

h→0+ h

for all x ∈ X for which the above limit exists. If D(A) is the set of all

such x’s, then we have a linear operator A : D(A) ⊂ X → X, which is

called the inﬁnitesimal generator of the semigroup {T (t); t ≥ 0}.

Theorem 9.5. For any operator A ∈ L(X) the family {T (t) = etA ; t ≥

0} is a uniformly continuous semigroup whose inﬁnitesimal generator

is A.

246 9 Semigroups of Linear Operators

∞ k

tA t

e = Ak ,

k!

k=0

meaning that for any t ≥ 0 this series is convergent in L(X) and its

sum is etA . It is easily seen that the family {T (t) = etA ; t ≥ 0} satisﬁes

(i) and (ii). Condition (iii) is also satisﬁed since

∞ k

t

T (t) − I ≤ A ≤ tA · etA

k!

k=1

t ≥ 0}.

Remark 9.6. We will see later that, in fact, every uniformly continuous

semigroup is a family of operator exponentials {etA ; t ≥ 0} with A ∈

L(X). Note that A can be obtained from the right derivative of T (t) =

etA calculated at t = 0. This explains the above deﬁnition of the

generator of a C0 -semigroup {T (t); t ≥ 0}. In this case, we can expect

only the existence of the right derivative at t = 0 of T (t)x for some

points x ∈ X.

Examples of C0 -semigroups (that do not belong to the class of uni-

formly continuous semigroups) will be provided later.

We start this section with a basic result in the linear semigroup theory:

following hold:

9.2 Some Properties of C0 -Semigroups 247

Proof. Assertion (a): Let us ﬁrst prove that there exists a constant

δ > 0 such that T (t) is bounded on [0, δ], i.e.,

Assume, by way of contradiction, that this is not the case, i.e., there

exists a sequence of real numbers tk ! 0 such that T (tk ) → ∞. On

the other hand, condition (iii) of Deﬁnition 9.1 implies that for each

x ∈ X there exists a natural number N = N (x) such that

T (tk ) is bounded, which contradicts the assumption above. Thus

(9.2.6) holds true for some δ > 0. Since T (0) = I = 1, we have

C ≥ 1.

Now, for all t ≥ 0 we have the decomposition (division with remainder)

t = nδ + r, n ∈ N, 0 ≤ r < δ.

So, by using condition (ii) of Deﬁnition 9.1, we can derive the estimate

Therefore,

T (t) ≤ C · C t/δ , t ≥ 0,

which shows that (9.2.5) holds true with M = C and ω = (ln C)/δ.

Assertion (b): Let t0 > 0 and x ∈ X be arbitrary but ﬁxed. For any

h > 0 we have (cf. condition (ii) from Deﬁnition 9.1)

≤ T (t0 ) · T (h)x − x,

which shows that the function t → T (t)x is continuous from the right

at t = t0 (cf. condition (iii) of Deﬁnition 9.1). Now, for 0 < h < t0 ,

we can write (cf. (ii) and (9.2.5))

≤ M eω(t0 −h) x − T (h)x,

248 9 Semigroups of Linear Operators

one can easily derive the following property that is stronger than (b)

above: the map (t, x) → T (t)x is continuous from [0, ∞) × X to X

(see Exercise 9.4).

Remark 9.9. The constant ω in (9.2.5) determined in the proof above

is nonnegative, but this is not the best constant. Indeed, sometimes ω

can be negative (e.g., this is the case if T (t) = etA , where A is a real

square matrix whose eigenvalues have negative real parts).

Theorem 9.10. Let {T (t) : t ≥ 0} ⊂ L(X) be a C0 -semigroup and

let A be its inﬁnitesimal generator. Then,

(c) A is densely deﬁned: D(A) = X;

d

T (t)x = AT (t)x = T (t)Ax. (9.2.8)

dt

t

1

x = lim T (s)x ds, ∀x ∈ X.

t→0+ t 0

that t

T (s)x ds ∈ D(A), ∀t > 0, x ∈ X. (9.2.9)

0

Indeed, for some given t > 0, x ∈ X, and for all h > 0, we have

t t

h−1 [T (h) − I] T (s)x ds = h−1 [T (s + h)x − T (s)x] ds

0 0

t+h

−1

=h T (s)x ds

h

t

− T (s)x ds

0

t+h

−1

=h T (s)x ds

t

h

− h−1 T (s)x ds.

0

9.2 Some Properties of C0 -Semigroups 249

t

−1

lim h [T (h) − I] T (s)x ds = T (t)x − x, (9.2.10)

h→0+ 0

Proof of (d): Let (xn ) be a sequence in D(A) such that xn → x and

Axn → y. Using (9.2.10), we can write

t

T (t)xn − xn = lim T (s)h−1 [T (h)xn − xn ] ds

h→0+ 0

t

= T (s)Axn ds ∀t > 0.

0

It follows that

t

T (t)x − x = T (s)y ds ∀t > 0,

0

so

lim t−1 [T (t)x − x] = y.

t→0+

Proof of (e): Let t ≥ 0 and x ∈ D(A). We have

h→0+

= lim h−1 [T (h)T (t)x − T (t)x],

h→0+

h→0+ h→0+

= T (t)Ax. (9.2.12)

d+

T (t)x = AT (t)x = T (t)Ax. (9.2.13)

dt

250 9 Semigroups of Linear Operators

+

We have used ddt to denote the right derivative. To conclude, we need

to show that the left derivative of T (t)x exists and equals its right

derivative at any t > 0. For 0 < h < t, we have

= T (t − h){h−1 [T (h)x − x] − T (h)Ax}

≤ M eω(t−h) {h−1 [T (h)x − x] − Ax + Ax − T (h)Ax}.

d−

T (t)x = T (t)Ax. (9.2.14)

dt

Obviously, (e) follows from (9.2.13) and (9.2.14).

0} ⊂ L(X), then the subspace Y := ∩∞ n

n=1 D(A ) is dense in X, where

the operators A : D(A ) → X are inductively deﬁned as follows:

n n

An x = A(An−1 x) ∀x ∈ D(An ),

supp φ ⊂ (0, +∞), deﬁne

∞

x(φ) = φ(t)T (t)x dt.

0

1 1 ∞

T (h) − I x(φ) = φ(t)T (t + h)x dt

h h

0

∞

− φ(t)T (t)x dt

0

1 ∞

= φ(t − h)T (t)x dt

h h

∞

− φ(t)T (t)x dt

∞0

φ(t − h) − φ(t)

= T (t)x dt,

0 h

which converges to −x(φ ) as h → 0+ . Hence x(φ) ∈ D(A) and Ax =

−x(φ ). We infer by induction that x(φ) ∈ D(An ) and An x(φ) =

9.2 Some Properties of C0 -Semigroups 251

(−1)n x(φ(n) ) for all n ∈ N, hence x(φ) ∈ Y . Now, let us prove that

any x ∈ X can be approximated by x(φ) for suitable φ’s (see [49, p.

44]). If ω ∈ C0∞ (R) is the usual test function with supp ω = [−1, +1]

+1

and −1 ω(t) dt = 1, deﬁne the molliﬁer

1 t

φε (t) = ω − 2 ∀t ∈ R, ε > 0.

ε ε

Since

3ε

x(φε ) − x = φε (t)[T (t)x − x] dt

ε

3ε

≤ φε (t)T (t)x − x dt

ε

3ε

≤ sup T (t)x − x φε (t) dt

t∈[ε,3ε] ε

= sup T (t)x − x.

t∈[ε,3ε]

Therefore

lim x(φε ) − x = 0.

ε→0+

generator, then they coincide.

Proof. Let A be the common generator of two C0 -semigroups, say

{T (t); t ≥ 0} and {S(t); t ≥ 0}. For any t > 0 and x ∈ D(A) we have

(see Theorem 9.10, (e))

d

[T (t − s)S(s)x]

ds

= −T (t − s)AS(s)x + T (t − s)AS(s)x = 0, ∀ 0 ≤ s < t.

constant on the interval [0, t]. In particular, T (t)x = S(t)x on D(A)

for all t ≥ 0. This concludes the proof since D(A) = X.

Remark 9.13. Property (e) of Theorem 9.10 says that for every x ∈

D(A) the function u(t) = T (t)x is continuously diﬀerentiable on [0, ∞)

and satisﬁes the Cauchy problem

252 9 Semigroups of Linear Operators

solution on every bounded interval [0, r] in the sense of Deﬁnition 9.44

below) is unique. Indeed, if ũ is also a C 1 -solution of problem (CP ),

then for any t > 0 we have

d

T (t − s)ũ(s) = −T (t − s)Aũ(s) + T (t − s)ũ (s) = 0 ∀ s ∈ (0, t),

ds

hence s → T (t − s)ũ(s) is a constant function on [0, t]. In particular,

its values at s = 0 and s = t coincide:

ũ(t) = T (t)ũ(0) = T (t)x,

which proves that the solution of (CP ) is unique and is given by u(t) =

T (t)x, t ≥ 0.

Now, if x ∈ X\D(A), then the function u(t) = T (t)x satisﬁes the initial

condition u(0) = x, but is no longer diﬀerentiable (see Sect. 9.5 below),

so it cannot satisfy the Cauchy problem above in a classical sense.

However, u can be regarded as a generalized solution (or mild solution,

as it will be called later, see Sect. 9.11) since the initial condition is still

satisﬁed, u(0) = x, and there exists a sequence (un ) of C 1 -solutions

of equation (CP )1 , such that un → u in C([0, r]; X) for all r > 0.

Indeed, one can choose a sequence (xn ) in D(A), such that xn → x (cf.

Theorem 9.10, (c)), and obviously un (t) = T (t)xn are all C 1 -solutions

satisfying the required condition:

T (t)xn − T (t)x ≤ T (t) · xn − x ≤ M eωr xn − x,

for all t ∈ [0, r]. Clearly, the deﬁnition of the generalized solution is

independent of the choice of the sequence (un ) (or (xn = un (0))).

It is worth pointing out that in the discussion above A was assumed to

be the inﬁnitesimal generator of a C0 - semigroup. Now, given a linear

operator A we want to know the conditions on A ensuring the existence

of a C0 -semigroup whose generator is precisely A. This will allow us

to solve Cauchy problems like (CP ) above. From Theorem 9.10 we

know that such an A has to necessarily be densely deﬁned and closed.

The complete answer will be provided later.

Uniformly continuous semigroups have been deﬁned before. We have

also seen that for any A ∈ L(X), the family {T (t) = etA ; t ≥ 0} is

a uniformly continuous semigroup whose generator is A. According

9.3 Uniformly Continuous Semigroups 253

uniformly continuous semigroup, having A as its generator. The next

result shows that, in fact, the class of uniformly continuous semigroups

reduces to {{etA ; t ≥ 0}; A ∈ L(X)}.

semigroup. If A is its inﬁnitesimal generator, then A ∈ L(X).

Proof. Since t

1

lim I − T (s) ds = 0,

t→0 + t 0

there exists a t0 > 0 such that

t0

1

I − B < 1, where B = T (s) ds.

t0 0

−1

Therefore, B is invertible and B −1 = I − (I − B) ∈ L(X). Now,

for all h > 0, we have

t0

1 1
t0

[T (h) − I]B = T (s + h) ds − T (s) ds

h ht0 0 0

1
1 t0 +h 1 h

= T (s) ds − T (s) ds .

t0 h t 0 h 0

1 1

lim [T (h) − I]B = [T (t0 ) − I], (9.3.15)

h→0+ h t0

with respect to the topology of L(X). Since the generator of {T (t); t ≥

0} is A, it follows from (9.3.15) that

1

AB = [T (t0 ) − I]. (9.3.16)

t0

1

A= [T (t0 ) − I]B −1 ∈ L(X).

t0

can naturally be extended to the group {etA ; t ∈ R} (see the next

section).

254 9 Semigroups of Linear Operators

inﬁnitesimal generator A : D(A) ⊂ X → X is bounded, i.e., there

exists a constant c > 0 such that Ax ≤ cx for all x ∈ D(A).

Then, D(A) = X, A ∈ L(X) and so the semigroup is in fact uniformly

continuous: T (t) = etA , t ≥ 0. Indeed, since D(A) = X, A has an

extension Ã ∈ L(X). Denote by {T̃ (t) = etÃ ; t ≥ 0} the (uniformly

continuous) semigroup with generator Ã. For an arbitrary t > 0 and

x ∈ D(A), we have

d

[T̃ (t − s)T (s)x]

ds

= −T̃ (t − s)ÃT (s)x + T̃ (t − s)AT (s)x = 0 ∀s ∈ (0, t),

T (s)x = AT (s)x for all

s ∈ (0, t). It follows that the function s → T̃ (t − s)T (s)x is constant

on [0, t], and hence T̃ (t)x = T (t)x for all x ∈ D(A) which shows that

T̃ (t)x = T (t)x for all x ∈ X. Therefore, A coincides with Ã and the

assertion follows.

Deﬁnitions and Link to Operator

Semigroups

Deﬁnition 9.16. A family {G(t); t ∈ R} ⊂ L(X) is called a group

if

(j) G(0) = I (the identity operator on X);

If, in addition,

(jjj) limt→0 G(t)x − x = 0 for all x ∈ X,

then {G(t); t ∈ R} is called a C0 -group (or a group of class C0 ).

The inﬁnitesimal generator A of a group {G(t); t ∈ R} is deﬁned

by

1

Ax = lim [G(h)x − x] ∀x ∈ D(A),

h→0 h

where D(A) is the set of all x ∈ X for which the limit above exists.

If {G(t); t ∈ R} satisﬁes conditions (j), (jj) and, in addition,

9.4 Groups of Linear Operators: Deﬁnitions and Link to Operator. . . 255

formly continuous group.

Remark 9.17. If {G(t); t ∈ R} is a C0 -group, then the families {G(t);

t ≥ 0} and {G(−t); t ≥ 0} are both C0 -semigroups, with generators

A and −A, respectively (prove it!). Conversely, if {T+ (t); t ≥ 0},

{T− (t); t ≥ 0} are C0 -semigroups with generators A and −A, respec-

tively, then one can deﬁne a C0 -group

T+ (t) if t ≥ 0,

G(t) =

T− (−t) if t < 0,

identity

T+ (t)T− (t) = T− (t)T+ (t) = I ∀t ≥ 0. (9.4.17)

Indeed, for any x ∈ D(A) = D(−A) and t ≥ 0, we have (cf. Theo-

rem 9.10, (e))

d

T+ (t)T− (t)x = T+ (t)AT− (t)x − T+ (t)AT− (t)x = 0,

dt

hence t → T+ (t)AT− (t)x is a constant function. Since it takes the

value x for t = 0, it follows that

T+ (t)T− (t) = I ∀t ≥ 0.

Similarly,

T− (t)T+ (t) = I ∀t ≥ 0,

so (9.4.17) holds true. Identity (9.4.17) shows that T+ (t) and T− (t) are

invertible for all t ≥ 0, being inverse to each other. Thus {G(t); t ∈ R}

satisﬁes the group property (jj). Since (j) and (jjj) are trivially

satisﬁed, we conclude that {G(t); t ∈ R} constructed above is indeed

a C0 -group, and its generator is A, as claimed.

Note that all the members G(t) of any group are necessarily invertible

operators, since G(t)G(−t) = I = G(−t)G(t). The next result shows

that invertibility allows one to extend any C0 -semigroup to a C0 -group.

256 9 Semigroups of Linear Operators

and T (t0 ) is a bijection from X to itself (hence T (t0 ) is invertible) for

some t0 > 0, then so is T (t) for all t ≥ 0. Indeed, for t ∈ (0, t0 ), we

have

T (t0 ) = T (t)T (t0 − t) = T (t0 − t)T (t),

which shows that T (t) is bijective. For t > t0 we write t as t = nt0 + s,

where n ∈ N and 0 ≤ s < t0 (division with remainder) and so T (t) =

T (t0 )n T (s), which clearly shows that T (t) is also bijective in this case.

note its inﬁnitesimal generator. If T (t) is a bijection from X to itself

for all t > 0 (equivalently, T (t0 ) is a bijection for some t0 > 0),

then {T (t)−1 ; t ≥ 0} is a C0 -semigroup with the generator −A, so

{G(t); t ∈ R} deﬁned by

T (t) if t ≥ 0,

G(t) =

T (−t)−1 if t < 0,

and {G(t); t ∈ R} deﬁned in the statement above is a group. Now, let

us prove that the semigroup {S(t) = G(−t); t ≥ 0} satisﬁes condition

(iii) of Deﬁnition 9.1. Let x ∈ X and s > 1. Denote y := T (s)−1 x.

For 0 < t < 1, we have

S(t)x − x = G(−t)x − x

= G(−t)G(s)y − T (s)y

= T (s − t)y − T (s)y → 0 as t → 0+ ,

satisﬁes condition (iii) as claimed, i.e., it is a C0 -semigroup. Let B be

the inﬁnitesimal generator of {S(t) = T (t)−1 ; t ≥ 0}. For x ∈ D(A)

we have

1

lim { [x − T (h)x] + Ax} = 0.

h→0 + h

9.5 Translation Semigroups 257

1

lim S(h){ [x − T (h)x] + Ax} = 0,

h→0+ h

(a)). Therefore,

h→0+

−1

T (t)−1 , t ≥ 0, we also have D(B) ⊂ D(A). Hence, D(A) = D(B)

and Bx = −Ax ∀x ∈ D(A), i.e., B = −A.

the function t → G(t)x is continuous from the right (or from the left)

at some point t = t0 ∈ R, then there exist constants M ≥ 1 and ω ∈ R

such that

G(t) ≤ M eω|t| ∀t ∈ R. (9.4.19)

This follows by the Uniform Boundedness Principle (see the proof of

Theorem 9.7). Moreover, using this estimate and the invertibility of

every G(t), one can easily see that t → G(t)x is continuous on R; even

more, the function (t, x) → G(t)x is continuous from R × X to X.

Remark 9.20. If A ∈ L(X), then {G(t) = etA ; t ∈ R} is a uniformly

continuous group. In fact, it follows from the discussion above that

the class of uniformly continuous groups is precisely {{etA ; t ∈ R}; A ∈

L(X)}.

In this section we present the ﬁrst examples of C0 -semigroups which

are not uniformly continuous ones.

Let X be the space of all functions f : [0, ∞) → R which are uniformly

continuous and bounded. The space X is a real Banach space with

respect to the norm

f ∞ = sup |f (t)|.

t≥0

T (t)f (s) = f (t + s), s ∈ [0, ∞), f ∈ X.

258 9 Semigroups of Linear Operators

inﬁnitesimal generator is deﬁned by

Af = f ∀f ∈ D(A). (9.5.21)

Indeed, if f ∈ X, f is diﬀerentiable on [0, ∞), and f ∈ X, then for all

h > 0 and s ≥ 0

−1

h [T (h)f − f ] (s) = h−1 [f (s + h) − f (s)] = f (θ),

−1

h [T (h)f − f ] (s) − f (s) = f (θ) − f (s) → 0,

fore, f ∈ D(A) and Af = f .

To conclude the proof, we need to show that (9.5.20) holds true, i.e.,

the converse inclusion relation is valid. To this end, let f ∈ D(A),

which means there exists

h→0+ h→0+

f is diﬀerentiable on [0, ∞) so that f+ = f . For an arbitrary ε > 0

deﬁne t

g(t) = f (t) − f (0) − f+ (s) ds − εt.

0

We have g(0) = 0 and

g+ (t) = −ε < 0 ∀t ≥ 0,

t

f (t) ≤ f (0) + f+ (s) ds,

0

by +ε, we obtain the converse inequality, so

t

f (t) = f (0) + f+ (s) ds ∀t ≥ 0,

0

9.5 Translation Semigroups 259

X, as claimed.

The semigroup deﬁned above is called a translation semigroup. Obvi-

ously,

T (t)f ∞ ≤ f ∞ ∀t ≥ 0,

which shows that T (t) ≤ 1 for all t ≥ 0, i.e., the estimate in Theo-

rem 9.7 holds with M = 1 and ω = 0.

It is worth pointing out that A is not a member of L(X) in this case,

so {T (t); t ≥ 0} is not a uniformly continuous semigroup (see The-

orem 9.14). This conﬁrms the fact that the unit sphere of X is not

equicontinuous (equivalently, condition (iii) is not valid).

Remark 9.21. If f ∈ D(A) (see (9.5.20)), then

u(t) = u(t, ·) = T (t)f (·) = f (t + ·)

u (t) = Au(t) ∀t ≥ 0,

u(0) = f,

i.e.,

∂u

∂t (t, s)= ∂u

∂s (t, s), t, s ≥ 0,

u(0, s) = f (s), s ≥ 0.

the above partial diﬀerential equation in a classical sense; it has to be

interpreted as a generalized solution of the Cauchy problem above.

If X is replaced by the space of all functions f : R → R which are

uniformly continuous and bounded, with the norm

f ∞ = sup |f (t)|,

t∈R

X, t ≥ 0,

T (t)f (s) = f (t + s) ∀s ∈ R, f ∈ X.

T (t) = 1 for all t ≥ 0, and its inﬁnitesimal generator A is given by

260 9 Semigroups of Linear Operators

Af = f ∀f ∈ D(A).

C0 -group {G(t); t ∈ R} deﬁned by

G(t)f (s) = f (t + s) ∀t, s ∈ R, f ∈ X.

erator does not belong to L(X).

Theorem

Let X be a Banach space and let A : D(A) ⊂ X → X be a linear

closed operator, not necessarily bounded. The set

ρ(A)

= {λ ∈ K; λI − A is a bijective operator from D(A) to X}

(9.6.22)

denote

R(λ, A) = (λI − A)−1 , (9.6.23)

R(λ, A) for all λ ∈ ρ(A). If we also take into account the fact that

D(R(λ, A)) = X, we infer that R(λ, A) ∈ L(X) for all λ ∈ ρ(A) (cf.

Theorem 4.10 (Closed Graph Theorem)).

Now, let us state a central result in the theory of semigroups of linear

operators, which belongs to E. Hille and K. Yosida.

ﬁnitesimal generator of a C0 -semigroup of contractions {T (t); t ≥ 0}

(i.e., T (t) ≤ 1 ∀t ≥ 0) if and only if

λ ∀λ > 0.

9.6 The Hille–Yosida Generation Theorem 261

conditions of (k) are fulﬁlled (cf. Theorem 9.10). It remains to prove

(kk), under the assumption that {T (t); t ≥ 0} is a C0 -semigroup of

contractions. To this purpose, deﬁne

∞

Rλ x = e−λt T (t)x dt ∀λ > 0, x ∈ X. (9.6.24)

0

∞

Rλ x ≤ e−λt T (t) · x dt

0

∞

≤ e−λt dt x

0

1

= x ∀x ∈ X, λ > 0,

λ

which implies that

1

Rλ ≤ ∀λ > 0. (9.6.25)

λ

Let us prove that for all λ > 0 and x ∈ X, Rλ x ∈ D(A). For all h > 0

we have

∞

−1 −1

h [T (h) − I]Rλ x = h e−λt T (t + h)x dt

∞ 0

− e−λt T (t)x dt

0

eλh − 1 eλh h −λτ

= Rλ x − e T (τ )x dτ.

h h 0

λRλ x − x as λ → 0+ . Therefore, Rλ x ∈ D(A) and

ARλ x = λRλ x − x,

i.e.,

(λI − A)Rλ = I ∀λ > 0. (9.6.26)

262 9 Semigroups of Linear Operators

On the other hand, for all x ∈ D(A) and t ≥ 0, T (t)x ∈ D(A) (cf.

Theorem 9.10, (e)) and

∞

ARλ x = lim e−λt h−1 [T (t + h)x − T (t)x] dt

h→0+ 0

∞

= e−λt T (t)Ax dt

0

= Rλ Ax,

hence (see also (9.6.26))

Rλ (λI − A) = ID(A) ∀λ > 0, (9.6.27)

where ID(A) is the identity operator on D(A). From (9.6.26) and

(9.6.27) we infer that λI − A is a bijective operator from D(A) to X

and

Rλ = (λI − A)−1 ∀λ > 0.

Therefore, (0, ∞) ⊂ ρ(A) and

Rλ = R(λ, A) ∀λ > 0,

so (9.6.25) implies that

1

R(λ, A) ≤ ∀λ > 0.

λ

Thus the proof of necessity is complete.

Suﬃciency: Assume that both (k) and (kk) hold. For the convenience

of the reader, the proof will be divided into several steps.

Step 1: limλ→∞ λR(λ, A)x = x ∀x ∈ X.

If x ∈ D(A), then, according to (kk), we have

1

λR(λ, A)x − x = R(λ, A)Ax ≤ Ax,

λ

which shows that

lim λR(λ, A)x = x ∀x ∈ D(A). (9.6.28)

λ→∞

such that xn → x. Since

λR(λ, A)x − x ≤ λR(λ, A)(x − xn ) + λR(λ, A)xn − xn

+ xn − x

≤ λR(λ, A)xn − xn + 2xn − x,

we have (see (9.6.28))

9.6 The Hille–Yosida Generation Theorem 263

λ→∞

which concludes Step 1.

Step 2: Deﬁne Aλ := λAR(λ, A), λ > 0 (the Yosida approximation of

A); then, for all λ > 0, Aλ ∈ L(X), and

λ→∞

Indeed, since

complete.

Step 3: For all t ≥ 0, x ∈ X, and λ, ν > 0, we have

2 R(λ,A)

−λt tλ2 R(λ,A)

≤ e ·e

≤ 1.

It is also easily seen that etAλ , etAν , Aλ , Aν commute with each other.

Using this information, we infer that

1

d tsAλ t(1−s)Aν

etAλ x − etAν x ≤ e e x ds

0 ds

1

≤ tetsAλ et(1−s)Aν (Aλ x − Aν x) ds

0

≤ tAλ x − Aν x, (9.6.31)

as claimed.

Step 4: The limit limλ→∞ etAλ x =: T (t)x, t ≥ 0, x ∈ X exists, and

{T (t); t ≥ 0} is a C0 -semigroup of contractions having A as its gener-

ator.

264 9 Semigroups of Linear Operators

First of all, according to Steps 2 and 3, the above limit exists for each

x ∈ D(A), uniformly on compact subintervals of [0, ∞), thus t → T (t)x

is a continuous function on [0, ∞). It is also easily seen that

operator and (9.6.32) are satisﬁed for all x ∈ X. Moreover, t → T (t)x

is continuous on [0, ∞) for all x ∈ X. Indeed, if (xn ) is a sequence in

D(A) converging to x, then

indeed continuous on [0, ∞). On the other hand,

+ etAλ (xn − x)

≤ 2x − xn + T (t)xn − etAλ xn ,

λ→∞

we have

T (t + s)x = T (t)T (s)x ∀t, s ≥ 0, x ∈ X.

Thus we have already proved that {T (t); t ≥ 0} is a C0 -semigroup of

contractions, and all we have to prove next is that its generator, say

B, coincides with the given operator A. If x ∈ D(A), we have

T (t)x − x = lim etAλ x − x

λ→∞

t

= lim esAλ Aλ x ds

λ→∞ 0

t

= T (s)Ax ds, (9.6.33)

0

9.7 The Lumer–Phillips Theorem 265

as λ → ∞. From (9.6.33) we easily see that D(A) ⊂ D(B) and

Bx = Ax for all x ∈ D(A). Now, by assumption 1 ∈ ρ(A). On

the other hand, according to the forward implication, we also have

1 ∈ ρ(B) (since B is the generator of a C0 -semigroup of contractions).

So both A and B are bijections from D(A) and respectively D(B) to

X. Since Ax = Bx for all x ∈ D(A), it follows that D(A) = D(B)

and A = B.

In this section we discuss another result which also provides neces-

sary and suﬃcient conditions for a linear operator to generate a C0 -

semigroup of contractions. This result belongs to Lumer and Phillips2

and is useful in applications. Before stating this result we need the

following deﬁnition.

dissipative if

dissipative.

if and only if there exists a λ0 > 0 such that R(λ0 I − A) = X. Indeed,

by the dissipativity condition (9.7.34) it follows that λ0 I − A is a bijec-

tion between D(A) and X, (λ0 I − A)−1 ∈ L(X) and (λ0 I − A)−1 ≤

1/λ0 . Using this information and Banach’s Fixed Point Theorem it

follows easily that R(λI − A) = X for all λ ∈ (0, 2λ0 ). Obviously, this

interval can be extended indeﬁnitely to the right and so R(λI −A) = X

for all λ > 0.

X is the inﬁnitesimal generator of a C0 -semigroup of contractions if

and only if the following conditions hold: (a) D(A) = X, and (b) A is

m-dissipative.

2

Günter Lumer, German-born mathematician, 1929–2005; Ralph S. Phillips,

American mathematician, 1913–1998.

266 9 Semigroups of Linear Operators

Proof. Suﬃciency: Assume that both (a) and (b) hold. From (b) it

follows that for every λ > 0 we have λ ∈ ρ(A), R(λ, A) ∈ L(X), and

R(λ, A) ≤ 1/λ (see the remark above). Also, A is a closed operator

since (λI − A)−1 ∈ L(X) for all λ > 0. It follows by the Hille–Yosida

Theorem that A generates a C0 -semigroup of contractions.

Necessity: Assume that A is the generator of a C0 -semigroup of con-

tractions {T (t); t ≥ 0}. According to the Hille–Yosida Theorem, it

suﬃces to show that A is dissipative. Let x ∈ D(A) and x∗ ∈ J(x),

where J is the duality mapping of X. We have

h→0+

= lim h−1 Re x∗ (T (h)x) − x2

h→0+

≤ 0,

since

Re x∗ (T (h)x) ≤ x∗ · T (h) · x ≤ x2 ,

≥ λx2 ∀λ > 0,

and only if

From the proof above we see that (9.7.35) implies (9.7.34). For the

proof of the converse implication, see [13, p. 81] or [39, p. 14]. If X

is a Hilbert space, then this implication follows easily. If X is a real

Hilbert space, then (9.7.35) means that A is negative semideﬁnite:

(Ax, x) ≤ 0 ∀x ∈ D(A) (equivalently, −A is positive semideﬁnite or

monotone).

Note that if X is assumed to be reﬂexive then condition (a) in the

Lumer–Phillips Theorem becomes superﬂuous, so we have

9.7 The Lumer–Phillips Theorem 267

operator A : D(A) ⊂ X → X is the inﬁnitesimal generator of a C0 -

semigroup of contractions if and only if A is m-dissipative.

Proof. Bearing in mind the Lumer–Phillips Theorem, we need to prove

that if X is reﬂexive and A is m-dissipative (equivalently, A satisﬁes

(9.7.35) and R(λ0 I − A) = X for some λ0 > 0), then D(A) = X.

Obviously, (0, ∞) ⊂ ρ(A), R(λ, A) ∈ L(X), and R(λ, A) ≤ 1/λ for

all λ > 0. Now, for x ∈ D(A) and λ > 0 denote xλ := λR(λ, A)x. As

in the proof of the Hille–Yosida Theorem, we can see that

1

xλ − x ≤ Ax,

λ

hence xλ converges to x as λ → +∞. (Note that this property cannot

be extended, for the time being, to all x ∈ X, as in the proof of

the Hille–Yosida Theorem, since D(A) = X is now a target, not a

hypothesis). It is also easily seen that for x ∈ D(A) and λ > 0

sequence λn → ∞ such that (Axλn ) converges weakly. Moreover, since

A is m-dissipative, its graph is closed in X × X, hence weakly closed,

so we have

lim x∗ (Axλn ) = x∗ (Ax) ∀x∗ ∈ X ∗ , (9.7.36)

n→∞

for all x ∈ D(A). Since Axλ = λR(λ, A)Ax ∈ D(A) for all λ > 0, we

derive from (9.7.36) that

Therefore D(A) = X as claimed.

rem 9.27, as the following counterexample shows: X = C[0, 1] equipped

with the usual sup-norm (which is a non-reﬂexive Banach space),

A : D(A) ⊂ X → X, D(A) = {u ∈ C 1 [0, 1]; u(0) = 0}, Au = −u . It

is easily seen that A is m-dissipative, but not densely deﬁned (D(A) =

268 9 Semigroups of Linear Operators

Theorem, A cannot be the generator of a C0 -semigroup in X. This

counterexample clearly shows that Theorem 9.27 fails to hold in non-

reﬂexive Banach spaces.

We close this section with the following result which is valid in a general

Banach space X.

such that D(A) = X and both A and A∗ are dissipative (where A∗

denotes the adjoint of A), then A is m-dissipative (hence, according to

the Lumer–Phillips Theorem, A is the generator of a C0 -semigroup of

contractions).

It follows that x∗ ∈ D(A∗ ) and x∗ − A∗ x∗ = 0. Since A∗ is assumed

to be dissipative, we infer that x∗ = 0, so R(I − A) = X. In fact,

R(I − A) is a closed subspace of X (since A is dissipative and closed),

hence R(I − A) = X.

Theorem

The Hille–Yosida theorem has the following signiﬁcant generalization

that belongs to Feller, Miyadera, and Phillips.3

imal generator of a C0 -semigroup {T (t); t ≥ 0} satisfying T (t) ≤

M eωt , t ≥ 0, with M ≥ 1, ω ∈ R, if and only if

(λ−ω)n

∀λ > ω, n = 1, 2, . . .

of the Hille–Yosida Theorem. Here Rλ is well deﬁned for λ > ω

and one can similarly prove that (ω, ∞) ⊂ ρ(A) and R(λ, A) = Rλ ,

R(λ, A) ≤ λ−ωM

for all λ > ω. Then, for all λ > ω and x ∈ X,

we have

3

William S. Feller, Croatian-American mathematician, 1906–1970; Isao

Miyadera, Japanese mathematician, born 1926.

9.8 The Feller–Miyadera–Phillips Theorem 269

∞

R(λ, A)2 x = e−λt T (t)Rλ x dt

0 ∞
∞

= e−λt T (t) e−λs T (s)x ds dt

0 ∞
∞ 0

= e−λ(t+s) T (t + s)x ds dt

0 ∞
0 ∞

= e−λr T (r)x dr dt

0 ∞ t

0

∞

1

R(λ, A)n x = tn−1 e−λt T (t)x dt. (9.8.38)

(n − 1)! 0

semigroup that for all λ > ω, x ∈ X and n = 1, 2, . . .

∞

n M

R(λ, A) x ≤ tn−1 e(ω−λ)t dt · x

(n − 1)! 0

M

= · x,

(λ − ω)n

which completes the proof of necessity.

Suﬃciency: To simplify the proof, we note that, in general, if {T (t); t ≥

0} is a C0 -semigroup satisfying T (t) ≤ M eωt , t ≥ 0, for some M ≥ 1

and ω ∈ R, with generator A, then the family {S(t) = e−ωt T (t); t ≥

0} is also a C0 -semigroup with the generator A − ωI. Thus, one

can assume in the following that ω = 0 (i.e., (0, ∞) ⊂ ρ(A) and

λn R(λ, A)n ≤ M for all λ > 0 and n = 1, 2, . . . ). The idea that

can be used to complete the proof is to deﬁne a new norm on X, say

· ∗ , equivalent to the original, such that the corresponding operator

norm of R(λ, A) be less than or equal to 1/λ for all λ > 0. Then the

conclusion will follow from the Hille–Yosida theorem. First, deﬁne for

ν > 0 the following norm on X

270 9 Semigroups of Linear Operators

and the operator norm of R(ν, A) with respect to the new norm satisﬁes

1

R(ν, A)ν ≤ ∀ν > 0. (9.8.40)

ν

In addition,

1

R(λ, A)ν ≤ for all 0 < λ ≤ ν. (9.8.41)

λ

This follows easily from (9.8.40) and the so-called resolvent identity:

Now, deﬁne

x∗ = sup{xν ; ν > 0},

and observe that (see (9.8.39) and (9.8.41))

1

x ≤ x∗ ≤ M x and R(λ, A)∗ ≤ ∀λ > 0.

λ

So, according to the Hille–Yosida Theorem, A generates a C0 -semigroup

{T (t); t ≥ 0} ⊂ L(X) satisfying

T (t)∗ ≤ 1 ∀t ≥ 0,

hence

T (t) ≤ M ∀t ≥ 0.

Theorem 9.30, then

≤ xν ∀ 0 < λ ≤ ν, x ∈ X, n = 0, 1, 2, . . .

which implies

λ→∞

Taking into account the above discussion on groups and their relation-

ship with semigroups, one can easily derive the following extension to

groups of the Feller–Miyadera–Phillips generation theorem.

9.9 A Perturbation Result 271

ﬁnitesimal generator of a C0 -group {G(t); t ∈ R} satisfying G(t) ≤

M eω|t| , t ∈ R, with M ≥ 1, ω ∈ R, if and only if

(kk) for every λ ∈ R with |λ| > ω one has λ ∈ ρ(A) and R(λ, A)n ≤

M

(|λ|−ω)n

∀n = 1, 2, . . .

equality R(λ, A)n ≤ (|λ|−ω)

1

n ∀n = 1, 2, . . . is equivalent to R(λ, A)

≤ |λ|−ω

1

. If M = 1 and ω = 0, then G(t) = 1 for all t ∈ R, or equiv-

alently G(t)x = x for all t ∈ R (i.e., all G(t)’s are isometries).

Summarizing, we have the following result.

ﬁnitesimal generator of a C0 -group of isometries {G(t); t ∈ R} if and

only if

(kk)* for every λ ∈ R \ {0} one has λ ∈ ρ(A) and R(λ, A) ≤ 1

|λ| .

It is intuitive that perturbing the generator A of a C0 -semigroup with

any operator B ∈ L(X) yields a generator. Indeed, the following result

holds.

semigroup {T (t); t ≥ 0} ⊂ L(X) satisfying T (t) ≤ M eωt for all

t ≥ 0, with M ≥ 1, ω ∈ R, and let B ∈ L(X). Then the operator

C = A + B with D(C) = D(A) is the generator of a C0 -semigroup

{S(t); t ≥ 0} ⊂ L(X) satisfying S(t) ≤ M e(ω+M B)t for all t ≥ 0.

orem 9.30), one can assume that ω = 0. Next, we also assume that

M = 1. Then (0, ∞) ⊂ ρ(A) and for all λ > 0 we can write

λI − C = I − BR(λ, A) (λI − A). (9.9.43)

272 9 Semigroups of Linear Operators

For all λ > B we have BR(λ, A) ≤ B · R(λ, A) < 1, so

I − BR(λ, A) is invertible in L(X). Thus, taking into account (9.9.43),

we can see that (B, ∞) ⊂ ρ(C) and for all λ > B

−1

R(λ, C) = R(λ, A) I − BR(λ, A)

∞

n

= R(λ, A) BR(λ, A) ,

n=0

1

R(λ, C) ≤ ∀λ > B.

λ − B

0} satisfying S(t) ≤ eBt for all t ≥ 0.

Now, let us consider the general case M ≥ 1 (and ω = 0). Deﬁne the

norm

x∗ = supt≥0 T (t),

C0 -semigroup {S(t); t ≥ 0} satisfying

S(t)∗ ≤ eB∗ t t ≥ 0.

Therefore,

S(t)x ≤ S(t)x∗

≤ eB∗ t x∗

≤ M eM Bt x ∀t ≥ 0,

9.10 Approximation of Semigroups 273

An example of approximation has already been encountered in the

proof of Theorem 9.22. Speciﬁcally, we saw that if {T (t); t ≥ 0} ⊂

L(X) is a C0 -semigroup of contractions with generator A, then S(t)x

can be approximated (uniformly for t in compact intervals) by etAλ x as

λ → ∞, where Aλ denotes the Yosida approximation of A. Deﬁnitely,

this approximation result extends to any C0 -semigroup.

In what follows, we present another approximation result, known as

the Trotter Theorem,4 which is relevant for applications. As in [39],

for M ≥ 1 and ω ∈ R denote by G(M, ω) the class of operators which

generate C0 -semigroups {T (t); t ≥ 0} satisfying T (t) ≤ M eωt , ∀t ≥

0. The Trotter Theorem [48] says that the convergence of a sequence

An ∈ G(M, ω) to A ∈ G(M, ω) in some sense (see below) is equivalent

to the convergence of the corresponding semigroups.

Theorem 9.36. If A, An ∈ G(M, ω) and {T (t); t ≥ 0}, {Tn (t); t ≥ 0}

are the C0 -semigroups generated by A, An (n = 1, 2, . . . ), then the

following conditions are equivalent:

(a) for some λ > ω and for all x ∈ X, R(λ, An )x → R(λ, A)x as

n → ∞;

(b) for all x ∈ X and t ≥ 0, Tn (t)x → T (t)x as n → ∞, uniformly

for t in compact subintervals of [0, ∞).

Proof. We ﬁrst prove that (a) implies (b). For a given t > 0, every

s ∈ (0, t), and every x ∈ X, we have

d

[Tn (t − s)R(λ, An )T (s)R(λ, A)x]

ds

= −Tn (t − s)An R(λ, An )T (s)R(λ, A)x

+ Tn (t − s)R(λ, An )AT (s)R(λ, A)x

= Tn (t − s)[−An R(λ, An )R(λ, A) + R(λ, An )AR(λ, A)]T (s)x

= Tn (t − s)[R(λ, A) − R(λ, An )]T (s)x.

Note that all the above operations are allowed. Integrating the above

equality over [0, t] yields

R(λ, An )[Tn (t) − T (t)]R(λ, A)x

t

= Tn (t − s)[R(λ, A) − R(λ, An )]T (s)x ds. (9.10.44)

0

4

Hale F. Trotter, Canadian mathematician, born 1931.

274 9 Semigroups of Linear Operators

[0, t1 ], one has

t

≤ Tn (t − s) · [R(λ, An ) − R(λ, A)]T (s)x ds

0

t1

≤ Tn (t − s) · [R(λ, An ) − R(λ, A)]T (s)x ds. (9.10.45)

0

wise to zero in [0, t1 ] and it has in this interval the upper bound

2M 3 eωt1 x(λ − ω)−1 . Thus, according to the Lebesgue Dominated

Convergence Theorem, one gets from (9.10.45)

n→∞

the range of R(λ, A) = D(A), we have

n→∞

estimate

+ R(λ, An )[Tn (t) − T (t)]x

+ [R(λ, An ) − R(λ, A)]T (t)x.

(9.10.47)

The right-hand side of (9.10.47) has three terms, say Si = Si (t, n, x),

i = 1, 2, 3. Using our assumption (a) and the estimate Tn (t) ≤

M eωt , t ≥ 0, we can see that, for each x ∈ X, S1 (t, n, x) converges to

zero as n → ∞, uniformly for t in every compact subinterval of [0, ∞).

A similar conclusion holds for S2 (t, n, x), x ∈ D(A) (see (9.10.46)).

Taking again assumption (a) into account, it follows that S3 (t, n, x),

x ∈ X, also converges to zero as n → ∞, uniformly for t in every

compact subinterval of [0, ∞) (here we use the fact that {T (t)x; 0 ≤

t ≤ t1 } is a compact set for each t1 > 0). Summarizing, we derive from

(9.10.47) that

n→∞

9.10 Approximation of Semigroups 275

Hence,

lim [Tn (t) − T (t)]z = 0, ∀z ∈ D(A2 ),

n→∞

in X (see Remark 9.11), this conclusion extends to all x ∈ X, so (b)

holds.

Conversely, assuming now that (b) is satisﬁed, we have for any λ > ω

and x ∈ X

∞

R(λ, An )x − R(λ, A)x = e−λt [Tn (t)x − T (t)x] dt

∞0

0

(9.10.48)

Using again Lebesgue’s Dominated Convergence Theorem for the right-

hand side of the above inequality, we conclude that indeed (b) implies

(a).

Remark 9.37. It is obvious from the proof above that condition (a)

is equivalent to (a) : for all x ∈ X and all λ > ω, R(λ, An )x →

R(λ, A)x as n → ∞. If one assumes that, for some λ > ω,

R(λ, An )x converges as n → ∞ to some Rλ x for all x ∈ X, and if in

addition the range of Rλ is assumed to be dense in X, then Rλ is the

resolvent R(λ, A) of an operator A ∈ G(M, ω). For the proof of this

implication, see [24] and [39, p. 86]. This implication can be used to

replace Theorem 9.36 by an improved version, in which the existence

of A ∈ G(M, ω) is no longer assumed. The reformulation of the Trotter

Theorem in view of the above information is left to the reader.

Remark 9.38. It is worth pointing out that the Trotter Theorem or

suitable versions of it can be used successfully in the numerical analysis

of various initial-boundary value problems.

We continue this section with a result known as the Chernoﬀ product

formula.5

Theorem 9.39. Let A ∈ G(M, ω) for some M ≥ 1 and ω ∈ R and let

F : [0, ∞) → L(X) be a function satisfying F (0) = I and F (t)k ≤

M ekωt for all t ≥ 0, k ∈ N. Assume that

s→0+

5

Paul R. Chernoﬀ, American mathematician, born 1942.

276 9 Semigroups of Linear Operators

Then,

T (t)x = lim F (t/n)n x, (9.10.50)

n→∞

{T (t); t ≥ 0} is the C0 -semigroup generated by A.

and all j ∈ N. Then we have

√

en(Q−I) x − Qn x ≤ M n Qx − x, ∀n ∈ N, x ∈ X.

en(Q−I) − Qn = e−n enQ − en Qn

∞

nk k

= e−n Q − Qn . (9.10.51)

k!

k=0

k−1

Q −Q =

k n

Qj (Q − I),

j=n

M

Qk x − Qn x ≤ M |n − k| · Qx − x. (9.10.52)

Now, using (9.10.51), (9.10.52), and the Bunyakovsky–Cauchy–Schwarz

inequality, we derive

∞

−n nk

e n(Q−I)

x − Q x ≤ e

n

M |n − k| · Qx − x

k!

k=0

∞

nk 1/2

−n

≤ Me Qx − x

k!

k=0

∞

n 1/2

k

× (n − k)2

k!

k=0

1/2 n 1/2

= M e−n Qx − x en ne

√

= M n Qx − x.

9.10 Approximation of Semigroups 277

s>0

As x = s−1 [F (s) − I]x, x ∈ X.

Obviously, As ∈ L(X) for all s > 0 and (cf. (9.10.49))

s→0+

∞

−t/s tk

e tAs

≤e F (s)k ≤ M, ∀t ≥ 0, (9.10.54)

sk k!

k=0

have

R(λ, As )y = R(λ, As ) (λI − As )x − (λI − As )x + (λI − A)x

= x + R(λ, As ) As x − Ax .

it follows by Theorem 9.36 (which also works with s instead of n),

T (t)x − etAs x → 0, as s → 0+ , ∀x ∈ X,

by Lemma 9.40, we have

√

≤ M nF (t/n)x − x

Mt

= √ At/n x → 0, as n → ∞, (9.10.57)

n

Combining (9.10.56) and (9.10.57), we derive (9.10.50) for all x ∈

D(A). Since D(A) is dense in X, (9.10.50) extends to the whole of X.

278 9 Semigroups of Linear Operators

The case ω = 0 can be reduced to the previous one. Indeed, the func-

tion F̃ , deﬁned by F̃ (t) = e−ωt F (t), satisﬁes F̃ (0) = I, F̃ (t)k ≤ M

for all t ≥ 0 and k ∈ N. Moreover, (9.10.49) is satisﬁed with F̃ instead

of F , and A − ωI instead of A. So the conclusion of Theorem 9.39

follows easily.

t −n

T (t)x = lim I − A x, ∀x ∈ X, (9.10.58)

n→∞ n

uniformly for t in compact subintervals of [0, ∞), where {T (t); t ≥

0} ⊂ L(X) is the C0 -semigroup generated by A.

Proof. We can assume that ω > 0. Deﬁne F : [0, ∞) → L(X) by

⎧

⎨ I, t = 0,

F (t) = (1/t)R 1/t, A , t ∈ (0, δ),

⎩

0, t ≥ δ,

≤ M/ tk (t−1 − ω)k

= M/(1 − ωt)k

≤ M ek(ω+1)t , ∀t ∈ (0, δ), k ∈ N.

We also have

t→0+ t→0+

Thus, all the assumptions of Theorem 9.39 are fulﬁlled and so (9.10.58)

holds.

Trotter product formula corresponding to perturbed semigroups:

Corollary 9.42. Let A ∈ G(M, ω), M ≥ 1, ω ∈ R, and B ∈ L(X). If

{T (t); t ≥ 0} is the C0 -semigroup generated by A, S(t) = etB , t ≥ 0,

and {U (t); t ≥ 0} is the C0 -semigroup generated by A + B, then

n

U (t)x = lim T (t/n)S(t/n) x, (9.10.59)

n→∞

9.11 The Inhomogeneous Cauchy Problem 279

the previous renorming procedure (see the proof of Theorem 9.30), we

can assume M = 1. So, deﬁning F (t) = T (t)S(t), t ≥ 0, we have

≤ ekωt ekBt

= ek(ω+B)t , ∀t ≥ 0,

and for all x ∈ D(A + B) = D(A)

S(t)x − x T (t)x − x

lim t−1 [F (t)x − x] = lim T (t) + lim .

t→0+ t→0+ t t→0 + t

= Bx + Ax.

Therefore, Theorem 9.39 is again applicable and (9.10.59) follows.

conditions, for two general C0 -semigroups (see, e.g., [13, p. 154]).

Problem

Consider the Cauchy (initial value) problem

u (t) = Au(t) + f (t), t ∈ [0, r]; u(0) = x, (CP )

where A is the generator of a C0 -semigroup {T (t); t ≥ 0} ⊂ L(X), f

is a given function from [0, r] to X, r ∈ (0, ∞). The case f ≡ 0 was

discussed before.

Deﬁnition 9.44. A function u : [0, r] → X is a classical solution of

problem (CP ) if u is continuous on [0, r] and continuously diﬀeren-

tiable on (0, r], u(t) ∈ D(A) for all t ∈ (0, r], u(0) = x, and u satisﬁes

equation (CP )1 for all t ∈ (0, r].

Remark 9.45. If f ∈ C([0, r]; X) and u is a classical solution of problem

(CP ), then for 0 < s < t ≤ r we have

d

[T (t − s)u(s)] = −T (t − s)Au(s) + T (t − s)u (s)

ds

= −T (t − s)Au(s) + T (t − s)Au(s)

+ T (t − s)f (s)

= T (t − s)f (s).

280 9 Semigroups of Linear Operators

t

u(t) = T (t)x + T (t − s)f (s) ds, t ∈ [0, r], (9.11.60)

0

see Theorem 9.12).

Note also that the integral term in the right-hand side of

Eq. (9.11.60) makes sense for f ∈ L1 (0, r; X), since (see Theorem 9.7)

for the Cauchy problem (CP ).

Deﬁnition 9.46. Let x ∈ X, f ∈ L1 (0, r; X), and let A be the

generator of a C0 -semigroup {T (t); t ≥ 0} ⊂ L(X). The function

u ∈ C([0, r]; X) given by

t

u(t) = T (t)x + T (t − s)f (s) ds ∀t ∈ [0, r] (9.11.61)

0

Obviously, if A is the generator of a C0 -semigroup {T (t); t ≥ 0}, then

for each (x, f ) ∈ X × L1 (0, r; X) problem (CP ) has a unique mild

solution (since the C0 -semigroup generated by A is unique). Formula

(9.11.61) above is often called the variation of constants formula. Un-

der certain conditions on x and f it gives a classical solution of problem

(CP ). The following theorem is one such example.

Theorem 9.47. Let A : D(A) ⊂ X → X be the inﬁnitesimal gen-

erator of a C0 -semigroup, say {T (t); t ≥ 0}, and let x ∈ D(A) and

f ∈ C 1 ([0, r]; X). Then problem (CP ) has a unique classical solution

(given by (9.11.61)).

Proof. Uniqueness is already known (see the remark above). To prove

existence it suﬃces to show that

t

v(t) = T (t − s)f (s) ds

0

t

v(t) = T (s)f (t − s) ds

0

9.11 The Inhomogeneous Cauchy Problem 281

t

v (t) = T (t)f (0) + T (s)f (t − s) ds

0

t

= T (t)f (0) + T (t − s)f (s) ds ∀t ∈ (0, r].

0

On the other hand, for each t ∈ (0, r) and h > 0 small enough, we

have

t

−1 −1

h [T (h) − I]v(t) = h T (t + h − s)f (s) ds − h−1 v(t)

0

= h−1 [v(t + h) − v(t)] − h−1

t+h

× T (t + h − s)f (s) ds

t

and

Av(t) = v (t) − f (t), ∀t ∈ (0, r).

In fact, f can be extended to the right of t = r as a continuously

diﬀerentiable function, so v(r) ∈ D(A) and there exists v (r) = Av(r)+

f (r). Even more, there exists v (0) = f (0) so the function u(t) =

T (t)x+v(t) is continuously diﬀerentiable on [0, r] and satisﬁes equation

(CP )1 for all t ∈ [0, r].

Remark 9.48. From the proof above we see that (under the conditions

of Theorem 9.47)

t

u (t) = T (t)x + T (t)f (0) + T (t − s)f (s) ds ∀t ∈ [0, r]. (9.11.62)

0

C0 -semigroup {T (t); t ≥ 0} and let (x, f ) ∈ X × L1 (0, r; X). If u

is the corresponding mild solution of problem (CP ), then it is the uni-

form limit of a sequence of C 1

-solutions (hence classical solutions) of

(CP ). Indeed, let (xn , fn ) be a sequence in D(A) × C 1 ([0, r]; X)

which approximates (x, f ) in X × L1 (0, r; X). For each (xn , fn ) there

exists a unique C 1 -solution un of problem (CP ) with x := xn and

f := fn given by the variation of constants formula:

t

un (t) = T (t)xn + T (t − s)fn (s) ds.

0

282 9 Semigroups of Linear Operators

un (t) − u(t) ≤ T (t)(xn − x)

t

+ T (t − s) · fn (s) − f (s) ds

0

≤ M eωt xn − x

t

+ M eω(t−s) fn (s) − f (s) ds

0

r

≤ Me ωr

xn − x + fn (s) − f (s) ds .

0

Therefore, un → u in C([0, r]; X).

Remark 9.50. The semigroup approach can be used to solve Cauchy

problems for semilinear evolution equations. Speciﬁcally, let us con-

sider the following problem,

u (t) = Au(t) + f (t, u(t)), t ∈ [0, r]; u(0) = x ∈ X, (N CP )

where A is the inﬁnitesimal generator of a C0 -semigroup {T (t); t ≥

0} ⊂ L(X) and f : [0, r] × X → X is continuous and satisﬁes the

Lipschitz condition

f (t, x1 ) − f (t, x2 ) ≤ Lx1 − x2 , (t, x1 ), (t, x2 ) ∈ [0, r] × X.

Here L is a positive constant. One can consider the following “mild”

form for (N CP )

t

u(t) = T (t)x + T (t − s)f (s, u(s)) ds, t ∈ [0, r]. (9.11.63)

0

If u is a classical solution of problem (N CP ), then it satisﬁes (9.11.63).

One can prove the existence of a solution u ∈ Y := C([0, r]; X) of

(9.11.63) by using the Banach Contraction Principle. For this purpose,

let us consider the Bielecki norm6 on Y :

gB = sup e−βt g, g ∈ Y,

0≤t≤r

to the usual sup-norm of Y , so Y is a Banach space with respect to

· B . Deﬁne on Y an operator Q by

t

(Qu)(t) = T (t)x + T (t − s)f (s, u(s)) ds, t ∈ [0, r], u ∈ Y.

0

6

Adam Bielecki, Polish mathematician, 1910–2003.

9.12 Applications 283

we have

t

(Qu1 )(t) − (Qu2 )(t) ≤ LM eω(t−s) u1 (s) − u2 (s) ds

0

t

= LM eωt e(β−ω)s e−βs u1 (s) − u2 (s) ds

0

t

≤ LM e u1 − u2 B

ωt

e(β−ω)s ds

0

LM

= u1 − u2 B eβt − eωt

β−ω

LM

≤ u1 − u2 B eβt .

β−ω

Thus,

LM

e−βt (Qu1 )(t) − (Qu2 )(t) ≤ u1 − u2 B ,

β−ω

t ∈ [0, r], u1 , u2 ∈ Y,

which implies

LM

Qu1 − Qu2 B ≤ u1 − u2 B , u1 , u2 ∈ Y.

β−ω

Principle ensures the existence of a unique ﬁxed point u of Q. This u

is the unique solution in Y of Eq. (9.11.63), which can be called a mild

solution of the given semilinear Cauchy problem. In general, a mild

solution is not a classical one. However, under appropriate conditions

on x and f it is.

9.12 Applications

In this section we illustrate the above theory with some applications.

Consider the heat (diﬀusion) equation

284 9 Semigroups of Linear Operators

unknown function representing the temperature (or density in the case

of a general diﬀusion process). We have denoted ut := ∂u ∂t and uxx :=

∂2u

∂x2

. In order to solve problem (9.12.64)–(9.12.66), we choose X =

L2 (0, 1) as the basic space equipped with the usual scalar product

1

p, q = p(x)q(x) dx,

0

by

d2 v

D(A) = H 2 (0, 1) ∩ H01 (0, 1), Av = v = 2 .

dx

So, regarding u = u(t, x) as an X-valued function of t ∈ [0, r], problem

(9.12.64)–(9.12.66) can be expressed as the Cauchy problem in X

d

u(t, ·) = Au(t, ·) + f (t, ·), t ∈ [0, r]; u(0, ·) = u0 . (9.12.67)

dt

Note that the boundary conditions (9.12.65) are incorporated into the

deﬁnition of D(A). It turns out that A is the generator of a C0 -

semigroup of contractions, say {T (t) : X → X; t ≥ 0}, so there is

a unique mild solution u of problem (9.12.64)–(9.12.66) given by the

variation of constants formula (see (9.11.61))

t

u(t, ·) = T (t)u0 (·) + T (t − s)f (s, ·) ds, t ∈ [0, r]. (9.12.68)

0

tions one could use the Hille–Yosida Theorem. A better option is to

use the Lumer–Phillips Theorem. In fact, as X is a Hilbert (hence

reﬂexive) space, it suﬃces to prove that A is an m-dissipative oper-

ator (cf. Theorem 9.27). This means that we do not need to check

the density condition on D(A) (that actually follows by the density of

C0∞ (0, 1) in X and the obvious inclusion relation C0∞ (0, 1) ⊂ D(A)).

9.12 Applications 285

λ > 0 we have R(λI − A) = X. In other words, for any λ > 0, g ∈ X,

there exists a solution v ∈ H 2 (0, 1) of the following boundary value

problem

λv − v = g, v(0) = 0 = v(1).

But this follows easily by imposing the boundary conditions to the

general solution of the above diﬀerential equation.

One could also use Theorem 9.29 and the fact that A is a self-adjoint

operator.

According to Theorem 9.47 (see also its proof), if u0 ∈ D(A) =

H01 (0, 1)∩H 2 (0, 1) and f ∈ C 1 ([0, r]; X) then u ∈ C 1 ([0, r]; X). More-

over, since u satisﬁes the heat equation it follows that u ∈ C([0, r];

H 2 (0, 1)). Note that the condition u0 ∈ D(A) incorporates the com-

patibility of u0 with the boundary conditions: u0 (0) = u0 (1) = 0. It

is also worth pointing out that higher regularity of u can be obtained

under additional conditions on u0 and f .

cally, let Ω ⊂ Rn , n ≥ 2, be a bounded domain with a suﬃciently

smooth boundary ∂Ω. Consider the n-dimensional heat equation

u = 0 on ∂Ω,

u(0, x) = u0 (x), x ∈ Ω.

X = L2 (Ω) and let A = Δ with D(A) = H01 (Ω) ∩ H 2 (Ω). So the above

initial-boundary value problem can be viewed as a Cauchy problem in

X. The fact that A is a dissipative operator follows from Green’s for-

mula, and its m-dissipativity can be derived by using the Lax–Milgram

Theorem. The reader is encouraged to continue the discussion and de-

rive existence, uniqueness, and regularity of the solution to the above

problem. The reader could also consider the case of the homogeneous

Neumann or Robin boundary condition and investigate it along the

same lines.

286 9 Semigroups of Linear Operators

Consider in a ﬁrst stage the one-dimensional wave equation

u(t, x) of an elastic string ﬁxed at both its ends (x = 0 and x = 1),

where f (t, x) represents an external force.

Denoting v = ut , problem (9.12.69)–(9.12.71) can be equivalently writ-

ten as ⎧

⎪ ∂

⎨ ∂t [u, v] = [v, uxx + f ], t ≥ 0, x ∈ (0, 1),

u(t, 0) = u(t, 1) = 0, t ≥ 0,

⎪

⎩

[u, v](0, x) = [u0 (x), v0 (x)], x ∈ (0, 1).

Let X = H01 (0, 1) × L2 (0, 1) (the so-called phase space) which is a real

Hilbert space with the scalar product

1 1

[p1 , q1 ], [p2 , q2 ] = p1 p2 dx + q1 q2 dx,

0 0

D(A) = [H01 (0, 1) ∩ H 2 (0, 1)] × H01 (0, 1), A[p, q] = [q, p ].

problem in X

dt [u(t, ·), v(t, ·)] = A[u(t, ·), v(t, ·)] + [0, f (t, ·)], t ≥ 0,

d

results for (CP), we are going to show in what follows that A is the

generator of a C0 -group of isometries. For this purpose, we can use

Corollary 9.34.

9.12 Applications 287

First, noting that C0∞ (0, 1) is dense in H01 (0, 1) as well as in L2 (0, 1),

and C0∞ (0, 1) × C0∞ (0, 1) ⊂ D(A), we infer that the closure of D(A) in

X equals X. It is also easily seen that A is a closed operator. So we

need only to show that condition (kk)∗ of Corollary 9.34 is fulﬁlled.

Let λ ∈ (−∞, 0) ∪ (0, ∞) and let [g, h] be an arbitrary pair in X. We

claim that there exists a unique [p, q] ∈ D(A) such that

or, equivalently, there exists a unique p ∈ H01 (0, 1)∩H 2 (0, 1) satisfying

the equation

λ2 p = p + h + λg.

We know from the preceding discussion on the heat equation that the

last assertion is true. We also have q = λp − g ∈ H01 (0, 1) which

concludes the proof of our claim. Hence λI − A is invertible.

Now, multiplying Eq. (9.12.72) by [p, q] and taking into account the

deﬁnition of A we get

1 1

λ[p, q]2X − p q dx + p q dx = [g, h], [p, q],

0 0

=0

which implies

≤ [g, h]X · [p, q]X .

Therefore,

λ ∈ (−∞, 0) ∪ (0, ∞),

and so

1

(λI − A)−1 ≤ ∀λ ∈ (−∞, 0) ∪ (0, ∞).

|λ|

Thus, according to Corollary 9.34, A generates a group of isometries,

say {G(t); t ∈ R} ⊂ L(X). Therefore, for all [u0 , v0 ] ∈ X and f ∈

L1loc ([0, ∞); X) there exists a unique mild solution [u, v] of (CP) given

by the variation of constants formula

t

[u(t, ·), v(t, ·)] = G(t)[u0 , v0 ] + G(t − s)[0, f (s, ·)] ds, t ≥ 0,

0

288 9 Semigroups of Linear Operators

hence u ∈ C([0, ∞); H01 (0, 1)). This u can be viewed as a general-

ized solution of problem (9.12.69)–(9.12.71). If [u0 , v0 ] ∈ D(A) =

[H01 (0, 1) ∩ H 2 (0, 1)] × H01 (0, 1) and f ∈ C 1 ([0, ∞); L2 (0, 1)), then

[u, v] ∈ C 1 ([0, ∞); X) (cf. Theorem 9.47). It follows that u ∈ C 2 ([0, ∞);

L2 (0, 1))∩C 1 ([0, ∞); H01 (0, 1))∩C([0, ∞); H 2 (0, 1)) and u is a classical

solution of problem (9.12.69)–(9.12.71).

⎧

⎪

⎨utt − Δu = f (t, x), t ≥ 0, x ∈ Ω,

u(t, x) = 0, t ≥ 0, x ∈ ∂Ω,

⎪

⎩

u(0, x) = u0 (x), x ∈ Ω,

boundary ∂Ω, and Δ is the Laplacian with respect to x. In this case,

using the substitution v = ut again, the above initial-boundary value

problem can similarly be expressed as a Cauchy problem in the phase

space X = H01 (Ω) × L2 (Ω), associated with the operator A : D(A) ⊂

X → X deﬁned by

D(A) = H01 (Ω) ∩ H 2 (Ω) × H01 (Ω), A[p, q] = [q, Δp].

group of isometries on X. In particular, to show that Eq. (9.12.72)

has a solution in D(A) we need to use Green’s formula (instead of

integration by parts) and Lax–Milgram. The rest follows similarly.

The case of the homogeneous Neumann or Robin boundary condition

can be addressed in a similar manner.

Let a be a given vector in Rn , n ≥ 1. Consider the equation

the usual scalar product of a, b ∈ Rn . Equation (9.12.73) is known as

the transport equation. The case a = 0 is trivial, so in what follows

we assume a = 0 (i.e., a = (a1 , . . . , an ) contains nonzero components).

9.12 Applications 289

Let us choose X = Lp (Rn ), p ∈ (1, ∞), equipped with the usual norm.

If f ≡ 0 and u0 is a smooth function, then the solution of problem

(9.12.73) and (9.12.74) is given by

u(t, x) = u0 (x − ta), t ≥ 0, x ∈ Rn .

X; t ≥ 0},

p

lim T (t)v − vX = lim |v(x − ta) − v(x)|p dx

t→0+ t→0+ Rn

= 0, ∀v ∈ X,

In order to determine its inﬁnitesimal generator A : D(A) ⊂ X → X,

consider Eq. (9.12.73) with f ≡ 0 and deduce Av = −a · ∇v for all

v ∈ D(A). This follows from the fact that the right derivative of

t → T (t)v at t = 0 is equal to Av. Indeed, if v ∈ C0∞ (Rn ) (which is

dense in X), then v ∈ D(A) and

h→0+

= lim |h−1 [v(x − ha) − v(x)] + a · ∇v(x)|p dx

h→0+ Rn

= 0,

gence as h → 0+ under the above integral). Since the range of A must

be a subset of X, the maximal domain of A is

D(A)

∂v

= {v ∈ X; ∈ X for all i ∈ {1, . . . , n} for which ai = 0},

∂xi

∂v

where ∂x i

denotes the partial derivative of v with respect to xi in the

sense of distributions. Since C0∞ (Rn ) is dense in X and C0∞ (Rn ) ⊂

D(A) it follows that D(A) is dense in X. Obviously, A is a closed

operator. We can use Theorem 9.29 to prove that A is a generator (the

generator of {T (t) : X → X; t ≥ 0}). Indeed, for all u ∈ D(A) \ {0}

290 9 Semigroups of Linear Operators

X |u|

p−2 u (here J denotes the duality mapping

of X), we have

∗ 2−p

u (Au) = uX Au · |u|p−2 u dx

Rn

n

2−p ∂u

= −uX ai |u|p−2 u dx

Rn ∂xi

i=1

n

1 2−p ∂

= − uX ai |u|p dx

p R n ∂x i

i=1

= 0,

so A is dissipative. To derive

the last equality, we have used the fact

that the function g(xi ) = Rn−1 |u|p dx1 . . . dxi−1 dxi+1 . . . dxn belongs

to W 1,1 (R) so g(xi ) −→ 0 as |xi | −→ ∞ (prove it, or see [6, Corollary

8.9, p. 214]). Let X ∗ = Lq (R) be the dual of X (i.e., p1 + 1q = 1). The

adjoint A∗ of A is deﬁned by

∂w

D(A∗ ) = {w ∈ X ∗ ; ∈ X ∗ ∀i ∈ {1, . . . , n} for which ai = 0},

∂xi

A∗ w = a · ∇w.

infer that operator A∗ is also dissipative. Thus, according to The-

orem 9.29, A is m-dissipative, hence it is indeed the generator of

{T (t) : X → X; t ≥ 0}. In fact, this semigroup extends to a C0 -

group of isometries,

T (t)v (x) = v(x − ta) x ∈ Rn , t ∈ R,

Therefore, for all u0 ∈ X = Lp (Rn ) and f ∈ L1 (0, ∞; X), problem

(9.12.73) and (9.12.74) has a unique mild solution u,

t

u(t, x) = T (t)u0 (x) + T (t − s)f (s, ·) (x) ds,

0

t

= u0 (x − ta) + f (s, x − (t − s)a) ds, ∀t ≥ 0.

0

a classical solution, with the additional property a·∇u ∈ C([0, ∞); X).

9.12 Applications 291

is a group of translations deﬁned on X = Lp (R).

Remark 9.52. Since the above operator A generates a C0 -group of

isometries, it follows by Corollary 9.34 that R \ {0} ⊂ ρ(A). Therefore

for all λ ∈ R \ {0} and g ∈ X = Lp (Rn ) there exists a unique solution

u ∈ D(A) satisfying the equation

For an electrical long line we have the following PDE system, called

the telegraph system (see, e.g., [36, p. 320])

Lit + vx + Ri = e(t, x),

Cvt + ix + Gv = 0, t ≥ 0, x ∈ (0, 1),

with the boundary conditions (Ohm’s law at both ends of the line)

represents the voltage across the line; R ≥ 0, R0 > 0, R1 > 0, L >

0, C > 0, G ≥ 0 are constants representing resistances, inductance,

capacitance, and conductance, respectively; e = e(t, x) is the voltage

per unit length impressed along the line in series with it.

We regard the unknown pair [i, v] as a function of t ≥ 0 with values

in X = L2 (0, 1) × L2 (0, 1). Consider in X the scalar product

1 1

[f1 , g1 ], [f2 , g2 ] = L f1 f2 dx + C g1 g2 dx,

0 0

space. Deﬁne A : D(A) ⊂ X → X by

g(0) + R0 f (0) = 0, R1 f (1) = g(1)},

1 1

A[f, g] = − (g + Rf ), − (f + Gg) .

L C

292 9 Semigroups of Linear Operators

Operator A is densely deﬁned, since C0∞ (0, 1) × C0∞ (0, 1) ⊂ D(A) and

is dense in X. It is also easily seen that A is a closed operator: it suﬃces

to note that the derivative is a closed operator in L2 (0, 1) and that con-

vergence in H 1 (0, 1) implies convergence in C[0, 1] (cf. Arzelà–Ascoli).

It turns out that A is an m-dissipative operator (thus conﬁrming the

fact that A is densely deﬁned and closed, cf. Theorems 9.10 and 9.27).

Indeed, for all [f, g] ∈ D(A) we have

1 1

A[f, g], [f, g] = − (g + Rf ), − (f + Gg) , [f, g]

L C

1 1

= − f (g + Rf ) dx − g(f + Gg) dx

0 0

1 1 1

= − (f g) dx − R f dx − G

2

g 2 dx

0 0 0

1

≤ − (f g) dx

0

= f (0)g(0) − f (1)g(1)

= −R0 f (0)2 − R1 f (1)2

≤ 0,

that is to say, A is dissipative (with respect to the scalar product ·, ·).

Let us now prove that R(λI − A) = X for all λ > 0, i.e., for all λ > 0

and [h, k] ∈ X there exists a solution [f, g] ∈ D(A) of the equation

problem

⎧

⎪

⎨f + (Cλ + G)g = Ck,

g + (Lλ + R)f = Lh,

⎪

⎩

g(0) + R0 f (0) = 0, R1 f (1) = g(1).

(see the solution of Exercise 9.13) and then impose upon it the above

boundary conditions to deduce that there exists a unique [f, g] ∈ D(A)

satisfying the problem. The details are left to the reader. Thus, A

is m-dissipative, so it generates a C0 -contraction semigroup on X,

say {T (t) : X → X; t ≥ 0} (cf. Theorem 9.27). Therefore, for all

[i0 , v0 ] ∈ X and e ∈ L1loc ([0, ∞); L2 (0, 1)) there exists a unique mild

9.13 Exercises 293

dt [i(t, ·), v(t, ·)] = A[i(t, ·), v(t, ·)] + [e(t, ·), 0], t ≥ 0,

d

formulated above. This mild solution can be written explicitly in terms

of T (t), i0 , v0 , and e, by using the usual variation of constants formula.

If [i0 , v0 ] ∈ D(A) and e ∈ C 1 ([0, ∞); L2 (0, 1)), then [i, v] is a classi-

cal solution, [i, v] ∈ C 1 ([0, ∞); X) ∩ C([0, ∞); H 1 (0, 1)2 ). It is worth

pointing out that the condition [i0 , v0 ] ∈ D(A) implies compatibility

of the initial data with the boundary conditions and, as a by-product

of this compatibility plus smoothness of function e, we obtain a clas-

sical solution [i, v] with the above properties. In particular i, v are

continuous on [0, ∞) × [0, 1] and satisfy the boundary conditions for

all t ≥ 0.

Remark 9.53. All the above applications can be extended to the semi-

linear case, as pointed out in Remark 9.50.

theory of semigroups of linear operators, including its implications to

linear evolution equations and some applications. Some subjects in

the ﬁeld have not been addressed, e.g., semigroups of compact opera-

tors, diﬀerentiable semigroups, analytic semigroups, dual semigroups,

etc. For more information about linear operator semigroups and their

applications, the reader is referred to [7], [12], [19], [21], [39], [49], [51].

For more details on the regularity of solutions to linear evolution equa-

tions, including signiﬁcant examples from the theory of linear partial

diﬀerential equations, see [6], [19], [39], [49].

9.13 Exercises

1. Compute T (t) = etA , t ∈ R, where

- . - . - .

1 1 0 1 −1 −1

(i) A = ; (ii) A = ; (iii) A = .

−1 −1 −1 0 2 −4

hold true:

294 9 Semigroups of Linear Operators

0 and whenever Re λ = 0, then λ is a simple eigenvalue;

(b) limt→∞ etA = 0 ⇐⇒ all eigenvalues λ of A satisfy Re λ <

0.

X the Cauchy problem

u (t) = Au(t), t ∈ R,

u(0) = u0 .

semigroup {T (t) : X → X; t ≥ 0} the X-valued function (t, x) →

T (t)x is continuous on [0, ∞) × X.

continuous and bounded, equipped with the sup-norm. For some

λ > 0 and δ > 0 deﬁne G(t) : X → X by

∞

(λt)k

(G(t)f )(x) = e−λt f (x − kδ), t ≥ 0, f ∈ X, x ∈ R.

k!

k=0

group and determine its inﬁnitesimal generator;

(b) Show that

1 if t ≥ 0,

G(t) =

e−2λt if t < 0.

are continuous on R and p-periodic with some period p > 0,

equipped with the sup-norm

f = sup |f (s)| ∀f ∈ X.

0≤s≤p

Deﬁne

(T (t)f )(s) = f (t + s), t, s ∈ R, f ∈ X.

Show that {T (t) : X → X; t ∈ R} is a C0 -group of isometries,

i.e., T (t) = 1, t ∈ R. Find its inﬁnitesimal generator.

9.13 Exercises 295

X = Lp (Rk ), where p ∈ [1, ∞). For t ∈ R deﬁne G(t) : X → X

by

(G(t)f )(x) = f (e−tM x), f ∈ X, a.a. x ∈ Rk .

mine its inﬁnitesimal generator;

(b) If ki=1 mii = 0, then G(t) = 1 for all t ∈ R.

that are bounded and uniformly continuous on [0, ∞), equipped

with the usual sup-norm. Deﬁne

f (s − t) for s − t ≥ 0,

(T (t)f )(s) =

f (0) for s − t < 0.

mine its inﬁnitesimal generator.

∞Banachp space X = l of

all sequences (xn )n∈N in R satisfying n=1 |xn | < ∞, equipped

with the usual norm

∞ 1/p

(xn )p = |xn |p ∀(xn )n∈N ∈ X.

n=1

X → X by

contractions;

(b) Determine its inﬁnitesimal generator;

(c) Prove that {T (t) : X → X; t ≥ 0} is uniformly continuous

if and only if (cn ) is bounded.

10. Let H = L2 (0, 1) be equipped with the usual scalar product and

the corresponding induced norm. Deﬁne A : D(A) ⊂ H → H by

296 9 Semigroups of Linear Operators

H → H; t ≥ 0}. Find the explicit form of this semigroup and

show that, for u0 ∈ H, u(t, x) = (T (t)u0 )(x) satisﬁes the trans-

port equation ut + ux = 0 in Ω = (0, ∞) × (0, 1) in the sense of

distributions.

11. Consider the initial-boundary value problem

⎧

⎪

⎨ut − uxx + au = f (t, x), t > 0, x ∈ (0, 1),

u(t, 0) = 0, ux (t, 1) + αu(t, 1) = 0, t > 0,

⎪

⎩

u(0, x) = u0 (x), x ∈ (0, 1),

where a ∈ R, α > 0, u0 ∈ L2 (0, 1), f ∈ L1loc [0, ∞). Solve this

problem using the semigroup approach. Solve the more general

problem obtained by replacing the term au in the above equation

by h(u), where h : R → R is a Lipschitz function.

12. Consider the initial-boundary value problem

⎧

⎪

⎨utt − uxx = f (t, x), t > 0, x ∈ (0, 1),

u(t, 0) = 0, ux (t, 1) = 0, t > 0,

⎪

⎩

u(0, x) = u0 (x),

where u0 ∈ H 1 (0, 1), u0 (0) = 0, f ∈ L1loc [0, ∞). Solve this

problem using the semigroup approach.

13. Consider the telegraph diﬀerential system

Lit + vx + Ri = e(t, x),

Cvt + ix + Gv = 0, t ≥ 0, x ∈ (0, 1),

with the following boundary conditions

v(t.0) + R0 i(t, 0) = 0, −i(t, 1) + C1 vt (t, 1) + D1 v(t, 1)

= e1 (t), t > 0,

and initial conditions

i(0, x) = i0 (x), v(0, x) = v0 (x), x ∈ (0, 1),

where C > 0, C1 > 0, L > 0, D1 ≥ 0, G ≥ 0, R ≥ 0, R0 ≥ 0,

and e, e1 are given functions.

(a) Solve the above problem by using the semigroup approach;

(b) What can you say about existence in the case when D1 , G, R

are Lipschitz functions from R into itself?

Chapter 10

Equations by the Fourier

Method

neous linear evolution equations. For the same purpose, we use here

the Fourier method. More precisely, under appropriate conditions on

the linear operators governing such equations, we ﬁnd the solutions in

the form of Fourier series expansions. This approach is based in an

essential way on the results discussed in Chap. 8.

Equations

Consider the Cauchy problem

u(0) = u0 , (IC)

where Q satisﬁes the set of conditions (a) originally presented in

Chap. 8:

G. Moroşanu, Functional Analysis for the Applied Sciences,

Universitext, https://doi.org/10.1007/978-3-030-27153-4 10

298 10 Solving Linear Evolution Equations by the Fourier Method

strongly positive operator, where (H, (·, ·), · ) is a real, inﬁnite

dimensional, separable Hilbert space.

satisﬁes

of that theorem will be also used in what follows.

(see Theorem 9.29) and so for any u0 ∈ H and f ∈ L1 (0, T ; H)

there exists a unique mild solution u = u(t) of problem (E), (IC)

given by the variation of constants formula. If u0 ∈ D(Q) and f ∈

C 1 ([0, T ]; H), then u is a classical solution (cf. Theorem 9.47). The

Fourier method we are going to discuss next oﬀers more possibilities

to investigate the regularity of solutions and provides good approxi-

mations of solutions in terms of eigenfunctions of the operator Q.

Let us start with a speciﬁc result.

Theorem 10.1. Assume that (a) and (b) above are fulﬁlled. Then

for all u0 ∈ H and f ∈ L2 (0, T ; H) there exists a unique

√ function u ∈

C([0, T ]; H) ∩ C((0, T ]; HE ) ∩ L2 (0, T ; HE ) with tu ∈ L2 (0, T ; H)

which satisﬁes (IC) and Eq. (E) for a.a. t ∈ (0, T ). This function

u (called a strong solution of problem (E), (IC)) is expressed as the

Fourier series expansion

∞

u(t) = un (t)en , (10.1.1)

n=1

where {en }∞

n=1 is the orthonormal basis in H provided by Theorem 8.16,

and un (t) = (u(t), en ), n = 1, 2, . . . If u0 ∈ HE and f ∈ L2 (0, T ; H),

then u ∈ H 1 (0, T ; H) ∩ C([0, T ]; HE ), u(t) ∈ D(Q) for a.a. t ∈ (0, T ),

and Qu ∈ L2 (0, T ; H).

before, we already know that problem (E), (IC) has a unique mild so-

lution u given by the variation of constants formula. A strong solution

is clearly a mild one so the uniqueness part of the theorem is obvious.

10.1 First Order Linear Evolution Equations 299

denotes the diﬀerence of two strong solutions of problem (E), (IC),

then y(0) = 0 and

Multiplying this equation by y(t) and taking into account the positivity

of Q we obtain

1d

y(t)2 = (y (t), y(t)) ≤ 0 for a.a. t ∈ (0, T ) ,

2 dt

which shows that the function t → y(t) is nonincreasing on [0, T ].

Since y(0) = 0 it follows that y is the null function, i.e., the two strong

solutions coincide.

We could show that, under our assumptions, the mild solution u is in

fact a strong solution by a limiting procedure applied to a sequence

of strong solutions un ∈ C 1 ([0, T ]; H) (given by Theorem 9.47) cor-

responding to sequences u0n ∈ D(Q) and fn ∈ C 1 ([0, T ]; H) which

satisfy u0n − u0 → 0, fn − f L2 (0,T ; H) → 0. However, we shall pro-

vide here the existence proof using the Fourier method. Speciﬁcally,

we seek a solution in the form (10.1.1) where the un ’s are unknown

real valued functions. For u0 we have the Fourier expansion

∞

∞

u0 = u0n en with u0n = (u0 , en ), u0 = 2

u20n .

n=1 n=1

∞

∞

f (t) = fn (t)en with fn (t) = (f (t), en ), f (t) = 2

fn (t)2 .

n=1 n=1

k

Denoting sk (t) = n=1 fn (t)en , we can see that

k

sk (t)2 = fn (t)2 ≤ f (t)2 ∀k ∈ N, a.a. t ∈ (0, T ) ,

n=1

in L2 (0, T ; H). Now we impose conditions on u (given by (10.1.1)) to

formally satisfy Eq. (E),

∞

∞

∞

un (t)en + un (t)λn en = fn (t)en ,

n=1 n=1 n=1

300 10 Solving Linear Evolution Equations by the Fourier Method

and (IC),

∞

∞

un (0)en = u0n en ,

n=1 n=1

un (t) + λn un (t) = fn (t) for all n ∈ N and a.a. t ∈ (0, T ) , (10.1.2)

un (0) = u0n , n ∈ N , (10.1.3)

hence

t

−λn t

un (t) = e u0n + e−λn (t−s) fn (s) ds ∀t ∈ [0, T ], n ∈ N .

0

Hölder’s inequality

T

un (t) ≤ 2

2

u20n +T fn (s)2 ds ∀t ∈ [0, T ], n ∈ N . (10.1.4)

0

T

Since u20n and 0 fn (s)2 ds are terms of convergent series, ∞ it follows

from (10.1.4), by the Weierstrass M-test, that the series n=1 un (t)2 is

uniformly convergent in [0, T ] and consequently so is the series (10.1.1)

and its sum u is in C([0, T ]; H).

Next, we multiply Eq. (10.1.2) by tun (t) and then integrate the result-

ing equation over [0, T ] to obtain ∀n ∈ N

T

λn

tun (t)2 dt + T un (T )2

0 2

T

λn T

= 2

un (t) dt + tfn (t)un (t) dt

2 0 0

T

λn 1 T 2 1 T

≤ un (t)2 dt + tun (t) dt + tfn (t)2 dt . (10.1.5)

2 0 2 0 2 0

On the other hand, multiplying (10.1.2) by un (t) and then integrating

over [0, T ] we obtain

T T

1 2

u (T ) − u0n + λn

2 2

un (t) dt = fn (t)un (t) dt

2 n 0 0

1 T

≤ fn (t)2 dt

2 0

1 T

+ un (t)2 dt ,

2 0

10.1 First Order Linear Evolution Equations 301

for all n ∈ N, so

∞

T

λn un (t)2 dt < ∞ , (10.1.6)

n=1 0

hence

∞

∞

∞

2 2

u(t), λn−1/2 en E

= u(t), λn−1/2 Qen = λn un (t)2

n=1 n=1 n=1

∞

is convergent for a.a. t ∈ (0, T ), and t → u(t)2E = n=1 λn un (t)

2 is

summable on (0, T ), i.e., u ∈ L2 (0, T ; HE ).

From (10.1.5) and (10.1.6) we infer that

∞

T

tun (t)2 dt < ∞ ,

n=1 0

√

so tu ∈ L2 (0, T ; H). We also have the inequality (similar to (10.1.5))

t T T

λn 1 λn 1

tun (t)2 ≤ − 2

sun (s) ds + 2

un (s) ds + sfn (s)2 ds ,

2 2 0 2 0 2 0

∞ combined 2with (10.1.6) implies (by the

Weierstrass

√ M-test) that n=1 λn tun (t) is uniformly convergent in

[0, T ] so tu ∈ C([0, T ]; HE ). This shows that u ∈ C((0, T ]; HE ).

Now, passing to the limit in L2 (0, T ; H) as k → ∞ in the equation

k

k

k

fn (t)en = un (t)en + un (t)λn en ,

n=1 n=1 n=1

k

k

= un (t)en +Q un (t)en

n=1 n=1

we conclude that u satisﬁes Eq. (E) for a.a. t ∈ (0, T ). This uses the

fact that Q is a closed operator. It is also obvious that u(0) = u0 .

Now, let us assume that u0 ∈ HE and f ∈ L2 (0, T ; H). Multiplying

Eq. (10.1.2) by un (t) we obtain

λn d

un (t)2 + un (t)2

2 dt

= fn (t) · un (t) for a.a. t ∈ (0, T ), ∀n ∈ N . (10.1.7)

302 10 Solving Linear Evolution Equations by the Fourier Method

T

T

λn

un (t)2 dt

+ un (T ) − u0n =

2 2

fn (t) · un (t) dt

0 2 0

1 T 1 T 2

≤ 2

fn (t) dt + u (t) dt , (10.1.8)

2 0 2 0 n

∞

for all n ∈ N. Since u0 ∈ HE (i.e., n=1 λn u0n < ∞), the last

2

inequality implies

∞ T

un (t)2 dt < ∞ ,

n=1 0

∞

hence n=1 un (t)en is convergent in L2 (0, T ; H) and, obviously, its

sum is u ∈ L2 (0, T ; H).

Integration over [0, t] of (10.1.7)

∞ leads to an inequality similar to

2

(10.1.8) which implies that n=1 λn un (t) is uniformly convergent in

[0, T ] and so u ∈ C([0, T ]; HE ). As u , f ∈ L2 (0, T ; H) we derive from

Eq. (E) that Qu ∈ L2 (0, T ; H).

Remark 10.2. For further regularity results see, e.g., [22, Chapter 7].

We continue with a result on the existence of a periodic solution of

Eq. (E).

Theorem 10.3. Assume that (a) and (b) are fulﬁlled and f ∈ L2

(0, T ; H). Then, there exists a unique function u ∈ H 1 (0, T ; H) ∩

C([0, T ]; HE ) satisfying Eq. (E) for a.a. t ∈ (0, T ) and u(0) = u(T ),

and u is given by Eq. (10.1.1), where

t

un (t) = dn e−λn t + e−λn (t−s) fn (s) ds ,

0

with

−λn T −1

T

dn = 1 − e e−λn (T −s) fn (s) ds , n = 1, 2, . . .

0

tion u = u(t, u0 ) of problem (E), (IC)

√ which belongs to C([0, T ]; H) ∩

C((0, T ]; HE ) ∩ L (0, T ; HE ) with tu (t, u0 ) ∈ L2 (0, T ; H). For two

2

vectors u0 , v0 ∈ H we have

d

[u(t, u0 ) − u(t, v0 )] + Q[u(t, u0 ) − u(t, v0 )] = 0 for a.a. t ∈ (0, T ) .

dt

10.1 First Order Linear Evolution Equations 303

positivity of Q (with some constant c > 0), we get

1 d

u(t, u0 ) − u(t, v0 )2

2 dt

+ cu(t, u0 ) − u(t, v0 )2 ≤ 0 for a.a. t ∈ (0, T ) ,

or, equivalently,

d 2ct

e u(t, u0 ) − u(t, v0 )2 ≤ 0 for a.a. t ∈ (0, T )

dt

which shows that the function t → ect u(t, u0 ) − u(t, v0 ) is nonin-

creasing and hence

u(t, u0 ) − u(t, v0 ) ≤ e−ct u0 − v0 ∀t ∈ [0, T ] . (10.1.9)

Now let us consider the so-called Poincaré operator P : H → H deﬁned

by

P u0 = u(T ; u0 ) ∀u0 ∈ H .

From (10.1.9) we see that P is a contraction:

P u0 − P v0 ≤ e−cT u0 − v0 ∀u0 , v0 ∈ H .

By the Banach Contraction Principle (see Chap. 2) it follows that P

has a unique ﬁxed point u∗0 ∈ H, i.e., P u∗0 = u∗0 . In other words,

u(T, u∗0 ) = u∗0 , which is to say, u(t, u∗0 ) is the unique periodic solu-

tion of Eq. (E). Since u∗0 = u(T, u∗0 ) we deduce from the ﬁrst part

of Theorem 10.1 that u∗0 ∈ HE . Therefore, by the second part of

Theorem 10.1, it follows that u(t, u∗0 ) ∈ H 1 (0, T ; H) ∩ C([0, T ]; HE ).

Clearly u(t, u∗0 ) is the sum of a Fourier series of the form (10.1.1) which

is convergent in C([0, T ]; HE ) since u∗0 ∈ HE . From the periodicity

condition u∗0 = u(T, u∗0 ) we infer

un (0) = un (T ) ∀n ∈ N , (10.1.10)

where the un ’s are solutions of (10.1.2), i.e.,

t

un (t) = dn e−λn t + e−λn (t−s) fn (s) ds ∀t ∈ [0, T ], n ∈ N,

0

−1 T −λn (T −s)

dn = 1 − e−λn T e fn (s) ds, n ∈ N .

0

304 10 Solving Linear Evolution Equations by the Fourier Method

Equations

In this section we keep the notation and assumptions used in the pre-

vious section. Consider the Cauchy problem

Theorem 10.4. Assume that conditions (a) and (b) are fulﬁlled.

Then for all u0 ∈ D(Q) (i.e., Qu0 ∈ H), u1 ∈ HE and f ∈ L2 (0, T ; HE )

there exists a unique function u ∈ C 1 ([0, T ]; HE ) ∩ H 2 (0, T ; H) which

satisﬁes (ic) and (e) for a.a. t ∈ (0, T ), and Qu ∈ C([0, T ]; H).

If, in addition, f ∈ C([0, T ]; H) then u ∈ C([0, T ]; H). Alter-

natively, if u0 ∈ D(Q), u1 ∈ HE and f ∈ H 1 (0, T ; H) then u ∈

C 1 ([0, T ]; HE ) ∩ C 2 ([0, T ]; H) (hence Qu ∈ C([0, T ]; H)). In both

cases the solution u is given by a Fourier series expansion of the form

(10.1.1).

Proof. Let us ﬁrst prove uniqueness. Let y ∈ H 1 (0, T ; H) be the

diﬀerence of two solutions of problem (e), (ic). Then, y(0) = 0, y (0) =

0, and

y (t) + Qy(t) = 0 for a.a t ∈ (0, T ) .

We multiply this equation by y (t) to obtain

y (t), y (t) + Qy(t), y (t) = 0 ,

d

y (t)2 + Qy(t), y(t) = 0 ,

dt

for a.a. t ∈ (0, T ). This shows that y is the null function (since y(0) =

0, y (0) = 0 and Q is strongly positive), so the solution is indeed unique

(if it exists).

In order to prove existence, we seek a solution u to problem (e), (ic)

in the form (10.1.1). Requiring this series to formally satisfy (e) and

(ic) we ﬁnd

10.2 Second Order Linear Evolution Equations 305

where fn (t), u0n and u1n are the Fourier coeﬃcients of f (t), u0 and

u1 , respectively. For each n ∈ N problem (10.2.11) and (10.2.12) has

the solution

u1n

un (t) = u0n cos( λn t) + √ sin( λn t)

λn

t

1

+√ sin λn (t − s) fn (s) ds , (10.2.13)

λn 0

un (t) = − λn u0n sin( λn t) + u1n cos( λn t)

t

+ cos λn (t − s) fn (s) ds , (10.2.14)

0

t √

= 0 cos λn s fn (t−s) ds

and

un (t) = −λn u0n cos( λn t) − λn u1n sin( λn t)

t

+ fn (t) − λn sin λn (t − s) fn (s) ds , (10.2.15)

0

or, equivalently,

un (t) = −λn u0n cos( λn t) − λn u1n sin( λn t)

t

+ cos λn (s) fn (t − s) ds . (10.2.16)

0

t √

= 0 cos λn (t−s) fn (s) ds

stants)

T

1 2 1

un (t)2 ≤ C1 (u20n + u1n + fn (s)2 ds) , (10.2.17)

λn λn 0

T

un (t)2 ≤ C2 (λn u20n + u21n + fn (s)2 ds) , (10.2.18)

0

T

un (t)2 ≤ C3 (λ2n u20n + λn u21n + fn (t) + λn2

fn (s)2 ds) , (10.2.19)

0

306 10 Solving Linear Evolution Equations by the Fourier Method

and

T

un (t)2 ≤ C4 (λ2n u20n + λn u21n + fn (s)2 ds) . (10.2.20)

0

∞

∞

λ2n u0n 2 < ∞, λn u1n 2 < ∞ ,

n=1 n=1

∞ T

λn f (t)2 dt < ∞ . (10.2.21)

n=1 0

is convergent in diﬀerent spaces and its sum u satisﬁes

C([0, T ]; H).

If u0 ∈ D(Q), u1 ∈ HE and f ∈ H 1 (0, T ; H) then, according to

(10.2.17), (10.2.18), (10.2.20), and (10.2.21), u ∈ C 1 ([0, T ]; HE ) ∩

C 2 ([0, T ]; H) (hence Qu ∈ C([0, T ]; H)).

Finally, it is easily seen (as in the proof of Theorem 10.1) that in

both cases u, expressed as the sum of the series (10.1.1), satisﬁes (e),

(ic).

diﬀerent conditions on u0 , u1 and f .

On the other hand, using the semigroup approach, one can derive

the existence of a solution to problem (e), (ic) which comes from the

mild solution for the Cauchy problem associated with a ﬁrst order

diﬀerential equation in the product space X = V × H equipped with

the scalar product

[v1 , h1 ], [v2 , h2 ] X

= (v1 , v2 )E + (h1 , h2 ) ∀[v1 , h1 ], [v2 , h2 ] ∈ X .

10.2 Second Order Linear Evolution Equations 307

In fact, for all [v, h] ∈ D(A), we have

A[v, h], [v, h] X = [h, −Qv], [v, h] X

= (h, Qv) − (Qv, h)

= 0.

Thus, according to Remark 9.26, A is a dissipative operator. We also

have A∗ = −A, so A∗ is also dissipative. By Theorem 9.29 it follows

that A is m-dissipative, so (according to the Lumer–Phillips Theorem)

it generates a C0 -semigroup of contractions, say {S(t) : X → X; t ≥

0}.

Problem (e), (ic) can be expressed as the following Cauchy problem

in X

d

[u(t), w(t)]

dt

= A[u(t), w(t)] + [0, f (t)], 0 < t < T ; [u, w](0) = [u0 , u1 ] .

(10.2.22)

According to Sect. 9.11, for [u0 , u1 ] ∈ X and f ∈ L1 (0, T ; H) this

problem has a unique mild solution [u, w] ∈ C([0, T ]; X),

t

[u(t), w(t)] = S(t)[u0 , u1 ] + S(t − s)[0, f (s)] ds, t ∈ [0, T ] .

0

(10.2.23)

The ﬁrst component u = u(t) can be called a mild solution of problem

(e), (ic). In fact, w(t) = u (t). In order to show this, we approx-

imate [u0 , u1 ] ∈ X by [uk0 , uk1 ] ∈ D(Q) × HE , and f ∈ L1 (0, T ; H)

by f k ∈ H 1 (0, T ; H). Denote by [uk , wk ] = [uk , (uk ) ] the solution

of problem (10.2.22) with [u0 , u1 ] := [uk0 , uk1 ] and f := f k which is a

strong solution belonging to C 1 ([0, T ]; HE ) ∩ C 2 ([0, T ]; H) (cf. Theo-

rem 10.4). Obviously,

[uk (t), (uk ) (t)]

t

= S(t)[uk0 , uk1 ] + S(t − s)[0, f k (s)] ds, t ∈ [0, T ] . (10.2.24)

0

As {S(t) : X → X; t ≥ 0} is a semigroup of contractions, we have for

all t ∈ [0, T ]

[uk (t) − um (t), (uk ) (t) − (um ) (t)]X ≤ [uk0 − uk0 , uk1 − um

1 ]X

T

+ f k (s) − f m (s) ds,

0

308 10 Solving Linear Evolution Equations by the Fourier Method

(uk ) converges in C([0, T ]; H) to w = u ∈ C([0, T ]; H). Passing to

the limit in (10.2.24) we reobtain (10.2.23) with w = u . So the mild

solution u belongs to C([0, T ]; HE ) ∩ C 1 ([0, T ]; H). Since u is a limit

of strong solutions uk that admit Fourier series expansions (as stated

in Theorem 10.4), we can easily show that u is the sum of the Fourier

series (10.1.1), where un (t) = (u(t), en ) for n = 1, 2, . . .

10.3 Examples

Let ∅ = Ω ⊂ RN , N ≥ 2, be a bounded domain with smooth bound-

ary ∂Ω. Consider the following problem (associated with the heat

equation)

⎧

⎪

⎨ut − Δu = f (t, x), t ≥ 0, x ∈ Ω ,

u(t, x) = 0, t ≥ 0, x ∈ ∂Ω , (10.3.25)

⎪

⎩

u(0, x) = u0 (x), x ∈ Ω.

This problem can be solved by the Fourier method using the re-

sults presented in Chap. 8 and in Sect. 10.1 above. Thus, the Fourier

method provides an approach for solving the above initial-boundary

value problem which is complementary to the semigroup approach.

Speciﬁcally, consider H = L2 (Ω) equipped with the usual scalar prod-

uct and Hilbertian norm, Q = −Δ with D(Q) = H01 (Ω) ∩ H 2 (Ω), and

HE = H01 (Ω) (the corresponding energetic space) with

(p, q)E = ∇p · ∇q dx, p2E = (p, p)E .

Ω

converging to ∞ and an orthonormal basis {en }∞ 2

n=1 in H = L (Ω) such

that

−Δen = λen in Ω, ∀n ≥ 1 .

Thus, Theorem 10.1 is applicable to problem (10.3.25) which is of the

form (E), (IC) with the above choices. In particular, under suitable

conditions, the solution of (10.3.25) is given by

∞

u(t, x) = un (t)en (x) , (10.3.26)

n=1

10.4 Exercises 309

un (t) + λn un (t) = fn (t), t ≥ 0,

un (0) = u0n , n = 1, 2, . . .

with

fn (t) = f (t, ξ)en (ξ) dξ, u0n = u0 (ξ)en (ξ) dξ, n = 1, 2 . . .

Ω Ω

with the wave equation):

⎧

⎪

⎨utt − Δu = f (t, x), t ≥ 0, x ∈ Ω ,

u(t, x) = 0, t ≥ 0, x ∈ ∂Ω , (10.3.27)

⎪

⎩

u(0, x) = u0 (x), x ∈ Ω.

The cases of the boundary conditions of Neumann or Robin type can

also be analyzed along the same lines.

10.4 Exercises

1. Consider the following initial-boundary value problem:

⎧

⎪

⎨ut − uxx = f (t, x), t ∈ (0, T ), x ∈ (0, 1),

u(t, 0) = 0, ux (t, 1) = 0, t ∈ [0, T ],

⎪

⎩

u0 (x) = u0 (x), x ∈ (0, 1).

usual scalar product (·, ·) and the induced norm · (hence H is

a real Hilbert space which is inﬁnite dimensional and separable).

Deﬁne Q : D(Q) ⊂ H → H by

D(Q) = {v ∈ H 2 (0, 1); v(0) = 0, v (1) = 0},

Qv = −v ∀v ∈ D(Q).

The above problem can be expressed as a Cauchy problem in H:

u (t) + Qu(t) = f (t), 0 < t < T,

(CP )

u(0) = u0 ,

where u(t) := u(t, ·) ∈ H.

310 10 Solving Linear Evolution Equations by the Fourier Method

Q is densely deﬁned, self-adjoint, and strongly positive);

(ii) Find all the eigenpairs of Q and construct a corresponding

orthonormal basis {en }∞n=1 of H;

pactly embedded in H, and determine an orthonormal basis

of HE ;

(iv) Find the explicit Fourier series solution u(t, x) =

∞

n=1 un (t)en (x) for

[0, l], l > 0. The temperature at time t = 0 of the rod is constant:

u = u0 for x ∈ [0, l]. The temperatures at the ends of the rod

are kept constant in time: u(t, 0) = u1 , u(t, l) = u2 , t ∈ [0, T ],

where T > 0 is a given time instant.

Find the temperature distribution u = u(t, x) on the rod, if there

is no external heat source distributed along the rod.

⎧

⎪

⎨ut − uxx = f (t, x), t ∈ (0, T ), x ∈ (0, 1),

−ux (t, 0)+αu(t, 0)=0, ux (t, 1)=0, t∈[0, T ],

⎪

⎩

u0 (x) = u0 (x), x ∈ (0, 1),

L2 (0, 1) and deﬁne Q : D(Q) ⊂ H → H by

Qv = −v ∀v ∈ D(Q).

in H:

u (t) + Qu(t) = f (t), 0 < t < T,

(CP )

u(0) = u0 ,

10.4 Exercises 311

Show that Q satisﬁes the conditions (a) and (b) of Theorem 10.1

(thus ensuring existence, uniqueness, and regularity of solutions

to the given problem).

by the following (Neumann) boundary conditions

5. Let (H, (·, ·), ·) be a real Hilbert space and let A : D(A) ⊂ H →

H be a linear and positive operator, i.e., (Ap, p) ≥ 0 ∀p ∈ D(A),

where I is the identity operator on H. Assume that Q = A + αI

satisﬁes both conditions (a) and (b) of Theorem 10.1, where

α is a positive constant.

u (t) + Au(t) = f (t), 0 < t < T,

(CP )

u(0) = u0 ,

(b) Show that, given T and f , if α is small enough, then there

exists u0 ∈ H such that u(T ) is close to u0 , i.e., u(T ) − u0

is small, where u is the solution of (CP ) corresponding to

u0 and f .

boundary value problem

⎧

⎪

⎨ut − Δu = f (t, x), (t, x) ∈ (0, T ) × Ω,

u(t, x) = 0, (t, x) ∈ [0, T ] × ∂Ω,

⎪

⎩

u(0, x) = u0 (x), x ∈ Ω.

u(t, x) of the above problem for u0 ∈ H = L2 (Ω) and f ∈

L2 ((0, T )×Ω), and determine an explicit expansion for u0 (x) = c

and f (t, x) = tx1 x2 , where c is a real constant.

(instead of the preceding Dirichlet boundary conditions). Con-

sider also combinations of Dirichlet and Neumann conditions on

diﬀerent sides of the rectangle Ω.

312 10 Solving Linear Evolution Equations by the Fourier Method

⎧

⎪

⎨ut − uxx = αδ(x − 1) + βδ(x − 2), (t, x) ∈ (0, ∞) × (0, 3),

u(t, 0) = 0, u(t, 3) = 0, t ≥ 0,

⎪

⎩

u(0, x) = 0, x ∈ [0, 3],

where α, β are real constants, and δ(x−1), δ(x−2) are the usual

Dirac distributions in D (0, 3), also denoted δ1 , δ2 .

ends x = 0 and x = l. Find the displacement u = u(t, x) in

the string, which is set in motion from its straight equilibrium

position, with the initial velocity v0 deﬁned by

Ax, 0 ≤ x ≤ l/2,

v0 (x) =

A(l − x), l/2 ≤ x ≤ l,

10. Consider an elastic string of length l > 0, held ﬁxed at the end

x = 0, while the end x = l is free. Find the displacement

u = u(t, x) in the string, if it is s