0 Stimmen dafür0 Stimmen dagegen

0 Aufrufe527 SeitenHarkrishan Lal Vasudeva-Elements of Hilbert Spaces and Operator Theory-Springer (2017)

Sep 10, 2019

© © All Rights Reserved

PDF, TXT oder online auf Scribd lesen

Harkrishan Lal Vasudeva-Elements of Hilbert Spaces and Operator Theory-Springer (2017)

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

0 Aufrufe

Harkrishan Lal Vasudeva-Elements of Hilbert Spaces and Operator Theory-Springer (2017)

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

- The Law of Explosive Growth: Lesson 20 from The 21 Irrefutable Laws of Leadership
- Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
- Hidden Figures Young Readers' Edition
- The E-Myth Revisited: Why Most Small Businesses Don't Work and
- Micro: A Novel
- The Wright Brothers
- The Other Einstein: A Novel
- State of Fear
- State of Fear
- The Power of Discipline: 7 Ways it Can Change Your Life
- The Kiss Quotient: A Novel
- The 10X Rule: The Only Difference Between Success and Failure
- Being Wrong: Adventures in the Margin of Error
- Algorithms to Live By: The Computer Science of Human Decisions
- The 6th Extinction
- The Black Swan
- The Art of Thinking Clearly
- The Last Battle
- Prince Caspian
- A Mind for Numbers: How to Excel at Math and Science Even If You Flunked Algebra

Sie sind auf Seite 1von 527

and Operator Theory

With contributions from Satish Shirali

123

Harkrishan Lal Vasudeva

Indian Institute of Science Education

and Research

Mohali, Punjab

India

DOI 10.1007/978-981-10-3020-8

Library of Congress Control Number: 2016957499

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part

of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations,

recitation, broadcasting, reproduction on microﬁlms or in any other physical way, and transmission

or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar

methodology now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this

publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from

the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this

book are believed to be true and accurate at the date of publication. Neither the publisher nor the

authors or the editors give a warranty, express or implied, with respect to the material contained herein or

for any errors or omissions that may have been made.

The registered company is Springer Nature Singapore Pte Ltd.

The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

To

Siddhant, Ashira and Shrayus

Preface

Algebraic and topological structures compatibly placed on the same underlying set

lead to the notions of topological semigroups, groups and vector spaces, among

others. It is then natural to consider concepts such as continuous homomorphisms

and continuous linear transformations between above-said objects. By an ‘operator’,

we mean a continuous linear transformation of a normed linear space into itself.

Functional analysis was developed around the turn of the last century by the

pioneering work of Banach, Hilbert, von Neumann, Riesz and others. Within a few

years, after an amazing burst of activity, it was well developed as a major branch of

mathematics. It is a unifying framework for many diverse areas such as Fourier

series, differential and integral equations, analytic function theory and analytic

number theory. The subject continues to grow and attracts the attention of some

of the ﬁnest mathematicians of the era.

A generalisation of the methods of vector algebra and calculus manifests itself in

the mathematical concept of a Hilbert space, named after the celebrated mathe-

matician Hilbert. It extends these methods from two-dimensional and

three-dimensional Euclidean spaces to spaces with any ﬁnite or inﬁnite dimension.

These are inner product spaces, which allow the measurement of angles and

lengths; once completed, they possess enough limits in the space so that the

techniques of analysis can be used. Their diverse applications attract the attention of

physicists, chemists and engineers alike in good measure.

Chapter 1 establishes notations used in the text and collects results from vector

spaces, metric spaces, Lebesgue integration and real analysis. No attempt has been

made to prove the results included under the above topics. It is assumed that the

reader is familiar with them. Appropriate references have, however, been provided.

Chapter 2 includes in some details the study of inner product spaces and their

completions. The space L2(X, M, l), where X, M and l denote, respectively, a

nonempty set, a r-algebra of subsets of it and an extended nonnegative real-valued

measure, has been studied. The theorem of central importance in the analysis due to

Riesz and Fischer, namely that L2(X, M, l) is a complete metric space, has been

proved. So has been the result, namely, the space A(X) of holomorphic functions

deﬁned on a bounded domain X is complete. To make the book useful to

vii

viii Preface

many applied topics: Legendre, Hermite, Laguerre polynomials, Rademacher

functions, Fourier series and Plancherel’s theorem. Such applications of the abstract

theory are also of signiﬁcance for the pure mathematician who wants to know the

origin of the subject. This chapter also contains the study of linear functionals on

Hilbert spaces; more speciﬁcally, Riesz Representation Theorem, the dual of a

Hilbert space is itself a Hilbert space and the fact that these spaces constitute

important examples of reflexive normed linear spaces. Applications of Hilbert space

theory to different branches of mathematics, such as approximation theory (Müntz’

Theorem), measure theory (Radon–Nikodým Theorem), Bergman kernel and

conformal mapping (analytic function theory), are included in Chap. 2.

A major portion of this book is devoted to the study of operators in Hilbert

spaces. It is carried out in Chaps. 3 and 4. The set of operators in a Hilbert space H,

equipped with the uniform norm, is denoted by BðH Þ. Some well-known classes of

operators have been deﬁned. Under compact operators, Fredholm theory has been

discussed. The Mean Ergodic Theorem has been proved as an application at the end

of Chap. 3. Spectrum of an operator is the key to the understanding of the operator.

Properties of the spectrum of different classes of operators, such as normal oper-

ators, self-adjoint operators, unitaries, isometries and compact operators, have been

discussed under appropriate headings. Here, the properties of the spectrum speciﬁc

to the class of operators under consideration are studied. A large number of

examples of operators together with their spectrum and its splitting into point

spectrum, continuous spectrum, residual spectrum, approximate point spectrum and

compression spectrum have been painstakingly worked out. It is expected that the

treatment will aid the understanding of the reader. The treatment of polar decom-

position of an operator is different from the ones available in books. Numerical

range and numerical radius of an operator have been deﬁned. The spectral radius

and the numerical radius of an operator have been compared. Professor Ajit Iqbal

Singh deserves special thanks for the help she rendered while this part was being

written. Spectral theorems, which reveal almost everything about the operators,

have been accorded special treatment in the text. After proving the spectral theorem

for compact normal operators, spectral theorems for self-adjoint operators and

normal operators have been proved. Here, we have been guided by the fundamental

principle of pedagogy that repetition helps in imbibing rather subtle techniques

needed for proving the spectral theorems. A bird’s eye view of invariant subspaces

with special attention to the Volterra operator is included. We close the chapter with

a brief introduction to unbounded operators.

Chapter 5 contains important theorems followed by applications from Banach

spaces.

The ﬁnal chapter contains hints and solutions to the 166 problems listed under

various sections. These are over and above the numerous detailed examples scat-

tered all over the text.

Contents

1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Lebesgue Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.4 Zorn’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.5 Absolute Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.1 Deﬁnition and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2 Norm of a Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3 Inner Product Spaces as Metric Spaces . . . . . . . . . . . . . . . . . . . . . 34

2.4 The Space L2 (X, M, µ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.5 A Subspace of L2(X, M, µ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

2.6 The Hilbert Space A(X) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.7 Direct Sum of Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.8 Orthogonal Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

2.9 Complete Orthonormal Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

2.10 Orthogonal Decomposition and Riesz Representation . . . . . . . . . . 102

2.11 Approximation in Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 123

2.12 Weak Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

2.13 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

3 Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

3.1 Basic Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

3.2 Bounded and Continuous Linear Operators . . . . . . . . . . . . . . . . . 156

3.3 The Algebra of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

3.4 Sesquilinear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

3.5 The Adjoint Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

3.6 Some Special Classes of Operators . . . . . . . . . . . . . . . . . . . . . . . . 192

3.7 Normal, Unitary and Isometric Operators . . . . . . . . . . . . . . . . . . . 205

ix

x Contents

3.9 Polar Decomposition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

3.10 An Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

4 Spectral Theory and Special Classes of Operators . . . . . . . . . . . . . . . 233

4.1 Spectral Notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

4.2 Resolvent Equation and Spectral Radius . . . . . . . . . . . . . . . . . . . . 238

4.3 Spectral Mapping Theorem for Polynomials . . . . . . . . . . . . . . . . . 242

4.4 Spectrum of Various Classes of Operators . . . . . . . . . . . . . . . . . . 248

4.5 Compact Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

4.6 Hilbert–Schmidt Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

4.7 The Trace Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

4.8 Spectral Decomposition for Compact Normal Operators . . . . . . . . 294

4.9 Spectral Measure and Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

4.10 Spectral Theorem for Self-adjoint Operators . . . . . . . . . . . . . . . . . 317

4.11 Spectral Mapping Theorem For Bounded Normal Operators . . . . 331

4.12 Spectral Theorem for Bounded Normal Operators . . . . . . . . . . . . 337

4.13 Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

4.14 Unbounded Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351

5 Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373

5.1 Deﬁnition and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373

5.2 Finite-Dimensional Spaces and Riesz Lemma. . . . . . . . . . . . . . . . 384

5.3 Linear Functionals and Hahn–Banach Theorem . . . . . . . . . . . . . . 393

5.4 Baire Category Theorem and Uniform Boundedness

Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......... 401

5.5 Open Mapping and Closed Graph Theorems . . . . . . . ......... 409

6 Hints and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417

6.1 Problem Set 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417

6.2 Problem Set 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418

6.3 Problem Set 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422

6.4 Problem Set 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427

6.5 Problem Set 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428

6.6 Problem Set 2.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428

6.7 Problem Set 2.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429

6.8 Problem Set 2.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437

6.9 Problem Set 2.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442

6.10 Problem Set 2.11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448

6.11 Problem Set 2.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450

6.12 Problem Set 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454

6.13 Problem Set 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462

6.14 Problem Set 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465

6.15 Problem Set 3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466

6.16 Problem Set 3.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466

6.17 Problem Set 3.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470

Contents xi

6.19 Problem Set 3.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478

6.20 Problem Set 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479

6.21 Problem Set 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485

6.22 Problem Set 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487

6.23 Problem Set 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487

6.24 Problem Set 4.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504

6.25 Problem Set 4.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507

6.26 Problem Set 4.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511

6.27 Problem Set 4.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517

About the Author

Institute of Science Education and Research, Mohali, India from 2010 to 2016.

Earlier, he taught at Panjab University, Chandigarh, India, and held visiting

positions at the University of Shefﬁeld, the UK, and the University of Graz, Austria,

for research projects. He has numerous research articles to his credit in various

international journals and has co-authored several books, two of which have been

published by Springer.

xiii

Chapter 1

Preliminaries

The important underlying structure in every Hilbert space is a vector space (linear

space). The present section contains preparatory material on these spaces. The

reader who is already familiar with their basic theory can pass directly to Sect. 1.2,

for there is nothing in the present section which is particularly oriented to the study

of Hilbert spaces.

Deﬁnition 1.1.1 Let X be a nonempty set of elements x, y, z, … and F be a ﬁeld of

scalars k, l, m, …. To each pair of elements x and y of X, there corresponds a third

element x + y in X, the sum of x and y, and to each k 2 F and x 2 X corresponds

the element kx or simply kx in X, called the scalar product of k and x such that the

operations of addition and multiplication satisfy the following rules:

(A1) x + y = y + x,

(A2) x + (y + z) = (x + y) + z,

(A3) there is a unique element 0 in X, called zero element, such that x + 0 = x for

all x 2 X,

(A4) for each x 2 X, there is a unique element (−x) in X such that x + (−x) = 0,

(M1) k(x + y) = kx + ky,

(M2) (kl)x = k(lx) and

(M3) 1x = x,

where 1 2 F is the identity in F, for all k, l 2 F and x, y, z 2 X.

Then, (X, +, ) satisfying properties (A1)–(A4) and (M1)–(M3) is called a vector

space over F. The elements of X are called vectors or points, and those of F are

called scalars.

If F is the ﬁeld of complex numbers C [resp. real numbers R], then (X, +, ) is

called a complex [resp. real] vector space or a complex [resp. real] linear space.

In what follows, F will denote the ﬁeld C of complex numbers or the ﬁeld R of

real numbers.

H.L. Vasudeva, Elements of Hilbert Spaces and Operator Theory,

DOI 10.1007/978-981-10-3020-8_1

2 1 Preliminaries

Remarks 1.1.2

(i) It is more satisfying to apply the term vector space over F to the ordered

triple (X, +, ), but if this sort of thing is done systematically in all mathe-

matics, the terminology will become extremely cumbersome. In order to

avoid this difﬁculty, we shall apply the term vector space over F to X, where

it is understood that X is equipped with the operations ‘+’ and ‘’, the latter

being scalar multiplication of elements of X by those of F.

(ii) We shall mainly restrict our attention to the ‘complex’ vector spaces. The

strong motivational factor for this choice is that the complex numbers con-

stitute an algebraically closed ﬁeld; that is, a polynomial of degree n has

precisely n roots (counting multiplicity) in the ﬁeld of complex numbers,

whereas the ﬁeld of real numbers does not have this property. This property

of complete factorisation of polynomials into linear factors is an appropriate

setting for a satisfactory treatment of the theory of operators in a Hilbert

space. It is also useful in dealing with the spaces of functions.

(iii) The additive identity element of the ﬁeld will be denoted by 0 and so shall

be the identity element of vector addition. It is unlikely that any confusion

will result from this practice.

(iv) The following immediate consequences of the axioms of a vector space are

easy to prove:

(a) The vector equation x + y = z, where y and z are given vectors in X, has

one and only one solution;

(b) If x + z = z, then x = 0;

(c) k0 = 0 for every scalar k;

(d) 0x = 0 for every x 2 X;

(e) If kx = 0, then either k = 0 or x = 0.

For given vectors x and y in X, the vector x + (−y) is called the difference

of x and y and is denoted by x − y.

(f) (−k)x = k(−x) = −(kx);

(g) k(x − y) = kx − ky;

(h) (k − l)x = kx − lx.

(v) It is easy to check that Y X is a vector space over F if, and only if,

x, y 2 Y, k, l 2 F imply kx + ly 2 Y.

Examples abound. We shall give at this point a few elementary ones: the real

ﬁeld or the complex ﬁeld with usual operations is a real or complex vector space

(scalar multiplication coinciding with the usual binary operation of multiplication).

The complex ﬁeld may also be considered as a real vector pace. The set of all

n-tuples x = (x1, …, xn), xi 2 F, i = 1, 2, …, n, is a vector space Rn or Cn , where

F ¼ R or C. The set of all real or complex functions deﬁned on some ﬁxed set is a

vector space, the operations being the usual ones. The vector space consisting of the

zero vector only is called the trivial vector space.

1.1 Vector Spaces 3

The Cartesian product X Y of vector spaces X and Y over the same ﬁeld can

be made into a vector space over that ﬁeld in an obvious way.

Deﬁnition 1.1.3 A sequence of vectors x1, x2, …, xn is said to be linearly inde-

pendent if the relation

k1 x 1 þ k 2 x 2 þ þ k n x n ¼ 0 ð1:1Þ

holds only in the trivial case when k1 = k2 = = kn = 0; otherwise, the sequence

x1, x2, …, xn is said to be linearly dependent.

The left member of (1.1) is said to be a linear combination of the ﬁnite sequence

x1, x2, …, xn. Thus, linear independence of the vectors x1, x2, …, xn means that

every nontrivial linear combination of these vectors is different from zero. If one of

the vectors is equal to zero, then these vectors are evidently linearly dependent. In

fact, if for some i, xi = 0, then we obtain the nontrivial relation on taking ki = 1 and

kj = 0, 1 j n, j 6¼ i. A repetition of a vector in a sequence renders it linearly

dependent. An arbitrary nonempty collection of vectors is said to be linearly

independent if every ﬁnite sequence of distinct terms belonging to the collection is

linearly independent.

Deﬁnition 1.1.4 A basis or a Hamel basis for a vector space X is a collection B of

linearly independent vectors with the property that any vector x 2 X can be

expressed as a linear combination of some subset of B.

Remarks 1.1.5

(i) Observe that a linear combination of vectors in a collection is always a ﬁnite

sum even though the collection may contain an inﬁnite number of vectors. In

fact, inﬁnite sums do not make any sense until the notion of ‘limit’ of a

sequence of vectors has been deﬁned in X.

(ii) The space X is said to be ﬁnite-dimensional, more precisely, n-dimensional

if B contains precisely n linearly independent vectors. In this case, any

(n + 1) elements of X are linearly dependent. If B contains arbitrarily many

linearly independent vectors, then X is said to be inﬁnite-dimensional. The

trivial vector space has dimension zero.

(iii) Permuting the vectors in a sequence does not alter its linear independence.

(iv) If x and y are linearly dependent and both are nonzero, then each is a nonzero

scalar multiple of the other.

space with respect to the same operations of vector addition and scalar multipli-

cation as in X is called a vector subspace (or a linear subspace). In other words, if

x, y 2 Y, k, l 2 F imply kx + ly 2 Y, then Y is a vector subspace (or a linear

subspace) of X.

One of the common methods of constructing a linear subspace Y is to consider

the set of all ﬁnite linear combinations

4 1 Preliminaries

k1 x1 þ k2 x2 þ þ kn xn

elements of X. This set Y is the smallest subspace of X that contains M. It is called

the linear span of M or the linear subspace [or manifold] spanned by M, and we

write Y = [M].

Deﬁnition 1.1.7 Given two vector spaces X and Y (over the same ﬁeld), we can

form a new vector space V as follows: deﬁne vector operations on the Cartesian

product of X and Y, the set of all ordered pairs hx; yi; where x 2 X and y 2 Y. We

deﬁne

The vector space V so formed is called the external direct sum of X and Y; we

denote it by X ⊕ Y. The vector hx; 0i in V, if identiﬁed with the vector x 2 X,

permits one to think of X as a subspace of V. Similarly, Y can be viewed as a

subspace of V. The mapping hx; yi ! hx; 0i [resp. h0; yi] is called the projection of

X ⊕ Y onto X [resp. Y].

Let Y1,Y2, …, Yn be subspaces of X. By Y1 + Y2 + + Yn, we shall mean all

sums x1 + x2 + + xn, where xj 2 Yj, j = 1, 2, …, n. The spaces Y1, Y2, …, Yn are

said to be linearly independent if for any i = 1, 2, …, n,

Yi \ ðY1 þ Y2 þ þ Yi 1 þ Yi þ 1 þ þ Yn Þ ¼ f0g:

fYi gni¼1 are said to form a direct sum decomposition of X, and we write

X ¼ Y1 Y2 Yn :

each element x 2 X can be uniquely written in the form y1 + y2 + + yn, where

yj 2 Yj, j = 1, 2, …, n.

Let Y be a subspace of a vector space X (Y X). Let x + Y = {x + y : y 2 Y} for

all x 2 X, and let X/Y = {x + Y : x 2 X}. The sets x + Y are called the cosets of Y in

X. We observe that 0 + Y = Y. Obviously, x1 + Y = x2 + Y if, and only if,

x1 − x2 2 Y, and consequently, for each pair x1, x2, either x1 + Y = x2 + Y or

(x1 + Y) \ (x2 + Y) = ∅. If x1 þ Y ¼ x01 þ Y and x2 þ Y ¼ x02 þ Y, then

ðx1 þ x2 Þ þ Y ¼ ðx1 þ x2 Þ þ Y and ax1 þ Y ¼ ax01 þ Y. The vector space X/Y with

0 0

for all x1, x2 2 X and a 2 C (or R) is called the quotient space of X modulo Y.

1.1 Vector Spaces 5

Deﬁnition 1.1.8 Two vector spaces H and K are said to be isomorphic if there

exists a bijective linear map between H and K, i.e. if there exists a bijective linear

mapping A : H ! K such that

A vector space is a purely algebraic object, and if the processes of analysis are to be

meaningful in it, a measure of distance between any two of its vectors (or elements)

must be deﬁned. Many of the familiar analytical concepts such as convergence in

R3 with the usual distance can be fruitfully generalised to inner product spaces (to

be studied in Chap. 2).

Intuitively, one expects a distance to be a nonnegative real number, symmetric

and to satisfy the triangle inequality. These considerations motivate the following

deﬁnitions.

Deﬁnition 1.2.1 A nonempty set X, whose elements we call ‘points’ is said to be a

metric space if with any two points x, y of X is associated with a real number d(x, y),

called the distance from x to y, such that

(i) d(x, y) 0 and d(x, y) = 0 if, and only if, x = y,

(ii) d(x, y) = d(y, x) and

(iii) d(x, z) d(x, y) + d(y, z), for any x, y, z 2 X [triangle inequality].

The function d: X X ! R þ ; where R þ denotes nonnegative reals, with

these properties is called a distance function or a metric on X.

It should be emphasised that a metric space is not the set of its points; it is, in

fact, the pair (X, d) consisting of the set of its points together with the metric d.

(R; d) [resp. (C; d)], where dðx; yÞ ¼ jx yj; x; y 2 R [resp. C] are examples of

metric spaces.

Any nonempty subset of a metric space is itself a metric space if we restrict the

metric to it and is called a subspace.

Certain standard notions from the topology of real numbers have natural gen-

eralisations to metric spaces.

Deﬁnition 1.2.2 By a sequence {xn}n 1 in a metric space X is meant a mapping of

N, the set of natural numbers, into X. A sequence {xn}n 1 in a metric space is said

to converge to the point x 2 X if limnd(xn, x) = 0, and we write limnxn = x. This

means: given any number e > 0, there is an integer n0 such that d(xn, x) < e

whenever n n0. It is easy to see that if limnxn = x and limnxn = y, then x = y. In

fact,

6 1 Preliminaries

A sequence {xn}n 1 in X is said to be Cauchy if for every e > 0, there is an

integer n0 such that d(xn, xm) < e whenever n, m n0, and we write d(xn, xm) ! 0

as n, m ! ∞.

Note that every convergent sequence is Cauchy. In fact, if limnxn = x

The converse is however not true; that is, not every Cauchy sequence is convergent.

In fact, the sequence xn ¼ 1n, n = 1,2,… in the open interval (0, 1) with the metric

d(x, y) = |x − y|, x, y 2 (0, 1), is Cauchy but the only possible limit, namely, 0 does

not belong to the interval.

In the metric space (R, d) [resp. (C, d)], where d(x, y) = |x − y|, x, y 2 R

[resp. C], a sequence {xn}n 1 is convergent if, and only if, it is Cauchy (this is the

well-known Cauchy criterion of convergence).

An important class of metric spaces in which the analogue of the Cauchy cri-

terion holds are known as ‘complete’ metric spaces. More precisely, we have the

following deﬁnitions.

Deﬁnition 1.2.3 A metric space (X, d) is said to be complete in case every Cauchy

sequence in the space converges. Otherwise, (X, d) is said to be incomplete.

Proposition 1.2.4 Let (X, d) be a metric space. Then

(a) |d(x, y) − d(z, y)| d(x, z) for all x, y, z 2 X;

(b) If limnd(xn, x) = 0 and limnd(yn, y) = 0 then limnd(xn, yn) = d(x, y);

(c) If {xn}n 1 and {yn}n 1 are Cauchy sequences in (X, d), then d(xn, yn) is a

convergent sequence of real numbers.

Proof

(a) By the triangle inequality,

that is,

1.2 Metric Spaces 7

(b) Using the triangle inequality for real numbers and (a), we have

ð1:4Þ

dðx; xn Þ þ d ðy; yn Þ:

Since limnd(xn, x) = 0 = limnd(yn, y), it follows that limnd(xn, yn) = d(x, y).

(c) Again,

dðxn ; xm Þ þ d ðyn ; ym Þ

ð1:5Þ

and {yn}n 1 are Cauchy sequences. Thus, the sequence {d(xn, yn)}n 1 is

Cauchy, and since the real numbers are complete, it converges. h

Deﬁnition 1.2.5 Let x0 be a ﬁxed point of the metric space X and r > 0 be a ﬁxed

real number. Then, the set of all points x in X such that d(x, x0) < r is called the

open ball with centre x0 and radius r. We denote it by S(x0, r). Thus

We speak of a closed ball if the inequality in (1.6) is replaced by d(x, x0) r, and

we denote the set by S(x0, r). Thus

A set O in a metric space is said to be open if it contains an open ball about each

of its points. In other words, for every x 2 O, there exists an r > 0 such that all

y with d(y, x) < r belong to O. A set F in a metric space X is closed if its com-

plement X\F (or Fc) is open in X. An open ball is an open set in X, and a closed ball

is a closed set in X.

An open ball is easily seen to be an open set. Indeed, if y 2 S(x0, r), then the

open ball about y with radius r − d(y, x0) is a subset of S(x0, r) because any z in the

latter ball satisﬁes d(z, y) < r − d(y, x0) and therefore also satisﬁes d(z, x0)

d(z, y) + d(y, x0) < (r − d(y, x0)) + d(y, x0) = r.

It is immediate from the deﬁnition of an open set that it is a union of open balls

with centres in the set. Conversely, any union of open balls is an open set, because

any union of open sets is clearly an open set and open balls are open sets.

Let (X, d) be a metric space and ∅ 6¼ A X. Then, d can be restricted to A in

the obvious sense, and it is trivial to check that the restriction dA provides a metric

on A. It is called the metric induced on A by d, or simply induced metric for short.

8 1 Preliminaries

An open ball in A with reference to the induced metric is easily seen to be the

intersection with A of an open ball in X [caution: the converse is false]. Together

with the fact that intersection is distributive over union and the observation in the

preceding paragraph, this implies that every open subset of A is the intersection with

A of an open set in X. On the other hand, the intersection with A of an open ball in

X having its centre in A is an open ball in A. This implies that the intersection with

A of an open set in X is an open set in A. In summary, a subset of A is open with

reference to the induced metric if, and only if, it is the intersection with A of an open

set in X.

If (A, dA) is complete, then A is a closed subset of X.

Deﬁnition 1.2.6 A neighbourhood of a point x0 2 X is any open ball in (X, d) with

centre x0.

We say that x0 is an interior point of a set A if A contains a neighbourhood of x0.

The interior of a set A, denoted by Aº, consists of all interior points of A and can be

easily seen to be the largest open set contained in A.

Deﬁnition 1.2.7 A point x0 2 X is called a limit point of set A X if every open

ball with centre x0 contains a point of A different from x0.

It may be easily seen that x0 is a limit point of A if, and only if, every open ball

with centre x0 contains an inﬁnite sequence of distinct points of A which converges

to x0.

The closure of a subset A X, denoted by A, is the union of A and the set of all

its limit points. A is the smallest closed set containing A, and A is closed if, and only

if, A ¼ A.

The closure of the open unit ball in C, {z : |z| < 1}, is the closed unit ball {z : |z|

1}.

Deﬁnition 1.2.8 A mapping f from a metric space (X, d) to a metric space (X′, d′) is

said to be continuous at x0 2 X if, for every e > 0, there exists a d > 0 such that

d′(f(x), f(x0)) < e whenever d(x, x0) < d. The function f is continuous on X if it is

continuous at each point of X.

The mapping f is said to be uniformly continuous on X if, for every e > 0, there

exists a d > 0 such that d′(f(x), f(y)) < e whenever d(x, y) < d.

The function f is continuous on X if f−1(V) = {x 2 X : f(x) 2 V}, called ‘inverse

image of V’, is open [resp. closed] when V is open [resp. closed] in X′.

Deﬁnition 1.2.9 Let f be a real-valued function deﬁned on a metric space (X, d).

The function is said to be lower semi-continuous at x0 2 X if for each e > 0, there

exists a d > 0 such that

f ðxÞ [ f ðx0 Þ e

Upper semi-continuity is deﬁned by replacing the inequality displayed above

by f(x) < f(x0) + e.

1.2 Metric Spaces 9

semi-continuous and lower semi-continuous there.

Deﬁnition 1.2.10 A metric space X is said to be separable if in the space X there

exists a sequence

fx1 ; x2 ; . . .; xn ; . . .g ð1:8Þ

such that for every x 2 X and every e > 0 there is an element xn0 of (1.8) with

d(x, xn0 ) < e.

A subset A X, where X is a metric space, is said to be dense if A ¼ X: In view

of this terminology, the deﬁnition of separability may be rephrased as follows: X is

said to be separable if X contains a countable dense set.

Deﬁnition 1.2.11 A subset K X, where X is a metric space, is said to be

bounded if there exists an M 0 such that d(x, y) M whenever x and y are

points in K.

The following is an immediate consequence of the deﬁnition.

Proposition 1.2.12 Let x0 be a ﬁxed point of the metric space X and K X. Then

K is bounded if, and only if, the numbers d(x, x0) are bounded as x varies over K.

Proof Suppose d(x, x0) M for all x 2 K; if x, y 2 K, then

Thus K is bounded.

Conversely, suppose that K is bounded, say d(x, y) M for all x, y 2 K. Fix any

point y0 2 K. Then

for all x 2 K. h

We review briefly the basic facts about the completion of a metric space. For

details, the reader may refer to 1–5 of [30].

Deﬁnition 1.2.13 Let (X, d) be an arbitrary metric space. A complete metric space

(X*, d*) is said to be a completion of the metric space (X, d) if

(i) X is a subspace of X* and

(ii) Every point of X* is the limit of some sequence in X (i.e. X is dense in X*).

For example, the space of real numbers is a completion of the space of rational

numbers. It will follow upon using Theorem 1.2.15 that the real numbers form the

only completion of the space of rational numbers.

Deﬁnition 1.2.14 Let (X, d) and (X′, d′) be two metric spaces. A mapping T from

X to X′ is an isometry if

10 1 Preliminaries

for all x, y 2 X. The mapping T is also called an isometric imbedding of X into X′.

If, however, the mapping is onto, the spaces X and X′ themselves, between which

there exists an isometric mapping, are said to be isometric.

It may be noted that an isometry is always one-to-one.

Theorem 1.2.15 Every metric space has a completion and any two completions

are isometric to each other. Moreover, there is a unique isometry between them that

reduces to the identity when restricted to the given metric space [30].

Let (X, d) be a metric space and Y X. A collection of open sets G in X is called

an open cover of Y if for each y 2 Y, there is a G 2 G such that y 2 G. A ﬁnite

subcollection of G which is itself a cover is called a ﬁnite subcover of Y.

Deﬁnition 1.2.16 A metric space (X, d) is said to be compact if every open cover

contains a ﬁnite subcover. A subset K of X is said to be a compact subset if the

metric space formed by K with the restriction of d to it is compact. A subset of X is

said to be precompact (or relatively compact) if its closure in X is compact.

A compact subset is always closed and therefore also precompact.

A closed subset of a compact metric space is compact. Also, a ﬁnite union of

compact subsets is compact.

A subset of Rn or Cn is compact if, and only if, it is closed as well as bounded.

The sequence criterion for compactness is: (X, d) is compact if, and only if, every

sequence in X has a convergent subsequence.

Every compact metric space is bounded but not conversely.

A continuous image of a compact metric space is compact.

Deﬁnition 1.2.16A Given a positive e, an e-net for a subset K of a metric space is a

subset Y of the metric space such that, for every x 2 K, there exists y 2 Y such that

d(x, y) < e. A subset K is said to be totally bounded if for every positive e, there

exists a ﬁnite e-net for K.

A subset of a metric space is totally bounded if, and only if, it is precompact.

A subset of a complete metric space is compact if, and only if, it is closed and

totally bounded. A closed subset of a complete metric space is compact if, and only

if, it is totally bounded.

If A is a nonempty subset of a metric space X with metric d and x 2 X, then the

nonnegative number d(x, A) = inf{d(x, a) : a 2 A} is called the distance from x to

A. Clearly, d(x, A) = 0 if, and only if, x 2 A. The function / deﬁned on X by

/(x) = d(x, A) is continuous. In particular, for any a > 0, the set {x 2 X :

d(x, A) a} is closed. Moreover, / vanishes at all points of A and nowhere else.

Theorem 1.2.17 Given disjoint closed subsets A and B of a compact metric space

X, there always exists a continuous function f: X ! [0, 1] such that f(a) = 0 for

every a 2 A and f(b) = 1 for every b 2 B [29, Theorem 3.4.4 on p. 116].

Proposition 1.2.18 Given a closed subset A of a metric space, its complement is

the union of a sequence of closed subsets.

1.2 Metric Spaces 11

closed and therefore compact; it is also disjoint from A. But the union of all sets Kn

is {x 2 X : d(x, A) > 0}, which is precisely the complement of A ¼ A: h

Let (Xn, dn), n = 1, 2, …, be metric spaces with d n (Xn ) = sup{d n (x, y) : x,

y 2 Xn} 1 for each n. For x; y 2 1

Q

n¼1 Xn , deﬁne

1

X

dðx; yÞ ¼ 2 n dn ðxn ; yn Þ;

n¼1

where x = {xn}n 1 and y = {yn}n 1. Observe that the series on the right converges

because 2−ndn(xn, yn) 2−n.

Then, d turns out to be a metric on X ¼ 1

Q

n¼1 Xn and (X, d) is called a product

metric space. Also, convergence in this metric turns out to be the same as coor-

dinatewise convergence.

Theorem 1.2.19 (Tychonoff) (X, d) is compact if, and only if, each (Xn, dn) is

compact [30].

Another important theorem in metric spaces is the theorem of Ascoli, also

known as the Arzelà–Ascoli Theorem.

As usual, C[0, 1] denotes the metric space of continuous functions deﬁned on

[0, 1] with metric

for every e > 0, there exists a d > 0 such that, for every f 2 K,

Theorem 1.2.21 (Ascoli) Let K be a nonempty subset of C[0, 1]. Then the fol-

lowing are equivalent:

(a) The closure of K is compact;

(b) K is uniformly bounded (i.e. there exists M > 0 such that |f(x)| M for every

x 2 [0, 1] and every f 2 K) and equicontinuous [30].

In this section, we shall review the theory of measure and integrable functions as

developed by H. Lebesgue in 1902. His integral, though more complicated to

develop and deﬁne than Riemann’s, yet as a tool, is easier to use and has better

properties. For example, problems which involve integration together with a

12 1 Preliminaries

limiting process are often awkward with the Riemann integral but are easily han-

dled when Lebesgue integration is used.

Measure theory is based on the idea of generalising the length of an interval in

R, the area of a rectangle in R2 , etc. to the measure of a subset. The more the

‘measurable’ subsets, the more the functions that can be integrated. A well-behaved

measure, i.e. a measure with acceptable properties, is possible on a wide class of

subsets. We begin with the following deﬁnitions.

Deﬁnition 1.3.1 Let X be a set and M be a collection of subsets of X with the

following properties:

(i) X 2 M,

(ii) If A X and A 2 M, then X\A 2 M and S

(iii) If An X and An 2 M; n = 1,2,…, then 1n¼1 An 2 M.

In case X is a metric space, there is a smallest r-algebra containing all open

subsets of it, and each member of this smallest r-algebra is called a Borel set and

the smallest r-algebra containing all open subsets is called the Borel ﬁeld.

Deﬁnition 1.3.2 Let l be an extended real-valued function deﬁned on M such that

(i) l(A) 0 for every A 2 M and

(ii) An 2 M, n = 1, 2, … and An \ Am = ∅, n 6¼ m implies

!

1

[ 1

X

l An ¼ lðAn Þ:

n¼1 n¼1

A measure space is a triple (X, M, l), where X is a nonempty set, M a

r-algebra of subsets of X and l a positive measure on X. A subset A in the measure

space (X, M, l) is said to have r-ﬁnite measure if A is a countable union of sets Ai,

i = 1, 2, …, with l(Ai) < ∞ and we say that l is r-ﬁnite on A. The measure l is

said to be r-ﬁnite if it is r-ﬁnite on X.

There exists a unique positive measure on the r-algebra of Borel subsets of Rn ,

which agrees with the volume when restricted to products of intervals. It is called

the Borel measure in Rn . There exists a r-algebra of subsets of Rn larger than the

Borel r-algebra, the elements of which are called Lebesgue measurable subsets,

and on which is deﬁned a positive measure, which agrees with the Borel measure

when restricted to the Borel r-algebra. It is called the Lebesgue measure in Rn and

has the following additional property of being complete: let E be a Lebesgue

measurable set of measure 0 and F be any subset of E. Then, F is also Lebesgue

measurable and hence has measure 0.

The Lebesgue measure on Rn is r-ﬁnite. Let E be a measurable subset of R and

l be the Lebesgue measure on R. There exists an open set O and a closed set F in R

1.3 Lebesgue Integration 13

such that F E O, l(O\E) < e and l(E\F) < e. This property is called the

regularity of the Lebesgue measure l.

Deﬁnition 1.3.3 Let M be a r-algebra of subsets of X. An extended real-valued

function f deﬁned on X is said to be measurable if f−1(O) = {x 2 X : f(x) 2 O},

where O is an open subset of R, is measurable and if the subsets f−1(∞) and

f−1(_∞) are measurable. A complex-valued function g + ih is measurable if, and

only if, g and h are both measurable.

It can be shown that if f = g + ih is measurable, then f−1(O) = {x 2 X : f(x) 2 O},

where O is an open subset of C, is measurable [26].

If f and g are extended real-valued measurable functions, then so are

f + g (provided it is deﬁned), fg, af (a 2 R), jf j; max{f, g}, min{f, g}, where

then supnfn, infnfn, limsupnfn (=limnsupk nfk), liminfnfn, and the pointwise limit

limnfn, when it exists, are measurable.

For A X, let vA denote the characteristic function of A, that is,

1 if x 2 A

vA ðxÞ ¼

0 if x 2

6 A:

function is a real-valued function on X whose range is ﬁnite. If a1, a2, …, am are

the distinct values of such a function f, then

m

X

f ¼ aj vAj ; where Aj ¼ fx 2 X : f ðxÞ ¼ aj g; j ¼ 1; 2; . . .; m:

j¼1

Also, f is measurable if, and only if, A1, …, Am are measurable subsets of X.

Let f : X ! [0, ∞] and for n = 1, 2, … consider the simple functions

ðj 1Þ=2n if ðj 1Þ=2n f ðxÞ\j=2n ; j ¼ 1; 2; . . .; n2n

sn ðxÞ ¼

n if f ðxÞ n:

bounded, the sequence {sn} converges to f uniformly on X. If f : X ! [−∞, ∞],

then by considering f = f+ − f−, where f+ = max{f, 0} and f− = −min{f, 0}, we see

that there exists a sequence of simple functions converging to f at every point of

X. Note that if f is measurable, then each of these simple functions is measurable.

Deﬁnition 1.3.4 Let (X, M, l) be a measure space and suppose f is measurable.

14 1 Preliminaries

Pm

(i) For f a simple function, say f ¼ j¼1 aj vAj ; the integral of f over X is

deﬁned as

Z m

X

f dl ¼ aj lðAj Þ;

j¼1

X

(ii) For f extended real-valued and nonnegative, the integral of f over X is

deﬁned as

8 9

Z <Z =

f dl ¼ sup gdl : g is simple and 0 gðxÞ f ðxÞ for x 2 X :

: ;

X X

and f− = −min{f, 0}. The integral of f over X is deﬁned as

Z Z Z

f dl ¼ f þ dl f dl

X X X

(iv) The function f is said to be integrable (or l-integrable) if X f þ dl and

R

R

X f dl are both ﬁnite.

(v) Suppose f is complex-valued and the integrals of ℜf and ℑf are deﬁned as in

(iii) and are ﬁnite. Then set

Z Z Z

f dl ¼ <f dl + i =f dl:

X X X

(vi) For a measurable set A, let vA be the characteristic function of A. If the

integral of fvA can be deﬁned as above, set

Z Z

f dl = f vA dl:

A X

(vii) If the integral can be deﬁned in this manner, we say that the integral exists.

When l is Lebesgue measure on a bounded closed interval, the l-integral

deﬁned above is the Lebesgue integral and coincides with the Riemann integral for

all Riemann integrable functions. R R

It is sometimes convenient to denote A f dl by A f ðxÞdlðxÞ.

1.3 Lebesgue Integration 15

the sense that 0 < l(A) < ∞ for some A 2 M. For p > 0, we shall denote by L~p (X,

R l)p the set of measurable complex-valued functions deﬁned on X such that

M,

X j f j \1:

~1 ðX; M; lÞ is a vector space

With pointwise addition and scalar multiplication, L

of functions.

~1 ðX; M; lÞ, we shall need the following: for a

In order to introduce the set L

nonnegative measurable function g, let A be the set of all real numbers a such that

1

[ 1

fx 2 X : gðxÞ [ bg ¼ x 2 X : gðxÞ [ b þ

n¼1

n

and since the union of a countable collection of sets of measure zero is a set of

measure zero, it follows that b 2 A. We call b the essential supremum of g and write

b = ess sup g. The function g is said to be essentially bounded, if ess sup g is ﬁnite.

The collection of measurable functions f for which ess sup | f | < ∞ will be denoted

~1 ðX; M; lÞ.

by L

The next three theorems involve interchange of integration with the limiting

process for a sequence of functions.

Theorem 1.3.6 (Monotone Convergence Theorem) Let (X, M, l) be a measure

space and assume that {fn}n 1 is a monotone increasing sequence of nonnegative

extended real-valued measurable functions. Then

Z Z

limn fn dl ¼ limn f dl:

X X

The reader will note that each of the above integrals is deﬁned (though not

necessarily ﬁnite).

The following immediate consequence of the Monotone Convergence Theorem

will be needed in Sect. 2.8.

Corollary 1.3.7 Let (X, M, l) be a measure space and {fn}n 1 be a sequence of

nonnegative integrable functions, each deﬁned on X, such that

1 Z

X

fk dl\1:

k¼1

X

P1

Then k¼1 fk is integrable and

16 1 Preliminaries

!

Z 1

X 1 Z

X

fk dl ¼ fk dl\1:

k¼1 k¼1

X X

Pn

Proof Since k¼1 fk n 1 is an increasing sequence of functions that converge to

P1

k¼1 fk ; by the Monotone Convergence Theorem, we conclude that

!

Z 1

X 1 Z

X

fk dl ¼ fk dl\1

k¼1 k¼1

X X

P1

and hence k¼1 fk is integrable. h

Theorem 1.3.8 (Fatou’s Lemma) Let (X, M, l) be a measure space and assume

that {fn}n 1 is a sequence of nonnegative extended real-valued measurable func-

tions. Then

Z Z

lim inf n fn dl lim inf n fn dl:

X X

A complex-valued measurable function is integrable if, and only if, its real and

imaginary parts are. Obviously, there can be no Monotone Convergence

Theorem or Fatou’s Lemma for complex-valued functions. Nonetheless, the fol-

lowing result, which is initially proved for real-valued functions by using Fatou’s

Lemma, can be extended to complex-valued functions without any difﬁculty [26].

Theorem 1.3.9 (Lebesgue Dominated Convergence Theorem) Let (X, M, l) be a

measure space and assume that {fn}n 1 is a sequence of complex measurable

functions. Suppose that limnfn = f. If there is an integrable function g such that |fn|

g (n 1), then f is integrable and

Z Z

limn fn dl ¼ f dl:

X X

We ﬁnally recall the role played by the sets of measure zero. Let P be a property

which a function is eligible to have at a point (e.g. continuity, positivity and the

like). If f has the property P at all points outside some set of measure zero, then f is

said to have the property P almost everywhere (abbreviated as a.e.).

R For example, if f is a nonnegative measurable function deﬁned on X and

X f dl ¼ 0; then f = 0 a.e.

For the deﬁnition of the spaces Lp, see Sect. 2.4.

Corollary 1.3.10 Let {fn} be a sequence of complex-valued Rmeasurable functions

P1 1

P1

on X such that

P1 n¼1 jfn j 2 L P(or equivalently, n¼1 X jfn jdl\1). Then

1

R P 1 1 R

f

n¼1 n 2 L and X n¼1 nf dl ¼ f

n¼1 X n dl:

1.3 Lebesgue Integration 17

R1

Proposition 1.3.11 If the function f 2 L1[0, 1] and 0 f ðxÞxn dx ¼ 0 for n = 0, 1, 2,

…, then f(x) = 0 a.e. on [0, 1].

R1

Hint: Obviously, 0 f ðxÞgðxÞdx ¼ 0 for every polynomial g and hence for every

continuous function g by Weierstrass’ Approximation Theorem. Since the char-

acteristic function of [0, t], where t 2 [0, 1], can be approximated pointwise by a

sequence of continuous functions, one gets

Zt

f ðxÞdx ¼ 0 for every t 2 ½0; 1;

0

using the Dominated Convergence Theorem. It follows from Corollary 1.3.10 that

the integral of f over any open subset U of [0, 1] vanishes. The result now follows

on using the regularity of Lebesgue measure.

Proposition

R1 1.3.12 If the real- or complex-valued function f(t) is integrable and

itx

1 f ðtÞe dt ¼ 0 for all real x, then f(t) = 0 a.e. [see Corollary (21.47) of 12].

R

Remark 1.3.13 If f is extended real-valued and if X f dl is ﬁnite, then f is ﬁnite

almost everywhere.

Let (X, S, l) and (Y, T , m) be r-ﬁnite measure spaces. A measurable rectangle is

any set of the form A B, where A 2 S and B 2 T : S T denotes the r-algebra

generated by the collection of measurable rectangles.

With each function f on X Y and with each x 2 X, we associate a function fx

deﬁned on Y as fx(y) = f(x, y). Similarly, if y 2 Y, fy is the function on X such that

fy(x) = f(x, y). Let f be an (S T )-measurable function. Then, for each x 2

X [resp. y 2 Y], the function fx [resp. fy] is T -measurable [resp. S-measurable].

With each subset Q of X Y and with each x 2 X, we associate a subset Qx of

Y deﬁned as Qx = {y 2 Y : (x, y) 2 Q}. Similarly, if y 2 Y, Qy is the subset of X such

that Qy = {x 2 X : (x, y) 2 Q}. Let Q be (S T )-measurable. Then, for each x 2

X [resp. y 2 Y], the set Qx [resp. Qy] is T -measurable [resp. S-measurable].

X [resp. y 2 Y], the function u is S-measurable [resp. w is T -measurable] and

Z Z Z Z

udl ¼ wdm; i:e:; mðQx ÞdlðxÞ ¼ lðQy ÞdmðyÞ:

X Y X Y

Z Z

ðl mÞðQÞ ¼ mðQx ÞdlðxÞ ¼ lðQy ÞdmðyÞ for Q2ST:

X Y

18 1 Preliminaries

Theorem 1.3.14 (Fubini) If f 2 L1(l m), then fx for almost all x 2 X [resp. fy for

almost all y 2 Y] is in L1(m) [resp. L1(l)] and

0 1 0 1

Z Z Z Z Z

@ fx dmAdl ¼ f dðl mÞ¼ @ f y dlAdm:

X Y XY Y X

elements of S satisfying (i) x x, (ii) x y, y x implies x = y, and (iii) x y,

y z implies x z. If every pair of elements of a subset S′ S are comparable,

that is x 2 S′ and y 2 S′ implies either x y or y x, then S′ is called a totally

ordered subset of S (or a chain). An upper bound of a set A S is any y 2 S such

that x y for all x 2 A. A maximal element of S is a y 2 S such that y x implies

y = x.

Zorn’s Lemma If S is a partially ordered set in which every totally ordered

subset has an upper bound, then S has a maximal element.

Remark This lemma is logically equivalent to the axiom of choice, that is one can

be derived from the other and vice versa. This axiom says if {Xa}a2K, K any

indexing set, is any family of sets, then there exists a set that contains exactly one

element from each Xa.

For a discussion of the equivalence alluded to above and related material, the

reader may consult J.L. Kelley [15].

[a, b] if, given e > 0, there is a d > 0 such that

n

X

jf ðdi Þ f ðci Þj\e

i¼1

for

Pn every ﬁnite pairwise disjoint family {(ci, di)} of intervals with

i¼1 ðdi ci Þ\d.

1.5 Absolute Continuity 19

(i)An absolutely continuous R function is continuous.

(ii)The indeﬁnite integral ½a;x f dl; f 2 L1 ½a; b, is absolutely continuous.

(iii)If f is absolutely continuous, then f has a derivative almost everywhere.

(iv) Let f be an absolutely continuous function on [a, b], and suppose that f 0 ðxÞ ¼ 0

a.e. Then, f is a constant.

(v) A function f on [a, b] has the form

Zx

f ðxÞ ¼ f ðaÞ þ uðsÞds

a

for some u 2 L1[a, b] if, and only if, f is absolutely continuous on [a, b].

In this case, u0 ðxÞ ¼ f ðxÞ a.e. on [a, b].

Chapter 2

Inner Product Spaces

In the study of vector algebra in Rn , the notion of angle between two nonzero

vectors is introduced by considering the inner (or dot) product. In fact, if x =

(x1, x2, …, xn) and y = (y1, y2, …, yn) are any two vectors in the n-dimensional

Euclidean space Rn ; then their inner product is deﬁned by

n

X

ðx; yÞ ¼ xi yi ;

i¼1

ðx; xÞ ¼ kxk2 :

determines the angle h between x and y. The vectors x and y are orthogonal if

(x, y) = 0. This concept of orthogonality proves useful and lends itself to the

generalisation to spaces of higher dimensions.

We introduce below the abstract notion of an inner product and show how a

vector space equipped with an inner product reflects properties analogous to those

enjoyed by the n-dimensional Euclidean space Rn :

Recall that we denote by F either the ﬁeld C of complex numbers or the ﬁeld R

of real numbers.

Deﬁnition 2.1.1 Let H be a vector space over F: An inner product on H is a

function (,) from H H into F such that for all x, y, z 2 H and k 2 F;

H.L. Vasudeva, Elements of Hilbert Spaces and Operator Theory,

DOI 10.1007/978-981-10-3020-8_2

22 2 Inner Product Spaces

(ii) ðx þ z; yÞ ¼ ðx; yÞ þ ðz; yÞ; ðkx; yÞ ¼ kðx; yÞ;

(iii) (x, x) 0 and (x, x) = 0 if, and only if, x = 0.

An inner product space is a vector space with an inner product on it. Axiom

(ii) for an inner product space can be expressed as follows: the inner product is

linear in the ﬁrst variable. In axiom (i), ðy; xÞ denotes the complex conjugate of

(y, x). Inner product spaces are also called pre-Hilbert spaces.

It is left to thePreader to verify that when F = R and H = Rn , the usual inner

product ðx; yÞ ¼ ni¼1 xi yi (described above) satisﬁes the foregoing deﬁnition.

The following proposition contains some immediate consequences of Deﬁnition

2.1.1.

Proposition 2.1.2 For any x, y, z in an inner product space H and any k 2 F, the

following hold:

(a) (x, y + z) = (x, y) + (x, z);

(b) (x, ky) = k(x, y);

(c) (0, y) = (x, 0) = 0;

(d) (x − y, z) = (x, z) − (y, z);

(e) (x, y − z) = (x, y) − (x, z);

(f) if (x, y) = (x, z) for all x, then y = z.

(c) (0, y) = (0 + 0, y) = (0, y) + (0, y), on using Deﬁnition 2.1.1(ii), and hence

(0, y) = 0.

(f) Suppose (x, y) = (x, z) for all x. Then

The proofs of (b), (d) and (e) are no different and are left to the reader. h

Examples 2.1.3

(i) Let H = Cn = {x = (x1, x2, …, xn): xi 2 C, 1 i n} be the complex vector

space of n-tuples. For x = (x1, x2, …, xn) and y = (y1, y2, …, yn), deﬁne

n

X

ðx; yÞ ¼ xi yi : ð2:1Þ

i¼1

It is routine to check that the formula (2.1) does deﬁne an inner product on Cn in

the sense of Deﬁnition 2.1.1. This space is called n-dimensional unitary space and

2.1 Deﬁnition and Examples 23

constitute a basis for Cn :

(ii) Let ‘0 be the vector space of all sequences x = {xn}n 1 of complex numbers,

all of whose terms, from some index onwards, are zero (the index, of course,

may vary with the sequence). If x = {xn}n 1 and y = {yn}n 1, deﬁne

1

X

ðx; yÞ ¼ xn yn : ð2:2Þ

n¼1

Since the sum on the right side of (2.2) is essentially ﬁnite, convergence is

not an issue here. The axioms of Deﬁnition 2.1.1 are easily veriﬁed.

(iii) Let ‘2 denote the set of all complex sequences x = {xn}n 1 which are square

summable, that is,

1

X

jxn j2 \1:

n¼1

multiplication of x = {xn}n 1 by a scalar k 2 C are deﬁned by

x þ y ¼ fx n þ y n gn 1 and kx ¼ fkxn gn 1 :

Since

ja þ bj2 2a2 þ 2jbj2

m

X m

X m

X 1

X 1

X

jxn þ yn j2 2 j xn j 2 þ 2 j yn j 2 2 jxn j2 þ 2 j yn j 2 ;

n¼1 n¼1 n¼1 n¼1 n¼1

and hence

1

X 1

X 1

X

j xn þ yn j 2 2 j xn j 2 þ 2 jyn j2 : ð2:3Þ

n¼1 n¼1 n¼1

Thus, if x = {xn}n 1 and y = {yn}n 1 are in ‘2, it follows from (2.3) that

P 2 P1

x + y 2 ‘2. Also, if x 2 ‘2 and k 2 C, then 1 2

n¼1 jkxn j ¼ jkj n¼1 jxn j

2

2 2

shows that kx 2 ‘ . Consequently, ‘ is a vector space over C:

P

For x = {xn}n 1 and y = {yn}n 1 in ‘2, the series 1 n¼1 xn yn converges abso-

lutely. In fact,

24 2 Inner Product Spaces

1 2

jxn yn j jxn j þ jyn j2

2

implies

m

X m m 1 1

1X 1X 1X 1X

j xn yn j jxn j2 þ jyn j2 jxn j2 þ jyn j2 ;

n¼1

2 n¼1 2 n¼1 2 n¼1 2 n¼1

and hence

1

X 1 1

1X 1X

j xn yn j jxn j2 þ jyn j2 :

n¼1

2 n¼1 2 n¼1

Now deﬁne

1

X

ðx; yÞ ¼ xn yn ; x; y 2 ‘2 : ð2:4Þ

n¼1

It is now easy to check that the axioms for an inner product are satisﬁed. Thus, ‘2

with the inner product deﬁned in (2.4) is an inner product space.

(iv) Let C[a, b], −∞ < a < b < ∞, be the vector space of all continuous

complex-valued functions deﬁned on [a, b]. Deﬁne

Zb

ðf ; gÞ ¼ f ðtÞ gðtÞ dt; f ; g 2 C½a; b: ð2:5Þ

a

Rb

Observe that ðf ; f Þ ¼ a jf ðtÞj2 dt ¼ 0 implies f(t) = 0 for each t 2 [a, b], in

view of the continuity of f. The other axioms in Deﬁnition 2.1.1 are

consequences of the properties of integrals.

(v) Let Cn[a, b] be the vector space of all n times continuously differentiable

complex-valued functions deﬁned on [a, b]. For f, g 2 Cn[a, b], deﬁne

b

n Z

X

ðf ; gÞ ¼ f ðiÞ ðtÞgðiÞ ðtÞ dt; ð2:6Þ

i¼0

a

P Rb 2

t 2 [a, b]. Observe that 0 ¼ ðf ; f Þ ¼ ni¼0 a f ðiÞ ðtÞ dt ¼ 0 implies

Pn ðiÞ 2 P 2

f ðtÞ ¼ 0, t 2 [a, b], in view of the continuity of n f ðiÞ ðtÞ ,

i¼0 i¼0

t 2 [a, b]. This implies f(t) = 0 for each t 2 [a, b]. The other axioms in

Deﬁnition 2.1.1 are consequences of the properties of integrals.

2.1 Deﬁnition and Examples 25

(vi) Let RL2 denote the space of rational functions (i.e. a ratio of two polynomials

with complex coefﬁcients) which are analytic on the unit circle

@D ¼ fz 2 C : jzj ¼ 1g

with the usual pointwise addition and scalar multiplication. The inner product

is deﬁned by

Z

1 dz

ðf ; gÞ ¼ f ðzÞgðzÞ ; ð2:7Þ

2pi z

@D

where the integral is being taken in the anticlockwise direction around ∂D.

RH2 is the subspace of RL2 consisting of those rational functions which are

analytic on the closed unit disc D, where

D ¼ fz 2 C : jzj\1g;

Thus, a rational function belongs to RL2 if it has no pole of absolute value 1, and

it belongs to RH2 if it has no pole of absolute value less than or equal to 1.

Clearly, (2.7) satisﬁes the axioms in (i), (ii) and part of (iii). We need to check

that (f, f) > 0 when f 6¼ 0. Indeed,

Zp

1 ih 2

ðf ; f Þ ¼ f ðe Þ dh; ð2:8Þ

2p

p

using the parametrisation z = eih, −p < h p. Since f(eih) is continuous on [−p, p],

the right-hand side of (2.8) is positive unless f = 0.

(vii) A trigonometric polynomial is a ﬁnite sum of the form

k

X

f ðxÞ ¼ a0 þ an eikn x ; x 2 ½ p; p;

n¼1

trigonometric polynomial is of period 2p. The space TP of trigonometric

polynomials is a vector space over C with respect to pointwise addition and

scalar multiplication. If we deﬁne the inner product by

Zp

1

ðf ; gÞ ¼ f ðtÞgðtÞdt; ð2:9Þ

2p

p

26 2 Inner Product Spaces

2:1:P1. For which values of a 2 C does the sequence {n−a}n 1 belong to ‘2?

2:1:P2. [Notations as in Example 2.1.3(vi)] Calculate the inner product of

functions

1 1

f ðzÞ ¼ and gðzÞ ¼ ; where jaj\1; jbj\1:

z a z b

(1 a z)−1, where |a| 6¼ 1. Show that for f 2 RH2,

f ðaÞ if jaj\1

ðf ; ka Þ ¼

0 if jaj [ 1

8 91

1 < 1 Z p =2

2

jf ðaÞj qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ f eih dh :

2 :2p ;

1 jaj p

Deﬁnition 2.2.1 A norm |||| is a function from X into the nonnegative reals R þ

satisfying

(i) ||x|| = 0 if, and only if, x = 0,

(ii) ||kx|| = |k|||x|| for each k 2 F and x 2 X,

(iii) ||x + y|| ||x|| + ||y|| for all x, y 2 X. [triangle inequality]

We emphasise that, by deﬁnition, ||x|| 0 for all x 2 X.

If X is a linear space and |||| is a norm deﬁned on X, then d(x, y) = ||x − y|| indeed

gives rise to a metric as a consequence of the foregoing Deﬁnition 2.2.1. The details

are as follows.

That the distance d(x, y) from a vector x to a vector y in H is strictly positive (that

is, d(x, y) 0 and equality holds if, and only if, x = y) follows from (i). The fact

that d(x, y) = d(y, x) follows from

2.2 Norm of a Vector 27

for all x, y and z. The reader will observe that (iii) has been used in proving the

preceding inequality.

A linear space X equipped with a norm |||| is called a normed linear space. If

the metric space (X, d), where d(x, y) = ||x − y||, x, y 2 X, is complete, then the

normed linear space is said to be complete and is called a Banach space. These

spaces are named after the great Polish mathematician Stefan Banach. Rn ; the real

P

space of n-tuples x = (x1, x2, …, xn) with each of the norms k xk1 ¼ ni¼1 jxi j; k xk2 ¼

P 1

n 2 2

i¼1 jxi j ; kxk1 ¼ supi jxi j is a Banach space. That ||x||1, ||x||2 and ||x||∞ are

norms can be veriﬁed, see [30]. So is the space Cn of complex n-tuples. The space

(Cn , ||||2) is complete [see Example 2.3.4(i)]. That Cn with ||||1 and Cn with ||||∞

are complete follows from the inequalities ||||∞ ||||2 ||||1 n||||∞, see [30].

Hilbert spaces are Banach spaces whose norms are derived from an inner pro-

duct as detailed below.

Deﬁnition 2.2.2 In an inner product space H, the norm (or length) of a vector

x 2 H, denoted by ||x||, is the nonnegative real number as deﬁned by

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

k xk ¼ ðx; xÞ;

We shall see below that this satisﬁes the conditions for being a norm as laid out

in Deﬁnition 2.2.1.

The norm of an element x = (a1, a2, …, an) in the unitary space Cn is

!12

n

X 2

k xk ¼ jai j ;

i¼1

!12

1

X 2

k xk ¼ jai j :

i¼1

28 2 Inner Product Spaces

0 112 2 0 112 3

Zb b

n Z

X

6

kfk ¼ @ jf ðtÞj2 dtA 4resp:@ f ðiÞ ðtÞ2 dtA 7 5:

i¼0

a a

0 112 0 112

Z Zp

1 dz 1

kfk ¼ @ jf ðzÞj2 A ¼ @ f ðeih Þ2 dhA :

2pi z 2p

@D p

Proposition 2.2.3 In an inner product space H, |||| has the following properties:

for x, y 2 H and k 2 F,

(a) ||x|| 0 and ||x|| = 0 if, and only if, x = 0;

(b) ||kx|| = |k|||x||;

(c) (Parallelogram Law)

kx þ y k2 þ kx y k2 ¼ 2k x k2 þ 2k y k2 ;

Proof (a) is immediate from Deﬁnition 2.1.1(c) while (b) follows from

For x, y 2 H, we have

k x þ y k2 þ k x y k2 ¼ 2k x k2 þ 2k y k2 :

In the identity (2.10), replace y by −y, iy and −iy:

2.2 Norm of a Vector 29

Multiply both sides of (2.12) by −1, (2.13) by i and (2.14) by −i and add to (2.10)

to obtain the following:

Remark In Proposition 2.2.3, the assertions (a)–(c) are valid in real as well as

complex inner product spaces, but (d) holds only in a complex inner product space.

Theorem 2.2.4 (Cauchy–Schwarz Inequality) Let H be an inner product space and

let ||x|| denote the norm of x 2 H. Then

for x, y 2 H, and equality holds if, and only if, x and y are linearly dependent.

Proof Choose a real number h such that eih ðx; yÞ ¼ jðx; yÞj: Let k = aeih, where

a 2 R: Then

2

ðx ky; x kyÞ ¼ k xk2 þ ky kðx; yÞ kðy; xÞ: ð2:16Þ

The expression on the left side of (2.16) is real and nonnegative. Hence,

for every real a. If k yk ¼ 0, then we must have jðx; yÞj ¼ 0, for otherwise (2.17)

will be false for large positive values of a. If kyk [ 0, take a ¼ jðx; yÞj=k yk2 in

(2.17) and obtain

k 2 F: Then

¼ kkxkk xk ¼ k ykkxk;

30 2 Inner Product Spaces

On the other hand, suppose that jðx; yÞj ¼ kxkk yk: If kyk ¼ 0, then y = 0 and

x and y are linearly dependent. If kyk 6¼ 0, then

! !

ðx; yÞ ðx; yÞ 2 jðx; yÞj2 ðx; yÞ

x y; x y ¼ k xk þ 2< x; y

k y k2 k y k2 k y k2 k y k2

jðx; yÞj2 jðx; yÞj2

¼ k x k2 þ 2

k y k2 k y k2

jðx; yÞj2

¼ k x k2 :

k y k2

ðx;yÞ

Hence, x k yk2

y ¼ 0; that is, x and y are linearly dependent. h

Remark The above proof of the Cauchy–Schwarz Inequality is valid in the real case

as well.

Applying Theorem 2.2.4 in speciﬁc spaces such as Cn , ‘2 and C[a, b], the

following corollary results.

Corollary 2.2.5

(a) If x1, x2, …, xn and y1, y2, …, yn are complex numbers,

!12 !12

X n Xn Xn

a b jai j2 jbi j2 :

i¼1 i i i¼1 i¼1

(b) If {ai}i 1 and {bi}i 1 are square summable sequences of complex numbers,

!12 !12

X 1 X1 X1

a b jai j2 jbi j2 :

i¼1 i i i¼1 i¼1

b 0 b 112 0 b 112

Z Z Z

f ðtÞ gðtÞ dt @ jf ðtÞj2 dtA @ jgðtÞj2 dtA :

a a a

2.2 Norm of a Vector 31

In each case, equality holds if, and only if, the vectors involved are linearly

dependent.

kx þ y k k x k þ k y k ð2:18Þ

for all x, y 2 H.

Proof For x, y 2 H,

kx þ yk2 ¼ ðx þ y; x þ yÞ

¼ kxk2 þ k yk2 þ ðx; yÞ þ ðy; xÞ

¼ kxk2 þ k yk2 þ 2<ðx; yÞ

kxk2 þ k yk2 þ 2k xkk yk;

kx þ y k2 ð k x k þ k y kÞ 2 ;

which implies

kx þ yk kxk þ kyk:

h

Applying Theorem 2.2.6 to speciﬁc inner product spaces, such as ‘2, RL2

and C[a, b], the following inequalities are obtained:

Corollary 2.2.7

(a) If {xi}i 1 and {yi}i 1 are in ‘2, then

!12 !12 !12

1

X 1

X 1

X

2 2 2

j xi þ yi j j xi j þ j yi j :

i¼1 i¼1 i¼1

0 112 0 112 0 112

Zp Zp Zp

ih 2 ih 2 ih 2

@1 f ðe Þ þ gðe Þ dhA @

ih 1 f ðe Þ dhA þ @ 1 gðe Þ dhA :

2p 2p 2p

p p p

32 2 Inner Product Spaces

0 112 0 112 0 112

Zb Zb Zb

@ jf ðtÞ þ gðtÞj2 dtA @ jf ðtÞj2 dtA þ @ jgðtÞj2 dtA :

a a a

jk xk k y kj kx yk ð2:19Þ

for all x, y 2 H.

Proof For x, y 2 H,

k x k ¼ kx y þ y k kx yk þ k yk

k xk k y k kx yk: ð2:20Þ

k yk k x k ky xk ¼ k x yk: ð2:21Þ

Problem Set 2.2

2:2:P1. Show that for x, y and z 2 X, an inner product space, the following

Apollonius Identity holds:

2

2 21 2

1

kx z k þ ky z k ¼ kx y k þ 2

z ðx þ yÞ

:

2 2

2:2:P2. Show that the formula (A, B) = trace(B*A) deﬁnes an inner product on the

space Cnn of n n complex matrices, where n 2 N and B* denotes the

conjugate transpose of B. Determine ||In||, where In is the identity matrix.

2:2:P3. Show that (i) x ¼ 1n n 1 2 ‘2 and determine k xk; (ii) x ¼ 2 n=2 n 1 2 ‘2

and determine k xk:

2.2 Norm of a Vector 33

p 2 312

Z rﬃﬃﬃ Zp

p 2

f ðtÞ sin t dt 4 jf ðtÞj dt5 ;

2

0 0

2:2:P5. Suppose µ(X) = 1 and R f, g are

R positive measurable functions on X such that

fg 1. Prove that X f dl X g dl 1:

2:2:P6. Let X1 = (C[0, 1],||||∞), where ||x||∞ = sup0 t 1|x(t)|, and let X2 = (C[0,

1],||||2), where

Z1

1

kxk2 ¼ ðx; xÞ ; ðx; yÞ ¼

2 xðtÞyðtÞdt:

0

Show that the identity mapping id: X1 ! X2 is continuous, but its inverse

id: X2 ! X1 is not.

2:2:P7. Let X be an inner product space. If ||x|| = ||y|| = 12 ||x + y||, then show that

x = y. The result fails to hold if X = Rn or Cn with norm ||||1 when n > 1.

2:2:P8. Let X be a vector space over C: Let (,) be a complex-valued function of

two variables (x, y): X X ! C which has the following properties:

(a) (ax1 + bx2, y) = a(x1, y) + b(x2, y)

(b) ðx; yÞ ¼ ðy; xÞ

(c) (x, x) 0 and (x, x) may be zero for nonzero x.

Prove that the Cauchy–Schwarz Inequality still holds but without the rider

about when equality holds.

1

2:2:P9. [Notations as in Example 2.1.3(vi)] Let g(z) = ðz aÞðz bÞ, where a and b are

distinct points in D. Using the Residue Theorem, show that

1

ð1 jabj2 Þ2

kgk ¼ 1 1 :

ð1 jaj2 Þ2 ð1 jbj2 Þ2 j1 abj

F ¼ ff 2 RH 2 : f ðaÞ ¼ 0g

34 2 Inner Product Spaces

We have seen in Proposition 2.2.3 and Theorem 2.2.6 that the norm induced by an

inner product in H satisﬁes the following conditions of Deﬁnition 2.2.1:

(i) ||x|| 0 and ||x|| = 0 if, and only if, x = 0,

(ii) ||kx|| = |k|||x|| for all k 2 C and x 2 H,

(iii) ||x + y|| ||x|| + ||y|| for all x, y 2 H.

Remark 2.3.1 The inner product in H together with the metric d(x, y) = ||x − y|| from

the norm induced by the inner product is a metric space. As in any metric space,

d(x, y) is called the distance from x to y or between x and y.

We shall henceforth feel free to use, for inner product spaces, all metric concepts

such as open and closed sets, convergence, continuity, uniform continuity, Cauchy

sequence, completeness, dense sets and separability. Below we translate the general

metric space concepts deﬁned in Sect. 1.2 into inner product space terms.

It follows from (2.19) that the map x ! ||x|| deﬁned in H is continuous. In fact, it

is uniformly continuous in view of (2.19).

Remarks 2.3.2

(i) A sequence {xn}n 1 in a normed space or in an inner product space is

Cauchy if for every e > 0, there exists an integer n0 such that

kx n xm k\e whenever n; m n0 :

(ii) Every convergent sequence is Cauchy; the converse is, however, not true; in

fact, let {xn}n 1, where xn = (1, 12, …, 1n, 0, …), n = 1, 2, … be a sequence in

the inner product space ‘0 (see Example 2.1.3(ii)). Then the sequence

{xn}n 1 is Cauchy because

!12

m

X þp

xm þ p 1

xm ¼ 2

k¼m þ 1

k

sequence does not converge to an element of the space. Assume the contrary,

that is suppose xn ! x, where x = (k1, k2, …, kN, 0, 0, …). If n N, then

n

X 2

1

X

kx n xk2 ¼ 1 kk þ jkk j2

k

k¼1 k¼n þ 1

n

X 2

¼ 1 kk :

k

k¼1

2.3 Inner Product Spaces as Metric Spaces 35

P 1 2

On letting n ! ∞, we obtain 1 k¼1 k kk ¼ 0, which implies kk = 1k for

each k, contradicting the fact that x has ﬁnitely many nonzero terms.

(iii) The open [respectively, closed] ball with centre x0 and radius e is the set

{x 2 H: ||x − x0|| < e} [respectively, {x 2 H: ||x − x0|| e}]. In view of

Proposition 1.2.10, a subset K H is bounded if, and only if, there exists an

M > 0 such that K {x: ||x|| M}.

sequence in H converges. That is, if {xn}n 1 is a sequence in H satisfying

||xn − xm|| ! 0 as n, m ! ∞, there exists an x 2 H such that ||xn − x|| ! 0 as

n ! ∞. An inner product space which is complete is called a Hilbert space.

Every Hilbert space is a Banach space. The norm in a Hilbert space is derived

from the inner product.

Examples 2.3.4

(i) The inner product space H = Cn with the metric given by

!12

n

X 2

dðx; yÞ ¼ kx yk ¼ j xi yi j ; ð2:22Þ

i¼1

where x = (x1, x2, …, xn) and y = (y1, y2, …, yn) are in Cn , is a Hilbert space

with metric as above or with inner product as in (i) of Examples 2.1.3.

We need to check that Cn , with the metric

deﬁned in (2.22), is complete [see

ðmÞ ðmÞ ðmÞ ðmÞ

(i) of Examples 2.1.3]. Let x m1

¼ x1 ; x2 ; . . .; xn denote a Cauchy

sequence in Cn , i.e. d(x(m), x(m′)) ! 0 as m, m′ ! ∞. Then for a given e > 0 there

exists an integer n0(e) such that

!12

n

X 2

ðmÞ ðm0 Þ

xk xk \e for all m; m0 n0 ðeÞ: ð2:23Þ

k¼1

ðmÞ ðm0 Þ

Hence xk xk \e for all m, m′ n0(e) and all k = 1, 2, …, n. Upon ﬁxing

n o

ðmÞ

k and using the Cauchy Principle of Convergence, it follows that xk

m1

converges to a limit xk. Let x = (x1, x2, …, xn) and m n0(e). It follows from (2.23)

that

n

X 2

ðmÞ ðm0 Þ

xk xk \e2 ð2:24Þ

k¼1

36 2 Inner Product Spaces

n

X 2

ðmÞ

xk xk e2

k¼1

(ii) The inner product space H = ‘2 (see Example 2.1.3(iii)) is a Hilbert space.

We shall show that ‘2 with the metric

!12

1

X 2

dðx; yÞ ¼ kx yk ¼ jxk yk j ð2:25Þ

k¼1

ðmÞ ðmÞ

is complete. Let xðmÞ m1

¼ x 1 ; x 2 ; . . . denote a Cauchy sequence in ‘2.

Then for a given e > 0 there exists an integer n0(e) such that

!12

1

X 2

ðmÞ ðm0 Þ

xk xk \e for all m; m0 n0 ðeÞ: ð2:26Þ

k¼1

ðmÞ ðm0 Þ

This implies xk xk \e for all m, m′ n0(e), i.e. for each k, the sequence

n o

ðmÞ

xk is a Cauchy sequence of complex numbers. So by the Cauchy Principle

m1

ðmÞ

of Convergence, limm xk ¼ xk , say. Let x be the sequence (x1, x2, …). It will be

shown that x 2 ‘2 and limmx(m) = x. From (2.26), we have

N

X 2

ðmÞ ðm0 Þ

xk xk \e2 ð2:27Þ

k¼1

obtain

N

X 2

ðmÞ

xk xk e2

k¼1

PN ðmÞ 2

k¼1 xk xk is a monotonically increasing sequence of nonnegative

N 1

real numbers and is bounded above and, therefore, has a ﬁnite limit

P1 ðmÞ 2

k¼1 kx x k which is less than or equal to e2. Hence

2.3 Inner Product Spaces as Metric Spaces 37

!12

1

X 2

ðmÞ

xk xk e for all m; m0 n0 ðeÞ: ð2:28Þ

k¼1

Observe that

!12 !12 !12

1

X 1

X 2 1

X

2 ðmÞ ðmÞ 2

j xk j xk xk þ xk ;

k¼1 k¼1 k¼1

(2.28).

It follows from Remark 2.3.2(ii) that ‘0 is an inner product space that is not

complete.

Remarks 2.3.5

(i) The inner product space ‘0 of sequences all of whose terms, from some index

onwards, are zero is dense in ‘2. In fact, let x = (a1, a2, …) be an element in

‘2 (not in ‘0) and e > 0 be given. Choose N such that

1

X

jak j2 \e:

k¼N þ 1

Then the sequence y = (a1, a2, …, aN, 0, …) is in the desired inner product

space and is such that

1

X

kx yk ¼ jaj2 \e:

k¼N þ 1

This shows that each x 2 ‘2 (not in ‘0) is a limit point of the space ‘0 of

sequences all of whose terms, from some index onwards, are zero.

(ii) It may be discerned from (i) above that ‘0 is not complete.

(iii) For j = 1, 2, …, let ej = (0, …, 0, 1, 0, 0, …), where 1 occurs only in the jth

place and

P 2

show that E is dense in ‘2. Let (x1, x2, …) 2 ‘2 and e > 0. As 1j¼1 xj is

ﬁnite, there is some N such that

38 2 Inner Product Spaces

1

X

xj 2 \e2 =2:

j¼N þ 1

Since the rational numbers are dense in R, there are k1, …, kN in C with

ℜkj, =kj rational and

2

xj kj \e2 =2N; j ¼ 1; 2; . . .; N:

Consider y ¼ k1 e1 þ þ kN eN in E. Then

N

X 2 1

X

kx y k2 ¼ xj kj þ xj 2 \e2 =2 þ e2 =2 ¼ e2 :

j¼1 j¼N þ 1

metric space.

Deﬁnition 2.3.6 Two Hilbert spaces H and K are said to be isometrically iso-

morphic if there exists a linear isometry between H and K, i.e. if there exists a

bijective linear mapping A:H ! K such that

Theorem 2.3.7 For every inner product space X, there is a Hilbert space H such

that X is a dense linear subspace of H, and for x, y 2 X, the inner product (x, y) in X

and in H is the same. The space H is unique up to a linear isometry; that is, if X is a

dense linear subspace of a Hilbert space K, then there is a unique linear isometry

A:H ! K such that the restriction of A to X is the identity map.

Proof Consider X as a metric space with the metric induced by the inner product on

1

X, i.e. with d(x, y) = ||x − y|| = (x − y, x − y)2 . Let H be its completion. Let

x, y 2 H and let {xn}n 1 and {yn}n 1 be sequences in X such that xn ! x and

yn ! y. Then for scalars k and µ, the sequence {kxn + µyn}n 1 is a Cauchy

sequence in X. Now, if H is to be given a Hilbert space structure such that the inner

product on H induces the metric on H, then

µyn}n 1. It may be checked that the addition, scalar multiplication and the limits of

the Cauchy sequences are well deﬁned. It is now easy to check that with this

deﬁnition of addition and scalar multiplication, H becomes a vector space. Now

deﬁne (x, y) = limn(xn, yn); note that it is well deﬁned. In fact, it is an inner product

2.3 Inner Product Spaces as Metric Spaces 39

on H whose restriction to X agrees with the given inner product in X. With this inner

product, H is a Hilbert space. The uniqueness can be easily veriﬁed. h

Problem Set 2.3

2:3:P1. Show that the space (C[0, 1],||||∞), where ||x||∞ = sup0 t 1|x(t)|, is not an

inner product space, hence not a Hilbert space.

Deﬁnition A strictly convex norm on a normed linear space is a norm such that,

for all x, y 2 X, ||x|| = ||y|| = 1, y 6¼ x ) ||x + y|| < 2.

2:3:P2: (a) Show that the norm on a Hilbert space is strictly convex.

(b) Show that the norm ||||∞ on C[0, 1] is not strictly convex.

(c) Show that the norm ||||1 on C[0, 1] is not strictly convex.

2:3:P3. Let H be the collection of all absolutely continuous functions x:[0, 1] ! F

R1

such that x(0) = 0 and x′ 2 L2[0, 1]. If ðx; yÞ ¼ 0 x0 ðtÞy0 ðtÞdt for x, y 2 H,

show that H is a Hilbert space.

2:3:P4. Let H be a Hilbert space over R: Show that there is a Hilbert space K over

C and a map U:H ! K such that (i) U is linear (ii) (Ux1, Ux2) = (x1, x2) for

all x1, x2 2 H, (iii) for any z 2 K, there are unique x1, x2 2 H such that

z = Ux1 + iUx2.

2:3:P5: (a) Suppose x and y are vectors in a normed space such that ||x|| = ||y||. If

there exists t 2 (0,1) such that ||tx + (1 − t)y|| < ||x||, then show that this

strict inequality holds for all t 2 (0, 1).

(b) Let x and y belong to a real or complex strictly convex normed space.

If ||x + y|| = ||x|| + ||y|| and x 6¼ 0 6¼ y, show that there exists a > 0 such

that y = ax.

2:3:P6. The set of all vectors x = {ηn}n 1 with |ηn| 1n, n = 1, 2, … in real ‘2 is

called the Hilbert cube. Show that this set is compact in ‘2.

2:3:P7. Let a = {an}n 1 be a sequence of positive real numbers. Deﬁne

P1 2

‘2a = {x = (x1, x2, …): xi 2 C and i¼1 ai jxi j \1}. Deﬁne an inner

P

product on ‘2a by ðx; yÞ ¼ 1 2

i¼1 ai xi yi . Show that ‘a is a Hilbert space.

2:3:P8. For a real number s, we deﬁne on Z a measure µs by setting

s=2

ls ðfngÞ ¼ 1 þ n2 ; n 2 Z:

2:3:P9: (a) Find a sequence a of positive real numbers such that (1, 1/22, 1/33, …)

62 ‘2a [see Problem 2.3.P7].

(b) Find a sequence a of positive real numbers such that all x = {xn}n 1

with |xn| = nn are in ‘2a .

2:3:P10. Let M be a closed subspace of a Hilbert space H, and let y 2 H, y 62 M. If

M′ is the subspace spanned by M and y, then M′ is closed. In particular, a

ﬁnite-dimensional subspace must be closed.

40 2 Inner Product Spaces

L p (X; M, µ) for all p, 0 < p ∞ [see Deﬁnition

1.3.5] is important in analysis. Here we are concerned only with cases p = 1 and

p = 2. We shall see that there is a Hilbert space associated with e L 2 (X; M, µ).

Henceforth, the symbols M and µ will be omitted.

e 2 (X) is a vector space.

Proposition 2.4.1 L

Proof Suppose that f 2 L e 2 (X; M; µ) and a 2 C: Then f is measurable and so af is

measurable; | f |2 is integrable and so |af |2 = |a|2| f |2 is integrable. Thus, af 2 e

L 2 (X,

M, µ).

Suppose that f, g 2 e L 2 (X). Then f and g are measurable and so f + g is mea-

surable. For all complex numbers a and b,

Inequality (Theorem 2.2.4) to the inner product ((a, b), (1,1)) in the inner product

space C2 . So, for all x 2 X,

0 jf ðxÞ þ gð xÞj2 2 jf ðxÞj2 þ jgð xÞj2 :

The function on the right is integrable and therefore so is the measurable function

| f + g |2. This proves that f + g 2 e

L 2 (X). h

Deﬁnition 2.4.2 If f and g are two complex-valued measurable functions deﬁned on

X, let us write ‘f * g’ if {x 2 X: f(x) 6¼ g(x)} is a null set (a measurable set of

measure zero). One says that the functions f and g are equivalent or that f − g is a

null function.

Let N denote the set of all null functions, that is,

N ¼ ff : f 0g:

measurable functions deﬁned on X. As such, it partitions the set into disjoint

equivalence classes, where a typical class, denoted by [ f ], is given by

½f ¼ fg : g measurable on X and g f g

2.4 The Space L2(X; M, µ) 41

Note that N is a vector space of functions and N e L p (X) for all p > 0.

e p

If f is a function in L (X), then [ f ] = f + N is the coset containing f, as deﬁned

towards the end of Sect. 1.1.

Deﬁnition 2.4.3 The space L2(X, M, µ) is the set of all equivalence classes of

functions in e L 2 (X).

Thus, if f 2 eL 2 (X), then [ f ] is the corresponding member of L2(X, M, µ). One says

that f is a representative of the equivalence class [ f ]. The set just deﬁned is what we

intend to make into the promised Hilbert space associated with e L 2 (X, M, µ).

Proposition 2.4.4 L2(X) is a vector space.

Proof N is a subspace of L e 2 (X), and L2(X) is actually the quotient space e

L2

(X, M, µ)/ N . h

e 2

We next deﬁne an inner product on the space L (X).

Proposition 2.4.5 If f, g 2 e

L 2 (X) then fg 2 e

L 1 (X).

Proof Suppose f, g 2 e

L 2 (X). Then, f, g are measurable and so the product fg is

measurable. The functions | f |2 and |g|2 are integrable and it follows from the

inequality

1

jf ð xÞgð xÞj jf ð xÞj2 þ jgð xÞj2 ; x2X

2

Now let us deﬁne (f, g) for f, g 2 L~2 (X) by

Z Z

ðf ; gÞ ¼ f ðxÞgðxÞdlð xÞ or ðf gÞdl:

X X

Note that if g 2 L L 2 (X, M, µ),

and so by Proposition 2.4.5, (f, g) is well deﬁned. The reader can check that (,) has all

the properties of an inner product except one. If (f, f) = 0, then

Z Z

0 ¼ ðf ; f Þ ¼ jf ð xÞj2 dlð xÞ ¼ j f j2 dl

X X

and therefore, f * 0; that is, f is a null function, but one cannot conclude that f = 0.

However, if f * f ′ and g * g′, then (f, g) = (f ′, g′). In fact,

42 2 Inner Product Spaces

Z Z Z Z

ðf gÞ f 0 g0 ¼ ðf f 0 Þg þ f g g0

0

X X X X

Z Z

ðf f 0 Þg þ f 0 g g0

X X

1 1 1 1

0

ðf f ;f 0

f Þ ðg; gÞ2 þ ðf 0 ; f 0 Þ2 ðg

2 g0 ; g g0 Þ2 ¼ 0;

above, the integral

Z

ðf gÞ

X

We can now deﬁne (,) on L2(X, M, µ).

Deﬁnition 2.4.6 For [ f ], [g] 2 L2(X), deﬁne

Z

ð½f ; ½gÞ ¼ ðf gÞdl;

X

In view of the remarks preceding Deﬁnition 2.4.6, (,) is unambiguously deﬁned

on L2(X, M, µ).

Proposition 2.4.7 The space L2(X) with inner product as in Deﬁnition 2.4.6 is an

inner product space.

R

Proof For [ f ] 2 L2(X), if ([ f ], [ f ]) = 0, then X | f |2dµ = 0 and hence [ f ] = 0, the

zero of L2(X). The veriﬁcation of the other axioms of an inner product is

straightforward. h

Remark 2.4.8 We shall adopt the usual practice and abandon the notation [ f ]. The

symbol f will be used to denote both a function in e L 2 (X) and the corresponding

2

equivalence class of functions in L (X). Working mathematicians tend to ignore the

distinction between a function and its equivalence class. The correspondence

between statements about e L 2 (X) and L2(X) is straightforward and gives rise to no

confusion. In the subsequent discussions, it will always be clear whether calcula-

tions are in terms of functions or equivalence classes of functions.

2.4 The Space L2(X; M, µ) 43

0 112 0 112

Z Z

1

2

k f k ¼ ðf ; f Þ ¼ @

2 jf ð xÞj dlð xÞA ¼ @ j f j2 dlA :

X X

called the Riesz–Fischer Theorem.

Theorem 2.4.9 (Riesz–Fischer Theorem) (L2(X), (,)) is a Hilbert space.

Proof Let {fn}n 1 be a Cauchy sequence in L2(X); that is, for e > 0 there exists an

integer n0 such that

kf n fm k\e whenever n; m n0 :

0 112

Z

2 1

fn

fni ¼ @ f n fni dlA \ i ; i ¼ 1; 2; . . .

iþ1 iþ1

2

X

Indeed, if nk has been selected, choose nk+1 > nk such that n, m > nk+1 implies

kfn fm k\ 2k1þ 1 :

Let

k

X

gk ¼ fn f ni ;

iþ1

i¼1

and

1

X

g¼ fn f ni :

iþ1

i¼1

X k k k

fn X X 1

k gk k ¼ iþ1

f ni fn

iþ1

f n i \ \ 1 for k ¼ 1; 2; . . .

i¼1 i¼1 i¼1

2i

44 2 Inner Product Spaces

Z Z

2

g ¼ ðlim inf k g2k Þdl

X X

Z

lim inf k g2k dl

X

1;

R g(x) < ∞ a.e. Indeed, if E = {x: |g(x)| = ∞} has

positive measure, then X g(x)2dµ(x) = ∞. Therefore, the series

1

X

g¼ fni þ 1 fni : ð2:29Þ

i¼1

converges absolutely for almost all x. Denote the sum of (2.29) by f(x) for those x at

which (2.29) converges. Put f(x) = 0 on the remaining set of measure zero. Since

k 1

X

fn1 þ ðfni þ 1 fni Þ ¼ fnk ;

i¼1

we see that

i

ffni gi 1 ; we have to show that f is the L2-limit of {fn}n 1, i.e. limn|| fn − f || = 0,

where |||| denotes the L2-norm. Recall that

Z Z

jf fm j2 dl limi jfni fm j2 dl e2 : ð2:30Þ

X X

f = (f − fm) + fm. Finally,

kf fm k ! 0 as m ! 1:

2.4 The Space L2(X; M, µ) 45

Remark 2.4.10 In the course of the proof, we have shown that if {fn}n 1 is a

Cauchy sequence in L2(X) with limit f, then {fn}n 1 has a subsequence which

convergence pointwise almost everywhere to f.

The simple functions play an important role in L2(X).

Theorem 2.4.11 Let S be the collection of all measurable simple functions on X

vanishing outside subsets of ﬁnite measure. Then S is dense in L2(X).

Proof Clearly, S L2(X). Let f 2 L2(X) and assume that f 0. There exists a

sequence {sn}n 1 of measurable simple functions such that

0 s 1 ð xÞ s 2 ð xÞ f ð xÞ and sn ð xÞ ! f ð xÞ

(see 1.3.3). Since 0 sn(x) f(x), we have sn 2 L2(X) and hence sn 2 S. Since

| f − sn|2 f 2, the Dominated Convergence Theorem 1.3.9 shows that ||f − sn|| ! 0

as n ! ∞. Then, f is in the L2-closure of S. The general case when f is complex

follows from the above considerations. h

Theorem 2.4.12 Let X = R, M be the r-algebra of measurable subsets of R and µ

be the usual Lebesgue measure on R. Then the set of all continuous functions

vanishing outside subsets of ﬁnite measure is dense in L2(X).

Proof Let E 2 M be such that µ(E) < ∞. Then every bounded measurable (in

particular continuous) function is square integrable on E. Consider now a nonempty

closed subset F of E and its characteristic function vF. For n = 1, 2, …, let

1

f n ð xÞ ¼ ; x 2 E;

1 þ n distðx; FÞ

where dist(x, F) = inf{|x − y|: y 2 F}. Note that each fn is continuous on E. Also,

fn(x) = 1 for all x 2 F and fn(x) ! 0 as n ! ∞ for all x 62 F. Hence, (vF − fn)(x) ! 0

for all x 2 E. Since µ(E) < ∞ and |fn(x)| 1 for all x 2 E, the Dominated

Convergence Theorem shows that

Z

jvF fn j2 dl ! 0 as n ! 1:

E

In case F is empty, vF is 0 everywhere and we may carry out the above argument

with each fn chosen to be 0 everywhere.

Now let E0 be any measurable subset of R with µ(E0) < ∞. Let e > 0 be given.

Then, there exists a closed set F E0 such that µ(E0\F) < e. This follows on using

regularity of Lebesgue measure. Since

dðvE0 ; fn Þ ¼ vE0 fn

dðvE0 ; vF Þ þ dðvF ; fn Þ;

46 2 Inner Product Spaces

and since

0 112

Z

2 1 1

dðvE0 ; vF Þ ¼ @ vE vF dlA ¼ lðE0 nF Þ2 \e2 ;

0

it follows that

dðvE0 ; fn Þ ! 0 as n ! 1:

We have thus proved that the characteristic functions of measurable subsets of ﬁnite

measure can be approximated in L2-norm by continuous functions which vanish

outside sets of ﬁnite measure. The proof is now completed by using Theorem 2.4.11

and the triangle inequality. h

2:4:P1. For which real a does the function fa(t) = taexp(−t), t > 0, belong to

L2(0,∞)? What is ||fa|| when deﬁned?

P 1

2:4:P2: (a) Show that the subspace M = {x = {xn}n 1 2 ‘2 : 1 n¼1 n xn ¼ 0g is

closed in ‘2.

R1

(b) Show that the subspace M = {x(t) 2 L2[1,∞): 1 1t x(t)dt = 0} is closed

in L2[1,∞).

2:4:P3. Lp[0, 1], 1 p < ∞, p 6¼ 2, is not a Hilbert space.

The following subspace of L2(X, M, µ) will play an important role in the discussion

of applications of Hilbert space tools to problems in analysis. Here X = [a, b], M is

the r-algebra of Lebesgue measurable subsets of [a, b] and µ is the Lebesgue

measure.

Example 2.5.1 Let [a, b] be a closed subinterval of R. Let C[a, b] denote the space

of complex-valued continuous functions deﬁned on [a, b] with inner product given

by

Zb

ðf ; gÞ ¼ f ðtÞgðtÞdt; f ; g 2 C½a; b: ð2:31Þ

a

2.5 A Subspace of L2(X, M, µ) 47

Then, C[a, b] is an inner product space (see Example 2.1.3(iv)) which is dense in

L2[a, b]. Extend f 2 L2[a, b] to R by setting f = 0 outside [a, b]. The extended

function is deﬁned on R and is in L2(R). There exists a continuous function

g vanishing outside a set of ﬁnite measure such that ||f − g|| < e [see

Theorem 2.4.12]. Consider the restriction of g to [a, b], to be denoted by h. Then

the given f is such that ||f − h|| < e. Moreover, C[a, b] 6¼ L2[a, b] as the following

argument shows.

If two functions differ at a point and are both continuous there, then they differ

on a neighbourhood of that point. Consequently, they cannot be equivalent. It

follows that the function

8

<1 if x 2 ½0; 12Þ

f ðxÞ ¼ 1 if x 2 ð12 ; 1

:

0 ifx ¼ 12

is not equivalent to any continuous function. This proves the assertion that

C[a, b] 6¼ L2[a, b].

Thus, C[a, b] is an inner product space which is not a Hilbert space.

Remarks 2.5.2

(i) The reader will note that if g1 and g2 are continuous functions on [a, b] and

g1 * g2 then g1 = g2. So, if f 2 L2[a, b] has a representative which is

continuous, then that representative is unique. Thus, C[a, b] ! L2[a, b] is an

injection. Moreover, uniform convergence of continuous functions implies

convergence in L2-norm. The aforementioned implication stems from the

ﬁniteness of the Lebesgue measure of [a, b] and it fails if the bounded interval

is replaced by an unbounded interval.

(ii) That the inner product space C[a, b] with the inner product deﬁned by (2.31)

above is not complete can also be seen by exhibiting a Cauchy sequence in the

space which converges to an element not lying in the space.

Let a and b be −1 and 1, respectively, and consider the sequence

8

<0 1t0

fn ðtÞ ¼ nt 0\t 1n :

: 1

1 n \t 1

48 2 Inner Product Spaces

_

1 1

1/ m 1/ n

Z1 Z1=m Z1=n

2 2

jfn ðtÞ fm ðtÞj dt ¼ ðmt ntÞ dt þ ð1 ntÞ2 dt

1 0 1=m

2 1 1 1

¼ ðm nÞ þ

3m3 n m

1 1 n2 1 1

n 2 þ

n m2 3 n3 m 3

ðm nÞ2

¼ :

3m2 n

The right-hand side of the above equality tends to zero as n, m ! ∞. Thus, the

sequence {fn}n 1 is Cauchy. We next show that fn ! f in the L2-norm, where f(t) = 0

for −1 t 0 and f(t) = 1 for 0 < t 1. In fact,

Z1 Z1=n

2 1 1 1 1

jfn ðtÞ f ðtÞj dt ¼ ð1 ntÞ2 dt ¼ þ ¼ ! 0 as n ! 1:

n n 3n 3n

1 0

The limit function is not equivalent to any continuous function for reasons similar

to those in Example 2.5.1.

2.5.P1. Prove that the system {1, t3, t6, …} has a dense linear span in the space

L2[0, 1] as well as in L2[−1, 1].

simple closed curves.

Consider the class of all holomorphic functions f in X for which the integral

2.6 The Hilbert Space A(X) 49

ZZ

j f j2 dm\1; ð2:32Þ

X

integral as a limit of the Riemann integral, we shall need the following:

sequence {Kn}n 1 of closed and bounded subsets of X such that X is their union.

Moreover, the sets Kn can be chosen to satisfy the following conditions:

(a) Kn Kn þ 1o ; n = 1, 2, …;

(b) every compact (closed and bounded) subset K of X is contained in Kn

for some n.

1

Kn ¼ z : j zj n and distðz; CnXÞ :

n

Observe that Kn is both bounded and closed and hence compact or empty.

Moreover, Kn X. Obviously,

1

Kn z : jzj\n þ 1 and distðz; CnXÞ [ Kn þ 1o :

nþ1

We next show that X = 1 n¼1 Kn. For, if z 2 X, then there exist m1 and m2 such

that |z| m1 and dist(z,C\X) m12 . Thus, z 2 Kn for some n. On the other hand,

S

Kn X. This proves that X = 1 Kn.

n¼1 S

In view of (a), it follows that X = 1 n¼1 Knº. If K is a compact subset of X, then

the Knº form an open cover of K. By Deﬁnition 1.2.16, the compact set K is covered

by ﬁnitely many Knº. Consequently, K is contained in Kn for some large n. This

completes the proof. h

On letting

jf ðzÞj2 z 2 Kn

un ðzÞ ¼ ;

0 z 2 XnKn

Dominated Convergence Theorem now implies

ZZ ZZ

un dm ! jf j2 dm as n ! 1;

X X

50 2 Inner Product Spaces

that is,

ZZ ZZ

jf j2 dm ! jf j2 dm:

Kn X

X = {z: |z| < R} and a holomorphic function f deﬁned in X.

The function f can be expanded in Taylor series

1

X

f ðzÞ ¼ an z n ;

n¼0

where an = f(n)(0)/n!, n = 0, 1, 2, …

ZZ ZZ X 1

2

2 n

jf j dm ¼ an z dm

n¼0

X X

ZZ X ! !

1 1

X

n n inh

¼ a r e ak r k e ikh

r dr dh

n¼0 k¼0

ZR Z2p X !

1

1 X

n þ k þ 1 iðn kÞh

¼ an ak r e dr dh

n¼0 k¼0

0 0

ZR X 1

1 X Z2p

¼ an ak eiðn kÞh

dh r n þ k þ 1 dr;

n¼0 k¼0

0 0

converges uniformly on |z| r for r < R. Then

ZZ ZR 1

X 1

X

2 R2n þ 2

jf j dm ¼ 2p jan j2 r 2n þ 1 dr ¼ p jan j2 : ð2:33Þ

n¼0 n¼0

nþ1

X 0

Deﬁnition 2.6.2 Let X be a bounded domain in theRRcomplex plane. The class of all

holomorphic functions in X for which the integral X | f |2dm is ﬁnite is denoted by

A(X) and is known as the Bergman space. Briefly,

ZZ

AðXÞ ¼ f f is holomorphic in X and j f j2 dm\1g

X

RR

The integral is to be understood in the sense of limn Kn |f(z)|2dm(z), where {Kn}

is a nondecreasing sequence of compact subsets of X whose union is X.

2.6 The Hilbert Space A(X) 51

The following inequality will be useful in proving that A(X) is a Hilbert space.

Proposition 2.6.3 Suppose f 2 A(X) and dz = dist(z, ∂X). Then

0 1

ZZ

jf ðzÞj2 @ jf j2 dmA=pdz2 : ð2:34Þ

X

Proof Let D be the disc with centre z and radius dz. Clearly,

ZZ ZZ

2

jf j dm jf j2 dm:

X D

ZZ

jf j2 dm pja0 j2 dz2 ¼ pjf ðzÞj2 dz2 :

X

So,

0 1

ZZ

jf ðzÞj2 @ jf j2 dmA=pdz2 :

X

The inequality |a + b|2 2(|a|2 + |b|2) implies that

jaf ðzÞ þ bgðzÞj2 2 jaj2 jf ðzÞj2 þ jbj2 jgðzÞj2 ð2:35Þ

1 i 1þi 2 1þi 2

f g ¼ jf þ gj2 þ jf þ igj2 jf j j gj ð2:36Þ

2 2 2 2

holds.

Deﬁnition 2.6.4 For f, g 2 A(X), we write

ZZ

ðf ; gÞ ¼ f g dm: ð2:37Þ

X

That the right side of (2.37) is ﬁnite follows from (2.35) and (2.36) above. It is

easily veriﬁed that (2.37) deﬁnes an inner product on A(X).

Theorem 2.6.5 With (f, g) deﬁned as in (2.37), A(X) is a Hilbert space.

52 2 Inner Product Spaces

Proof We show that A(X) is a complete inner product space. In view of (2.35),

A(X) is closed under addition and scalar multiplication.

As usual, A(X) with norm deﬁned by

0 112

ZZ

1

2

k f k ¼ ðf ; f Þ ¼ @

2 jf j dmA ;

X

becomes a normed space [see Deﬁnition 2.2.2 and Theorem 2.2.6]. It remains to

show that A(X) is complete in this norm. Suppose {fn}n 1 is a Cauchy sequence in

this norm, that is,

0 1

ZZ

2 2

f n fp ¼ @ fn fp dmA\e

X

that

2

fn ðzÞ fp ðzÞ \ e=pd 2 ðz 2 KÞ;

on each compact subset, K X, the sequence {fn}n 1 converges uniformly to a

holomorphic function f:

fn ðzÞ ! f ðzÞ as n ! 1 z 2 K X:

Since

ZZ ZZ

2

jfn fp j dm jfn fp j2 dm; n; p N;

K X

ZZ

jfn f j2 dm e; n N;

K

ZZ

jfn f j2 dm e; n N;

K

The last inequality implies f 2 A(X) and ||fn − f|| ! 0 as n ! ∞. This completes the

proof. h

2.6 The Hilbert Space A(X) 53

2:6:P1. Let X be an arbitrary domain in C whose boundary consists of a ﬁnite

number of smooth simple closed curves and let A(X) be the collection of all

holomorphic functions f:X ! C for which

2 3

ZZ ZZ

jf ðzÞj2 dxdy4same as j f j2 dm of Sect. 2:65\1

X X

holds.

(a) Show that every f 2 A(X), where X = {z 2 C: 0 < |z| < 1} has a removable

singularity at z = 0;

(b) Show that if a 2 X, then {f 2 A(X): f(a) = 0} is closed in A(X).

The deﬁnition of the external direct sum of vector spaces [Deﬁnition 1.1.7] is

extended to any arbitrary family of vector spaces (each vector space is over the ﬁeld

R of real numbers or the ﬁeld C complex numbers) beginning with any ﬁnite such

family. This procedure lays bare the intricacies involved and aids understanding.

The direct sum

H ¼ H1 H2 Hn

of the vector spaces H1, H2, …, Hn is the set H = H1 H2 Hn, in which the

addition and scalar multiplication are deﬁned by the formula

kðx1 ; x2 ; . . .; xn Þ ¼ ðkx1 ; kx2 ; . . .; kxn Þ;

where (x1, x2, …, xn), (y1, y2, …, yn) are in H1 H2 … Hn and k 2 R or C.

It is quite clear that H contains a subspace Yi of H, where

as a subspace of H, and when such reference is made, it is the isomorphic space Yi

that is to be understood. The map of H into Hi given by

54 2 Inner Product Spaces

ðx1 ; x2 ; . . .; xn Þ ! ð0; 0; . . .; xi ; 0; . . .Þ

If H1, H2, …, Hn are Hilbert spaces, then H is the uniquely determined Hilbert

space with inner product

n

X

ððx1 ; x2 ; . . .; xn Þ; ðy1 ; y2 ; . . .; yn ÞÞ ¼ ð xi ; yi Þ i ; ð2:38Þ

i¼1

where (,)i is the inner product in Hi. Then the norm in a direct sum of Hilbert

spaces is given by

1

kðx1 ; x2 ; . . .; xn Þk ¼ jððx1 ; x2 ; . . .; xn Þ; ðx1 ; x2 ; . . .; xn ÞÞj2 : ð2:39Þ

Deﬁnition 2.7.1 For each i = 1, 2, …, n, let Hi be a Hilbert space with inner

product (,)i. The direct sum of Hilbert spaces H1, H2, …, Hn is the vector space

H = H1 ⊕ H2 ⊕ ⊕ Hn in which the inner product and the norm are deﬁned by

(2.38) and (2.39).

Henceforth, the subscripts i in the notation for the inner products and the norms

will be omitted because the context will make it clear which one is intended.

Proposition 2.7.2 With notations as above, H = H1 ⊕ H2 ⊕ ⊕ Hn is a Hilbert

space.

Proof It must be shown that (2.38) deﬁnes an inner product on H and H is complete

with respect to the norm deﬁned by (2.39). We shall check the second assertion

only.

(m) (m) (m)

PnLet {(x(m)

1 , x2 , …, xn )}m 1 be a Cauchy sequencePin H, that is,

(‘) 2 (m) (‘) 2 n (m) (‘) 2

i¼1 ||xi − xi || ! 0 as m, ‘ ! ∞. For each k, ||xk − xk || i¼1 ||xi − xi ||

(m)

shows that {xk }m 1 is Cauchy in Hk. Since Hk is a Hilbert space, there exists xk in

Hk such that x(m)k ! xk as m ! ∞. Clearly, the vector (x1, x2, …, xn) is in H. It will

be shown that (x(m) (m) (m)

1 , x2 , …, xn ) ! (x1, x2, …, xn) in H. Let e > 0 be given. For

each k, there exists an integer mk such that

pﬃﬃﬃ

ðmÞ

xk xk \e= n for m mk :

Consequently,

n

X 2

ðmÞ

xk xk \e2 for m maxfm1 ; m2 ; . . .; mn g:

k¼1

2.7 Direct Sum of Hilbert Spaces 55

1

We next deﬁne H1 ⊕ H2 ⊕ ; also written as Hi, for a sequence of Hilbert

i¼1

spaces H1, H2, …. Let

( )

1

X 2

H¼ fxn gn 1 : xn 2 Hn ; n ¼ 1; 2; . . . and kxn k \1 :

n¼1

1

X

ðx; yÞ ¼ ðxn ; yn Þ: ð2:40Þ

n¼1

The sum on the right is seen to be ﬁnite by using the Cauchy–Schwarz Inequality

for each Hi and then for ‘2. It can then be veriﬁed that (,) is an inner product on

P

H and the norm relative to the inner product is ||x|| = ( 1 2 12

n¼1 ||xn|| ) .

With this inner product, H can be shown to be a Hilbert space.

1

Proposition 2.7.3 With notations as above, H = H1 ⊕ H2 ⊕ = Hi is a

i¼1

Hilbert space.

Proof It must be shown that (2.40) deﬁnes an inner product on H and H is complete

with respect to the norm deﬁned by

!12

1

X 2

k xk ¼ kx n k :

n¼1

For x = {xn}n 1 and y = {yn}n 1 in H,

!12 !12

1

X 1

X 1

X 1

X

2 2

j ð xn ; yn Þ j kx n k ky n k kx n k ky n k ;

n¼1 n¼1 n¼1 n¼1

using the Cauchy–Schwarz Inequality twice. Hence, the series on the right of (2.40)

converges absolutely. Consequently, (,) is well deﬁned. It is a routine exercise to

show that (,) is an inner product on H. It remains to show that H is a complete

space. Suppose {x(m)}m 1 = {(x(m) (m)

1 , x2 , …)}m 1 is a Cauchy sequence

P1 in H, that

is, ||x − x || ! 0 as m, n ! ∞. For each k, ||x(m)

(m) (n)

k − x (n) 2

k || j¼1 ||x (m)

j − x(n)

j ||

2

k , xk , …} of kth components is

(n)

Cauchy. Since H Pk is a Hilbert space, xk ! xk as n ! ∞ for suitable xk in Hk. It will

be shown that 1 n¼1 ||x k || 2

< ∞ and x(n) ! x, where x = {xk}k 1.

Given e > 0. Let p be an integer such that ||x(m) − x(n)|| < e whenever m, n

p. For any positive integer r, one has

56 2 Inner Product Spaces

r

X 2

ðmÞ ðnÞ

xk xk xðmÞ xðnÞ e2 ;

k¼1

provided m, n p. Letting m ! ∞,

r

X 2

ðnÞ

xk xk e2

k¼1

1

X 2

ðnÞ

xk xk e2 ð2:41Þ

k¼1

provided n p. In particular,

1

X 2

ðpÞ

xk xk e2 ;

k¼1

k }k 1 belongs to H. Consequently, the sequence

n o

ðpÞ ðpÞ

fx k gk 1 ¼ x k xk þ xk

k1

x(n) ! x. h

Deﬁnition 2.7.4 If H1, H2, … are Hilbert spaces, the space H in Proposition 2.7.3 is

called the direct sum of H1, H2, …

For our next deﬁnition, a summation over an arbitrary (possibly uncountable)

indexing set is to be understood in the following sense:

Suppose S = {xa: a 2 K}, where K is an indexing set, is a collection of ele-

ments from a normed linear space X. {xa: a 2 K} is said to be summable to x 2 X,

written

X X

xa ¼ x or xa ¼ x;

a2K a

if for all e > 0, there exists some ﬁnite set of indices J0 K, such that for any ﬁnite

set of indices J J0,

X

x x \e:

a2J a

2.7 Direct Sum of Hilbert Spaces 57

Deﬁnition 2.7.5 For each a in the index set K, let Ha be a Hilbert space. The direct

sum ⊕aHa of Hilbert spaces Ha is deﬁned to be the family of all functions {xa} on

P

K such that for each a, xa 2 Ha and a2K kxa k2 \1:

P

If x, y 2 H, (x, y) = a (xa, ya) is an inner product on H, then H is a Hilbert space

P 1

2 2

with respect to the norm kxa k ¼ a2K kxa k . The proof is not included.

Permuting the index set K results in an isomorphic Hilbert space.

Let X = Cn[a, b], the linear space of all scalar valued n times continuously

differentiable functions on [a, b]. For x, y in X, deﬁne

b

n Z

X

ðx; yÞ ¼ xðjÞ ðtÞyðjÞ ðtÞdt:

j¼0

a

b

n Z

X

2

kxk ¼ ðx; xÞ ¼ xðjÞ ðtÞ2 dt; x 2 X:

j¼0

a

Let

H = {x 2 C[a, b]: x(n−1) is deﬁned and absolutely continuous, x(n) 2 L2[a, b]}.

For x, y 2 H, let

b

n Z

X

ðx; yÞ ¼ xðjÞ ðtÞ yðjÞ ðtÞ dt:

j¼0

a

b

n Z

X

2

kxk ¼ ðx; xÞ ¼ xðjÞ ðtÞ2 dt; x 2 H:

j¼0

a

Theorem 2.7.6 With notations as in the paragraph above, H is a Hilbert space and

X is dense in H.

Proof Consider the direct sum of (n + 1) copies of L2[a, b], i.e.

T:H ! ⊕n+1L2[a, b] be deﬁned by

58 2 Inner Product Spaces

Tx ¼ x; xð1Þ ; xð2Þ ; . . .; xðnÞ :

Observe that T is both linear and injective; moreover, it preserves inner products.

We shall next show that T(H) is a closed subspace of ⊕n+1L2[a, b] and is conse-

quently a Hilbert space. This will imply that H is a Hilbert space.

Let {xk}k 1 be a Cauchy sequence in H and let

ð1Þ ð2Þ ðnÞ

y ¼ ðy0 ; y1 ; . . .; yn Þ ¼ lim Txk ¼ lim xk ; xk ; xk ; . . .; xk ;

k!1 k!1

k }k 1 converges in L [a, b] to yj, j = 0, 1, 2, …, n, where xk means xk.

Now, for j = 1, 2, …, n and each t 2 [a, b],

Zt

ðj 1Þ ðj 1Þ ðjÞ

xk ðtÞ ¼ xk ðaÞ þ xk ðsÞds: ð2:42Þ

a

Observe that

t

Z Zt

ðjÞ 1 ðjÞ

yj ðsÞds xk ðsÞds ðb aÞ2 yj xk ð2:43Þ

a a

nR j = 1, 2, …,

for o n and all t 2 [a, b]. Therefore, the sequence of continuous

Rt

functions

t ðjÞ

a xk ðsÞds k1

is uniformly convergent to the continuous function a yj(s)ds. It

is also convergent as a sequence in L2[a, b]. But {x(j−1) k }k 1 is convergent in

L2[a, b] to yj−1 and so by (2.42), the sequence {x(j−1)

k (a)}k 1 of constant functions

is convergent in L2[a, b]. Therefore, the sequence {x(j−1)

k (a)}k 1 is convergent in C,

and the function on the right of (2.42) is uniformly convergent to a continuous

function. Thus, it follows that

Zt

yj 1 ðtÞ ¼ yj 1 ðaÞ þ yj ðsÞds:

a

Consequently, y = Tx, where x = y0.

We next show that X is dense in H.

Let x 2 H. Then x(n) 2 L2[a, b] so that we can ﬁnd a sequence {zm}m 1 in

C[a, b] such that ||zm − x(n)||2 ! 0 as m ! ∞. Deﬁne recursively u(1) (2) (n)

m , um , …, um

by the formula

2.7 Direct Sum of Hilbert Spaces 59

Zt

ðjÞ

um ðtÞ ¼ uðjm 1Þ

ðsÞds þ xðn jÞ

ðaÞ; j ¼ 1; 2; . . .; n;

a

m = zm. Observe that um 2 X. We claim that

ðjÞ

u xðn jÞ

! 0 as m ! 1:

m 2

ðj þ 1Þ ðjÞ 1

um ðtÞ xn ðj þ 1Þ ðtÞ um xðn jÞ ðb aÞ2 , using (2.42) and (2.43)

2

above.

Hence, {u(j+1)

m (t)}m 1 converges uniformly to x

n−(j+1)

for t 2 [a, b], so that

||u(j+1)

m −x n−(j+1)

||2 ! 0 as m ! ∞. This completes the argument for j = 1, 2, …,

n. Consequently, u(n)m ! x in H. The proof is now complete. h

In the familiar Euclidean space, we assign a length to each vector and to each pair

of vectors an angle between them. The ﬁrst notion has been made abstract in the

deﬁnition of a norm. An appropriate notion of angle and the associated notion of

orthogonality are introduced below. The introduction of the concept of orthogo-

nality depends on the deﬁnition of inner product in a pre-Hilbert space.

Recall from Deﬁnition 2.1.1 that a real vector space H equipped with an inner

product is called a real pre-Hilbert space. The angle h between two nonzero vectors

in a real pre-Hilbert space may be deﬁned in a manner consistent with the properties

of an inner product by means of the relation (x, y) = ||x|| ||y||cos h. Observe that the

Cauchy–Schwarz Inequality then says that |cos h| 1. This deﬁnition is not

satisfactory in a complex pre-Hilbert space, for (x, y) is in general a complex

number. Nevertheless, if the condition (x, y) = 0 is taken as the deﬁnition of

orthogonality (perpendicularity), then the concept is just as useful here as in the real

case.

Deﬁnition 2.8.1 Let H be a pre-Hilbert space. Two vectors x and y in H are said to

be orthogonal if (x, y) = 0; we write x ⊥ y.

Since (x, y) = 0 implies (y, x) = 0, we have x ⊥ y if, and only if, y ⊥ x. It is also

clear that x ⊥ 0 for every x. Also, the relation (x, x) = ||x||2 shows that 0 is the only

vector orthogonal to itself.

Deﬁnition 2.8.2 A set M of nonzero vectors in a pre-Hilbert space H is said to be

an orthogonal set if x ⊥ y whenever x and y are distinct vectors of M. A set M of

vectors in a pre-Hilbert space H is said to be orthonormal if

60 2 Inner Product Spaces

maximal) orthonormal system provided it is not a proper subset of some other

orthonormal set.

Remarks 2.8.3

(i) If x is orthogonal to y1, y2, …, yn, then x is orthogonalPto every linear

combination of the yk. In fact, if x ⊥ yk for all k and y ¼ nk¼1 kk yk , then

!

n

X n

X

ðx; yÞ ¼ x; k k yk ¼ kk ðx; yk Þ ¼ 0:

k¼1 k¼1

||x||2 + (x, y) + (y, x) + ||y||2 = ||x||2 + ||y||2, using (x, y) = 0 = (y, x).

(iii) An orthogonal subset M P of H not containing the zero vector is linearly

independent. Indeed, if nk¼1 kk yk ¼ 0, where y1, y2, …, yn are orthogonal,

then on taking the inner product of the sum on the left-hand side with ym, we

ﬁnd that km = 0.

Examples 2.8.4

(i) The sequence {xj}j 1, where xj = (0, 0, …, 0, kj, 0, …) and the scalar kj

occurs at the jth place, in the space ‘0 of ﬁnitely nonzero sequences, is an

orthogonal sequence (a sequence whose range is an orthogonal set).

The sequence {ej}j 1, where ej = (0, 0, …, 0, 1, 0, …) and 1 occurs at the jth

place is an orthonormal sequence, in the space ‘0 of ﬁnitely nonzero

sequences.

(ii) Let H = C[−p, p] and let fn(x) = sin nx, n = 1, 2, … and gn(x) = cos nx,

n = 1, 2, …. Since

Zp Zp

sin mx sin nx dx ¼ 0 ¼ cos mx cos nx dx; n 6¼ m;

p p

it follows that {fn}n 1 and {gn}n 1 are orthogonal sequences in C[−p, p].

In the space, the vectors

1

un ðxÞ ¼ pﬃﬃﬃ sin nx; n ¼ 1; 2; . . .

p

2.8 Orthogonal Complements 61

1 1

v0 ðxÞ ¼ pﬃﬃﬃﬃﬃﬃ ; vn ðxÞ ¼ pﬃﬃﬃ cos nx; n ¼ 1; 2; . . .:

2p p

v 0 ; v 1 ; u 1 ; v 2 ; u2 ; . . .

Recall that (f, uk), k = 1, 2, … and (f, vk), k = 1, 2, … are called Fourier

coefﬁcients of the function f 2 C[−p, p].

pﬃﬃ

(iii) The sequence un ðzÞ ¼ pn zn 1 ; n ¼ 1; 2; . . . is orthonormal in A(D), where

D = {z 2 C: |z| < 1}. In fact,

ZZ

ðun ; um Þ ¼ un um dx dy

D

pﬃﬃﬃﬃﬃﬃ Z1 Z2p

nm

¼ r n þ m 1 eiðn mÞh

dr dh

p

0 0

pﬃﬃﬃﬃﬃﬃ Z2p

nm

¼ eiðn mÞh

dh

pðn þ mÞ

0

0 if n 6¼ m

¼

1 if n ¼ m:

arbitrary inﬁnite-dimensional pre-Hilbert space.

Deﬁnition 2.8.5 If {xn}n 1 is an orthonormal sequence in a pre-Hilbert space H,

then for any x 2 H, (x, xn) is called the Fourier coefﬁcient of x with respect to

{xn}n 1. The Fourier series of x with respect to {xn}n 1 is the series

P 1

n¼1 ðx; xn Þxn .

In the Hilbert space ‘2, let

Then {en}n 1 is an orthonormal sequence in ‘2. If x = {kj}j 1 2 ‘2, then (x, ej) = kj

P

and x ¼ 1 j¼1 x; ej ej is its Fourier series with respect to the orthonormal sequence

P 2 P 2

{en}n 1. Observe that 1 x; ej ¼ 1 kj \1. That this result holds for

j¼1 j¼1

any orthonormal sequence is a consequence of the following:

62 2 Inner Product Spaces

pre-Hilbert space H. For every x 2 H,

2

n

X Xn

x ðx; xk Þxk ¼ k xk2 jðx; xk Þj2 ;

k¼1

k¼1

hence

n

X

jðx; xk Þj2 k xk2 :

k¼1

2 !

X n n

X n

X n

X

kk xk ¼ k k xk ; k k xk ¼ jkk j2 :

k¼1 k¼1 k¼1 k¼1

So,

2 !

n

X n

X n

X

x kk xk ¼ x kk xk ; x kk xk

k¼1

k¼1 k¼1

Xn n

X n

X

¼ k x k2 kk ðxk ; xÞ kk ðx; xk Þ þ jkk j2

k¼1 k¼1 k¼1

n

X n

X

¼ k x k2 jðxk ; xÞj2 þ jðx; xk Þ kk j2 :

k¼1 k¼1

2

n

X Xn

x ðx; xk Þxk ¼ k xk2 jðxk ; xÞj2 :

k¼1

k¼1

n

X

jðx; xk Þj2 k xk2 :

k¼1

Since Bessel’s Inequality holds for each n orthonormal vectors, it yields the

following corollary:

Corollary 2.8.7 If x1, x2, … is any orthonormal sequence of vectors, then for any x

in the pre-Hilbert space H,

2.8 Orthogonal Complements 63

1

X

jðx; xn Þj2 k xk2 :

n¼1

Remarks 2.8.8

(i) As a special case of Corollary 2.8.7, we obtain the following inequality in the

pre-Hilbert space C[−p, p]: For f 2 C[−p, p],

1

X 1

X Zp

2 2

jðf ; un Þj þ jðf ; vn Þj jf ðxÞj2 dx;

n¼1 n¼0

p

2p

, vn ðxÞ ¼ p1ﬃﬃp cos nx, n = 1, 2, …[see

Example 2.8.4(ii)].

(ii) Let M denote the linear manifold spanned by orthonormal vectors x1, x2, …,

xn. Then the proof of Theorem 2.8.6 shows that the distance

x Pn kk xk is minimised if we set kk = (x, xk), k = 1, 2, …, n; i.e.

k¼1

x Pn ðx; xk Þxk x Pn kk xk , where k1, k2, …, kn are arbitrary

k¼1 k¼1

scalars. P

Thus, y = nk¼1 (x, xk)xk is the vector in M which provides the ‘best ap-

proximation’ to the vector x in the pre-Hilbert space H. Also note that if

n > m, then in the best approximation by the linear span of x1, x2, …, xn, the

ﬁrst m coefﬁcients are precisely the same as required for the best approxi-

mation in the linear span of P x1, x2, …, xm.

(iii) We set z = x − y, where y = nk¼1 (x, xk) xk provides the best approximation

amongst the vectors in M, then (z, xk) = (x, xk) − (y, xk) = 0 for k = 1, 2, …,

n. Hence (z, y) = 0. Thus, x = y + z, where y is a linear combination of x1, x2,

…, xn providing the best approximation to x and z ⊥ xk, k = 1, 2, …, n, is a

decomposition of x. The decomposition is unique. Indeed, the vector in

M providing the best approximation to x 2 H is unique. If x = y1 + z1 is the

another decomposition of x, where y1 provides the best approximation

amongst the vectors in M and z1 ⊥ xk, k = 1, 2, …, n, then y + z = y1 + z1

implies y − y1 = z1 − z, which in turn says y = y1 and z = z1, since y, y1 are in

M and z, z1 are orthogonal to M.

It follows from Remark 2.8.3(iii) that every orthonormal sequence in H is lin-

early independent. Conversely, given any countable linearly independent sequence

in H, we can construct an orthonormal sequence, keeping the span of the elements

at each step of construction [see Theorem 2.8.9 below] in tact.

Theorem 2.8.9 (Gram–Schmidt orthonormalisation) Let x1, x2, … be a linearly

independent sequence in an inner product space H. Deﬁne y1 = x1, u1 = kxx11 k and for

n = 2, 3, …,

64 2 Inner Product Spaces

and

yn

un ¼ :

ky n k

spanfu1 ; u2 ; . . .; un g ¼ spanfx1 ; x2 ; . . .; xn g:

||u1|| = 1 and span{u1} = span{x1}.

For n 1, assume that we have deﬁned y1, y2, …, yn and u1, u2, …, un as stated

above and proved that {u1, u2, …, un} is an orthonormal sequence satisfying

span{u1, u2, …, un} = span{x1, x2, …, xn}. Deﬁne

Since the set {x1, x2, …, xn+1} is linearly independent, xn+1 does not belong to

span{x1, x2, …, xn} = span{u1, u2, …, un}. Hence, yn+1 6¼ 0 and let un þ 1 ¼ kyynn þþ 11 k.

Then ||un+1|| = 1 and for j n,

n

X

yn þ 1 ; uj ¼ ðxn þ 1 ; uj Þ ð x n þ 1 ; uk Þ uk ; uj

k¼1

¼ ðxn þ 1 ; uj Þ ðxn þ 1 ; uj Þ

¼ 0;

ðyn þ 1 ; uj Þ

un þ 1 ; uj ¼ ¼ 0 for j ¼ 1; 2; . . .; n:

ky n þ 1 k

spanfu1 ; u2 ; . . .; un þ 1 g ¼ spanfx1 ; x2 ; . . .; xn ; un þ 1 g

¼ spanfx1 ; x2 ; . . .; xn þ 1 g:

Remarks 2.8.10

(i) The Gram–Schmidt orthonormalisation process as described above yields an

orthonormal sequence which is unique.

2.8 Orthogonal Complements 65

vectors in an inner product space. Suppose they have the same linear span and so do

e1, …, en−1 and f1, …, fn−1. Then the vectors en and fn are scalar multiples of each

other, as the following argument shows.

It is sufﬁcient to argue that fn is a scalar multiple of en.

P fn lies in the linear span of e1, …, en, there exist scalars k1, …, kn such that

Since

fn ¼ 1 k n kk ek . However, the vectors e1, …, en−1 lie in the linear span of f1, …,

fn−1P. Therefore, the sumPof the ﬁrst n − 1 terms in the preceding sum can be written

as 1 k n 1 kk ek ¼ 1 k n 1 ck fk for some scalars c1, …, cn−1. Thus,

X

fn ¼ c k f k þ kn e n : ð2:44Þ

1kn 1

!

X

0 ¼ ðfn ; fj Þ ¼ c k f k þ kn e n ; f j ¼ cj ðfj ; fj Þ þ kn en ; fj :

1kn 1

But fj lies in the linear span of e1, …, en−1 and en is orthogonal to this linear span.

Therefore, (en, fj) = 0 and the above equality becomes cj(fj, fj) = 0. As each fj is

nonzero, it now follows that each cj = 0 (1 j n − 1). Using this in (2.44), we

get fn = knen.

If the vectors en and fn have the same norm, then it further follows that the scalar

kn has absolute value 1.

(ii) If e1, e2, … and f1, f2, … are the orthogonal sequences of nonzero vectors in

an inner product space and

then en and fn are scalar multiples of each other. If the vectors en and fn have the

same norm, then it further follows that the scalar factor has absolute value 1.

(iii) Let Q0, Q1, … be the sequence of polynomials obtained from the sequence of

polynomials 1, t, t2, … (on the domain [−1, 1]) by orthonormalisation, and

let P0, P1, … be the sequence of Legendre polynomials deﬁned in 2.8.13

below. The ﬁrst k functions in either sequence span the space of polynomials

of degree at most k − 1. It follows from what has been proved above that

each Qn is a scalar multiple of Pn and vice versa. The value of the scalar can

be obtained by comparing (a) the leading coefﬁcients or (b) the constant

terms or (c) the integrals over [−1, 1].

(iv) The Gram–Schmidt procedure when applied to a ﬁnite sequence {x1, x2, …,

xn} of independent vectors leads to orthonormal vectors {u1, u2, …, un} such

that

66 2 Inner Product Spaces

Corollary 2.8.11 If H is a pre-Hilbert space of dimension n, then it has a basis of

orthonormal vectors.

Theorem 2.8.12 Every ﬁnite-dimensional pre-Hilbert space is complete and is,

therefore, a Hilbert space.

Proof By Corollary 2.8.11, there is a basis u1, u2, …, un of orthonormal vectors. If

P P

x ¼ nk¼1 kk uk , then kxk2 ¼ nk¼1 jkk j2 , using Remark 2.8.3(ii). The completeness

follows as in Example 2.3.4(i). h

The following examples illustrate the orthogonalisation procedure.

Examples 2.8.13

(i) Let H = ‘2. For n = 1, 2, …, let xn = (1, 1, …, 1, 0, 0, …), where 1 occurs only

in the ﬁrst n places. The Gram–Schmidt orthonormalisation process yields

yn ¼ ð0; 0; . . .; 0; 1; 0; . . .Þ; n ¼ 1; 2; . . .;

ðx2 ;x1 Þx1

The vector y1 = x1 = (1, 0, 0, …). The vector y2 ¼ kxx22 ðx2 ;x1 Þx1 k ¼ ð0; 1; 0; . . .Þ.

By induction, it can be shown that

yn ¼ ð0; 0; . . .; 0; 1; 0; . . .Þ; n ¼ 1; 2; . . .;

where 1 occurs only in the nth place. The sequence of vectors {yn}n 1 is an

orthonormal sequence in ‘2.

The set of ﬁnite linear combinations of the sequence {yn}n 1 is dense in ‘2. Let

x = {ki}i 1 2 ‘2. Given e > 0, there exists n0 such that n > n0 implies

P 2

n0 þ 1 i\1 jki j \e. Then the vector

y ¼ k1 y1 þ k2 y2 þ þ kn0 yn0

is such that

X

kx yk22 ¼ jki j2 \e.

n0 þ 1 i\1

Pn

L2[−1, 1]. Since any nontrivial ﬁnite linear combination i¼1 aki t

ki

is a

polynomial of degree m = maxiki, it has at most m zeros. This shows that the

2.8 Orthogonal Complements 67

vectors {tk}k 0 are linearly independent. We next calculate the ﬁrst three

orthonormal vectors by the Gram–Schmidt procedure.

R1

Let y0(t) = x0(t) = 1, so that ky0 k2 ¼ 1 ds ¼ 2 and u0 ¼ yk0yðtÞ

0k

¼ p1ﬃﬃ2. Next,

0 1

Z1

@ s A 1

y1 ðtÞ ¼ x1 ðtÞ ðx1 ; u0 Þu0 ðtÞ ¼ t pﬃﬃﬃ ds pﬃﬃﬃ ¼ t;

2 2

1

R1 qﬃﬃ

so that ky1 k2 ¼ 1 s2 ds ¼ 23 and u1 ðtÞ ¼ yk1yðtÞ

1k

¼ 3

2 t.

Further,

0 1 0 1 rﬃﬃﬃ 1 rﬃﬃﬃ !

Z1 2 Z

@ s 1 3 3 1

¼ t2 pﬃﬃﬃ dsA pﬃﬃﬃ @ s3 dsA t ¼ t2 ;

2 2 2 2 3

1 1

R1

1 2

so that ky2 k2 ¼ 1 s2 3 ds

8

¼ 45 and

y2 ðtÞ pﬃﬃﬃﬃﬃ 2

u2 ðtÞ ¼ ¼ 10 ð3t 1Þ=4:

ky 2 k

We shall next prove that the general form of these orthonormal polynomials is

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

n þ 12 Pn ðtÞ; where

1 dn 2

Pn ðtÞ ¼ ðt 1Þn ; n ¼ 0; 1; 2; . . .: ð2:45Þ

2n n! dtn

The reader can check by using (2.45) that P0(t) = 1, P1(t) = t and

P2(t) = P2 ðtÞ ¼ 12 ð3t2 1Þ and consequently the ﬁrst three normalised polynomials

qﬃﬃ pﬃﬃﬃﬃ

are p1ﬃﬃ2, 32 t and 410 ð3t2 1Þ: That the general form of these polynomials is

nqﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ o

n þ 12 Pn ðtÞ will be veriﬁed below. We begin by showing that

Z1

0 if n 6¼ m

Pn ðtÞPm ðtÞdt ¼ 2 :

2n þ 1 if n ¼ m

1

For m 6¼ n,

68 2 Inner Product Spaces

Z1 Z1

dn 2 dm 2

2n þ m n!m! Pn ðtÞPm ðtÞdt ¼ ðt 1Þn ðt 1Þm dt

dtn dtm

1 1

1

dn 1 2 m

n d 2

m

¼ n 1 ðt 1Þ ðt 1Þ

dt dtm 1

Z1 m þ 1

d 2 dn 1

m þ 1

ðt 1Þm n 1 ðt2 1Þn dt

dt dt

1

Z1

dm þ 1 2 dn 1

¼ ðt 1Þm ðt2 1Þn dt

dtm þ 1 dtn 1

1

n k

since ddtn k ððt2 1Þn Þ, k = 1, 2, …, n, is zero at t = ±1. Hence, if n > m and we

continue the process of integration by parts, we obtain

Z1

dn m 1

d 2m þ 1 2

ððt2 1Þn Þ ððt 1Þm Þdt;

dtn m 1 dt2m þ 1

1

which is equal to zero (the second factor in the integrand is identically zero).

For m = n, we have

Z1 Z1

2 ð 1Þn 2 n d2n 2

Pn ðtÞ dt ¼ 2

t 1 ððt 1Þn Þdt

22n ðn!Þ dt2n

1 1

ð2:46Þ

Z1

ð 1Þn

¼ ð2nÞ! ðt2 1Þn dt

22n ðn!Þ2

1

since

d2n 2

ððt 1Þn Þ ¼ ð2nÞ!:

dt2n

Zp=2

2n n!

sin2n þ 1 h dh ¼ ;

1 3 ð2n þ 1Þ

0

2.8 Orthogonal Complements 69

Z1

2

Pn ðtÞ2 dt ¼ :

2n þ 1

1

nqﬃﬃﬃﬃﬃﬃﬃﬃﬃ o

Thus 2n þ 1

is an orthonormal sequence in L2[−1, 1].

2 Pn ðtÞ n 0

nqﬃﬃﬃﬃﬃﬃﬃﬃﬃ o

2n þ 1

It now follows readily that the functions 2 P n ðtÞ n 0 are obtained from

polynomial of degree n. The essential uniqueness pointed out in Remark 2.8.10(ii),

and the fact that the leading coefﬁcients in Pn(t) and in the nth polynomial obtained

via orthonormalisation are both positive lead to the result.

(iii) Hermite functions. Consider the sequence of functions {fn}n 0 on R, where

2

fn(t) = tnexp( t2 ). Since

Z1 Z1

2

t2n expð t2 Þdt\ t2n dt exp t \1 for 0\t\1 ð2:47Þ

0 0

and

Z1 Z1

2n 2 1 2 t2n þ 2

t expð t Þ dt\ðn þ 1Þ! dt ðet [ ; t [ 1Þ; ð2:48Þ

t2 ðn þ 1Þ!

1 1

Z1 Z1 Z1

n t2 2 2n 2

jt expð Þj dt ¼ t expð t Þdt ¼ 2 t2n expð t2 Þdt

2

1 1 0

is ﬁnite. Thus, each fn 2 L2(R). Moreover, fn’s are linearly independent, because

any nontrivial ﬁnite linear combination of functions fn is a polynomial multiplied by

2

exp( t2 ), which is zero for no t 2 R and any nonzero polynomial has at most

ﬁnitely many zeros.

2

We next orthonormalise the functions {fn}n 0, where fn(t) = tnexp( t2 ), and

obtain the ﬁrst three orthonormal

R1 vectors.

To begin with, (f0, f0) = 1 exp(−t2)dt = √p, using a well-known formula from

advanced calculus. Thus

70 2 Inner Product Spaces

2

expð t2 Þ

u0 ð t Þ ¼ :

p1=4

t2 t2

t2 t2 expð 2 Þ expð Þ

t expð 2Þt expð 2 Þ; p1=4

2

p1=4

u1 ðtÞ ¼

t2 t2

t expð t2 Þ t2 expð 2 Þ expð 2 Þ

2 t expð 2 Þ; p1=4 p1=4

pﬃﬃﬃ 2 2

2t expð t2 Þ 2t expð t2 Þ

¼ ¼ :

p1=4 ð2p1=2 Þ1=2

t2 t2 t2

t2 expð 2Þ t2 expð 2 Þ; u0 u0 t2 expð 2 Þ; u1 u1

u2 ð t Þ ¼

t2 expð t2

t2 expð t2

t2 expð t2

2Þ 2 Þ; u0 u0 2 Þ; u1 u1

2 t2

ð4t 2Þ expð 2Þ

¼ :

ðp1=2 22 2!Þ1=2

We shall next prove that the general form of these orthonormal functions is

t2

Hn ðtÞ expð 2Þ

vn ð t Þ ¼ 1=2

;

ð2n 1=2

n!p Þ

where

Hn ðtÞ ¼ ð 1Þn exp t2 expðnÞ t2 ; ð2:49Þ

and the superscript ‘(n)’ indicates the nth derivative of the function t ! exp(−t2).

The functions {Hn}n 0 are easily seen to be polynomials and are called Hermite

polynomials. The degree of Hn is n, as shown in (2.50) below. The functions vn are

called Hermite functions.

For n = 0, 1 and 2, it can be veriﬁed that

2.8 Orthogonal Complements 71

expðn þ 1Þ t2 ¼ 2t expðnÞ ð t2 Þ 2n expðn 1Þ

t2 : ð2:51Þ

d

expðk þ 1Þ t2 ¼ expðkÞ t2

dt

d

¼ ð 2t expðk 1Þ ð t2 Þ 2ðk 1Þ expðk 2Þ t2 Þ

dt

¼ 2t expðkÞ ð t2 Þ 2 expðk 1Þ ð t2 Þ 2ðk 1Þ expðk 1Þ

t2

¼ 2t expðkÞ ð t2 Þ 2k expðk 1Þ t2 :

This proves (2.51) for all n = 1, 2, …. Now, differentiating (2.49) and using (2.51),

we obtain

n o

Hn0 ðtÞ ¼ ð 1Þn 2t exp t2 expðnÞ ð t2 Þ þ exp t2 2t expðnÞ ð t2 Þ 2n expðn 1Þ

t2

¼ ð 1Þn 1 2n exp t2 expðn 1Þ t2

¼ 2n Hn 1 ðtÞ:

The orthogonality of the Hermite functions may be obtained from

Z1 Z1

t2 n

Hm ðtÞHn ðtÞe dt ¼ ð 1Þ Hm ðtÞ expðnÞ ð t2 Þ dt:

1 1

For n > m, repeated integration by parts, using (2.50) and the fact that exp(−t2) and

all its derivatives vanish for t = ±∞, we obtain

Z1 Z1

t2 n 1

Hm ðtÞHn ðtÞe dt¼ ð 1Þ 2m Hm 1 ðtÞ expðn 1Þ

ð t2 Þdt

1 1

Z1

n m m

¼ ð 1Þ 2 m! H0 ðtÞ expðn mÞ

ð t2 Þdt ¼ 0:

1

For n = m,

72 2 Inner Product Spaces

Z1 Z1

2 2 n

Hn ðtÞ expð t Þdt¼ 2 n! H0 ðtÞ expð t2 Þdt

1 1

pﬃﬃﬃ

¼ 2n n! p:

t2

Hn ðtÞ expð 2Þ

vn ð t Þ ¼ 1=2

; n ¼ 0; 1; 2; . . . ð2:52Þ

ð2n 1=2

n!p Þ

The reader can check using (2.52) that

vj ðtÞ ¼ uj ðtÞ; j ¼ 0; 1; 2:

2

The vector Hn(t)expð t2 Þ is a linear combination of f0, …, fn. Since the sets

2

{Hk(t)expð t2 Þ: k = 0, …, n} and {fk: k = 0, …, n} are linearly independent, it

follows by Remark 2.8.10(ii) that vj = ±uj for all j. The ambiguity of sign can be

removed by using the following observation. The leading coefﬁcient of each Hn is

2

positive in view of (2.50) and so is that of fn exp(t2 ).

(iv) Laguerre functions. ConsiderRthe sequence of functions {fn}n 0 on (0,∞),

1

where fn(t) = tnexp( 2t ). Since 0 t2nexp(−t)dt = C(2n + 1), where C() is the

R1

gamma function and 0 exp(−t)dt = 1, it follows that each fn 2 L2(0,∞).

Moreover, fn’s are linearly independent because any nontrivial ﬁnite linear

combination of functions fn is a polynomial multiplied by exp( 2t ), which is

zero for no t 2 (0,∞) and any nonzero polynomial has at most ﬁnitely many

zeros.

We next orthonormalise the functions {fn}n 0, where fn(t) = tnexp( t

2), and

obtain the ﬁrst three orthonormal vectors.

Z1

ðf 0 ; f 0 Þ ¼ expð tÞdt ¼ 1:

0

t

Thus, u0(t) = exp 2 :

The next two orthonormal vectors are given as

2.8 Orthogonal Complements 73

t

t exp 2t t exp 2t ; u0 u0

u1 ðtÞ ¼ t exp

t t ¼ ðt 1Þ exp ; since

2 t exp 2 ; u0 u0 2

t t

ðt; u0 Þ ¼ 1and t exp t exp ; u0 u0 ¼ 1:

22 2 2

t2 exp 2t t exp 2t ; u1 u1 t exp 2t ; u0 u0

u2 ðtÞ ¼ t2 exp

t t

2 t2 exp 2 ; u1 u1 t2 exp 2t ; u0 u0

t

1 2

¼ t 2t þ 1 exp ; since

2 2

t t t

t2 exp ; u1 ðtÞ ¼ 4; t2 exp ; exp ¼ 1 and

2 2 2

t t t

2

t exp t2 exp ; u1 u1 t2 exp ; u0 u0 ¼ 2:

2 2 2

We shall next prove that the general form of these orthonormal functions is

1 t

vn ð t Þ ¼ exp Ln ðtÞ; ð2:53Þ

n! 2

where

dn n

Ln ðtÞ ¼ ð 1Þn expðtÞ ðt expð tÞÞ; n ¼ 0; 1; 2; . . .:

dtn

n

X

n

L n ð t Þ ¼ ð 1Þ n ð 1Þ n k

nð n 1Þ ð n k þ 1Þtn k : ð2:54Þ

k¼0

k

The reader can check using (2.53) that v0(t) = exp( 2t ) = u0(t), v1(t) = (t − 1)exp

( 2t ) = u1(t) and v2(t) = (12 t2 2t þ 1Þ expð 2t ) = u2(t), where u0, u1 and u2 are the

orthonormalised vectors computed using the Gram–Schmidt orthonormalisation

process.

We begin by showing that

Z1

expð tÞLn ðtÞLm ðtÞdt ¼ 0 for n [ m:

0

For m < n,

74 2 Inner Product Spaces

Z1 Z1

m n dn n

expð tÞt Ln ðtÞdt ¼ ð 1Þ tm ðt expð tÞÞdt

dtn

0 0

Z1

nþm dn m

¼ ð 1Þ m! ðtn expð tÞÞdt ¼ 0;

dtn m

0

Z1 Z1

dn n

expð tÞL2n ðtÞdt ¼ ð 1Þn ðt exp ð t ÞÞ Ln ðtÞdt

dtn

0 0

Z1 Xn

dn n n k n

¼ ð 1Þn ðt exp ð t ÞÞ ð 1 Þ nðn 1Þ. . .ðn k þ 1Þtn k dt

dtn k¼0 k

0

Z1

dn n

¼ ð 1Þ2n tn ðt expð tÞÞdt

dtn

0

Z1

¼ n! tn expð tÞdt ¼ ðn!Þ2 ;

0

The vector Ln(t)exp( 2t ) is a linear combination of f0, …, fn. Since the sets

{Lk(t)exp( 2t ): k = 0, …, n} and {fk: k = 0, …, n} are linearly independent, it

follows by Remark 2.8.10(ii) that vj = ±uj for all j. The ambiguity of sign can be

removed by using the following observation. The leading coefﬁcient of each Ln is

positive in view of (2.54) and so is that of fnexp( 2t ).

(v) Rademacher functions. Consider the sequence {rn} of functions deﬁned on

the interval [0, 1] by

This sequence was introduced by Rademacher and the rk are known as Rademacher

functions. If the interval [0, 1] is divided into 2k (k 1) equal parts, then

rk(t) assumes on the interiors of those segments the values +1 and −1 alternately

while at the endpoints, rk(t) = 0.

R1 1

The reader will note that ||rn|| = ( 0 |rn(t)|2dt)2 = 1, i.e. rn 2 L2[0, 1] and

||rn|| = 1, n = 0, 1, 2, …. To prove orthogonality, let n > m 0. Let I be the open

segment which lies between some two consecutive points of subdivision of the

interval [0, 1] corresponding to the function rm. Then, rm has constant value +1 or

−1 on I. Furthermore, I is composed of an even number, precisely 2n−m, of intervals

2.8 Orthogonal Complements 75

of equal length. On half of these intervals, rn(t) has value +1, whereas on the other

half, rn(t) has value −1. Consequently,

Z Z

rm ðtÞrn ðtÞdt ¼ rn ðtÞdt ¼ 0:

I I

Z1

rm ðtÞrn ðtÞdt ¼ 0:

0

dependent, it follows that the Rademacher sequence {rn}n 0 of orthonormal

functions in L2[0, 1] is a linearly independent sequence. Moreover, the function

f(t) = cos 2pt is such that

k

Z1 2n

X Z2n

rn ðtÞf ðtÞdt ¼ ð 1Þ k 1

f ðtÞdt ¼ 0;

k¼1

0 k 1

2n

since the kth term and the (2n − (k − 1))th term are equal in magnitude and opposite

in sign because cos 2pt = cos 2p(1 − t), and consequently add up to 0.

Remark The sequence {rn}n 0 converges for t = 0, 1, 2kn , 1 k < 2n, n = 1, 2, …,

the points of subdivision of the interval [0, 1], and converges for no t other than

these points of subdivision, for if t 6¼ 0, 1 or any of the points of subdivision,

{rn(t)}n 0 assumes the values +1 and −1 inﬁnitely often. (For any t that is not of

the form 2kn , there exists an integer j such that 2jn \t\ j þ 1 n

2n , that is, jp < 2 pt <

(j + 1)p: so, rn(t) is 1 if j is even and −1 if j is odd. As n increases, the parity of

j keeps changing between even and odd [see Problem 2.8.P10.]). Thus, the

sequence converges only on the set {0, 1, 2kn : 1 k < 2n, n = 1, 2, …} of measure

zero. However, the arithmetic averages of {rn}n 0 converge to the zero function

almost everywhere.

Lemma 2.8.14 For distinct nonnegative integers k1, k2, …, kn,

Z1

rk1 rk2 . . .rkn ¼ 0:

0

76 2 Inner Product Spaces

Theorem

P 2.8.15 Let {rn}n 1 be the Rademacher functions. Then the sequence

{( nk¼1 rk)/n}n 1 of arithmetic means converges to zero almost everywhere with

respect to the Lebesgue measure on [0, 1].

P

Proof Set fn = [( nk¼1 rk)/n]4, n = 1, 2, …. Observe that each fn belongs to L1[0, 1].

P R1

Indeed, |fn| [( nk¼1 |rk|)/n]4 = 1 and 0 fn(t)dt 1. Next, on using r2k = 1,

k = 1, 2, … (except at the ﬁnitely many points of subdivision), we have

2 ! 2 32

n

X

n4 fn ¼ 4 rk 5

k¼1

0 12

n

X n

X

B C

¼@ rk2 þ 2 rk rm A

k¼1 k;m¼1

k\m

0 12

n

X

B C

¼ @n þ 2 rk rm A

k;m¼1

k\m

0 10

1

n

X B n

X CB n

X C

¼ n2 þ 4n rk rm þ 4B

@ ri rj C

A@ rk rm A

k;m¼1 i;j¼1 k;m¼1

k\m i\j k\m

0 1

B C

B C

n

X n

X BX n X n X n C

B C

¼ n2 þ 4n rk rm þ 4 2 2

rk rm þ 4B 2

2rj rk rm þ 2 ri rj rk rm C

B j¼1 C

k;m¼1 k;m¼1 B k;m¼1 k;m¼1 C

k\m k\m @ k\m

k;m6¼j

i;j¼1

k\m

A

i\j

i;j;k;mdist

n

X n

X n

X

¼ n2 þ 4n rk rm þ 4ððn 1Þ þ ðn 2Þ þ þ 2 þ 1Þ þ 8 rk rm

k;m¼1 j¼1 k;m¼1

k\m k\m

k;m6¼j

n

X

þ8 ri rj rk rm

k;m¼1

i;j¼1

k\m

i\j

i;j;k;mdist

n

P n

P n

P n

P

¼ n2 þ 2nðn 1Þ þ 4n rk rm þ 8 rk rm þ 8 ri rj rk rm :

k;m¼1 j¼1 k;m¼1 k;m¼1

k\m k\m i;j¼1

k;m6¼j k\m

i\j

i;j;k;m dist

ð2:55Þ

2.8 Orthogonal Complements 77

Dividing both sides of (2.55) by n4, integrating and using Lemma 2.8.14, we obtain

Z1

1 2nðn 1Þ 3

fn dt ¼ 2

þ 4

\ 2:

n n n

0

Consequently,

1

1 Z

X

fn dt\1:

n¼1

0

P {fn}n 1 con-

By Corollary 1.3.7 and Remark 1.3.13, it follows that the sequence

verges to zero almost everywhere; that is, the sequence {( nk¼1 rk)/n}n 1 of

arithmetic averages converges to zero almost everywhere with respect to Lebesgue

measure. This completes the proof. h

2:8:P2. Give an example to show that strict inequality can hold in the Corollary

2.8.7 to Bessel’s Inequality.

2:8:P3. Let {ek}k 1 be any orthonormal sequence in an inner product space

X. Show that for any x, y 2 X,

1

X

jðx; ek Þðy; ek Þj k xkkyk:

n¼1

2:8:P4. Let {ek}k 1 be any orthonormal sequence in a Hilbert space H and let

M = span{ek}. Show that for any x 2 H, we have x 2 M if, and only if,

x can be represented by

1

X

x¼ ðx; ek Þek :

k¼1

f′(x) 2 L2[−p, p]. Let fn, n 2 Z; be the Fourier coefﬁcients of f(x) in the

pﬃﬃﬃﬃﬃﬃ P

system feinx = 2pgn2Z . Prove that 1 n¼ 1 |fn| < ∞.

2:8:P6. Show that the system {1, t , t , …} is complete in the space L2[0, 1]. It is

2 4

2:8:P7. Find a nonzero vector in C3 orthogonal to (1, 1, 1) and (1, x, x2), where

x = exp(2pi/3).

78 2 Inner Product Spaces

2:8:P8. Let a 2 C be such that |a| 6¼ 1. Find the Fourier coefﬁcients of f 2 RL2,

where f(z) = (z − a)−1, with respect to the orthonormal sequence

1

ej j¼ 1 ; ej(z) = zj.

2 P

2:8:P9. If the series ja20 j + 1 2 2

n¼1 (|ak| + |bk| ) converges, show that there exists a

2

function f 2 L [0, 2p] having the ak, bk as its Fourier coefﬁcients, i.e. the

equations

1 1 1

a0 ¼ f ðtÞdt; ak ¼ f ðtÞ cos kt dt; bk ¼ f ðtÞ sin kt dt;

p p p

0 0 0

k ¼ 1; 2; . . .

i.e. if there are two such functions, they differ only on a set of measure

zero.

2:8:P10. Show that for any t 2 [0, 1] that is not of the form 2kn (i.e. t is not a ‘dyadic

rational’) for any integers k and n, the parity of the (obviously unique)

integer j such that j2n1 \t\ 2jn keeps changing between even and odd as

n increases.

2:8:P11. Prove Lemma 2.8.14: for distinct nonnegative integers k1, k2, …, kn,

R1

0 rk1 rk2 . . .rkn ¼ 0:

2:8:P12. Show that completeness of the orthonormal set of Hermite functions in

L2(−∞,∞) is equivalent to that of the orthonormal set of Laguerre

functions in L2(0,∞).

2:8:P13. Let X be a complex inner product space of dimension n. Show that X is

isometrically isomorphic to Cn and is hence complete.

x ⊥ y whenever x and y are distinct vectors of M [Deﬁnition 2.8.1]. The orthogonal

set M is said to be orthonormal if, in addition, ||x|| = 1 for every vector x in M.

An orthonormal set is said to be complete if it is a maximal orthonormal set

[Deﬁnition 2.8.2]. We shall show that there are complete orthonormal sets in any

nontrivial inner product space and discuss a few of the many important examples.

One also speaks of complete orthogonal sets, which are deﬁned analogously. The

classical result of Riesz–Fischer and Parseval will be proved. These will lead to the

identiﬁcation of all inﬁnite-dimensional Hilbert spaces.

We begin by showing that a nontrivial inner product space H (H 6¼ {0}) contains

a complete orthonormal set.

2.9 Complete Orthonormal Sets 79

Theorem 2.9.1 Let H be an inner product space over F and let H 6¼ {0}. Then H

contains a complete orthonormal set.

Proof Let S denote the collection of all orthonormal sets in X. Since, for any

nonzero vector x, the set {kxxk} is an orthonormal set; it follows that S 6¼ ∅. The

collection S is partially ordered by inclusion. We wish to show that every totally

ordered subset of S has an upper bound in S. It will then follow by Zorn’s Lemma

that S has a maximal element, namely a complete orthonormal set.

Let T = {Aa}Sa2K, where K is an indexing set, be any totallyS ordered subset of

S. Then the set Sa Aa is an upper bound for T; indeed, Ab a Aa for each b. We

S show that a Aa is orthonormal. Let x and y be any two distinct elements of

next

a Aa so that x 2 Ab and y 2 Ac for some b and c in the indexing set K. Since T is

totally ordered, either Ab Ac or Ac ASb. Supposing Ab Ac, it follows that x, y 2

Ac. So x ⊥ y and ||x|| = ||y|| = 1. Thus, a Aa is seen to be orthonormal.

By Zorn’s Lemma [Sect. 1.3], S has a maximal element. This completes the

proof. h

A slight modiﬁcation of the proof of Theorem 2.9.1 yields the following

corollary.

Corollary 2.9.2 Let H be an inner product space over F. If E H is an

orthonormal set, then there exists a complete orthonormal set S such that E S.

The next result contains an alternate description of complete orthonormal sets.

Theorem 2.9.3 Let H be an inner product space over F. Suppose that S H is an

orthonormal set. Then the following are equivalent:

(a) S is a complete orthonormal set;

(b) If x 2 H is such that x ⊥ S, then x = 0.

x 6¼ 0, then S [ {kxxk} is an orthonormal set that properly contains S, contradicting

the fact that S is a complete orthonormal set.

On the other hand, suppose x ⊥ S implies x = 0. If S were not a complete

orthonormal set, there would exist some orthonormal set T H such that T properly

contains S. Hence, if x 2 T\S then ||x|| = 1 and x ⊥ S. This contradicts the

assumption that x ⊥ S implies x = 0.

Therefore, the orthonormal set S is complete. h

So far we have considered examples of countable orthonormal sets in pre-Hilbert

spaces. If a Hilbert space contains a countable complete orthonormal set, then it is

said to be separable. This deﬁnition of separability is equivalent to Deﬁnition

1.2.10 as the next theorem shows.

Let S be a countable dense set in a Hilbert space H 6¼ {0}. By progressively

reducing S, if necessary, it can be turned into a linearly independent set. The Gram–

Schmidt orthonormalisation process applied to the linearly independent set renders

80 2 Inner Product Spaces

it into an orthonormal set. This orthonormal set is in fact complete. More precisely,

we have the following theorem.

Theorem 2.9.4 Let H 6¼ {0} be a Hilbert space that contains a countable dense

subset S. Then H contains a countable complete orthonormal set that is obtained

from S by the Gram–Schmidt orthonormalisation process. Thus H is separable.

Let H 6¼ {0} contain a countable, complete orthonormal set T, then H contains a

countable dense set, namely the ﬁnite rational linear combinations of vectors in T.

Proof We assume, as we may, that 0 62 S. Enumerate the vectors in S as a sequence

{xn}n 1 and let y1 ¼ xn1 ; where n1 = 1. If all the xn for n > n1 are scalar multiples of

xn1 ; then the set {xn1 } is the linearly independent set obtained from S. Otherwise, let

y2 = xn2 be the ﬁrst xn which is not a scalar multiple of xn1 . Then for n < n2, xn is a

scalar multiple of xn1 : If all the xn for n > n2 are expressible as linear combinations

of xn1 and xn2 ; then the set {xn1 ; xn2 } is the linearly independent set obtained from

S. Otherwise, let y3 = xn3 be the ﬁrst xn which is independent of xn1 andxn2 : Then for

n < n3, xn is a linear combination of xn1 andxn2 : The proof continues inductively, and

we thus obtain a ﬁnite or countably inﬁnite linearly independent set {y1, y2, …}

S. Let X be the smallest linear subspace of H containing {y1, y2, …}. It is clear that

S X since if xj 2 S then xj is a linear combination of y1, y2, …, yk, where k is

chosen so that nk j < nk+1. This says that X is dense in H. Orthonormalise {y1, y2,

…} by the Gram–Schmidt procedure to obtain the orthonormal set {u1, u2, …}. It

remains to show that the orthonormal set {u1, u2, …} is complete. P

Let x 2 H be such that (x, uk) = 0 for k = 1, 2, …. Then (x; nk¼1 akuk) = 0 for all

ﬁnite linear combinations of the un and so (x, y) = 0 for all y 2 X. Let {zn}n 1 be a

sequence in X such that ||x − zn|| ! 0 as n ! ∞. Then ||x||2 = (x, x) − (x, zn) = (x, x −

zn) ||x||||x − zn|| ! 0 as n ! ∞.

Clearly, the closure of the rational linear combinations of the vectors of T = {xk}

contains all possible linear combinations of T, i.e. contains [T] and is hence the

P

same as [T]. Let x 2 H . Now, nk¼1 (x, xk)xk 2 [T]. Using Bessel’s Inequality

P

[Theorem 2.8.6], it follows that 1 n¼1 (x, xk)xk converges to some y 2 H. In fact,

y 2 [T]. Suppose y 6¼ x. Then

completes the proof. h

However, there are Hilbert spaces which contain non-denumerable orthonormal

sets and are, therefore, nonseparable. We give below examples of such Hilbert

spaces.

Examples 2.9.5

(i) Consider the collection X of functions on R representable in the form

2.9 Complete Orthonormal Sets 81

n

X

xð t Þ ¼ ak eikk t

k¼1

for arbitrary n, real numbers k1, k2, …, kn and complex coefﬁcients a1, a2, …,

an. X is a vector space, and an inner product in X is deﬁned by

ZT

1

ðx; yÞ ¼ lim xðtÞyðtÞdt:

T!1 2T

T

Pn

If y(t) = k¼1 bk eilk t ; then

ZT X m

n X

1

ðx; yÞ ¼ lim aj bk eiðkj lk Þ t

dt

T!1 2T

j¼1 k¼1

T ð2:56Þ

n X

X m

¼ aj bk

j¼1 k¼1

since

ZT

1 1 if k ¼ 0

lim eikt dt ¼

T!1 2T 0 if k 6¼ 0:

T

The reader will note that the summation in (2.56) is taken over all j and k for

which kj = µk. X together with the inner product deﬁned in (2.56) is an inner

product space. This is known as the space of trigonometric polynomials on R.

Its completion H is a Hilbert space. The set {ur(t) = eirt: r 2 R} is an

uncountable orthonormal set in the Hilbert space H, where H = X; the closure

of X.

(ii) Let X be a nonempty set. Consider the Hilbert space L2(X, , l), where

denotes the collection of all subsets of X and µ is the counting measure on X;

that is, if E 2 , µ(E) is equal to the number of points in E when E is ﬁnite and

is inﬁnite if E is inﬁnite. The space L2(X, , µ) is denoted by ‘2(X).

Consider the subset of ‘2(X) consisting of all characteristic functions of one point

sets in X, i.e. {v{x}: x 2 X}. Observe that (v{x}, v{y}) = 0 for x 6¼ y and ||v{x}|| = 1.

Suppose now x 6¼ y and consider the distance between v{x} and v{y}:

2 X 2

vfxg vfyg ¼ vfxg vfyg ¼ 2:

2

z2X

Thus

82 2 Inner Product Spaces

pﬃﬃﬃ

vfxg vfyg ¼ 2:

2

The open balls S(v{x},1/√2) with centres v{x} and radii 1/√2 are nonoverlapping,

since no ball S(v{x},1/√2) contains a point of the set {v{x}: x 2 X} other than its

centre. Now suppose that X is an uncountably inﬁnite set. We claim that the space

‘2(X) is nonseparable. Suppose not and let {zk} be a countable dense set in ‘2(X).

Each of the balls S(v{x},1/√2) will contain a point zk of the countable dense set.

Since the balls are nonoverlapping, the points contained in different balls cannot be

identical. We thus have an injective map from X into the countable dense set, which

is not possible as X is uncountable.

In view of the examples above, we consider orthonormal sets which are not

necessarily countable. We begin with the following deﬁnition that formalises the

remarks above Deﬁnition 2.7.5.

Deﬁnition 2.9.6 Suppose {xa: a 2 K}, where K is an indexing set, is a collection of

elements from a normed linear space X. {xa: a 2 K} is said to be summable to

x 2 X, written

X X

xa ¼ x or xa ¼ x;

a2K a

if for all e > 0, there exists some ﬁnite set of indices J0 K, such that for any ﬁnite

set of indices J J0,

X

x x \e:

a2J a

This notion of summability can be easily reconciled with the usual notion of

summability of a series when K consists of the natural numbers.

Remarks 2.9.7

(i) Suppose S = {xa: a 2 K}, where K is an indexing set, is a collection of

elements

P from the normed linear space R. If 0 xa < ∞ for each a 2 K, then

x

a a is the supremum of the set of all ﬁnite sums xa1 þ xa2 þ þ xan , where

a1, a2, …, an are distinct members of K. In this situation, the sum can be

inﬁnity, which is outside the space R. However, if it is within the space, then

the twoP notions of summation are identical. If xa = ∞ for some a 2 K, then the

sum a xa is equal to inﬁnity.

P P

(ii) It is easy to check that if a xa ¼ x and a ya ¼ y, then

X X

ðxa þ ya Þ ¼ x þ y and kxa ¼ kx; k 2 F:

a a

2.9 Complete Orthonormal Sets 83

indexing set, it is the sum over only a countable set of indices that matters.

Proposition 2.9.8 Let X be a Banach space over F and suppose {xj: j 2 K}

X. The family {xj: j 2 K} is summable

P if, and only if, for every e > 0, there exists a

ﬁnite set J0 of indices such that j2J xj \e whenever J is a ﬁnite set of indices

disjoint from J0.

If {xj} is summable, then the set of those indices for which xj 6¼ 0 is countable.

Proof If {xj} is a summable

family with sum x, then for every e > 0, there exists a

P

ﬁnite set J0 such that x j2J1 xj \e=2 whenever J1 J0 and is ﬁnite. It follows

that if J \ J0 = ∅, then

X X X X

X

x ¼ x xj x x þ x xj \e.

j2J j j2J [ J j j2J

j2J [ J j j2J

0 0 0 0

The reader will note that we have not used the completeness of X in the above

argument.

If, conversely, the condition is satisﬁed,

then for every positive integer n, there

P

exists a ﬁnite set Jn such that j2J xj \1=n whenever J is a ﬁnite set of indices

and J \ Jn = ∅. By replacing Jn by J1 [ J2 [ [ Jn, n = 1, 2, …, we see that

there is a sequence {Jn} of ﬁnite sets of indices which is increasing. If n < m, then

X X X

xj xj ¼

x

j \1=n

j2J

m j2Jn j2Jm nJn

since (J

P ∅. By the completeness of X, it follows that there exists x such

m\Jn) \ Jn =

that j2Jn xj x ! 0. For e > 0, there exists n0 > 2/e such that

P

j2Jn xj x \e=2. If J is any ﬁnite set of indices containing Jn0 ; then

0

X

X X

x x xj

x þ xj

\e=2 þ 1=n0 \e.

j2J j j2J j2JnJn0

n0

Finally we show that xj = 0 for all but

countably

many j. If j is an index which

does not belong to J1 [ J2 [ …, then xj \1=n for every n. The reader will note

that we have not used the completeness of X in this argument. This completes the

proof. h

84 2 Inner Product Spaces

P

If the sequence {xn}n 1 in a Hilbert space is orthogonal, then 1 n¼1 xn con-

P

verges if, and only if, 1n¼1 ||x n || 2

< ∞. More generally, the following theorem

holds.

Theorem 2.9.9 Let H be a Hilbert space andPlet {xj: j 2 K} be an orthogonal

family in H, i.e. xj ⊥ xk for j 6¼ k. Then j2K xj converges if, and only if,

P 2 P 2 P 2

j2K ||x j || < ∞. Moreover, if x

j2K j = x, then ||x|| = j2K ||x j|| .

then for every positive number e there exists a ﬁnite set

P

J0 such that j2J xj \e whenever J \ J0 = ∅, and consequently

2

X 2 X

xj ¼ xj \e2

j2J

j2J

whenever J \ J0 P = ∅.

If, conversely, j2K ||xj||2 < ∞, then for every positive e there exists a ﬁnite set

P P

J0 such that j2J ||xj||2 < e2 (consequently || j2J xj||2 < e2) whenever J \ J0 = ∅.

Summability now follows from the previous Theorem 2.9.8.

Observe that

! !

2

X X X XX X

k xk ¼ ðx; xÞ ¼ xj ; x ¼ xj ; xk ¼ ðxj ; xk Þ ¼ ðxj ; xj Þ

j2K j2K k2K j2K k2K j2K

X 2

¼ xj :

j2K

h

The following general form of Bessel’s Inequality holds.

Theorem 2.9.10 (Bessel’s Inequality) Let S = {xa: a 2 K} be an orthonormal set

in an inner product space H and let x 2 H. Then we have

X

jðx; xa Þj2 k xk2 :

a2K

Proof The inequality in Theorem 2.8.6 implies that for each ﬁnite set J K of

indices, we have

X

jðx; xa Þj2 k xk2 :

a2J

2.9 Complete Orthonormal Sets 85

( )

X 2

X 2

jðx; xa Þj ¼ sup jðx; xa Þj : JK; J finite

a2K a2J

k x k2 :

h

Remark 2.9.11 The set A = {a 2 K: (x, xa) 6¼ 0} is countable. By Bessel’s

P 2 2

Inequality, a2K jðx; xa Þj k xk . The countability now follows from

Theorem 2.9.8.

Theorem 2.9.12 Let{xa: a 2 K} P be an orthonormal set in a Hilbert space H.

For every x 2 H, the vector y ¼ a2K ðx; xa Þxa exists in H and x − y ⊥ xa for every

a 2 K.

Proof By Bessel’s Inequality 2.9.10, there is a countable set of xa for which (x, xa)

6¼ 0. Arrange them as a sequence x1, x2, … Let e > 0 be given. Then

2 !

nXþk þk

nX þk

nX

ðx; xi Þxi ¼ ðx; xi Þxi ; x; xj xj

i¼n i¼n j¼n

þ k nX

nX þk

¼ ðx; xi Þ x; xj xi ; xj

i¼n j¼n

þk

nX

¼ jðx; xi Þj2 \e

i¼n

Pnk, again using Bessel’s Inequality. It follows that

the sequence of partial sums i¼1 ð x; x i Þx i n1

is Cauchy in H, and H being a

P P

Hilbert space, y ¼ 1 i¼1 ð x; x Þx

i i exists in H and equals a2K ðx; xa Þxa . Note that

the foregoing argument is valid whether F = C or R, as in the latter case

ðx; xj Þ ¼ ðx; xj Þ:

It Premains to show that x − y ⊥ xa for every a 2 K. For each n, let

yn = nk¼1 ðx; xk Þxk . We ﬁrst prove that (x − yn, xa) = 0 for those a for which (x, xa)

= 0 and any n. Note that xa cannot appear in the representation of yn for any

n. Therefore, (xk, xa) = 0 for all k and hence

n

X

ðx yn ; xa Þ ¼ ðx; xa Þ ðx; xk Þðxk ; xa Þ ¼ 0:

k¼1

Next we prove (x − yn, xa) = 0 for those a for which (x, xa) 6¼ 0 and sufﬁciently

large n. Note that xa must appear in the representation of yn for sufﬁciently large

n. Therefore,

86 2 Inner Product Spaces

n

X

ðx yn ; xa Þ ¼ ðx; xa Þ ðx; xk Þðxk ; xa Þ ¼ ðx; xa Þ ðx; xa Þ ¼ 0

k¼1

Now, for sufﬁciently large n,

0 þ ky n y kkx a k

¼ kyn yk ðusing orthonormality of fxa : a 2 KgÞ

We next investigate the problem of writing an arbitrary element x in a Hilbert

space H as a limit of linear combinations of elements of an orthonormal set. We

begin with a deﬁnition.

Deﬁnition 2.9.13 Let H be a Hilbert space and S = {xa: a 2 K} be an orthonormal

set in H. We say that S is a basis (orthonormal basis) in H if for every x 2 H, the

following holds:

X

x¼ ðx; xa Þxa :

a2K

Theorem 2.9.14 If H is a Hilbert space, then S = {xa: a 2 K} consisting of

orthonormal vectors in H is a basis if, and only if, S is a complete orthonormal

system of vectors.

Proof Suppose S is a basis in H. Then if x 2 H satisﬁes (x, xa) = 0, a 2 K, then the

deﬁnition of the basis gives

X

x¼ ðx; xa Þxa ¼ 0:

a2K

hand, suppose that S is complete in H. Let b 2 K. Then for any

On the other P

x 2 H, the sum a2K ðx; xa Þxa exists by Theorem 2.9.12 and

!

X X

x ðx; xa Þxa ; xb ¼ ðx; xb Þ ðx; xa Þðxa ; xb Þ ¼ ðx; xb Þ ðx; xb Þ ¼ 0;

a2K a2K

usingPthe fact that S = {xa: a 2 K} consists of orthonormal vectors. Thus, the vector

x a2K ðx; xa Þxa ; is orthogonal to xb for every b 2 K. The hypothesis, together

with Theorem 2.9.3, now implies

2.9 Complete Orthonormal Sets 87

X

x¼ ðx; xa Þxa ;

a2K

i.e. S is a basis in H. h

Examples 2.9.15

(i) The set e1 = (1, 0, 0, …), e2 = (0, 1, 0, 0, …), … is a complete orthonormal

set (basis) in ‘2. Indeed, if x = (x1, x2,…) 2 ‘2 and (x, ej) = 0, j = 1, 2, …, then

xj = P0, j = 1, 2, … and so x = 0. Moreover, if x = (x1, x2, …) 2 ‘2 then

x ¼ 1 ðx; ei Þei , where the partial sums converge in the ‘2-norm:

Pn i¼1 2 P

i¼1ðx; ei Þei x ¼ 1 i¼n þ 1 jxi j2 is small for large n.

(ii) [Cf. Examples 2.9.5(ii)] Let X be a non-denumerable set. The set

‘2(X) = L2(X, , µ), where denotes the collection of all subsets of X and µ is

the counting measure on X, is a nonseparable Hilbert space. The set {v{x}:

x 2 X} of characteristic functions is an uncountable orthonormal set in ‘2(X).

In fact, it is a complete

P orthonormal set. If f 2 ‘2(X) and for x 2 X,

(f, v{x}) = 0, then y2X f ðyÞvfxg ðyÞ ¼ 0, which implies f(x) = 0. So f is the

identically zero function [see Theorem 2.9.3].

(iii) The Rademacher system is not complete. The function f(x) = cos 2px is

orthogonal to all the Rademacher functions [see Example 2.8.13(v)].

The following theorem provides various characterisations of complete

orthonormal sets and helps decide which orthonormal sets are complete. Some of

the characterisations have already been described.

Theorem 2.9.16 Let S = {xa: a 2 K} be an orthonormal set in Hilbert space H.

Each of the following conditions implies the other ﬁve:

(a) S is a complete orthonormal set in H;

(b) x ⊥ S implies x = 0; P

(c) x 2 H implies x ¼ a2K ðx; xa Þxa ; that is, S is a basis in H;

P

(d) k x k2 ¼ jðx; xa Þj2 for each x 2 H; (Parseval’s Identity)

a2K

P

(e) for x, y 2 H, ðx; yÞ ¼ a2K ðx; xa Þðy; xa Þ;

(f) ½S ¼ H; that is, the smallest subspace of H containing S is dense in H.

The equality in (c) means that the right-hand side has only a countable number of

nonzero terms, and every rearrangement of this series converges to x [Deﬁnition

2.9.6]. The equations in (d) and (e) are to be interpreted analogously.

Of course, (d) is a special case of (e).

Proof The equivalence of (a) and (b) has been proved [Theorem 2.9.3]. So also the

equivalence of (a) and (c) [Theorem 2.9.14]. We shall prove that (b) ) (f)

) (d) ) (e) ) (b).

(b) implies (f). Let M ¼ ½S. Since [S] is a subspace, so is M. (For x, y 2 M, there

exist sequences {xn}n 1 and {yn}n 1 such that xn ! x and yn ! y; then

88 2 Inner Product Spaces

H

Pso that there exists a nonzero vector x in H which is not in M. The vector y =

a2K ðx; xa Þxa exists in H and x − y ⊥ xa for every a 2 K [Theorem 2.9.12].

Moreover, x 6¼ y since y 2 M and x 62 M and hence x − y 6¼ 0. This contradicts (b).

(f) implies (d). Suppose (f) holds. For x 2 H and e > 0, there exists a ﬁnite set

{xa1 ; xa2 ; . . .; xan } such that some linear combinations ofP these vectors have distance

less than e from x. By Remark 2.8.8(ii), the vector z ¼ ni¼1 ðx; xai Þxai provides the

best approximation to the vector x in the linear span of {xa1 ; xa2 ; . . .; xan }; so

||x − z|| < e, and hence ||x|| < ||z|| + e which implies ðk xk eÞ2 \kzk2 ¼

Pn 2 P1 2 2

i¼1 jðx; xai Þj i¼1 jðx; xai Þj . Since e > 0 is arbitrary, we obtain ||x||

P1 2

i¼1 jðx; xai Þj . The result now follows using Bessel’s Inequality 2.9.10.

(d) implies (e). Note that (d) can be written as

!

X X

ðx; xÞ ¼ ðx; xa Þxa ; ðx; xa Þxa :

a2K a2K

!

X X

ðx þ ky; x þ kyÞ ¼ ðx þ ky; xa Þxa ; ðx þ ky; xa Þxa

a2K a2K

! !

X X X X

yÞ þ kðy; xÞ ¼ k

kðx; ðx; xa Þxa ; ðy; xa Þxa þk ðy; xa Þxa ; ðx; xa Þxa

a2K a2K a2K a2K

ð2:57Þ

Taking

P k = 1 and k = i, (2.57) shows that the real and imaginary parts of (x, y) and

P

a2K ðx; xa Þxa ; a2K ðy; xa Þxa are equal. Hence

!

X X

ðx; yÞ ¼ ðx; xa Þxa ; ðy; xa Þxa

a2K a2K

X

¼ ðx; xa Þðxa ; yÞ:

a2K

(e) implies (b). Finally, if (b) fails to be true, there exists a vector z 6¼ 0 so that (z,

xa) = 0 for all a 2 K. If x = y = z, the ||z||2 = (x, y) 6¼ 0 but ðz; xa Þðxa ; zÞ ¼ 0: Hence

(e) fails to hold. Thus, (e) implies (b) and the proof is complete. h

2.9 Complete Orthonormal Sets 89

To deal with completeness of orthonormal sets in the next few examples, we will

use their equivalent descriptions provided in Theorem 2.9.16.

Examples 2.9.17

(i) In the completion H of the inner product space of trigonometric polynomials

[see Example 2.9.5(i)], the uncountable orthonormal set {ur(t) = exp(irt):

r 2 R} is complete since ½fur g ¼ H [equivalence of (a) and (d) in

Theorem 2.9.16].

(ii) Let H = L2[−1, 1] and for n = 0, 1, 2, …, let Pn denote the Legendre polynomial

of degree n. Note that Pn is obtained by applying the Gram–Schmidt

orthonormalisation process to the linearly independent vectors {1, t, t2, …, tn}.

Moreover,

span 1; t; t2 ; . . .; tn ¼ spanfP0 ; P1 ; . . .; Pn g: ð2:58Þ

This is true for each n. Let x 2 H and e > 0. By Example 2.5.1, there exists y 2

C[−1, 1] such that ||x − y|| < e. By Weierstrass’ Theorem, there is a polynomial

Q(t) such that |y(t) − Q(t)| < e for all t 2 [−1, 1]. Then

Z1

ky Qk22 ¼ jyðtÞ QðtÞj2 dt\2e2 :

1

Thus

p

kx Qk2 kx y k2 þ ky Qk2 \ð1 þ 2Þe:

In view of (2.58), the set {P0, P1, …} constitutes a complete orthonormal basis

[Theorem 2.9.16(f)].

(iii) Let H = L2([−p, p], dt

2p) and for n = 0, ±1, ±2, …,

Zp

1 1 if n ¼ m

ð un ; u m Þ ¼ eiðn mÞt

dt ¼

2p 0 if n 6¼ m

p

90 2 Inner Product Spaces

system. For x 2 H and n = 0, ±1, ±2, …,

Zp

1 int

ðx; un Þ ¼ xðtÞe dt ¼ ^xðnÞ;

2p

p

We shall show that if ^xðnÞ ¼ 0, n = 0, ±1, ±2, …, then x = 0 a.e. This will prove

that the trigonometric system is complete in L2([−p, p], 2p

dt

).

Set

Zt

1

yðtÞ ¼ xðsÞds:

2p

p

dt

) L1([−p, p], 2p

dt

), it is evident that y is a well-deﬁned abso-

lutely continuous function on [−p, p] [see 1–5]. In particular, y 2 L2([−p, p], 2p dt

).

Moreover, y(−p) = 0 and y(p) = 0, using the fact that ^xð0Þ ¼ 0 by hypothesis. Let

a be any constant. On integrating by parts, we obtain

Zp

½yðtÞ aeint dt ¼ 0; n ¼ 1; 2; . . .: ð2:59Þ

p

periodic function, for e > 0, there is a trigonometric polynomial

n

X

TðtÞ ¼ ck eikt

k¼ n

such that

Now using (2.59) and the choice of a, we obtain

2.9 Complete Orthonormal Sets 91

Zp Zp

2

jyðtÞ aj dt ¼ ðyðtÞ aÞ yðtÞ a TðtÞ dt

p p

Zp

e jyðtÞ ajdt

p

2 312 2 312

Zp Zp

2

e4 jyðtÞ aj dt5 4 dt5 ;

p p

which implies

Zp

jyðtÞ aj2 dt 2pe2 :

p

Thus, y(t) is constant and x(t) = 0 almost everywhere. This completes the proof.

Remarks (a) In the above proof, we have used the fact that x 2 L1 ½ p; p; 2p dt

and

R

1 p int

have proved that if ^xðnÞ ¼ 2p p xðtÞe dt ¼ 0 for n = 0, ±1, ±2, …, then x =0

a.e.

(b) We put some ofdt the results proved

for abstract

Hilbert spaces in the present

setting of L2 ½ p; p; 2p . For x 2 L2 ½ p; p; 2pdt

, associate a function ^x deﬁned on

Z, the set of integers. The Fourier series of x is

1

X

^xðnÞeint ð2:60Þ

n¼ 1

N

X

SN ¼ ^xðnÞeint ; N ¼ 0; 1; 2; . . .;

n¼ N

1

X Zp

1 dt

^xðnÞ^yðnÞ ¼ xðtÞyðtÞdt; x; y 2 L2 ½ p; p; : ð2:61Þ

n¼ 1

2p 2p

p

92 2 Inner Product Spaces

limN kx SN k2 ¼ 0: ð2:62Þ

dt

(c) (The Riemann–Lebesgue Lemma) If x 2 L2 ½ p; p; 2p , then

Zp

int

xðtÞe dt ! 0 as n ! 1:

p

P1

Indeed, the Parseval’s identity (2.61) gives xðnÞj2 ¼

n¼ 1 j^

R

1 p 2

2p p jxðtÞj dt\1. (d) The relation

(2.62) leads to the question whether the

Fourier series of x 2 L2 ½ p; p; 2p

dt

tends to x pointwise. This is not true even for a

continuous function, as was demonstrated by du Bois-Reymond in 1876. However,

Fejér proved in 1900 that the Fourier series of a continuous function is Cesàro

dt

summable and the sum is the function itself. For a function x 2 L2 ½ p; p; 2p ,

PN int

Lusin’s conjecture that {SN}N 0, where SN ¼ n¼ N ^xðnÞe converges to

x pointwise a.e. was proved by Carleson in 1966.

(iv) A complete orthonormal system for the space H = L2(0,∞) is given by the

Laguerre functions

1 t

vn ðtÞ ¼ exp Ln ðtÞ; ð2:63Þ

n! 2

where

dn n

Ln ðtÞ ¼ ð 1Þn expðtÞ ðt expð tÞÞ; n ¼ 0; 1; 2; . . .:

dtn

order to show that theRsystem (2.63) is complete in H, it will be enough to

1

show that if f 2 H and 0 f ðtÞ exp 2t Ln ðtÞdt ¼ 0, n = 0, 1, 2, …, then f = 0

a.e. Let

t

gðtÞ ¼ f ðtÞ exp ; 0\t\1:

2

Indeed, by the Cauchy–Schwarz Inequality,

2.9 Complete Orthonormal Sets 93

Z1 Z1

t

jgðtÞjdt ¼ f ðtÞ exp dt

2

0 0

2 1 312 2 1 312

Z Z

4 jf ðtÞj2 dt5 4 expð tÞdt5

0 0

k f k2 :

Therefore, each tn is a linear combination of L0, …, Ln. Thus, we need only to

show that

Z1

gðtÞtn dt ¼ 0; n ¼ 0; 1; 2; . . . implies gðtÞ ¼ 0 a:e: ð2:64Þ

0

Now consider

Z1

UðzÞ ¼ expð tzÞgðtÞdt

0

ð2:65Þ

Z1

¼ expð txÞ expð ityÞgðtÞdt; <z [ 0:

0

Since g 2 L1(0,∞) and |exp(−ity)| = 1, the integral in (2.65) exists as a Lebesgue

integral. Moreover, U(z) is continuous in ℜz > 0. Indeed, if zn ! z in ℜz > 0, then g

(t)exp(−tzn) ! g(t)exp(−tz). Both the sequence of functions and the limit function

are integrable and are dominated by the integrable function g(t), an application of

the Lebesgue Dominated Convergence Theorem 1.3.9 proves the assertion. If D

denotes the boundary of any closed triangle in ℜz > 0, then

0 1 1

I I Z

UðzÞdz ¼ @ gðtÞ expð tzÞdtAdz

D D 0

0 1

Z1 I

¼ gðtÞ@ expð tzÞdzAdt ½Fubini’s Theorem

0 D

Z1

¼ gðtÞ 0dt ½Cauchy’s Theorem ¼ 0:

0

94 2 Inner Product Spaces

On using integration by parts and induction on n, we see that

Z1

expð tÞt2n dt ¼ ð2nÞ! ð2n n!Þ2 : ð2:66Þ

0

1

X Z1

1

sn gðtÞtn dt; s [ 0; ð2:67Þ

n¼0

n!

0

1

and show that the series converges for 0 s < 2 to the function U(s).

1 Z1 1 Z1 t

X n 1 X n 1

s gðtÞt n

dt s j f ðtÞ j exp tn dt

n¼0 n! n¼0 n! 2

0 0

1

X 1

sn k f k2 ½ð2nÞ!1=2 ;

n¼0

n!

1

X

k f k2 ð2sÞn :

n¼0

P

Note that expð stÞgðtÞ ¼ 1 n n 1 n

n¼0 ð 1Þ s n! gðtÞt , and

1 Z

1 1

X 1 Z

X

ð 1Þn sn 1 gðtÞtn dt sn jgðtÞjtn dt

n!

n¼0 n¼0

0 0

Z1

1

X 1 n t n

¼ s jf ðtÞj exp t dt

n¼0

n! 2

0

2 312

1

X Z1 Z1

14 2

s n

jf ðtÞj dt t expð tÞdt5

2n

ðCauchy SchwarzÞ

n¼0

n!

0 0

1

X 1 n

s k f k2 2n n! ðusing ð2:66ÞÞ

n¼0

n!

1

X 1

¼ k f k2 ð2sn Þ\1 for 0 s\ :

n¼0

2

2.9 Complete Orthonormal Sets 95

1

X 1

ð 1Þn sn gðtÞtn 2 L1 ð0; 1Þ

n¼0

n!

and

Z1 1 Z

X

1

n¼0

0 0

Z1

UðsÞ ¼ ts 1 gð ln tÞdt; s 0:

0

u = −lnt. Using Proposition 1.3.11, it now follows that g(−lnt) = 0 a.e. on (0,1),

which implies [see Problem 2.9.P8] that g(t) = 0 a.e. on (0,∞). This completes the

proof.

(v) A complete orthonormal system for the space H = L2(−∞,∞) is given by the

Hermite functions

t2

Hn ðtÞ exp 2

vn ðtÞ ¼ 1=2

; n ¼ 0; 1; 2; . . .; ð2:68Þ

ð2n n!p1=2 Þ

where

(iii)]. In order to show that the system (2.68) is completein H,it will be enough

R1 t2

to show that if f 2 L2(−∞,∞) and 1 f ðtÞ exp 2 Hn ðtÞdt ¼ 0, or

R1 2

t n

equivalently, 1 f ðtÞ exp 2 t dt ¼ 0, for n = 0, 1, 2, … implies f = 0

a.e. on (−∞,∞). The equivalence follows exactly as in (iv) above, since Hn is a

polynomial of degree n.

96 2 Inner Product Spaces

Z1

itx t2

FðxÞ ¼ f ðtÞe exp dt; 1\ x\1:

2

1

t2

This integral exists, since f 2 L2(−∞,∞), exp( 2) 2 L2(−∞,∞) and |e−itx| = 1.

In fact,

Z1
Z1
2

t2 t

f ðtÞe itx

exp dt ¼ f ðtÞ exp dt

2 2

1 1

2 312 2 312

Z1 Z1

4 jf ðtÞj2 dt5 4 expð t2 Þdt5 \1;

1 1

We write

1

X xn

f ðtÞe itx

¼ ð iÞn f ðtÞtn

n¼0

n!

Z1 X

1

j xjn n t2

jf ðtÞjjt j exp dt

n¼0

n! 2

1

Z1

t2

¼ jf ðtÞj expðjxtjÞ exp dt

2

1

Z1
2

t2 t

¼ jf ðtÞj exp expðjxtjÞ exp dt

4 4

1

2 312 2 312

Z1
2

Z1
2

t t

4 jf ðtÞj2 exp dt5 4 expð2jxtjÞ exp dt5 \1;

2 2

1 1

since

Z1
Z1

2 t2

jf ðtÞj exp dt jf ðtÞj2 dt ¼ k f k22 ;

2

1 1

2.9 Complete Orthonormal Sets 97

and

Z1
Z1
2

t2 t

expð2jxtjÞ exp dt¼ 2 expð2jxtjÞ exp dt

2 2

1 0

Z1

t2 2

¼2 exp þ 2j xjt 2x expð2x2 Þdt

2

0

Z1

2 1 2

¼ 2 expð2x Þ exp ðt 2j xjÞ dt\1:

2

0

Z1

itx t2

FðxÞ ¼ f ðtÞe exp dt

2

1

1

X Z1

xn n t2 n

¼ ð iÞ f ðtÞ exp t dt

n¼0

n! 2

1

1

X Z1

xn n t2 n

¼ ð iÞ f ðtÞ exp t dt ¼ 0;

n¼0

n! 2

1

i.e.

Z1

itx t2

f ðtÞe exp dt ¼ 0

2

1

for all real x. It follows on using Proposition 1.3.12 that f = 0 a.e., which was to be

proved.

pﬃﬃ

(vi) The set un ðzÞ ¼ pn zn 1 ; n = 1, 2, … is an orthonormal set in H = A(D),

where D = {z 2 C: |z| < 1} [see 2.8.4(iii)]. We shall show that the Parseval

formula

1

X ZZ

jðf ; un Þj2 ¼ jf ðzÞj2 dz; f 2 L2 ðDÞ

n¼1

jzj\1

A(D) [Theorem 2.9.16].

For f 2 A(D), the Fourier coefﬁcients are given by

98 2 Inner Product Spaces

rﬃﬃﬃZZ

n

cn ¼ ðf ; un Þ ¼ f ðzÞzn 1 dxdy

p

D

r ﬃﬃﬃZZ

n

¼ lim f ðzÞzn 1 dxdy:

r!1 p

jzj\r

rﬃﬃﬃ Z

1 n zn

cj ¼ lim f ðzÞ dz:

r!1 2i p n

jzj¼r

and so

Z

1 r 2n dz

cn ¼ limr!1 pﬃﬃﬃﬃﬃﬃ f ðzÞ : ð2:69Þ

np 2i zn

jzj¼r

Now if

f ðzÞ ¼ a0 þ a1 z þ ; jzj\1

Z

1 f ðzÞ

an 1 ¼ dz: ð2:70Þ

2pi zn

jzj¼1

1

cn ¼ lim pﬃﬃﬃﬃﬃﬃ r 2n pan 1

r!1 np

rﬃﬃﬃ ð2:71Þ

p

¼ an 1 ; n ¼ 1; 2; . . .:

n

Also,

ZZ 1

X j an 1 j 2

jf ðzÞj2 dxdy ¼ p ð2:72Þ

n¼1

n

jzj\1

From (2.71) to (2.72), it follows that Parseval’s formula holds. The argument is

therefore complete.

2.9 Complete Orthonormal Sets 99

Theorem 2.9.18 Any two complete orthonormal sets in a given Hilbert space H 6¼ {0}

have the same cardinal number.

Proof Let H be a Hilbert space of dimension n and A be any complete orthonormal

set in H. It consists of linearly independent vectors and therefore can have at most

n vectors in it. We shall argue that it contains precisely n vectors: by

Theorem 2.9.14, A is a basis. Since it is ﬁnite, it is a Hamel basis and must therefore

contain precisely n vectors.

Let A = {xa: a 2 K} and B = {yb: b 2 C} be complete orthonormal sets in H. For

any xa 2 A, the set

Bxa ¼ yb 2 B: ðxa ; yb Þ 6¼ 0

S

mustSbe countable [see Remark 2.9.11]. Clearly, a Bxa B: We next show that

B a Bxa : Let yb 2 B. Suppose yb 2 Bxa for no a. Then (xa, yb) = 0 for all a 2 K.

In other words, yb ⊥ A. Since A is complete, it follows that yb = 0, which is

impossible since ||yb|| = 1. Hence, yb 2 Bxa for some xa. Thus

[

B¼ B xa :

a

It follows that |B|, the cardinality of B, satisﬁes |B| ℵ0|A| = |A|. Interchanging the

roles of A and B, we also have |A | |B|. This completes the proof. h

Deﬁnition 2.9.19 Let H be a Hilbert space. If H 6¼ {0}, we deﬁne the orthogonal

dimension of H to be the unique cardinal number of a complete orthonormal set in

H. If H = {0}, we say that H has orthogonal dimension 0.

If H is ﬁnite dimensional, then the orthogonal dimension of H is the cardinal of a

Hamel basis.

Theorem 2.9.20 (Riesz–Fischer) Let {xa}a2A be a complete orthonormal system in

a Hilbert space H and ‘2(A) = L2(A, , µ), where denotes the collection of all

subsets of A and µ is counting measure on A. Then H is isometrically isomorphic to

‘2(A).

Proof For x 2 H, let T(x) be that function on A such that

½TðxÞðaÞ ¼ ðx; xa Þ; a 2 A:

P

Then T maps H into ‘2(A), for a2A jðx; xa Þj2 \1 by Bessel’s Inequality. Also, for

x, y 2 H, we have

¼ ½TðxÞðaÞ þ ½TðyÞðaÞ; a 2 A;

i.e. T(x + y) = T(x) + T(y). It is equally easy to show that T(ax) = aT(x) for scalar

a. Thus, T is linear. Using Theorem 2.9.16(e), we have

100 2 Inner Product Spaces

X

ðTðxÞ; TðyÞÞ ¼ ½TðxÞðaÞ½TðyÞðaÞ

a2A

X

¼ ðx; xa Þðy; xa Þ

a2A

¼ ðx; yÞ

T:H ! ‘2(A) is onto.

P

Let f 2 ‘2(A). Then a2A jf ðaÞj2 \1. Let a1, a2, … be those a’s for which f(a)

P P

6¼ 0. The condition a2A jf ðaÞj2 \1 becomes 1 2

i¼1 jf ðai Þj \1. It follows from

P1

Theorem 2.9.9 that x ¼ i¼1 f ðai Þai is in H. Since (x, xaj ) = f(aj), so (x, xaj ) 6¼ 0.

For a ﬁxed p and any m p, we have

Xm

ðx; xa Þ f ðap Þ ¼ ðx; xap Þ f ðai Þðxai ; xap Þ

p

i¼1

Xm

x f ðai Þxai xap ! 0

i¼1

also holds for those a’s for which f(a) = 0. This completes the proof. h

Remarks 2.9.21

(i) The following form of the above Theorem was originally proved by Riesz and

Fischer in 1907:

P

Let {an}n2 Z be in ‘2(Z), that is, 1 2

n¼ 1 jan j \1. Then, there Rexists a function

1 p

f in L2([−p, p], 2p

dt

), such that ^f ðnÞ ¼ an ; n 2 Z; where ^f ðnÞ ¼ 2p p f ðtÞe

int

dt is

−int

the nth Fourier coefﬁcient of f with respect to the orthonormal basis {e : n 2 Z}.

(ii) A Hilbert space is completely determined up to an isometric isomorphism by

its orthogonal dimension, i.e. by the cardinality of its complete orthonormal

basis. The space L2([−p, p], 2p

dt

) is isometrically isomorphic to ‘2(Z) and hence

2

also to ‘ (N).

2:9:P1. Let fen gn 1 and fe~n gn 1 be orthonormal sequences in a Hilbert space

H and let M1 = span(en) and M2 = spanðe~n Þ. Show that M1 ¼ M2 if, and

only if,

2.9 Complete Orthonormal Sets 101

1

X 1

X

en ¼ anm e~m ; e~n ¼ anm em ; anm ¼ ðen ; e~m Þ:

m¼1 m¼1

2:9:P2. Let H be a Hilbert space. Then show that the following hold:

(a) If H is separable, every orthonormal set in H is countable.

(b) If H contains an orthonormal sequence which is complete in H, then

H is separable.

2:9:P3. Let A [−p, p] and A be measurable. Prove that

Z Z

lim cos nt dt ¼ lim sin nt dt ¼ 0:

n!1 n!1

A A

1

2:9:P5. Let ej(z) = zj, j 2 Z: Show that ej j¼ 1 is an orthonormal sequence in

RL2 (notation as in Example 2.1.3(vi)).

2:9:P6. Let a1, a2 2 D(0,1) = {z: |z| < 1} and a1 6¼ a2. Show that the vectors

12 12

1 ja1 j2 z a1 1 ja2 j2

e1 ðzÞ ¼ and e2 ðzÞ ¼

1 a1 z 1 a1 z 1 a2 z

(vi)).

2:9:P7. Let {en}n 1 be an orthonormal basis in H. Show that for any orthonormal

set {fn}n 1, if

1

X

ke n fn k2 \1;

n¼1

2:9:P8. A real-valued function on an interval having a continuous nonvanishing

derivative on the interior of its domain maps a set of (Lebesgue) measure

zero into a set of measure zero. In case the domain is an open interval, in

which case the range is also an open interval and an inverse exists, the

inverse also has the same properties. [Examples of such a function on the

domain (0,∞) would be exp(−x) and x2.]

102 2 Inner Product Spaces

A result of particular interest about Hilbert space is the projection theorem, namely,

if M is any closed subspace of a Hilbert space H, then H can be decomposed into

the direct sum of M and its orthogonal complement (to be deﬁned below). This

important geometric property is one of the main reasons that Hilbert spaces are

easier to handle than Banach spaces.

A characterisation of a bounded linear functional [see Deﬁnition 2.10.18 below]

on a Hilbert space, known as the Riesz Representation Theorem, will be studied.

P.L. Chebyshev sought the approximation of arbitrary functions by linear

combinations of given ones. He considered approximations in the spaces of con-

tinuous functions, Lp spaces, etc. These have a bearing on constrained optimisation.

We deal with the approximation problem in a pre-Hilbert space X; given a set of

n linearly independent vectors {v1, v2, …, vn} and an x 2 X, to ﬁnd a method of

computing the minimum value of

n

X

x cj vj ;

j¼1

where c1, c2, …, cn range over all scalars and to ﬁnd the corresponding values of c1,

c2, …, cn. The reader will learn that this is precisely the problem of ﬁnding the

distance of x from the linear span of {v1, v2, …, vn}.

Recall that a set of M nonzero vectors in a pre-Hilbert space is said to be

orthogonal if x ⊥ y whenever x and y are distinct vectors of M.

Deﬁnition 2.10.1 Let X be a pre-Hilbert space and x 2 X. We deﬁne

x? ¼ fy 2 X : ðx; yÞ ¼ 0g:

and if S is a subset of X,

writes S⊥⊥ for the perp of S⊥; thus S⊥⊥ = (S⊥)⊥. The set S⊥ is called the

orthogonal complement of S.

Remarks 2.10.2

(i) Observe that x⊥ is a subspace of X, since (x, y) = 0 and (x, z) = 0 imply (x, ay +

bz) = 0, where a, b are scalars. Also, x⊥ is precisely the set of vectors where

the continuous function y ! (x, y) is zero. Hence, x⊥ is a closed subspace of

X. Since

2.10 Orthogonal Decomposition and Riesz Representation 103

\

S? ¼ x? ;

x2S

closed subspace of X.

ðiiÞ S? ¼ ðSÞ? :

Let y 2 S⊥. Then (x, y) = 0 for all x 2 S. Let z 2 S: Then there exists a

sequence {zn}n 1 in S such that zn ! z. The continuity of the mapping x !

(x, y) and the fact that (zn, y) = 0 for n = 1, 2, … imply (z, y) = 0. Since z 2 S is

arbitrary, we conclude that y 2 (S)⊥.

On the other hand, if y 2 (S)⊥, then (y, x) = 0 for all x 2 S. Since S S; it follows

that (y, x) = 0 for all x 2 S, that is y 2 S⊥.

Proposition 2.10.3 Let S and S1 be subsets of an inner product space X. Then the

following hold.

(a) S⊥ is a closed subspace of X and S \ S⊥ {0};

(b) S S⊥⊥;

(c) S S1 implies S⊥ S⊥1;

(d) S⊥ = S⊥⊥⊥.

Proof

(a) In Remark 2.10.2(i), we have noted that S⊥ is a closed subspace of X. If

x 2 S \ S⊥ then x ⊥ x, that is, (x, x) = 0, which implies x = 0.

(b) Let x 2 S. For any y 2 S⊥, one has (y, x) = 0, so that x ⊥ S⊥ and therefore

x 2 S⊥⊥.

(c) If x 2 S⊥

1 , then (x, y) = 0 for all y 2 S1. In particular, (x, y) = 0 for all y 2 S,

which implies x 2 S ⊥.

(d) Applying (iii) to the relation S S⊥⊥, we have (S⊥⊥)⊥ S⊥. Also, S⊥

(S⊥)⊥⊥ by (b) above. Since (S⊥⊥)⊥ = (S⊥)⊥⊥, as in each case one starts with

S and perps three times, it follows that (S⊥⊥)⊥ = S⊥. h

Then

1

S? ¼ g 2 L2 ½0; 1 : gðtÞ ¼ 0 a:e: on ;1

2

and

?? 2 1

S ¼ g 2 L ½0; 1 : gðtÞ ¼ 0 a:e: on 0; ¼ S:

2

104 2 Inner Product Spaces

R1

Hint: To compute S⊥, ﬁrst show that x gðtÞdt ¼ 0 for every x 2 [12,1] and then use

regularity of Lebesgue measure.

If x is a point lying outside a plane in R3 , then there is a unique y in the plane

which is closer to x than any other point of the plane. This assertion, when trans-

lated in the language of Hilbert spaces, yields rich dividends via the Riesz

Representation Theorem below. The accompanying ﬁgure illustrates the situation

when the plane is a coordinate plane. However, this need not always be the case.

Deﬁnition 2.10.4 A subset K of a vector space is convex if, for all x, y 2 K, and all

k such that 0 < k < 1, the vector kx + (1 − k)y belongs to K. The set of vectors

{kx + (1 − k)y: 0 < k < 1} is the line segment joining x and y. The convex hull of a

subset S of any vector space is the intersection of all convex subsets containing

S and is denoted by co(S) or by coS.

It is sometimes neater to work with an equivalent formulation of convexity as

follows: for all x, y 2 K, and all a, b 0 such that a + b = 1, the vector ax +

by belongs to K.

It is easy to see that the intersection of any family of convex sets is again convex;

in particular, any convex hull is a convex set. By using the alternative formulation

of convexity, the convex hull of any ﬁnite set of vectors {x1, x2P , …, xn} is easily

seen to consist of precisely those vectors which can be written as nk¼1 kk xk , where

P

0 kk 1 for each k and nk¼1 kk ¼ 1. (Induction: we start with the vectors x1,

P P

x2, …, xn and nonnegative k1, k2, …, kn satisfying nk¼1 kk ¼ 1. If nk¼11 kk ¼ 0,

P

then kk is 0 for k = 1, …, n−1 and kn = 1, which together imply nk¼1 kk xk is in the

Pn 1 Pn 1

convex hull. Assume k¼1 kk ¼ b [ 0. Then k¼1 ðkk =bÞxk is in the convex hull

P P

by the induction hypothesis. Consequently, nk¼1 kk xk ¼ b nk¼11 ðkk =bÞxk þ kn xn

is in the convex hull. Conversely, the vectors that can be written in this form

obviously constitute a convex set that contains {x1, x2, …, xn} and therefore con-

tains the convex hull under reference.)

This description of the convex hull will now be used for arguing that it is

compact when the vector space is normed.

When n = 1, there is nothing to prove. Assume as induction hypothesis that the

convex hull of any n vectors is compact. Consider any set {x1, x2, …, xn, x} of n + 1

2.10 Orthogonal Decomposition and Riesz Representation 105

vectors. P yp can be written

as nk¼1 kk;p xk þ kp x, where 0 kk,p, kp 1 for each k and nk¼1 kk;p þ kp ¼ 1.

If there are only ﬁnitelyPmany such p, then we can assume that 1 − kp > 0 for every

p. It then follows that nk¼1 kk;p =ð1 kp Þ ¼ 1, each term in the sum being non-

P

negative. This means zp ¼ nk¼1 kk;p xk =ð1 kp Þ is in the convex hull of {x1, x2,

…, xn}. By the induction hypothesis, {zp} has a subsequence {zp(q)} converging to a

limit z 2 co{x1, x2, …, xn}. Now the bounded sequence {kp(q)} in R has a con-

vergent subsequence {kp(q(r))}, whose

Pn limit we shall denote by k. Then {zp(q(r))}

converges to z and therefore k¼1 kk;pðqðrÞÞ xk ¼

P

ð1 kpðqðrÞÞ ÞzpðqðrÞÞ forms a

sequence converging to (1 − k)z. As ypðqðrÞÞ ¼ nk¼1 kk;pðqðrÞÞ xk þ kpðqðrÞÞ x; the

subsequence {yp(q(r))} converges to (1 − k)z + kx, which belongs to co(co{x1, x2,

…, xn} [ {x}). The latter can easily be seen to be the same as co{x1, x2, …, xn, x}.

This completes the induction proof that the convex hull of any ﬁnite set of vectors is

compact.

If K is convex and x is any vector, then the convex hull co(K [ {x}) is precisely

K1 = {ax + bk: a, b 0, a + b = 1}. The convexity of K1 follows from the three

computations

(a) a(a1x + b1k1) + b(a2x + b2k2) = (aa1 + ba2)x + (ab1k1 + bb2k2)

= (aa1 + ba2)x + c((ab1/c)k1 + (bb2/c)k2), where c = 1 − (aa1 + ba2) if

nonzero;

(b) (ab1/c) + (bb2/c) = 1 because

(c) aa1 + ba2 a + b = 1 when 0 a1 1 and 0 a2 1.

Regarding (a), we note that c = 0 implies a1 = a2 = 1 and b1 = b2 = 0, in which

case a(a1x + b1k1) + b(a2x + b2k2) = x. Once the convexity of K1 is established, it is

a trivial matter to see that it is the convex hull of K.

Theorem 2.10.5 (Closest point property) Let K be a nonempty closed convex set in

a Hilbert space H. For every x 2 H, there is a unique point y 2 K which is closer to

x than any other point of K, i.e.

kx yk ¼ inf kx zk:

z2K

Proof Let d = infz2K||x − z||. Since K 6¼ ∅, d < ∞, therefore for each n 2 N, there

exists yn 2 K such that

106 2 Inner Product Spaces

1

kx yn k2 \d2 þ : ð2:73Þ

n

x − yn and x − ym. By the Parallelogram Law [Proposition 2.2.3(c)],

kðx yn Þ ðx ym Þk2 þ kðx yn Þ þ ðx y m Þ k2 ¼ 2 kx y n k2 þ kx y m k2

2 1 1

4d þ 2 þ :

n m

2 yn þ ym

2 2 1 1

kyn ym k þ 4 x 4d þ 2 þ

2 n m

and hence

2 21 1 yn þ ym

2

ky n ym k 4d þ 2 þ 4 x :

n m 2

yn þ ym

Since K is convex, yn, ym 2 K, we have 2 2 K, and hence

yn þ ym

2

x d2 :

2

Consequently,

2 2 1 1

ky n ym k 4d þ 2 þ 4d2

n m

1 1

¼2 þ :

n m

closed, y 2 K and therefore

kx yk d.

kx yk d

and so

2.10 Orthogonal Decomposition and Riesz Representation 107

kx yk ¼ d:

unique. Suppose that z 2 K (z 6¼ y) is such that ||x − z|| = d. Then y þ2 z 2 K so that

y þ z

x d.

2

y þ z 2

¼ 4d2 4 x 0:

2

Hence y = z. h

Remarks 2.10.6

(i) If x 2 H is such that x 2 K, then the vector nearest to x is x itself.

(ii) If K is not closed, the conclusion of Theorem 2.10.5 may not hold. In fact, in

this situation, whether K is convex or not, there always exists a point in

H having no closest approximation in K. Any point in the closure that does not

belong to K will serve the purpose.

k¼1 kk ¼ 1}. K is convex.

However, K is not closed. In fact, the sequence

y1 ¼ ð1; 0; 0; . . .Þ, y2 ¼ 12 ; 12 ; 0; 0; . . . , …, yn ¼ 1n ; 1n ; . . .; 1n ; 0; 0; . . . , … is in

K. However, the limit of the sequence {yn}n 1 in ‘2 is the point y = (0, 0, …) of ‘2,

which does not belong to K. According to the preceding paragraph, the point y = (0,

0, …) does not possess a closest point in K.

(iii) The conclusion of Theorem 2.10.5 fails to hold if H is not a Hilbert space.

Let X = R2 , the real Banach space with ||(x1, x2)|| = max{|x1|, |x2|}. Consider

the closed convex set K = {(x1, x2): x1 1}. The minimal distance of the

origin from K is attained at each of the points of the line segment {(x1, x2): x1

= 1 and |x2| 1}. Even when the norm comes from an inner product and

H is not complete, the existence part of the conclusion of Theorem 2.10.5

may fail to hold; an example of this will be given later in (ii) of Remarks

2.10.12. The uniqueness part however holds because its proof does not use

completeness.

Corollary 2.10.7 Every nonempty closed convex set K in a Hilbert space H con-

tains a unique element of smallest norm.

Proof Take x = 0 in Theorem 2.10.5. h

108 2 Inner Product Spaces

P

Example 2.10.8 Let K = {y = (k1, k2, …, kn) 2 Cn : nk¼1 kk ¼ 1}. K is a closed

n

convex1 1subset1of C . The unique vector y0 2 K of smallest norm is

y0 ¼ n ; n ; . . .; n ; indeed, if y0 has two unequal components, then interchanging

them leads to another vector in K with smallest norm; consequently, all components

of y0 are equal. The reader who is familiar with constrained optimisation can verify

the claim made above independently of the corollary.

Corollary 2.10.9 Let M be a closed subspace of a Hilbert space H. If x is a vector

in H, and if d = inf{||x − z||: z 2 M} then there exists a unique y 2 M such that

d = ||x − y||.

Proof Every subspace of a vector space is convex. h

Theorem 2.10.10 Let M be a closed subspace of a Hilbert space H and x 2 H. If y

denotes the unique element in M for which ||x − y|| = inf{||x − z||: z 2 M}, then x − y

is orthogonal to M. Conversely, if y 2 M is such that x − y is orthogonal to M, then

||x − y|| = inf{||x − z||: z 2 M}.

Proof Consider z 2 M with ||z|| = 1. Then w = y + (x − y, z)z lies in M and we have

kx yk2 kx wk2 ¼ ðx w; x wÞ

¼ ðx y ðx y; zÞz; x y ðx y; zÞzÞ

2 2

¼ kx yk j ðx y; zÞj :

multiple of a vector in M of norm 1, it follows that x − y ⊥ M.

If z 2 M, then x − y is orthogonal to y − z, so that ||x − z||2 = ||x − y + y − z ||2 = ||

x − y||2 + ||y − z||2 ||x − y||2. Thus, ||x − y|| = inf{||x − z||: z 2 M}. h

It may be noted that x − y is closest to x in M⊥.

Theorem 2.10.11 (Orthogonal Decomposition Theorem) If M is a closed subspace

of a Hilbert space H, then H = M ⊕ M⊥ and M = M⊥⊥.

Proof Let x 2 H. Since M is a closed subspace of H, there exists a unique vector

y 2 M such that ||x − y|| = inf{||x − z||: z 2 M}. So x − y ⊥ M by Theorem 2.10.10.

Hence, x = y + (x − y), where y 2 M and x − y 2 M⊥. Since M \ M⊥ = {0}, it

follows that H = M ⊕ M⊥.

We already know that M M⊥⊥ [Proposition 2.10.3(b)]. On the other hand, let

x 2 M⊥⊥. If x = y + z, where y 2 M and z 2 M⊥, then x and y are in M⊥⊥. Hence, z =

x − y 2 M⊥⊥ (M⊥⊥ is a subspace of H). Since z 2 M⊥, it follows that (z, z) = 0, that

is, z = 0 or x = y. This shows that M⊥⊥ M. The proof is complete. h

Remarks 2.10.12

(i) If M is a closed subspace of a Hilbert space H and x 2 H, then x can be

uniquely expressed as

2.10 Orthogonal Decomposition and Riesz Representation 109

x ¼ y þ z;

(ii) The condition that H is a Hilbert space for a closed subspace to satisfy M =

M⊥⊥ cannot be omitted. Let ‘0 ‘2 be the inner product space consisting of

sequences, each of which has only ﬁnitely many nonzero terms. Let M =

P 1

{x = {kk}k 1: kk 6¼ 0 for only ﬁnitely many k’s and 1

k¼1 k kk ¼ 0g: Clearly,

M is a subspace of ‘0. Moreover, M is closed, as is proved below.

Let {x(n)}n 1 be a sequence in M such that x(n) ! x in ‘0. By the Cauchy–

Schwarz Inequality, it follows that

2 ! !

X 1

1 X 1

1 2 X1

1 X1 2

ðnÞ ðnÞ

kk ¼ k kk kk kk ;

k¼1 k k¼1 k k k¼1

k 2

k¼1

P1 1

where k(n)k is the kth component of xn. Consequently, k¼1 k kk ¼ 0. So, x 2 M.

⊥

We next show that M = {0}. Assume 0 6¼ z 2 ‘0 and z ⊥ M. Then there exists

P P

k such that z = (x1, …, xk, 0, …) and ki¼1 jxi j2 6¼ 0. Let l ¼ ðk þ 1Þ kj¼1 1j xj .

Then w = (x1, …, xk, µ, 0, …) 2 M in view of the deﬁnition of µ. Hence, z ⊥ w, i.e.

(z, w) = 0. But (z, w) = ||z||2. It follows that z = 0, contradicting the assumption on z.

Consequently, M⊥⊥ = ‘0 6¼ M.

We shall use the fact that M⊥ = {0} to show that the closed convex subset M of

the (incomplete) inner product space ‘0 has the property that every x 2 ‘0 that does

not lie in M fails to have a nearest element in M. In particular, it will follow that the

conclusion of Theorem 2.10.5 may fail to hold in the absence of completeness.

Suppose x 2 ‘0 does not lie in M but has a closest element y 2 M. It follows that

x − y ⊥ M exactly as in the ﬁrst paragraph of the proof of Theorem 2.10.10,

considering that completeness is not needed in that paragraph. Since we have

shown that M⊥ = {0}, we infer that x − y = 0, which is a contradiction because

y 2 M, whereas x 62 M.

(iii) If M is any linear subspace of H, then M = M⊥⊥. Observe that M M⊥⊥ [see

Proposition 2.10.3(b)]. It follows that M M ?? , since M⊥⊥ is a closed

subspace of H. As M M, it follows that ðMÞ? M? using Proposition

2.10.3(c). Another application of Proposition 2.10.3(c) yields

M?? ðMÞ?? ¼ M since M is a closed subspace of H [Theorem 2.10.11].

(iv) If H = M ⊕ N, M N⊥, then M = N⊥ and is therefore closed, as we now

show. Suppose x 2 N⊥ and x 62 M. The vector x has the representation x = y +

z, where y 2 M and z 2 N. Now

110 2 Inner Product Spaces

x 62 M and y 2 M.

(v) S⊥⊥ = (S⊥)⊥ is the smallest closed subspace of the Hilbert space H which

contains S.

(vi) Let {Mk}k 1 be a sequence of closed linear subspaces of a Hilbert space

H. There exists a smallest closed linear subspace M such that Mk M for all

k and it has the property that x ⊥ M if, and only if, x ⊥ Mk for all k. To see

why, let S = {x 2 H: x 2 Mk for some k}. Clearly Mk S for all k. Moreover,

S is the smallest subset of H with this property. Set M = S⊥⊥. If N is a closed

linear subspace such that Mk N for all k, then S N. Hence, M N in

view of (v). The assertion that x ⊥ M if, and only if, x ⊥ Mk for all k is

proved by using the following observation:

M ? ¼ S??? ¼ S? :

Example 2.10.13 Consider F o = {f 2 L2[−1, 1]: f(t) = −f(−t)} and F e = {f 2 L2[−1, 1]:

f(t) = f(−t)}.

The set F o is an inﬁnite-dimensional linear subspace of L2[−1, 1]. [f(t) = t2n−1,

n = 1, 2, … are in F o ; they are countably many and linearly independent.] Also, F e

is an inﬁnite-dimensional subspace of L2[−1, 1]. [F e contains the functions f(t) = t2n,

n = 0, 1, 2, ….]

For f 2 F o and g 2 F e , the inner product

Z1

ðf ; gÞ ¼ f ðtÞgðtÞdt ¼ 0

1

e .

For any function f 2 L2[−1, 1],

f ðtÞ þ f ð tÞ f ðtÞ f ð tÞ

fe ðtÞ ¼ ; fo ðtÞ ¼ and f ¼ fe þ fo :

2 2

F? ?

e , it follows that F o = F e . [See Remark 2.10.12(iv)].

The following proposition provides an alternate way of computing

d ¼ inf fkx zk : z 2 M g;

2.10 Orthogonal Decomposition and Riesz Representation 111

then

d ¼ inf fkx uk : u 2 M g ¼ max jðx; zÞj : z 2 M ? and kzk ¼ 1 :

[Theorem 2.10.10] and kwk ¼ kx dyjk ¼ 1. For this w,

x y 1 1 1

ðx; wÞ ¼ x; ¼ ðx; x yÞ ¼ ðx y; x yÞ ¼ kx yk2 ¼ d;

d d d d

This completes the proof. h

Let H be a Hilbert space and M is a closed subspace of H. The Orthogonal

Decomposition Theorem 2.10.11 says that H = M ⊕ M⊥. Thus for each x 2 H, there

are unique y 2 M and z 2 M⊥ such that x = y + z. Note that z = x − y is the unique

vector in M⊥ closest to x. The vector y is the projection of x on M and it follows

that z is the projection of x on M⊥. This sets up mappings from H onto M and from

H onto M⊥, respectively.

Theorem 2.10.15 The mapping PM:H ! M deﬁned by PM(x) = y, x 2 H, where y

denotes the projection of x on M. The mapping PM has the following properties:

(a) PM is linear, i.e. PM(a1x1 + a2x2) = a1PM(x1) + a2PM(x2), where a1 and a2 are

scalars;

(b) If x 2 M, then PM(x) = x. Thus, PM is idempotent, i.e. P2M = PM;

(c) If x 2 M⊥, then PM(x) = 0;

(d) (PM(x), x) = ||PM(x)||2 ||x||2 for all x 2 H.

Proof

(a) Let xi = yi + zi, i = 1, 2 be the decomposition of xi relative to M. Then

x. If x 2 H, then PM(x) 2 M and, by what has just been proved, PM(x) =

PM(PM(x)) = P2M(x). Thus P2M = PM.

(c) If x 2 M⊥, then x = 0 + x is the unique decomposition of x. Thus PM(x) = 0.

112 2 Inner Product Spaces

y ⊥ z and therefore (y, y + z) = (y, y) = (PM(x), PM(x)) = ||PM(x)||2. Also,

||PM(x)||2 = (y, y) ||y||2 + ||z||2 = ||x||2. h

Remarks 2.10.17

(i) The map PM is often denoted by P when it is clear from the context on which

subspace M the projection PM is intended.

(ii) In Theorem 2.10.15(d), we have checked that

Since PM(x) = x for all x 2 M, it follows that ||PM(x)|| = ||x|| and hence

Deﬁnition 2.10.18 A linear functional on a vector space X over a ﬁeld F is a

mapping f:X ! F which satisﬁes f(kx + µy) = kf(x) + µf(y) for all x, y 2 X and all

scalars k, µ in F.

Deﬁnition 2.10.19 Let X be a normed linear space over F. A linear mapping f:X ! F

is called a bounded linear functional on X if it maps bounded subsets of X into

bounded subsets of F, or equivalently, if there exists a constant K such that

jf ðxÞj K kxk; x 2 X:

The linear functional f is said to be a continuous linear functional if for a

sequence {xn}n 1 in X, xn ! x implies f(xn) ! f(x).

Let X* denote the set of all bounded linear functionals on X. Deﬁne addition and

scalar multiplication in X* as follows:

ðf1 þ f2 ÞðxÞ ¼ f1 ðxÞ þ f2 ðxÞ and ðaf1 ÞðxÞ ¼ af1 ðxÞ for f1 ; f2 2 X* and a 2 F:

Deﬁne a norm on X* by setting

It can be checked that |||| is a norm on X*. It is immediate from the deﬁnition of

the norm in X* that

2.10 Orthogonal Decomposition and Riesz Representation 113

jf ðxÞj k f k kxk:

It will be proved later that X* with the norm described above is complete.

Proposition 2.10.20 A linear functional f:X ! C is bounded if, and only if, it is a

continuous functional.

Proof Indeed, if f is bounded, then

k f kkx n x k ! 0

Suppose xn ! 0. Then f(xn) ! 0. If f is not bounded, then for every n 2 N, there

exists xn with ||xn|| = 1 and |f(xn)| n. But in this case, |f(xn/n)| 1, whereas

||xn/n|| ! 0. h

Remark The study of continuous linear functionals will be taken up in more detail

later in the book.

Proposition 2.10.21 A linear functional f deﬁned on X is continuous at x = 0, then

it is continuous everywhere.

Proof Suppose f is continuous at x = 0. Let e > 0 be given. There exists d > 0 such

that ||x|| < d implies |f(x)| < e. Therefore, for every x 2 X, ||x − y|| < d implies

jf ðxÞ f ðyÞj ¼ jf ðx yÞj\e: h

Remarks 2.10.22

(i) The point x = 0 could be replaced by any other point of X.

(ii) A slight modiﬁcation of the argument in the proof above shows that f is

uniformly continuous. Indeed, for every pair of points x, y in X, ||x − y|| < d

implies

discontinuous.

Proof Let {fn}n 1 be a Cauchy sequence of elements of X*. This means that for

any e > 0, there exists n0 2 N such that m,n n0 implies

kf n fm k\e;

114 2 Inner Product Spaces

limit

must exist. The function f deﬁned in this way is clearly linear. We next show that

f is bounded. For any j 2 N and an appropriate n1 2 N, we have

fn þ j fn \1;

1 1

which implies

fn þ j \ð1 þ kfn kÞ;

1 1

or

fn þ j ðxÞ

\ð1 þ kfn kÞkxk:

1 1

On letting j ! ∞, we obtain

f in the norm of X*. Using (2.74) again, for x 2 X and n n0, we have

Hence

It is easy to write down the general linear functional on a ﬁnite-dimensional

linear space. The description of continuous linear functionals on Banach spaces

entails some efforts. However, not much effort is required to describe the contin-

uous linear functionals on Hilbert spaces. We begin with some examples of con-

tinuous linear functionals.

Examples 2.10.24

(i) Let H be a Hilbert space of ﬁnite P dimension and let e1, e2, …, en be an

orthonormal Pbasis in H. If x ¼ ak ek ; a1 ; a2 ; . . .; an 2 C, be any vector in H,

then f ðxÞ ¼ ak f ðek Þ is clearly a linear functional in H. Moreover,

2.10 Orthogonal Decomposition and Riesz Representation 115

X n X n

jf ðxÞj ¼ ak f ðek Þ ja jjf ðek Þj

k¼1 k¼1 k

!12 !12

X n Xn

jak j2 jf ðek Þj2 ½Cauchy Schwarz Inequality

k¼1 k¼1

¼ M k xk;

P 12 P 12

n

where M ¼ k¼1 jf ðek Þj2 and k xk ¼ n

k¼1 jak j2 ; i.e. f is a bounded [con-

tinuous] linear functional.

(ii) Consider the Hilbert space H = ‘2 of square summable sequences of scalars.

For y = {yn}n 1 in ‘2, deﬁne

1

X

fy ðxÞ ¼ ðx; yÞ ¼ xn yn :

n¼1

Observe that

X 1 X 1

xn yn jx jjy j

n¼1 n¼1 n n

!12 !12

X1 1

X

2 2

j xn j j yn j

n¼1 n¼1

¼ k x k2 k y k2 ;

on ‘2 of norm at most ||y||2. For y 2 ‘2,

1

X

fy ðyÞ ¼ ðy; yÞ ¼ jyn j2 ¼ kyk22 ;

n¼1

(iii) Consider the Hilbert space L2(X, M, l) of complex-valued measurable

R

functions f deﬁned on X for which X j f j2 dl is ﬁnite. For g 2 L2(X, M, µ),

deﬁne

Z

fg ðhÞ ¼ h g dl; h 2 L2 ðX; M; lÞ:

X

116 2 Inner Product Spaces

Observe that

0 112 0 112

Z Z Z

g dl @ jhj2 dlA @ jgj2 dlA ¼ khk2 kgk2 ;

h

X X X

of norm at most ||g||2. For g 2 L2(X, M, µ),

Z Z

fg ðgÞ ¼ g dl ¼

g jgj2 dl ¼ kgk22 ;

X X

(iv) Given a Hilbert space H and a vector y 2 H, the function fy(x) = (x, y), x 2 H,

is a bounded linear functional on H of norm ||y||. Indeed, f is clearly linear

and |fy(x)| = |(x, y)| ||x||||y|| and so, ||fy|| ||y||. Furthermore, |fy(y)| = ||y||2

and so ||fy|| = ||y||.

Note that (i) and (ii) are special cases of (iii) with µ as counting measure, and

(iii) is a special case of (iv).

The existence of orthogonal decompositions implies that all bounded linear func-

tionals on H can be obtained in this way.

Theorem 2.10.25 (Riesz Representation Theorem) Let H be a Hilbert space over

C and let f 2 H*, the space of all continuous linear functionals on H. Then there

exists a unique vector y 2 H such that f(x) = (x, y) for all x 2 H.

Moreover, the mapping T:H ! H* deﬁned by T(y) = fy, where fy(x) = (x, y), is

onto, conjugate linear and isometric.

If H is a Hilbert space over R, the mapping T is linear rather than conjugate

linear.

Proof Let f 2 H*. If f = 0, choose y = 0. Then f(x) = (x, y), x 2 H. Furthermore, y =

0 is the only such element of H since 0 = f(y) = (y, y) = ||y||2.

Suppose that f 6¼ 0 and let W = {x 2 H: f(x) = 0}, known as the kernel of f and

denoted by ker(f). Clearly, W is a linear subspace of H. Moreover, W is closed, so

that W is a closed subspace of H. In fact, W is the inverse image of the closed set

{0} under the continuous linear functional f. Since f 6¼ 0, we have W 6¼ H. So, by

the Orthogonal Decomposition Theorem 2.10.11, H = W ⊕ W⊥. Since W⊥ 6¼ {0},

thereh exists y0 2i W⊥, y0 6¼ 0. Clearly, f(y0) 6¼ 0 as y0 2 W⊥. Let

y ¼ y0 f ðy0 Þ=ky0 k2 . For an arbitrary x 2 H, we can form the element x − [f(x)/f

(y0)]y0 2 H. Observe that

2.10 Orthogonal Decomposition and Riesz Representation 117

We next show that y is unique. Assuming the contrary, we have the equation

for x 2 H where y0 6¼ y00 . But this is impossible, since the substitution x ¼ y0 y00

yields the contradiction ky0 y00 k ¼ 0. The fact that || f || = ||y|| was proved in

Example 2.10.24(iv).

The mapping T:H ! H* deﬁned by T(y) = fy, where fy(x) = (x, y), is conjugate

linear: T(ay + bz) = fay+bz, where fay þ bz ðxÞ ¼ ðx; ay þ bzÞ ¼ a zÞ ¼

ðx; yÞ þ bðx;

z ðxÞ. Thus, Tðay þ bzÞ ¼ a

fy ðxÞ þ bf

a z:

fy þ bf

The real case is left to the reader. This completes the proof. h

Remarks 2.10.26

(i) The functionals deﬁned on ‘2 and L2(X, F, µ) in Examples 2.10.24(ii) and

(iii) are the only continuous functionals on these spaces. To prove the state-

ment without the use of the theorem will need quite an effort. The linear

functionals deﬁned in Example 2.10.24(i) are the only ones possible on that

space.

(ii) The Riesz Representation Theorem has been proved for a Hilbert space. The

hypothesis that the space is complete is essential for the theorem to hold.

Consider the pre-Hilbert space ‘0 of ﬁnitely nonzero sequences. Deﬁne

1

X xðnÞ

f ðxÞ ¼ ; x 2 ‘0 and x ¼ fxðnÞgn 1 :

n¼1

n

!12 !12

X 1

xðnÞ X1 X1

1

2

jf ðxÞj ¼ jxðnÞj ¼ M k xk;

n¼1 n n¼1 n¼1

n2

P1 1 12

where M ¼ n¼1 n2 , using the Cauchy–Schwarz Inequality. Thus, f is a

bounded linear functional on ‘0. However, there exists no y 2 ‘0 for which

f(x) = (x, y). Indeed, for x = en = (0, 0, …, 0, 1, 0, …), where 1 occurs in the

nth place, f ðxÞ ¼ 1n, while (x, y) = yðnÞ, so that yðnÞ ¼ 1n. Consequently, y 62 ‘0.

118 2 Inner Product Spaces

Remark 2.10.27 In fact, every incomplete inner product space has a continuous

linear functional that cannot be represented by an element of the space. Indeed, the

linear functional deﬁned by a vector in the completion but not in the incomplete

space is the desired linear functional.

Let Y be a subspace of a normed linear space X and f be a bounded linear

functional deﬁned on Y. Then f can be extended to the whole of X, so that both the

functional and its extension have the same norm [Theorem 5.3.2]. Apart from the

fact that the procedure of extension is involved, the extension is not unique.

However, the existence of an extension of a continuous linear functional deﬁned on

a subspace of a Hilbert space H to H is a direct consequence of the Riesz

Representation Theorem 2.10.25. Moreover, the extension is unique.

Theorem 2.10.28 Let H be a Hilbert space, Y a subspace of H and f a continuous

linear functional deﬁned on Y. Then, there exists a unique F 2 H* such that F|Y = f

and || f ||Y = ||F||H,

kxk¼1 k xk¼1

x2Y x2H

Y [Remark 2.10.22(ii)]. Hence, f can be extended to Y, the closure of Y with the

preservation of norm. This is shown as follows:

Let x 2 Y. There exists a sequence {xn} in Y such that xn ! x and, in view of the

linearity and continuity of f, limn f(xn) exists; moreover, it is independent of the

sequence chosen. We deﬁne f(x) to be limn f(xn). Then |f(xn)| || f ||Y ||xn|| and this

implies |f(x)| || f ||Y ||x||. Consequently, the norm of the extended f is at most the

norm of the given f. The reverse inequality is trivial.

We may therefore assume without loss of generality that Y is a closed subspace

of H. The Riesz Representation Theorem 2.10.25 asserts the existence of a unique

element y 2 Y such that

and

k f kY ¼ kyk:

2.10 Orthogonal Decomposition and Riesz Representation 119

kF kH ¼ k y k ¼ k f kY :

We shall next show that any other extension of the linear functional f to the

whole space increases the norm. Indeed, if F′ is any other extension of f to the

whole space, then

F 0 ðxÞ ¼ ðx; zÞ

and

kF 0 k ¼ kzk:

For x 2 Y,

so that y − z ⊥ Y. Because y 2 Y,

kF 0 k k f kY ;

We note in passing that if Y is not the whole space H, then there exist extensions

of arbitrarily large norm.

Let X be a normed linear space over C. The space X* of all bounded linear

functionals on X is a Banach space. One can then consider continuous linear

functionals on X*, that is, the space ðX*Þ* ¼ X**. By the preceding remark, X** is

again a Banach space. The element x 2 X deﬁnes a continuous linear functional on

X*; that is, x determines an element s(x) of X** deﬁned by

shows that sðxÞ 2 X** and ||s(x)|| ||x||. One learns in a course on Banach spaces

that the mapping s : X ! X** deﬁned by (2.75) above is an isometric isomor-

phism. The space is said to be reflexive if the mapping s deﬁned by (2.75) above is

surjective. Not all Banach spaces are reflexive. However, all ﬁnite-dimensional

120 2 Inner Product Spaces

normed linear spaces are. One of the distinguishing features of a Hilbert space is

that it is reflexive. We begin by showing that H*, the dual of a Hilbert space H, is

itself a Hilbert space.

Theorem 2.10.29 If H is a Hilbert space, then H* is a Hilbert space. Moreover,

there exists a conjugate linear map T:H ! H* which is one-to-one, onto, norm

preserving and satisﬁes

1 1

ðf1 ; f2 ÞH* ¼ T f2 ; T f1 :

mapping T:H ! H* deﬁned by

Note that T deﬁned on H and given by (2.76) is conjugate linear, one-to-one, norm

preserving and onto [Theorem 2.10.25]. Therefore, T−1 exists.

Deﬁne an inner product on H* as follows: Given f1, f2 2 H*, let

1 1

ðf1 ; f2 ÞH* ¼ ðT f2 ; T f1 Þ:

It is easy to check that this deﬁnes an inner product on H*. It is related to the norm

by (f1, f1)H* = ||f1|| 2H* , because

2

ðf1 ; f1 ÞH* ¼ ðT 1

f1 ; T 1

f1 Þ ¼ T 1

f1 ¼ kf1 k2H* :

Theorem 2.10.30 If H is a Hilbert space and H** ¼ ðH*Þ*, then the mapping

s: H ! H**ðx ! sðxÞÞ, where the deﬁning equation for s(x) is

Proof Let T:H ! H* and S: H* ! H** be the conjugate linear maps assured by

Theorem 2.10.29 (used twice). Since both are conjugate linear, one-to-one, norm

preserving and onto, we know that the composition ST:H ! H** is linear,

one-to-one, norm preserving and onto, which is to say that it is an isometric iso-

morphism between H* and H**. Thus, we need only to prove that

2.10 Orthogonal Decomposition and Riesz Representation 121

1 1

ðf ; gÞH* ¼ ðT g; T f ÞH ; f ; g 2 H*: ð2:79Þ

STðxÞðf Þ ¼ SðTxÞðf Þ

¼ ðf ; TxÞH* by ð2:78Þ

1 1

¼ ðT ðTxÞ; T f ÞH by ð2:79Þ

1

1

¼ ðx; T f ÞH ¼ T T f ð xÞ by ð2:77Þ

¼ f ðxÞ:

h

The following result is an analogue of a familiar result from metric spaces.

Theorem 2.10.31 In order that the linear span of a system M of vectors is dense in

H, it is necessary and sufﬁcient that a continuous linear functional f 2 H* which

vanishes for all x 2 M must be identically zero.

Proof Necessity: suppose the linear span of M is dense in H, i.e. [M] = H and f 2

H* vanishes for all x 2 M. By linearity, f vanishes on [M] and hence by continuity it

vanishes on [M], which is the same as H.

Sufﬁciency: suppose the linear span of M is not dense in H, i.e. [M] 6¼ H. Then

there exists y 6¼ 0 such that y ⊥ [M]. Then the linear functional f deﬁned by

f(x) = (x, y) for all x 2 H vanishes for all x 2 M but is not identically zero because

f(y) = (y, y) 6¼ 0. h

Problem Set 2.10

P

2:10:P1. Let f 2 RH2 have the series expansion f ¼ 1 j

j¼0 aj z . Deﬁne Cn(f) = an.

Show that Cn is a continuous linear functional on RH2.

2:10:P2. Let e0(t) = 1 and e1(t) = √3(2t − 1), t 2 [0, 1], be vectors in the Hilbert

space L2[0, 1]. Show that e0 ⊥ e1, ||e0|| = ||e1|| = 1. Compute the vector

y in the linear span of {e0, e1} closest to t2 and also compute

R1 2

min t2 a bt dt:

0

a;b

2:10:P3. Let X = R2 . Find M⊥

(a) M = {x}, where x = (n1, n2) 6¼ 0;

(b) M is a linearly independent set {x1, x2} X.

2:10:P4. For any subset M 6¼ ∅ of a Hilbert space H, span(M) is dense in H if, and

only if, M⊥ = {0}.

122 2 Inner Product Spaces

2:10:P5: (a) Prove that for any two subspaces M1 and M2 of a Hilbert space H, we

have (M1 + M2)⊥ = M⊥ ⊥

1 \ M2 .

(b) Prove that for any two closed subspaces M1 and M2 of a Hilbert

space H, we have

2:10:P6: (a) Let K1 and K2 be the nonempty, closed and convex subsets of a

Hilbert space H such that K1 K2. Prove that, for all x 2 H,

ky1 y2 k2 2 d ðx; K1 Þ2 d ðx; K2 Þ2 ;

(b) Let {Kn}n 1 be an increasing sequence of nonempty closed

S

convex subsets in H and let K ¼ n Kn . Prove that K is closed

and convex. Also show that limnyn = y for all x 2 H, where yn is

the projection of x onto Kn, n = 1, 2, …, and y is the projection of

x onto K.

2:10:P7. Let M be a closed subspace of a Hilbert space H and x0 2 H. Prove that

min{||x0 − x||: x 2 M} = max{|(x0, y)|: y 2 M⊥ and ||y|| = 1}.

2:10:P8: (a) Let a be a nonzero element of a Hilbert space H. Prove that, for all x 2

H,

jðx; aÞj

dðx; fag? Þ ¼ :

kak

8 9

< Z1 =

F¼ f 2 L2 ½0; 1 : f ðxÞdx ¼ 0 :

: ;

0

R1

2:10:P9. In the linear space C[0, 1], consider the functional FðxÞ ¼ 0 xðtÞf ðtÞdt;

where f is a continuous function deﬁned on [0, 1]. Show that

R1

kF k ¼ 0 jf ðtÞjdt:

2:10:P10. Let H be a Hilbert space and f be a nonzero continuous linear functional

on H, i.e. f 2 H*\{0}. Show that dim((ker(f))⊥) = 1.

2:10:P11. Prove that if f is a linear functional on a Hilbert space H and ker(f) is

closed, then f is bounded.

2.10 Orthogonal Decomposition and Riesz Representation 123

P

2:10:P12. Show that the subspace M = {x = {xn}n 1 2 ‘2: 1 p1ﬃﬃ

n¼1 n xn ¼ 0} is not

a closed subspace of ‘2.

2:10:P13. Prove that the system sin nx, n = 1, 2, …, is complete in L2[0, p].

2:10:P14. Let K be a nonempty closed convex set in a Hilbert space H. Show that

K contains a unique vector k of smallest norm and that ℜ(k, k − x) 0

for all x 2 K. Moreover, if k 2 K satisﬁes ℜ(k, k − x) 0 for all x 2 K,

then k is the vector of smallest norm in K.

2:10:P15. Let y be a nonzero vector in a Hilbert space H and let

M ¼ fx 2 H : ðx; yÞ ¼ 0g:

What is M⊥?

H. Suppose that x 2 H. In linear approximation, it is required to ﬁnd a method of

computing the minimum value of the quantity

n

X

x kj vj ;

j¼1

where k1, k2, …, kn range over all scalars and also determine those values of k1, k2,

…, kn for which the minimum is attained.

Let M be the closed linear space generated by linearly independent vectors v1, v2,

…, vn and x 2 H. By Theorem 2.10.10, there exists a unique minimising vector y 2

M and x − y ⊥ M. Since y 2 M, we have (x −P y, y) = 0.

Denote (vj, vi) = aij and bi = (x, vi). If y ¼ nj¼1 cj vj is the minimising vector,

then

ðx y; vi Þ ¼ 0 for i ¼ 1; 2; . . .; n;

n

X

bi ¼ aij cj ; i ¼ 1; 2; . . .; n:

j¼1

124 2 Inner Product Spaces

Since the vectors {vi} are linearly independent, the matrix [aij] is nonsingular.

Consequently, the n linear equations in n unknowns c1, c2, …, cn have a unique

solution. n o

Pn

Let d ¼ inf x j¼1 kj v j : k 1 ; k 2 ; . . .; kn are scalars . Then

!

n

X n

X

2

2

d ¼ kx yk ¼ ð x y; x yÞ ¼ x; x cj vj ¼ k x k2 c j bj ð2:80Þ

j¼1 j¼1

If we replace v1, v2, …, vn by an orthonormal set u1, u2, …, un, then aij = 1 if i =

j and 0 if i 6¼ j. Hence, cj = bj, j = 1, 2, …, n and it follows from (2.80) that

n

X n

X

d2 ¼ k x k2 bj 2 ¼ k x k2 x; uj 2 :

j¼1 j¼1

Theorem 2.11.1 Let {u1, u2, …, un} be an orthonormal set in H and let x 2

H. Then

Xn Xn

x ðx; uk Þuk x k k uk

k¼1

k¼1

k1, k2, …, kn. Equality holds if, and only if, kk = (x,uk), k = 1, 2, …,

n. Moreover, nk¼1 kk uk is the orthogonal projection of x onto the subspace M

generated by {u1, u2, …, un}, and if d is the distance of x from M, then

Pn

d2 ¼ k x k2 2

k¼1 jðx; uk Þj .

desired to obtain the distance of x 2 H from M, then

þ1

nX

d2 ¼ kx y k2 ¼ k x k2 jðx; uk Þj2 ; ð2:81Þ

k¼1

þ1

nX

y¼ ðx; uk Þuk : ð2:82Þ

k¼1

2.11 Approximation in Hilbert Spaces 125

The reader will notice that the ﬁrst n components in the sums on the right of (2.81)

and (2.82) remain unaltered when the dimension of the space is increased from n to

n + 1. This exhibits the importance of orthonormalising the linearly independent

vectors.

Example 2.11.3 Consider the real inner product space C[−1, 1], the inner product

(x, y), x, y 2 C[−1, 1] being deﬁned by

Z1

ðx; yÞ ¼ xðtÞyðtÞdt:

1

Consider the three linearly independent vectors 1, t, t2 (the Wronskian of the vectors

1, t, t2 is 2 6¼ 0) in C[−1, 1]. The Gram–Schmidt orthonormalisation process yields

rﬃﬃﬃ rﬃﬃﬃ

1 3 5 1 2

u0 ðtÞ ¼ pﬃﬃﬃ ; u1 ðtÞ ¼ t; u2 ðtÞ ¼ 3t 1 :

2 2 22

Let M2 [respectively, M3] be the linear space generated by {u0, u1} [respectively

{u0, u1, u2}]. Consider x(t) = et in C[−1, 1]. We shall compute the distance of

x from M2 and M3.

Z1 1

1 e e

ðx; u0 Þ ¼ pﬃﬃﬃ et dt ¼ pﬃﬃﬃ ;

2 2

1

rﬃﬃﬃ Z1

3 pﬃﬃﬃ

ðx; u1 Þ ¼ tet dt ¼ 6 e 1

2

1

rﬃﬃﬃ Z1 rﬃﬃﬃ

51 2 t 5 1

ðx; u2 Þ ¼ 3t 1 e dt ¼ e 7e :

22 2

1

Let y2 and y3 be the projections of x(t) = et on the subspaces M2 and M3, respec-

tively. Then

rﬃﬃﬃ

1 e e 1 pﬃﬃﬃ 1 3

¼ pﬃﬃﬃ pﬃﬃﬃ þ 6e t

2 2 2

1

¼ ðe e 1 Þ þ 3e 1 t

2

126 2 Inner Product Spaces

and

rﬃﬃﬃ rﬃﬃﬃ

1 1 1 5 1

51 2

¼ ðe e Þ þ 3e t þ e 7e 3t 1

2 2 22

1 5

¼ ðe e 1 Þ þ 3e 1 t þ e 7e 1 3t2 1 :

2 4

Theorem 2.11.1,

2

1 e e 1

¼ e2 e 2 pﬃﬃﬃ 6e 2

2 2

¼ 1 7e 2

and

5 2

¼ d22 e 7e 1

2

5 2

¼ 1 7e 2 e þ 49e 2 14

2

5 2 259 2

¼ 36 e e :

2 2

R1 3 2 R1

2:11:P1. Find min t a bt ct2 dt and max 1 t3 gðtÞdt, where g is

a;b;c 1

Z1 Z1 Z1 Z1

gðtÞdt ¼ tgðtÞdt ¼ t2 gðtÞdt ¼ 0; jgðtÞj2 dt ¼ 1:

1 1 1 1

2:11:P2. Find the point nearest to (1, −1, 1) in the linear span of (1, x, x2) and (1,

x2, x) in C3 , where x = exp(2pi/3).

2.12 Weak Convergence 127

1

x in H if ||xn − x|| = (xn − x, xn − x)2 ! 0 as n ! ∞, and we write xn ! x. From now

on it will be called strong convergence to distinguish it from weak convergence, to

be introduced shortly. The relationship between the two types of convergence will

be discussed. The concepts of strong convergence and weak convergence are

identical in ﬁnite-dimensional spaces. A characterisation of weak convergence in

special spaces will also ﬁnd a mention below.

Deﬁnition 2.12.1 A sequence of vectors {xn}n 1 converges weakly to a vector

w

x and we write xn * x (xn ! x) if

n!1

for all

y 2 H:

The concepts of a weakly Cauchy sequence and weak completeness are deﬁned

analogously.

Remarks 2.12.2

(i) A sequence cannot converge weakly to two different limits: assume that

w w

xn !x0 and xn !y0 . Then

for all y 2 H. Consequently, (x0, y) = (y0, y), or (x0 − y0, y) = 0, for all y 2

H. If we choose y = x0 − y0, we obtain (x0 − y0, x0 − y0) = 0, which implies

x 0 = y 0.

w

(ii) If xn ! x0, then every arbitrary subsequence fxnk gk 1 converges weakly to

x 0.

w

(iii) Strong convergence of {xn}n 1 to x0 implies xn ! x0. Indeed, for y 2 H, we

have

j ð xn x 0 ; y Þ j kxn x0 kkyk;

(iv) The converse of (iii) is, however, not true. Indeed, let {en}n 1 be an inﬁnite

orthonormal sequence of vectors in H. Since for any y 2 H,

128 2 Inner Product Spaces

1

X

jðy; en Þj2 kyk2 ðby Bessel's InequalityÞ;

n¼1

to the vector zero, but this sequence cannot converge strongly, since

2

e i ej ¼ 2 ði 6¼ jÞ;

However, the following theorem holds:

Theorem 2.12.3 If H is a ﬁnite-dimensional Hilbert space, strong convergence is

equivalent to weak convergence.

Proof Since we have already shown that, in any Hilbert space, strong convergence

implies weak convergence [Remark 2.12.2(iii)], it is enough to show in this situ-

ation that weak convergence implies strong convergence. To this end, let e1, …, ek

be an orthonormal basis for H and let

w

xn !x;

where

ðnÞ ðnÞ

xn ¼ a1 e1 þ þ ak ek

for n = 1, …, k, and

x ¼ a 1 e 1 þ þ ak e k :

w

Since xn ! x, it follows that

ðnÞ

xn ; ej ! x; ej ; i:e:; aj ! aj

for j = 1, …, k. For any prescribed e > 0, there must be an integer n0 such that for all

n > n0 and for every j = 1, …, k,

ðnÞ

aj aj \e=k;

2.12 Weak Convergence 129

hence

X k 2 X k 2

ðnÞ ðnÞ

k xn x k2 ¼ a aj e j ¼ aj aj \e:

j¼1 j j¼1

The next result pinpoints the relationship between weak and strong

convergences.

Theorem 2.12.4 Let {xn}n 1 be a sequence in a Hilbert space H. Then xn ! x if,

w

and only if, xn ! x and lim supn!∞||xn|| ||x||.

w

Proof Let xn ! x. Then xn ! x [Remark 2.12.2(iii)]. Also, lim supn!∞||xn|| =

limn!∞||xn|| = ||x||, since 0 ||xn − x|| |||xn|| − ||x|||.

w

Conversely, let xn ! x and lim supn!∞||xn|| ||x||. For each n, 0 ||xn − x||2 =

(xn − x, xn − x) = ||xn||2 + ||x||2 − 2ℜ(xn, x). Since

we have

The Riesz Representation Theorem enables us to prove the following analogue

of the classical Bolzano–Weierstrass Theorem.

Theorem 2.12.5 Any bounded sequence in H has a weakly convergent subse-

quence and the limit has the same bound.

Proof Let {xn}n 1 be a sequence in H and an M > 0 be such that ||xn|| M for all

n. We need to ﬁnd a weakly convergent subsequence of {xn}n 1.

By the Cauchy–Schwarz Inequality,

for all n. The classical Bolzano–Weierstrass Theorem shows that the bounded

sequence {(xn, x1)}n 1 has a convergent subsequence {(xn(1), x1)}n(1) 1, say.

Applying the preceding argument to the sequence {(xn(1), x2)}n(1) 1, we extract a

convergent subsequence {(xn(2), x2)}n(2) 1.

Continuing inductively, we obtain for each k a convergent subsequence {(xn(k),

xk)}n(k) 1 of {(xn(k−1), xk)}n(k−1) 1.

130 2 Inner Product Spaces

Consider now the diagonal sequence {zp}p 1, where zp denotes the pth term of

the sequence {xn(p)}n(p) 1. We show that for each x 2 H, the sequence {(zp, x)}p 1

of scalars converges.

If x = xm for some m, then for each p > m, {(zp, x)}p 1 is a subsequence of the

convergent sequence {(zp, xm)}p 1 and is, therefore, convergent. Hence, if x 2 span

{x1, x2, …}, then {(zp, x)}p 1 converges in the ﬁeld of scalars.

Let x 2 spanfx1 ; x2 ; . . .g. Consider a sequence {yr}r 1 in span{x1, x2,…} such

that yr ! x as r ! ∞. Then for all n, m and r, we have

jðzn ; xÞ ð z m ; xÞ j ¼ j ð z n z m ; xÞ j

j ðznzm ; x yr Þj þ jðzn zm ; yr Þj

kz n zm kkx yr k þ jðzn zm ; yr Þj

2M kx yr k þ jðzn zm ; yr Þj:

Since ||x − yr|| ! 0 as r ! ∞ and |(zn − zm, yr)| ! 0 as n, m ! ∞ for each r, we see

that {(zn, x)}n 1 is a Cauchy sequence of scalars and is, therefore, convergent.

Next, let x ⊥ spanfx1 ; x2 ; . . .g. Then (zn, x) = 0 for all n, since zn is in span{x1, x2,

…}. Thus (zn, x) ! 0 as n ! ∞.

By the Orthogonal Decomposition Theorem 2.10.11,

?

H ¼ spanfx1 ; x2 ; . . .g spanfx1 ; x2 ; . . .g :

n!1

n!1

the Riesz Representation Theorem 2.10.25, there exists a unique y 2 H such that

lim zn ¼ y ðweakÞ:

n!1

Every convergent sequence in a normed linear space X is bounded. This is easily

seen as follows: let {xn}n 1 be a sequence in X and suppose that limn!∞xn = x. For

2.12 Weak Convergence 131

a given e > 0, there exists an integer n0 such that n n0 implies ||xn − x|| < e. But

since ||xn|| − ||x|| ||xn − x||, this implies ||xn|| < e + ||x||, n n0. It now follows that

kxn k\e þ k xk þ M;

Thus, the terms of a convergent sequence in a normed linear space, a fortiori, in

a Hilbert space are bounded. The foregoing statement is true for a weakly con-

vergent sequence.

w

Theorem 2.12.6 If H is a Hilbert space and xn !x, then there exists a positive

constant M such that

kxn k M:

Deﬁnition 2.12.7 A real functional p(x) in H is said to be convex if for all

x, y 2 H and a 2 C; the following hold:

Observe that (i) p(0) = 0 (ii) p(x − y) |p(x) − p(y)| and p(x) 0, where

x, y 2 H. Indeed, p(0) = p(0 x) = 0 p(x) = 0. Also, p(x − y) + p(y) p(x) and

hence p(x − y) p(x) − p(y). Since p(x − y) = |−1| p(y − x) p(y) − p(x), it

follows that p(x − y) |p(x) − p(y)|. On setting y = −x, we obtain p(2x) = 2p(x)

| p(x) − p(−x)| = 0.

That a lower semi-continuous convex functional in a Hilbert space is bounded is

the content of the Lemma below. In conjunction with the observation above, it will

further follow that it is uniformly continuous.

Lemma 2.12.8 Suppose p(x) is a convex functional in a Hilbert space H and

assume that p(x) is lower semi-continuous. Then there exists M > 0 such that

Proof We ﬁrst show that the functional p(x) is bounded in the ball S(0,1). We

assume the contrary. Then p(x) is unbounded in every ball, because every ball is

obtained by dilation and/or translation of the ball S(0,1). We choose a point x1 2 S

(0,1) such that p(x1) > 1. The lower semi-continuity of the functional p(x) implies

that there exists a ball S(x1,q1) S(0,1) with radius q1 < 12 in which p(x) > 1. By

reducing the radius q1, we may assume that S(x1, q1) S(0,1). Since p(x) is

unbounded in every ball, in a similar manner, we obtain a point x2 2 S(x1, q1) and

also a closed ball S(x2, q2) S(x1, q1) with radius q2 < 12 q1 , in which p(x) > 2.

Continuing the process, we obtain an inﬁnite sequence of balls

132 2 Inner Product Spaces

Observe that the sequence {xn}n 1 of the centres of the balls S(xn,qn), n = 1, 2, …

is Cauchy and since H is complete, limn!∞xn exists and equals x, say. Then x lies

in the intersection of the closed balls, and hence p(x) > n for each n, which is a

contradiction.

Let x 2 H be arbitrary. Then x/2||x|| is an element of H of norm 12 and is,

therefore, in S(0,1). Now,

pðx=2kxkÞ M1 ;

Corollary 2.12.9 Let pk(x), k = 1, 2, … be a sequence of convex continuous

functionals in H. If this sequence is bounded at each point x 2 H, then the

functional

Proof Evidently, p(x) is a convex functional. On the other hand, for each x0 2

H and each e > 0, there exists N such that

1

pN ðx0 Þ [ pðx0 Þ e;

2

i.e.

1

pðx0 Þ pN ðx0 Þ\ e:

2

1

jpN ðxÞ pN ðx0 Þj\ e

2

1

pðxÞ pðx0 Þ [ supk pk ðxÞ pN ðx0 Þ e

2

1

pN ðxÞ pN ðx0 Þ e[ e:

2

2.12 Weak Convergence 133

This implies that the functional p(x) is lower semi-continuous. By Lemma 2.12.8, it

follows that p(x) is bounded. Continuity now follows from the observation pre-

ceding the lemma. h

Every weakly convergent sequence of vectors in a Hilbert space is bounded.

This is an immediate consequence of the following theorem.

Theorem 2.12.10 Let {Uk}k 1 be a sequence of continuous linear functionals

deﬁned on the Hilbert space H. Suppose that the numerical sequence {Uk(x)}k 1 is

bounded for each x 2 H. Then the sequence {||Uk||}k 1 of norms of the functionals

is bounded.

Proof For x 2 H, deﬁne

functional

sup pðxÞ M:

k xk 1

Consequently,

k xk 1

¼ sup pk ðxÞ

k xk 1

k xk 1 k

¼ sup pðxÞ M:

k xk 1

Proof of Theorem 2.12.6 Let {xn}n 1 be a weakly convergent sequence. Each vector

xn determines a functional Un(x) = (x, xn). Since the sequence {xn}n 1 is weakly

convergent, the numerical sequence {Un(x)}n 1 converges for each x 2 H and hence

is bounded. Using Theorem 2.12.10, it follows that

kUn k M; n ¼ 1; 2; . . .:

134 2 Inner Product Spaces

to be weakly Cauchy if, for each y 2 H,

limm;n ðxm xn ; yÞ ¼ 0:

sequence converges to a weak limit in H.

Corollary 2.12.12 Let H be a Hilbert space. Then H is weakly complete.

Proof Let {xn}n 1 be a Cauchy sequence in the sense of weak convergence, that is,

for each y 2 H,

limðxm xn ; yÞ ¼ 0:

m;n

It follows that the sequence {(xn, y)}n 1 of scalars converges for each y in H. By

Theorem 2.12.10, the sequence {xn}n 1 is bounded:

kxn k M; n ¼ 1; 2; . . .:

lim ðx; xn Þ

n!1

deﬁnes a linear functional U(x) with norm less than or equal to M. By the Riesz

Representation Theorem 2.10.25, U(x) = (x, z), where z is a unique element of the

Hilbert space H. This element is the weak limit of the sequence {xn}n 1. h

We give below two applications of Corollary 2.12.9.

Theorem 2.12.13 (F. Riesz) If a functional U is deﬁned everywhere on L2[a, b] by

the formula

Zb

UðxÞ ¼ xðtÞyðtÞdt; x 2 L2 ½a; b;

a

where y is a ﬁxed measurable function deﬁned on [a, b], then U is a bounded linear

functional on L2[a, b], so that y 2 L2[a, b].

Proof Clearly, U is a linear functional on L2[a, b]. Set

and

2.12 Weak Convergence 135

Z

pn ðxÞ ¼ jxðtÞyðtÞjdt; x 2 L2 ½a; b:

En

a 2 C,

Z

pn ðx þ zÞ ¼ j½xðtÞ þ zðtÞyðtÞjdt

En

Z Z

jxðtÞyðtÞjdt þ jzðtÞyðtÞjdt

En En

¼ pn ðxÞ þ pn ðzÞ

and

Z Z

pn ðaxÞ ¼ jaxðtÞyðtÞjdt ¼ jaj jxðtÞyðtÞjdt ¼ jajpn ðxÞ:

En En

Moreover,

0 112 0 112

Z Z Z

2

pn ðxÞ n jxðtÞjdt n@ jxðtÞj dtA @ dtA

En En En

1

¼ nlðEn Þ k xk2 ;2

measure.

Thus, for n = 1, 2, …, pn is a continuous convex functional on L2[a, b]. The

equality

Z Zb

pðxÞ ¼ limn pn ðxÞ ¼ limn jxðtÞyðtÞjdt ¼ jxðtÞyðtÞjdt;

En a

using the Monotone Convergence Theorem 1.3.6 shows that p(x) is ﬁnite for any

x in L2[a, b]. By Corollary 2.12.9, the functional p(x) is bounded; i.e. there exists

M > 0 such that

Thus

136 2 Inner Product Spaces

i.e. U is a bounded linear functional on L2[a, b]; so y 2 L2[a, b] and ||y||2 = ||U||,

using the deﬁnition of U and the Riesz Representation Theorem. h

Theorem 2.12.14 (Landau) If U is a functional deﬁned everywhere in ‘2 by means

of the formula

1

X

UðxÞ ¼ ak x k ; x ¼ fxk gk 1 2 ‘2 ;

k¼1

P

where {ak}k 1 is some ﬁxed sequence, then 1 2

k¼1 jak j \1:

P

Proof Deﬁne pn ðxÞ ¼ nk¼1 jak xk j; x ¼ fxk gk 1 2 ‘2 . Check that pn, n = 1, 2, …,

is a continuous convex functional. Then the equality

n

X 1

X

pðxÞ ¼ limn pn ðxÞ ¼ limn jak xk j ¼ j ak x k j

k¼1 k¼1

implies that p(x) is ﬁnite for any x 2 ‘2. So, by Corollary 2.12.9, the functional

p(x) is continuous; i.e. there exists M > 0 such that pðxÞ M k xk|, x 2 ‘2.

Consequently,

1

X

jUðxÞj jak xk j ¼ pðxÞ M kxk:

k¼1

So, U is a bounded linear functional on ‘2. The form of U and the Riesz

P

Representation Theorem imply that 1 2

k¼1 jak j \1. h

P1

Remark 2.12.15 Landau’s Theorem may also be stated as follows: if k¼1 ak xk

P

converges for every {xk}k 1 in ‘2, then 1 2

k¼1 jak j \1

2:12:P1. Show that for a sequence {xn}n 1 in an inner product space X and x 2 X,

the conditions

imply xn ! x in X.

2.12 Weak Convergence 137

weakly to x 2 H. Prove that there exists a subsequence {xnk }k 1 such that

the sequence {yk}k 1 deﬁned by

1

yk ¼ ðxn1 þ xn2 þ þ xnk Þ

k

converges strongly to x.

2:12:P3: (a) (Mazur’s Theorem) Let {xn}n 1 be a weakly convergent sequence

in a Hilbert space H and let x be its weak limit. Prove that x lies in the

closed convex hull of the range {xn: n 1} of the sequence.

(b) Let C be a convex subset of a Hilbert space H. Prove that C is closed

if, and only if, it contains the weak limit of every sequence of points

in it.

2:12:P4: (a) Let H be a separable Hilbert space and let {en}n 1 be an orthonormal

basis for H. Let B = {x 2 H: ||x|| 1}. For x, y 2 H, let

1

X

dðx; yÞ ¼ 2 n jðx y; en Þj ð2:85Þ

n¼1

(b) Show that the topology generated by d is the same as the one

w

given by the weak topology, i.e. d(xk, x) ! 0 if, and only if, xk !x.

(c) Show that the metric space (B, d) is compact.

2.13 Applications

Müntz’s Theorem

Weierstrass’s Theorem for C[0, 1] says, in effect, that all linear combinations of the

functions

1; x; x2 ; . . .; xn ; . . . ð2:86Þ

are dense in C[0, 1]. Instead of working with all positive powers of x, let us permit

gaps to occur, and consider the inﬁnite set of functions

138 2 Inner Product Spaces

where nk are positive integers satisfying n1 < n2 < < nk . The result we shall

prove is called Müntz’s Theorem and asserts that the linear combinations of the

functions (2.87) are dense in C[0, 1] and hence in L2[0, 1] if, and only if, the series

P1 1

k¼1 nk diverges. The following will be needed in Sect. 2.13.

Deﬁnition Let x1, x2, …, xn be any vectors in an inner product space X. Then the

n n matrix G(x1, x2, …, xn) whose (i, j)th entry is (xi, xj), where (,) is the inner

product in X, is called the Gram matrix of the given ﬁnite sequence of vectors.

Its determinant is called their Gram determinant.

Proposition The Gram matrix G(x1, x2, …, xn) is nonsingular if, and only if, the

vectors x1, x2, …, xn are linearly independent.

Proof Observe that for the given G, and any n-tuple of scalars x = (n1, n2, …, nn),

we have

2 3

ðx1 ; x1 Þ ðx1 ; x2 Þ ðx1 ; xn Þ

6 . .. .. .. 7

xG ¼ ½n1 ; n2 ; . . .; nn 6

4 .. . . . 5

7

" #

Xn Xn n

X

¼ ni ðxi ; x1 Þ ni ðxi ; x2 Þ . . . ni ðxi ; xn Þ :

i¼1 i¼1 i¼1

So,

2 3

" # n1

X n Xn Xn 6 . 7

xGx* ¼ ni ðxi ; x1 Þ ni ðxi ; x2 Þ . . . ni ðxi ; xn Þ 64 .. 5

7

i¼1 i¼1 i¼1

n

n

! 2

X n Xn Xn X n

¼ ðxi ; xj Þni

nj ¼ ni x i ; ni x i ¼ ni x i

i;j¼1 i¼1 i¼1

i¼1

Corollary A necessary and sufﬁcient condition for the vectors x1, x2, …, xn to be

linearly dependent is that

det Gðx1 ; x2 ; . . .; xn Þ ¼ 0:

Let M be the closed subspace generated by x1, x2, …, xn. Then H can be written as

M ⊕ M⊥. If y 2 H, then y = z + w, where z 2 M and w 2 M⊥, so that y − z 2 M⊥

[see Remark 2.10.12(i)]. The minimum distance d from to the subspace M is

d ¼ ky zk, where

2.13 Applications 139

n

X

z¼ ai x i

i¼1

minimal distance d.

Since y z?xj ; j ¼ 1; 2; . . .; n; we obtain a system of equations

!

n

X

y ai x i ; x j ¼ 0; j ¼ 1; 2; . . .; n;

i¼1

9

a1 ðx1 ; x1 Þ þ a2 ðx2 ; x1 Þ þ an ðxn ; x1 Þ ¼ ðy; x1 Þ >

>

=

a1 ðx1 ; x2 Þ þ a2 ðx2 ; x2 Þ þ an ðxn ; x2 Þ ¼ ðy; x2 Þ

ð2:88Þ

>

>

;

a1 ðx1 ; xn Þ þ a2 ðx2 ; xn Þ þ an ðxn ; xn Þ ¼ ðy; xn Þ

of its coefﬁcients is precisely the transpose of the Gram matrix G(x1, x2, …, xn).

Since the vectors x1, x2, …, xn are linearly independent, the matrix is nonsingular by

Proposition 4.2.2 and the system has one and only one solution. Moreover, by

Cramer’ Rule, the unique solution is given by

where G(i) is obtained from G by replacing its ith column by the column of con-

stants (y, xi).

Now,

d2 ¼ ky zk2 ¼ ðy z; y zÞ

¼ ðy; y zÞ

!

n

X

2

¼ k yk y; ai x i ;

i¼1

so that

!

n

X

ai x i ; y ¼ k y k2 d2 : ð2:89Þ

i¼1

We combine Eq. (2.89) with the system of Eq. (2.88) and write them in the form

140 2 Inner Product Spaces

9

a1 ðx1 ; x1 Þ þ a2 ðx2 ; x1 Þ þ an ðxn ; x1 Þ ðy; x1 Þ ¼ 0 >

>

>

a1 ðx1 ; x2 Þ þ a2 ðx2 ; x2 Þ þ an ðxn ; x2 Þ ðy; x2 Þ ¼ 0 >

=

: ð2:90Þ

>

a1 ðx1 ; xn Þ þ a2 ðx2 ; xn Þ þ an ðxn ; xn Þ ðy; xn Þ ¼ 0 >

>

>

;

a1 ðx1 ; yÞ þ a2 ðx2 ; yÞ þ an ðxn ; yÞ þ d2 ðy; yÞ ¼ 0

column, the (2.90) becomes a system of n + 1 homogeneous linear equations in the

n + 1 variables a1, a2, …, an, an+1 (= 1). The system (2.90) will possess a nontrivial

solution if the determinant of the system vanishes, i.e.

2 3

ðx1 ; x1 Þ ðx2 ; x1 Þ ðxn ; x1 Þ ðy; x1 Þ

6 ðx1 ; x1 Þ ðx2 ; x1 Þ ðxn ; x2 Þ ðy; x2 Þ 7

det6

4

7 ¼ 0:

5

2

ðx1 ; yÞ ðx2 ; yÞ ðxn ; yÞ d ðy; yÞ

This gives

det Gðx1 ; x2 ; . . .; xn ; yÞ

d2 ¼ :

det Gðx1 ; x2 ; . . .; xn Þ

Apart from the above observations, the following lemmas will be needed in the

proof of Müntz’s Theorem.

Lemma Let k1, k2, …, kn be positive real numbers and A be the matrix whose (i, j)

th entry is aij ¼ ki þ1 kj : Then

n

Yn

1 Y kj kk 2

det A ¼ 2 :

k

j¼1 j 1 j\k n

kj þ kk

Q

Proof If A is a 1 1 matrix, then det A ¼ 2k1 1 ¼ 2 1 nj¼1 k1j . Thus, the assertion is

true for n = 1. Assume that the assertion is true for m; i.e. if A is an m m matrix,

then

m

Ym

1 Y kj kk 2

det A ¼ 2 :

k

j¼1 j 1 j\k m

kj þ kk

determinant, when written in full, takes the form

2.13 Applications 141

1 1

1 1

k1 þ k1 k1 þ k2 k1 þ km k1 þ km þ 1

1 1

1 1

k2 þ k1 k2 þ k2 k2 þ km k2 þ km þ 1

:

.. .. ..

. . .

1 1 1 1

km þ 1 þ k1 km þ 1 þ k2 km þ 1 þ km km þ 1 þ km þ 1

By subtracting the last row from each of the others, removing common factors,

subtracting the last column from each of the others and again removing the common

factors, we obtain

k þ1 k 1

k1 þ k2 1

k1 þ km 1

11 1

Qm 1

1

1

2 k2 þ k1 k2 þ k2 k2 þ km

1 1 i¼1 ðkm þ 1 ki Þ .. .. .. .. .. :

2 Q 2

km þ 1 m ðkmþ1 þ k i Þ . . . . .

i¼1 1 1 1

km þ 1 þ k1 km þ 1 þ k2 km þ 1 þ km 1

0 0 0 1

m 1

þ1

mY

1 Y k j kk 2

det A ¼ 2 :

j¼1

kj 1 j\k m þ 1 kj þ kk

Lemma Let 1, tn1 , tn2 ,… , 1 n1 < n2 < be a set of functions deﬁned on [0, 1].

The sequence ftnk gk 1 is total (ﬁnite linear combinations are dense) in C[0, 1], if,

and only if, it is complete in L2[0, 1].

Proof Let x 2 C[0, 1]. The inequality

2 1 2 312

Z k

X k

X

4 xðtÞ

ai tni dt5 max0 t 1 xðtÞ

ai t n i ð2:91Þ

i¼1

i¼1

0

Conversely, suppose that the sequence ftnk gk 1 is complete in L2[0, 1]. In order

to show that the ﬁnite linear combinations constitute a dense subset of C[0, 1], it is

enough to show that the inequality (2.91) in the reverse direction holds for the

functions tm, m = 1, 2,…. Now

142 2 Inner Product Spaces

t !

k Z k

m X X

t ai tni ¼ m tm 1

bi tni 1

dt

i¼1

i¼1

0

Z1 k

X

ni 1

m t m 1

bi t dt ð2:92Þ

i¼1

0

0 112

Z1 k

X

1 A

m @ t m 1

bi t n i dt ;

i¼1

0

using the Cauchy–Schwarz Inequality. The above inequality proves the assertion. h

Remarks

(i) The function 1 must be added in the case of C[0, 1] but is redundant in L2[0, 1].

Indeed, if the function 1 is missing from ftnk gk 1 , then the polynomial

Pk ni

i¼1 ai t is itself zero at t = 0 and cannot, therefore, approximate the con-

tinuous function x(t) for which x(0) 6¼ 0.

R1

(ii) Since ðxp ; xq Þ ¼ 0 tp þ q dt ¼ p þ 1q þ 1, it follows that

n þ n1 þ 1 n þ n1 þ 1 1

1 1 1 2 n1 þ nk þ 1

n2 þ n11 þ 1 n2 þ n12 þ 1 1

n2 þ nk þ 1

det Gðtn1 ; tn2 ; . . .; tnk Þ ¼ .. .. .. ..

. . . .

1 1

1

nk þ n2 þ 1 nk þ n2 þ 1 nk þ nk þ 1

Q

i [ j ðni nj Þ 2

¼Q

i;j ðni þ nj þ 1Þ

and analogously,

Q Qk

m n1 n2 nk i [ j ðni nj Þ2 ðm ni Þ2 1

det Gðt ; t ; t ; . . .; t Þ ¼ Q Qk i¼1

2 2m þ 1

:

i;j ðn i þ n j þ 1Þ i¼1 ðm þ ni þ 1Þ

k
2

det Gðtm ; tn1 ; tn2 ; . . .; tnk Þ 1 Y m ni

¼ :

det Gðtn1 ; tn2 ; . . .; tnk Þ 2m þ 1 i¼1 m þ ni þ 1

2.13 Applications 143

P1 P1

(iii) The series i¼1 lnð1 þ ai Þ and the series i¼1 ai converge or diverge

simultaneously. This is because

lnð1 þ xÞ 1

lim ¼ lim ¼1

x!0 x x!0 1 þ x

1

X 1

¼ 1:

i¼1

ni

and hence, the minimal distance is zero. Thus, completeness holds if, and only if,

for each m 1 with m 6¼ ni, i = 1, 2, …, the minimal distance

d2k ¼ !0 as k ! 1: ð2:93Þ

det Gðtn1 ; tn2 ; . . .; tnk Þ

Now,

k
2

det Gðtm ; tn1 ; tn2 ; . . .; tnk Þ 1 Y m ni

¼ : ð2:94Þ

det Gðtn1 ; tn2 ; . . .; tnk Þ 2m þ 1 i¼1 m þ ni þ 1

" #

k

X

m mþ1

lim ln 1 ln 1 þ ¼ 1: ð2:95Þ

k!1

i¼1

ni ni

P1 1

If the series i¼1 ni diverges, then by (iii) of Remarks 4.2.6

144 2 Inner Product Spaces

1

X
X1

m mþ1

ln 1 ¼ 1; ln 1 þ ¼ þ1 ð2:96Þ

i¼1

ni i¼1

ni

and therefore, (2.95) is satisﬁed and hence so is (2.93). If, however, the series

P1 1

i¼1 ni converges, then the series (2.96) also converges, so that (2.95) is not

satisﬁed and hence (2.93) does not hold. h

Radon–Nikodým Theorem

X, be a measurable space, and let m, µ be ﬁnite nonnegative measures on (X, R). The

measure m is said to be absolutely continuous with respect to µ, in symbols, m µ,

if m(E) = 0 for every E 2 R for which µ(E) = 0.

For h 2 L1(X, R, µ), the integral

Z

mðEÞ ¼ h dl; E2R

E

The point of the Radon–Nikodým Theorem is the converse: every m µ is

obtained in this way.

von Neumann showed how to derive this from the Riesz Representation

Theorem for linear functionals on a Hilbert space.

(Radon–Nikodým Theorem) Let m and µ be ﬁnite nonnegative measures on

(X; RÞ.If m l; then there exists a unique nonnegative measurable function h such

that

Z

mðEÞ ¼ h dl; E 2 R:

E

Proof For any E 2 R, put u(E) = m(E) + µ(E). Since m and µ are ﬁnite nonnegative

measures, so is u. Moreover,

Z Z Z

x du ¼ x dm þ x dl ð2:97Þ

X X X

holds for x = vE, E 2 R. Hence, (2.97) holds for simple functions and consequently

for any nonnegative measurable function x.

R

Let H be the real Hilbert space L2(X, R, u) with the norm jj xjj2 ¼ X j xj2 du: For

x 2 H, the Cauchy–Schwarz Inequality gives

2.13 Applications 145

0 112

Z Z Z Z

1

x dm

j xjdm j xjdu @ j xj2 duA ðuðXÞÞ2 \1

X X X X

Z

L:x! x dm

X

is seen to be deﬁned and ﬁnite on H. It is clear that L(ax + by) = aL(x) + bL(y) for

all a, b scalars and x, y 2 L2(X, R, u) = H. Thus, L is a bounded linear functional on

H and so, by Theorem 2.10.25, there is a function y 2 H such that

Z Z Z Z

xdm ¼ xy du ¼ xy dm þ xy dl;

X X X X

where we have used (2.97) in the last equality. It is easy to discern that y is non-

negative a.e. with respect to u and hence with respect to µ and m as well. This may

be written as

Z Z

xð1 yÞdm ¼ xy dl: ð2:98Þ

X X

obtain

Z Z Z

0 lðEÞ ¼ vE dl vE y dl ¼ vE ð1 yÞdm 0:

X X X

Let z ¼ yvEc : Then z(s) 2 [0,1) and z = y a.e. with respect to both m and µ. The

equality (2.98) then becomes

Z Z

xð1 zÞdm ¼ xz dl: ð2:99Þ

X X

Since both x and z are bounded and u is a ﬁnite measure, the function (1 + z + z2 +

⋯ + zn−1)x is in L2(X, R, u) for every positive integer n and hence by (2.99)

146 2 Inner Product Spaces

Z Z

2 n 1

1þzþz þ þz xð1 zÞdm ¼ 1 þ z þ z2 þ þ zn 1

xz dl

X X

holds. In view of the fact that z(s) 6¼ 1 for any s, the above equality can be written as

Z Z

n zð1 zn Þ

ð1 z Þx dm ¼ x dl:

1 z

X X

n

Since 0 z(s) < 1 for all s 2 X, the sequences (1 − zn)x and zð11 zz Þ x increase to

x and 1 z z x, respectively, as n ! ∞. By the Monotone Convergence Theorem 1.3.6,

we obtain

Z Z

z

x dm ¼ x dl:

ð1 zÞ

X X

Z Z

x dm ¼ hx dl:

X X

Z

mðEÞ ¼ h dl:

E

Remarks

(i) The construction of h shows that h 0.

(ii) The Radon–Nikodým Theorem is valid if m and µ are r-ﬁnite measures. For

details, the reader may consult [26].

Let X be a bounded domain in the z = x + iy plane, whose boundary consists of a

ﬁnite number of smooth simple closed curves. The class of all holomorphic func-

RR

tions in X for which the integral X jf ðzÞj2 dxdy\1 is denoted by A(X). The

integral is understood as the limit of Riemann integrals

ZZ

limn jf ðzÞj2 dxdy;

Kn

2.13 Applications 147

is X. It has been proved [see 2.6.2, 2.6.3, 2.6.4, 2.6.5] that A(X) is a Hilbert space.

Consider the linear functional L(f) = f(f), where f 2 X is ﬁxed and f 2 A(X).

Observe that jLðf Þj ¼ jf ðfÞj pkﬃﬃpf dkf , where df = dist(f, ∂X) and ∂X denotes the

boundary of X [see Proposition 2.6.3]. It follows on using Theorem 2.10.25 that

there exists a uniquely determined uf 2 AðXÞ such that

The traditional notation is uf(z) = K(z, f) and K is called the Bergman kernel of X.

For each f 2 X, the function has the reproducing property

ZZ

f ðfÞ ¼ ðf ; Kð; fÞÞ ¼ f ðzÞKðz; fÞdxdy; f 2 AðXÞ: ð2:101Þ

X

(a) If one substitutes f = K(, f) in (2.101), one ﬁnds that

ZZ

kKð; fÞk2 ¼ Kðz; fÞKðz; fÞ dxdy

X

¼ Kðf; fÞ; f 2 X.

(b) For z1, z2 2 X, the relation K(z1, z2) = K ðz2 ; z1 Þ holds. To see this, we let f = K(, z2)

and f = z1 in (2.101) and we obtain

ZZ

K ðz1 ; z2 Þ ¼ Kðz; z2 ÞKðz; z1 Þdxdy

X

ZZ

¼ Kðz; z2 ÞKðz; z1 Þdxdy

X

¼ Kðz2 ; z1 Þ

The relation between the kernel function and a certain minimum problem in A

(X) is also important. Suppose f 2 X is ﬁxed, and write

the function f0 is connected with the Bergman kernel function as follows:

Kðz; fÞ f0 ðzÞ

f0 ðzÞ ¼ and Kðz; fÞ ¼ :

Kðf; fÞ kf 0 k2

148 2 Inner Product Spaces

Proof Since A(X) is a Hilbert space and M A(X) is its closed subspace, the ﬁrst

assertion follows on using Corollary 2.10.7.

For each f 2 A(X), we have f(f) = (f, K(,f)). For f 2 M, on using the Cauchy–

Schwarz inequality, we have

p

1 ¼ ðf ; Kð; fÞÞ k f kkKð; fÞk ¼ k f k Kðf; fÞ: ð2:102Þ

1

Since 1 ¼ f0 ðfÞ ¼ CKðf; fÞ (therefore C ¼ Kðf;fÞ ), it follows that

Kðz; fÞ

f0 ðzÞ ¼ :

Kðf; fÞ

This implies

Also,

f0 ðzÞ

Kðz; fÞ ¼ ;

kf0 k2

1

; using (2.102) and (2.103). h

Recall that the Riemann Mapping Theorem asserts: if X is a simply connected

domain having more than one boundary point, then there exists a holomorphic

function in X which maps X bijectively onto D ¼ fz : jzj\1g. If f is ﬁxed, then the

mapping function f(z) = f(z, f) for which f(f) = 0 and f′(f) > 0 is unique.

The mapping function f and the Bergman kernel K of X are related as follows:

rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

p 1

f 0 ðzÞ ¼ kðz; fÞ and kðz; fÞ ¼ f 0 ðzÞf 0 ðfÞ; z 2 X

Kðf; fÞ p

Proof Let Xr denote the subdomain of X which is mapped by f onto the disc

fx : jxj\r g, where r < 1 and x = f(z). Denote the boundary of Xr by cr. If g 2 A

(X), then gðzÞ

f ðzÞ has a simple pole at z = f and the residue at this pole is

ðz fÞgðzÞ gðfÞ

lim ¼ 0 :

z!f f ðzÞ f ðfÞ

2.13 Applications 149

Z Z

gðfÞ 1 gðzÞ 1

¼ dz ¼ f ðzÞgðzÞdz:

f 0 ðzÞ 2pi f ðzÞ 2pir2

cr cr

ZZ

gðfÞ 1

¼ f 0 ðzÞgðzÞdxdy:

f 0 ðfÞ pr 2

Xr

Letting r ! 1, we get

ZZ

f 0 ðzÞf 0 ðfÞ

gðfÞ ¼ gðzÞdxdy:

p

X

f 0 ðzÞf 0 ðfÞ

Kðz; fÞ ¼ ð2:104Þ

p

has the reproducing property for A(X) and is therefore the Bergman kernel. For

z = f, it follows that

f 0 ðfÞ2

Kðf; fÞ ¼ ;

p

rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

p

f 0 ðzÞ ¼ kðz; fÞ:

Kðf; fÞ

Remarks

(i) In only a few cases, it is possible to obtain a representation for the kernel

function in closed form. It is easy to ﬁnd a series representation with respect to

some complete orthonormal system f/j g, because by (2.101), the Fourier

coefﬁcients are

uj ¼ Kð; fÞ; /j ¼ /j ðfÞ; j ¼ 1; 2; . . .

and the Bergman kernel has the series representation [see Theorem 2.9.15(iii)]

150 2 Inner Product Spaces

1

X

Kðz; fÞ ¼ /j ðfÞ /j ðzÞ; z; f 2 X:

j¼1

(ii) Consider the special case where X ¼ Dfz : jzj\1g. According to (vi) of

pﬃﬃ

Examples 2.9.16, the set /n ðzÞ ¼ pn zn 1 n ¼ 1; 2; . . . is an orthonormal

system in A(D). Thus,

1

X n 1 1

Kðz; fÞ ¼ fn 1 z n 1

¼ ; z; f 2 X:

n¼1

p p ð1 zfÞ2

The reproducing property becomes

ZZ

1 f ðzÞ

f ðfÞ ¼ dxdy:

p ð1 zfÞ2

D

Let C be a nonempty convex, closed and bounded subset of a Hilbert space H and

let T be a map from C into C such that

Solution: for each n 2 N, let

1 n 1

Tn ðxÞ ¼ aþ TðxÞ;

n n

xn 2 C. Indeed, for x; y 2 C; jjTn ðxÞ Tn ðyÞjj ¼ n n 1 kTx Tyk n n 1 kx yk:

Since C is a bounded subset of H and fxn gn 1 is in C, it follows that there exists

w

a subsequence fxnj gj 1 such that xnj !x, say [Theorem 3.1.5]. By the Banach–Saks

Theorem [Problem 2.12.P2], fxnj gj 1 has subsequence such that a sequence of

certain convex combinations of its terms converges strongly to x. Consequently, x 2

C as C is convex and (strongly) closed. We shall prove that x is a ﬁxed point of T.

2.13 Applications 151

2 2

x n

j

y ¼ xnj x þ kx yk2 þ 2< xnj x; x y ; ð2:105Þ

where 2< xnj x; x y ! 0 as j ! ∞, since xnj x ! 0 weakly in H.

Observe that

1 nj 1

Tðxnj Þ xnj ¼ Tðxnj Þ Tnj ðxnj Þ ¼ Tðxnj Þ a Tðxnj Þ

nj nj

ð2:106Þ

1

¼ Tðxnj Þ a ! 0 as j ! 1:

nj

n 2 2 o

lim xnj TðxÞ xnj x ¼ kx TðxÞk2 : ð2:107Þ

j!1

Tðxn Þ TðxÞ xn x :

j j

Hence

xn TðxÞ xnj Tðxnj Þ þ Tðxnj Þ TðxÞ

j

xn Tðxn Þ þ xn x :

j j j

lim sup xnj TðxÞ x n

j

x 0

j!1

and therefore

n 2 2 o

lim sup xnj TðxÞ xnj x 0;

j!1

kx TðxÞk ¼ 0:

Chapter 3

Linear Operators

Let X and Y be ﬁnite-dimensional vector spaces over the same ﬁeld F. Recall that a

mapping T:X!Y is called linear if T(a1x1 + a2x2) = a1T(x1) + a2T(x2) for all x1, x2 2

X and a1, a2 2 F. T is also called a linear operator or linear transformation.

If dim(X) = n and dim(Y) = m, we choose a basis {e1, e2, …, en} for X and a basis

{f1, f2, …, fm} for Y. An m n matrix A of elements of F corresponds to a linear

transformation T:X!Y in the following way: for each integer k, 1 k n, there

are unique elements s1,k, s2,k, …, sm,k of F such that

m

X

Tek ¼ sj;k fj : ð3:1Þ

j¼1

Pn

Each point x 2 X has a unique representation in the form x ¼ k¼1 nk ek , where

n1, n2, …, nn are in F. Hence,

n

X

Tx ¼ nk Tek

k¼1

!

n

X m

X

¼ nk sj;k fj ð3:2Þ

k¼1 j¼1

!

m

X n

X

¼ sj;k nk fj :

j¼1 k¼1

If η1, η2, …, ηm areP

f2, …, fm}, then gj ¼ nk¼1 sj;k nk . In this sense, the matrix A = [sj,k] corresponds to

the linear transformation T. It is also said that the matrix A represents the linear

transformation T with respect to the aforementioned bases of X and Y.

H.L. Vasudeva, Elements of Hilbert Spaces and Operator Theory,

DOI 10.1007/978-981-10-3020-8_3

154 3 Linear Operators

mapping T:X!Y in the following P manner. Consider an x 2 X. It has a unique

representation in the form x ¼ nk¼1 nk ek , where n1, n2, …, nn are in F. Set

n

X

gj ¼ sj;k nk ; j ¼ 1; 2; . . .; m: ð3:3Þ

k¼1

and

!

m

X m

X n

X

Tx ¼ g j fj ¼ sj;k nk fj : ð3:4Þ

j¼1 j¼1 k¼1

T determines a unique m n matrix representing T with respect to a given basis for

X and a given basis for Y, where the vectors of each basis are arranged in a ﬁxed

order, and conversely.

Questions about the system (3.3) can be formulated as questions about T. For

example, for which η1, η2, …, ηm does the system (3.3) have a solution n1, n2, …,

nn? This amounts to asking for a description of the range of T.

The most complete and satisfying results about (3.3) are obtained when m = n.

Indeed, if m = n, the system (3.3) has a unique solution, if and only if, the matrix [sj,k]

is nonsingular, equivalently, the linear operator T determined by the matrix [sj,k] is

one-to-one (or onto). In particular, if X = Y, ej = fj, j = 1, …, n, the operator T maps

X to itself. If p is a polynomial, then p(T) makes sense. The study of p(T) can provide

insight about T. For example, k is an eigenvalue of T if, and only if, it is a root of the

characteristic polynomial det(kI − T).

Recall that if H is a Hilbert space and M is a closed subspace of H, then the

mappings PM from H onto M and PM ? onto M⊥ are linear [see Theorem 2.10.15].

We give below a formal deﬁnition of a linear operator.

Deﬁnition 3.1.1 Let X and Y be linear spaces (vector spaces) over the same scalar

ﬁeld F, say. A mapping T deﬁned over a linear subspace D of X, written D(T), and

taking values in Y is said to be a linear operator if

The deﬁnition implies, in particular, that

Tð0Þ ¼ 0; Tð xÞ ¼ TðxÞ:

We denote

3.1 Basic Deﬁnitions 155

and

We call D(T) the domain, ran(T) the range and ker(T) the kernel, respectively, of

the operator T. A linear operator is also called a linear transformation with domain

D(T) H into Y. If the range ran(T) is contained in the scalar ﬁeld F, then T is

called a linear functional [see Deﬁnition 2.10.18] on D(T). If a linear operator gives

a one-to-one map (x1 6¼ x2 ) Tx1 6¼ Tx2 or equivalently, Tx1 = Tx2 ) x1 = x2) of

D(T) onto ran(T), then the inverse map T−1 gives a linear operator on R(T) onto

D(T):

1

T Tx ¼ x for x 2 DðTÞ and

1

TT y¼y for y 2 ran ðTÞ:

The following proposition is an easy consequence of the linearity of T.

1

Proposition 3.1.2 A linear operator T admits an inverse T if, and only if, Tx = 0

implies x = 0.

Proof Suppose Tx = 0 implies x = 0. Let Tx1 = Tx2. Since T is linear,

T ð x1 x2 Þ ¼ Tx1 Tx2 ¼ 0;

so that x1 = x2 by hypothesis.

Conversely, if T 1 exists, then Tx1 = Tx2 implies x1 = x2. Let Tx = 0. Since T is

linear, T0 = 0 = Tx, so that x = 0 by hypothesis. h

Example 3.1.3 Let X be the vector space of all real-valued functions which are

deﬁned over ℝ and have derivatives of all orders everywhere on ℝ. Deﬁne T:

X!X by y(t) = Tx(t) = x′(t). Then, R(T) = X. Indeed, for y 2 X, we have y = Tx,

Rt

where xðtÞ ¼ 0 yðsÞds. Since Tx = 0 for every constant function, T 1 does not

exist.

Deﬁnition 3.1.4 Let T1 and T2 be linear operators with domains D(T1) and D(T2)

both contained in a linear space X and ranges R(T1) and R(T2) both contained in a

linear space Y. Then, T1 = T2 if, and only if, D(T1) = D(T2) and T1x = T2x for all x 2

D(T1) = D(T2). If D(T1) D(T2) and T1x = T2x for all x 2 D(T1), T2 is called an

extension of T1 and T1 a restriction of T2. We shall write T1 T2.

We shall abbreviate “D(T)” to simply “D” when there is only one operator under

consideration.

The following is a special case of bijective mappings between sets.

Proposition 3.1.5 Let T:X!Y and S:Y!Z be bijective linear operators, where X,

Y, Z are linear spaces over the same scalar ﬁeld F. Then, the inverse (ST)−1:Z!X

of the product (composition) of S and T exists and satisﬁes

156 3 Linear Operators

1 1

ðSTÞ ¼T S 1:

Remark 3.1.6 The identity map, a composition of linear maps and the inverse of a

linear map (when it exists) are all linear.

Every linear functional is a linear transformation between the linear space and the

one-dimensional scalar ﬁeld underlying the linear space. The study of continuous

linear functionals on inner product spaces and more speciﬁcally on Hilbert spaces

has yielded many valuable results [Sect. 2.10]. It seems natural to attempt gener-

alising the considerations to linear transformations (operators) from Hilbert space

into itself. The interplay between algebraic notions and metric structure proves

interesting and useful in applications.

Deﬁnition 3.2.1 Let X and Y be normed linear spaces and T:D!Y a linear oper-

ator, where D X. T is said to be continuous at x0 2 D if limx!x0 T(x) = Tx0. T is

continuous in D if it is continuous at each point of D.

A linear operator is bounded if

sup jjTxjj\1

x2D

jjxjj 1

The left member of the above inequality is called the norm of the operator T in

D, provided it is ﬁnite, and is denoted by the symbol ||T|| or sometimes by ||T||D.

If M ||T||D, then M is called a bound of T.

The inﬁmum of all bounds M is the norm ||T||D.

Remarks 3.2.2

(i) If x 2 D and x 6¼ 0, then by the deﬁnition of the norm of T,

T x kT k :

k xk D

seen that this inequality holds also when x = 0 (the two sides are both zero in

this event), and therefore,

(ii) It follows from the relation (3.5) and linearity of T that T is uniformly

continuous. Indeed, by (3.5),

3.2 Bounded and Continuous Linear Operators 157

(iv) Now assume that D 6¼ {0}. Then, it follows from (3.5) and (3.6) and the

equality ||T(ax)|| = |a|||Tx|| that ||T|| can be deﬁned as

x2D

jjxjj¼1

or equivalently by

kTxk

kT k ¼ sup : ð3:8Þ

x2D k xk

jjxjj6¼0

kTxk

kT k ¼ sup kTxk ¼ sup jjTxjj ¼ sup : ð3:9Þ

x2D x2D x2D k xk

kxk¼1 k xk 1 k xk6¼0

linear operator from D X into Y.

Proposition 3.2.3 Let X and Y be normed linear spaces over the same ﬁeld of

scalars and D X be the domain of the linear operator T from D into Y. Then, the

following conditions are equivalent:

(a) T is continuous at a given x0 2 D;

(b) T is bounded; and

(c) T is continuous everywhere and the continuity is uniform.

(a) implies (b). Suppose T is continuous at x0 2 D. Then for given e > 0, there is

a d > 0 such that ||Tx − Tx0|| < e for all x 2 D satisfying ||x − x0|| < d. We now take

y 6¼ 0 in D and set

d

x ¼ x0 þ y:

2k y k

158 3 Linear Operators

Then,

d

x x0 ¼ y:

2k y k

Hence, ||x − x0|| = d/2 < d, so that we have ||Tx − Tx0|| < e. Since T is linear, we

obtain

d d

kTx Tx0 k ¼ kT ðx x 0 Þ k ¼ T

y ¼ kTyk

2k y k 2k y k

d

kTyk\e.

2k y k

d ||y|| = M||y||, where M = d . Thus, T is bounded.

(b) implies (c). Suppose T is bounded and M > 0 a bound. Then for x, y 2 D, we

have ||Tx − Ty|| = ||T(x − y)|| M||x − y||. Let e > 0 and d = e/M. Then, ||x − y|| < d

implies ||T(x − y)|| < Md = e. Since x, y 2 D are arbitrary, T is uniformly continuous

on D and hence continuous everywhere on D.

(c) implies (a). Trivial. h

Remark The terms continuous linear operator and bounded linear operator will be

used interchangeably.

Many properties of linear functionals generalise easily to linear operators. The

analogue of the dual space is the space of all continuous linear operators from a

normed linear space X into a normed linear space Y (which may or may not be the

same as X) and is denoted by B(X,Y). Note that in this context D = X. We abbreviate

B(X, X) as B(X).

First of all, B(X, Y) becomes a vector space if we deﬁne the sum T1 + T2 of two

operators T1,T2 in B(X, Y) in a natural way,

ðT1 þ T2 Þx ¼ T1 x þ T2 x

ðaTÞx ¼ aðTxÞ:

Since

supfkT1 xk : x 2 X and kxk ¼ 1g þ fsupkT2 xk : x 2 X and k xk ¼ 1g

¼ kT1 k þ kT2 k;

3.2 Bounded and Continuous Linear Operators 159

it follows that

kT k ¼ 0 impllies T ¼ O:

These imply that B(X, Y) is a normed vector space (linear space) over the scalar

ﬁeld F.

Theorem 3.2.4 If Y is a Banach space, then B(X, Y) is a Banach space.

Proof Let {Tn}n 1 be a Cauchy sequence in B(X, Y). Then for any x 2 X,

converges, say Tnx!y. Clearly, the limit y depends on x. This deﬁnes a map T:

X!Y, where y = Tx = limnTnx. The map T is a linear operator since

¼ a1 limn Tn ðx1 Þ þ a2 limn Tn ðx2 Þ

We prove that T is bounded and ||Tn − T||!0 as n!∞. The sequence {Tn}n 1,

being Cauchy, is bounded, i.e., there exists an M > 0 such that ||Tn|| M, n = 1, 2,

…. For any x 2 X, ||Tnx|| ||Tn||||x|| M||x||. Consequently,

This proves that T is bounded. It remains to show that ||Tn − T||!0 as n!∞.

Let e > 0. There exists n0 such that m, n n0 implies ||Tn − Tm|| < e. Then,

This implies that ||Tn − T|| e for n n0, so that ||Tn − T||!0 as n!∞. h

160 3 Linear Operators

Theorem 2.10.23.

Example 3.2.5

(i) (Identity operator) Let H be a Hilbert space. The identity operator I:

H!H deﬁned by Ix = x, x 2 H, is linear and bounded with ||I|| = 1 when H 6¼

{0}.

(ii) (Zero operator) The zero operator on H deﬁned by Ox = 0, x 2 H, is linear

and ||O|| = 0.

(iii) If H is a Hilbert space of ﬁnite dimension and T is a linear mapping on H into

H, then TP is continuous. For, let e1, e2, …, en be an orthonormal basis for

H. If x = nk¼1 nk ek is any vector in H, then

!12 !12

X n X n n n

2 2

X X

kTxk ¼ n ðTe Þ jn jkðTek Þk j nk j kðTek Þk ;

k¼1 k k k¼1 k k¼1 k¼1

!12

n

2

X

M¼ kTek k

k¼1

is independent of x.

(iv) Let T be a linear operator deﬁned on a Hilbert space H 6¼ {0} by the formula

Tx ¼ ax; x 2 H;

Consequently,

x2H x2H

jjxjj¼1 jjxjj¼1

where y 2 M and z 2 M⊥ and this representation is unique [see Remark 2.

10.12]. Deﬁne T:H!H by the formula

3.2 Bounded and Continuous Linear Operators 161

Tx ¼ y; x 2 H:

We know that T is linear and ||Tx||2 = ||y||2 ||y||2 + ||z||2 = ||y + z||2 = ||x||2 [see

Theorem 2.10.15]. Thus, T is a bounded linear operator on H and ||T|| 1.

Indeed, ||T|| = 1 when M 6¼ {0}; for x 2 M, Tx = x, and hence, ||Tx|| = ||x||.

Recall that this operator is called the projection on M and is denoted by PM

[see Deﬁnition 2.10.16].

(vi) (Multiplication operator) Let ðX; M; lÞ be a r-ﬁnite measure space and

H ¼ L2 ðX; M; lÞ be the Hilbert space of square integrable functions deﬁned

on X. For y 2 H, an essentially bounded measurable function, deﬁne

Tx(t) = y(t)x(t), x 2 H and t 2 X. Clearly, T is a bounded linear operator in

H. Indeed,

Z Z

kTxk22 ¼ jyðtÞj2 jxðtÞj2 dlðtÞ ess supjyðtÞj2 jxðtÞj2 dlðtÞ = k yk21 k xk22 ; x 2 H:

t2X

X X

Thus, ||T|| ||y||∞. Indeed, ||T|| = ||y||∞, as the following argument shows:

if l(X) = 0, then H = {0}, ||T|| = 0 = ||y||∞. Suppose l(X) > 0. If e > 0, the

r-ﬁniteness of the measure space implies that there is a measurable set F

1

X, 0 < l(F) < ∞, such that |y(t)| ||y||∞ – e on F. If f ¼ ðlðFÞÞ 2 vF , then

f 2 L2 ðX; M; lÞ and || f ||2 = 1. So,

Z

2

kTf k ¼ jyðtÞj2 ðlðFÞÞ 1 vF ðtÞdlðtÞ

X

Z

1

¼ ðlðF)Þ jyðtÞj2 dlðtÞ

F

2

ðlðF)Þ 1 k yk1 e lðFÞ

2

= k y k1 e ;

which implies ||T|| ||y||∞–e, as || f ||2 = 1. Since e > 0 is arbitrary, we get ||T||

||y||∞.

The operator T is called a multiplication operator.

(vii) Let H be a separable Hilbert space and {ei}i 1 be an orthonormal basis in

H. Deﬁne T:H!H as follows:

Tei ¼ ei þ 1 ; i ¼ 1; 2; . . .:

If x 2 H, then x = 1 k e ; k 2 F, k = 1, 2, …, where 1 2

P P

P1 k¼1 k k k k¼1 jkk j \1.

P 1

In particular, k¼1 kk ek þ 1 is an element of H. Deﬁne Tx = k¼1 kk Tek =

P1

k¼1 kk ek þ 1 . Clearly T is linear. Moreover,

162 3 Linear Operators

2

X 1 X 1 1

2

jkk j2 kek þ 1 k2 ¼ jkk j2 :

X

kTxk ¼ kk ek þ 1 ¼

k¼1 k¼1 k¼1

2

X 1 X 1 1

k x k2 ¼ jkk j2 kek k2 ¼ jkk j2 :

X

k k ek ¼

k¼1 k¼1 k¼1

The operator described above is called the simple unilateral shift.

(viii) Let ðX; M; lÞ be a r-ﬁnite measure space and k:X X!ℂ be an MM

measurable function for which there are constants c1 and c2 such that

Z

jkðs; tÞjdlðtÞ c1 a:e: ½l;

X

Z

jkðs; tÞjdlðsÞ c2 a:e: ½l:

X

Z

ðKxÞðsÞ ¼ kðs; tÞxðtÞdlðtÞ:

X

1

kK k ðc1 c2 Þ2 .

Z

jðKxÞðsÞj jkðs; tÞjjxðtÞjdlðtÞ

X

Z

1 1

¼ jkðs; tÞj2 jk ðs; tÞj2 jxðtÞjdlðtÞ

X

2 312 2 312

Z Z

4 jkðs; tÞjdlðtÞ5 4 jkðs; tÞjjxðtÞj2 dlðtÞ5

X X

2 312

Z

1

c1 4

2

jkðs; tÞjjxðtÞj2 dlðtÞ5 a:e: ½l:

X

3.2 Bounded and Continuous Linear Operators 163

nonnegative),

Z Z Z

jðKxÞðsÞj2 dlðsÞ c1 jkðs; tÞjjxðtÞj2 dlðtÞdlðsÞ

X X X

Z Z

2

¼ c1 j xð t Þ j jkðs; tÞjdlðsÞdlðt):

X X

c1 c2 jj xjj2 :

The above argument shows that the formula used to deﬁne Kx is such that

Kx is ﬁnite a.e. [l] and Kx 2 L2(l) and ||Kx||2 c1c2||x||2.

The operator K described above is called an integral operator and the

function k is called its kernel.

(ix) A particular instance of the integral operator described above is known as

the Volterra operator. Let k:[0,1] [0,1]!F be the characteristic function

of the set {(s,t) 2 [0,1] [0,1] : t < s}. The corresponding operator V:

L2[0,1]!L2[0,1] is deﬁned by

Zs

VxðsÞ ¼ xðtÞdt; x 2 L2 ½0; 1:

0

Then,

12

Zs

0

jVxðsÞj2 @ jxðtÞjdtA

0

Zs Zs

0 10 1

@ dtA@ jxðtÞj2 dtA

0 0

Zs

0 1

¼ s@ jxðtÞj2 dtA:

0

Consequently,

164 3 Linear Operators

Z1 Z1 Z s

jVxðsÞj2 ds ¼ s jxðtÞj2 dtds

0 0 0

Z1 Z1

s jxðtÞj2 dtds

0 0

Z1 Z1

¼ sds jxðtÞj2 dt

0 0

1

¼ jj xjj22 :

2

So,

1

jjVxjj22 jj xjj22 :

2

(x) Let H be the Hilbert space L2[0,1] of square integrable functions deﬁned on

[0,1] and D = C1[0,1] be the linear subspace of continuously differentiable

functions. Deﬁne

by the rule

xn(t) = sin npt, we have Txn(t) = npcosnpt and

Z1 Z1 Z1

cos 2npt þ 1 1

jTxn ðtÞj2 dt ¼ ðnpÞ2 cos2 nptdt ¼ ðnpÞ2 dt ¼ ðnpÞ2

2 2

0 0 0

Also,

Z1 Z1

2 2 1 cos 2npt 1

j j xn j j ¼ sin npt dt ¼ dt ¼ :

2 2

0 0

3.2 Bounded and Continuous Linear Operators 165

Thus,

jjTxjj jjTxn jj

kT k ¼ sup sup ¼ sup ðnpÞ ¼ 1

x2D jjxjj n jjxn jj n

k xk6¼0

Problem Set 3.2

3:2:P1. Let [si,j]i,j 1 be an inﬁnite matrix [that is, a double sequence {si,j}i,j 1

2

normally presented as an array] and K 2 ¼ 1

P

i;j¼1 si;j \1. The operator

2

T is deﬁned on ‘ by

T fx i gi 1 ¼ fy i gi 1 ;

where

1

X

yi ¼ si;j xj ; i ¼ 1; 2; . . . :

j¼1

3:2:P2. Let H be a separable Hilbert space and {ei}i 1 be an orthonormal basis.

Let T:H!H be a bounded linear operator. Show that T is deﬁned by the

matrix [(Tej,ei)] i,j 1.

3:2:P3. Let [si,j]i,j 1 be an inﬁnite matrix such that

1

X 1

X

a1 ¼ supj si;j \1 and a1 ¼ supi si;j \1:

i¼1 j¼1

Tej ; ei ¼ si;j

positive numbers such that

1

X

si;j pi a1 pj ; j ¼ 1; 2; . . .;

i¼1

X1

si;j pj a1 pi ; i ¼ 1; 2; . . .;

j¼1

then there exists an operator T on ‘2 with (Tej, ei) = si,j and ||T||2 a1a∞.

166 3 Linear Operators

h i

3:2:P5. Show that the matrix 1

iþj 1 deﬁnes a bounded linear operator on ‘2

i;j 1

with ||T|| p (The matrix is known as the Hilbert matrix.).

3:2:P6. Let {en}n 1 be the usual basis for ‘2 and {an}n 1 be a sequence of scalars.

Show that there is a bounded linear operator T on ‘2 such that Ten = anen

for all n if, and only if, {an}n 1 is bounded. This type of operator is called

a diagonal operator.

3:2:P7. (Laplace transform) Let x(t) be a complex-valued function on ℝ+ = {t 2

ℝ:t 0}. Its Laplace transform Lx is the function on ℝ+ deﬁned by

Z1

st

yðsÞ ¼ ðLxÞðsÞ ¼ xðtÞe dt:

0

Show that the Laplace transform is a bounded linear map of L2(ℝ+) into

itself and ||L|| = √p.

3:2:P8. Find an operator T on ℝ2 for which (Tx, x) = 0 for all x and ||T|| = 1.

3:2:P9. If M is a total subset of a Hilbert space H and S, T 2 B(H) are such that

Sx = Tx for all x 2 M, then S = T.

3:2:P10. Let H = L2[0,1] and

0 if 0 s\t 1

k ðs; tÞ ¼ 1

pﬃﬃﬃﬃﬃ if 0 t\s 1:

s t

Z1

ðKxÞðsÞ ¼ kðs; tÞxðtÞdt:

0

3:2:P11. Let {ai}i 1 be a sequence of complex numbers. Deﬁne an operator Da on

‘2 by

Prove that Da is bounded if, and only if, {ai}i 1 is bounded and in this

case ||Da|| = sup|ai|.

3:2:P12. Let H1 and H2 be Hilbert spaces. Deﬁne H1 ⊕ H2 [see Sect. 2.7] to be the

Hilbert space consisting of all pairs 〈u1, u2〉, ui 2 Hi, i = 1, 2,

khu1 ; u2 i ¼ hku1 ; ku2 i;

3.2 Bounded and Continuous Linear Operators 167

A1 0

A¼ ;

0 A2

i.e., A〈u1, u2〉 = 〈A1u1, A2u2〉. Prove that A 2 B(H) and that

3:2:P13. Let ‘2(ℤ) be the Hilbert space of all sequences {nj}j2ℤ with

P1 2

j¼ 1 nj \1 and the usual inner product. Deﬁne an operator

S:‘2(ℤ)!‘2(ℤ) by the formula

S nj j2Z ¼ nj 1 j2Z :

Show that ||Sx|| = ||x|| for any x 2 ‘2(ℤ). Give a formula and a matrix

representation for the operator Sn for n 2 ℤ.

For a normed linear space X and a Banch space Y, the space B(X, Y) of bounded linear

operators from X to Y is a Banach space [Theorem 3.2.4] in the norm deﬁned by

jjTxjj

jjT jj ¼ sup jjTxjj ¼ sup jjTxjj ¼ sup :

x2X x2X x2X jjxjj

jjxjj¼1 jjxjj 1 jjxjj6¼0

space B(X, Y) is then denoted by B(H). It turns out that B(H) is a “Banach algebra”.

Deﬁnition 3.3.1 An algebra A over a ﬁeld F is a vector space over F such that to

each ordered pair of elements x, y 2 A a unique product xy 2 A is deﬁned, with the

properties

ðxyÞz ¼ xðyzÞ

xðy þ zÞ ¼ xy þ xz

ðx þ yÞz ¼ xz þ yz

aðxyÞ ¼ ðaxÞy ¼ xðayÞ

168 3 Linear Operators

said to be commutative if the multiplication is commutative, that is, for all x, y 2 A,

xy ¼ yx:

all x 2 A, we have

xe ¼ ex ¼ x:

It may be noted that F and B(H) are algebras with identity.

Deﬁnition 3.3.2 A normed algebra is a normed space which is an algebra such

that for all x, y 2 A,

kxyk kxkk yk

jjejj ¼ 1:

normed space.

The space C[a, b] of continuous functions deﬁned on [a, b] is a commutative

Banach algebra in which the product is deﬁned by

xyðtÞ ¼ xðtÞyðtÞ

and

Theorem 3.3.3 (B(H),||•||), where ||T|| = sup{||Tx|| : ||x|| 1}, T 2 B(H), is a

Banach algebra with identity, provided that H 6¼ {0}.

Proof Since

it follows that

kST k kSkkT k:

That B(H) is a Banach space has been checked in Theorem 3.2.4. The operator

I is the identity and satisﬁes ||I|| = 1 when H 6¼ {0}. h

3.3 The Algebra of Operators 169

Remarks 3.3.4

(i) If the dimension of H is 2 or greater, the algebra B(H) is not commutative.

For example,

1 0 1 1 1 1

¼ and

1 0 0

0

1 1

1 1 1 0 2 0

¼ :

0 0 1 0 0 0

(ii) As in every algebra, Tn will denote the product of n factors all equal to T,

0

Pn T is deﬁned to be I, the identity operator. More generally, if

n = 1,2,…;

p(k) = j¼0 aj kj is any polynomial, we shall use the symbol p(T), T 2 B(H),

for the operator nj¼0 aj T j .

P

(iii) Let H be a Hilbert space different from {0}. We have seen that B(H) is a

Banach algebra with identity I and norm ||T|| = sup ||Tx||.

x2X

jjxjj 1

From now on, the Hilbert space will always be assumed to contain nonzero

vectors.

uniform operator norm if limn||Tn − T|| = 0.

There are two other modes of convergence: strong operator convergence and

weak operator convergence.

Deﬁnition 3.3.6 The sequence {Tn}n 1 in B(H) converges strongly to T 2

B(H) if, for each x 2 H, limn||Tnx − Tx|| = 0. The sequence {Tn}n 1 in

B(H) converges weakly to T 2 B(H) if, for all x, y 2 H, limn|(Tnx, y)–(Tx, y)| = 0.

Clearly, uniform operator convergence implies strong operator convergence and

strong operator convergence implies weak operator convergence. The reverse

implications, namely that strong operator convergence implies uniform operator

convergence and that weak operator convergence implies strong operator conver-

gence, are not true in general [see Problem 3.8.P1].

These are some of the important modes of convergence in B(H). They will

sufﬁce for any developments we contemplate.

The inverses of certain operators will be of concern in later Sections. If T 2

B(H), where H is of course a Hilbert space, and I is the identity operator, we shall

be concerned with the operator (T − kI)−1, k 2 ℂ. When H = ℂn and T is a linear

operator on H, the set of k’s for which (T − kI)−1 does not exist are precisely the

eigenvalues of T. When H is inﬁnite-dimensional, the set of k’s for which (T −

kI)−1 does not exist will turn out to be a nonempty compact subset of the complex

plane. Assuming that (T − kI)−1 exists, in which case it is obviously linear, it will be

of interest to know whether it is bounded.

170 3 Linear Operators

The treatment of the above question leads us into what is known as ‘spectral

theory’ or ‘spectral analysis’.

Deﬁnition 3.3.7 Let T 2 B(H). T is said to be invertible in B(H) if it has a set

theoretic inverse T−1 and T−1 2 B(H).

It is known that when the set theoretic inverse T−1 of an operator T 2

B(H) exists, it is in B(H) [Theorem 5.5.2].

The following fundamental proposition will be used to show that the collection

of invertible elements in B(H) is an open set and inversion is continuous in the

uniform operator norm.

Proposition 3.3.8 If T 2 B(H) and ||I − T|| < 1, then T is invertible and

1

T Þk ;

X

1

T ¼ ðI

k¼0

1

T 1

:

1 kI T k

X n m X n n

T Þk T Þk ¼ ðI T Þk T Þ kk

X X

ðI ðI kðI

k¼0 k¼0

k¼m þ 1 k¼m þ 1

n

X gm þ 1

¼ gk \ :

k¼m þ 1

1 g

nP o

n

The sequence of partial sums k¼0 ðI TÞk is Cauchy. If S =

n0

P1

k¼0 ðI TÞk , then

!

1

k

X

TS ¼ ½I ðI TÞ ðI TÞ

k¼0

n

TÞk

X

¼ limn ½I ðI TÞ ðI

k¼0

h i

nþ1

¼ limn I ðI TÞ ¼ I:

S. Moreover,

n n

X

k 1

kI T kk ¼

X

kSk ¼ limn ðI TÞ limn : h

k¼0 k¼0

1 k I Tk

3.3 The Algebra of Operators 171

Proposition 3.3.9 If T 2 G and S 2 B(H) satisﬁes ||S − T|| < kT1 1 k, then S is

invertible. In particular, the set G is open in B(H). Moreover, the map T!T−1

deﬁned on G is continuous.

Proof Let T 2 G. Consider {S 2 B(H):||S − T|| < kT1 1 k }. Then, 1 > ||T−1||||S − T||

||T−1S − I||. The preceding Proposition 3.3.8 implies that T−1S 2 G, and hence, S = T

(T−1S) is in G (the product of invertible elements is invertible). Thus, the ball of

radius jjT1 1 jj about each of its elements T, namely {S 2 B(H) : ||S − T|| < jjT1 1 jj }, is

contained in G. Consequently, G is an open subset of B(H).

It remains to show that the map T!T−1 is continuous on G.

If T 2 G, then the inequality ||T − S|| < 2jjT1 1 jj implies that ||I–T−1S|| < 12 and hence

1 1

S ¼ S TT 1

S 1 T T 1

¼ ðT 1 SÞ 1 T 1

1 1

1

1

T 2T ;

1 kI T Sk

1 1 2

S 1 ¼ T 1

SÞS 1 2T

T ðT kT Sk

Remark 3.3.10 The reader is undoubtedly familiar with the equivalence of the

following assertions when H is ﬁnite-dimensional:

(i) T is invertible;

(ii) T is injective;

(iii) T is surjective;

(iv) there exists S 2 B(H) such that TS = I; and

(v) there exists S 2 B(H) such that ST = I.

The above assertions are not equivalent in inﬁnite-dimensional spaces. Let H =

‘2 and T denote the “right shift”:

T fxi gi 1 ¼ ð0; x1 ; x2 ; . . .Þ:

The operator S deﬁned by

172 3 Linear Operators

S fxi gi 1 ¼ ðx2 ; x3 ; x4 ; . . .Þ

is surjective but not injective and thus also not invertible. Moreover, ST({xi}i 1) = S

(0, x1, x2, …) = (x1, x2, x3, …), which means ST = I. The reader may note that TS 6¼ I.

Furthermore, no operator in a ball of radius 1 around T is invertible. Indeed, if

||T − A|| < 1, then

invertible, so would be S; but this is not the case.

We next derive useful criteria for the invertibility of an operator.

Deﬁnition 3.3.11 An operator T 2 B(H) is said to be bounded below if there exists

an a > 0 such that ||Tx|| a||x|| for all x 2 H.

An operator which is bounded below is clearly injective.

Theorem 3.3.12 An operator T 2 B(H) is invertible if, and only if, it is bounded

below and has dense range.

Proof If T is invertible, then the range of T is H and is therefore dense. Moreover,

1 1

1

kTxk T Tx ¼ k xk; x2H

kT 1 k kT 1 k

Conversely, if T is bounded below, there exists an a > 0 such that ||Tx|| a||x||

for all x 2 H. Hence, if {Txn}n 1 is a Cauchy sequence in H, then the inequality

1

kx n xm k kTxn Txm k

a

limnTxn; and hence, ran(T) is closed. Since ran(T) is dense in H, it follows that ran

(T) = H. As T is bounded below, this implies T−1 is well deﬁned. Moreover, if y =

Tx, then

T 1 1 1

y ¼ k xk kTxk ¼ kyk: h

a a

We proceed to study vector-valued functions, which will be needed in Sect. 4.3

below.

Deﬁnition 3.3.13 Let f be a function deﬁned in a domain X of the complex plane

whose values are in a complex Banach space X.

3.3 The Algebra of Operators 173

f ðf þ hÞ f ðfÞ

limh!0

h

(b) f(f) is weakly holomorphic in X if for every bounded linear functional F on

X, F(f(f)) is holomorphic in X in the classical sense.

The words holomorphic and analytic will be used interchangeably, as is the

usual practice.

Every strongly holomorphic function is weakly holomorphic. N. Dunford has

proved the following surprising result.

Theorem 3.3.14 Let f:X!X be a weakly holomorphic function from X to X. Then,

f is strongly holomorphic.

Proof For a bounded linear functional F on X, F(f(f)) is holomorphic in X; so, we

can represent it by the Cauchy integral formula

1 Fðf ðzÞÞ

Z

Fðf ðfÞÞ ¼ dz;

2pi z f

c

where c is a simple closed rectiﬁable curve around f in X. Hence, for small |h|,

Zh

k Z

1 1 1 1 1 1

¼ Fðf ðzÞÞ dz dz

2pih z f h z f 2pik z f k z f

c c

h k Fðf ðzÞÞ

Z

¼ dz:

2pi ðz f hÞðz f kÞðz fÞ

c

So,

1 Fðf ðf þ hÞÞ Fðf ðfÞÞ Fðf ðf þ kÞÞ Fðf ðfÞÞ

h k h k

ð3:10Þ

1 Fðf ðzÞÞ

Z

¼ dz:

2pi c ðz f hÞðz f kÞðz fÞ

bounded. For small enough |h| and |k|, it now follows that the right-hand side of

(3.10) is bounded. Hence, by the uniform boundedness principle [Theorem 5.4.6],

there exists a constant C > 0 such that

174 3 Linear Operators

f ðf þ hÞ f ðfÞ f ðf þ kÞ f ðfÞ

C jh k j:

h k

complete, it follows that the difference quotient of f tends to a limit as h tends to 0.

Thus, f(f) is strongly analytic. h

A holomorphic function f:X!X has a Taylor series representation at every z 2 X,

i.e., for every z 2 X, there is an r = r(z) such that D(z,r) = {f 2 ℂ : |f − z| < r} X and

1

zÞn

X

f ðfÞ ¼ an ðf ð3:11Þ

n¼0

for some a0,Pa1,… in X and all f 2 D(z,r) with series (3.11) being absolutely

convergent ( 1 k¼0 jjan jjjf zjn \1).

The other standard results concerning holomorphic functions remain valid in this

more general setting. These results can be proved by the same method that is used

for complex functions.

1

Also, the radius of convergence of (3.11) is lim infkan kn just as in the classical

case. Correspondingly, the Laurent series

1

X

n

gðfÞ ¼ bn f ð3:12Þ

n¼0

1

has radius of convergence s = lim supkbn kn Indeed, if |f| > s, then choosing e > 0

1

such that (1 + e)s/|f| < 1, we have kbn kn < (1 + e)s for every sufﬁciently large

n. Hence, ||bnf−n|| < ((1 + e)s/|f|)n if n is sufﬁciently large, implying that (3.12) is

absolutely convergent. Conversely, if |f| < s, then there is an inﬁnite sequence n1 <

n2 < … such that kbnk k [ jfjnk . But then ||bnk fnk || > 1 and so, (3.12) does not

converge.

Problem Set 3.3

3:3:P1. Let H be a Hilbert space and let T1, T2, T3 2 B(H). On H(3) = H ⊕ H ⊕ H,

deﬁne T by the matrix

2 3

0 T3 T1

T ¼ 40 0 T2 5:

0 0 0

Prove that T 2 B(H(3)). For a 2 ℂ, show that (I − aT) is invertible and ﬁnd

its inverse.

3.3 The Algebra of Operators 175

Prove that the following systems of equations have unique solutions in ‘2

for any {ηk}k 1 2 ‘2. Find the solutions for ηk = d1k,lk = 2k1 1 .

(a) nk − lknk+1 = ηk, k = 1, 2, …

(b) nk − lknk−1 = ηk, k = 2, 3, … and n1 = 1.

3:3:P3. Show that T 2 B(H) is surjective if, and only if, T* is bounded below.

3:3:P4. Show that if T O then (I + T)−1 exists.

form, will be introduced. On the pattern of linear functionals, the notion of bounded

sesquilinear functionals is studied. A characterisation of such functionals is

provided.

Deﬁnition 3.4.1 Let X be a vector space over ℂ. A sesquilinear form on X is a

mapping B from X X into the complex plane ℂ with the following properties:

ðivÞ

Bðx; byÞ ¼ bBðx; yÞ:

for all x, x1, x2, y, y1, y2 in X and all scalars a, b in ℂ.

Thus, B is linear in the ﬁrst argument and conjugate linear in the second argu-

ment. If X is a real vector space, then (iv) is simply

176 3 Linear Operators

from X X into the complex plane ℂ satisfying properties (i), (ii), (iii) and the

additional property

(v) Bðx; yÞ ¼ Bðy; xÞ:

It is then obvious that B must also have the property (iv) above and thus be

sesquilinear. However, a sesquilinear form need not be Hermitian, for example, B(x,

y) = i(x, y), where (x, y) on the right denotes an inner product in X. In this

connection, see (ii) of Remark 3.4.4 below.

A sesquilinear form B on X is said to be nondegenerate if it has the following

property:

(vi) If x 2 X is such that for all y 2 X, B(x, y) = 0, then x = 0; if y 2 X is such that

for all x 2 X, B(x, y) = 0, then y = 0.

Example 3.4.3

(i) The inner product in any pre-Hilbert space is a nondegenerate

Pn Hermitian form.

In particular, the usual inner product (x, y) = i¼1 xi yi is a nondegenerate

Hermitian form on ℂn. But if we delete one or more terms in the preceding

sum, it will deﬁne a degenerate Hermitian form on ℂn.

(ii) The form

Bðx; yÞ ¼ x1 y1 x2 y2

on ℂn, n > 2).

Remarks 3.4.4

(i) The property (iv) above is responsible for the name “sesquilinear”; the Latin

word “sesquilinear” means one time and a half.

(ii) A sesquilinear form is Hermitian if, and only if, B(x, x) is a real number for all

x.

It follows in view of the property (v) with y = x that Bðx; xÞ ¼ Bðx; xÞ, that is, B

(x, x) is real. On the other hand, we have

Since the left-hand side of the above equality (3.13) is real for all x and y in X, it

implies

3.4 Sesquilinear Forms 177

Apply (3.13) with iy in place of y. The left side must again be real and so must be

the right-hand side, which is now, in view of the sesquilinearity,

Hence, in view of (3.14), Bðx; yÞ ¼ Bðy; xÞ.We shall essentially be interested in

positive deﬁnite forms. These are sesquilinear forms which satisfy the following

condition:

In particular, positive deﬁnite sesquilinear forms are Hermitian. They are obviously

nondegenerate.

Sesquilinear forms which satisfy the weaker condition, namely

We now present a result for sesquilinear forms generalising the Cauchy–

Schwarz inequality for inner products.

Theorem 3.4.5 Let B be a nonnegative sesquilinear form on the complex vector

space X. Then,

Proof If B(x, y) = 0, the inequality is, of course, true. Suppose B(x, y) 6¼ 0. Then for

arbitrary complex numbers a, b, we have

¼ aaBðx; xÞ þ abBðx; yÞ þ abBðy; xÞ þ bbBðy; yÞ

¼ aaBðx; xÞ þ abBðx; yÞ þ abBðx; yÞ þ bbBðy; yÞ

since B is nonnegative. Now let a = t be real and set b = B(x, y)/|B(x, y)|. Then,

Hence,

178 3 Linear Operators

Deﬁnition 3.4.6 Let H be a Hilbert space. The sesquilinear form B is said to be

bounded if there exists some positive constant M such that

jBðx; yÞj

jjBjj ¼ supkxk¼kyk¼1 jBðx; yÞj ¼ sup :

x2H;y2H

k x kk y k

x6¼06¼y

Example 3.4.7

(i) If H is a Hilbert space, the sesquilinear form B:H H!ℂ deﬁned by B(x,

y) = (x, y) is bounded by the Cauchy–Schwarz inequality. Moreover, ||B|| = 1.

Indeed, |B(x, y)| = |(x, y)| ||x||||y||, and so, ||B|| 1. For y = x, |B(x, y)| = |(x,

x)| = ||x||2 = 1 if ||x|| = 1.

(ii) If H is a Hilbert space and T:H!H is a bounded linear operator, then B(x,

y) = (Tx, y) is a bounded sesquilinear form with ||B|| = ||T||. Indeed, for x, y 2

H, ||x|| = ||y|| = 1,

hence,

jjBjj ¼ ¼ ;

kxkkTxk k xkkTxk k xk

which implies

variables:

jjBjjðkx x0 kky y0 k þ kx x0 kky0 k þ kx0 kky y0 kÞ:

3.4 Sesquilinear Forms 179

representation of sesquilinear forms on Hilbert space.

Theorem 3.4.8 Let H be a Hilbert space and B(,):H H!ℂ be a bounded

sesquilinear form. Then, B has a representation

norm

kSk ¼ kBk:

Proof For ﬁxed x, the expression Bðx; yÞ deﬁnes a linear functional in y whose

domain is H. Then, the Theorem 2.10.25 of F. Riesz yields an element z 2 H such

that

Hence,

Sx = z, x 2 H. Then,

Since

we have

Since y is arbitrary,

H. Furthermore, since |(Sx, y)| ||Sx||||y||, we have

jjBjj ¼ sup ¼ sup supx6¼0 ¼ jjSjj:

x2H;y2H k x kk y k x2H;y2H k x kk y k k xk

x6¼06¼y x6¼06¼y

180 3 Linear Operators

jjBjj ¼ sup sup ¼ supx6¼0 ¼ jjSjj:

x2H;y2H k x kk y k x2H k xkkSxk k xk

x6¼06¼y x6¼06¼Sx

It remains to check that S is unique. Suppose there is a linear operator T:H!H such

that for x, y 2 H, we have

ððS TÞx; yÞ ¼ 0; x; y 2 H:

x 2 H. Consequently, S = T. h

The following simple Theorem is often useful:

Theorem 3.4.9 If a complex scalar function B:H H!ℂ, where H denotes a

Hilbert space, satisﬁes the following conditions:

where M is a constant; x, x1, x2, y, y1, y2 are arbitrary elements of H; a, b are

scalars, then B is a bounded sesquilinear functional with ||B|| M.

Proof From (i)–(iv), it follows that

1

Bðx; yÞ þ Bðy; xÞ ¼ ½Bðx þ y; x þ yÞ Bðx y; x yÞ:

2

This implies

3.4 Sesquilinear Forms 181

1 h i h i

jBðx; yÞ þ Bðy; xÞj M jjx þ yjj2 þ kx y k2 ¼ M j j x j j 2 þ k y k2 : ð3:15Þ

2

value 1, to be speciﬁed later. Then, (3.15) yields

kBðx; zÞ þ kBðz; xÞ 2M: ð3:16Þ

þ keid 2M:

ic

jBðx; zÞjke

which yields

Corollary 3.4.10 If the bounded sesquilinear functional B satisﬁes the condition

then

jBðx; xÞj

jjBjj ¼ sup :

x2H k x k2

jjxjj6¼0

(v) of Theorem 3.4.9. It follows that

jBðx; xÞj

jjBjj sup ;

x2H k x k2

jjxjj6¼0

182 3 Linear Operators

sup 2

sup ¼ kBk h

x2H k xk x2H;y2H k xk k yk

k xk6¼0 x6¼06¼y

important role in the exposition of spectral theory given in subsequent pages.

Corollary 3.4.11 If H is a Hilbert space, the norm of a Hermitian bounded

sesquilinear form B:H H!ℂ is given by the formula

jBðx; xÞj

kBk ¼ sup :

x2H k x k2

kxk6¼0

|B(x, y)| = |B(y, x)|. h

3:4:P1. Let B(,) be a bounded sesquilinear form on a Hilbert space H. Show that

(a) (Parallelogram law) For all x, y 2 H,

iBðx iy; x iyÞ:

3.4.P2. A function f deﬁned on a Hilbert space H is called a quadratic form if there

exists a sesquilinear form B on H H such that f(x) = B(x, x). Show that a pointwise

limit of quadratic forms is a quadratic form.

The study of bilinear forms on a Hilbert space H yields rich dividends. The algebra

B(H) of bounded linear operators on H admits a canonical bijection T!T* pos-

sessing pleasant algebraic properties. Moreover, many properties of T can be

studied through the operator T*. It also helps us to study three important classes of

3.5 The Adjoint Operator 183

operators, namely self-adjoint, unitary and normal operators. These classes have

been studied extensively, because they play an important role in various

applications.

Deﬁnition 3.5.1 Let T be a bounded linear operator on a Hilbert space H. Then, the

Hilbert space adjoint T* of T is the operator

T*:H ! H

We ﬁrst show that this deﬁnition makes sense and also prove that the adjoint

operator has the same norm.

Theorem 3.5.2 The Hilbert space adjoint T* of T in Deﬁnition 3.5.1 exists, is

unique and is a bounded linear operator with norm

sesquilinear form and T is a bounded linear operator. Indeed, for y1, y2, x1, x2 in

H and a, b scalars,

¼ ðay1 ; TxÞ þ ðby2 ; TxÞ

¼ aðy1 ; TxÞ þ bðy2 ; TxÞ

¼ aBðy1 ; TxÞ þ bBðy2 ; xÞ

and

¼ ðy; aTx1 þ bTx2 Þ

¼ aðy; Tx1 Þ þ bðy; Tx2 Þ

¼ aBðy; x1 Þ þ bBðy; x2 Þ:

Moreover, B is bounded:

184 3 Linear Operators

kBk ¼ sup sup ¼ kT k: ð3:19Þ

x6¼0 k y kk x k x6¼0 kTxkk xk

y6¼0 Tx6¼0

From the representation Theorem 3.4.8 for bounded sesquilinear forms, we have

The operator T*:H!H is a uniquely deﬁned bounded linear operator with norm

Remarks 3.5.3

(i) T = O if and only if, (Tx, y) = 0 for all x, y 2 H. T = O means Tx = 0 for all x 2

H and this implies (Tx, y) = (0, y) = 0. On the other hand, (Tx, y) = 0 for all x, y

2 H implies Tx = 0 for all x 2 H, which, by deﬁnition, says T = O.

(ii) (Tx,x) = 0 for all x 2 H if and only if, T = O. For x = ay + z 2 H,

0 ¼ ðTðay þ zÞ; ay þ zÞ

¼ jaj2 ðTy; yÞ þ aðTy; zÞ þ aðTz; yÞ þ ðTz; zÞ ð3:23Þ

¼ aðTy; zÞ þ aðTz; yÞ

3.5 The Adjoint Operator 185

and

(i) above.

The following general properties of Hilbert space adjoint operators are fre-

quently used in studying these operators.

Theorem 3.5.4 If S, T 2 B(H) and a is a scalar, then

(a) ðaS þ TÞ* ¼ aS* þ T*

(b) ðSTÞ* ¼ T*S*

(c) ðS*Þ* ¼ S

(d) If S is invertible in B(H) and S−1 is its inverse, then S* is invertible and

(ðS*Þ 1) = (ðS 1Þ*.

(e) jjS*Sjj ¼ jjSS*jj ¼ jjSjj2

(f) S*S ¼ 0 if, and only if, S = O.

Proof

(a) By deﬁnition of the adjoint, for all x, y 2 H,

¼ ðaSx; yÞ þ ðTx; yÞ

¼ ðx; aS*yÞ þ ðx; T*yÞ

¼ ðx; ðaS* þ T*ÞyÞ:

ðaS þ TÞ* ¼ aS* þ T*.

(b) For x, y 2 H,

¼ ðTx; S*yÞ

¼ ðx; T*S*ðyÞÞ:

(c) For x, y 2 H,

186 3 Linear Operators

Suppose S is an invertible element in B(H). Then, S−1S = SS−1 = I. Using (ii) above,

we have (S−1S)* = S*(S−1)* = I*. Since I* = I, we get S*(S−1)* = I. Similarly,

(S−1)*S* = I. Hence, (S*)−1 = (S−1)*.

(e) By the Cauchy–Schwarz inequality,

jjSjj2 jjS*Sjj:

Hence,

(f) This is an immediate consequence of (e) above. h

Remarks 3.5.5

(i) The map T!T* has properties very similar to the complex conjugation z!z

on ℂ. A new feature is the relation (b) of Theorem 3.5.4 which results from

the noncommutativity of operator multiplication.

(ii) Since ||T*|| = ||T||, we have ||T* − S*|| = ||(T − S)*|| = ||T − S||, and it follows

that the map T!T* from B(H) to B(H) is continuous in the norm.

(iii) If H is a Hilbert space, we know that B(H) is a Banach algebra

[Theorem 3.3.3]. Moreover, in view of Theorem 3.5.4, the mapping T!T*

of B(H) into itself is such that

3.5 The Adjoint Operator 187

ðaÞ T** ¼ T

ðbÞ ðS þ TÞ* ¼ S* þ T*

ðeÞ kT*T k ¼ kT k2 :

It is immediate from (a) that the mapping T!T* is both one-to-one and onto. It

is useful to have the following general deﬁnition.

Deﬁnition 3.5.6 Let A be an algebra over ℂ. A mapping a ! a* of A into itself is

called an involution if, for all a,b 2 A and all a 2 ℂ,

(i) a** ¼ a

(ii) ða þ bÞ* ¼ a* þ b*

(iii) ðaaÞ* ¼ aa*

(iv) ðabÞ* ¼ b*a*:

An algebra with an involution is called a *algebra. A normed algebra with an

involution is called a normed *algebra. A Banach algebra A with an involution

satisfying jja*ajj ¼ jjajj2 is called a C*-algebra.

Observe that in a C*-algebra,

which implies jjajj jja*jj provided a 6¼ 0. Replacing a by a* and using (a) of the

above deﬁnition, we obtain jja*jj jjajj. Thus, jjajj ¼ jja*jj for a 2 A, since the

equality is trivially true when a = 0.

In view of the observations [(iii) of Remark 3.5.5], it follows that B(H) is a C*-

algebra. Obviously, every *subalgebra of B(H), that is, a subalgebra containing

adjoints, which is closed in the norm is also a C*-algebra. Every C*-algebra has the

same “mathematical structure” as a subalgebra of B(H) for a suitable Hilbert space

H; this is known as the Gelfand–Naimark Theorem. The study of such algebras

constitutes an important area of research in functional analysis and is beyond the

scope of the present text.

There is an interesting relationship between the range of an operator T 2

B(H) and the kernel of its adjoint T*. This relationship proves useful in deciding the

invertibility of operators.

Theorem 3.5.7 Let M and N be closed linear subspaces of a Hilbert space

H. Then, TðMÞN if and only if, T*ðN? ÞM? .

188 3 Linear Operators

Proof Suppose TðMÞN and let y 2 T*ðN? Þ. There exists x 2 N? such that y =

T*x. For z 2 M, (y, z) = (T*x, z) = (x, Tz) = 0, since x 2 N? and Tz 2 N; thus,

y?M.

If T*ðN? ÞM? , then by the argument in the above paragraph,

T**ðM?? ÞN?? . Since T** = T and M and N are closed subspaces of H, it

follows that TðMÞN. h

Theorem 3.5.8 If T 2 B(H), then ker(T) = ker(T*T) = [ran(T*)]⊥ and [ker(T)]⊥ =

½ranðT*Þ.

Proof Clearly, ker(T) ker(T*T). The reverse inclusion follows from the com-

putation ||Tx||2 = (Tx, Tx) = (T*Tx, x).

Now, x 2 ker(T) , Tx = 0 , (Tx, y) = 0 for all y 2 H , (x,T*y) = 0 for all y 2

H , x 2 [ran(T*)]⊥. Thus, ker(T) = [ran(T*)]⊥.

It follows by (iii) of Remark 2.10.12 that [ker(T)]⊥ = [ran(T*)]⊥⊥ = ½ranðT*Þ.

h

The following Theorem provides a criterion for the invertibility of T 2 B(H).

Theorem 3.5.9 If T 2 B(H) is such that T and T* are both bounded below, then T

is invertible.

Proof If T* is bounded below, then ker(T*) = {0}. In view of Theorem 3.5.8, [ran

(T)]⊥ = {0}, which implies ½ranðTÞ = [ran(T)]⊥⊥ = {0}⊥ = H. Thus, ran(T) is dense

in H and the result now follows on using Theorem 3.3.12. h

In the following examples, we compute the adjoints of some well-known

operators.

Example 3.5.10

(i) Let H = ℂn, the Hilbert space of ﬁnite dimension n and {e1, e2, …, en} be the

standard orthonormal basis for H. Deﬁne T:ℂn!ℂn by setting

Xn

ðTxÞi ¼ ai;j xj :

j¼1

P of Example 3.2.5].

Since the inner product in ℂn is (x, y) = ni¼1 xi yi ,

n

X

ðTx; yÞ ¼ ðTxÞiyi

i¼1

!

n

X n

X

¼ ai;j xj yi

i¼1 j¼1

n

X n

X

¼ xj ai;j yi

j¼1 i¼1

¼ ðx; T*yÞ;

3.5 The Adjoint Operator 189

where ðT*yÞj ¼ ni¼1 ai;j yi . The adjoint of T is, therefore, represented by the

P

(ii) Let H be a separable Hilbert space and {en}n 1 constitute an orthonormal

basis for H. By Problem 3.2.P2, each T 2 B(H) is deﬁned by a matrix

½ai;j ]i;j 1 , where ai;j ¼ ðTej ; ei Þ, i,j = 1,2,…. Since T* 2 B(H) and

T*ej ; ei ¼ ej ; Tei ¼ Tei ; ej for i; j ¼ 1; 2; . . .;

the matrix [ai,j]i-,j 1 representing T.

(iii) The adjoint of the operator T 2 B(H) deﬁned by Tx = ax, x 2 H and a 2 ℂ, is

the operator T* deﬁned by T*x ¼ ax, x 2 H. Indeed, for x, y 2 H, (x, T*y) =

(Tx, y) = (ax, y) = ðx; ayÞ. Thus, ðx; ðT* aIÞyÞ ¼ 0. Consequently,

T* ¼ aI.

(iv) Let M be a closed subspace of a Hilbert space H and PM the orthogonal

projection on M. Moreover, ||PM|| = 1 [(ii) of Remark 2.10.17].

The adjoint PM* of PM is PM itself. Indeed, for x1, x2 2 H with xi = yi + zi,

where yi 2 M and zi 2 M⊥, i = 1, 2, we have

¼ ðx1 ; PM x2 Þ;

(v) Let H ¼ L2 ðX; M; lÞ, where ðX; M; lÞ is a r-ﬁnite measure space and y 2

L1 ðX; M; lÞ be an essentially bounded measurable function.

A multiplication operator T 2 B(H) [see Example (vi) of 3.2.5] has adjoint

T* which is also a multiplication operator. The deﬁning relation for T* is (x,

T*z) = (Tx, z), x, z 2 H. Consequently,

Z Z

xðtÞT*zðtÞdt ¼ yðtÞxðtÞzðtÞdt; x; z 2 H;

X X

which implies

Z h i

xðtÞ T*zðtÞ yðtÞzðtÞ dt ¼ 0:

X

Since the above relation holds for all x, z 2 H, it follows that T*zðtÞ ¼

yðtÞzðtÞ in H. Thus, the adjoint T* of the multiplication operator T is mul-

tiplication by the complex conjugate of y. In particular, if y is real-valued,

then T* = T.

190 3 Linear Operators

T 2 B(H) be the simple unilateral shift [see (vii) of Example 3.2.5]. The

deﬁning relation for T* is (x,T*y) = (Tx, y), x, y 2 H. Now,

! !

1

X 1

X 1

X 1

X

ðx; T*yÞ ¼ T kk ek ; lk ek ; x¼ kk ek ; y ¼ lk ek ;

k¼1 k¼1 k¼1 k¼1

!

1

X 1

X

¼ k k ek þ 1 ; ðlk ek Þ

k¼1 k¼1

1

X

¼ kk lk þ 1

k¼1

!

1

X 1

X

¼ k k ek ; l k þ 1 ek

k¼1 k¼1

!

1

X 1

X

¼ k k ek ; l k ek 1 :

k¼1 k¼1

1

X 1

X

T*y ¼ lk ek 1 ; where y¼ lk ek :

k¼2 k¼1

simple unilateral shift is

!

1

X 1

X

T* l k ek ¼ lk ek 1 :

k¼1 k¼2

(vii) If K is the integral operator with kernel k as in (viii) of Example 3.2.5, then

K* is the integral operator with kernel k*ðs; tÞ ¼ kðt; sÞ. The deﬁning relation

for K* is (x, K*y) = (Kx, y) for x, y 2 L2(l). Now,

0 1

Z Z

ðx; K*yÞ ¼ ðKx; yÞ ¼ @ kðs; tÞxðtÞdlðtÞAyðsÞdlðsÞ

X X

Z Z

¼ xðtÞkðs; tÞyðsÞdlðsÞdlðtÞ:

X X

3.5 The Adjoint Operator 191

Theorem [Theorem 1.3.14]. As this holds for all x and y in L2(l), we must

have

Z

K*yðtÞ ¼ kðs; tÞyðsÞdlðsÞ

X

Z

K*yðsÞ ¼ kðt; sÞyðtÞdlðtÞ;

X

Z

K*yðsÞ ¼ kðt; sÞyðtÞdlðtÞ

X

for almost all s. Thus, K* is the integral operator with kernel k*, where

k*ðs; tÞ ¼ kðt; sÞ:

Remarks 3.5.11 The Laplace transform L:L2(ℝ+)!L2(ℝ+) with kernel k(s, t) = e–st

[Problem 3.2.P7] deﬁned by

Z1

st

LxðsÞ ¼ xðtÞe dt

0

Since ||S*S|| = ||S||2, S 2 B(H), it follows that ||L∘L|| = ||L||2. The mapping L∘L is

easily computed. For x 2 L2(ℝ+),

Z1 Z1 Z1

0 1

ðLLÞxðrÞ ¼ LxðsÞe rs ds ¼ @ xðtÞe st

dtAe rs

ds

0 0 0

Z1 Z1

ðr þ tÞs

¼ xðtÞ e dsdt; using Fubini0 s Theorem

0 0

Z1

xðtÞ

¼ dt:

rþt

0

192 3 Linear Operators

Z1

xðtÞ

HxðrÞ ¼ dt

rþt

0

is bounded as a map from L2(ℝ+) to itself and its norm equals √p.

Problem Set 3.5

k 1}. Show that there exists one and only one operator T on a Hilbert

space H such that

(a) Tek = lkek for all k, where {ek}k 1 is an orthonormal basis in H;

!

1

X 1

X

ðbÞ T kk ek ¼ kk lk ek ;

k¼1 k¼1

ðcÞ jjT jj ¼ M;

!

1

X 1

X

ðeÞ T* k k ek ¼ kk lk ek ; and

k¼1 k¼1

The adjoint operation in B(H) in a way extends the conjugation operation in the

complex numbers. Unlike conjugation in complex numbers, the adjoint operation in

B(H) does not preserve the product.

Those operators T for which T*T = TT* have “decent” properties. Such operators

and their suitable subsets will be studied in this section.

3.6 Some Special Classes of Operators 193

(a) T is Hermitian or self-adjoint if T* = T;

(b) T is unitary if T is bijective and T* = T−1; and

(c) T is normal if T*T = TT*.

Remarks 3.6.2

(i) In the analogy between the adjoint and the conjugate, Hermitian operators

become analogues of real numbers, unitaries are the analogues of complex

numbers of absolute value 1. Normal operators are the true analogues of

complex numbers: Note that

T þ T* T T*

T¼ þi ;

2 2i

where T þ2T* and T 2iT* are self-adjoint and T* = T þ2 T* iT 2iT*. The operators T þ2T*

and T 2iT* are called real and imaginary parts of T.

(ii) If T is self-adjoint or unitary, then T is normal. However, a normal operator

need not be self-adjoint or unitary. First note that I¸ the identity operator in

B(H), is self-adjoint. The operator T = 2iI is such that T* = −2iI; so, TT* =

4I = T*T, but T* 6¼ T and T−1 = 12 iI 6¼ T*.

From Examples 3.2.5 and 3.5.10, we can readily produce some

inﬁnite-dimensional operators satisfying conditions (a), (b) and (c) of

Deﬁnition 3.6.1.

(iii) If T 2 B(H), where H is a separable Hilbert space and T is deﬁned by the matrix

∞

M = [ai,j]i,j=1 with respect to an orthonormal basis {en}n 1 (ai,j = (Tej,ei), i,j =

t

1,2,…), then T* is deﬁned by M ¼ ½aj;i 1 i;j¼1 with respect to the same basis.

t

Thus, T is self-adjoint if, and only if, ai;j ¼ aj;i ; i; j ¼ 1; 2. . ., that is, M ¼ M.

t t

Since M M ¼ ½Rn an;j an;i 1 1

i;j¼1 and MM ¼ ½Rn ai;n aj;n i;j¼1 with respect to the

t t

basis {en}n 1, it follows that T is unitary if, and only if, M M ¼ I ¼ MM ,

that is,

X X

an;j an;i ¼ di;j ¼ ai;n aj;n

n n

for all i, j = 1, 2,… where di,j is 1 if i = j and zero otherwise. This says that

the columns of M form an orthonormal set in ‘2 and so do its rows. Next, T is

t t

normal if, and only if, M M ¼ MM . This is certainly the case if M is a

diagonal matrix.

(iv) If T denotes the operator of multiplication by y 2 L∞(l) (notations as in (vi) of

Example 3.2.5 and (v) of Example 3.5.10), then T is normal; T is Hermitian if,

and only if, y is real-valued; T is unitary if, and only if, |y| = 1 a.e.

194 3 Linear Operators

(v) By (viii) of Example 3.2.5 and (vii) of Example 3.5.10, the integral operator

K with kernel k is self-adjoint if, and only if, kðs; tÞ ¼ kðt; sÞ a.e. [l l].

(vi) [(vii) of Example 3.2.5 and (vi) of Example 3.5.10] If T 2 B(‘2) is the simple

shift, then T*Te1 = T*e2 = e1 and TT*e1 = T0 = 0; so, T*T 6¼ TT*, that is, T is

not a normal operator.

The following is an important and rather simple criterion for self-adjointness in

the complex case.

Theorem 3.6.3 Let T 2 B(H). Then,

(a) If T is self-adjoint, (Tx, x) is real for all x 2 H.

(b) If H is a complex Hilbert space and (Tx, x) is real for all x 2 H, the operator T

is self-adjoint.

conclusion now follows from (ii) of Remark 3.4.4. h

Remarks 3.6.4

(i) Part (b) of the preceding proposition is false

if it is only assumed that H is a

0 1

real Hilbert space. For example, if T ¼ on ℝ2, then (Tx,x) = 0 for

1 0

0 1 0 1

all x 2 ℝ2. However, T* ¼ 6¼ ¼ T.

1 0 1 0

(ii) If S and T are bounded self-adjoint operators on a Hilbert space H, then so is

aS + bT, where a and b are real numbers. Thus, the collection of all

self-adjoint operators is a real vector space, which we shall denote by S(H).

(iii) If T 2 B(H), then T*T and T + T* are self-adjoint.

(iv) If S,T 2 B(H) are self-adjoint, then ST is self-adjoint if, and only if, ST = TS.

Indeed, (ST)* = T*S* = TS; so, ST = (ST)* if, and only if, ST = TS.

Sequences of self-adjoint operators occur in various problems. For them, the

following holds:

Theorem 3.6.5 Let {Tn}n 1 be a sequence of bounded self-adjoint linear opera-

tors on a Hilbert space H. Suppose {Tn}n 1 converges, say limnTn = T (uniform

norm), i.e. limn||Tn − T|| = 0. Then, the limit operator T is a bounded self-adjoint

operator on H.

Proof Clearly, T is a bounded linear operator. It is enough to show that T* = T. It

follows from Theorems 3.5.2 and 3.5.4 that

The following result is important for the discussion of “spectral theory”.

3.6 Some Special Classes of Operators 195

Proof Deﬁne B(x, y) = (Tx, y), x, y 2 H; B is a bounded sesquilinear form with

||B|| = ||T|| [(ii) of Example 3.4.7]. Since Bðy; xÞ ¼ ðTy; xÞ ¼ ðy; TxÞ ¼

ðTx; yÞ ¼ Bðx; yÞ, B is Hermitian. Hence, by Corollary 3.4.11,

n o

jjBjj ¼ sup jBðx; xÞj=kxk2 : x 2 H; x 6¼ 0

¼ supfjBðx; xÞj : x 2 H; jj xjj 1g: h

Corollary 3.6.7 If T 2 B(H) is such that T = T* and (Tx, x) = 0 for all x 2 H, then T

= O.

Remarks 3.6.8 The above Corollary is not true unless T = T*. See (i) of Remark

3.6.4. However, if the Hilbert space under consideration is complex, then the

hypothesis, namely T = T*, can be deleted. In fact, the following holds.

Proposition 3.6.9 If H is a complex Hilbert space and T 2 B(H) is such that (Tx,

x) = 0 for all x 2 H, then T = O.

Proof For x, y 2 H, the following equality is easily veriﬁed:

1

ðTx; yÞ ¼ fðTðx þ yÞ; x þ yÞ ðTðx yÞ; x yÞ

4

þ iðTðx þ iyÞ; x þ iyÞ iðTðx iyÞ; x iyÞg:

Tx, we obtain

The notion of positive deﬁnite matrix is familiar from linear algebra; it has a

natural generalisation to inﬁnite dimensions.

Deﬁnition 3.6.10 Let T 2 B(H) be such that T* = T. If for each x 2 H, (Tx, x) 0,

we say that T is positive semideﬁnite. If (Tx, x) > 0 for all nonzero x 2 H, we say

that T is positive deﬁnite. Alternatively, these are known as positive and strictly

positive operators.

196 3 Linear Operators

Remarks 3.6.11

(i) If T is any operator on a complex Hilbert space, then the condition (Tx,x)

0 for all x 2 H implies T is self-adjoint. However, in a real Hilbert

space, this

2 1 1

is not true. Indeed, the operator T in ℝ deﬁned by the matrix is

1 1

not self-adjoint but (Tx, x) = x21 + x22 0 for all x = (x1, x2) 2 ℝ2 [See also

(i) of Remark 3.6.4.].

(ii) We write T O to mean T is positive. The collection of all positive

operators is a positive cone: if S O, T O, then for all nonnegative real

numbers a and b, we have aS + bT O. This deﬁnes a partial order on the

collection S(H) of self-adjoint operators: S T if, and only if, S − T

O. Also, if S1 T1 and S2 T2, then S1 + S2 T1 + T2.

(iii) If T 2 B(H) is any operator, then T*T and TT* are positive. Indeed, (T*Tx, x) =

(Tx, Tx) = ||Tx||2 0 for all x 2 H. The argument that TT* is positive is

similar.

2 1 1 1

(iv) If A ¼ and B ¼ , then it can be checked that A

1 1 1 1

1 0

B. Indeed, A B ¼ 0. However, the relation A2 B2 is false. In

0 0

2 5 3 2 2 2 2 2 3 1

fact, A ¼ and B ¼ and A B ¼ and this does

3 2 2 2 1 0

not represent a positive operator, as can be easily veriﬁed by considering the

vector (1, −2).

(v) The multiplication operator T:L2[0,1]!L2[0,1] deﬁned by

Z1

ðTx; xÞ ¼ tjxðtÞj2 dt 0

0

It was pointed out in (ii) of Remark 3.6.11 that the sum of positive operators is

positive. Let us turn to products. From (iv) of Remark 3.6.4, we know that a product

of bounded self-adjoint operators is self-adjoint if, and only if, the operators

commute. We shall see below that the product of two positive operators is positive

if, and only if, the operators commute.

Theorem 3.6.12 If S,T 2 B(H), where H is a complex Hilbert space, are such that

S O, T O, then their product ST is positive if, and only if, ST = TS.

Proof The “only if” part is trivial in view of (iv) of Remark 3.6.4.

3.6 Some Special Classes of Operators 197

To prove the “if” part, we suppose ST = TS and show that (STx,x) 0 for all x 2

H. If S = O, the inequality holds. Let S 6¼ O. Set S1 = S/||S||, S2 = S1 − S21,…,Sn+1 =

Sn − S2n,…. Note that each Si is self-adjoint. We shall show that, for each i = 1,2,…,

O Si I.

For i = 1 and x 2 H, (S1x, x) = ((S/||S||)x, x) = (Sx, x)/||S|| ||Sx||||x||/||S|| ||x||2 =

(x, x); so, ((I − S1)x, x) 0. Thus, the result is true for i = 1.

Assume that O Sk I. Then, (S2k (I − Sk)x,x) = ((I − Sk)Skx,Skx) O, that is,

Sk (I − Sk) O. Similarly, it can be shown that Sk(I − Sk)2 O. Consequently,

2

Sk+1 = S2k (I − Sk) + Sk(I − Sk)2 O and I − Sk+1 = (I − Sk) + S2k O by the

induction hypothesis and the fact that S2k O whenever Sk O. This completes

the argument when O Sk I.

We now consider the general case. Observe that S1 = S21 + S2 = S21 + S22 + S3 =

= S21 + S22 + + S2n + Sn+1.

Since Sn+1 O, this implies

By the deﬁnition of and the fact that Si = Si*, this means that

n n n

jjSi xjj2 ¼

X X X

S2i x; x ðS1 x; xÞ:

ðSi x; Si xÞ ¼

i¼1 i¼1 i¼1

P1

Since n is arbitrary, the inﬁnite series i¼1 jjSi xjj2 converges, which implies

||Six||!0 and hence Six!0. By (3.28),

!

n

X

S2i x; x ¼ ð S1 Sn þ 1 Þx ! S1 x as n ! 1: ð3:29Þ

i¼1

Observe that all the Si commute with T since they are the sums and products of

S1 = ||S||−1S and S and T commute. Finally,

¼ jjSjjðTS1 x; xÞ

!

n

X

¼ jjSjj Tlimn S2i x; x

i¼1

n

X

¼ jjSjjlimn ðTS2i x; xÞ

i¼1

Xn

¼ jjSjjlimn ðTSi x; Si xÞ

i¼1

0;

198 3 Linear Operators

In (ii) of Remark 3.6.11, it was pointed out that the collection of positive

operators on a Hilbert space H is a positive cone in S(H). The positive cone induces

a partial order in S(H). This leads to the following deﬁnition.

Deﬁnition 3.6.13 Let {Tn}n 1 be a sequence of bounded linear self-adjoint

operators deﬁned in a Hilbert space H, i.e. Tn 2 B(H), n = 1,2,…. The sequence

{Tn}n 1 is said to be increasing [resp. decreasing] if T1 T2 [resp. T1

T2 ].

An increasing [resp. decreasing] sequence {Tn}n 1 in B(H) has the following

remarkable property. It follows from Theorem 3.4.8 proved above.

Theorem 3.6.14 Let {Tn}n 1 be an increasing sequence of bounded linear self-

adjoint operators on a Hilbert space H that is bounded from above, that is,

T1 T2 Tn aI;

Proof For each x 2 H, the sequence {(Tnx,x)}n 1 of real numbers is bounded from

above by a||x||2. So, limn(Tnx,x) exists and equals f(x), say. Being a limit of

quadratic forms [see Problem 3.4.P2], this is again a quadratic form, that is, there

exists a sesquilinear form B(x, y) on H such that f(x) = B(x, x). Clearly, B is

bounded. By Theorem 3.4.8, there exists a self-adjoint operator T such that f(x) =

(Tx,x). It remains to show that limn||(Tn − T)x|| = 0 for each x 2 H.

Without loss of generality, we may assume that T1 O by replacing each Ti by

Ti − T1 and a by 2a. Then for n > m, we have O Tn − Tm aI. This shows that

(Ax, y), where A is a positive operator], we get for each x and y = (Tn − Tm)x,

¼ ½ððTn Tm Þx; yÞ2

ððTn Tm Þx; xÞððTn Tm Þy; yÞ

¼ ððTn Tm Þx; xÞ ðTn Tm Þ2 x; ðTn Tm Þx

ððTn Tm Þx; xÞkTn Tm kkðTn Tm Þxk2

ððTn Tm Þx; xÞa3 jj xjj2 :

left-hand side of the above inequality tends to zero as n,m!∞, i.e.

3.6 Some Special Classes of Operators 199

Obviously, Bx depends linearly on x. Moreover, 0 (Tnx, x) a (x, x), and so, it

follows that 0 B(x, x) a||x||2, which implies that B is a bounded linear

operator. h

Recall that if T 2 B(H), T*T O since (T*Tx, x) = ||Tx||2 0 [(iii) of Remark

pﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃﬃﬃﬃ

3.6.11]. Just as jzj ¼ zz, we would like to deﬁne jT j ¼ T*T . This requires the

notion of square roots of positive operators. We begin with a Lemma.

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

Lemma 3.6.15 The power series for the function 1 z about z = 0 converges

absolutely for all complex numbers in the unit disc {z 2 ℂ:|z| 1}.

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

Proof Since the function f ðzÞ ¼ 1 z is holomorphic in the open unit disc {z 2

ℂ : |z| < 1}, it can be expanded in a Taylor series about z = 0:

1

X f ðnÞ ð0Þ

f ðzÞ ¼ an zn ; where an ¼ :

n¼0

n!

Note that the series converges absolutely in the open unit disc and the derivatives

at the origin are all negative:

1 1 1 2 þ 12

f 0 ðzÞ ¼ ð1 zÞ 2 ; f 00 ðzÞ ¼ ð1 zÞ ; . . .; f ðnÞ ðzÞ

2 22

1 3 ð2n 3Þ n þ 12

¼ ð1 zÞ ; . . .:

2n

n

X n

X

jak j ¼ 2 ak

k¼0 k¼0

n

X

¼2 lim ak xk

x!1

k¼0

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

2 lim 1 x

x!1

¼ 2;

where lim means that the limit is being taken as x!1 from the left. The sequence

x!1

of partial sums { f nk¼0 jak jgn 1 on the left is increasing and is bounded above by

P

P1 P k

k¼0 ak z converges

absolutely for |z| = 1. This proves the Lemma. h

Now consider the Cauchy product of the above power series with itself, which is

200 3 Linear Operators

1

X k

X

bk zk ; where bk ¼ aj ak j for each k:

k¼0 j¼0

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

It converges absolutely and its sum is the product ð 1 zÞ2 ¼ 1 z. See

Theorem 15 on p.51 of [28]. This means that, if

n

X n

X

Pn ðzÞ ¼ b k zk ; and Qn ðzÞ ¼ an zn ;

k¼0 k¼0

paper), one can see that the polynomial Qn(z)2 − Pn(z) has coefﬁcients that are sums

of products of only those aj with j 1. As noted in the course of the proof of the

above Lemma, these aj are all negative, and hence, the coefﬁcients of Qn(z)2 −

Pn(z) are all positive. It follows for any bounded linear operator T that ||Qn(T)2 −

Pn(T)|| |Qn(||T||)2 − Pn(||T||)|. In particular, whenever ||T|| P 1, we have ||Qn(T)2

1 P1

− Pn(T)||!0 as n!∞. That is to say, the Cauchy product k¼0 bkT of k¼0 akTk k

P

k¼0 akT ) , provided that ||T|| 1.

On the other hand, since the Cauchy product 1

P k P1 k

b z

k¼0 k of k¼0 akz with itself

converges to 1 − z (as noted at the beginning of the preceding paragraph), the

uniqueness of the power series of any holomorphic function

P1 implies that b0 = 1 =

− b1 and bk = 0 for k > 1. Therefore, ( 1 k 2 k

I − T.

P

k¼0 ka T ) = b

k¼0 k T =

Theorem 3.6.16 Let T 2 B(H) and T O. Then, there is a unique S 2 B(H) with S

O and S2 = T. Furthermore, S commutes with every bounded operator which

commutes with T.

Proof If T = O, then take S = O. We may next assume, without loss of generality,

that ||T|| 1. Indeed, for any positive T and x 2 H,

which implies

and therefore, T/||T|| I. Assuming we have already proved the Theorem for this

case, we could then assert the existence of a positive operator S such that S2 = T/||T||.

1

From this, it follows that kT k2 S is a positive square root of T.

Since I − T is self-adjoint, it follows from (ii) of Example 3.4.7 and Corollary

3.4.11 that

kI T k ¼ sup ¼ sup jððI T Þx; xÞj 1:

k xk6¼0 k x k2 kxk¼1

3.6 Some Special Classes of Operators 201

I þ a1 ðI TÞ þ a2 ðI TÞ2 þ ð3:30Þ

converges in norm to an operator S. From what has been noted just before the

statement of this Theorem, it also follows that S2 = I − (I − T) = T. Furthermore,

since O (I − T) I, we have

0 ððI TÞn x; xÞ 1

1

TÞn x; xÞ

X

ðSx; xÞ ¼ 1 þ an ððI

n¼1

1

X

1þ an ; using an \0 for all n 1

n¼1

¼ 0;

P1 n

since the value of the sum of the series 1 + n¼1 anz at z = 1, which is

P1

1 + n¼1 an, is zero. Thus, S O.

From here onwards, we do not need the restriction that ||T|| 1. We next check

that S commutes with every operator that commutes with T. Let V 2 B(H) be such

that VT = TV. Then, V(I − T)n = (I − T)nV and consequently, VS = SV. It remains to

show that S is unique.

Suppose there is S′, with S′ O and (S′)2 = T. Then since

3

S0 T ¼ ðS0 Þ ¼ TS0 ;

ðS ð3:31Þ

Since both terms on the left of (3.31) are positive, they must both be zero; so

their difference (S − S′)3 = O. Since S − S′ is self-adjoint, it follows that

2 2

kS S0 k ¼ kð S S0 ÞðS S0 Þk ¼ ðS S0 Þ

202 3 Linear Operators

Example 3.6.17

(i) In L2[0, 1], the multiplication operator

pﬃ

ðSxÞðtÞ ¼ txðtÞ; 0\t\1; x 2 L2 ½0; 1:

a 1

T¼

1 a 1

is positive. Indeed,

ax1 þ x2 x1

ðTx; xÞ ¼ 1

;

x1 þ a x2 x2

¼ ajx1 j2 þ x1 x2 þ x1 x2 þ a 1 jx2 j2

pﬃﬃﬃ pﬃﬃﬃﬃﬃﬃﬃ 2

¼ ax1 þ a 1 x2 0 for all vectors ðx1 ; x2 Þ 2 C2 :

In what follows, we shall determine the square root of the matrix T. The

characteristic values are the roots of the equation det(kI − T)

= 10. These

−1 a a

roots are 0, (a + a ) and the corresponding eigenvectors are , ,

1 1

1

a a 0 a2 þ 1

respectively. If V is the matrix , then TV = .

1 1 0 aþa 1

0 0 1 a

Consequently, V−1TV = , where V−1 = a þ1a 1 . Hence,

0 aþa 1 1 a 1

" #

1 1=2 1 1=2

0 0 ða þ a Þ a ða þ a Þ

V−1 =

1

T2 = V .

0 ða þ a 1 Þ1=2 ða þ a 1 Þ 1=2 ða þ a 1 Þ 1=2 a 1

(iii) Using (ii) above, we may guess that the square root of the matrix

T I

1 2 BðH HÞ;

I T

3.6 Some Special Classes of Operators 203

" #

ðT þ T 1 Þ 1=2 T ðT þ T 1 Þ 1=2

:

ðT þ T 1 Þ 1=2 ðT þ T 1 Þ 1=2 T 1

and is also bounded below in view of the fact that

2 2

1

x ¼ kTxk2 þ T 1

x þ 2jj xjj2 2jj xjj2 :

T þT

" #2 2

1 1=2 1 1=2

T I

ðT þ T Þ T ðT þ T Þ 1 1

1=2 1=2

¼ ðT þ T Þ 1

ðT þ T 1

Þ ðT þ T 1

Þ T 1 I T

T2 þ I ðT þ T 1 Þ

1 1

¼ ðT þ T Þ

ðT þ T 1 Þ T 2þI

1 1

1 ðT þ T ÞT

ðT þ T Þ

1

¼ ðT þ T Þ 1 1 1

ðT þ T Þ ðT þ T ÞT

T I

¼ 1

:

I T

Proof When T = O, there is nothing to prove. So we may take ||T||m > 0 for all m 2 ℕ.

The case n = 1 is trivial. For n = 2, the desired equality follows from

T ¼ kT*T k ¼ kT k2 :

2

k

This says that, when k = 1, the equality T 2 = jjT jj2 holds. Assume this for

k

some k 2 ℕ. Then,

kþ1 k k k 2 2

2 2 2 2 k 2k kþ1

T ¼ ðT Þ ¼ ðT Þ*ðT 2 Þ ¼ T 2 ¼ kT k ¼ kT k2 :

k

2 2k

T ¼ k T k for all k 2 N:

204 3 Linear Operators

Now consider an arbitrary n 2 ℕ. Choose k 2 ℕ such that n < 2k, and put m = 2k

− n. Then, 0 ||Tm|| ||T||m 6¼ 0 and 0 ||Tn|| ||T||n. If it were to be the case

that ||Tn|| < ||T||n, then it would follow that

k k

n m nþm

¼ k T k2 ;

2

T ¼ kT n þ m k kT n k kT m k\kT k kT k ¼ kT k

contradicting what was proved earlier by induction. Thus, ||Tn|| = ||T||n. Therefore,

by induction, the desired equality must hold for all n 2 ℕ. h

Theorem 3.6.19 If T 2 B(H) is positive, then the sesquilinear form deﬁned by (Tx,

y) is nonnegative and satisﬁes

inequality now follows from Theorem 3.4.5. h

As an application of the above Theorem, we show for a positive operator T and

any positive integer k that

1 1 1 1

k 1k

T 2 x; x ðTx; xÞ2 þ 4 þ 8 þ þ 2k T 2 þ 1 x; x :

2

2

T 2 x; x ðTx; xÞ T 3 x; x

and hence,

1 1

T 2 x; x ðTx; xÞ2 T 3 x; x 2 :

k

induction, assume it true for some k. Taking y ¼ T 2 x in the inequality of

Theorem 3.6.19, we get

k

2 k k

kþ1

T2 þ1

x; x ðTx; xÞ T 2 þ 1 x; T 2 x ¼ ðTx; xÞ T 2 þ 1 x; x :

Taking the root of order 2k+1 on both sides and combining with the induction

hypothesis, we ﬁnd that

1 1 1 1 1

kþ1 k1

T 2 x; x ðTx; xÞ2 þ 4 þ 8 þ þ 2k þ 2k þ 1 T 2 þ 1 x; x

2 þ1

:

3.6 Some Special Classes of Operators 205

3:6:P1. If H = ℂn, then the set of invertible matrices is dense in the space of all

matrices.

3:6:P2. Let T 2 B(H), where H = ℂn and {ek:k = 1, 2, …, n} be an orthonormal

basis for H. Then, T has the matrix representation [aij] and T* has the

representation [aji ] with respect to the given orthonormal basis. Show that

if the basis is not orthonormal, then this relation between the matrix rep-

resentations need not hold.

3:6:P3. Let T:X!X be a bounded linear operator on a complex inner product space

X. If (Tx,x) = 0 for all x 2 X, show that T = O. Show that this does not hold

in the case of a real inner product space.

3:6:P4. Let the operator T: ℂ2!ℂ2 be deﬁned by Tx = 〈n1 + in2,n1 − in2〉, where

x = 〈n1,n2〉. Find T*. Show that we have T*T = TT* = 2I. Find T1 ¼

1

2 ðT1Pþ T*Þ and T2 ¼ 2i1 ðT T*Þ:

3:6:P5. Let 1 n

n¼0 anz be a power series with radius of convergence R, 0 < R

∞. If A 2 B(H) and ||A|| < R, show P that there is an operator T 2 B(H) such

that for any x, y 2 H, (Tx, y) = 1 a

n¼0 n (A n

x, y). Moreover, T is unique. If

BA = AB, then show that BT = TB. [When the sum of the series 1 n

P

n¼0 anz

is denoted by f(z), the operator T is denoted by f(A).]

3:6:P6. Let H be a Hilbert space and A 2 B(H). Deﬁne the operator B on H ⊕ H by

0 iA

B¼ :

iA* 0

3:6:P7. If T 2 B(H), show that T + T* O if, and only of, (T + I) is invertible in

B(H) and ||(T − I)(T + I)−1|| 1.

The true analogues of complex numbers are the normal operators. The following

Theorem gives a characterisation of these operators.

Theorem 3.7.1 If T 2 B(H), the following are equivalent:

(a) T is normal;

(b) ||Tx|| = ||T*x|| for all x 2 H.

If H is a complex Hilbert space, then these statements are also equivalent to:

(c) The real and imaginary parts of T commute, i.e.

206 3 Linear Operators

T þ T* T T*

T1 T2 ¼ T2 T1 ; where T1 ¼ and T2 ¼ :

2 2i

Proof If x 2 H, then

¼ ðT*Tx; xÞ ðTT*x; xÞ

¼ ððT*T TT*Þx; xÞ:

Since T*T − TT* is Hermitian, it follows on using Corollary 3.6.7 that (a) and

(b) are equivalent.

We next show that (a) and (c) are equivalent:

TT* ¼ ðT1 þ iT2 ÞðT1 iT2 Þ ¼ T12 þ iðT2 T1 T1 T2 Þ þ T22 :

For any operator T, we have ||Tk|| ||T||k, where k is a positive integer.

A strengthening of the preceding inequality holds for normal operators:

Theorem 3.7.2 Let T 2 B(H) satisfy T*T = TT*. Then,

T ¼ k T k k

k

for k ¼ 2n ; n ¼ 1; 2; . . . :

Proof For n = 1,

2 2 2 2

T ¼ T T * ½Theorem 3:5:4ðe)

¼ T 2 ðT*Þ2

4

¼ kT k ; ½Theorem 3:5:4ðe)

which implies

T ¼ k T k 2 :

2

3.7 Normal, Unitary and Isometric Operators 207

m þ 1 2 m þ 1 m þ 1

2

T ¼ T 2 T2 * ½Theorem 3:5:4ðe)

m m m m

¼ T 2 T 2 T 2 * T 2 *

m m m m

¼ T 2 T 2 *T 2 T 2 * ½T*T ¼ TT*

m m

2

¼ T 2 T 2 * ½Theorem 3:5:4ðe)

m m 2

¼ T 2 T 2 * ½using the case n ¼ 1

m 22

¼ T 2 ½induction hypothesis

mþ2

¼ kT k2 :

Consequently,

mþ1

2 2m þ 1

T ¼ kT k :

If T 2 B(H) is self-adjoint, it was proved in Theorem 3.6.6 that

The norm of any bounded linear normal operator can be computed using the

foregoing formula. We begin with the following:

Deﬁnition 3.7.3 For any T 2 B(H),

kTxk2 þ T 2 x; x 2qðTÞkTxkk xk

ð3:32Þ

for every x 2 H.

Proof Let k and h be real numbers. Then for x 2 H,

1

kTxk2 þ e2ih T 2 x; x ¼ ðke2ih T 2 x þ k 1 eih Tx; keih Tx þ k 1 xÞ

2

1

ðke2ih T 2 x k 1 eih Tx; keih Tx k 1 xÞ:

2

208 3 Linear Operators

2

1

þ ke2ih T 2 x k 1 eih Tx; keih Tx k 1 x

2

1 ih ih

e T ke Tx þ k 1 x ; keih Tx þ k 1 x

2

1

þ eih T keih Tx k 1 x ; keih Tx k 1 x

2

1 2 2

qðTÞ keih Tx þ k 1 x þ keih Tx þ k 1 x :

2

ð3:33Þ

2

|(T x, x)|, we deduce from (3.33) that

kTxk2 þ T 2 x; x qðTÞ k2 kTxk2 þ k 2 k xk2

¼ 2qðTÞkTxkk xk:

The following proposition will also be needed.

Proposition 3.7.5 If T 2 B(H), then ||T|| 2q(T) and q(T2) q(T)2.

Proof From Proposition 3.7.4, we have

This implies

and so,

kT k 2qðTÞ:

that is,

3.7 Normal, Unitary and Isometric Operators 209

kTxk2 2qðTÞkTxk þ T 2 x; x 0:

Therefore,

qðTÞÞ2 þ T 2 x; x qðTÞ2 ;

ðkTxk

which implies

T x; x qðTÞ2 :

2

Hence,

h

Proof By induction. h

Theorem 3.7.7 If T 2 B(H) is a normal operator, then

Proof From the deﬁnition of q and the deﬁnition of norm, it follows that

supfkTxkkxk : k xk ¼ 1g ð3:34Þ

¼ supfkTxk : kxk ¼ 1g ¼ kT k:

kT p k ¼ kT kp for p ¼ 2n ; n ¼ 1; 2; . . . :

So,

1

kT k ¼ kT p kp

1

ð2qðT p ÞÞp ½Corollary 3:7:6

1

¼ 2 qðTÞ:

p

kT k qðTÞ: ð3:35Þ

210 3 Linear Operators

Combining (3.34) and (3.35), we get the desired expression for the norm of the

operator T. h

Corollary 3.7.8 Let T 2 B(H) be self-adjoint. Then,

In three-dimensional Euclidean space ℂ3, the simplest operator after that of

projection is rotation of the space, which changes neither the length of the vectors

nor orthogonality between pairs of them. We consider below the analogue of this

operation in Hilbert space.

Deﬁnition 3.7.9 Let H be a Hilbert space and U be a bounded linear operator with

domain H and range H. U is called unitary if

for all x, y 2 H.

If y = x, then the deﬁning relation for a linear unitary operator U takes the form

||Ux|| = ||x|| for all x 2 H, in particular, U is bounded and ||U|| = 1.

Theorem 3.7.10 Let U be a unitary operator on a Hilbert space H. Then, U−1

exists and is unitary. Moreover, U−1 = U*.

Proof In order to show that U−1 exists, it is enough to show that U is injective,

which follows from the fact that ||Ux|| = ||x|| for all x 2 H.

We next show that U−1 is unitary. Choose arbitrary x, y 2 H and let x = U−1x′,

y = U−1y′. Then, Ux = x′ and Uy = y′. So, (x′, y′) = (Ux, Uy) = (x, y) = (U−1x′, U−1y′),

that is, U−1 is unitary.

It remains to show that U−1 = U*. For x, y 2 H, let U−1y = z, so that y = Uz.

Then, (Ux, y) = (Ux, Uz) = (x, z) = (x, U−1y). Also, (Ux, y) = (x, U*y).

Consequently, (x, U*y) = (x, U−1y) and this implies U*y = U−1y for all y 2 H. This

proves the assertion. h

Corollary 3.7.11 Let U be a bounded linear operator deﬁned on H. Then, U is

unitary if, and only if, UU* = U*U = I.

Proof Indeed, for x, y 2 H and U a unitary operator,

On the other hand, if UU* = U*U = I, then U is invertible (hence has range H)

and

3.7 Normal, Unitary and Isometric Operators 211

The following simple characterisation of unitary operators is often useful.

Theorem 3.7.12 Let H be a Hilbert space and let U 2 B(H). Then, U is unitary if,

and only if,

(a) ||Ux|| = ||x|| for all x 2 H

and

(b) the range of U is dense in H.

Proof Suppose U is unitary. It has been observed that ||Ux|| = ||x|| for all x 2 H, that

is, (a) holds. Condition (b) is satisﬁed by virtue of the deﬁnition of a unitary

operator.

Suppose that (a) and (b) hold. Then for x, y 2 H and a 2 ℂ,

¼ ðUx; UxÞ þ jaj2 ðUy; UyÞ þ aðUy; UxÞ þ aðUx; UyÞ

and

for all x, y 2 H.

By (a), U is bounded below. Therefore, by (b) and Theorem 3.3.12, U is invertible.

Together with (3.39) and Deﬁnition 3.7.9, this entails that U is unitary. h

212 3 Linear Operators

Example 3.7.13

(i) Let ‘2(ℤ) P

denote the Hilbert space consisting of the complex functions x on ℤ

such that 1 2 2

n¼ 1 |x(n)| < ∞. Deﬁne U on ‘ (ℤ) by U(x)(n) = x(n − 1) for x 2

2

‘ (ℤ). The operator U is called the bilateral shift. It is clearly linear and the

following calculation

1 1

jjUxjj2 ¼ jðUxÞðnÞj2 ¼ 1Þj2 ¼ kxk2

X X

jxðn

n¼ 1 n¼ 1

The deﬁning relation for U* is (x, U*y) = (Ux, y), x, y 2 H.

1

X 1

X

ðx; U*yÞ ¼ ðUx; yÞ ¼ ðUxÞðnÞyðnÞ ¼ xðn 1ÞyðnÞ

n¼ 1 n¼ 1

1

X

¼ xðnÞyðn þ 1Þ:

n¼ 1

Therefore, U*y(n) = y(n + 1). An easy computation shows that UU* = U*U =

I. Thus, U is a unitary operator.

(ii) Let H = L2[0, 2p]. Deﬁne U:H!H by the formula (Ux)(t) = eitx(t) for x 2 L2[0,

2p]. Observe that U is onto. Indeed, if y 2 L2[0, 2p], then e−ity(t) = z(t) 2 L2[0,

2p] and

Moreover,

Z2p Z2p

2 e xðtÞ2 dt ¼ jxðtÞj2 dt ¼ jj xjj2 :

it

jjUxjj ¼

0 0

Thus,

More general than a unitary operator deﬁned on H is an isometric operator.

Deﬁnition 3.7.14 Let H be a complex Hilbert space and T 2 B(H). The operator

T is said to be isometric if ||Tx|| = ||x|| for all x in H.

Remarks 3.7.15 (i) An isometry is a distance preserving transformation:

3.7 Normal, Unitary and Isometric Operators 213

In particular, T is injective.

(ii) Observe that a unitary operator H is isometric. However, not every isometric

operator is unitary. The simple unilateral shift T discussed in (vii) of Example 3.2.5

is an isometry but is not unitary because it is obviously not a bijection. In fact, T is

not even normal, because its adjoint is given [see (vi) of Example 3.5.10] by T*

({xi}i 1) = (x2, x3,…), and hence,

T*T fxi gi 1 ¼ T*ð0; x1 ; x2 ; . . .Þ ¼ ðx1 ; x2 ; . . .Þ ¼ fxi gi 1 ;

TT* fxi gi 1 ¼ T ðx2 ; x3 ; . . .Þ ¼ ð0; x2 ; x3 ; . . .Þ:

Proposition 3.7.16 Let H be a complex Hilbert space and T 2 B(H). Then, the

following are equivalent.

(a) T is an isometry,

(b) T*T = I and

(c) (Tx, Ty) = (x, y).

Proof (a) implies (b). Since ||Tx|| = ||x|| for all x 2 H, we have

(b) implies (c). (Tx,Ty) = (T*Tx, y) = (x, y).

(c) implies (a). This follows on taking y = x in (Tx,Ty) = (x, y). h

Theorem 3.7.17 The range ran(T) of an isometric operator T deﬁned on a complex

Hilbert space is a closed linear subspace of T.

Proof Clearly, ran(T) = T(H) is a linear subspace of H. Suppose y 2 ½ranðTÞ. We

need to show that y 2 ran(T). Chose a sequence {yn}n 1 in ran(T) such that

yn!y as n!∞. Note that yn = Txn for some xn in H, n = 1,2,…. Since ||xm − xn|| =

||T(xm − xn)|| = ||ym − yn||!0 as m,n!∞, it follows that {xn}n 1 is a Cauchy

sequence. Since H is complete, there exists x in H such that xn!x. By continuity of

T, we have Txn!Tx, i.e. Tx = limnTxn = limnyn = y. Hence, y = Tx, so y 2 ran(T).

h

The following Theorem is an alternative characterisation of unitary operators

[see Corollary 3.7.11].

214 3 Linear Operators

Theorem 3.7.18 Let H be a complex Hilbert space and T 2 B(H). Then, the

following are equivalent:

(a) T*T = TT* = I,

(b) T is a surjective isometry and

(c) T is a normal isometry.

Proof (a) implies (b). From (a), it follows that TT* = I. This ensures that T is

surjective. It also follows that T*T = I, and hence, by Proposition 3.7.16, T is an

isometry.

(b) implies (c). Since T is an isometry, (Tx,Ty) = (x, y) by Proposition 3.7.16.

Being surjective, T must be unitary by Deﬁnition 3.7.9. Hence, T*T = TT* = I by

Corollary 3.7.11, so that T is normal.

(c) implies (a). Since T is an isometry, T*T = I by Proposition 3.7.16. Since T is

normal, T*T = TT* = I. This completes the proof. h

Deﬁnition 3.7.19 Let S and T be bounded linear operators on a Hilbert space

H. The operator S is said to be unitarily equivalent to T if there exists a unitary

operator U on H such that

1

S ¼ UTU ¼ UTU*:

unitarily equivalent to T. The reason is as follows: S* = (UTU*)* = (U*)*T*U* =

UTU* = S, using the hypothesis that T = T*. A similar argument shows that if T is

normal, then so is S.

Problem Set 3.7

3:7:P1. Show that the range of a bounded linear operator need not be closed.

3:7:P2. [See Problem 3.7.P1] Let T:H!H be a bounded linear operator on a

Hilbert space H. Suppose there exists M > 0 such that ||Tx|| M||x|| for

any x 2 H. Prove that the range of T is a closed subspace of

H.

0 n

3:7:P3. Let H = ℂ2 and T be the operator deﬁned on H by the matrix . Find

0 0

||T|| and r(T). Show that T is not a normal operator.

3:7:P4. Let S = I + T*T:H!H, where T 2 B(H). Show that

(a) S−1:ran(S)!H exists,

(b) ran(S) is closed,

(c) NðSÞ = kernel of S = {0} and

(d) ||S−1|| 1.

1

zn/n! and A 2 B(H) is such that A = A*, show that f(iA) is

P

3:7:P5. If f(z) =

k¼0

unitary.

3.7 Normal, Unitary and Isometric Operators 215

3:7:P6. Recall from (v) of Example 2.1.3 that RH2 denotes the space of rational

functions which are analytic on the closed unit disc D = {z 2 ℂ : |z| 1},

with the usual addition and scalar multiplication and with inner product

1 dz

Z

ðf ; gÞ ¼ f ðzÞgðzÞ :

2pi z

@D

jaj2 Þ1=2

ð1 z a

Uf ðzÞ ¼ f for all z 2 D;

1 az 1 z

a

RH2.

3:7:P7. Let T 2 B(H) be a normal operator. Assume that Tm = O for some positive

integer m. Show that T = O.

3:7:P8. Let T 2 B(H) be normal. Show that T is injective if, and only if, T has

dense range.

3:7:P9. (a) Give an example of an operator S 2 B(H) such that ker(S) = {0} but ran

(S) is not dense in H.

(b) Give an example of an operator T 2 B(H) such that T is surjective but

ker(T) 6¼ {0}.

3:7:P10. Let H be a Hilbert space. Show that the set of all normal operators in

B(H) is closed in B(H) in the operator norm.

3:7:P11. If T is a normal operator on the complex Hilbert space H and S 2 B(H) is

such that TS = ST, then T*S = ST*.

Let T 2 BðH Þ be a self-adjoint operator on a complex Hilbert space H 6¼

{0}. Then, rðT Þ 2 R. So, ± i 2 q(T), the resolvent set of T. The oper-

ators T ± iI are invertible elements in BðH Þ. Consider the operator

1

U ¼ ðT iIÞðT þ iIÞ ¼ ðT þ iIÞ 1 ðT iIÞ

1

U 1

¼ ðT þ iIÞðT iIÞ ¼ ðT iI Þ 1 ðT þ iIÞ:

3:7:P12. (a) Show that U is unitary and U = I − 2i(T + iI)−1.

(b) Also show that 1 2 q(U) and

(c) T = i(I + U)(I − U)−1 = i(I − U)−1(I + U).

216 3 Linear Operators

position theorem [Theorem 2.10.11] says that H = M ⊕ M⊥, where M⊥ denotes the

orthogonal complement of M. Thus, for each x 2 H, there exists a unique y 2 M and

z 2 M⊥ such that x = y + z.

The concept of orthogonal projection operator PM, or briefly, projection, was

deﬁned in Deﬁnition 2.10.16. It was proved in Theorem 2.10.15 that the mapping

PM:H!H has range M; its kernel is M⊥ and PM restricted to M is the identity

operator on M. Also proved therein are the following:

(i) PM is linear, bounded with norm 1;

(ii) PM is self-adjoint; and

(iii) PM is idempotent: P2M = PM.

P2 = P.

Associated with any closed subspace M of H, the orthogonal projection operator

PM, or briefly, P, has the properties (i), (ii) and (iii), and also satisﬁes ran(PM) = M,

ker(PM) = M⊥ [Theorem 2.10.15].

We now reverse the above trend that if P 2 B(H) is such that P* = P and P2 = P,

then there exists a unique subspace M of H such that P is the associated orthogonal

projection operator PM.

Set

M ¼ fx 2 H : Px ¼ xg:

We next show that ran(P) = M and ker(P) = M⊥. Indeed, if x 2 H, then Px =

2

P x = P(Px). Thus, Px 2 M for each x 2 H, i.e. PH M. On the other hand, if x 2

M, then x = Px 2 PH. Hence, PH = M. Also, if Px = 0, then for z 2 H, (x, P*z) =

(Px, z) = (0, z) = 0, that is, x 2 (P*H)⊥ = (PH)⊥ = M⊥. On the other hand, if x 2 M⊥,

then (Px, z) = (x, P*z) = (x, Pz) = 0 for each z 2 H. Therefore, Px = 0 for x 2 M⊥.

Finally, for x 2 H, we have x = y + z, where y 2 M and z 2 M⊥, and hence, Px =

Py + Pz = y. Thus, P is the operator of orthogonal projection on M.

Combining the discussions in the paragraph above, we have the following

Theorem.

Theorem 3.8.2 Let P 2 B(H). Then, P is a projection if, and only if,

Remarks 3.8.3

(i) The argument used to establish the above theorem shows that to each closed

linear subspace M in H there corresponds a unique orthogonal projection

3.8 Orthogonal Projections 217

closed linear subspace M = {x 2 H:Px = x} = ran(P). This enables us to

replace geometric properties of subspaces in terms of algebraic properties of

projections corresponding to them [see Theorems 3.8.4 and 3.8.5 below].

(ii) Every orthogonal projection is a positive operator: Indeed,

2 1 1

(iii) Consider the operator P on ℂ corresponding to the matrix P ¼ .

0 0

2

its kernel is {(x, −x):x 2

Observe that P = P. Its range

is {(x,0):x

2 ℂ} and

1 0 1 1

ℂ}. However, P* has matrix 6¼ . So it is not an orthogonal

1 0 0 0

projection.

(iv) Let ðX; M; lÞ be a r-ﬁnite measure space. For y 2 L∞(l), consider the

operator T on L2(l) of multiplication by y:

[See (vi) of Example 3.2.5.] The operator T is bounded with ||T|| = ||y||∞ and

it is self-adjoint if, and only if, y is real-valued a.e. Observe that T2 = T if, and

only if, y2 = y a.e., or y is equal a.e. to a characteristic function. Thus, if the

operator of multiplication by a real-valued y is a projection, then it is an

orthogonal projection.

We propose to show below in detail how the closed subspaces of a Hilbert space

and the corresponding orthogonal projections are related to each other.

Theorem 3.8.4 Let M and N be closed subspaces of a Hilbert space H, and P and

Q denote the projections on M and N, respectively. Then,

(a) I − P is the projection on M⊥;

(b) M⊥N if, and only if, PQ = O.

P. Thus, I − P is a projection operator. We next show that {x 2 H : (I − P)x = x} =

M⊥. If (I − P)x = x, then Px = 0, which implies x 2 M⊥. On the other hand, if x 2

M⊥, then Px = 0, and hence, (I − P)x = x.

(b) Suppose that PQ = O. Then for x 2 M and y 2 N,

PQx = 0 for x 2 H. Hence, PQ = O. h

218 3 Linear Operators

Under the condition (b) of the above theorem, we speak of projections P and

Q themselves as being orthogonal.

Theorem 3.8.5 Let M and N be closed subspaces of a Hilbert space H. If P and Q

denote projections on M and N, respectively, then the following are equivalent:

(a) M N;

(b) P Q;

(c) PQ = P; and

(d) QP = P.

Proof (a) implies (c). If M N, then Px 2 N for each x 2 H. Therefore, Q(Px) = Px,

x 2 H; so QP = P. Also, (QP)* = P*, that is, P*Q* = P*, which implies PQ = P.

(c) implies (b). Suppose PQ = P. Then for x 2 H,

ðPx; xÞ ¼ ðP2 x; xÞ ¼ ðPx; PxÞ ¼ jjPxjj2 ¼ jjPQxjj2 jjQxjj2 ¼ ðQx; QxÞ ¼ ðQx; xÞ:

Hence, P Q.

(b) implies (a). Suppose that P Q and let x 2 M. Then,

¼ ðQ2 x; xÞ ¼ ðQx; QxÞ ¼ jjQxjj2 jj xjj2 :

x ¼ Qx þ ðI QÞx;

and so,

kðI QÞxk ¼ 0;

since

jj xjj2 ¼ jjQxjj2 :

Consequently,

x ¼ Qx;

i.e. x 2 N.

(c) implies (d). Let PQ = P. Then, P = P* = (PQ)* = Q*P* = QP. Now let QP =

P. Then, P = P* = (QP)* = P*Q* = PQ. h

3.8 Orthogonal Projections 219

The next few results give necessary and sufﬁcient conditions for addition,

subtraction and multiplication of projection operators to result in a projection

operator.

Theorem 3.8.6 Let {Pi}i 1 be a denumerable or ﬁnite family of projections and

RiPi = P in the sense of strong convergence. Then, a necessary and sufﬁcient

condition that P be a projection is that PjPk = O whenever j 6¼ k. If this condition is

satisﬁed and if, for each j, the range of Pj is Mj, then the range of P is M = RiMi =

{x 2 H:x = Rixi, xi 2 Mi, i = 1,2,…} = ½ [ k Mk .

Proof If the family {Pi}i 1 satisﬁes the condition, then

and

for every pair x, y in H. In other words, the orthogonality of the family {Pi} implies

that P is idempotent and Hermitian, and hence, P is a projection.

If, conversely, P is a projection and if x 2 ran(Pk) for some value of k, then

It follows that every term in the chain of inequalities is equal to every other term.

From the equality

X

k. Thus, the family {Pi}i 1 satisﬁes the condition PjPk = O whenever j 6¼ k.

We next show that ran(P) = RiMi, where Mi = ran(Pi). For any Px 2 ran(P), we

have Px = RiPix 2 RiMi, because Pix 2 Mi. Thus, ran(P) RiMi. On the other hand,

every z 2 RiMi is of the form Rixi, xi 2 Mi, so that Pz = RiPixi = Rixi = z, which

implies z 2 ran(P). Thus, RiMi ran(P).

Finally, we show that ran ðPÞ ¼ ½ [ k Mk . From the equality of ||x|| and ||Px||, we

conclude that x 2 ran(P) and hence, Mk ran(P) for all k and it therefore follows

that ½ [ k Mk ran(P). On the other hand, Pkx 2 Mk for every vector x and every

value of k; it follows that Px = RkPkx 2 RkMk ½ [ k Mk for all x, i.e. ran(P)

½ [ k Mk . h

The useful fact about the product of projections is contained in the following.

Theorem 3.8.7 The product of two projection operators P and Q is a projection

operator if, and only if, PQ = QP. In this case, PQ is the projection on M \ N,

where M [resp.N] is the subspace of H on which P [resp. Q] is the projection.

220 3 Linear Operators

R2 ¼ ðPQÞðPQÞ ¼ PPQQ ¼ P2 Q2 ¼ PQ ¼ R

Finally, we show that the range of PQ is M \ N.

For x 2 H, let

y ¼ PQx ¼ QPx:

Hence, x 2 M \ N, i.e. the range of PQ, ran(PQ) M \ N. If x 2 M \ N, then

PQx = x. Thus, ran(PQ) = M \ N. h

We ﬁnally treat the difference of projections.

Theorem 3.8.8 The difference of two projections, P1 − P2, is a projection if, and

only if, M2 M1, where M1 [resp.M2] is the subspace of H on which P1 [resp.P2] is

the projection. In this case, ran(P1 − P2) = M1 \ M⊥

2.

((P1 − P2)x,x) = ((P1 − P2)2x,x) = ((P1 − P2)x,(P1 − P2)x)) = ||(P1 − P2)x||2 0,

which proves that M2 M1 [see Theorem 3.8.5]. On the other hand, suppose

that M2 M1. Then,

Now,

and

ðP1 P2 Þ* ¼ P1 * P2 * ¼ P1 P2 :

2 . Since P1P2 = P2P1 by (3.40)

above, it follows that

3.8 Orthogonal Projections 221

P1 ðI P2 Þ ¼ P1 P1 P2 ¼ P1 P2 P1 ¼ ðI P2 ÞP1 :

Hence, by Theorem 3.8.7, P1(I - P2) is an orthogonal projection with range given

by

Let H be a ﬁnite-dimensional Hilbert space and T 2 B(H) be such that T*T =

TT*. The subspace M formed by the eigenvectors belonging to a certain eigenvalue

is invariant under T, i.e. T(M) M. In fact, T(M⊥) M⊥ as well. Since T and T*

commute, it follows that ðT kIÞ and ðT* kIÞ commute. Therefore, they have

the same kernel. This implies that Ty = ky if, and only if, T*y ¼ ky. Let x 2 M⊥ and

y 2 M. Then, (Tx, y) = (x,T*y) = ðx; kyÞ = k(x, y) = 0. Consequently, T(M⊥) M⊥.

M is called a reducing subspace of T.

Although no analogous structure theory exists for operators on

inﬁnite-dimensional spaces, the notions of “invariant subspaces” and “reducing

subspaces” do make sense.

Deﬁnition 3.8.9 A subspace M of a Hilbert space H is said to be invariant under a

bounded linear operator T 2 B(H) if T(M) M. The subspace M H is said to

reduce T if T(M) M and T(M⊥) M⊥, i.e. if both M and M⊥ are invariant under T.

Then, M and M⊥ are called reducing subspaces of T.

It can be easily checked that M reduces T if, and only if, M is invariant under

both T and T*.

The investigation of T is facilitated by considering T|M and T|M⊥ separately.

Note that the subspace {0} and H are invariant under any T 2 B(H). Also,

ker(T) is always invariant under T; for Tx = 0 implies T(Tx) = 0.

Theorem 3.8.10 Let P be the orthogonal projection onto the subspace M of H.

Then, M is invariant under an operator T 2 B(H) if, and only if, TP = PTP;

M reduces T if, and only if, TP = PT.

Proof For each x 2 H, Px 2 M. Suppose M is invariant under T. Then, T(Px) 2 M,

and hence, PTPx = TPx; so PTP = TP. Conversely, if PTP = TP, then for every x 2

M, we have Tx = TPx = PTPx, and this is a vector in M. This proves that M is

invariant under T.

It remains to show that M reduces T if, and only if, TP = PT.

M reduces T if, and only if, TP = PTP and T(I − P) = (I − P)T(I − P) if, and only

if, TP = PTP = PT. This completes the proof. h

Problem Set 3.8

(a) Deﬁne Tnx = 1

n x for all x 2 ‘2. Show that limn||Tn|| = 0.

222 3 Linear Operators

projection on the linear span of {e1, e2, …, en}, so that I − Pn is the

orthogonal projection on the complement of this space. Show that

Pn!I in strong operator convergence, but not in the operator norm

convergence.

(c) Let T: ‘2!‘2 be deﬁned as follows: T((x1, x2, …)) = (0, x1, x2, …).

Show that Tn!O weakly but not strongly. For x, y 2 ‘2,

X1

¼ xk yn þ k :

k¼1

P2 = P. (Note that we do not require a projection to be a bounded linear operator or

to be self-adjoint.)

3:8:P2. Let P be a projection in X. Then,

(a) I − P is a projection in X;

(b) ran(P) = {x 2 X : Px = x};

(c) ran(P) = ker(I − P);

(d) X = ran(P) ⊕ ran(I − P); and

(e) if P is bounded, then ran(P) and ran(I − P) are closed.

3:8:P3. Show that a projection P in a Hilbert space is an orthogonal projection iff

ran(P)⊥ker(P).

3:8:P4. Consider the Volterra operator V on L2[0,1] given by

Zs

VxðsÞ ¼ xðtÞdt; x 2 L2 ½0; 1:

0

vector 1.

obtain the “polar decomposition” of an operator, analogous to the representation of

a complex number z as |z|eih for some real h. Does an analogue exist for operators?

In order to answer this question, we need to deﬁne the analogues of |z| and eih

amongst operators suitably.

3.9 Polar Decomposition 223

pﬃﬃﬃﬃﬃﬃﬃﬃﬃ

jT j ¼ T*T :

Remarks 3.9.2

pﬃﬃﬃﬃﬃﬃﬃﬃﬃ

(i) The reader should note that T*T O and therefore T*T is uniquely

deﬁned and is positive.

(ii) It is true that |kT| = |k||T|, whenever k 2 ℂ and T 2 B(H).

(iii) If the square of an operator S is invertible, i.e. S2U = US2 = I for some U,

then we have S(SU) = I = (US)S. Also,

T*T and consequently, |T| is invertible.

The analogue in B(H) of the complex numbers of absolute value 1 is rather

complicated. At ﬁrst one might expect that unitary operator would sufﬁce. A little

reflection shows that this is not the case.

Example 3.9.3 Let T be the simple unilateral shift on ‘2. Then, as seen in Remark

pﬃﬃﬃﬃﬃﬃﬃﬃﬃ

3.7.15(ii), T*T = I, so that jT j ¼ T*T ¼ I, but T is not unitary. So, if write T = U|T|

or |T|U, we must have U = T, which is not unitary.

Deﬁnition 3.9.4 An operator T 2 B(H) is called a partial isometry if T is an

isometry when restricted to the closed subspace [ker(T)]⊥, i.e. ||Tx|| = ||x|| for every

x 2 [ker(T)]⊥.

Observe that ||T|| 1. Every isometry is a partial isometry. Every orthogonal

projection is a partial isometry.

The subspace [ker(T)]⊥ is called the initial space of T and ran(T) is called its

ﬁnal space. It is obvious that the initial space is always closed; we shall now show

that the ﬁnal space too is always closed, i.e. ½ranðTÞ = ran(T) when T is a partial

isometry: let x 2 ½ranðTÞ. Then, there exists a sequence {xn}n 1 in H such that

Txn!x. For each n, there exist yn 2 ker(T) and zn 2 [ker(T)]⊥ such that xn = yn + zn.

Then, we have Tzn- = Txn and also

¼ kT ðzn zm Þk because yn ym 2 kerðTÞ

¼ kz n zm k because zn zm 2 ½kerðTÞ? :

But {Txn}n 1 is a Cauchy sequence (since it converges to x), and by the above

equality, {zn}n 1 is also Cauchy sequence. Let zn!z. By continuity of T, we have

Tz = limnTzn = limnTxn = x, which shows that x 2 ran(T).

224 3 Linear Operators

Proposition 3.9.5 Let U 2 B(H). Then, the following statements are equivalent:

(a) U is a partial isometry;

(b) U* is a partial isometry;

(c) U*U is a projection; and

(d) UU* is a projection.

Moreover, U*U is a projection on [ker(U)]⊥ and UU* is a projection on

½ranðUÞ = ran(U).

Proof (a) implies (c): to begin with, observe that for any T 2 B(H), we have

ker(T) = ker(T*T) by Theorem 3.5.8.

For x 2 H,

since ||U|| 1. Thus, I − U*U is a positive operator. Now if x⊥ker(U), then ||Ux|| =

1 2

||x||, which implies that ((I − U*U)x,x) = 0. Since ðI U*U Þ2 x = |((I − U*U)x,

maps ker(U) into {0}. Consequently, (U*U)2 = U*U. Since U*U is self-adjoint, it

follows by Theorem 3.8.2 that it is a projection onto the orthogonal complement of

its own kernel. However, its kernel is the same as that of U. (Note that the

orthogonal complement is by deﬁnition the initial space of U.)

(c) implies (a): if U*U is a projection and x⊥ker(U*U), then U*Ux = x.

Therefore,

and hence, U preserves the norm on [ker(U*U)]⊥. But as noted at the beginning,

ker(U*U) = ker(U). Therefore, U is a partial isometry.

(b) implies (d) and (d) implies (b) follow by reversing the roles of U and U*.

(c) implies (d): ﬁrst observe that UU* is self-adjoint. We shall show that

It is enough to show that UU*U = U. To this end, we note that this holds on

ker(U). Since it has already been proved that (c) implies (a), we know that U is a

partial isometry. Therefore, for x in ker(U)⊥, we have ||Ux|| = ||x||, which implies

U*Ux = x (see the proof of (a) implies (c)); thus, we have UU*U = U also on

ker(U)⊥ and hence on all of H. h

Observe that Proposition 3.9.5 has the following consequence: if U is a partial

isometry, then ||Ux|| = ||x|| if, and only if, x 2 ran(U*U). Indeed, ||Ux||2 = (Ux,Ux) =

(U*Ux,x) = (U*UU*Ux,x) = ||U*Ux||2, and it is true of any orthogonal projection

P that ||Px|| = ||x|| is equivalent to x 2 ran(P).

3.9 Polar Decomposition 225

Theorem 3.5.8 will be used frequently without explicit mention.

Theorem 3.9.6 (Polar Decomposition) Let T 2 B(H). Then, there is a partial

isometry U such that T = U|T| and ker(U) = ker(T). Moreover, ran(U) = ½ranðTÞ.

Amongst all bounded linear operators V such that T = V|T|, U is uniquely deter-

mined by the condition ker(V) ker(T).

Proof Deﬁne U:ran(|T|)!ran(T) by U(|T|x) = Tx. Since

if |T|x = |T|y then Tx = Ty. The equality (3.41) also shows that U preserves norms

and hence extends to a norm preserving linear mapping of ½ranðjT jÞ onto ½ranðTÞ

?

such that ker(U) = {0}. Extend U to all of H by deﬁning it to be zero on ½ranðjT jÞ

= ker(|T|), so that it now has kernel equal to ker(|T|) but the same range as before,

which is ½ranðTÞ. Observe that T = U|T| on H. Furthermore, in view of (3.41), |T|

x = 0 if, and only if, Tx = 0, so that ker(|T|) = ker(T). Thus, ker(U) = ker(T) and, as

already noted, ran(U) = ½ranðTÞ.

We next consider uniqueness.

If V is any linear operator with V|T| = T and ker(V) ker(T), we note that Vy =

Uy for every y 2 ran(|T|), so that U = V on ½ranðjT jÞ. Since both operators are zero

?

on ½ranðjT jÞ = ker(|T|) = ker(T) ker(V), it follows that V = U. h

The preceding decomposition theorem is due to von Neumann.

The factorisation T = U|T|, where U is the unique partial isometry such that T =

U|T| and ker(U) = ker(T) is called the polar decomposition of T and U is called the

partial isometry in the polar decomposition of T.

The uniqueness argument in the last paragraph of the above proof begins by

assuming that V satisﬁes ker(V) ker(T) as well as V|T| = T, but not that it is a

partial isometry. Nevertheless, even a partial isometry V satisfying only T = V|T|

need not be unique. This is illustrated by (ii) of the Remarks below.

Remarks 3.9.7 (i) If T 2 B(H) is invertible, the partial isometry in its polar

decomposition is unitary, as we now show.

Since T is invertible, ker(T) = {0} and ran(T) = H. Consequently, if T = U|T| is

the polar decomposition of T, then ker(U) = ker(T) = {0} and ran(U) = ½ranðTÞ =

H. Hence, U is unitary.

(ii) If y(t) is a complex measurable function on [0,1], there are complex mea-

surable functions a on [0,1] such that |a(t)| = 1 when y(t) 6¼ 0 and y(t) = a(t)|y(t)|

everywhere. Then, the operator T of multiplication on L2[0,1] deﬁned by

226 3 Linear Operators

T ¼ ajyðtÞj:

If y vanishes on a set Y of positive measure, then several such a are possible,

amongst which several have the property that |a| is the characteristic function of

some set. In case a is chosen (nonuniquely) so that |a| is the characteristic function

of some set E, then V can be shown to be a partial isometry by arguing as follows.

The kernel of V is {x 2 L2[0, 1]:x(t) = 0 a.e. on E} and its orthogonal complement is

{x 2 L2[0, 1]:x(t) = 0 a.e. on Ec}; we have to show for any x in this orthogonal

R1 R1

complement that ||Vx|| = ||x||, i.e. 0 jaðtÞxðtÞj2 dt ¼ 0 jxðtÞj2 dt. Since |a| is the

characteristic function of E, the former integral equals E jxðtÞj2 dt ; since x vanishes

R

a.e. on Ec, the latter integral also equals E jxðtÞj2 dt. Thus, the two integrals are

R

What it takes for V to have the same kernel as T is that

¼ fx 2 L2 ½0; 1 : xðtÞ ¼ 0 a:e: on Y c g;

or equivalently, the symmetric difference (E\Yc) [ (Yc\E) has measure zero. This

amounts to saying that the characteristic function |a| of E must be equal a.e. to that

of Yc. In other words, a must be equal a.e. to 0 on Y and y(t)/|y(t)| on Yc. With this

choice of a, the polar decomposition of T is V|T|.

It has been shown by Ichinose and Iwashita in [14] that a partial isometry such

that T = V|T| is unique if, and only if, either ker(T) or ker(T*) is {0}. They have

proved this for operators from one Hilbert space to another, but we shall conﬁne

ourselves to operators from a Hilbert space into itself. Our considerations carry over

verbatim to the broader case. We begin with a preliminary remark.

Remark 3.9.8 The zero operator is a partial isometry. It is easy to see that, given a

partial isometry V 2 B(H) and any x 2 H, the equality ||Vx|| = ||x|| is equivalent to

x 2 (ker(V))⊥. Also, given any partial isometry V and any complex number k of

absolute value 1, the operator kV is a partial isometry with the same kernel as

V. Distinct k gives rise to distinct partial isometries kV unless V = O.

Proposition 3.9.9 If T 2 B(H) and V is a partial isometry such that T = V|T|, then

(a) V*T = |T| = T*V;

(b) V*V|T| = |T| and V|T|V* = |T*|.

Proof (a) Since T = V|T|, we have V*T = V*V|T|. Therefore, in order to show that

V*T = |T| it is sufﬁcient to arrive at ran(|T|) ran(V*V). We arrive at this by

showing that y = |T|x implies ||Vy|| = ||y|| and using the observation just after

Proposition 3.9.5:

3.9 Polar Decomposition 227

jjVyjj2 ¼ ðV jT jx; V jT jxÞ ¼ ðTx; TxÞ ¼ ðT*Tx; xÞ ¼ jT j2 x; x ¼ ðjT jx; jT jxÞ

¼ k y k2 :

(b) It follows from (a) that V*V|T| = V*T = |T|. As for V|T|V*, we note that it is

positive and that its square is V|T|V*V|T|V* = V|T|(V*V|T|)V* = V|T|2V* = (V|T|)

(|T|V*) = TT*. It is immediate from here that V|T|V* = |T*|. h

Theorem 3.9.10 If T 2 B(H) and either ker(T) or ker(T*) is {0}, then there is a

unique partial isometry V such that T = V|T|.

Proof Existence has been established in Theorem 3.9.6. Uniqueness when ker(T) =

{0} is a trivial consequence of the last part of that Theorem.

To prove uniqueness when ker(T*) = {0}, consider any partial isometries U and

V such that T = U|T| and T = V|T|. By Proposition 3.9.9(a), we have T*U = |T| =

T*V. When ker(T*) = {0}, this equality leads to U = V immediately. h

Theorem 3.9.11 If T 2 B(H) and there is a unique partial isometry V such that T =

V|T|, then either ker(T) or ker(T*) is {0}.

Proof We prove the contrapositive that if ker(T) 6¼ {0} 6¼ ker(T*), then there exist

several partial isometries V satisfying T = V|T|. Theorem 3.9.6 ensures that at least

one such partial isometry U always exists and we show how to get others from it

when ker(T) 6¼ {0} 6¼ ker(T*). Recall that Theorem 3.9.6 provides not only that

T ¼ U jT j

Since ker(T) and ker(T*) must each have a one-dimensional subspace, there

exists an isometry from the former one-dimensional subspace to the latter. Extend it

to be an element of B(H) by deﬁning it to be 0 on the orthogonal complement of the

one-dimensional subspace and call it V. Then, V is a partial isometry, distinct from

O; moreover,

and

ranðVÞkerðT*Þ:

There are inﬁnitely many possibilities for V because kV has the same properties

when |k| = 1. Since it is a partial isometry, V*V is the projection on (ker(V))⊥. Since

228 3 Linear Operators

?

ker(T*) = ½ranðTÞ = (ran(U))⊥ = ker(U*), the second of the above inclusions is

equivalent to ran(V) ker(U*), which is to say,

U*V ¼ O:

Besides, in the light of the fact that ran(|T|) ker(|T|)⊥ = (ker(T))⊥ the ﬁrst of

the above inclusions leads to ran(|T|) ker(V), which can be rephrased as

V jT j ¼ O:

T. The latter is an easy consequence of the equality V|T| = O:

W jT j ¼ ðU þ VÞjT j ¼ U jT j þ V jT j ¼ U jT j þ O ¼ U jT j ¼ T:

projection [Proposition 3.9.5]. Keeping in mind that U*V = O, so that V*U = O as

well, we have

U*U and V*V are projections on mutually orthogonal subspaces. Therefore, their

sum W*W is a projection. This establishes that W is a partial isometry. h

We note in passing that every partial isometry W such that W|T| = T must

necessarily be of the form U + V, where V is a partial isometry which, as in the

foregoing proof, satisﬁes (ker(T))⊥ ker(V) and ran(V) ker(T*). For details, the

reader is referred to [14].

Theorem 3.9.12 If T 2 B(H) and n 2 ℕ, then |||T|n|| = ||T||n.

Proof Equality (1) in the proof of Theorem 3.9.6 justiﬁes the case n = 1. For other

values of n, the desired equality follows upon applying Theorem 3.6.18 to the

self-adjoint operator |T| and using the case when n = 1. h

Proposition 3.9.13 If T 2 B(H), then

Proof The ﬁrst equality is a restatement of the ﬁrst equality of Theorem 3.5.8.

Applying it to |T| in place of T, we get ker(|T|) = ker(|T|*|T|) = ker(|T|2) = ker(T*T).

The last equality follows upon taking orthogonal complements and invoking the

third equality in Theorem 3.5.8. h

3.9 Polar Decomposition 229

3:9:P1. Let T:‘2!‘2 be deﬁned by ((n1, n2, …)!(0, 0, n3, n4, …). Without using

general properties of projections, show that T is bounded and positive.

Find the square root of T.

3:9:P2. Let T 2 B(H) be self-adjoint and positive, where H denotes a complex

Hilbert space. Show that

1

1

(a) T 2 ¼ kT k2 ,

1 1

(b) jðTx; yÞj ðTx; xÞ2 ðTy; yÞ2 and

1 1

(c) kTxk kT k ðTx; xÞ , so that (Tx, x) = 0 if, and only if, Tx = 0.

2 2

3:9:P3. (a) If T 2 B(H) is a partial isometry and x 2 ran(T), show that T*x is the

unique element y of [ker(T)]⊥ such that x = Ty. Moreover, ||T*x|| = ||y|| = ||x||.

(b) Show that if T 2 B(H) is a partial isometry, then so is T*.

3.10 An Application

Ergodic theory has its roots in the study of chaotic motion of small particles, such as

pollen, suspended in a liquid. The chaotic motion was originally observed by the

botanist R. Brown in 1862 and subsequently came to be called Brownian motion.

The ﬁrst result in connection with Brownian motion that led to major developments

in mathematics was proved by Poincaré in 1890.

Let (X, R, l) be a measure space and T be a measurable transformation of X into

itself (F 2 R implies T−1(F) 2 R). The transformation is said to be measure

preserving if l(T−1(E)) = l(E) for every E 2 R. A point x 2 E is called recurrent

with respect to E and T if Tnx 2 E for at least one positive integer n. Poincaré

proved that almost every point of E is recurrent provided that l(X) < ∞. In fact, if

E 2 R and l(X) < ∞, then for almost every x 2 E, there are inﬁnitely many n such

that Tnx 2 E, that is, almost every point of any measurable subset E returns to

E inﬁnitely many times. The question arises if such a point has a mean time of

sojourn in E; more precisely if

n 1

X

1

limn n vE ðT k xÞ

k¼0

exists where T0 denotes the identity transformation. More generally, we may ask for

which class of measurable functions f(x)

230 3 Linear Operators

n 1

X

1

limn n f ðT k xÞ

k¼0

If we begin with a function f in L1(X, R, l), the associated function Uf given by

(Uf)(x) = f(Tx) belongs to L1(X, R, l) and has the norm as f. This is easy to see for

characteristic functions, hence for simple functions and consequently for other

functions, using the Monotone Convergence Theorem. Applying this to |f|2, we

conclude that U is also an isometry on L2(X, R, l). Note that the general term

f (Tkx) in the summation in the preceding paragraph can now be written as (Ukf)(x).

The question raised above will now be answered in the general context of a

Hilbert space for an operator U satisfying ||U|| 1, not necessarily preserving the

norm Riesz and Nagy [cf. 23, p. 454].

(Mean Ergodic Theorem) Let H be a Hilbert space and U be a bounded linear

operator in H with ||U|| 1. If P is the orthogonal projection on the closed linear

subspace M = {x 2 H : Ux = x}, then

n 1

X

1

limn n U k x ¼ Px

k¼0

for all x 2 H.

Proof First, we shall prove that Ux = x if, and only if, U*x = x, where U* denotes

the adjoint of U. Observe that ||U*|| = ||U|| 1. Now Ux = x implies

¼ jjU*xjj2 ðx; UxÞ ðUx; xÞ þ jj xjj2

¼ jjU*xjj2 ðx; xÞ ðx; xÞ þ jj xjj2

¼ jjU*xjj2 jj xjj2 0;

For any x 2 M, the sums n 1 nk¼01 U k x are all equal to x and so, converge to x =

P

Pn 1 k

Px. Next, consider an element x = y − Uy, y 2 H. For such an x, k¼0 Ux=y−

Uny and so, ||n−1 nk¼01 Ukx|| 2n−1||y||!0 as n!∞. The collection

P

fx 2 H : x ¼ y Uy; y 2 Hg ð3:42Þ

is clearly linear but not necessarily closed. Let z be any element in the closure K of

the collection (3.42). Then, there is a sequence xp = yp − Uyp such that xp!z as

p!∞. Let An = n−1 nk¼01 Uk. Then, ||An|| 1 for all n and

P

3.10 An Application 231

jjAn zjj An z xp þ A n xp z xp þ An xp :

So, given e > 0, there exists an integer p0 such that z xp0 \ 2e. Also,

An xp ¼ An yp An Uyp0

0 0

X n 1 n 1

X

¼ n 1 U k yp 0 U k þ 1 yp0

k¼0 k¼0

¼ n 1 yp 0 U n yp 0

e

2n 1 yp0 \ ;

2

We next show that K⊥ = M:

v 2 K⊥ , (v, y − Uy) = 0 for all y , (v, y) − (U*v, y) = 0 for all y , (v − U*v, y) = 0

for all y , v = U*v , v 2 M.

Finally, x 2 H can be written as x1 + x2 with x1 2 M and x2 2 M⊥ (= K), so that

nP1

n−1 Ukx converges to x1 + 0 = x1 = Px. This completes the proof. h

k¼0

Chapter 4

Spectral Theory and Special Classes

of Operators

[see Deﬁnition 3.5.6]. The invertibility of an operator T 2 BðHÞ and its ramiﬁca-

tions were discussed in 3.3.7–3.3.12. In what follows, we shall study the invert-

ibility of the operators kI T, where T 2 BðHÞ, I is the identity operator and

k 2 C. The study of the distribution of the values of k for which kI T does not

have an inverse is called ‘spectral theory’ for the operator.

The study of the complement of the set fk 2 C : kI T is invertible in BðHÞg;

called the ‘spectrum’ of the operator T, is an important part of operator theory. In

ﬁnite dimensions, it is the set of eigenvalues of T. In inﬁnite dimensions, the

operator kI T may fail to be invertible in different ways. So, ﬁnding the spectrum

is not an easy problem. It is deﬁnitely more complicated than in the

ﬁnite-dimensional case.

Deﬁnition 4.1.1 If T 2 BðHÞ, we deﬁne the spectrum of T to be the set

spectral radius of T is deﬁned by

Examples 4.1.2

(i) For the identity operator I 2 BðHÞ; rðIÞ ¼ f1g; qðTÞ ¼ Cnf1g and r(I) = 1.

H.L. Vasudeva, Elements of Hilbert Spaces and Operator Theory,

DOI 10.1007/978-981-10-3020-8_4

234 4 Spectral Theory and Special Classes of Operators

detðkI TÞ ¼ 0: Thus, in the ﬁnite-dimensional case, rðIÞ is just the set of

eigenvalues of T (since detðkI TÞ is an nth-degree polynomial whose roots

are the eigenvalues of T).

(iii) Let f : ½a; b ! C be continuous, where a < b are in R. The multiplication

operator

Tf x ðTÞ ¼ f ðtÞxðTÞ; atb

rðTf Þ ¼ ranðf Þ ¼ fk 2 C : there exists t 2 ½a; b for which

f ðtÞ ¼ kg ¼ ff ðtÞ : t 2 ½a; bg.

If k 62 ranðf Þ, then ðkI Tf Þ has a bounded inverse Tðk f Þ 1 and so, k 62 rðTf Þ.

On the other hand, if k ¼ f ðt0 Þ for some t0 2 ½a; b, then k 2 rðTf Þ. Otherwise,

ðkI Tf Þ has a bounded inverse S. Pick an interval Jn about t0 in [a, b], of length

dn [ 0, such that jf ðtÞ kj\ 1n for t 2 Jn ; and deﬁne

1=2

gn ðTÞ ¼ dn t 2 Jn

0 otherwise:

R

Then, ðkI Tf Þgn ! 0 as n ! 1 because jðkI Tf Þgn j2 dt n12 dn 1 dn ¼ n12 but

SðkI Tf Þgn ¼ gn which has norm 1 for all n, contradicting the continuity of S.

Depending on the complications to the invertibility of the operator kI T, we

classify rðTÞ, the spectrum of T.

Recall that kI T fails to be invertible if either ranðkI TÞ 6¼ H or

kerðkI TÞ 6¼ f0g [Problem 3.3.P3].

Deﬁnition 4.1.3

(a) The point spectrum (eigenspectrum, eigenvalues) of T 2 BðHÞ is deﬁned to

be the set

rp ðTÞ ¼ fk 2 C : kerðkI TÞ ¼

6 f0gg;

in other words, there is a nonzero vector x in H such that ðkI TÞx ¼ 0, i.e.

kI T is not injective.

(b) The continuous spectrum rc ðTÞ is the set

rc ðTÞ ¼ fk 2 C : kI T is injective and ranðkI TÞ is dense in H but

ðkI TÞ 1 is not boundedg:

(c) The residual spectrum rr ðTÞ is the set

rr ðTÞ ¼ fk 2 C : kI T is injective and ran ðkI TÞ is not dense in H and

ðkI TÞ 1 exists as a bounded or unbounded operatorg:

4.1 Spectral Notions 235

Remarks 4.1.4

(i) The conditions in (a), (b) and (c) are mutually exclusive and exhaustive by

Theorem 3.3.12. Thus, we have the following disjoint splitting of C:

(ii) If H is ﬁnite-dimensional and T 2 BðHÞ, then the two conditions kerðkI

TÞ ¼ f0g and ranðkI TÞ ¼ H are equivalent. Hence, rðTÞ ¼ rp ðTÞ for

every operator T on a ﬁnite-dimensional Hilbert space H. Consequently, in

this case, rc ðTÞ ¼ £ ¼ rr ðTÞ.

(iii) The multiplication operator Tt : L2 ½a; b ! L2 ½a; b deﬁned by Tt(x(t)) = tx(t),

a x b, is such that rp ðTt Þ ¼ £. Indeed, the condition ðkI Tt Þx ¼ 0

implies ðk tÞxðtÞ ¼ 0 a.e. and so, x(t) = 0 a.e. It has been proved in

Example (iii) of 4.1.2 that rðTt Þ ¼ ½a; b. The domain of ðkI Tt Þ 1 is the

set of all y’s in L2[a, b] for which there exists an x in L2[a, b] satisfying

ðkI Tt Þx ¼ y, i.e. kyðtÞt is in L2[a, b]. We shall argue that the set

fy 2 L2 ½a; b : kyðtÞt 2 L2 ½a; bg is dense in L2[a, b]. For an arbitrary d [ 0,

there exists an e > 0 such that the function fe, where fe is 0 on I ¼

ðk e; k þ eÞ \ ½a; b and is f on its complement, satisﬁes the inequality

Zb Z

jf fe j2 ¼ jf ðtÞj2 dt\d:

a I

Moreover, the function fke ðtÞt is in L2[a, b] since its L2-norm is less than or

equal to 1e times the L2-norm of f.

But the set fy 2 L2 ½a; b : kyðtÞt 2 L2 ½a; bg does not coincide with L2[a, b] as it

does not contain the constant function 1. Thus, each k 2 rðTt Þ is in rc ðTt Þ. It

follows from (i) above that rr ðTt Þ ¼ £:

Theorem 3.3.121 leads to yet another useful division of the spectrum into two

parts, not necessarily disjoint. It is an immediate consequence of that Theorem that

1

Note that the same theorem had made it possible earlier to divide the complement of the point

spectrum into two disjoint parts.

236 4 Spectral Theory and Special Classes of Operators

bounded below: there is no e [ 0 such that jjðkI TÞxjj ejjxjj for every x 2 H. In

the former case, k is said to belong to the compression spectrum rcom(T) of T, and

in the latter case, k is said to belong to the approximate point spectrum rap(T) of T.

In other words,

¼ 1 for every n and jjðkI TÞxn jj ! 0 as n ! 1g:

approximate eigenvalue. Clearly,

which is to say the residual spectrum is the set of those points in the compression

spectrum that are not eigenvalues. Also,

and

¼ rap ðTÞnðrcom ðTÞ [ rp ðTÞÞ

¼ rap ðTÞnðrp ðTÞ [ rr ðTÞÞ:

4:1:P1. For T 2 BðHÞ, show that (i) rcom ðTÞ rp ðT*Þ and (ii) rp ðTÞrcom ðT*Þ.

4:1:P2. Let H ¼ ‘2 and fek gk 1 be the standard orthonormal basis in ‘2 : Any

P P1

x 2 ‘2 has the representation x ¼ 1 n¼1 ðx; en Þen ¼ n¼1 an en ; where

2 2

an ¼ ðx; en Þ; n ¼ 1; 2; . . .: Deﬁne T :‘ !‘ by taking

P

Tx ¼ 1 an 1 1

n¼1 n þ 1en þ 1 ; in other words, Te1 ¼ 2 e2 ; Te2 ¼ 3 e3 ; . . .: Show that

T is a bounded linear operator, 0 2 rr ðTÞ and any k 6¼ 0 belongs to qðTÞ.

4.1 Spectral Notions 237

P P1

x 2 ‘2 has the representation x ¼ 1 n¼1 ðx; en Þen ¼ n¼1 an en , where

an ¼ ðx; en Þ; n ¼ 1; 2; . . .: Consider a sequence of scalars fkn gn 1 such

P

that kn ! 1 and no kn equals 1. Deﬁne T : ‘2 ! ‘2 by Tx ¼ 1 n¼1 an kn en .

Show that

(a) T is a bounded linear operator;

(b) fkn : n ¼ 1; 2; . . .g rp ðTÞ;

(c) 1 2 rc ðTÞ;

(d) k 6¼ kn for any n and k 6¼ 1 implies k 2 qðTÞ;

(e) rr ðTÞ ¼ £:

4:1:P4. Show that if A; B 2 BðHÞ; k 2 qðABÞ and k 6¼ 0, then k 2 qðBAÞ and

1

ðkI BAÞ ¼ k 1 I þ k 1 BðkI BAÞ 1 A:

Deduce that rðABÞ and rðBAÞ have the same elements with one possible

exception: the point zero. Show that the point zero is exceptional.

4:1:P5. Let l ¼ flk gk 1 be a bounded sequence of complex numbers,

M ¼ supk 1 jlk j. Deﬁne T : ‘2 ! ‘2 by

Show that jjTjj ¼ supk 1 jlk j ¼ M. Show also that the eigenvalues of T are

l1 ; l2 ; . . . and rðTÞ ¼ flk : k 1g: What is T*?

4:1:P6. Let T 2 BðHÞ be self-adjoint and x be a ﬁxed unit vector in H. Suppose

jjTxjj ¼ jjTjj. Show that x is an eigenvector of T2 corresponding to the

eigenvalue kTk2 ð¼kT 2 kÞ. Also, prove that Tx ¼ jjTjjx or Ty ¼ jjTjjy,

where y ¼ jjTjjx Tx 6¼ 0.

4:1:P7. Let T 2 BðHÞ, where H is a complex Hilbert space. Show that the fol-

lowing statements are equivalent:

(a) There exists k 2 rap ðTÞ such that jkj ¼ kTk;

(b) jjTjj ¼ supjjxjj¼1 jðTx; xÞj:

4:1:P8. Let S and T denote a pair of self-adjoint operators in BðHÞ: Then,

m2rðTÞ l2rðSÞ

l2rðSÞ m2rðTÞ

238 4 Spectral Theory and Special Classes of Operators

Let H be a ﬁnite-dimensional Hilbert space and T 2 BðHÞ. The set of k’s for which

detðkI TÞ ¼ 0 comprise the spectrum of T. The fundamental theorem of algebra

guarantees that rðTÞ 6¼ £: For every bounded linear operator deﬁned on a Hilbert

space (ﬁnite- or inﬁnite-dimensional), the spectrum rðTÞ is a nonempty, closed and

bounded subset of the complex plane.

Theorem 4.2.1 (The resolvent equation) For k; l 2 qðTÞ,

Proof We have

1 1

Rðk; TÞ Rðl; TÞ ¼ ðkI TÞ ðlI TÞ

¼ ðkI TÞ 1 ½ðlI TÞ ðkI TÞðlI TÞ 1

h

The above relation has the consequence that

Rðk; TÞ Rðl; TÞ

Rðk; TÞRðl; TÞ ¼

k l

Rðl; TÞ Rðk; TÞ

¼

l k

¼ Rðl; TÞRðk; TÞ:

Thus, the family fRðk; TÞ : k 2 qðTÞg is a commuting family, i.e. any two mem-

bers of the family commute with each other.

Theorem 4.2.2 Let T 2 BðHÞ. The resolvent set q(T) of T is open, and the map

k ! Rðk; TÞ ¼ ðkI TÞ 1 from qðTÞ C to BðHÞ is strongly holomorphic in the

sense of Deﬁnition 3.3.13 (understood with X ¼ BðHÞ), vanishing at 1: For each

x; y 2 H, the map k ! ðRðk; TÞx; yÞ ¼ ððkI TÞ 1 x; yÞ 2 C is holomorphic on

qðTÞ, vanishing at 1:

Proof Let k 2 qðTÞ. By deﬁnition, kI T is invertible and thus belongs to the set

G of all invertible elements of BðHÞ. By the ﬁrst part of Proposition 3.3.9, G is open.

Therefore, some d [ 0 has the property that any S 2 BðHÞ which satisﬁes the

inequality kS ðkI TÞk\d belongs to G. If kk lk\d, then S ¼ lI T

clearly satisﬁes the inequality and therefore belongs to G, so that l 2 qðTÞ. This

shows that qðTÞ is open.

Since the map k ! ðkI TÞ from qðTÞ to G is continuous, it follows by the

second part of Proposition 3.3.9 that the map k ! ðkI TÞ 1 from qðTÞ C to

BðHÞ is also continuous. The resolvent identity of Theorem 4.2.1 now shows that

the map is strongly holomorphic with derivative Rðk; TÞ2 .

4.2 Resolvent Equation and Spectral Radius 239

ðI k 1 TÞ 1 ! I [by the second part of Theorem 3.3.9]. Consequently,

1 1

Rðk; TÞ ¼ ðkI TÞ ¼ k 1 ðI k 1 TÞ ! O:

Being strongly holomorphic, the map is also weakly holomorphic. Now, for

x; y 2 H, the map from BðHÞ to C given by S ! (Sx, y) is a linear functional on

BðHÞ. Hence, the map k ! ðRðk; TÞx; yÞ ¼ ððkI TÞ 1 x; yÞ 2 C is holomorphic,

vanishing at 1: h

Corollary 4.2.3 For T 2 BðHÞ; rðTÞ ¼ CnqðTÞ is a closed subset of C.

Recall that the spectral radius of an operator T 2 BðHÞ is deﬁned to be

Theorem 4.2.4 Let T 2 BðHÞ, where Hð6¼ f0gÞ: If jkj [ kTk, then k 2 qðTÞ and

1

X

1 n 1 n

Rðk; TÞ ¼ ðkI TÞ ¼ k T ;

n¼0

where convergence takes place in the uniform operator norm. Also, the spectrum

rðTÞ of T is a nonempty compact subset which lies in fk 2 C : jkj kTkg. In

particular, there exists k 2 rðTÞ such that jkj ¼ rðTÞ.

Proof By Corollary 4.2.3, rðTÞ is a closed subset of C. If jkj [ kTk, then

kI ðI k 1 TÞk ¼ kk 1 Tk\1, and by Proposition 3.3.8, I k 1 T is invertible

P

with ðI k 1 TÞ 1 ¼ 1 1 n

n¼0 ðk TÞ , convergence being in the uniform operator

norm. This implies that kI T ¼ kðI k 1 TÞ is invertible and ðkI TÞ 1 ¼

P1 n 1 n

n¼0 k T ; convergence being in the uniform operator norm.

In particular, jkj [ kTk implies k 62 rðTÞ. In other words,

rðTÞ fk 2 C : jkj jjTjjg, showing that rðTÞ is bounded. Being closed, it is also

compact.

We show that the assumption rðTÞ ¼ £ leads to a contradiction. rðTÞ ¼ £

implies qðTÞ ¼ C. Now, for every x, y in H, ðRðk; TÞx; yÞ is an entire function,

which vanishes at 1; and is therefore bounded. By Liouville’s Theorem,

ðRðk; TÞx; yÞ is constant and the value of this constant is zero. Since ðRðk; TÞx; yÞ ¼

0 for every x, y in H implies Rðk; TÞ ¼ O, it follows that

O ¼ Rðk; T ÞðkI T Þ ¼ I:

This is a contradiction. h

240 4 Spectral Theory and Special Classes of Operators

Theorem 4.2.5 (Gelfand’s formula) For any T 2 BðHÞ, the following limit exists

1

limn!1 jjT n jjn

The following lemma will be needed in the proof of Gelfand’s formula.

1 1

Lemma 4.2.6 For T 2 BðHÞ; limn!1 jjT n jjn exists and equals inf n jjT n jjn .

1

Moreover, 0 inf n jjT n jj kTk. n

1 1

Proof Set a ¼ inf n jjT n jjn . Then, for e [ 0, there exists m such that jjT m jjm \a þ e.

Now, any n 2 N can be written as n = pm + q, 0 q < m. So,

1 1 p q p q

kT n kn ¼ kT pm þ q kn kT m kn kT kn \ða þ eÞmn kT kn :

1

lim supn jjT n jjn a þ e:

As e [ 0 is arbitrary, we have

1

lim supn jjT n jjn a:

1 1

Also, a jjT n jjn for every n and this implies a lim inf n jjT n jjn . Consequently,

1 1 1 1

limn!1 jjT n jjn exists and equals inf n jjT n jjn . Finally, jjT n jjn ðkTkn Þn ¼ kTk

1

implies a ¼ inf n jjT n jjn kTk. h

1

Proof of Gelfand’s Formula Let k 2 C be such that jkj [ a ¼ inf n kT n kn . Then,

1

there exists a positive integer m such that jkj [ kT m km , i.e. kT m k\jkm j, so that

km 2 qðT m Þ. Since

Tm km I ¼ ðT kIÞðT m 1

þ k Tm 2

þ þ km 1 IÞ

¼ ðT m 1

þ kT m 2

þ þ km 1 IÞðT kIÞ;

it follows that

1

ðT kIÞ ¼ ðT m km IÞ 1 ðT m 1

þ kT m 2

þ þ km 1 IÞ 1 ;

end, we proceed as follows:

Let jkj [ rðTÞ. Then, k 2 qðTÞ. The resolvent Rðk; TÞ exists and is strongly

holomorphic on qðTÞ by Proposition 4.2.1. It therefore has a Laurent series around

k ¼ 0, converging in the operator norm.

4.2 Resolvent Equation and Spectral Radius 241

1

X

n 1 n

Rðk; TÞ ¼ k T ;

n¼0

Since jkj [ kTk rðTÞ, by uniqueness of Laurent series, it follows that

1

X

n 1 n

Rðk; TÞ ¼ k T if jkj [ rðTÞ:

n¼0

Hence,

n 1 n

limn jjk T jj ¼ 0 if jkj [ rðTÞ;

nþ1

ðe þ jkjÞ ;

which implies

1 1

jjT n jjn ðe þ jkjÞ1 þ n for large n and jkj [ rðTÞ;

and hence,

1

limn!1 jjT n jjn jkj for jkj [ rðTÞ:

Consequently,

1

limn!1 jjT n jjn rðTÞ:

1

rðTÞ ¼ limn!1 jjT n jjn :

h

Remarks 4.2.7

(i) If T 2 BðHÞ is such that T*T ¼ TT*; then

242 4 Spectral Theory and Special Classes of Operators

1

2]. It follows that jjT p jjp ¼ jjTjj for p = 2n, n = 1, 2, …, which implies that

1

the limit of the subsequence fjjT p jjp gp¼2n of the convergent sequence

1 1

fjjT n jjn gn 1 equals kTk; so limn!1 jjT n jjn ¼ jjTjj. Hence, if T is normal,

rðTÞ ¼ kTk. Therefore, by Theorem 4.2.4, there exists k 2 rðTÞ such that

jkj ¼ kTk. In particular, if the spectrum contains only real numbers [e.g.

self-adjoint operators; see Theorem 4.4.2], then jkj ¼ k and therefore

either kTk 2 rðTÞ or kTk 2 rðTÞ.

1

(ii) For T 2 BðHÞ; rðTÞ ¼ f0g if and only if limn!1 jjT n jjn ¼ 0. Indeed, if

1

limn!1 jjT n jjn ¼ 0, then r(T) = 0, which implies rðTÞ ¼ f0g. On the other

hand, if rðTÞ ¼ f0g, then rðTÞ ¼ supfjkj : k 2 rðTÞg ¼ 0; i:e:;

1

limn!1 kT n kn ¼ 0:

(iii) An operator T 2 BðHÞ is called nilpotent if there exists an n 2 N such that

Tn = O and is called quasinilpotent if rðTÞ ¼ f0g.

Any normal quasinilpotent operator is the zero operator. Indeed, if T is normal,

1

then limn!1 kT n kn ¼ kTk. Since T is quasinilpotent, rðTÞ ¼ f0g. It then follows

from (i) and (ii) above that kTk ¼ 0, which implies T = O.

Problem Set 4.2

4:2:P1: (a) The analogue of Theorem 4.2.4 ½rðTÞÞ 6¼ £ fails for real spaces:

(b) Give an example to show that it is possible to have rðTÞ ¼ 0 but T 6¼ O:

4:2:P2. Let A; B 2 BðHÞ be bounded linear operators on a complex Hilbert space

H such that AB = BA. Show that

rðABÞ rðAÞrðBÞ:

4:2:P3. Let A; B 2 BðHÞ be bounded linear operators on a complex Hilbert space

H such that AB = BA. Show that r(A + B) r(A) + r(B). Give an example

to show that commutativity cannot be dropped.

P

Let T 2 BðHÞ. To every polynomial pðzÞ ¼ nj¼0 cj z j , we can associate the oper-

Pn j −1

ator pðTÞ 2 BðHÞ deﬁned by j¼0 cj T . With f ðzÞ ¼ z and f(z) = z , we can

−1

associate the operators f(T) = T* and f(T) = T . The purpose of this section is to

investigate the relationship between rðTÞ and the spectrum of the operators deﬁned

above. In fact, we have the following theorem.

4.3 Spectral Mapping Theorem for Polynomials 243

Spectral Mapping Theorem 4.3.1 Let H be a Hilbert space and T 2 BðHÞ: Then,

(a) r(T*) = fk : k 2 rðTÞg;

(b) if T is invertible, then r(T−1) = fk 1 : k 2 rðTÞg;

P

(c) if pðzÞ ¼ nj¼0 cj z j is a polynomial with complex coefﬁcients and if p(T) is

P

deﬁned by nj¼0 cj T j , then r(p(T)) ¼ fpðkÞ : k 2 rðTÞg = p(r(T)).

Proof

(a) Suppose k 62 rðTÞ. Then, ðkI TÞ 1 exists, so that ðk I T*Þ 1 ¼

½ðkI TÞ* 1 ¼ ½ðkI TÞ 1 * exists [see Theorem 3.5.4(d)]. Thus,

k 62 rðT*Þ. We have thus proved rðT*Þfk : k 2 rðTÞg. Applying this

argument to T*, we get rðTÞfk : k 2 rðT*Þg; that is, fk : k 2 rðTÞgfk :

k 2 rðT*Þg: Taking conjugates, we get fkk 2 rðTÞgfk : k 2 rðT*Þg ¼

rðT*Þ, so that rðT*Þ ¼ fk : k 2 rðTÞg.

(b) If T is invertible, then 0 62 rðTÞ, so that {k−1 : k 2 r(T)} is well deﬁned. If k

62 r(T) and k 6¼ 0, then the equation

ðk 1 I T 1

Þ ¼ k 1T 1

ðT kIÞ ¼ k 1T 1

ðkI TÞ

other words, rðT 1 Þfk 1 : k 2 rðTÞg. To prove the reverse inclusion, we

apply the result to T−1. Thus,

1 1

rðT Þ ¼ fk : k 2 rðTÞg:

(c) When p is the zero polynomial or has degree 1, this is obvious. Let k 2 rðTÞ

and p be a polynomial of degree n > 1. Then, p(z) − p(k) is a polynomial of

degree n with k as a root and we can factor p(z) − p(k) as (z − k)q(z), where

q is a polynomial of degree n − 1. Then,

1

If B were invertible, then the equation BB ¼ B 1 B ¼ I can be written as

1

ðT kIÞqðTÞB ¼ B 1 qðTÞðT kIÞ ¼ I:

B is not invertible, i.e. pðkÞ 2 rðpðTÞÞ. So, pðrðTÞÞrðpðTÞÞ:

Let k 2 rðpðTÞÞ: Factorise the polynomial pðzÞ k into linear factors and write

244 4 Spectral Theory and Special Classes of Operators

Since pðTÞ kI is not invertible, one of the factors T kj I is not invertible. Thus,

kj 2 rðTÞ, and also, pðkj Þ k ¼ 0. This shows that k ¼ pðkj Þ for some kj 2 rðTÞ:

Hence, rðpðTÞÞpðrðTÞÞ. This completes the proof. h

Example 4.3.2 [(ix) of Examples 3.2.5]. The Volterra integral operator

V : L2 ½0; 1 ! L2 ½0; 1

deﬁned by

Zs

VxðsÞ ¼ xðtÞdt; x 2 L2 ½0; 1

0

is a bounded linear operator of norm not exceeding p1ﬃﬃ2. We shall show that r(V) = 0

and 0 is not an eigenvalue of V. Now,

Zs Zs Z t

2

V xðsÞ ¼ V ðVðxÞðsÞÞ ¼ ðVxÞðtÞdt ¼ xðuÞdu dt

0 0 0

0 1

Zs Zs Zs

¼ x ð uÞ @ dtAdu ¼ ðs uÞxðuÞdu:

0 u 0

Zs

1

n

V xðsÞ ¼ ðs uÞn 1 xðuÞdu;

ðn 1Þ!

0

so

2

Z1 2 Z1 Z s

1

kV n xk22 ¼ jV n xðsÞj2 ds ¼ ðs uÞn 1 xðuÞdu ds

ðn 1Þ!

0 0 0

0 1 2

2 Z 1 Z s

1 @ ðs uÞn 1 jxðuÞjduA ds

ðn 1Þ!

0 0

0 10 s 1

2 Z 1 Z s Z

1 @ jxðuÞj2 duA@ ðs uÞ2n 2 duAds;

ðn 1Þ!

0 0 0

4.3 Spectral Mapping Theorem for Polynomials 245

2

1

jj xjj22 :

ðn 1Þ!

Thus,

n 1

kV x k k xk:

ðn 1Þ!

1

Consequently, jjV n jj ðn 1Þ!, which implies

1n

1 1

r ðV Þ ¼ limn!1 jjV n jjn limn!1 ¼ 0:

ðn 1Þ!

s

for if Vx = 0, then 0 xðuÞdu ¼ 0 for every s 2 ½0; 1 and this implies x = 0 a.e. Since

Rs

0 2 rðVÞ, V is not invertible. Since the range f 0 xðuÞdu : x 2 L2 ½0; 1g of V is

dense in L2[0, 1] (see below), it follows that 0 2 rc ðVÞ.

The range of V consists of continuous functions on [0, 1] vanishing at 0 and

differentiable a.e. We shall show that they are dense in L2[0, 1]. We need consider

only real functions. By the Stone–Weierstrass Theorem [13, Theorem 7.34 of

Chap. II], they are uniformly dense in the algebra of all real continuous functions

vanishing at 0. It is sufﬁcient therefore to argue that this algebra is L2-dense in the

algebra of all continuous real functions.

Let f be any real continuous function on [0, 1], f(0) 6¼ 0, and let e > 0 be given.

There exists a positive d1 \1 such that on the interval [0, d1], we have

jf ðxÞj 2jf ð0Þj. Choose a positive d\d1 such that it also satisﬁes the inequality

e2

d\ :

16jf ð0Þj2

Since 0 < d < d1, the inequality jf ðxÞj 2jf ð0Þj holds on [0, d] as well. Now,

consider the continuous function g deﬁned to agree with f on [d, 1] and have a

straight-line graph from the origin to the point (d, f(d)) on the graph of f. Then, g(0)

= 0 and g satisﬁes jgðxÞj 2jf ð0Þj on [0, d], and hence,

Moreover,

246 4 Spectral Theory and Special Classes of Operators

R

It follows that ½0;1 jf gj2 16jf ð0Þj2 d, which is less than e2 by choice of d. Thus,

kf gk\e in L2[0, 1].

Proposition 4.3.3 Let T2 BðHÞ. Then, (a) rp(T*) = rcom ðTÞ, (b) r(T*) = rap(T*)

[ rap ðTÞ, (c) rcom ðT*Þrp ðTÞrap ðTÞ and (d) rr ðTÞ ¼ rp ðT*Þ\rp(T), where

the bar signiﬁes complex conjugation [not closure].

Proof If k 2 rp ðT*Þ, then kI T* has a nonzero kernel, and therefore,

ranðk I TÞ has a nonzero orthogonal complement, i.e. k 2 rcom ðTÞ; both these

implications are reversible. This proves (a).

The operator kI T* is not invertible if and only if one of kI T* and k I T

is not bounded from below [Theorem 3.5.9]. In other words, k 2 rðT*Þ if and only

if either k 2 rap ðT*Þ or k 2 rap ðTÞ: This means

If k 2 rcom ðT*Þ, then by deﬁnition, kI T* does not have dense range, and

therefore, kI T has a nontrivial kernel [Theorem 3.5.8], i.e. k 2 rp ðTÞ. But

rp ðTÞrap ðTÞ: This proves (c).

By Remark 4.1.4, rr ðTÞ ¼ rcom ðTÞnrp ðTÞ ¼ rp ðT*Þ nrp ðTÞ by part (a). This

proves (d). h

Proposition 4.3.4 Let T 2 BðHÞ: Then, rap(T) is a closed subset of C.

Proof Let k 62 rap ðTÞ. Then, kI T is bounded below. So there exists some e [ 0

such that kðkI TÞxk ekxk: Also, for all l, kðkI TÞxk kðlI TÞxk þ

kðk lÞxk for all x 2 H. It follows that ðe jk ljÞkxk kðlI TÞxk for all l

and all x 2 H: For jk lj sufﬁciently small, the preceding inequality implies lI

T is bounded below. Hence, the complement of rap ðTÞ is open. h

Our next result shows that rap ðTÞ is not empty.

Theorem 4.3.5 If T 2 BðHÞ, then @rðTÞrap ðTÞ:

Proof Let k 2 @rðTÞÞ, and let fkn gn 1 be a sequence in the resolvent set qðTÞ

such that kn ! k. We claim that kðkn I TÞ 1 k ! 1 as n ! 1. Suppose this is

false. By passing to a subsequence if necessary, there is a constant M such that

kðkn I TÞ 1 k M for all n. Choose n sufﬁciently large so that jkn kj\

M 1 kðkn I TÞ 1 k 1 . It follows on using Proposition 3.3.9 that kI T is

invertible, a contradiction.

Let kxn k ¼ 1 satisfy an ¼ kðkn I TÞ 1 xn k [ kðkn I TÞ 1 k 1n : Then, an !

1 as n ! 1: Put yn ¼ an 1 ðkn I TÞ 1 xn ; then, kyn k ¼ 1. Now,

4.3 Spectral Mapping Theorem for Polynomials 247

1

¼ an xn þ ðk kn Þyn :

Thus,

jjðkI TÞyn jj an 1 þ jk kn j;

We now work out in detail an example which illustrates the various kinds of

spectra.

Example 4.3.6 Let T be the simple unilateral shift on ‘2 deﬁned by

As seen in (vi) of Examples 3.5.10, the adjoint T* of T, called the left shift operator,

acts on ‘2 by

It has been observed [Example (vii) of 3.2.5] that kTxk ¼ kxk for x 2 ‘2 , and hence,

kTk ¼ 1. Since kT*k ¼ kTk [Theorem 3.5.2], it follows that kT*k ¼ 1.

Consequently, rðTÞfk 2 C : jkj 1g and rðT*Þfk 2 C : jkj 1g.

In what follows, rðTÞ; rp ðTÞ; rc ðTÞ; rr ðTÞ; rap ðTÞ; rcom ðTÞ and their ana-

logues for T* will be characterised.

(i) Suppose jkj\1. The vector xk ¼ ð1; k; k2 ; . . .Þ is in ‘2 and satisﬁes

ðkI T*Þxk ¼ 0. Thus, all such k are in the point spectrum of T*. Thus,

fk 2 C : jkj\1grp ðT*Þ:

that rðT*Þ ¼ fk 2 C : jkj 1g. In view of Theorem 4.3.1(a), we have

rðTÞ ¼ fk 2 C : jkj 1g. This characterises rðTÞ and rðT*Þ.

(ii) From Theorem 4.3.5, ∂r(T*) rap(T*). Since rp(T*) rap(T*) by deﬁ-

nition, we have r(T*) = {k 2 ℂ : |k| 1} = {k 2 ℂ : |k| < 1} [ {k 2 ℂ : |k|

= 1} = {k 2 ℂ : |k| < 1} [ ∂r(T*) rp(T*) [ rap(T*) = rap(T*) r(T*),

where we have used (i) for the ﬁrst inclusion. Thus, we have shown that

r(T*) = rap(T*).

(iii) It may be remarked that no k satisfying |k| = 1 is in rp(T*). Indeed, if x =

{xi}i 1, x 6¼ 0, is such that T* x = kx, then (x2, x3, …) = (kx1, kx2…), which

implies xn+1 = kxn for n 1. So, xn+1 = knx1, n 1. Hence, x1(1, k, k2, …)

2 ‘2. Since |k| = 1, the vector x1(1, k, k2, …) 2 ‘2 if and only if x1 = 0, which

implies x = 0, a contradiction.

248 4 Spectral Theory and Special Classes of Operators

(i) rp(T) = ∅. Indeed, if {nn}n 1 2 ‘2 and (kI − T)({nn}) = 0, k 6¼ 0, then

0 = kn1, n1 = kn2, n2 = kn3, …, implying that n1 = 0, n2 = 0, ….

(ii) rap(T) = {k 2 ℂ : |k| = 1}. If |k| < 1, and x 2 ‘2, then ||(T − kI)x|| |||Tx|| −

|k|||x||| |(1 − |k|)||x|||, which implies k 62 rap(T). Consequently, rap(T)

{k 2 ℂ : |k| = 1}. It follows in view of Theorem 4.3.5 that rap(T) = {k 2 ℂ :

|k| = 1}.

(iii) By Proposition 4.3.3, rp(T*) = rcom ðTÞ. It follows that rcom(T) = {k 2 ℂ :

|k| < 1}.

(iv) rc(T) = r(T)\(rcom(T) [ rp(T)) = {k 2 ℂ : |k| 1}\{k 2 ℂ : |k| < 1}

= {k 2 ℂ : |k| = 1} since rp(T)) = ∅.

(v) rr(T) = r(T)\(rc(T)) [ rp(T) = {k 2 ℂ : |k| < 1}.

We have thus proved the following:

r(T*) = rap(T*), since rcom(T*) = rp ðTÞ = ∅ by Proposition 4.3.3 and (i) of

paragraph above.

Also,

and

rr ðT*Þ ¼ £:

r(T) = rap(T) [ rcom(T) = {k 2 ℂ : |k| = 1} [ {k 2 ℂ : |k| < 1}

and

r(T) = rp(T) [ rc(T) [ rr(T) = ∅ [ {k 2 ℂ : |k| = 1} [ {k 2 ℂ : |k| < 1}.

Let H be a complex Hilbert space and BðHÞ denote the algebra of bounded linear

operators on H. Normal operators and their suitable subsets such as self-adjoint

operators and unitary operators have been studied in Sect. 3.7 and so have been the

isometric operators. The spectral properties of a member of the class are somewhat

4.4 Spectrum of Various Classes of Operators 249

simpler to describe than those of a general member of BðHÞ. We begin with normal

operators.

Theorem 4.4.1 Every point in the spectrum of a normal operator is an approxi-

mate eigenvalue.

Proof If T 2 BðHÞ is a normal operator and k 2 ℂ, then so is kI − T. So, for each

x 2 H,

¼ jjðk I T*Þxjj:

rp ðT*Þ ¼ rp ðTÞ. Now, by Proposition 4.3.3, rp ðT*Þ ¼ rcom ðTÞ, and hence,

Since we have shown that rp ðT*Þ ¼ rp ðTÞ, the above equality leads to

The following theorem, which is a consequence of the one above, is important in

its own right.

Theorem 4.4.2 [cf. Problem 3.8.P2] The spectrum of every self-adjoint operator

T 2 BðHÞ is a subset of ℝ. In particular, the eigenvalues of T, if any, are real.

Furthermore, if T is a positive operator, then the spectrum of T is nonnegative, and

eigenvalues, if any, are also nonnegative.

Proof Let k = l + im, where l and m are real and m 6¼ 0 be a complex number. If T is

a self-adjoint operator and x 2 H, then

¼ ððk I TÞðkI TÞx; xÞ

2

¼ jkj ðx; xÞ 2lðTx; xÞ þ jjTxjj2

¼ jjðlI TÞxjj2 þ v2 jj xjj2

v2 jj xjj2 ;

and hence cannot be in the spectrum r(T) of T by Theorem 4.4.1. Consequently,

r(T) ℝ.

250 4 Spectral Theory and Special Classes of Operators

2

¼ ððkI TÞ x; xÞ

¼ k2 ðx; xÞ 2kðTx; xÞ þ jjTxjj2

k2 jj xjj2 since k\0:

and hence cannot be in the spectrum r(T) of T by Theorem 4.4.1. h

The second assertion of the above theorem is trivial to prove directly. For a

self-adjoint operator, (Tx, x) is real. If k 2 rp(T), then there exists a nonzero x such

that (Tx, x) = (kx, x) = k(x, x).

In light of Theorem 4.4.2 and Remark 4.2.7(i), a self-adjoint operator T must

have the property that either ||T|| 2 r(T) or −||T|| 2 r(T).

Theorem 4.4.3 Let BðHÞ denote the algebra of bounded linear operators on a

complex Hilbert space H. Suppose T 2 BðHÞ satisﬁes the equality TT* = T*T, i.e.

T is normal. Then,

(a) rp ðTÞ ¼ rp ðT*Þ;

(b) eigenvectors corresponding to distinct eigenvalues, if any, are orthogonal;

(c) rr ðTÞ ¼ £:

Proof

(a) Since T is normal, for k 2 ℂ, ||(kI − T)x|| = ||(kI − T)* x|| for each x 2 H. It

follows that ker(kI − T) 6¼ {0} if and only if kerðkI T*Þ ¼ 6 f0g, that is

rp ðTÞ ¼ rp ðT*Þ:

(b) Let k, l be distinct eigenvalues of T and x, y 2 H corresponding eigenvectors.

Then, Tx = kx and Ty = ly. It follows from (a) that

(c) For any T 2 BðHÞ; rr ðTÞ ¼ rp ðT*Þnrp ðTÞ by Proposition 4.3.3(d). It

therefore follows upon using (a) above that rr(T) = ∅ when T is normal. h

The spectrum of a self-adjoint operator can be characterised in more detail.

Recall that the spectrum r(T) of an operator T 2 BðHÞ is a nonempty compact

subset of ℂ. In the present case, we have the following.

Theorem 4.4.4 The spectrum r(T) of a bounded self-adjoint linear operator T on

a complex Hilbert space H lies in the closed interval [m, M] on the real axis, where

m = inf||x||=1(Tx, x) and M = sup||x||=1(Tx, x).

4.4 Spectrum of Various Classes of Operators 251

Proof The fact that T = T* implies (Tx, x) is real for each x 2 H. Indeed, for x 2 H,

we have ðTx; xÞ ¼ ðx; TxÞ ¼ ðTx; xÞ:

The spectrum r(T) lies on the real axis [Theorem 4.4.2]. We show that any real

number M + e with e > 0 belongs to the resolvent set q(T). For every x 2 H, x 6¼ 0,

and v = ||x||−1x, we have x = ||x||v and

T)x, x), where k = M + e, e > 0, we obtain

2

ð M þ eÞjj xjj

ej j xj j 2 :

This implies

The argument when k < m is similar and is therefore not included. h

Let T 2 BðHÞ, where H is a Hilbert space over the ﬁeld ℂ of complex numbers,

and T = T*. In the theorem above, we deﬁned

and

The numbers m and M are related to the norm ||T|| of T. The following theorem has

already been proved using Example 3.4.7(ii) and Corollary 3.4.11 [see Theorem 3.

6.6]. An independent proof is desirable.

Theorem 4.4.5 For T 2 BðHÞ, T = T*, we have

so that a ||T||. It remains to prove that ||T|| a. If Tx = 0 for all x 2 H with ||x|| = 1,

then ||T|| = sup||x||=1||Tx|| = 0. In this case, the proof is complete. Let x 2 H be such that

252 4 Spectral Theory and Special Classes of Operators

1 1

||x|| = 1 and Tx 6¼ 0. Set v ¼ jjTxjj2 x and w ¼ jjTxjj 2 Tx. Then, ||v||2 = ||w||2 = ||Tx||.

If y1 = v + w and y2 = v − w, then

¼ 2 ðTx; TxÞ þ T 2 x; x

¼ 4jjTxjj2 :

jðTy; yÞj ¼ k yk2 jðTz; zÞj kyk2 supjjzjj¼1 jðTz; zÞj ¼ akyk2 : ð4:1Þ

n o

a ky 1 k2 þ ky 2 k2

n o ð4:2Þ

¼ 2a jjvjj2 þ jjwjj2

¼ 4ajjTxjj:

4jjTxjj2 4ajjTxjj;

which implies

jjTxjj a:

The bounds for r(T) in Theorem 4.4.4 cannot be tightened.

Theorem 4.4.6 If T 2 BðHÞ is self-adjoint, then m and M, where m and M are as

in Theorem 4.4.4, are in the spectrum r(T) of T.

Proof We show that M 2 rap(T) = r(T). The proof that m 2 r(T) is similar and is,

therefore, not included.

By the Spectral Mapping Theorem 4.3.1, M 2 r(T) if and only if M + k 2 r(T +

kI), where k is a real constant. Without loss of generality, we may assume 0

m M. By Theorem 4.4.5,

that

4.4 Spectrum of Various Classes of Operators 253

2

¼ jjTxn jj 2MðTxn ; xn Þ þ M 2 jjxn jj2

M2 2MðM dn Þ þ M 2

¼ 2Mdn ! 0:

Remark If T 2 BðHÞ is a nonzero self-adjoint operator and m + M 0, then M > 0

since m M and the bounds m and M cannot both be 0 by Corollary 3.6.7.

Therefore, |m| |M|. Hence, by Theorem 4.4.5 and Theorem 4.4.6, ||T|| = |M| =

M 2 r(T). On the other hand, if m + M < 0, then m < 0, and hence, ||T|| = |m| = −m,

so that −||T|| = m 2 r(T).

We now consider a subset of scalars which is closely related to the spectrum

r(T) of a bounded linear operator T deﬁned on a complex Hilbert space H.

Deﬁnition 4.4.7 The numerical range of a bounded linear operator T deﬁned on a

complex Hilbert space H is the set

The reader will note that ||x|| = 1, not ||x|| 1. The numerical range of T is the

range of the restriction to the unit sphere {x 2 H : ||x|| = 1} of the quadratic form

(Tx, x) associated with T.

The following properties of the numerical range are easy to discern:

(a) W(aI + bT) = a + bW(T), where a and b are complex numbers;

(b) W(T) is real if T is self-adjoint;

(c) W(U*TU) = W(T) if U is unitary.

Since |(Tx, x)| ||T|| ||x||2 for every x 2 H, we see that |k| ||T|| for all k 2 W(T).

In particular, W(T) is a bounded subset of ℂ. It, however, may not be closed. For

P

example, if H ¼ ‘2 and T 2 BðHÞ is deﬁned by Tx ¼ 1 n¼1 an en =n; where x ¼

P1 1

a

n¼1 n ne : Then, ðTe ;

n n e Þ ¼ n 2 WðTÞ for each n but ðTe n ; en Þ ! 0 62 WðTÞ:

However, the numerical range W(T) of T 2 BðHÞ is a convex subset of ℂ as we

shall later prove.

Examples 4.4.8

1 0 u

(i) Let T ¼ and x ¼ 2 C2 , where juj2 þ jvj2 ¼ 1. Now,

0 0 v

1 0 u u u u

ðTx; xÞ ¼ ; ¼ ; ¼ juj2 .

0 0 v v 0 v

254 4 Spectral Theory and Special Classes of Operators

So,

0 0 u

(ii) Let T ¼ and x ¼ 2 C2 , where juj2 þ jvj2 ¼ 1. Now,

1 0 v

0 0 u u 0 u

ðTx; xÞ ¼ ; ¼ ; ¼ uv:

1 0 v v u v

juvj 12 ðjuj2 þ jvj2 Þ ¼ 12 and equality holds if and only if juj ¼ jvj ¼ p1ﬃﬃ2. In

other words, the numerical range of the operator under consideration lies

within the closed disc centred at 0 and having radius 12. We proceed to show

that the numerical range is in fact the entire disc.

Consider any complex number X + iY lying this disc; then, X 2 þ Y 2 14. Our

claim is that there exist complex numbers u and v such that uv = X + iY and

|u|2 + |v|2 = 1. Observe that as r ranges over [0, 1], the product r2(1 − r2)

ranges over [0; 14], taking the maximum value 14 when r ¼ p1ﬃﬃ2. Using this

observation about the product r2(1 − r2), we obtain a number r 2 [0, 1] such

1

that r2(1 − r2) = X2 + Y2. Taking s to be ð1 r 2 Þ2 ; we can write r2 + s2 = 1

and X2 + Y2 = r2s2. From the latter of these equalities, we have X + iY = rseiw

for some w. Now, choose h and / in any manner so long as w = h − / and set

u = reih and v = sei/. Then, |u|2 + |v|2 = r2 + s2 = 1 and

uv ¼ rseiðh /Þ ¼ rseiw ¼ X þ iY. This proves our claim. (Note that the

numerical range

has turned out to be convex.)

0 0

(iii) Let T ¼ . We shall demonstrate that the numerical range of the

1 1

operator T in ℂ2 is the set of all complex numbers X + iY such that

2

X 12 Y2

2 þ 12 1:

p1ﬃﬃ 2

2

The author is indebted to Professor Ajit Iqbal Singh for the elegant argument

given below.

Lemma A If A and B are any two real numbers, not both zero, then the quadratic

equation

A2 þ B2 t2 ð2A þ 1Þt þ 2 ¼ 0

4.4 Spectrum of Various Classes of Operators 255

2

A 12 B2

2 þ 12 1:

p1ﬃﬃ 2

2

ð2A þ 1Þ2 8 A2 þ B2 ;

0 1

1 2 2

B A 2 B C

2@ 2 þ 2 1A :

1

p1ﬃﬃ 2

2

Therefore, the quadratic equation, which has real coefﬁcients, has a real root if and

only if

2

A 12 B2

2 þ 12 1:

p1ﬃﬃ 2

2

h

Lemma B A complex number X + iY is of the form ðd þ 1Þ=ðjdj2 þ 1Þ; where d is a

complex number if and only if its real and imaginary parts X and Y satisfy

2

X 12 Y2

2 þ 12 1:

p1ﬃﬃ 2

2

number. Then, d ¼ ðX þ iYÞðjdj2 þ 1Þ 1 and d ¼ ðX iYÞðjdj2 þ 1Þ 1, so that

2

jd j2 ¼ X 2 þ Y 2 jd j2 þ 1 2Xðjd j2 þ 1Þ þ 1:

Put t ¼ jdj2 þ 1. Then, t is real and the above equation is a quadratic in t with real

coefﬁcients, namely

X 2 þ Y 2 t2 ð2X þ 1Þt þ 2 ¼ 0: ðÞ

256 4 Spectral Theory and Special Classes of Operators

If part: Assume that X + iY is any complex number such that its real and

imaginary parts X and Y satisfy the inequality in question. If X 2 þ Y 2 ¼ 0; then X +

iY = 0, and choosing d = −1 leads to X + iY = ðd þ 1Þ=ðjdj2 þ 1Þ: So, suppose

X 2 þ Y 2 6¼ 0. Using X and Y, set up the quadratic equation (*), which obviously has

real coefﬁcients. By Lemma A, it must have a real solution. In what follows, the

symbol ‘t’ will denote any one real solution. Obviously, t 6¼ 0. Consider the

complex number d deﬁned in terms of the nonzero number t and the given complex

number X + iY as

X þ iY ¼ ðd þ 1Þ=t ðÞ

and

¼ X 2 þ Y 2 t2 2Xt þ 1:

Hence,

jd j2 þ 1 ¼ X 2 þ Y 2 t2 2Xt þ 2

¼ t in view of ð4:3Þ:

springs forth. h

With the above Lemma B in hand, we can now

prove that the numerical range of

0 0

the operator T in C2 given by the matrix is the set of all complex numbers

1 1

X + iY such that

2

X 12 Y2

2 þ 12 1:

p1ﬃﬃ 2

2

p

To see why this is so, let x ¼ 2 C2 , where jpj2 þ jqj2 ¼ 1. Now,

q

0 0 p p 0 p

ðTx; xÞ ¼ ; ¼ ; ¼ ðp þ qÞq:

1 1 q q pþq q

4.4 Spectrum of Various Classes of Operators 257

ðTx; xÞ ¼ ðd þ 1Þjqj2 . Also, ðjdj2 þ 1Þjqj2 ¼ 1. So, jqj2 ¼ 1=ðjdj2 þ 1Þ, and hence,

ðTx; xÞ ¼ ðd þ 1Þ=ðjdj2 þ 1Þ, which is independent of q. Now, d = −1 implies (Tx,

x) = 0.

Therefore, the numerical range can be characterised as consisting of all values of

(d + 1)/(|d|2 + 1) as d ranges over all complex numbers (keeping in mind that the

values of (Tx, x) when d is not available—i.e. when q = 0—are generated by d =

−1). This characterisation reduces the matter to Lemma B.

P

(iv) Let T be the left unilateral shift deﬁned on ‘2 by Tx ¼ 1 n¼1 xn þ 1 en , where

P1 P1

x ¼ n¼1 xn en . Then, ðTx; xÞ ¼ n þ 1 xn þ 1 xn : Taking m to be the smallest

index for which xm 6¼ 0 (such an m must exist when jjxjj ¼ 1), we get

1h i 1h i

jðTx; xÞj jxm j2 þ 2jxm þ 1 j2 þ 2jxm þ 2 j2 þ ¼ 2 jxm j2 \1:

2 2

It follows that W(T) is contained in the open unit disc with centre 0.

Conversely, let z ¼ reih with 0 r < 1. Consider the vector

1

X pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

1 iðn 1Þh

x¼ rn 1 r2 e en :

n¼1

The numerical range W(T) of a bounded linear operator T belonging to BðHÞ,

where H is a complex Hilbert space, has decent properties, some of which are easy

to prove.

Theorem 4.4.9 Let H be a Hilbert space over ℂ and T 2 BðHÞ. Then, WðTÞ ¼

fðTx; xÞ : jjxjj ¼ 1g has the following properties:

(a) k 2 W(T) if and only if k 2 WðT*Þ;

(b) [Hausdorff–Toeplitz] W(T) is a convex subset of ℂ;

(c) rðTÞWðTÞ;

(d) if T is normal, then the convex hull of the spectrum r(T) of T,

coðrðTÞÞ ¼ WðTÞ.

Proof

(a) Let x 2 H with ||x|| = 1. Then, ðTx; xÞ ¼ ðx; TxÞ ¼ ðT*x; xÞ. Thus, (Tx, x) 2 W

(T) if and only if ðTx; xÞ 2 WðT*Þ.

(b) Let n = (Tx, x) and η = (Ty, y) for unit vectors x and y in H. We want to prove

that every point of the segment joining n and η is in W(T). If n = η, the

problem is trivial. Suppose n 6¼ η. Choose complex numbers a and b such that

an + b = 1 and aη + b = 0. Indeed, a = 1/(n − η) and b = −η/(n − η) are the

desired complex numbers.

258 4 Spectral Theory and Special Classes of Operators

Consequently, the set {0, 1} is contained in W(aT + bI). It will sufﬁce to show

that the interval (0, 1) is included in W(aT + bI). If any t 2 (0, 1) can be shown

to be of the form a(Tz, z) + b, where jjzjj ¼ 1, then

¼ aðtn þ ð1 tÞgÞ þ b;

So, there is no loss of generality in assuming that n = 1 and η = 0, i.e. (Tx, x) = 1

and (Ty, y) = 0, and showing that [0, 1] W(T). It follows that x and y are

linearly independent, because otherwise, x would be a scalar multiple of y, and

hence, (Tx, x) would also be zero. Write

T ¼ T1 þ iT2 ;

0 ¼ ðTy; yÞ ¼ ðT1 y; yÞ þ iðT2 y; yÞ ) ðT1 y; yÞ ¼ 0; ðT2 y; yÞ ¼ 0:

unaltered and ðT2 k:x; yÞ ¼ kðT2 x; yÞ: Furthermore, we may assume that (T2x,

y) is purely imaginary. Indeed, k ¼ il=jlj, where l = (T2x, y), has the desired

property.

Set z(t) = tx + (1 − t)y, 0 t 1. Since x and y are linearly independent, z

(t) = 0 for no t.

Since

ðT2 zðTÞ; zðTÞÞ ¼ t2 ðT2 x; xÞ þ tð1 tÞððT2 x; yÞ þ ðT2 x; yÞÞ þ ð1 tÞ2 ðT2 y; yÞ;

for all t, it follows from the relations (T2x, x) = 0 = (T2y, y) and ℜ(T2x, y) = 0,

that (T2z(t), z(t)) = 0. Hence, (Tz(t), z(t)) is real for all t. So, the function

t ! ðTzðTÞ; zðTÞÞ=jjzðTÞjj2

respectively, 0 and 1. Hence, the range of the function contains every t 2 [0, 1].

4.4 Spectrum of Various Classes of Operators 259

ðTx; xÞ ¼ ðk x; xÞ ¼ kðx; xÞ ¼ kjjxjj2 ¼ k, we see that k 2 W(T). Next,

let k 2 r(T). Note that rðTÞ ¼ rap ðTÞ [ rcom ðTÞ [Remarks 4.1.4] =

rap ðTÞ [ rp ðT*Þ [Proposition 4.3.3(a)]. So, k 2 rap ðTÞ or k 2 rp ðT*Þ. If

k 2 rap ðTÞ then there is a sequence fxn gn 1 in H such that jjxn jj ¼ 1 and

Txn k xn ! 0 as n ! 1: Since

kðT klÞxn kjjxn jj ! 0 as n ! 1;

Also, if k 2 rp ðT*Þ, then we have seen above that k 2 W(T*), and hence,

k 2 WðTÞ by (a) above. This completes the proof.

(d) For a proof of this, we refer the reader to [3]. h

ðTx; xÞ ¼ ðx; TxÞ ¼ ðTx; xÞ. Consequently, W(T) ℝ. If m = inf||x||=1(Tx, x) and M =

sup||x||=1(Tx, x), then WðTÞ½m; M. Since W(T) is convex [(b) above], so is WðTÞ.

Therefore, WðTÞ½m; M. Now, [m, M] = cor(T) in view of Theorem 4.4.4 and

Theorem 4.4.6. Hence, corðTÞ ¼ WðTÞ, i.e. for a self-adjoint T, (d) holds.

The numerical range, like the spectrum, associates a set of complex numbers

with each operator T 2 BðHÞ; it is a set-valued function. The smallest disc centred

at the origin that contains the numerical range has radius given by

In Theorem 3.7.7, it was proved that for a normal operator, the norm is the same

as its numerical radius.

Observe that w(T) is a vector space norm on BðHÞ. That is, 0 w(T) for every

T 2 BðHÞ and 0 < w(T) whenever T is not zero; wða TÞ ¼ jajwðTÞ and w(T + S)

w(T) + w(S) for every a 2 ℂ and every S and T in BðHÞ. The numerical radius will

now be shown to be equivalent to the operator norm of BðHÞ and to dominate the

spectral radius.

Proposition 4.4.11 For any T 2 BðHÞ, we have 0 r(T) w(T) ||T|| 2w

(T).

Proof Since rðTÞWðTÞ by Theorem 4.4.9(c), we have

260 4 Spectral Theory and Special Classes of Operators

So,

rðTÞ wðTÞ:

Moreover,

Note that

supfjðTu; uÞj : jjujj ¼ 1g jjzjj2

¼ wðTÞjjzjj2 for every z 2 H:

þ iðT ðx þ iyÞ; x þ iyÞ iðT ðx iyÞ; x iyÞj

wðTÞ jjx þ yjj2 þ jjx yjj2 þ jjx þ iyjj2 þ jjx iyjj2

¼ 4wðTÞ jj xjj2 þ jj yjj2

8wðTÞ whenever jj xjj ¼ 1 ¼ jj yjj:

Therefore,

h

Remark It is known that if T 2 is the zero operator, then jjTjj ¼ 2wðTÞ: See [28] and

references therein.

If a is any positive number, then the vector space norm wa(T) = aw(T) is an

algebra norm (i.e. satisﬁes wa ðSTÞ wa ðSÞwa ðTÞÞ if and only if a 4. See [11].

The inequality kT*T þ TT*k 4wðTÞ 2kT*T þ TT*k has been proved in

Kittaneh [18].

1

If T 2 BðHÞ, then jðTx; yÞj2 ðjTjx; xÞðjT*jy; yÞ and 2wðTÞ kTk þ kT 2 k2 :

The second of these is due to Kittaneh [17].

1 1 1

If S, T 2 BðHÞ are positive, then kS2 T 2 k kSTk2 and

1 1

12

2jjS þ T jj kSk þ kT k þ ðkSk kT kÞ2 þ 4kS2 T 2 k2 :

We now turn to the properties of the spectrum of unitary and isometric operators.

4.4 Spectrum of Various Classes of Operators 261

Recall that U 2 BðHÞ, the algebra of all bounded linear operators on a complex

Hilbert space H, is unitary if and only if UU* = U*U = I. Moreover,

kUk ¼ 1 ¼ kU*k. Therefore, r(U) {k 2 ℂ : |k| 1} and so is r(U*). Note that

0 62 r(U) since U, by deﬁnition, is invertible. If |k| < 1, then kI U ¼

kðU* k1 IÞU: Since k1 is not in the closed unit disc, the operator kðU* k1 IÞ is

invertible, and hence, so is kI − U. Thus, r(U) {k 2 ℂ : |k| = 1}. The unitary

operator U is normal, so rr ðUÞ 6¼ £ [Theorem 4.4.3].

Examples 4.4.12

(i) (Bilateral shift; (i) of Examples 3.7.13). The operator U : ‘2 ðZÞ ! ‘2 ðZÞ is

deﬁned by the rule

kU*k ¼ 1 and so r(U) and r(U*) are contained in the closed unit disc. From

the paragraph above, it follows that r(U) {k 2 ℂ : |k| = 1}. From the

normality of U, rðUÞ ¼ rap ðUÞ[Theorem 4.4.1]. We next show that each k

with |k| = 1 is an approximate eigenvalue of U.

For ﬁxed h in [0, 2p] and n 2 ℕ, let xn be the vector in ‘2 ðZÞ deﬁned by

1

ikh

xn ð k Þ ¼ ð2n þ 1Þ 2 e ; jk j n

0; otherwise.

Pn 1

Note that jjxn jj2 ¼ ð2n þ 1Þ 1

k¼ n 1 ¼ 1. Also, ð2n þ 1Þ2 xn ðkÞ and

1

ð2n þ 1Þ2 Uxn ðkÞ are, respectively,

e e e e e e e

e e e e e e e

e

e e

262 4 Spectral Theory and Special Classes of Operators

where the only two nonzero entries are in positions −n and n + 1. Therefore,

jjðU eih IÞxn jj2 ¼ 2n 2þ 1, so that limn ðU eih IÞxn ¼ 0.

Thus, each eih ; h 2 [0, 2p] is an approximate eigenvalue of U.

It may be argued that rp ðUÞ ¼ £ as is done below. Let k be an eigenvalue, so

that jkj ¼ 1, and let x 2 ‘2 ðZÞ be a corresponding nonzero eigenvector. Since

(Ux)(n) = x(n − 1) and (kx)(n) = k(x(n)), we have x(n − 1) = k(x(n)), and

hence, xð nÞ ¼ kn xð0Þ for any n 0. This implies

P 2 P1

jjxjj2 1 n¼0 jxð nÞj 2

¼ jxð0Þj n¼0 jkj 2n

, which leads to x(0) = 0.

Therefore, x(n) = 0 for all nonpositive n; by similar considerations, we can

show that x(n) = 0 for positive n as well. Hence, the contradiction that x = 0.

(ii) (Multiplication Operator) Let H = L2[0, 2p]. The multiplication operator

U:H ! H is deﬁned by the formula (Ux)(t) = eitx(t), x 2 H. Note that U is

unitary [(ii) of Examples 3.7.13]. So, rðUÞfkC : jkj ¼ 1g: It follows from

(iii) of Examples 4.1.2 that r(U) = {eit: t 2 [0, 2p]}. From the fact that U is

normal, each point of the spectrum is an approximate eigenvalue [Theorem 4.

4.1].

The functions e int ; n 2 Z; form an orthonormal basis of L2[0, 2p]. If we identify

2

L [0, 2p] with ‘2 ðZÞ in terms of the basis, then the multiplication operator U gets

identiﬁed with the bilateral shift. Thus, (ii) is really the ‘same’ example as (i).

Therefore, the operators have the same spectrum of each kind. In particular, the

point spectrum of the multiplication operator is empty, a fact which can of course

be deduced directly from the deﬁnition of the operator as well.

Recall that an operator V 2 BðHÞ, the algebra of operators on a complex Hilbert

space H, is an isometry if jjVxjj ¼ jjxjj for each x 2 H. The norm jjVjj of V is 1. So,

rðVÞfk 2 C : jkj 1g. There exist isometries whose spectrum coincides with the

unit disc. In fact, if V is the simple unilateral shift and V* denotes the adjoint of

V [Example 4.3.6], then r(V*) = fk 2 C : jkj 1g and rðVÞ ¼ fk : k 2

rðV*Þg ¼ fk 2 C : jkj 1g:

The next result shows that the eigenvalues of an isometry, if any, lie on the unit

circle and the eigenspaces corresponding to distinct eigenvalues are orthogonal.

Theorem 4.4.13 Let V 2 BðHÞ be an isometry. Then,

(a) Every k 2 rp ðVÞ lies on the unit circle.

(b) If Mk and Ml are eigenspaces of V corresponding to k and l, respectively,

then k 6¼ l implies Mk ? Ml :

Proof

(a) Let k 2 rp(V). Then, there exists an x 2 H, x 6¼ 0, such that Vx = kx. Now,

jjxjj2 ¼ jjVxjj2 ¼ ðVx; VxÞ ¼ jkj2 kxk2 , so that |k|2 = 1, and hence, |k| = 1.

4.4 Spectrum of Various Classes of Operators 263

follows that (x, y) = 0. This completes the proof. h

Problem Set 4.4

=k 6¼ 0. Show that

Example (x) of 3.2.5. [An unbounded linear operator is deﬁned on a dense linear

subspace of the space under consideration.] The theory developed for bounded

linear operators is not applicable to differential operators. In order to overcome this

difﬁculty in part, the results of bounded linear operators are applied to the inverse

operators of differential operators after restricting the latter to a subspace on which

they are injective. The inverse of the linear differential

R s operator cited above is the

familiar Volterra operator in (ix) of 3.2.5: VxðsÞ ¼ 0 xðtÞdt: These inverse opera-

tors are not only bounded, but in addition possess a special property called

‘compactness’. Compact operators are also called completely continuous operators.

Most of the statements about these operators are generalisations of the statements

about linear operators in ﬁnite-dimensional spaces.

The use of linear operator methods to prove some of Fredholm’s results on linear

integral equations of the form

Zb

ðT kIÞxðsÞ ¼ yðsÞ; where TxðsÞ ¼ kðs; tÞxðtÞdt;

a

k being a parameter, y and k given functions and x the unknown function was

pioneered by F. Riesz in 1916. The concept of linear spaces had not been formu-

lated by then and Riesz worked with integral equations. His techniques generalise

directly and can be applied to compact (or completely continuous) operators.

264 4 Spectral Theory and Special Classes of Operators

Deﬁnition 4.5.1 Let X and Y be normed linear spaces. A linear operator T:X ! Y is

called a compact operator (or completely continuous operator) if it maps the unit

ball B = {x 2 X : ||x|| 1} of X onto a precompact (i.e. having compact closure)

subset of Y.

Since T is linear, this means that for every bounded subset M of X, the closure

TðMÞ is a compact subset of Y.

The sequence criterion for compactness in a metric space tells us that T is

compact if and only if for a bounded sequence fxn gn 1 in X, the sequence

fTxn gn 1 in Y has a convergent subsequence.

The following lemma shows that a compact linear operator is continuous,

whereas the converse is generally not true [see Remark 4.5.3(i)].

Lemma 4.5.2 Let X and Y be normed linear spaces. Then, every compact linear

operator T:X ! Y is bounded and hence continuous.

Proof The unit sphere S = {x 2 X : jjxjj = 1} is bounded. Since T is a compact

operator, TðSÞ is compact. It is therefore bounded, that is,

sup kTxk\1:

jjxjj¼1

Remarks 4.5.3

(i) We show that the identity operator on an inﬁnite-dimensional normed linear

space is not compact.

Let X be an inﬁnite-dimensional normed linear space and {x1, x2, …} denote

linearly independent vectors in X. We claim that there exist yn, n = 1, 2, …,

satisfying the properties

j j yn j j ¼ 1 for all n;

1

jjyn þ 1 xj j for all x 2 Mn : ð4:4Þ

2

x1

y1 ¼ 2 M1 :

kx 1 k

4.5 Compact Linear Operators 265

5.2.11, there exists a vector y2 2 M2 with ||y2|| = 1 such that

1

j j y2 xj j for all x 2 M1 :

2

Continuing in this manner, we obtain y1, y2,… satisfying the conditions speciﬁed in

(4.3) and (4.4). Now, consider the sequence {yn}n 1. It is clear that it is bounded

(jjyn jj ¼ 1; n ¼ 1; 2; . . .). Its image under the identity operator is the sequence itself.

In view of (4.4), the sequence under consideration satisﬁes

1

j j yn ym j j for all n 6¼ m

2

The above remark essentially says that in an inﬁnite-dimensional normed space,

the unit ball is never compact.

(ii) In case of some normed linear spaces, it is possible to prove the above result

without appealing to the Riesz lemma, as the following example shows.

p

Let X ¼ ‘ , and ek = (0, 0, …, 0, 1, 0, …), where 1 occurs at the kth place,

p

k = 1, 2, …. Then, {en}n 1 is a bounded sequence in ‘ and jjen jj = 1, n = 1, 2, ….

However, fIen gn 1 has no convergent subsequence. Indeed,

Ien Iem p ¼en em p ¼ 21=p for n 6¼ m:

p

any orthonormal sequence; then, jjek jj ¼ 1 for every k and jjen em jjp ¼ 2

whenever n 6¼ m.

(iii) If either X or Y is ﬁnite-dimensional, then every T 2 BðX; YÞ is compact.

Suppose dim(Y) < 1: Let fxn gn 1 be a bounded sequence in X. Then, the

inequality jjTxn jj kTkkxn k shows that fTxn gn 1 is bounded. Since dim

(Y) < 1; it follows that fTxn gn 1 has a convergent subsequence. Now,

suppose dim(X) < 1: Note that dim(TX) dim(X). The result therefore

follows from what has just been proved.

Deﬁnition 4.5.4 Let T 2 BðX; YÞ. The rank of T is deﬁned to be the dimension of

the range ran(T) of T. If the range is ﬁnite-dimensional, we say that T has ﬁnite

rank.

The rank is a purely algebraic concept.

In (iii) of the remark above, we have noted that ﬁnite rank operators in B(X,Y)

are compact. We write B0 (X,Y) for the collection of all compact operators from X to

266 4 Spectral Theory and Special Classes of Operators

Y and B00 (X,Y) for the collection of all ﬁnite rank operators from X to Y. We

abbreviate B0 ðX; XÞ as B0 (X).

The reader will note that B00 ðX; YÞB0 ðX; YÞBðX; YÞ.

Examples 4.5.5

(i) Let X be a normed linear space, z a vector in X and f a bounded linear

functional on X. We deﬁne T:X ! X by

Tx ¼ f ðxÞz; x 2 X:

which implies

kT k k f kkzk:

(ii) Let X = C[0, 1], the space of continuous functions on [0, 1] with ||x|| = sup{|x

(t)| : 0 t 1}. Let k(x, y) be a continuous kernel on [0, 1] [0, 1]. Deﬁne

the integral operator T by

Z1

ðTxÞðsÞ ¼ kðs; tÞxðtÞdt; x 2 C½0; 1:

0

Let fxn gn 1 be a sequence in X with kxn k 1 for all n. We shall show that

{Txn}n 1 has a convergent subsequence. For this, we shall use Ascoli’s

Theorem [see Theorem 1.2.21]. Since kTxn k kTk, the sequence fTxn gn 1

is bounded. We shall show that it is equicontinuous. Since k is uniformly

continuous, for each e > 0, there exists a d > 0 such that js1 s2 j\d implies

jkðs1 ; tÞ kðs2 ; tÞj\e for all t 2 [0, 1]. Thus, for js1 s2 j\d, we have

Z1 Z1

jTxn ðs1 Þ Txn ðs2 Þj jkðs1 ; tÞ kðs2 ; tÞjjxn ðtÞjdt e jxn ðtÞjdt e:

0 0

has a convergent subsequence.

4.5 Compact Linear Operators 267

t < s}, which is patently discontinuous, the above argument does not apply.

But we still have an operator in C[0, 1], calledRthe Volterra operator, just like

s

its counterpart in L2[0, 1]. Since jðTxÞðsÞj ¼ j 0 kðs; tÞxðtÞdtj sjjxjj 1, not

only is T bounded with norm at most 1, but also satisﬁes

jTxðs1 Þ Txðs2 Þj js1 s2 j jjxjj; which has the consequence that T maps a

bounded set in C[0, 1] into an equicontinuous set. By Ascoli’s Theorem, T is

compact.

(iii) Let k be a complex function belonging to L2([0, 1] [0, 1]). We deﬁne the

transformation T on L2[0, 1] by

Z1

ðTxÞðsÞ ¼ k ðs; tÞxðtÞdt; x 2 L2 ½0; 1:

0

The computation

2

Z1 Z 1 Z 1

2

jðTxÞðsÞj ds ¼ kðs; tÞxðtÞdt ds

0 0 0

8 9 8 9

Z1 <Z1 =<Z 1 =

jkðs; tÞj2 dt jxðtÞj2 dt ds

: ;: ;

0 0 0

Z1 Z1

jj xjj2 jkðs; tÞj2 dtds;

0 0

x 2 L2[0, 1] with

8 1 912

<Z =

jjT jj jk ðs; tÞj2 dsdt ¼ jjk jjL2 ð½0;1½0;1Þ :

: ;

0

Let U : L2 ð½0; 1 ½0; 1Þ ! BðL2 ð½0; 1Þ be deﬁned by

UðkÞ ¼ T:

We have shown above that U is a bounded linear operator with norm at most

jjkjjL2 ð½0;1½0;1Þ .

268 4 Spectral Theory and Special Classes of Operators

P

orthonormal basis in L2 ð½0; 1 ½0; 1Þ; so; kðs; tÞ ¼ 1 i;j¼1 ai;j fi ðsÞfj ðtÞ, where the

P

series converges in the norm of L ([0, 1] [0, 1]). Let kn ðs; tÞ ¼ ni;j¼1 ai;j fi ðsÞfj ðtÞ.

2

Z1

ðKn xÞðsÞ ¼ kn ðs; tÞxðtÞdt; x 2 L2 ½0; 1;

0

In the case of the Volterra operator, k(s, t) equals 1 if 0 t s and equals 0 if

s < t 1. Therefore, it is a compact operator in L2[0, 1]. Since its range includes all

polynomials with constant term 0, it is not of ﬁnite rank. In fact, its range is dense,

as has been shown in Example 4.3.2. Thus, we have B00 ðX; YÞ B0 ðX; YÞ

BðX; YÞ when X = Y = L2([0, 1]. Moreover, its spectrum consists of only 0, which is

not an eigenvalue [see Example 4.3.2].

Recall that the uniform limit of a sequence of continuous functions is continu-

ous. The following similar result for completely continuous operators holds.

Theorem 4.5.6 Let fTn gn 1 be a sequence of completely continuous operators

mapping a normed linear space X into a Banach space Y be such that

limn jjTn T jj ¼ 0:

Proof Let B0 ¼ fx 2 X : jjxjj 1g be the unit ball in X. Since Tn is compact, the set

Tn ðB0 Þ in Y is precompact (i.e. Tn ðB0 Þ is compact). Given e > 0, there exists n 2 N

such that jjTn Tjj\e=3. By compactness, we can cover Tn ðB0 Þ with a ﬁnite

number m of balls BðTn xj ; e=3Þ, where x1 ; x2 ; . . .; xm are in B0. Suppose x 2 B0 and

let j m be such that jjTn x Tn xj jj\e=3. By the triangle inequality,

Tx Txj kTx Tn xk þ Tn x Tn xj þ Tn xj Txj

2kT n T k þ Tn x Tn xj \e:

Therefore,

[

m

T ðB0 Þ BðTxj ; eÞ;

j¼1

4.5 Compact Linear Operators 269

Corollary 4.5.7 If T 2 BðX; YÞ and there exists a sequence Tn 2 B00 (X, Y) such

that jjTn Tjj ! 0 as n ! 1; then T 2 B0 (X,Y).

Proof

h

Remark

(i) The above corollary provides a frequently used sufﬁcient condition for an

operator to be compact; namely, it is sufﬁcient that it be the norm limit of a

sequence of ﬁnite rank operators. The necessity of this condition has been

shown to be false by P. Enflo [8].

(ii) If X and Y are Hilbert spaces, the following statement also holds [see Problem

4.5.P6]: if T 2 BðX; YÞ is compact and ran(T) = Y, then there exists a sequence

Tn 2 B00 (X, Y) such that jjTn Tjj ! 0 as n ! 1:

Lemma 4.5.8 The set B0 (X, Y) of all compact linear operators is a closed linear

subspace of BðX; YÞ.

Proof Let S and T be in B0 ðX; YÞ and a; b 2 C. If {xn}n 1 is a bounded sequence

in X, then fTxn gn 1 has a convergent subsequence fTxnk gk 1 , say, and fSxnk gk 1

has also a convergent sequence fSxnkðjÞ gj 1 , say. The sequence fTxnkðjÞ gj 1 con-

verges because it is a subsequence of a convergent sequence. It is now clear that the

sequence faTxnkðjÞ þ bSxnkðjÞ gj 1 converges.

Using Theorem 4.5.6, we conclude that B0 ðX; YÞ is a closed linear subspace of

BðX; YÞ. h

Theorem 4.5.9 If S and T are linear operators mapping a normed linear space X

into itself, where S is completely continuous and T is bounded, then ST and TS are

completely continuous operators.

Proof Suppose B is a bounded set in X and consider

STðBÞ ¼ SðTðBÞÞ:

subset of X. S being completely continuous implies S(T(B)) is precompact.

Therefore, ST is a completely continuous operator.

Consider now TS(B) = T(S(B)). By the complete continuity of S, S(B) is pre-

compact, i.e. SðBÞ is compact. Since the continuous image of a compact set is a

compact set, and T is continuous, it follows that TðSðBÞÞ is compact. Note that

TðSðBÞÞTðSðBÞÞ: This implies T(S(B)) is precompact. The complete continuity of

TS has now been proved. h

270 4 Spectral Theory and Special Classes of Operators

Remark Combining this result with Lemma 4.5.8 and Theorem 4.5.9, we see that

B0 ðXÞ, the class of all completely continuous operators on X is what is known as a

‘closed two-sided ideal’ in the algebra BðXÞ of all bounded linear operators on X. In

particular, the square of a compact operator is compact. The converse, however, is

not true, as the following example shows.

In ‘2 ; let {ek} be the standard basis, i.e. ek = (0, 0, …, 0, 1, 0, 0, …), where the

only nonzero entry is 1 and occurs in the kth position. Deﬁne the operator T in ‘2 by

setting Tek = (1 − (−1)k)e2k. Certainly, T maps into ‘2 because convergence of

Rjxk j2 implies that of Rð1 ð 1Þk Þ2 jxk j2 . Clearly, T2 = O. But T is not compact,

because it maps the bounded sequence e1, e3, e5, … into the sequence 2e2, 2e6, 2e10,

…, whichpcan have no convergent subsequence in view of the fact that jj2em

2en jj ¼ 2 2 when n 6¼ m.

The simple unilateral shift is not compact because it maps the bounded sequence

e1, e2, e3, … into the sequence e2, e3, e4, ….

It is easy to see that B00 ðXÞ, the class of all ﬁnite rank operators on X, is a

two-sided ideal in BðXÞ. Suppose T 2 B00 ðXÞ and S 2 BðXÞ. Since the range of

the product TS is contained in that of T, it is surely ﬁnite-dimensional. To see why

ST also has ﬁnite-dimensional range, we ﬁrst note that T(X) is ﬁnite-dimensional,

and therefore, any linear image of it is also ﬁnite-dimensional. It follows that the

linear image S(T(X)) = ST(X) is ﬁnite-dimensional.

We shall use this to show that if X is a Hilbert space H, then the adjoint T* of a

ﬁnite rank operator T 2 B00 ðHÞ and its absolute value |T| are also of ﬁnite rank. Let

P be the orthogonal projection on the range of T. Then, PT = T and P* =

P. Therefore, T* = T*P* = T*P, which must have ﬁnite rank, because P does. Also,

T*T must have ﬁnite rank, which means ran(T*T) is ﬁnite-dimensional and hence

closed. It follows by the last part of Theorem 3.9.13 that jTj has ﬁnite rank.

It turns out that the closure of B00 ðHÞ is precisely B0 ðHÞ; so, there is no question

of B00 ðHÞ being closed unless it equals B0 ðHÞ, which we know it does not when

H = L2([0, 1].

We recall the following deﬁnition for the beneﬁt of the reader.

Let H be a Hilbert space, fxn gn 1 a sequence of elements of H and x 2 H. If, for

all elements y 2 H, the sequence (xn, y) of scalars converges to (x, y) as n ! 1;

then fxn gn 1 is said to converge weakly to x and we write:

w

xn ! x:

The following equivalent criterion of a compact operator in a Hilbert space

holds.

4.5 Compact Linear Operators 271

only if it maps every weakly convergent sequence into a strongly convergent

sequence.

Proof Suppose that T 2 BðHÞ is a compact operator and let fxn gn 1 be a sequence

w

in H such that xn ! x: If possible, suppose that fTxn gn 1 does not converge strongly

to Tx. Then, there exists e > 0 and an increasing sequence n1, n2, … such that

kTxnk Txk e; k ¼ 1; 2; . . .. As the sequence fxn gn 1 converges weakly, it fol-

lows [Theorem 2.12.6] that kxn k M; n ¼ 1; 2; . . .; for some suitable M > 0, and

hence, by compactness of T, the sequence fTxnk gk 1 has a subsequence fTxnkj gj 1

such that Txnkj ! y as j ! 1 strongly. Since strong convergence implies weak

w w w

convergence, Txnkj ! y as j ! 1: Also, xnkj ! x as j ! 1; so Txnkj ! Tx as j ! 1:

Thus, y = Tx and Txnkj ! Tx as j ! 1 strongly. Moreover,

Txnkj Tx e; j ¼ 1; 2; . . .:

Conversely, suppose that fxn gn 1 is a bounded sequence in H. Then, it contains

a weakly convergent subsequence fxnk gk 1 [Theorem 2.12.5]. By hypothesis,

fTxnk gk 1 converges in H; consequently, T is compact. h

As an illustration of the use of the above theorem, we show that if fen gn 1 is an

orthonormal sequence, not necessarily complete, in a Hilbert space H and

T compact operator, then jjTen jj ! 0. First consider an arbitrary subsequence,

which we shall continue to call fen gn 1 for ease of notation. For any x 2 H, the

P1 2

sum n¼1 jðx; en Þj must converge by Bessel’s inequality [Theorem 2.8.6]. So,

(x, en) ! 0 as n ! 1, i.e. the sequence fen gn 1 converges to 0 weakly. Using the

fact that T is continuous, we ﬁnd that Ten ! 0 weakly. As T is also compact, it

follows by Theorem 4.5.10 that fTen gn 1 converges strongly to some y 2 H. Since

strong convergence implies weak convergence, we know that fTen gn 1 converges

weakly to y, and hence, y = 0. Thus, fTen gn 1 converges strongly to 0.

The next result can be rephrased as saying that the class of compact operators in

a Hilbert space is closed under taking adjoints.

Theorem 4.5.11 The adjoint of a compact operator is compact.

Proof Let fxn gn 1 be a sequence in H such that jjxn jj M; n ¼ 1; 2; . . . and M > 0.

If yn ¼ T*xn ; n ¼ 1; 2; . . .; then fyn gn 1 is also a bounded sequence in H. Since

T is compact, the sequence fTyn gn 1 has a convergent subsequence fTynj gj 1 , say.

For all i, j,

272 4 Spectral Theory and Special Classes of Operators

2 2

yn ynj ¼ T*xni T*xnj

i

¼ T*xni T*xnj ; T*xni T*xnj

¼ TT* xni xnj ; xni xnj

TT* xni xnj xni xnj

2M Tyni Tynj :

This implies that the sequence fynj gj 1 is a Cauchy sequence in H, and since H is

complete, it converges in H. Consequently, T* is a compact operator. h

We shall show that when T is compact, the operator I − T has the following

feature of operators in a ﬁnite-dimensional space: it is onto if and only if it is

one-to-one:

Theorem 4.5.12 Let T be a compact operator. Then,

Proof Suppose ran(I − T) = H but ker(I − T) 6¼ {0}. Then, there exists a nonzero

vector x1 2 ker(I − T). Since ran(I − T) = H, we can obtain a sequence fxn gn 1 of

nonzero vectors in H such that

another way,

Combined with the obvious inclusions kerðI TÞn kerðI TÞn þ 1 for every n, this

yields the strict inclusions

Each of these kernels is closed, and therefore, each kerðI TÞn is a proper closed

subspace of the Hilbert space kerðI TÞn þ 1 : Hence, there exists a sequence

fyn gn 1 of unit vectors such that

Then, surely,

4.5 Compact Linear Operators 273

because

¼ ðI TÞp 1 yq þ 0 þ 0

1 q

¼ ðI TÞp ðI TÞq yq ¼ 0:

Therefore, jjyp ðyq þ ðI TÞyp ðI TÞyq Þjj 1, i.e. jjTyp Tyq jj 1 when

p > q. But this means that although the set {yn : n 1} is bounded, the set

fTyn : n 1g cannot contain a Cauchy sequence. This contradicts the compactness

of T and thereby shows that ran(I − T) = H ) ker(I − T) = {0}.

For the converse, suppose that ker(I − T) = {0}. By Theorem 3.5.8, the

orthogonal complement of ker(I − T) is the closure of the range of I − T*.

Therefore, ran(I − T*) is dense. However, by Theorem 4.5.11, T* is also compact,

and hence, by Problem 4.5.P14, ran(I − T*) is closed. Thus, ran(I − T*) = H. By the

compactness of T*, what has already been proved above implies that ker(I − T*) =

{0}. Invoking Theorem 3.5.8 once again in exactly the same manner as above, we

ﬁnd that ran(I − T) = H. h

In the presence of the additional hypothesis that T is self-adjoint, the above result

is a trivial consequence of Theorem 3.5.8 and Problem 4.5.P14. However, much

more can be said in that situation [see (Problem 4.5.P15)].

An alternative between two speciﬁed statements is as assertion to the effect that

precisely one among two statements holds, i.e. one of them holds but not both.

A little reflection shows that this is the same as saying that one holds if and only if

the other does not. For the sceptical reader, we show the corresponding simple

computation in Boolean algebra, wherein ^ denotes conjunction, _ denotes dis-

junction and ′ denotes negation. Recall that P ) Q is the same as P0 _ Q. The

computation is as follows:

between one of the statements and the negation of the other. Conventionally, the

equivalence asserted by Theorem 4.5.12 is expressed as an alternative and named

after the discoverer, who originally put it forth in 1903 in the context of integral

equations:

Theorem 4.5.13 (Fredholm Alternative) For a compact operator T in a Hilbert

space H, precisely one of the following holds:

(a) For every y 2 H, there exists x 2 H such that x − Tx = y;

(b) There exists a nonzero x 2 H such that x − Tx = 0.

274 4 Spectral Theory and Special Classes of Operators

Theorem 4.5.14 For a compact operator T in a Hilbert space H, the dimensions of

ker(I − T) and ker(I − T*) are the same.

Proof Both the dimensions in question are ﬁnite because of the compactness of

T and hence of T* [Prop. 4.8.3]. Let fx1 ; . . .; xn g and fy1 ; . . .; ym g be orthonormal

bases of ker(I − T) and ker(I − T*), respectively. It is sufﬁcient to prove that

assuming m > n leads to a contradiction. With this in view, assume that m > n.

Set up the operator S deﬁned as

X

n

Sx ¼ Tx þ x; xj yj ; x 2 H:

j¼1

yn are orthonormal and lie in ker(I − T*), we obtain for any x 2 H

¼ ðx; ðI T*Þyk Þ ðx; xk Þ

¼ ðx; xk Þ; 1 k n:

Now, let x 2 ker(I − S). Then, the above n equalities lead to the n orthogonality

relations

ðx; xk Þ ¼ 0; 1 k n:

x 2 kerðI TÞ? ;

On the other hand, since

it follows from the deﬁnition of S that ðI TÞx ¼ nj¼1 ðx; xj Þyj , the same

n orthogonality relations imply this time around that

x 2 kerðI TÞ:

But ker ðI TÞ? \ kerðI TÞ ¼ f0g, and it follows that x = 0. This validates our

contention that ker(I − S) = {0}.

As S is compact, Theorem 4.5.12 now tells us that ran(I − S) = H. In particular,

yn+1 = (I − S)z for some z 2 H. Recalling the deﬁnition of S, we obtain

X

n

ðI TÞz ¼ z; xj yj þ yn þ 1 :

n¼1

4.5 Compact Linear Operators 275

Considering that y1,…,yn+1 are orthonormal and yn+1 2 ker(I − T*), we now arrive

at the contradiction that

0 ¼ ðz; ðI T*Þyn þ 1 Þ

¼ ððI TÞz; yn þ 1 Þ

!

X

n

¼ z; xj yj þ yn þ 1 ; yn þ 1

j¼1

¼ ð yn þ 1 ; yn þ 1 Þ

¼ 1:

h

Since integral operators are compact, the Fredholm alternative and

Theorem 4.5.14 have direct implications regarding solutions of integral equations;

in fact, they are generalisations of Fredholm’s results on the latter. For an explicit

formulation in terms of integral equations, the reader is referred to Limaye [21,

p. 339] or Riesz and Nagy [24, p. 164].

Theorems 4.5.12, 4.5.13 and 4.5.14 can be further generalised even to Banach

spaces, but the matter will not be taken up in this book.

Problem Set 4.5

Zt

ðKuÞðtÞ ¼ uðsÞds; u 2 L2 ½a; b

a

4:5:P2. Show that the operator K : L2 ½a; b ! L2 ½a; b deﬁned by

X

n Zt

ðKf ÞðtÞ ¼ uj ðtÞ wj ðsÞds;

j¼1

0

P

where f ¼ nj¼1 uj

wj and uj ; wj are in L2[a, b], is of ﬁnite rank.

4:5:P3: (a) Let the operator K : L2 ½0; 1 ! L2 ½0; 1 be given by

Z1

KxðtÞ ¼ k ðt; sÞxðsÞdlðSÞ;

0

276 4 Spectral Theory and Special Classes of Operators

compact and has denumerably many negative eigenvalues with 0 as

the only accumulation point.

(b) Suppose in part (a), the function k is changed to be k(t, s) = min{t, s}.

Prove that K is self-adjoint, is compact and has positive eigenvalues.

(The reader can check that they are denumerably many and have 0 as

the only accumulation point.)

4:5:P4. Take V as in Problem 3.8.P4, and ﬁnd the eigenvalues of the operator

V*V on L2[0, 1]. Prove that jjVjj ¼ 2p 1 .

4:5:P5. Let V be the Volterra operator on L2[0, 1] [see Example (ix) of 3.2.5].

Prove by induction that

Zt

n ðt sÞn 1

ðV ðxÞðtÞ ¼ xðsÞds:

ðn 1Þ!

0

Zt

yðtÞ ¼ sin t þ yðsÞds: ð4:5Þ

0

4:5:P6: (a) Let H and K be Hilbert spaces and T 2 B0 ðH; KÞ. Show that ran(T) is

separable.

(b) Let fek gk 1 be an orthonormal basis for ½ranðTÞ: If Pn:K ! K is the

orthogonal projection onto the closed linear subspace generated by

fek g1 k n , then show that PnT ! PT (unif), where P is the orthog-

onal projection on ½ranðTÞ:

4:5:P7. Let H be a separable Hilbert space with basis fen gn 1 . Let fan gn 1 be a

sequence of complex numbers with M ¼ supjan j\1. Deﬁne an operator

T on ‘2 by

4:5:P8. Let faj gj 1 be a sequence of complex numbers with 1

j¼1 jaj j\1: Deﬁne

an operator T on ‘2 by

!

1

X 1

X 1

X

Tx ¼ ai x i ; ai þ 1 xi ; . . .; ai þ n 1 x i ; . . . ;

i¼1 i¼1 i¼1

x ¼ ðx1 ; x2 ; . . .Þ 2 ‘2 :

4.5 Compact Linear Operators 277

P

4:5:P9. Let ½sij i;j 1 be an inﬁnite matrix and 1 2

i;j¼1 jsij j \1 and operator T be

deﬁned on ‘2 by

Tðfxi gi 1 Þ ¼ fyi gi 1 ;

where

1

X

yi ¼ sij xj ; i ¼ 1; 2; . . .:

j¼1

P P1

4:5:P10. Deﬁne T : ‘2 ! ‘2 by Tx ¼ Tðx1 ; x2 ; . . .Þ ¼ ð 1 j¼1 s1j xj ; j¼1 s2j xj ; . . .Þ,

where sik = 0 for |i − k| > 1. Then, T is compact if and only if limi,ksi,k = 0.

Observe that the matrix deﬁning T has the form

2 3

s11 s12 0 0

6 s12 s22 s23 0 7

6 7

6 0 s32 s33 s34 0 7

6 7:

6 .. .. .. .. .. 7

4 . . . . . 5

0 0 0 0

2 3

a1 b1 0 0

6 c1 a2 b2 0 7

6 7

60 c2 a3 b3 0 7

6 7:

6 .. .. .. .. .. 7

4 . . . . . 5

0 0 0 0

equivalent to limkak = 0 = limkbk = limkck.

4:5:P11. Prove that the mapping T deﬁned on ‘2 by

1 1

Tx ¼ ðn1 ; n2 ; n3 ; . . .Þ; x ¼ ðn1 ; n2 ; n3 ; . . .Þ 2 ‘2

2 3

278 4 Spectral Theory and Special Classes of Operators

not an eigenvalue of T. Show that k 62 r(T).

4:5:P13. Construct an example of a compact operator which has no proper value.

4:5:P14. Let T 2 BðHÞ, H a complex Hilbert space, be compact and k 6¼ 0 a

complex number. Then, ran(T − kI) is closed.

4:5:P15. (Fredholm Alternative) Let T 2 BðHÞ, H a complex Hilbert space, be

compact and self-adjoint. If k is an eigenvalue of T, we denote by Nk ðTÞ

the eigenspace of T associated with k and by Pk the orthogonal projection

of H onto Nk ðTÞ. Then, one of the following holds:

(a) If k is not an eigenvalue of T, then the equation

Tx kx ¼ y ð4:6Þ

X

x ¼ ðT kIÞ 1 y ¼ ðl kÞ 1 Pl y:

l2rp ðTÞ

solutions for y 2 Nk ðTÞ? and no solution otherwise. In the ﬁrst

case, the solutions are given by

X

x ¼ zþ ðl kÞ 1 Pl y:

l2rp ðTÞ

l6¼k

with z 2 NT ðkÞ:

4:5:P16. Let H = L2[0, 1]. For x 2 H, let

Z1

TxðsÞ ¼ kðs; tÞxðtÞdt;

0

and

ð1 sÞt 0ts1

k ðs; tÞ ¼

sð1 t Þ 0 s t 1:

Let x 2 H and 0 6¼ k 2 ℂ be such that Tx = kx. Then, for all s 2 [0, 1],

Zs Z1

kxðsÞ ¼ TxðsÞ ¼ ð1 sÞtxðtÞdt þ sð1 tÞxðtÞdt: ð4:7Þ

0 s

4.5 Compact Linear Operators 279

P R1

Show that TxðsÞ ¼ 1 2

n¼1 n2 p2 ½ 0 xðtÞ sin npt dt sin nps. Use the Fredholm

alternative to determine the solution of the operator equation

Tx kx ¼ y; y 2 H. P

4:5:P17. Let faj gj 1 be a sequence of complex numbers such that 1 j¼1 jaj j\1.

2

Deﬁne an operator on ‘ by the matrix

2 3

a1 a2 a3

6 a2 a3 a4 7

6 7

A ¼ 6 a3 a4 a5 7 :

4 5

.. .. .. ..

. . . .

Problem 3.2.P1 and Example (viii) of 3.2.5 provide sufﬁcient conditions on inﬁnite

matrices and kernels to induce bounded linear operators on a Hilbert space. In fact,

Example (viii) of 3.2.5 is a continuous analogue of Problem 3.2.P3. These are

typical illustrations of a class of operators—the Hilbert–Schmidt operators. We

shall show that if T is a Hilbert–Schmidt operator in a Hilbert space H, then so its

adjoint T*. These operators constitute a two-sided ideal in BðHÞ, the algebra of

bounded linear operators in H. Every Hilbert–Schmidt operator is a compact

operator. The converse is, however, not true. The class of Hilbert–Schmidt oper-

ators is deﬁned as follows.

Deﬁnition 4.6.1 Let T 2 BðHÞ be an operator on a Hilbert space H, and let fxc gc2C

be an orthonormal basis for H. If Rc2C jjTxc jj2 \1; then T is called a Hilbert–

Schmidt operator.

The set of all Hilbert–Schmidt operators on H will be denoted by HS.

In this deﬁnition of the class HS, a particular orthonormal basis was used. The

following lemma shows that the class HS depends only upon the Hilbert space and

not upon the basis.

Lemma 4.6.2 Let T 2 BðHÞ be an operator on a Hilbert space H. Let fxc gc2C and

fyc gc2C be orthonormal bases for H. Then,

Rc2C jjTxc jj2 ¼ Rb2C jjT*yb jj2 ¼ Ra2C Rb2C jðTxa ; yb Þj2 :

280 4 Spectral Theory and Special Classes of Operators

Whenever any one of them is summable, so are the others and their sum is the

same, independent of fxc gc2C and fyc gc2C :

Proof By using Parseval’s equality [Theorem 2.9.16], jjTxa jj2 ¼ Rb2C jðTxa ; yb Þj2 .

Thus,

¼ Rb2C Ra2C jðTxa ; yb Þj2

¼ Rb2C Ra2C jðxa ; T*yb Þj2

2

¼ Rb2C T*yb ;

## Viel mehr als nur Dokumente.

Entdecken, was Scribd alles zu bieten hat, inklusive Bücher und Hörbücher von großen Verlagen.

Jederzeit kündbar.