Sie sind auf Seite 1von 527

Harkrishan Lal Vasudeva

Elements of Hilbert Spaces


and Operator Theory
With contributions from Satish Shirali

123
Harkrishan Lal Vasudeva
Indian Institute of Science Education
and Research
Mohali, Punjab
India

ISBN 978-981-10-3019-2 ISBN 978-981-10-3020-8 (eBook)


DOI 10.1007/978-981-10-3020-8
Library of Congress Control Number: 2016957499

© Springer Nature Singapore Pte Ltd. 2017


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by Springer Nature


The registered company is Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
To
Siddhant, Ashira and Shrayus
Preface

Algebraic and topological structures compatibly placed on the same underlying set
lead to the notions of topological semigroups, groups and vector spaces, among
others. It is then natural to consider concepts such as continuous homomorphisms
and continuous linear transformations between above-said objects. By an ‘operator’,
we mean a continuous linear transformation of a normed linear space into itself.
Functional analysis was developed around the turn of the last century by the
pioneering work of Banach, Hilbert, von Neumann, Riesz and others. Within a few
years, after an amazing burst of activity, it was well developed as a major branch of
mathematics. It is a unifying framework for many diverse areas such as Fourier
series, differential and integral equations, analytic function theory and analytic
number theory. The subject continues to grow and attracts the attention of some
of the finest mathematicians of the era.
A generalisation of the methods of vector algebra and calculus manifests itself in
the mathematical concept of a Hilbert space, named after the celebrated mathe-
matician Hilbert. It extends these methods from two-dimensional and
three-dimensional Euclidean spaces to spaces with any finite or infinite dimension.
These are inner product spaces, which allow the measurement of angles and
lengths; once completed, they possess enough limits in the space so that the
techniques of analysis can be used. Their diverse applications attract the attention of
physicists, chemists and engineers alike in good measure.
Chapter 1 establishes notations used in the text and collects results from vector
spaces, metric spaces, Lebesgue integration and real analysis. No attempt has been
made to prove the results included under the above topics. It is assumed that the
reader is familiar with them. Appropriate references have, however, been provided.
Chapter 2 includes in some details the study of inner product spaces and their
completions. The space L2(X, M, l), where X, M and l denote, respectively, a
nonempty set, a r-algebra of subsets of it and an extended nonnegative real-valued
measure, has been studied. The theorem of central importance in the analysis due to
Riesz and Fischer, namely that L2(X, M, l) is a complete metric space, has been
proved. So has been the result, namely, the space A(X) of holomorphic functions
defined on a bounded domain X is complete. To make the book useful to

vii
viii Preface

probabilists, statisticians, physicists, chemists and engineers, we have included


many applied topics: Legendre, Hermite, Laguerre polynomials, Rademacher
functions, Fourier series and Plancherel’s theorem. Such applications of the abstract
theory are also of significance for the pure mathematician who wants to know the
origin of the subject. This chapter also contains the study of linear functionals on
Hilbert spaces; more specifically, Riesz Representation Theorem, the dual of a
Hilbert space is itself a Hilbert space and the fact that these spaces constitute
important examples of reflexive normed linear spaces. Applications of Hilbert space
theory to different branches of mathematics, such as approximation theory (Müntz’
Theorem), measure theory (Radon–Nikodým Theorem), Bergman kernel and
conformal mapping (analytic function theory), are included in Chap. 2.
A major portion of this book is devoted to the study of operators in Hilbert
spaces. It is carried out in Chaps. 3 and 4. The set of operators in a Hilbert space H,
equipped with the uniform norm, is denoted by BðH Þ. Some well-known classes of
operators have been defined. Under compact operators, Fredholm theory has been
discussed. The Mean Ergodic Theorem has been proved as an application at the end
of Chap. 3. Spectrum of an operator is the key to the understanding of the operator.
Properties of the spectrum of different classes of operators, such as normal oper-
ators, self-adjoint operators, unitaries, isometries and compact operators, have been
discussed under appropriate headings. Here, the properties of the spectrum specific
to the class of operators under consideration are studied. A large number of
examples of operators together with their spectrum and its splitting into point
spectrum, continuous spectrum, residual spectrum, approximate point spectrum and
compression spectrum have been painstakingly worked out. It is expected that the
treatment will aid the understanding of the reader. The treatment of polar decom-
position of an operator is different from the ones available in books. Numerical
range and numerical radius of an operator have been defined. The spectral radius
and the numerical radius of an operator have been compared. Professor Ajit Iqbal
Singh deserves special thanks for the help she rendered while this part was being
written. Spectral theorems, which reveal almost everything about the operators,
have been accorded special treatment in the text. After proving the spectral theorem
for compact normal operators, spectral theorems for self-adjoint operators and
normal operators have been proved. Here, we have been guided by the fundamental
principle of pedagogy that repetition helps in imbibing rather subtle techniques
needed for proving the spectral theorems. A bird’s eye view of invariant subspaces
with special attention to the Volterra operator is included. We close the chapter with
a brief introduction to unbounded operators.
Chapter 5 contains important theorems followed by applications from Banach
spaces.
The final chapter contains hints and solutions to the 166 problems listed under
various sections. These are over and above the numerous detailed examples scat-
tered all over the text.

Chandigarh, India Harkrishan Lal Vasudeva


Contents

1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Lebesgue Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Zorn’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 Absolute Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1 Definition and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Norm of a Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 Inner Product Spaces as Metric Spaces . . . . . . . . . . . . . . . . . . . . . 34
2.4 The Space L2 (X, M, µ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5 A Subspace of L2(X, M, µ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.6 The Hilbert Space A(X) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.7 Direct Sum of Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.8 Orthogonal Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.9 Complete Orthonormal Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.10 Orthogonal Decomposition and Riesz Representation . . . . . . . . . . 102
2.11 Approximation in Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 123
2.12 Weak Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
2.13 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
3 Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
3.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
3.2 Bounded and Continuous Linear Operators . . . . . . . . . . . . . . . . . 156
3.3 The Algebra of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
3.4 Sesquilinear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
3.5 The Adjoint Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
3.6 Some Special Classes of Operators . . . . . . . . . . . . . . . . . . . . . . . . 192
3.7 Normal, Unitary and Isometric Operators . . . . . . . . . . . . . . . . . . . 205

ix
x Contents

3.8 Orthogonal Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216


3.9 Polar Decomposition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
3.10 An Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
4 Spectral Theory and Special Classes of Operators . . . . . . . . . . . . . . . 233
4.1 Spectral Notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
4.2 Resolvent Equation and Spectral Radius . . . . . . . . . . . . . . . . . . . . 238
4.3 Spectral Mapping Theorem for Polynomials . . . . . . . . . . . . . . . . . 242
4.4 Spectrum of Various Classes of Operators . . . . . . . . . . . . . . . . . . 248
4.5 Compact Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
4.6 Hilbert–Schmidt Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
4.7 The Trace Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
4.8 Spectral Decomposition for Compact Normal Operators . . . . . . . . 294
4.9 Spectral Measure and Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
4.10 Spectral Theorem for Self-adjoint Operators . . . . . . . . . . . . . . . . . 317
4.11 Spectral Mapping Theorem For Bounded Normal Operators . . . . 331
4.12 Spectral Theorem for Bounded Normal Operators . . . . . . . . . . . . 337
4.13 Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
4.14 Unbounded Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
5 Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
5.1 Definition and Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
5.2 Finite-Dimensional Spaces and Riesz Lemma. . . . . . . . . . . . . . . . 384
5.3 Linear Functionals and Hahn–Banach Theorem . . . . . . . . . . . . . . 393
5.4 Baire Category Theorem and Uniform Boundedness
Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ......... 401
5.5 Open Mapping and Closed Graph Theorems . . . . . . . ......... 409
6 Hints and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
6.1 Problem Set 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
6.2 Problem Set 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
6.3 Problem Set 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
6.4 Problem Set 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
6.5 Problem Set 2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
6.6 Problem Set 2.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
6.7 Problem Set 2.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
6.8 Problem Set 2.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
6.9 Problem Set 2.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
6.10 Problem Set 2.11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
6.11 Problem Set 2.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
6.12 Problem Set 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
6.13 Problem Set 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
6.14 Problem Set 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
6.15 Problem Set 3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
6.16 Problem Set 3.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
6.17 Problem Set 3.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
Contents xi

6.18 Problem Set 3.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475


6.19 Problem Set 3.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478
6.20 Problem Set 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
6.21 Problem Set 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
6.22 Problem Set 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
6.23 Problem Set 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
6.24 Problem Set 4.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
6.25 Problem Set 4.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
6.26 Problem Set 4.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
6.27 Problem Set 4.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
About the Author

Harkrishan Lal Vasudeva was a visiting professor of mathematics at Indian


Institute of Science Education and Research, Mohali, India from 2010 to 2016.
Earlier, he taught at Panjab University, Chandigarh, India, and held visiting
positions at the University of Sheffield, the UK, and the University of Graz, Austria,
for research projects. He has numerous research articles to his credit in various
international journals and has co-authored several books, two of which have been
published by Springer.

xiii
Chapter 1
Preliminaries

1.1 Vector Spaces

The important underlying structure in every Hilbert space is a vector space (linear
space). The present section contains preparatory material on these spaces. The
reader who is already familiar with their basic theory can pass directly to Sect. 1.2,
for there is nothing in the present section which is particularly oriented to the study
of Hilbert spaces.
Definition 1.1.1 Let X be a nonempty set of elements x, y, z, … and F be a field of
scalars k, l, m, …. To each pair of elements x and y of X, there corresponds a third
element x + y in X, the sum of x and y, and to each k 2 F and x 2 X corresponds
the element kx or simply kx in X, called the scalar product of k and x such that the
operations of addition and multiplication satisfy the following rules:
(A1) x + y = y + x,
(A2) x + (y + z) = (x + y) + z,
(A3) there is a unique element 0 in X, called zero element, such that x + 0 = x for
all x 2 X,
(A4) for each x 2 X, there is a unique element (−x) in X such that x + (−x) = 0,
(M1) k(x + y) = kx + ky,
(M2) (kl)x = k(lx) and
(M3) 1x = x,
where 1 2 F is the identity in F, for all k, l 2 F and x, y, z 2 X.
Then, (X, +, ) satisfying properties (A1)–(A4) and (M1)–(M3) is called a vector
space over F. The elements of X are called vectors or points, and those of F are
called scalars.
If F is the field of complex numbers C [resp. real numbers R], then (X, +, ) is
called a complex [resp. real] vector space or a complex [resp. real] linear space.
In what follows, F will denote the field C of complex numbers or the field R of
real numbers.

© Springer Nature Singapore Pte Ltd. 2017 1


H.L. Vasudeva, Elements of Hilbert Spaces and Operator Theory,
DOI 10.1007/978-981-10-3020-8_1
2 1 Preliminaries

Remarks 1.1.2
(i) It is more satisfying to apply the term vector space over F to the ordered
triple (X, +, ), but if this sort of thing is done systematically in all mathe-
matics, the terminology will become extremely cumbersome. In order to
avoid this difficulty, we shall apply the term vector space over F to X, where
it is understood that X is equipped with the operations ‘+’ and ‘’, the latter
being scalar multiplication of elements of X by those of F.
(ii) We shall mainly restrict our attention to the ‘complex’ vector spaces. The
strong motivational factor for this choice is that the complex numbers con-
stitute an algebraically closed field; that is, a polynomial of degree n has
precisely n roots (counting multiplicity) in the field of complex numbers,
whereas the field of real numbers does not have this property. This property
of complete factorisation of polynomials into linear factors is an appropriate
setting for a satisfactory treatment of the theory of operators in a Hilbert
space. It is also useful in dealing with the spaces of functions.
(iii) The additive identity element of the field will be denoted by 0 and so shall
be the identity element of vector addition. It is unlikely that any confusion
will result from this practice.
(iv) The following immediate consequences of the axioms of a vector space are
easy to prove:
(a) The vector equation x + y = z, where y and z are given vectors in X, has
one and only one solution;
(b) If x + z = z, then x = 0;
(c) k0 = 0 for every scalar k;
(d) 0x = 0 for every x 2 X;
(e) If kx = 0, then either k = 0 or x = 0.
For given vectors x and y in X, the vector x + (−y) is called the difference
of x and y and is denoted by x − y.
(f) (−k)x = k(−x) = −(kx);
(g) k(x − y) = kx − ky;
(h) (k − l)x = kx − lx.
(v) It is easy to check that Y  X is a vector space over F if, and only if,
x, y 2 Y, k, l 2 F imply kx + ly 2 Y.
Examples abound. We shall give at this point a few elementary ones: the real
field or the complex field with usual operations is a real or complex vector space
(scalar multiplication coinciding with the usual binary operation of multiplication).
The complex field may also be considered as a real vector pace. The set of all
n-tuples x = (x1, …, xn), xi 2 F, i = 1, 2, …, n, is a vector space Rn or Cn , where
F ¼ R or C. The set of all real or complex functions defined on some fixed set is a
vector space, the operations being the usual ones. The vector space consisting of the
zero vector only is called the trivial vector space.
1.1 Vector Spaces 3

The Cartesian product X  Y of vector spaces X and Y over the same field can
be made into a vector space over that field in an obvious way.
Definition 1.1.3 A sequence of vectors x1, x2, …, xn is said to be linearly inde-
pendent if the relation

k1 x 1 þ k 2 x 2 þ    þ k n x n ¼ 0 ð1:1Þ

holds only in the trivial case when k1 = k2 =  = kn = 0; otherwise, the sequence
x1, x2, …, xn is said to be linearly dependent.
The left member of (1.1) is said to be a linear combination of the finite sequence
x1, x2, …, xn. Thus, linear independence of the vectors x1, x2, …, xn means that
every nontrivial linear combination of these vectors is different from zero. If one of
the vectors is equal to zero, then these vectors are evidently linearly dependent. In
fact, if for some i, xi = 0, then we obtain the nontrivial relation on taking ki = 1 and
kj = 0, 1  j  n, j 6¼ i. A repetition of a vector in a sequence renders it linearly
dependent. An arbitrary nonempty collection of vectors is said to be linearly
independent if every finite sequence of distinct terms belonging to the collection is
linearly independent.
Definition 1.1.4 A basis or a Hamel basis for a vector space X is a collection B of
linearly independent vectors with the property that any vector x 2 X can be
expressed as a linear combination of some subset of B.
Remarks 1.1.5
(i) Observe that a linear combination of vectors in a collection is always a finite
sum even though the collection may contain an infinite number of vectors. In
fact, infinite sums do not make any sense until the notion of ‘limit’ of a
sequence of vectors has been defined in X.
(ii) The space X is said to be finite-dimensional, more precisely, n-dimensional
if B contains precisely n linearly independent vectors. In this case, any
(n + 1) elements of X are linearly dependent. If B contains arbitrarily many
linearly independent vectors, then X is said to be infinite-dimensional. The
trivial vector space has dimension zero.
(iii) Permuting the vectors in a sequence does not alter its linear independence.
(iv) If x and y are linearly dependent and both are nonzero, then each is a nonzero
scalar multiple of the other.

Definition 1.1.6 A nonempty subset Y of a vector space X that is also a vector


space with respect to the same operations of vector addition and scalar multipli-
cation as in X is called a vector subspace (or a linear subspace). In other words, if
x, y 2 Y, k, l 2 F imply kx + ly 2 Y, then Y is a vector subspace (or a linear
subspace) of X.
One of the common methods of constructing a linear subspace Y is to consider
the set of all finite linear combinations
4 1 Preliminaries

k1 x1 þ k2 x2 þ    þ kn xn

of elements x1, x2, …, xn of M, where M is a nonempty finite or infinite set of


elements of X. This set Y is the smallest subspace of X that contains M. It is called
the linear span of M or the linear subspace [or manifold] spanned by M, and we
write Y = [M].
Definition 1.1.7 Given two vector spaces X and Y (over the same field), we can
form a new vector space V as follows: define vector operations on the Cartesian
product of X and Y, the set of all ordered pairs hx; yi; where x 2 X and y 2 Y. We
define

k1 hx1 ; y1 i þ k2 hx2 ; y2 i ¼ hk1 x1 þ k2 x2 ; k1 y1 þ k2 y2 i:

The vector space V so formed is called the external direct sum of X and Y; we
denote it by X ⊕ Y. The vector hx; 0i in V, if identified with the vector x 2 X,
permits one to think of X as a subspace of V. Similarly, Y can be viewed as a
subspace of V. The mapping hx; yi ! hx; 0i [resp. h0; yi] is called the projection of
X ⊕ Y onto X [resp. Y].
Let Y1,Y2, …, Yn be subspaces of X. By Y1 + Y2 +  + Yn, we shall mean all
sums x1 + x2 +  + xn, where xj 2 Yj, j = 1, 2, …, n. The spaces Y1, Y2, …, Yn are
said to be linearly independent if for any i = 1, 2, …, n,

Yi \ ðY1 þ Y2 þ    þ Yi 1 þ Yi þ 1 þ    þ Yn Þ ¼ f0g:

If Y1, Y2, …, Yn are linearly independent and X = Y1 + Y2 + ⋯ + Yn, the spaces


fYi gni¼1 are said to form a direct sum decomposition of X, and we write

X ¼ Y1  Y2      Yn :

In case {Yi : i = 1, 2, …, n} [or fYi gni¼1 ] constitute a direct sum decomposition of X,


each element x 2 X can be uniquely written in the form y1 + y2 +  + yn, where
yj 2 Yj, j = 1, 2, …, n.
Let Y be a subspace of a vector space X (Y  X). Let x + Y = {x + y : y 2 Y} for
all x 2 X, and let X/Y = {x + Y : x 2 X}. The sets x + Y are called the cosets of Y in
X. We observe that 0 + Y = Y. Obviously, x1 + Y = x2 + Y if, and only if,
x1 − x2 2 Y, and consequently, for each pair x1, x2, either x1 + Y = x2 + Y or
(x1 + Y) \ (x2 + Y) = ∅. If x1 þ Y ¼ x01 þ Y and x2 þ Y ¼ x02 þ Y, then
ðx1 þ x2 Þ þ Y ¼ ðx1 þ x2 Þ þ Y and ax1 þ Y ¼ ax01 þ Y. The vector space X/Y with
0 0

addition and scalar multiplication defined as

ðx1 þ YÞ þ ðx2 þ YÞ ¼ ðx1 þ x2 Þ þ Y and aðx þ YÞ ¼ ax þ Y

for all x1, x2 2 X and a 2 C (or R) is called the quotient space of X modulo Y.
1.1 Vector Spaces 5

Definition 1.1.8 Two vector spaces H and K are said to be isomorphic if there
exists a bijective linear map between H and K, i.e. if there exists a bijective linear
mapping A : H ! K such that

Aða1 x1 þ a2 x2 Þ ¼ a1 Aðx1 Þ þ a2 Aðx2 Þ;

for all x1, x2 2 H and scalars a1 and a2.

1.2 Metric Spaces

A vector space is a purely algebraic object, and if the processes of analysis are to be
meaningful in it, a measure of distance between any two of its vectors (or elements)
must be defined. Many of the familiar analytical concepts such as convergence in
R3 with the usual distance can be fruitfully generalised to inner product spaces (to
be studied in Chap. 2).
Intuitively, one expects a distance to be a nonnegative real number, symmetric
and to satisfy the triangle inequality. These considerations motivate the following
definitions.
Definition 1.2.1 A nonempty set X, whose elements we call ‘points’ is said to be a
metric space if with any two points x, y of X is associated with a real number d(x, y),
called the distance from x to y, such that
(i) d(x, y)  0 and d(x, y) = 0 if, and only if, x = y,
(ii) d(x, y) = d(y, x) and
(iii) d(x, z)  d(x, y) + d(y, z), for any x, y, z 2 X [triangle inequality].
The function d: X  X ! R þ ; where R þ denotes nonnegative reals, with
these properties is called a distance function or a metric on X.
It should be emphasised that a metric space is not the set of its points; it is, in
fact, the pair (X, d) consisting of the set of its points together with the metric d.
(R; d) [resp. (C; d)], where dðx; yÞ ¼ jx yj; x; y 2 R [resp. C] are examples of
metric spaces.
Any nonempty subset of a metric space is itself a metric space if we restrict the
metric to it and is called a subspace.
Certain standard notions from the topology of real numbers have natural gen-
eralisations to metric spaces.
Definition 1.2.2 By a sequence {xn}n  1 in a metric space X is meant a mapping of
N, the set of natural numbers, into X. A sequence {xn}n  1 in a metric space is said
to converge to the point x 2 X if limnd(xn, x) = 0, and we write limnxn = x. This
means: given any number e > 0, there is an integer n0 such that d(xn, x) < e
whenever n  n0. It is easy to see that if limnxn = x and limnxn = y, then x = y. In
fact,
6 1 Preliminaries

0  dðx; yÞ  dðx; xn Þ þ dðxn ; yÞ:

The element x is called the limit of the sequence {xn}n  1.


A sequence {xn}n  1 in X is said to be Cauchy if for every e > 0, there is an
integer n0 such that d(xn, xm) < e whenever n, m  n0, and we write d(xn, xm) ! 0
as n, m ! ∞.
Note that every convergent sequence is Cauchy. In fact, if limnxn = x

dðxn ; xm Þ  dðxn ; xÞ þ dðx; xm Þ ! 0 as n; m ! 1:

The converse is however not true; that is, not every Cauchy sequence is convergent.
In fact, the sequence xn ¼ 1n, n = 1,2,… in the open interval (0, 1) with the metric
d(x, y) = |x − y|, x, y 2 (0, 1), is Cauchy but the only possible limit, namely, 0 does
not belong to the interval.
In the metric space (R, d) [resp. (C, d)], where d(x, y) = |x − y|, x, y 2 R
[resp. C], a sequence {xn}n  1 is convergent if, and only if, it is Cauchy (this is the
well-known Cauchy criterion of convergence).
An important class of metric spaces in which the analogue of the Cauchy cri-
terion holds are known as ‘complete’ metric spaces. More precisely, we have the
following definitions.
Definition 1.2.3 A metric space (X, d) is said to be complete in case every Cauchy
sequence in the space converges. Otherwise, (X, d) is said to be incomplete.
Proposition 1.2.4 Let (X, d) be a metric space. Then
(a) |d(x, y) − d(z, y)|  d(x, z) for all x, y, z 2 X;
(b) If limnd(xn, x) = 0 and limnd(yn, y) = 0 then limnd(xn, yn) = d(x, y);
(c) If {xn}n  1 and {yn}n  1 are Cauchy sequences in (X, d), then d(xn, yn) is a
convergent sequence of real numbers.

Proof
(a) By the triangle inequality,

dðx; yÞ  dðx; zÞ þ dðz; yÞ:

Transposing d(z, y), we get

dðx; yÞ dðz; yÞ  dðx; zÞ: ð1:2Þ

Interchanging the roles of x and z, we get

dðz; yÞ dðx; yÞ  dðz; xÞ;

that is,
1.2 Metric Spaces 7

ðd ðx; yÞ dðz; yÞÞ  d ðx; zÞ: ð1:3Þ

Combining (1.2) and (1.3), the desired inequality follows.


(b) Using the triangle inequality for real numbers and (a), we have

jdðx; yÞ dðxn ; yn Þj  jdðx; yÞ d ðxn ; yÞj þ jdðxn ; yÞ dðxn ; yn Þj


ð1:4Þ
 dðx; xn Þ þ d ðy; yn Þ:

Since limnd(xn, x) = 0 = limnd(yn, y), it follows that limnd(xn, yn) = d(x, y).
(c) Again,

jdðxn ; yn Þ d ðxm ; ym Þj  jdðxn ; yn Þ d ðxm ; yn Þj þ jdðxm ; yn Þ d ð xm ; ym Þ j


 dðxn ; xm Þ þ d ðyn ; ym Þ
ð1:5Þ

and the right-hand side of (1.5) tends to zero as m, n ! ∞ because {xn}n  1


and {yn}n  1 are Cauchy sequences. Thus, the sequence {d(xn, yn)}n  1 is
Cauchy, and since the real numbers are complete, it converges. h

Definition 1.2.5 Let x0 be a fixed point of the metric space X and r > 0 be a fixed
real number. Then, the set of all points x in X such that d(x, x0) < r is called the
open ball with centre x0 and radius r. We denote it by S(x0, r). Thus

Sðx0 ; rÞ ¼ fx 2 X: dðx; x0 Þ\rg: ð1:6Þ

We speak of a closed ball if the inequality in (1.6) is replaced by d(x, x0)  r, and
we denote the set by S(x0, r). Thus

Sðx0 ; rÞ ¼ fx 2 X: dðx; x0 Þ  rg: ð1:7Þ


A set O in a metric space is said to be open if it contains an open ball about each
of its points. In other words, for every x 2 O, there exists an r > 0 such that all
y with d(y, x) < r belong to O. A set F in a metric space X is closed if its com-
plement X\F (or Fc) is open in X. An open ball is an open set in X, and a closed ball
is a closed set in X.
An open ball is easily seen to be an open set. Indeed, if y 2 S(x0, r), then the
open ball about y with radius r − d(y, x0) is a subset of S(x0, r) because any z in the
latter ball satisfies d(z, y) < r − d(y, x0) and therefore also satisfies d(z, x0) 
d(z, y) + d(y, x0) < (r − d(y, x0)) + d(y, x0) = r.
It is immediate from the definition of an open set that it is a union of open balls
with centres in the set. Conversely, any union of open balls is an open set, because
any union of open sets is clearly an open set and open balls are open sets.
Let (X, d) be a metric space and ∅ 6¼ A  X. Then, d can be restricted to A in
the obvious sense, and it is trivial to check that the restriction dA provides a metric
on A. It is called the metric induced on A by d, or simply induced metric for short.
8 1 Preliminaries

An open ball in A with reference to the induced metric is easily seen to be the
intersection with A of an open ball in X [caution: the converse is false]. Together
with the fact that intersection is distributive over union and the observation in the
preceding paragraph, this implies that every open subset of A is the intersection with
A of an open set in X. On the other hand, the intersection with A of an open ball in
X having its centre in A is an open ball in A. This implies that the intersection with
A of an open set in X is an open set in A. In summary, a subset of A is open with
reference to the induced metric if, and only if, it is the intersection with A of an open
set in X.
If (A, dA) is complete, then A is a closed subset of X.
Definition 1.2.6 A neighbourhood of a point x0 2 X is any open ball in (X, d) with
centre x0.
We say that x0 is an interior point of a set A if A contains a neighbourhood of x0.
The interior of a set A, denoted by Aº, consists of all interior points of A and can be
easily seen to be the largest open set contained in A.
Definition 1.2.7 A point x0 2 X is called a limit point of set A  X if every open
ball with centre x0 contains a point of A different from x0.
It may be easily seen that x0 is a limit point of A if, and only if, every open ball
with centre x0 contains an infinite sequence of distinct points of A which converges
to x0.
The closure of a subset A  X, denoted by A, is the union of A and the set of all
its limit points. A is the smallest closed set containing A, and A is closed if, and only
if, A ¼ A.
The closure of the open unit ball in C, {z : |z| < 1}, is the closed unit ball {z : |z|
 1}.
Definition 1.2.8 A mapping f from a metric space (X, d) to a metric space (X′, d′) is
said to be continuous at x0 2 X if, for every e > 0, there exists a d > 0 such that
d′(f(x), f(x0)) < e whenever d(x, x0) < d. The function f is continuous on X if it is
continuous at each point of X.
The mapping f is said to be uniformly continuous on X if, for every e > 0, there
exists a d > 0 such that d′(f(x), f(y)) < e whenever d(x, y) < d.
The function f is continuous on X if f−1(V) = {x 2 X : f(x) 2 V}, called ‘inverse
image of V’, is open [resp. closed] when V is open [resp. closed] in X′.
Definition 1.2.9 Let f be a real-valued function defined on a metric space (X, d).
The function is said to be lower semi-continuous at x0 2 X if for each e > 0, there
exists a d > 0 such that

f ðxÞ [ f ðx0 Þ e

for all x satisfying the inequality d(x, x0) < d.


Upper semi-continuity is defined by replacing the inequality displayed above
by f(x) < f(x0) + e.
1.2 Metric Spaces 9

The function f is continuous at x0 2 X if, and only if, it is both upper


semi-continuous and lower semi-continuous there.
Definition 1.2.10 A metric space X is said to be separable if in the space X there
exists a sequence

fx1 ; x2 ; . . .; xn ; . . .g ð1:8Þ

such that for every x 2 X and every e > 0 there is an element xn0 of (1.8) with
d(x, xn0 ) < e.
A subset A  X, where X is a metric space, is said to be dense if A ¼ X: In view
of this terminology, the definition of separability may be rephrased as follows: X is
said to be separable if X contains a countable dense set.
Definition 1.2.11 A subset K  X, where X is a metric space, is said to be
bounded if there exists an M  0 such that d(x, y)  M whenever x and y are
points in K.
The following is an immediate consequence of the definition.
Proposition 1.2.12 Let x0 be a fixed point of the metric space X and K  X. Then
K is bounded if, and only if, the numbers d(x, x0) are bounded as x varies over K.
Proof Suppose d(x, x0)  M for all x 2 K; if x, y 2 K, then

dðx; yÞ  dðx; x0 Þ þ dðx0 ; yÞ  2M:

Thus K is bounded.
Conversely, suppose that K is bounded, say d(x, y)  M for all x, y 2 K. Fix any
point y0 2 K. Then

dðx; x0 Þ  dðx; y0 Þ þ dðy0 ; x0 Þ  M þ dðy0 ; x0 Þ

for all x 2 K. h
We review briefly the basic facts about the completion of a metric space. For
details, the reader may refer to 1–5 of [30].
Definition 1.2.13 Let (X, d) be an arbitrary metric space. A complete metric space
(X*, d*) is said to be a completion of the metric space (X, d) if
(i) X is a subspace of X* and
(ii) Every point of X* is the limit of some sequence in X (i.e. X is dense in X*).
For example, the space of real numbers is a completion of the space of rational
numbers. It will follow upon using Theorem 1.2.15 that the real numbers form the
only completion of the space of rational numbers.
Definition 1.2.14 Let (X, d) and (X′, d′) be two metric spaces. A mapping T from
X to X′ is an isometry if
10 1 Preliminaries

d 0 ðTðxÞ; TðyÞÞ ¼ dðx; yÞ

for all x, y 2 X. The mapping T is also called an isometric imbedding of X into X′.
If, however, the mapping is onto, the spaces X and X′ themselves, between which
there exists an isometric mapping, are said to be isometric.
It may be noted that an isometry is always one-to-one.
Theorem 1.2.15 Every metric space has a completion and any two completions
are isometric to each other. Moreover, there is a unique isometry between them that
reduces to the identity when restricted to the given metric space [30].
Let (X, d) be a metric space and Y  X. A collection of open sets G in X is called
an open cover of Y if for each y 2 Y, there is a G 2 G such that y 2 G. A finite
subcollection of G which is itself a cover is called a finite subcover of Y.
Definition 1.2.16 A metric space (X, d) is said to be compact if every open cover
contains a finite subcover. A subset K of X is said to be a compact subset if the
metric space formed by K with the restriction of d to it is compact. A subset of X is
said to be precompact (or relatively compact) if its closure in X is compact.
A compact subset is always closed and therefore also precompact.
A closed subset of a compact metric space is compact. Also, a finite union of
compact subsets is compact.
A subset of Rn or Cn is compact if, and only if, it is closed as well as bounded.
The sequence criterion for compactness is: (X, d) is compact if, and only if, every
sequence in X has a convergent subsequence.
Every compact metric space is bounded but not conversely.
A continuous image of a compact metric space is compact.
Definition 1.2.16A Given a positive e, an e-net for a subset K of a metric space is a
subset Y of the metric space such that, for every x 2 K, there exists y 2 Y such that
d(x, y) < e. A subset K is said to be totally bounded if for every positive e, there
exists a finite e-net for K.
A subset of a metric space is totally bounded if, and only if, it is precompact.
A subset of a complete metric space is compact if, and only if, it is closed and
totally bounded. A closed subset of a complete metric space is compact if, and only
if, it is totally bounded.
If A is a nonempty subset of a metric space X with metric d and x 2 X, then the
nonnegative number d(x, A) = inf{d(x, a) : a 2 A} is called the distance from x to
A. Clearly, d(x, A) = 0 if, and only if, x 2 A. The function / defined on X by
/(x) = d(x, A) is continuous. In particular, for any a > 0, the set {x 2 X :
d(x, A)  a} is closed. Moreover, / vanishes at all points of A and nowhere else.
Theorem 1.2.17 Given disjoint closed subsets A and B of a compact metric space
X, there always exists a continuous function f: X ! [0, 1] such that f(a) = 0 for
every a 2 A and f(b) = 1 for every b 2 B [29, Theorem 3.4.4 on p. 116].
Proposition 1.2.18 Given a closed subset A of a metric space, its complement is
the union of a sequence of closed subsets.
1.2 Metric Spaces 11

Proof For each natural number n, take Kn to be fx 2 X : d ðx; AÞ  1ng: Then, Kn is


closed and therefore compact; it is also disjoint from A. But the union of all sets Kn
is {x 2 X : d(x, A) > 0}, which is precisely the complement of A ¼ A: h
Let (Xn, dn), n = 1, 2, …, be metric spaces with d n (Xn ) = sup{d n (x, y) : x,
y 2 Xn}  1 for each n. For x; y 2 1
Q
n¼1 Xn , define
1
X
dðx; yÞ ¼ 2 n dn ðxn ; yn Þ;
n¼1

where x = {xn}n  1 and y = {yn}n  1. Observe that the series on the right converges
because 2−ndn(xn, yn)  2−n.
Then, d turns out to be a metric on X ¼ 1
Q
n¼1 Xn and (X, d) is called a product
metric space. Also, convergence in this metric turns out to be the same as coor-
dinatewise convergence.
Theorem 1.2.19 (Tychonoff) (X, d) is compact if, and only if, each (Xn, dn) is
compact [30].
Another important theorem in metric spaces is the theorem of Ascoli, also
known as the Arzelà–Ascoli Theorem.
As usual, C[0, 1] denotes the metric space of continuous functions defined on
[0, 1] with metric

dðf ; gÞ ¼ supfjf ðxÞ gðxÞj : x 2 ½0; 1Šg:

Definition 1.2.20 A nonempty subset K of C[0, 1] is said to be equicontinuous if,


for every e > 0, there exists a d > 0 such that, for every f 2 K,

jx yj\d implies jf ðxÞ f ðyÞj\e:

Theorem 1.2.21 (Ascoli) Let K be a nonempty subset of C[0, 1]. Then the fol-
lowing are equivalent:
(a) The closure of K is compact;
(b) K is uniformly bounded (i.e. there exists M > 0 such that |f(x)|  M for every
x 2 [0, 1] and every f 2 K) and equicontinuous [30].

1.3 Lebesgue Integration

In this section, we shall review the theory of measure and integrable functions as
developed by H. Lebesgue in 1902. His integral, though more complicated to
develop and define than Riemann’s, yet as a tool, is easier to use and has better
properties. For example, problems which involve integration together with a
12 1 Preliminaries

limiting process are often awkward with the Riemann integral but are easily han-
dled when Lebesgue integration is used.
Measure theory is based on the idea of generalising the length of an interval in
R, the area of a rectangle in R2 , etc. to the measure of a subset. The more the
‘measurable’ subsets, the more the functions that can be integrated. A well-behaved
measure, i.e. a measure with acceptable properties, is possible on a wide class of
subsets. We begin with the following definitions.
Definition 1.3.1 Let X be a set and M be a collection of subsets of X with the
following properties:
(i) X 2 M,
(ii) If A  X and A 2 M, then X\A 2 M and S
(iii) If An  X and An 2 M; n = 1,2,…, then 1n¼1 An 2 M.

Such a system of sets is called a r-algebra.


In case X is a metric space, there is a smallest r-algebra containing all open
subsets of it, and each member of this smallest r-algebra is called a Borel set and
the smallest r-algebra containing all open subsets is called the Borel field.
Definition 1.3.2 Let l be an extended real-valued function defined on M such that
(i) l(A)  0 for every A 2 M and
(ii) An 2 M, n = 1, 2, … and An \ Am = ∅, n 6¼ m implies
!
1
[ 1
X
l An ¼ lðAn Þ:
n¼1 n¼1

Then l is called a positive measure on M or on X.


A measure space is a triple (X, M, l), where X is a nonempty set, M a
r-algebra of subsets of X and l a positive measure on X. A subset A in the measure
space (X, M, l) is said to have r-finite measure if A is a countable union of sets Ai,
i = 1, 2, …, with l(Ai) < ∞ and we say that l is r-finite on A. The measure l is
said to be r-finite if it is r-finite on X.
There exists a unique positive measure on the r-algebra of Borel subsets of Rn ,
which agrees with the volume when restricted to products of intervals. It is called
the Borel measure in Rn . There exists a r-algebra of subsets of Rn larger than the
Borel r-algebra, the elements of which are called Lebesgue measurable subsets,
and on which is defined a positive measure, which agrees with the Borel measure
when restricted to the Borel r-algebra. It is called the Lebesgue measure in Rn and
has the following additional property of being complete: let E be a Lebesgue
measurable set of measure 0 and F be any subset of E. Then, F is also Lebesgue
measurable and hence has measure 0.
The Lebesgue measure on Rn is r-finite. Let E be a measurable subset of R and
l be the Lebesgue measure on R. There exists an open set O and a closed set F in R
1.3 Lebesgue Integration 13

such that F  E  O, l(O\E) < e and l(E\F) < e. This property is called the
regularity of the Lebesgue measure l.
Definition 1.3.3 Let M be a r-algebra of subsets of X. An extended real-valued
function f defined on X is said to be measurable if f−1(O) = {x 2 X : f(x) 2 O},
where O is an open subset of R, is measurable and if the subsets f−1(∞) and
f−1(_∞) are measurable. A complex-valued function g + ih is measurable if, and
only if, g and h are both measurable.
It can be shown that if f = g + ih is measurable, then f−1(O) = {x 2 X : f(x) 2 O},
where O is an open subset of C, is measurable [26].
If f and g are extended real-valued measurable functions, then so are
f + g (provided it is defined), fg, af (a 2 R), jf j; max{f, g}, min{f, g}, where

ðmaxff ; ggÞðxÞ ¼ maxff ðxÞ; gðxÞg for each x 2 X:

If {fn}n  1 is a sequence of extended real-valued measurable functions defined on X,


then supnfn, infnfn, limsupnfn (=limnsupk  nfk), liminfnfn, and the pointwise limit
limnfn, when it exists, are measurable.
For A  X, let vA denote the characteristic function of A, that is,

1 if x 2 A
vA ðxÞ ¼
0 if x 2
6 A:

It is measurable if, and only if, A is a measurable subset of X, i.e. A 2 M. A simple


function is a real-valued function on X whose range is finite. If a1, a2, …, am are
the distinct values of such a function f, then
m
X
f ¼ aj vAj ; where Aj ¼ fx 2 X : f ðxÞ ¼ aj g; j ¼ 1; 2; . . .; m:
j¼1

Also, f is measurable if, and only if, A1, …, Am are measurable subsets of X.
Let f : X ! [0, ∞] and for n = 1, 2, … consider the simple functions

ðj 1Þ=2n if ðj 1Þ=2n  f ðxÞ\j=2n ; j ¼ 1; 2; . . .; n2n
sn ðxÞ ¼
n if f ðxÞ  n:

Then, 0  s1(x)  s2(x)  ⋯  f(x) and sn(x) ! f(x) for each x 2 X. If f is


bounded, the sequence {sn} converges to f uniformly on X. If f : X ! [−∞, ∞],
then by considering f = f+ − f−, where f+ = max{f, 0} and f− = −min{f, 0}, we see
that there exists a sequence of simple functions converging to f at every point of
X. Note that if f is measurable, then each of these simple functions is measurable.
Definition 1.3.4 Let (X, M, l) be a measure space and suppose f is measurable.
14 1 Preliminaries
Pm
(i) For f a simple function, say f ¼ j¼1 aj vAj ; the integral of f over X is
defined as
Z m
X
f dl ¼ aj lðAj Þ;
j¼1
X

the convention 0∞ = 0 being used.


(ii) For f extended real-valued and nonnegative, the integral of f over X is
defined as
8 9
Z <Z =
f dl ¼ sup gdl : g is simple and 0  gðxÞ  f ðxÞ for x 2 X :
: ;
X X

(iii) Suppose that f is extended real-valued and f = f+ − f−, where f+ = max{f, 0}


and f− = −min{f, 0}. The integral of f over X is defined as
Z Z Z
f dl ¼ f þ dl f dl
X X X

provided at least one of the integrals on the right is finite.


(iv) The function f is said to be integrable (or l-integrable) if X f þ dl and
R
R
X f dl are both finite.
(v) Suppose f is complex-valued and the integrals of ℜf and ℑf are defined as in
(iii) and are finite. Then set
Z Z Z
f dl ¼ <f dl + i =f dl:
X X X

In this case, f is said to be integrable (or l-integrable).


(vi) For a measurable set A, let vA be the characteristic function of A. If the
integral of fvA can be defined as above, set
Z Z
f dl = f vA dl:
A X

(vii) If the integral can be defined in this manner, we say that the integral exists.
When l is Lebesgue measure on a bounded closed interval, the l-integral
defined above is the Lebesgue integral and coincides with the Riemann integral for
all Riemann integrable functions. R R
It is sometimes convenient to denote A f dl by A f ðxÞdlðxÞ.
1.3 Lebesgue Integration 15

Definition 1.3.5 Let (X, M, l) be a r-finite measure space which is nontrivial in


the sense that 0 < l(A) < ∞ for some A 2 M. For p > 0, we shall denote by L~p (X,
R l)p the set of measurable complex-valued functions defined on X such that
M,
X j f j \1:
~1 ðX; M; lÞ is a vector space
With pointwise addition and scalar multiplication, L
of functions.
~1 ðX; M; lÞ, we shall need the following: for a
In order to introduce the set L
nonnegative measurable function g, let A be the set of all real numbers a such that

lðfx 2 X : gðxÞ [ agÞ ¼ 0:

If A = ∅, put b = ∞. If A 6¼ ∅, put b = infA. Since


1  
[ 1
fx 2 X : gðxÞ [ bg ¼ x 2 X : gðxÞ [ b þ
n¼1
n

and since the union of a countable collection of sets of measure zero is a set of
measure zero, it follows that b 2 A. We call b the essential supremum of g and write
b = ess sup g. The function g is said to be essentially bounded, if ess sup g is finite.
The collection of measurable functions f for which ess sup | f | < ∞ will be denoted
~1 ðX; M; lÞ.
by L
The next three theorems involve interchange of integration with the limiting
process for a sequence of functions.
Theorem 1.3.6 (Monotone Convergence Theorem) Let (X, M, l) be a measure
space and assume that {fn}n  1 is a monotone increasing sequence of nonnegative
extended real-valued measurable functions. Then
Z Z
limn fn dl ¼ limn f dl:
X X
The reader will note that each of the above integrals is defined (though not
necessarily finite).
The following immediate consequence of the Monotone Convergence Theorem
will be needed in Sect. 2.8.
Corollary 1.3.7 Let (X, M, l) be a measure space and {fn}n  1 be a sequence of
nonnegative integrable functions, each defined on X, such that
1 Z
X
fk dl\1:
k¼1
X

P1
Then k¼1 fk is integrable and
16 1 Preliminaries

!
Z 1
X 1 Z
X
fk dl ¼ fk dl\1:
k¼1 k¼1
X X
 Pn
Proof Since k¼1 fk n  1 is an increasing sequence of functions that converge to
P1
k¼1 fk ; by the Monotone Convergence Theorem, we conclude that
!
Z 1
X 1 Z
X
fk dl ¼ fk dl\1
k¼1 k¼1
X X

P1
and hence k¼1 fk is integrable. h
Theorem 1.3.8 (Fatou’s Lemma) Let (X, M, l) be a measure space and assume
that {fn}n  1 is a sequence of nonnegative extended real-valued measurable func-
tions. Then
Z Z
lim inf n fn dl  lim inf n fn dl:
X X
A complex-valued measurable function is integrable if, and only if, its real and
imaginary parts are. Obviously, there can be no Monotone Convergence
Theorem or Fatou’s Lemma for complex-valued functions. Nonetheless, the fol-
lowing result, which is initially proved for real-valued functions by using Fatou’s
Lemma, can be extended to complex-valued functions without any difficulty [26].
Theorem 1.3.9 (Lebesgue Dominated Convergence Theorem) Let (X, M, l) be a
measure space and assume that {fn}n  1 is a sequence of complex measurable
functions. Suppose that limnfn = f. If there is an integrable function g such that |fn|
 g (n  1), then f is integrable and
Z Z
limn fn dl ¼ f dl:
X X
We finally recall the role played by the sets of measure zero. Let P be a property
which a function is eligible to have at a point (e.g. continuity, positivity and the
like). If f has the property P at all points outside some set of measure zero, then f is
said to have the property P almost everywhere (abbreviated as a.e.).
R For example, if f is a nonnegative measurable function defined on X and
X f dl ¼ 0; then f = 0 a.e.
For the definition of the spaces Lp, see Sect. 2.4.
Corollary 1.3.10 Let {fn} be a sequence of complex-valued Rmeasurable functions
P1 1
P1
on X such that
P1 n¼1 jfn j 2 L P(or equivalently, n¼1 X jfn jdl\1). Then
1
R P 1 1 R
f
n¼1 n 2 L and X n¼1 nf dl ¼ f
n¼1 X n dl:
1.3 Lebesgue Integration 17

R1
Proposition 1.3.11 If the function f 2 L1[0, 1] and 0 f ðxÞxn dx ¼ 0 for n = 0, 1, 2,
…, then f(x) = 0 a.e. on [0, 1].
R1
Hint: Obviously, 0 f ðxÞgðxÞdx ¼ 0 for every polynomial g and hence for every
continuous function g by Weierstrass’ Approximation Theorem. Since the char-
acteristic function of [0, t], where t 2 [0, 1], can be approximated pointwise by a
sequence of continuous functions, one gets

Zt
f ðxÞdx ¼ 0 for every t 2 ½0; 1Š;
0

using the Dominated Convergence Theorem. It follows from Corollary 1.3.10 that
the integral of f over any open subset U of [0, 1] vanishes. The result now follows
on using the regularity of Lebesgue measure.
Proposition
R1 1.3.12 If the real- or complex-valued function f(t) is integrable and
itx
1 f ðtÞe dt ¼ 0 for all real x, then f(t) = 0 a.e. [see Corollary (21.47) of 12].
R
Remark 1.3.13 If f is extended real-valued and if X f dl is finite, then f is finite
almost everywhere.
Let (X, S, l) and (Y, T , m) be r-finite measure spaces. A measurable rectangle is
any set of the form A  B, where A 2 S and B 2 T : S  T denotes the r-algebra
generated by the collection of measurable rectangles.
With each function f on X  Y and with each x 2 X, we associate a function fx
defined on Y as fx(y) = f(x, y). Similarly, if y 2 Y, fy is the function on X such that
fy(x) = f(x, y). Let f be an (S  T )-measurable function. Then, for each x 2
X [resp. y 2 Y], the function fx [resp. fy] is T -measurable [resp. S-measurable].
With each subset Q of X  Y and with each x 2 X, we associate a subset Qx of
Y defined as Qx = {y 2 Y : (x, y) 2 Q}. Similarly, if y 2 Y, Qy is the subset of X such
that Qy = {x 2 X : (x, y) 2 Q}. Let Q be (S  T )-measurable. Then, for each x 2
X [resp. y 2 Y], the set Qx [resp. Qy] is T -measurable [resp. S-measurable].

Let Q 2 S  T . If u(x) = m(Qx) [resp. w(y) = l(Qy)], then for every x 2


X [resp. y 2 Y], the function u is S-measurable [resp. w is T -measurable] and
Z Z Z Z
udl ¼ wdm; i:e:; mðQx ÞdlðxÞ ¼ lðQy ÞdmðyÞ:
X Y X Y

The product measure l  m is given by


Z Z
ðl  mÞðQÞ ¼ mðQx ÞdlðxÞ ¼ lðQy ÞdmðyÞ for Q2ST:
X Y
18 1 Preliminaries

Theorem 1.3.14 (Fubini) If f 2 L1(l  m), then fx for almost all x 2 X [resp. fy for
almost all y 2 Y] is in L1(m) [resp. L1(l)] and
0 1 0 1
Z Z Z Z Z
@ fx dmAdl ¼ f dðl  mÞ¼ @ f y dlAdm:
X Y XY Y X

The above equality holds for nonnegative measurable functions as well.

1.4 Zorn’s Lemma

A partially ordered set is a set S with a relation x  y between ordered pairs of


elements of S satisfying (i) x  x, (ii) x  y, y  x implies x = y, and (iii) x  y,
y  z implies x  z. If every pair of elements of a subset S′  S are comparable,
that is x 2 S′ and y 2 S′ implies either x  y or y  x, then S′ is called a totally
ordered subset of S (or a chain). An upper bound of a set A  S is any y 2 S such
that x  y for all x 2 A. A maximal element of S is a y 2 S such that y  x implies
y = x.
Zorn’s Lemma If S is a partially ordered set in which every totally ordered
subset has an upper bound, then S has a maximal element.
Remark This lemma is logically equivalent to the axiom of choice, that is one can
be derived from the other and vice versa. This axiom says if {Xa}a2K, K any
indexing set, is any family of sets, then there exists a set that contains exactly one
element from each Xa.
For a discussion of the equivalence alluded to above and related material, the
reader may consult J.L. Kelley [15].

1.5 Absolute Continuity

A real-valued function defined on [a, b] is said to be absolutely continuous on


[a, b] if, given e > 0, there is a d > 0 such that
n
X
jf ðdi Þ f ðci Þj\e
i¼1

for
Pn every finite pairwise disjoint family {(ci, di)} of intervals with
i¼1 ðdi ci Þ\d.
1.5 Absolute Continuity 19

The following results are well known [25]:


(i)An absolutely continuous R function is continuous.
(ii)The indefinite integral ½a;xŠ f dl; f 2 L1 ½a; bŠ, is absolutely continuous.
(iii)If f is absolutely continuous, then f has a derivative almost everywhere.
(iv) Let f be an absolutely continuous function on [a, b], and suppose that f 0 ðxÞ ¼ 0
a.e. Then, f is a constant.
(v) A function f on [a, b] has the form

Zx
f ðxÞ ¼ f ðaÞ þ uðsÞds
a

for some u 2 L1[a, b] if, and only if, f is absolutely continuous on [a, b].
In this case, u0 ðxÞ ¼ f ðxÞ a.e. on [a, b].
Chapter 2
Inner Product Spaces

2.1 Definition and Examples

In the study of vector algebra in Rn , the notion of angle between two nonzero
vectors is introduced by considering the inner (or dot) product. In fact, if x =
(x1, x2, …, xn) and y = (y1, y2, …, yn) are any two vectors in the n-dimensional
Euclidean space Rn ; then their inner product is defined by
n
X
ðx; yÞ ¼ xi yi ;
i¼1

and this inner product is related to the norm by

ðx; xÞ ¼ kxk2 :

The familiar equation

ðx; yÞ ¼ k xkkyk cos h

determines the angle h between x and y. The vectors x and y are orthogonal if
(x, y) = 0. This concept of orthogonality proves useful and lends itself to the
generalisation to spaces of higher dimensions.
We introduce below the abstract notion of an inner product and show how a
vector space equipped with an inner product reflects properties analogous to those
enjoyed by the n-dimensional Euclidean space Rn :
Recall that we denote by F either the field C of complex numbers or the field R
of real numbers.
Definition 2.1.1 Let H be a vector space over F: An inner product on H is a
function (,) from H  H into F such that for all x, y, z 2 H and k 2 F;

© Springer Nature Singapore Pte Ltd. 2017 21


H.L. Vasudeva, Elements of Hilbert Spaces and Operator Theory,
DOI 10.1007/978-981-10-3020-8_2
22 2 Inner Product Spaces

(i) ðx; yÞ ¼ ðy; xÞ;


(ii) ðx þ z; yÞ ¼ ðx; yÞ þ ðz; yÞ; ðkx; yÞ ¼ kðx; yÞ;
(iii) (x, x)  0 and (x, x) = 0 if, and only if, x = 0.
An inner product space is a vector space with an inner product on it. Axiom
(ii) for an inner product space can be expressed as follows: the inner product is
linear in the first variable. In axiom (i), ðy; xÞ denotes the complex conjugate of
(y, x). Inner product spaces are also called pre-Hilbert spaces.
It is left to thePreader to verify that when F = R and H = Rn , the usual inner
product ðx; yÞ ¼ ni¼1 xi yi (described above) satisfies the foregoing definition.
The following proposition contains some immediate consequences of Definition
2.1.1.
Proposition 2.1.2 For any x, y, z in an inner product space H and any k 2 F, the
following hold:
(a) (x, y + z) = (x, y) + (x, z);
(b) (x, ky) = k(x, y);
(c) (0, y) = (x, 0) = 0;
(d) (x − y, z) = (x, z) − (y, z);
(e) (x, y − z) = (x, y) − (x, z);
(f) if (x, y) = (x, z) for all x, then y = z.

Proof (a) Using Definition 2.1.1(i) and (ii), we have

ðx; y þ zÞ ¼ ðy þ z; xÞ ¼ ðy; xÞ þ ðz; xÞ ¼ ðy; xÞ þ ðz; xÞ ¼ ðx; yÞ þ ðx; zÞ:

(c) (0, y) = (0 + 0, y) = (0, y) + (0, y), on using Definition 2.1.1(ii), and hence
(0, y) = 0.
(f) Suppose (x, y) = (x, z) for all x. Then

ðx; y zÞ ¼ ðx; yÞ ðx; zÞ ¼ 0

for all x; in particular, (y − z, y − z) = 0 and hence y − z = 0 by Definition 2.1.1(iii).


The proofs of (b), (d) and (e) are no different and are left to the reader. h
Examples 2.1.3
(i) Let H = Cn = {x = (x1, x2, …, xn): xi 2 C, 1  i  n} be the complex vector
space of n-tuples. For x = (x1, x2, …, xn) and y = (y1, y2, …, yn), define
n
X
ðx; yÞ ¼ xi yi : ð2:1Þ
i¼1

It is routine to check that the formula (2.1) does define an inner product on Cn in
the sense of Definition 2.1.1. This space is called n-dimensional unitary space and
2.1 Definition and Examples 23

is denoted by Cn : Indeed, the vectors (1, 0, …, 0), (0, 1, 0, …, 0), …, (0, 0, …, 0, 1)


constitute a basis for Cn :
(ii) Let ‘0 be the vector space of all sequences x = {xn}n  1 of complex numbers,
all of whose terms, from some index onwards, are zero (the index, of course,
may vary with the sequence). If x = {xn}n  1 and y = {yn}n  1, define
1
X
ðx; yÞ ¼ xn yn : ð2:2Þ
n¼1

Since the sum on the right side of (2.2) is essentially finite, convergence is
not an issue here. The axioms of Definition 2.1.1 are easily verified.
(iii) Let ‘2 denote the set of all complex sequences x = {xn}n  1 which are square
summable, that is,
1
X
jxn j2 \1:
n¼1

The addition of vectors x = {xn}n  1 and y = {yn}n  1 and the scalar


multiplication of x = {xn}n  1 by a scalar k 2 C are defined by

x þ y ¼ fx n þ y n gn  1 and kx ¼ fkxn gn  1 :

Since
 
ja þ bj2  2a2  þ 2jbj2

for a, b 2 C; it follows that


m
X m
X m
X 1
X 1
X
jxn þ yn j2  2 j xn j 2 þ 2 j yn j 2  2 jxn j2 þ 2 j yn j 2 ;
n¼1 n¼1 n¼1 n¼1 n¼1

and hence
1
X 1
X 1
X
j xn þ yn j 2  2 j xn j 2 þ 2 jyn j2 : ð2:3Þ
n¼1 n¼1 n¼1

Thus, if x = {xn}n  1 and y = {yn}n  1 are in ‘2, it follows from (2.3) that
P 2 P1
x + y 2 ‘2. Also, if x 2 ‘2 and k 2 C, then 1 2
n¼1 jkxn j ¼ jkj n¼1 jxn j
2
2 2
shows that kx 2 ‘ . Consequently, ‘ is a vector space over C:
P
For x = {xn}n  1 and y = {yn}n  1 in ‘2, the series 1 n¼1 xn yn converges abso-
lutely. In fact,
24 2 Inner Product Spaces

1 2 
jxn yn j  jxn j þ jyn j2
2

implies
m
X m m 1 1
1X 1X 1X 1X
j xn yn j  jxn j2 þ jyn j2  jxn j2 þ jyn j2 ;
n¼1
2 n¼1 2 n¼1 2 n¼1 2 n¼1

and hence
1
X 1 1
1X 1X
j xn yn j  jxn j2 þ jyn j2 :
n¼1
2 n¼1 2 n¼1

Now define
1
X
ðx; yÞ ¼ xn yn ; x; y 2 ‘2 : ð2:4Þ
n¼1

It is now easy to check that the axioms for an inner product are satisfied. Thus, ‘2
with the inner product defined in (2.4) is an inner product space.
(iv) Let C[a, b], −∞ < a < b < ∞, be the vector space of all continuous
complex-valued functions defined on [a, b]. Define

Zb
ðf ; gÞ ¼ f ðtÞ gðtÞ dt; f ; g 2 C½a; bŠ: ð2:5Þ
a

Rb
Observe that ðf ; f Þ ¼ a jf ðtÞj2 dt ¼ 0 implies f(t) = 0 for each t 2 [a, b], in
view of the continuity of f. The other axioms in Definition 2.1.1 are
consequences of the properties of integrals.
(v) Let Cn[a, b] be the vector space of all n times continuously differentiable
complex-valued functions defined on [a, b]. For f, g 2 Cn[a, b], define

b
n Z
X
ðf ; gÞ ¼ f ðiÞ ðtÞgðiÞ ðtÞ dt; ð2:6Þ
i¼0
a

where f(i)(t) denotes the ith derivative of f, 1  i  n, and f(0)(t) = f(t),


P Rb  2
t 2 [a, b]. Observe that 0 ¼ ðf ; f Þ ¼ ni¼0 a f ðiÞ ðtÞ dt ¼ 0 implies
Pn  ðiÞ 2 P  2
f ðtÞ ¼ 0, t 2 [a, b], in view of the continuity of n f ðiÞ ðtÞ ,
i¼0 i¼0
t 2 [a, b]. This implies f(t) = 0 for each t 2 [a, b]. The other axioms in
Definition 2.1.1 are consequences of the properties of integrals.
2.1 Definition and Examples 25

(vi) Let RL2 denote the space of rational functions (i.e. a ratio of two polynomials
with complex coefficients) which are analytic on the unit circle

@D ¼ fz 2 C : jzj ¼ 1g

with the usual pointwise addition and scalar multiplication. The inner product
is defined by
Z
1 dz
ðf ; gÞ ¼ f ðzÞgðzÞ ; ð2:7Þ
2pi z
@D

where the integral is being taken in the anticlockwise direction around ∂D.
RH2 is the subspace of RL2 consisting of those rational functions which are
analytic on the closed unit disc D, where

D ¼ fz 2 C : jzj\1g;

with inner product given by (2.7).


Thus, a rational function belongs to RL2 if it has no pole of absolute value 1, and
it belongs to RH2 if it has no pole of absolute value less than or equal to 1.
Clearly, (2.7) satisfies the axioms in (i), (ii) and part of (iii). We need to check
that (f, f) > 0 when f 6¼ 0. Indeed,

Zp
1  ih 2
ðf ; f Þ ¼ f ðe Þ dh; ð2:8Þ
2p
p

using the parametrisation z = eih, −p < h  p. Since f(eih) is continuous on [−p, p],
the right-hand side of (2.8) is positive unless f = 0.
(vii) A trigonometric polynomial is a finite sum of the form

k
X
f ðxÞ ¼ a0 þ an eikn x ; x 2 ½ p; pŠ;
n¼1

where k 2 N; a0, a1, …, ak 2 C and k1, k2, …, kk 2 N: It is clear that every


trigonometric polynomial is of period 2p. The space TP of trigonometric
polynomials is a vector space over C with respect to pointwise addition and
scalar multiplication. If we define the inner product by

Zp
1
ðf ; gÞ ¼ f ðtÞgðtÞdt; ð2:9Þ
2p
p

then TP becomes an inner product space.


26 2 Inner Product Spaces

Problem Set 2.1

2:1:P1. For which values of a 2 C does the sequence {n−a}n  1 belong to ‘2?
2:1:P2. [Notations as in Example 2.1.3(vi)] Calculate the inner product of
functions

1 1
f ðzÞ ¼ and gðzÞ ¼ ; where jaj\1; jbj\1:
z a z b

2:1:P3. [Notations as in Example 2.1.3(vi)] Let ka 2 RL2 be defined by ka(z) =


(1 a z)−1, where |a| 6¼ 1. Show that for f 2 RH2,

f ðaÞ if jaj\1
ðf ; ka Þ ¼
0 if jaj [ 1

2:1:P4. [Notations as in Example 2.1.3(vi)] Let f 2 RH2 and a 2 D. Show that


8 91
1 < 1 Z p    =2
2
jf ðaÞj  qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi f eih  dh :
2 :2p ;
1 jaj p

2.2 Norm of a Vector

Let X be a vector (linear) space over F:


Definition 2.2.1 A norm |||| is a function from X into the nonnegative reals R þ
satisfying
(i) ||x|| = 0 if, and only if, x = 0,
(ii) ||kx|| = |k|||x|| for each k 2 F and x 2 X,
(iii) ||x + y||  ||x|| + ||y|| for all x, y 2 X. [triangle inequality]
We emphasise that, by definition, ||x||  0 for all x 2 X.
If X is a linear space and |||| is a norm defined on X, then d(x, y) = ||x − y|| indeed
gives rise to a metric as a consequence of the foregoing Definition 2.2.1. The details
are as follows.
That the distance d(x, y) from a vector x to a vector y in H is strictly positive (that
is, d(x, y)  0 and equality holds if, and only if, x = y) follows from (i). The fact
that d(x, y) = d(y, x) follows from
2.2 Norm of a Vector 27

kx yk ¼ k ðy xÞk ¼ j 1jky x k ¼ ky xk;

in view of (ii). Also

dðx; zÞ ¼ kx zk ¼ kx yþy z k  kx y k þ ky zk ¼ dðx; yÞ þ dðy; zÞ

for all x, y and z. The reader will observe that (iii) has been used in proving the
preceding inequality.
A linear space X equipped with a norm |||| is called a normed linear space. If
the metric space (X, d), where d(x, y) = ||x − y||, x, y 2 X, is complete, then the
normed linear space is said to be complete and is called a Banach space. These
spaces are named after the great Polish mathematician Stefan Banach. Rn ; the real
P
space of n-tuples x = (x1, x2, …, xn) with each of the norms k xk1 ¼ ni¼1 jxi j; k xk2 ¼
P 1
n 2 2
i¼1 jxi j ; kxk1 ¼ supi jxi j is a Banach space. That ||x||1, ||x||2 and ||x||∞ are
norms can be verified, see [30]. So is the space Cn of complex n-tuples. The space
(Cn , ||||2) is complete [see Example 2.3.4(i)]. That Cn with ||||1 and Cn with ||||∞
are complete follows from the inequalities ||||∞  ||||2  ||||1  n||||∞, see [30].
Hilbert spaces are Banach spaces whose norms are derived from an inner pro-
duct as detailed below.
Definition 2.2.2 In an inner product space H, the norm (or length) of a vector
x 2 H, denoted by ||x||, is the nonnegative real number as defined by
pffiffiffiffiffiffiffiffiffiffi
k xk ¼ ðx; xÞ;

and is called the norm induced by the inner product on H.


We shall see below that this satisfies the conditions for being a norm as laid out
in Definition 2.2.1.
The norm of an element x = (a1, a2, …, an) in the unitary space Cn is
!12
n
X 2
k xk ¼ jai j ;
i¼1

and that of an x = {ai}i  1 in ‘2 is


!12
1
X 2
k xk ¼ jai j :
i¼1

The norm of an element f 2 C[a, b] [respectively, f 2 Cn[a, b]] is


28 2 Inner Product Spaces

0 112 2 0 112 3
Zb b
n Z 
X
6 
kfk ¼ @ jf ðtÞj2 dtA 4resp:@ f ðiÞ ðtÞ2 dtA 7 5:
i¼0
a a

The norm of an element f in RL2 or in RH2 is


0 112 0 112
Z Zp
1 dz 1  
kfk ¼ @ jf ðzÞj2 A ¼ @ f ðeih Þ2 dhA :
2pi z 2p
@D p

Proposition 2.2.3 In an inner product space H, |||| has the following properties:
for x, y 2 H and k 2 F,
(a) ||x||  0 and ||x|| = 0 if, and only if, x = 0;
(b) ||kx|| = |k|||x||;
(c) (Parallelogram Law)

kx þ y k2 þ kx y k2 ¼ 2k x k2 þ 2k y k2 ;

(d) (Polarisation Identity in case F = C)

4ðx; yÞ ¼ kx þ yk2 kx yk2 þ ikx þ iyk2 ikx iyk2 :

Proof (a) is immediate from Definition 2.1.1(c) while (b) follows from

kkxk2 ¼ ðkx; kxÞ ¼ jkj2 ðx; xÞ ¼ jkj2 kxk2 :

For x, y 2 H, we have

kx þ yk2 ¼ ðx þ y; x þ yÞ ¼ k xk2 þ k yk2 þ ðx; yÞ þ ðy; xÞ: ð2:10Þ

In the identity (2.10), replace y by −y to obtain

kx yk2 ¼ ðx y; x yÞ ¼ kxk2 þ k yk2 ðx; yÞ ðy; xÞ: ð2:11Þ

Adding (2.10) and (2.11), we get

k x þ y k2 þ k x y k2 ¼ 2k x k2 þ 2k y k2 :

This proves the Parallelogram Law.


In the identity (2.10), replace y by −y, iy and −iy:
2.2 Norm of a Vector 29

kx yk2 ¼ kxk2 þ k yk2 ðx; yÞ ðy; xÞ: ð2:12Þ

kx þ iyk2 ¼ k xk2 þ k yk2 iðx; yÞ þ iðy; xÞ: ð2:13Þ

kx iyk2 ¼ kxk2 þ k yk2 þ iðx; yÞ iðy; xÞ: ð2:14Þ

Multiply both sides of (2.12) by −1, (2.13) by i and (2.14) by −i and add to (2.10)
to obtain the following:

kx þ y k 2 kx yk2 þ ikx þ iyk2 ikx iyk2 ¼ 4ðx; yÞ:

This completes the proof of the Polarisation Identity. h


Remark In Proposition 2.2.3, the assertions (a)–(c) are valid in real as well as
complex inner product spaces, but (d) holds only in a complex inner product space.
Theorem 2.2.4 (Cauchy–Schwarz Inequality) Let H be an inner product space and
let ||x|| denote the norm of x 2 H. Then

jðx; yÞj  kxkk yk ð2:15Þ

for x, y 2 H, and equality holds if, and only if, x and y are linearly dependent.
Proof Choose a real number h such that eih ðx; yÞ ¼ jðx; yÞj: Let k = aeih, where
a 2 R: Then
2
ðx ky; x kyÞ ¼ k xk2 þ ky kðx; yÞ kðy; xÞ: ð2:16Þ

The expression on the left side of (2.16) is real and nonnegative. Hence,

k xk2 þ a2 k yk2 2ajðx; yÞj  0 ð2:17Þ

for every real a. If k yk ¼ 0, then we must have jðx; yÞj ¼ 0, for otherwise (2.17)
will be false for large positive values of a. If kyk [ 0, take a ¼ jðx; yÞj=k yk2 in
(2.17) and obtain

jðx; yÞj2  kxk2 kyk2 :

If x and y are linearly dependent, then we may write y = kx or x = ky for some


k 2 F: Then

jðx; yÞj ¼ jðx; kxÞj ¼ jk jk xk2


¼ kkxkk xk ¼ k ykkxk;

that is, equality holds in (2.15).


30 2 Inner Product Spaces

On the other hand, suppose that jðx; yÞj ¼ kxkk yk: If kyk ¼ 0, then y = 0 and
x and y are linearly dependent. If kyk 6¼ 0, then
! !
ðx; yÞ ðx; yÞ 2 jðx; yÞj2 ðx; yÞ
x y; x y ¼ k xk þ 2< x; y
k y k2 k y k2 k y k2 k y k2
jðx; yÞj2 jðx; yÞj2
¼ k x k2 þ 2
k y k2 k y k2
jðx; yÞj2
¼ k x k2 :
k y k2

ðx;yÞ
Hence, x k yk2
y ¼ 0; that is, x and y are linearly dependent. h

Remark The above proof of the Cauchy–Schwarz Inequality is valid in the real case
as well.
Applying Theorem 2.2.4 in specific spaces such as Cn , ‘2 and C[a, b], the
following corollary results.
Corollary 2.2.5
(a) If x1, x2, …, xn and y1, y2, …, yn are complex numbers,

  !12 !12
X n  Xn Xn
 
 a b  jai j2 jbi j2 :
 i¼1 i i  i¼1 i¼1

(b) If {ai}i  1 and {bi}i  1 are square summable sequences of complex numbers,

  !12 !12
X 1  X1 X1
 
 a b  jai j2 jbi j2 :
 i¼1 i i  i¼1 i¼1

(c) If f, g 2 C[a, b], then


 b  0 b 112 0 b 112
Z  Z Z
 
 f ðtÞ gðtÞ dt  @ jf ðtÞj2 dtA @ jgðtÞj2 dtA :
 
 
a a a
2.2 Norm of a Vector 31

In each case, equality holds if, and only if, the vectors involved are linearly
dependent.

Theorem 2.2.6 (Triangle inequality) In an inner product space H,

kx þ y k  k x k þ k y k ð2:18Þ

for all x, y 2 H.
Proof For x, y 2 H,

kx þ yk2 ¼ ðx þ y; x þ yÞ
¼ kxk2 þ k yk2 þ ðx; yÞ þ ðy; xÞ
¼ kxk2 þ k yk2 þ 2<ðx; yÞ
 kxk2 þ k yk2 þ 2k xkk yk;

using the Cauchy–Schwarz Inequality (2.15). Thus,

kx þ y k2  ð k x k þ k y kÞ 2 ;

which implies

kx þ yk  kxk þ kyk:
h

Applying Theorem 2.2.6 to specific inner product spaces, such as ‘2, RL2
and C[a, b], the following inequalities are obtained:
Corollary 2.2.7
(a) If {xi}i  1 and {yi}i  1 are in ‘2, then
!12 !12 !12
1
X 1
X 1
X
2 2 2
j xi þ yi j  j xi j þ j yi j :
i¼1 i¼1 i¼1

(b) If f, g are in RL2, then


0 112 0 112 0 112
Zp Zp Zp
 ih 2  ih 2  ih 2
@1 f ðe Þ þ gðe Þ dhA  @
ih 1 f ðe Þ dhA þ @ 1 gðe Þ dhA :
2p 2p 2p
p p p
32 2 Inner Product Spaces

(c) If f, g are in C[a, b], then


0 112 0 112 0 112
Zb Zb Zb
@ jf ðtÞ þ gðtÞj2 dtA  @ jf ðtÞj2 dtA þ @ jgðtÞj2 dtA :
a a a

Corollary 2.2.8 In an inner product space H,

jk xk k y kj  kx yk ð2:19Þ

for all x, y 2 H.
Proof For x, y 2 H,

k x k ¼ kx y þ y k  kx yk þ k yk

by Theorem 2.2.6. This implies

k xk k y k  kx yk: ð2:20Þ

On interchanging the roles of x and y, we get

k yk k x k  ky xk ¼ k x yk: ð2:21Þ

The inequality (2.19) follows upon combining (2.20) and (2.21). h


Problem Set 2.2

2:2:P1. Show that for x, y and z 2 X, an inner product space, the following
Apollonius Identity holds:
2
2 21 2
1
kx z k þ ky z k ¼ kx y k þ 2
z ðx þ yÞ
:
2 2

2:2:P2. Show that the formula (A, B) = trace(B*A) defines an inner product on the
space Cnn of n  n complex matrices, where n 2 N and B* denotes the
conjugate transpose of B. Determine ||In||, where In is the identity matrix.



2:2:P3. Show that (i) x ¼ 1n n  1 2 ‘2 and determine k xk; (ii) x ¼ 2 n=2 n  1 2 ‘2
and determine k xk:
2.2 Norm of a Vector 33

2:2:P4. Prove that, for any function f 2 C[0, p],


 p  2 312
Z  rffiffiffi Zp
  p 2
 f ðtÞ sin t dt  4 jf ðtÞj dt5 ;
  2
 
0 0

and describe the nonzero functions for which equality holds.


2:2:P5. Suppose µ(X) = 1 and R f, g are
R positive measurable functions on X such that
fg  1. Prove that X f dl X g dl  1:
2:2:P6. Let X1 = (C[0, 1],||||∞), where ||x||∞ = sup0  t  1|x(t)|, and let X2 = (C[0,
1],||||2), where

Z1
1
kxk2 ¼ ðx; xÞ ; ðx; yÞ ¼
2 xðtÞyðtÞdt:
0

Show that the identity mapping id: X1 ! X2 is continuous, but its inverse
id: X2 ! X1 is not.
2:2:P7. Let X be an inner product space. If ||x|| = ||y|| = 12 ||x + y||, then show that
x = y. The result fails to hold if X = Rn or Cn with norm ||||1 when n > 1.
2:2:P8. Let X be a vector space over C: Let (,) be a complex-valued function of
two variables (x, y): X  X ! C which has the following properties:
(a) (ax1 + bx2, y) = a(x1, y) + b(x2, y)
(b) ðx; yÞ ¼ ðy; xÞ
(c) (x, x)  0 and (x, x) may be zero for nonzero x.
Prove that the Cauchy–Schwarz Inequality still holds but without the rider
about when equality holds.
1
2:2:P9. [Notations as in Example 2.1.3(vi)] Let g(z) = ðz aÞðz bÞ, where a and b are
distinct points in D. Using the Residue Theorem, show that
1
ð1 jabj2 Þ2
kgk ¼ 1 1 :
ð1 jaj2 Þ2 ð1 jbj2 Þ2 j1 abj

2:2:P10. [Notations as in Example 2.1.3(vi)] Prove that for a 2 D,

F ¼ ff 2 RH 2 : f ðaÞ ¼ 0g

is a closed linear subspace of RH2.


34 2 Inner Product Spaces

2.3 Inner Product Spaces as Metric Spaces

We have seen in Proposition 2.2.3 and Theorem 2.2.6 that the norm induced by an
inner product in H satisfies the following conditions of Definition 2.2.1:
(i) ||x||  0 and ||x|| = 0 if, and only if, x = 0,
(ii) ||kx|| = |k|||x|| for all k 2 C and x 2 H,
(iii) ||x + y||  ||x|| + ||y|| for all x, y 2 H.

Remark 2.3.1 The inner product in H together with the metric d(x, y) = ||x − y|| from
the norm induced by the inner product is a metric space. As in any metric space,
d(x, y) is called the distance from x to y or between x and y.
We shall henceforth feel free to use, for inner product spaces, all metric concepts
such as open and closed sets, convergence, continuity, uniform continuity, Cauchy
sequence, completeness, dense sets and separability. Below we translate the general
metric space concepts defined in Sect. 1.2 into inner product space terms.
It follows from (2.19) that the map x ! ||x|| defined in H is continuous. In fact, it
is uniformly continuous in view of (2.19).
Remarks 2.3.2
(i) A sequence {xn}n  1 in a normed space or in an inner product space is
Cauchy if for every e > 0, there exists an integer n0 such that

kx n xm k\e whenever n; m  n0 :

(ii) Every convergent sequence is Cauchy; the converse is, however, not true; in
fact, let {xn}n  1, where xn = (1, 12, …, 1n, 0, …), n = 1, 2, … be a sequence in
the inner product space ‘0 (see Example 2.1.3(ii)). Then the sequence
{xn}n  1 is Cauchy because
!12
m
X þp
xm þ p 1
xm ¼ 2
k¼m þ 1
k

can be made arbitrarily small by choosing m sufficiently large. However, the


sequence does not converge to an element of the space. Assume the contrary,
that is suppose xn ! x, where x = (k1, k2, …, kN, 0, 0, …). If n  N, then
n 
X 2
 1
X
kx n xk2 ¼ 1 kk  þ jkk j2
k
k¼1 k¼n þ 1
n 
X 2

¼ 1 kk  :
k
k¼1
2.3 Inner Product Spaces as Metric Spaces 35

P 1 2
On letting n ! ∞, we obtain 1 k¼1 k kk  ¼ 0, which implies kk = 1k for
each k, contradicting the fact that x has finitely many nonzero terms.
(iii) The open [respectively, closed] ball with centre x0 and radius e is the set
{x 2 H: ||x − x0|| < e} [respectively, {x 2 H: ||x − x0||  e}]. In view of
Proposition 1.2.10, a subset K  H is bounded if, and only if, there exists an
M > 0 such that K  {x: ||x||  M}.

Definition 2.3.3 An inner product space H is said to be complete if every Cauchy


sequence in H converges. That is, if {xn}n  1 is a sequence in H satisfying
||xn − xm|| ! 0 as n, m ! ∞, there exists an x 2 H such that ||xn − x|| ! 0 as
n ! ∞. An inner product space which is complete is called a Hilbert space.
Every Hilbert space is a Banach space. The norm in a Hilbert space is derived
from the inner product.
Examples 2.3.4
(i) The inner product space H = Cn with the metric given by
!12
n
X 2
dðx; yÞ ¼ kx yk ¼ j xi yi j ; ð2:22Þ
i¼1

where x = (x1, x2, …, xn) and y = (y1, y2, …, yn) are in Cn , is a Hilbert space
with metric as above or with inner product as in (i) of Examples 2.1.3.
We need to check that Cn , with the metric
 defined in (2.22),  is complete [see

ðmÞ ðmÞ ðmÞ ðmÞ
(i) of Examples 2.1.3]. Let x m1
¼ x1 ; x2 ; . . .; xn denote a Cauchy
sequence in Cn , i.e. d(x(m), x(m′)) ! 0 as m, m′ ! ∞. Then for a given e > 0 there
exists an integer n0(e) such that
!12
n 
X 2
 ðmÞ ðm0 Þ 
 xk xk  \e for all m; m0  n0 ðeÞ: ð2:23Þ
k¼1

 
 ðmÞ ðm0 Þ 
Hence xk xk \e for all m, m′  n0(e) and all k = 1, 2, …, n. Upon fixing
n o
ðmÞ
k and using the Cauchy Principle of Convergence, it follows that xk
m1
converges to a limit xk. Let x = (x1, x2, …, xn) and m  n0(e). It follows from (2.23)
that
n 
X 2
 ðmÞ ðm0 Þ 
 xk xk  \e2 ð2:24Þ
k¼1
36 2 Inner Product Spaces

for all m′  n0(e). Letting m′ ! ∞ in (2.24), we have


n 
X 2
 ðmÞ 
 xk xk   e2
k¼1

for all m  n0(e). That is, d(x(m), x) ! 0 in Cn :


(ii) The inner product space H = ‘2 (see Example 2.1.3(iii)) is a Hilbert space.
We shall show that ‘2 with the metric
!12
1
X 2
dðx; yÞ ¼ kx yk ¼ jxk yk j ð2:25Þ
k¼1


 
ðmÞ ðmÞ
is complete. Let xðmÞ m1
¼ x 1 ; x 2 ; . . . denote a Cauchy sequence in ‘2.
Then for a given e > 0 there exists an integer n0(e) such that
!12
1 
X 2
 ðmÞ ðm0 Þ 
 xk xk  \e for all m; m0  n0 ðeÞ: ð2:26Þ
k¼1

 
 ðmÞ ðm0 Þ 
This implies xk xk \e for all m, m′  n0(e), i.e. for each k, the sequence
n o
ðmÞ
xk is a Cauchy sequence of complex numbers. So by the Cauchy Principle
m1
ðmÞ
of Convergence, limm xk ¼ xk , say. Let x be the sequence (x1, x2, …). It will be
shown that x 2 ‘2 and limmx(m) = x. From (2.26), we have

N 
X 2
 ðmÞ ðm0 Þ 
 xk xk  \e2 ð2:27Þ
k¼1

for any positive integer N, provided m, m′  n0(e). Letting m′ ! ∞ in (2.27), we


obtain

N 
X 2
 ðmÞ 
 xk xk   e2
k¼1

for any positive integer N and all m  n0(e). The sequence



PN  ðmÞ 2

k¼1 xk xk  is a monotonically increasing sequence of nonnegative
N 1
real numbers and is bounded above and, therefore, has a finite limit
P1  ðmÞ 2


k¼1 kx x k  which is less than or equal to e2. Hence
2.3 Inner Product Spaces as Metric Spaces 37

!12
1 
X 2
 ðmÞ 
 xk xk  e for all m; m0  n0 ðeÞ: ð2:28Þ
k¼1

Observe that
!12 !12 !12
1
X 1 
X 2 1 
X 
2  ðmÞ   ðmÞ 2
j xk j   xk xk  þ  xk  ;
k¼1 k¼1 k¼1

using Corollary 2.2.7(a) and consequently x 2 ‘2. Moreover, limmx(m) = x in ‘2 by


(2.28).
It follows from Remark 2.3.2(ii) that ‘0 is an inner product space that is not
complete.
Remarks 2.3.5
(i) The inner product space ‘0 of sequences all of whose terms, from some index
onwards, are zero is dense in ‘2. In fact, let x = (a1, a2, …) be an element in
‘2 (not in ‘0) and e > 0 be given. Choose N such that
1
X
jak j2 \e:
k¼N þ 1

Then the sequence y = (a1, a2, …, aN, 0, …) is in the desired inner product
space and is such that
1
X
kx yk ¼ jaj2 \e:
k¼N þ 1

This shows that each x 2 ‘2 (not in ‘0) is a limit point of the space ‘0 of
sequences all of whose terms, from some index onwards, are zero.
(ii) It may be discerned from (i) above that ‘0 is not complete.
(iii) For j = 1, 2, …, let ej = (0, …, 0, 1, 0, 0, …), where 1 occurs only in the jth
place and

E ¼ fk1 e1 þ    þ kn en : n ¼ 1; 2; . . .; <kj ; =kj are rationalg:

Since the rational numbers constitute a countable set, E is countable. We


P  2
show that E is dense in ‘2. Let (x1, x2, …) 2 ‘2 and e > 0. As 1j¼1 xj is
finite, there is some N such that
38 2 Inner Product Spaces

1  
X
xj 2 \e2 =2:
j¼N þ 1

Since the rational numbers are dense in R, there are k1, …, kN in C with
ℜkj, =kj rational and
 2
 xj kj  \e2 =2N; j ¼ 1; 2; . . .; N:

Consider y ¼ k1 e1 þ    þ kN eN in E. Then

N 
X 2 1  
X
kx y k2 ¼  xj kj  þ xj 2 \e2 =2 þ e2 =2 ¼ e2 :
j¼1 j¼N þ 1

Hence, y 2 S(x, e). Thus, E is dense in ‘2. Consequently, ‘2 is a separable


metric space.

Definition 2.3.6 Two Hilbert spaces H and K are said to be isometrically iso-
morphic if there exists a linear isometry between H and K, i.e. if there exists a
bijective linear mapping A:H ! K such that

Aða1 x1 þ a2 x2 Þ ¼ a1 Aðx1 Þ þ a2 Aðx2 Þ;

ðAx1 ; Ax2 Þ ¼ ðx1 ; x2 Þ

for all x1, x2 2 H and scalars a1 and a2.


Theorem 2.3.7 For every inner product space X, there is a Hilbert space H such
that X is a dense linear subspace of H, and for x, y 2 X, the inner product (x, y) in X
and in H is the same. The space H is unique up to a linear isometry; that is, if X is a
dense linear subspace of a Hilbert space K, then there is a unique linear isometry
A:H ! K such that the restriction of A to X is the identity map.
Proof Consider X as a metric space with the metric induced by the inner product on
1
X, i.e. with d(x, y) = ||x − y|| = (x − y, x − y)2 . Let H be its completion. Let
x, y 2 H and let {xn}n  1 and {yn}n  1 be sequences in X such that xn ! x and
yn ! y. Then for scalars k and µ, the sequence {kxn + µyn}n  1 is a Cauchy
sequence in X. Now, if H is to be given a Hilbert space structure such that the inner
product on H induces the metric on H, then

limn ðkxn þ lyn Þ ¼ k limn xn þ l limn yn ¼ kx þ ly:

Thus, k x + µy must be defined to be the limit of the Cauchy sequence {kxn +


µyn}n  1. It may be checked that the addition, scalar multiplication and the limits of
the Cauchy sequences are well defined. It is now easy to check that with this
definition of addition and scalar multiplication, H becomes a vector space. Now
define (x, y) = limn(xn, yn); note that it is well defined. In fact, it is an inner product
2.3 Inner Product Spaces as Metric Spaces 39

on H whose restriction to X agrees with the given inner product in X. With this inner
product, H is a Hilbert space. The uniqueness can be easily verified. h
Problem Set 2.3

2:3:P1. Show that the space (C[0, 1],||||∞), where ||x||∞ = sup0  t  1|x(t)|, is not an
inner product space, hence not a Hilbert space.

Definition A strictly convex norm on a normed linear space is a norm such that,
for all x, y 2 X, ||x|| = ||y|| = 1, y 6¼ x ) ||x + y|| < 2.
2:3:P2: (a) Show that the norm on a Hilbert space is strictly convex.
(b) Show that the norm ||||∞ on C[0, 1] is not strictly convex.
(c) Show that the norm ||||1 on C[0, 1] is not strictly convex.
2:3:P3. Let H be the collection of all absolutely continuous functions x:[0, 1] ! F
R1
such that x(0) = 0 and x′ 2 L2[0, 1]. If ðx; yÞ ¼ 0 x0 ðtÞy0 ðtÞdt for x, y 2 H,
show that H is a Hilbert space.
2:3:P4. Let H be a Hilbert space over R: Show that there is a Hilbert space K over
C and a map U:H ! K such that (i) U is linear (ii) (Ux1, Ux2) = (x1, x2) for
all x1, x2 2 H, (iii) for any z 2 K, there are unique x1, x2 2 H such that
z = Ux1 + iUx2.
2:3:P5: (a) Suppose x and y are vectors in a normed space such that ||x|| = ||y||. If
there exists t 2 (0,1) such that ||tx + (1 − t)y|| < ||x||, then show that this
strict inequality holds for all t 2 (0, 1).
(b) Let x and y belong to a real or complex strictly convex normed space.
If ||x + y|| = ||x|| + ||y|| and x 6¼ 0 6¼ y, show that there exists a > 0 such
that y = ax.
2:3:P6. The set of all vectors x = {ηn}n  1 with |ηn|  1n, n = 1, 2, … in real ‘2 is
called the Hilbert cube. Show that this set is compact in ‘2.
2:3:P7. Let a = {an}n  1 be a sequence of positive real numbers. Define
P1 2
‘2a = {x = (x1, x2, …): xi 2 C and i¼1 ai jxi j \1}. Define an inner
P
product on ‘2a by ðx; yÞ ¼ 1 2
i¼1 ai xi yi . Show that ‘a is a Hilbert space.
2:3:P8. For a real number s, we define on Z a measure µs by setting
 s=2
ls ðfngÞ ¼ 1 þ n2 ; n 2 Z:

Put H s ¼ L2 ðls Þ: Prove that for r < s, we have H s H r :


2:3:P9: (a) Find a sequence a of positive real numbers such that (1, 1/22, 1/33, …)
62 ‘2a [see Problem 2.3.P7].
(b) Find a sequence a of positive real numbers such that all x = {xn}n  1
with |xn| = nn are in ‘2a .
2:3:P10. Let M be a closed subspace of a Hilbert space H, and let y 2 H, y 62 M. If
M′ is the subspace spanned by M and y, then M′ is closed. In particular, a
finite-dimensional subspace must be closed.
40 2 Inner Product Spaces

2.4 The Space L2 (X; M, µ)

A class of spaces associated with e


L p (X; M, µ) for all p, 0 < p  ∞ [see Definition
1.3.5] is important in analysis. Here we are concerned only with cases p = 1 and
p = 2. We shall see that there is a Hilbert space associated with e L 2 (X; M, µ).
Henceforth, the symbols M and µ will be omitted.
e 2 (X) is a vector space.
Proposition 2.4.1 L
Proof Suppose that f 2 L e 2 (X; M; µ) and a 2 C: Then f is measurable and so af is
measurable; | f |2 is integrable and so |af |2 = |a|2| f |2 is integrable. Thus, af 2 e
L 2 (X,
M, µ).
Suppose that f, g 2 e L 2 (X). Then f and g are measurable and so f + g is mea-
surable. For all complex numbers a and b,

ja þ bj2  2jaj2 þ 2jbj2 :

The above inequality may be seen to result on applying the Cauchy–Schwarz


Inequality (Theorem 2.2.4) to the inner product ((a, b), (1,1)) in the inner product
space C2 . So, for all x 2 X,
 
0  jf ðxÞ þ gð xÞj2  2 jf ðxÞj2 þ jgð xÞj2 :

The function on the right is integrable and therefore so is the measurable function
| f + g |2. This proves that f + g 2 e
L 2 (X). h
Definition 2.4.2 If f and g are two complex-valued measurable functions defined on
X, let us write ‘f * g’ if {x 2 X: f(x) 6¼ g(x)} is a null set (a measurable set of
measure zero). One says that the functions f and g are equivalent or that f − g is a
null function.
Let N denote the set of all null functions, that is,

N ¼ ff : f  0g:

The relation ‘*’ is an equivalence relation on the set of all complex-valued


measurable functions defined on X. As such, it partitions the set into disjoint
equivalence classes, where a typical class, denoted by [ f ], is given by

½f Š ¼ fg : g measurable on X and g  f g

and g 2 [ f ] would be called a representative of this class.


2.4 The Space L2(X; M, µ) 41

Note that N is a vector space of functions and N  e L p (X) for all p > 0.
e p
If f is a function in L (X), then [ f ] = f + N is the coset containing f, as defined
towards the end of Sect. 1.1.
Definition 2.4.3 The space L2(X, M, µ) is the set of all equivalence classes of
functions in e L 2 (X).
Thus, if f 2 eL 2 (X), then [ f ] is the corresponding member of L2(X, M, µ). One says
that f is a representative of the equivalence class [ f ]. The set just defined is what we
intend to make into the promised Hilbert space associated with e L 2 (X, M, µ).
Proposition 2.4.4 L2(X) is a vector space.
Proof N is a subspace of L e 2 (X), and L2(X) is actually the quotient space e
L2
(X, M, µ)/ N . h
e 2
We next define an inner product on the space L (X).
Proposition 2.4.5 If f, g 2 e
L 2 (X) then fg 2 e
L 1 (X).
Proof Suppose f, g 2 e
L 2 (X). Then, f, g are measurable and so the product fg is
measurable. The functions | f |2 and |g|2 are integrable and it follows from the
inequality

1 
jf ð xÞgð xÞj  jf ð xÞj2 þ jgð xÞj2 ; x2X
2

that fg is also integrable. h


Now let us define (f, g) for f, g 2 L~2 (X) by
Z Z
ðf ; gÞ ¼ f ðxÞgðxÞdlð xÞ or ðf gÞdl:
X X

~2 (X) then g (defined by g(x) = gðxÞ) is also a member of e


Note that if g 2 L L 2 (X, M, µ),
and so by Proposition 2.4.5, (f, g) is well defined. The reader can check that (,) has all
the properties of an inner product except one. If (f, f) = 0, then
Z Z
0 ¼ ðf ; f Þ ¼ jf ð xÞj2 dlð xÞ ¼ j f j2 dl
X X

and therefore, f * 0; that is, f is a null function, but one cannot conclude that f = 0.
However, if f * f ′ and g * g′, then (f, g) = (f ′, g′). In fact,
42 2 Inner Product Spaces

   
Z Z  Z Z 
    
 ðf gÞ f 0 g0  ¼  ðf f 0 Þg þ f g g0 
0

   
X X X X
   
Z  Z 
     
  ðf f 0 Þg þ  f 0 g g0 
   
X X
1 1 1 1
0
 ðf f ;f 0
f Þ ðg; gÞ2 þ ðf 0 ; f 0 Þ2 ðg
2 g0 ; g g0 Þ2 ¼ 0;

using the Cauchy–Schwarz Inequality 2.2.4. In view of the inequality proved


above, the integral
Z
ðf gÞ
X

depends only on the equivalence classes [ f ] and [g] of the functions.


We can now define (,) on L2(X, M, µ).
Definition 2.4.6 For [ f ], [g] 2 L2(X), define
Z
ð½f Š; ½gŠÞ ¼ ðf gÞdl;
X

where f 2 [ f ] and g 2 [g].


In view of the remarks preceding Definition 2.4.6, (,) is unambiguously defined
on L2(X, M, µ).
Proposition 2.4.7 The space L2(X) with inner product as in Definition 2.4.6 is an
inner product space.
R
Proof For [ f ] 2 L2(X), if ([ f ], [ f ]) = 0, then X | f |2dµ = 0 and hence [ f ] = 0, the
zero of L2(X). The verification of the other axioms of an inner product is
straightforward. h
Remark 2.4.8 We shall adopt the usual practice and abandon the notation [ f ]. The
symbol f will be used to denote both a function in e L 2 (X) and the corresponding
2
equivalence class of functions in L (X). Working mathematicians tend to ignore the
distinction between a function and its equivalence class. The correspondence
between statements about e L 2 (X) and L2(X) is straightforward and gives rise to no
confusion. In the subsequent discussions, it will always be clear whether calcula-
tions are in terms of functions or equivalence classes of functions.
2.4 The Space L2(X; M, µ) 43

Note that for f 2 L2(X),


0 112 0 112
Z Z
1
2
k f k ¼ ðf ; f Þ ¼ @
2 jf ð xÞj dlð xÞA ¼ @ j f j2 dlA :
X X

Now we can prove a result which is of central importance in analysis. It is often


called the Riesz–Fischer Theorem.
Theorem 2.4.9 (Riesz–Fischer Theorem) (L2(X), (,)) is a Hilbert space.
Proof Let {fn}n  1 be a Cauchy sequence in L2(X); that is, for e > 0 there exists an
integer n0 such that

kf n fm k\e whenever n; m  n0 :

Then there exists a subsequence ffni gi  1 , n1 < n2 <   , such that


0 112
Z
 2 1
fn
fni ¼ @ f n fni  dlA \ i ; i ¼ 1; 2; . . .
iþ1 iþ1
2
X

Indeed, if nk has been selected, choose nk+1 > nk such that n, m > nk+1 implies
kfn fm k\ 2k1þ 1 :
Let

k 
X 
gk ¼ fn f ni ;
iþ1
i¼1

and
1 
X 
g¼ fn f ni :
iþ1
i¼1

Then by the triangle inequality (Theorem 2.2.6),



X k   k k
fn X X 1
k gk k ¼ iþ1
f ni   fn
iþ1
f n i \ \ 1 for k ¼ 1; 2; . . .
i¼1 i¼1 i¼1
2i

Hence, an application of Fatou’s Lemma (Theorem 1.3.8) to {g2k }k  1 gives


44 2 Inner Product Spaces

Z Z
2
g ¼ ðlim inf k g2k Þdl
X X
Z
 lim inf k g2k dl
X
 1;

that is, ||g||  1. In particular,


R g(x) < ∞ a.e. Indeed, if E = {x: |g(x)| = ∞} has
positive measure, then X g(x)2dµ(x) = ∞. Therefore, the series
1 
X 
g¼ fni þ 1 fni : ð2:29Þ
i¼1

converges absolutely for almost all x. Denote the sum of (2.29) by f(x) for those x at
which (2.29) converges. Put f(x) = 0 on the remaining set of measure zero. Since

k 1
X
fn1 þ ðfni þ 1 fni Þ ¼ fnk ;
i¼1

we see that

f ð xÞ ¼ lim fni ð xÞ a:e:


i

Having determined a function f which is the pointwise limit almost everywhere of


ffni gi  1 ; we have to show that f is the L2-limit of {fn}n  1, i.e. limn|| fn − f || = 0,
where |||| denotes the L2-norm. Recall that

jjfn fm jj\e whenever n; m  n0 :

For m > n0, another application of Fatou’s Lemma shows that


Z Z
jf fm j2 dl  limi jfni fm j2 dl  e2 : ð2:30Þ
X X

We conclude from (2.30) that f − fm 2 L2(X) and hence f 2 L2(X) since


f = (f − fm) + fm. Finally,

kf fm k ! 0 as m ! 1:

This completes the proof that L2(X, M, µ) is a Hilbert space. h


2.4 The Space L2(X; M, µ) 45

Remark 2.4.10 In the course of the proof, we have shown that if {fn}n  1 is a
Cauchy sequence in L2(X) with limit f, then {fn}n  1 has a subsequence which
convergence pointwise almost everywhere to f.
The simple functions play an important role in L2(X).
Theorem 2.4.11 Let S be the collection of all measurable simple functions on X
vanishing outside subsets of finite measure. Then S is dense in L2(X).
Proof Clearly, S  L2(X). Let f 2 L2(X) and assume that f  0. There exists a
sequence {sn}n  1 of measurable simple functions such that

0  s 1 ð xÞ  s 2 ð xÞ      f ð xÞ and sn ð xÞ ! f ð xÞ

(see 1.3.3). Since 0  sn(x)  f(x), we have sn 2 L2(X) and hence sn 2 S. Since
| f − sn|2  f 2, the Dominated Convergence Theorem 1.3.9 shows that ||f − sn|| ! 0
as n ! ∞. Then, f is in the L2-closure of S. The general case when f is complex
follows from the above considerations. h
Theorem 2.4.12 Let X = R, M be the r-algebra of measurable subsets of R and µ
be the usual Lebesgue measure on R. Then the set of all continuous functions
vanishing outside subsets of finite measure is dense in L2(X).
Proof Let E 2 M be such that µ(E) < ∞. Then every bounded measurable (in
particular continuous) function is square integrable on E. Consider now a nonempty
closed subset F of E and its characteristic function vF. For n = 1, 2, …, let

1
f n ð xÞ ¼ ; x 2 E;
1 þ n distðx; FÞ

where dist(x, F) = inf{|x − y|: y 2 F}. Note that each fn is continuous on E. Also,
fn(x) = 1 for all x 2 F and fn(x) ! 0 as n ! ∞ for all x 62 F. Hence, (vF − fn)(x) ! 0
for all x 2 E. Since µ(E) < ∞ and |fn(x)|  1 for all x 2 E, the Dominated
Convergence Theorem shows that
Z
jvF fn j2 dl ! 0 as n ! 1:
E

In case F is empty, vF is 0 everywhere and we may carry out the above argument
with each fn chosen to be 0 everywhere.
Now let E0 be any measurable subset of R with µ(E0) < ∞. Let e > 0 be given.
Then, there exists a closed set F  E0 such that µ(E0\F) < e. This follows on using
regularity of Lebesgue measure. Since

dðvE0 ; fn Þ ¼ vE0 fn
 dðvE0 ; vF Þ þ dðvF ; fn Þ;
46 2 Inner Product Spaces

and since
0 112
Z
 2 1 1
dðvE0 ; vF Þ ¼ @ vE vF  dlA ¼ lðE0 nF Þ2 \e2 ;
0

it follows that

dðvE0 ; fn Þ ! 0 as n ! 1:

We have thus proved that the characteristic functions of measurable subsets of finite
measure can be approximated in L2-norm by continuous functions which vanish
outside sets of finite measure. The proof is now completed by using Theorem 2.4.11
and the triangle inequality. h

Problem Set 2.4


2:4:P1. For which real a does the function fa(t) = taexp(−t), t > 0, belong to
L2(0,∞)? What is ||fa|| when defined?
P 1
2:4:P2: (a) Show that the subspace M = {x = {xn}n  1 2 ‘2 : 1 n¼1 n xn ¼ 0g is
closed in ‘2.
R1
(b) Show that the subspace M = {x(t) 2 L2[1,∞): 1 1t x(t)dt = 0} is closed
in L2[1,∞).
2:4:P3. Lp[0, 1], 1  p < ∞, p 6¼ 2, is not a Hilbert space.

2.5 A Subspace of L2(X, M, µ)

The following subspace of L2(X, M, µ) will play an important role in the discussion
of applications of Hilbert space tools to problems in analysis. Here X = [a, b], M is
the r-algebra of Lebesgue measurable subsets of [a, b] and µ is the Lebesgue
measure.
Example 2.5.1 Let [a, b] be a closed subinterval of R. Let C[a, b] denote the space
of complex-valued continuous functions defined on [a, b] with inner product given
by

Zb
ðf ; gÞ ¼ f ðtÞgðtÞdt; f ; g 2 C½a; bŠ: ð2:31Þ
a
2.5 A Subspace of L2(X, M, µ) 47

Then, C[a, b] is an inner product space (see Example 2.1.3(iv)) which is dense in
L2[a, b]. Extend f 2 L2[a, b] to R by setting f = 0 outside [a, b]. The extended
function is defined on R and is in L2(R). There exists a continuous function
g vanishing outside a set of finite measure such that ||f − g|| < e [see
Theorem 2.4.12]. Consider the restriction of g to [a, b], to be denoted by h. Then
the given f is such that ||f − h|| < e. Moreover, C[a, b] 6¼ L2[a, b] as the following
argument shows.
If two functions differ at a point and are both continuous there, then they differ
on a neighbourhood of that point. Consequently, they cannot be equivalent. It
follows that the function
8
<1 if x 2 ½0; 12Þ
f ðxÞ ¼ 1 if x 2 ð12 ; 1Š
:
0 ifx ¼ 12

is not equivalent to any continuous function. This proves the assertion that
C[a, b] 6¼ L2[a, b].
Thus, C[a, b] is an inner product space which is not a Hilbert space.
Remarks 2.5.2
(i) The reader will note that if g1 and g2 are continuous functions on [a, b] and
g1 * g2 then g1 = g2. So, if f 2 L2[a, b] has a representative which is
continuous, then that representative is unique. Thus, C[a, b] ! L2[a, b] is an
injection. Moreover, uniform convergence of continuous functions implies
convergence in L2-norm. The aforementioned implication stems from the
finiteness of the Lebesgue measure of [a, b] and it fails if the bounded interval
is replaced by an unbounded interval.
(ii) That the inner product space C[a, b] with the inner product defined by (2.31)
above is not complete can also be seen by exhibiting a Cauchy sequence in the
space which converges to an element not lying in the space.
Let a and b be −1 and 1, respectively, and consider the sequence
8
<0 1t0
fn ðtÞ ¼ nt 0\t  1n :
: 1
1 n \t  1
48 2 Inner Product Spaces

Observe that (m > n)

_
1 1
1/ m 1/ n

Z1 Z1=m Z1=n
2 2
jfn ðtÞ fm ðtÞj dt ¼ ðmt ntÞ dt þ ð1 ntÞ2 dt
1 0 1=m

2 1 1 1
¼ ðm nÞ þ
3m3 n m
 
1 1 n2 1 1
n 2 þ
n m2 3 n3 m 3
ðm nÞ2
¼ :
3m2 n

The right-hand side of the above equality tends to zero as n, m ! ∞. Thus, the
sequence {fn}n  1 is Cauchy. We next show that fn ! f in the L2-norm, where f(t) = 0
for −1  t  0 and f(t) = 1 for 0 < t  1. In fact,

Z1 Z1=n
2 1 1 1 1
jfn ðtÞ f ðtÞj dt ¼ ð1 ntÞ2 dt ¼ þ ¼ ! 0 as n ! 1:
n n 3n 3n
1 0

The limit function is not equivalent to any continuous function for reasons similar
to those in Example 2.5.1.

Problem Set 2.5


2.5.P1. Prove that the system {1, t3, t6, …} has a dense linear span in the space
L2[0, 1] as well as in L2[−1, 1].

2.6 The Hilbert Space A(X)

Suppose X  C is an arbitrary bounded domain whose boundary consists of smooth


simple closed curves.
Consider the class of all holomorphic functions f in X for which the integral
2.6 The Hilbert Space A(X) 49

ZZ
j f j2 dm\1; ð2:32Þ
X

where dm is the two-dimensional Lebesgue measure, exists. In order to interpret the


integral as a limit of the Riemann integral, we shall need the following:

Proposition 2.6.1 Let X be a nonempty open subset of C. Then there exists a


sequence {Kn}n  1 of closed and bounded subsets of X such that X is their union.
Moreover, the sets Kn can be chosen to satisfy the following conditions:
(a) Kn Kn þ 1o ; n = 1, 2, …;
(b) every compact (closed and bounded) subset K of X is contained in Kn
for some n.

Proof For each positive integer n, let



1
Kn ¼ z : j zj  n and distðz; CnXÞ  :
n

Observe that Kn is both bounded and closed and hence compact or empty.
Moreover, Kn  X. Obviously,

1
Kn  z : jzj\n þ 1 and distðz; CnXÞ [ Kn þ 1o :
nþ1

This proves (a). S


We next show that X = 1 n¼1 Kn. For, if z 2 X, then there exist m1 and m2 such
that |z|  m1 and dist(z,C\X)  m12 . Thus, z 2 Kn for some n. On the other hand,
S
Kn  X. This proves that X = 1 Kn.
n¼1 S
In view of (a), it follows that X = 1 n¼1 Knº. If K is a compact subset of X, then
the Knº form an open cover of K. By Definition 1.2.16, the compact set K is covered
by finitely many Knº. Consequently, K is contained in Kn for some large n. This
completes the proof. h
On letting

jf ðzÞj2 z 2 Kn
un ðzÞ ¼ ;
0 z 2 XnKn

we see that un is monotonically increasing and limn un (z) = |f(z)|2, z 2 X. The


Dominated Convergence Theorem now implies
ZZ ZZ
un dm ! jf j2 dm as n ! 1;
X X
50 2 Inner Product Spaces

that is,
ZZ ZZ
jf j2 dm ! jf j2 dm:
Kn X

We now compute in terms of Taylor coefficients the integral (2.32) for


X = {z: |z| < R} and a holomorphic function f defined in X.
The function f can be expanded in Taylor series
1
X
f ðzÞ ¼ an z n ;
n¼0

where an = f(n)(0)/n!, n = 0, 1, 2, …

ZZ ZZ X 1
2

2  n
jf j dm ¼  an z  dm
 n¼0 
X X
ZZ X ! !
1 1
X
n n inh
¼ a r e ak r k e ikh
r dr dh
n¼0 k¼0
ZR Z2p X !
1
1 X
n þ k þ 1 iðn kÞh
¼ an ak r e dr dh
n¼0 k¼0
0 0
ZR X 1
1 X Z2p
¼ an ak eiðn kÞh
dh r n þ k þ 1 dr;
n¼0 k¼0
0 0

where term-by-term integration is valid because the power series representation


converges uniformly on |z|  r for r < R. Then

ZZ ZR 1
X 1
X
2 R2n þ 2
jf j dm ¼ 2p jan j2 r 2n þ 1 dr ¼ p jan j2 : ð2:33Þ
n¼0 n¼0
nþ1
X 0

Definition 2.6.2 Let X be a bounded domain in theRRcomplex plane. The class of all
holomorphic functions in X for which the integral X | f |2dm is finite is denoted by
A(X) and is known as the Bergman space. Briefly,
ZZ
AðXÞ ¼ f f is holomorphic in X and j f j2 dm\1g
X
RR
The integral is to be understood in the sense of limn Kn |f(z)|2dm(z), where {Kn}
is a nondecreasing sequence of compact subsets of X whose union is X.
2.6 The Hilbert Space A(X) 51

The following inequality will be useful in proving that A(X) is a Hilbert space.
Proposition 2.6.3 Suppose f 2 A(X) and dz = dist(z, ∂X). Then
0 1
ZZ
jf ðzÞj2  @ jf j2 dmA=pdz2 : ð2:34Þ
X

Proof Let D be the disc with centre z and radius dz. Clearly,
ZZ ZZ
2
jf j dm  jf j2 dm:
X D

It follows from the relation (2.33) above that


ZZ
jf j2 dm  pja0 j2 dz2 ¼ pjf ðzÞj2 dz2 :
X

So,
0 1
ZZ
jf ðzÞj2  @ jf j2 dmA=pdz2 :
X

This completes the proof. h


The inequality |a + b|2  2(|a|2 + |b|2) implies that
 
jaf ðzÞ þ bgðzÞj2  2 jaj2 jf ðzÞj2 þ jbj2 jgðzÞj2 ð2:35Þ

for any two functions f, g 2 A(X); further, the identity

1 i 1þi 2 1þi 2
f g ¼ jf þ gj2 þ jf þ igj2 jf j j gj ð2:36Þ
2 2 2 2

holds.
Definition 2.6.4 For f, g 2 A(X), we write
ZZ
ðf ; gÞ ¼ f g dm: ð2:37Þ
X

That the right side of (2.37) is finite follows from (2.35) and (2.36) above. It is
easily verified that (2.37) defines an inner product on A(X).
Theorem 2.6.5 With (f, g) defined as in (2.37), A(X) is a Hilbert space.
52 2 Inner Product Spaces

Proof We show that A(X) is a complete inner product space. In view of (2.35),
A(X) is closed under addition and scalar multiplication.
As usual, A(X) with norm defined by
0 112
ZZ
1
2
k f k ¼ ðf ; f Þ ¼ @
2 jf j dmA ;
X

becomes a normed space [see Definition 2.2.2 and Theorem 2.2.6]. It remains to
show that A(X) is complete in this norm. Suppose {fn}n  1 is a Cauchy sequence in
this norm, that is,
0 1
ZZ
2  2
f n fp ¼ @ fn fp  dmA\e
X

for n, p  N. For each compact set K  X, it follows, on using Proposition 2.6.3,


that
 2
fn ðzÞ fp ðzÞ \ e=pd 2 ðz 2 KÞ;

where d = inf{d(z,x): z 2 K and x 2 ∂X}. This means by Weierstrass’ Theorem that


on each compact subset, K  X, the sequence {fn}n  1 converges uniformly to a
holomorphic function f:

fn ðzÞ ! f ðzÞ as n ! 1 z 2 K  X:

Since
ZZ ZZ
2
jfn fp j dm  jfn fp j2 dm; n; p  N;
K X

it follows, on letting p ! ∞, that


ZZ
jfn f j2 dm  e; n  N;
K

for each compact set K; hence,


ZZ
jfn f j2 dm  e; n  N;
K

The last inequality implies f 2 A(X) and ||fn − f|| ! 0 as n ! ∞. This completes the
proof. h
2.6 The Hilbert Space A(X) 53

Problem Set 2.6


2:6:P1. Let X be an arbitrary domain in C whose boundary consists of a finite
number of smooth simple closed curves and let A(X) be the collection of all
holomorphic functions f:X ! C for which
2 3
ZZ ZZ
jf ðzÞj2 dxdy4same as j f j2 dm of Sect. 2:65\1
X X

holds.
(a) Show that every f 2 A(X), where X = {z 2 C: 0 < |z| < 1} has a removable
singularity at z = 0;
(b) Show that if a 2 X, then {f 2 A(X): f(a) = 0} is closed in A(X).

2.7 Direct Sum of Hilbert Spaces

The definition of the external direct sum of vector spaces [Definition 1.1.7] is
extended to any arbitrary family of vector spaces (each vector space is over the field
R of real numbers or the field C complex numbers) beginning with any finite such
family. This procedure lays bare the intricacies involved and aids understanding.
The direct sum

H ¼ H1  H2      Hn

of the vector spaces H1, H2, …, Hn is the set H = H1  H2      Hn, in which the
addition and scalar multiplication are defined by the formula

ðx1 ; x2 ; . . .; xn Þ þ ðy1 ; y2 ; . . .; yn Þ ¼ ðx1 þ y1 ; x2 þ y2 ; . . .; xn þ yn Þ;


kðx1 ; x2 ; . . .; xn Þ ¼ ðkx1 ; kx2 ; . . .; kxn Þ;

where (x1, x2, …, xn), (y1, y2, …, yn) are in H1  H2  … Hn and k 2 R or C.
It is quite clear that H contains a subspace Yi of H, where

Yi ¼ fðx1 ; x2 ; . . .; xn Þ : xj ¼ 0 for j 6¼ ig;

which is isomorphic to Hi. It is sometimes convenient to refer to the space Hi itself


as a subspace of H, and when such reference is made, it is the isomorphic space Yi
that is to be understood. The map of H into Hi given by
54 2 Inner Product Spaces

ðx1 ; x2 ; . . .; xn Þ ! ð0; 0; . . .; xi ; 0; . . .Þ

is a projection and is sometimes called projection of H onto Hi.


If H1, H2, …, Hn are Hilbert spaces, then H is the uniquely determined Hilbert
space with inner product
n
X
ððx1 ; x2 ; . . .; xn Þ; ðy1 ; y2 ; . . .; yn ÞÞ ¼ ð xi ; yi Þ i ; ð2:38Þ
i¼1

where (,)i is the inner product in Hi. Then the norm in a direct sum of Hilbert
spaces is given by
1
kðx1 ; x2 ; . . .; xn Þk ¼ jððx1 ; x2 ; . . .; xn Þ; ðx1 ; x2 ; . . .; xn ÞÞj2 : ð2:39Þ

To summarise, we state the following definition:


Definition 2.7.1 For each i = 1, 2, …, n, let Hi be a Hilbert space with inner
product (,)i. The direct sum of Hilbert spaces H1, H2, …, Hn is the vector space
H = H1 ⊕ H2 ⊕    ⊕ Hn in which the inner product and the norm are defined by
(2.38) and (2.39).
Henceforth, the subscripts i in the notation for the inner products and the norms
will be omitted because the context will make it clear which one is intended.
Proposition 2.7.2 With notations as above, H = H1 ⊕ H2 ⊕    ⊕ Hn is a Hilbert
space.
Proof It must be shown that (2.38) defines an inner product on H and H is complete
with respect to the norm defined by (2.39). We shall check the second assertion
only.
(m) (m) (m)
PnLet {(x(m)
1 , x2 , …, xn )}m  1 be a Cauchy sequencePin H, that is,
(‘) 2 (m) (‘) 2 n (m) (‘) 2
i¼1 ||xi − xi || ! 0 as m, ‘ ! ∞. For each k, ||xk − xk ||  i¼1 ||xi − xi ||
(m)
shows that {xk }m  1 is Cauchy in Hk. Since Hk is a Hilbert space, there exists xk in
Hk such that x(m)k ! xk as m ! ∞. Clearly, the vector (x1, x2, …, xn) is in H. It will
be shown that (x(m) (m) (m)
1 , x2 , …, xn ) ! (x1, x2, …, xn) in H. Let e > 0 be given. For
each k, there exists an integer mk such that
pffiffiffi
ðmÞ
xk xk \e= n for m  mk :

Consequently,
n
X 2
ðmÞ
xk xk \e2 for m  maxfm1 ; m2 ; . . .; mn g:
k¼1

This completes the proof of the assertion made. h


2.7 Direct Sum of Hilbert Spaces 55

1
We next define H1 ⊕ H2 ⊕    ; also written as  Hi, for a sequence of Hilbert
i¼1
spaces H1, H2, …. Let
( )
1
X 2
H¼ fxn gn  1 : xn 2 Hn ; n ¼ 1; 2; . . . and kxn k \1 :
n¼1

For x = {xn}n  1 and y = {yn}n  1 in H, define


1
X
ðx; yÞ ¼ ðxn ; yn Þ: ð2:40Þ
n¼1

The sum on the right is seen to be finite by using the Cauchy–Schwarz Inequality
for each Hi and then for ‘2. It can then be verified that (,) is an inner product on
P
H and the norm relative to the inner product is ||x|| = ( 1 2 12
n¼1 ||xn|| ) .
With this inner product, H can be shown to be a Hilbert space.
1
Proposition 2.7.3 With notations as above, H = H1 ⊕ H2 ⊕    =  Hi is a
i¼1
Hilbert space.
Proof It must be shown that (2.40) defines an inner product on H and H is complete
with respect to the norm defined by
!12
1
X 2
k xk ¼ kx n k :
n¼1
For x = {xn}n  1 and y = {yn}n  1 in H,
!12 !12
1
X 1
X 1
X 1
X
2 2
j ð xn ; yn Þ j  kx n k ky n k  kx n k ky n k ;
n¼1 n¼1 n¼1 n¼1

using the Cauchy–Schwarz Inequality twice. Hence, the series on the right of (2.40)
converges absolutely. Consequently, (,) is well defined. It is a routine exercise to
show that (,) is an inner product on H. It remains to show that H is a complete
space. Suppose {x(m)}m  1 = {(x(m) (m)
1 , x2 , …)}m  1 is a Cauchy sequence
P1 in H, that
is, ||x − x || ! 0 as m, n ! ∞. For each k, ||x(m)
(m) (n)
k − x (n) 2
k ||  j¼1 ||x (m)
j − x(n)
j ||
2

= ||x(m) − x(n)||2, which shows that the sequence {x(1) (2)


k , xk , …} of kth components is
(n)
Cauchy. Since H Pk is a Hilbert space, xk ! xk as n ! ∞ for suitable xk in Hk. It will
be shown that 1 n¼1 ||x k || 2
< ∞ and x(n) ! x, where x = {xk}k  1.
Given e > 0. Let p be an integer such that ||x(m) − x(n)|| < e whenever m, n 
p. For any positive integer r, one has
56 2 Inner Product Spaces

r
X 2
ðmÞ ðnÞ
xk xk  xðmÞ xðnÞ  e2 ;
k¼1

provided m, n  p. Letting m ! ∞,
r
X 2
ðnÞ
xk xk  e2
k¼1

provided n  p. Since r is arbitrary,


1
X 2
ðnÞ
xk xk  e2 ð2:41Þ
k¼1

provided n  p. In particular,
1
X 2
ðpÞ
xk xk  e2 ;
k¼1

hence, the sequence {xk − x(p)


k }k  1 belongs to H. Consequently, the sequence

n o
ðpÞ ðpÞ
fx k gk  1 ¼ x k xk þ xk
k1

belongs to H. It follows from (2.41) that ||x − x(n)||  e whenever n  p. Thus


x(n) ! x. h
Definition 2.7.4 If H1, H2, … are Hilbert spaces, the space H in Proposition 2.7.3 is
called the direct sum of H1, H2, …
For our next definition, a summation over an arbitrary (possibly uncountable)
indexing set is to be understood in the following sense:
Suppose S = {xa: a 2 K}, where K is an indexing set, is a collection of ele-
ments from a normed linear space X. {xa: a 2 K} is said to be summable to x 2 X,
written
X X
xa ¼ x or xa ¼ x;
a2K a

if for all e > 0, there exists some finite set of indices J0  K, such that for any finite
set of indices J  J0,

X

x x \e:
a2J a
2.7 Direct Sum of Hilbert Spaces 57

Definition 2.7.5 For each a in the index set K, let Ha be a Hilbert space. The direct
sum ⊕aHa of Hilbert spaces Ha is defined to be the family of all functions {xa} on
P
K such that for each a, xa 2 Ha and a2K kxa k2 \1:
P
If x, y 2 H, (x, y) = a (xa, ya) is an inner product on H, then H is a Hilbert space
P 1
2 2
with respect to the norm kxa k ¼ a2K kxa k . The proof is not included.
Permuting the index set K results in an isomorphic Hilbert space.
Let X = Cn[a, b], the linear space of all scalar valued n times continuously
differentiable functions on [a, b]. For x, y in X, define

b
n Z
X
ðx; yÞ ¼ xðjÞ ðtÞyðjÞ ðtÞdt:
j¼0
a

Clearly, (,) is an inner product on X and

b
n Z 
X 
2
kxk ¼ ðx; xÞ ¼ xðjÞ ðtÞ2 dt; x 2 X:
j¼0
a

Let
H = {x 2 C[a, b]: x(n−1) is defined and absolutely continuous, x(n) 2 L2[a, b]}.
For x, y 2 H, let

b
n Z
X
ðx; yÞ ¼ xðjÞ ðtÞ yðjÞ ðtÞ dt:
j¼0
a

It can be seen that (,) is an inner product on H and

b
n Z 
X 
2
kxk ¼ ðx; xÞ ¼ xðjÞ ðtÞ2 dt; x 2 H:
j¼0
a

Theorem 2.7.6 With notations as in the paragraph above, H is a Hilbert space and
X is dense in H.
Proof Consider the direct sum of (n + 1) copies of L2[a, b], i.e.

ðn þ 1Þcopies L2 ½a; bŠ:

Then ⊕(n+1)copiesL2[a, b] is seen to be a Hilbert space by Proposition 2.7.2. Let


T:H ! ⊕n+1L2[a, b] be defined by
58 2 Inner Product Spaces

 
Tx ¼ x; xð1Þ ; xð2Þ ; . . .; xðnÞ :

Observe that T is both linear and injective; moreover, it preserves inner products.
We shall next show that T(H) is a closed subspace of ⊕n+1L2[a, b] and is conse-
quently a Hilbert space. This will imply that H is a Hilbert space.
Let {xk}k  1 be a Cauchy sequence in H and let
 
ð1Þ ð2Þ ðnÞ
y ¼ ðy0 ; y1 ; . . .; yn Þ ¼ lim Txk ¼ lim xk ; xk ; xk ; . . .; xk ;
k!1 k!1

that is, {x(j) 2 (0)


k }k  1 converges in L [a, b] to yj, j = 0, 1, 2, …, n, where xk means xk.
Now, for j = 1, 2, …, n and each t 2 [a, b],

Zt
ðj 1Þ ðj 1Þ ðjÞ
xk ðtÞ ¼ xk ðaÞ þ xk ðsÞds: ð2:42Þ
a

Observe that
 t 
Z Zt 
 ðjÞ  1 ðjÞ
 yj ðsÞds xk ðsÞds  ðb aÞ2 yj xk ð2:43Þ

 
a a

nR j = 1, 2, …,
for o n and all t 2 [a, b]. Therefore, the sequence of continuous
Rt
functions
t ðjÞ
a xk ðsÞds k1
is uniformly convergent to the continuous function a yj(s)ds. It
is also convergent as a sequence in L2[a, b]. But {x(j−1) k }k  1 is convergent in
L2[a, b] to yj−1 and so by (2.42), the sequence {x(j−1)
k (a)}k  1 of constant functions
is convergent in L2[a, b]. Therefore, the sequence {x(j−1)
k (a)}k  1 is convergent in C,
and the function on the right of (2.42) is uniformly convergent to a continuous
function. Thus, it follows that

Zt
yj 1 ðtÞ ¼ yj 1 ðaÞ þ yj ðsÞds:
a

and hence, yj−1 is an absolutely continuous function. We have shown that y0 2 H.


Consequently, y = Tx, where x = y0.
We next show that X is dense in H.
Let x 2 H. Then x(n) 2 L2[a, b] so that we can find a sequence {zm}m  1 in
C[a, b] such that ||zm − x(n)||2 ! 0 as m ! ∞. Define recursively u(1) (2) (n)
m , um , …, um
by the formula
2.7 Direct Sum of Hilbert Spaces 59

Zt
ðjÞ
um ðtÞ ¼ uðjm 1Þ
ðsÞds þ xðn jÞ
ðaÞ; j ¼ 1; 2; . . .; n;
a

where u(0) (n)


m = zm. Observe that um 2 X. We claim that

ðjÞ
u xðn jÞ
! 0 as m ! 1:
m 2

The result is true for j = 0. Assume that it is true for j  0. Then


 
 ðj þ 1Þ  ðjÞ 1
um ðtÞ xn ðj þ 1Þ ðtÞ  um xðn jÞ ðb aÞ2 , using (2.42) and (2.43)
2
above.
Hence, {u(j+1)
m (t)}m  1 converges uniformly to x
n−(j+1)
for t 2 [a, b], so that
||u(j+1)
m −x n−(j+1)
||2 ! 0 as m ! ∞. This completes the argument for j = 1, 2, …,
n. Consequently, u(n)m ! x in H. The proof is now complete. h

2.8 Orthogonal Complements

In the familiar Euclidean space, we assign a length to each vector and to each pair
of vectors an angle between them. The first notion has been made abstract in the
definition of a norm. An appropriate notion of angle and the associated notion of
orthogonality are introduced below. The introduction of the concept of orthogo-
nality depends on the definition of inner product in a pre-Hilbert space.
Recall from Definition 2.1.1 that a real vector space H equipped with an inner
product is called a real pre-Hilbert space. The angle h between two nonzero vectors
in a real pre-Hilbert space may be defined in a manner consistent with the properties
of an inner product by means of the relation (x, y) = ||x|| ||y||cos h. Observe that the
Cauchy–Schwarz Inequality then says that |cos h|  1. This definition is not
satisfactory in a complex pre-Hilbert space, for (x, y) is in general a complex
number. Nevertheless, if the condition (x, y) = 0 is taken as the definition of
orthogonality (perpendicularity), then the concept is just as useful here as in the real
case.
Definition 2.8.1 Let H be a pre-Hilbert space. Two vectors x and y in H are said to
be orthogonal if (x, y) = 0; we write x ⊥ y.
Since (x, y) = 0 implies (y, x) = 0, we have x ⊥ y if, and only if, y ⊥ x. It is also
clear that x ⊥ 0 for every x. Also, the relation (x, x) = ||x||2 shows that 0 is the only
vector orthogonal to itself.
Definition 2.8.2 A set M of nonzero vectors in a pre-Hilbert space H is said to be
an orthogonal set if x ⊥ y whenever x and y are distinct vectors of M. A set M of
vectors in a pre-Hilbert space H is said to be orthonormal if
60 2 Inner Product Spaces

ðiÞ M is orthogonal and ðiiÞ k xk ¼ 1 for every x 2 M:

An orthonormal set M of vectors in a pre-Hilbert space is called complete (or


maximal) orthonormal system provided it is not a proper subset of some other
orthonormal set.
Remarks 2.8.3
(i) If x is orthogonal to y1, y2, …, yn, then x is orthogonalPto every linear
combination of the yk. In fact, if x ⊥ yk for all k and y ¼ nk¼1 kk yk , then
!
n
X n
X
ðx; yÞ ¼ x; k k yk ¼ kk ðx; yk Þ ¼ 0:
k¼1 k¼1

(ii) If x ⊥ y, then ||x + y||2 = ||x||2 + ||y||2 since ||x + y||2 = (x + y, x + y) =


||x||2 + (x, y) + (y, x) + ||y||2 = ||x||2 + ||y||2, using (x, y) = 0 = (y, x).
(iii) An orthogonal subset M P of H not containing the zero vector is linearly
independent. Indeed, if nk¼1 kk yk ¼ 0, where y1, y2, …, yn are orthogonal,
then on taking the inner product of the sum on the left-hand side with ym, we
find that km = 0.

Examples 2.8.4
(i) The sequence {xj}j  1, where xj = (0, 0, …, 0, kj, 0, …) and the scalar kj
occurs at the jth place, in the space ‘0 of finitely nonzero sequences, is an
orthogonal sequence (a sequence whose range is an orthogonal set).
The sequence {ej}j  1, where ej = (0, 0, …, 0, 1, 0, …) and 1 occurs at the jth
place is an orthonormal sequence, in the space ‘0 of finitely nonzero
sequences.
(ii) Let H = C[−p, p] and let fn(x) = sin nx, n = 1, 2, … and gn(x) = cos nx,
n = 1, 2, …. Since

Zp Zp
sin mx sin nx dx ¼ 0 ¼ cos mx cos nx dx; n 6¼ m;
p p

it follows that {fn}n  1 and {gn}n  1 are orthogonal sequences in C[−p, p].
In the space, the vectors

1
un ðxÞ ¼ pffiffiffi sin nx; n ¼ 1; 2; . . .
p

form an orthonormal sequence and so do the vectors


2.8 Orthogonal Complements 61

1 1
v0 ðxÞ ¼ pffiffiffiffiffiffi ; vn ðxÞ ¼ pffiffiffi cos nx; n ¼ 1; 2; . . .:
2p p

Also, note that the vectors

v 0 ; v 1 ; u 1 ; v 2 ; u2 ; . . .

form an orthonormal sequence in C[−p, p].


Recall that (f, uk), k = 1, 2, … and (f, vk), k = 1, 2, … are called Fourier
coefficients of the function f 2 C[−p, p].
pffiffi
(iii) The sequence un ðzÞ ¼ pn zn 1 ; n ¼ 1; 2; . . . is orthonormal in A(D), where
D = {z 2 C: |z| < 1}. In fact,
ZZ
ðun ; um Þ ¼ un um dx dy
D
pffiffiffiffiffiffi Z1 Z2p
nm
¼ r n þ m 1 eiðn mÞh
dr dh
p
0 0
pffiffiffiffiffiffi Z2p
nm
¼ eiðn mÞh
dh
pðn þ mÞ
0

0 if n 6¼ m
¼
1 if n ¼ m:

The following definition generalises the notion of Fourier coefficients to any


arbitrary infinite-dimensional pre-Hilbert space.
Definition 2.8.5 If {xn}n  1 is an orthonormal sequence in a pre-Hilbert space H,
then for any x 2 H, (x, xn) is called the Fourier coefficient of x with respect to
{xn}n  1. The Fourier series of x with respect to {xn}n  1 is the series
P 1
n¼1 ðx; xn Þxn .
In the Hilbert space ‘2, let

e1 ¼ ð1; 0; 0; . . .Þ; e2 ¼ ð0; 1; 0; . . .Þ; e3 ¼ ð0; 0; 1; . . .Þ; . . .:

Then {en}n  1 is an orthonormal sequence in ‘2. If x = {kj}j  1 2 ‘2, then (x, ej) = kj
P
and x ¼ 1 j¼1 x; ej ej is its Fourier series with respect to the orthonormal sequence
P  2 P  2
{en}n  1. Observe that 1  x; ej  ¼ 1 kj  \1. That this result holds for
j¼1 j¼1
any orthonormal sequence is a consequence of the following:
62 2 Inner Product Spaces

Theorem 2.8.6 (Bessel’s Inequality) Let x1, x2, …, xn be orthonormal vectors in a


pre-Hilbert space H. For every x 2 H,
2
n
X Xn

x ðx; xk Þxk ¼ k xk2 jðx; xk Þj2 ;
k¼1
k¼1

hence
n
X
jðx; xk Þj2  k xk2 :
k¼1

Proof For k1, k2, …, kn 2 C;


2 !
X n n
X n
X n
X

kk xk ¼ k k xk ; k k xk ¼ jkk j2 :
k¼1 k¼1 k¼1 k¼1

So,
2 !
n
X n
X n
X

x kk xk ¼ x kk xk ; x kk xk
k¼1
k¼1 k¼1
Xn n
X n
X
¼ k x k2 kk ðxk ; xÞ kk ðx; xk Þ þ jkk j2
k¼1 k¼1 k¼1
n
X n
X
¼ k x k2 jðxk ; xÞj2 þ jðx; xk Þ kk j2 :
k¼1 k¼1

In particular, if kk = (x, xk), then


2
n
X Xn

x ðx; xk Þxk ¼ k xk2 jðxk ; xÞj2 :
k¼1
k¼1

Since the left-hand side of the above equality is nonnegative, we get


n
X
jðx; xk Þj2  k xk2 :
k¼1

Since Bessel’s Inequality holds for each n orthonormal vectors, it yields the
following corollary:
Corollary 2.8.7 If x1, x2, … is any orthonormal sequence of vectors, then for any x
in the pre-Hilbert space H,
2.8 Orthogonal Complements 63

1
X
jðx; xn Þj2  k xk2 :
n¼1

In particular, (x, xn) ! 0 as n ! ∞.


Remarks 2.8.8
(i) As a special case of Corollary 2.8.7, we obtain the following inequality in the
pre-Hilbert space C[−p, p]: For f 2 C[−p, p],

1
X 1
X Zp
2 2
jðf ; un Þj þ jðf ; vn Þj  jf ðxÞj2 dx;
n¼1 n¼0
p

where un ðxÞ ¼ p1ffiffip sin nx, v0 ðxÞ ¼ p1ffiffiffiffi


2p
, vn ðxÞ ¼ p1ffiffip cos nx, n = 1, 2, …[see
Example 2.8.4(ii)].
(ii) Let M denote the linear manifold spanned by orthonormal vectors x1, x2, …,
xn. Then the proof of Theorem 2.8.6 shows that the distance

x Pn kk xk is minimised if we set kk = (x, xk), k = 1, 2, …, n; i.e.
k¼1
x Pn ðx; xk Þxk  x Pn kk xk , where k1, k2, …, kn are arbitrary
k¼1 k¼1
scalars. P
Thus, y = nk¼1 (x, xk)xk is the vector in M which provides the ‘best ap-
proximation’ to the vector x in the pre-Hilbert space H. Also note that if
n > m, then in the best approximation by the linear span of x1, x2, …, xn, the
first m coefficients are precisely the same as required for the best approxi-
mation in the linear span of P x1, x2, …, xm.
(iii) We set z = x − y, where y = nk¼1 (x, xk) xk provides the best approximation
amongst the vectors in M, then (z, xk) = (x, xk) − (y, xk) = 0 for k = 1, 2, …,
n. Hence (z, y) = 0. Thus, x = y + z, where y is a linear combination of x1, x2,
…, xn providing the best approximation to x and z ⊥ xk, k = 1, 2, …, n, is a
decomposition of x. The decomposition is unique. Indeed, the vector in
M providing the best approximation to x 2 H is unique. If x = y1 + z1 is the
another decomposition of x, where y1 provides the best approximation
amongst the vectors in M and z1 ⊥ xk, k = 1, 2, …, n, then y + z = y1 + z1
implies y − y1 = z1 − z, which in turn says y = y1 and z = z1, since y, y1 are in
M and z, z1 are orthogonal to M.
It follows from Remark 2.8.3(iii) that every orthonormal sequence in H is lin-
early independent. Conversely, given any countable linearly independent sequence
in H, we can construct an orthonormal sequence, keeping the span of the elements
at each step of construction [see Theorem 2.8.9 below] in tact.
Theorem 2.8.9 (Gram–Schmidt orthonormalisation) Let x1, x2, … be a linearly
independent sequence in an inner product space H. Define y1 = x1, u1 = kxx11 k and for
n = 2, 3, …,
64 2 Inner Product Spaces

yn ¼ xn ðxn ; u1 Þu1 ðxn ; u2 Þu2  ðxn ; un 1 Þun 1

and
yn
un ¼ :
ky n k

Then {u1, u2, …} is an orthonormal sequence in H and for n = 1, 2, …,

spanfu1 ; u2 ; . . .; un g ¼ spanfx1 ; x2 ; . . .; xn g:

Proof As {x1} is a linearly independent set, y1 = x1 6¼ 0 and u1 = kxx11 k is such that


||u1|| = 1 and span{u1} = span{x1}.
For n  1, assume that we have defined y1, y2, …, yn and u1, u2, …, un as stated
above and proved that {u1, u2, …, un} is an orthonormal sequence satisfying
span{u1, u2, …, un} = span{x1, x2, …, xn}. Define

yn þ 1 ¼ xn þ 1 ðxn þ 1 ; u1 Þu1 ðxn þ 1 ; u2 Þu2  ðxn þ 1 ; un Þun :

Since the set {x1, x2, …, xn+1} is linearly independent, xn+1 does not belong to
span{x1, x2, …, xn} = span{u1, u2, …, un}. Hence, yn+1 6¼ 0 and let un þ 1 ¼ kyynn þþ 11 k.
Then ||un+1|| = 1 and for j  n,

  n
X  
yn þ 1 ; uj ¼ ðxn þ 1 ; uj Þ ð x n þ 1 ; uk Þ uk ; uj
k¼1
¼ ðxn þ 1 ; uj Þ ðxn þ 1 ; uj Þ
¼ 0;

since (uk, uj) = 0 for all k 6¼ j, k = 1, 2, …, n. Thus

  ðyn þ 1 ; uj Þ
un þ 1 ; uj ¼ ¼ 0 for j ¼ 1; 2; . . .; n:
ky n þ 1 k

Hence, {u1, u2, …, un+1} is an orthonormal sequence. Moreover,

spanfu1 ; u2 ; . . .; un þ 1 g ¼ spanfx1 ; x2 ; . . .; xn ; un þ 1 g
¼ spanfx1 ; x2 ; . . .; xn þ 1 g:

The argument is now complete in view of mathematical induction. h


Remarks 2.8.10
(i) The Gram–Schmidt orthonormalisation process as described above yields an
orthonormal sequence which is unique.
2.8 Orthogonal Complements 65

Let e1, …, en and f1, …, fn be n-term (n > 1) orthogonal sequences of nonzero


vectors in an inner product space. Suppose they have the same linear span and so do
e1, …, en−1 and f1, …, fn−1. Then the vectors en and fn are scalar multiples of each
other, as the following argument shows.
It is sufficient to argue that fn is a scalar multiple of en.
P fn lies in the linear span of e1, …, en, there exist scalars k1, …, kn such that
Since
fn ¼ 1  k  n kk ek . However, the vectors e1, …, en−1 lie in the linear span of f1, …,
fn−1P. Therefore, the sumPof the first n − 1 terms in the preceding sum can be written
as 1  k  n 1 kk ek ¼ 1  k  n 1 ck fk for some scalars c1, …, cn−1. Thus,
X
fn ¼ c k f k þ kn e n : ð2:44Þ
1kn 1

By the orthogonality of f1, …, fn, for 1  j  n − 1, we have


!
X  
0 ¼ ðfn ; fj Þ ¼ c k f k þ kn e n ; f j ¼ cj ðfj ; fj Þ þ kn en ; fj :
1kn 1

But fj lies in the linear span of e1, …, en−1 and en is orthogonal to this linear span.
Therefore, (en, fj) = 0 and the above equality becomes cj(fj, fj) = 0. As each fj is
nonzero, it now follows that each cj = 0 (1  j  n − 1). Using this in (2.44), we
get fn = knen.
If the vectors en and fn have the same norm, then it further follows that the scalar
kn has absolute value 1.
(ii) If e1, e2, … and f1, f2, … are the orthogonal sequences of nonzero vectors in
an inner product space and

spanfe1 ; . . .; en g ¼ spanff1 ; . . .; fn g for n ¼ 1; 2; . . .;

then en and fn are scalar multiples of each other. If the vectors en and fn have the
same norm, then it further follows that the scalar factor has absolute value 1.
(iii) Let Q0, Q1, … be the sequence of polynomials obtained from the sequence of
polynomials 1, t, t2, … (on the domain [−1, 1]) by orthonormalisation, and
let P0, P1, … be the sequence of Legendre polynomials defined in 2.8.13
below. The first k functions in either sequence span the space of polynomials
of degree at most k − 1. It follows from what has been proved above that
each Qn is a scalar multiple of Pn and vice versa. The value of the scalar can
be obtained by comparing (a) the leading coefficients or (b) the constant
terms or (c) the integrals over [−1, 1].
(iv) The Gram–Schmidt procedure when applied to a finite sequence {x1, x2, …,
xn} of independent vectors leads to orthonormal vectors {u1, u2, …, un} such
that
66 2 Inner Product Spaces

spanfu1 ; u2 ; . . .; uk g ¼ spanfx1 ; x2 ; . . .; xk g for k ¼ 1; 2; . . .; n:

As an immediate consequence, we record the following:


Corollary 2.8.11 If H is a pre-Hilbert space of dimension n, then it has a basis of
orthonormal vectors.
Theorem 2.8.12 Every finite-dimensional pre-Hilbert space is complete and is,
therefore, a Hilbert space.
Proof By Corollary 2.8.11, there is a basis u1, u2, …, un of orthonormal vectors. If
P P
x ¼ nk¼1 kk uk , then kxk2 ¼ nk¼1 jkk j2 , using Remark 2.8.3(ii). The completeness
follows as in Example 2.3.4(i). h
The following examples illustrate the orthogonalisation procedure.
Examples 2.8.13
(i) Let H = ‘2. For n = 1, 2, …, let xn = (1, 1, …, 1, 0, 0, …), where 1 occurs only
in the first n places. The Gram–Schmidt orthonormalisation process yields

yn ¼ ð0; 0; . . .; 0; 1; 0; . . .Þ; n ¼ 1; 2; . . .;

where 1 occurs only in the nth place.


ðx2 ;x1 Þx1
The vector y1 = x1 = (1, 0, 0, …). The vector y2 ¼ kxx22 ðx2 ;x1 Þx1 k ¼ ð0; 1; 0; . . .Þ.
By induction, it can be shown that

yn ¼ ð0; 0; . . .; 0; 1; 0; . . .Þ; n ¼ 1; 2; . . .;

where 1 occurs only in the nth place. The sequence of vectors {yn}n  1 is an
orthonormal sequence in ‘2.
The set of finite linear combinations of the sequence {yn}n  1 is dense in ‘2. Let
x = {ki}i  1 2 ‘2. Given e > 0, there exists n0 such that n > n0 implies
P 2
n0 þ 1  i\1 jki j \e. Then the vector

y ¼ k1 y1 þ k2 y2 þ    þ kn0 yn0

is such that
X
kx yk22 ¼ jki j2 \e.
n0 þ 1  i\1

(ii) Legendre polynomials. Consider the sequence {1, t, t2, …} of vectors in


Pn
L2[−1, 1]. Since any nontrivial finite linear combination i¼1 aki t
ki
is a
polynomial of degree m = maxiki, it has at most m zeros. This shows that the
2.8 Orthogonal Complements 67

vectors {tk}k  0 are linearly independent. We next calculate the first three
orthonormal vectors by the Gram–Schmidt procedure.
R1
Let y0(t) = x0(t) = 1, so that ky0 k2 ¼ 1 ds ¼ 2 and u0 ¼ yk0yðtÞ
0k
¼ p1ffiffi2. Next,
0 1
Z1
@ s A 1
y1 ðtÞ ¼ x1 ðtÞ ðx1 ; u0 Þu0 ðtÞ ¼ t pffiffiffi ds pffiffiffi ¼ t;
2 2
1

R1 qffiffi
so that ky1 k2 ¼ 1 s2 ds ¼ 23 and u1 ðtÞ ¼ yk1yðtÞ
1k
¼ 3
2 t.
Further,

y2 ðtÞ ¼ x2 ðtÞ ðx2 ; u0 Þu0 ðtÞ ðx2 ; u1 Þu1 ðtÞ


0 1 0 1 rffiffiffi 1 rffiffiffi !
Z1 2 Z
@ s 1 3 3 1
¼ t2 pffiffiffi dsA pffiffiffi @ s3 dsA t ¼ t2 ;
2 2 2 2 3
1 1

R1  
1 2
so that ky2 k2 ¼ 1 s2 3 ds
8
¼ 45 and

y2 ðtÞ pffiffiffiffiffi 2
u2 ðtÞ ¼ ¼ 10 ð3t 1Þ=4:
ky 2 k

We shall next prove that the general form of these orthonormal polynomials is
qffiffiffiffiffiffiffiffiffiffiffi
n þ 12 Pn ðtÞ; where

1 dn  2 
Pn ðtÞ ¼ ðt 1Þn ; n ¼ 0; 1; 2; . . .: ð2:45Þ
2n n! dtn

The reader can check by using (2.45) that P0(t) = 1, P1(t) = t and
P2(t) = P2 ðtÞ ¼ 12 ð3t2 1Þ and consequently the first three normalised polynomials
qffiffi pffiffiffiffi
are p1ffiffi2, 32 t and 410 ð3t2 1Þ: That the general form of these polynomials is
nqffiffiffiffiffiffiffiffiffiffiffi o
n þ 12 Pn ðtÞ will be verified below. We begin by showing that

Z1 
0 if n 6¼ m
Pn ðtÞPm ðtÞdt ¼ 2 :
2n þ 1 if n ¼ m
1

For m 6¼ n,
68 2 Inner Product Spaces

Z1 Z1
dn  2  dm  2 
2n þ m n!m! Pn ðtÞPm ðtÞdt ¼ ðt 1Þn ðt 1Þm dt
dtn dtm
1 1
1
dn 1  2 m
n d  2

m 
¼ n 1 ðt 1Þ ðt 1Þ 
dt dtm 1
Z1 m þ 1
d  2  dn 1  
m þ 1
ðt 1Þm n 1 ðt2 1Þn dt
dt dt
1
Z1
dm þ 1  2  dn 1  
¼ ðt 1Þm ðt2 1Þn dt
dtm þ 1 dtn 1
1

n k
since ddtn k ððt2 1Þn Þ, k = 1, 2, …, n, is zero at t = ±1. Hence, if n > m and we
continue the process of integration by parts, we obtain

Z1
dn m 1
d 2m þ 1 2
 ððt2 1Þn Þ ððt 1Þm Þdt;
dtn m 1 dt2m þ 1
1

which is equal to zero (the second factor in the integrand is identically zero).
For m = n, we have

Z1 Z1
2 ð 1Þn 2 n d2n 2
Pn ðtÞ dt ¼ 2
t 1 ððt 1Þn Þdt
22n ðn!Þ dt2n
1 1
ð2:46Þ
Z1
ð 1Þn
¼ ð2nÞ! ðt2 1Þn dt
22n ðn!Þ2
1

since

d2n 2
ððt 1Þn Þ ¼ ð2nÞ!:
dt2n

Setting t = cos h in (2.46) and using Wallis’ formula,

Zp=2
2n n!
sin2n þ 1 h dh ¼ ;
1  3       ð2n þ 1Þ
0

[which can be derived by integrating by parts repeatedly], it follows that


2.8 Orthogonal Complements 69

Z1
2
Pn ðtÞ2 dt ¼ :
2n þ 1
1

nqffiffiffiffiffiffiffiffiffi o
Thus 2n þ 1
is an orthonormal sequence in L2[−1, 1].
2 Pn ðtÞ n  0
nqffiffiffiffiffiffiffiffiffi o
2n þ 1
It now follows readily that the functions 2 P n ðtÞ n  0 are obtained from

{1, t, t2, …} by the Gram–Schmidt orthonormalisation procedure since Pn(t) is a


polynomial of degree n. The essential uniqueness pointed out in Remark 2.8.10(ii),
and the fact that the leading coefficients in Pn(t) and in the nth polynomial obtained
via orthonormalisation are both positive lead to the result.
(iii) Hermite functions. Consider the sequence of functions {fn}n  0 on R, where
2
fn(t) = tnexp( t2 ). Since

Z1 Z1
  2 
t2n expð t2 Þdt\ t2n dt exp t \1 for 0\t\1 ð2:47Þ
0 0

and

Z1 Z1
2n 2 1 2 t2n þ 2
t expð t Þ dt\ðn þ 1Þ! dt ðet [ ; t [ 1Þ; ð2:48Þ
t2 ðn þ 1Þ!
1 1

it follows on using (2.47) and (2.48) that

Z1 Z1 Z1
n t2 2 2n 2
jt expð Þj dt ¼ t expð t Þdt ¼ 2 t2n expð t2 Þdt
2
1 1 0

is finite. Thus, each fn 2 L2(R). Moreover, fn’s are linearly independent, because
any nontrivial finite linear combination of functions fn is a polynomial multiplied by
2
exp( t2 ), which is zero for no t 2 R and any nonzero polynomial has at most
finitely many zeros.
2
We next orthonormalise the functions {fn}n  0, where fn(t) = tnexp( t2 ), and
obtain the first three orthonormal
R1 vectors.
To begin with, (f0, f0) = 1 exp(−t2)dt = √p, using a well-known formula from
advanced calculus. Thus
70 2 Inner Product Spaces

2
expð t2 Þ
u0 ð t Þ ¼ :
p1=4

The next orthonormal vector is



t2 t2
t2 t2 expð 2 Þ expð Þ
t expð 2Þt expð 2 Þ; p1=4
2
p1=4
u1 ðtÞ ¼


t2 t2
t expð t2 Þ t2 expð 2 Þ expð 2 Þ
2 t expð 2 Þ; p1=4 p1=4
pffiffiffi 2 2
2t expð t2 Þ 2t expð t2 Þ
¼ ¼ :
p1=4 ð2p1=2 Þ1=2
   
t2 t2 t2
t2 expð 2Þ t2 expð 2 Þ; u0 u0 t2 expð 2 Þ; u1 u1
u2 ð t Þ ¼    
t2 expð t2
t2 expð t2
t2 expð t2
2Þ 2 Þ; u0 u0 2 Þ; u1 u1
2 t2
ð4t 2Þ expð 2Þ
¼ :
ðp1=2 22 2!Þ1=2

We shall next prove that the general form of these orthonormal functions is

t2
Hn ðtÞ expð 2Þ
vn ð t Þ ¼ 1=2
;
ð2n  1=2
n!p Þ

where
   
Hn ðtÞ ¼ ð 1Þn exp t2 expðnÞ t2 ; ð2:49Þ

and the superscript ‘(n)’ indicates the nth derivative of the function t ! exp(−t2).
The functions {Hn}n  0 are easily seen to be polynomials and are called Hermite
polynomials. The degree of Hn is n, as shown in (2.50) below. The functions vn are
called Hermite functions.
For n = 0, 1 and 2, it can be verified that

H0 ðtÞ ¼ 1; H1 ðtÞ ¼ 2t and H2 ðtÞ ¼ 4t2 2:

We shall establish below that

Hn0 ðtÞ ¼ 2nHn 1 ðtÞ; n ¼ 1; 2; . . .: ð2:50Þ


2.8 Orthogonal Complements 71

In order to do so, we first prove by induction that


   
expðn þ 1Þ t2 ¼ 2t expðnÞ ð t2 Þ 2n expðn 1Þ
t2 : ð2:51Þ

This is true for n = 0. Assume that it is true for n = k − 1. Then

  d  
expðk þ 1Þ t2 ¼ expðkÞ t2
dt
d  
¼ ð 2t expðk 1Þ ð t2 Þ 2ðk 1Þ expðk 2Þ t2 Þ
dt
 
¼ 2t expðkÞ ð t2 Þ 2 expðk 1Þ ð t2 Þ 2ðk 1Þ expðk 1Þ
t2
 
¼ 2t expðkÞ ð t2 Þ 2k expðk 1Þ t2 :

This proves (2.51) for all n = 1, 2, …. Now, differentiating (2.49) and using (2.51),
we obtain
n       o
Hn0 ðtÞ ¼ ð 1Þn 2t exp t2 expðnÞ ð t2 Þ þ exp t2 2t expðnÞ ð t2 Þ 2n expðn 1Þ
t2
   
¼ ð 1Þn 1 2n exp t2 expðn 1Þ t2
¼ 2n Hn 1 ðtÞ:

This establishes (2.50).


The orthogonality of the Hermite functions may be obtained from

Z1 Z1
t2 n
Hm ðtÞHn ðtÞe dt ¼ ð 1Þ Hm ðtÞ expðnÞ ð t2 Þ dt:
1 1

For n > m, repeated integration by parts, using (2.50) and the fact that exp(−t2) and
all its derivatives vanish for t = ±∞, we obtain

Z1 Z1
t2 n 1
Hm ðtÞHn ðtÞe dt¼ ð 1Þ 2m Hm 1 ðtÞ expðn 1Þ
ð t2 Þdt
1 1
Z1
n m m
¼ ð 1Þ 2 m! H0 ðtÞ expðn mÞ
ð t2 Þdt ¼ 0:
1

For n = m,
72 2 Inner Product Spaces

Z1 Z1
2 2 n
Hn ðtÞ expð t Þdt¼ 2 n! H0 ðtÞ expð t2 Þdt
1 1
pffiffiffi
¼ 2n n! p:

Thus, the functions

t2
Hn ðtÞ expð 2Þ
vn ð t Þ ¼ 1=2
; n ¼ 0; 1; 2; . . . ð2:52Þ
ð2n  1=2
n!p Þ

form an orthonormal sequence.


The reader can check using (2.52) that

vj ðtÞ ¼ uj ðtÞ; j ¼ 0; 1; 2:
2
The vector Hn(t)expð t2 Þ is a linear combination of f0, …, fn. Since the sets
2
{Hk(t)expð t2 Þ: k = 0, …, n} and {fk: k = 0, …, n} are linearly independent, it
follows by Remark 2.8.10(ii) that vj = ±uj for all j. The ambiguity of sign can be
removed by using the following observation. The leading coefficient of each Hn is
2
positive in view of (2.50) and so is that of fn exp(t2 ).
(iv) Laguerre functions. ConsiderRthe sequence of functions {fn}n  0 on (0,∞),
1
where fn(t) = tnexp( 2t ). Since 0 t2nexp(−t)dt = C(2n + 1), where C() is the
R1
gamma function and 0 exp(−t)dt = 1, it follows that each fn 2 L2(0,∞).
Moreover, fn’s are linearly independent because any nontrivial finite linear
combination of functions fn is a polynomial multiplied by exp( 2t ), which is
zero for no t 2 (0,∞) and any nonzero polynomial has at most finitely many
zeros.
We next orthonormalise the functions {fn}n  0, where fn(t) = tnexp( t
2), and
obtain the first three orthonormal vectors.

Z1
ðf 0 ; f 0 Þ ¼ expð tÞdt ¼ 1:
0

 t

Thus, u0(t) = exp 2 :
The next two orthonormal vectors are given as
2.8 Orthogonal Complements 73

       t
t exp 2t t exp 2t ; u0 u0
u1 ðtÞ ¼ t exp
 t   t   ¼ ðt 1Þ exp ; since
2 t exp 2 ; u0 u0 2
 t   t 

ðt; u0 Þ ¼ 1and t exp t exp ; u0 u0 ¼ 1:
  22   2  2   
t2 exp 2t t exp 2t ; u1 u1 t exp 2t ; u0 u0
u2 ðtÞ ¼ t2 exp
 t   t     
2 t2 exp 2 ; u1 u1 t2 exp 2t ; u0 u0
  t
1 2
¼ t 2t þ 1 exp ; since
2 2
  t    t  t 
t2 exp ; u1 ðtÞ ¼ 4; t2 exp ; exp ¼ 1 and
2 2 2
 t   t    t 
2
t exp t2 exp ; u1 u1 t2 exp ; u0 u0 ¼ 2:
2 2 2

We shall next prove that the general form of these orthonormal functions is

1  t
vn ð t Þ ¼ exp Ln ðtÞ; ð2:53Þ
n! 2

where

dn n
Ln ðtÞ ¼ ð 1Þn expðtÞ ðt expð tÞÞ; n ¼ 0; 1; 2; . . .:
dtn

Using Leibniz’s formula for higher derivatives of a product, we have


n
X 
n
L n ð t Þ ¼ ð 1Þ n ð 1Þ n k
nð n 1Þ    ð n k þ 1Þtn k : ð2:54Þ
k¼0
k

The reader can check using (2.53) that v0(t) = exp( 2t ) = u0(t), v1(t) = (t − 1)exp
( 2t ) = u1(t) and v2(t) = (12 t2 2t þ 1Þ expð 2t ) = u2(t), where u0, u1 and u2 are the
orthonormalised vectors computed using the Gram–Schmidt orthonormalisation
process.
We begin by showing that

Z1
expð tÞLn ðtÞLm ðtÞdt ¼ 0 for n [ m:
0

For m < n,
74 2 Inner Product Spaces

Z1 Z1
m n dn n
expð tÞt Ln ðtÞdt ¼ ð 1Þ tm ðt expð tÞÞdt
dtn
0 0
Z1
nþm dn m
¼ ð 1Þ m! ðtn expð tÞÞdt ¼ 0;
dtn m
0

by repeated integration by parts. Also,

Z1 Z1 
dn n
expð tÞL2n ðtÞdt ¼ ð 1Þn ðt exp ð t ÞÞ Ln ðtÞdt
dtn
0 0
Z1 Xn 
dn n n k n
¼ ð 1Þn ðt exp ð t ÞÞ ð 1 Þ nðn 1Þ. . .ðn k þ 1Þtn k dt
dtn k¼0 k
0
Z1
dn n
¼ ð 1Þ2n tn ðt expð tÞÞdt
dtn
0
Z1
¼ n! tn expð tÞdt ¼ ðn!Þ2 ;
0

which shows that {vn}n  0 is orthonormal.


The vector Ln(t)exp( 2t ) is a linear combination of f0, …, fn. Since the sets
{Lk(t)exp( 2t ): k = 0, …, n} and {fk: k = 0, …, n} are linearly independent, it
follows by Remark 2.8.10(ii) that vj = ±uj for all j. The ambiguity of sign can be
removed by using the following observation. The leading coefficient of each Ln is
positive in view of (2.54) and so is that of fnexp( 2t ).
(v) Rademacher functions. Consider the sequence {rn} of functions defined on
the interval [0, 1] by

r0 ðtÞ ¼ 1; rk ðtÞ ¼ sgnðsin 2k ptÞ; k ¼ 1; 2; . . .; t 2 ½0; 1Š:

This sequence was introduced by Rademacher and the rk are known as Rademacher
functions. If the interval [0, 1] is divided into 2k (k  1) equal parts, then
rk(t) assumes on the interiors of those segments the values +1 and −1 alternately
while at the endpoints, rk(t) = 0.
R1 1
The reader will note that ||rn|| = ( 0 |rn(t)|2dt)2 = 1, i.e. rn 2 L2[0, 1] and
||rn|| = 1, n = 0, 1, 2, …. To prove orthogonality, let n > m  0. Let I be the open
segment which lies between some two consecutive points of subdivision of the
interval [0, 1] corresponding to the function rm. Then, rm has constant value +1 or
−1 on I. Furthermore, I is composed of an even number, precisely 2n−m, of intervals
2.8 Orthogonal Complements 75

of equal length. On half of these intervals, rn(t) has value +1, whereas on the other
half, rn(t) has value −1. Consequently,
Z Z
rm ðtÞrn ðtÞdt ¼  rn ðtÞdt ¼ 0:
I I

Summing up over all such segments I, we have

Z1
rm ðtÞrn ðtÞdt ¼ 0:
0

Since the vectors of an orthonormal system in a pre-Hilbert space cannot be linearly


dependent, it follows that the Rademacher sequence {rn}n  0 of orthonormal
functions in L2[0, 1] is a linearly independent sequence. Moreover, the function
f(t) = cos 2pt is such that
k
Z1 2n
X Z2n
rn ðtÞf ðtÞdt ¼ ð 1Þ k 1
f ðtÞdt ¼ 0;
k¼1
0 k 1
2n

since the kth term and the (2n − (k − 1))th term are equal in magnitude and opposite
in sign because cos 2pt = cos 2p(1 − t), and consequently add up to 0.
Remark The sequence {rn}n  0 converges for t = 0, 1, 2kn , 1  k < 2n, n = 1, 2, …,
the points of subdivision of the interval [0, 1], and converges for no t other than
these points of subdivision, for if t 6¼ 0, 1 or any of the points of subdivision,
{rn(t)}n  0 assumes the values +1 and −1 infinitely often. (For any t that is not of
the form 2kn , there exists an integer j such that 2jn \t\ j þ 1 n
2n , that is, jp < 2 pt <
(j + 1)p: so, rn(t) is 1 if j is even and −1 if j is odd. As n increases, the parity of
j keeps changing between even and odd [see Problem 2.8.P10.]). Thus, the
sequence converges only on the set {0, 1, 2kn : 1  k < 2n, n = 1, 2, …} of measure
zero. However, the arithmetic averages of {rn}n  0 converge to the zero function
almost everywhere.
Lemma 2.8.14 For distinct nonnegative integers k1, k2, …, kn,

Z1
rk1 rk2 . . .rkn ¼ 0:
0

Proof This is left as Problem 2.8.P11. h


76 2 Inner Product Spaces

Theorem
P 2.8.15 Let {rn}n  1 be the Rademacher functions. Then the sequence
{( nk¼1 rk)/n}n  1 of arithmetic means converges to zero almost everywhere with
respect to the Lebesgue measure on [0, 1].
P
Proof Set fn = [( nk¼1 rk)/n]4, n = 1, 2, …. Observe that each fn belongs to L1[0, 1].
P R1
Indeed, |fn|  [( nk¼1 |rk|)/n]4 = 1 and 0 fn(t)dt  1. Next, on using r2k = 1,
k = 1, 2, … (except at the finitely many points of subdivision), we have
2 ! 2 32
n
X
n4 fn ¼ 4 rk 5
k¼1
0 12
n
X n
X
B C
¼@ rk2 þ 2 rk rm A
k¼1 k;m¼1
k\m
0 12
n
X
B C
¼ @n þ 2 rk rm A
k;m¼1
k\m
0 10
1
n
X B n
X CB n
X C
¼ n2 þ 4n rk rm þ 4B
@ ri rj C
A@ rk rm A
k;m¼1 i;j¼1 k;m¼1
k\m i\j k\m
0 1
B C
B C
n
X n
X BX n X n X n C
B C
¼ n2 þ 4n rk rm þ 4 2 2
rk rm þ 4B 2
2rj rk rm þ 2 ri rj rk rm C
B j¼1 C
k;m¼1 k;m¼1 B k;m¼1 k;m¼1 C
k\m k\m @ k\m
k;m6¼j
i;j¼1
k\m
A
i\j
i;j;k;mdist

n
X n
X n
X
¼ n2 þ 4n rk rm þ 4ððn 1Þ þ ðn 2Þ þ    þ 2 þ 1Þ þ 8 rk rm
k;m¼1 j¼1 k;m¼1
k\m k\m
k;m6¼j

n
X
þ8 ri rj rk rm
k;m¼1
i;j¼1
k\m
i\j
i;j;k;mdist

n
P n
P n
P n
P
¼ n2 þ 2nðn 1Þ þ 4n rk rm þ 8 rk rm þ 8 ri rj rk rm :
k;m¼1 j¼1 k;m¼1 k;m¼1
k\m k\m i;j¼1
k;m6¼j k\m
i\j
i;j;k;m dist

ð2:55Þ
2.8 Orthogonal Complements 77

Dividing both sides of (2.55) by n4, integrating and using Lemma 2.8.14, we obtain

Z1
1 2nðn 1Þ 3
fn dt ¼ 2
þ 4
\ 2:
n n n
0

Consequently,

1
1 Z
X
fn dt\1:
n¼1
0

P {fn}n  1 con-
By Corollary 1.3.7 and Remark 1.3.13, it follows that the sequence
verges to zero almost everywhere; that is, the sequence {( nk¼1 rk)/n}n  1 of
arithmetic averages converges to zero almost everywhere with respect to Lebesgue
measure. This completes the proof. h

Problem Set 2.8

2:8:P1. Using Bessel’s Inequality, obtain the Cauchy–Schwarz Inequality.


2:8:P2. Give an example to show that strict inequality can hold in the Corollary
2.8.7 to Bessel’s Inequality.
2:8:P3. Let {ek}k  1 be any orthonormal sequence in an inner product space
X. Show that for any x, y 2 X,
1
X
jðx; ek Þðy; ek Þj  k xkkyk:
n¼1

2:8:P4. Let {ek}k  1 be any orthonormal sequence in a Hilbert space H and let
M = span{ek}. Show that for any x 2 H, we have x 2 M if, and only if,
x can be represented by
1
X
x¼ ðx; ek Þek :
k¼1

2:8:P5. Let f(x) be a differentiable 2p-periodic function on [−p, p] with derivative


f′(x) 2 L2[−p, p]. Let fn, n 2 Z; be the Fourier coefficients of f(x) in the
pffiffiffiffiffiffi P
system feinx = 2pgn2Z . Prove that 1 n¼ 1 |fn| < ∞.
2:8:P6. Show that the system {1, t , t , …} is complete in the space L2[0, 1]. It is
2 4

not complete in L2[−1, 1].


2:8:P7. Find a nonzero vector in C3 orthogonal to (1, 1, 1) and (1, x, x2), where
x = exp(2pi/3).
78 2 Inner Product Spaces

2:8:P8. Let a 2 C be such that |a| 6¼ 1. Find the Fourier coefficients of f 2 RL2,
where f(z) = (z − a)−1, with respect to the orthonormal sequence

1
ej j¼ 1 ; ej(z) = zj.
2 P
2:8:P9. If the series ja20 j + 1 2 2
n¼1 (|ak| + |bk| ) converges, show that there exists a
2
function f 2 L [0, 2p] having the ak, bk as its Fourier coefficients, i.e. the
equations

Z2p Z2p Z2p


1 1 1
a0 ¼ f ðtÞdt; ak ¼ f ðtÞ cos kt dt; bk ¼ f ðtÞ sin kt dt;
p p p
0 0 0
k ¼ 1; 2; . . .

are valid. This function is uniquely defined up to a set of measure zero;


i.e. if there are two such functions, they differ only on a set of measure
zero.
2:8:P10. Show that for any t 2 [0, 1] that is not of the form 2kn (i.e. t is not a ‘dyadic
rational’) for any integers k and n, the parity of the (obviously unique)
integer j such that j2n1 \t\ 2jn keeps changing between even and odd as
n increases.
2:8:P11. Prove Lemma 2.8.14: for distinct nonnegative integers k1, k2, …, kn,
R1
0 rk1 rk2 . . .rkn ¼ 0:
2:8:P12. Show that completeness of the orthonormal set of Hermite functions in
L2(−∞,∞) is equivalent to that of the orthonormal set of Laguerre
functions in L2(0,∞).
2:8:P13. Let X be a complex inner product space of dimension n. Show that X is
isometrically isomorphic to Cn and is hence complete.

2.9 Complete Orthonormal Sets

Recall that a set M of vectors in a pre-Hilbert space is said to be orthogonal if


x ⊥ y whenever x and y are distinct vectors of M [Definition 2.8.1]. The orthogonal
set M is said to be orthonormal if, in addition, ||x|| = 1 for every vector x in M.
An orthonormal set is said to be complete if it is a maximal orthonormal set
[Definition 2.8.2]. We shall show that there are complete orthonormal sets in any
nontrivial inner product space and discuss a few of the many important examples.
One also speaks of complete orthogonal sets, which are defined analogously. The
classical result of Riesz–Fischer and Parseval will be proved. These will lead to the
identification of all infinite-dimensional Hilbert spaces.
We begin by showing that a nontrivial inner product space H (H 6¼ {0}) contains
a complete orthonormal set.
2.9 Complete Orthonormal Sets 79

Theorem 2.9.1 Let H be an inner product space over F and let H 6¼ {0}. Then H
contains a complete orthonormal set.
Proof Let S denote the collection of all orthonormal sets in X. Since, for any
nonzero vector x, the set {kxxk} is an orthonormal set; it follows that S 6¼ ∅. The
collection S is partially ordered by inclusion. We wish to show that every totally
ordered subset of S has an upper bound in S. It will then follow by Zorn’s Lemma
that S has a maximal element, namely a complete orthonormal set.
Let T = {Aa}Sa2K, where K is an indexing set, be any totallyS ordered subset of
S. Then the set Sa Aa is an upper bound for T; indeed, Ab  a Aa for each b. We
S show that a Aa is orthonormal. Let x and y be any two distinct elements of
next
a Aa so that x 2 Ab and y 2 Ac for some b and c in the indexing set K. Since T is
totally ordered, either Ab  Ac or Ac  ASb. Supposing Ab  Ac, it follows that x, y 2
Ac. So x ⊥ y and ||x|| = ||y|| = 1. Thus, a Aa is seen to be orthonormal.
By Zorn’s Lemma [Sect. 1.3], S has a maximal element. This completes the
proof. h
A slight modification of the proof of Theorem 2.9.1 yields the following
corollary.
Corollary 2.9.2 Let H be an inner product space over F. If E  H is an
orthonormal set, then there exists a complete orthonormal set S such that E  S.
The next result contains an alternate description of complete orthonormal sets.
Theorem 2.9.3 Let H be an inner product space over F. Suppose that S  H is an
orthonormal set. Then the following are equivalent:
(a) S is a complete orthonormal set;
(b) If x 2 H is such that x ⊥ S, then x = 0.

Proof Suppose S is a complete orthonormal set. If x 2 H is such that x ⊥ S, and


x 6¼ 0, then S [ {kxxk} is an orthonormal set that properly contains S, contradicting
the fact that S is a complete orthonormal set.
On the other hand, suppose x ⊥ S implies x = 0. If S were not a complete
orthonormal set, there would exist some orthonormal set T  H such that T properly
contains S. Hence, if x 2 T\S then ||x|| = 1 and x ⊥ S. This contradicts the
assumption that x ⊥ S implies x = 0.
Therefore, the orthonormal set S is complete. h
So far we have considered examples of countable orthonormal sets in pre-Hilbert
spaces. If a Hilbert space contains a countable complete orthonormal set, then it is
said to be separable. This definition of separability is equivalent to Definition
1.2.10 as the next theorem shows.
Let S be a countable dense set in a Hilbert space H 6¼ {0}. By progressively
reducing S, if necessary, it can be turned into a linearly independent set. The Gram–
Schmidt orthonormalisation process applied to the linearly independent set renders
80 2 Inner Product Spaces

it into an orthonormal set. This orthonormal set is in fact complete. More precisely,
we have the following theorem.
Theorem 2.9.4 Let H 6¼ {0} be a Hilbert space that contains a countable dense
subset S. Then H contains a countable complete orthonormal set that is obtained
from S by the Gram–Schmidt orthonormalisation process. Thus H is separable.
Let H 6¼ {0} contain a countable, complete orthonormal set T, then H contains a
countable dense set, namely the finite rational linear combinations of vectors in T.
Proof We assume, as we may, that 0 62 S. Enumerate the vectors in S as a sequence
{xn}n  1 and let y1 ¼ xn1 ; where n1 = 1. If all the xn for n > n1 are scalar multiples of
xn1 ; then the set {xn1 } is the linearly independent set obtained from S. Otherwise, let
y2 = xn2 be the first xn which is not a scalar multiple of xn1 . Then for n < n2, xn is a
scalar multiple of xn1 : If all the xn for n > n2 are expressible as linear combinations
of xn1 and xn2 ; then the set {xn1 ; xn2 } is the linearly independent set obtained from
S. Otherwise, let y3 = xn3 be the first xn which is independent of xn1 andxn2 : Then for
n < n3, xn is a linear combination of xn1 andxn2 : The proof continues inductively, and
we thus obtain a finite or countably infinite linearly independent set {y1, y2, …} 
S. Let X be the smallest linear subspace of H containing {y1, y2, …}. It is clear that
S  X since if xj 2 S then xj is a linear combination of y1, y2, …, yk, where k is
chosen so that nk  j < nk+1. This says that X is dense in H. Orthonormalise {y1, y2,
…} by the Gram–Schmidt procedure to obtain the orthonormal set {u1, u2, …}. It
remains to show that the orthonormal set {u1, u2, …} is complete. P
Let x 2 H be such that (x, uk) = 0 for k = 1, 2, …. Then (x; nk¼1 akuk) = 0 for all
finite linear combinations of the un and so (x, y) = 0 for all y 2 X. Let {zn}n  1 be a
sequence in X such that ||x − zn|| ! 0 as n ! ∞. Then ||x||2 = (x, x) − (x, zn) = (x, x −
zn)  ||x||||x − zn|| ! 0 as n ! ∞.
Clearly, the closure of the rational linear combinations of the vectors of T = {xk}
contains all possible linear combinations of T, i.e. contains [T] and is hence the
P
same as [T]. Let x 2 H . Now, nk¼1 (x, xk)xk 2 [T]. Using Bessel’s Inequality
P
[Theorem 2.8.6], it follows that 1 n¼1 (x, xk)xk converges to some y 2 H. In fact,
y 2 [T]. Suppose y 6¼ x. Then

ðx y; xk Þ ¼ ðx; xk Þ ðy; xk Þ ¼ ðx; xk Þ ðx; xk Þ ¼ 0:

Using the completeness of T, it follows that x − y = 0. Thus x 2 [T]. This


completes the proof. h
However, there are Hilbert spaces which contain non-denumerable orthonormal
sets and are, therefore, nonseparable. We give below examples of such Hilbert
spaces.
Examples 2.9.5
(i) Consider the collection X of functions on R representable in the form
2.9 Complete Orthonormal Sets 81

n
X
xð t Þ ¼ ak eikk t
k¼1

for arbitrary n, real numbers k1, k2, …, kn and complex coefficients a1, a2, …,
an. X is a vector space, and an inner product in X is defined by

ZT
1
ðx; yÞ ¼ lim xðtÞyðtÞdt:
T!1 2T
T

Pn
If y(t) = k¼1 bk eilk t ; then

ZT X m
n X
1
ðx; yÞ ¼ lim aj bk eiðkj lk Þ t
dt
T!1 2T
j¼1 k¼1
T ð2:56Þ
n X
X m
¼ aj bk
j¼1 k¼1

since

ZT 
1 1 if k ¼ 0
lim eikt dt ¼
T!1 2T 0 if k 6¼ 0:
T

The reader will note that the summation in (2.56) is taken over all j and k for
which kj = µk. X together with the inner product defined in (2.56) is an inner
product space. This is known as the space of trigonometric polynomials on R.
Its completion H is a Hilbert space. The set {ur(t) = eirt: r 2 R} is an
uncountable orthonormal set in the Hilbert space H, where H = X; the closure
of X.
(ii) Let X be a nonempty set. Consider the Hilbert space L2(X, , l), where
denotes the collection of all subsets of X and µ is the counting measure on X;
that is, if E 2 , µ(E) is equal to the number of points in E when E is finite and
is infinite if E is infinite. The space L2(X, , µ) is denoted by ‘2(X).
Consider the subset of ‘2(X) consisting of all characteristic functions of one point
sets in X, i.e. {v{x}: x 2 X}. Observe that (v{x}, v{y}) = 0 for x 6¼ y and ||v{x}|| = 1.
Suppose now x 6¼ y and consider the distance between v{x} and v{y}:
2 X  2
 
vfxg vfyg ¼ vfxg vfyg  ¼ 2:
2
z2X

Thus
82 2 Inner Product Spaces

pffiffiffi

vfxg vfyg ¼ 2:
2

The open balls S(v{x},1/√2) with centres v{x} and radii 1/√2 are nonoverlapping,
since no ball S(v{x},1/√2) contains a point of the set {v{x}: x 2 X} other than its
centre. Now suppose that X is an uncountably infinite set. We claim that the space
‘2(X) is nonseparable. Suppose not and let {zk} be a countable dense set in ‘2(X).
Each of the balls S(v{x},1/√2) will contain a point zk of the countable dense set.
Since the balls are nonoverlapping, the points contained in different balls cannot be
identical. We thus have an injective map from X into the countable dense set, which
is not possible as X is uncountable.
In view of the examples above, we consider orthonormal sets which are not
necessarily countable. We begin with the following definition that formalises the
remarks above Definition 2.7.5.
Definition 2.9.6 Suppose {xa: a 2 K}, where K is an indexing set, is a collection of
elements from a normed linear space X. {xa: a 2 K} is said to be summable to
x 2 X, written
X X
xa ¼ x or xa ¼ x;
a2K a

if for all e > 0, there exists some finite set of indices J0  K, such that for any finite
set of indices J  J0,

X

x x \e:
a2J a

This notion of summability can be easily reconciled with the usual notion of
summability of a series when K consists of the natural numbers.
Remarks 2.9.7
(i) Suppose S = {xa: a 2 K}, where K is an indexing set, is a collection of
elements
P from the normed linear space R. If 0  xa < ∞ for each a 2 K, then
x
a a is the supremum of the set of all finite sums xa1 þ xa2 þ    þ xan , where
a1, a2, …, an are distinct members of K. In this situation, the sum can be
infinity, which is outside the space R. However, if it is within the space, then
the twoP notions of summation are identical. If xa = ∞ for some a 2 K, then the
sum a xa is equal to infinity.
P P
(ii) It is easy to check that if a xa ¼ x and a ya ¼ y, then
X X
ðxa þ ya Þ ¼ x þ y and kxa ¼ kx; k 2 F:
a a
2.9 Complete Orthonormal Sets 83

The following proposition says though we are summing over an arbitrary


indexing set, it is the sum over only a countable set of indices that matters.
Proposition 2.9.8 Let X be a Banach space over F and suppose {xj: j 2 K} 
X. The family {xj: j 2 K} is summable
P if, and only if, for every e > 0, there exists a

finite set J0 of indices such that j2J xj \e whenever J is a finite set of indices
disjoint from J0.
If {xj} is summable, then the set of those indices for which xj 6¼ 0 is countable.
Proof If {xj} is a summable
family with sum x, then for every e > 0, there exists a
P
finite set J0 such that x j2J1 xj \e=2 whenever J1  J0 and is finite. It follows
that if J \ J0 = ∅, then

X X X X
X
x ¼ x xj  x x þ x xj \e.
j2J j j2J [ J j j2J
j2J [ J j j2J

0 0 0 0

The reader will note that we have not used the completeness of X in the above
argument.
If, conversely, the condition is satisfied,
then for every positive integer n, there
P
exists a finite set Jn such that j2J xj \1=n whenever J is a finite set of indices
and J \ Jn = ∅. By replacing Jn by J1 [ J2 [  [ Jn, n = 1, 2, …, we see that
there is a sequence {Jn} of finite sets of indices which is increasing. If n < m, then

X X X


xj xj ¼
x
j \1=n
j2J
m j2Jn j2Jm nJn

since (J
P ∅. By the completeness of X, it follows that there exists x such
m\Jn) \ Jn =

that j2Jn xj x ! 0. For e > 0, there exists n0 > 2/e such that
P

j2Jn xj x \e=2. If J is any finite set of indices containing Jn0 ; then
0


X
X X
x x  xj
x þ xj
\e=2 þ 1=n0 \e.
j2J j j2J j2JnJn0
n0

Consequently, the family {xj} is summable with sum x.


Finally we show that xj = 0 for all but
countably
many j. If j is an index which
does not belong to J1 [ J2 [ …, then xj \1=n for every n. The reader will note
that we have not used the completeness of X in this argument. This completes the
proof. h
84 2 Inner Product Spaces
P
If the sequence {xn}n  1 in a Hilbert space is orthogonal, then 1 n¼1 xn con-
P
verges if, and only if, 1n¼1 ||x n || 2
< ∞. More generally, the following theorem
holds.
Theorem 2.9.9 Let H be a Hilbert space andPlet {xj: j 2 K} be an orthogonal
family in H, i.e. xj ⊥ xk for j 6¼ k. Then j2K xj converges if, and only if,
P 2 P 2 P 2
j2K ||x j || < ∞. Moreover, if x
j2K j = x, then ||x|| = j2K ||x j|| .

Proof If {xj} is summable,


then for every positive number e there exists a finite set
P
J0 such that j2J xj \e whenever J \ J0 = ∅, and consequently

2
X 2 X
xj ¼ xj \e2
j2J
j2J

whenever J \ J0 P = ∅.
If, conversely, j2K ||xj||2 < ∞, then for every positive e there exists a finite set
P P
J0 such that j2J ||xj||2 < e2 (consequently || j2J xj||2 < e2) whenever J \ J0 = ∅.
Summability now follows from the previous Theorem 2.9.8.
Observe that
! !
2
X X X XX X
k xk ¼ ðx; xÞ ¼ xj ; x ¼ xj ; xk ¼ ðxj ; xk Þ ¼ ðxj ; xj Þ
j2K j2K k2K j2K k2K j2K
X 2
¼ xj :
j2K

h
The following general form of Bessel’s Inequality holds.
Theorem 2.9.10 (Bessel’s Inequality) Let S = {xa: a 2 K} be an orthonormal set
in an inner product space H and let x 2 H. Then we have
X
jðx; xa Þj2  k xk2 :
a2K

Proof The inequality in Theorem 2.8.6 implies that for each finite set J  K of
indices, we have
X
jðx; xa Þj2  k xk2 :
a2J

It now follows using Remark 2.9.7 that


2.9 Complete Orthonormal Sets 85

( )
X 2
X 2
jðx; xa Þj ¼ sup jðx; xa Þj : JK; J finite
a2K a2J

 k x k2 :
h
Remark 2.9.11 The set A = {a 2 K: (x, xa) 6¼ 0} is countable. By Bessel’s
P 2 2
Inequality, a2K jðx; xa Þj  k xk . The countability now follows from
Theorem 2.9.8.
Theorem 2.9.12 Let{xa: a 2 K} P be an orthonormal set in a Hilbert space H.
For every x 2 H, the vector y ¼ a2K ðx; xa Þxa exists in H and x − y ⊥ xa for every
a 2 K.
Proof By Bessel’s Inequality 2.9.10, there is a countable set of xa for which (x, xa)
6¼ 0. Arrange them as a sequence x1, x2, … Let e > 0 be given. Then
2 !
nXþk þk
nX þk 
nX 

ðx; xi Þxi ¼ ðx; xi Þxi ; x; xj xj
i¼n i¼n j¼n
þ k nX
nX þk   
¼ ðx; xi Þ x; xj xi ; xj
i¼n j¼n
þk
nX
¼ jðx; xi Þj2 \e
i¼n

for large n and any positive integer


Pnk, again using Bessel’s Inequality. It follows that
the sequence of partial sums i¼1 ð x; x i Þx i n1
is Cauchy in H, and H being a
P P
Hilbert space, y ¼ 1 i¼1 ð x; x Þx
i i exists in H and equals a2K ðx; xa Þxa . Note that
the foregoing argument is valid whether F = C or R, as in the latter case
ðx; xj Þ ¼ ðx; xj Þ:
It Premains to show that x − y ⊥ xa for every a 2 K. For each n, let
yn = nk¼1 ðx; xk Þxk . We first prove that (x − yn, xa) = 0 for those a for which (x, xa)
= 0 and any n. Note that xa cannot appear in the representation of yn for any
n. Therefore, (xk, xa) = 0 for all k and hence
n
X
ðx yn ; xa Þ ¼ ðx; xa Þ ðx; xk Þðxk ; xa Þ ¼ 0:
k¼1

Next we prove (x − yn, xa) = 0 for those a for which (x, xa) 6¼ 0 and sufficiently
large n. Note that xa must appear in the representation of yn for sufficiently large
n. Therefore,
86 2 Inner Product Spaces

n
X
ðx yn ; xa Þ ¼ ðx; xa Þ ðx; xk Þðxk ; xa Þ ¼ ðx; xa Þ ðx; xa Þ ¼ 0
k¼1

for sufficiently large n.


Now, for sufficiently large n,

jðx y; xa Þj  jðx yn ; xa Þj þ jðyn y; xa Þj


 0 þ ky n y kkx a k
¼ kyn yk ðusing orthonormality of fxa : a 2 KgÞ

Since ||yn − y|| ! 0 as n ! ∞, it follows that x − y ⊥ xa for every a 2 K. h


We next investigate the problem of writing an arbitrary element x in a Hilbert
space H as a limit of linear combinations of elements of an orthonormal set. We
begin with a definition.
Definition 2.9.13 Let H be a Hilbert space and S = {xa: a 2 K} be an orthonormal
set in H. We say that S is a basis (orthonormal basis) in H if for every x 2 H, the
following holds:
X
x¼ ðx; xa Þxa :
a2K

The following theorem provides a characterisation of a basis in a Hilbert space.


Theorem 2.9.14 If H is a Hilbert space, then S = {xa: a 2 K} consisting of
orthonormal vectors in H is a basis if, and only if, S is a complete orthonormal
system of vectors.
Proof Suppose S is a basis in H. Then if x 2 H satisfies (x, xa) = 0, a 2 K, then the
definition of the basis gives
X
x¼ ðx; xa Þxa ¼ 0:
a2K

Thus, S is complete by Theorem 2.9.3.


hand, suppose that S is complete in H. Let b 2 K. Then for any
On the other P
x 2 H, the sum a2K ðx; xa Þxa exists by Theorem 2.9.12 and
!
X X
x ðx; xa Þxa ; xb ¼ ðx; xb Þ ðx; xa Þðxa ; xb Þ ¼ ðx; xb Þ ðx; xb Þ ¼ 0;
a2K a2K

usingPthe fact that S = {xa: a 2 K} consists of orthonormal vectors. Thus, the vector
x a2K ðx; xa Þxa ; is orthogonal to xb for every b 2 K. The hypothesis, together
with Theorem 2.9.3, now implies
2.9 Complete Orthonormal Sets 87

X
x¼ ðx; xa Þxa ;
a2K

i.e. S is a basis in H. h
Examples 2.9.15
(i) The set e1 = (1, 0, 0, …), e2 = (0, 1, 0, 0, …), … is a complete orthonormal
set (basis) in ‘2. Indeed, if x = (x1, x2,…) 2 ‘2 and (x, ej) = 0, j = 1, 2, …, then
xj = P0, j = 1, 2, … and so x = 0. Moreover, if x = (x1, x2, …) 2 ‘2 then
x ¼ 1 ðx; ei Þei , where the partial sums converge in the ‘2-norm:
Pn i¼1 2 P

i¼1ðx; ei Þei x ¼ 1 i¼n þ 1 jxi j2 is small for large n.
(ii) [Cf. Examples 2.9.5(ii)] Let X be a non-denumerable set. The set
‘2(X) = L2(X, , µ), where denotes the collection of all subsets of X and µ is
the counting measure on X, is a nonseparable Hilbert space. The set {v{x}:
x 2 X} of characteristic functions is an uncountable orthonormal set in ‘2(X).
In fact, it is a complete
P orthonormal set. If f 2 ‘2(X) and for x 2 X,
(f, v{x}) = 0, then y2X f ðyÞvfxg ðyÞ ¼ 0, which implies f(x) = 0. So f is the
identically zero function [see Theorem 2.9.3].
(iii) The Rademacher system is not complete. The function f(x) = cos 2px is
orthogonal to all the Rademacher functions [see Example 2.8.13(v)].
The following theorem provides various characterisations of complete
orthonormal sets and helps decide which orthonormal sets are complete. Some of
the characterisations have already been described.
Theorem 2.9.16 Let S = {xa: a 2 K} be an orthonormal set in Hilbert space H.
Each of the following conditions implies the other five:
(a) S is a complete orthonormal set in H;
(b) x ⊥ S implies x = 0; P
(c) x 2 H implies x ¼ a2K ðx; xa Þxa ; that is, S is a basis in H;
P
(d) k x k2 ¼ jðx; xa Þj2 for each x 2 H; (Parseval’s Identity)
a2K
P
(e) for x, y 2 H, ðx; yÞ ¼ a2K ðx; xa Þðy; xa Þ;
(f) ½SŠ ¼ H; that is, the smallest subspace of H containing S is dense in H.
The equality in (c) means that the right-hand side has only a countable number of
nonzero terms, and every rearrangement of this series converges to x [Definition
2.9.6]. The equations in (d) and (e) are to be interpreted analogously.
Of course, (d) is a special case of (e).
Proof The equivalence of (a) and (b) has been proved [Theorem 2.9.3]. So also the
equivalence of (a) and (c) [Theorem 2.9.14]. We shall prove that (b) ) (f)
) (d) ) (e) ) (b).
(b) implies (f). Let M ¼ ½SŠ. Since [S] is a subspace, so is M. (For x, y 2 M, there
exist sequences {xn}n  1 and {yn}n  1 such that xn ! x and yn ! y; then
88 2 Inner Product Spaces

xn + yn ! x + y and kxn ! kx, k 2 F.) Suppose [S] is not dense in H. Then, M 6¼


H
Pso that there exists a nonzero vector x in H which is not in M. The vector y =
a2K ðx; xa Þxa exists in H and x − y ⊥ xa for every a 2 K [Theorem 2.9.12].
Moreover, x 6¼ y since y 2 M and x 62 M and hence x − y 6¼ 0. This contradicts (b).
(f) implies (d). Suppose (f) holds. For x 2 H and e > 0, there exists a finite set
{xa1 ; xa2 ; . . .; xan } such that some linear combinations ofP these vectors have distance
less than e from x. By Remark 2.8.8(ii), the vector z ¼ ni¼1 ðx; xai Þxai provides the
best approximation to the vector x in the linear span of {xa1 ; xa2 ; . . .; xan }; so
||x − z|| < e, and hence ||x|| < ||z|| + e which implies ðk xk eÞ2 \kzk2 ¼
Pn 2 P1 2 2
i¼1 jðx; xai Þj  i¼1 jðx; xai Þj . Since e > 0 is arbitrary, we obtain ||x|| 
P1 2
i¼1 jðx; xai Þj . The result now follows using Bessel’s Inequality 2.9.10.
(d) implies (e). Note that (d) can be written as
!
X X
ðx; xÞ ¼ ðx; xa Þxa ; ðx; xa Þxa :
a2K a2K

Fix x, y 2 H. If (d) holds, then


!
X X
ðx þ ky; x þ kyÞ ¼ ðx þ ky; xa Þxa ; ðx þ ky; xa Þxa
a2K a2K

for any scalar k. Hence,


! !
X X X X
 yÞ þ kðy; xÞ ¼ k
kðx;  ðx; xa Þxa ; ðy; xa Þxa þk ðy; xa Þxa ; ðx; xa Þxa
a2K a2K a2K a2K

ð2:57Þ

Taking
P k = 1 and k = i, (2.57) shows that the real and imaginary parts of (x, y) and
P
a2K ðx; xa Þxa ; a2K ðy; xa Þxa are equal. Hence
!
X X
ðx; yÞ ¼ ðx; xa Þxa ; ðy; xa Þxa
a2K a2K
X
¼ ðx; xa Þðxa ; yÞ:
a2K

(e) implies (b). Finally, if (b) fails to be true, there exists a vector z 6¼ 0 so that (z,
xa) = 0 for all a 2 K. If x = y = z, the ||z||2 = (x, y) 6¼ 0 but ðz; xa Þðxa ; zÞ ¼ 0: Hence
(e) fails to hold. Thus, (e) implies (b) and the proof is complete. h
2.9 Complete Orthonormal Sets 89

To deal with completeness of orthonormal sets in the next few examples, we will
use their equivalent descriptions provided in Theorem 2.9.16.
Examples 2.9.17
(i) In the completion H of the inner product space of trigonometric polynomials
[see Example 2.9.5(i)], the uncountable orthonormal set {ur(t) = exp(irt):
r 2 R} is complete since ½fur gŠ ¼ H [equivalence of (a) and (d) in
Theorem 2.9.16].
(ii) Let H = L2[−1, 1] and for n = 0, 1, 2, …, let Pn denote the Legendre polynomial
of degree n. Note that Pn is obtained by applying the Gram–Schmidt
orthonormalisation process to the linearly independent vectors {1, t, t2, …, tn}.
Moreover,


span 1; t; t2 ; . . .; tn ¼ spanfP0 ; P1 ; . . .; Pn g: ð2:58Þ

This is true for each n. Let x 2 H and e > 0. By Example 2.5.1, there exists y 2
C[−1, 1] such that ||x − y|| < e. By Weierstrass’ Theorem, there is a polynomial
Q(t) such that |y(t) − Q(t)| < e for all t 2 [−1, 1]. Then

Z1
ky Qk22 ¼ jyðtÞ QðtÞj2 dt\2e2 :
1

Thus
p
kx Qk2  kx y k2 þ ky Qk2 \ð1 þ 2Þe:

This shows that the set of all polynomials on [−1, 1] is dense in H.


In view of (2.58), the set {P0, P1, …} constitutes a complete orthonormal basis
[Theorem 2.9.16(f)].
(iii) Let H = L2([−p, p], dt
2p) and for n = 0, ±1, ±2, …,

un ðtÞ ¼ eint ; t 2 ½ p; pŠ:

Then {un: n = 0, ±1, ±2, …} is an orthonormal set in H. Indeed,

Zp 
1 1 if n ¼ m
ð un ; u m Þ ¼ eiðn mÞt
dt ¼
2p 0 if n 6¼ m
p
90 2 Inner Product Spaces

The orthonormal set {un: n = 0, ±1, ±2,…} is usually called trigonometric


system. For x 2 H and n = 0, ±1, ±2, …,

Zp
1 int
ðx; un Þ ¼ xðtÞe dt ¼ ^xðnÞ;
2p
p

where ^x(n) is the nth Fourier coefficient of x.


We shall show that if ^xðnÞ ¼ 0, n = 0, ±1, ±2, …, then x = 0 a.e. This will prove
that the trigonometric system is complete in L2([−p, p], 2p
dt
).
Set

Zt
1
yðtÞ ¼ xðsÞds:
2p
p

Since L2([−p, p], 2p


dt
)  L1([−p, p], 2p
dt
), it is evident that y is a well-defined abso-
lutely continuous function on [−p, p] [see 1–5]. In particular, y 2 L2([−p, p], 2p dt
).
Moreover, y(−p) = 0 and y(p) = 0, using the fact that ^xð0Þ ¼ 0 by hypothesis. Let
a be any constant. On integrating by parts, we obtain

Zp
½yðtÞ aŠeint dt ¼ 0; n ¼ 1; 2; . . .: ð2:59Þ
p

Choose a so that (2.59) holds for n = 0 as well. Since y(t) − a is a continuous


periodic function, for e > 0, there is a trigonometric polynomial
n
X
TðtÞ ¼ ck eikt
k¼ n

such that

supfjyðtÞ a TðtÞj: t 2 ½ p; pŠg\e;

using Weierstrass’ Theorem.


Now using (2.59) and the choice of a, we obtain
2.9 Complete Orthonormal Sets 91

Zp Zp  
2
jyðtÞ aj dt ¼ ðyðtÞ aÞ yðtÞ a TðtÞ dt
p p
Zp
e jyðtÞ ajdt
p
2 312 2 312
Zp Zp
2
 e4 jyðtÞ aj dt5 4 dt5 ;
p p

which implies

Zp
jyðtÞ aj2 dt  2pe2 :
p

Thus, y(t) is constant and x(t) = 0 almost everywhere. This completes the proof.
 
Remarks (a) In the above proof, we have used the fact that x 2 L1 ½ p; pŠ; 2p dt
and
R
1 p int
have proved that if ^xðnÞ ¼ 2p p xðtÞe dt ¼ 0 for n = 0, ±1, ±2, …, then x =0
a.e.
(b) We put some ofdt the results proved
 for abstract
 Hilbert spaces in the present
setting of L2 ½ p; pŠ; 2p . For x 2 L2 ½ p; pŠ; 2pdt
, associate a function ^x defined on
Z, the set of integers. The Fourier series of x is
1
X
^xðnÞeint ð2:60Þ
n¼ 1

and its partial sums are

N
X
SN ¼ ^xðnÞeint ; N ¼ 0; 1; 2; . . .;
n¼ N

The Parseval identity asserts

1
X Zp 
1 dt
^xðnÞ^yðnÞ ¼ xðtÞyðtÞdt; x; y 2 L2 ½ p; pŠ; : ð2:61Þ
n¼ 1
2p 2p
p

The Fourier series (2.60) converges to x in the L2-norm:


92 2 Inner Product Spaces

limN kx SN k2 ¼ 0: ð2:62Þ
 
dt
(c) (The Riemann–Lebesgue Lemma) If x 2 L2 ½ p; pŠ; 2p , then

Zp
int
xðtÞe dt ! 0 as n ! 1:
p

P1
Indeed, the Parseval’s identity (2.61) gives xðnÞj2 ¼
n¼ 1 j^
R
1 p 2
2p p jxðtÞj dt\1. (d) The relation

(2.62) leads to the question whether the
Fourier series of x 2 L2 ½ p; pŠ; 2p
dt
tends to x pointwise. This is not true even for a
continuous function, as was demonstrated by du Bois-Reymond in 1876. However,
Fejér proved in 1900 that the Fourier series of a continuous function  is Cesàro 
dt
summable and the sum is the function itself. For a function x 2 L2 ½ p; pŠ; 2p ,
PN int
Lusin’s conjecture that {SN}N  0, where SN ¼ n¼ N ^xðnÞe converges to
x pointwise a.e. was proved by Carleson in 1966.
(iv) A complete orthonormal system for the space H = L2(0,∞) is given by the
Laguerre functions

1  t
vn ðtÞ ¼ exp Ln ðtÞ; ð2:63Þ
n! 2

where

dn n
Ln ðtÞ ¼ ð 1Þn expðtÞ ðt expð tÞÞ; n ¼ 0; 1; 2; . . .:
dtn

In fact, {vn(t)}n  0 constitute an orthonormal set in H [Example 2.8.13(iv)] . In


order to show that theRsystem (2.63)  is complete in H, it will be enough to
1
show that if f 2 H and 0 f ðtÞ exp 2t Ln ðtÞdt ¼ 0, n = 0, 1, 2, …, then f = 0
a.e. Let
 t
gðtÞ ¼ f ðtÞ exp ; 0\t\1:
2

Since f 2 L2(0,∞) and exp( 2t ) 2 L2(0,∞), it follows that g 2 L1(0,∞).


Indeed, by the Cauchy–Schwarz Inequality,
2.9 Complete Orthonormal Sets 93

Z1 Z1 

 t 
jgðtÞjdt ¼ f ðtÞ exp dt
2
0 0
2 1 312 2 1 312
Z Z
 4 jf ðtÞj2 dt5 4 expð tÞdt5
0 0
 k f k2 :

Each Ln is a polynomial of degree n [see (2.64) of Examples 2.8.13(iv)].


Therefore, each tn is a linear combination of L0, …, Ln. Thus, we need only to
show that

Z1
gðtÞtn dt ¼ 0; n ¼ 0; 1; 2; . . . implies gðtÞ ¼ 0 a:e: ð2:64Þ
0

Now consider

Z1
UðzÞ ¼ expð tzÞgðtÞdt
0
ð2:65Þ
Z1
¼ expð txÞ expð ityÞgðtÞdt; <z [ 0:
0

We shall show that U(z) is holomorphic in ℜz > 0.


Since g 2 L1(0,∞) and |exp(−ity)| = 1, the integral in (2.65) exists as a Lebesgue
integral. Moreover, U(z) is continuous in ℜz > 0. Indeed, if zn ! z in ℜz > 0, then g
(t)exp(−tzn) ! g(t)exp(−tz). Both the sequence of functions and the limit function
are integrable and are dominated by the integrable function g(t), an application of
the Lebesgue Dominated Convergence Theorem 1.3.9 proves the assertion. If D
denotes the boundary of any closed triangle in ℜz > 0, then
0 1 1
I I Z
UðzÞdz ¼ @ gðtÞ expð tzÞdtAdz
D D 0
0 1
Z1 I
¼ gðtÞ@ expð tzÞdzAdt ½Fubini’s TheoremŠ
0 D
Z1
¼ gðtÞ  0dt ½Cauchy’s TheoremŠ ¼ 0:
0
94 2 Inner Product Spaces

It now follows by using Morera’s Theorem that U(z) is holomorphic in ℜz > 0.


On using integration by parts and induction on n, we see that

Z1
expð tÞt2n dt ¼ ð2nÞ!  ð2n n!Þ2 : ð2:66Þ
0

We next consider the series

1
X Z1
1
sn gðtÞtn dt; s [ 0; ð2:67Þ
n¼0
n!
0

1
and show that the series converges for 0  s < 2 to the function U(s).
 
1 Z1  1 Z1  t
X n 1  X n 1
 s gðtÞt n 
dt  s j f ðtÞ j exp tn dt
 
 n¼0 n!  n¼0 n! 2
0 0
1
X 1
 sn k f k2  ½ð2nÞ!Š1=2 ;
n¼0
n!

using the Cauchy–Schwarz Inequality. From (2.66), we have


1
X
 k f k2  ð2sÞn :
n¼0

And the series on the right converges for 0  s < 12.


P
Note that expð stÞgðtÞ ¼ 1 n n 1 n
n¼0 ð 1Þ s n! gðtÞt , and

1 Z  
1 1
X  1 Z
X
ð 1Þn sn 1 gðtÞtn dt  sn jgðtÞjtn dt
 n! 
n¼0 n¼0
0 0
Z1 
1
X 1 n t n
¼ s jf ðtÞj exp t dt
n¼0
n! 2
0
2 312
1
X Z1 Z1
14 2
 s n
jf ðtÞj dt t expð tÞdt5
2n
ðCauchy SchwarzÞ
n¼0
n!
0 0
1
X 1 n
 s k f k2  2n  n! ðusing ð2:66ÞÞ
n¼0
n!
1
X 1
¼ k f k2 ð2sn Þ\1 for 0  s\ :
n¼0
2
2.9 Complete Orthonormal Sets 95

Then, on using Corollary 1.3.10,


1
X 1
ð 1Þn sn gðtÞtn 2 L1 ð0; 1Þ
n¼0
n!

and

Z1 1 Z
X
1

UðsÞ ¼ expð stÞgðtÞdt ¼ ð 1Þn sn gðtÞtn dt:


n¼0
0 0

On using the hypothesis, we obtain U(s) = 0 for all s  0. Now,

Z1
UðsÞ ¼ ts 1 gð ln tÞdt; s  0:
0

It may be observed that t 1 gð ln tÞ is in L1[0, 1], using the substitution


u = −lnt. Using Proposition 1.3.11, it now follows that g(−lnt) = 0 a.e. on (0,1),
which implies [see Problem 2.9.P8] that g(t) = 0 a.e. on (0,∞). This completes the
proof.
(v) A complete orthonormal system for the space H = L2(−∞,∞) is given by the
Hermite functions
 
t2
Hn ðtÞ exp 2
vn ðtÞ ¼ 1=2
; n ¼ 0; 1; 2; . . .; ð2:68Þ
ð2n  n!p1=2 Þ

where

Hn ðtÞ ¼ ð 1Þn expðt2 Þ expðnÞ ð t2 Þ:

In fact, {vn(t)}n  0 constitute an orthonormal set in L2(−∞,∞) [Example 2.8.13


(iii)]. In order to show that the system (2.68) is completein H,it will be enough
R1 t2
to show that if f 2 L2(−∞,∞) and 1 f ðtÞ exp 2 Hn ðtÞdt ¼ 0, or
R1  2
t n
equivalently, 1 f ðtÞ exp 2 t dt ¼ 0, for n = 0, 1, 2, … implies f = 0
a.e. on (−∞,∞). The equivalence follows exactly as in (iv) above, since Hn is a
polynomial of degree n.
96 2 Inner Product Spaces

We consider the function

Z1 
itx t2
FðxÞ ¼ f ðtÞe exp dt; 1\ x\1:
2
1

t2
This integral exists, since f 2 L2(−∞,∞), exp( 2) 2 L2(−∞,∞) and |e−itx| = 1.
In fact,

Z1   Z1  2 
 t2   t 
f ðtÞe itx
exp dt ¼  f ðtÞ exp dt
 2   2 
1 1
2 312 2 312
Z1 Z1
4 jf ðtÞj2 dt5 4 expð t2 Þdt5 \1;
1 1

using the Cauchy–Schwarz Inequality.


We write
1
X xn
f ðtÞe itx
¼ ð iÞn f ðtÞtn
n¼0
n!

and observe that

Z1 X
1 
j xjn n t2
jf ðtÞjjt j exp dt
n¼0
n! 2
1
Z1 
t2
¼ jf ðtÞj expðjxtjÞ exp dt
2
1
Z1  2
t2 t
¼ jf ðtÞj exp expðjxtjÞ exp dt
4 4
1
2 312 2 312
Z1 2
 Z1 2

t t
4 jf ðtÞj2 exp dt5 4 expð2jxtjÞ exp dt5 \1;
2 2
1 1

since

Z1  Z1
2 t2
jf ðtÞj exp dt  jf ðtÞj2 dt ¼ k f k22 ;
2
1 1
2.9 Complete Orthonormal Sets 97

and

Z1  Z1 2
t2 t
expð2jxtjÞ exp dt¼ 2 expð2jxtjÞ exp dt
2 2
1 0
Z1 
t2 2
¼2 exp þ 2j xjt 2x expð2x2 Þdt
2
0
Z1 
2 1 2
¼ 2 expð2x Þ exp ðt 2j xjÞ dt\1:
2
0

Using Corollary 1.3.10, it follows that

Z1 
itx t2
FðxÞ ¼ f ðtÞe exp dt
2
1
1
X Z1 
xn n t2 n
¼ ð iÞ f ðtÞ exp t dt
n¼0
n! 2
1
1
X Z1 
xn n t2 n
¼ ð iÞ f ðtÞ exp t dt ¼ 0;
n¼0
n! 2
1

i.e.

Z1 
itx t2
f ðtÞe exp dt ¼ 0
2
1

for all real x. It follows on using Proposition 1.3.12 that f = 0 a.e., which was to be
proved.
pffiffi
(vi) The set un ðzÞ ¼ pn zn 1 ; n = 1, 2, … is an orthonormal set in H = A(D),
where D = {z 2 C: |z| < 1} [see 2.8.4(iii)]. We shall show that the Parseval
formula
1
X ZZ
jðf ; un Þj2 ¼ jf ðzÞj2 dz; f 2 L2 ðDÞ
n¼1
jzj\1

holds, thereby establishing the completeness of {un (z)}n  1 in


A(D) [Theorem 2.9.16].
For f 2 A(D), the Fourier coefficients are given by
98 2 Inner Product Spaces

rffiffiffiZZ
n
cn ¼ ðf ; un Þ ¼ f ðzÞzn 1 dxdy
p
D
r ffiffiffiZZ
n
¼ lim f ðzÞzn 1 dxdy:
r!1 p
jzj\r

On applying the complex Green’s formula [29, p. 124], we obtain


rffiffiffi Z
1 n zn
cj ¼ lim f ðzÞ dz:
r!1 2i p n
jzj¼r

Since |z|2 = r2, we have z ¼ r 2 z 1


and so
Z
1 r 2n dz
cn ¼ limr!1 pffiffiffiffiffiffi f ðzÞ : ð2:69Þ
np 2i zn
jzj¼r

Now if

f ðzÞ ¼ a0 þ a1 z þ    ; jzj\1

is the power series expansion of f, then


Z
1 f ðzÞ
an 1 ¼ dz: ð2:70Þ
2pi zn
jzj¼1

From (2.69) to (2.70), we obtain

1
cn ¼ lim pffiffiffiffiffiffi r 2n pan 1
r!1 np
rffiffiffi ð2:71Þ
p
¼ an 1 ; n ¼ 1; 2; . . .:
n

Also,
ZZ 1
X j an 1 j 2
jf ðzÞj2 dxdy ¼ p ð2:72Þ
n¼1
n
jzj\1

[see (2.70) above Definition 2.6.2].


From (2.71) to (2.72), it follows that Parseval’s formula holds. The argument is
therefore complete.
2.9 Complete Orthonormal Sets 99

Theorem 2.9.18 Any two complete orthonormal sets in a given Hilbert space H 6¼ {0}
have the same cardinal number.
Proof Let H be a Hilbert space of dimension n and A be any complete orthonormal
set in H. It consists of linearly independent vectors and therefore can have at most
n vectors in it. We shall argue that it contains precisely n vectors: by
Theorem 2.9.14, A is a basis. Since it is finite, it is a Hamel basis and must therefore
contain precisely n vectors.
Let A = {xa: a 2 K} and B = {yb: b 2 C} be complete orthonormal sets in H. For
any xa 2 A, the set


Bxa ¼ yb 2 B: ðxa ; yb Þ 6¼ 0
S
mustSbe countable [see Remark 2.9.11]. Clearly, a Bxa  B: We next show that
B  a Bxa : Let yb 2 B. Suppose yb 2 Bxa for no a. Then (xa, yb) = 0 for all a 2 K.
In other words, yb ⊥ A. Since A is complete, it follows that yb = 0, which is
impossible since ||yb|| = 1. Hence, yb 2 Bxa for some xa. Thus
[
B¼ B xa :
a

It follows that |B|, the cardinality of B, satisfies |B|  ℵ0|A| = |A|. Interchanging the
roles of A and B, we also have |A |  |B|. This completes the proof. h
Definition 2.9.19 Let H be a Hilbert space. If H 6¼ {0}, we define the orthogonal
dimension of H to be the unique cardinal number of a complete orthonormal set in
H. If H = {0}, we say that H has orthogonal dimension 0.
If H is finite dimensional, then the orthogonal dimension of H is the cardinal of a
Hamel basis.
Theorem 2.9.20 (Riesz–Fischer) Let {xa}a2A be a complete orthonormal system in
a Hilbert space H and ‘2(A) = L2(A, , µ), where denotes the collection of all
subsets of A and µ is counting measure on A. Then H is isometrically isomorphic to
‘2(A).
Proof For x 2 H, let T(x) be that function on A such that

½TðxފðaÞ ¼ ðx; xa Þ; a 2 A:
P
Then T maps H into ‘2(A), for a2A jðx; xa Þj2 \1 by Bessel’s Inequality. Also, for
x, y 2 H, we have

½Tðx þ yފðaÞ ¼ ðx þ y; xa Þ ¼ ðx; xa Þ þ ðy; xa Þ


¼ ½TðxފðaÞ þ ½TðyފðaÞ; a 2 A;

i.e. T(x + y) = T(x) + T(y). It is equally easy to show that T(ax) = aT(x) for scalar
a. Thus, T is linear. Using Theorem 2.9.16(e), we have
100 2 Inner Product Spaces

X
ðTðxÞ; TðyÞÞ ¼ ½TðxފðaÞ½TðyފðaÞ
a2A
X
¼ ðx; xa Þðy; xa Þ
a2A
¼ ðx; yÞ

and so T preserves inner products. It remains to show that the mapping


T:H ! ‘2(A) is onto.
P
Let f 2 ‘2(A). Then a2A jf ðaÞj2 \1. Let a1, a2, … be those a’s for which f(a)
P P
6¼ 0. The condition a2A jf ðaÞj2 \1 becomes 1 2
i¼1 jf ðai Þj \1. It follows from
P1
Theorem 2.9.9 that x ¼ i¼1 f ðai Þai is in H. Since (x, xaj ) = f(aj), so (x, xaj ) 6¼ 0.
For a fixed p and any m  p, we have
 
   Xm 

ðx; xa Þ f ðap Þ ¼ ðx; xap Þ f ðai Þðxai ; xap Þ
p
 i¼1


Xm

 x f ðai Þxai xap ! 0
i¼1

as m ! ∞. Therefore, f(aj) = (x, xaj ) = (T(x))(aj) for all j = 1, 2, …. The equality


also holds for those a’s for which f(a) = 0. This completes the proof. h
Remarks 2.9.21
(i) The following form of the above Theorem was originally proved by Riesz and
Fischer in 1907:
P
Let {an}n2 Z be in ‘2(Z), that is, 1 2
n¼ 1 jan j \1. Then, there Rexists a function
1 p
f in L2([−p, p], 2p
dt
), such that ^f ðnÞ ¼ an ; n 2 Z; where ^f ðnÞ ¼ 2p p f ðtÞe
int
dt is
−int
the nth Fourier coefficient of f with respect to the orthonormal basis {e : n 2 Z}.
(ii) A Hilbert space is completely determined up to an isometric isomorphism by
its orthogonal dimension, i.e. by the cardinality of its complete orthonormal
basis. The space L2([−p, p], 2p
dt
) is isometrically isomorphic to ‘2(Z) and hence
2
also to ‘ (N).

Problem Set 2.9


2:9:P1. Let fen gn  1 and fe~n gn  1 be orthonormal sequences in a Hilbert space
H and let M1 = span(en) and M2 = spanðe~n Þ. Show that M1 ¼ M2 if, and
only if,
2.9 Complete Orthonormal Sets 101

1
X 1
X
en ¼ anm e~m ; e~n ¼ anm em ; anm ¼ ðen ; e~m Þ:
m¼1 m¼1

2:9:P2. Let H be a Hilbert space. Then show that the following hold:
(a) If H is separable, every orthonormal set in H is countable.
(b) If H contains an orthonormal sequence which is complete in H, then
H is separable.
2:9:P3. Let A  [−p, p] and A be measurable. Prove that
Z Z
lim cos nt dt ¼ lim sin nt dt ¼ 0:
n!1 n!1
A A

2:9:P4. Let n1 < n2 < n3 <  be positive integers and

E ¼ fx 2 ½ p; pŠ : limk sin nk x existsg:

Prove that m(E) = 0, where m(E) denotes the Lebesgue measure of E.



1
2:9:P5. Let ej(z) = zj, j 2 Z: Show that ej j¼ 1 is an orthonormal sequence in
RL2 (notation as in Example 2.1.3(vi)).
2:9:P6. Let a1, a2 2 D(0,1) = {z: |z| < 1} and a1 6¼ a2. Show that the vectors
 12  12
1 ja1 j2 z a1 1 ja2 j2
e1 ðzÞ ¼ and e2 ðzÞ ¼
1 a1 z 1 a1 z 1 a2 z

constitute an orthonormal system in RH2 (notation as in Example 2.1.3


(vi)).
2:9:P7. Let {en}n  1 be an orthonormal basis in H. Show that for any orthonormal
set {fn}n  1, if
1
X
ke n fn k2 \1;
n¼1

then {fn}n  1 is an orthonormal basis.


2:9:P8. A real-valued function on an interval having a continuous nonvanishing
derivative on the interior of its domain maps a set of (Lebesgue) measure
zero into a set of measure zero. In case the domain is an open interval, in
which case the range is also an open interval and an inverse exists, the
inverse also has the same properties. [Examples of such a function on the
domain (0,∞) would be exp(−x) and x2.]
102 2 Inner Product Spaces

2.10 Orthogonal Decomposition and Riesz Representation

A result of particular interest about Hilbert space is the projection theorem, namely,
if M is any closed subspace of a Hilbert space H, then H can be decomposed into
the direct sum of M and its orthogonal complement (to be defined below). This
important geometric property is one of the main reasons that Hilbert spaces are
easier to handle than Banach spaces.
A characterisation of a bounded linear functional [see Definition 2.10.18 below]
on a Hilbert space, known as the Riesz Representation Theorem, will be studied.
P.L. Chebyshev sought the approximation of arbitrary functions by linear
combinations of given ones. He considered approximations in the spaces of con-
tinuous functions, Lp spaces, etc. These have a bearing on constrained optimisation.
We deal with the approximation problem in a pre-Hilbert space X; given a set of
n linearly independent vectors {v1, v2, …, vn} and an x 2 X, to find a method of
computing the minimum value of

n
X

x cj vj ;
j¼1

where c1, c2, …, cn range over all scalars and to find the corresponding values of c1,
c2, …, cn. The reader will learn that this is precisely the problem of finding the
distance of x from the linear span of {v1, v2, …, vn}.
Recall that a set of M nonzero vectors in a pre-Hilbert space is said to be
orthogonal if x ⊥ y whenever x and y are distinct vectors of M.
Definition 2.10.1 Let X be a pre-Hilbert space and x 2 X. We define

x? ¼ fy 2 X : ðx; yÞ ¼ 0g:

and if S is a subset of X,

S? ¼ fy 2 X : ðx; yÞ ¼ 0 for all x 2 Sg:

The symbol x ⊥ [respectively, S⊥] is read as x perp [respectively, S perp]. One


writes S⊥⊥ for the perp of S⊥; thus S⊥⊥ = (S⊥)⊥. The set S⊥ is called the
orthogonal complement of S.
Remarks 2.10.2
(i) Observe that x⊥ is a subspace of X, since (x, y) = 0 and (x, z) = 0 imply (x, ay +
bz) = 0, where a, b are scalars. Also, x⊥ is precisely the set of vectors where
the continuous function y ! (x, y) is zero. Hence, x⊥ is a closed subspace of
X. Since
2.10 Orthogonal Decomposition and Riesz Representation 103

\
S? ¼ x? ;
x2S

it follows that S⊥, being the intersection of closed subspaces of X, is itself a


closed subspace of X.

ðiiÞ S? ¼ ðSÞ? :
Let y 2 S⊥. Then (x, y) = 0 for all x 2 S. Let z 2 S: Then there exists a
sequence {zn}n  1 in S such that zn ! z. The continuity of the mapping x !
(x, y) and the fact that (zn, y) = 0 for n = 1, 2, … imply (z, y) = 0. Since z 2 S is
arbitrary, we conclude that y 2 (S)⊥.
On the other hand, if y 2 (S)⊥, then (y, x) = 0 for all x 2 S. Since S  S; it follows
that (y, x) = 0 for all x 2 S, that is y 2 S⊥.
Proposition 2.10.3 Let S and S1 be subsets of an inner product space X. Then the
following hold.
(a) S⊥ is a closed subspace of X and S \ S⊥  {0};
(b) S  S⊥⊥;
(c) S  S1 implies S⊥  S⊥1;
(d) S⊥ = S⊥⊥⊥.

Proof
(a) In Remark 2.10.2(i), we have noted that S⊥ is a closed subspace of X. If
x 2 S \ S⊥ then x ⊥ x, that is, (x, x) = 0, which implies x = 0.
(b) Let x 2 S. For any y 2 S⊥, one has (y, x) = 0, so that x ⊥ S⊥ and therefore
x 2 S⊥⊥.
(c) If x 2 S⊥
1 , then (x, y) = 0 for all y 2 S1. In particular, (x, y) = 0 for all y 2 S,
which implies x 2 S ⊥.
(d) Applying (iii) to the relation S  S⊥⊥, we have (S⊥⊥)⊥  S⊥. Also, S⊥ 
(S⊥)⊥⊥ by (b) above. Since (S⊥⊥)⊥ = (S⊥)⊥⊥, as in each case one starts with
S and perps three times, it follows that (S⊥⊥)⊥ = S⊥. h

Example Let S = {f 2 L2[0, 1]: f(t) = 0 a.e. for 0  t  12}.


Then
  
1
S? ¼ g 2 L2 ½0; 1Š : gðtÞ ¼ 0 a:e: on ;1
2

and
  
?? 2 1
S ¼ g 2 L ½0; 1Š : gðtÞ ¼ 0 a:e: on 0; ¼ S:
2
104 2 Inner Product Spaces

R1
Hint: To compute S⊥, first show that x gðtÞdt ¼ 0 for every x 2 [12,1] and then use
regularity of Lebesgue measure.
If x is a point lying outside a plane in R3 , then there is a unique y in the plane
which is closer to x than any other point of the plane. This assertion, when trans-
lated in the language of Hilbert spaces, yields rich dividends via the Riesz
Representation Theorem below. The accompanying figure illustrates the situation
when the plane is a coordinate plane. However, this need not always be the case.

Definition 2.10.4 A subset K of a vector space is convex if, for all x, y 2 K, and all
k such that 0 < k < 1, the vector kx + (1 − k)y belongs to K. The set of vectors
{kx + (1 − k)y: 0 < k < 1} is the line segment joining x and y. The convex hull of a
subset S of any vector space is the intersection of all convex subsets containing
S and is denoted by co(S) or by coS.
It is sometimes neater to work with an equivalent formulation of convexity as
follows: for all x, y 2 K, and all a, b  0 such that a + b = 1, the vector ax +
by belongs to K.
It is easy to see that the intersection of any family of convex sets is again convex;
in particular, any convex hull is a convex set. By using the alternative formulation
of convexity, the convex hull of any finite set of vectors {x1, x2P , …, xn} is easily
seen to consist of precisely those vectors which can be written as nk¼1 kk xk , where
P
0  kk  1 for each k and nk¼1 kk ¼ 1. (Induction: we start with the vectors x1,
P P
x2, …, xn and nonnegative k1, k2, …, kn satisfying nk¼1 kk ¼ 1. If nk¼11 kk ¼ 0,
P
then kk is 0 for k = 1, …, n−1 and kn = 1, which together imply nk¼1 kk xk is in the
Pn 1 Pn 1
convex hull. Assume k¼1 kk ¼ b [ 0. Then k¼1 ðkk =bÞxk is in the convex hull
P P
by the induction hypothesis. Consequently, nk¼1 kk xk ¼ b nk¼11 ðkk =bÞxk þ kn xn
is in the convex hull. Conversely, the vectors that can be written in this form
obviously constitute a convex set that contains {x1, x2, …, xn} and therefore con-
tains the convex hull under reference.)
This description of the convex hull will now be used for arguing that it is
compact when the vector space is normed.
When n = 1, there is nothing to prove. Assume as induction hypothesis that the
convex hull of any n vectors is compact. Consider any set {x1, x2, …, xn, x} of n + 1
2.10 Orthogonal Decomposition and Riesz Representation 105

P Let {yp}p  1 be a sequence in co{x1, x2, …, xn, x}. Each


vectors. P yp can be written
as nk¼1 kk;p xk þ kp x, where 0  kk,p, kp  1 for each k and nk¼1 kk;p þ kp ¼ 1.
If there are only finitelyPmany such p, then we can assume that 1 − kp > 0 for every
p. It then follows that nk¼1 kk;p =ð1 kp Þ ¼ 1, each term in the sum being non-
P
negative. This means zp ¼ nk¼1 kk;p xk =ð1 kp Þ is in the convex hull of {x1, x2,
…, xn}. By the induction hypothesis, {zp} has a subsequence {zp(q)} converging to a
limit z 2 co{x1, x2, …, xn}. Now the bounded sequence {kp(q)} in R has a con-
vergent subsequence {kp(q(r))}, whose
Pn limit we shall denote by k. Then {zp(q(r))}
converges to z and therefore k¼1 kk;pðqðrÞÞ xk ¼
P
ð1 kpðqðrÞÞ ÞzpðqðrÞÞ forms a
sequence converging to (1 − k)z. As ypðqðrÞÞ ¼ nk¼1 kk;pðqðrÞÞ xk þ kpðqðrÞÞ x; the
subsequence {yp(q(r))} converges to (1 − k)z + kx, which belongs to co(co{x1, x2,
…, xn} [ {x}). The latter can easily be seen to be the same as co{x1, x2, …, xn, x}.
This completes the induction proof that the convex hull of any finite set of vectors is
compact.
If K is convex and x is any vector, then the convex hull co(K [ {x}) is precisely
K1 = {ax + bk: a, b  0, a + b = 1}. The convexity of K1 follows from the three
computations
(a) a(a1x + b1k1) + b(a2x + b2k2) = (aa1 + ba2)x + (ab1k1 + bb2k2)
= (aa1 + ba2)x + c((ab1/c)k1 + (bb2/c)k2), where c = 1 − (aa1 + ba2) if
nonzero;
(b) (ab1/c) + (bb2/c) = 1 because

ab1 þ bb2 þ ðaa1 þ ba2 Þ ¼ aða1 þ b1 Þ þ bða2 þ b2 Þ ¼ 1;

so that ab1 + bb2 = 1 − (aa1 + ba2) = c,


(c) aa1 + ba2  a + b = 1 when 0  a1  1 and 0  a2  1.
Regarding (a), we note that c = 0 implies a1 = a2 = 1 and b1 = b2 = 0, in which
case a(a1x + b1k1) + b(a2x + b2k2) = x. Once the convexity of K1 is established, it is
a trivial matter to see that it is the convex hull of K.
Theorem 2.10.5 (Closest point property) Let K be a nonempty closed convex set in
a Hilbert space H. For every x 2 H, there is a unique point y 2 K which is closer to
x than any other point of K, i.e.

kx yk ¼ inf kx zk:
z2K

Proof Let d = infz2K||x − z||. Since K 6¼ ∅, d < ∞, therefore for each n 2 N, there
exists yn 2 K such that
106 2 Inner Product Spaces

1
kx yn k2 \d2 þ : ð2:73Þ
n

We shall prove that {yn}n  1 is a Cauchy sequence in K. Consider the vectors


x − yn and x − ym. By the Parallelogram Law [Proposition 2.2.3(c)],
 
kðx yn Þ ðx ym Þk2 þ kðx yn Þ þ ðx y m Þ k2 ¼ 2 kx y n k2 þ kx y m k2

2 1 1
 4d þ 2 þ :
n m

Rearranging the left side, we obtain



2 yn þ ym
2 2 1 1
kyn ym k þ 4 x  4d þ 2 þ
2 n m

and hence

2 21 1 yn þ ym
2
ky n ym k  4d þ 2 þ 4 x :
n m 2
yn þ ym
Since K is convex, yn, ym 2 K, we have 2 2 K, and hence
yn þ ym
2
x  d2 :
2

Consequently,

2 2 1 1
ky n ym k  4d þ 2 þ 4d2
n m

1 1
¼2 þ :
n m

Thus, {yn}n  1 is a Cauchy sequence and so converges to a limit y 2 H. Since K is


closed, y 2 K and therefore

kx yk  d.

On letting n ! ∞ in (2.73), we obtain

kx yk  d

and so
2.10 Orthogonal Decomposition and Riesz Representation 107

kx yk ¼ d:

We have proved that there is a closest point to x in K. It remains to show that it is


unique. Suppose that z 2 K (z 6¼ y) is such that ||x − z|| = d. Then y þ2 z 2 K so that
y þ z

x  d.
2

On applying the Parallelogram Law [Proposition 2.2.3(c)] to y − x and z − x, we get

ky zk2 ¼ 2ky x k2 þ 2kz x k2 ky þ z 2xk2


y þ z 2

¼ 4d2 4 x  0:
2

Hence y = z. h
Remarks 2.10.6
(i) If x 2 H is such that x 2 K, then the vector nearest to x is x itself.
(ii) If K is not closed, the conclusion of Theorem 2.10.5 may not hold. In fact, in
this situation, whether K is convex or not, there always exists a point in
H having no closest approximation in K. Any point in the closure that does not
belong to K will serve the purpose.

P1 Let H = ‘2 and K = {x = {kk}k  1: kk 6¼ 0 for only finitely many k’s and


k¼1 kk ¼ 1}. K is convex.

However, K is not closed. In fact, the sequence
  
y1 ¼ ð1; 0; 0; . . .Þ, y2 ¼ 12 ; 12 ; 0; 0; . . . , …, yn ¼ 1n ; 1n ; . . .; 1n ; 0; 0; . . . , … is in
K. However, the limit of the sequence {yn}n  1 in ‘2 is the point y = (0, 0, …) of ‘2,
which does not belong to K. According to the preceding paragraph, the point y = (0,
0, …) does not possess a closest point in K.
(iii) The conclusion of Theorem 2.10.5 fails to hold if H is not a Hilbert space.
Let X = R2 , the real Banach space with ||(x1, x2)|| = max{|x1|, |x2|}. Consider
the closed convex set K = {(x1, x2): x1  1}. The minimal distance of the
origin from K is attained at each of the points of the line segment {(x1, x2): x1
= 1 and |x2|  1}. Even when the norm comes from an inner product and
H is not complete, the existence part of the conclusion of Theorem 2.10.5
may fail to hold; an example of this will be given later in (ii) of Remarks
2.10.12. The uniqueness part however holds because its proof does not use
completeness.

Corollary 2.10.7 Every nonempty closed convex set K in a Hilbert space H con-
tains a unique element of smallest norm.
Proof Take x = 0 in Theorem 2.10.5. h
108 2 Inner Product Spaces
P
Example 2.10.8 Let K = {y = (k1, k2, …, kn) 2 Cn : nk¼1 kk ¼ 1}. K is a closed
n
convex1 1subset1of C . The unique vector y0 2 K of smallest norm is
y0 ¼ n ; n ; . . .; n ; indeed, if y0 has two unequal components, then interchanging
them leads to another vector in K with smallest norm; consequently, all components
of y0 are equal. The reader who is familiar with constrained optimisation can verify
the claim made above independently of the corollary.
Corollary 2.10.9 Let M be a closed subspace of a Hilbert space H. If x is a vector
in H, and if d = inf{||x − z||: z 2 M} then there exists a unique y 2 M such that
d = ||x − y||.
Proof Every subspace of a vector space is convex. h
Theorem 2.10.10 Let M be a closed subspace of a Hilbert space H and x 2 H. If y
denotes the unique element in M for which ||x − y|| = inf{||x − z||: z 2 M}, then x − y
is orthogonal to M. Conversely, if y 2 M is such that x − y is orthogonal to M, then
||x − y|| = inf{||x − z||: z 2 M}.
Proof Consider z 2 M with ||z|| = 1. Then w = y + (x − y, z)z lies in M and we have

kx yk2  kx wk2 ¼ ðx w; x wÞ
¼ ðx y ðx y; zÞz; x y ðx y; zÞzÞ
2 2
¼ kx yk j ðx y; zÞj :

This shows that (x − y, z) = 0, i.e. x − y ⊥ z. Since every vector in M is a scalar


multiple of a vector in M of norm 1, it follows that x − y ⊥ M.
If z 2 M, then x − y is orthogonal to y − z, so that ||x − z||2 = ||x − y + y − z ||2 = ||
x − y||2 + ||y − z||2  ||x − y||2. Thus, ||x − y|| = inf{||x − z||: z 2 M}. h
It may be noted that x − y is closest to x in M⊥.
Theorem 2.10.11 (Orthogonal Decomposition Theorem) If M is a closed subspace
of a Hilbert space H, then H = M ⊕ M⊥ and M = M⊥⊥.
Proof Let x 2 H. Since M is a closed subspace of H, there exists a unique vector
y 2 M such that ||x − y|| = inf{||x − z||: z 2 M}. So x − y ⊥ M by Theorem 2.10.10.
Hence, x = y + (x − y), where y 2 M and x − y 2 M⊥. Since M \ M⊥ = {0}, it
follows that H = M ⊕ M⊥.
We already know that M  M⊥⊥ [Proposition 2.10.3(b)]. On the other hand, let
x 2 M⊥⊥. If x = y + z, where y 2 M and z 2 M⊥, then x and y are in M⊥⊥. Hence, z =
x − y 2 M⊥⊥ (M⊥⊥ is a subspace of H). Since z 2 M⊥, it follows that (z, z) = 0, that
is, z = 0 or x = y. This shows that M⊥⊥  M. The proof is complete. h
Remarks 2.10.12
(i) If M is a closed subspace of a Hilbert space H and x 2 H, then x can be
uniquely expressed as
2.10 Orthogonal Decomposition and Riesz Representation 109

x ¼ y þ z;

where y 2 M and z 2 M⊥.


(ii) The condition that H is a Hilbert space for a closed subspace to satisfy M =
M⊥⊥ cannot be omitted. Let ‘0  ‘2 be the inner product space consisting of
sequences, each of which has only finitely many nonzero terms. Let M =
P 1
{x = {kk}k  1: kk 6¼ 0 for only finitely many k’s and 1
k¼1 k kk ¼ 0g: Clearly,
M is a subspace of ‘0. Moreover, M is closed, as is proved below.
Let {x(n)}n  1 be a sequence in M such that x(n) ! x in ‘0. By the Cauchy–
Schwarz Inequality, it follows that
 2   ! !
X 1
1  X 1
1 2 X1
1 X1  2
 ðnÞ   ðnÞ 
 kk  ¼  k kk   kk kk  ;
 k¼1 k   k¼1 k k  k¼1
k 2
k¼1

P1 1
where k(n)k is the kth component of xn. Consequently, k¼1 k kk ¼ 0. So, x 2 M.

We next show that M = {0}. Assume 0 6¼ z 2 ‘0 and z ⊥ M. Then there exists
P P
k such that z = (x1, …, xk, 0, …) and ki¼1 jxi j2 6¼ 0. Let l ¼ ðk þ 1Þ kj¼1 1j xj .
Then w = (x1, …, xk, µ, 0, …) 2 M in view of the definition of µ. Hence, z ⊥ w, i.e.
(z, w) = 0. But (z, w) = ||z||2. It follows that z = 0, contradicting the assumption on z.
Consequently, M⊥⊥ = ‘0 6¼ M.
We shall use the fact that M⊥ = {0} to show that the closed convex subset M of
the (incomplete) inner product space ‘0 has the property that every x 2 ‘0 that does
not lie in M fails to have a nearest element in M. In particular, it will follow that the
conclusion of Theorem 2.10.5 may fail to hold in the absence of completeness.
Suppose x 2 ‘0 does not lie in M but has a closest element y 2 M. It follows that
x − y ⊥ M exactly as in the first paragraph of the proof of Theorem 2.10.10,
considering that completeness is not needed in that paragraph. Since we have
shown that M⊥ = {0}, we infer that x − y = 0, which is a contradiction because
y 2 M, whereas x 62 M.
(iii) If M is any linear subspace of H, then M = M⊥⊥. Observe that M  M⊥⊥ [see
Proposition 2.10.3(b)]. It follows that M  M ?? , since M⊥⊥ is a closed
subspace of H. As M  M, it follows that ðMÞ?  M? using Proposition
2.10.3(c). Another application of Proposition 2.10.3(c) yields
M??  ðMÞ?? ¼ M since M is a closed subspace of H [Theorem 2.10.11].
(iv) If H = M ⊕ N, M  N⊥, then M = N⊥ and is therefore closed, as we now
show. Suppose x 2 N⊥ and x 62 M. The vector x has the representation x = y +
z, where y 2 M and z 2 N. Now
110 2 Inner Product Spaces

0 ¼ ðx; zÞ ¼ ðy; zÞ þ ðz; zÞ

which implies z = 0. Consequently, x = y, which is a contradiction because


x 62 M and y 2 M.
(v) S⊥⊥ = (S⊥)⊥ is the smallest closed subspace of the Hilbert space H which
contains S.
(vi) Let {Mk}k  1 be a sequence of closed linear subspaces of a Hilbert space
H. There exists a smallest closed linear subspace M such that Mk  M for all
k and it has the property that x ⊥ M if, and only if, x ⊥ Mk for all k. To see
why, let S = {x 2 H: x 2 Mk for some k}. Clearly Mk  S for all k. Moreover,
S is the smallest subset of H with this property. Set M = S⊥⊥. If N is a closed
linear subspace such that Mk  N for all k, then S  N. Hence, M  N in
view of (v). The assertion that x ⊥ M if, and only if, x ⊥ Mk for all k is
proved by using the following observation:

M ? ¼ S??? ¼ S? :

Example 2.10.13 Consider F o = {f 2 L2[−1, 1]: f(t) = −f(−t)} and F e = {f 2 L2[−1, 1]:
f(t) = f(−t)}.
The set F o is an infinite-dimensional linear subspace of L2[−1, 1]. [f(t) = t2n−1,
n = 1, 2, … are in F o ; they are countably many and linearly independent.] Also, F e
is an infinite-dimensional subspace of L2[−1, 1]. [F e contains the functions f(t) = t2n,
n = 0, 1, 2, ….]
For f 2 F o and g 2 F e , the inner product

Z1
ðf ; gÞ ¼ f ðtÞgðtÞdt ¼ 0
1

since the function f ðtÞgðtÞ is odd. Hence F o ⊥ F e , i.e. F o  F ?


e .
For any function f 2 L2[−1, 1],

f ðtÞ þ f ð tÞ f ðtÞ f ð tÞ
fe ðtÞ ¼ ; fo ðtÞ ¼ and f ¼ fe þ fo :
2 2

Moreover, this representation is unique, so that L2[−1, 1] = F o ⊕ F e . Since F o 


F? ?
e , it follows that F o = F e . [See Remark 2.10.12(iv)].
The following proposition provides an alternate way of computing

d ¼ inf fkx zk : z 2 M g;

where x 2 H and M is a closed subspace of H.


2.10 Orthogonal Decomposition and Riesz Representation 111

Proposition 2.10.14 If M is a closed subspace of a Hilbert space H and x 2 H,


then


d ¼ inf fkx uk : u 2 M g ¼ max jðx; zÞj : z 2 M ? and kzk ¼ 1 :

Proof Suppose x 2 H. Then, x has a unique representation of the form

x ¼ y þ z; y2M and z 2 M?:

Let w 2 M⊥ and ||w|| = 1. Then

jðx; wÞj ¼ jðy þ z; wÞj ¼ jðz; wÞj  kzk ¼ kx yk ¼ d ½Theorem 2:10:10Š:

Hence, sup{|(x, w)|: w 2 M⊥, ||w|| = 1}  d. The vector w ¼ x d y is in M⊥


[Theorem 2.10.10] and kwk ¼ kx dyjk ¼ 1. For this w,
 x y 1 1 1
ðx; wÞ ¼ x; ¼ ðx; x yÞ ¼ ðx y; x yÞ ¼ kx yk2 ¼ d;
d d d d

using the fact that y 2 M and x − y 2 M⊥.


This completes the proof. h
Let H be a Hilbert space and M is a closed subspace of H. The Orthogonal
Decomposition Theorem 2.10.11 says that H = M ⊕ M⊥. Thus for each x 2 H, there
are unique y 2 M and z 2 M⊥ such that x = y + z. Note that z = x − y is the unique
vector in M⊥ closest to x. The vector y is the projection of x on M and it follows
that z is the projection of x on M⊥. This sets up mappings from H onto M and from
H onto M⊥, respectively.
Theorem 2.10.15 The mapping PM:H ! M defined by PM(x) = y, x 2 H, where y
denotes the projection of x on M. The mapping PM has the following properties:
(a) PM is linear, i.e. PM(a1x1 + a2x2) = a1PM(x1) + a2PM(x2), where a1 and a2 are
scalars;
(b) If x 2 M, then PM(x) = x. Thus, PM is idempotent, i.e. P2M = PM;
(c) If x 2 M⊥, then PM(x) = 0;
(d) (PM(x), x) = ||PM(x)||2  ||x||2 for all x 2 H.

Proof
(a) Let xi = yi + zi, i = 1, 2 be the decomposition of xi relative to M. Then

PM ða1 x1 þ a2 x2 Þ ¼ a1 y1 þ a2 y2 ¼ a1 PM ðx1 Þ þ a2 PM ðx2 Þ:

(b) Since x = x + 0 is the unique decomposition of x 2 M, it follows that PM(x) =


x. If x 2 H, then PM(x) 2 M and, by what has just been proved, PM(x) =
PM(PM(x)) = P2M(x). Thus P2M = PM.
(c) If x 2 M⊥, then x = 0 + x is the unique decomposition of x. Thus PM(x) = 0.
112 2 Inner Product Spaces

(d) (PM(x), x) = (y, y + z), where x = y + z is the unique decomposition of x. Now


y ⊥ z and therefore (y, y + z) = (y, y) = (PM(x), PM(x)) = ||PM(x)||2. Also,
||PM(x)||2 = (y, y)  ||y||2 + ||z||2 = ||x||2. h

Definition 2.10.16 The map PM is called the orthogonal projection on M.


Remarks 2.10.17
(i) The map PM is often denoted by P when it is clear from the context on which
subspace M the projection PM is intended.
(ii) In Theorem 2.10.15(d), we have checked that

kPM ðxÞk  k xk for all x 2 H:

Since PM(x) = x for all x 2 M, it follows that ||PM(x)|| = ||x|| and hence

supkxk¼1 kPM ðxÞk ¼ 1 provided M 6¼ f0g:

We now turn to the study of ‘linear functionals’ on Hilbert spaces.


Definition 2.10.18 A linear functional on a vector space X over a field F is a
mapping f:X ! F which satisfies f(kx + µy) = kf(x) + µf(y) for all x, y 2 X and all
scalars k, µ in F.
Definition 2.10.19 Let X be a normed linear space over F. A linear mapping f:X ! F
is called a bounded linear functional on X if it maps bounded subsets of X into
bounded subsets of F, or equivalently, if there exists a constant K such that

jf ðxÞj  K kxk; x 2 X:

The equivalence is a consequence of the linearity of the functional.


The linear functional f is said to be a continuous linear functional if for a
sequence {xn}n  1 in X, xn ! x implies f(xn) ! f(x).
Let X* denote the set of all bounded linear functionals on X. Define addition and
scalar multiplication in X* as follows:

ðf1 þ f2 ÞðxÞ ¼ f1 ðxÞ þ f2 ðxÞ and ðaf1 ÞðxÞ ¼ af1 ðxÞ for f1 ; f2 2 X* and a 2 F:

It can be checked that f1 + f2 and af1 are in X*.


Define a norm on X* by setting

kf k ¼ supx6¼0 jf ðxÞj=kxk ¼ supkxk¼1 jf ðxÞj:

It can be checked that |||| is a norm on X*. It is immediate from the definition of
the norm in X* that
2.10 Orthogonal Decomposition and Riesz Representation 113

jf ðxÞj  k f k  kxk:

It will be proved later that X* with the norm described above is complete.
Proposition 2.10.20 A linear functional f:X ! C is bounded if, and only if, it is a
continuous functional.
Proof Indeed, if f is bounded, then

jf ðxn Þ f ðxÞj ¼ jf ðxn xÞj


 k f kkx n x k ! 0

as xn ! x and this implies the result in one direction.


Suppose xn ! 0. Then f(xn) ! 0. If f is not bounded, then for every n 2 N, there
exists xn with ||xn|| = 1 and |f(xn)|  n. But in this case, |f(xn/n)|  1, whereas
||xn/n|| ! 0. h
Remark The study of continuous linear functionals will be taken up in more detail
later in the book.
Proposition 2.10.21 A linear functional f defined on X is continuous at x = 0, then
it is continuous everywhere.
Proof Suppose f is continuous at x = 0. Let e > 0 be given. There exists d > 0 such
that ||x|| < d implies |f(x)| < e. Therefore, for every x 2 X, ||x − y|| < d implies
jf ðxÞ f ðyÞj ¼ jf ðx yÞj\e: h
Remarks 2.10.22
(i) The point x = 0 could be replaced by any other point of X.
(ii) A slight modification of the argument in the proof above shows that f is
uniformly continuous. Indeed, for every pair of points x, y in X, ||x − y|| < d
implies

jf ðxÞ f ðyÞj ¼ jf ðx yÞj\e:

Thus, a linear functional is either uniformly continuous or everywhere


discontinuous.

Theorem 2.10.23 If X is a normed linear space, then X* is a Banach space.


Proof Let {fn}n  1 be a Cauchy sequence of elements of X*. This means that for
any e > 0, there exists n0 2 N such that m,n  n0 implies

kf n fm k\e;

that is, for any x 2 X,


114 2 Inner Product Spaces

jfn ðxÞ fm ðxÞj  kfn fm kkxk\ek xk: ð2:74Þ

In particular, for any x 2 X, {fn(x)}n  1 is a Cauchy sequence of scalars. So, the


limit

limn fn ðxÞ ¼ f ðxÞ; say;

must exist. The function f defined in this way is clearly linear. We next show that
f is bounded. For any j 2 N and an appropriate n1 2 N, we have

fn þ j fn \1;
1 1

which implies

fn þ j \ð1 þ kfn kÞ;
1 1

or
 
fn þ j ðxÞ
\ð1 þ kfn kÞkxk:
1 1

On letting j ! ∞, we obtain

jf ðxÞj  ð1 þ kfn1 kÞk xk:

Thus, f is a bounded linear functional in X. We next prove that {fn}n  1 converges to


f in the norm of X*. Using (2.74) again, for x 2 X and n  n0, we have

limm jfn ðxÞ fm ðxÞj ¼ jfn ðxÞ limm fm ðxÞj  ek xk:

Hence

jfn ðxÞ f ðxÞj  ek xk; x 2 X;

which implies ||fn − f||  e for n  n0. h


It is easy to write down the general linear functional on a finite-dimensional
linear space. The description of continuous linear functionals on Banach spaces
entails some efforts. However, not much effort is required to describe the contin-
uous linear functionals on Hilbert spaces. We begin with some examples of con-
tinuous linear functionals.
Examples 2.10.24
(i) Let H be a Hilbert space of finite P dimension and let e1, e2, …, en be an
orthonormal Pbasis in H. If x ¼ ak ek ; a1 ; a2 ; . . .; an 2 C, be any vector in H,
then f ðxÞ ¼ ak f ðek Þ is clearly a linear functional in H. Moreover,
2.10 Orthogonal Decomposition and Riesz Representation 115

 
X n  X n
 
jf ðxÞj ¼  ak f ðek Þ  ja jjf ðek Þj
 k¼1  k¼1 k
!12 !12
X n Xn
 jak j2 jf ðek Þj2 ½Cauchy Schwarz InequalityŠ
k¼1 k¼1

¼ M k xk;
P 12 P 12
n
where M ¼ k¼1 jf ðek Þj2 and k xk ¼ n
k¼1 jak j2 ; i.e. f is a bounded [con-
tinuous] linear functional.
(ii) Consider the Hilbert space H = ‘2 of square summable sequences of scalars.
For y = {yn}n  1 in ‘2, define
1
X
fy ðxÞ ¼ ðx; yÞ ¼ xn yn :
n¼1

Observe that
 
X 1  X 1
 
 xn yn   jx jjy j
 n¼1  n¼1 n n
!12 !12
X1 1
X
2 2
 j xn j j yn j
n¼1 n¼1

¼ k x k2 k y k2 ;

using the Cauchy–Schwarz Inequality. Thus, fy is a bounded linear functional


on ‘2 of norm at most ||y||2. For y 2 ‘2,
1
X
fy ðyÞ ¼ ðy; yÞ ¼ jyn j2 ¼ kyk22 ;
n¼1

showing that ||fy || = ||y||2.


(iii) Consider the Hilbert space L2(X, M, l) of complex-valued measurable
R
functions f defined on X for which X j f j2 dl is finite. For g 2 L2(X, M, µ),
define
Z
fg ðhÞ ¼ h g dl; h 2 L2 ðX; M; lÞ:
X
116 2 Inner Product Spaces

Observe that
  0 112 0 112
Z  Z Z
 
g dl  @ jhj2 dlA @ jgj2 dlA ¼ khk2 kgk2 ;
 h

 
X X X

using Cauchy–Schwarz Inequality. Thus, fg is a linear functional on L2(X, M, µ)


of norm at most ||g||2. For g 2 L2(X, M, µ),
Z Z
fg ðgÞ ¼ g dl ¼
g jgj2 dl ¼ kgk22 ;
X X

showing that ||fg|| = ||g||2.


(iv) Given a Hilbert space H and a vector y 2 H, the function fy(x) = (x, y), x 2 H,
is a bounded linear functional on H of norm ||y||. Indeed, f is clearly linear
and |fy(x)| = |(x, y)|  ||x||||y|| and so, ||fy||  ||y||. Furthermore, |fy(y)| = ||y||2
and so ||fy|| = ||y||.
Note that (i) and (ii) are special cases of (iii) with µ as counting measure, and
(iii) is a special case of (iv).
The existence of orthogonal decompositions implies that all bounded linear func-
tionals on H can be obtained in this way.
Theorem 2.10.25 (Riesz Representation Theorem) Let H be a Hilbert space over
C and let f 2 H*, the space of all continuous linear functionals on H. Then there
exists a unique vector y 2 H such that f(x) = (x, y) for all x 2 H.
Moreover, the mapping T:H ! H* defined by T(y) = fy, where fy(x) = (x, y), is
onto, conjugate linear and isometric.
If H is a Hilbert space over R, the mapping T is linear rather than conjugate
linear.
Proof Let f 2 H*. If f = 0, choose y = 0. Then f(x) = (x, y), x 2 H. Furthermore, y =
0 is the only such element of H since 0 = f(y) = (y, y) = ||y||2.
Suppose that f 6¼ 0 and let W = {x 2 H: f(x) = 0}, known as the kernel of f and
denoted by ker(f). Clearly, W is a linear subspace of H. Moreover, W is closed, so
that W is a closed subspace of H. In fact, W is the inverse image of the closed set
{0} under the continuous linear functional f. Since f 6¼ 0, we have W 6¼ H. So, by
the Orthogonal Decomposition Theorem 2.10.11, H = W ⊕ W⊥. Since W⊥ 6¼ {0},
thereh exists y0 2i W⊥, y0 6¼ 0. Clearly, f(y0) 6¼ 0 as y0 2 W⊥. Let
y ¼ y0 f ðy0 Þ=ky0 k2 . For an arbitrary x 2 H, we can form the element x − [f(x)/f
(y0)]y0 2 H. Observe that
2.10 Orthogonal Decomposition and Riesz Representation 117

f ðx ½f ðxÞ=f ðy0 ފy0 Þ ¼ f ðxÞ ½f ðxÞ=f ðy0 ފf ðy0 Þ ¼ 0:

So, x − [f(x)/f(y0)]y0 2 W. Consequently,

ðx ½f ðxÞ=f ðy0 ފy0 ; y0 Þ ¼ 0; i:e:; ðx; y0 Þ ¼ ½f ðxÞ=f ðy0 ފky0 k2 ;

which implies f(x) = (x, y).


We next show that y is unique. Assuming the contrary, we have the equation

ðx; y0 Þ ¼ ðx; y00 Þ

for x 2 H where y0 6¼ y00 . But this is impossible, since the substitution x ¼ y0 y00
yields the contradiction ky0 y00 k ¼ 0. The fact that || f || = ||y|| was proved in
Example 2.10.24(iv).
The mapping T:H ! H* defined by T(y) = fy, where fy(x) = (x, y), is conjugate
linear: T(ay + bz) = fay+bz, where fay þ bz ðxÞ ¼ ðx; ay þ bzÞ ¼ a  zÞ ¼
ðx; yÞ þ bðx;
 z ðxÞ. Thus, Tðay þ bzÞ ¼ a
fy ðxÞ þ bf
a  z:
fy þ bf
The real case is left to the reader. This completes the proof. h
Remarks 2.10.26
(i) The functionals defined on ‘2 and L2(X, F, µ) in Examples 2.10.24(ii) and
(iii) are the only continuous functionals on these spaces. To prove the state-
ment without the use of the theorem will need quite an effort. The linear
functionals defined in Example 2.10.24(i) are the only ones possible on that
space.
(ii) The Riesz Representation Theorem has been proved for a Hilbert space. The
hypothesis that the space is complete is essential for the theorem to hold.
Consider the pre-Hilbert space ‘0 of finitely nonzero sequences. Define
1
X xðnÞ
f ðxÞ ¼ ; x 2 ‘0 and x ¼ fxðnÞgn  1 :
n¼1
n

Clearly, f is linear and


  !12 !12
X 1
xðnÞ X1 X1
1
 2
jf ðxÞj ¼   jxðnÞj ¼ M k xk;
 n¼1 n  n¼1 n¼1
n2

P1 1 12
where M ¼ n¼1 n2 , using the Cauchy–Schwarz Inequality. Thus, f is a
bounded linear functional on ‘0. However, there exists no y 2 ‘0 for which
f(x) = (x, y). Indeed, for x = en = (0, 0, …, 0, 1, 0, …), where 1 occurs in the
nth place, f ðxÞ ¼ 1n, while (x, y) = yðnÞ, so that yðnÞ ¼ 1n. Consequently, y 62 ‘0.
118 2 Inner Product Spaces

Remark 2.10.27 In fact, every incomplete inner product space has a continuous
linear functional that cannot be represented by an element of the space. Indeed, the
linear functional defined by a vector in the completion but not in the incomplete
space is the desired linear functional.
Let Y be a subspace of a normed linear space X and f be a bounded linear
functional defined on Y. Then f can be extended to the whole of X, so that both the
functional and its extension have the same norm [Theorem 5.3.2]. Apart from the
fact that the procedure of extension is involved, the extension is not unique.
However, the existence of an extension of a continuous linear functional defined on
a subspace of a Hilbert space H to H is a direct consequence of the Riesz
Representation Theorem 2.10.25. Moreover, the extension is unique.
Theorem 2.10.28 Let H be a Hilbert space, Y a subspace of H and f a continuous
linear functional defined on Y. Then, there exists a unique F 2 H* such that F|Y = f
and || f ||Y = ||F||H,

k f kY ¼ sup jf ðxÞj and kF kH ¼ sup jF ð xÞj:


kxk¼1 k xk¼1
x2Y x2H

Proof Since f is linear and continuous, it follows that it is uniformly continuous on


Y [Remark 2.10.22(ii)]. Hence, f can be extended to Y, the closure of Y with the
preservation of norm. This is shown as follows:
Let x 2 Y. There exists a sequence {xn} in Y such that xn ! x and, in view of the
linearity and continuity of f, limn f(xn) exists; moreover, it is independent of the
sequence chosen. We define f(x) to be limn f(xn). Then |f(xn)|  || f ||Y ||xn|| and this
implies |f(x)|  || f ||Y ||x||. Consequently, the norm of the extended f is at most the
norm of the given f. The reverse inequality is trivial.
We may therefore assume without loss of generality that Y is a closed subspace
of H. The Riesz Representation Theorem 2.10.25 asserts the existence of a unique
element y 2 Y such that

f ðxÞ ¼ ðx; yÞ for all x 2 Y;

and

k f kY ¼ kyk:

We now extend f to the whole of H by defining

FðxÞ ¼ ðx; yÞ for all x 2 H;

i.e. F(x) = 0 if x 2 Y⊥ and F|Y = f. It is clear from the definition of F that


2.10 Orthogonal Decomposition and Riesz Representation 119

kF kH ¼ k y k ¼ k f kY :

We shall next show that any other extension of the linear functional f to the
whole space increases the norm. Indeed, if F′ is any other extension of f to the
whole space, then

F 0 ðxÞ ¼ ðx; zÞ

and

kF 0 k ¼ kzk:

For x 2 Y,

ðx; yÞ ¼ ðx; zÞ;

so that y − z ⊥ Y. Because y 2 Y,

kzk2 ¼ k yk2 þ ky zk2 ;

which implies that

kF 0 k  k f kY ;

where there is strict inequality if y 6¼ z. h


We note in passing that if Y is not the whole space H, then there exist extensions
of arbitrarily large norm.
Let X be a normed linear space over C. The space X* of all bounded linear
functionals on X is a Banach space. One can then consider continuous linear
functionals on X*, that is, the space ðX*Þ* ¼ X**. By the preceding remark, X** is
again a Banach space. The element x 2 X defines a continuous linear functional on
X*; that is, x determines an element s(x) of X** defined by

sðxÞðx*Þ ¼ x*ðxÞ; x* 2 X*: ð2:75Þ

It is apparent that s(x) is linear. Also, the inequality

jsðxÞðx*Þj ¼ jx*ðxÞj  kx*kkxk

shows that sðxÞ 2 X** and ||s(x)||  ||x||. One learns in a course on Banach spaces
that the mapping s : X ! X** defined by (2.75) above is an isometric isomor-
phism. The space is said to be reflexive if the mapping s defined by (2.75) above is
surjective. Not all Banach spaces are reflexive. However, all finite-dimensional
120 2 Inner Product Spaces

normed linear spaces are. One of the distinguishing features of a Hilbert space is
that it is reflexive. We begin by showing that H*, the dual of a Hilbert space H, is
itself a Hilbert space.
Theorem 2.10.29 If H is a Hilbert space, then H* is a Hilbert space. Moreover,
there exists a conjugate linear map T:H ! H* which is one-to-one, onto, norm
preserving and satisfies
 1 1

ðf1 ; f2 ÞH* ¼ T f2 ; T f1 :

Proof By Theorem 2.10.23, H* is a complete normed linear space. Consider the


mapping T:H ! H* defined by

TðxÞðyÞ ¼ ðy; xÞ; x; y 2 H: ð2:76Þ

Note that T defined on H and given by (2.76) is conjugate linear, one-to-one, norm
preserving and onto [Theorem 2.10.25]. Therefore, T−1 exists.
Define an inner product on H* as follows: Given f1, f2 2 H*, let
1 1
ðf1 ; f2 ÞH* ¼ ðT f2 ; T f1 Þ:

It is easy to check that this defines an inner product on H*. It is related to the norm
by (f1, f1)H* = ||f1|| 2H* , because
2
ðf1 ; f1 ÞH* ¼ ðT 1
f1 ; T 1
f1 Þ ¼ T 1
f1 ¼ kf1 k2H* :

Thus H* is a Hilbert space. h


Theorem 2.10.30 If H is a Hilbert space and H** ¼ ðH*Þ*, then the mapping
s: H ! H**ðx ! sðxÞÞ, where the defining equation for s(x) is

sðxÞðf Þ ¼ f ðxÞ f 2 H*;

is an isometric isomorphism between H and H**. Thus H is reflexive.


Proof Let T:H ! H* and S: H* ! H** be the conjugate linear maps assured by
Theorem 2.10.29 (used twice). Since both are conjugate linear, one-to-one, norm
preserving and onto, we know that the composition ST:H ! H** is linear,
one-to-one, norm preserving and onto, which is to say that it is an isometric iso-
morphism between H* and H**. Thus, we need only to prove that

STðxÞðf Þ ¼ f ðxÞ f 2 H*;

and set s = ST.


2.10 Orthogonal Decomposition and Riesz Representation 121

By Theorem 2.10.29, we also have

TðxÞðyÞ ¼ ðy; xÞH ; x; y 2 H; ð2:77Þ

SðgÞðf Þ ¼ ðf ; gÞH* ; f ; g 2 H*; ð2:78Þ


1 1
ðf ; gÞH* ¼ ðT g; T f ÞH ; f ; g 2 H*: ð2:79Þ

We compute ST from here as follows:

STðxÞðf Þ ¼ SðTxÞðf Þ
¼ ðf ; TxÞH* by ð2:78Þ
1 1
¼ ðT ðTxÞ; T f ÞH by ð2:79Þ
1
  1 
¼ ðx; T f ÞH ¼ T T f ð xÞ by ð2:77Þ
¼ f ðxÞ:

h
The following result is an analogue of a familiar result from metric spaces.
Theorem 2.10.31 In order that the linear span of a system M of vectors is dense in
H, it is necessary and sufficient that a continuous linear functional f 2 H* which
vanishes for all x 2 M must be identically zero.
Proof Necessity: suppose the linear span of M is dense in H, i.e. [M] = H and f 2
H* vanishes for all x 2 M. By linearity, f vanishes on [M] and hence by continuity it
vanishes on [M], which is the same as H.
Sufficiency: suppose the linear span of M is not dense in H, i.e. [M] 6¼ H. Then
there exists y 6¼ 0 such that y ⊥ [M]. Then the linear functional f defined by
f(x) = (x, y) for all x 2 H vanishes for all x 2 M but is not identically zero because
f(y) = (y, y) 6¼ 0. h
Problem Set 2.10
P
2:10:P1. Let f 2 RH2 have the series expansion f ¼ 1 j
j¼0 aj z . Define Cn(f) = an.
Show that Cn is a continuous linear functional on RH2.
2:10:P2. Let e0(t) = 1 and e1(t) = √3(2t − 1), t 2 [0, 1], be vectors in the Hilbert
space L2[0, 1]. Show that e0 ⊥ e1, ||e0|| = ||e1|| = 1. Compute the vector
y in the linear span of {e0, e1} closest to t2 and also compute
R1  2
min t2 a bt dt:
0
a;b
2:10:P3. Let X = R2 . Find M⊥
(a) M = {x}, where x = (n1, n2) 6¼ 0;
(b) M is a linearly independent set {x1, x2}  X.
2:10:P4. For any subset M 6¼ ∅ of a Hilbert space H, span(M) is dense in H if, and
only if, M⊥ = {0}.
122 2 Inner Product Spaces

2:10:P5: (a) Prove that for any two subspaces M1 and M2 of a Hilbert space H, we
have (M1 + M2)⊥ = M⊥ ⊥
1 \ M2 .

(b) Prove that for any two closed subspaces M1 and M2 of a Hilbert
space H, we have

ðM1 \ M2 Þ? ¼ M1? þ M2? :

2:10:P6: (a) Let K1 and K2 be the nonempty, closed and convex subsets of a
Hilbert space H such that K1  K2. Prove that, for all x 2 H,
 
ky1 y2 k2  2 d ðx; K1 Þ2 d ðx; K2 Þ2 ;

where y1 and y2 are closest points of x in K1 and K2, respectively.


(b) Let {Kn}n  1 be an increasing sequence of nonempty closed
S
convex subsets in H and let K ¼ n Kn . Prove that K is closed
and convex. Also show that limnyn = y for all x 2 H, where yn is
the projection of x onto Kn, n = 1, 2, …, and y is the projection of
x onto K.
2:10:P7. Let M be a closed subspace of a Hilbert space H and x0 2 H. Prove that
min{||x0 − x||: x 2 M} = max{|(x0, y)|: y 2 M⊥ and ||y|| = 1}.
2:10:P8: (a) Let a be a nonzero element of a Hilbert space H. Prove that, for all x 2
H,

jðx; aÞj
dðx; fag? Þ ¼ :
kak

(b) Let H = L2[0, 1] and let


8 9
< Z1 =
F¼ f 2 L2 ½0; 1Š : f ðxÞdx ¼ 0 :
: ;
0

Determine F⊥. For f(x) = exp(x), determine d(f, F).


R1
2:10:P9. In the linear space C[0, 1], consider the functional FðxÞ ¼ 0 xðtÞf ðtÞdt;
where f is a continuous function defined on [0, 1]. Show that
R1
kF k ¼ 0 jf ðtÞjdt:
2:10:P10. Let H be a Hilbert space and f be a nonzero continuous linear functional
on H, i.e. f 2 H*\{0}. Show that dim((ker(f))⊥) = 1.
2:10:P11. Prove that if f is a linear functional on a Hilbert space H and ker(f) is
closed, then f is bounded.
2.10 Orthogonal Decomposition and Riesz Representation 123

P
2:10:P12. Show that the subspace M = {x = {xn}n  1 2 ‘2: 1 p1ffiffi
n¼1 n xn ¼ 0} is not
a closed subspace of ‘2.
2:10:P13. Prove that the system sin nx, n = 1, 2, …, is complete in L2[0, p].
2:10:P14. Let K be a nonempty closed convex set in a Hilbert space H. Show that
K contains a unique vector k of smallest norm and that ℜ(k, k − x)  0
for all x 2 K. Moreover, if k 2 K satisfies ℜ(k, k − x)  0 for all x 2 K,
then k is the vector of smallest norm in K.
2:10:P15. Let y be a nonzero vector in a Hilbert space H and let

M ¼ fx 2 H : ðx; yÞ ¼ 0g:

What is M⊥?

2.11 Approximation in Hilbert Spaces

Let H be a Hilbert space and v1, v2, …, vk be linearly independent vectors in


H. Suppose that x 2 H. In linear approximation, it is required to find a method of
computing the minimum value of the quantity

n
X

x kj vj ;
j¼1

where k1, k2, …, kn range over all scalars and also determine those values of k1, k2,
…, kn for which the minimum is attained.
Let M be the closed linear space generated by linearly independent vectors v1, v2,
…, vn and x 2 H. By Theorem 2.10.10, there exists a unique minimising vector y 2
M and x − y ⊥ M. Since y 2 M, we have (x −P y, y) = 0.
Denote (vj, vi) = aij and bi = (x, vi). If y ¼ nj¼1 cj vj is the minimising vector,
then

ðx y; vi Þ ¼ 0 for i ¼ 1; 2; . . .; n;

which, written in full, reads as


n
X
bi ¼ aij cj ; i ¼ 1; 2; . . .; n:
j¼1
124 2 Inner Product Spaces

Since the vectors {vi} are linearly independent, the matrix [aij] is nonsingular.
Consequently, the n linear equations in n unknowns c1, c2, …, cn have a unique
solution. n o
Pn
Let d ¼ inf x j¼1 kj v j : k 1 ; k 2 ; . . .; kn are scalars . Then
!
n
X n
X
2
2
d ¼ kx yk ¼ ð x y; x yÞ ¼ x; x cj vj ¼ k x k2 c j bj ð2:80Þ
j¼1 j¼1

If we replace v1, v2, …, vn by an orthonormal set u1, u2, …, un, then aij = 1 if i =
j and 0 if i 6¼ j. Hence, cj = bj, j = 1, 2, …, n and it follows from (2.80) that
n  
X n 
X 
d2 ¼ k x k2  bj  2 ¼ k x k2  x; uj 2 :
j¼1 j¼1

We have thus proved the following theorem.


Theorem 2.11.1 Let {u1, u2, …, un} be an orthonormal set in H and let x 2
H. Then

Xn Xn

x ðx; uk Þuk  x k k uk
k¼1
k¼1

for all scalars P


k1, k2, …, kn. Equality holds if, and only if, kk = (x,uk), k = 1, 2, …,
n. Moreover, nk¼1 kk uk is the orthogonal projection of x onto the subspace M
generated by {u1, u2, …, un}, and if d is the distance of x from M, then
Pn
d2 ¼ k x k2 2
k¼1 jðx; uk Þj .

Remark 2.11.2 If the subspace M generated by (n + 1) orthonormal vectors and it is


desired to obtain the distance of x 2 H from M, then

þ1
nX
d2 ¼ kx y k2 ¼ k x k2 jðx; uk Þj2 ; ð2:81Þ
k¼1

where y is the orthogonal projection of x on M and is given by

þ1
nX
y¼ ðx; uk Þuk : ð2:82Þ
k¼1
2.11 Approximation in Hilbert Spaces 125

The reader will notice that the first n components in the sums on the right of (2.81)
and (2.82) remain unaltered when the dimension of the space is increased from n to
n + 1. This exhibits the importance of orthonormalising the linearly independent
vectors.
Example 2.11.3 Consider the real inner product space C[−1, 1], the inner product
(x, y), x, y 2 C[−1, 1] being defined by

Z1
ðx; yÞ ¼ xðtÞyðtÞdt:
1

Consider the three linearly independent vectors 1, t, t2 (the Wronskian of the vectors
1, t, t2 is 2 6¼ 0) in C[−1, 1]. The Gram–Schmidt orthonormalisation process yields
rffiffiffi rffiffiffi
1 3 5 1 2 
u0 ðtÞ ¼ pffiffiffi ; u1 ðtÞ ¼ t; u2 ðtÞ ¼ 3t 1 :
2 2 22

Let M2 [respectively, M3] be the linear space generated by {u0, u1} [respectively
{u0, u1, u2}]. Consider x(t) = et in C[−1, 1]. We shall compute the distance of
x from M2 and M3.

Z1 1
1 e e
ðx; u0 Þ ¼ pffiffiffi et dt ¼ pffiffiffi ;
2 2
1

rffiffiffi Z1
3 pffiffiffi
ðx; u1 Þ ¼ tet dt ¼ 6 e 1
2
1

rffiffiffi Z1 rffiffiffi
51  2  t 5 1

ðx; u2 Þ ¼ 3t 1 e dt ¼ e 7e :
22 2
1

Let y2 and y3 be the projections of x(t) = et on the subspaces M2 and M3, respec-
tively. Then

y2 ¼ ðx; u0 Þu0 þ ðx; u1 Þu1


rffiffiffi
1 e e 1 pffiffiffi 1 3
¼ pffiffiffi pffiffiffi þ 6e t
2 2 2
1
¼ ðe e 1 Þ þ 3e 1 t
2
126 2 Inner Product Spaces

and

y3 ¼ ðx; u0 Þu0 þ ðx; u1 Þu1 þ ðx; u2 Þu2


rffiffiffi rffiffiffi
1 1 1 5 1
 51 2 
¼ ðe e Þ þ 3e t þ e 7e 3t 1
2 2 22
1 5  
¼ ðe e 1 Þ þ 3e 1 t þ e 7e 1 3t2 1 :
2 4

If d2 [respectively, d3] denotes the distance of x from M2 [respectively, M3], then by


Theorem 2.11.1,

d22 ¼ k xk2 jðx; u0 Þj2 jðx; u1 Þj2


2
1  e e 1
¼ e2 e 2 pffiffiffi 6e 2
2 2
¼ 1 7e 2

and

d23 ¼ kxk2 ¼ jðx; u0 Þj2 jðx; u1 Þj2 jðx; u2 Þj2


5 2
¼ d22 e 7e 1
2
5 2 
¼ 1 7e 2 e þ 49e 2 14
2
5 2 259 2
¼ 36 e e :
2 2

Problem Set 2.11


R1  3 2 R1
2:11:P1. Find min t a bt ct2  dt and max 1 t3 gðtÞdt, where g is
a;b;c 1

subject to the restrictions

Z1 Z1 Z1 Z1
gðtÞdt ¼ tgðtÞdt ¼ t2 gðtÞdt ¼ 0; jgðtÞj2 dt ¼ 1:
1 1 1 1

2:11:P2. Find the point nearest to (1, −1, 1) in the linear span of (1, x, x2) and (1,
x2, x) in C3 , where x = exp(2pi/3).
2.12 Weak Convergence 127

2.12 Weak Convergence

Let {xn}n  1 be a sequence in a Hilbert space H. Recall that {xn}n  1 converges to


1
x in H if ||xn − x|| = (xn − x, xn − x)2 ! 0 as n ! ∞, and we write xn ! x. From now
on it will be called strong convergence to distinguish it from weak convergence, to
be introduced shortly. The relationship between the two types of convergence will
be discussed. The concepts of strong convergence and weak convergence are
identical in finite-dimensional spaces. A characterisation of weak convergence in
special spaces will also find a mention below.
Definition 2.12.1 A sequence of vectors {xn}n  1 converges weakly to a vector
w
x and we write xn * x (xn ! x) if

lim ðxn ; yÞ ¼ ðx; yÞ


n!1

for all

y 2 H:

The concepts of a weakly Cauchy sequence and weak completeness are defined
analogously.
Remarks 2.12.2
(i) A sequence cannot converge weakly to two different limits: assume that
w w
xn !x0 and xn !y0 . Then

ðxn ; yÞ ! ðx0 ; yÞ and ðxn ; yÞ ! ðy0 ; yÞ

for all y 2 H. Consequently, (x0, y) = (y0, y), or (x0 − y0, y) = 0, for all y 2
H. If we choose y = x0 − y0, we obtain (x0 − y0, x0 − y0) = 0, which implies
x 0 = y 0.
w
(ii) If xn ! x0, then every arbitrary subsequence fxnk gk  1 converges weakly to
x 0.
w
(iii) Strong convergence of {xn}n  1 to x0 implies xn ! x0. Indeed, for y 2 H, we
have

j ð xn x 0 ; y Þ j  kxn x0 kkyk;

by the Cauchy–Schwarz Inequality.


(iv) The converse of (iii) is, however, not true. Indeed, let {en}n  1 be an infinite
orthonormal sequence of vectors in H. Since for any y 2 H,
128 2 Inner Product Spaces

1
X
jðy; en Þj2  kyk2 ðby Bessel's InequalityÞ;
n¼1

therefore, limn!∞(en, y) = 0. Thus, the sequence {en}n  1 converges weakly


to the vector zero, but this sequence cannot converge strongly, since
2
e i ej ¼ 2 ði 6¼ jÞ;

so that ||ei − ej|| 9 0 as i, j ! ∞.


However, the following theorem holds:
Theorem 2.12.3 If H is a finite-dimensional Hilbert space, strong convergence is
equivalent to weak convergence.
Proof Since we have already shown that, in any Hilbert space, strong convergence
implies weak convergence [Remark 2.12.2(iii)], it is enough to show in this situ-
ation that weak convergence implies strong convergence. To this end, let e1, …, ek
be an orthonormal basis for H and let
w
xn !x;

where

ðnÞ ðnÞ
xn ¼ a1 e1 þ    þ ak ek

for n = 1, …, k, and

x ¼ a 1 e 1 þ    þ ak e k :
w
Since xn ! x, it follows that
    ðnÞ
xn ; ej ! x; ej ; i:e:; aj ! aj

for j = 1, …, k. For any prescribed e > 0, there must be an integer n0 such that for all
n > n0 and for every j = 1, …, k,
 
 ðnÞ 
 aj aj \e=k;
2.12 Weak Convergence 129

hence

X k   2 X k  2
ðnÞ  ðnÞ 
k xn x k2 ¼ a aj e j ¼ aj aj  \e:
j¼1 j j¼1

Thus, xn ! x strongly. This completes the proof. h


The next result pinpoints the relationship between weak and strong
convergences.
Theorem 2.12.4 Let {xn}n  1 be a sequence in a Hilbert space H. Then xn ! x if,
w
and only if, xn ! x and lim supn!∞||xn||  ||x||.
w
Proof Let xn ! x. Then xn ! x [Remark 2.12.2(iii)]. Also, lim supn!∞||xn|| =
limn!∞||xn|| = ||x||, since 0  ||xn − x||  |||xn|| − ||x|||.
w
Conversely, let xn ! x and lim supn!∞||xn||  ||x||. For each n, 0  ||xn − x||2 =
(xn − x, xn − x) = ||xn||2 + ||x||2 − 2ℜ(xn, x). Since

lim supn!1 kxn k  kxkand <ðxn ; xÞ ! <ðx; xÞ ¼ kxk2 ;

we have

0  lim supn!1 kxn xk  k xk2 þ kxk2 2k xk2 ¼ 0;

so that limn!∞||xn − x|| exists and equals zero. h


The Riesz Representation Theorem enables us to prove the following analogue
of the classical Bolzano–Weierstrass Theorem.
Theorem 2.12.5 Any bounded sequence in H has a weakly convergent subse-
quence and the limit has the same bound.
Proof Let {xn}n  1 be a sequence in H and an M > 0 be such that ||xn||  M for all
n. We need to find a weakly convergent subsequence of {xn}n  1.
By the Cauchy–Schwarz Inequality,

jðxn ; x1 Þj  kxn kkx1 k  M 2

for all n. The classical Bolzano–Weierstrass Theorem shows that the bounded
sequence {(xn, x1)}n  1 has a convergent subsequence {(xn(1), x1)}n(1)  1, say.
Applying the preceding argument to the sequence {(xn(1), x2)}n(1)  1, we extract a
convergent subsequence {(xn(2), x2)}n(2)  1.
Continuing inductively, we obtain for each k a convergent subsequence {(xn(k),
xk)}n(k)  1 of {(xn(k−1), xk)}n(k−1)  1.
130 2 Inner Product Spaces

Consider now the diagonal sequence {zp}p  1, where zp denotes the pth term of
the sequence {xn(p)}n(p)  1. We show that for each x 2 H, the sequence {(zp, x)}p  1
of scalars converges.
If x = xm for some m, then for each p > m, {(zp, x)}p  1 is a subsequence of the
convergent sequence {(zp, xm)}p  1 and is, therefore, convergent. Hence, if x 2 span
{x1, x2, …}, then {(zp, x)}p  1 converges in the field of scalars.
Let x 2 spanfx1 ; x2 ; . . .g. Consider a sequence {yr}r  1 in span{x1, x2,…} such
that yr ! x as r ! ∞. Then for all n, m and r, we have

jðzn ; xÞ ð z m ; xÞ j ¼ j ð z n z m ; xÞ j
 j ðznzm ; x yr Þj þ jðzn zm ; yr Þj
 kz n zm kkx yr k þ jðzn zm ; yr Þj
 2M kx yr k þ jðzn zm ; yr Þj:

Since ||x − yr|| ! 0 as r ! ∞ and |(zn − zm, yr)| ! 0 as n, m ! ∞ for each r, we see
that {(zn, x)}n  1 is a Cauchy sequence of scalars and is, therefore, convergent.
Next, let x ⊥ spanfx1 ; x2 ; . . .g. Then (zn, x) = 0 for all n, since zn is in span{x1, x2,
…}. Thus (zn, x) ! 0 as n ! ∞.
By the Orthogonal Decomposition Theorem 2.10.11,
?
H ¼ spanfx1 ; x2 ; . . .g  spanfx1 ; x2 ; . . .g :

Hence, {(zn, x)}n  1 converges for each x 2 H. Define

f ðxÞ ¼ lim ðx; zn Þ; x 2 H: ð2:83Þ


n!1

Clearly, f is linear and

jf ðxÞj ¼ lim jðx; zn Þj  M kxk


n!1

for all x 2 H. Thus, f is a continuous linear functional on H satisfying || f ||  M. By


the Riesz Representation Theorem 2.10.25, there exists a unique y 2 H such that

f ðxÞ ¼ ðx; yÞ; x 2 H: ð2:84Þ

and ||y|| = || f ||  M. On comparing (2.83) and (2.84), we obtain

lim zn ¼ y ðweakÞ:
n!1

This completes the proof. h


Every convergent sequence in a normed linear space X is bounded. This is easily
seen as follows: let {xn}n  1 be a sequence in X and suppose that limn!∞xn = x. For
2.12 Weak Convergence 131

a given e > 0, there exists an integer n0 such that n  n0 implies ||xn − x|| < e. But
since ||xn|| − ||x||  ||xn − x||, this implies ||xn|| < e + ||x||, n  n0. It now follows that

kxn k\e þ k xk þ M;

where M = max{||xk||: 1  k  n0}.


Thus, the terms of a convergent sequence in a normed linear space, a fortiori, in
a Hilbert space are bounded. The foregoing statement is true for a weakly con-
vergent sequence.
w
Theorem 2.12.6 If H is a Hilbert space and xn !x, then there exists a positive
constant M such that

kxn k  M:

We discuss preliminary results needed for the proof of Theorem 2.12.6.


Definition 2.12.7 A real functional p(x) in H is said to be convex if for all
x, y 2 H and a 2 C; the following hold:

pðx þ yÞ  pðxÞ þ pðyÞ and pðaxÞ ¼ jajpðxÞ:

Observe that (i) p(0) = 0 (ii) p(x − y)  |p(x) − p(y)| and p(x)  0, where
x, y 2 H. Indeed, p(0) = p(0  x) = 0  p(x) = 0. Also, p(x − y) + p(y)  p(x) and
hence p(x − y)  p(x) − p(y). Since p(x − y) = |−1| p(y − x)  p(y) − p(x), it
follows that p(x − y)  |p(x) − p(y)|. On setting y = −x, we obtain p(2x) = 2p(x) 
| p(x) − p(−x)| = 0.
That a lower semi-continuous convex functional in a Hilbert space is bounded is
the content of the Lemma below. In conjunction with the observation above, it will
further follow that it is uniformly continuous.
Lemma 2.12.8 Suppose p(x) is a convex functional in a Hilbert space H and
assume that p(x) is lower semi-continuous. Then there exists M > 0 such that

pðxÞ  M kxk for all x 2 H:

Proof We first show that the functional p(x) is bounded in the ball S(0,1). We
assume the contrary. Then p(x) is unbounded in every ball, because every ball is
obtained by dilation and/or translation of the ball S(0,1). We choose a point x1 2 S
(0,1) such that p(x1) > 1. The lower semi-continuity of the functional p(x) implies
that there exists a ball S(x1,q1)  S(0,1) with radius q1 < 12 in which p(x) > 1. By
reducing the radius q1, we may assume that S(x1, q1)  S(0,1). Since p(x) is
unbounded in every ball, in a similar manner, we obtain a point x2 2 S(x1, q1) and
also a closed ball S(x2, q2)  S(x1, q1) with radius q2 < 12 q1 , in which p(x) > 2.
Continuing the process, we obtain an infinite sequence of balls
132 2 Inner Product Spaces

Sð0; 1Þ  Sðx1 ; q1 Þ  Sðx2 ; q2 Þ. . .;

for which qk \ 12 qk 1 (k = 1, 2, … and q0 = 1) and also p(x) > n if x 2 S(xn, qn).


Observe that the sequence {xn}n  1 of the centres of the balls S(xn,qn), n = 1, 2, …
is Cauchy and since H is complete, limn!∞xn exists and equals x, say. Then x lies
in the intersection of the closed balls, and hence p(x) > n for each n, which is a
contradiction.
Let x 2 H be arbitrary. Then x/2||x|| is an element of H of norm 12 and is,
therefore, in S(0,1). Now,

pðx=2kxkÞ  M1 ;

where M1 is an upper bound of p on S(0,1), i.e. p(x)  2M1||x||. Take M = 2M1. h


Corollary 2.12.9 Let pk(x), k = 1, 2, … be a sequence of convex continuous
functionals in H. If this sequence is bounded at each point x 2 H, then the
functional

pðxÞ ¼ supk pk ðxÞ

is also convex and bounded, and hence continuous.


Proof Evidently, p(x) is a convex functional. On the other hand, for each x0 2
H and each e > 0, there exists N such that

1
pN ðx0 Þ [ pðx0 Þ e;
2

i.e.

1
pðx0 Þ pN ðx0 Þ\ e:
2

By continuity of the functional pN(x), there exists d > 0 such that

1
jpN ðxÞ pN ðx0 Þj\ e
2

for ||x − x0|| < d. But if ||x − x0|| < d, then

1
pðxÞ pðx0 Þ [ supk pk ðxÞ pN ðx0 Þ e
2
1
 pN ðxÞ pN ðx0 Þ e[ e:
2
2.12 Weak Convergence 133

This implies that the functional p(x) is lower semi-continuous. By Lemma 2.12.8, it
follows that p(x) is bounded. Continuity now follows from the observation pre-
ceding the lemma. h
Every weakly convergent sequence of vectors in a Hilbert space is bounded.
This is an immediate consequence of the following theorem.
Theorem 2.12.10 Let {Uk}k  1 be a sequence of continuous linear functionals
defined on the Hilbert space H. Suppose that the numerical sequence {Uk(x)}k  1 is
bounded for each x 2 H. Then the sequence {||Uk||}k  1 of norms of the functionals
is bounded.
Proof For x 2 H, define

Pk ðxÞ ¼ jUk ðxÞj; k ¼ 1; 2; . . .:

Then {pk}k  1 is convex and continuous. By Corollary 2.12.9, the convex


functional

pðxÞ ¼ supk pk ðxÞ

is convex and bounded; i.e. there exists M > 0 such that

sup pðxÞ  M:
k xk  1

Consequently,

kUk k ¼ sup jUk ðxÞj


k xk  1

¼ sup pk ðxÞ
k xk  1

 sup sup pk ðxÞ


k xk  1 k

¼ sup pðxÞ  M:
k xk  1

This completes the proof. h


Proof of Theorem 2.12.6 Let {xn}n  1 be a weakly convergent sequence. Each vector
xn determines a functional Un(x) = (x, xn). Since the sequence {xn}n  1 is weakly
convergent, the numerical sequence {Un(x)}n  1 converges for each x 2 H and hence
is bounded. Using Theorem 2.12.10, it follows that

kUn k  M; n ¼ 1; 2; . . .:

As ||Un|| = ||xn||, n = 1, 2, … [Example 2.10.24(iv)], the result follows.


134 2 Inner Product Spaces

Definition 2.12.11 A sequence of vectors {xn}n  1 in an inner product space is said


to be weakly Cauchy if, for each y 2 H,

limm;n ðxm xn ; yÞ ¼ 0:

An inner product space is said to be weakly complete if every weakly Cauchy


sequence converges to a weak limit in H.
Corollary 2.12.12 Let H be a Hilbert space. Then H is weakly complete.
Proof Let {xn}n  1 be a Cauchy sequence in the sense of weak convergence, that is,
for each y 2 H,

limðxm xn ; yÞ ¼ 0:
m;n

It follows that the sequence {(xn, y)}n  1 of scalars converges for each y in H. By
Theorem 2.12.10, the sequence {xn}n  1 is bounded:

kxn k  M; n ¼ 1; 2; . . .:

Therefore, the limit

lim ðx; xn Þ
n!1

defines a linear functional U(x) with norm less than or equal to M. By the Riesz
Representation Theorem 2.10.25, U(x) = (x, z), where z is a unique element of the
Hilbert space H. This element is the weak limit of the sequence {xn}n  1. h
We give below two applications of Corollary 2.12.9.
Theorem 2.12.13 (F. Riesz) If a functional U is defined everywhere on L2[a, b] by
the formula

Zb
UðxÞ ¼ xðtÞyðtÞdt; x 2 L2 ½a; bŠ;
a

where y is a fixed measurable function defined on [a, b], then U is a bounded linear
functional on L2[a, b], so that y 2 L2[a, b].
Proof Clearly, U is a linear functional on L2[a, b]. Set

En ¼ ft : t 2 ½a; bŠ \ ½ n; nŠ and jyðtÞj  ng

and
2.12 Weak Convergence 135

Z
pn ðxÞ ¼ jxðtÞyðtÞjdt; x 2 L2 ½a; bŠ:
En

Then {pn}n  1 is a sequence of convex functionals: indeed, for x, z 2 L2[a, b] and


a 2 C,
Z
pn ðx þ zÞ ¼ j½xðtÞ þ zðtފyðtÞjdt
En
Z Z
 jxðtÞyðtÞjdt þ jzðtÞyðtÞjdt
En En

¼ pn ðxÞ þ pn ðzÞ

and
Z Z
pn ðaxÞ ¼ jaxðtÞyðtÞjdt ¼ jaj jxðtÞyðtÞjdt ¼ jajpn ðxÞ:
En En

Moreover,
0 112 0 112
Z Z Z
2
pn ðxÞ  n jxðtÞjdt  n@ jxðtÞj dtA @ dtA
En En En
1
¼ nlðEn Þ k xk2 ;2

using the Cauchy–Schwarz Inequality, where µ denotes the usual Lebesgue


measure.
Thus, for n = 1, 2, …, pn is a continuous convex functional on L2[a, b]. The
equality

Z Zb
pðxÞ ¼ limn pn ðxÞ ¼ limn jxðtÞyðtÞjdt ¼ jxðtÞyðtÞjdt;
En a

using the Monotone Convergence Theorem 1.3.6 shows that p(x) is finite for any
x in L2[a, b]. By Corollary 2.12.9, the functional p(x) is bounded; i.e. there exists
M > 0 such that

pðxÞ  M kxk; x 2 L2 ½a; bŠ:

Thus
136 2 Inner Product Spaces

jUðxÞj  pðxÞ  M k xk; x 2 L2 ½a; bŠ;

i.e. U is a bounded linear functional on L2[a, b]; so y 2 L2[a, b] and ||y||2 = ||U||,
using the definition of U and the Riesz Representation Theorem. h
Theorem 2.12.14 (Landau) If U is a functional defined everywhere in ‘2 by means
of the formula
1
X
UðxÞ ¼ ak x k ; x ¼ fxk gk  1 2 ‘2 ;
k¼1

P
where {ak}k  1 is some fixed sequence, then 1 2
k¼1 jak j \1:
P
Proof Define pn ðxÞ ¼ nk¼1 jak xk j; x ¼ fxk gk  1 2 ‘2 . Check that pn, n = 1, 2, …,
is a continuous convex functional. Then the equality
n
X 1
X
pðxÞ ¼ limn pn ðxÞ ¼ limn jak xk j ¼ j ak x k j
k¼1 k¼1

implies that p(x) is finite for any x 2 ‘2. So, by Corollary 2.12.9, the functional
p(x) is continuous; i.e. there exists M > 0 such that pðxÞ  M k xk|, x 2 ‘2.
Consequently,
1
X
jUðxÞj  jak xk j ¼ pðxÞ  M kxk:
k¼1

So, U is a bounded linear functional on ‘2. The form of U and the Riesz
P
Representation Theorem imply that 1 2
k¼1 jak j \1. h
P1
Remark 2.12.15 Landau’s Theorem may also be stated as follows: if k¼1 ak xk
P
converges for every {xk}k  1 in ‘2, then 1 2
k¼1 jak j \1

Problem Set 2.12


2:12:P1. Show that for a sequence {xn}n  1 in an inner product space X and x 2 X,
the conditions

ðiÞ jjxn jj ! jj xjj and ðiiÞ ðxn ; xÞ ! ðx; xÞ;

imply xn ! x in X.
2.12 Weak Convergence 137

2:12:P2. (Banach–Saks) Let {xn}n  1 be a sequence in a Hilbert space converging


weakly to x 2 H. Prove that there exists a subsequence {xnk }k  1 such that
the sequence {yk}k  1 defined by

1
yk ¼ ðxn1 þ xn2 þ    þ xnk Þ
k

converges strongly to x.
2:12:P3: (a) (Mazur’s Theorem) Let {xn}n  1 be a weakly convergent sequence
in a Hilbert space H and let x be its weak limit. Prove that x lies in the
closed convex hull of the range {xn: n  1} of the sequence.
(b) Let C be a convex subset of a Hilbert space H. Prove that C is closed
if, and only if, it contains the weak limit of every sequence of points
in it.
2:12:P4: (a) Let H be a separable Hilbert space and let {en}n  1 be an orthonormal
basis for H. Let B = {x 2 H: ||x||  1}. For x, y 2 H, let
1
X
dðx; yÞ ¼ 2 n jðx y; en Þj ð2:85Þ
n¼1

Show that d is a metric on B.


(b) Show that the topology generated by d is the same as the one
w
given by the weak topology, i.e. d(xk, x) ! 0 if, and only if, xk !x.
(c) Show that the metric space (B, d) is compact.

2.13 Applications

Müntz’s Theorem
Weierstrass’s Theorem for C[0, 1] says, in effect, that all linear combinations of the
functions

1; x; x2 ; . . .; xn ; . . . ð2:86Þ

are dense in C[0, 1]. Instead of working with all positive powers of x, let us permit
gaps to occur, and consider the infinite set of functions

1; xn1 ; xn2 ; . . .; xnk ; . . .; ð2:87Þ


138 2 Inner Product Spaces

where nk are positive integers satisfying n1 < n2 <  < nk . The result we shall
prove is called Müntz’s Theorem and asserts that the linear combinations of the
functions (2.87) are dense in C[0, 1] and hence in L2[0, 1] if, and only if, the series
P1 1
k¼1 nk diverges. The following will be needed in Sect. 2.13.

Definition Let x1, x2, …, xn be any vectors in an inner product space X. Then the
n  n matrix G(x1, x2, …, xn) whose (i, j)th entry is (xi, xj), where (,) is the inner
product in X, is called the Gram matrix of the given finite sequence of vectors.
Its determinant is called their Gram determinant.
Proposition The Gram matrix G(x1, x2, …, xn) is nonsingular if, and only if, the
vectors x1, x2, …, xn are linearly independent.
Proof Observe that for the given G, and any n-tuple of scalars x = (n1, n2, …, nn),
we have
2 3
ðx1 ; x1 Þ ðx1 ; x2 Þ    ðx1 ; xn Þ
6 . .. .. .. 7
xG ¼ ½n1 ; n2 ; . . .; nn Š6
4 .. . . . 5
7

ðxn ; x1 Þ ðxn ; x2 Þ    ðxn ; xn Þ


" #
Xn Xn n
X
¼ ni ðxi ; x1 Þ ni ðxi ; x2 Þ . . . ni ðxi ; xn Þ :
i¼1 i¼1 i¼1

So,
2 3
" # n1
X n Xn Xn 6 . 7
xGx* ¼ ni ðxi ; x1 Þ ni ðxi ; x2 Þ . . . ni ðxi ; xn Þ 64 .. 5
7
i¼1 i¼1 i¼1
n
n
! 2
X n Xn Xn X n

¼ ðxi ; xj Þni 
nj ¼ ni x i ; ni x i ¼ ni x i
i;j¼1 i¼1 i¼1
i¼1

From the above equality, the result follows. h


Corollary A necessary and sufficient condition for the vectors x1, x2, …, xn to be
linearly dependent is that

det Gðx1 ; x2 ; . . .; xn Þ ¼ 0:
Let M be the closed subspace generated by x1, x2, …, xn. Then H can be written as
M ⊕ M⊥. If y 2 H, then y = z + w, where z 2 M and w 2 M⊥, so that y − z 2 M⊥
[see Remark 2.10.12(i)]. The minimum distance d from to the subspace M is
d ¼ ky zk, where
2.13 Applications 139

n
X
z¼ ai x i
i¼1

[Theorem 2.10.10]. We wish to calculate the coefficients ai, i = 1, 2, …, n and the


minimal distance d.
Since y z?xj ; j ¼ 1; 2; . . .; n; we obtain a system of equations
!
n
X
y ai x i ; x j ¼ 0; j ¼ 1; 2; . . .; n;
i¼1

which, when written in full, have the form


9
a1 ðx1 ; x1 Þ þ a2 ðx2 ; x1 Þ þ    an ðxn ; x1 Þ ¼ ðy; x1 Þ >
>
=
a1 ðx1 ; x2 Þ þ a2 ðx2 ; x2 Þ þ    an ðxn ; x2 Þ ¼ ðy; x2 Þ
ð2:88Þ
 >
>
;
a1 ðx1 ; xn Þ þ a2 ðx2 ; xn Þ þ    an ðxn ; xn Þ ¼ ðy; xn Þ

and represent a system of equations in the unknowns ai, i = 1, 2, …, n. The matrix


of its coefficients is precisely the transpose of the Gram matrix G(x1, x2, …, xn).
Since the vectors x1, x2, …, xn are linearly independent, the matrix is nonsingular by
Proposition 4.2.2 and the system has one and only one solution. Moreover, by
Cramer’ Rule, the unique solution is given by

ai ¼ det GðiÞ =det G; j ¼ 1; 2; . . .; n;

where G(i) is obtained from G by replacing its ith column by the column of con-
stants (y, xi).
Now,

d2 ¼ ky zk2 ¼ ðy z; y zÞ
¼ ðy; y zÞ
!
n
X
2
¼ k yk y; ai x i ;
i¼1

so that
!
n
X
ai x i ; y ¼ k y k2 d2 : ð2:89Þ
i¼1

We combine Eq. (2.89) with the system of Eq. (2.88) and write them in the form
140 2 Inner Product Spaces

9
a1 ðx1 ; x1 Þ þ a2 ðx2 ; x1 Þ þ    an ðxn ; x1 Þ ðy; x1 Þ ¼ 0 >
>
>
a1 ðx1 ; x2 Þ þ a2 ðx2 ; x2 Þ þ    an ðxn ; x2 Þ ðy; x2 Þ ¼ 0 >
=
 : ð2:90Þ
>
a1 ðx1 ; xn Þ þ a2 ðx2 ; xn Þ þ    an ðxn ; xn Þ ðy; xn Þ ¼ 0 >
>
>
;
a1 ðx1 ; yÞ þ a2 ðx2 ; yÞ þ    an ðxn ; yÞ þ d2 ðy; yÞ ¼ 0

If we introduce a dummy value an+1 = 1 as a coefficient of the elements of the last


column, the (2.90) becomes a system of n + 1 homogeneous linear equations in the
n + 1 variables a1, a2, …, an, an+1 (= 1). The system (2.90) will possess a nontrivial
solution if the determinant of the system vanishes, i.e.
2 3
ðx1 ; x1 Þ ðx2 ; x1 Þ  ðxn ; x1 Þ ðy; x1 Þ
6 ðx1 ; x1 Þ ðx2 ; x1 Þ  ðxn ; x2 Þ ðy; x2 Þ 7
det6
4 
7 ¼ 0:
5
   
2
ðx1 ; yÞ ðx2 ; yÞ  ðxn ; yÞ d ðy; yÞ

This gives

det Gðx1 ; x2 ; . . .; xn ; yÞ
d2 ¼ :
det Gðx1 ; x2 ; . . .; xn Þ

Apart from the above observations, the following lemmas will be needed in the
proof of Müntz’s Theorem.
Lemma Let k1, k2, …, kn be positive real numbers and A be the matrix whose (i, j)
th entry is aij ¼ ki þ1 kj : Then


n
Yn
1 Y kj kk 2
det A ¼ 2 :
k
j¼1 j 1  j\k  n
kj þ kk
Q
Proof If A is a 1  1 matrix, then det A ¼ 2k1 1 ¼ 2 1 nj¼1 k1j . Thus, the assertion is
true for n = 1. Assume that the assertion is true for m; i.e. if A is an m  m matrix,
then

m
Ym
1 Y kj kk 2
det A ¼ 2 :
k
j¼1 j 1  j\k  m
kj þ kk

Consider the (m + 1)  (m + 1) matrix whose (i, j)th entry is aij ¼ ki þ1 kj . Its


determinant, when written in full, takes the form
2.13 Applications 141

 
 1 1
 1 1 
 k1 þ k1 k1 þ k2 k1 þ km k1 þ km þ 1 
 1 1
 1 1 
 k2 þ k1 k2 þ k2 k2 þ km k2 þ km þ 1 
 :
 .. .. .. 
 . . . 
 1 1 1 1

  
km þ 1 þ k1 km þ 1 þ k2 km þ 1 þ km km þ 1 þ km þ 1

By subtracting the last row from each of the others, removing common factors,
subtracting the last column from each of the others and again removing the common
factors, we obtain
 
 k þ1 k 1
k1 þ k2  1
k1 þ km 1
 11 1 
Qm  1
 1
1 
2  k2 þ k1 k2 þ k2 k2 þ km
1 1 i¼1 ðkm þ 1 ki Þ  .. .. .. .. .. :
2 Q 2 
km þ 1 m ðkmþ1 þ k i Þ . . . . .
i¼1  1 1 1 
 km þ 1 þ k1 km þ 1 þ k2  km þ 1 þ km 1
 
0 0  0 1

On expanding the determinant by the last row, we have



m 1
þ1
mY
1 Y k j kk 2
det A ¼ 2 :
j¼1
kj 1  j\k  m þ 1 kj þ kk

By induction, the result follows. h


Lemma Let 1, tn1 , tn2 ,… , 1  n1 < n2 <  be a set of functions defined on [0, 1].
The sequence ftnk gk  1 is total (finite linear combinations are dense) in C[0, 1], if,
and only if, it is complete in L2[0, 1].
Proof Let x 2 C[0, 1]. The inequality
2 1 2 312  
Z  k
X   k
X 
4 xðtÞ  
ai tni  dt5  max0  t  1 xðtÞ

ai t n i  ð2:91Þ
 i¼1
  i¼1

0

shows that the sequence is complete in L2[0, 1] if it is total in C[0, 1].


Conversely, suppose that the sequence ftnk gk  1 is complete in L2[0, 1]. In order
to show that the finite linear combinations constitute a dense subset of C[0, 1], it is
enough to show that the inequality (2.91) in the reverse direction holds for the
functions tm, m = 1, 2,…. Now
142 2 Inner Product Spaces

   t ! 
 k  Z k
m X   X 
t ai tni  ¼ m tm 1
bi tni 1
dt
 i¼1
  i¼1 
0
Z1  k
X


 ni 1 
 m t m 1
bi t dt ð2:92Þ
 i¼1

0
0  112
Z1  k
X 
 1 A
 m @ t m 1
bi t n i dt ;
 i¼1

0

using the Cauchy–Schwarz Inequality. The above inequality proves the assertion. h
Remarks
(i) The function 1 must be added in the case of C[0, 1] but is redundant in L2[0, 1].
Indeed, if the function 1 is missing from ftnk gk  1 , then the polynomial
Pk ni
i¼1 ai t is itself zero at t = 0 and cannot, therefore, approximate the con-
tinuous function x(t) for which x(0) 6¼ 0.
R1
(ii) Since ðxp ; xq Þ ¼ 0 tp þ q dt ¼ p þ 1q þ 1, it follows that
 
 n þ n1 þ 1 n þ n1 þ 1  1 
 1 1 1 2 n1 þ nk þ 1 
 
 n2 þ n11 þ 1 n2 þ n12 þ 1  1
n2 þ nk þ 1 

det Gðtn1 ; tn2 ; . . .; tnk Þ ¼  .. .. .. .. 
 
 . . . . 
 
 1 1
 1 
nk þ n2 þ 1 nk þ n2 þ 1 nk þ nk þ 1
Q
i [ j ðni nj Þ 2
¼Q
i;j ðni þ nj þ 1Þ

and analogously,
Q Qk
m n1 n2 nk i [ j ðni nj Þ2 ðm ni Þ2 1
det Gðt ; t ; t ; . . .; t Þ ¼ Q  Qk i¼1 
2 2m þ 1
:
i;j ðn i þ n j þ 1Þ i¼1 ðm þ ni þ 1Þ

From this, it follows that

k 2
det Gðtm ; tn1 ; tn2 ; . . .; tnk Þ 1 Y m ni
¼ :
det Gðtn1 ; tn2 ; . . .; tnk Þ 2m þ 1 i¼1 m þ ni þ 1
2.13 Applications 143
P1 P1
(iii) The series i¼1 lnð1 þ ai Þ and the series i¼1 ai converge or diverge
simultaneously. This is because

lnð1 þ xÞ 1
lim ¼ lim ¼1
x!0 x x!0 1 þ x

and so, for any e > 0,

ð1 eÞai \lnð1 þ ai Þ\ð1 þ eÞai :

(Müntz’s Theorem) A necessary and sufficient condition for the set

tn1 ; tn2 ; . . .; 1  n1 \n2 \   

to be complete in L2[0, 1] is that


1
X 1
¼ 1:
i¼1
ni

Proof If one of the exponents ni coincides with m, then for k  i,

det Gðtm ; tn1 ; tn2 ; . . .; tnk Þ ¼ 0

and hence, the minimal distance is zero. Thus, completeness holds if, and only if,
for each m  1 with m 6¼ ni, i = 1, 2, …, the minimal distance

det Gðtm ; tn1 ; tn2 ; . . .; tnk Þ


d2k ¼ !0 as k ! 1: ð2:93Þ
det Gðtn1 ; tn2 ; . . .; tnk Þ

Now,

k 2
det Gðtm ; tn1 ; tn2 ; . . .; tnk Þ 1 Y m ni
¼ : ð2:94Þ
det Gðtn1 ; tn2 ; . . .; tnk Þ 2m þ 1 i¼1 m þ ni þ 1

In view of (2.94), the condition (2.93) becomes


" #
k
X 
m mþ1
lim ln 1 ln 1 þ ¼ 1: ð2:95Þ
k!1
i¼1
ni ni
P1 1
If the series i¼1 ni diverges, then by (iii) of Remarks 4.2.6
144 2 Inner Product Spaces

1
X  X1 
m mþ1
ln 1 ¼ 1; ln 1 þ ¼ þ1 ð2:96Þ
i¼1
ni i¼1
ni

and therefore, (2.95) is satisfied and hence so is (2.93). If, however, the series
P1 1
i¼1 ni converges, then the series (2.96) also converges, so that (2.95) is not
satisfied and hence (2.93) does not hold. h
Radon–Nikodým Theorem

Definition Let (X,R), where X is a nonempty set and R is a r-algebra of subsets of


X, be a measurable space, and let m, µ be finite nonnegative measures on (X, R). The
measure m is said to be absolutely continuous with respect to µ, in symbols, m  µ,
if m(E) = 0 for every E 2 R for which µ(E) = 0.
For h 2 L1(X, R, µ), the integral
Z
mðEÞ ¼ h dl; E2R
E

defines a measure on R which is clearly absolutely continuous with respect to µ.


The point of the Radon–Nikodým Theorem is the converse: every m  µ is
obtained in this way.
von Neumann showed how to derive this from the Riesz Representation
Theorem for linear functionals on a Hilbert space.
(Radon–Nikodým Theorem) Let m and µ be finite nonnegative measures on
(X; RÞ.If m  l; then there exists a unique nonnegative measurable function h such
that
Z
mðEÞ ¼ h dl; E 2 R:
E

In particular, h 2 L1(X, R, µ).


Proof For any E 2 R, put u(E) = m(E) + µ(E). Since m and µ are finite nonnegative
measures, so is u. Moreover,
Z Z Z
x du ¼ x dm þ x dl ð2:97Þ
X X X

holds for x = vE, E 2 R. Hence, (2.97) holds for simple functions and consequently
for any nonnegative measurable function x.
R
Let H be the real Hilbert space L2(X, R, u) with the norm jj xjj2 ¼ X j xj2 du: For
x 2 H, the Cauchy–Schwarz Inequality gives
2.13 Applications 145

  0 112
Z  Z Z Z
  1
 x dm 
  j xjdm  j xjdu  @ j xj2 duA ðuðXÞÞ2 \1
 
X X X X

since u(X) < ∞. Thus, the mapping


Z
L:x! x dm
X

is seen to be defined and finite on H. It is clear that L(ax + by) = aL(x) + bL(y) for
all a, b scalars and x, y 2 L2(X, R, u) = H. Thus, L is a bounded linear functional on
H and so, by Theorem 2.10.25, there is a function y 2 H such that
Z Z Z Z
xdm ¼ xy du ¼ xy dm þ xy dl;
X X X X

where we have used (2.97) in the last equality. It is easy to discern that y is non-
negative a.e. with respect to u and hence with respect to µ and m as well. This may
be written as
Z Z
xð1 yÞdm ¼ xy dl: ð2:98Þ
X X

Let E = {s 2 X: y(s)  1}. Since vE 2 L2(X, R, u), we apply (2.98) to x = vE to


obtain
Z Z Z
0  lðEÞ ¼ vE dl  vE y dl ¼ vE ð1 yÞdm  0:
X X X

Thus, we have µ(E) = 0 and since m  l; mðEÞ ¼ 0:


Let z ¼ yvEc : Then z(s) 2 [0,1) and z = y a.e. with respect to both m and µ. The
equality (2.98) then becomes
Z Z
xð1 zÞdm ¼ xz dl: ð2:99Þ
X X

Consider any bounded, nonnegative, measurable function x. Let z be as above.


Since both x and z are bounded and u is a finite measure, the function (1 + z + z2 +
⋯ + zn−1)x is in L2(X, R, u) for every positive integer n and hence by (2.99)
146 2 Inner Product Spaces

Z Z
 2 n 1
  
1þzþz þ  þz xð1 zÞdm ¼ 1 þ z þ z2 þ    þ zn 1
xz dl
X X

holds. In view of the fact that z(s) 6¼ 1 for any s, the above equality can be written as
Z Z
n zð1 zn Þ
ð1 z Þx dm ¼ x dl:
1 z
X X

n
Since 0  z(s) < 1 for all s 2 X, the sequences (1 − zn)x and zð11 zz Þ x increase to
x and 1 z z x, respectively, as n ! ∞. By the Monotone Convergence Theorem 1.3.6,
we obtain
Z Z
z
x dm ¼ x dl:
ð1 zÞ
X X

Now define h ¼ 1 z z; then we have


Z Z
x dm ¼ hx dl:
X X

In particular, for E 2 M and x = vE, we obtain


Z
mðEÞ ¼ h dl:
E

The uniqueness is obvious. h


Remarks
(i) The construction of h shows that h  0.
(ii) The Radon–Nikodým Theorem is valid if m and µ are r-finite measures. For
details, the reader may consult [26].

Bergman Kernel and Conformal Mappings


Let X be a bounded domain in the z = x + iy plane, whose boundary consists of a
finite number of smooth simple closed curves. The class of all holomorphic func-
RR
tions in X for which the integral X jf ðzÞj2 dxdy\1 is denoted by A(X). The
integral is understood as the limit of Riemann integrals
ZZ
limn jf ðzÞj2 dxdy;
Kn
2.13 Applications 147

where {Kn}n  1 is a nondecreasing sequence of compact subsets of X whose union


is X. It has been proved [see 2.6.2, 2.6.3, 2.6.4, 2.6.5] that A(X) is a Hilbert space.
Consider the linear functional L(f) = f(f), where f 2 X is fixed and f 2 A(X).
Observe that jLðf Þj ¼ jf ðfÞj  pkffiffipf dkf , where df = dist(f, ∂X) and ∂X denotes the
boundary of X [see Proposition 2.6.3]. It follows on using Theorem 2.10.25 that
there exists a uniquely determined uf 2 AðXÞ such that

f ðfÞ ¼ ðf ; uf Þ; f 2 AðXÞ: ð2:100Þ

The traditional notation is uf(z) = K(z, f) and K is called the Bergman kernel of X.
For each f 2 X, the function has the reproducing property
ZZ
f ðfÞ ¼ ðf ; Kð; fÞÞ ¼ f ðzÞKðz; fÞdxdy; f 2 AðXÞ: ð2:101Þ
X

The following two assertions are immediate from (2.101).


(a) If one substitutes f = K(, f) in (2.101), one finds that
ZZ
kKð; fÞk2 ¼ Kðz; fÞKðz; fÞ dxdy
X
¼ Kðf; fÞ; f 2 X.

(b) For z1, z2 2 X, the relation K(z1, z2) = K ðz2 ; z1 Þ holds. To see this, we let f = K(, z2)
and f = z1 in (2.101) and we obtain
ZZ
K ðz1 ; z2 Þ ¼ Kðz; z2 ÞKðz; z1 Þdxdy
X
ZZ
¼ Kðz; z2 ÞKðz; z1 Þdxdy
X
¼ Kðz2 ; z1 Þ

The relation between the kernel function and a certain minimum problem in A
(X) is also important. Suppose f 2 X is fixed, and write

M ¼ ff 2 AðXÞ : f ðfÞ ¼ 1g:

There is exactly one solution f0 2 M such that minf 2M k f k ¼ kf0 k. Moreover,


the function f0 is connected with the Bergman kernel function as follows:

Kðz; fÞ f0 ðzÞ
f0 ðzÞ ¼ and Kðz; fÞ ¼ :
Kðf; fÞ kf 0 k2
148 2 Inner Product Spaces

Proof Since A(X) is a Hilbert space and M  A(X) is its closed subspace, the first
assertion follows on using Corollary 2.10.7.
For each f 2 A(X), we have f(f) = (f, K(,f)). For f 2 M, on using the Cauchy–
Schwarz inequality, we have
p
1 ¼ ðf ; Kð; fÞÞ  k f kkKð; fÞk ¼ k f k  Kðf; fÞ: ð2:102Þ

Equality in the above inequality occurs provided

f ¼ f0 ¼ CKð; fÞ; where C is a constant: ð2:103Þ


1
Since 1 ¼ f0 ðfÞ ¼ CKðf; fÞ (therefore C ¼ Kðf;fÞ ), it follows that

Kðz; fÞ
f0 ðzÞ ¼ :
Kðf; fÞ

This implies

Kðz; fÞ ¼ f0 ðzÞKðf; fÞ:

Also,

f0 ðzÞ
Kðz; fÞ ¼ ;
kf0 k2

since k f k2 ¼ kf0 k2 ¼ Kðf;fÞ


1
; using (2.102) and (2.103). h
Recall that the Riemann Mapping Theorem asserts: if X is a simply connected
domain having more than one boundary point, then there exists a holomorphic
function in X which maps X bijectively onto D ¼ fz : jzj\1g. If f is fixed, then the
mapping function f(z) = f(z, f) for which f(f) = 0 and f′(f) > 0 is unique.
The mapping function f and the Bergman kernel K of X are related as follows:
rffiffiffiffiffiffiffiffiffiffiffiffiffiffi
p 1
f 0 ðzÞ ¼ kðz; fÞ and kðz; fÞ ¼ f 0 ðzÞf 0 ðfÞ; z 2 X
Kðf; fÞ p

Proof Let Xr denote the subdomain of X which is mapped by f onto the disc
fx : jxj\r g, where r < 1 and x = f(z). Denote the boundary of Xr by cr. If g 2 A
(X), then gðzÞ
f ðzÞ has a simple pole at z = f and the residue at this pole is

ðz fÞgðzÞ gðfÞ
lim ¼ 0 :
z!f f ðzÞ f ðfÞ
2.13 Applications 149

By the Residue Theorem,


Z Z
gðfÞ 1 gðzÞ 1
¼ dz ¼ f ðzÞgðzÞdz:
f 0 ðzÞ 2pi f ðzÞ 2pir2
cr cr

since jf ðzÞj2 ¼ r 2 for z 2 cr. Using Green’s formula, we obtain


ZZ
gðfÞ 1
¼ f 0 ðzÞgðzÞdxdy:
f 0 ðfÞ pr 2
Xr

Letting r ! 1, we get
ZZ
f 0 ðzÞf 0 ðfÞ
gðfÞ ¼ gðzÞdxdy:
p
X

In other words, the function

f 0 ðzÞf 0 ðfÞ
Kðz; fÞ ¼ ð2:104Þ
p

has the reproducing property for A(X) and is therefore the Bergman kernel. For
z = f, it follows that

f 0 ðfÞ2
Kðf; fÞ ¼ ;
p

which implies on using (2.104)


rffiffiffiffiffiffiffiffiffiffiffiffiffiffi
p
f 0 ðzÞ ¼ kðz; fÞ:
Kðf; fÞ

This completes the proof. h


Remarks
(i) In only a few cases, it is possible to obtain a representation for the kernel
function in closed form. It is easy to find a series representation with respect to
some complete orthonormal system f/j g, because by (2.101), the Fourier
coefficients are
 
uj ¼ Kð; fÞ; /j ¼ /j ðfÞ; j ¼ 1; 2; . . .

and the Bergman kernel has the series representation [see Theorem 2.9.15(iii)]
150 2 Inner Product Spaces

1
X
Kðz; fÞ ¼ /j ðfÞ /j ðzÞ; z; f 2 X:
j¼1

(ii) Consider the special case where X ¼ Dfz : jzj\1g. According to (vi) of
pffiffi
Examples 2.9.16, the set /n ðzÞ ¼ pn zn 1 n ¼ 1; 2; . . . is an orthonormal
system in A(D). Thus,
1
X n 1 1
Kðz; fÞ ¼ fn 1 z n 1
¼ ; z; f 2 X:
n¼1
p p ð1 zfÞ2

is the kernel function of D. The series converges uniformly in jfj  r; r\1:


The reproducing property becomes
ZZ
1 f ðzÞ
f ðfÞ ¼ dxdy:
p ð1 zfÞ2
D

Special Case of Browder Fixed Point Theorem


Let C be a nonempty convex, closed and bounded subset of a Hilbert space H and
let T be a map from C into C such that

kTx Tyk  kx yk for all x; y 2 C:

Then T has at least one fixed point.


Solution: for each n 2 N, let

1 n 1
Tn ðxÞ ¼ aþ TðxÞ;
n n

where a 2 C is fixed. Then Tn is a contraction and therefore has a fixed point


xn 2 C. Indeed, for x; y 2 C; jjTn ðxÞ Tn ðyÞjj ¼ n n 1 kTx Tyk  n n 1 kx yk:
Since C is a bounded subset of H and fxn gn  1 is in C, it follows that there exists
w
a subsequence fxnj gj  1 such that xnj !x, say [Theorem 3.1.5]. By the Banach–Saks
Theorem [Problem 2.12.P2], fxnj gj  1 has subsequence such that a sequence of
certain convex combinations of its terms converges strongly to x. Consequently, x 2
C as C is convex and (strongly) closed. We shall prove that x is a fixed point of T.
2.13 Applications 151

For any point y in H, we note that


2 2  
x n
j
y ¼ xnj x þ kx yk2 þ 2< xnj x; x y ; ð2:105Þ
 
where 2< xnj x; x y ! 0 as j ! ∞, since xnj x ! 0 weakly in H.
Observe that

1 nj 1
Tðxnj Þ xnj ¼ Tðxnj Þ Tnj ðxnj Þ ¼ Tðxnj Þ a Tðxnj Þ
nj nj
ð2:106Þ
1 
¼ Tðxnj Þ a ! 0 as j ! 1:
nj

Setting y = T(x) in (2.105), we have


n 2 2 o
lim xnj TðxÞ xnj x ¼ kx TðxÞk2 : ð2:107Þ
j!1

On the other hand, using the hypothesis,



Tðxn Þ TðxÞ  xn x :
j j

Hence

xn TðxÞ  xnj Tðxnj Þ þ Tðxnj Þ TðxÞ
j

 xn Tðxn Þ þ xn x :
j j j

Thus on using (2.106), we obtain


 
lim sup xnj TðxÞ x n
j
x  0
j!1

and therefore
n 2 2 o
lim sup xnj TðxÞ xnj x  0;
j!1

which implies, on using (2.107),

kx TðxÞk ¼ 0:
Chapter 3
Linear Operators

3.1 Basic Definitions

Let X and Y be finite-dimensional vector spaces over the same field F. Recall that a
mapping T:X!Y is called linear if T(a1x1 + a2x2) = a1T(x1) + a2T(x2) for all x1, x2 2
X and a1, a2 2 F. T is also called a linear operator or linear transformation.
If dim(X) = n and dim(Y) = m, we choose a basis {e1, e2, …, en} for X and a basis
{f1, f2, …, fm} for Y. An m  n matrix A of elements of F corresponds to a linear
transformation T:X!Y in the following way: for each integer k, 1  k  n, there
are unique elements s1,k, s2,k, …, sm,k of F such that
m
X
Tek ¼ sj;k fj : ð3:1Þ
j¼1

Pn
Each point x 2 X has a unique representation in the form x ¼ k¼1 nk ek , where
n1, n2, …, nn are in F. Hence,
n
X
Tx ¼ nk Tek
k¼1
!
n
X m
X
¼ nk sj;k fj ð3:2Þ
k¼1 j¼1
!
m
X n
X
¼ sj;k nk fj :
j¼1 k¼1

the components of the vector Tx with respect to the basis {f1,


If η1, η2, …, ηm areP
f2, …, fm}, then gj ¼ nk¼1 sj;k nk . In this sense, the matrix A = [sj,k] corresponds to
the linear transformation T. It is also said that the matrix A represents the linear
transformation T with respect to the aforementioned bases of X and Y.

© Springer Nature Singapore Pte Ltd. 2017 153


H.L. Vasudeva, Elements of Hilbert Spaces and Operator Theory,
DOI 10.1007/978-981-10-3020-8_3
154 3 Linear Operators

Conversely, let A = [sj,k] be an m  n matrix of elements of F. We can define a


mapping T:X!Y in the following P manner. Consider an x 2 X. It has a unique
representation in the form x ¼ nk¼1 nk ek , where n1, n2, …, nn are in F. Set
n
X
gj ¼ sj;k nk ; j ¼ 1; 2; . . .; m: ð3:3Þ
k¼1

and
!
m
X m
X n
X
Tx ¼ g j fj ¼ sj;k nk fj : ð3:4Þ
j¼1 j¼1 k¼1

T is obviously linear. Our considerations show that a linear operator


T determines a unique m  n matrix representing T with respect to a given basis for
X and a given basis for Y, where the vectors of each basis are arranged in a fixed
order, and conversely.
Questions about the system (3.3) can be formulated as questions about T. For
example, for which η1, η2, …, ηm does the system (3.3) have a solution n1, n2, …,
nn? This amounts to asking for a description of the range of T.
The most complete and satisfying results about (3.3) are obtained when m = n.
Indeed, if m = n, the system (3.3) has a unique solution, if and only if, the matrix [sj,k]
is nonsingular, equivalently, the linear operator T determined by the matrix [sj,k] is
one-to-one (or onto). In particular, if X = Y, ej = fj, j = 1, …, n, the operator T maps
X to itself. If p is a polynomial, then p(T) makes sense. The study of p(T) can provide
insight about T. For example, k is an eigenvalue of T if, and only if, it is a root of the
characteristic polynomial det(kI − T).
Recall that if H is a Hilbert space and M is a closed subspace of H, then the
mappings PM from H onto M and PM ? onto M⊥ are linear [see Theorem 2.10.15].
We give below a formal definition of a linear operator.
Definition 3.1.1 Let X and Y be linear spaces (vector spaces) over the same scalar
field F, say. A mapping T defined over a linear subspace D of X, written D(T), and
taking values in Y is said to be a linear operator if

Tða1 x1 þ a2 x2 Þ ¼ a1 Tðx1 Þ þ a2 T ðx2 Þ for scalars a1 ; a2 and x1 ; x2 in D:


The definition implies, in particular, that

Tð0Þ ¼ 0; Tð xÞ ¼ TðxÞ:

We denote

ran(TÞ ¼ fy 2 Y:y ¼ Tx for some x in DðTÞg


3.1 Basic Definitions 155

and

ker(TÞ ¼ fx 2 DðTÞ:Tx ¼ 0g:

We call D(T) the domain, ran(T) the range and ker(T) the kernel, respectively, of
the operator T. A linear operator is also called a linear transformation with domain
D(T)  H into Y. If the range ran(T) is contained in the scalar field F, then T is
called a linear functional [see Definition 2.10.18] on D(T). If a linear operator gives
a one-to-one map (x1 6¼ x2 ) Tx1 6¼ Tx2 or equivalently, Tx1 = Tx2 ) x1 = x2) of
D(T) onto ran(T), then the inverse map T−1 gives a linear operator on R(T) onto
D(T):
1
T Tx ¼ x for x 2 DðTÞ and
1
TT y¼y for y 2 ran ðTÞ:

T 1 is called the inverse operator or, in short, the inverse of T.


The following proposition is an easy consequence of the linearity of T.
1
Proposition 3.1.2 A linear operator T admits an inverse T if, and only if, Tx = 0
implies x = 0.
Proof Suppose Tx = 0 implies x = 0. Let Tx1 = Tx2. Since T is linear,

T ð x1 x2 Þ ¼ Tx1 Tx2 ¼ 0;

so that x1 = x2 by hypothesis.
Conversely, if T 1 exists, then Tx1 = Tx2 implies x1 = x2. Let Tx = 0. Since T is
linear, T0 = 0 = Tx, so that x = 0 by hypothesis. h
Example 3.1.3 Let X be the vector space of all real-valued functions which are
defined over ℝ and have derivatives of all orders everywhere on ℝ. Define T:
X!X by y(t) = Tx(t) = x′(t). Then, R(T) = X. Indeed, for y 2 X, we have y = Tx,
Rt
where xðtÞ ¼ 0 yðsÞds. Since Tx = 0 for every constant function, T 1 does not
exist.
Definition 3.1.4 Let T1 and T2 be linear operators with domains D(T1) and D(T2)
both contained in a linear space X and ranges R(T1) and R(T2) both contained in a
linear space Y. Then, T1 = T2 if, and only if, D(T1) = D(T2) and T1x = T2x for all x 2
D(T1) = D(T2). If D(T1)  D(T2) and T1x = T2x for all x 2 D(T1), T2 is called an
extension of T1 and T1 a restriction of T2. We shall write T1  T2.
We shall abbreviate “D(T)” to simply “D” when there is only one operator under
consideration.
The following is a special case of bijective mappings between sets.
Proposition 3.1.5 Let T:X!Y and S:Y!Z be bijective linear operators, where X,
Y, Z are linear spaces over the same scalar field F. Then, the inverse (ST)−1:Z!X
of the product (composition) of S and T exists and satisfies
156 3 Linear Operators

1 1
ðSTÞ ¼T S 1:

Remark 3.1.6 The identity map, a composition of linear maps and the inverse of a
linear map (when it exists) are all linear.

3.2 Bounded and Continuous Linear Operators

Every linear functional is a linear transformation between the linear space and the
one-dimensional scalar field underlying the linear space. The study of continuous
linear functionals on inner product spaces and more specifically on Hilbert spaces
has yielded many valuable results [Sect. 2.10]. It seems natural to attempt gener-
alising the considerations to linear transformations (operators) from Hilbert space
into itself. The interplay between algebraic notions and metric structure proves
interesting and useful in applications.
Definition 3.2.1 Let X and Y be normed linear spaces and T:D!Y a linear oper-
ator, where D  X. T is said to be continuous at x0 2 D if limx!x0 T(x) = Tx0. T is
continuous in D if it is continuous at each point of D.
A linear operator is bounded if

sup jjTxjj\1
x2D
jjxjj  1

The left member of the above inequality is called the norm of the operator T in
D, provided it is finite, and is denoted by the symbol ||T|| or sometimes by ||T||D.
If M  ||T||D, then M is called a bound of T.
The infimum of all bounds M is the norm ||T||D.
Remarks 3.2.2
(i) If x 2 D and x 6¼ 0, then by the definition of the norm of T,
  
 T x   kT k :
 
 k xk  D

Hence, for any x 2 D, x 6¼ 0, we have ||Tx||  ||T||D||x||. However, it is easily


seen that this inequality holds also when x = 0 (the two sides are both zero in
this event), and therefore,

jjTxjj  jjT jjD jj xjj for all x 2 D: ð3:5Þ

(ii) It follows from the relation (3.5) and linearity of T that T is uniformly
continuous. Indeed, by (3.5),
3.2 Bounded and Continuous Linear Operators 157

kTx Tyk ¼ kTðx yÞk  kT kkx ykfor x; y 2 D:

(iii) From (3.5), it also follows that, if x 2 D and ||x||  1, then

jjTxjj  jjT jj ð3:6Þ

and the above inequality is strict if ||x|| < 1 and ||T|| 6¼ 0.


(iv) Now assume that D 6¼ {0}. Then, it follows from (3.5) and (3.6) and the
equality ||T(ax)|| = |a|||Tx|| that ||T|| can be defined as

jjT jj ¼ sup jjTxjj ð3:7Þ


x2D
jjxjj¼1

or equivalently by

kTxk
kT k ¼ sup : ð3:8Þ
x2D k xk
jjxjj6¼0

Thus, if T is a bounded linear operator on D  X and D 6¼ {0}, then

kTxk
kT k ¼ sup kTxk ¼ sup jjTxjj ¼ sup : ð3:9Þ
x2D x2D x2D k xk
kxk¼1 k xk  1 k xk6¼0

The following proposition gives equivalent conditions for the continuity of a


linear operator from D  X into Y.
Proposition 3.2.3 Let X and Y be normed linear spaces over the same field of
scalars and D  X be the domain of the linear operator T from D into Y. Then, the
following conditions are equivalent:
(a) T is continuous at a given x0 2 D;
(b) T is bounded; and
(c) T is continuous everywhere and the continuity is uniform.

Proof If D = {0}, there is nothing to prove.


(a) implies (b). Suppose T is continuous at x0 2 D. Then for given e > 0, there is
a d > 0 such that ||Tx − Tx0|| < e for all x 2 D satisfying ||x − x0|| < d. We now take
y 6¼ 0 in D and set

d
x ¼ x0 þ y:
2k y k
158 3 Linear Operators

Then,

d
x x0 ¼ y:
2k y k

Hence, ||x − x0|| = d/2 < d, so that we have ||Tx − Tx0|| < e. Since T is linear, we
obtain
 
 d  d
kTx Tx0 k ¼ kT ðx x 0 Þ k ¼ T
 y ¼ kTyk
2k y k  2k y k

and this implies

d
kTyk\e.
2k y k

Therefore, ||Ty|| < 2e 2e


d ||y|| = M||y||, where M = d . Thus, T is bounded.
(b) implies (c). Suppose T is bounded and M > 0 a bound. Then for x, y 2 D, we
have ||Tx − Ty|| = ||T(x − y)||  M||x − y||. Let e > 0 and d = e/M. Then, ||x − y|| < d
implies ||T(x − y)|| < Md = e. Since x, y 2 D are arbitrary, T is uniformly continuous
on D and hence continuous everywhere on D.
(c) implies (a). Trivial. h
Remark The terms continuous linear operator and bounded linear operator will be
used interchangeably.
Many properties of linear functionals generalise easily to linear operators. The
analogue of the dual space is the space of all continuous linear operators from a
normed linear space X into a normed linear space Y (which may or may not be the
same as X) and is denoted by B(X,Y). Note that in this context D = X. We abbreviate
B(X, X) as B(X).
First of all, B(X, Y) becomes a vector space if we define the sum T1 + T2 of two
operators T1,T2 in B(X, Y) in a natural way,

ðT1 þ T2 Þx ¼ T1 x þ T2 x

and the product aT of T 2 B(X, Y) and a scalar a by

ðaTÞx ¼ aðTxÞ:

Since

kðT1 þ T2 Þxk  kT1 xk þ kT2 xk


 supfkT1 xk : x 2 X and kxk ¼ 1g þ fsupkT2 xk : x 2 X and k xk ¼ 1g
¼ kT1 k þ kT2 k;
3.2 Bounded and Continuous Linear Operators 159

it follows that

kT1 þ T2 k  kT1 k þ kT2 k for T1 ; T2 2 BðX; YÞ:

Similarly, it can be proved that

kaT k ¼ jajkT k for a2 F and T 2 BðX; YÞ:

It is immediate from Definition 3.2.1 that

kT k ¼ 0 impllies T ¼ O:

These imply that B(X, Y) is a normed vector space (linear space) over the scalar
field F.
Theorem 3.2.4 If Y is a Banach space, then B(X, Y) is a Banach space.
Proof Let {Tn}n  1 be a Cauchy sequence in B(X, Y). Then for any x 2 X,

kTn x Tm xk ¼ kðTn Tm Þxk  kTn Tm kk xk;

so that {Tnx}n  1 is a Cauchy sequence in Y. Since Y is complete, the sequence


converges, say Tnx!y. Clearly, the limit y depends on x. This defines a map T:
X!Y, where y = Tx = limnTnx. The map T is a linear operator since

limn Tn ða1 x1 þ a2 x2 Þ ¼ limn ða1 Tn ðx1 Þ þ a2 Tn ðx2 ÞÞ


¼ a1 limn Tn ðx1 Þ þ a2 limn Tn ðx2 Þ

for scalars a1,a2 and x1,x2 in X.


We prove that T is bounded and ||Tn − T||!0 as n!∞. The sequence {Tn}n  1,
being Cauchy, is bounded, i.e., there exists an M > 0 such that ||Tn||  M, n = 1, 2,
…. For any x 2 X, ||Tnx||  ||Tn||||x||  M||x||. Consequently,

kTxk ¼ klimn Tn xk ¼ limn kTn xk  M kxk:

This proves that T is bounded. It remains to show that ||Tn − T||!0 as n!∞.
Let e > 0. There exists n0 such that m, n  n0 implies ||Tn − Tm|| < e. Then,

kT n x Tm xk ¼ kðTn Tm Þxk  kTn T m kk x k  e k x k

for m, n  n0 and x 2 X. Letting m!∞, we get

kTn x Txk  ek xk for n  n0 and x 2 X:

This implies that ||Tn − T||  e for n  n0, so that ||Tn − T||!0 as n!∞. h
160 3 Linear Operators

Remark If Y is one-dimensional and X is a normed linear space, we obtain


Theorem 2.10.23.
Example 3.2.5
(i) (Identity operator) Let H be a Hilbert space. The identity operator I:
H!H defined by Ix = x, x 2 H, is linear and bounded with ||I|| = 1 when H 6¼
{0}.
(ii) (Zero operator) The zero operator on H defined by Ox = 0, x 2 H, is linear
and ||O|| = 0.
(iii) If H is a Hilbert space of finite dimension and T is a linear mapping on H into
H, then TP is continuous. For, let e1, e2, …, en be an orthonormal basis for
H. If x = nk¼1 nk ek is any vector in H, then
  !12 !12
X n  X n n n
2 2
X X
kTxk ¼  n ðTe Þ  jn jkðTek Þk  j nk j kðTek Þk ;
 
 k¼1 k k  k¼1 k k¼1 k¼1

using Cauchy–Schwarz inequality. Thus, ||Tx||  M||x||, where


!12
n
2
X
M¼ kTek k
k¼1

is independent of x.
(iv) Let T be a linear operator defined on a Hilbert space H 6¼ {0} by the formula

Tx ¼ ax; x 2 H;

where a 2 F is fixed. Then,

kTxk ¼ kaxk ¼ jajk xk:

Consequently,

kT k ¼ sup kTxk ¼ sup jajkxk ¼ jaj:


x2H x2H
jjxjj¼1 jjxjj¼1

Thus, T is a bounded linear operator on H of norm |a|.

(v) Let M be a closed subspace of a Hilbert space H and x 2 H. Then, x = y + z,


where y 2 M and z 2 M⊥ and this representation is unique [see Remark 2.
10.12]. Define T:H!H by the formula
3.2 Bounded and Continuous Linear Operators 161

Tx ¼ y; x 2 H:

We know that T is linear and ||Tx||2 = ||y||2  ||y||2 + ||z||2 = ||y + z||2 = ||x||2 [see
Theorem 2.10.15]. Thus, T is a bounded linear operator on H and ||T||  1.
Indeed, ||T|| = 1 when M 6¼ {0}; for x 2 M, Tx = x, and hence, ||Tx|| = ||x||.
Recall that this operator is called the projection on M and is denoted by PM
[see Definition 2.10.16].
(vi) (Multiplication operator) Let ðX; M; lÞ be a r-finite measure space and
H ¼ L2 ðX; M; lÞ be the Hilbert space of square integrable functions defined
on X. For y 2 H, an essentially bounded measurable function, define
Tx(t) = y(t)x(t), x 2 H and t 2 X. Clearly, T is a bounded linear operator in
H. Indeed,
Z Z
kTxk22 ¼ jyðtÞj2 jxðtÞj2 dlðtÞ  ess supjyðtÞj2 jxðtÞj2 dlðtÞ = k yk21 k xk22 ; x 2 H:
t2X
X X

Thus, ||T||  ||y||∞. Indeed, ||T|| = ||y||∞, as the following argument shows:
if l(X) = 0, then H = {0}, ||T|| = 0 = ||y||∞. Suppose l(X) > 0. If e > 0, the
r-finiteness of the measure space implies that there is a measurable set F 
1
X, 0 < l(F) < ∞, such that |y(t)|  ||y||∞ – e on F. If f ¼ ðlðFÞÞ 2 vF , then
f 2 L2 ðX; M; lÞ and || f ||2 = 1. So,
Z
2
kTf k ¼ jyðtÞj2 ðlðFÞÞ 1 vF ðtÞdlðtÞ
X
Z
1
¼ ðlðF)Þ jyðtÞj2 dlðtÞ
F
2
 ðlðF)Þ 1 k yk1 e lðFÞ

 2
= k y k1 e ;

which implies ||T||  ||y||∞–e, as || f ||2 = 1. Since e > 0 is arbitrary, we get ||T||
 ||y||∞.
The operator T is called a multiplication operator.
(vii) Let H be a separable Hilbert space and {ei}i  1 be an orthonormal basis in
H. Define T:H!H as follows:

Tei ¼ ei þ 1 ; i ¼ 1; 2; . . .:

If x 2 H, then x = 1 k e ; k 2 F, k = 1, 2, …, where 1 2
P P
P1 k¼1 k k k k¼1 jkk j \1.
P 1
In particular, k¼1 kk ek þ 1 is an element of H. Define Tx = k¼1 kk Tek =
P1
k¼1 kk ek þ 1 . Clearly T is linear. Moreover,
162 3 Linear Operators

 2
X 1  X 1 1
2
jkk j2 kek þ 1 k2 ¼ jkk j2 :
X
kTxk ¼  kk ek þ 1  ¼
 
 k¼1  k¼1 k¼1

On the other hand,


 2
X 1  X 1 1
k x k2 ¼  jkk j2 kek k2 ¼ jkk j2 :
X
k k ek  ¼
 
 k¼1  k¼1 k¼1

Thus, ||Tx|| = ||x||, x 2 H, i.e., T is a bounded linear operator on H of norm 1.


The operator described above is called the simple unilateral shift.
(viii) Let ðX; M; lÞ be a r-finite measure space and k:X  X!ℂ be an MM
measurable function for which there are constants c1 and c2 such that
Z
jkðs; tÞjdlðtÞ  c1 a:e: ½lŠ;
X
Z
jkðs; tÞjdlðsÞ  c2 a:e: ½lŠ:
X

For x 2 L2(l), set


Z
ðKxÞðsÞ ¼ kðs; tÞxðtÞdlðtÞ:
X

We shall show that K is a bounded linear operator in L2(l) and


1
kK k  ðc1 c2 Þ2 .
Z
jðKxÞðsÞj  jkðs; tÞjjxðtÞjdlðtÞ
X
Z
1 1
¼ jkðs; tÞj2 jk ðs; tÞj2 jxðtÞjdlðtÞ
X
2 312 2 312
Z Z
 4 jkðs; tÞjdlðtÞ5 4 jkðs; tÞjjxðtÞj2 dlðtÞ5
X X
2 312
Z
1
 c1 4
2
jkðs; tÞjjxðtÞj2 dlðtÞ5 a:e: ½lŠ:
X
3.2 Bounded and Continuous Linear Operators 163

Hence, by Fubini’s Theorem (the function under the integral sign is


nonnegative),
Z Z Z
jðKxÞðsÞj2 dlðsÞ  c1 jkðs; tÞjjxðtÞj2 dlðtÞdlðsÞ
X X X
Z Z
2
¼ c1 j xð t Þ j jkðs; tÞjdlðsÞdlðt):
X X

 c1 c2 jj xjj2 :

The above argument shows that the formula used to define Kx is such that
Kx is finite a.e. [l] and Kx 2 L2(l) and ||Kx||2  c1c2||x||2.
The operator K described above is called an integral operator and the
function k is called its kernel.
(ix) A particular instance of the integral operator described above is known as
the Volterra operator. Let k:[0,1]  [0,1]!F be the characteristic function
of the set {(s,t) 2 [0,1]  [0,1] : t < s}. The corresponding operator V:
L2[0,1]!L2[0,1] is defined by

Zs
VxðsÞ ¼ xðtÞdt; x 2 L2 ½0; 1Š:
0

Then,
12
Zs
0
jVxðsÞj2  @ jxðtÞjdtA
0
Zs Zs
0 10 1
@ dtA@ jxðtÞj2 dtA
0 0
Zs
0 1
¼ s@ jxðtÞj2 dtA:
0

Consequently,
164 3 Linear Operators

Z1 Z1 Z s
jVxðsÞj2 ds ¼ s jxðtÞj2 dtds
0 0 0
Z1 Z1
 s jxðtÞj2 dtds
0 0
Z1 Z1
¼ sds jxðtÞj2 dt
0 0
1
¼ jj xjj22 :
2

So,

1
jjVxjj22  jj xjj22 :
2

Thus, V is a bounded linear operator of norm not exceeding p1ffiffi2.


(x) Let H be the Hilbert space L2[0,1] of square integrable functions defined on
[0,1] and D = C1[0,1] be the linear subspace of continuously differentiable
functions. Define

T:D ! L2 ½0; 1Š; D  L2 ½0; 1Š;

by the rule

TxðtÞ ¼ x0 ðtÞ; t 2 ½0; 1Š:

Clearly, T is linear. However, T is not bounded. In fact, for the sequence


xn(t) = sin npt, we have Txn(t) = npcosnpt and

Z1 Z1 Z1
cos 2npt þ 1 1
jTxn ðtÞj2 dt ¼ ðnpÞ2 cos2 nptdt ¼ ðnpÞ2 dt ¼ ðnpÞ2
2 2
0 0 0

Consequently, ||Txn||2 = pnpffiffi2.


Also,

Z1 Z1
2 2 1 cos 2npt 1
j j xn j j ¼ sin npt dt ¼ dt ¼ :
2 2
0 0
3.2 Bounded and Continuous Linear Operators 165

Thus,

jjTxjj jjTxn jj
kT k ¼ sup  sup ¼ sup ðnpÞ ¼ 1
x2D jjxjj n jjxn jj n
k xk6¼0

and hence, T is not bounded.


Problem Set 3.2

3:2:P1. Let [si,j]i,j  1 be an infinite matrix [that is, a double sequence {si,j}i,j  1
 2
normally presented as an array] and K 2 ¼ 1
P
i;j¼1 si;j \1. The operator
 
2
T is defined on ‘ by
 
T fx i gi  1 ¼ fy i gi  1 ;

where
1
X
yi ¼ si;j xj ; i ¼ 1; 2; . . . :
j¼1

Show that T is a bounded linear operator on ‘2.


3:2:P2. Let H be a separable Hilbert space and {ei}i  1 be an orthonormal basis.
Let T:H!H be a bounded linear operator. Show that T is defined by the
matrix [(Tej,ei)] i,j  1.
3:2:P3. Let [si,j]i,j  1 be an infinite matrix such that
1 
X  1 
X 
a1 ¼ supj si;j \1 and a1 ¼ supi si;j \1:
i¼1 j¼1

Show that there is an operator T on H such that

and jjT jj2  a1 a1 :


 
Tej ; ei ¼ si;j

3:2:P4. If si,j  0 (i, j = 1, 2,…), if pi > 0 (i = 1, 2,…) and if a1 and a∞ are


positive numbers such that
1
X
si;j pi  a1 pj ; j ¼ 1; 2; . . .;
i¼1
X1
si;j pj  a1 pi ; i ¼ 1; 2; . . .;
j¼1

then there exists an operator T on ‘2 with (Tej, ei) = si,j and ||T||2  a1a∞.
166 3 Linear Operators
h i
3:2:P5. Show that the matrix 1
iþj 1 defines a bounded linear operator on ‘2
i;j  1
with ||T||  p (The matrix is known as the Hilbert matrix.).
3:2:P6. Let {en}n  1 be the usual basis for ‘2 and {an}n  1 be a sequence of scalars.
Show that there is a bounded linear operator T on ‘2 such that Ten = anen
for all n if, and only if, {an}n  1 is bounded. This type of operator is called
a diagonal operator.
3:2:P7. (Laplace transform) Let x(t) be a complex-valued function on ℝ+ = {t 2
ℝ:t  0}. Its Laplace transform Lx is the function on ℝ+ defined by

Z1
st
yðsÞ ¼ ðLxÞðsÞ ¼ xðtÞe dt:
0

Show that the Laplace transform is a bounded linear map of L2(ℝ+) into
itself and ||L|| = √p.
3:2:P8. Find an operator T on ℝ2 for which (Tx, x) = 0 for all x and ||T|| = 1.
3:2:P9. If M is a total subset of a Hilbert space H and S, T 2 B(H) are such that
Sx = Tx for all x 2 M, then S = T.
3:2:P10. Let H = L2[0,1] and

0 if 0  s\t  1

k ðs; tÞ ¼ 1
pffiffiffiffiffi if 0  t\s  1:
s t

(That k(s,t) is undefined when s = t is of no consequence.) Define

Z1
ðKxÞðsÞ ¼ kðs; tÞxðtÞdt:
0

Show that K is a bounded linear operator of norm at most 2.


3:2:P11. Let {ai}i  1 be a sequence of complex numbers. Define an operator Da on
‘2 by

Da x ¼ fai xi gi  1 for x ¼ fxi gi  1 2 ‘2 :

Prove that Da is bounded if, and only if, {ai}i  1 is bounded and in this
case ||Da|| = sup|ai|.
3:2:P12. Let H1 and H2 be Hilbert spaces. Define H1 ⊕ H2 [see Sect. 2.7] to be the
Hilbert space consisting of all pairs 〈u1, u2〉, ui 2 Hi, i = 1, 2,

hu1 ; u2 i þ hv1 ; v2 i ¼ hu1 þ v1 ; u2 þ v2 i;


khu1 ; u2 i ¼ hku1 ; ku2 i;

the inner product being defined by


3.2 Bounded and Continuous Linear Operators 167

ðhu1 ; u2 i; hv1 ; v2 iÞ ¼ ðu1 ; v1 ÞH1 þ ðu2 ; v2 ÞH2 :

Given A1 2 B(H1) and A22 B(H2), define A on H by the matrix




A1 0
A¼ ;
0 A2

i.e., A〈u1, u2〉 = 〈A1u1, A2u2〉. Prove that A 2 B(H) and that

jj Ajj ¼ maxfjjA1 jj; jjA2 jjg:

3:2:P13. Let ‘2(ℤ) be the Hilbert space of all sequences {nj}j2ℤ with
P1  2
j¼ 1 nj \1 and the usual inner product. Define an operator
S:‘2(ℤ)!‘2(ℤ) by the formula
 
S nj j2Z ¼ nj 1 j2Z :

Show that ||Sx|| = ||x|| for any x 2 ‘2(ℤ). Give a formula and a matrix
representation for the operator Sn for n 2 ℤ.

3.3 The Algebra of Operators

For a normed linear space X and a Banch space Y, the space B(X, Y) of bounded linear
operators from X to Y is a Banach space [Theorem 3.2.4] in the norm defined by

jjTxjj
jjT jj ¼ sup jjTxjj ¼ sup jjTxjj ¼ sup :
x2X x2X x2X jjxjj
jjxjj¼1 jjxjj  1 jjxjj6¼0

In what follows, we shall assume that X = Y = H, a Hilbert space. The Banach


space B(X, Y) is then denoted by B(H). It turns out that B(H) is a “Banach algebra”.
Definition 3.3.1 An algebra A over a field F is a vector space over F such that to
each ordered pair of elements x, y 2 A a unique product xy 2 A is defined, with the
properties
ðxyÞz ¼ xðyzÞ
xðy þ zÞ ¼ xy þ xz
ðx þ yÞz ¼ xz þ yz
aðxyÞ ¼ ðaxÞy ¼ xðayÞ

for all x, y, z 2 A and a 2 F.


168 3 Linear Operators

Depending on whether F is ℝ or ℂ, A is called a real or complex algebra. A is


said to be commutative if the multiplication is commutative, that is, for all x, y 2 A,

xy ¼ yx:

A is called an algebra with identity if it contains an element e 6¼ 0 such that for


all x 2 A, we have

xe ¼ ex ¼ x:

The element e is called an identity. If A has an identity, it is unique.


It may be noted that F and B(H) are algebras with identity.
Definition 3.3.2 A normed algebra is a normed space which is an algebra such
that for all x, y 2 A,

kxyk  kxkk yk

and if A has an identity e,

jjejj ¼ 1:

A Banach algebra is a normed algebra which is complete considered as a


normed space.
The space C[a, b] of continuous functions defined on [a, b] is a commutative
Banach algebra in which the product is defined by

xyðtÞ ¼ xðtÞyðtÞ

and

jj xjj ¼ supt2½0;1Š jxðtÞj:

The commutative algebra has an identity, namely the function 1.


Theorem 3.3.3 (B(H),||•||), where ||T|| = sup{||Tx|| : ||x||  1}, T 2 B(H), is a
Banach algebra with identity, provided that H 6¼ {0}.
Proof Since

kðSTÞðxÞk ¼ kSðTxÞk  kSkkTxk  kSkkT kkxk; S; T 2 BðHÞ;

it follows that

kST k  kSkkT k:

That B(H) is a Banach space has been checked in Theorem 3.2.4. The operator
I is the identity and satisfies ||I|| = 1 when H 6¼ {0}. h
3.3 The Algebra of Operators 169

Remarks 3.3.4
(i) If the dimension of H is 2 or greater, the algebra B(H) is not commutative.
For example,




1 0 1 1 1 1
¼ and

1 0 0

0
1 1
1 1 1 0 2 0
¼ :
0 0 1 0 0 0

(ii) As in every algebra, Tn will denote the product of n factors all equal to T,
0
Pn T is defined to be I, the identity operator. More generally, if
n = 1,2,…;
p(k) = j¼0 aj kj is any polynomial, we shall use the symbol p(T), T 2 B(H),
for the operator nj¼0 aj T j .
P

(iii) Let H be a Hilbert space different from {0}. We have seen that B(H) is a
Banach algebra with identity I and norm ||T|| = sup ||Tx||.
x2X
jjxjj  1
From now on, the Hilbert space will always be assumed to contain nonzero
vectors.

Definition 3.3.5 A sequence {Tn}n  1 in B(H) converges to T 2 B(H) in the


uniform operator norm if limn||Tn − T|| = 0.
There are two other modes of convergence: strong operator convergence and
weak operator convergence.
Definition 3.3.6 The sequence {Tn}n  1 in B(H) converges strongly to T 2
B(H) if, for each x 2 H, limn||Tnx − Tx|| = 0. The sequence {Tn}n  1 in
B(H) converges weakly to T 2 B(H) if, for all x, y 2 H, limn|(Tnx, y)–(Tx, y)| = 0.
Clearly, uniform operator convergence implies strong operator convergence and
strong operator convergence implies weak operator convergence. The reverse
implications, namely that strong operator convergence implies uniform operator
convergence and that weak operator convergence implies strong operator conver-
gence, are not true in general [see Problem 3.8.P1].
These are some of the important modes of convergence in B(H). They will
suffice for any developments we contemplate.
The inverses of certain operators will be of concern in later Sections. If T 2
B(H), where H is of course a Hilbert space, and I is the identity operator, we shall
be concerned with the operator (T − kI)−1, k 2 ℂ. When H = ℂn and T is a linear
operator on H, the set of k’s for which (T − kI)−1 does not exist are precisely the
eigenvalues of T. When H is infinite-dimensional, the set of k’s for which (T −
kI)−1 does not exist will turn out to be a nonempty compact subset of the complex
plane. Assuming that (T − kI)−1 exists, in which case it is obviously linear, it will be
of interest to know whether it is bounded.
170 3 Linear Operators

The treatment of the above question leads us into what is known as ‘spectral
theory’ or ‘spectral analysis’.
Definition 3.3.7 Let T 2 B(H). T is said to be invertible in B(H) if it has a set
theoretic inverse T−1 and T−1 2 B(H).
It is known that when the set theoretic inverse T−1 of an operator T 2
B(H) exists, it is in B(H) [Theorem 5.5.2].
The following fundamental proposition will be used to show that the collection
of invertible elements in B(H) is an open set and inversion is continuous in the
uniform operator norm.
Proposition 3.3.8 If T 2 B(H) and ||I − T|| < 1, then T is invertible and
1
T Þk ;
X
1
T ¼ ðI
k¼0

where convergence takes place in the uniform operator norm. Moreover,


 1
T   1
:
1 kI T k

Proof Set η = ||I − T|| < 1. Then for n > m, we have


   
X n m   X n  n
T Þk T Þk  ¼  ðI T Þk   T Þ kk
X X
ðI ðI kðI
   

 k¼0 k¼0
 k¼m þ 1  k¼m þ 1
n
X gm þ 1
¼ gk \ :
k¼m þ 1
1 g
nP o
n
The sequence of partial sums k¼0 ðI TÞk is Cauchy. If S =
n0
P1
k¼0 ðI TÞk , then
!
1
k
X
TS ¼ ½I ðI Tފ ðI TÞ
k¼0
n
TÞk
X
¼ limn ½I ðI Tފ ðI
k¼0
h i
nþ1
¼ limn I ðI TÞ ¼ I:

since limn||(I − T)n+1|| = 0. Similarly, ST = I, so that T is invertible with T−1 =


S. Moreover,
 
n n
X 
k 1
kI T kk ¼
X
kSk ¼ limn  ðI TÞ   limn : h

 k¼0  k¼0
1 k I Tk
3.3 The Algebra of Operators 171

Let G denote the set of invertible elements in B(H).


Proposition 3.3.9 If T 2 G and S 2 B(H) satisfies ||S − T|| < kT1 1 k, then S is
invertible. In particular, the set G is open in B(H). Moreover, the map T!T−1
defined on G is continuous.
Proof Let T 2 G. Consider {S 2 B(H):||S − T|| < kT1 1 k }. Then, 1 > ||T−1||||S − T|| 
||T−1S − I||. The preceding Proposition 3.3.8 implies that T−1S 2 G, and hence, S = T
(T−1S) is in G (the product of invertible elements is invertible). Thus, the ball of
radius jjT1 1 jj about each of its elements T, namely {S 2 B(H) : ||S − T|| < jjT1 1 jj }, is
contained in G. Consequently, G is an open subset of B(H).
It remains to show that the map T!T−1 is continuous on G.
If T 2 G, then the inequality ||T − S|| < 2jjT1 1 jj implies that ||I–T−1S|| < 12 and hence
 1  1
S  ¼ S TT 1 


 S 1 T T 1 
  

¼ ðT 1 SÞ 1 T 1 
  

1  1
  1

 1
T  2T ;
1 kI T Sk

by Proposition 3.3.8. Thus, the inequality

1 1 2
S 1  ¼ T 1
SÞS 1   2T
     
T ðT kT Sk

shows that the map T!T−1 is continuous on G. h


Remark 3.3.10 The reader is undoubtedly familiar with the equivalence of the
following assertions when H is finite-dimensional:
(i) T is invertible;
(ii) T is injective;
(iii) T is surjective;
(iv) there exists S 2 B(H) such that TS = I; and
(v) there exists S 2 B(H) such that ST = I.
The above assertions are not equivalent in infinite-dimensional spaces. Let H =
‘2 and T denote the “right shift”:
 
T fxi gi  1 ¼ ð0; x1 ; x2 ; . . .Þ:

Then, T is injective but not surjective, and thus not invertible.


The operator S defined by
172 3 Linear Operators

 
S fxi gi  1 ¼ ðx2 ; x3 ; x4 ; . . .Þ

is surjective but not injective and thus also not invertible. Moreover, ST({xi}i  1) = S
(0, x1, x2, …) = (x1, x2, x3, …), which means ST = I. The reader may note that TS 6¼ I.
Furthermore, no operator in a ball of radius 1 around T is invertible. Indeed, if
||T − A|| < 1, then

kI SAk ¼ kSðT AÞk  kSkkT Ak\1

since ||S||  1. This implies SA is invertible by Proposition 3.3.8. If A were


invertible, so would be S; but this is not the case.
We next derive useful criteria for the invertibility of an operator.
Definition 3.3.11 An operator T 2 B(H) is said to be bounded below if there exists
an a > 0 such that ||Tx||  a||x|| for all x 2 H.
An operator which is bounded below is clearly injective.
Theorem 3.3.12 An operator T 2 B(H) is invertible if, and only if, it is bounded
below and has dense range.
Proof If T is invertible, then the range of T is H and is therefore dense. Moreover,

1  1
 1
kTxk  T Tx ¼ k xk; x2H
kT 1 k kT 1 k

and therefore, T is bounded below.


Conversely, if T is bounded below, there exists an a > 0 such that ||Tx||  a||x||
for all x 2 H. Hence, if {Txn}n  1 is a Cauchy sequence in H, then the inequality

1
kx n xm k  kTxn Txm k
a

implies {xn}n  1 is a Cauchy sequence in H. Let x = limnxn. Then, x 2 H and Tx =


limnTxn; and hence, ran(T) is closed. Since ran(T) is dense in H, it follows that ran
(T) = H. As T is bounded below, this implies T−1 is well defined. Moreover, if y =
Tx, then

T 1 1 1
y ¼ k xk  kTxk ¼ kyk: h
a a
We proceed to study vector-valued functions, which will be needed in Sect. 4.3
below.
Definition 3.3.13 Let f be a function defined in a domain X of the complex plane
whose values are in a complex Banach space X.
3.3 The Algebra of Operators 173

(a) f(f) is strongly holomorphic in X if the limit

f ðf þ hÞ f ðfÞ
limh!0
h

exists in the norm (of X) at every point f of X.


(b) f(f) is weakly holomorphic in X if for every bounded linear functional F on
X, F(f(f)) is holomorphic in X in the classical sense.
The words holomorphic and analytic will be used interchangeably, as is the
usual practice.
Every strongly holomorphic function is weakly holomorphic. N. Dunford has
proved the following surprising result.
Theorem 3.3.14 Let f:X!X be a weakly holomorphic function from X to X. Then,
f is strongly holomorphic.
Proof For a bounded linear functional F on X, F(f(f)) is holomorphic in X; so, we
can represent it by the Cauchy integral formula

1 Fðf ðzÞÞ
Z
Fðf ðfÞÞ ¼ dz;
2pi z f
c

where c is a simple closed rectifiable curve around f in X. Hence, for small |h|,

Fðf ðf þ hÞÞ Fðf ðfÞÞ Fðf ðf þ kÞÞ Fðf ðfÞÞ


Zh
k Z

1 1 1 1 1 1
¼ Fðf ðzÞÞ dz dz
2pih z f h z f 2pik z f k z f
c c
h k Fðf ðzÞÞ
Z
¼ dz:
2pi ðz f hÞðz f kÞðz fÞ
c

So,
 
1 Fðf ðf þ hÞÞ Fðf ðfÞÞ Fðf ðf þ kÞÞ Fðf ðfÞÞ
h k h k
ð3:10Þ
1 Fðf ðzÞÞ
Z
¼ dz:
2pi c ðz f hÞðz f kÞðz fÞ

Since c is compact and the function F(f()) is a continuous function, |F(f(z))| is


bounded. For small enough |h| and |k|, it now follows that the right-hand side of
(3.10) is bounded. Hence, by the uniform boundedness principle [Theorem 5.4.6],
there exists a constant C > 0 such that
174 3 Linear Operators

 
f ðf þ hÞ f ðfÞ f ðf þ kÞ f ðfÞ
   C jh k j:
 h k 

(Any element of X may be thought of as a linear functional on X*). Since X is


complete, it follows that the difference quotient of f tends to a limit as h tends to 0.
Thus, f(f) is strongly analytic. h
A holomorphic function f:X!X has a Taylor series representation at every z 2 X,
i.e., for every z 2 X, there is an r = r(z) such that D(z,r) = {f 2 ℂ : |f − z| < r}  X and
1
zÞn
X
f ðfÞ ¼ an ðf ð3:11Þ
n¼0

for some a0,Pa1,… in X and all f 2 D(z,r) with series (3.11) being absolutely
convergent ( 1 k¼0 jjan jjjf zjn \1).
The other standard results concerning holomorphic functions remain valid in this
more general setting. These results can be proved by the same method that is used
for complex functions.
1
Also, the radius of convergence of (3.11) is lim infkan kn just as in the classical
case. Correspondingly, the Laurent series
1
X
n
gðfÞ ¼ bn f ð3:12Þ
n¼0

1
has radius of convergence s = lim supkbn kn Indeed, if |f| > s, then choosing e > 0
1
such that (1 + e)s/|f| < 1, we have kbn kn < (1 + e)s for every sufficiently large
n. Hence, ||bnf−n|| < ((1 + e)s/|f|)n if n is sufficiently large, implying that (3.12) is
absolutely convergent. Conversely, if |f| < s, then there is an infinite sequence n1 <
n2 < … such that kbnk k [ jfjnk . But then ||bnk fnk || > 1 and so, (3.12) does not
converge.
Problem Set 3.3
3:3:P1. Let H be a Hilbert space and let T1, T2, T3 2 B(H). On H(3) = H ⊕ H ⊕ H,
define T by the matrix
2 3
0 T3 T1
T ¼ 40 0 T2 5:
0 0 0

Prove that T 2 B(H(3)). For a 2 ℂ, show that (I − aT) is invertible and find
its inverse.
3.3 The Algebra of Operators 175

3:3:P2. Let l = {lk}k  1 be a sequence of complex numbers with supk|lk| < 1.


Prove that the following systems of equations have unique solutions in ‘2
for any {ηk}k  1 2 ‘2. Find the solutions for ηk = d1k,lk = 2k1 1 .
(a) nk − lknk+1 = ηk, k = 1, 2, …
(b) nk − lknk−1 = ηk, k = 2, 3, … and n1 = 1.
3:3:P3. Show that T 2 B(H) is surjective if, and only if, T* is bounded below.
3:3:P4. Show that if T  O then (I + T)−1 exists.

3.4 Sesquilinear Forms

In this section, a new kind of functional—a sesquilinear functional, or a sesquilinear


form, will be introduced. On the pattern of linear functionals, the notion of bounded
sesquilinear functionals is studied. A characterisation of such functionals is
provided.
Definition 3.4.1 Let X be a vector space over ℂ. A sesquilinear form on X is a
mapping B from X  X into the complex plane ℂ with the following properties:

ðiÞ Bðx1 þ x2 ; yÞ ¼ Bðx1 ; yÞ þ Bðx2 ; yÞ;

ðiiÞ Bðx; y1 þ y2 Þ ¼ Bðx; y1 Þ þ Bðx; y2 Þ;

ðiiiÞ Bðax; yÞ ¼ aBðx; yÞand

ðivÞ 
Bðx; byÞ ¼ bBðx; yÞ:
for all x, x1, x2, y, y1, y2 in X and all scalars a, b in ℂ.
Thus, B is linear in the first argument and conjugate linear in the second argu-
ment. If X is a real vector space, then (iv) is simply

Bðx; byÞ ¼ bBðx; yÞ:

and B is called bilinear, since it is linear in each of the two arguments.


176 3 Linear Operators

Definition 3.4.2 A Hermitian form B on a complex vector space X is a mapping


from X  X into the complex plane ℂ satisfying properties (i), (ii), (iii) and the
additional property
(v) Bðx; yÞ ¼ Bðy; xÞ:
It is then obvious that B must also have the property (iv) above and thus be
sesquilinear. However, a sesquilinear form need not be Hermitian, for example, B(x,
y) = i(x, y), where (x, y) on the right denotes an inner product in X. In this
connection, see (ii) of Remark 3.4.4 below.
A sesquilinear form B on X is said to be nondegenerate if it has the following
property:
(vi) If x 2 X is such that for all y 2 X, B(x, y) = 0, then x = 0; if y 2 X is such that
for all x 2 X, B(x, y) = 0, then y = 0.

Example 3.4.3
(i) The inner product in any pre-Hilbert space is a nondegenerate
Pn Hermitian form.
In particular, the usual inner product (x, y) = i¼1 xi yi is a nondegenerate
Hermitian form on ℂn. But if we delete one or more terms in the preceding
sum, it will define a degenerate Hermitian form on ℂn.
(ii) The form

Bðx; yÞ ¼ x1 y1 x2 y2

is a nondegenerate form on ℂ2 (it would be degenerate when viewed as a form


on ℂn, n > 2).

Remarks 3.4.4
(i) The property (iv) above is responsible for the name “sesquilinear”; the Latin
word “sesquilinear” means one time and a half.
(ii) A sesquilinear form is Hermitian if, and only if, B(x, x) is a real number for all
x.

It follows in view of the property (v) with y = x that Bðx; xÞ ¼ Bðx; xÞ, that is, B
(x, x) is real. On the other hand, we have

Bðx þ y; x þ yÞ Bðx; xÞ Bðy; yÞ ¼ Bðx; yÞ þ Bðy; xÞ: ð3:13Þ


Since the left-hand side of the above equality (3.13) is real for all x and y in X, it
implies
3.4 Sesquilinear Forms 177

=Bðx; yÞ ¼ =Bðy; xÞ: ð3:14Þ

Apply (3.13) with iy in place of y. The left side must again be real and so must be
the right-hand side, which is now, in view of the sesquilinearity,

i½ Bðx; yÞ þ Bðy; xފ:

Consequently, ℜ[−B(x, y) + B(y, x)] = 0, which implies ℜB(y, x) = ℜB(x, y).


Hence, in view of (3.14), Bðx; yÞ ¼ Bðy; xÞ.We shall essentially be interested in
positive definite forms. These are sesquilinear forms which satisfy the following
condition:

for all x 2 X; x 6¼ 0; Bðx; xÞ [ 0:

In particular, positive definite sesquilinear forms are Hermitian. They are obviously
nondegenerate.
Sesquilinear forms which satisfy the weaker condition, namely

for all x 2 X; x 6¼ 0; Bðx; xÞ  0

are called nonnegative sesquilinear forms.


We now present a result for sesquilinear forms generalising the Cauchy–
Schwarz inequality for inner products.
Theorem 3.4.5 Let B be a nonnegative sesquilinear form on the complex vector
space X. Then,

jBðx; yÞj2  Bðx; xÞBðy; yÞ for all x; y 2 X:

Proof If B(x, y) = 0, the inequality is, of course, true. Suppose B(x, y) 6¼ 0. Then for
arbitrary complex numbers a, b, we have

0  Bðax þ by; ax þ byÞ


¼ aaBðx; xÞ þ abBðx; yÞ þ abBðy; xÞ þ bbBðy; yÞ
¼ aaBðx; xÞ þ abBðx; yÞ þ abBðx; yÞ þ bbBðy; yÞ

since B is nonnegative. Now let a = t be real and set b = B(x, y)/|B(x, y)|. Then,

bBðy; xÞ ¼ jBðx; yÞj and bb ¼ 1:

Hence,

0  t2 Bðx; xÞ þ 2tjBðx; yÞj þ Bðy; yÞ

for an arbitrary real number t. Thus, the discriminant


178 3 Linear Operators

4jBðx; yÞj2 4Bðx; xÞBðy; yÞ  0;

which completes the proof. h


Definition 3.4.6 Let H be a Hilbert space. The sesquilinear form B is said to be
bounded if there exists some positive constant M such that

jBðx; yÞj  M k xkjj yjj for all x; y 2 H:

The norm of B is defined by

jBðx; yÞj
jjBjj ¼ supkxk¼kyk¼1 jBðx; yÞj ¼ sup :
x2H;y2H
k x kk y k
x6¼06¼y

Example 3.4.7
(i) If H is a Hilbert space, the sesquilinear form B:H  H!ℂ defined by B(x,
y) = (x, y) is bounded by the Cauchy–Schwarz inequality. Moreover, ||B|| = 1.
Indeed, |B(x, y)| = |(x, y)|  ||x||||y||, and so, ||B||  1. For y = x, |B(x, y)| = |(x,
x)| = ||x||2 = 1 if ||x|| = 1.
(ii) If H is a Hilbert space and T:H!H is a bounded linear operator, then B(x,
y) = (Tx, y) is a bounded sesquilinear form with ||B|| = ||T||. Indeed, for x, y 2
H, ||x|| = ||y|| = 1,

jBðx; yÞj ¼ jðTx; yÞj  kTxkk yk  kT k;

hence,

jjBjj  jjT jj:

On the other hand, for y = Tx,

jBðx; TxÞj kTxk 2 kTxk


jjBjj  ¼ ¼ ;
kxkkTxk k xkkTxk k xk

which implies

jjBjj  jjT jj:

(iii) A bounded sesquilinear form B:H  H!ℂ is jointly continuous in both


variables:

jBðx; yÞ Bðx0 ; y0 Þj ¼ jBðx x0 ; y y0 Þ þ Bðx x0 ; y0 Þ þ Bðx0 ; y y0 Þj


 jjBjjðkx x0 kky y0 k þ kx x0 kky0 k þ kx0 kky y0 kÞ:
3.4 Sesquilinear Forms 179

It is interesting that the Riesz Representation Theorem 2.10.25 yields a general


representation of sesquilinear forms on Hilbert space.
Theorem 3.4.8 Let H be a Hilbert space and B(,):H  H!ℂ be a bounded
sesquilinear form. Then, B has a representation

Bðx; yÞ ¼ ðSx; yÞ;

where S:H!H is a bounded linear operator. S is uniquely determined by B and has


norm

kSk ¼ kBk:

Proof For fixed x, the expression Bðx; yÞ defines a linear functional in y whose
domain is H. Then, the Theorem 2.10.25 of F. Riesz yields an element z 2 H such
that

Bðx; yÞ ¼ ðy; zÞ:

Hence,

Bðx; yÞ ¼ ðz; yÞ:

Here, z is unique but, of course, depends on x 2 H. Define the mapping S:H!H by


Sx = z, x 2 H. Then,

Bðx; yÞ ¼ ðSx; yÞ:

Since

Bða1 x1 þ a2 x2 ; yÞ ¼ a1 Bðx1 ; yÞ þ a2 Bðx2 ; yÞ;

we have

ðSða1 x1 þ a2 x2 Þ a1 Sx1 a2 Sx2 ; yÞ ¼ 0; y 2 H:

Since y is arbitrary,

Sða1 x1 þ a2 x2 Þ ¼ a1 Sx1 þ a2 Sx2 ;

so that S is a linear operator. The domain of the operator S is the whole of


H. Furthermore, since |(Sx, y)|  ||Sx||||y||, we have

jBðx; yÞj jðSx; yÞj kSxk


jjBjj ¼ sup ¼ sup  supx6¼0 ¼ jjSjj:
x2H;y2H k x kk y k x2H;y2H k x kk y k k xk
x6¼06¼y x6¼06¼y
180 3 Linear Operators

On the other hand,

jðSx; yÞj jðSx; SxÞj kSxk


jjBjj ¼ sup  sup ¼ supx6¼0 ¼ jjSjj:
x2H;y2H k x kk y k x2H k xkkSxk k xk
x6¼06¼y x6¼06¼Sx

It remains to check that S is unique. Suppose there is a linear operator T:H!H such
that for x, y 2 H, we have

Bðx; yÞ ¼ ðSx; yÞ ¼ ðTx; yÞ:

It then follows that

ððS TÞx; yÞ ¼ 0; x; y 2 H:

Setting y = (S − T)x, we obtain ||(S − T)x|| = 0, that is, Sx = Tx for each


x 2 H. Consequently, S = T. h
The following simple Theorem is often useful:
Theorem 3.4.9 If a complex scalar function B:H  H!ℂ, where H denotes a
Hilbert space, satisfies the following conditions:

ðiÞ Bðx1 þ x2 ; yÞ ¼ Bðx1 ; yÞ þ Bðx2 ; yÞ;

ðiiÞ Bðx; y1 þ y2 Þ ¼ Bðx; y1 Þ þ Bðx; y2 Þ;

ðiiiÞ Bðax; yÞ ¼ aBðx; yÞ;

ðivÞ Bðx; byÞ ¼ bBðx; yÞ;

ðvÞ jBðx; xÞj  M kxk2 and

ðviÞ jBðx; yÞj ¼ jBðy; xÞj;


where M is a constant; x, x1, x2, y, y1, y2 are arbitrary elements of H; a, b are
scalars, then B is a bounded sesquilinear functional with ||B||  M.
Proof From (i)–(iv), it follows that

1
Bðx; yÞ þ Bðy; xÞ ¼ ½Bðx þ y; x þ yÞ Bðx y; x yފ:
2

This implies
3.4 Sesquilinear Forms 181

1 h i h i
jBðx; yÞ þ Bðy; xÞj  M jjx þ yjj2 þ kx y k2 ¼ M j j x j j 2 þ k y k2 : ð3:15Þ
2

Let ||x||  1, ||y||  1 and y = kz, where k is a complex number of absolute


value 1, to be specified later. Then, (3.15) yields
 
kBðx; zÞ þ kBðz; xÞ  2M: ð3:16Þ

Assume that B(x, z) 6¼ 0 and let

Bðx; zÞ ¼ jBðx; zÞjeic ; Bðz; xÞ ¼ jBðz; xÞjeid :

Then by (3.16) and (vi),

 þ keid   2M:
 ic 
jBðx; zÞjke

Letting k = ei(c−d)/2, we find that

keic þ keid ¼ eiðc þ dÞ=2 þ eiðc þ dÞ=2 ¼ 2eiðc þ dÞ=2 ;

which yields

jBðx; zÞj  M; jj xjj  1; kzk  1:

As the relation obviously holds for B(x,z) = 0, the result follows. h


Corollary 3.4.10 If the bounded sesquilinear functional B satisfies the condition

jBðx; yÞj ¼ jBðy; xÞj; x; y 2 H;

then

jBðx; xÞj
jjBjj ¼ sup :
x2H k x k2
jjxjj6¼0

Proof The supremum in question is obviously a possible value of M that satisfies


(v) of Theorem 3.4.9. It follows that

jBðx; xÞj
jjBjj  sup ;
x2H k x k2
jjxjj6¼0

but on the other hand,


182 3 Linear Operators

jBðx; xÞj jBðx; yÞj


sup 2
 sup ¼ kBk h
x2H k xk x2H;y2H k xk k yk
k xk6¼0 x6¼06¼y

The following result, which is a special case of Corollary 3.4.10, plays an


important role in the exposition of spectral theory given in subsequent pages.
Corollary 3.4.11 If H is a Hilbert space, the norm of a Hermitian bounded
sesquilinear form B:H  H!ℂ is given by the formula

jBðx; xÞj
kBk ¼ sup :
x2H k x k2
kxk6¼0

Proof Indeed, a Hermitian bounded sesquilinear form B satisfies the condition


|B(x, y)| = |B(y, x)|. h

Problem Set 3.4

3:4:P1. Let B(,) be a bounded sesquilinear form on a Hilbert space H. Show that
(a) (Parallelogram law) For all x, y 2 H,

Bðx þ y; x þ yÞ þ Bðx y; x yÞ ¼ 2Bðx; xÞ þ 2Bðy; yÞ;

(b) (Polarisation identity) For all x, y 2 H,

4Bðx; yÞ ¼ Bðx þ y; x þ yÞ Bðx y; x yÞ þ iBðx þ iy; x þ iyÞ


iBðx iy; x iyÞ:

(c) B = 0 if, and only if, B(x,x) = 0 for all x 2 H.


3.4.P2. A function f defined on a Hilbert space H is called a quadratic form if there
exists a sesquilinear form B on H  H such that f(x) = B(x, x). Show that a pointwise
limit of quadratic forms is a quadratic form.

3.5 The Adjoint Operator

The study of bilinear forms on a Hilbert space H yields rich dividends. The algebra
B(H) of bounded linear operators on H admits a canonical bijection T!T* pos-
sessing pleasant algebraic properties. Moreover, many properties of T can be
studied through the operator T*. It also helps us to study three important classes of
3.5 The Adjoint Operator 183

operators, namely self-adjoint, unitary and normal operators. These classes have
been studied extensively, because they play an important role in various
applications.
Definition 3.5.1 Let T be a bounded linear operator on a Hilbert space H. Then, the
Hilbert space adjoint T* of T is the operator

T*:H ! H

such that for all x, y 2 H,

ðTx; yÞ ¼ ðx; T*yÞ:


We first show that this definition makes sense and also prove that the adjoint
operator has the same norm.
Theorem 3.5.2 The Hilbert space adjoint T* of T in Definition 3.5.1 exists, is
unique and is a bounded linear operator with norm

jjT*jj ¼ jjT jj:

Proof The formula

Bðy; xÞ ¼ ðy; TxÞ; x; y 2 H; ð3:17Þ

defines a bounded sesquilinear form on H  H, because the inner product is a


sesquilinear form and T is a bounded linear operator. Indeed, for y1, y2, x1, x2 in
H and a, b scalars,

Bðay1 þ by2 ; xÞ ¼ ðay1 þ by2 ; TxÞ


¼ ðay1 ; TxÞ þ ðby2 ; TxÞ
¼ aðy1 ; TxÞ þ bðy2 ; TxÞ
¼ aBðy1 ; TxÞ þ bBðy2 ; xÞ

and

Bðy; ax1 þ bx2 Þ ¼ ðy; Tðax1 þ bx2 ÞÞ


¼ ðy; aTx1 þ bTx2 Þ
¼ aðy; Tx1 Þ þ bðy; Tx2 Þ
¼ aBðy; x1 Þ þ bBðy; x2 Þ:

Moreover, B is bounded:
184 3 Linear Operators

jBðx; yÞj ¼ jðx; TyÞj  kxkkTyk  kT kkxkk yk: ð3:18Þ

This implies ||B||  ||T||. Also,

ðy; TxÞ jðTx; TxÞj


kBk ¼ sup  sup ¼ kT k: ð3:19Þ
x6¼0 k y kk x k x6¼0 kTxkk xk
y6¼0 Tx6¼0

From (3.18) and (3.19), we conclude that

jjBjj ¼ jjT jj: ð3:20Þ

From the representation Theorem 3.4.8 for bounded sesquilinear forms, we have

Bðy; xÞ ¼ ðT*y; xÞ; ð3:21Þ

where we have replaced S of Theorem 3.4.8 by T*.


The operator T*:H!H is a uniquely defined bounded linear operator with norm

jjT*jj ¼ jjBjj ¼ jjT jj:

The last equality is the assertion of (3.20). Note that

ðy; TxÞ ¼ ðT*y; xÞ ð3:22Þ

follows on comparing (3.17) and (3.21). On taking conjugates in (3.22), we obtain

ðTx; yÞ ¼ ðx; T*yÞ:

This completes the proof. h


Remarks 3.5.3
(i) T = O if and only if, (Tx, y) = 0 for all x, y 2 H. T = O means Tx = 0 for all x 2
H and this implies (Tx, y) = (0, y) = 0. On the other hand, (Tx, y) = 0 for all x, y
2 H implies Tx = 0 for all x 2 H, which, by definition, says T = O.
(ii) (Tx,x) = 0 for all x 2 H if and only if, T = O. For x = ay + z 2 H,

0 ¼ ðTðay þ zÞ; ay þ zÞ
¼ jaj2 ðTy; yÞ þ aðTy; zÞ þ aðTz; yÞ þ ðTz; zÞ ð3:23Þ
¼ aðTy; zÞ þ aðTz; yÞ

since (Ty, y) = 0 and (Tz, z) = 0. Setting a = 1 and a = i in (3.23) gives


3.5 The Adjoint Operator 185

ðTy; zÞ þ ðTz; yÞ ¼ 0 ð3:24Þ

and

ðTy; zÞ ðTz; yÞ ¼ 0: ð3:25Þ

From (3.24) and (3.25), we get (Ty, z) = 0, which implies T = O on using


(i) above.
The following general properties of Hilbert space adjoint operators are fre-
quently used in studying these operators.
Theorem 3.5.4 If S, T 2 B(H) and a is a scalar, then
(a) ðaS þ TÞ* ¼ aS* þ T*
(b) ðSTÞ* ¼ T*S*
(c) ðS*Þ* ¼ S
(d) If S is invertible in B(H) and S−1 is its inverse, then S* is invertible and
(ðS*Þ 1) = (ðS 1Þ*.
(e) jjS*Sjj ¼ jjSS*jj ¼ jjSjj2
(f) S*S ¼ 0 if, and only if, S = O.

Proof
(a) By definition of the adjoint, for all x, y 2 H,

ðx; ðaS þ TÞ*yÞ ¼ ððaS þ TÞx; yÞ


¼ ðaSx; yÞ þ ðTx; yÞ
¼ ðx; aS*yÞ þ ðx; T*yÞ
¼ ðx; ðaS* þ T*ÞyÞ:

Hence, ðaS þ TÞ*y ¼ ðaS* þ T*Þy for all y 2 H, which implies


ðaS þ TÞ* ¼ aS* þ T*.
(b) For x, y 2 H,

ðx; ðSTÞ*yÞ ¼ ðSTðxÞ; yÞ


¼ ðTx; S*yÞ
¼ ðx; T*S*ðyÞÞ:

Hence, ðSTÞ*y ¼ T*S*ðyÞ for all y 2 H, which implies (b).


(c) For x, y 2 H,

ððx; ðS*Þ*yÞ ¼ ðS*x; yÞ ¼ ðx; SyÞ:

Hence, (S*)*y = Sy for all y 2 H, which implies (c).


186 3 Linear Operators

(d) If I denotes the identity operator in B(H), then I* ¼ I. Indeed, for x, y 2 H,

ðx; I*yÞ ¼ ðIx; yÞ ¼ ðx; yÞ ¼ ðx; IyÞ:

Hence, I*y ¼ Iy for all y 2 H, which implies I* ¼ I.


Suppose S is an invertible element in B(H). Then, S−1S = SS−1 = I. Using (ii) above,
we have (S−1S)* = S*(S−1)* = I*. Since I* = I, we get S*(S−1)* = I. Similarly,
(S−1)*S* = I. Hence, (S*)−1 = (S−1)*.
(e) By the Cauchy–Schwarz inequality,

jjSxjj2 ¼ ðSx; SxÞ ¼ ðS*Sx; xÞ  kS*Sxkjj xjj  jS*Sjjj xjj2 :

Taking the supremum over all x of norm 1, we obtain

jjSjj2  jjS*Sjj:

Applying Theorem 3.5.2, we obtain

jjSjj2  jjS*Sjj  kS*kjjSjj ¼ jjSjj2 :

Hence,

jjS*Sjj ¼ jjSjj2 : ð3:26Þ

Replacing S by S*, and using Theorem 3.5.2, we obtain

jjSS*jj ¼ jjSjj2 : ð3:27Þ

Thus, on using (3.26) and (3.27), the result follows.


(f) This is an immediate consequence of (e) above. h

Remarks 3.5.5
(i) The map T!T* has properties very similar to the complex conjugation z!z
on ℂ. A new feature is the relation (b) of Theorem 3.5.4 which results from
the noncommutativity of operator multiplication.
(ii) Since ||T*|| = ||T||, we have ||T* − S*|| = ||(T − S)*|| = ||T − S||, and it follows
that the map T!T* from B(H) to B(H) is continuous in the norm.
(iii) If H is a Hilbert space, we know that B(H) is a Banach algebra
[Theorem 3.3.3]. Moreover, in view of Theorem 3.5.4, the mapping T!T*
of B(H) into itself is such that
3.5 The Adjoint Operator 187

ðaÞ T** ¼ T

ðbÞ ðS þ TÞ* ¼ S* þ T*

ðcÞ ðaSÞ* ¼ aT*

ðdÞ ðST Þ* ¼ T*S*

ðeÞ kT*T k ¼ kT k2 :
It is immediate from (a) that the mapping T!T* is both one-to-one and onto. It
is useful to have the following general definition.
Definition 3.5.6 Let A be an algebra over ℂ. A mapping a ! a* of A into itself is
called an involution if, for all a,b 2 A and all a 2 ℂ,
(i) a** ¼ a
(ii) ða þ bÞ* ¼ a* þ b*
(iii) ðaaÞ* ¼ aa*
(iv) ðabÞ* ¼ b*a*:
An algebra with an involution is called a *algebra. A normed algebra with an
involution is called a normed *algebra. A Banach algebra A with an involution
satisfying jja*ajj ¼ jjajj2 is called a C*-algebra.
Observe that in a C*-algebra,

jjajj2 ¼ jja*ajj  ka*kkak;

which implies jjajj  jja*jj provided a 6¼ 0. Replacing a by a* and using (a) of the
above definition, we obtain jja*jj  jjajj. Thus, jjajj ¼ jja*jj for a 2 A, since the
equality is trivially true when a = 0.
In view of the observations [(iii) of Remark 3.5.5], it follows that B(H) is a C*-
algebra. Obviously, every *subalgebra of B(H), that is, a subalgebra containing
adjoints, which is closed in the norm is also a C*-algebra. Every C*-algebra has the
same “mathematical structure” as a subalgebra of B(H) for a suitable Hilbert space
H; this is known as the Gelfand–Naimark Theorem. The study of such algebras
constitutes an important area of research in functional analysis and is beyond the
scope of the present text.
There is an interesting relationship between the range of an operator T 2
B(H) and the kernel of its adjoint T*. This relationship proves useful in deciding the
invertibility of operators.
Theorem 3.5.7 Let M and N be closed linear subspaces of a Hilbert space
H. Then, TðMÞN if and only if, T*ðN? ÞM? .
188 3 Linear Operators

Proof Suppose TðMÞN and let y 2 T*ðN? Þ. There exists x 2 N? such that y =
T*x. For z 2 M, (y, z) = (T*x, z) = (x, Tz) = 0, since x 2 N? and Tz 2 N; thus,
y?M.
If T*ðN? ÞM? , then by the argument in the above paragraph,
T**ðM?? ÞN?? . Since T** = T and M and N are closed subspaces of H, it
follows that TðMÞN. h
Theorem 3.5.8 If T 2 B(H), then ker(T) = ker(T*T) = [ran(T*)]⊥ and [ker(T)]⊥ =
½ranðT*ފ.
Proof Clearly, ker(T)  ker(T*T). The reverse inclusion follows from the com-
putation ||Tx||2 = (Tx, Tx) = (T*Tx, x).
Now, x 2 ker(T) , Tx = 0 , (Tx, y) = 0 for all y 2 H , (x,T*y) = 0 for all y 2
H , x 2 [ran(T*)]⊥. Thus, ker(T) = [ran(T*)]⊥.
It follows by (iii) of Remark 2.10.12 that [ker(T)]⊥ = [ran(T*)]⊥⊥ = ½ranðT*ފ.
h
The following Theorem provides a criterion for the invertibility of T 2 B(H).
Theorem 3.5.9 If T 2 B(H) is such that T and T* are both bounded below, then T
is invertible.
Proof If T* is bounded below, then ker(T*) = {0}. In view of Theorem 3.5.8, [ran
(T)]⊥ = {0}, which implies ½ranðTފ = [ran(T)]⊥⊥ = {0}⊥ = H. Thus, ran(T) is dense
in H and the result now follows on using Theorem 3.3.12. h
In the following examples, we compute the adjoints of some well-known
operators.
Example 3.5.10
(i) Let H = ℂn, the Hilbert space of finite dimension n and {e1, e2, …, en} be the
standard orthonormal basis for H. Define T:ℂn!ℂn by setting
Xn
ðTxÞi ¼ ai;j xj :
j¼1

Clearly, T is linear and hence bounded [(iii)


P of Example 3.2.5].
Since the inner product in ℂn is (x, y) = ni¼1 xi yi ,
n
X
ðTx; yÞ ¼ ðTxÞiyi
i¼1
!
n
X n
X
¼ ai;j xj yi
i¼1 j¼1
n
X n
X
¼ xj ai;j yi
j¼1 i¼1

¼ ðx; T*yÞ;
3.5 The Adjoint Operator 189

where ðT*yÞj ¼ ni¼1 ai;j yi . The adjoint of T is, therefore, represented by the
P

usual conjugate transpose of the matrix representing T.


(ii) Let H be a separable Hilbert space and {en}n  1 constitute an orthonormal
basis for H. By Problem 3.2.P2, each T 2 B(H) is defined by a matrix
½ai;j ]i;j  1 , where ai;j ¼ ðTej ; ei Þ, i,j = 1,2,…. Since T* 2 B(H) and
     
T*ej ; ei ¼ ej ; Tei ¼ Tei ; ej for i; j ¼ 1; 2; . . .;

it follows that the matrix representing T* is the conjugate of the transpose of


the matrix [ai,j]i-,j  1 representing T.
(iii) The adjoint of the operator T 2 B(H) defined by Tx = ax, x 2 H and a 2 ℂ, is
the operator T* defined by T*x ¼ ax, x 2 H. Indeed, for x, y 2 H, (x, T*y) =
(Tx, y) = (ax, y) = ðx; ayÞ. Thus, ðx; ðT* aIÞyÞ ¼ 0. Consequently,
T* ¼ aI.
(iv) Let M be a closed subspace of a Hilbert space H and PM the orthogonal
projection on M. Moreover, ||PM|| = 1 [(ii) of Remark 2.10.17].
The adjoint PM* of PM is PM itself. Indeed, for x1, x2 2 H with xi = yi + zi,
where yi 2 M and zi 2 M⊥, i = 1, 2, we have

ðx1 ; PM *x2 Þ ¼ ðPM x1 ; x2 Þ ¼ ðy1 ; y2 þ z2 Þ ¼ ðy1 ; y2 Þ ¼ ðy1 þ z1 ; y2 Þ


¼ ðx1 ; PM x2 Þ;

i.e., (x1,(PM* − PM)x2) = 0, which implies PM* = PM.


(v) Let H ¼ L2 ðX; M; lÞ, where ðX; M; lÞ is a r-finite measure space and y 2
L1 ðX; M; lÞ be an essentially bounded measurable function.
A multiplication operator T 2 B(H) [see Example (vi) of 3.2.5] has adjoint
T* which is also a multiplication operator. The defining relation for T* is (x,
T*z) = (Tx, z), x, z 2 H. Consequently,
Z Z
xðtÞT*zðtÞdt ¼ yðtÞxðtÞzðtÞdt; x; z 2 H;
X X

which implies
Z h i
xðtÞ T*zðtÞ yðtÞzðtÞ dt ¼ 0:
X

Since the above relation holds for all x, z 2 H, it follows that T*zðtÞ ¼
yðtÞzðtÞ in H. Thus, the adjoint T* of the multiplication operator T is mul-
tiplication by the complex conjugate of y. In particular, if y is real-valued,
then T* = T.
190 3 Linear Operators

(vi) Let H be a separable Hilbert space, {ei}i  1 be an orthonormal basis in H and


T 2 B(H) be the simple unilateral shift [see (vii) of Example 3.2.5]. The
defining relation for T* is (x,T*y) = (Tx, y), x, y 2 H. Now,
! !
1
X 1
X 1
X 1
X
ðx; T*yÞ ¼ T kk ek ; lk ek ; x¼ kk ek ; y ¼ lk ek ;
k¼1 k¼1 k¼1 k¼1
!
1
X 1
X
¼ k k ek þ 1 ; ðlk ek Þ
k¼1 k¼1
1
X
¼ kk lk þ 1
k¼1
!
1
X 1
X
¼ k k ek ; l k þ 1 ek
k¼1 k¼1
!
1
X 1
X
¼ k k ek ; l k ek 1 :
k¼1 k¼1

As the above equality holds for all x, y 2 H, it follows that


1
X 1
X
T*y ¼ lk ek 1 ; where y¼ lk ek :
k¼2 k¼1

In particular, T*ek = ek−1, k = 1, 2, …, where e0 = 0. Thus, the adjoint of the


simple unilateral shift is
!
1
X 1
X
T* l k ek ¼ lk ek 1 :
k¼1 k¼2

This can also be described as

T*e1 ¼ 0 and T*ei ¼ ei 1 ; i ¼ 2; 3; . . .:

(vii) If K is the integral operator with kernel k as in (viii) of Example 3.2.5, then
K* is the integral operator with kernel k*ðs; tÞ ¼ kðt; sÞ. The defining relation
for K* is (x, K*y) = (Kx, y) for x, y 2 L2(l). Now,
0 1
Z Z
ðx; K*yÞ ¼ ðKx; yÞ ¼ @ kðs; tÞxðtÞdlðtÞAyðsÞdlðsÞ
X X
Z Z
¼ xðtÞkðs; tÞyðsÞdlðsÞdlðtÞ:
X X
3.5 The Adjoint Operator 191

The reversal of the order of integration is justified by Fubini’s


Theorem [Theorem 1.3.14]. As this holds for all x and y in L2(l), we must
have
Z
K*yðtÞ ¼ kðs; tÞyðsÞdlðsÞ
X

for almost all t, or, interchanging the roles of s and t,


Z
K*yðsÞ ¼ kðt; sÞyðtÞdlðtÞ;
X
Z
K*yðsÞ ¼ kðt; sÞyðtÞdlðtÞ
X

for almost all s. Thus, K* is the integral operator with kernel k*, where
k*ðs; tÞ ¼ kðt; sÞ:

Remarks 3.5.11 The Laplace transform L:L2(ℝ+)!L2(ℝ+) with kernel k(s, t) = e–st
[Problem 3.2.P7] defined by

Z1
st
LxðsÞ ¼ xðtÞe dt
0

is such that L* = L. Indeed, k*ðs; tÞ ¼ kðt; sÞe st .


Since ||S*S|| = ||S||2, S 2 B(H), it follows that ||L∘L|| = ||L||2. The mapping L∘L is
easily computed. For x 2 L2(ℝ+),

Z1 Z1 Z1
0 1
ðLLÞxðrÞ ¼ LxðsÞe rs ds ¼ @ xðtÞe st
dtAe rs
ds
0 0 0
Z1 Z1
ðr þ tÞs
¼ xðtÞ e dsdt; using Fubini0 s Theorem
0 0
Z1
xðtÞ
¼ dt:
rþt
0

We have thus proved the following result.


192 3 Linear Operators

The integral operator, called Hilbert–Hankel operator,

Z1
xðtÞ
HxðrÞ ¼ dt
rþt
0

is bounded as a map from L2(ℝ+) to itself and its norm equals √p.
Problem Set 3.5

3:5:P1. Let {ln}n  1 be a bounded sequence of complex numbers, M = sup{|lk| :


k  1}. Show that there exists one and only one operator T on a Hilbert
space H such that
(a) Tek = lkek for all k, where {ek}k  1 is an orthonormal basis in H;
!
1
X 1
X
ðbÞ T kk ek ¼ kk lk ek ;
k¼1 k¼1

ðcÞ jjT jj ¼ M;

ðdÞ T*ek ¼ lk ek for all k;


!
1
X 1
X
ðeÞ T* k k ek ¼ kk lk ek ; and
k¼1 k¼1

ðfÞ T*T ¼ TT*:

3.6 Some Special Classes of Operators

The adjoint operation in B(H) in a way extends the conjugation operation in the
complex numbers. Unlike conjugation in complex numbers, the adjoint operation in
B(H) does not preserve the product.
Those operators T for which T*T = TT* have “decent” properties. Such operators
and their suitable subsets will be studied in this section.
3.6 Some Special Classes of Operators 193

Definition 3.6.1 If T 2 B(H), then


(a) T is Hermitian or self-adjoint if T* = T;
(b) T is unitary if T is bijective and T* = T−1; and
(c) T is normal if T*T = TT*.

Remarks 3.6.2
(i) In the analogy between the adjoint and the conjugate, Hermitian operators
become analogues of real numbers, unitaries are the analogues of complex
numbers of absolute value 1. Normal operators are the true analogues of
complex numbers: Note that

T þ T* T T*
T¼ þi ;
2 2i

where T þ2T* and T 2iT* are self-adjoint and T* = T þ2 T* iT 2iT*. The operators T þ2T*
and T 2iT* are called real and imaginary parts of T.
(ii) If T is self-adjoint or unitary, then T is normal. However, a normal operator
need not be self-adjoint or unitary. First note that I¸ the identity operator in
B(H), is self-adjoint. The operator T = 2iI is such that T* = −2iI; so, TT* =
4I = T*T, but T* 6¼ T and T−1 = 12 iI 6¼ T*.
From Examples 3.2.5 and 3.5.10, we can readily produce some
infinite-dimensional operators satisfying conditions (a), (b) and (c) of
Definition 3.6.1.
(iii) If T 2 B(H), where H is a separable Hilbert space and T is defined by the matrix

M = [ai,j]i,j=1 with respect to an orthonormal basis {en}n  1 (ai,j = (Tej,ei), i,j =
t
1,2,…), then T* is defined by M ¼ ½aj;i Š1 i;j¼1 with respect to the same basis.
t
Thus, T is self-adjoint if, and only if, ai;j ¼ aj;i ; i; j ¼ 1; 2. . ., that is, M ¼ M.
t t
Since M M ¼ ½Rn an;j an;i Š1 1
i;j¼1 and MM ¼ ½Rn ai;n aj;n Ši;j¼1 with respect to the
t t
basis {en}n  1, it follows that T is unitary if, and only if, M M ¼ I ¼ MM ,
that is,
X X
an;j an;i ¼ di;j ¼ ai;n aj;n
n n

for all i, j = 1, 2,… where di,j is 1 if i = j and zero otherwise. This says that
the columns of M form an orthonormal set in ‘2 and so do its rows. Next, T is
t t
normal if, and only if, M M ¼ MM . This is certainly the case if M is a
diagonal matrix.
(iv) If T denotes the operator of multiplication by y 2 L∞(l) (notations as in (vi) of
Example 3.2.5 and (v) of Example 3.5.10), then T is normal; T is Hermitian if,
and only if, y is real-valued; T is unitary if, and only if, |y| = 1 a.e.
194 3 Linear Operators

(v) By (viii) of Example 3.2.5 and (vii) of Example 3.5.10, the integral operator
K with kernel k is self-adjoint if, and only if, kðs; tÞ ¼ kðt; sÞ a.e. [l  l].
(vi) [(vii) of Example 3.2.5 and (vi) of Example 3.5.10] If T 2 B(‘2) is the simple
shift, then T*Te1 = T*e2 = e1 and TT*e1 = T0 = 0; so, T*T 6¼ TT*, that is, T is
not a normal operator.
The following is an important and rather simple criterion for self-adjointness in
the complex case.
Theorem 3.6.3 Let T 2 B(H). Then,
(a) If T is self-adjoint, (Tx, x) is real for all x 2 H.
(b) If H is a complex Hilbert space and (Tx, x) is real for all x 2 H, the operator T
is self-adjoint.

Proof For x, y 2 H and T 2 B(H), B(x, y) = (Tx, y) is a sesquilinear form. The


conclusion now follows from (ii) of Remark 3.4.4. h
Remarks 3.6.4
(i) Part (b) of the preceding proposition is false

if it is only assumed that H is a
0 1
real Hilbert space. For example, if T ¼ on ℝ2, then (Tx,x) = 0 for
1 0



0 1 0 1
all x 2 ℝ2. However, T* ¼ 6¼ ¼ T.
1 0 1 0
(ii) If S and T are bounded self-adjoint operators on a Hilbert space H, then so is
aS + bT, where a and b are real numbers. Thus, the collection of all
self-adjoint operators is a real vector space, which we shall denote by S(H).
(iii) If T 2 B(H), then T*T and T + T* are self-adjoint.
(iv) If S,T 2 B(H) are self-adjoint, then ST is self-adjoint if, and only if, ST = TS.
Indeed, (ST)* = T*S* = TS; so, ST = (ST)* if, and only if, ST = TS.
Sequences of self-adjoint operators occur in various problems. For them, the
following holds:
Theorem 3.6.5 Let {Tn}n  1 be a sequence of bounded self-adjoint linear opera-
tors on a Hilbert space H. Suppose {Tn}n  1 converges, say limnTn = T (uniform
norm), i.e. limn||Tn − T|| = 0. Then, the limit operator T is a bounded self-adjoint
operator on H.
Proof Clearly, T is a bounded linear operator. It is enough to show that T* = T. It
follows from Theorems 3.5.2 and 3.5.4 that

jjTn * T*jj ¼ kðTn T Þ*k ¼ jjTn T jj:

Therefore, T* = limnTn* = limnTn = T. h


The following result is important for the discussion of “spectral theory”.
3.6 Some Special Classes of Operators 195

Theorem 3.6.6 If T 2 B(H) is self-adjoint,

jjT jj ¼ supfjðTx; xÞj : kxk  1g ¼ supfjðTx; xÞj : kxk ¼ 1g:

(The latter will be needed in 3.7.4.)


Proof Define B(x, y) = (Tx, y), x, y 2 H; B is a bounded sesquilinear form with
||B|| = ||T|| [(ii) of Example 3.4.7]. Since Bðy; xÞ ¼ ðTy; xÞ ¼ ðy; TxÞ ¼
ðTx; yÞ ¼ Bðx; yÞ, B is Hermitian. Hence, by Corollary 3.4.11,
n o
jjBjj ¼ sup jBðx; xÞj=kxk2 : x 2 H; x 6¼ 0
¼ supfjBðx; xÞj : x 2 H; jj xjj  1g: h

Corollary 3.6.7 If T 2 B(H) is such that T = T* and (Tx, x) = 0 for all x 2 H, then T
= O.
Remarks 3.6.8 The above Corollary is not true unless T = T*. See (i) of Remark
3.6.4. However, if the Hilbert space under consideration is complex, then the
hypothesis, namely T = T*, can be deleted. In fact, the following holds.
Proposition 3.6.9 If H is a complex Hilbert space and T 2 B(H) is such that (Tx,
x) = 0 for all x 2 H, then T = O.
Proof For x, y 2 H, the following equality is easily verified:

1
ðTx; yÞ ¼ fðTðx þ yÞ; x þ yÞ ðTðx yÞ; x yÞ
4
þ iðTðx þ iyÞ; x þ iyÞ iðTðx iyÞ; x iyÞg:

Since (Tx, x) = 0 for all x 2 H, it follows that (Tx, y) = 0 for x, y 2 H. Setting y =


Tx, we obtain

jjTxjj ¼ 0 for all x 2 H; that is;

Tx = 0 for all x 2 H. Consequently, T = O. h


The notion of positive definite matrix is familiar from linear algebra; it has a
natural generalisation to infinite dimensions.
Definition 3.6.10 Let T 2 B(H) be such that T* = T. If for each x 2 H, (Tx, x)  0,
we say that T is positive semidefinite. If (Tx, x) > 0 for all nonzero x 2 H, we say
that T is positive definite. Alternatively, these are known as positive and strictly
positive operators.
196 3 Linear Operators

Remarks 3.6.11
(i) If T is any operator on a complex Hilbert space, then the condition (Tx,x) 
0 for all x 2 H implies T is self-adjoint. However, in a real Hilbert

space, this
2 1 1
is not true. Indeed, the operator T in ℝ defined by the matrix is
1 1
not self-adjoint but (Tx, x) = x21 + x22  0 for all x = (x1, x2) 2 ℝ2 [See also
(i) of Remark 3.6.4.].
(ii) We write T  O to mean T is positive. The collection of all positive
operators is a positive cone: if S  O, T  O, then for all nonnegative real
numbers a and b, we have aS + bT  O. This defines a partial order on the
collection S(H) of self-adjoint operators: S  T if, and only if, S − T 
O. Also, if S1  T1 and S2  T2, then S1 + S2  T1 + T2.
(iii) If T 2 B(H) is any operator, then T*T and TT* are positive. Indeed, (T*Tx, x) =
(Tx, Tx) = ||Tx||2  0 for all x 2 H. The argument that TT* is positive is
similar.


2 1 1 1
(iv) If A ¼ and B ¼ , then it can be checked that A 
1 1 1 1


1 0
B. Indeed, A B ¼  0. However, the relation A2  B2 is false. In
0 0




2 5 3 2 2 2 2 2 3 1
fact, A ¼ and B ¼ and A B ¼ and this does
3 2 2 2 1 0
not represent a positive operator, as can be easily verified by considering the
vector (1, −2).
(v) The multiplication operator T:L2[0,1]!L2[0,1] defined by

TxðtÞ ¼ txðtÞ; 0\t\1

is a positive operator, since

Z1
ðTx; xÞ ¼ tjxðtÞj2 dt  0
0

for any x 2 L2[0,1].


It was pointed out in (ii) of Remark 3.6.11 that the sum of positive operators is
positive. Let us turn to products. From (iv) of Remark 3.6.4, we know that a product
of bounded self-adjoint operators is self-adjoint if, and only if, the operators
commute. We shall see below that the product of two positive operators is positive
if, and only if, the operators commute.
Theorem 3.6.12 If S,T 2 B(H), where H is a complex Hilbert space, are such that
S  O, T  O, then their product ST is positive if, and only if, ST = TS.
Proof The “only if” part is trivial in view of (iv) of Remark 3.6.4.
3.6 Some Special Classes of Operators 197

To prove the “if” part, we suppose ST = TS and show that (STx,x)  0 for all x 2
H. If S = O, the inequality holds. Let S 6¼ O. Set S1 = S/||S||, S2 = S1 − S21,…,Sn+1 =
Sn − S2n,…. Note that each Si is self-adjoint. We shall show that, for each i = 1,2,…,
O  Si  I.
For i = 1 and x 2 H, (S1x, x) = ((S/||S||)x, x) = (Sx, x)/||S||  ||Sx||||x||/||S||  ||x||2 =
(x, x); so, ((I − S1)x, x)  0. Thus, the result is true for i = 1.
Assume that O  Sk  I. Then, (S2k (I − Sk)x,x) = ((I − Sk)Skx,Skx)  O, that is,
Sk (I − Sk)  O. Similarly, it can be shown that Sk(I − Sk)2  O. Consequently,
2

Sk+1 = S2k (I − Sk) + Sk(I − Sk)2  O and I − Sk+1 = (I − Sk) + S2k  O by the
induction hypothesis and the fact that S2k  O whenever Sk  O. This completes
the argument when O  Sk  I.
We now consider the general case. Observe that S1 = S21 + S2 = S21 + S22 + S3 = 
= S21 + S22 +  + S2n + Sn+1.
Since Sn+1  O, this implies

S21 þ S22 þ    þ S2n ¼ S1 Sn þ 1  S1 : ð3:28Þ

By the definition of  and the fact that Si = Si*, this means that
n n n 
jjSi xjj2 ¼
X X X
S2i x; x  ðS1 x; xÞ:

ðSi x; Si xÞ ¼
i¼1 i¼1 i¼1

P1
Since n is arbitrary, the infinite series i¼1 jjSi xjj2 converges, which implies
||Six||!0 and hence Six!0. By (3.28),
!
n
X
S2i x; x ¼ ð S1 Sn þ 1 Þx ! S1 x as n ! 1: ð3:29Þ
i¼1

Observe that all the Si commute with T since they are the sums and products of
S1 = ||S||−1S and S and T commute. Finally,

ðSTx; xÞ ¼ jjSjjðS1 Tx; xÞ


¼ jjSjjðTS1 x; xÞ
!
n
X
¼ jjSjj Tlimn S2i x; x
i¼1
n
X
¼ jjSjjlimn ðTS2i x; xÞ
i¼1
Xn
¼ jjSjjlimn ðTSi x; Si xÞ
i¼1
 0;

using S = ||S||S1, (3.29) and, T  O. Thus,


198 3 Linear Operators

ðSTx; xÞ  0 for all x 2 H h

In (ii) of Remark 3.6.11, it was pointed out that the collection of positive
operators on a Hilbert space H is a positive cone in S(H). The positive cone induces
a partial order in S(H). This leads to the following definition.
Definition 3.6.13 Let {Tn}n  1 be a sequence of bounded linear self-adjoint
operators defined in a Hilbert space H, i.e. Tn 2 B(H), n = 1,2,…. The sequence
{Tn}n  1 is said to be increasing [resp. decreasing] if T1  T2   [resp. T1 
T2  ].
An increasing [resp. decreasing] sequence {Tn}n  1 in B(H) has the following
remarkable property. It follows from Theorem 3.4.8 proved above.
Theorem 3.6.14 Let {Tn}n  1 be an increasing sequence of bounded linear self-
adjoint operators on a Hilbert space H that is bounded from above, that is,

T1  T2      Tn      aI;

where a is a real number. Then, {Tn}n  1 is strongly convergent.


Proof For each x 2 H, the sequence {(Tnx,x)}n  1 of real numbers is bounded from
above by a||x||2. So, limn(Tnx,x) exists and equals f(x), say. Being a limit of
quadratic forms [see Problem 3.4.P2], this is again a quadratic form, that is, there
exists a sesquilinear form B(x, y) on H such that f(x) = B(x, x). Clearly, B is
bounded. By Theorem 3.4.8, there exists a self-adjoint operator T such that f(x) =
(Tx,x). It remains to show that limn||(Tn − T)x|| = 0 for each x 2 H.
Without loss of generality, we may assume that T1  O by replacing each Ti by
Ti − T1 and a by 2a. Then for n > m, we have O  Tn − Tm  aI. This shows that

jjTn Tm jj ¼ supkxk¼1 ððTn Tm Þx; xÞ  a:

Using the generalised Cauchy–Schwarz inequality [Theorem 3.4.5 with B(x, y) =


(Ax, y), where A is a positive operator], we get for each x and y = (Tn − Tm)x,

jjTn x Tm xjj4 ¼ ½ððTn Tm Þx; ðTn Tm Þxފ2


¼ ½ððTn Tm Þx; yފ2
 ððTn Tm Þx; xÞððTn Tm Þy; yÞ
 
¼ ððTn Tm Þx; xÞ ðTn Tm Þ2 x; ðTn Tm Þx
 ððTn Tm Þx; xÞkTn Tm kkðTn Tm Þxk2
 ððTn Tm Þx; xÞa3 jj xjj2 :

Since limn(Tnx, x) = (Tx, x), it follows that limn,m((Tn − Tm)x, x) = 0. So the


left-hand side of the above inequality tends to zero as n,m!∞, i.e.
3.6 Some Special Classes of Operators 199

limn;m jjTn x Tm xjj ¼ 0:

Hence, {Tnx}n  1 is a Cauchy sequence and limnTnx = Bx, say, exists.


Obviously, Bx depends linearly on x. Moreover, 0  (Tnx, x)  a (x, x), and so, it
follows that 0  B(x, x)  a||x||2, which implies that B is a bounded linear
operator. h
Recall that if T 2 B(H), T*T  O since (T*Tx, x) = ||Tx||2  0 [(iii) of Remark
pffiffiffiffi pffiffiffiffiffiffiffiffiffi
3.6.11]. Just as jzj ¼ zz, we would like to define jT j ¼ T*T . This requires the
notion of square roots of positive operators. We begin with a Lemma.
pffiffiffiffiffiffiffiffiffiffiffi
Lemma 3.6.15 The power series for the function 1 z about z = 0 converges
absolutely for all complex numbers in the unit disc {z 2 ℂ:|z|  1}.
pffiffiffiffiffiffiffiffiffiffiffi
Proof Since the function f ðzÞ ¼ 1 z is holomorphic in the open unit disc {z 2
ℂ : |z| < 1}, it can be expanded in a Taylor series about z = 0:
1
X f ðnÞ ð0Þ
f ðzÞ ¼ an zn ; where an ¼ :
n¼0
n!

Note that the series converges absolutely in the open unit disc and the derivatives
at the origin are all negative:

1 1 1 2 þ 12
f 0 ðzÞ ¼ ð1 zÞ 2 ; f 00 ðzÞ ¼ ð1 zÞ ; . . .; f ðnÞ ðzÞ
2 22
1  3      ð2n 3Þ n þ 12
¼ ð1 zÞ ; . . .:
2n

So, the an are all negative for n  1. Thus,


n
X n
X
jak j ¼ 2 ak
k¼0 k¼0
n
X
¼2 lim ak xk
x!1
k¼0
pffiffiffiffiffiffiffiffiffiffiffi
2 lim 1 x
x!1
¼ 2;

where lim means that the limit is being taken as x!1 from the left. The sequence
x!1
of partial sums { f nk¼0 jak jgn  1 on the left is increasing and is bounded above by
P

2. It follows that k¼0 jak j  2, which implies that the series 1


P1 P k
k¼0 ak z converges
absolutely for |z| = 1. This proves the Lemma. h
Now consider the Cauchy product of the above power series with itself, which is
200 3 Linear Operators

1
X k
X
bk zk ; where bk ¼ aj ak j for each k:
k¼0 j¼0

pffiffiffiffiffiffiffiffiffiffiffi
It converges absolutely and its sum is the product ð 1 zÞ2 ¼ 1 z. See
Theorem 15 on p.51 of [28]. This means that, if
n
X n
X
Pn ðzÞ ¼ b k zk ; and Qn ðzÞ ¼ an zn ;
k¼0 k¼0

then |Qn(z)2 − Pn(z)|!0 as n!∞ for |z|  1. By a computation (best avoided on


paper), one can see that the polynomial Qn(z)2 − Pn(z) has coefficients that are sums
of products of only those aj with j  1. As noted in the course of the proof of the
above Lemma, these aj are all negative, and hence, the coefficients of Qn(z)2 −
Pn(z) are all positive. It follows for any bounded linear operator T that ||Qn(T)2 −
Pn(T)||  |Qn(||T||)2 − Pn(||T||)|. In particular, whenever ||T|| P 1, we have ||Qn(T)2
1 P1
− Pn(T)||!0 as n!∞. That is to say, the Cauchy product k¼0 bkT of k¼0 akTk k

with itself converges in norm to ( 1 k 2


P
k¼0 akT ) , provided that ||T||  1.
On the other hand, since the Cauchy product 1
P k P1 k
b z
k¼0 k of k¼0 akz with itself
converges to 1 − z (as noted at the beginning of the preceding paragraph), the
uniqueness of the power series of any holomorphic function
P1 implies that b0 = 1 =
− b1 and bk = 0 for k > 1. Therefore, ( 1 k 2 k
I − T.
P
k¼0 ka T ) = b
k¼0 k T =
Theorem 3.6.16 Let T 2 B(H) and T  O. Then, there is a unique S 2 B(H) with S
 O and S2 = T. Furthermore, S commutes with every bounded operator which
commutes with T.
Proof If T = O, then take S = O. We may next assume, without loss of generality,
that ||T||  1. Indeed, for any positive T and x 2 H,

ðTx; xÞ  kTxkk xk  kT kkxk2 ¼ kT kðx; xÞ;

which implies

ðT=kT kx; xÞ  ðx; xÞ; x2H

and therefore, T/||T||  I. Assuming we have already proved the Theorem for this
case, we could then assert the existence of a positive operator S such that S2 = T/||T||.
1
From this, it follows that kT k2 S is a positive square root of T.
Since I − T is self-adjoint, it follows from (ii) of Example 3.4.7 and Corollary
3.4.11 that

jðI TÞx; xÞj


kI T k ¼ sup ¼ sup jððI T Þx; xÞj  1:
k xk6¼0 k x k2 kxk¼1
3.6 Some Special Classes of Operators 201

The above Lemma now implies that the series

I þ a1 ðI TÞ þ a2 ðI TÞ2 þ    ð3:30Þ

converges in norm to an operator S. From what has been noted just before the
statement of this Theorem, it also follows that S2 = I − (I − T) = T. Furthermore,
since O  (I − T)  I, we have

0  ððI TÞn x; xÞ  1

for all x 2 H with ||x|| = 1. Thus,


1
TÞn x; xÞ
X
ðSx; xÞ ¼ 1 þ an ððI
n¼1
1
X
1þ an ; using an \0 for all n  1
n¼1
¼ 0;
P1 n
since the value of the sum of the series 1 + n¼1 anz at z = 1, which is
P1
1 + n¼1 an, is zero. Thus, S  O.
From here onwards, we do not need the restriction that ||T||  1. We next check
that S commutes with every operator that commutes with T. Let V 2 B(H) be such
that VT = TV. Then, V(I − T)n = (I − T)nV and consequently, VS = SV. It remains to
show that S is unique.
Suppose there is S′, with S′  O and (S′)2 = T. Then since
3
S0 T ¼ ðS0 Þ ¼ TS0 ;

S′ commutes with T and thus with S. Therefore,

S0 ÞSðS S0 Þ þ ð S S0 ÞS0 ðS S0 Þ ¼ S2 S02 ðS S0 Þ ¼ O:


 
ðS ð3:31Þ

Since both terms on the left of (3.31) are positive, they must both be zero; so
their difference (S − S′)3 = O. Since S − S′ is self-adjoint, it follows that
 
2 2
kS S0 k ¼ kð S S0 ÞðS S0 Þk ¼ ðS S0 Þ 


and ||S − S′||4 = ||(S − S′)2||2 = ||(S − S′)4||, so S − S′ = O. h


202 3 Linear Operators

Example 3.6.17
(i) In L2[0, 1], the multiplication operator

ðTxÞðtÞ ¼ txðtÞ; 0\t\1; x 2 L2 ½0; 1Š

has the square root S, where


pffi
ðSxÞðtÞ ¼ txðtÞ; 0\t\1; x 2 L2 ½0; 1Š:

(ii) For a > 0, the 2  2 matrix




a 1

1 a 1

is positive. Indeed,

ax1 þ x2 x1



ðTx; xÞ ¼ 1
;
x1 þ a x2 x2
¼ ajx1 j2 þ x1 x2 þ x1 x2 þ a 1 jx2 j2
pffiffiffi pffiffiffiffiffiffiffi 2
¼  ax1 þ a 1 x2   0 for all vectors ðx1 ; x2 Þ 2 C2 :


In what follows, we shall determine the square root of the matrix T. The
characteristic values are the roots of the equation det(kI − T)
= 10. These


−1 a a
roots are 0, (a + a ) and the corresponding eigenvectors are , ,
1 1

1

a a 0 a2 þ 1
respectively. If V is the matrix , then TV = .
1 1 0 aþa 1



0 0 1 a
Consequently, V−1TV = , where V−1 = a þ1a 1 . Hence,
0 aþa 1 1 a 1
" #
1 1=2 1 1=2


0 0 ða þ a Þ a ða þ a Þ
V−1 =
1
T2 = V .
0 ða þ a 1 Þ1=2 ða þ a 1 Þ 1=2 ða þ a 1 Þ 1=2 a 1
(iii) Using (ii) above, we may guess that the square root of the matrix


T I
1 2 BðH  HÞ;
I T

where T is a positive invertible operator in B(H), is


3.6 Some Special Classes of Operators 203

" #
ðT þ T 1 Þ 1=2 T ðT þ T 1 Þ 1=2
:
ðT þ T 1 Þ 1=2 ðT þ T 1 Þ 1=2 T 1

Note that T + T−1 is invertible by Theorem 3.5.9, because it is self-adjoint


and is also bounded below in view of the fact that
 2 2
1
x ¼ kTxk2 þ T 1
x þ 2jj xjj2  2jj xjj2 :
 
 T þT

Now, it follows on using matrix multiplication that


" #2 2
1 1=2 1 1=2
T I

ðT þ T Þ T ðT þ T Þ 1 1
1=2 1=2
¼ ðT þ T Þ 1
ðT þ T 1
Þ ðT þ T 1
Þ T 1 I T
T2 þ I ðT þ T 1 Þ


1 1
¼ ðT þ T Þ
ðT þ T 1 Þ T 2þI
1 1
1 ðT þ T ÞT

ðT þ T Þ

1
¼ ðT þ T Þ 1 1 1
ðT þ T Þ ðT þ T ÞT
T I


¼ 1
:
I T

Theorem 3.6.18 If T 2 B(H) is self-adjoint and n 2 ℕ, then ||Tn|| = ||T||n.


Proof When T = O, there is nothing to prove. So we may take ||T||m > 0 for all m 2 ℕ.
The case n = 1 is trivial. For n = 2, the desired equality follows from

T  ¼ kT*T k ¼ kT k2 :
 2

  k
This says that, when k = 1, the equality T 2  = jjT jj2 holds. Assume this for
 k 

some k 2 ℕ. Then,
 kþ1  k   k   k 2  2
 2   2 2  2 k  2k kþ1
T  ¼ ðT Þ  ¼ ðT Þ*ðT 2 Þ ¼ T 2  ¼ kT k ¼ kT k2 :
 

It follows by induction that


 k
 2 2k
T  ¼ k T k for all k 2 N:
204 3 Linear Operators

Now consider an arbitrary n 2 ℕ. Choose k 2 ℕ such that n < 2k, and put m = 2k
− n. Then, 0  ||Tm||  ||T||m 6¼ 0 and 0  ||Tn||  ||T||n. If it were to be the case
that ||Tn|| < ||T||n, then it would follow that
 k k
n m nþm
¼ k T k2 ;
 2
T  ¼ kT n þ m k  kT n k  kT m k\kT k kT k ¼ kT k

contradicting what was proved earlier by induction. Thus, ||Tn|| = ||T||n. Therefore,
by induction, the desired equality must hold for all n 2 ℕ. h
Theorem 3.6.19 If T 2 B(H) is positive, then the sesquilinear form defined by (Tx,
y) is nonnegative and satisfies

jðTx; yÞj2  ðTx; xÞðTy; yÞ for all x; y 2 H:

Proof It is trivial that (Tx, y) defines a nonnegative sesquilinear form. The


inequality now follows from Theorem 3.4.5. h
As an application of the above Theorem, we show for a positive operator T and
any positive integer k that

1 1 1 1
 k  1k
T 2 x; x  ðTx; xÞ2 þ 4 þ 8 þ  þ 2k T 2 þ 1 x; x :
2
 

Taking y = Tx in the inequality of Theorem 3.6.19, we get


2
T 2 x; x  ðTx; xÞ T 3 x; x
  

and hence,
1 1
T 2 x; x  ðTx; xÞ2 T 3 x; x 2 :
 

This means the inequality in question is true with k = 1. In order to prove it by


k
induction, assume it true for some k. Taking y ¼ T 2 x in the inequality of
Theorem 3.6.19, we get
 k
2  k k
  kþ1 
T2 þ1
x; x  ðTx; xÞ T 2 þ 1 x; T 2 x ¼ ðTx; xÞ T 2 þ 1 x; x :

Taking the root of order 2k+1 on both sides and combining with the induction
hypothesis, we find that

1 1 1 1 1
 kþ1  k1
T 2 x; x  ðTx; xÞ2 þ 4 þ 8 þ  þ 2k þ 2k þ 1 T 2 þ 1 x; x
2 þ1
 
:

This completes the proof by induction. h


3.6 Some Special Classes of Operators 205

Problem Set 3.6

3:6:P1. If H = ℂn, then the set of invertible matrices is dense in the space of all
matrices.
3:6:P2. Let T 2 B(H), where H = ℂn and {ek:k = 1, 2, …, n} be an orthonormal
basis for H. Then, T has the matrix representation [aij] and T* has the
representation [aji ] with respect to the given orthonormal basis. Show that
if the basis is not orthonormal, then this relation between the matrix rep-
resentations need not hold.
3:6:P3. Let T:X!X be a bounded linear operator on a complex inner product space
X. If (Tx,x) = 0 for all x 2 X, show that T = O. Show that this does not hold
in the case of a real inner product space.
3:6:P4. Let the operator T: ℂ2!ℂ2 be defined by Tx = 〈n1 + in2,n1 − in2〉, where
x = 〈n1,n2〉. Find T*. Show that we have T*T = TT* = 2I. Find T1 ¼
1
2 ðT1Pþ T*Þ and T2 ¼ 2i1 ðT T*Þ:
3:6:P5. Let 1 n
n¼0 anz be a power series with radius of convergence R, 0 < R 
∞. If A 2 B(H) and ||A|| < R, show P that there is an operator T 2 B(H) such
that for any x, y 2 H, (Tx, y) = 1 a
n¼0 n (A n
x, y). Moreover, T is unique. If
BA = AB, then show that BT = TB. [When the sum of the series 1 n
P
n¼0 anz
is denoted by f(z), the operator T is denoted by f(A).]
3:6:P6. Let H be a Hilbert space and A 2 B(H). Define the operator B on H ⊕ H by


0 iA
B¼ :
iA* 0

Prove that B is self-adjoint and ||B|| = ||A||.


3:6:P7. If T 2 B(H), show that T + T*  O if, and only of, (T + I) is invertible in
B(H) and ||(T − I)(T + I)−1||  1.

3.7 Normal, Unitary and Isometric Operators

The true analogues of complex numbers are the normal operators. The following
Theorem gives a characterisation of these operators.
Theorem 3.7.1 If T 2 B(H), the following are equivalent:
(a) T is normal;
(b) ||Tx|| = ||T*x|| for all x 2 H.
If H is a complex Hilbert space, then these statements are also equivalent to:
(c) The real and imaginary parts of T commute, i.e.
206 3 Linear Operators

T þ T* T T*
T1 T2 ¼ T2 T1 ; where T1 ¼ and T2 ¼ :
2 2i

Proof If x 2 H, then

kTxk2 kT*xk2 ¼ ðTx; TxÞ ðT*x; T*xÞ


¼ ðT*Tx; xÞ ðTT*x; xÞ
¼ ððT*T TT*Þx; xÞ:

Since T*T − TT* is Hermitian, it follows on using Corollary 3.6.7 that (a) and
(b) are equivalent.
We next show that (a) and (c) are equivalent:

T*T ¼ ðT1 iT2 ÞðT1 þ iT2 Þ ¼ T12 þ iðT1 T2 T2 T1 Þ þ T22


TT* ¼ ðT1 þ iT2 ÞðT1 iT2 Þ ¼ T12 þ iðT2 T1 T1 T2 Þ þ T22 :

Hence, T*T = TT* if, and only if, T1T2 = T2T1. h


For any operator T, we have ||Tk||  ||T||k, where k is a positive integer.
A strengthening of the preceding inequality holds for normal operators:
Theorem 3.7.2 Let T 2 B(H) satisfy T*T = TT*. Then,

T  ¼ k T k k
 k
for k ¼ 2n ; n ¼ 1; 2; . . . :

Proof For n = 1,
 2 2  2  2  
T  ¼ T T * ½Theorem 3:5:4ðe)Š
¼ T 2 ðT*Þ2 
 

¼ ðTT*Þ2  ½T*T ¼ TT*Š


 

¼ kTT*k2 ½Theorem 3:5:4ðe)Š


4
¼ kT k ; ½Theorem 3:5:4ðe)Š

which implies

T  ¼ k T k 2 :
 2

Suppose the result is true for n = m. Then,


3.7 Normal, Unitary and Isometric Operators 207

 m þ 1 2  m þ 1  m þ 1  
 2 
T  ¼ T 2 T2 * ½Theorem 3:5:4ðe)Š
 
 m m m  m 
¼ T 2 T 2 T 2 * T 2 *
 m m m m 
¼ T 2 T 2 *T 2 T 2 * ½T*T ¼ TT*Š
 m  m   
2
¼  T 2 T 2 *  ½Theorem 3:5:4ðe)Š

 m  m  2
¼  T 2 T 2 *  ½using the case n ¼ 1Š
 m 22
¼ T 2  ½induction hypothesisŠ
mþ2
¼ kT k2 :

Consequently,
 mþ1
 2  2m þ 1
T  ¼ kT k :

By induction, the proof is complete. h


If T 2 B(H) is self-adjoint, it was proved in Theorem 3.6.6 that

kT k ¼ supfjðTx; xÞj : k xk ¼ 1g:

The norm of any bounded linear normal operator can be computed using the
foregoing formula. We begin with the following:
Definition 3.7.3 For any T 2 B(H),

q(T) ¼ supfjðTx; xÞj : kxk ¼ 1g:

Proposition 3.7.4 Let T 2 B(H), where H is complex. Then,

kTxk2 þ  T 2 x; x   2qðTÞkTxkk xk
 
ð3:32Þ

for every x 2 H.
Proof Let k and h be real numbers. Then for x 2 H,

 1
kTxk2 þ e2ih T 2 x; x ¼ ðke2ih T 2 x þ k 1 eih Tx; keih Tx þ k 1 xÞ

2
1
ðke2ih T 2 x k 1 eih Tx; keih Tx k 1 xÞ:
2

Since |(Tx,x)|  q(T)||x||2 for every x 2 H, we have


208 3 Linear Operators

kTxk2 þ e2ih T 2 x; x   1  ke2ih T 2 x þ k 1 eih Tx; keih Tx þ k 1 x 


    
2
1 
þ  ke2ih T 2 x k 1 eih Tx; keih Tx k 1 x 

2
1  ih   ih
 e T ke Tx þ k 1 x ; keih Tx þ k 1 x 
  
2
1  
þ eih T keih Tx k 1 x ; keih Tx k 1 x 
  
2
1  2  2 
 qðTÞ keih Tx þ k 1 x þ keih Tx þ k 1 x :
2
ð3:33Þ

If Tx 6¼ 0, choosing k 6¼ 0 such that k2||Tx|| = ||x|| and h such that e2ih(T2x, x) =


2
|(T x, x)|, we deduce from (3.33) that
 
kTxk2 þ  T 2 x; x   qðTÞ k2 kTxk2 þ k 2 k xk2
 

¼ qðTÞðkTxkkxk þ kTxkk xkÞ


¼ 2qðTÞkTxkk xk:

The inequality (3.32) is obviously true in case Tx = 0. h


The following proposition will also be needed.
Proposition 3.7.5 If T 2 B(H), then ||T||  2q(T) and q(T2)  q(T)2.
Proof From Proposition 3.7.4, we have

kTxk2  2qðTÞkTxkkxk for all x 2 H:

This implies

kTxk  2qðTÞk xk;

and so,

kT k  2qðTÞ:

Let x 2 H be such that ||x|| = 1. Then, Proposition 3.7.4 gives

kTxk2 þ ðT 2 x; xÞ  2qðTÞkTxk;


 

that is,
3.7 Normal, Unitary and Isometric Operators 209

kTxk2 2qðTÞkTxk þ  T 2 x; x   0:
 

Therefore,

qðTÞÞ2 þ  T 2 x; x   qðTÞ2 ;
 
ðkTxk

which implies

 T x; x   qðTÞ2 :
 2 

Hence,

q T 2 ¼ sup  T 2 x; x  : kxk ¼ 1  qðT)2 :


   
h

Corollary 3.7.6 q(T p)  q(T)p, for p = 2n, n = 1,2,….


Proof By induction. h
Theorem 3.7.7 If T 2 B(H) is a normal operator, then

kT k ¼ supfjðTx; xÞj : k xk ¼ 1g:

Proof From the definition of q and the definition of norm, it follows that

qðT) ¼ supfjðTx; xÞj : kxk ¼ 1g


 supfkTxkkxk : k xk ¼ 1g ð3:34Þ
¼ supfkTxk : kxk ¼ 1g ¼ kT k:

Since T is normal, we have

kT p k ¼ kT kp for p ¼ 2n ; n ¼ 1; 2; . . . :

So,
1
kT k ¼ kT p kp
1
 ð2qðT p ÞÞp ½Corollary 3:7:6Š
1
¼ 2 qðTÞ:
p

On letting p!∞, we obtain

kT k  qðTÞ: ð3:35Þ
210 3 Linear Operators

Combining (3.34) and (3.35), we get the desired expression for the norm of the
operator T. h
Corollary 3.7.8 Let T 2 B(H) be self-adjoint. Then,

kT k ¼ supfjðTx; xÞj : k xk ¼ 1g:

Proof Every self-adjoint operator T 2 B(H) is normal. h


In three-dimensional Euclidean space ℂ3, the simplest operator after that of
projection is rotation of the space, which changes neither the length of the vectors
nor orthogonality between pairs of them. We consider below the analogue of this
operation in Hilbert space.
Definition 3.7.9 Let H be a Hilbert space and U be a bounded linear operator with
domain H and range H. U is called unitary if

ðUx; UyÞ ¼ ðx; yÞ

for all x, y 2 H.
If y = x, then the defining relation for a linear unitary operator U takes the form
||Ux|| = ||x|| for all x 2 H, in particular, U is bounded and ||U|| = 1.
Theorem 3.7.10 Let U be a unitary operator on a Hilbert space H. Then, U−1
exists and is unitary. Moreover, U−1 = U*.
Proof In order to show that U−1 exists, it is enough to show that U is injective,
which follows from the fact that ||Ux|| = ||x|| for all x 2 H.
We next show that U−1 is unitary. Choose arbitrary x, y 2 H and let x = U−1x′,
y = U−1y′. Then, Ux = x′ and Uy = y′. So, (x′, y′) = (Ux, Uy) = (x, y) = (U−1x′, U−1y′),
that is, U−1 is unitary.
It remains to show that U−1 = U*. For x, y 2 H, let U−1y = z, so that y = Uz.
Then, (Ux, y) = (Ux, Uz) = (x, z) = (x, U−1y). Also, (Ux, y) = (x, U*y).
Consequently, (x, U*y) = (x, U−1y) and this implies U*y = U−1y for all y 2 H. This
proves the assertion. h
Corollary 3.7.11 Let U be a bounded linear operator defined on H. Then, U is
unitary if, and only if, UU* = U*U = I.
Proof Indeed, for x, y 2 H and U a unitary operator,

ðx; yÞ ¼ x; U 1 Uy ¼ ðx; U*UyÞ;


 

which implies U*U = I. Similarly, UU* = I.


On the other hand, if UU* = U*U = I, then U is invertible (hence has range H)
and
3.7 Normal, Unitary and Isometric Operators 211

ðx; yÞ ¼ ðU*Ux; yÞ ¼ ðUx; UyÞ: h


The following simple characterisation of unitary operators is often useful.
Theorem 3.7.12 Let H be a Hilbert space and let U 2 B(H). Then, U is unitary if,
and only if,
(a) ||Ux|| = ||x|| for all x 2 H
and
(b) the range of U is dense in H.

Proof Suppose U is unitary. It has been observed that ||Ux|| = ||x|| for all x 2 H, that
is, (a) holds. Condition (b) is satisfied by virtue of the definition of a unitary
operator.
Suppose that (a) and (b) hold. Then for x, y 2 H and a 2 ℂ,

ðx þ ay; x þ ayÞ ¼ ðUðx þ ayÞ; Uðx þ ayÞÞ:

Since U is linear, the above equality leads to

ðx; xÞ þ jaj2 ðy; yÞ þ aðy; xÞ þ aðx; yÞ


¼ ðUx; UxÞ þ jaj2 ðUy; UyÞ þ aðUy; UxÞ þ aðUx; UyÞ

and this implies

aðy; xÞ þ aðx; yÞ ¼ aðUy; UxÞ þ aðUx; UyÞ; ð3:36Þ

using (a). On taking a = 1 and a = i in (3.36), we obtain

ðy; xÞ þ ðx; yÞ ¼ ðUy; UxÞ þ ðUx; UyÞ ð3:37Þ

and

ðy; xÞ ðx; yÞ ¼ ðUy; UxÞ ðUx; UyÞ: ð3:38Þ

On subtracting (3.38) from (3.37), we get

ðUx; UyÞ ¼ ðx; yÞ ð3:39Þ

for all x, y 2 H.
By (a), U is bounded below. Therefore, by (b) and Theorem 3.3.12, U is invertible.
Together with (3.39) and Definition 3.7.9, this entails that U is unitary. h
212 3 Linear Operators

Example 3.7.13
(i) Let ‘2(ℤ) P
denote the Hilbert space consisting of the complex functions x on ℤ
such that 1 2 2
n¼ 1 |x(n)| < ∞. Define U on ‘ (ℤ) by U(x)(n) = x(n − 1) for x 2
2
‘ (ℤ). The operator U is called the bilateral shift. It is clearly linear and the
following calculation
1 1
jjUxjj2 ¼ jðUxÞðnÞj2 ¼ 1Þj2 ¼ kxk2
X X
jxðn
n¼ 1 n¼ 1

for x 2 ‘2(ℤ) shows that it is bounded with norm 1.


The defining relation for U* is (x, U*y) = (Ux, y), x, y 2 H.
1
X 1
X
ðx; U*yÞ ¼ ðUx; yÞ ¼ ðUxÞðnÞyðnÞ ¼ xðn 1ÞyðnÞ
n¼ 1 n¼ 1
1
X
¼ xðnÞyðn þ 1Þ:
n¼ 1

Therefore, U*y(n) = y(n + 1). An easy computation shows that UU* = U*U =
I. Thus, U is a unitary operator.
(ii) Let H = L2[0, 2p]. Define U:H!H by the formula (Ux)(t) = eitx(t) for x 2 L2[0,
2p]. Observe that U is onto. Indeed, if y 2 L2[0, 2p], then e−ity(t) = z(t) 2 L2[0,
2p] and

ðUzÞðtÞ ¼ eit zðtÞ ¼ eit ðe it yðtÞÞ ¼ yðtÞ:

Moreover,

Z2p Z2p
2 e xðtÞ2 dt ¼ jxðtÞj2 dt ¼ jj xjj2 :
 it 
jjUxjj ¼
0 0

Thus,

jjUxjj ¼ jj xjj for x 2 L2 ½0; 2pŠ:

Consequently, U is a unitary operator on L2[0,2p].


More general than a unitary operator defined on H is an isometric operator.
Definition 3.7.14 Let H be a complex Hilbert space and T 2 B(H). The operator
T is said to be isometric if ||Tx|| = ||x|| for all x in H.
Remarks 3.7.15 (i) An isometry is a distance preserving transformation:
3.7 Normal, Unitary and Isometric Operators 213

kTx Tyk ¼ kTðx yÞk ¼ kx yk for all x; y 2 H:


In particular, T is injective.
(ii) Observe that a unitary operator H is isometric. However, not every isometric
operator is unitary. The simple unilateral shift T discussed in (vii) of Example 3.2.5
is an isometry but is not unitary because it is obviously not a bijection. In fact, T is
not even normal, because its adjoint is given [see (vi) of Example 3.5.10] by T*
({xi}i  1) = (x2, x3,…), and hence,
 
T*T fxi gi  1 ¼ T*ð0; x1 ; x2 ; . . .Þ ¼ ðx1 ; x2 ; . . .Þ ¼ fxi gi  1 ;

so that T*T = I, while


 
TT* fxi gi  1 ¼ T ðx2 ; x3 ; . . .Þ ¼ ð0; x2 ; x3 ; . . .Þ:

The following provides a characterisation of an isometry.


Proposition 3.7.16 Let H be a complex Hilbert space and T 2 B(H). Then, the
following are equivalent.
(a) T is an isometry,
(b) T*T = I and
(c) (Tx, Ty) = (x, y).

Proof (a) implies (b). Since ||Tx|| = ||x|| for all x 2 H, we have

ðT*Tx; xÞ ¼ ðTx; TxÞ ¼ jjTxjj2 ¼ jj xjj2 ¼ ðx; xÞ:

This implies T*T = I, in view of Problem 3.6.P4.


(b) implies (c). (Tx,Ty) = (T*Tx, y) = (x, y).
(c) implies (a). This follows on taking y = x in (Tx,Ty) = (x, y). h
Theorem 3.7.17 The range ran(T) of an isometric operator T defined on a complex
Hilbert space is a closed linear subspace of T.
Proof Clearly, ran(T) = T(H) is a linear subspace of H. Suppose y 2 ½ranðTފ. We
need to show that y 2 ran(T). Chose a sequence {yn}n  1 in ran(T) such that
yn!y as n!∞. Note that yn = Txn for some xn in H, n = 1,2,…. Since ||xm − xn|| =
||T(xm − xn)|| = ||ym − yn||!0 as m,n!∞, it follows that {xn}n  1 is a Cauchy
sequence. Since H is complete, there exists x in H such that xn!x. By continuity of
T, we have Txn!Tx, i.e. Tx = limnTxn = limnyn = y. Hence, y = Tx, so y 2 ran(T).
h
The following Theorem is an alternative characterisation of unitary operators
[see Corollary 3.7.11].
214 3 Linear Operators

Theorem 3.7.18 Let H be a complex Hilbert space and T 2 B(H). Then, the
following are equivalent:
(a) T*T = TT* = I,
(b) T is a surjective isometry and
(c) T is a normal isometry.

Proof (a) implies (b). From (a), it follows that TT* = I. This ensures that T is
surjective. It also follows that T*T = I, and hence, by Proposition 3.7.16, T is an
isometry.
(b) implies (c). Since T is an isometry, (Tx,Ty) = (x, y) by Proposition 3.7.16.
Being surjective, T must be unitary by Definition 3.7.9. Hence, T*T = TT* = I by
Corollary 3.7.11, so that T is normal.
(c) implies (a). Since T is an isometry, T*T = I by Proposition 3.7.16. Since T is
normal, T*T = TT* = I. This completes the proof. h
Definition 3.7.19 Let S and T be bounded linear operators on a Hilbert space
H. The operator S is said to be unitarily equivalent to T if there exists a unitary
operator U on H such that
1
S ¼ UTU ¼ UTU*:

Remark 3.7.20 If T is self-adjoint or normal, then so is any operator S that is


unitarily equivalent to T. The reason is as follows: S* = (UTU*)* = (U*)*T*U* =
UTU* = S, using the hypothesis that T = T*. A similar argument shows that if T is
normal, then so is S.
Problem Set 3.7

3:7:P1. Show that the range of a bounded linear operator need not be closed.
3:7:P2. [See Problem 3.7.P1] Let T:H!H be a bounded linear operator on a
Hilbert space H. Suppose there exists M > 0 such that ||Tx||  M||x|| for
any x 2 H. Prove that the range of T is a closed subspace of

H.
0 n
3:7:P3. Let H = ℂ2 and T be the operator defined on H by the matrix . Find
0 0
||T|| and r(T). Show that T is not a normal operator.
3:7:P4. Let S = I + T*T:H!H, where T 2 B(H). Show that
(a) S−1:ran(S)!H exists,
(b) ran(S) is closed,
(c) NðSÞ = kernel of S = {0} and
(d) ||S−1||  1.
1
zn/n! and A 2 B(H) is such that A = A*, show that f(iA) is
P
3:7:P5. If f(z) =
k¼0
unitary.
3.7 Normal, Unitary and Isometric Operators 215

3:7:P6. Recall from (v) of Example 2.1.3 that RH2 denotes the space of rational
functions which are analytic on the closed unit disc D = {z 2 ℂ : |z|  1},
with the usual addition and scalar multiplication and with inner product

1 dz
Z
ðf ; gÞ ¼ f ðzÞgðzÞ :
2pi z
@D

Define an operator U on the inner product space RH2 by

jaj2 Þ1=2
 
ð1 z a
Uf ðzÞ ¼ f for all z 2 D;
1 az 1 z
a

where a 2 D is fixed. Show that U is an isometry: ||Uf|| = || f || for all f 2


RH2.
3:7:P7. Let T 2 B(H) be a normal operator. Assume that Tm = O for some positive
integer m. Show that T = O.
3:7:P8. Let T 2 B(H) be normal. Show that T is injective if, and only if, T has
dense range.
3:7:P9. (a) Give an example of an operator S 2 B(H) such that ker(S) = {0} but ran
(S) is not dense in H.
(b) Give an example of an operator T 2 B(H) such that T is surjective but
ker(T) 6¼ {0}.
3:7:P10. Let H be a Hilbert space. Show that the set of all normal operators in
B(H) is closed in B(H) in the operator norm.
3:7:P11. If T is a normal operator on the complex Hilbert space H and S 2 B(H) is
such that TS = ST, then T*S = ST*.
Let T 2 BðH Þ be a self-adjoint operator on a complex Hilbert space H 6¼
{0}. Then, rðT Þ 2 R. So, ± i 2 q(T), the resolvent set of T. The oper-
ators T ± iI are invertible elements in BðH Þ. Consider the operator
1
U ¼ ðT iIÞðT þ iIÞ ¼ ðT þ iIÞ 1 ðT iIÞ

and the inverse operator


1
U 1
¼ ðT þ iIÞðT iIÞ ¼ ðT iI Þ 1 ðT þ iIÞ:

The transformation U is called the Cayley transform of T.


3:7:P12. (a) Show that U is unitary and U = I − 2i(T + iI)−1.
(b) Also show that 1 2 q(U) and
(c) T = i(I + U)(I − U)−1 = i(I − U)−1(I + U).
216 3 Linear Operators

3.8 Orthogonal Projections

Let H be a Hilbert space and M a closed subspace of H. The orthogonal decom-


position theorem [Theorem 2.10.11] says that H = M ⊕ M⊥, where M⊥ denotes the
orthogonal complement of M. Thus, for each x 2 H, there exists a unique y 2 M and
z 2 M⊥ such that x = y + z.
The concept of orthogonal projection operator PM, or briefly, projection, was
defined in Definition 2.10.16. It was proved in Theorem 2.10.15 that the mapping
PM:H!H has range M; its kernel is M⊥ and PM restricted to M is the identity
operator on M. Also proved therein are the following:
(i) PM is linear, bounded with norm 1;
(ii) PM is self-adjoint; and
(iii) PM is idempotent: P2M = PM.

Definition 3.8.1 Let P 2 B(H). P is called an orthogonal projection if P* = P and


P2 = P.
Associated with any closed subspace M of H, the orthogonal projection operator
PM, or briefly, P, has the properties (i), (ii) and (iii), and also satisfies ran(PM) = M,
ker(PM) = M⊥ [Theorem 2.10.15].
We now reverse the above trend that if P 2 B(H) is such that P* = P and P2 = P,
then there exists a unique subspace M of H such that P is the associated orthogonal
projection operator PM.
Set

M ¼ fx 2 H : Px ¼ xg:

Clearly, M = ker(I − P) and therefore a closed subspace.


We next show that ran(P) = M and ker(P) = M⊥. Indeed, if x 2 H, then Px =
2
P x = P(Px). Thus, Px 2 M for each x 2 H, i.e. PH  M. On the other hand, if x 2
M, then x = Px 2 PH. Hence, PH = M. Also, if Px = 0, then for z 2 H, (x, P*z) =
(Px, z) = (0, z) = 0, that is, x 2 (P*H)⊥ = (PH)⊥ = M⊥. On the other hand, if x 2 M⊥,
then (Px, z) = (x, P*z) = (x, Pz) = 0 for each z 2 H. Therefore, Px = 0 for x 2 M⊥.
Finally, for x 2 H, we have x = y + z, where y 2 M and z 2 M⊥, and hence, Px =
Py + Pz = y. Thus, P is the operator of orthogonal projection on M.
Combining the discussions in the paragraph above, we have the following
Theorem.
Theorem 3.8.2 Let P 2 B(H). Then, P is a projection if, and only if,

fx 2 H:Px ¼ xg ¼ kerðI PÞ ¼ ran(PÞ ¼ kerðPÞ? :

Remarks 3.8.3
(i) The argument used to establish the above theorem shows that to each closed
linear subspace M in H there corresponds a unique orthogonal projection
3.8 Orthogonal Projections 217

P such that ran(P) = M; to each orthogonal projection P there corresponds a


closed linear subspace M = {x 2 H:Px = x} = ran(P). This enables us to
replace geometric properties of subspaces in terms of algebraic properties of
projections corresponding to them [see Theorems 3.8.4 and 3.8.5 below].
(ii) Every orthogonal projection is a positive operator: Indeed,

ðPx; xÞ ¼ ðP2 x; xÞ ¼ ðPx; PxÞ ¼ jjPxjj2  0:




2 1 1
(iii) Consider the operator P on ℂ corresponding to the matrix P ¼ .
0 0
2
its kernel is {(x, −x):x 2
Observe that P = P. Its range
is {(x, 0):x
2 ℂ} and
1 0 1 1
ℂ}. However, P* has matrix 6¼ . So it is not an orthogonal
1 0 0 0
projection.
(iv) Let ðX; M; lÞ be a r-finite measure space. For y 2 L∞(l), consider the
operator T on L2(l) of multiplication by y:

TxðtÞ ¼ yðtÞxðtÞ; x 2 L2 ðlÞ; t 2 X:

[See (vi) of Example 3.2.5.] The operator T is bounded with ||T|| = ||y||∞ and
it is self-adjoint if, and only if, y is real-valued a.e. Observe that T2 = T if, and
only if, y2 = y a.e., or y is equal a.e. to a characteristic function. Thus, if the
operator of multiplication by a real-valued y is a projection, then it is an
orthogonal projection.
We propose to show below in detail how the closed subspaces of a Hilbert space
and the corresponding orthogonal projections are related to each other.
Theorem 3.8.4 Let M and N be closed subspaces of a Hilbert space H, and P and
Q denote the projections on M and N, respectively. Then,
(a) I − P is the projection on M⊥;
(b) M⊥N if, and only if, PQ = O.

Proof (a) Note that (I − P)* = I* − P* = I − P and (I − P)2 = I − 2P + P2 = I −


P. Thus, I − P is a projection operator. We next show that {x 2 H : (I − P)x = x} =
M⊥. If (I − P)x = x, then Px = 0, which implies x 2 M⊥. On the other hand, if x 2
M⊥, then Px = 0, and hence, (I − P)x = x.
(b) Suppose that PQ = O. Then for x 2 M and y 2 N,

ðx; yÞ ¼ ðPx; QyÞ ¼ ðx; PQyÞ ¼ 0:

Therefore, M⊥N. Conversely, if M⊥N, then for any x 2 H, Qx 2 N  M⊥; so,


PQx = 0 for x 2 H. Hence, PQ = O. h
218 3 Linear Operators

Under the condition (b) of the above theorem, we speak of projections P and
Q themselves as being orthogonal.
Theorem 3.8.5 Let M and N be closed subspaces of a Hilbert space H. If P and Q
denote projections on M and N, respectively, then the following are equivalent:
(a) M  N;
(b) P  Q;
(c) PQ = P; and
(d) QP = P.

Proof (a) implies (c). If M  N, then Px 2 N for each x 2 H. Therefore, Q(Px) = Px,
x 2 H; so QP = P. Also, (QP)* = P*, that is, P*Q* = P*, which implies PQ = P.
(c) implies (b). Suppose PQ = P. Then for x 2 H,

ðPx; xÞ ¼ ðP2 x; xÞ ¼ ðPx; PxÞ ¼ jjPxjj2 ¼ jjPQxjj2  jjQxjj2 ¼ ðQx; QxÞ ¼ ðQx; xÞ:

Hence, P  Q.
(b) implies (a). Suppose that P  Q and let x 2 M. Then,

jj xjj2 ¼ jjPxjj2 ¼ ðPx; PxÞ ¼ ðP2 x; xÞ ¼ ðPx; xÞ  ðQx; xÞ


¼ ðQ2 x; xÞ ¼ ðQx; QxÞ ¼ jjQxjj2  jj xjj2 :

Hence, ||Qx|| = ||x||. Now,

x ¼ Qx þ ðI QÞx;

and so,

jj xjj2 ¼ jjQxjj2 þ kðI QÞxk2

and this implies

kðI QÞxk ¼ 0;

since

jj xjj2 ¼ jjQxjj2 :

Consequently,

x ¼ Qx;

i.e. x 2 N.
(c) implies (d). Let PQ = P. Then, P = P* = (PQ)* = Q*P* = QP. Now let QP =
P. Then, P = P* = (QP)* = P*Q* = PQ. h
3.8 Orthogonal Projections 219

The next few results give necessary and sufficient conditions for addition,
subtraction and multiplication of projection operators to result in a projection
operator.
Theorem 3.8.6 Let {Pi}i  1 be a denumerable or finite family of projections and
RiPi = P in the sense of strong convergence. Then, a necessary and sufficient
condition that P be a projection is that PjPk = O whenever j 6¼ k. If this condition is
satisfied and if, for each j, the range of Pj is Mj, then the range of P is M = RiMi =
{x 2 H:x = Rixi, xi 2 Mi, i = 1,2,…} = ½ [ k Mk Š.
Proof If the family {Pi}i  1 satisfies the condition, then

P2 ¼ ðRi Pi ÞðRj Pj Þ ¼ Ri;j Pi Pj ¼ Ri Pi ¼ P

and

ðPx; yÞ ¼ ðRi Pi x; yÞ ¼ Ri ðPi x; yÞ ¼ Ri ðx; Pi yÞ ¼ ðx; Ri Pi yÞ ¼ ðx; PyÞ

for every pair x, y in H. In other words, the orthogonality of the family {Pi} implies
that P is idempotent and Hermitian, and hence, P is a projection.
If, conversely, P is a projection and if x 2 ran(Pk) for some value of k, then

jj xjj2  jjPxjj2 ¼ ðPx; xÞ ¼ Ri ðPi x; xÞ ¼ Ri jjPi xjj2  jjPk xjj2 ¼ jj xjj2 :

It follows that every term in the chain of inequalities is equal to every other term.
From the equality

jjPi xjj2 ¼ jjPk xjj2 ;


X

we conclude that Pix = 0 whenever i 6¼ k and hence, Pi(ran(Pk)) = {0} whenever i 6¼


k. Thus, the family {Pi}i  1 satisfies the condition PjPk = O whenever j 6¼ k.
We next show that ran(P) = RiMi, where Mi = ran(Pi). For any Px 2 ran(P), we
have Px = RiPix 2 RiMi, because Pix 2 Mi. Thus, ran(P)  RiMi. On the other hand,
every z 2 RiMi is of the form Rixi, xi 2 Mi, so that Pz = RiPixi = Rixi = z, which
implies z 2 ran(P). Thus, RiMi  ran(P).
Finally, we show that ran ðPÞ ¼ ½ [ k Mk Š. From the equality of ||x|| and ||Px||, we
conclude that x 2 ran(P) and hence, Mk  ran(P) for all k and it therefore follows
that ½ [ k Mk Š ran(P). On the other hand, Pkx 2 Mk for every vector x and every
value of k; it follows that Px = RkPkx 2 RkMk  ½ [ k Mk Š for all x, i.e. ran(P) 
½ [ k Mk Š. h
The useful fact about the product of projections is contained in the following.
Theorem 3.8.7 The product of two projection operators P and Q is a projection
operator if, and only if, PQ = QP. In this case, PQ is the projection on M \ N,
where M [resp.N] is the subspace of H on which P [resp. Q] is the projection.
220 3 Linear Operators

Proof Suppose that PQ is a projection. Then,

PQ ¼ ðPQÞ* ¼ Q*P* ¼ QP:

On the other hand, suppose that PQ = QP = R, say. Then,

R2 ¼ ðPQÞðPQÞ ¼ PPQQ ¼ P2 Q2 ¼ PQ ¼ R

and for all pairs x, y in H,

ðRx; yÞ ¼ ðPQx; yÞ ¼ ðQx; PyÞ ¼ ðx; QPyÞ ¼ ðx; RyÞ:

Thus, R is both self-adjoint and idempotent.


Finally, we show that the range of PQ is M \ N.
For x 2 H, let

y ¼ PQx ¼ QPx:

By the first representation, y 2 M and by the second representation, y 2 N.


Hence, x 2 M \ N, i.e. the range of PQ, ran(PQ)  M \ N. If x 2 M \ N, then
PQx = x. Thus, ran(PQ) = M \ N. h
We finally treat the difference of projections.
Theorem 3.8.8 The difference of two projections, P1 − P2, is a projection if, and
only if, M2  M1, where M1 [resp.M2] is the subspace of H on which P1 [resp.P2] is
the projection. In this case, ran(P1 − P2) = M1 \ M⊥
2.

Proof Suppose P1 − P2 is an orthogonal projection. Then for x 2 H,


((P1 − P2)x,x) = ((P1 − P2)2x,x) = ((P1 − P2)x,(P1 − P2)x)) = ||(P1 − P2)x||2  0,
which proves that M2  M1 [see Theorem 3.8.5]. On the other hand, suppose
that M2  M1. Then,

P1 P2 ¼ P2 ¼ P2 P1 ½Theorem 3:8:5Š: ð3:40Þ

Now,

ðP1 P2 Þ2 ¼ P21 P1 P2 P2 P1 þ P21 ¼ P1 P2

and

ðP1 P2 Þ* ¼ P1 * P2 * ¼ P1 P2 :

Finally, we show that ran(P1 − P2) = M1 \ M⊥


2 . Since P1P2 = P2P1 by (3.40)
above, it follows that
3.8 Orthogonal Projections 221

P1 ðI P2 Þ ¼ P1 P1 P2 ¼ P1 P2 P1 ¼ ðI P2 ÞP1 :

Hence, by Theorem 3.8.7, P1(I - P2) is an orthogonal projection with range given
by

ran(P1 Þ \ ranðI P2 Þ ¼ ranP1 \ ranðP2 Þ? :

The proof is completed by observing that P1(I − P2) = (I − P2)P1 = P1 − P2. h


Let H be a finite-dimensional Hilbert space and T 2 B(H) be such that T*T =
TT*. The subspace M formed by the eigenvectors belonging to a certain eigenvalue
is invariant under T, i.e. T(M)  M. In fact, T(M⊥)  M⊥ as well. Since T and T*
commute, it follows that ðT kIÞ and ðT* kIÞ commute. Therefore, they have
the same kernel. This implies that Ty = ky if, and only if, T*y ¼ ky. Let x 2 M⊥ and
y 2 M. Then, (Tx, y) = (x,T*y) = ðx; kyÞ = k(x, y) = 0. Consequently, T(M⊥)  M⊥.
M is called a reducing subspace of T.
Although no analogous structure theory exists for operators on
infinite-dimensional spaces, the notions of “invariant subspaces” and “reducing
subspaces” do make sense.
Definition 3.8.9 A subspace M of a Hilbert space H is said to be invariant under a
bounded linear operator T 2 B(H) if T(M)  M. The subspace M  H is said to
reduce T if T(M)  M and T(M⊥)  M⊥, i.e. if both M and M⊥ are invariant under T.
Then, M and M⊥ are called reducing subspaces of T.
It can be easily checked that M reduces T if, and only if, M is invariant under
both T and T*.
The investigation of T is facilitated by considering T|M and T|M⊥ separately.
Note that the subspace {0} and H are invariant under any T 2 B(H). Also,
ker(T) is always invariant under T; for Tx = 0 implies T(Tx) = 0.
Theorem 3.8.10 Let P be the orthogonal projection onto the subspace M of H.
Then, M is invariant under an operator T 2 B(H) if, and only if, TP = PTP;
M reduces T if, and only if, TP = PT.
Proof For each x 2 H, Px 2 M. Suppose M is invariant under T. Then, T(Px) 2 M,
and hence, PTPx = TPx; so PTP = TP. Conversely, if PTP = TP, then for every x 2
M, we have Tx = TPx = PTPx, and this is a vector in M. This proves that M is
invariant under T.
It remains to show that M reduces T if, and only if, TP = PT.
M reduces T if, and only if, TP = PTP and T(I − P) = (I − P)T(I − P) if, and only
if, TP = PTP = PT. This completes the proof. h
Problem Set 3.8

3:8:P1. Let X = Y = ‘2.


(a) Define Tnx = 1
n x for all x 2 ‘2. Show that limn||Tn|| = 0.
222 3 Linear Operators

(b) Let e1, e2, … be an orthonormal basis. Let Pn be the orthogonal


projection on the linear span of {e1, e2, …, en}, so that I − Pn is the
orthogonal projection on the complement of this space. Show that
Pn!I in strong operator convergence, but not in the operator norm
convergence.
(c) Let T: ‘2!‘2 be defined as follows: T((x1, x2, …)) = (0, x1, x2, …).
Show that Tn!O weakly but not strongly. For x, y 2 ‘2,

ðT n x; yÞ ¼ ðð0; . . .; 0; x1 ; x2 ; . . .Þ; ðy1 ; y2 ; . . .; yn ; yn þ 1 ; . . .ÞÞ


X1
¼ xk yn þ k :
k¼1

Definition. A linear operator P in any linear space X is said to be a projection if


P2 = P. (Note that we do not require a projection to be a bounded linear operator or
to be self-adjoint.)
3:8:P2. Let P be a projection in X. Then,
(a) I − P is a projection in X;
(b) ran(P) = {x 2 X : Px = x};
(c) ran(P) = ker(I − P);
(d) X = ran(P) ⊕ ran(I − P); and
(e) if P is bounded, then ran(P) and ran(I − P) are closed.
3:8:P3. Show that a projection P in a Hilbert space is an orthogonal projection iff
ran(P)⊥ker(P).
3:8:P4. Consider the Volterra operator V on L2[0,1] given by

Zs
VxðsÞ ¼ xðtÞdt; x 2 L2 ½0; 1Š:
0

Find V* and show that V + V* is a projection on the space spanned by the


vector 1.

3.9 Polar Decomposition

This section deals with an application of positivity defined in Definition 3.6.10 to


obtain the “polar decomposition” of an operator, analogous to the representation of
a complex number z as |z|eih for some real h. Does an analogue exist for operators?
In order to answer this question, we need to define the analogues of |z| and eih
amongst operators suitably.
3.9 Polar Decomposition 223

Definition 3.9.1 For T 2 B(H), we define


pffiffiffiffiffiffiffiffiffi
jT j ¼ T*T :

Remarks 3.9.2
pffiffiffiffiffiffiffiffiffi
(i) The reader should note that T*T  O and therefore T*T is uniquely
defined and is positive.
(ii) It is true that |kT| = |k||T|, whenever k 2 ℂ and T 2 B(H).
(iii) If the square of an operator S is invertible, i.e. S2U = US2 = I for some U,
then we have S(SU) = I = (US)S. Also,

SU ¼ US2 ðSU Þ ¼ ðUSÞ S2 U ¼ US;


   

which is therefore an inverse of S. Now, if T is any invertible operator, then so is


T*T and consequently, |T| is invertible.
The analogue in B(H) of the complex numbers of absolute value 1 is rather
complicated. At first one might expect that unitary operator would suffice. A little
reflection shows that this is not the case.
Example 3.9.3 Let T be the simple unilateral shift on ‘2. Then, as seen in Remark
pffiffiffiffiffiffiffiffiffi
3.7.15(ii), T*T = I, so that jT j ¼ T*T ¼ I, but T is not unitary. So, if write T = U|T|
or |T|U, we must have U = T, which is not unitary.
Definition 3.9.4 An operator T 2 B(H) is called a partial isometry if T is an
isometry when restricted to the closed subspace [ker(T)]⊥, i.e. ||Tx|| = ||x|| for every
x 2 [ker(T)]⊥.
Observe that ||T||  1. Every isometry is a partial isometry. Every orthogonal
projection is a partial isometry.
The subspace [ker(T)]⊥ is called the initial space of T and ran(T) is called its
final space. It is obvious that the initial space is always closed; we shall now show
that the final space too is always closed, i.e. ½ranðTފ = ran(T) when T is a partial
isometry: let x 2 ½ranðTފ. Then, there exists a sequence {xn}n  1 in H such that
Txn!x. For each n, there exist yn 2 ker(T) and zn 2 [ker(T)]⊥ such that xn = yn + zn.
Then, we have Tzn- = Txn and also

kTxn Txm k ¼ kTðyn ym Þ þ T ð z n z m Þ k


¼ kT ðzn zm Þk because yn ym 2 kerðTÞ
¼ kz n zm k because zn zm 2 ½kerðTފ? :

But {Txn}n  1 is a Cauchy sequence (since it converges to x), and by the above
equality, {zn}n  1 is also Cauchy sequence. Let zn!z. By continuity of T, we have
Tz = limnTzn = limnTxn = x, which shows that x 2 ran(T).
224 3 Linear Operators

The following proposition is in order.


Proposition 3.9.5 Let U 2 B(H). Then, the following statements are equivalent:
(a) U is a partial isometry;
(b) U* is a partial isometry;
(c) U*U is a projection; and
(d) UU* is a projection.
Moreover, U*U is a projection on [ker(U)]⊥ and UU* is a projection on
½ranðUފ = ran(U).
Proof (a) implies (c): to begin with, observe that for any T 2 B(H), we have
ker(T) = ker(T*T) by Theorem 3.5.8.
For x 2 H,

ððI U*UÞx; xÞ ¼ ðx; xÞ ðU*Ux; xÞ ¼ jj xjj2 jjUxjj2  0;

since ||U||  1. Thus, I − U*U is a positive operator. Now if x⊥ker(U), then ||Ux|| =
1 2
 
||x||, which implies that ((I − U*U)x,x) = 0. Since ðI U*U Þ2 x = |((I − U*U)x,


x) = 0, we have (I − U*U)x = 0 or U*Ux = x. On the other hand, U*U obviously


maps ker(U) into {0}. Consequently, (U*U)2 = U*U. Since U*U is self-adjoint, it
follows by Theorem 3.8.2 that it is a projection onto the orthogonal complement of
its own kernel. However, its kernel is the same as that of U. (Note that the
orthogonal complement is by definition the initial space of U.)
(c) implies (a): if U*U is a projection and x⊥ker(U*U), then U*Ux = x.
Therefore,

jjUxjj2 ¼ ðUx; UxÞ ¼ ðU*Ux; xÞ ¼ ðx; xÞ ¼ jj xjj2

and hence, U preserves the norm on [ker(U*U)]⊥. But as noted at the beginning,
ker(U*U) = ker(U). Therefore, U is a partial isometry.
(b) implies (d) and (d) implies (b) follow by reversing the roles of U and U*.
(c) implies (d): first observe that UU* is self-adjoint. We shall show that

ðUU*Þ2 ¼ ðUU*UÞU* ¼ UU*:

It is enough to show that UU*U = U. To this end, we note that this holds on
ker(U). Since it has already been proved that (c) implies (a), we know that U is a
partial isometry. Therefore, for x in ker(U)⊥, we have ||Ux|| = ||x||, which implies
U*Ux = x (see the proof of (a) implies (c)); thus, we have UU*U = U also on
ker(U)⊥ and hence on all of H. h
Observe that Proposition 3.9.5 has the following consequence: if U is a partial
isometry, then ||Ux|| = ||x|| if, and only if, x 2 ran(U*U). Indeed, ||Ux||2 = (Ux,Ux) =
(U*Ux,x) = (U*UU*Ux,x) = ||U*Ux||2, and it is true of any orthogonal projection
P that ||Px|| = ||x|| is equivalent to x 2 ran(P).
3.9 Polar Decomposition 225

We next prove the analogue of the decomposition z = |z|eih for some h.


Theorem 3.5.8 will be used frequently without explicit mention.
Theorem 3.9.6 (Polar Decomposition) Let T 2 B(H). Then, there is a partial
isometry U such that T = U|T| and ker(U) = ker(T). Moreover, ran(U) = ½ranðTފ.
Amongst all bounded linear operators V such that T = V|T|, U is uniquely deter-
mined by the condition ker(V)  ker(T).
Proof Define U:ran(|T|)!ran(T) by U(|T|x) = Tx. Since

jjTxjj2 ¼ ðTx; TxÞ ¼ ðx; T*TxÞ ¼ ðx; jT j2 xÞ ¼ kjT j xjjk2 ; ð3:41Þ

it follows that U is well defined. Indeed, if we apply (3.41) to x − y, we deduce that


if |T|x = |T|y then Tx = Ty. The equality (3.41) also shows that U preserves norms
and hence extends to a norm preserving linear mapping of ½ranðjT jފ onto ½ranðTފ
?
such that ker(U) = {0}. Extend U to all of H by defining it to be zero on ½ranðjT jފ
= ker(|T|), so that it now has kernel equal to ker(|T|) but the same range as before,
which is ½ranðTފ. Observe that T = U|T| on H. Furthermore, in view of (3.41), |T|
x = 0 if, and only if, Tx = 0, so that ker(|T|) = ker(T). Thus, ker(U) = ker(T) and, as
already noted, ran(U) = ½ranðTފ.
We next consider uniqueness.
If V is any linear operator with V|T| = T and ker(V)  ker(T), we note that Vy =
Uy for every y 2 ran(|T|), so that U = V on ½ranðjT jފ. Since both operators are zero
?
on ½ranðjT jފ = ker(|T|) = ker(T)  ker(V), it follows that V = U. h
The preceding decomposition theorem is due to von Neumann.
The factorisation T = U|T|, where U is the unique partial isometry such that T =
U|T| and ker(U) = ker(T) is called the polar decomposition of T and U is called the
partial isometry in the polar decomposition of T.
The uniqueness argument in the last paragraph of the above proof begins by
assuming that V satisfies ker(V)  ker(T) as well as V|T| = T, but not that it is a
partial isometry. Nevertheless, even a partial isometry V satisfying only T = V|T|
need not be unique. This is illustrated by (ii) of the Remarks below.
Remarks 3.9.7 (i) If T 2 B(H) is invertible, the partial isometry in its polar
decomposition is unitary, as we now show.
Since T is invertible, ker(T) = {0} and ran(T) = H. Consequently, if T = U|T| is
the polar decomposition of T, then ker(U) = ker(T) = {0} and ran(U) = ½ranðTފ =
H. Hence, U is unitary.
(ii) If y(t) is a complex measurable function on [0,1], there are complex mea-
surable functions a on [0,1] such that |a(t)| = 1 when y(t) 6¼ 0 and y(t) = a(t)|y(t)|
everywhere. Then, the operator T of multiplication on L2[0,1] defined by
226 3 Linear Operators

TxðtÞ ¼ yðtÞxðtÞ; x 2 L2 ½0; 1Š;

satisfies T = V|T|, where V is the operator of multiplication by a. Loosely speaking,

T ¼ ajyðtÞj:
If y vanishes on a set Y of positive measure, then several such a are possible,
amongst which several have the property that |a| is the characteristic function of
some set. In case a is chosen (nonuniquely) so that |a| is the characteristic function
of some set E, then V can be shown to be a partial isometry by arguing as follows.
The kernel of V is {x 2 L2[0, 1]:x(t) = 0 a.e. on E} and its orthogonal complement is
{x 2 L2[0, 1]:x(t) = 0 a.e. on Ec}; we have to show for any x in this orthogonal
R1 R1
complement that ||Vx|| = ||x||, i.e. 0 jaðtÞxðtÞj2 dt ¼ 0 jxðtÞj2 dt. Since |a| is the
characteristic function of E, the former integral equals E jxðtÞj2 dt ; since x vanishes
R

a.e. on Ec, the latter integral also equals E jxðtÞj2 dt. Thus, the two integrals are
R

equal and V is therefore a partial isometry.


What it takes for V to have the same kernel as T is that

fx 2 L2 ½0; 1Š : xðtÞ ¼ 0 a:e: on Eg ¼ fx 2 L2 ½0; 1Š : yðtÞxðtÞ ¼ 0 a:e: on ½0; 1Šg


¼ fx 2 L2 ½0; 1Š : xðtÞ ¼ 0 a:e: on Y c g;

or equivalently, the symmetric difference (E\Yc) [ (Yc\E) has measure zero. This
amounts to saying that the characteristic function |a| of E must be equal a.e. to that
of Yc. In other words, a must be equal a.e. to 0 on Y and y(t)/|y(t)| on Yc. With this
choice of a, the polar decomposition of T is V|T|.
It has been shown by Ichinose and Iwashita in [14] that a partial isometry such
that T = V|T| is unique if, and only if, either ker(T) or ker(T*) is {0}. They have
proved this for operators from one Hilbert space to another, but we shall confine
ourselves to operators from a Hilbert space into itself. Our considerations carry over
verbatim to the broader case. We begin with a preliminary remark.
Remark 3.9.8 The zero operator is a partial isometry. It is easy to see that, given a
partial isometry V 2 B(H) and any x 2 H, the equality ||Vx|| = ||x|| is equivalent to
x 2 (ker(V))⊥. Also, given any partial isometry V and any complex number k of
absolute value 1, the operator kV is a partial isometry with the same kernel as
V. Distinct k gives rise to distinct partial isometries kV unless V = O.
Proposition 3.9.9 If T 2 B(H) and V is a partial isometry such that T = V|T|, then
(a) V*T = |T| = T*V;
(b) V*V|T| = |T| and V|T|V* = |T*|.

Proof (a) Since T = V|T|, we have V*T = V*V|T|. Therefore, in order to show that
V*T = |T| it is sufficient to arrive at ran(|T|)  ran(V*V). We arrive at this by
showing that y = |T|x implies ||Vy|| = ||y|| and using the observation just after
Proposition 3.9.5:
3.9 Polar Decomposition 227

 
jjVyjj2 ¼ ðV jT jx; V jT jxÞ ¼ ðTx; TxÞ ¼ ðT*Tx; xÞ ¼ jT j2 x; x ¼ ðjT jx; jT jxÞ
¼ k y k2 :

As |T| is self-adjoint, it follows that |T| = T*V as well.


(b) It follows from (a) that V*V|T| = V*T = |T|. As for V|T|V*, we note that it is
positive and that its square is V|T|V*V|T|V* = V|T|(V*V|T|)V* = V|T|2V* = (V|T|)
(|T|V*) = TT*. It is immediate from here that V|T|V* = |T*|. h
Theorem 3.9.10 If T 2 B(H) and either ker(T) or ker(T*) is {0}, then there is a
unique partial isometry V such that T = V|T|.
Proof Existence has been established in Theorem 3.9.6. Uniqueness when ker(T) =
{0} is a trivial consequence of the last part of that Theorem.
To prove uniqueness when ker(T*) = {0}, consider any partial isometries U and
V such that T = U|T| and T = V|T|. By Proposition 3.9.9(a), we have T*U = |T| =
T*V. When ker(T*) = {0}, this equality leads to U = V immediately. h
Theorem 3.9.11 If T 2 B(H) and there is a unique partial isometry V such that T =
V|T|, then either ker(T) or ker(T*) is {0}.
Proof We prove the contrapositive that if ker(T) 6¼ {0} 6¼ ker(T*), then there exist
several partial isometries V satisfying T = V|T|. Theorem 3.9.6 ensures that at least
one such partial isometry U always exists and we show how to get others from it
when ker(T) 6¼ {0} 6¼ ker(T*). Recall that Theorem 3.9.6 provides not only that

T ¼ U jT j

but also that

½ranðTފ ¼ ranðUÞ and kerðTÞ ¼ kerðUÞ:

Since ker(T) and ker(T*) must each have a one-dimensional subspace, there
exists an isometry from the former one-dimensional subspace to the latter. Extend it
to be an element of B(H) by defining it to be 0 on the orthogonal complement of the
one-dimensional subspace and call it V. Then, V is a partial isometry, distinct from
O; moreover,

ðkerðT ÞÞ? kerðVÞ

and

ranðVÞkerðT*Þ:

There are infinitely many possibilities for V because kV has the same properties
when |k| = 1. Since it is a partial isometry, V*V is the projection on (ker(V))⊥. Since
228 3 Linear Operators

?
ker(T*) = ½ranðTފ = (ran(U))⊥ = ker(U*), the second of the above inclusions is
equivalent to ran(V)  ker(U*), which is to say,

U*V ¼ O:

Besides, in the light of the fact that ran(|T|)  ker(|T|)⊥ = (ker(T))⊥ the first of
the above inclusions leads to ran(|T|)  ker(V), which can be rephrased as

V jT j ¼ O:

Set W = U + V. It is enough to show that W is a partial isometry and that W|T| =


T. The latter is an easy consequence of the equality V|T| = O:

W jT j ¼ ðU þ VÞjT j ¼ U jT j þ V jT j ¼ U jT j þ O ¼ U jT j ¼ T:

We can show that W is a partial isometry by merely arguing that W*W is a


projection [Proposition 3.9.5]. Keeping in mind that U*V = O, so that V*U = O as
well, we have

W*W ¼ ðU* þ V*ÞðU þ V Þ ¼ U*U þ U*V þ V*U þ V*V ¼ U*U þ V*V:

But U*U is the projection on (ker(U))⊥ = (ker(T))⊥  ker(V). This means


U*U and V*V are projections on mutually orthogonal subspaces. Therefore, their
sum W*W is a projection. This establishes that W is a partial isometry. h
We note in passing that every partial isometry W such that W|T| = T must
necessarily be of the form U + V, where V is a partial isometry which, as in the
foregoing proof, satisfies (ker(T))⊥  ker(V) and ran(V)  ker(T*). For details, the
reader is referred to [14].
Theorem 3.9.12 If T 2 B(H) and n 2 ℕ, then |||T|n|| = ||T||n.
Proof Equality (1) in the proof of Theorem 3.9.6 justifies the case n = 1. For other
values of n, the desired equality follows upon applying Theorem 3.6.18 to the
self-adjoint operator |T| and using the case when n = 1. h
Proposition 3.9.13 If T 2 B(H), then

ker(T*TÞ ¼ kerðTÞ ¼ kerðjT jÞ and ½ranðT*T ފ ¼ ½ranðjT jފ:

Proof The first equality is a restatement of the first equality of Theorem 3.5.8.
Applying it to |T| in place of T, we get ker(|T|) = ker(|T|*|T|) = ker(|T|2) = ker(T*T).
The last equality follows upon taking orthogonal complements and invoking the
third equality in Theorem 3.5.8. h
3.9 Polar Decomposition 229

Problem Set 3.9

3:9:P1. Let T:‘2!‘2 be defined by ((n1, n2, …)!(0, 0, n3, n4, …). Without using
general properties of projections, show that T is bounded and positive.
Find the square root of T.
3:9:P2. Let T 2 B(H) be self-adjoint and positive, where H denotes a complex
Hilbert space. Show that
  1
 1
(a) T 2  ¼ kT k2 ,
1 1
(b) jðTx; yÞj  ðTx; xÞ2 ðTy; yÞ2 and
1 1
(c) kTxk  kT k ðTx; xÞ , so that (Tx, x) = 0 if, and only if, Tx = 0.
2 2

3:9:P3. (a) If T 2 B(H) is a partial isometry and x 2 ran(T), show that T*x is the
unique element y of [ker(T)]⊥ such that x = Ty. Moreover, ||T*x|| = ||y|| = ||x||.
(b) Show that if T 2 B(H) is a partial isometry, then so is T*.

3.10 An Application

Mean Ergodic Theorem


Ergodic theory has its roots in the study of chaotic motion of small particles, such as
pollen, suspended in a liquid. The chaotic motion was originally observed by the
botanist R. Brown in 1862 and subsequently came to be called Brownian motion.
The first result in connection with Brownian motion that led to major developments
in mathematics was proved by Poincaré in 1890.
Let (X, R, l) be a measure space and T be a measurable transformation of X into
itself (F 2 R implies T−1(F) 2 R). The transformation is said to be measure
preserving if l(T−1(E)) = l(E) for every E 2 R. A point x 2 E is called recurrent
with respect to E and T if Tnx 2 E for at least one positive integer n. Poincaré
proved that almost every point of E is recurrent provided that l(X) < ∞. In fact, if
E 2 R and l(X) < ∞, then for almost every x 2 E, there are infinitely many n such
that Tnx 2 E, that is, almost every point of any measurable subset E returns to
E infinitely many times. The question arises if such a point has a mean time of
sojourn in E; more precisely if

n 1
X
1
limn n vE ðT k xÞ
k¼0

exists where T0 denotes the identity transformation. More generally, we may ask for
which class of measurable functions f(x)
230 3 Linear Operators

n 1
X
1
limn n f ðT k xÞ
k¼0

exists in some sense.


If we begin with a function f in L1(X, R, l), the associated function Uf given by
(Uf)(x) = f(Tx) belongs to L1(X, R, l) and has the norm as f. This is easy to see for
characteristic functions, hence for simple functions and consequently for other
functions, using the Monotone Convergence Theorem. Applying this to |f|2, we
conclude that U is also an isometry on L2(X, R, l). Note that the general term
f (Tkx) in the summation in the preceding paragraph can now be written as (Ukf)(x).
The question raised above will now be answered in the general context of a
Hilbert space for an operator U satisfying ||U||  1, not necessarily preserving the
norm Riesz and Nagy [cf. 23, p. 454].
(Mean Ergodic Theorem) Let H be a Hilbert space and U be a bounded linear
operator in H with ||U||  1. If P is the orthogonal projection on the closed linear
subspace M = {x 2 H : Ux = x}, then

n 1
X
1
limn n U k x ¼ Px
k¼0

for all x 2 H.
Proof First, we shall prove that Ux = x if, and only if, U*x = x, where U* denotes
the adjoint of U. Observe that ||U*|| = ||U||  1. Now Ux = x implies

0  kU*x xk2 ¼ kU*xk2 ðU*x; xÞ ðx; U*xÞ þ jj xjj2


¼ jjU*xjj2 ðx; UxÞ ðUx; xÞ þ jj xjj2
¼ jjU*xjj2 ðx; xÞ ðx; xÞ þ jj xjj2
¼ jjU*xjj2 jj xjj2  0;

so, U*x = x. Similarly, U*x = x implies Ux = x.


For any x 2 M, the sums n 1 nk¼01 U k x are all equal to x and so, converge to x =
P
Pn 1 k
Px. Next, consider an element x = y − Uy, y 2 H. For such an x, k¼0 Ux=y−
Uny and so, ||n−1 nk¼01 Ukx||  2n−1||y||!0 as n!∞. The collection
P

fx 2 H : x ¼ y Uy; y 2 Hg ð3:42Þ

is clearly linear but not necessarily closed. Let z be any element in the closure K of
the collection (3.42). Then, there is a sequence xp = yp − Uyp such that xp!z as
p!∞. Let An = n−1 nk¼01 Uk. Then, ||An||  1 for all n and
P
3.10 An Application 231

        
jjAn zjj  An z xp  þ   A n xp     z xp  þ An xp :
 
So, given e > 0, there exists an integer p0 such that z xp0 \ 2e. Also,
   
An xp  ¼ An yp An Uyp0 
0 0
 
X n 1 n 1
X 
¼ n 1 U k yp 0 U k þ 1 yp0 
 
 k¼0 k¼0

¼ n 1   yp 0 U n yp 0  
   
  e
 2n 1 yp0 \ ;
2

provided n is sufficiently large. Therefore, limnAnz = 0 for z 2 K.


We next show that K⊥ = M:
v 2 K⊥ , (v, y − Uy) = 0 for all y , (v, y) − (U*v, y) = 0 for all y , (v − U*v, y) = 0
for all y , v = U*v , v 2 M.
Finally, x 2 H can be written as x1 + x2 with x1 2 M and x2 2 M⊥ (= K), so that
nP1
n−1 Ukx converges to x1 + 0 = x1 = Px. This completes the proof. h
k¼0
Chapter 4
Spectral Theory and Special Classes
of Operators

4.1 Spectral Notions

As noted earlier, if H is a complex Hilbert space, BðHÞ is a C*-algebra with identity


[see Definition 3.5.6]. The invertibility of an operator T 2 BðHÞ and its ramifica-
tions were discussed in 3.3.7–3.3.12. In what follows, we shall study the invert-
ibility of the operators kI T, where T 2 BðHÞ, I is the identity operator and
k 2 C. The study of the distribution of the values of k for which kI T does not
have an inverse is called ‘spectral theory’ for the operator.
The study of the complement of the set fk 2 C : kI T is invertible in BðHÞg;
called the ‘spectrum’ of the operator T, is an important part of operator theory. In
finite dimensions, it is the set of eigenvalues of T. In infinite dimensions, the
operator kI T may fail to be invertible in different ways. So, finding the spectrum
is not an easy problem. It is definitely more complicated than in the
finite-dimensional case.
Definition 4.1.1 If T 2 BðHÞ, we define the spectrum of T to be the set

rðTÞ ¼ fk 2 C : kI T is not invertible in BðHÞg

and the resolvent set of T to be the set

qðTÞ ¼ CnrðTÞ ¼ fk 2 C : kI T is invertible in BðHÞg:

Rðk0 ; TÞ denotes ðk0 I TÞ 1 and is called the resolvent at k0 of T. Further, the


spectral radius of T is defined by

rðTÞ ¼ supfjkj : k 2 rðTÞg:

Examples 4.1.2
(i) For the identity operator I 2 BðHÞ; rðIÞ ¼ f1g; qðTÞ ¼ Cnf1g and r(I) = 1.

© Springer Nature Singapore Pte Ltd. 2017 233


H.L. Vasudeva, Elements of Hilbert Spaces and Operator Theory,
DOI 10.1007/978-981-10-3020-8_4
234 4 Spectral Theory and Special Classes of Operators

(ii) For an n  n matrix T, kI T is not invertible if and only if


detðkI TÞ ¼ 0: Thus, in the finite-dimensional case, rðIÞ is just the set of
eigenvalues of T (since detðkI TÞ is an nth-degree polynomial whose roots
are the eigenvalues of T).
(iii) Let f : ½a; bŠ ! C be continuous, where a < b are in R. The multiplication
operator
 
Tf x ðTÞ ¼ f ðtÞxðTÞ; atb

is a bounded operator on L2[a, b]. We argue in the next paragraph that


rðTf Þ ¼ ranðf Þ ¼ fk 2 C : there exists t 2 ½a; bŠ for which
f ðtÞ ¼ kg ¼ ff ðtÞ : t 2 ½a; bŠg.
If k 62 ranðf Þ, then ðkI Tf Þ has a bounded inverse Tðk f Þ 1 and so, k 62 rðTf Þ.
On the other hand, if k ¼ f ðt0 Þ for some t0 2 ½a; bŠ, then k 2 rðTf Þ. Otherwise,
ðkI Tf Þ has a bounded inverse S. Pick an interval Jn about t0 in [a, b], of length
dn [ 0, such that jf ðtÞ kj\ 1n for t 2 Jn ; and define
 1=2
gn ðTÞ ¼ dn t 2 Jn
0 otherwise:
R
Then, ðkI Tf Þgn ! 0 as n ! 1 because jðkI Tf Þgn j2 dt  n12 dn 1 dn ¼ n12 but
SðkI Tf Þgn ¼ gn which has norm 1 for all n, contradicting the continuity of S.
Depending on the complications to the invertibility of the operator kI T, we
classify rðTÞ, the spectrum of T.
Recall that kI T fails to be invertible if either ranðkI TÞ 6¼ H or
kerðkI TÞ 6¼ f0g [Problem 3.3.P3].
Definition 4.1.3
(a) The point spectrum (eigenspectrum, eigenvalues) of T 2 BðHÞ is defined to
be the set
rp ðTÞ ¼ fk 2 C : kerðkI TÞ ¼
6 f0gg;

in other words, there is a nonzero vector x in H such that ðkI TÞx ¼ 0, i.e.
kI T is not injective.
(b) The continuous spectrum rc ðTÞ is the set
rc ðTÞ ¼ fk 2 C : kI T is injective and ranðkI TÞ is dense in H but
ðkI TÞ 1 is not boundedg:
(c) The residual spectrum rr ðTÞ is the set
rr ðTÞ ¼ fk 2 C : kI T is injective and ran ðkI TÞ is not dense in H and
ðkI TÞ 1 exists as a bounded or unbounded operatorg:
4.1 Spectral Notions 235

Remarks 4.1.4
(i) The conditions in (a), (b) and (c) are mutually exclusive and exhaustive by
Theorem 3.3.12. Thus, we have the following disjoint splitting of C:

C ¼ qðTÞ [ rp ðTÞ [ rc ðTÞ [ rr ðTÞ

and the union

rp ðTÞ [ rc ðTÞ [ rr ðTÞ

comprises the spectrum of T.


(ii) If H is finite-dimensional and T 2 BðHÞ, then the two conditions kerðkI
TÞ ¼ f0g and ranðkI TÞ ¼ H are equivalent. Hence, rðTÞ ¼ rp ðTÞ for
every operator T on a finite-dimensional Hilbert space H. Consequently, in
this case, rc ðTÞ ¼ £ ¼ rr ðTÞ.
(iii) The multiplication operator Tt : L2 ½a; bŠ ! L2 ½a; bŠ defined by Tt(x(t)) = tx(t),
a  x  b, is such that rp ðTt Þ ¼ £. Indeed, the condition ðkI Tt Þx ¼ 0
implies ðk tÞxðtÞ ¼ 0 a.e. and so, x(t) = 0 a.e. It has been proved in
Example (iii) of 4.1.2 that rðTt Þ ¼ ½a; bŠ. The domain of ðkI Tt Þ 1 is the
set of all y’s in L2[a, b] for which there exists an x in L2[a, b] satisfying
ðkI Tt Þx ¼ y, i.e. kyðtÞt is in L2[a, b]. We shall argue that the set
fy 2 L2 ½a; bŠ : kyðtÞt 2 L2 ½a; bŠg is dense in L2[a, b]. For an arbitrary d [ 0,
there exists an e > 0 such that the function fe, where fe is 0 on I ¼
ðk e; k þ eÞ \ ½a; bŠ and is f on its complement, satisfies the inequality

Zb Z
jf fe j2 ¼ jf ðtÞj2 dt\d:
a I

Moreover, the function fke ðtÞt is in L2[a, b] since its L2-norm is less than or
equal to 1e times the L2-norm of f.

But the set fy 2 L2 ½a; bŠ : kyðtÞt 2 L2 ½a; bŠg does not coincide with L2[a, b] as it
does not contain the constant function 1. Thus, each k 2 rðTt Þ is in rc ðTt Þ. It
follows from (i) above that rr ðTt Þ ¼ £:
Theorem 3.3.121 leads to yet another useful division of the spectrum into two
parts, not necessarily disjoint. It is an immediate consequence of that Theorem that

1
Note that the same theorem had made it possible earlier to divide the complement of the point
spectrum into two disjoint parts.
236 4 Spectral Theory and Special Classes of Operators

k 2 rðTÞ if and only if either ranðkI TÞ is not dense in H or ðkI TÞ is not


bounded below: there is no e [ 0 such that jjðkI TÞxjj  ejjxjj for every x 2 H. In
the former case, k is said to belong to the compression spectrum rcom(T) of T, and
in the latter case, k is said to belong to the approximate point spectrum rap(T) of T.
In other words,

rcom ðTÞ ¼ fk 2 C : ranðkI TÞ is not dense in Hg;

rap ðTÞ ¼ fk 2 C : there is a sequence fxn gn  1 such that kxn k


¼ 1 for every n and jjðkI TÞxn jj ! 0 as n ! 1g:

Sometimes, fxn gn  1 is called an approximate eigenvector corresponding to the


approximate eigenvalue. Clearly,

rp ðTÞ  rap ðTÞ and rðTÞ ¼ rap ðTÞ [ rcom ðTÞ:

The reader will note that

rr ðTÞ ¼ rcom ðTÞnrp ðTÞ;

which is to say the residual spectrum is the set of those points in the compression
spectrum that are not eigenvalues. Also,

rcom ðTÞ [ rp ðTÞ ¼ rp ðTÞ [ rr ðTÞ

and

rc ðTÞ ¼ rðTÞnðrcom ðTÞ [ rp ðTÞÞ


¼ rap ðTÞnðrcom ðTÞ [ rp ðTÞÞ
¼ rap ðTÞnðrp ðTÞ [ rr ðTÞÞ:

Problem Set 4.1

4:1:P1. For T 2 BðHÞ, show that (i) rcom ðTÞ  rp ðT*Þ and (ii) rp ðTÞrcom ðT*Þ.
4:1:P2. Let H ¼ ‘2 and fek gk  1 be the standard orthonormal basis in ‘2 : Any
P P1
x 2 ‘2 has the representation x ¼ 1 n¼1 ðx; en Þen ¼ n¼1 an en ; where
2 2
an ¼ ðx; en Þ; n ¼ 1; 2; . . .: Define T :‘ !‘ by taking
P
Tx ¼ 1 an 1 1
n¼1 n þ 1en þ 1 ; in other words, Te1 ¼ 2 e2 ; Te2 ¼ 3 e3 ; . . .: Show that
T is a bounded linear operator, 0 2 rr ðTÞ and any k 6¼ 0 belongs to qðTÞ.
4.1 Spectral Notions 237

4:1:P3. Let H = ‘2 and fek gk  1 be the standard orthonormal basis in ‘2 . Any


P P1
x 2 ‘2 has the representation x ¼ 1 n¼1 ðx; en Þen ¼ n¼1 an en , where
an ¼ ðx; en Þ; n ¼ 1; 2; . . .: Consider a sequence of scalars fkn gn  1 such
P
that kn ! 1 and no kn equals 1. Define T : ‘2 ! ‘2 by Tx ¼ 1 n¼1 an kn en .
Show that
(a) T is a bounded linear operator;
(b) fkn : n ¼ 1; 2; . . .g  rp ðTÞ;
(c) 1 2 rc ðTÞ;
(d) k 6¼ kn for any n and k 6¼ 1 implies k 2 qðTÞ;
(e) rr ðTÞ ¼ £:
4:1:P4. Show that if A; B 2 BðHÞ; k 2 qðABÞ and k 6¼ 0, then k 2 qðBAÞ and
1
ðkI BAÞ ¼ k 1 I þ k 1 BðkI BAÞ 1 A:

Deduce that rðABÞ and rðBAÞ have the same elements with one possible
exception: the point zero. Show that the point zero is exceptional.
4:1:P5. Let l ¼ flk gk  1 be a bounded sequence of complex numbers,
M ¼ supk  1 jlk j. Define T : ‘2 ! ‘2 by

T ðx1 ; x2 ; . . .Þ ¼ ðl1 x1 ; l2 x2 ; . . .Þ:

Show that jjTjj ¼ supk  1 jlk j ¼ M. Show also that the eigenvalues of T are
l1 ; l2 ; . . . and rðTÞ ¼ flk : k  1g: What is T*?
4:1:P6. Let T 2 BðHÞ be self-adjoint and x be a fixed unit vector in H. Suppose
jjTxjj ¼ jjTjj. Show that x is an eigenvector of T2 corresponding to the
eigenvalue kTk2 ð¼kT 2 kÞ. Also, prove that Tx ¼ jjTjjx or Ty ¼ jjTjjy,
where y ¼ jjTjjx Tx 6¼ 0.
4:1:P7. Let T 2 BðHÞ, where H is a complex Hilbert space. Show that the fol-
lowing statements are equivalent:
(a) There exists k 2 rap ðTÞ such that jkj ¼ kTk;
(b) jjTjj ¼ supjjxjj¼1 jðTx; xÞj:
4:1:P8. Let S and T denote a pair of self-adjoint operators in BðHÞ: Then,

max min jm lj  jjS T jj:


m2rðTÞ l2rðSÞ

(The reader will note that by interchanging S and T, we also obtain

max min jm lj  jjS T jj:Þ


l2rðSÞ m2rðTÞ
238 4 Spectral Theory and Special Classes of Operators

4.2 Resolvent Equation and Spectral Radius

Let H be a finite-dimensional Hilbert space and T 2 BðHÞ. The set of k’s for which
detðkI TÞ ¼ 0 comprise the spectrum of T. The fundamental theorem of algebra
guarantees that rðTÞ 6¼ £: For every bounded linear operator defined on a Hilbert
space (finite- or infinite-dimensional), the spectrum rðTÞ is a nonempty, closed and
bounded subset of the complex plane.
Theorem 4.2.1 (The resolvent equation) For k; l 2 qðTÞ,

Rðk; TÞ Rðl; TÞ ¼ ðk lÞRðk; TÞRðl; TÞ:

Proof We have
1 1
Rðk; TÞ Rðl; TÞ ¼ ðkI TÞ ðlI TÞ
¼ ðkI TÞ 1 ½ðlI TÞ ðkI TފðlI TÞ 1

¼ ðk lÞRðk; TÞRðl; TÞ:

h
The above relation has the consequence that

Rðk; TÞ Rðl; TÞ
Rðk; TÞRðl; TÞ ¼
k l
Rðl; TÞ Rðk; TÞ
¼
l k
¼ Rðl; TÞRðk; TÞ:

Thus, the family fRðk; TÞ : k 2 qðTÞg is a commuting family, i.e. any two mem-
bers of the family commute with each other.
Theorem 4.2.2 Let T 2 BðHÞ. The resolvent set q(T) of T is open, and the map
k ! Rðk; TÞ ¼ ðkI TÞ 1 from qðTÞ  C to BðHÞ is strongly holomorphic in the
sense of Definition 3.3.13 (understood with X ¼ BðHÞ), vanishing at 1: For each
x; y 2 H, the map k ! ðRðk; TÞx; yÞ ¼ ððkI TÞ 1 x; yÞ 2 C is holomorphic on
qðTÞ, vanishing at 1:
Proof Let k 2 qðTÞ. By definition, kI T is invertible and thus belongs to the set
G of all invertible elements of BðHÞ. By the first part of Proposition 3.3.9, G is open.
Therefore, some d [ 0 has the property that any S 2 BðHÞ which satisfies the
inequality kS ðkI TÞk\d belongs to G. If kk lk\d, then S ¼ lI T
clearly satisfies the inequality and therefore belongs to G, so that l 2 qðTÞ. This
shows that qðTÞ is open.
Since the map k ! ðkI TÞ from qðTÞ to G is continuous, it follows by the
second part of Proposition 3.3.9 that the map k ! ðkI TÞ 1 from qðTÞ  C to
BðHÞ is also continuous. The resolvent identity of Theorem 4.2.1 now shows that
the map is strongly holomorphic with derivative Rðk; TÞ2 .
4.2 Resolvent Equation and Spectral Radius 239

If kkk ! 1, then I k 1 T ! I in the uniform operator norm, which implies


ðI k 1 TÞ 1 ! I [by the second part of Theorem 3.3.9]. Consequently,
1 1
Rðk; TÞ ¼ ðkI TÞ ¼ k 1 ðI k 1 TÞ ! O:

Being strongly holomorphic, the map is also weakly holomorphic. Now, for
x; y 2 H, the map from BðHÞ to C given by S ! (Sx, y) is a linear functional on
BðHÞ. Hence, the map k ! ðRðk; TÞx; yÞ ¼ ððkI TÞ 1 x; yÞ 2 C is holomorphic,
vanishing at 1: h
Corollary 4.2.3 For T 2 BðHÞ; rðTÞ ¼ CnqðTÞ is a closed subset of C.
Recall that the spectral radius of an operator T 2 BðHÞ is defined to be

rðTÞ ¼ supfjkj : k 2 rðTÞg:

Theorem 4.2.4 Let T 2 BðHÞ, where Hð6¼ f0gÞ: If jkj [ kTk, then k 2 qðTÞ and
1
X
1 n 1 n
Rðk; TÞ ¼ ðkI TÞ ¼ k T ;
n¼0

where convergence takes place in the uniform operator norm. Also, the spectrum
rðTÞ of T is a nonempty compact subset which lies in fk 2 C : jkj  kTkg. In
particular, there exists k 2 rðTÞ such that jkj ¼ rðTÞ.
Proof By Corollary 4.2.3, rðTÞ is a closed subset of C. If jkj [ kTk, then
kI ðI k 1 TÞk ¼ kk 1 Tk\1, and by Proposition 3.3.8, I k 1 T is invertible
P
with ðI k 1 TÞ 1 ¼ 1 1 n
n¼0 ðk TÞ , convergence being in the uniform operator
norm. This implies that kI T ¼ kðI k 1 TÞ is invertible and ðkI TÞ 1 ¼
P1 n 1 n
n¼0 k T ; convergence being in the uniform operator norm.
In particular, jkj [ kTk implies k 62 rðTÞ. In other words,
rðTÞ  fk 2 C : jkj  jjTjjg, showing that rðTÞ is bounded. Being closed, it is also
compact.
We show that the assumption rðTÞ ¼ £ leads to a contradiction. rðTÞ ¼ £
implies qðTÞ ¼ C. Now, for every x, y in H, ðRðk; TÞx; yÞ is an entire function,
which vanishes at 1; and is therefore bounded. By Liouville’s Theorem,
ðRðk; TÞx; yÞ is constant and the value of this constant is zero. Since ðRðk; TÞx; yÞ ¼
0 for every x, y in H implies Rðk; TÞ ¼ O, it follows that

O ¼ Rðk; T ÞðkI T Þ ¼ I:

This is a contradiction. h
240 4 Spectral Theory and Special Classes of Operators

Theorem 4.2.5 (Gelfand’s formula) For any T 2 BðHÞ, the following limit exists
1
limn!1 jjT n jjn

and equals r(T).


The following lemma will be needed in the proof of Gelfand’s formula.
1 1
Lemma 4.2.6 For T 2 BðHÞ; limn!1 jjT n jjn exists and equals inf n jjT n jjn .
1
Moreover, 0  inf n jjT n jj  kTk. n

1 1
Proof Set a ¼ inf n jjT n jjn . Then, for e [ 0, there exists m such that jjT m jjm \a þ e.
Now, any n 2 N can be written as n = pm + q, 0  q < m. So,
1 1 p q p q
kT n kn ¼ kT pm þ q kn  kT m kn kT kn \ða þ eÞmn kT kn :

Since m pn ! 1 and qn ! 0 as n ! 1, it follows that


1
lim supn jjT n jjn  a þ e:

As e [ 0 is arbitrary, we have
1
lim supn jjT n jjn  a:
1 1
Also, a  jjT n jjn for every n and this implies a  lim inf n jjT n jjn . Consequently,
1 1 1 1
limn!1 jjT n jjn exists and equals inf n jjT n jjn . Finally, jjT n jjn  ðkTkn Þn ¼ kTk
1
implies a ¼ inf n jjT n jjn  kTk. h
1
Proof of Gelfand’s Formula Let k 2 C be such that jkj [ a ¼ inf n kT n kn . Then,
1
there exists a positive integer m such that jkj [ kT m km , i.e. kT m k\jkm j, so that
km 2 qðT m Þ. Since

Tm km I ¼ ðT kIÞðT m 1
þ k Tm 2
þ    þ km 1 IÞ
¼ ðT m 1
þ kT m 2
þ    þ km 1 IÞðT kIÞ;

it follows that
1
ðT kIÞ ¼ ðT m km IÞ 1 ðT m 1
þ kT m 2
þ    þ km 1 IÞ 1 ;

and so k 2 qðTÞ. Consequently, rðTÞ  a. It remains to show that rðTÞ  a. To this


end, we proceed as follows:
Let jkj [ rðTÞ. Then, k 2 qðTÞ. The resolvent Rðk; TÞ exists and is strongly
holomorphic on qðTÞ by Proposition 4.2.1. It therefore has a Laurent series around
k ¼ 0, converging in the operator norm.
4.2 Resolvent Equation and Spectral Radius 241

If jkj [ kTk, then by Theorem 4.2.4,


1
X
n 1 n
Rðk; TÞ ¼ k T ;
n¼0

which converges in the operator norm.


Since jkj [ kTk  rðTÞ, by uniqueness of Laurent series, it follows that
1
X
n 1 n
Rðk; TÞ ¼ k T if jkj [ rðTÞ:
n¼0

Hence,
n 1 n
limn jjk T jj ¼ 0 if jkj [ rðTÞ;

and so, for any e > 0, we must have

jjT n jj  ejkjn þ 1 for large n and jkj [ rðTÞ:


nþ1
 ðe þ jkjÞ ;

which implies
1 1
jjT n jjn  ðe þ jkjÞ1 þ n for large n and jkj [ rðTÞ;

and hence,
1
limn!1 jjT n jjn  jkj for jkj [ rðTÞ:

Consequently,
1
limn!1 jjT n jjn  rðTÞ:

Using the Lemma proved above, we obtain Gelfand’s formula:


1
rðTÞ ¼ limn!1 jjT n jjn :

h
Remarks 4.2.7
(i) If T 2 BðHÞ is such that T*T ¼ TT*; then
242 4 Spectral Theory and Special Classes of Operators

rðTÞ ¼ jjT jj:

For a normal operator T, kT p k ¼ kTkp for p = 2n, n = 1, 2, … [Theorem 3.7.


1
2]. It follows that jjT p jjp ¼ jjTjj for p = 2n, n = 1, 2, …, which implies that
1
the limit of the subsequence fjjT p jjp gp¼2n of the convergent sequence
1 1
fjjT n jjn gn  1 equals kTk; so limn!1 jjT n jjn ¼ jjTjj. Hence, if T is normal,
rðTÞ ¼ kTk. Therefore, by Theorem 4.2.4, there exists k 2 rðTÞ such that
jkj ¼ kTk. In particular, if the spectrum contains only real numbers [e.g.
self-adjoint operators; see Theorem 4.4.2], then jkj ¼ k and therefore
either kTk 2 rðTÞ or kTk 2 rðTÞ.
1
(ii) For T 2 BðHÞ; rðTÞ ¼ f0g if and only if limn!1 jjT n jjn ¼ 0. Indeed, if
1
limn!1 jjT n jjn ¼ 0, then r(T) = 0, which implies rðTÞ ¼ f0g. On the other
hand, if rðTÞ ¼ f0g, then rðTÞ ¼ supfjkj : k 2 rðTÞg ¼ 0; i:e:;
1
limn!1 kT n kn ¼ 0:
(iii) An operator T 2 BðHÞ is called nilpotent if there exists an n 2 N such that
Tn = O and is called quasinilpotent if rðTÞ ¼ f0g.
Any normal quasinilpotent operator is the zero operator. Indeed, if T is normal,
1
then limn!1 kT n kn ¼ kTk. Since T is quasinilpotent, rðTÞ ¼ f0g. It then follows
from (i) and (ii) above that kTk ¼ 0, which implies T = O.
Problem Set 4.2

4:2:P1: (a) The analogue of Theorem 4.2.4 ½rðTÞÞ 6¼ £Š fails for real spaces:
(b) Give an example to show that it is possible to have rðTÞ ¼ 0 but T 6¼ O:
4:2:P2. Let A; B 2 BðHÞ be bounded linear operators on a complex Hilbert space
H such that AB = BA. Show that

rðABÞ  rðAÞrðBÞ:

Give an example to show that commutativity cannot be dropped.


4:2:P3. Let A; B 2 BðHÞ be bounded linear operators on a complex Hilbert space
H such that AB = BA. Show that r(A + B)  r(A) + r(B). Give an example
to show that commutativity cannot be dropped.

4.3 Spectral Mapping Theorem for Polynomials


P
Let T 2 BðHÞ. To every polynomial pðzÞ ¼ nj¼0 cj z j , we can associate the oper-
Pn j −1
ator pðTÞ 2 BðHÞ defined by j¼0 cj T . With f ðzÞ ¼ z and f(z) = z , we can
−1
associate the operators f(T) = T* and f(T) = T . The purpose of this section is to
investigate the relationship between rðTÞ and the spectrum of the operators defined
above. In fact, we have the following theorem.
4.3 Spectral Mapping Theorem for Polynomials 243

Spectral Mapping Theorem 4.3.1 Let H be a Hilbert space and T 2 BðHÞ: Then,
(a) r(T*) = fk : k 2 rðTÞg;
(b) if T is invertible, then r(T−1) = fk 1 : k 2 rðTÞg;
P
(c) if pðzÞ ¼ nj¼0 cj z j is a polynomial with complex coefficients and if p(T) is
P
defined by nj¼0 cj T j , then r(p(T)) ¼ fpðkÞ : k 2 rðTÞg = p(r(T)).

Proof
(a) Suppose k 62 rðTÞ. Then, ðkI TÞ 1 exists, so that ðk I T*Þ 1 ¼
½ðkI TÞ*Š 1 ¼ ½ðkI TÞ 1 Š* exists [see Theorem 3.5.4(d)]. Thus,
k 62 rðT*Þ. We have thus proved rðT*Þfk : k 2 rðTÞg. Applying this
argument to T*, we get rðTÞfk : k 2 rðT*Þg; that is, fk : k 2 rðTÞgfk :
k 2 rðT*Þg: Taking conjugates, we get fkk 2 rðTÞgfk : k 2 rðT*Þg ¼
rðT*Þ, so that rðT*Þ ¼ fk : k 2 rðTÞg.
(b) If T is invertible, then 0 62 rðTÞ, so that {k−1 : k 2 r(T)} is well defined. If k
62 r(T) and k 6¼ 0, then the equation

ðk 1 I T 1
Þ ¼ k 1T 1
ðT kIÞ ¼ k 1T 1
ðkI TÞ

shows k 1 62 rðT 1 Þ, for if k 1 2 rðT 1 Þ, then either k 2 rðTÞ or k ¼ 0. In


other words, rðT 1 Þfk 1 : k 2 rðTÞg. To prove the reverse inclusion, we
apply the result to T−1. Thus,
1 1
rðT Þ ¼ fk : k 2 rðTÞg:

(c) When p is the zero polynomial or has degree 1, this is obvious. Let k 2 rðTÞ
and p be a polynomial of degree n > 1. Then, p(z) − p(k) is a polynomial of
degree n with k as a root and we can factor p(z) − p(k) as (z − k)q(z), where
q is a polynomial of degree n − 1. Then,

pðTÞ pðkÞI ¼ ðT kIÞqðTÞ ¼ B; say:


1
If B were invertible, then the equation BB ¼ B 1 B ¼ I can be written as
1
ðT kIÞqðTÞB ¼ B 1 qðTÞðT kIÞ ¼ I:

This would mean T − kI is invertible, which is not possible if k 2 rðTÞ. Thus,


B is not invertible, i.e. pðkÞ 2 rðpðTÞÞ. So, pðrðTÞÞrðpðTÞÞ:
Let k 2 rðpðTÞÞ: Factorise the polynomial pðzÞ k into linear factors and write
244 4 Spectral Theory and Special Classes of Operators

pðTÞ kI ¼ cðT k1 IÞ    ðT kn IÞ:

Since pðTÞ kI is not invertible, one of the factors T kj I is not invertible. Thus,
kj 2 rðTÞ, and also, pðkj Þ k ¼ 0. This shows that k ¼ pðkj Þ for some kj 2 rðTÞ:
Hence, rðpðTÞÞpðrðTÞÞ. This completes the proof. h
Example 4.3.2 [(ix) of Examples 3.2.5]. The Volterra integral operator

V : L2 ½0; 1Š ! L2 ½0; 1Š

defined by

Zs
VxðsÞ ¼ xðtÞdt; x 2 L2 ½0; 1Š
0

is a bounded linear operator of norm not exceeding p1ffiffi2. We shall show that r(V) = 0
and 0 is not an eigenvalue of V. Now,

Zs Zs Z t
2
V xðsÞ ¼ V ðVðxÞðsÞÞ ¼ ðVxÞðtÞdt ¼ xðuÞdu dt
0 0 0
0 1
Zs Zs Zs
¼ x ð uÞ @ dtAdu ¼ ðs uÞxðuÞdu:
0 u 0

Proceeding as above, one can show that

Zs
1
n
V xðsÞ ¼ ðs uÞn 1 xðuÞdu;
ðn 1Þ!
0

so
 2
Z1 2 Z1 Z s
 
1  
kV n xk22 ¼ jV n xðsÞj2 ds ¼  ðs uÞn 1 xðuÞdu ds
 
ðn 1Þ!  
0 0 0
0 1 2
 2 Z 1 Z s
1 @ ðs uÞn 1 jxðuÞjduA ds

ðn 1Þ!
0 0
0 10 s 1
 2 Z 1 Z s Z
1 @ jxðuÞj2 duA@ ðs uÞ2n 2 duAds;

ðn 1Þ!
0 0 0
4.3 Spectral Mapping Theorem for Polynomials 245

using the Cauchy–Schwarz inequality,


 2
1
 jj xjj22 :
ðn 1Þ!

Thus,
 
n 1
kV x k  k xk:
ðn 1Þ!

1
Consequently, jjV n jj  ðn 1Þ!, which implies


1n
1 1
r ðV Þ ¼ limn!1 jjV n jjn  limn!1 ¼ 0:
ðn 1Þ!

The spectrum of VR is thus a single point 0. Moreover, 0 is not an eigenvalue of V;


s
for if Vx = 0, then 0 xðuÞdu ¼ 0 for every s 2 ½0; 1Š and this implies x = 0 a.e. Since
Rs
0 2 rðVÞ, V is not invertible. Since the range f 0 xðuÞdu : x 2 L2 ½0; 1Šg of V is
dense in L2[0, 1] (see below), it follows that 0 2 rc ðVÞ.
The range of V consists of continuous functions on [0, 1] vanishing at 0 and
differentiable a.e. We shall show that they are dense in L2[0, 1]. We need consider
only real functions. By the Stone–Weierstrass Theorem [13, Theorem 7.34 of
Chap. II], they are uniformly dense in the algebra of all real continuous functions
vanishing at 0. It is sufficient therefore to argue that this algebra is L2-dense in the
algebra of all continuous real functions.
Let f be any real continuous function on [0, 1], f(0) 6¼ 0, and let e > 0 be given.
There exists a positive d1 \1 such that on the interval [0, d1], we have
jf ðxÞj  2jf ð0Þj. Choose a positive d\d1 such that it also satisfies the inequality

e2
d\ :
16jf ð0Þj2

Since 0 < d < d1, the inequality jf ðxÞj  2jf ð0Þj holds on [0, d] as well. Now,
consider the continuous function g defined to agree with f on [d, 1] and have a
straight-line graph from the origin to the point (d, f(d)) on the graph of f. Then, g(0)
= 0 and g satisfies jgðxÞj  2jf ð0Þj on [0, d], and hence,

jf gj  4 jf ð0Þj on ½0; dŠ:

Moreover,

jf gj is zero on ½d; 1Š:


246 4 Spectral Theory and Special Classes of Operators

R
It follows that ½0;1Š jf gj2  16jf ð0Þj2 d, which is less than e2 by choice of d. Thus,
kf gk\e in L2[0, 1].
Proposition 4.3.3 Let T2 BðHÞ. Then, (a) rp(T*) = rcom ðTÞ, (b) r(T*) = rap(T*)
[ rap ðTÞ, (c) rcom ðT*Þrp ðTÞrap ðTÞ and (d) rr ðTÞ ¼ rp ðT*Þ\rp(T), where
the bar signifies complex conjugation [not closure].
Proof If k 2 rp ðT*Þ, then kI T* has a nonzero kernel, and therefore,
ranðk I TÞ has a nonzero orthogonal complement, i.e. k 2 rcom ðTÞ; both these
implications are reversible. This proves (a).
The operator kI T* is not invertible if and only if one of kI T* and k I T
is not bounded from below [Theorem 3.5.9]. In other words, k 2 rðT*Þ if and only
if either k 2 rap ðT*Þ or k 2 rap ðTÞ: This means

rðT*Þ ¼ rap ðT*Þ [ rap ðTÞ:

This proves (b).


If k 2 rcom ðT*Þ, then by definition, kI T* does not have dense range, and
therefore, kI T has a nontrivial kernel [Theorem 3.5.8], i.e. k 2 rp ðTÞ. But
rp ðTÞrap ðTÞ: This proves (c).
By Remark 4.1.4, rr ðTÞ ¼ rcom ðTÞnrp ðTÞ ¼ rp ðT*Þ nrp ðTÞ by part (a). This
proves (d). h
Proposition 4.3.4 Let T 2 BðHÞ: Then, rap(T) is a closed subset of C.
Proof Let k 62 rap ðTÞ. Then, kI T is bounded below. So there exists some e [ 0
such that kðkI TÞxk  ekxk: Also, for all l, kðkI TÞxk  kðlI TÞxk þ
kðk lÞxk for all x 2 H. It follows that ðe jk ljÞkxk  kðlI TÞxk for all l
and all x 2 H: For jk lj sufficiently small, the preceding inequality implies lI
T is bounded below. Hence, the complement of rap ðTÞ is open. h
Our next result shows that rap ðTÞ is not empty.
Theorem 4.3.5 If T 2 BðHÞ, then @rðTÞrap ðTÞ:
Proof Let k 2 @rðTÞÞ, and let fkn gn  1 be a sequence in the resolvent set qðTÞ
such that kn ! k. We claim that kðkn I TÞ 1 k ! 1 as n ! 1. Suppose this is
false. By passing to a subsequence if necessary, there is a constant M such that
kðkn I TÞ 1 k  M for all n. Choose n sufficiently large so that jkn kj\
M 1  kðkn I TÞ 1 k 1 . It follows on using Proposition 3.3.9 that kI T is
invertible, a contradiction.
Let kxn k ¼ 1 satisfy an ¼ kðkn I TÞ 1 xn k [ kðkn I TÞ 1 k 1n : Then, an !
1 as n ! 1: Put yn ¼ an 1 ðkn I TÞ 1 xn ; then, kyn k ¼ 1. Now,
4.3 Spectral Mapping Theorem for Polynomials 247

ðkI TÞyn ¼ ðkn I TÞyn þ ðk kn Þyn


1
¼ an xn þ ðk kn Þyn :

Thus,

jjðkI TÞyn jj  an 1 þ jk kn j;

so that kðkI TÞyn k ! 0 as n ! 1; so k 2 rap ðTÞ: h


We now work out in detail an example which illustrates the various kinds of
spectra.
Example 4.3.6 Let T be the simple unilateral shift on ‘2 defined by

Tðk1 ; k2 ; . . .Þ ¼ ð0; k1 ; k2 ; . . .Þ; fki gi  1 2 ‘2 :

As seen in (vi) of Examples 3.5.10, the adjoint T* of T, called the left shift operator,
acts on ‘2 by

T*ðk1 ; k2 ; . . .Þ ¼ ðk2 ; k3 ; . . .Þ; fki gi  1 2 ‘2 :

It has been observed [Example (vii) of 3.2.5] that kTxk ¼ kxk for x 2 ‘2 , and hence,
kTk ¼ 1. Since kT*k ¼ kTk [Theorem 3.5.2], it follows that kT*k ¼ 1.
Consequently, rðTÞfk 2 C : jkj  1g and rðT*Þfk 2 C : jkj  1g.
In what follows, rðTÞ; rp ðTÞ; rc ðTÞ; rr ðTÞ; rap ðTÞ; rcom ðTÞ and their ana-
logues for T* will be characterised.
(i) Suppose jkj\1. The vector xk ¼ ð1; k; k2 ; . . .Þ is in ‘2 and satisfies
ðkI T*Þxk ¼ 0. Thus, all such k are in the point spectrum of T*. Thus,

fk 2 C : jkj\1grp ðT*Þ:

Since the spectrum of an operator is a bounded closed subset of C, it follows


that rðT*Þ ¼ fk 2 C : jkj  1g. In view of Theorem 4.3.1(a), we have
rðTÞ ¼ fk 2 C : jkj  1g. This characterises rðTÞ and rðT*Þ.
(ii) From Theorem 4.3.5, ∂r(T*)  rap(T*). Since rp(T*)  rap(T*) by defi-
nition, we have r(T*) = {k 2 ℂ : |k|  1} = {k 2 ℂ : |k| < 1} [ {k 2 ℂ : |k|
= 1} = {k 2 ℂ : |k| < 1} [ ∂r(T*)  rp(T*) [ rap(T*) = rap(T*)  r(T*),
where we have used (i) for the first inclusion. Thus, we have shown that
r(T*) = rap(T*).
(iii) It may be remarked that no k satisfying |k| = 1 is in rp(T*). Indeed, if x =
{xi}i  1, x 6¼ 0, is such that T* x = kx, then (x2, x3, …) = (kx1, kx2…), which
implies xn+1 = kxn for n  1. So, xn+1 = knx1, n  1. Hence, x1(1, k, k2, …)
2 ‘2. Since |k| = 1, the vector x1(1, k, k2, …) 2 ‘2 if and only if x1 = 0, which
implies x = 0, a contradiction.
248 4 Spectral Theory and Special Classes of Operators

We next consider the spectrum of T.


(i) rp(T) = ∅. Indeed, if {nn}n  1 2 ‘2 and (kI − T)({nn}) = 0, k 6¼ 0, then
0 = kn1, n1 = kn2, n2 = kn3, …, implying that n1 = 0, n2 = 0, ….
(ii) rap(T) = {k 2 ℂ : |k| = 1}. If |k| < 1, and x 2 ‘2, then ||(T − kI)x||  |||Tx|| −
|k|||x|||  |(1 − |k|)||x|||, which implies k 62 rap(T). Consequently, rap(T) 
{k 2 ℂ : |k| = 1}. It follows in view of Theorem 4.3.5 that rap(T) = {k 2 ℂ :
|k| = 1}.
(iii) By Proposition 4.3.3, rp(T*) = rcom ðTÞ. It follows that rcom(T) = {k 2 ℂ :
|k| < 1}.
(iv) rc(T) = r(T)\(rcom(T) [ rp(T)) = {k 2 ℂ : |k|  1}\{k 2 ℂ : |k| < 1}
= {k 2 ℂ : |k| = 1} since rp(T)) = ∅.
(v) rr(T) = r(T)\(rc(T)) [ rp(T) = {k 2 ℂ : |k| < 1}.
We have thus proved the following:
r(T*) = rap(T*), since rcom(T*) = rp ðTÞ = ∅ by Proposition 4.3.3 and (i) of
paragraph above.
Also,

rðT*Þ ¼ rp ðT*Þ [ rc ðT*Þ [ rr ðT*Þ; where rp ðT*Þ ¼ fk 2 C : jkj\1g;

rc ðT*Þ ¼ rðT*Þnðrcom ðT*ÞÞ [ rp ðT*Þ ¼ rðT*Þnrp ðT*Þ ¼ fk 2 : jkj ¼ 1g

and

rr ðT*Þ ¼ £:

We summarise below the decomposition of the spectrum of T:


r(T) = rap(T) [ rcom(T) = {k 2 ℂ : |k| = 1} [ {k 2 ℂ : |k| < 1}
and
r(T) = rp(T) [ rc(T) [ rr(T) = ∅ [ {k 2 ℂ : |k| = 1} [ {k 2 ℂ : |k| < 1}.

4.4 Spectrum of Various Classes of Operators

Let H be a complex Hilbert space and BðHÞ denote the algebra of bounded linear
operators on H. Normal operators and their suitable subsets such as self-adjoint
operators and unitary operators have been studied in Sect. 3.7 and so have been the
isometric operators. The spectral properties of a member of the class are somewhat
4.4 Spectrum of Various Classes of Operators 249

simpler to describe than those of a general member of BðHÞ. We begin with normal
operators.
Theorem 4.4.1 Every point in the spectrum of a normal operator is an approxi-
mate eigenvalue.
Proof If T 2 BðHÞ is a normal operator and k 2 ℂ, then so is kI − T. So, for each
x 2 H,

jjðkI TÞxjj ¼ jjðkI TÞ*xjj ½Theorem 3:7:1Š


¼ jjðk I T*Þxjj:

Thus, k is an eigenvalue of T if and only if k is an eigenvalue of T*. This means


rp ðT*Þ ¼ rp ðTÞ. Now, by Proposition 4.3.3, rp ðT*Þ ¼ rcom ðTÞ, and hence,

rðTÞ ¼ rap ðTÞ [ rcom ðTÞ ¼ rap ðTÞ [ rp ðT*Þ:

Since we have shown that rp ðT*Þ ¼ rp ðTÞ, the above equality leads to

rðTÞ ¼ rap ðTÞ [ rp ðTÞ;

from which we get r(T) = rap(T) since rp(T)  rap(T) by definition. h


The following theorem, which is a consequence of the one above, is important in
its own right.
Theorem 4.4.2 [cf. Problem 3.8.P2] The spectrum of every self-adjoint operator
T 2 BðHÞ is a subset of ℝ. In particular, the eigenvalues of T, if any, are real.
Furthermore, if T is a positive operator, then the spectrum of T is nonnegative, and
eigenvalues, if any, are also nonnegative.
Proof Let k = l + im, where l and m are real and m 6¼ 0 be a complex number. If T is
a self-adjoint operator and x 2 H, then

jjðkI TÞxjj2 ¼ ððkI TÞx; ðkI TÞxÞ


¼ ððk I TÞðkI TÞx; xÞ
2
¼ jkj ðx; xÞ 2lðTx; xÞ þ jjTxjj2
¼ jjðlI TÞxjj2 þ v2 jj xjj2
 v2 jj xjj2 ;

So, kI − T is bounded below. This means that k is not an approximate eigenvalue


and hence cannot be in the spectrum r(T) of T by Theorem 4.4.1. Consequently,
r(T)  ℝ.
250 4 Spectral Theory and Special Classes of Operators

Assume that T is positive and k < 0. Then,

jjðkI TÞxjj2 ¼ ððkI TÞx; ðkI TÞxÞ


2
¼ ððkI TÞ x; xÞ
¼ k2 ðx; xÞ 2kðTx; xÞ þ jjTxjj2
 k2 jj xjj2 since k\0:

So, kI − T is bounded below. This means that k is not an approximate eigenvalue


and hence cannot be in the spectrum r(T) of T by Theorem 4.4.1. h
The second assertion of the above theorem is trivial to prove directly. For a
self-adjoint operator, (Tx, x) is real. If k 2 rp(T), then there exists a nonzero x such
that (Tx, x) = (kx, x) = k(x, x).
In light of Theorem 4.4.2 and Remark 4.2.7(i), a self-adjoint operator T must
have the property that either ||T|| 2 r(T) or −||T|| 2 r(T).
Theorem 4.4.3 Let BðHÞ denote the algebra of bounded linear operators on a
complex Hilbert space H. Suppose T 2 BðHÞ satisfies the equality TT* = T*T, i.e.
T is normal. Then,
(a) rp ðTÞ ¼ rp ðT*Þ;
(b) eigenvectors corresponding to distinct eigenvalues, if any, are orthogonal;
(c) rr ðTÞ ¼ £:

Proof
(a) Since T is normal, for k 2 ℂ, ||(kI − T)x|| = ||(kI − T)* x|| for each x 2 H. It
follows that ker(kI − T) 6¼ {0} if and only if kerðkI T*Þ ¼ 6 f0g, that is
rp ðTÞ ¼ rp ðT*Þ:
(b) Let k, l be distinct eigenvalues of T and x, y 2 H corresponding eigenvectors.
Then, Tx = kx and Ty = ly. It follows from (a) that

kðx; yÞ ¼ ðk x; yÞ ¼ ðTx; yÞ ¼ ðx; T*yÞ ¼ ðx; l yÞ ¼ lðx; yÞ:

Noting that k 6¼ l, we deduce that (x, y) = 0, which says x is orthogonal to y.


(c) For any T 2 BðHÞ; rr ðTÞ ¼ rp ðT*Þnrp ðTÞ by Proposition 4.3.3(d). It
therefore follows upon using (a) above that rr(T) = ∅ when T is normal. h
The spectrum of a self-adjoint operator can be characterised in more detail.
Recall that the spectrum r(T) of an operator T 2 BðHÞ is a nonempty compact
subset of ℂ. In the present case, we have the following.
Theorem 4.4.4 The spectrum r(T) of a bounded self-adjoint linear operator T on
a complex Hilbert space H lies in the closed interval [m, M] on the real axis, where
m = inf||x||=1(Tx, x) and M = sup||x||=1(Tx, x).
4.4 Spectrum of Various Classes of Operators 251

Proof The fact that T = T* implies (Tx, x) is real for each x 2 H. Indeed, for x 2 H,
we have ðTx; xÞ ¼ ðx; TxÞ ¼ ðTx; xÞ:
The spectrum r(T) lies on the real axis [Theorem 4.4.2]. We show that any real
number M + e with e > 0 belongs to the resolvent set q(T). For every x 2 H, x 6¼ 0,
and v = ||x||−1x, we have x = ||x||v and

ðTx; xÞ ¼ jj xjj2 ðTv; vÞ  jj xjj2 supjjvjj¼1 ðTv; vÞ ¼ jj xjj2 M:

Hence, −(Tx, x)  −||x||2M. On applying the Cauchy–Schwarz inequality to ((kI −


T)x, x), where k = M + e, e > 0, we obtain

kðkI TÞxkjj xjj  ððkI TÞx; xÞ ¼ ðTx; xÞ þ kðx; xÞ


2
 ð M þ eÞjj xjj
 ej j xj j 2 :

This implies

kðkI TÞxk  ejj xjj; x 2 H:

Consequently, k 62 rap(T) = r(T), and hence, k 2 q(T).


The argument when k < m is similar and is therefore not included. h
Let T 2 BðHÞ, where H is a Hilbert space over the field ℂ of complex numbers,
and T = T*. In the theorem above, we defined

m ¼ inf jjxjj¼1 ðTx; xÞ

and

M ¼ supjjxjj¼1 ðTx; xÞ:

The numbers m and M are related to the norm ||T|| of T. The following theorem has
already been proved using Example 3.4.7(ii) and Corollary 3.4.11 [see Theorem 3.
6.6]. An independent proof is desirable.
Theorem 4.4.5 For T 2 BðHÞ, T = T*, we have

kT k ¼ maxfjmj; jM jg ¼ supfjðTx; xÞj : kxk ¼ 1g:

Proof Denote the supremum by a. By the Cauchy–Schwarz inequality,

supjjxjj¼1 jðTx; xÞj  supjjxjj¼1 kTxkjj xjj ¼ jjT jj;

so that a  ||T||. It remains to prove that ||T||  a. If Tx = 0 for all x 2 H with ||x|| = 1,
then ||T|| = sup||x||=1||Tx|| = 0. In this case, the proof is complete. Let x 2 H be such that
252 4 Spectral Theory and Special Classes of Operators

1 1
||x|| = 1 and Tx 6¼ 0. Set v ¼ jjTxjj2 x and w ¼ jjTxjj 2 Tx. Then, ||v||2 = ||w||2 = ||Tx||.
If y1 = v + w and y2 = v − w, then

ðTy1 ; y1 Þ ðTy2 ; y2 Þ ¼ 2fðTv; wÞ þ ðTw; vÞg


 
¼ 2 ðTx; TxÞ þ T 2 x; x
¼ 4jjTxjj2 :

Now, for every y 6¼ 0, and z = ||y||−1y, we have ||y||z = y and

jðTy; yÞj ¼ k yk2 jðTz; zÞj  kyk2 supjjzjj¼1 jðTz; zÞj ¼ akyk2 : ð4:1Þ

By the triangle inequality in ℂ,

jðTy1 ; y1 Þ ðTy2 ; y2 Þj  jðTy1 ; y1 Þj þ jðTy2 ; y2 Þj


n o
 a ky 1 k2 þ ky 2 k2
n o ð4:2Þ
¼ 2a jjvjj2 þ jjwjj2
¼ 4ajjTxjj:

From (4.1) and (4.2), we obtain

4jjTxjj2  4ajjTxjj;

which implies

jjTxjj  a:

This completes the proof. h


The bounds for r(T) in Theorem 4.4.4 cannot be tightened.
Theorem 4.4.6 If T 2 BðHÞ is self-adjoint, then m and M, where m and M are as
in Theorem 4.4.4, are in the spectrum r(T) of T.
Proof We show that M 2 rap(T) = r(T). The proof that m 2 r(T) is similar and is,
therefore, not included.
By the Spectral Mapping Theorem 4.3.1, M 2 r(T) if and only if M + k 2 r(T +
kI), where k is a real constant. Without loss of generality, we may assume 0 
m  M. By Theorem 4.4.5,

M ¼ supjjxjj¼1 ðTx; xÞ ¼ jjT jj:

By the definition of supremum, there is a sequence {xn}n  1 of vectors in H such


that
4.4 Spectrum of Various Classes of Operators 253

jjxn jj ¼ 1; ðTxn ; xn Þ [ M dn ; dn  0 and dn ! 0:

Then, ||Txn||  ||T|| ||xn|| = ||T|| = M, and since T is self-adjoint

jjTxn Mxn jj2 ¼ ðTxn Mxn ; Txn Mxn Þ


2
¼ jjTxn jj 2MðTxn ; xn Þ þ M 2 jjxn jj2
 M2 2MðM dn Þ þ M 2
¼ 2Mdn ! 0:

It follows by definition that M 2 rap(T) = r(T). This completes the proof. h


Remark If T 2 BðHÞ is a nonzero self-adjoint operator and m + M  0, then M > 0
since m  M and the bounds m and M cannot both be 0 by Corollary 3.6.7.
Therefore, |m|  |M|. Hence, by Theorem 4.4.5 and Theorem 4.4.6, ||T|| = |M| =
M 2 r(T). On the other hand, if m + M < 0, then m < 0, and hence, ||T|| = |m| = −m,
so that −||T|| = m 2 r(T).
We now consider a subset of scalars which is closely related to the spectrum
r(T) of a bounded linear operator T defined on a complex Hilbert space H.
Definition 4.4.7 The numerical range of a bounded linear operator T defined on a
complex Hilbert space H is the set

WðTÞ ¼ fðTx; xÞ : jj xjj ¼ 1g:

The reader will note that ||x|| = 1, not ||x||  1. The numerical range of T is the
range of the restriction to the unit sphere {x 2 H : ||x|| = 1} of the quadratic form
(Tx, x) associated with T.
The following properties of the numerical range are easy to discern:
(a) W(aI + bT) = a + bW(T), where a and b are complex numbers;
(b) W(T) is real if T is self-adjoint;
(c) W(U*TU) = W(T) if U is unitary.
Since |(Tx, x)|  ||T|| ||x||2 for every x 2 H, we see that |k|  ||T|| for all k 2 W(T).
In particular, W(T) is a bounded subset of ℂ. It, however, may not be closed. For
P
example, if H ¼ ‘2 and T 2 BðHÞ is defined by Tx ¼ 1 n¼1 an en =n; where x ¼
P1 1
a
n¼1 n ne : Then, ðTe ;
n n e Þ ¼ n 2 WðTÞ for each n but ðTe n ; en Þ ! 0 62 WðTÞ:
However, the numerical range W(T) of T 2 BðHÞ is a convex subset of ℂ as we
shall later prove.
Examples 4.4.8

1 0 u
(i) Let T ¼ and x ¼ 2 C2 , where juj2 þ jvj2 ¼ 1. Now,
0 0 v



 


1 0 u u u u
ðTx; xÞ ¼ ; ¼ ; ¼ juj2 .
0 0 v v 0 v
254 4 Spectral Theory and Special Classes of Operators

So,

WðTÞ ¼ ½0; 1Š:



0 0 u
(ii) Let T ¼ and x ¼ 2 C2 , where juj2 þ jvj2 ¼ 1. Now,
1 0 v



 


0 0 u u 0 u
ðTx; xÞ ¼ ; ¼ ; ¼ uv:
1 0 v v u v

juvj  12 ðjuj2 þ jvj2 Þ ¼ 12 and equality holds if and only if juj ¼ jvj ¼ p1ffiffi2. In
other words, the numerical range of the operator under consideration lies
within the closed disc centred at 0 and having radius 12. We proceed to show
that the numerical range is in fact the entire disc.
Consider any complex number X + iY lying this disc; then, X 2 þ Y 2  14. Our
claim is that there exist complex numbers u and v such that uv = X + iY and
|u|2 + |v|2 = 1. Observe that as r ranges over [0, 1], the product r2(1 − r2)
ranges over [0; 14], taking the maximum value 14 when r ¼ p1ffiffi2. Using this
observation about the product r2(1 − r2), we obtain a number r 2 [0, 1] such
1
that r2(1 − r2) = X2 + Y2. Taking s to be ð1 r 2 Þ2 ; we can write r2 + s2 = 1
and X2 + Y2 = r2s2. From the latter of these equalities, we have X + iY = rseiw
for some w. Now, choose h and / in any manner so long as w = h − / and set
u = reih and v = sei/. Then, |u|2 + |v|2 = r2 + s2 = 1 and
uv ¼ rseiðh /Þ ¼ rseiw ¼ X þ iY. This proves our claim. (Note that the
numerical range
has turned out to be convex.)
0 0
(iii) Let T ¼ . We shall demonstrate that the numerical range of the
1 1
operator T in ℂ2 is the set of all complex numbers X + iY such that
 2
X 12 Y2
2 þ 12  1:
p1ffiffi 2
2

The author is indebted to Professor Ajit Iqbal Singh for the elegant argument
given below.

Lemma A If A and B are any two real numbers, not both zero, then the quadratic
equation
 
A2 þ B2 t2 ð2A þ 1Þt þ 2 ¼ 0
4.4 Spectrum of Various Classes of Operators 255

has a real root if and only if


 2
A 12 B2
2 þ 12  1:
p1ffiffi 2
2

Proof The discriminant of the quadratic equation is


 
ð2A þ 1Þ2 8 A2 þ B2 ;

which can be put into the form


0 1
 
1 2 2
B A 2 B C
2@ 2 þ  2 1A :
1
p1ffiffi 2
2

Therefore, the quadratic equation, which has real coefficients, has a real root if and
only if
 2
A 12 B2
2 þ 12  1:
p1ffiffi 2
2

h
Lemma B A complex number X + iY is of the form ðd þ 1Þ=ðjdj2 þ 1Þ; where d is a
complex number if and only if its real and imaginary parts X and Y satisfy
 2
X 12 Y2
2 þ 12  1:
p1ffiffi 2
2

Proof Only if part: Assume X þ iY ¼ ðd þ 1Þ=ðjdj2 þ 1Þ; where d is a complex


number. Then, d ¼ ðX þ iYÞðjdj2 þ 1Þ 1 and d ¼ ðX iYÞðjdj2 þ 1Þ 1, so that

  2
jd j2 ¼ X 2 þ Y 2 jd j2 þ 1 2Xðjd j2 þ 1Þ þ 1:

Put t ¼ jdj2 þ 1. Then, t is real and the above equation is a quadratic in t with real
coefficients, namely
 
X 2 þ Y 2 t2 ð2X þ 1Þt þ 2 ¼ 0: ðÞ

The required inequality now follows by Lemma A.


256 4 Spectral Theory and Special Classes of Operators

If part: Assume that X + iY is any complex number such that its real and
imaginary parts X and Y satisfy the inequality in question. If X 2 þ Y 2 ¼ 0; then X +
iY = 0, and choosing d = −1 leads to X + iY = ðd þ 1Þ=ðjdj2 þ 1Þ: So, suppose
X 2 þ Y 2 6¼ 0. Using X and Y, set up the quadratic equation (*), which obviously has
real coefficients. By Lemma A, it must have a real solution. In what follows, the
symbol ‘t’ will denote any one real solution. Obviously, t 6¼ 0. Consider the
complex number d defined in terms of the nonzero number t and the given complex
number X + iY as

d ¼ ðX þ iYÞt 1; or equivalently; d ¼ ðXt 1Þ þ iYt:

This complex number d has the property that

X þ iY ¼ ðd þ 1Þ=t ðÞ

and

jd j2 ¼ jðXt 1Þ þ iYtj2 ¼ ðXt 1Þ2 þ Y 2 t2 ¼ X 2 t2 2Xt þ 1 þ Y 2 t2


 
¼ X 2 þ Y 2 t2 2Xt þ 1:

Hence,
 
jd j2 þ 1 ¼ X 2 þ Y 2 t2 2Xt þ 2
¼ t in view of ð4:3Þ:

When this is combined with (**), the required equality X + iY = (d + 1)/(|d|2 + 1)


springs forth. h
With the above Lemma B in hand, we can now
prove that the numerical range of
0 0
the operator T in C2 given by the matrix is the set of all complex numbers
1 1
X + iY such that
 2
X 12 Y2
2 þ 12  1:
p1ffiffi 2
2

p
To see why this is so, let x ¼ 2 C2 , where jpj2 þ jqj2 ¼ 1. Now,
q



 


0 0 p p 0 p
ðTx; xÞ ¼ ; ¼ ; ¼ ðp þ qÞq:
1 1 q q pþq q
4.4 Spectrum of Various Classes of Operators 257

If q = 0, then this is 0. Now, suppose q 6¼ 0. Then, p = dq, where d 2 ℂ. Then,


ðTx; xÞ ¼ ðd þ 1Þjqj2 . Also, ðjdj2 þ 1Þjqj2 ¼ 1. So, jqj2 ¼ 1=ðjdj2 þ 1Þ, and hence,
ðTx; xÞ ¼ ðd þ 1Þ=ðjdj2 þ 1Þ, which is independent of q. Now, d = −1 implies (Tx,
x) = 0.
Therefore, the numerical range can be characterised as consisting of all values of
(d + 1)/(|d|2 + 1) as d ranges over all complex numbers (keeping in mind that the
values of (Tx, x) when d is not available—i.e. when q = 0—are generated by d =
−1). This characterisation reduces the matter to Lemma B.
P
(iv) Let T be the left unilateral shift defined on ‘2 by Tx ¼ 1 n¼1 xn þ 1 en , where
P1 P1
x ¼ n¼1 xn en . Then, ðTx; xÞ ¼ n þ 1 xn þ 1 xn : Taking m to be the smallest
index for which xm 6¼ 0 (such an m must exist when jjxjj ¼ 1), we get

1h i 1h i
jðTx; xÞj  jxm j2 þ 2jxm þ 1 j2 þ 2jxm þ 2 j2 þ    ¼ 2 jxm j2 \1:
2 2

It follows that W(T) is contained in the open unit disc with centre 0.
Conversely, let z ¼ reih with 0  r < 1. Consider the vector
1
X pffiffiffiffiffiffiffiffiffiffiffiffiffi
1 iðn 1Þh
x¼ rn 1 r2 e en :
n¼1

Observe that jjxjj ¼ 1 and ðTx; xÞ ¼ reih :


The numerical range W(T) of a bounded linear operator T belonging to BðHÞ,
where H is a complex Hilbert space, has decent properties, some of which are easy
to prove.
Theorem 4.4.9 Let H be a Hilbert space over ℂ and T 2 BðHÞ. Then, WðTÞ ¼
fðTx; xÞ : jjxjj ¼ 1g has the following properties:
(a) k 2 W(T) if and only if k 2 WðT*Þ;
(b) [Hausdorff–Toeplitz] W(T) is a convex subset of ℂ;
(c) rðTÞWðTÞ;
(d) if T is normal, then the convex hull of the spectrum r(T) of T,
coðrðTÞÞ ¼ WðTÞ.

Proof
(a) Let x 2 H with ||x|| = 1. Then, ðTx; xÞ ¼ ðx; TxÞ ¼ ðT*x; xÞ. Thus, (Tx, x) 2 W
(T) if and only if ðTx; xÞ 2 WðT*Þ.
(b) Let n = (Tx, x) and η = (Ty, y) for unit vectors x and y in H. We want to prove
that every point of the segment joining n and η is in W(T). If n = η, the
problem is trivial. Suppose n 6¼ η. Choose complex numbers a and b such that
an + b = 1 and aη + b = 0. Indeed, a = 1/(n − η) and b = −η/(n − η) are the
desired complex numbers.
258 4 Spectral Theory and Special Classes of Operators

For any complex numbers u and v, it is easy to verify that

W ðuT þ vI Þ ¼ fuw þ v : w 2 WðTÞg ¼ uWðTÞ þ v:

Consequently, the set {0, 1} is contained in W(aT + bI). It will suffice to show
that the interval (0, 1) is included in W(aT + bI). If any t 2 (0, 1) can be shown
to be of the form a(Tz, z) + b, where jjzjj ¼ 1, then

aðTz; zÞ þ b ¼ t ¼ tðan þ bÞ þ ð1 tÞðag þ bÞ


¼ aðtn þ ð1 tÞgÞ þ b;

which implies tn + (1 − t)η = (Tz, z) 2 W(T).


So, there is no loss of generality in assuming that n = 1 and η = 0, i.e. (Tx, x) = 1
and (Ty, y) = 0, and showing that [0, 1]  W(T). It follows that x and y are
linearly independent, because otherwise, x would be a scalar multiple of y, and
hence, (Tx, x) would also be zero. Write

T ¼ T1 þ iT2 ;

T1 ¼ T þ2 T* and T2 ¼ T 2iT* are Hermitian. Now,

1 ¼ ðTx; xÞ ¼ ðT1 x; xÞ þ iðT2 x; xÞ ) ðT1 x; xÞ ¼ 1; ðT2 x; xÞ ¼ 0;


0 ¼ ðTy; yÞ ¼ ðT1 y; yÞ þ iðT2 y; yÞ ) ðT1 y; yÞ ¼ 0; ðT2 y; yÞ ¼ 0:

If x is replaced by kx, k 2 ℂ, where jkj ¼ 1, the value of (Tx, x) remains


unaltered and ðT2 k:x; yÞ ¼ kðT2 x; yÞ: Furthermore, we may assume that (T2x,
y) is purely imaginary. Indeed, k ¼ il=jlj, where l = (T2x, y), has the desired
property.
Set z(t) = tx + (1 − t)y, 0  t  1. Since x and y are linearly independent, z
(t) = 0 for no t.
Since

ðT2 zðTÞ; zðTÞÞ ¼ t2 ðT2 x; xÞ þ tð1 tÞððT2 x; yÞ þ ðT2 x; yÞÞ þ ð1 tÞ2 ðT2 y; yÞ;

for all t, it follows from the relations (T2x, x) = 0 = (T2y, y) and ℜ(T2x, y) = 0,
that (T2z(t), z(t)) = 0. Hence, (Tz(t), z(t)) is real for all t. So, the function
t ! ðTzðTÞ; zðTÞÞ=jjzðTÞjj2

is real-valued and continuous on [0, 1] and its values at 0 and 1 are,


respectively, 0 and 1. Hence, the range of the function contains every t 2 [0, 1].
4.4 Spectrum of Various Classes of Operators 259

(c) Let k 2 rp ðTÞ: Then, Tx = kx for some x 2 H with jjxjj ¼ 1. Since


ðTx; xÞ ¼ ðk x; xÞ ¼ kðx; xÞ ¼ kjjxjj2 ¼ k, we see that k 2 W(T). Next,
let k 2 r(T). Note that rðTÞ ¼ rap ðTÞ [ rcom ðTÞ [Remarks 4.1.4] =
rap ðTÞ [ rp ðT*Þ [Proposition 4.3.3(a)]. So, k 2 rap ðTÞ or k 2 rp ðT*Þ. If
k 2 rap ðTÞ then there is a sequence fxn gn  1 in H such that jjxn jj ¼ 1 and
Txn k xn ! 0 as n ! 1: Since

jðTxn ; xn Þ kj ¼ jðT kIÞxn ; xn Þj


 kðT klÞxn kjjxn jj ! 0 as n ! 1;

we see that k ¼ limn ðTxn ; xn Þ, and hence, k 2 WðTÞ:


Also, if k 2 rp ðT*Þ, then we have seen above that k 2 W(T*), and hence,
k 2 WðTÞ by (a) above. This completes the proof.
(d) For a proof of this, we refer the reader to [3]. h

Remark 4.4.10 If T 2 BðHÞ is self-adjoint, then (Tx, x) is real. Indeed,


ðTx; xÞ ¼ ðx; TxÞ ¼ ðTx; xÞ. Consequently, W(T)  ℝ. If m = inf||x||=1(Tx, x) and M =
sup||x||=1(Tx, x), then WðTÞ½m; MŠ. Since W(T) is convex [(b) above], so is WðTÞ.
Therefore, WðTÞ½m; MŠ. Now, [m, M] = cor(T) in view of Theorem 4.4.4 and
Theorem 4.4.6. Hence, corðTÞ ¼ WðTÞ, i.e. for a self-adjoint T, (d) holds.
The numerical range, like the spectrum, associates a set of complex numbers
with each operator T 2 BðHÞ; it is a set-valued function. The smallest disc centred
at the origin that contains the numerical range has radius given by

wðTÞ ¼ supfjkj : k 2 WðTÞg ¼ supfjðTx; xÞj : k xk ¼ 1g;

called the numerical radius of T. [Cf. Definition 3.7.3.]


In Theorem 3.7.7, it was proved that for a normal operator, the norm is the same
as its numerical radius.
Observe that w(T) is a vector space norm on BðHÞ. That is, 0  w(T) for every
T 2 BðHÞ and 0 < w(T) whenever T is not zero; wða TÞ ¼ jajwðTÞ and w(T + S) 
w(T) + w(S) for every a 2 ℂ and every S and T in BðHÞ. The numerical radius will
now be shown to be equivalent to the operator norm of BðHÞ and to dominate the
spectral radius.
Proposition 4.4.11 For any T 2 BðHÞ, we have 0  r(T)  w(T)  ||T||  2w
(T).
Proof Since rðTÞWðTÞ by Theorem 4.4.9(c), we have

rðTÞ ¼ supfjkj : k 2 rðTÞg  supfjkj : k 2 WðTÞg ¼ supfjkj : k 2 WðTÞg:


260 4 Spectral Theory and Special Classes of Operators

So,
rðTÞ  wðTÞ:

Moreover,

wðTÞ ¼ supfjðTx; xÞj : k xk ¼ 1g  supfjjTxjj : jj xjj ¼ 1g ¼ jjT jj:

Note that

jðTz; zÞj ¼ jðTz=jjzjj; z=jjzjjÞj  jjzjj2


 supfjðTu; uÞj : jjujj ¼ 1g  jjzjj2
¼ wðTÞjjzjj2 for every z 2 H:

By the parallelogram law,

4jðTx; yÞj ¼ jðT ðx þ yÞ; x þ yÞ ðT ðx yÞ; x yÞ


þ iðT ðx þ iyÞ; x þ iyÞ iðT ðx iyÞ; x iyÞj

 wðTÞ jjx þ yjj2 þ jjx yjj2 þ jjx þ iyjj2 þ jjx iyjj2

¼ 4wðTÞ jj xjj2 þ jj yjj2
 8wðTÞ whenever jj xjj ¼ 1 ¼ jj yjj:

Therefore,

jjT jj ¼ supfjðTx; yÞj : kxk ¼ 1 ¼ k ykg  2wðTÞ:

h
Remark It is known that if T 2 is the zero operator, then jjTjj ¼ 2wðTÞ: See [28] and
references therein.
If a is any positive number, then the vector space norm wa(T) = aw(T) is an
algebra norm (i.e. satisfies wa ðSTÞ  wa ðSÞwa ðTÞÞ if and only if a  4. See [11].
The inequality kT*T þ TT*k  4wðTÞ  2kT*T þ TT*k has been proved in
Kittaneh [18].
1
If T 2 BðHÞ, then jðTx; yÞj2  ðjTjx; xÞðjT*jy; yÞ and 2wðTÞ  kTk þ kT 2 k2 :
The second of these is due to Kittaneh [17].
1 1 1
If S, T 2 BðHÞ are positive, then kS2 T 2 k  kSTk2 and
1 1
12
2jjS þ T jj  kSk þ kT k þ ðkSk kT kÞ2 þ 4kS2 T 2 k2 :

The second of these is due to Kittaneh [16].


We now turn to the properties of the spectrum of unitary and isometric operators.
4.4 Spectrum of Various Classes of Operators 261

Recall that U 2 BðHÞ, the algebra of all bounded linear operators on a complex
Hilbert space H, is unitary if and only if UU* = U*U = I. Moreover,
kUk ¼ 1 ¼ kU*k. Therefore, r(U)  {k 2 ℂ : |k|  1} and so is r(U*). Note that
0 62 r(U) since U, by definition, is invertible. If |k| < 1, then kI U ¼
kðU* k1 IÞU: Since k1 is not in the closed unit disc, the operator kðU* k1 IÞ is
invertible, and hence, so is kI − U. Thus, r(U)  {k 2 ℂ : |k| = 1}. The unitary
operator U is normal, so rr ðUÞ 6¼ £ [Theorem 4.4.3].
Examples 4.4.12
(i) (Bilateral shift; (i) of Examples 3.7.13). The operator U : ‘2 ðZÞ ! ‘2 ðZÞ is
defined by the rule

UðxÞðnÞ ¼ xðn 1Þ; x 2 ‘2 ðZÞ:

and its adjoint U* : ‘2 ðZÞ ! ‘2 ðZÞ is defined by

U*ðxÞðnÞ ¼ xðn þ 1Þ; x 2 ‘2 ðZÞ:

UU* = U*U = I. In other words, U 1 exists and U 1 ¼ U*: Also, kUk ¼


kU*k ¼ 1 and so r(U) and r(U*) are contained in the closed unit disc. From
the paragraph above, it follows that r(U)  {k 2 ℂ : |k| = 1}. From the
normality of U, rðUÞ ¼ rap ðUÞ[Theorem 4.4.1]. We next show that each k
with |k| = 1 is an approximate eigenvalue of U.
For fixed h in [0, 2p] and n 2 ℕ, let xn be the vector in ‘2 ðZÞ defined by
 1
ikh
xn ð k Þ ¼ ð2n þ 1Þ 2 e ; jk j  n
0; otherwise.
Pn 1
Note that jjxn jj2 ¼ ð2n þ 1Þ 1
k¼ n 1 ¼ 1. Also, ð2n þ 1Þ2 xn ðkÞ and
1
ð2n þ 1Þ2 Uxn ðkÞ are, respectively,

e e e e e e e

e e e e e e e
e

e e
262 4 Spectral Theory and Special Classes of Operators

where the only two nonzero entries are in positions −n and n + 1. Therefore,
jjðU eih IÞxn jj2 ¼ 2n 2þ 1, so that limn ðU eih IÞxn ¼ 0.
Thus, each eih ; h 2 [0, 2p] is an approximate eigenvalue of U.
It may be argued that rp ðUÞ ¼ £ as is done below. Let k be an eigenvalue, so
that jkj ¼ 1, and let x 2 ‘2 ðZÞ be a corresponding nonzero eigenvector. Since
(Ux)(n) = x(n − 1) and (kx)(n) = k(x(n)), we have x(n − 1) = k(x(n)), and
hence, xð nÞ ¼ kn xð0Þ for any n  0. This implies
P 2 P1
jjxjj2  1 n¼0 jxð nÞj 2
¼ jxð0Þj n¼0 jkj 2n
, which leads to x(0) = 0.
Therefore, x(n) = 0 for all nonpositive n; by similar considerations, we can
show that x(n) = 0 for positive n as well. Hence, the contradiction that x = 0.
(ii) (Multiplication Operator) Let H = L2[0, 2p]. The multiplication operator
U:H ! H is defined by the formula (Ux)(t) = eitx(t), x 2 H. Note that U is
unitary [(ii) of Examples 3.7.13]. So, rðUÞfkC : jkj ¼ 1g: It follows from
(iii) of Examples 4.1.2 that r(U) = {eit: t 2 [0, 2p]}. From the fact that U is
normal, each point of the spectrum is an approximate eigenvalue [Theorem 4.
4.1].
The functions e int ; n 2 Z; form an orthonormal basis of L2[0, 2p]. If we identify
2
L [0, 2p] with ‘2 ðZÞ in terms of the basis, then the multiplication operator U gets
identified with the bilateral shift. Thus, (ii) is really the ‘same’ example as (i).
Therefore, the operators have the same spectrum of each kind. In particular, the
point spectrum of the multiplication operator is empty, a fact which can of course
be deduced directly from the definition of the operator as well.
Recall that an operator V 2 BðHÞ, the algebra of operators on a complex Hilbert
space H, is an isometry if jjVxjj ¼ jjxjj for each x 2 H. The norm jjVjj of V is 1. So,
rðVÞfk 2 C : jkj  1g. There exist isometries whose spectrum coincides with the
unit disc. In fact, if V is the simple unilateral shift and V* denotes the adjoint of
V [Example 4.3.6], then r(V*) = fk 2 C : jkj  1g and rðVÞ ¼ fk : k 2
rðV*Þg ¼ fk 2 C : jkj  1g:
The next result shows that the eigenvalues of an isometry, if any, lie on the unit
circle and the eigenspaces corresponding to distinct eigenvalues are orthogonal.
Theorem 4.4.13 Let V 2 BðHÞ be an isometry. Then,
(a) Every k 2 rp ðVÞ lies on the unit circle.
(b) If Mk and Ml are eigenspaces of V corresponding to k and l, respectively,
then k 6¼ l implies Mk ? Ml :

Proof
(a) Let k 2 rp(V). Then, there exists an x 2 H, x 6¼ 0, such that Vx = kx. Now,
jjxjj2 ¼ jjVxjj2 ¼ ðVx; VxÞ ¼ jkj2 kxk2 , so that |k|2 = 1, and hence, |k| = 1.
4.4 Spectrum of Various Classes of Operators 263

(b) Let x 2 Mk and y 2 Ml : Then, by Proposition 3.7.16,

ðx; yÞ ¼ ðVx; VyÞ ¼ ðkx; lyÞ ¼ klðx; yÞ:

So, ð1 klÞðx; yÞ ¼ 0. Since 1 kl 6¼ 0 (if kl ¼ 1; then l ¼ kjlj2 ¼ k), it


follows that (x, y) = 0. This completes the proof. h
Problem Set 4.4

4:4:P1. Let A 2 BðHÞ be self-adjoint and k 2 ℂ be a complex number such that


=k 6¼ 0. Show that

jjðA kIÞxjj  j= kjjj xjj for all x 2 H:

Hence or otherwise, show that the spectrum of A is real.

4.5 Compact Linear Operators

A typical example of an unbounded operator is the differential operator studied in


Example (x) of 3.2.5. [An unbounded linear operator is defined on a dense linear
subspace of the space under consideration.] The theory developed for bounded
linear operators is not applicable to differential operators. In order to overcome this
difficulty in part, the results of bounded linear operators are applied to the inverse
operators of differential operators after restricting the latter to a subspace on which
they are injective. The inverse of the linear differential
R s operator cited above is the
familiar Volterra operator in (ix) of 3.2.5: VxðsÞ ¼ 0 xðtÞdt: These inverse opera-
tors are not only bounded, but in addition possess a special property called
‘compactness’. Compact operators are also called completely continuous operators.
Most of the statements about these operators are generalisations of the statements
about linear operators in finite-dimensional spaces.
The use of linear operator methods to prove some of Fredholm’s results on linear
integral equations of the form

Zb
ðT kIÞxðsÞ ¼ yðsÞ; where TxðsÞ ¼ kðs; tÞxðtÞdt;
a

k being a parameter, y and k given functions and x the unknown function was
pioneered by F. Riesz in 1916. The concept of linear spaces had not been formu-
lated by then and Riesz worked with integral equations. His techniques generalise
directly and can be applied to compact (or completely continuous) operators.
264 4 Spectral Theory and Special Classes of Operators

Compact linear operators are defined as follows.


Definition 4.5.1 Let X and Y be normed linear spaces. A linear operator T:X ! Y is
called a compact operator (or completely continuous operator) if it maps the unit
ball B = {x 2 X : ||x||  1} of X onto a precompact (i.e. having compact closure)
subset of Y.
Since T is linear, this means that for every bounded subset M of X, the closure
TðMÞ is a compact subset of Y.
The sequence criterion for compactness in a metric space tells us that T is
compact if and only if for a bounded sequence fxn gn  1 in X, the sequence
fTxn gn  1 in Y has a convergent subsequence.
The following lemma shows that a compact linear operator is continuous,
whereas the converse is generally not true [see Remark 4.5.3(i)].
Lemma 4.5.2 Let X and Y be normed linear spaces. Then, every compact linear
operator T:X ! Y is bounded and hence continuous.
Proof The unit sphere S = {x 2 X : jjxjj = 1} is bounded. Since T is a compact
operator, TðSÞ is compact. It is therefore bounded, that is,

sup kTxk\1:
jjxjj¼1

Thus, T is a bounded linear operator and is therefore continuous. h


Remarks 4.5.3
(i) We show that the identity operator on an infinite-dimensional normed linear
space is not compact.
Let X be an infinite-dimensional normed linear space and {x1, x2, …} denote
linearly independent vectors in X. We claim that there exist yn, n = 1, 2, …,
satisfying the properties

j j yn j j ¼ 1 for all n;

yn 2 Mn ¼ ½fx1 ; x2 ; . . .; xn gŠ; ð4:3Þ

the linear span of {x1, x2, …, xn},

1
jjyn þ 1 xj j  for all x 2 Mn : ð4:4Þ
2

To prove this claim, set


x1
y1 ¼ 2 M1 :
kx 1 k
4.5 Compact Linear Operators 265

Note that M1 being a finite-dimensional subspace of X is closed. By Riesz Lemma


5.2.11, there exists a vector y2 2 M2 with ||y2|| = 1 such that

1
j j y2 xj j  for all x 2 M1 :
2

Continuing in this manner, we obtain y1, y2,… satisfying the conditions specified in
(4.3) and (4.4). Now, consider the sequence {yn}n  1. It is clear that it is bounded
(jjyn jj ¼ 1; n ¼ 1; 2; . . .). Its image under the identity operator is the sequence itself.
In view of (4.4), the sequence under consideration satisfies

1
j j yn ym j j  for all n 6¼ m
2

and therefore cannot have a convergent subsequence.


The above remark essentially says that in an infinite-dimensional normed space,
the unit ball is never compact.
(ii) In case of some normed linear spaces, it is possible to prove the above result
without appealing to the Riesz lemma, as the following example shows.
p
Let X ¼ ‘ , and ek = (0, 0, …, 0, 1, 0, …), where 1 occurs at the kth place,
p
k = 1, 2, …. Then, {en}n  1 is a bounded sequence in ‘ and jjen jj = 1, n = 1, 2, ….
However, fIen gn  1 has no convergent subsequence. Indeed,
   
Ien Iem p ¼en em p ¼ 21=p for n 6¼ m:

A similar argument works in any infinite-dimensional Hilbert space. Let fek gk  1 be


p
any orthonormal sequence; then, jjek jj ¼ 1 for every k and jjen em jjp ¼ 2
whenever n 6¼ m.
(iii) If either X or Y is finite-dimensional, then every T 2 BðX; YÞ is compact.
Suppose dim(Y) < 1: Let fxn gn  1 be a bounded sequence in X. Then, the
inequality jjTxn jj  kTkkxn k shows that fTxn gn  1 is bounded. Since dim
(Y) < 1; it follows that fTxn gn  1 has a convergent subsequence. Now,
suppose dim(X) < 1: Note that dim(TX)  dim(X). The result therefore
follows from what has just been proved.

Definition 4.5.4 Let T 2 BðX; YÞ. The rank of T is defined to be the dimension of
the range ran(T) of T. If the range is finite-dimensional, we say that T has finite
rank.
The rank is a purely algebraic concept.
In (iii) of the remark above, we have noted that finite rank operators in B(X,Y)
are compact. We write B0 (X,Y) for the collection of all compact operators from X to
266 4 Spectral Theory and Special Classes of Operators

Y and B00 (X,Y) for the collection of all finite rank operators from X to Y. We
abbreviate B0 ðX; XÞ as B0 (X).
The reader will note that B00 ðX; YÞB0 ðX; YÞBðX; YÞ.
Examples 4.5.5
(i) Let X be a normed linear space, z a vector in X and f a bounded linear
functional on X. We define T:X ! X by

Tx ¼ f ðxÞz; x 2 X:

T is linear since f is a linear functional. Moreover, T is bounded. Indeed,

jjTxjj ¼ kf ðxÞzk  k f kkxkkzk;

which implies

kT k  k f kkzk:

Since T is of rank 1, it follows that T is a completely continuous operator.


(ii) Let X = C[0, 1], the space of continuous functions on [0, 1] with ||x|| = sup{|x
(t)| : 0  t  1}. Let k(x, y) be a continuous kernel on [0, 1]  [0, 1]. Define
the integral operator T by

Z1
ðTxÞðsÞ ¼ kðs; tÞxðtÞdt; x 2 C½0; 1Š:
0

Then, T will be shown to be a compact operator.


Let fxn gn  1 be a sequence in X with kxn k  1 for all n. We shall show that
{Txn}n  1 has a convergent subsequence. For this, we shall use Ascoli’s
Theorem [see Theorem 1.2.21]. Since kTxn k  kTk, the sequence fTxn gn  1
is bounded. We shall show that it is equicontinuous. Since k is uniformly
continuous, for each e > 0, there exists a d > 0 such that js1 s2 j\d implies
jkðs1 ; tÞ kðs2 ; tÞj\e for all t 2 [0, 1]. Thus, for js1 s2 j\d, we have

Z1 Z1
jTxn ðs1 Þ Txn ðs2 Þj  jkðs1 ; tÞ kðs2 ; tÞjjxn ðtÞjdt  e jxn ðtÞjdt  e:
0 0

Thus, the sequence fTxn gn  1 is equicontinuous. So, by Ascoli’s Theorem, it


has a convergent subsequence.
4.5 Compact Linear Operators 267

If we take k to be the characteristic function of the set {(s,t) 2 [0, 1]  [0, 1] :


t < s}, which is patently discontinuous, the above argument does not apply.
But we still have an operator in C[0, 1], calledRthe Volterra operator, just like
s
its counterpart in L2[0, 1]. Since jðTxÞðsÞj ¼ j 0 kðs; tÞxðtÞdtj  sjjxjj  1, not
only is T bounded with norm at most 1, but also satisfies
jTxðs1 Þ Txðs2 Þj  js1 s2 j  jjxjj; which has the consequence that T maps a
bounded set in C[0, 1] into an equicontinuous set. By Ascoli’s Theorem, T is
compact.
(iii) Let k be a complex function belonging to L2([0, 1]  [0, 1]). We define the
transformation T on L2[0, 1] by

Z1
ðTxÞðsÞ ¼ k ðs; tÞxðtÞdt; x 2 L2 ½0; 1Š:
0

The computation
 2
Z1 Z 1 Z 1 
2
 
jðTxÞðsÞj ds ¼  kðs; tÞxðtÞdt ds

 
0 0 0
8 9 8 9
Z1 <Z1 =<Z 1 =
 jkðs; tÞj2 dt jxðtÞj2 dt ds
: ;: ;
0 0 0
Z1 Z1
 jj xjj2 jkðs; tÞj2 dtds;
0 0

using the Cauchy–Schwarz inequality, shows that T is a bounded linear operator in


x 2 L2[0, 1] with
8 1 912
<Z =
jjT jj  jk ðs; tÞj2 dsdt ¼ jjk jjL2 ð½0;1Š½0;1ŠÞ :
: ;
0

We shall show that T is a compact operator.


Let U : L2 ð½0; 1Š  ½0; 1ŠÞ ! BðL2 ð½0; 1ŠÞ be defined by

UðkÞ ¼ T:

We have shown above that U is a bounded linear operator with norm at most
jjkjjL2 ð½0;1Š½0;1ŠÞ .
268 4 Spectral Theory and Special Classes of Operators

Let {fi(s)} be an orthonormal basis in L2[0, 1]. Then, {fi(s)fj(t)} is an


P
orthonormal basis in L2 ð½0; 1Š  ½0; 1ŠÞ; so; kðs; tÞ ¼ 1 i;j¼1 ai;j fi ðsÞfj ðtÞ, where the
P
series converges in the norm of L ([0, 1]  [0, 1]). Let kn ðs; tÞ ¼ ni;j¼1 ai;j fi ðsÞfj ðtÞ.
2

Then, jjk kn jjL2 ð½0;1Š½0;1ŠÞ ! 0 as n ! 1: Considering the operator

Z1
ðKn xÞðsÞ ¼ kn ðs; tÞxðtÞdt; x 2 L2 ½0; 1Š;
0

it follows that kUðkÞ Uðkn Þkop ! 0 as n ! 1: This completes the argument.


In the case of the Volterra operator, k(s, t) equals 1 if 0  t  s and equals 0 if
s < t  1. Therefore, it is a compact operator in L2[0, 1]. Since its range includes all
polynomials with constant term 0, it is not of finite rank. In fact, its range is dense,
as has been shown in Example 4.3.2. Thus, we have B00 ðX; YÞ  B0 ðX; YÞ 
BðX; YÞ when X = Y = L2([0, 1]. Moreover, its spectrum consists of only 0, which is
not an eigenvalue [see Example 4.3.2].
Recall that the uniform limit of a sequence of continuous functions is continu-
ous. The following similar result for completely continuous operators holds.
Theorem 4.5.6 Let fTn gn  1 be a sequence of completely continuous operators
mapping a normed linear space X into a Banach space Y be such that

limn jjTn T jj ¼ 0:

Then, T is a completely continuous operator.


Proof Let B0 ¼ fx 2 X : jjxjj  1g be the unit ball in X. Since Tn is compact, the set
Tn ðB0 Þ in Y is precompact (i.e. Tn ðB0 Þ is compact). Given e > 0, there exists n 2 N
such that jjTn Tjj\e=3. By compactness, we can cover Tn ðB0 Þ with a finite
number m of balls BðTn xj ; e=3Þ, where x1 ; x2 ; . . .; xm are in B0. Suppose x 2 B0 and
let j  m be such that jjTn x Tn xj jj\e=3. By the triangle inequality,
     
Tx Txj   kTx Tn xk þ Tn x Tn xj  þ Tn xj Txj 
 
 2kT n T k þ Tn x Tn xj \e:

Therefore,

[
m
T ðB0 Þ BðTxj ; eÞ;
j¼1

and T(B0) is precompact. h


4.5 Compact Linear Operators 269

Corollary 4.5.7 If T 2 BðX; YÞ and there exists a sequence Tn 2 B00 (X, Y) such
that jjTn Tjj ! 0 as n ! 1; then T 2 B0 (X,Y).
Proof

B00 ðX; YÞB0 ðX; YÞ:

h
Remark
(i) The above corollary provides a frequently used sufficient condition for an
operator to be compact; namely, it is sufficient that it be the norm limit of a
sequence of finite rank operators. The necessity of this condition has been
shown to be false by P. Enflo [8].
(ii) If X and Y are Hilbert spaces, the following statement also holds [see Problem
4.5.P6]: if T 2 BðX; YÞ is compact and ran(T) = Y, then there exists a sequence
Tn 2 B00 (X, Y) such that jjTn Tjj ! 0 as n ! 1:

Lemma 4.5.8 The set B0 (X, Y) of all compact linear operators is a closed linear
subspace of BðX; YÞ.
Proof Let S and T be in B0 ðX; YÞ and a; b 2 C. If {xn}n  1 is a bounded sequence
in X, then fTxn gn  1 has a convergent subsequence fTxnk gk  1 , say, and fSxnk gk  1
has also a convergent sequence fSxnkðjÞ gj  1 , say. The sequence fTxnkðjÞ gj  1 con-
verges because it is a subsequence of a convergent sequence. It is now clear that the
sequence faTxnkðjÞ þ bSxnkðjÞ gj  1 converges.
Using Theorem 4.5.6, we conclude that B0 ðX; YÞ is a closed linear subspace of
BðX; YÞ. h
Theorem 4.5.9 If S and T are linear operators mapping a normed linear space X
into itself, where S is completely continuous and T is bounded, then ST and TS are
completely continuous operators.
Proof Suppose B is a bounded set in X and consider

STðBÞ ¼ SðTðBÞÞ:

Since B is bounded and T is a bounded linear operator, it follows T(B) is a bounded


subset of X. S being completely continuous implies S(T(B)) is precompact.
Therefore, ST is a completely continuous operator.
Consider now TS(B) = T(S(B)). By the complete continuity of S, S(B) is pre-
compact, i.e. SðBÞ is compact. Since the continuous image of a compact set is a
compact set, and T is continuous, it follows that TðSðBÞÞ is compact. Note that
TðSðBÞÞTðSðBÞÞ: This implies T(S(B)) is precompact. The complete continuity of
TS has now been proved. h
270 4 Spectral Theory and Special Classes of Operators

Remark Combining this result with Lemma 4.5.8 and Theorem 4.5.9, we see that
B0 ðXÞ, the class of all completely continuous operators on X is what is known as a
‘closed two-sided ideal’ in the algebra BðXÞ of all bounded linear operators on X. In
particular, the square of a compact operator is compact. The converse, however, is
not true, as the following example shows.
In ‘2 ; let {ek} be the standard basis, i.e. ek = (0, 0, …, 0, 1, 0, 0, …), where the
only nonzero entry is 1 and occurs in the kth position. Define the operator T in ‘2 by
setting Tek = (1 − (−1)k)e2k. Certainly, T maps into ‘2 because convergence of
Rjxk j2 implies that of Rð1 ð 1Þk Þ2 jxk j2 . Clearly, T2 = O. But T is not compact,
because it maps the bounded sequence e1, e3, e5, … into the sequence 2e2, 2e6, 2e10,
…, whichpcan have no convergent subsequence in view of the fact that jj2em
2en jj ¼ 2 2 when n 6¼ m.
The simple unilateral shift is not compact because it maps the bounded sequence
e1, e2, e3, … into the sequence e2, e3, e4, ….
It is easy to see that B00 ðXÞ, the class of all finite rank operators on X, is a
two-sided ideal in BðXÞ. Suppose T 2 B00 ðXÞ and S 2 BðXÞ. Since the range of
the product TS is contained in that of T, it is surely finite-dimensional. To see why
ST also has finite-dimensional range, we first note that T(X) is finite-dimensional,
and therefore, any linear image of it is also finite-dimensional. It follows that the
linear image S(T(X)) = ST(X) is finite-dimensional.
We shall use this to show that if X is a Hilbert space H, then the adjoint T* of a
finite rank operator T 2 B00 ðHÞ and its absolute value |T| are also of finite rank. Let
P be the orthogonal projection on the range of T. Then, PT = T and P* =
P. Therefore, T* = T*P* = T*P, which must have finite rank, because P does. Also,
T*T must have finite rank, which means ran(T*T) is finite-dimensional and hence
closed. It follows by the last part of Theorem 3.9.13 that jTj has finite rank.
It turns out that the closure of B00 ðHÞ is precisely B0 ðHÞ; so, there is no question
of B00 ðHÞ being closed unless it equals B0 ðHÞ, which we know it does not when
H = L2([0, 1].
We recall the following definition for the benefit of the reader.
Let H be a Hilbert space, fxn gn  1 a sequence of elements of H and x 2 H. If, for
all elements y 2 H, the sequence (xn, y) of scalars converges to (x, y) as n ! 1;
then fxn gn  1 is said to converge weakly to x and we write:

w
xn ! x:

Also, x is called the weak limit of the sequence.


The following equivalent criterion of a compact operator in a Hilbert space
holds.
4.5 Compact Linear Operators 271

Theorem 4.5.10 A bounded linear operator in a Hilbert space is compact if and


only if it maps every weakly convergent sequence into a strongly convergent
sequence.
Proof Suppose that T 2 BðHÞ is a compact operator and let fxn gn  1 be a sequence
w
in H such that xn ! x: If possible, suppose that fTxn gn  1 does not converge strongly
to Tx. Then, there exists e > 0 and an increasing sequence n1, n2, … such that
kTxnk Txk  e; k ¼ 1; 2; . . .. As the sequence fxn gn  1 converges weakly, it fol-
lows [Theorem 2.12.6] that kxn k  M; n ¼ 1; 2; . . .; for some suitable M > 0, and
hence, by compactness of T, the sequence fTxnk gk  1 has a subsequence fTxnkj gj  1
such that Txnkj ! y as j ! 1 strongly. Since strong convergence implies weak
w w w
convergence, Txnkj ! y as j ! 1: Also, xnkj ! x as j ! 1; so Txnkj ! Tx as j ! 1:
Thus, y = Tx and Txnkj ! Tx as j ! 1 strongly. Moreover,
 
 
Txnkj Tx  e; j ¼ 1; 2; . . .:

This contradiction completes the argument.


Conversely, suppose that fxn gn  1 is a bounded sequence in H. Then, it contains
a weakly convergent subsequence fxnk gk  1 [Theorem 2.12.5]. By hypothesis,
fTxnk gk  1 converges in H; consequently, T is compact. h
As an illustration of the use of the above theorem, we show that if fen gn  1 is an
orthonormal sequence, not necessarily complete, in a Hilbert space H and
T compact operator, then jjTen jj ! 0. First consider an arbitrary subsequence,
which we shall continue to call fen gn  1 for ease of notation. For any x 2 H, the
P1 2
sum n¼1 jðx; en Þj must converge by Bessel’s inequality [Theorem 2.8.6]. So,
(x, en) ! 0 as n ! 1, i.e. the sequence fen gn  1 converges to 0 weakly. Using the
fact that T is continuous, we find that Ten ! 0 weakly. As T is also compact, it
follows by Theorem 4.5.10 that fTen gn  1 converges strongly to some y 2 H. Since
strong convergence implies weak convergence, we know that fTen gn  1 converges
weakly to y, and hence, y = 0. Thus, fTen gn  1 converges strongly to 0.
The next result can be rephrased as saying that the class of compact operators in
a Hilbert space is closed under taking adjoints.
Theorem 4.5.11 The adjoint of a compact operator is compact.
Proof Let fxn gn  1 be a sequence in H such that jjxn jj  M; n ¼ 1; 2; . . . and M > 0.
If yn ¼ T*xn ; n ¼ 1; 2; . . .; then fyn gn  1 is also a bounded sequence in H. Since
T is compact, the sequence fTyn gn  1 has a convergent subsequence fTynj gj  1 , say.
For all i, j,
272 4 Spectral Theory and Special Classes of Operators

 2  2
 yn ynj  ¼ T*xni T*xnj 
i
 
¼ T*xni T*xnj ; T*xni T*xnj
    
¼ TT* xni xnj ; xni xnj
   
 TT* xni xnj xni xnj 
 
 2M Tyni Tynj :

This implies that the sequence fynj gj  1 is a Cauchy sequence in H, and since H is
complete, it converges in H. Consequently, T* is a compact operator. h
We shall show that when T is compact, the operator I − T has the following
feature of operators in a finite-dimensional space: it is onto if and only if it is
one-to-one:
Theorem 4.5.12 Let T be a compact operator. Then,

ranðI TÞ ¼ H , kerðI TÞ ¼ f0g:

Proof Suppose ran(I − T) = H but ker(I − T) 6¼ {0}. Then, there exists a nonzero
vector x1 2 ker(I − T). Since ran(I − T) = H, we can obtain a sequence fxn gn  1 of
nonzero vectors in H such that

ðI TÞxn þ 1 ¼ xn for every n:

Then, ðI TÞn xn þ 1 ¼ x1 6¼ 0 but ðI TÞn þ 1 xn þ 1 ¼ ðI TÞx1 ¼ 0. To put it


another way,

xn þ 1 62 kerðI TÞn but xn þ 1 2 kerðI TÞn þ 1 :

Combined with the obvious inclusions kerðI TÞn kerðI TÞn þ 1 for every n, this
yields the strict inclusions

kerðI TÞn  kerðI TÞn þ 1 for every n:

Each of these kernels is closed, and therefore, each kerðI TÞn is a proper closed
subspace of the Hilbert space kerðI TÞn þ 1 : Hence, there exists a sequence
fyn gn  1 of unit vectors such that

yn 2 kerðI TÞn and yn þ 1 ? kerðI TÞn for every n:

Then, surely,

jjyn þ 1 xjj  1 for any x 2 kerðI TÞn :

For indices p > q, we have


4.5 Compact Linear Operators 273

yq þ ðI TÞyp ðI TÞyq 2 kerðI TÞp 1 ;

because

ðI TÞp 1 ðyq þ ðI TÞyp ðI TÞyq Þ ¼ ðI TÞp 1 yq þ ðI TÞp yp ðI TÞp yq


¼ ðI TÞp 1 yq þ 0 þ 0
1 q
¼ ðI TÞp ðI TÞq yq ¼ 0:

Therefore, jjyp ðyq þ ðI TÞyp ðI TÞyq Þjj  1, i.e. jjTyp Tyq jj  1 when
p > q. But this means that although the set {yn : n  1} is bounded, the set
fTyn : n  1g cannot contain a Cauchy sequence. This contradicts the compactness
of T and thereby shows that ran(I − T) = H ) ker(I − T) = {0}.
For the converse, suppose that ker(I − T) = {0}. By Theorem 3.5.8, the
orthogonal complement of ker(I − T) is the closure of the range of I − T*.
Therefore, ran(I − T*) is dense. However, by Theorem 4.5.11, T* is also compact,
and hence, by Problem 4.5.P14, ran(I − T*) is closed. Thus, ran(I − T*) = H. By the
compactness of T*, what has already been proved above implies that ker(I − T*) =
{0}. Invoking Theorem 3.5.8 once again in exactly the same manner as above, we
find that ran(I − T) = H. h
In the presence of the additional hypothesis that T is self-adjoint, the above result
is a trivial consequence of Theorem 3.5.8 and Problem 4.5.P14. However, much
more can be said in that situation [see (Problem 4.5.P15)].
An alternative between two specified statements is as assertion to the effect that
precisely one among two statements holds, i.e. one of them holds but not both.
A little reflection shows that this is the same as saying that one holds if and only if
the other does not. For the sceptical reader, we show the corresponding simple
computation in Boolean algebra, wherein ^ denotes conjunction, _ denotes dis-
junction and ′ denotes negation. Recall that P ) Q is the same as P0 _ Q. The
computation is as follows:

ðP _ QÞ ^ ðP ^ QÞ0 ¼ ðP _ QÞ ^ ðP0 _ Q0 Þ ¼ ðP0 ) QÞ ^ ðQ ) P0 Þ ¼ P0 , Q:

Thus, the equivalence of any two statements can be restated as an alternative


between one of the statements and the negation of the other. Conventionally, the
equivalence asserted by Theorem 4.5.12 is expressed as an alternative and named
after the discoverer, who originally put it forth in 1903 in the context of integral
equations:
Theorem 4.5.13 (Fredholm Alternative) For a compact operator T in a Hilbert
space H, precisely one of the following holds:
(a) For every y 2 H, there exists x 2 H such that x − Tx = y;
(b) There exists a nonzero x 2 H such that x − Tx = 0.
274 4 Spectral Theory and Special Classes of Operators

Proof Immediate from Theorem 4.5.12. h


Theorem 4.5.14 For a compact operator T in a Hilbert space H, the dimensions of
ker(I − T) and ker(I − T*) are the same.
Proof Both the dimensions in question are finite because of the compactness of
T and hence of T* [Prop. 4.8.3]. Let fx1 ; . . .; xn g and fy1 ; . . .; ym g be orthonormal
bases of ker(I − T) and ker(I − T*), respectively. It is sufficient to prove that
assuming m > n leads to a contradiction. With this in view, assume that m > n.
Set up the operator S defined as

X
n  
Sx ¼ Tx þ x; xj yj ; x 2 H:
j¼1

Obviously, S is compact. We contend that ker(I − S) = {0}. Considering that y1, …,


yn are orthonormal and lie in ker(I − T*), we obtain for any x 2 H

ððI SÞx; yk Þ ¼ ððI TÞx; yk Þ ðx; xk Þ


¼ ðx; ðI T*Þyk Þ ðx; xk Þ
¼ ðx; xk Þ; 1  k  n:

Now, let x 2 ker(I − S). Then, the above n equalities lead to the n orthogonality
relations

ðx; xk Þ ¼ 0; 1  k  n:

These imply on the one hand that

x 2 kerðI TÞ? ;

because {x1, …, xn} is an orthonormal basis of ker(I − T).P


On the other hand, since
it follows from the definition of S that ðI TÞx ¼ nj¼1 ðx; xj Þyj , the same
n orthogonality relations imply this time around that

x 2 kerðI TÞ:

But ker ðI TÞ? \ kerðI TÞ ¼ f0g, and it follows that x = 0. This validates our
contention that ker(I − S) = {0}.
As S is compact, Theorem 4.5.12 now tells us that ran(I − S) = H. In particular,
yn+1 = (I − S)z for some z 2 H. Recalling the definition of S, we obtain

X
n  
ðI TÞz ¼ z; xj yj þ yn þ 1 :
n¼1
4.5 Compact Linear Operators 275

Considering that y1,…,yn+1 are orthonormal and yn+1 2 ker(I − T*), we now arrive
at the contradiction that

0 ¼ ðz; ðI T*Þyn þ 1 Þ
¼ ððI TÞz; yn þ 1 Þ
!
X
n  
¼ z; xj yj þ yn þ 1 ; yn þ 1
j¼1

¼ ð yn þ 1 ; yn þ 1 Þ
¼ 1:

h
Since integral operators are compact, the Fredholm alternative and
Theorem 4.5.14 have direct implications regarding solutions of integral equations;
in fact, they are generalisations of Fredholm’s results on the latter. For an explicit
formulation in terms of integral equations, the reader is referred to Limaye [21,
p. 339] or Riesz and Nagy [24, p. 164].
Theorems 4.5.12, 4.5.13 and 4.5.14 can be further generalised even to Banach
spaces, but the matter will not be taken up in this book.
Problem Set 4.5

4:5:P1. Show that the operator K:L2[a, b] ! L2[a, b] defined by

Zt
ðKuÞðtÞ ¼ uðsÞds; u 2 L2 ½a; bŠ
a

does not have finite rank.


4:5:P2. Show that the operator K : L2 ½a; bŠ ! L2 ½a; bŠ defined by

X
n Zt
ðKf ÞðtÞ ¼ uj ðtÞ wj ðsÞds;
j¼1
0

P
where f ¼ nj¼1 uj
wj and uj ; wj are in L2[a, b], is of finite rank.
4:5:P3: (a) Let the operator K : L2 ½0; 1Š ! L2 ½0; 1Š be given by

Z1
KxðtÞ ¼ k ðt; sÞxðsÞdlðSÞ;
0
276 4 Spectral Theory and Special Classes of Operators

where k(t, s) = max{t, s}, 0  t, s  1. Prove that K is self-adjoint, is


compact and has denumerably many negative eigenvalues with 0 as
the only accumulation point.
(b) Suppose in part (a), the function k is changed to be k(t, s) = min{t, s}.
Prove that K is self-adjoint, is compact and has positive eigenvalues.
(The reader can check that they are denumerably many and have 0 as
the only accumulation point.)
4:5:P4. Take V as in Problem 3.8.P4, and find the eigenvalues of the operator
V*V on L2[0, 1]. Prove that jjVjj ¼ 2p 1 .
4:5:P5. Let V be the Volterra operator on L2[0, 1] [see Example (ix) of 3.2.5].
Prove by induction that

Zt
n ðt sÞn 1
ðV ðxÞðtÞ ¼ xðsÞds:
ðn 1Þ!
0

Hence, solve the integral equation

Zt
yðtÞ ¼ sin t þ yðsÞds: ð4:5Þ
0

4:5:P6: (a) Let H and K be Hilbert spaces and T 2 B0 ðH; KÞ. Show that ran(T) is
separable.
(b) Let fek gk  1 be an orthonormal basis for ½ranðTފ: If Pn:K ! K is the
orthogonal projection onto the closed linear subspace generated by
fek g1  k  n , then show that PnT ! PT (unif), where P is the orthog-
onal projection on ½ranðTފ:
4:5:P7. Let H be a separable Hilbert space with basis fen gn  1 . Let fan gn  1 be a
sequence of complex numbers with M ¼ supjan j\1. Define an operator
T on ‘2 by

Tx ¼ ða1 x1 ; a2 x2 ; . . .Þ; x ¼ ðx1 ; x2 ; . . .Þ 2 ‘2 :

Prove that T is compact if and only if limnan = 0. P


4:5:P8. Let faj gj  1 be a sequence of complex numbers with 1
j¼1 jaj j\1: Define
an operator T on ‘2 by
!
1
X 1
X 1
X
Tx ¼ ai x i ; ai þ 1 xi ; . . .; ai þ n 1 x i ; . . . ;
i¼1 i¼1 i¼1
x ¼ ðx1 ; x2 ; . . .Þ 2 ‘2 :
4.5 Compact Linear Operators 277

Show that T is compact.


P
4:5:P9. Let ½sij Ši;j  1 be an infinite matrix and 1 2
i;j¼1 jsij j \1 and operator T be
defined on ‘2 by

Tðfxi gi  1 Þ ¼ fyi gi  1 ;

where
1
X
yi ¼ sij xj ; i ¼ 1; 2; . . .:
j¼1

Show that T is a compact linear operator on ‘2 .


P P1
4:5:P10. Define T : ‘2 ! ‘2 by Tx ¼ Tðx1 ; x2 ; . . .Þ ¼ ð 1 j¼1 s1j xj ; j¼1 s2j xj ; . . .Þ,
where sik = 0 for |i − k| > 1. Then, T is compact if and only if limi,ksi,k = 0.
Observe that the matrix defining T has the form
2 3
s11 s12 0 0 
6 s12 s22 s23 0 7
6 7
6 0 s32 s33 s34 0 7
6 7:
6 .. .. .. .. .. 7
4 . . . . . 5
0 0 0 0 

This matrix may be expressed as


2 3
a1 b1 0 0 
6 c1 a2 b2 0 7
6 7
60 c2 a3 b3 0 7
6 7:
6 .. .. .. .. .. 7
4 . . . . . 5
0 0 0 0 

Such a matrix is called a Jacobi matrix. The condition limi,ksi,k = 0 is


equivalent to limkak = 0 = limkbk = limkck.
4:5:P11. Prove that the mapping T defined on ‘2 by

1 1
Tx ¼ ðn1 ; n2 ; n3 ; . . .Þ; x ¼ ðn1 ; n2 ; n3 ; . . .Þ 2 ‘2
2 3

has range contained in ‘2 and is compact.


278 4 Spectral Theory and Special Classes of Operators

4:5:P12. Let T be a compact operator on X, i.e. T 2 B0 ðXÞ, and suppose k 6¼ 0 is


not an eigenvalue of T. Show that k 62 r(T).
4:5:P13. Construct an example of a compact operator which has no proper value.
4:5:P14. Let T 2 BðHÞ, H a complex Hilbert space, be compact and k 6¼ 0 a
complex number. Then, ran(T − kI) is closed.
4:5:P15. (Fredholm Alternative) Let T 2 BðHÞ, H a complex Hilbert space, be
compact and self-adjoint. If k is an eigenvalue of T, we denote by Nk ðTÞ
the eigenspace of T associated with k and by Pk the orthogonal projection
of H onto Nk ðTÞ. Then, one of the following holds:
(a) If k is not an eigenvalue of T, then the equation

Tx kx ¼ y ð4:6Þ

with y 2 H has a unique solution. The unique solution x is given by


X
x ¼ ðT kIÞ 1 y ¼ ðl kÞ 1 Pl y:
l2rp ðTÞ

(b) If k is an eigenvalue of T, then the Eq. (4.6) has infinitely many


solutions for y 2 Nk ðTÞ? and no solution otherwise. In the first
case, the solutions are given by
X
x ¼ zþ ðl kÞ 1 Pl y:
l2rp ðTÞ
l6¼k

with z 2 NT ðkÞ:
4:5:P16. Let H = L2[0, 1]. For x 2 H, let

Z1
TxðsÞ ¼ kðs; tÞxðtÞdt;
0

and

ð1 sÞt 0ts1
k ðs; tÞ ¼
sð1 t Þ 0  s  t  1:

Let x 2 H and 0 6¼ k 2 ℂ be such that Tx = kx. Then, for all s 2 [0, 1],

Zs Z1
kxðsÞ ¼ TxðsÞ ¼ ð1 sÞtxðtÞdt þ sð1 tÞxðtÞdt: ð4:7Þ
0 s
4.5 Compact Linear Operators 279

P R1
Show that TxðsÞ ¼ 1 2
n¼1 n2 p2 ½ 0 xðtÞ sin npt dtŠ sin nps. Use the Fredholm
alternative to determine the solution of the operator equation
Tx kx ¼ y; y 2 H. P
4:5:P17. Let faj gj  1 be a sequence of complex numbers such that 1 j¼1 jaj j\1.
2
Define an operator on ‘ by the matrix
2 3
a1 a2 a3 
6 a2 a3 a4 7
6 7
A ¼ 6 a3 a4 a5 7 :
4 5
.. .. .. ..
. . . .

Prove that A is compact.

4.6 Hilbert–Schmidt Operators

Problem 3.2.P1 and Example (viii) of 3.2.5 provide sufficient conditions on infinite
matrices and kernels to induce bounded linear operators on a Hilbert space. In fact,
Example (viii) of 3.2.5 is a continuous analogue of Problem 3.2.P3. These are
typical illustrations of a class of operators—the Hilbert–Schmidt operators. We
shall show that if T is a Hilbert–Schmidt operator in a Hilbert space H, then so its
adjoint T*. These operators constitute a two-sided ideal in BðHÞ, the algebra of
bounded linear operators in H. Every Hilbert–Schmidt operator is a compact
operator. The converse is, however, not true. The class of Hilbert–Schmidt oper-
ators is defined as follows.
Definition 4.6.1 Let T 2 BðHÞ be an operator on a Hilbert space H, and let fxc gc2C
be an orthonormal basis for H. If Rc2C jjTxc jj2 \1; then T is called a Hilbert–
Schmidt operator.
The set of all Hilbert–Schmidt operators on H will be denoted by HS.
In this definition of the class HS, a particular orthonormal basis was used. The
following lemma shows that the class HS depends only upon the Hilbert space and
not upon the basis.
Lemma 4.6.2 Let T 2 BðHÞ be an operator on a Hilbert space H. Let fxc gc2C and
fyc gc2C be orthonormal bases for H. Then,

Rc2C jjTxc jj2 ¼ Rb2C jjT*yb jj2 ¼ Ra2C Rb2C jðTxa ; yb Þj2 :
280 4 Spectral Theory and Special Classes of Operators

Whenever any one of them is summable, so are the others and their sum is the
same, independent of fxc gc2C and fyc gc2C :

Proof By using Parseval’s equality [Theorem 2.9.16], jjTxa jj2 ¼ Rb2C jðTxa ; yb Þj2 .
Thus,

Ra2C jjTxa jj2 ¼ Ra2C Rb2C jðTxa ; yb Þj2


¼ Rb2C Ra2C jðTxa ; yb Þj2
¼ Rb2C Ra2C jðxa ; T*yb Þj2
 2
¼ Rb2C T*yb  ;