Applied Algebra & Funtional Analysis - Anthony Michel - Charles Herget

REVIEWSOF
Algebra and Analysis

for Engineers and Scientists
"This book is a useful compendium of the mathematics of (mostly) finitedi-
mensionallinear vector spaces (plus two final chapters on infinitedimensional
spaces), which do find increasing application in many branches of engineering
and science.... The treatment is thorough; the book will certainly serve as a valu-
able reference." American Scientist
"The authors present topics in algebra and analysis for students in engineering
and science.... Each chapter is organized to include a brief overview, detailed
topical discussions and references for further study. Notes about the references
guide the student to collateral reading. Theorems, definitions, and corollaries are
illustrated with examples. The student is encouraged to prove some theorems
and corollaries as models for proving others in exercises. In most chapters, the
authors discuss constructs used to illustrate examples of applications. Discus-
sions are tied together by frequent, well written notes. The tables and index are
good. The type faces are nicely chosen. The text should prepare a student well in
mathematical matters." Science Books and F ilms
"This is an intermediate level text, with exercises, whose avowed purpose is to
provide the science and engineering graduate student with an appropriate mod-
ern mathematical (analysis and algebra) background in a succinct, but nontrivial,
manner. After some fundamentals, algebraic structures are introduced followed
by linear spaces, matrices, metric spaces, normed and inner product spaces and
linear operators.... While one can quarrel with the choice of specific topics and
the omission of others, the book is quite thorough and can serve as a text, for
selfstudy or as a reference." Mathematical Reviews
"The authors designed a typical work from graduate mathematical lectures: for-
mal definitions, theorems, corollaries, proofs, examples, and exercises. It is to
be noted that problems to challenge students comprehension are interspersed
throughout each chapter rather than at the end." CHOICE
Printed in the USA
Anthony N. Michel
Charles J . Herget
Algebra and Analysis
for Engineers and Scientists
Birkhauser
Boston Basel Berlin
Anthony N. Michel
Department of Electrical Engineering
U niversity of Notre Dame
Notre Dame, IN 4 6556
U .S.A.
Cover design by Dutton and Sherman, H amden, CT.
Charles J . H erget
H erget Associates
P.O. Box 14 25
Alameda, CA 94 501
U .S.A.
Mathematics Subject Classification (2000): 03Ex , 03E20, 08- X , 08- 01, IS- X , 15- 01, 15A03,
15A04 , 15A06, 15A09, 15A15, 15AI8, 15A21, 15A57, 15A60, 15A63, 20- X , 20- 01, 26- X ,
26- 01, 26Ax , 26A03, 26A15, 26Bx , 34 X , 34 01, 34 Ax , 34AI2, 34 A30, 34 H 05, 4 5B05, 4 6- X ,
4 6- 01, 4 6Ax , 4 6A22, 4 6A50, 4 6A55, 4 6Bx , 4 6B20, 4 6B25, 4 6Cx , 4 6C05, 4 6Ex , 4 6NIO, 4 6N20,
4 7- X , 4 7- 01, 4 7Ax , 4 7A05, 4 7A07, 4 7A10, 4 7A25, 4 7A30, 4 7A67, 47BI5,47HI0, 4 7N20, 4 7N70,
54 X , 54 01, 54 A20, 54 Cx , 54 C05, 54 C30, 540x, 54005, 54030,54035,54045, 54 E35, 54 E4 5,
54 E50, 93EIO
Library of Congress Control Number: 2007931687
ISBN- 13: 978- 0- 8176- 4 706- 3
Printed on acid- free paper.
2007 Birkhiiuser Boston
e- ISBN- 13: 978- 0- 8176- 4 707- 0
Originally published as Mathematical F oundations in Engineering and Science by Prentice- H all,
Englewood Cliffs, NJ , 1981. A subseq uent paperback edition under the title Applied Algebra and
Functional Analysis was published by Dover, New Y ork, 1993. F or the Birkhiiuser Boston printing,
the authors have revised the original preface.
All rights reserved. This work may not be translated or copied in whole or in part without the writ-
ten permission of the publisher (Birkhiiuser Boston, c/o Springer ScienceBusiness Media L C, 233
Spring Street, New Y ork, NY 10013, U SA), ex cept for brief ex cerpts in connection with reviews or
scholarly analysis. U se in connection with any form of information .storage and retrieval, electronic
adaptation, computer software, or by similar or dissimilar methodology now known or hereafter de-
veloped is forbidden.
The use in this publication of trade names, trademarks, service marks and similar terms, even if they
are not identified as such, is not to be taken as an ex pression of opinion as to whether or not they are
subject to proprietary rights.
9 8 7 6 5 432 I
www.birkhauser.com (IBT)
CONTENTS
PREF ACE IX
CHAPTER 1: FUNDAMENTAL CONCEPTS 1
1.1 Sets 1
1.2 Functions 12
1.3 Relations and Eq uivalence Relations 25
1.4 Operations on Sets 26
1.5 Mathematical Systems Considered in This Book 30
1.6 References and Notes 31
References 32
CHAPTER 2: ALGEBRAIC STRUCTURES 33
2.1 Some Basic Structures of Algebra 34
A. Semigroups and Groups 36
8. Rings and F ields 46
C. Modules, Vector Spaces, and Algebras 53
D. Overview 61
2.2 Homomorphisms 62
2.3 Application to Polynomials 69
References 74
v
vi
CHAPTER J: VECTORSPACESANDLINEAR
TRANSFORMATIONS 75
Contents
3.1 L inear Spaces 75
3.2 L inear Subspaces and Direct Sums 81
3.3 L inear Independence, Bases, and Dimension 85
3.4 L inear Transformations 95
3.5 L inear F unctionals 109
3.6 Bilinear F unctionals 113
3.7 Projections 119
3.8 Notes and References 123
References 123
CHAPTER 4: FINITE-DIMENSIONAL VECTOR
SPACESANDMATRICES 124
4 .1 Coordinate Representation of Vectors 124
4 .2 Matrices 129
A. Representation of L inear Transformations
by Matrices 129
B. Rank of a Matrix 134
C. Properties of Matrices 136
4 .3 Eq uivalence and Similarity 14 8
4.4 Determinants of Matrices 155
4 .5 Eigenvalues and Eigenvectors 163
4 .6 Some Canonical F orms of Matrices 169
4 .7 Minimal Polynomials, Nilpotent Operators
and the J ordan Canonical F orm 178
A. Minimal Polynomials 178
B. Nilpotent Operators 185
C. The J ordan Canonical F orm 190
4 .8 Bilinear F unctionals and Congruence 194
4 ;9 Euclidean Vector Spaces 202
A. Euclidean Spaces: Definition and Properties 202
B. Orthogonal Bases 209
4 .10 L inear Transformations on Euclidean Vector Spaces 216
A. Orthogonal Transformations 216
B. Adjoint Transformations 218
C. Self-Adjoint Transformations 221
D. Some Ex amples 227
E. F urther Properties of Orthogonal
Transformations 231
Contents
4 .11 Applications. to Ordinary Differential Eq uations 238
A. Initial-Value Problem: Definition 238
B. Initial-Value Problem: linear Systems 24
References 262
vii
CHAPTER 5: METRICSPACES 263
5.1 Definition of Metric Spaces 264
5.2 Some Inequalities 268
5.3 Ex amples of Important Metric Spaces 271
5.4 Open and Closed Sets 275
5.5 Complete Metric Spaces 286
5.6 Compactness 298
5.7 Continuous Functions 307
5.8 Some Important Results in Applications 314
5.9 Equivalent and Homeomorphic Metric Spaces.
Topological Spaces 317
5.10 Applications 323
A. Applications of the Contraction Mapping
Principle 323
B. F urther Applications to Ordinary Differential
Eq uations 329
5.11 References and Notes 34 1
References 34 1
CHAPTER 6: NORMEDSPACES ANDINNER PRODUCT
SPACES 343
6.1 Normed linear Spaces 34
6.2 linear Subspaces 348
6.3 Infinite Series 350
6.4 Convex Sets 351
6.5 L inear Functionals 355
6.6 Finite- Dimensional Spaces 360
6.7 Geometric Aspects of L inear Functionals 363
6.8 Ex tension of L inear Functionals 367
6.9 Dual Space and Second Dual Space 370
6.1 0 Weak Convergence 372
6.11 Inner Product Spaces 375
6.12 Orthogonal Complements 381
yiii Contents
6.13 F ourier Series 387
6.14 The Riesz Representation Theorem 393
6.15 Some Applications 394
A. Approx imation of Elements in H ilbert Space
(Normal Eq uations) 395
B. Random Variables 397
C. Estimation of Random Variables 398
References 404
CHAPTER 7: LINEAR OPERATORS 406
7.1 Bounded L inear Transformations 4 07
7.2 Inverses 415
7.3 Conjugate and Adjoint Operators 419
7.4 H ermitian Operators 4 27
7.5 Other L inear Operators: Normal Operators, Projections,
U nitary Operators, and Isometric Operators 4 31
7.6 The Spectrum of an Operator 439
7.7 Completely Continuous Operators 4 7
7.8 The Spectral Theorem for Completely Continuous
Normal Operators 454
7.9 Differentiation of Operators 458
7.10 Some Applications 465
A. Applications to Integral Eq uations 465
B. An Ex ample from Optimal Control 468
C. Minimiz ation of Functionals: Method of Steepest
Descent 4 71
References 4 73
Index 475
PREF ACE
This book evolved from a one-year sequence of courses offered by the authors
at Iowa State University. The audience for this book typically included theoreti-
cally oriented first- or second-year graduate students in various engineering or
science disciplines. Subsequently, while serving as Chair of the Department of
Electrical Engineering, and later, as Dean of the College of Engineering at the
University of Notre Dame, the first author continued using this book in courses
aimed primarily at graduate students in control systems. Since administrative
demands precluded the possibility of regularly scheduled classes, the Socratic
method was used in guiding students in self study. This method of course deliv-
ery turned out to be very effective and satisfying to student and teacher alike.
Feedback from colleagues and students suggests that this book has been used in
a similar manner elsewhere.
The original objectives in writing this book were to provide the reader with ap-
propriate mathematical background for graduate study in engineering or science;
to provide the reader with appropriate prerequisites for more advanced subjects
in mathematics; to allow the student in engineering or science to become famil-
iar with a great deal of pertinent mathematics in a rapid and efficient manner
without sacrificing rigor; to give the reader a unified overview of applicable
mathematics, thus enabling him or her to choose additional courses in math-
ematics more intelligently; and to make it possible for the student to understand
at an early stage of his or her graduate studies the mathematics used in the cur-
ix
x Preface
rent literature (e.g., journal articles, monographs, and the like).
Whereas the objectives enumerated above for writing this book were certain-
ly pertinent over twenty years ago, they are even more compelling today. The
reasons for this are twofold. First, today s graduate students in engineering or
science are expected to be more knowledgeable and sophisticated in mathemat-
ics than students in the past. Second, today s graduate students in engineering
or science are expected to be familiar with a great deal of ancillary material
(primarily in the computer science area), acquired in courses that did not even
exist a couple of decades ago. In view of these added demands on the students
time, to become familiar with a great deal of mathematics in an efficient manner,
without sacrificing rigor, seems essential.
Since the original publication of this book, progress in technology, and con-
sequently, in applications of mathematics in engineering and science, has been
phenomenal. H owever, it must be emphasized that the type of mathematics it-
selfthat is being utilized in these applications did not experience corresponding
substantial changes. This is particularly the case for algebra and analysis at the
intermediate level, as addressed in the present book. Accordingly, the material
of the present book is as current today as it was at the time when this book first
appeared. (Plus ~ a change, plus c est la meme chose.-Alphonse K arr, 184 9.)
This book may be viewed as consisting essentially of three parts: set theory
(Chapter I), algebra (Chapters 2-4), and analysis (Chapters 5-7). Chapter I is
a prerequisite for all subsequent chapters. Chapter 2 emphasizes abstract alge-
bra (semigroups, groups, rings, etc.) and may essentially be skipped by those
who are not interested in this topic. Chapter 3, which addresses linear spaces
and linear transformations, is a prerequisite for Chapters 4 , 6, and 7. Chap-
ter 4, which treats finite-dimensional vector spaces and linear transformations
on such spaces (matrices) is required for Chapters 6 and 7. In Chapter 5, metric
spaces are treated. This chapter is a prerequisite for the subsequent chapters. Fi-
nally, Chapters 6 and 7 consider Banach and Hilbert spaces and linear operators
on such spaces, respectively.
The choice of applications in a book of this kind is subjective and will al-
ways be susceptible to criticisms. We have attempted to include applications of
algebra and analysis that have broad appeal. These applications, which may be
omitted without loss of continuity, are presented at the ends of Chapters 2, 4, 5,
6, and 7 and include topics dealing with ordinary differential equations, integral
equations, applications of the contraction mapping principle, minimization of
functionals, an example from optimal control, and estimation of random vari-
ables.
All exercises are an integral part of the tex t and are given when they arise,
rather than at the end of a chapter. Their intent is to further the reader s under-
standing of the subject matter on hand.
Preface x i
The prerequisites for this book include the usual background in undergraduate
mathematics offered to students in engineering or in the sciences at universities
in the United States. Thus, in addition to graduate students, this book is suit-
able for advanced senior undergraduate students as well, and for self study by
practitioners.
Concerning the labeling of items in the book, some comments are in order. Sec-
tions are assigned numerals that reflect the chapter and the section numbers. For
example, Section 2.3 signifies the third section in the second chapter. Extensive
sections are usually divided into subsections identified by upper-case com-
mon letters A, B, C, etc. Equations, definitions, theorems, corollaries, lemmas,
examples, exercises, figures, and special remarks are assigned monotonically
increasing numerals which identify the chapter, section, and item number. For
example, Theorem 4.4.7 denotes the seventh identified item in the fourth section
of Chapter 4 . This theorem is followed by Eq . (4.4.8), the eighth identified item
in the same section. Within a given chapter, figures are identified by upper-case
letters A, B, C, etc., while outside of the chapter, the same figure is identified
by the above numbering scheme. F inally, the end of a proof or of an example is
signified by the symbol .
Suggested Course Outlines
Because of the flex ibility described above, this book can be used either in a one-
semester course, or a two-semester course. In either case, mastery of the material
presented will give the student an appreciation of the power and the beauty of
the axiomatic method; will increase the student s ability to construct proofs;
will enable the student to distinguish between purely algebraic and topological
structures and combinations of such structures in mathematical systems; and of
course, it will broaden the student s background in algebra and analysis.
Aone- semester course
Chapters 1, 3, 4, 5, and Sections 6.1 and 6.11 in Chapter 6 can serve as the basis
for a one-semester course, emphasizing basic aspects of Linear Algebra and
Analysis in a metric space setting.
The coverage of Chapter 1 should concentrate primarily on functions (Sec-
tion 1.2) and relations and equivalence relations (Section 1.3), while the material
concerning sets (Section 1.1) and operations on sets (Section 1.4 ) may be cov-
ered as reading assignments. On the other hand, Section 1.5 (on mathematical
systems) merits formal coverage, since it gives the student a good overview of
the book s aims and contents.
xii Preface
The material in this book has been organized so that Chapter 2, which ad-
dresses the important algebraic structures encountered in Abstract Algebra, may
be omitted without any loss of continuity. In a one-semester course emphasizing
Linear Algebra, this chapter may be omitted in its entirety.
In Chapter 3, which addresses general vector spaces and linear transforma-
tions, the material concerning linear spaces (Section 3.1), linear subspaces and
direct sums (Section 3.2), linear independence and bases (Section 3.3), and lin-
ear transformations (Section 3.4) should be covered in its entirety, while selected
topics on linear functionals (Section 3.5), bilinear functionals (Section 3.6), and
projections (Section 3.7) should be deferred until they are required in Chap-
ter 4 .
Chapter 4 addresses finite-dimensional vector spaces and linear transforma-
tions (matrices) defined on such spaces. The material on determinants (Section
4.4) and some of the material concerning linear transformations on Euclidean
vector spaces (Subsections 4 .1 OD and 4 .1 OE), as well as applications to ordinary
differential equations (Section 4.11) may be omitted without any loss of conti-
nuity. The emphasis in this chapter should be on coordinate representations of
vectors (Section 4.1), the representation of linear transformations by matrices
and the properties of matrices (Section 4.2), equivalence and similarity of ma-
trices (Section 4.3), eigenvalues and eigenvectors (Section 4.5), some canonical
forms of matrices (Section 4.6), minimal polynomials, nilpotent operators and
the Jordan canonical form (Section 4.7), bilinear functionals and congruence
(Section 4.8), Euclidean vector spaces (Section 4.9), and linear transformations
on Euclidean vector spaces (Subsections 4 .1 OA, 4 .1 OB, and 4 .1 oq.
Chapter 5 addresses metric spaces, which constitute some of the most impor-
tant topological spaces. In a one-semester course, the emphasis in this chapter
should be on the definition of metric space and the presentation of important
classes of metric spaces (Sections 5.1 and 5:3), open and closed sets (Sec-
tion 5.4), complete metric spaces (Section 5.5), compactness (Section 5.6), and
continuous functions (Section 5.7). The development of many classes of metric
spaces requires important inequalities, including the Holder and the Minkowski
inequalities for finite and infinite sums and for integrals. These are presented
in Section 5.2 and need to be included in the course. Sections 5.8 and 5.10 ad-
dress specific applications and may be omitted without any loss of continuity.
H owever, time permitting, the material in Section 5.9, concerning equivalent and
homeomorphic metric spaces and topological spaces, should be considered for
inclusion in the course, since it provides the student a glimpse into other areas
of mathematics.
To demonstrate mathematical systems endowed with both algebraic and to-
pological structures, the one-semester course should include the material of
Sections 6.1 and 6.2 in Chapter 6, concerning normed linear spaces (resp., Ban-
ach spaces) and inner product spaces (resp., Hilbert spaces), respectively.
Preface
Atwo- semester course
x iii
In addition to the material outlined above for a one-semester course, a two-se-
mester course should include most of the material in Chapters 2, 6, and 7.
Chapter 2 addresses algebraic structures. The coverage of semigroups and
groups, rings and fields, and modules, vector spaces and algebras (Section 2.1)
should be in sufficient detail to give the student an appreciation of the various
algebraic structures summarized in Figure B on page 61. Important mappings
defined on these algebraic structures (homomorphisms) should also be empha-
sized (Section 2.2) in a two-semester course, as should the brief treatment of
polynomials in Section 2.3.
The first ten sections of Chapter 6 address normed linear spaces (resp., Ban-
ach spaces) while the nex t four sections address inner product spaces (resp.,
Hilbert spaces). The last section of this chapter, which includes applications (to
random variables and estimates of random variables), may be omitted without
any loss of continuity. The material concerning normed linear spaces (Sec-
tion 6.1), linear subspaces (Section 6.2), infinite series (Section 6.3), convex sets
(Section 6.4), linear functionals (Section 6.5), finite-dimensional spaces (Sec-
tion 6.6), inner product spaces (Section 6.11), orthogonal complements (Section
6.12), and Fourier series (Section 6.13) should be covered in its entirety. Cov-
erage of the material on geometric aspects of linear functionals (Section 6.7),
extensions of linear functionals (Section 6.8), dual space and second dual space
(Section 6.9), weak convergence (Section 6.10), and the Riesz representation
theorem (Section 6.14) should be selective and tailored to the availability of time
and the students areas of interest. (For example, students interested in optimiza-
tion and estimation problems may want a detailed coverage of the Hahn-Banach
theorem included in Section 6.8.)
Chapter 7 addresses (bounded) linear operators defined on Banach and Hilbert
spaces. The first nine sections of this chapter should be covered in their entirety
in a two-semester course. The material of this chapter includes bounded lin-
ear transformations (Section 7.1), inverses (Section 7.2), conjugate and adjoint
operators (Section 7.3), Hermitian operators (Section 7.4), normal, projection,
unitary and isometric operators (Section 7.5), the spectrum of an operator (Sec-
tion 7.6), completely continuous operators (Section 7.7), the spectral theorem
for completely continuous normal operators (Section 7.8), and differentiation of
(not necessarily linear and bounded) operators (Section 7.9). The last section,
which includes applications to integral equations, an example from optimal con-
trol, and minimization of functionals by the method of steepest descent, may be
omitted without loss of continuity.
Both one-semester and two-semester courses offered by the present authors,
based on this book, usually included a project conducted by each course par-
ticipant to demonstrate the applicability of the course material. Each project
x iv Preface
involved a formal presentation to the entire class at the end of the semester.
The courses described above were also offered using the Socratic method, fol-
lowing the outlines given above. These courses typically involved half a dozen
participants. While most of the material was self taught by the students them-
selves, the classroom meetings served as a forum for guidance, clarifications, and
challenges by the teacher, usually resulting in lively discussions of the subject on
hand not only among teacher and students, but also among students themselves.
For the current printing of this book, we have created a supplementary web-
site of additional resources for students and instructors: http://Michel.Herget.
net. Available at this website are additional current references concerning the
subject matter of the book and a list of several areas of applications (including
references). Since the latter reflects mostly the authors interests, it is by defini-
tion rather subjective. Among several additional items, the website also includes
some reviews of the present book. In this regard, the authors would like to invite
readers to submit reviews of their own for inclusion into the website.
The present publication of Algebra and Analysisfor Engineers and Scientists
was made possible primarily because of Tom Grasso, Birkhauser s Computa-
tional Sciences and Engineering Editor, whom we would like to thank for his
considerations and professionalism.
Anthony N. Michel
Charles J . Herget
Summer. 2007
1
NDAMENTA CONCEPTS
In this chapter we present Iundamental concepts reuired throughout the
remainder oI this book. We begin by considering sets in Section 1.1. In
Section 1.2 we discuss Iunctions; in Section 1.3 we introduce relations and
euivalence relations; and in Section 1. we concern ourselves with operations
on sets. In Section 1.5 we give a brieI indication oI the types oI mathematical
systems which we will consider in this book. The chapter concludes with a
brieI discussion oI reIerences.
1.1. SETS
Virtually every area oI modern mathematics is developed by starting Irom
an undeIined object called a set. There are several reasons Ior doing this.
One oI these is to develop a mathematical discipline in a completely aiomatic
and totally abstract manner. Another reason is to present a uniIied approach
to what may seem to be highly diverse topics in mathematics. Our reason is
the latter, Ior our interest is not in abstract mathematics Ior its own sake.
owever, by using abstraction, many oI the underlying principles oI modern
mathematics are more clearly understood.
Thus, we begin by assuming that a set is a well deIined collection oI
1
2 Chapter 1 I undomental Concepts
elements or objects. We denote sets by common capital letters A, B, C, etc.,
and elements or objects oI sets by lower case letters a, b, c, etc. or eample,
we write
A a, b, c
to indicate that A is the collection oI elements a, b, c. II an element belongs
to a set A, we write
EA.
In this case we say that " belongs to A," or " is contained in A," or " is
a member oI A," etc. II is any element and iI A is a set, then we assume that
one knows whether belongs to A or whether does not belong to A. II
does not belong to A we write
A.
To illustrate some oI the concepts, we assume that the reader is Iamiliar
with the set oI real numbers. Thus, iI we say
R is the set oI all real numbers,
then this is a well deIined collection oI objects. We point out that it is possible
to characterie the set oI real numbers in a purely abstract manner based on
an aiomatic approach. We shall not do so here.
To illustrate a non well deIined collection oI objects, consider the state
ment "the set oI all tall people in Ames, Iowa." This is clearly not precise
enough to be considered here.
We will agree that any set A may not contain any given element more
than once unless we e plicitly say so. Moreover, we assume that the concept
oI "order" will play no role when representing elements oI a set, unless we
say so. Thus, the sets A a, b, c and B c, b, a are to be viewed as being
e actly the same set.
We usually do not describe a set by listing every element between the
curly brackets as we did Ior set A above. A convenient method oI charac
teriing sets is as Iollows. Suppose that Ior each element oI a set A there is a
statement P() which is either true or Ialse. We may then deIine a set B which
consists oI all elements E A such that P() is true, and we may write
B E A: P() is true .
or e ample, let A denote the set oI all people who live in Ames, Iowa, and
let B denote the set oI all males who live in Ames. We can write, then,
B E A: is a male .
When it is clear which set belongs to, we sometimes write : P() is true
(instead oI, say, E A: P() is trueD.
It is also necessary to consider a set which has no members. Since a set is
determined by its elements, there is only one such set which is called the
1.1. Sets 3
empty set, or the vacuous set, or the null set, or the void set and which is
denoted by 0. Any set, A, consisting oI one or more elements is said to be
non empty or nOD void. IIA is nonvoid we write A 1 0.
IIA and B are sets and iI every element oI B also belongs to A, then we
say that B is a subset oI A or A includes B, and we write B c A or A ::: B.
urthermore, iI B c A and iI there is an E A such that . B, then we
say that B is a proper subset oI A. Some tets make a distinction between
proper subset and any subset by using the notation c and , respectively.
We shall not use the symbol in this book. We note that iI A is any set,
then 0 c: A. Also, 0 c 0. IIB is not a subset oI A, we write B A or
A P B.
1.1.1. Eample. et R denote the set oI all real numbers, let Z denote
the set oI all integers, let denote the set oI all positive integers, and let Q
denote the set oI all rational numbers. We could alternately describe the set
Zas
Z E R: is an integer .
Thus, Ior every E R, the statement is an integer is either true or Ialse.
We Ireuently also speciIy sets such as in the Iollowing obvious manner,
E Z: 1, 2, ... .
We can speciIy the set Q as
Q E R: :,p, E Z,:;i:o.
It is clear that 0 c c Z c Q c R, and that each oI these subsets are
proper subsets. We note that 0 . .
We now wish to state what is meant by euality oI sets.
1.1.2. De6nition. Two sets, A and B, are said to be e ual iI A c Band
B c A. In this case we write A B. IItwo sets, A and B, are not eual,
we write A :;i: B. II and y denote the same element oI a set, we say that they
are eual and we write y. II and y denote distinct elements oI a set,
we write :;i: y.
We emphasi e that all deIinitions are "iIand only iI"statements. Thus, in
the above deIinition we should actually have said: A and B are eual iI and
only iI A c Band Be A. Since this is always understood, hereaIter all
deIinitions will imply the "only iI" portion. Thus, we simply say: two sets A
and B are said to be eual iI A c Band B cA.
In DeIinition 1.1.2 we introduced two concepts oI euality, one oI euality
oI sets and one oI euality oI elements. We shall encounter many Iorms oI
euality throughout this book.
Chapter 1 I undamental Concepts
Now let be a set and let A c: . The complement oI subset A with
respect to is the set oI elements oI which do not belong to A. We denote
the complement oI Awith respect to by CA. When it is clear that the com
plement is with respect to , we simply say the complement oI A (instead oI
the complement oI A with respect to ), and simply write A. Thus, we have
A E : A . (1.1.3)
In every discussion involving sets, we will always have a given Ii ed set
in mind Irom which we take elements and subsets. We will call this set the
universal set, and we will usually denote this set by .
Throughout the remainder oI the present section, denotes always an
arbitrary non void Iied set.
We now establish some properties oI sets.
1.1. . Theorem. et A, B, and C be subsets oI . Then
(i) iI A c: Band Bee, then Ace;
(ii) 0;
(iii) 0 ;
(iv) (Ar A;
(v) A c B iI and only iI A B; and
(vi) A B iI and only iI A B.
ProoI To prove (i), Iirst assume that A is nonvoid and let E A. Since
A c: B, E B, and since B c: C, E C. Since is arbitrary, every element
oI A is also an element oI C and so A c C. inally, iI A 0, then A c C
Iollows trivially.
The prooIs oI parts (ii) and (iii) Iollow immediately Irom (1.1.3).
To prove (iv), we must show that A c (A) and (Ar c: A. IIA 0,
then clearly A c: (Ar. Now suppose that A is non void. We note Irom
(1.1.3) that
(Ar E : A. (1.1.5)
II E A, it Iollows Irom (1.1.3) that A, and hence we have Irom
(1.1.5) that E (A). This proves that A c:(A).
II(Ar 0, then A 0; otherwise we would have a contradiction by
what we have already shown; i.e., A c: (Ar. So let us assume that (Ar
"* 0. II E (Ar it Iollows Irom (1.1.5) that A, and thus we have
E A in view oI (1.1.3). ence, (Ar c: A.
We leave the prooIs oI parts (v) and (vi) as an e ercise.
1.1.6. E ercise. Prove parts (v) and (vi) oI Theorem 1.1. .
The prooIs given in parts (i) and (iv) oI Theorem 1.1. are intentionally
uite detailed in order to demonstrate the eact procedure reuired to prove
1.1. Sets 5
containment and euality oI sets. reuently, the manipulations reuired to
prove some seemingly obvious statements are uite long. It is suggested that
the reader carry out all the details in the manipulations oI the above eercise
and the eercises that Iollow.
Net, let A and B be subsets oI . We deIine the union oI sets A and B,
denoted by A B, as the set oI all elements that are in A or B; i.e.,
A u B E : E A or E B.
When we say E A or E B, we mean is in either A or in B or in both
A and B. This inclusive use oI "or" is standard in mathematics and logic.
IIA and B are subsets oI , we deIine their intersection to be the set oI all
elements which belong to both A and B and denote the intersection by A n B.
SpeciIically,
A n B E : E A and E B.
IIthe intersection oI two sets A and B is empty, i.e., iI A n B 0, we
say that A and B are disjoint.
or eample, let I, 2, 3,, 5 , let A I, 2 , let B 3, , 5 , let
C 2, 3 , and let D , 5 . Then A B, B A, DeB, A B ,
A n B 0, A C I, 2, 3 , B n D D, A n C 2 , etc.
In the net result we summarie some oI the important properties oI
union and intersection oI sets.
1.1.7. Theorem. et A, B, and C be subsets oI . Then
(i) A n B B n A;
(ii) A B B A;
(iii) A n 0 0;
(iv) Au 0 A;
(v) An A;
(vi) Au;
(vii) A n A A;
(viii) A u A A;
(i ) A u A ;
() An A 0;
( i) An Be A;
(ii) An B A iI and only iI A c B;
( iii) A c A B;
( iv) A A u B iI and only iI B c A;
( v) (A n B) n C An (B n C);
( vi) (A B) C A (B C);
(vii) A n (B u C) (A n B) u (A n C);
6 Chapter 1 I undamental Concepts
( viii) (A n B) C (A C) () (B C);
( i ) (A B) A n B; and
( ) (A n BI A B.
ProoI. We only prove part ( viii) oI this theorem. again as an illustration oI
the manipulations involved. We will Iirst show that (A () B) C c (A C)
() (B C), and then we show that (A () B) C:::: (A C) n (B C).
Clearly, iI (A () B) C 0, the assertion is true. So let us assume that
(A () B) C * 0, and let be any element oI (A () B) C. Then E
A () B or E C. Suppose E A () B. Then belongs to both A and B, and
hence E A C and E B C. rom this it Iollows that E (A C)
() (B C). On the other hand, let E C. Then E A Cand E B C.
and hence E (A C) () (B C). Thus, iI E (A n B) C, then
E (A C) n (B C). and we have
(A n B) C c (A C) n (B C). (1.1.8)
To show that (A () B) C :::: (A C) () (B C) we need to prove the
assertion only when (A C) () (B C) * 0. So let be any element oI
(A C) n (B C). Then E A C and E B C. Since E A C,
then E A or E C. urthermore, E B C implies that E B or
E C. We know that either E C or C. II E C. then E (A () B)
C. II C, then it Iollows Irom the above comments that E A and
also E B. Then E A () B, and hence E (A () B) C. Thus, iI C,
then E (A () B) C. Since this ehausts all the possibilities, we conclude
that
(A C) () (B C) c (A () B) C. (1.1.9)
rom (.S) and (1.1.9) it Iollows that (A C) () (B C) (A () B)
C .
1.1.10. E ercise. Prove parts (i) through ( vii) and parts ( i ) and ( )
oI Theorem 1.1.7.
In view oI part ( vi) oI Theorem 1.1.7, there is no ambiguity in writing
A B C. E tending this concept. let n be any positive integer and let
AI A
2
,A
3
denote subsets oI . The set AI A
2
... A
3
is deIined
to be the set oI all E which belong to at least one oI the subsets AI,
and we write
3
A, AI A
2
... A
3
E : E A, Ior some i 1... , n).
;1
Similarly, by part ( v) oI Theorem 1.1.7, there is no ambiguity in writing
A () B n C. We deIine
nA, AI () A: () ... () A. E : E A, Ior all i 1, ... ,n).
;1
1.1. Sets 7
n
That is, nA, consists oI those members oI which belong to all the subsets
11
AI, A
, . ,An
We will consider the union and the intersection oI an inIinite number oI
subsets A, at a later point in the present section.
The Iollowing is a generaliation oI parts ( i ) and ( ) oI Theorem
1.1.7.
1.1.11. Theorem. et AI ... , An be subsets oI . Then
(i) A/ nA;, and
1 I 1 1
(ii) n A/ A;.
1 1 / 1
1.1.1 . E ercise. Prove Theorem 1.1.11.
(1.1.12)
(1.1.13)
The results epressed in E s. (1.1.12) and (1.1.13) are usually reIerred
to as De Morgans laws. We will see later in this section that these laws hold
under more general conditions.
Net, let A and B be two subsets oI . We deIine the diIIerence oI Band
A, denoted (B A), as the set oI elements in B which are not in A, i.e.,
B A E : E B and I/: A .
We note here that A is not reuired to be a subset oI B. It is clear that
B A Bn A.
Now let A and B be again subsets oI the set . The symmetric diIIerence
oI A and B is denoted by A l B and is deIined as
A l B (A B) (B A).
The Iollowing properties Iollow immediately.
1.1.15. Theorem. et A, B, and C denote subsets oI . Then
(i) A AB B AA;
(ii) A AB (A B) (A n B);
(iii) AA A 0;
(iv) A l 0 A;
(v) A A(B l C) (A A B) A C;
(vi) A n (B l C) (A n B) l (A n C); and
(vii) A l Be (A l C) (C l B).
1.1.16. E ercise. Prove Theorem 1.1.15.
In passing, we point out that the use oI Venn diagrams is highly useIul in
visualiing properties oI sets; however, under no circumstances should such
diagrams take the place oI a prooI. In igure A we illustrate the concepts oI
union, intersection, diIIerence, and symmetric diIIerence oI two sets, and the
complement oI a set, by making use oI Venn diagrams. ere, the shaded
regions represent the indicated sets.
C(DO
A B
C"AB
1.1.17. igure A. Venn diagrams.

1.1.18. DeIinition. A nonvoid set A is said to be Iinite iI A contains n
distinct elements, where n is some positive integer; such a set A is said to be
oI order n. The null set is deIined to be Iinite with order ero. A set consisting
oI eactly one element, say A a , is called a singleton or the singleton oI a.
II a set A is not Iinite, then we say that A is inIinite.
In Section 1.2 we will Iurther categorie inIinite sets as being countable
or uncountable.
Net, we need to consider sets whose elements are sets themselves. or
eample, iI A, D, and C are subsets oI , then the collection 1 A, B, C
is a set whose elements are A, D, and C. We usually call a set whose elements
are subsets oI a Iamily oI subsets oI or a collection oI subsets oI .
We will usually employ a hierarchical system oI notation where lower
case letters, e.g., a, b, c, are elements oI , upper case letters, e.g., A, B, C,
are subsets oI , and script letters, e.g., 1, B, e, are Iamilies oI subsets oI .
We could, oI course, continue this process and consider a set whose elements
are Iamilies oI subsets, e.g., 1, B, e .
In connection with the above comments, we point out that the empty
1.1. Sets 9
set, 0, is a subset oI . It is possible to Iorm a nonempty set whose only
element is the empty set, i.e., 0. In this case, 0 is a singleton. We see that
o E 0 and 0 c 0.
In principle, we could also consider sets made up oI both elements oI
and subsets oI . or e ample, iI E and A c , then , A is a valid
set. owever, we shall not make use oI sets oI this nature in this book.
There is a special Iamily oI subsets oI to which we given a special name.
1.1.19. DeIinition. et A be any subset oI . We deIine the power class
oI A or the power set oI A to be the Iamily oI all subsets oI A. We denote the
power class oI A by P(A). SpeciIically,
(P(A) B: B c A .
1.1.20. E ample. The power class oI the empty set, (P(0) 0, i.e.,
the singleton oI 0. The power class oI a singleton, (P( a ) 0, an. or the
set A a, b , (P(A) 0, a , b , a, bn. In general, iI A is a Iinite set with
n elements, then (P(A) contains 2" elements.
BeIore proceeding Iurther, it should be pointed out that a Iree and uncrit
ical use oI a set theory can lead to contradictions and that set theory has had
a careIul development with various devices used to e clude the contradictions.
Roughly speaking, contradictions arise when one uses sets which are "too
big," such as trying to speak oI a set which contains everything. In all oI our
subseuent discussions we will keep away Irom these contradictions by always
having some set or space Ii ed Ior a given discussion and by considering
only sets whose elements are elements oI , or sets (collections) whose ele
ments are subsets oI , or sets (Iamilies) whose elements are collections oI
subsets oI , etc.
et us net consider ordered sets. Above, we deIined set in such a manner
that the ordering oI the elements is immaterial, and Iurthermore that each
element is distinct. Thus, iI a and b are elements oI , then a, b b, a ;
i.e., there is no preIerence given to a or b. urthermore, we have a, a, b
a, b . In this case we sometimes speak oI an unordered pair a, b .
reuently, we will need to consider the ordered pair (a, b), (a and b need
not belong to the same set) where we distinguish between the Iirst element a
and the second element b. In this case (a, b) (u, v) iI and only iI u a and
v b. Thus, (a, b) * (b, a) iI a * b. Also, we will consider ordered triplets
(a, b, c), ordered uadruplets (a, b, c, d), etc., where we need to distinguish
between the Iirst element, second element, third element, Iourth element,
etc. Ordered pairs, ordered triplets, ordered uadruplets, etc., are e amples oI
ordered sets.
We point out here that our characteriation oI ordered sets is not aiom
atic, since we are assuming that the reader knows what is meant by the Iirst
10
element, second element, third element, etc. ( owever, it is possible to deIine
ordered sets in a totally abstract Iashion without assuming this simple Iact.
We shall Iorego these subtle distinctions and accept the preceding as a deIi
nition.)
Now let and be two nonvoid sets. We deIine the Cartesian or direct
product oI and , denoted by , as the set oI all ordered pairs whose
Iirst element belongs to and whose second element belongs to . Thus,
(, y): E , E . (1.1.21)
Net, let I"" II denote n arbitrary non void sets. We similarly
deIine the (n Iold) Cartesian product oI I" .. , ., denoted by I
2
. ., as
I
2
.. . (l
2
, . ,.):
I E I
2
E
2
, , . E.. (1.1.22)
We call I the ith element oI the ordered set (l ,.) E I
2
.
., i 1, ... , n. ere again, two ordered sets (I" . ,.) and (I
,.) are said to be eual iI and only iI I I i 1, ... ,n.
In the Iollowing e ample, the symbol I: . means e ual by deIinition.
1.1.23. E ample. et R be the set oI all real numbers. We denote the
Cartesian product, R R, by R2 I: . R R. Thus, iI , E R, the ordered
pair (, y) E R R. We may interpret (, y) geometrically as being the
coordinates oI a point in the plane, being the Iirst coordinate and the
second coordinate.
1.1.2. E ample. et A to, I , and let B a, b, c . Then
A B ( O, a), (0, b), (0, c), (1, a), (I, b), (I, c)
and
B A (a, 0), (a, 1), (b, 0), (b, 1), (c, 0), (c, I) .
rom this e ample it Iollows that, in general, iI A and B are distinct sets,
then
A B ;e. B A.
Net, we consider some generaliations to an ordered set. To this end,
let I denote any non void set which we call inde set. Now Ior each E I,
suppose there is a uniue A" c: . We call A .. : E I an inde ed Iamily oI
sets. This notation reuires some clariIication. Strictly speaking, the set
notation A .. : E I would normally indicate that none oI the sets A",
E I may be repeated. owever, in the case oI inde ed Iamily we agree to
permit the possibility that the sets A.. , E I need not be distinct.
We deIine an inde ed set in a similar manner. et I be an inde set, and
Ior each E I let there be a uniue element .. E . Then the set .. : E I
1.1. Sets 11
is called an inde ed set. ere again, we agree to permit the possibility that the
elements .. , a E I need not be distinct. Clearly, iI I is a Iinite nonvoid set,
then an indeed set is simply an ordered set.
In the net deIinition, and throughout the remainder oI this section,
denotes the set oI positive integers.
1.1.25. DeIinition. A se uence is an indeed set whose inde set is . A
se uence oI sets is an indeed Iamily oI sets whose inde set is .
We usually abbreviate the seuence " E : n E by ,, , when no
possibility Ior conIusion e ists. (Even though the same notation is used Ior
the seuence ,, and the singleton oI " the meaning as to which is meant
will always be clear Irom contet.) Some authors write ,, ;C to indicate
that the inde set oI the seuence is . Also, some authors allow the inde set
oI a seuence to be Iinite.
We are now in a position to consider the Iollowing additional generalia
tions.
1.1.26. DeIinition. et A .. : a E I be an indeed Iamily oI sets, and let
be any subset oI I. II is nonvoid, we deIine
A.. E : E A.. Ior some a E
.. er
and
n A.. E : E A.. Ior all e E .
.. e
II 0, we deIine A.. 0 and n A.. .
.. e0 .. e0
The union and intersection oI Iamilies oI sets which are not necessarily
indeed is deIined in a similar Iashion. Thus, iI II is any nonvoid Iamily oI
subsets oI , then we deIine
E : E Ior some E II
pe
and
( E : E Ior all E II .
e
When, in DeIinition 1.1.26, is oI the Iorm k, k 1, k 2, ... ,
where k is an integer, we sometimes write 0 A" and nA".
t k
1.1.27. Eample. et R, the set oI real numbers, and let I E R:
o I . et A.. E R: 0 a Ior all a E I. Then, A. I
..el
and n A.. O , i.e., the singleton containing only the element O.
.el
1.1.28. E ample. et R, the set oI real numbers, and let 1 . et
.. ..
A" : n n . Then, A" Rand nA" : 1 I.
.. 1 .1
II ..
et B" E R: 1 . Then, B" : 1 2
n n ,,
00
and nB" : 0 I.
,,
The reader is now in a position to prove the Iollowing results.
1.1.29. Deorem. et A .. : (t E I be an inde ed Iamily oI sets. et B be
any subset oI , and let be any subset oI I. Then
(i) B n A.. B n A.. ;
.. e .. eIC
(ii) B n A.. n B A.. ;
.. eIC .. eIC
(iii) B A.. n(B A.. );
.. e .. eIC
(iv) B n A.. (B A.. );
.. eIC .. eIC
(v) A.. r n A;; and
.. eIC .. elC
(vi) n A.. r A;.
.. eIC .. e
1.1.30. E ercise. Prove Theorem 1.1. 29.
Parts (v) and (vi) oI Theorem 1.1.29 are called De Morgans laws.
We conclude the present section with the Iollowing:
1.1.31. DeIinition. et II be any Iamily oI subsets oI . II is said to be a
Iamily oI disjoint sets iI Ior all A, B E II such that A : B, then A n B 0.
A se uence oI sets E,, is said to be a se uenee oI disjoint sets iI Ior every m,
n E such that m "*n, E", n E" 0.
1.2. NCTIONS
We Iirst give the deIinition oI a Iunction in a set theoretic manner. Then
we discuss the meaning oI Iunction in more intuitive terms.
1.2.1. DeIinition. et and be nonvoid sets. A Iunction/Irom into
is a subset oI such that Ior every E there is one and only one
y E ( e., there is a uniue y E ) such that ( , y) E I The set is called
the domain oII (or the domain oI deIinition oI I), and we say that / is de8Ded
on . The set y E : (, y) E I Ior some E is called the range oI I
and is denoted by n(I). or each (, y) E , we call y the value oI / at
1.2. unctions
13
and denote it by I(). We sometimes writeI: to denote the Iunction
I Irom into .
The terms mapping, map, operator, transIormation, and Iunction are used
interchangeably. When using the term mapping, we usually say "a mapping
oI into ." Although the distinction between the words "oI " and "Irom
" is immaterial, as we shall see, the wording "into " becomes important
as opposed to the wording "onto ," which we will encounter later.
Sometimes it is convenient not to insist that the domain oI deIinition oII
be all oI ; i.e., a Iunction is sometimes deIined on a subset oI rather than
on all oI . In any case, the domain oI deIinition oIIisdenoted by (I) c .
nless speciIied otherwise, we shall always assume that (I) .
Intuitively, a IunctionIis a "rule" whereby Ior each E a uniue y E
is assigned to . When viewed in this manner, the term mapping is uite
descriptive. owever, deIining a Iunction as a "rule" involves usage oI yet
another undeIined term.
Concerning Iunctions, some additional comments are in order.
1. Socalled "multivalued Iunctions" are not allowed by the above
deIinition. They will be treated later under the topic oI relations
(Section 1.3).
2. The set (or ) may be the Cartesian product oI sets, e.g., I

2
... ". In this case we think oIIas being a Iunction oI n
variables. We write I(
l
, ... ,,,) to denote the value oIIat (I , ,
.) E I ... ".
3. It is important that the distinction between a Iunction and the value
oI a Iunction be clearly understood. The value oI a Iunction, I(),
is an element oI . The Iunction.I is a much larger entity, and it is
to be thought oIas a single object. Note that I E P( ) (the power
set oI ), but not every element oI P( ) is a Iunction. The
set oI all Iunctions Irom into is a subset oI P( ) and is some
times denoted by y.
1.2.2. Eample. et A and B be the sets deIined in Eample 1.1.2 . et
Ibe the subset oI A Bgiven byI (0, a), (1, b) . ThenIis a Iunction Irom
A into B. We see thatI(O) a andI(1) b. The range oIIis the set (a, h
which is a proper subset oI B.
Although we have deIined a Iunction as being a set, we usually characterie
a Iunction according to a rule as shown, Ior eample, in the Iollowing.
1.2.3. Eample. et Rdenote the real numbers, and letIbe a Iunction Irom
R into R whose value at each E R is given by I() sin . The Iunction
I is the sine Iunction. Epressed eplicitly as a set, we see that I (, y):
1 Chapter 1 / undamental Concepts
y sin . Note that the subset ( , y): sin y c R R is not a
Iunction.
The preceding e ample also illustrates the notion oI the graph oI a
Iunction. et and denote the set oI real numbers, let denote
their Cartesian product, and let/be a Iunction Irom into . The collection
oI ordered pairs (,I( in is called the graph oI the Iunction I.
Thus, a subset G oI is the graph oI a Iunction deIined on iI and
only iI Ior each E there is a uniue ordered pair in G whose Iirst element
is . In Iact, the graph oI a Iunction and the Iunction itselI are one and the
same thing.
Since Iunctions are deIined as sets, e uality oI I DctiODS is to be interpreted
in the sense as euality oI sets. With this in mind, the reader will have no
diIIiculty in proving the Iollowing.
1.2. . Deorem. Two mappingsI and g oI into are eual iI and only
iI I() g() Ior every E .
1.2.5. E ercise. Prove Theorem 1.2. .
We now wish to Iurther characterie and classiIy Iunctions. II I is a
Iunction Irom into , we denote the range oIIby Gl(I). In general, Gl(/)
c mayor may not be a proper subset oI . Thus, we have the Iollowing
deIinition.
1.2.6. DeIinition. et I be a Iunction Irom into . II Gl(I) , then
I is said to be surjective or a surjection, and we say that I maps onto .
II/is a Iunction such that Ior every I, :t E ,I(
l
) /(:t) implies that
l :t, then/is said to be injective or a oile to one mapping, or an injection.
II/is both injective and surjective, we say that/is bijective or one to one and
onto, or a bijection.
ets go over this again. Every Iunction I: is a mapping oI into
. II the range oII happens to be all oI , then we say I maps onto .
or each E , there is always a uni ue y E such thaty I(). owever,
there may be distinct elements l and :t in such that I(
l
) I(:t). II
there is a uni ue E such that I() y Ior each y E Gl(I), then we say
that I is a onetoone mapping. III maps onto and is onetoone, we say
that I is onetoone and onto. In igure B an attempt is made to illustrate
these concepts pictorially. In this Iigure the dots denote elements oI sets and
the arrows indicate the rules oI the various Iunctions.
The reader should commit to memory the Iollowing associations: surjec
tive onto; injective onetoone; bijective onetoone and onto.
reuently, the term one to one is abbreviated as (1 1).
1.2. unctions 15
e 5 e
,.
d.
,.5
d

d:3
c
".
b 3
:3
c 2 b.
,.3 c 2
b 2
:7
1
a:
d 1
a 1
1,: , ,
1
2
:
2

2
1
3
:
3
... 3 I.: . .
I, is into 1
2
is onto 1
3
is (1 11 I. is bijective
1.2.7. igure B. Illustration oI diIIerent types oI mappings.
We now prove the Iollowing important but obvious result.
1.2.8. Theorem. et I be a Iunction Irom into , and let Z ISt(I),
the range oII etgdenote the set (y, ) E Z : (, y) E I. Then, clearly,
g is a subset oI Z and I is injective iI and only iI g is a Iunction Irom Z
into .
ProoI etIbe injective, and let y E Z. Since y E ISt(I), there is an E
such that (, y) E I and hence (y, ) E g. Now suppose there is another
I E such that (y, I) E g. Then ( I y) E I Since I is injective and
y I() I(I) this implies that I and so is uniue. This means
that g is a Iunction Irom Z into .
Conversely, suppose g is a Iunction Irom Z into . et h E be such
that I(
l
) I(
2
). This implies that (I,I(
I
and (
2
,I(
2
E I and so
(I(
l
), I) and (I(
2
),
2
) E g. Since I(
l
) I(
2
) and g is a Iunction,
we must have I
2
. ThereIore,Iis injective.
The above result motivates the Iollowing deIinition.
1.2.9. DeIinition. etIbe an injective mapping oI into . Then we say
that I has an inverse, and we call the mapping g deIined in Theorem 1.2.8
the inverse oII ereaIter, we will denote the inverse oIIbyII.
Clearly, iIIhas an inverse, thenI
I
is a mapping Irom ISt(I) onto .
1.2.10. Theorem. et I be an injective mapping oI into . Then
(i) Iis a onetoone mapping oI onto ISt(I);
(ii) I
I
is a onetoone mapping oI ISt(I) onto ;
(iii) Ior every E , Il(I( ; and
(iv) Ior every y E al(I),I(II(y y.
Note that in the above deIinition, the domain oI/1 is R(I) ,which need
not be all oI .
Some te ts insist that in order Ior a Iunction/to have an inverse, it must
be bijective. Thus, when reading the literature it is important to note which
deIinition oI /1 the author has in mind. (Note that an injective Iunction
/: is a bijective Iunction Irom onto R(I).)
1.2.12. E ample. et R, the set o"real numbers. et/:
be given byle)
3
Ior every E R. Then/is a (1 1) mapping oI onto
and/I(y) (y)/3 Ior all y.
1.2.13. E ample. et , the set oI positive integers. et / : ,
be given by Ien) n 3 Ior all n E . Then/is a (1 1) mapping oI
into . owever, the range oII, R(I) y E : y , 5, ... * .
ThereIore,Ihas an inverse,II, which is deIined only on R(I) and not on all
oI . In this case we have/I(y) y 3 Ior all y E R(I).
1.2.1 . E ample. et R, the set oI all real numbers. et /:
be given byle) I ;I Ior all E R. Then/is an injective mapping
and R(I) y E : ,I y I. Also, /1 is a mapping Irom R(I)
into R given by /I(y) I y I Ior all y E R(I).
Ne t, let , , and Z be non void sets. Suppose that/: and g :
Z. or each E , we have I() E and g(I( E Z. Since / and g
are mappings Irom into and Irom into Z, respectively, it Iollows that
Ior each E there is one and only one element g(I( E Z. ence, the
set
( , ) E Z: g(I(, E (1.2.15)
is a Iunction Irom into Z. We call this Iunction the composite Iunction oI
g and / and denote it by g 0 I The value oI go/ at is given by
(g 0 I)() g o/() t:. g(I(.
In igure C, a pictorial interpretation oI a composite Iunction is given.
1.2.17. Theorem. II/is a mapping oI a set onto a set and g is a mapping
oI the set onto a set Z, then go/ is a mapping oI onto Z.
ProoI In order to show that go/ is an onto mapping we must show that
Iorany E Zthere e ists an E suchthatg(/( .II E Zthensince
g is a mapping oI onto Z, there is an element y E such that g(y) .
urthermore, since/ is a mapping oI onto , there is an E such that
le) y. Since g o/() g(I( g(y) , it readily Iollows that go/
is a mapping oI onto Z, which proves the theorem.
1.2. unctions 17
1.2.16. igure C. Illustration oI a composite Iunction.
We also have
1.2.18. Theorem. III is a (II) mapping oI a set onto a set , and iI g
is a (II) mappi ng oIthe set onto a set Z, then g 0 I is a (II) mapping oI
ontoZ.
Net we prove:
1.2.20. Theorem. III is a (1 1) mapping oI a set onto a set , and iI
g is a (1I) mapping oI onto a set Z, then (g 0 I)I (II) 0 (gI).
ProoI et E Z. Then there e ists an E such that g 0 I() , and
hence (g 0 I)I(Z) . Also, since g 0 I() g(I( , it Iollows that
gI(Z) I(), Irom which we have II(gI(Z . But II(gI(
II 0 gI(Z) and since this is eual to , we haveII 0 gI(Z) (g olt
1
( ).
Since is arbitrary, the theorem is proved.
Note careIully that in Theorem 1.2.20I is a mapping oI onto . II it
had simply been an injective mapping, the composite Iunction (/1) 0 (gI)
may not be deIined. That is, the range oI gI is ; however, the domain oI
11 is R(I). Clearly, the domain oIII must include the range oIg1 in order
that the composition (Il) 0(gl) be deIined.
1.2.21. E ample. et A r, s, t, u , B u, v, W, , and C w, , y,
. et the Iunction I : A B be deIined as
1 Ier, ), (s, w), (t, v), (u, ) .
We Iind it convenient to represent this Iunction in the Iollowing way:
1 (r st).
u W v
That is, the top row identiIies the domain oII and the bottom row contains
each uniue element in the range oII directly below the appropriate element
in the domain. Clearly, this representation can be used Ior any Iunction
deIined on a Iinite set. In a similar Iashion, let the Iunction g : B C be
deIined as
g ( v W ).
W y
Clearly, both/and g are bijective. Also, go lis the (II) mapping oI A onto
C given by
(
r st).
g 0/ W y
urthermore,
u), ( W Z y), ( sZ w
t

u
).
gI (goI)t r
u v W
Now
(
Wt sZ
u
)
II ogt r
i.e.,It ogt (goltt .
The reader can prove the net result readily.
1.2.22. Theorem. et W, , , and Z be nonvoid sets. III is a mapping
oI set W into set , iI g is a mapping oI into set , and iI h is a mapping
oI into set Z (sets W, , , Z are not necessarily distinct), then h 0 (g 0 I)
(h 0 g) oI
1.2.2 . Eample. et A m, n, p, , B m, r, s , C r, t, u, v , D
w, , , , and deIineI : A B, g : B C, and h : C D as
1 (: ; :), g ( r :), h C ; :).
Then
g 0 I ( ; :) and hog (: : :) .
1.2. unctions
Thus,
h 0 (g 0 I) (: : ) and (h 0 g) 0 I (: : ),
i.e., h 0 (g 0 I) (h 0 g) 0 I.
19
There is a special mapping which is so important that we give it a special
name. We have:
1.2.25. DeIinition. et be a non void set. et e : be deIined by
e() Ior all E . We call e the identity Iunction on .
It is clear that the identity Iunction is bijective.
1.2.26. Theorem. et and be nonvoid sets, and leIt I: . et
e , ey, and e
l
be the identity Iunctions on , , and R(I), respectively. Then
(i) iIIis injective, thenII oI e andIoII; e
l
; and
(ii) I is bijective iI and only iI there is a g : such that g 0 I e
andIo g ey.
ProoI. Part (i) Iollows immediately Irom parts (iii) and (iv) oI Theorem
1.2.10.
The prooI oI part (ii) is leIt as an e ercise.
1.2.27. E ercise. Prove part (ii) oI Theorem 1.2.26.
Another special class oI important Iunctions are permutations.
1.2.28. DeIinition. A permutation on a set is a (II) mapping oI onto .
It is clear that the identity mapping on is a permutation on . or this
reason it is sometimes called the identity permutation on . It is also clear
that the inverse oI a permutation is also a permutation.
1.2.29. E ercise. et a, b, e , and deIineI: and g :
as
I (a b e), g (a b e).
cba bea
Show that/, g,II, and gI are permutations on .
1.2.30. E ercise. et Z denote the set oI integers, and let I : Z Z be
deIined by I(n) n 3 Ior all n E Z. Show thatIandII are permutations
on Z and thatI
I
0 I Io II.
10
The reader can readily prove the Iollowing results.
1.2.31. Theorem. IIlis a (II) mapping oI a set A onto a set B and iI g
is a (1 1) mapping oI the set Bonto the set A, then g 0 I is a permutation on A.
1.2.32. Corollary. III and g are both permutations on a set A, then g 0 I
is a permutation on A.
1.2.33. E ercise. Prove Theorem 1.2.31 and Corollary 1.2.32.
1.2.3. E ercise. Show that iI a set A consists oI n elements, then there
are e actly n (n Iactorial) distinct permutations on A.
Now letl be a mapping oI a set into a set . II I is a subset oI , then
Ior each element E I there is a uniue element/() E . Thus,Imay be
used to deIine a mapping I oI I into deIined by
I() I() (1.2.35)
Ior all E I This motivates the Iollowing deIinition.
1.2.36. DeIinition. The mappingI oI subset I C into oI E . (I.2.35)
is called the mapping oI . into induced by the mapping I: . In
this case I is called the restriction oIIto the set I
We also have:
1.2.37. DeIinition. III is a mapping oI I into and iI I C , then any
mappingI oI into is said to be an e tension oIIiI
/() I() (1.2.38)
Ior every E I
Thus, iI j is an e tension oII, thenI is a mapping oI a set I C into
which is induced by the mapping j oI into .
1.2.39. E ample. et I u, v, , Il, v, , y, , and tn, p, ,
T, s, t . Clearly I C . DeIineI : I as
I ( v ).
n p
Also, deIine j, j : as
j ( v y Z), j ( v y Z) .
nprs npnt
Then j andj are two diIIerent e tensions oII Moreover, I is the mapping
1.2. unctions 11
oI I into induced either by j or j. In general, two distinct mappings
may induce the same mapping on a subset.
et us net consider the image and the inverse image oI sets under
mappings. SpeciIically, we have
1.2.0. DeIinition. etI be a Iunction Irom a set into a set : et A c: ,
and let B c: . We deIine the image oI A under I, denoted by I(A), to be the
set
I(A) y E : y I(), E A .
We deIine the inverse image oI B under I, denoted by Il(B), to be the set
I(B) E :I() E B .
Note thatII(B) is always deIined Ior any I: . That is, there is no
implication here thatIhas an inverse. The notation is somewhat unIortunate
in this respect. Note also that the range oIIis I( ).
In the net result, some oI the important properties oI images and inverse
images oI Iunctions are summaried.
1.2.1. Theorem. et I be a Iunction Irom into , let A, A1 and A
2
be subsets oI , and let B, BI and B
2
be subsets oI . Then
(i) iI AI c: A, then I(AI) c: I(A);
(ii) I(A
I
A
2
) I(A
I
) I(A
2
);
(iii) I(A
I
n A
2
) c: I(A
I
) n I(A
2
);
(iv) I(B
I
B
2
) II(B
I
) I
I
(B
2
);
(v) I(B
I
n B
2
) r(B
I
) n I
I
(B
2
);
(vi) I(B) II(B)r;
(vii) II(A)::: A; and
(viii) II(B) c: B.
ProoI We prove parts (i) and (ii) to demonstrate the method oI prooI.
The remaining parts are leIt as an e ercise.
To prove part (i), let y E I(AI) Then there is an E AI such that
y I(). But AI c: A and so E A. ence,I() y E I(A). This proves
thatI(A
I
) c: I(A).
To prove part (ii), let y E I(A
1
A
2
). Then there is an E AI A
2
such that y I(). II E AI, then I() y E I(A, ). II E A
2
, then
I() y E I(A
). Since is in AI or in A,I() must be in I(A, ) or I(A
).
ThereIore, I(AI A
2
) c: I(A
I
) I(A
). To prove that I(AI) I(A
)
c: I(A, A
), we note that Al c: AI A
. So by part (i), I(AI) c: I(AI

Chapter 1 I undamental Concepts
A
2
). Similarly, I(A
2
) c I(A, A
2
). rom this it Iollows that I(A
I
)
I(A
2
) c I(A, A
2
). We conclude that I(A, A
2
) I(A
I
) I(A,j.
1.2. 2. E ercise. Prove parts (iii) through (viii) oI Theorem 1.2. 1.

We note that, in general, euality is not attained in parts (iii), (vii), and
(viii) oI Theorem 1.2. 1. owever, by considering special types oI mappings
we can obtain the Iollowing results Ior these cases.
1.2.3. Theorem. etIbe a Iunction Irom into , let A, AI and A
2
be
subsets oI , and let B be a subset oI . Then
(i) I(A, n A
2
) I(A
I
) n I(A
2
) Ior all pairs oI subsets AI, A
2
oI
iI and only iIIis injective;
(ii) II(A) A Ior all A c iI and only iIIis injective; and
(iii) III(B) BIor all B c iI and only iIIissurjective.
ProoI We will prove only part (i) and leave the prooIs oI parts (ii) and
(iii) as an e ercise.
To prove suIIiciency, letIbe injective and let AI and A
2
be subsets oI .
In view oI part (iii) oI Theorem 1.2. 1, we need only show thatI(A
I
) nI(A,)
c I(A, n A
2
). In doing so, let y E I(A
I
) n I(A
2
). Then y E I(A
I
) and
y E I(A
2
). This means there is an I E AI and an
2
E A
2
such that y
I(,) I(
2
). Since I is injective, I 2 ence, I E AI n A
2
. This
implies that y E I(A
n A
2
); i.e.,I(A
I
) n I(A
l
) c I(A
I
n A
l
)
To prove necessity, assume that I(A
I
n A
2
) I(A
I
) n I(A
2
) Ior all
subsets AI and A
2
oI . or purposes oI contradiction, suppose there are
I 2 E such that I *
2
and I(
,
) I(
2
). et AI I and A
2
( 2 ; i.e., AI and A
2
are singletons oI I and
2
, respectively. Then AI n A
2
0, and so I(A, n A
2
) 0. owever, I(A, ) y and I(A
2
y ,
and thus I(A
I
) n I(A
2
) y *0. This contradicts the Iact that I(A
,
)
n I(A
2
) I(AI n A
2
) Ior all subsets AI and A
2
oI . Thus, I is injective.
1.2. . E ercise. Prove parts (ii) and (iii) oI Theorem 1.2. 3.
Some oI the preceding results can be etended to Iamilies oI sets. or
eample, we have:
1.2.5. Theorem. et I be a Iunction Irom into , let A .. : I E I be
an indeed Iamily oI sets in , and let B .. : I E be an indeed Iamily
oI sets in . Then
(i) I( A.. ) I(A..);
(l.EI E
(ii) I( n A.. ) c nI(A.. );
EI EI
1.2. unct;ons
(iii) II( B,,) II(B,,);
"EI: "EI:
(iv) II(n B,,) n II(B,,); and
"E / "EI:
(v) iI Be ,I(B) II(Br.
ProoI We prove parts (i) and (iii) and leave the prooIs oI the remaining
parts as an e ercise.
To prove part (i), let y E Ieu A,,). This means that there is an E A"
"EI "EI
such that y I(). Thus, Ior some I E T, E A". This implies that I()
E I(A,,) and so y E I(A,,). ence, y E I(A,,). This shows that Ieu A,,)
"EI "EI
c I(A,,).
"EI
To prove the converse, let y E I(A,,). Then y E I(A,,) Ior some I E T.
"EI
This means there is an E A" such thatI() y. Now E A", and so
"EI
I() E Ieu A,,). ThereIore, I(A,,) c Ieu A,,). This completes the
"EI "EI "EI
prooI oI part (i).
To prove part (iii), let E II( B.). This means that I() E B", .
E/ "E/
ence, I() E B" Ior some I E . Thus, E II(B",), and so E
II(B.). ThereIore,jI( B",) c II(B,,) .
E/ "E/ .E/
Conversely, let E II(B,,). Then E II(B,,) Ior some I E .
E/
Thus,j() E B". ence,j() E B., and so E II( B,,). This means
ce ce
that II(B,,) c II( B,,), which completes the prooI oI part (iii).
.EI: "E/
1.2. 6. E ercise. Prove parts (ii), (iv), and (v) oI Theorem 1.2. 5.
aving introduced the concept oI mapping, we are in a position to consider
an important classiIication oI inIinite sets. We Iirst consider the Iollowing
deIinition.
1.2. 7. DeIinition. et A and B be any two sets. The set A is said to be
e uivalent to set B iI there e ists a bijective mapping oI A onto B.
Clearly, iI A is euivalent to B, then B is euivalent to A.
1.2. 8. DeIinition. et be the set oI positive integers, and let Abe any set.
Then A is said to be countably inIinite iIA is euivalent to . A set is said to
be countable or denumerable iI it is either Iinite or countably inIinite. IIa set
is not countable, it is said to be uncountable.
We have:
Chapter 1 I ntal Concepts
1.2.9. Theorem. et be the set oI positive integers, and let 1 c . II1
is inIinite, then 1 is euivalent to .
ProoI. We shall construct a bijective mapping, I, Irom onto 1. et .:
n E be the Iamily oI sets given by . I, 2, ... , n Ior n 1,2, ....
Clearly, each . is Iinite and oI order n. ThereIore, . n I is Iinite. Since I is
inIinite, 1 . *0 Ior all n. et us now deIine I : I as Iollows. et
I(I) be the smallest integer in 1. We now proceed inductively. Assume I(n)
E I has been deIined and let I(n 1) be the smallest integer in I which is
greater than I(n). Now I(n 1) I(n), and so I(n.) I(n, Ior any
n. n
2
This implies thatIis injective.
Net, we want to show that I is surjective. We do so by contradiction.
Suppose that I() *I. Since I() c I, this implies that 1 I() *0. et
be the smallest integer in 1 I(). Then *I(1) because I(l) E I(),
and so I(I). This implies that 1 n
. *0. Since In
. is nonvoid
and Iinite, we may Iind the largest integer in this set, say r. It Iollows that
r 1 . Now r is the largest integer in I which is less than . But
r implies that r E I(). This means there is an s E such that r I(s).
By deIinition oI I,I(s 1) . ence, E I() and we have arrived at a
contradition. Thus, I is surjective. This completes the prooI.
We now have the Iollowing corollary.
1.2.50. Corollary. et A c B c . II B is a countable set, then A is
countable.
ProoI. II A is Iinite, then there is nothing to prove. So let us assume that A
is inIinite. This means that B is countably inIinite, and so there eists a
bijective mapping I : B . et g be the restriction oIIto A. Then Ior all
u
2
E A such that . *
2
, g(.) I(
t
) *I(
2
) g(
2
). Thus, g
is an injective mapping oI A into . By part (i) oI Theorem 1.2.10, g is a
bijective mapping oI A onto g(A). This means A is euivalent to g(A), and
thus g(A) is an inIinite set. Since g(A) c , g(A) is euivalent to . ence,
there is a bijective mapping oI g(A) onto , which we call h. By Theorem
1.2.18, the composite mapping hog is a bijective mapping oI A onto . This
means that is euivalent to A. ThereIore, A is countable.
We conclude the present section by considering the cardinality oI sets.
SpeciIically, iI a set is Iinite, we say the cardinal Dumber oI the set is eual to
the number oI elements oI the set. IItwo sets are countably inIinite, then we
say they have the same cardinal number, which we can deIine to be the
cardinal number oI the positive integers. More generally, two arbitrary sets
are said to have the same cardinal number iI we can establish a bijective
mapping between the two sets (i.e., the sets are euivalent).
1.3. RE ATIONS AND EQ IVA ENCE RE ATIONS
Throughout the present section, denotes a non void set.
We begin by introducing the notion oI relation, which is a generaliation
oI the concept oI Iunction.
1.3.1 DeItnition. et and be nonvoid sets. Any subset oI is
called a relation Irom to . Any subset oI is called a relation in .
1.3.2. E ample. et A u, v, , y) and B a, b, c, d). et (u, a),
(v, b), (u, c), (, a ). Then is a relation Irom A into B. It is clearly not a
Iunction Irom A into B (why ).
1.3.3. E ample. et R, the set oI real numbers. The set
( , y) E R R: :::;;; y) is a relation in R. Also, the set ( , y) E R R:
sin y) is a relation in R. This shows that so called multivaluedIunctions
are actually relations rather than mappings.
As in the case oI mappings, it makes sense to speak oI the domain and the
range oI a relation. We have:
1.3. . DeIiDition. et p be a relation Irom to . The subset oI ,
E : ( , y) E p, E ),
is called the domaiD or p. The subset oI
y E : (, y) E p, E),
is called the ruge oI p.
Now let p be a relation Irom to . Then, clearly, the set pI c ,
deIined by
pI (y; ) E : (, y) E pc ),
is a relation Irom to . The relation pI is called the inverse relation oI p.
Note that whereas the inverse oI a Iunction does not always e ist, the inverse
oI a relation does always e ist.
Net, we consider e uivalence relations. et p denote a relation in ;
i.e., p c . Then Ior any , y E , .either (, y) E P or (, y) i p,
but not both. II(, y) E p, then we write p y and iI (, y) i p, we write
./y.
1.3.5. DeIiDition. et p be a relation in .
(i) II P Ior all E , then p is said to be reIle ive;
26 Chapter 1 I undt mental Concepts
(ii) iI P y implies y p Ior all , Ep, then p is said to be symmetric;
and
(iii) iI Ior all , y, Z E , Pand y p Z implies p , then p is said to
be traositive.
1.3.6. E ample. et R denote the set oI real numbers. The relation in R
given by (, y): y is transitive but not reIle ive and not symmetric.
The relation in R given by (, y): *"y is symmetric but not reIle ive and
not transitive.
1.3.7. E ample. et p be the relation in ( ) deIined by p (A B):
A c B . That is, A p BiI and only iI A c B. Then p is reIle ive and transitive
but not symmetric.
In the Iollowing, we use the symbol,.., to denote a relation in . II
(, y) E ,.." then we write, as beIore, ,.., y.
1.3.8. DeIinition. et,.., be a relation in . Then ...., is said to be an
e uivalence relation in iI ,.., is reIle ive, symmetric, and transitive. II ,.., is
an e uivalence relation and iI ...., y, we say that is e uivalent to y.
In particular, the euivalence relation in characteried by the statement
" ,.., y iI and only iI y" is called the e uals relation in or the identity
relation in .
1.3.9. E ample. et be a Iinite set, and let A, B, C E P( ). et,.., on
P( ) be deIined by saying that A ...., B iI and only iI A and B have the same
number oI elements. Clearly A ,.., A. Also, iI A ,.., B then B " " A. urther
more, iI A ...., Band B " " C, then A ,.., C. ence, ...., is reIle ive, symmetric,
and transitive. ThereIore, ,.., is an euivalence relation in P( ).
1.3.10. E ample. et R1. R R, the real plane. et be the Iamily oI
all triangles in R1.. Then each oI the Iollowing statements can be used to deIine
an euivalence relation in : "is similar to," "is congruent to," "has the same
area as," and "has the same perimeter as."
1. . OPERATIONS ON SETS
In the present section we introduce the concept oI operation on set, and
we consider some oI the properties oI operations. Throughout this section,
denotes a non void set.
1. .1. DeIinition. A binary operation on is a mapping oI into
. A ternary operation on is a mapping oI into .
1. . Operations on Sets 27
We could proceed in an obvious manner and deIine an nary operation
on . Since our primary concern in this book will be with binary operations,
we will henceIorth simply say "an operation on " when we actually mean a
binary operation on .
II I : is an operation, then we usually use the notation
I ( , y) A I y.
1. .2. Eample. et R denote the real numbers. et I: R R R be
given by I(, y) y Ior all , y E R, where y denotes the custom
ary sum oI plus y ( e., denotes the usual operation oI addition oI real
numbers). Then I is clearly an operation on R, in the sense oI DeIinition
1. .1. We could just as well have deIined " " as being the operation on R,
i.e., : R R R, where (, y) A y. Similarly, the ordinary rules
oI subtraction and multiplication on R, "" and" . ", respectively, are also
operations on R. Notice that division, : , is not an operation on R, because
: y is not deIined Ior all y E R (i.e., : y is not deIined Ior y 0).
owever, iI we let R* R O , then ":" is an operation on R.
1. .3. Eercise. Show that iI A is a set consisting oI n distinct elements,
then there eist e actly n( ) distinct operations on A.
1. . . Eample. et A a, b . An eample oI an operation on A is the
mapping I : A A A deIined by
I (a, a) A 010 0, 1 (0, b) A 01 b b,
l (b,O) A b I a b, l (b, b) b I b a.
It is convenient to utili e the Iollowing operation table to deIine I :
..
ala b
b b a
(l..5)
II, in general, I is an operation on an arbitrary Iinite set A, or sometimes
even on a countably inIinite set A, then we can construct an operation table
as Iollows:
I
Iy
II A a, b , as at the beginning oI this eample, then in addition to I
28 CMprerlI men /C up
given in (1..5), we can deIine, Ior eample, the operations p, y, and on
A as
a Ia a
b b a
" a b
a a b
b a b
:r::
a b
a a a
b b b
We now consider operations with important special properties.
1..6. DeIinition. An operation c on is said to be commutative iI c y
y c Ior all , y E .
1..7. DeIinition. An operation c on is said to be associative iI ( c y) c
c (y c ) Ior , y, Z E .
In the case oI the real numbers R, the operations oI addition and multipli
cation are both associative and commutative. The operation oIsubtraction is
neither associative nor commutative.
1..8. DeIinition. II c and Pare operations on (not necessarily distinct),
then
(i) c is said to be leIt distributive over PiI
c (y P ) ( c y) P( c )
Ior every , y, Z E ;
(ii) c is said to be right distributive over PiI
( Py) c ( c ) P(y c )
Ior every , y, Z E ; and
(iii) c is said to be distributive over PiI c is both leIt and right distributive
over p.
In Eample 1.., c is the only commutative operation. The operation
p oI Eample 1.. is not associative. The operations c , y, and 6 oI this eam
ple are associative. In this eample, " is distributive over 6 and 6 is distributive
over y.
In the case oI the real numbers R, multiplication, ".", is distributive
over addition, " ". The converse is not true.
1..9. DeIinition. II c is an operation on , and iI I is a subset oI ,
then l is said to be closed relative to c iI Ior every , y E .. c E l.
Clearly, every set is closed with respect to an operation on it.
The set oI all integers Z, which is a subset oI the real numbers R, is closed
with respect to the operations oI addition and multiplication deIined on R.
The even integers are also closed with respect to both oI these operations,
whereas the odd integers are not a closed set relative to addition.
1. . Operations on Sets
1. .10. DeIinition. II a subset l oI is closed relative to an operation
on , then the operation a: on l deIined by
( , y) I y y
Ior all , y E l is called the operation on l induced by I .
II l , then I I . II l C but l 1 , then I 1 since I
and are operations on diIIerent sets, namely l and , respectively. In
general, an induced operation I diIIers Irom its predecessor I ; however,
it does inherit the essential properties which possesses, as shown in the
Iollowing result.
1. .11. Theorem. et be an operation on , let l C , where l is
closed relative to I , and let I be the operation on l induced by I . Then
(i) iI is commutative, then I is commutative;
(ii) iI is associative, then I is associative; and
(iii) iI Pis an operation on and l is closed relative to p, and iI is
leIt (right) distributive over p, then I is leIt (right) distributive over
P , where P is the operation on l induced by p.
1. .12. E ercise. Prove Theorem 1. .11.
The operation I on a subset l induced by an operation on will
Ireuently be denoted by I , and we will reIer to as an operation on l
In such cases one must keep in mind that we are actually reIerring to the
induced operation I and not to I .
1. .13. DeIinition. et l be a subset oI . An operation a. on is called
an e tension oI an operation on l iI l is closed relative to a. and iI is
eual to the operation on l induced by a..
A given operation on a subset l oI a set may, in general, have many
diIIerent e tensions.
1. .1 . E ample. et l a, b, c , and let a, b, c, d, e . DeIine
on l and a. and a. on as
a b C a. a b C d e I1. a b C d e
a a C b a a C b e d a a C b d e
b C b a b C b a d e b C b a e d
C b a C C b a C e d C b a C d e
d C d a b e d d C b a e
e d C a b e e d a C b e
Clearly, is an operation on I and ii and Il. are operations on . Moreover,
both a. and Il. (ii * Il.) are e tensions oI . Also, may be viewed as being
induced by ii and Il..
1.5. MATEMATICA S STEMS CONSIDERED
IN T IS BOO
We will concern ourselves with several diIIerent types oI mathematical
systems in the subseuent chapters. Although it is possible to give an abstract
deIinition oI the term mathematical systelI1, we will not do so. Instead, we
will brieIly indicate which types oI mathematical systems we shall consider in
this book.
1. In Chapter 2 we will begin by considering mathematical systems which
are made up oI an underlying set and an operation deIined on . We
will identiIy such systems by writing ; . We will be able to characterie a
system ; according to certain properties which and possess. Two
important cases oI such systems that we will consider are semigroups and
groups.
In Chapter 2 we will also consider mathematical systems consisting oI a
basic set and two operations, say and p, deIined on , where a special
relation e ists between and p. We will identiIy such systems by writing
;, Pl. Included among the mathematical systems oI this kind which we
will consider are rings and Iields.
In Chapter 2 we will also consider composite mathematical systems. Such
systems are endowed with two underlying sets, say and , and possess
a much more comple (algebraic) structure than semigroups, groups, rings,
and Iields. Composite sytems which we will consider include modules, vector
spaces over a Iield which are also called linear spaces, and algebras.
In Chapter 2 we will also study various types oI important mappings
(e.g., homomorphisms and isomorphisms) deIined on semigroups, groups,
rings, etc.
Mathematical systems oI the type considered in Chapter 2 are sometimes
called algebraic systems.
2. In Chapters 3 and we will study in some detail vector spaces and
special types oI mappings on vector spaces, called linear transIormations.
An important class oI linear transIormations can be represented by matrices,
which we will consider in Chapter . In this chapter we will also study in
some detail important vector spaces, called Euclidean spaces.
3. Most oI Chapter 5 is devoted to mathematical systems consisting oI
a basic set and a Iunction p: R (R denotes the real numbers),
where p possesses certain properties (namely, the properties oI distance
1.6. ReIerences and Notes 31
between points or elements in ). The Iunction p is called a metric (or a
distance Iunction) and the pair ; p) is called a metric space.
In Chapter 5 we will also consider mathematical systems consisting oI
a basic set and a Iamily oI subsets oI (called open sets) denoted by 3.
The pair ; 3) is called a topological space. It turns out that all metric
spaces are in a certain sense topological spaces.
We will also study Iunctions and their properties on metric (topological)
spaces in Chapter 5.
. In Chapters 6 and 7 we will consider Dormed linear spaces, inner
product spaces, and an important class oI Iunctions (linear operators) deIined
on such spaces.
A normed linear space is a mathematical system consisting oI a vector
space and a real valued Iunction denoted by II . II, which takes elements oI
into R and which possesses the properties which characterie the "length"
oI a vector. We will denote normed spaces by ; 1I 11l.
An inner product space consists oI a vector space (over the Iield oI real
numbers R or over the Iield oI comple numbers C) and a Iunction (, ),
which takes elements Irom into R (or into C) and possesses certain
properties which allow us to introduce, among other items, the concept oI
orthogonality. We will identiIy such mathematical systems by writing
;(,,).
It turns out that in a certain sense all inner product spaces are normed
linear spaces, that all normed linear spaces are metric spaces, and as indicated
beIore, that all metric spaces are topological spaces. Since normed linear
spaces and inner product spaces are also vector spaces, it should be clear
that, in the case oI such spaces, properties oI algebraic systems (called
algebraic strocture) and properties oI topological systems (called topological
structure) are combined.
A class oI normed linear spaces which are very important are Bauach
spaces, and among the more important inner product spaces are ilbert
spaces. Such spaces will be considered in some detail in Chapter 6. Also, in
Chapter 7, linear transIormations deIined on Banach and ilbert spaces will
be considered.
5. Applications are considered at the ends oI Chapters , 5, and 7.
1.6. RE ERENCES AND NOTES
A classic reIerence on set theory is the book by ausdorII 1.5 . The many
e cellent reIerences on the present topics include the elegant tet by anneken
1. ), the standard reIerence by almos 1.3 as well as the books by Gleason
1.1 and Goldstein and Rosenbaum 1.2 .
RE ERENCES
1.1 A. M. G EASON, undamentals oI Abstract Analysis. Reading, Mass.:
Addison Wesley Publishing Co., Inc., 1966.
1.2 M. E. GO DStEIN and B. M. ROSENBA M, "Introduction to Abstract Analy
sis," National Aeronautics and Space Administration, Report No. SP 203,
Washington, D.C., 1969.
1.3 P. R. A MOS, Naive Set Theory. Princeton, N..: D. Van Nostrand Com
pany, Inc., 1960.
1. C. B. ANNE EN, Introduction to Abstract Algebra. Belmont, CaliI.: Dicken
son Publishing Co., Inc., 1968.
1.5 . A SDOR , Mengenlehre. New ork: Dover Publications, Inc., 19 .
31
2
A GEBRAIC STR CT RES
The subject matter oI the previous chapter is concerned with set theoretic
structure. We emphasi ed essential elements oI set theory and introduced
related concepts such as mappings, operations, and relations.
In the present chapter we concern ourselves with algebraic structure.
The material oI this chapter Ialls usually under the heading oI abstract
algebra or modern algebra. In the ne t two chapters we will continue our
investigation oI algebraic structure. The topics oI those chapters go usually
under the heading oI linear algebra.
This chapter is divided into three parts. The Iirst section is concerned
with some basic algebraic structures, including semigroups, groups, rings,
Iields, modules, vector spaces, and algebras. In the second section we study
properties oI special important mappings on the above structures, including
homomorphisms, isomorphisms, endomorphisms, and automorphisms oI
semigroups, groups and rings. Because oI their importance in many areas
oI mathematics, as well as in applications, polynomials are considered in
the third section. Some appropriate reIerences Ior Iurther reading are sug
gested at the end oI the chapter.
The subject matter oI the present chapter is widely used in pure as well as
in applied mathematics, and it has Iound applications in diverse areas, such
as modern physics, automata theory, systems engineering, inIormation
theory, graph theory, and the like.
33
3 Chapter 2 I Algebraic Structures
I
Our presentation oI modern algebra is by necessity very brieI. owever,
mastery oI the topics covered in the present chapter will provide the reader
with the Ioundation reuired to make contact with the literature in applica
tions, and it will enable the interested reader to pursue this subject Iurther
at a more advanced level.
2.1. SOME BASIC STR CT RES O AGEBRA
We begin by developing some oI the more important properties oI mathe
matical systems, ; I , where I is an operation on a non void set .
2.1.1. DeIinition. et I be an operation on . II Ior all , , Z E ,
I I implies that y , then we say that ; I possesses the leIt
cancellation property. II I y Z I implies that , then ; I is
said to possess the right cancellation property. II ; I possesses both the
leIt and right cancellation properties, then we say that the cancellation laws
hold in ; I .
In the Iollowing e ercise, some speciIic cases are given.
2.1.2. E ercise. et , y and let I , p, ) , and d be deIined as
.1.r:ty
y y
yy yy yy yyy
Show that (i) ; P possesses neither the right nor the leIt cancellation prop
erty; (ii) ; ) possesses the leIt cancellation property but not the right can
cellation property; (iii) ; d possesses the right cancellation property but
not the leIt cancellation property; and (iv) ; I possesses both the leIt and
the right cancellation property.
In an arbitrary mathematical system ; I there are sometimes special
elements in which possess important properties relative to the operation
I . We have:
2.1.3. DeIinition. et I be an operation on a set and let contain an
element e, such that
I e, ,
Ior all E . We call e, a right identity element oI relative to l , or simply
aright identity oI the system ; I . IIcontains an element e, which satisIies
the condition
e,I ,
2.1. Some Basic Structures oIAlgebra 3S
Ior all E , then et is called a leIt identity element oI relative to , or
simply a leIt identity oI the system ; .
We note that a system ; may contain more than one right identity
element oI (e.g., system ; cS oI E ercise 2.1.2) or leIt identity element oI
(e.g., system ; y oI Eercise 2.1.2).
2.1. . DeIinition. An element e oI a set is called an identity element
oI relative to an operation on iI
ee
Ior every E .
2.1.5. Eercise. et to, I and deIine the operations" " and"" by
Ih I
o 0 I
I I 0
. 0 I
o 0 0
I 0 I
Does either ; or ; . have an identity element
Identity elements have the Iollowing properties.
2.1.6. Theorem. et be an operation on .
(i) II ; has an identity element e, then e is uniue.
(ii) II ; has a right identity e, and a leIt identity ee. then e, e
t
.
(iii) II is a commutative operation and iI ; has a right identity
element e" then e, is also a leIt identity.
ProoI To prove the Iirst part, let e and en be identity elements oI ; .
Then e en e and e en en. ence, e en.
To prove the second part, note that since e, is a right identity, et e, et.
Also, since e
t
is a leIt identity, et e, e,. Thus, et e,.
To prove the last part, note that Ior all E we have e,
e, .
In summary, iI ; has an identity element, then that element is uniue.
urthermore, iI ; has both a right identity and a leIt identity element,
then these elements are eual, and in Iact they are eual to the uniue identity
element. Also, iI ; has a right (or leIt) identity element and is a com
mutative operation, then ; has an identity element.
2.1.7. DeIinition. et be an operation on and let e be an identity oI
relative to . II E , then E is called a right inverse oI relative to
Chapter 2 I Algebraic Structures
provided that
e.
An element " E is called a leIt iaerse oI relative to iI
" e.
The Iollowing e ercise shows that some elements may not possess any right
or leIt inverses. Some other elements may possess several inverses oI one kind
and none oI the other, and other elements may possess a number oI inverses
oI both kinds.
2.1.8. Eercise. et , y, u, v and deIine as
y u v
y y
y y y
u y u v
v y v u
(i) Show that ; contains an identity element.
(ii) Which elements possess neither leIt inverses nor right inverses
(iii) Which element has a leIt and a right inverse
A. Semigroups and Groups
OI crucial importance are mathematical systems called semlgroups. Such
mathematical systems serve as the natural setting Ior many important results
in algebra and are used in several diverse areas oI applications (e.g., ualita
tive analysis oI dynamical systems, automata theory, etc.).
2.1.9. DeItnition. et be an operation on . We call ; a semlgroup
iI is an associative operation on .
Now let , y, Z E , and let be an associative operation on . Then
(y ) ( y) Z E . enceIorth, we will oIten simply write
u y . As a result oI this convention we see that Ior , y, u, V E ,
y u v (y u) v y (u v)
(y)(uv) (y)uv. (2.1.10)
As a generaliation oI the above we have the so called generali ed assoc:la
ti e law, which asserts that iI I
, .. ,. are elements oI a semigroup

; , then any two products, each involving these elements in a particular
order, are eual. This allows us to simply write I
... .
2.1. Some Basic Structures oIAlgebra 37
In view oI Theorem 2.1.6, part (i), iI a semigroup has an identity element,
then such an element is uniue. We give a special name to such a semigroup.
2.1.11. DeIinition. A semigroup ; ( is called a .monoid iI contains
an identity element relative to (, enceIorth, the uniue identity element oI
a monoid ; ( will be denoted bye.
Subseuently, we Ireuently single out elements oI monoids which possess
inverses.
2.1.12. DeIiDition. et ; ( be a monoid. II E possesses a right
inverse E , then is called a right invertible element in . II E
possesses a leIt inverse " E , then is called a leIt invertible element
in . II E is both right invertible and leIt invertible in , then we say
that is an invertible element or a unit oI .
Clearly, iI e E , then e is an invertible element.
2.1.13. Theorem. et ; ( be a monoid, and let E . II there e ists
a leIt inverse oI , say , and a right inverse oI , say ", then " and
is uniue.
ProoI Since ( is associative, we have ( ( ) ( " " and ( ( ( ")
. Thus, ". Now suppose there is another leIt inverse oI , say ".
Then " " and thereIore " .
Theorem 2.1.13 does, in general, not hold Ior arbitrary mathematical
systems ; ( with identity, as is evident Irom the Iollowing:
2.1.1 . E ercise. et u, v, , y and deIine ( as
( u v y
u v v u u
v u u v
u v y
y v y
se this operations table to demonstrate that Theorem 2.1.13 does not, in
general, hold iI monoid ; ( is replaced by system ; ( with identity.
By Theorem 2.1.13, any invertible element oI a monoid possesses a uniue
right inverse and a uniue leIt inverse, and moreover these inverses are
eual. This gives rise to the Iollowing.
2.1.15. DeIinition. et ; a be a monoid. II E has a leIt inverse
and a right inverse, and ", respectively, then this uniue element "
is called the inverse oI and is denoted by I.
Concerning inverses we have.
2.1.16. Theorem. et ; a be a monoid.
(i) II E has an inverse, I, then I has an inverse (I t I .
(ii) II, y E have inverses I, yI, respectively, then a y has an
inverse, and moreover ( a y)I yI 1 I.
(iii) The identity element e E has an inverse e
I
and e
I
e.
ProoI To prove the Iirst part, note that a I e and I a e.
Thus, is both a leIt and a right inverse oI I and (I)I .
To prove the second part, note that
(ay)a(yI aI) l(yayI)a
I
e
and
(yI a I) 1 ( a y) yI 1 (I a ) a y e.
The third part oI the theorem Iollows trivially Irom e a e e.
In the remainder oI the present chapter we will oIten use the symbols
"" and "." to denote operations in place oI a, p, etc. We will call these
"addition" and "multiplication." owever, we strongly emphasi e here that
.. " and"" will, in general, not denote addition and multiplication oI real
numbers but, instead, arbitrary operations. In cases where there e ists an
identity element relative to "", we will denote this element by "0" and call
it "ero." II there e ists an identity element relative to ". ", we will denote this
element either by "I" or bye. Our usual notation Ior representing an identity
relative to an arbitrary operation a will still be e. IIin a system ; an
element E possesses an inverse, we will denote this element by and
we will call it "minus ". or e ample, iI ; is a semigroup, then we
denote the inverse oI an invertible element E by , and in this case we
have () () 0, and also, () . urthermore, iI
, y E are invertible elements, then the "sum" y is also invertible,
and ( y) (y) (). Note, however, that unless " " is com
mutative, ( y) * () (y). inally, iI , y E and iI y is an
invertible element, then y E . In this case we oIten will simply write
(y) y.
2.1.17. E ample. et O, 1,2, 3 , and let the systems ; and
; . be deIined by means oI the operation tables
(2.1.20)
2.1. Some Basic Structures oIAlgebra
39
1
0 1 2 3 0 1 2 3
0 0 1 2 3 0 0 0 0 0
1 1 2 3 0 1 0 1 2 3
2 2 3 0 I 2 0 2 0 2
3 3 0 1 2 3 0 3 2 1
The reader should readily show that the systems ; and ; . are
monoids. In this case the operation" " is called "addition mod " and""
is called "multiplication mod ."

The most important special type oI semigroup that we will encounter in
this chapter is the group.
2.1.18. DeIinition. A group is a monoid in which every element is invertible;
e., a group is a semigroup, ; I , with identity in which every element is
invertible.
The set R oI real numbers with the operation oI addition is an eample oI
a group. The set oI real numbers with the operation oI multiplication does
not Iorm a group, since the number ero does not have an inverse relative to
multiplication. owever, the latter system is a monoid. II we let Rtt R
O , then R ; . is a group.
Groups possess several important properties. Some oI these are summa
ried in the net result.
2.1.19. Theorem. et ; I be a group, and let e denote the identity
element oI relative to I . et and y be arbitrary elements in . Then
(i) iI I , then e;
(ii) iI Z E and I y I , then y ;
(iii) iI E andIyI,then;
(iv) there eists a uniue W E such that
W ( y; and
(v) there eists a uniue E such that
(y. (2.1.21)
ProoI To prove the Iirst part, let ( . Then I I ( I ) I ( ,
and so (I ( ) I e. This implies that e.
To prove the second part, let I y I . Then I ( ( I y) I I (
I ), and so (I I ) I (I ( ) I . This implies that y .
The prooI oI part (iii) is similar to that oI part (ii).
To prove part (iv), let w y I. Then w (y
I
) y(I
) y. To show that w is uniue, suppose there is a v E such that
v y. Then w v . By part (iii), w v.
The prooI oI the last part oI the theorem is similar to the prooI oI part
(iv).
In part (iv) oI Theorem 2.1.19 the element w is called the leIt solution
oI E . (2.1.20), and in part (v) oI this theorem the element is called the
right solution oI E . (2. I.21).
We can classiIy groups in a variety oI ways. Some oI these classiIications
are as Iollows. et ; be a group. IIthe set possesses a Iinite number
oI elements, then we speak oI a Iinite group. II the operation is commutative
then we have a commutative group, also called an abelian group. II is not
commutative, then we speak oI a non commutative group or a non abelian
group. Also, by the order oI a group we understand the order oI the set .
Now let ; be a semigroup and let I be a nonvoid subset oI
which is closed relative to . Then by Theorem 1. .11, the operation Ion I
induced by the associative operation is also associative, and thus the
mathematical system I; I is also a semigroup. The system I; I is
called a subsystem oI ; . This gives rise to the Iollowing concept.
2.1.22. DeIinitio... et ; be a semigroup, let I be a nonvoid subset
oI which is closed relative to l , and let I be the operation on I induced
by . The semigroup (I; ( I is called a subsemigroup oI (; ( l.
In order to simpliIy our notation, we will henceIorth use the notation
(I; ( to denote the subsemigroup (I; tl ( e., we will suppress the
subscript oI ).
The Iollowing result allows us to generate subsemigroups in a variety oI
ways.
2.1.23. Theorem. et ; be a semigroup and let , c Ior a i E I,
where I denotes some inde set. et n " II ,; is a subsemigroup
lEI
oI(; Ior every i E I, and jI is not empty, then ; is a subsemigroup
oI ; .
ProoI et , y E . Then , y E , Ior all i E I and so y E , Ior
every i, and hence y E . This implies that ; is a subsemigroup.
Now let Wbe any non void subset oI , where ; is a semigroup, and
let
: We c and ; is a subsemigroup oI ; n.
1
Then cy is nonempty, since E cy. Also, let
G n .
El/
Then We G, and by Theorem 2.1.23 G; Il is a subsemigroup oI ; Il .
This subsemigroup is called the subsemigroup generated by W.
2.1.2. Theorem. et ; Il be a monoid with eits identity element, and let
I; Ill be a subsemigroup oI ; Il . IIe E I, then e is an identity element
oI I; Ill and I; Iltlis a monoid.
2.1.25. Eercise. Prove Theorem 2.1.2 .
Net we deIine subgroup.
2.1.26. DeIinition. et ; Il be a semigroup, and let I; Iltl be a sub
semigroup oI ; Il . II I; Ill is a group, then I; Ill is called a subgroup
oI; Il . We denote this subgroup by I; Il , and we say the set I deter
mines a subgroup oI; Il .
We consider a speciIic eample in the Iollowing:
2.1.27. Eercise. et Z6 O, 1,2,3,, 5 and deIine the operation
on Z6 by means oI the Iollowing operation table:
01235
001235
110523
2 2 50 3 1
335012
31250
552310
(a) Show that Z6; is a group.
(b) et O, I . Show that; is a subgroup OI Z6; .
(c) Are there any other subgroups OI Z6;
We have seen in Theorem 2.1.2 that iI e E I c , then it is also an
identity oI the subsemigroup I; Il . We can state something Iurther.
2.1.28. Theorem. et ; Il be a group with identity element e, and let
I; Il be a subgroup oI ; Il . Then e
l
is the identity element oI I; Il
iI and only iI e
l
e.
1
It should be noted that a semigroup ; l which has no identity element
may contain a subgroup I; l , since it is possible Ior a subsystem to possess
an identity element while the original system may not possess an identity.
II; l is a semigroup with an identity element and iI I; l is a subgroup,
then the identity element oI mayor may not be the identity element oI I
owever, iI ; l is a group, then the subgroup must satisIy the conditions
given in the Iollowing:
2.1.30. Theorem. et ; l be a group, and let I be a nonempty subset
oI . Then l; l is a subgroup iI and only iI
(i) e E I;
(ii) Ior every E I I E I; and
(iii) Ior every , y E I l E l
Proo.I Assume that I; l is a subgroup. Then (i) Iollows Irom Theorem
2.1.28, and (ii) and (iii) Iollow Irom the deIinition oI a group.
Conversely, assume that hypotheses (i), (ii), and (iii) hold. Condition
(iii) implies that I is closed relative to l , and thereIore I; l is a sub
semigroup. Condition (i) along with Theorem 2.1.2 imply that (I; l is
a monoid, and condition (ii) implies that (I; l is a group.
Analogous to Theorem 2.1.23 we have:
2.1.31. Theorem. et (; l be a group, and let , c Ior all i E l,
where lis some inde set. et n " II(,; l is a subgroup oI ; l
lEI
Ior every i E l, then (; l is a subgroup oI (; l .
ProoI Since e E , Ior every i E 1 it Iollows that e E . ThereIore, is
nonempty. Now let y E . Then y E , Ior all i E l, and thus yI E I
so that yl E . Since y E , it Iollows that c . Also, Ior every ,
y E , , E I Ior every i E l, and thus l y E I Ior every i and hence
l y E . ThereIore, we conclude Irom Theorem 2.1.30 that ; l is a
subgroup oI ; l .
A direct conse uence oI the above result is the Iollowing:
2.1.32. Corollary. et (; l be a group, and let (I; l and (
2
; l be
subgroups oI ; l . et
3
I n
2
Then
3
; l is a subgroup oI
I; l and (
2
; l .
2.1.33. E ercise. Prove Corollary 2.1.32.
We can deIine a generated subgroup in a similar manner as was done in
the case oI semigroups. To this end let W be any subset oI , where (;
is a group, and let
(: We c and (; is a subgroup oI (; .
The set is clearly nonempty because E . Now let
G n .
E
Then We G, and by Theorem 2.1.31 (G; is a subgroup oI (; . This
subgroup is called the subgroup generated by W.
2.1.3 . Eercise . et W be deIined as above. Show that iI (W; is a
subgroup oI(; , then it is the subgroup generated by W.
et us now consider the Iollowing:
2.1.35. E ample. et Z denote the set oI integers, and let " " denote
the usual operation oI addition oI integers. et W (I . II is any subset
oI Z such that (; is a subgroup oI Z; and We , then Z.
To prove this statement, let n be any positive integer. Since is closed with
respect to , we must have 1 1 2 E . Similarly, we must have 1 I
... 1 n E . Also, n
I
n, and thereIore all the negative integers
are in . Also, n n 0 E , i.e., Z. Thus, G n Z, and so
E
the group Z; is the subgroup generated by I .
The above is an e ample oI a special class oI generated subgroups, the
so called cyclic groups, which we will deIine aIter our net result.
2.1.36. Theorem. et Z denote the set oI all integers, and let ;
be a group. et E and deIine ; l I I I (k times), Ior k a
positive integer. et I (l)I, and let O e. et l : k E Z.
Then ; is the subgroup oI ; generated by ( .
ProoI We Iirst show that ; is a subgroup oI ; . Clearly, e
and e E and Ior every y E we have r lE. Also, Ior every , y E
we have y E . Thus, by Theorem 2.1.30, ( ; is a subgroup oI; .
Net, we must show that ; is the subgroup generated by . To do so,
it suIIices to show that c / Ior every / such that E
j
and such that
/; is a subgroup oI ( ; . But this is certainly true, since y E implies
y l Ior some k E Z. Since E , it Iollows that l E
j
and thereIore
y E , .
The preceding result motivates the Iollowing:
Ciropter 2 I Algebraic Structures
2.1.37. DeIinition. et ; be a group. IIthere e ists an element E
such that the subgroup generated by is eual to ; , then ; is
called the cyclic group generated by .
By Theorem 2.1.36, we see that a cyclic group has elements oI such a
Iorm that ... ,
3
,
, I, e, ,
, .. . . Now suppose there is some

positive integer n such that e. Then we see that I . Similarly,
II e, and
II
I
. Thus, Ie, , ... , I , and is a Iinite set oI
order n. II there is no n such that e, then is an inIinite set.
We consider ne t another important class oI groups, the so called permu
tation groups. To this end let be a nonempty set and let M() denote the
set oI all mappings oI into itselI. Now, iI , p E M() then it Iollows Irom
(1.2.15) that the composite mapping p0 belongs also to M(), and we can
deIine an operation on M() (i.e., a mapping Irom M() M() into
M()) by associating with each ordered pair (P, ) the element po . We
denote this operation by "." and write
p. po , , p E M(). (2.1.38)
We call this operation "multiplication," we reIer to p . as the product oI
pand , and we note that (P 0 )() (P . )() Ior all E . We also note
that "." is associative, Ior iI / , P, y E M(), then
( P) . y ( 0 P) 0 i 0 (j 0 y) (P . y).
Thus, the system M(); . is a semigroup, which we call the semigroup oI
transIormations on .
Net, let us recall that a permutation on is a onetoone mapping oI
onto . Clearly, any permutation on belongs to M(). In particular, the
identity permutation e: deIined by
e() Ior all E ,
belongs to M(). We thus can readily prove the Iollowing:
2.1.39. Theorem. M();. is a monoid whose identity element is the
identity permutation oI M().
ProoI et E M(). The (e )() e( )) () Ior every E ,
and so e . Similarly, ( e)( ) (e( )) () Ior all E , and
so / e .
Net, we prove:
2.1. 0. Theorem. et M(); . be the semigroup oI transIormations on
the set . An element / E M() has an inverse in M() iI and only iI is a
permutation on . Moreover, the inverse oI a unit is the inverse mapping
I determined by the permutation / .
2.1. Some Ba.Iic Structures oIAlgebra 5
ProoI Suppose that ( E M() is a permutation on . Then it Iollows Irom
Theorem 1.2.10, part (ii), that (I is a permutation on and hence (I
E M(). Since ( 0 (I (I 0 ( e, it Iollows that (. (I (I. ( e,
and thus ( has an inverse.
Net, suppose that ( has an inverse in M() and let ( denote that inverse
relative to ".". Then ( E M() and (. ( ( ( e. To show that
( is a permutation on we must show that ( is a onetoone mapping oI
onto . To prove that ( is onto, we must show that Ior any E there
e ists ayE such that ( (y) . Since ( E M() it Iollows that ( ( ) E
Ior every E and ( 0 ( ( ) e( ) . etting y ( ( ) it Iollows that
( is onto. To show that ( is onetoone we assume that ( ( ) ((y). Then,
((( (((y and since ( 0 ( e, we have
e( ) ( 0 ( ( ) ( 0 ( ( ) ( 0 ( (y) e(y) y.
ThereIore, ( is onetoone. ence, iI ( E M() has an inverse, (I, it is a
permutation on .
enceIorth, we employ the Iollowing notation: the set oI all permutations
on a given set is denoted by P(). As pointed out in Chapter I, iI a set
has n elements, then there are n distinct permutations on .
The reader is now in a position to prove the Iollowing result.
2.1. 1. Theorem. (P();. is a subgroup oI(M();..
2.1. 2. E ercise. Prove Theorem 2.1. 1.
The preceding result gives rise to a very important class oI groups.
2.1. 3. DeIinition. Any subgroup oI the group P(); is called a permu
tation group or a transIormation group on , and P(); is called the
permutation group or the transIormation group on .
Occasionally, we speak oI a permutation group on , say ; , without
making reIerence to the set . In such cases it is assumed that (; is a
subgroup oI the permutation group P() Ior some set .
2.1. . E ample. et , y, . Then P() consists oI 3 6 permu
tations, namely,
( I C
y
:), ( 2 C
y
;),
( 3 (;
y
:),
y
( G
y
;), (s G
y
;), (, (:
y
;).
y
We can readily veriIy that ( I e. II I Ie, (2, then (I; is a subgroup
oI P() and hence a permutation group on . et
2
(e, ( (s. Then
6
2
; . is also a permutation group on . Note that I; . is oI order 2
and
2
; . is oI order 3.
B. Rings and ields
Thus Iar we have concerned ourselves with mathematical systems con
sisting oI a set and an operation on the set. Presently we consider mathemat
ical systems consisting oI a basic set with two operations and PdeIined
on the set, denoted by ; , Pl. Associated with such systems there are two
mathematical systems (called subsystems) ; and ; Pl. By insisting that
the systems ; and ; P possess certain properties and that one oI the
operations be distributive over the other, we introduce the important mathe
matical systems known as rings. We then concern ourselves with special types
oI important rings called integral domains, division rings, and Iields.
2.1.5. DeIinition. et be a nonempty set, and let and Pbe operations
on . The set together with the operations and P on , denoted by
;, P , is called a ring iI
(i) ; is an abelian group;
(ii) ; Pl is a semigroup; and
(iii) Pis distributive over .
We reIer to ; as the group component oI the ring, to ; P as the
semigroup componeDt oI the ring, to t as the group operatioD oI the ring, and
to Pas the semigroup operation oI the ring. or convenience we oIten denote
a ring ; t , P by and simply reIer to "ring ". or obvious reasons,
we oIten use the symbols " " and "." ("addition" and "multiplication")
in place oI t and P, respectively. Thus, iI is a ring we may write ; , .
and assume that ; 1is the group component oI , and ; . is the
semigroup component oI . We call ; the additive group oI ring ,
; . the multiplicati e semigroup oI ring , y the sum oI and y, and
. y the product oI and y.
We use 0 ("ero") to denote the identity element oI ; . II; . has
an identity element, we denote that identity bye.
The inverse oI an element relative to is denoted by . II has an
inverse relative to ".", we denote it by I. urthermore, we denote (y)
by y (the "diIIerence oI and y") and () y by y. Note that
the elements 0, e, , and I are uniue.
Subseuently, we adopt the convention that when operations " " and
"." appear mied without parentheses to clariIy the order oI operation,
the operation should be taken with respect to ". " Iirst and then with respect
to . or eample,
y ( y)
7
and not (y ). The latter would have to be written with parentheses.
Thus, we have
(y ) ( y) ( ) y .
In general, the semigroup ; . does not contain an identity. owever,
iI it does we have:
2.1. 6. DeIinition. et ; , . be a ring. II the semigroup ; . has
an identity element, we say that is a ring with identity.
There should be no ambiguity concerning the above statement. The group
; always has an identity, so iI we say "ring with identity," we must
reIer to ; . .
We note that it is always true that the operation " " is commutative Ior
a given ring. II in addition the operation "." is also commutative, we have
2.1. 7. DeIinition. et ; , . be a ring. IIthe operation "." is com
mutative on the set then the ring is called a commutative ring.
or rings we also have:
2.1. 8. DeIinition. et ; , .be a ring with identity. An element E
is called a unit oI iI has an inverse as an element oI the semigroup ; ..
We denote this inverse oI by I.
The reader can readily veriIy that the Iollowing eamples are rings.
2.1. 9. Eercise. etting" " and "." denote the usual operations oI
addition and multiplication, show that ; , .is a commutative ring with
identity iI
(i) is the set oI integers;
(ij) is the set oI rational numbers; and
(iii) is the set oI real numbers.
2.1.50. Eercise. et O, I and deIine" " and"" by the Iollowing
operation tables:
0 I
o 0 I
1 I 0

O I
o 0 0
I 0 I
Show that ; , . is a commutative ring with identity.
2.1.51. Eercise. et ; e be an abelian group with identity element e.
DeIine the operation P on as py e Ior every , y E . Show that
; e , P is a ring.
1
or rings we have:
2.1.52. Theorem. II ; , .is a ring then Ior every , y E we have
(i) 0 0 ;
(ii) ( y) () (y) () y y;
(iii) iI y 0, then y;
(iv) () ;
(v) 0 0 0 ;
(vi) (). y (. y) (y); and
(vii) ()(y)y.
ProoI Parts (i)(iv) Iollow Irom the Iact that ; is an abelian group
and Irom our notation convention.
To prove part (v) we note that since 0 Ior every E we have
Ior every E , 0 0 0 (0 0) 0 0 , andthus
o 0 . Also, . 0 0 0 . (0 0) 0 0, so that
o O. ence, 0 0 0 . Ior every E .
To prove part (vi), note that 0 y 0 Ior every y E and since ()
0 we have 0 O y () y y() y. This implies
that ( y) () y since ( y) is the additive inverse oI y.
Similarly, 0 0 y (y) y (y). This implies
that (y) ( y). Thus, () . y ( y) (y).
inally, to prove part (vii), we note that since ( ) Ior every
E and since part (vi) holds Ior any E , we obtain, replacing by ,
(). (y) (). y ( y) y.
Now let ; , . denote a ring Ior which the two operations are eual,
i.e., " " ".". Then y y Ior all ,y E . In particular, iI
y 0, then 0 . 0 0 Ior all E and we conclude that 0 is the
only element oI the set : This gives rise to:
2.1.53. DeIinition. A ring ; , . is called a trivial ring iI O .
We ne t introduce:
2.1.5 . DeIinition. et ; , .be a ring. IIthere e ist nonero elements
, y E (not necessarily distinct) such that y 0, then and yare both
called divisors oI ero.
We have:
2.1.55. Theorem. et ; , .be a ring, and let O . Then
has no divisors oI ero iI and only iI ; . is a subsemigroup oI ; . .
9
ProoI Assume that has no divisors oI ero. Then , y E implies
y * 0, so E and is a subsemigroup.
Conversely, iI , y E implies y E , then y,*iI * and
y,*O.
We now consider special types oI rings called integral domains.
2.1.56. DeItnition. A ring ; , . is called an integral domain iI it has
no divisors oI ero.
Our ne t result enables us to characterie integral domains in another
euivalent Iashion.
2.1.57. Theorem. A ring is an integral domain iI and only iI Ior every
* 0, the Iollowing three statements are euivalent Ior every y, Z E :
(i) y ;
(ii) y ; and
(iii) y. .
ProoI Assume that is an integral domain. Clearly (i) implies (ii) and
(iii). To show that (ii) implies (i), let y . Then (y ) 0.
Since * 0 and has no ero divisors, y 0 or y . Thus, (ii)
implies (i). Similarly, it Iollows that (iii) implies (i). This proves that (i),
(ii), and (iii) are e uivalent.
Conversely, assume that ;t:. 0 and that (i), (ii), and (iii) are e uivalent.
et y O. Then 0 . y, and it Iollows that y must be ero since
(ii) implies (i). Thus, . ;t:. 0 Ior y ;t:. 0, and has no ero divisors.
We now introduce divisors oI elements.
2.1.58. DeIinition. et ; , . be a commutative integral domain with
identity, and let , y E . We say y is a divisor oI iI there e ists an element
Z E such that y . II y is a divisor oI , we write y I .
II yI, it is customary to say that y divides .
2.1.59. Theorem. et ; , . be a commutative integral domain with
identity, and let E . Then is a unit oI iIand only iI Ie.
ProoI et Ie. Then there is a E such that e Z . Thus, Z
is an inverse oI , e., I.
Conversely, let be a unit oI . Then there e ists I E such that
e I, and thus le.
We notice that iI in an integral domain y 0, then either 0 or
y O. Now a divisor oI ero cannot have an inverse. To show this, we let
and y be divisors oI ero, e., y O. Suppose that y has an inverse.
Then y r I 0 r I, or 0, which contradicts the Iact that and y
are ero divisors. owever, the Iact that an element is not a ero divisor
does not imply it has an inverse. II all oI the elements ecept ero have an
inverse, we have yet another special type oI ring.
2.1.60. DeIlDition. et ; , .be a nontrivial ring, and let O .
The ring is called a division ring iI ; . is a subgroup oI ; . .
In the case oI division rings we have:
2.1.61. Theorem. et ; , . be a division ring. Then is a ring with
identity.
ProoI et O . Then ; . has an identity element e. et
E . II E , then e e . II *, then 0 and 0 e
e 0 O. ThereIore, e is an identity element oI .
OI utmost importance is the Iollowing special type oI ring.
2.1.62. DeIinition. et ; , . be a division ring. Then is called a
Iield iI the operation "." is commutative.
Because oI the prominence oI Iields in mathematics as well as in applica
tions, and because we will have occasion to make repeated use oI Iields, it
may be worthwhile to restate the above deIinition, by listing all the properties
oI Iields.
2.1.63. DeIinitioD. et be a set containing more than one element, and
let there be two operations " " and "." deIined on . Then ; , . is a
Iield provided that:
(i) (y ) ( y) and (y ) ( y) Ior all
, y, E ( e., " " and"." are associative operations);
(ii) y y and y y Ior all , y E ( e., " " and
"." are commutative operations);
(iii) there eists an element 0 E such that 0 Ior all E ;
(iv) Ior every E there eists an element E such that
()O;
(v) (y ) y Ior all , y, E (i.e., "." is distribu
tive over "");
(vi) there eists an element e*"O such that e Ior all E ; and
(vii) Ior any *"0, there eists an I E such that (I) e.
2.1.6 . Eample. Perhaps the most widely known Iield is the set oI real
numbers with the usual rules Ior addition and multiplication.
2.1.65. Eercise. et Z denote the set oI all integers and .. " and "."
denote the usual operations oI addition and multiplication on Z. Show that
Z; , . is an integral domain, but not a division ring, and hence not a
Iield.
The above eample and e ercise yield:
2.1.66. DeIinition. et R denote the set oI all real numbers, let Z denote
the set oI all integers, and let" " and "." denote the usual operations oI
addition and multiplication, respectively. We call R; , . the Iield oI real
numbers and Z; , . the ring oI Integers.
Another very important Iield is considered in the Iollowing:
2.1.67. Eercise. et C R R, where R is given in DeIinition 2.1.66.
or any , y E C, let (a, b) and y (c, d), where a, b, e, d E R. We
deIine y iI and only iI a c and b d. Also, we deIine the operations
.. " and "." on C by
y (a c, b d)
and
y (ac bd, ad be).
Show that C; , . is a Iield.
In view oI the last e ercise we have:
2.1.68. DeIinition. The Iield (C; , . deIined in E ercise 2.1.67 is called
the Iield oI comple numbers.
2.1.69. Eercise. et Q denote the set oI rational numbers, let P denote
the set oI irrational numbers, and let" " and" " denote the usual operations
oI addition and multiplication on P and Q.
(a) Discuss the system Q; , . .
(b) Discuss the system P; , . .
2.1.70. Eercise. (This e ercise shows that the Iamily oI 2 2 matrices
Iorms a ring but not a Iield.) et R; , .denote the Iield oI real numbers.
DeIine M to be the set characteried as Iollows. IIu, v E M, then u and v
are oI the Iorm
where a, b, c, d and m, n, p, E R. DeIine the operations " " and "."
onMby
uv 0 b m n 0 mb n
C d p cp d
and
u v 0 b . m n 0 .mb pO n b .
C d p cmdp c.nd
(Note that in the preceding the operations and deIined on M are entirely
diIIerent Irom the operations and Ior the Iield R.)
(a) Show that M; is a monoid.
(b) Show that M; is an abelian group.
(c) Show that M; , . is a ring.
(d) Show that M; , .has divisors oI ero.
Net, we introduce the concept oI subring.
2.1.71. DeIinition. et be a ring, and let be a nonvoid subset oI
which is closed relative to both operations " " and "." oI the ring . The
set , together with the (induced) operations " " and ".", ; , , is
called a subring oI the ring provided that ; , .is itselI a ring.
In connection with the above deIinition we say that subset determines
the subring ; , . . We have:
2.1.72. Theorem. II is a ring then a nonvoid subset oI determines
a subring oI the ring iI and only iI
(i) is closed with respect to both operations" " and""; and
(ii) E whenever E .
sing the concept oI subring, we now introduce subdomains.
2.1.7 . DeIinition. et be a ring, and let be a subring oI . II is
an integral domain, then it is called a subdomain oI .
We also deIine subIield in a natural way.
2.1.75. DeIinition. et be a ring, and let be a subring oI . II is a
Iield, then it is called a subIteld oI .
BeIore, we characteried a trivial ring as a ring Ior which the set consists
only oI the 0 element. In the case oI subrings we have:
2.1. Some Basic Structures oIAlbegra 53
2.1.76. DeIinition. et ; , .be a ring, and let ; , .be a subring.
Then subring is called a trivial subring iI either
(i) O or
(ii) .
or subdomains we have:
2.1.77. Theorem. et be an integral domain, and let be a nontrivial
subring oI . Then is a subdomain oI .
ProoI et , y E , and let y O. Since , y E , and y cannot be
ero divisors. Thus, has no ero divisors.
or subIields we have:
2.1.78. Theorem. et be a Iield, and let be a subring oI . Then
is a subIield oI iI and only iI Ior every E , "*0, I E .
2.1.79. Eercise. Prove Theorem 2.1.78.
or the intersection oI arbitrary subrings we have the Iollowing:
2.1.80. Theorem. et be a ring, and let / be a subring oI Ior each
i E I, where I is some inde set. et n I Then ; , . is a subring
/EI
oI(;,..
ProoI Since 0 E / Ior all i E I, it Iollows that 0 E and is nonempty.
et ,y E . Then ,y E / Ior all i El. ence, y E / and y
E / Ior all i E I so that is closed with respect to " " and "." Also,
E / Ior every i E I. Thus, by Theorem 2.1.72, is a subring oI .
Now let (; , . be a ring and let W be any subset oI . Also, let
: Wee and is a subring oI .
Then is nonempty because E . Now let R n . Then We R
Ey
and, by Theorem 2.1.80, R; , .is a subring oI ; , . . This subring is
called the subring generated by W.
C. Modules, Vector Spaces, and Algebras
Thus Iar we have considered mathematical systems consisting oI a set
oI elements and oI mappings Irom into called operations on .
Since a mapping may be regarded as a set and since an operation is a mapping
(see Chapter I), the various components oI the mathematical systems con
sidered up to this point may be thought as being derived Irom one set .
Chapter 2 / Algebraic Structures
Net, we concern ourselves with mathematical systems which are not
restricted to possessing one single Iundamental set. We have seen that a single
set admits a number oI basic derived sets. Clearly, the number oI sets that
may be derived Irom two sets, say and , will increase considerably. or
eample, there are sets which may be generated by utiliing operations on
and , and then there are sets which may be derived Irom mappings oI
into or into .
Mathematical systems which possess several Iundamental sets and
operations on at least one oI these sets may, at least in part, be analyed by
making use oI the development given thus Iar in the present section. Indeed,
one may view many such comple systems as a composite oI simpler mathe
matical systems and reIer to such systems simply as composite mathematical
systems. Important eamples oI such systems include vector spaces, algebras,
and modules.
2.1.81. DeIinition. et R; , . be a ring with identity, e, and let ;
be an abelian group. et I : R be any Iunction satisIying the
Iollowing Iour conditions Ior all t , pER and Ior all , y E :
(i) p(t p, ) l (t , ) I (P, );
(ii) l (t , y) p(t , ) p(t , y);
(iii) l (t , p(P, p(t P, ); and
(iv) p(e, ) .
Then the composite system R, , p is called a module.
Since the Iunction p is deIined on R , the module deIined above is
sometimes called a leIt R module. A right Rmodule is deIined in an analogous
manner. We will consider only leIt Rmodules and simply reIer to them as
modules, or R modules.
The mapping I : R is usually abbreviated by writing l (t , )
t , i.e., in the same manner as "multiplication oI t times ." sing this
notation, conditions (i) to (iv) above become
(i) (t P) t p;
(ii) t ( y) t t y;
(iii) t (p ) (t P); and
(iv) e ;
respectively. We usually reIer to the module R, ,I by simply reIerring
to and calling it an Rmodule or a module over R.
To simpliIy notation, we used in the preceding the same operation symbol,
, Ior ring R as well as Ior group . owever, this should cause no conIusion,
since it will always be clear Irom contet which operation is used. We will
Iollow similar practices on numerous other occasions in this book.
2.1.82. E ample. et Z; , . denote the ring oI integers, and let;
be any abelian group. DeIine p: Z by p(n, ) ... ,
where the summation includes n times. We abreviate this as p(n,) n
and think oI it as "n times ." The identity element in Z is 1, and we see that
the conditions (i) to (iv) in DeIinition 2.1.81 are satisIied. Thus, any abelian
group may be viewed as a module over the ring oI integers.
2.1.83. E ample. et ; , . be a ring with identity, and let R be a
subring oI with e E R. By deIining p: R as p (I" ) (1,. , it
is clear that is an R module. In particular, iI R , we see that any ring
with identity can be made into a module over itselI.
or modules we have:
2.1.8 . Theorem. et be an R module. Then Ior all (I, E R and E
we have
(i) (1,0 0;
(ii) (1,( ) (I,);
(iii) O 0; and
(iv) ((I,) (I,).
ProoI. To prove the Iirst part, we note that Ior 0 E we have 0 0 O.
Thus, (1,(0 0) (1,0 (1,0 (1,0, and so (I,Q O.
To prove the second part, note that Ior any E we have () 0,
and thus (I,( ( (l, (1,( ) (1,0 O. ThereIore, (1,( )
(I,).
To prove the third part observe that Ior 0 E R we have 0 0 O. ence,
(0 O) O O O , and thereIore O O.
To prove the last part, note that since (I, ((I,) 0 it Iollows that
(I, ((I, O O. ThereIore, (l, ((I,) 0, and ((I,) (I,).
We ne t introduce the important concept oI vector space.

2.1.85. DeIinition. et ; , .be a Iield, and let ; be an abelian
group. IIis an module, then is called a vector space over .
The notion oI vector space, also called linear space, is among the most
important concepts encountered in mathematics. We will devote the ne t two
chapters and a large portion oI the remainder oI this book to vector spaces
and to mappings on such spaces.
2.1.86. Theorem. et R; , . be a ring, and let R" R ... R;
i.e., R" denotes the n Iold Cartesian product oI R. We denote the element
E R" by (.. :
h
, ,,) and deIine the operation" " on k by
(I I " ., " ,,)
Ior all , E R". Also, we deIine p: R R" k by
ot (ot .. .. , ot ,,)
Ior all ot E R and E R". Then, R" is an R module.
We also have:
2.1.88. Theorem. et ; , . be a Iield, and let " ...
be the n Iold Cartesian product oI . Denote the element E " by (I
, ... ,,,) and deIine the operation " " on " by
Ior all , E ". Also, deIine p: " " by
ot ( ..... , ,,)
Ior all ot E and E ". Then " is a vector space over .
In view oI Theorem 2.1.88 we have:
2.1.90. DeIinition. et ; , . be a Iield. The vector space " over
is called the vector space oI n tuples over .
Another very important concept encountered in mathematics is that oI
an algebra. We have:
2.1.91. DeIinition. et be a vector space over a Iield . et a binary
operation called "multiplication" and denoted by "." be deIined on ,
satisIying the Iollowing aioms:
(i) (y ) ;
(ii) ( y) ; and
(iii) (ot) (P ) (ot P)( y)
Ior all , , E and Ior all ot, PE . Then, is called an algebra over .
II, in addition to the above a ioms, the binary operation oI multiplication is
associative, then is called an associative algebra. IIthe operation is com
57
mutative, then is called a commutative algebra. IIhas an identity element,
then is called an algebra with identity.
Note that in hypothesis (iii) the symbol"." is used to denote two diIIerent
operations. Thus, in the case oI y the operation used is deIined on while
in the case oI / pthe operation used is deIined on .
The reader is cautioned that in some tets the term algebra means what
we deIined to be an associative algebra.
2.1.92. Eercise. et M; , .denote the ring oI 2 2 matrices deIined
in Eercise 2.1.70, and let R; , . be the Iield oI real numbers. or u E M
given by
where a, b, c, d E R, deIine / u Ior / E R by
/ u / Q / b .
/ c t d
Show that M is an associative algebra over R.
In some areas oI application, socalled ie algebras are oI importance.
We have:
2.1.93. DeIinition. A nonassociative algebra R is called a ie algebra
iI 0 Ior every E R and iI
(y ) y . ( ) ( y) 0 (2.1.9 )
Ior every , y, Z E R. Euation (2.1.9 ) is called the acobi identity.
et us now consider some speciIic cases oI ie algebras. Our Iirst eercise
shows that any associative algebra can be made into a ie algebra.
2.1.95. Eercise. et R be an associative algebra over , and deIine the
operation "." on R by
.yyy
Ior all , y E R (where"" is the operation on the associative algebra R
over ). Show that R with "." deIined on it is a ie algebra.
2.1.96. Eample. In Eercise 2.1.70 we showed that the set oI 2 2
matrices Iorms a ring but not a Iield, and in Eercise 2.1.92 we showed that
this set Iorms an algebra over , the Iield oI real numbers. This set can be
made into a ie algebra by Eercise 2.1.95.
2.1.97. E ercise. et denote the usual "threedimensional space," and
let i, j, k denote the elements oI depicted in igure A.
k (0,0,1)
j (0,1,0)
2.1.98. igure A. nit vectors i.j. k in three dimensional space.
DeIine the operation "" on by
j k
0 k j
j
k 0 i
k j
i 0
i.e., "" denotes the usual "cross product," also called "outer product."
encountered in vector analysis. Show that is a ie algebra.
et us ne t consider submodules.
2.1.99. DeIlnition. et R; , . be a ring with identity, and let ;
be an abelian group, where is an Rmodule. et ; be a subgroup oI
; . II is an R module, then is called an R submoclole oI .
We can characterie submodules by the Iollowing:
2.1.100. Theorem. et be an Rmodule, and let be a nonempty
subset oI . Then, is an R submodule iI and only jI
(i) ; is a subgroup oI ; ; and
(ii) Ior all E R and E , we have E .
ProoI We give the suIIiciency part oI the prooI and leave the necessity part
as an e ercise.
et , pERand let E . Then , p, ( P) E by hypothesis
(ii). Since is a group, it Iollows that py E and since E we have
( P) p. Now Iet E R and let ,y E . Then ( y) E
and, also, ( y), , y E . Thus, ( y) , since is a
subgroup oI . Now let , PER, and let E . Then p E , and hence
(p) E . We have ( P) E , and so (p) ( P). Also, since
e E R, we have e E Ior all E and Iurthermore, since E , we
have e . This proves that is an Rmodule and hence an Rsubmodule
oI .
2.1.101. Eercise. Prove the necessity part oI the preceding theorem.
We net introduce the notion oI vector subspace, also called linear subspace.
2.1.102. DeIinitioD. et be a Iield, and let be a vector space over .
et be a subset oI . II is an submodule oI , then is called a vector
snbspace.
et us consider some speciIic cases.
2.1.103. Eample. et R be a ring, let be an Rmodule, and let E
Ior i I, ... ,n. Then the subset oI given by E : ii(1, ,
E R is an Rsubmodule oI .
2.1.10. Eample. et be a Iield, and let P be the vector space oI n
tuples over . et I (1,0, ... ,0) and l (0,1,0, ... 0). Then I
l E . et ( E P: I
I
l
l
, .. 1 E . Then is a vector
subspace. We see that jI E , then is oI the Iorm (1,1. (1,1,0, ... ,0).
We net prove:
2.1.105. Theorem. et be an Rmodule, and let denote a Iamily oI
Rsubmodules oI , i.e., / is a submodule oI Ior every E , where
i E I and I is some inde set. et n . Then is an Rsubmodule oI
. el
ProoI Since is a subgroup oI Ior all
1
E , it Iollows that is a
subgroup oI by Theorem 2.1.31. Now let E R and let y E . Then
y E
j
Ior all E . ence, y E Ior all E . and so (1,y E .
ThereIore, by Theorem 2.1.100, is an Rsubmodule oI .
The above result gives rise to:
2.1.106. DeItnition. et be an Rmodule, and let W be a subset oI .
et cy be the Iamily oI subsets oI given by
cy : W c: c: and is an R submodule oI .
et G () . Then G is called the R sabmodule oI generated by W.
rely
et us ne t prove:
2.1.107. Theorem. et be an R module, and let
lt
," E . et
( I .. , ,,) denote the subset oI given by
(
l
, ,,,) E : I
I
... "" I ... ," E R .
Then ( I , ,,) is an R submodule oI .
ProoI. or brevity let ( I .. , ,,). To show that is a subgroup
oI we Iirst note that E . Net, Ior E , let y (I)
I
...
( I ") ,,. Then y E and y 0, and hence y . Net, let
PI
I
... P.. ". Then Z (I
I
PI)
I
... (I " P") ,,
E . ThereIore, by Theorem 2.1.30, is a subgroup oI .
inally, note that Ior any a E R,
a a(I l
l
... I " ,,) a I I
I
... a I " " E .
Thus, by Theorem 2.1.100, is an R submodule oI .
We see that (
l
, , ,,) belongs to the Iamily cy oI DeIinition 2.1.106
iI we let (" ... , ,,), in which case n , (
l
, , ,,). This
r .E.y
leads to:
2.1.108. DeItnition. et be an R module, let I ... ," E , and let
(
l
, ,,,) E : I I
I
... I " ", lI .. ,I" E R. Then
( I , ,,) is called the R module oI generated by I ... , ".
Also oI interest to us is:
2.1.109. DeIinition. et be an R module. II there e ist elements I
.. ," E such that Ior every E there e ist lI , I " E R such that
I
I
... "" then is said to be bitely generated and I ... ,
" are called the generators oI .
It can happen that the inde ed set I
It
.. , I ,, in the above deIinition is
not uniue. That is to say, Ior E we may have I I
I
... I .. "
PI
I
... P" ", where 1
,
1 P, Ior some i. owever, iI it turns out
that the above representation oI in terms oI " ... ," is uniue, then we
have:
2.1.110. DeIiDitioD. et be an Rmodule which is Iinitely generated.
et " ... ,
n
be generators oI . IIIor every E the relation
,, ... .n PI , ... Pn n
implies that I PI Ior all i I, ... ,n, then the set " . .. , n is called
a basis Ior .
D. Overview
We conclude this section with the Ilow chart oI igure D, which attempts
to put into perspective most oI the algebraic systems considered thus Iar.
Module
Commutative ring Integral domain
Associative algebra Commutative algebra
2.1.111. igure B. Some basic structures oI algebra.
2.2. OMOMORP ISMS
Thus Iar we have concerned ourselves with various aspects oI diIIerent
mathematical systems (e.g., semigroups, groups, rings, etc.). In the present
section we study special types oI mappings deIined on such algebraic struc
tures. We begin by Iirst considering mappings on semigroups.
2.2.1. DeIinition. et ; and ; P be two semigroups (not neces
sarily distinct). A mapping p oI set into set is called a homomorphism
oI the semigroup ; into the semigroup ; P iI
p( ( y) p()P p (y) (2.2.2)
Ior every , y E . The image oI under p, denoted by p(), is called the
homomorphic image oI . II E then p() is called the homomorphic
image oI .
In igure C, the signiIicance oI E . (2.2.2) is depicted pictorially. rom
this Iigure and Irom E . (2.2.2) it is evident why homomorphisms are said to
"preserve the operations ( and p."
p
p
p
y
2.2.3. igure C. omomorphism or semigroup ; into semigroup
; I .
In the above deIinition we have used arbitrary semigroups ; and
; Pl. As mentioned in Section 2.1, it is oIten convenient to use the symbol
"" Ior operations. When using the notation ; and ; to denote
two diIIerent semigroups, it should oI course be understood that the operation
associated with set will, in general, be diIIerent Irom the operation
associated with set . Since it will usually be clear Irom conte t which par
ticular operation is being used, the same symbol will be employed Ior both
semigroups (however, on rare occasions we may wish to distinguish between
diIIerent operations on diIIerent sets).
sing the notation ; and ; in DeIinition 2.2.1, E . (2.2.2)
62
2.2. omomorphisms
now assumes the Iorm
p( y) p() p(y)
(2.2. )
Ior every , y E . This relation looks very much like the "linearity property"
which will be the central topic oI a large portion oI the remainder oI this
book, and with which the reader is no doubt Iamiliar. owever, we emphasi e
here that the deIinition oI "linear" will be reserved Ior a later occasion, and
that the term homomorphism is not to be taken as being synonymous with
linear. Nevertheless, we will see that many oI the subseuent results Ior
homomorphisms will reoccur with appropriate counterparts throughout the
this book.
2.2.5. E ample. et R denote the set oI real numbers, and let " " and
"." denote the usual operations oI addition and multiplication on R. Then
R; and R; . are semigroups. et
I() e"
Ior all E R. ThenIis a homomorphism Irom R; to R; ..
2.2.6. E ercise. et ; and ; denote the semigroups deIined
in E ample 2.1.17. etI: be deIined as Iollows:I(O) 1,/(1) 3,
I(2) I, and I(3) 3. Show that I is a homomorphism Irom ; into
; ..
In order to simpliIy our notation even Iurther, we will oIten use the
symbol"." in the remainder oI the present chapter to denote operations Ior
semigroups (or groups), say ; , ; , and we will oIten reIer to these
simply as semigroup (or group) and , respectively. In this case, iI p
denotes a homomorphism oI into we write
p( y) p() p(y)
Ior all ,y E .
In Chapter I we classiIied mappings as being into, onto, one to one and into,
and one to one alld onto. Now iI p is a homomorphism oI a semigroup
into a semigroup , we can also classiIy homomorphisms as being into, onto,
onetoone and into, and onetoone and onto. This classiIication gives rise
to the Iollowing concepts.
2.2.7. DeIinition. et p be a homomorphism oI a semigroup into a
semigroup .
(i) II P is a mapping oI onto , we say that and are homomorphic
semigroups, and we reIer to as being homomorphic to .
(ii) II P is a one to one mapping oI into , then p is called an
isomorphism oI into .
(iii) II P is a mapping which is onto and onetoone, we say that
semigroup is isomorphic to semigroup .
(iv) II ( e., p is a homomorphism oI semigroup into itselI),
then p is called an endomorphism.
(v) II and iI p is an isomorphism ( e., p is an isomorphism oI
semigroup into itselI), then p is called an automorphism oI .
We note that since all groups are semigroups, the concepts introduced
in the above deIinition apply necessarily also to groups.
In connection with isomorphic semigroups (or groups) a very important
observation is in order. We Iirst note that iI a semigroup (or group) is
isomorphic to a semigroup , then there e ists a mapping p Irom into
which is onetoone and onto. Thus, the inverse oI p, p I, e ists and we can
associate with each element oI one and only one element oI , and vice
versa. Secondly, we note that p is a homomorphism, e., p preserves the
properties oI the respective operations associated with semigroup (or group)
and semigroup (or group) or, to put it another way, under p the (alge
braic) properties oI semigroups (or groups) and are preserved. ence,
it should be clear that isomorphic semigroups (or groups) are essentially
indistinguishable, the homomorphism (which is onetoone and onto in this
case) amounting to a mere relabeling oI elements oI one set by elements oI
a second set. We will encounter this type oI phenomenon on several other
occasions in this book.
We are now ready to prove several results.
2.2.8. 1beorem. et p be a homomorphism Irom a semigroup into a
semigroup . Then
(i) p() is a subsemigroup oI ;
(ii) iI has an identity element, e, p(e) is an identity element oI p();
(iii) iI has an identity element, e, and iI E has an inverse, I,
then p() has an inverse in p() and, in Iact, P( ) I p(
I
);
(iv) iI I is a subsemigroup oI , then p(
I
) is a subsemigroup oI p();
and
(v) iI
I
is a subsemigroup oI p(), then
l E : p() E
I

is a subsemigroup oI .
ProoI. To prove the Iirst part we must show that the subset p() oI
is closed relative to the operation" .. on . Now iI , y E p( ), then there
e ists at least one E and at least one y E such that p() and
p(y) y. Since p is a homomorphism, we have
y p() p(y) p( y),
2.2. omomorphisms
and since y E it Iollows that y E p() because p( y) E P().
Thus, p() is closed and, hence, is a subsemigroup oI .
To prove the second part, note that since e E we have pee) E p(),
and since Ior any E p() there e ists E such that p() , we have
p(e) pee) p() p(e ) p() .
Since this is true Ior every E p(), it Iollows that p(e) is a leIt identity
element oI p(). Similarly, we can show that pee) Ior every
E p(). Thus, p(e) is an identity element oI the subsemigroup p() oI .
To prove the third part oI the theorem, note that since p is a homo
morphism, we have
p() p(
I
) p( I) p(e),
and
p(
I
) p() p(
I
) p(e);
i.e., p(e) is an identity element oI p(). Also, since p(
I
) E P(), p() has
an inverse in P(), and P( ) I p(
I
).
The prooI oI parts (iv) and (v) oI this theorem are leIt as an e ercise.
2.2.9. E ercise. Complete the prooI oI Theorem 2.2.8.
We emphasie that although p(e) in the above theorem is an identity
element oI the subsemigroup p() oI , it is not necessarily true that pee)
has to be an identity element oI .
2.2.10. DeIinition. et p be a homomorphism oI a semigroup into a
semigroup . II p() has identity element, say e, then the subset oI ,
p

deIined by
p
E : p() e l
is called the kernel oI the homomorphism p.
It turns out that
p
is a semigroup; i.e., we have:
2.2.11. Theorem. , is a subsemigroup oI .
Now let and be groups (instead oI semigroups, as above), and let p
be a homomorphism oI into . We have:
2.2.13. Theorem. et p be a homomorphism Irom a group into a group
. Then
(i) P() is a subgroup oI ; and
(ii) iI e is the identity element oI , then pee) is the identity element oI .
66 Chapter 2 / Algebraic Structures
ProoI To prove the Iirst part, let e denote the identity element oI . By
part (i) oI Theorem 2.2.8, p() is a subsemigroup oI ; by part (ii) oITheorem
2.2.8, p(e) is an identity element oI p(); and by part (iii) oI the same theorem,
it Iollows that every element oI p() has an inverse. Thus, p() is a subgroup
oI .
The second part oI this theorem Iollows Irom Theorem 2.1.28 and Irom
part (ii) oI Theorem 2.2.8.
The Iollowing result is known as Cayleys theorem.
2.2.1. Theorem. et ; . be a group, and let P(); . denote the
permutation group on . Then is isomorphic to a subgroup oI P().
ProoI or each 0 E , deIine the mapping /,,: by Ia( ) 0
Ior each E . II , y E and Ia( ) /,,(y), then a 0 y, and so
y. ence, Ia is an injective mapping. Now let y E . Then aI. y E
and so /"(0
1
y) y. This implies that/" is surjective. ence, Ia is a (11)
mapping oI onto , which implies that /" is a permutation on ; i.e.,
IG E P(). Now deIine the Iunction rp: P() by rp(o) /" Ior each
o E . Now let u, VE. or each E ,I. .() (u v) u (v )
Iu(v ) Iu.()) Iu ol.(). Thus, Iu. Iu 0 I. Ior all u, VE.
Since rp(u v) I.. and rp(u) 0 rp(v) Iu 0 I., it Iollows that rp(u v) rp(u)
o rp(v), and so rp is a homomorphism. Suppose u, v E are such that rp(u)
rp(v). Then Iu I., which implies that j,,() I.() Ior all E . In
particular,Iu(e) I.(e). ence, u e v e, so that u v. This implies that
rpis injective. It Iollows that rp is a (II) mappingoI onto rp( ). By Theorem
2.2.13, part (i), rp( ) is a subgroup oI P(). This completes the prooI.
We also have:
2.2.15. Theorem. et p be a homomorphism oI a semigroup into a semi
group , and let p be an isomorphism oI with p(). Then
(i) pI is an isomorphism oI p() with ; and
(ii) iI P() contains an identity element e , then pI(e) e is an identity
element oI and , e and , . e (, denotes the kernel oI
the homomorphism p).
ProoI To prove the Iirst part oI the theorem, let , y E P(). Then there
eist uniue , y E such that p() and p(y) y, and pI()
and pI(y) y. Since
p( y) p() P( ) .v ,
we have
pI( y) y pI() pI(y).
2.2. omomorphisms 67
Since this is true Ior all ,y E p(), it Iollows that pI is an isomorphism
oI p() with .
To prove the second part oI the theorem we Iirst note that P() is a
subsemigroup oI by Theorem 2.2.8. It Iollows Irom Theorem 2.2.13 that
e r I(e) is an identity element oI . Now let p(k) e . Since p(e) e ,
it Iollows that k e and that , e . We can similarly show that , .
e .
rom the above result we can now conclude that iI a semigroup is
isomorphic to a semigroup , then the semigroup is isomorphic to the
semigroup .
or endomorphisms and automorphisms we have:
2.2.16. Theorem. et" and IjI be homomorphisms oI a semigroup into
itselI.
(i) II" and IjI are endomorphisms oI , then the composite mapping
IjI 0 " is likewise an endomorphism oI .
(ii) II" and IjI are automorphisms oI , then IjI 0 " is an automorphism
oI .
(iii) II" is an automorphism oI , then ,, 1 is also an automorphism
oI .
ProoI To prove the Iirst part, note that" and IjI are both mappings oI
into , and thus IjI 0 " is a mapping oI into . Also, by deIinition,
(1jI 0 l)() 1jI( l( Ior every E . Now since l( y) le) l(y)
and ljI( y) ljI() IjI( ) Ior every , y E , we have
IjI 0 ,,( y) 1jI(l( y 1jI(l() l( 1jI(l( 1jI(l(
(1jI 0 l( (1jI 0 l(y.
This implies that the mapping IjI 0 1 is an endomorphism oI .
The prooI oI the second and third part oI this theorem is leIt as an
eercise.
2.2.17. Eercise. Complete the prooI oI the above theorem.
et us net consider homomorphisms oI rings. To this end let, henceIorth,
and be arbitrary rings, and without loss oI generality let the operations
oI these two rings be denoted by " " and ".".
2.2.18. DeIinition. et and be two rings. A mapping p oI set into
set is called a homomorphism oI the ring into the ring iI
(i) p( y) p() p(y); and
(ii) p(, y) p() p(y)
Ior every , y E . The image oI into , denoted by P(), is called the
homomorphic image oI .
IIa homomorphism p is a onetoone mapping oI a ring into a ring
, then p is called an isomorphism oI into . IIthe isomorphism p is an
onto mapping oI into , then p is called an isomorphism oI with .
urthermore, iI p is a homomorphism oI into , then p is called an
endomorphism oI the ring . inally, an isomorphismoI with itselI is called
an automorphism oI ring .
The properties associated with homomorphisms oI groups and semigroups
can, oI course, be utili ed when discussing homomorphisms oI rings.
2.2.19. Deorem. et p be a homomorphism oI a ring into a ring .
(i) The homomorphic image p() is a subring oI .
(ii) . II l is a subring oI , then p(
I
) is a subring oI P().
(iii) et
I
be a subring oI P(). Then the subset l C deIined by
I E : p() E tl
is a subring oI .
(iv) et Z be a ring and let I/I be a homomorphism oI into Z. Then
the composite mapping I/I 0 p is a homomorphism oI into Z.
ProoI To prove the Iirst part oI the theorem we note that the homomorphic
image p() is clearly the homomorphic image oI the group ; and oI
the semigroup ; .). Since this homomorphic image is a subgroup oI
; ) and subsemigroup oI ; .), it Iollows Irom Theorem 2.1.72 that
P() is a subring oI .
The prooIs oI the remaining parts oI this theorem are leIt as an e ercise .
2.2.20. E ercise. Prove parts (ii), ( i), and (iv) oI Theorem 2.2.19.
Analogous to 2.2.10, we make the Iollowing deIinition.
2.2.21. DeIinition. II p is a homomorphism oI a ring into a ring ,
then the subset , oI deIined by
, E : p( ) O
is called the kernel oI the homomorphism p oI the ring into .
We close the present section by introducing one more concept.
2.2.22. DeIinition. et R; , . be a ring with identity and let and
be two R modules. A mapping I: is called an R homomorphism iI,
Ior all u, v E and E R the relations
2.3. Application to Polynomials
(i) I(u 1) I(u) I(v); and
(ii) I(rt.u) rt.I(u)
hold.
69
In the ne t chapter we will consider in great detail a special class oI
vector spaces and homomorphisms, and Ior this reason we will not pursue
this subject any Iurther at this time.
2.3. APP ICATION TO PONOMIAS
Polynomials play an important role in many branches oI mathematics
as well as in science and engineering. In the present section we brieIly consider
applications oI some oI the concepts oI the preceding sections to polynomials.
irst, we wish to give an abstract deIinition Ior a polynomial Iunction.
Basically, we want this Iunction to take the Iorm
I(t) 00 alt ... a.t.
owever, we are not looking Ior a way oI deIining the value oI I(t) Ior each
t, but instead we seek a deIinition oIIin terms oI the inde ed set 0o, ... , an .
To this end we let the a, belong to some Iield.
More Iormally, let be a Iield and deIine a set P as Iollows. II a E P,
then a denotes an inIinite se uence oI elements Irom in which all e cept a
Iinite number are ero. Thus, iI a E P, then
a 0
o
, a.. ... ,an 0, 0, ... .
That is to say, there e ists some integer n 0 such that a/ 0 Ior all i n.
Now let b be another element oI P, where
b lbo, bl ... ,b.., 0, 0, ... .
We say that a b iI and only iI a, b, Ior all i. We now deIine the operation
" " on P by
0 b ao b
o
, 0
1
b.. ... .
Thus, iI n 2 m, then a, b, 0 Ior all i nand P is clearly closed with
respect to " ". Net, we deIine the operation "." on P by
a b c co, C
I
, . ,
where
" C" a,b,, ,
t: o
Ior all k. In this case c" 0 Ior all k m n, and P is also closed with
respect to the operation"". Now let us deIine
0 O, 0, ... .
Then 0 E P and P; is clearly an abelian group with identity O. Ne t,
70
deIine
eI,O,O, ... .
Then e E P and P; is obviously a monoid with e as its identity element.
We can now easily prove the Iollowing
2.3.1. Theorem. The mathematical system P; , . is a commutative
ring with identity. It is called the riDg oI polynomials over the Iield .
et us net complete the connection between our abstract characteriation
oI polynomials and with the Iunction I(t) we originally introduced. To this
end we let
toI,O,O,
t O, I, 0, 0,
t 1. O,O, 1,0,
t
3
O,O,O, I,O,
At this point we still cannot give meaning to a,t, because a, E and t E P.
owever, iI we make the obvious identiIication a" 0,0, ... E P, and iI
we denote this element simply by a, E P, then we have
I(t) a
o
to a t ... a t.
Thus, we can represent (t) uniuely by the seuence a
o
, at, ... ,a.,
0, ... . By convention, we henceIorth omit the symbol ". ", and write, e.g.,
I(t) a
o
at ... a"r.
We assign t appearing in the argument oI I(t) a special name.
2.3.3. DeItnitiOD. et P; , . be the polynomial ring over a Iield .
The element t E P, t O, 1,0, ... , is called the indeterminate oI P.
To simpliIy notation, we denote by t the ring oI polynomials over a
Iield , and we identiIy elements oI t (i.e., polynomials) by making use oI
the argument t, e.g., I(t) E t .
2.3.. DeItnitioD. et I(t) E t , and let I(t) IO,I1o .. ,I", ... *0,
where I, E Ior all i. The polynomial I(t) is said to be oI order n or oI
degree n iII" *and iI I, Ior all i n. In this case we write degI(t) n
and we call I" the leading coeIticieDt oII II I" I and I, Ior all i n,
then (t) is said to be monic.
II every coeIIicient oI a polynomialIis ero, thenI b. is called the ero
polynomial. The order oI the ero polynomial is not deIined.
2.3. Application to Polynomials 71
2.3.5. Theorem. etI(t) be a polynomial oI order n and let get) be a poly
nomial oI order m. Then I(t)g(t) is a polynomial oI order m n.
ProoI etI(t) Io Iit ... /"t, let get) go glt ... g.r,
and let h(t) I(t)g(t). Then
Since It 0 Ior i nand g 0 Ior j m, the largest possible value oI k
such that h
k
is nonero occurs Ior k m n; e.,
hm n /"gm
Since is a Iield, I" and gm cannot be ero divisors, and thush
m
. * O.
ThereIore, h
m
. * 0, and h
k
0 Ior all k m n.
The reader can readily prove the net result.
2.3.6. Theorem. The ring (t) oI polynomials over a Iield is an integral
domain.
Our net result shows that, in general, we cannot go any Iurther than
integral domain Ior tl.
2.3.8. Theorem. et I(t) E t . Then I(t) has an inverse relative to "."
iI and only iI I(t) is oI order ero.
ProoI et I(t) E t be oI order n, and assume that I(t) has an inverse
relative to ".", denoted by II(t), which is oI order m. Then
I(t)II(t) e,
where e I, 0, 0, ... is oI order ero. By Theorem 2.3.5 the degree oI
I(t)I1(t) is m n. Thus, m n 0 and since m 0 and n 0, we must
havem n O.
Conversely, let I(t) Io Io, 0, 0, ... , where Io * O. Then II(t)
Io
1
Io, 0, 0, ....
In the case oI polynomials oI order ero we omit the notation t, and we
say I(t) is a scalar. Thus, iI c(r) is a polynomial oI order ero, we have
c(t) c, where c 1 O. We see immediately that cI(t) cIo cIlt ...
cI"t" Ior all I(t) E t .
The Iollowing result, which we will reuire in Chapter , is sometimes
called the division algorithm.
2.3.9. Theorem. et I(t), get) E E t and assume that get) *"O. Then there
e ist uniue elements (t) and ret) in E t such that
I(t) (t)g(t) ret), (2.3.10)
where either ret) 0 or deg ret) deg get).
ProoI III(t) 0 or iI degI(t) deg get), then E . (2.3.10) is satisIied with
(t) 0, and ret) I(t). IIdegg(t) 0, e.,g(t) c, thenI(t) c
I
I(t)
C, and E . (2.3.10) holds with (t) cII(t) and ret) O.
Assume now that deg I(t) deg get) 1. The prooI is by induction on
the degree oI the polynomial I(t). Thus, let us assume that E. (2.3.10)
holds Ior deg I(t) n. We Iirst prove our assertion Ior n 1 and then Ior
n I.
Assume that deg I(t) I, e., I(t) a
o
alt, where a
l
*"O. We need
only consider the case g(t) b
o
bit, where b
l
*"O. We readily see that
E. (2.3.10) is satisIied with(t) alb.
1
and ret) a
o
alb.lb
o
Now assume that E . (2.3.10) holds Ior degI(t) k, where k 1, ... ,
n. We want to show that this implies the validity oI E . (2.3.10) Ior
degI(t) n I. et
I(t) a
o
alt ... a" lt" I,
where a,, I 1 O. et deg get) m. We may assume that 0 m n I. et
g(t) b
o
bit ... b",t" , where b", *"O. It is now readily veriIied that
I(t) b;.la"t" I " g(t) I(t) b;.la.,tk I " g(t) . (2.3.11)
Now let h(t) I(t) b;.l a.,t"I"g(t). It can readily be veriIied that the
coeIIicient oI t" 1 in h(t) is O. ence, either h(t) 0 or deg h(t) n I.
By our induction hypothesis, this implies there eist polynomials set) and ret)
such that h(t) s(t)g(t) ret), where ret) 0 or deg ret) deg get). Sub
stituting the epression Ior h(t) into E. (2.3.11), we have
I(t) b;.la"t" I " s(t) g(t) ret).
Thus, E. (2.3.10) is satisIied and the prooI oI the e istence oI ret) and (t)
is complete.
The prooI oI the uniueness oI(t) and ret) is leIt as an e ercise.
2.3.12. E ercise. Prove that (t) and ret) in Theorem 2.3.9 are uniue.
The preceding result motivates the Iollowing deIinition.
2.3.13. DeIinition. et I(t) and get) be any nonero polynomials. et
(t) and ret) be the uniue polynomials such thatI(t) (t)g(t) r(t), where
either ret) 0 or deg ret) deg get). We call (t) the ootient and ret) the
remainder in the division oI I(t) by get). II ret) 0, we say that get) divides
I(t) or is a Iactor oI I(t).
2.3. Application to Polynomials
Net. we prove:
73
2.3.1 . Theorem. et t denote the ring oI polynomials over a Iield .
et I(t) and g(t) be nonero polynomials in t . Then there e ists a uniue
monic polynomial. d(t). such that (i) d(t) divides I(t) and g(t). and (ii) iI
d(t) is any polynomial which divides I(t) and g(t), then d(t) divides d(t).
ProoI. et
t (t) E t : (t) m(t)I(t) n(t)g(t). where m(t). n(t) E tl .
We note that I(t). g(t) E t . urthermore, iI a(t), b(t) E t . then a(t)
b(t) E t and a(t)b(t) E t . Also. iI c is a scalar. then ca(t) E t Ior all
a(/) E /. Now let d(/) be a polynomial oI lowest degree in t. Since all
scalar multiples oI d(/) belong to t , we may assume that d(t) is monic. We
now show that Ior any h(/) E t . there is a (t) E t such that h(/)
d(/) (t). To prove this. we know Irom Theorem 2.3.9 that there eist uniue
elements (t) and ,(t) in t such that h(t) (t)d(t) ,(1). where either
r(t) 0 or deg ,(t) deg d(t). Since d(t) E / and (t) E t . it Iollows
that (I)d(t) E (t). Also. since h(/) E t , it Iollows that r(/) h(t)
(t)d(t) E t . Since d(t) is a polynomial oI smallest degree in (t). it
Iollows that r(/) O. ence. d(t) divides every polynomial in / .
To show that d(t) is uniue. suppose dl(t) is another monic polynomial
in t which divides every polynomial in t . Then d(t) a(t)dl(t). and
d
1
(t) b(t)d(/) Ior some a(t). b(t) E t . It can readily be veriIied that this
is true only when aCt) b(t) 1. Now, since (t), g(t) E tl, part (i) oI
the theorem has been proven.
To prove part (ii), let o(t), b(t) E t be such that I(t) a(t)d(t) and
get) b(t)d(t). Since d(t) E t , there eist polynomials m(t), n(t) such that
d(t) m(t)I(t) n(t)g(t). ence,
d(t) m(t)a(t)d(t) n(t)b(t)d(t)
m(t)a(t) n(t)b(t) d (t).
This implies that d(t) divides d(t) and completes the prooIoIthe theorem.
The polynomial d(t) in the preceding theorem is called the greatest
common divisor oI I(t) and g(t). II d(t) 1. then I(t) and g(t) are said to be
relatively prime.
2.3.15. E ercise. Show that iI d(t) is the greatest common divisor oI I(t)
and g(t). then there e ist polynomials m(t) and n(t) such that
de,) m(t)I(t) n(t)g(t).
III(t) and g(t) are relatively prime, then
1 m(t)I(t) n(t)g(t).
Now let I(t) E t be oI positive degree. II I(t) g(t)h(t) implies that
either g(t) is a scalar or h(t) is a scalar, then I(t) is said to be irreducible.
We close the present section with a statement oI the Iundameotal theorem
oI algebra.
2.3.16. Theorem. et I(t) E t be a nonero polynomial. et R denote
the Iield oI real numbers and let C denote the Iield oI comple numbers.
(i) II C, then I(t) can be written uniuely, ecept Ior order, as a
product
I(t) c(t cl)(t C1) .. (t c.),
where c, C
l
, ,C. E C.
(ij) II R, then I(t) can be written uniuely, ecept Ior order, as a
product
I(t) cIl(t)I1(t) . . .I",(t),
where C E R and the Il(t), ... ,/",(t) are monic irreducible polyno
mials oI degree one or two.
2. . RE ERENCES AND NOTES
There are many e cellent tets on abstract algebra. or an introductory
eposition oIthis subject reIer, e.g., to BirkhoIIand Macane 2.1 , anneken
2.2 , u 2.3 , acobson 2. , and McCoy 2.6 . The books by BirkhoII
and Macane and acobson are standard reIerences. The tets by u and
McCoy are very readable. The ecellent presentation by anneken is
concise, somewhat abstract, yet very readable. Polynomials over a Iield are
treated e tensively in these reIerences. or a brieI summary oI the properties
oI polynomials over a Iield, reIer also to ipschut 2.5 .
RE ERENCES
2.1 G. BIR O and S.MAC ANE, A Survey oI Modern Algebra. New ork:
The Macmillan Company, 1965.
2.2 C. B. ANNE EN, Introduction to Abstract Algebra. Belmont, CaliI.: Dicken
son Publishing Co., Inc., 1968.
2.3 S. T. u, Elements oIModern Algebra. San rancisco, CaliI.: olden Day,
Inc., 1965.
2. N. ACOBSON, ectures in Abstract Algebra. New ork: D. Van Nostrand
Company, Inc., 1951.
2.5) S. IPSCTZ, inear Algebra. New ork: McGraw ill Book Company,
1968.
2.6) N. . McCo , undamentals oI Abstract Algebra. Boston: Allyn Bacon,
Inc., 1972.
3
VECTOR SPACES AND
INEAR
TRANS ORMATIONS
In Chapter I we considered the settheoretic structure oI mathematical
systems, and in Chapter 2 we developed to various degrees oI comple ity the
algebraic structure oI mathematical systems. One oI the mathematical systems
introduced in Chapter 2 was the linear or vector space, a concept oI great
importance in mathematics and applications.
In the present chapter we Iurther eamine properties oI linear spaces.
Then we consider special types oI mappings deIined on linear spaces, called
linear transIormations, and establish several important properties oI linear
transIormations.
In the net chapter we will concern ourselves with Iinite dimensional
vector spaces, and we will consider matrices, which are used to represent
linear transIormations on Iinite dimensional vector spaces.
3.1. INEAR SPACES
We begin by restating the deIinition oI linear space.
3.1.1. DeIinition. et be a nonempty set, let be a Iield, let "..
denote a mapping oI into , and let"" denote a mapping oI
into . et the members E be called l ectors, let the elements E be
called scalars, let the operation ".. deIined on be called ector addition,
75
76 Chapter 3 I Vector Spaces and inear TransIormations
and let the mapping "." be called scalar multiplicatioD or moltip catioa or
vectors by scalars. Then Ior each , y E there is a uniue element, y
E , called the sum or aad y, and Ior each E and I E there is a
uniue element, I I . I E , called the multiple or by I . We say that
the nonempty set and the Iield , along with the two mappings oI vector
addition and scalar multiplication constitute a vector space or a iDear space
iI the Iollowing a ioms are satisIied:
(i) y y Ior every , y E;
(ii) (y ) ( y) Ior every , y, Z E ;
(iii) there is a uniue vector in , called the ero vector or the Dull
vector or the origiD, which is denoted by 0 and which has the prop
erty that 0 Ior all E;
(iv) I ( y) I I y Ior all I E and Ior all , y E ;
(v) (I p) I p Ior all I , p E and Ior all E ;
(vi) (I P) I (P ) Ior all I , p E and Ior all E ;
(vii) O 0 Ior all E ; and
(viii) I Ior all E .
The reader may Iind it instructive to review the a ioms oI a Iield which
are summaried in DeIinition 2.1.63. In (v) the " " on the leIthand side
denotes the operation oI addition on ; the " " on the righthand side
denotes vector addition. Also, in (vi) I P I . I p, where "." denotes the
operation oI mulitplication on . In (vii) the symbol 0 on the leIthand side is
a scalar; the same symbol on the righthand side denotes a vector. The I
on the leIthand side oI (viii) is the identity element oI relative to ".".
To indicate the relationship between the set oIvectors and the underlying
Iield , we sometimes reIer to a vector space over Iield . owever, usually
we speak oI a vector space without making eplicit reIerence to the Iield
and to the operations oI vector addition and scalar multiplication. II is
the Iield oI real numbers we call our vector space a real vector space. Similarly,
iI is the Iield oI comple numbers, we speak oI a comple vector space.
Throughout this chapter we will usually use lower case atin letters (e.g.,
, y, ) to denote vectors ( e., elements oI ) and lower case Greek letters
(e.g., I , p, ) ) to denote scalars ( e., elements oI ).
IIwe agree to denote the element (l) E simply by , e., (l)
I . , then we have I (l) (l l) O O. Thus,
iI is a vector space, then Ior every E there is a uniue vector, denoted
, such that O. There are several other elementary properties oI
vector spaces which are a direct conseuence oI the above a ioms. Some oI
these are summari ed below. The reader will have no diIIiculties in veriIying
these.
3.1. inear Spaces 77
3.1.2. Theorem. et be a vector space. II , y, are elements in and
iI , Pare any members oI , then the Iollowing hold:
(i) iI y and I 1 0, then y;
(ii) II I p and 1 0, then I p;.
(iii) iI o y , then y ;
(iv) I O 0;
(v) I ( y) I I;
(vi) (I I ) I p ; and
(vii) y 0 implies that yo
We now consider several important e amples oI vector spaces.
3.1. . E ample. et be the set oI all "arrows" in the "plane" emanating
Irom a reIerence point which we call the origin or the ero vector or the null
vector, and which we denote by o. et denote the set oI real numbers, and
let vector addition and scalar multiplication be deIined in the usual way,
as shown in igure A.
y

., .

0 y
"I
0
av
y ( y
Vector y Vector y
Vector av, Oal
Vector ( y, Ij 1
Vector "I , O
3.1.5. igm e A
Vector
/
o
The reader can readily veriIy that, Ior the space described above, all the
a ioms oI a linear space are satisIied, and hence is a vector space.
The purpose oI the above e ample is to provide an intuitive idea oI a
linear space. We wi ) utili e this space occasionally Ior purposes oI motivation
in our development. We must point out however that the terms "plane" and
"arrows" were not Iormally deIined, and thus the space was not really
properly deIined. In the e amples which Iollow, we give a more precise Ior
mulation oI vector spaces.
(3.1.8)
3.1.6. E ample. et R denote the set oI real numbers, and let also
denote the set oI real numbers. We deIine vector addition to be the usual
addition oI real numbers and multiplication oI vectors E R by scalars
( E to be multiplication oI real numbers. It is a simple matter to show that
this space is a linear space.
3.1.7. E ample. et P denote the set oI all ordered n tuples oI
elements Irom Iield . Thus, iI E it, then (CIt e2 ... ,elt), where
e, E , i I, ... ,n. With , yEP and ( E , let vector addition and
scalar multiplication be deIined as
y (el e2"" elt) ( 71 72"" 71t)
.0. (el 71 e2 72 ... ,elt 71t)
and
( ( (el e2" .. ,elt) .0. el e2" .. , elt) (3.1.9)
It should be noted that the symbol" " on the righthand side oI E . (3.1.8)
denotes addition on the Iield , and the symbol" " on the leIthand side oI
E . (3.1.8) designates vector addition. (See Theorem 2.1.88.)
In the present case the null vector is deIined as (0, 0, ... , 0) and the
vector is deIined by (I 2"" It) (I 2"" elt)
tiliing the properties oI the Iield , all a ioms oI DeIinition 3.1.1 are
readily veriIied, and it is thus a vector space. We call this space the space
it oI n tuples oI elements oI .
3.1.10. E ample. In E ample 3.1.7 let R, the Iield oI real numbers.
Then RIt denotes the set oI all n tuples oI real numbers. We call the
vector space R" the n dimensional real coordinate space. Similarly, in E ample
3.1.7 let C, the Iield oI comple numbers. Then Cit designates the
set oI all n tuples oI comple numbers. The linear space C" is called the n
dimensional comple coordinate space.
In the previous e ample we used the term dimension. At a later point in
the present chapter the concept oI dimension will be deIined precisely and
some oI its properties will be e amined in detail.
3.1.11. E ample. et denote the set oI all inIinite se uences oI real
numbers oI the Iorm
(3.1.12)
let denote the Iield oI real numbers, let vector addition be deIined similarly
as in E . (3.1.8), and let scalar multiplication be deIined similarly as in E .
(3.1.9). It is again an easy matter to show that this space is a vector space.
We point out that this space, which we denote by R, is simply the collection
3.1. inear Spaces 79
oI all inIinite seuences; e., there is no reuirement that any type oI conver
gence oI the seuence be implied.
3.1.13. Eample. et c denote the set oI all inIinite seuences oI
comple numbers oI the Iorm (3.1.12), let represent the Iield oI comple
numbers, let vector addition be deIined similarly as in E. (3.1.8), and let
scalar multiplication be deIined similarly as in E. (3.1.9). Then C is a
vector space.
3.1.1. Eample. et denote the set oI all seuences oI real numbers
having only a Iinite number oI nonero terms. Thus, iI E , then
(3.1.15)
Ior some positive integer I. II we deIine vector addition similarly as in E.
(3.1.8), iI we deIine scalar multiplication similarly as in E. (3.1.9), and iI we
let be the Iield oI real numbers, then we can readily show that is a real
vector space. We call this space the space oI Iinitely nonero se uences.
II denotes the set oI all seuences oI comple numbers oI the Iorm
(3.1.15), iI vector addition and scalar multiplication are deIined similarly as
in euations (3.1.8) and (3.1.9), respectively, then is again a vector space
(a comple vector space).
3.1.16. Eample. et be the set oI inIinite seuences oI real numbers
oI the Iorm (3.1.12), with the property that Iim. O. II is the Iield oI real
numbers, iI vector addition is deIined similarly as in E. (3.1.8), and iI scalar
multiplication is deIined similarly as in E. (3.1.9), then is a vector space.
This is so because the sum oI two seuences which converge to ero also
converges to ero, and because the scalar multiple oI a seuence converging
to ero also converges to ero.
3.1.17. Eample. et be the set oI inIinite seuences oI real numbers
oI the Iorm (3.1.12) which are bounded. II vector addition and scalar multi
plication are again deIined similarly as in (3.1.8) and (3.1.9), respectively,
and iI denotes the Iield oI real numbers, then is a vector space. This
space is called the space oI bounded real seuences.
There also eists, oI course, a comple counterpart to this space, the
space oI bounded comple seuences.
3.1.18. Eample. et denote the set oI inIinite seuences oI real numbers
00
oI the Iorm (3.1.12), with the property that III 00. et be the Iield
11
oI real numbers, let vector addition be deIined similarly as in (3.1.8), and let
scalar multiplication be deIined similarly as in E. (3.1.9). Then is a vector
space.
3.1.19. E aDlple. et be the set oI all realvalued continuous Iunctions
deIined on the interval a, b . Thus, iI E , then : a, b R is a real,
continuous Iunction deIined Ior all a t b. We note that y iI and only
iI (t) y(t) Ior all t E a, b , and that the null vector is the Iunction which
is ero Ior all t E a, b . et denote the Iield oI real numbers, let liE,
and let vector addition and scalar multiplication be deIined pointwise by
and
( yt) (t) y(t) Ior all t E a, b
(lIt) lIt) Ior all t E a, b .
(3.1.20)
(3.1.21)
Then clearly y E whenever , y E , II E whenever liE and
E , and all the a ioms oI a vector space are satisIied. We call this vector
space the space oI real valued continuous Iunctions on a, b) and we denote it
by era, b).
3.1.22. E ample. et be the set oI all real valued Iunctions deIined on
the interval a, b) such that
s: I(t) dt 00,
where integration is taken in the Riemann sense. et denote the Iield oI
real numbers, and let vector addition and scalar multiplication be deIined as
in euations (3.1.20) and (3.1.21), respectively. We can readily veriIy that
is a vector space.
3.1.23. E ample. et denote the set oI all real valued polynomials
deIined on the interval a, b), let be the Iield oI real numbers, and let vector
addition and scalar multiplication be deIined as in euations (3.1.20) and
(3.1.21), respectively. We note that the null vector is the Iunction which is
ero Ior all t E a, b , and also, iI (t) is a polynomial, then so is (t).
urthermore, we observe that the sum oI two polynomials is again a poly
nomial, and that a scalar multiple oI a polynomial is also a polynomial.
We can now readily veriIy that is a linear space.
3.1.2 . E ample. et denote the set oI real numbers between a 0
and a 0; i.e., iI E then E a, a). et be the Iield oI real
numbers. et vector addition and scalar multiplication be as deIined in
E ample 3.1.6. Now, iI II E is such that II I, then tZa a and tZa .
rom this it Iollows that is not a vector space.
Vector spaces such as those encountered in E amples 3.1.19,3.1.22,
and 3.1.23 are called Iunction spaces. In Chapter 6 we will consider some addi
tional linear spaces.
3.2. inear Subspaces and Direct Sums 81
3.1.25. Eercise. VeriIy the assertions made in Eamples 3.1.6,3.1.7,
3.1.10,3.1.11,3.1.13,3.1.1,3.1.16,3.1.17,3.1.18,3.1.19,3.1.22, and 3.1.23.
3.2. INEAR S BSPACES AND DIRECT SMS
We Iirst introduce the notion oI linear subspace. (See also DeIinition
2.1.102.)
3.2.1. DeIinition. A nonempty subset oI a vector space is called a
linear maniIold or a linear subspace in iI (i) y is in whenever and y
are in , and (ii) ( is in whenever ( E and E .
It is an easy matter to veriIy that a linear maniIold satisIies all the
aioms oI a vector space and may as such be regarded as a linear space itselI.
3.2.2. Eample. The set consisting oI the null vector 0 is a linear subspace;
i.e., the set O is a linear subspace. Also, the vector space is a linear
subspace oI itselI. II a linear subspace is not all oI , then we say that
is a proper subspace oI .
3.2.3. Eample. The set oI all realvalued polynomials deIined on the
interval a, b (see Eample 3.1.23) is a linear subspace oI the vector space
consisting oI all realvalued continuous Iunctions deIined on the interval
a, b (see Eample 3.1.19).
Concerning linear subspaces we now state and prove the Iollowing result.
3.2.. Theorem. et and Z be linear subspaces oI a vector space .
The intersection oI and Z, n Z, is also a linear subspace oI .
ProoI. Since and Z are linear subspaces, it Iollows that 0 E and 0 E Z,
and thus 0 E n Z. ence, n Z is nonempty. Now let (, p E , let
,y E , and let ,y E Z. Then ( py E and also ( py E Z,
because and Z are both linear subspaces. ence, ( py E n Z and
n Z is a linear subspace oI .
We can etend the above theorem to a more general result.
3.2.5. Theorem. et be a vector space and let be a linear subspace
oI Ior every i E I, where I denotes some inde set. Then nI is a linear
b I
lei
su space 0 .
Now consider in the vector space oI E ample 3.1. the subsets and Z
consisting oI two lines intersecting at the origin 0, as shown in igure B.
Clearly, and Z are linear subspaces oI the vector space . On the other
hand, the union oI and Z, u Z, obviously does not contain arbitrary
sums ly Il , where l , Il E and y E and E Z. rom this it Iollows
that iI and Z are linear subspaces then, in general, the union Z is
not a linear subspace oI .
3.1.7. igure B
3.2.8. DeIinition. et be a linear space, and let and Z be arbitrary
subsets oI . The sum oI sets and Z, denoted by Z, is the set oI all
vectors in which are oI the Iorm y , where y E and Z E Z.
The above concept is depicted pictorially in igure C by utili ing the
vector space oI E ample 3.1. . With the aid oI our ne t result we can generate
various linear subspaces.
Z
3.2.9. igure C. Sum oI two Subsets.
3.2. inear Subspaces and Direct Sums 83
3.2.10. Theorem. et and Z be linear subspaces oI a vector space .
Then their sum, Z, is also a linear subspace oI .
Now let and Z be linear subspaces oI a vector space . II n Z O ,
we say that the spaces and Z are disjoint. We emphasie that this termi
nology is not consistent with that used in connection with sets. We now
have:
3.2.12. Theorem. et and Z be linear subspaces oI a vector space .
Then Ior every E Z there e ist uniue elements E and Z E Z
such that Z iI and only iI n Z O .
ProoI et E Z be such that I ZI 2 Z2 where I
2 E and where ZI,Z2 E Z. Then clearly I 2 Z2 ZI Now I 2
E and Z2 ZI E Z, and since by assumption n Z (O , it Iollows that
I 2 0 and Z2 ZI 0, . 2 and ZI Z2 Thus, every E Z
has a uniue representation , where E and Z E Z, provided
that n Z O .
Conversely, let us assume that Ior each Z E Z the E
and the Z E Z are uniuely determined. et us Iurther assume that the linear
subspaces and Z are not disjoint. Then there e ists a nonero vector
v E n Z. In this case we can write Z Z v v (y
/ v) ( / v) Ior all/ E . But this implies that y and are not uniue,
which is a contradiction to our hypothesis. ence, the spaces and Z must
be disjoint.
Theorem 3.2.10 is readily etended to any number oI linear subspaces
oI . SpeciIically, iI I ... , , are linear subspaces oI , then I ...
, is also a linear subspace oI . This enables us to introduce the Iol
lowing:
3.2.13. DeIinition. et I"" , be linear subspaces oI the vector
space . The sum I ... , is said to be a direct sum iI Ior each
E I , there is a uniue set oI i E I i I, ... ,r such that
I ,. We denote the direct sum oI I, . .. , , by I EB ...
EB ,.
There is a connection between the Cartesian product oI two vector spaces
and their direct sum. et and Z be two arbitrary linear spaces over the same
Iield and let V Z. Thus, iI v E V, then vis the ordered pair
v (y, ),
where y E and Z E Z. Now let us deIine vector addition as
and scalar multiplication as
(y, ) ( y, ),
(3.2.1 )
(3.2.15)
where (yI ZI), ( 1 1) E V Z and where E . Noting thatIor each
vector (y, ) E V there is a vector (y, ) (y, ) E V, and observing
that (0, 0) (y, ) (y, ) Ior all elements in V, it is an easy matter to show
that the space V Z is a linear space. We note that is not a linear
subspace oI V, because, in Iact, it is not even a subset oI V. owever, iI we
let
Iey, 0): yE ,
and
Z (O, ): E Z,
Then and Z are linear subspaces oI V and V EB Z. By abuse oI
notation, we Ireuently epress this simply as V EB Z.
Once more, making use oI Eample 3.1., let and Z denote two lines
intersecting at the origin 0, as shown in igure D. The direct sum oI linear
subspaces and Z is in this case the "entire plane."
3.2.16. igure D
In order that a subset be a linear subspace oI a vector space, it is necessary
that this subset contain the null vector. Thus, in igure D, the lines and Z
passing through the origin 0 are linear subspaces oI the plane (see Eample
3.1. ). In many applications this reuirement is too restrictive and a general
iation is called Ior. We have:
3.2.17. DeIinitiOD. et be a linear subspace oI a vector space , and let
be a Ii ed vector in . We cal1 the translation
3.3. inear Independence. Bases. and Dimension
Do E : Z . E
a linear variety or a Itat or an aIIine linear subspace oI .
8S
In igure E. an eample oI a linear variety is given Ior the vector space oI
Eample 3.1. .
3.2.18. igure E
3.3. INEAR INDEPENDENCE, BASES,
AND DIMENSION
Throughout the remainder oI this and in the Iollowing chapter we use the
Iollowing notation: ( I ... ,( ,, , ( , E , denotes an indeed set oI scalars,
and I ... ,, . , E . denotes an indeed set oI vectors.
BeIore introducing the notions oI linear dependence and independence
oI a set oI vectors in a linear space , we Iirst consider the Iollowing.
3.3.1. DeIlnition. et be a set in a linear space (may be a Iinite set
or an inIinite set). We say that a vector E is a Iinite linear combination
oI vectors in iI there is a Iinite set oI elements I 2"" ,, in and a
Iinite set oI scalars ( I ( 2 ... , ( . in such that
( I I ( 2 2 ... ( "y". (3.3.2)
In E. (3.3.2) vector addition has been etended in an obvious way Irom
the case oI two vectors to the case oI n vectors. In later chapters we will
consider linear combinations which are not necessarily Iinite. The represen
tation oI in E . (3.3.2) is, oI course, not necessarily uniue. Thus, in the
case oI E ample 3.1.10, iI RZ and iI (1, 1), then can be represented
as
or as
PIZI p 2( ,0) 3(0, ),
etc. This situation is depicted in igure .
2 (0,1)
Z2 (0, M
" (1,1) 1(1,0) 1(0, 1)" 2( , 0) 3(O, )" etc.
... ....... ...... , (1,0)
Z, .. ( 0)
3.3.3. igure
3.3. . Theorem. et be a nonempty subset oI a linear space . et
V( ) be the set oI all Iinite linear combinations oI the vectors Irom ; e.,
E V( ) iI and only iI there is some set oI scalars lI , I ",l and some
Iinite subset y I . , ",l oI such that
I I I I . ... I ", .. ,
where m may be any positive integer. Then V( ) is a linear subspace oI .
Our previous result motivates the Iollowing concepts.
3.3.6. DeIloition. We say the linear space V( ) in Theorem 3.3. is the
linear subspace generated by the set .
3.3.7. DeIloition. et Z be a linear subspace oI a vector space . II there
e ists a set oI vectors c such that the linear space V( ) generated by
is Z, then we say spans Z.
II, in particular, the space oI E ample 3.1. is considered and iI Vand W
are linear subspaces oI as depicted in igure 0, then the set ell
spans W, the set Z e l spans V, and the set M el e l spans the vector
space . The set N el, e ., e
3
l also spans the vector space .
3.3. inear Independence, Bases, and Dimension
3.3.8. igure G. Vand Ware ines Intersecting at Origin O.
87
3.3.9. E ercise. Show that V( ) is the smallest linear subspace oI a vector
space containing the subset oI . SpeciIically, show that iI Z is a linear
subspace oI and iIZ contains , then Z also contains V( ).
And now the important notion oI linear dependence.
3.3.10. DeItnition. et I
2
, , ", be a Iinite nonempty set in a
linear space . IIthere eist scalars I
I
, . ,I", E E, not all ero, such that
II
I
... I",", 0 (3.3.11)
then the set I 2 ,
m
is said to be linearly dependent. II a set is not
linearly dependent,then it is said to be linearly independent. In this case the
relation (3.3.11) implies that I
I
I
2
... I ", O. An inIinite set oI
vectors in is said to be linearly independent iI every Iinite subset oI
is linearly independent.
Note that the null vector cannot be contained in a set which is linearly
independent. Also, iI a set oI vectors contains a linearly dependent subset,
then the whole set is linearly dependent.
II denotes the space oI Eample 3.1. , the set oI vectors y, in igure
is linearly independent, while the set oI vectors ru, v is linearly dependent.
v
u
o
3.3.12. igure . inearly Independent and inearly Dependent Vec
tors.
3.3.13. E ercise. et e a, b), the set oI all realvalued continuous
Iunctions on a, b), where b a. As we saw in Eample 3.1.19, this set Iorms
a vector space. et n be a Ii ed positive integer, and let us deIine , E Ior
i 0, 1,2, ... , n, as Iollows. or all I E a, b), let
it) 1
and
,(t) I , i I, ... ,n.
et
o
, I""
8
. Then V( ) is the set oI all polynomials on a, b
oI degree less than or eual to n.
(a) Show that is a linearly independent set in .
(b) et , , , i 0, 1, ... ,n; i.e., each , is a singleton subset
oI . Show that
V( ) V(
o
) EI V(
I
) EI .. EI V(.).
(c) et o(t) 1 Ior all I E a, b) and let
Zk(t) I I ... I
k
Ior all I E a, b) and k 1, ... ,n. Show that Z o, Zl"" .
is a linearly independent set in V( ).
3.3.1 . Theorem. et I 1"" ", be a linearly independent set in
" "
a vector space . III:: " , , I:: P, " then "" P, Ior all i 1, 2, ... , m.
I I
" " "
ProoI. II, " , , , P, , then , ("" P,) , O. Since the set ., ... ,
", is linearly independent, we have (" , P,) 0 Ior all i 1, ... ,m.
ThereIore "" P, Ior all i.
The ne t result provides us with an alternate way oI deIining linear
dependence.
3.3.15. Theorem. A set oI vectors .,
1
, , ", in a linear space
is linearly dependent iI and only iI Ior some inde i, 1 i m, we can Iind
scalars "I ... , " , .. " , I , " ", such that
, "I. ""II " , . I
I
... " ", .. (3.3.16)
ProoI. Assume that E . (3.3.16) is satisIied. Then
".
I
... ",.I (l), " , . , . ... " . ", O.
Thus, "" 1 1 0 is a nontrivial choice oI coeIIicient Ior which E.
(3.3.11) holds, and thereIore the set l 1 , ", is linearly dependent.
Conversely, assume that the set I
, ,", is linearly dependent.

Then there e ist coeIIicients "., ... , " ", which are not all ero, such that
"I
I
"
... " ", ", O. (3.3.17)

Suppose that inde i is chosen such that "" 1 O. Rearranging E . (3.3.17) to
3.3. inear Independence, Bases, and Dimension 89
II
I
, II,, ... III
I
I
III
I
I
... II.", (3.3.18)
and multiplying both sides oI E . (3.3.18) by 1/11" we obtain
I PI
I
P1. 1. PI/ I PII/
1
... P",""
where P" 11"/11,, k I, ,i I, i I, ... ,m. This concludes our
prooI.
The prooI oI the net result is leIt as an e ercise.
3.3.19. Theorem. A Iinite nonempty set in a linear space is linearly
indenpendent iI and only iI Ior each y E V( ), y *0, there is a uniue Iinite
subset oI , say I 1."" ", and a uniue set oI scalars III 111.,"" II",,
such that
3.3.21. E ercise. et be a Iinite set in a linear space . Show that
is linearly independent iI and only iI there is no proper subset Z oI such
that V(Z) V( ).
A concept which is oI utmost importance in the study oI vector spaces is
that oI basis oI a linear space.
3.3.22. DeIinition. A set in a linear space is called a amel basis,
or simply a basis, Ior iI
(i) is linearly independent; and
(ii) the span oI is the linear space itselI; e., V( ) .
As an immediate conseuence oI this deIinition we have:
3.3.23. Theorem. et be a linear space, and let be a linearly indepen
dent set in . Then is a basis Ior V( ).
In order to introduce the notion oI dimension oI a vector space we show
that iI a linear space is generated by a Iinite number oI linearly independent
elements, then this number oI elements must be uniue. We Iirst prove the
Iollowing result.
3.3.25. 1beorem. et I 1.," ,,, be a basis Ior a linear space .
Then Ior each vector E there eist uniue scalars (II ... , (I" such that
(lI
1
... II"".
ProoI. Since I ... ,. span , every vector E can be e pressed as
a linear combination oI them; i.e.,
lIll lI" " ... lI. .
Ior some choice oI scalars lIl" .. ,lI. We now must show that these scalars
are uni ue. To this end, suppose that
lIlI lI" " ... lI. .
and
Then
() (lIII lI" " ... lI. .) (PI
I P""
... P..)
(lIl PI)
I
(lI" P") ,, ... (ll. P.). O.
Since the vectors I, ", ... , . Iorm a basis Ior , it Iollows that they
are linearly independent, and thereIore we must have (lI, P,) 0 Ior
i 1, ... ,n. rom this it Iollows that III PI lI" P", ... ,lI" p".
We also have:
3.3.26. Theorem. et I ", ... ,. be a basis Ior vector space , and
let I ... III be any linearly independent set oI vectors. Then m n.
ProoI. We need to consider only the case m n and prove that then we
actually have m n. Consider the set oI vectors I I"" , .l. Since the
vectors I ... ,. span , I can be e pressed as a linear combination oI
them. Thus, the set I I" ,.l is not linearly independent. ThereIore,
there e ist scalars PI lIl ... , lI., not all ero, such that
PI I lIl
l
... lI" . O. (3.3.27)
II all the lI, are ero, then PI * 0 and PlI O. Thus, we can write
PI I O " ... O III O.
But this contradicts the hypothesis oI the theorem and cant happen because
the I ... III are linearly independent. ThereIore, at least one oI the lI, * O.
Renumbering all the " iI necessary, we can assume that lI" * O. Solving Ior
" we now obtain
" (ll)I (I)I ... (::I). I. (3.3.28)
Now we show that the set I I"" ,, I is also a basis Ior . Since
I"" . is a basis Ior , we have I ", ,. E such that
II ... . .
Substituting (3.3.28) into the above e pression we note that
91
(3.3.29)
II Z ,. ( I) t ... ( :: I) .
t
"I "t I ".I.I

where" and "1 are deIined in an obvious way. In any case, every E can
be epressed as a linear combination oI the set oI vectors y t, I , . a,
and thus this set must span . To show that this set is also linearly indepen
dent, let us assume that there are scalars A, AI ... ,A. Isuch that
A I AI
I
... A.I.
I
0,
and assume that A 1 O. Then
( AI) ( A. I) 0
I T I" A . t .
In view oI E. (3.3.27) we have, since PI 1 0, the relation
I (pI)I ... ( p: t) . t (p.).. (3.3.30)
Now the term ( a../Pt) . in E. (3.3.30) is not ero, because we solved Ior
. in E. (3.3.28); yet the coeIIicient multiplying . in E. (3.3.29) is ero.
Since I ... ,. is a basis, we have arrived at a contradiction, in view oI
Theorem 3.3.25. ThereIore, we must have A O. Thus, we have
At I ... A. t .
1
0 . . 0
and since u . .. , .l is a linearly independent set it Iollows that AI 0,
.. , A.I O. ThereIore, the set y I , . d is indeed a basis Ior .
By a similar argument as the preceding one we can show that the set
, I I ,. is a basis Ior , that the set 3 , I I ... ,.
3
I
is a basis Ior , etc. Now iI m n, then we would not utilie n1 in our
process. Since ., . .. ,I) is a basis by the preceding argument, there eist
coeIIicients I., ... , II such that
.I I.. ... III
But by Theorem 3.3.15 this means the " i 1, ... ,n 1 are linearly
dependent, a contradiction to the hypothesis oI our theorem. rom this it
now Iollows that iI m n, then we must have m n. This concludes the
prooI oI the theorem.
As a direct conseuence oI Theorem 3.3.26 we have:
3.3.31. Theorem. II a linear space has a basis containing a Iinite number
oI vectors n, then any other basis Ior consists oI eactly n elements.
ProoI et I ... , .1 be a basis Ior , and let also I"" , y..l be a
basis Ior . Then in view oI Theorem 3.3.26 we have m n. Interchanging
the role oI the i and , we also have n m. ence, m n.
Our preceding result enables us to make the Iollowing deIinition.
3.3.32. DeIinition. II a linear space has a basis consisting oI a Iinite
number oI vectors, say I , ,, , then is said to be a Itnite di Del lional
vector space and the dimension oI is n, abbreviated dim n. In this
case we speak oI an n dimeasional vector space. IIis not a Iinitedimensional
vector space, it is said to be an inItnite dimeasional vector space.
We will agree that the linear space consisting oI the null vector is Iinite
dimensional, and we will say that the dimension oI this space is ero.
Our net result provides us with an alternate characteriation oI (Iinite)
dimension oI a linear space.
3.3.33. Theorem. et be a vector space which contains n linearly inde
pendent vectors. II every set oI n I vectors in is linearly dependent,
then is Iinite dimensional and dim n.
ProoI et I .. ,,, be a linearly independent set in , and let E .
Then there e ists a set oI scalars II I ... , 11,,Inot all ero, such that
II
I
I
... II" " II
I
O.
Now 11" 1 * 0, otherwise we would contradict the Iact that I . ," are
linearly independent. ence,
(... )
l
.. ()"
11" 1 11,, I
and E V(I"" ,, ); i.e., l ,,, is a basis Ior . ThereIore,
is ndimensional.
rom our preceding result Iollows:
3.3.3 . Corollary. et be a vector space. II Ior given nevery set oI n 1
vectors in is linearly dependent, then is Iinite dimensional and dim
no
3.3.35. Eercise. Prove Corollry 3.3.3 .
We are now in a position to speak oI coordinates oI a vector. We have:
3.3.36. DeIinition. et be a Iinite dimensional vector space, and let
I ... , ,, be a basis Ior . et E be represented by
tI ... ,,, ,,.
The uniue scalars, I 2., ... ,,,, are called the coordinates oI with respect
to the basis I 2." , ,, .
It is possible to prove results similar to Theorems 3.3.26 and 3.3.31 Ior
inIinitedimensional linear spaces. Since we will not make Iurther use oI
93
these results in this book, their prooIs will be omitted. In the Iollowing
theorems, is an arbitrary vector space (i.e., Iinite dimensional or inIinite
dimensional).
3.3.37. Theorem. II is a linearly independent set in a linear space ,
then there eists a amel basis Z Ior such that c Z.
3.3.38. Theorem. II and Z are amel bases Ior a linear space , then
and Z have the same cardinal number.
The notion oI amel basis is not the only concept oI basis with which we
will deal. Such other concepts (to be speciIied later) reduce to amel basis
on Iinitedimensional vector spaces but diIIer signiIicantly on inIinitedimen
sional spaces. We will Iind that on inIinitedimensional spaces the concept
oI amel basis is not very useIul. owever, in the case oI Iinitedimensional
spaces the concept oI amel basis is most crucial.
In view oI the results presented thus Iar, the reader can readily prove the
Iollowing Iacts.
3.3.39. Theorem. et be a Iinitedimensional linear space with dim
n.
(i) No linearly independent set in contains more than n vectors.
(ii) A linearly independent set in is a basis iI and only iI it contains
eactly n vectors.
(iii) Every spanning or generating set Ior contains a basis Ior .
(iv) Every set oI vectors which spans contains at least n vectors.
(v) Every linearly independent set oI vectors in is contained in a basis
Ior .
(vi) II is a linear subspace oI , then is Iinite dimensional and
dim n.
(vii) II is a linear subspace oI and iI dim dim , then .
rom Theorem 3.3.39 Iollows directly our net result.
3.3.1. Theorem. et be a Iinitedimensional linear space oI dimension
n, and let be a collection oI vectors in . Then any two oI the three con
ditions listed below imply the third condition:
(i) the vectors in are linearly independent;
(ii) the vectors in span ; and
(iii) the number oI vectors in is n.
Chapter 3 I Vector Spaces and inear TransIormalions
3.3.2. E ercise. Prove Theorem 3.3. 1.
Another way oI restating Theorem 3.3. 1 is as Iollows:
(a) the dimension oI a Iinite dimensional linear space is eual to the
smallest number oI vectors that can be used to span ; and
(b) the dimension oI a Iinite dimensional linear space is the largest
number oI vectors that can be linearly independent in .
or the direct sum oI two linear subspaces we have the Iollowing result.
3.3.3. Theorem. et be a Iinite dimensional vector space. II there
e ist linear subspaces and Z oI such that Z, then dim ()
dim () dim (Z).
ProoI Since is Iinite dimensional it Iollows Irom part (vi) oI Theorem
3.3.39 that and Z are Iinitedimensionallinear spaces. Thus, there e ists
a basis, say I"" , ,, Ior , and a basis, say ZI" ,.. , Ior Z. et
W I"" ", ZI"" ,",. We must show that Wis a linearly independent
set in and that V(W) . Now suppose that
Since the representation Ior 0 E must be uniue in terms oI its components
in and Z, we must have
and
But this implies that I ... " PI P ... P.. O.
Thus, W is a linearly independent set in . Since is the direct sum oI
and Z, it is clear that Wgenerates . Thus, dim m n. This completes
the prooI oI the theorem.
We conclude the present section with the Iollowing results.
3.3. . 1beorem. et be an n dimensional vector space, and let y I
... , y", be a linearly independent set oI vectors in , where m n. Then it
is possible to Iorm a basis Ior consisting oI n vectors I , " where
, , Ior i I, ... , m.
ProoI et el"" ,e,, be a basis Ior . et SI be the set oI vectors I
... , "" e
l
, , ell , where I"" .. is a linearly independent set oI
vectors in and where m n. We note that SI spans and is linearly
3. . inear TransIormations
dependent, since it contains more than n vectors. Now let
.. "
E t , , E p,e, O.
1 1
95
Then there must be some Pj "* 0, otherwise the linear independence oI
y" ... , . would be contradicted. But this means that e
j
is a linear combi
nation oI the set oI vectors S y I .. , .. , e
l
, , e
j

l
, e
j
" ... ,ell ;
i.e., S is the set SI with e
j
eliminated. Clearly, S still spans . Now either
S contains n vectors or else it is a linearly dependent set. II it contains n
vectors, then by Theorem 3.3. 1 these vectors must be linearly independent
in which case S is a basis Ior . We then let " t
j
, and the theorem is proved.
On the other hand, iI S contains more than n vectors, then we continue the
above procedure to eliminate vectors Irom the remaining e, s until e actly
n m oI them are leIt. etting e
il
, ... ,e
j
be the remaining vectors and
letting ..
I
til ... ," e
j
, we have completed the prooI oI the
theorem.
3.3. 5. Corollary. et be an n dimensional vector space, and let be
an m dimensional subspace oI . Then there e ists a subspace Z oI oI
dimension (n m) such that EB Z.
3.3. 6. E ercise. Prove Corollary 3.3. 5.
ReIerring to igure 3.3.8, it is easy to see that the subspace Z in Corollary
3.3. 5 need not be uni ue.
3. . INEAR TRANS ORMATIONS
Among the most important notions which we will encounter are special
types oI mappings on vector spaces, called linear transIormations.
3. .1. DeItnition. A mapping T oI a linear space into a linear space ,
where and are vector spaces over the same Iield , is called a linear
transIormation or linear operator provided that
(i) T( y) T() T(y) Ior all , y E ; and
(ii) T(t ) t T( ) Ior all E and Ior all t E .
A transIormation which is not linear is called a non linear transIormation.
We will Iind it convenient to write T E (, ) to indicate that T is
a linear transIormation Irom a linear space into a linear space (i.e.,
(, ) denotes the set oI all linear transIormations Irom linear space
into linear space ). .
It Iollows immediately Irom the above deIinition that Tis a linear transIor
mation Irom a linear space into a linear space iI and only iI T(tl III I)
"
I; II,T(,) Ior all , E and Ior all II, E ,; I, ... ,n. In engineering
II
and science this is called the principle oI soperposition and is among the most
important concepts in those disciplines.
3..2. E ample. et denote the space oI realvalued continuous
Iunctions on the interval a, b as described in E ample 3.1.19. et T:
be deIined by
T (t) I (s)ds, a t b,
where integration is in the Riemann sense. By the properties oI integrals
it Iollows readily that T is a linear transIormation.
3..3. E ample. et e"(a, b) denote the set oI Iunctions (t) with n
continuous derivatives on the interval (a, b), and let vector addition and scalar
multiplication be deIined by euations (3.1.20) and (3.1.21), respectively.
It is readily veriIied that e"(a, b) is a linear space. Now let T: e"(a, b)
eOI(a, b) be deIined by
T (t) d(t) .
dt
rom the properties oI derivatives it Iollows that T is a linear transIormation
Irom e"(a, b) to e"I(a, b).
3... E ample. et denote the space oIall comple valued Iunctions (t)
deIined on the halIopen interval 0, 00) such that (t) is Riemann integrable
and such that
,
where k is some positive constant and a is any real number. DeIining vector
addition and scalar multiplication as in E s. (3.1.20) and (3.1.21), respectively,
it is easily shown that is a linear space. Now let denote the linear space oI
comple Iunctions oI a comple variable s (s (1 ;0 , ; ,. T). The
reader can readily veriIy that the mapping T: deIined by
T (s) 50 e"(t) dt (3. .5)
is a linear transIormation (called the aplace traasIorm oI (t.
3. .6. E ample. et be the space oI realvalued continuous Iunctions
on a, b as described in Eample 3.1.19. et k(s, t) be a realvalued Iunction
deIined Ior a s :::;;: b, a t b, such that Ior each E the Riemann
integral
s: k(s, t)(t) dt (3. .7)
e ists and deIines a continuous Iunction oI s on a, b). et T
1
: be
deIined by
Tt)(s) y(s) s: k(s, t)(t) dt. (3. .8)
It is readily shown that T
1
E (, ). The e uation (3. .8) is called the
redholm integral e uation oI the Iirst type.
3. .9. E ample. II in place oI (3. .8) we deIine T
: by
T )(s) y(s) (s) s: k(s, t)(t) dt, (3. .10)
then it is again readily shown that T
E (, ). E uation (3. .10) is known

as the redholm integral e uation oI the second type.
3. .11. E ample. In E amples 3. .6 and 3. .9, assume that k(s, t) 0
when t s. In place oI (3. .7) we now have
rk(s, t)(t)dt.
Euations (3..8) and (3..10) now become
T
3
(s) y(s) :k(s, t)(t) dt
and
(3. .12)
(3..13)
T.)(s) y(s) (s) .: k(s, t)(t) dt, (3. .1 )
respectively. E uations (3. .13) and (3. .1 ) are called Volterra integral
e uations (oI the Iirst type and the second type, respectively). Again, the
mappings T
3
and T. are linear transIormations Irom into .
3. .15. E ample. et C, the set oI comple numbers. II E C,

let denote the comple conjugate oI . DeIine T: as
T() .
Then, clearly, T( y) y y T() T(y). Now iI C,
the Iield oI comple numbers. and iI E . then
T( ) iT() * T( ).
ThereIore, T is not a linear transIormation.
Eample 3. .15 demonstrates the important Iact that condition (i) oI
DeIinition 3. .1 does not imply condition (ii) oI this deIinition.
enceIorth, where dealing with linear transIormations T: , we wi I
write T in place oI T().
3..16. DeIinition. et T E (, ). We call the set
(T) E : T O
the null space oI T. The set
R(T) y E : y T, E
is called the range space oI T.
(3. .17)
(3. .18)
Since TO 0 it Iollows that (T) and R(T) are never empty. The net
two important assertions are readily proved.
3..19. Theorem. et T E (, ). Then
(i) the null space (T) is a linear subspace oI ; and
(ii) the range space R(T) is a linear subspace oI .
3..20. Eercise. Prove Theorem 3. .19.
or the dimension oI the range space R(T) we have
3..21. Theorem. et T E (, ). II is Iinite dimensional with dimen
sion n, then R(T) is Iinite dimensional and dim R(T) n.
ProoI We assume that R(T) O and O , Ior iI R(T) O or
O , then dim R(T) 0, and the theorem is proved. Thus, assume that
n 0 and let lo"" y1 E R(T). Then there eist I . , 1 E
such that T/ y/ Ior ; 1, ... , n 1. Since is oI dimension n, there
eist I . , I E such that not all / 0 and
It ... tt O.
This implies that
Thus,
or
II ... II O.
ThereIore, by Corollary 3.3.3 , R(T) is Iinite dimensional and dim R(T)
no
3..22. Eample. et T: R R"", where R and R are deIined in Eam
ples 3.1.10 and 3.1.11, respectively. or E R we write (Io ). DeIine
Tby
T(I Z) (0, I 0, Z, 0, 0, ...).
The mapping T is clearly a linear transIormation. The vectors (0; 1,0, ...)
and (0,0,0,1,0,0, ...) span R(T) and dim R(T) 2 dim RZ .
We also have:
3. .23. Theorem. et T E (, ), and let be Iinite dimensional. et
I" ,y. be a basis Ior R(T) and let , be such that T
I
, Ior i 1,
... , n. Then I . , . are linearly independent in .
3. .2 . E ercise. Prove Theorem 3. .23.
Our ne t result, which as we will see is oI utmost importance, is sometimes
called the Iundamental theorem oI linear e uations.
3. .25. Theorem. et T E (, ). II is Iinite dimensional, then
dim (T) dim R(T) dim . (3. .26)
ProoI et dim n, let dim (T) s, and let r n s. We must
show that dim R(T) r.
irst, let us assume that s n, and let el e
, ... , e. be a basis Ior

chosen in such a way that the last s vectors, et ., e ... ,e., Iorm a
basis Ior the linear subspace (T) (see Theorem 3.3. ). Then the vectors
Tel, Te , , Te" Te 1 ... , Te. generate the linear subspace R(T). But
e, 1 e, , , e. are vectors in (T), and thus Te, 1 0, ... , Te. O.
rom this it now Iollows that the vectors Tel, Te
, ... , Te, must generate

R(T). Now let Il Tel,I Te , .. .I, Te,. We must show that the
vectors I1,IZ, ... ,I, are linearly independent and as such Iorm a basis
Ior R(T).
Net, we observe that "ltIl "ld ... "I,I, E R(T). II the "II "l ,
... ,"1, are chosen in such a Iashion that "ltI. td ... "1 /, 0, then
7tIl td 7,I, 71
Te
l 7
Te
... 7,Te,
T(7.
e
l 7
e
7,e,),
and Irom this it Iollows that "lle
l
7 e ... 7,e, E (T). Now,
by assumption, the set e I" .. , e. is a basis Ior (T). Thus there must
e ist scalars 7t 1 7,, ... ,7. such that
"lle
l
"I e ... "I,e, ),e, ... ) .e .
This can be rewritten as
But Iel, e", ... ,en is a basis Ior . rom this it Iollows that 71 7" ...
r 7r I ... n O. ence, IltI", ... ,Ir are linearly independent
and thereIore dim R(T) r. IIs 0, the preceding prooI remains valid iI
we let Iel, ... ,e. be any basis Ior and ignore the remarks about the
vectors e
r
I ,en II s n, then IIi.(T) . ence, R(T) O and so
dim R(T) O. This concludes the prooI oI the theorem.
Our preceding result gives rise to the net deIinition.
3. .27. DeIinition. The rank p(T) oI a linear transIormation T oI a Iinite
dimensional vector space into a vector space is the dimension oI the
range space R(T). The nullity v(T) oI the linear transIormation Tis the dimen
sion oI the nullspace IIi.(i).
The reader is now in a position to prove the ne t result.
3. .28. Theorem. et T E (, ). et be Iinite dimensional, and let
s dim IIi.(T). et I ... ,, be a basis Ior IIi.(T). Then
(i) a vector E satisIies the euation
TO
iI and only iI lIl
I
... lI,, Ior some set oI scalars lilt
... , lI,. urthermore, Ior each E such that T 0 is satisIied,
the set oI scalars lilt ... , II, is uniue;
(ii) iI o is a Ii ed vector in , then T o holds Ior at least one E
(called a solutioD oI the euation T o) iI and only iI o E R(T);
and
(iii) iI o is any Ii ed vector in and iI
o
is some vector in such that
T
o
o (i.e.,
o
is a solution oI the euation T
o
o), then a
vector E satisIies T o iI and only iI
o
PI
I
...
P, , Ior some set oI scalars Pit P", ... ,P, . urthermore, Ior
each E such that T o, the set oI scalars Pit P1. ... ,P,
is uniue.
Since a linear transIormation T oI a linear space into a linear space
is a mapping, we can distinguish, as in Chapter I, between linear transIorma
tions that are surjective (i.e., onto), injective (i.e., onetoone), and bijective
(i.e., onto and onetoone). We will oIten be particularly interested in
knowing when a linear transIormation T has an inverse, which we denote by
Tl. In this connection, the Iollowing terms are used interchangeably: TI
e ists, T has an inverse, T is invertible, and Tis non singular. Also,. a linear
3. . inear TransIormations 101
transIormation which is not nonsingular is said to be singular. We recall,
iI T has an inverse, then
and
TI(T) Ior all E
T(TI y) y Ior all y E R(T).
(3. .30)
(3. .31)
The Iollowing theorem is a Iundamental result concerning inverses oI
linear transIormations.
3..32. Theorem. et T E (, ).
(i) The inverse oI T e ists iI and only iI T 0 implies O.
(ii) II TI e ists, then TI is a linear transIormation Irom R(T) onto .
ProoI To prove part (i), assume Iirst that T 0 implies O. et
I
2
E with T
I
T 2 Then T(
l

2
) 0 and thereIore I 2
O. Thus, I
2
and T has an inverse.
Conversely, assume that T has an inverse. et T O. Since TO 0, we
have TO T . Since T has an inverse, O.
To prove part (ii), assume that TI e ists. To establish the linearity oI
TI,let I T
I
and 2 T
2
, where I 2 E R(T) and I
2
E are
such that I T
I
and 2 T
2
. Then
TI(I 2) TI(T
l
T
2
) TIT(
l

2
) I
2
TI(I) TI(y).
Also, Ior E we have
TI(I) TI(Tl) TI(T(l)) I TI(I)
Thus, TI is linear. It is also a mapping onto , since every E R(T) is
the image oI some E . or, iI E , then there is ayE R(T) such that
T y. ence, TI y and E R(T I).
3. .33. Eample. Consider the linear transIormation T: R2 R oI
Eample 3. .22. Since T 0 implies 0, Thas an inverse. We see that T
is not a mapping oI R2 onto R; however, T is clearly a onetoone mapping
oI R2 onto R(T).
or Iinitedimensional vector spaces we have:
3. .3 . Theorem. et T E (, ). II is Iinite dimensional, T has an
inverse iI and only iI CR(T) has the same dimension as ; i.e., p(T) dim .
ProoI By Theorem 3. .25 we have
dim IIi:(T) dim R(T) dim .
Since Thas an inverse iIand only it(T) O , it Iollows that P(T) dim
iI and only iI T has an inverse.
or Iinite dimensional linear spaces we also have:
3. .35. Theorem. et and be Iinite dimensional vector spaces oI the
same dimension, say dim dim n. et T E (, ). Then R(T)
iI and only iI T has an inverse.
ProoI Assume that T has an inverse. By Theorem 3. .3 we know that
dim R(T) n. Thus, dim R(T) dim and iI Iollows Irom Theorem 3.3.39,
part (vii), that R(T) .
Conversely, assume that R(T) . et I : . .. ,. be a basis Ior
R(T). et , be such that Tt , Ior i I, ... ,n. Then, by Theorem
3. .23, the vectors u , . are linearly independent. Since the dimension
oI is n, it Iollows that the vectors l ,. span . Now let T 0 Ior
some E . We can represent as I
I
... . . ence, 0 T
II ... .1 Since the vectorsI". ,. are linearly independent,
we must have I . 0, and thus o. This implies that T has
an inverse.
At this point we Iind it instructive to summarie the preceding results
which characterie injective, surjective, and bijective linear transIormations.
In so doing, it is useIul to keep igure in mind.
:Dm
T
3. .36. igure . inear transIormation T Irom vector space into
vector space .
3. .37. Summary (Injective inear TransIormations). et and be
vector spaces over the same Iield , and let T E (, ). The Iollowing
are euivalent:
(i) T is injective;
(ii) T has an inverse;
(iii) T 0 implies 0;
(iv) Ior each y E R(T), there is a uniue E such that T y;
(v) iI T t T
1
, then
t

1
; and
(vi) iI
t
*
1
, then T t * T
1

II is Iinite dimensional, then the Iollowing are euivalent:
(i) T is injective; and
(ii) p(T) dim .
103
3..38. Summary (Surjective inear TransIormations). et and be
vector spaces over the same Iield E, and let T E (, ). The Iollowing are
euivalent:
(i) T is surjective; and
(ii) Ior each E , there is an E such that T y.
II and are Iinite dimensional, then the Iollowing are euivalent:
(i) T is surjective; and
(ii) dim p(T).
3..39. Summary (Bijective inear TransIormations). et and be
vector spaces over the same Iield E, and let T E (, ). The Iollowing are
euivalent:
(i) T is bijective; and
(ii) Ior every y E there is a uniue E such that T y.
IIand are Iinite dimensional, then the Iollowing are euivalent:
(i) T is bijective; and
(ii) dim dim p(T).
3..0. Summary (Injective, Surjective, and Bijective inear TransIorma
tions). et and be Iinitedimensional vector spaces, over the same
Iield E, and let dim dim . (Note: this is true iI, e.g., .) The
Iollowing are euivalent:
(i) T is injective;
(ii) T is surjective:
(iii) T is bijective; and
(iv) T has an inverse.
3..1. Eercise. VeriIy the assertions made in summaries (3..37)
(3. . 0).
et us ne t e amine some oI the properties oI the set (, ), the set
oI all linear transIormations Irom a vector space into a vector space .
As beIore, we assumelhat and are linear spaces over the same Iield .
et S, T E (, ), and deIine the sum oI SandT by
(S T) t::. S T (3. . 2)
Ior all E . Also, with / E and T E (, ), deIine multiplication oI T
by a scalar / as
(/ T) t::. / T (3. . 3)
Ior all E . It is an easy matter to show that (S T) E (, ) and also
that / T E (, ). et us Iurther note that there e ists a ero element in
(, ), called the ero transIormation and denoted by 0, which is deIined by
O 0 (3. . )
Ior all E . Moreover, to each T E (, ) there corresponds a uniue
linear transIormation T E (, ) deIined by
( T) T (3..5)
Ior all E . In this case it Iollows trivially that T T O.
3. . 6. Eercise. et be a Iinite dimensional space, and let T E (, ).
et el ... ,e. be a basis Ior . Then Te, 0 Ior i I, ... , n iI and only
iI T 0 (i.e., T is the ero transIormation).
With the above deIinitions it is now easy to establish the Iollowing result.
3. . 7. Tbeorem. et and be two linear spaces over the same Iield
oI scalars , and let ( , ) denote the set oI all linear transIormations Irom
into . Then (, ) is itselI a linear space over , called the space oI
linear transIormations (here, vector addition is deIined by E . (3. . 2) and
multiplication oI vectors by scalars is deIined by E. (3..3.
3. . 8. E ercise. Prove Theorem 3. . 7.
Net, let us recall the deIinition oI an algebra, considered in Chapter 2.
3. . 9. DeIinition. A set is called an algebra iI it is a linear space and
iI in addition to each , y E there corresponds an element in , denoted
by y and called the product oI times y, satisIying the Iollowing aioms:
(i) (y ) y Ior all , y, E ;
(ii) ( y) y Ior all , y, Z E ; and
(iii) (/), (py) (/ P)( y) Ior all , y E and Ior all / , P E .
II in addition to the above,
3. . inear TransIormations 105
(iv) ( y) (y ) Ior all , y, Z E ,
then is called an associatil e algebra.
II there e ists an element i E such that i . i Ior every
E , then i is called the identity oI the algebra. It can be readily shown
that iIi e ists, then it is uniue. urthermore, iI y y Ior all , y E ,
then is said to be a commutative algebra. inally, iI is a subset oI
(isanalgebra)and(a)iIy E whenever,y E ,and(b)iIe E
whenever e E and E , and (c) iI y E whenever , y E , then
is called a subalgebra oI .
Now let us return to the subject on hand. et , , and Z be linear spaces
over , and consider the vector spaces ( , ) and (, Z). IIS E ( , Z)
and iI T E (, ), then we deIine the product STas the mapping oI into
Z characteried by
(ST) S(T) (3. .50)
Ior all E . The reader can readily veriIy that STE (, Z).
Net, let Z. II S, T, V E (, ) and iI e , P E , then it is
easily shown that
and
S(T) (ST)V,
S(T ) ST SV,
(S T)V S TV,
(3. .51)
(3. .52)
(3. .53)
(3. .5 ) (e S)(PT) (a,P)ST.
or eample, to veriIy (3. .52), we observe that
SeT ) S(T ) ST
(ST) (S) (ST S)
Ior all E , and hence E . (3. .52) Iollows.
We emphasie at this point that, in general, commutativity oI linear
transIormations does not hold; i.e., in general,
ST* TS. (3. .55)
There is a special mapping Irom a linear space into , called the identity
transIormation, deIined by
I (3. .56)
Ior all E . We note that I is linear, i.e., I E (, ), that I*O iIand only
iI * O , that I is uniue, and that
TI IT T (3. .57)
Ior all T E (, ). Also, we can readily veriIy that the transIormation
106
r ,I, r , e , deIined by
Chapter j I Vector Spaces and inear TransIormations
(a.I) a.l a.
is also a linear transIormation.
The above discussion gives rise to the Iollowing result.
(3. .58)
3. .59. Theorem. The set oI linear transIormations oI a linear space
into , denoted by (, ), is an associative algebra with identity I. This
algebra is, in general, not commutative.
We Iurther have:
3..60. Theorem. et T E (, ). IIT is bijective, then TI E (, )
and
TIT ITI I,
where I denotes the identity transIormation deIined in E . (3. .56).
3. .62. Eercise. Prove Theorem 3. .60.
(3. .61)
or invertible linear transIormations deIined on Iinitedimensional linear
spaces we have the Iollowing result.
3. .63. Theorem. et be a Iinitedimensional vector space, and let
T E (, ). Then the Iollowing are euivalent:
(i) T is invertible;
(ii) rank T dim ;
(iii) T is onetoone;
(iv) T is onto; and
(v) T 0 implies O.
3. .6 . Eercise. Prove Theorem 3. .63.
Bijective linear transIormations are Iurther characteried by our net
result.
3. .65. Theorem. et be a linear space, and let S, T, E (, ). et
IE (, ) denote the identity transIormation.
(i) IIST S I, then S is bijective and SI T .
(ii) IISand Tare bijective, then STis bijective, and (Sn
I
TISI.
(iii) IIS is bijective, then (SI)I S.
(iv) II S is bijective, then a.S is bijective and (a.S 1 SI Ior all a. E
and a. * O.
107
With the aid oI the above concepts and results we can now construct
certain classes oI Iunctions oI linear transIormations. Since relation (3. .51)
allows us to write the product oIthree or more linear transIormations without
the use oIparentheses, we can deIine T", where T E ( , ) and n is a positive
integer, as
T"I1TT ... T.
n times
(3. .67)
Similarly, iI TI is the inverse oI T, then we can deIine T", where m is a
positive integer, as
mtImes
T" 11 (TI)" TI TI ... Tt.
.
(3. .68)
mti ines
(T. T .... T)
m n times
n tImes
T" " (T T ... T) (T T .. T)
n times
1" T".
In a similar Iashion we have
(T" )" T"" T (1" ) "
and
..
mtimes
(3..69)
(3..70)
(3..71)
where m and n are positive integers. Consistent with this notation we also
have
and
TI T
TO 1.
(3. .72)
(3. .73)
We are now in a position to consider polynomials oI linear transIormations.
Thus, iI I(A) is a polynomial, i.e.,
I(A) 0 A ... "A", (3. .7 )
where 0 ... ," E , then by I(T) we mean
I(T) I1,
0
1 I1,tT ... ,,1" .
(3..75)
The reader is cautioned that the above concept can, in general, not be
e tended to Iunctions oI two or more linear transIormations, because linear
transIormations in general do not commute.
Ne t, we consider the important concept oI isomorphic linear spaces.
In Chapter 2we encountered the notion oI isomorphisms oI groups and rings.
We saw that such mappings, iI they e ist, preserve the algebraic properties
oI groups and rings. Thus, in many cases two algebraic systems (such as
groups or rings) may diIIer only in the nature oIthe elements oIthe underlying
set and may thus be considered as being the same in all other respects. We
n.ow e tend this concept to linear spaces.
3. .76. DeIinition. et and be vector spaces over the same Iield .
IIthere e ists T E (, ) such that Tis a one to one mapping oI into ,
then T is said to be an isomorphism oI into . II in addition, T maps
onto then and are said to be isomorphic.
Note that iI and are isomorphic, then clearly and are isomorphic.
Our ne t result shows that all n dimensional linear spaces over the same
Iield are isomorphic.
3. .77. Theorem. Every n dimensional vector space over a Iield is
isomorphic to ".
ProoI et el, ... ,e,, be a basis Ior . Then every E has the uni ue
representation
ele l ... e"e",
where el, e1., ... ,,, is a uni ue set oI scalars (belonging to ). Now let
us deIine a linear transIormation T Irom into P by
T (1 1., ,e,,)
It is an easy matter to veriIy that T is a linear transIormation oI onto P,
and that it is one to one (the reader is invited to do so). Thus, is isomorphic
to P .
It is not diIIicult to establish the ne t result.
3. .78. Theorem. Two Iinite dimensional vector spaces and over the
same Iield are isomorphic iI and only iI dim dim .
Theorem 3. .77 points out the importance oIthe spaces R" and C". Namely,
every n dimensional vector space over the Iield oI real numbers is isomorphic
to R" and every n dimensional vector space over the Iield oI comple numbers
is isomorphic to eIt (see E ample 3.I.lO).
3.5. INEAR NCTIONA S
There is a special type oI linear transIormation which is so important
that we give it a special name: linear Iunctional.
We showed in E ample 3.1.7 that iI is a Iield, then " is a vector space
over . II, in particular, n I, then we may view as being a vector space
over itselI. This enables us to consider linear transIormations oI a vector
space over into .
3.5.1. DeIinition. et be a vector space over a Iield . A mapping I
oI into is called a Iunctional on . II1 is a linear transIormation oI
into , then we call 1 a linear Iunctional on .
. We cite some speciIic e amples oI linear Iunctionals.
3.5.2. E ample. Consider the space era, b . Then the mapping
II() s: es) ds, E era, b
is a linear Iunctional on era, b . Also, the Iunction deIined by
Il() (so), E era, b , So E a, b
is also a linear Iunctional on era, b . urthermore, the mapping
I,e) r (s) o(s) ds,
(3.5.3)
(3.5. )
(3.5.5)
where
o
is a Ii ed element oI era, b and where is any element in era, b ,
is also a linear Iunctional on era, b .
3.5.6. E ample. et P, and denote E by (e I . , e.).
The mappingI, deIined by
I,() el (3.5.7)
is a linear Iunctional on . A more general Iorm oII, is as Iollows. et
a (I ... , .) E be Ii ed and let (el ... ,e.) be an arbitrary
element oI . It is readily shown that the Iunction
Is() :E ,e,
II
is a linear Iunctional on .
(3.5.8)
3.5.9. E ercise. Show that the mappings (3.5.3), (3.5. ), (3.5.5), (3.5.7),
and (3.5.8) are linear Iunctionals.
Now let be a linear space and let denote the set oI all linear Iunc
109
tionals on . IIIE is evaluated at a point E , we write I(). re
uently we will also Iind the notation
I() A (,) (3.5.10)
useIul. In addition to E. (3.5.10), the notation () or is sometimes
used. In this case E. (3.5.10) becomes
I() (,) (, ), (3.5.11)
where is used in place oI I Now letIl t, 1. belong to I, and let
E . et us deIine Il I1. t and I by
(Il I1.)( ) (, t ) A (, ;) (,)
Il() I1.( ), (3.5.12)
and
( I)( ) (, ) A (, ) I(), (3.5.13)
respectively. We denote the Iunctional I such that I() () 0
Ior all E by O. IIIis a linear Iunctional then we note that
I(
l
1.) ( I 1. )
I ) 1. ) I(
l
) I( 1.), (3.5.1 )
and also,
I( ) , ) , ) I().
It is now a simple matter to prove the Iollowing:
(3.5.15)
3.5.16. Theorem. The space with vector addition and multiplication
oI vectors by scalars deIined by euations (3.5.12) and (3.5.13), respectively,
is a vector space over .
3.5.18. DeIinition. The linear space is called the algebraic conjugate
oI .
et us now eamine some oI the propeties oI Ior the case oI Iinite
dimensional linear spaces. We have:
3.5.19. Theorem. et be a Iinitedimensional vector space, and let
e
l
, ,e,, be a basis Ior . II It ... ,. is an arbitrary set oI scalars,
then there is a uniue linear Iunctional E such that (e" ) , Ior
i 1, ... , n.
ProoI or every E , we have
1l1 1.e1. ... "e .
3.5. inear unctionals
Now let E be given by
(, ) rt"
11
111
II e
,
Ior some i, we have I I and 0 iI i *j. Thus, (e
"
) rt
,
Ior i I, ... , n. To show that is uniue, suppose there is an E
such that (e,,) rt, Ior i I, ... ,n. It then Iollows that (e,,) (e
"
) 0 Ior i I, ... ,n, and so (e
"
) 0 Ior i I, ... ,n. This
implies 0; i.e., .
In our net result and on several other occasions throughout this book,
we make use oI the ronecker delta.
3.5.20. DeIIDitioD. et
I iI i j
/ 0 iIi*j
Ior i,j I, ... ,n. Then is called the ronecker delta.
We now have:
(3.5.21)
3.5.22. Theorem. et be a Iinitedimensional vector space. II el e ,
... , e. is a basis Ior , then there is a uniue basis e, e;, ... , e:. in
with the property that (e
"
e) / rom this it Iollows that iI is n
dimensional, then so is I.
ProoI rom Theorem 3.5.19 it Iollows that Ior eachj I, ... , n, a uniue
E can be Iound such that .e
"
e) / Thus, we only have to show
that the set e;, e;, ... , e:. is a linearly independent set which spans I.
To show that e;, e;, ... , e is linearly independent, let
PIe; p e; ... P.e:. o.
Then
(
. ). .
o e
, Pie; p,(e
, e;) P/ P
11 11 11
and thereIore we have PI P ... P. O. This proves that e, e;,
... , e is a linearly independent set.
To show that the set Ie;, e;, ... , e spans , let E and deIine
rt
,
(e
"
). et lei We then have
11
(, ) (leI ... .e., ) leI ) (.e., )
I(e
l
, ) ... .(e., ) Irt
l
.rt .
Also,
(, e) " e
,
, )
t:t
Combining the above relations we now have
, ) I , e;) .(, e.)
, Ie; .e.).
rom this it now Iollows that Ior any E we have
Ie; ... .e.,
which proves our theorem.
The previous result motivates the Iollowing deIinition.
3.5.23. DeIinitiOD. The basis (e;, e;, ... , e. oI in Theorem 3.5.22 is
called the dual basis oI (e" e
2
, , e. .
We are now in a position to consider the algebraic transpose oI a linear
transIormation. et S be a linear transIormation oI a linear space into a
linear space and let and yl denote the algebraic conjugates oI and ,
respectively (the spaces and need not be Iinite dimensional). or each
y E yl let us establish a correspondence with an element E according
to the rule
() , ) S, y) y(S), (3.5.2 )
where E . et us denote the mapping deIined in this way by ST: STy
and let us rewrite E. (3.5.2) as
, STy) S , y), E , y E yl, (3.5.25)
to deIine ST. It should be noted that iI S is a mapping oI into , then ST
is a mapping oI yl into I, as depicted in igure . We now state the
Iollowing Iormal deIinition.
s
y
l
3.5.26. igure . Transpose oI a linear transIormation.
3.5.27. DeIinitioD. et S be a linear transIormation oI a linear space
into a linear space over the same Iield and let and yl denote the
3.6. Bilinear unctiona s 113
algebraic conjugates oI and , respectively. A transIormation ST Irom yl
into such that
, STy) S , y)
Ior all E and all y E yl is called the (algebraic) traaspose oI S.
We now show that ST is a linear transIormation.
3.5.28. Theorem. et S E (, y), and let ST be the transpose oI S. Then
ST is a linear transIormation Irom yl into .
ProoI et E , and let y;, y E yl. Then Ior all E ,
, ST( ; h S , (y; h S , y;) S , h)
, ST ;) , ST
h
).
Thus, ST(y; y) ST(y;) ST(y). Also,
, ST( y; S , ;) S , ;)
, sry;) , sry ).
ence, ST( y;) ST( ;). ThereIore, ST E ( yl, ).
The reader should have now no diIIiculties in proving the Iollowing results.
3.5.29. Theorem. et R, S E (, ), and let T E ( y, Z). et RT, ST,
and TT be the transpose transIormations oI R, S, and T, respectively. Then,
(i) (R SI RT ST; and
(ii) (TSI STTT.
3.5.30. Theorem. et I denote the identity element oI (, ). Then T
is the identity element oI (, I).
3.5.31. Theorem. et 0 be the null transIormation in (, ). Then OT
is the null transIormation in ( yl, I).
3.5.32. Eercise. Prove Theorems 3.5.29 3.5.31.
We will consider an important class oI transpose linear transIormations
in Chapter (transpose oI a matri ).
3.6. BI INEAR NCTIONA S
In the present section we introduce the notion oI bilinear Iunctional and
e amine some oI the properties oI this concept. Throughout the present
section we concern ourselves only with real vector spaces or comple vector
spaces. Thus, iI is a linear space over a Iield , it will be assumed that
is either the Iield oI real numbers, R, or the Iield oI comple numbers, C.
3.6.1. DeIiDitiOD. et be a vector space over C. A mapping g Irom
into C is said to be a coojugate I.ClioDal iI
g(c py) lig() pg(y) (3.6.2)
Ior all , y E and Ior all c , p E C, where d denotes the comple conjugate
oI c and pdenotes the comple conjugate oI p.
IIin DeIinition 3.6.1 the comple vector space is replaced by a real linear
space, then the concept oI conjugate Iunctional reduces to that oI linear
Iunctional, Ior in this case E. (3.6.2) assumes the Iorm
g(c py) c g( ) pg(y) (3.6.3)
Ior all , y E and Ior all c , pER.
3.6.. DeItDitioD. et be a vector space over C. A mapping g oI
into C is called a bilioear I.ClioDal or a bilioear Iorm iI
(i) Ior each Ii ed y, g(, y) is a linear Iunctional in ; and
(ii) Ior each Ii ed , g(, y) is a conjugate Iunctional in y.
Thus, iI g is a bilinear Iunctional, then
(a) g(c py, ) c g( , ) pg(y, ); and
(b) g(, c y p) iig(, y) pg(,)
Ior all , y, E and Ior all c , p E C.
or the case oI real linear spaces the deIinition oI bilinear Iunctional
is modiIied in an obvious way by deleting in DeIinition 3.6. the symbol Ior
comple conjugates.
We leave it as an eercise to veriIy that the eamples cited below are
bilinear Iunctionals.
3.6.5. E ample. et , y E Cl, where Cl denotes the linear space oI
ordered pairs oI comple numbers (iI , y E Cl, then (I 1) and
y (111,111 . The Iunction
g(, y) I;;I 7.;;7.
is a bilinear Iunctional.
3.6.6. E ample. et , y E Rl, where R7. denotes the linear space oI
ordered pairs oI real numbers (iI , y E R7., then (I 7.) and y
(111) 111 et ( denote the angle hetween , y E l7.. The dot product oI two
3.6. Bilinear unctionals
vectors, deIined by
g(, y) Ill l (t DI/2(It IDI/2 cos (
115
3.6.7. E ample. et be an arbitrary linear space over C, and let ()
and P(y) denote two linear Iunctionals on . The transIormation
g(, y) ()P(y)
3.6.8. E ample. et be any linear space over C, and let g be a bilinear
Iunctional. The transIormation h deIined by
h(, y) g(, y)
3.6.9. E ercise. VeriIy that the transIormations given in E amples 3.6.5
through 3.6.8 are bilinear Iunctionals.
We note that Ior any bilinear Iunctional, g, we have g(O, y) g(O 0, y)
0 g(O,y) 0 Ior all y E . Also, g(, 0) 0 Ior all E .
reuently, we Iind it convenient to impose certain restrictions on
bilinear Iunctionals.
3.6.10. DeIinition. et be a comple linear space. A bilinear Iunctional
g is said to be symmetric iI g(, y) g(y, ) Ior all , y E . IIg(, ) 2 0
Ior all E , then g is said to be positive. IIg(, ) 0 Ior all *"0, then
g is said to be strictly positive.
3.6.11. DeIinition. et be a comple vector space, and let g be a bilinear
Iunctional. We call the Iunction g: ......... C deIined by
g() g(, )
Ior all E , the uadratic Ionn induced by g (we Ireuently omit the phrase
"induced by g").
or e ample, iI g(, y) 1;;1 2;;2 as in Eample 3.6.5, then g()
Il 2Z Id lI2. This is a uadratic Iorm as studied in
analytic geometry.
or real linear spaces, DeIinitions 3.6.10 and 3.6.11 are again modiIied in
an obvious way by ignoring comple conjugates.
3.6.12. Theorem. II gis the uadratic Iorm induced by a bilinear Iunctional
116
g, then
Chapter 3 I VectoI Spaces and inear TransIormations
g(,y) g(y,) t(t ) te 2 ).
ProoI By direct epansion we have,
(
) (y
y
) I
t 2 g 22
g
( y, y)
I
g(, y) g(y, y )
I
g(, ) g(, y) g(y, ) g(y, y ),
and also,
te 2 ) g( , ) g(, y) g(y, ) g(y, y ).
Thus,
g(, y) g(y, ) te t ) te 2 y).
Our ne t result is commonly reIerred to as polari ation.
3.6.13. 1beorem. IIt is the uadratic Iorm induced by a bilinear Iorm g
on a comple vector space , then
g(, y) t( y ) t( y ) it( iy )
it( iy) (3.6.1 )
Ior every , y E (here i I).
ProoI rom the prooI oI the last theorem we have
te t ) g( , ) g(, y) g(y, ) g(y, y )
and
te 2 ) g( , ) g(, y) g(y, ) g(y, y ).
Also,
it(i i
) g( , ) . ig(, y) ig(y, ) g(y, y )

and
it(; i
) g( , ) ig(, y) ig(y, ) g(y, y ).

AIter combining the above Iour e pressions, E . (3.6.1 ) results.
The reader can prove the ne t result readily.
3.6. Bilinear unctiona/s 117
3.6.15. Theorem. et be a comple vector space. II two bilinear Iunc
tionals g and h are such that g h, then g h.
or symmetric bilinear Iunctionals we have:
3.6.17. Theorem. A bilinear Iunctional g on a comple vector space is
symmetric iI and only iI g is real (i.e., g() is real Ior all E ).
ProoI Suppose that g is symmetric; i.e., suppose that
g(, y) g(y, )
Ior all , y E . Setting y, we obtain
g() g(, ) g(, ) g()
Ior all E . But this implies that g is real.
Conversely, iI g() is real Ior all E , then Ior h( , y) g(y, ) we
have h() g(, ) g(, ) g( ). Since h g, it now Iollows Irom
Theorem 3.6.15 that h g, and thus
g(, y) g(y, ).
Note that Theorems 3.6.13, 3.6.15, and 3.6.17 hold only Ior comple
vector spaces. Theorem 3.6.15 implies that a bilinear Iorm is uniuely
determined by its induced uadratic Iorm, and Theorem 3.6.13 gives an
eplicit connection between g and g. In the case oI real spaces, these conclu
sions do not Iollow.
3.6.18. Eample. et R2 with (el e2) E R2 and y (171,172)
E R2. DeIine the bilinear Iunctionals g and h by
g(, y) el171 2 217. 1172 2172
and
h( , y) Ill 3e2 11 3, 1 12 2 12
Then g() h(), but g ; : h. Note that h is symmetric whereas g is not.
sing bilinear Iunctionals, we now introduce the very important concept
oI inner product.
3.6.19. DeIinition. A strictly positive, symmetric bilinear Iunctional g on
a comple linear space is called an inner product.
or the case oI real linear spaces, the deIinition oI inner product is identical
to the above deIinition.
Since in a given discussion the particular bilinear Iunctional g is always
speciIied, we will write (, y) in place oI g(, y) to denote an inner product.
tiliing this notation, the inner product can alternatively be deIined as a
rule which assigns a scalar (, y) to every , y E (is a comple vector
space), having the Iollowing properties:
(i) (, ) 0 Ior all 1 0 and (, ) 0 iI 0;
(ii) (, y) ( , ) Ior all , y E ;
(iii) (Ct Py, ) c ( , ) P( , ) Ior all , y, Z E and Ior all
Ct, P E C; and
(iv) (, Cty p ) (, y) p(, ) Ior all , y, Z E and Ior all
Ct, p E C.
In the case oI real linear spaces, the preceding characteriation oI inner
product is identical, e cept, oI course, that we omit conjugates in (i (iv).
We are now in a position to introduce the concept oI inner product space.
3.6.20. DeIiDition. A comple (real) linear space on which a comple
(real) inner product, (" ), is deIined is called a comple (real) inner product
space. In general, we denote this space by ; (0, ) . IIthe particular inner
product is understood, we simply write to denote such a space (and we
usually speak oI an inner product space rather than a comple or real inner
product space).
It should be noted that iI two diIIerent inner products are deIined on the
same linear space , say (, )1 and (, )2 then we have two diIIerent inner
product spaces, namely, ; (, .). and ; (0, )2
Now let ; (0, .) be an inner product space, let be a linear subspace
oI , and let (, .)" denote the inner product on induced by the inner
product on ; i.e.,
(, y) (, y)" (3.6.21)
Ior all , y Ee . Then ; (, )"is an inner product space in its own
right, and we say that is an inner product subspace oI .
sing the concept oI inner product, we are in a position to introduce the
notion oI orthogonality. We have:
3.6.22. DeIinition. et be an inner product space. The vectors , y E
are said to be orthogonal iI (, y) O. In this case we write l y. II a vector
E is orthogonal to every vector oI a set c , then is said to be
orthogonal to set , and we write l . IIevery vector oI set c is
orthogonal to every vector oI set Z c , then set is said to be orthogonal
to set Z, and we write ... Z.
Clearly, iI is orthogonal to y, then y is orthogonal to . Note that iI
1 0, then it is not possible that l , because (, ) 0 Ior all 1 O.
Also note that 0 l Ior all E .
3.7. Projections 119
BeIore closing the present section, let us consider a Iew speciIic e amples.
3.6.23. E ample. et R"o or (I 00" ") E R" and y (II
o ,I.) E R, we can readily veriIy that
(, y) ,Il
I
is an inner product, and ; ( ., .) is a real inner product space.
3.6.2 . Eample. et C", or (I .. " .) E C" and y ( II
... ,I.) E C, let
(, y) :E ,;;"
11
Then (, y) is an inner product and ; (., .) is a comple inner product
space.
3.6.25. E ample. et denote the space oI continuous comple valued
Iunctions on the interval 0, 1). The reader can readily show that IorI, g E ,
(I,g) I I(t)g(t)dt
is an inner product. Now consider the Iamily oI Iunctions I. deIined by
I.(t) e
1
1
, t E 0, 1 ,
n 0, l, 2, .... Clearly, I. E Ior all n. It is easily shown that (Irn,
I.) 0 iI m * n. Thus, I .. .. I .. iI m * n.
3.7. PRO ECTIONS
In the present section we consider another special class oI linear transIor
mations, called prOjectiODS. Such transIormations which utili e direct sums
(introduced in Section 3.2) as their natural setting will Iind wide applications
in later parts oI this book.
3.7.1. DeIinition. et be the direct sum oI linear spaces I and
1
;
i.e., let I
1
et I 2, be the uniue representation oI
E , where I E I and 2, E
1
We say that the projection on I along
2, is the transIormation deIined by
P() I
ReIerring to igure , we note that elements in the plane can uniuely
be represented as I 2" where I E I and
2
E
2
(I and
1
are onedimensional linear spaces represented by the indicated lines inter
secting at the origin 0). In this case, a projection P can be deIined as that
l 2
3.7.1. igure . Projection on I along 1..
transIormation which maps every point in the plane onto the subspace
I along the subspace 1.
3.7.3. Theorem. et be the direct sum oI two linear subspaces I and
1., and let P be the projection on I along 1. Then
(i) P E (, );
(ii) R(P) I; and
(iii) (P)
2
ProoI To prove the Iirst part, note that iI I 1. and I 1.

where " I E I and 1. 1. E
2
, then clearly
P(I1. Py) P(I1.
I
I1. 1. P I P 1.) I1.
I
P I
I1.P(
l
) PP(I) I1.P(
I
1.) PP( I 1.)
I1.P( ) pP(y),
and thereIore P is a linear transIormation.
To prove the second part oI the theorem. we note that Irom the deIinition
oI P it Iollows that R(P) C I Now assume that I E I Then P
I
I
and thus I E R(P). This implies that I C R(P) and proves that R(P) I
To prove the last part oI the theorem, let 1. E
2
Then P 1. 0 so that
1. C (P). On the other hand, iI E (P), then P O. Since
I 1. where I E I and 1. E 1. it Iollows that I 0 and E 1.
Thus, 1. :: (P). ThereIore, 1. (P).
Our ne t result enables us to characterie projections in an alternative
way.
3.7. . Theorem. et P E (, ). Then P is a projection on R(P) along
(P) iI and only iI PP p 1. P.
3.7. Projections 111
ProoI Assume that P is the projection on the linear subspace l oI
along the linear subspace :
h
where I EB I By the preceding theorem,
l R(P) and I m(p). or E , we have l I where
I E I and I E I Then
p 1. P(P) P
I
I P ,
and thus p 1. P.
C nversely, let us assume that p2 P. et 1. m(p) and let I R(P).
Clearly, m(p) and R(P) are linear subspaces oI . We must show that
R(P) EB m(P) I EB I In particular, we must show that R(P)
n m(p) O and that R(P) and m(p) span .
Now iIy E R(P) there eists an E such that P y. Thus, p 1. Py
P y. II y E m(p) then Py O. Thus, iI y is in both m(p) and m(p),
then we must have y 0; i.e., R(P) n m(p) O .
Net, let be an arbitrary element in . Then we have
P (I P).
etting P l and (I P) I we have P
I
pI P I and
also P
I
P(I P) P p 1. P P 0; i.e., I E I and
I E I rom this it Iollows that I EB I and that the projection on
I along I is P.
The preceding result gives rise to the Iollowing:
3.7.5. DeIinition. et P E (, ). Then P is said to be idempotent iI
pI P.
Now let P be the projection on a linear subspace l along a linear subspace
I Then the projection on I along I is characteried in the Iollowing way.
3.7.6. Theorem. A linear transIormation P is a projection on a linear
subspace iI and only iI (I P) is a projection. II P is the projection on I
along 1. then (I P) is the projection on 1. along l
In view oI the preceding results there is no ambiguity in simply saying a
transIormation P is a projection (rather than P is a projection on I along
1.)
We emphasie here that iI P is a projection, then
R(P) EB m(p). (3.7.8)
This is not necessarily the case Ior arbitrary linear transIormations T E (,
) Ior, in general, R(T) and meT) need not be disjoint. or eample, iI
there eists a vector E such that T * 0 and such that T2 0,
. then T E R(T) and T E meT).
121
et us now consider:
Chapter 3 I Vector Spaces and inear TransIormations
3.7.9. DeIinition. et T E ., ). A linear subspace oI a vector space
is said to be invariant under the linear transIormation T iI y E implies
that Ty E .
Note that this deIinition does not imply that every element in can be
written in the Iorm Ty, with y E . It is not even assumed that Ty E
implies y E .
or invariant subspaces under a transIormation T E ., ) we can
readily prove the Iollowing result.
3.7.10. Theorem. et T E ., ). Then
(i) is an invariant subspace under T;
(ii) O is an invariant subspace under T;
(iii) R(T) is an invariant subspace under T; and
(iv) (T) is an invariant subspace under T.
Net we consider:
3.7.12. DeIinition. et be a linear space which is the direct sum oI two
linear subspaces and Z; i.e., EEl Z. II and Z are both invariant
under a linear transIormation T, then T is said to be reduced by and Z.
We are now in a position to prove the Iollowing result.
3.7.13. Theorem. et and Z be two linear subspaces oI a vector space
such that EEl Z. et T E (, ). Then T is reduced by and Z
iI and only iI PT TP, where P is the projection on along Z.
ProoI Assume that PT TP. II y E , then Ty TPy PTy so that
Ty E and is invariant under T. Now let y E Z. Then Py 0 and PTy
TPy TO O. Thus, Ty E Z and Z is also invariant under T. ence, T
is reduced by and Z.
Conversely, let us assume that T is reduced by and Z. II E , then
y , where y E and Z E Z. Then P yand TP Ty E .
ence, PTP Ty TP; i.e.,
PTP TP (3.7.1 )
Ior all E . On the other hand, since and Z are invariant under T, we
have T Ty T with Ty E and T E Z. ence, PT Ty PTy
PTP; i.e.,
PTP PT (3.7.15)
3.8. Notes and ReIerences 123
Ior all E . Euations (3.7.1 ) and (3.7.15) imply that PT TP.
We close the present section by considering the Iollowing special type oI
projection.
3.7.16. DeIinition. A projection P on an inner product space is said
to be an orthogonal projection iI the range oI P and the null space oI Pare
orthogonal; i.e., iI R(P) l.. (P).
We will consider e amples and additional properties oI projections in
much greater detail in Chapters and 7.
3.8. NOTES AND RE ERENCES
The material oI the present chapter as well as that oI the ne t chapter is
usually reIerred to as linear algebra. Thus, these two chapters should be
viewed as one package. or this reason, applications (dealing with ordinary
diIIerential euations) are presented at the end oI the ne t chapter.
There are many tetbooks and reIerence works dealing with vector spaces
and linear transIormations. Some oI these which we have Iound to be very
useIul are cited in the reIerences Ior this chapter. The reader should consult
these Ior Iurther study.
RE ERENCES
3.1 P. R. A MOS, inite Dimensional Vector Spaces. Princeton, N..: D. Van
Nostrand Company, Inc., 1958.
3.2 . O MAN and R. NZE, inear Algebra. Englewood CliIIs, N..: Prentice
all, Inc., 1971.
3.3 A. W. NA OR and G. R. SE , inear Operator Theory in Engineering and
Science. New ork: olt, Rinehart and Winston, 1971.
3. A. E. TA OR, Introduction to unctional Analysis. New ork: ohn Wiley
Sons, Inc., 1966.
INITE DIMENSIONA
VECTOR SPACES AND
MATRICES
In the present chapter we eamine some oIthe properties oIIinitedimensional
linear spaces. We will show how elements oI such spaces are represented by
coordinate vectors and how linear transIormations on such spaces are repre
sented by means oI matrices. We then will study some oI the important prop
erties oI matrices. Also, we will investigate in some detail a special type oI
vector space, called the Euclidean space. This space is one oI the most impor
tant spaces encountered in applied mathematics.
Throughout this chapter " ... , . , / E , and " ... ,., / E ,
denote an indeed set oI scalars and an indeed set oI vectors, respectively.
.1. COORDINATE REPRESENTATION
O VECTORS
et be a Iinitedimensional linear space over a Iield , and let I ,
. be a basis Ior . Now iI E , then according to Theorem 3.3.25 and
DeIinition 3.3.36, there e ist uni ue scalars I ... ,., called coordinates
oI with respect to this basis such that
II ... ... ( .1.1)
12
.1. Coordinate Representation oI Vectors 5
This enables us to represent unambiguously in terms oI its coordinates
as
(.1.2)
or as
( .1.3)
We call (or
T
) the coordinate representation oI the underlying object (vector)
with respect to the basis " ... , ,, . We call a column vector and
T
a
row vector. Also, we say that
T
is the transpose vector, or simply the transpose
oI the vector . urthermore, we deIine (TI to be .
It is important to note that in the coordinate representation (.1.2) or
(.1.3) oI the vector (.1.1), an "ordering" oI the basis I ... , ,, is em
ployed (i.e., the coeIIicient I oI , is the ith entry in Es. ( .1.2) and (.1.3.
IIthe members oI this basis were to be relabeled, thus speciIying a diIIerent
"ordering," then the corresponding coordinate representation oI the vector
would have to be altered, to reIlect this change. owever, this does not pose
any diIIiculties, because in a given discussion we will always agree on a par
ticular "ordering" oI the basis vectors.
Now let E . Then
(I
I
... ,,, .) ( I)
I
... ( e") ,,. ( .1. )
In view oI Es. ( . t.I) .1. ) it now Iollows that the coordinate representa
tion oI with respect to the basis I ... ,. is given by
I
el
e
(.1.5)
It
e.
or
T
(I , .. ",,) ( I e, .. , e,,).
Net, let y E , where
y "II " ... "" ".
( .1.6)
( .1.7)
The coordinate representation oI y with respect to the basis I" .. , ,,
is, oI course,
126 Cht pter I inite Dimensional Vector Spaces and Matrices
or
Now
(.1.8)
(.1.9)
y (.. ... R.) ( 1I I ... 1R .)
(I 1.)
l
... (. 1.) .. ( .1.10)
rom E. ( .1.10) it now Iollows that the coordinate representation oI the
vector E with respect to the basis I"" ,.l is given by
or
. I I 11
y . . .
. . .
. 1. R 1R
T
yT ( ..... , .) ( 11" .. , 1R)
(. 1 ..... ,. 1R)
(.1.11)
( .1.12)
Net, let lt . , u.l and VIt . , vRl be two diIIerent bases Ior the linear
space . Then clearly there eist two diIIerent but uniue sets oI scalars
(i.e., coordinates) t ., . ,t .l and IP..... ,P.l such that
t lu. ... t .u. P.v
l
... P.v.. (.I.I3)
This enables us to represent the same vector E with respect to two diI
Ierent bases in terms oI two diIIerent but uniue sets oI coordinates, namely,
d
( .1.1 )
The net two eamples are intended to throw additional light on the above
discussion.
.1.15. E ample. et (I" . ,.) E R. et "I (1,0, ... ,0),
2 (0, 1,0, ... ,0), ... , . (0, ... ,0, I). It is readily shown that the
set u
l
, . , u.l is a basis Ior RR. We call this basis the natural basis Ior RR.
Noting that
( .1.16)
.1. Coordinate Representation oI Vectors
117
the unambiguous coordinate representation oI E R" with respect to the
natural basis oI R" is
( .1.17)
or
T
(1 ... , .).
Moreover, the coordinate representations oI the basis vectors I , tI are
I 0 0
0 I
u.
0
u ( .1.18) , , ... ,
"
,
0
0 0 I
respectively. We call the coordinates in E . ( .1.17) the natural coordinates
oI E R". (The natural basis Ior " and the natural coordinates oI E "
are similarly deIined.)
Net, consider the set oI vectors v., ... , v. , given by v. (1,0, ... ,0),
V
(I, 1,0, ... ,0), ... ,v" (I, ... , I). We see that the vectors VI
... , v. Iorm a basis Ior R". We can e press the vector given in E . ( .1.16)
in terms oI this basis by
( .1.19)
where ot" ,,, and ott " , . Ior i 1, 2, ... , n 1. Thus, the coor
dinate representation oI relative to v., ... , v.) is given by
ot.
,.
ot
( .1.20)
ot,, .
,,, . "
ot"
"
ence, we have represented the same vector E R by two diIIerent coordi
nate vectors with respect to two diIIerent bases Ior R".
.1.21. E ample. et e a, b , the set oI all real valued continuous
Iunctions on the interval a, b . et
o
, ., .. , " c , where o(t)
1 and ,(t) I Ior all I E a, b , i 1, ... ,n. As we saw in E ercise
3.3.13, is a linearly independent set in and as such it is a basis Ior V( ).
128 Chapter I inite Dimensional Vector Spaces and Matrices
ence, Ior any y E V() there e ists a uniue set oI scalars ( Io, II . , 1.1
such that
y Io o ... I.

.
Since y is a polynomial in t we can write, mote e plicitly,
y(t) 10 lIt ... I.t, t E a, b).
( .1.22)
( .1.23)
In the present e ample there is also a coordinate representation; i.e., we can
represent y E V( ) by
( .1.2 )
I.
This representation is with respect to the basis (
o
, I , .l in V().
We could, oI course, also have used another basis Io V(). or e ample,
let us choose the basis ( o, I . , .l Ior V( ) given in E ercise 3.3.13. Then
we have
y 1 0Z
o
IIZ
I
... I "Z", (.1.25)
where I . I" and I t It It I i 0, 1, ... ,n 1. Thus, y E V()
may also be represented with respect to the basis (ZO,ZI" ,. by
1
0 10 II
I
I II 12
( .1.26)
I .
I 1,, 1 I.
I " I.
Thus, two diIIerent coordinate vectors were used above in representing the
same vector y E V( ) with respect to two diIIerent bases Ior V( ).
Summariing, we observe:
1. Every vector belonging to an n dimensional linear space over
a Iield can be represented in terms oI a coordinate vector , or its
transpose
T
, with respect to a given basis e I , e.l c . We note
that
T
E P (the space P is deIined in E ample 3.1.7). By convention
we will henceIorth also write E P. To indicate the coordinate repre
sentation oI E by E P, we write .
2. In representing by , an "ordering" oI the basis e
lt
, ell c
is implied.
.2. Matrices
129
3. sage oI diIIerent bases Ior results in diIIerent coordinate repre
sentations oI E .
.2. MATRICES
In this section we will Iirst concern ourselves with the representation
oI linear transIormations on Iinite dimensional vector spaces. Such represen
tations oI linear transIormations are called matrices. We will then e amine
the properties oI matrices in great detail. Throughout the present section
will denote an n dimensional vector space and an m dimensional vector space
over the same Iield .
A. Representation oI inear TransIormations by Matrices
We Iirst prove the Iollowing result.
.2.1. Theorem. et e., e
2
, ... ,e.. be a basis Ior a linear space .
(i) et A be a linear transIormation Irom into vector space and
set e; Ae
1
, e Ae
2
, ,I" Ae... II is any vector in and iI
(el e2"" e.. ) are the coordinates oI with respect to e., e
2
, ... ,
e.. , then A e1e; e2e ... e..e .
(ii) et e, e;, ... , e be any set oI vectors in . Then there eists a
uniue linear transIormation A Irom into such that Ae
l
e;,
Ae
2
e, .. , Ae.. 1".
ProoI To prove (i) we note that
A A(e1e l e2e2 e"e,,) elAe l e2Ae2 ... e"Ae"
el e2e e"e.
To prove (ii), we Iirst observe that Ior each E we have uniue scalars
e., e2" .. , e.. such that
e.e l e2e2 ... e..e".
Now deIine a mapping A Irom into as
A() ele; e2e ... e.. l".
Clearly, A(e,) e; Ior i 1, ,n. We Iirst must show that A is linear.
Given ele
l
e2e2 e.. e.. and y lIe. I2e2 ... I .. e.. ,
we have
A( y) A(el I1)e. (e.. I ..)e..l
(el I1)e l (e.. I ..)e ...
130
On the other hand.
and
Chapter I inite Dimensional Vector Spaces and Matrices
(.2.2)
Thus.
A() A(y) ell. ee ... e"e:. 111e 11 ... l1"e
(el 111)e (e 11) ... (e" 11,,)e:.
A( y).
In an identical way we establish that
l A( ) A(l )
Ior all E and all l E . It thus Iollows that A E (. ).
To show that A is uniue. suppose there eists aBE (. ) such that
Be, e; Ior i I... n. ItIollows that (A B)e, 0 Ior all i I... n.
and thus it Iollows Irom Eercise 3. . 6 that A B.
We point out that part (i) oI Theorem .2.1 implies that a linear transIor
mation is completelydetermined by knowinghowit transIorms the basis vectors
in its domain. and part (ii) oI Theorem .2.1 states that this linear transIor
mation is uniuely determined in this way. We will utilie these Iacts in the
Iollowing.
Now let be an ndimensional vector space. and let el e
... ell be
a basis Ior . et be an mdimensional vector space. and let II ... ",
be a basis Ior . et A E (. ). and let e; Ae, Ior i I ... n. Since
I ... ", is a basis Ior . there are uniue scalars a
o
. i I... m.
j I ... n. such that
Ae
l
I. allIl at a",t ",
Ae
auIl auI a",d",

I
Ae" e:. at.. 1 a ,, ... a",.. ",.
Now let E . Then has the uniue representation
elel ee .,. e"e"
with respect to the basis e
l
ell In view oI part (i) oI Theorem .2.1
we have
A ele ... e"e. ( .2.3)
Since A E . A has a uniue representation with respect to the basis
III. .. ,IIlII.say.
A 11t 1 l1d ... 11", ",. (.2.)
.2. Matrices
Combining Euations ( .2.2) and ( .2.3), we have
A el(aldl ... a",d",)
e,,(au/
l
a",,,/,,,)
.
e8(a
l
I a", m)
Rearranging the last epression we have
A (allel al"e" ... ah e8)/1
(a"lel aue" ... a"8e8)/"
131
(a"lel a",,,e,, ... a" 8en)/",
owever, in view oI the uniueness oI the representation in E. ( .2. ) we
have
111 allel aue" alnen,
11" a"lel aue" ah e8
( .2.5)
( .2.6)
11", amlel a",,,e,, ... a",ne8
This set oI euations enables us to represent the linear transIormation A
Irom linear space into linear space by the uniue scalars lao , i I,
... , m,j I, ... , n. or convenience we let
r
ail au a
18
A
a" I au . . . ah
a, .
a" l a",,, .. , a" 8
We see that once the bases el, e", . .. ,e
8
, /h/", ... ,I", are Ii ed, we can
represent the linear transIormation A by the array oI scalars in E . ( .2.6)
which are uniuely determined by E . ( .2.2).
In view oI part (ii) oI Theorem .2.1, the converse to the preceding also
holds. SpeciIically, with the bases Ior and still Ii ed, the array given in
E. ( .2.6) is uniuely associated with the linear transIormation AoI into .
The above discussion justiIies the Iollowing important deIinition.
.2.7. DeIinition. The array given in E. ( .2.6) is called the matri A oI
tbe linear transIormation AIrom linear space into linear space with respect
to the basis e 1 , en oI and the basis II ... ,IIlloI .
II, in DeIinition .2.7, , and iI Ior both and the same basis
el ... , e
8
is used, then we simply speak oI the matri A oI the linear trans
Iormation A with respect to the basis el, ... ,e
8
.
In E . ( .2.6), the scalars (all, 0,,,, ... ,08) Iorm the ith row oI A and the
scalars (all 0
2/
, ... , 0" /) Iorm the jth column oI A. The scalar a l reIers to
that element oI matri A which can be Iound in the ith row andjth column oI
A. The array in E. ( .2.6) is said to be an (m n) matri. IIm n, we speak
oI a suare matri (i.e., an (n n) matri).
In accordance with our discussion oI Section .1, an (n 1) matri
is called a column vector, column matri, or n vector, and a (1 n) matri
is called a row vector.
We say that two (m n) matrices A 01/ and B bl/ are e ual iI
and only iI 01/ bl/ Ior all i I, ... , mand Ior allj I, ... , n.
rom the preceding discussion it should be clear that the same linear
transIormation A Irom linear space into linear space may be represented
by diIIerent matrices, depending on the particular choice oI bases in and .
Since it is always clear Irom contet which particular bases are being used,
we usually dont reIer to them e plicitly, thus avoiding cumbersome notation.
Now let AT denote the transpose oI A E (, ) (reIer to DeIinition
3.5.27). Our net result provides the matri representation oI AT.
.2.8. Theorem. et A E (, ) and let A denote the matri oI A with
respect to the bases e I ... , e in and Il ... ,I. in . et I and
yl be the algebraic conjugates oI and , respectively. et AT E (I, I)
be the transpose oI A. et I, ... ,I and e, ... , e:. , denote the dual
bases oI Il ... , I", and e
u
... , e, respectively. II the matri A is given by
E . ( .2.6), then the matri oI AT with respect to I, ... ,I oI yl and
e, ... , e:. oI is given by
all a21 0" 1

AT 01.2.. .2.2 """ ."2
al a2 . . . a",
( .2.9)
ProoI. et B b l denote the (n m) matri oI the linear transIormation
ATwith respect to the bases I , ... ,I and e, ... , e:. . We want to show
that B is the matri in E . ( .2.9). By E . ( .2.2) we have
Ior i I, ... ,n, and
Ior j I, ... , m. By Theorem 3.5.22, e
"
e 6,,, and I",I 6,, .
ThereIore,
.2. Matrices
Also,
Ael,I e l, AT/ (el, tl bkje)
"
bklel, e bl
k1
ThereIore, b
,j
ajl which proves the theorem.
The preceding result gives rise to the Iollowing concept.
133
.2.10. DeIinition. The matri AT in E . ( .2.9) is called the transpose
oI matri A.
Our ne t result Iollows trivially Irom the discussion leading up to DeIini
tion .2.7.
.2.11. Theorem. et A be a linear transIormation oI an n dimensional
vector space into an mdimensional vector space , and let y A. et
the coordinates oI with respect to the basis e
l
, el ... , e,, be (e el ... ,
e.), and let the coordinates oI y with respect to the basis Il,Il ... ,I..be
( I I 11 ... , I.). et
all 011 ala
all au ala
(.2.12)
be the matri oI A with respect to the bases reI e
1
, , e. and Il,Il, ... ,
I. . Then
allel al1el alae. II
auel a21el a1"e. 11
or, euivalently,
II a,jej, i I, ... , m.
jml
.2.15. E ercise. Prove Theorem .2.11.
( .2.13)
( .2.1 )
sing matri and vector notation, let us agree to e press the system oI
linear euations given by E . ( .2.13) e uivalently as
all a . aa.
I
"1
au au ah
2 "2
(.2.16)
a.
1
a.2 a
,.
".
or, more succinctly, as
y, (.2.17)
where
T
(I 2" .. ".) and yT ("1 "2 ... ,,,.).
In terms oI
T
, yT, and AT, let us agree to epress E. (.2.13) euivalently
as
all a21 a.
1
a . au a.2
(It 2 , ,.)
("It "2 .. " "",)
(.2.18)
a In
a
b a
or, in short, as
TAT yT. (.2.19)
We note that in E. (.2.17), E P, E ", and A is an m n matri.
rom our discussion thus Iar it should be clear that we can utilie matrices
to study systems oI linear euations which are oI the Iorm oI E. (.2.13).
It should also be clear that an m n matri A is nothing more than a uniue
representation oI a linear transIormation A oI an ndimensional vector space
into an mdimensional vector space over the same Iield . As such,
A possesses all the properties oI such transIormations. We could, in Iact,
utilie matrices in place oI general linear transIormations to establish many
Iacts concerning linear transIormations deIined on Iinitedimensional linear
spaces. owever, since a given matri is dependent upon the selection oI two
particular sets oI bases (not necessarily distinct), such practice will, in general,
be avoided whenever possible.
We emphasie that a matri and a linear transIormation are not one and
the same thing. In many tets no distinction in symbols is made between
linear transIormations and their matri representation. We will not Iollow
this custom.
B. Rank oI a Matri
We begin by proving the Iollowing result.
.2.20. Theorem. et A be a linear transIormation Irom into . Then
A has rank r iI and only iI it is possible to choose a basis el e
2
, , e.
.2. Matrices 135
Ior and a basis II ... ,IIllIor such that the matri A oI A with respect
to these bases is oI the Iorm
r
..
100
010
6 0 0 0
o 0 0 0
A 0 0 0 ... 1 0 0 ... 0 m dim . ( .2.21)
0000000
0000000
....
n dim
ProoI. We choose a basis Ior oI the Iorm el, e2. ... ,e" e,I .. , e. ,
where e l ... , e. isa basisIodt(A). IIl
l
Ae
l
,I2. Ae2. ... ,/, Ae"
then l1,I2.," .,/, is a basis Ior R(A), as we saw in the prooI oI Theorem
3. .25. Now choose vectors 1, 1, ... ,Iin in such that the set oI vectors
l1,I2., .. . ,I", Iorms a basis Ior (see Theorem 3.3. ). Then
II Ae
l
(1)/1 (0)/2. (O)/, (0)/ 1 (O)/In
12. Ae2 (0)/1 (1)12. (0)/, (0)/ 1 (0)/""
.....................................................................................................
,
I, Ae, (0)/1 (0)/2 (1)/, (0)/ 1 " (O)/In (.2.22)
o Ae, I (0)/1 (0)/2. (0)/, (O)/, 1 ... (O)/In
......................................................................................................
,
0 Ae" (0)/1 (0)/2. ... (0)/, (0)/ 1 ... (O)/In
The necessity is proven by applying DeIinition .2.7 (and also E . ( .2.2
to the set oI euations (.2.22); the desired result given by E . (.2.21)
Iollows.
SuIIiciency Iollows Irom the Iact that the basis Ior R(A) contains r linearly
independent vectors.
A uestion oI practical signiIicance is the Iollowing: iI A is the matri
oI a linear transIormation A Irom linear space into linear space with
respect to arbitrary bases e
l
, , e. Ior and II ... , /In Ior , what is
the rank oI A in terms oI matri A et R(A) be the subspace oI gener
ated by Ae
l
, Ae2. ... , Ae". Then, in view oI E . (.2.2), the coordinate repre
sentation oI Ae/ i I, ... ,n, in with respect to II ... ,Iin is given
by
... , Ae,,
rom this it Iollows that R(A) consists oI vectors y whose coordinate repre
sentation is
y
... "
( .2.23)
"... a ... I a ... 2 a..."
where" I , ""are scalars. Since every spanning or generating set oI a linear
space contains a basis, we are able to select Irom among the vectors Ae
l

Ae
2
... ,Ae" a basis Ior R(A). Suppose that the set Ae
l
, Ae2 ... Aek
is this basis. Then the vectors Ae I. Ae
2
.. , Ae
k
are linearly independent.
and the vectors Aek I ... , Ae" are linear combinations oI the vectors Ae
l

Ae
2
, Ae
k
rom this there now Iollows:
.2.2 . Theorem. et A E (. ), and let A be the matri oI A with
respect to the (arbitrary) basis eel e2 ... , e,, Ior and with respect to the
(arbitrary) basis l1.l2 ... .I... Ior . et the coordinate representation oI
y Abe A . Then
(i) the rank oI A is the number oI vectors in the largest possible linearly
independent set oI columns oI A; and
(ii) the rank oI A is the number oI vectors in the smallest possible set oI
columns oI A which has the property that all columns not in it can
be e pressed as linear combinations oI the columns in it.
In view oI this result we make the Iollowing deIinition.
.2.25. DeIinition. The rank oI an m n matri A is the largest number
oI linearly independent columns oI A.
c. Properties oI Matrices
Now let be an n dimensional linear space. let be an mdimensional
linear space, let be the Iield Ior and , and let Aand Bbe linear transIor
mations oI into . et A a
o
be the matri oI A. and let B h
o
be the matri oI B with respect to the bases Ielt e

2
, e,, in and It.I2.
.2. Matrices 137
... ,/", in . sing E. (3. . 2) as well as DeIinition .2.7, the reader can
readily veriIy that the matri oI A D, denoted by C A A B, is given by
A B alj bI al blj eI C. ( .2.26)
sing E. (3. . 3) and DeIinition .2.7, the reader can also easily show
that the matri oI A, denoted by D A A, is given by
A aI alj dI D. ( .2.27)
rom E. ( .2.26) we note that, in order to be able to add two matrices A
and B, they must have the same number oI row.5 and columns. In this case
we say that A and B are comparable matrices. Also, Irom E. ( .2.27) it is
clear that iI A is an m n matri, then so is A.
Net, let Z be an rdimensional vector space, let A E (, ), and let
D E ( , Z). et A be the matri oI A with respect to the basis e I e", ... ,
e
in and with respect to the basis Il " ... ,I",in . et B be the matri
oI D with respect to the basis Il ,I", ... , m in and with respect to the basis
gl g", ... , g, in Z. The product mapping DA as deIined by E. (3. .50)
is a linear transIormation oI into Z. We now ask: what is the matri C
oI DA with respect to the bases el, e", ... , e
oI and g I g", ... , g,

oI Z By deIinition oI matrices A and B (see E. ( .2.2 , we have
and
,
B 1:bljg/t jI, ... ,m.
11
Now
, "
1: 1: blj a kgl
11 I
Ior k I, ... , n. Thus, the matri C oI BAwith respect to basis e I .. , e
in and g" ... , g, in Z is eI where

Ior i I, ... , r andj I, ... , n. We write this as
CBA.
( .2.28)
( .2.29)
rom the preceding discussion it is clear that two matrices A and B can
be multiplied to Iorm the product BA iI and only iI the number oI columns
oIB is eual to the number oI rows oI A. In this case we say that the matrices
B and A are conIormal matrices.
In arriving at Euations ( .2.28) and ( .2.29) we established the result
given below.
.2.30. Theorem. et A be the matri oI A E (, ) with respect to the
basis leu e
, ... , e. in and basis lu , ,Iillin . et B be the matri

oI BE( , Z) with respect to basis II , ,Iillin and basis g" g ,
... ,g, in Z. Then BA is the matri oI BA.
We now summari e the above discussion in the Iollowing deIinition.
.2.31. DeIinition. et A a l and B bll be two m n matrices,
let C CII be an n r matri, and let E . Then
(i) the som oI A and B is the m n matri
DAB
where
dll a l b
l
Ior all i I, ... , mand Ior allj 1, ... ,n;
(ii) the product oI matri A by scalar is the m n matri
EA
where
ell all
Ior all i 1, ... ,mand Ior allj I, ... ,n; and
(iii) the product oI matri A and matri C is the m r matri
GAC,
where
Ior each i I, ... , m and Ior eachj 1, ... , r.
The properties oI general linear transIormations established in Section
3. hold, oI course, in the case oI their matri representation. We summari e
some oI these in the remainder oI the present section.
.2.32. Theorem.
(i) et A and B be (m n) matrices, and let C be an (n r) matri .
Then
(A B)C AC BC. ( .2.33)
(ii) et A be an (m n) matri, and let Band C be (n r) matrices.
Then
A(B C) AD AC.
( .2.3 )
.2. Matrices
139
(iii) et A be an (m n) matri, let B be an (n r) matri, and let
C be an (r s) matri. Then
A(BC) (AB)C.
(iv) et t , p E , and let A be an (m n) matri. Then
(t P)A t A pA.
(v) et t E , and let A and B be (m n) matrices. Then
t (A B) t A t B.
( .2.35)
( .2.36)
( .2.37)
(vi) et t , PE , let A be an (m n) matri, and let B be an (n r)
matri. Then
(t A)(pB) (t P)(AB).
(vii) et A and B be (m n) matrices. Then
A B BA.
(viii) et A, B, and C be (m n) matrices. Then
(A B) C A (B C).
The prooIs oI the net two results are leIt as an e ercise.
( .2.38)
( .2.39)
( .2. 0)
.2. 1. Theorem. et 0 E (, ) be the ero transIormation deIined by
E. (3. . ). Then Ior any bases el ... , e. and Il ... ,I.. Ior and ,
respectively, the linear transIormation 0 is represented by the (m n) matri
( .2. 2)
( .2. )
The matri 0 is called the Dull matri.
.2. 3. Theorem. et I E (, ) be the identity transIormation deIined
by E. (3. .56). et el ... , e. be an arbitrary basis Ior . Then the matri
representation oI the linear transIormation I Irom into with respect to
the basis el ... , e. is given by
I : .. ..: ..:.:.: .. :
I is called the n n identity matri.
.2. 5. E ercise. Prove Theorems .2.32,.2.1, and .2. 3.
1 0 Chapter I inite Dimensional Vector Spaces and Matrices
or any (m n) matri A we have
AOOAA
and Ior any (n n) matri B we have
BIIBB
( .2. 6)
( .2. 7)
where I is the (n n) identity matri .
II A au is a matri oI the linear transIormation A, then correspond
ingly, A is a matri oI the linear transIormation A, where
all 012 ala
021 02
2 02"
A (I)A (I)
0"1
0",2 a",,, ( .2. 8)
all 012 ala
021 au
0
211
a"l 0"2
a",,,
It Iollows immediately that A (A) 0, where 0 denotes the null matri.
By convention we usually write A (A) AA.
et A and B be (n n) matrices. Then we have, in general,
AB*BA, ( .2. 9)
as was the case in E . (3. .55).
Net,let A E (, ) and assume that A is nonsingular. et AI denote
the inverse oI A. Then, by Theorem 3. .60, .. A 1 A1A 1. Now iI A is
the (n n) matri oI A with respect to the basis e
l
, ,ell in ; then
there is an (n n) matri B oI AI with respect to the basis e
u
... ,ell in
, such that
BAABI. ( .2.50)
We call B the inverse oI A and we denote it by AI. In this connection we use
the Iollowing terms interchangeably: AI eists, A bas an inverse, A is
invertible, or A is non singular. II A is not nonsingular, we say A is singnlar.
With the aid oI Theorem 3. .63 the reader can readily establish the Iol
lowing result Ior matrices.
.2.51. Theorem. et Abe an (n n) matri. The Iollowing areeuivalent:
(i) rank A n;
(ii) A 0 implies 0;
.2. Matrices 1 1
(iii) Ior every o E ", there is a uniue
o
E " such that o A
o
;
(iv) the columns oI A are linearly independent; and
(v) AI e ists.
.2.52. Eercise. Prove Theorem .2.51.
We have shown that we can represent n linear euations by the matri
euation ( .2.17). Now let A be a nonsingular (n n) matri and consider
the euation
y A . ( .2.53)
IIwe premultiply both sides oI this euation by AI we obtain
AI
y
, ( .2.5 )
the solution to E. ( .2.53). Thus, knowledge oI the inverse oI A enables
us to solve the system oI linear euations ( .2.53).
In our net result, which is readily veriIied, some oI the important proper
ties oI nonsingular matrices are given.
.2.55. Theorem.
(i) An (n n) nonsingular matri has one and only one inverse.
(ii) IIAand Bare nonsingular (n n) matrices, then (AB)I B1 AI.
(iii) IIA and Bare (n n) matrices and iI AB is nonsingular, then so
are A and D.
Our net theorem summaries some oI the important properties oI the
transpose oI matrices. The prooI oI this theorem is a direct conseuence oI
the deIinition oI the transpose oI a matri (see E . ( .2.9 .
.2.57. Theorem.
(i) or any matri A, (AT)T A.
(ii) et A and B be conIormal matrices. Then (AB)T DTAT.
(iii) et A be a nonsingular matri. Then (AT) I (A I)T.
(iv) et A be an (n n) matri. Then AT is nonsingular iI and only iI
A is nonsingular.
(v) et A and B be comparable matrices. Then (A B)T AT BT.
(vi) et E and A be a matri. Then ( A)T AT.
Now let A be an (n n) matri, and let m be a positive integer. Similarly
as in E . (3. .67) we deIine the (n n) matri A " by
A " A ... A.
..
mtimes
( .2.59)
and iI AI e ists. then similarly as in E . (3. .68). we deIine the (n n)
matri A " as
A " (A I) " AI. AI ... AI.
,
..
m times
( .2.60)
As in the case oI E s. (3. .69) through (3. .71). the usual laws oI eponents
Iollow Irom the above deIinitions. SpeciIically. iI A is an (n n) matri and
iI rand s are positive integers. then
A A A A A A.
(A) A" A" (A).
and iI AI e ists. then
Consistent with the above notation we have
Al A
and
AO I.
( .2.61)
( .2.62)
( .2.63)
( .2.6 )
( .2.65)
We are now once more in a position to consider Iunctions oI linear transIor
mations. where in the present case the linear transIormations are represented
by matrices or e ample. iII(.) is the polynomial in . given in E. (3. .7 ).
and iI A is any (n n) matri. then by I(A) we mean
I(A) /1
0
1 /lIA ... /I.A. ( .2.66)
.2.67. E ercise. et A E (. ). and let A be the matri oI A with
respect to the basis Ielt ... e.l in . etI(.) be given by E . (3. .7 ). Show
that/(A) is the matri oI /(A) with respect to the basis el ... e.l.
We noted earlier that in general linear transIormations and matrices do
not commute (see (3. .55) and ( .2. 9 . owever. in the case oI suare
matrices. the reader can veriIy the Iollowing result easily.
.2.68. lbeorem. et A. B. C denote (n n) matrices. let 0 denote the
(n n) null matri . and let 1 denote the (n n) identity matri. Then.
(i) 0 commutes with any A;
(ii) A commutes with AI. where p and are positive integers;
(iii) /II commutes with any A. where /I E ; and
(iv) iI A commutes with B and iI A commutes with C. then A commutes
with /lB PC. where /I. P E .
.2. Matrices
et us now consider some speciIic e amples.
.2.70. E ample. et denote the Iield oI real numbers, and let
A l and B iI
Then
A B :3 : and A B i .
o 3 1 0 I
II/ 3, then
l : I .
/ A
18 9
o 3
1 3
.1.71. E ample. et denote the Iield oIcomple numbers, let; 1,
let
Then
C D I
7i
3 i 5 2;
7 7; 11 .
5
II/ i, then
i I 23;
/ C 8; 7 6i.
1 i 3i 5i
.2.71. E ample. et denote the Iield oI real numbers, let
G : : and a
1
Then
eMpter I inite Dimensional Vector Spaces and Matrices
10 5
G 22 13.
10 IS
Notice that in this case G is not deIined.
.2.73. Eample. et be the Iield oI real numbers, let
and
Then
10
22
Clearly, * .
5 II 16
13 and 7 12
.2.7 . Eample. et
M and N .
Then
MN 0,
i.e., MN 0, even though M t 0 and N t o.
.2.75. Eample. II A is as deIined in Eample .2.70, then
2 I 2
AT560.
163 I
.2.76. Eample. et
32 8
16
p :
5
2 2 2
6 and
Q
I 2 I
2 2 2
7
5 6 27
2 2 2
Then
.2. Matrices
1 0
pQQP 0 1
o 0
i.e., Q pI or, euivalently, P Ql.
1 5
( .2.78)
.2.77. Eample. Consider the set oI simultaneous linear euations
I 2e e3 3e, 0,
6el 3e2 e3 e, 0,
2el e2 0e3 e, o.
Euation ( .2.78) can be rewritten as
: : : .
2 1 0 1 e3 0
e,
et
21 3
A631.
2 I 0 I
( .2.79)
( .2.80)
Matri A is the coordinate representation oI a linear transIormation A E (,
). In this case dim and dim 3. Observe now that the Iirst column
oI A is a linear combination oI the second column oI A. Also, by adding the
third column oI A to the second column we obtain the Iourth column oI A.
It Iollows that A has only two linearly independent columns. ence, the rank
oI A is 2. Now since dim dim (A) dim lR(A), the nullity oI A is
also 2.
Net, we discuss brieIly partitioned vectors and matrices. Such vectors and
matrices arise in a natural way when linear transIormations acting on the
direct sum oI linear spaces are considered.
et be an ndimensional vector space, and let be an mdimensional
vector space. Suppose that EB W, where is an rdimensionallinear
subspace oI , and suppose that R EB Q, where R is a pdimensional
linear subspace oI . et A E (, ), let el, , eIt be a basis Ior such
that el , e, is a basis Ior , and let Il ,I", be a basis Ior such
that II ,I, is a basis Ior R. et A be the matri oI A with respect to
these bases. Now iI E P is the coordinate representation oI E with
respect to the basis el, ... , eIt we can partition into two components,
I
.
;
( .2.81)
where E pr, " E ) and where
Similarly, we can e press y E pI as
II
y
( .2.82)
I ..
where y is the coordinate representation oI y with respect to II ... ,I..
and where r E P and E "p. We say the vector in E . ( .2.81) is parti
tioned into components u and ". Clearly, the vector u is determined by the
coordinates oI corresponding to the basis vectors el ... ,e, in .
We can similarly divide the matri A into the partition
A II: ( .2.83)
A
ZI
; Au
where All isa(p r)matri, Au isa(p (n rmatri, A
11
isanm p)
r) matri, and Au is an m p) (n r matri . In this case, the
euation
yA
is euivalent to the pair oI euations
r Allu Au".
A11 u A "
( .2.8 )
( .2.85)
.2. Matrices 1 7
( .2.87)
A matri in the Iorm oI E . ( .2.83) is called a partitioned matti . The
matrices All, Au, A
21
, and A
22
are called s bmatrices oI A.
The generaliation oI partitioning the matri A into more than Iour
submatrices is accomplished in an obvious way, when the linear space
and/or the linear space are the direct sum oI more than two linear
subspaces.
Now let the linear spaces and and the linear transIormation A and
the matri A oI A still be deIined as in the preceding discussion. et Z be a
k dimensional vector space (the spaces , , and Zare vector spaces over
the same Iield ). et Z M EB N, where M is a jdimensionallinear sub
space oI Z. et B E ( , Z). In a manner analogous to our preceding dis
cussion, we represent B by the partitioned matri
BI ( .2.86)
B
21
B
u
It is now a simple matter to show that the linear transIormation BA E (,
Z) is represented by the partitioned matri
BA ;;;::i:;;;:l
We now prove:
.2.88. Theorem. et be an n dimensional vector space, and let
P E (, ). II P is a projection, then there e ists a basis (el ... , e. Ior
such that the matri P oI P with respect to this basis is oI the Iorm
( .2.89)
r
nr
o
o
o
1 0
o 1
0:
I
0:
,
I
I
I
I
I
I
I
I
p 0 0 :
.
:0 0
I
I
I
I
I
I
I
I
I
;0
where r dim R(P).
ProoI. Since P is a projection we have, Irom E . (3.7.8),
R(P) EB (P).
Now let r dim R(P), and let el ... , e. be a basis Ior such that (el
... , e, is a basis Ior R(P). et P be the matri oI P with respect to this basis,
and the theorem Iollows.
We leave the net result as an e ercise.
.2.90. Theorem. et be a Iinitedimensional vector space, and let
A E (, ). IIW is a pdimensional invariant subspace oI and iI W
EB Z, then there e ists a basis Ior such that the matri A oI A with respect
to this basis has the Iorm
A ::i:
o :A
2Z
where All is a (p p) matri and the remaining submatrices are oI appro
priate dimension.
.3. EQ IVA ENCE AND SIMIARIT
rom the previous section it is clear that a linear transIormation A oI
a Iinitedimensional vector space into a Iinitedimensional vector space
can be represented by means oIdiIIerent matrices, depending on the particular
choice oI bases in and . The choice oI bases may in diIIerent cases result
in matrices that are "easy" or "hard" to utili e. Many oI the resulting
"standard" Iorms oI matrices, called canonical Iorms, arise because oI prac
tical considerations. Such canonical Iorms oIten ehibit inherent character
istics oI the underlying transIormation A. BeIore we can consider some oI the
more important canonical Iorms oI matrices, we need to introduce several
new concepts which are oI great importance in their own right.
Throughout the present section and are Iinitedimensional vector spaces
over the same Iield , dim n and dim m. We begin our discussion
with the Iollowing result.
.3.1. Theorem. et e
l
, ,e" be a basis Ior a linear space , and let
e;, ... , e be a set oI vectors in given by
e; :t pjlej i 1, ... ,n,
j I
( .3.2)
where Plj E Ior all i,j I, ... ,n. The set e;, ... , Iorms a basis Ior
iI and only iI P Plj is nonsingular.
ProoI et e;, . .. , be linearly independent, and let Pj denote the
jthcolumn vector oI P. et
.3. E uivalence and Similarity
Ior some scalars lI ,I" E . This implies that
"
I IPIl 0, i I, ... ,n.
1 :1
It Iollows that
Rearranging, we have
or
1 9
"
I:: I
I
" O.
1 I
Since e;, ... , e. are linearly independent, it Iollows that I
I
... I " O.
Thus, the columns oIP are linearly independent. ThereIore, P is nonsingular.
Conversely, let P be nonsingular, i.e., let PI . , PIt be a linearly inde
"
pendent set oI vectors in . et I:: l ,e; 0 Ior some scalars II . . , I " E .
, I
Then
"
Since el ... ,e,, is a linearly independent set, it Iollows that I:: I ,PI 0
I
"
Ior j I, ... ,n, and thus, I:: I ,P, O. Since PI" .. ,p,, is a linearly
I
independent set, it now Iollows that I
I
... I " 0, and thereIore
e;, ... , e. is a linearly independent set.
The preceding result gives rise to:
.3.3. DeIinition. The matri P oI Theorem .3.1 is called the matri oI
basis e;, ... , e. with respect to basis e I . , eIt
We note that since P is nonsingular, pI e ists. Thus, we can readily
prove the net result.
.3.. Theorem. et el, ... ,e,, and e;, . .. ,e. be two bases Ior ,
and let P be the matri oI basis e;, ... ,e with respect to basis el ... , eIt
Then pI is the matri oI basis e I ... , eIt with respect to the basis e;,
... , e,. .
.3.5. Eercise. Prove Theorem .3. .
The net result is also easily veriIied.
.3.6. Theorem. et be a linear space, and let the sets oI vectors el
... ,eIt e ,e.. , and e I, . .. , e : be bases Ior . II P is the matri
oI basis e , , e It with respect to basis e I , eIt and iI Q is the matri
oI basis e I, , e : with respect to basis e , ... ,e.. , then PQ is the
matri oI basis e I, .. , e : with respect to basis e
l
, eIt
We now prove:
.3.8. Theorem. et e I .. , eIt and e , , e.. be two bases Ior a linear
space . and let P be the matri oI basis , ... ,e.. with respect to basis
e
lt
, eIt et E and let denote the coordinate representation oI
with respect to the basis e
lt
, eIt et denote the coordinate representa
tion oI with respect to the basis e , ... ,e.. . Then P .
ProoI. et
T
(I ... It) and let ( )T (, ... ,). Then
and
Thus,
It . It(. )
e pl e, P/ e,
I I 1 t:1 II
which implies that
It
, P/ i I, ... n.
j :1
ThereIore,
P
/
.
.3.9. E ercise. et RIt and let u It ,u. be the natural basis
Ior RIt (see E ample . l.l5). et e
lt
,eIt be another basis Ior R , and let
e
lt
eIt be the coordinate representations oI e
lt
, e., respectively, with
respect to the natural basis. Show that the matri oI basis e I . , e. with
respect to basis u
lt
, It is given by P e
lt
e
2
, , eIt i.e., the matri
whose columns are the column vectors e
l
. ,eIt
.3.10. Theorem. et A E (, ), and let el, ... ,e. and Il" .. ,I..
be bases Ior and , respectively. et A be the matri oI A with respect
to the bases e
l
, ,eIt in and Il ,Iillin . et ... , e.. be
another basis Ior . and letthe matri oI e , , e.. with respectto e
l
, ,
eIt be P. et I , ... ,I be another basis Ior , and let Q be the matri oI
Il ... ,Iillwith respect to I , ... ,I. et A be the matrioI Awith respect
.3. E uivalence and Simiklrity
to the bases e , ... , e:, in and I , ... ,I in . Then.
AQAP.
ProoI. We have
Ae; A( Pklek) t Pk/Aek t Pkl(I at/rlt)
I k 1 k 1 t1
t Pklt alk(t dj) t(I t lalkPkl)Ij.
k 1 l1 I l t1 k1
151
IN
Now, by deIinition, Ae; aj, j. Since a matri oI a linear transIormation
"::1
is uni uely determined once the bases are speciIied, we conclude that
Ior i I, ... ,mandj I, ... , n. ThereIore, A QAP.
In igure A, Theorem .3.10 is depicted schematically.
P
t
(e;, .. .e
A
A
A
"
"
" y
y A
u; ..... I;"
y Qy
.3.11. Igure A. Schematic diagram oI Theorem .3.10.
The preceding result motivates the Iollowing deIinition.
.3.12. DeIinition. An (m n) matri A is said to be e uivalent to an
(m n) matri A iI there e ists an (m m) non singular matri Q and an
152 Chapter" I inite Dimensional Vector Spaces and Matrices
(n n) nonsingular matri P such that
AQAP.
IIA is euivalent to A, we write A ..., A.
( .3.13)
Thus, an (m n) matri A is euivalent to an (m n) matri A iI and
only iI A and A can be interpreted as both being matrices oI the same linear
transIormation A oI a linear space into a linear space , but with respect
to possibly diIIerent choices oI bases.
Our net result shows that ..., is reIle ive, symmetric, and transitive,
and as such is an e uivalence relation.
.3.1 . Theorem. et A, B, and C be (m n) matrices. Then
(i) A is always euivalent to A;
(ii) iI A is euivalent to B, then B is euivalent to A; and
(iii) iI Ais euivalent to Band B is euivalent to C, then A is euivalent
to C.
.3.15. E ercise. Prove Theorem .3.1 .
The reader can prove the net result readily.
.3.16. Theorem. et A and B be m n matrices. Then
(i) every matri A is euivalent to a matri oI the Iorm
100 .. .. 0
o 1 0 ... ... 0
r rank A
000 .. 100 .. 0
000 .. 000 .. 0
( .3.17)
000 .. 000 .. 0
(ii) two (m n) matrices A and B are euivalent iI and only iI they
have the same rank; and
(iii) A and AT have the same rank.
Our deIinition oI rank oI a matri given in the last section (DeIinition
.2.25) is sometimes called the columa rank oI a matri. Sometimes, an analo
gous deIinition Ior row rank oI a matri is also considered. The above theorem
shows that the row rank oIa matri is e ual to its column rank.
.3. E uivalence and Similarity
Net, let us consider the special case when . We have:
153
.3.19. Theorem. et A E (, ), let (e
l
, , e.l be a basis Ior , and
let A be the matri oI A with respect to (e
l
, , e.l. et (e, ... , e"l be
another basis Ior whose matri with respect to (e
l
, , e.l is P. et A
be the matri oI A with respect to (, ... , e"l. Then
A PIAP. ( .3.20)
The meaning oI the above theorem is depicted schematically in igure B.
The prooI oI this theorem is just a special application oI Theorem .3.10.
,;,.;A ....

A
Ie" . ". enl

Ie,. .. enl
t,
A
t, ,
Ie; ..... el
e;, ... , e
.3.21. igure B. Schematic diagram oI Theorem .3.19.
Theorem .3.19 gives rise to the Iollowing concept.
.3.22. DeIinition. An (n n) matri A is said to be similar to an (n n)
matri A iI there eists an (n n) nonsingular matri P such that
A PIAP. ( .3.23)
II A is similar to A, we write A ,." A. We call P a similarity transIormation.
It is a simple matter to prove the Iollowing:
.3.2 . Theorem. et A be similar to A; i.e., A PIAP, where P is
nonsingular. Then A is similar to A and A PAPI.
In view oI this result, there is no ambiguity in saying two matrices are
similar.
To sum up, iI two matrices A and A represent the same linear transIor
mation A E (, ), possibly with respect to two diIIerent bases Ior ,
then A and A are similar matrices.
15 eMpter I inite Dimensional Vector Spaces and Matrices
Our net result shows that " given in DeIinition .3.22 is an euivalence
relation.
.3.25. Theorem. et A, B, and C be (n n) matrices. Then
(i) A is similar to A;
(ii) iI A is similar to B, then B is similar to A; and
(iii) iI A is similar to B and iI B is similar to C, then A is similar to C.
or similar matrices we also have the Iollowing result.
.3.27. Theorem.
(i) IIan (n n) matri A is similar to an (n n) matri B, then At is
similar to Bk, where k is a positive integer.
(ii) et
(.3.28)
where / 0 ,/ ", E . Then
I(PIAP) PlI(A)P. (.3.29)
This implies that iI B is similar to A, then I(B) is similar to I(A). In Iact,
the same matri P is involved.
(iii) et A be similar to A, and let I(l) denote the polynomial oI E.
( .3.28). Then I(A) 0 iI and only iI I(A) O.
(iv) et A E (, ), and let A be the matri oI A with respect to a
basis e
l
, ,e. in . et I(l) denote the polynomial oI E.
( .3.28). Then I(A) is the matri oI I(A) with respect to the basis
e
l
, , e. .
(v) et A E (, ), and letI(l) denote the polynomial oIE. ( .3.28).
et A be any matri oI A. ThenI(A) 0 iIand only iII(A) O.
We can use results such as the preceding ones to good advantage. or
eample, let A denote the matri
1
1
0 0 0
o 1
2
0 0
A
o 0 0
o 0 0
1" 1 0
o 1.
(.3.31)
. . Determinants oIMatrices
Then
MOO .. 0
o A 0 0
(A)k
ISS
o 0 0
o 0 0
Now letI(A) be given by E. ( .3.28). Then
I 0 0 Al 0
o I 0 0 A2
o
I(A) (10
(II
...
0 0 0 0 0
A" I
0
0 0 I 0 0 0
A"
A 1
0 0 I(A
I
) 0 0

........ . ............
0
Ar
.........
0 0 I(A2)
............
0
o
o
o
o
o o
o
o
o
o
I(l.)
We conclude the present section with the Iollowing deIinition.
.3.32. DeIinition. We call a matri oI the Iorm (.3.31) a diagonal matri.
SpeciIically, a suare (n n) matri A al is said to be a diagonal
matri iI alj 0 Ior all i "*j. In this case we write A diag (all, an, ... ,
a ).
. . DETERMINANTS O MATRICES
At this point oI our development we need to consider the important
topic oI determinants. AIter stating the deIinition oI the determinant oI a
matri, we eplore some oI the commonly used properties oI determinants.
We then characterie singular and nonsingular linear transIormations on
Iinite dimensional vector spaces in terms oI determinants. inally, we give
a method oI determining the inverse oI nonsingular matrices.
et N I, 2, ... ,n. We recall (see DeIinition 1.2.28) that a permutation
on N is a onetoone mapping oI N onto itselI. or e ample, iI ( denotes a
permutation on N, then we can represent it as
(
I 2 ... n)
jl j ... j,,
wherej, E NIor i I, ... , n andj, *j" Ior i * k. enceIorth, we represent
given above, more compactly, as
jd .. . j".
Clearly, there are n possible permutations on N. We let P(N) denote the
set oI all permutations on N, and we distinguish between odd and even
permutations. SpeciIically, iI there is an even number oI pairs (i, k) such
that i k but i precedes k in , then we say that is even. Otherwise is
said to be odd. inally, we deIine the Iunction sgn Irom P(N) into by
I is even
sgn ()
I is odd
Ior all E P(N).
BeIore giving the deIinition oI the determinant oI a matri, let us consider
a speciIic e ample.
..1. Eample. As indicated in the accompanying table, there are si
permutations on N (I, 2,3). In this table the odd and even permutations
are identiIied and the Iunction sgn is given.
t1 is
t1
(jl.h) (j.. h) (j,h)
odd or even sgn t1
123 (1,2) (1,3) (2,3) even 1
132 (1,3) (1,2) (3,2) odd 1
213 (2, 1) (2,3) (1,3) odd 1
231 (2,3) (2,1) (3,1) even 1
312 (3,1) (3,2) (1,2) even 1
321 (3,2) (3,1) (2, 1) odd 1
Now let A denote the (n n) matri
all al2 alrt

A a .. ......... " .
a"l a" . a""
We Iorm the product oI n elements Irom A by taking one and only one
element Irom each row and one and only one element Irom each column. We
represent this product as
. . Determinants oIMatrices 157
where ti . ... j.) E P(N). It is possible to Iind n such products, one Ior
each u E P(N). We now deIine the determinant oI A, denoted by det (A), by
the sum
det (A) I: sgn (0 ) allt a2jo ... a..,
"ep(N)
where u jl .. . j . We also denote the determinant oI A by writing
( . .2)
det(A) ( . .3)
We now present some oI the Iundamental properties oI determinants.
. . . Theorem. et A and B be (n n) matrices.
(i) det (AT) det (A).
(ii) II all elements oI a column (or row) oI A are ero, then det (A) O.
(iii) IIB is the matri obtained by multiplying every element in a column
(or row) oI A by a constant t , while all other columns oI B are the
same as those in A, then det (B) t det (A).
(iv) II B is the same as A, e cept that two columns (or rows) are inter
changed, then det (B) det (A).
(v) II two columns (or rows) oI A are identical, then det (A) O.
(vi) II the columns (or rows) oI A are linearly dependent, then det
(A) O.
ProoI To prove the Iirst part, we note Iirst that each product in the sum
given in E . ( . .2) has as a Iactor one and only one element Irom each
column and each row oI A. Thus, transposing matri A will not aIIect the
n products appearing in the summation. We now must check to see that
the sign oI each term is the same.
or E P(N), the term in det (A) corresponding to 0 is sgn (u)a
llt
a
2
. a. . There is a product term in det (AT) oI the Iorm a

lt
lajo2" . aN. such
that a
1lt
a
2jo
. . , a.. aIlaN2 ... au . The righthand side oI this euation
is just a rearrangement oI the leIthand side. The number oI j; j;I Ior
i I, ... ,n I is the same as the number oI j/ j/I Ior i 1, ... ,
n 1. Thus, iI 0" ; j . . .j) then sgn (u) sgn (0 ), which means det
(AT) det (A). Note that this result implies that any property below which
is proved Ior columns holds eually as well Ior rows.
To prove the second part, we note Irom E . ( . .2) that iI Ior some i,
Q/
k
0 Ior all k, then det (A) O. This proves that iI every element in a row
oI A is ero, then det (A) O. By part (i) it Iollows that this result holds
also Ior columns.
. .5. E ercise. Prove parts (iii (vi) oI Theorem . . .
We now introduce some additional concepts Ior determinants.
. .6. DeIinition. et A a l be an n n matri . II the ith row and
jth column oI Aare deleted, the remaining (n 1) rows and (n 1) columns
can be used to Iorm another matri Mil whose determinant is det (Mil)
We call det (MI ) the minor oIa l II the diagonal elements oI MI are diagonal
elements oIA, i.e., i j, then we speak oI a principal minor oI A. The coIactor
oI a l is deIined as (1)
1
det (MI ).
or e ample, iI A is a (3 3) matri, then
all au a
l3
det (A) a
ZI
an a
Z3
,
the minor oI element a
Z3
is
l
a
ll
det(M3)
a
31
and the coIactor oI a
Z3
is
The net result provides us with a convenient method oI evaluating
determinants.
. .7. Theorem. et A be an n n matri. et e l denote the coIactor oI
a l i,j I, ... ,n. Then the determinant oI A is eual to the sum oI the
products oI the elements oI any column (or row) oI A, each by its own
coIactor. SpeciIically,
Ior j I, ... , n, and,
"
det (A) a,AI
I
Ior i 1, ... ,n.
or e ample, iI A is a (2 2) matri, then we have
( . .8)
( . .9)
. . Determinants oIMatrices
IIA is a (3 3) matri, then we have
all 012. 03
det (A) 02 au 023
159
0IlC
I
, 0I1C 0I3
C
I3
In this case Iive other possibilities e ist. or e ample, we also have
det (A) O"C" 02,C2 a
3
,c
31

. .10. E ercise. Prove Theorem . .7.
We also have:
. .11. Theorem. IIthe ith row oI an (n n) matri Aconsists oI elements
oI the Iorm 0/1 0:" 02 0;2 ,a," 0:.; i.e., iI
a.2
then
det(A)
urthermore, we have:
. .13. Theorem. et A and B be (n n) matrices. IIB is obtained Irom
the matri A by adding a constant tt times any column (or row) to any other
column (or row) oI A, then det (B) det (A).
. .1 . E ercise. Prove Theorem . .13.
In addition, we can prove:
. .15. Theorem. et A be an (n n) matri, and let c,/ denote the
coIactor oI 0,/, i,j I, ... , n. Then the sum oI products oI the elements
oI any column (or row) by the corresponding coIactors oI the elements oI
any other column (or row) is ero. That is,
a,/c,k 0 Ior j * k
11
and
We can combine E s. ( . .8) and ( . .16a) to obtain
( . .16a)
( . .16b)
a,/c
,k
det (A)cS/k ( . .18)
11
j, k 1, ... , n, where /k denotes the ronecker delta. Similarly, we can
combine E s. ( . .9) and ( . .16b) to obtain
( . .19)
i, k 1, ... , n.
We are now in a position to prove the Iollowing important result.
. .20. Theorem. et A and B be (n n) matrices. Then
det (AD) det (A) det (B).
ProoI We have
(..21)
det(AB) .
a",.b
/
1
.1
By Theorem . .11 and Theorem . . , part (iii), we have
a"" a",.
This determinant will vanish whenever two or more oI the indices i/,j 1,
... , n, are identical. Thus, we need to sum only over (I E P(N). We have
det (AB) b"lb,,1" .b
,
. . ,
"EP(N)
. . Determinants 01 Matrices
161
where ili ... i. and P(N) is the set oI all permutations oI N I, ... ,
n . It is now straightIorward to show that
sgn () det (A),
and hence it Iollows that
det (AB) det (A) det (B).
Our net result is readily veriIied.
. .22. Theorem. et I be the (n n) identity matri, and let 0 be the
(n n) ero matri. Then det (I) I and det (0) 0.
The net theorem allows us to characterie nonsingular matrices in terms
oI their determinants.
. .2 . Theorem. An (n n) matri A is nonsingular iI and only iI det
(A)::I O.
ProoI Suppose that A is nonsingular. Then AI e ists and AIA AAI
I. rom this it Iollows that det (AI A) I *0, and thus, in view oI E .
( . .21), det (AI) ::1 0 and det (A) *O.
Net, assume that Ais singular. By Theorem .3.16, there e ist nonsingular
matrices Q and P such that
o
A QAP
o

o
This shows that rank A nand det (A) 0. But
det (QAP) det (Q ) det (A ) det (P ) 0,
and det (P) *and det (Q)::I 0. ThereIore, iI A is singular, then det (A)
0 .
et us now turn to the problem oI Iinding the inverse AI oI a non
singular matri A. In doing so, we need to introduce the classical adjoint oI A.
. .25. DeIinition. et A be an (n n) matri, and let c
j
be the coIactor
oI D/ Ior i,j 1, ... ,n. et C be the matri Iormed by the coIactors oI A;
i.e., C c/ The matri ( is called the classical adjoint oI A. We write
adj (A) to denote the classical adjoint oI A.
We now have:
. .26. Theorem. et A be an (n n) matri. Then
A adj (A) adj (A) A det (A) I.
ProoI The prooI Iollows by direct computation, using E s. ( . .18) and
( . .19).
As an immediate conseuence oI Theorem . .26 we now have the Iol
lowing practical result.
..27. Coro ary. et A be a nonsingular (n n) matri. Then
AI de/(A) adj(A). (..28)
. .29. Eample. Consider the matri
A:
We have det(A) 1,
adj (A)
3
and
1
1
1
,
1
2
AI
The prooIs oI the net two theorems are leIt as an e ercise.
. .30. Theorem. IIA and 8 are similar matrices, then det (A) det (8).
. .31. Theorem. et A E (, ). et A be the matri oI A with respect
to a basis el .. ,e,, in , and let A be the matri oI A with respect to
another basis Ie;, ... , e:. in . Then det (A) det (A).
.5. Eigenvalues and Eigenvectors
. .32. E ercise. Prove Theorems . .30 and . .31.
163
In view oI the preceding results, there is no ambiguity in the Iollowing
deIinition.
. .33. DeIinition. The determinant oI a linear transIormation A oI a
Iinite dimensional vector space into is the determinant oI any matri
A representing it; i.e., det (A) Do det (A).
The last result oI the present section is a conse uence oI Theorems . .20
and . .2 .
. .3 . Theorem. et be a Iinite dimensional vector space, and let
A, B E (, ). Then A is nonsingular iI and only iI det (A) *"O. Also,
det (AB) det (A) det (B) .
.5. EIGENVA ES AND EIGENVECTORS
In the present section we consider eigenvalues and eigenvectors oI linear
transIormations deIined on Iinite dimensional vector spaces. ater, in
Chapter 7, we will reconsider these concepts in a more general setting.
Eigenvalues and eigenvectors play, oI course, a crucial role in the study oI
Throughout the present section, denotes an ndimensional vector space
over a Iield .
et A E (, ), and let us assume that there e ist sets oI vectors el,
... , e. and e;, ... , e, which are bases Ior such that
e; Ae
l
lle
l
,
( .5.1)
i. Ae. l.e.,
where 1, E , i 1, ... , n. II this is the case, then the matri A oI A with
respect to the given basis is
A
/
o
This motivates the Iollowing result.
.5.2. Theorem. et A E ( , ), and let.t E . Then the set oIall E
such that
A A ( .5.3)
is a linear subspace oI . In Iact, it is the null space oI the linear transIorma
tion (A .tI), where I is the identity element oI (, ).
ProoI Since the ero vector satisIies E . ( .5.3) Ior any .t E , the set is
nonvoid. IIthe ero vector is the only such vector, then we are done, Ior
O is a linear subspace oI (oI dimension ero). In any case, E . ( .5.3)
holds iI and only iI (A ) O. Thus, belongs to the null space oI
A , and it Iollows Irom Theorem 3. .19 that the set oI all E sat
isIying E . ( .5.3) is a linear subspace oI .
enceIorth we let
mol E : (A .tl) O . ( .5. )
The preceding result gives rise to several important concepts which we
introduce in the Iollowing deIinition.
.5.5. DeIiDition. et , A E (, ), and mol be deIined as in Theorem
.5.2 and E. ( .5. ). A scalar .t such that mol contains more than just the
ero vector is called an eigenvalue oI A (i.e., iI there is an 0 such that
A l, then 1 is called an eigenvalue oI A). When .t is an eigenvalue oI A,
then each 0 in mol is called an eigenvector oI A corresponding to the
eigenvalue .t. The dimension oI the linear subspace mol is called the multi
plicity oI the eigenvalue .t. IImol is oI dimension one, then A. is called a simple
eigenvalue. The set oI all eigenvalues oI A is called the spectrum oI A.
Some authors call an eigenvalue a proper value or a characteristic value
or a latent value or a secular value. Similarly, other names Ior eigenvector are
proper vector or cbaracteristic vector. The space mol is called the .tth proper
subspace oI .
or matrices we give the Iollowing corresponding deIinition.
.5.6. DeIiDition. et A be an (n n) matri whose elements belong to
the Iield . II there eists.t E and a nonero vector E " such that
A .t ( .5.7)
then .t is called an eigenvalue oI A and is called an eigenvector oI A corre
sponding to the eigenvalue .t.
Our net result provides the connection between DeIinitions .5.5 and
.5.6.
.5.8. Theorem. et A E (, ), and let Abe the matri oIA with respect
to the basis e., ... ,e,, . Then A. is an eigenvalue oI A iI and only iI.t is an
eigenvalue oI A. Also, E is an eigenvector oI A corresponding to .t iI
.5. Eigenvalues and Eigenvectors 165
and only iI the coordinate representation oI with respect to the basis
e I , e,, , , is an eigenvector oI Acorresponding to 1.
Note that iI (or ) is an eigenvector oI A (oI A), then any non ero
multiple oI (oI ) is also an eigenvector oI A (oI A).
In the ne t result, the prooI oI which is leIt as an e ercise, we use deter
minants to characterie eigenvalues. We have:
.5.10. Theorem. et A E (, ). Then 1 E is an eigenvalue oI A iI
and only iI det (A lI) O.
et us ne t e amine the euation
det(A 11) 0 (.5.12)
in terms oI the parameter 1. We ask: Can we determine which values oI 1,
iI any, satisIy E . (.5.12)1 et el, ... ,e,, be an arbitrary basis Ior and
let A be the matri oI A with respect to this basis. We then have
det (A ) det (A 11).
The righthand side oI E . (.5.13) may be rewritten as
(all 1) au at..
det(A 11)
(.5.13)
(.5.1)
0"1 ad (a"" 1)
It is clear Irom E . (..2) that e pansion oI the determinant (.5.1) yields
a polynomial in 1 oI degree n. In order Ior 1 to be an eigenvalue oI A it must
(a) satisIy E . (.5.12), and (b) it must belong to . Reuirement (b) warrants
Iurther comment: note that there is no guarantee that there e ists 1 E
such that E . (.5.12) is satisIied, or euivalently we have no assurance that
the nthorder polynomial euation
det(A 11) 0
has any roots in . There is, however, a special class oI Iields Ior which
reuirement (b) is automatically satisIied. We have:
.5.15. DeIinition. A Iield is said to be algebraically closed iI Ior every
polynomial p(l) there is at least one 1 E such that
Pel) o.
(.5.16)
Any 1 which satisIies E . ( .5.16) is said to be a root oI the polynomial e ua
tion ( .5.16).
In particular, the Iield oIcomple numbers is algebraically closed, whereas
the Iield oI real numbers is not (e.g., consider the e uation ..P I 0).
There are other Iields besides the Iield oI comple numbers which are
algebraically closed. owever, since we will not develop these, we will restrict
ourselves to the Iield oI comple numbers, C, whenever the algebraic closure
property oI DeIinition .5.15 is re uired. When considering results that are
valid Ior a vector space over an arbitrary Iield, we will (as beIore) make usage
oI the symbol or Ire uently (as beIore) make no reIerence to at all.
We summari e the above discussion in the Iollowing theorem.
.5.17. Theorem. et A E (, ). Then
(i) det (A 1I) is a polynomial oI degree n in the parameter 1; i.e.,
there e ist scalars /10 /II , /1It depending only on A, such that
det (A lT) /1
0
/Ill /ll

... /I)It ( .5.18)
(note that /1
0
det (A) and /1It ( I)");
(ii) the eigenvalues oI A are precisely the roots oI the e uation det
(A ).T) 0; i.e., they are the roots oI
/1
0
/II). /l ) ... /lIt1" 0; and ( .5.19)
(iii) A has; at most, n distinct eigenvalues.
The above result motivates the Iollowing deIinition.
.5.20. DeIinition. et A E (, ), and let A be a matri oI A. We call
det (A 1I) det (A ).1) /1
0
/II). ... /I)." ( .5.21)
the characteristic polynomial oI A (or oI A) and
det(A 1T) det(A 11) 0
the characteristic e uation oI A(or oI A).
( .5.22)
rom the Iundamental properties oI polynomials over the Iield oI comple
numbers there now Iollows:
.5.23. Theorem. II is an n dimensional vector space over C and iI
A E (, ), then it is possible to write the characteristic polynomial oI A
in the Iorm
det (A ).1) (1
1
).)",,(). ).)"" . ()., ).)"", ( .5.2 )
.5. Eigenvalues and Eigenvectors 167
where AI i 1, ... ,p, are the distinct roots oI E . ( .5.19) ( e., AI 1 A/
Ior i 1 j). In E . ( .5.2 ), m
l
is called the algebraic multiplicity oI the root AI
The m
l
are positive integers, and t m
l
n.
11
Note the distinction between the concept oI algebraic multiplicity oI AI
given in Theorem .5.23 and the multiplicity oI ).1 as given in DeIinition
.5.5. In general, these need not be the same, as will be seen later.
We now state and prove one oI the most important results oI linear
algebra, the Cayley amilton theorem.
.5.25. Theorem. et A be an n n matri, and let p(A) det (A AI)
be the characteristic polynomial oI A. Then
P(A) O.
ProoI et the characteristic polynomial Ior A be
p(A) o IA ... "A".
Now let B(A) be the classical adjoint oI (A AI). Since the elements bli).)
oI B(A) are coIactors oI the matri A ),1, they are polynomials in AoI
degree not more than n 1. Thus,
bl (A) PI/O PI/IA ... PI/,,Il
A
,,I.
etting B
k
PI/k Ior k 0, 1, ... , n 1, we have
B(.t) B
o
.tB
I
... .t"IB"I
By Theorem . .26,
(A AI)B(A) det (A AI) I.
Thus,
(A II)(Bo AB
I
... A,, IB,, I (o IA ... "A")I.
Epanding the leIthand side oI this euation and euating like powers oI
A, we have
B,,I "I, AB,, I B" 1 "II, ... , AB
I
B
o
II,
ABo 0I.
Premultiplying the above matri euations by A", A" I, ... , A, I, respec
tively, we have
A"B"I "A", A"B" I A" IB" 1 "IA"I,
A1B
I
ABo IA, ABo ol.
Adding these matri euations, we obtain
o oI IA ... "A" p(A),
which was to be shown.
... ,
As an immediate conse uence oI the Cayley amilton theorem, we have:
.5.26. Theorem. et A be an (n n) matri with characteristic poly
nomial given by E . ( .5.21). Then
(i) A (I)I(loI (lIA ... (lIAI; and
(ii) iII(.A.) is any polynomial in 1, then there e ist Po, PI .. , PI E
such that
I(A) Pol PIA ... PIAI.
ProoI Part (i) Iollows Irom Theorem .5.25 and Irom the Iact that (l
(I).
To prove part (ii), let I(A) be any polynomial in Aand let P(1) denote
the characteristic polynomial oI A. Then there e ist two polynomials g(1)
and r(A) (see Theorem 2.3.9) such that
I(A) P(A)g(1) r(1), ( .5.27)
where deg r(A) n I. sing the Iact that p(A) 0, we have I(A) r(A),
and the theorem Iollows.
The Cayley amilton theorem holds also in the case oI linear transIor
mations. SpeciIically, we have the Iollowing result.
.5.28. Theorem. et A E (, ), and let p(l) denote the characteristic
polynomial oI A. Then P(A) O.
et us now consider a speciIic e ample.
.5.30. E ample. Consider the matri
A G
et us use Theorem .5.26 to evaluate A37. Since n 2, we assume that
A37 is oI the Iorm
A37 Pol PIA.
The characteristic polynomial oI A is
P(A) (I 1)(2 1)
and the eigenvalues oI A are 1
1
I and 1
2
2. In the present case I(l)
1
37
and r(l) in E . ( .5.27) is
r(l) Po PI1.
We must determine Po and PI sing the Iact that P(11) P(11) 0, it
.6. Some CtllWnical orms oIMatrices
Iollows thatIOI) rO.
I
) andI(A
) r(A
). Thus, we have
Po PI p7 I, Po 2PI 2
37
ence, PI 2
37
I and Po 2 2
37
ThereIore,
A37 (2 2
37
)1 (2
37
I)A,
or,
169
A37 I 0.
2
37
I 2
37
BeIore closing the present section, let us introduce another important
concept Ior matrices.
.5.31. DeIinition. II A is an (n n) matri, then the trace oI A, denoted
by trace A or by tr A, is deIined as
trace A tr A all 022 ... a.. ( .5.32)
(i.e., the trace oI a suare matri is the sum oI its diagonal elements).
It turns out that iI C, the Iield oI comple numbers, then there is a
relationship between the trace, determinant, and eigenvalues oI an (n n)
matri A. We have:
.5.33. Theorem. et be a vector space over C. et A be a matri oI
A E (, ) and let det (A ) be given by E . ( .5.2 ). Then
(i) det (A) j; .ti;
(ii) trace (A) t m
;
I
(iii) iI B is any matri similar to A, then trace (8) trace (A); and
(iv) let I(A) denote the polynomial
I(A) 10 11 A ... 1... A... ;
then the roots oI the characteristic polynomial oI I(A) are I(A
I
),
... ,I(A,)and
det I(A) All I(A
I
) A "" ... I(1,) 1... .
.5.3 . Eercise. Prove Theorem .5.33.
.6. SOME CANONICA ORMS O MATRICES
In the present section we investigate under which conditions a linear
transIormation oI a vector space into itselI can be represented by special
types oI matrices, namely, by (a) a diagonal matri, (b) a so called triangular
matri, and (c) a socalled "block diagonal matri." We will also investigate
when a linear transIormation cannot be represented by a diagonal matri.
Throughout the present section denotes an n dimensional vector space
over a Iield .
.6.1. Theorem. et ll ... , lp be distinct eigenvalues oI a linear trans
Iormation A E (, ). et e 1 0, ... , e 1 0 be eigenvectors oI A
corresponding to ll"" lp, respectively. Then the set e, ... , e is
linearly independent.
ProoI. The prooI is by contradiction. Assume that the set e , ... ,e
is linearly dependent so that there e ist scalars I ... , p, not all ero, such
that
Ie ... pe O.
We assume that these scalars have been chosen in such a Iashion that as
Iew oI them as possible are nonero. Relabeling, iI necessary, we thus have
Ie ... ,e 0,
( .6.2)
( .6. )
where I ;C 0, ... , I , 1 0 and where r p is the smallest number Ior which
we can get such an e pression.
Since ll, ... ,l, are eigenvalues and since e , . .. ,I, are eigenvectors.
we have
0 A(O) A(le ... ,1,) IAe ... ,AI,
(Ill)e ... ( ,l,)I,. ( .6.3)
Also,
o .2., 0 .2.,(, ... ,e)
(Il,)e ... ( ,.2.,)1,.
Subtracting E . ( .6. ) Irom E . ( .6.3) we obtain
o I(ll .2.,)e ... ,(.2., l,)I,.
Since by assumption the .2., s are distinct, we have Iound an epression
involving only (r I) vectors satisIying E. ( .6.2). But r was chosen to
be the smallest number Ior which E . ( .6.2) holds. We have thus arrived
at a contradiction, and our theorem is proved.
We note that iI, in the above theorem, A has n distinct eigenvalues, then
the corresponding n eigenvectors span the linear space (recall that dim
n).
Our net result enables us to represent a linear transIormation with n
distinct eigenvalues in a very convenient Iorm.
.6.5. Theorem. et A E (, ). Assume that the characteristic poly
nomial oI A has n distinct roots, so that
.6. Some Canonical orms oIMatrices 171
det (A )..I) ()..l )..)O )..) ... ()..It )..),
where )..1 )" , ... )..It are distinct eigenvalues. Then there e ists a basis
(e , e;, ,e oI such that e; is an eigenvector corresponding to )../ Ior
i I, 2, , n. The matri A oI Awith respect to the basis e;, e;, ... , i. is
Al
A
o
( .6.6)
A"
ProoI et e; denote the eigenvector corresponding to the eigenvalue A/.
In view oI Theorem .6.1, the set e;, e;, ... ,e. is linearly independent
because AI A
, , ,t" are all diIIerent. Moreover, since there are n oI the

e;, the set e , e;, , e Iorms a basis Ior the ndimensional space . Also,
Irom the deIinition oI eigenvalue and eigenvector, we have
Ae; Ale;
Ae; A e;
( .6.7)
rom E . ( .6.7) we obtain the desired matri given in E . ( .6.6).
The reader can prove the Iollowing useIul result readily.
.6.8. Theorem. et A E (, ), and let Abe the matri oI A with respect
to a basis (eI e
, ... , eIt II the characteristic polynomial

det (A )..l) I
o
Ilt I
)" ... I It)..It

has n distinct roots, )..1 ,AIt then A is similar to the matri A oI A
with respect to a basis (e;, , e, where
o
A
o
AIt
In this case there ei sts a nonsingular matri P such that
A PIAP.
( .6.9)
(.6.10)
The matri P is the matri oI basis (e;, e;, ... , e with respect to basis
(e
l
, e
, ... , ell , and pI is the matri oI basis el, ... ,eIt with respect to
172
basis e , ... , e,, . The matri P can be constructed by letting its columns
be eigenvectors oI A corresponding to AIt , A., respectively. That is,
P I
2
, ,.l, ( .6.11)
where
tt
,. are eigenvectors oI A corresponding to the eigenvalues
AI, ... , A., respectively.
The similarity transIormation P given in E . ( .6.11) is called a modal
matri. IIthe conditions oI Theorem .6.8 are satisIied and iI, in particular,
E . ( .6.9) holds, then we say that matri A bas been diagonali ed.
et us now consider some speciIic e amples.
.6.13. E ample. et be a two dimensional vector space over the Iield
oI real numbers. et A E (, ), and let el, e2 be a basis Ior . Suppose
the matri A oI Awith respect to this basis is given by
2
A .
I I
p(l) det(A 1I) det(A 11) A2 A 6.
Now det (A AI) 0 iI and only iI A
2
A 6 0, or (A 2)(A 3)
O. Thus, the eigenvalues oI A are 1
1
2 and 1
2
3. To Iind an
eigenvector corresponding to 1
1
, we solve the euation (A lll) 0, or
The last euation yields the euations
e1 e2 0, I 2 O.
These are satisIied whenever I 2 Thus, any vector oI the Iorm
is an eigenvector oI A corresponding to the eigenvalue 11 or convenience,
let us choose I. Then
is an eigenvector. In a similar Iashion we obtain an eigenvector
2
corre
sponding to 1
2
, given by
.6. Some Canonical orms oIMatrices
The diagonal matri A given in E . ( .6.9) is, in the present case,
A I ; l
We can arrive at A, using E . ( .6.10). SpeciIically, let
P I
2
;
Then
.2 .8
pI
.2 .2
and
173
PIAP I ; .
By E . (.3.2), the basis e , e;) c with respect to which A represents
A is given by
2
e PIleI e
l
e2 e; Pnel e
1
e2
1 1 1 1
In view oI Theorem .3.8, iI is the coordinate representation oI with
respect to el, e
2
), then PI is the coordinate representation oI
with respect to e, e;). The vectors e , e; are, oI course, eigenvectors oI A
corresponding to AI and A
2
, respectively.
When the algebraic multiplicity oI one or more oI the eigenvalues oI a
linear transIormation is greater than one, then the linear transIormation
is said to have repeated eigenvalues. nIortunately, in this case it is not always
possible to represent the linear transIormation by a diagonal matri. To put
it another way, iI a suare matri has repeated eigenvalues, then it is not
always possible to diagonali e it. owever, Irom the preceding results oI the
present section it should be clear that a linear transIormation with repeated
eigenvalues can be represented by a diagonal matri iI the number oI linearly
independent eigenvectors corresponding to any eigenvalue is the same as the
algebraic multiplicity oI the eigenvalue. The Iollowing e amples throw addi
tional light on these comments.
.6.1. E ample. The characteristic euation oI the matri
13 2
A 0 2
o 3 I
is
det (A AI) (I A)2(2 A) 0,
and the eigenvalues oI A are AI 1 and A2 2. The algebraic multiplicity
oI AI is two. Corresponding to AI we can Iind two linearly independent
eigenvectors
mand
Corresponding to A we have an eigenvector
:
etting P denote a modal matri, we have
and
p
1I 0
Oland pI 1
010
1
2
3
APAP n
In this eample, dim moll 2, which happens to be the same as the algebraic
multiplicity oI 11" or this reason we were able to diagonalie matri A.
The net eample shows that the multiplicity oI an eigenvalue need not
be the same as its algebraic multiplicity. In this case we are not able to
diagonalie the matri.
.6.15. E ample. The characteristic euation oI the matri
21 2
A 0 2 1
001
is
det(A ).I) (11)(2 1) 0
and the eigenvalues oI A are 1
1
1 and 1 2. The algebraic multiplicity
oI 1 is two. An eigenvector corresponding to AI is r (I, 1, 1). An
eigenvector corresponding to 1 must be oI the Iorm
*O.
175
Setting (1,0,0), we see that dim mAl I, and thus we have not been
able to determine a basis Ior R3, consisting oI eigenvectors. Conseuently,
we have not been able to diagonali e A.
When a matri cannot be diagonalied we seek, Ior practical reasons,
to represent a linear transIormation by a matri which is as nearly diagonal
as possible. Our ne t result provides the basis oI representing linear transIor
mations by such matrices, which we call block diagonal matrices. In the ne t
section we will consider the "simplest" type oI block diagonal matri, called
the ordan canonical Iorm.
.6.16. Theorem. et be an n dimensional vector space, and let A
E (, ). et and Z be linear subspaces oI such that EEl Z
and such that A is reduced by and Z. Then there e ists a basis Ior such
that the matri A oI A with respect to this basis has the Iorm
A l *
where dim r, AI is an (r r) matri and A
2
is an (n r) (n r)
matri .
We can generali e the preceding result. Suppose that is the direct sum
oI linear subspaces I ... , that are invariant under A E (, ).
We can deIine linear transIormations AI E (
I
, ,), i 1, ... ,p, by
A/ AIor E ,. That is to say, A, is the restriction oI A to ,. We now
can Iind Ior each A, a matri representation A" which will lead us to the
Iollowing result.
.6.18. Theorem. et be a Iinite dimensional vector space, and let
A E (, ). IIis the direct sum oIp linear subspaces, I ... , " which
are invariant under A, then there e ists a basis Ior such that the matri
representation Ior A is in the block diagonal Iorm given by
AI:
. ...
I I
: A
2
: 0
,,
A
r
o : A,
Moreover, A, is a matri representation oI A" the restriction oI A to
it
176
i 1, ... ,po Also,
rom the preceding it is clear that, in order to carry out the block diago
naliation oI a matri A, we need to Iind an appropriate set oI invariant
subspaces oI and, Iurthermore, to Iind a simple matri representation on
each oI these subspaces.
.6.20. E ample. et be an ndimensional vector space. II A E (, )
has n distinct eigenvalues, 1
1
, , 1., and iI we let
: (A ll) O , j 1, ... ,n,
then is an invariant linear subspace under A and
I EB EB .
or any E we have A 1 , and hence A 1 Ior E
A basis Ior is any nonero
E r Thus, with respect to this basis, A
is represented by the matri 1 (in this case, simply a scalar). With respect to a
basis oI n linearly independent eigenvectors, I , . , A is represented
by E. ( .6.6).
In addition to the diagonal Iorm and the block diagonal Iorm, there
are many other useIul Iorms Ior matrices to represent linear transIormations
on Iinite dimensional vector spaces. One oI these canonical Iorms involves
triangular matrices, which we consider in the last result oIthe present section.
We say that an (n n) matri is a triangulu matri iIit either has the Iorm
all 012. 0
13
a
b
o 022 023 02.
( .6.21)
or the Iorm
0 0 0
a.I,.
0 0 0
a
all
0 0 0
021 02:1, 0 0
( .6.22)
In case oI E. ( .6.21) we speak oI an upper triangulu matri, whereas in
case oI E. ( .6.22) we say the matri is in the lower triangular Iorm.
117
.6.23. Theorem. et be an ndimensional vector space over C, and let
A E (, ). Then there eists a basis Ior such that A is represented by
an upper triangular matri.
ProoI. We wilt show that iI A is a matri oI A, then A is similar to an upper
triangular matri A. Our prooI is by induction on n.
II n 1, then the assertion is clearly true. Now assume that Ior n k,
and C any k k matri, there e ists a nonsingular matri Q such that
C QICQ is an upper triangular matri. We now must show.the validity
oI the assertion Ior n k 1. et be a (k I)dimensional vector space
over C. et AI be an eigenvalue oI A, and letl
l
be a correspondingeigenvector.
et I, ... ,Ikl be any set oI vectors in such that Il ... ,Ikl is
a basis Ior . et B be the matri oI A with respect to the basis Il ... ,
Ik I . Since All A.l
I
B must be oI the Iorm
AI bl2 bl,k 1
B 0.... ... ::: ... .k:.1 .
o bk I, .. bk I,k 1
Now let C be the k k matri
By our induction hypothesis, there e ists a nonsingular matri Q such that
C QICQ,
Q
p
where C is an upper triangular matri. Now let
I i 0 ... 0
I
0 :
I
I
I
I
I
I
I
I
0:
By direct computation we have
pI
I ; 0 ... 0
I
I
0:
I
I
.: QI
I
1
1
0:
and
PIBP
AI :. .
.
o :
I
I
I
I
I
I
I
I
o :
where the .s denote elements which may be non ero. etting A PIBP,
it Iollows that A is upper triangular and is similar to B. ence, any (k 1)
(k 1) matri which represents A E (, ) is similar to the upper
triangular matri A, by Theorem .3.19. This completes the prooI oI the
theorem.
Note that iI A is in the triangular Iorm oI either E . ( .6.21) or ( .6.22),
then
det (A 11) (a
I A)(au A) ... (a 1).

In this case the diagonal elements oI A are the eigenvalues oI A.
.7. MINIMA PO NOMIA S, NI POTENT
OPERATORS, AND T E ORDAN
CANONICA ORM
In the present section we develop the ordan canonical Iorm oI a matri .
To do so, we need to introduce the concepts oI minimal polynomial and
nilpotent operator and to study some oI the properties oI such polynomials
and operators. nless otherwise speciIied, denotes an n dimensional vector
space over a Iield throughout the present section.
A. Minimal Polynomials
or purposes oI motivation, consider the matri
A .
o 3 I
p(A) (I 1)Z(2 1),
and we know Irom the Cayleyamilton theorem that
P(A) O.
(.7.1)
.7. Minimal Polynomials
Now let us consider the polynomial
m(A) (1 A)(2 A) 2 3A A
Z
Then
179
m(A) 21 3A A2 O. (.7.2)
Thus, matri A satisIies E. ( .7.2), which is oI lower degree than E. ( .7.1),
the characteristic euation oI A.
BeIore stating our Iirst result, we recall that an nthorder polynomial in
Ais said to be monic iI the coeIIicient oI An is unity (see DeIinition 2.3. ).
.7.3. Theorem. et A be an (n n) matri. Then there eists a uniue
polynomial m(A) such that
(i) m(A) 0;
(ii) m(A) is monic; and,
(iii) iI m (A) is any other polynomial such that m (A) 0, then the degree
oI m(A)is less or eual to the degree oIm (A) ( e., m(A)is oIthe lowest
degree such that m(A) 0).
ProoI We know that a polynomial, p(A), eists such that P(A) 0, namely,
the characteristic polynomial. urthermore, the degree oI p(A) is n. Thus,
there eists a polynomial, say I(A), oI degree m n such that I(A) O.
et us choose m to be the lowest degree Ior which I(A) O. Since I(A) is
oI degree m, we may divide I(A) by the coeIIicient oI Am, thus obtaining
a monic polynomial, m(A), such that m(A) O. To show that m(A) is
uniue, suppose there is another monic polynomial m(A) oI degree m
such that m(A) O. Then m(l) m(l) is a polynomial oI degree less than
m. urthermore, m(A) m(A) 0, which contradicts our assumption that
m(A) is the polynomial oI lowest degree such that m(A) O. This completes
the prooI.
The preceding result gives rise to the notion oI minimal polynomial.
.7.. DeIinition. The polynomial m(A) deIined in Theorem .7.3 is called
the minimal polynomial oI A.
Other names Ior minimal polynomial are minimum polynomial and
reduced characteristic I Bction. In the Iollowing we will develop an eplicit
Iorm Ior the minimal polynomial oI A, which makes it possible to determine
it systematically, rather than by trial and error.
In the remainder oI this section we let A denote an (n n) matri, we let
p(A) denote the characteristic polynomial oI A, and we let m(A) denote the
minimal polynomial oIA.
.7.5. Theorem. et I(l) be any polynomial such that I(A) O. Then
m(A) divides I(A).
... ,
ProoI. et 11 denote the degree oI mel). Then there e ist polynomials (l)
and r(l) such that (see Theorem 2.3.9)
I(l) (l)m(l) r(l),
where deg r(l) 11 or r(l) O. Since I(A) 0, we have
o (A)m(A) rCA),
and hence rCA) O. This means r(l) 0, Ior otherwise we would have a
contradiction to the Iact that mel) is the minimal polynomial oI A. ence,
I(l) (l)m(l) and mel) divides I(l).
.7.6. Corollary. The minimal polynomial oI A, mel), divides the char
acteristic polynomial oI A, pel).
.7.7. Eercise. Prove Corollary .7.6.
We now prove:
.7.8. Deorem. The polynomial pel) divides m(l) ".
ProoI. We want to show that m(l) " p(l)(l) Ior some polynomial
(,t). et m(,t) be oI degree 11 and be given by
mel) l P.l ... P .
et us now deIine the matrices B
o
, B., ... , B.. as
B
o
I, B. A P.I, B
1
Al P.A P1I,
B.. At PIA,l ... P..I.
ABI P,I A PtAt ... P,I)
P,I meA) P.I.
Then
and
Now let
Then
B
o
I, B. ABo PtI, B
1
AB. P1I,
B.
t
AB.
1
P.t
I
,
... ,
(A lI)B(l) lB
o
A,tB
1
... AB I lAB
o
llAB.
... AB,t
A B
o
A 1 Bt ABo A,lBl AB
t
.,.
A B, t AB,l AB,t
lI PtA,II ... P,tll P,I m(l)I.
.7. MinimolPolynomials 181
Taking the determinant oI both sides oI this euation we have
det (A ).1) det B(). ) m() )It.
But det B().) is a polynomial in )., say ().). Thus, we have proved that
p().) ().) m().) .
The net result establishes the Iorm oI the minimal polynomial.
.7.9. Theorem. etp().) be given by E . ( .5.2 ); i.e.,
P().) ().t ).)" ,(). ).)"... ().p ).)"",
where m
t
, , m
p
are the algebraic multiplicities oI the distinct eigenvalues
)., .. ,).p oI A, respectively. Then
m().) (). ).t)"(). ). ), .. (). ).p)",
where 1 ::;;; v,::;;; m, Ior i I, ... ,p.
( .7.10)
.7.11. E ercise. Prove Theorem .7.9. (int: Assume that m().) ().
p)" ... (). p,)", and use Corollary .7.6 and Theorem .7.8).
The only unknowns leIt to determine the minimal polynomial oI A are
Vt, , v
p
in E . ( .7.10). These can be determined in several ways.
Our net result is an immediate conseuence oI Theorem .3.27.
.7.12. Theorem. et A be similar to A. and let m(.t) be the minimal
polynomial oI A. Then m
/
().) m().).
This result justiIies the Iollowing deIinition.
.7.13. DeIinition. et A E (, ). The minimal polynomial oI A is
the minimal polynomial oI any matri A which represents A.
In order to develop the ordan canonical Iorm (Ior linear transIormations
with repeated eigenvalues), we need to establish several additional preliminary
results which are important in their own right.
.7.1 . Theorem. et A E (, ). and letI().) be any polynomial in )..
et m, :I(A) O . Then m, is an invariant linear subspace oI
under A.
ProoI The prooI that m, is a linear subspace oI is straightIorward and
is leIt as an e ercise. To show that m, is invariant under A, let Em,. so
thatI(A) O. We want to show that A E m" et
182
Then
and
which completes the prooI.
BeIore proceeding Iurther, we establish some additional notation. et
AI" .. ,A
p
be distinct eigenvalues oI A E (, ). or j I, ... ,p and
Ior any positive integer , let
1 : (A A T) O . ( .7.15)
Note that this notation is consistent with that used in Eample .6.20
iI we deIine
.
Note also that, in view oI Theorem .7.1 , 1 is an invariant linear subspace
oI under A.
We will need the Iollowing result concerning the restriction oI a linear
transIormation.
.7.16. Theorem. et A E (, ). et I and
1
be linear subspaces oI
such that I EEl
1
and let AI be the restriction oI A to I. et I(A)
be any polynomial in 1. II A is reduced by I and
1
then, Ior all I E "
I(AI)
I
I(A)
l

Net we prove:
.7.18. Theorem. et be a vector space over C, and let A E (, ).
et m(l) be the minimal polynomial oI A as given in E . ( .7.10). et g(l)
(A AI)", let h(A) (l A1)" ... (A A
p
)" iI p 2 2, let h(A) I
iIp I. et AI be the restriction oI Ato i, i.e., AI AIor all E i.
et ml E : h(A) O . Then
(i) i EEl ml; and
(ii) (l AI)" is the minimal polynomial Ior AI
ProoI By Theorem .7.1 , ml and i are invariant linear subspaces under
A. Since g(l) and h(l) are relatively prime, there e ist polynomials (A) and
r(l) such that (see E ercise 2.3.15)
(l)g(l) r(l)h(l) 1.
.7. Minimal Polynomials
ence, Ior the linear transIormation A we have
(A)g(A) r(A)h(A) I.
Thus, Ior E , we have
(A)g(A) r(A)h(A).
Now since
183
(.7.19)
h(A) (A)g(A) (A)g(A)h(A) (A)m(A) (A)O 0,
it Iollows that(A)g(A) E ml. We can similarly show that r(A)h(A) Emi.
Thus, Ior every E we have I
2
, where I E mi and
E ml.
et us now show that this representation oI is uniue. et I
2
; , where I ; E m
l
and 2 E ml. Then
r(A)h(A) r(A)h(A)
l
r(A)h(A);.
Applying E . (.7.19) to I and ; we get
I r(A)h(A)
l
and
; r(A)h(A);.
rom this we conclude that I ;. Similarly. we can show that
2
.
ThereIore. mi EB ml.
To prove the second part oI the theorem, let AI be the restriction oI A to
mi and let A
2
be the restriction oI A to ml. et ml(l) and m
2
(1) be the
minimal polynomials Ior AI and A
2
respectively. Since g(A
I
) 0 and
h(A
1
) O. it Iollows that ml(l) divides g(l) and m
1
o.) divides hell. by
Theorem .7.5. ence, we can write
ml(l) (1 ll)k
t
and
m
2
(A) (A A2)lo ... (1 A,)lo,.
where 0 k
l
:::;;:v
l
Ior i I... p. Now let Iell ml(A)mrlA). Then
I(A) m
l
(A)m
2
(A). et E with I 2 where I E mi and
2 E ml. Then
I(A) m
l
(A)m
2
(A)
l
m
l
(A)m
2
(A)
2
m
2
(A)m.(A)
l
O.
ThereIore,I(A) O. But this implies that mel) dividesI(l) and 0 VI kl
i I, ... ,po
We thus conclude that k
l
VI Ior i I, ... P. which completes the
prooI oI the theorem.
We are now in a position to prove the Iollowing important result, called
the primary decomposition theorem.
.7.20. Theorem. et be an ndimensional vector space over C. let
AI ... A, be the distinct eigenvalues oI A E (. ). let the characteristic
18
polynomial oI A be
Chapter I mite Dimensional Vector Spaces and Matrices
p(A.) (A.I A.)"" ... (A., A.) ,
and let the minimal polynomial oI A be
m(A.) (A. A.
I
)" .. (A. A.,)".
et
, : (A A.,I)" O , i I, ... ,po
( .7.21)
( .7.22)
Then
(i) " i I, ... ,p are invariant linear subspaces oI under A;
(ii) l Et .. Et ,;
(iii) (A. A.,)" is the minimal polynomial oI A" where A, is the restriction
oI A to ,; and,
(iv) dim , m" i I, ... ,po
ProoI The prooIs oI parts (i), (ii), and (iii) Iollow Irom the preceding
theorem by a simple induction argument and are leIt as an e ercise.
To prove the last part oIthe theorem, we Iirst showthat the only eigenvalue
oI A, E ( " ,) is A." i I, ... ,po et I) E " v*" 0, and consider
(A, A.l)v O. rom part (iii) it Iollows that
0 (A, A.,ly"V (A, 11I), 1(A , A.I/)v
(A, 1,I), I(A. A.,)v (A. A.,)(A, A.,I),.l(A, A.,l)v
(A. l,)l(A , A.,I),, l v ... (A. A.,)"v.
rom this we conclude that 1 1
"
We can nowIind a matri representation oIA in the Iorm given in Theorem
.6.18. urthermore, Irom this theorem it Iollows that
p(A.) det (A A./) D; det (A, A./).
Now since the only eigenvalue oI A, is 1
"
the determinant oI A, A./ must
be oI the Iorm
det (A, A.I) (A., A.) t,
where , dim ,. Since p(A.) is given by E . ( .7.21), we must have
(A. I A.)IIII . (A., A.)III, (A.
l
A.)" .. (A., A.)t"
Irom which we conclude that m, " Thus, dim , m
"
i 1, ... ,po
This concludes the prooI oI the theorem.
.7.23. E ercise. Prove parts (i) iii) oI Theorem .7.20.
The preceding result shows that we can always represent A E (, )
by a matri in block diagonal Iorm, where the number oI diagonal blocks
.7. Nilpotent Operators 185
(in the matri A oI Theorem .6.18) is eual to the number oI distinct
eigenvalues oI A. We will ne t Iind a convenient representation Ior each oI
the diagonal submatrices A" It may turn out that one or more oI the sub
matrices A, will be diagonal. Our ne t result tells us speciIically when A
E (, ) is representable by a diagonal matri .
.7.2 . Theorem. et be an n dimensional vector space over C, and
let A E (, ). et 1..... , 1" p n, be the distinct eigenvalues oI A.
Then there e ists a basis Ior such that the matri A oI A with respect to
this basis is diagonal iI and only iI the minimal polynomial Ior A is oI the
Iorm
mel) (1 A
1
)(1 A
) . (A A,).
.7.26. E ercise. Apply the above theorem to the matrices in E amples
.6.1 and .6.15.
B. Nilpotent Operators
et us now proceed to Iind a representation Ior each oI the A, E (
"
,)
in Theorem .7.20 so that the block diagonal matri representation oI
A E (, ) (see Theorem .6.18) is as simple as possible. To accomplish
this, we Iirst need to deIine and e amine so called nilpotent operators.
.7.27. DeIiDition. et N E (, ). Then N is said to be nilpotent iI
there e ists an integer 0 such that N" O. A nilpotent operator is said
to be oI inde iI N" 0 but N,,I "* O.
Recall now that Theorem .7.20 enables us to write I EB
EEl
EB . urthermore, the linear transIormation (A, A,l) is nilpotent on .
IIwe let N, A, A,I, then A, All N,. Now 1,1 is clearly represented
by a diagonal matri . owever, the transIormation N, Iorces the matri
representation oI A, to be in general nondiagonal. So our ne t task is to
seek a simple representation oI the nilpotent operator N,.
In the ne t Iew results, which are concerned with properties oI nilpotent
operators, we drop Ior convenience the subscript i.
.7.28. Theorem. et N E (V, V), where V is an m dimensional vector
space. IIN is a nilpotent linear transIormation oI inde and iI .E V is
such that N,l *"0, then the vectors , N, ... , N,,I in V are linearly
independent.
( .7.30)
ProoI. We Iirst note that iI NII * 0, then N * 0 Ior j 0, I, ... ,
I. Our prooI is now by contradiction. Suppose that
l,INI o.
10
etj be the smallest integer such that l, * o. Then we can write
N l,1 Ni * O.
I I l,
Thus,
N NI (t)NII Nl
y
,
I I (l,
where y is deIined in an obvious way. Now we can write
NII NIIN NIINl
y
NIy O.
We thus have arrived at a contradiction, which proves our result.
Net, let us e amine the matri representation oI nilpotent transIor
mations.
.7.29. Theorem. et Vbe a dimensional vector space, and let N E (V,
V) be nilpotent oI inde . et m
o
E V be such that NI1m
o
*o. Then the
matri N oI N with respect to the basis NIIm
o
, NQ 2
mo
, . .. ,mol in V
is given by
0100 00
0010 00
N .
0000 01
0000 00
ProoI. By the previous theorem we know that NIIm
o
, .. ,mol is a linearly
independent set. By hypothesis, there are vectors in the set, and thus
NIIm
o
, ... ,mol Iorms a basis Ior V. et e
l
N m
o
Ior i I, ... ,.
Then
O, i I
Ne
l
.
el 1 2, ... ,.
ence,
Ne
l
0 e
t
0 e
2
0 . e
I

t
0 e
I
Ne
2
I e
t
0 e2 0 e
I

1
0 e
Ne
I
0 e
t
0 e
2
... I e
I

t
0 e
I

rom E . ( .2.2) and DeIinition .2.7, it Iollows that the representation
oI Nis that given by E . ( .7.30). This completes the prooIoIthetheorem.
The above theorem establishes the matri representation oI a nilpotent
linear transIormation oI inde on a dimensional vector space. We will
net determine the representation oI a nilpotent operator oI inde v on a
vector space oI dimension m, where v m. The Iollowing lemma shows
that we can dismiss the case v m.
.7.31. emma. et N E (V, V) be nilpotent oI inde v, where dim
V m. Then v m.
ProoI Assume E V, N 0, N
I
*0, and v m. Then, by Theorem
.7.28, the vectors , N, ... , NI are linearly independent, which con
tradicts the Iact that dim V m.
To prove the ne t theorem, we reuire the Iollowing result.
.7.32. emma. et V be an m dimensional vector space, let N E (V,
V), let v be any positive integer, and let
WI : N O , dim WI II,
W
2
: N2 O , dim W
2
1
2
,
W. : N O , dim W. I .
Also, Ior any i such that I i v, let el ... , em be a basis Ior V such
that elt ... ,ed is a basis Ior WI Then
(i) WI C w
2
C . C W.; and
(ii) (e
u
"" e" ,, Ne,. 1 ... ,Ne, .. , is a linearly independent set oI
vectors in W,.
ProoI To prove the Iirst part, let E WI Ior any i v. Then Ni O.
ence, NII 0, which implies E W1
1

To prove the second part, let r III and let t 1
1
I II We note
that iI E W
I
I
, then NI(N) 0, and so N E WI This implies that
Ne
E WI Ior j II I, ... ,11 1 This means that the set oI vectors

el, ... ,e" Ne ... , Ne".. is in WI We show that this set is linearly
independent by contradiction. Assume there are scalars (I" ,(, and
PI ... , PI not all ero, such that
( le
l
... ( ,e, PINe,, 1 ... p,Ne"., O.
Since e
l
, ,e, is a linearly independent set, at least one oI the PI must
be non ero. Rearranging the last euation we have
ence,
188
Thus,
N(Il. e,, . ... Il,e,.. 0,
and (Il.e,, . ... Il,e".,) E W,. II Il.e,, ... Il,e" 1 0, it can
be written as a linear combination oI e., ... , e", which contradicts the Iact
that e., . .. ,e"., is a linearly independent set. II Il.e,, . ... Il,e, ,
0, we contradict the Iact that e., ... , e"., is a linearly independent set.
ence, weconcludethatlZ, OIori I, ... , r andIl, OIori I, ... , t.
This completes the prooI oI the theorem.
We are now in a position to consider the general representation oI a
nilpotent operator on a Iinite dimensional vector space.
.7.33. Theorem. et V be an m dimensional vector space over C, and
let N E (V, V) be nilpotent oI inde v. et W. : N O , ... , W.
: N O , and let I, dim W" i I, ... ,v. Then there e ists a basis
Ior V such that the matri N oI N is oI block diagonal Iorm,
where
N : : ,
o N,
(.7.3)
(.7.35)
0100 00
0010 00
N, .
0000 01
0000 00
i 1, ... ,r, where r I., N, is a (k, k,) matri, I :::;; k,:::;; lI, and k, is
determined in the Iollowing way: there are
I. I. I (v v) matrices,
2/, 1 1 1, . (i i) matrices, i 2, ... ,v I, and
2/. 1
1
(I I) matrices.
The basis Ior Vconsists oI strings oI vectors oI the Iorm
ProoI By emma .7.32, W. c W
1
C c W . et e., ... , e. be a
basis Ior V such that e., . .. ,e,.l is a basis Ior W,. We see that W. V.
Since N is nilpotent oI inde v, W.
1
1 W. and 1. . I .
We now proceed to select a new basis Ior Vwhich yields the desired result.
We Iind it convenient to use double subscripting oI vectors. eth . e, . .,
,/(/yIv.),y e,y and let It. .1 Nlt.., ... ,/(/.1..),.1 NI(/. I . ) ,
By emma .7.32, it Iollows that el" ,e,. .,Il .I," ,I ,. , . ,) I is
a linearly independent subset oI W. I which mayor may not be a basis Ior
W. I II it is not, we adjoin additional elements Irom W. denoted by
1 ,. , . 11 . 1"" ,/(/ . Iv ) so as to Iorm a basis Ior W.
I
Now let
11 . 2. NI
I

I
,I2. 2. NI2. .I ,1 , . , . ),. 2. NI , . , . ).I By
emma .7.32 it Iollows, as beIore, that e ... , e, . ,/I. 2.,. .. ,1 , . I . ). 2.
is a linearly independent set in W. 2. II this set is not a basis we adjoin vectors
Irom W. 2. so that we do have a basis. We denote the vectors that we adjoin
by 1 "., 1 . 11 .2., ,1 ,. . 1.,.) 2. We continue in this manner until we
have Iormed a basis Ior V. We e press this basis in the manner indicated in
igure C.
Basis Ior
I ." , I(/. I.. ,I. V
I". " , I(l.I ,),Vl, . I(l.,/.
2
),vl
I,,2 I(l.I ,I,2, , I(/21,),2
I,." , I(l.I.,). ,, , I(/2
1
,), 1. , I/"I
.7.36. igtn C. Basis Ior V.
The desired result Iollows now, Ior we have
NI; /,./ j I
1./ 0, j I.
ence, iI we let I II.. , we see that the Iirst column in igure C reading
bottom to top, is
We see that each column oI igure C determines a string consisting oI k
,
entries, where k, v Ior i I, ... , (I. /. 1) Note that (/. 1. 1) 0,
so there is at least one string. In general, the number oI strings withj entries
is (// //1) (/I //) 2/ II II Ior j 2, ... , v I. Also,
there are /1 (12. /1) 2/
1
/" vectors, or strings with one entry.
inally, to show that the number oI entries, N
I
, in N is /1 we see that
( .7.38)
there are a total oI(/. I.I) (2/ 1 I. 1.
2
) ... (2/
2
II 1
3
)
(2/
1
1
2
) II columns in the table oI igure C.
The reader should study igure C to obtain an appreciation oIthe structure
oI the basis Ior the space V.
C. The ordan Canonical orm
We are Iinally now in a position to state and prove the result which estab
lishes the ordan canonical Iorm oI matrices.
.7.37. Deorem. et be an n dimensional vector space over C, and let
A E (, ). et the characteristic polynomial oI A be
p(A) (AI A)"" ... (A, A)m.,
and let the minimal polynomial oI A be
m(A) (A AI)" ... (A A,)",
where AI ... ,A, are the distinct eigenvalues oI A. et
, E : (A A,I)" O .
Then
(i) l"" , are invariant subspaces oI under A;
(ii) I EB .. EB ,;
(iii) dim , m
"
i 1, ... ,p; and
(iv) there e ists a basis Ior such that the matri A oI A with respect
to this basis is oI the Iorm
AI 0 ... 0
A ... .2 : : : .
o 0 ... A,
where A, is an (m, m,) matri oI the Iorm
A, 1
,
1 N, ( .7.39)
and where N, is the matri oI the nilpotent operator (A, liT)
oI inde V, on , given by E . ( .7.3 ) and E . ( .7.35).
ProoI. Parts (i) (iii) are restatements oI the primary decomposition theorem
(Theorem .7.20). rom this theorem we also know that (1 1 ,)" is the
minimal polynomial oI A" the restriction oI A to " ence, iI we let N,
A, l,I, then N, is a nilpotent operator oI inde V, on " We are thus
able to represent N, as shown in E . ( .7.35).
The completes the prooI oI the theorem.
.7. ordan Canonical orm 191
A little etra work shows that the representation oI A E (. ) by a
matri A oI the Iorm given in E s. ( .7.38) and ( .7.39) is uniue. ecept Ior
the order in which the block diagonals AI... A
p
appear in A.
.7. 0. DeIinition. The matri A oI A E (. ) given by Es. ( .7.38)
and ( .7.39) is called the ordan canonical Iorm oI A.
We conclude the present section with an eample.
.7. 1. E ample. et R
7
and let u
I
u
7
be the natural basis Ior
(see Eample . I.15). et A E (. ) be represented by the matri
I 0
o 1
2 1
A 2 0
o 0
o 0
I I
1 1
o 0
2 1
1 2
o 0
o 0
o 1
1
o
1
1
1
o
2
3 0
o 0
6 0
3 0
o 0
1 0
1
with respect to u
I
, .. u
7
. et us Iind the matri At which represents A
in the ordan canonical Iorm.
We Iirst Iind that the characteristic polynomial oI A is
Pel) (I 1)7.
This implies that 1
1
I is the only distinct eigenvalue oI A. Its alge
braic multiplicity is m. 7. In order to Iind the minimal polynomial oI A.
let
N A ),.1,
where I is the identity operator in (, ). The representation Ior N with
respect to the natural basis in is
NAI
2
o
2
2
o
o
1
o I
o 0
I 1
o I
o 0
o 0
I 0
1 1
o 0
I I
I 1
o 0
o 0
1 2
3 0
o 0
6 0
3 0
o 0
o 0
0
We assume the minimal polynomial is oI the Iorm m(l) (l I and
proceed to Iind the smallest VI such that m(A I) m(N) O. We Iirst
obtain
Ne t, we get that
o 1 0 0 0
o 0 0 0 0
o I 0 0 0
NZ 0 I 0 0 0
o 0 0 0 0
o 0 0 0 0
o 0 0 0 0
3 0
o 0
3 0
3 0
o 0
o 0
o 0
N
N3 0,
and so VI 3. ence, N is a nilpotent operator oI inde 3. We see that
5t. We will now apply Theorem .7.33 to obtain a representation Ior
N in this space.
sing the notation oI Theorem .7.33, we let WI : N O , W
: NZ O , and W, : N
3
0). We see that N has three linearly
independent rows. This means that the rank oI N is 3, and so dim (WI) II
. Similarly, the rank oI NZ is I, and so dim (W
) I
6. Clearly,
dim (W
3
) 1
3
7. We can conclude that N will have a representation N
oIthe Iorm in E . ( .7.3 ) with r . Each oI the N; will be oI the Iorm in E.
( .7.35). There will be 1
3
I
1 (3 3) matri . There will be 2/
1
3
II 1 (2 2) matri , and 2/
1
I
2 (l I) matrices. ence, there

is a basis Ior such that N may be represented by the matri
o 1 0iO 0 0 0
001:0000
I
o 0 0:0 0 0 0
r j
000:01:00
I I
o 0 010 010 0
r
o 0 0 0 0:0:0
1 , ..
o 0 0 0 0 0:0
The corresponding basis will consist oI strings oI vectors oI the Iorm
NZ.. N.. .. N
,
3
, ...
We will represent the vectors ..
, " and .. by ..
, " and .. ,
their coordinate representations, respectively, with respect to the natural
basis u .. ... , u, in . We begin by choosing I E W
3
such that I 1 W
;
i.e., we Iind an I such that N
3
I
0 but NZ
I
: O. The vector I (0,
.7. ordan Canonical orm 193
1,0,0,0,0,0) will do. We see that (N l)T (0,0, 1,0,0,0, I) and
(N2
I
)T (1,0, I, 1,0,0,0). ence, N
I
E W
but N
I
WI and
NZ
l
E WI We see there will be only one string oI length three, and so we
net choose E W
such that
WI Also, the pair N

l
, must be
linearly independent. The vector I (1,0,0,0,0,0,0) will do. Now
(N Z)T (2,0,2, 2,0,0, I), and N
2
E WI We complete the basis
Ior by selecting two more vectors,
3
, , E W., such that NZ
l
, N
3t
, are linearly independent. The vectors I (0, 0, I, 2, I, 0, 0)
and r (1, 3, I, 0, 0, I, 0) will suIIice.
It Iollows that the matri
P N
l
, N
l
, I N
,
3
, ,
is the matri oI the new basis with respect to the natural basis (see E ercise
.3.9).
The reader can readily show that
NPINP,
where
I 0 0 2 I 0 I
0 0 I 0 0 0 3
I I 0 2 0 I I
P I 0 0 2 0 2 0
0 0 0 0 0 I 0
0 0 0 0 0 0 I
0 I 0 I 0 0 0
and
0 0 2 I 2 2
0 0 I I 3 I 0
0 I 0 0 0 3 0
pl
0 0 I I
3 I I
I 0 0 I 2 I 0
0 0 0 0 I 0 0
0 0 0 0 0 I 0
inally the ordan canonical Iorm Ior A is given by
A N I.
(Recall that the matri representation Ior is the same Ior any basis in .)
Thus,
A
A
1 1 0iO 0 0 0
I
011:0000
I
001:0000
t
00 0: 1 1:0 0
I I
000:01:00
I I
o 0 0 o oT"i l 0
i
00000 OIl
Again, the reader can show that
A PIAP.
In general, it is more convenient as a check to show that PA AP.
.7. 2. E ercise. et R, and let u
t
, , u, denote the natural
basis Ior . et A E (, ) be represented by the matri
5 1 I 1 0
0
1 3 I 1 0 0
0 0 0 1 1
A
0 0 0 1 1
0 0 0 0 3 1
0 0 0 0 1 3
Show that the ordan canonical Iorm oI A is given by
1 0iO 0 0
I
0 1:000
I
o 0 :0 0 0
O O O r i l 0
I I
o 0 0:0 :0
1 1
0 0 0 0 0 i 2
and Iind a basis Ior Ior which A represents A.
.8. BI INEAR NCTIONA S AND CONGR ENCE
In the present section we consider the representation and some oI the
properties oI bilinear Iunctionals on real Iinite dimensional vector spaces.
(We will consider bilinear Iunctionals deIined on comple vector spaces in
Chapter 6.)
.8. Bilinear unctionals and Congruence 195
Throughout this section is assumed to be an n dimensional vector space
over the Iield oIreal numbers. We recall that iIIis a bilinear Iunctional on a
real vector space , then I: Rand
I(
I
p
2
,) I(l,y) PI(
2
,y)
and
I(, I P 2) I(, I) PI(, 2)
Ior all , pER and Ior all , I
2
, , I 2 E . As a conseuence oI
these properties we have, more generally,
I(ii , Ir" i;,. Plr Ir) tl 1;1 Pd( , Ir)
Ior all PIr E Rand Ir E ,j I, ... , rand k I, ... ,s.
.8.1. DeIinition. et e
l
, ... , en) be a basis Ior the vector space , and
let
It
I(e" e
), i,j I, ... , n.
The matri lIt
is called the matri oI the bilinear Iunctional I with

respect to el ... , en)
Our Iirst result provides us with the representation oI bilinear Iunctionals
on Iinite dimensional vector spaces.
.8.2. Theorem. et I be a bilinear Iunctional on a vector space , and
let IeI .. , e.l be a basis Ior . et be the matri oI the bilinear Iunctional
Iwith respect to the basis Iel ... ,e.l. II and yare arbitrary vectors in
and iI and yare their coordinate representation with respect to the basis
Ie
l
, e
2
, , e.l, then

I(, y) T y :E Ittll ( .8.3)
1 1 I
ProoI We have
T
(el" .. ,en) and yT ( II" .. , In) Also, ele
l
... e.e. and Ile
l
... I.en ThereIore,

I(, y) :E :E ell(e
l
, e
) :E :E It el l T y
II I 1 I I
Conversely, iI we are given any (n n) matri , we can use Iormula
( .8.3) to deIine the bilinear Iunctional I whose matri with respect to the
given basis e I .. , e.) is, in turn, again. In general, it thereIore Iollows that
on Iinite dimensional vector spaces, bilinear Iunctionals correspond in a
onetoone Iashion to matrices. The particular onetoone correspondence
depends on the particular basis chosen.
Now recall that iI is a real vector space, thenIis said to be symmetric
iI I(, y) I(y, ) Ior all , y E . We also have the Iollowing related
concept.
.8.. DeIinition. A bilinear Iunctional I on a vector space is said to be
skew symmetric iI
Ior all , y E .
I(,y) I(y,) (.8.5)
or symmetric and skew symmetric bilinear Iunctionals we have the
Iollowing result.
.8.6. Theorem. et e
t
, ,e
R
be a basis Ior , and let be the matri
Ior a bilinear IunctionalIwith respect to el .. ,e.. Then
(i) I is symmetric iI and only iI T;
(ii) I is skew symmetric iI and only iI T; and
(iii) Ior every bilinear Iunctional I, there eists a uniue symmetric
bilinear Iunctional I, and a uniue skew symmetric bilinear Iunc
tional I2 such that
II, I2
We callIt the symmetric part oIIandI2 the skew symmetric part oII.
.8.7. Eercise. Prove Theorem .8;6.
The preceding result motivates the Iollowing deIinitions.
.8.8. DeIinition. An (n n) matri is said to be
(i) symmetric iI T; and
(ii) skew symmetric iI T.
The net result is easily veriIied.
.8.9. Theorem. et I be a bilinear Iunctional on , and let It and I2 be
the symmetric and skew symmetric parts oII, respectively. Then
I,(, y) tI(, y) Ie, )
and
I2( , y) tI(, y) Ie, )
Ior all ,y E .
Now let us recall that the uadratic Iorm induced by I was deIined as
() I(, ). On a real Iinitedimensional vector space we now have
.8. Bilinear unctionals and Congrl nce
or uadratic Iorms we have the Iollowing result.
197
.8.11. Theorem. etand g be,bilinear Iunctionals on . The uadratic
Iorms induced by and g are eual iI and only iI and g have the same sym
metric part. In other words, () () Ior all E iI and only iI
(, y) (, ) g(, y) g(y, )
Ior all ,y E .
ProoI We note that
( y, y) (, ) (, y) (, ) (, y).
rom this it Iollows that
(, y) (, ) I(, ) Ie, y) I( y, y) .
Now iI g(, ) (, ), then
(, y) (, ) g(, ) g(y, y) g( y, y)
g(, y) g(y, ),
so that
(, y) 1 ( , ) g(, y) g(y, ) . ( .8.12)
Conversely, assume that E . (.8.12) holds Ior all , y E . Then, in
particular, iI we let y, we have I(, ) g(, ) Ior all E . This
concludes our prooI.
rom Theorem .8.11 the Iollowing useIul result Iollows: when treating
uadratic Iunctionals, it suIIices to work with symmetric bilinear Iunctionals.
We leave the prooI oI the ne t result as an e ercise.
.8.13. Theorem. A bilinear Iunctional on a vector space is skew sym
metric iI and only iI(, ) 0 Ior all E .
.8.1 . E ercise. Prove Theorem .8.13.
The net result enables us to introduce the concept oI congruence.
.8.15. Theorem. et I be a bilinear Iunctional on a vector space , let
el ... , e.l be a basis Ior , and let be the matri oIwith respect to this
basis. et e;, . .. ,el be another basis whose matri with respect to el
... ,e.l is P. Then the matri oI Iwith respect to the basis e;, .. . ,e.l
198
is given by
P"P. ( .8.16)
ProoI et /:/ where, by deIinition,I:/ 1(1" ). Then/( t p",e".
" I
. ) .. . .
p,/e, p",p,/I(e", e,) p",I",p,/. ence, P"P.
t:1 " I I t:t t:1
We now have:
.8.17. DeIinition. An (n n) matri is said to be congruent to an
(n n) matri iI there e ists a nonsingular matri P such that
PT P.
We e press this congruence by writing ,." .
(.8.18)
Note that congruent matrices are also euivalent matrices. The net
theorem shows that ,." in DeIinition .8.17 is reIle ive, symmetric, and
transitive, and as such it is an euivalence relation.
.8.19. Theorem. et A, B, and C be (n n) matrices. Then,
(i) A is congruent to A;
(ii) iI A is congruent to B, then B is congruent to A; and
(iii) iI A is congruent to Band B is congruent to C, then A is congruent
toC.
ProoI Clearly A ITAI, which proves the Iirst part. To prove the second
part, let A PTBP, where P is nonsingular. Then
B (PT)IAPI (PI)TA(PI),
which proves the second part.
et A PTBP and B QTCQ, where P and Q are nonsingular matrices.
Then
A PTQTCQP (QP)TC(QP),
where QP is nonsingular. This proves the third part.
or practical reasons we are interested in determining the "nicest" (i.e.,
the simplest) matri congruent to a given matri, or what amounts to the
same thing, the "nicest" (i.e., the most convenient) basis to use in epressing
a given bilinear Iunctional. II, in particular, we conIine our interest to
uadratic Iunctionals, then it suIIices, in view oI Theorem .8.11, to consider
symmetric bilinear Iunctionals.
.8. Bilinear unctionals and Congruence 199
We come now to the main result oI this section, called Sylvesters
theorem.
.8.20. Theorem. et / be any symmetric bilinear Iunctional on a real
ndimensional vector space . Then there e ists a basis el ... ,e. oI
such that the matri oI/with respect to this basis is oI the Iorm
1
p
0
1
r
1
n ( .8.21)
1
0 0
o
The integers rand p in the above matri are uniuely determined by the
bilinear Iorm.
ProoI. Since the prooI oI this theorem is somewhat long, it will be carried
out in several steps.
Step 1. We Iirst show that there e ists a basis v
u
... , v. oI such that
/(v
1
, v ) 0 Ior i 1 j. The prooI oI this step is by induction on the dimension
oI . The statement is trivial iI dim 1. Suppose that the assertion is
true Ior dim n l. et / be a bilinear Iunctional on , where dim
n. et VI E be such that /(v
l
, VI) 1 O. There must be such a VI;
otherwise, by Theorem .8.13, / would be skew symmetric, and we would
conclude that/(,y) 0 Ior all ,y. Now let m E :I(v
l
, ) O .
We now show that m is a linear subspace oI . et u 2 E m so that
I(v
l
, I) I(v
u

2
) O. Then I(v
u
I
2
) I(v
l
, I) I(v
l
,
2
) 0
0 O. Similarly, I(vt I) 0 Ior all E R. ThereIore, m is a linear
subspace oI . urthermore, m 1 because VI m . ence, dim m
:;;; n 1. Now let dim m n 1. Since / is a bilinear Iunctional on
m , it Iollows by the induction hypothesis that there is a basis Ior m consisting
oI a set oI vectors v
2
, , vI tl such that I(v
1
, v ) 0 Ior i 1 j, 2 i,
j 1. Also, I(v
l
, v ) 0 Ior j 2, ... , I, by deIinition oI m .
urthermore, I(v
"
VI) I(v
l
, v ence, I(v
"
VI) I(v., v, 0 Ior i 2,
... ,l. It Iollows that I(v"v
0 Ior i:j and Ii,jl.

We now show that VI"" ,vll is a basis Ior . et E and let
(lVI, where ( I I(v
l
, )II(v
l
, VI) Then I(v., I(v" )
(t/(v
l
, VI) I(v
l
, VI) I(v
l
, VI) O. Thus, E mI. Since v , .. ,
vll is a basis Ior mI, there e ist ( 2" ,(1 such that ( V ...
( IV
l
; i.e., ( lVI ... ( IV
l
, Thus, VI"" v
l
spans .
To show that the set VI" , v
l
is linearly independent, assume that
( lVI ... ( IV 1 O. Then 0 I(v
l
, 0) I(V., (lVI ...
( IV

l
) (t/(v" VI) 0, which implies that (, O. ence, ( 2VZ ...
( IV
1
O. Since the set V2"" v tl Iorms a basis Ior mI, we
must have ( 2 ... ( 1 O. Thus, VI"" v tl Iorms a basis Ior
, and we conclude that I n. This completes the prooI oI
step l.
Step 2. et VI , v.l be a basis Ior such that I(v" V ) 0 Ior
i: j and let P, I(v
"
v,) Ior i,j I, ... , n. et e
,
1IV, Ior i 1, ... , n,
where 1, II.. j ;l iI P,: 0 and 11 1 iI P, O. Now suppose that
P, I(v" v
,
): O. Then we have I(e" e,l I(lv
"
1,V,) 11I(v" v,)
P,I.. 7 l l. Also, iI P, I(v
"
v,) 0, then I(e" e,l lI(v" v,) O.
inally, we see that I(e
"
e
) I(y,v
"
v

) I 1 I(v
"
V ) 0 iI i: j.
Thus, we have established a basis Ior such that Il I(e" e
) 0 iI
i *j and " I(e,. e,l 1, 1, or O.
Step 3. We now show that the integers p and r in matri ( .8.21) are
uni uely determined by I et el, ,e.l and e, . .. , e be bases Ior
and let and be matrices oIIwith respect to el, , e.l and Ie;, ... , i. ,
respectively, where
p
o
1
o I
o
o
np
.8. Bilinear unctionals and Congruence
1
o
1
o
o
n
201
To prove that p we show that e
l
, ,e" e;h ... ,e:. are linearly
independent. rom this it must Iollow that p (n ) n, or p . By
the same argument, p, and so p . et
lel y,e, ; le; 1 ... ,,e:. 0,
where 11 E R, i 1, ,p and 1: E R, i 1, ... ,n. Rewriting the
above euation we have
"leI ... "pep (;le;1 ... ye:.) A
o
Then
I(
o
,
o
) I(y,e, ... ,e" lel ... pe p)
... 0,
by choice oIe
l
, ... , ep . On the other hand,
I(
o
,
o
) I(,, ... ye:.), (rII ... ,ye)
(1 )Z (,, 1)2 (,,)Z . (y)Z 0
by choice oII" .. ,eR rom this we conclude thaty ... 1 0;
i.e., 11 ... 1 p O. ence, I I ... ye O. But the set
I" .. , e:, is linearly independent, and thus I ... ,, O. ence,
the vectors el ... ,e
p
, t, ... ,e are linearly independent, and it Iollows
thatp .
To prove that r is uniue, let r be the number oI nonero elements oI
and let r be the number oI nonero elements oI . By Theorem .8.15,
and are congruent and hence euivalent. Thus, it Iollows Irom Theorem
.3.16 that and must have the same rank, and thereIore r r.
201 Chapter I initeDimensional Vector Spaces and Matrices
Sylvester s theorem allows the Iollowing classiIication oIsymmetric bilinear
Iunctionals.
.8.22. DeIinition. The integer r in Theorem .8.20 is called the rank oI
the symmetric bilinear Iunctional I. The integer p is called the inde oI I.
The integer n is called the order oII. The integer s 2p r (i.e., the number
oI ls minus the number oI I s) is called the signature oII.
Since every real symmetric matri is congruent to a uniue matri oI the
Iorm ( .8.21), we deIine the inde , order, and rank oI a real symmetric
matri analogously as in DeIinition .8.22.
Now let us recall that a bilinear Iunctional I on a vector space is said
to be positive iI I(, ) 0 Ior all E . Also, a bilinear Iunctional I is
said to be strictly positive iI I(, ) 0 Ior all "*0, E (it should be
noted that I(, ) 0 Ior 0). Our Iinal result oI the present section,
which is a conse uence oI Theorem .8.20, enables us now to classiIy sym
metric bilinear Iunctionals.
.8.23. Theorem. et p, r, and n be deIined as in Theorem .8.20. A
symmetric bilinear Iunctional on a real n dimensional vector space is
(i) strictly positive iI and only iI p r n; and
(ii) positive iI and only iI p r.
.9. E C IDEAN VECTOR SPACES
A. Euclidean Spaces: DeIinition and Properties
Among the various linear spaces which we will encounter, the so called
Euclidean spaces are so important that we devote the ne t two sections to
them. These spaces will allow us to make many generali ations to Iacts
established in plane geometry, and they will enable us to consider several
important special types oI linear transIormations. In order to characterie
these spaces properly, we must make use oI two important notions, that oI
the norm oI a vector and that oI the inner product oI two vectors (reIer to
Section 3.6). In the real plane, these concepts are related to the length oI a
vector and to the angle between two vectors, respectively. BeIore considering
the matter on hand, some preliminary remarks are in order.
To begin with, we would like to point out that Irom a strictly logical
point oI view Euclidean spaces should actually be treated at a later point oI
.9. Euclidean Vector Spaces
203
our development. This is so because these spaces are speciIic e amples oI
metric spaces (to be treated in the ne t chapter), oI normed spaces (to be dealt
with in Chapter 6), and oI inner product spaces (also to be considered in
Chapter 6). owever, there are several good reasons Ior considering Euclidean
spaces and their properties at this point. These include: Euclidean spaces are
so important in applications that the reader should be e posed to them as
early as possible; these spaces and their properties will provide the motivation
Ior subseuent topics treated in this book; and the material covered in the
present section and in the ne t section (dealing with linear transIormations
deIined on Euclidean spaces) constitutes a natural continuation and con
clusion oI the topics considered thus Iar in the present chapter.
In order to provide proper motivation Ior the present section, it is useIul
to utili e certain Iacts Irom plane geometry to indicate the way. To this end
let us consider the space R and let (I ,,) and y ( 11 1, ) be vectors
in R . et I u, be the natural basis Ior R . Then the natural coordinate
representation oI and y is
: and y ::
( .9.1)
respectively (see E ample .1.15). The representation oI these vectors in
the plane is shown in igure D. In this Iigure, II, Iy I, and I y Idenote the
.9.1. igure D. ength oI vectors and angle between vectors.
lengths oI vectors , y, and ( y), respectively, and 8 represents the angle
between and y. The length oI vector is eual to (,I ,n
IlZ
, and the length
oI vector ( y) is eual to (I 11) (,, 1, ) )1/2. By convention,
we say in this case that "the distance Irom to y" is eual to (I II)Z
( IZ)Z 1/2, that "the distance Irom the origin 0 (the null vector) to
" is eual to (I DI/Z, and the like. sing the notation oI the present
chapter, we have
(.9.3)
and
I yl ,( y)T( y) ,(y )T(y ) Iy l. ( .9. )
The angle ( between vectors and y can easily be characteried by its
cosine, namely,
cos 8 ( 17 7) Z (.9.5)
""I i "" II I
tiliing the notation oI the present chapter, we have
cos ( , T (.9.6)
T yT
y
It turns out that the realvalued Iunction Ty, which we used in both Es.
( .9.3) and ( .9.6) to characterie the length oI any vector and the angle
between any vectors and y, is oI Iundamental importance. or this reason
we denote it by a special symbol; i.e., we write
(, y) t:. Ty. ( .9.7)
Now iI we let yin E. ( .9.7), then in view oI E. ( .9.3) we have
I I ""(, ). ( .9.8)
By inspection oI E . ( .9.3) we note that
(, ) 0 Ior all *O ( .9.9)
and
(, ) 0 Ior O.
Also, Irom E . ( .9.7) we have
(, y) (y, )
( .9.10)
( .9.11)
( .9.12)
(.9.13)
( .9.1 )
Ior all and y. Moreover, Ior any vectors , y, and and Ior any real scalars
and p we have, in view oI E . ( .9.7), the relations
( y, ) (, ) ( , ),
(, y ) (, y) (, ),
( , y) (, y),
and
(, y) (, y). ( .9.15)
In connection with E. ( .9.6) we can make several additional observa
tions. irst, we note that iI y, then cos ( 1; iI y, then cos
8 1; iI
T
(I 0) and yT (0, I ), then cos ( 0; etc. It is easily
veriIied, using E. (.9.6), that cos ( assumes all values between 1 and
1 ; i.e., 1 cos ( S 1.
The above Iormulation agrees, oI course, with our notions oI length
oI a vector, distance between two vectors, and angle between two vectors.
rom Es. (.9.9(.9.l5) it is also apparent that relation (.9.7) satisIies
all the aioms oI an inner product (see Section 3.6).
sing the above discussion as motivation, let us now begin our treatment
oI Euclidean vector spaces.
irst, we recall the deIinition oI a real inner product: a bilinear Iunctional
I on a real vector space is said to be an inner product on iI (i) I is sym
metric and (ii) I is strictly positive. We also recall that a real vector space
on which an inner product is deIined is called a real inner product space. We
now have the Iollowing important
.9.16. DeIinition. A real Iinitedimensional vector space on which an
inner product is deIined is called a Euclidean space. A Iinitedimensional
vector space over the Iield oI comple numbers on which an inner product is
deIined is called a unitary space.
We point out that some authors do not restrict Euclidean spaces to be
Iinite dimensional.
Although many oI the results oI unitary spaces are essentially identical
to those oI Euclidean spaces, we postpone our treatment oI comple inner
product spaces until Chapter 6, where we consider spaces that, in general,
may be inIinite dimensional.
Throughout the remainder oIthe present section, will denote an ndimen
sional Euclidean space, unless otherwise speciIied. Since we will always be
concerned with a given bilinear Iunctional on , we will henceIorth write
(, y) in place oI I(, y) to denote the inner product oI and y. inally, Ior
purposes oI completeness, we give a summary oI the aioms oI a real inner
product. We have
(i) (, ) 0 Ior all * 0 and (, ) 0 iI 0;
(ii) (, y) (y, ) Ior all , y E ;
(iii) (I py, ) I ( , ) P(y, ) Ior all , y, E and all I ,
PER; and
(iv) (, l y p) I ( , y) P(, ) Ior all , y E and all I , pER.
We note that Es. (.9.9(.9.15) are clearly in agreement with these
aioms.
.9.17. Theorem. The inner product (, y) 0 Ior all E iI and only
iI y o.
ProoI IIy 0, then y 0 and (, 0) (, 0 ) 0 (, ) 0 Ior
allE.
On the other hand, let (, y) 0 Ior all E . Then, in particular, it
must be true that (, y) 0 iI y. We thus have (y, y) 0, which implies
thaty 0.. .
The reader can prove the net results readily.
.9.18. Corollary. et A E (, ). Then (, Ay) 0 Ior all , y E
iI and only iI A O.
.9.19. Corollary. et A, B E (, ). II (, Ay) (, By) Ior all ,
y E , then A B.
.9.20. Corollary. et A be a real (n n) matri. II
T
Ay 0 Ior all
, y E R , then A o.
.9.11. Eercise. Prove Corollaries .9.18.9.20.
OI crucial importance is the notion oI norm. We have:
.9.11. DeIinition. or each E , let
I l (, )1/2.
We call IIthe norm oI .
et us consider a speciIic case.
.9.13. Eample. et R and let , y E , where (t .. , )
and y ("I . , ,,). rom Eample 3.6.23 it Iollows that
(, y) :E /
II
( .9.2 )
is an inner product on . The coordinate representation oI and y with
respect to the natural basis in R is given by
respectively (see Eample .1.15). We thus have
(,y) Ty,
and
(
)1/2
I l :E l (
T
)1/2
II
The above eample gives rise to:
( .9.25)
( .9.26)
.9. Euclidean Vector Spaces 207
.9.27. DeIinition. The vector space R" with the inner product deIined
in E. ( .9.2 ) is denoted by P. The norm oI given by E . ( .9.26) is called
the Euclidean norm on R".
Relation ( .9.29) oI the net result is called the Schwar ine uality.
.9.28. Theorem. et and y be any elements oI . Then
l(,y)1 IlII, ( .9.29)
where in E . ( .9.29) I(, y) Idenotes the absolute value oI a real scalar and
I Idenotes the norm oI .
ProoI or any and y in and Ior any real scalar tt we have
( tty, tty) ( , ) tt(, y) tt(y, ) tt
2
(y, y) O.
Now assume Iirst that y * 0, and let
tt (, y).
(y, y)
Then
( tty, tty) (, ) 2tt(, y) tt
2
(y, y)
( ) 2(, y)( y) (, y)2(y y)
, (y, y) ( , y)2 ,
( ) (, y)2 0
, (y,y) ,
or
(, )(y, y) (, y) .
Taking the suare root oI both sides, we have the desired ineuality
l( ,y)1 Illyl
To complete the prooI, consider the case y O. Then (, y) 0, Iy I 0,
and in this case the ineuality Iollows trivially.
.9.30. Eercise. or , y E , show that
l( ,y)1 Illyl
iI and only iI and yare linearly d.ependent.
In the net result we establish the a ioms oI a norm.
.9.31. Theorem. or all and y in and Ior all real scalars tt, the
Iollowing hold:
(i) Il 0 unless 0, in which case Il 0;
(ii) Itt I Itt I . I I, where Itt I denotes the absolute value oI the scalar
tt; and
(iii) I I I l Iyl
Chopter I inite Dimensional Vector Spaces and Matrices
ProoI The prooI oI part (i) Iollows Irom the deIinition oI an inner product.
To prove part (ii), we note that
I l

( , ) ( , ) Z( , ) l l

.
Taking the suare root oI both sides we have the desired relation
I l I ll I
To veriIy the last part oI the theorem we note that
I ylZ ( y, y) ( , ) 2(,y) (,y)
I l

2(,y) Iyl .
sing the Schwar ine uality we obtain
I ylZ I l

2l l Iyl lylZ (I l Iyl) .
Taking the suare root oI both sides we have
I y I II Iy I,
which is the desired result.
Part (iii) oI Theorem .9.31 is called the triangle ine uality. Part (ii) is
called the homogeneous property oI a norm. In Chapter 6 we will deIine
Iunctions on general vector spaces satisIying a ioms (i), (ii), and (iii) oI
Theorem .9.31 without making use oI inner products. In such cases we will
speak oI normed linear spaces (Euclidean spaces are e amples oI normed
linear spaces).
Our ne t result is called the parallelogram law. Its meaning in the plane is
evident Irom igure E.
o
Iyl
I
I
/
I
I
I
/
/
I
I
/
y
y
.9.32. Igure E. Interpretation oI the parallelogram law.
.9.33. Deorem. or all , y E the euality
I ylZ I ylZ 21 l

21yI
Z
holds.
Generali ing E . (.9.), we deIine the distance between two vectors
andy oI as
p(,y) I yl.
It is not diIIicult Ior the reader to prove the net result.
.9.36. Theorem. or all ,y, Z E , the Iollowing hold:
(i) p(, y) p(y, );
(ii) p(, y) 0 and p(, y) 0 iI and only iI y; and
(iii) p(, y) p(, ) p( , y).
( .9.35)
A Iunction p(, y) having properties (i), (ii), and (iii) oI Theorem .9.36
is called a metric. Without making use oI inner products, we will in Chapter 5
deIine such Iunctions on nonempty sets (not necessarily linear spaces) and
we will in such cases speak oI metric spaces (Euclidean spaces are eamples
oI metric spaces).
B. Orthogonal Bases
ollowing our discussion at the beginning oI the present section Iurther,
we now recall the important concept oI orthogonality, using inner products.
In accordance with DeIinition 3.6.22, two vectors , y E are said to be
ortbogonal (to one another) iI (, y) O. We recall that this is written as
.. y. rom the discussion at the beginning oI this section it is clear that in
the plane *0 is orthogonal to y *0 iI and only iI the angle between
and y is some odd multiple oI 90 .
The reader has undoubtedly encountered a special case oI our net result,
known as the Pythagorean tbeorem.
.9.38. Theorem. et , y E . II .. y, then
I ylZ I l

Iyl.
ProoI. Since by assumption .. y, we have (, y) O. Thus,
I ylZ ( y, y) (, ) (,y) (y,) (,y) I l

I l ,
which is the desired result.
.9.39. DeIinition. A vector E is said to be a unit vector iI II 1.
et us choose any vector y *0 and let Z I Iy. Then the norm oI is
I i 11lyl 1llI 1,
i.e., is a unit vector. This process is called normaliing the vector y.
Ne t, let Il" .. . .l be an arbitrary basis Ior and let lt

denote
the matri oI the inner product with respect to this basis; i.e., It
(It, I
)
Ior all; andj. More speciIically, denotes the matri oIthe bilinear Iunctional
I that is used in determining the inner product on with respect to the
indicated basis (see DeIinition .8.1). et and y denote the coordinate
representation oI and y, respectively, with respect to Il ... ,I.l. Then
we have, by Theorem .8.2,

(, y) T y yT Ittlh
I I
Now by Theorems .8.20 and .8.23, since the inner product is symmetric
and strictly positive, there e ists a basis e
l
, ,e.l Ior such that the
matri oI the inner product with respect to this basis is the (n n) identity
matri I, i.e.,
(e" e)
This motivates the Iollowing:
iIi:;ej
iI; j.
.9. 0. DeIinition. II e
l
, , e.l is a basis Ior such that (e" e
) 0
Ior all ;:;e j, i.e., iI e, .. e
Ior all ;:;e j, then e

l
, ,e.l is called an
orthogonal basis. II in addition, (e" e,) I, i.e., iI Ie,l I Ior all i, then
e
l
, ,e.l is said to be an orthonormal basis Ior (thus, e
lt
e.l is
orthonormal iI and only iI (e" e
) I)
sing the properties oI inner products and the deIinitions oI orthogonal
and orthonormal bases, we are now in a position to establish several useIul
results.
.9. 1. Theorem. et e
lt
,e.l be an orthonormal basis Ior . et
and y be arbitrary vectors in , and let the coordinate representation oI
andy with respectto this basis be
T
(:1 ... ,e.) and yT (lit, I.),
respectively. Then
and
I l (
T
)I/1 , e: ... e:
ProoI. rom the above discussion we have

(, y) T y Itt, I e,I e, I,.
,I ,I 1
In particular, we have
(, ) e: .
1
( .9. 2)
( .9. 3)
The reader should note that E s. ( .9.7) and ( .9.8) introduced at the
beginning oI this section are, oI course, in agreement with E s. ( .9. 2)
and ( .9. 3). (See also Eample .9.23.)
Our net result enables us to determine the coordinates oI a vector with
respect to a given orthonormal basis.
.9.. Theorem. et e
l
, ,e,, be an orthonormal basis Ior and let
be an arbitrary vector. The coordinates oI with respect to el ... , ell
are given by the Iormulas
I (, e
l
), ," ( , ell)
ProoI Since Iel ... "e", we have
(, e
l
) (ele
l
... e"e", e
l
) el(e
l
, e
l
) ... "(e,,, e
l
) el
Repeating this procedure Ior (, e
l
), i 2, ... ,n, yields the desired
result.
et us consider some speciIic cases.
.9.5. Eample. et Z (see DeIinition .9.27). et , y E 2,
where (I 2) and y (111,112) Then
(,y) el111 2112
The natural basis Ior 2 is given by
I
(1,0) and
2
(0, I). Since (u
l
,
u
/
) I it Iollows that u
1
, u
2
is an orthonormal basis Ior 2. urthermore,
we have
.9.6. Eample. et RZ, and let the inner product on RZ be deIined
by
(, y) 1111 11 ( .9. 7)
(The reader may veriIy that this is indeed an inner product.) et u
I
, u
denote the natural basis Ior RZ; i.e.,

I
(1,0) and
(0,1). The matri

representation oI the bilinear Iunctional, which determines the above inner
product with respect to the basis u
l
, u is
(, y) T y,
where and yare the coordinate vectors oI and y with respect to u
l
, 2.
We see that (I 2) 1 0 0 . 1 0; e.,
I
and
2
are orthogonal
with respect to the inner product ( .9. 7). Note however that l ll I and
Iu 2; i.e., the vectors
I
and
2
are not orthonormal.
Now let e
l
(1,0) and e
(0, t). Then it is readily veriIied that el e

2
is an orthonormal basis Ior . urthermore, Ior el ee2 we have

e (, e,) and e (, e
1
). II we let
and y :
denote the coordinate representation oI and y, respectively, with respect to
eel, e
1
, then
(, y) ( )Ty .
This illustrates the Iact that the norm oI a vector must be interpreted with
respect to the inner product used in determining the norm.
Our net result allows us to represent vectors in in a convenient way.
.9. 8. Theorem. et e
l
, ,e. be an orthogonal basis Ior . Then Ior
all E we have
(, e,) (, eIt)
( )e
l
()e".
e
l
, e, eIt, e.
ProoI. Normaliing e
l
, ,e", we obtain the orthonormal basis , ... ,
i.. , where e: rile" i I, ,n. By Theorem .9. we have
(, e)e (, e)e
( , TI re,) (TI r) e, ... ( , ;.le.)C:.I)e.
(, e,) (, e.)
el . le.11 eIt
(, e,) e
l
... (, e.) e..
(el e,) (e., e.)
We are now in a position to characterie inner products by means oI
Parseyals identity, given in our net result.
.9. 9. Coronary. et e" ... ,e,, be an orthogonal basis Ior . Then
Ior any , y E we have
(, y) t. (, e,)(y, e,).
t1 (e" e,)
.9.50. Eercise. VeriIy Corollary .9. 9.
Our net result establishes the linear independence oI orthogonal vectors.
We have:
.9.51. Theorem. Suppose that I
k
are mutually orthogonal non
ero vectors in ; i.e., ,... j i::lj. Then l" ,
k
are linearly inde
pendent.
ProoI Assume that Ior real scalars (I , ( k we have
(I
I
... ( k k O.
or arbitrary i I, ... , k, we have
0 (0, I) I
I
... ( k k I) (I(
I
, I) ... ( k(
k
, I)
(/(" I);
i.e., (I(
I
, I) O. This implies that ( I 0 Ior arbitrary i, which proves the
linear independence oI I" ,
k

Note that the converse to the above theorem is not true. We leave the
prooIs oI the ne t two results as an e ercise.
.9.52. Corollary. A set, oI k nonero mutually orthogonal vectors is
a basis Ior iI and only iI k dim n.
.9.53. Corollary. or there e ist not more than n mutually orthonormal
vectors. (In this case we speak oI a complete orthonormal set oI vectors.)
.9.5 . E ercise. Prove Corollaries .9.52 and .9.53.
Our ne t result, which is called the GramSchmidt process, allows us to
construct an orthonormal basis Irom an arbitrary basis.
.9.55. Theorem. et II., ... ,In be an arbitrary basis Ior . Set
gl Il, e
l
gl/lgII,
g,. I,. (/,., el)e
u
e,. g,./lg,.l,
.1
g" 1" (/",ej)ej, elI g"/lg,,l
j1
Then e
u
... , elI is an orthonormal basis Ior .
.9.56. E ercise. Prove Theorem .9.55. To accomplish this, show that
(e" e
j
) 0 Ior i :.j, that lell 1 Ior i 1, ... , n, and that e
l
, , elI
Iorms a basis Ior .
The ne t result is a direct conse uence oI Theorem .9.55 and Theorem
3.3. .
.9.57. Corollary. II e
u
... , ek k n, are mutually orthogonal non ero
vectors in , then we can Iind a set oI vectors e
k
I .. , elI such that the
set el, ... , elI Iorms a basis Ior .
Our ne t result is known as the Bessel ine uality.
21 Clwpter I inite Dimensional Vector Spaces and Matrices
.9.58. Theorem. II I"
k
is an arbitrary set oI mutually
orthonormal vectors in , then
Ior all E . Moreover, the vector
k
:E (, ,) ,
1
is orthogonal to each " i I, ... , k.
ProoI. et I , (, ,). We have
o I I. I , ,I2 ( I. I , " t Ij j)
I
k k k k
(, ) 1 ,t , Itiljl
j
I I (I,,t t ,, j)
Now since the vectors
lt

k
are mutually orthonormal, we have
which proves the Iirst part oI the theorem.
To prove the second part, we note that
In Theorem .9.58, let denote the linear subspace oI which is spanned
by the set oI vectors
lt
. ,
k
. Then clearly each vector deIined in this
theorem is orthogonal to each vector oI ; i.e., .l.. (see DeIinition 3.6.22).
et us net consider:
.9.59. Theorem. et be a linear subspace oI , and let
y. E : ( , y) 0 Ior all E . ( .9.60)
(i) et II ,Ik span . Then E y. iI and only iI ..1 I
j
Ior
j I, , k.
(ii) y. is a linear subspace oI .
(iii) n dim dim dim y. .
(iv) ( . ). .
(v) yEt y.
(vi) et , E . II I
2
and I 2 where I,I E
and
2
, 2 E y. , then
and
215
I l v"l ti

I l .
ProoI To prove the Iirst part, note that iI E y. , then ..Il"" ..Ik
since It E Ior i 1, ... , k. On the other hand, let ../, i 1, ... , k.
Then Ior any E there e ist scalars I" i 1, ... , k such that y ndl
... I,.Ik ence,
(, y) (, t I,/) I I,(,/,) O.
I t:1
Thus, E y. .
The remaining parts oI the theorem are leIt as an e ercise.
.9.61. E ercise. Prove parts (ii) through (vi) oI Theorem .9.59.
.9.62. DeIinition. et be a linear subspace oI . The subspace y.
deIined in E . ( .9.60) is called the orthogonal complement oI .
BeIore closing the present section we state and prove the Iollowing impor
tant result.
.9.63. lbeorem. et/be a linear Iunctional on . There e ists a uniue
y E such that
I() (,y) (.9.6)
Ior all E .
ProoI. III() 0 Ior all E , then y 0 is the uniue vector such that
E. (.9.6) is satisIied Ior all E , by Theorem .9.17. So let us suppose
that I() * 0 Ior some E , and let
E :/() O .
Then is a linear subspace oI . et . be the orthogonal complement
oI. Then it Iollows Irom Theorem .9.59 that EB . . urther
more, . contains a non ero vector. et o E . and, without loss oI
generality, let o be chosen in such a Iashion that Iyo I 1. Now let y
I(yo) o and Ior any E let
o
l o, where l I()/I(yo) Then
I(
o
) 0, and thus
o
E . We now have
o
l o, and
( , y) (o/(o) o) (l o /( o) o)
I(yo) (
o
, o) l I(yo) ( o. o)
l I(yo) I();
i.e., Ior all E ,/) (, y).
To show that y is uni ue, suppose that (, I) (, y) Ior all E .
Then (, I y) 0 Ior all E . But this implies that I 0, or
I This completes the prooI oI the theorem.
.10. INEAR TRANSORMATIONS
ON E C I DEAN VECTOR SPACES
A. Orthogonal TransIormations
In the present section we concern ourselves with special types oI linear
transIormations deIined on Euclidean vector spaces. We will have occasion
to reconsider similar types oI transIormations again in Chapter 7, in a much
more general setting. nless otherwise speciIied, will denote an n dimensional
Euclidean vector space throughout the present section.
The Iirst special type oI linear transIormation deIined on Euclidean
vector spaces which we consider is the so called "orthogonal transIormation."
et el, .. ,e.l be an orthonormal basis Ior , let e; t. P le i 1, ... ,
1 "1
n, and let P denote the matri determined by the real scalarsP The Iollowing
uestion arises: when is the set e, ... , e.l also an orthonormal basis Ior
To determine the desired properties oI P, we consider
(e;, e ) (t Pklek, t. PI el) PkIP/ (e
ko
e,).
k 1 t.1
In order that (1" ej) 0 Ior i *"j and (e;, ej) I Ior i j, we reuire that

(e;, ej) PklP)k PklPk 6, ,
I
i.e., we reuire that
PTP I,
where, as usual, I denotes the n n identity matri . We summari e.
.10.1. Theorem. et el ... ,e.l be an orthonormal basis Ior . et
e; t P le i 1, ... ,n. Then e, ... ,e.l is an orthonormal basis Ior
I
iI and only iI pT PI.
This result gives rise to the Iollowing:
.10.2. DeIinition. A matri P such that pT pI, i.e., such that p7p
pIp I, is called an orthogonal matri.
.10.3. E ercise. Show that iI P is an orthogonal matri, then either
det P 1 or det P I. Also, show that iI P and Qare (n n) orthogonal
matrices, then so is PQ.
The nomenclature used in our ne t deIinition will become clear shortly.
216
.10. ineary TransIormations on Euclidean Vector Spaces 217
.10. . DeIinition. A linear transIormation A Irom into is called an
orthogonal linear transIormation iI (A, Ay) ( , y) Ior all , y E .
et us now establish some oI the properties oI orthogonal transIormations.
.10.5. Theorem. et A E (, ). Then A is orthogonal iI and only iI
IAl I l Ior all E .
ProoI. II A is orthogonal, then (A, A) (, ) and IAI I l. Con
versely, iI IAI IIIor all E , then
IA( y)1
2
(A( y), A( y (A Ay, A Ay)
IA l
2
2(A, Ay) IAyl2
I l
2
2(A, Ay) lyl2.
Also,
IA( yW I yl2 ( y, y) I l
2
2(,y) lyl2,
and thereIore
(A, Ay) (, y)
Ior all ,y E .
We note that iI A is an orthogonal linear transIormation, then .1.. y
Ior all , y E iI and only iI A .1.. Ay. or ( , y) 0 iI and only iI (A,
Ay) O.
.10.6. Corollary. Every orthogonal linear transIormation oI into
is nonsingular.
ProoI et A O. Then IA l I l O. Thus, 0 and A is non
singular.
Our ne t result establishes the link between DeIinitions .10.2 and .10..
.10.7. Theorem. et el ... ,e. be an orthonormal basis Ior . et
A E (, ), and let A be the matri oI A with respect to this basis. Then
Ais orthogonal iI and only iI A is orthogonal.
ProoI. et and y be arbitrary vectors in , and let and y denote their
coordinate representation, respectively, with respect to the basis el, ... , e. .
Then A and Ay denote the coordinate representation oI A and Ay, respec
tively, with respect to this basis. Now,
(A, Ay) (A(Ay)
T
ATAy,
and
118 Chopter I inite Dimensional Vector Spaces and Matrices
Now suppose that A is orthogonal. Then ATA I and (A, Ay) Ty
(, y) Ior all , y E . On the other hand, iI A is orthogonal, then (A,
Ay) TATAy Ty (,y) Ior all ,y E . Thus, T (ATA I)y O.
Since this holds Ior all , y E , we conclude Irom Corollary .9.20 that
ATA I 0; i.e., ATA I.
The net two results are leIt as an e ercise.
.10.8. Corollary. et A E (, ). IIA is orthogonal, then det A 1.
.10.9. Corollary. et A, BE (, ). IIA and B are orthogonal trans
Iormations, then ABis also an orthogonal linear transIormation.
.10.10. Eercise. Prove Corollaries .10.8 and .10.9.
or reasons that will become apparent later, we introduce the Iollowing
convention.
.10.11. DeIinition. et A E (, ) be an orthogonal linear transIor
mation. IIdet A I, then A is called a rotation. IIdet A I, then A is
called a reIlection.
B. Adjoint TransIormations
The net important class oI linear transIormations on Euclidean spaces
which we consider are socalled adjoint linear transIormations. Our net
result enables us to introduce such transIormations in a natural way.
.10.12. Theorem. et G E (, ) and deIineg: R by g(, y)
(, Gy) Ior all , y E . Then g is a bilinear Iunctional on . Moreover, iI
el .. , e.1 is an orthonormal basis Ior , then the matri oI g with respect
to this basis, denoted by G, is the matri oI G with respect to el , e.l.
Conversely, given an arbitrary bilinear Iunctional g deIined on , there
eists a uniue linear transIormation G E (, ) such that (, Gy) g(, y)
Ior all ,y E .
ProoI. et G E (, ), and let g(, y) (, Gy). Then
g(
l
,y) (I
Z
, Gy) (I Gy) (
, Gy) g(l,y) g(,).

Also,
g(, I y) (, G( I y (, G I Gy ) (, GI) (, Gy )
g(, I) g(, y)
.10. inear TransIormations on Euclidean Vector Spaces
urthermore,
g(t, y) (l , Gy) I ( , Gy) I g( , y),
and
119
g(, I ) (, G(I (, I G(y I ( , Gy) I g( , y),
where I is a real scalar. ThereIore, g is a bilinear Iunctional.
Net, let e., ... ,e. be an orthonormal basis Ior . Then the matri
G oI g with respect to this basis is determined by the elements g/j g(e
l
, e ).
Now let G g; be the matri oI G with respect to e., . .. ,e.. Then
Ge
t gek Ior j I, ... ,n. ence, (e

lt
Ge) (e
l
, t g)ek) g;j.
k . k .
Since gl g(e
l
, e
) (e
lt
Ge
) g; it Iollows that G G; e., G is the

matri oIG.
To prove the last part oI the theorem, choose any orthonormal basis
e., ... ,e. Ior . Given a bilinear Iunctional g deIined on , let G glj
denote its matri with respect to this basis, and let G be the linear transIor
mation corresponding to G. Then ( , Gy) g(, y) by the identical argument
given above. inally, since the matri oI the bilinear Iunctional and the matri
oI the linear transIormation were determined independently, this correspon
dence is uniue.
It should be noted that the correspondence between bilinear Iunctionals
and linear transIormations determined by the relation (, Gy) g(, y) Ior
all , y E does not depend on the particular basis chosen Ior ; however,
it does depend on the way the inner product is chosen Ior at the outset.
Now let G E (, ), set g(, y) (, Gy), and let h(, y) g(y, )
(y, G) (G, y). By Theorem .10.12, there e ists a uniue linear trans
Iormation, denote it by G*, such that h(, y) (, G*y) Ior all , y E .
We call the linear transIormation G* E (, ) the adjoint oI G.
.10.13. Theorem
(i) or each G E (, ), there is a uniue G* E (, ) such that
(, G*y) (G , y) Ior all , y E .
(ii) et e., . .. ,e. be an orthonormal basis Ior , and let G be the
matri oI the linear transIormation G E (, ) with respect to this
basis. et G* be the matri oI G* with respect to e
l
, , e. . Then
G* GT.
ProoI The prooI oI the Iirst part Iollows Irom the discussion preceding
the present theorem.
To prove the second part, let el, ... , e. be an orthonormal basis Ior
, and let G* denote the matri oI G* with respect to this basis. et and y
be the coordinate representation oI and y, respectively, with respect to this
basis. Then
(, G*y) TG*y (G, y) (G )T y TGT y.
Thus, Ior all and y we have T(G* GT)y O. ence, G* GT.
The above result allows the Iollowing euivalent deIinition oI the adjoint
linear transIormation.
.10.1 . DeIinition. et G E (, ). The adjoint transIormation, G*
is deIined by the Iormula
(, G*y) (G, y)
Ior all , y E .
Although there is obviously great similarity between the adjoint linear
transIormation and the transpose oI a linear transIormation, it should be
noted that these two transIormations constitute diIIerent concepts. The
diIIerences oI these will become more apparent in our subseuent discussion
oI linear transIormations deIined on comple vector spaces in Chapter 7.
Our ne t result includes some oI the elementary properties oI the adjoint
oI linear transIormations. The reader should compare these with the proper
ties oI the transpose oI linear transIormations.
.10.15. Theorem. et A, B E (, ), let A*, B* denote their respective
adjoints, and let l be a real scalar. Then
(i) (A*)* A;
(ii) (A B)* A* B*;
(iii) (l A)* l A*;
(iv) (AB)* B*A*;
(v) /* I, where / denotes the identity transIormation;
(vi) 0* 0, where 0 denotes the null transIormation;
(vii) A is nonsingular iI and only iI A* is nonsingular; and
(viii) iI A is non singular, then (A*)I (AI)*.
Our ne t result enables us to characterie orthogonal transIormations in
terms oI their adjoints.
A* AI.
ProoI We have (A, Ay) (A*A, y). But A is orthogonal iI and only jI
.10. inear TransIormations on Euclidean Vector Spaces 221
(A, Ay) (, y) Ior all , y E . ThereIore,
(A*A,y) (,y)
Ior all and y. rom this it Iollows that A*A I, which implies that A*
AI .
The prooI oI the net theorem is leIt as an e ercise.
AI is orthogonal, and AI is orthogonal iI and only iI A* is orthogonal.
C. SelIAdjoint TransIormations
sing adjoints, we now introduce two additional important types oI
.10.20. DeIinition. et A E (, ). Then A is said to be selI adjoint
iI A* A, and it is said to be skew adjoint iI A* A.
Some oI the properties oI such transIormations are as Iollows.
.10.21. Theorem. et A E (, ). et e
lO
, e" be an orthonormal
basis Ior , and let A be the matri oI A with respect to this basis. The Iol
lowing are euivalent:
(i) A is selIadjoint;
(ii) A is symmetric; and
(iii) (A, y) (, Ay) Ior all , y E .
.10.22. Theorem. et A E (, ), and let el, ... , e" be an orthonor
mal basis Ior . et A be the matri oI A with respect to this basis. The
Iollowing are euivalent:
(i) A is skewadjoint;
(ii) A is skew symmetric (see DeIinition .8.8); and
(iii) (A, y) (, Ay) Ior all , y E .
.10.23. Eercise. Prove Theorems .10.21 and .10.22.
The Iollowing corollary Iollows Irom part (iii) oI Theorem .10.22.
221
.10.2 . Corollary. et A be as deIined in Theorem .10.22. Then the
Iollowing are e uivalent:
(i) A is skew symmetric;
(ii) ( , A) 0 Ior all E ; and
(iii) A..l Ior all E .
Our ne t result enables us to represent arbitrary linear transIormations
as the sum oI selI adjoint and skew adjoint transIormations.
.10.25. Corollary. et A E (, ). Then there e ist uni ue At, A"
E (, ) such that A AI A", where At is selI adjoint and A" is skew
adjoint.
.10.26. E ercise. Prove Corollaries .10.2 and .10.25.
.10.27. E ercise. Show that every real n n matri can be written in
one and only one way as the sum oI a symmetric and skew symmetric matri .
Our ne t result is applicable to real as well as comple vector spaces.
.10.28. Theorem. et be a comple vector space. Then the eigenvalues
oI a real symmetric matri Aare all real. (II all eigenvalues oI Aare positive
(negative), then Ais called positive (oegative) deIinite.)
ProoI et A r is denote an eigenvalue oI A, where rand s are real
numbers and where i ../1. We must show that s O.
Since Ais an eigenvalue we know that the matri (A AI) is singular.
So is the matri
B A (r is)I) A (r is)I)
A" (r is)IA (r is)IA (r is)(r is)1"
A" 2rA (r" s")1" (A rI)" s"I.
Since B is singular, there e ists an *"O such that B O. Also,
0 TB T (A rl)" s"I) T(A rI)" s" T .
Since A and I are symmetric,
(A rI)T AT rl
T
A rl.
ThereIore,
i.e.,

where y (A rI) . Now yTy ,,1 0 and T ,1 0, because
, I
223
(.10.29)
by assumption *O. Thus, we have
o yTy SZ( T ) 0 sZ T .
The only way that this last relation can hold is iI s O. ThereIore, A T,
and Ais real.
Now let A be the matri oI the linear transIormation A E (, ) with
respect to some basis. II A is symmetric, then all its eigenvalues are real.
In this case A is selIadjoint and all its eigenvalues are also real; in Iact, the
eigenvalues oI A and A are identical. Thus, there eist uniue real scalars
AI ... , Apt P n, such that
det (A ) det (A AI) (AI A)""(A A) "
... (A, A) " .
We summarie these observations in the Iollowing:
.10.30. Corollary. et A E (, ). IIA is selIadjoint, then all eigen
values oI A are real and there eist uniue real numbers AI" ,A" p n,
such that E. (.10.29) holds.
As in Section .5, we say that in Corollary .10.30 the eigenvalues A"
i 1, ... ,p n, have algebraic multiplicities m
"
i 1, ... ,p, respectively.
Another direct conseuence oI Theorem .10.28 is the Iollowing result.
.10.31. Corollary. et A E (, ). II A is selIadjoint, then A has at
least one eigenvalue.
et us now eamine some oI the properties oI the eigenvalues and
eigenvectors oI selIadjoint linear transIormations. irst, we have:
.10.33. Theorem. et A E (, ) be a selIadjoint transIormation, and
let AI" .. ,A
p
, p n, denote the distinct eigenvalues oI A. II , is an eigen
vector Ior A, and iI I is an eigenvector Ior AI then , .1. I Ior all i *j.
ProoI Assume that A, *A,andconsider A
I
A,
,
and A, A " where
, *0 and , *O. We have
A,(
"
,) (A,
"
) (A
"
) ( I A
/
) ( " A
/
) Ai" )
Thus,
(A, A )( " I) O.
Since A, *AI we have ( I ) 0, which means , .1. I
Now let A E (, ), and let A, be an eigenvalue oI A. Recall that ,
denotes the null space oI the linear transIormation A A,l, i.e.,
m
l
E : (A AIl) O . ( .10.3 )
Recall also that m
l
is a linear subspace oI . rom Theorem .10.33 we now
have immediately:
.10.35. Corollary. et A E (, ) be a selIadjoint transIormation, and
let AI and A
j
be eigenvalues oI A. II AI * A
j
, then m
l
..1 m
j

Making use oI Theorem .9.59, we now prove the Iollowing important
result.
.10.37. lbeorem. et A E (, ) be a selIadjoint transIormation, and
let A , ... , A"p n, denote the distinct eigenvalues oI A. Then
dim n dim m dim m
... dim m,.

ProoI et dimm
l
n
l
, and let ret, ... , e.l be an orthonormal basis Ior
mi Net, let e I" ,e.
,
. be an orthonormal basis Ior m
. We continue
in this manner, Iinally letting e., ... . I , e ... ., be an orthonormal
basis Ior m
p
et n ... n
p
m. Since m
l
..1 m
j
, i *j, it Iollows that
the vectors et ... ,e.., relabeled in an obvious way, are orthonormal in .
We can conclude, by Corollary .9.52, that these vectors are a basis Ior ,
iI we can prove that m n.
et be the linear subspace oI generated by the orthonormal vectors
e, ... , e... Then e
l
, , e..is an orthonormal basis Ior and dim m.
Since dim dim y1. dim n (see Theorem .9.59), we need only
prove that dim 1. O. To this end let be an arbitrary vector in 1.. Then
( , e) 0, ... , ( , e..) 0; i.e., ..l e, ... , ..1 e.., by Theorem .9.59.
So, in particular, again by Theorem .9.59, we have ..1 m
l
, i I, ... ,p.
Now let ybe in mi Then
(A, y) ( , Ay) ( , AI ) Al, y) 0,
since A is selIadjoint, since y is in m
l
, and since ..1 mi Thus, A..1 m
,
Ior i I, ... ,p, and again by Theorem .9.59, A..l e
l
, i I, ... , m.
Thus, by Theorem .9.59, A..l yol. ThereIore, Ior each E 1. we also have
A E yol. ence, A induces a linear transIormation, say A, Irom yol into
1., where A A Ior all E y1.. Now A is a selIadjoint linear trans
Iormation Irom yol into ol, because Ior all and y in yol we have
(A, y) (A , y) ( , Ay) ( , Ay).
Assume now that dim yol O. Then by Corollary .10.31, A has an
eigenvalue, say A
o
, and a corresponding eigenvector
o
* O. Thus,
o
* 0
225
IS 10 y1. and A
o
A
o
Ao
o
; i.e., A
o
is also an eigenvector oI A, say
A
o
A,. So now it Iollows that
o
E / But Irom above,
o
E 1., which
means
o
1 / This implies that
o
1
o
, or (
o
,
o
) 0, which in turn
implies that
o
O. But this contradicts our earlier assumption that
o
1 O.
ence, we have arrived at a contradiction, and it thereIore Iollows that dim
1. O. This proves the theorem.
Our ne t result is a direct conse uence oI Theorem .10.37.
.10.38. Corollary. et A E (, ). II Ais selI adjoint, then
(i) there e ists an orthonormal basis in such that the matri oI A
with respect to this basis is diagonal; and
(ii) Ior each eigenvalue A, oI A we have dim m, multiplicity oI A,.
ProoI As in the prooI oI Theorem .10.37 we choose an orthonormal
basis ret, ... ,em , where m n. We have Ae
l
Ale., . .. ,Ae", Ale""
Ae", l A2.e" h .. ,Ae",... u. Ape",... . Thus, the matri A oI A with
respect to e
l
, . ,e. is
Al
In.
Al
0
A. 2.
In.
A
A2.
A, o
I
n,
A,
To prove the second part, we note that the characteristic polynomial oI
A is
det (A AI) det (A AI) (AI A)" (A2. A)" (A
p
A)" ,
and, hence, n, dim / multiplicity oI A" i 1, ,p.
Another conse uence oI Theorem .10.37 is the Iollowing:
.10.39. Corollary. et A be a real (n n) symmetric matri . Then there
e ists an orthogonal matri P such that the matri A deIined by
A PIAP pTAP
is diagonal.
.10. 0. E ercise. Prove Corollary .10.39.
or symmetric bilinear Iunctionals deIined on Euclidean vector spaces
we have the Iollowing result.
.10. 1. Corollary. et I(, y) be a symmetric bilinear Iunctional on .
Then there e ists an orthonormal basis Ior such that the matri oIIwith
respect to this basis is diagonal.
ProoI By Theorem .10.12 there e ists an E ( , ) such that I(, y)
(, y) Ior all , y E . Since I is symmetric, I(y, ) I(, y)
(y, ) (, y) ( , y) Ior all , y E , and thus, by Theorem .10.21,
is selI adjoint. ence, by Corollary .10.38, there is an orthonormal basis
Ior such that the matri oI is diagonal. By Theorem .10.12, this matri
is also the representation oIIwith respect to the same basis.
The prooI oI the ne t result is leIt as an e ercise.
.10. 2. Corollary. et j() be a uadratic Iorm deIined on . Then
there e ists an orthonormal basis Ior such that iI
T
(I .. ,.) is
the coordinate representation oI with respect to this basis, then () lle
... l.e Ior some real scalars lI , I
.10. 3. E ercise. Prove Corollary .10. 2.
Net, we state and prove the spectral tbeorem Ior selI adjoint linear
transIormations. irst, we recall that a transIormation P E (, ) is a
projection on a linear subspace oI iI and only iI p1. P (see Theorem
3.7. ). Also, Ior any projection P, R(P) EEl (P), where R(P) is the range
oI P and (P) is the null space oI P (see E. (3.7.8 . urthermore, recall
that a projection P is called an orthogonal projection iI R(P) ..1 (P) (see
DeIinition 3.7.16).
.10. . Theorem. et A E (, ) be a selIadjoint transIormation, let
AI ... ,A
p
denote the distinct eigenvalues oI A, and let I be the null space
oI A AsI (see E . ( .10.3 . or each; I, ... ,p, let PI denote the
projection on 1 along I. Then
(i) PI is an orthogonal projection Ior each; I, ... ,p;
(ii) PIP) 0 Ior i *j, i,j I, ... ,p;
(iii) t P
I, where I E (, ) denotes the identity transIormation;

I
and
(iv) A t A P)"
I
ProoI To prove the Iirst part, note that m:, EB m:; , i I, ... ,p,
by Theorem .9.59. Thus, by Theorem 3.7.3, R(P ,) m:, and m:(P,) m:; ,
and hence, P, is an orthogonal projection.
To prove the second part, let i 1 j and let E . Then P I: .
E m:
.
Since R(P ,) m:, and since m:,1.. m:
, we must have
E m:(P,); i.e.,
P,P 0 Ior all E .
To prove the third part, let P t P" We must show that P I. To
II
do so, we Iirst show that P is a projection. This Iollows immediately Irom the
Iact that Ior arbitrary E , p2 (PI ... P,)(Pl ... P,)
PI ... P;, because P P

0 Ior i 1 j. ence, p2 (PI ...
P,) P , and thus P is a projection. Net, we show that dim R(P) n.
It is straightIorward to show that
dim R(P) t dim m:, ,
11
But by Theorem .10.37, t dim m:, n, and thus dim R(P) n. Since
11
R(P) EB m:(P), we conclude that R(P) . inally, since P is a pro
jection with range , we conclude that P Ior all E , i.e., P 1.
To prove the last part oI the theorem, let E . rom part (iii) we have
PI P
2
... P,.
et , P, Ior i I, ... , p. Then , E m:, and A
I
A,,. ence,
A A(
i
,) A
I
... A, Al I ... A, ,
AIPI Ap, (AIP
I
... A,P,) ,
which concludes the prooI oI the theorem.
Any set oI linear transIormations PI" .. , P, satisIying parts (i)(iii)
oI Theorem .10. is said to be a resolution oI the identity in the setting oI
a Euclidean space. We shall give a more general deIinition oI this concept in
Chapter 7.
D. Some E amples
At this point it is appropriate to consider some speciIic cases.
.10.5. Eample. et P, let A E (, ), and let el e
2
be an
arbitrary basis Ior . Suppose that
Chapter I initeDimensional Vector Spaces and Matrices
A 011 Ou
021 022
is the matri oI A with respect to the basis el, e2 et E E2, and let
T
(I 2) denote the coordinate representation oI with respect to this
basis. Then A is the coordinate representation oI A with respect to this
basis, and we have
A 0
11
1 O e2 1I y.
0
21
1 022e2 12
This transIormation is depicted pictorially in igure .
Now assume that A is a selI adjoint linear transIormation. Then there
e ists an orthonormal basis e , such that
Ae A.le, Ae; A.2e;,
o
111
8
1
.10. 6. igure
8
1
o
.10. 7. Igure G

where A1 and A denote the eigenvalues oI A. Suppose that the coordinates
oI with respect to la, e are ; and ;, respectively. Then
A A(;e e;) ;Ae ;Ae; Ale Ae;;
i.e., the coordinate representation oI Awith respect to e , e is (AI ; A ;).
Thus, in order to determine A, we merely "stretch" or "compress" the coor
dinates ;, along lines colinear with e and e;, respectively. This is illus
trated in igure G.
.10. 8. E ample. Consider a transIormation R Irom EZ into 1. which
rotates vectors as shown in igure . By inspection we can characteri e R,
with respect to the indicated orthonormal basis el, e , as
Rei cos ( e
l
sin ( e
Re

sin ( e
l
cos ( e

nit circle
.10. 9. igure
The reader can readily veriIy that R is indeed a linear transIormation. The
matri oI R with respect to this basis is
COS ( sin (
R. .
sin ( cos (
By direct computation we can veriIy that
lij R; I cs ( sin (
sm ( cos ( ,
and, moreover, that
det R, cos" ( sin" ( 1.
Thus, R is indeed a rotation as deIined in DeIinition .10.11.
or the matri R, we also note that R
o
I, R;t R, and R,R R, .
.10.50. E ample. Consider now a transIormation A Irom E3 into E3,

as depicted in igure . The vectors e
l
, e", e
3
Iorm an orthonormal basis Ior
Set
PlaneZ
90
0
I
I
I
I
I
/
I
Set . I
.10.51.
aure
E3. The plane Z is spanned by et and e". This transIormation accomplishes
a rotation about the vector e
3
in the plane Z. By inspection oI igure it is
clear that this transIormation is characteri ed by the set oI euations
Aet cos ( et sin ( e" 0 e
3
Ae" sin ( e
l
cos ( e" 0 e
3
Ae
3
0 e
l
0 e" I e
3
The reader can readily veriIy that A is a linear transIormation. The matri
231
oI A with respect to the basis el e
, e
3
l is
COS9 sin 9
A sin9 cos 9
o 0
or this transIormation the Iollowing Iacts are immediately evident (assume
sin9 t 0): (a) e
3
is an eigenvector with eigenvalue I; (b) plane Z is a linear
subspace oI 3; (c) A E Z whenever E Z; (d) the set is a linear
subspace oI 3; (e) A E whenever E ; (I) Z..l ; and (g)
dim I, dim Z 2, and dim dim Z dim E3.
E. urther Properties oI Orthogonal TransIormations
The preceding eample motivates several oI our subseuent results.
et A E (, ). We recall that a linear subspace oI is invariant under
A iI A E whenever E . We now prove the Iollowing:
.10.52. Theorem. et A E (, ) be an orthogonal transIormation.
Then
(i) the only possible real eigenvalues oI A, iI there are any, are I
and I;
(ii) iI is a linear subspace oI which is invariant under A, then the
restriction A oI Ato is an orthogonal transIormation Irom into
; and
(iii) iI is a linear subspace oI which is invariant under A, then l.
is also a linear subspace oI which is invariant under A.
ProoI To prove the Iirst part, assume that A has a real eigenvalue, say A
o
(The deIinition oI eigenvalue oI A E (, ) e cludes the possibility oI com

ple eigenvalues, since is a vector space over the Iield R oI real numbers.)
Then A .lo Ior * 0 and
IAI IAoI lAo II l
But IAI II, because A is by assumption an orthogonal linear transIor
mation. ThereIore, lAoI I, and we have A
o
I or 1.
To prove the second part assume that is invariant under A. Then
A E whenever E , and thus the restriction A oI A to , deIined by
A A
Ior all in , is clearly a linear transIormation oI into . Now, trivially,
Ior all in we have
IA l IAl l i,
since A E (, ) is an orthogonal transIormation. ThereIore, A is an
orthogonal transIormation Irom into .
To prove the last part, let be an invariant subspace oI under A. Then
E y.l iI and only iI 1.. y Ior all y E . Suppose then that E y.l and
consider A. Then Ior each y E we have
(A, y) (, A*y) (, AI y),
because A is orthogonal.. But AI is also in , Ior the Iollowing reasons.
The restriction A oI A to is orthogonal on by part (ii) and is thereIore
a nonsingular transIormation Irom into . ence, (At
1
e ists and,
moreover, (At
l
must be a transIormation Irom into . Thus, (Atly
AI y and AI y is in . We Iinally have
(A, y) (, AI y) 0
Ior each y in . Thus, A E y.l whenever E y.l. This proves that y.l
is invariant under A.
We also have:
.10.53. Theorem. et A E (, ) be an orthogonal transIormation, let
denote the set oI all E such that A , and let y denote the set
oI all E such that A . Then and are linear subspaces oI
and 1.. .
ProoI Since (A I) and y (A I), it Iollows that and
are linear subspaces oI . Now let E
and let y E . Then

(, y) (A, Ay) (, y) (, y),
which implies that (, y) O. ThereIore, 1.. yand 1.. y.
sing the above theorem we now can prove the Iollowing result.
.10.5 . Corollary. et A, and y be deIined as in Theorem .10.53,
and let Z denote the set oI all E such that 1.. and 1.. y. Then Z
is a linear subspace oI and dim dim y dim Z dim 11.
urthermore, the restriction oI A to Z has no (real) eigenvalues.
ProoI et e
l
, , e" be an orthonormal basis Ior
and let e. 1
... , e"I n. be an orthonormal basis Ior , where dim n
1
and dim y
11". Then the set e
l
, ,e"I " is orthonormal. et denote the linear
subspace generated by e
l
, , e". n. . Then dim 11
1
11". By the deI
inition oI Z and by Theorem .9.59 we have Z . , and thus Z is a linear
subspace oI . ThereIore,
11 dim dim dim y.l n
l
11" dimZ
dim dim y dimZ,
( .10.56)
To prove the second assertion, let A denote the restriction oI A to Z.
Suppose there e ists a nonero vector E Z such that A lo. Since
AI is orthogonal by part (ii) oI Theorem .10.52, we have 1
0
1 by part
(i) oI Theorem .10.52. Thus, is either in or in . But by assumption,
E Z and Z .. and Z .. . ThereIore, 0, a contradiction to
our earlier assumption. ence, the restriction A oI A to Z cannot have a
real eigenvalue.
Our net result is concerned with orthogonal transIormations on two
dimensional Euclidean spaces.
.10.55. Theorem. et A E (, ) be an orthogonal transIormation,
where dim 2.
(i) II det A 1 (i.e., A is a rotation), there e ists some real ( such
that Ior every orthonormal basis Ie., e
2
the corresponding matri
oI Ais
COS ( sin (
R, .
sin ( cos (
(ii) II det A 1 (i.e., A is a reIlection), there e ists some orthonormal
basis IeI e
2
such that the matri oI A with respect to this basis is
Q 1 O .
o I
( .10.57)
ProoI To prove the Iirst part assume that det A 1 and choose an
arbitrary orthonormal basis Ie" e
2
. et
A all au
a21 a2
2
denote the matri oI A with respect to this basis. Then, since A is orthogonal,
so is A and we have
( .10.58)
and
det A I. ( .10.59)
Solving E s. ( .10.58) and ( .10.59) (we leave the details to the reader) yields
a
l
I cos ( , au sin ( , a21 sin ( , and a
22
cos ( .
To prove the second part assume that A is orthogonal and that det A
I. Consider the characteristic polynomial oI A,
. p(l) 1
2
/
1
1 /
o
Since det A 1 we have /

o
1. Solving Ior 1I and 1
2
we have
, , /I .viiI
AI A2 2
which implies that both AI and A
are real and that AI * A . rom Theorem

.10.52 these eigenvalues are 1and 1. ThereIore, there e ists an orthonor
mal basis such that the matri oI A with respect to this basis is
AI O I O
A 0 1 .
In the above prooI we have e
l
E and e
E , in view oI Theorem
.10.53. Also, Irom the preceding theorem it is clear that iI A is orthogonal
and (a) iI det A 1, then det (A ) 1 2 cos 9A A
Z
, and (b) iI
det A 1, then det (A ) A
Z
l.
.10.60. Theorem. et A E (, ) be an orthogonal transIormation
having no (real) eigenvalues. Then there e ist linear subspaces " ... ,
r
oI such that
(i) dim
I
2, i 1, ... , r;
(ii)
I
1
Ior all i *j;

(iii) dim
I
... dim
r
dim n; and
(iv) each subspace
I
is invariant under A; in Iact, the restriction oI A
to
I
is a non trivial rotation (i.e., Ior the matri given by E .
( .10.56) we have 9 * kn, k 0, 1,2, ...).
ProoI. Since by assumption A does not have any (real) eigenvalues, we have
det (A AI) (t
I
PI). ).Z) ... (t , P,). ).Z),
where the t l PI i I, ... , r are real (i.e., det (A ) does not have any
linear Iactors (AI ).), with AI real). Solving the Iirst uadratic Iactor we have
, PI , P: 1
""I 2
and
A PI .. m t l ,
2
where AI and A are comple . By Theorem .5.33, part (iv), iI 1() is any
polynomial Iunction, then I(A
I
) will be an eigenvalue oII(A). In particular,
iIj(A) t
l
PIA A
Z
, we know that one oI the eigenvalues oI the linear
transIorm tion t lI PIA AZ will be t
l
PIAl Ai 0, by choice.
Thus, the linear transIormation (t II PIA AZ) has 0 as an eigenvalue.
ThereIore, there e ists a vector jl * 0 in such that
(t II PIA AZ)/I 0 II
or
( .10.61)
Now let 11 Alt We assert that 11 and 11 are linearly independent. or
iI they were not, we would have 11 "IIt Alto where" is a real scalar,
and It would be an eigenvector corresponding to a real eigenvalue" oI A,
which is impossible by hypothesis. Net, let
t
be the linear subspace oI
generated byIt and/
1
Then
t
is two dimensional. We now show that .
is invariant under A. et E
t
Then
etIt e.
Ior some et and e, and
A etA/. e
AI1 etA/. eAZ/

t
.
But Irom E. (.10.61) it Iollows that
AZit I1,tIt PtA/t.
Thus,
A etA/ e(I1,tIt PtA/.) eI1,tIt (et ept)A/t
eI1,tIt (et eZpt)I,
which shows that A E whenever E
t
. Thus,
t
is invariant under A.
By Theorem .10.52, the restriction A oI A to
t
is an orthogonal trans
Iormation Irom
t
into
t
This restriction cannot have any (real) eigenvalues,
Ior then A would also have (real) eigenvalues.
rom Theorem .10.55, A cannot be a reIlection, Ior in that case A
would have eigenvalues eual to I and 1. Moreover, A cannot be a
trivial rotation, Ior then the eigenvalues oI A would be eual to 1 iI 6 0
0
and 1 iI ( 180
0
But Irom Corollary .10.8 we know that iI A is
orthogonal, then det A 1. ThereIore, it Iollows now Irom Theorem
.10.55 that the restriction oI A to
t
is a nontrivial rotation.
Now let Zt t. Since
t
is invariant under A, so is Zto by Theorem
.10.52, part (iii), and dim Zt dim 2. The restriction At oI A to Zt
is an orthogonal transIormation Irom Zt into Zto and it cannot have any
(real) eigenvalues. Applying the argument already given Ior A and now to
At andZ
t
, we can conclude that there eists a twodimensional linear subspace
oI Z t such that the restriction oI At to

1
is a nontrivial rotation. Now
since iscontainedinZ
t
and since by deIinitionZ
t
t,wehave ...
2
Net, let Z2 be the linear subspace which is orthogonal to both

t
and
, and let A
2
be the restriction oI A to Z2 Repeating the argument given
thus Iar, we can conclude that there eists a twodimensional linear subspace
oI Z2 such that the restriction oI A
2
to is a nontrivial rotation and
such that ... and
t
... .
To conclude the prooI oI the theorem, we continue the above process
until we have ehausted the original space .
Combining Theorems .10.53 and .10.60, we obtain the Iollowing:
A
.10.62. Corollary. et A E (, ) be an orthogonal linear transIorma
tion. Then there e ist linear subspaces t ,
lt
. , , oI such that
(i) all oI the above linear subspaces are orthogonal to one another;
(ii) n dim dim dim dim
I
... dim ,;
(iii) E iI and only iI A ;
(iv) E y iI and only iI A ; and
(v) the restriction oI Ato each /, i 1, ... , r, is a nontrivial rotation.
Since in the above corollary the dimension oI each " i 1, ... ,r,
is two, we have the Iollowing additional result.
.10.63. Corollary. II in Corollary .10.62 dim is odd, then A has a
real eigenvalue.
We leave the prooI oI the net result as an e ercise.
.10.6 . Deorem. IIA is an orthogonal transIormation Irom into ,
then the characteristic polynomial oI Ais oI the Iorm
(I l)( 1 l)(I 2 cos 8
1
l lZ) . .. (1 2 cos 8,l lZ)
det(A Al).
Moreover, there e ists an orthonormal basis elt ... , e.l oI such that the
matri oI A with respect to this basis is oI the Iorm
cos 8
1
sin 9
1
sin 9I cos 9I
o
,
: cos 8, sin 8, :
I I
I 8 8 I

1
2,
o
1
1
237
In our ne t result the canonical Iorm oI skew adjoint linear transIorma
tions is established.
.10.66. Theorem. et A be a skew adjoint linear transIormation Irom
into . Then there e ists an orthonormal basis Ie" ... ,e.l such that
the matri A oI A with respect to this basis is oI the Iorm
o
A
o
oi
I
i ., 0
where the ." i I, ... ,r are real and where some oI the ., may be
ero.
BeIore closing the present section, we brieIly introduce so called "normal
transIormations." We will have uite a bit more to say about such transIor
mations and their representation in Chapter 7.
.10.68. DeIinitioD. A transIormation A E (, ) is said to be a normal
linear transIormation iI A*A AA*.
Some oI the properties oI such transIormations are as Iollows.
.10.69. Theorem. et A E (, ). Then
(i) iI Ais a selI adjoint transIormation then it is also a normal transIor
mation;
(ii) iI A is a skew adjoint transIormation, then it is also a normal
transIormation;
(iii) iI A is an orthogonal transIormation, then it is also a normal
transIormation; and
(iv) iI A is a normal linear transIormation then there e ists an ortho
normal basis el ... ,e,, oI such that the matri A oI A with
respect to this basis is oI the Iorm
Chapter I initeDimensional Vector Spaces and Matrices
PI AI:
I
AI PI
.
o
A
o
1
1 , I
I P, A, I
1 I
" I
: A, P, I
1
V2,
The prooIs oI parts (i) (iii) Iollow Irom the deIinitions oI normal, selI
adjoint, skew adjoint, and orthogonal linear transIormations. To prove
part (iv), let A AI A
2
, where AI A A*) and A
2
t(A A*),
and note that AI is selI adjoint and A
2
is skew adjoint. This representation
is uniue by Corollary .10.25. Making use oI Theorem .10.66 and Corollary
.10.38, we obtain the desired result. We leave the details oI the prooI oI
this theorem as an e ercise.
.11. APPICATIONS TO ORDINAR
DI ERENTIA EQATIONS
In the present section we present applications to the material covered in
the present chapter and the preceding chapter. Because oI their importance
in almost all branches oI science and engineering, we consider some topics
in ordinary diIIerential euations. SpeciIically, we concern ourselves with
initial value problems described by ordinary diIIerential euations. The
present section is divided into two parts. In subsection A, we deIine the initial
value problem, while in subsection B we treat linear initial value problems.
At the end oI the ne t chapter, we will continue our discussion oI ordinary
diIIerential euations.
A. InitialValue Problem: DeIinition
et R denote the set oI real numbers, and let D c R2 be a domain (i.e.,
D is an open and connected subset oI R2). We will call R2 the (t, ) plane.
et / be a real valued Iunction which is deIined and continuous on D, and
.I. Applications to Ordinary DiIIerential E uations
239
let I: . d/dt ( e., denotes the derivative oI with respect to t). We call
I(t, ) ( .11.1)
an ordinary diIIerential e uation oI the Iirst order. et T (t l t ) c R be
an open interval which we call a t interval ( e., T (t l t
) t E R:
t I t t ). A real diIIerentiable Iunction rp (iI it e ists) deIined on T such
that the points (t, rp(t E D Ior all t E T and such that
;(t) I(t, rp(t ( .11.2)
Ior all t E Tis called a solution oI the diIIerential euation ( .11.1).
.11.3. DeIinition. et ( r, e) E D. II rp is a solution oI the diIIerential
euation ( .11.1) and iI rp( r) e, then rp is called a solution oI tbe initial
value problem
I(t, ) .
(r) e
( .11. )
In igure a typical solution oI an initialvalue problem is depicted.

t, T
t interval T (t, t2)
m slope oI line
IIT, .,(T))
t
.11.5. igure . Typical solution oI an initial value problem.
We can represent the initialvalue problem given in E . ( .1 1. ) euiva
lently by means oI the integral euation
rp(t) erI(s, rp(s ds.
( .1 1.6)
ere we say that two problems are euivalent iI they have the same solution.
To prove this e uivalence, let rp be a solution oI the initialvalue problem
( .1 1. ). Then rprr) eand
;(t) I(t, rp(t
1 0 Chapter I inite Dimensiol ll Vector Spaces and Matrices
Ior all lET. Integrating Irom I to I we have
s: ;(s) ds s: I(s, ,(s ds
or
(1) s: I(s, ,(s ds.
Thus, , is a solution oI the integral euation ( .11.6).
Conversely, let, be a solution oI the integral euation ( .11.6). Then
(I) , and diIIerentiating both sides oI E. ( .11.6) with respect to I
we have
;(1) 1(1, ,(t)),
and thus, is a solution oI the initialvalue problem ( .11. ).
Net, we consider initial value problems described by means oI several
Iirstorder ordinary diIIerential euations. et D c R I be a domain ( e.,
D is an open and connected subset oI RI). We will call RI the (t, I
... , .) space. et II ... ,I. be n realvalued Iunctions which are deIined
and continuous on D (i.e., /,(t, I ... , .), i I, ... ,n, are deIined Ior
all points in D and are continuous with respect to all arguments I, I ,
.). We call
I /,(1, I ... ,.), i 1, ... , n, ( .11. 7)
a system oI n ordinary diIIerential e uations oI tbe Iirst order. A set oI n real
diIIerentiable Iunctions 1 ... , ,. (iI it e ists) deIined on a real I interval
T (I I I
) c R such that the points (I, 1(1), ... , ,.(1 E D Ior all lET
and such that
;tCt) /,(1, 1(1), ... ".(t, i 1, ... , n ( .11.8)
Ior all lET, is called a solution oI tbe system oI ordinary diIIerential e uations
( .11.7).
.11.9. DeIinition. et (I, I .. , .) E D. II the set I"" ,. is
a solution oI the system oI euations ( .11.7) and iI (I(I), ... , ,.(I (I
... , .), then the set 1 " . ". is called a solution oI the initial value
problem
I /,(t, I. , .), i 1, ... , n .
( .11.10)
I(I) I I I, ... , n
It is convenient to use vector notation to represent E. ( .11.10). et
.11. Applications to Ordinary DiIIerential E uotions
/1(/,
It
. , .) /I( )
I(/, ) . .
. .
1.(/,
It
.. ,.) 1.(/, )
and deIine i d /dt componentwise; i.e.,
2 1
We can e press E . ( .11.10) euivalently as
i I(t, ) .
( .11.11)
(T) ;
II in E . ( .11.1 I) I(t, ) does not depend on I (i.e., I(t, ) I( ) Ior all
(I, ) E D), then we have
i I( ). ( .11.12)
In this case we speak oI an autonomous system oI Iirstorder ordinary diIl er
ential e uations.
OI special importance are systems oI Iirstorder ordinary diIIerential
euations described by
and
i A(t) vet),
i A(t),
iA,
( .11.13)
( .11.1 )
( .11.15)
where is a real nvector, A(t) a j(t) is a real (n n) matri with elements
a j(/) that are deIined and continuous on a t interval T, A a,/ is an (n n)
matri with real constant coeIIicients, and vet) is a real nvector with com
ponents v,(t), i 1, ... ,n, which are deIined and at least piecewise continu
ous on T. These euations are clearly a special case oI E . ( .11.7). or eam
ple, iI in E. ( .11.7) we let
/,(t, I ,.) /,(t, ) a/(t)

l
, i I, ... ,n,
II
then E . ( .11.1 ) results. In the case oI Es. ( .11.1 ) and ( .11.15), we
speak oI a linear homogeneous system oI ordinary diIIerential euations, in
the case oI E . ( .11.13) we have a linear non bomogeneous system oI ordinary
diIIerential euations, and in the case oI E. ( .11.15) we speak oI a linear
system oI ordinary diIIerential euations with constant coeIIicients.
Net, we consider initialvalue problems described by means oI nthorder
ordinary diIIerential euations. etlbe a real Iunction which is deIined and
continuous in a domain D oI the real (I, I ,) space, and let (k)
Ii. dk/dl
k
. We call
( ) 1(1, , (I), , (Il) ( .1 1.16)
an nthorder ordinary diIt erential e uation. A real Iunction I (iI it e ists)
which is deIined on a I interval T (I I t 2) C R and which has n derivatives
on T is called a solution oI E . (.11.16) iI (I, 1(/), ... ,rp()(/ E D Ior all
I E Tand iI
Ior all lET.
rp()(/) 1(/, 1(/), .. , rp(Il(/
( .1 1.17)
.11.18. DeIinition. et (r, e" ... ,e ) E D. II I is a solution oI E.
(.11.16) and iI rp(r) e" ... ,rp(Il(r) e, then I is called a solution
oI the initial value problem
( ) 1(/, , (ll, ... ,(I .
( .1 1.19)
(r) e ... , ( Il(r) e
OI particular interest are nthorder ordinary diIIerential euations
a,,(/)() a.I(/)(1l al(/)(l) ao(t) V(/), ( .11.20)
a,,(t) ( ) aI(t)(1l a
l
(t) (1l ao(t) 0, (.11.21)
and
a,.() al(1l ... al(I) ao 0, (.11.22)
where a,,(t), .. ,oo(t) are real continuous Iunctions deIined on the interval
T, where a (/) :; : 0 Ior all lET, where a , . , a
o
are real constants, where
a" :; : 0, and where v(/) is a real Iunction deIined and piecewise continuous
on T. We call E . (.11.21) a linear homogeneous ordinary diIIerential euation
oIorder n, E . ( .1 1.20) a linear non homogeneous ordinary diIIerential e uation
oI order n, and E . ( .1 I.22) a linear ordinary diIIerential e uation oI order n
with constant coeIIicients.
We now show that the theory oI nthorder ordinary diIIerential euations
reduces to the theory oI a system oI n Iirstorder ordinary diIIerential eua
tions. To this end, let in E . (.11.19) I and let
I
2

2

3
(2)
( .1 1.23)
I (I)
1(/, I , ) ()
This system oI euations is clearly deIined Ior all (I, I ... ,) E D. Now
assume that the vector pT ( 11 ... , rp) is a solution oI E . (.11.23) on an
.11. Applications to Ordinary DiIIerential Euations
interval T. Since rp" ;" rp3 ;", ... ,rpIt rp It I), and since
I(t, rp,(t), . .. ,rpIt(t I(t, rp,(t), . .. ,rp It Il(t rp It)(t),
it Iollows that the Iirst component rp, oI the vector, is a solution oI E.
(.11.16) on the interval T. Conversely, assume that rp, is a solution oI E.
( .11.16) on the interval T. Then the vector cpT (rp, rp(l), ... , rp(It ll) is
clearly a solution oI the system oI euations ( .11.23). Note that iIrp,(1 ) "
... ,rp It I)(1 ) It then the vector, satisIies ,(I) ;, where ;T (t,
... , It) The converse is also true.
Thus Iar we have concerned ourselves with initialvalue problems charac
teried by real ordinary diIIerential euations. It is possible to consider initial
value problems involving comple ordinary diIIerential euations. or eam
ple, let t be real and let ZT (" ... , ZIt) be a comple vector (i.e., Zk is oI
the Iorm k iv
k
, where k and V
k
are real and i ,). et D be a domain
in the (t, ) space, and letI., ,I,. be ncontinuous complevalued Iunctions
deIined on D. et IT (Il ,I,.), and let d /dl. We call
C(t, ) ( .11.2 )
a system oI n comple ordinary diIIerential euations oI the Iirst order.
A comple vector cpT (rp" .. , rpIt) which is deIined and diIIerentiable on
a real t interval T (T" T,,) c R such that the points (I, rp,(t), ... , rpIt(t
E D Ior all t E T and such that
(t) C(t, . t
Ior all t E T, is called a solution oI the system oI euations (.11.2). II in
addition, (r," ... It) E D and iI (rp,( r), ... ,rpIt( r (I"" It) T,
then cp is said to be a solution oI the initialvalue problem
(t, ) .
(.11.25)
( r)
OI particular interest in applications are initialvalue problems characteried
by comple linear ordinary diIIerential euations having Iorms analogous
to those given in euations ( .1 1.13)(.11.15).
We can similarly consider initialvalue problems described by comple
nthorder ordinary diIIerential euations.
et us look now at some speciIic eamples. The Iirst eample demonstrates
that the solution to an initialvalue problem may not be uniue.
.11.26. Eample. Consider the initialvalue problem
/3
(O) O.
We can readily veriIy that this problem has inIinitely many solutions passing
through the origin oI the (I, ) plane, given by
0, 1 p
tpi
l
) (t p) 3/Z, P t I
where p is any real number such that p I.
The net e ample shows that the t interval Ior which a solution to the
initialvalue problem e ists may be restricted.
.11.27. E ample. Consider the initialvalue problem
(t.) ,
where is any real number. By direct computation we can veriIy that
tp(t) I (t tl)l
is a solution oI this problem. We note that iI t t
l
Ij, then the solution
tp(t) is not deIined. Thus, there is a restriction on the t interval Ior which a
solution to the above problem e ists. Namely, iI 0, the above solution
is valid over any interval (t
l
, t
2
) such that
I
IIt 2 t l e
In this case we say the solution Iails to e ist Ior I t
2
On the other hand,
iI 0, the solution given above is valid Ior any t t I and we say the solu
tion e ists on any interval (t
l
, t
2
).
The preceding e amples give rise to several important uestions:
When does an initial value problem possess a solution
When is a solution uniue
What is the etent oI the interval over which such a solution eists
Is the solution continuously dependent on the initial condition
At the end oI the ne t chapter we will state and prove results which give
answers to these uestions.
8. InitialValue Problem: inear Systems
In the remainder oI the present section we concern ourselves e clusively
with initialvalue problems described by linear ordinary diIIerential euations.
et again T (tit t
2
) be a real t interval, let
T
(I"" ,,) denote an
n dimensional vector, let A a,/ be a constant (n n) matri, let A(t)
ol/(t) be an (n n) matri with elements 0l/(t) that are deIined and con
tinuous on the interval T, and let v(t (vl(t), ... ,v,,(t denote an n
.l. Applications to Ordinary DiIIerential E uations 2 S
vector with components vl(t) that are deIined and piecewise continuous on T.
In the Iollowing we consider matrices and vectors with components which
may be either real or complevalued. In the Iormer case the Iield Ior the
space is the Iield oI real numbers, while in the latter case the Iield Ior the
space is the Iield oI comple numbers. Also, let
D (I, ): lET, E Rn(or en) . ( .11.28)
At Iirst we consider systems oI ordinary diIIerential euations given by
i A(I) V(I), ( .11.29)
i A(I) , ( .11.30)
and
i A . ( .11.31)
In the applications section oI the net chapter we will show that, with the
above assumptions, euations (.11.29)(.11.31) possess uniue solutions
Ior every (r, e) E D which e ist over the entire interval T (II (
2
) and which
depend continuously on the initial conditions. This is an etremely important
result in applications, where we usually reuire that T ( 00, 00).
.11.32. Theorem. The set S oI all solutions oI E. ( .11.30) on T Iorms
an ndimensional vector space.
ProoI et Il and 2 be solutions oI E . ( .11.30), let denote the Iield Ior
the space, and let 0.:
1
, 0.:
2
E . Since
d
dt o.:lIl(l) 0.: 22(1) 0.:
1
1(1) 0.: 2 2(1)
o.:IA(t) pl(t) 0.:2A(t) p2(t)
A(t)o.:ll(t) 0.:2 2(t) .
it Iollows that 0.:
1
1 0.:
2
2 E S whenever 1 I2 E S and whenever 0.:
1
,0.:
2
E . urthermore, the trivial solution I 0 deIined by I(l) 0 Ior all
t E T is clearly in S, and Ior every TI E S there e ists a II 11 E S such
that TI II O. It is now an easy matter to veriIy that all the aioms oI a
vector space are satisIied Ior S (we leave the details to the reader to veriIy).
Net, we must show that S is ndimensional; i.e., we must Iind a set oI
solutions Il .. , n which is linearly independent and which spans S. et
1;1 ... ,I;n be a set oI linearly independent vectors in the ndimensional
space. By the e istence results which we will prove in the ne t chapter (and
which we will accept here on Iaith), iI I E T, there e ist n solutions I .... n
oI E. ( .11.30) such thatl(I) ;1 i I, ... ,n. We Iirst show that these
solutions are linearly independent. or purposes oI contradiction, assume that
these solutions are linearly dependent. Then there eist scalars 0.:
1
" .. ,
I " E , not all ero, such that
Ior all t E T. This implies that
But this last euation contradicts the assumption that the , are linearly
independent. Thus, the ." i I, ... ,n, are linearly independent. inally,
to show that these solutions span S, let. be any solution oI E . ( .1 1.30) on
T, such that 1) . Then there e ist uniue scalars I
I
, , I " E such
that
because the vectors / i 1, ... ,n, Iorm a basis Ior the space. It now
Iollows that
is a solution oI E . ( .11.30) on T such that .(1) . By the uni ueness
results which we will prove in the ne t chapter (and which we accept here
on Iaith),
Since. was chosen arbitrarily, it Iollows that the solutions 1 i I, ... ,n,
span S. This concludes the prooI.
The above result motivates the Iollowing two deIinitions.
.11.33. DeIinition. A set oI n linearly independent solutions oI E .
( .11.30) on T is called a Iundamental set oI solutions oI ( .11.30). An (n n)
matriP whose n columns are linearly independent solutions oI E . ( .11.30)
on T is called a Iundamental matri.
Thus, iI .I ... , .,, is a set oI n linearly independent solutions oI E.
( .11.30) and iI.r ("11 , "III) then
" II , , 111
,, : .. :.1 .. ::: .. :.1 .II.1 1.,,
""I "111 ... ",,"
is a Iundamental matri .
.11. Applications to Ordinary DiII rential E uations 2 7
In our ne t deIinition we employ the natural basis Ior the space, given by
I
0
0
0 I
8
1

,
8:
0
... ,
u
0
0
0
I
.11.3 . DeIinition. A Iundamental matri. (Ior E . ( .11.30 whose
columns are determined by the linearly independent solutions I i I,
... ,n, with
/(1) 01 i I, ... ,n,
I E T, is called the state transition matri. oI E . ( .11.30).
et II be an (n n) matri and deIine diIIerentiation oI with
respect to t E Tcomponentwise; i.e., A
o
. We now have:
.11.35. Theorem. et" be a Iundamental matri oI E . ( .11.30) and
let denote an (n n) matri . Then" satisIies the matri e uation
A(t), t E T. (.11.36)
ProoI We have
II: 1 . A(t) I IIA(t l : .. IA(t)I.
A(t).II. I 1 . A(t).
We also have:
.11.37. Theorem. II"is a solution oI the matri euation ( .11.36)
on T and iI t, I E T, then
det "(/) det "(I)eI. tI A(.) i., t E T. ( .11.38)
ProoI Recall that iI C ell is an (n n) matri, then tr C t CII et
I I
" "II and A(t) oIAt) . Then III ;I alk(t)"kr Now

Iill "u ... lh Iill Ilu .. , IIh
(det) : .. :.: .. ::: .. :. .2 .: .. ::: .. :
Il.I
"d
.. .
"..
".1
"d
Il
""
"u "I.
"u
Ilu ,,:.
( .11.39)
..................
,,
Ild
. , .
Il
Also,
1 2.
...................
/Inn
01(/) det 1.
The last determinant is unchanged iI we subtract Irom the Iirst row 012 times
the second row plus
1
, times the third row up to 0ln times the nth row.
This yields
0 1 (/) /1 I 0 I (t yt u . . . I (t/I 1n
1 21 /122 l2n
Repeating the above procedure Ior the remaining determinants we get
det 1 (t)
11
(/) det (t) 022(1) det 1 (1) ... 0..(/) det 1(t)
tr A(t) det 1(t).
This now implies
det (t) det (r)eI It A(,),,,
Ior all t E T.
.1 . 0. E ercise. VeriIy E . ( .11.39).
We now prove:
.1 . 1. Theorem. A solution oI the matri euation ( .11.36) is a
Iundamental matri Ior E. ( .11.30) iIand only iI det 1(t) * 0 Ior all t E T.
ProoI. Assume that l .1 I V21 .. 1 V. is a Iundamental matri Ior
E. ( .11.30), and let I be a nontrivial solution Ior ( .11.30). By Theorem
.11.32 there e ist uniue scalars I , /In E , not all ero, such that
or
1a, ( .11. 2)
where aT (/II .. ,/I.). Euation ( .11. 2) constitutes a system oI n linear
euations with unknowns /II .. , /In at any I E T and has a uniue solution
Ior any choice oI.(I). ence, we have det 1(I) * 0, and it now Iollows Irom
Theorem .11.37 that det (t) * 0 Ior any 1 E T.
Conversely, let l be a solution oI the matri euation ( .11.36) and assume
.11. Applications to Ordinary DiIIerential E uations
2 9
that det (t) 1 0 Ior all t E T. Then the columns oI., are linearly inde
pendent Ior all t E T.
The reader can readily prove the ne t result.
.11. 3. Theorem. et" be a Iundamental matri Ior E . ( .11.30), and
let C be an arbitrary (n n) nonsingular constant matri . Then "C is also
a Iundamental matri Ior E . ( .11.30). Moreover, iIT is any other Iunda
mental matri Ior E . ( . 11.30) then there e ists a constant (n n) non sin
gular matri P such that T "P.
.1l.. E ercise. Prove Theorem . 11. 3.
Now let R(t) rit) be an arbitrary matri such that the scalar valued
Iunctions rl (t) are Riemann integrable on T. We deIine integration oI R(t)
componentwise, i.e.,
IR(t)dt I r,it )dt r,/(t)dt
Integration oI vectors is deIined similarly.
In the ne t result we establish some oI the properties oI the state transition
matri, . ereaIter, in order to indicate the dependence oI. on l as well
as t, we will write .(t, 1 ). By b(t, 1 ), we mean u.(t, 1 )/ut.
.11. 5. Theorem. et D be deIined by E . ( .11.28), let l E T, let
cp(1 ) , let (1, ) E D, and let .(t,1) denote the state transition matri
Ior E . ( .11.30) Ior all t E T. Then
(i) b(t, I) A(t).(t, 1) with .(1,1) I, where I denotes the (n n)
identity matri;
(ii) the uniue solution oI E . ( .11.30) is given by
,(t) .(t, 1 ( .11. 6)
Ior all t E T;
(iii) .(t, I) is nonsingular Ior alI t E T;
(iv) Ior any t, ( E T we have
.(t,1) .(t, ((, I);
(v) .(t,1)1 t:. .I(t, I) .(r, t) Ior all t E T; and
(vi) the uniue solution oI E . ( .11.29) is given by
cp(t) .(t, 1) I.(t, ")v(,,)d,,. ( .11. 7)
ProoI The Iirst part oI the theorem Iollows Irom the deIinition oI the state
transition matri .
Chapter I inite Dimensiotull Vector Spaces and Matrices
To prove the second part, assume that It) .(t, I. DiIIerentiating
with respect to t we have
(t) i(t, I A(t(t, I A(t)tt).
urthermore, II) .(I, I . rom the uniueness results (to be
presented in the net chapter) it Iollows that the speciIied is indeed the
solution oI E . ( .11.30).
The third part oI the theorem is a conseuence oI Theorem .11. 1.
To prove the Iourth part oI the theorem we note that p t) .(t, I
is the uniue solution oI E. ( .11.30) satisIying II) , and also that
I o ) O, I, 0 E T. Now consider the solution oI E. ( .11.30) with
initial condition given at 0 in place oI I; i.e., p t) .(t,O . O ). Then
It) .(t, I t, OO, I.
Since this euation holds Ior arbitrary in the space, we have
t, I) .(t, O(O, I).
To prove the IiIth part oI the theorem we note that .I(t, I) e ists by part
(iii). rom part (iv) it now Iollows that
I .(t, I(I, t),
where I denotes the (n n) identity matri. Thus,
.I(t, I) .(I, t)
Ior all t E T.
In the net chapter we will show that under the present assumptions,
E. ( .11.29) possesses a uniue solution Ior every(I, ) E D, wheretI) .
Thus, to prove the last part oI the theorem, we must show that the Iunction
( .11. 7) is this solution. DiIIerentiating with respect to t we have
.(t) ,t, I t, t) (t) I,t, ,,) (,,) d"
A(t(t, I (t) I A(t(t, ,,) (,,) d"
A(t)t, I I.(t, ,,)v(,,) d" v(t)
A(tt) (t).
Also, II) . ThereIore, is the uniue solution oI E. ( .11.29).
In engineering and physics, is interpreted as representing the "state"
oI a physical system described by appropriate ordinary diIIerential euations.
In E. ( .11. 6), the matri .(t, I) relates the "states" oI the system at the
points t E T and I E T. ence, the name "state transition matri."
Net, we wish to eamine the properties oI linear ordinary diIIerential
.11. Applications to Ordinary DiIIerential E Qtions 251
euations with constant coeIIicients given by E. ( .1 1.31). We reuire the
Iollowing preliminary result.
.11.8. Theorem. et A be a constant (n n) matri (A may be real or
comple). et SN(I) denote the matri
N tie
SNCt) I Ie k AIe.
Then each element oI matri S (I) converges absolutely and uniIormly on
any Iinite interval (II II), II 0, as N 00.
ProoI et aW denote the (i,j)th element oI matri Ale, where i,j I,
... , n, and k 0, I, 2, .... Then the (i,j)th element oI SNCt) is eual to
(leI tie
QI) 1e1 ai k
where is the ronecker delta. We now show that
letl Ia r
l
I 00 Ior all i,j.
et m ma (t Ia i I). Then m is a constant which depends on the elements
II. I
oI the matri A SinceA1c 1 AAIe wehavemala(IeI)Imala a(lell
, I Ip p
pI
ma (t IalP I . IaW I) (ma t Ialp I)(ma Ia1 I). ThereIore,
I,) p I I p I P.
ma la
II
Im.mala111. When k t. we have malatlm. and
I. I. I.
by induction it Iollows that malaWIm
k
Now tet Mk(mtl)klk .
I.
Then we have Ior any t E (t
l
, tl), t
l
0, and Ior any i,j,
I
(kIt" I M
ali k ".
Since I M" e" t" we now have

" I
i (kl tIc
QI) al k
1e1
is an absolutely and uniIormly convergent series Ior each i,j over the interval
( II I I) by the Weierstrass Mtest.
We are now in a position to consider the Iollowing:
.11.9. DeIinition. et A be a constant (n n) matri. We deIine eAt
to be the matri
t
k
eAt I A"
" 1 k
Ior any 00 I 00.
We note immediately that eN1, 0 I.
We now prove:
.11.50. Theorem. et T (00, 00), let l E T, and let A be a constant
(n n) matri. Then
(i) the state transition matri Ior E. ( .11.31) is given by
.(t, 1 ) eA
l
.)
Ior all t E T;
(ii) the matri eA is nonsingular Ior all t E T;
(iii) eA"eA" eA1ht,) Ior all t I t
2
E T;
(iv) AeA eA A Ior all t E T; and
(v) (eN) 1 e
AI
Ior all t E T.
ProoI To prove the Iirst part we must show that .(t, 1 ) satisIies the matri
euation
(t,1) A.(t,1 )
Ior all t E T, with .(1, 1 ) I. Now, by deIinition,
.(/,1) e
AII
.) I :E (t r)k Ak.
k 1 k
In view oI Theorem .11. 8 we may diIIerentiate the above series term by
term. In doing so we obtain
.eAII.) A :E (t 1 )k Ak l AI :E (t 1 )k Ak
dt k 1 k kI k
AeA
II
.
l
,
and thus we have
t, 1 ) At, 1 )
Ior all t E T, with .(1,1) eA
l
..
l
I. ThereIore, e
All
d
is the state tran
sition matri Ior E . ( .11.31).
The second part oI the theorem is obvious.
To prove the third part oI the theorem, we note that Ior any tl, t
2
E T,
we have
Now .(tl t
2
) eAII,t,l, .(tl 0) eA", and O, t
2
) eN , which
yields the desired result.
To prove the Iourth part oI the theorem we note that Ior all t E T,
A(I :E IkAk) A :E t....Ak 1 (I t t....Ak)A.
klk k 1 k k l k
.11. Applications to Ordinary DiIIerential E uations 253
inally, to prove the last part oI the theorem, note that Ior all t E T,
t , . t ( ,) e
A
(,,) I.
ThereIore, (t ")t e
A
,.
The Iollowing natural uestion arises: can we Iind an e pression similar
to t , Ior the case when A A(t), t E T. The answer is, in general, no.
owever, there is a special case when such a generali ation is valid.
.11.51. Theorem. II Ior E . ( .11.30) A(tt)A(t
) A(t )A(tt) Ior all

t
l
, t
E T, then the state transition matri .(t, T) is given by

A(I,)d ... I
.(t, T) e T eB(.,T) I kIBt, T),
k1
where B(t, T) A rA( 1) dl.
.11.52. E ercise. Prove Theorem .11.5 I.
We note that a suIIicient condition Ior At
l
) to commute with A(t
) Ior
all t I t E T is that A(t) be a diagonal matri .
.11.53. E ercise. ind the state transition matri Ior i At), where
At) ; l
The reader will Iind it instructive to veriIy the Iollowing additional
results.
.11.5 . E ercise. et A denote the (n n) diagonal matri
A A.I. 0 .
o A.
Show that
ell . 0
e
A
, .
o e
l
.,
Ior all t E T (bo, bo).
.11.55. E ercise. et t E T (bo, be, let T E T, and let E R
(or en). et A be the (n n) matri Ior E . (.1I.3I), and let. denote the
uniue solution oIE. ( .11.31) with ,(oI) . et P be a similarity transIor
mation Ior A, and let B pI AP.
(a) Show that eAt PerP1 Ior all t E T.
(b) Show that the uniue solution oI E . ( .11.31) is given by
,P.r,
where. is the uniue solution oI the initialvalue problem
tBy
with
(I) PI(I) PI.
.1l.56. Eercise. et D be deIined by ( .11.28). In E . ( .1 1.29), let
A(t) A Ior all t E T; i.e.,
i A v(t). ( .11.57)
etI E T, and let, denote the uniue solution oIE. ( .11.57) with (I) .
et P be a similarity transIormation Ior A, and let B pI AP. Show that
the uniue solution oI E . ( .11.57) is given by
.p ,
where. is the uniue solution oI the initialvalue problem
t By PIv(t)
with
(I, (I E D, t E T.
.11.58. Eercise. et denote the ordan canonical Iorm oI the (n n)
matri A oI E . ( .11.31), and let M denote the nonsingular (n n) matri
which transIorms A into ; i.e., MIAM. Then
o
:
1
1
i: 0
o
where
o
o
where
.
o
o
o
o
. I
o 0 1.. .
o 0 0 lk..
m I, ... ,p, and where 11 ... ,1.., lk . ... ,lu, denote the (not
necessarily distinct) eigenvalues oI A. Show that
o
ell
o
e "
where
and
.
I" I,i
1 2 (v. l)
t
1"
(VIII 2)
( .11.59)
( .11.60)
o 0 0
where . is a VIII VIII matri and k v. ... v, n.
Net, we consider initialvalue problems characteried by linear nthorder
ordinary diIIerential euations given by
a.(t)
l
l
a..(t)
l
.
Il
a.(t)(l ao(t) v(t),
a.(t)
l
l
a..(t)
c
.
Il
a.(t)(l ao(t) 0,
and
a.
l
.) a..
l
.I) ... a.ll) ao O. ( .11.61)
In E s. ( .11.59) and ( .11.60), v(t) and o,(t), i 0, ... ,n, are Iunctions
which are deIined and continuous on a real t interval T, and in E. ( .11.61),
the 01 i 0, ... , n, are constant coeIIicients. We assume that 0. 0, that
0,,(1) 0 Ior any 1 E T, and that v(l) is not identically ero. urthermore, the
coeIIicients 01 0
1
(1), i 0, ... ,n, may be either real or comple .
In accordance with E . ( .11.23), we can reduce the study oI E . ( .11.60)
to the study oI the system oI n Iirst order ordinary diIIerential euations
i A(I), ( .11.62)
where
o
o
1
o
o
I
o
o
( .11.63) A(I)
o 0 0 I
oo(t)
1
(1) 02(1) O"I(t)
a,,(I) a,,(I) 0.(1) . a,,(/)
In this case the matri A(I) is said to be in companion Iorm. Since A(I) is
continuous on T, there e ists Ior all 1 ETa uniue solution II to the initial
value problem
( .11.6 )
i A(I)
(t);(I,,,,)T
where T E T and; E R" (or e") (this will be proved in the net chapter).
Moreover, the Iirst component oI ,I, PI is the solution oI E . ( .11.60)
satisIying
PI(T) p(T) I p( )(T) 2 ... , pl"II(T) ".
Now let 1 1" .. ,I" be solutions oI E . ( .11.60). Then we can readily
veriIy that the matri
( .11.65)
y ;::: ... ;1:... :::... ;:
1 " t) I" t) ,,. t)
is a solution oI the matri euation
A(I)", ( .11.66)
where A(I) is deIined by E . ( .11.63). We call the determinant oI" the
Wronskian oI E . ( .11.60) with respect to the solutions l1"" I", and
we denote it by
det" W(II"" 1 ,,). ( .11.67)
Note that Ior a Ii ed set oI solutions II" .. , "" (and considering T Ii ed),
the Wronskian is a Iunction oI I. To indicate this, we write W("I , ".)(1).
.11. Applications to Ordinary DiIIerential E uations 257
In view oI Theorem .11.37 we have Ior all t E T,
W(I"" ,,)(t) det (t) det P(r)etrACII"
W(I"" ,,)(r)eII..e")/II.C"lld". ( .11.68)
.11.69. Eample. Consider the secondorder ordinary diIIerential eua
tion
tZ
CZI
t
W
0, 0 t 00. ( .11.70)
The Iunctions 1(t) t and (t) lit are clearly solutions oI E. ( .11.70).
Consider now the matri
Then
2
W( I )(t) det P(t) , t O.
t
sing the notation oI E. ( .1 1.63), we have in the present case al(t)la(t)
lit. rom E. ( .1 1.68) we have, Ior any l 0,
W( I )(t) det "(t) W(I )(r)e (II.e"I/II,C"I d"
which checks.
2 ID (Titl 2
e ,
l t
t 0,
The reader will have no diIIiculty in proving the Iollowing:
.11.71. Theorem. A set oI n solutions oI E. ( .11.60), I ... ,", is
linearly independent on a t interval T iI and only iI W( t ... , ,,)(t) 1 0
Ior all t E T. Moreover, every solution oI E. ( .11.60) is a linear combina
tion oI any set oI n linearly independent solutions.
.11.72. Eercise. Prove Theorem .11. 71.
We call a set oIn solutions oIE. ( .11.60), 1t .. , ", which is linearly
independent on T a Iundamental set Ior E. ( .11.60).
et us net turn our attention to the nonhomogeneous linear nthorder
ordinary diIIerential euation ( .11.59). Without loss oI generality, let us
assume that a,,(t) 1 Ior all t E T; i.e., let us consider
C"1 a" I(t) C" 1l ... al(t)(l) ao(t) v(t). ( .11.73)
The study oI this euation reduces to the study oI the system oI n Iirstorder
158 Chapter I initeDimensional Vector Spaces and Matrices
ordinary diIIerential euations
i A(t) b(t),
where
o
o
1
o
o
1
o
o
( .11.7 )
o
b(t)
A(t) .
000
oo(t) 01(/) 02(/) ... 0.
1
(/)
o
V(/)
( .11.75)
In the net chapter we will show that Ior all lETthere e ists a uniue
solution to the initialvalue problem
i A(t) b(t)
, ( .11.76)
(T) ; (el ... ,e.)T
where T E Tand; E R (or C ). The Iirst component oI, CI is the solution
oI E . ( .11.59), with 0.(/) 1 Ior all t E T, satisIying
CI( r) el C(t ( r) 2 ... , Clo ( r) , .
We now have:
.11.77. Theorem. et It ... , I. be a Iundamental set Ior the euation
l. O. I(t)
I
.
I
... OI(t) ( ) oo(/) O. ( .11.78)
Then the solution oI the euation
l. o.I(/)(t 01(/) ( ) oo(t) v(1), ( .11.79)
satisIying (T) ; (C(T), CIIl(T), , ,(t(TT (I" .. ,.)T, T E T,
; E R (or C) is given by the epression
C(/) CA(/) to I,(/) I W,( I .. . , .)(s) v(s) ds, ( .11.80)
t:1 r W( I h , I .)(s)
where CA is the solution oI E . ( .11.78) with CA(T) I and where (lI
... ,I./) is obtained Irom W( lI" .. , I.)(/) by replacing the ith column oI
W( lI" .. , I. /) by (0,0, ... , l)T.
.11.82. Eample. Consider the secondorder ordinary diIIerential eua
tion
1
2
12
t
lt
b(t), t 0,
( .11.83)
where b(t) is a real continuous Iunction Ior all t O. This euation is
euivalent to
(.11.8)
1
,
t
where v(t) b(t)/t 1.. rom E ample .11.69 we have V I(t) t, V 1.(t) l/t,
and W(V ., V 1.)(t) 2/t. Also,
o I
t
1
tr
et us ne t Iocus our attention on linear nthorder ordinary diIIerential
euations with constant coeIIicients. Without loss oI generality, let us
assume that, in E . ( .11.61), a. 1. We have
(.11.85)
We call the algebraic euation
P(l) ln a.
l
l
n
1
.,. all a
o
0 ( .11.86)
the characteristic euation oI the diIIerential euation ( .11.85).
As was done beIore, we see that the study oI E . (.11.85) reduces to the
study oI the system oI Iirstorder ordinary diIIerential euations given by
:i A , (.11.87)
where
A . ..... ..... ..... ... :::.... .. .
l a
o
al a2 a
3
.. a.I
(.11.88)
We now show that the eigenvalues oI matri AoIE. ( .11.88) are precisely
the roots oI the characteristic euation ( .11.86). irst we consider
,t 1 0
o ,t 0
o
o
o
o
det(A ,tI) .
o o o
,t
160
1 0 0 0
o 1 I 0 0
1 .
o 0 0 1 I
01 0" 03 . . . 0",, (1 0,,1)
100
1 I 0
(1)"1(0
0
) 0 1 0 .
o 0
sing induction we arrive at the e pression
det(A 11) (I)"l" ,, 11,, 1 ... all oo . ( .11.89)
It Iollows Irom E . ( .11.89) that 1 is an eigenvalue oI A iI and only iI 1
is a root oI the characteristic euation ( .11.86).
( .11.91)
et V denote the
1"
where 1
1
, ,1" denote the eigenvalues oI matri A.
Vanclermonde matri given by
.11.90. E ercise. Assume that the eigenvalues oI matri A given in E .
( .11.88) are all real and distinct. et A denote the diagonal matri
o
I
1
1
1" 1"
V II l l
(a) Show that V is non singular.
(b) Show that A V IAV.
BeIore closing the present section, let us consider so called "adjoint
systems." To this end let us consider once more E . ( .11.30); i.e.,
t A(t). ( .11.92)
et A*(t) denote the conjugate transpose oI A(t). (That is, iIA(t) o (t) ,
then A*(t) al (t) T a ,(t) , where a,it) denotes the comple conjugate
.12. Notes and ReIerences
261
( .11.93)
( .1 1.97)
oI a,it).) We call the system oI linear Iirst order ordinary diIIerential e ua
tions
y A*(t)y
the adjoint system to ( .1 1.92).
.11.9 . E ercise. et be a Iundamental matri oI E . ( .11.92). Show
that T is a Iundamental matri Ior E . ( .11.93) iI and only iI
T* C,
where. C is a constant non singular matri , and where T* denotes the con
jugate transpose oI T.
It is also possible to consider adjoint e uations Ior linear nth order
ordinary diIIerential e uations. et us Ior e ample consider E . ( .11.85),
the study oI which can be reduced to that oI E . ( .11.87), with A speciIied
by E . ( .11.88). Now consider the adjoint system to E . ( .11.87), given by
yA*y, ( .11.95)
where
0 0 0 a
o
I 0 0 a
1
A*
0 I 0 a
2
,
( .1 1.96)
.....................
o 0 I a.
1
where a, denotes the comple conjugate oI a" i 0, ... , n I. E uation
( .11.95) represents the system oI e uations
l ao .,
2 I al.
. ,,I a,, I .
DiIIerentiating the last e pression in E . ( .11.97) (n 1) times, eliminating
" ... "I and letting " , we obtain
(I)y
C
" (I),,l a. 1
c
.. I ... (I)Qlit ao O. ( .11.98)
E uation ( .11.98) is called the adjoint oI E . ( .1 1.85).
.12. NOTES AND RE ERENCES
There are many e cellent te ts on Iinite dimensional vector spaces and
matrices that can be used to supplement this chapter (see e.g., .1 , .2 ,
. , and .6 .10 ). ReIerences .1 , .2 , .6 , and .10 include appli
C1uIpter I miteDimensional Vector Spaces and Matrices
cations. (In particular, consult the reIerences in .10 Ior a list oI diversiIied
areas oI applications.)
E ce ent reIerences on ordinary diIIerential e uations include .3 , .5 ,
and .11 .
RE ERENCES
.1 N. R. AMuNDSON, MatMmatical Methods in Chemical Engineering: Matrices
and Their Applications. Englewood ai1I s, N..: Prentice all, Inc., 1966.
.2 R. E. BE MAN, Introduction to Matri Algebra. New ork: McGraw iD
Book Company, Inc., 1970.
.3 . BRA ER and . A. NOBE , Qualitatil1e Theory oI Ordinary DiIIerential
E uations: An Introduction. New ork: W. A. Benjamin, Inc., 1969. *
. E. T. BROWNE, Introduction to the Theory oI Determinants and Matrices.
Chapel iD, N.C.: The niversity oI North carolina Press, 1958.
.5 E. A. CoDDINGTON and N. IMNSON, Theory oI Ordinary DiIIerential
E uations. New ork: McGraw ill Book Company, Inc., 1955.
.6 . R. GANTMACER, Theory oI Matrices. Vols. I, II. New ork: Chelsea
Publishing Company, 1959.
.7) P. R. IIA Mos, inite Dimensional Vector Spaces. Princeton, N..: D. Van
Nostrand Company, Inc., 1958.
.8 . O MAN and R. NZE, inear Algebra. Englewood ai1I s, N..:
Prentice all, Inc., 1961.
.9 S. IPSC TZ, inear Algebra. New ork: McGraw ill Book Company,
1968.
.10 B. NOB E, Applied inear Algebra. Englewood aiits, N..: Prentice all,
Inc., 1969.
.11 . S. PoNTlAOIN, Ordinary DiIIerential E uations. Reading, Mass.:
Reprinted by Dover Publications, Inc., New ork, 1989.
5
METRIC SPACES
p to this point in our development we have concerned ourselves primarily
with algebraic structure oI mathematical systems. In the present chapter we
Iocus our attention on topological structure. In doing so, we introduce the
concepts oI "distance" and "closeness." In the Iinal two chapters we will
consider mathematical systems endowed with algebraic as well as topological
structure.
A generaliation oI the concept oI "distance" is the notion oI metric.
sing the terminology Irom geometry, we will reIer to elements oI an
arbitrary set as points and we will characterie metric as a real valued,
non negative Iunction on satisIying the properties oI "distance"
between two points oI . We will reIer to a mathematical system consisting
oI a basic set and a metric deIined on it as a metric space. We emphasi e
that in the present chapter the underlying space need not be a linear
space.
In the Iirst nine sections oI the present chapter we establish several basic
Iacts Irom the theory oI metric spaces, while in the last section oI the present
chapter, which consists oI two parts, we consider some applications to the
material oI the present chapter.
5.1. DE INITION O METRIC SPACE
We begin with the Iollowing deIinition oI metric and metric space.
5.1.1. DeIinition. et be an arbitrary nonempty set, and let p be a
real valued Iunction on , i.e., p: R, where p has the Iol
lowing properties:
(i) p(, y) 0 Ior all , y E and p(, y) 0 iI and only iI y;
(ii) p(, y) p(y, ) Ior all , y E ; and
(iii) p(, y) p(, ) p( , y) Ior all , y, Z E .
The Iunction p is called a metric on , and the mathematical system con
sisting oI p and , ; p , is called a metric space.
The set is oIten called the underlying set oI the metric space, the elements
oI are oIten called points, and p(, y) is Ire uently called the distance
Irom a point E to a point y E . In view oI a iom (i) the distance
between two diIIerent points is a uniue positive number and is eual to ero
iI and only iI two points coincide. A iom (ii) indicates that the distance
between points and y is eual to the distance between points y and .
A iom (iii) represents the well known triangle ine uality encountered, Ior
e ample, in plane geometry. Clearly, iI p is a metric Ior and iI I is any
real positive number, then the Iunction I p( , y) is also a metric Ior . We
are thus in a position to deIine inIinitely many metries on .
The above deIinition oI metric was motivated by our notion oI distance.
Our ne t result enables us to deIine metric in an euivalent (and oIten con
venient) way.
5.1.2. Theorem. et p: R. Then p is a metric iI and only iI
(i) p(, y) 0 iI and only iI y; and
(ii) p(y, ) p(, y) p(, ) Ior all , y, E .
ProoI The necessity is obvious. To prove suIIiciency, let , y, Z E with
y . Then 0 p(y, y) 2p(, y). ence, p(, y) 0 Ior all , y E .
Ne t, let Z . Then p(y, ) p(, y). Since and yare arbitrary, we can
reverse their role and conclude p(, y) p(y, ). ThereIore, p(, y) P( , )
Ior all , y E . This proves that p is a metric.
DiIIerent metrics deIined on the same underlying set yield diIIerent
metric spaces. In applications, the choice oI a speciIic metric is oIten dictated
by the particular problem on hand. II in a particular situation the metric p
is understood, then we simply write in place oI ; p to denote the par
ticular metric space under consideration.
et us now consider a Iew e amples oI metric spaces.
5.1. DeIinition 01Metric Space
5.1.3. Eample. et be the set oI real numbers R, and let the Iunction
P on R R be deIined as
p(, y) I y I (5.1. )
Ior all , E R, where I I denotes the absolute value oI . Now clearly
p(, y) I yl 0 iIand only iI y. Also, Ior all ,y, Z E R, we have
p(y, ) Iy l 1( ) ( )1 I yl I l p(,y)
p(, ). ThereIore, by Theorem 5.1.2, P is a metric and R; p is a metric
space. We call p(, y) deIined by E. (5.1. ) the usual metric on R, and we
call the metric space R; p the real line.
5.1.5. Eample. et be the set oI all comple numbers C. II E C,
then a ib, where i ..;1, and where a, b are real numbers. et
i a ib and deIine p as
p( l Z2) ( I Z2)(ZI Z2) ,12. (5.1.6)
It can readily be shown that C; p is a metric space. We call (5.1.6) the
usual metric Ior C.
5.1.7. Eample. et be an arbitrary nonempty set and deIine the
Iunction p on as
0 iI y
p(, y) (5.1.8)
I iI *y.
Clearly p(, y) 2 0 Ior all , y E , p(, ) 0 Ior all E , and p(, y)
::::;; p(, ) p( , y) Ior all , y, E . ThereIore, (5.1.8) is a metric on .
The Iunction deIined in E . (5. I .8) is called the discrete metric and is important
in analysis because it can be used to metrie any set .
We distinguish between bounded and unbounded metric spaces.
5.1.9. DeIinition. et ; p be a metric space. II there eists a positive
number r such that p(, y) , Ior all , y E , we say ; p is a bounded
metric space. II ; p is not bounded, we say ; p is an unbounded metric
space.
II ; p is an unbounded metric space, then p takes on arbitrarily large
values. The metric spaces in Eamples 5. 1.3 and 5. 1.5 are unbounded, whereas
the metric space in Eample 5.1. 7 is clearly bounded.
5.1.10. Eercise. et ; p be an arbitrary metric space. DeIine the
Iunction PI : R by
PI(
y) p(,y) . (5.1.11)
, 1 p(,y)
Show that PI( , y) is a metric. Show that ; PI is a bounded metric space,
Chapter 5 I Metric Spaces
even though ; p may not be bounded. Thus, the Iunction (5.1.11) can be
used to generate a bounded metric space Irom any unbounded metric space.
(int: Show that iI,: R . R is given by ,(t) t/(l t), then ,(t1) ,(t
1
)
Ior all t
1
, t
1
such that 0 t
1
t
1
.)
Subseuently, we will call
R* R oo u oo
the e tended real numbers. In the Iollowing eercise, we deIine a useIul metric
on R*. This metric is, oI course, not the only metric possible.
5.1.12. E ercise. et R* and deIine the Iunction I: R* . R as
1 li E R
() 1 :: :: :
et p*: R* R* . R be deIined by p.(, y) II() I(y) I Ior all ,
y E R*. Show that R*; P. is a bounded metric space. The Iunction p* is
called the usual metric Ior R*, and R*; p* is called the e tended real line.
We will have occasion to use the net result.
5.1.13. Theorem. et ; p be a metric space, and let , y, and be any
elements oI . Then
Ip(, ) p(y, ) I p(, y)
Ior all , , E .
ProoI rom aiom (iii) oI DeIinition 5.1.1 it Iollows that
p(, ) p(, y) p(y, )
and
P( , ) p(y, ) p(, ).
rom (5.1.15) we have
(5.1.1)
(5.1.15)
(5.1.16)
p(, ) p(y, ) p(, y) (5.1.17)
and Irom (5.1.16) we have
p(y, ) S p(, ) P( , ). (5.1.18)
In view oI aiom (ii) oI DeIinition 5.1.1 we have p(, y) p(y, ), and thus
relations (5.1.17) and (5.1.18) imply
p(, y) p(, ) P( , ) p(, y).
This proves that Ip(, ) p(y, ) I p(, y) Ior all , y, E .
5.1. DeIinition oIMetric Space 267
The notion oI metric makes it possible to consider various geometric
concepts. We have:
5.1.19. DeIinition. et I; p be a metric space, and let be a nonvoid
subset oI . II p( , y) is bounded Ior all , y E , we deIine the diameter
oI set , denoted t5( ) or diam (), as
t5( ) sup p( , y): , y E .
II p(, y) is unbounded, we write t5( ) 00 and we say that has
inIinite diameter, or is unbounded. II is empty, we deIine t5( ) O.
5.1.20. Eercise. Show that iI c Z c , where I; p is a metric space,
then t5( ) t5(Z). Also, show that iI Z is nonempty, then t5(Z) 0 iI and
only iI Z is a singleton.
We also have:
5.1.21. DeIinition. et I; p be a metric space, and let and Z be two
nonvoid subsets oI . We deIine the distance between sets and Z as
d(, Z) inIIp(y, ): y E , E Z.
et p E and deIine
d(p, Z) inIIp(p, ): E Z .
We call d(p, Z) the distance between point p and set Z.
Since p(y, ) p(, y) Ior all y E and E Z, it Iollows that d( , Z)
d(Z, ). We note that, in general, d( , Z) 0 does not imply that and
Z have points in common. or eample, let be the real line with the usual
metric p. II I E : 0 I and Z I E : I 2 , then
clearly d( , Z) 0, even though n Z 0. Similarly, d(p, Z) 0 does
not imply that p E Z.
5.1.22. Theorem. et I; p be a metric space, and let be any nonvoid
subset oI . IIp denotes the restriction oI p to , i.e., iI
p(, y) p( , y) Ior all , y E ,
then I ; p is a metric space.
We call p the metric induced by p on , and we say that ; p is a
metric subspace oI I; p or simply a subspace oI . Since usually there is
no room Ior conIusion, we drop the prime Irom p and simply denote the
metric subspace by ; pl. We emphasie that any nonvoid subset oI a metric
space can be made into a metric subspace. This is not so in the case oI linear
subspaces. II * , then we speak oI a proper subspace.
5.2. SOME INEQ A ITIES
In order to present some oI the important metric spaces that arise in
applications, we Iirst need to establish some important ineualities. These
are summaried and proved in the Iollowing:
5.2.1. Theorem. et R denote the set oI real numbers, and let C denote
the set oI comple numbers.
(i) et p, E Rsuch that I p 00 and such that 1. 1. 1. Then
p
Ior all, pERsuch that 0 and p 0, we have
p :::;; , I . (5.2.2)
p
(ii) (6Iders ineuality) et p, E R be such that 1 p 00, and
1 1
1.
p
(a) inite Sums. et n be any positive integer and, let,1 , n
and I., ... ,I. belong either to R or to C. Then
(5.2.3)
(b) InIinite Sums. et "l and II be inIinite seuences in either

R or C. II le,l 00 and III 00, then
(5.2.)
(c) Integrals. et la, b) be an interval on the real line, and let
I, g: la, b R. II s: II(t) I dt 00 and s: Ig(t) I dt 00
(integration is in the Riemann sense), then
s: II(t)g(t) Idt :::;; : II(t) I dt I/ : Ig(t) dt II . (5.2.5)
(iii) (Minkowskis ineuality) et pER, where 1:::;; p 00.
(a) inite Sums. et n be any positive integer, and let I ... ,e.
and II ... , tI. belong either to R or to C. Then
5.2. Some Ine ualities 269
(b) InIinite Sums. et e/ and II be inIinite se uences in either
eo eo
R or C. IIt:1le/l 00 and 1 1/1 00, then
Iel III T
/
lellT/, tilldT/,. (5.2.7)
(c) Integrals. et a, b be an interval on the real line, and let
, g: a, b R. II rII(tWdt 00 and rIg(tW dt 00,
then
: II(t) g(tW dt TI, : II(tWdt T
I
: Ig(tW dt T
/
,.
(5.2.8)
ProoI To prove part (i), consider the graph oI 1 e, I in the (e, I) plane,
depicted in igure A. et l So e, I de and 2 s: III d l. We have
l Il. /p and 2 P / . rom igure A it is clear that l 2 a.p Ior
any choice oI Il., P 0, and hence relation (5.2.2) Iollows.
To prove part (iia) we Iirst note that iI (tilell ) II 0 or iI (til 111 ) III
0, then ineuality (5.2.3) Iollows trivially. ThereIore, we assume that
(tile/l,)/, 7 0 and (iil lII,) /I 7 O. rom (5.2.2) we now have
lell . 1 1/1 1. lell 1. I llI .
(III, (TIIIyl P (III) (TII)
ence,
5.1.9. igure A.
270
It now Iollows that
I ,I,I ,III,I ( I,I,)/ (II,)/.
which was to be proved.
To prove part (iib), we note that Ior any positive integer n,
1 ,I,1 (I,I,)/( 1 1,1,) / ( I ,I,) / ( 1 1,1,) / .
IIwe let n . 00 in the above ine uality, then (5.2. ) Iollows.
The prooI oI part (iic) is established in a similar Iashion. We leave the
details oI the prooI to the reader.
To prove part (iiia), we Iirst note that iI p I, then ineuality (5.2.6)
Iollows trivially. It thereIore suIIices to consider the case I p 00. We
observe that Ior any , and I" we have
(I ,I I I,I (I,I II,I)II,I (I,I 1 1,1) 11 1,1
Summing the above identity with respect to i Irom I to n, we now have

1 ,1 II,II,I I,I 1 1,1 11 1,1
Applying the alder ine uality (5.2.3) to each oI the sums on the right side
oI the above relation and noting that (p l) p, we now obtain
1 ,1 1 1,1 (I,I II,I)T/ I,I/
(le,1 1 I,l) T
/
, t.1 1,I, / .
II we assume that t (le,1 1 1,1), 1
/
* 0 and divide both sides oI the
sl
above ine uality by this term, we have
1/ 1/ . 1/
(1 ,1 1 1,1) I,I 1 1,1 .
a 11 . 1/
Since I , 1,1 (I ,I 1 1,1) ,the desired result Iollows.
a 1/
We note that in case I; (I,I 1 1,1) 0, ine uality (5.2.6) Iollows
11
trivially.
Applying the same reasoning as above, the reader can now prove the
Minkowski ine uality Ior inIinite sum and Ior integrals.
II in (5.2.3), (5.2. ), or (5.2.5) we let p t, then we speak oI the
Schwar ineuality Ior Iinite sums, inIinite sums, and integrals, respectively.
5.3. Eamples oIImportant Metric Spaces
271
5.2.10. Eercise. Prove olders ineuality Ior integrals (5.2.5),
Minkowskis ineuality Ior inIinite sums (5.2.7), and Minkowskis ineuality
Ior integrals (5.2.8).
5.3. E AMP ES O IMPORTANT METRIC SPACES
In the present section we consider speciIic eamples oI metric spaces which
are very important in applications. It turns out that a oI the spaces oI this
section are also vector spaces.
As in Chapter , we denote elements , y E R (elements , y E C) by
(I , ,) and y ( 11 ... ,1), respectively, where I 11 E R Ior
i I, ,n (where " 11 E C Ior i I, ... ,n). Similarly, elements ,
y E Roo (elements , y E Coo) are denoted by (1 ,,,, ...) andy ( 11
1", ...), respectively, where I 11 E R Ior all i (where I 11 E C Ior all i).
5.3.1. Eample. et Rn (let Cn), let 1 :::;; P 00, and let
p,(, y) t 1 , 111, 1/ . (5.3.2)
1 1
We now show that (Rn; p, (C ; p, ) is a metric space.
Aioms (i) and (ii) oI DeIinition 5.1.1 are readily veriIied. To show that
aiom (iii) is satisIied, let a, b, d E Rn (let a, b, d E cn), where a ( I
... , (I..), b (ItI ... ,It.), and d (I ... ,.). II a band y b
d, then we have Irom ineuality (5.2.6),
11, 1. 11,
p,(a,d) 1(l.1,I l,I(l.IPIPIII
11, n 11,
:::;; 1(l.I PII IP,,I p(a,b)p(b,d),
the triangle ineuality. It thus Iollows that (Rn; p, ( C
n
; p,) is a metric
space; in Iact, it is an unbounded metric space.
We Ireuently abbreviate (R; p, by R; and (cn; p, by C;. or the case
p 2, we call p" the Euclidean metric or the usual metric on R.
5.3.3. Eample. or , y E R (Ior , y E C), let
poo( , y) ma (1 1 1111, ... ,I n 1nl
It is readily shown that (R; poo (cn; poo ) is a metric space.
(5.3.)
5.3.5. Eample. et 1 p 00, let R (or Coo), and deIine
I, E: t Ill oo .
I I
(5.3.6)
172
or , y E I" let
.. 1/,
p,y) I; I , 1,1 .
II
We can readily veriIy that I,; p, is a metric space.
5.3.8. E ample. et R" (or C"), and let
I.. E : sup 1 ,1l oo .
I
or , y E I.. , deIine
p..(, y) sup I , I,ll
,
We can easily show that I .. ; P..is a metric space.
(5.3.7)
(5.3.9)
(5.3.10)
(5.3.13)
5.3.11. E ercise. se the ine ualities oI Section 5.2 to show that the
spaces oI E amples 5.3.3, 5.3.5, and 5.3.8 are metric spaces.
5.3.12. E ample. et a, b , a b, be an interval on the real line, and let
era, b be the set oI all real valued continuous Iunctions deIined on a, b .
et I p 00 and Ior , y E era, b , deIine
i
b
II
p,(,y) I(t) y(t)IP dt .
We now show that era, b ; p, is a metric space.
Clearly, p,(, y) ply. ), and p,(, y) 0 Ior all , y E era, b . II
(t) y(t) Ior all t E a, b, then p,(, y) O. To prove the converse oI
this statement, suppose that (t) 1 y(t) Ior some t E a, b). Since , y E
era, b , E e a, b , and there is some interval in a, b , i.e., a subinterval
oI a, b , such that I(t) y(t)I 0 Ior all t in that subinterval. ence,
1
6
1/,
I(t) y(t) I dt o.
ThereIore, p,(, y) 0 iI and only iI (t) y(t) Ior all t E a, b .
To show that the triangle ine uality holds, let u, t , W E era, b , and let
t and y v w. Then we have, Irom ine uality (5.2.8),
I
b 1/,
p,(u, w) lu(t) w(tWdt
I
b 1/,
Iu(t) v(t) v(t) w(t)I dt
I
b
1/, Ib 1/,
lu(t) v(tWdt Iv(t) w(tWdt
p,(u, v) p,(v, w),
the triangle ine uality. It now Iollows that era, b ; p, is a metric space.
It is easy to see that this space is an unbounded metric space.
5.3. E amples oIImportant Metric Spaces
273
5.3.1 . E ample. et a, b be deIined as In the preceding e ample.
or , y E a, b , let
p(, y) sup It) yt) I.
GStS6
(5.3.15)
To show that a, b ; pl is a metric space we Iirst note thatp(, y) p(y,
), that p(, y) 0 Ior all , y, and that p, y) 0 iI and only iI (t) yt)
Ior all t E a, b . To show that p satisIies the triangle ineuality we note that
p(,y) sup It) y(t)/
GStS6
sup I(t) t) (t) yt) I
.StS6
o
I Iyl I
i I
I I
I I
I I

y o
R, pi, yl I yl
ev(v,.V2)
I
" (" 21 :
.,
I I
o o
P.(, yl
.. R
2
, P.(, yl" ma (I, y,I,l2 211
(tl
o
Ib
":"
I
era, b, P. (" 21 sup Il (t) 2 t111
a t b
5.3.16. igure B. Illustration oI various metrics.
17 Chopttr 5 I Mttr;c Spacts
sup I (t) (t) I I(t) y(t) I
.S;9
p ( , ) p ( , y).
It thus Iollows that e a, b ; P is a metric space.
In igure B,. several metrics considered in Section 5.1 and in the present
section are depicted pictorially.
5.3.17. Eercise. Show that the metric deIined in E. (5.3. ) is euivalent
to
" I
/
,
p ( , y) lim I; I, 1,1 .
, I
5.3.18. Eercise. et R denote the set oI real numbers, and deIine
d(, y) ( yy Ior all , y E R. Show that the Iunction dis not a metric.
This illustrates the necessity Ior the eponent lip in E. (5.3.2).
We conclude the present section by considering Cartesian products oI
metric spaces. et ; P.. and ; py be two metric spaces, and letZ
. tiliing the metrics P.. and py we can deIine Metrics on Z in an inIinite
variety oI ways. Some oI the more interesting cases are given in the Iollowing:
5.3.19. Theorem. et ; P.. and ; py be metric spaces, and let
Z . et ZI (I I) and Z2 (
2
, 2) be two points oIZ .
DeIine the Iunctions
P,(ZI Z2) ( p ( I
2
l piI 2),I/" 1 p 00
and
P (ZI Z2) ma p (
u
2), P ( I 2) .
Then Z; PI and Z; P are metric spaces.
The spaces Z; P, and Z; P are eamples oI product (metric) spaes.
We can etend the above concept to the product oI n metric spaces.
We have:
5.3.21. Theorem. et I; PI , ... ,,,; P" be n metric spaces, and let
"
I ... " IT " or ( I ... , II) E , (I ... ,
t
,,) E , deIine the Iunctions
"
p(, y) I; P,(
"
y,)
I
5. . Open and Closed Sets
and
(
)1/
p"(, y) p,( " y,) .
I
Then ; p and ; pIt are metric spaces.
5. . OPEN AND C OSED SETS
175
aving introduced the notion oI metric, we are now in a position to
consider several important Iundamental concepts which we will need through
out the remainder oI this book.
In the present section ; p will denote an arbitrary metric space.
5. .1. DeIinition. et
o
E and let r E R, r O. An open sphere or
open ball, denoted by S(
o
; r), is deIined as the set
S(
o
; r) E : p( ,
o
) r .
We call the Ii ed point
o
the center and the number r the radius oI S(
o
; r).
or simplicity, we oIten call an open sphere simply a sphere.
The radius oI a sphere is always positive and Iinite. In place oI the terms
ball or sphere we also use the term spherical neighborhood oI
o
In igure C, spheres in several types oI metric spaces considered in the

previous sections are depicted. Note that in these Iigures the indicated spheres
do not include boundaries.
5. .3. Eercise. Describe the open sphere in R as a Iunction oI r iI the
metric is the discrete metric oI Eample 5.1.7.
We can now categorie the points or elements oI a metric space in several
ways.
5. . . DeIinition. et be a subset oI . A point E is called a contact
point or adherent point oI set iI every open sphere with center contains at
least one point oI . The set oI all adherent points oI is called the closure
oI and is denoted by .
We note that every point oI is an adherent point oI ; however, there
may be points not in which are also adherent points oI .
5. .5. DeIinition. et be a subset oI , and let E be an adherent
point oI . Then is called an isolated point iI there is a sphere with center
o
Sphere S( O; rl, where R
and pI, yl I vi
276
r
I
.
Sphere S(o; rl. where ., R2 and
pI, yl" P2(, yl " (tl 11112 (b 1I212
t
2
t
02
r
9i
t
02
r
I I
, :
tOI r
tOI
tOI r t 1
Sphere S(o; rl. where R2 and
p(,ylPI(,yllt "II
I ., I 2 11
2
1
ltl
t
02
r
t
02
."
1.,, .
I
: I
I I
I I I
I I I
I I I
tOI r tOI to r t I
Sphere S(o ; rI where " R
2
and
p(,ylp(,yl"ma lit .. I It:
I "1 2 1121)
o r
a
b
Sphere S( o; rl, where " era, b
and pI, yl " p ( , yl sup I(tI y(tl I
atb
5..2. igure C Sph . .
eres In various metric spaces.
5. . Open and Closed Sets 277
which contains no point oI other than itselI. The point is called a limit
point or point oI accumulation oI set iI every sphere with center at contains
an inIinite number oI points oI . The set oI all limit points oI is called the
derived set oI and is denoted by .
Our ne t result shows that adherent points are either limit points or
isolated points.
5. .6 . Theorem. et be a subset oI and let E . II is an adherent
point oI , then is either a limit point or an isolated point.
ProoI We prove the theorem by assuming that is an adherent point oI
but not an isolated point. We must then show that is a limit point oI .
To do so, consider the Iamily oI spheres S(; lin) Ior n 1,2, .... et
It E S(; lin) be such that It E but It 1 Ior each n. Now suppose there
are only a Iinite number oI distinct such points
It
, say, l ... ,
k
. II we
let d min p(, I) then d O. But this contradicts the Iact that there is
1:S:I:S:k
an It E S(; lin) Ior every n 1,2,3, .... ence, there are inIinitely many
It and thus is a limit point oI .
We can now categorie adherent points oI c into the Iollowing
three classes: (a) isolated points oI , which always belong to ; (b) points
oI accumulation which belong to ; and (c) points oI accumulation which do
not belong to .
5. .7. E ample. et R, let p be the usual metric, and let E R:
o 1, 2 , as depicted in igure D. The element 2 is an isolated
point oI , the elements 0 and 1 are adherent points oI which do not
belong to , and each point oI the set E R: 0 I is a limit point
oI belonging to .
(
o
)

2
5. .8. igure D. Set E R: 0 1, 2 oI E ample
5. .7.
5. .9. E ample. et R; p be the real line with the usual metric, and let
Q be the set oI rational numbers in R. or every E R, any open sphere
S(; r) contains a point in Q. Thus, every point in R is an adherent point
oI Q; i.e., R c Q. Since Qc R, it Iollows that R Q. Clearly, there are
no isolated points in Q. Also, Ior any E R, every sphere S(; r) contains
278
an inIinite number oI points in Q. ThereIore, every point in R is a limit
point oI Q; i.e., R c Q . This implies that Q R.
et us now consider the Iollowing basic results.
5..10. Theorem. et and Z be subsets oI , and let I and i denote the
closures oI and Z, respectively. et r denote the closure oI , and let
be the derived set oI . Then
(i) c I;
(ii) I I;
(iii) iI c Z, then I c i;
(iv) Z I u i;
(v) n Z c I n i; and
(vi) I .
ProoI To prove the Iirst part, let E . Then E S(; r) Ior every r O.
ence, E . ThereIore, c I.
To prove the second part, let E , and let r O. Then there is an
I E such that I E S(; r),andhencep(, I) r
l
r. etro r r
l
O. WenowwishtoshowthatS(
l
; ro) c S(; r). Indoingso,lety E S(
l
;
ro) Then p(y, I) roo By the triangle ineuality we have p(, y) p(,
I) p(
l
, y) r
l
(r r
l
) r, and hence y E S(; r). Since I E I,
the sphere S(
l
; ro) containsapoint
2
E . Thus,
2
E S(; r). Since S(;
r) is an arbitrary spherical neighborhood oI , we have E . This proves
that r c . Also, in view oI part (i), we have c r. ThereIore, it Iollows
that .
To prove the third part oI the theorem, let r 0 and let E . Then
there is ayE such that y E S(; r). Since c Z, E Z and thus is an
adherent point oI Z.
To prove the Iourth part, note that c Z and Z c Z. rom
part (iii) it now Iollows that c Z and ic Z. Thus, I u i
c Z. To show that Z c I u i, let E Z and suppose
that : u i. Then there eist spheres S(; r
l
) and S(; r2) such that
S(; r
l
) n 0 and S(; 2) n Z 0. et r min It : Then
S(; r) n Z 0. But this is impossible since E Z. ence,
E u i, and thus Z c I u i.
The prooI oI the remainder oI the theorem is leIt as an e ercise.
5..11. Eercise. Prove parts (v) and (vi) oI Theorem 5. .10.
We can Iurther classiIy points and subsets oI metric spaces.
5. .12. DeIinition. et be a subset oI and let denote the comple
ment oI . A point E is called an interior point oI the set iI there
279
eists a sphere Se ; r) such that se ; r) c . The set oI all interior points oI
set is called the interior oI and is denoted by yo. A point E is an
eterior point oI iI it is an interior point oI the complement oI . The
eterior oI is the set oIall eterior points oI set . The set oIall points E
such that E I () () is called the Irontier oI set . The boundary oI a
set is the set oI all points in the Irontier oI which belong to .
5..13. Eample. et R; p be the real line with the usual metric, and
let y E R: 0 :5: I (0, I . The interior oI is the set (0, I)
y E R: 0 y I . The eterior oI is the set (00, 0) (I, 00), I
y E R: :5: I 0, I and (00,0 1, 00). Thus, the
Irontier oI is the set CO, I , and the boundary oI is the singleton l .
We now introduce the Iollowing important concepts.
5. .1 . DeIinition. A subset oI is said to be an open subset oI iI
every point oI is an interior point oI ; e., yo. A subset Z oI is
said to be a closed subset oI iI Z Z.
When there is no room Ior conIusion, we usually call an open set and
Z a closed set. On occasions when we want to be very eplicit, we will say
that is open relative to ; p or witb respect to ; p .
In our net result we establish some oI the important properties oI open
sets.
5..15. Theorem.
(i) and 0 are open sets.
(ii) II . .. eA is an arbitrary Iamily oI open subsets oI , then ..
eA
is an open set.
(iii) The intersection oI a Iinite number oI open sets oI is open.
ProoI To prove the Iirst part,. note that Ior every E , any sphere
Se ; r) c . ence, every point in is an interior point. Thus, is open.
Also, observe that 0 has no points and thereIore every point oI 0 is an
interior point oI 0. ence, 0 is an open subset oI .
To prove the second part, let . .EA be a Iamily oI open sets in , and
let . II .. is empty Ior every tt E A, then 0 is an open
.eA
subset oI . Now suppose that * 0 and let E . Then E . Ior some
tt E A. Since .. is an open set, there is a sphere Se ; r) such that se ; r)
c . ence, Se ; r) c , and thus is an interior point oI . ThereIore,
is an open set.
To prove the third part, let
1
and
2
be open subsets oI . II
1
()
2
0, then
1
n
2
is open. So let us assume that
1
n
* 0, and let
E
1
n
Since E " there is an r

l
0 such that E S(;
T
1
) C I Similarly, there is an r
0 such that E S(; r ) c
. et
T min r" T . Then E S(; r), where S(; r) c S(; T
1
) and S(; r)
c S(; r ). Thus, S(; r) c
1
n
, and is an interior point oI

1
n
.
ence,
1
n
is an open subset oI . By induction, we can show that the

intersection oI any Iinite number oI open subsets oI is open.
We now make the Iollowing
5..16. DeIinition. et ; p be a metric space. The topology oI deter
mined by p is deIined to be the Iamily oI all open subsets oI .
In our net result we establish a connection between open and closed
subsets oI .
5..17. Theorem.
(i) and 0 are closed sets.
(ii) II is an open subset oI , then r is closed.
(iii) IIZ is a closed subset oI , then Z is open.
ProoI The Iirst part oI this theorem Iollows immediately Irom the deIini
tions oI , 0, and closed set.
To prove the second part, let be any open subset oI . We may assume
that 1 0 and 1 . et be any adherent point oI . Then cannot
belong to , Ior iI it did, then there would eist a sphere S(; ,) c , which
is impossible. ThereIore, every adherent point oI belongs to , and thus
is closed iI is open.
To prove the third part, let Z be any closed subset oI . Again, we may
assume that Z 1 0 and Z 1 . et E Z. Then there eists a sphere
S(; T) which contains no point oI Z. This is so because iI every such sphere
would contain a point oI Z, then would be an adherent point oI Z and
conseuently would belong to Z, since Z is closed. Thus, there is a sphere
S(;r) c Z; i.e., is an interiorpointoIZ. Since this holds Ior arbitrary
E Z, Z is an open set.
In the net result we present additional important properties oI open
sets.
5..18. Theorem.
(i) Every open sphere in is an open set.
(ii) II is an open subset oI , then there is a Iamily oI open spheres,
S. .eA such that S .
eA
(iii) The interior oI any subset oI is the largest open set contained
in .
ProoI To prove the Iirst part, let Se ; r) be any open sphere in . et
. E se; r), and let p( , l) r . II we let r
o
r ., then according to
the prooI oI part (ii) oI Theorem 5. .10 we have S(
l
; ro) c Se ; r). ence,
. is an interior point oI se; r). Since this is true Ior any . E se; r), it
Iollows that se; r) is an open subset oI .
To prove the second part oI the theorem, we Iirst note that iI 0,
then is open and is the union oI an empty Iamily oI spheres. So assume
that t 0 and that is open. Then each point E is the center oI a
sphere Se ; r) c , and moreover is the union oI the Iamily oI all such
spheres.
The prooI oI the last part oI the theorem is leIt as an e ercise.
5. .19. Eercise. Prove part (iii) oI Theorem 5. .18.
et ; p be a subspace oI a metric space ; pI, and suppose that V
is a subset oI . It can happen that V may be an open subset oI and at
the same time not be an open subset oI . Thus, when a set is described as
open, it is important to know in what space it is open. We have:
5. .20. Theorem. et ; p be a metric subspace oI ; pl.
(i) A subset V c is open relative to ; p iI and only iI there is a
subset c such that is open relative to ; p and V n .
(ii) A subset G c is closed relative to ; p iI and only iI there is a
subset oI such that is closed relative to ; p and G n .
ProoI et S(
o
; r) E : p( ,
o
) r and S(
o
; r) E : p(,
o
) r . Then S(
o
; r) n S(
o
; r).
To prove the necessity oI part (i), let V be an open set relative to ; p,
and let E V. Then there is a sphere S(; r) c V (r may depend on ).
Now
V S(;r) S(;r)n .
.,el .,el
By part (ii) oI Theorem 5. .15, Se ; r) is an open set in ; pl.
.,el
To prove the suIIiciency oI part (i), let V n , where is an open
subset oI . et E V. Then E , and hence there is a sphere S(; r) c .
Thus, S(; r) n Se ; r) c n V. This proves that is an inte
rior point oI V and that V is an open subset oI .
The prooI oI part (ii) oI the theorem is leIt as an e ercise.
5. .21. Eercise. Prove part (ii) oI Theorem 5. .20.
The Iirst part oI the preceding theorem may be stated in another euivalent
way. et 3 and 3 be the topology oI ; p and ; pI, respectively, generated
by p. Then 3 n : E 3 .
et us now consider some speciIic eamples.
elulpter 5 I Metric Spaces
5. .22. E ample. et R, and let p be the usual metric on R; e.,
p(, y) I yl. Any set (a, b) : a b is an open subset
oI . We call (a, b) an open interval on R.
5. .23. E ample. We now show that the word "Iinite" is crucial in part
(iii) oI Theorem 5. .15. et R; p denote again the real line with the usual
metric, and let a b. II " E R: a b lin , then Ior each
positive integer n, " is an open subset oI the real line. owever, the set
n " E R: a b (a, b
,,
is not an open subset oI R. (This. can readily be veriIied, since every sphere
S(b; r) contains a point greater than b and hence is not in n ".)

,,
In the above e ample, let (a, b . We saw that is not an open subset
oI R; i.e., b is not an interior point oI . owever, iI we were to consider
; p as a metric space by itselI, then is an open set.
5. .2 . E ample. et era, b ; p denote the metric space oI Eample
5.3.1 . et 1 be an arbitrary Iinite positive number. Then the st oI continuous
Iunctions satisIying the condition I(t) I 1 Ior all a t b is an open
subset oI the metric space era, b; p .
Theorems 5. .15 and 5. .17 tell us that the sets and 0 are both open
and closed in any metric space. In some metric spaces there may be proper
subsets oI which are both open and closed, as illustrated in the Iollowing
e ample.
5. .25. E ample. et be the set oI real numbers given by (2,
1) (1, 2), and let p(,y) I yl Ior ,y E . Then ;p is
clearly a metric space. et (2, 1) c and Z (I, 2) c .
Note that both and Z are open subsets oI . owever, Z, Z ,
and thus and Z are also closed subsets oI . ThereIore, and Z are proper
subsets oI the metric space ; p which are both open and closed. (Note that
in the preceding we are not viewing as a subset oI R. As such would be
open. Considering ; pas our metric space, is both open and closed.)
5. .26. E ercise. et ; p be a metric space with p the discrete metric
deIined in E ample 5.1.7. Show that every subset oI is both open and
closed.
In our ne t result we summari e several important properties oI closed
sets.
5. .27. Theorem.
(i) Every subset oI consisting oI a Iinite number oI elements is closed.
(ii) et
o
E , let r 0, and let (
o
; r) E : p(,
o
) r .
Then (
o
; r) is closed.
(iii) A subset c is closed iI and only iI Ie.
(iv) A subset c is closed iI and only iI c .
(v) et ..eA be any Iamily oI closed sets in . Then n . is closed.
eA
(vi) The union oI a Iinite number oI closed sets in is closed.
(vii) The closure oI a subset oI is the intersection oI all closed sets
containing .
ProoI Only the prooI oI part (v) is given. et ..eA be any Iamily oI
closed subsets oI . Then :.eA is a Iamily oI open sets. Now (n .)
.eA
: is an open set, and hence n . is a closed subset oI .
.eA .eA
5. .28. Eercise. Prove parts (i) to (iv), (vi), and (vii) oI Theorem 5. .27.
We now consider several speciIic eamples oI closed sets.
5. .29. Eample. et R, and let p be the usual metric, p(, y)
I yl Any set E R: a b , where a b is a closed subset
oI R. We call a closed interval on R and denote it by a, b .
5. .30. Eample. We now show that the word "Iinite" is essential in part
(vi) oI Theorem 5. .27. et R; p denote the real line with the usual metric,
and let a O. II . E R: lin a Ior each positive integer n,
then . is a closed subset oI the real line. owever, the set
. ( E R: 0 a (0, a
. 1
is not a closed subset oI the real line, as can readily be veriIied since is an
adherent point oI (0, a .
5. .31. Eercise. The set (
o
; r) deIined in part (ii) oI Theorem 5. .27
is sometimes called a closed sphere. It need not coincide with S(
o
; r), i.e.,
the closure oI the open sphere S(
o
; r).
(i) Show thatS(
o
; r) c (o;r).
(ii) et (; p be the discrete metric space deIined in Eample 5.1.7.
Describe the sets S(; I), S(; I), and (; I) Ior any E and conclude
that, in general, S(; I) * (; I) iI contains more than one point.
(iii) et (00,0) u , where denotes the set oI positive integers,
and let p(, y) I I. Describe S(O; 1), (0; I), and (O; 1)
and conclude that (0; 1) *" (O; 1).
We are now in a position to introduce certain additional concepts which
are important in analysis and applications.
5. .32. DeIinition. et and Z be subsets oI . The set is said to be
dense in Z (or dense with respect to Z) iI : Z. The set is said to be
everywhere dense in ; p (or simply, everywhere dense in ) iI .
IIthe eterior oI is everywhere dense in , then is said to be nowhere
dense in . A subset oI is said to be dense in itselI iI every point oI
is a limit point oI . A subset oI which is both closed and denseinitselI
is called a perIect set.
5. .33. DeIinition. A metric space ; p is said to be separable iI there
is a countable subset in which is everywhere dense in .
The Iollowing result enables us to characterie separable metric spaces
in an euivalent way. We have:
5. .3 . Theorem. A metric space ; p is separable iI and only iI there
is a countable set S l , ... c such that Ior every E , Ior given
I 0 there is an . E S such that p(, .) I.
5. .35. E ercise. Prove Theorem 5. .3 .
et us now consider some speciIic cases.
5. .36. E ample. The real line with the usual metric is a separable space.
As we saw in E ample 5. .9, iI Q is the set oI rational numbers, then Q R.
5. .37. E ample. et R; p, be the metric space deIined in Eample

5.3.1 (recall that 1 p 00). The set oI vectors (e I ,e.) with
rational coordinates (i.e., e, is a rational real number, i I, ,n) is a
denumerable everywhere dense set in R and, thereIore, R ; p, is a separable
metric space.
5. .38. E ample. et l,; p, be the metric space deIined in E ample 5.3.5
(recall that I p 00). We can show that this space is separable in the
Iollowing manner. et
. E I,: . (II ... , 1/.,0,0, ...) Ior some n,
where 1/1 is a rational real number, i 1, ... ,n.
Then is a countable subset oI I,. To show that it is everywhere dense, let
E 0 and let E I" where (I , ...). Choose n suIIiciently large so
that
E
1: I kl .
kt 2
We can now Iind a E such that
ence,
i.e., p,( , ) E. By Theorem 5. .3 , I,; P, is separable.
In order to establish the separability oI the space oI continuous Iunctions,
it is necessary to use the Weierstrass approimation theorem, which we state
without prooI.
5..39. Theorem. et era, b be the space oI real continuous Iunctions
on the interval a, b , and let 6(t) be the Iamily oI all polynomials (deIined
on a, b ). et E 0, and let E era, b . Then there is apE 6(t) such that
sup I(t) P(t)1 E
/b
5..0. Eercise. sing the Weierstrass approimation theorem, show
that the metric spaces era, b; P, , deIined in E ample 5.3.12, and era, b ;
p , deIined in Eample 5.3.1 , are separable.
5..1. Eercise. Show that the metric space ; p, where pis the discrete
metric deIined in Eample 5.1.7, is separable iI and only iI is a countable
set.
We conclude the present section by considering an eample oI a metric
space which is not separable.
5..2. Eample. et l; p be the metric space deIined in Eample
5.3.8. et c R denote the set
y E R: ( 11 1 , ...), where 11 0 or I .
Clearly then c I. Now Ior every real number I E 0, I , there is ayE
such that I 1), where ( 11 1 , ...). Thus, is an uncountable

.I
set. Notice now that Ior every I E , p ( I y) 0 or l. That is, p
restricted to is the discrete metric. It Iollows Irom E ercise 5. . 1 that
cannot be separable and, conseuently, t; p is not separable.
286 Chapter 5 I Metric Spaces
5.5. COMP ETE METRIC SPACES
The set oI real numbers R with the usual metric p deIined on it has many
remarkable properties, several oI which are attributable to the so called
"completeness property" oI this space. or this reason we speak oI R; p
as being a complete metric space. In the present section we consider general
complete metric spaces.
Throughout this section ; p is our underlying metric space, and denotes
the set oIpositive integers. BeIore considering the completeness oI metric
spaces we need to consider a Iew Iacts about se uences on metric spaces (cI.
DeIinition 1.1.25).
5.5.1. DeIinition. A se uence . in a set c: is a IunctionI: .
Thus, iI . is a se uence in , thenI(n) . Ior each n E .
5.5.2. DeIinition. et . be a se uence oI points in , and let be a
point oI . The se uence . is said to converge to iI Ior every I 0,
there is an integer N such that Ior all n;;::: N, p(, .) I (i.e., . E S(; I)
Ior all n ;;::: N). In general, N depends on I; i.e., N N(I). We call the limit
oI . , and we usually write
lim . ,
or . as n 00. II there is no E to which the se uence converges,

then we say that .l diverges.
Thus, . iI and only iI the se uence oI real numbers p(., )
converges to ero. In view oI the above deIinition we note that Ior every
I 0 there is aIinite number N such that all terms oI .l e cept the Iirst
(N I) terms must lie in the sphere with center and radius E. ence, the
convergence oI a se uence depends on the inIinite number oI terms
N
2
), and no amount oI alteration oI a Iinite number oI terms oI a
divergent se uence can make it converge. Moreover, iI a convergent seuence
is changed by omitting or adding a Iinite number oI terms, then the resulting
se uence is still convergent to the same limit as the original se uence.
Note that in DeIinition 5.5.2 we called the limit oI the se uence . .
We will show that iI .) has a limit in , then that limit is uniue.
5.5.3. DeIinition. et . be a se uence oI points in , where I(n) to .
Ior each n E . II the range oIIisbounded, then . is said to be a bounded
se uence.
The range oII in the above deIinition may consist oI a Iinite number oI
points or oI an inIinite number oI points. SpeciIically, iI the range oI I
5.5. Complete Metric Spaces
consists oI one point, then we speak oI a constant se uence. Clearly, all
constant se uences are convergent.
5.5. . E ample. et R; p denote the set oI real numbers with the usual
metric. II n E , then the se uence n
Z
diverges and is unbounded, and the
range oI this se uence is an inIinite set. The se uence (I)" diverges, is
bounded, and its range is a Iinite set. The se uence a ( nl)" converges
to a, is bounded, and its range is an inIinite set.
5.5.5. DeIinition. et " be a se uence in . et n
l
, n
, ... , nk ... be
a se uence oI positive integers which is strictly increasing; i.e., n
n
k
Ior
all j k. Then the se uence ". is called a subse uence oI ,, . II the
subse uence ". converges, then its limit is called a subse uential limit
oI ,, .
It turns out that many oI the important properties oI convergence on R
can be e tended to the setting oI arbitrary metric spaces. In the ne t result
several oI these properties are summari ed.
5.5.6. lbeorem. et ,, be a se uence in . Then
(i) there is at most one point E such that lim " ;
"
(ii) iI ,, is convergent, then it is bounded;
(iii) ,, converges to a point E iI and only iI every sphere about
contains all but a Iinite number oI terms in ,, ;
(iv) ,, converges to a point E iI and only iI every subse uence
oI ,, converges to ;
(v) iI ,, converges to E and iI E , then lim p( ", ) p(, );
"
(vi) iI ,, converges to E and iI the se uence y,, oI converges
to E , then lim p( ", y,,) p(, y); and
"
(vii) iI ,, converges to E , and iI there is ayE and a ) 0 such
that p( ", y) ) Ior all n E , then p(, y) y.
ProoI. To prove part (i), assume that , y E and that lim " and
"
lim " y. Then Ior every I 0 there are positive integers N" and N) such
"
that p(", ) I/2 whenever n N" and p( ", y) I/2 whenever n N
r
IIwe let N ma (N", N,,), then it Iollows that
Now I is any positive number. Since the only non negative number which
is less than every positive number is ero, it Iollows that p(, y) 0 and
thereIore y.
To prove part (iii), assume that lim . and let Se ; I) be any sphere
about . Then there is a positive integer N such that the only terms oI the
se uence . which are possibly not in Se ; I) are the terms I
2
, ,
N
Conversely, assume that every sphere about contains all but a Iinite number
oI terms Irom the se uence . . With I 0 speciIied, let M ma n E :
.1 S(;I).IIwesetNM l,then. E S(;I)IorallnN,which
was to be shown.
To prove part (v), we note Irom Theorem 5.1.13 that
lP(y, ) p(y, .) I p(, .).
By hypothesis, lim . . ThereIore, lim p(, .) 0 and so limIp(y, )

p(y, .) I 0; i.e., lim p(y, .) p(y, ) .
inally, to prove part (vii), suppose to the contrary that p(, y) .

Then 6 p(, y) i O. Now p(., y) 0 Ior all n E , and thus
0 6 p(, y) p(., y) p(, .)
Ior all n E . But this is impossible, since lim . . Thus, p(, y) y.
We leave the prooIs oI the remaining parts as an e ercise.

5.5.7. E ercise. Prove parts (ii), (iv), and (vi) oI Theorem 5.5.6.
In DeIinition 5. .5, we introduced the concept oI limit point oI a set
c . In DeIinition 5.5.2, we deIined the limit oI a se uence oI points,
. , in . These two concepts are closely related; however, the reader should
careIully note the distinction between the two. The limit point oI a set is
strictly a property oI the set itselI. On the other hand, a se uence is not a set.
urthermore, the elements oI a se uence are ordered and not necessarily
distinct, while the elements oI a set are not ordered but are distinct. owever,
the range oI a se uence is a subset oI . We now give a result relating these
concepts.
S.S.8. Theorem. et be a subset oI . Then
(i) E is an adherent point oI iI and only iI there is a se uence
. in (i.e., . E Ior all n) such that lim. ;
(ii) E is a limit point oI the set iI and only iI there is a se uence

. oI distinct points in such that lim. ; and
(iii) is closed iI and only iI Ior every convergent se uence y.j, such
that . E Ior all n, limy. E .

ProoI To prove part (i), assume that lim . . Then every sphere about
contains at least one term oI the se uence . and, since every term oI
Iy. is a point oI , it Iollows that is an adherent point oI . Conversely,
assume that is an adherent point oI . Then every sphere about contains
at least one point oI . Now let us choose Ior each positive integer n a point
. E such that . E S(; lIn). Then it Iollows readily that the se uence
. chosen in this Iashion converges to . SpeciIically, iI I 0 is given,
then we choose a positive integer N such that lIN I. Then Ior every n N
we have . E S(; lIn) c S(; I). This concludes the prooI oI part (i).
To prove part (ii), assume that is a limit point oI the set . Then every
sphere S(; lIn) contains an inIinite number oI points, and so we can choose
a . E S(; lIn) such that . "* III Ior all m n. The se uence . consists
oI distinct points and converges to . Conversely, iI . is a se uence oI
distinct points convergent to and iI S(; I) is any sphere with center at ,
then by deIinition oI convergence there is an N such that Ior all n N,
y" E S(; I). That is, there are inIinitely many points oI in S(; I).
To prove part (iii), assume that is closed and let ,, be a convergent
se uence with . E Ior all n and lim " . We want to show that E .
By part (i), must be an adherent point oI . Since is closed, E .

Net, we prove the converse. et be an adherent point oI . Then by part
(i), there is a se uence y. in such that lim. . By hypothesis, we must
have E . Since contains all oI its adherent points, it must be closed.

Statement (iii) oI Theorem 5.5.8 is oIten used as an alternate way oI
deIining a closed set.
The ne t theorem provides us with conditions under which a se uence is
convergent in a product metric space.
5.5.9. Theorem. et ; P.. and I ; py be two metric spaces, letZ
, let p be any oI the metrics deIined on Z in Theorem 5.3.19, and let
Z; p denote the product metric space oI ; P.. and ; py . II Z E Z
, then (, y), where E and y E . et I ,, be a se uence
in , and let y,, be a se uence in . Then,
(i) the se uence ( ., y,,) converges in Z iI and only iI ,, converges in
and . converges in ; and
(ii) lim ( " .) (lim ., lim y,,) whenever this limit e ists.

In many situations the limit to which a given se uence may converge is
unknown. The Iollowing concept enables us to consider the convergence
oI a se uence without knowing the limit to which the se uence may converge.
5.5.11. DeIinition. A seuence ,, oI points in a metric space ; p is
said to be a Cauchy seuence or a Iundamental seuence iI Ior every e 0
there is an integer N such that p( ", ",) e whenever m, n N.
The net result Iollows directly Irom the triangle ineuality.
5.5.12. Theorem. Every convergent seuence in a metric space ; p is
a Cauchy seuence.
ProoI Assume that lim " . Then Ior arbitrary e 0 we can Iind an
"
integer N such that p(", ) el2 and p( "" ) el2 whenever m, n N.
In view oI the triangle ineuality we now have
p( ", ",) p( ", ) p( "" ) e
whenever m, n N. This proves the theorem.
We emphasie that in an arbitrary metric space ; p a Cauchy seuence
is not necessarily convergent.
5.5.13. Theorem. et ,, be a Cauchy seuence. Then ,, is a bounded
seuence.
ProoI We need to show that there is a constant "I such that 0 "I 00 and
such that p( "" ,,) "I Ior all m, n E I.
etting e I, we can Iind N such that p( "" ,,) I whenever m, n N.
Now let l ma p( I
), p(
l
,
3
), ... ,p(
l
,
N
)). Then, by the triangle
ineuality,
p(
l
, ,,) P(l
N
) p(
N
, ,,) (l I)
iI n N. Thus, Ior all n E I, p(
l
, ,,) l l. Again, by the triangle
ineuality,
p( "" ,,) p( "" l) p(
l
, ,,)
Ior all m, n E I. Thus, p( "" ,,) 2(A I) and ,, is a bounded seuence.
We also have:
5.5.1. Theorem. II a Cauchy seuence ,, contains a convergent subse
uence ". , then the seuence ,, is convergent.
We now give the deIinition oI complete metric space.
5.5.16. DeIinition. IIevery Cauchy seuence in a metric space ; p con
verges to an element in , then ; p is said to be a complete metric space.
5.5. Complete Metric Spaces 291
Complete metric spaces are oI utmost importance in analysis and applica
tions. We will have occasion to make e tensive use oI the properties oI such
spaces in the remainder oI this book.
5.5.17. E ample. et (0, I), and let p(, y) I I Ior all ,
E . et . lin Ior n E . Then the se uence . is Cauchy (i.e., it
is a Cauchy se uence), since I. liii IIN Ior all n m N. Since
there is no E to which . converges, the metric space ; p is not
complete.
5.5.18. E ample. et Q, the set oI rational numbers, and let p(, y)
I yl et . I 2 ... 1, Ior n E . The se uence . is
. n.
Cauchy. Since there is no limit in Q to which . converges, the metric space
Q; p is not complete.
5.5.19. E ample. et R R CO , and let p(,y) I I Ior all
, E R It. et . lin, n E . The se uence . is Cauchy; however, it
does not converge to a limit in R. Thus, R; p is not complete. Some
Iurther comments are in order here. II we view R as a subset oI R in the
metric space R; p (p denotes the usual metric on R), then the se uence .
converges to ero; i.e., lim . O. By Theorem 5.5.8, R cannot be a closed
subset oI R. owever, R is a closed subset oI the metric space R; p ,

since it is the whole space. There is no contradiction here to Theorem 5.5.8,
Ior the se uence . does not converge to a limit in R. SpeciIically, Theorem
5.5.8 states that iIa se uence does converge to a limit, then the limit must
belong to the space. The reuirement Ior completeness is that every Cauchy
se uence must converge to an element in the space.
We now consider several speciIic e amples oI important complete metric
spaces.
5.5.20. E ample. et p denote the usual metric on R, the set oI real
numbers. The completeness oI R; p is one oI the Iundamental results oI
analysis.
5.5.21. E ample. et ; P.. and ; py be arbitrary complete metric
spaces. IIZ and iI Z E Z, then (, y), where E and y E
(see Theorem 5.3.19). DeIine
p,.(Zt, Z2) P2
t
, t), (
2
, 2
,(P ..(
t
,
2
) 2 (pit, 2) 2.
It can readily be shown that the metric space Z; P2 is complete.
5.5.22. E ercise. VeriIy the completeness oI Z; P2 in the above e ample.
5.5.23. E ample. et P be the usual metric deIined on C, the set oI
comple numbers. tili ing E ample 5.5.21 along with the completeness oI
R; p (see E ample 5.5.20), we can readily show that C; p is a complete
metric space.
5.5.2 . E ercise. VeriIy the completeness oI C; pl.
5.5.25. E ercise. et R" (let C") denote the set oI all real (oI
all comple ) ordered n tuples (I ... ,,,). et y ( 11 ... , 1,,), let
p,(,y) II 111 T
/
" I sp 00,
and let
p..(, y) ma I I 111 ... ,I" 1"n. i.e. p 00.
tili ing the completeness oI the real line (oI the comple plane), show that
R"; p, R;(C;; p, C;) is a complete metric space Ior 1 S p S 00.
In particular, show that iI l is a Cauchy se uence in R; (in C;), where
l (k ... "l) then /l is a Cauchy se uence in R (in C) Ior j I,
... ,n, and l converges to , where (I ... ,,,) and , lim l
l
Ior j 1, ... , n.
5.5.26. E ample. et I,; p, be the metric space deIined in E ample
5.3.5. We now show that this space is a complete metric space.
et l be a Cauchy se uence in I,. where l (lk 2k , d ).
et I O. Then there is an N E such that
.. 1/
p,(" l ) " .., .ll I
. 1
Ior all k,j N. This implies that ".1 "l I I Ior every m E and all
k,j N. Thus, .l is a Cauchy se uence in R Ior every m E , and hence
.l is convergent to some limit, say lim ..l . Ior m E . Now let
l
(t, 2 , " ). We want to show that (i) E I, and (ii) lim l .
k
Since l is a Cauchy se uence, we know by Theorem 5.5.13 that there
e ists a " 0 such that
.. 1/
p,(O, l ) .I 1.k I "
Ior all k E . Now let n be any positive integer, let p be the metric on R"
deIined in E ercise 5.5.25, and let k ... "l. Then p(, j)
p,(
l"
I) and thus is a Cauchy se uence in R;. It also Iollows that
p(O, ) s" Ior all k E . Now by E ercise 5.5.25, converges to ,
293
where (I" .. ,,,). It Iollows Irom Theorem 5.5.6, part (vii), that
p(O, ) ,,; i.e., ti 1 ,1 )t/, ". Since this must hold Ior all n E I, it
Iollows that E I,. To show that lim
k
, let O. Then there is an
k
integer N such that p,(,
k
) Ior all k,j N. Again, let n be any
positive integer. Then we have p ( , ) Ior all j, k N. or Ii ed
n, we conclude Irom Theorem 5.5.6, part (vii), that p(, ) :::;; Ior all
" I/
k 2 N. ence, 1,,,, k I Ior all k N, where N depends
"sl
only on (and not on n). Since this must hold Ior all n E I, we conclude
that p(,
k
Ior all k N. This implies that lim
k
.
k
5.5.27. Eercise. Show that the discrete metric space oI E ample 5.1.7
is complete.
5.5.28. E ample. et era, b ; p ) be the metric space deIined in E ample
5.3.1 . Thus, era, b is the set oI all continuous Iunctions on a, b and
p(, y) sup I (I) y(l) I.
S/Sb
We now show that era, b ; p ) is a complete metric space. II ,, isa Cauchy
se uence in era, b , then Ior each 0 there is an N such that I ,,(I) ",(I) I
whenever m, n 2 N Ior all I E a, b . Thus, Ior Ii ed I, the se uence
,,(I ) converges to, say, o(I . Since t is arbitrary, the se uence oIIunctions
,,( .) converges pointwise to a Iunction
o
( .). Also, since N N()
is independent oI I, the se uence ,,( ) converges uniIormly to
o
( ).
Now Irom the calculus we know that iI a se uence oI continuous Iunctions
,,( ) converges uniIormly to a Iunction
o
( ), then
o
( ) is continuous.
ThereIore, every Cauchy se uence in e a, b); pool converges to an element in
this space in the sense oI the metric poo. ThereIore, the metric space e a,
b ; pool is complete.
5.5.29. E ample. et era, b ; p be the metric space deIined in Eample
5.3.12, with p 2; i.e.,
p(, y) : (I) y(I) 2 dt lIZ.
We now show that this metric space is not complete. Without loss oIgenerality
let the closed interval be 1, I . In particular, consider the se uence ,,
oI continuous Iunctions deIined by
0, ) t:::;; 0
,,(t) nt, O:::;;t:::;;l n ,
I,I n:::;;t:::;;)
(t)
n3
n2I nl
llIIl.. t
5.5.30. igw e . Se uence . Ior era, b ; P2 .
n 1,2, .... This se uence is depicted pictorially in igure . Now let
m n and note that
P2( .., .) 2 (m ,,)2 ill... t
2
dt Il/. (I nt)2 dt
o 1/..
(m ,,)2 . .
3m
2
n 3n
whenever n 1/(3 ). ThereIore, . is a Cauchy se uence.
or purposes oI contradiction, let us now assume that . converges to
a continuous Iunction , where convergence is taken with respect to the metric
P2 In other words, assume that
Il I.(t) (t)12 dt 0 as n 00.
This implies that the above integral with any limits between I and I
also approaches ero as n 00. Since .(t) 0 whenever t E 1,0, we
have
I
l
I.(t) (t)12 dt 0
independent oI n. rom this it Iollows that the continuous Iunction is such
that
Il I(t) 1
2
dt 0,
and (t) 0 whenever t E 1,0. Now iI 0 a S I, then
rI.(t) (t) 1
2
dt 0 as n 00.
Choosing n I/a, we have
r11 (tW dt 0 as n 00.
Since this integral is independent oI n it vanishes. Also, since is continuous
it Iollows that (t) 1 Ior t a. Since a can be chosen arbitrarily close to
ero, we end up with a Iunction such that
(t) O, t E 1,0 .
I, t E (0, I
ThereIore, the Cauchy se uence . does not converge to a point in era, b ,
and the metric space is not complete.
The completeness property oI certain metric spaces is an essential and
important property which we will use and encounter Ireuently in the remain
der oI this book. The preceding e ample demonstrates that not all metric
spaces are complete. owever, this space era, b ; p is a subspace oI a larger
metric space which is complete. To discuss this complete metric space (i.e.,
the completion oI era, b ; p ), it is necessary to make use oI the ebesgue
theory oI measure and integration. or a thorough treatment oI this theory,
we reIer the reader to the te ts by Royden 5.9 and Taylor 5.10 . Although
knowledge oI this theory is not an essential reuirement in the development
oI the subseuent results in this book, we will want to make reIerence to
certain e amples oI important metric spaces which are deIined in terms oI
the ebesgue integral. or this reason, we provide the Iollowing heuristic
comments Ior those readers who are unIamiliar with this subject.
The ebesgue measure space on the real numbers, R, consists oI the triple
R, mr, l , where mr is a certain Iamily oI subsets oI R, called the ebesgue
measurable sets in R, and l is a mapping, W mr R*, called ebesgue
measure, which may be viewed as a generaliation oI the concept oI length
in R. While it is not possible to characterie mr without providing additional
details concerning the ebesgue theory, it is uite simple to enumerate
several important e amples oI elements in mr. or instance, mr contains all
intervals oI the Iorm (a, b) E R: a b , c, d) E R: c
d , (e,I E R: e I, g, h E R: g h , as well as
all countable unions and intersections oI such intervals. It is emphasied
that mr does not include all subsets oI R. Now iI A E mr is an interval, then
the measure oI A, l(A), is the length oI A. or e ample, iI A a, b , then
l(A) b a. Also, iI B is a countable union oI disjoint intervals, then
l(B) is the sum oI the lengths oI the disjoint intervals (this sum may be inIi
nite). OI particular interest are subsets oI R having measure ero. Essentially,
this means it is possible to "cover" the set with an arbitrarily small subset oI
R. Thus, every subset oI R containing at most a countable number oI points
has ebesgue measure eual to ero. or e ample, the set oI rational numbers
has ebesgue measure ero. (There are also uncountable subsets oI R having
ebesgue measure ero.)
In connection with the above discussion, we say that a proposition P()
is true almost everywhere (abbreviated a.e.) iI the set S E R: P() is
not true has ebesgue measure ero. or e ample, two Iunctions I, g:
R R are said to be e ual a.e. iI the set S E R:I() * g() E mt
and iI .l(S) O.
et us now consider the integral oI real valued Iunctions deIined on the
interval a, b c R. It can be shown that a bounded Iunction I: a, b R
is Riemann integrable (where the Riemann integral is denoted, as usual, by
rI()d) iI and only iI I is continuous almost everywhere on a, b . The
a .
class oI Riemann integrable IunCtions with a metric deIined in the same
manner as in E ample 5.5.29 (Ior continuous Iunctions on a, b ) is not a
complete metric space. owever, as pointed out beIore, it is possible to
generali e the concept oI integral and make it applicable to a class oI Iunctions
signiIicantly larger than the class oI Iunctions which are continuous a.e.
In doing so, we must consider the class oI measurable Iunctions. SpeciIically,
a IunctionI: R R is said to be a ebesgue measurable Innction iI II(ll.)
E mt Ior every open set C c R. Now letIbe a ebesgue measurable Iunction
which is bounded on the interval a, b , and let M sup I() y: E
a, b , and let m inI I() y: E a, b . In the ebesgue approach to
integration, the range oIIis partitioned into intervals. (This is in contrast
with the Riemann approach, where the domain oI I is partitioned in
developing the integral.) SpeciIically, let us divide the range oIIinto the n
parts speciIied by m o I ... I M, let E
k
E R:
kI k Ior k I, ... ,", and let k be such that kI k k
Ior k I, ... ,n. The sum :t kP(E
k
) approimates the area under the
k1
graph oII, and it can serve as the deIinition oI the integral oII between a
and b, aIter an appropriate limiting process has been perIormed. Provided
that this limit e ists, it is called the ebesgue integral oIIover a, b , and it
is denoted by I. Idp. It can be shown that any bounded Iunction I which
(a, b)
is Riemann integrable over a, b is ebesgue integrable over a, b , and
Iurthermore I. Id .l Ib I()d. On the other hand, there are Iunctions
(a,bl
which are ebesgue integrable but not Riemann integrable over a, b . or
e ample, consider the Iunction I: a, b R deIined by I() 0 iI is
rational and I() I iI is irrational. This Iunction is so erratic that the
Riemann integral does not e ist in this case. owever, since the interval
a, b A B, where A :I() Iand B :I() O , it Iollows
Irom the preceding characteriation oI ebesgue integral that I. Id .l
(a,bl
l. .l(A) O. .l(B) b a.
et us now consider an important class oI complete metric spaces, given
in the ne t e ample.
5.5. Complete Metric Spaces 297
5.5.31. E ample. et p 1 (p not necessarily an integer), let (R, mr, p.)
denote the ebesgue measure space on the real numbers, and let a, h be
a subset oI R. et .c;:a, h denote the Iamily oI Iunctions I: R R which
are ebesgue measurable and such that I IIIp dp. e ists and is Iinite.
. b
We deIine an euivalence relation on .c;:a, h by saying that I" g
iI I() g() e cept on a subset oI a, h having ebesgue measure ero.
Now denote the Iamily oI euivalence classes into which .cp a, h is divided
by p a, h . SpeciIically, let us denote the euivalence class I g E
.cp a, h : g I Ior I E .cp a, h . Then p a, h I:I E .cp a, hl
Now let p a, h and deIine Pp: R by
(5.5.32)
It can be shown that the value oI p(I, g ) deIined by E . (5.5.32) is the same
Ior any I and g in the e uivalence classes I and g , respectively. urther
more, p satisIies all the a ioms oI a metric, and as such p a, h ; pp is a
metric space. One oI the important results oI the ebesgue theory is that this
space is complete.
It is important to note that the righthand side oI E . (5.5.32) cannot be
used to deIine a metric on .cp a, h , since there are Iunctions I * g such that
I II glp dp. 0; however, in the literature the distinction between
., b)
p a, h and .cp a, h is usually suppressed. That is, we usually write I E
p a, b instead oI I E Aa, b , where I E .c a, b .
inally, in the particular case when p 2, the space era, b; p oI
Eample 5.5.29 is a subspace oI the space
; p .
BeIore closing the present section we consider some important general
properties oI complete metric spaces.
5.5.33. Theorem. et ; p) be a complete metric space, and let ; p be
a metric subspace oI ; pl. Then ; p) is complete iI and only iI is a
closed subset oI .
ProoI Assume that ; p) is complete. To show that is a closed subset
oI we must show that contains all oI its adherent points. et y be an
adherent point oI ; i.e., lety E . Then each open sphere S(y; lIn), n I,
2, ... , contains at least one point y" in . Since p(y", y) lIn it Iollows that
the se uence y,,) converges to y. Since y,,) is a Cauchy se uence in the com
plete space ; p we have y,, converging to a point y E . But the limit oI
a se uence oI points in a metric space is uniue by Theorem 5.5.6. ThereIore,
y y; i.e., y E and y is closed.
Chapter 5 / Metric Spaces
Conversely, assume that is a closed subset oI . To show that the
space ; p is complete, let . be an arbitrary Cauchy se uence in ; pl.
Then y,, is a Cauchy se uence in the complete metric space ; p and as
such it has a limit y E . owever, in view oI Theorem 5.5.8, part (iii),
the closed subset oI contains all its adherent points. ThereIore, ; p
is complete.
We emphasi e that completeness and closure are not necessarily euivalent
in arbitrary metric spaces. or e ample, a metric space is always closed, yet
it is not necessarily complete.
BeIore characteriing a complete metric space in an alternate way, we
need to introduce the Iollowing concept.
5.5.3 . DeIinition. A se uence St oI subsets oI a metric space ; p
is called a nested se uence oI sets iI
St :: S :: S3 ::
We leave the prooI oI the last result oI the present section as an e ercise.
5.5.35. Theorem. et ; p be a metric space. Then,
(i) ; p is complete iI and only iI every se uence oI closed nested
spheres in ; p with radii tending to ero have non void interesec
tion; and
(ii) iI ; p is complete, iI St is a nested se uence oI nonempty closed
subsets oI , and iI lim diam (S,,) 0, then the intersection nSIt
.I
is not empty; in Iact, it consists oI a single point.
5.6. COMPACTNESS
We recall the Bol ano Weierstrass theorem Irom the calculus: Every
bounded, inIinite subset oI the real line (i.e., the set oI real numbers with the
usual metric) has at least one point oI accumulation. Thus, iI is an arbitrary
bounded inIinite subset oI R, then in view oI this theorem we know that any
se uence Iormed Irom elements oI has a convergent subse uence. or
e ample, let 0, 2 , and let ,, be the se uence oI real numbers given by
I (I)" I 2
" 2 n n 1, , ....
Then the range oI this se uence lies in and is thus bounded. ence, the
range has at least one accumulation point. It, in Iact, has two.
5.6. Compactness 299
A theorem Irom the calculus which is closely related to the Bol ano
Weierstrass theorem is the eine Borel theorem. We need the Iollowing
terminology.
5.6.1. DeIinition. et be a set in a metric space ; p), and let A be an
inde set. A collection oI sets
II
: ( E A) in ; p) is called a covering
oI iI c
II
Asubcollection
p
: p E B) oI the covering .: ( E A),
ileA
e., B c A such that c
p
is called a subcovering oI .; ( E A). II
pes
all the members . and
p
are open sets, then we speak oI an open covering
and open subcovering. II A is a Iinite set, then we speak oI a Iinite covering.
In general, A may be an uncountable set.
We now recall the eine Borel theorem as it applies to subsets oI the real
line ( e., oI R): et be a closed and bounded subset oI R. II .: ( E A)
is any Iamily oI open sets on the real line which covers , then it is possible to
Iind a Iinite subcovering oI sets Irom .: ( E A).
Many important properties oI the real line Iollow Irom the Bol ano
Weierstrass theorem and Irom the eine Borel theorem. In general, these
properties cannot be carried over directly to arbitrary metric spaces. The
concept oI compactness, to be introduced in the present section, will enable
us to isolate those metric spaces which possess the eine Borel and Bol ano
Weierstrass property.
Because oI its close relationship to compactness, we Iirst introduce the
concept oI total boundedness.
5.6.2. DeIinition. et be any set in a metric space ; p, and let l
be an arbitrary positive number. A set S. in is said to be an lnet Ior
iI Ior any point y E there e ists at least one point S E S. such that p(s,y)
l. The lnet, S.. is said to be Iinite iI S. contains a Iinite number oI points.
A subset oI is said to be totally bounded iI contains a Iinite lnet Ior
Ior every l O.
Some authors use the terminology ldense set Ior E net and precompact
Ior totally bounded sets.
An obvious euivalent characteriation oI total boundedness is contained
in the Iollowing result.
5.6.3. Theorem. A subset c is totally bounded iI and only iI can
be covered by a Iinite number oI spheres oI radius E Ior any E O.
5.6. . Eercise. Prove Theorem 5.6.3.
In igure G a pictorial demonstration oI the preceding concepts is given.
IIin this Iigure the si e oI E would be decreased, then correspondingly, the
Set

Set
S. is the Iinite

set consisting oI

the dots within

the set

5.6.5. igure G. Total boundedness oI a set .

number oI elements in S. would increase. II Ior arbitrarily small E the number
oI elements in S. remains Iinite, then we have a totally bounded set .
Total boundedness is a stronger property than boundedness. We leave
the prooI oI the ne t result as an e ercise.
5.6.6. Theorem. et ; pbe a metric space, and let be a subset oI .
Then,
(i) iI is totally bounded, then it is bounded;
(ii) iI is totally bounded, then its closure is totally bounded; and
(iii) iI the metric space ; p is totally bounded, then it is separable.
We note, Ior e ample, that all Iinite sets (including the empty set) are
totally bounded. Whereas all totally bounded sets are also bounded, the
converse does, in general, not hold. We demonstrate this by means oI the
Iollowing e ample.
5.6.8. E ample. et /2; P2 be the metric space deIined in E ample 5.3.5.
Consider the subset c /2 deIined by
..
y E 1
2
::E 1 1,1
2
S I.
t1
We show that is bounded but not totally bounded. or any , y E , we
have by the Minkowski ineuality (5.2.7),
P2(,y) Iet l,12r2 le,/2r2 ti l,12T 2 s 2.
Thus, is bounded. To show that is not totally bounded, consider the
set oI points E ep e
2
, c , where e
l
(1,0,0, ...), e
2
(0, 1,
5.6. Compactness
301
0, ...), etc. Then p(e
l
, e
) ...; T Ior i 1 j. Now suppose there is a Iinite

net Ior Ior say 1 et Sl ... , s,, be the net S,. Now iI e
is such
that p(e SI) Ior some i, then peek s peek e
) p(e SI) Ior

k 1 j. ence, there can be at most one element oI the set E in each sphere
S(SI; ) Ior i I, ... ,n. Since there are inIinitely many points in E and
only a Iinite number oI spheres S(SI; ), this contradicts the Iact that S, is
an (net. ence, there is no Iinite (net Ior ( and is not totally
bounded.
et us now consider an eample oI a totally bounded set.
5.6.9. Eample. et R"; p be the metric space deIined in Eample 5.3.1,
and let be the subset oI R" deIined by y EO R": t til I . Clearly,
leI
is bounded. To show that is totally bounded, we construct an net Ior
Ior an arbitrary ( 0. To this end, let N be a positive integer such that
N .In, and let S, be the set oI all ntuples given by
SI s (l ... ,.) EO : l mlIN, some integer m
l
,
where N m
l
N, i I, ... , n .
Then clearly S. c and S, is Iinite. Now Ior any y (til ... ,tI,,) EO ,
there is an s EO S, such that I l till IIN Ior i I, ... , n. Thus, p(y, s)
1/1
IIN)Z IilN (. ThereIore, S. is a Iinite (net. Since (
II
is arbitrary, is totally bounded.
In general, any bounded subset oI R i R"; p is totally bounded.
5.6.10. Eercise. et l ; p be the metric space deIined in Eample
5.3.5, and let c / be the subset deIined by
y EO / : Itlll I, Iti i , ... , 111.1 (1)"I, ... .
Show that is totally bounded.
In studying compactness oI metric spaces, we will Iind it convenient to
introduce the Iollowing concept.
5.6.11. DeIinition. A metric space ; p is said to be seuentially compact
iI every seuence oI elements in contains a subseuence which converges
to some element EO . A set in the metric space ; p is said to be
seuentially compact iI the subspace ; p is seuentially compact; e.,
every seuence in contains a subseuence which converges to a point in .
5.6.12. Eample. et (0, I , and let p be the usual metric on the real
line R. Consider the seuence . , where " lin, n I, 2, . . .. This
se uence has no subse uence which converges to a point in , and thus
; p is not se uentially compact.
We now deIine compactness.
5.6.13. DeIinition. A metric space ; pis said to be compact, or to possess
the eine Borel property, iI every open covering oI ; p contains a Iinite
open subcovering. A set in a metric space ; p is said to be compact iI
the subspace ; p is compact.
Some authors use the term bicompact Ior eine Borel compactness and
the term compact Ior what we call se uentially compact. As we shall see
shortly, in the case oI metric spaces, compactness and seuential compactness
are euivalent, so no conIusion should arise.
We will also show that compact metric spaces can euivalently be charac
teri ed by means oI the BolanoWeierstrass property, given by the Iollowing.
5.6.1 . DeIinition. A metric space ; p possesses the BolanoWeierstrass
property iI every inIinite subset oI has at least one point oI accumulation.
A set in possesses the BolanoWeierstrass property iI the subspace
; p possesses the BolanoWeierstrass property.
BeIore setting out on proving the assertions made above, i.e., the euiva
lence oI compactness, seuential compactness, and the BolanoWeierstrass
property, in metric spaces, a Iew comments concerning some oI these concepts
may be oI beneIit.
InIormally, we may view a se uentially compact metric space as having
such an abundance oI elements that no matter how we choose a se uence,
there will always be a clustering oI an inIinite number oI points around at
least one point in the metric space. A similar interpretation can be made
concerning metric spaces which possess the BolanoWeierstrass property.
tiliing the concepts oI seuential compactness and total boundedness,
we Iirst state and prove the Iollowing result.
5.6.15. Theorem. et ; p be a metric space, and let be a subset oI
. The Iollowing properties hold:
(i) iI is se uentially compact, then is bounded;
(ii) iI is se uentially compact, then is closed;
(iii) iI ; p is se uentially compact, then ; p is totally bounded;
(iv) iI ; p is se uentially compact, then ; p is complete; and
(v) iI ; p is totally bounded and complete, then it is seuentially
compact.
ProoI To prove (i), assume that is a se uentially compact subset oI
and assume, Ior purposes oI contradiction, that is unbounded. Then we
5.6. Compactness 303
can construct a seuence ,, with elements arbitrarily Iar apart. Speci
Iically, let I E and choose E such that P(I 12) I. Net, choose
3 E such that p(y I 3) 1 p(y I ). Continuing this process, choose
" E such that P(I ,,) 1 p(y., ,,I) II m n, then P(I") I
p(y"y")andp(y",,y,,) Ip(I") p( I, ,,)1 1. But this implies that
y,, contains no convergent subseuence. owever, we assumed that is
seuentially compact; i.e., every se uence in contains a convergent subse
uence. ThereIore, we have arrived at a contradiction. ence, must be
bounded. In the above argument we assumed that is an inIinite set. We
note that iI is a Iinite set then there is nothing to prove.
To prove part (ii), let I denote the closure oI and assume that E I.
Then there is a seuence oI points ,, in which converges to , and every
subseuence oI y,, converges to , by Theorem 5.5.6, part (iv). But, by
hypothesis, is seuentially compact. Thus, the seuence y,, in contains
a subseuence which converges to some element in . ThereIore, I
and is closed.
We now prove part (iii). et ; p be a seuentially compact metric
space, and let I E . With E 0 Ii ed we choose iI possible
E such
that p(
p

) E. Net, iI possible choose

3
E such that p(
l
,
) E
and p(
p

3
) E. Continuing this process we have, Ior every n, p(", I) E,
p(",
) E, , p(", ,,I) E. We now show that this process must

ultimately terminate. Clearly, iI; p is a bounded metric space then we can
pick E suIIiciently large to terminate the process aIter the Iirst step; i.e.,
there is no point E such thatp(
1
, ):2 . Now suppose that, in general,
the process does not terminate. Then we have constructed a seuence ,,
such that Ior any two members I
oI this seuence, we have p( t ) E.

But, by hypothesis, ; p is seuentially compact, and thus ,, contains a
subseuence which is convergent to an element in . ence, we have arrived
at a contradiction and the process must terminate. sing this procedure
we now have Ior arbitrary E 0 aIinite set oI points .,
, ... ,l such
that the spheres, S(,,; E), n I, ... ,I, cover ; i.e., Ior any E 0,
contains a Iinite E net. ThereIore, the metric space ; p is totally bounded.
We now prove part (iv) oI the theorem. et ,, be a Cauchy seuence.
Then Ior every E 0 there is an integer I such that p( "" ,,) I whenever
m n I. Since ; p is seuentially compact, the se uence ,, contains a
subseuence t.l convergent to a point E so that lim P( l., ) O.
,, 00
The se uence I,, is an increasing seuence and I", m. It now Iollows that
whenever m n I. etting m 00, we have 0 p( ", ) E, whenever
n I. ence, the Cauchy seuence ,, converges to E . ThereIore,
is complete.
In connection with parts (iv) and (v) we note that a totally bounded metric
space is not necessarily se uentially compact. We leave the prooI oI part
(v) as an e ercise.
5.6.16. Eercise. Prove part (v) oI Theorem 5.6.15.
Parts (iii), (iv) and (v) oI the above theorem allow us to deIine a seuentially
compact metric space euivalently as a metric space which is complete and
totally bounded. We now show that a metric space is se uentially compact iI
and only iI it satisIies the BolanoWeierstrass property.
5.6.17. Theorem. A metric space ; p is seuentially compact iI and
only iI every inIinite subset oI has at least one point oI accumulation.
ProoI Assume that is an inIinite subset oI a se uentially compact metric
space ; pl. II n is any se uence oI distinct points in , then n contains
a convergent subseuence y, , because ; p is seuentially compact.
The limit oI the subseuence is a point oI accumulation oI .
Conversely, assume that ; p is a metric space such that every inIinite
subset oI has a point oI accumulation. et y n be any se uence oI points
in . II a point occurs an inIinite number oI times in n , then this seuence
contains a convergent subse uence, a constant subse uence, and we are
Iinished. II this is not the case, then we can assume that all elements oI .
are disti net. et Z denote the set oI all points n n I, 2, .... By hypothesis,
the inIinite set Z has at least one point oI accumulation. II Z E Z is such
a point oI accumulation then we can choose a se uence oI points oI Z which
converges to (see Theorem 5.5.8, part (i and this se uence is a subseuence
y,. oI n ThereIore, ; p is seuentially compact. This concludes the
prooI.
Our ne t objective is to show that in metric spaces the concepts oI com
pactness and seuential compactness are euivalent. In doing so we employ
the Iollowing lemma, the prooI oI which is leIt as an e ercise.
5.6.18. emma. et ; p be a seuentially compact metric space. II
.. : I E Ais an inIinite open covering oI ; p , then there e ists a number
E 0 such that every sphere in oI radius E is contained in at least one oI
the open sets ...
5.6.19. Eercise. Prove emma 5.6.18.
5.6.20. Theorem. A metric space ; p is compact iI and only iI it is
seuentially compact.
ProoI rom Theorem 5.6.17, a metric space is se uentially compact iI and
only iI it has the Bol ano Weierstrass property. ThereIore, we Iirst show
5.6. Compactness
that every inIinite subset oI a compact metric space has a point oI accu
mulation.
et ; p) be a compact metric space, and let be an inIinite subset oI
. or purposes oI contradiction, assume that has no point oI accumula
tion. Then each E is the center oI a sphere which contains no point oI
, e cept possibly itselI. These spheres Iorm an inIinite open covering oI
. But, by hypothesis, ; p) is compact, and thereIore we can choose Irom
this inIinite covering a Iinite number oI spheres which also cover . Now
each sphere Irom this Iinite subcovering contains at most one point oI , and
thereIore is Iinite. But this is contrary to our original assumption, and we
have arrived at a contradiction. ThereIore, has at least one point oI accu
mulation, and ; p) is se uentially compact.
Conversely, assume that ; p) is a se uentially compact metric space,
and let .. ; E A) be an arbitrary inIinite open covering oI . rom
emma 5.6.18 there e ists an 0 such that every sphere in oI radius
is contained in at least one oI the open sets ... Now, by hypothesis, ; p)
is se uentially compact and is thereIore totally bounded by part (iii) oI
Theorem 5.6.15. Thus, with arbitrary Ii ed we can Iind a Iinite net,
I
I
, ... ,I) such that c S(

l
; I). Now in view oI emma 5.6.18,
11
S(
l
; ) c ..
I
, i I, ... ,I, where the sets ,," are Irom the Iamily .. ;
E A). ence,
I
c ..
" II
and has a Iinite open subcovering chosen Irom the inIinite open covering
.. ; E A). ThereIore, the metric space ; p) is compact. This proves the
theorem.
There is yet another way oI characteriing a compact metric space. BeIore
doing so, we give the Iollowing deIinition.
5.6.21. DeIinition. et .. : E A be an inIinite Iamily oI closed sets.
The Iamily .. : E A is said to have the Iinite intersection property iI
Ior every Iinite set B c A the set n .. is not empty.
.. EB
5.6.22. Theorem. A metric space ; p is compact iI and only iI every
inIinite Iamily .. : E A oI closed sets in with the Iinite intersection
property has a nonvoid intersection; i.e., n .. t 0 .
.. EA
We now summarie the above results as Iollows.
5.6.2. Theorem. In a metric space ; p the Iollowing are euivalent:
(i) ; p is compact;
(ii) ; p is seuentially compact;
(iii) ; p possesses the BolanoWeierstrass property;
(iv) ; p is complete and totally bounded; and
(v) every inIinite Iamily oIclosed sets in ; p with the Iinite intersection
property has a nonvoid intersection.
Concerning product spaces we oIIer the Iollowing eercise.
56.2S. Eercise. et I; pa,
; p, . .. , .; P. be n compact metric
spaces. et I
... ., and let

p(, y) PI(" I) ... P.(., .), (5.6.26)
where " , E " i I; ... , n, and where , E . Show that the product
space ; p is also a compact metric space.
The net result constitutes an important characteriation oI compact sets
in the spaces R and C .
5.6.27. Theorem. et R; p (let C; p) be the metric space deIined
in Eample 5.3.1. A set c R (a set c C) is compact iI and only iIit
is closed and bounded.
Recall that every nonvoid compact set in the real line R contains its
inIimum and its supremum.
In general, it is not an easy task to apply the results oI Theorem 5.6.2
to speciIic spaces in order to establish necessary and suIIicient conditions
Ior compactness. rom the point oI view oI applications, criteria such as
those established in Theorem 5.6.27 are much more desirable.
We now give a condition which tells us when a subset oI a metric space is
compact. We have:
5.6.29. Theorem. et ; p be a compact metric space, and let c .
II is closed, then is compact.
ProoI et .. ; ( , E A be any open covering oI ; i.e., each .. is open
relative to ; pl. Then, by Theorem 5..20, Ior each .. there is a .. which
is open relative to ; p such that .. n ... Since is closed, is
an open set in ; pl. Also, since , .. : ( , E Ais an
open covering oI . Since is compact, it is possible to Iind a Iinite sub
covering Irom this Iamily; i.e., there is a Iinite set B c A such that
5.7. Continuous unctions 307
u V.. . Since c V.. , n V.. ; i.e., .. ; E Bcovers .
.. EB .. eB .. eB
This implies that is compact.
We close the present section by introducing the concept oI relative
compactness.
5.6.30. DeIinition. et ; p be a metric space and let c . The subset
is said to be relatively compact in iI is a compact subset oI .
One oI the essential Ieatures oI a relatively compact set is that every
se uence has a convergent subse uence, just as in the case oI compact
subsets; however, the limit oI the subseuence need not be in the subset.
Thus, we have the Iollowing result.
5.6.31. Theorem. et ; p be a metric space and let c . Then is
relatively compact in iI and only iI every se uence oI elements in contains
a subse uence which converges to some E .
ProoI et be relatively compact in , and let n be any se uence in
. Then n belongs to also and hence has a convergent subseuence in
, since is se uentially compact. ence, n contains a subseuence
which converges to an element Ee .
Conversely, let n be a se uence in . Then Ior each n 1,2, ... , there
is an
n
E such that p(
n
, n) lin. Since
n
is a se uence in , it
contains a convergent subse uence, say
n
., which converges to some
E . Since n. is also in , it Iollows Irom part (iii) oI Theorem 5.5.8
that E . ence, is se uentially compact, and so is relatively compact
in .
5.7. CONTIN O S NCTIONS
aving introduced the concept oI metric space, we are in a position to
give a generaliation oI the concept oI continuity oI Iunctions encountered
in calculus.
5.7.1. DeIinition. et ; P.. and ; py be two metric spaces, and let
I: be a mapping oI into . The mappingI is said to be continuous
at the pcint
o
E iI Ior every ( 0 there is a 0 such that
PI(),I(
o
) (
whenever p,,( ,
o
) . The mapping I is said to be continuous on or
simply continuous iI it is continuous at each point E .
308 Chapter 5 / Metric Spaces
We note that in the above deIinition the is dependent on the choice
oI
o
and e; ie., tS(I,
o
). Now iI Ior each I 0 there e ists a tS(e)
0 such that Ior any
o
we have pyI(),I(
o
) I whenever p,,( ,
o
) ,
then we say that the Iunction I is uniIormly continuous on . enceIorth,
iI we simply say I is continuous, we mean I is continuous on .
5.7.2. Eample. et ; p,, R, and let ; py RT (see Eample
5.3. I). et A denote the real matri
::: ::: . ... :::

A .
amI a
m2
.,. a
mn
We denote E Rn and E Rm by
et us deIine the Iunction I: Rn Rm by
I( ) A
Ior each ERn. We now show that I is continuous on Rn. II,
o
E R and
y, o E Rm are such that y I( ) and a I(.), then we have
an " e,
11m amI am en
and
py(y, O) 2 t a/j(e
eO)r
sing the Schwar ineuality, it Iollows that
p,. y, yo )2 Ct ah) (e
eOI)2)
Now let M t1 t all 1/1 1 0 (iI M 0 then we are done). Given any
I 0 and choosing IlM, it Iollows that p,. y, o) I whenever p,,( , o)
and any mapping I: Rn Rm which is represented by a real, constant
(m n) matri A is continuous on Rn.
5.7.3. Eample. et ; p,, ; py ea, b; P2 the metric space
deIined in Eample 5.3.12, and let us deIine a Iunction/: in the Iol
5.7. Continuous unctions
309
lowing way. or E , I() is given by
yet) I: k(t, s) (s)ds, t E a, b ,
where k: R7. R is continuous in the usual sense, i.e., with respect to the
metric spaces R and R1. We now show that I is continuous on . et ,
o
E and y, o E be such that y I() and o I(
o
). Then
pi, oW r I: k(t, s) (s) o(s) ds 7. dt.
It Iollows Irom olders ineuality Ior integrals (5.2.5) that
py(y, o) Mp(,
o
),
where M u: rk7.(t, s) dsdtl7.. ence, Ior any I 0, py y, o) I when
ever P ( ,
o
) b, where b IiM.
5.7. . E ample. Consider the metric space era, b ; p deIined in E ample
5.3.1 . et ela, b be the subset oI era, b oI all Iunctions having continuous
Iirst derivatives (on (a, b , and let ; P be the metric subspace ela, b ;
pool. et ; py era, b ; p and deIine the Iunction: as Iollows.
or E , I() is given by
yet) d(t) .
dt
To show that/is not continuous, we show that Ior any b 0 there is a pair
,
o
E such that P ( ,
o
) but piI(),I(
o
I. et o(t) 0 Ior
all t E a, b , and let (t) t sin rot, t 0, ro O. Then p(
o
) t .
Now iI o I(
o
) and y I(), then yo(t) 0 Ior all t E a, b and yet)
( CI) cos rot. ence, p( o y) t ro, provided that ro is suIIiciently large, i.e.,
so that cos rot I Ior some t E a, b . Now no matter what value oI
we choose, there is an E such that p(,
o
) iI we pick t . ow
ever, p(y, o) I iI we let ro Ilt . ThereIore,is not continuous on .
We can interpret the notion oI continuity oI Iunctions in the Iollowing
euivalent way.
5.7.5. Theorem. et ; P and ; py be metric spaces, and let I:
. Then I is continuous at a point
o
E iI and only iI Ior every
I 0, there e ists a 0 such that
I(S(o;) c S(I(
o
); I).
Intuitively, Theorem 5.7.5 tells us thatIis continuous at
o
iI I() is arbi
trarily close to I(
o
) when is suIIiciently close to
o
. The concept oI continu
ity is depicted in igure Ior the case where ; P ; py R.
o
o
;pyl R
5.7.7. igure . Illustration oI continuity.
As we did in Chapter I, we distinguish between mappings on metric
spaces which are injective, surjective, or bijective.
It turns out that the concepts oI continuity and convergence oI seuences
are related. Our ne t result yields a connection between convergence and
continuity.
5.7.8. Theorem. et ; P, and ; p,. be two metric spaces. A Iunction
I: is continuous at a point
o
E iI and only iI Ior every se uence
oI points in which converges to a point
o
the corresponding se uence
I( ) converges to the point I(
o
) in ; i.e.,
limI( ) I(lim ) I(
o
)
whenever lim
o
ProoI Assume that I is continuous at a point
o
E , and let .l be a
se uence such that lim .
o
Then Ior every E 0 there is a 6 0 such
that p,.(I(),I(
o
E whenever P(,
o
) 6. Also, there is an N such
that P(.,
o
) 6 whenever n N. ence, p,.(I(.),I(
o
E whenever
n N. Thus, iI I is continuous at
o
and iI lim .
o
, then IimI(.)
I(
o
)
Conversely, assume that I(.) I(
o
) whenever
o
or purposes
oI contradiction, assume that I is not continuous at
o
Then there eists
an E 0 such that Ior each 6 0 there is an with the property that
P(,
o
) 6 and p,.(I(),I(
o
E. This implies that Ior each positive
integer n there is an such that P(.,
o
) lin and P,.(I(,I(
o
E
Ior all n; i.e.,
o
but I(.) does not converge to I(
o
) But we assumed
that I(.) I(
o
) whenever
o
ence, we have arrived at a contradic
5.7. Continuous unctions 311
tion, and I must be continuous at
o
This concludes the prooI oI our
theorem.
Continuous mappings on metric spaces possess the Iollowing important
properties.
5.7.9. Theorem. et ; p and ; p, be two metric spaces, and letl
be a mapping oI into . Then
(i) I is continuous on iI and only iI the inverse image oI each open
subset oI ; p, is open in ; p; and
(ii) I is continuous on iI and only iI the inverse image oI each closed
subset oI ; p, is closed in ; p.
ProoI. et I be continuous on , and let V::t 0 be an open subset oI
; p, . et II(V). Clearly, ::t 0. Now let E . Then there e ists a
uniue y I() E V. Since V is open, there is a sphere S(y; e) which is
entirely contained in V. SinceI is continuous at , there is a sphere S(; 0)
such that its image I(S(; 0 is entirely contained in S(y; e) and thereIore
in V. But Irom this it Iollows that S(; 0) c . ence, every E is the
center oI a sphere which is contained in . ThereIore, is open.
Conversely, assume that the inverse image oI each nonempty open
subset oI is open. or arbitrary E we have y I(). Since S(y; e)
c is open, the setII(S(y; e is open Ior every I and E II(S(y;e.
ence, there is a sphere Se ; 0) such that se; 0) c II(S(y; e. rom
this it Iollows that Ior every I 0 there is a 6 0 such that I(S(; 0)
c S(y; I). ThereIore,Iis continuous at . But E was arbitrarily chosen.
ence, I is continuous on . This concludes the prooI oI part (i).
To prove part (ii) we utili e part (i) and take complements oI open sets.
The reader is cautioned that the image oI an open subset oI under
a continuous mapping I: is not necessarily an open subset oI .
or e ample, let I: R R be deIined by I()
2
Ior every E R. Clearly,
lis continuous on R. et the image oI the open interval (I, I) is the interval
0, I). But the interval 0, I) is not open.
We leave the prooI oI the ne t result as an e ercise to the reader.
5.7.10. Theorem. et ; p, ; p, , and Z; P. be metric spaces, letI
be a mapping oI into , and let g be a mapping oI into Z. IIIis contin
uous on and g is continuous on , then the composite mapping h g 0 I
oI into Z is continuous on .
or continuous mappings on compact spaces we state and prove the
Iollowing result.
5.7.12. Theorem. et ; P and ; P) be two metric spaces, and let
I: be continuous on .
(i) II; P is compact, then I() is a compact subset oI ; p) .
(ii) II is a compact subset oI the metric space ; P , thenI() is
a compact subset oI the metric space ; p) .
(iii) II ; P is compact and iI is a closed subset oI , then I( )
is a closed subset oI ; p) ).
(iv) II ; P is compact, thenIis uniIormly continuous on .
ProoI To prove part (i) let II be a se uence in I(). Then there are
points ,, in such that II I(
ll
). Since ; P is compact we can Iind
a subse uence ,.) oI ,, which converges to a point in ; i.e., ,. .
In view oI Theorem 5.7.8 we have, since I is continuous at , I(,.) I()
E I(). rom this it Iollows that the se uence ,, has a convergent sub
se uence and I() is compact.
To prove part (ii), let be a compact subset oI . Then ; P is a
compact metric space. In view oI part (i) it now Iollows that I( ) is also a
compact subset oI the metric space ; p) .
To prove part (iii), we Iirst observe that a closed subset oI a compact
metric space ; P ) is itselI compact and ; P ) is itselI a compact metric
space. In view oI part (ii), I() is a compact subset oI the metric space
; P) and as such is bounded and closed.
To prove part (iv), let E O. or every E , there is some positive
number, I(), such that I(S(; 21() c: S(I(); E/2). Now the Iamily
Se; I(: E ) is an open covering oI . Since is compact, there is
a Iinite set, say c: , such that Se ; ,,(: E is a covering oI .
Now let
6 min ,,(): E .
Since is a Iinite set, 6 is some positive number. Now let , E be such
that p(, y) 6. Choose E such that E S( ; ,,(. Since 6:::;;; ,,( ),
E S(; 2,,( . Since I(S(; 2,,( ) c S(I(); E/2), it Iollows that I() and
I(y) are in S(I(); E/2). ence, piI(),I(y E. Since 6 does not depend
on E , I is uniIormly continuous on . This completes the prooI oI the
theorem.
et us ne t consider some additional generali ations oI concepts encoun
tered in the calculus.
5.7.13. DeIinition. et ; P and ; p), be metric spaces, and let Ill
be a se uence oI Iunctions Irom into . II I1l( ) converges at each E ,
then we say that Ill is pointwise convergent. In this case we write limIll I,
II
where I is deIined Ior every E .
E uivalently, we say that the se uence IlO is pointwise convergent to
5.7. Continuous unctions
313
a Iunction I iI Ior every I 0 and Ior every E there is an integer N
N(I, ) such that
pil,,(),/( I
whenever n N(I, ). In general, N(I, ) is not necessarily bounded. ow
ever, iI N(I, ) is bounded Ior all E , then we say that the se uence
I. converges toI uniIormly on . et M(I) sup N(I, ) 00. Euivalently,
"e
we say that the se uence I. converges uniIormly to I on iI Ior every
I 0 there is an M(I) such that
pil.(),I( I
whenever n M(I) Ior all E .
In the net result a connection between uniIorm convergence oI Iunctions
and continuity is established. (We used a special case oI this result in the
prooI oI Eample 5.5.28.)
5.7.1. Theorem. et ; p,, and ; py be two metric spaces, and let
IIt be a se uence oI Iunctions Irom into such that I" is continuous on
Ior each n. II the se uence I. converges uniIormly to I on , then I is
continuous on .
ProoI Assume that the se uence I. converges uniIormly to Ion . Then
Ior every I 0 there is an N such that Py(I.(),I( I whenever n N
Ior all E . II M N is a Ii ed integer then1M is continuous on . etting
o
E be Ii ed, we can Iind a 6 0 such thatpy(IM(),IM(
o
Iwhenever
p,,(,
o
) 6. ThereIore, we have
py(I(),/(
o
:::;; piI(),IM( py(IM(),IM(
O
P (IM( O),I(
o
3I,
whenever. P e( ,
o
) 6. rom this it Iollows that I is continuous at
O
Since
o
was arbitrarily chosen,Iis continuous at all E . This proves the
theorem.
The reader will recogni e in the last result oI the present section several
generaliations Irom the calculus to realvalued Iunctions deIined on metric
spaces.
5.7.15. Theorem. et ; p e be a metric space, and let R; p denote the
real line R with the usual metric. et I: R, and let c: . III is con
tinuous on and iI is a compact subset oI ; p", , then
(i) lis uniIormly continuous on ;
(ii) Iis bounded on ; and
(iii) iI "* 0, I attains its inIimum and supremum on ; i.e., there
eisto,
1
E suchthatI(o)inII(): E )andI(
l
) sup
I(): E .
ProoI Part (i) Iollows Irom part (iv) oITheorem 5.7.12. Since is a compact
subset oI it Iollows that /() is a compact subset oI R. Thus, /() is
bounded and closed. rom this it Iollows that j is bounded. To prove part
(iii), note that iI is a nonempty compact subset oI ; P , then /() is
a nonempty compact subset oI R. This implies that / attains its inIimum
and supremum on .
5.8. SOME IMPORTANT RES TS
IN APP ICATIONS
In this section we present two results which are used widely in applica
tions. The Iirst oI these is called the Ii ed point principle while the second is
known as the Ascoli Ar ela theorem. Both oI these results are widely utili ed,
Ior eample,in establishing e istence and uniueness oI solutions oI various
types oI euations (ordinary diIIerential euations, integral euations,
algebraic euations, Iunctional diIIerential euations, and the like).
We begin by considering a special class oI continuous mappings on metric
spaces, so called contraction mappings.
5.8.1. DeIinition. et ; p be a metric space and let j: . The
Iunction / is said to be a contraction mapping iI there e ists a real number
c such that 0 c I and
p(I(),j(y s;;: cp(. y) (5.8.2)
Ior all , y E .
The reader can readily veriIy the Iollowing result.
5.8.3. Theorem. Every contraction mapping is uniIormly continuous on
.
5.8. . E ercise. Prove Theorem 5.8.3.
The Iollowing result is known as the Ii ed point principle or the principle
oI contraction mappings.
5.8.5. Theorem. et ; p be a complete metric space, and let / be a
contraction mapping oI into . Then
(i) there e ists a uniue point u E such that
I(
o
) o (5.8.6)
and
(ii) Ior any I E , the se uence
n
in deIined by
1
/(
n
), n 1,2, ...
converges to the uniue element
o
given in (5.8.6).
(5.8.7)
5.8. Some Important Results in Applications 315
The uniue point
o
satisIying E . (5.8.6) is called a Ii ed point oII In
this case we say that
o
is obtained by the method oI successive approimations.
ProoI We Iirst show that iI there is an
o
E satisIying (5.8.6), then it
must be uniue. Suppose that
o
and o satisIy (5.8.6). Then by ineuality
(5.8.2), we have p(o,o) cp(
o
o) Since 0 c I, it Iollows that
p(
o
o) 0 and thereIore
o
o
Now let I be any point in . We want to show that the se uence I .
generated by E . (5.8.7) is a Cauchy se uence. or any n I, we have
p( .
I
, .) cp(., .
I
). By induction we see that p( . 1 .) CI p( Z
l
)
Ior n 1,2, .... Thus, Ior any m n we have
",I
p( "" .) I: P( k I
k
) cIp(, l)1 c ... C", I
k
1 ( )
c p , I .
1 c
Since 0 c I, the righthand side oI the above ineuality can be made
arbitrarily small by choosing n suIIiciently large. Thus, . is a Cauchy
se uence.
Ne t, since I; p is complete, it Iollows that . converges; i.e., lim
e ists. et lim . . Now since/is continuous on , we have
limI(.) I(lim .).

But I(lim .) I() and limI(
n
) lim
n
I . Thus,/() and we
II.
have proven the e istence oI a Ii ed point oII Since we have already proven
uniueness, the prooI is complete.
It may turn out that the composite Iunction pn /),. /0/0 ... 0/ is a
contraction mapping, whereas / is not. The Iollowing result shows that such
a mapping still has a uniue Ii ed point.
5.8.8. Corollary. et ; p be a complete metric space, and let/;
be continuous on . II the composite Iunction p. I 0/0 ... 0/ is a
contraction mapping, then there is a uniue point
o
E such that
I(
o
)
o
(5.8.9)
Moreover, the Ii ed point can be determined by the method oI successive
approimations (see Theorem 5.8.5).
5.8.10. Eercise. Prove Corollary 5.8.8.
We will consider several applications oI the above results in the last
section oI this chapter.
BeIore we can consider the Ar ela Ascoli theorem, we need to introduce
the Iollowing concept.
5.8.11. DeIinition. et e a, b denote the set oI all continuous realvalued
Iunctions deIined on the interval a, b oI the real line R. A subset oI
era, b is said to be e uicontinuous on a, b iI Ior every I 0 there e ists a
0 such that I(t) (t
o
)I I Ior all E and all t, to such that
It tol .
Note that in this deIinition depends only on I and not on or 1and
0
,
We now state and prove the Ar ela Ascoli theorem.
5.8.12. Theorem. et e a, b; p be the metric space deIined in Eample
5.3.1 . et be a bounded subset oI e a, b . II is euicontinuous on a, b ,
then is relatively compact in e a, b .
ProoI or each positive integer k, let us divide the interval a, b into keual
parts by the set oI points V
k
tok 11k ... ,/
u
c a, b . That is, a 1
0k
Ilk ... lu b, where t k a (ilk)(b a), i 0, I, ... ,k, and
k
a, b /c/Ilk I,k Ior all k 1,2, .... Since each V
k
is a Iinite set,
1
Vk is a countable set. or convenience oI notation, let us denote this set

k1
by T , T , ... . The ordering oI this set is immaterial. Net, since is
bounded, there is a ., 0 such that p ( , y) ., Ior all , E . et
o
be
held Ii ed in , and let E be arbitrary. et 0 E era, b be the Iunction
which is ero Ior all 1 E a, b . Then p (y, 0) p (y,
o
) p (
o
, 0). ence,
p(y, 0) M Ior all y E , where M ., p (
o
, 0). This implies that
sup ly(t)1 M Ior all E . Now, let y. be an arbitrary seuence in
IEI bl
. We want to show that y. contains a convergent subseuence. Since
I .(TI)I M Ior all n, the seuence oI real numbers .(T
I
) contains a
convergent subseuence which we shall call I.(TI) Again, since I h(1 Z) I
M Ior all n, the se uence oI real numbers h(1 ) contains a convergent
subseuence which we shall call .(1 ) . We see that .(1 I) is a subse
uence OI h(1
I
) , and hence it is convergent. Proceeding in a similar Iashion,
we obtain se uences y hI, . , ... such that b is a subseuence oI y 1.
Ior all k j. urthermore, each se uence is such that lim h(1 ,) e ists Ior
each i such that 1 i k. Now let .be the se uence y . Then . is

a subseuence oI h and lim .(1 ,) e ists Ior i 1,2, .... We now wish to
show that . is a Cauchy seuence in e a, b ; p . et I 0 be given.

Since is euicontinuous on a, b , we can Iind a positive number k such
that I.(t) .(t) I I/3 Ior every n whenever It tl Ilk. Since .(1 ,)
is a convergent se uence oI real numbers, there e ists a positive integer
N such that I .(1 ,) m(1 ,) I I/3 whenever m Nand n N Ior all
1 , E V
k
Now, iI t E a, b , there is some 1 , E V
k
such that II 1 ,1 Ilk.
5.9. E uivalent and omeomorphic Metric Spaces. Topological Spaces 317
ence, Ior all m Nand n N, we have
Iit) ",(t) I I It(t) It(t,) I I It(t,) ",(t,) I
I ",(t,) ",(t)1 E.
This implies that poo( "" It) . E Ior all m, n N. ThereIore, . is a Cauchy
se uence in era, b . Since era, b ; pool is a complete metric space (see Eample
5.5.28), It converges to some point in era, b . This implies that It
has a subse uence which converges to a point in era, b and so, by Theorem
5.6.31, is relatively compact in era, b . This completes the prooI oI the
theorem.
Our ne t result Iollows directly Irom Theorem 5.8.12. It is sometimes
reIerred to as Ascolis lemma.
5.8.13. Corollary. et 91It be a se uence oI Iunctions in era, b ; pool
II 91It is euicontinuous on a, b and uniIormly bounded on a, b ( e., there
e ists an M 0 such that sup 1,.(t)1 M Ior all n), then there e ists a
.S;.S;b
, E era, b and a subse uence 91It. oI ,,, such that 91It. converges to,
uniIormly on a, b .
5.8.1 . Eercise. Prove Corollary 5.8.13.
We close the present section with the Iollowing converse to Theorem
5.8.12.
5.8.15. Theorem. et be a subset oI era, b which is relatively compact
in the metric space era, b; pool. Then is a bounded set and is euicon
tinuous on a, b .
5.9. EQ IVA ENT AND OMEOMORPIC METRIC
SPACES. TOPOOGICA SPACES
It is possible that seemingly diIIerent metric spaces may ehibit properties
which are very similar with regard to such concepts as open sets, limits
oI se uences, and continuity oI Iunctions. or e ample, Ior each p, I p
00, the spaces R (see E amples 5.3.1,5.3.3) are diIIerent metric spaces.
owever, it turns out that the Iamily oI all open sets is the same in all oI
these metric spaces Ior 1 p 00 (e.g., the Iamily oI open sets in R7 is the
same as the Iamily oI open sets in Ri, which is the same as the Iamily oI open
sets in Rj, etc.). urthermore, metric spaces which are not even deIined on
the same underlying set (e.g., the metric spaces ; P.. and ; py , where
*) may have many similar properties oI the type mentioned above.
We begin with euivalence oI metric spaces deIined on the same underlying
set.
5.9.1. DeIinition. et ; ptl and ; Pl be two metric spaces deIined on
the same underlying set . et 3
1
and 3
1
be the topology oI determined
by PI and Pl respectively. Then the metrics PI and Pl are said to be euivalent
metrics iI 3
1
3
1
,
Throughout the present section we use the notation
I: ; PI ; Pl
to indicate a mapping Irom into , where the metric on is PI and the
metric on is Pl This distinction becomes important in the case where
, i.e. in the caseI: ; PI ; Pl
et us denote by i the identity mapping Irom onto ; i.e., i()
Ior all E . Clearly, i is a bijective mapping, and the inverse is simply
i itselI. owever, since the domain and range oI i may have diIIerent metrics
associated with them, we shall write
i: ; PI ; Pl
and
iI: ; Pl ; PI
With the Ioregoing statements in mind, we provide in the Iollowing
theorem a number oIeuivalent statements to characterie euivalent metrics.
5.9.2. Theorem. et ; pd, ; Pl and ; P3 be metric spaces. Then
the Iollowing statements are euivalent:
(i) PI and Pl are euivalent metrics;
(ii) Ior any mappingI: ,: ; PI ; P3 is continuous on
iI and only iII: ; Pl ; P3 is continuous on ;
(iii) the mapping i: ; PI ; Pl is continuous on , and the
mapping iI: ; Pl ; ptl is continuous on ; and
(iv) Ior any seuence
R
in ,
R
converges to a point in ; PI
iI and only iI
R
converges to in ; Pl
ProoI To prove this theorem we show that statement (i) implies statement
(ii); that statement (ii) implies statement (iii); that statement (iii) implies
statement (iv); and that statement (iv) implies statement (i).
To show that (i) implies (ii), assume that PI and Pl are euivalent metrics,
and letIbe any continuous mapping Irom ; PI into ; P3 et be any
open set in ; P3 SinceIis continuous,I() is an open set in ; PI
Since PI and Pl are euivalent metrics,II() is also an open set in ; Pl
ence, the mapping I: ; Pl ; P3 is continuous. The prooI oI the
converse in statement (ii) is identical.
We now show that (ii) implies (iii). Clearly, the mapping i: ; p
; p is continuous. Now assume the validity oI statement (ii), and let
; P3 ; p . Then i: ; PI ; p is continuous. Again, it is clear
that iI: ;PI;PI is continuous. etting ;P3;pd in
statement (ii), it Iollows that iI: ; p ; PI is continuous.
Net, we show that (iii) implies (iv). et i: ; PI ; p be contin
uous, and let the seuence in metric space ; PI converge to . By
Theorem 5.7.8, lim i( ) i(); e., lim in ; p . The converse is

proven in the same manner.
inally, we show that (iv) implies (i). et be an open set in ; PI
Then is closed in ; PI Now let be a seuence in which converges
to in ; PI . Then E by part (iii) oI Theorem 5.5.8. By assumption,
converges to in ; p also. urthermore, since E , is closed
in ; p , by part (iii) oI Theorem 5.5.8. ence, is open in ; p . etting
be an open set in ; p , by the same reasoning we conclude that is open
in ; PI Thus, PI and p are euivalent metrics.
The net result establishes suIIicient conditions Ior two metrics to be
euivalent. These conditions are not necessary, however.
5.9.3. Theorem. et ; PI and ; p be two metric spaces. II there
e ist two positive real numbers, and A, such that
lp(, y) PI(, y) lP ( , y)
Ior all , y E , then PI and p are euivalent metrics.
5.9. . E ercise. Prove Theorem 5.9.3.
et us now consider some speciIic eamples oI euivalent metric spaces.
5.9.5. E ercise. et ; p be any metric space. or the eample oI
E ercise 5.1.10 the reader showed that ; PI is a metric space, where
(
) p(,y)
PI ,y I p(,y)
Ior all , y E . Show that P and PI are euivalent metrics.
5.9.6. Theorem. et R; PI R and R; p R be the metric spaces
deIined in Eample 5.3.1, and let R ; pool be the metric space deIined in
Eample 5.3.3. Then
(i) poo( , y) p ( , y) ..jn poo( , y) Ior all , y E R;
(ii) poo( , y) PI( , y) npoo( , y) Ior all , y E R; and
(iii) PI P , and poo are euivalent metrics.
320
It can be shown that Ior the metric spaces Rn; PoP) and Rn; Pv), PoP and P
v
are euivalent metrics Ior any p, such that I p 00, I 00.
In Eample 5.1.12, we deIined a metric P*, called the usual metric Ior R*.
p until now, it has not been apparent that there is any meaningIul connec
tion between P* and the usual metric Ior R. The Iollowing result shows that
when P* is restricted to R, it is euivalent to the usual metric on R.
5.9.8. Theorem. et R; p) denote the real line with the usual metric,
and let R*; p* denote the etended real line (see Eercise 5.1.12). Consider
R; P* which is a metric subspace oI R*; P* . Then
(i) Ior the metric spaces R; p) and R; p*, p and p* are euivalent
metrics;
(ii) iI c R, then is open in R; p) iI and only iI is open in
R*; P*); and
(iii) iI is open in R*; p*), then n R, oo, and oo)
are open in R*; p*).
5.9.9. Eercise. Prove Theorem 5.9.8. (int: se part (iii) oI Theorem
5.9.2 to prove part (i) oI this theorem.)
Our net eample shows that iI need not be continuous, even though
i is continuous.
5.9.10. Eample. et be any nonempty set, and let PI be the discrete
metric on (see Eample 5.1.7). In Eercise 5. .26 the reader was asked
to show that every subset oI is open in ; PI) Now let ; p) be an
arbitrary metric space with the same underlying set . Clearly, i: ; PI)
; p) is continuous. owever, iI: ; p) ; PI) is not continuous
unless every subset oI ; p) is open. Since this is usually not true, iI need
not be continuous.
Net, we introduce the concepts oI homeomorphism and homeomorphic
metric spaces.
5.9.11. DeIinition. Two metric spaces ; P..and ; py are said to be
bomeomorpbic iI there e ists a mapping rp: ; P.. ; p,.) such that (i)
rp is a bijective mapping oI onto , and (ii) E c is open in ; P.. ) iI
and only iI rp(E) is open in ; p,. . The mappingrp is calledabomeomorpbism.
We immediately have the Iollowing generaliation oI Theorem 5.9.2.
5.9.12. Theorem. et ; P.. , ; p,.), and Z; p,) be metric spaces, and
let rp be a bijective mapping oI ; P.. ) onto ; p,.). Then the Iollowing
statements are euivalent.
(i) rp is a homeomorphism;
(ii) Ior any mapping I: Z, I: ; P Z; P is continuous on
iI and only iII0 rp I: ; py Z; P is continuous on ;
(iii) rp: ; P ; py is continuous and rp I: ; py ; P is
continuous; and
(iv) Ior any se uence
n
in ,
n
converges to a point in ; P
iI and only iI rp(
n
) converges to rp( ) in ; py .
The connection between homeomorphic metric spaces deIined on the same
underlying set and euivalent metrics is provided by the ne t result.
5.9.1 . Theorem. et ; PI and ; P2 be two metric spaces with the
same underlying set . Then PI and P2 are euivalent iI and only iI the identity
mapping i: ; PI ; P2 is a homeomorphism.
It is possible Ior ; PI and ; P2 to be homeomorphic, even though
PI and P2 may not be euivalent.
There are important cases Ior which the metric relations between the
elements oI two distinct metric spaces are the same. In such cases only the
nature oI the elements oI the metric spaces diIIer. Since this diIIerence may
be oI no importance, such spaces may oIten be viewed as being essentially
identical. Such metric spaces are said to be isometric. SpeciIically, we have:
5.9.16. DeIinition. et ; P and ; py be two metric spaces, and let
rp: ; P (; py be a bijective mapping oI onto . The mapping rp is
said to be an isometry iI
P ( , y) py(rp( ), rp(y
Ior all , y E . II such an isometry e ists, then the metric spaces (; P
and ; P,. are said to be isometric.
5.9.17. Theorem. et rp be an isometry. Then rp is a homeomorphism.
We close the present section by introducing the concept oI topological
space. It turns out that metric spaces are special cases oI such spaces.
In Theorem 5..15 we showed that, in the case oI a metric space ; pI,
(i) the empty set 0 and the entire space are open; (ii) the union oI an
arbitrary collection oI open sets is open; and (iii) the intersection oI a Iinite
collection oI open sets is open. Eamining the various prooIs oI the present
chapter, we note that a great deal oI the development oI metric spaces is not
a conse uence oI the metric but, rather, depends only on the properties oI
certain open and closed sets. Taking the notion oI open set as basic (instead
oI the concept oI distance, as in the case oI metric spaces) and taking the
aIorementioned properties oI open sets as postulates, we can Iorm a mathe
matical structure which is much more general than the metric space.
5.9.19. DeIinition. et be a nonvoid set oI points, and let 3 be a Iamily
oI subsets which we will call open. We call the pair ; 3 a topological space
iI the Iollowing hold:
(i) E 3, 0 E 3;
(ii) iI
1
E 3 and
E 3, then
1
n
E 3; and
(iii) Ior any inde set A, iI I E A, and ,. E 3, then ,. E 3.
,.eA
The Iamily 3 is called the topology Ior the set . The complement oI an
open set E 3 with respect to is called a closed set.
The reader can readily veriIy the Iollowing results:
5.9.20. Theorem. et ; 3 be a topological space. Then
(i) 0 is closed;
(ii) is closed;
(iii) the union oI a Iinite number oI closed sets is closed; and
(iv) the intersection oI an arbitrary collection oI closed sets is closed.
We close the present section by citing several speciIic e amples oI
topological spaces.
5.9.22. E ample. In view oI Theorem 5. .15, every metric space is a
topological space.
5.9.23. E ample. et , y , and let the open sets in be the void set
0, the set itselI, and the set . II 3 is deIined in this way, then ; 3
is a topological space. In this case the closed sets are 0, , and y .
5.9.2. E ample. Although many Iundamental concepts carry over Irom
metric spaces to topological spaces, it turns out that the concept oI topological
space is oIten too general. ThereIore, it is convenient to suppose that certain
topological spaces satisIy some additional conditions which are also true
in metric spaces. These conditions, called the separation a ioms, are imposed
on topological spaces ; 3 to Iorm the Iollowing important special cases:
5.10. Applications
323
TIspaces: A topological space (; ::I is called a TIspace iI every set
consisting oI a single point is closed. Euivalently, a space is called a T
I

space, provided that iI and yare distinct points there is an open set con
taining y but not . Clearly, metric spaces satisIy the TIaiom.
Tspaces: A topological space (;::I is called a T space iI Ior all
distinct points , y E there are disjoint open sets
and
y
such that
E
and y E
r
T spaces are also called ausdorII spaces. All metric
spaces are ausdorII spaces. Also, all T spaces are TI spaces. owever,
there are TIspaces which do not satisIy the Tseparation a iom.
T
3
spaces: A topological space ( ; ::I is called a T
3
space iI (i) it is a
TIspace, and (ii) iI given a closed set and a point not in there are
disjoint open sets
I
and
such that E
I
and c
. T
3
spaces are
also called regular topological spaces. All metric spaces are T3 spaces. All
T
3
spaces are T spaces; however, not all T spaces are T
3
spaces.
T.spaces: A topological space ( ;::I is called a T.space iI (i) it is a
TIspace, and (ii) iI Ior each pair oI disjoint closed sets I
in there
e ists a pair oI disjoint open sets
I
,
such that
I
c
I
and
.
T.spaces are also called normal topological spaces. Such spaces are clearly
T3 spaces. owever, there are T3 spaces which are not normal topological
spaces. On the other hand, all metric spaces are T.spaces.
5.10. APP ICATIONS
The present section consists oI two parts (subsections A and B).
In the Iirst part we make e tensive use oI the contraction mapping principle
to establish e istence and uniueness results Ior various types oI euations.
This part consists essentially oI some speciIic e amples.
In the second part, we continue the discussion oI Section .11, dealing
with ordinary diIIerential euations. SpeciIically, we will apply Ascolis
lemma, and we will answer the uestions raised at the end oIsubsection.1IA.
A. Applications oI the Contraction Mapping Principle
In our Iirst eample we consider a scalar algebraic euation which may be
linear or nonlinear.
5.10.1. Eample. Consider the euation
I(), (5.10.2)
where I: a, b a, b and where a, b is a closed interval oI R. et 0,
and assume that I satisIies the condition .
II(
) I(
l
) l
II (5.10.3)
Ior all ,
E a, b). In this case / is said to satisIy a ipschit condition,

and is called a ipschit constant.
Now consider the complete metric space R; p, where p denotes the usual
metric on the real line. Then a, b ; p is a complete metric subspace oIR; p
(see Theorem 5.5.33). II in (5.10.3) we assume that I, then/is clearly a
contraction mapping, and Theorem 5.8.5 applies. It Iollows that iI I,
then E . (5.10.2) possesses a uniue solution. SpeciIically, iI
o
E a, b), then
the se uence ,, , n 1,2, ... determined by " /(,,I) converges to
the uniue solution oI E . (5.10.2).
Note that iI Id/()Id I II() I c I on the interval a, b) (in this
case I(a) denotes the righthand derivative oI/at a, and I(b) denotes the
leIthand derivative oI/at b), then / is clearly a contraction.
In igures and the applicability oI the contraction mapping principle
y
/
/
b 1.,(
81.(
/
/
/
/
5.10. . igure . Successive approimations (convergent case).
b
8
y
/
/
/
,
/
/
/
...............
y" II )
,
/
.

3
/
b
5.10.5. igure . Successive approimations (convergent case).
5.10. Applications 325
is demonstrated pictorially. As indicated, the seuence . determined by
successive approimations converges to the Ii ed point .
In our net eample we consider a system oI linear euations.
5.10.6. Eample. Consider the system oI n linear euations
e, a P" i 1, ... , n. (5.10.7)

:1
Assume that (I ... , e.) E R, b (PI .. , P.) E R, and a E R.
ere the constants a P, are known and the e, are unknown. In the Iollowing
we use the contraction mapping principle to determine conditions Ior the
e istence and uniueness oI solutions oI E . (5.10.7). In doing so we consider
diIIerent metric spaces. In all cases we let
y ()
denote the mapping determined by the system oI linear euations
I, a P" i I, ... , n,
":1
where y (II ... , I.) ERn.
irst we consider the complete space Rn; PI R7. et y (),
y" ("), (;, ... , ) and " (, ... , ). We have
PI(y, y") PI(I(),I(" i; It a,j P, ae P, I
II I
tIt a,iej e)1 t t la, llej e1
II I 1 I 1 I
m: la,IPI(, "),
where in the preceding the older ineuality Ior Iinite sums was used (see
Theorem 5.2.1). Clearly, I is a contraction iI the ineuality
(5.10.8)
holds. Thus, E. (5.10.7) possesses a uniue solution iI (5. 10.8) holds Ior allj.
Net, we consider the complete space Rn; p Ri. We have
.
pl(y,y") pl( (), (" It1 a,tj P, a,e1 P,
. 2
a, (ej e1) ah pi(, "),
I I I 11 I
where, in the preceding, the Schwar ineuality Ior Iinite sums was employed
(see Theorem 5.2.1). It Iollows that I is a contraction, provided that the
ineuality
(5.10.9)
326 ehopler 5 I Metric Spaces
(5.10.11)
holds. ThereIore, E. (5.10.7) possesses a uniue solution, iI (5.10.9) is
satisIied.
astly, let us consider the complete metric space Ra; p l R:.. We have
p(y, y") p(I(),I(" m It all ) I
a
ma la,/llp(, ").
I 11
Thus,I is a contraction iI
a
m Iti
1
al/
Il
b. k l. (5.10.10)
ence, iI (5.10.10) holds, then E. (5.10.7) has a uniue solution.
In summary, iI anyone oI the conditions (5.10.8), (5.10.9), or (5.10.10)
holds, then E. (5.10.7) possesses a uniue solution, namely . This solution
can be determined by the successive approimation
a
..t(k) a ..t(kI) b k 1 2
I II I " ... ,
1
Ior all i I, ... ,n, with starting point
C01
(iO), ... ,O.
Net, let us consider an integral euation.
3.10.12. Eample. et, E ea, b) and let (s, I) be a realvalued Iunction
which is continuous on the suare a, b) a, b). et 1 E R. We call
es) tp s) 1 s: (s, t) (t)dt (5.10.13)
a redholm nOD homogeneous linear integral euation oI the secODd kind.
In this euation is the unknown, (s, t) and, are speciIied, and 1 is regarded
as an arbitrary parameter.
We now show that Ior all IIIsuIIiciently small, E. (5.10.13) has a uniue
solution which is continuous on a, b . To this end, consider the complete
metric space era, b ; p l, and let y I() denote the mapping determined by
yes) ,(s) ). s: (s, t) (t)dt.
Clearly y E era, b . We thus have I: era, b era, b . Now let M
sup I (s, t) I. Then
a5;t5;b
a5;,5;b
p(I(l), (, S 111M(b a)p(
l
, ,).
ThereIore, iI we choose 1 so that
111 M(b a)
(5.10.1)
5.10. Applications 3rT
then I is a contraction mapping. rom Theorem 5.8.5 it now Iollows tbat
E. (5.10.13) possesses a uniue solution E era, b , iI (5.10.1) bolds.
Starting at
o
E era, b , successive approimations to this solution are given
by
,,(s) ,(s) .t s: (s, t) " I(t)dt, n 1,2,3, . . .. (5.10.15)
Net, we consider yet another type oI integral euation.
5.10.16. Eample. et rp E era, b , let (s, t) be a real continuous Iunction
on the triangle a t s b, and let .t E R. We call
(s) rp(s) .t . (s, t)(t)dt, a s b,
(5.10.17)
a linear Volterra integral euation. ere is unknown, (s, t) and, are
speciIied, and .t is an arbitrary parameter.
We now show that, Ior all .t, E. (5.10.17) possesses a uniue continuous
solution. We consider again the complete metric space era, b ; pool, and we
let I() be the mapping determined by
y(s) rp(s) .1. r (s, t) (t)dt.
Since the righthand side oI this epression is continuous, it Iollows that
I: era, b era, b . Moreover, since is continuous, there is an M such
that I (s, t)1 M. et I I(
l
), and let I(
). As in the preceding
eample, we have
p..(I(
l
),I(
2
)) P"( I 2) 1.1.1 M(b a)poo(
l
,
2
)
Now let Il"l denote the composite mapping I 0 I 0 0 I, and let Il"l()
yl"l. A little bit oI algebra yields
p..(II"l(I),II"l(
p..(yl"yl"l) 1.1.I"M"(b a)"p..(

l
,
2
)
n.
(5.10.18)
owever, 1.tI"M"(b a)" 0 as n 00. Thus, Ior an arbitrary value oI
n.
.t, n can be chosen so large that
k A 1.tI"M"(b a)" 1.
ence, we have
p..(II"l(
I
),II"l(
2
) kp..(
1
,
2
), 0 k l.
ThereIore, the composite mapping Il"l is a contraction mapping. It Iollows
Irom Corollary 5.8.8 that E. (5.10.17) possesses a uniue continuous solution
Ior arbitrary .t. This solution can be determined by the method oI successive
approimations.
328
5.10.19. Eercise. VeriIy ineuality (5.10.18).
Net we consider initialvalue problems characteried by scalar ordinary
diIIerential euations.
5.10.20. E ample. Consider the initialvalue problem
1(1, )
(I) e
(5.10.21)
discussed in Section .1 I. We would like to determine conditions Ior the
e istence and uniueness oI a solution ,(1) oI (5.10.21) Ior I 1 ::;; T.
et k 0, and assume that 1 satisIies the condition
1/(1, I) 1(1, )1 k I
I
l
Ior all 1 E I, T and Ior all l
E R. In this case we say thatI satisIies

a ipschit condition in and we call k a ipschit constant.
As was pointed out in Section .11, E. (5.10.21) is euivalent to the
integral euation
,(t) e s: I(s, tp(s ds.
Consider now the complete metric space eI, T; Poo , and let
(5.10.22)
(,) e rI(s, IP(s ds, I 1 T.
Then clearly : eI, T eI, n. Now
poo( ( I) (, sup I t (s, I(S I(s, (sds I
.9:S:T
sup It k ll(s) ,(S) Ids k(T I)Poo(IPl IP ) .
:S:,:S:T
Thus, is a contraction iI k Ij(T I).
Net, let p.) denote the composite mapping 0 0 0 . Similarly,
as in (5.10.18), the reader can veriIy that
(5.10.23)
Since .. , k(T I)It 0 as n 00, it Iollows that Ior suIIiciently large n,
n. .
.. , kIt(T I)It I. ThereIore, p.) is a contraction. It now Iollows Irom
n.
Corollary 5.8.8 that E . (5.10.21) possesses a uniue solution Ior I, T .
urthermore, this solution can be obtained by the method oI successive
approimations.
5.10.2 . E ercise. Generali e E ample 5.10.20 to the initial value problem
I ,(1, I . ,
It
),
tIr) I i I, ... , n,
which is discussed in Section .1 I .
B. urther Applications to Ordinary DiIIerential E uations
At the end oI Section .lA we raised the Iollowing uestions: (i) When
does an initialvalue problem possess solutions (ii) When are these solutions
uniue (iii) What is the etent oI the interval over which such solutions
eist (iv) Are these solutions continuously dependent on initial conditions
In E ample 5.10.20 we have already given a partial answer to the Iirst
two uestions. In the remainder oI the present section we reIine the type oI
result given in Eample 5.10.20, and we give an answer to the remaining
items raised above.
As in the beginning oI Section .lIA, we call R2 the (I, ) plane, we let
D c R2 denote an open connected set (i.e., D is a domain), we assume that
/ is a real valued Iunction which is deIined and continuous on D, we call
T (11 (
2
) eRa I interval, and we let I denote a solution oIthe diIIerential
euation
1(1, ). (5.10.25)
The reader should reIer to Section .lIA Ior the deIinition oI solution rp.
We Iirst concern ourselves with the initialvalue problem
1(1, ), (r) (5.10.26)
characteried in DeIinition .11.3. Our Iirst result is concerned with e istence
oI solutions oI this problem. It is convenient to establish this result in two
stages, using the notion oI approimate solution oI E . (5.10.25).
5.10.27. DeIinition. A Iunction rp deIined and continuous on a I interval
T is called an appro imate solution oI E . (5.10.25) on T iI
(i) (t, 1(1 E D Ior all t E T;
(ii) I has a continuous derivative on T e cept possibly on a Iinite set
S oI points in T, where there are jump discontinuities allowed; and
(iii) I;(t) 1(1, rp(I I Ior all lET S.
IIS is not empty, I is said to have piecewise continuous derivatives on T.
We now prove:
5.10.28. Theorem. In E . (5.10.25), let/be continuous on the rectangle
Do (I, ): It 1 1 a, I I b .
1"a
......... "" : t
b .,. 1
1"
M
(1", l
r "I
," I
I
I
I
I
I
..
1"a
I...... """ . t
1"1 b
1"
M
b
OCal
M
OC . I
M
(1" I, l
(1" a,)
5.10.29. igure . Construction oI an tapproimate solution.
Given any I 0, there e ists an Iapproimate solution tp oI E. (5.10.25)
on an interval It I I :5: a such that tp(r) .
ProoI et M ma II(t, ) I, and let min (a, hIM). Note that
(,:lED,
a iI a hiMand hiMiI a hiM(reIer to igure ). We will show
that an Iapproimate solution e ists on the interval I, I . The prooI is
similar Ior the interval (I , Il. In our prooI we will construct an Iapproi
mate solution starting at (I,, consisting oI a Iinite number oI straight line
segments joined end to end (see igure ).
Since 1is continuous on the compact set Do, it is uniIormly continuous
on Do (see Theorem 5.7.12). ence, given I 0, there e ists 6 6(I) 0
such that I/(t, ) I(t, ) I I whenever (t, ), (t, ) E Do, It tl 6
and I i 6. Now let I to and I t". We divide the halIopen
interval (to, t,, into n halIopen subintervals (to, tl (tl t
2
, ... , (t,, I t,, in
such a Iashion that
(5.10.30)
Net, we construct a polygonal path consisting oI n straight lines joined end
to end, starting at the point (r, e 6 (to, eo) and having slopes eual to
m
l

1
(tI1ell) over the intervals (tll,tl,i I, ... ,n, respectively,
where el elI m
l
I It
l
t
l

1
I A typical polygonal path is shown in
igure . Note that the graph oI this path is conIined to the triangular region
in igure . et us denote the polygonal path constructed in this way by 1.
Note that 1 is continuous on the interval 1 , l , that 1 is a piecewise
linear Iunction, and that 1 is piecewise continuously diIIerentiable. Indeed,
we have 1(1 ) eo eand
1(t) 1(t
l
l
) I(tl I 1(tiI(t tiI) t
i
I
t ti i 1, ... , n.
(5.10.31)
Also note that
1 1(t) 1(t ) I Mit tl (5.10.32)
Ior all t, t E 1 , l . We now show that 1 is an Iapproimate solution.
et t E (t
l

l
, t Then it Iollows Irom (5.10.30) and (5.10.32) that 1 1(t)
1(t
l
l
) I6. Now since II(t, ) I(t, ) I I whenever (t, ), (t , )
E Do, It t l 6, and I i 0, it Iollows Irom E . (5.10.31) that
I;(t) I(t, 1 1 II(t
l
I
, rp(t
l
I
I(t, 1(t 1 I.
ThereIore, the Iunction 1 is an Iapproimate solution on the interval
It 1 1 a.
We are now in a position to establish conditions Ior the e istence oI solu
tions oI the initialvalue problem (5.10.26).
S.10.33. Theorem. In E . (5.10.25), let be continuous on the rectangle
Do ret, ): It 1 1 a, I el b . Then the initial value problem
(5.10.26) has a solution on some t interval given by It 1 1 a.
ProoI et I. 0, I.
1
. and lim I. 0 (i.e., let . , n 1,2, ... ,
.
be a monotone decreasing se uence oI positive numbers tending to ero).
By Theorem 5.10.28, there e ists Ior every . an I.approimate solution oI
E . (5.10.25), call it 1., on some interval It 1 1 such that 1.(1 ) e.
Now Ior each 1. it is true, by construction oI 1., that
(5.10.3)
This shows that 1. is an euicontinuous set oI Iunctions (see DeIinition
5.8.11). etting t Tin (5.10.3), we have 1 1.(t) el Mit 1 1 M,
and thus I 1.(t) I I" M Ior all n and Ior all t E T, l . Thus, the
se uence 1. is uniIormly bounded. In view oI the Ascoli lemma (see Corol
lary 5.8.13) there e ists a subse uence 1 . , k I, 2, ... oI the se uence
1. which converges uniIormly on the interval 1 , l to a limit
Iunction 1; i.e.,
This Iunction is continuous (see Theorem 5.7.1 ) and, in addition, I,t)
,(t) 1s Mit tl
To complete the prooI, we must show that, is a solution oI E . (5.10.26)
or, e uivalently, that, satisIies the integral euation
,(t) rI(s, ,(sds. (5.10.35)
et , be an I appro imate solution, let .,(t) ; (t) I(t, , ..(t)) at
those points where , is diIIerentiable, and let .,(t) 0 at the points where
, is not diIIerentiable. Then , can be e pressed in integral Iorm as
, .(t) rI(s, , (s ..,(s) ds. (5.10.36)
Since , ... is an I appro imate solution, we have 1 ...(t)1 I... Also, since
I is uniIormly continuous on Do and since , , uniIormly on 1 I ,
I I , as k 00, it Iollows that II(t, ,.,(1 I(t, ,(t I I on the interval
I I , I I whenever k is so large that I,...(t) ,(t)1 on I I ,
I I . sing E . (5.10.36) we now have
IrI(s, , ..,(s I(s, ,(s ...(s) ds I II II(s, , ..(s I(s, ,(s Ids I
III ..,(s) Ids 1 l (I ... I) A I.
ThereIore, I I(s, ,.,(s .,(s)ds rI(s, ,(sds. It nowIollows that
tp(t) II(s, ,(sds,
which completes the prooI.
sing Theorem 5.10.33, the reader can readily prove the ne t result.
5.10.37. Corollary. In E . (5.10.25), let I be continuous on a domain D
oIthe (t, ) plane, and let (I, E D. Then the initial value problem (5.10.26)
has a solution, on some t interval containing I.
Theorem 5.10.33 (along with Corollary 5.10.37) is known in the literature
as the Caucby Peano e istence tbeorem. Note that in these results the solution
, is not guaranteed to be uni ue.
Ne t, we seek conditions under which uni ueness oI solutions is assured.
We re uire the Iollowing preliminary result, called the Gronwall ine uality.
5.10.39. Theorem. et rand k be real continuous Iunctions on an interval
a, b . Suppose ret) 0 and k(t) 0 Ior all t E a, b , and let 0 be a
5.10. Applications
given nonnegative constant. II
r(t) I k(s)r(s)ds
Ior all t E a, b), then
333
(5.10. 0)
Ior all t E a, b).
ProoI. et R(t) Ik(s)r(s)ds. Then r(t) R(t), R(a) 6, R(t)
k(t)r(t) k(t)R(t), and
R(t) k(t)R(t) 0
Ior all t E a, b). et (t) eI.k(I).". Then
(t) k(t)eIk(l)dl (t)k(t).
Multiplying both sides oI (5. 10. 2) by (t) we have
(t)R(t) (t)k(t)R(t) 0
or
(t)R(t) (t)R(t) 0
or
(t)R(t ) o.
Integrating this last e pression Irom a to t we obtain
(t)R(t) (a)R(a) 0
or
(t)R(t) 0
or
or
r(t) R(t) elk(l)dl,
which is the desired ine uality.
(5.10. 2)
In our net result we will reuire that the Iunction 1 in E . (5.10.25)
satisIy a ipschit condition
I/(t, ) I(t, ") I k l "I
Ior all (t, ), (t, ") E D.
5.10. 3. Theorem. In E . (5.10.25), let 1 be continuous on a domain D
oI the (t, ) plane, and let 1 satisIy a ipschit condition with respect to
on D. et (t, e E D. Then the initialvalue problem (5.10.26) has a uniue
solution on some t interval containing t (i.e., iIl and 2 are two solutions
oI E . (5.10.25) on an interval (a, b), iI r E (a, b), and iI I(r) (r) e,
thenl ,).
ProoI. By Corollary 5.10.37, at least one solution e ists on some interval
(a, b), r E (a, b). Now suppose there is more than one solution, say I
and , to the initial value problem (5.10.26). Then
"et) e s: I(s, ,,(sds, i 1,2
Ior all t E (a, b), and
I(t) ,(t) s: I(s, I(S I(s, (sds.
et ret) ll(t) (t)l, and let k 0 denote the ipschit constant Ior
I. In the Iollowing we consider the case when t r, and we leave the details
oI the prooI Ior t r as an e ercise. We have,
ret) s: II(s, I(S I(s, ,(s))l ds s: k ll(s) ,(s) Ids
I kr(s)ds;
i.e.,
ret) s: kr(s)ds
Ior all t E r, b). The conditions oI Theorem 5.10.39 are clearly satisIied and
we have: iI r(t) I kr(s)ds, then ret) e"". Since in the present
case 0, it Iollows that
ret) 0 Ior all t E r, b).
ThereIore, ll(t) ,(t) 1 0 Ior all t E r, b), and I(t) (t) Ior all t
in this interval.
Now suppose that in E . (5.10.25) I is continuous on some domain D
oI the (t, ) plane and assume thatIis bounded on D; i.e., suppose there
e ists a constant M 0 such that
sup II(t, )1 M.
(""leD
Also, assume that r E (a, b), that ( r, e E D, and that the initial value
problem (5.10.26) has a solution, on a t interval (a, b) such that (t, ,(t)) E D
Ior all t E (a, b). Then
lim ,(t) ,(a) and lim ,(t) ,(b)
1. 16
e ist. To prove this, let t E (a, b). Then
,(t) e II(s, ,(s)ds.
5.10. Applications
335
II a t
1
t
1
b, then
1,(t
1
) ,(t
1
)1 i"lI(s, ,(s 1ds Mlt
1
til
"
Now let t
1
band t
1
b. Then It1 t
1
1 0, and thereIore I,(tl)
,(t1) I O. This limiting process yields thus a convergent Cauchy
seuence; i.e., ,(b) e ists. The eistence oI ,(a) is similarly established.
Net, let us assume that the points (a, ,(a, (b, ,(b are in the domain
D. We now show that the solution, can be continued to the right oI t b.
An identical procedure can be used to show that the solution P can be
continued to the leIt oI t a.
We deIine a Iunction
;(t) ,(t), t E (a, b) .
,(b), t b
Then
;(0 s: I(s, ;(sds
Ior all t E (a, bl. Thus, the derivative oI ;(t) eists on the interval (a, b),
and the leIthand derivative oI ;(t) at t b is given by
;(b) Ieb, ;(b.
Net, we consider the initialvalue problem
.i I(t, )
(b) ,(b).
By Corollary 5.10.37, the diIIerential euation .i I(t, ) has a solution
" which passes through the point (b, ,(b and which e ists on some interval
lb, b Pl, P O. Now let
1) ;(1), 1 E (a, bl .
",(1), 1 E b, b Pl
To show that ; is a solution oI the diIIerential euation on the interval
(a, b Pl, with ;(r) , we must show that ; is continuous at t b.
Since
,(b) rI(, ;(sds
and since
;(0 ,(b) s: I(s, ;(sds,
we have
;(0 s: I(s, ;(sds
Ior all t E (a, b Pl. The continuity oI ; in the last euation implies the
countinuity oII(s, s . DiIIerentiating the last euation, we have
(t) I(t, (t
Ior all t E (a, b Pl.
We call a continuation oI the solution tp to the interval (a, b Pl. II
1 satisIies a ipschit condition on D with respect to , then is uniue,
and we call the continuation oI tp to the interval (a, b Pl.
We can repeat the above procedure oI continuing solutions until the
boundary oI D is reached.
Now let the domain D be, in particular, a rectangle, as shown in igure
M. It is important to notice that, in general, we cannot continue solutions
over the entire t interval T shown in this Iigure.
T
0 h. ): Tl tT2.tl t2)
T(Tl.T2)
t
5.10. . igure M. Continuation oI a solution to the boundary oI
domain D.
We summarie the above discussion in the Iollowing:
5.10. 5. Theorem. In E. (5.10.25), let I be continuous and bound on a
domain D oI the (t, ) plane and let (T, ) E D. Then all solutions oI the
initialvalue problem (5.10.26) can be continued to the boundary oI D.
We can readily etend Theorems 5.10.28, 5.10.33, Corollary 5.10.37,
and Theorems 5.10. 3 and 5.10. 5 to initialvalue problems characteried by
systems oI n Iirstorder ordinary diIIerential euations, as given in DeIinition
.11.9 and E. .1 1.11. In doing so we replace D c R 1. by D c Ra
I
, E R
by E Ra: D R by I: D Ra, the absolute value I l by the uantity
a
I l I; I ,l,
sl
(5.10.6)
and the metric p( , y) I y Ion R by the metric p(, y) I; I , y,l

I
on R . (The reader can readily veriIy that the Iunction given in E. (5.10. 6)
satisIies the aioms oI a norm (see Theorem .9.31).) The deIinition oI E
approimate solution Ior the diIIerential euation i I(t, ) is identical
to that given in DeIinition 5.10.27, save that scalars are replaced by vectors
(e.g., the scalar Iunction tp is replaced by the nvector valued Iunction p).
5.10. Applications
337
Also, the modiIications involved in deIining a ipschit condition Ior I(t, )
on D c R I are obvious.
5.10. 7. E ercise. or the ordinary diIIerential euation
i I(t, ) (5.10. 8)
and Ior the initialvalue problem
i I(t, ), (T) (5.10. 9)
characteried in E. ( .11.7) and DeIinition .11.9, respectively, state and
prove results Ior e istence, uniueness, and continuation oI solutions, which
are analogous to Theorems 5.10.28, 5.10.33, Corollary 5.10.37, and Theorems
5.10. 3 and 5.10. 5.
In connection with Theorem 5.10. 5 we noted that the solutions oI initial
value problems described by nonlinear ordinary diIIerential euations can,
in general, not be etended to the entire t interval T depicted in igure M.
We now show that in the case oI initialvalue problems characteried by
linear ordinary diIIerential euations it is possible to etend solutions to the
entire interval T. irst, we need some preliminary results.
et
D (t, ): a t b, E R (5.10.50)
where the Iunction 1 1 is deIined in E. (5.10. 6). Consider the set oI linear
euations
, t a,it) I. . I,(t, ), i 1, ... , n (5.10.51)
I
where the a,it), i,j I, ... , n, are assumed to be real and continuous
Iunctions deIined on the interval a, b . We Iirst show that I(t, ) lIl(t, ),
... ,/(t, ) T satisIies a ipschit condition on D,
II(t, ) I(t, ") I k l "l
Ior all (t, ), (t, ") E D, where (;, ... ,)T, " ( :, ... ,)T,
and k ma I a,it) I Indeed, we have

I (.(.II
Ir(t, ) r(t, ") I , II,(t, ) I,(t, ") I
I I
a,it) a,it)
II 1 1 I
tita,it)( ) I
I I
k I 1 kl "l
I
338
Net, we prove the Iollowing:
(5.10.56)
5.10.52. emma. In E. (5.10.8), let I(t, ) (/1(t, ), ... ,I,,(t, T be
continuous on a domain D c R"I, and let I(t, ) satisIy a ipschit condition
on D with respect to , with ipschit constant k. IIl and 2 are uniue
solutions oI the initialvalue problem (5.10.9), with I(I) ;1 2(t) ;2
and with (t, ;1), (t, ;2) E D, then
"I(t) 2(01::;: 1;1 ;2Ie
kl
, 1 (5.10.53)
Ior all (t, I(t, (t, 2t E D.
ProoI We assume that t t, and we leave the details oI the prooI Ior
t t as an e ercise. We have
I(t) ;1 rI(s, I(sds,
2(t) ;2 rI(s, t2(s ds,
and
"It) t2(t) 1 1;1 ;11 k s: IIl(s) I1(S) 1ds. (5.10.5)
Applying Theorem 5.10.39 to ineuality (5.10.5), the desired ineuality
(5.10.53) results.
We are now in a position to prove the Iollowing important result Ior
systems oI linear ordinary diIIerential euations.
5.10.55. Theorem. et D c Rn l be given by E. (5.10.50), and let the
real Iunctions alit), i,j I, ... ,n, be continuous on the t interval a, b .
Then there e ists a uni ue solutionto the initialvalue problem
I t a1it) 1 6 ,(t, ), i I, ... ,n
1 1
,(I) I i I, ... , n
with (t, 1 n) E D. This solution can be etended to the entire interval
a, b .
ProoI Since the vector I(t, ) (Ilt, ), ... ,/.(t, T is continuous on
D, since I(t, ) satisIies a ipschit condition with respect to on D, and
since (T,;) E D (where; (I ... n)T), it Iollows Irom Theorem 5.10. 3
(interpreted Ior systems oI Iirstorder ordinary diIIerential euations) that
the initialvalue problem (5.10.56) has a uniue solution I through the point
5.10. Applications
339
(r, ;) over some interval e, d c a, b . We must show that I can be continued
to a uniue solution, over the entire interval a, b .
et i be any solution oI E . (5.10.56) through (r, ;) which e ists on some
subinterval oI a, b . Applying emma 5.10.52 to I i and 2 0, we
have
(5.10.57)
Ior all t in the domain oI deIinition oI i. or purposes oI contradiction,
suppose that I does not have a continuation to a, b and assume that I
has a continuation i e isting up to t b and cannot be continued beyond
t. But ineuality (5.10.57) implies that the path (t, i(t remains inside a
closed bounded subset oI D. It Iollows Irom Theorem 5.10. 5, interpreted
Ior systems oI Iirstorder ordinary diIIerential euations, that i may be
continued beyond t. We thus have arrived at a contradiction, which proves
that a continuation, oI", e ists on the entire interval a, b . This continuation
is uniue because I(t, ) satisIies a ipschit condition with respect to
on D.
S.lO.58. Eercise. In Theorem 5.10.55, let alj(t), i,j 1, ... ,n, be
continuous on the open interval ( 00, 00). Show that the initialvalue prob
lem (5.10.56) possesses uniue solutions Ior every (r, e E Rn 1 which can
be etended to the t interval ( 00, 00).
5.10.59. Eercise. et D c Rn I be given by E . (5.10.50), and let the real
Iunctions alit), v/(t), i,j I, ... ,n, be continuous on the t interval a, b .
Show that there e ists a uniue solution to the initialvalue problem
l r) e/, i I, ... ,n, (5.10.60)
with (r, el ... ,en) E D. Show that this solution can be etended to the
entire interval a, b .
It is possible to rela the conditions on v/(t), j 1, . . ,n, in the above
e ercise considerably. or e ample, it can be shown that iI v/(t) is piecewise
continuous on a, b , then the assertions oI E ercise 5.10.59 still hold.
We now address ourselves to the last item oI the present section. Consider
the initialvalue problem (5.10. 9) which we characteried in DeIinition
.11.9. Assume that I(t, ) satisIies a ipschit condition on a domain D
c Rn 1 and that (r,;) E D. Then the initialvalue problem possesses a
uniue solution, over some t interval containing 1 . To indicate the depen
3 0 Chapter 5 / Metric Spaces
dence oI, on the initial point (r, ;), we write
,(t; T, ;),
where IP T; T,;) ;. We now ask: What are the eIIects oI diIIerent initial
conditions on the solution oI E . (5.1O. 8) Our ne t result provides the
answer.
5.10.61. Theorem. In E . (5.10. 9) let I(t, ) satisIy a ipschit condition
with respect to on Dc RI. et (T,;) E D. Then the uniue solution
I(t; T, ;) oI E . (5.10. 9), e isting on some bounded t interval containing T,
depends continuously on ; on any such bounded interval. (This means iI
;. ;, then ,(t; T, ;.) I(t; T, ;).)
ProoI We have
cp(t; T, ;.) ;. rIrs, ,(s; T, ;.)ds
and
,(t; T,;) ; rIrs, ,(s; T, ;) ds.
It Iollows that Ior t T (the prooI Ior t T is leIt as an e ercise),
I.(t; T, ;.) .(t; T, 1;)1 II;. ;1 rII s, ,(s; T, ;.) Irs, cp(s; T, ;) 1 ds
II;. ;1 k r,(s; T, 1;.) cp(s; T, ;)1 ds,
where k denotes a ipschit constant Ior I(t, ). sing Theorem 5.10.39,
we obtain
I,(t; T, 1;.) ,(t; T, 1;)1 II;. 1;1 eI kd II;. 1;1 ek(tTl.
Thus iI 1;. 1;, then cp(t; T, 1;.) .(t; T, 1;).
It Iollows Irom the prooI oI the above theorem that the convergence is
uniIorm with respect to t on any interval a, b on which the solutions are
deIined.
5.10.62. E ample. The initial value problem
2 (5.10.63)
(T) c;
where 00 T 00, 00 c; 00, has the uniue solution
,(t; T, e c;eZ(t T), 00 t 00,
which depends continuously on the initial value C;.
5.11. ReIerences and Notes
3 1
Thus Iar, in the present section, we have concerned ourselves with prob
lems characteri ed by real ordinary diIIerential e uations. It is an easy matter
to veriIy that all the e istence, uni ueness, continuation, and dependence
(on initial conditions) results proved in the present section are also valid Ior
initial value problems described by comple ordinary diIIerential e uations
such as those given, e.g., in E . ( .11.25). In this case, the norm oI a comple
vector (" ... ,ZR Zk k iv
k
, k 1, ... , n, is given by
where IZkI (u vl)I/2. The metric on en is in this case given by P(ZI Z2)
IZI Z21
There are numerous e cellent te ts on metric spaces. Books which are
especially readable include Copson 5.2 , Gleason 5.3 , Goldstein and
Rosenbaum 5., antorovich and Akilov 5.5, olmogorov and omin
5.7 , Naylor and Sell 5.8 , and Royden 5.9 . ReIerence 5.8 includes some
applications. The book by elley 5.6 is a standard reIerence on topology.
An e cellent reIerence on ordinary diIIerential euations is the book by
Coddington and evinson 5.1.
RE ERENCES
5.1 E. A. CODDINGTON and N. EVINSON, Theory oIOrdinary DiIIerential E ua
tions. New ork: McGraw ili Book Company, Inc., 1955.
5.2 E. T. CoPSON, Me/ric Spaces. Cambridge, England: Cambridge niversity
Press, 1968.
5.3 A. M. G EASON, undamentals oI Abstract Analysis. Reading, Mass.:
5. M. E. GO DSTEIN and B. M. ROSENBA M, "Introduction to Abstract Analy
sis," National Aeronautics and Space Administration, Report No. SP 203,
Washington, D.C., 1969.
5.5 . V. ANTOROVIC and G. P. A I OV, unctional Analysis in Normed
Spaces. New ork: The Macmillan Company, 196 .
5.6 . E E , General Topology. Princeton, N. .: D. Van Nostrand Company,
Inc., 1955.
5.7 A. N. O MOGOROV and S. V. OMIN, Elements oIthe Theory oI unctions
and unctional Analysis. Vol. I. Albany, N. .: Graylock Press, 1957.
3 2 Chapter 5 I Metric Spaces
5.8 A. W. NA OR. and G. R. SE , inear Operator Theory in Engineering and
Science. New ork: olt, Rinehart and Winston, 1971.
5.9 . . RODEN, Real Analysis. New ork:The Macmillan Company,I965.
5.10 A. E. TA OR., General Theory oI unctions and Integration. New ork;
Blaisdell Publishing Company, 1965.
6
NORMED SPACES AND
INNER PROD CT SPACES
In Chapters 2 we concerned ourselves primarily with algebraic aspects oI
certain mathematical systems, while in Chapter 5 we addressed ourselves to
topological properties oI some mathematical systems. The stage is now set
to combine topological and algebraic structures. In doing so, we arrive at
linear topological spaces, namely normed linear spaces and inner product
spaces, in general, and Banach spaces and ilbert spaces, in particular. The
properties oI such spaces are the topic oI the present chapter. In the net
chapter we will study linear transIormations deIined on Banach and ilbert
spaces. The material oI the present chapter and the ne t chapter constitutes
part oI a branch oI mathematics called Iunctional analysis.
Since normed linear spaces and inner product spaces are vector spaces as
well as metric spaces, the results oI Chapters 3 and 5 are applicable to the
spaces considered in this chapter. urthermore, since the Euclidean spaces
considered in Chapter are important e amples oI normed linear spaces and
inner product spaces, the reader may Iind it useIul to reIer to Section .9 Ior
proper motivation oI the material to Iollow.
The present chapter consists oI 16 sections. In the Iirst 10 sections we
consider some oI the important general properties oI normed linear spaces
and Banach spaces. In sections II through 1 we e amine some oI the
important general characteristics oI inner product spaces and ilbert spaces.
(Inner product spaces are special types oI normed linear spaces; ilbert
3 3
3 Chapter 6 I Normed Spaces and Inner Product Spaces
spaces are special cases oI Banach spaces; Banach spaces are special kinds oI
nonned linear spaces; and ilbert spaces are special types oI inner product
spaces.) In section 15, we consider two applications. This chapter is con
cluded with a brieI discussion oI pertinent reIerences in the last section.
6.1. NORMED INEAR SPACES
Throughout this chapter, R denotes the Iield oIreal numbers, C denotes the
Iield oI comple numbers, denotes either R or C, and denotes a vector
space over .
6.1.1. DeIinition. et II 11 denote a mapping Irom into R which satisIies
the Iollowing properties Ior every , y E and every E :
(i) II ll 0;
(ii) II ll 0 iI and only iI 0;
(iii) II/ II 1 1 ll ll; and
(iv) Il yll II ll lIyll
The Iunction " "is called a nonn on , the mathematical system con
sisting oI II I and , ;" II , is called a nonned linear space, and II II
is called the nonn or . II C we speak oI a comple nonned linear space,
and iI R we speak oI a real nonned linear space.
DiIIerent norms deIined on the same linear space yield diIIerent nonned
linear spaces. II in a given discussion it is clear which particular norm is
being used, we simply write in place oI ;" IIto denote the nonned
linear space under consideration. Properties (iii) and (iv) in DeIinition 6.1.1
are called the homogeneity property and the triangle ine uality oI a nonn,
respectively.
et ; II II be a normed linear space and let , E , i I, ... ,n.
Repeated use oI the triangle ineuality yields
II I ... .1I I I " ... II .lI
The Iollowing result shows that every normed linear space has a metric
associated with it, induced by the. nonn I II. ThereIore, every nonned
linear space is also a metric space.
6.1.2. 1beorem. et ; II III be a nonned linear space, and let p be a
realvalued Iunction deIined on given by p(, y) II yll Ior all
, y E . Then p is a metric on and ; p is a metric space.
6.1. ormed inear Spaces
This theorem tells us that all oIthe results in the previous chapter on metric
spaces apply to normedlinear spaces as we/l,providedwelet p(, y) Il y II.
We will adopt the convention that when using the terminology oImetric spaces
(e.g., completeness, compactness, convergence, continuity, etc.) in a normed
linear space (; II . Ill, we mean with respect to the metric space (; p ,
where p(, y) II y II. Also, whenever we use metric space properties on ,
i.e., on R or C, we mean with respect to the usual metric on Ror C, respectively.
With the Ioregoing in mind, we now introduce the Iollowing important
concept.
6.1. . DeIinition. A complete normed linear space is called a Banach
space.
Thus, (; (II II is a Banach space iI and only iI (; p is a complete
metric space, where p(,y) II yll.
6.1.5. E ample. et RR, the space oI n tuples oI real numbers, or let
CR, the space oI ntuples oI comple numbers. rom Eample 3.1.10 we
see that is a vector space. or E given by (e I ... , e.), and Ior
pERsuch that I p 00, deIine
II II, lei I ... leRI p
/
,.
We can readily veriIy that II . II, satisIies the a ioms oI a norm. Aioms
(i), (ii), (iii) oI DeIinition 6.1.1 Iollow trivially, while a iom (iv) is a direct
conseuence oI Minkowskis ineuality Ior Iinite sums (5.2.6). etting
pi, y) II y II" then (; p, is the metric space oI E ercise 5.5.25.
Since (; p, is complete, it Iollows that (RR; II . II, and (CR; II . II,are
Banach spaces.
We may also deIine a norm on by letting
II ll .. ma le,l
I R
It can readily be veriIied that (R II . II.. and (CR; II . II..are also Banach
spaces (see E ercise 5.5.25).
6.1.6. Eample. et R" (see Eample 3.1.11) or C" (see Eam
ple 3.1.13), let I S p 00, and as in Eample 5.3.5, let
I, E : I; le,l oo .
I
DeIine
(
.. )1/
II ll, le,l .
/1
(6.1.7)
It is readily veriIied that II . II, is a norm on the linear space I,. Aioms
(i), (ii), (iii) oI DeIinition 6.1.1 Iollow trivially, while aiom (iv), the triangle
ineuality, Iollows Irom Minkowskis ineuality Ior inIinite sums (5.2.7).
Invoking Eample 5.5.26, it also Iollows that lp; II . lip is a Banach space.
enceIorth, when we simply reIer to the Banach space I
p
, we assume that the
norm on this space is given by E. (6.1.7).
etting p 00 and
I.. E : sup ne/1 oo
/
(reIer to Eample 5.3.8), and deIining
II II .. sup ne/I ,
/
(6.1.8)
it is readily veriIied that I .. ; II II.. is also a Banach space. When we simply
reIer to the Banach space I.. , we have in mind the norm given in E.
(6.1.8).
6.1.9. Eample
(a) et e a, b) denote the linear space oI real continuous Iunctions on
the interval a, b), as given in Eample 3.1.l9. or E era, b) deIine
i
b IIP
Il llp "l(t
W
dt , I p 00.
It is easily shown that lela, b); II . lip is a normed linear space. Aioms
(i)(iii) oI DeIinition 6.1.1 Iollow trivially, while aiom (iv) Iollows Irom the
Minkowski ineuality Ior integrals (5.2.8). et pi, y) II ll
p
Then
era, b); pp is a metric space which is not complete (see Eample 5.5.29
where we considered the special case p 2). It Iollows that era, b); II . lip
is not a Banach space.
Net, deIine on the linear space era, b) the Iunction II . II.. by
II II .. sup I(t) I.
Ela,b)
It is readily shown that era, b); II II..is a normed linear space. et p..(, y)
I yll... In accordance with Eample 5.5.28, era, b); P.. is a complete
metric space, and thus era, b); II . II..is a Banach space.
The above discussion can be modiIied in an obvious way Ior the case
where era, b) consists oI complevalued continuous Iunctions deIined on
a, b). ere vector addition and multiplication oI vectors by scalars are deIined
similarly as in Es. (3.1.20) and (3.1.21), respectively. urthermore, it is
easy to show that era, b); II lip , I p 00, and era, b); II . II.. are
normed linear spaces with norms deIined similarly as above. Once more, the
space era, b); II lip , I p 00, is not a Banach space, while the space
lela, b); II II..is.
(b) The metric space p(a, b); pp was deIined in Eample 5.5.31. It
can be shown that p a, b) is a vector space over R. II we let
II I1p I IIIPd l I/P,
),,,,bl
6.1. Normed inear Spaces
3 7
p I, Ior I E p a, b , where the integral is the ebesgue integral, then
p a, b ; II . lip is a Banach space since p a, b ; pp is complete, where
pp( , y) Il llp.
6.1.10. Eample. et ; II II.. , ; II . II,. be two normed linear spaces
over , and let denote the Cartesian product oI and . DeIining
vector addition on by
( I I) (
) ( I
, I )
and multiplication oI vectors by scalars as
( , y) ( , y),
we can readily show that is a linear space (see Es. (3.2.1), (3.2.15)
and the related discussion). This space can be used to generate a normed
linear space ; II . III by deIining the norm II . II as
II( , y)11 II ll .. II II,
urthermore, iI ; II . II.. and ; II II, are Banach spaces, then it is
easily shown that ; II . III is also a Banach space.
6.1.11. Eercise. VeriIy the assertions made in Eamples 6.1.5 through
6.1.10.
We note that in a normed linear space ; II III a sphere S(
o
; r) with
center
o
E and radius r 0 is given by
S(
o
; r) E : II oll rl
(6.1.12)
ReIerring to Theorem 5. .27 and Eercise 5. .31, recall that in a metric
space the closure oI a sphere (denoted by S(
o
; r need not coincide with the
closed sphere (denoted by (
o
; r. In a normed linear space we have the
Iollowing result.
6.1.13. Theorem. et be a normed linear space, let
o
E , and let
r O. et S(
o
; r) denote the closure oI the open sphere S(
o
; r) given by
E. (6.1.12). Then S(
o
; r) (
o
; r), the closed sphere, where
(o;r) E :llolIr. (6.1.1 )
ProoI By Eercise 5. .31 we know that S(
o
; r) c (
o
; r). Thus, we
need only show that (
o
; r) c S(
o
; r). It is clearly suIIicient to show that
E : Il oll r c S(
o
; r). To do so, let be such that II oll
r, and let 0 I. et
o
(I I). Then y
o
(I )
(
o
) Thus, Ily oll II 1 ll oll r and so y E S(
o
; r).
Also, y (
o
). ThereIore, Ily ii I r. This means that
E S(
o
; r), which completes the prooI.
3 8 Chapter 6 I Normed Spaces and Inner Product Spaces
Thus, in a nonned linear space we may call S(
o
; r) the closed sphere
given by E . (6.1.1 ).
When regarded as a Iunction Irom into R, a nonn has the Iollowing
important property.
6.1.15. Theorem. et ; II . III be a nonned linear space. Then II II is
a continuous mapping oI into R.
ProoI We view II . II as a mapping Irom the metric space ; p , P
II yII, into the real numbers with the usual metric Ior R. Thus, Ior given
I 0, we wish to show that there is a t5 0 such that II y II t5 implies
IlIlllIylll I. Now let y. Then y and so Il ll
Il ll lIyll This implies that IIllllyll Il ll. Similarly, y ,
and so II II Il ll IIll Il ll II ll. Thus, IIIIIlll II ll. It
now Iollows that IlllllIylll II ll II yll. etting t5 I, the
desired result Iollows.
In this chapter we will not always reuire that a particular nonned linear
space be a Banach space. Nonetheless, many important results oI analysis
reuire the completeness property. This is also true in applications. or
e ample, in the solution oI various types oI euations (such as nonlinear
diIIerential euations, integral euations, etc.) or in optimiation problems
or in nonlinear Ieedback problems or in approimation theory, as well as
many other areas oI applications, we Ire uently obtain our desired solution
in the Iorm oI a se uence generated by means oI some iterative scheme. In
such a se uence, each succeeding member is closer to the desired solution
than its predecessor. Now even though the precise solution to which a
se uence oI this type may converge is unknown, it is usually imperative that
the se uence converge to an element in that space which happens to be the
setting oI the particular problem in uestion.
6.2. INEAR S BSPACES
We now turn our attention brieIly to linear subspaces oI a normed linear
space. We Iirst recall DeIinition 3.2.1. A nonempty subset oI a vector
space is called a linear subspace in iI (i) y E whenever and y
are in , and (ii) E whenever E and E . Net, consider a
normed linear space ; 1I 11l, let be a linear subspace in , and let II lit
denote the restriCtion oI 11 11 to ; i.e.,
II ll
l
Il ll Ior all E .
Then it is easy to show that ; 1 litis also a nonned linear space. We
6.2. inear Subspaces 3 9
call II . III the norm induced by II . lion and we say that ; II III is a
normed linear subspace oI ; II II , or simply a linear subspace oI . Since
there is usually no room Ior conIusion, we drop the subscript and simply
denote this subspace by ; II III In Iact, when it is clear which norm is
being used, we usually reIer to the normed linear spaces and .
Our Iirst result is an immediate conseuence oI Theorem 5.5.33.
6.2.1. Theorem. et be a Banach space, and let be a linear subspace
oI . Then is a Banach space iI and only iI is closed.
In the Iollowing we give an e ample oI a linear subspace oI a Banach space
which is not closed.
6.2.2. E ample. et be the Banach space /1 oI Eample 6.1.6, and let
be the space oI Iinitely nonero se uences given in Eample 3.1.1 . It is
easily shown that is a linear subspace oI . To show that is not closed,
consider the se uence (y.l in deIined by
I (1,0,0, ...),
1 (I, 1/2,0,0, ...),
(I, 1/2, 1/,0,0, ...),
......................... ,
. (I, 1/2, ... , 1/2,0,0, ).
This se uence converges to the point (I, 1/2, , 1/2 , 1/2
1
, .. ) E .
Since , it Iollows Irom part (iii) oI Theorem 5.5.8 that is not a
closed subset oI .
Net, we prove:
6.2.3. Theorem. et be a Banach space, let be a linear subspace oI
, and let I denote the closure oI . Then I is a closed linear subspace oI .
ProoI Since is closed, we only have to show that is a linear subspace.
et , E , and let I O. Then there e ist elements , y E such that
II lI I and lIy yll I. ence, Ior arbitrary ,P E ,
py E . Now 1I( py) ( Py) II 1I( ) P(y y )11
II lI il IPI lIy y ll (II IPlk Since I 0 is arbitrary,
this implies that py is an adherent point oI ; i.e., py E .
We conclude this section with the Iollowing useIul result.
6.2. . Theorem. et be a normed linear space, and let be a linear
subspace oI . II is an open subset oI , then .
ProoI et E . We wish to show that E . Since 0 E , we may
assume that 1 O. Since is open and 0 E , there is some l 0 such that
the sphere S(O; l) C . et 2li ll , Then II ll l and so E . Since
is a linear subspace, it Iollows that 211

ll E .
l
6.3. IN INITE SERIES
aving deIined a norm on a linear space, we are in a position to consider
the concept oI inIinite series in a meaningIul way. Throughout this section
we reIer to a normed linear space ; II II simply as .
6.3.1. DeIinition. et
8
be a se uence oI elements in . or each positive
integer m, let
y", I ... " .
We call y", the se uence oI partial sums oI
8
. IIthe se uence y", con
verges to a limit y E , we say the inIinite series
...
k
... I;
8
81
converges and we write
y I;
8

81
We say the inIinite series I;

8
diverges iI the se uence ", diverges.
8
2
1
The Iollowing result yields suIIicient conditions Ior an inIinite series to
converge.
6.3.2. Theorem. et be a Banach space, and let
8
be a se uence in .
..
III; II
8
11 00, then
81
..
(i) the inIinite series I;
8
converges; and
81
..
(ii) III;
8
II I; II
8
II
81 8
2
1
ProoI To prove the Iirst part, let y", I ... " . II n m, then
8 ", ",I ...
8
ence,
Since i:II
8
II is a convergent inIinite series oI real numbers, the se uence
81
6. . Conve Sets 351
oI partial sums sIIt II Il1 ... II" II is Cauchy. ence, given I 0,
there is a positive integer N such that n m N implies Is. SIIt I::::;;; I.
But Is. s.. 1 Ily. y",lI, and so .. is a Cauchy seuence. Since is
complete, y", is convergent and conclusion (i) Iollows.
To prove the second part, let y", I ... .. , and let y lim y",
at
...
I; . Then Ior each positive integer m we have y y y", .. and
. 1
..
Ilyll Ily y .. 11 Ily.. 1I Ily y.. 11 I; Il /ll Taking the limit as
11
... ...
m 00, we have III; III I; II tll
11 II
6. . CONVE SETS
In the present section we consider the concepts oI conveity and cones
which arise naturally in many applications. Throughout this section, is a
real normed linear space.
et and y be two elements oI . We call the set y, deIined by
y E : Z a (I a)y Ior all a E R such that 0 a I ,
the line segment joining and y. Conve sets are now characteried as
Iollows.
6..1. DeIinition. et be a subset oI . Then is said to be conve iI
contains the line segment y whenever and yare two arbitrary points
in . A conve set is called a conve body iI it contains at least one interior
point, i.e., iI it completely contains some sphere.
In igure A we depict a line segment y, a conve set, and a nonconve
set in R2.
)(
line segment y Conve set
6..2. igure A
Non conve set
Note that an e uivalent statement Ior to be conve is that iI , y E
then py E whenever and p are positive constants such that
P1.
We cite a Iew e amples.
6. .3. E ample. The empty set is conve . Also, a set consisting oI one
point is conve . In R3, a cube and a sphere are conve bodies, while a plane
and a line segment are conve sets but not conve bodies. Any linear sub
space oI is a conve set. Also, any linear variety oI (see DeIinition 3.2.17)
is a conve set.
6. . . E ample. et and Z be conve sets in , let II, pER, and let
E : ,y E . Then the set pZ is a conve set in
.
6. .5. E ercise. Prove the assertions made in E amples 6. .3 and 6. . .
6. .6. Theorem. et be a conve set in , and let II, pERbe positive
scalars. Then ( P) p.
ProoI Regardless oI conve ity, iI E ( P), then ( P)y
ay py E p, and thus ( P) c p. Now let be
conve , and let y p, where y, E . Then
1 p
p i:I i" E ,
because
ti11
"i: P i: i: P .
ThereIore, E ( P)and thus pc ( P). This completes
the prooI.
6. .7. Theorem. et e be an arbitrary collection oI conve sets. The
intersection nis also a conve set.
ee
The preceding result gives rise to the Iollowing concept.
6. .9. DeIinition. et be any set in . The conve bull oI , also called
the conve cover oI , denoted by
e
, is the intersection oI all conve sets
which contain .
6. . Conve Sets 353
We note that the conve hull oI is the smallest conve set which con
tains . E amples oI conve covers oI sets in R2 are depicted in igure B.
egend: I
6. .10. igure B. Conve hulls.
6. .11. Theorem. et be any set in . The conve hull oI is the set
oI points e pressible as I I 22 ... "", where I ... ," E ,
"
where I 0, i I, ... , n, where E I 1 and where n is not Ii ed.
11
ProoI II Z is the set oI elements e pressible as described above, then clearly
Z is conve . Moreover, c Z, and hence
e
c Z. To show that Z ee,
we show that Z is contained in every conve set which contains . We do so
by induction on the number oI elements oI that appear in the representation
oI an element oI Z. et be a conve set with ::: . II (lZI E Z Ior
which n I, then ( 1 1 and Z E . Now assume that an element oI Z is
in iI it is represented in terms oI n I elements oI . et Z l ZI ...
( "Z" be in Z, let P I ... ("1 let PI ( IIP, i I, ... ,n I,
andletu PIZI ... P,,lZ,,I Thenu E , by the induction hypothesis.
But " E , ( " I P, and Z pu (I P) " E , since is conve .
This completes the induction, and thus Z c Irom which it Iollows that
Zc e
6. .12. Theorem. et be a conve set in . Then the closure oI , ,
is also a conve set.
Since the intersection oI any number oI closed sets is always closed, it
Iollows Irom Theorem 6. .7 that the intersection oI an arbitrary number oI
closed conve sets is also a closed conve set.
We now consider some interesting aspects oI norms in terms oI conve
sets.
Chapter 6 / Normed Spaces and Inner Product Spaces
6. .1 . Theorem. Any sphere in is a conve set.
ProoI We consider without loss oI generality the unit sphere,
E :llll I .
II
o
o E , then II
o
II I and II olI I. Now iI 0 and P 0, where
P I, then IIo P ol1 lIoll IIP olI lloll Pll ol1
P I, and thus o P o E .
In view oI Theorems 6.1.13, 6. .1 2, and 6. .1 , it Iollows that a closed
sphere S(
o
; r) is also conve. The Iollowing eample, cast in R2, is rather
instructive.
6. .15. E ample. On R2 we deIine the norm II IIi oI Eample 6.1.5. A
moments reIlection reveals that in case oI II . 112 the unit sphere is a circle
oI radius I; when the norm is II II.., the unit sphere is a suare with vertices
(1,1), (I, I), (1, 1), (1, I); iI the norm is II 111 the unit sphere is
the suare with vertices (0, I), (I 0), (1,0), (0, I). II Ior the unit sphere
corresponding to II lip we let p increase Irom I to 00, then this sphere will
deIorm in a continuous manner Irom the suare corresponding to II lit
to the suare corresponding to II . II... This is depicted in igure C. We note
that in all cases the unit sphere results in a conve set.
or the case oI the realvalued Iunction
(6. .16)
the set determined by II II 1 results in a set which is not conve. In parti
cular, iI p 2/3, the set determined by II II I yields the boundary and the
interior oI an asteroid, as shown in igure C. The reason Ior the nonconveity
oI this set can be Iound in the Iact that the Iunction (6. .16) does not represent
a norm. In particular, it can be shown that (6..16) does not satisIy the triangle
ineuality.
11 11.
nit spheres Ior
t, E ample 6. .15
11 11, 1I lb13
6. .17. igure C. nit spheres Ior E ample 6..15.
6.5. inear unctionals
6. .18. E ercise. VeriIy the assertions made in E ample 6. .15.
We conclude this section by introducing the notion oI cone.
355
6. .19. DeIinition. A set in is called a cone with verte at the origin
iI E implies that E Ior all O. II is a cone with verte at the
origin, then the set
o
,
o
E , is called a cone with verte
o
A conve
cone is a set which is both conve and a cone.
In igure D e amples oI cones are shown.
(al Cone (bl Conve cone
6. .20. igure D
6.5. INEAR NCTIONAS
Throughout this section is a normed linear space.
We recall that a mapping, I, Irom into is called a Iunctional on
(see DeIinition 3.5.1). IIIis also linear, i.e., I( py) I() PI(y)
Ior all, PE and all , y E , thenI is called a linear Iunctional (reIer to
DeIinition 3.5.1). Recall Iurther that I, the set oI all linear Iunctionals on
, is a linear space over (see Theorem 3.5.16). et I E I and E . In
accordance with E . (3.5.10), we use the notation
I() ,I) (6.5.1)
to denote the value oIIat . Alternatively, we sometimes Iind it convenient
to let E denote a linear Iunctional deIined on and write (see E .
(3.5.11))
() , ). (6.5.2)
Invoking DeIinition 5.7.1, we note that continuity oI a Iunctional at a
point
o
E means, in the present contet, that Ior every I 0 there is a
0 such that II() I(
o
) I I whenever II
o
I . Our Iirst
356 Chapter 6 I ormed Spaces and Inner Product Spaces
result shows that iI a linear Iunctional on is continuous at one point oI
then it is continuous at all points oI .
6.5.3. Theorem. IIa linear Iunctional I on is continuous at some point
o
E , then it is continuous Ior all E .
ProoI II y,, is a se uence in such that y"
o
thenI(y,,) I(
o
) by
Theorem 5.7.8. Now let ,, be a se uence in converging to E . Then
the se uence y,, in given by y" "
o
converges to
o
By the
linearity oII, we have
I(,,) I() I(y,,) I(
o
)
Since II(y,,) I(o)l 0 as y"
o
we have II(,,) I()l 0 as
" , and thereIore I is continuous at E . Since is arbitrary, the
prooI oI the theorem is complete.
It is clear that iIIis a linear Iunctional and iI I() "*0 Ior some E ,
then the range oIIis all oI ; i.e., R(I) .
or linear Iunctionals we deIine boundedness as Iollows.
6.5. . DeIinition. A linear IunctionalI on is said to be bounded iI there
e ists a real constant M 0 such that
II() I Mil II
Ior all E . III is not bounded, then it is said to be unbounded.
The Iollowing theorem shows that continuity and boundedness oI linear
Iunctionals are e uivalent.
6.5.5. Theorem. A linear Iunctional I on a normed linear space is
bounded iI and only iI it is continuous.
ProoI Assume thatIis bounded, and let M be such that II()1 Mil II
Ior all E . II " 0, then II(,,) I Mil " 11 o. ence,Iis continuous
at O. rom Theorem 6.5.3 it Iollows thatIis continuous Ior all E .
Conversely, assume thatIis continuous at 0 and hence at any E .
There is a 6 0 such that II( )1 I whenever II ll 6. Now Ior any
1 0 we have II(6 )/11 II II 6, and thus
II() III( II
II)1 II( 6 ) . II ll II
ll.
II ll r TI lT r 0
II we let M 1/6, then II() I Mll ll, andIis bounded.
We will see later, in E ample 6.5.17, that there may e ist linear Iunetionals
on a normed linear space which are unbounded. The class oIlinear Iunctionals
which are bounded has some interesting properties.
6.5. inear unctionals 357
6.5.6. Theorem. et be the vector space oI all linear Iunctionals on
, and let denote the Iamily oI all bounded linear Iunctionals on .
DeIine the Iunction II . II: R by
11/11 sup I/( )1 Ior IE *.
.... 0 II ll
(6.5.7)
Then
(i) * is a linear subspace oI I;
(ii) the Iunction II II deIined in E . (6.5.7) is a norm on ; and
(iii) the normed space ; II . III is complete.
ProoI. The prooI oI part (i) is straightIorward and is leIt as an e ercise.
To prove part (ii), note that iI I 0, then III II 0 and iI I 0, then
11/11 O. Also, since
sup IOt/( )1 lOti sup l/() I,
.... 0 II II .... 0 lT lr
it Iollows that IIOtl II lOt IIII II inally,
III1 1211 sup 1/1( ) 12() I sup 1/1( )1 1/2( ) I
.... 0 II ll .... 0 II ll
sup 1/1( )1 sup 1/2( )1 III1II 11/211.
.... 0 II ll .... 0 Il ll
ence, II . II satisIies the a ioms oI a norm.
To prove part (iii), let E * be a Cauchy se uence. Then Il :"11
0 as m, n 00. II we evaluate this se uence at any E , then ()
is a Cauchy se uence oI scalars, because I ( ) :"( ) I II :.. 1111 II.
This implies that Ior each E there is a scalar () such that ()
(). We observe that (Ot py) lim (Ot py) lim Ot ( )
1100 1100 .
p(y) Ot lim :() p lim :(y) / ( ) p(y), i.e., (Ot py)
.
Ot ( ) p(y), and thus, is a linear Iunctional. Net we show that
is bounded. Since : is a Cauchy se uence, Ior I 0 there is an M such that
I:() :"( )1 Ill ll Ior all m, n M and Ior all E . But ( )
(), and hence I() :"( ) I Ill ll Ior all m M. It now Iollows that
I ( )I I ( ) :"( ) :"( ) I I ( ) :"( ) I I :"( ) I
Ill ll Il :"lIll ll,
and thus is a bounded linear Iunctional. inally, to show that :.,
E *, we note that I() :"( ) I IllII whenever m M Irom which
we have Il :., II I whenever m M. This proves the theorem.
Chapter 6 I Normed Spaces and Inner Product Spaces
6.5.8. Eercise. Prove part (i) oI Theorem 6.5.6.
It is especially interesting to note that * is a Banach space whether
is or is not a Banach space. We are now in a position to make the Iollowing
deIinition.
6.5.9. DeIinition. The set oI all bounded linear Iunctionals on a normed
space is called the Donned conjugate space oI , or the nonned dual oI ,
or simply the dual oI , and is denoted by *. or I E * we call 11/11
deIined by E . (6.5.7) the nonn oII
The net result states that the norm oI a Iunctional can be represented
in various euivalent ways.
6.5.10. Theorem. etlbe a bounded linear Iunctional on , and let 11/11
be the norm oII Then
(i) IIIII inIM: I/() Is Mil II Ior all E;
M
(ii) IlIll sup 1/()l; and
b:11
(iii) 11/11 sup l/()l.
1..11
6.5.. Eercise. Prove Theorem 6.5.10.
et us now consider the norms oI some speciIic linear Iunctionals.
6.5.12. Eample. Consider the normed linear space era, b ; II II I. The
mapping
I() r(s) ds, E era, b
is a linear Iunctional on era, b (cI. Eample 3.5.2). The norm oI this Iunc
tional euals (b a), because
I/()1 I 6 (s) ds I (b a) ma I(s) I.
G G.6
6.5.13. Eample. Consider the space era, b ; II II.. , let
o
be a Ii ed
element oI era, b , and let be any element oI era, b . The mapping
I() s: (s)o(s) ds
is a linear Iunctional on era, b (cI. Eample 3.5.2). This Iunctional is
bounded, because
II( )1 II (S) o(S) ds I u: I o(S) IdS) II 11 .
6.5. inear unctionals 359
SinceIis bounded and linear, it Iollows that it is continuous. We leave it to
the reader to show that
11/11 rIo(s) Ids.
6.5.1 . E ample. et a (1 ... , n) be a Ii ed element oI n, and let
(et, ... ,en) denote an arbitrary element oI P. Then iI
n
I() /e"
it Iollows that I is a linear Iunctional on P (cI. E ample 3.5.6). etting
II ll Ietl
Z
... ienI
Z
)I/2, it Iollows Irom the Schwar ine uality
( .9.29) that
(6.5.15)
Thus, I is bounded and continuous. In order to determine the norm oI ,
we rewrite (6.5.15) as
I/( ) I sup II( )1 lIall
Il ll ",00 lT lI ,
Irom which it Iollows that II/ II II a II. Net, by setting a, we have
II(a)1 lIall
. Thus,
I/(a) I 11011
11011 .
ThereIore IIIll lIall
6.5.16. E ample. Analogous to the above e ample, let a (1 ...)
be a Ii ed element oI the Banach space I" (see E ample 6.1.6), and let
(et, e", .. ) be an arbitrary element oI I". It Iollows that iI
00
I() /e"
1 1
thenIis a linear Iunctional on /". We can show thatIis bounded by observing
that
II( )1 Iii/e(1 Ile/l lIalllIll,
which Iollows Irom olders ine uality Ior inIinite sums (5.2. ). Thus, I is
bounded and, hence, continuous. In a manner similar to that oI E ample
6.5.1, we can show that II/II Iall.
We conclude this section with an e ample oI an unbounded linear
Iunctional.
6.5.17. Eample. Consider the space oI Iinitely nonero seuences
(" 2 8 0, 0, ...) (cI. Eample 3.1.1 ). DeIine II . II: R as
II li ma lell It is easy to show that ; II . IIis a normed linear space.
i
urthermore, it is readily veriIied that the mapping
is an unbounded linear Iunctional on .
6.5.18. Eercise. VeriIy the assertions made in Eamples 6.5.12, 6.5.13,
6.5.1,6.5.16, and 6.5.17.
6.6. INITE DIMENSIONA SPACES
We now brieIly turn our attention to Iinitedimensional vector spaces.
Throughout this section denotes a normed linear space.
We recall that iI " ... ,
8
is a basis Ior a linear space , then Ior each
E there is a uniue set oIscaars el" .. ,e8 in , called the coordinates
oI with respect to this basis (see DeIinition 3.3.36). We now prove the Iol
lowing result.
6.6.1. Theorem. et be a Iinitedimensional normed linear space, and
let " ... ,
8
be a basis Ior . or each E , let the coordinates oI
with respect to this basis be denoted by (e I . , e8) E P. or i I, ... ,
n, deIine the linear Iunctionals It: byIt() el Then each It is a con
tinuous linear Iunctional.
ProoI The prooI that It is linear is straightIorward. To show that It is a
bounded linear Iunctional, we let
S a ( " ... , 8) E 8: 1 ,1 1 21 ... 1 81 I .
It is leIt as an e ercise to show that S is a compact set in the metric space
8; PI (see Eample 5.3.1). Now let us deIine the Iunction g: S R by
g(a) 1I , , ... 8 811
The reader can readily veriIy that g is a continuous Iunction on S. Now let
m inIg(a): a E S . It Iollows Irom Theorem 5.7.15 that there is an
a
o
E S such that g(ao) m. Note that m*0 since " ... ,
8
is a basis Ior
, and also a
o
*O. ence m O. It now Iollows that
II , , ... 8 811 m
Ior every a ( " ... 8) E S. Since 1 ,1 ... 1 81 I Ior a E S, we
6.6. inite Dimensional Spaces
see that
361
III
I
I
... I n nll m(lI II ... II
n
) (6.6.2)
Ior all a E S.
Net, Ior arbitrary E with coordinates (I" .. ,en) E P, we let
p 11 I ... Ien I irst, we suppose that p O. Then
Il li Ilell ... en nll pII I ... t nll
pm(I 1 II)
m(lell I nl),
where ineuality (6.6.2) has been used. ThereIore, iI p ::1 0, we have
I
(let ... len I) lllI
m
(6.6.3)
Noting that ineuality (6.6.3) is also true iI p 0, we conclude that this
ineuality is true Ior all E . Since IIt( )1 Iell leI I ... Ienl,
i I, ... , n, we see that IIt() I (ljm)lIli Ior any E . ence, It is a
bounded linear Iunctional and, conseuently, it is continuous.
6.6.. Eercise. Prove that the set S and the Iunction g have the properties
asserted in the prooI oI Theorem 6.6.1.
The preceding theorem allows us to prove the Iollowing important result.
6.6.5. Theorem. et be a Iinitedimensional normed linear space. Then
is complete.
ProoI et I ... , ,, be a basis Ior , let k be a Cauchy seuence in
, and Ior each k let the coordinates oI k with respect to I ... , ,, be
given by (l1kl ... , 7h) It Iollows Irom Theorem 6.6.1 that there is a con
stant M such that I11k 11/ 1 Mll k III Iorj I, ... , n and all i, k
1,2, .... ence, each seuence 7k is a Cauchy seuence in , i.e., in R
or C, and is thereIore convergent. et 70 lim 7k Ior j I, ... , n. II we
k
let
o 70I

I ... 7o" ",
it Iollows that k converges to o This proves that is complete.
The net result Iollows Irom Theorems 6.6.5 and 6.2.1.
6.6.6. Theorem. et be a normed linear space, and let be a Iinite
dimensional linear subspace oI . Then (i) is complete, and (ii) is closed.
Our net result is an immediate conseuence oI Theorem 6.6.1.
6.6.8. Theorem. et be a Iinitedimensional normed linear space, and
let/be a linear Iunctional on . Then/is continuous.
We recall Irom DeIinition 5.6.30 and Theorem 5.6.31 that a subset oI
a metric space is relatively compact iI every seuence oI elements in
contains a subseuence which converges to an element in . This property
can be useIul in characteriing Iinitedimensional subspaces in an arbitrary
normed linear space as we shall see in the net theorem. Note also that in
view oI DeIinition 5.1.19 a subset in a normed linear space is bounded
iI and only iI there is a .t 0 such that IIII .t Ior all E .
6.6.10. Theorem. et be a normed linear space, and let be a linear
subspace oI . Then is Iinite dimensional iI and only iI every bounded
subset oI is relatively compact.
ProoI (Necessity) Assume that is Iinite dimensional, and let I , .
be a basis Ior . Then Ior any E there is a uniue set "I ... , ". such
that "II ... "... et A be a bounded subset oI , and let Ik
be a seuence in A. Then we can write k "ai ... ".k. Ior k
I, 2, . . . . There eists a .l 0 such that II kll .l Ior all k. Consider I"IkI
... I".kl. We wish to show that this sum is bounded. Suppose that it
is not. Then Ior each positive integer m, we can Iind a k. such that I"I k.1
... I".l.1 . m m. Now let . (l/ m) k. It Iollows that
lIy.1I ,,1 IIk. II ,,1 . .
m m m
Thus, . 0 as m 00. On the other hand,
. "lk.I ... "k..
where t(,k. "k"m Ior i I, ... ,n. Since 1";k.1 ... lk.1 I, the
coordinates ";k., ; .. k. Iorm a bounded seuence in " and as such
contain a convergent subseuence. et ";o ... , o be the limit oI such
a convergent subseuence whose indices we denote by km II we let
";o
i
... ,,o., then we have
Iy yI 1";0 ";k lI
l
lI ... Io k",1 I . II
0 as mj 00.
Thus, . y. Since y ", 0, it Iollows that O. But this is impossible
because l ... , . is a linearly independent set. We conclude that the sum
6.7. Geometric Aspects oI inear unctionals
363
I Ilk I ... I111k I is bounded. Conseuently, there is a subseuence
Ilk " .. , 1"k which is convergent in ". et 110 , 1"o be the limit oI
the convergent subse uence, and let o 110

I 1"o ". Then k/
o. Thus, k contains a convergent subseuence, and this proves that A is
relatively compact.
(SuIIiciency) Assume that every bounded subset oI is relatively compact.
et I E be such that II llI I, and let VI V(
l
) be the linear sub
space generated by I (see DeIinition 3.3.6). II VI , then we are done.
II VI * , let E be such that VI et d inIlly "l. Since
II
VI is closed by Theorem 6.6.6, we must have d 0; otherwise E VI
or every 1 0 there is an
o
E VI such that d II
o
II d I.
Now let
( o)/Ily oll. Then
rt V , Il ll I, and
Il
ii IIII: ;:11 ii II oll"

II
II II d 11,
d1 dn d1
where
o
II
o
II E VI Ior all E VI Since I is arbitrary, we
can choose I so that II
1 II t
Now let V
2
be the linear subspace generated by I
. II VI , we are
done. II not, we can proceed in the manner used above to select an
3
rt V
Z
,
II
3
11 I, II I
3
11 t, and II

3
11 t. II we continue this process,
then we either have V( l ... , ,, ) Ior some n, or or else we obtain
an inIinite se uence ,, such that Il ,,1I t and II " ",11 1 Ior all
m * n. The second alternative is impossible, since ,, is a bounded se uence
and as such must contain a convergent subse uence. This completes the
prooI.
6.7. GEOMETRIC ASPECTS O
INEAR NCTIONAS
Throughout this section denotes a real normed linear space. BeIore giving
geometric interpretations oI linear Iunctionals we introduce the notions oI
maimal subspace and hyperplane.
6.7.1. DeIinition. A linear subspace oI linear space is called ma imal
iI it is not all oI and iI there e ists no linear subspace Z oI such that
,*Z,Z,* and c Z.
Recall that iI is a linear subspace oI and iI Z E , then we call the
set Z a linear variety (see DeIinition 3.2.17). In this case we also
say that Z is a translation oI .
36 Chapter 6 I armed Spaces and Inner Product Spaces
6.7.2. DeIinition. A hyperplane in a linear space is a rnaimallinear
variety resulting Irom the translation oI a maimal linear subspace.
II a hyperplane contains the origin, then it is simply a maimal linear
subspace and all hyperplanes Z obtained by translating are said to be
para el to .
The Iollowing theorem provides us with an important characteriation
oI hyperplanes in terms oI linear Iunctionals.
6.7.3. Theorem. III 1 0 is a linear Iunctional on and iI ( is any Ii ed
scalar, then the set ( : I() ( is a hyperplane. It contains the origin
oiI and only iI ( O. Conversely, iI is a hyperplane in a linear space ,
then there is a linear Iunctional I on and a Ii ed scalar ( such that
( : I() ( .
ProoI Consider the Iirst part. SinceI /::. 0 there is an I such thatI(,)
P /::. O. II
o
( /P)
l
, then I(
o
) ( /P)I( , ) ( and thus
o
E . et
o
o
. It is readily veriIied that o (:I() O and that o
is a linear subspace, so that is a linear variety. Since o /::. , we can
write every element oI as the sum oI an element oI o and a multiple oI
y, where y E o II E , iI y is any element in o such that
I(y) 1 0, and iI
I()y
I(y) ,
then I() 0, and thus has the reuired Iorm. Now assume that
I
is a
linear subspace oI Ior which o c
I
and
I
1 o We can choose
y E
I
o, and the above argument shows that c
I
so that
I
.
This shows that o is maimal and that is a hyperplane.
The assertion that contains 0 iI and only iI ( 0 Iollows readily.
Consider now the last part oI the thorem. II is a hyperplane in ,
then is the translation oI a linear subspace Z in ; i.e.,
o
Z, with
o
Ii ed. II
o
i Z, and iI V(
o
) denotes the linear subspace generated
by the set
O
then V(
o
) . II Ior (
o
Z, Z E Z, we
deIine I() (, then ( : I() I . On the other hand, iI
o
E Z, then
we take I i Z, V(Z I) Z, and deIine Ior ( I Z,
I() ( . Then (:I() O . This concludes the prooI oI the
theorem.
In the prooI oIthe above theorem we established also the Iollowing result:
6.7. . Theorem. etI /::. 0 be a linear Iunctional on the linear space , and
let Z : I() O . II
o
E Z, then every E can be e pressed as
I()
I(o) O , Z E Z.
6.7. Geometric Aspects oIinear unctionals
The ne t result shows that it is possible to establish a uniue correspon
dence between hyperplanes and linear Iunctionals. This result Iollows readily
Irom Theorem 6.7.3.
6.7.5. Theorem. et be a hyperplane in a linear space . II does not
contain the origin, there is a uniue linear Iunctional I on such that
:I() I .
6.7.7. Theorem. et be a ma imal linear subspace in a Banach space .
Then either Ior I ; i.e., either is closed or else is dense in .
ProoI Since is a linear subspace, I is a linear subspace oI by Theorem
6.2.3. Now c I. ence, iI ,* I we must have I , since is a
ma imal linear subspace.
In the ne t result we will show that is closed iI and only iI the Iunctional
I associated with is bounded ( e., continuous). Thus, corresponding to
any hyperplane in a normed linear space there is a Iunctional that is bounded
whenever the hyperplane is closed and vice versa.
6.7.8. Theorem. etIbe a non ero linear Iunctional on , and let
:I() IIbe a hyperplane in . Then is closed Ior every II iI and only
iIIis bounded.
ProoI SinceIis bounded, it is continuous. II ,, is a se uence in which
converges to E , then I(,,) I() II, so that E , and thus is
closed.
Conversely, let Z :I() O be closed. In view oI Theorem 6.7. ,
there e ists an
o
E Z such that
o
Z. Now let ,, be a
se uence in such that " E . Then it is possible to e press each "
and as " c. o " and c
o
, where ., E Z. et d
inI II
o
II Since Z is closed, d O. Now
6e
II ,,11 II(e e,,)
o
( ,,)11
inI II(e e.)
o
( ,,)11 Ie e"ld.
(66.)eZ
Thus e" e. Moreover, since I(,,) c"I(
o
) I(,,) e"I(
o
) cI(
o
)
I(), it Iollows that I is continuous on , and hence bounded.
We now introduce the concept oI a halI space.
6.7.9. DeIinition. et I be a nonero linear Iunctional on , and let
II E R. et be the hyperplane given by :I() II . et
1
,
2
,
3
,
and
be subsets oI deIined by
I
:I() ,
1
:I() .
, :I() . and
:I() . Then each oI the sets

I

1
,
,. and
is called a halI space determined by . In addition, let ZI and Z1

be subsets oI . We say that separates ZI and Z1 iI either (i) ZI C
1
and
Z1 C
or (ii) ZI C
and Z1 C
1
6.7.10. Eercise. Show that each oI the sets

I
,
1
, "
in the preced
ing deIinition is conve . Also, show that iI in the above deIinition I is
continuous, then
I
and , are open sets in , and
1
and
are closed sets

in .
In order to demonstrate some oI the notions introduced. we conclude
this section with the Iollowing e ample.
6.7.11. E ample. et R1, and let (t 1) E . et y ("1 "1)
be any Ii ed vector in . and deIine the linear IunctionalI on as
I() "II "11
The set
o E R1:I() "II "11 O
is a line through the origin oI R1 which is normal to the vector y. III o.
the hyperplane
6.7.12. igure E. alI spaces.
6.8. E tension oI inear unctionals
367
is a linear variety which is parallel to o. The hyperplane divides RZ into
two open halI spaces ZI and Z as depicted in igure E. It should be noted
that E can now be written as Z py, Z E , where E ZI iI
P 0 and E Z iI P o.
6.8. E TENSION O INEAR NCTIONA S
In this section we state and prove the ahnBanach theorem. This result
is very important in analysis and has important implications in applications.
We would like to point out that the present Iorm oI this theorem is not the
most general version oI the ahnBanach theorem.
Throughout this section will denote a real normed linear space.
6.8.1. DeIinition. et be a linear subspace oI , let Z be a proper linear
subspace oI , let I be a bounded linear Iunctional deIined on Z, and let
be a bounded linear Iunctional deIined on . II le ) I() whenever E Z,
then is called an etension oI/IromZ to . II the spaces , , Z are normed
and iI 1I/11 II1lin then I is called a norm preserving etension oII
We now prove the Iollowing version oI the ahnBanach theorem.
6.8.2. Theorem. Every bounded linear Iunctional I deIined on a linear
subspace oI a real normed linear space can be etended to the entire
space with preservation oI norm. SpeciIically, one can Iind a bounded
linear Iunctional I such that
(i) le ) le ) Ior every E ; and
(ii) Illll Ililly
ProoI Although this theorem is true Ior not separable, we shall give
the prooI only Ior the case where is separable (see DeIinition 5. .33 Ior
separability). We assume that is a proper linear subspace oI , Ior other
wise there is nothing to prove. et I E but I i , and let us deIine the
subset
I
E : ( I y, ( E R, y E .
It is straightIorward to veriIy that
I
is a linear subspace oI , and Iurther
more that Ior each E
I
there is a uniue ( E R and a uniue y E
such that ( I y. IIan etension oI/Irom to
I
e ists, then it has
the Iorm
le ) (l(
l
) IC ),
and iI we let c l(
l
), then le ) 1() c( . rom this it is clear that
the etension is speciIied by prescribing the constant ( c. In order that the
norm oI the Iunctional not be increased when it is continued Irom to
I
,
we must Iind a c such that the ine uality
II() t cl IIIll IIy t
I
II
holds Ior all E .
II E , then /t E and the above ineuality can be written as
II(t ) t cl IIIllllt t
l
II
or
I/( ) cl IlIllll III
This ineuality can be rewritten as
IIIllll I II I() c IlIllll III
or, e uivalently, as
I() IIIll li III C I() IIIllli III (6.8.3)
Ior all E . We now must show that such a number c does indeed always
e ist. To do this, it suIIices to show that Ior any lo)/Z E , we have
C
1
t . /( I) 11/IIII I I II s /() IIIllllyIII t . C
(6.8. )
But this ine uality Iollows directly Irom
I(I) I(y) IlIllilI lI IlIllllI l l y1I
IIIllll I III IIIll II l II
In view oI (6.8.3) and (6.8. ) it Iollows that C
I
S C C. II we now let
j() I(y) / C, E l
we have 11111 II/II, and is an e tension oI/Irom to
I

Net, since is separable it contains a denumerable everywhere dense
set l , , ., ... rom this set oI vectors we select, one at a time,
a linearly independent subset oI vectors l , ... ,., ... which belongs
to . The set l , .. ,., ... together with the linear subspace
generates a subspace W dense in .
ollowing the above procedure, we now etend the Iunctional I to a
Iunctional on the subspace Wby etending/Irom to
I
, then to 2, etc.,
where
I
: t I ; E , t E R
and
: t ; E
I
, t E R , etc.
6.8. E tension 0/ inear unctionals 369
inally, we etend Irom the dense subspace W to the space . At the
remaining points oI the Iunctional is deIined by continuity. II E ,
then there e ists a seuence w
n
oI vectors in W converging to . By con
tinuity, iI lim W
n
, then() Iim(w
n
). The ineuality I() I 1I/1111 II
. 00 " 00
Iollows Irom
I() I lim Ij(w
n
) I lim 1I/IIIIwnii 11/1I11 ll
00 noo
The net result is a direct conseuence oI Theorem 6.8.2.
6.8.5. Corollary. et
o
E ,
o
1 O. Then there e ists a bounded non
ero linear Iunctional/deIined on all oI such that/(
o
) II
o
II and 11/11
1.
ProoI et be the linear subspace oI given by y E : y l
o
,
I E R . or E , deIine 10( ) I li
o
II, where I
o
Then Ilyll
IIIll
o
ll, and so lo1 1 Ior all E . This implies that II/oil 1.
The prooI now Iollows Irom Theorem 6.8.2.
The net result is also a conseuence oI the ahnBanach theorem.
6.8.6. Corollary. et
o
E ,
O
1 0, and let "I O. Then there e ists a
bounded nonero linear Iunctional/deIined on all oI such that II I II "I
and/(
o
) 1I/11lI
o
lI
The above corollary guarantees the e istence oI nontrivial bounded
linear Iunctionals.
In the net eample a geometric interpretation oI Corollary 6.8.5 IS
given.
6.8.8. Eample. et
o
E ,
o
1 0, and let/be a linear Iunctional deIined
on such that/(o) Il
o
II and II/II I. et be the closed sphere given
by E : II ll Il olI . Now iI E , then I() I/() I
11/11 lIll II
o
II, and so belongs to the halIspace E :/() II
o
ll
Thus, the hyperplane E : /() II
o
IIis tangent to the closed sphere
(as illustrated in igure ).
(: II ) lI oll
6.8.9. igure . Illustration oI Corollary 6.8.5.
In closing this section, we mention two oI the more important conse
uences oI the ahnBanach theorem with signiIicant practical implications.
One oI these states that given a conve set in containing an interior
point and given a Ii ed point not in the interior oI , there is a hyperplane
separating the Ii ed point and the conve set . The second oI these asserts
that iI
t
and
are conve sets in , iI

t
has interior points, and iI
contains no interior point oI

I
, then there is a closed hyperplane which
separates
t
and
6.9. DA SPACE AND SECOND DA SPACE

In this section we brieIly reconsider dual space (see DeIinition 6.5.9),
and we introduce the dual space oI , called the second dual space. Through
out this section is a real normed linear space, and is the algebraic
conjugate oI .
We begin by determining the dual spaces oI some common normed linear
spaces.
6.9.1. E ample. et R, let (el ... ,e.) denote an arbitrary
element oI R, let a (/I .. , / .) be some Ii ed element in R, let II II
I ... :, and recall Irom E ample 6.5.1 that the Iunctional/()
/let .,. / .e. is a bounded linear Iunctional on and II/II 11011.
II we deIine a set oI basis vectors in R as e
l
(1,0, ... ,0), ... , e.
(0, ... , 0, I), then E R may be e pressed as e,e,.II we let / ,

II
/(e,), where/is any bounded linear Iunctional on R, then
6.9. Dual Space and Second Dual Space 371
Thus, the dual space * oI R is itselI the space R in the sense that
the elements oI * consist oI all Iunctionals oI the Iorm I() e,"

.
urthermore, the norm on * is IlIII CE en
l
/
2

lal
6.9.2. Eercise. et R, where the norm oI (I ... ,.) E is
given by II ll ma I ,I (see Eample 6.1.5). Show that iI I E *, then
15;15;,
there is an a (e .. ... ,e .) E R such that I() el1 ... e.., so
that * R, and show that the norm on * is given by IlIll E le ,l.

II
6.9.3. Eercise. et R, and deIine the norm oI (I ... , ,) E
by II ll (1llp ... 1,lp)l/p, where I p 00 (see Eample 6.1.5).
Show that iI I E * then there is an a (e ... .. , e ,) E R" such that
I() el1 ... e," i.e., * R, and show that the norm on *
is given by IlIII (I e I I ... II ,I ) 1/" where is such that .1 .1 I.
p
6.9. . Eercise. et be the space 1.1" 1 p 00, deIined in Eample
6.1.6 and let .1 .1 I. II p 1, we take 00. Show that the dual
p
space oI 1.1 is I,. SpeciIically, show that every bounded linear Iunctional on
lp is uniuely representable as
I() I; e , "
11
where a (e.. ... , e
k
, ) is an element oI I,. Also, show that every
element a oI II deIines an element oI (lp)* in the same way and that
1(
1: Ie ll,)I/ iI I P 001
IIIll II .
sup II ,I iI p I.
I
Since * is a normed linear space (see Theorem 6.5.6) it is possible to
Iorm the dual space oI *, which we will denote by " and which will
be reIerred to as the second dual space oI . As beIore, we will use the
notation " Ior elements oI ** and we will write
"( ) , "),
where E *. II denotes the algebraic conjugate oI , then the reader
can readily show that even though * c and ** c (*)I, in general,
** is not a linear subspace oI II.
et us deIine a mapping oI into ** by the relation
(, ) (, ), E , E * (6.9.5)
372
or, e uivalently, by
", "() (). (6.9.6)
We call this mapping the canonical mapping oI into **. The Iunctional
" deIined on * in this way is linear, because
"( ; p;) , ; p;) , ;) p , ;
"() p"(;),
and thus " E (*)/. Since
I"() I I() I I , )1 II llll lI,
it Iollows that Il "ll II ll and thus " E **. We can actually show that
II " II II II This is obvious Ior O. II * 0, then in view oI Corollary
6.8.6 there e ists a nonero E * such that , ) II 1111 II, and thus
II "ll II ll. rom this it Iollows that the norm oI every E can be
deIined in two ways: as the norm oI an element in and as the norm oI a
linear Iunctional on *, i.e., as the norm oI an element in **. We sum
mari e this discussion in the Iollowing result:
6.9.7. Theorem. is isometric to some linear subspace in **.
II we agree not to distinguish between isometric spaces, then Theorem
6.9.7 can simply be stated as c **.
6.9.8. DeIinition. A normed linear space is said to be reIle ive iI the
canonical mapping (6.9.6), : **, is onto. II we again agree not to
distinguish between isometric spaces, we write in this case ** . II
* **, then is said to be irreIle ive.
6.9.9. E ample. The space R , I p 00 is reIle ive.
6.9.10. E ample. The spaces I
p
, I p 00, are reIle ive.
6.9.11. E ample. The space II is irreIle ive.
6.9.12. E ercise. Prove the assertions made in E amples 6.9.9 through
6.9.1 I.
6.10. WEA CONVERGENCE
aving introduced the normed dual space, we are now in a position to
consider the notion oI weak convergence, a concept which arises Ireuently
in analysis and which plays an important role in certain applications. Through
out this section denotes a normed linear space and * is the dual space oI .
6.10. Weak Convergence 373
6.10.1. DeIinition. A seuence .l oI elements in is said to converge
weakly to the element E iI Ior every E *, ., ) , ). In
this case we write . weakly. II a seuence . converges to E ,
i.e., iI II . 11 as n 00, then we call this convergence strong con
vergence or convergence in norm to distinguish it Irom weak convergence.
6.10.2. Theorem. et .l be a seuence in which converges in norm to
E . Then .l converges weakly to .
ProoI Assume that Il . ll as n 00. Then Ior any E * we
have
1 ., ) (, ,)1 Illlll. ll 0 as n 00,
and thus . weakly.
Thus, strong convergence implies weak convergence. owever, the con
verse is not true, in general, as the Iollowing eample shows.
6.10.3. Eample. Consider in /2 the seuence oI vectors
(1,0, ... ,
0, ...),
2
(0, 1,0, ... ,0, ...),
3
(0,0, I, ... ,0, ...), .... To show
that . converges weakly we note that every E /2 * can be repre
sented as the scalar product with some Ii ed vector y ( 11 12 ... , 1., ...);
i.e., iI (el e2 ... ,e., ... ), then
, ) :E el 1l
I
(see E ercise 6.9.). or the case oI the seuence . we now have
., ) 1.,
and since 1. as n 00 Ior every y E /2 it Iollows that ., )
as n 00 Ior every E 1
2
, Thus, . converges to weakly. owever,
. strongly, because II . II 1.
We leave the prooI oI the ne t result as an e ercise to the reader.
6.10. . Theorem. II is Iinite dimensional, weak and strong convergence
are euivalent.
6.10.5. Eercise. Prove Theorem 6.10. .
Analogous to the concept oI weak convergence oI elements oI a normed
linear space we can introduce the notion oI weak convergence oI elements
oI *.
6.10.6. DeIinition. A seuence oI Iunctionals in * converges weakstar
(i.e., weak*) to the linear Iunctional E * iI Ior every E we have
, ) , ). We say that weak*.
Since strong convergence in implies weak convergence in , it Iollows
that iI a se uence oI linear Iunctionals in converges to the linear
Iunctional E , then weak.
et us consider an e ample.
6.10.7. E ample. et a, b) be an interval on the real line containing the
origin, i.e., a 0 b, and let e a, b); II . II be the Banach space oI real
valued continuous Iunctions as deIined in Eample 6.1.9. et lpIt be a se uence
oI Iunctions in era, b) satisIying the Iollowing conditions Ior n I, 2, ... :
(i) IpIt(t) 0 Ior all t E a, b);
(ii) IpIt(t) 0 iI It I lin and t E a, b); and
(iii) s: IpIt(t) dt I.
or each n I, 2, ... , we can deIine a continuous linear Iunctional
on (see E ample 6.5.13) by
(, s: (t)lpIt(t) dt
where E era, b . Now let be deIined on era, b) by
(, (O)
Ior all E era, b . It is clear that E . We now show that
weak. By the mean value theorem Irom the calculus, there is a tIt such that
lin S t. lin and
I
1/. Ip.(t) (t) dt (tIt) IliIt It(t) dt (tIt)
liIt liIt
Ior each n 1,2, ... , and E era, b . Thus, (, (O) Ior every
E era, b ; i.e.,
o
weak. We see that the se uence oI Iunctions Ip.
does not approach a limit in era, b . In particular, there is no Ipo E era, b
such that (O) s: (t)lpo(t) dt. reuently, in applications, it is convenient
to say the se uence lpIt converges to the so called " Iunction" which has
this property. We see that the se uence lpIt converges to the Iunction in the
sense oI weak convergence.
6.10.8. Theorem. et be a separable normed linear space. Every
bounded se uence oI linear Iunctionals in contains a weakly convergent
subse uence.
ProoI Since is separable, we can choose a denumerable everywhere dense
set I
, .. ,
It
, . in . Now let be a se uence in . Since this
se uence is bounded in norm, the se uence ( I is a bounded se uence
in either R or C. It now Iollows that we can select Irom a subseuence
6.11. Inner Product Spaces 375
such that the se uence I . converges. Again, Irom the sub
se uence . we can select another subseuence . such that the se uence
l : . converges. Continuing this procedure, we obtain the se uences
" " , ., .
" ., , ., .
" ., , , .
By taking the diagonal oI the above array, we obtain the subseuence oI
linear Iunctionals " ., ., .... or this subse uence, the se uence
,( ), .( ), l.( ), ... converges Ior all n. But then ,(), / ..( ),
.( ), ... converges Ior all E . This completes the prooI oI the
theorem.
The concepts oI weak convergence and weak* convergence give rise to
various generaliations, some oI which we brieIly mention.
et be a normed linear space and, let * be its normed dual. We call
a set c: weak compact iI every inIinite se uence Irom contains a
weak* convergent subse uence. We say that a Iunctional deIined on , which
in general may be nonlinear, is weakly continuous at a point
o
E iI Ior
every I 0 there is a 0 and a Iinite collection , , ... , l in *,
such that II() I(
o
)I I Ior all such that I , ; I Ior i 1,2,
... ,n. We can deIine weak continuity oI a Iunctional similarly by inter
changing the roles oI and .
It can be shown that iI is a real normed linear space and is its
normed dual, then any closed sphere in * is weak compact.
The reader can readily show that iII is a weakly continuous Iunctional,
then . weakly implies that I() I().
6.11. INNER PROD CT SPACES
We recall (see DeIinition 3.6.19 and the discussion Iollowing this deIinition)
that iI is a comple linear space, a Iunction deIined on into C,
which we denote by (, y) Ior , y E , is called an inner product iI
(i) ( , ) 0 Ior all *O and ( , ) 0 iI 0;
(ii) ( , y) (y, ) Ior all , y E ;
(iii) (I py, ) I ( , ) P( , ) Ior all , y, E and Ior all
I , P E C; and
(iv) (, I y p) (, y) P(, ) Ior all , y, E and Ior all
I , p E C.
In the case oI real linear spaces, the preceding characteriation oI an
inner product is identical, ecept we omit comple conjugates in (ii) and (iv).
We call a comple (real) linear space on which an inner product, ( " ),
is deIined a comple (real) inner product space which we denote by ; ( " .)
(see DeIinition 3.6.20). II the particular inner product being used in a given
discussion is understood, we simply write to denote the inner product
space. In accordance with our discussion Iollowing DeIinition 3.6.20, recall
also that diIIerent inner products deIined on the same linear space yield
diIIerent inner product spaces. inally, reIer also to the discussion Iollowing
DeIinition 3.6.20 Ior the characteriation oI an (inner product) subspace.
We have already etensively studied Iinitedimensional real inner product
spaces, i.e., Euclidean vector spaces, in Sections .9 and .10. Our subseuent
presentation will be in a more general setting, where need not be Iinite
dimensional and where may be a comple vector space. In Iact, unless
otherwise stated, ; ( " .) will denote in this section an arbitrary comple
inner product space. Since the prooIs oI several oI the Iollowing theorems are
nearly identical to corresponding ones in Sections .9 and .10, we will leave
such prooIs as e ercises.
One oI our Iirst objectives will be to show that every inner product space
; ( " .) has a norm associated with it which is induced by its inner
product ( " .). We Iind it convenient to consider Iirst the Schwar ineual
ity, given in the Iollowing theorem.
6.11.1. Theorem. or any E , let us deIine the Iunction II . II: R
by II ll (, )I/2. Then Ior all ,y E ,
I( , y)l II
ll I II (6.11.2)
6.11.3. Eercise. Prove Theorem 6.11.l (see Theorem .9.28).
sing the above results, we can now readily show that the Iunction II . II
deIined by II ll ( , )I/2 is a norm.
6.11. . Theorem. et be an inner product space. Then the Iunction
II . II: R deIined by
II ll (, )I/2
is a norm; i.e., Ior every , E and Ior every E C, we have
(i) II ll 0;
(ii) II ll 0 iI and only iI 0;
(iii) lI ll 1 lIl ll; and
(iv) Il II II ll lIyll
6.11.6 Eercise. Prove Theorem 6.11. (see Theorem .9.31).
(6.11.5)
6.11. Inner Product Spaces
377
Theorem 6.11. allows us to view every inner product space as a normed
linear space, provided that we use E. (6.11.5) to deIine the norm on .
Moreover, in view oI Theorem 6.1.2, we may view every inner product space
as a metric space, provided that we deIine the metric by p(, y) Il yll.
Subse uently, we adopt the convention that when using the properties and
terminology oIa normed linear space in connection with an inner product space
we mean the norm induced by the inner product, as given in E . (6.11.5).
We are now in a position to make the Iollowing important deIinition.
6.11.7. DeIinition. A complete inner product space is called a ilbert
space.
Thus, every ilbert space is also a Banach space (and also a complete
metric space). Some authors insist that ilbert spaces be inIinite dimensional.
We shall not Iollow that practice. An arbitrary inner product space (not
necessarily complete) is sometimes also called a preilbert space.
6.11.8. Eample. et be a Iinitedimensional (real or comple) inner
product space. It Iollows Irom Theorem 6.6.5 that is a ilbert space.
6.11.9. Eample. et 12. be the (comple) linear space deIined in Eample
6.1.6. et (el e2. ...) E 12. (111) 112. ...) E 12. and deIine ( , y):
12. 12. Cas
..
(, y) I; elil
II
It can readily be shown that ( " .) is an inner product on . Since 12. is
complete relative to the norm induced by this inner product (see Eample
6.1.6), it Iollows that 12. is a ilbert space.
6.11.10. Eample
(a) et a, b denote the linear space oI complevalued continuous
Iunctions deIined on a, b (see Eample 6.1.9). or , y E a, b deIine
(, y) s: (t)y(t) dt.
It is readily veriIied that this space is a preilbert space. In view oI Eample
6.1.9 this space is not complete relative to the norm II II ( , )I/2., and hence
it is not a ilbert space.
(b) We etend the space oI realvalued Iunctions, p a, b , deIined in
Eample 5.5.31 Ior the case p 2, to complevalued Iunctions to be the
set oI all Iunctions I: a, b C such that I u iv Ior u, v E 2. a, b .
Denoting this space also by 2. a, b , we deIine
(I, g) r Igdp,
G.bl
Ior I, g E a, b , where integration is in the ebegue sense. The space
a, b; ( " .) is a ilbert space.
In the ne t e ample we consider the Cartesian product oI ilbert spaces.
6.11.11. E ample. et I i I, ... , n, denote a Iinite collection oI
ilbert spaces over C, and let I . II E , then
(I . , .) with I E I DeIining vector addition and multiplication oI
vectors by scalars in the usual manner (see E s. (3.2.1 ), (3.2.15), and the
related discussion, and see Eample 6.1.10) it Iollows that is a linear
space. II , E and iI ( I I)I denotes the inner product oI I and I on
u then it is easy to show that
deIines an inner product on . The norm induced on by this inner product
is
Il ll (, )I/2 d: II I 11I)1/2
II
where II lIII ( I
I
)/2. It is readily veriIied that is complete, and thus
is a ilbert space.
6.11.12. E ercise. VeriIy the assertions made in E ample 6.1 1.11.
In Theorem 6.1.15 we saw that in a normed linear space ; II II , the
norm 1 II is a continuous mapping oI into R. Our ne t result establishes
the continuity oI an inner product. In the Iollowing, . implies con
vergence with respect to the norm induced by the inner product ( " .)
on .
6.11.13. Theorem. et . be a se uence in such that . , where
E , and let . be a se uence in . Then
(i) ( , .) ( , ) Ior all E ;
(ii) (., ) (, ) Ior all E ;
(iii) II lIll II ll; and

(iv) iI 1; . is convergent in , then (1; ., ) 1; (y., ) Ior all
,.1 ,,1 n:o::.l
Z E .
6.11.1 . Eercise. Prove Theorem 6.11.13.
Net, let us recall that two vectors , E are said to be orthogonal
iI ( , y) 0 (see DeIinition 3.6.22). In this case we write .. y. II c
6.11. Inner Product Spaces 379
and E is such that ... y Ior all y E , then we write ... . Also, iI
Z c and c and iI ... Ior all E Z, then we write ... Z. urther
more, observe that ... implies that O. inally, the notion oI inner
product allows us to consider the concepts oI alignment and colinearity oI
vectors.
6.11.1S. DeIinition. et be an inner product space. The vectors , y E
are said to be co inear iI ( , y) lIll llyll and aligned iI (, y)
Il ll II II
Our ne t result is proved by straightIorward computation.
6.11.16. Theorem. or all , y E we have
(i) Il yW Il yW 211 W 211yW; and
(ii) iI ... y, then II yW Il W IlyW
Parts (i) and (ii) oI Theorem 6.11.16 are reIerred to as the parallelogram
law and the Pythagorean theorem, respectively (reIer to Theorems .9.33 and
.9.38).
6.11.18. DeIinition. et .. : a E I be an inde ed set oI elements in ,
where I is an arbitrary inde set (i.e., I is not necessarily the integers). Then
(.. : E Iis said to be an orthogonal set oIvectors iI .. ...
p
Ior all , pEl
such that 1 p. A vector E is called a unit vector iI II II 1. An
orthogonal set oI vectors is called an orthonormal set iI every element oI the
set is a unit vector. inally, iI I is a se uence oI elements in , we deIine
an orthogonal se uence and an orthonormal se uence in an obvious manner.
sing an inductive process we can generali e part (ii) oI Theorem 6.11.16
as Iollows.
6.11.19. Theorem. et I ... ,
n
be a Iinite orthogonal set in . Then
II W II ll .
We note that iI 1 0 and iI y lll ll, then lIyll 1. ence, it is
possible to convert every orthogonal set oI vectors into an orthonormal set.
et us now consider a speciIic e ample.
6.11.20. Eample. et denote the space oI continuous complevalued
Iunctions on the interval 0, I . In accordance with Eample 6.11.10, we
deIine an inner product on by
(I, g) (I(t)g(t) dt. (6.11.21)
We now show that the set oI vectors deIined by
IIt(t) e
2a
.,, n 0, I, 2, ... ,i ,I (6.11.22)
is an orthonormal set in . Substituting E. (6.11.22) into E . (6.11.21),
we obtain
I II
(I.,I",) 0 IIt(t)I",(t) dt 0 e2aCaIII)" dt
e
2a
(aIII) 1
2n(n m)i
Since e
2ak
cos 2nk i sin 2nk, we have
(IIt,I",) 0, m *n;
i.e., iI m*n, then Ia .. Iill On the other hand,
(IIt,IIt) : e
2a
(ItIt)" dt I;
i.e., iI n m, then (IIt,IIt) I and Il/all 1.
The ne t result arises oIten in applications.
6.11.23. Theorem. II I ... , It is a Iinite orthonormal set in , then
(i) t I( , ;) 1
2
Il W Ior all E ; and
1
(ii) ( :t (, ,),) ..
Ior any j 1, ... , n.

1
6.11.25. Eercise. Prove Theorem 6.11.23 (see Theorem .9.58).
(6.11.2 )
On passing to the limit as n 00 in (6.11.2 ), we obtain the Iollowing
result.
6.11.26. Theorem. II , is any countable orthonormal set in , then
(6.11.27)
Ior every E .
The relationship (6.11.27) is known as the Bessel ine uality. The scalars
(1" (, ,) are called the ourier coeIIicients oI with respect to the ortho
normal set , .
The ne t result is a generaliation oI Theorem .9.17.
6.12. Orthogonal Complements 381
6.11.28. Theorem. In an inner product space we have ( , y) 0 Ior all
E iI and only iI y O.
rom our discussion thus Iar it should be clear that not every normed
linear space can be made into an inner product space. The Iollowing theorem
gives us suIIicient conditions Ior which a normed linear space is also an
inner product space.
6.11.30. Theorem. et be a normed linear space. II Ior all , y E ,
Il yll2 Il yW 2(llW IlyW), (6.11.31)
then it is possible to deIine an inner product on by
(, y) tIll
yW II yW ill iyW ill iyW (6.11.32)

Ior all , y E , where i ,.; T.
6.11.3 . Corollary. II is a real normed linear space whose norm
satisIies E . (6.11.31) Ior all , y E , then it is possible to deIine an inner
product on by
( , y) tW yWll yW
Ior all , y E .
In view oI part (i) oI Theorem 6.11.16 and in view oI Theorem 6.11.30,
condition (6.11.31) is both necessary and suIIicient that a normed linear
space be also an inner product space. urthermore, it can also be shown
that E . (6.11.32) uniuely deIines the inner product on a normed linear
space.
We conclude this section with the Iollowing e ercise.
6.11.36. Eercise. et I" I p 00, be the normed linear space deIined
in E ample 6.1.6. Show that I, is an inner product space iI and only iI
p 2.
6.12. ORT OGONA COMP EMENTS
In this section we establish some interesting structural properties oI
ilbert spaces. SpeciIically, we will show that any vector oI a ilbert space
can uniuely be represented as the sum oI two vectors y and , where y
is in a subspace oI and is orthogonal to . This is known as the
projection theorem. In proving this theorem we employ the so called "classical
projection theorem," a result oI great importance in its own right. This
theorem e tends the Iollowing Iamiliar result to the case oI (inIinite dimen
sional) ilbert spaces: in the three dimensional Euclidean space the shortest
distance between a point and a plane is along a vector through the point and
perpendicular to the plane. Both the classical projection theorem and the
projection theorem are oI great importance in applications.
Throughout this section, ; (, .)) is a comple inner product space.
6.12.1. DeIinition. et be a non void subset oI . The set oI all vectors
orthogonal to , denoted by .l, is called the orthogonal complement oI .
The orthogonal complement oI yl. is denoted by yl.).l. 6 yil, the orthog
onal complement oI yil is denoted by (yil) 6 il.l, etc.
6.12.2. E ample. et be the space 3 depicted in igure G, and let
be the I a is. Then yl. is the
2
3
plane, yu is the I a is, is again
the
2
3
plane, etc. Thus, in the present case, y.u , yl., yilil
yil, yilil yil yl., etc.
y. y
l
6.11.3 igure G
We now state and prove several properties oI the orthogonal complement.
The prooI oI the Iirst result is leIt as an e ercise.
6.12. . Theorem. In an inner product space , O)l. and l. O .
6.12.6. Theorem. et be a non void subset oI . Then y is a closed
linear subspace oI .
ProoI II , y E y.l, then ( , ) 0 and (y, ) 0 Ior all E . ence,
( py, ) ( , ) P( , ) 0, and thus ( Py) .l Ior all
E , or ( py) E l.. ThereIore, yl. is a linear subspace oI .
6.12. Orthogonal Complements
383
To show that y.l is closed, assume that
o
is a point oI accumulation oI
l.. Then there is a se uence I ) Irom y.l such that II
o
11 0 as n
00. By Theorem 6.11.13 we have 0 (, ) (
o
, ) as n 00 Ior all
Z E . ThereIore
o
E y.l and y.l is closed.
BeIore considering the ne t result we reuire the Iollowing concept.
6.12.7. DeIinition. et be a nonvoid subset oI , and let V( ) be the
linear subspace generated by (see DeIinition 3.3.6). et V( ) denote the
closure oI V( ). We call V( ) the closed linear subspace generated by .
Note that in view oI Theorem 6.2.3, V( ) is indeed a linear subspace
oI .
6.12.8. Theorem. et and Z be nonvoid subsets oI . Then
(i) either () yl. 0 or () y.l O ;
(ii) c .ll.;
(iii) iI c Z, then Zl. c .l;
(iv) y.l .lll; and
(v) y is the smallest closed linear subspace oI which contains ;
i.e., y V( ).
ProoI To prove part (i), assume that () yl. 1 0, and let E () .l.
Then E and E y.l and so (, ) O. This implies that O.
The prooI oI part (ii) is leIt as an e ercise.
To prove part (iii), let E Z. . Then y .l Ior all E Z. Since Z :::: ,
it Iollows that y .l Ior all E . Thus, y E y. whenever E Z. and
y. :::: Z. .
To prove part (iv) we note that, by part (ii) oI this theorem, y.l c yll..
On the other hand, since c y, by part (iii) oI this theorem, y. :::: y.ll.
Thus, y. y. . . .
The prooI oI part (v) is also leIt as an e ercise.
6.12.9. Eercise. Prove parts (ii) and (v) oI Theorem 6.12.8.
In view oI part (iv) oI the above theorem, we can write y. y.
yl.. . . l. ... , and y.l. y.l. . . yll. ....
BeIore giving the classical projection theorem, we state and prove the
Iollowing preliminary result.
6.12.10. Theorem. et be a linear subspace oI , and let be an
arbitrary vector in . et
6 inI(lIy II: y E .
II there e ists a o E such that lIyo II 0, then o is uniue, and
moreover o E is the uniue element in such that II o II 0 iI and
only iI ( o) 1 .
ProoI et us Iirst show that iI II o II 0, then ( o) 1 . In doing
so we assume to the contrary that there is ayE not orthogonal to o
We also assume, without loss oI generality, that y is a unit vector and that
( o, y) *O. DeIining a vector Z E as o , we have
II W II o y1l2 ( o y, o y)
( o, o) ( o, y) ( , o) ( , y)
II o 11
2
11 II lIlIyll2
II o 11
2
1 1
2
Il o 11
2
;
i.e., II II II o II. rom this it Iollows that iI o is not orthog
onal to every y E , then Ilyo ii *o. This completes the Iirst part oI
the prooI.
Net, assume that ( o) 1 . We must show that o is a uniue vector
such that II y II II o II Ior all y * o. or any y E we have, in
view oI part (ii) oI Theorem 6.11.16,
I yW II o o yW II oW Ilyo yW
rom this it Iollows that II yll Il o II Ior all y * o This com
pletes the prooI oI the theorem.
In igure the meaning oI Theorem 6. 2. 0 is illustrated pictorially Ior
a subset oI 3.
6.12.11. I ure
The preceding theorem does not ensure the eistence oI the vector o.
owever, iI we reuire in Theorem 6.12.10 that be a c/osedlinear subspace
in a ilbert space , then the eistence oI the uniue vector o is guaranteed.
6.12. Orthogonal Complements 385
This important result, which we will prove below, is called the classical
projection theorem.
6.12.12. Theorem. et be a ilbert space, and let be a closed linear
subspace oI . et be an arbitrary vector in , and let
inIlIy II: E .
Then there e ists a uniue vector o E such that II o II . More
over, o E is the uniue vector such that IIollinI(llyll:
E iI and only iI the vector ( o) 1.. .
ProoI. In view oI Theorem 6.12. IO we only have to establish the e istence
oI a vector o E such that II o II . Assume that i (iI E ,
then o and we are done). Since is the inIimum oI Ily ii Ior all
E , there is a se uence n in such that Il nll as n 00.
We now show that n is a Cauchy se uence. By part (i) oI Theorem 6.11.16
we have
II( ", ) ( n)W II( ", ) ( n)I 2
211 ", 11
2
211 nW
This euation yields, aIter some straightIorward manipulations, the relation
Ily", nW 211 ", W 211 nW 11

(y", i n) r
Since is a linear subspace, it Iollows that Ior each "" n E we have
(y", n)/2 E . Thus,ll (y", n)/211 and
lIy", nW 211 ", W 211 nW (P.
Also, since lIy", W 2 as m 00, it Iollows that lIy", "W 0
as m, n 00. ence, n is a Cauchy se uence. Since is a closed linear
subspace oI a ilbert space, it is itselI a ilbert space and as such n has
a limit o E . inally, by the continuity oI the norm (see Theorem 6.1.15),
it Iollows that lim II "II II o II . This proves the theorem.
The net result is a conse uence oI the preceding theorem.
6.12.13. Theorem. II and Z are closed linear subspaces oI a ilbert
space , iI c Z, and iI Z, then there e ists a nonero vector in Z,
say , such that 1.. .
ProoI. et be any vector in Z which is not in (there is one such vector
by hypothesis). II we deIine as above, i.e., inIlly II: E ,
then there e ists by Theorem 6.12.12 a vector o E such that II o II
. Now let o . Then 1.. by Theorem 6.12.12.
rom part (ii) oI Theorem 6.12.8 we have, in general, y.u. :: . nder
certain conditions euality holds.
6.12.1 . Theorem. et be a linear subspace oI a ilbert space . Then
I y.u..
ProoI rom part (ii) oI Theorem 6.12.8 we have c y.u.. Since y.u. is
closed by Theorem 6.12.6, it Iollows that I c y.u.. or purposes oI con
tradiction, let us now assume that I :1 y.u.. Then Theorem 6.12.13 estab
lishes the e istence oI a vector E y.u. such that :1 0 and such that I.
Thus, . E Il.. Since c I, it Iollows that Z E yl.. ThereIore, we have
Z E yl. n y.u. and Z :1 0, which is a contradiction to part (i) oI Theorem
6.12.8. ence, we must have I y.u..
We note that iI, in particular, is a closed linear subspace oI , then
y.u..
In connection with the ne t result, recall the deIinition oI the sum oI two
subsets oI (see DeIinition 3.2.8).
6.12.15. Theorem. II and Z are closed linear subspaces oI a ilbert
space , and iI Z, then Z is a closed linear subspace oI .
ProoI In view oI Theorem 3.2.10, Z is a linear subspace oI . To
show that Z is closed, it suIIices to show that iI u is a point oI accu
mulation Ior Z, then u Ior some E and Ior some Z E Z.
et u be a point oI accumulation oI Z. Then there is a se uence oI
vectors u,, in Z with lIu" ull 0 as n 00. In this se uence we
have Ior each n, u" " " with " E and " E Z. Suppose now that
u,, converges to a vector u E . By the Pythagorean theorem (see Theorem
6.11.16) we have
Ilu" umW II " m Z" mW II " mW li " mll
2
But II u"
m
11 0 as m, n 00, because u" having a limit is a Cauchy
se uence. ThereIore, II " mW 0 and li " mW 0 as m, n 00.
But this implies that the se uences y,, , ZIt are also Cauchy se uences. Since
and Z are closed, these se uences have limits E and Z E Z, respec
tively. inally, we note that
Ilu" (y )1I II " ZIt ll II ,, yll li " ll 0
as n 00. ThereIore, since " cannot approach two distinct limits, we have
u . This completes the prooI.
BeIore proceeding to the ne t result, we recall Irom DeIinition 3.2.13 that
a linear space is the direct sum oI two linear subspaces and Z iI Ior
every E there is a uniue E and a uniue Z E Z such that
6.13. ourier Series
387
. We write, in this case, IIi Z. The Iollowing result is known as
the projection theorem.
6.12.16. Theorem. II is a closed linear subspace oI a ilbert space ,
then IIi y1..
ProoI et Z y1.. By hypothesis, is a closed linear subspace and
so is y1. in view oI Theorem 6.12.6. rom the previous result it now Iollows
that Z is also a closed linear subspace. Net, we show that Z . Since
c Z and y1. c Z it Iollows Irom part (iii) oI Theorem 6.12.8 that Z1. c y1.
and also that Z1. c y1.1., so that Z1. c y1. nu. But Irom part (i) oI
Theorem 6.12.8 we have y1. n y1.1. O . ThereIore, the ero vector is the
only element in both y1. and yil, and thus Z1. O . Since Z is a closed linear
subspace we have Irom Theorems 6.12. and 6.12.1 ,
Z Zu (Z1.)1. 0 1. .
We have thus shown that we can represent every E as the sum
, where E and E y1.. To show that this representation is uniue
we consider I Zl and 2 Z2 where t 2 E and Zl Z2
E y1.. Then ( ) 0 I 2 Z Z2 or I 2 Z2 Zl Now
clearly (I 2) E and (Z2 E yl.. Since I 2 Z2 Zl we also
have ( 2) E y1. and (Z2 Zl) E . rom this it Iollows that I 2
Z2 Z 0; e., I 2 and Zl Z2 ThereIore, is uniue.
The above theorem allows us to write any vector oI a ilbert space
as the sum oI two vectors and ; i.e., y , where y is in a closed linear
subspace oI and is in y1.. It is this theorem which gave rise to the
e pression orthogonal complement.
II is a ilbert space and iI is a closed linear subspace oI and iI
y , where y E and Z E y1., then we deIine the mapping P as
Py.
We call the Iunction P the projection oI onto . Note that P(P) p2
Py ; e., p2 P. We will e amine the properties oI projections in
greater detail in the ne t chapter. (ReIer also to DeIinition 3.7.1 and
Theorem 3.7. .)
6.13. O RIER SERIES
In the previous section we e amined some oI the structural properties oI
ilbert spaces. Presently, we will concern ourselves with the representation
oI elements in ilbert space. We will see that the vectors oI a ilbert space
can under certain conditions be represented as a linear combination oI a
Iinite or inIinite number oI vectors Irom an orthonormal set. In this con
nection we will touch upon the concept oI basis in ilbert space. The prop
erty which makes all this possible is, oI course, the inner product.
Much oI the material in this section is concerned with an abstract approach
to the topic oI ourier series. Since the reader is probably already Iamiliar
with certain Iacets oI ourier analysis, he or she is now in a position to recog
ni e the power and the beauty oI the abstract approach.
Throughout this section ; (0, .) is a comple inner product space, and
convergence oI an inIinite series is to be understood in the sense oI DeIinition
6.3.1.
We now consider the representation oI a vector oI a Iinitedimensional
linear subspace in an inner product space.
6.13.1. Theorem. et be an inner product space, let u .. , n be a
Iinite orthonormal set in , and let be the linear subspace oI generated
by I . , n Then the vectors u .. , n Iorm a basis Ior and, more
over, in the representation oI a vector E by the sum
I I I ... I n n
the coeIIicients t1., are speciIied by
t1.
1
(y,y/), iI, .. ,n.
6.13.2. Eercise. Prove Theorem 6.13.1. (ReIer to Theorems .9. and
.9.51.)
We now generalie the preceding result.
6.13.3. Theorem. et be a ilbert space and let , be a countably
..
inIinite orthonormal seuence in . A series t1.
I
, is convergent to an
t:1
element E , i.e.,
..
iI and only iI I 1
/
1
2
00. In this case we have the relation
11
t1., ( , ,), i I, 2, ....
ProoI Assume that i;I1
/
1
2
00, and let " t I , ,. II n m, then
11 II
t 1t1.
/
1
2
O
I:ml
6.13. ourier Series 389
as n, m 00. ThereIore, s. is a Cauchy se uence and as such it has a
limit, say , in the ilbert space . Thus lim s. .
00
Conversely, iI s. converges then it is a Cauchy se uence and" s. s". W

Ill,1
2
0 as n, m 00. rom this it Iollows that Ill,12 0
1 11I 1 11I1
00
and Ill,12 00.
1 ..1
..
Now assume that Ill,12 00, and let lim s . We must show that
I:1 ...
ll, (, ,). rom Theorem 6.13.1 we have ll, (s., ,), i I, ... ,n.
But s. , and hence by the continuity oI the inner product we have
(s., ,) (, ,) as n 00. ThereIore, ll, (, ,), which completes the
prooI.
In the ne t result we use the concept oI closed linear subspace generated by
a set (see DeIinition 6.12.7).
6.13. . Theorem. et , be an orthonormal se uence in a ilbert space
, and let be the closed linear subspace generated by I Corresponding
to each E the series
00
(, ,) ,
1
converges to an element E . Moreover, ( ) .. .
(6.13.5)
6.13.6. E ercise. Prove Theorem 6.13. . (int: tili e Theorems 6.11.26,
6.13.3, and the continuity oI the inner product.)
A more general version oI Theorem 6.13. can be established by replacing
the orthonormal se uence , by an arbitrary orthonormal set Z.
In view oI Theorem 6.13. any element oI a ilbert space can
unambiguously be represented by a series oI the Iorm 6.13.5 provided that
the closed linear subspace generated by the orthonormal se uence ,
is eual to the space . The scalars ( , ,) in 6.13.5 are called ourier co
eIllcients oI with respect to the , ,
6.13.7. DeIinition. et be a ilbert space. An orthonormal set in
is said to be complete iI there e ists no orthonormal set oI which is a
proper subset.
The ne t result enables us to characterie complete orthonormal sets.
6.13.8. Theorem. et be a ilbert space, and let be an orthonormal
set in . Then the Iollowing statements are euivalent:
390 Chopter 6 I ormed Spaus and Inner Product Spaces
(i) is complete;
(ii) iI (, y) Ior all E , then 0; and
(iii) V( ) .
6.13.9. Eercise. Prove Theorem 6.13.8 Ior the case where is an ortho
normal seuence , .
As a speciIic eample oI a complete orthonormal set, we consider the
set oI elements e
l
(1,0, ... ,0, ...), e" (0, 1,0, ... ,0, ...), e
3

(0,0, 1,0, ... ,0, ...), ... in the ilbert space I" (see Eample 6.11.9). It is
readily veriIied that Ie, is an orthonormal set in I". Now let (t,
i
,,, ... "., .. ) E I", and corresponding to let i ,e,. Then
, 1
Il iII" I 1 ,1", and thus lim II iii 0. ence, V( ) I" and
Ic:t:l k
is complete by the preceding theorem.
Many oI the subseuent results involving countable orthonormal sets
may be shown to hold Ior uncountable orthonormal sets as well (reIer to
DeIinition 1.2. 8). The prooIs oI these generalied results usually reuire a
postulate known as Zorns lemma. (Consult the reIerences cited at the end
oI this chapter Ior a discussion oI this lemma.) Although the prooIs oI such
generalied results are not particularly diIIicult, they do involve an added
level oI abstraction which we do not wish to pursue in this book. In connec
tion with generalied results oI this type, it is also necessary to use the notion
oI cardinal number oI a set, introduced at the end oI Section 1.2.
The net result is known as Pusevals Iormula (reIer also to Corollary
.9. 9).
6.13.10. Theorem. et be a ilbert space and let the seuence , be
orthonormal in . Then
(6.13.11)
Ior every E iI and only iI the seuence , is complete.
ProoI. Assume to the contrary that the seuence , is not complete. Then
there e ists some 1 such that ( , ,) Ior all i. Thus, there eists a
E such that liII" 1 I( , ,) I". This proves the Iirst part.

I
Now assume that the seuence , is complete. In view oI Theorems
6.13. and 6.13.8 we have

( , ,) , , ,.
t 1 ,
6.13. ourier Series
Since , is orthonormal we obtain
IIII
2
(i: (1"
"
1; (1,
) I 1; (1,/i
(
"
) t 1(1,,1
2
1 I I 1 1 I 1 I
This completes the prooI.
391
A more general version oI Theorem 6.13.10 can be established by replacing
the orthonormal se uence by an orthonormal set.
The ne t result, known as the GramSchmidt procedure, allows us to
construct orthonormal sets in innerproduct spaces (compare with Theorem
.9.55).
6.13.12. Theorem. et be an innerproduct space. et , be a Iinite
or a countably inIinite se uence oI linearly independent vectors. Then there
e ists an orthonormal se uence y, having the same cardinal number as the
se uence , and generating the same linear subspace as , ,
ProoI Since I . 0, let us deIine I as
I 11;:11
It is clear that I and I generate the same linear subspace. Net, let
Since
(Z2 I) (
2
(
2
, I) Io I) (
2
, I) (
2
, I)( Io I)
(
2
, I) ( 2 I) 0,
it Iollows that Z2 I We now let 2 2/11 211. Note that Z2 *"0, because
2 and I are linearly independent. Also, I and 2 generate the same linear
subspace as I and
2
, because 2 is a linear combination oI I and 2
Proceeding in the Iashion described above we deIine Zlo Z2 ... and
I 2 .. recursively as
aI
Z. . ( ., ,) ,
11
and
. II :II
As beIore, we can readily veriIy that . , Ior all i n, that . . 0, and
that the ,l, i I, ... ,n, generate the same linear subspace as the , ,
i I, ... ,n. II the set , is Iinite, the process terminates. Otherwise it is
continued indeIinitely by induction.
The se uence e,, thus constructed can be put into a onetoone cor
respondence with the se uence , , ThereIore, these se uences have the
same cardinal number.
The Iollowing result can be established by use oI Zorns lemma.
6.13.13. Theorem. et be an inner product space containing a non
ero element. Then contains a complete orthonormal set. II is any
orthonormal set in , then there is a complete orthonormal set containing
as a subset.
Indeed, it is also possible to prove the Iollowing result: iI in an inner
product space and
1
are two complete orthonormal sets, then and
1
have the same cardinal number, so that a onetoone mapping oI set onto
set
1
can be established. This result, along with Theorem 6.13.13, allows
us to conclude that with each ilbert space there is associated in a natural
way a cardinal number ". This, in turn, enables us to consider " as the
dimension oI a ilbert space . or the case oI Iinite dimensional spaces this
concept and the usual deIinition oI dimension coincide. owever, in general,
these two notions are not to be viewed as one and the same concept.
Net, recall that in Chapter 5 we deIined a metric space to be separable
iI there is a countable subset everywhere dense in (see DeIinition 5. .33).
Since normed linear spaces and inner product spaces are also metric spaces,
we speak also oI separable Banach spaces and separable ilbert spaces. In
the case oI ilbert spaces, we can characterie separability in the Iollowing
e uivalent way.
6.13.1 . Theorem. A ilbert space is separable iI and only iI it contains
a complete orthonormal se uence.
6.13.15. E ercise. Prove Theorem 6.13.1 .
Since in a separable ilbert space with a complete orthonormal se uence
, one can represent every E as
..
(, ,)
I
,
1
we reIer to a complete orthonormal se uence , in a separable ilbert
space as a basis Ior . Caution should be taken here not to conIuse this
concept with the deIinition oI basis introduced in Chapter 3. (See DeIinitions
3.3.6 and 3.3.22.) In that case we deIined each in a vector space to have a
representation as a Iinite linear combination oI vectors I Indeed, the con
cept oI amel basis (see DeIinition 3.3.22), which is a purely algebraic
6.1 . The Ries Representation Theorem 393
concept, is oI very little value in spaces which are not Iinite dimensional. In
such spaces, orthonormal basis as deIined above is much more useIul.
We conclude this section with the Iollowing result.
6.13.16. Theorem. et be an orthonormal set in a separable ilbert
space . Then is either a Iinite set or a countably inIinite set.
6.1 . T E RIESZ REPRESENTATION T EOREM
In this section we state and prove an important result known as the
Ries representation theorem. A direct conseuence oI this theorem is that
the dual space * oI a ilbert space is itselI a ilbert space. Throughout
this section, ; (0, .) is a ilbert space.
We begin by Iirst noting that Ior a Ii ed E ,
I() ( , y) (6.1.1)
is a linear Iunctional in . By means oI (6.1.1) distinct vectors y E are
associated with distinct Iunctionals. rom the Schwar ineuality we have
I(, y)1 Illillyll
ence, IlIll lIyll andIis bounded (i.e.,I E *). rom this it Iollows that
iI is a ilbert space, then bounded linear Iunctionals are determined by
the elements oI itselI. In the net theorem we show that every element
y oI determines a uniue bounded linear IunctionalI(i.e., a uniue element
oI *) oI the Iorm (6..1) and that IlIli lIyll. rom this we conclude
that the dual space * oI the ilbert space is itselI a ilbert space.
(Compare the Iollowing with Theorem .9.63.)
6.1 .2. Theorem. (Ries) etIbe a bounded linear Iunctional on . Then
there is a uniue y E such that I() (, y) Ior all E . Moreover,
IlIll Ilyll, and every y determines a uniue element oI the dual space *
in this way.
ProoI or Ii ed y E , deIine the linear IunctionalI on by E. (6.1.1).
rom the Schwar ineuality we have II( )1 le , y)1 lIyllllll so thatI
is a bounded linear Iunctional and IIIll lIyll etting y we have II(y)1
l(y,y)1 lIyllllyll, Irom which it Iollows that IlIli Ilyll
Net, let I be a bounded linear Iunctional deIined on the ilbert space
. et Z be the set oI all vectors E such that Ie) o. By Theorem
3..19, Z is a linear subspace oI . Now let Z8 be a seuence oI vectors in
Z, and let
o
E be a point oI accumulation oI Z8 In view oI the con
39 elu pter 6 I Normed Spaces and Inner Product Spaces
tinuity oIIwe now have 0 Ie ,,) I(
o
) as n 00. Thus,
o
E Z and Z
is closed.
II Z , then Ior a E we have I() 0, and the euality I()
(, y) 0 Ior all E holds iI and only iI y O.
Now consider the case Z c , 1 Z. rom above, Z is a closed linear
subspace oI . We can thereIore utilie Theorem 6.12.16 to represent by
the direct sum
ZEIjZ1..
Since Z c and Z 1 , there eists in view oI Theorem 6.12.13 a nonero
vector u E such that u Z; i.e., u E Z1.. Also, since u 1 O.and since
u E Z1., it Iollows Irom part (i) oI Theorem 6.12.8 that u II. Z, and hence
Ieu) 1 O. Since Z1. is a linear subspace oI , we may assume without loss oI
generality that Ieu) l. We now show that u is a scalar multiple oI our
desired vector yin E. (6.1 .1).
or any Ii ed E we can write
I( I()u) I() I()I(u) I() I() 0,
and thus ( I()u) E Z. rom beIore, we have u... Z and hence
( I()u, u) 0, or ( , u) I()lluW, or I() ( , u/lluW). etting
y u/ll uWyields now the desired Iorm
I() ( , y).
To show that the vector y is uniue we assume that I() ( , y) and
I() ( , y") Ior all E . Then ( , y) (, y") 0, or ( , y y") 0,
or (y y", ) 0 Ior all E . It now Iollows Irom Theorem 6.11.28 that
y y". This completes the prooI oI the theorem.
6.1 .3. E ercise. Show that every ilbert space is reIle ive (reIer to
DeIinition 6.9.8).
6.1 . . E ercise. Two normed linear spaces over the same Iield are said
to be congruent iI they are isomorphic (see DeIinition 3. .76) and isometric
(see DeIinition 5.9.16). et be a ilbert space. Show that is congruent
to *.
6.15. SOME APP ICATIONS
We now consider two applications to some oI the material oI the present
chapter. This section consists oI three parts. In the Iirst oI these we consider
the problem oI approimating elements in a ilbert space by elements in a
Iinitedimensional subspace. lit the second part we brieIly consider random
6.15. Some Applications 395
variables, while in the third part we concern ourselves with the estimation oI
random variables.
A. Approimation oI Elements in ilbert Space
(Normal Euations)
In many applications it is necessary to approimate Iunctions by simpler
ones. This problem can oIten be implemented by approimating elements
Irom an appropriate ilbert space by elements belonging to a suitable linear
subspace. In other words, we need to consider the problem oI approimating
a vector in a ilbert space by a vector o in a linear subspace oI .
et I E Ior i I, ... ,n, and let V( I ) denote the linear sub
space oI generated by II ... , n Since is Iinite dimensional, it is
closed. Now Ior any Ii ed E we wish to Iind that element oI which
minimi es II II Ior all E . II o E is that element, then we say that
o approimates . We call ( o) the error vector and II o II the error.
Since any vector in can be e pressed as a linear combination ly
... I n n our problem is reduced to Iinding the set oI I /, i 1, ... , n,
Ior which the error II ly ... I n nll is minimi ed. But in view oI
the classical projection theorem (Theorem 6.12.12), o E which minimi es
the error is uniue and, moreover, ( o) .. yj i I, ... ,n. rom this
we obtain the n simultaneous linear euations
I ( )
GT(y1 ... , n) : :
I n (, n)
where in E . (6.15.1) GT(y, ... , n) is the transpose oI the matri
(, ) (, n)
( 2, ) ( 2 n)
(6.15.1)
(6.15.2)
( n ) ( n n)
The matri (6.15.2) is called the Gram matri oI, ... , n The determinant
oI (6.15.2) is called the Gram determinant and is denoted by A( I ... ,y .
The euations (6.15.1) are called the normal e uations. It is clear that in a
real ilbert space G( I ... , n) GT( I ... , n), and that in a comple
ilbert space G( I ... , n) GT(y1 ... , n)
In order to approimate E by o E we only need to solve E.
(6.15.1) Ior the l I i 1, ... ,n. The net result gives conditions under
which E . (6.15.1) possesses a uniue solution Ior the I
I

396 Chapter 6 I ormed Spaces and Inner Product Spaces
6.15.3. Theorem. A set oI elements I ... It oI a ilbert space is
linearly independent iI and only iI the Gram determinant (y I , It) * O.
ProoI We prove this result by proving the euivalent statement (yl ... ,
It) 0 iI and only iI the vectors I ... , It are linearly dependent.
Assume that I ... It is a set oI linearly dependent vectors in .
Then there e ists a set oI scalars l I ... ,I It not all ero, such that
I I I ... I It It O.
(6.15. )
Taking the inner product oI E . (6.15. ) with the vectors I ... , It yields
the n linear euations
(6.15.5)
I I( It I) t .. I It( It It) 0
Taking the l I ... ,I It as unknowns, we see that Ior a nontrivial solution
(I
I
... ,I It) to e ist we must have ( I ... It) O.
Conversely, assume that (yl ... , It) O. Then a nontrivial solution
(I
I
, I It) e ists Ior E . (6.15.5). AIter rewriting E . (6.15.5) as
we obtain
( I I I. I . I l l) II, I I II12 0,
II l":1 1 I
which implies that t I ly, O. ThereIore, the set I ... ,. is linearly
II
dependent. This completes the prooI.
The ne t result establishes an e pression Ior the error II o II. The
prooI oI this result Iollows directly Irom the classical projection theorem.
6.15.6. Theorem. et be a ilbert space, let E , let yl ... , It be
a set oI linearly independent vectors in , let be the linear subspace oI
generated by yl ... , It and let o E be such that
II o II min I yll min II I I I ... I.II
7E
Then
6.15. Some Applications
where
(I ... ,", ) det
( it ,,) (lt )
(, ,,) (, )
( ", ,,) (y.. )
(, ,,) (, )
397
8. Random Variables
A rigorous development oI the theory oI probability is based on measure
and integration theory. Since knowledge oI this theory by the reader has not
been assumed, a brieI discussion oI some essential concepts will now be
given.
We begin by introducing some terminology. II 0 is a nonvoid set, a
Iamily oI subsets, , oI 0 is called a algebra (or a Iield) iI (i) Ior all
E, E 0 we have E E 0 and E E 0, (ii) Ior any countable
se uence oI sets (E,, in we have E" E , and (iii) 0 E . It readily

,, I
Iollows that a algebra is a Iamily oI subsets oI 0 which is closed under all
countable set operations.
A Iunction P: R, where is a algebra, is called a probability
measure iI (i) 0 P(E) 1 Ior all E E ,(ii)P(0) 0 and pen) 1, and
(iii) Ior any countable collection oI sets (E,, in such that E, n E
0
.. ..
iI i "*j, we have P( E,,) peE,,) .
""1 .1
A probability space is a triple (0, ,P, where n is a non void set, is
a algebra oI subsets oI 0, and P is a probability measure on . We call
elements 0. E 0 outcomes (usually thought oI as occurring at random), and
we call elements E E events.
A Iunction : 0 R is called a random variable iI (0. : (o. E
Ior all E R. The set (0. : (o. is usually written in shorter Iorm as
( . II is a random variable, then the Iunction
: R R deIined by
( ) P( Ior E R is called the distribution Iunction oI . II "
i 1, ... , n are random variables, we deIine the random vector as
(I .. , ,,)T. Also, Ior (" ... ,,, E R", the event (I l" . , "
,, is deIined to be (0. : I(o. I ( ) o. :
2
(0.
( ) ... ( ) (0. :
,,(o. ,, . urthermore, Ior a random vector , the Iunction
: R" R,
398 Chapter 6 / Normed Spaces and Inner Product Spaces
deIined by ( ) P
I
., ... , It
It
, is called the distribution
Iunction oI .
II is a random variable and g is a Iunction, g: R R, such that the
Stieltjes integral roo g()d e ists, then the e pected value oI g( ) is deIined
to be Eg() roo g( )d ( ). Similarly, iI is a random vector and iI g
is a Iunction, g: RIt R such that t.g()d() e ists, then the epected
value oIg( ) is deIined to be E g( ) t. g( )d ( ). Some oI the e pected
values oI primary interest are E(), the epected value oI , E( Z), the second
moment oI , and E E( ) Z , the variance oI .
II we let .c
denote the Iamily oI random variables deIined on a probability

space to, g:, P such that E( Z) 00, then this space is a vector space over
R with the usual deIinition oI addition and multiplication by a scalar. We
say two random variables, I and
, are e ual almost surely iI P co: I(co)

* (co) O. II we let
denote the Iamily oI e uivalence classes oI all

random variables which are almost surely eual (as in E ample 5.5.31),
then
; (,) is a real ilbert space where the inner product is deIined by

(, ) E() Ior , E
Throughout the remainder oI this section, we let to, g:, P denote our
underlying probability space, and we assume that all random variabes belong
to the ilbert space
with inner product (, ) E().

C. Estimation oI Random Variables
The special class oI estimation problems which we consider may be
Iormulated as Iollows: given a set oI random variables .. ... , ", , Iind
the best estimate oI another random variable, . The sense in which an
estimate is "best" will be deIined shortly. ere we view the set ., ... , ",
to be observations and the random variable as the unknown.
or any mappingI: R " R such thatI(
I
, . , ",) E
Ior all obser

vations ... .. , ", , we call I(
I
, , ",) an estimate oI . IIIis
linear, we call a linear estimate.
Net, letIbe linear; e., letIbe a linear Iunctional on R" . Then there is
a vector aT (III ... ,II",) E R " such thatI(y) aTy Ior all yT ("., ... ,
"",) E R" . Now a linear estimate, lil. ... II", "" is called the
best linear estimate oI , given l"" ", , iI E lil. ...
II", ", Z is minimum with respect to a E R" .
The classical projection theorem (see Theorem 6.12.12) tells us that the
best linear estimate oI is the projection oI onto the linear vector space
399
V(
p
. . , IIl ) urthermore, E. (6.15.1) gives us the eplicit Iorm Ior
" i 1, ... ,m. We are now in a position to summarie the above discus
sion in the Iollowing theorem, which is usually called the orthogonality
principle.
6.15.8. Theorem. et ,
p
.. , III belong to
Then I
I
...
IIl IIl is the best linear estimate oI iI and only iI p ... Ill are such
that E 21, 0 Ior i 1, ... ,m.
We also have the Iollowing result.
6.15.9. Corollary. et ,
I
, ,
IIl
belong to
. et G ,j where
t,j E , j , i,j I, , m, and let V (PI ... ,Pill) E Rill, where
P, E , Ior i 1, , m. II G is nonsingular, then I
I
...
lil 1It is the best linear estimate oI iI and only iI aT bTGI.
6.15.10. Eercise. Prove Theorem 6.15.8 and Corollary 6.15.9.
et us now consider a speciIic case.
6.15.11. Eample. et , VI ... , V
m
be random variables in
such that
E E V, EV , 0 Ior i I, ,m, and let R P/ be non
singular where P,j EV,V
j
Ior i,j I, , m. Suppose that the measure
ments
p
. , IIt oI are given by , V, Ior i I, ... ,m.
Then we have E , j E V, Vj 0 ; Pj Ior i,j I,
... ,m, where 0 ; 11. E . Also, E , E( V,) 0 ; Ior i I,
... ,m. Thus, G / where j 0 ; Pj Ior i,j I, ... , m, b
T
(PI ... , Pm), where P, 0 ; Ior i 1, ... ,m, and aT bTGI.

6.15.12. Eercise. In the preceding eample, show that iI P,j O ;b
,j
Ior
i, I, ... , m, where b
tj
is the ronecker delta, then
0 ; .
, 2 Ior I I, ... , m.
mO " , O v
The net result provides us with a useIul means Ior Iinding the best linear
estimate oI a random variable , given a set oI random variables p ... ,
k
, iI we already have the best linear estimate, given
p
. ,
k

I
.
6.15.13. Theorem. et k 2, and let
I
, ,
k
be random variables
in
. et
j
V(
I
, j) the linear vector space generated by the
random variables
p
. ,
j
, Ior 1 j k. et ik I) denote the
best linear estimate oI
k
given
p
.. ,
k
I
, and let k(k I)
k

ik I). Then k kI EB V( k(k I) ).
(6 I5.1 7)
ProoI By the classical projection theorem (see Theorem 6.12.12), ",(k I)
... ,. I Now Ior arbitrary Z E ,., we must have Z CI
I
...
C,. I ,. I C,. ,. Ior some (C I ... ,C,.). We can rewrite this as Z ZI Z2
where ZI CI
I
... C,. I ,. I C,. ,.(k I) and Z2 C,. ,.(k I).
Since ZI E ,. I and Z2 1 ,.I it Iollows Irom Theorem 6.12.12 that ZI
and Z2 are uni ue. Since ZI E ,. I and Z2 E V( ,.(k I) ), the theorem
is proved.
We can e tend the problem oI estimation oI (scalar) random variables
to random vectors. et I ... , . be random variables in 2 and let
( I ... , .)T be a random vector. et o .. , ", be random variables in
2 We call i (.A , ... , .)T the best linear estimate oI , given I ,
", . iI , is the best linear estimate oI " given o .. , ", Ior i 1, ,
n. Clearly, the orthogonality principle must hold Ior each ,; i.e., we must
have E ( , ,)
j
0 Ior i 1, ... ,n and j 1, ... ,m. In this case
i can be e pressed as i A , where A is an (n m) matri oI real numbers
and (
I
, , ",)T. Corollary 6.15.9 assumes now the Iollowing matri
Iorm.
6.15.1 . Theorem. et I ... ,., o ... , ", be random variables in
2 et G ),j where ) 1) E ,
j Ior i,j 1, ... ,m, and let B
P,j , where PI) E,
j
Ior i I, ... ,n. II G is nonsingular, then
i A is the best linear estimate oI , given , iI and only iI A BGI.
We note that Band G in the above theorem can be written in an alternate
way. That is, we can say that
i ETEVTWI (6.15.16)
is the best linear estimate oI . By the e pected value oI a matri oI random
variables, we mean the e pected value oI each element oI the matri.
In the remainder oI this section we apply the preceding development to
dynamic systems.
et I, 2, ...denote the set oI positive integers. We use the notation
(k) to denote a se uence oI random vectors; i.e., (k) is a random vector
Ior each k E . et (k) be a se uence oI random vectors, (k) I(k),
... , ik)T, with the properties
E(k) 0
and
E(k)T(j) Q(kj" (6.15.18)
Ior all j, k E , where Q(k) is a symmetric positive deIinite (p p) matri
Ior all k E . Ne t, let V(k) be a se uence oI random vectors, V(k)
V1(k), ... , V..(k) T, with the properties
EV(k) 0
and
01
(6.15.19)
EV(k)VT(j) R(k)Ojk (6.15.20)
Ior all j, k E , where R(k) is a symmetric positive deIinite (m m) matri
Ior all k E .
Now let (I) be a random vector, (I) 1(I), ... , (I)T, with the
properties
and
E(I) 0
E(I)T(I) P(I),
(6.15.21)
(6.15.22)
where P(I) is an (n n) symmetric positive deIinite matri. We assume
Iurther that the relationships among the random vectors are such that
and
E (k)VT(j ) 0,
E (I) T(k ) 0,
E(I)VT(k) 0
(6.15.23)
(6.15.2 )
(6.15.25)
Ior all k,j E .
Net, let A(k) be a real (n n) matri Ior each k E , let B(k) be a real
(n p) matri Ior each k E , and let C(k) be a real (m n) matri Ior
each k E . We let (k) and (k) be the se uences oI random vectors
generated by the diIIerence euations
and
(k 1) A(k)(k) B(k)(k)
(k) C(k)(k) V(k)
(6.15.26)
(6.15.27)
Ior k 1,2, ....
We are now in a position to consider the Iollowing estimation problem:
given the set oI observations, (I), ... , (k), Iind the best linear estimate oI
the random vector (k). We could view the observed random variables as
a single random vector, say cyT T(I), yT(2), ... ,T(k), and apply
Theorem 6.15.1; however, it turns out that a rather elegant and signiIicant
algorithm eists Ior this problem, due to R. E. alman, which we consider
net.
In the Iollowing, we adopt some additional convenient notation. or
each k,j E , we let t(j Ik) denote the best linear estimate oI (j), given
(I), ... , (k) . This notation is valid Ior j k and j;;::: k; however, we
shall limit our attention to the situation where j ;;::: k. In the present contet,
a recursive algorithm means that (k I Ik I) is a Iunction only oI
(k Ik) and (k I). The Iollowing theorem, which is the last result oI this
section, provides the desired algorithm e plicitly.
6.15.28. Theorem (alman). Given the Ioregoing assumptions Ior the
dynamic system described by Es. (6.15.26) and (6.15.27), the best linear
estimate oI (k), given ((I), ... , (k), is provided by the Iollowing set oI
diIIerence euations:
i(kIk) i(kIk I) (k)(k) C(k)i(k Ik 1) , (6.15.29)
and
where
i(k II k) A(k)i(kIk), (6.15.30)
and
(k) P(k Ik I)CT(k)C(k)P(k Ik I)CT(k) R(k)l, (6.15.31)
P(kl k) I (k)C(k)P(kl k I), (6.15.32)
P(k 11 k) A(k)P(kl k)AT(k) B(k)Q(k)BT(k) (6.15.33)
Ior k 1, 2, ... , with initial conditions
i(IIO) 0
and
P(lIO) P(I).
ProoI Assume that i(kl k I) is known Ior k E . We may interpret
i(lIO) as the best linear estimate oI (l), given no observations. We wish
to Iind i(k Ik) and i(k 11 k). It Iollows Irom Theorem 6.15.13 (etended
to the case oI random vectors) that there is a matri (k) such that i(kIk)
i(kl k I) (k)I(kl k 1), where I(kl k 1) (k) t(kl k
I), and t(k Ik I) is the best linear estimate oI (k), given (l), ... ,
(k I) . It Iollows immediately Irom Es. (6.15.23) and (6.15.27) and the
orthogonality principle that t(k Ik 1) C(k)i(k Ik I). Thus, we have
shown that E. (6.15.29) must be true. In order to determine (k), let
(kl k 1) (k) (kl k I). Then it Iollows Irom Es. (6.15.26) and
(6.15.29) that
(kl k) (kl k I) (k)C(k)(kl k I) V(k).
To satisIy the orthogonality principle, we must have E(k Ik)T(j) 0 Ior
j 1, ... , k. We see that this is satisIied Ior any (k) Ior j 1, ... , k 1.
In order to satisIy E((k Ik)T(k) 0, (k) must satisIy
0 E(k Ik l)T(k) (k)C(k)E(k Ik l)T(k)
EV(k)T(k). (6.15.3 )
et us Iirst consider the term
E(k Ik I)T(k) E((kl k I)T(k)C
T
(k) (kl k l)VT(k).
(6.15.35)
We observe that (k), the solution to the diIIerence euation (6.15.26) at
(time) k, is a linear combination oI (l) and (l), ... , (k 1). In view
6.15. Some Applications 03
oI Es. (6.15.23) and (6.15.25) it Iollows that E(j)VT(k) 0 Ior all
k,j E . ence, E(kl k I)VT(k) 0, since (kl k I) is a linear
combination oI (k) and O), ... , (k I).
Net, we consider the term
E(kl k I) T(k) E(kl k l)T(k)
iT(kl k 1) iT(k Ik I)
E(klkI)T(klkl).iT(klkI)
P(kl k I) (6.15.36)
where
P(kl k I) t::. E(kl k I)T(klk I)
and E(klk l)iT(klk I) 0, since i(klk I) is a linear combina
tion oI O), ... , (k I).
Now consider
E V(k) T(k) E V(k) T(k)CT(k) . VT(k) R(k). (6.15.37)
sing Es. (6.15.35), (6.15.36), and (6.15.37), E. (6.15.3 ) becomes
0 P(kl k I)CT(k) (k)C(k)P(kl k l)CT(k) . R(k) . (6.15.38)
Solving Ior (k), we obtain E. (6.15.31).
To obtain E. (6.15.32), let (k Ik) i(k) (k Ik) and P(k Ik)
Er(k Ik) T(k Ik) . In view oI E s. (6.15.27) and (6.15.29) we have
(kl k) (kl k 1) (k)C(k)(kl k 1) V(k)
I (k)C(k)(k Ik 1) (k)V(k).
P(kl k) I (k)C(k) P(kl k 1)
I (k)C(k)P(kl k I)CT(k) T(k) (k)R(k) T(k)
I (k)C(k)P(k Ik 1)
P(k Ik I)CT(k) (k)C(k)P(k Ik I)CT(k) . R(k)
T(k).
sing E. (6.15.38), it Iollows that E. (6.15.32) must be true.
To show that i(k 11k) is given by E. (6.15.30), we simply show that
the orthogonality principle is satisIied. That is,
E(k 1) A(k)i(kIk)T(j)
EIA(k)(k) i(kIk)T(j) . EIB(k)(k)T(j)
Ior j 1, ... , k.
0 Chapter 6 / Normed Spaces and Inner Product Spaces
inally, to veriIy E. (6.15.33), we have Irom Es. (6.15.26) and (6.15.30)
(k 11 k) A(k)(k Ik) B(k)(k).
rom this, E. (6.15.33) Iollows immediately. We note that i(ll 0) 0 and
P(IIO) P(l). This completes the prooI.
6.16. NOTES AND RE ERENCES
The material oI the present chapter as well as that oI the net chapter
constitutes part oI what usually goes under the heading oI Iunctional analysis.
Thus, these two chapters should be viewed as a whole rather than two separate
parts.
There are numerous ecellent sources dealing with ilbert and Banach
spaces. We cite a representative sample oI these which the reader should
consult Ior Iurther study. ReIerences 6.6 6.8 , 6.10 , and 6.12 are at an
introductory or intermediate level, whereas reIerences 6.2 6. and 6.13
are at a more advanced level. The books by DunIord and Schwart and by
ille and Phillips are standard and encyclopedic reIerences on Iunctional
analysis; the tet by osida constitutes a concise treatment oI this subject,
while the monograph by almos contains a compact eposition on ilbert
space. The book by Taylor is a standard reIerence on Iunctional analysis at
the intermediate level. The tets by antorovich and Akilov, by olmogorov
and omin, and by iusternik and Sobolev are very readable presentations
oI this subject. The book by Naylor and Sell, which presents a very nice
introduction to Iunctional analysis, includes some interesting eamples. or
reIerences with applications oI Iunctional analysis to speciIic areas, including
those in Section 6.15, see, e.g., Byron and uller 6.1 , alman et al. 6.5 ,
uenberger 6.9 , and Porter 6.11 .
RE ERENCES
6.1 . W. B RON and R. W. ER, Mathematics oI Classical and Quantum
Physics. Vols. I. II. Reading, Mass.: Addison Wesley Publishing Co., Inc.,
1969 and 1970.
6.2 N. D N ORD and . SC WARTZ, inear Operators. Parts I and II. New ork:
Interscience Publishers, 1958 and 196 .
6.3 P. R. A MOS, Introduction to ilbert Space. New ork: Chelsea Publishing
Company, 1957.
6. E. I E and R. S. P I IPS, unctional Analysis and SemiGroups. Provi
dence, R.I.: American Mathematical Society, 1957.
6.5 R. E. A MAN, P. . A B, and M. A. ARBIB, Topics in Mathematical System
Theory. New ork: McGraw ill Book Company, 1969.
*Reprinted in one volume by Dover Publications, Inc., New ork, 1992.
6.16. Notes and ReIerences
6.6) . V. ANTORovlC and G. P. A I OV, unctional Analysis in Normed
6.7) A. N. O MOGOROV and S. V. OMIN, Elements oIthe Theory oI unctions
and unctional Analysis. Vols. t, II. Albany, N..: Graylock Press, 1957
and 1961.
6.8) . A. I STERNI and V. . SoBO EV, Elements oI unctional Analysis. New
ork: rederick ngar Publishing Company, 1961.
6.9) D. G. ENBERGER, Optimi ation by Vector Space Methods. New ork:
ohn Wiley Sons, Inc., 1969.
6.10 A. W. NA OR and G. R. SE , inear Operator Theory. New ork: olt,
Rinehart and Winston, 1971.
6.11 W. A. PORTER, Modern oundations oI Systems Engineering. New ork:
The Macmillan Company, 1966.
6.12 A. E. TA OR, Introduction to unctional Analysis. New ork: ohn Wiley
Sons, Inc., 1958.
6.13 . OSIDA, unctional Analysis. Berlin: Springer Verlag, 1965.
7
INEAR OPERATORS
In the present chapter we concern ourselves with linear operators deIined
on Banach and ilbert spaces and we study some oI the important properties
oI such operators. We also consider selected applications in this chapter.
This chapter consists oI ten parts. Throughout, we consider primarily
bounded linear operators, which we introduce in the Iirst section. In the second
section we look at inverses oI linear transIormations, in section three we
introduce conjugate and adjoint operators, and in section Iour we study
hermitian operators. In the IiIth section we present additional special linear
transIormations, including normal operators, projections, unitary operators,
and isometric operators. The spectrum oI an operator is considered in the
si th, while completely continuous operators are introduced in the seventh
section. In the eighth section we present one oI the main results oI the present
chapter, the spectral theorem Ior completely continuous normal operators.
inally, in section nine we study diIIerentiation oI operators (which need not
be linear) deIined on Banach and ilbert spaces.
Section ten, which consists oI three subsections, is devoted to selected
topics in applications. Items touched upon include applications to integral
euations, an e ample Irom optimal control, and minimiation oI Iunctionals
(method oI steepest descent). The chapter is concluded with a brieI discussion
oI pertinent reIerences in the eleventh section.
7.1. BO NDED INEAR TRANSORMATIONS
Throughout this section and denote vector spaces over the same Iield
, where is either R (the real numbers) or C (the comple numbers).
We begin by pointing to several concepts considered previously. Recall
Irom Chapter I that a transIormation or operator T is a mapping oI a subset
:D(T) oI into . nless speciIied to the contrary, we will assume that
:D(T). Since a transIormation is a mapping we distinguish, as in Chapter
I, between operators which are onto or surjective, one to one or injective,
and one to one and onto or bijective. II T is a transIormation oI into we
write T: . II E we call y T() the image oI in under T, and
iI V c we deIine the image oIset Vin under T as the set
T(V) y E : y T(v), v EVe .
On the other hand, iI W c , then the inverse image oIset Wunder T is the
set
TI(W) E : y T() EWe .
We deIine the range oIT, denoted R(T), by
R(T) y E : y T(), E;
i.e., R(T) T(). Recall that iI a transIormation T oI into is injective,
then the inverse oI T, denoted TI, e ists (see DeIinition 1.2.9). Thus, iI
y T() and iI T is injective, then Tl(y).
In DeIinition 3. .1 we deIined a linear operator (or a linear transIormation)
as a mapping oI into having the property that
(i) T( y) T() T(y) Ior all , y E ; and
(ii) T(l ) l T( ) Ior alll E and all E .
As in Chapter 3, we denote the class oI all linear transIormations Irom
into by (, ). Also, in the case oI linear transIormations we write
T in place oI T().
OI great importance are bounded linear operators, which turn out to
be also continuous. We have the Iollowing deIinition.
7.1.1. DeIinition. et and be normed linear spaces. A linear operator
T: is said to be bounded iI there is a real number 1 0 such that
II TIly 111 li
Ior all E .
The notation II Ilindicates that the norm on is used, while the notation
II TlIy indicates that the norm on is employed. owever, since the norms
oI the various spaces are usually understood, it is customary to drop the
subscripts and simply write II II and II TII
07
08 Chapter 7 I inear Operators
Our Iirst result allows us to characterie a bounded linear operator in an
euivalent way.
7.1.2. Theorem. et T E (, ). Then T is bounded iI and only iI T
maps the unit sphere into a bounded subset oI .
In Chapter 5 we introduced continuous Iunctions (see DeIinition 5.7.1).
The deIinition oI continuity oI an operator in the setting oI normed linear
spaces can now be rephrased as Iollows.
7.1.. DeIinition. An operator T: (not necessarily linear) is said
to be continuous at a point
o
E iIIor every I 0 there is a 6 0 such that
IIT() T(
o
) II I
whenever II
o
II 6.
The reader can readily prove the net result.
7.1.5. Theorem. et T E (, ). II T is continuous at a single point
o
E , then it is continuous at all E .
In this chapter we will mainly concern ourselves with bounded linear
operators. Our net result shows that in the case oI linear operators bounded
ness and continuity are euivalent.
7.1.7. Theorem. et T E (, ). Then T is continuous iI and only iI it
is bounded.
ProoI Assume that T is bounded, and let "I be such that II T II S "III II Ior
all E . Now consider a seuence
n
) in such that . 0 as n 00.
Then II T
n
II ,,11 .11 0 as n 00, and hence Tis continuous at the point
oE . rom Theorem 7.1.5 it Iollows that T is continuous at all points
E .
Conversely, assume that Tis continuous at 0, and hence at all E .
Since TO 0 we can Iind a 6 0 such that II T II I whenever II II S 6.
or any 1: 0 we have 1I(6 )/ll llll 6, and hence
IIT ll II T(II I 11 )11 ()II T(III)II ilIl
IIwe let" 1/6, then II T ll illll, and Tis bounded.
7.1. Bounded inear TransIormations
Now let S, T E (, ). In E . (3. . 2) we deIined the sum oI linear
operators (S T) by
(S T) S T , E ,
and in E . (3. . 3) we deIined multiplication oIT by a scalar I E as
(I T) I (T ), E , I E .
We also recall (see E . (3. . that the ero transIormation, 0, oI into
is deIined by O 0 Ior all E and that the negative oI a transIormation
T, denoted by T, is deIined by (T) T Ior all E (see E .
3..5. urthermore, the identity transIormation IE (, ) is deIined
by I Ior all E (see E . (3..56. ReIerring to Theorem 3. . 7, we
recall that (, ) is a linear space over .
Net, let , , Z be vector spaces over , and let S E ( , Z) and
T E (, ). The product oI Sand T, denoted by ST, was deIined in E .
(3. .50) as the mapping oI into Z such that
(ST) S(T), E .
It can readily be shown that ST E (, Z). urthermore, iI Z,
then (, ) is an associative algebra with identity I (see Theorem 3. .59).
Note however that the algebra (, ) is, in general, not commutative
because, in general,
ST* TS.
In the Iollowing, we will use the notation B(, ) to denote the set oI
all bounded linear transIormations Irom into ; i.e.,
B(, ) A T E (, ): T is bounded .
The reader should have no diIIiculty in proving the net theorem.
7.1.9. Theorem. The space B(, ) is a linear space over .
Net, we wish to deIine a norm on B(, ).
(7.1.8)
7.1.11. DeIinition. et T E B(, ). The norm oI T, denoted II Til, is
deIined by
IITII inIy: II T ll yllll Ior all E.
(7.1.12)
Note that II Til is Iinite and that
IIT ll S IITIIllll
Ior all E . In proving that the Iunction II . II: B(, ) R satisIies all
the aioms oI a norm (see DeIinition 6.1.1), we need the Iollowing result.
7.1.13. Theorem. et T E B(, ). Then II Til can e uivalently be
e pressed in anyone oI the Iollowing Iorms:
(i) II Til inIy:IITll ll ll Ior all E;
7
(ii) II Til sup II Tll/llll: E;
","0
(iii) IITII sup II Tll: E ; and
I",I:S:
(iv) II Til sup II Tll: E.
1"1
7.1.1 . Eercise. Prove Theorem 7.1.13.
We now show that the Iunction II . II deIined in E . (7.1.12) satisIies all
the aioms oI a norm.
7.1.15. Theorem. The linear space B(, ) is a normed linear space (with
norm deIined by E . (7.1.12; i.e.,
(i) Ior every T E B(, ), II Til 0, and II Til 0 iI and only iIT 0;
(ii) liS Til s IISII II Til Ior every S, T E B(, ); and
(iii) II TII I IIITil Ior every T E B(, ) and Ior every E .
ProoI The prooI oI part (i) is obvious. To veriIy (ii) we note that
II(S T)ll IIS Tll IIS ll IIT l1 (IISII IITll)ll ll
II 0, then we are Iinished. II t 0, then
liS Til II(Slt)lI IISII IITIIIor all E , t O.
We leave the prooI oI part (iii), which is similar, as an e ercise.
or the space B(, ) we have the Iollowing results.
7.1.16. Theorem. II S, T E B(, ), then ST E B(, ) and
IISTII IISII 11 Til
ProoI or each E we have
II(ST) II IIS(T) II IISII 11 Tll IISII IITII ll ll,
which shows that ST E B(, ). II t 0, then
IISTII sup II(ST) ll IISII IITII,
","0 II ll
completing the prooI.
7.1.17. Theorem. et / denote the identity operator on . Then / E
B(, ), and II/II 1.
We now consider some speciIic cases.
11
7.1.19. E ample. et I
, the Banach space oI Eample 6.1.6. or

(el e , ... ) E , let us deIine T: by
T (0, e , e3 ... ).
The reader can readily veriIy that T is a linear operator which is neither
injective nor surjective. We see that
00 00
IIT W Ie,l le,l
II W
Thus, T is a bounded linear operator. To compute IITII we observe that
IIT ll II II, which implies that IITII I. Choosing, in particular,
(0, 1,0, ...) E , we have II T ll Il ll I and
1 II T ll IITIIllll IITII.
Thus, it must be that II Til I.
7.1.20. E ample. et era, b , and let 11 1100 be the norm on era, b deIined
in E ample 6.1.9. et k: a, b a, b R be a realvalued Iunction, con
tinuous on the suare a s b, a t b. DeIine the operator T:
by
T(s) rk(s, t)(t) dt
Ior E . Then T E (, ) (see Eample 3. .6). Then
IITII sup IIb k(s, t)(t) dt I
Q,b Q
Qrb rIk(s, t) Idt QIb I(t) I
)10 lIll
This shows that T E B(, ) and that II Til )10 It can, in Iact, be shown
that IITII )10
or norms oI linear operators on Iinite dimensional spaces, we have the
Iollowing important result.
7.1.21. Theorem. et T E (, ). II is Iinite dimensional, then Tis
continuous.
ProoI et I ,
n
be a basis Ior . or each E , there is a uniue
set oI scalars reI, ,en such that el
l
en n II we deIine
the linear Iunctionalsj,: by j,() e" i I, ,n, then by Theorem
6.6.1 we know that each I, is a continuous linear Iunctional. Thus, there
e ists a set oI real numbers "I ... ,,," such that II,( ) I ",lI ll Ior i
1, ... , n. Now
T ITl ... ,"T ".
II we let p ma llT ,11 and )0 ma ) " then it Iollows that IIT ll
, ,
np) oll II. Thus, T is bounded and hence continuous.
Ne t, we concern ourselves with various norms oI linear transIormations
on the Iinite dimensional space R".
7.1.22. E ample. et R", and let I ... ,u" be the natural basis Ior
R" (see E ample . I.I5). or any A E (, ) there is an n n matri , say
A all (see DeIinition .2.7), which represents Awith respect to I ... ,
u" . Thus, iI A y, where (I ... ,,") E andy ( 71 ... , 7") E ,
we may represent this transIormation by y A (see E . ( .2.17 . In E am
ple 6.1.5 we deIined several norms on R", namely
II llp IllI ... 1e" I/P, 1 p 00
and
II 11 ma I e,l .
,
It turns out that diIIerent norms on R" give rise to diIIerent norms oItransIor
mation A. (In this case we speak oI the norm oI A induced by the norm
deIined on R".) In the present e ample we derive e pressions Ior the norm
oI A in terms oI the elements oI matri A when the norm on R" is given
by II III II li , and II 11
(i) etp 1; i.e., II li led ... 1e"1. Then IIAII ma tlalll.
1 1 1
To prove this, we see that
IIA l1 It1 atjelII tilatj ll
t lell t la ll i; lell ma t Iall I
l 1 , 1 l . I S;lS;" , I
ma t la,ll llll
S;lS;" I I
et jo be such that i;latj,l ma tla/ I" )0 Then IIAII"o To
1 1 I S;lS;" 1 1
show that e uality must hold, let
o
(I ... ,,") E R" be given by ll I,
and " 0 iI i *jo. Then
IIA oli t lau,l and Il oll 1.
I
rom this it Iollows that II AII )0 and so we conclude that II AII )0
13
(ii) et p 2; i.e., IIll (leI 1
2
Ie. 1
2
)1/2. et AT denote the
transpose oI A(see E . ( .2.9 , and let A., ,A,,be the distinct eigenvalues
oI the matri ATA (see DeIinition .5.6). etA
o
ma A. Then II A II .
To prove this we note Iirst that by Theorem .10.28 the eigenvalues oI ATA
are all real. We show Iirst that they are, in Iact, nonnegative.
et I ... , ,, be eigenvectors oI ATA corresponding to the eigenvalues
A. I, ... , A,, , respectively. Then Ior each i I, ... , k we have ATA / A/ /.
Thus, ;ATA / A, ; /. rom this it Iollows that
A ;ATA / O.
, ;/
or arbitrary E it Iollows Irom Theorem .10. that I .
", where ATA, A/ /, i I, ... ,k. ence, ATA AI
I
.
A/e ", By Theorem .9. 1 we have IIA W TATA . Thus,
" "
IIA W TATA I; Atll ,W A
o
I; Il,W Aoll W,
1 /1
Irom which it Iollows that II AII . II we let be an eigenvector corre
sponding to A
o
, then we must have IIA W Aoll W, and so euality is
achieved. Thus, II AII .
(iii) et Illi ma I e, I . Then IIAII ma(t laill). The prooI oI
/ / I
this part is leIt as an e ercise.
7.1.23. E ercise. Prove part (iii) oI Eample 7.1.22.
Net, we prove the Iollowing important result concerning the completeness
oI B(, )
7.1.2 . Theorem. II is complete, then the normed linear space B(, )
is also complete.
ProoI et T. be a Cauchy seuence in the normed linear space B(, ).
Choose N such that Ior a given I 0, IIT
m
T.II I whenever m N
and n N. Since the T. are bounded we have Ior each E ,
IITm T.ll IIT
m
T.lll1ll Illll
whenever m, n N. rom this it Iollows that T. is a Cauchy seuence in
. But is complete, by hypothesis. ThereIore, T. has a limit in which
depends on E . et us denote this limit by T ; i.e., lim T. T . To
. 00
show that T is linear we note that
T( y) lim T.( y) lim T. lim T.y T Ty
and
T( ) lim T.( ) lim T. T.
Thus, T is a linear operator oI into . We show ne t that T is bounded
and hence continuous. Since every Cauchy se uence in a normed linear
space is bounded, it Iollows that the se uence Tn is bounded, and thus
II Tn II M Ior all n, where M is some constant. We have
II T ll lilim TnII lim II TnII sup 01 Tn IIIIII)
This proves that T is bounded and thereIore continuous, and T E B(, ).
inally, we must show that Tn T as n 00 in the norm oI B(, ). rom
beIore, we have II Tm TnII ell II whenever m, n N. IIwe let n 00,
then II Tm TII ell II Ior every E provided that m N. This
implies that II Tm Til e whenever m N. But Tm T as m 00 with
respect to the norm deIined on B(, ). ThereIore, B(, ) is complete and
the theorem is proved.
In DeIinition 3. .16 we deIined the null space oI T E (, ) as
(T) E : T O . (7.1.25)
We then showed that the range space R(T) is a linear subspace oI and
that (T) is a linear subspace oI . or the case oI bounded linear transIor
mations we have the Iollowing result.
7.1.26. Theorem. et T E B(, ). Then (T) is a closed linear subspace
oI .
ProoI meT) is a linear subspace oI by Theorem 3. .19. That it is closed
Iollows Irom part (ii) oI Theorem 5.7.9, since (T) T I( O ) and since O
is a closed subset oI .
We conclude this section with the Iollowing useIul result Ior continuous
7.1.27. Theorem. et T E (, ). Then T is continuous iI and only iI

T(I; II) I; ITl
II 11
Ior every convergent series I; ,, in .

II
The prooI oI this theorem Iollows readily Irom Theorem 5.7.8. We leave
the details as an e ercise.
7.2. INVERSES
Throughout this section and denote vector spaces over the same Iield
where is either R (the real numbers) or C (the comple numbers).
We recall that a linear operator T: has an inverse, TI, iI it is
injective, and iI this is so, then TI is a linear operator Irom R(T) onto
(see Theorem 3. .32). We have the Iollowing result concerning the continuity
oITI.
7.2.1. Theorem. et T E (, ). Then TI e ists, and TI E B( R(T), )
iI and only iI there is an I 0 such that II TII I II II Ior all E . II
this is so, II TIII I/I .
ProoI Assume that there is a constant I 0 such that I II II II TII Ior
all E . Then T 0 implies 0, and T e ists by Theorem 3. .32.
or y E R(T) there is an E such that y T and Tly . Thus,
I II II I II TI y II II TII II y II,
or
IIT I

II lIyll
ence, TI is bounded and liT III I/I .
Conversely, assume that TI e ists and is bounded. Then, Ior E
there is ayE R(T) such that y T , and also TI y. Since TI is
bounded we have
or
The net result, called the Neumann epansion theorem, gives us important
inIormation concerning the eistence oI the inverse oI a certain class oI
bounded linear transIormations.
7.2.2. Theorem. et be a Banach space, let T E B( , ), let I E B( , )
denote the identity operator, and let II Til I. Then the range oI (1 T) is
, the inverse oI (I T) e ists and is bounded and satisIies the ineuality
(7.2.3)
..
urthermore, the series Til in B(, ) converges uniIormly to ( T)I
. 0
with respect to the norm oI B( , ); i.e.,
(1 T)I 1 T T2 ... T" . . . . (7.2. )
15
..
ProoI Since II Til I, it Iollows that the series IITII converges. In
..
view oI Theorem 7.1.16 we have II P II II Til , and hence the series I: T
.so
converges in the space B(, ), because this space is complete in view oI
Theorem 7.1.2 . II we set
..
S I:P,
. 0
then
..
ST TS I: PI,
. 0
and
(I T)S S(I T) I.
It now Iollows Irom Theorem 3. .65 that (I T)I eists and is eual to S.
urthermore, S E B(, ). The ineuality (7.2.3) Iollows now readily
and is leIt as an e ercise.
7.2.5. Eercise. .Prove ineuality (7.2.3).
The net result, which is oI great signiIicance, is known as the Banach
inverse theorem.
7.2.6. Theorem. et and be Banach spaces, and let T E B(, ).
II T is bijective then TI is bounded.
ProoI The prooI oI this theorem is rather lengthy and reuires two prelimi
nary results which we state and prove separately.
7.2.7. Proposition. II A is any subset oI such that .1 (.1 denotes the
closure oI A), then any E such that *O can be written in the Iorm
l " ... . ... ,
where . E A and II . II 31I lI/2 , n 1,2, ....
ProoI The seuence
k
is constructed as Iollows. et l E A be such
that II l II tilII. This can certainly be done since A . Now choose
" E A such that Il I " II illII We continue in this manner
and obtain
I
Il I .1I 2.lI lI.
We can always choose such an . E A, because l .
1
E
and A . By construction oI .l,11 tl
k
11 0 as n 00. ence,
7.2. Inverses
..
:E
k
We now compute II " II. irst, we see that
kI
Il lll II
l
II II
l
II IIli ll ll,
II
lI Il
I I ii II I ll
Il III ill II,
and, in general,
II " II II " ,, I I I ... ,, I II
Il I ,,11 Il I ... ,, 111
3
2"lI
ll.
which proves the proposition.
17
7.2.8. Proposition. II A,, is any countable collection oI subsets oI
GO
such that A", then there is a sphere S(
o
; E) C and a set A" such
,, I
that S(
o
; E) C .1".
ProoI The prooI is by contradiction. Without loss oI generality, assume
that
AI C A
C A
3
C ....
or purposes oI contradiction assume that Ior every E and every n
there is an E" 0 such that S(; E,,) n A" 0. Now let I E and EI 0
be such that S(
l
; I
l
) n AI 0. et
E and I

0 be such that
S(
; I) c S(
l
; I) and S(
; I) n A
0. We see that it is possible to

construct a se uence oI closed nested spheres, ,, , (see DeIinition 5.5.3 )
in such a Iashion that the diameter oI these spheres, diam ( ,,), converges to
.. GO
ero. In view oI part (ii) oI Theorem 5.5.35, n" *0. et E n ".
kI k1
GO
Then A" Ior all n. But this contradicts the Iact that A". This
,,1
completes the prooI oI the proposition.
ProoI oITheorem 7.2.6. et
A
k
y E : II rIy II kllyll, k 1,2, ....
..
Clearly, A
k
By Proposition 7.2.8 there is a sphere S( o; I) C
kI
and a set A" such that S( o; E) C .1". We may assume that o E A". et p
be such that 0 p E, and let us deIine the sets Band B
o
by
B y E S( o; I): p lIy o II
and
B
o
y E : y o, Z E B .
We now show that there is an A such that B
o
c A
et E B n AIt Then
o E B
o
. We then have
IIT I(y o)11 IIT I

II IIT I
o
lI
n l II II o III
n ly o II 211 o 10
nlly 11 1 211 oll
o II oll
nlly o 11 1 2
11
po
ll
Now let be a positive integer such that

n 1 211 po " .
It then Iollows that o E A . It Iollows readily that B
o
c A
Now let be an arbitrary element in . It is always possible to choose

a real number .t such that .ty E B
o
. Thus, there is a se uence y, such that
, E A Ior all i and lim , .ty. This means that the se uence i: ,
converges to y. We observe Irom the deIinition oI Athat iI , E A, then
1
T I E AIor any real number .t. ence, we have shown that c A.
inally, Ior arbitrary E we can write, by Proposition 7.2.7,
I 1 ... It ,
where II It II 311 II/2
It
. et
k
TI
yk, k 1, ,and consider the inIinite
..
series I;
k
This series converges, since
k1
so that
II lI tl"k" 3IIllk lk 3 II II
..
Since T is continuous and since
k
converges, it Iollows that T
(:1
T(I
k
) T
k
:tk y. ence, TI y. ThereIore, II li IITI

II
I kI kI
3 llylI. This implies that T I is bounded, which was to be proved.
tili ing the principle oI contraction mappings (see Theorem 5.8.5),
we now establish results related to inverses which are important in applica
tions. In the setting oI normed linear spaces we can restate the deIinition oI a
contraction mapping as being a Iunction T: (T is not necessarily
7.3. Conjugate and Adjoint Operators
linear) such that
19
IIT() T(y) II / li yll
Ior all , y E , with 0 I I. The principle oI contraction mappings
asserts that iI T is a contraction mapping, then the euation
T()
has one and only one solution E .
We now state and prove the Iollowing result.
7.2.9. Theorem. et be a Banach space, let T E B(, ), let l E ,
and let l *O.
(i) II III II T II, then T h has a uniue solution, namely 0;
(ii) iI IllII Til, then (T 1/)1 e ists and is continuous on ;
(iii) iI III II T II, then Ior a given y E there is one and only one
vector E such that (T l/) y, and
i ... ; and
(iv) iI III Til I, then TI e ists and is continuous on .
ProoI
(i) or any , y E , we have
1I1
I
T ltTII 11
1
11 T( y)1I IlIIIITllll yll.
Thus, iI II Til IAI, then AI T is a contraction mapping. In view oI the
principle oI contraction mappings there is a uniue E with lT ,
or T l. The uniue solution has to be 0, because TO O.
(ii) et tT. Then II II milTil l. ItnowIollows Irom Theorem
7.2.2 that ( /)1 e ists and is continuous on . Thus, (l ll)I (T
ll)I e ists and is continuous on . This completes the prooI oI part (ii).
The prooIs oI the remaining parts are leIt as an e ercise.
7.2.10. Eercise. Prove parts (iii) and (iv) oI Theorem 7.2.9.
7.3. CON GATE AND ADOINT OPERATORS
Associated with every bounded linear operator deIined on a normed linear
space is a transIormation called its conjugate, and associated with every
bounded linear operator deIined on an inner product space is a transIormation
called its adjoint. These operators, which we consider in this section, are oI
utmost importance in analysis as well as in applications.
Throughout this section and are normed linear spaces over , where
is either R (the real numbers) or C (the comple numbers). In some cases
we may Iurther assume that and are inner product spaces, and in other
instances we may reuire that and/or be complete.
et I and yI denote the algebraic conjugate oI and , respectively
(reIer to DeIinition 3.5.18). tili ing the notation oI Section 3.5, we write
E I and y E yI to denote elements oI these spaces. II T E (, ),
we deIined the transpose oI T, TT, to be a mapping Irom yI to I determined
by the euation
, TTy) T , y) Ior all E , y E yI
(see DeIinition 3.5.27), and we showed that TT E (yI, I).
Now let us assume that T: is a bounded linear operator on
into . et * and y* denote the normed conjugate spaces oI and ,
respectively (reIer to DeIinition 6.5.9). II y E y*, then y(y) y, y) is
deIined Ior every y E and, in particular, it is deIined Ior every y T,
E . The uantity T , y) y(T) is a scalar Ior each E . Writing
() T , y) y(T), we have deIined a Iunctional on . Since y
is a linear transIormation (it is a bounded linear Iunctional) and since T is a
linear transIormation (it is a bounded linear operator), it Iollows readily
that is a linear Iunctional. Also, since T is bounded, we have
I() I ly(T) I I T , y)I Ily lllITll" lIy III1Tllll ll,
and thereIore is a bounded linear Iunctional and E *. We have thus
assigned to each Iunctional y E y* a Iunctional E *; i.e., we have
established a linear operator which maps y* into *. This operator is called
the conjugate operator oI the operator T and is denoted by T. We now have
Ty.
The deIinition oI T: * * is usually e pressed by the relation
. Ty) T , y), E , y E *.
tili ing operator notation rather than bracket notation, the deIinition
oI the conjugate operator T satisIies the euation
() y(T) (Ty)(), E ,
and we may thereIore write
yT Ty,
where yT denotes the Iunctional on consisting oI the operators T and
y, and Ty is the Iunctional obtained by operating on y by T.
The reader can readily show that T is uniue and linear. II * yI,
which is the case iI is Iinite dimensional, then the conjugate T and the
transpose TTare identical concepts. owever, since, in general, y* is a proper
7.3. Conjugate and Adjoint Operators 21
subspace oI yl, TT is an e tension oI T or, conversely, T is a restriction oI
P to the space *.
We summarie the above discussion in the Iollowing deIinition and
igure A.
y
R
y.

7.3.1. igure A
7.3.2. DeIinition. et T be a bounded linear operator on into . The
conjugate operator oI T, T: y* * is deIined by the Iormula
, Ty) T, y), E , y E *.
7.3.3. Eercise. Show that the conjugate operator T is uniue and linear.
BeIore eploring the properties oI conjugate operators, we introduce
another important operator which is closely related to the conjugate operator,
the so called "adjoint operator." In this case we Iocus our attention on
ilbert spaces.
et and denote ilbert spaces, and let the symbol ( , ) denote the
inner product on both and . II T is a bounded linear transIormation on
into , then in view oI the above discussion there is a uniue bounded linear
operator Irom * into , called the conjugate oI T. But in view oI Theorem
6.1 .2, the dual spaces , y* may be identiIied with and , respectively,
because and are ilbert spaces. This gives rise to a new type oI bounded
linear operator Irom into , called the adjoint oI T, which we consider in
place oIT.
et o E be Ii ed, and let () (, ) (T, o), where T E
B(, ) and E *. By Theorem 6.1 .2 there is a uniue
o
E such that
() ( , o) Writing
o
T*yo we deIine in this way a transIormation oI
into . We call this transIormation the adjoint oI T. Dropping the subscript
ero, we characterie the adjoint oI T by the Iormula
(T, y) (, T*y), E , E .
We will now show that T*: is linear, uniue, and bounded. To prove
linearity, let E , l 2 E , let , P E , and note that
( , T*( I P 2 (T, I P 2) (T, I) P(T, 2)
(, T* I) p(, T* 2) (, T*I PT* )
T*(I P 2) T*I PT* 2
and thereIore T* is linear.
To show that T* is uniue we note that iI ( , T*y) ( , S*y), then
( , T*y) ( , S*y) 0 implies ( , (T* S*)y) 0 Ior all E . rom
this it Iollows that (T* S*)y 1. Ior all E , and thus (T* S*)y 0
Ior all E . ThereIore, T* S*.
To veriIy that T* is bounded we observe that
IIT*11
2
I(T*, T*) I I(T(T*), )1 s;; II T(T*) IIIIII
IITIIIIT*llllll,
and thus
IIT*II IITllllll
rom this it Iollows that T* is bounded and Iurthermore II T*II S;; IITII.
We now give the Iollowing Iormal deIinition.
7.3. . DeIinition. et and be ilbert spaces, and let T be a bounded
linear operator on into . The adjoint opentor T*: is deIined by
the Iormula
(T , y) ( , T*y), E , E .
Summariing the above discussion we have the Iollowing result.
7.3.5. Theorem. The adjoint operator T* given in DeIinition 7.3. is
linear, uniue, and bounded.
The reader is cautioned that many authors use the terms conjugate operator
and adjoint operator interchangeably. Also, the symbol T* is used by many
authors to denote both adjoint and conjugate operators.
Some oI the important properties oI conjugate operators are summaried
in the Iollowing result.
23
7.3.6. Theorem. Conjugate transIormations have the Iollowing properties:
(i) liT II II Til;
(ii) I I, where I is the identity operator on a normed linear space ;
(iii) 0 0, where 0 is the ero operator on a normed linear space ;
(iv) (S T) S T, where S, T E B(, ) and where , are
normed linear spaces;
(v) (I T) I T , where T E B(, ), I E , and , are normed
linear spaces;
(vi) (ST) TS, where T E B( , ), S E B( , Z), and , , Z are
normed linear spaces; and
(vii) iI T e ists and iI T E B( , ), then (T) e ists, and moreover
(Tt

(T).
ProoI To prove part (i) we note that
I , Ty) I IT,y)1 IIyIIIITll Ily IIIITllll ll
rom this it Iollows that IIT y ll IITlllly lI,and thereIore
IIT II IITII
Net, let
o
E ,
O
"*O. In view oI the ahnBanach theorem (see
Corollary 6.8.5) there is a y E *, II y II 1, such that T
o
, y) II T
o
II.
ThereIore,
II ToII I
o TyI IITy IIl1
o
II IIT lIll
o
II,
Irom which it Iollows that
II Til IITII
ThereIore, II Til IIT II
The prooIs oI properties (ii) (vi) are straightIorward. To prove (iv), Ior
e ample, we note that
, (S T)y) S 1 T), y) S T, y)
S , y) T, y) , Sy) , Ty)
, Sy Ty) , (S T)y).
rom this it Iollows that (S T) S T.
To prove part (vii) assume that T E B(, ) has a bounded inverse T:
y . To show that T: y* * has an inverse we must show that it is
injective. et y , y; E y* be such that y "*y;. Then
, Ty; , T y T, y; y)"* 0
Ior some E . rom this it Iollows that Ty "*Ty , and T is onetoone.
We can, in Iact, show that T is onto. We note that Ior any E * and any
2
E , T y, and we have
Chapter 7 I inear Operators
, ) TI y, ) y, (TI)) T, (TI))
, T(TI)).
T(TI).
This shows that E R(T ) and that (T)I (TI).
7.3.7. E ercise. Prove parts (ii), (iii), (v), and (vi) oI Theorem 7.3.6.
In the ne t theorem some oI the important properties oI adjoint operators
are summari ed.
7.3.8. Theorem. et , , and Z be ilbert spaces, and let I and 0 denote
the identity and ero transIormation on , respectively. Then
(i) IIT*II IITII, where T E B(, );
(ii) 1* I;
(iii) 0* 0;
(iv) (S T)* S* T*, where S, T E B(, );
(v) (a:T)* a.T*, where T E B(, ) and I E ;
(vi) (ST)* T*S*, where T E B(, ), S E B(, Z);
(vii) iI TI E B(, ) e ists, then (T*)t E B(, ) e ists, and more
over (T*)t (Tt)*;
(viii) iI Ior T E B(, ) we deIine (T*)* T**, then T** T; and
(i ) IIT*TII IITll, where T E B(, ).
ProoI To prove part (i) we note that
IIT*ll

I(T*, T*) I(T(T*), )1 II T(T*) II Il ll
IITIIIIT* llll ll,
or
Ilr* ll IITllllll
rom the last ineuality it Iollows that IIT*II IIT II. Reversing the roles oI
T and T* we obtain
IIT Z I(T, T )1 I(T*(T), )1 II T*(T) IIIIII IIT*IIIITllllll,
or
IIT II II T*IIIIII
rom this it Iollows that IITil IIT*II,and thereIore IITII IIT*II
The prooIs oI properties (ii) (viii) are trivial. To prove part (i ), we Iirst
note that
25
IIT*TII IIT*IIIITII IITIIIITII IITW
On the other hand,
IITW (T, T) (T*T, ) IIT*Tllllll IIT*Tllll llll ll
Taking the suare root on both sides oI the above ineuality we obtain
II T ll IIT*Tllllll,
and thus II Til IIT*TII, or IITW IIT*TIIence, IIT*TII IITW
7.3.9. E ercise. Prove parts (ii)(viii) oI Theorem 7.3.8.
rom the above discussion it is obvious that adjoint operators are distinct
Irom conjugate operators even though many oI their properties appear to
be identical, especially Ior the case oI real spaces. We now cite a Iew eamples
to illustrate some oI the concepts considered above.
7.3.10. E ample. et C" be the ilbert space with inner product
deIined in Eample 3.6.2 , and let A E (, ) be represented (with respect
to the natural basis Ior ) by the n n matri A aI The transIormation
A can be written in the Iorm
I E al
, i 1,2, ... ,n,

I
where I is the ith component oI the vector y E . et A* denote the adjoint
oI A on the ilbert space , and let A* be represented by the n n matri
a. Now iI u (u
l
, ,u.) E , then
(A, u) (y, u) t Nil t
I
( I: aI
),
11 11 1 1
and
(, A*u) t I (t au)
II I
In order that (A, u) (, A*u) we must have a ii
I
; i.e., the matri oI
A* is the transpose oI the conjugate oI the matri oI A.
7.3.11. E ample. et
1
a, b , a b (see Eample 6. I I.lO),
and deIine the redholm operator T by
yet) (T)(t) rk(s, t)(s)ds, t E a, b ,
where it is assumed that the kernel Iunction k(s, t) is well enough behaved so
that
s: s: Ik(s, t) 1
1
dtds 00.
26 Clulpler 7 I inear Operalors
Now iI E
2
a, b , then
(T, u) ( . u) s: y(l)u(l) dl s: (I) : k(s, I) (s)ds )dl
s: es) : k(s, I) (I)d1 )ds.
rom this it Iollows that the adjoint T* oI T maps into the Iunction
Z(I) (T*uI) s: k(t, s)u(s)ds;
i.e., the adjoint oI T is obtained by interchanging the roles oI sand t in the
kernel and by utiliing the comple conjugate oI k.
7.3.12. Eercise. et 1
2
(see Eample 6.1.6) and deIine T:
1
2
/
2
by
T(el 2" .. "., ) (0, I 2" .. ".....) y,
Ior all (I 2 . , e., )E 1
2
, Show that T*: 1
2
1
2
is the operator
deIined by
T*( 11 12 ... I., ... ) ( 12 13 ... , I., ... )
Ior all y (II 12 ... , "... ) E 1
2,
Recalling the deIinition oI orthogonal complement (reIer to DeIinition
6.12.1), we have the Iollowing important results Ior bounded linear operators
on ilbert spaces.
7.3.13. Theorem. et T be a bounded linear operator on a ilbert space
into a ilbert space . Then,
(i) R(T) . (T*);
(ii) R(T) (T*).;
(iii) (T) R(T*) . ;
(iv) R(T*) (T).;
(v) (T*) (TT); and
(vi) R(T) R(TT*).
ProoI We prove (i) and (v) and leave the prooIs oI (ii (iv) and (vi) as an
e ercise.
To prove (i), we Iirst show that R(T).l (T*). et y E R(T).l. Then
(y, T ) 0 Ior all E , and hence (Ty, ) 0 Ior all E . This can be
true only iI T*y 0; i.e., y E (T*). On the other hand, iI y E (T*),
then (T*y, ) 0 Ior all E . Thus, (y, T) 0 Ior every E , which
implies that y E R (T).l. Now R(T) need not be closed. owever, by Theorem
6.12.1 R(T) R(T)u. ThereIore, R(T) . R(T)il. R(T). (T*).
7. . ermitian Operators 27
To prove (v), let y E m(T*). Then T*y 0 and TT*y O. This implies
that m(T*) c m(TT*). Net, let y E m(TT*). Then TT*y 0 and (y, TT*y)
O. This implies that (T*y, T*y) 0 so that T*y O. ThereIore, y E m(T*)
and m(TT*) c m(T*), completing the prooI oI part (v).
7.3.1 . Eercise. Prove parts (ii)(iv) and (vi) oI Theorem 7.3.13.
We conclude this section with the Iollowing results.
7.3.15. Theorem. et T E B(, ), where is a ilbert space, and let
M and N be subsets oI . DeIine T(M) as
T(M) y: y T, EM.
II T(M) c N, then T*(Ni ) c Mi .
ProoI et .1 N. Then Ior E M we have (T, ) 0 (, T* ). There
Iore, T* .1 Ior all E M and T* E Mi .
7.3.16. Theorem. et T E B(, ), where is a ilbert space, and let
M and N be closed linear subspaces oI . Then T(M) c N iI and only iI
T*(Ni ) c Mi .
ProoI II T(M) c N, then by Theorem 7.3.15 T*(Ni ) c Mi . Conversely,
iI T*(Nl.) c Ml., then by Theorem 7.3.15, T**(Mil) c N. But T** T
and iI M and N are closed linear subspaces, then M M and N .1. N.
ThereIore, T(M) c N.
7. . ERMITIAN OPERATORS
Throughout this section denotes a comple ilbert space. We shall
be primarily concerned with operators T E B(, ). By T* we shall always
mean the adjoint oI T.
or our Iirst result, recall the deIinition oI bilinear Iunctional (DeIinition
3.6. ).
7. .1. Theorem. et T E B(, ) and deIine the Iunction rp: C
by rp( , y) (T, y) Ior all , y E . Then rp is a bilinear Iunctional.
OI central importance in this section is the Iollowing class oI operators.
7. .3. DeIinition. A bounded linear transIormation T E B(, ) is said
to be hermitian iI T T*.
28
Some authors call such transIormations selI adjoint operators (see DeIini
tion .10.20).
The net two results allow us to characterie a hermitian operator in an
euivalent manner. The Iirst oI these involves symmetric bilinear Iorms (see
DeIinition 3.6.10).
7. . . Theorem. et T E B(, ). Then T is hermitian iI and only iI the
bilinear transIormation ,(, y) (T, y) is symmetric.
ProoI II T* T, then ,(, y) (T, y) (, T*y) (, Ty) (Ty, )
,(y, ), and thereIore, is symmetric.
Conversely, assume that ,(, y) ,(y, ). Then ,(y, ) (Ty, )
(, Ty) ,(, y) (T, y) (, T*y); e., (, Ty) (, T*y) Ior all ,y E
. rom this it Iollows that
T* T), y) 0,
and thus (T* T) .. y Ior all E . This implies that T* T Ior all
E or T* T.
7. .5. Theorem. et T E B(, ). Then T is hermitian iI and only iI
(T, ) is real Ior every E .
ProoI II T is hermitian, then (T, y) (Ty, ). Setting y, we obtain
(T, ) (T, ), which implies that (T, ) is real.
Conversely, suppose (, T) (, T) Ior all E . Then (, T)
(T, ). Now consider (, Ty) Ior arbitrary , y E . It is easily veriIied that
( y, T( y ( y, T( y i( iy, T( iy
i( iy, T( iy (, Ty) (7. .6)
where; ..;1. Also,
(T( y), y) (T( y), y) i(T( iy), iy
i(T( iy), iy) (T, y). (7. .7)
Since the leIthand sides oI E s. (7. .6) and (7. .7) are eual, it Iollows
that (, Ty) (T, y) Ior all , y E , and hence T T*.
The norm oI a hermitian operator can be Iound as Iollows.
7. .8. Theorem. et T E B(, ) be a hermitian operator. Then the norm
oI T can be e pressed in the Iollowing euivalent ways:
(i) II Til sup I (T, )l: II ll I ; and
(ii) IITII sup I(T,y)l: II li lIyll I .
7. . ermitian Operators
29
In the ne t theorem, some oI the more important properties oI hermitian
operators are given.
7. .10. Theorem. et S, T E B(, ) be hermitian operators, and let e
be a real scalar. Then
(i) (S T) is a hermitian operator;
(ii) e T is a hermitian operator;
(iii) iI T is bijective, then TI is hermitian; and
(iv) STis hermitian iI and only iI ST TS.
7... E ercise. Prove Theorem 7. .10.
Since in the case oI hermitian operators (T, ) is real Ior all E ,
the Iollowing deIinition concerning deIiniteness applies (recall DeIinition
3.6.10).
7. .12. DeIinition. et T E B(, ) be a hermitian operator. Then Tis
said to be positive iI (T, ) 0 Ior all E . In this case we write T O.
II (T, ) 0 Ior all * 0, we say that T is strictly positive.
7. .13. DeIinition. et S, T E B(, ) be hermitian operators. II the
hermitian operator T (S) T S 0, then we write T S.
7. .1 . Theorem. et S, T, E B(, ) be hermitian operators, and let
e be a real scalar. Then,
(i) iI S 0, T 0, then (S T) 0;
(ii) iI e 0, T 0, then eT 0;
(iii) iI S ::; T, T::; , then S ; and
(iv) Ior any V E B(, ), iI T 0, then V*TV O. In particular,
V*V o.
ProoI The prooIs oI parts (i (iii) are obvious. or e ample, iI S 0,
T 0, then (S, ) (T, ) (S T, ) S D, ) ;;::: 0 and
(S D;;:::O.
To prove part (iv) we note that (V*TV, ) (TV, V );;::: 0, since V
y is a vector in and (Ty, ) 0 Ior all y E . II we consider, in par
ticular, T 1 1*, then v*V O.
The prooI oI the ne t result Iollows by direct veriIication oI the Iormulas
involved.
30 Chapter 7 / inear Operators
7. .15. Theorem. et A E B(, ), and let
A A* and V ii A A*,
where i ,j1. Then
(i) and V are hermitian operators; and
(ii) iI A C iD, where C and D are hermitian, then C and
D V.
et us now consider some speciIic cases.
7. .17. E ample. et C" with inner product given in E ample 3.6.2 .
et A E B(, ), and let el ... ,eIt) be any orthonormal basis Ior . As
we saw in E ample 7.3.10, iI A is represented by the matri A, then A* is
represented by A* AT. In this case Ais hermitian iIand only iIA AT.
7. .18. E ample. et
2
a, b (see E ample 6.11.10), and deIine
T E B(, ) by
y T t(t).
Then Ior any Z E we have
(T, ) s: t(t)(t) dt s: (t)t(t) dt
(, T) (T*, ).
Thus, T T* and T is hermitian.
7. .19. E ercise. et
2
a, b , and deIine T: by
T I (s)ds.
Show that T* *" T and thereIore T is not hermitian.
7. .20. E ercise. et
2
a, b and consider the redholm operator
given in E ample 7.3.11; i.e.,
y(t) (T)(t) s: k(s, t)(s)ds, t E a, b .
Show that T T* iI and only iI k(t, s) k(s, t).
We conclude this section with the Iollowing result, which we will sub
se uently re uire.
7.5. Other inear Operators
31
7..21. Theorem. et be a ilbert space, let T E B(, ) be a hermitian
operator, and let 1 E R. Then there eists a real number" 0 such that
,,11 II II (T ) II Ior all E iI and only iI (T ) is bijective and
(T 11) 1 E B(, ), in which case II(T ,il)III 1/".
ProoI et T
A
T AT. It Iollows Irom Theorem 7. .10 that T
A
is also
hermitian.
To prove suIIiciency, let Til E B(, ). It Iollows that Ior all E ,
IITilyli II Til II llyl ettingy TAand" II Til WI, we have II TAII
2 ,,11 1 Ior all E .
To prove necessity, let" 0 be such that ,,11 II II TAII Ior all E .
We see that TA 0 implies 0; i.e., m(T O , and so T
A
is injective.
We net show that R(T
A
) . It Iollows Irom Theorem 6.12.16 that
R(T
A
) EEl R(T
A
)1.. rom Theorem 7.3.13, we have R(TA)l men). Since
T
A
is hermitian, meT ) m(T
A
) O . ence, R(T
A
) . We net show that
R(T
A
) R(T
A
), i.e. the range oI T
A
is closed. et n be a seuence in
R(T
A
) such that n y. Then there is a seuence n in such that TA
n
n
or any positive integers m, n, ,,11
m

n
Ii II TA
m
TA
n
II II m n II.
Since n is Cauchy, n must also be Cauchy. et
n
. Then n TA
n
TA y. Thus, E R(T
A
) and so R(T
A
) is closed. This proves that T
A
is bijective. inally, ,,11 Ti I II II II Ior all y E implies Ti I E B(, )
and II Tilll 1/". This completes the prooI oI the theorem.
7.5. OT ER INEAR OPERATORS:
NORMA OPERATORS, PROECTIONS,
NITAR OPERATORS,
AND ISOMETRIC OPERATORS
In this section we consider additional important types oI linear operators.
Throughout this section is a comple ilbert space, T* denotes the adjoint
oI T E B(, ), and I E B(, ) denotes the identity operator.
7.5.1. DeIinition. An operator T E B(, ) is said to be a normal operator
iIT*T TT*.
7.5.2. DeIinition. An operator T E B(, ) is said to be an isometric
operator iI T*T I.
7.5.3. DeIinition. An operator T E B(, ) is said to be an unitary opera
tor iI T*T TT* I.
Our Iirst result is Ior normal operators.
31 Cl pter 7 I inear Operators
7.5. . Theorem. et T E B(, ). et , V E B(, ) be hermitian
operators such that T iV. Then T is normal iI and only iI V V .
7.5.5. E ercise. Prove Theorem 7.5. . Recall that and Vare uniue by
Theorem 7. .15.
or the ne t result, recall that a linear subspace oI is invariant
under a linear transIormation T iI T() c (see DeIinition 3.7.9). Also,
recall that a cloSed linear subspace oI a ilbert space is itselI a ilbert
space with inner product induced by the inner product on (see Theorem
6.2.1).
7.5.6. Theorem. et T E B( , ) be a normal operator, and let be a
closed linear subspace oI which is invariant under T. et T I be the restric
tion oI T to . Then TIE B(, ) and TI is normal.
or isometric operators we have the Iollowing result.
7.5.8. Theorem. et T E B(, ). Then the Iollowing are euivalent:
(i) T is isometric;
(ii) (T, Ty) (, y) Ior all , y E ; and
(iii) II T Ty II II y II Ior all , y E .
ProoI II T is isometric, then (, y) (l, y) (T*T, y) (T, Ty) Ior
all ,y E .
Ne t, assume that (T, Ty) (, y). Then I T Ty I I T( y) II
(T( y), T( y)) y), ( y II yll; i.e., IIT Tyll
llyll
inally, assume that II T TyII II y II. Then (T*T, ) (T, T)
II TW IIW (, ); i.e., (TT, ) (, ) Ior all E . But this
implies that TT I; i.e., T is isometric.
rom Theorem 7.5.8 there Iollows the Iollowing corollary.
7.5.9. Corollary. II T E B(, ) is an isometric operator, then IITll
Il ll Ior all E and IITII I.
or unitary operators we have the Iollowing result.
7.5.10. Theorem. et T E B(, ). Then the Iollowing are euivalent:
(i) T is unitary;
(ii) T is unitary;
(iii) Tand T are isometric;
(iv) T is isometric and T* is injective;
(v) T is isometric and surjective; and
(vi) T is bijective and TI T*.
7.S.. E ercise. Prove Theorem 7.5.10.
33
BeIore considering projections, let us brieIly return to Section 3.7. Recall
that iI (a linear space) is the direct sum oI two linear subspaces I and
,
i.e., l EB
, then Ior each E there e ist uniue l E l and

E
such that l
. We call a mapping P: deIined by

P l the projection on . along
. Recall thatP E (, ), R(P) l

and m(p)
. urthermore, recall that iI P E (, ) is such that p P,

then P is said to be idempotent and this condition is both necessary and suIIi
cient Ior P to be a projection on R(P) along m(p) (see Theorem 3.7. ). Now
iI is a ilbert space and iI l is a closed linear subspace oI , then
y.l and E9 y.l (see Theorem 6.12.16). II Ior this particular

case P is the projection on along y.l, then P is an orthogonal projection
(see DeIinition 3.7.16). In this case we shall simply call P the orthogonal
projection on .
7.5.12. Theorem. et be a closed linear subspace oI such that *" O
and *" . et P be the orthogonal projection onto . Then
(i) P E B(, );
(ii) IIPII I; and
(iii) p* P.
ProoI We know that P E (, ). To show that P is bounded let l

, where I E and
E .l. Then II P II Il lli II ll. ence, P

is bounded and IIPII I. II
0, then IIP l1 II ll and so IIPII I.

To prove (iii), let , E be given by I
and I ,
respectively, where I I E and
, E .l. Then (, Py) (l
,
l) (l l) and (P , y) (I I y ) (I I) Thus, ( , Py) (P , y)
Ior all , E .
This implies that P P*.
rom the above theorem it Iollows that an orthogonal projection is a
hermitian operator.
7.5.13. Theorem. et be a closed linear subspace oI , and let P be the
orthogonal projection onto . II
l
E : P )
and iI
is the range oI P, then

l

.
ProoI Since , since c
I
, and since
I
c , it Iollows that

I
.
7.5.1. Theorem. et P E (, ). II P is idempotent and hermitian, then
E : P
is a closed linear subspace oI and P is the orthogonal projection onto .
ProoI Since P is a linear operator we have
P(r . Ity) r .P ItPy.
II, y E , then P and Py y, and it Iollows that
P(r . Ity) r . Ity.
ThereIore, (r . Ity) E and is a linear subspace oI . We must show
that is a closed linear subspace. irst, however, we show that P is bounded
and thereIore continuous. Since
IIP W (P , P ) (P*P , ) (P, ) (P , ) IIPlllllI,
we have IIP lI II ll and IIPI l.
To show that is a closed linear subspace oI let
o
be a point oI accu
mulation oI the space . Then there is a seuence oI vectors in such
that lim I
o
II O. Since E , we can put P and we have
I P
o
1 0 as n (l. Since P is bounded, it is continuous and thus
we also have 1 P P
o
II0 as n (l, and hence
o
E .
inally, we must show that P is an orthogonal projection. et E ,
and let y E .l. Then (Py, ) (y, P ) ( , ) 0, since ... y. ThereIore,
Py... and Py E .l. But P(Py) Py, since P P and thus Py E .
ThereIore, it Iollows that Py 0, because Py E and Py E .l. Now let
Z y E , where E and y E l.. Then p P Py
. ence, P is an orthogonal projection onto .
The net result is a direct conseuence oI Theorem 7.5.1 .
7.5.15. Corollary. et be a closed linear subspace oI , and let P be
the orthogonal projection onto . Then P(yl.) O .
The net result yields the representation oI an orthogonal projection onto
a Iinitedimensional subspace oI .
7.5.17. Theorem. et I , be a Iinite orthonormal set in , and
let be the linear subspace oI generated by I"" . Then the
orthogonal projection oI onto is given by
P ( , ,) , Ior all E .
II
35
ProoI We Iirst note that is a closed linear subspace oI by Theorem
6.6.6. We now show that P is a projection by proving that p 1. P. or
any j I, ... , n we have
It
P
, ,) , I"
II
ence, Ior any E we have
(7.5.18)
"
(, ,) , P .
t 1
Ne t, we show that CR(P) . It is clear that CR(P) c . To show that
c CR(P), let y E . Then
y tllI ... tI" "
Ior some til ... ,tiIt . It Iollows Irom E . (7.5.18) that Py and so
y E CR(P).
inally, to show that P is an orthogonal projection, we must show that
CR(P) 1 (P). To do so, let E (P) and let y E CR(P). Then
" "
(, y) (, Py) (, (y, ,) ,) (y, ,)( , ,)
11 11
" "
( , ,)(
"
y) ((, ,)
"
y) (P , y)
I 1 1
(O,y) O.
This completes the prooI.
ReIerring to DeIinition 3.7.12 we recall that iI and Z are linear subspaces
oI (a linear space) such that IIiZ, and iI T E (, ) is such
that both and Z are invariant under T, then T is said to be reduced by
and Z. When is a ilbert space, we make the Iollowing deIinition.
7.5.19. DeIinition. et be a closed linear subspace oI , and let T E
(, ). Then is said to reduce T iI and y.l. are invariant under T.
Note that in view oI Theorem 6.12.16, DeIinitions 3.7.12 and 7.5.19 are
consistent.
The prooI oI the ne t theorem is straightIorward.
7.5.20. Theorem. et be a closed linear subspace oI , and let T E
B(, ). Then
(i) is invariant under T iI and only iI y.l. is invariant under T*; and
(ii) reduces T iI and only iI is invariant under T and T*.
7.5.22. Theorem. et be a closed linear subspace oI , let P be the
orthogonal projection onto , let T E B( , ), and let I denote the identity
operator on . Then
(i) is invariant under T iI and only iI TP PTP;
(ii) reduces T iI and only iI TP PT; and
(iii) (I P) is the orthogonal projection onto l..
ProoI To prove (i), assume that TP PTP. Then Ior any E we have
T T(P) P(TP) E , since P applied to any vector oI is in .
Conversely, iI is invariant under T, then Ior any vector E we have
T(P) E , because P E . Thus, P(TP) TP Ior every E .
To prove (ii), assume that PT TP. Then PTP P2T PT TP.
ThereIore, PTP TP, and it Iollows Irom (i) that is invariant under T.
To prove that reduces T we must show that is invariant under T*.
Since P is hermitian we have T*P (PT)* (TP)* P*T* PT*; i.e.,
T*P PT*. But above we showed that PTP TP. Applying this to T* we
obtain T*P PT*P. In view oI (i), is now invariant under T*. ThereIore,
the closed linear space reduces the linear operator T.
Conversely, assume that reduces T. By part (i), TP PTP and T*P
PT*P. Thus, PT (T*P)* (PT*P)* PTP TP; i.e., TP PT.
To prove (iii) we Iirst show that (I P) is hermitian. We note that
(l P)* 1* p* I P. Net, we show that (I P) is idempotent.
We observe that (I pp (1 2P P2) (1 2P P) (1 P).
inally, we note that (1 P) iI and only iI P 0, which implies
that E l.. Thus,
yl. E : (1 P) .
It Iollows Irom Theorem 7.5.1 that (I P) is a projection onto l..
The ne t result Iollows immediately Irom part (iii) oI the preceding
theorem.
7.5.23. Theorem. et be a closed linear subspace oI , and let P be the
orthogonal projection on . II II P II II II, then P , and conseuently
E .
We leave the prooI oI the Iollowing result as an e ercise.
7.5.25. Theorem. et and Z be closed linear subspaces oI , and let P
and Q be the orthogonal projections on and Z, respectively. et 0 denote
the ero transIormation in B(, ). The Iollowing are euivalent:
(i) 1 ;
(ii) PQ 0;
(iii) QP 0;
(iv) P(Z) O ; and
(v) Q() O .
37
or the product oI two orthogonal projections we have the Iollowing
result.
7.S.27. Theorem. et
I
and
be closed linear subspaces oI , and let

PI and P
be the orthogonal projections onto

I
and
, respectively. The
product transIormation P P
Z
is an orthogonal projection iI and only iI PI
commutes with P
. In this case the range oI P1P

Z
is
I
(i
.
ProoI Assume that PIP
Z
PZP
I
Then (PIP
Z
)* PIN PZP
I
PIP
Z
;
i.e., iI PIP
Z
PZP
I
then (PIP
Z
)* (P1P
Z
) Also, (P PZP PIPZPIP
Z
PIPIPZP
Z
PIP
Z
; i.e., iI PIP
Z
PZP
I
, then PIP
Z
is idempotent. There
Iore, PIP
Z
is an orthogonal projection.
Conversely, assume that P P
Z
is an orthogonal projection. Then (P P

)*
PIN PZP
1
and also (P1P
Z
)* P P

. ence, P1P
Z
PZP

.
inally, we must show that the range oIPI P
is eual to
(i
. Assume
that E 6l(P
I
P
). Then P1PZ , because P
isan orthogonal projection.

Also, PIPZ PI(PZ ) E
, because any vector operated on by P
is in I
Similarly, PZPl P (P ) E
. Now, by hypothesis, P1P

Z
PZP
Io
and
thereIore PIPZ PZP E
I
(i
. Thus, whenever E 6l(P

I
P
),
then E
(i
. This implies that 6l(P

I
P
) c
I
(i
. To show that
6l(P
I
P
) ::
I
(i
, assume that E
1
(i
. Then P PZ P P )
PI E 6l(P
I
P
). Thus,
I
(i
C 6l(P
1
P
). ThereIore, 6l(P
I
P
)

I
(i
7.5.28. Theorem. et and Z be closed linear subspaces oI , and let P

and Qbe the orthogonal projections onto and Z, respectively. The Iollowing
are euivalent:
(i) P::;;; Q;
(ii) II P II II Q ll Ior all E ;
(iii) c: ;
(iv) QP P; and
(v) PQ P.
38 Ch pter 7. I inear Operators
ProoI Assume that P Q. Since P and Q are orthogonal projections, they
are hermitian. or a hermitian operator, P 0 means (P, ) 0 Ior all
E . II P Q, then (P, ) (Q, ) Ior all E or (P", ) (Q",
) or (P,P) (Q, Q) or II P II" II Q 1l
2
, and hence IIPll II Qll
Ior a E .
Net, assume that II P II II QII Ior all E . II E , then P
and
(, ) (P,P) IIPll" IIQ ll" IIQllllll" II II" (,),
and thereIore II Q II II II. rom Theorem 7.5.23 it now Iollows that
Q , and hence E Z. Thus, whenever E then E Z and Z :: .
Now assume that Z :: and let y P, where is any vector in .
Then QP Qy y P Ior all E and QP P.
Suppose now that QP P. Then (QP)* P*, or P*Q* PQ p* P;
i.e., PQ P.
inally, assume that PQ P. or any E we have (P, ) IIP ll"
IIPQll"IIPII"IIQll" IIQ ll" (Q,Q) (Q2
,) (Q,);
i.e., (P, ) (Q, ) Irom which we have P Q.
We leave the prooI oI the net result as an e ercise.
7.5.29. Theorem. et
1
and " be closed linear subspaces oI , and let
PI and P
2
be the orthogonal projections onto
t
and ", respectively. The
diIIerence transIormation P PI P
is an orthogonal projection iI and

only iI P
PI The range oI Pis

t
n t.
We close this section by considering some speciIic cases.
7.5.31. E ample. et R denote the transIormation Irom E" into E" given
in E ample .10. 8. That transIormation is represented by the matri
R, cS 0 sin O
SID 0 cos 0
with respect to an orthonormal basis el e" . By direct computation we
obtain
R: cs 0 sin O .
SID 9 cos 9
It readily Iollows that R*R RR* I. ThereIore, R is a linear transIorma
tion which is isometric, unitary, and normal.
7.6. The Spectrum0/ an Operator
39
7.5.32. E ercise. et
2
0, 00) and deIine the truneation operator PT
by y PT , where
y(t) (t) Ior all 0 t :::;; T
o Ior all t T
Show that P
T
is an orthogonal projection with range
R(P
T
) E : (t) 0 Ior t T ,
and null space
m(P
T
) E : (t) 0 Ior all t T .
Additional e amples oI diIIerent types oI operators are considered in
Section 7.10.
7.6. T E SPECTR M O AN OPERATOR
In Chapter we introduced and discussed eigenvalues and eigenvectors
oI linear transIormations deIined on Iinite dimensional vector spaces. In
the present section we continue this discussion in the setting oI inIinite
dimensional spaces.
nless otherwise stated, will denote a comple Banach space and I will
denote the identity operator on . owever, in our Iirst deIinition, may be
an arbitrary vector space over a Iield .
7.6.1. DeIinition. et T E (, ). A scalar AE is called an eigenvalue
oI T iI there e ists an E such that *O and such that T A . Any
vector *O satisIying the euation T A is called an eigenvector oI T
corresponding to the eigenvalue A..
7.6.2. DeIinition. et be a comple Banach space and let T: .
The set oI all . E C such that
(i) R(T AI) is dense in ;
(ii) (T .I) I e ists; and
(iii) (T .I) I is continuous (i.e., bounded)
is called the resolvent set oI T and is denoted by p(T). The complement oI
p(T) is called the spectrum oI T and is denoted by (T).
The preceding deIinitions re uire some comments. irst, note that iI .
is an eigenvalue oI T, there is an *O such that (T .I) O. rom
Theorem 3. .32 this is true iI and only iI (T AI) does not have an inverse.
ence, iI . is an eigenvalue oI T, then ,t E (T). Note, however, that there
0 C1u: pter 7 I inear Operators
are other ways that a comple number 1 may Iail to be in p(T). These possi.
bilities are enumerated in the Iollowing deIinition.
7.6.3. DeIinition. The set oI all eigenvalues oI T is called the point spectrum
oI T. The set oI alll such that (T l1)1 eists but Gl(T ll) is not dense
in is called the residual spectrum oI T. The set oI all 1 such that (T 11) 1
e ists and such that Gl(T 11) is dense in but (T ll)I is not continuous
is called the continuous spectrum. We denote these sets by p(T), R(T),
and C (T), respectively.
Clearly, (T) P(T) C(T) R (T). urthermore, when is Iinite
dimensional, then (T) P (T). We summarie the preceding deIinition in
the Iollowing table.
(T AI) 1 e ists and (T AI) 1 e ists but
(T AI) 1 does
(T .11) 1 is (T .11) 1 is not
continuous continuous
not e ist
R(T) Ae p(D .Ie Ca(D Ae Pa(D
R(T ) *
.Ie "RtT(T) 1 e RtT(T) 1 e PtT(T)
7.6. . Table A. Characteri ation oI the resolvent set and the spectrum
oI an operator
7.6.5. Eample. et /2 be the ilbert space oI Eample 6.11.9, let
(I 2" ..) E , and deIine T E B(, ) by
T (I 2 i(3 ...).
or each 1 E C we want to determine (a) whether (T 11) 1 eists; (b) iI so,
whether (T 11) 1 is continuous; and (c) whether Gl(T 11) .
irst we consider the point spectrumoI T. IIT lthen ( l )k 0,
k 1,2.... This holds Ior nontrivial iI and only iI l 11k Ior some k.
ence.
p(T) k:k I. 2... .
Net, assume that 1 p(T). so that (T l1)1 e ists. and let us inves
tigate the continuity oI (T 11)1. We see that iI y (II. 12. ..) E Gl(T
11), then (T l1)ly is given by
..... l. klk .
k .. ..lIlk
k
7.6. The Spectrum0/an Operator 1
..
Now iI A. 0, then II (T A.I) I y W k"11 and (T A.I) I is not
k1
bounded and hence not continuous. On the other hand, iI A. *" 0, then
(T A.I) I is continuous since Ik I ,,1 11k IIor all k, where
I
and
p(n PO (T) u CO(nr
7.6.6. E ercise. et l , the ilbert space oI E ample 6.11.9, let
(I ,,, 3" .), and deIine the right shiIt operator T,: and the
leIt shiIt operator T,: by
T, (0, I ,,, ...)
and
respectively. Show that
p(T,) p(T,) A. E C: IA.I I),
CO (T,) CO (T,) A. E C: IA.I I),
RO (T,) PO (T,) A. E C: IA.I I),
PO (T,) RO (T,) 0.
We now e amine some oI the properties oI the resolvent set and the
spectrum.
7.6.7. Theorem. et T E B(, ). IIlAI II Til, then A. E p(T) or, e uiva
lently, iI AE O(n, then IA.I II Til.
7.6.8. E ercise. Prove Theorem 7.6.7 (use Theorem 7.2.2).
7.6.9. Theorem. et T E B(, ). Then P(T) is open and o (T) is closed.
ProoI Since o(T) is the complement oI p(T), it is closed iI and only iI P(T)
is open. et 1
0
E P(T). Then (T 1
0
1) has a continuous inverse. or arbi
trary 1 we now have
III (T loII(T 11 11
II(T loI(T 1
0
1) (T lolI(T 1/)11
II(T loII(T 1
0
1) (T 1I)11
II(llo)(T 1
0
1) 111
IllolIl(T 1
0
/) 111.
Now Ior 11 1
0
IsuIIiciently small, we have
III (T loT) I(T 1I) II 11 1
0
III(T 1
0
) I II 1.
Now in Theorem 7.2.2 we showed that iI T E B(, ), then T has a contin
uous inverse iI III Til 1. In our case it now Iollows that (T lo/)I(T
lI) has a continuous inverse, and thereIore (T 1I) has a continuous
inverse whenever Il lo I is suIIiciently small. This implies that 1 E p(T)
and P(T) is open. ence, u(T) is closed.
or normal, hermitian, and isometric operators we have the Iollowing
result.
7.6.10. Theorem. et be a ilbert space, let T E B(, ), let l be an
eigenvalue oI T, and let T l. Then
(i) iI T is hermitian then 1 is real;
(ii) iI T is isometric, then III I;
(iii) iI T is normal, then is an eigenvalue oI T* and T* ; and
(iv) iI T is normal, iI .l is an eigenvalue oI T such that .l 1 1, and iI
Ty .l , then ..1 y.
ProoI Without loss oI generality, assume that is a unit vector.
To prove (i) note that l 111 W l(, ) (l, ) (T, ), which
is real by Theorem 7. .5. ThereIore, (T, ) ( , T ) (T, ) ; i.e.,
1 and 1 is real.
To veriIy (ii), note that iI T is isometric, then II Tll II ll 1, by
Corollary 7.5.9. Since T A it Iollows that IIlll 1 or Illllli I,
and hence III l.
7.6. The Spectrum 0/ an Operator
To prove (iii), assume that T is normal; i.e., T*T TT*. Then
(T )(T )* (T )(T* II)
(T )T* (T )II
TT* ).T* IT ;.II
T*T IT ).T* ).II
(T* II)(T ),1) (T )*(T ).1);
e.,
3
(T T ).1)* (T )*(T ),
and (T AI) is normal. Also, we can readily veriIy that I (T ).I)II
II(T 11)*ll. Since (T ) 0, it Iollows that (T AI)* 0, or
(T* Il) 0, or T* I. ThereIore, I is an eigenvalue oI T* with eigen
vector .
To prove the last part assume that 1 /.l and that T is normal. Then
(1 /.l)( , y) ).(, y) /.l( , y) (.t , y) (, Ily)
(T, y) (, T*y) (T, y) (T, y) 0;
i.e., (A /.l)( , y) O. Since 1 /.l we have .. y.
The ne t two results indicate what happens to the spectrum oI an operator
T when it is subjected to various elementary transIormations.
7.6.11. Theorem. et T E B(, ), and let P(T) denote a polynomial in
T. Then
(p(T p((T p(A): A E (T).
7.6.13. Theorem. et T E B(, ) be a bijective mapping. Then
(TI) (T)r
l
t . l :). E (T).
ProoI Since TI e ists, 0 i (T) and so the deIinition oI (T) 1 makes
sense. Now Ior any). 0, consider the identity
(TI 1I) ( T) 1TI.
It Iollows that iI 1 i (T), then (TI 1/) has a continuous inverse; i.e.,
1 i (T) implies that 1i (TI). In other words, (TI) c u(T)r
l
To
prove that (T)1 c (TI) we proceed similarly, interchanging the roles oI
Tand TI .
We now introduce the concept oI the approimate point spectrum oI an
operator.
7.6.1 . DeIinition. et T E B(, ). Then 1 E C is said to belong to the
appro imate point spectrum oI T iI Ior every E 0 there e ists a nonero
vector E such that II T lII Ell II. We denote the approimate
point spectrum by n(T). II1 E n(T), then 1 is called an approimate eigenvalue
oIT.
Clearly, Pt1(T) c n(T). Other properties oI n(T) are as Iollows.
7.6.15. Theorem. et be a ilbert space, and let T E B(, ). Then
n(T) c t1(T).
ProoI Assume that 1 t1(T). Then (T l) has a continuous inverse,
and Ior any E we have
II II II(Tll)I(T ll)ll II(Tll)IIIII(T ll)ll.
Now let E I/II(T 1/)1 II. Then we have, Irom above, II T lll
Ell II Ior every E and 1 n(T). ThereIore, t1(T) :: n(T).
7.6.16. Theorem. et be a ilbert space, and let T E B( , ) be a
normal operator. Then n(T) t1(T).
We can use the approimate point spectrum to establish some oI the
properties oI the spectrum oI hermitian operators.
7.6.18. Theorem. et be a ilbert space, and let T E B(, ) be
hermitian. Then
(i) t1(T) is a subset oI the real line;
(ii) II Til sup Ill: 1 E t1(T) ; and
(iii) t1(T) is not empty and either II Til or II Til belongs to t1(T).
ProoI To prove (i), note that iI T is hermitian it is normal and t1(T) n(T).
et 1 E n(T), and assume that 1 * 0 is comple . Then Ior any * 0 we have
0 11 IlIl W 11 II(, ) IT ll), ) T Il), )1
IT ll), )1 IT Il), )1 II(T l ) llll ll
II(T Il)llllli
211(T l)llllll;
i.e.,
0 11 IllIll 211(Tll)ll
7.6. The Spectrum0/an Operator
Ior all E . But this implies that l rt neT), contrary to the original assump
tion. ence, it must Iollow that l .i, which implies that l is real.
To prove (ii), Iirst note that II Til sup Ill: l E (T) Ior any T E
B(, ) (see Theorem 7.6.7). To show that euality holds, iI T is hermitian,
we Iirst must show that II T WE n(P) (P). or all reall and all E
we can write
IIT2
..1hW (T2
l2 , T2l2 )
(T2
, T2
) (T2
, l2
) (l2
, T2) l ( , ).
Since (T2, ) (T , T* ) (T, T ), we now have
(T2
l2, P l2) (T2, T2) 2l

2
(T , T ) A, ( , ),
or
IIT2 l2 W IIT2W 2l
2
11T W A,
I1 W. (7.6.19)
Now let be a se uence oI unit vectors such that IITll IITII. II
l IITII, then we have, Irom E . (7.6.19),
IIP l2Z IIT2Z 2l211TW l
(II T1111 T 11)2 2A,211 T Z A,
). 211 T Z 2). 211 T W ).
). ).211 T Z 0 as n 00;
e., II T2. ).
2
.11 0 as n 00, and thus ).
2
E n(T2) (T2).
sing Theorems 7.6.11 and 7.6.15 and the Iact that).
2
E n(P), it now
Iollows that
IITII sup Ill: l E u(T).
The prooI oI (iii) is leIt as an e ercise.
7.6.20. Eercise. Prove part (iii) oI Theorem 7.6.18.
neT) is closed.
In the Iollowing we let T E B(, ), ). E C, and we let .l(T) be the null
space oI T ; i.e.,
iT) E : (T ) O (T ). (7.6.23)
It Iollows Irom Theorem 7.1.26 that .l(T) is a closed linear subspace oI .
or the net result, recall DeIinition 3.7.9 Ior the meaning oI an invariant
subspace.
7.6.2 . Theorem. et be a ilbert space, let). E C, and let S, T E
B(, ). II ST TS, then l(T) is invariant under S.
6
ProoI et E l(n. We wantto showthat S E l(n; i.e., TS lS.
Since E l(n, we have T .l . Thus, ST lS. Since ST TS, we
have TS lS.
7.6.25. Corollary l(n is invariant under T.
ProoI Since IT IT, the result Iollows Irom Theorem 7.6.2.
or the net result, recall DeIinition 7.5.19.
7.6.26. Theorem. et be a ilbert space, let A. E C, and let T E B(, ).
II T is normal, then
(i) l(T) rtT*);
(ii) l(T)..l ,,(T) iI A. "* p.; and
(iii) l(T) reduces T.
ProoI The prooIs oI parts (i) and (ii) are leIt as an eercise.
To prove (iii), we see that l(T) is invariant under T Irom Corollary
7.6.25. To prove that l(T)l is invariant under T, let y E l(T)l.. We want
to show that (, Ty) 0 Ior all E iT). II E l(T), we have T .l ,
and so, by part (i), T* . Now (, Ty) (T*, y) ( , y)
( , y) O. This implies that Ty E l(T)l., and so l(T)l. is invariant under
T. This completes the prooI oI part (iii).
7.6.27. Eercise. Prove parts (i) and (ii) oI Theorem 7.6.26.
BeIore considering the last result oI this section, we make the Iollowing
deIinition.
7.6.28. DeIinition. A Iamily oI closed linear subspaces in a ilbert space
is said to be total iI the only vector y E orthogonal to each member oI
the Iamily is y o.
7.6.29. Theorem. et be a ilbert space and let S; T E B(, ). IIthe
Iamily oI closed linear subspaces oI given by l(T): A. E C is total,
then TS STiI and only iI l(n is invariant under S Ior all,t E C.
ProoI The necessity Iollows Irom Theorem 7.6.2. To prove suIIiciency,
assume that l(T) is invariant under S Ior aliA. E C. et denote the null
space oI TS ST; i.e., (TS ST). II E in, then S E l(n
by hypothesis. ence, TS T(S) ,t(S) S(.l) S(T) ST Ior
all E iT). Thus, (TS ST) 0 Ior any E l(n, and so in c .
II there is a vector y 1 , then it Iollows that y 1 iT) Ior all A. E C. By
hypothesis, the Iamily l(T): A. E C is total, and thus y O. It Iollows that
1. O and 1.1. rOll. and 1.1. , because is a closed linear
7.7. Completely Continuous Operators 7
subspace oI . ThereIore, m. ; e., (TS ST) 0 Ior all E .
ence, TS ST.
7.7. COMP ETE CONTINOS OPERATORS
Throughout this section is a normed linear space over the Iield oIcomple
numbers C.
Recall that a set c is bounded iI there is a constant k such that Ior
all E we have II II k. Also, recall that a set is relatively compact
iI each se uence
n
oI elements chosen Irom contains a convergent
subseuence (see DeIinition 5.6.30 and Theorem 5.6.31). When contains
only a Iinite number oI elements then any seuence constructed Irom must
include some elements inIinitely many times, and thus contains a convergent
subseuence. rom this it Iollows that any set containing a Iinite number
oI elements is relatively compact. Every relatively compact set is contained
in a compact set and hence is bounded. or the Iinitedimensional case it is
also true that every bounded set is relatively compact (e.g., in Rn the Bolano
Weierstrass theorem guarantees this). owever, in the inIinitedimensional
case it does not Iollow that every bounded set is also relatively compact.
In analysis and in applications linear operators which transIorm bounded
sets into relatively compact sets are oI great importance. Such operators are
called completely continuous operators or compact operators. We give the
Iollowing Iormal deIinition.
7.7.1. DeIinition. et and be normed linear spaces, and let T be a
linear transIormation with domain and range in . Then T is said to be
completely continuous or compact iI Ior each bounded se uence
n
in , the
seuence T. contains a subseuence converging to some element oI y E .
We have the Iollowing euivalent characteriation oI a completely con
tinuous operator.
7.7.2. Theorem. et and be normed linear spaces, and let T E
B(, ). Then T is completely continuous iI and only iI the seuence T
n
contains a subseuence convergent to some y E Ior all se uences

n
such
that Il ,,11 I Ior all n.
Clearly, iI an operator T is completely continuous, then it is continuous.
On the other hand, the Iact that T may be continuous does not ensure that
it is completely continuous.
We now cite some e amples.
7.7. . E ample. et T: be the ero operator; i.e., T 0 Ior all
E . Then T is clearly completely continuous.
7.7.5. E ample. et era, b , and let II . II", be the norm on era, b
as deIined in E ample 6.1.9. et k: a, b a, b R be a real valued
Iunction continuous on the suare a s b, a t b. DeIining T:
by
T (s) s: k(s, t) (t)dt
Ior all E , we saw in E ample 7.1.20 that Tis a bounded linear operator.
We now show that T is completely continuous.
et ,, be a bounded se uence in ; i.e., there is a 0 such that
II "lI", Ior all n. It readily Iollows that iI " T ", then II "II S 7011 "II,
where 70 sup Ib Ik(s, t) Idt (see E ample 7.1.20). We now show that .
1I , b II
is an euicontinuous set oI Iunctions on a, b (see DeIinition 5.8.11). et
I O. Then, because oI the uniIorm continuity oI k on a, b a, b , there
is a 0 such that Ik(s .. t) k(s, t)1 (b I a) iI lSI s1 Ior
every t E a, b . Thus
I ,,(sl) y,,(s) I rIk(sl t) k(s, t) II(t) Idt I
Ior all n and all s.. s such that lSI s I . This implies the set ,, is
euicontinuous, and so by the Ar ela Ascoli theorem (Theorem 5.8.12),
the set . is relatively compact in era, b; i.e., it has a convergent subse
uence. This implies that T is completely continuous.
It can be shown that iI a, b) and iI T is the redholm operator
deIined in E ample 7.3. II, then T is also a completely continuous operator.
The ne t result provides us with an e ample oI a continuous linear trans

Iormation which is not completely continuous.
7.7.6. Theorem. et IE B(, ) denote the identity operator on .
Then I is completely continuous iI and only iI is Iinite dimensional.
ProoI. The prooI is an immediate conse uence oI Theorem 6.6.10.
We now consider some oI the general properties oI completely continuous
operators.
7.7.7. Theorem. et and be normed linear spaces, let S, T E B(, )
be completely continuous operators, and let I , pEe. Then the operator
(I S PT) is completely continuous.
ProoI Given a seuence . with Il .1I I, there is a subseuence .
such that the seuence S . has a limit u; i.e., S u. rom the seuence
. we pick another subseuence , such that T .
v. Then
(as PD
aS
PT , ( u pv
as n
k
, n
k
00.
We leave the prooIs oI the net results as an e ercise.
7.7.8. Theorem. et T E B(, ) be completely continuous. et be a
closed linear subspace oI which is invariant under T. et Tt be the restric
tion oI T to . Then Tt E B(, ) and Tt is completely continuous.
7.7.10. Theorem. et T E B(, ) be a completely continuous operator,
and let S E B(, ) be any bounded linear operator. Then STand TSare
completely continuous.
7.7.12. Corollary. et and be normed linear spaces, and let T E
B(, ) and S E B( , ). II T is completely continuous, then ST is com
pletely continuous.
7.7.1. Eample. A conseuence oI the above corollary is that iI T E
B(, ) is completely continuous and is inIinite dimensional, then T cannot
be a bijective mapping oI onto . or, suppose T were bijective. Then we
would have TtT I. By the Banach inverse theorem (see Theorem 7.2.6)
Tt would then be continuous, and by the preceding theorem the identity
mapping would be completely continuous. owever, according to Theorem
7.7.6, this is possible only when is Iinite dimensional.
Pursuing this eample Iurther, let era, b with II II as deIined
in Eample 6.1.9. et T: be deIined by T(t) s: ( r)d r Ior a t
b and E . It is easily shown that T is a completely continuous operator
on . It is, however, not bijective since R(T) is the Iamily oI all Iunctions
which are continuously diIIerentiable in , and thus R(T) is clearly a proper
subset oI . The operator T is injective, since T 0 implies O. The
inverse Tt is given by Tty(t) dy(t)/dt Ior y E R(T) and a t b. We
saw in Eample 5.7. that Tt is not continuous.
In our net result we reuire the Iollowing deIinition.
7.7.15. DeIinition. et and be normed linear spaces, and let T E
B(, ). The operator Tis said to be Iinite dimensional iIT() is Iinite dimen
sional; i.e., the range oI T is Iinite dimensional.
7.7.16. Theorem. et and be normed linear spaces, and let T E
B(, ). II T is a Iinite dimensional operator, then it is a completely con
tinuous operator.
ProoI et .l be a se uence in such that II .1I ::;; 1Ior all n. Then T.l
is a bounded se uence in T(). It Iollows Irom Theorem 6.6.10 that the set
T.l is relatively compact, and as such this set has a convergent subseuence
in T(). It Iollows Irom Theorem 7.7.2 that T is completely continuous.
The prooI oI the net result utili es what is called the diagonaliation
process.
7.7.17. Theorem. et and be Banach spaces, and let T.l be a seuence
oI completely continuous operators mapping into . IIthe se uence T.l
converges in norm to an operator T, then T is completely continuous.
ProoI et .l be an arbitrary se uence in with II .11 I. We must show
that the se uence T.l contains a convergent subse uence.
By assumption, T1 is a completely continuous operator, and thus we can
select a convergent subse uence Irom the se uence T 1 .l. et
denote the inverse images oI the members oI this convergent subseuence.
Net, let us apply T" to each member oI the above subse uence. Since T"
is completely continuous, we can again select a convergent subseuence
Irom the se uence T".
1
l. The inverse images oI the terms oI this se uence
are
u, "",
3
", ... , .", ....
Continuing this process we can generate the array
sing this array, let us now Iorm the diagonal se uence
Now each oI the operators T
I
T", T
3
, , T., ... transIorms this se uence
into a convergent se uence. To show that Tis completely continuous we must
7.7. Completely Continuous Operators
51
show that T also transIorms this se uence into a convergent se uence. Now
II T T.... 11 1 T." Tk Tk ". Tk..", Tk ",,,, T ",,,, II
1 T Tk 11 II Tk Tk ",,,, II II Tk ",,,, T ",,,, II
liT Tkll(ll 11 Il
m
", II) 1 Tk T
k
",,,, II;
i.e.,
II T "" T ",,,, II liT TkII (II II II " II) II Tk Tk ",,,, II.
Since the se uence T k converges, we can choose m, n N such that
II Tk Tk..", II I/2, and also we can choose k so that II T T
k
II I/ .
We now have
II T T ",,,, II I
whenever m, n Nand T ". is a Cauchy se uence. Since is a complete
space it Iollows that this se uence converges in and by Theorem 7.7.2
the desired result Iollows.
Theorem 7.7.7 implies that the Iamily oI completely continuous operators
Iorms a linear subspace oI B(, ). The preceding theorem states that iI
is complete, then this linear subspace is closed.
(i) T is completely continuous iI and only iI T*T is completely contin
uous; and
(ii) Tis completely continuous iI and only iI T* is completely continuous.
ProoI We prove (i) and leave the prooI oI (ii) as an e ercise. Assume that
T is completely continuous. It then Iollows Irom Theorem 7.7.10 that T*T
is completely continuous.
Conversely, assume that T*T is completely continuous, and let ( ,, be
a se uence in such that II " II 1. It Iollows that there is a subseuence
" such that T*T ". E as n
k
00. Now
II T " T ". W II T( " )W (T(
".), T( ", ".

(T*T( ", ".), ( " ". II T*T( " ) II II " ". II
:::;; 211 T*T ", T*T".II 0
as n
l
, n
k
00. Thus, T ", is a Cauchy se uence and so it is convergent.
It Iollows Irom Theorem 7.7.2 that Tis completely continuous.
7.7.19. Eercise. Prove part (ii) oI Theorem 7.7.18.
In the remainder oI this section we turn our attention to the properties
oI eigenvalues oI completely continuous operators.
7.7.20. Theorem. et be a ilbert space, let T E B(, ), and letA. E C.
II T is completely continuous and iI 1 0, then
m.A(n : T l
is Iinite dimensional.
ProoI. The prooI is. by contradiction. Assume that m.A(n is not Iinite
dimensional. Then there is an orthonormal inIinite se uence I 2. ,
.. , ... in m.A(n, and
IIT.. T lllW II A.. AlllW 1112. II .. lllW 21112.;
i.e., II T.. T lllil ,."I" r III 0 Ior all m n. ThereIore, no subseuence
oI T ..can be a Cauchy se uence, and hence no subse uence oI T ..can
converge. This completes the prooI.
In the ne t result n(T) denotes the approimate point spectrum oI T.
7.7.21. Theorem. et be a ilbert space, let T E B( , ), and let 1 E C.
II T is completely continuous, iI 1 0, and iI 1 E n(T), then 1 is an
eigenvalue.
ProoI. or each positive integer n there is an .. E such that II T.. A.. II
. .II .. II IorA. E n(n. We may assume that II .. II l. Since Tis completely
n
continuous, there is a subse uence oI .. , say ... such that T .. is
convergent. et lim T... y E . It now Iollows that lIy l I1 0
"
as n
k
00; i.e., A... y. Now lIyll 0, because lIyll lim II A... II
"
IAI lim II II IAI O. By the continuity oI T. we now have
"
Ty T(liml...) lim T(A...) 1 lim T ly.
.... I,t II.
ence, Ty ly, y O. Thus, 1 is an eigenvalue oI T and y is the corre
sponding eigenvector.
The prooI oI the ne t result is an immediate conse uence oI Theorems
7.6.16 and 7.7.21.
7.7.22. Theorem. et be a ilbert space, and let T E B( , ) be com
pletely continuous and normal. II1 E u(n and 1 0, then 1is an eigenvalue
oIT.
The above theorem states that, with the possible e ception oI 1 0,
the spectrum oI a completely continuous normal operator consists entirely
oI eigenvalues; i.e., iI 1 0, either 1 E Pu(T) or 1 E P(T).
7.7.2. Theorem. et be a ilbert space, and let T E B(, ). II T is
completely continuous and hermitian, then T has an eigenvalue, l, with
III II Til
ProoI The prooI Iollows directly Irom part (iii) oI Theorem 7.6.18 and
Theorem 7.7.22.
7.7.25. Theorem. et be a ilbert space, and let T E B(, ). II Tis
normal and completely continuous, then T has at least one eigenvalue.
ProoI II T 0, then l 0 clearly satisIies the conclusion oI the theorem.
So let us assume that T * O. Also, iI T T*, the conclusion oI the theorem
Iollows Irom Theorem 7.7.2 . So let us assume that T* T*. et 1(T
T*) and V i/T T*). It Iollows Irom Theorem 7. .15 that and V
are hermitian. urthermore, by Theorem 7.5. we have V V. rom
Theorems 7.7.7 and 7.7.18, and V are completely continuous. Byassump
tion, V* O. By the preceding theorem, Vhas a nonero eigenvalue which we
shall call p. It Iollows Irom Theorem 7.1.26 that IIi:iV) IIi:(V PI) N
is a closed linear subspace oI . Since V V, Theorem 7.6.2 implies
that N is invariant under . Now let
I
be the restriction oI to the linear
subspace N. It Iollows that
I
is completely continuous by Theorem 7.7.8.
It is readily veriIied that
I
is a hermitian operator on the inner product
subspace N (see E . (3.6.21). ence,
I
is completely continuous and
hermitian. This implies that there is an ( E C and an E N such that *O
and l ( . This means ( . Now since E N, we must have V
p. It Iollows that l (1, iP is an eigenvalue oI T with corresponding
eigenvector , since T iV ( ip iP) l. This
completes the prooI.
We now state and prove the last result oI this section.
7.7.26. Theorem. et be a ilbert space, and let T E B(, ). IITis
normal and completely continuous, then T has an eigenvalue l such that
III II Til
ProoI et S T*T. Then S is hermitian and completely continuous by
Theorem 7.7.18. Also, S 0 because (S, ) (T*T, ) (T, T)
II T Z O. This last condition implies that S has no negative eigenvalues.
SpeciIically, iI l is an eigenvalue oI S, then there is an *O in such that
S A . Now
o (S , ) (A , ) A( , ) All W,
and since II II * 0, we have A O. By Theorem 7.7.2 , S has an eigenvalue,
p, where p IISII IIT*TII IITW Now let N IIi:(S pI) IIi:i
S),
and note that N contains a nonero vector. Since T is normal, TS T(T*T)
(T*nT ST. Similarly, we have T*S ST*. By Theorem 7.6.2 , N is
invariant under T and under T*. By Theorem 7.5.6 this means T remains
normal when its domain oI deIinition is restricted to N. By Theorem 7.7.25,
there is alE C and a vector I: 0 in N such that T l, and thus T*
. Now since S T*T T*(l) IT* ll 1112 Ior this I:
0, and since T l Ior all E N, it Iollows that 111
2
l IISII IIT*TII
II T W ThereIore, III II TII and 1 is an eigenvalue oI T.
7.8. T E SPECTRA T EOREM
OR COMP ETE CONTIN O S
NORMA OPERATORS
The main result oI this section is reIerred to as the spectral theorem
(Ior completely continuous operators). Some oI the direct conseuences oI
this theorem provide an insight into the geometric properties oI normal
operators. Results such as the spectral theorem playa central role in applica
tions. In Section 7.10 we will apply this theorem to integral euations.
Throughout this section, is a comple ilbert space.
We reuire some preliminary results.
7.8.1. neorem. et T E B(, ) be completely continuous and normal.
or each I 0, let A. be the annulus in the comple plane deIined by
A. l E C: I 1).1 s II Til .
Then the number oI eigenvalues oI T contained in A. is Iinite.
ProoI To the contrary, let us assume that Ior some I 0 the annulus A.
contains an inIinite number oI eigenvalues. By the BolanoWeierstrass
theorem, there is a point oI accumulation 1
0
oI the eigenvalues in the
annulus A . et ).It be a seuence oI distinct eigenvalues such that )." ).0
as n 00, and let T " l" ", II " II I. Since Tis a completely continuous
operator, there is a subseuence ... oI ,, Ior which the seuence T ".
converges to an element u E ; i.e., T ". as nk 00. Thus, since
T... l".
It
., we have l ... u. But 1/).... 1/1
0
because 1" I: O. There
Iore (I/10)u. But the are distinct eigenvectors corresponding to
distinct eigenvalues. By part (iv) oI Theorem 7.6.10 ... is an orthonormal
seuence and ". (I/10)u. But II ",11
2
2, and thus ...cannot be
a Cauchy seuence. et, it is convergent by assumption; i.e., we have arrived
at a contradiction. ThereIore, our initial assumption is Ialse and the theorem
is proved.
Our net result is a direct conseuence oI the preceding theorem.
7.8. The Spectral TheoremIor Completely Continuous Normal Operators 55
7.8.2. Theorem. et T E B(, ) be completely continuous and normal.
Then the number oI eigenvalues oI T is at most denumerable. II the set oI
eigenvalues is denumerable, then we have a point oI accumulation at ero
and only at ero (in the comple plane). The nonero eigenvalues can be
ordered so that
The net result is known as the spectral theorem. ere we let A
o
0,
and we let AI A2. ...be the nonero eigenvalues oI a completely contin
uous operator T E B(, ). Note that A
o
mayor may not be an eigenvalue
oI T. II A
o
is an eigenvalue, then m.(T) need not be Iinite dimensional. ow
ever, by Theorem 7.7.20, m.(T A/) is Iinite dimensional Ior i 1,2, ....
7.8. . Theorem. et T E B(, ) be completely continuous and normal,
let A
o
0, and let A
lt
A2. ... be the nonero distinct eigenvalues oI T
(this collection may be Iinite). et m., m.(T A,I) Ior i 0, I, 2, ....
Then the Iamily oI closed linear subspaces m., ;:o oI is total.
ProoI The Iact that each m., is a closed linear subspace oI Iollows Irom
Theorem 7.1.26. Now let m.", and let N y.1.. We wish to show that
.
N O . By Theorem 6.12.6, N is a closed linear subspace oI . We will
show Iirst that is invariant under T*. et E . Then E m.. Ior some n
and T l". Now l.,(T*) T*(l") T*T T(T*); i.e., T(T*)
l.(T*) and so T* E m.., which implies T* E . ThereIore, is
invariant under T*. rom Theorem 7.3.15 it Iollows that y.1. is invariant under
T. ence, N is an invariant closed linear subspace under T. It Iollows Irom
Theorems 7.7.8 and 7.5.6 that iI T
I
is the restriction oI T to N, then T
I
E
B(N, N) and TI is completely continuous and normal. Now let us suppose that
N 1 O . By Theorem 7.7.25 there is a nonero E N and a A. E C such that
TI l. But iI this is so, Ais an eigenvalue oI T and it Iollows that E m."
Ior some n. ence, E N ( , which is impossible unless O. This
In proving an alternate Iorm oI the spectral theorem, we reuire the
Iollowing result.
7.8.5. Theorem. et N
k
be a se uence oI orthogonal closed linear sub
spaces oI ; i.e., N
k
.1. N
Ior all j 1 k. Then the Iollowing statements are

euivalent:
(i) N
k
is a total Iamily;
(ii) is the smallest closed linear subspace which contains every N
k
; and
S6 Chapter 7 I inear Operators
(iii) Ior every E there is a uniue se uence
k
such that
(a)
k
E N
k
Ior every k,
..
(b) k ; and
k 1
ProoI We Iirst prove the e uivalence oI statements (i) and (ii). et
Nil Then c y.l. by Theorem 6.12.8. urthermore, y.l.is the smallest
II
closed linear subspace which contains by Theorem 6.12.8. Now suppose
N
k
is a total Iamily. Then yl. O . ence, yl.l. and so is the
smallest closed linear subspace which contains every N
k

On the other hand, suppose is the smallest closed linear subspace which
contains every N
k
Then y.l. and yl.l.l. O . But yl.l.l. l.. Thus,
yl. O , and so N
k
is a total Iamily.
We now prove the e uivalence oI statements (i) and (iii). et N
k
be a
total Iamily, and let E . or every k 1,2, ... , there is an I E I
and a k E Nt such that
k
I II I 0, then (,
k
) 0. II I * 0,
then (, k1llkll) ( k k klllkll) II ,.. II Thus, it Iollows Irom
Bessel s ineuality that
.. ..
ence, Il ,..1I
2
00. Net, let
o

k
Then
o
E . or Ii ed j,
k 1 " 1
let E N
j
. Then (
o
, y) (
j
j
o
, y) (j,y) (j,y) (o,)
(j y) (i: ,.., ) (j y) i: (
k
, y) (j y) (j) O. Thus,
,.. 1 " I
(
o
) is orthogonal to every element oI N
j
Ior every j. Since N,.. is a total
..
Iamily, we have
o
. To prove uniueness, suppose that I
" 1
.. ..
, where
k
, E N
k
Ior all k. Then (
k
) O. Since
k
k 1 k 1
E N
k
we have (
k
) (j ) Ior j * k, and so II k (k ) Ir
i: II
k
11
2
O. Thus, II
k
II 0 Ior all k, and
k
is uniue Ior
k 1
each k.
To prove that (iii) implies (i), assume that E Nt Ior every k. By hypothe
sis, i:
k
, where
k
E N
k
Ior all k. ence, Ior any j we have
k 1
7.8. The Spectral Theorem Ior Completely Continuous Normal Operators 57
and ) 0 Ior allj. This means 0, and so N
k
) is a total Iamily. This
In DeIinition 3.2.13 we introduced the direct sum oI a Iinite number oI
linear subspaces. The preceding theorem permits us to etend this deIinition
in a meaningIul way to a countable number oI linear subspaces.
7.8.6. DeIinition. et k) be a se uence oI mutually orthogonal closed
linear subspaces oI , and let V(
k
)) be the closed linear subspace generated
..
by k II every E V(
k
)) is uniuely representable as
k
, where
k1
k
E
k
Ior every k, then we say V(
k
)) is the direct sum oI k) In this
case we write
We are now in a position to present another version oI the spectral theorem.
7.8.7. Theorem. et T E B(, ) be completely continuous and normal,
let lo 0, and let PI l2 ... , In ...) be the nonero distinct eigenvalues
oI T. et mol mo(T l) Ior i 0, I, 2, ... , and let Pi be the projection
on mol along mot. Then
(i) PI is an orthogonal projection Ior each i;
(ii) PIP) 0 Ior all i,j such that i j;
..
(iii) I; P
I; and
)0
..
(iv) T lP).
t 1
ProoI The prooI oI each part Iollows readily Irom results already obtained.
We simply indicate the principal results needed and leave the details as an
e ercise.
Part (i) Iollows Irom the deIinition oI orthogonal projection. Part (ii)
Iollows Irom part (ii) oI Theorem 7.6.26. Parts (iii) and (iv) Iollow Irom
Theorems 7.1.27 and 7.8.5.
In Chapter we deIined the resolution oI the identity operator Ior
Euclidean spaces. We conclude this section with a more general deIinition.
7.8.9. DeIinition. et P
n
) be a seuence oI linear transIormations on
such that P
n
E B(, ) Ior each n. II conditions (i), (ii), and (iii) oI Theorem
7.8.7 are satisIied, then P
n
) is said to be a resolution oI the identity.
7.9. DI ERENTIATION O OPERATORS
In this section we consider diIIerentiation oI operators on normed linear
spaces. Such operators need not be linear. Throughout this section, and
are normed linear spaces over a Iield , where may be either R, the real
numbers, or C, the comple numbers. We will identiIy mappings which are,
in general, not linear by I: . As usual, (, ) will denote the class
oI all linear operators Irom into , while B(, ) will denote the class oI
all bounded linear operators Irom into
o
E be a Ii ed element, and let I: . II
there eists a Iunction 6/(
o
, .): such that
(7.9.2)
(where t E ) Ior all hE, then I is said to be Gateau diIIerentiable at
o
, and 6/(
o
, h) is called the Gateau diIIerential oI/at
o
with increment h.
The Gateau diIIerential oII is sometimes also called the weak diIIerential
oI I or the GdiIIerenIial oI I II I is Gateau diIIerentiable at
o
, then
6/(
o
, h) need not be linear nor continuous as a Iunction oI hE. owever,
we shall primarily be concerned with Iunctions I: which have these
properties. This gives rise to the Iollowing concept.
o
E be a Ii ed element, and let I: . II
there eists a bounded linear operator (
o
) E B(, ) such that
I 1IlIllI(o h) I(o) (o)hll 0
(where hE), then I is said to be rechet diIIerentiable at
o
, and (
o
)
is called the rechet derivative oII at
o
We deIine
I(
o
) (
o
)
II I is rechet diIIerentiable Ior each ED, where D c , then I is said
to be rechet diIIerentiable on D.
We now show that rechet diIIerentiability implies Gateau diIIeren
tiability.
7.9.. Theorem. et/: , and let
o
E be a Ii ed element. III is
rechet diIIerentiable at
o
then/is Gateau diIIerentiable. and Iurthermore
the Gateau diIIerential is given by
6/(
o
, h) I(o)h Ior all hE.
58
7.9. DiIIerentiation oIOperators
59
ProoI et (
o
) (
o
), let I 0, and let hE. Then there is a 0 0
such that
IIt 1I1 ( o th) /(o) (o)th II I II h II
provided that II th II 0 iI th * O. This implies that
II /(
o
t) /(
o
) (o)h II I
provided that It I 0/11 hII. ence, / is Gateau diIIerentiable at
o
and
/(o, h) ( o)h.
Because oI the preceding theorem, iI I: . is rechet diIIerentiable
at
o
E , the Gateau diIIerential /(o, h) ,(o)h is also called the
recbet diIIerential oI/at
o
with increment h.
et us now consider some e amples.
7.9.5. Eample. et be a ilbert space, and let/be a Iunctional deIined
on ; i.e., I: . . II I has a rechet derivative at some
o
E , then
that derivative must be a bounded linear Iunctional on ; i.e., ,(
o
) E .
In view oI Theorem 6.1.2, there is an element o E such that ,(o)h
(h,yo)Ior each h E . AIthough ,(
o
) E andyo E , we know by E ercise
6.1. that and are congruent and thus isometric. It is customary to
view the corresponding elements oI isometric spaces as being one and the
same element. With this in mind, we say (
o
) o and we call (
o
) the
gradient oIIat
O
As a special case oI the preceding eample we consider the Iollowing

speciIic case.
7.9.6. Eample. et R and let 11 11 be any norm on . By Theorem
6.6.5, is a Banach space. Now let / be a Iunctional deIined on ; i.e.,
I: . R. et (I ... ,.) E and h (hI ... ,h.) E . II/has
continuous partial derivatives with respect to I i I, ... ,n, then the
rechet diIIerential oI/is given by
8/() 8/()
/(, h) ae: hI ... oc h .
or Ii ed
o
E , we deIine the bounded linear Iunctional (
o
) on by
8/() I
(o)h hi " "o Ior hE.
Then (
o
) is the rechet derivative oI/at
O
As in the preceding eample,
we do not distinguish between and , and we write
The gradient oIIat is given by
I
() (I() I().
"" .
(7.9.7)
In the Iollowing, we consider another e ample oI the gradient oI a Iunc
tional.
7.9.8. E ample. et be a real ilbert space, let: be a bounded
linear operator, and let/: R be given by I() (,). Then I has a
rechet derivative which is given by ,() ( *) . To veriIy this, we let
h be an arbitrary element in and we let ( ) ( *) . Then
I( h) I() ()h ( h, h) (, ) (h, ) (h, *)
(h,h).
lim II( h) I() ()h I 0
Ih IIhll .
In the ne t e ample we consider a Iunctional which Ireuently arises in
optimiation problems.
7.9.9. E ample. et and be real ilbert spaces, and let be a bounded
linear operator Irom into ; i.e., E B(, ). et *be the adjoint oI .
et v be a Ii ed element in , and let/be a real valued Iunctional deIined on
by
I() IIv 11
1
Ior all E .
ThenI has a rechet derivative which is given by
I() 2*v 2*.
To veriIy this, observe that
I() (v , v ) (v, v) 2(v, ) (, )
(v, v) 2( *v, ) (,*).
The conclusion now Iollows Irom E amples 7.9.5 and 7.9.8.
In the ne t e ample we introduce the acobian matri oI a Iunction
I: R
8
R" .
7.9.10. E ample. et R8, and let R" . Since and are Iinite
dimensional, we may assume arbitrary norms on each oI these spaces and
they wiII both be Banach spaces. etI: . or (I" .. 8) E ,
7.9. DiIIerentiation oIOperators
let us write
/I) /1(1;1,;., . ,I; )

I() . . .
. .
I",() 1",(1;1 ,1;.)
or
o
E , assume that the partial derivatives
aI,() I aI,(
o)
I; " ". ae;
61
e ist and are continuous Ior i I, ... , m and j I, ... ,n. The rechet
diIIerential oI1 at
o
with increment h (hI ... ,h.) E is given by
all (
o
) a/,(
o
)
hh :.
3/(
o
, h)
al",(
o
) al",(
o
)
ael al;.
The rtkhet derivative oI1at
o
is given by
all (
o
)
which is also called the acobian matri oI 1 at

o
We sometimes write
j() a ()/a.
7.9.11. E ample. et e a, b , the Iamily oI realvalued continuous
Iunctions deIined on a, b , and let ; II II be the Banach space given in
Eample 6.1.9. et k(s, t) be a real valued Iunction deIined and continuous
on a, b a, b , and let g(t, ) be a real valued Iunction which is deIined
and ag(t, )/a is continuous Ior t E a, b and E R. et I: . be
deIined by
I() s: k(s, t)g(t, (tdt, E .
or Ii ed
o
E , the rechet diIIerential oI1at
o
with increment hEis
given by
3/(
o
, h) I k(s, t) ag(tao(t) h(t)dt.
7.9.12. Eercise. VeriIy the assertions made in Eamples 7.9.5 to 7.9.11.
We now establish some oI the properties oI rechet diIIerentials.
7.9.13. Theorem. et I, g: be rechet diIIerentiable at
o
E .
Then
(i) Iis continuous at
o
E ; and
(ii) Ior all ,p E , I pg is rechet diIIerentiable at
o
and (I
pg)(
o
) I(o) pg(
o
)
ProoI To prove (i), let I be rechet diIIerentiable at
o
, and let (
o
) be
the rechet derivative oIIat
o
Then
I(
o
h) I(
o
) I(
o
h) I(
o
) ( o)h ( o)h,
and
III(
o
h) I(
o
) II III(
o
h) I(
o
) ( o)hll II (
o
)hll.
Since (
o
) is bounded, there is an M 0 such that II ( o)h II Mil h II.
urthermore, Ior given 0 there is a 0 such that III(
o
h) I(
o
)
( oh) II II hII provided that II hII . ence, III(
o
h) I(
o
) II
(M )lIhll whenever IIhll . This implies thatIis continuous at
o
The prooI oI part (ii) is straightIorward and is leIt as an e ercise.

7.9.1. Eercise. Prove part (ii) oI Theorem 7.9.13.
We now show that the chain rule encountered in calculus applies to
rechet derivatives as well.
7.9.15. Theorem. et , , and Z be normed linear spaces. et g: ,
I: Z, and let,: Z be the composite Iunction , Iog. et g be
rechet diIIerentiable on an open set D c , and let I be rechet diIIeren
tiable on an open set E c g(D). II ED is such that g() E E, then, is
rechet diIIerentiable at and ,() I(g())g().
ProoI et y g() and d g( h) g(), where hEis such that
hE D. Then
,( h) ,() I(y)g()h I(y d) I(y) I(y)d
I(y)d g ( )h)
I(y d) I(y) I(y)d I(y)g( h) g() g ( )h).
Thus, given 0 there is a 0 such that II d II and II h II imply
11,( h) ,() I(y)g()hll lIdli 11I(y)IIlIhll E.
By the continuity oI g (see the prooI oI part (i) oI Theorem 7.9.13), it Iollows
that Ildli M lIhll Ior some constant M. ence, there is a constant k
7.9. DiIIerentiation 01 Operators
such that
II ,( h) ,() I(y)g()h II kI II h II.
This implies that ,() e ists and ,() I(g(g().
We net consider the rckhet derivative oI bounded linear operators.
63
7.9.16. Theorem. et T be a linear operator Irom into . II I() T
Ior all E , then/is rechet diIIerentiable on iI and only iI T is a bounded
linear operator. In this case, I() T Ior all E .
ProoI et T be a bounded linear operator. Then IlI( h) I() Th II
IIT( h) T Thll 0 Ior all , hE. rom this it Iollows that
I() T.
Conversely, suppose T is unbounded. Then, by Theorem 7.9.13,/ cannot
be rechet diIIerentiable.
7.9.17. Eample. et R" and Rm, and let us assume that the
natural basis Ior each oI these spaces is being used (see Eample .1.15).
IIA E (, ), then A is given in matri representation by
all
A :
amI
ence, iI I() A, then I() A, and the matri representation oI
I() dI()/ is A.
The net result is useIul in obtaining bounds on rechet diIIerentiable
Iunctions.
7.9.18. Theorem. et I: , let D be an open set in , and let / be
rechet diIIerentiable on D. et
o
E D, and let hEbe such that
o
th E D Ior all t when 0 t I. et N sup 11I (
o
th) II. Then
0,1
IlI(
o
h) I(
o
) II N lIhll.
ProoI et y I(
o
h) I(
o
), and let , be a bounded linear Iunctional
deIined on (i.e." E *) such that ,(y) 11,11 lIyl (see Corollary 6.8.6).
DeIine g: (0, 1) R by get) ,(I( th Ior 0 t I. By Theorems
7.9.15 and 7.9.16, g(t) P(/( th)h). By the mean value theorem oI
calculus, there is a to such that 0 to I and g(I) g(O) g(t 0) Thus,
I,(/( h ,(/( I 11,11 sup 1II( th)II llhll
011
Since
Irp(I( h rp(/( I Irp(/( h) I()) I Irp(y) I
IIrpll lI/(
o
h) I(
o
) II,
it Iollows that II/(
o
h) I(
o
) II sup 11I ( th)1I lIhll.
Ot1
II a Iunction I: is rechet diIIerentiable on an open set D c .
and iI I() is rechet diIIerentiable at ED, then I is said to be twice
rechet diIIerentiable at , and we call the rechet derivative oI I() the
second derivative oI I We denote the second derivative oI I by I". Note
that I" is a bounded linear operator deIined on with range in the nonned
linear space B(, ).
7.9.19. Theorem. et/: be twice rechet diIIerentiable on an open
set D c . et
o
E D, and hEbe such that
o
th E D Ior all t
when 0 t I. et N sup 1I/"( th) II. Then
Ot1
11/( h) I() I()hll iN lIhll
.
We conclude the present section by showing that the Gateau and rechet
diIIerentials play a role in ma imi ing and minimi ing Iunctionals which is
similar to that oI the ordinary derivative oI Iunctions oI real variables.
et R, and let I be a Iunctional on ; i.e., I: R. Clearly, Ior
Ii ed
o
, hE. we may deIine a Iunction g: R R by the relation g(t)
I(
o
th) Ior all t E R. In this case, iII is Gateau diIIerentiable at
o
we see that /(o. h) g(t) It.o, where g(t) is the usual derivative oI g(t).
We will need this property in proving our ne t result, Theorem 7.9.22. irst,
however, we reuire the Iollowing important concept.
7.9.21. DeIinition. et I be a real valued Iunctional deIined on a domain
S) c ; i.e.,I: S) R. et
o
E S). Then/is said to have a relative minimum
(relative ma imum) at
o
iI there e ists an open sphere S(
o
; r) c such that
Ior all E S(
o
; r) n S) the relation I(
o
) I() (/(
o
) I( holds.
III has either a relative minimum or a relative maimum at
o
then I is
said to have a relative e tremum at
O
or relative e trema, we have the Iollowing result.

7.9.22. Theorem. et I: R be Gateau diIIerentiable at
o
E .
II/has a relative e tremum at
o
, then /(o, h) 0 Ior all hE.
ProoI As pointed out in the remark preceding DeIinition 7.9.21, the real
valued Iunction g(t) I(
o
th) must have an e tremum at t O. rom
the oridnary calculus we must have g (t) 1,.0 O. ence, 6I(
o
, h) 0
Ior all hE.
7.9.23. Corollary. et I: R be rechet diIIerentiable at
o
E . II
Ihas a relative e tremum at
o
, thenj(
o
) O.
7.9.2 . E ercise. Prove Corollary 7.9.23.
We conclude this section with the Iollowing e ample.
7.9.25. E ample. Consider the real valued IunctionalI deIined in E ample
7.9.9; i.e.,I() IIv li . or a given v E , a necessary condition Ior
Ito have a minimum at
o
E is that
*
o
*v .
7.10. SOME APPICATIONS
In this section we consider selected applications oI the material oI the
present chapter. The section consists oI three parts. In the Iirst part we con
sider integral euations, in the second part we give an e ample in optimal
control, while in the third part we address the problem oI minimi ing Iunc
tionals by the method oI steepest descent.
A. Applications to Integral Euations
Throughout this part, is a comple ilbert space while T denotes a
completely continuous normal operator deIined on .
We recall that iI, e.g., a, b and T is deIined by (see E ample
7.3.11 and the comment at the end in E ample 7.7.5)
T (s) s: k(s, t)(t)dt, (7.10.1)
then T is a completely continuous operator deIined on . urthermore, iI
k(s, t) k(t, s) Ior all s, t E a, b , then T is hermitian (see E ercise 7. .20)
and, hence, normal.
In the Iollowing, we shall Iocus our attention on euations oI the Iorm
T h y, (7.10.2)
where AE C and , E . II, in particular, T is deIined by E . (7.10.1),
then E . (7.10.2) includes a large class oI integral euations. Indeed, it was
the study oI such euations which gave rise to much oI the development oI
Iunctional analysis.
We now prove the Iollowing e istence and uniueness result.
7.10.3. Theorem. II A 1 0 and iI Ais not an eigenvalue oI T, then E.
(7.10.2) has a uniue solution, which is given by
(7.10. )
where An are the nonero distinct eigenvalues oI T, P
n
is the projection oI
onto n (T AnI) along ;,l Ior n 1,2, ... ,and Po is the projec
tion oI onto (T).
ProoI We Iirst prove that the inIinite series on the righthand side oI
E . (7.10. ) is convergent. Since A 1 0, it cannot be an accumulation point
oI An . Thus, we can Iind ad 0 such that IAI d and 11 1kI d Ior
k 1,2, .... We note Irom Theorem 7.8.7 that PIP
j
0 Ior i * j. Now
Ior N 00, we have by the Pythagorean theorem,
II PI kI;::;:112 rhIlPoW k 11 AkI2I1Pk W
d211Po W d kt IIP
k
yW
d IIPo W ktlllPk ll

d
2
11
po
Pkylr
d ll
pO
i;lPkyW
d iI W.
This implies that k 11 1k1
2
II Pk11
2
is convergent, and so it Iollows Irom
Theorem 6.13.3 that nt : ): is convergent to an element in .
etj be a positive integer. By Theorem 7.5.12, P
j
is continuous, and so
(
00 P ) 00 PP
by Theorem 7.1.27, P
j
, 1 , ....: ,. Now let be given by
11 1 All ,,1 A" l.
E . (7.10. ) Ior arbitrary E . We want to show that T l y. rom
E . (7.10. ) we have
I
Po rPo
and
67
(7.10.6)
Pl 1 lPyIorj 1,2, ....
Thus, po lPo and P l P lP. Now Irom the spectral
00 00
theorem (Theorem 7.8.7), we have poIti P T ItilP, and
00
l lPo lP. ence, T l.
:1
inally, to show that given by E . (7.10. ) is uniue, let and be such
that T A T l y. Then it Iollows that T( ) l( )
O. ence, T( ) l( ). Since 1 is by assumption not
an eigenvalue oI T, we must have O. This completes the prooI.
In the net result we consider the case where 1 is a nonero eigenvalue
oIT.
7.tO.S. Theorem. et In denote the nonero distinct eigenvalues oI T,
and let A l Ior some positive integer j. Then there is a (nonuniue)
E satisIying E . (7.10.2) iI and only iI P 0, where P
is the orthogonal
projection oI onto IIi:
: (T Al) O . II P 0, then a solution

to E. (7.10.2) is given by
po Pk
o
""
II. kllI.k .I,
k*
where Po is the orthogonal projection oI onto IIi:(T) and
o
is any element
in IIi:
ProoI We Iirst observe that IIi:
reduces T by part (iii) oI Theorem 7.6.26.

It thereIore Iollows Irom part (ii) oI Theorem 7.5.22 that TP
P T. Now
suppose that is such that E . (7.10.2) is satisIied Ior some E . Then it
Iollows that P Pi
T
l) TP l P A P A P O.
In the preceding, we used the Iact that T l Ior E IIi:
and P E IIi:
Ior all E . ence, P O.

Conversely, suppose that P 0, and let be given by E . (7.10.6).
The prooI that satisIies E . (7.10.2) Iollows along the same lines as the
prooI oI Theorem 7.10.3, and the details are leIt as an e ercise. The non
uniueness oI the solution is apparent, since (T ll)
o
0 Ior any
o
E IIi:
7.tO.7. Eercise. Complete the prooI oI Theorem 7.10.5.
68
B. An E ample Irom Optimal Control
In this e ample we consider systems which can appropriately be described
by the system oI Iirstorder ordinary diIIerential euations
i(l) A (I) B (I), (7.10.8)
where (o) A
o
is given. ere (I) E RIO and (I) E R " Ior every 1such that
1 T Ior some T 0, and A is an n n matri, and B is an n m
matri. As we saw in part (vi) oI Theorem .11. 5, iI each element oI the
vector (I) is a continuous Iunction oI I, then the uniue solution to E.
(7.10.8) at time 1 is given by
(I) .(1, O) (O) (.(1, r)B (I)d r, (7.10.9)
where .(1, I) is the state transition matri Ior the system oI euations given
in E . (7.10.8).
et s now deIine the class oI vector valued Iunctions ; O, T by
; O, T u: u
T
(., ,u",), where / E
2
0, T, i I, ... ,m.
II we deIine the inner product by
(u, v) ruT(t)v(l)dl
Ior u, v E rO, 1 , then it Iollows that rO, T is a ilbert space (see
E ample 6.11.11). Net, let us deIine the linear operator : rO, T
iO, 1 by
u(I) I .(1, r)B (I)d r (7.10.10)
Ior all E rO, 1 . Since the elements oI .(1, r) are continuous Iunctions
on 0, T 0, T , it Iollows that is completely continuous.
Now recall Irom E ercise 5.10.59 that E. (7.10.9) is the uniue solution
to E . (7.10.8) when the elements oI the vector u(t) are continuous Iunctions
oI t. It can be shown that the solution oI E . (7.10.8) e ists in an etended
sense iI we permit u E rO, T. Allowing Ior this generaliation, we can
now consider the Iollowing optimal control problem. et "I E R be such that
"I 0, and let/be the real valued Iunctional deIined on lO, T given by
/(u) r T(t) (I)dt "I r T(I) (t)dt, (7.10.11)
where (t) is given by E . (7.10.9) Ior E T O, T. The linear uadratic
cost control problem is to Iind u E T O, T such that/(u) in E . (7.10.11) is
minimum, where (t) is the solution to the set oI ordinary diIIerential eua
tions (7.10.8). This problem can be cast into a minimiation problem in a
ilbert space as Iollows.
et
v(t) .(t, O)
o
Ior 0 t ::::;; T.
Then we can rewrite E . (7.10.9) as
u v,
69
and E . (7.10.11) assumes the Iorm
I(u) II u vW "lIuW.
We can Iind the desired minimi ing u in the more general contet oI
arbitrary real ilbert spaces by means oI the Iollowing result.
7.10.12. Theorem. et and be real ilbert spaces, let : be a
completely continuous operator, and let * denote the adjoint oI . et v
be a given Ii ed element in , let" E R, and deIine the IunctionalI: R
by
I(u) II u vW "lIull
(7.10.13)
Ior u E . (In E . (7.10.13) we use the norm induced by the inner product
and note that II u II is the norm oI u E , while II u vII is the norm oI
(u v) E .) II in E . (7.10.13), " 0, then there e ists a uniue
o
E
such that I(u
o
) I(u) Ior all u E . urthermore,
o
is the solution to the
euation
*u
o
"
o
*v.
(7.10.1)
ProoI. et us Iirst e amine E . (7.10.1). Since is a completely continuous
operator, by Corollary 7.7.12, so is *. urthermore, the eigenvalues oI
* cannot be negative, and so " cannot be an eigenvalue oI *. Making
the association T *, A ", and y *v in E . (7.10.2), it is clear
that Tis normal and it Iollows Irom Theorem 7.10.3 that E . (7.10.1) has a
uniue solution. In Iact, this solution is given by E . (7.10. ), using the above
deIinitions oI symbols.
Net, let us assume that
o
is the uniue element in satisIying E .
(7.10.1), and let hE be arbitrary. It Iollows Irom E . (7.10.13) that
I(u
o
h) (u
o
h v,u
o
h v) ,,(u
o
h,
o
h)
(u
o
v, u
o
v) 2(h, u
o
v)
(v, v) "(I o, u
o
) 2,,(u
o
, h) ,,(h, h)
(u
o
v, u
o
v) (v, v) ,,(u
o
, u
o
)
2(h, *u
o
"u
o
*v) ,,(h, h)
II u o vW IlvW "lIuoW "lIhW
ThereIore, I(u
o
h) is minimum iI and only iI h O.
The solution to E. (7.10.1) can be obtained Irom E. (7.10.); however.
a more convenient method is available Ior the Iinding oI the solution when
is given by E. (7.10.10). This is summaried in the Iollowing result.
7.10.1S. Theorem. et 0, and let I(u) be deIined by E. (7.10.11),
where (t) is the solution to E. (7.10.8). II
u(t) ... BTp(t)(t)

Ior all t such that 0 t T, where P(t) is the solution to the matri diIIer
ential euation
P(t) ATp(t) P(t)A . . P(t)BBTp(t) I (7.10.16)

with P(T) O. then u minimies I(u).
ProoI We want to show that u satisIies E. (7.10.1), where u is given
by E. (7.10.10). We note that iIu satisIies E. (7.10.1). then u . .*(u

v) *. We now Iind the epression Ior evaluating *w Ior ar

bitrary w E ,, O. T. We compute
(w. 0) ru: .(s, t)Bu(t)dtrw(s)ds
rI: uT(t)BT.T(S,t)w(s)dtds
ruT(t) I BT.T(S, t)w(s)dsdt.
In order Ior this last epression to eual (*w, u), we must have
*w (t) rBT.T(S. t)w(s)ds.
Thus, u must satisIy
I iT
u(t) BT .T(S, t)(s)ds
I
Ior all t such that 0 t T. Now assume there eists a matri P such that
P(t)(t) rT(S. t)(s)ds. (7.10.17)
We now Iind conditions Ior such a matri P(t) to eist. irst, we see that
P(T) O. Net, diIIerentiating both sides oI E. (7.10.17) with respect to
t, and noting that ebT(s, t) AT.(s. t), we have
P(t)(t) P(t)i(t) (t) AT I T(S, t)(s)ds
(t) ATp(t)(t).
ThereIore,
P(t)(t) P(t)A(t) Bu(t) (t) ATP(t) (t).
But
u(t) l*(t) lBTp(t)(t)
1 i
so that
P(t)(t) P(t)A(t) lP(t)BBTp(t)(t) (t) ATP(t)(t).
1
ence, pet) must satisIy
pet) ATP(t) P(t)AT lP(t)BBTP(t) I
i
with peT) O.
II
it Iollows that u satisIies
71
*u 1 *v,
where v (t, O)
o
and so, by Theorem 7.10.12, u minimi es I given by
E . (7.10.11). This completes the prooI oI the theorem.
The diIIerential euation Ior pet) in E . (7.10.16) is called a matri
Riccati e uation and can be shown to have a uniue solution Ior all t T.
C. Minimiation oI unctionals: Method oI Steepest Descent
The problem oI Iinding the minimum (or maimum) oI Iunctionals arises
Ireuently in many diverse areas in applications. In this part we turn our
attention to an iterative method oI obtaining the minimum oI a Iunctional
I deIined on a real ilbert space .
Consider a Iunctional I: R oI the Iorm
I() ( , M) 2(w, ) p, (7.10.18)
where wis a Ii ed vector in , where PER, and where M is a linear selI
adjoint operator having the property
c,llW,M)c
2
1I W (7.10.19)
Ior all E and some constants C
2
C
1
O. The reader can readily veriIy
that the Iunctional given in E . (7.10.13) is a special case oII, given in E .
(7.10.18), where we make the association M * 11 (provided i 0),
w *v, and p (v, v).
nder the above conditions, the euation
M w (7.10.20)
has a uniue solution, say
o
, and
o
minimi es I(). Iterative methods are
based on beginning with an initial guess to the solution oI E . (7.10.20)
and then successively attempting to improve the estimate according to a
recursive relationship oI the Iorm
(7.10.21)
where . E Rand r. E . DiIIerent methods oI selecting. and r. give rise
to various algorithms oI minimi ing I() given in E . (7.10.18) or, euiva
lently, Iinding the solution to E . (7.10.20). In this part we shall in particular
consider the method oI steepest descent. In doing so we let
r. w M ., n 1,2, . . . . (7.10.22)
The term r. deIined by E . (7.10.22) is called the residual oI the approima
tion . II, in particular, . satisIies E . (7.10.20), we see that the residual
is ero. or I() given in E . (7.10.18), we see that
I(.) 2r",
where I(.) denotes the gradient oI I(.). That is, the residual, r., is
"pointing" into the direction oIthe negative oI the gradient, or in the direction
oI steepest descent. Euation (7.10.2I) indicates that the correction term
.r. is to be a scalar multiple oI the gradient, and thus the steepest descent
method constitutes an e ample oI one oI the so called "gradient methods."
With r. given by E . (7.10.22), . is chosen so thatI(. .r.) is minimum.
Substituting . .r. into E . (7.10.18), it is readily shown that
(l (r r.)
(r., Mr.)
is the minimi ing value. This method is illustrated pictorially in igure B.
,
Ii, )
7.10.23. igure B. Illustration oI the method oI steepest descent.
7.11. ReIe,ences and Notes
73
In the Iollowing result we show that under appropriate conditions the
seuence N generated in the heuristic discussion above converges to the
uniue minimiing element
o
satisIying E. (7.10.20).
7.10.2. Theorem. et M E B(, ) be a selIadjoint operator such that
Ior some pair oI positive real numbers" and .l we have ,,11 W ( , M)
.lll WIor all E . et I E be arbitrary, let WE , and letN W
M
N
, where N I
N
(l,N N Ior n 1,2, ... ,and (l,N ( N N)/( N M N)
Then the seuence
N
converges to
o
, where
o
is the uniue solution to
E. (7.10.20).
ProoI In view oI the Schwar ineuality we have ( , M) IIMllllll.
This implies that "llll IIMll Ior all E , and so M is a bijective
mapping by Theorem 7..21, with MI E B(, ) and 11M III I/r. By
Theorem 7..10, M I is also selIadjoint. et
o
be the uniue solution to
E. (7.10.20), and deIine : R by
() (
o
, M(
o
)) Ior E .
We see that is minimied uniuely by
o
, and Iurthermore (
o
) O.
We now show that lim (
N
) O. II Ior some n, (
N
) 0, the process
N
terminates and we are done. So assume in the Iollowing that (
N
) 1 O. Note
also that since M is positive, we have () 0 Ior all E .
We begin with the Iact that
( NI) ( N) 2(1,N( N M N) (I,(N M N)
where we have let N
o

N
. Noting that N M N so that (
N
)
( N M N) (MIN n), we have
( N) ( N I) (n n)2 :;;::: .1..
(
N
) (n, Mn)(MI rN, n) .l
ence, ( N I) (1 )(
n
) (1 r(
l
). Thus, li (
n
) Oand
so
N

o
, which was to be proven.
Many oI the ecellent sources dealing with linear operators on Banach
and ilbert spaces include Balakrishnan 7.2 , DunIord and Schwar 7.5 ,
antorovich and Akilov 7.6 , olmogorov and omin 7.7 , iusternik and
Sobolev 7.8 , Naylor and Sell 7.11 , and Taylor 7.12). The eposition by
Naylor and Sell is especially well suited Irom the viewpoint oI applications
in science and engineering.
7 Chapter 7 / inear Operators
or applications oI the type considered in Section 7.10, as well as addi
tional applications, reIer to Antosiewic and Rheinboldt 7.1 , Balakrishnan
7.2 , Byron and uller 7.3 , Curtain and Pritchard 7. , antarovich and
Akilov 7.6 , ovitt 7.9 , and uenberger 7.10). Applications to integral
euations (see Section 7.lOA) are treated in 7.3 and 7.9 . Optimal control
problems (see Section 7.lOB) in a Banach and ilbert space setting are
presented in 7.2 , 7. , and 7.10 . Methods Ior minimiation oI Iunc
tionals (see Section 7.1OC) are developed in 7.1 , 7.6 , and 7.10 .
REERENCES
7.1 . A. ANTOSIEWICZ and W. C. R EINBO DT, "Numerical Analysis and
unctional Analysis," Chapter 1 in Survey oI Numerical Analysis, ed. by
. TODD. New ork: McGraw ill Book Company, 1962.
7.2 A. V. BA A RIS NAN, Applied unctional Analysis. New ork: Springer
Verlag, 1976.
7.3 . W. B RON and R. W. ER, Mathematics oI Classical and Quantum
Physics. Vols. I. II. Reading, Mass.: Addison Wesley Publishing Co., Inc.,
1969 and 1970.
7. R. . CuRTAIN and A. . PRITC ARD, unctional Analysisin Modern Applied
Mathematics. ondon: Academic Press, Inc., 1977.
7.5 N. D N ORD and . SC WARZ, inear Operators, Parts I and II. New ork:
Interscience Publishers, 1958 and 196.
7.6 . V. ANTOROVIC and G. P. A I OV, unctional Analysis in Normed
7.7 A. N. O MOGOROV and S. V. OMIN, Elements oIthe Theory oI unctions
and unctional Analysis. Vols. I, II. Albany, N. .: Graylock Press, 1957
and 1961.
7.8 . A. I STERNl and V. . SoBO EV, Elements oI unctional Analysis. New
ork: rederick ngar Publishing Company, 1961.
7.9 W. V. ovllT, inear Integral E uations. New ork: Dover Publications,
Inc., 1950.
7.10 D. G. ENBERGER, Optimiation by Vector Space Methods. New ork:
ohn Wiley Sons, Inc., 1969.
7.11 A. W. NA OR and G. R. SE , inear Operator Theory. New ork: olt,
Rinehart and Winston, 1971.
7.12 A. E. TA OR, Introduction to unctional Analysis. New ork: ohn Wiley
Sons, Inc., 1958.
*Reprinted in one volume by Dover Publications, Inc., New ork, 1992.
INDEX
Abelian group, 40
abstract algebra, 33
additive group, 46
adherent point, 275
adjoint system of
ordinary differential
equations, 261
adjoint transformation, 219, 220,422
affine linear subspace, 85
algebra, 30, 56, 57, 104
algebraically closed
field, 165
algebraic conjugate, 110
algebraic multiplicity, 167,223
algebraic structure, 31
algebraic system, 30
algebra with identity, 57,105
aligned, 379
almost everywhere, 295
approximate eigenvalue, 444
approximate point
spectrum, 444
approximation, 395
Arzela-Ascoli theorem, 316
Ascoli's lemma, 317
associative algebra, 56, 105
associative operation, 28
automorphism, 64, 68
autonomous system of
differential equations, 241
Axioms of norm, 207
475
476
Index
B
Banach inverse theorem, 416
Banach space, 31, 345
basis, 61,89
Bessel inequality, 213, 380
bicompact, 302
bijection 14
bijective, 14, 100
bilinear form, 114
bilinear functional, 114-115
binary operation, 26
block diagonal matrix, 175
Bolzano-Weierstrass
property, 302
Bolzano-Weierstrass
theorem, 298
boundary, 279
bounded linear
functional, 356
bounded linear operator, 407
bounded metric space, 265
bounded sequence, 286
B(X,Y), 409
c
C[a, b], 80
cancellation laws, 34
canonical mapping, 372
cardinal number, 24
cartesian product, 10
Cauchy-Peano
existence theorem, 332
Cauchy sequence, 290
Cayley-Hamilton theorem, 167
Cayley's theorem, 66
characteristic equation, 166,259
characteristic polynomial, 166
characteristic value, 164
characteristic vector, 164
0 > 79
classical adjoint of a
matrix, 162
closed interval, 283
closed relative to an
operation, 28
closed set, 279
closed sphere, 283
closure, 275
C
n
, 78
cofactor, 158
colinear, 379
collection of subsets, 8
column matrix, 132
column of a matrix, 132
column rank of a matrix, 152
column vector, 125
commutative algebra, 57,105
commutative group, 40
commutative operation, 28
commutative ring, 47
compact, 302
compact operator, 447
companion form, 256
comparable matrices, 137
complement of a subset, 4
completely continuous
operator, 447
complete metric space, 290
complete ortghonormal
set of vectors, 213,389
completion, 295
complex vector space, 76
composite function, 16
composite mathematical
system, 30, 54
conformal matrices, 137
congruent matrices, 198
conjugate functional, 114
conjugate operator, 421
constant coefficients, 241
contact point, 275
continuation of a
solution, 336
continuous function, 307,408
continuous spectrum, 440
contraction mapping, 314
converge, 286,350
convex, 351-355
coordinate representation
of a vector, 125
coordinates of a vector
with respect to a basis, 92, 124
countable set, 23
countably infinite set, 23
Index
477
covering, 299
cyclic group, 43,44
D
degree of a polynomial, 70
DeMorgan's laws, 7,12
dense-in-itself, 284
denumerable set, 23
derived set, 277-278
determinant of a
linear transformation, 163
determinant of a matrix, 157
diagonalization of a
matrix, 172
diagonalization process, 450
diagonal matrix, 155
diameter of a set, 267
difference of sets, 7
differentiation:
of matrices, 247
of vectors, 241
dimension, 78,92,392
direct product, 10
direct sum of linear,
subspaces83,457
discrete metric, 265
disjoint sets, 5
disjoint vector spaces, 83
distance 264
between a point
and a set, 267
between sets, 267
between vectors, 208
distribution function, 397
distributive, 28
diverge, 286, 350
division algorithm, 71
division (of
polynomials), 72
division ring, 46, 50
divisor, 49
divisors of zero, 48
divisors of zero, 48
domain of a function, 12
domain of a relation, 25
dot product, 114
dual, 358
dual basis, 112
E
e-approximate solution, 329
e-dense set, 299
e-net, 299
eigenvalue, 164,439
eigenvector, 164,439
element, 2
element of ordered set, 10
empty set, 3
endomorphism, 64, 68
equal by definition, 10
equality of functions, 14
equality of matrices, 132
equality of sets, 3
equals relation, 26
equicontinuous, 316
equivalence relation, 26
equivalent matrices, 151
equivalent metrics, 318
equivalent sets, 23
error vector, 395
estimate, 398
Euclidean metric, 271
Euclidean norm, 207
Euclidean space, 30,124, 205
even permutation, 156
events, 397
everywhere dense, 284
expected value, 398
extended real line, 266
extended real numbers, 266
extension of a function, 20
extension of an
operation, 29
exterior, 279
extremum, 464
F
factor, 72
family of disjoint sets, 12
family of subsets, 8
478
Index
field, 30, 46, 50
field of complex
numbers, 51
field of real numbers, 51
finite covering, 299
finite-dimensional
operator, 450
finite-dimensional
vector space, 92,124
finite group, 40
finite intersection
property, 305
finite linear
combination of vectors, 85
finite set, 8
fixed point, 315
flat, 85
F
n
, 78
Fourier coefficients, 380,389
Frechet derivative, 458
Fredholm equation, 97,326
Fredholm operator, 425
function, 12
functional, 109,355
functional analysis, 343
function space, 80
fundamental matrix, 246
fundamental sequence, 290
fundamental set, 246
fundamental theorem
of algebra, 74
fundamental theorem of
linear equations, 99
G
Gateaux differential, 458
generalized associative
law, 36
generated subspace, 383
generators of a set, 60
Gram matrix, 395
Gram-Schmidt process, 213,391
graph of a function, 14
greatest common divisor, 73
Gronwall inequality, 332
group, 30, 39
group component, 46
group operation, 46
H
Hahn-Banach theorem, 367-370
half space, 366
Hamel basis, 89
Hausdorff spaces, 323
Heine-Borel property, 302
Heine-Borel theorem, 299
hermitian operator, 427
Hilbert space, 31, 377
homeomorphism, 320
homogeneous property
of a norm, 208,344
homogeneous system, 241-242
homomorphic image, 62,68
homomorphic rings, 67
homomorphic semigroups, 63
homomorphism, 30, 62
hyperplane, 364
I
idempotent operator, 121
identity:
element, 35
function, 19
matrix, 139
permutation, 19,44
relation, 26
transformation, 105,409
image of a set under f, 21
indeterminate of a
polynomial ring, 70
index:
of a nilpotent
operator, 185
of a symmetric
bilinear functional, 202
set, 10
indexed family of sets, 10
indexed set, 11
induced:
mapping, 20
Index
induced (cont.)
metric, 267
norm,349,412
operation, 29
inequalities, 268-271
infinite-dimensional
vector space, 92
infinite series, 350
infinite set, 8
initial value problem, 238-261,328-:
injection, 14
injective, 14,100
inner product, 117,205,375
inner product space, 31, 118, 205
inner product subspace, 118
integral domain, 46,49
integration:
of matrices 249
of vectors 249
interior, 278
intersection of sets, 5
invariant linear
subspace, 122
inverse:
image 21
of a function, 15, 100
of a matrix, 140
of an element, 38
relation, 25
invertible element, 37
invertible linear
transformation, 100
invertible matrix, 140
irreducible polynomial, 74
irreflexive, 372
isolated point, 275
isometric operator, 431
isometry,321
isomorphic, 108
isomorphic semigroups, 64
isomorphism, 30, 63, 68,108
J
Jacobian matrix, 461
Jacobi identity, 57
Jordan canonical form, 175,191
K
Kalman's theorem, 401402
kernel of a homomorphism, 65
Kronecker delta, 111
L
Laplace transform, 96
latent value, 164
leading coefficient of
a polynomial, 70
Lebesgue integral, 296
Lebesgue measurable
function, 296
Lebesgue measurable
sets, 295
Lebesgue measure, 295
left cancellation
property, 34
left distributive, 28
left identity, 35
left inverse, 36
left invertible element, 37
left R-module, 54
left solution, 40
Lie algebra, 57
limit, 286
limit point, 277,288
line segment, 351
linear:
algebra, 33
functional, 109,355-360
manifold, 81
operator, 31,95
quadratic cost
control, 468
space, 30, 55, 76
subspace, 59,81,348
subspace generated
by a set, 86
transformation, 30, 95,100
variety, 85
linearly dependent, 87
linearly independent, 87
Lipschitz condition, 324, 328
Lipschitz constant, 324, 328
480
Index
lower triangular matrix, 176
L 297
L(X,Y), 104
M
map, 13
mapping, 13
mathematical system, 30
matrices, 30
matrix, 132
matrix of:
a bilinear functional, 195
a linear transformation, 131
one basis with respect
to a second basis, 149
maximal linear subspace, 363
metric, 31,209,264
metric space, 31,209, 263-342
metric subspace, 267
minimal polynomial, 179,181
minor of a matrix, 158
modal matrix, 172
modern algebra, 33
module, 30, 54
monic polynomial, 70
monoid, 37
multiplication of a
linear transformation
by a scalar, 104
multiplication of
vectors by scalars, 76,409
multiplicative semigroup, 46
multiplicity of an
eigenvalue, 164
multivalued function, 25
N
natural basis, 126
natural coordinates, 127
n-dimensional complex
coordinate space, 78
n-dimensional real
coordinate space, 78
n-dimensional vector
space, 92
negative definite
matrix, 222
nested sequence
of sets, 298
Neumann expansion
theorem, 415
nilpotent operator, 185
non-abelian group, 40
non-commutative group, 40
non-empty set, 3
non-homogeneous system, 241-242
non-linear
transformation, 95
non-singular linear
transformation, 100
non-singular matrix, 140
non-void set, 3
norm, 206, 344
normal:
equations, 395
linear
transformation, 237
operator, 431
topological space, 323
normalizing a vector, 209
normed conjugate space, 358
normed dual space, 358
normed linear space, 31, 208,344
norm of a bounded
linear transformation, 409
norm preserving, 367
nowhere dense, 284
null:
matrix, 139
set, 3
space, 98,224
vector, 76, 77
nullity of a linear
transformation, 100
n-vector, 132
O
object, 2
observations, 398
odd permutation, 156
Index 481
one-to-one and onto
mapping, 14,100
one-to-one mapping, 14, 100
onto mapping, 14,100
open:
ball, 275
covering, 299
interval, 282
set, 279
sphere, 275
operation table, 27
operator, 13
optimal control problem, 468
ordered sets, 9
order of a group, 40
order of a polynomial, 70
order of a set, 8
ordinary differential
equations, 238-261
origin, 76, 77
orthogonal:
basis, 210
complement, 215,382
linear transformation, 217, 231-:
matrix, 216,226
projection, 123,433
set of vectors, 379
vectors, 118,209
orthogonality principle, 399
orthonormal set of
vectors, 379
outcomes, 397
P
parallel, 364
parallelogram law, 208, 379
Parseval's formula, 390
Parseval's identity, 212
partial sums, 350
partitioned matrix, 147
permutation group, 44,45
permutation on a set, 19
piecewise continuous
derivatives, 329
point of accumulation, 277
points, 264
point spectrum, 440
polarization, 116
polynomial, 69
positive definite matrix, 222
positive operator, 429
power class, 9
power set, 9
precompact, 299
predecessor of an
operation, 29
pre-Hilbert space, 377
primary decomposition
theorem, 183
principal minor of a
matrix, 158
principle of superposition, 96
probability space, 397
product metric spaces, 274
product of:
a matrix by a scalar, 138
linear transformations, 105,409
two elements, 46,104
two matrices, 138
projection, 119,226,387
projection theorem, 387,400
proper:
subset, 3
subspace, 81, 164
value, 164
vector, 164
Pythagorean theorem, 209, 379
Q
quadratic form, 115, 226
quotient, 72
R
radius, 275
random variable, 397
range of a function, 12
range of a relation, 25
range space, 98
rank of a linear
transformation, 100
482
Index
rank of a matrix, 136
rank of a symmetric
real inner product space, 205
real line, 265
real vector space, 76
reduce, 435
reduced characteristic
function, 179
reduced linear
transformation, 122
reflection, 218
reflexive, 372
reflexive relation, 25
regular topological
space, 323
relation, 25
relatively compact, 307
relatively prime, 73
remainder, 72
repeated eigenvalues, 173
residual, 472
residual spectrum, 440
resolution of the
identity, 226,457
resolvent set, 439
restriction of a mapping, 20
R-homomorphism, 68
Riccati equation, 471
Riemann intergrable, 296
Riesz representation
theorem, 393
right:
cancellation property, 34
distributive, 28
identity, 34
inverse, 35
invertible element, 37
R-module, 54
solution, 40
R, 78
ring, 30,46
ring of integers, 51
ring of polynomials, 70
ring with identity, 47
R-module, 54
R
n
, 78
rotation, 218, 230
row of a matrix, 131
row rank of a matrix, 152
row vector, 125,132
R*, 266
R-submodule, 58
R-submodule generated
by a set, 60
s
scalar, 75
scalar multiplication, 76
Schwarz inequality, 207,376
second dual space, 371
secular value, 164
self-adjoint linear
transformation, 221, 224-225
self-adjoint operators, 428
semigroup, 30, 36
semigroup component, 46
semigroup of
transformations, 44
semigroup operation, 46
separable, 284, 300
separates, 366
sequence, 11, 286
sequence of disjoint
sets, 12
sequence of sets, 11
sequentially compact, 301-305
set, 1
set of order zero, 8
shift operator, 441
a-algebra, 397
a-field,397
signature of a symmetric
similarity transformation, 153
similar matrices, 153
simple eigenvalues, 164
singleton set, 8
singular linear
transformation, 101
singular matrix, 140
skew-adjoint linear
transformation, 221, 237
skew symmetric bilinear
functional, 196
skew symmetric matrix, 196
Index 483
skew symmetric part of a
linear functional, 196
solution of a differential
equation, 239
solution of an initial
value problem, 239
space of:
bounded complex
sequences, 79
bounded real sequences, 79
finitely non-zero
sequences, 79
linear transformations, 104
real-valued continuous
functions, 80
span, 86
spectral theorem, 226,455,457
spectrum, 164,439
sphere, 275
spherical neighborhood, 275
square matrix, 132
state transition matrix, 247-255
steepest descent, 472
strictly positive, 429
strong convergence, 373
subalgebra, 105
subcovering, 299
subdomain, 52
subfield, 52
subgroup, 41
subgroup generated
by a set, 43
submatrix, 147
subring, 52
subring generated by
a set, 53
subsemigroup,40
subsemigroup generated
by a set, 41
subsequence, 287
subset, 3
subsystem, 40,46
successive
approximations, 315, 324-328
sum of:
elements, 46
linear operators, 409
linear transformations, 104
matrices, 138
sets, 82
vectors, 76
surjective, 14, 100
Sylvester's theorem, 199
symmetric difference
of sets, 7
symmetric matrix, 196, 226
symmetric part of a
linear functional, 196
symmetric relation, 26
system of differential
equations, 240, 255-260
T
ternary operation, 26
Tj-spaces, 323
topological space, 31
topological structure, 31
topology, 280, 318,322-323
totally bounded, 299
T',421
trace of a matrix, 169
transformation, 13
transformation group, 45
transitive relation, 26
transpose of a linear
transpose of a matrix, 133
transpose of a vector, 125
triangle inequality, 208, 264, 344
triangular matrix, 176
trivial ring, 48
trivial solution, 245
trivial subring, 53
truncation operator, 439
T*,422
T
T
, 113
u
unbounded linear
functional, 356
unbounded metric space, 265
uncountable set, 23
uniform convergence, 313
484
Index
uniformly continuous, 308
union of sets, 5
unit, 37
unitary operator, 431
unitary space, 205
unit of a ring, 47
unit vector, 209
unordered pair of
elements, 9
upper triangular matrix, 176
usual metric for R*, 266,320
usual metric on R, 265
usual metric on R
n
, 271
V
vacuous set, 3
Vandermonde matrix, 260
variance, 398
vector, 75
vector addition, 75
vector space, 30,55, 76
vector space of n-tuples
over F, 56
vector space over a field, 76
vector subspace, 59
Venn diagram, 8
void set, 3
Volterra equation, 327
Volterra integral
equation, 97
w
weak convergence, 373
weakly continuous, 375
weak* compact, 375
weak-star convergence, 373
Weierstrass approximation
theorem, 285
Wronskian, 256-259
X Y Z
X
f
, 357
X*, 357-358
zero:
polynomial, 70
vector, 76, 77
Zorn's lemma, 390

Applied Algebra & Funtional Analysis - Anthony Michel - Charles Herget

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Applied Algebra & Funtional Analysis - Anthony Michel - Charles Herget

Hochgeladen von

Copyright:

Verfügbare Formate

REVIEWSOF

Algebra and Analysis

1.1.17. igure A. Venn diagrams.

). Since is in AI or in A,I() must be in I(A, ) or I(A

). To prove that I(AI) I(A

. So by part (i), I(AI) c: I(AI

1.2. 2. E ercise. Prove parts (iii) through (viii) oI Theorem 1.2. 1.

, .. ,. are elements oI a semigroup

, .. . . Now suppose there is some

We ne t introduce the important concept oI vector space.

, ,", is linearly dependent.

... " ", ", O. (3.3.17)

"I "t I ".I.I

E (, ). E uation (3. .10) is known

3. .15. E ample. et C, the set oI comple numbers. II E C,

, ... , e. be a basis Ior

, ... , Te, must generate

) g( , ) . ig(, y) ig(y, ) g(y, y )

) g( , ) ig(, y) ig(y, ) g(y, y ).

ProoI To prove the Iirst part, note that iI I 1. and I 1.

auIl auI a",d",

all a21 0" 1

be the matri oI B with respect to the bases Ielt e

oI and g I g", ... , g,

in and g" ... , g, in Z is eI where

, ... , e. in and basis lu , ,Iillin . et B be the matri

all al2 alrt

. a. . There is a product term in det (AT) oI the Iorm a

, , ,t" are all diIIerent. Moreover, since there are n oI the

, ... , eIt II the characteristic polynomial

)" ... I It)..It

E r Thus, with respect to this basis, A

I A)(au A) ... (a 1).

E WI Ior j II I, ... ,11 1 This means that the set oI vectors

1 (3 3) matri . There will be 2/

2 (l I) matrices. ence, there

WI Also, the pair N

is called the matri oI the bilinear Iunctional I with

0 Ior i:j and Ii,jl.

Ior all ;:;e j, then e

denote the natural basis Ior RZ; i.e.,

(0,1). The matri

(0, t). Then it is readily veriIied that el e

is an orthonormal basis Ior . urthermore, Ior el ee2 we have

, Gy) g(l,y) g(,).

t gek Ior j I, ... ,n. ence, (e

) g; it Iollows that G G; e., G is the

... dim m,.

I, where I E (, ) denotes the identity transIormation;

.10. inear TransIormations on Euclidean Vector Spaces

.10.50. E ample. Consider now a transIormation A Irom E3 into E3,

(The deIinition oI eigenvalue oI A E (, ) e cludes the possibility oI com

and let y E . Then

Since det A 1 we have /

are real and that AI * A . rom Theorem

Ior all i *j;

AI1 etA/. eAZ/

oI Z t such that the restriction oI At to

Net, let Z2 be the linear subspace which is orthogonal to both

/,(t, I ,.) /,(t, ) a/(t)

" "II and A(t) oIAt) . Then III ;I alk(t)"kr Now

Since I M" e" t" we now have