Sie sind auf Seite 1von 243

Introduction to Analysis

in One Variable

Michael E. Taylor

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
2

Contents

Chapter I. Numbers
1. Peano arithmetic
2. The integers
3. Prime factorization and the fundamental theorem of arithmetic
4. The rational numbers
5. Sequences
6. The real numbers
7. Irrational numbers
8. Cardinal numbers
9. Metric properties of R
10. Complex numbers

Chapter II. Spaces


1. Euclidean spaces
2. Metric spaces
3. Compactness
A. The Baire category theorem

Chapter III. Functions


1. Continuous functions
2. Sequences and series of functions
3. Power series
4. Spaces of functions

Chapter IV. Calculus


1. The derivative
2. The integral
3. Power series
4. Curves and arc length
5. Exponential and trigonometric functions
6. Unbounded integrable functions
A. The fundamental theorem of algebra
B. π 2 is irrational
C. More on (1 − x)b
D. Archimedes’ approximation of π
E. Computing π using arctangents
F. Power series of tan x
G. Abel’s power series theorem

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
3

Chapter V. Further topics in analysis


1. Convolutions and bump functions
2. The Weierstrass approximation theorem
3. The Stone-Weierstrass theorem
4. Fourier series
5. Newton’s method
A. Inner product spaces

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
4

Introduction

This is a text for students who have had a three course calculus sequence, and who
are ready for a course that explores the logical structure of this area of mathematics,
which forms the backbone of analysis. This is intended for a one semester course. An
accompanying text, Introduction to Analysis in Several Variables [T2], can be used in the
second semester of a one year sequence.
The main goal of Chapter 1 is to develop the real number system. We start with a
treatment of the “natural numbers” N, obtaining its structure from a short list of axioms,
the primary one being the principle of induction. Then we construct the set Z of all integers,
which has a richer algebraic structure, and proceed to construct the set Q of rational
numbers, which are quotients of integers (with a nonzero denominator). After discussing
infinite sequences of rational numbers, including the notions of convergent sequences and
Cauchy sequences, we construct the set R of real numbers, as ideal limits of Cauchy
sequences of rational numbers. At the heart of this chapter is the proof that R is complete,
i.e., Cauchy sequences of real numbers always converge to a limit in R. This provides the
key to studying other metric properties of R, such as the compactness of (nonempty) closed,
bounded subsets. We end Chapter 1 with a section on the set C of complex numbers. Many
introductions to analysis shy away from the use of complex numbers. My feeling is that
this forecloses the study of way too many beautiful results that can be appreciated at
this level. This is not a course in complex analysis. That is for another course, and with
another text (such as [T3]). However, I hope that various topics covered in this text make
it clear that the use of complex numbers in analysis actually simplifies the treatment of a
number of key concepts, while extending their scope in very useful ways.
In fact, the structure of analysis is revealed more clearly by moving beyond R and C,
and we undertake this in Chapter 2. We start with a treatment of n-dimensional Euclidean
space, Rn . There is a notion of Euclidean distance between two points in Rn , leading to
notions of convergence and of Cauchy sequences. The spaces Rn are all complete, and
again closed bounded sets are compact. Going through this sets one up to appreciate a
further generalization, the notion of a metric space, introduced in §2. This is followed by
§3, exploring the notion of compactness in a metric space setting.
Chapter 3 deals with functions. It starts in a general setting, of functions from one
metric space to another. We then treat infinite sequences of functions, and study the
notion of convergence, particularly of uniform convergence of a sequence of functions. We
move on to infinite series. In such a case, we take the target space to be Rn , so we can
add functions. Section 3 treats power series. Here, we study series of the form

X
(1) ak (z − z0 )k ,
k=0

with ak ∈ C and z running over a disk in C. For results obtained in this section, regarding
the radius of convergence R and the continuity of the sum on DR (z0 ) = {z ∈ C : |z − z0 | <

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
5

R}, there is no extra difficulty in allowing ak and z to be complex, rather than insisting
they be real, and the extra level of generality will pay big dividends in Chapter 4. A final
section in Chapter 3 is devoted to spaces of functions, illustrating the utility of studying
spaces beyond the case of Rn .
Chapter 4 gets to the heart of the matter, a rigorous development of differential and
integral calculus. We define the derivative in §1, and prove the Mean Value Theorem,
making essential use of compactness of a closed, bounded interval and its consequences,
established in earlier chapters. This result has many important consequences, such as the
Inverse Function Theorem, and especially the Fundamental Theorem of Calculus, estab-
lished in §2, after the Riemann integral is introduced. In §3, we return to power series,
this time of the form

X
(2) ak (t − t0 )k .
k=0

We require t and t0 to be in R, but still allow ak ∈ C. Results on radius of convergence


R and continuity of the sum f (t) on (t0 − R, t0 + R) follow from material in Chapter 3.
The essential new result in §3 of Chapter 4 is that one can obtain the derivative f 0 (t) by
differentiating the power series for f (t) term by term. In §4 we consider curves in Rn ,
and obtain a formula for arc length for a smooth curve. We show that a smooth curve
with nonvanishing velocity can be parametrized by arc length. When this is applied to
the unit circle in R2 centered at the origin, one is looking at the standard definition of the
trigonometric functions,

(3) C(t) = (cos t, sin t).

We provide a demonstration that

(4) C 0 (t) = (− sin t, cos t)

that is much shorter than what is usually presented in calculus texts. In §5 we move on to
exponential functions. We derive the power series for the function et , introduced to solve
the differential equation dx/dt = x. We then observe that with no extra work we get an
analogous power series for eat , with derivative aeat , and that this works for complex a as
well as for real a. It is a short step to realize that eit is a unit speed curve tracing out the
unit circle in C ≈ R2 , so comparison with (3) gives Euler’s formula

(5) eit = cos t + i sin t.

That the derivative of eit is ieit provides a second proof of (4). Thus we have a unified
treatment of the exponential and trigonometric functions, carried out further in §5, with
details developed in numerous exercises. Section 6 extends the scope of the Riemann
integral to a class of unbounded functions. Chapter 4 has several appendices, one proving
the fundamental theorem of algebra, one showing that π is irrational, one exploring in

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
6

more detail than in §3 the power series for (1 − x)b , and one describing an approximation
to π pioneered by Archimedes.
Chapter 5 treats further topics in analysis. If time permits, the instructor might cover
one or more of these at the end of the course. The topics center around approximating
functions, via various infinite sequences or series. Topics include approximating continuous
functions by polynomials, Fourier series, and Newton’s method for approximating the
inverse of a given function.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
7

Chapter I
Numbers

Introduction

One foundation for a course in analysis is a solid understanding of the real number sys-
tem. Texts vary on just how to achieve this. Some take an axiomatic approach. In such an
approach, the set of real numbers is hypothesized to have a number of properties, including
various algebraic properties satisfied by addition and multiplication, order axioms, and,
crucially, the completeness property, sometimes expressed as the supremum property.
This is not the approach we will take. Rather, we will start with a small list of ax-
ioms for the natural numbers (i.e., the positive integers), and then build the rest of the
edifice logically, obtaining the basic properties of the real number system, particularly the
completeness property, as theorems.
Sections 1–3 deal with the integers, starting in §1 with the set N of natural numbers.
The development proceeds from axioms of G. Peano. The main one is the principle of
mathematical induction. We deduce basic results about integer arithmetic from these
axioms. A high point is the fundamental theorem of arithmetic, presented in §3.
Section 4 discusses the set Q of rational numbers, deriving the basic algebraic properties
of these numbers from the results of §§1–3. Section 5 provides a bridge between §4 and §6.
It deals with infinite sequences, including convergent sequences and “Cauchy sequences.”
This prepares the way for §6, the main section of this chapter. Here we construct the set
R of real numbers, as “ideal limits” of rational numbers. We extend basic algebraic results
from Q to R. Furthermore, we establish the result that R is “complete,” i.e., Cauchy
sequences
√ √ always
√ have limits in R. Section 7 provides examples of irrational numbers, such
as 2, 3, 5,...
Section 8 deals with cardinal numbers, an extension of the natural numbers N, that can
be used to “count” elements of a set, not necessarily finite. For example, N is a “countably”
infinite set, and so is Q. We show that R “uncountable,” and hence much larger than N
or Q.
Section 9 returns to the real number line R, and establishes further metric properties of
R and various subsets, with an emphasisis on the notion of compactness. The completeness
property established in §6 plays a crucial role here.
Section 10 introduces the set C of complex numbers and establishes basic algebraic and
metric properties of C. While some introductory treatments of analysis avoid complex
numbers, we embrace them, and consider their use in basic analysis too precious to omit.
Sections 9 and 10 also have material on continuous functions, defined on a subset of R
or C, respectively. These results give a taste of further results to be developed in Chapter
3, which will be essential to material in Chapters 4 and 5.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
8

1. Peano arithmetic

In Peano arithmetic, we assume we have a set N (the natural numbers). We assume


given 0 ∈ e = N ∪ {0}. We assume there is a map
/ N, and form N

(1.1) e −→ N,
s:N

e such that s(j) = k, so


which is bijective. That is to say, for each k ∈ N, there is a j ∈ N
0 0
s is surjective; and furthermore, if s(j) = s(j ) then j = j , so s is injective. The map s
plays the role of “addition by 1,” as we will see below. The only other axiom of Peano
arithmetic is that the principle of mathematical induction holds. In other words, if S ⊂ Ne
is a set with the properties

(1.2) 0 ∈ S, k ∈ S ⇒ s(k) ∈ S,

e
then S = N.
Actually, applying the induction principle to S = {0} ∪ s(N),e we see that it suffices to
assume that s in (1.1) is injective; the induction principle ensures that it is surjective.
e inductively on y, by
We define addition x + y, for x, y ∈ N,

(1.3) x + 0 = x, x + s(y) = s(x + y).

Next, we define multiplication x · y, inductively on y, by

(1.4) x · 0 = 0, x · s(y) = x · y + x.

We also define

(1.5) 1 = s(0).

We now establish the basic laws of arithmetic.


Proposition 1.1. x + 1 = s(x).
Proof. x + s(0) = s(x + 0).
Proposition 1.2. 0 + x = x.
Proof. Use induction on x. First, 0 + 0 = 0. Now, assuming 0 + x = x, we have

0 + s(x) = s(0 + x) = s(x).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
9

Proposition 1.3. s(y + x) = s(y) + x.

Proof. Use induction on x. First, s(y + 0) = s(y) = s(y) + 0. Next, we have

s(y + s(x)) = ss(y + x),


s(y) + s(x) = s(s(y) + x).

If s(y + x) = s(y) + x, the two right sides are equal, so the two left sides are equal,
completing the induction.

Proposition 1.4. x + y = y + x.

Proof. Use induction on y. The case y = 0 follows from Proposition 1.2. Now, assuming
e we must show s(y) has the same property. In fact,
x + y = y + x, for all x ∈ N,

x + s(y) = s(x + y) = s(y + x),

and by Proposition 1.3 the last quantity is equal to s(y) + x.

Proposition 1.5. (x + y) + z = x + (y + z).

Proof. Use induction on z. First, (x + y) + 0 = x + y = x + (y + 0). Now, assuming


e we must show s(z) has the same property. In
(x + y) + z = x + (y + z), for all x, y ∈ N,
fact,
(x + y) + s(z) = s((x + y) + z),
x + (y + s(z)) = x + s(y + z) = s(x + (y + z)),

and we perceive the desired identity.

Remark. Propositions 1.4 and 1.5 state the commutative and associative laws for addition.

We now establish some laws for multiplication.

Proposition 1.6. x · 1 = x.

Proof. We have
x · s(0) = x · 0 + x = 0 + x = x,

the last identity by Proposition 1.2.

Proposition 1.7. 0 · y = 0.

Proof. Use induction on y. First, 0 · 0 = 0. Next, assuming 0 · y = 0, we have 0 · s(y) =


0 · y + 0 = 0 + 0 = 0.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
10

Proposition 1.8. s(x) · y = x · y + y.


Proof. Use induction on y. First, s(x) · 0 = 0, while x · 0 + 0 = 0 + 0 = 0. Next, assuming
s(x) · y = x · y + y, for all x, we must show that s(y) has this property. In fact,
s(x) · s(y) = s(x) · y + s(x) = (x · y + y) + (x + 1),
x · s(y) + s(y) = (x · y + x) + (y + 1),
and identity then follows via the commutative and associative laws of addition, Proposi-
tions 1.4 and 1.5.
Proposition 1.9. x · y = y · x.
Proof. Use induction on y. First, x · 0 = 0 = 0 · x, the latter identity by Proposition 1.7.
e we must show that s(y) has the same property.
Next, assuming x · y = y · x for all x ∈ N,
In fact,
x · s(y) = x · y + x = y · x + x,
s(y) · x = y · x + x,
the last identity by Proposition 1.8.
Proposition 1.10. (x + y) · z = x · z + y · z.
Proof. Use induction on z. First, the identity clearly holds for z = 0. Next, assuming it
e we must show it holds for s(z). In fact,
holds for z (for all x, y ∈ N),
(x + y) · s(z) = (x + y) · z + (x + y) = (x · z + y · z) + (x + y),
x · s(z) + y · s(z) = (x · z + x) + (y · z + y),
and the desired identity follows from the commutative and associative laws of addition.
Proposition 1.11. (x · y) · z = x · (y · z).
Proof. Use induction on z. First, the identity clearly holds for z = 0. Next, assuming it
e we have
holds for z (for all x, y ∈ N),
(x · y) · s(z) = (x · y) · z + x · y,
while
x · (y · s(z)) = x · (y · z + y) = x · (y · z) + x · y,
the last identity by Proposition 1.10 (and 1.9). These observations yield the desired iden-
tity.

Remark. Propositions 1.9 and 1.11 state the commutative and associative laws for multi-
plication. Proposition 1.10 is the distributive law. Combined with Proposition 1.9, it also
yields
z · (x + y) = z · x + z · y,
used above.

We next demonstrate the cancellation law of addition:

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
11

e
Proposition 1.12. Given x, y, z ∈ N,
(1.6) x + y = z + y =⇒ x = z.

Proof. Use induction on y. If y = 0, (1.6) obviously holds. Assuming (1.6) holds for y, we
must show that
(1.7) x + s(y) = z + s(y)
implies x = z. In fact, (1.7) is equivalent to s(x + y) = s(z + y). Since the map s is assumed
to be one-to-one, this implies that x + y = z + y, so we are done.
e Given x, y ∈ N,
We next define an order relation on N. e we say

(1.8) x < y ⇐⇒ y = x + u, for some u ∈ N.


Similarly there is a definition of x ≤ y. We have x ≤ y if and only if y ∈ Rx , where

(1.9) e
Rx = {x + u : u ∈ N}.
Other notation is
y > x ⇐⇒ x < y, y ≥ x ⇐⇒ x ≤ y.
Proposition 1.13. If x ≤ y and y ≤ x then x = y.
Proof. The hypotheses imply

(1.10) y = x + u, x = y + v, e
u, v ∈ N.
Hence x = x + u + v, so, by Proposition 1.12, u + v = 0. Now, if v 6= 0, then v = s(w), so
u + v = s(u + w) ∈ N. Thus v = 0, and u = 0.
e either
Proposition 1.14. Given x, y ∈ N,
(1.11) x < y, or x = y, or y < x,
and no two can hold.
Proof. That no two of (1.11) can hold follows from Proposition 1.13. It remains to show
e We will establish (1.11) by induction on x. Clearly (1.11)
that one must hold. Take y ∈ N.
e then either
holds for x = 0. We need to show that if (1.11) holds for a given x ∈ N,
(1.12) s(x) < y, or s(x) = y, or y < s(x).
Consider the three possibilities in (1.11). If either y = x or y < x, then clearly y < s(x) =
x + 1. On the other hand, if x < y, we can use the implication
(1.12A) x < y =⇒ s(x) ≤ y
to complete the proof of (1.12). See Lemma 1.17 for a proof of (1.12A).
We can now establish the cancellation law for multiplication.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
12

e
Proposition 1.15. Given x, y, z ∈ N,
(1.13) x · y = x · z, x 6= 0 =⇒ y = z.

Proof. If y 6= z, then either y < z or z < y. Suppose y < z, i.e., z = y + u, u ∈ N. Then


the hypotheses of (1.13) imply
x · y = x · y + x · u, x 6= 0,
hence, by Proposition 1.12,
(1.14) x · u = 0, x 6= 0.
We thus need to show that (1.14) implies u = 0. In fact, if not, then we can write u = s(w),
e and we have
and x = s(a), with w, a ∈ N,
(1.15) x · u = x · w + s(a) = s(x · w + a) ∈ N.
This contradicts (1.14), so we are done.

Remark. Note that (1.15) implies


(1.16) x, y ∈ N =⇒ x · y ∈ N.
We next establish the following variant of the principle of induction, called the well-
e
ordering property of N.
e is nonempty, then T contains a smallest element.
Proposition 1.16. If T ⊂ N
Proof. Suppose T contains no smallest element. Then 0 ∈
/ T. Let

(1.17) e : x < y, ∀ y ∈ T }.
S = {x ∈ N
Then 0 ∈ S. We claim that
(1.18) x ∈ S =⇒ s(x) ∈ S.
Indeed, suppose x ∈ S, so x < y for all y ∈ T. If s(x) ∈
/ S, we have s(x) ≥ y0 for some
y0 ∈ T. On the other hand (see Lemma 1.17 below),
(1.19) x < y0 =⇒ s(x) ≤ y0 .
Thus, by Proposition 1.13,
(1.20) s(x) = y0 .
It follows that y0 must be the smallest element of T. Thus, if T has no smallest element,
(1.18) must hold. The induction principle then implies that S = N, e which implies T is
empty.
Here is the result behind (1.12A) and (1.19).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
13

e
Lemma 1.17. Given x, y ∈ N,

(1.21) x < y =⇒ s(x) ≤ y.

Proof. Indeed, x < y ⇒ y = x + u with u ∈ N, hence u = s(v), so

y = x + s(v) = s(x + v) = s(x) + v,

hence s(x) ≤ y.

Remark. Proposition 1.16 has a converse, namely, the assertion

(1.22) e nonempty =⇒ T contains a smallest element


T ⊂N

implies the principle of induction:


³ ´
(1.23) e e
0 ∈ S ⊂ N, k ∈ S ⇒ s(k) ∈ S =⇒ S = N.

e \ S. If S 6= N,
To see this, suppose S satisfies the hypotheses of (1.23), and let T = N e then
T is nonempty, so (1.22) implies T has a smallest element, say x1 . Since 0 ∈ S, x1 ∈ N,
so x1 = s(x0 ), and we must have

(1.24) x0 ∈ S, e \ S,
s(x0 ) ∈ T = N

contradicting the hypotheses of (1.23).

Exercises
Pn
Given n ∈ N, we define k=1 ak inductively, as follows.

1
X n+1
X ³X
n ´
(1.25) ak = a1 , ak = ak + an+1 .
k=1 k=1 k=1

Use the principle of induction to establish the following identities.


n
X
(1) 2 k = n(n + 1).
k=1

n
X
(2) 6 k 2 = n(n + 1)(2n + 1).
k=1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
14

n
X
(3) (a − 1) ak = an+1 − a, if a 6= 1.
k=1

In (3), we define an inductively by

(1.26) a1 = a, an+1 = an · a.
Pn Pn
We also set a0 = 1 if a ∈ N, and k=0 ak = a0 + k=1 ak . Verify that

n
X
(4) (a − 1) ak = an+1 − 1, if a 6= 1.
k=0

5. Given k ∈ N, show that


2k ≥ 2k,
with strict inequality for k > 1.

e
6. Show that, for x, x0 , y, y 0 ∈ N,

x < x0 , y ≤ y 0 =⇒ x + y < x0 + y 0 , and


x · y < x0 · y 0 , if also y 0 > 0.

7. Show that the following variant of the principle of induction holds:


³ ´
1 ∈ S ⊂ N, k ∈ S ⇒ s(k) ∈ S =⇒ S = N.

e
Hint. Consider {0} ∪ S ⊂ N.
More generally, with Rx as in (1.9), show that, for x ∈ N,
³ ´
x ∈ S ⊂ Rx , k ∈ S ⇒ s(k) ∈ S =⇒ S = Rx .

Hint. Use induction on x.

e n ∈ N, show that if also m ∈ N,


8. With an defined inductively as in (1.26) for a ∈ N,

am an = am+n , (am )n = amn .

Hint. Use induction on n.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
15

2. The integers

e To be more formal, we
An integer is thought of as having the form x − a, with x, a ∈ N.
e where
will define an element of Z as an equivalence class of ordered pairs (x, a), x, a ∈ N,
we define

(2.1) (x, a) ∼ (y, b) ⇐⇒ x + b = y + a.

We claim (2.1) is an equivalence relation. In general, an equivalence relation on a set S is


a specification s ∼ t for certain s, t ∈ S, which satisfies the following three conditions.

(a) Reflexive. s ∼ s, ∀ s ∈ S.
(b) Symmetric. s ∼ t ⇐⇒ t ∼ s.
(c) Transitive. s ∼ t, t ∼ u =⇒ s ∼ u.

We will encounter various equivalence relations in this and subsequent sections. Generally,
(a) and (b) are quite easy to verify, and we will be content with verifying (c).
Proposition 2.1. The relation (2.1) is an equivalence relation.
Proof. We need to check that

(2.2) (x, a) ∼ (y, b), (y, b) ∼ (z, c) =⇒ (x, a) ∼ (z, c),

e
i.e., that, for x, y, z, a, b, c ∈ N,

(2.3) x + b = y + a, y + c = z + b =⇒ x + c = z + a.

In fact, the hypotheses of (2.3), and the results of §1, imply

(x + c) + (y + b) = (z + a) + (y + b),

and the conclusion of (2.3) then follows from the cancellation property, Proposition 1.12.
Let us denote the equivalence class containing (x, a) by [(x, a)]. We then define addition
and multiplication in Z to satisfy

[(x, a)] + [(y, b)] = [(x, a) + (y, b)], [(x, a)] · [(y, b)] = [(x, a) · (y, b)],
(2.4)
(x, a) + (y, b) = (x + y, a + b), (x, a) · (y, b) = (xy + ab, ay + xb).

To see that these operations are well defined, we need:

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
16

Proposition 2.2. If (x, a) ∼ (x0 , a0 ) and (y, b) ∼ (y 0 , b0 ), then

(2.5) (x, a) + (y, b) ∼ (x0 , a0 ) + (y 0 , b0 ),

and

(2.6) (x, a) · (y, b) ∼ (x0 , a0 ) · (y 0 , b0 ).

Proof. The hypotheses say

(2.7) x + a0 = x0 + a, y + b0 = y 0 + b.

The conclusions follow from results of §1. In more detail, adding the two identities in (2.7)
gives
x + a0 + y + b0 = x0 + a + y 0 + b,
and rearranging, using the commutative and associative laws of addition, yields

(x + y) + (a0 + b0 ) = (x0 + y 0 ) + (a + b),

implying (2.5). The task of proving (2.6) is simplified by going through the intermediate
step

(2.8) (x, a) · (y, b) ∼ (x0 , a0 ) · (y, b).

If x0 > x, so x0 = x + u, u ∈ N, then also a0 = a + u, and our task is to prove

(xy + ab, ay + xb) ∼ (xy + uy + ab + ub, ay + uy + xb + ub),

which is readily done. Having (2.8), we apply similar reasoning to get

(x0 , a0 ) · (y, b) ∼ (x0 , a0 ) · (y 0 , b0 ),

and then (2.6) follows by transitivity.


Similarly, it is routine to verify the basic commutative, associative, etc. laws incorpo-
rated in the next proposition. To formulate the results, set

(2.9) m = [(x, a)], n = [(y, b)], k = [(z, c)] ∈ Z.

Also, define

(2.10) 0 = [(0, 0)], 1 = [(1, 0)],

and

(2.11) −m = [(a, x)].

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
17

Proposition 2.3. We have

m + n = n + m,
(m + n) + k = m + (n + k),
m + 0 = m,
m + (−m) = 0,
mn = nm,
(2.12)
m(nk) = (mn)k,
m · 1 = m,
m · 0 = 0,
m · (−1) = −m,
m · (n + k) = m · n + m · k.

To give an example of a demonstration of these results, the identity mn = nm is


equivalent to
(xy + ab, ay + xb) ∼ (yx + ba, bx + ya).
In fact, commutative laws for addition and multiplication in N e imply xy + ab = yx + ba
and ay + xb = bx + ya. Verification of the other identities in (2.12) is left to the reader.
We next establish the cancellation law for addition in Z.
Proposition 2.4. Given m, n, k ∈ Z,

(2.13) m + n = k + n =⇒ m = k.

Proof. We give two proofs. For one, we can add −n to both sides and use the results of
Proposition 2.3. Alternatively, we can write the hypotheses of (2.13) as

x+y+c+b=z+y+a+b

and use Proposition 1.12 to deduce that x + c = z + a.


Note that it is reasonable to set

(2.14) m − n = m + (−n).

This defines subtraction on Z.


There is a natural injection

(2.15) N ,→ Z, x 7→ [(x, 0)],

whose image we identify with N. Note that the map (2.10) preserves addition and multi-
plication. There is also an injection x 7→ [(0, x)], whose image we identify with −N.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
18

Proposition 2.5. We have a disjoint union:

(2.16) Z = N ∪ {0} ∪ (−N).

Proof. Suppose m ∈ Z; write m = [(x, a)]. By Proposition 1.14, either

a < x, or x = a, or x < a.

In these three cases,

x = a + u, u ∈ N, or x = a, or a = x + v, v ∈ N.

Then, either

(x, a) ∼ (u, 0), or (x, a) ∼ (0, 0), or (x, a) ∼ (0, v).

We define an order on Z by:

(2.17) m < n ⇐⇒ n − m ∈ N.

We then have:
Corollary 2.6. Given m, n ∈ Z, then either

(2.18) m < n, or m = n, or n < m,

and no two can hold.


The map (2.15) is seen to preserve order relations.
Another consequence of (2.16) is the following.
Proposition 2.7. If m, n ∈ Z and m · n = 0, then either m = 0 or n = 0.
Proof. Suppose m 6= 0 and n 6= 0. We have four cases:

m > 0, n > 0 =⇒ mn > 0,


m < 0, n < 0 =⇒ mn = (−m)(−n) > 0,
m > 0, n < 0 =⇒ mn = −m(−n) < 0,
m < 0, n > 0 =⇒ mn = −(−m)n < 0,

the first by (1.16), and the rest with the help of Exercise 3 below. This finishes the proof.
Using Proposition 2.7, we have the following cancellation law for multiplication in Z.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
19

Proposition 2.8. Given m, n, k ∈ Z,

(2.19) mk = nk, k 6= 0 =⇒ m = n.

Proof. First, mk = nk ⇒ mk − nk = 0. Now

mk − nk = (m − n)k.

See Exercise 3 below. Hence

mk = nk =⇒ (m − n)k = 0.

Given k 6= 0, Proposition 2.7 implies m − n = 0. Hence m = n.

Exercises

1. Verify Proposition 2.3.


Pn
2. We define k=1 ak as in (1.25), this time with ak ∈ Z. We also define ak inductively
as in Exercise (3) of §1, with a0 = 1 if a 6= 0. Use the principle of induction to establish
the identity
Xn
(−1)k−1 k = − m if n = 2m,
k=1
m+1 if n = 2m + 1.

3. Show that, if m, n, k ∈ Z,

−(nk) = (−n)k, and mk − nk = (m − n)k.

Hint. For the first part, use Proposition 2.3 to show that nk + (−n)k = 0. Alternatively,
compare (a, x) · (y, b) with (x, a) · (y, b).

4. Deduce the following from Proposition 1.16. Let S ⊂ Z be nonempty and assume there
exists m ∈ Z such that m < n for all n ∈ S. Then S has a smallest element.
Hint. Given such m, let Se = {(−m) + n : n ∈ S}. Show that Se ⊂ N and deduce that Se
has a smallest element.

5. Show that Z has no smallest element.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
20

3. Prime factorization and the fundamental theorem of arithmetic

Let x ∈ N. We say x is composite if one can write

(3.1) x = ab, a, b ∈ N,

with neither a nor b equal to 1. If x 6= 1 is not composite, it is said to be prime. If (3.1)


holds, we say a|x (and that b|x), or that a is a divisor of x. Given x ∈ N, x > 1, set

(3.2) Dx = {a ∈ N : a|x, a > 1}.

Thus x ∈ Dx , so Dx is non-empty. By Proposition 1.16, Dx contains a smallest element,


say p1 . Clearly p1 is a prime. Set

(3.3) x = p1 x1 , x1 ∈ N, x1 < x.

The same construction applies to x1 , which is > 1 unless x = p1 . Hence we have either
x = p1 or

(3.4) x1 = p2 x2 , p2 prime , x2 < x1 .

Continue this process, passing from xj to xj+1 as long as xj is not prime. The set S of
such xj ∈ N has a smallest element, say xµ−1 = pµ , and we have

(3.5) x = p1 p2 · · · pµ , pj prime.

This is part of the Fundamental Theorem of Arithmetic:


Theorem 3.1. Given x ∈ N, x 6= 1, there is a unique product expansion

(3.6) x = p1 · · · pµ ,

where p1 ≤ · · · ≤ pµ are primes.


Only uniqueness remains to be established. This follows from:
Proposition 3.2. Assume a, b ∈ N, and p ∈ N is prime. Then

(3.7) p|ab =⇒ p|a or p|b.

We will deduce this from:

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
21

Proposition 3.3. If p ∈ N is prime and a ∈ N, is not a multiple of p, or more generally


if p, a ∈ N have no common divisors > 1, then there exist m, n ∈ Z such that

(3.8) ma + np = 1.

Proof of Proposition 3.2. Assume p is a prime which does not divide a. Pick m, n such
that (3.8) holds. Now, multiply (3.8) by b, to get

mab + npb = b.

Thus, if p|ab, i.e., ab = pk, we have

p(mk + nb) = b,

so p|b, as desired.
To prove Proposition 3.3, let us set

(3.9) Γ = {ma + np : m, n ∈ Z}.

Clearly Γ satisfies the following criterion:


Definition. A nonempty subset Γ ⊂ Z is a subgroup of Z provided

(3.10) a, b ∈ Γ =⇒ a + b, a − b ∈ Γ.

Proposition 3.4. If Γ ⊂ Z is a subgroup, then either Γ = {0}, or there exists x ∈ N such


that

(3.11) Γ = {mx : m ∈ Z}.

Proof. Note that n ∈ Γ ⇔ −n ∈ Γ, so, with Σ = Γ ∩ N, we have a disjoint union

Γ = Σ ∪ {0} ∪ (−Σ).

If Σ 6= ∅, let x be its smallest element. Then we want to establish (3.11), so set Γ0 = {mx :
m ∈ Z}. Clearly Γ0 ⊂ Γ. Similarly, set Σ0 = {mx : m ∈ N} = Γ0 ∩ N. We want to show
that Σ0 = Σ. If y ∈ Σ \ Σ0 , then we can pick m0 ∈ N such that

m0 x < y < (m0 + 1)x,

and hence
y − m0 x ∈ Σ
is smaller than x. This contradiction proves Proposition 3.4.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
22

Proof of Proposition 3.3. Taking Γ as in (3.9), pick x ∈ N such that (3.11) holds. Since
a ∈ Γ and p ∈ Γ, we have
a = m0 x, p = m1 x
for some mj ∈ Z. The assumption that a and p have no common divisor > 1 implies x = 1.
We conclude that 1 ∈ Γ, so (3.8) holds.

Exercises

1. Prove that there are infinitely many primes.


Hint. If {p1 , . . . , pm } is a complete list of primes, consider

x = p1 · · · pm + 1.

What are its prime factors?

2. Referring to (3.10), show that a nonempty subset Γ ⊂ Z is a subgroup of Z provided

(3.12) a, b ∈ Γ =⇒ a − b ∈ Γ.

Hint. a ∈ Γ ⇒ 0 = a − a ∈ Γ ⇒ −a = 0 − a ∈ Γ, given (3.12).

3. Let n ∈ N be a 12 digit integer. Show that if n is not prime, then it must be divisible
by a prime p < 106 .

4. Determine whether the following number is prime:

(3.13) 201367.

Hint. This is for the student who can use a computer.

5. Find the smallest prime larger than the number in (3.13). Hint. Same as above.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
23

4. The rational numbers

A rational number is thought of as having the form m/n, with m, n ∈ Z, n 6= 0. Thus, we


will define an element of Q as an equivalence class of ordered pairs m/n, m ∈ Z, n ∈ Z\{0},
where we define

(4.1) m/n ∼ a/b ⇐⇒ mb = an.

Proposition 4.1. This is an equivalence relation.


Proof. We need to check that

(4.2) m/n ∼ a/b, a/b ∼ c/d =⇒ m/n ∼ c/d,

i.e., that, for m, a, c ∈ Z, n, b, d ∈ Z \ {0},

(4.3) mb = an, ad = cb =⇒ md = cn.

Now the hypotheses of (4.3) imply (mb)(ad) = (an)(cb), hence

(md)(ab) = (cn)(ab).

We are assuming b 6= 0. If also a 6= 0, then ab 6= 0, and the conclusion of (4.3) follows


from the cancellation property, Proposition 2.8. On the other hand, if a = 0, then m/n ∼
a/b ⇒ mb = 0 ⇒ m = 0 (since b 6= 0), and similarly a/b ∼ c/d ⇒ cb = 0 ⇒ c = 0, so the
desired implication also holds in that case.
We will (temporarily) denote the equivalence class containing m/n by [m/n]. We then
define addition and multiplication in Q to satisfy

[m/n] + [a/b] = [(m/n) + (a/b)], [m/n] · [a/b] = [(m/n) · (a/b)],


(4.4)
(m/n) + (a/b) = (mb + na)/(nb), (m/n) · (a/b) = (ma)/(nb).

To see that these operations are well defined, we need:


Proposition 4.2. If m/n ∼ m0 /n0 and a/b ∼ a0 /b0 , then

(4.4A) (m/n) + (a/b) ∼ (m0 /n0 ) + (a0 /b0 ),

and

(4.4B) (m/n) · (a/b) ∼ (m0 /n0 ) · (a0 /b0 ).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
24

Proof. The hypotheses say

(4.4C) mn0 = m0 n, ab0 = a0 b.

The conclusions follow from the results of §2. In more detail, multiplying the two identities
in (4.4C) yields
man0 b0 = m0 a0 nb,

which implies (4.4B). To prove (4.4A), it is convenient to establish the intermediate step

(4.4D) (m/n) + (a/b) ∼ (m0 /n0 ) + (a/b).

This is equivalent to
(mb + na)/nb ∼ (m0 b + n0 a)/(n0 b),

hence to
(mb + na)n0 b = (m0 b + n0 a)nb,

or to
mn0 bb + nn0 ab = m0 nbb + n0 nab.

This in turn follows readily from (4.4C). Having (4.4D), we can use a similar argument to
establish that
(m0 /n0 ) + (a/b) ∼ (m0 /n0 ) + (a0 /b0 ),

and then (4.4A) follows by transitivity of ∼.


From now on, we drop the brackets, simply denoting the equivalence class of m/n by
m/n, and writing (4.1) as m/n = a/b. We also may denote an element of Q by a single
letter, e.g., x = m/n. There is an injection

(4.5) Z ,→ Q, m 7→ m/1,

whose image we identify with Z. This map preserves addition and multiplication. We
define

(4.6) −(m/n) = (−m)/n,

and, if x = m/n 6= 0, (i.e., m 6= 0 as well as n 6= 0), we define

(4.7) x−1 = n/m.

The results stated in the following proposition are routine consequences of the results of
§2.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
25

Proposition 4.3. Given x, y, z ∈ Q, we have

x + y = y + x,
(x + y) + z = x + (y + z),
x + 0 = x,
x + (−x) = 0,
x · y = y · x,
(x · y) · z = x · (y · z),
x · 1 = x,
x · 0 = 0,
x · (−1) = −x,
x · (y + z) = x · y + x · z.

Furthermore,
x 6= 0 =⇒ x · x−1 = 1.

For example, if x = m/n, y = a/b with m, n, a, b ∈ Z, n, b 6= 0, the identity x · y = y · x


is equivalent to (ma)/(nb) ∼ (am)/(bn). In fact, the identities ma = am and nb = bn
follow from Proposition 2.3. We leave the rest of Proposition 4.3 to the reader.
We also have cancellation laws:
Proposition 4.4. Given x, y, z ∈ Q,

(4.8) x + y = z + y =⇒ x = z.

Also,

(4.9) xy = zy, y 6= 0 =⇒ x = z.

Proof. To get (4.8), add −y to both sides of x+y = z +y and use the results of Proposition
4.3. To get (4.9), multiply both sides of x · y = z · y by y −1 .
It is natural to define

(4.10) x − y = x + (−y),

and, if y 6= 0,

(4.11) x/y = x · y −1 .

We now define the order relation on Q. Set

(4.12) Q+ = {m/n : mn > 0},

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
26

where, in (4.12), we use the order relation on Z, discussed in §2. This is well defined.
In fact, if m/n = m0 /n0 , then mn0 = m0 n, hence (mn)(m0 n0 ) = (mn0 )2 , and therefore
mn > 0 ⇔ m0 n0 > 0. Results of §2 imply that

(4.13) Q = Q+ ∪ {0} ∪ (−Q+ )

is a disjoint union, where −Q+ = {−x : x ∈ Q+ }. Also, clearly


x
(4.14) x, y ∈ Q+ =⇒ x + y, xy, ∈ Q+ .
y

We define

(4.15) x < y ⇐⇒ y − x ∈ Q+ ,

and we have, for any x, y ∈ Q, either

(4.16) x < y, or x = y, or y < x,

and no two can hold. The map (4.5) is seen to preserve the order relations. In light of
(4.14), we see that

x 1 1
(4.17) given x, y > 0, x<y⇔ <1⇔ < .
y y x

As usual, we say x ≤ y provided either x < y or x = y. Similarly there are natural


definitions of x > y and of x ≥ y.
The following result implies that Q has the Archimedean property.
Proposition 4.5. Given x ∈ Q, there exists k ∈ Z such that

(4.18) k − 1 < x ≤ k.

Proof. It suffices to prove (4.18) assuming x ∈ Q+ ; otherwise, work with −x (and make a
few minor adjustments). Say x = m/n, m, n ∈ N. Then

S = {` ∈ N : ` ≥ x}

contains m, hence is nonempty. By Proposition 1.16, S has a smallest element; call it k.


Then k ≥ x. We cannot have k − 1 ≥ x, for then k − 1 would belong to S. Hence (4.18)
holds.

Exercises

1. Verify Proposition 4.3.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
27

2. Look at the exercise set for §1, and verify (3) and (4) for a ∈ Q, a 6= 1, n ∈ N.

3. Here is another route to (4) of §1, i.e.,


n
X an+1 − 1
(4.19) ak = , a 6= 1.
a−1
k=0

Denote the left side of (4.19) by Sn (a). Multiply by a and show that

aSn (a) = Sn (a) + an+1 − 1.

4. Given a ∈ Q, n ∈ N, define an as in Exercise 3 of §1. If a 6= 0, set a0 = 1 and


a−n = (a−1 )n , with a−1 defined as in (4.7). Show that, if a, b ∈ Q \ 0,

aj+k = aj ak , ajk = (aj )k , (ab)j = aj bj , ∀ j, k ∈ Z.

5. Prove the following variant of Proposition 4.5.


Proposition 4.5A. Given ε ∈ Q, ε > 0, there exists n ∈ N such that
1
ε> .
n

6. Work through the proof of the following.


Assertion If x = m/n ∈ Q, then x2 6= 2.
Hint. We can arrange that m and n have no common factors. Then
³ m ´2
= 2 ⇒ m2 = 2n2 ⇒ m even (m = 2k)
n
⇒ 4k 2 = 2n2
⇒ n2 = 2k 2
⇒ n even.

Contradiction? (See Proposition 7.2 for a more general result.)

7. Given xj , yj ∈ Q, show that

x1 < x2 , y1 ≤ y2 =⇒ x1 + y1 < x2 + y2 .

Show that
0 < x1 < x2 , 0 < y1 ≤ y2 =⇒ x1 y1 < x2 y2 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
28

5. Sequences

In this section, we discuss infinite sequences. For now, we deal with sequences of rational
numbers, but we will not explicitly state this restriction below. In fact, once the set of
real numbers is constructed in §6, the results of this section will be seen to hold also for
sequences of real numbers.
Definition. A sequence (aj ) is said to converge to a limit a provided that, for any n ∈ N,
there exists K(n) such that

1
(5.1) j ≥ K(n) =⇒ |aj − a| < .
n
We write aj → a, or a = lim aj , or perhaps a = limj→∞ aj .
Here, we define the absolute value |x| of x by

|x| = x if x ≥ 0,
(5.2)
−x if x < 0.

The absolute value function has various simple properties, such as |xy| = |x| · |y|, which
follow readily from the definition. One basic property is the triangle inequality:

(5.3) |x + y| ≤ |x| + |y|.

In fact, if either x and y are both positive or they are both negative, one has equality
in (5.3). If x and y have opposite signs, then |x + y| ≤ max(|x|, |y|), which in turn is
dominated by the right side of (5.3).
Proposition 5.1. If aj → a and bj → b, then

(5.4) aj + bj → a + b,

and

(5.5) aj bj → ab.

If furthermore, bj 6= 0 for all j and b 6= 0, then

(5.6) aj /bj → a/b.

Proof. To see (5.4), we have, by (5.3),

(5.7) |(aj + bj ) − (a + b)| ≤ |aj − a| + |bj − b|.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
29

To get (5.5), we have


|aj bj − ab| = |(aj bj − abj ) + (abj − ab)|
(5.8)
≤ |bj | · |aj − a| + |a| · |b − bj |.
The hypotheses imply |bj | ≤ B, for some B, and hence the criterion for convergence is
readily verified. To get (5.6), we have
¯a a ¯¯ 1 © ª
¯ j
(5.9) ¯ − ¯≤ |b| · |a − aj | + |a| · |b − bj | .
bj b |b| · |bj |
The hypotheses imply 1/|bj | ≤ M for some M, so we also verify the criterion for conver-
gence in this case.
We next define the concept of a Cauchy sequence.
Definition. A sequence (aj ) is a Cauchy sequence provided that, for any n ∈ N, there
exists K(n) such that
1
(5.10) j, k ≥ K(n) =⇒ |aj − ak | ≤ .
n
It is clear that any convergent sequence is Cauchy. On the other hand, we have:
Proposition 5.2. Each Cauchy sequence is bounded.
Proof. Take n = 1 in the definition above. Thus, if (aj ) is Cauchy, there is a K such that
j, k ≥ K ⇒ |aj − ak | ≤ 1. Hence, j ≥ K ⇒ |aj | ≤ |aK | + 1, so, for all j,
¡ ¢
|aj | ≤ M, M = max |a1 |, . . . , |aK−1 |, |aK | + 1 .

Now, the arguments proving Proposition 5.1 also establish:


Proposition 5.3. If (aj ) and (bj ) are Cauchy sequences, so are (aj + bj ) and (aj bj ).
Furthermore, if, for all j, |bj | ≥ c for some c > 0, then (aj /bj ) is Cauchy.
The following proposition is a bit deeper than the first three.
Proposition 5.4. If (aj ) is bounded, i.e., |aj | ≤ M for all j, then it has a Cauchy
subsequence.
Proof. We may as well assume M ∈ N. Now, either aj ∈ [0, M ] for infinitely many j or
aj ∈ [−M, 0] for infinitely many j. Let I1 be any one of these two intervals containing aj
for infinitely many j, and pick j(1) such that aj(1) ∈ I1 . Write I1 as the union of two closed
intervals, of equal length, sharing only the midpoint of I1 . Let I2 be any one of them with
the property that aj ∈ I2 for infinitely many j, and pick j(2) > j(1) such that aj(2) ∈ I2 .
Continue, picking Iν ⊂ Iν−1 ⊂ · · · ⊂ I1 , of length M/2ν−1 , containing aj for infinitely
many j, and picking j(ν) > j(ν − 1) > · · · > j(1) such that aj(ν) ∈ Iν . Setting bν = aj(ν) ,
we see that (bν ) is a Cauchy subsequence of (aj ), since, for all k ∈ N,
|bν+k − bν | ≤ M/2ν−1 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
30

Proposition 5.5. Each bounded monotone sequence (aj ) is Cauchy.


Proof. To say (aj ) is monotone is to say that either (aj ) is increasing, i.e., aj ≤ aj+1 for
all j, or that (aj ) is decreasing, i.e., aj ≥ aj+1 for all j. For the sake of argument, assume
(aj ) is increasing.
By Proposition 5.4, there is a subsequence (bν ) = (aj(ν) ) which is Cauchy. Thus, given
n ∈ N, there exists K(n) such that

1
(5.11) µ, ν ≥ K(n) =⇒ |aj(ν) − aj(µ) | < .
n

Now, if ν0 ≥ K(n) and k ≥ j ≥ j(ν0 ), pick ν1 such that j(ν1 ) ≥ k. Then

aj(ν0 ) ≤ aj ≤ ak ≤ aj(ν1 ) ,

so
1
(5.12) k ≥ j ≥ j(ν0 ) =⇒ |aj − ak | < .
n

We give a few simple but basic examples of convergent sequences.


Proposition 5.6. If |a| < 1, then aj → 0.
Proof. Set b = |a|; it suffices to show that bj → 0. Consider c = 1/b > 1, hence c =
1 + y, y > 0. We claim that
cj = (1 + y)j ≥ 1 + jy,
for all j ≥ 1. In fact, this clearly holds for j = 1, and if it holds for j = k, then

ck+1 ≥ (1 + y)(1 + ky) > 1 + (k + 1)y.

Hence, by induction, the estimate is established. Consequently,

1
bj < ,
jy

so the appropriate analogue of (5.1) holds, with K(n) = Kn, for any integer K > 1/y.
Proposition 5.6 enables us to establish the following result on geometric series.
Proposition 5.7. If |x| < 1 and

aj = 1 + x + · · · + x j ,

then
1
aj → .
1−x

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
31

Proof. Note that xaj = x + x2 + · · · + xj+1 , so (1 − x)aj = 1 − xj+1 , i.e.,

1 − xj+1
aj = .
1−x
The conclusion follows from Proposition 5.6.
Note in particular that
1
(5.13) 0 < x < 1 =⇒ 1 + x + · · · + xj < .
1−x
It is an important mathematical fact that not every Cauchy sequence of rational numbers
has a rational number as limit. We give one example here. Consider the sequence
j
X 1
(5.14) aj = .
`!
`=0

Then (aj ) is increasing, and

1³ 1 ´
n+j
X 1 1 1
an+j − an = < + + · · · + ,
`! n! n + 1 (n + 1)2 (n + 1)j
`=n+1

since (n + 1)(n + 2) · · · (n + j) > (n + 1)j . Using (5.13), we have


1 1 1 1
(5.15) an+j − an < 1 = · .
(n + 1)! 1 − n+1 n! n

Hence (aj ) is Cauchy. Taking n = 2, we see that

(5.16) j > 2 =⇒ 2 12 < aj < 2 34 .

Proposition 5.8. The sequence (5.14) cannot converge to a rational number.


Proof. Assume aj → m/n with m, n ∈ N. By (5.16), we must have n > 2. Now, write
n
m X1
(5.17) = + r, r = lim (an+j − an ).
n `! j→∞
`=0

Multiplying both sides of (5.17) by n! gives

(5.18) m(n − 1)! = A + r · n!

where
n
X n!
(5.19) A= ∈ N.
`!
`=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
32

Thus the identity (5.17) forces r · n! ∈ N, while (5.15) implies

(5.20) 0 < r · n! ≤ 1/n.

This contradiction proves the proposition.

Exercises

1. Show that
k
lim = 0,
k→∞ 2k
and more generally for each n ∈ N,
kn
lim = 0.
k→∞ 2k
Hint. See Exercise 5.

2. Show that
2k
lim = 0,
k→∞ k!
and more generally for each n ∈ N,

2nk
lim = 0.
k→∞ k!

The following two exercises discuss continued fractions. We assume

(5.21) aj ∈ Q, aj ≥ 1, j = 1, 2, 3, . . . ,

and set
1 1
(5.22) f1 = a1 , f 2 = a1 + , f3 = a 1 + 1 ,....
a2 a2 + a3

Having fj , we obtain fj+1 by replacing aj by aj + 1/aj+1 . In other words, with

(5.23) fj = ϕj (a1 , . . . , aj ),

given explicitly by (5.22) for j = 1, 2, 3, we have

(5.24) fj+1 = ϕj+1 (a1 , . . . , aj+1 ) = ϕj (a1 , . . . , aj−1 , aj + 1/aj+1 ).

3. Show that
f1 ≤ fj , ∀ j ≥ 2, and f2 ≥ fj , ∀ j ≥ 3.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
33

Going further, show that

(5.25) f1 ≤ f3 ≤ f5 ≤ · · · ≤ f6 ≤ f4 ≤ f2 .

4. If also ãj+1 ∈ Q, ãj+1 ≥ 1, show that

ϕj+1 (a1 , . . . , aj , aj+1 ) − ϕj+1 (a1 , . . . , aj , ãj+1 )


(5.26)
= ϕj (a1 , . . . , aj−1 , bj ) − ϕj (a1 , . . . , aj−1 , b̃j ),

with
1 1
bj = aj + , b̃j = aj + ,
aj+1 ãj+1
(5.27)
1 1 ãj+1 − aj+1
bj − b̃j = − = .
aj+1 ãj+1 ãj+1 aj+1

Note that bj , b̃j > 1. Iterating this, show that

(5.28) f2j − f2j+1 → 0, as j → ∞.

Deduce that (fj ) is a Cauchy sequence.

5. Suppose a sequence (aj ) has the property that there exist

r < 1, K∈N

such that ¯a ¯
¯ j+1 ¯
j ≥ K =⇒ ¯ ¯ ≤ r.
aj
Show that aj → 0 as j → ∞. How does this result apply to Exercises 1 and 2?

6. If (aj ) satisfies the hypotheses of Exercise 5, show that there exists M < ∞ such that
k
X
|aj | ≤ M, ∀ k.
j=1

Remark. This yields the ratio test for infinite series.

7. Show that you get the same criterion for convergence if (5.1) is replaced by
5
j ≥ K(n) =⇒ |aj − a| < .
n
Generalize, and note the relevance for the proof of Proposition 5.1. Apply the same
observation to the criterion (5.10) for (aj ) to be Cauchy.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
34

6. The real numbers

We think of a real number as a quantity that can be specified by a process of approx-


imation arbitrarily closely by rational numbers. Thus, we define an element of R as an
equivalence class of Cauchy sequences of rational numbers, where we define

(6.1) (aj ) ∼ (bj ) ⇐⇒ aj − bj → 0.

Proposition 6.1. This is an equivalence relation.


Proof. This is a straightforward consequence of Proposition 5.1. In particular, to see that

(6.2) (aj ) ∼ (bj ), (bj ) ∼ (cj ) =⇒ (aj ) ∼ (cj ),

just use (5.4) of Proposition 5.1 to write

aj − bj → 0, bj − cj → 0 =⇒ aj − cj → 0.

We denote the equivalence class containing a Cauchy sequence (aj ) by [(aj )]. We then
define addition and multiplication on R to satisfy

[(aj )] + [(bj )] = [(aj + bj )],


(6.3)
[(aj )] · [(bj )] = [(aj bj )].

Proposition 5.3 states that the sequences (aj + bj ) and (aj bj ) are Cauchy if (aj ) and (bj )
are. To conclude that the operations in (6.3) are well defined, we need:
Proposition 6.2. If Cauchy sequences of rational numbers are given which satisfy (aj ) ∼
(a0j ) and (bj ) ∼ (b0j ), then

(6.4) (aj + bj ) ∼ (a0j + b0j ),

and

(6.5) (aj bj ) ∼ (a0j b0j ).

The proof is a straightforward variant of the proof of parts (5.4)-(5.5) in Proposition


5.1, with due account taken of Proposition 5.2. For example, aj bj − a0j b0j = aj bj − aj b0j +
aj b0j − a0j b0j , and there are uniform bounds |aj | ≤ A, |b0j | ≤ B, so

|aj bj − a0j b0j | ≤ |aj | · |bj − b0j | + |aj − a0j | · |b0j |


≤ A|bj − b0j | + B|aj − a0j |.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
35

There is a natural injection

(6.6) Q ,→ R, a 7→ [(a, a, a, . . . )],

whose image we identify with Q. This map preserves addition and multiplication.
If x = [(aj )], we define

(6.7) −x = [(−aj )].

For x 6= 0, we define x−1 as follows. First, to say x 6= 0 is to say there exists n ∈ N such
that |aj | ≥ 1/n for infinitely many j. Since (aj ) is Cauchy, this implies that there exists
K such that |aj | ≥ 1/2n for all j ≥ K. Now, if we set αj = aK+j , we have (αj ) ∼ (aj ); we
propose to set

(6.8) x−1 = [(αj−1 )].

We claim that this is well defined. First, by Proposition 5.3, (αj−1 ) is Cauchy. Furthermore,
if for such x we also have x = [(bj )], and we pick K so large that also |bj | ≥ 1/2n for all
j ≥ K, and set βj = bK+j , we claim that

(6.9) (αj−1 ) ∼ (βj−1 ).

Indeed, we have

|βj − αj |
(6.10) |αj−1 − βj−1 | = ≤ 4n2 |βj − αj |,
|αj | · |βj |

so (6.9) holds.
It is now a straightforward exercise to verify the basic algebraic properties of addition
and multiplication in R. We state the result.
Proposition 6.3. Given x, y, z ∈ R, all the algebraic properties stated in Proposition 4.3
hold.
For example, if x = [(aj )] and y = [(bj )], the identity xy = yx is equivalent to (aj bj ) ∼
(bj aj ). In fact, the identity aj bj = bj aj for aj , bj ∈ Q, follows from Proposition 4.3. The
rest of Proposition 6.3 is left to the reader.

As in (4.10)-(4.11), we define x − y = x + (−y) and, if y 6= 0, x/y = x · y −1 .

We now define an order relation on R. Take x ∈ R, x = [(aj )]. From the discussion
above of x−1 , we see that, if x 6= 0, then one and only one of the following holds. Either,
for some n, K ∈ N,

1
(6.11) j ≥ K =⇒ aj ≥ ,
2n

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
36

or, for some n, K ∈ N,

1
(6.12) j ≥ K =⇒ aj ≤ − .
2n
If (aj ) ∼ (bj ) and (6.11) holds for aj , it also holds for bj (perhaps with different n and K),
and ditto for (6.12). If (6.11) holds, we say x ∈ R+ (and we say x > 0), and if (6.12) holds
we say x ∈ R− (and we say x < 0). Clearly x > 0 if and only if −x < 0. It is also clear
that the map Q ,→ R in (6.6) preserves the order relation.
Thus we have the disjoint union

(6.13) R = R+ ∪ {0} ∪ R− , R− = −R+ .

Also, clearly

(6.14) x, y ∈ R+ =⇒ x + y, xy ∈ R+ .

As in (4.15), we define

(6.15) x < y ⇐⇒ y − x ∈ R+ .

If x = [(aj )] and y = [(bj )], we see from (6.11)–(6.12) that

x < y ⇐⇒ for some n, K ∈ N,


(6.15A) 1 ³ 1´
j ≥ K ⇒ bj − aj ≥ i.e., aj ≤ bj − .
n n
The relation (6.15) can also be written y > x. Similarly we define x ≤ y and y ≤ x, in the
obvious fashions.
The following results are straightforward.
Proposition 6.4. For elements of R, we have

(6.16) x1 < y1 , x2 < y2 =⇒ x1 + x2 < y1 + y2 ,

(6.17) x < y ⇐⇒ −y < −x,

(6.18) 0 < x < y, a > 0 =⇒ 0 < ax < ay,

(6.19) 0 < x < y =⇒ 0 < y −1 < x−1 .

Proof. The results (6.16) and (6.18) follow from (6.14); consider, for example, a(y − x).
The result (6.17) follows from (6.13). To prove (6.19), first we see that x > 0 implies

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
37

x−1 > 0, as follows: if −x−1 > 0, the identity x · (−x−1 ) = −1 contradicts (6.14). As for
the rest of (6.19), the hypotheses imply xy > 0, and multiplying both sides of x < y by
a = (xy)−1 gives the result, by (6.18).
As in (5.2), define |x| by
|x| = x if x ≥ 0,
(6.20)
−x if x < 0.
Note that
(6.20A) x = [(aj )] =⇒ |x| = [(|aj |)].
It is straightforward to verify
(6.21) |xy| = |x| · |y|, |x + y| ≤ |x| + |y|.
We now show that R has the Archimedean property.
Proposition 6.5. Given x ∈ R, there exists k ∈ Z such that
(6.22) k − 1 < x ≤ k.

Proof. It suffices to prove (6.22) assuming x ∈ R+ . Otherwise, work with −x. Say x = [(aj )]
where (aj ) is a Cauchy sequence of rational numbers. By Proposition 5.2, there exists
M ∈ Q such that |aj | ≤ M for all j. By Proposition 4.5, we have M ≤ ` for some ` ∈ N.
Hence the set S = {` ∈ N : ` ≥ x} is nonempty. As in the proof of Proposition 4.5, taking
k to be the smallest element of S gives (6.22).
Proposition 6.6. Given any real ε > 0, there exists n ∈ N such that ε > 1/n.
Proof. Using Proposition 6.5, pick n > 1/ε and apply (6.19). Alternatively, use the rea-
soning given above (6.8).
We are now ready to consider sequences of elements of R.
Definition. A sequence (xj ) converges to x if and only if, for any n ∈ N, there exists
K(n) such that
1
(6.23) j ≥ K(n) =⇒ |xj − x| < .
n
In this case, we write xj → x, or x = lim xj .
The sequence (xj ) is Cauchy if and only if, for any n ∈ N, there exists K(n) such that
1
(6.24) j, k ≥ K(n) =⇒ |xj − xk | < .
n
We note that it is typical to phrase the definition above in terms of picking any real
ε > 0 and demanding that, e.g., |xj − x| < ε, for large j. The equivalence of the two
definitions follows from Proposition 6.6.
As in Proposition 5.2, we have that every Cauchy sequence is bounded.
It is clear that, if each xj ∈ Q, then the notion that (xj ) is Cauchy given above coincides
with that in §5. If also x ∈ Q, the notion that xj → x also coincides with that given in §5.
Here is another natural but useful observation.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
38

Proposition 6.6A. If each aj ∈ Q, and x ∈ R, then

(6.25) aj → x ⇐⇒ x = [(aj )].

Proof. First assume x = [(aj )]. In particular, (aj ) is Cauchy. Now, given m, we have from
(6.15A) that

1 1 1
|x − ak | < ⇐⇒ ∃ K, n such that j ≥ K ⇒ |aj − ak | < −
(6.26) m m n
1
⇐= ∃ K such that j ≥ K ⇒ |aj − ak | < .
2m

On the other hand, since (aj ) is Cauchy,

1
for each m ∈ N, ∃ K(m) such that j, k ≥ K(m) ⇒ |aj − ak | < .
2m

Hence
1
k ≥ K(m) =⇒ |x − ak | < .
m
This shows that x = [(aj )] ⇒ aj → x. For the converse, if aj → x, then (aj ) is Cauchy, so
we have [(aj )] = y ∈ R. The previous argument implies aj → y. But

|x − y| ≤ |x − aj | + |aj − y|, ∀ j,

so x = y. Thus aj → x ⇒ x = [(aj )].


Next, the proof of Proposition 5.1 extends to the present case, yielding:
Proposition 6.7. If xj → x and yj → y, then

(6.27) xj + yj → x + y,

and

(6.28) xj yj → xy.

If furthermore yj 6= 0 for all j and y 6= 0, then

(6.28) xj /yj → x/y.

So far, statements made about R have emphasized similarities of its properties with
corresponding properties of Q. The crucial difference between these two sets of numbers is
given by the following result, known as the completeness property.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
39

Theorem 6.8. If (xj ) is a Cauchy sequence of real numbers, then there exists x ∈ R such
that xj → x.
Proof. Take xj = [(aj` : ` ∈ N)] with aj` ∈ Q. Using (6.25), take aj,`(j) = bj ∈ Q such that

(6.29) |xj − bj | ≤ 2−j .

Then (bj ) is Cauchy, since |bj − bk | ≤ |xj − xk | + 2−j + 2−k . Now, let

(6.30) x = [(bj )].

It follows that

(6.31) |xj − x| ≤ |xj − bj | + |x − bj | ≤ 2−j + |x − bj |,

and hence xj → x.
If we combine Theorem 6.8 with the argument behind Proposition 5.4, we obtain the
following important result, known as the Bolzano-Weierstrass Theorem.
Theorem 6.9. Each bounded sequence of real numbers has a convergent subsequence.
Proof. If |xj | ≤ M, the proof of Proposition 5.4 applies without change to show that (xj )
has a Cauchy subsequence. By Theorem 6.8, that Cauchy subsequence converges.
Similarly, adding Theorem 6.8 to the argument behind Proposition 5.5 yields:
Proposition 6.10. Each bounded monotone sequence (xj ) of real numbers converges.
A related property of R can be described in terms of the notion of the “supremum” of
a set.
Definition. If S ⊂ R, one says that x ∈ R is an upper bound for S provided x ≥ s for all
s ∈ S, and one says

(6.32) x = sup S

provided x is an upper bound for S and further x ≤ x0 whenever x0 is an upper bound for
S.
For some sets, such as S = Z, there is no x ∈ R satisfying (6.32). However, there is the
following result, known as the supremum property.
Proposition 6.11. If S is a nonempty subset of R that has an upper bound, then there
is a real x = sup S.
Proof. We use an argument similar to the one in the proof of Proposition 5.4. Let x0 be
an upper bound for S, pick s0 in S, and consider

I0 = [s0 , x0 ] = {y ∈ R : s0 ≤ y ≤ x0 }.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
40

If x0 = s0 , then already x0 = sup S. Otherwise, I0 is an interval of nonzero length,


L = x0 − s0 . In that case, divide I0 into two equal intervals, having in common only the
midpoint; say I0 = I0` ∪ I0r , where I0r lies to the right of I0` .
Let I1 = I0r if S ∩ I0r 6= ∅, and otherwise let I1 = I0` . Note that S ∩ I1 6= ∅. Let x1 be
the right endpoint of I1 , and pick s1 ∈ S ∩ I1 . Note that x1 is also an upper bound for S.
Continue, constructing
Iν ⊂ Iν−1 ⊂ · · · ⊂ I0 ,
where Iν has length 2−ν L, such that the right endpoint xν of Iν satisfies

(6.33) xν ≥ s, ∀ s ∈ S,

and such that S ∩ Iν 6= ∅, so there exist sν ∈ S such that

(6.34) xν − sν ≤ 2−ν L.

The sequence (xν ) is bounded and monotone (decreasing) so, by Proposition 6.10, it con-
verges; xν → x. By (6.33), we have x ≥ s for all s ∈ S, and by (6.34) we have x−sν ≤ 2−ν L.
Hence x satisfies (6.32).
P∞
We turn to infinite series k=0 ak , with ak ∈ R. We say this series converges if and
only if the sequence of partial sums
n
X
(6.35) Sn = ak
k=0

converges:

X
(6.36) ak = A ⇐⇒ Sn → A as n → ∞.
k=0

The following is a useful condition guaranteeing convergence.


P∞
Proposition 6.12. The infinite series k=0 ak converges provided

X
(6.37) |ak | < ∞,
k=0
Pn
i.e., there exists B < ∞ such that k=0 |ak | ≤ B for all n.
Proof. The triangle inequality (the second part of (6.21)) gives, for ` ≥ 1,
¯ n+` ¯
¯ X ¯
|Sn+` − Sn | = ¯ ak ¯
k=n+1
(6.38)
n+`
X
≤ |ak |,
k=n+1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
41

and we claim this tends to 0 as n → ∞, uniformly in ` ≥ 1, provided (6.37) holds. In fact,


if the right side of (6.38) fails to go to 0 as n → ∞, there exists ε > 0 and infinitely many
nν → ∞ and `ν ∈ N such that

nX
ν +`ν

(6.39) |ak | ≥ ε.
k=nν +1

We can pass to a subsequence and assume nν+1 > nν + `ν . Then

nX
ν +`ν

(6.40) |ak | ≥ νε,


k=n1 +1

for all ν, contradicting the bound by B that follows from (6.37). Thus (6.37) ⇒ (Sn ) is
Cauchy. Convergence follows, by Theorem 6.8.
P∞
When (6.37) holds, we say the series k=0 ak is absolutely convergent.
The following result on alternating series gives another sufficient condition for conver-
gence.
Proposition 6.13. Assume ak > 0, ak & 0. Then

X
(6.41) (−1)k ak
k=0

is convergent.
Proof. Denote the partial sums by Sn , n ≥ 0. We see that, for m ∈ N,

(6.42) S2m+1 ≤ S2m+3 ≤ S2m+2 ≤ S2m .

Iterating this, we have, as m → ∞,

(6.43) S2m & α, S2m+1 % β, β ≤ α,

and

(6.44) S2m − S2m+1 = a2m+1 ,

hence α = β, and convergence is established.


Here is an example:

X (−1)k 1 1 1
=1− + − + ··· is convergent.
k+1 2 3 4
k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
42

This series is not absolutely convergent (cf. Exercise 6 below). For an evaluation of this
series, see exercises in §5 of Chapter 4.

Exercises

1. Verify Proposition 6.3.

2. If S ⊂ R, we say that x ∈ R is a lower bound for S provided x ≤ s for all s ∈ S, and


we say

(6.45) x = inf S,

provided x is a lower bound for S and further x ≥ x0 whenever x0 is a lower bound for S.
Mirroring Proposition 6.11, show that if S ⊂ R is a nonempty set that has a lower bound,
then there is a real x = inf S.

3. Given a real number ξ ∈ (0, 1), show it has an infinite decimal expansion, i.e., there
exist bk ∈ {0, 1, . . . , 9} such that

X
(6.46) ξ= bk · 10−k .
k=1

Hint. Start by breaking [0, 1] into ten subintervals of equal length, and picking one to
which ξ belongs.

4. Show that if 0 < x < 1,



X 1
xk = < ∞.
1−x
k=0

Hint. As in (4.19), we have


n
X 1 − xn+1
xk = , x 6= 1.
1−x
k=0

5. Assume ak > 0 and ak & 0. Show that



X ∞
X
(6.47) ak < ∞ ⇐⇒ bk < ∞,
k=1 k=0

where

(6.48) bk = 2k a2k .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
43

Hint. Use the following observations:


1 1
b2 + b3 + · · · ≤ (a3 + a4 ) + (a5 + a6 + a7 + a8 ) + · · · , and
2 2
(a3 + a4 ) + (a5 + a6 + a7 + a8 ) + · · · ≤ b1 + b2 + · · · .

1 1 1
6. Deduce from Exercise 5 that the harmonic series 1 + 2 + 3 + 4 + · · · diverges, i.e.,

X 1
(6.49) = ∞.
k
k=1

7. Deduce from Exercises 4–5 that



X 1
(6.50) p > 1 =⇒ < ∞.
kp
k=1

For now, we take p ∈ N. We will see later that (6.50) is meaningful, and true, for p ∈
R, p > 1.

8. Given a, b ∈ R \ 0, k ∈ Z, define ak as in Exercise 4 of §4. Show that

aj+k = aj ak , ajk = (aj )k , (ab)j = aj bj , ∀ j, k ∈ Z.

9. Given k ∈ N, show that, for xj ∈ R,

xj → x =⇒ xkj → xk .

Hint. Use Proposition 6.7.

10. Given xj , x, y ∈ R, show that

xj ≥ y ∀ j, xj → x =⇒ x ≥ y.

P
11. Given the alternating series (−1)k ak as in Proposition 6.13 (with ak & 0), with sum
S, show that, for each N ,
N
X
(−1)k ak = S + rN , |rN | ≤ |aN +1 |.
k=0

12. Generalize Exercises 5–6 of §5 as follows. Suppose a sequence (aj ) in R has the
property that there exist r < 1 and K ∈ N such that
¯a ¯
¯ j+1 ¯
j ≥ K =⇒ ¯ ¯ ≤ r.
aj

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
44

Show that there exists M < ∞ such that


k
X
|aj | ≤ M, ∀k ∈ N.
j=1

P∞
Conclude that k=1 ak is convergent.

13. Show that, for each x ∈ R,



X 1 k
x
k!
k=1

is convergent.

The following exercises deal with the sequence (fj ) of continued fractions associated to a
sequence (aj ) as in (5.21), via (5.22)–(5.24), leading to Exercises 3–4 of §5.

14. Deduce from (5.25) that there exist fo , fe ∈ R such that

f2k+1 % fo , f2k & fe , fo ≤ fe .

15. Deduce from (5.28) that fo = fe (= f , say), and hence

fj −→ f, as j → ∞,

i.e., if (aj ) satisfies (5.21),

ϕj (a1 , . . . , aj ) −→ f, as j → ∞.

We denote the limit by ϕ(a1 , . . . , aj , . . . ).

16. Show that ϕ(1, 1, . . . , 1, . . . ) = x solves x = 1 + 1/x, and hence



1+ 5
ϕ(1, 1, . . . , 1, . . . ) = .
2

Note. The existence of such x implies that 5 has a square root, 5 ∈ R. See Proposition
7.1 for a more general result.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
45

7. Irrational numbers

There are real numbers that are not rational. One, called e, is given by the limit of the
sequence (5.14); in standard notation,


X 1
(7.1) e= .
`!
`=0

This number appears naturally in the theory of the exponential function, which plays a
central role in calculus, as exposed in §5 of Chapter 4. Proposition 5.8 implies that e is
not rational. One can approximate e to high accuracy. In fact, as a consequence of (5.15),
one has
n
X 1 1 1
(7.2) e− ≤ · .
`! n! n
`=0

For example, one can verify that

(7.3) 120! > 6 · 10198 ,

and hence

120
X 1
(7.4) e− < 10−200 .
`!
`=0

In a fraction of a second, a personal computer with the right program can perform a highly
accurate approximation to such a sum, yielding

2.7182818284 5904523536 0287471352 6624977572 4709369995


9574966967 6277240766 3035354759 4571382178 5251664274
2746639193 2003059921 8174135966 2904357290 0334295260
5956307381 3232862794 3490763233 8298807531 · · ·

accurate to 190 places after the decimal point.


A number in R \ Q is said to√ be irrational. We present some other√common examples
of irrational numbers, such as 2. To begin, one needs to show that 2 is a well defined
real number. The following general result includes this fact.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
46

Proposition 7.1. Given a ∈ R+ , k ∈ N, there is a unique b ∈ R+ such that bk = a.


Proof. Consider

(7.5) Sa,k = {x ≥ 0 : xk ≤ a}.

Then Sa,k is a nonempty bounded subset of R. Note that if y > 0 and y k > a then y is
an upper bound for Sa,k . Hence 1 + a is an upper bound for Sa,k . Take b = sup Sa,k .
We claim that bk = a. In fact, if bk < a, it follows from Exercise 9 of §6 that there exists
b1 > b such that bk1 < a, hence b1 ∈ Sa,k , so b < sup Sa,k . Similarly, if bk > a, there exists
b0 < b such that bk0 > a, hence b0 is an upper bound for Sa,k , so b > sup Sa,k .
We write

(7.6) b = a1/k .

Now for a list of some irrational numbers:


Proposition 7.2. Take a ∈ N, k ∈ N. If a1/k is not an integer, then a1/k is irrational.
Proof. Assume a1/k = m/n, with m, n ∈ N. We can arrange that m and n have no common
prime factors. Now

(7.7) mk = ank ,

so

(7.8) n | mk .

Thus, if n > 1 and p is a prime factor of n, then p|mk . It follows from Proposition 3.2,
and induction on k, that p|m. This contradicts our arrangement that m and n have no
common prime factors, and concludes the proof.
Noting that 12 = 1, 22 = 4, 32 = 9, we have:
Corollary 7.3. The following numbers are irrational:
√ √ √ √ √ √
(7.9) 2, 3, 5, 6, 7, 8.

A similar argument establishes the following more general result.


Proposition 7.4. Consider the polynomial

(7.10) p(z) = z k + ak−1 z k−1 + · · · + a1 z + a0 , aj ∈ Z.

Then

(7.11) z ∈ Q, p(z) = 0 =⇒ z ∈ Z.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
47

Proof. If z ∈ Q but z ∈
/ Z, we can write z = m/n with m, n ∈ Z, n > 1, and m and n
containing no common prime factors. Now multiply (7.12) by nk , to get

(7.12) mk + ak−1 mk−1 n + · · · + a1 mnk−1 + a0 nk = 0, aj ∈ Z.

It follows that n divides mk , so, as in the proof of Proposition 7.2, m and n must have a
common prime factor. This contradiction proves Proposition 7.4.
Note that Proposition 7.2 deals with the special case

(7.13) p(z) = z k − a, a ∈ N.

Remark. The existence of solutions to p(z) = 0 for general p(z) as in (7.10) is harder
than Proposition 7.1, especially when k is even. For the case of odd k, see Exercise 1 of
§9. For the general result, see Chapter 4, Appendix A.

The real line is thick with both rational numbers and irrational numbers. By (6.25),
given any x ∈ R, there exist aj ∈ Q such that aj → x. Also, given any x ∈ R, there
exist irrational
√ bj such that bj → x. To see this, just take aj ∈ Q, aj → x, and set
−j
bj = aj + 2 2.
In a sense that can be made precise, there are more irrational numbers than rational
numbers. Namely, Q is countable, while R is uncountable. See §8 for a treatment of this.
Perhaps the most intriguing irrational number is π. See Chapter 4 for material on π,
including a proof that it is irrational.

Exercises

1. Let ξ ∈ (0, 1) have a decimal expansion of the form (6.46), i.e.,



X
(7.14) ξ= bk · 10−k , bk ∈ {0, 1, . . . , 9}.
k=1

Show that ξ is rational if and only if (7.12) is eventually repeating, i.e., if and only if there
exist N, m ∈ N such that
k ≥ N =⇒ bk+m = bk .

2. Show that

X 2
10−k is irrational.
k=1

3. Making use of Proposition 7.1, define ap for real a > 0, p = m/n ∈ Q. Show that if
also q ∈ Q,
ap aq = ap+q .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
48

Hint. You might start with am/n = (a1/n )m , given n ∈ N, m ∈ Z. Then you need to show
that if k ∈ N,
(a1/nk )mk = (a1/n )m .
You can use the results of Exercise 8 in §6.

4. Show that, if a, b > 0 and p ∈ Q, then

(ab)p = ap bp .

Hint. First show that (ab)1/n = a1/n b1/n .

5. Using Exercises 3 and 4, extend (6.50) to p ∈ Q, p > 1.


Hint. If ak = k −p , then bk = 2k a2k = 2k (2k )−p = 2−(p−1)k = xk with x = 2−(p−1) .
√ √
6. Show that 2 + 3 is irrational.
Hint. Square it.

7. Specialize the proof of Proposition 7.2 to a demonstration that 2 has no rational square
root, and contrast this argument with the proof of such a result suggested in Exercise 6 of
§4.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
49

8. Cardinal numbers

We return to the natural numbers considered in §1 and make contact with the fact that
these numbers are used to count objects in collections. Namely, let S be some set. If S is
empty, we say 0 is the number of its elements. If S is not empty, pick an element out of S
and count “1.” If there remain other elements of S, pick another element and count “2.”
Continue. If you pick a final element of S and count “n,” then you say S has n elements.
At least, that is a standard informal description of counting. We wish to restate this a
little more formally, in the setting where we can apply the Peano axioms.
In order to do this, we consider the following subsets of N. Given n ∈ N, set
(8.1) In = {j ∈ N : j ≤ n}.
While the following is quite obvious, it is worthwhile recording that it is a consequence of
the Peano axioms and the material developed in §1.
Lemma 8.1. We have
(8.2) I1 = {1}, In+1 = In ∪ {n + 1}.

Proof. Left to the reader.


Now we propose the following
Definition 8.1. A nonempty set S has n elements if and only if there exists a bijective
map ϕ : S → In .
A reasonable definition of counting should permit one to demonstrate that, if S has n
elements and it also has m elements, then m = n. The key to showing this from the Peano
postulates is the following.
Proposition 8.2. Assume m, n ∈ N. If there exists an injective map ϕ : Im → In , then
m ≤ n.
Proof. Use induction on n. The case n = 1 is clear (by Lemma 8.1). Assume now that
N ≥ 2 and that the result is true for n < N . Then let ϕ : Im → IN be injective. Two
cases arise: either there is an element j ∈ Im such that ϕ(j) = N , or not. (Also, there is
no loss of generality in assuming at this point that m ≥ 2.)
If there is such a j, define ψ : Im−1 → IN −1 by
ψ(`) = ϕ(`) for ` < j,
ϕ(` + 1) for j ≤ ` < m.
Then ψ is injective, so m − 1 ≤ N − 1, and hence m ≤ N .
On the other hand, if there is no such j, then we already have an injective map ϕ :
Im → IN −1 . The induction hypothesis implies m ≤ N − 1, which in turn implies m ≤ N .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
50

Corollary 8.3. If there exists a bijective map ϕ : Im → In , then m = n.


Proof. We see that m ≤ n and n ≤ m, so Proposition 1.13 applies.
Corollary 8.4. If S is a set, m, n ∈ N, and there exist bijective maps ϕ : S → Im , ψ :
S → In , then m = n.
Proof. Consider ψ ◦ ϕ−1 .
Definition 8.2. If either S = ∅ or S has n elements for some n ∈ N, as in Definition
8.1, we say S is finite.
The next result implies that any subset of a finite set is finite.
Proposition 8.5. Assume n ∈ N. If S ⊂ In is nonempty, then there exists m ≤ n and a
bijective map ϕ : S → Im .
Proof. Use induction on n. The case n = 1 is clear (by Lemma 8.1). Assume the result is
true for n < N . Then let S ⊂ IN . Two cases arise: either N ∈ S or N ∈/ S.
0 0 0
If N ∈ S, consider S = S \ {N }, so S = S ∪ {N } and S ⊂ IN −1 . The inductive
hypothesis yields a bijective map ψ : S 0 → Im (with m ≤ N − 1), and then we obtain
ϕ : S 0 ∪ {N } → Im+1 , equal to ψ on S 0 and sending the element N to m + 1.
If N ∈ / S, then S ⊂ IN −1 , and the inductive hypothesis directly yields the desired
bijective map.
Proposition 8.6. The set N is not finite.
Proof. If there were an n ∈ N and a bijective map ϕ : In → N, then, by restriction, there
would be a bijective map ψ : S → In+1 for some subset S of In , hence by the results above
a bijective map ψ̃ : Im → In+1 for some m ≤ n < n + 1. This contradicts Corollary 8.3.
The next result says that, in a certain sense, N is a minimal set that is not finite.
Proposition 8.7. If S is not finite, then there exists an injective map Φ : N → S.
Proof. We aim to show that there exists a family of injective maps ϕn : In → S, with the
property that
¯
(8.3) ϕn ¯ = ϕm , ∀ m ≤ n. Im
We establish this by induction on n. For n = 1, just pick some element of S and call
it ϕ1 (1). Now assume this claim is true for all n < N . So we have ϕN −1 : IN −1 → S
injective, but not surjective (since we assume S is not finite), and (8.3) holds for n ≤ N −1.
Pick x ∈ S not in the range of ϕN −1 . Then define ϕN : IN → S so that
ϕN (j) = ϕN −1 (j), j ≤ N − 1,
(8.3A)
ϕN (N ) = x.
Having the family ϕn , we define Φ : N → S by Φ(j) = ϕn (j) for any n ≥ j.
Two sets S and T are said to have the same cardinality if there exists a bijective map
between them; we write Card(S) = Card(T ). If there exists an injective map ϕ : S → T ,
we write Card(S) ≤ Card(T ). The following result, known as the Schroeder-Bernstein
theorem, implies that Card(S) = Card(T ) whenever one has both Card(S) ≤ Card(T ) and
Card(T ) ≤ Card(S).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
51

Theorem 8.8. Let S and T be sets. Suppose there exist injective maps ϕ : S → T and
ψ : T → S. Then there exists a bijective map Φ : S → T .
Proof. Let us say an element x ∈ T has a parent y ∈ S if ϕ(y) = x. Similarly there is a
notion of a parent of an element of S. Iterating this gives a sequence of “ancestors” of any
element of S or T . For any element of S or T , there are three possibilities:
a) The set of ancestors never terminates.
b) The set of ancestors terminates at an element of S.
c) The set of ancestors terminates at an element of T .
We denote by Sa , Ta the elements of S, T , respectively for which case a) holds. Similarly
we have Sb , Tb and Sc , Tc . We have disjoint unions

S = Sa ∪ Sb ∪ Sc , T = Ta ∪ Tb ∪ Tc .

Now note that


ϕ : Sa → Ta , ϕ : Sb → Tb , ψ : Tc → Sc
are all bijective. Thus we can set Φ equal to ϕ on Sa ∪ Sb and equal to ψ −1 on Sc , to get
a desired bijection.
The terminology above suggests regarding Card(S) as an object (some sort of number).
Indeed, if S is finite we set Card(S) = n if S has n elements (as in Definition 8.1). A set
that is not finite is said to be infinite. We can also have a notion of cardinality of infinite
sets. A standard notation for the cardinality of N is

(8.4) Card(N) = ℵ0 .

Here are some other sets with the same cardinality:


Proposition 8.9. We have

(8.5) Card(Z) = Card(N × N) = Card(Q) = ℵ0 .

Proof. We can define a bijection of N onto Z by ordering elements of Z as follows:

0, 1, −1, 2, −2, 3, −3, · · · .

We can define a bijection of N and N × N by ordering elements of N × N as follows:

(1, 1), (1, 2), (2, 1), (1, 3), (2, 2), (3, 1), · · · .

We leave it to the reader to produce a similar ordering of Q.


An infinite set that can be mapped bijectively onto N is called countably infinite. A
set that is either finite or countably infinite is called countable. The following result is a
natural extension of Proposition 8.5.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
52

Proposition 8.10. If X is a countable set and S ⊂ X, then S is countable.


Proof. If X is finite, then Proposition 8.5 applies. Otherwise, we can assume X = N, and
we are looking at S ⊂ N, so there is an injective map ϕ : S → N. If S is finite, there is
no problem. Otherwise, by Proposition 8.7, there is an injective map ψ : N → S, and then
Theorem 8.8 implies the existence of a bijection between S and N.
There are sets that are not countable; they are said to be uncountable. The following
is a key result of G. Cantor.
Proposition 8.11. The set R of real numbers is uncountable.
Proof. We may as well show that (0, 1) = {x ∈ R : 0 < x < 1} is uncountable. If it were
countable, there would be a bijective map ϕ : N → (0, 1). Expand the real number ϕ(j) in
its infinite decimal expansion:

X
(8.6) ϕ(j) = ajk · 10−k , ajk ∈ {0, 1, . . . 9}.
k=1

Now set

bk = 2 if akk =
6 2,
(8.7)
3 if akk = 2,

and consider

X
(8.8) ξ= bk · 10−k , ξ ∈ (0, 1).
k=1

It is seen that ξ is not equal to ϕ(j) for any j ∈ N, contradicting the hypothesis that
ϕ : N → (0, 1) is onto.
A common notation for the cardinality of R is

(8.9) Card(R) = c.

We leave it as an exercise to the reader to show that

(8.10) Card(R × R) = c.

Further development of the theory of cardinal numbers requires a formalization of the


notions of set theory. In these notes we have used set theoretical notions rather informally.
Our use of such notions has gotten somewhat heavier in this last section. In particular,
in the proof of Proposition 8.7, the innocent looking use of the phrase “pick x ∈ S . . . ”
actually assumes a weak version of the Axiom of Choice. For an introduction to the
axiomatic treatment of set theory we refer to [Dev].

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
53

Exercises

1. What is the cardinality of the set P of prime numbers?

2. Let S be a nonempty set and let T be the set of all subsets of S. Adapt the proof of
Proposition 8.11 to show that
Card(S) < Card(T ),
i.e., there is not a surjective map ϕ : S → T .
Hint. There is a natural bijection of T and Te, the set of functions f : S → {0, 1}, via
f ↔ {x ∈ S : f (x) = 1}. Given ϕ̃ : S → Te, describe a function g : S → {0, 1}, not in the
range of ϕ̃, taking a cue from the proof of Proposition 8.11.

3. Finish the proof of Proposition 8.9.

4. Use the map f (x) = x/(1 + x) to prove that

Card(R+ ) = Card((0, 1)).

5. Find a one-to-one map of R onto R+ and conclude that Card(R) = Card((0, 1)).

6. Use an interlacing of infinite decimal expansions to prove that

Card((0, 1) × (0, 1)) = Card((0, 1)).

7. Prove (8.10).

8. Let m ∈ Z, n ∈ N, and consider

Sm,n = {k ∈ Z : m + 1 ≤ k ≤ m + n}.

Show that
Card Sm,n = n.
Hint. Produce a bijective map In → Sm,n .

9. Let S and T be sets. Assume

Card S = m, Card T = n, S ∩ T = ∅,

with m, n ∈ N. Show that


Card S ∪ T = m + n.
Hint. Produce bijective maps S → Im and T → Sm,n , leading to a bijection S ∪T → Im+n .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
54

9. Metric properties of R

We discuss a number of notions and results related to convergence in R. Recall that a


sequence of points (pj ) in R converges to a limit p ∈ R (we write pj → p) if and only if for
every ε > 0 there exists N such that

(9.1) j ≥ N =⇒ |pj − p| < ε.

A set S ⊂ R is said to be closed if and only if

(9.2) pj ∈ S, pj → p =⇒ p ∈ S.

The complement R \ S of a closed set S is open. Alternatively, Ω ⊂ R is open if and only


if, given q ∈ Ω, there exists ε > 0 such that Bε (q) ⊂ Ω, where

(9.3) Bε (q) = {p ∈ R : |p − q| < ε},

so q cannot be a limit of a sequence of points in R \ Ω.


We define the closure S of a set S ⊂ R to consist of all points p ∈ R such that
Bε (p) ∩ S 6= ∅ for all ε > 0. Equivalently, p ∈ S if and only if there exists an infinite
sequence (pj ) of points in S such that pj → p.
An important property of R is completeness, which we recall is defined as follows. A
sequence (pj ) of points in R is called a Cauchy sequence if and only if

(9.4) |pj − pk | −→ 0, as j, k → ∞.

It is easy to see that if pj → p for some p ∈ R, then (9.4) holds. The completeness property
is the converse, given in Theorem 6.8, which we recall here.
Theorem 9.1. If (pj ) is a Cauchy sequence in R, then it has a limit.
Completeness provides a path to the following key notion of compactness. A nonempty
set K ⊂ R is said to be compact if and only if the following property holds.
Each infinite sequence (pj ) in K has a subsequence
(9.5)
that converges to a point in K.

It is clear that if K is compact, then it must be closed. It must also be bounded, i.e., there
exists R < ∞ such that K ⊂ BR (0). Indeed, if K is not bounded, there exist pj ∈ K such
that |pj+1 | ≥ |pj | + 1. In such a case, |pj − pk | ≥ 1 whenever j 6= k, so (pj ) cannot have a
convergent subsequence. The following converse statement is a key result.
Theorem 9.2. If a nonempty K ⊂ R is closed and bounded, then it is compact.
Clearly every nonempty closed subset of a compact set is compact, so Theorem 9.2 is a
consequence of:

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
55

Proposition 9.3. Each closed bounded interval I = [a, b] ⊂ R is compact.


Proof. This is a direct consequence of Theorem 6.9, the Bolzano-Weierstrass theorem.
Let K ⊂ R be compact. Since K is bounded from above and from below, we have well
defined real numbers

(9.6) b = sup K, a = inf K,

the first by Proposition 6.11, and the second by a similar argument (cf. Exercise 2 of §6).
Since a and b are limits of elements of K, we have a, b ∈ K. We use the notation

(9.7) b = max K, a = min K.

We next discuss continuity. If S ⊂ R, a function

(9.8) f : S −→ R

is said to be continuous at p ∈ S provided

(9.9) pj ∈ S, pj → p =⇒ f (pj ) → f (p).

If f is continuous at each p ∈ S, we say f is continuous on S.


The following two results give important connections between continuity and compact-
ness.
Proposition 9.4. If K ⊂ R is compact and f : K → R is continuous, then f (K) is
compact.
Proof. If (qk ) is an infinite sequence of points in f (K), pick pk ∈ K such that f (pk ) = qk .
If K is compact, we have a subsequence pkν → p in K, and then qkν → f (p) in R.
This leads to the second connection.
Proposition 9.5. If K ⊂ R is compact and f : K → R is continuous, then there exists
p ∈ K such that

(9.10) f (p) = max f (x),


x∈K

and there exists q ∈ K such that

(9.11) f (q) = min f (x).


x∈K

Proof. Since f (K) is compact, we have well defined numbers

(9.12) b = max f (K), a = min f (K), a, b ∈ f (K).

So take p, q ∈ K such that f (p) = b and f (q) = a.


The next result is called the intermediate value theorem.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
56

Proposition 9.6. Take a, b, c ∈ R, a < b. Let f : [a, b] → R be continuous. Assume

(9.13) f (a) < c < f (b).

Then there exists x ∈ (a, b) such that f (x) = c.


Proof. Let

(9.14) S = {y ∈ [a, b] : f (y) ≤ c}.

Then a ∈ S, so S is a nonempty, closed (hence compact) subset of [a, b]. Note that b ∈
/ S.
Take

(9.15) x = max S.

Then a < x < b and f (x) ≤ c. If f (x) < c, then there exists ε > 0 such that a < x − ε <
x + ε < b and f (y) < c for x − ε < y < x + ε. Thus x + ε ∈ S, contradicting (9.15).
Returning to the issue of compactness, we establish some further properties of compact
sets K ⊂ R, leading to the important result, Proposition 9.10 below.
Proposition 9.7. Let K ⊂ R be compact. Assume X1 ⊃ X2 ⊃ X3 ⊃ · · · form a decreas-
ing sequence of closed subsets of K. If each Xm 6= ∅, then ∩m Xm 6= ∅.
Proof. Pick xm ∈ Xm . If K is compact, (xm ) has a convergent subsequence, xmk → y.
Since {xmk : k ≥ `} ⊂ Xm` , which is closed, we have y ∈ ∩m Xm .
Corolary 9.8. Let K ⊂ R be compact. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an increasing
sequence of open sets in R. If ∪m Um ⊃ K, then UM ⊃ K for some M .
Proof. Consider Xm = K \ Um .
Before getting to Proposition 9.10, we bring in the following. Let Q denote the set of
rational numbers. The set Q ⊂ R has the following “denseness” property: given p ∈ R
and ε > 0, there exists q ∈ Q such that |p − q| < ε. Let

(9.16) R = {Brj (qj ) : qj ∈ Q, rj ∈ Q ∩ (0, ∞)}.

Note that Q is countable, i.e., it can be put in one-to-one correspondence with N. Hence R
is a countable collection of balls. The following lemma is left as an exercise for the reader.
Lemma 9.9. Let Ω ⊂ R be a nonempty open set. Then
[
(9.17) Ω= {B : B ∈ R, B ⊂ Ω}.

To state the next result, we say that a collection {Uα : α ∈ A} covers K if K ⊂ ∪α∈A Uα .
If each Uα ⊂ R is open, it is called an open cover of K. If B ⊂ A and K ⊂ ∪β∈B Uβ , we
say {Uβ : β ∈ B} is a subcover. This result is called the Heine-Borel theorem.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
57

Proposition 9.10. If K ⊂ R is compact, then it has the following property.

(9.18) Every open cover {Uα : α ∈ A} of K has a finite subcover.

Proof. By Lemma 9.9, it suffices to prove the following.

Every countable cover {Bj : j ∈ N} of K by open intervals


(9.19)
has a finite subcover.

For this, we set

(9.20) Um = B1 ∪ · · · ∪ Bm

and apply Corollary 9.8.

Exercises

1. Consider a polynomial p(x) = xn + an−1 xn−1 + · · · + a1 x + a0 . Assume each aj ∈ R


and n is odd. Use the intermediate value theorem to show that p(x) = 0 for some x ∈ R.

We describe the construction of a Cantor set. Take a closed, bounded interval [a, b] = C0 .
Let C1 be obtained from C0 by deleting the open middle third interval, of length (b − a)/3.
At the jth stage, Cj is a disjoint union of 2j closed intervals, each of length 3−j (b−a). Then
Cj+1 is obtained from Cj by deleting the open middle third of each of these 2j intervals.
We have C0 ⊃ C1 ⊃ · · · ⊃ Cj ⊃ · · · , each a closed subset of [a, b].

2. Show that
\
(9.21) C= Cj
j≥0

is nonempty, and compact. This is the Cantor set.

3. Suppose C is formed as above, with [a, b] = [0, 1]. Show that points in C are precisely
those of the form

X
(9.22) ξ= bj 3−j , bj ∈ {0, 2}.
j=0

4. If p, q ∈ C (and p < q), show that the interval [p, q] must contain points not in C. One
says C is totally disconnected.

5. If p ∈ C, ε > 0, show that (p − ε, p + ε) contains infinitely many points in C. Given that


C is closed, one says C is perfect.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
58

6. Show that Card(C) = Card(R).


Hint. With ξ as in (9.22) show that
∞ ³
X bj ´
ξ 7→ η = 2−j
j=0
2

maps C onto [0, 1].

Remark. At this point, we mention the


Continuum Hypothesis. If S ⊂ R is uncountable, then Card S = Card R.
This hypothesis has been shown not to be amenable to proof or disproof, from the standard
axioms of set theory. See [C]. However, there is a large class of sets for which the conclusion
holds. For example, it holds whenever S ⊂ R is uncountable and compact. See Exercises
7–9 in §3 of Chapter 2 for further results along this line.

7. Show that Proposition 9.6 implies Proposition 7.1.

8. In the setting of Proposition 9.6 (the intermediate value theorem), in which f : [a, b] → R
is continuous and f (a) < c < f (b), consider the following.

(a) Divide I = [a, b] into two equal intervals I` and Ir , meeting at the midpoint α0 =
(a + b)/2. Select I1 = I` if f (α0 ) ≥ c, I1 = Ir if f (α0 ) < c. Say I1 = [x1 , y1 ]. Note that
f (x1 ) < c, f (y1 ) ≥ c.

(b) Divide I1 into two equal intervals I1` and I1r , meeting at the midpoint (x1 +y1 )/2 = α1 .
Select I2 = I1` if f (α1 ) ≥ c, I2 = I1r if f (α1 ) < c. Say I2 = [x2 , y2 ]. Note that
f (x2 ) < c, f (y2 ) ≥ c.

(c) Continue. Having Ik = [xk , yk ], of length 2−k (b − a), with f (xk ) < c, f (yk ) ≥ c,
divide Ik into two equal intervals Ik` and Ikr , meeting at the midpoint αk = (xk + yk )/2.
Select Ik+1 = Ik` if f (αk ) ≥ c, Ik+1 = Ikr if f (αk ) < c. Again, Ik+1 = [xk+1 , yk+1 ] with
f (xk+1 ) < c and f (yk+1 ) ≥ c.

(d) Show that there exists x ∈ (a, b) such that

xk % x, yk & x, and f (x) = c.

This method of approximating a solution to f (x) = c is called the bisection method.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
59

10. Complex numbers

A complex number is a number of the form


(10.1) z = x + iy, x, y ∈ R,
where the new object i has the property
(10.2) i2 = −1.
We denote the set of complex numbers by C. We have R ,→ C, identifying x ∈ R with
x + i0 ∈ C.
We define addition and multiplication in C as follows. Suppose w = a + ib, a, b ∈ R.
We set
z + w = (x + a) + i(y + b),
(10.3)
zw = (xa − yb) + i(xb + ya).
It is routine to verify various commutative, associative, and distributive laws, parallel to
those in Proposition 4.3. If z 6= 0, i.e., either x 6= 0 or y 6= 0, we can set
1 x y
(10.4) z −1 = = 2 2
−i 2 ,
z x +y x + y2
and verify that zz −1 = 1.
For some more notation, for z ∈ C of the form (10.1), we set
(10.5) z = x − iy, Re z = x, Im z = y.
We say z is the complex conjugate of z, Re z is the real part of z, and Im z is the imaginary
part of z.
We next discuss the concept of the magnitude (or absolute value) of an element z ∈ C.
If z has the form (10.1), we take a cue from the Pythagorean theorem, giving the Euclidean
distance from z to 0, and set
p
(10.6) |z| = x2 + y 2 .
Note that
(10.7) |z|2 = z z.
With this notation, (10.4) takes the compact (and clear) form
z
(10.8) z −1 = 2 .
|z|
We have
(10.9) |zw| = |z| · |w|,
for z, w ∈ C, as a consequence of the identity (readily verified from the definition (10.5))
(10.10) zw = z · w.
In fact, |zw| = (zw)(zw) = z w z w = zzww = |z|2 |w|2 . This extends the first part of
2

(6.21) from R to C. The extension of the second part also holds, but it requires a little
more work. The following is the triangle inequality in C.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
60

Proposition 10.1. Given z, w ∈ C,


(10.11) |z + w| ≤ |z| + |w|.

Proof. We compare the squares of each side of (10.11). First,


|z + w|2 = (z + w)(z + w)
(10.12) = |z|2 + |w|2 + wz + zw
= |z|2 + |w|2 + 2 Re zw.
Now, for any ζ ∈ C, Re ζ ≤ |ζ|, so Re zw ≤ |zw| = |z| · |w|, so (10.12) is
(10.13) ≤ |z|2 + |w|2 + 2|z| · |w| = (|z| + |w|)2 ,
and we have (10.11).
We now discuss matters related to convergence in C. Parallel to the real case, we say
a sequence (zj ) in C converges to a limit z ∈ C (and write zj → z) if and only if for each
ε > 0 there exists N such that
(10.14) j ≥ N =⇒ |zj − z| < ε.
Equivalently,
(10.13) zj → z ⇐⇒ |zj − z| → 0.
It is easily seen that
(10.16) zj → z ⇐⇒ Re zj → Re z and Im zj → Im z.
The set C also has the completeness property, given as follows. A sequence (zj ) in C is
said to be a Cauchy sequence if and only if
(10.17) |zj − zk | → 0, as j, k → ∞.
It is easy to see (using the triangle inequality) that if zj → z for some z ∈ C, then (10.17)
holds. Here is the converse:
Proposition 10.2. If (zj ) is a Cauchy sequence in C, then it has a limit.
Proof. If (zj ) is Cauchy in C, then (Re zj ) and (Im zj ) are Cauchy in R, so, by Theorem
6.8, they have limits.
P∞
We turn to infinite series k=0 ak , with ak ∈ C. We say this converges if and only if
the sequence of partial sums
n
X
(10.18) Sn = ak
k=0
converges:

X
(10.19) ak = A ⇐⇒ Sn → A as n → ∞.
k=0

The following is a useful condition guaranteeing convergence. Compare Proposition 6.12.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
61

P∞
Proposition 10.3. The infinite series k=0 ak converges provided

X
(10.20) |ak | < ∞,
k=0
Pn
i.e., there exists B < ∞ such that k=0 |ak | ≤ B for all n.
Proof. The triangle inequality gives, for ` ≥ 1,

¯ n+` ¯
¯ X ¯
|Sn+` − Sn | = ¯ ak ¯
k=n+1
(10.21)
n+`
X
≤ |ak |,
k=n+1

which tends to 0 as n → ∞, uniformly in ` ≥ 1, provided (10.20) holds (cf. (6.39)–(6.40)).


Hence (10.20) ⇒ (Sn ) is Cauchy. Convergence then follows, by Proposition 10.2.
P∞
As in the real case, if (10.20) holds, we say the infinite series k=0 ak is absolutely
convergent.
An example to which Proposition 10.3 applies is the following power series, giving the
exponential function ez :

X
z zk
(10.22) e = , z ∈ C.
k!
k=0

Compare Exercise 13 of §6. The exponential function is explored in depth in §5 of Chapter


4.
We turn to a discussion of polar coordinates on C. Given a nonzero z ∈ C, we can write
z
(10.23) z = rω, r = |z|, ω = .
|z|

Then ω has unit distance from 0. If the ray from 0 to ω makes an angle θ with the positive
real axis, we have

(10.24) Re ω = cos θ, Im ω = sin θ,

by definition of the trigonometric functions cos and sin. Hence

(10.25) z = r cis θ,

where

(10.26) cis θ = cos θ + i sin θ.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
62

If also
(10.27) w = ρ cis ϕ, ρ = |w|,
then
(10.28) zw = rρ cis(θ + ϕ),
as a consequence of the identity
(10.29) cis(θ + ϕ) = (cis θ)(cis ϕ),
which in turn is equivalent to the pair of trigonometric identities
cos(θ + ϕ) = cos θ cos ϕ − sin θ sin ϕ,
(10.30)
sin(θ + ϕ) = cos θ sin ϕ + sin θ cos ϕ.
There is another way to write (10.25), using the classical Euler identity
(10.31) eiθ = cos θ + i sin θ.
Then (10.25) becomes
(10.32) z = r eiθ .
The identity (10.29) is equivalent to
(10.33) ei(θ+ϕ) = eiθ eiϕ .
We will present a self-contained derivation of (10.31) (and also of (10.30) and (10.33)) in
Chapter 4.
We next define closed and open subsets of C, and discuss the notion of compactness. A
set S ⊂ C is said to be closed if and only if
(10.34) zj ∈ S, zj → z =⇒ z ∈ S.
The complement C \ S of a closed set S is open. Alternatively, Ω ⊂ C is open if and only
if, given q ∈ Ω, there exists ε > 0 such that Bε (q) ⊂ Ω, where
(10.35) Bε (q) = {z ∈ C : |z − q| < ε},
so q cannot be a limit of a sequence of points in C \ Ω. We define the closure S of a set
S ⊂ C to consist of all points p ∈ C such that Bε (p) ∩ S 6= ∅ for all ε > 0. Equivalently,
p ∈ S if and only if there exists an infinite sequence (pj ) of points in S such that pj → p.
Parallel to (9.5), we say a nonempty set K ⊂ C is compact if and only if the following
property holds.
Each infinite sequence (pj ) in K has a subsequence
(10.36)
that converges to a point in K.
As in §9, if K ⊂ C is compact, it must be closed and bounded. Parallel to Theorem 9.2,
we have the converse.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
63

Proposition 10.4. If a nonempty K ⊂ C is closed and bounded, then it is compact.


Proof. Let (zj ) be a sequence in K. Then (Re zj ) and (Im zj ) are bounded, so Theorem
6.9 implies the existence of a subsequence such that Re zjν and Im zjν converge. Hence the
subsequence (zjν ) converges in C. Since K is closed, the limit must belong to K.
If S ⊂ C, a function

(10.37) f : S −→ C

is said to be continuous at p ∈ S provided

(10.38) pj ∈ S, pj → p =⇒ f (pj ) → f (p).

If f is continuous at each p ∈ S, we say f is continuous on S. The following result has the


same proof as Proposition 9.4.
Proposition 10.5. If K ⊂ C is compact and f : K → C is continuous, then f (K) is
compact.
Then the following variant of Proposition 9.5 is straightforward.
Proposition 10.6. If K ⊂ C is compact and f : K → C is continuous, then there exists
p ∈ K such that

(10.39) |f (p)| = max |f (z)|,


z∈K

and there exists q ∈ K such that

(10.40) |f (q)| = min |f (z)|.


z∈K

There are also straightforward extensions to K ⊂ C of Propositions 9.7–9.10. We omit


the details. But see §1 of Chapter 2 for further extensions.

Exercises

We define π as the smallest positive number such that

cis π = −1.

See Chapter 4, §§4–5 for more on this matter.

1. Show that

ω = cis =⇒ ω n = 1.
n

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
64

For this, use (10.29). In conjunction with (10.25)–(10.28) and Proposition 7.1, use this to
prove the following:

Given a ∈ C, a 6= 0, n ∈ N, there exist z1 , . . . , zn ∈ C


such that zjn = a.

2. Compute √ ´
³1 3 3
+ i ,
2 2
and verify that

π 1 π 3
(10.41) cos = , sin = .
3 2 3 2

3. Find z1 , . . . , zn such that

(10.42) zjn = 1,

explicitly in the form a + ib (not simply as cis(2πj/n)), in case

(10.43) n = 3, 4, 6, 8.

Hint. Use (10.41), and also the fact that the equation u2j = i has solutions

1 i
(10.44) u1 = √ + √ , u2 = −u1 .
2 2

4. Take the following path to finding the 5 solutions to

(10.45) zj5 = 1.

One solution is z1 = 1. Since z 5 − 1 = (z − 1)(z 4 + z 3 + z 2 + z + 1), we need to find 4


solutions to z 4 + z 3 + z 2 + z + 1 = 0. Write this as
1 1
(10.46) z2 + z + 1 + + 2 = 0,
z z
which, for
1
(10.47) w=z+ ,
z
becomes

(10.48) w2 + w − 1 = 0.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
65

Use the quadratic formula to find 2 solutions to (10.48). Then solve (10.47), i.e., z 2 − wz +
1 = 0, for z. Use these calculations to show that

2π 5−1
cos = .
5 4

5. Take the following path to explicitly finding the real and imaginary parts of a solution
to
z 2 = a + ib.
Namely, with x = Re z, y = Im z, we have

x2 − y 2 = a, 2xy = b,

and also p
x2 + y 2 = ρ = a2 + b2 ,
hence r
ρ+a b
x= , y= ,
2 2x
as long as a + ib 6= −|a|.

6. Taking a cue from Exercise 4 of §6, show that

X ∞
1
(10.49) = zk , for z ∈ C, |z| < 1.
1−z
k=0

7. Show that
X ∞
1
2
= z 2k , for z ∈ C, |z| < 1.
1−z
k=0

8. Produce a power series series expansion in z, valid for |z| < 1, for

1
.
1 + z2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
66

Chapter II
Spaces

Introduction

In Chapter 1 we developed the real number line R, and established a number of metric
properties, such as completeness of R, and compactness of closed, bounded subsets. We
also produced the complex plane C, and studied analogous metric properties of C. Here
we examine other types of spaces, which are useful in analysis.
n
Section 1 treats n-dimensional Euclidean√space, R . This is equipped with a dot product
x · y ∈ R, which gives rise to a norm |x| = x · x. Parallel to (6.21) and (10.11) of Chapter
1, this norm satisfies the triangle inequality. In this setting, the proof goes through an
inequality known as Cauchy’s inequality. Then the distance between x and y in Rn is
given by d(x, y) = |x − y|, and it satisfies a triangle inequality. With these structures, we
have the notion of convergent sequences and Cauchy sequences, and can show that Rn is
complete. There is a notion of compactness for subsets of Rn , similar to that given in (9.5)
and in (10.36) of Chapter 1, for subsets of R and of C, and it is shown that nonempty,
closed bounded subsets of Rn are compact.
Analysts have found it useful to abstract some of the structures mentioned above, and
apply them to a larger class of spaces, called metric spaces. A metric space is a set
X, equipped with a distance function d(x, y), satisfying certain conditions (see (2.1)),
including the triangle inequality. For such a space, one has natural notions of a convergent
sequence and of a Cauchy sequence. The space may or may not be complete. If not,
there is a construction of its completion, somewhat similar to the construction of R as the
completion of Q in §6 of Chapter 1. We discuss the definition and some basic properties
of metric spaces in §2. There is also a natural notion of compactness in the metric space
context, which we treat in §3.
Most metric spaces we will encounter are subsets of Euclidean space. One exception
introduced in this chapter is the class of infinite products; see (3.3). Another important
class of metric spaces beyond the Euclidean space setting consists of spaces of functions,
which will be treated in §4 of Chapter 3.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
67

1. Euclidean spaces

The space Rn , n-dimensional Euclidean space, consists of n-tuples of real numbers:

(1.1) x = (x1 , . . . , xn ) ∈ Rn , xj ∈ R, 1 ≤ j ≤ n.

The number xj is called the jth component of x. Here we discuss some important algebraic
and metric structures on Rn . First, there is addition. If x is as in (1.1) and also y =
(y1 , . . . , yn ) ∈ Rn , we have

(1.2) x + y = (x1 + y1 , . . . , xn + yn ) ∈ Rn .

Addition is done componentwise. Also, given a ∈ R, we have

(1.3) ax = (ax1 , . . . , axn ) ∈ Rn .

This is scalar multiplication.


We also have the dot product,
n
X
(1.4) x·y = xj yj = x1 y1 + · · · + xn yn ∈ R,
j=1

given x, y ∈ Rn . The dot product has the properties

x · y = y · x,
(1.5) x · (ay + bz) = a(x · y) + b(x · z),
x · x > 0 unless x = 0.

Note that

(1.6) x · x = x21 + · · · + x2n .

We set

(1.7) |x| = x · x,

which we call the norm of x. Note that (1.5) implies

(1.8) (ax) · (ax) = a2 (x · x),

hence

(1.9) |ax| = |a| · |x|, for a ∈ R, x ∈ Rn .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
68

Taking a cue from the Pythagorean theorem, we say that the distance from x to y in
n
R is

(1.10) d(x, y) = |x − y|.

For us, (1.7) and (1.10) are simply definitions. We do not need to depend on the Pythagorean
theorem. Significant properties will be derived below, without recourse to the Pythagorean
theorem.
A set X equipped with a distance function is called a metric space. We will consider
metric spaces in general in the next section. Here, we want to show that the Euclidean
distance, defined by (1.10), satisfies the “triangle inequality,”

(1.11) d(x, y) ≤ d(x, z) + d(z, y), ∀ x, y, z ∈ Rn .

This in turn is a consequence of the following, also called the triangle inequality.
Proposition 1.1. The norm (1.7) on Rn has the property

(1.12) |x + y| ≤ |x| + |y|, ∀ x, y ∈ Rn .

Proof. We compare the squares of the two sides of (1.12). First,

|x + y|2 = (x + y) · (x + y)
(1.13) =x·x+y·x+y·x+y·y
= |x|2 + 2x · y + |y|2 .

Next,

(1.14) (|x| + |y|)2 = |x|2 + 2|x| · |y| + |y|2 .

We see that (1.12) holds if and only if x · y ≤ |x| · |y|. Thus the proof of Proposition 1.1 is
finished off by the following result, known as Cauchy’s inequality.
Proposition 1.2. For all x, y ∈ Rn ,

(1.15) |x · y| ≤ |x| · |y|.

Proof. We start with the chain

(1.16) 0 ≤ |x − y|2 = (x − y) · (x − y) = |x|2 + |y|2 − 2x · y,

which implies

(1.17) 2x · y ≤ |x|2 + |y|2 , ∀ x, y ∈ Rn .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
69

If we replace x by tx and y by t−1 y, with t > 0, the left side of (1.17) is unchanged, so we
have
(1.18) 2x · y ≤ t2 |x|2 + t−2 |y|2 , ∀ t > 0.
Now we pick t so that the two terms on the right side of (1.18) are equal, namely
|y| |x|
(1.19) t2 = , t−2 = .
|x| |y|
(At this point, note that (1.15) is obvious if x = 0 or y = 0, so we will assume that x 6= 0
and y 6= 0.) Plugging (1.19) into (1.18) gives
(1.20) x · y ≤ |x| · |y|, ∀ x, y ∈ Rn .
This is almost (1.15). To finish, we can replace x in (1.20) by −x = (−1)x, getting
(1.21) −(x · y) ≤ |x| · |y|,
and together (1.20) and (1.21) give (1.15).
We now discuss a number of notions and results related to convergence in Rn . First, a
sequence of points (pj ) in Rn converges to a limit p ∈ Rn (we write pj → p) if and only if
(1.22) |pj − p| −→ 0,
where | · | is the Euclidean norm on Rn , defined by (1.7), and the meaning of (1.22) is
that for every ε > 0 there exists N such that
(1.23) j ≥ N =⇒ |pj − p| < ε.
If we write pj = (p1j , . . . , pnj ) and p = (p1 , . . . , pn ), then (1.22) is equivalent to
(p1j − p1 )2 + · · · + (pnj − pn )2 −→ 0, as j → ∞,
which holds if and only if
|p`j − p` | −→ 0 as j → ∞, for each ` ∈ {1, . . . , n}.
That is to say, convergence pj → p in Rn is eqivalent to convergence of each component.
A set S ⊂ Rn is said to be closed if and only if
(1.24) pj ∈ S, pj → p =⇒ p ∈ S.
The complement Rn \ S of a closed set S is open. Alternatively, Ω ⊂ Rn is open if and
only if, given q ∈ Ω, there exists ε > 0 such that Bε (q) ⊂ Ω, where
(1.25) Bε (q) = {p ∈ Rn : |p − q| < ε},
so q cannot be a limit of a sequence of points in Rn \ Ω.
An important property of Rn is completeness, a property defined as follows. A sequence
(pj ) of points in Rn is called a Cauchy sequence if and only if
(1.26) |pj − pk | −→ 0, as j, k → ∞.
n
Again we see that (pj ) is Cauchy in R if and only if each component is Cauchy in R. It is
easy to see that if pj → p for some p ∈ Rn , then (1.26) holds. The completeness property
is the converse.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
70

Theorem 1.3. If (pj ) is a Cauchy sequence in Rn , then it has a limit, i.e., (1.22) holds
for some p ∈ Rn .
Proof. Since convergence pj → p in Rn is equivalent to convergence in R of each component,
the result is a consequence of the completeness of R. This was proved in Chapter 1.
Completeness provides a path to the following key notion of compactness. A nonempty
set K ⊂ Rn is said to be compact if and only if the following property holds.
Each infinite sequence (pj ) in K has a subsequence
(1.27)
that converges to a point in K.
It is clear that if K is compact, then it must be closed. It must also be bounded, i.e., there
exists R < ∞ such that K ⊂ BR (0). Indeed, if K is not bounded, there exist pj ∈ K such
that |pj+1 | ≥ |pj | + 1. In such a case, |pj − pk | ≥ 1 whenever j 6= k, so (pj ) cannot have a
convergent subsequence. The following converse statement is a key result.
Theorem 1.4. If a nonempty K ⊂ Rn is closed and bounded, then it is compact.
Proof. If K ⊂ Rn is closed and bounded, it is a closed subset of some box
(1.28) B = {(x1 , . . . , xn ) ∈ Rn : a ≤ xk ≤ b, ∀ k}.
Clearly every closed subset of a compact set is compact, so it suffices to show that B is
compact. Now, each closed bounded interval [a, b] in R is compact, as shown in §9 of
Chapter 1, and (by reasoning similar to the proof of Theorem 1.3) the compactness of B
follows readily from this.
We establish some further properties of compact sets K ⊂ Rn , leading to the important
result, Proposition 1.8 below. This generalizes results established for n = 1 in §9 of Chapter
1. A further generalization will be given in §3.
Proposition 1.5. Let K ⊂ Rn be compact. Assume X1 ⊃ X2 ⊃ X3 ⊃ · · · form a
decreasing sequence of closed subsets of K. If each Xm 6= ∅, then ∩m Xm 6= ∅.
Proof. Pick xm ∈ Xm . If K is compact, (xm ) has a convergent subsequence, xmk → y.
Since {xmk : k ≥ `} ⊂ Xm` , which is closed, we have y ∈ ∩m Xm .
Corollary 1.6. Let K ⊂ Rn be compact. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an increasing
sequence of open sets in Rn . If ∪m Um ⊃ K, then UM ⊃ K for some M .
Proof. Consider Xm = K \ Um .
Before getting to Proposition 1.8, we bring in the following. Let Q denote the set of
rational numbers, and let Qn denote the set of points in Rn all of whose components are
rational. The set Qn ⊂ Rn has the following “denseness” property: given p ∈ Rn and
ε > 0, there exists q ∈ Qn such that |p − q| < ε. Let
(1.29) R = {Br (q) : q ∈ Qn , r ∈ Q ∩ (0, ∞)}.
Note that Q and Qn are countable, i.e., they can be put in one-to-one correspondence with
N. Hence R is a countable collection of balls. The following lemma is left as an exercise
for the reader.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
71

Lemma 1.7. Let Ω ⊂ Rn be a nonempty open set. Then


[
(1.30) Ω= {B : B ∈ R, B ⊂ Ω}.

To state the next result, we say that a collection {Uα : α ∈ A} covers K if K ⊂ ∪α∈A Uα .
If each Uα ⊂ Rn is open, it is called an open cover of K. If B ⊂ A and K ⊂ ∪β∈B Uβ , we
say {Uβ : β ∈ B} is a subcover.
Proposition 1.8. If K ⊂ Rn is compact, then it has the following property.

(1.31) Every open cover {Uα : α ∈ A} of K has a finite subcover.

Proof. By Lemma 1.7, it suffices to prove the following.

Every countable cover {Bj : j ∈ N} of K by open balls


(1.32)
has a finite subcover.

To see this, write R = {Bj : j ∈ N}. Given the cover {Uα }, pass to {Bj : j ∈ J}, where
j ∈ J if and only of Bj is contained in some Uα . By (1.30), {Bj : j ∈ J} covers K. If
(1.32) holds, we have a subcover {B` : ` ∈ L} for some finite L ⊂ J. Pick α` ∈ A such
that B` ⊂ Uα` . The {Uα` : ` ∈ L} is the desired finite subcover advertised in (1.31).
Finally, to prove (1.32), we set

(1.33) Um = B1 ∪ · · · ∪ Bm

and apply Corollary 1.6.

Exercises

1. Identifying z = x + iy ∈ C with (x, y) ∈ R2 and w = u + iv ∈ C with (u, v) ∈ R2 , show


that the dot product satisfies
z · w = Re zw.
In light of this, compare the proof of Proposition 1.1 with that of Proposition 10.1 in
Chapter 1.

2. Show that the inequality (1.12) implies (1.11).

3. Prove Lemma 1.7.

4. Use Proposition 1.8 to prove the following extension of Proposition 1.5.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
72

Proposition 1.9. Let K ⊂ Rn be compact. Assume {Xα : α ∈ A} is a collection of closed


subsets of K. Assume that for each finite set B ⊂ A, ∩α∈B Xα 6= ∅. Then
\
Xα 6= ∅.
α∈A

Hint. Consider Uα = Rn \ Xα .

5. Let K ⊂ Rn be compact. Show that there exist x0 , x1 ∈ K such that

|x0 | ≤ |x|, ∀ x ∈ K,
|x1 | ≥ |x|, ∀ x ∈ K.

We say
|x0 | = min |x|, |x1 | = max |x|.
x∈K x∈K

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
73

2. Metric spaces

A metric space is a set X, together with a distance function d : X × X → [0, ∞), having
the properties that

d(x, y) = 0 ⇐⇒ x = y,
(2.1) d(x, y) = d(y, x),
d(x, y) ≤ d(x, z) + d(y, z).

The third of these properties is called the triangle inequality. We sometimes denote this
metric space by (X, d). An example of a metric space is the set of rational numbers Q,
with d(x, y) = |x − y|. Another example is X = Rn , with
p
d(x, y) = (x1 − y1 )2 + · · · + (xn − yn )2 .

This was treated in §1.


If (xν ) is a sequence in X, indexed by ν = 1, 2, 3, . . . , i.e., by ν ∈ Z+ , one says

(2.2) xν → y ⇐⇒ d(xν , y) → 0, as ν → ∞.

One says (xν ) is a Cauchy sequence if and only if

(2.3) d(xν , xµ ) → 0 as µ, ν → ∞.

One says X is a complete metric space if every Cauchy sequence converges to a limit in
X. Some metric spaces are not complete; for example,
√ Q is not complete. You can take a
sequence (xν ) of rational numbers such that xν → 2, which is not rational. Then (xν ) is
Cauchy in Q, but it has no limit in Q.
b as follows. Let
If a metric space X is not complete, one can construct its completion X
b
an element ξ of X consist of an equivalence class of Cauchy sequences in X, where we say

(2.4) (xν ) ∼ (x0ν ) =⇒ d(xν , x0ν ) → 0.

We write the equivalence class containing (xν ) as [xν ]. If ξ = [xν ] and η = [yν ], we can set

(2.5) ˆ η) = lim d(xν , yν ),


d(ξ,
ν→∞

and verify that this is well defined, and makes X b a complete metric space. Details are
provided at the end of this section.
If the completion of Q is constructed by this process, you get R, the set of real numbers.
This construction was carried out in §6 of Chapter 1.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
74

There are a number of useful concepts related to the notion of closeness. We define
some of them here. First, if p is a point in a metric space X and r ∈ (0, ∞), the set

(2.6) Br (p) = {x ∈ X : d(x, p) < r}

is called the open ball about p of radius r. Generally, a neighborhood of p ∈ X is a set


containing such a ball, for some r > 0.
A set S ⊂ X is said to be closed if and only if

(2.7) pj ∈ S, pj → p =⇒ p ∈ S.

The complement X \ S of a closed set is said to be open. Alternatively, U ⊂ X is open if


and only if

(2.8) q ∈ U =⇒ ∃ ε > 0 such that Bε (q) ⊂ U,

so q cannot be a limit of a sequence of points in X \ U .


We state a couple of straightforward propositions, whose proofs are left to the reader.
Proposition 2.1. If Uα is a family of open sets in X, then ∪α Uα is open. If Kα is a
family of closed subsets of X, then ∩α Kα is closed.
Given S ⊂ X, we denote by S (the closure of S) the smallest closed subset of X
containing S, i.e., the intersection of all the closed sets Kα ⊂ X containing S. The
following result is straightforward.
Proposition 2.2. Given S ⊂ X, p ∈ S if and only if there exist xj ∈ S such that xj → p.
Given S ⊂ X, p ∈ X, we say p is an accumulation point of S if and only if, for each
ε > 0, there exists q ∈ S ∩ Bε (p), q 6= p. It follows that p is an accumulation point of S if
and only if each Bε (p), ε > 0, contains infinitely many points of S. One straightforward
observation is that all points of S \ S are accumulation points of S.
If S ⊂ Y ⊂ X, we say S is dense in Y provided S ⊃ Y .
The interior of a set S ⊂ X is the largest open set contained in S, i.e., the union of all
the open sets contained in S. Note that the complement of the interior of S is equal to
the closure of X \ S.
We next define the notion of a connected space. A metric space X is said to be connected
provided that it cannot be written as the union of two disjoint nonempty open subsets.
The following is a basic example. Here, we treat I as a stand-alone metric space.
Proposition 2.3. Each interval I in R is connected.
Proof. Suppose A ⊂ I is nonempty, with nonempty complement B ⊂ I, and both sets are
open. (Hence both sets are closed.) Take a ∈ A, b ∈ B; we can assume a < b. (Otherwise,
switch A and B.) Let ξ = sup{x ∈ [a, b] : x ∈ A}. This exists, by Proposition 6.11 of
Chapter 1.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
75

Now we obtain a contradiction, as follows. Since A is closed, ξ ∈ A. (Hence ξ < b.) But
then, since A is open, ξ > a, and furthermore there must be a neighborhood (ξ − ε, ξ + ε)
contained in A. This would imply ξ ≥ ξ + ε. Contradiction.
See the next chapter for more on connectedness, and its connection to the Intermediate
Value Theorem.

Construction of the completion of (X, d)

As indicated earlier in this section, if (X, d) is a metric space, we can construct its
ˆ This construction can be compared to that done to pass from Q to R
b d).
completion (X,
in §6 of Chapter 1. Elements of X b consist of equivalence classes of Cauchy sequences in
X, with equivalence relation given by (2.4). To verify that (2.4) defines an equivalence
relation, we need to show that the relation specified there is reflexive, symmetric, and
transitive. The first two properties are completely straightforward. As for the third, we
need to show that

(2.9) (xν ) ∼ (x0ν ), (x0ν ) ∼ (x00ν ) =⇒ (xν ) ∼ (x00ν ).

In fact, the triangle inequality for d gives

(2.10) d(xν , x00ν ) ≤ d(xν , x0ν ) + d(x0ν , x00ν ),

from which (2.9) readily follows. We write the equivalence class containing (xν ) as [xν ].
ˆ η) by
Given ξ = [xν ] and η = [yν ], we propose to define d(ξ,

(2.11) ˆ η) = lim d(xν , yν ).


d(ξ,
ν→∞

To obtain a well defined dˆ : X b ×X b → [0, ∞), we need to verify that the limit on the
right side of (2.11) exists whenever (xν ) and (yν ) are Cauchy in X, and that the limit is
unchanged if (xν ) and (yν ) are replaced by (x0ν ) ∼ (xν ) and (yν0 ) ∼ (yν ). First, we show
that dν = d(xν , yν ) is a Cauchy sequence in R. The triangle inequality for d gives

dν = d(xν , yν ) ≤ d(xν , xµ ) + d(xµ , yµ ) + d(yµ , yν ),

hence
dν − dµ ≤ d(xν , xµ ) + d(yµ , yν ),
and the same upper estimate applies to dµ − dν , hence to |dν − dµ |. Thus the limit on the
right side of (2.11) exists. Next, with d0ν = d(x0ν , yν0 ), we have

d0ν = d(x0ν , yν0 ) ≤ d(x0ν , xν ) + d(xν , yν ) + d(yν , yν0 ),

hence
d0ν − dν ≤ d(x0ν , xν ) + d(yν , yν0 ),

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
76

and the same upper estimate applies to dν − d0ν , hence to |d0ν − dν |.


These observations establish that dˆ : X
b ×X
b → [0, ∞) is well defined. We next need to
show that it makes Xb a metric space. First,

(2.12) ˆ η) = 0 ⇒ lim d(xν , yν ) = 0 ⇒ (xν ) ∼ (yν ) ⇒ ξ = η.


d(ξ,
ν→∞

Next, the symmetry d(ξ,ˆ η) = d(η,


ˆ ξ) follows from (2.11) and the symmetry of d. Finally,
b then
if also ζ = [zν ] ∈ X,

ˆ ζ) = lim d(xν , zν )
d(ξ,
ν
£ ¤
(2.13) ≤ lim d(xν , yν ) + d(yν , zν )
ν
ˆ η) + d(η,
= d(ξ, ˆ ζ),

so dˆ satisfies the triangle inequality.


To proceed, we have a natural map

(2.14) b
j : X −→ X, j(x) = (x, x, x, . . . ).

It is clear that for each x, y ∈ X,

(2.15) ˆ
d(j(x), j(y)) = d(x, y).

From here on, we will simply identify a point x ∈ X with its image j(x) ∈ X,b using the
b (so X ⊂ X).
notation x ∈ X b It is useful to observe that if (xk ) is a Cauchy sequence in
X, then

(2.16) ˆ xk ) = 0.
ξ = [xk ] =⇒ lim d(ξ,
k→∞

In fact,

(2.17) ˆ xk ) = lim d(xν , xk ) → 0 as k → ∞.


d(ξ,
ν→∞

From here we have the following.


b
Lemma 2.4. The set X is dense in X.
b say ξ = [xν ], the fact that xν → ξ in (X,
Proof. Given ξ ∈ X, ˆ follows from (2.16).
b d)

We are now ready for the following analogue of Theorem 6.8 of Chapter 1.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
77

ˆ is complete.
b d)
Proposition 2.5. The metric space (X,
Proof. Assume (ξk ) is Cauchy in (X, ˆ By Lemma 2.4, we can pick xk ∈ X such that
b d).
ˆ k , xk ) ≤ 2−k . We claim (xk ) is Cauchy in X. In fact,
d(ξ

ˆ k , x` )
d(xk , x` ) = d(x
(2.18) ˆ k , ξk ) + d(ξ
≤ d(x ˆ k , ξ` ) + d(ξ
ˆ ` , x` )
ˆ k , ξ` ) + 2−k + 2−` ,
≤ d(ξ

so

(2.19) d(xk , x` ) −→ 0 as k, ` → ∞.

b We claim ξk → ξ. In fact,
Since (xk ) is Cauchy in X, it defines an element ξ = [xk ] ∈ X.

ˆ k , ξ) ≤ d(ξ
d(ξ ˆ k , xk ) + d(x
ˆ k , ξ)
(2.20)
ˆ k , ξ) + 2−k ,
≤ d(x

ˆ k , ξ) → 0 as k → ∞ follows from (2.17). This completes the proof of


and the fact that d(x
Proposition 2.5.

Exercises

1. Prove Proposition 2.1.

2. Prove Proposition 2.2.

3. Suppose the metric space (X, d) is complete, and (X, ˆ is constructed as indicated
b d)
in (2.4)–(2.5), and described in detail in (2.9)–(2.17). Show that the natural inclusion
j:X→X b is both one-to-one and onto.

4. Show that if p ∈ Rn and R > 0, the ball BR (p) = {x ∈ Rn : |x − p| < R} is connected.


Hint. Suppose BR (p) = U ∪ V , a union of two disjoint open sets. Given q1 ∈ U, q2 ∈ V ,
consider the line segment

` = {tq1 + (1 − t)q2 : 0 ≤ t ≤ 1}.

p
5. Let X = Rn , but replace the distance d(x, y) = (x1 − y1 )2 + · · · + (xn − yn )2 by

d1 (x, y) = |x1 − y1 | + · · · + |xn − yn |.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
78

Show that (X, d1 ) is a metric space. In particular, verify the triangle inequality. Show
that a sequence pj converges in (X, d1 ) if and only if it converges in (X, d).

6. Show that if U is an open subset of (X, d), then U is a union of open balls.

7. Let S ⊂ X be a dense subset. Let


B = {Br (p) : p ∈ S, r ∈ Q+ },
with Br (p) defined as in (2.6). Show that if U is an open subset of X, then U is a union
of balls in B. That is, if q ∈ U , there exists B ∈ B such that q ∈ B ⊂ U .

Given a nonempty metric space (X, d), we say it is perfect if it is complete and has no
isolated points. Exercises 8–10 deal with perfect metric spaces.

8. Show that if p ∈ X and ε > 0, then Bε (p) contains infinitely many points.

9. Pick distinct p0 , p1 ∈ X, and take positive r0 < (1/2)d(p0 , p1 ). Show that


X0 = Br0 (p0 ) and X1 = Br0 (p1 )
are disjoint perfect subsets of X (i.e., are each perfect metric spaces).

10. Similarly, take distinct p00 , p01 ∈ X0 and distinct p10 , p11 ∈ X1 , and sufficiently small
r1 > 0 such that
Xjk = Br1 (pjk ) for k = 0, 1 are disjoint perfect subsets of Xj .
Continue in this fashion, producing Xj1 ···jk+1 ⊂ Xj1 ···jk , closed balls of radius rk & 0,
centered at pj1 ···jk+1 . Show that you can define a function
Y∞
ϕ: {0, 1} → X, ϕ((j1 , j2 , j3 , . . . )) = lim pj1 j2 ···jk .
k→∞
`=1
Show that ϕ is one-to-one, and deduce that
Card(X) ≥ Card(R).

A metric space X is said to be separable if it has a countable dense subset.

11. Let X be a separable metric space, with a dense subset S = {pj : j ∈ N}. Produce a
function
Y∞
ψ : X −→ N
`=1
as follows. Given x ∈ X, choose a sequence (pjν ) of points in S such that pjν → x. Set
ψ(x) = (j1 , j2 , j3 , . . . ).
Show that ψ is one-to-one, and deduce that
Card(X) ≤ Card(R).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
79

3. Compactness

We return to the notion of compactness, defined in the Euclidean context in (1.27). We


say a (nonempty) metric space X is compact provided the following property holds:

(A) Each sequence (xk ) in X has a convergent subsequence.

We will establish various properties of compact metric spaces, and provide various equiv-
alent characterizations. For example, it is easily seen that (A) is equivalent to:

(B) Each infinite subset S ⊂ X has an accumulation point.

The following property is known as total boundedness:


Proposition 3.1. If X is a compact metric space, then

(C) Given ε > 0, ∃ finite set {x1 , . . . , xN } such that Bε (x1 ), . . . , Bε (xN ) covers X.

Proof. Take ε > 0 and pick x1 ∈ X. If Bε (x1 ) = X, we are done. If not, pick x2 ∈
X \ Bε (x1 ). If Bε (x1 ) ∪ Bε (x2 ) = X, we are done. If not, pick x3 ∈ X \ [Bε (x1 ) ∪ Bε (x2 )].
Continue, taking xk+1 ∈ X \ [Bε (x1 ) ∪ · · · ∪ Bε (xk )], if Bε (x1 ) ∪ · · · ∪ Bε (xk ) 6= X. Note
that, for 1 ≤ i, j ≤ k,
i 6= j =⇒ d(xi , xj ) ≥ ε.
If one never covers X this way, consider S = {xj : j ∈ N}. This is an infinite set with no
accumulation point, so property (B) is contradicted.
Corollary 3.2. If X is a compact metric space, it has a countable dense subset.
Proof. Given ε = 2−n , let Sn be a finite set of points xj such that {Bε (xj )} covers X.
Then C = ∪n Sn is a countable dense subset of X.
Here is another useful property of compact metric spaces, which will eventually be
generalized even further, in (E) below.
Proposition 3.3. Let X be a compact metric space. Assume K1 ⊃ K2 ⊃ K3 ⊃ · · · form
a decreasing sequence of closed subsets of X. If each Kn 6= ∅, then ∩n Kn 6= ∅.
Proof. Pick xn ∈ Kn . If (A) holds, (xn ) has a convergent subsequence, xnk → y. Since
{xnk : k ≥ `} ⊂ Kn` , which is closed, we have y ∈ ∩n Kn .
Corollary 3.4. Let X be a compact metric space. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an
increasing sequence of open subsets of X. If ∪n Un = X, then UN = X for some N .
Proof. Consider Kn = X \ Un .
The following is an important extension of Corollary 3.4. Note how this generalizes
Proposition 1.8.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
80

Proposition 3.5. If X is a compact metric space, then it has the property:

(D) Every open cover {Uα : α ∈ A} of X has a finite subcover.

Proof. Let C = {zj : j ∈ N} ⊂ X be a countable dense subset of X, as in Corollary 3.2.


Given p ∈ Uα , there exist zj ∈ C and a rational rj > 0 such that p ∈ Brj (zj ) ⊂ Uα . Hence
each Uα is a union of balls Brj (zj ), with zj ∈ C ∩ Uα , rj rational. Thus it suffices to show
that

(D0 ) Every countable cover {Bj : j ∈ N} of X by open balls has a finite subcover.

Compare the argument used in Proposition 1.8. To prove (D0 ), we set

Un = B1 ∪ · · · ∪ Bn

and apply Corollary 3.4.


The following is a convenient alternative to property (D):
\
(E) If Kα ⊂ X are closed and Kα = ∅, then some finite intersection is empty.
α

Considering Uα = X \ Kα , we see that

(D) ⇐⇒ (E).

The following result, known as the Heine-Borel theorem, completes Proposition 3.5.
Theorem 3.6. For a metric space X,

(A) ⇐⇒ (D).

Proof. By Proposition 3.5, (A) ⇒ (D). To prove the converse, it will suffice to show that
(E) ⇒ (B). So let S ⊂ X and assume S has no accumulation point. We claim:

Such S must be closed.

Indeed, if z ∈ S and z ∈
/ S, then z would have to be an accumulation point. To proceed,
say S = {xα : α ∈ A}, and set Kα = S \ {xα }. Then each Kα has no accumulation point,
hence Kα ⊂ X is closed. Also ∩α Kα = ∅. Hence there exists a finite set F ⊂ A such that
∩α∈F Kα = ∅, if (E) holds. Hence S = ∪α∈F {xα } is finite, so indeed (E) ⇒ (B).

Remark. So far we have that for every metric space X,

(A) ⇐⇒ (B) ⇐⇒ (D) ⇐⇒ (E) =⇒ (C).

We claim that (C) implies the other conditions if X is complete. Of course, compactness
implies completeness, but (C) may hold for incomplete X, e.g., X = (0, 1) ⊂ R.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
81

Proposition 3.7. If X is a complete metric space with property (C), then X is compact.
Proof. It suffices to show that (C) ⇒ (B) if X is a complete metric space. So let S ⊂ X
be an infinite set. Cover X by balls B1/2 (x1 ), . . . , B1/2 (xN ). One of these balls contains
infinitely many points of S, and so does its closure, say X1 = B1/2 (y1 ). Now cover X by
finitely many balls of radius 1/4; their intersection with X1 provides a cover of X1 . One
such set contains infinitely many points of S, and so does its closure X2 = B1/4 (y2 ) ∩ X1 .
Continue in this fashion, obtaining
X1 ⊃ X2 ⊃ X3 ⊃ · · · ⊃ Xk ⊃ Xk+1 ⊃ · · · , Xj ⊂ B2−j (yj ),
each containing infinitely many points of S. Pick zj ∈ Xj . One sees that (zj ) forms
a Cauchy sequence. If X is complete, it has a limit, zj → z, and z is seen to be an
accumulation point of S.

Remark. Note the similarity of this argument with the proof of the Bolzano-Weiersrass
theorem in Chapter 1.

If Xj , 1 ≤ j ≤ m, is a finite collection of metric spaces, with metrics dj , we can define


a Cartesian product metric space
m
Y
(3.1) X= Xj , d(x, y) = d1 (x1 , y1 ) + · · · + dm (xm , ym ).
j=1
p
Another choice of metric is δ(x, y) = d1 (x1 , y1 )2 + · · · + dm (xm , ym )2 . The metrics d and
δ are equivalent, i.e., there exist constants C0 , C1 ∈ (0, ∞) such that
(3.2) C0 δ(x, y) ≤ d(x, y) ≤ C1 δ(x, y), ∀ x, y ∈ X.
A key example is Rm , the Cartesian product of m copies of the real line R.
We describe some important classes of compact spaces.
Qm
Proposition 3.8. If Xj are compact metric spaces, 1 ≤ j ≤ m, so is X = j=1 Xj .
Proof. If (xν ) is an infinite sequence of points in X, say xν = (x1ν , . . . , xmν ), pick a
convergent subsequence of (x1ν ) in X1 , and consider the corresponding subsequence of
(xν ), which we relabel (xν ). Using this, pick a convergent subsequence of (x2ν ) in X2 .
Continue. Having a subsequence such that xjν → yj in Xj for each j = 1, . . . , m, we then
have a convergent subsequence in X.
The following result is useful for analysis on Rn .
Proposition 3.9. If K is a closed bounded subset of Rn , then K is compact.
Proof. This has been proved in §1. There it was noted that the result follows from the
compactness of a closed bounded interval I = [a, b] in R, which in turn was proved in §9 of
Chapter 1. Here, we just note that compactness of [a, b] is also a corollary of Proposition
3.7.
We next give a slightly more sophisticated result on compactness. The following exten-
sion of Proposition 3.8 is a special case of Tychonov’s Theorem.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
82

Q∞
Proposition 3.10. If {Xj : j ∈ Z+ } are compact metric spaces, so is X = j=1 Xj .
Here, we can make X a metric space by setting

X dj (pj (x), pj (y))
(3.3) d(x, y) = 2−j ,
j=1
1 + dj (pj (x), pj (y))

where pj : X → Xj is the projection onto the jth factor. It is easy to verify that, if xν ∈ X,
then xν → y in X, as ν → ∞, if and only if, for each j, pj (xν ) → pj (y) in Xj .
Proof. Following the argument in Proposition 3.8, if (xν ) is an infinite sequence of points
in X, we obtain a nested family of subsequences

(3.4) (xν ) ⊃ (x1 ν ) ⊃ (x2 ν ) ⊃ · · · ⊃ (xj ν ) ⊃ · · ·

such that p` (xj ν ) converges in X` , for 1 ≤ ` ≤ j. The next step is a diagonal construction.
We set

(3.5) ξν = xν ν ∈ X.

Then, for each j, after throwing away a finite number N (j) of elements, one obtains from
(ξν ) a subsequence of the sequence (xj ν ) in (3.4), so p` (ξν ) converges in X` for all `. Hence
(ξν ) is a convergent subsequence of (xν ).

Exercises

1. Let ϕ : [0, ∞) → [0, ∞) have the following properties: Assume

ϕ(0) = 0, ϕ(s) < ϕ(s + t) ≤ ϕ(s) + ϕ(t), for s ≥ 0, t > 0.

Prove that if d(x, y) is symmetric and satisfies the triangle inequality, so does

δ(x, y) = ϕ(d(x, y)).

2. Show that the function d(x, y) defined by (3.3) satisfies (2.1).


Hint. Consider ϕ(r) = r/(1 + r).

3. In the setting of (3.1), let


n o1/2
δ(x, y) = d1 (x1 , y1 )2 + · · · + dm (xm , ym )2 .

Show that √
δ(x, y) ≤ d(x, y) ≤ m δ(x, y).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
83

4. Let X be a metric space, p ∈ X, and let K ⊂ X be compact. Show that there exist
x0 , x1 ∈ K such that
d(x0 , p) ≤ d(x, p), ∀ x ∈ K,
d(x1 , p) ≥ d(x, p), ∀ x ∈ K.
Show that there exist y0 , y1 ∈ K such that

d(q0 , q1 ) ≤ d(y0 , y1 ), ∀ q0 , q1 ∈ K.

We say diam K = d(y0 , y1 ).

b
5. Let X be a metric space that satisfies the total boundedness condition (C), and let X
be its completion. Show that X b is compact.
Hint. Show that X b also satisfies condition (C).

6. Deduce from Exercises 10 and 11 of §2 that if X is a compact metric space with no


isolated points, then Card(X) = Card(R). Note how this generalizes the result on Cantor
sets in Exercise 6, §9, of Chapter 1.

In Exercises 7–9, X is an uncountable compact metric space (so, by Exercise 11 of §2,


Card X ≤ Card R).

7. Define K ⊂ X as follows:

x ∈ K ⇐⇒ Bε (x) is uncountable, ∀ ε > 0.

Show that
(a) K 6= ∅.
Hint. Cover X with B1 (pj ), 1 ≤ j ≤ N0 . At least one is uncountable; call it X0 . Cover
X0 with X0 ∩ B1/2 (pj ), 1 ≤ j ≤ N1 , pj ∈ X0 . At least one is uncountable; call it X1 .
Continue, obtaining uncountable compact sets X0 ⊃ X1 ⊃ · · · , with diam Xj ≤ 21−j .
Show that ∩j Xj = {x} with x ∈ K.

8. In the setting of Exercise 7, show that


(b) K is closed (hence compact), and
(c) K has no isolated points.
Hint for (c). Given x ∈ K, show that, for each ε > 0, there exists δ ∈ (0, ε) such that
Bε (x) \ Bδ (X) is uncountable. Apply Exercise 7 to this compact metric space.

9. Deduce from Exercises 6–8 that Card K = Card R. Hence conclude that

Card X = Card R.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
84

A. The Baire category theorem

If X is a metric space, a subset U ⊂ X is said to be dense if U = X, and a subset


S ⊂ X is said to be nowhere dense if S contains no nonempty open set. Consequently, S
is nowhere dense if and only if X \ S is dense. Also, a set U ⊂ X is dense in X if and only
if U intersects each nonempty open subset of X.
Our main goal here is to prove the following.
Theorem A.1. A complete metric space X cannot be written as a countable union of
nowhere dense subsets.
Proof. Let Sk ⊂ X be nowhere dense, k ∈ N. Set

k
[
(A.1) Tk = Sj ,
j=1

so Tk are closed, nowhere dense, and increasing. Consider

(A.2) Uk = X \ Tk ,

which are open, dense, and decreasing. Clearly


[ \
(A.3) Sk = X =⇒ Uk = ∅,
k k

so to prove the theorem, we show that there exists p ∈ ∩k Uk .


To do this, pick p1 ∈ U1 and ε1 > 0 such that Bε1 (p1 ) ⊂ U1 . Since U2 is dense in X,
we can then pick p2 ∈ Bε1 (p1 ) ∩ U2 and ε2 ∈ (0, ε1 /2) such that

(A.4) Bε2 (p2 ) ⊂ Bε1 (p1 ) ∩ U2 .

Continue, producing pk ∈ Bεk−1 (pk−1 ) ∩ Uk and εk & 0 such that

(A.5) Bεk (pk ) ⊂ Bεk−1 (pk−1 ) ∩ Uk ,

which is possible at each stage because Uk is dense in X, and hence intersects each
nonempty open set. Note that

(A.6) p` ∈ Bε` (p` ) ⊂ Bεk (pk ), ∀ ` > k.

It follows that

(A.7) d(p` , pk ) ≤ εk , ∀ ` > k,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
85

so (pk ) is Cauchy. Since X is complete, this sequence has a limit p ∈ X. Since each
Bεk (pk ) is closed, (A.6) implies

(A.8) p ∈ Bεk (pk ) ⊂ Uk , ∀ k.

This finishes the proof.


Theorem A.1 is called the Baire category theorem. The terminology arises as follows.
We say a subset Y ⊂ X is of first category provided Y is a countable union of nowhere
dense sets. If Y is not a set of first category, we say it is of second category. Theorem A.1
says that if X is a complete metric space, then X is of second category.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
86

Chapter III
Functions

Introduction

The playing fields for analysis are spaces, and the players themselves are functions. In
this chapter we develop some frameworks for understanding the behavior of various classes
of functions. We spend about half the chapter studying functions f : X → Y from one
metric space (X) to another (Y ), and about half specializing to the case Y = Rn .
Our emphasis is on continuous functions, and §1 presents a number of results on con-
tinuous functions f : X → Y , which by definition have the property
xν → x =⇒ f (xν ) → f (x).
We devote particular attention to the behavior of continuous functions on compact sets.
We bring in the notion of uniform continuity, a priori stronger than continuity, and show
that f continuous on X ⇒ f uniformly continuous on X, provided X is compact. We also
introduce the notion of connectedness, and extend the intermediate value theorem given
in §9 of Chapter 1 to the setting where X is a connected metric space, and f : X → R is
continuous.
In §2 we consider sequences and series of functions, starting with sequences (fj ) of
functions fj : X → Y . We study convergence and uniform convergence. We move to
infinite series

X
fj (x),
j=1
n
in case Y = R , and discuss conditions on fj yielding convergence, absolute convergence,
and uniform convergence. Section 3 introduces a special class of infinite series, power
series,
X∞
ak z k .
k=0
Here we take ak ∈ C and z ∈ C, and consider conditions yielding convergence on a disk
DR = {z ∈ C : |z| < R}. This section is a prelude to a deeper study of power series, as it
relates to calculus, in Chapter 4.
In §4 we study spaces of functions, including C(X, Y ), the set of continuous functions
f : X → Y . Under certain hypotheses (e.g., if either X or Y is compact) we can take
D(f, g) = sup dY (f (x), g(x)),
x∈X
as a distance function, making C(X, Y ) a metric space. We investigate conditions under
which this metric space can be shown to be complete. We also investigate conditions
under which certain subsets of C(X, Y ) can be shown to be compact. Unlike §§1–3, this
section will not have much impact on Chapters 4–5, but we include it to indicate further
interesting directions that analysis does take.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
87

1. Continuous functions

Let X and Y are metric spaces, with distance functions dX and dY , respectively. A
function f : X → Y is said to be continuous at a point x ∈ X if and only if

(1.1) xν → x in X =⇒ f (xν ) → f (x) in Y,

or, equivalently, for each ε > 0, there exists δ > 0 such that

(1.1A) dX (x, x0 ) < δ =⇒ dY (f (x), f (x0 )) < ε,

that is to say,

(1.1B) f −1 (Bε (f (x)) ⊂ Bδ (x),

where the balls Bε (y) and Bδ (x) are defined as in (2.6) of Chapter 2. Here we use the
notation
f −1 (S) = {x ∈ X : f (x) ∈ S},
given S ⊂ Y .
We say f is continuous on X if it is continuous at each point of X. Here is an equivalent
condition.
Proposition 1.1. Given f : X → Y , f is continuous on X if and only if

(1.1C) U open in Y =⇒ f −1 (U ) open in X.

Proof. First, assume f is continuous. Let U ⊂ Y be open, and assume x ∈ f −1 (U ), so


f (x) = y ∈ U . Given that U is open, pick ε > 0 such that Bε (y) ⊂ U . Continuity of f at
x forces the image of Bδ (x) to lie in the ball Bε (y) about y, if δ is small enough, hence to
lie in U . Thus Bδ (x) ⊂ f −1 (U ) for δ small enough, so f −1 (U ) must be open.
Conversely, assume (1.1C) holds. If x ∈ X, and f (x) = y, then for all ε > 0, f −1 (Bε (y))
must be an open set containing x, so f −1 (Bε (y)) contains Bδ (x) for some δ > 0. Hence f
is continuous at x.
We record the following important link between continuity and compactness. This
extends Proposition 9.4 of Chapter 1.
Proposition 1.2. If X and Y are metric spaces, f : X → Y continuous, and K ⊂ X
compact, then f (K) is a compact subset of Y.
Proof. If (yν ) is an infinite sequence of points in f (K), pick xν ∈ K such that f (xν ) = yν .
If K is compact, we have a subsequence xνj → p in X, and then yνj → f (p) in Y.
If f : X → R is continuous, we say f ∈ C(X). A useful corollary of Proposition 1.2 is:

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
88

Proposition 1.3. If X is a compact metric space and f ∈ C(X), then f assumes a


maximum and a minimum value on X.
Proof. We know from Proposition 1.2 that f (X) is a compact subset of R, hence bounded.
Proposition 6.1 of Chapter 1 implies f (K) ⊂ R has a sup and an inf, and, as noted in (9.7)
of Chapter 1, these numbers are in f (K). That is, we have

(1.2) b = max f (K), a = min f (K).

Hence a = f (x0 ) for some x0 ∈ X, and b = f (x1 ) for some x1 ∈ X.


For later use, we mention that if X is a nonempty set and f : X → R is bounded from
above, disregarding any notion of continuity, we set

(1.3) sup f (x) = sup f (X),


x∈X

and if f : X → R is bounded from below, we set

(1.4) inf f (x) = inf f (X).


x∈X

If f is not bounded from above, we set sup f = +∞, and if f is not bounded from below,
we set inf f = −∞.
Given a set X, f : X → R, and xn ∈ X, we set
³ ´
(1.5) lim sup f (xn ) = lim sup f (xk ) ,
n→∞ n→∞ k≥n

and
³ ´
(1.6) lim inf f (xn ) = lim inf f (xk ) .
n→∞ n→∞ k≥n

We return to the notion of continuity. A function f ∈ C(X) is said to be uniformly


continuous provided that, for any ε > 0, there exists δ > 0 such that

(1.7) x, y ∈ X, d(x, y) ≤ δ =⇒ |f (x) − f (y)| ≤ ε.

More generally, if Y is a metric space with distance function dY , a function f : X → Y is


said to be uniformly continuous provided that, for any ε > 0, there exists δ > 0 such that

(1.8) x, y ∈ X, dX (x, y) ≤ δ =⇒ dY (f (x), f (y)) ≤ ε.

An equivalent condition is that f have a modulus of continuity, i.e., a monotonic function


ω : [0, 1) → [0, ∞) such that δ & 0 ⇒ ω(δ) & 0, and such that

(1.9) x, y ∈ X, dX (x, y) ≤ δ ≤ 1 =⇒ dY (f (x), f (y)) ≤ ω(δ).

Not all continuous functions are uniformly continuous. For example, if X = (0, 1) ⊂ R,
then f (x) = sin 1/x is continuous, but not uniformly continuous, on X. The following
result is useful, for example, in the development of the Riemann integral in Chapter 4.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
89

Proposition 1.4. If X is a compact metric space and f : X → Y is continuous, then f


is uniformly continuous.
Proof. If not, there exist ε > 0 and xν , yν ∈ X such that dX (xν , yν ) ≤ 2−ν but

(1.10) dY (f (xν ), f (yν )) ≥ ε.

Taking a convergent subsequence xνj → p, we also have yνj → p. Now continuity of f at


p implies f (xνj ) → f (p) and f (yνj ) → f (p), contradicting (1.10).
If X and Y are metric spaces and f : X → Y is continuous, one-to-one, and onto, and
if its inverse g = f −1 : Y → X is continuous, we say f is a homeomorphism. Here is a
useful sufficient condition for producing homeomorphisms.
Proposition 1.5. Let X be a compact metric space. Assume f : X → Y is continuous,
one-to-one, and onto. Then its inverse g : Y → X is continuous.
Proof. If K ⊂ X is closed, then K is compact, so by Proposition 1.2, f (K) ⊂ Y is
compact, hence closed. Now if U ⊂ X is open, with complement K = X \ U , we see that
f (U ) = Y \ f (K), so U open ⇒ f (U ) open, that is,

U ⊂ X open =⇒ g −1 (U ) open.

Hence, by Proposition 1.1, g is continuous.


We next define the notion of a connected space. A metric space X is said to be connected
provided that it cannot be written as the union of two disjoint nonempty open subsets.
The following is a basic class of examples.
Proposition 1.6. Each interval I in R is connected.
Proof. This is Proposition 2.3 of Chapter 2.
We say X is path-connected if, given any p, q ∈ X, there is a continuous map γ : [0, 1] →
X such that γ(0) = p and γ(1) = q. The following is an easy consequence of Proposition
1.6.
Proposition 1.7. Every path connected metric space X is connected.
Proof. If X = U ∪ V with U and V open, disjoint, and both nonempty, take p ∈ U, q ∈ V ,
and let γ : [0, 1] → X be a continuous path from p to q. Then

[0, 1] = γ −1 (U ) ∪ γ −1 (V )

would be a disjoint union of nonempty open sets, which by Proposition 1.6 cannot happen.
The next result is known as the Intermediate Value Theorem. Note that it generalizes
Proposition 9.6 of Chapter 1.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
90

Proposition 1.8. Let X be a connected metric space and f : X → R continuous. Assume


p, q ∈ X, and f (p) = a < f (q) = b. Then, given any c ∈ (a, b), there exists z ∈ X such
that f (z) = c.
Proof. Under the hypotheses, A = {x ∈ X : f (x) < c} is open and contains p, while
B = {x ∈ X : f (x) > c} is open and contains q. If X is connected, then A ∪ B cannot be
all of X; so any point in its complement has the desired property.

Exercises

1. If X is a metric space, with distance function d, show that

|d(x, y) − d(x0 , y 0 )| ≤ d(x, x0 ) + d(y, y 0 ),

and hence
d : X × X −→ [0, ∞) is continuous.

2. Let pn (x) = xn . Take b > a > 0, and consider

pn : [a, b] −→ [an , bn ].

Use the intermediate value theorem to show that pn is onto.

3. In the setting of Exercise 2, show that pn is one-to-one, so it has an inverse

qn : [an , bn ] −→ [a, b].

Use Proposition 1.5 to show that qn is continuous. The common notation is

qn (x) = x1/n , x > 0.

Note. This strengthens Proposition 7.1 of Chapter 1.

4. Let f, g : X → C be continuous, and let h(x) = f (x)g(x). Show that h : X → C is


continuous.

5. Define pn : C → C by pn (z) = z n . Show that pn is continuous for each n ∈ N.


Hint. Start at n = 1, and use Exercise 4 to produce an inductive proof.

6. Let X, Y, Z be metric spaces. Assume f : X → Y and g : Y → Z are continuous. Define


g ◦ f : X → Z by g ◦ f (x) = g(f (x)). Show that g ◦ f is continuous.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
91

7. Let fj : X → Yj be continuous, for j = 1, 2. Define g : X → Y1 × Y2 by g(x) =


(f1 (x), f2 (x)). Show that g is continuous.

We present some exercises that deal with functions that are semicontinuous. Given a metric
space X and f : X → [−∞, ∞], we say f is lower semicontinuous at x ∈ X provided

f −1 ((c, ∞]) ⊂ X is open, ∀ c ∈ R.

We say f is upper semicontinuous provided

f −1 ([−∞, c)) is open, ∀ c ∈ R.

8. Show that

f is lower semicontinuous ⇐⇒ f −1 ([−∞, c]) is closed, ∀ c ∈ R,

and
f is upper semicontinuous ⇐⇒ f −1 ([c, ∞]) is closed, ∀ c ∈ R.

9. Show that

f is lower semicontinuous ⇐⇒ xn → x implies lim inf f (xn ) ≥ f (x).

Show that

f is upper semicontinuous ⇐⇒ xn → x implies lim sup f (xn ) ≤ f (x).

10. Given S ⊂ X, show that

χS is lower semicontinuous ⇐⇒ S is open.


χS is upper semicontinuous ⇐⇒ S is closed.

Here, χS (x) = 1 if x ∈ S, 0 if x ∈
/ S.

11. If X is a compact metric space, show that

f : X → R is lower semicontinuous =⇒ min f is achieved.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
92

2. Sequences and series of functions

Let X and Y be metric spaces, with distance functions dX and dY , respectively. Con-
sider a sequence of functions fj : X → Y , which we denote (fj ). To say (fj ) converges at
x to f : X → Y is simply to say that fj (x) → f (x) in Y . If such convergence holds for
each x ∈ X, we say (fj ) converges to f on X, pointwise.
A stronger type of convergence is uniform convergence. We say fj → f uniformly on X
provided
(2.1) sup dY (fj (x), f (x)) −→ 0, as j → ∞.
x∈X

An equivalent characterization is that, for each ε > 0, there exists K ∈ N such that
(2.2) j ≥ K =⇒ dY (fj (x), f (x)) ≤ ε, ∀ x ∈ X.
A significant property of uniform convergence is that passing to the limit preserves conti-
nuity.
Proposition 2.1. If fj : X → Y is continuous for each j and fj → f uniformly, then
f : X → Y is continuous.
Proof. Fix p ∈ X and take ε > 0. Pick K ∈ N such that (2.2) holds. Then pick δ > 0 such
that
(2.3) x ∈ Bδ (p) =⇒ dY (fK (x), fK (p)) < ε,
which can be done since fK : X → Y is continuous. Together, (2.2) and (2.3) imply
x ∈ Bδ (p) ⇒ dY (f (x), f (p))
(2.4) ≤ dY (f (x), fK (x)) + dY (fK (x), fK (p)) + dY (fK (p), f (p))
≤ 3ε.
Thus f is continuous at p, for each p ∈ X.
We next consider Cauchy sequences of functions fj : X → Y . To say (fj ) is Cauchy
at x ∈ X is simply to say (fj (x)) is a Cauchy sequence in Y . We say (fj ) is uniformly
Cauchy provided
(2.5) sup dY (fj (x), fk (x)) −→ 0, as j, k → ∞.
x∈X

An equivalent characterization is that, for each ε > 0, there exists K ∈ N such that
(2.6) j, k ≥ K =⇒ dY (fj (x), fk (x)) ≤ ε, ∀ x ∈ X.
If Y is complete, a Cauchy sequence (fj ) will have a limit f : X → Y . We have the
following.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
93

Proposition 2.2. Assume Y is complete, and fj : X → Y is uniformly Cauchy. Then


(fj ) converges uniformly to a limit f : X → Y .
Proof. We have already seen that there exists f : X → Y such that fj (x) → f (x) for each
x ∈ X. To finish the proof, take ε > 0, and pick K ∈ N such that (2.6) holds. Then taking
k → ∞ yields

(2.7) j ≥ K =⇒ dY (fj (x), f (x)) ≤ ε, ∀ x ∈ X,

yielding the uniform convergence.


If, in addition, each fj : X → Y is continuous, we can put Propositions 2.1 and 2.2
together. We leave this to the reader.
It is useful to note the following phenomenon in case, in addition, X is compact.
Proposition 2.3. Assume X is compact, fj : X → Y continuous, and fj → f uniformly
on X. Then
[
(2.8) K = f (X) ∪ fj (X) ⊂ Y is compact.
j≥1

Proof. Let (yν ) ⊂ K be an infinite sequence. If there exists j ∈ N such that yν ∈ fj (X)
for infinitely many ν, convergence of a subsequence to an element of fj (X) follows from
the known compactness of fj (X). Ditto if yν ∈ f (X) for infinitely many ν. It remains to
consider the situation yν ∈ fjν (X), jν → ∞ (after perhaps taking a subsequence). That,
is, suppose yν = fjν (xν ), xν ∈ X, jν → ∞. Passing to a further subsequence, we can
assume xν → x in X, and then it follows from the uniform convergence that

(2.9) yν −→ y = f (x) ∈ K.

We move from sequences to series. For this, we need some algebraic structure on Y .
Thus, for the rest of this section, we assume

(2.10) fj : X −→ Rn ,

for some n ∈ N. We look at the infinite series



X
(2.11) fk (x),
k=0

and seek conditions for convergence, which is the same as convergence of the sequence of
partial sums,
j
X
(2.11) Sj (x) = fk (x).
k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
94

Parallel to Proposition 6.12 of Chapter 1, we have convergence at x ∈ X provided


X
(2.13) |fk (x)| < ∞,
k=0

i.e., provided there exists Bx < ∞ such that

j
X
(2.14) |fk (x)| ≤ Bx , ∀ j ∈ N.
k=0

In such a case, we say the series (2.11) converges absolutely at x. We say (2.11) converges
uniformly on X if and only if (Sj ) converges uniformly on X. The following sufficient
condition for uniform convergence is called the Weierstrass M test.
Proposition 2.4. Assume there exist Mk such that |fk (x)| ≤ Mk , for all x ∈ X, and


X
(2.15) Mk < ∞.
k=0

Then the series (2.11) converges uniformly on X, to a limit S : X → Rn .


Proof. This proof is also similar to that of Proposition 6.12 of Chapter 1, but we review
it. We have

¯ m+` ¯
¯ X ¯
|Sm+` (x) − Sm (x)| ≤ ¯ fk (x)¯
k=m+1
m+`
X
(2.16) ≤ |fk (x)|
k=m+1
m+`
X
≤ Mk .
k=m+1

Pm
Now (2.15) implies σm = k=0 Mk is uniformly bounded, so (by Proposition 6.10 of
Chapter 1), σm % β for some β ∈ R+ . Hence

(2.17) |Sm+` (x) − Sm (x)| ≤ σm+` − σm ≤ β − σm → 0, as m → ∞,

independent of ` ∈ N and x ∈ X. Thus (Sj ) is uniformly Cauchy on X, and uniform


convergence follows by Propositon 2.2.
Bringing in Proposition 2.1, we have the following.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
95

Corollary 2.5. In the setting of Proposition 2.4, if also each fk : X → Rn is continuous,


so is the limit S.

Exercises

1. For j ∈ N, define fj : R → R by
x
f1 (x) = , fj (x) = f (jx).
1 + x2
Show that fj → 0 pointwise on R.
Show that, for each ε > 0, fj → 0 uniformly on R \ (−ε, ε).
Show that (fj ) does not converge uniformly to 0 on R.

2. For j ∈ N, define gj : R → R by
x
g1 (x) = √ , gj (x) = g1 (jx).
1 + x2
Show that there exists g : R → R such that gj → g pointwise. Show that g is not continuous
on all of R. Where is g discontinuous?

3. Let X be a compact metric space. Assume fj , f : X → R are continuous and


fj (x) % f (x), ∀ x ∈ X.
Prove that fj → f uniformly on X. (This result is called Dini’s theorem.)
Hint. For ε > 0, let Kj (ε) = {x ∈ X : f (x)−fj (x) ≥ ε}. Note that Kj (ε) ⊃ Kj+1 (ε) ⊃ · · · .
What about ∩j≥1 Kj (ε)?

4. Take gj as in Exercise 2 and consider



X 1
gk (x).
k2
k=1

Show that this series converges uniformly on R, to a continuous limit.

5. Take fj as in Exercise 1 and consider



X 1
fk (x).
k
k=1

Where does this series converge? Where does it converge uniformly? Where is the sum
continuous?
Hint. For use in the latter questions, note that, for ` ∈ N, ` ≤ k ≤ 2`, we have fk (1/`) ∈
[1/2, 1].

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
96

3. Power series

An important class of infinite series is the class of power series



X
(3.1) ak z k ,
k=0

with ak ∈ C. Note that if z1 6= 0 and (3.1) converges for z = z1 , then there exists C < ∞
such that
(3.2) |ak z1k | ≤ C, ∀ k.
Hence, if |z| ≤ r|z1 |, r < 1, we have

X ∞
X
k C
(3.3) |ak z | ≤ C rk = < ∞,
1−r
k=0 k=0

the last identity being the classical geometric series computation. (Compare (10.49) in
Chapter 1.) This yields the following.
Proposition 3.1. If (3.1) converges for some z1 6= 0, then either this series is absolutely
convergent for all z ∈ C, or there is some R ∈ (0, ∞) such that the series is absolutely
convergent for |z| < R and divergent for |z| > R.
We call R the radius of convergence of (3.1). In case of convergence for all z, we say
the radius of convergence is infinite. If R > 0 and (3.1) converges for |z| < R, it defines a
function

X
(3.4) f (z) = ak z k , z ∈ DR ,
k=0

on the disk of radius R centered at the origin,


(3.5) DR = {z ∈ C : |z| < R}.
Proposition 3.2. If the series (3.4) converges in DR , then it converges uniformly on DS
for all S < R, and hence f is continuous on DR , i.e., given zn , z ∈ DR ,
(3.6) zn → z =⇒ f (zn ) → f (z).

Proof. For each z ∈ DR , there exists S < R such that z ∈ DS , so it suffices to show that
f is continuous on DS whenever 0 < S < R. Pick T such that S < T < R. We know that
there exists C < ∞ such that |ak T k | ≤ C for all k. Hence
³ S ´k
(3.7) z ∈ DS =⇒ |ak z k | ≤ C .
T

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
97

Since
∞ ³ ´k
X S
(3.8) < ∞,
T
k=0

the Weierstrass M-test, Proposition 2.4, applies, to yield uniform convergence on DS . Since

(3.9) ∀ k, ak z k is continuous,

continuity of f on DS follows from Corollary 2.5.


More generally, a power series has the form

X
(3.10) f (z) = an (z − z0 )n .
n=0

It follows from Proposition 3.1 that to such a series there is associated a radius of con-
vergence R ∈ [0, ∞], with the property that the series converges absolutely whenever
|z − z0 | < R (if R > 0), and diverges whenever |z − z0 | > R (if R < ∞). We identify R as
follows:
1
(3.11) = lim sup |an |1/n .
R n→∞

This is established in the following result, which complements Propositions 3.1–3.2.


Proposition 3.3. The series (3.10) converges whenever |z − z0 | < R and diverges when-
ever |z − z0 | > R, where R is given by (3.11). If R > 0, the series converges uniformly
on {z : |z − z0 | ≤ R0 }, for each R0 < R. Thus, when R > 0, the series (3.10) defines a
continuous function

(3.12) f : DR (z0 ) −→ C,

where

(3.13) DR (z0 ) = {z ∈ C : |z − z0 | < R}.

Proof. If R0 < R, then there exists N ∈ Z+ such that

1
n ≥ N =⇒ |an |1/n < =⇒ |an |(R0 )n < 1.
R0
Thus
¯ z − z ¯n
0 ¯ 0¯ n
(3.14) |z − z0 | < R < R =⇒ |an (z − z0 ) | ≤ ¯ 0 ¯ ,
R

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
98

for n ≥ N , so (3.10) is dominated by a convergent geometrical series in DR0 (z0 ).


For the converse, we argue as follows. Suppose R00 > R, so infinitely many |an |1/n ≥
1/R00 , hence infinitely many |an |(R00 )n ≥ 1. Then
¯ z − z ¯n
¯ 0¯
|z − z0 | ≥ R00 > R =⇒ infinitely many |an (z − z0 )n | ≥ ¯ 00 ¯ ≥ 1,
R
forcing divergence for |z − z0 | > R.
The assertions about uniform convergence and continuity follow as in Proposition 3.2.

It is useful to note that we can multiply power series with radius of convergence R > 0.
In fact, there is the following more general result on products of absolutely convergent
series.
Proposition 3.4. Given absolutely convergent series

X ∞
X
(3.15) A= αn , B= βn ,
n=0 n=0

we have the absolutely convergent series



X n
X
(3.16) AB = γn , γn = αj βn−j .
n=0 j=0

Pk Pk
Proof. Take Ak = n=0 αn , Bk = n=0 βn . Then
k
X
(3.17) Ak Bk = γn + Rk
n=0

with
X
(3.18) Rk = αm βn , σ(k) = {(m, n) ∈ Z+ × Z+ : m, n ≤ k, m + n > k}.
(m,n)∈σ(k)

Hence
X X X X
|Rk | ≤ |αm | |βn | + |αm | |βn |
m≤k/2 k/2≤n≤k k/2≤m≤k n≤k
(3.19) X X
≤A |βn | + B |αm |,
n≥k/2 m≥k/2

where

X ∞
X
(3.20) A= |αn | < ∞, B= |βn | < ∞.
n=0 n=0

It follows thatPRk → 0 as k → ∞. Thus the left side of (3.17) converges to AB and the

right side to n=0 γn . The absolute convergence of (3.16) follows by applying the same
argument with αn replaced by |αn | and βn replaced by |βn |.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
99

Corollary 3.5. Suppose the following power series converge for |z| < R:

X ∞
X
n
(3.21) f (z) = an z , g(z) = bn z n .
n=0 n=0

Then, for |z| < R,



X n
X
n
(3.22) f (z)g(z) = cn z , cn = aj bn−j .
n=0 j=0

The following result, which is related to Proposition 3.4, has a similar proof.
P P
Proposition 3.6. If ajk ∈ C and j,k |ajk | < ∞, then j ajk is absolutely convergent
P
for each k, k ajk is absolutely convergent for each j, and
∞ ³X
X ∞ ´ ∞ ³X
X ∞ ´ X
(3.23) ajk = ajk = ajk .
j=0 k=0 k=0 j=0 j,k

P P
Proof. Clearly the hypothesis implies j |ajk | < ∞ for each k and k |ajk | < ∞ for each
j. It also implies that there exists B < ∞ such that
N X
X N
SN = |ajk | ≤ B, ∀ N.
j=0 k=0

Now SN is bounded and monotone, so there exists a limit, SN % A < ∞ as N % ∞. It


follows that, for each ε > 0, there exists N ∈ N such that
X
e ×N
|ajk | < ε, C(N ) = {(j, k) ∈ N e : j > N or k > N }.
(j,k)∈C(N )

Note that if M, K ≥ N , then


¯XM ³X
K ´ XN X
N ¯ X
¯ ¯
¯ ajk − ajk ¯ ≤ |ajk |,
j=0 k=0 j=0 k=0 (j,k)∈C(N )

hence
¯XM ³X
∞ ´ XN X
N ¯ X
¯ ¯
¯ ajk − ajk ¯ ≤ |ajk |.
j=0 k=0 j=0 k=0 (j,k)∈C(N )

Therefore
¯X∞ ³X
∞ ´ XN X
N ¯ X
¯ ¯
¯ a jk − a jk ¯ ≤ |ajk |.
j=0 k=0 j=0 k=0 (j,k)∈C(N )

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
100

We have a similar result with the roles of j and k reversed, and clearly the two finite sums
agree. It follows that

¯X∞ ³X
∞ ´ X∞ ³X
∞ ´¯
¯ ¯
¯ ajk − ajk ¯ < 2ε, ∀ ε > 0,
j=0 k=0 k=0 j=0

yielding (3.23).
Using Proposition 3.6, we demonstrate the following. (Thanks to Shrawan Kumar for
this argument.)
Proposition 3.7. If (3.10) has a radius of convergence R > 0, and z1 ∈ DR (z0 ), then
f (z) has a convergent power series about z1 :


X
(3.24) f (z) = bk (z − z1 )k , for |z − z1 | < R − |z1 − z0 |.
k=0

Proof. There is no loss in generality in taking z0 = 0, which we will do here, for notational
simplicity. Setting fz1 (ζ) = f (z1 + ζ), we have from (3.10)


X
fz1 (ζ) = an (ζ + z1 )n
n=0
(3.25) ∞ Xn µ ¶
X n k n−k
= an ζ z1 ,
n=0 k=0
k

the second identity by the binomial formula. Now,

∞ X
X n µ ¶ X∞
n k n−k
(3.26) |an | |ζ| |z1 | = |an |(|ζ| + |z1 |)n < ∞,
n=0 k=0
k n=0

provided |ζ| + |z1 | < R, which is the hypothesis in (3.24) (with z0 = 0). Hence Proposition
3.6 gives

∞ ³X µ ¶
n n−k ´ k
X ∞
(3.27) fz1 (ζ) = an z ζ .
k 1
k=0 n=k

Hence (3.24) holds, with


X µ ¶
n n−k
(3.28) bk = an z .
k 1
n=k

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
101

This proves Proposition 3.7. Note in particular that



X
(3.29) b1 = nan z1n−1 .
n=1

For more on power series, see §3 of Chapter 4.

Exercises

1. Let ak ∈ C. Assume there exist K ∈ N, α < 1 such that


¯a ¯
¯ k+1 ¯
(3.30) k ≥ K =⇒ ¯ ¯ ≤ α.
ak
P∞
Show that k=0 ak is absolutely convergent.
Note. This is the ratio test.

2. Determine the radius of convergence R for each of the following power series. If 0 <
R < ∞, try to determine when convergence holds at points on |z| = R.

X X∞ X∞
zn zn
zn, , 2
,
n=0 n=1
n n=1
n
X∞ X∞ X∞ n
zn zn z2
(3.31) , n
, n
,
n=1
n! n=1
2 n=1
2

X ∞
X ∞
X
nz n , n2 z n , n! z n .
n=1 n=1 n=1

3. Prove Proposition 3.6.

4. We have seen that


X ∞
1
(3.32) = zk , |z| < 1.
1−z
k=0

Find power series in z for

1 1
(3.33) , .
z−2 z+3

Where do they converge?

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
102

5. Use Corollary 3.5 to produce a power series in z for

1
(3.34) .
z2 +z−6

Where does the series converge?

6. As an alternative to the use of Corollary 3.5, write (3.34) as a linear combination of the
functions (3.33).

7. Find the power series on z for


1
.
1 + z2
Hint. Replace z by −z 2 in (3.32).

8. Given a > 0, find the power series in z for

1
.
a2 + z2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
103

4. Spaces of functions

If X and Y are metric spaces, the space C(X, Y ) of continuous maps f : X → Y has a
natural metric structure, under some additional hypotheses. We use
¡ ¢
(4.1) D(f, g) = sup d f (x), g(x) .
x∈X

This sup exists provided f (X) and g(X) are bounded subsets of Y, where to say B ⊂ Y is
bounded is to say d : B × B → [0, ∞) has bounded image. In particular, this supremum
exists if X is compact.
Proposition 4.1. If X is a compact metric space and Y is a complete metric space, then
C(X, Y ), with the metric (4.1), is complete.
Proof. That D(f, g) satisfies the conditions to define a metric on C(X, Y ) is straightfor-
ward. We check completeness. Suppose (fν ) is a Cauchy sequence in C(X, Y ), so, as
ν → ∞,
¡ ¢
(4.2) sup sup d fν+k (x), fν (x) ≤ εν → 0.
k≥0 x∈X

Then in particular (fν (x)) is a Cauchy sequence in Y for each x ∈ X, so it converges, say
to g(x) ∈ Y . It remains to show that g ∈ C(X, Y ) and that fν → g in the metric (4.1).
In fact, taking k → ∞ in the estimate above, we have
¡ ¢
(4.3) sup d g(x), fν (x) ≤ εν → 0,
x∈X

i.e., fν → g uniformly. It remains only to show that g is continuous. For this, let xj → x
in X and fix ε > 0. Pick N so that εN < ε. Since fN is continuous, there exists J such
that j ≥ J ⇒ d(fN (xj ), fN (x)) < ε. Hence
¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢
j ≥ J ⇒ d g(xj ), g(x) ≤ d g(xj ), fN (xj ) + d fN (xj ), fN (x) + d fN (x), g(x) < 3ε.

This completes the proof.


In case Y = R, we write C(X, R) = C(X). The distance function (4.1) can then be
written

(4.4) D(f, g) = kf − gksup , kf ksup = sup |f (x)|.


x∈X

kf ksup is a norm on C(X).


Generally, a norm on a vector space V is an assignment f 7→ kf k ∈ [0, ∞), satisfying

(4.5) kf k = 0 ⇔ f = 0, kaf k = |a| kf k, kf + gk ≤ kf k + kgk,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
104

given f, g ∈ V and a a scalar (in R or C). A vector space equipped with a norm is called a
normed vector space. It is then a metric space, with distance function D(f, g) = kf − gk.
If the space is complete, one calls V a Banach space.
In particular, by Proposition 4.1, C(X) is a Banach space, when X is a compact metric
space.
The next result is a special case of Ascoli’s Theorem. To state it, we say a modulus of
continuity is a strictly monotonically increasing, continuous function ω : [0, ∞) → [0, ∞)
such that ω(0) = 0.
Proposition 4.2. Let X and Y be compact metric spaces, and fix a modulus of continuity
ω(δ). Then
© ¡ ¢ ¡ ¢ ª
(4.6) Cω = f ∈ C(X, Y ) : d f (x), f (x0 ) ≤ ω d(x, x0 ) ∀ x, x0 ∈ X
is a compact subset of C(X, Y ).
Proof. Let (fν ) be a sequence in Cω . Let Σ be a countable dense subset of X, as in Corollary
3.2 of Chapter 2. For each x ∈ Σ, (fν (x)) is a sequence in Y, which hence has a convergent
subsequence. Using a diagonal construction similar to that in the proof of Proposition 3.10
of Chapter 2, we obtain a subsequence (ϕν ) of (fν ) with the property that ϕν (x) converges
in Y, for each x ∈ Σ, say
(4.7) ϕν (x) → ψ(x),
for all x ∈ Σ, where ψ : Σ → Y.
So far, we have not used (4.6). This hypothesis will now be used to show that ϕν
converges uniformly on X. Pick ε > 0. Then pick δ > 0 such that ω(δ) < ε/3. Since X is
compact, we can cover X by finitely many balls Bδ (xj ), 1 ≤ j ≤ N, xj ∈ Σ. Pick M so
large that ϕν (xj ) is within ε/3 of its limit for all ν ≥ M (when 1 ≤ j ≤ N ). Now, for any
x ∈ X, picking ` ∈ {1, . . . , N } such that d(x, x` ) ≤ δ, we have, for k ≥ 0, ν ≥ M,
¡ ¢ ¡ ¢ ¡ ¢
d ϕν+k (x), ϕν (x) ≤ d ϕν+k (x), ϕν+k (x` ) + d ϕν+k (x` ), ϕν (x` )
¡ ¢
(4.8) + d ϕν (x` ), ϕν (x)
≤ ε/3 + ε/3 + ε/3.
Thus (ϕν (x)) is Cauchy in Y for all x ∈ X, hence convergent. Call the limit ψ(x), so we
now have (4.7) for all x ∈ X. Letting k → ∞ in (4.8) we have uniform convergence of ϕν
to ψ. Finally, passing to the limit ν → ∞ in
(4.9) d(ϕν (x), ϕν (x0 )) ≤ ω(d(x, x0 ))
gives ψ ∈ Cω .
We want to re-state Proposition 4.2, bringing in the notion of equicontinuity. Given
metric spaces X and Y , and a set of maps F ⊂ C(X, Y ), we say F is equicontinuous at a
point x0 ∈ X provided
∀ ε > 0, ∃ δ > 0 such that ∀ x ∈ X, f ∈ F,
(4.10)
dX (x, x0 ) < δ =⇒ dY (f (x), f (x0 )) < ε.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
105

We say F is equicontinuous on X if it is equicontinuous at each point of X. We say F is


uniformly equicontinuous on X provided

∀ ε > 0, ∃ δ > 0 such that ∀ x, x0 ∈ X, f ∈ F,


(4.11)
dX (x, x0 ) < δ =⇒ dY (f (x), f (x0 )) < ε.

Note that (4.11) is equivalent to the existence of a modulus of continuity ω such that
F ⊂ Cω , given by (4.6). It is useful to record the following result.
Proposition 4.3. Let X and Y be metric spaces, F ⊂ C(X, Y ). Assume X is compact.
then

(4.12) F equicontinuous =⇒ F is uniformly equicontinuous.

Proof. The argument is a variant of the proof of Proposition 1.4. In more detail, suppose
there exist xν , x0ν ∈ X, ε > 0, and fν ∈ F such that d(xν , x0ν ) ≤ 2−ν but

(4.13) d(fν (xν ), fν (x0ν )) ≥ ε.

Taking a convergent subsequence xνj → p ∈ X, we also have x0νj → p. Now equicontinuity


of F at p implies that there esists N < ∞ such that
ε
(4.14) d(g(xνj ), g(p)) < , ∀ j ≥ N, g ∈ F,
2
contradicting (4.13).
Putting together Propositions 4.2 and 4.3 then gives the following.
Proposition 4.4. Let X and Y be compact metric spaces. If F ⊂ C(X, Y ) is equicontin-
uous on X, then it has compact closure in C(X, Y ).

Exercises

1. Let X and Y be compact metric spaces. Show that if F ⊂ C(X, Y ) is compact, then F
is equicontinuous. (This is a converse to Proposition 4.4.)

2. Let X be a compact metric space, and r ∈ (0, 1]. Define Lipr (X, Rn ) to consist of
continuous functions f : X → Rn such that, for some L < ∞ (depending on f ),

|f (x) − f (y)| ≤ LdX (x, y)r , ∀ x, y ∈ X.

Define a norm
|f (x) − f (y)|
kf kr = sup |f (x)| + sup .
x∈X x,y∈X,x6=y d(x, y)r

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
106

Show that Lipr (X, Rn ) is a complete metric space, with distance function Dr (f, g) =
kf − gkr .

3. In the setting of Exercise 2, show that if 0 < r < s ≤ 1 and f ∈ Lips (X, Rn ), then
r
kf kr ≤ Ckf k1−θ θ
sup kf ks , θ= ∈ (0, 1).
s

4. In the setting of Exercise 2, show that if 0 < r < s ≤ 1, then

{f ∈ Lips (X, Rn ) : kf ks ≤ 1}

is compact in Lipr (X, Rn ).

5. Let X be a compact metric space, and define C(X) as in (4.4). Take

P : C(X) × C(X) −→ C(X), P (f, g)(x) = f (x)g(x).

Show that P is continuous.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
107

Chapter IV
Calculus

Introduction

Having foundational material on numbers, spaces, and functions, we proceed further


into the heart of analysis, with a rigorous development of calculus, for functions of one real
variable.
Section 1 introduces the derivative, establishes basic identities like the product rule and
the chain rule, and also obtains some important theoretical results, such as the Mean Value
Theorem and the Inverse Function Theorem. One application of the latter is the study of
x1/n , for x > 0, which leads more generally to xr , for x > 0 and r ∈ Q.
Section 2 brings in the integral, more precisely the Riemann integral. A major result is
the Fundamental Theorem of Calculus, whose proof makes essential use of the Mean Value
Theorem. Another topic is the change of variable formula for integrals (treated in some
exercises).
In §3 we treat power series, continuing the development from §3 of Chapter 3. Here
we treat such topics as term by term differentiation of power series, and formulas for the
remainder when a power series is truncated. An application of such remainder formulas is
made to the study of convergence of the power series about x = 0 of (1 − x)b .
Section 4 studies curves in Euclidean space Rn , with particular attention to arc length.
We derive an integral formula for arc length. We show that a smooth curve can be
reparametrized by arc length, as an application of the Inverse Function Theorem. We
1 2 1
then
√ take a look at the unit circle S in R . Using the parametrization of part of S as
(t, 1 − t2 ), we obtain a power series for arc lengths, as an application of material of §3
on power series of (1 − x)b , with b = −1/2, and x replaced by t2 . We also bring in the
trigonometric functions, having the property that (cos t, sin t) provides a parametrization
of S 1 by arc length.
Section 5 goes much further into the study of the trigonometric functions. Actually,
it begins with a treatment of the exponential function et , observes that such treatment
extends readily to eat , given a ∈ C, and then establishes that eit provides a unit speed
parametrization of S 1 . This directly gives Euler’s formula

eit = cos t + i sin t,

and provides for a unified treatment of the exponential and trigonometric functions. We
also bring in log as the inverse function to the exponential, and we use the formula xr =
er log x to generalize results of §1 on xr from r ∈ Q to r ∈ R, and further, to r ∈ C.
In §6 we give a natural extension of the Riemann integral from the class of bounded (Rie-
mann integrable) functions to a class of unbounded “integrable” functions. The treatment
here is perhaps a desirable alternative to discussions one sees of “improper integrals.”

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
108

This chapter concludes with some appendices. Appendix A gives a proof of the Fun-
damental Theorem of Algebra, that every nonconstant polynomial has a complex root.
Appendix B presents a proof that π is irrational. Appendix C refines material on the
power series of (1 − x)b , in case b > 0. This will prove useful in Chapter 5. Appendix D
dicusses a method of calculating π that goes back to Archimedes. Appendix E discusses
calculations of π using arctangents. Appendix F treats the power series for tan x, whose
coefficients require a more elaborate derivation than those for sin x and cos x. Appendix
G discusses a theorem of Abel, giving the optimal condition under which a power series in
t with radius of convergence 1 can be shown to converge uniformly in t ∈ [0, 1], as well as
relatd issues regarding convergence of infinite series.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
109

1. The derivative

Consider a function f , defined on an interval (a, b) ⊂ R, taking values in R or C. Given


x ∈ (a, b), we say f is differentiable at x, with derivative f 0 (x), provided

f (x + h) − f (x)
(1.1) lim = f 0 (x).
h→0 h

We also use the notation

df
(1.2) (x) = f 0 (x).
dx

A characterization equivalent to (1.1) is

(1.3) f (x + h) = f (x) + f 0 (x)h + r(x, h), r(x, h) = o(h),

where

r(x, h)
(1.4) r(x, h) = o(h) means → 0 as h → 0.
h

Clearly if f is differentiable at x then it is continuous at x. We say f is differentiable on


(a, b) provided it is differentiable at each point of (a, b). If also g is defined on (a, b) and
differentiable at x, we have

d
(1.5) (f + g)(x) = f 0 (x) + g 0 (x).
dx

We also have the following product rule:

d
(1.6) (f g)(x) = f 0 (x)g(x) + f (x)g 0 (x).
dx

To prove (1.6), note that

f (x + h)g(x + h) − f (x)g(x) f (x + h) − f (x) g(x + h) − g(x)


= g(x) + f (x + h) .
h h h

We can use the product rule to show inductively that

d n
(1.7) x = nxn−1 ,
dx

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
110

for all n ∈ N. In fact, this is immediate from (1.1) if n = 1. Given that it holds for n = k,
we have
d k+1 d dx k d
x = (x xk ) = x + x xk
dx dx dx dx
= xk + kxk
= (k + 1)xk ,
completing the induction. We also have

1³ 1 1´ 1 1
− =− → − 2 , as h → 0,
h x+h x x(x + h) x

for x 6= 0, hence

d 1 1
(1.8) = − 2, if x 6= 0.
dx x x

From here, we can extend (1.7) from n ∈ N to all n ∈ Z (requiring x 6= 0 if n < 0).
A similar inductive argument yields

d
(1.9) f (x)n = nf (x)n−1 f 0 (x),
dx

for n ∈ N, and more generally for n ∈ Z (requiring f (x) 6= 0 if n < 0).


Going further, we have the following chain rule. Suppose f : (a, b) → (α, β) is differen-
tiable at x and g : (α, β) → R (or C) is differentiable at y = f (x). Form G = g ◦ f , i.e.,
G(x) = g(f (x)). We claim

(1.10) G = g ◦ f =⇒ G0 (x) = g 0 (f (x))f 0 (x).

To see this, write

G(x + h) = g(f (x + h)) = g(f (x) + f 0 (x)h + rf (x, h))


(1.11)
= g(f (x)) + g 0 (f (x))(f 0 (x)h + rf (x, h)) + rg (f (x), f 0 (x)h + rf (x, h)).

Here,
rf (x, h)
−→ 0 as h → 0,
h
and also
rg (f (x), f 0 (x)h + rf (x, h))
−→ 0, as h → 0,
h
so the analogue of (1.3) applies.
The derivative has the following important connection to maxima and minima.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
111

Proposition 1.1. Let f : (a, b) → R. Suppose x ∈ (a, b) and

(1.12) f (x) ≥ f (y), ∀ y ∈ (a, b).

If f is differentiable at x, then f 0 (x) = 0. The same conclusion holds if f (x) ≤ f (y) for
all y ∈ (a, b).
Proof. Given (1.12), we have

f (x + h) − f (x)
(1.13) ≤ 0, ∀ h ∈ (0, b − x),
h

and

f (x + h) − f (x)
(1.14) ≥ 0, ∀ h ∈ (a − x, 0).
h

If f is differentiable at x, both (1.13) and (1.14) must converge to f 0 (x) as h → 0, so we


simultaneously have f 0 (x) ≤ 0 and f 0 (x) ≥ 0.
We next establish a key result known as the Mean Value Theorem.
Theorem 1.2. Let f : [a, b] → R. Assume f is continuous on [a, b] and differentiable on
(a, b). Then there exists ξ ∈ (a, b) such that

f (b) − f (a)
(1.15) f 0 (ξ) = .
b−a

Proof. Let g(x) = f (x) − κ(x − a), where κ denotes the right side of (1.15). Then g(a) =
g(b). The result (1.15) is equivalent to the assertion that

(1.16) g 0 (ξ) = 0

for some ξ ∈ (a, b). Now g is continuous on the compact set [a, b], so it assumes both a
maximum and a minimum on this set. If g has a maximum at a point ξ ∈ (a, b), then
(1.16) follows from Proposition 1.1. If not, the maximum must be g(a) = g(b), and then g
must assume a minimum at some point ξ ∈ (a, b). Again Proposition 1.1 implies (1.16).
We use the Mean Value Theorem to produce a criterion for constructing the inverse of
a function. Let

(1.17) f : [a, b] −→ R, f (a) = α, f (b) = β.

Assume f is continuous on [a, b], differentiable on (a, b), and

(1.18) 0 < γ0 ≤ f 0 (x) ≤ γ1 < ∞, ∀ x ∈ (a, b).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
112

Then (1.15) implies

(1.19) γ0 (b − a) ≤ β − α ≤ γ1 (b − a).

We can also apply Theorem 1.2 to f , restricted to an interval [x1 , x2 ] ⊂ [a, b], to get

(1.20) γ0 (x2 − x1 ) ≤ f (x2 ) − f (x1 ) ≤ γ1 (x2 − x1 ), if a ≤ x1 < x2 ≤ b.

It follows that

(1.21) f : [a, b] −→ [α, β] is one-to-one.

The intermediate value theorem implies f : [a, b] → [α, β] is onto. Consequently f has an
inverse

(1.22) g : [α, β] −→ [a, b], g(f (x)) = x, f (g(y)) = y,

and (1.20) implies

(1.23) γ0 (g(y2 ) − g(y1 )) ≤ y2 − y1 ≤ γ1 (g(y2 ) − g(y1 )), if α ≤ y1 < y2 ≤ β.

The following result is known as the Inverse Function Theorem.


Theorem 1.3. If f is continuous on [a, b] and differentiable on (a, b), and (1.17)–(1.18)
hold, then its inverse g : [α, β] → [a, b] is differentiable on (α, β), and

1
(1.24) g 0 (y) = , for y = f (x) ∈ (α, β).
f 0 (x)

The same conclusion holds if in place of (1.18) we have

(1.25) −γ1 ≤ f 0 (x) ≤ −γ0 < 0, ∀ x ∈ (a, b),

except that then β < α.


Proof. Fix y ∈ (α, β), and let x = g(y), so y = f (x). From (1.22) we have, for h small
enough,
x + h = g(f (x + h)) = g(f (x) + f 0 (x)h + r(x, h)),
i.e.,

(1.26) g(y + f 0 (x)h + r(x, h)) = g(y) + h, r(x, h) = o(h).

Now (1.23) implies

1
(1.27) |g(y1 + r(x, h)) − g(y1 )| ≤ |r(x, h)|,
γ0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
113

provided y1 , y1 + r(y, h) ∈ [α, β], so, with h̃ = f 0 (x)h, and y1 = y + h̃, we have


(1.28) g(y + h̃) = g(y) + + o(h̃),
f 0 (x)

yielding (1.24) from the analogue of (1.3).

Remark. If one knew that g were differentiable, as well as f , then the identity (1.24) would
follow by differentiating g(f (x)) = x, applying the chain rule. However, an additional
argument, such as given above, is necessary to guarantee that g is differentiable.

Theorem 1.3 applies to the functions

(1.29) pn (x) = xn , n ∈ N.

By (1.7), p0n (x) > 0 for x > 0, so (1.18) holds when 0 < a < b < ∞. We can take a & 0
and b % ∞ and see that

(1.30) pn : (0, ∞) −→ (0, ∞) is invertible,

with differentiable inverse qn : (0, ∞) → (0, ∞). We use the notation

(1.31) x1/n = qn (x), x > 0,

so, given n ∈ N,

(1.32) x > 0 =⇒ x = x1/n · · · x1/n , (n factors).

Note. We recall that x1/n was constructed, for x > 0, in Chapter 1, §7, and its continuity
discussed in Chapter 3, §1.
Given m ∈ Z, we can set

(1.33) xm/n = (x1/n )m , x > 0,

and verify that (x1/kn )km = (x1/n )m . Thus we have xr defined for all r ∈ Q, when x > 0.
We have

(1.34) xr+s = xr xs , for x > 0, r, s ∈ Q.

See Exercises 3–5 in §7 of Chapter 1. Applying (1.24) to f (x) = xn , g(y) = y 1/n , we have

d 1/n 1
(1.35) y = , y = xn , x > 0.
dy nxn−1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
114

Now xn−1 = y/x = y 1−1/n , so we get

d r
(1.36) y = ry r−1 , y > 0,
dy

when r = 1/n. Putting this together with (1.9) (with m in place of n), we get (1.36) for
all r = m/n ∈ Q.
The definition of xr for x > 0 and the identity (1.36) can be extended to all r ∈ R, with
some more work. We will find a neat way to do this in §5.
We recall another common notation, namely

(1.37) x = x1/2 , x > 0.

Then (1.36) yields

d √ 1
(1.38) x= √ .
dx 2 x

In regard to this, note that, if we consider


√ √
x+h− x
(1.39) ,
h
√ √
we can multiply numerator and denominator by x+h+ x, to get

1
(1.40) √ √ ,
x+h+ x

whose convergence to the right side of (1.38) for x > 0 is equivalent to the statement that
√ √
(1.41) lim x+h= x,
h→0


i.e., to the continuity of x 7→ x on (0, ∞). Such continuity is a consequence of the fact
that, for 0 < a < b < ∞, n = 2,

(1.42) pn : [a, b] −→ [an , bn ]

is continuous, one-to-one, and onto, so, by the compactness of [a, b], its inverse is continu-
ous. Thus we have an alternative derivation of (1.38).
If I ⊂ R is an interval and f : I → R (or C), we say f ∈ C 1 (I) if f is differentiable on I
and f 0 is continuous on I. If f 0 is in turn differentiable, we have the second derivative of
f:

d2 f d 0
(1.43) 2
(x) = f 00 (x) = f (x).
dx dx

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
115

If f 0 is differentiable on I and f 00 is continuous on I, we say f ∈ C 2 (I). Inductively, we can


define higher order derivatives of f, f (k) , also denoted dk f /dxk . Here, f (1) = f 0 , f (2) = f 00 ,
and if f (k) is differentiable,

d (k)
(1.44) f (k+1) (x) = f (x).
dx

If f (k) is continuous on I, we say f ∈ C k (I).


Sometimes we will run into functions of more than one variable, and will want to dif-
ferentiate with respect to each one of them. For example, if f (x, y) is defined for (x, y) in
an open set in R2 , we set

∂f f (x + h, y) − f (x, y)
(x, y) = lim ,
∂x h→0 h
(1.45)
∂f f (x, y + h) − f (x, y)
(x, y) = lim .
∂y h→0 h

We will not need any more than the definition here. A serious study of the derivative of
a function of several variables is given in the companion [T2] to this volume, Introduction
to Analysis in Several Variables.
We end this section with some results on the significance of the second derivative.
Proposition 1.4. Assume f is differentiable on (a, b), x0 ∈ (a, b), and f 0 (x0 ) = 0. As-
sume f 0 is differentiable at x0 and f 00 (x0 ) > 0. Then there exists δ > 0 such that

(1.46) f (x0 ) < f (x) for all x ∈ (x0 − δ, x0 + δ) \ {x0 }.

We say f has a local minimum at x0 .


Proof. Since

f 0 (x0 + h) − f 0 (x0 )
(1.47) f 00 (x0 ) = lim ,
h→0 h

the assertion that f 00 (x0 ) > 0 implies that there exists δ > 0 such that the right side of
(1.47) is > 0 for all nonzero h ∈ [−δ, δ]. Hence

−δ ≤ h < 0 =⇒ f 0 (x0 + h) < 0,


(1.48)
0 < h ≤ δ =⇒ f 0 (x0 + h) > 0.

This plus the mean value theorem imply (1.46).

Remark. Similarly,

(1.49) f 00 (x0 ) < 0 =⇒ f has a local maximum at x0 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
116

These two facts constitute the second derivative test for local maxima and local minima.

Let us now assume that f and f 0 are differentiable on (a, b), so f 00 is defined at each
point of (a, b). Let us further assume

(1.50) f 00 (x) > 0, ∀ x ∈ (a, b).

The mean value theorem, applied to f 0 , yields

(1.51) a < x0 < x1 < b =⇒ f 0 (x0 ) < f 0 (x1 ).

Here is another interesting property.


Proposition 1.5. If (1.50) holds and a < x0 < x1 < b, then

(1.52) f (sx0 + (1 − s)x1 ) < sf (x0 ) + (1 − s)f (x1 ), ∀ s ∈ (0, 1).

Proof. For s ∈ [0, 1], set

(1.53) g(s) = sf (x0 ) + (1 − s)f (x1 ) − f (sx0 + (1 − s)x1 ).

The result (1.52) is equivalent to

(1.54) g(s) > 0 for 0 < s < 1.

Note that

(1.55) g(0) = g(1) = 0.

If (1.54) fails, g must assume a minimum at some point s0 ∈ (0, 1). At such a point,
g 0 (s0 ) = 0. A computation gives g 0 (s) = f (x0 ) − f (x0 ) − (x0 − x1 )f 0 (sx0 + (1 − s)x1 ), and
hence

(1.56) g 00 (s) = −(x0 − x1 )2 f 00 (sx0 + (1 − s)x1 ).

Thus (1.50) ⇒ g 00 (s0 ) < 0. Then (1.49) ⇒ g has a local maximum at s0 . This contradiction
establishes (1.54), hence (1.52).

Remark. The result (1.52) implies that the graph of y = f (x) over [x0 , x1 ] lies below the
chord, i.e., the line segment from (x0 , f (x0 )) to (x1 , f (x1 )) in R2 . We say f is convex.

Exercises

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
117

Compute the derivative of each of the following functions. Specify where each of these
derivatives are defined.
p
(1) 1 + x2 ,
(2) (x2 + x3 )−4 ,

1 + x2
(3) .
(x2 + x3 )4

4. Let f : [0, ∞) → R be a C 2 function satisfying

(1.57) f (x) > 0, f 0 (x) > 0, f 00 (x) < 0, for x > 0.

Show that

(1.58) x, y > 0 =⇒ f (x + y) < f (x) + f (y).

5. Apply Exercise 4 to
x
(1.59) f (x) = .
1+x

Relate the conclusion to Exercises 1–2 in §3 of Chapter 2. Give a direct proof that (1.58)
holds for f in (1.59), without using calculus.

6. If f : I → Rn , we define f 0 (x) just as in (1.1). If f (x) = (f1 (x), . . . , fn (x)), then f is


differentiable at x if and only if each component fj is, and

f 0 (x) = (f10 (x), . . . , fn0 (x)).

Parallel to (1.6), show that if g : I → Rn , then the dot product satisfies

d
f (x) · g(x) = f 0 (x) · g(x) + f (x) · g 0 (x).
dx

7. Establish the following variant of Proposition 1.5. Suppose (1.50) is weakened to

(1.60) f 00 (x) ≥ 0, ∀ x ∈ (a, b).

Show that, in place of (1.52), one has

(1.61) f (sx0 + (1 − s)x1 ) ≤ sf (x0 ) + (1 − s)f (x1 ), ∀ s ∈ (0, 1).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
118

Hint. Consider fε (x) = f (x) + εx2 .

8. The following is called the generalized mean value theorem. Let f and g be continuous
on [a, b] and differentiable on (a, b). Then there exists ξ ∈ (a, b) such that

[f (b) − f (a)]g 0 (ξ) = [g(b) − g(a)]f 0 (ξ).

Show that this follows from the mean value theorem, applied to

h(x) = [f (b) − f (a)]g(x) − [g(b) − g(a)]f (x).

9. Take f : [a, b] → [α, β] and g : [α, β] → [a, b] as in the setting of the Inverse Function
Theorem, Theorem 1.3. Write (1.24) as

1
(1.62) g 0 (y) = , y ∈ (α, β).
f 0 (g(y))

Show that
f ∈ C 1 ((a, b)) =⇒ g ∈ C 1 ((α, β)),
i.e., the right side of (1.62) is continuous on (α, β). Show inductively that, for k ∈ N,

f ∈ C k ((a, b)) =⇒ g ∈ C k ((α, β)).

Example. Show that if f ∈ C 2 ((a, b)), then (having shown that g ∈ C 1 ) the right side of
(1.62) is C 1 and hence
1
g 00 (y) = − 0 2
f 00 (g(y))g 0 (y).
f (g(y))

10. Let I ⊂ R be an open interval and f : I → R differentiable. (Do not assume f 0 is


continuous.) Assume a, b ∈ I, a < b, and

f 0 (a) < u < f 0 (b).

Show that there exists ξ ∈ (a, b) such that f 0 (ξ) = u.


Hint. Reduce to the case u = 0, so f 0 (a) < 0 < f 0 (b). Show that then f |[a,b] has a
minimum at a point ξ ∈ (a, b).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
119

2. The integral

In this section, we introduce the Riemann version of the integral, and relate it to the
derivative. We will define the Riemann integral of a bounded function over an interval
I = [a, b] on the real line. For now, we assume f is real valued. To start, we partition
I into smaller intervals. A partition P of I is a finite collection of subintervals {Jk : 0 ≤
k ≤ N }, disjoint except for their endpoints, whose union is I. We can order the Jk so that
Jk = [xk , xk+1 ], where

(2.1) x0 < x1 < · · · < xN < xN +1 , x0 = a, xN +1 = b.

We call the points xk the endpoints of P. We set

(2.2) `(Jk ) = xk+1 − xk , maxsize (P) = max `(Jk )


0≤k≤N

We then set
X
I P (f ) = sup f (x) `(Jk ),
Jk
k
(2.3) X
I P (f ) = inf f (x) `(Jk ).
Jk
k

Here,
sup f (x) = sup f (Jk ), inf f (x) = inf f (Jk ),
Jk Jk

and we recall that if S ⊂ R is bounded, sup S and inf S were defined in §6 of Chapter 1;
cf. (6.32) and (6.45). We call I P (f ) and I P (f ) respectively the upper sum and lower sum
of f , associated to the partition P. Note that I P (f ) ≤ I P (f ). These quantities should
approximate the Riemann integral of f, if the partition P is sufficiently “fine.”
To be more precise, if P and Q are two partitions of I, we say P refines Q, and write
P Â Q, if P is formed by partitioning each interval in Q. Equivalently, P Â Q if and only
if all the endpoints of Q are also endpoints of P. It is easy to see that any two partitions
have a common refinement; just take the union of their endpoints, to form a new partition.
Note also that refining a partition lowers the upper sum of f and raises its lower sum:

(2.4) P Â Q =⇒ I P (f ) ≤ I Q (f ), and I P (f ) ≥ I Q (f ).

Consequently, if Pj are any two partitions and Q is a common refinement, we have

(2.5) I P1 (f ) ≤ I Q (f ) ≤ I Q (f ) ≤ I P2 (f ).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
120

Now, whenever f : I → R is bounded, the following quantities are well defined:


(2.6) I(f ) = inf I P (f ), I(f ) = sup I P (f ),
P∈Π(I) P∈Π(I)

where Π(I) is the set of all partitions of I. We call I(f ) the lower integral of f and I(f ) its
upper integral. Clearly, by (2.5), I(f ) ≤ I(f ). We then say that f is Riemann integrable
provided I(f ) = I(f ), and in such a case, we set
Z b Z
(2.7) f (x) dx = f (x) dx = I(f ) = I(f ).
a
I
We will denote the set of Riemann integrable functions on I by R(I).
We derive some basic properties of the Riemann integral.
Proposition 2.1. If f, g ∈ R(I), then f + g ∈ R(I), and
Z Z Z
(2.8) (f + g) dx = f dx + g dx.
I I I

Proof. If Jk is any subinterval of I, then


sup (f + g) ≤ sup f + sup g, and inf (f + g) ≥ inf f + inf g,
Jk Jk Jk Jk Jk Jk

so, for any partition P, we have I P (f + g) ≤ I P (f ) + I P (g). Also, using common refine-
ments, we can simultaneously approximate I(f ) and I(g) by I P (f ) and I P (g), and ditto
for I(f + g). Thus the characterization (2.6) implies I(f + g) ≤ I(f ) + I(g). A parallel
argument implies I(f + g) ≥ I(f ) + I(g), and the proposition follows.
Next, there is a fair supply of Riemann integrable functions.
Proposition 2.2. If f is continuous on I, then f is Riemann integrable.
Proof. Any continuous function on a compact interval is bounded and uniformly continuous
(see Propositions 1.1 and 1.3 of Chapter 3). Let ω(δ) be a modulus of continuity for f, so
(2.9) |x − y| ≤ δ =⇒ |f (x) − f (y)| ≤ ω(δ), ω(δ) → 0 as δ → 0.
Then
(2.10) maxsize (P) ≤ δ =⇒ I P (f ) − I P (f ) ≤ ω(δ) · `(I),
which yields the proposition.
We denote the set of continuous functions on I by C(I). Thus Proposition 2.2 says
C(I) ⊂ R(I).
The proof of Proposition
R 2.2 provides a criterion on a partition guaranteeing that I P (f )
and I P (f ) are close to I f dx when f is continuous. We produce an extension, giving a
condition under which I P (f ) and I(f ) are close, and I P (f ) and I(f ) are close, given f
bounded on I. Given a partition P0 of I, set
(2.11) minsize(P0 ) = min{`(Jk ) : Jk ∈ P0 }.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
121

Lemma 2.3. Let P and Q be two partitions of I. Assume

1
(2.12) maxsize(P) ≤ minsize(Q).
k

Let |f | ≤ M on I. Then

2M
I P (f ) ≤ I Q (f ) + `(I),
(2.13) k
2M
I P (f ) ≥ I Q (f ) − `(I).
k

Proof. Let P1 denote the minimal common refinement of P and Q. Consider on the one
hand those intervals in P that are contained in intervals in Q and on the other hand those
intervals in P that are not contained in intervals in Q. Each interval of the first type is also
an interval in P1 . Each interval of the second type gets partitioned, to yield two intervals
in P1 . Denote by P1b the collection of such divided intervals. By (2.12), the lengths of the
intervals in P1b sum to ≤ `(I)/k. It follows that

X `(I)
|I P (f ) − I P1 (f )| ≤ 2M `(J) ≤ 2M ,
k
J∈P1b

and similarly |I P (f ) − I P1 (f )| ≤ 2M `(I)/k. Therefore

2M 2M
I P (f ) ≤ I P1 (f ) + `(I), I P (f ) ≥ I P1 (f ) − `(I).
k k

Since also I P1 (f ) ≤ I Q (f ) and I P1 (f ) ≥ I Q (f ), we obtain (2.13).


The following consequence is sometimes called Darboux’s Theorem.
Theorem 2.4. Let Pν be a sequence of partitions of I into ν intervals Jνk , 1 ≤ k ≤ ν,
such that
maxsize(Pν ) −→ 0.
If f : I → R is bounded, then

(2.14) I Pν (f ) → I(f ) and I Pν (f ) → I(f ).

Consequently,
ν
X
(2.15) f ∈ R(I) ⇐⇒ I(f ) = lim f (ξνk )`(Jνk ),
ν→∞
k=1
R
for arbitrary ξνk ∈ Jνk , in which case the limit is I
f dx.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
122

Proof. As before, assume |f | ≤ M . Pick ε > 0. Let Q be a partition such that

I(f ) ≤ I Q (f ) ≤ I(f ) + ε,
I(f ) ≥ I Q (f ) ≥ I(f ) − ε.

Now pick N such that

ν ≥ N =⇒ maxsize Pν ≤ ε minsize Q.

Lemma 2.3 yields, for ν ≥ N ,

I Pν (f ) ≤ I Q (f ) + 2M `(I)ε,
I Pν (f ) ≥ I Q (f ) − 2M `(I)ε.

Hence, for ν ≥ N ,
I(f ) ≤ I Pν (f ) ≤ I(f ) + [2M `(I) + 1]ε,
I(f ) ≥ I Pν (f ) ≥ I(f ) − [2M `(I) + 1]ε.
This proves (2.14).

Remark.
R The sums on the right side of (2.15) are called Riemann sums, approximating
I
f dx (when f is Riemann integrable).

Remark. A second proof of Proposition 2.1 can readily be deduced from Theorem 2.4.

One should be warned that, once such a specific choice of Pν and ξνk has been made,
the limit on the right side of (2.15) might exist for a bounded function f that is not
Riemann integrable. This and other phenomena are illustrated by the following example
of a function which is not Riemann integrable. For x ∈ I, set

(2.16) ϑ(x) = 1 if x ∈ Q, ϑ(x) = 0 if x ∈


/ Q,

where Q is the set of rational numbers. Now every interval J ⊂ I of positive length contains
points in Q and points not in Q, so for any partition P of I we have I P (ϑ) = `(I) and
I P (ϑ) = 0, hence

(2.17) I(ϑ) = `(I), I(ϑ) = 0.

Note that, if Pν is a partition of I into ν equal subintervals, then we could pick each ξνk to
be rational, in which case the limit on the right side of (2.15) would be `(I), or we could
pick each ξνk to be irrational, in which case this limit would be zero. Alternatively, we
could pick half of them to be rational and half to be irrational, and the limit would be
1
2 `(I).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
123

Associated to the Riemann integral is a notion of size of a set S, called content. If S is


a subset of I, define the “characteristic function”

(2.18) χS (x) = 1 if x ∈ S, 0 if x ∈
/ S.

We define “upper content” cont+ and “lower content” cont− by

(2.19) cont+ (S) = I(χS ), cont− (S) = I(χS ).

We say S “has content,” or “is contented” if these quantities are equal, which happens if
and only if χS ∈ R(I), in which case the common value of cont+ (S) and cont− (S) is
Z
(2.20) m(S) = χS (x) dx.
I

It is easy to see that


nX
N o
+
(2.21) cont (S) = inf `(Jk ) : S ⊂ J1 ∪ · · · ∪ JN ,
k=1

where Jk are intervals. Here, we require S to be in the union of a finite collection of


intervals.
There is a more sophisticated notion of the size of a subset of I, called Lebesgue measure.
The key to the construction of Lebesgue measure is to cover a set S by a countable (either
finite or infinite) set of intervals. The outer measure of S ⊂ I is defined by
nX [ o

(2.22) m (S) = inf `(Jk ) : S ⊂ Jk .
k≥1 k≥1

Here {Jk } is a finite or countably infinite collection of intervals. Clearly

(2.23) m∗ (S) ≤ cont+ (S).

Note that, if S = I ∩ Q, then χS = ϑ, defined by (2.16). In this case it is easy to see that
cont+ (S) = `(I), but m∗ (S) = 0. In fact, (2.22) readily yields the following:

(2.24) S countable =⇒ m∗ (S) = 0.

We point out that we can require the intervals Jk in (2.22) to be open. Consequently, since
each open cover of a compact set has a finite subcover,

(2.25) S compact =⇒ m∗ (S) = cont+ (S).

See the appendix at the end of this section for a generalization of Proposition 2.2, giving
a sufficient condition for a bounded function to be Riemann integrable on I, in terms of
the upper content of its set of discontinuities, in Proposition 2.11, and then, in Proposition
2.12, a refinement, replacing
R upper content by outer measure.
It is useful to note that I f dx is additive in I, in the following sense.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
124
¯ ¯
Proposition 2.5. If a < b < c, f : [a, c] → R, f1 = f ¯[a,b] , f2 = f ¯[b,c] , then
¡ ¢ ¡ ¢ ¡ ¢
(2.26) f ∈ R [a, c] ⇐⇒ f1 ∈ R [a, b] and f2 ∈ R [b, c] ,

and, if this holds,


Z c Z b Z c
(2.27) f dx = f1 dx + f2 dx.
a a b

Proof. Since any partition of [a, c] has a refinement for which b is an endpoint, we may as
well consider a partition P = P1 ∪ P2 , where P1 is a partition of [a, b] and P2 is a partition
of [b, c]. Then

(2.28) I P (f ) = I P1 (f1 ) + I P2 (f2 ), I P (f ) = I P1 (f1 ) + I P2 (f2 ),

so
© ª © ª
(2.29) I P (f ) − I P (f ) = I P1 (f1 ) − I P1 (f1 ) + I P2 (f2 ) − I P2 (f2 ) .

Since both terms in braces in (2.29) are ≥ 0, we have equivalence in (2.26). Then (2.27)
follows from (2.28) upon taking finer and finer partitions, and passing to the limit.
Let I = [a, b]. If f ∈ R(I), then f ∈ R([a, x]) for all x ∈ [a, b], and we can consider the
function
Z x
(2.30) g(x) = f (t) dt.
a

If a ≤ x0 ≤ x1 ≤ b, then
Z x1
(2.31) g(x1 ) − g(x0 ) = f (t) dt,
x0

so, if |f | ≤ M,

(2.32) |g(x1 ) − g(x0 )| ≤ M |x1 − x0 |.

In other words, if f ∈ R(I), then g is Lipschitz continuous on I.


Recall from §1 that a function g : (a, b) → R is said to be differentiable at x ∈ (a, b)
provided there exists the limit
1£ ¤
(2.33) lim g(x + h) − g(x) = g 0 (x).
h→0 h
When such a limit exists, g 0 (x), also denoted dg/dx, is called the derivative of g at x.
Clearly g is continuous wherever it is differentiable.
The next result is part of the Fundamental Theorem of Calculus.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
125

Theorem 2.6. If f ∈ C([a, b]), then the function g, defined by (2.30), is differentiable at
each point x ∈ (a, b), and
(2.34) g 0 (x) = f (x).

Proof. Parallel to (2.31), we have, for h > 0,


Z x+h
1£ ¤ 1
(2.35) g(x + h) − g(x) = f (t) dt.
h h x

If f is continuous at x, then, for any ε > 0, there exists δ > 0 such that |f (t) − f (x)| ≤ ε
whenever |t − x| ≤ δ. Thus the right side of (2.35) is within ε of f (x) whenever h ∈ (0, δ].
Thus the desired limit exists as h & 0. A similar argument treats h % 0.
The next result is the rest of the Fundamental Theorem of Calculus.
Theorem 2.7. If G is differentiable and G0 (x) is continuous on [a, b], then
Z b
(2.36) G0 (t) dt = G(b) − G(a).
a

Proof. Consider the function


Z x
(2.37) g(x) = G0 (t) dt.
a

We have g ∈ C([a, b]), g(a) = 0, and, by Theorem 2.6,


(2.38) g 0 (x) = G0 (x), ∀ x ∈ (a, b).
Thus f (x) = g(x) − G(x) is continuous on [a, b], and
(2.39) f 0 (x) = 0, ∀ x ∈ (a, b).
We claim that (2.39) implies f is constant on [a, b]. Granted this, since f (a) = g(a)−G(a) =
−G(a), we have f (x) = −G(a) for all x ∈ [a, b], so the integral (2.37) is equal to G(x)−G(a)
for all x ∈ [a, b]. Taking x = b yields (2.36).
The fact that (2.39) implies f is constant on [a, b] is a consequence of the Mean Value
Theorem. This was established in §1; see Theorem 1.2. We repeat the statement here.
Theorem 2.8. Let f : [a, β] → R be continuous, and assume f is differentiable on (a, β).
Then ∃ ξ ∈ (a, β) such that
f (β) − f (a)
(2.40) f 0 (ξ) = .
β−a

Now, to see that (2.39) implies f is constant on [a, b], if not, ∃ β ∈ (a, b] such that
f (β) 6= f (a). Then just apply Theorem 2.8 to f on [a, β]. This completes the proof of
Theorem 2.7.

We now extend Theorems 2.6–2.7 to the setting of Riemann integrable functions.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
126

Proposition 2.9. Let f ∈ R([a, b]), and define g by (2.28). If x ∈ [a, b] and f is contin-
uous at x, then g is differentiable at x, and g 0 (x) = f (x).
The proof is identical to that of Theorem 2.6.
Proposition 2.10. Assume G is differentiable on [a, b] and G0 ∈ R([a, b]). Then (2.36)
holds.
Proof. We have

Xh
n−1 ³ k + 1´ ³ k ´i
G(b) − G(a) = G a + (b − a) − G a + (b − a)
n n
k=0
(2.41)
n−1
b−a X 0
= G (ξkn ),
n
k=0

for some ξkn satisfying


k k+1
(2.42) a + (b − a) < ξkn < a + (b − a) ,
n n
as a consequence of the Mean Value Theorem. Given G0 ∈ R([a, b]), Darboux’s theorem
Rb
(Theorem 2.4) implies that as n → ∞ one gets G(b) − G(a) = a G0 (t) dt.
Note that the beautiful symmetry in Theorems 2.6–2.7 is not preserved in Propositions
2.9–2.10. The hypothesis of Proposition 2.10 requires G to be differentiable at each x ∈
[a, b], but the conclusion of Proposition 2.9 does not yield differentiability at all points.
For this reason, we regard Propositions 2.9–2.10 as less “fundamental” than Theorems
2.6–2.7. There are more satisfactory extensions of the fundamental theorem of calculus,
involving the Lebesgue integral, and a more subtle notion of the “derivative” of a non-
smooth function. For this, we can point the reader to Chapters 10-11 of the text [T1],
Measure Theory and Integration.
So far, we have dealt with integration of real valued functions. If f : I → C, we set
f = f1 + if2 with fj : I → R and say f ∈ R(I) if and only if f1 and f2 are in R(I). Then
Z Z Z
(2.43) f dx = f1 dx + i f2 dx.
I I I

There are straightforward extensions of Propositions 2.5–2.10 to complex valued functions.


Similar comments apply to functions f : I → Rn .

Complementary results on Riemann integrability

Here we provide a condition, more general then Proposition 2.2, which guarantees Rie-
mann integrability.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
127

Proposition 2.11. Let f : I → R be a bounded function, with I = [a, b]. Suppose that the
set S of points of discontinuity of f has the property

(2.44) cont+ (S) = 0.

Then f ∈ R(I).
Proof. Say |f (x)| ≤ M . Take ε > 0. As in (2.21), take intervals J1 , . . . , JN such that
PN
S ⊂ J1 ∪ · · · ∪ JN and k=1 `(Jk ) < ε. In fact, fatten each Jk such that S is contained
in the interior of this collection of intervals. Consider a partition P0 of I, whose intervals
include J1 , . . . , JN , amongst others, which we label I1 , . . . , IK . Now f is continuous on
each interval Iν , so, subdividing each Iν as necessary, hence refining P0 to a partition
P1 , we arrange that sup f − inf f < ε on each such subdivided interval. Denote these
subdivided intervals I10 , . . . , IL0 . It readily follows that

N
X L
X
0 ≤ I P1 (f ) − I P1 (f ) < 2M `(Jk ) + ε`(Ik0 )
(2.45) k=1 k=1
< 2εM + ε`(I).

Since ε can be taken arbitrarily small, this establishes that f ∈ R(I).


With a little more effort, we can establish the following result, which, in light of (2.23),
is a bit sharper than Proposition 2.11.
Proposition 2.12. In the setting of Proposition 2.11, if we replace (2.38) by

(2.46) m∗ (S) = 0,

we still conclude that f ∈ R(I).


Proof. As before, we assume |f (x)| ≤ M and pick ε > 0. This P time, take a countable
collection of open intervals {Jk } such that S ⊂ ∪k≥1 Jk and k≥1 `(Jk ) < ε. Now f is
continuous at each p ∈ I \ S, so there exists an interval Kp , open (in I), containing p, such
that supKp f − inf Kp f < ε. Now {Jk : k ∈ N} ∪ {Kp : p ∈ I \ S} is an open cover of I, so
it has a finite subcover, which we denote {J1 , . . . , JN , K1 , . . . , KM }. We have

N
X
(2.47) `(Jk ) < ε, and sup f − inf f < ε, ∀ j ∈ {1, . . . , M }.
Kj Kj
k=1

Let P be the partition of I obtained by taking the union of all the endpoints of Jk and Kj
in (2.47). Let us write
P = {Lk : 0 ≤ k ≤ µ}
³[ ´ ³[ ´
= Lk ∪ Lk ,
k∈A k∈B

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
128

where we say k ∈ A provided Lk is contained in an interval of the form Kj for some


j ∈ {1, . . . , M }, as in (2.47). Consequently, if k ∈ B, then Lk ⊂ J` for some ` ∈ {1, . . . , N },
so
[ N
[
(2.48) Lk ⊂ J` .
k∈B `=1

We therefore have
X
(2.49) `(Lk ) < ε, and sup f − inf f < ε, ∀ j ∈ A.
Lj Lj
k∈B

It follows that
X X
0 ≤ I P (f ) − I P (f ) < 2M `(Lk ) + ε`(Lj )
(2.50) k∈B j∈A
< 2εM + ε`(I).

Since ε can be taken arbitrarily small, this establishes that f ∈ R(I).

Remark. Proposition 2.12 is part of the sharp result that a bounded function f on
I = [a, b] is Riemann integrable if and only if its set S of points of discontinuity satisfies
(2.46). Standard books on measure theory, including [Fol] and [T1], establish this.

We give an example of a function to which Proposition 2.11 applies, and then an example
for which Proposition 2.11 fails to apply, but Proposition 2.12 applies.

Example 1. Let I = [0, 1]. Define f : I → R by

f (0) = 0,
(2.51)
f (x) = (−1)j for x ∈ (2−(j+1) , 2−j ], j ≥ 0.

Then |f | ≤ 1 and the set of points of discontinuity of f is

(2.52) S = {0} ∪ {2−j : j ≥ 1}.

It is easy to see that cont+ S = 0. Hence f ∈ R(I).

See Exercises 16-17 below for a more elaborate example to which Proposition 2.11 applies.

Example 2. Again I = [0, 1]. Define f : I → R by

f (x) = 0 if x ∈
/ Q,
(2.53) 1 m
if x = , in lowest terms.
n n

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
129

Then |f | ≤ 1 and the set of points of discontinuity of f is

(2.54) S = I ∩ Q.

As we have seen below (2.23), cont+ S = 1, so Proposition 2.11 does not apply. Neverthe-
less, it is fairly easy to see directly that

(2.55) I(f ) = I(f ) = 0, so f ∈ R(I).

In fact, given ε > 0, f ≥ ε only on a finite set, hence

(2.56) I(f ) ≤ ε, ∀ ε > 0.

As indicated below (2.23), (2.46) does apply to this function, so Proposition 2.12 applies.
Example 2 is illustrative of the following general phenomenon, which is worth recording.
Corollary 2.13. If f : I → R is bounded and its set S of points of discontinuity is
countable, then f ∈ R(I).
Proof. By virtue of (2.24), Proposition 2.12 applies.
Here is another useful sufficient condition condition for Riemann integrability.
Proposition 2.14. If f : I → R is bounded and monotone, then f ∈ R(I).
Proof. It suffices to consider the case that f is monotone increasing. Let PN = {Jk :
1 ≤ k ≤ N } be the partition of I into N intervals of equal length. Note that supJk f ≤
inf Jk+1 f . Hence

N
X −1
I PN (f ) ≤ ( inf f )`(Jk ) + (sup f )`(JN )
Jk+1 JN
(2.57) k=1
`(I)
≤ I PN (f ) + 2M ,
N

if |f | ≤ M . Taking N → ∞, we deduce from Theorem 2.4 that I(f ) ≤ I(f ), which proves
f ∈ R(I).

Remark. It can be shown that if f is monotone, then its set of points of discontinuity is
countable. Given this, Proposition 2.14 is also a consequence of Corollary 2.13.

By contrast, the function ϑ in (2.16) is discontinuous at each point of I.

We mention an alternative characterization of I(f ) and I(f ), which can be useful. Given
I = [a, b], we say g : I → R is piecewise constant on I (and write g ∈ PK(I)) provided there

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
130

exists a partition P = {Jk } of I such that g is constant on the interior of each interval Jk .
Clearly PK(I) ⊂ R(I). It is easy to see that, if f : I → R is bounded,
nZ o
I(f ) = inf f1 dx : f1 ∈ PK(I), f1 ≥ f ,
I
(2.58) nZ o
I(f ) = sup f0 dx : f0 ∈ PK(I), f0 ≤ f .
I

Hence, given f : I → R bounded,


f ∈ R(I) ⇔ for each ε > 0, ∃f0 , f1 ∈ PK(I) such that
Z
(2.59) f0 ≤ f ≤ f1 and (f1 − f0 ) dx < ε.
I

This can be used to prove

(2.60) f, g ∈ R(I) =⇒ f g ∈ R(I),

via the fact that

(2.61) fj , gj ∈ PK(I) =⇒ fj gj ∈ PK(I).

In fact, we have the following, which can be used to prove (2.60).


Proposition 2.15. Let f ∈ R(I), and assume |f | ≤ M . Let ϕ : [−M, M ] → R be
continuous. Then ϕ ◦ f ∈ R(I).
Proof. We proceed in steps.

Step 1. We can obtain ϕ as a uniform limit on [−M, M ] of a sequence ϕν of continuous,


piecewise linear functions. Then ϕν ◦ f → ϕ ◦ f uniformly on I. A uniform limit g of
functions gν ∈ R(I) is in R(I) (see Exercise 9). So it suffices to prove Proposition 2.12
when ϕ is continuous and piecewise linear.

Step 2. Given ϕ : [−M, M ] → R continuous and piecewise linear, it is an exercise to write


ϕ = ϕ1 − ϕ2 , with ϕj : [−M, M ] → R monotone, continuous, and piecewise linear. Now
ϕ1 ◦ f, ϕ2 ◦ f ∈ R(I) ⇒ ϕ ◦ f ∈ R(I).

Step 3. We now demonstrate Proposition 2.15 when ϕ : [−M, M ] → R is monotone and


Lipschitz. By Step 2, this will suffice. So we assume

−M ≤ x1 < x2 ≤ M =⇒ ϕ(x1 ) ≤ ϕ(x2 ) and ϕ(x2 ) − ϕ(x1 ) ≤ L(x2 − x1 ),

for some L < ∞. Given ε > 0, pick f0 , f1 ∈ PK(I), as in (2.59). Then

ϕ ◦ f0 , ϕ ◦ f1 ∈ PK(I), ϕ ◦ f0 ≤ ϕ ◦ f ≤ ϕ ◦ f1 ,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
131

and Z Z
(ϕ ◦ f1 − ϕ ◦ f0 ) dx ≤ L (f1 − f0 ) dx ≤ Lε.
I I

This proves ϕ ◦ f ∈ R(I).

Exercises

1. Let c > 0 and let f : [ac, bc] → R be Riemann integrable. Working directly with the
definition of integral, show that
Z b Z bc
1
(2.62) f (cx) dx = f (x) dx.
a c ac

More generally, show that


Z b−d/c Z bc
1
(2.63) f (cx + d) dx = f (x) dx.
a−d/c c ac

R
2. Let f : I ×S → R be continuous, where I = [a, b] and S ⊂ Rn . Take ϕ(y) = I
f (x, y) dx.
Show that ϕ is continuous on S.
Hint. If fj : I → R are continuous and |f1 (x) − f2 (x)| ≤ δ on I, then
¯Z Z ¯
¯ ¯
(2.64) ¯ f 1 dx − f2 dx ¯ ≤ `(I)δ.
I I

3. With f as in Exercise 2, suppose gj : S → R are continuous and a ≤ g0 (y) < g1 (y) ≤ b.


R g (y)
Take ϕ(y) = g01(y) f (x, y) dx. Show that ϕ is continuous on S.
Hint. Make a change of variables, linear in x, to reduce this to Exercise 2.

4. Let ϕ : [a, b] → [A, B] be C 1 on a neighborhood J of [a, b], with ϕ0 (x) > 0 for all
x ∈ [a, b]. Assume ϕ(a) = A, ϕ(b) = B. Show that the identity
Z B Z b ¡ ¢
(2.65) f (y) dy = f ϕ(t) ϕ0 (t) dt,
A a

for any f ∈ C([A, B]), follows from the chain rule and the Fundamental Theorem of
Calculus.
Hint. Replace b by x, B by ϕ(x), and differentiate.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
132

4A. Show that (2.65) holds for each f ∈ PK([A, B]). Using (2.58)–(2.59), show that
f ∈ R([A, B]) ⇒ f ◦ ϕ ∈ R([a, b]) and (2.65) holds. (This result contains that of Exercise
1.)

5. Show that, if f and g are C 1 on a neighborhood of [a, b], then


Z b Z b
0
£ ¤
(2.66) f (s)g (s) ds = − f 0 (s)g(s) ds + f (b)g(b) − f (a)g(a) .
a a

This transformation of integrals is called “integration by parts.”

6. Let f : (−a, a) → R be a C j+1 function. Show that, for x ∈ (−a, a),

f 00 (0) 2 f (j) (0) j


(2.67) f (x) = f (0) + f 0 (0)x + x + ··· + x + Rj (x),
2 j!

where
Z x
(x − s)j (j+1)
(2.68) Rj (x) = f (s) ds
0 j!

This is Taylor’s formula with remainder.


Hint. Use induction. If (2.67)–(2.68) holds for 0 ≤ j ≤ k, show that it holds for j = k + 1,
by showing that
Z x Z x
(x − s)k (k+1) f (k+1) (0) k+1 (x − s)k+1 (k+2)
(2.69) f (s) ds = x + f (s) ds.
0 k! (k + 1)! 0 (k + 1)!

To establish this, use the integration by parts formula (2.66), with f (s) replaced by
f (k+1) (s), and with appropriate g(s). See §3 for another approach. Note that another
presentation of (2.68) is
Z ³¡
xj+1 1
(j+1) 1/(j+1)
¢ ´
(2.70) Rj (x) = f 1−t x dt.
(j + 1)! 0

7. Assume f : (−a, a) → R is a C j function. Show that, for x ∈ (−a, a), (2.67) holds, with
Z x
1 £ ¤
(2.71) Rj (x) = (x − s)j−1 f (j) (s) − f (j) (0) ds.
(j − 1)! 0

Hint. Apply (2.68) with j replaced by j − 1. Add and subtract f (j) (0) to the factor f (j) (s)
in the resulting integrand.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
133

8. Given I = [a, b], show that

(2.72) f, g ∈ R(I) =⇒ f g ∈ R(I),

as advertised in (2.60).

9. Assume fk ∈ R(I) and fk → f uniformly on I. Prove that f ∈ R(I) and


Z Z
(2.73) fk dx −→ f dx.
I I

10. Given I = [a, b], Iε = [a + ε, b − ε], assume fk ∈ R(I), |fk | ≤ M on I for all k, and

(2.74) fk −→ f uniformly on Iε ,

for all ε ∈ (0, (b − a)/2). Prove that f ∈ R(I) and (2.73) holds.

11. Use the fundamental theorem of calculus and results of §1 to compute


Z b
(2.75) xr dx, r ∈ Q \ {−1},
a

where −∞ < a < b < ∞ if r ≥ 0 and 0 < a < b < ∞ if r < 0. See §5 for (2.75) with
r = −1.

12. Use the change of variable result of Exercise 4 to compute


Z 1 p
(2.76) x 1 + x2 dx.
0

13. We say f ∈ R(R) provided f |[k,k+1] ∈ R([k, k + 1]) for each k ∈ Z, and

∞ Z
X k+1
(2.77) |f (x)| dx < ∞.
k=−∞ k

If f ∈ R(R), we set
Z ∞ Z k
(2.78) f (x) dx = lim f (x) dx.
−∞ k→∞ −k

Formulate and demonstrate basic properties of the integral over R of elements of R(R).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
134

14. This exercise discusses the integral test for absolute convergence of an infinite series,
which goes as follows. Let f be a positive, monotonically decreasing, continuous function
on [0, ∞), and suppose |ak | = f (k). Then
X∞ Z ∞
|ak | < ∞ ⇐⇒ f (x) dx < ∞.
k=0 0

Prove this.
Hint. Use
N
X Z N N
X −1
|ak | ≤ f (x) dx ≤ |ak |.
k=1 0 k=0

15. Use the integral test to show that, if p > 0,



X 1
< ∞ ⇐⇒ p > 1.
kp
k=1

Note. Compare Exercise 7 in §6 of Chapter 1. (For now, p ∈ Q+ . Results of §5 allow one


RN
to take p ∈ R+ .) Hint. Use Exercise 11 to evaluate IN (p) = 1 x−p dx, for p 6= −1, and
R∞
let N → ∞. See if you can show 1 x−1 dx = ∞ without knowing about log N . Subhint.
R2 R 2N
Show that 1 x−1 dx = N x−1 dx.

In Exercises 16–17, C ⊂ [a, b] is the Cantor set introduced in the exercises for §9 of Chapter
1. As in (9.21) of Chapter 1, C = ∩j≥0 Cj .

16. Show that cont+ Cj = (2/3)j (b − a), and conclude that


cont+ C = 0.

17. Define f : [a, b] → R as follows. We call an interval of length 3−j (b − a), omitted in
passing from Cj−1 to Cj , a “j-interval.” Set
f (x) = 0, if x ∈ C,
(−1)j , if x belongs to a j-interval.
Show that the set of discontinuities of f is C. Hence Proposition 2.11 implies f ∈ R([a, b]).

18. Let fk ∈ R([a, b]) and f : [a, b] → R satisfy the following conditions.
(a) |fk | ≤ M < ∞, ∀ k,
(b) fk (x) −→ f (x), ∀ x ∈ [a, b],
(c) Given ε > 0, there exists Sε ⊂ [a, b] such that
cont+ Sε < ε, and fk → f uniformly on [a, b] \ Sε .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
135

Show that f ∈ R([a, b]) and


Z b Z b
fk (x) dx −→ f (x) dx, as k → ∞.
a a

Remark. In the Lebesgue theory of integration, there is a stronger result, known as the
Lebesgue dominated convergence theorem. See Exercises 12–14 in §6 for more on this.

19. Recall that one ingredient in the proof of Theorem 2.7 was that if f : (a, b) → R, then

(2.79) f 0 (x) = 0 for all x ∈ (a, b) =⇒ f is constant on (a, b).

Consider the following approach to proving (2.79), which avoids use of the Mean Value
Theorem.
(a) Assume a < x0 < y0 < b and f (x0 ) 6= f (y0 ). Say f (y0 ) = f (x0 ) + A(y0 − x0 ), and we
may as well assume A > 0.
(b) Divide I0 = [x0 , y0 ] into two equal intervals, I0` and I0r , meeting at the midpoint
ξ0 = (x0 + y0 )/2. Show that either

f (ξ0 ) ≥ f (x0 ) + A(ξ0 − x0 ) or f (y0 ) ≥ f (ξ0 ) + A(y0 − ξ0 ).

Set I1 = I0` if the former holds; otherwise, set I1 = I0r . Say I1 = [x1 , y1 ].
(c) Inductively, having Ik = [xk , yk ], of length 2−k (y0 − x0 ), divide it into two equal
intervals, Ik` and Ikr , meeting at the midpoint ξk = (xk + yk )/2. Show that either

f (ξk ) ≥ f (xk ) + A(ξk − xk ) or f (yk ) ≥ f (ξk ) + A(yk − ξk ).

Set Ik+1 = Ik` if the former holds; otherwise set Ik+1 = Ikr .
(d) Show that
xk % x, yk & x, x ∈ [x0 , y0 ],
and that, if f is differentiable at x, then f 0 (x) ≥ A. Note that this contradicts the
hypothesis that f 0 (x) = 0 for all x ∈ (a, b).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
136

3. Power series

In §3 of Chapter 3 we introduced power series, of the form



X
(3.1) f (z) = ak (z − z0 )k ,
k=0

with ak ∈ C, and established the following.


Proposition 3.1. If the series (3.1) converges for some z1 6= z0 , then either this series
is absolutely convergent for all z ∈ C or there is some R ∈ (0, ∞) such that the series is
absolutely convergent for |z − z0 | < R and divergent for |z − z0 | > R. The series converges
uniformly on

(3.2) DS (z0 ) = {z ∈ C : |z − z0 | < S},

for each S < R, and f is continuous on DR (z0 ).


We now restrict attention to cases where z0 ∈ R and z = t ∈ R, and apply calculus to
the study of such power series. We emphasize that we still allow the coefficients ak to be
complex numbers.
Proposition 3.2. Assume ak ∈ C and

X
(3.3) f (t) = a k tk
k=0

converges for real t satisfying |t| < R. Then f is differentiable on the interval −R < t < R,
and

X
0
(3.4) f (t) = kak tk−1 ,
k=1

the latter series being absolutely convergent for |t| < R.


We first check absolute convergence of the series (3.4). Let S < T < R. Convergence of
(3.3) implies there exists C < ∞ such that

(3.5) |ak |T k ≤ C, ∀ k.

Hence, if |t| ≤ S,

C ³ S ´k
(3.6) |kak tk−1 | ≤ k ,
S T

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
137

which readily yields absolute convergence. (See Exercise 1 below.) Hence



X
(3.7) g(t) = kak tk−1
k=1

is continuous on (−R, R). To show that f 0 (t) = g(t), by the fundamental theorem of
calculus, it is equivalent to show
Z t
(3.8) g(s) ds = f (t) − f (0).
0

The following result implies this.


Proposition 3.3. Assume bk ∈ C and

X
(3.9) g(t) = bk tk
k=0

converges for real t, satisfying |t| < R. Then, for |t| < R,
Z t ∞
X bk k+1
(3.10) g(s) ds = t ,
0 k+1
k=0

the series being absolutely convergent for |t| < R.


Proof. Since, for |t| < R,
¯ b ¯
¯ k k+1 ¯
(3.11) ¯ t ¯ ≤ R|bk tk |,
k+1
convergence of the series in (3.10) is clear. Next, write

g(t) = SN (t) + RN (t),


N
X ∞
X
(3.12)
SN (t) = b k tk , RN (t) = bk tk .
k=0 k=N +1

As in the proof of Proposition 3.2 in Chapter 3, pick S < T < R. There exists C < ∞
such that |bk T k | ≤ C for all k. Hence
∞ ³ ´k
X S
(3.13) |t| ≤ S ⇒ |RN (t)| ≤ C = CεN → 0, as N → ∞.
T
k=N +1

so
Z t N
X Z t
bk k+1
(3.14) g(s) ds = t + RN (s) ds,
0 k+1 0
k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
138

and, for |t| ≤ S,


¯Z t ¯ Z t
¯ ¯
(3.15) ¯ RN (s) ds¯ ≤ |RN (s)| ds ≤ CRεN .
0 0

This gives (3.10).


Second proof of Proposition 3.2. As shown in Proposition 3.7 of Chapter 3, if |t1 | < R,
then f (t) has a convergent power series about t1 :

X
(3.16) f (t) = bk (t − t1 )k , for |t − t1 | < R − |t1 |,
k=0

with

X
(3.17) b1 = nan tn−1
1 .
n=1

This clearly implies f is differentiable at t1 , and f 0 (t1 ) is given by (3.17).

Remark. The definition of (3.10) for t < 0 follows standard convention. More generally,
if a < b and g ∈ R([a, b]), then
Z a Z b
g(s) ds = − g(s) ds.
b a

More generally, if we have a power series about t0 ,



X
(3.18) f (t) = ak (t − t0 )k , for |t − t0 | < R,
k=0

then f is differentiable for |t − t0 | < R and



X
0
(3.19) f (t) = kak (t − t0 )k−1 .
k=1

We can then differentiate this power series, and inductively obtain



X
(n)
(3.20) f (t) = k(k − 1) · · · (k − n + 1)ak (t − t0 )k−n .
k=n

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
139

In particular,

(3.21) f (n) (t0 ) = n! an .

We can turn (3.21) around and write

f (n) (t0 )
(3.22) an = .
n!
This suggests the following method of taking a given function and deriving a power series
representation. Namely, if we can, we compute f (k) (t0 ) and propose that

X f (k) (t0 )
(3.23) f (t) = (t − t0 )k ,
k!
k=0

at least on some interval about t0 . To take an example, consider

(3.24) f (t) = (1 − t)−r ,

with r ∈ Q (but −r ∈ / N), and take t0 = 0. (Results of §5 will allow us to extend this
analysis to r ∈ R.) Using (1.36), we get

(3.25) f 0 (t) = r(1 − t)−(r+1) ,

for t < 1. Inductively, for k ∈ N,

hk−1
Y i
(k)
(3.26) f (t) = (r + `) (1 − t)−(r+k) .
`=0

Hence, for k ≥ 1,

k−1
Y
(3.27) f (k) (0) = (r + `) = r(r + 1) · · · (r + k − 1).
`=0

Consequently, we propose that



X ak
(3.28) (1 − t)−r = tk , |t| < 1,
k!
k=0

with
k−1
Y
(3.29) a0 = 1, ak = (r + `), for k ≥ 1.
`=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
140

We can verify convergence of the right side of (3.28) by using the ratio test:
¯a ¯
¯ k+1 tk+1 /(k + 1)! ¯ k + r
(3.30) ¯ ¯= |t|.
ak tk /k! k+1

This computation implies that the power series on the right side of (3.28) is absolutely
convergent for |t| < 1, yielding a function

X ak
(3.31) g(t) = tk , |t| < 1.
k!
k=0

It remains to establish that g(t) = (1 − t)−r .


We take up this task, on a more general level. Establishing that the series

X f (k) (t0 )
(3.32) (t − t0 )k
k!
k=0

converges to f (t) is equivalent to examining the remainder Rn (t, t0 ) in the finite expansion
n
X f (k) (t0 )
(3.33) f (t) = (t − t0 )k + Rn (t, t0 ).
k!
k=0

The series (3.32) converges to f (t) if and only if Rn (t, t0 ) → 0 as n → ∞. To see when this
happens, we need a compact formula for the remainder Rn , which we proceed to derive.
It seems to clarify matters if we switch notation a bit, and write

f (n) (y)
(3.34) f (x) = f (y) + f 0 (y)(y − x) + · · · + (x − y)n + Rn (x, y).
n!
We now take the y-derivative of each side of (3.34). The y-derivative of the left side is 0,
and when we apply ∂/∂y to the right side, we observe an enormous amount of cancellation.
There results the identity
∂Rn 1
(3.35) (x, y) = − f (n+1) (y)(x − y)n .
∂y n!
Also,

(3.36) Rn (x, x) = 0.

If we concentrate on Rn (x, y) as a function of y and look at the difference quotient


[Rn (x, y) − Rn (x, x)]/(y − x), an immediate consequence of the mean value theorem is
that, if f is real valued,
1
(3.37) Rn (x, y) = (x − y)(x − ξn )n f (n+1) (ξn ),
n!

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
141

for some ξn betweeen x and y. This is known as Cauchy’s formula for the remainder. If
f (n+1) is continuous, we can apply the fundamental theorem of calculus to (3.35)–(3.36),
and obtain the integral formula
Z x
1
(3.38) Rn (x, y) = (x − s)n f (n+1) (s) ds.
n! y

This works regardless of whether f is real valued. Another derivation of (3.38) arose in
the exercise set for §2. The change of variable x − s = t(x − y) gives the integral formula
Z 1
1
(3.39) Rn (x, y) = (x − y)n+1 tn f (n+1) (ty + (1 − t)x) dt.
n! 0

If we think of this integral as 1/(n + 1) times a weighted mean of f (n+1) , we get the
Lagrange formula for the remainder,

1
(3.40) Rn (x, y) = (x − y)n+1 f (n+1) (ζn ),
(n + 1)!

for some ζn between x and y, provided f is real valued. The Lagrange formula is shorter
and neater than the Cauchy formula, but the Cauchy formula is actually more powerful.
The calculations in (3.43)–(3.54) below will illustrate this.
Note that, if I(x, y) denotes the interval with endpoints x and y (e.g., (x, y) if x < y),
then (3.38) implies

|x − y|
(3.41) |Rn (x, y)| ≤ sup |(x − ξ)n f (n+1) (ξ)|,
n! ξ∈I(x,y)

while (3.39) implies

|x − y|n+1
(3.42) |Rn (x, y)| ≤ sup |f (n+1) (ξ)|.
(n + 1)! ξ∈I(x,y)

In case f is real valued, (3.41) also follows from the Cauchy formula (3.37) and (3.42)
follows from the Lagrange formula (3.40).
Let us apply these estimates with f as in (3.24), i.e.,

(3.43) f (x) = (1 − x)−r ,

and y = 0. By (3.26),
n
Y
(n+1) −(r+n+1)
(3.44) f (ξ) = an+1 (1 − ξ) , an+1 = (r + `).
`=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
142

Consequently,

f (n+1) (ξ) an+1


(3.45) = bn (1 − ξ)−(r+n+1) , bn = .
n! n!
Note that
bn+1 n+1+r
(3.46) = → 1, as n → ∞.
bn n+1

Let us first investigate the estimate of Rn (x, 0) given by (3.42) (as in the Lagrange
formula), and see how it leads to a suboptimal conclusion. (The impatient reader might
skip (3.47)–(3.50) and go to (3.51).) By (3.45), if n is sufficiently large that r + n + 1 > 0,

|f (n+1) (ξ)| |bn |


sup = if − 1 ≤ x ≤ 0,
ξ∈I(x,0) (n + 1)! n+1
(3.47)
|bn |
(1 − x)−(r+n+1) if 0 ≤ x < 1.
n+1

Thus (3.42) implies

|bn |
|Rn (x, 0)| ≤ |x|n+1 if − 1 ≤ x ≤ 0,
n+1
(3.48) ³ x ´n+1
|bn | 1
if 0 ≤ x < 1.
n + 1 (1 − x)r 1 − x

Note that, by (3.46),

|bn | cn+1 |bn+1 | n + 1


cn = =⇒ = → 1 as n → ∞,
n+1 cn |bn | n + 2

so we conclude from the first part of (3.48) that

(3.49) Rn (x, 0) −→ 0 as n → ∞, if − 1 < x ≤ 0.

On the other hand, x/(1 − x) is < 1 for 0 ≤ x < 1/2, but not for 1/2 ≤ x < 1. Hence the
factor (x/(1 − x))n+1 decreases geometrically for 0 ≤ x < 1/2, but not for 1/2 ≤ x < 1.
Thus the second part of (3.48) yields only

1
(3.50) Rn (x, 0) −→ 0 as n → ∞, if 0 ≤ x < .
2
This is what the remainder estimate (3.42) yields.
To get the stronger result

(3.51) Rn (x, 0) −→ 0 as n → ∞, for |x| < 1,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
143

we use the remainder estimate (3.41) (as in the Cauchy formula). This gives

|x − ξ|n
(3.52) |Rn (x, 0)| ≤ |bn | · |x| sup ,
ξ∈I(x,0) |1 − ξ|n+1+r

with bn as in (3.45). Now


x−ξ
0 ≤ ξ ≤ x < 1 =⇒ ≤ x,
1−ξ
(3.53) ¯x − ξ ¯
¯ ¯
−1 < x ≤ ξ ≤ 0 =⇒ ¯ ¯ ≤ |x − ξ| ≤ |x|.
1−ξ
The first conclusion holds since it is equivalent to x − ξ ≤ x(1 − ξ) = x − xξ, hence to
xξ ≤ ξ. The second conclusion in (3.53) holds since ξ ≤ 0 ⇒ 1 − ξ ≥ 1. We deduce from
(3.52)–(3.53) that

(3.54) |x| < 1 =⇒ |Rn (x, 0)| ≤ |bn | · |x|n+1 .

Using (3.46) then gives the desired conclusion (3.51).


We can now conclude that (3.28) holds, with ak given by (3.29). For another proof of
(3.28), see Exercise 14.
There are some important examples of power series representations for which one does
not need to use remainder estimates like (3.41) or (3.42). For example, as seen in Chapter
1, we have
n
X 1 − xn+1
(3.55) xk = ,
1−x
k=0

if x 6= 1. The right side tends to 1/(1 − x) as n → ∞, if |x| < 1, so we get


X ∞
1
(3.56) = xk , |x| < 1,
1−x
k=0

without further ado, which is the case r = 1 of (3.28)–(3.29). We can differentiate (3.56)
repeatedly to get

X
−n
(3.57) (1 − x) = ck (n)xk , |x| < 1, n ∈ N,
k=0

and verify that (3.57) agrees with (3.28)–(3.29) with r = n. However, when r ∈
/ Z, such
an analysis of Rn (x, 0) as made above seems necessary.
Let us also note that we can apply Proposition 3.3 to (3.56), obtaining

X Z x
xk+1 dy
(3.58) = , |x| < 1.
k+1 0 1−y
k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
144

Material covered in §5 will produce another formula for the right side of (3.58).

Exercises

1. Show that (3.6) yields the absolute convergence asserted in the proof of Proposition
3.2. More generally, show that, for any n ∈ N, r ∈ (0, 1),

X
k n rk < ∞.
k=1

Hint. Refer to the ratio test, discussed in §3 of Chapter 3.

2. A special case of (3.18)–(3.21) is that, given a polynomial p(t) = an tn + · · · + a1 t + a0 ,


we have p(k) (0) = k! ak . Apply this to

Pn (t) = (1 + t)n .

(k) (k)
Compute Pn (t) using (1.7) repeatedly, then compute Pn (0), and use this to establish
the binomial formula:
n µ ¶
X µ ¶
n n k n n!
(1 + t) = t , = .
k k k!(n − k)!
k=0

3. Find the coefficients in the power series

X ∞
1
√ = bk xk .
1 − x4 k=0

Show that this series converges to the left side for |x| < 1.
Hint. Take r = 1/2 in (3.28)–(3.29) and set t = x4 .

4. Expand Z x
dy
p
0 1 − y4
in a power series in x. Show this holds for |x| < 1.

5. Expand Z x
dy
p
0 1 + y4
as a power series in x. Show that this holds for |x| < 1.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
145

6. Expand
Z 1
dt

0 1 + xt4
as a power series in x. Show that this holds for |x| < 1.

7. Show that another formula for Rn (x, y) in (3.34) is


Z x
1
(3.59) Rn (x, y) = (x − s)n−1 [f (n) (s) − f (n) (y)] ds.
(n − 1)! y

Hint. Do (3.34)–(3.38) with n replaced by n − 1, and then write

f (n) (y)
Rn−1 (x, y) = + Rn (x, y).
n!

Remark. An advantage of (3.59) over (3.38) is that for (3.59), we need only f ∈ C n ,
rather than f ∈ C n+1 .

8. Note that r
√ 1
2=2 1− .
2
Expand the right
√ side in a power series, using (3.28)–(3.29). How many terms suffice to
approximate 2 to 12 digits?

9. In the setting of Exercise 8, investigate series that converge faster, such as series obtained
from r
√ 3 1
2= 1−
2 9
r
10 1
= 1− .
7 50
√ √ √
10.
√ Apply variants of the methods of Exercises 8–9 to approximate 3, 5, 7, and
1001.

11. Given a rational approximation xn to 2, write
√ p
2 = x n 1 + δn .

Assume |δn | ≤ 1/2. Then set


³ 1 ´
xn+1 = xn 1 + δn , 2 = x2n+1 (1 + δn+1 ).
2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
146

Estimate δn+1 . Does the sequence (xn ) approach 2 faster than a power series? Apply
this method to the last approximation in Exercise 9.

12. Assume F ∈ C([a, b]), g ∈ R([a, b]), F real valued, and g ≥ 0 on [a, b]. Show that
Z b ³Z b ´
g(t)F (t) dt = g(t) dt F (ζ),
a a

for some ζ ∈ (a, b). Show how this result justifies passing from (3.39) to (3.40).
Rb
Hint. If A = min F, B = max F , and M = a g(t) dt, show that
Z b
AM ≤ g(t)F (t) dt ≤ BM.
a

13. Recall that the Cauchy formula (3.37) for the remainder Rn (x, y) was obtained by
applying the Mean Value Theorem to the difference quotient

Rn (x, y) − Rn (x, x)
.
y−x

Now apply the generalized mean value theorem, described in Exercise 8 of §1, with

f (y) = R(x, y), g(y) = (x − y)n+1 ,

to obtain the Lagrange formula (3.40).

14. Here is an approach to the proof of (3.28) that avoids formulas for the remainder
Rn (x, 0). Set

X
−r ak k
fr (t) = (1 − t) , gr (t) = t , for |t| < 1,
k!
k=0

with ak given by (3.29). Show that, for |t| < 1,


r
fr0 (t) = f (t), and (1 − t)gr0 (t) = rgr (t).
1−t

Then show that


d
(1 − t)r gr (t) = 0,
dt
and deduce that fr (t) = gr (t).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
147

4. Curves and arc length

The term “curve” is commonly used to refer to a couple of different, but closely related,
objects. In one meaning, a curve is a continuous function from an interval I ⊂ R to
n-dimensional Euclidean space:
(4.1) γ : I −→ Rn , γ(t) = (γ1 (t), . . . , γn (t)).
We say γ is differentiable provided each component γj is, in which case
(4.2) γ 0 (t) = (γ10 (t), . . . , γn0 (t)).
γ 0 (t) is the velocity of γ, at “time” t, and its speed is the magnitude of γ 0 (t):
q
0
(4.3) |γ (t)| = γ10 (t)2 + · · · + γn0 (t)2 .

We say γ is smooth of class C k provided each component γj (t) has this property.
One also calls the image of I under the map γ a curve in Rn . If u : J → I is continuous,
one-to-one, and onto, the map
(4.4) σ : J −→ Rn , σ(t) = γ(u(t))
has the same image as γ. We say σ is a reparametrization of γ. We usually require that u
be C 1 , with C 1 inverse. If γ is C k and u is also C k , so is σ, and the chain rule gives
(4.5) σ 0 (t) = u0 (t)γ 0 (u(t)).
Let us assume I = [a, b] is a closed, bounded interval, and γ is C 1 . We want to define
the length of this curve. To get started, we take a partition P of [a, b], given by
(4.6) a = t0 < t1 < · · · < tN = b,
and set
N
X
(4.7) `P (γ) = |γ(tj ) − γ(tj−1 )|.
j=1

We will massage the right side of (4.7) into something that looks like a Riemann sum for
Rb 0
a
|γ (t)| dt. We have
Z tj
γ(tj ) − γ(tj−1 ) = γ 0 (t) dt
tj−1
Z tj £ ¤
(4.8) = γ 0 (tj ) + γ 0 (t) − γ 0 (tj ) dt
tj−1
Z tj
0
£ 0 ¤
= (tj − tj−1 )γ (tj ) + γ (t) − γ 0 (tj ) dt.
tj−1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
148

We get

(4.9) |γ(tj ) − γ(tj−1 )| = (tj − tj−1 )|γ 0 (tj )| + rj ,

with
Z tj
(4.10) |rj | ≤ |γ 0 (t) − γ 0 (tj )| dt.
tj−1

Now if γ 0 is continuous on [a, b], so is |γ 0 |, and hence both are uniformly continuous on
[a, b]. We have

(4.11) s, t ∈ [a, b], |s − t| ≤ h =⇒ |γ 0 (t) − γ 0 (s)| ≤ ω(h),

where ω(h) → 0 as h → 0. Summing (4.9) over j, we get


N
X
(4.12) `P (γ) = |γ 0 (tj )|(tj − tj−1 ) + RP ,
j=1

with

(4.13) |RP | ≤ (b − a)ω(h), if each tj − tj−1 ≤ h.

Since the sum on the right side of (4.12) is a Riemann sum, we can apply Theorem 2.4 to
get the following.
Proposition 4.1. Assume γ : [a, b] → Rn is a C 1 curve. Then
Z b
(4.14) `P (γ) −→ |γ 0 (t)| dt as maxsize P → 0.
a

We call this limit the length of the curve γ, and write


Z b
(4.15) `(γ) = |γ 0 (t)| dt.
a

Note that if u : [α, β] → [a, b] is a C 1 map with C 1 inverse, and σ = γ ◦ u, as in (4.4), we


have from (4.5) that |σ 0 (t)| = |u0 (t)| · |γ 0 (u(t))|, and the change of variable formula (2.65)
for the integral gives
Z β Z b
0
(4.16) |σ (t)| dt = |γ 0 (t)| dt,
α a

hence we have the geometrically natural result

(4.17) `(σ) = `(γ).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
149

Given such a C 1 curve γ, it is natural to consider the length function


Z t
(4.18) `γ (t) = |γ 0 (s)| ds, `0γ (t) = |γ 0 (t)|.
a

If we assume also that γ 0 is nowhere vanishing on [a, b], Theorem 1.3, the inverse function
theorem, implies that `γ : [a, b] → [0, `(γ)] has a C 1 inverse

(4.19) u : [0, `(γ)] −→ [a, b],

and then σ = γ ◦ u : [0, `(γ)] → Rn satisfies

σ 0 (t) = u0 (t)γ 0 (u(t))


(4.20) 1
= 0 γ 0 (u(t)), for t = `γ (s), s = u(t),
`γ (s)

and by (4.18), `0γ (s) = |γ 0 (s)| = |γ 0 (u(t))|, so

(4.21) |σ 0 (t)| ≡ 1.

Then σ is a reparametrization of γ, and σ has unit speed. We say σ is a reparametrization


by arc length.
We now focus on that most classical example of a curve in the plane R2 , the unit circle

(4.22) S 1 = {(x, y) ∈ R2 : x2 + y 2 = 1}.

We can parametrize S 1 away from (x, y) = (±1, 0) by


p p
(4.23) γ+ (t) = (t, 1 − t2 ), γ− (t) = (t, − 1 − t2 ),

on the intersection of S 1 with {(x, y) : y > 0} and {(x, y) : y < 0}, respectively. Here
γ± : (−1, 1) → R2 , and both maps are smooth. In fact, we can take γ± : [−1, 1] → R2 ,
but these functions are not differentiable at ±1. We can also parametrize S 1 away from
(x, y) = (0, ±1), by
p p
(4.24) γ` (t) = (− 1 − t2 , t), γr (t) = ( 1 − t2 , t),

again with t ∈ (−1, 1). Note that


0
(4.25) γ+ (t) = (1, −t(1 − t2 )−1/2 ),

so

0 t2 1
(4.26) |γ+ (t)|2 = 1 + 2
= .
1−t 1 − t2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
150

Hence, if `(t) is the length of the image γ+ ([0, t]), we have


Z t
1
(4.27) `(t) = √ ds, for 0 < t < 1.
0 1 − s2
The same formula holds with γ+ replaced by γ− , γ` , or γr .
We can evaluate the integral (4.27) as a power series in t, as follows. As seen in §3,

X
−1/2 ak
(4.28) (1 − r) = rk , for |r| < 1,
k!
k=0

where
1 ³ 1 ´³ 3 ´ ³ 1´
(4.29) a0 = 1, a1 = , ak = ··· k − .
2 2 2 2
The power series converges uniformly on [−ρ, ρ], for each ρ ∈ (0, 1). It follows that

X
2 −1/2 ak
(4.30) (1 − s ) = s2k , |s| < 1,
k!
k=0

uniformly convergent on [−a, a] for each a ∈ (0, 1). Hence we can integrate (4.30) term by
term to get

X ak t2k+1
(4.31) `(t) = , 0 ≤ t < 1.
k! 2k + 1
k=0

One can use (4.27)–(4.31) to get a rapidly convergent infinite series for the number π,
defined as

(4.31A) π is half the length of S 1 .

See Exercise 7 in §5.


Since S 1 is a smooth curve, it can be parametrized by arc length. We will let C : R → S 1
be such a parametrization, satisfying

(4.32) C(0) = (1, 0), C 0 (0) = (0, 1),

so C(t) traverses S 1 counter-clockwise, as t increases. For t moderately bigger than 0, the


rays from (0, 0) to (1, 0) and from (0, 0) to C(t) make an angle that, measured in radians,
is t. This leads to the standard trigonometrical functions cos t and sin t, defined by

(4.33) C(t) = (cos t, sin t),

when C is such a unit-speed parametrization of S 1 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
151

We can evaluate the derivative of C(t) by the following device. Applying d/dt to the
identity

(4.34) C(t) · C(t) = 1

and using the product formula gives

(4.35) C 0 (t) · C(t) = 0.

since both |C(t)| ≡ 1 and |C 0 (t)| ≡ 1, (4.35) allows only two possibilities. Either

(4.36) C 0 (t) = (sin t, − cos t).

or

(4.37) C 0 (t) = (− sin t, cos t).

Since C 0 (0) = (0, 1), (4.36) is not a possibility. This implies

d d
(4.38) cos t = − sin t, sin t = cos t.
dt dt

We will derive further important results on cos t and sin t in §5.


One can think of cos t and sin t as special functions arising to analyze the length of arcs
in the circle. Related special functions arise to analyze the length of portions of a parabola
in R2 , say the graph of

1 2
(4.39) y= x .
2

This curve is parametrized by


³ 1 ´
(4.40) γ(t) = t, t2 ,
2
so

(4.41) γ 0 (t) = (1, t).

In such a case, the length of γ([0, t]) is


Z tp
(4.42) `γ (t) = 1 + s2 ds.
0

Methods to evaluate the integral in (4.42) are provided in §5. See Exercise 10 of §5.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
152

The study of lengths of other curves has stimulated much work in analysis. Another
example is the ellipse

x2 y2
(4.43) + = 1,
a2 b2
given a, b ∈ (0, ∞). This curve is parametrized by

(4.44) γ(t) = (a cos t, b sin t).

In such a case, by (4.38), γ 0 (t) = (−a sin t, b cos t), so

|γ 0 (t)|2 = a2 sin2 t + b2 cos2 t


(4.45)
= b2 + γ sin2 t, γ = a2 − b2 ,

and hence the length of γ([0, t]) is


Z tp
γ
(4.46) `γ (t) = b 1 + σ sin2 s ds, σ= .
0 b2

If a 6= b, this is called an elliptic integral, and it gives rise to a more subtle family of
special functions, called elliptic functions. Material on this can be found in §33 of [T3],
Introduction to Complex Analysis.
We end this section with a brief discussion of curves in polar coordinates. We define a
map

(4.47) Π : R2 −→ R2 , Π(r, θ) = (r cos θ, r sin θ).

We say (r, θ) are polar coordinates of (x, y) ∈ R2 if Π(r, θ) = (x, y). Now, Π in (4.47) is
not bijective, since

(4.48) Π(r, θ + 2π) = Π(r, θ), Π(r, θ + π) = Π(−r, θ),

and Π(0, θ) is independent of θ. So polar coordinates are not unique, but we will not
belabor this point. The point we make is that an equation

(4.49) r = ρ(θ), ρ : [a, b] → R,

yields a curve in R2 , namely (with θ = t)

(4.50) γ(t) = (ρ(t) cos t, ρ(t) sin t), a ≤ t ≤ b.

The circle (4.33) corresponds to ρ(θ) ≡ 1. Other cases include


π π
(4.51) ρ(θ) = a cos θ, − ≤θ≤ ,
2 2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
153

yielding a circle of diameter a centered at (a/2, 0), and

(4.52) ρ(θ) = a cos 3θ,

yielding a figure called a three-leaved rose.


To compute the arc length of (4.50), we note that, by (4.38),

x(t) = ρ(t) sin t, y(t) = ρ(t) sin t


(4.53)
⇒ x0 (t) = ρ0 (t) cos t − ρ(t) sin t, y 0 (t) = ρ0 (t) sin t + ρ(t) cos t,

hence

x0 (t)2 + y 0 (t)2 = ρ0 (t)2 cos2 t − 2ρ(t)ρ0 (t) cos t sin t + ρ(t)2 sin2 t
(4.54) + ρ0 (t)2 sin2 t + 2ρ(t)ρ0 (t) sin t cos t + ρ(t)2 cos2 t
= ρ0 (t)2 + ρ(t)2 .

Therefore
Z b Z b p
0
(4.55) `(γ) = |γ (t)| dt = ρ(t)2 + ρ0 (t)2 dt.
a a

Exercises

1. Let γ(t) = (t2 , t3 ). Compute the length of γ([0, t]).

2. With a, b > 0, the curve


γ(t) = (a cos t, a sin t, bt)
is a helix. Compute the length of γ([0, t]).

3. Let
³ 2√2 ´
3/2 1 2
γ(t) = t, t , t .
3 2
Compute the length of γ([0, t]).

4. In case b > a for the ellipse (4.44), the length formula (4.46) becomes
Z tq
b2 − a2
`γ (t) = b 1 − β 2 sin2 s ds, β2 = ∈ (0, 1).
0 b2

Apply the change of variable x = sin s to this integral (cf. (2.46)), and write out the
resulting integral.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
154

5. The second half of (4.48) is equivalent to the identity

(cos(θ + π), sin(θ + π)) = −(cos θ, sin θ).

Deduce this from the definition (4.31A) of π, together with the characterization of C(t)
in (4.33) as the unit speed parametrization of S 1 , satisfying (4.32). For a more general
identity, see (5.44).

6. The curve defined by (4.51) can be written


π π
γ(t) = (a cos2 t, a cos t sin t), − ≤t≤ .
2 2

Peek ahead at (5.44) and show that


³a a a ´
γ(t) = + cos 2t, sin 2t .
2 2 2

Verify that this traces out a circle of radius a/2, centered at (a/2, 0).

7. Use (4.55) to write the arc length of the curve given by (4.52) as an integral. Show this
integral has the same general form as (4.45)–(4.46).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
155

5. The exponential and trigonometric functions

The exponential function is one of the central objects of analysis. In this section we de-
fine the exponential function, both for real and complex arguments, and establish a number
of basic properties, including fundamental connections to the trigonometric functions.
We construct the exponential function to solve the differential equation
dx
(5.1) = x, x(0) = 1.
dt
We seek a solution as a power series

X
(5.2) x(t) = ak tk .
k=0

In such a case, if this series converges for |t| < R, then, by Proposition 3.2,

X
0
x (t) = kak tk−1
k=1
(5.3) ∞
X
= (` + 1)a`+1 t` ,
`=0

so for (5.1) to hold we need


ak
(5.4) a0 = 1, ak+1 = ,
k+1
i.e., ak = 1/k!, where k! = k(k − 1) · · · 2 · 1. Thus (5.1) is solved by

X
t1 k
(5.5) x(t) = e = t , t ∈ R.
k!
k=0

This defines the exponential function et .


More generally, we can define

X
z 1 k
(5.6) e = z , z ∈ C.
k!
k=0

The ratio test then shows that the series (5.6) is absolutely convergent for all z ∈ C, and
uniformly convergent for |z| ≤ R, for each R < ∞. Note that, again by Proposition 3.2,

X
at ak
(5.7) e = tk
k!
k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
156

solves
d at
(5.8) e = aeat ,
dt
and this works for each a ∈ C.
We claim that eat is the unique solution to

dy
(5.9) = ay, y(0) = 1.
dt

To see this, compute the derivative of e−at y(t):

d ¡ −at ¢
(5.10) e y(t) = −ae−at y(t) + e−at ay(t) = 0,
dt

where we use the product rule, (5.8) (with a replaced by −a) and (5.9). Thus e−at y(t) is
independent of t. Evaluating at t = 0 gives

(5.11) e−at y(t) = 1, ∀ t ∈ R,

whenever y(t) solves (5.9). Since eat solves (5.9), we have e−at eat = 1, hence

1
(5.12) e−at = , ∀ t ∈ R, a ∈ C.
eat

Thus multiplying both sides of (5.11) by eat gives the asserted uniqueness:

(5.13) y(t) = eat , ∀ t ∈ R.

We can draw further useful conclusions from applying d/dt to products of exponential
functions. In fact, let a, b ∈ C; then

d ³ −at −bt (a+b)t ´


e e e
dt
(5.14) = −ae−at e−bt e(a+b)t − be−at e−bt e(a+b)t + (a + b)e−at e−bt e(a+b)t
= 0,

so again we are differentiating a function that is independent of t. Evaluation at t = 0


gives

(5.15) e−at e−bt e(a+b)t = 1, ∀ t ∈ R.

Again using (5.12), we get

(5.16) e(a+b)t = eat ebt , ∀ t ∈ R, a, b ∈ C,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
157

or, setting t = 1,

(5.17) ea+b = ea eb , ∀ a, b ∈ C.

We next record some properties of exp(t) = et for real t. The power series (5.5) clearly
gives et > 0 for t ≥ 0. Since e−t = 1/et , we see that et > 0 for all t ∈ R. Since
det /dt = et > 0, the function is monotone increasing in t, and since d2 et /dt2 = et > 0,
this function is convex. (See Proposition 1.5 and the remark that follows it.) Note that,
for t > 0,

t2
(5.18) et = 1 + t + + · · · > 1 + t % +∞,
2
as t % ∞. Hence

(5.19) lim et = +∞.


t→+∞

Since e−t = 1/et ,

(5.20) lim et = 0.
t→−∞

As a consequence,

(5.21) exp : R −→ (0, ∞)

is one-to-one and onto, with positive derivative, so there is a smooth inverse

(5.22) L : (0, ∞) −→ R.

We call this inverse the natural logarithm:

(5.23) log x = L(x).

See Figures 5.1 and 5.2 for graphs of x = et and t = log x.


Applying d/dt to

(5.24) L(et ) = t

gives
1
(5.25) L0 (et )et = 1, hence L0 (et ) = ,
et
i.e.,

d 1
(5.26) log x = .
dx x

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
158

Figure 5.1

Figure 5.2

Since log 1 = 0, we get


Z x
dy
(5.27) log x = .
1 y
An immediate consequence of (5.17) (for a, b ∈ R) is the identity
(5.28) log xy = log x + log y, x, y ∈ (0, ∞).
We move on to a study of ez for purely imaginary z, i.e., of
(5.29) γ(t) = eit , t ∈ R.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
159

This traces out a curve in the complex plane, and we want to understand which curve it
is. Let us set
(5.30) eit = c(t) + is(t),
with c(t) and s(t) real valued. First we calculate |eit |2 = c(t)2 + s(t)2 . For x, y ∈ R,
(5.31) z = x + iy =⇒ z = x − iy =⇒ zz = x2 + y 2 = |z|2 .
It is elementary that
z, w ∈ C =⇒ zw = z w =⇒ z n = z n ,
(5.32)
and z + w = z + w.
Hence

X zk
(5.33) ez = = ez .
k!
k=0
In particular,
(5.34) t ∈ R =⇒ |eit |2 = eit e−it = 1.
Hence t 7→ γ(t) = eit traces out the unit circle centered at the origin in C. Also
(5.35) γ 0 (t) = ieit =⇒ |γ 0 (t)| ≡ 1,
so γ(t) moves at unit speed on the unit circle. We have
(5.36) γ(0) = 1, γ 0 (0) = i.
Thus, for moderate t > 0, the arc from γ(0) to γ(t) is an arc on the unit circle, pictured
in Figure 5.3, of length
Z t
(5.37) `(t) = |γ 0 (s)| ds = t.
0

Figure 5.3

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
160

In other words, γ(t) = eit is the parametrization of the unit circle by arc length, intro-
duced in (4.32). As in (4.33), standard definitions from trigonometry give

(5.38) cos t = c(t), sin t = s(t).

Thus (5.30) becomes

(5.39) eit = cos t + i sin t,

which is Euler’s formula. The identity

d it
(5.40) e = ieit ,
dt

applied to (5.39), yields

d d
(5.41) cos t = − sin t, sin t = cos t.
dt dt

Compare the derivation of (4.38). We can use (5.17) to derive formulas for sin and cos of
the sum of two angles. Indeed, comparing

(5.42) ei(s+t) = cos(s + t) + i sin(s + t)

with

(5.43) eis eit = (cos s + i sin s)(cos t + i sin t)

gives

cos(s + t) = (cos s)(cos t) − (sin s)(sin t),


(5.44)
sin(s + t) = (sin s)(cos t) + (cos s)(sin t).

Further material on the trigonometric functions is developed in the exercises below.

Exercises.

1. Show that

X (−1)k−1 t2 t3
(5.45) |t| < 1 ⇒ log(1 + t) = tk = t − + − ··· .
k 2 3
k=1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
161

Hint. Rewrite (5.27) as


Z t
ds
log(1 + t) = ,
0 1+s
expand
1
= 1 − s + s2 − s3 + · · · , |s| < 1,
1+s
and integrate term by term.

2. In §4, π was defined to be half the length of the unit circle S 1 . Equivalently, π is the
smallest positive number such that eπi = −1. Show that

πi/2 πi/3 1 3
e = i, e = + i.
2 2

Hint. See Figure 5.4.

Figure 5.4

3. Show that
cos2 t + sin2 t = 1,

and
1 + tan2 t = sec2 t,

where
sin t 1
tan t = , sec t = .
cos t cos t

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
162

4. Show that
d
tan t = sec2 t = 1 + tan2 t,
dt
d
sec t = sec t tan t.
dt

5. Evaluate Z y
dx
.
0 1 + x2
Hint. Set x = tan t.

6. Evaluate Z y
dx
√ .
0 1 − x2
Hint. Set x = sin t.

7. Show that Z 1/2


π dx
= √ .
6 0 1 − x2
Use (4.27)–(4.31) to obtain a rapidly convergent infinite series for π.
Hint. Show that sin π/6 = 1/2. Use Exercise 2 and the identity eπi/6 = eπi/2 e−πi/3 . Note
that ak in (4.29)–(4.31) satisfies ak+1 = (k + 1/2)ak . Deduce that


X bk 1 2k + 1
(5.45A) π= , b0 = 3, bk+1 = bk .
2k + 1 4 2k + 2
k=0

8. Set
1 t 1 t
cosh t = (e + e−t ), sinh t = (e − e−t ).
2 2
Show that
d d
cosh t = sinh t, sinh t = cosh t,
dt dt
and
cosh2 t − sinh2 t = 1.

9. Evaluate Z y
dx
√ .
0 1 + x2
Hint. Set x = sinh t.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
163

10. Evaluate Z y p
1 + x2 dx.
0

11. Using Exercise 4, verify that

d
(sec t + tan t) = sec t(sec t + tan t),
dt
d
(sec t tan t) = sec3 t + sec t tan2 t,
dt
= 2 sec3 t − sec t.

12. Next verify that


d
log | sec t| = tan t,
dt
d
log | sec t + tan t| = sec t.
dt

13. Now verify that Z


tan t dt = log | sec t|,
Z
sec t dt = log | sec t + tan t|,
Z Z
3
2 sec t dt = sec t tan t + sec t dt.

(Here and below, we omit the arbitrary additive constants.) See Exercises 40–43 for other
approaches to evaluating these and related integrals.
R
14. Here is another approach to the evaluation of sec t dt. Using Exercise 8 and the chain
rule, show that
d 1
cosh−1 u = √ .
du u2 − 1
Take u = sec t and use Exercises 3–4 to get

d sec t tan t
cosh−1 (sec t) = = sec t,
dt tan t

hence Z
sec t dt = cosh−1 (sec t).

Compare this with the analogue in Exercise 13.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
164

15. Show that


n
X ak d a
Ena (t) = tk satisfies a
E (t) = aEn−1 (t).
k! dt n
k=0

From this, show that


d ³ −at a ´ an+1 n −at
e En (t) = − t e .
dt n!

16. Use Exercise 15 and the fundamental theorem of calculus to show that
Z
n!
tn e−at dt = − n+1 Ena (t)e−at
a
n! ³ a2 t2 an tn ´ −at
= − n+1 1 + at + + ··· + e .
a 2! n!

17. Take a = −i in Exercise 16 to produce formulas for


Z Z
n
t cos t dt and tn sin t dt.

Exercises on xr

In §1, we defined xr for x > 0 and r ∈ Q. Now we define xr for x > 0 and r ∈ C, as
follows:

(5.46) xr = er log x .

18. Show that if r = n ∈ N, (5.46) yields xn = x · · · x (n factors).

19. Show that if r = 1/n, x1/n defined by (5.46) satisfies

x = x1/n · · · x1/n (n factors),

and deduce that x1/n , defined by (5.46), coincides with x1/n as defined in §1.

20. Show that xr , defined by (5.46), coincides with xr as defined in §1, for all r ∈ Q.

21. Show that, for x > 0,

xr+s = xr xs , and (xr )s = xrs , ∀ r, s ∈ C.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
165

22. Show that, given r ∈ C,


d r
x = rxr−1 , ∀ x > 0.
dx
Ry Rx
22A. For y > 0, evaluate 0 cos(log x) dx and 0 sin(log x) dx.
Hint. Deduce from (5.46) and Euler’s formula that

cos(log x) + i sin(log x) = xi .

Use the result of Exercise 22 to integrate xi .

23. Show that, given r, rj ∈ C, x > 0,

rj → r =⇒ xrj → xr .

24. Given a > 0, compute


d x
a , x ∈ R.
dx

25. Compute
d x
x , x > 0.
dx

26. Prove that


x1/x −→ 1, as x → ∞.
Hint. Show that
log x
−→ 0, as x → ∞.
x

27. Verify that Z Z


1 1
x
x dx = ex log x dx
0
Z0 ∞
−y
= e−ye e−y dy
0
XZ ∞

(−1)n n −(n+1)y
= y e dy.
n=1 0 n!

28. Show that, if α > 0, n ∈ N,


Z ∞
y n e−αy dy = (−1)n F (n) (α),
0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
166

where Z ∞
1
F (α) = e−αy dy = .
0 α

29. Using Exercises 27–28, show that


Z 1 ∞
X
xx dx = (−1)n (n + 1)−(n+1)
0 n=0
1 1 1
=1− + − + ··· .
22 33 44

Some special series

30. Using (5.45), show that



X (−1)k−1
= log 2.
k
k=1

Hint. Using properties of alternating series, show that if t ∈ (0, 1),

N
X (−1)k−1 tN +1
tk = log(1 + t) + rN (t), |rN (t)| ≤ ,
k N +1
k=1

and let t % 1.

31. Using the result of Exercise 5, show that



X (−1)k π
= .
2k + 1 4
k=0

Hint. Exercise 5 implies



X
−1 (−1)k 2k+1
tan y= y , for − 1 < y < 1.
2k + 1
k=0

Use an argument like that suggested for Exercise 30, taking y % 1.

Alternative approach to exponentials and logs

An alternative approach is to define log : (0, ∞) → R first and derive some of its properties,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
167

and then define the exponential function Exp : R → (0, ∞) as its inverse. The following
exercises describe how to implement this. To start, we take (5.27) as a definition:
Z x
dy
(5.47) log x = , x > 0.
1 y

32. Using (5.47), show that

(5.48) log(xy) = log x + log y, ∀ x, y > 0.

Also show
1
(5.49) log = − log x, ∀ x > 0.
x

33. Show from (5.47) that

d 1
(5.50) log x = , x > 0.
dx x

34. Show that log x → +∞ as x → +∞.


(Hint. See the hint for Exercise 15 in §2.)
Then show that log x → −∞ as x → 0.

35. Deduce from Exercises 33 and 34, together with Theorem 1.3, that

log : (0, ∞) −→ R is one-to-one and onto,

with a differentiable inverse. We denote the inverse function

Exp : R −→ (0, ∞), also set et = Exp(t).

36. Deduce from Exercise 32 that

(5.51) es+t = es et , ∀ s, t ∈ R.

Note. (5.51) is a special case of (5.17).

37. Deduce from (5.50) and Theorem 1.3 that

d t
(5.52) e = et , ∀ t ∈ R.
dt

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
168

As a consequence,

dn t
(5.53) e = et , ∀ t ∈ R, n ∈ N.
dtn

38. Note that e0 = 1, since log 1 = 0. Deduce from (5.53), together with the power series
formulas (3.34) and (3.40), that, for all t ∈ R, n ∈ N,
n
X
t 1 k
(5.54) e = t + Rn (t),
k!
k=0

where

tn+1 ζn
(5.55) Rn (t) = e ,
(n + 1)!

for some ζn between 0 and t.

39. Deduce from Exercise 38 that



X
t 1 k
(5.56) e = t , ∀ t ∈ R.
k!
k=0

Remark. Exercises 35–39 develop et only for t ∈ R. At this point, it is natural to segue
to (5.6) and from there to arguments involving (5.7)–(5.17), and then on to (5.29)–(5.41),
renewing contact with the trigonometric functions.

Some more trigonometric integrals

These exercises treat integrals of the form


Z
(5.57) R(cos θ, sin θ) dθ.

40. Using the substitution x = tan θ/2, show that

dx 1 − x2 2x
dθ = 2 , cos θ = , sin θ = .
1 + x2 1 + x2 1 + x2

Hint. With α = θ/2, use

cos 2α = 2 cos2 α − 1, and sec2 α = 1 + tan2 α.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
169

41. Deduce that (5.57) converts to


Z ³ 1 − x2 2x ´ dx
(5.58) 2 R , .
1 + x2 1 + x2 1 + x2

42. Use this approach to compute


Z Z
1 1
dθ, and dθ.
sin θ cos θ

Compare the second result with that from Exercise 13.

43. Use the substitution t = sin θ to show that, for k ∈ Z+ ,


Z Z
2k+1 dt
sec θ dθ = .
(1 − t2 )k+1

Compare what you get by the methods of Exercises 40–42, and also (for k = 0, 1) those of
Exercise 13.
Hint. sec2k+1 θ = (cos θ)/(1 − sin2 θ)k+1 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
170

6. Unbounded integrable functions

There are lots of unbounded functions we would like to be able to integrate. For example,
consider f (x) = x−1/2 on (0, 1] (defined any way you like at x = 0). Since, for ε ∈ (0, 1),
Z 1 √
(6.1) x−1/2 dx = 2 − 2 ε,
ε

this has a limit as ε & 0, and it is natural to set


Z 1
(6.2) x−1/2 dx = 2.
0

Sometimes (6.2) is called an “improper integral,” but we do not consider that to be a


proper designation. Here, we define a class R# (I) of not necessarily bounded “integrable”
functions on an interval I = [a, b], as follows.
First, assume f ≥ 0 on I, and for A ∈ (0, ∞), set

fA (x) = f (x) if f (x) ≤ A,


(6.3)
A, if f (x) > A.

We say f ∈ R# (I) provided

fA ∈ R(I), ∀ A < ∞, and


Z
(6.4) ∃ uniform bound fA dx ≤ M.
I
R
If f ≥ 0 satisfies (6.4),R then I fA dx increases monotonically to a finite limit as A % +∞,
and we call the limit I f dx:
Z Z
(6.5) fA dx % f dx, for f ∈ R# (I), f ≥ 0.
I I

Rb
We
R also use the notation a
f dx, if I = [a, b]. If I is understood, we might just write
f dx. It is valuable to have the following.
Proposition 6.1. If f, g : I → R+ are in R# (I), then f + g ∈ R# (I), and
Z Z Z
(6.6) (f + g) dx = f dx + g dx.
I I I

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
171

Proof. To start, note that (f + g)A ≤ fA + gA . In fact,

(6.7) (f + g)A = (fA + gA )A .


R R R R R
Hence (f + g)A ∈ R(I) and (f + g)A dx ≤ fA dx + gA dx ≤ f dx + g dx, so we
have f + g ∈ R# (I) and
Z Z Z
(6.8) (f + g) dx ≤ f dx + g dx.

On the other hand, if B > 2A, then (f + g)B ≥ fA + gA , so


Z Z Z
(6.9) (f + g) dx ≥ fA dx + gA dx,

for all A < ∞, and hence


Z Z Z
(6.10) (f + g) dx ≥ f dx + g dx.

Together, (6.8) and (6.10) yield (6.6).


Next, we take f : I → R and set

f = f + − f −, f + (x) = f (x) if f (x) ≥ 0,


(6.11)
0 if f (x) < 0.

Then we say

(6.12) f ∈ R# (I) ⇐⇒ f + , f − ∈ R# (I),

and set
Z Z Z
+
(6.13) f dx = f dx − f − dx,
I I I

where the two terms on the right are defined as in (6.5). To extend the additivity, we
begin as follows
Proposition 6.2. Assume that g ∈ R# (I) and that gj ≥ 0, gj ∈ R# (I), and

(6.14) g = g0 − g1 .

Then
Z Z Z
(6.15) g dx = g0 dx − g1 dx.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
172

Proof. Take g = g + − g − as in (6.11). Then (6.14) implies

(6.16) g + + g1 = g0 + g − ,

which by Proposition 6.1 yields


Z Z Z Z
(6.17) g dx + g1 dx = g0 dx + g − dx.
+

This implies
Z Z Z Z
+ −
(6.18) g dx − g dx = g0 dx − g1 dx,

which yields (6.15)


We now extend additivity.
Proposition 6.3. Assume f1 , f2 ∈ R# (I). Then f1 + f2 ∈ R# (I) and
Z Z Z
(6.19) (f1 + f2 ) dx = f1 dx + f2 dx.
I I I

Proof. If g = f1 + f2 = (f1+ − f1− ) + (f2+ − f2− ), then

(6.20) g = g0 − g1 , g0 = f1+ + f2+ , g1 = f1− + f2− .

We have gj ∈ R# (I), and then


Z Z Z
(f1 + f2 ) dx = g0 dx − g1 dx
Z Z
(6.21) + +
= (f1 + f2 ) dx − (f1− + f2− ) dx
Z Z Z Z
= f1 dx + f2 dx − f1 dx − f2− dx,
+ + −

the first equality by Proposition 6.2, the second tautologically, and the third by Proposition
6.1. Since
Z Z Z
(6.22) fj dx = fj dx − fj− dx,
+

this gives (6.19).


If f : I → C, we set f = f1 + if2 , fj : I → R, and say f ∈ R# (I) if and only if f1 and
f2 belong to R# (I). Then we set
Z Z Z
(6.23) f dx = f1 dx + i f2 dx.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
173

Similar comments apply to f : I → Rn .


Given f ∈ R# (I), we set
Z
(6.24) kf kL1 (I) = |f (x)| dx.
I

We have, for f, g ∈ R# (I), a ∈ C,

(6.25) kaf kL1 (I) = |a| kf kL1 (I) ,

and
Z
kf + gkL1 (I) = |f + g| dx

ZI
(6.26)
≤ (|f | + |g|) dx
I
= kf kL1 (I) + kgkL1 (I) .

Note that, if S ⊂ I,
Z
+
(6.27) cont (S) = 0 =⇒ |χS | dx = 0,
I

where cont+ (S) is defined by (2.21). Thus, to get a metric, we need to form equivalence
classes. The set of equivalence classes [f ] of elements of R# (I), where
Z
(6.28) f ∼ f˜ ⇐⇒ |f − f˜| dx = 0,
I

forms a metric space, with distance function

(6.29) D([f ], [g]) = kf − gkL1 (I) .

However, this metric space is not complete. One needs the Lebesgue integral to obtain a
complete metric space. One can see [Fol] or [T1].
We next show that each f ∈ R# (I) can be approximated in L1 by a sequence of
bounded, Riemann integrable functions.
Proposition 6.4. If f ∈ R# (I), then there exist fk ∈ R(I) such that

(6.30) kf − fk kL1 (I) −→ 0, as k → ∞.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
174

Proof. If we separately approximate Re f and Im f by such sequences, then we approximate


f , so it suffices to treat the case where f is real. Similarly, writing f = f + − f − , we see
that it suffices to treat the case where f ≥ 0 on I. For such f , simply take

(6.31) fk = fA , A = k,

with fA as in (6.3). Then (6.5) implies


Z Z
(6.32) fk dx % f dx,
I I

and Proposition 6.3 gives


Z Z
|f − fk | dx = (f − fk ) dx
I I
Z Z
(6.33)
= f dx − fk dx
I I
→ 0 as k → ∞.

So far, we have dealt with integrable functions on a bounded interval. Now, we say
f : R → R (or C, or Rn ) belongs to R# (R) provided f |I ∈ R# (I) for each closed, bounded
interval I ⊂ R and
Z R
(6.34) ∃A < ∞ such that |f | dx ≤ A, ∀ R < ∞.
−R

In such a case, we set


Z ∞ Z R
(6.35) f dx = lim f dx.
−∞ R→∞ −R

One can similarly define R# (R+ ).

Exercises

1. Let f : [0, 1] → R+ and assume f is continuous on (0, 1]. Show that


Z 1
#
f ∈ R ([0, 1]) ⇐⇒ f dx is bounded as ε & 0.
ε

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
175

In such a case, show that Z Z


1 1
f dx = lim f dx.
0 ε→0 ε

2. Let a > 0. Define pa : [0, 1] → R by pa = x−a if 0 < x ≤ 1 Set pa (0) = 0. Show that

pa ∈ R# ([0, 1]) ⇐⇒ a < 1.

3. Let b > 0. Define qb : [0, 1/2] → R by

1
qb (x) = ,
x| log x|b

if 0 < x ≤ 1/2. Set qb (0) = 0. Show that

qb ∈ R# ([0, 1/2]) ⇐⇒ b > 1.

4. Show that if a ∈ C and if f ∈ R# (I), then


Z Z
#
af ∈ R (I), and af dx = a f dx.

Hint. Check this for a > 0, a = −1, and a = i.

5. Show that
f ∈ R(I), g ∈ R# (I) =⇒ f g ∈ R# (I).
Hint. Use (2.53). First treat the case f, g ≥ 1, f ≤ M . Show that in such a case,

(f g)A = (fA gA )A , and (f g)A ≤ M gA .

6. Compute Z 1
log t dt.
0
R1
Hint. To compute ε
log t dt, first compute

d
(t log t).
dt

7. Given g ∈ R(I), show that there exist gk ∈ PK(I) such that

kg − gk kL1 (I) −→ 0.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
176

Given h ∈ PK(I), show that there exist hk ∈ C(I) such that

kh − hk kL1 (I) −→ 0.

8. Using Exercise 7 and Proposition 6.4, prove the following: given f ∈ R# (I), there exist
fk ∈ C(I) such that
kf − fk kL1 (I) −→ 0.

9. Recall Exercise 4 of §2. If ϕ : [a, b] → [A, B] is C 1 , with ϕ0 (x) > 0 for all x ∈ [a, b], then
Z B Z b
(6.36) f (y) dy = f (ϕ(t))ϕ0 (t) dt,
A a

for each f ∈ C([a, b]), where A = ϕ(a), B = ϕ(b). Using Exercise 8, show that (6.36)
holds for each f ∈ R# ([a, b]).

10. If f ∈ R# (R), so (6.34) holds, prove that the limit exists in (6.35).

11. Given f (x) = x−1/2 (1 + x2 )−1 for x > 0, show that f ∈ R# (R+ ). Show that
Z ∞ Z ∞
1 dx dy
2
√ =2 .
0 1+x x 0 1 + y4

12. Let fk ∈ R# ([a, b]), f : [a, b] → R satisfy

(a) |fk | ≤ g, ∀ k, for some g ∈ R# ([a, b]),


Given ε > 0, ∃ contented Sε ⊂ [a, b] such that
Z
(b) g dx < ε, and fk → f uniformly on [a, b] \ Sε .

Show that f ∈ R# ([a, b]) and


Z b Z b
fk (x) dx −→ f (x) dx, as k → ∞.
a a

13. Let g ∈ R# ([a, b]) be ≥ 0. Show that for each ε > 0, there exists δ > 0 such that
Z
S ⊂ [a, b] contented, cont S < δ =⇒ g dx < ε.
S

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
177
R R
Hint. With gA defined as in (6.3), pick A such that gA dx ≥ g dx − ε/2. Then pick
δ < ε/2A.

14. Deduce from Exercises 12–13 the following. Let fk ∈ R# ([a, b]), f : [a, b] → R satisfy

(a) |fk | ≤ g, ∀ k, for some g ∈ R# ([a, b]),


Given δ > 0, ∃ contented Sδ ⊂ [a, b] such that
(b) cont Sδ < δ, and fk → f uniformly on [a, b] \ Sδ .

Show that f ∈ R# ([a, b]) and


Z b Z b
fk (x) dx −→ f (x) dx, as k → ∞.
a a

Remark. Compare Exercise 18 of §2. As mentioned there, the Lebesgue theory of


integration has a stronger result, known as the Lebesgue dominated convergence theorem.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
178

A. The fundamental theorem of algebra

The following result is the fundamental theorem of algebra.


Theorem A.1. If p(z) is a nonconstant polynomial (with complex coefficients), then p(z)
must have a complex root.
Proof. We have, for some n ≥ 1, an 6= 0,

p(z) = an z n + · · · + a1 z + a0
(A.1) ¡ ¢
= an z n 1 + O(z −1 ) , |z| → ∞,

which implies

(A.2) lim |p(z)| = ∞.


|z|→∞

Picking R ∈ (0, ∞) such that

(A.3) inf |p(z)| > |p(0)|,


|z|≥R

we deduce that

(A.4) inf |p(z)| = inf |p(z)|.


|z|≤R z∈C

Since DR = {z : |z| ≤ R} is compact and p is continuous, there exists z0 ∈ DR such that

(A.5) |p(z0 )| = inf |p(z)|.


z∈C

The theorem hence follows from:


Lemma A.2. If p(z) is a nonconstant polynomial and (A.5) holds, then p(z0 ) = 0.
Proof. Suppose to the contrary that

(A.6) p(z0 ) = a 6= 0.

We can write

(A.7) p(z0 + ζ) = a + q(ζ),

where q(ζ) is a (nonconstant) polynomial in ζ, satisfying q(0) = 0. Hence, for some k ≥ 1


and b 6= 0, we have q(ζ) = bζ k + · · · + bn ζ n , i.e.,

(A.8) q(ζ) = bζ k + O(ζ k+1 ), ζ → 0,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
179

so, uniformly on S 1 = {ω : |ω| = 1}

(A.9) p(z0 + εω) = a + bω k εk + O(εk+1 ), ε & 0.

Pick ω ∈ S 1 such that


b k a
(A.10) ω =− ,
|b| |a|

which is possible since a 6= 0 and b 6= 0. In more detail, since −(a/|a|)(|b|/b) ∈ S 1 , Euler’s


identity implies
a |b|
− = eiθ ,
|a| b
for some θ ∈ R, so we can take
ω = eiθ/k .
Given (A.10),
³ ¯b¯ ´
¯ ¯
(A.11) p(z0 + εω) = a 1 − ¯ ¯εk + O(εk+1 ),
a
which contradicts (A.5) for ε > 0 small enough. Thus (A.6) is impossible. This proves
Lemma A.2, hence Theorem A.1.
Now that we have shown that p(z) in (A.1) must have one root, we can show it has n
roots (counting multiplicity).
Proposition A.3. For a polynomial p(z) of degree n, as in (A.1), there exist r1 , . . . , rn ∈
C such that

(A.12) p(z) = an (z − r1 ) · · · (z − rn ).

Proof. We have shown that p(z) has one root; call it r1 . Dividing p(z) by z − r1 , we have

(A.13) p(z) = (z − r1 )p̃(z) + q,

where p̃(z) = an z n−1 +· · ·+ã0 and q is a polynomial of degree < 1, i.e., a constant. Setting
z = r1 in (A.13) yields q = 0, so

(A.14) p(z) = (z − r1 )p̃(z).

Since p̃(z) is a polynomial of degree n − 1, the result (A.12) follows by induction on n.


The numbers rj , 1 ≤ j ≤ n, in (A.12) are called the roots of p(z). If k of them coincide
(say with r` ) we say r` is a root of multiplicity k. If r` is distinct from rj for all j 6= `, we
say r` is a simple root.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
180

B. π 2 is Irrational

The following proof that π 2 is irrational follows a classic argument of I. Niven, [Niv].
The idea is to consider
Z π
1
(B.1) In = ϕn (x) sin x dx, ϕn (x) = xn (π − x)n .
0 n!
Clearly In > 0 for each n ∈ N, and In → 0 very fast, faster than geometrically:
1 ³ π ´2n
(B.1A) 0 < In < .
n! 2
The next key fact, to be established below, is that In is a polynomial of degree n in π 2
with integer coefficients:
n
X
(B.2) In = cnk π 2k , cnk ∈ Z.
k=0

Given this it follows readily that π 2 is irrational. In fact, if π 2 = a/b, a, b ∈ N, then


n
X
(B.3) cnk a2k b2n−2k = b2n In .
k=0

But the left side of (B.3) is an integer for each n, while by the estimate (B.1A), the
right side belongs to the interval (0, 1) for large n, yielding a contradiction. It remains to
establish (B.2).
A method of computing the integral in (B.1), which works for any polynomial ϕn (x) is
the following. One looks for an antiderivative of the form

(B.4) Gn (x) sin x − Fn (x) cos x,

where Fn and Gn are polynomials. One needs

(B.5) Gn (x) = Fn0 (x), G0n (x) + Fn (x) = ϕn (x),

hence

(B.6) Fn00 (x) + Fn (x) = ϕn (x).

One can exploit the nilpotence of ∂x2 on the space of polynomials of degree ≤ 2n and set

Fn (x) = (I + ∂x2 )−1 ϕn (x)


n
X
(B.7)
= (−1)k ϕ(2k)
n (x).
k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
181

Then
d ³ 0 ´
(B.8) Fn (x) sin x − Fn (x) cos x = ϕn (x) sin x.
dx

Integrating (B.8) over x ∈ [0, π] gives


Z π
(B.9) ϕn (x) sin x dx = Fn (0) + Fn (π) = 2Fn (0),
0

the last identity holding for ϕn (x) as in (B.1) because then ϕn (π − x) = ϕn (x) and hence
Fn (π − x) = Fn (x). For the first identity in (B.9), we use the defining property that
sin π = 0 while cos π = −1.
(2k)
In light of (B.7), to prove (B.2) it suffices to establish an analogous property for ϕn (0).
Comparing the binomial formula and Taylor’s formula for ϕn (x):
n µ ¶
1 X ` n
ϕn (x) = (−1) π n−` xn+` , and
n! `
`=0
(B.10) 2n
X 1 (k)
ϕn (x) = ϕn (0)xk ,
k!
k=0

we see that
µ ¶
` (n + `)! n n−`
(B.11) k =n+`⇒ ϕ(k)
n (0) = (−1) π ,
n! `
so
µ ¶
n (n + `)! n 2(k−`)
(B.12) 2k = n + ` ⇒ ϕ(2k)
n (0) = (−1) π .
n! `

(2k)
Of course ϕn (0) = 0 for 2k < n. Clearly the multiple of π 2(k−`) in (B.12) is an integer.
In fact,
µ ¶
(n + `)! n (n + `)! n!
=
n! ` n! `!(n − `)!
(n + `)! n!
(B.13) =
n!`! (n − `)!
µ ¶
n+`
= n(n − 1) · · · (n − ` + 1).
n

Thus (B.2) is established, and the proof that π 2 is irrational is complete.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
182

C. More on (1 − x)b

In §3 we showed that

X
b ak
(C.1) (1 − x) = xk ,
k!
k=0

for |x| < 1, with


k−1
Y
(C.2) a0 = 1, ak = (−b + `), for k ≥ 1.
`=0

There we required b ∈ Q, but in §5 we defined y b , for y > 0, for all b ∈ R (and for y ≥ 0 if
b > 0), and noted that such a result extends. Here, we prove a further result, when b > 0.
Proposition C.1. Given b > 0, ak as in (C.2), the identity (C.1) holds for x ∈ [−1, 1],
and the series converges absolutely and uniformly on [−1, 1].
Proof. Our main task is to show that

X |ak |
(C.3) < ∞,
k!
k=0

if b > 0. This implies that the right side of (C.1) converges absolutely and uniformly on
[−1, 1] and its limit, g(x), is continuous on [−1, 1]. We already know that g(x) = (1 − x)b
on (−1, 1), and since both sides are continuous on [−1, 1], the identity also holds at the
endpoints. Now, if k − 1 > b,
ak b Y ³ b´ Y ³ b´
(C.4) =− 1− 1− ,
k! k ` `
1≤`≤b b<`≤k−1

which we write as (B/k)pk , where pk denotes the last product in (C.4). Then
X ³ b´
log pk = log 1 −
`
b<`≤k−1

(C.5) X b
≤−
`
b<`≤k−1
≤ −b log k + β,

for some β ∈ R. Here, we have used

(C.6) log(1 − r) < −r, for 0 < r < 1,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
183

and
k−1
X Z k
1 dy
(C.7) > .
` 1 y
`=1

It follows from (C.5) that

(C.8) pk ≤ e−b log k+β = γk −b ,

so

|ak |
(C.9) ≤ |Bγ| k −(1+b) ,
k!

giving (C.3).

Exercise

1. Why did we not put this argument in §3?


Hint. logs

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
184

D. Archimedes’ approximation to π

Here we discuss an approximation to π proposed by Archimedes. It is based on the fact


(known to the ancient Greeks) that the unit disk D = {(x, y) ∈ R2 : x2 + y 2 ≤ 1} has the
property

(D.1) Area D = π.

We have not discussed area in this text. This topic is treated in the companion text [T2].
Actually, (D.1) was originally the definition of π. Here, we have taken (4.31A) as the
definition. To get the equivalence, we appeal to notions from first-year calculus, giving
areas of regions bounded by graphs in terms of integrals. We have
Z 1p
Area D = 2 1 − x2 dx
−1
Zπ/2
=2 cos2 θ dθ
(D.2) −π/2
Z π/2
= (cos 2θ + 1) dθ
−π/2
= π.

Here, the second identity follows from the substitution x = sin θ and the third from the
identity
cos 2θ = cos2 θ − sin2 θ = 2 cos2 θ − 1,
a consequence of (5.44), with s = t = θ. One can also get (D.1) by computing areas in
polar coordinates (cf. [T2]).
Having (D.1), Archimedes proceeded as follows. If Pn is a regular n-gon inscribed in
the unit circle, then Area Pn → π as n → ∞, with
c
(D.3) π− < Area Pn < π.
n2
See (D.18)–(D.20) below for more on this. Note that such a polygon decomposes into n
equal sized isoceles triangles, with two sides of length 1 meeting at an angle αn = 2π/n.
Such a triangle Tn has
³ αn ´³ αn ´ 1
(D.4) Area Tn = sin cos = sin αn ,
2 2 2
so
n 2π
(D.5) Area Pn = sin .
2 n

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
185

One can obtain an inductive formula for Area Pn for n = 2k as follows. Set

2π 2π
(D.6) Sk = sin , Ck = cos .
2k 2k
Then, for example, S2 = 1, C2 = 0, and

(D.7) (Ck+1 + iSk+1 )2 = Ck + iSk ,

i.e.,
2 2
(D.8) Ck+1 − Sk+1 = Ck , 2Ck+1 Sk+1 = Sk .

We are in the position of solving

(D.9) x2 − y 2 = a, 2xy = b,

for x and y, knowing that a ≥ 0, b, x, y > 0. We substitute y = b/2x into the first equation,
obtaining

b2
(D.10) x2 − = a,
4x2

then set u = x2 and get

b2
(D.11) u2 − au − = 0,
4
whose positive solution is

a 1p 2
(D.12) u= + a + b2 .
2 2
Then
√ b
(D.13) x= u, y= √ .
2 u

Taking a = Ck , b = Sk , and knowing that Ck2 + Sk2 = 1, we obtain

Sk
(D.14) Sk+1 = √ ,
2 Uk

with
p
1 + Ck 1+ 1 − Sk2
(D.15) Uk = = .
2 2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
186

Then

(D.16) Area P2k = 2k−1 Sk .

Alternatively, with Pk = Area P2k , we have

Pk
(D.17) Pk+1 = √ .
Uk

As we show below, π is approximated to 15 digits of accuracy in 25 iterations of (D.14)–


(D.17), starting with S2 = 1 and P2 = 2.
First, we take a closer look at the error estimate in (D.3). Note that

n ³ 2π 2π ´
(D.18) π − Area Pn = − sin ,
2 n n
and that
δ3 δ5 δ3
(D.19) δ − sin δ = − + ··· < , for 0 < δ < 6,
3! 5! 3!
so
2π 3 1
(D.20) π − Area Pn < · , for n ≥ 2.
3 n2
Thus we can take c = 2π 3 /3 in (D.3) for n ≥ 2, and this is asymptotically sharp.
From (D.20) with n = 225 , we have

2π 3 −50
(D.21) π − P25 < ·2 .
3
Since
2π 3
(D.22) 210 = 1024 ⇒ 250 ≈ 1015 , and ≈ 20,
3
we get

(D.23) π − P25 ≈ 10−14 .

The Archimedes method often gets bad press because the error given in (D.20) decreases
slowly with n. However, given that we take n = 2k and iterate on k, the error actually
decreases exponentially in k. Nevertheless, use of the infinite series suggested in Exercise 7
of §5 has advantages over the use of (D.14)–(D.17), particularly in that it does not require
one to calculate a bunch of square roots.
There is another disadvantage of the iteration (D.14)–(D.17), though it does not show up
in a mere 25 iterations (at least, not if one is using double precision arithmetic). Namely,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
187

any error in the approximate calculation of Pk (compared to its exact value), due for
example to truncation error, can get magnified in the approximate calculation of Pk+` for
` ≥ 1. This will ultimately lead to an instability, and a breakdown in the viability of the
iterative method (D.14)–(D.17).
We end this appendix by showing how the approximation to π described here can be
justified without any notion of area. In fact, setting

n 2π
(D.24) An = sin ,
2 n

(cf. (D.5)), we get

2π 3 1
(D.25) 0 < π − An < , for n ≥ 2,
3 n2

directly from (D.19); cf. (D.20). Thus, we can simply set Pk = A2k , and then the estimates
(D.21)–(D.23) hold, and the iteration (D.14)–(D.17) works, without recourse to area.
In effect, the previous paragraph took the geometry out of Archimedes’ approximation
to π. Finally, we note the following variant, bringing in arc length (treated thoroughly in
§4) in place of area. Namely, the perimeter Qn of the regular n-gon Pn is a union of n line
segments, each being the base of an isoceles triangle with two sides of length 1, meeting
at an angle αn = 2π/n. Hence each such line segment has length 2 sin αn /2, so
π
(D.26) `(Qn ) = 2n sin .
n
The fact that

(D.27) `(Qn ) −→ 2π, as n → ∞

follows from the definition (4.31A) together with Proposition 4.1. Note that (D.26) implies

(D.28) `(Qn ) = 2A2n ,

leading us back to Archimedes’ approximation.

Note. Actually, Archimedes started with the regular hexagon and proceeded from there
to evaluate Pek = Area P3·2k , for k up to 5. The basic iteration (D.7)–(D.15) also applies
to this case. By (D.20),
0 < π − Area P96 < 0.00225.
Archimedes’ presentation was
10 1
3+ <π <3+ .
71 7

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
188

E. Computing π using arctangents

In Exercise 3 of §5, we defined tan t = sin t/ cos t. It is readily verified (via Exercise 4
of §5) that
³ π π´
(E.1) tan : − , −→ R
2 2
is one-to-one and onto, with positive derivative, so it has a smooth inverse
³ π π´
−1
(E.2) tan : R −→ − , .
2 2
It follows from Exercise 5 of §5 that
Z x
−1 ds
(E.3) tan x= .
0 1 + s2

We can insert the power series for (1 + s2 )−1 and integrate term by term to get

X (−1)k 2k+1
(E.4) tan−1 x = x , if − 1 < x < 1.
2k + 1
k=0

This provides a way to obtain rapidly convergent series for π, alternative to that proposed
in Exercise 7 of §5, which can be called an evaluation of π using the arcsine.
For a first effort, we use
π 1
(E.5) tan =√ ,
6 3
which follows from
√ √
π 1 π 3 πi/6 3 1
(E.6) sin = , cos = ⇐⇒ e = + i,
6 2 6 2 2 2
compare Exercises 2 and 7 of §5. Now (E.4)–(E.5) yield

1 X (−1)k ³ 1 ´k

π
(E.7) =√ .
6 3 k=0 2k + 1 3

We can compare
√ (E.7) with the series (5.45A) for π. One difference is√the occurence of the
factor 1/ 3, which is irrational. To be sure, it is not hard to compute 3 to high precision.
Compare Exercises 8–10 of §3; for a faster method, see the treatment of Newton’s method
in §5 of Chapter 5. Nevertheless, the presence of this irrational factor in (E.7) is a bit

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
189

of a glitch. Another disadvantage of (E.7) is that this series converges more slowly than
(5.45A).
We can do better by expressing π as a finite linear combination of terms tan−1 xj for
certain fairly small rational numbers xj . The key to this is the following formula for
tan(a + b). Using (5.44), we have
sin(a + b) sin a cos b + cos a sin b
tan(a + b) = =
cos(a + b) cos a cos b − sin a sin b
(E.8)
tan a + tan b
= .
1 − tan a tan b
Since tan π/4 = 1, we have, for a, b, a + b ∈ (−π/2, π/2),
π tan a + tan b
(E.9) = a + b ⇐= = 1.
4 1 − tan a tan b
Taking a = tan−1 x, b = tan−1 y gives
π
= tan−1 x + tan−1 y ⇐= x + y = 1 − xy
4
(E.10) 1−y
⇐= x = .
1+y
If we set y = 1/2, we get x = 1/3, so
π 1 1
(E.11) = tan−1 + tan−1 .
4 3 2
The power series (E.4) for tan−1 (1/3) and tan−1 (1/2) both converge faster than (E.7), but
that for tan−1 (1/2) converges at essentially the same
√ rate as (5.45A). We might optimise
by taking x = y in (E.10), but that yields x = y = √ 2 − 1, and we do not want to plug this
irrational number into (E.4). Taking a cue from 2 − 1 ≈ 0.414, we set y = 2/5, which
yields x = 3/7, so
π 2 3
(E.12) = tan−1 + tan−1 .
4 5 7
Both resulting power series converge faster than (5.45A), but not by much.
To do better, we bring in a formula for tan(a + 2b). Note that setting a = b in (E.8)
yields
2 tan b
(E.13) tan 2b = ,
1 − tan2 b
and concatenating this with (E.8) (with b replaced by 2b) yields, after some elementary
calculation,
tan a(1 − tan2 b) + 2 tan b
(E.14) tan(a + 2b) = .
1 − tan2 b − 2 tan a tan b

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
190

Thus, parallel to (E.9),

π tan a(1 − tan2 b) + 2 tan b


(E.15) = a + 2b ⇐= = 1.
4 1 − tan2 b − 2 tan a tan b

Taking a = tan−1 x, b = tan−1 y gives


π
= tan−1 x + 2 tan−1 y ⇐= x(1 − y 2 ) + 2y = 1 − y 2 − 2xy
4
(E.16) 1 − y 2 − 2y
⇐= x = .
1 − y 2 + 2y

Taking y = 1/3 yields x = 1/7, so

π 1 1
(E.17) = tan−1 + 2 tan−1 .
4 7 3
Both resulting power series converge significantly faster than (5.45A). Alternatively, we
can take y = 1/4, yielding x = 7/23, so

π 7 1
(E.18) = tan−1 + 2 tan−1 .
4 23 4

The power series for tan−1 (7/23) converges a little faster than that for tan−1 (1/3).
One can go still farther, iterating (E.13) to produce a formula for tan 4b, and concate-
nating this with (E.8) to produce a formula for

(E.19) tan(a + 4b).

An argument somewhat parallel to that involving (E.15)–(E.16) yields identities of the


form
π
(E.20) = tan−1 x + 4 tan−1 y,
4
including the following, known as Machin’s formula:

π 1 1
(E.21) = 4 tan−1 − tan−1 ,
4 5 239
with y = 1/5, x = −1/239. For many years, this was the most popular formula for
high precision approximations to π, until the 1970s, when a more sophisticated method
(actually discovered by Gauss in 1799) became available. For more on this, the reader can
consult Chapter 7 of [AH].
Returning to the arctangent function, we record a series that converges somewhat faster
than (E.4), for such values of x as occur in (E.11), (E.12), (E,17), (E.18), and (E.21). The
following is due to Euler.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
191

Proposition E.1. For x ∈ R,


³ x2 ´
(E.22) x tan−1 x = ϕ ,
1 + x2

with
³ 2 2·4 2 2·4·6 3 ´
(E.23) ϕ(z) = z 1 + z + z + z + ··· .
3 3·5 3·5·7

The power series (E.23) has the same radius of convergence as (E.4). The advantage of
(E.22)–(E.23) over (E.4) lies in the fact that x2 /(1 + x2 ) is a bit smaller than x2 , for the
values of x that appear in our various formulas for π.
To start the proof of Proposition E.1, note that

x2 2 z
(E.24) z= ⇐⇒ x = .
1 + x2 1−z

Hence, by (E.4),

X
−1 (−1)k−1
x tan x= x2k
2k − 1
k=1
(E.25) ∞
X (−1)k−1 k
= z (1 − z)−k .
2k − 1
k=1

Now

X
(E.26) (1 − z)−1 = zn,
n=0

and differentiating repeatedly gives

X∞ µ ¶
−k k+n−1 n
(E.27) (1 − z) = z ,
n=0
n

for |z| < 1. Thus, with z = x2 /(1 + x2 ), we have (E.22) with

X∞ X ∞ µ ¶
(−1)k−1 k + n − 1 n+k
ϕ(z) = z
2k − 1 n
k=1 n=0
(E.28) ∞ X`−1 µ ¶
X (−1)`−n−n ` − 1 `
= z .
n=0
2` − 2n − 1 n
`=1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
192

Hence
∞ X
X `−1 µ ¶
(−1)m ` − 1 `
(E.29) ϕ(z) = z .
m=0
2m + 1 m
`=1

To get (E.23), it remains to show that


`−1
X µ ¶
(−1)m ` − 1 2 · 4 · · · 2(` − 1)
(E.30) ϕ` = =⇒ ϕ` = , ` ≥ 2,
m=0
2m + 1 m 3 · 5 · · · (2` − 1)

while

(E.31) ϕ1 = 1.

In fact, (E.31) is routine, so it suffices to establish that

2`
(E.32) ϕ`+1 = ϕ` .
2` + 1
To see this, note that the binomial formula gives
`−1
X µ ¶
2 `−1 m ` − 1 2m
(E.33) (1 − s ) = (−1) s ,
m=0
m

and integrating over s ∈ [0, 1] gives


Z 1
(E.34) ϕ` = (1 − s2 )`−1 ds.
0

To get the recurrence relation (E.32), we start with

d
(1 − s2 )`+1 = −2(` + 1)s(1 − s2 )` ,
ds
(E.35)
d2
(1 − s2 )`+1 = −2(` + 1)(1 − s2 )` + 4`(` + 1)s2 (1 − s2 )`−1 .
ds2
Integrating the last identity over s ∈ [0, 1] gives
Z 1 Z 1
2 `−1 2
(E.36) 2` (1 − s ) s ds = (1 − s2 )` ds.
0 0

Hence

(E.37) 2`(−ϕ`+1 + ϕ` ) = ϕ`+1 ,

which gives (E.32). This finishes the proof of Proposition E.1.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
193

F. Power series for tan x

Recall that
sin x
(F.1) tan x =
cos x
is a smooth function on (−π/2, π/2). Here we desire to represent it as a convergent power
series

X
(F.2) T (x) = τk x2k+1 .
k=0

Only odd exponents are involved since tan(−x) = − tan x. We will derive a recursive
formula for the coefficients τk .
As seen in Exercise 4 of §5,

d
(F.3) tan x = 1 + tan2 x.
dx
To find the coefficients τk in (F.2), we construct the power series to solve the differential
equation

(F.4) T 0 (x) = 1 + T (x)2 , T (0) = 0.

Indeed, if (F.2) is a convergent power series for |x| < r, then, on such an interval,

X
0
(F.5) T (x) = (2k + 1)τk x2k ,
k=0

and
X
T (x)2 = τj τk x2(j+k)+2
j,k≥0
(F.6) ∞ X̀
X
= τk τ`−k x2`+2 .
`=0 k=0

We can rewrite (F.5) as



X
0
(F.7) T (x) = τ0 + (2` + 3)τ`+1 x2`+2 ,
`=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
194

and then the equation (F.2) yields τ0 = 1 and

1 X̀
(F.8) τ`+1 = τk τ`−k , for ` ≥ 0.
2` + 3
k=0

Clearly, given τ0 = 1, (F.8) uniquely determines τk for all k ∈ N. The first few terms
are
1 2 1³ 2 1 2 ´
(F.9) τ0 = 1, τ1 = , τ2 = , τ3 = + + .
3 3·5 7 3·5 3·3 3·5
An easy induction shows that, for each k ∈ N,

(F.10) 0 < τk < 1.

It follows that (F.2) is a convergent power series, at least on |x| < 1, and that on this
interval the equation (F.4) holds. We claim that, on the interval of convergence,

(F.11) T (x) = tan x.

For this task, remainder formulas such as (3.37) and (3.38) are not so convenient, since
formulas for high derivatives of tan x become quite unwieldly. We take another approach,
bringing in the function tan−1 , introduced in (E.2)–(E.3). If (F.2) converges for |x| < r,
then

(F.12) ψ(x) = tan−1 T (x)

defines a smooth function on (−r, r), and, via (E.3) and the chain rule,

T 0 (x)
(F.13) ψ 0 (x) = = 1.
1 + T (x)2

Since ψ(0) = 0, it follows that ψ(x) = x, hence that T (x) = tan x, so



X
(F.14) tan x = τk x2k+1 ,
k=0

with τ0 = 1 and τk for k ∈ N defined recursively by (F.8).


As one might expect, the radius of convergence of the power series (F.14), seen above to
be ≥ 1, is actially π/2. This is conveniently estabished using methods of complex analysis,
such as treated in [T3].
The coefficients τk in the power series for tan x are closely related to the Bernoulli
numbers Bk , which arise in the power series expansion
X Bk ∞
z
(F.15) = zk .
ez − 1 k!
k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
195

In this case, one can multiply the power series on the right side of (F.15) by

ez − 1 X z j
(F.16) =
z j=0
(j + 1)!

to get the recursion formula


`−1
X 1 Bk
(F.17) B0 = 1, = 0.
(` − k)! k!
k=0
The first few terms are
1 1 1
(F.18) B1 = − , B2 = , B3 = 0, B4 = − .
B0 = 1,
2 6 30
Methods of complex analysis (cf. [T3]) show that the power series in (F.15) has radius of
convergence 2π. It turns out that Bk = 0 for all odd k ≥ 3. In fact, a formula equivalent
to (F.15) is

1 ez + 1 1 X B2k 2k−1
(F.19) = + z .
2 ez − 1 z (2k)!
k=1

It is an exercise to show that the difference between the left side of (F.19) and 1/z is odd
in z. Now an application of Euler’s formula to (F.19) yields

X B2k
(F.20) x cot x = (−1)k (2x)2k .
(2k)!
k=0
Furthermore, one can show that
(F.21) tan x = cot x − 2 cot 2x,
and then deduce from (F.20) that (F.14) holds, with
22k (22k − 1)
(F.22) τk−1 = (−1)k−1 B2k , k ≥ 1.
(2k)!
The fact that τk is positive for each k is equivalent to the fact that B2k is positive for
k odd, and negative for k even (which might take one some effort to glean from (F.17)).
Note that comparing (F.22) and (F.10) implies that the radius of convergence of the power
series in (F.19) is at least 4.
For further results, relating (F.20) to results connecting the Bernoulli numbers to ζ(2k),
defined by
X∞
(F.23) ζ(2k) = n−2k ,
n=1

see §30 of [T3]. One upshot of these results is the identity


X # ∞
π 4
(F.24) tan x= τk x2k+1 , τk# = ζ(2k + 2)(1 − 2−2k−2 ).
2 π
k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
196

G. Abel’s power series theorem

In this appendix we prove the following result of Abel and derive some applications.
Theorem G.1. Assume we have a convergent series

X
(G.1) ak = A.
k=0

Then

X
(G.2) f (r) = ak r k
k=0

converges uniformly on [0, 1], so f ∈ C([0, 1]).


As a warm up, we look at the following somewhat simpler result.
Proposition G.2. Assume we have an absolutely convergent series

X
(G.3) |ak | < ∞.
k=0

Then the series (G.2) converges uniformly on [−1, 1], so f ∈ C([−1, 1]).
P∞
Proof. Writing (G.2) as k=0 fk (r) with fk (r) = ak rk , we have |fk (r)| ≤ |ak | for |r| ≤ 1,
so the conclusion follows from the Weierstrass M -test, Proposition 2.4 of Chapter 3.
Theorem G.1 is much more subtle than Proposition G.2. One ingredient in the proof is
the following summation by parts formula.
Proposition G.3. Let (aj ) and (bj ) be sequences, and let
n
X
(G.4) sn = aj .
j=0

If m > n, then
m
X m−1
X
(G.5) ak bk = (sm bm − sn bn+1 ) + sk (bk − bk+1 ).
k=n+1 k=n+1

Proof. Write the left side of (G.5) as


m
X
(G.6) (sk − sk−1 )bk .
k=n+1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
197

It is then straightforward to obtain the right side.


Before applying Proposition G.3 to the proof of Theorem G.1, we note that, by Propo-
sition 3.3 of Chapter 3, the power series (G.2) converges uniformly on compact subsets of
(−1, 1), and defines f ∈ C((−1, 1)). Our task here is to get uniform convergence up to
r = 1.
To proceed, we apply (G.5) with bk = rk and n + 1 = 0, s−1 = 0, to get

m
X m−1
X
k
(G.7) ak r = (1 − r) sk r k + sm r m .
k=0 k=0

Now, we want to add and subtract a function gm (r), defined for 0 ≤ r < 1 by

X
gm (r) = (1 − r) sk rk
k=m
(G.8) ∞
X
m
= Ar + (1 − r) σk r k ,
k=m

with A as in (G.1) and

(G.9) σk = sk − A −→ 0, as k → ∞.

Note that, for 0 ≤ r < 1, µ ∈ N,


¯X∞ ¯ ³ ´ ∞
X
¯ ¯
(1 − r)¯ σk rk ¯ ≤ sup |σk | (1 − r) rk
k≥µ
(G.10) k=µ k=µ
³ ´
= sup |σk | rµ .
k≥µ

It follows that

(G.11) gm (r) = Arm + hm (r)

extends to be continuous on [0, 1] and

(G.12) |hm (r)| ≤ sup |σk |, hm (1) = 0.


k≥m

Now adding and subtracting gm (r) in (G.7) gives


m
X
(G.13) ak rk = g0 (r) + (sm − A)rm − hm (r),
k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
198

and this converges uniformly for r ∈ [0, 1] to g0 (r). We have Theorem G.1, with f (r) =
g0 (r).
Here is one illustration of Theorem G.1. Let ak = (−1)k−1 /k, which produces a con-
vergent series by the alternating series test (Chapter 1, Proposition 6.3). By (5.45),

X (−1)k−1
(G.14) rk = log(1 + r),
k
k=1

for |r| < 1. It follows from Theorem G.1 that this infinite series converges uniformly on
[0, 1], and hence

X (−1)k−1
(G.15) = log 2.
k
k=1

See Exercise 30 in §5 for a more direct approach to (G.15), using the special behavior of
alternating series. Here is a more subtle generalization.

Claim. For all θ ∈ (0, 2π), the series



X eikθ
(G.16) = S(θ)
k
k=1

converges.
Given this claim, it follows from Theorem G.1 that

X eikθ
(G.17) lim rk = S(θ), ∀ θ ∈ (0, 2π).
r%1 k
k=1

Note that taking θ = π gives (G.15). Incidentally, we mention that the function log :
(0, ∞) → R has a natural extension to

(G.18) log : C \ (−∞, 0] −→ C,

and

X 1 k
(G.19) z = − log(1 − z), for |z| < 1,
k
k=1

from which one can deduce, via Theorem G.1, that S(θ) in (G.16) satisfies

(G.20) S(θ) = − log(1 − eiθ ), 0 < θ < 2π.

Details on (G.18)–(G.19) would take us too far into the area of complex analysis for a
treatment here. One can find such material in [T3].
We want to establish the convergence of (G.16) for θ ∈ (0, 2π). In fact, we prove the
following more general result.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
199

Proposition G.4. If bk & 0, then



X
(G.21) bk eikθ = F (θ)
k=1

converges for all θ ∈ (0, 2π).


Given Proposition G.4, it then follows from Theorem G.1 that

X
(G.22) lim bk rk eikθ = F (θ), ∀ θ ∈ (0, 2π).
r%1
k=1

In turn, Proposition G.4 is a special case of the following more general result, known as
the Dirichlet test for convergence of an infinite series.
Proposition G.5. If bk & 0, ak ∈ C, and there exists B < ∞ such that
k
X
(G.23) sk = aj =⇒ |sk | ≤ B, ∀ k ∈ N,
j=1

then

X
(G.24) ak bk converges.
k=1

To apply Proposition G.5 to Proposition G.4, take ak = eikθ and observe that
k
X 1 − eikθ iθ
(G.25) eijθ = e ,
j=1
1 − eiθ

which is uniformly bounded (in k) for each θ ∈ (0, 2π).


To prove Proposition G.5, we use summation by parts, Proposition G.3. We have, via
(G.5) with n = 0, s0 = 0,
m
X m−1
X
(G.26) ak bk = sm bm + sk (bk − bk+1 ).
k=1 k=1

Now, if |sk | ≤ B for all k and bk & 0, then



X ∞
X
(G.27) |sk (bk − bk+1 )| ≤ B (bk − bk+1 ) = Bb1 < ∞,
k=1 k=1

so the infinite series



X
(G.28) sk (bk − bk+1 )
k=1

is absolutely convergent, and the convergence of the left side of (G.26) readily follows.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
200

Chapter V
Further Topics in Analysis

Introduction

In this final chapter we apply results of Chapters 3 and 4 to a selection of topics in


analysis. One underlying theme here is the approximation of a function by a sequence of
“simpler” functions.
In §1 we define the convolution of functions on R,
Z ∞
f ∗ u(x) = f (y)u(x − y) dy,
−∞

and give conditions on a sequence (fn ) guaranteeing that fn ∗ u → u as n → ∞. In §2 we


treat the Weierstrass approximation theorem, which states that each continuous function
on a closed, bounded interval [a, b] is a uniform limit of a sequence of polynomials. We
give two proofs, one using convolutions and one using the uniform convergence on [−1, 1]
of the power series of (1 − x)b , whenever b > 0, established in Appendix C of Chapter
4. (Here, we take b = 1/2.) Section 3 treats a far reaching generalization, known as the
Stone-Weierstrass theorem. A special case, of use in §4, is that each continuous function
on T1 is a uniform limit of a sequence of finite linear combinations of the exponentials
eikθ , k ∈ Z.
Section 4 introduces Fourier series,

X
f (θ) = ak eikθ .
k=−∞

A central question is when this holds with


Z π
1
ak = f (θ)e−ikθ dθ.
2π −π

This is the Fourier inversion problem, and we examine several aspects of this. Fourier
analysis is a major area in modern analysis, and it is hoped that the material treated here
will provide a useful stimulus for further study.
For further material on Fourier analysis, one can look at Chapter 13 of [T3], dealing
with Fourier series on a similar level as here, but with a different perspective, followed by
Chapters 14–15 of [T3], on the Fourier transform and Laplace transform. Progressively
more advanced treatments of Fourier analysis can be found in [Fol], Chapter 8, and [T4],
Chapter 3.
Section 5 treats the use of Newton’s method to solve

f (ξ) = y

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
201

for ξ in an interval [a, b] given that f (a) − y and f (b) − y have opposite signs and that

|f 00 (x)| ≤ A, |f 0 (x)| ≥ B > 0, ∀ x ∈ [a, b].

It is seen that if an initial guess x0 is close enough to ξ, then Newton’s method produces
a sequence (xk ) satisfying
k
|xk − ξ| ≤ Cβ 2 , for some β ∈ (0, 1).

It is extremely useful to have such a rapid approximation of the solution ξ.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
202

1. Convolutions and bump functions

If u is bounded and continuous on R and f is integrable (say f ∈ R(R)) we define the


convolution f ∗ u by
Z ∞
(1.1) f ∗ u(x) = f (y)u(x − y) dy.
−∞

Clearly
Z
(1.2) |f | dx = A, |u| ≤ M on R =⇒ |f ∗ u| ≤ AM on R.

Also, a change of variable gives


Z ∞
(1.3) f ∗ u(x) = f (x − y)u(y) dy.
−∞

We want to analyze the convolution action of a sequence of integrable functions fn on


R that satisfy the following conditions:
Z Z
(1.4) fn ≥ 0, fn dx = 1, fn dx = εn → 0,
R\In

where

(1.5) In = [−δn , δn ], δn → 0.

Let u ∈ C(R) be supported on a bounded interval [−A, A], or more generally, assume

(1.6) u ∈ C(R), |u| ≤ M on R,

and u is uniformly continuous on R, so with δn as in (1.5),

(1.7) |x − x0 | ≤ 2δn =⇒ |u(x) − u(x0 )| ≤ ε̃n → 0.

We aim to prove the following.


Proposition 1.1. If fn ∈ R(R) satisfy (1.4)–(1.5) and if u ∈ C(R) is bounded and
uniformly continuous (satisfying (1.6)–(1.7)), then

(1.8) un = fn ∗ u −→ u, uniformly on R, as n → ∞.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
203

Proof. To start, write


Z
un (x) = fn (y)u(x − y) dy
Z Z
(1.9) = fn (y)u(x − y) dy + fn (y)u(x − y) dy
In R\In

= vn (x) + rn (x).

Clearly

(1.10) |rn (x)| ≤ M εn , ∀ x ∈ R.

Next,
Z
(1.11) vn (x) − u(x) = fn (y)[u(x − y) − u(x)] dy − εn u(x),
In

so

(1.12) |vn (x) − u(x)| ≤ ε̃n + M εn , ∀ x ∈ R,

hence

(1.13) |un (x) − u(x)| ≤ ε̃n + 2M εn , ∀ x ∈ R,

yielding (1.8).
Here is a sequence of functions (fn ) satisfying (1.4)–(1.5). First, set
Z 1
1 2 n
(1.14) gn (x) = (x − 1) , An = (x2 − 1)n dx,
An −1

and then set


fn (x) = gn (x), |x| ≤ 1,
(1.15)
0, |x| ≥ 1.

It is readily verified that such (fn ) satisfy (1.4)–(1.5). We will use this sequence in Propo-
sition 1.1 for one proof of the Weierstrass approximation theorem, in the next section.
The functions fn defined by (1.14)–(1.15) have the property

(1.16) fn ∈ C n−1 (R).

Furthermore, they have compact support, i.e., vanish outside some compact set. We say

(1.17) f ∈ C0k (R),

provided f ∈ C k (R) and f has compact support. The following result is useful.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
204

Proposition 1.2. If f ∈ C0k (R) and u ∈ R(R), then f ∗ u ∈ C k (R), and (provided k ≥ 1)

d
(1.18) f ∗ u(x) = f 0 ∗ u(x).
dx

Proof. We start with the case k = 0, and show that

f ∈ C00 (R), u ∈ R(R) =⇒ f ∗ u ∈ C(R).

In fact, by (1.3),
¯ ¯ ¯Z ∞ ¯
¯ ¯ ¯ ¯
¯f ∗ u(x + h) − f ∗ u(x)¯ = ¯ [f (x + h − y) − f (x − y)]u(y) dy ¯
−∞
Z ∞
≤ sup |f (x + h) − f (x)| |u(y)| dy,
x −∞

which clearly tends to 0 as h → 0.


From here, it suffices to treat the case k = 1, since if f ∈ C0k (R), then f 0 ∈ C0k−1 (R),
and one can use induction on k. Using (1.3), we have
Z ∞
f ∗ u(x + h) − f ∗ u(x)
(1.19) = gh (x − y)u(y) dy,
h −∞

where
1
(1.20) gh (x) = [f (x + h) − f (x)].
h
We claim that

(1.21) f ∈ C01 (R) =⇒ gh → f 0 uniformly on R, as h → 0.

Given this,
¯Z ∞ Z ∞ ¯
¯ ¯
0
¯ gh (x − y)u(y) dy − f (x − y)u(y) dy ¯
−∞ −∞
(1.22) Z ∞
0
≤ sup |gh (x) − f (x)| |u(y)| dy,
x −∞

which yields (1.18).


It remains to prove (1.21). Indeed, the fundamental theorem of calculus implies
Z x+h
1
(1.23) gh (x) = f 0 (y) dy,
h x

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
205

if h > 0, so

(1.24) |gh (x) − f 0 (x)| ≤ sup |f 0 (y) − f 0 (x)|,


x≤y≤x+h

if h > 0, with a similar estimate if h < 0. This yields (1.21).


We say

(1.25) f ∈ C ∞ (R) provided f ∈ C k (R) for all k,

and similarly f ∈ C0∞ (R) provided f ∈ C0k (R), for all k. It is useful to have some examples
of functions in C0∞ (R). We start with the following. Set
2
G(x) = e−1/x , if x > 0,
(1.26)
0, if x ≤ 0.

Lemma 1.3. G ∈ C ∞ (R).


Proof. Clearly g ∈ C k for all k on (0, ∞) and on (−∞, 0). We need to check its behavior
at 0. The fact that G is continuous at 0 follows from
2
(1.27) e−y −→ 0, as y → ∞.

Note that
2 −1/x2
G0 (x) = e , if x > 0,
(1.28) x3
0, if x < 0.

also
G(h)
(1.29) G0 (0) = lim = 0,
h→0 h
as a consequence of
2
(1.30) ye−y −→ 0, as y → ∞.

Clearly G0 is continuous on (0, ∞) and on (−∞, 0). The continuity at 0 is a consequence


of
2
(1.31) y 3 e−y −→ 0, as y → ∞.

The existence and continuity of higher derivatives of G follows a similar pattern, making
use of
2
(1.32) y k e−y −→ 0, as y → ∞,

for each k ∈ N. We leave the details to the reader.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
206

Corollary 1.4. Set

(1.33) g(x) = G(x)G(1 − x).

Then g ∈ C0∞ (R). In fact, g(x) 6= 0 if and only if 0 < x < 1.

Exercises

1. Let f ∈ R(R) satisfy


Z
(1.34) f ≥ 0, f dx = 1,

and set
³x´
(1.35) fn (x) = nf , n ∈ N.
n
Show that Proposition 1.1 applies to the sequence fn .

2. Take
Z ∞
1 2 2
(1.36) f (x) = e−x , A= e−x dx.
A −∞

Show that Exercise 1 applies to this √


case.
Note. In [T2] it is shown that A = π in (1.36).

3. Modify the proof of Lemma 1.3 to show that, if

G1 (x) = e−1/x , if x > 0,


0, if x ≤ 0,

then G1 ∈ C ∞ (R).

4. Establish whether each of the following functions is in C ∞ (R).


1
ϕ(x) = G(x) sin , if x 6= 0,
(a) x
0, if x = 0.

1
ψ(x) = G1 (x) sin , if x 6= 0,
(b) x
0, if x = 0.

Here G(x) is as in (1.26) and G1 (x) is as in Exercise 3.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
207

2. The Weierstrass approximation theorem

The following result of Weierstrass is a very useful tool in analysis.


Theorem 2.1. Given a compact interval I, any continuous function f on I is a uniform
limit of polynomials.
Otherwise stated, our goal is to prove that the space C(I) of continuous (real valued)
functions on I is equal to P(I), the uniform closure in C(I) of the space of polynomials.
We will give two proofs of this theorem. Our starting point for the first proof will be
the result that the power series for (1 − x)a converges uniformly on [−1, 1], for any a > 0.
This was established in Chapter 4, Appendix C, and we will use it, with a = 1/2.
From the identity x1/2 = (1 − (1 − x))1/2 , we have x1/2 ∈ P([0, 2]). More to the point,
from the identity
¡ ¢1/2
(2.1) |x| = 1 − (1 − x2 ) ,
√ √
we have |x| ∈ P([− 2, 2]). Using |x| = b−1 |bx|, for any b > 0, we see that |x| ∈ P(I) for
any interval I = [−c, c], and also for any closed subinterval, hence for any compact interval
I. By translation, we have

(2.2) |x − a| ∈ P(I)

for any compact interval I. Using the identities


1 1 1 1
(2.3) max(x, y) = (x + y) + |x − y|, min(x, y) = (x + y) − |x − y|,
2 2 2 2
we see that for any a ∈ R and any compact I,

(2.4) max(x, a), min(x, a) ∈ P(I).

We next note that P(I) is an algebra of functions, i.e.,

(2.5) f, g ∈ P(I), c ∈ R =⇒ f + g, f g, cf ∈ P(I).

Using this, one sees that, given f ∈ P(I), with range in a compact interval J, one has
h ◦ f ∈ P(I) for all h ∈ P(J). Hence f ∈ P(I) ⇒ |f | ∈ P(I), and, via (2.3), we deduce
that

(2.6) f, g ∈ P(I) =⇒ max(f, g), min(f, g) ∈ P(I).

Suppose now that I 0 = [a0 , b0 ] is a subinterval of I = [a, b]. With the notation x+ =
max(x, 0), we have
¡ ¢
(2.7) fII 0 (x) = min (x − a0 )+ , (b0 − x)+ ∈ P(I).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
208

This is a piecewise linear function, equal to zero off I \ I 0 , with slope 1 from a0 to the
midpoint m0 of I 0 , and slope −1 from m0 to b0 .
Now if I is divided into N equal subintervals, any continuous function on I that is linear
on each such subinterval can be written as a linear combination of such “tent functions,”
so it belongs to P(I). Finally, any f ∈ C(I) can be uniformly approximated by such
piecewise linear functions, so we have f ∈ P(I), proving the theorem.
For the second proof, we bring in the sequence of functions fn defined by (1.14)–(1.15),
i.e., first set
Z 1
1 2
(2.8) gn (x) = (x − 1)n , An = (x2 − 1)n dx,
An −1

and then set

fn (x) = gn (x), |x| ≤ 1,


(2.9)
0, |x| ≥ 1.

It is readily verified that such (fn ) satisfy (1.4)–(1.5). We will use this sequence in Propo-
sition 1.1 to prove that if I ⊂ R is a closed, bounded interval, and f ∈ C(I), then there
exist polynomials pn (x) such that

(2.10) pn −→ f, uniformly on I.

To start, we note that by an affine change of variable, there is no loss of generality in


assuming that
h 1 1i
(2.11) I= − , .
4 4

Next, given I as in (2.11) and f ∈ C(I), it is easy to extend f to a function

1
(2.12) u ∈ C(R), u(x) = 0 for |x| ≥ .
2

Now, with fn as in (2.8)–(2.9), we can apply Proposition 1.1 to deduce that


Z
(2.13) un (x) = fn (y)u(x − y) dy =⇒ un → u uniformly on R.

Now
1
|x| ≤ =⇒ u(x − y) = 0 for |y| > 1
2 Z
(2.14)
=⇒ un (x) = gn (y)u(x − y) dy,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
209

that is,

1
(2.15) |x| ≤ =⇒ un (x) = pn (x),
2
where
Z
pn (x) = gn (y)u(x − y) dy
(2.16) Z
= gn (x − y)u(y) dy.

The last identity makes it clear that each pn (x) is a polynomial in x. Since (2.13) and
(2.15) imply
h 1 1i
(2.17) pn −→ u uniformly on − , ,
2 2
we have (2.10).

Exercises

1. As in Exercises 1–2 of §1, take


Z ∞
1 2 2
f (x) = e−x , A = e−x dx,
A −∞
³x´
fn (x) = nf .
n
Let u ∈ C(R) vanish outside [−1, 1]. Let ε > 0 and take n ∈ N such that

sup |fn ∗ u(x) − u(x)| < ε.


x

Approximate fn by a sufficient partial sum of the power series

n X 1 ³ x 2 ´k

fn (x) = − 2 ,
A k! n
k=0

and use this to obtain a third proof of Theorem 2.1.

Remark. A fourth proof of Theorem 2.1 is indicated in Exercise 8 of §4.

2. Let f be continuous on [−1, 1]. If f is odd, show that it is a uniform limit of finite linear
combinations of x, x3 , x5 , . . . , x2k+1 , . . . . If f is even, show it is a uniform limit of finite
linear combinations of 1, x2 , x4 , . . . , x2k , . . . .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
210

3. If g is continuous on [−π/2, π/2], show that g is a uniform limit of finite linear combi-
nations of
sin x, sin2 x, sin3 x, . . . , sink x, . . . .
Hint. Write g(x) = f (sin x) with f continuous on [−1, 1].

4. If g is continuous on [−π, π] and even, show that g is a uniform limit of finite linear
combinations of
1, cos x, cos2 x, . . . , cosk x, . . . .
Hint. cos : [0, π] → [−1, 1] is a homeomorphism.

5. Assume h : R → C is continuous, periodic of period 2π, and odd, so

(2.18) h(x + 2π) = h(x), h(−x) = −h(x), ∀ x ∈ R.

Show that h is a uniform limit of finite linear combinations of

sin x, sin x cos x, sin x cos2 x, . . . , sin x cosk x, . . . .

Hint. Given ε > 0, find δ > 0 and continuous hε , satisfying (2.18), such that

sup |h(x) − hε (x)| < ε, hε (x) = 0 if |x − jπ| < δ, j ∈ Z.


x

Then apply Exercise 4 to g(x) = hε (x)/ sin x.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
211

3. The Stone-Weierstrass theorem

A far reaching extension of the Weierstrass approximation theorem, due to M. Stone,


is the following result, known as the Stone-Weierstrass theorem.
Theorem 3.1. Let X be a compact metric space, A a subalgebra of CR (X), the algebra of
real valued continuous functions on X. Suppose 1 ∈ A and that A separates points of X,
i.e., for distinct p, q ∈ X, there exists hpq ∈ A with hpq (p) 6= hpq (q). Then the closure A is
equal to CR (X).
We present the proof in eight steps.

Step 1. Let f ∈ A and assume ϕ : R → R is continuous. If sup |f | ≤ A, we can apply


the Weierstrass approximation theorem to get polynomials pk → ϕ uniformly on [−A, A].
Then pk ◦ f → ϕ ◦ f uniformly on X, so ϕ ◦ f ∈ A.

Step 2. Consequently, if fj ∈ A, then

1 1
(3.1) max(f1 , f2 ) = |f1 − f2 | + (f1 + f2 ) ∈ A,
2 2

and similarly min(f1 , f2 ) ∈ A.

Step 3. It follows from the hypotheses that if p, q ∈ X and p 6= q, then there exists
fpq ∈ A, equal to 1 at p and to 0 at q.

Step 4. Apply an appropriate continuous ϕ : R → R to get gpq = ϕ ◦ fpq ∈ A, equal to 1


on a neighborhood of p and to 0 on a neighborhood of q, and satisfying 0 ≤ gpq ≤ 1 on X.

Step 5. Fix p ∈ X and let U be an open neighborhood of p. By Step 4, given q ∈ X \ U ,


there exists gpq ∈ A such that gpq = 1 on a neighborhood Oq of p, equal to 0 on a
neighborhood Ωq of q, satisfying 0 ≤ gpq ≤ 1 on X.
Now {Ωq } is an open cover of X \ U , so there exists a finite subcover Ωq1 , . . . , ΩqN . Let

(3.2) gpU = min gpqj ∈ A.


1≤j≤N

Then gpU = 1 on O = ∩N
1 Oqj , an open neighborhood of p, gpU = 0 on X \ U , and
0 ≤ gpU ≤ 1 on X.

Step 6. Take K ⊂ U ⊂ X, K closed, U open. By Step 5, for each p ∈ K, there exists


gpU ∈ A, equal to 1 on a neighborhood Op of p, and equal to 0 on X \ U .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
212

Now {Op } covers K, so there exists a finite subcover Op1 , . . . , Opm . Let

(3.3) gKU = max gpj U ∈ A.


1≤j≤M

We have

(3.4) gKU = 1 on K, 0 on X \ U, and 0 ≤ gKU ≤ 1 on X.

Step 7. Take f ∈ CR (X) such that 0 ≤ f ≤ 1 on X. Fix k ∈ N and set


n `o
(3.5) K` = x ∈ X : f (x) ≥ ,
k

so X = K0 ⊃ · · · ⊃ K` ⊃ K`+1 ⊃ · · · Kk ⊃ Kk+1 = ∅. Define open U` ⊃ K` by


n ` − 1o n ` − 1o
(3.6) U` = x ∈ X : f (x) > , so X \ U` = x ∈ X : f (x) ≤ .
k k

By Step 6, there exist ψ` ∈ A such that

(3.7) ψ` = 1 on K` , ψ` = 0 on X \ U` , and 0 ≤ ψ` ≤ 1 on X.

Let

`
(3.8) fk = max ψ` ∈ A.
0≤`≤k k

It follows that fk ≥ `/k on K` and fk ≤ (` − 1)/k on X \ U` , for all `. Hence fk ≥ (` − 1)/k


on K`−1 and fk ≤ `/k on U`+1 . In other words,

`−1 ` `−1 `
(3.9) ≤ f (x) ≤ =⇒ ≤ fk (x) ≤ ,
k k k k
so

1
(3.10) |f (x) − fk (x)| ≤ , ∀ x ∈ X.
k

Step 8. It follows from Step 7 that if f ∈ CR (X) and 0 ≤ f ≤ 1 on X, then f ∈ A. It is


an easy final step to see that f ∈ CR (X) ⇒ f ∈ A.

Theorem 3.1 has a complex analogue.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
213

Theorem 3.2. Let X be a compact metric space, A a subalgebra (over C) of C(X), the
algebra of complex valued continuous functions on X. Suppose 1 ∈ A and that A separates
the points of X. Furthermore, assume

(3.11) f ∈ A =⇒ f ∈ A.

Then the closure A = C(X).


Proof. Set AR = {f + f : f ∈ A}. One sees that Theorem 3.1 applies to AR .
Here are a couple of applications of Theorems 3.1–3.2.
Corollary 3.3. If X is a compact subset of Rn , then every f ∈ C(X) is a uniform limit
of polynomials on Rn .
Corollary 3.4. The space of trigonometric polynomials, given by
N
X
(3.12) ak eikθ ,
k=−N

is dense in C(S 1 ).

Exercises

1. Prove Corollary 3.3.

2. Prove Corollary 3.4, using Theorem 3.2.


Hint. eikθ ei`θ = ei(k+`)θ , and eikθ = e−ikθ .

3. Use the results of Exercises 4–5 in §2 to provide another proof of Corollary 3.4.
Hint. Use cosk θ = ((eiθ + e−iθ )/2)k , etc.

4. Let X be a compact metric space, and K ⊂ X a compact subset. Show that A = {f |K :


f ∈ C(X)} is dense in C(K).

5. In the setting of Exercise 4, take f ∈ C(K), ε > 0. Show that there exists g1 ∈ C(X)
such that
sup |g1 − f | ≤ ε, and sup |g1 | ≤ sup |f |.
K X K

6. Iterate the result of Exercise 5 to get gk ∈ C(X) such that

sup |gk − (f − g1 − · · · − gk−1 )| ≤ 2−k , sup |gk | ≤ 2−(k−1) .


K X

7. Use the results of Exercises 4–6 to show that, if f ∈ C(K), then there exists g ∈ C(X)
such that g|K = f .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
214

4. Fourier series

We work on T1 = R/(2πZ), which under θ 7→ eiθ is equivalent to S 1 = {z ∈ C : |z| = 1}.


Given f ∈ C(T1 ), or more generally f ∈ R(T1 ) (or still more generally, if f ∈ R# (T1 ),
defined as in §6 of Chapter 4), we set, for k ∈ Z,
Z 2π
1
(4.1) fˆ(k) = f (θ)e−ikθ dθ.
2π 0

We call fˆ(k) the Fourier coefficients of f . We say



X
(4.2) f ∈ A(T ) ⇐⇒ 1
|fˆ(k)| < ∞.
k=−∞

We aim to prove the following.


Proposition 4.1. Given f ∈ C(T1 ), if f ∈ A(T1 ), then

X
(4.3) f (θ) = fˆ(k)eikθ .
k=−∞

P ˆ
Proof. Given |f (k)| < ∞, the right side of (4.3) is absolutely and uniformly convergent,
defining

X
(4.4) g(θ) = fˆ(k)eikθ , g ∈ C(T1 ),
k=−∞

and our task is to show that f ≡ g. Making use of the identities


Z 2π
1
ei`θ dθ = 0, if ` =
6 0,
(4.5) 2π 0
1, if ` = 0,

we get fˆ(k) = ĝ(k), for all k ∈ Z. Let us set u = f − g. We have

(4.6) u ∈ C(T1 ), û(k) = 0, ∀ k ∈ Z.

It remains to show that this implies u ≡ 0. To prove this, we use Corollary 3.4, which
implies that, for each v ∈ C(T1 ), there exist trigonometric polynomials, i.e., finite linear
combinations vN of {eikθ : k ∈ Z}, such that

(4.7) vN −→ v uniformly on T1 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
215

Now (4.6) implies Z


u(θ)vN (θ) dθ = 0, ∀ N,
T1

and passing to the limit, using (4.7), gives


Z
(4.8) u(θ)v(θ) dθ = 0, ∀ v ∈ C(T1 ).
T1

Taking v = u gives
Z
(4.9) |u(θ)|2 dθ = 0,
T1

forcing u ≡ 0, and completing the proof.


We seek conditions on f that imply (4.2). Integration by parts for f ∈ C 1 (T1 ) gives,
for k 6= 0,
Z 2π
ˆ 1 i ∂ −ikθ
f (k) = f (θ) (e ) dθ
2π 0 k ∂θ
(4.10) Z 2π
1
= f 0 (θ)e−ikθ dθ,
2πik 0
hence
Z 2π
1
(4.11) |fˆ(k)| ≤ |f 0 (θ)| dθ.
2π|k| 0

If f ∈ C 2 (T1 ), we can integrate by parts a second time, and get


Z 2π
ˆ 1
(4.12) f (k) = − f 00 (θ)e−ikθ dθ,
2πk 2 0
hence Z 2π
1
|fˆ(k)| ≤ |f 00 (θ)| dθ.
2πk 2 0
In concert with
Z 2π
1
(4.13) |fˆ(k)| ≤ |f (θ)| dθ,
2π 0

which follows from (4.1), we have


Z 2π h
1 ¤
(4.14) |fˆ(k)| ≤ |f 00 (θ)| + |f (θ)| dθ.
2π(k 2 + 1) 0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
216

Hence
X
(4.15) f ∈ C 2 (T1 ) =⇒ |fˆ(k)| < ∞.
We will sharpen this implication below. We start with an interesting example. Consider
(4.16) f (θ) = |θ|, −π ≤ θ ≤ π,
and extend this to be periodic of period 2π, yielding f ∈ C(T1 ). We have
Z π
ˆ 1
f (k) = |θ|e−ikθ dθ
2π −π
(4.17)
1
= −[1 − (−1)k ] 2 ,
πk
for k 6= 0, while fˆ(0) = π/2. This is clearly a summable series, so f ∈ A(T1 ), and
Proposition 4.1 implies that, for −π ≤ θ ≤ π,
π X 2
|θ| = − 2
eikθ
2 πk
k odd
(4.18) ∞
π 4X 1
= − cos(2` + 1)θ.
2 π (2` + 1)2
`=0
Now, evaluating this at θ = 0 yields the identity
X∞
1 π2
(4.19) = .
(2` + 1)2 8
`=0
Using this, we can evaluate

X 1
(4.20) S= ,
k2
k=1
as follows. We have

X X X
1 1 1
= +
k2 k2 k2
k=1 k≥1 odd k≥2 even
(4.21) ∞
π2 1 X 1
= + ,
8 4 `2
`=1

hence S − S/4 = π 2 /8, so



X 1 π2
(4.22) = .
k2 6
k=1

We see from (4.17) that if f is given by (4.16), then fˆ(k) satisfies


C
(4.23) |fˆ(k)| ≤
.
k2 + 1
This is a special case of the following generalization of (4.15).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
217

Proposition 4.2. Let f be Lipschitz continuous and piecewise C 2 on T1 . Then (4.23)


holds.
Proof. Here we are assuming f is C 2 on T1 \{p1 , . . . , p` }, and f 0 and f 00 have limits at each
of the endpoints of the associated intervals in T1 , but f is not assumed to be differentiable
at the endpoints p` . We can write f as a sum of functions fν , each of which is Lipschitz
on T1 , C 2 on T1 \ pν , and fν0 and fν00 have limits as one approaches pν from either side. It
suffices to show that each fˆν (k) satisfies (4.23). Now g(θ) = fν (θ + pν − π) is singular only
at θ = π, and ĝ(k) = fˆν (k)eik(pν −π) , so it suffices to prove Proposition G.2 when f has a
singularity only at θ = π. In other words, f ∈ C 2 ([−π, π]), and f (−π) = f (π).
In this case, we still have (4.10), since the endpoint contributions from integration by
parts still cancel. A second integration by parts gives, in place of (4.12),
Z π
ˆ 1 i ∂ −ikθ
f (k) = f 0 (θ) (e ) dθ
2πik −π k ∂θ
(4.24) Z
1 h π 00 −ikθ 0 0
i
=− f (θ)e dθ + f (π) − f (−π) ,
2πk 2 −π
which yields (4.23).
R
We next make use of (4.5) to produce results on T1
|f (θ)|2 dθ, starting with the follow-
ing.
Proposition 4.3. Given f ∈ A(T1 ),
X Z
1
(4.25) |fˆ(k)|2 = |f (θ)|2 dθ.

T1

More generally, if also g ∈ A(T1 ),


X Z
1
(4.26) fˆ(k)ĝ(k) = f (θ)g(θ) dθ.

T1

Proof. Switching order of summation and integration and using (4.5), we have
Z Z X
1 1
f (θ)g(θ) dθ = fˆ(j)ĝ(k)e−i(j−k)θ dθ
2π 2π
(4.27) T1 T1 j,k
X
= fˆ(k)ĝ(k),
k

giving (4.26). Taking g = f gives (4.25).


We will extend the scope of Proposition 4.3 below. Closely tied to this is the issue of
convergence of SN f to f as N → ∞, where
X
(4.28) SN f (θ) = fˆ(k)eikθ .
|k|≤N

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
218

Clearly f ∈ A(S 1 ) ⇒ SN f → f uniformly on T1 as N → ∞. Here, we are interested in


convergence in L2 -norm, where
Z
2 1
(4.29) kf kL2 = |f (θ)|2 dθ.

T1

Given f ∈ R(T1 ), this defines a “norm,” satisfying the following result, called the triangle
inequality:

(4.30) kf + gkL2 ≤ kf kL2 + kgkL2 .

See Appendix A for details on this. Behind these results is the fact that

(4.31) kf k2L2 = (f, f )L2 ,

where, when f and g belong to R(T1 ), we set


Z
1
(4.32) (f, g)L2 = f (θ)g(θ) dθ.

S1

Thus the content of (4.25) is that


X
(4.33) |fˆ(k)|2 = kf k2L2 ,

and that of (4.26) is that


X
(4.34) fˆ(k)ĝ(k) = (f, g)L2 .

The left side of (4.33) is the square norm of the sequence (fˆ(k)) in `2 . Generally, a
sequence (ak ) (k ∈ Z) belongs to `2 if and only if
X
(4.35) k(ak )k2`2 = |ak |2 < ∞.

There is an associated inner product


X
(4.36) ((ak ), (bk )) = ak bk .

As in (4.30), one has (see Appendix A)

(4.37) k(ak ) + (bk )k`2 ≤ k(ak )k`2 + k(bk )k`2 .

As for the notion of L2 -norm convergence, we say

(4.38) fν → f in L2 ⇐⇒ kf − fν kL2 → 0.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
219

There is a similar notion of convergence in `2 . Clearly

(4.39) kf − fν kL2 ≤ sup |f (θ) − fν (θ)|.


θ

In view of the uniform convergence SN f → f for f ∈ A(T1 ) noted above, we have

(4.40) f ∈ A(T1 ) =⇒ SN f → f in L2 , as N → ∞.

The triangle inequality implies


¯ ¯
¯ ¯
(4.41) ¯kf kL2 − kSN f kL2 ¯ ≤ kf − SN f kL2 ,

and clearly (by Proposition 4.3)

N
X
(4.42) kSN f k2L2 = |fˆ(k)|2 ,
k=−N

so
X
(4.43) kf − SN f kL2 → 0 as N → ∞ =⇒ kf k2L2 = |fˆ(k)|2 .

We now consider more general functions f ∈ R(T1 ). With fˆ(k) and SN f defined by
(4.1) and (4.28), we define RN f by

(4.44) f = SN f + RN f.
R R
Note that T1
f (θ)e−ikθ dθ = T1
SN f (θ)e−ikθ dθ for |k| ≤ N , hence

(4.45) (f, SN f )L2 = (SN f, SN f )L2 ,

and hence

(4.46) (SN f, RN f )L2 = 0.

Consequently,

kf k2L2 = (SN f + RN f, SN f + RN f )L2


(4.47)
= kSN f k2L2 + kRN f k2L2 .

In particular,

(4.48) kSN f kL2 ≤ kf kL2 .

We are now in a position to prove the following.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
220

Lemma 4.4. Let f, fν belong to R(T1 ). Assume

(4.49) lim kf − fν kL2 = 0,


ν→∞

and, for each ν,

(4.50) lim kfν − SN fν kL2 = 0.


N →∞

Then

(4.51) lim kf − SN f kL2 = 0.


N →∞

Proof. Writing f − SN f = (f − fν ) + (fν − SN fν ) + SN (fν − f ), and using the triangle


inequality, we have, for each ν,

(4.52) kf − SN f kL2 ≤ kf − fν kL2 + kfν − SN fν kL2 + kSN (fν − f )kL2 .

Taking N → ∞ and using (4.48), we have

(4.53) lim sup kf − SN f kL2 ≤ 2kf − fν kL2 ,


N →∞

for each ν. Then (4.49) yields the desired conclusion (4.51).


Given f ∈ C(T1 ), we have trigonometric polynomials fν → f uniformly on T1 , and
clearly (4.50) holds for each such fν . Thus Lemma 4.4 yields the following.

f ∈ C(T1 ) =⇒ SN f → f in L2 , and
(4.54) X
|fˆ(k)|2 = kf k2L2 .

Lemma 4.4 also applies to many discontinuous functions. Consider, for example

f (θ) = 0 for − π < θ < 0,


(4.55)
1 for 0 < θ < π.

We can set, for ν ∈ N,

fν (θ) = 0 for − π ≤ θ ≤ 0,
1
νθ for 0 ≤ θ ≤ ,
ν
(4.56) 1 1
1 for ≤θ≤π− ,
ν ν
1
ν(π − θ) for π − ≤ θ ≤ π.
ν

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
221

Then each fν ∈ C(T1 ). (In fact, fν ∈ A(T1 ), by Proposition 4.2.). Also, one can check
that kf − fν k2L2 ≤ 2/ν. Thus the conclusion in (4.54) holds for f given by (4.55).
More generally, any piecewise continuous function on T1 is an L2 limit of continuous
functions, so the conclusion of (4.54) holds for them. To go further, let us consider the class
of Riemann integrable functions. A function f : T1 → R is Riemann integrable provided
f is bounded (say |f | ≤ M ) and, for each δ > 0, there exist piecewise constant functions
gδ and hδ on T1 such that
Z
¡ ¢
(4.57) gδ ≤ f ≤ hδ , and hδ (θ) − gδ (θ) dθ < δ.
T1

Then
Z Z Z
(4.58) f (θ) dθ = lim gδ (θ) dθ = lim hδ (θ) dθ.
δ→0 δ→0
T1 T1 T1

Note that we can assume |hδ |, |gδ | < M + 1, and so


Z Z
1 2 M +1
|f (θ) − gδ (θ)| dθ ≤ |hδ (θ) − gδ (θ)| dθ
2π π
(4.59) T1 T1
M +1
< δ,
π

so gδ → f in L2 -norm. A function f : T1 → C is Riemann integrable provided its real and


imaginary parts are. In such a case, there are also piecewise constant functions fν → f in
L2 -norm, giving the following.
Proposition 4.5. We have

f ∈ R(T1 ) =⇒ SN f → f in L2 , and
(4.60) X
|fˆ(k)|2 = kf k2L2 .

This is not the end of the story. Lemma 4.4 extends to unbounded functions on T1 that
are square integrable, such as

1
(4.61) f (θ) = |θ|−α on [−π, π], 0<α< .
2
In such a case, one can take fν (θ) = min(f (θ), ν), ν ∈ N. Then each fν is continuous and
kf − fν kL2 → 0 as ν → ∞. The conclusion of (4.60) holds for such f . We can fit (4.61)
into the following general setting. If f : T1 → C, we say

f ∈ R2 (T1 ) ⇐⇒ f, |f |2 ∈ R# (T1 ),

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
222

where R# is defined in §6 of Chapter 4. Though we will not pursue the details, Lemma
4.4 extends to f, fν ∈ R2 (T1 ), and then (4.60) holds for f ∈ R2 (T1 ).
The ultimate theory of functions for which the result

(4.62) SN f −→ f in L2 -norm

holds was produced by H. Lebesgue in what is now known as the theory of Lebesgue
measure and integration. There is the notion of measurability
R of a function f : T1 →
C. One says f ∈ L2 (T1 ) provided f is measurable and T1 |f (θ)|2 dθ < ∞, the integral
here being the Lebesgue integral. Actually,
R L2 (T1 ) consists of equivalence classes of such
functions, where f1 ∼ f2 if and only if |f1 (θ) − f2 (θ)|2 dθ = 0. With `2 as in (4.35), it is
then the case that

(4.63) F : L2 (T1 ) −→ `2 ,

given by

(4.64) (Ff )(k) = fˆ(k),

is one-to-one and onto, with


X
(4.65) |fˆ(k)|2 = kf k2L2 , ∀ f ∈ L2 (T1 ),

and

(4.66) SN f −→ f in L2 , ∀ f ∈ L2 (T1 ).

We refer to books on the subject (e.g., [T2]) for information on Lebesgue integration.
We mention two key propositions which, together with the arguments given above,
establish these results. The fact that Ff ∈ `2 for all f ∈ L2 (T1 ) and (4.65)–(4.66) hold
follows via Lemma 4.4 from the following.
Proposition A. Given f ∈ L2 (T1 ), there exist fν ∈ C(T1 ) such that fν → f in L2 .
As for the surjectivity of F in (4.63), note that, given (ak ) ∈ `2 , the sequence
X
fν (θ) = ak eikθ
|k|≤ν

satisfies, for µ > ν,


X
kfµ − fν k2L2 = |ak |2 → 0 as ν → ∞.
ν<|k|≤µ

That is to say, (fν ) is a Cauchy sequence in L2 (T1 ). Surjectivity follows from the fact that
Cauchy sequences in L2 (T1 ) always converge to a limit:

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
223

Proposition B. If (fν ) is a Cauchy sequence in L2 (T1 ), there exists f ∈ L2 (T1 ) such


that fν → f in L2 -norm.
Proofs of Propositions A and B can be found in the standard texts on measure theory
and integration, such as [T2].
We now establish a sufficient condition for a function f to belong to A(T1 ), more general
than that in Proposition 4.2.
P ˆ
Proposition 4.6. If f is a continuous, piecewise C 1 function on T1 , then |f (k)| < ∞.
Proof. As in the proof of Proposition 4.2, we can reduce the problem to the case f ∈
C 1 ([−π, π]), f (−π) = f (π). In such a case, with g = f 0 ∈ C([−π, π]), the integration by
parts argument (4.10) gives

1
(4.67) fˆ(k) = ĝ(k), k 6= 0.
ik

By (4.60),
X
(4.68) |ĝ(k)|2 = kgk2L2 .

Also, by Cauchy’s inequality (cf. Appendix A),

X ³X 1 ´1/2 ³X ´1/2
|fˆ(k)| ≤ |ĝ(k)| 2

(4.69) k2
k6=0 k6=0 k6=0
≤ CkgkL2 .

This completes the proof.


Moving beyond square integrable functions, we now provide some results on Fourier
series for a function f ∈ R# (T1 ). For starters, if f ∈ R# (T1 ), then (4.1) yields
Z 2π
1 1
(4.70) |fˆ(k)| ≤ |f (θ)| dθ = kf kL1 (T1 ) .
2π 0 2π

Using this, we can establish the following result, which is part of what is called the
Riemann-Lebesgue lemma.
Proposition 4.7. Given f ∈ R# (T1 ),

(4.71) fˆ(k) −→ 0, as |k| → ∞.

Proof. By Proposition 6.4 of Chapter 4, there exist fν ∈ R(T1 ) such that

(4.72) kf − fν kL1 (T1 ) −→ 0, as ν → ∞.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
224

P
Now Proposition 4.5 applies to each fν , so k |fˆν (k)|2 < ∞, for each ν. Hence

(4.73) fˆν (k) −→ 0, as k → ∞, for each ν.

Since
1
(4.74) sup |fˆ(k) − fˆν (k)| ≤ kf − fν kL1 (T1 ) ,
k 2π
(4.71) follows.
We now consider conditions on f ∈ R# (T1 ) guaranteeing that SN f (θ) converges to f (θ)
as N → ∞, at a particular point θ ∈ T1 . Note that
N
X
SN f (θ) = fˆ(k)eikθ
k=−N
N Z
1 X
(4.75) = f (ϕ)eik(θ−ϕ) dϕ

k=−N T1
Z
= f (ϕ)DN (θ − ϕ) dϕ,
T1

where Dn (θ), called the Dirichlet kernel, is given by


N
1 X ikθ
(4.76) DN (θ) = e .

k=−N

The following compact formula is very useful.


Lemma 4.8. We have DN (0) = (2N + 1)/2π, and if θ ∈ T1 \ 0,
1 sin(N + 1/2)θ
(4.77) DN (θ) = .
2π sin θ/2

Proof. The formula (4.76) can be rewritten


2N
1 −iN θ X ikθ
(4.78) DN (θ) = e e .

k=0
P2N
Using the geometrical series k=0 z k = (1 − z 2N +1 )/(1 − z), for z 6= 1, we have

1 −iN θ 1 − ei(2N +1)θ


DN (θ) = e
2π 1 − eiθ
(4.79)
1 ei(N +1)θ − e−iN θ
= ,
2π eiθ − 1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
225

and multiplying numerator and denominator by e−iθ/2 gives (4.77).


Note that if Rϕ f (θ) = f (θ + ϕ), then, for each f ∈ R# (T1 ),

(4.80) SN Rϕ f = Rϕ SN f,

so to test for convergence of SN f to f at ϕ, it suffices to test for convergence of SN Rϕ f


to Rϕ f at θ = 0. Thus we seek conditions that
Z
(4.81) SN f (0) = f (ϕ)DN (ϕ) dϕ
T1

converges to f (0) as N → ∞. (Note that DN (ϕ) = DN (−ϕ).) We have


Z π
1 f (θ)
(4.82) SN f (0) = sin(N + 21 )θ dθ.
2π −π sin θ/2

Also,
³ 1´
(4.83) sin N + θ = (sin N θ)(cos 12 θ) + (cos N θ)(sin 12 θ).
2
Using this in concert with Proposition 4.7, we have the following.
Lemma 4.9. Let f ∈ R# (T1 ). Assume f “vanishes” at θ = 0 in the sense that

f (θ)
(4.84) ∈ R# ([−π, π]).
sin θ/2

Then

(4.85) SN f (0) −→ 0, as N → ∞.

Applying Lemma 4.9 to f (θ) = g(θ) − g(0), we have the following.


Corollary 4.10. Let g ∈ R# (T1 ), and assume

g(θ) − g(0)
(4.86) ∈ R# ([−π, π]).
sin θ/2

Then

(4.87) SN g(0) −→ g(0), as N → ∞.

Bringing in (4.80), we have the following.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
226

Proposition 4.11. Let f ∈ R# (T1 ). Fix θ0 ∈ T1 . If

f (θ) − f (θ0 )
(4.88) ∈ R# ([−π + θ0 , π + θ0 ]),
sin(θ − θ0 )/2

then

(4.89) SN f (θ0 ) −→ f (θ0 ), as N → ∞.

Proposition 4.11 has the following application. We say a function f ∈ R# (T1 ) is Hölder
continuous at θ0 ∈ T1 , with exponent α ∈ (0, 1], provided there exists δ > 0, C < ∞, such
that

(4.90) |θ − θ0 | ≤ δ =⇒ |f (θ) − f (θ0 )| ≤ C|θ − θ0 |α .

Proposition 4.11 implies the following.


Proposition 4.12. Let f ∈ R# (T1 ). If f is Hölder continuous at θ0 , with some exponent
α ∈ (0, 1], then (4.89) holds.
Proof. We have
¯ f (θ) − f (θ ) ¯
¯ 0 ¯
(4.91) ¯ ¯ ≤ C 0 |θ − θ0 |−(1−α) ,
sin(θ − θ0 )/2

for |θ − θ0 | ≤ δ. Since sin(θ − θ0 )/2 is bounded away from 0 for θ ∈ [−π + θ0 , π + θ0 ] \


[θ0 − δ, θ0 + δ], the hypothesis (4.88) holds.
We now look at the following class of piecewise regular functions, with jumps. Take
points pj ,

(4.92) −π = p0 < p1 < · · · < pK = π.

Take functions

(4.93) fj : [pj , pj+1 ] −→ C,

Hölder continuous with exponent α > 0, for 0 ≤ j ≤ K − 1. Define f : T1 → C by

f (θ) = fj (θ), if pj < θ < pj+1 ,


(4.94) fj (pj+1 ) + fj+1 (pj+1 )
, if θ = pj+1 .
2

By convention, we take K ≡ 0 (recall that π ≡ −π in T1 ).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
227

Proposition 4.13. With f as specified above, we have

(4.95) SN f (θ) −→ f (θ), ∀ θ ∈ T1 .

Proof. If θ ∈
/ {p0 , . . . , pK }, this follows from Proposition 4.12. It remains to consider the
case θ = pj for some j. By (4.80), there is no loss of generality in taking pj = 0. Parallel
to (4.80), we have

(4.96) SN T f = T SN f, T f (θ) = f (−θ).

Hence
1
(4.97) SN f (0) = SN (f + T f )(0).
2
However, f + T f is Hölder continuous at θ = 0, with value 2f (0), so Proposition 4.12
implies
1
(4.98) SN (f + T f )(0) −→ f (0), as N → ∞.
2
This gives (4.95) for θ = pj = 0.

Exercises

1. Prove (4.80).

2. Prove (4.96).

3. Compute fˆ(k) when


f (θ) = 1 for 0 < θ < π,
(4.99)
0 for − π < θ < 0.

Then use (4.60) to obtain another proof of (4.22).

4. Apply Proposition 4.13 to f in (4.99), when θ = 0, π/2, π. Use the computation at


θ = π/2 to show the following (compare Exercise 31 in Chapter 4, §5):
π 1 1 1
= 1 − + − + ··· .
4 3 5 7
5. Apply (4.60) when f (θ) is given by (4.16). Use this to show that

X 1 π4
= .
k4 90
k=1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
228

6. Use Proposition 4.12 in concert with Proposition 4.2 to demonstrate that (4.3) holds
when f is Lipschitz and piecewise C 2 on T1 , without recourse to Corollary 3.4 (whose
proof in §3 uses the Stone-Weierstrass theorem). Use this in turn to prove Proposition 4.1,
without using Corollary 3.4.

7. Use the results of Exercise 6 to give a proof of Corollary 3.4 that does not use the
Stone-Weierstrass theorem.
Hint. As in the end of the proof of Theorem 2.1, each f ∈ C(T1 ) can be uniformly
approximated by a sequence of Lipschitz, piecewise linear functions.

Recall that Corollary 3.4 states that each f ∈ C(T1 ) can be uniformly approximated by
a sequence of finite linear combinations of the functions eikθ , k ∈ Z. The proof given in
§3 relied on the Weierstrass approximation theorem, Theorem 2.1, which was used in the
proof of Theorems 3.1 and 3.2. Exercise 7 indicates a proof of Corollary 3.4 that does not
depend on Theorem 2.1.

8. Give another proof of Theorem 2.1, as a corollary of Corollary 3.4.


Hint. You can take I = [−π/2, π/2]. Given f ∈ C(I), you can extend it to f ∈ C([−π, π]),
vanishing at ±π, and identify such f with an element of C(T1 ). Given ε > 0, approximate
f uniformly to within ε on [−π, π] by a finite sum

N
X
ak eikθ .
k=−N

Then approximate eikθ uniformly to within ε/(2N + 1) for each k ∈ {−N, . . . , N }, by a


partial sum of the power series for eikθ .

9. Let f ∈ C(T1 ). Assume there exist fν ∈ A(T1 ) and B < ∞ such that fν → f uniformly
on T1 and
X∞
|fˆν (k)| ≤ B, ∀ ν.
k=−∞

Show that f ∈ A(T1 ).

10. Let f ∈ C(T1 ). Assume there exist fν ∈ C(T1 ) satisfying the conditions of Proposition
4.6 such that fν → f uniformly on T1 , and assume there exists C < ∞ such that
Z
|fν0 (θ)|2 dθ ≤ C, ∀ ν.
T1

Show that f ∈ A(T1 ). Hint. Use (4.69), with f replaced by fν .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
229

5. Newton’s method

Here we describe a method to approximate the solution to

(5.1) f (ξ) = 0.

We assume f : [a, b] → R is continuous and f ∈ C 2 ((a, b)). We assume it is known that


f vanishes somewhere in (a, b). For example, f (a) and f (b) might have opposite signs.
We take x0 ∈ (a, b) as an initial guess of a solution to (5.1), and inductively construct the
sequence (xk ), going from xk to xk+1 as follows. Replace f by its best linear approximation
at xk ,

(5.2) g(x) = f (xk ) + f 0 (xk )(x − xk ),

and solve g(xk+1 ) = 0. This yields

f (xk )
(5.3) xk+1 − xk = − ,
f 0 (xk )

or

f (xk )
(5.4) xk+1 = xk − .
f 0 (xk )

Naturally, we need to assume f 0 (x) is bounded away from 0 on (a, b). This production of
the sequence (xk ) is Newton’s method, and as we will see, under appropriate hypotheses
it converges quite rapidly to ξ.
We want to give a condition guaranteeing that |xk+1 − ξ| < |xk − ξ|. Say

(5.5) xk = ξ + δ.

Then (5.4) yields

f (ξ + δ)
xk+1 − ξ = δ −
f 0 (ξ + δ)
(5.6)
f 0 (ξ + δ)δ − f (ξ + δ)
= .
f 0 (ξ + δ)

Now the mean value theorem implies

(5.7) f (ξ + δ) − f (ξ) = f 0 (ξ + τ δ)δ, for some τ ∈ (0, 1).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
230

Since f (ξ) = 0, we get from (5.6) that

f 0 (ξ + δ) − f 0 (ξ + τ δ)
(5.7) xk+1 − ξ = δ.
f 0 (ξ + δ)

A second application of the mean value theorem gives

(5.8) f 0 (ξ + δ) − f 0 (ξ + τ δ) = (1 − τ )δf 00 (ξ + γδ),

for some γ ∈ (τ, 1), hence

f 00 (ξ + γδ) 2
(5.9) xk+1 − ξ = (1 − τ ) δ , τ ∈ (0, 1), γ ∈ (τ, 1).
f 0 (ξ + δ)

Consequently,
¯ f 00 (ξ + γδ) ¯
¯ ¯ 2
(5.10) |xk+1 − ξ| ≤ sup ¯ 0 ¯δ .
0<γ<1 f (ξ + δ)

A favorable condition for convergence is that the right side of (5.10) is ≤ βδ for some
β < 1. This leads to the following.
Proposition 5.1. Let f ∈ C([a, b]) be C 2 on (a, b). Assume there exists a solution ξ ∈
(a, b) to (5.1). Assume there exist A, B ∈ (0, ∞) such that

(5.11) |f 00 (x)| ≤ A, |f 0 (x)| ≥ B, ∀ x ∈ (a, b).

Pick x0 ∈ (a, b). Assume

(5.12) |x0 − ξ| = δ0 , [ξ − δ0 , ξ + δ0 ] ⊂ (a, b),

and
A
(5.13) δ0 = β < 1.
B

Then xk , defined inductively by (5.4), converges to ξ as k → ∞.


When Proposition 5.1 applies, one clearly has

(5.14) |xk − ξ| ≤ β k δ0 .

In fact, (5.10) implies much faster convergence than this. With |xk −ξ| = δk , (5.10) implies

A 2
(5.15) δk+1 ≤ δ ,
B k

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
231

hence
A 2 ³ A ´1+2 ³ A ´1+2+4
(5.16) δ1 ≤ δ , δ2 ≤ δ04 , δ3 ≤ δ08 ,
B 0 B B
and, inductively,
³ A ´2k −1 k
(5.17) δk ≤ δ02k = β 2 −1
δ0 ,
B
with β as in (5.13). Note that the exponent on β in (5.17) is much larger (for moderately
large k) than that in (5.14). One says the sequence (xk ) converges quadratically to the
limit ξ, solving (5.1). Roughly speaking, xk+1 has twice as many digits of accuracy as xk .
If we change (5.1) to

(5.18) f (ξ) = y,

then the results above apply to f˜(x) = f (x) − y, so we get the sequence of approximate
solutions defined inductively by

f (xk ) − y
(5.19) xk+1 = xk − ,
f 0 (xk )

and the formula (5.9) and estimate (5.10) remain valid.


As an example, let us take

(5.20) f (x) = x2 on [a, b] = [1, 2],



and approximate ξ = 2, which solves (5.18) with y = 2. Note that f (1) = 1 < 2 and
f (2) = 4 > 2. In this case, (5.19) becomes

x2k − 2
xk+1 = xk −
2xk
(5.21)
xk 1
= + .
2 xk

Let us pick

3
(5.22) . x0 =
2

Examining (1.4)2 and (1.5)2 , we see that 1.4 < 2 < 1.5. Thus (5.12) holds with δ0 < 1/10.
Furthermore, (5.11) holds with A = B = 2, so (5.13) holds with β < 1/10. Hence, by
(5.17),
√ k
(5.23) |xk − 2| ≤ 10−2 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
232

Explicit computations give


x0 = 1.5
x1 = 1.41666666666666
(5.24) x2 = 1.41421568627451
x3 = 1.41421356237469
x4 = 1.41421356237309.

We have |x24 − 2| ≤ 4 · 10−16 , consistent with (5.23).


Under certain circumstances, Newton’s method can be even better than quadratically
convergent. This happens when f 00 (ξ) = 0, assuming also that f is C 3 . In such a case, the
mean value theorem implies

f 00 (ξ + γδ) = f 00 (ξ + γδ) − f 00 (ξ)


(5.25)
= γδf (3) (ξ + σγδ),

for some σ ∈ (0, 1). Hence, given |xk − ξ| = δk , we get from (5.10) that
¯ f (3) (ξ + γδ ) ¯
¯ k ¯ 3
(5.26) |xk+1 − ξ| ≤ sup ¯ 0 ¯δ k .
0<γ<1 f (ξ + δk )

Thus xk → ξ cubically.
Here is an application to the production of a sequence that rapidly converges to π, based
on

(5.27) sin π = 0.

We take f (x) = sin x. Then f 00 (x) = − sin x, so the considerations above apply. The
iteration (5.4) becomes
sin xk
(5.28) xk+1 = xk − .
cos xk
If xk = π + δk , note that

(5.29) cos(π + δk ) = −1 + O(δk2 ),

so the iteration

(5.30) xk+1 = xk + sin xk

is also cubically convergent, if x0 is chosen close enough to π. Now, the first few terms of
the series (4.27)–(4.31) of Chapter 4, applied to
Z 1/2
π dx
(5.31) = √
6 0 1 − x2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
233

(cf. Chapter 4, §5, Exercise 7, (5.45A)), yields π = 3.14 · · · . We take

(5.32) x0 = 3,

and use the iteration (5.30), obtaining

x1 = 3.14112000805987
(5.33) x2 = 3.14159265357220
x3 = 3.14159265358979.

The error π − x2 is < 2 · 10−11 , and all the printed digits of x3 are accurate. If the
computation were to higher precision, x3 would approximate π to quite a few more digits.
By contrast, we apply Newton’s method to

π 1
(5.34) sin =
6 2
(equivalent to (5.31)). In this case, f (x) = sin x/6, and (5.19) becomes

sin(xk /6) − 1/2


(5.35) xk+1 = xk − 6 .
cos(xk /6)

If we take x0 = 3, as in (5.32), the iteration (5.35) yields

x1 = 3.14066684291090
(5.36) x2 = 3.14159261236234
x3 = 3.14159265358979.

Note that x1 here is almost as accurate an approximation to π as is x1 in (5.33), but x2


here is substantially less accurate than x2 in (5.33). Here, x3 has full accuracy, though
as noted above, x3 in (5.33) could be much more accurate if the computation (5.30) were
done to higher precision.

Exercises

Using a calculator or a computer, implement Newton’s method to get approximate solu-


tions to the following equations.

1. x5 − x3 + 1 = 0.

2. ex = 2x.

3. tan x = x.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
234

4. x log x = 2.

5. xx = 3.

6. Apply Newton’s method to f (x) = 1/x, obtaining the sequence

(5.37) xk+1 = 2xk − ax2k

of approximate solutions to f (x) = a. That is, xk → 1/a, if x0 is close enough. Try this
out with a = 3, x0 = 0.3. Note that the right side of (5.37) involves only multiplication
and subtraction.

7. Prove Proposition 5.1 when the hypothesis (5.12) is replaced by

(5.38) |f (x0 )| ≤ Bδ0 , [x0 − δ0 , x0 + δ0 ] ⊂ (a, b).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
235

A. Inner product spaces

In §4, we have looked at norms and inner products on spaces of functions, such as C(S 1 )
and R(S 1 ), which are vector spaces. Generally, a complex vector space V is a set on which
there are operations of vector addition:

(A.1) f, g ∈ V =⇒ f + g ∈ V,

and multiplication by an element of C (called scalar multiplication):

(A.2) a ∈ C, f ∈ V =⇒ af ∈ V,

satisfying the following properties. For vector addition, we have

(A.3) f + g = g + f, (f + g) + h = f + (g + h), f + 0 = f, f + (−f ) = 0.

For multiplication by scalars, we have

(A.4) a(bf ) = (ab)f, 1 · f = f.

Furthermore, we have two distributive laws:

(A.5) a(f + g) = af + ag, (a + b)f = af + bf.

These properties are readily verified for the function spaces mentioned above.
An inner product on a complex vector space V assigns to elements f, g ∈ V the quantity
(f, g) ∈ C, in a fashion that obeys the following three rules:

(a1 f1 + a2 f2 , g) = a1 (f1 , g) + a2 (f2 , g),


(A.6) (f, g) = (g, f ),
(f, f ) > 0 unless f = 0.

A vector space equipped with an inner product is called an inner product space. For
example,
Z
1
(A.7) (f, g) = f (θ)g(θ) dθ

S1

defines an inner product on C(S 1 ), and also on R(S 1 ), where we identify two functions
that differ only on a set of upper content zero. Similarly,
Z ∞
(A.8) (f, g) = f (x)g(x) dx
−∞

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
236

defines an inner product on R(R) (where, again, we identify two functions that differ only
on a set of upper content zero).
As another example, in we define `2 to consist of sequences (ak )k∈Z such that

X
(A.9) |ak |2 < ∞.
k=−∞

An inner product on `2 is given by



X
¡ ¢
(A.10) (ak ), (bk ) = ak bk .
k=−∞

Given an inner product on V , one says the object kf k defined by


p
(A.11) kf k = (f, f )

is the norm on V associated with the inner product. Generally, a norm on V is a function
f 7→ kf k satisfying

(A.12) kaf k = |a| · kf k, a ∈ C, f ∈ V,


(A.13) kf k > 0 unless f = 0,
(A.14) kf + gk ≤ kf k + kgk.

The property (H.14) is called the triangle inequality. A vector space equipped with a norm
is called a normed vector space. We can define a distance function on such a space by

(A.15) d(f, g) = kf − gk.

Properties (A.12)–(A.14) imply that d : V × V → [0, ∞) makes V a metric space.


If kf k is given by (A.11), from an inner product satisfying (A.6), it is clear that (A.12)–
(A.13) hold, but (A.14) requires a demonstration. Note that

kf + gk2 = (f + g, f + g)
(A.16) = kf k2 + (f, g) + (g, f ) + kgk2
= kf k2 + 2 Re(f, g) + kgk2 ,

while

(A.17) (kf k + kgk)2 = kf k2 + 2kf k · kgk + kgk2 .

Thus to establish (A.17) it suffices to prove the following, known as Cauchy’s inequality.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
237

Proposition A.1. For any inner product on a vector space V , with kf k defined by (A.11),

(A.18) |(f, g)| ≤ kf k · kgk, ∀ f, g ∈ V.

Proof. We start with

(A.19) 0 ≤ kf − gk2 = kf k2 − 2 Re(f, g) + kgk2 ,

which implies

(A.20) 2 Re(f, g) ≤ kf k2 + kgk2 , ∀ f, g ∈ V.

Replacing f by af for arbitrary a ∈ C of absolute velue 1 yields 2 Re a(f, g) ≤ kf k2 + kgk2 ,


for all such a, hence

(A.21) 2|(f, g)| ≤ kf k2 + kgk2 , ∀ f, g ∈ V.

Replacing f by tf and g by t−1 g for arbitrary t ∈ (0, ∞), we have

(A.22) 2|(f, g)| ≤ t2 kf k2 + t−2 kgk2 , ∀ f, g ∈ V, t ∈ (0, ∞).

If we take t2 = kgk/kf k, we obtain the desired inequality (A.18). This assumes f and g
are both nonzero, but (A.18) is trivial if f or g is 0.
An inner product space V is called a Hilbert space if it is a complete metric space, i.e.,
if every Cauchy sequence (fν ) in V has a limit in V . The space `2 has this completeness
property, but C(S 1 ), with inner product (A.7), does not, nor does R(S 1 ). Chapter 2
describes a process of constructing the completion of a metric space. When appied to an
incomplete inner product space, it produces a Hilbert space. When this process is applied
to C(S 1 ), the completion is the space L2 (S 1 ). An alternative construction of L2 (S 1 ) uses
the Lebesgue integral.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
238

References

[AH] J. Arndt and C. Haenel, π Unleashed, Springer-Verlag, New York, 2001.


[BS] R. Bartle and D. Sherbert, Introduction to Real Analysis, J. Wiley, New York,
1992.
[Be] P. Beckmann, A History of π, St. Martin’s Press, New York, 1971.
[C] P. Cohen, Set Theory and the Continuum Hypothesis, Dover, New York, 2008.
[Dev] K. Devlin, The Joy of Sets: Fundamentals of Contemporary Set Theory, Springer-
Verlag, New York, 1993.
[Fol] G. Folland, Real Analysis: Modern Techniques and Applications, Wiley-Interscience,
New York, 1984.
[Niv] I. Niven, A simple proof that π is irrational, Bull. AMS 53 (1947), 509.
[T1] M. Taylor, Measure Theory and Integration, American Mathematical Society, Prov-
idence RI, 2006.
[T2] M. Taylor, Introduction to Analysis in Several Variables (Advanced Calculus),
Lecture notes, available at
http://www.unc.edu/math/Faculty/met/math521.html
[T3] M. Taylor, Introduction to Complex Analysis. Lecture notes, avaliable at
http://www.unc.edu/math/Faculty/met/complex.html
[T4] M. Taylor, Partial Differential Equations, Vols. 1–3, Springer-Verlag, New York,
1996 (2nd ed., 2011).
[T5] M. Taylor, Introduction to Differential Equations, American Mathematical Society,
Providence RI, 2011.
[T6] M. Taylor, Elementary Differential Geometry, Lecture Notes, available at
http://www.unc.edu/math/Faculty/met/diffg.html

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
239

Index

Abel’s power series theorem 196


absolute value 28, 59
absolutely convergent series 40, 61, 94
accumulation point 74, 79
alternating series 40
arc length 147
Archimedean property 26, 36
arctangent 188
Ascoli’s theorem 104
associative law 9, 16

Baire category theorem 84


ball 74
Banach space 104
Bernoulli numbers 194
bisection method 58
Bolzano-Weierstrass theorem 38

calculus 107
cancellation law 10
Cantor set 57, 134
Card 50
cardinal number 49
cardinality 50
Cauchy inequality 68
Cauchy remainder formula 141
Cauchy sequence 29, 37, 69, 73
change of variable 133
circle 149
cis 61
closed set 54, 62, 74
closure 74
commutative law 9, 16
compact set 54, 62, 70, 79, 111
completeness property 38, 60, 69
completion 73
complex conjugate 59
complex number 59
composite number 21

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
240

connected 74, 89
cont+ 123
cont− 123
continued fraction 32
continuous 55, 87, 109, 120
continuum hypothesis 58
convergent sequence 28, 37, 60
convex 116
convolution 202
cos 61, 150, 160
cosh 160
countable 51
countably infinite 51
cover 56
cubic convergence 232
curve 147

Darboux theorem 121, 126


dense 74
derivative 109
diagonal construction 82
differentiable function 109
differential equation 155
Dirichlet convergence test 199
Dirichlet kernel 224
distance 73
disk 96
dot product 67

e 45, 155
elliptic function 152
elliptic integral 152
equicontinuity 104
equivalence class 15
equivalence relation 15
Euclidean space 67
Euler identity 62, 160
exponential function 61, 155

Fourier inversion 200


Fourier series 200, 213
Fundamental theorem of algebra 178
Fundamental theorem of arithmetic 20
Fundamental theorem of calculus 124, 125

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
241

function 86

Generalized mean value theorem 117


geometric series 96

Heine-Borel theorem 56
Hölder continuous 226

improper integral 170


induction principle 8
infinite decimal expansion 41
infinite series 39, 60, 93
inner product 235
inner product space 235
integer 15
integral 119
integral remainder formula 141
integral test 134
integration by parts 132
Intermediate value theorem 55, 75, 89
interval 55, 74
Inverse function theorem 112, 149
irrational number 45, 179

Lagrange remainder formula 141


limit 28
log 155
lower content 123

max 55, 63, 72, 203


maximum 87, 101
maxsize 118
Mean value theorem 111, 125, 126, 229
metric space 73
min 55, 63, 72, 91, 207
minimum 87, 110
modulus of continuity 88
monotone function 129
monotone sequence 30, 38
multiplying power series 98

natural logarithm 157


neighborhood 74
Newton’s method 229

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
242

norm 67

open set 54, 62, 74


order relation 11, 17, 25
outer measure 123

parametrization by arc length 149


partition 119
path-connected 87
Peano axioms 8
perfect set 57, 75
π 150, 161, 162, 180, 232
piecewise constant 129
piecewise regular 224
polar coordinates 61, 152
polynomial 178, 202
power series 96, 136, 155
prime number 20
principle of induction 8
product rule 109, 156
Pythagorean theorem 59, 68

quadratic convergence 230

raduis of convergence 96
ratio test 33
rational number 23
real number 34
refinement 119
remainder in a power series 140
reparametrization 147
Riemann integrable 120
Riemann integral 119
Riemann-Lebesgue lemma 222
Riemann sum 122, 147

Schroeder-Bernstein theorem 50
second derivative 114
second derivative test 114
semicontinuous 90
sec 161
sequence 28
sin 61, 150, 160
sinh 162

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33
243

speed 147
Stone-Weierstrass theorem 211
subgroup 21
summation by parts 195
sup 39, 88
supremum property 39

tan 159
triangle inequality 28, 59, 68, 73, 236
trigonometric function 150, 160
trigonometric polynomial 213
Tychonov theorem 79

unbounded integrable function 170


uncountable 52
uniform convergence 92
uniformly continuous 88
uniformly equicontinuous 105
upper bound 39
upper content 123

vector space 234


velocity 147

Weierstrass approximation theorem 207


Weieresrass M test 94, 97
well ordering property 12

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33