0 Bewertungen0% fanden dieses Dokument nützlich (0 Abstimmungen)

8 Ansichten243 Seitenanalysis

Jun 02, 2018

© © All Rights Reserved

PDF, TXT oder online auf Scribd lesen

analysis

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

0 Bewertungen0% fanden dieses Dokument nützlich (0 Abstimmungen)

8 Ansichten243 Seitenanalysis

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

Sie sind auf Seite 1von 243

in One Variable

Michael E. Taylor

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

2

Contents

Chapter I. Numbers

1. Peano arithmetic

2. The integers

3. Prime factorization and the fundamental theorem of arithmetic

4. The rational numbers

5. Sequences

6. The real numbers

7. Irrational numbers

8. Cardinal numbers

9. Metric properties of R

10. Complex numbers

1. Euclidean spaces

2. Metric spaces

3. Compactness

A. The Baire category theorem

1. Continuous functions

2. Sequences and series of functions

3. Power series

4. Spaces of functions

1. The derivative

2. The integral

3. Power series

4. Curves and arc length

5. Exponential and trigonometric functions

6. Unbounded integrable functions

A. The fundamental theorem of algebra

B. π 2 is irrational

C. More on (1 − x)b

D. Archimedes’ approximation of π

E. Computing π using arctangents

F. Power series of tan x

G. Abel’s power series theorem

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

3

1. Convolutions and bump functions

2. The Weierstrass approximation theorem

3. The Stone-Weierstrass theorem

4. Fourier series

5. Newton’s method

A. Inner product spaces

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

4

Introduction

This is a text for students who have had a three course calculus sequence, and who

are ready for a course that explores the logical structure of this area of mathematics,

which forms the backbone of analysis. This is intended for a one semester course. An

accompanying text, Introduction to Analysis in Several Variables [T2], can be used in the

second semester of a one year sequence.

The main goal of Chapter 1 is to develop the real number system. We start with a

treatment of the “natural numbers” N, obtaining its structure from a short list of axioms,

the primary one being the principle of induction. Then we construct the set Z of all integers,

which has a richer algebraic structure, and proceed to construct the set Q of rational

numbers, which are quotients of integers (with a nonzero denominator). After discussing

infinite sequences of rational numbers, including the notions of convergent sequences and

Cauchy sequences, we construct the set R of real numbers, as ideal limits of Cauchy

sequences of rational numbers. At the heart of this chapter is the proof that R is complete,

i.e., Cauchy sequences of real numbers always converge to a limit in R. This provides the

key to studying other metric properties of R, such as the compactness of (nonempty) closed,

bounded subsets. We end Chapter 1 with a section on the set C of complex numbers. Many

introductions to analysis shy away from the use of complex numbers. My feeling is that

this forecloses the study of way too many beautiful results that can be appreciated at

this level. This is not a course in complex analysis. That is for another course, and with

another text (such as [T3]). However, I hope that various topics covered in this text make

it clear that the use of complex numbers in analysis actually simplifies the treatment of a

number of key concepts, while extending their scope in very useful ways.

In fact, the structure of analysis is revealed more clearly by moving beyond R and C,

and we undertake this in Chapter 2. We start with a treatment of n-dimensional Euclidean

space, Rn . There is a notion of Euclidean distance between two points in Rn , leading to

notions of convergence and of Cauchy sequences. The spaces Rn are all complete, and

again closed bounded sets are compact. Going through this sets one up to appreciate a

further generalization, the notion of a metric space, introduced in §2. This is followed by

§3, exploring the notion of compactness in a metric space setting.

Chapter 3 deals with functions. It starts in a general setting, of functions from one

metric space to another. We then treat infinite sequences of functions, and study the

notion of convergence, particularly of uniform convergence of a sequence of functions. We

move on to infinite series. In such a case, we take the target space to be Rn , so we can

add functions. Section 3 treats power series. Here, we study series of the form

∞

X

(1) ak (z − z0 )k ,

k=0

with ak ∈ C and z running over a disk in C. For results obtained in this section, regarding

the radius of convergence R and the continuity of the sum on DR (z0 ) = {z ∈ C : |z − z0 | <

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

5

R}, there is no extra difficulty in allowing ak and z to be complex, rather than insisting

they be real, and the extra level of generality will pay big dividends in Chapter 4. A final

section in Chapter 3 is devoted to spaces of functions, illustrating the utility of studying

spaces beyond the case of Rn .

Chapter 4 gets to the heart of the matter, a rigorous development of differential and

integral calculus. We define the derivative in §1, and prove the Mean Value Theorem,

making essential use of compactness of a closed, bounded interval and its consequences,

established in earlier chapters. This result has many important consequences, such as the

Inverse Function Theorem, and especially the Fundamental Theorem of Calculus, estab-

lished in §2, after the Riemann integral is introduced. In §3, we return to power series,

this time of the form

∞

X

(2) ak (t − t0 )k .

k=0

R and continuity of the sum f (t) on (t0 − R, t0 + R) follow from material in Chapter 3.

The essential new result in §3 of Chapter 4 is that one can obtain the derivative f 0 (t) by

differentiating the power series for f (t) term by term. In §4 we consider curves in Rn ,

and obtain a formula for arc length for a smooth curve. We show that a smooth curve

with nonvanishing velocity can be parametrized by arc length. When this is applied to

the unit circle in R2 centered at the origin, one is looking at the standard definition of the

trigonometric functions,

that is much shorter than what is usually presented in calculus texts. In §5 we move on to

exponential functions. We derive the power series for the function et , introduced to solve

the differential equation dx/dt = x. We then observe that with no extra work we get an

analogous power series for eat , with derivative aeat , and that this works for complex a as

well as for real a. It is a short step to realize that eit is a unit speed curve tracing out the

unit circle in C ≈ R2 , so comparison with (3) gives Euler’s formula

That the derivative of eit is ieit provides a second proof of (4). Thus we have a unified

treatment of the exponential and trigonometric functions, carried out further in §5, with

details developed in numerous exercises. Section 6 extends the scope of the Riemann

integral to a class of unbounded functions. Chapter 4 has several appendices, one proving

the fundamental theorem of algebra, one showing that π is irrational, one exploring in

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

6

more detail than in §3 the power series for (1 − x)b , and one describing an approximation

to π pioneered by Archimedes.

Chapter 5 treats further topics in analysis. If time permits, the instructor might cover

one or more of these at the end of the course. The topics center around approximating

functions, via various infinite sequences or series. Topics include approximating continuous

functions by polynomials, Fourier series, and Newton’s method for approximating the

inverse of a given function.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

7

Chapter I

Numbers

Introduction

One foundation for a course in analysis is a solid understanding of the real number sys-

tem. Texts vary on just how to achieve this. Some take an axiomatic approach. In such an

approach, the set of real numbers is hypothesized to have a number of properties, including

various algebraic properties satisfied by addition and multiplication, order axioms, and,

crucially, the completeness property, sometimes expressed as the supremum property.

This is not the approach we will take. Rather, we will start with a small list of ax-

ioms for the natural numbers (i.e., the positive integers), and then build the rest of the

edifice logically, obtaining the basic properties of the real number system, particularly the

completeness property, as theorems.

Sections 1–3 deal with the integers, starting in §1 with the set N of natural numbers.

The development proceeds from axioms of G. Peano. The main one is the principle of

mathematical induction. We deduce basic results about integer arithmetic from these

axioms. A high point is the fundamental theorem of arithmetic, presented in §3.

Section 4 discusses the set Q of rational numbers, deriving the basic algebraic properties

of these numbers from the results of §§1–3. Section 5 provides a bridge between §4 and §6.

It deals with infinite sequences, including convergent sequences and “Cauchy sequences.”

This prepares the way for §6, the main section of this chapter. Here we construct the set

R of real numbers, as “ideal limits” of rational numbers. We extend basic algebraic results

from Q to R. Furthermore, we establish the result that R is “complete,” i.e., Cauchy

sequences

√ √ always

√ have limits in R. Section 7 provides examples of irrational numbers, such

as 2, 3, 5,...

Section 8 deals with cardinal numbers, an extension of the natural numbers N, that can

be used to “count” elements of a set, not necessarily finite. For example, N is a “countably”

infinite set, and so is Q. We show that R “uncountable,” and hence much larger than N

or Q.

Section 9 returns to the real number line R, and establishes further metric properties of

R and various subsets, with an emphasisis on the notion of compactness. The completeness

property established in §6 plays a crucial role here.

Section 10 introduces the set C of complex numbers and establishes basic algebraic and

metric properties of C. While some introductory treatments of analysis avoid complex

numbers, we embrace them, and consider their use in basic analysis too precious to omit.

Sections 9 and 10 also have material on continuous functions, defined on a subset of R

or C, respectively. These results give a taste of further results to be developed in Chapter

3, which will be essential to material in Chapters 4 and 5.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

8

1. Peano arithmetic

given 0 ∈ e = N ∪ {0}. We assume there is a map

/ N, and form N

(1.1) e −→ N,

s:N

which is bijective. That is to say, for each k ∈ N, there is a j ∈ N

0 0

s is surjective; and furthermore, if s(j) = s(j ) then j = j , so s is injective. The map s

plays the role of “addition by 1,” as we will see below. The only other axiom of Peano

arithmetic is that the principle of mathematical induction holds. In other words, if S ⊂ Ne

is a set with the properties

(1.2) 0 ∈ S, k ∈ S ⇒ s(k) ∈ S,

e

then S = N.

Actually, applying the induction principle to S = {0} ∪ s(N),e we see that it suffices to

assume that s in (1.1) is injective; the induction principle ensures that it is surjective.

e inductively on y, by

We define addition x + y, for x, y ∈ N,

(1.4) x · 0 = 0, x · s(y) = x · y + x.

We also define

(1.5) 1 = s(0).

Proposition 1.1. x + 1 = s(x).

Proof. x + s(0) = s(x + 0).

Proposition 1.2. 0 + x = x.

Proof. Use induction on x. First, 0 + 0 = 0. Now, assuming 0 + x = x, we have

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

9

s(y) + s(x) = s(s(y) + x).

If s(y + x) = s(y) + x, the two right sides are equal, so the two left sides are equal,

completing the induction.

Proposition 1.4. x + y = y + x.

Proof. Use induction on y. The case y = 0 follows from Proposition 1.2. Now, assuming

e we must show s(y) has the same property. In fact,

x + y = y + x, for all x ∈ N,

e we must show s(z) has the same property. In

(x + y) + z = x + (y + z), for all x, y ∈ N,

fact,

(x + y) + s(z) = s((x + y) + z),

x + (y + s(z)) = x + s(y + z) = s(x + (y + z)),

Remark. Propositions 1.4 and 1.5 state the commutative and associative laws for addition.

Proposition 1.6. x · 1 = x.

Proof. We have

x · s(0) = x · 0 + x = 0 + x = x,

Proposition 1.7. 0 · y = 0.

0 · y + 0 = 0 + 0 = 0.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

10

Proof. Use induction on y. First, s(x) · 0 = 0, while x · 0 + 0 = 0 + 0 = 0. Next, assuming

s(x) · y = x · y + y, for all x, we must show that s(y) has this property. In fact,

s(x) · s(y) = s(x) · y + s(x) = (x · y + y) + (x + 1),

x · s(y) + s(y) = (x · y + x) + (y + 1),

and identity then follows via the commutative and associative laws of addition, Proposi-

tions 1.4 and 1.5.

Proposition 1.9. x · y = y · x.

Proof. Use induction on y. First, x · 0 = 0 = 0 · x, the latter identity by Proposition 1.7.

e we must show that s(y) has the same property.

Next, assuming x · y = y · x for all x ∈ N,

In fact,

x · s(y) = x · y + x = y · x + x,

s(y) · x = y · x + x,

the last identity by Proposition 1.8.

Proposition 1.10. (x + y) · z = x · z + y · z.

Proof. Use induction on z. First, the identity clearly holds for z = 0. Next, assuming it

e we must show it holds for s(z). In fact,

holds for z (for all x, y ∈ N),

(x + y) · s(z) = (x + y) · z + (x + y) = (x · z + y · z) + (x + y),

x · s(z) + y · s(z) = (x · z + x) + (y · z + y),

and the desired identity follows from the commutative and associative laws of addition.

Proposition 1.11. (x · y) · z = x · (y · z).

Proof. Use induction on z. First, the identity clearly holds for z = 0. Next, assuming it

e we have

holds for z (for all x, y ∈ N),

(x · y) · s(z) = (x · y) · z + x · y,

while

x · (y · s(z)) = x · (y · z + y) = x · (y · z) + x · y,

the last identity by Proposition 1.10 (and 1.9). These observations yield the desired iden-

tity.

Remark. Propositions 1.9 and 1.11 state the commutative and associative laws for multi-

plication. Proposition 1.10 is the distributive law. Combined with Proposition 1.9, it also

yields

z · (x + y) = z · x + z · y,

used above.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

11

e

Proposition 1.12. Given x, y, z ∈ N,

(1.6) x + y = z + y =⇒ x = z.

Proof. Use induction on y. If y = 0, (1.6) obviously holds. Assuming (1.6) holds for y, we

must show that

(1.7) x + s(y) = z + s(y)

implies x = z. In fact, (1.7) is equivalent to s(x + y) = s(z + y). Since the map s is assumed

to be one-to-one, this implies that x + y = z + y, so we are done.

e Given x, y ∈ N,

We next define an order relation on N. e we say

Similarly there is a definition of x ≤ y. We have x ≤ y if and only if y ∈ Rx , where

(1.9) e

Rx = {x + u : u ∈ N}.

Other notation is

y > x ⇐⇒ x < y, y ≥ x ⇐⇒ x ≤ y.

Proposition 1.13. If x ≤ y and y ≤ x then x = y.

Proof. The hypotheses imply

(1.10) y = x + u, x = y + v, e

u, v ∈ N.

Hence x = x + u + v, so, by Proposition 1.12, u + v = 0. Now, if v 6= 0, then v = s(w), so

u + v = s(u + w) ∈ N. Thus v = 0, and u = 0.

e either

Proposition 1.14. Given x, y ∈ N,

(1.11) x < y, or x = y, or y < x,

and no two can hold.

Proof. That no two of (1.11) can hold follows from Proposition 1.13. It remains to show

e We will establish (1.11) by induction on x. Clearly (1.11)

that one must hold. Take y ∈ N.

e then either

holds for x = 0. We need to show that if (1.11) holds for a given x ∈ N,

(1.12) s(x) < y, or s(x) = y, or y < s(x).

Consider the three possibilities in (1.11). If either y = x or y < x, then clearly y < s(x) =

x + 1. On the other hand, if x < y, we can use the implication

(1.12A) x < y =⇒ s(x) ≤ y

to complete the proof of (1.12). See Lemma 1.17 for a proof of (1.12A).

We can now establish the cancellation law for multiplication.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

12

e

Proposition 1.15. Given x, y, z ∈ N,

(1.13) x · y = x · z, x 6= 0 =⇒ y = z.

the hypotheses of (1.13) imply

x · y = x · y + x · u, x 6= 0,

hence, by Proposition 1.12,

(1.14) x · u = 0, x 6= 0.

We thus need to show that (1.14) implies u = 0. In fact, if not, then we can write u = s(w),

e and we have

and x = s(a), with w, a ∈ N,

(1.15) x · u = x · w + s(a) = s(x · w + a) ∈ N.

This contradicts (1.14), so we are done.

(1.16) x, y ∈ N =⇒ x · y ∈ N.

We next establish the following variant of the principle of induction, called the well-

e

ordering property of N.

e is nonempty, then T contains a smallest element.

Proposition 1.16. If T ⊂ N

Proof. Suppose T contains no smallest element. Then 0 ∈

/ T. Let

(1.17) e : x < y, ∀ y ∈ T }.

S = {x ∈ N

Then 0 ∈ S. We claim that

(1.18) x ∈ S =⇒ s(x) ∈ S.

Indeed, suppose x ∈ S, so x < y for all y ∈ T. If s(x) ∈

/ S, we have s(x) ≥ y0 for some

y0 ∈ T. On the other hand (see Lemma 1.17 below),

(1.19) x < y0 =⇒ s(x) ≤ y0 .

Thus, by Proposition 1.13,

(1.20) s(x) = y0 .

It follows that y0 must be the smallest element of T. Thus, if T has no smallest element,

(1.18) must hold. The induction principle then implies that S = N, e which implies T is

empty.

Here is the result behind (1.12A) and (1.19).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

13

e

Lemma 1.17. Given x, y ∈ N,

hence s(x) ≤ y.

T ⊂N

³ ´

(1.23) e e

0 ∈ S ⊂ N, k ∈ S ⇒ s(k) ∈ S =⇒ S = N.

e \ S. If S 6= N,

To see this, suppose S satisfies the hypotheses of (1.23), and let T = N e then

T is nonempty, so (1.22) implies T has a smallest element, say x1 . Since 0 ∈ S, x1 ∈ N,

so x1 = s(x0 ), and we must have

(1.24) x0 ∈ S, e \ S,

s(x0 ) ∈ T = N

Exercises

Pn

Given n ∈ N, we define k=1 ak inductively, as follows.

1

X n+1

X ³X

n ´

(1.25) ak = a1 , ak = ak + an+1 .

k=1 k=1 k=1

n

X

(1) 2 k = n(n + 1).

k=1

n

X

(2) 6 k 2 = n(n + 1)(2n + 1).

k=1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

14

n

X

(3) (a − 1) ak = an+1 − a, if a 6= 1.

k=1

(1.26) a1 = a, an+1 = an · a.

Pn Pn

We also set a0 = 1 if a ∈ N, and k=0 ak = a0 + k=1 ak . Verify that

n

X

(4) (a − 1) ak = an+1 − 1, if a 6= 1.

k=0

2k ≥ 2k,

with strict inequality for k > 1.

e

6. Show that, for x, x0 , y, y 0 ∈ N,

x · y < x0 · y 0 , if also y 0 > 0.

³ ´

1 ∈ S ⊂ N, k ∈ S ⇒ s(k) ∈ S =⇒ S = N.

e

Hint. Consider {0} ∪ S ⊂ N.

More generally, with Rx as in (1.9), show that, for x ∈ N,

³ ´

x ∈ S ⊂ Rx , k ∈ S ⇒ s(k) ∈ S =⇒ S = Rx .

8. With an defined inductively as in (1.26) for a ∈ N,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

15

2. The integers

e To be more formal, we

An integer is thought of as having the form x − a, with x, a ∈ N.

e where

will define an element of Z as an equivalence class of ordered pairs (x, a), x, a ∈ N,

we define

a specification s ∼ t for certain s, t ∈ S, which satisfies the following three conditions.

(a) Reflexive. s ∼ s, ∀ s ∈ S.

(b) Symmetric. s ∼ t ⇐⇒ t ∼ s.

(c) Transitive. s ∼ t, t ∼ u =⇒ s ∼ u.

We will encounter various equivalence relations in this and subsequent sections. Generally,

(a) and (b) are quite easy to verify, and we will be content with verifying (c).

Proposition 2.1. The relation (2.1) is an equivalence relation.

Proof. We need to check that

e

i.e., that, for x, y, z, a, b, c ∈ N,

(2.3) x + b = y + a, y + c = z + b =⇒ x + c = z + a.

(x + c) + (y + b) = (z + a) + (y + b),

and the conclusion of (2.3) then follows from the cancellation property, Proposition 1.12.

Let us denote the equivalence class containing (x, a) by [(x, a)]. We then define addition

and multiplication in Z to satisfy

[(x, a)] + [(y, b)] = [(x, a) + (y, b)], [(x, a)] · [(y, b)] = [(x, a) · (y, b)],

(2.4)

(x, a) + (y, b) = (x + y, a + b), (x, a) · (y, b) = (xy + ab, ay + xb).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

16

and

(2.7) x + a0 = x0 + a, y + b0 = y 0 + b.

The conclusions follow from results of §1. In more detail, adding the two identities in (2.7)

gives

x + a0 + y + b0 = x0 + a + y 0 + b,

and rearranging, using the commutative and associative laws of addition, yields

implying (2.5). The task of proving (2.6) is simplified by going through the intermediate

step

Similarly, it is routine to verify the basic commutative, associative, etc. laws incorpo-

rated in the next proposition. To formulate the results, set

Also, define

and

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

17

m + n = n + m,

(m + n) + k = m + (n + k),

m + 0 = m,

m + (−m) = 0,

mn = nm,

(2.12)

m(nk) = (mn)k,

m · 1 = m,

m · 0 = 0,

m · (−1) = −m,

m · (n + k) = m · n + m · k.

equivalent to

(xy + ab, ay + xb) ∼ (yx + ba, bx + ya).

In fact, commutative laws for addition and multiplication in N e imply xy + ab = yx + ba

and ay + xb = bx + ya. Verification of the other identities in (2.12) is left to the reader.

We next establish the cancellation law for addition in Z.

Proposition 2.4. Given m, n, k ∈ Z,

(2.13) m + n = k + n =⇒ m = k.

Proof. We give two proofs. For one, we can add −n to both sides and use the results of

Proposition 2.3. Alternatively, we can write the hypotheses of (2.13) as

x+y+c+b=z+y+a+b

Note that it is reasonable to set

(2.14) m − n = m + (−n).

There is a natural injection

whose image we identify with N. Note that the map (2.10) preserves addition and multi-

plication. There is also an injection x 7→ [(0, x)], whose image we identify with −N.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

18

a < x, or x = a, or x < a.

x = a + u, u ∈ N, or x = a, or a = x + v, v ∈ N.

Then, either

(2.17) m < n ⇐⇒ n − m ∈ N.

We then have:

Corollary 2.6. Given m, n ∈ Z, then either

The map (2.15) is seen to preserve order relations.

Another consequence of (2.16) is the following.

Proposition 2.7. If m, n ∈ Z and m · n = 0, then either m = 0 or n = 0.

Proof. Suppose m 6= 0 and n 6= 0. We have four cases:

m < 0, n < 0 =⇒ mn = (−m)(−n) > 0,

m > 0, n < 0 =⇒ mn = −m(−n) < 0,

m < 0, n > 0 =⇒ mn = −(−m)n < 0,

the first by (1.16), and the rest with the help of Exercise 3 below. This finishes the proof.

Using Proposition 2.7, we have the following cancellation law for multiplication in Z.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

19

(2.19) mk = nk, k 6= 0 =⇒ m = n.

mk − nk = (m − n)k.

mk = nk =⇒ (m − n)k = 0.

Exercises

Pn

2. We define k=1 ak as in (1.25), this time with ak ∈ Z. We also define ak inductively

as in Exercise (3) of §1, with a0 = 1 if a 6= 0. Use the principle of induction to establish

the identity

Xn

(−1)k−1 k = − m if n = 2m,

k=1

m+1 if n = 2m + 1.

3. Show that, if m, n, k ∈ Z,

Hint. For the first part, use Proposition 2.3 to show that nk + (−n)k = 0. Alternatively,

compare (a, x) · (y, b) with (x, a) · (y, b).

4. Deduce the following from Proposition 1.16. Let S ⊂ Z be nonempty and assume there

exists m ∈ Z such that m < n for all n ∈ S. Then S has a smallest element.

Hint. Given such m, let Se = {(−m) + n : n ∈ S}. Show that Se ⊂ N and deduce that Se

has a smallest element.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

20

(3.1) x = ab, a, b ∈ N,

holds, we say a|x (and that b|x), or that a is a divisor of x. Given x ∈ N, x > 1, set

say p1 . Clearly p1 is a prime. Set

(3.3) x = p1 x1 , x1 ∈ N, x1 < x.

The same construction applies to x1 , which is > 1 unless x = p1 . Hence we have either

x = p1 or

Continue this process, passing from xj to xj+1 as long as xj is not prime. The set S of

such xj ∈ N has a smallest element, say xµ−1 = pµ , and we have

(3.5) x = p1 p2 · · · pµ , pj prime.

Theorem 3.1. Given x ∈ N, x 6= 1, there is a unique product expansion

(3.6) x = p1 · · · pµ ,

Only uniqueness remains to be established. This follows from:

Proposition 3.2. Assume a, b ∈ N, and p ∈ N is prime. Then

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

21

if p, a ∈ N have no common divisors > 1, then there exist m, n ∈ Z such that

(3.8) ma + np = 1.

Proof of Proposition 3.2. Assume p is a prime which does not divide a. Pick m, n such

that (3.8) holds. Now, multiply (3.8) by b, to get

mab + npb = b.

p(mk + nb) = b,

so p|b, as desired.

To prove Proposition 3.3, let us set

Definition. A nonempty subset Γ ⊂ Z is a subgroup of Z provided

(3.10) a, b ∈ Γ =⇒ a + b, a − b ∈ Γ.

that

Γ = Σ ∪ {0} ∪ (−Σ).

If Σ 6= ∅, let x be its smallest element. Then we want to establish (3.11), so set Γ0 = {mx :

m ∈ Z}. Clearly Γ0 ⊂ Γ. Similarly, set Σ0 = {mx : m ∈ N} = Γ0 ∩ N. We want to show

that Σ0 = Σ. If y ∈ Σ \ Σ0 , then we can pick m0 ∈ N such that

and hence

y − m0 x ∈ Σ

is smaller than x. This contradiction proves Proposition 3.4.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

22

Proof of Proposition 3.3. Taking Γ as in (3.9), pick x ∈ N such that (3.11) holds. Since

a ∈ Γ and p ∈ Γ, we have

a = m0 x, p = m1 x

for some mj ∈ Z. The assumption that a and p have no common divisor > 1 implies x = 1.

We conclude that 1 ∈ Γ, so (3.8) holds.

Exercises

Hint. If {p1 , . . . , pm } is a complete list of primes, consider

x = p1 · · · pm + 1.

(3.12) a, b ∈ Γ =⇒ a − b ∈ Γ.

3. Let n ∈ N be a 12 digit integer. Show that if n is not prime, then it must be divisible

by a prime p < 106 .

(3.13) 201367.

5. Find the smallest prime larger than the number in (3.13). Hint. Same as above.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

23

will define an element of Q as an equivalence class of ordered pairs m/n, m ∈ Z, n ∈ Z\{0},

where we define

Proof. We need to check that

(md)(ab) = (cn)(ab).

from the cancellation property, Proposition 2.8. On the other hand, if a = 0, then m/n ∼

a/b ⇒ mb = 0 ⇒ m = 0 (since b 6= 0), and similarly a/b ∼ c/d ⇒ cb = 0 ⇒ c = 0, so the

desired implication also holds in that case.

We will (temporarily) denote the equivalence class containing m/n by [m/n]. We then

define addition and multiplication in Q to satisfy

(4.4)

(m/n) + (a/b) = (mb + na)/(nb), (m/n) · (a/b) = (ma)/(nb).

Proposition 4.2. If m/n ∼ m0 /n0 and a/b ∼ a0 /b0 , then

and

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

24

The conclusions follow from the results of §2. In more detail, multiplying the two identities

in (4.4C) yields

man0 b0 = m0 a0 nb,

which implies (4.4B). To prove (4.4A), it is convenient to establish the intermediate step

This is equivalent to

(mb + na)/nb ∼ (m0 b + n0 a)/(n0 b),

hence to

(mb + na)n0 b = (m0 b + n0 a)nb,

or to

mn0 bb + nn0 ab = m0 nbb + n0 nab.

This in turn follows readily from (4.4C). Having (4.4D), we can use a similar argument to

establish that

(m0 /n0 ) + (a/b) ∼ (m0 /n0 ) + (a0 /b0 ),

From now on, we drop the brackets, simply denoting the equivalence class of m/n by

m/n, and writing (4.1) as m/n = a/b. We also may denote an element of Q by a single

letter, e.g., x = m/n. There is an injection

(4.5) Z ,→ Q, m 7→ m/1,

whose image we identify with Z. This map preserves addition and multiplication. We

define

The results stated in the following proposition are routine consequences of the results of

§2.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

25

x + y = y + x,

(x + y) + z = x + (y + z),

x + 0 = x,

x + (−x) = 0,

x · y = y · x,

(x · y) · z = x · (y · z),

x · 1 = x,

x · 0 = 0,

x · (−1) = −x,

x · (y + z) = x · y + x · z.

Furthermore,

x 6= 0 =⇒ x · x−1 = 1.

is equivalent to (ma)/(nb) ∼ (am)/(bn). In fact, the identities ma = am and nb = bn

follow from Proposition 2.3. We leave the rest of Proposition 4.3 to the reader.

We also have cancellation laws:

Proposition 4.4. Given x, y, z ∈ Q,

(4.8) x + y = z + y =⇒ x = z.

Also,

(4.9) xy = zy, y 6= 0 =⇒ x = z.

Proof. To get (4.8), add −y to both sides of x+y = z +y and use the results of Proposition

4.3. To get (4.9), multiply both sides of x · y = z · y by y −1 .

It is natural to define

(4.10) x − y = x + (−y),

and, if y 6= 0,

(4.11) x/y = x · y −1 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

26

where, in (4.12), we use the order relation on Z, discussed in §2. This is well defined.

In fact, if m/n = m0 /n0 , then mn0 = m0 n, hence (mn)(m0 n0 ) = (mn0 )2 , and therefore

mn > 0 ⇔ m0 n0 > 0. Results of §2 imply that

x

(4.14) x, y ∈ Q+ =⇒ x + y, xy, ∈ Q+ .

y

We define

(4.15) x < y ⇐⇒ y − x ∈ Q+ ,

and no two can hold. The map (4.5) is seen to preserve the order relations. In light of

(4.14), we see that

x 1 1

(4.17) given x, y > 0, x<y⇔ <1⇔ < .

y y x

definitions of x > y and of x ≥ y.

The following result implies that Q has the Archimedean property.

Proposition 4.5. Given x ∈ Q, there exists k ∈ Z such that

(4.18) k − 1 < x ≤ k.

Proof. It suffices to prove (4.18) assuming x ∈ Q+ ; otherwise, work with −x (and make a

few minor adjustments). Say x = m/n, m, n ∈ N. Then

S = {` ∈ N : ` ≥ x}

Then k ≥ x. We cannot have k − 1 ≥ x, for then k − 1 would belong to S. Hence (4.18)

holds.

Exercises

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

27

2. Look at the exercise set for §1, and verify (3) and (4) for a ∈ Q, a 6= 1, n ∈ N.

n

X an+1 − 1

(4.19) ak = , a 6= 1.

a−1

k=0

Denote the left side of (4.19) by Sn (a). Multiply by a and show that

a−n = (a−1 )n , with a−1 defined as in (4.7). Show that, if a, b ∈ Q \ 0,

Proposition 4.5A. Given ε ∈ Q, ε > 0, there exists n ∈ N such that

1

ε> .

n

Assertion If x = m/n ∈ Q, then x2 6= 2.

Hint. We can arrange that m and n have no common factors. Then

³ m ´2

= 2 ⇒ m2 = 2n2 ⇒ m even (m = 2k)

n

⇒ 4k 2 = 2n2

⇒ n2 = 2k 2

⇒ n even.

x1 < x2 , y1 ≤ y2 =⇒ x1 + y1 < x2 + y2 .

Show that

0 < x1 < x2 , 0 < y1 ≤ y2 =⇒ x1 y1 < x2 y2 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

28

5. Sequences

In this section, we discuss infinite sequences. For now, we deal with sequences of rational

numbers, but we will not explicitly state this restriction below. In fact, once the set of

real numbers is constructed in §6, the results of this section will be seen to hold also for

sequences of real numbers.

Definition. A sequence (aj ) is said to converge to a limit a provided that, for any n ∈ N,

there exists K(n) such that

1

(5.1) j ≥ K(n) =⇒ |aj − a| < .

n

We write aj → a, or a = lim aj , or perhaps a = limj→∞ aj .

Here, we define the absolute value |x| of x by

|x| = x if x ≥ 0,

(5.2)

−x if x < 0.

The absolute value function has various simple properties, such as |xy| = |x| · |y|, which

follow readily from the definition. One basic property is the triangle inequality:

In fact, if either x and y are both positive or they are both negative, one has equality

in (5.3). If x and y have opposite signs, then |x + y| ≤ max(|x|, |y|), which in turn is

dominated by the right side of (5.3).

Proposition 5.1. If aj → a and bj → b, then

(5.4) aj + bj → a + b,

and

(5.5) aj bj → ab.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

29

|aj bj − ab| = |(aj bj − abj ) + (abj − ab)|

(5.8)

≤ |bj | · |aj − a| + |a| · |b − bj |.

The hypotheses imply |bj | ≤ B, for some B, and hence the criterion for convergence is

readily verified. To get (5.6), we have

¯a a ¯¯ 1 © ª

¯ j

(5.9) ¯ − ¯≤ |b| · |a − aj | + |a| · |b − bj | .

bj b |b| · |bj |

The hypotheses imply 1/|bj | ≤ M for some M, so we also verify the criterion for conver-

gence in this case.

We next define the concept of a Cauchy sequence.

Definition. A sequence (aj ) is a Cauchy sequence provided that, for any n ∈ N, there

exists K(n) such that

1

(5.10) j, k ≥ K(n) =⇒ |aj − ak | ≤ .

n

It is clear that any convergent sequence is Cauchy. On the other hand, we have:

Proposition 5.2. Each Cauchy sequence is bounded.

Proof. Take n = 1 in the definition above. Thus, if (aj ) is Cauchy, there is a K such that

j, k ≥ K ⇒ |aj − ak | ≤ 1. Hence, j ≥ K ⇒ |aj | ≤ |aK | + 1, so, for all j,

¡ ¢

|aj | ≤ M, M = max |a1 |, . . . , |aK−1 |, |aK | + 1 .

Proposition 5.3. If (aj ) and (bj ) are Cauchy sequences, so are (aj + bj ) and (aj bj ).

Furthermore, if, for all j, |bj | ≥ c for some c > 0, then (aj /bj ) is Cauchy.

The following proposition is a bit deeper than the first three.

Proposition 5.4. If (aj ) is bounded, i.e., |aj | ≤ M for all j, then it has a Cauchy

subsequence.

Proof. We may as well assume M ∈ N. Now, either aj ∈ [0, M ] for infinitely many j or

aj ∈ [−M, 0] for infinitely many j. Let I1 be any one of these two intervals containing aj

for infinitely many j, and pick j(1) such that aj(1) ∈ I1 . Write I1 as the union of two closed

intervals, of equal length, sharing only the midpoint of I1 . Let I2 be any one of them with

the property that aj ∈ I2 for infinitely many j, and pick j(2) > j(1) such that aj(2) ∈ I2 .

Continue, picking Iν ⊂ Iν−1 ⊂ · · · ⊂ I1 , of length M/2ν−1 , containing aj for infinitely

many j, and picking j(ν) > j(ν − 1) > · · · > j(1) such that aj(ν) ∈ Iν . Setting bν = aj(ν) ,

we see that (bν ) is a Cauchy subsequence of (aj ), since, for all k ∈ N,

|bν+k − bν | ≤ M/2ν−1 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

30

Proof. To say (aj ) is monotone is to say that either (aj ) is increasing, i.e., aj ≤ aj+1 for

all j, or that (aj ) is decreasing, i.e., aj ≥ aj+1 for all j. For the sake of argument, assume

(aj ) is increasing.

By Proposition 5.4, there is a subsequence (bν ) = (aj(ν) ) which is Cauchy. Thus, given

n ∈ N, there exists K(n) such that

1

(5.11) µ, ν ≥ K(n) =⇒ |aj(ν) − aj(µ) | < .

n

aj(ν0 ) ≤ aj ≤ ak ≤ aj(ν1 ) ,

so

1

(5.12) k ≥ j ≥ j(ν0 ) =⇒ |aj − ak | < .

n

Proposition 5.6. If |a| < 1, then aj → 0.

Proof. Set b = |a|; it suffices to show that bj → 0. Consider c = 1/b > 1, hence c =

1 + y, y > 0. We claim that

cj = (1 + y)j ≥ 1 + jy,

for all j ≥ 1. In fact, this clearly holds for j = 1, and if it holds for j = k, then

1

bj < ,

jy

so the appropriate analogue of (5.1) holds, with K(n) = Kn, for any integer K > 1/y.

Proposition 5.6 enables us to establish the following result on geometric series.

Proposition 5.7. If |x| < 1 and

aj = 1 + x + · · · + x j ,

then

1

aj → .

1−x

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

31

1 − xj+1

aj = .

1−x

The conclusion follows from Proposition 5.6.

Note in particular that

1

(5.13) 0 < x < 1 =⇒ 1 + x + · · · + xj < .

1−x

It is an important mathematical fact that not every Cauchy sequence of rational numbers

has a rational number as limit. We give one example here. Consider the sequence

j

X 1

(5.14) aj = .

`!

`=0

1³ 1 ´

n+j

X 1 1 1

an+j − an = < + + · · · + ,

`! n! n + 1 (n + 1)2 (n + 1)j

`=n+1

1 1 1 1

(5.15) an+j − an < 1 = · .

(n + 1)! 1 − n+1 n! n

Proof. Assume aj → m/n with m, n ∈ N. By (5.16), we must have n > 2. Now, write

n

m X1

(5.17) = + r, r = lim (an+j − an ).

n `! j→∞

`=0

where

n

X n!

(5.19) A= ∈ N.

`!

`=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

32

Exercises

1. Show that

k

lim = 0,

k→∞ 2k

and more generally for each n ∈ N,

kn

lim = 0.

k→∞ 2k

Hint. See Exercise 5.

2. Show that

2k

lim = 0,

k→∞ k!

and more generally for each n ∈ N,

2nk

lim = 0.

k→∞ k!

(5.21) aj ∈ Q, aj ≥ 1, j = 1, 2, 3, . . . ,

and set

1 1

(5.22) f1 = a1 , f 2 = a1 + , f3 = a 1 + 1 ,....

a2 a2 + a3

(5.23) fj = ϕj (a1 , . . . , aj ),

3. Show that

f1 ≤ fj , ∀ j ≥ 2, and f2 ≥ fj , ∀ j ≥ 3.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

33

(5.25) f1 ≤ f3 ≤ f5 ≤ · · · ≤ f6 ≤ f4 ≤ f2 .

(5.26)

= ϕj (a1 , . . . , aj−1 , bj ) − ϕj (a1 , . . . , aj−1 , b̃j ),

with

1 1

bj = aj + , b̃j = aj + ,

aj+1 ãj+1

(5.27)

1 1 ãj+1 − aj+1

bj − b̃j = − = .

aj+1 ãj+1 ãj+1 aj+1

r < 1, K∈N

such that ¯a ¯

¯ j+1 ¯

j ≥ K =⇒ ¯ ¯ ≤ r.

aj

Show that aj → 0 as j → ∞. How does this result apply to Exercises 1 and 2?

6. If (aj ) satisfies the hypotheses of Exercise 5, show that there exists M < ∞ such that

k

X

|aj | ≤ M, ∀ k.

j=1

7. Show that you get the same criterion for convergence if (5.1) is replaced by

5

j ≥ K(n) =⇒ |aj − a| < .

n

Generalize, and note the relevance for the proof of Proposition 5.1. Apply the same

observation to the criterion (5.10) for (aj ) to be Cauchy.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

34

imation arbitrarily closely by rational numbers. Thus, we define an element of R as an

equivalence class of Cauchy sequences of rational numbers, where we define

Proof. This is a straightforward consequence of Proposition 5.1. In particular, to see that

aj − bj → 0, bj − cj → 0 =⇒ aj − cj → 0.

We denote the equivalence class containing a Cauchy sequence (aj ) by [(aj )]. We then

define addition and multiplication on R to satisfy

(6.3)

[(aj )] · [(bj )] = [(aj bj )].

Proposition 5.3 states that the sequences (aj + bj ) and (aj bj ) are Cauchy if (aj ) and (bj )

are. To conclude that the operations in (6.3) are well defined, we need:

Proposition 6.2. If Cauchy sequences of rational numbers are given which satisfy (aj ) ∼

(a0j ) and (bj ) ∼ (b0j ), then

and

5.1, with due account taken of Proposition 5.2. For example, aj bj − a0j b0j = aj bj − aj b0j +

aj b0j − a0j b0j , and there are uniform bounds |aj | ≤ A, |b0j | ≤ B, so

≤ A|bj − b0j | + B|aj − a0j |.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

35

whose image we identify with Q. This map preserves addition and multiplication.

If x = [(aj )], we define

For x 6= 0, we define x−1 as follows. First, to say x 6= 0 is to say there exists n ∈ N such

that |aj | ≥ 1/n for infinitely many j. Since (aj ) is Cauchy, this implies that there exists

K such that |aj | ≥ 1/2n for all j ≥ K. Now, if we set αj = aK+j , we have (αj ) ∼ (aj ); we

propose to set

We claim that this is well defined. First, by Proposition 5.3, (αj−1 ) is Cauchy. Furthermore,

if for such x we also have x = [(bj )], and we pick K so large that also |bj | ≥ 1/2n for all

j ≥ K, and set βj = bK+j , we claim that

Indeed, we have

|βj − αj |

(6.10) |αj−1 − βj−1 | = ≤ 4n2 |βj − αj |,

|αj | · |βj |

so (6.9) holds.

It is now a straightforward exercise to verify the basic algebraic properties of addition

and multiplication in R. We state the result.

Proposition 6.3. Given x, y, z ∈ R, all the algebraic properties stated in Proposition 4.3

hold.

For example, if x = [(aj )] and y = [(bj )], the identity xy = yx is equivalent to (aj bj ) ∼

(bj aj ). In fact, the identity aj bj = bj aj for aj , bj ∈ Q, follows from Proposition 4.3. The

rest of Proposition 6.3 is left to the reader.

We now define an order relation on R. Take x ∈ R, x = [(aj )]. From the discussion

above of x−1 , we see that, if x 6= 0, then one and only one of the following holds. Either,

for some n, K ∈ N,

1

(6.11) j ≥ K =⇒ aj ≥ ,

2n

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

36

1

(6.12) j ≥ K =⇒ aj ≤ − .

2n

If (aj ) ∼ (bj ) and (6.11) holds for aj , it also holds for bj (perhaps with different n and K),

and ditto for (6.12). If (6.11) holds, we say x ∈ R+ (and we say x > 0), and if (6.12) holds

we say x ∈ R− (and we say x < 0). Clearly x > 0 if and only if −x < 0. It is also clear

that the map Q ,→ R in (6.6) preserves the order relation.

Thus we have the disjoint union

Also, clearly

(6.14) x, y ∈ R+ =⇒ x + y, xy ∈ R+ .

As in (4.15), we define

(6.15) x < y ⇐⇒ y − x ∈ R+ .

(6.15A) 1 ³ 1´

j ≥ K ⇒ bj − aj ≥ i.e., aj ≤ bj − .

n n

The relation (6.15) can also be written y > x. Similarly we define x ≤ y and y ≤ x, in the

obvious fashions.

The following results are straightforward.

Proposition 6.4. For elements of R, we have

Proof. The results (6.16) and (6.18) follow from (6.14); consider, for example, a(y − x).

The result (6.17) follows from (6.13). To prove (6.19), first we see that x > 0 implies

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

37

x−1 > 0, as follows: if −x−1 > 0, the identity x · (−x−1 ) = −1 contradicts (6.14). As for

the rest of (6.19), the hypotheses imply xy > 0, and multiplying both sides of x < y by

a = (xy)−1 gives the result, by (6.18).

As in (5.2), define |x| by

|x| = x if x ≥ 0,

(6.20)

−x if x < 0.

Note that

(6.20A) x = [(aj )] =⇒ |x| = [(|aj |)].

It is straightforward to verify

(6.21) |xy| = |x| · |y|, |x + y| ≤ |x| + |y|.

We now show that R has the Archimedean property.

Proposition 6.5. Given x ∈ R, there exists k ∈ Z such that

(6.22) k − 1 < x ≤ k.

Proof. It suffices to prove (6.22) assuming x ∈ R+ . Otherwise, work with −x. Say x = [(aj )]

where (aj ) is a Cauchy sequence of rational numbers. By Proposition 5.2, there exists

M ∈ Q such that |aj | ≤ M for all j. By Proposition 4.5, we have M ≤ ` for some ` ∈ N.

Hence the set S = {` ∈ N : ` ≥ x} is nonempty. As in the proof of Proposition 4.5, taking

k to be the smallest element of S gives (6.22).

Proposition 6.6. Given any real ε > 0, there exists n ∈ N such that ε > 1/n.

Proof. Using Proposition 6.5, pick n > 1/ε and apply (6.19). Alternatively, use the rea-

soning given above (6.8).

We are now ready to consider sequences of elements of R.

Definition. A sequence (xj ) converges to x if and only if, for any n ∈ N, there exists

K(n) such that

1

(6.23) j ≥ K(n) =⇒ |xj − x| < .

n

In this case, we write xj → x, or x = lim xj .

The sequence (xj ) is Cauchy if and only if, for any n ∈ N, there exists K(n) such that

1

(6.24) j, k ≥ K(n) =⇒ |xj − xk | < .

n

We note that it is typical to phrase the definition above in terms of picking any real

ε > 0 and demanding that, e.g., |xj − x| < ε, for large j. The equivalence of the two

definitions follows from Proposition 6.6.

As in Proposition 5.2, we have that every Cauchy sequence is bounded.

It is clear that, if each xj ∈ Q, then the notion that (xj ) is Cauchy given above coincides

with that in §5. If also x ∈ Q, the notion that xj → x also coincides with that given in §5.

Here is another natural but useful observation.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

38

Proof. First assume x = [(aj )]. In particular, (aj ) is Cauchy. Now, given m, we have from

(6.15A) that

1 1 1

|x − ak | < ⇐⇒ ∃ K, n such that j ≥ K ⇒ |aj − ak | < −

(6.26) m m n

1

⇐= ∃ K such that j ≥ K ⇒ |aj − ak | < .

2m

1

for each m ∈ N, ∃ K(m) such that j, k ≥ K(m) ⇒ |aj − ak | < .

2m

Hence

1

k ≥ K(m) =⇒ |x − ak | < .

m

This shows that x = [(aj )] ⇒ aj → x. For the converse, if aj → x, then (aj ) is Cauchy, so

we have [(aj )] = y ∈ R. The previous argument implies aj → y. But

|x − y| ≤ |x − aj | + |aj − y|, ∀ j,

Next, the proof of Proposition 5.1 extends to the present case, yielding:

Proposition 6.7. If xj → x and yj → y, then

(6.27) xj + yj → x + y,

and

(6.28) xj yj → xy.

So far, statements made about R have emphasized similarities of its properties with

corresponding properties of Q. The crucial difference between these two sets of numbers is

given by the following result, known as the completeness property.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

39

Theorem 6.8. If (xj ) is a Cauchy sequence of real numbers, then there exists x ∈ R such

that xj → x.

Proof. Take xj = [(aj` : ` ∈ N)] with aj` ∈ Q. Using (6.25), take aj,`(j) = bj ∈ Q such that

Then (bj ) is Cauchy, since |bj − bk | ≤ |xj − xk | + 2−j + 2−k . Now, let

It follows that

and hence xj → x.

If we combine Theorem 6.8 with the argument behind Proposition 5.4, we obtain the

following important result, known as the Bolzano-Weierstrass Theorem.

Theorem 6.9. Each bounded sequence of real numbers has a convergent subsequence.

Proof. If |xj | ≤ M, the proof of Proposition 5.4 applies without change to show that (xj )

has a Cauchy subsequence. By Theorem 6.8, that Cauchy subsequence converges.

Similarly, adding Theorem 6.8 to the argument behind Proposition 5.5 yields:

Proposition 6.10. Each bounded monotone sequence (xj ) of real numbers converges.

A related property of R can be described in terms of the notion of the “supremum” of

a set.

Definition. If S ⊂ R, one says that x ∈ R is an upper bound for S provided x ≥ s for all

s ∈ S, and one says

(6.32) x = sup S

provided x is an upper bound for S and further x ≤ x0 whenever x0 is an upper bound for

S.

For some sets, such as S = Z, there is no x ∈ R satisfying (6.32). However, there is the

following result, known as the supremum property.

Proposition 6.11. If S is a nonempty subset of R that has an upper bound, then there

is a real x = sup S.

Proof. We use an argument similar to the one in the proof of Proposition 5.4. Let x0 be

an upper bound for S, pick s0 in S, and consider

I0 = [s0 , x0 ] = {y ∈ R : s0 ≤ y ≤ x0 }.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

40

L = x0 − s0 . In that case, divide I0 into two equal intervals, having in common only the

midpoint; say I0 = I0` ∪ I0r , where I0r lies to the right of I0` .

Let I1 = I0r if S ∩ I0r 6= ∅, and otherwise let I1 = I0` . Note that S ∩ I1 6= ∅. Let x1 be

the right endpoint of I1 , and pick s1 ∈ S ∩ I1 . Note that x1 is also an upper bound for S.

Continue, constructing

Iν ⊂ Iν−1 ⊂ · · · ⊂ I0 ,

where Iν has length 2−ν L, such that the right endpoint xν of Iν satisfies

(6.33) xν ≥ s, ∀ s ∈ S,

(6.34) xν − sν ≤ 2−ν L.

The sequence (xν ) is bounded and monotone (decreasing) so, by Proposition 6.10, it con-

verges; xν → x. By (6.33), we have x ≥ s for all s ∈ S, and by (6.34) we have x−sν ≤ 2−ν L.

Hence x satisfies (6.32).

P∞

We turn to infinite series k=0 ak , with ak ∈ R. We say this series converges if and

only if the sequence of partial sums

n

X

(6.35) Sn = ak

k=0

converges:

∞

X

(6.36) ak = A ⇐⇒ Sn → A as n → ∞.

k=0

P∞

Proposition 6.12. The infinite series k=0 ak converges provided

∞

X

(6.37) |ak | < ∞,

k=0

Pn

i.e., there exists B < ∞ such that k=0 |ak | ≤ B for all n.

Proof. The triangle inequality (the second part of (6.21)) gives, for ` ≥ 1,

¯ n+` ¯

¯ X ¯

|Sn+` − Sn | = ¯ ak ¯

k=n+1

(6.38)

n+`

X

≤ |ak |,

k=n+1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

41

if the right side of (6.38) fails to go to 0 as n → ∞, there exists ε > 0 and infinitely many

nν → ∞ and `ν ∈ N such that

nX

ν +`ν

(6.39) |ak | ≥ ε.

k=nν +1

nX

ν +`ν

k=n1 +1

for all ν, contradicting the bound by B that follows from (6.37). Thus (6.37) ⇒ (Sn ) is

Cauchy. Convergence follows, by Theorem 6.8.

P∞

When (6.37) holds, we say the series k=0 ak is absolutely convergent.

The following result on alternating series gives another sufficient condition for conver-

gence.

Proposition 6.13. Assume ak > 0, ak & 0. Then

∞

X

(6.41) (−1)k ak

k=0

is convergent.

Proof. Denote the partial sums by Sn , n ≥ 0. We see that, for m ∈ N,

and

Here is an example:

∞

X (−1)k 1 1 1

=1− + − + ··· is convergent.

k+1 2 3 4

k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

42

This series is not absolutely convergent (cf. Exercise 6 below). For an evaluation of this

series, see exercises in §5 of Chapter 4.

Exercises

we say

(6.45) x = inf S,

provided x is a lower bound for S and further x ≥ x0 whenever x0 is a lower bound for S.

Mirroring Proposition 6.11, show that if S ⊂ R is a nonempty set that has a lower bound,

then there is a real x = inf S.

3. Given a real number ξ ∈ (0, 1), show it has an infinite decimal expansion, i.e., there

exist bk ∈ {0, 1, . . . , 9} such that

∞

X

(6.46) ξ= bk · 10−k .

k=1

Hint. Start by breaking [0, 1] into ten subintervals of equal length, and picking one to

which ξ belongs.

∞

X 1

xk = < ∞.

1−x

k=0

n

X 1 − xn+1

xk = , x 6= 1.

1−x

k=0

∞

X ∞

X

(6.47) ak < ∞ ⇐⇒ bk < ∞,

k=1 k=0

where

(6.48) bk = 2k a2k .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

43

1 1

b2 + b3 + · · · ≤ (a3 + a4 ) + (a5 + a6 + a7 + a8 ) + · · · , and

2 2

(a3 + a4 ) + (a5 + a6 + a7 + a8 ) + · · · ≤ b1 + b2 + · · · .

1 1 1

6. Deduce from Exercise 5 that the harmonic series 1 + 2 + 3 + 4 + · · · diverges, i.e.,

∞

X 1

(6.49) = ∞.

k

k=1

∞

X 1

(6.50) p > 1 =⇒ < ∞.

kp

k=1

For now, we take p ∈ N. We will see later that (6.50) is meaningful, and true, for p ∈

R, p > 1.

xj → x =⇒ xkj → xk .

xj ≥ y ∀ j, xj → x =⇒ x ≥ y.

P

11. Given the alternating series (−1)k ak as in Proposition 6.13 (with ak & 0), with sum

S, show that, for each N ,

N

X

(−1)k ak = S + rN , |rN | ≤ |aN +1 |.

k=0

12. Generalize Exercises 5–6 of §5 as follows. Suppose a sequence (aj ) in R has the

property that there exist r < 1 and K ∈ N such that

¯a ¯

¯ j+1 ¯

j ≥ K =⇒ ¯ ¯ ≤ r.

aj

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

44

k

X

|aj | ≤ M, ∀k ∈ N.

j=1

P∞

Conclude that k=1 ak is convergent.

∞

X 1 k

x

k!

k=1

is convergent.

The following exercises deal with the sequence (fj ) of continued fractions associated to a

sequence (aj ) as in (5.21), via (5.22)–(5.24), leading to Exercises 3–4 of §5.

fj −→ f, as j → ∞,

ϕj (a1 , . . . , aj ) −→ f, as j → ∞.

√

1+ 5

ϕ(1, 1, . . . , 1, . . . ) = .

2

√

Note. The existence of such x implies that 5 has a square root, 5 ∈ R. See Proposition

7.1 for a more general result.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

45

7. Irrational numbers

There are real numbers that are not rational. One, called e, is given by the limit of the

sequence (5.14); in standard notation,

∞

X 1

(7.1) e= .

`!

`=0

This number appears naturally in the theory of the exponential function, which plays a

central role in calculus, as exposed in §5 of Chapter 4. Proposition 5.8 implies that e is

not rational. One can approximate e to high accuracy. In fact, as a consequence of (5.15),

one has

n

X 1 1 1

(7.2) e− ≤ · .

`! n! n

`=0

and hence

120

X 1

(7.4) e− < 10−200 .

`!

`=0

In a fraction of a second, a personal computer with the right program can perform a highly

accurate approximation to such a sum, yielding

9574966967 6277240766 3035354759 4571382178 5251664274

2746639193 2003059921 8174135966 2904357290 0334295260

5956307381 3232862794 3490763233 8298807531 · · ·

A number in R \ Q is said to√ be irrational. We present some other√common examples

of irrational numbers, such as 2. To begin, one needs to show that 2 is a well defined

real number. The following general result includes this fact.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

46

Proof. Consider

Then Sa,k is a nonempty bounded subset of R. Note that if y > 0 and y k > a then y is

an upper bound for Sa,k . Hence 1 + a is an upper bound for Sa,k . Take b = sup Sa,k .

We claim that bk = a. In fact, if bk < a, it follows from Exercise 9 of §6 that there exists

b1 > b such that bk1 < a, hence b1 ∈ Sa,k , so b < sup Sa,k . Similarly, if bk > a, there exists

b0 < b such that bk0 > a, hence b0 is an upper bound for Sa,k , so b > sup Sa,k .

We write

(7.6) b = a1/k .

Proposition 7.2. Take a ∈ N, k ∈ N. If a1/k is not an integer, then a1/k is irrational.

Proof. Assume a1/k = m/n, with m, n ∈ N. We can arrange that m and n have no common

prime factors. Now

(7.7) mk = ank ,

so

(7.8) n | mk .

Thus, if n > 1 and p is a prime factor of n, then p|mk . It follows from Proposition 3.2,

and induction on k, that p|m. This contradicts our arrangement that m and n have no

common prime factors, and concludes the proof.

Noting that 12 = 1, 22 = 4, 32 = 9, we have:

Corollary 7.3. The following numbers are irrational:

√ √ √ √ √ √

(7.9) 2, 3, 5, 6, 7, 8.

Proposition 7.4. Consider the polynomial

Then

(7.11) z ∈ Q, p(z) = 0 =⇒ z ∈ Z.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

47

Proof. If z ∈ Q but z ∈

/ Z, we can write z = m/n with m, n ∈ Z, n > 1, and m and n

containing no common prime factors. Now multiply (7.12) by nk , to get

It follows that n divides mk , so, as in the proof of Proposition 7.2, m and n must have a

common prime factor. This contradiction proves Proposition 7.4.

Note that Proposition 7.2 deals with the special case

(7.13) p(z) = z k − a, a ∈ N.

Remark. The existence of solutions to p(z) = 0 for general p(z) as in (7.10) is harder

than Proposition 7.1, especially when k is even. For the case of odd k, see Exercise 1 of

§9. For the general result, see Chapter 4, Appendix A.

The real line is thick with both rational numbers and irrational numbers. By (6.25),

given any x ∈ R, there exist aj ∈ Q such that aj → x. Also, given any x ∈ R, there

exist irrational

√ bj such that bj → x. To see this, just take aj ∈ Q, aj → x, and set

−j

bj = aj + 2 2.

In a sense that can be made precise, there are more irrational numbers than rational

numbers. Namely, Q is countable, while R is uncountable. See §8 for a treatment of this.

Perhaps the most intriguing irrational number is π. See Chapter 4 for material on π,

including a proof that it is irrational.

Exercises

∞

X

(7.14) ξ= bk · 10−k , bk ∈ {0, 1, . . . , 9}.

k=1

Show that ξ is rational if and only if (7.12) is eventually repeating, i.e., if and only if there

exist N, m ∈ N such that

k ≥ N =⇒ bk+m = bk .

2. Show that

∞

X 2

10−k is irrational.

k=1

3. Making use of Proposition 7.1, define ap for real a > 0, p = m/n ∈ Q. Show that if

also q ∈ Q,

ap aq = ap+q .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

48

Hint. You might start with am/n = (a1/n )m , given n ∈ N, m ∈ Z. Then you need to show

that if k ∈ N,

(a1/nk )mk = (a1/n )m .

You can use the results of Exercise 8 in §6.

(ab)p = ap bp .

Hint. If ak = k −p , then bk = 2k a2k = 2k (2k )−p = 2−(p−1)k = xk with x = 2−(p−1) .

√ √

6. Show that 2 + 3 is irrational.

Hint. Square it.

7. Specialize the proof of Proposition 7.2 to a demonstration that 2 has no rational square

root, and contrast this argument with the proof of such a result suggested in Exercise 6 of

§4.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

49

8. Cardinal numbers

We return to the natural numbers considered in §1 and make contact with the fact that

these numbers are used to count objects in collections. Namely, let S be some set. If S is

empty, we say 0 is the number of its elements. If S is not empty, pick an element out of S

and count “1.” If there remain other elements of S, pick another element and count “2.”

Continue. If you pick a final element of S and count “n,” then you say S has n elements.

At least, that is a standard informal description of counting. We wish to restate this a

little more formally, in the setting where we can apply the Peano axioms.

In order to do this, we consider the following subsets of N. Given n ∈ N, set

(8.1) In = {j ∈ N : j ≤ n}.

While the following is quite obvious, it is worthwhile recording that it is a consequence of

the Peano axioms and the material developed in §1.

Lemma 8.1. We have

(8.2) I1 = {1}, In+1 = In ∪ {n + 1}.

Now we propose the following

Definition 8.1. A nonempty set S has n elements if and only if there exists a bijective

map ϕ : S → In .

A reasonable definition of counting should permit one to demonstrate that, if S has n

elements and it also has m elements, then m = n. The key to showing this from the Peano

postulates is the following.

Proposition 8.2. Assume m, n ∈ N. If there exists an injective map ϕ : Im → In , then

m ≤ n.

Proof. Use induction on n. The case n = 1 is clear (by Lemma 8.1). Assume now that

N ≥ 2 and that the result is true for n < N . Then let ϕ : Im → IN be injective. Two

cases arise: either there is an element j ∈ Im such that ϕ(j) = N , or not. (Also, there is

no loss of generality in assuming at this point that m ≥ 2.)

If there is such a j, define ψ : Im−1 → IN −1 by

ψ(`) = ϕ(`) for ` < j,

ϕ(` + 1) for j ≤ ` < m.

Then ψ is injective, so m − 1 ≤ N − 1, and hence m ≤ N .

On the other hand, if there is no such j, then we already have an injective map ϕ :

Im → IN −1 . The induction hypothesis implies m ≤ N − 1, which in turn implies m ≤ N .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

50

Proof. We see that m ≤ n and n ≤ m, so Proposition 1.13 applies.

Corollary 8.4. If S is a set, m, n ∈ N, and there exist bijective maps ϕ : S → Im , ψ :

S → In , then m = n.

Proof. Consider ψ ◦ ϕ−1 .

Definition 8.2. If either S = ∅ or S has n elements for some n ∈ N, as in Definition

8.1, we say S is finite.

The next result implies that any subset of a finite set is finite.

Proposition 8.5. Assume n ∈ N. If S ⊂ In is nonempty, then there exists m ≤ n and a

bijective map ϕ : S → Im .

Proof. Use induction on n. The case n = 1 is clear (by Lemma 8.1). Assume the result is

true for n < N . Then let S ⊂ IN . Two cases arise: either N ∈ S or N ∈/ S.

0 0 0

If N ∈ S, consider S = S \ {N }, so S = S ∪ {N } and S ⊂ IN −1 . The inductive

hypothesis yields a bijective map ψ : S 0 → Im (with m ≤ N − 1), and then we obtain

ϕ : S 0 ∪ {N } → Im+1 , equal to ψ on S 0 and sending the element N to m + 1.

If N ∈ / S, then S ⊂ IN −1 , and the inductive hypothesis directly yields the desired

bijective map.

Proposition 8.6. The set N is not finite.

Proof. If there were an n ∈ N and a bijective map ϕ : In → N, then, by restriction, there

would be a bijective map ψ : S → In+1 for some subset S of In , hence by the results above

a bijective map ψ̃ : Im → In+1 for some m ≤ n < n + 1. This contradicts Corollary 8.3.

The next result says that, in a certain sense, N is a minimal set that is not finite.

Proposition 8.7. If S is not finite, then there exists an injective map Φ : N → S.

Proof. We aim to show that there exists a family of injective maps ϕn : In → S, with the

property that

¯

(8.3) ϕn ¯ = ϕm , ∀ m ≤ n. Im

We establish this by induction on n. For n = 1, just pick some element of S and call

it ϕ1 (1). Now assume this claim is true for all n < N . So we have ϕN −1 : IN −1 → S

injective, but not surjective (since we assume S is not finite), and (8.3) holds for n ≤ N −1.

Pick x ∈ S not in the range of ϕN −1 . Then define ϕN : IN → S so that

ϕN (j) = ϕN −1 (j), j ≤ N − 1,

(8.3A)

ϕN (N ) = x.

Having the family ϕn , we define Φ : N → S by Φ(j) = ϕn (j) for any n ≥ j.

Two sets S and T are said to have the same cardinality if there exists a bijective map

between them; we write Card(S) = Card(T ). If there exists an injective map ϕ : S → T ,

we write Card(S) ≤ Card(T ). The following result, known as the Schroeder-Bernstein

theorem, implies that Card(S) = Card(T ) whenever one has both Card(S) ≤ Card(T ) and

Card(T ) ≤ Card(S).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

51

Theorem 8.8. Let S and T be sets. Suppose there exist injective maps ϕ : S → T and

ψ : T → S. Then there exists a bijective map Φ : S → T .

Proof. Let us say an element x ∈ T has a parent y ∈ S if ϕ(y) = x. Similarly there is a

notion of a parent of an element of S. Iterating this gives a sequence of “ancestors” of any

element of S or T . For any element of S or T , there are three possibilities:

a) The set of ancestors never terminates.

b) The set of ancestors terminates at an element of S.

c) The set of ancestors terminates at an element of T .

We denote by Sa , Ta the elements of S, T , respectively for which case a) holds. Similarly

we have Sb , Tb and Sc , Tc . We have disjoint unions

S = Sa ∪ Sb ∪ Sc , T = Ta ∪ Tb ∪ Tc .

ϕ : Sa → Ta , ϕ : Sb → Tb , ψ : Tc → Sc

are all bijective. Thus we can set Φ equal to ϕ on Sa ∪ Sb and equal to ψ −1 on Sc , to get

a desired bijection.

The terminology above suggests regarding Card(S) as an object (some sort of number).

Indeed, if S is finite we set Card(S) = n if S has n elements (as in Definition 8.1). A set

that is not finite is said to be infinite. We can also have a notion of cardinality of infinite

sets. A standard notation for the cardinality of N is

(8.4) Card(N) = ℵ0 .

Proposition 8.9. We have

(1, 1), (1, 2), (2, 1), (1, 3), (2, 2), (3, 1), · · · .

An infinite set that can be mapped bijectively onto N is called countably infinite. A

set that is either finite or countably infinite is called countable. The following result is a

natural extension of Proposition 8.5.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

52

Proof. If X is finite, then Proposition 8.5 applies. Otherwise, we can assume X = N, and

we are looking at S ⊂ N, so there is an injective map ϕ : S → N. If S is finite, there is

no problem. Otherwise, by Proposition 8.7, there is an injective map ψ : N → S, and then

Theorem 8.8 implies the existence of a bijection between S and N.

There are sets that are not countable; they are said to be uncountable. The following

is a key result of G. Cantor.

Proposition 8.11. The set R of real numbers is uncountable.

Proof. We may as well show that (0, 1) = {x ∈ R : 0 < x < 1} is uncountable. If it were

countable, there would be a bijective map ϕ : N → (0, 1). Expand the real number ϕ(j) in

its infinite decimal expansion:

∞

X

(8.6) ϕ(j) = ajk · 10−k , ajk ∈ {0, 1, . . . 9}.

k=1

Now set

bk = 2 if akk =

6 2,

(8.7)

3 if akk = 2,

and consider

∞

X

(8.8) ξ= bk · 10−k , ξ ∈ (0, 1).

k=1

It is seen that ξ is not equal to ϕ(j) for any j ∈ N, contradicting the hypothesis that

ϕ : N → (0, 1) is onto.

A common notation for the cardinality of R is

(8.9) Card(R) = c.

(8.10) Card(R × R) = c.

notions of set theory. In these notes we have used set theoretical notions rather informally.

Our use of such notions has gotten somewhat heavier in this last section. In particular,

in the proof of Proposition 8.7, the innocent looking use of the phrase “pick x ∈ S . . . ”

actually assumes a weak version of the Axiom of Choice. For an introduction to the

axiomatic treatment of set theory we refer to [Dev].

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

53

Exercises

2. Let S be a nonempty set and let T be the set of all subsets of S. Adapt the proof of

Proposition 8.11 to show that

Card(S) < Card(T ),

i.e., there is not a surjective map ϕ : S → T .

Hint. There is a natural bijection of T and Te, the set of functions f : S → {0, 1}, via

f ↔ {x ∈ S : f (x) = 1}. Given ϕ̃ : S → Te, describe a function g : S → {0, 1}, not in the

range of ϕ̃, taking a cue from the proof of Proposition 8.11.

5. Find a one-to-one map of R onto R+ and conclude that Card(R) = Card((0, 1)).

7. Prove (8.10).

Sm,n = {k ∈ Z : m + 1 ≤ k ≤ m + n}.

Show that

Card Sm,n = n.

Hint. Produce a bijective map In → Sm,n .

Card S = m, Card T = n, S ∩ T = ∅,

Card S ∪ T = m + n.

Hint. Produce bijective maps S → Im and T → Sm,n , leading to a bijection S ∪T → Im+n .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

54

9. Metric properties of R

sequence of points (pj ) in R converges to a limit p ∈ R (we write pj → p) if and only if for

every ε > 0 there exists N such that

(9.2) pj ∈ S, pj → p =⇒ p ∈ S.

if, given q ∈ Ω, there exists ε > 0 such that Bε (q) ⊂ Ω, where

We define the closure S of a set S ⊂ R to consist of all points p ∈ R such that

Bε (p) ∩ S 6= ∅ for all ε > 0. Equivalently, p ∈ S if and only if there exists an infinite

sequence (pj ) of points in S such that pj → p.

An important property of R is completeness, which we recall is defined as follows. A

sequence (pj ) of points in R is called a Cauchy sequence if and only if

(9.4) |pj − pk | −→ 0, as j, k → ∞.

It is easy to see that if pj → p for some p ∈ R, then (9.4) holds. The completeness property

is the converse, given in Theorem 6.8, which we recall here.

Theorem 9.1. If (pj ) is a Cauchy sequence in R, then it has a limit.

Completeness provides a path to the following key notion of compactness. A nonempty

set K ⊂ R is said to be compact if and only if the following property holds.

Each infinite sequence (pj ) in K has a subsequence

(9.5)

that converges to a point in K.

It is clear that if K is compact, then it must be closed. It must also be bounded, i.e., there

exists R < ∞ such that K ⊂ BR (0). Indeed, if K is not bounded, there exist pj ∈ K such

that |pj+1 | ≥ |pj | + 1. In such a case, |pj − pk | ≥ 1 whenever j 6= k, so (pj ) cannot have a

convergent subsequence. The following converse statement is a key result.

Theorem 9.2. If a nonempty K ⊂ R is closed and bounded, then it is compact.

Clearly every nonempty closed subset of a compact set is compact, so Theorem 9.2 is a

consequence of:

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

55

Proof. This is a direct consequence of Theorem 6.9, the Bolzano-Weierstrass theorem.

Let K ⊂ R be compact. Since K is bounded from above and from below, we have well

defined real numbers

the first by Proposition 6.11, and the second by a similar argument (cf. Exercise 2 of §6).

Since a and b are limits of elements of K, we have a, b ∈ K. We use the notation

(9.8) f : S −→ R

The following two results give important connections between continuity and compact-

ness.

Proposition 9.4. If K ⊂ R is compact and f : K → R is continuous, then f (K) is

compact.

Proof. If (qk ) is an infinite sequence of points in f (K), pick pk ∈ K such that f (pk ) = qk .

If K is compact, we have a subsequence pkν → p in K, and then qkν → f (p) in R.

This leads to the second connection.

Proposition 9.5. If K ⊂ R is compact and f : K → R is continuous, then there exists

p ∈ K such that

x∈K

x∈K

The next result is called the intermediate value theorem.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

56

Proof. Let

Then a ∈ S, so S is a nonempty, closed (hence compact) subset of [a, b]. Note that b ∈

/ S.

Take

(9.15) x = max S.

Then a < x < b and f (x) ≤ c. If f (x) < c, then there exists ε > 0 such that a < x − ε <

x + ε < b and f (y) < c for x − ε < y < x + ε. Thus x + ε ∈ S, contradicting (9.15).

Returning to the issue of compactness, we establish some further properties of compact

sets K ⊂ R, leading to the important result, Proposition 9.10 below.

Proposition 9.7. Let K ⊂ R be compact. Assume X1 ⊃ X2 ⊃ X3 ⊃ · · · form a decreas-

ing sequence of closed subsets of K. If each Xm 6= ∅, then ∩m Xm 6= ∅.

Proof. Pick xm ∈ Xm . If K is compact, (xm ) has a convergent subsequence, xmk → y.

Since {xmk : k ≥ `} ⊂ Xm` , which is closed, we have y ∈ ∩m Xm .

Corolary 9.8. Let K ⊂ R be compact. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an increasing

sequence of open sets in R. If ∪m Um ⊃ K, then UM ⊃ K for some M .

Proof. Consider Xm = K \ Um .

Before getting to Proposition 9.10, we bring in the following. Let Q denote the set of

rational numbers. The set Q ⊂ R has the following “denseness” property: given p ∈ R

and ε > 0, there exists q ∈ Q such that |p − q| < ε. Let

Note that Q is countable, i.e., it can be put in one-to-one correspondence with N. Hence R

is a countable collection of balls. The following lemma is left as an exercise for the reader.

Lemma 9.9. Let Ω ⊂ R be a nonempty open set. Then

[

(9.17) Ω= {B : B ∈ R, B ⊂ Ω}.

To state the next result, we say that a collection {Uα : α ∈ A} covers K if K ⊂ ∪α∈A Uα .

If each Uα ⊂ R is open, it is called an open cover of K. If B ⊂ A and K ⊂ ∪β∈B Uβ , we

say {Uβ : β ∈ B} is a subcover. This result is called the Heine-Borel theorem.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

57

(9.19)

has a finite subcover.

(9.20) Um = B1 ∪ · · · ∪ Bm

Exercises

and n is odd. Use the intermediate value theorem to show that p(x) = 0 for some x ∈ R.

We describe the construction of a Cantor set. Take a closed, bounded interval [a, b] = C0 .

Let C1 be obtained from C0 by deleting the open middle third interval, of length (b − a)/3.

At the jth stage, Cj is a disjoint union of 2j closed intervals, each of length 3−j (b−a). Then

Cj+1 is obtained from Cj by deleting the open middle third of each of these 2j intervals.

We have C0 ⊃ C1 ⊃ · · · ⊃ Cj ⊃ · · · , each a closed subset of [a, b].

2. Show that

\

(9.21) C= Cj

j≥0

3. Suppose C is formed as above, with [a, b] = [0, 1]. Show that points in C are precisely

those of the form

∞

X

(9.22) ξ= bj 3−j , bj ∈ {0, 2}.

j=0

4. If p, q ∈ C (and p < q), show that the interval [p, q] must contain points not in C. One

says C is totally disconnected.

C is closed, one says C is perfect.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

58

Hint. With ξ as in (9.22) show that

∞ ³

X bj ´

ξ 7→ η = 2−j

j=0

2

Continuum Hypothesis. If S ⊂ R is uncountable, then Card S = Card R.

This hypothesis has been shown not to be amenable to proof or disproof, from the standard

axioms of set theory. See [C]. However, there is a large class of sets for which the conclusion

holds. For example, it holds whenever S ⊂ R is uncountable and compact. See Exercises

7–9 in §3 of Chapter 2 for further results along this line.

8. In the setting of Proposition 9.6 (the intermediate value theorem), in which f : [a, b] → R

is continuous and f (a) < c < f (b), consider the following.

(a) Divide I = [a, b] into two equal intervals I` and Ir , meeting at the midpoint α0 =

(a + b)/2. Select I1 = I` if f (α0 ) ≥ c, I1 = Ir if f (α0 ) < c. Say I1 = [x1 , y1 ]. Note that

f (x1 ) < c, f (y1 ) ≥ c.

(b) Divide I1 into two equal intervals I1` and I1r , meeting at the midpoint (x1 +y1 )/2 = α1 .

Select I2 = I1` if f (α1 ) ≥ c, I2 = I1r if f (α1 ) < c. Say I2 = [x2 , y2 ]. Note that

f (x2 ) < c, f (y2 ) ≥ c.

(c) Continue. Having Ik = [xk , yk ], of length 2−k (b − a), with f (xk ) < c, f (yk ) ≥ c,

divide Ik into two equal intervals Ik` and Ikr , meeting at the midpoint αk = (xk + yk )/2.

Select Ik+1 = Ik` if f (αk ) ≥ c, Ik+1 = Ikr if f (αk ) < c. Again, Ik+1 = [xk+1 , yk+1 ] with

f (xk+1 ) < c and f (yk+1 ) ≥ c.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

59

(10.1) z = x + iy, x, y ∈ R,

where the new object i has the property

(10.2) i2 = −1.

We denote the set of complex numbers by C. We have R ,→ C, identifying x ∈ R with

x + i0 ∈ C.

We define addition and multiplication in C as follows. Suppose w = a + ib, a, b ∈ R.

We set

z + w = (x + a) + i(y + b),

(10.3)

zw = (xa − yb) + i(xb + ya).

It is routine to verify various commutative, associative, and distributive laws, parallel to

those in Proposition 4.3. If z 6= 0, i.e., either x 6= 0 or y 6= 0, we can set

1 x y

(10.4) z −1 = = 2 2

−i 2 ,

z x +y x + y2

and verify that zz −1 = 1.

For some more notation, for z ∈ C of the form (10.1), we set

(10.5) z = x − iy, Re z = x, Im z = y.

We say z is the complex conjugate of z, Re z is the real part of z, and Im z is the imaginary

part of z.

We next discuss the concept of the magnitude (or absolute value) of an element z ∈ C.

If z has the form (10.1), we take a cue from the Pythagorean theorem, giving the Euclidean

distance from z to 0, and set

p

(10.6) |z| = x2 + y 2 .

Note that

(10.7) |z|2 = z z.

With this notation, (10.4) takes the compact (and clear) form

z

(10.8) z −1 = 2 .

|z|

We have

(10.9) |zw| = |z| · |w|,

for z, w ∈ C, as a consequence of the identity (readily verified from the definition (10.5))

(10.10) zw = z · w.

In fact, |zw| = (zw)(zw) = z w z w = zzww = |z|2 |w|2 . This extends the first part of

2

(6.21) from R to C. The extension of the second part also holds, but it requires a little

more work. The following is the triangle inequality in C.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

60

(10.11) |z + w| ≤ |z| + |w|.

|z + w|2 = (z + w)(z + w)

(10.12) = |z|2 + |w|2 + wz + zw

= |z|2 + |w|2 + 2 Re zw.

Now, for any ζ ∈ C, Re ζ ≤ |ζ|, so Re zw ≤ |zw| = |z| · |w|, so (10.12) is

(10.13) ≤ |z|2 + |w|2 + 2|z| · |w| = (|z| + |w|)2 ,

and we have (10.11).

We now discuss matters related to convergence in C. Parallel to the real case, we say

a sequence (zj ) in C converges to a limit z ∈ C (and write zj → z) if and only if for each

ε > 0 there exists N such that

(10.14) j ≥ N =⇒ |zj − z| < ε.

Equivalently,

(10.13) zj → z ⇐⇒ |zj − z| → 0.

It is easily seen that

(10.16) zj → z ⇐⇒ Re zj → Re z and Im zj → Im z.

The set C also has the completeness property, given as follows. A sequence (zj ) in C is

said to be a Cauchy sequence if and only if

(10.17) |zj − zk | → 0, as j, k → ∞.

It is easy to see (using the triangle inequality) that if zj → z for some z ∈ C, then (10.17)

holds. Here is the converse:

Proposition 10.2. If (zj ) is a Cauchy sequence in C, then it has a limit.

Proof. If (zj ) is Cauchy in C, then (Re zj ) and (Im zj ) are Cauchy in R, so, by Theorem

6.8, they have limits.

P∞

We turn to infinite series k=0 ak , with ak ∈ C. We say this converges if and only if

the sequence of partial sums

n

X

(10.18) Sn = ak

k=0

converges:

∞

X

(10.19) ak = A ⇐⇒ Sn → A as n → ∞.

k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

61

P∞

Proposition 10.3. The infinite series k=0 ak converges provided

∞

X

(10.20) |ak | < ∞,

k=0

Pn

i.e., there exists B < ∞ such that k=0 |ak | ≤ B for all n.

Proof. The triangle inequality gives, for ` ≥ 1,

¯ n+` ¯

¯ X ¯

|Sn+` − Sn | = ¯ ak ¯

k=n+1

(10.21)

n+`

X

≤ |ak |,

k=n+1

Hence (10.20) ⇒ (Sn ) is Cauchy. Convergence then follows, by Proposition 10.2.

P∞

As in the real case, if (10.20) holds, we say the infinite series k=0 ak is absolutely

convergent.

An example to which Proposition 10.3 applies is the following power series, giving the

exponential function ez :

∞

X

z zk

(10.22) e = , z ∈ C.

k!

k=0

4.

We turn to a discussion of polar coordinates on C. Given a nonzero z ∈ C, we can write

z

(10.23) z = rω, r = |z|, ω = .

|z|

Then ω has unit distance from 0. If the ray from 0 to ω makes an angle θ with the positive

real axis, we have

(10.25) z = r cis θ,

where

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

62

If also

(10.27) w = ρ cis ϕ, ρ = |w|,

then

(10.28) zw = rρ cis(θ + ϕ),

as a consequence of the identity

(10.29) cis(θ + ϕ) = (cis θ)(cis ϕ),

which in turn is equivalent to the pair of trigonometric identities

cos(θ + ϕ) = cos θ cos ϕ − sin θ sin ϕ,

(10.30)

sin(θ + ϕ) = cos θ sin ϕ + sin θ cos ϕ.

There is another way to write (10.25), using the classical Euler identity

(10.31) eiθ = cos θ + i sin θ.

Then (10.25) becomes

(10.32) z = r eiθ .

The identity (10.29) is equivalent to

(10.33) ei(θ+ϕ) = eiθ eiϕ .

We will present a self-contained derivation of (10.31) (and also of (10.30) and (10.33)) in

Chapter 4.

We next define closed and open subsets of C, and discuss the notion of compactness. A

set S ⊂ C is said to be closed if and only if

(10.34) zj ∈ S, zj → z =⇒ z ∈ S.

The complement C \ S of a closed set S is open. Alternatively, Ω ⊂ C is open if and only

if, given q ∈ Ω, there exists ε > 0 such that Bε (q) ⊂ Ω, where

(10.35) Bε (q) = {z ∈ C : |z − q| < ε},

so q cannot be a limit of a sequence of points in C \ Ω. We define the closure S of a set

S ⊂ C to consist of all points p ∈ C such that Bε (p) ∩ S 6= ∅ for all ε > 0. Equivalently,

p ∈ S if and only if there exists an infinite sequence (pj ) of points in S such that pj → p.

Parallel to (9.5), we say a nonempty set K ⊂ C is compact if and only if the following

property holds.

Each infinite sequence (pj ) in K has a subsequence

(10.36)

that converges to a point in K.

As in §9, if K ⊂ C is compact, it must be closed and bounded. Parallel to Theorem 9.2,

we have the converse.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

63

Proof. Let (zj ) be a sequence in K. Then (Re zj ) and (Im zj ) are bounded, so Theorem

6.9 implies the existence of a subsequence such that Re zjν and Im zjν converge. Hence the

subsequence (zjν ) converges in C. Since K is closed, the limit must belong to K.

If S ⊂ C, a function

(10.37) f : S −→ C

same proof as Proposition 9.4.

Proposition 10.5. If K ⊂ C is compact and f : K → C is continuous, then f (K) is

compact.

Then the following variant of Proposition 9.5 is straightforward.

Proposition 10.6. If K ⊂ C is compact and f : K → C is continuous, then there exists

p ∈ K such that

z∈K

z∈K

the details. But see §1 of Chapter 2 for further extensions.

Exercises

cis π = −1.

1. Show that

2π

ω = cis =⇒ ω n = 1.

n

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

64

For this, use (10.29). In conjunction with (10.25)–(10.28) and Proposition 7.1, use this to

prove the following:

such that zjn = a.

2. Compute √ ´

³1 3 3

+ i ,

2 2

and verify that

√

π 1 π 3

(10.41) cos = , sin = .

3 2 3 2

(10.42) zjn = 1,

(10.43) n = 3, 4, 6, 8.

Hint. Use (10.41), and also the fact that the equation u2j = i has solutions

1 i

(10.44) u1 = √ + √ , u2 = −u1 .

2 2

(10.45) zj5 = 1.

solutions to z 4 + z 3 + z 2 + z + 1 = 0. Write this as

1 1

(10.46) z2 + z + 1 + + 2 = 0,

z z

which, for

1

(10.47) w=z+ ,

z

becomes

(10.48) w2 + w − 1 = 0.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

65

Use the quadratic formula to find 2 solutions to (10.48). Then solve (10.47), i.e., z 2 − wz +

1 = 0, for z. Use these calculations to show that

√

2π 5−1

cos = .

5 4

5. Take the following path to explicitly finding the real and imaginary parts of a solution

to

z 2 = a + ib.

Namely, with x = Re z, y = Im z, we have

x2 − y 2 = a, 2xy = b,

and also p

x2 + y 2 = ρ = a2 + b2 ,

hence r

ρ+a b

x= , y= ,

2 2x

as long as a + ib 6= −|a|.

X ∞

1

(10.49) = zk , for z ∈ C, |z| < 1.

1−z

k=0

7. Show that

X ∞

1

2

= z 2k , for z ∈ C, |z| < 1.

1−z

k=0

8. Produce a power series series expansion in z, valid for |z| < 1, for

1

.

1 + z2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

66

Chapter II

Spaces

Introduction

In Chapter 1 we developed the real number line R, and established a number of metric

properties, such as completeness of R, and compactness of closed, bounded subsets. We

also produced the complex plane C, and studied analogous metric properties of C. Here

we examine other types of spaces, which are useful in analysis.

n

Section 1 treats n-dimensional Euclidean√space, R . This is equipped with a dot product

x · y ∈ R, which gives rise to a norm |x| = x · x. Parallel to (6.21) and (10.11) of Chapter

1, this norm satisfies the triangle inequality. In this setting, the proof goes through an

inequality known as Cauchy’s inequality. Then the distance between x and y in Rn is

given by d(x, y) = |x − y|, and it satisfies a triangle inequality. With these structures, we

have the notion of convergent sequences and Cauchy sequences, and can show that Rn is

complete. There is a notion of compactness for subsets of Rn , similar to that given in (9.5)

and in (10.36) of Chapter 1, for subsets of R and of C, and it is shown that nonempty,

closed bounded subsets of Rn are compact.

Analysts have found it useful to abstract some of the structures mentioned above, and

apply them to a larger class of spaces, called metric spaces. A metric space is a set

X, equipped with a distance function d(x, y), satisfying certain conditions (see (2.1)),

including the triangle inequality. For such a space, one has natural notions of a convergent

sequence and of a Cauchy sequence. The space may or may not be complete. If not,

there is a construction of its completion, somewhat similar to the construction of R as the

completion of Q in §6 of Chapter 1. We discuss the definition and some basic properties

of metric spaces in §2. There is also a natural notion of compactness in the metric space

context, which we treat in §3.

Most metric spaces we will encounter are subsets of Euclidean space. One exception

introduced in this chapter is the class of infinite products; see (3.3). Another important

class of metric spaces beyond the Euclidean space setting consists of spaces of functions,

which will be treated in §4 of Chapter 3.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

67

1. Euclidean spaces

(1.1) x = (x1 , . . . , xn ) ∈ Rn , xj ∈ R, 1 ≤ j ≤ n.

The number xj is called the jth component of x. Here we discuss some important algebraic

and metric structures on Rn . First, there is addition. If x is as in (1.1) and also y =

(y1 , . . . , yn ) ∈ Rn , we have

(1.2) x + y = (x1 + y1 , . . . , xn + yn ) ∈ Rn .

We also have the dot product,

n

X

(1.4) x·y = xj yj = x1 y1 + · · · + xn yn ∈ R,

j=1

x · y = y · x,

(1.5) x · (ay + bz) = a(x · y) + b(x · z),

x · x > 0 unless x = 0.

Note that

We set

√

(1.7) |x| = x · x,

hence

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

68

Taking a cue from the Pythagorean theorem, we say that the distance from x to y in

n

R is

For us, (1.7) and (1.10) are simply definitions. We do not need to depend on the Pythagorean

theorem. Significant properties will be derived below, without recourse to the Pythagorean

theorem.

A set X equipped with a distance function is called a metric space. We will consider

metric spaces in general in the next section. Here, we want to show that the Euclidean

distance, defined by (1.10), satisfies the “triangle inequality,”

This in turn is a consequence of the following, also called the triangle inequality.

Proposition 1.1. The norm (1.7) on Rn has the property

|x + y|2 = (x + y) · (x + y)

(1.13) =x·x+y·x+y·x+y·y

= |x|2 + 2x · y + |y|2 .

Next,

We see that (1.12) holds if and only if x · y ≤ |x| · |y|. Thus the proof of Proposition 1.1 is

finished off by the following result, known as Cauchy’s inequality.

Proposition 1.2. For all x, y ∈ Rn ,

which implies

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

69

If we replace x by tx and y by t−1 y, with t > 0, the left side of (1.17) is unchanged, so we

have

(1.18) 2x · y ≤ t2 |x|2 + t−2 |y|2 , ∀ t > 0.

Now we pick t so that the two terms on the right side of (1.18) are equal, namely

|y| |x|

(1.19) t2 = , t−2 = .

|x| |y|

(At this point, note that (1.15) is obvious if x = 0 or y = 0, so we will assume that x 6= 0

and y 6= 0.) Plugging (1.19) into (1.18) gives

(1.20) x · y ≤ |x| · |y|, ∀ x, y ∈ Rn .

This is almost (1.15). To finish, we can replace x in (1.20) by −x = (−1)x, getting

(1.21) −(x · y) ≤ |x| · |y|,

and together (1.20) and (1.21) give (1.15).

We now discuss a number of notions and results related to convergence in Rn . First, a

sequence of points (pj ) in Rn converges to a limit p ∈ Rn (we write pj → p) if and only if

(1.22) |pj − p| −→ 0,

where | · | is the Euclidean norm on Rn , defined by (1.7), and the meaning of (1.22) is

that for every ε > 0 there exists N such that

(1.23) j ≥ N =⇒ |pj − p| < ε.

If we write pj = (p1j , . . . , pnj ) and p = (p1 , . . . , pn ), then (1.22) is equivalent to

(p1j − p1 )2 + · · · + (pnj − pn )2 −→ 0, as j → ∞,

which holds if and only if

|p`j − p` | −→ 0 as j → ∞, for each ` ∈ {1, . . . , n}.

That is to say, convergence pj → p in Rn is eqivalent to convergence of each component.

A set S ⊂ Rn is said to be closed if and only if

(1.24) pj ∈ S, pj → p =⇒ p ∈ S.

The complement Rn \ S of a closed set S is open. Alternatively, Ω ⊂ Rn is open if and

only if, given q ∈ Ω, there exists ε > 0 such that Bε (q) ⊂ Ω, where

(1.25) Bε (q) = {p ∈ Rn : |p − q| < ε},

so q cannot be a limit of a sequence of points in Rn \ Ω.

An important property of Rn is completeness, a property defined as follows. A sequence

(pj ) of points in Rn is called a Cauchy sequence if and only if

(1.26) |pj − pk | −→ 0, as j, k → ∞.

n

Again we see that (pj ) is Cauchy in R if and only if each component is Cauchy in R. It is

easy to see that if pj → p for some p ∈ Rn , then (1.26) holds. The completeness property

is the converse.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

70

Theorem 1.3. If (pj ) is a Cauchy sequence in Rn , then it has a limit, i.e., (1.22) holds

for some p ∈ Rn .

Proof. Since convergence pj → p in Rn is equivalent to convergence in R of each component,

the result is a consequence of the completeness of R. This was proved in Chapter 1.

Completeness provides a path to the following key notion of compactness. A nonempty

set K ⊂ Rn is said to be compact if and only if the following property holds.

Each infinite sequence (pj ) in K has a subsequence

(1.27)

that converges to a point in K.

It is clear that if K is compact, then it must be closed. It must also be bounded, i.e., there

exists R < ∞ such that K ⊂ BR (0). Indeed, if K is not bounded, there exist pj ∈ K such

that |pj+1 | ≥ |pj | + 1. In such a case, |pj − pk | ≥ 1 whenever j 6= k, so (pj ) cannot have a

convergent subsequence. The following converse statement is a key result.

Theorem 1.4. If a nonempty K ⊂ Rn is closed and bounded, then it is compact.

Proof. If K ⊂ Rn is closed and bounded, it is a closed subset of some box

(1.28) B = {(x1 , . . . , xn ) ∈ Rn : a ≤ xk ≤ b, ∀ k}.

Clearly every closed subset of a compact set is compact, so it suffices to show that B is

compact. Now, each closed bounded interval [a, b] in R is compact, as shown in §9 of

Chapter 1, and (by reasoning similar to the proof of Theorem 1.3) the compactness of B

follows readily from this.

We establish some further properties of compact sets K ⊂ Rn , leading to the important

result, Proposition 1.8 below. This generalizes results established for n = 1 in §9 of Chapter

1. A further generalization will be given in §3.

Proposition 1.5. Let K ⊂ Rn be compact. Assume X1 ⊃ X2 ⊃ X3 ⊃ · · · form a

decreasing sequence of closed subsets of K. If each Xm 6= ∅, then ∩m Xm 6= ∅.

Proof. Pick xm ∈ Xm . If K is compact, (xm ) has a convergent subsequence, xmk → y.

Since {xmk : k ≥ `} ⊂ Xm` , which is closed, we have y ∈ ∩m Xm .

Corollary 1.6. Let K ⊂ Rn be compact. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an increasing

sequence of open sets in Rn . If ∪m Um ⊃ K, then UM ⊃ K for some M .

Proof. Consider Xm = K \ Um .

Before getting to Proposition 1.8, we bring in the following. Let Q denote the set of

rational numbers, and let Qn denote the set of points in Rn all of whose components are

rational. The set Qn ⊂ Rn has the following “denseness” property: given p ∈ Rn and

ε > 0, there exists q ∈ Qn such that |p − q| < ε. Let

(1.29) R = {Br (q) : q ∈ Qn , r ∈ Q ∩ (0, ∞)}.

Note that Q and Qn are countable, i.e., they can be put in one-to-one correspondence with

N. Hence R is a countable collection of balls. The following lemma is left as an exercise

for the reader.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

71

[

(1.30) Ω= {B : B ∈ R, B ⊂ Ω}.

To state the next result, we say that a collection {Uα : α ∈ A} covers K if K ⊂ ∪α∈A Uα .

If each Uα ⊂ Rn is open, it is called an open cover of K. If B ⊂ A and K ⊂ ∪β∈B Uβ , we

say {Uβ : β ∈ B} is a subcover.

Proposition 1.8. If K ⊂ Rn is compact, then it has the following property.

(1.32)

has a finite subcover.

To see this, write R = {Bj : j ∈ N}. Given the cover {Uα }, pass to {Bj : j ∈ J}, where

j ∈ J if and only of Bj is contained in some Uα . By (1.30), {Bj : j ∈ J} covers K. If

(1.32) holds, we have a subcover {B` : ` ∈ L} for some finite L ⊂ J. Pick α` ∈ A such

that B` ⊂ Uα` . The {Uα` : ` ∈ L} is the desired finite subcover advertised in (1.31).

Finally, to prove (1.32), we set

(1.33) Um = B1 ∪ · · · ∪ Bm

Exercises

that the dot product satisfies

z · w = Re zw.

In light of this, compare the proof of Proposition 1.1 with that of Proposition 10.1 in

Chapter 1.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

72

subsets of K. Assume that for each finite set B ⊂ A, ∩α∈B Xα 6= ∅. Then

\

Xα 6= ∅.

α∈A

Hint. Consider Uα = Rn \ Xα .

|x0 | ≤ |x|, ∀ x ∈ K,

|x1 | ≥ |x|, ∀ x ∈ K.

We say

|x0 | = min |x|, |x1 | = max |x|.

x∈K x∈K

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

73

2. Metric spaces

A metric space is a set X, together with a distance function d : X × X → [0, ∞), having

the properties that

d(x, y) = 0 ⇐⇒ x = y,

(2.1) d(x, y) = d(y, x),

d(x, y) ≤ d(x, z) + d(y, z).

The third of these properties is called the triangle inequality. We sometimes denote this

metric space by (X, d). An example of a metric space is the set of rational numbers Q,

with d(x, y) = |x − y|. Another example is X = Rn , with

p

d(x, y) = (x1 − y1 )2 + · · · + (xn − yn )2 .

If (xν ) is a sequence in X, indexed by ν = 1, 2, 3, . . . , i.e., by ν ∈ Z+ , one says

(2.2) xν → y ⇐⇒ d(xν , y) → 0, as ν → ∞.

(2.3) d(xν , xµ ) → 0 as µ, ν → ∞.

One says X is a complete metric space if every Cauchy sequence converges to a limit in

X. Some metric spaces are not complete; for example,

√ Q is not complete. You can take a

sequence (xν ) of rational numbers such that xν → 2, which is not rational. Then (xν ) is

Cauchy in Q, but it has no limit in Q.

b as follows. Let

If a metric space X is not complete, one can construct its completion X

b

an element ξ of X consist of an equivalence class of Cauchy sequences in X, where we say

We write the equivalence class containing (xν ) as [xν ]. If ξ = [xν ] and η = [yν ], we can set

d(ξ,

ν→∞

and verify that this is well defined, and makes X b a complete metric space. Details are

provided at the end of this section.

If the completion of Q is constructed by this process, you get R, the set of real numbers.

This construction was carried out in §6 of Chapter 1.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

74

There are a number of useful concepts related to the notion of closeness. We define

some of them here. First, if p is a point in a metric space X and r ∈ (0, ∞), the set

containing such a ball, for some r > 0.

A set S ⊂ X is said to be closed if and only if

(2.7) pj ∈ S, pj → p =⇒ p ∈ S.

and only if

We state a couple of straightforward propositions, whose proofs are left to the reader.

Proposition 2.1. If Uα is a family of open sets in X, then ∪α Uα is open. If Kα is a

family of closed subsets of X, then ∩α Kα is closed.

Given S ⊂ X, we denote by S (the closure of S) the smallest closed subset of X

containing S, i.e., the intersection of all the closed sets Kα ⊂ X containing S. The

following result is straightforward.

Proposition 2.2. Given S ⊂ X, p ∈ S if and only if there exist xj ∈ S such that xj → p.

Given S ⊂ X, p ∈ X, we say p is an accumulation point of S if and only if, for each

ε > 0, there exists q ∈ S ∩ Bε (p), q 6= p. It follows that p is an accumulation point of S if

and only if each Bε (p), ε > 0, contains infinitely many points of S. One straightforward

observation is that all points of S \ S are accumulation points of S.

If S ⊂ Y ⊂ X, we say S is dense in Y provided S ⊃ Y .

The interior of a set S ⊂ X is the largest open set contained in S, i.e., the union of all

the open sets contained in S. Note that the complement of the interior of S is equal to

the closure of X \ S.

We next define the notion of a connected space. A metric space X is said to be connected

provided that it cannot be written as the union of two disjoint nonempty open subsets.

The following is a basic example. Here, we treat I as a stand-alone metric space.

Proposition 2.3. Each interval I in R is connected.

Proof. Suppose A ⊂ I is nonempty, with nonempty complement B ⊂ I, and both sets are

open. (Hence both sets are closed.) Take a ∈ A, b ∈ B; we can assume a < b. (Otherwise,

switch A and B.) Let ξ = sup{x ∈ [a, b] : x ∈ A}. This exists, by Proposition 6.11 of

Chapter 1.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

75

Now we obtain a contradiction, as follows. Since A is closed, ξ ∈ A. (Hence ξ < b.) But

then, since A is open, ξ > a, and furthermore there must be a neighborhood (ξ − ε, ξ + ε)

contained in A. This would imply ξ ≥ ξ + ε. Contradiction.

See the next chapter for more on connectedness, and its connection to the Intermediate

Value Theorem.

As indicated earlier in this section, if (X, d) is a metric space, we can construct its

ˆ This construction can be compared to that done to pass from Q to R

b d).

completion (X,

in §6 of Chapter 1. Elements of X b consist of equivalence classes of Cauchy sequences in

X, with equivalence relation given by (2.4). To verify that (2.4) defines an equivalence

relation, we need to show that the relation specified there is reflexive, symmetric, and

transitive. The first two properties are completely straightforward. As for the third, we

need to show that

from which (2.9) readily follows. We write the equivalence class containing (xν ) as [xν ].

ˆ η) by

Given ξ = [xν ] and η = [yν ], we propose to define d(ξ,

d(ξ,

ν→∞

To obtain a well defined dˆ : X b ×X b → [0, ∞), we need to verify that the limit on the

right side of (2.11) exists whenever (xν ) and (yν ) are Cauchy in X, and that the limit is

unchanged if (xν ) and (yν ) are replaced by (x0ν ) ∼ (xν ) and (yν0 ) ∼ (yν ). First, we show

that dν = d(xν , yν ) is a Cauchy sequence in R. The triangle inequality for d gives

hence

dν − dµ ≤ d(xν , xµ ) + d(yµ , yν ),

and the same upper estimate applies to dµ − dν , hence to |dν − dµ |. Thus the limit on the

right side of (2.11) exists. Next, with d0ν = d(x0ν , yν0 ), we have

hence

d0ν − dν ≤ d(x0ν , xν ) + d(yν , yν0 ),

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

76

These observations establish that dˆ : X

b ×X

b → [0, ∞) is well defined. We next need to

show that it makes Xb a metric space. First,

d(ξ,

ν→∞

ˆ ξ) follows from (2.11) and the symmetry of d. Finally,

b then

if also ζ = [zν ] ∈ X,

ˆ ζ) = lim d(xν , zν )

d(ξ,

ν

£ ¤

(2.13) ≤ lim d(xν , yν ) + d(yν , zν )

ν

ˆ η) + d(η,

= d(ξ, ˆ ζ),

To proceed, we have a natural map

(2.14) b

j : X −→ X, j(x) = (x, x, x, . . . ).

(2.15) ˆ

d(j(x), j(y)) = d(x, y).

From here on, we will simply identify a point x ∈ X with its image j(x) ∈ X,b using the

b (so X ⊂ X).

notation x ∈ X b It is useful to observe that if (xk ) is a Cauchy sequence in

X, then

(2.16) ˆ xk ) = 0.

ξ = [xk ] =⇒ lim d(ξ,

k→∞

In fact,

d(ξ,

ν→∞

b

Lemma 2.4. The set X is dense in X.

b say ξ = [xν ], the fact that xν → ξ in (X,

Proof. Given ξ ∈ X, ˆ follows from (2.16).

b d)

We are now ready for the following analogue of Theorem 6.8 of Chapter 1.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

77

ˆ is complete.

b d)

Proposition 2.5. The metric space (X,

Proof. Assume (ξk ) is Cauchy in (X, ˆ By Lemma 2.4, we can pick xk ∈ X such that

b d).

ˆ k , xk ) ≤ 2−k . We claim (xk ) is Cauchy in X. In fact,

d(ξ

ˆ k , x` )

d(xk , x` ) = d(x

(2.18) ˆ k , ξk ) + d(ξ

≤ d(x ˆ k , ξ` ) + d(ξ

ˆ ` , x` )

ˆ k , ξ` ) + 2−k + 2−` ,

≤ d(ξ

so

(2.19) d(xk , x` ) −→ 0 as k, ` → ∞.

b We claim ξk → ξ. In fact,

Since (xk ) is Cauchy in X, it defines an element ξ = [xk ] ∈ X.

ˆ k , ξ) ≤ d(ξ

d(ξ ˆ k , xk ) + d(x

ˆ k , ξ)

(2.20)

ˆ k , ξ) + 2−k ,

≤ d(x

and the fact that d(x

Proposition 2.5.

Exercises

3. Suppose the metric space (X, d) is complete, and (X, ˆ is constructed as indicated

b d)

in (2.4)–(2.5), and described in detail in (2.9)–(2.17). Show that the natural inclusion

j:X→X b is both one-to-one and onto.

Hint. Suppose BR (p) = U ∪ V , a union of two disjoint open sets. Given q1 ∈ U, q2 ∈ V ,

consider the line segment

p

5. Let X = Rn , but replace the distance d(x, y) = (x1 − y1 )2 + · · · + (xn − yn )2 by

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

78

Show that (X, d1 ) is a metric space. In particular, verify the triangle inequality. Show

that a sequence pj converges in (X, d1 ) if and only if it converges in (X, d).

6. Show that if U is an open subset of (X, d), then U is a union of open balls.

B = {Br (p) : p ∈ S, r ∈ Q+ },

with Br (p) defined as in (2.6). Show that if U is an open subset of X, then U is a union

of balls in B. That is, if q ∈ U , there exists B ∈ B such that q ∈ B ⊂ U .

Given a nonempty metric space (X, d), we say it is perfect if it is complete and has no

isolated points. Exercises 8–10 deal with perfect metric spaces.

8. Show that if p ∈ X and ε > 0, then Bε (p) contains infinitely many points.

X0 = Br0 (p0 ) and X1 = Br0 (p1 )

are disjoint perfect subsets of X (i.e., are each perfect metric spaces).

10. Similarly, take distinct p00 , p01 ∈ X0 and distinct p10 , p11 ∈ X1 , and sufficiently small

r1 > 0 such that

Xjk = Br1 (pjk ) for k = 0, 1 are disjoint perfect subsets of Xj .

Continue in this fashion, producing Xj1 ···jk+1 ⊂ Xj1 ···jk , closed balls of radius rk & 0,

centered at pj1 ···jk+1 . Show that you can define a function

Y∞

ϕ: {0, 1} → X, ϕ((j1 , j2 , j3 , . . . )) = lim pj1 j2 ···jk .

k→∞

`=1

Show that ϕ is one-to-one, and deduce that

Card(X) ≥ Card(R).

11. Let X be a separable metric space, with a dense subset S = {pj : j ∈ N}. Produce a

function

Y∞

ψ : X −→ N

`=1

as follows. Given x ∈ X, choose a sequence (pjν ) of points in S such that pjν → x. Set

ψ(x) = (j1 , j2 , j3 , . . . ).

Show that ψ is one-to-one, and deduce that

Card(X) ≤ Card(R).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

79

3. Compactness

say a (nonempty) metric space X is compact provided the following property holds:

We will establish various properties of compact metric spaces, and provide various equiv-

alent characterizations. For example, it is easily seen that (A) is equivalent to:

Proposition 3.1. If X is a compact metric space, then

(C) Given ε > 0, ∃ finite set {x1 , . . . , xN } such that Bε (x1 ), . . . , Bε (xN ) covers X.

Proof. Take ε > 0 and pick x1 ∈ X. If Bε (x1 ) = X, we are done. If not, pick x2 ∈

X \ Bε (x1 ). If Bε (x1 ) ∪ Bε (x2 ) = X, we are done. If not, pick x3 ∈ X \ [Bε (x1 ) ∪ Bε (x2 )].

Continue, taking xk+1 ∈ X \ [Bε (x1 ) ∪ · · · ∪ Bε (xk )], if Bε (x1 ) ∪ · · · ∪ Bε (xk ) 6= X. Note

that, for 1 ≤ i, j ≤ k,

i 6= j =⇒ d(xi , xj ) ≥ ε.

If one never covers X this way, consider S = {xj : j ∈ N}. This is an infinite set with no

accumulation point, so property (B) is contradicted.

Corollary 3.2. If X is a compact metric space, it has a countable dense subset.

Proof. Given ε = 2−n , let Sn be a finite set of points xj such that {Bε (xj )} covers X.

Then C = ∪n Sn is a countable dense subset of X.

Here is another useful property of compact metric spaces, which will eventually be

generalized even further, in (E) below.

Proposition 3.3. Let X be a compact metric space. Assume K1 ⊃ K2 ⊃ K3 ⊃ · · · form

a decreasing sequence of closed subsets of X. If each Kn 6= ∅, then ∩n Kn 6= ∅.

Proof. Pick xn ∈ Kn . If (A) holds, (xn ) has a convergent subsequence, xnk → y. Since

{xnk : k ≥ `} ⊂ Kn` , which is closed, we have y ∈ ∩n Kn .

Corollary 3.4. Let X be a compact metric space. Assume U1 ⊂ U2 ⊂ U3 ⊂ · · · form an

increasing sequence of open subsets of X. If ∪n Un = X, then UN = X for some N .

Proof. Consider Kn = X \ Un .

The following is an important extension of Corollary 3.4. Note how this generalizes

Proposition 1.8.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

80

Given p ∈ Uα , there exist zj ∈ C and a rational rj > 0 such that p ∈ Brj (zj ) ⊂ Uα . Hence

each Uα is a union of balls Brj (zj ), with zj ∈ C ∩ Uα , rj rational. Thus it suffices to show

that

(D0 ) Every countable cover {Bj : j ∈ N} of X by open balls has a finite subcover.

Un = B1 ∪ · · · ∪ Bn

The following is a convenient alternative to property (D):

\

(E) If Kα ⊂ X are closed and Kα = ∅, then some finite intersection is empty.

α

(D) ⇐⇒ (E).

The following result, known as the Heine-Borel theorem, completes Proposition 3.5.

Theorem 3.6. For a metric space X,

(A) ⇐⇒ (D).

Proof. By Proposition 3.5, (A) ⇒ (D). To prove the converse, it will suffice to show that

(E) ⇒ (B). So let S ⊂ X and assume S has no accumulation point. We claim:

Indeed, if z ∈ S and z ∈

/ S, then z would have to be an accumulation point. To proceed,

say S = {xα : α ∈ A}, and set Kα = S \ {xα }. Then each Kα has no accumulation point,

hence Kα ⊂ X is closed. Also ∩α Kα = ∅. Hence there exists a finite set F ⊂ A such that

∩α∈F Kα = ∅, if (E) holds. Hence S = ∪α∈F {xα } is finite, so indeed (E) ⇒ (B).

We claim that (C) implies the other conditions if X is complete. Of course, compactness

implies completeness, but (C) may hold for incomplete X, e.g., X = (0, 1) ⊂ R.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

81

Proposition 3.7. If X is a complete metric space with property (C), then X is compact.

Proof. It suffices to show that (C) ⇒ (B) if X is a complete metric space. So let S ⊂ X

be an infinite set. Cover X by balls B1/2 (x1 ), . . . , B1/2 (xN ). One of these balls contains

infinitely many points of S, and so does its closure, say X1 = B1/2 (y1 ). Now cover X by

finitely many balls of radius 1/4; their intersection with X1 provides a cover of X1 . One

such set contains infinitely many points of S, and so does its closure X2 = B1/4 (y2 ) ∩ X1 .

Continue in this fashion, obtaining

X1 ⊃ X2 ⊃ X3 ⊃ · · · ⊃ Xk ⊃ Xk+1 ⊃ · · · , Xj ⊂ B2−j (yj ),

each containing infinitely many points of S. Pick zj ∈ Xj . One sees that (zj ) forms

a Cauchy sequence. If X is complete, it has a limit, zj → z, and z is seen to be an

accumulation point of S.

Remark. Note the similarity of this argument with the proof of the Bolzano-Weiersrass

theorem in Chapter 1.

a Cartesian product metric space

m

Y

(3.1) X= Xj , d(x, y) = d1 (x1 , y1 ) + · · · + dm (xm , ym ).

j=1

p

Another choice of metric is δ(x, y) = d1 (x1 , y1 )2 + · · · + dm (xm , ym )2 . The metrics d and

δ are equivalent, i.e., there exist constants C0 , C1 ∈ (0, ∞) such that

(3.2) C0 δ(x, y) ≤ d(x, y) ≤ C1 δ(x, y), ∀ x, y ∈ X.

A key example is Rm , the Cartesian product of m copies of the real line R.

We describe some important classes of compact spaces.

Qm

Proposition 3.8. If Xj are compact metric spaces, 1 ≤ j ≤ m, so is X = j=1 Xj .

Proof. If (xν ) is an infinite sequence of points in X, say xν = (x1ν , . . . , xmν ), pick a

convergent subsequence of (x1ν ) in X1 , and consider the corresponding subsequence of

(xν ), which we relabel (xν ). Using this, pick a convergent subsequence of (x2ν ) in X2 .

Continue. Having a subsequence such that xjν → yj in Xj for each j = 1, . . . , m, we then

have a convergent subsequence in X.

The following result is useful for analysis on Rn .

Proposition 3.9. If K is a closed bounded subset of Rn , then K is compact.

Proof. This has been proved in §1. There it was noted that the result follows from the

compactness of a closed bounded interval I = [a, b] in R, which in turn was proved in §9 of

Chapter 1. Here, we just note that compactness of [a, b] is also a corollary of Proposition

3.7.

We next give a slightly more sophisticated result on compactness. The following exten-

sion of Proposition 3.8 is a special case of Tychonov’s Theorem.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

82

Q∞

Proposition 3.10. If {Xj : j ∈ Z+ } are compact metric spaces, so is X = j=1 Xj .

Here, we can make X a metric space by setting

∞

X dj (pj (x), pj (y))

(3.3) d(x, y) = 2−j ,

j=1

1 + dj (pj (x), pj (y))

where pj : X → Xj is the projection onto the jth factor. It is easy to verify that, if xν ∈ X,

then xν → y in X, as ν → ∞, if and only if, for each j, pj (xν ) → pj (y) in Xj .

Proof. Following the argument in Proposition 3.8, if (xν ) is an infinite sequence of points

in X, we obtain a nested family of subsequences

such that p` (xj ν ) converges in X` , for 1 ≤ ` ≤ j. The next step is a diagonal construction.

We set

(3.5) ξν = xν ν ∈ X.

Then, for each j, after throwing away a finite number N (j) of elements, one obtains from

(ξν ) a subsequence of the sequence (xj ν ) in (3.4), so p` (ξν ) converges in X` for all `. Hence

(ξν ) is a convergent subsequence of (xν ).

Exercises

Prove that if d(x, y) is symmetric and satisfies the triangle inequality, so does

Hint. Consider ϕ(r) = r/(1 + r).

n o1/2

δ(x, y) = d1 (x1 , y1 )2 + · · · + dm (xm , ym )2 .

Show that √

δ(x, y) ≤ d(x, y) ≤ m δ(x, y).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

83

4. Let X be a metric space, p ∈ X, and let K ⊂ X be compact. Show that there exist

x0 , x1 ∈ K such that

d(x0 , p) ≤ d(x, p), ∀ x ∈ K,

d(x1 , p) ≥ d(x, p), ∀ x ∈ K.

Show that there exist y0 , y1 ∈ K such that

d(q0 , q1 ) ≤ d(y0 , y1 ), ∀ q0 , q1 ∈ K.

b

5. Let X be a metric space that satisfies the total boundedness condition (C), and let X

be its completion. Show that X b is compact.

Hint. Show that X b also satisfies condition (C).

isolated points, then Card(X) = Card(R). Note how this generalizes the result on Cantor

sets in Exercise 6, §9, of Chapter 1.

Card X ≤ Card R).

7. Define K ⊂ X as follows:

Show that

(a) K 6= ∅.

Hint. Cover X with B1 (pj ), 1 ≤ j ≤ N0 . At least one is uncountable; call it X0 . Cover

X0 with X0 ∩ B1/2 (pj ), 1 ≤ j ≤ N1 , pj ∈ X0 . At least one is uncountable; call it X1 .

Continue, obtaining uncountable compact sets X0 ⊃ X1 ⊃ · · · , with diam Xj ≤ 21−j .

Show that ∩j Xj = {x} with x ∈ K.

(b) K is closed (hence compact), and

(c) K has no isolated points.

Hint for (c). Given x ∈ K, show that, for each ε > 0, there exists δ ∈ (0, ε) such that

Bε (x) \ Bδ (X) is uncountable. Apply Exercise 7 to this compact metric space.

9. Deduce from Exercises 6–8 that Card K = Card R. Hence conclude that

Card X = Card R.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

84

S ⊂ X is said to be nowhere dense if S contains no nonempty open set. Consequently, S

is nowhere dense if and only if X \ S is dense. Also, a set U ⊂ X is dense in X if and only

if U intersects each nonempty open subset of X.

Our main goal here is to prove the following.

Theorem A.1. A complete metric space X cannot be written as a countable union of

nowhere dense subsets.

Proof. Let Sk ⊂ X be nowhere dense, k ∈ N. Set

k

[

(A.1) Tk = Sj ,

j=1

(A.2) Uk = X \ Tk ,

[ \

(A.3) Sk = X =⇒ Uk = ∅,

k k

To do this, pick p1 ∈ U1 and ε1 > 0 such that Bε1 (p1 ) ⊂ U1 . Since U2 is dense in X,

we can then pick p2 ∈ Bε1 (p1 ) ∩ U2 and ε2 ∈ (0, ε1 /2) such that

which is possible at each stage because Uk is dense in X, and hence intersects each

nonempty open set. Note that

It follows that

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

85

so (pk ) is Cauchy. Since X is complete, this sequence has a limit p ∈ X. Since each

Bεk (pk ) is closed, (A.6) implies

Theorem A.1 is called the Baire category theorem. The terminology arises as follows.

We say a subset Y ⊂ X is of first category provided Y is a countable union of nowhere

dense sets. If Y is not a set of first category, we say it is of second category. Theorem A.1

says that if X is a complete metric space, then X is of second category.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

86

Chapter III

Functions

Introduction

The playing fields for analysis are spaces, and the players themselves are functions. In

this chapter we develop some frameworks for understanding the behavior of various classes

of functions. We spend about half the chapter studying functions f : X → Y from one

metric space (X) to another (Y ), and about half specializing to the case Y = Rn .

Our emphasis is on continuous functions, and §1 presents a number of results on con-

tinuous functions f : X → Y , which by definition have the property

xν → x =⇒ f (xν ) → f (x).

We devote particular attention to the behavior of continuous functions on compact sets.

We bring in the notion of uniform continuity, a priori stronger than continuity, and show

that f continuous on X ⇒ f uniformly continuous on X, provided X is compact. We also

introduce the notion of connectedness, and extend the intermediate value theorem given

in §9 of Chapter 1 to the setting where X is a connected metric space, and f : X → R is

continuous.

In §2 we consider sequences and series of functions, starting with sequences (fj ) of

functions fj : X → Y . We study convergence and uniform convergence. We move to

infinite series

∞

X

fj (x),

j=1

n

in case Y = R , and discuss conditions on fj yielding convergence, absolute convergence,

and uniform convergence. Section 3 introduces a special class of infinite series, power

series,

X∞

ak z k .

k=0

Here we take ak ∈ C and z ∈ C, and consider conditions yielding convergence on a disk

DR = {z ∈ C : |z| < R}. This section is a prelude to a deeper study of power series, as it

relates to calculus, in Chapter 4.

In §4 we study spaces of functions, including C(X, Y ), the set of continuous functions

f : X → Y . Under certain hypotheses (e.g., if either X or Y is compact) we can take

D(f, g) = sup dY (f (x), g(x)),

x∈X

as a distance function, making C(X, Y ) a metric space. We investigate conditions under

which this metric space can be shown to be complete. We also investigate conditions

under which certain subsets of C(X, Y ) can be shown to be compact. Unlike §§1–3, this

section will not have much impact on Chapters 4–5, but we include it to indicate further

interesting directions that analysis does take.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

87

1. Continuous functions

Let X and Y are metric spaces, with distance functions dX and dY , respectively. A

function f : X → Y is said to be continuous at a point x ∈ X if and only if

or, equivalently, for each ε > 0, there exists δ > 0 such that

that is to say,

where the balls Bε (y) and Bδ (x) are defined as in (2.6) of Chapter 2. Here we use the

notation

f −1 (S) = {x ∈ X : f (x) ∈ S},

given S ⊂ Y .

We say f is continuous on X if it is continuous at each point of X. Here is an equivalent

condition.

Proposition 1.1. Given f : X → Y , f is continuous on X if and only if

f (x) = y ∈ U . Given that U is open, pick ε > 0 such that Bε (y) ⊂ U . Continuity of f at

x forces the image of Bδ (x) to lie in the ball Bε (y) about y, if δ is small enough, hence to

lie in U . Thus Bδ (x) ⊂ f −1 (U ) for δ small enough, so f −1 (U ) must be open.

Conversely, assume (1.1C) holds. If x ∈ X, and f (x) = y, then for all ε > 0, f −1 (Bε (y))

must be an open set containing x, so f −1 (Bε (y)) contains Bδ (x) for some δ > 0. Hence f

is continuous at x.

We record the following important link between continuity and compactness. This

extends Proposition 9.4 of Chapter 1.

Proposition 1.2. If X and Y are metric spaces, f : X → Y continuous, and K ⊂ X

compact, then f (K) is a compact subset of Y.

Proof. If (yν ) is an infinite sequence of points in f (K), pick xν ∈ K such that f (xν ) = yν .

If K is compact, we have a subsequence xνj → p in X, and then yνj → f (p) in Y.

If f : X → R is continuous, we say f ∈ C(X). A useful corollary of Proposition 1.2 is:

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

88

maximum and a minimum value on X.

Proof. We know from Proposition 1.2 that f (X) is a compact subset of R, hence bounded.

Proposition 6.1 of Chapter 1 implies f (K) ⊂ R has a sup and an inf, and, as noted in (9.7)

of Chapter 1, these numbers are in f (K). That is, we have

For later use, we mention that if X is a nonempty set and f : X → R is bounded from

above, disregarding any notion of continuity, we set

x∈X

x∈X

If f is not bounded from above, we set sup f = +∞, and if f is not bounded from below,

we set inf f = −∞.

Given a set X, f : X → R, and xn ∈ X, we set

³ ´

(1.5) lim sup f (xn ) = lim sup f (xk ) ,

n→∞ n→∞ k≥n

and

³ ´

(1.6) lim inf f (xn ) = lim inf f (xk ) .

n→∞ n→∞ k≥n

continuous provided that, for any ε > 0, there exists δ > 0 such that

said to be uniformly continuous provided that, for any ε > 0, there exists δ > 0 such that

ω : [0, 1) → [0, ∞) such that δ & 0 ⇒ ω(δ) & 0, and such that

Not all continuous functions are uniformly continuous. For example, if X = (0, 1) ⊂ R,

then f (x) = sin 1/x is continuous, but not uniformly continuous, on X. The following

result is useful, for example, in the development of the Riemann integral in Chapter 4.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

89

is uniformly continuous.

Proof. If not, there exist ε > 0 and xν , yν ∈ X such that dX (xν , yν ) ≤ 2−ν but

p implies f (xνj ) → f (p) and f (yνj ) → f (p), contradicting (1.10).

If X and Y are metric spaces and f : X → Y is continuous, one-to-one, and onto, and

if its inverse g = f −1 : Y → X is continuous, we say f is a homeomorphism. Here is a

useful sufficient condition for producing homeomorphisms.

Proposition 1.5. Let X be a compact metric space. Assume f : X → Y is continuous,

one-to-one, and onto. Then its inverse g : Y → X is continuous.

Proof. If K ⊂ X is closed, then K is compact, so by Proposition 1.2, f (K) ⊂ Y is

compact, hence closed. Now if U ⊂ X is open, with complement K = X \ U , we see that

f (U ) = Y \ f (K), so U open ⇒ f (U ) open, that is,

U ⊂ X open =⇒ g −1 (U ) open.

We next define the notion of a connected space. A metric space X is said to be connected

provided that it cannot be written as the union of two disjoint nonempty open subsets.

The following is a basic class of examples.

Proposition 1.6. Each interval I in R is connected.

Proof. This is Proposition 2.3 of Chapter 2.

We say X is path-connected if, given any p, q ∈ X, there is a continuous map γ : [0, 1] →

X such that γ(0) = p and γ(1) = q. The following is an easy consequence of Proposition

1.6.

Proposition 1.7. Every path connected metric space X is connected.

Proof. If X = U ∪ V with U and V open, disjoint, and both nonempty, take p ∈ U, q ∈ V ,

and let γ : [0, 1] → X be a continuous path from p to q. Then

[0, 1] = γ −1 (U ) ∪ γ −1 (V )

would be a disjoint union of nonempty open sets, which by Proposition 1.6 cannot happen.

The next result is known as the Intermediate Value Theorem. Note that it generalizes

Proposition 9.6 of Chapter 1.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

90

p, q ∈ X, and f (p) = a < f (q) = b. Then, given any c ∈ (a, b), there exists z ∈ X such

that f (z) = c.

Proof. Under the hypotheses, A = {x ∈ X : f (x) < c} is open and contains p, while

B = {x ∈ X : f (x) > c} is open and contains q. If X is connected, then A ∪ B cannot be

all of X; so any point in its complement has the desired property.

Exercises

and hence

d : X × X −→ [0, ∞) is continuous.

pn : [a, b] −→ [an , bn ].

continuous.

Hint. Start at n = 1, and use Exercise 4 to produce an inductive proof.

g ◦ f : X → Z by g ◦ f (x) = g(f (x)). Show that g ◦ f is continuous.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

91

(f1 (x), f2 (x)). Show that g is continuous.

We present some exercises that deal with functions that are semicontinuous. Given a metric

space X and f : X → [−∞, ∞], we say f is lower semicontinuous at x ∈ X provided

8. Show that

and

f is upper semicontinuous ⇐⇒ f −1 ([c, ∞]) is closed, ∀ c ∈ R.

9. Show that

Show that

χS is upper semicontinuous ⇐⇒ S is closed.

Here, χS (x) = 1 if x ∈ S, 0 if x ∈

/ S.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

92

Let X and Y be metric spaces, with distance functions dX and dY , respectively. Con-

sider a sequence of functions fj : X → Y , which we denote (fj ). To say (fj ) converges at

x to f : X → Y is simply to say that fj (x) → f (x) in Y . If such convergence holds for

each x ∈ X, we say (fj ) converges to f on X, pointwise.

A stronger type of convergence is uniform convergence. We say fj → f uniformly on X

provided

(2.1) sup dY (fj (x), f (x)) −→ 0, as j → ∞.

x∈X

An equivalent characterization is that, for each ε > 0, there exists K ∈ N such that

(2.2) j ≥ K =⇒ dY (fj (x), f (x)) ≤ ε, ∀ x ∈ X.

A significant property of uniform convergence is that passing to the limit preserves conti-

nuity.

Proposition 2.1. If fj : X → Y is continuous for each j and fj → f uniformly, then

f : X → Y is continuous.

Proof. Fix p ∈ X and take ε > 0. Pick K ∈ N such that (2.2) holds. Then pick δ > 0 such

that

(2.3) x ∈ Bδ (p) =⇒ dY (fK (x), fK (p)) < ε,

which can be done since fK : X → Y is continuous. Together, (2.2) and (2.3) imply

x ∈ Bδ (p) ⇒ dY (f (x), f (p))

(2.4) ≤ dY (f (x), fK (x)) + dY (fK (x), fK (p)) + dY (fK (p), f (p))

≤ 3ε.

Thus f is continuous at p, for each p ∈ X.

We next consider Cauchy sequences of functions fj : X → Y . To say (fj ) is Cauchy

at x ∈ X is simply to say (fj (x)) is a Cauchy sequence in Y . We say (fj ) is uniformly

Cauchy provided

(2.5) sup dY (fj (x), fk (x)) −→ 0, as j, k → ∞.

x∈X

An equivalent characterization is that, for each ε > 0, there exists K ∈ N such that

(2.6) j, k ≥ K =⇒ dY (fj (x), fk (x)) ≤ ε, ∀ x ∈ X.

If Y is complete, a Cauchy sequence (fj ) will have a limit f : X → Y . We have the

following.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

93

(fj ) converges uniformly to a limit f : X → Y .

Proof. We have already seen that there exists f : X → Y such that fj (x) → f (x) for each

x ∈ X. To finish the proof, take ε > 0, and pick K ∈ N such that (2.6) holds. Then taking

k → ∞ yields

If, in addition, each fj : X → Y is continuous, we can put Propositions 2.1 and 2.2

together. We leave this to the reader.

It is useful to note the following phenomenon in case, in addition, X is compact.

Proposition 2.3. Assume X is compact, fj : X → Y continuous, and fj → f uniformly

on X. Then

[

(2.8) K = f (X) ∪ fj (X) ⊂ Y is compact.

j≥1

Proof. Let (yν ) ⊂ K be an infinite sequence. If there exists j ∈ N such that yν ∈ fj (X)

for infinitely many ν, convergence of a subsequence to an element of fj (X) follows from

the known compactness of fj (X). Ditto if yν ∈ f (X) for infinitely many ν. It remains to

consider the situation yν ∈ fjν (X), jν → ∞ (after perhaps taking a subsequence). That,

is, suppose yν = fjν (xν ), xν ∈ X, jν → ∞. Passing to a further subsequence, we can

assume xν → x in X, and then it follows from the uniform convergence that

(2.9) yν −→ y = f (x) ∈ K.

We move from sequences to series. For this, we need some algebraic structure on Y .

Thus, for the rest of this section, we assume

(2.10) fj : X −→ Rn ,

∞

X

(2.11) fk (x),

k=0

and seek conditions for convergence, which is the same as convergence of the sequence of

partial sums,

j

X

(2.11) Sj (x) = fk (x).

k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

94

∞

X

(2.13) |fk (x)| < ∞,

k=0

j

X

(2.14) |fk (x)| ≤ Bx , ∀ j ∈ N.

k=0

In such a case, we say the series (2.11) converges absolutely at x. We say (2.11) converges

uniformly on X if and only if (Sj ) converges uniformly on X. The following sufficient

condition for uniform convergence is called the Weierstrass M test.

Proposition 2.4. Assume there exist Mk such that |fk (x)| ≤ Mk , for all x ∈ X, and

∞

X

(2.15) Mk < ∞.

k=0

Proof. This proof is also similar to that of Proposition 6.12 of Chapter 1, but we review

it. We have

¯ m+` ¯

¯ X ¯

|Sm+` (x) − Sm (x)| ≤ ¯ fk (x)¯

k=m+1

m+`

X

(2.16) ≤ |fk (x)|

k=m+1

m+`

X

≤ Mk .

k=m+1

Pm

Now (2.15) implies σm = k=0 Mk is uniformly bounded, so (by Proposition 6.10 of

Chapter 1), σm % β for some β ∈ R+ . Hence

convergence follows by Propositon 2.2.

Bringing in Proposition 2.1, we have the following.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

95

so is the limit S.

Exercises

1. For j ∈ N, define fj : R → R by

x

f1 (x) = , fj (x) = f (jx).

1 + x2

Show that fj → 0 pointwise on R.

Show that, for each ε > 0, fj → 0 uniformly on R \ (−ε, ε).

Show that (fj ) does not converge uniformly to 0 on R.

2. For j ∈ N, define gj : R → R by

x

g1 (x) = √ , gj (x) = g1 (jx).

1 + x2

Show that there exists g : R → R such that gj → g pointwise. Show that g is not continuous

on all of R. Where is g discontinuous?

fj (x) % f (x), ∀ x ∈ X.

Prove that fj → f uniformly on X. (This result is called Dini’s theorem.)

Hint. For ε > 0, let Kj (ε) = {x ∈ X : f (x)−fj (x) ≥ ε}. Note that Kj (ε) ⊃ Kj+1 (ε) ⊃ · · · .

What about ∩j≥1 Kj (ε)?

∞

X 1

gk (x).

k2

k=1

∞

X 1

fk (x).

k

k=1

Where does this series converge? Where does it converge uniformly? Where is the sum

continuous?

Hint. For use in the latter questions, note that, for ` ∈ N, ` ≤ k ≤ 2`, we have fk (1/`) ∈

[1/2, 1].

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

96

3. Power series

∞

X

(3.1) ak z k ,

k=0

with ak ∈ C. Note that if z1 6= 0 and (3.1) converges for z = z1 , then there exists C < ∞

such that

(3.2) |ak z1k | ≤ C, ∀ k.

Hence, if |z| ≤ r|z1 |, r < 1, we have

∞

X ∞

X

k C

(3.3) |ak z | ≤ C rk = < ∞,

1−r

k=0 k=0

the last identity being the classical geometric series computation. (Compare (10.49) in

Chapter 1.) This yields the following.

Proposition 3.1. If (3.1) converges for some z1 6= 0, then either this series is absolutely

convergent for all z ∈ C, or there is some R ∈ (0, ∞) such that the series is absolutely

convergent for |z| < R and divergent for |z| > R.

We call R the radius of convergence of (3.1). In case of convergence for all z, we say

the radius of convergence is infinite. If R > 0 and (3.1) converges for |z| < R, it defines a

function

∞

X

(3.4) f (z) = ak z k , z ∈ DR ,

k=0

(3.5) DR = {z ∈ C : |z| < R}.

Proposition 3.2. If the series (3.4) converges in DR , then it converges uniformly on DS

for all S < R, and hence f is continuous on DR , i.e., given zn , z ∈ DR ,

(3.6) zn → z =⇒ f (zn ) → f (z).

Proof. For each z ∈ DR , there exists S < R such that z ∈ DS , so it suffices to show that

f is continuous on DS whenever 0 < S < R. Pick T such that S < T < R. We know that

there exists C < ∞ such that |ak T k | ≤ C for all k. Hence

³ S ´k

(3.7) z ∈ DS =⇒ |ak z k | ≤ C .

T

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

97

Since

∞ ³ ´k

X S

(3.8) < ∞,

T

k=0

the Weierstrass M-test, Proposition 2.4, applies, to yield uniform convergence on DS . Since

(3.9) ∀ k, ak z k is continuous,

More generally, a power series has the form

∞

X

(3.10) f (z) = an (z − z0 )n .

n=0

It follows from Proposition 3.1 that to such a series there is associated a radius of con-

vergence R ∈ [0, ∞], with the property that the series converges absolutely whenever

|z − z0 | < R (if R > 0), and diverges whenever |z − z0 | > R (if R < ∞). We identify R as

follows:

1

(3.11) = lim sup |an |1/n .

R n→∞

Proposition 3.3. The series (3.10) converges whenever |z − z0 | < R and diverges when-

ever |z − z0 | > R, where R is given by (3.11). If R > 0, the series converges uniformly

on {z : |z − z0 | ≤ R0 }, for each R0 < R. Thus, when R > 0, the series (3.10) defines a

continuous function

(3.12) f : DR (z0 ) −→ C,

where

1

n ≥ N =⇒ |an |1/n < =⇒ |an |(R0 )n < 1.

R0

Thus

¯ z − z ¯n

0 ¯ 0¯ n

(3.14) |z − z0 | < R < R =⇒ |an (z − z0 ) | ≤ ¯ 0 ¯ ,

R

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

98

For the converse, we argue as follows. Suppose R00 > R, so infinitely many |an |1/n ≥

1/R00 , hence infinitely many |an |(R00 )n ≥ 1. Then

¯ z − z ¯n

¯ 0¯

|z − z0 | ≥ R00 > R =⇒ infinitely many |an (z − z0 )n | ≥ ¯ 00 ¯ ≥ 1,

R

forcing divergence for |z − z0 | > R.

The assertions about uniform convergence and continuity follow as in Proposition 3.2.

It is useful to note that we can multiply power series with radius of convergence R > 0.

In fact, there is the following more general result on products of absolutely convergent

series.

Proposition 3.4. Given absolutely convergent series

∞

X ∞

X

(3.15) A= αn , B= βn ,

n=0 n=0

∞

X n

X

(3.16) AB = γn , γn = αj βn−j .

n=0 j=0

Pk Pk

Proof. Take Ak = n=0 αn , Bk = n=0 βn . Then

k

X

(3.17) Ak Bk = γn + Rk

n=0

with

X

(3.18) Rk = αm βn , σ(k) = {(m, n) ∈ Z+ × Z+ : m, n ≤ k, m + n > k}.

(m,n)∈σ(k)

Hence

X X X X

|Rk | ≤ |αm | |βn | + |αm | |βn |

m≤k/2 k/2≤n≤k k/2≤m≤k n≤k

(3.19) X X

≤A |βn | + B |αm |,

n≥k/2 m≥k/2

where

∞

X ∞

X

(3.20) A= |αn | < ∞, B= |βn | < ∞.

n=0 n=0

It follows thatPRk → 0 as k → ∞. Thus the left side of (3.17) converges to AB and the

∞

right side to n=0 γn . The absolute convergence of (3.16) follows by applying the same

argument with αn replaced by |αn | and βn replaced by |βn |.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

99

Corollary 3.5. Suppose the following power series converge for |z| < R:

∞

X ∞

X

n

(3.21) f (z) = an z , g(z) = bn z n .

n=0 n=0

∞

X n

X

n

(3.22) f (z)g(z) = cn z , cn = aj bn−j .

n=0 j=0

The following result, which is related to Proposition 3.4, has a similar proof.

P P

Proposition 3.6. If ajk ∈ C and j,k |ajk | < ∞, then j ajk is absolutely convergent

P

for each k, k ajk is absolutely convergent for each j, and

∞ ³X

X ∞ ´ ∞ ³X

X ∞ ´ X

(3.23) ajk = ajk = ajk .

j=0 k=0 k=0 j=0 j,k

P P

Proof. Clearly the hypothesis implies j |ajk | < ∞ for each k and k |ajk | < ∞ for each

j. It also implies that there exists B < ∞ such that

N X

X N

SN = |ajk | ≤ B, ∀ N.

j=0 k=0

follows that, for each ε > 0, there exists N ∈ N such that

X

e ×N

|ajk | < ε, C(N ) = {(j, k) ∈ N e : j > N or k > N }.

(j,k)∈C(N )

¯XM ³X

K ´ XN X

N ¯ X

¯ ¯

¯ ajk − ajk ¯ ≤ |ajk |,

j=0 k=0 j=0 k=0 (j,k)∈C(N )

hence

¯XM ³X

∞ ´ XN X

N ¯ X

¯ ¯

¯ ajk − ajk ¯ ≤ |ajk |.

j=0 k=0 j=0 k=0 (j,k)∈C(N )

Therefore

¯X∞ ³X

∞ ´ XN X

N ¯ X

¯ ¯

¯ a jk − a jk ¯ ≤ |ajk |.

j=0 k=0 j=0 k=0 (j,k)∈C(N )

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

100

We have a similar result with the roles of j and k reversed, and clearly the two finite sums

agree. It follows that

¯X∞ ³X

∞ ´ X∞ ³X

∞ ´¯

¯ ¯

¯ ajk − ajk ¯ < 2ε, ∀ ε > 0,

j=0 k=0 k=0 j=0

yielding (3.23).

Using Proposition 3.6, we demonstrate the following. (Thanks to Shrawan Kumar for

this argument.)

Proposition 3.7. If (3.10) has a radius of convergence R > 0, and z1 ∈ DR (z0 ), then

f (z) has a convergent power series about z1 :

∞

X

(3.24) f (z) = bk (z − z1 )k , for |z − z1 | < R − |z1 − z0 |.

k=0

Proof. There is no loss in generality in taking z0 = 0, which we will do here, for notational

simplicity. Setting fz1 (ζ) = f (z1 + ζ), we have from (3.10)

∞

X

fz1 (ζ) = an (ζ + z1 )n

n=0

(3.25) ∞ Xn µ ¶

X n k n−k

= an ζ z1 ,

n=0 k=0

k

∞ X

X n µ ¶ X∞

n k n−k

(3.26) |an | |ζ| |z1 | = |an |(|ζ| + |z1 |)n < ∞,

n=0 k=0

k n=0

provided |ζ| + |z1 | < R, which is the hypothesis in (3.24) (with z0 = 0). Hence Proposition

3.6 gives

∞ ³X µ ¶

n n−k ´ k

X ∞

(3.27) fz1 (ζ) = an z ζ .

k 1

k=0 n=k

∞

X µ ¶

n n−k

(3.28) bk = an z .

k 1

n=k

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

101

∞

X

(3.29) b1 = nan z1n−1 .

n=1

Exercises

¯a ¯

¯ k+1 ¯

(3.30) k ≥ K =⇒ ¯ ¯ ≤ α.

ak

P∞

Show that k=0 ak is absolutely convergent.

Note. This is the ratio test.

2. Determine the radius of convergence R for each of the following power series. If 0 <

R < ∞, try to determine when convergence holds at points on |z| = R.

∞

X X∞ X∞

zn zn

zn, , 2

,

n=0 n=1

n n=1

n

X∞ X∞ X∞ n

zn zn z2

(3.31) , n

, n

,

n=1

n! n=1

2 n=1

2

∞

X ∞

X ∞

X

nz n , n2 z n , n! z n .

n=1 n=1 n=1

X ∞

1

(3.32) = zk , |z| < 1.

1−z

k=0

1 1

(3.33) , .

z−2 z+3

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

102

1

(3.34) .

z2 +z−6

6. As an alternative to the use of Corollary 3.5, write (3.34) as a linear combination of the

functions (3.33).

1

.

1 + z2

Hint. Replace z by −z 2 in (3.32).

1

.

a2 + z2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

103

4. Spaces of functions

If X and Y are metric spaces, the space C(X, Y ) of continuous maps f : X → Y has a

natural metric structure, under some additional hypotheses. We use

¡ ¢

(4.1) D(f, g) = sup d f (x), g(x) .

x∈X

This sup exists provided f (X) and g(X) are bounded subsets of Y, where to say B ⊂ Y is

bounded is to say d : B × B → [0, ∞) has bounded image. In particular, this supremum

exists if X is compact.

Proposition 4.1. If X is a compact metric space and Y is a complete metric space, then

C(X, Y ), with the metric (4.1), is complete.

Proof. That D(f, g) satisfies the conditions to define a metric on C(X, Y ) is straightfor-

ward. We check completeness. Suppose (fν ) is a Cauchy sequence in C(X, Y ), so, as

ν → ∞,

¡ ¢

(4.2) sup sup d fν+k (x), fν (x) ≤ εν → 0.

k≥0 x∈X

Then in particular (fν (x)) is a Cauchy sequence in Y for each x ∈ X, so it converges, say

to g(x) ∈ Y . It remains to show that g ∈ C(X, Y ) and that fν → g in the metric (4.1).

In fact, taking k → ∞ in the estimate above, we have

¡ ¢

(4.3) sup d g(x), fν (x) ≤ εν → 0,

x∈X

i.e., fν → g uniformly. It remains only to show that g is continuous. For this, let xj → x

in X and fix ε > 0. Pick N so that εN < ε. Since fN is continuous, there exists J such

that j ≥ J ⇒ d(fN (xj ), fN (x)) < ε. Hence

¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢

j ≥ J ⇒ d g(xj ), g(x) ≤ d g(xj ), fN (xj ) + d fN (xj ), fN (x) + d fN (x), g(x) < 3ε.

In case Y = R, we write C(X, R) = C(X). The distance function (4.1) can then be

written

x∈X

Generally, a norm on a vector space V is an assignment f 7→ kf k ∈ [0, ∞), satisfying

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

104

given f, g ∈ V and a a scalar (in R or C). A vector space equipped with a norm is called a

normed vector space. It is then a metric space, with distance function D(f, g) = kf − gk.

If the space is complete, one calls V a Banach space.

In particular, by Proposition 4.1, C(X) is a Banach space, when X is a compact metric

space.

The next result is a special case of Ascoli’s Theorem. To state it, we say a modulus of

continuity is a strictly monotonically increasing, continuous function ω : [0, ∞) → [0, ∞)

such that ω(0) = 0.

Proposition 4.2. Let X and Y be compact metric spaces, and fix a modulus of continuity

ω(δ). Then

© ¡ ¢ ¡ ¢ ª

(4.6) Cω = f ∈ C(X, Y ) : d f (x), f (x0 ) ≤ ω d(x, x0 ) ∀ x, x0 ∈ X

is a compact subset of C(X, Y ).

Proof. Let (fν ) be a sequence in Cω . Let Σ be a countable dense subset of X, as in Corollary

3.2 of Chapter 2. For each x ∈ Σ, (fν (x)) is a sequence in Y, which hence has a convergent

subsequence. Using a diagonal construction similar to that in the proof of Proposition 3.10

of Chapter 2, we obtain a subsequence (ϕν ) of (fν ) with the property that ϕν (x) converges

in Y, for each x ∈ Σ, say

(4.7) ϕν (x) → ψ(x),

for all x ∈ Σ, where ψ : Σ → Y.

So far, we have not used (4.6). This hypothesis will now be used to show that ϕν

converges uniformly on X. Pick ε > 0. Then pick δ > 0 such that ω(δ) < ε/3. Since X is

compact, we can cover X by finitely many balls Bδ (xj ), 1 ≤ j ≤ N, xj ∈ Σ. Pick M so

large that ϕν (xj ) is within ε/3 of its limit for all ν ≥ M (when 1 ≤ j ≤ N ). Now, for any

x ∈ X, picking ` ∈ {1, . . . , N } such that d(x, x` ) ≤ δ, we have, for k ≥ 0, ν ≥ M,

¡ ¢ ¡ ¢ ¡ ¢

d ϕν+k (x), ϕν (x) ≤ d ϕν+k (x), ϕν+k (x` ) + d ϕν+k (x` ), ϕν (x` )

¡ ¢

(4.8) + d ϕν (x` ), ϕν (x)

≤ ε/3 + ε/3 + ε/3.

Thus (ϕν (x)) is Cauchy in Y for all x ∈ X, hence convergent. Call the limit ψ(x), so we

now have (4.7) for all x ∈ X. Letting k → ∞ in (4.8) we have uniform convergence of ϕν

to ψ. Finally, passing to the limit ν → ∞ in

(4.9) d(ϕν (x), ϕν (x0 )) ≤ ω(d(x, x0 ))

gives ψ ∈ Cω .

We want to re-state Proposition 4.2, bringing in the notion of equicontinuity. Given

metric spaces X and Y , and a set of maps F ⊂ C(X, Y ), we say F is equicontinuous at a

point x0 ∈ X provided

∀ ε > 0, ∃ δ > 0 such that ∀ x ∈ X, f ∈ F,

(4.10)

dX (x, x0 ) < δ =⇒ dY (f (x), f (x0 )) < ε.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

105

uniformly equicontinuous on X provided

(4.11)

dX (x, x0 ) < δ =⇒ dY (f (x), f (x0 )) < ε.

Note that (4.11) is equivalent to the existence of a modulus of continuity ω such that

F ⊂ Cω , given by (4.6). It is useful to record the following result.

Proposition 4.3. Let X and Y be metric spaces, F ⊂ C(X, Y ). Assume X is compact.

then

Proof. The argument is a variant of the proof of Proposition 1.4. In more detail, suppose

there exist xν , x0ν ∈ X, ε > 0, and fν ∈ F such that d(xν , x0ν ) ≤ 2−ν but

of F at p implies that there esists N < ∞ such that

ε

(4.14) d(g(xνj ), g(p)) < , ∀ j ≥ N, g ∈ F,

2

contradicting (4.13).

Putting together Propositions 4.2 and 4.3 then gives the following.

Proposition 4.4. Let X and Y be compact metric spaces. If F ⊂ C(X, Y ) is equicontin-

uous on X, then it has compact closure in C(X, Y ).

Exercises

1. Let X and Y be compact metric spaces. Show that if F ⊂ C(X, Y ) is compact, then F

is equicontinuous. (This is a converse to Proposition 4.4.)

2. Let X be a compact metric space, and r ∈ (0, 1]. Define Lipr (X, Rn ) to consist of

continuous functions f : X → Rn such that, for some L < ∞ (depending on f ),

Define a norm

|f (x) − f (y)|

kf kr = sup |f (x)| + sup .

x∈X x,y∈X,x6=y d(x, y)r

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

106

Show that Lipr (X, Rn ) is a complete metric space, with distance function Dr (f, g) =

kf − gkr .

3. In the setting of Exercise 2, show that if 0 < r < s ≤ 1 and f ∈ Lips (X, Rn ), then

r

kf kr ≤ Ckf k1−θ θ

sup kf ks , θ= ∈ (0, 1).

s

{f ∈ Lips (X, Rn ) : kf ks ≤ 1}

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

107

Chapter IV

Calculus

Introduction

into the heart of analysis, with a rigorous development of calculus, for functions of one real

variable.

Section 1 introduces the derivative, establishes basic identities like the product rule and

the chain rule, and also obtains some important theoretical results, such as the Mean Value

Theorem and the Inverse Function Theorem. One application of the latter is the study of

x1/n , for x > 0, which leads more generally to xr , for x > 0 and r ∈ Q.

Section 2 brings in the integral, more precisely the Riemann integral. A major result is

the Fundamental Theorem of Calculus, whose proof makes essential use of the Mean Value

Theorem. Another topic is the change of variable formula for integrals (treated in some

exercises).

In §3 we treat power series, continuing the development from §3 of Chapter 3. Here

we treat such topics as term by term differentiation of power series, and formulas for the

remainder when a power series is truncated. An application of such remainder formulas is

made to the study of convergence of the power series about x = 0 of (1 − x)b .

Section 4 studies curves in Euclidean space Rn , with particular attention to arc length.

We derive an integral formula for arc length. We show that a smooth curve can be

reparametrized by arc length, as an application of the Inverse Function Theorem. We

1 2 1

then

√ take a look at the unit circle S in R . Using the parametrization of part of S as

(t, 1 − t2 ), we obtain a power series for arc lengths, as an application of material of §3

on power series of (1 − x)b , with b = −1/2, and x replaced by t2 . We also bring in the

trigonometric functions, having the property that (cos t, sin t) provides a parametrization

of S 1 by arc length.

Section 5 goes much further into the study of the trigonometric functions. Actually,

it begins with a treatment of the exponential function et , observes that such treatment

extends readily to eat , given a ∈ C, and then establishes that eit provides a unit speed

parametrization of S 1 . This directly gives Euler’s formula

and provides for a unified treatment of the exponential and trigonometric functions. We

also bring in log as the inverse function to the exponential, and we use the formula xr =

er log x to generalize results of §1 on xr from r ∈ Q to r ∈ R, and further, to r ∈ C.

In §6 we give a natural extension of the Riemann integral from the class of bounded (Rie-

mann integrable) functions to a class of unbounded “integrable” functions. The treatment

here is perhaps a desirable alternative to discussions one sees of “improper integrals.”

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

108

This chapter concludes with some appendices. Appendix A gives a proof of the Fun-

damental Theorem of Algebra, that every nonconstant polynomial has a complex root.

Appendix B presents a proof that π is irrational. Appendix C refines material on the

power series of (1 − x)b , in case b > 0. This will prove useful in Chapter 5. Appendix D

dicusses a method of calculating π that goes back to Archimedes. Appendix E discusses

calculations of π using arctangents. Appendix F treats the power series for tan x, whose

coefficients require a more elaborate derivation than those for sin x and cos x. Appendix

G discusses a theorem of Abel, giving the optimal condition under which a power series in

t with radius of convergence 1 can be shown to converge uniformly in t ∈ [0, 1], as well as

relatd issues regarding convergence of infinite series.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

109

1. The derivative

x ∈ (a, b), we say f is differentiable at x, with derivative f 0 (x), provided

f (x + h) − f (x)

(1.1) lim = f 0 (x).

h→0 h

df

(1.2) (x) = f 0 (x).

dx

where

r(x, h)

(1.4) r(x, h) = o(h) means → 0 as h → 0.

h

(a, b) provided it is differentiable at each point of (a, b). If also g is defined on (a, b) and

differentiable at x, we have

d

(1.5) (f + g)(x) = f 0 (x) + g 0 (x).

dx

d

(1.6) (f g)(x) = f 0 (x)g(x) + f (x)g 0 (x).

dx

= g(x) + f (x + h) .

h h h

d n

(1.7) x = nxn−1 ,

dx

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

110

for all n ∈ N. In fact, this is immediate from (1.1) if n = 1. Given that it holds for n = k,

we have

d k+1 d dx k d

x = (x xk ) = x + x xk

dx dx dx dx

= xk + kxk

= (k + 1)xk ,

completing the induction. We also have

1³ 1 1´ 1 1

− =− → − 2 , as h → 0,

h x+h x x(x + h) x

for x 6= 0, hence

d 1 1

(1.8) = − 2, if x 6= 0.

dx x x

From here, we can extend (1.7) from n ∈ N to all n ∈ Z (requiring x 6= 0 if n < 0).

A similar inductive argument yields

d

(1.9) f (x)n = nf (x)n−1 f 0 (x),

dx

Going further, we have the following chain rule. Suppose f : (a, b) → (α, β) is differen-

tiable at x and g : (α, β) → R (or C) is differentiable at y = f (x). Form G = g ◦ f , i.e.,

G(x) = g(f (x)). We claim

(1.11)

= g(f (x)) + g 0 (f (x))(f 0 (x)h + rf (x, h)) + rg (f (x), f 0 (x)h + rf (x, h)).

Here,

rf (x, h)

−→ 0 as h → 0,

h

and also

rg (f (x), f 0 (x)h + rf (x, h))

−→ 0, as h → 0,

h

so the analogue of (1.3) applies.

The derivative has the following important connection to maxima and minima.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

111

If f is differentiable at x, then f 0 (x) = 0. The same conclusion holds if f (x) ≤ f (y) for

all y ∈ (a, b).

Proof. Given (1.12), we have

f (x + h) − f (x)

(1.13) ≤ 0, ∀ h ∈ (0, b − x),

h

and

f (x + h) − f (x)

(1.14) ≥ 0, ∀ h ∈ (a − x, 0).

h

simultaneously have f 0 (x) ≤ 0 and f 0 (x) ≥ 0.

We next establish a key result known as the Mean Value Theorem.

Theorem 1.2. Let f : [a, b] → R. Assume f is continuous on [a, b] and differentiable on

(a, b). Then there exists ξ ∈ (a, b) such that

f (b) − f (a)

(1.15) f 0 (ξ) = .

b−a

Proof. Let g(x) = f (x) − κ(x − a), where κ denotes the right side of (1.15). Then g(a) =

g(b). The result (1.15) is equivalent to the assertion that

(1.16) g 0 (ξ) = 0

for some ξ ∈ (a, b). Now g is continuous on the compact set [a, b], so it assumes both a

maximum and a minimum on this set. If g has a maximum at a point ξ ∈ (a, b), then

(1.16) follows from Proposition 1.1. If not, the maximum must be g(a) = g(b), and then g

must assume a minimum at some point ξ ∈ (a, b). Again Proposition 1.1 implies (1.16).

We use the Mean Value Theorem to produce a criterion for constructing the inverse of

a function. Let

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

112

(1.19) γ0 (b − a) ≤ β − α ≤ γ1 (b − a).

We can also apply Theorem 1.2 to f , restricted to an interval [x1 , x2 ] ⊂ [a, b], to get

It follows that

The intermediate value theorem implies f : [a, b] → [α, β] is onto. Consequently f has an

inverse

Theorem 1.3. If f is continuous on [a, b] and differentiable on (a, b), and (1.17)–(1.18)

hold, then its inverse g : [α, β] → [a, b] is differentiable on (α, β), and

1

(1.24) g 0 (y) = , for y = f (x) ∈ (α, β).

f 0 (x)

Proof. Fix y ∈ (α, β), and let x = g(y), so y = f (x). From (1.22) we have, for h small

enough,

x + h = g(f (x + h)) = g(f (x) + f 0 (x)h + r(x, h)),

i.e.,

1

(1.27) |g(y1 + r(x, h)) − g(y1 )| ≤ |r(x, h)|,

γ0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

113

provided y1 , y1 + r(y, h) ∈ [α, β], so, with h̃ = f 0 (x)h, and y1 = y + h̃, we have

h̃

(1.28) g(y + h̃) = g(y) + + o(h̃),

f 0 (x)

Remark. If one knew that g were differentiable, as well as f , then the identity (1.24) would

follow by differentiating g(f (x)) = x, applying the chain rule. However, an additional

argument, such as given above, is necessary to guarantee that g is differentiable.

(1.29) pn (x) = xn , n ∈ N.

By (1.7), p0n (x) > 0 for x > 0, so (1.18) holds when 0 < a < b < ∞. We can take a & 0

and b % ∞ and see that

so, given n ∈ N,

Note. We recall that x1/n was constructed, for x > 0, in Chapter 1, §7, and its continuity

discussed in Chapter 3, §1.

Given m ∈ Z, we can set

and verify that (x1/kn )km = (x1/n )m . Thus we have xr defined for all r ∈ Q, when x > 0.

We have

See Exercises 3–5 in §7 of Chapter 1. Applying (1.24) to f (x) = xn , g(y) = y 1/n , we have

d 1/n 1

(1.35) y = , y = xn , x > 0.

dy nxn−1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

114

d r

(1.36) y = ry r−1 , y > 0,

dy

when r = 1/n. Putting this together with (1.9) (with m in place of n), we get (1.36) for

all r = m/n ∈ Q.

The definition of xr for x > 0 and the identity (1.36) can be extended to all r ∈ R, with

some more work. We will find a neat way to do this in §5.

We recall another common notation, namely

√

(1.37) x = x1/2 , x > 0.

d √ 1

(1.38) x= √ .

dx 2 x

√ √

x+h− x

(1.39) ,

h

√ √

we can multiply numerator and denominator by x+h+ x, to get

1

(1.40) √ √ ,

x+h+ x

whose convergence to the right side of (1.38) for x > 0 is equivalent to the statement that

√ √

(1.41) lim x+h= x,

h→0

√

i.e., to the continuity of x 7→ x on (0, ∞). Such continuity is a consequence of the fact

that, for 0 < a < b < ∞, n = 2,

is continuous, one-to-one, and onto, so, by the compactness of [a, b], its inverse is continu-

ous. Thus we have an alternative derivation of (1.38).

If I ⊂ R is an interval and f : I → R (or C), we say f ∈ C 1 (I) if f is differentiable on I

and f 0 is continuous on I. If f 0 is in turn differentiable, we have the second derivative of

f:

d2 f d 0

(1.43) 2

(x) = f 00 (x) = f (x).

dx dx

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

115

define higher order derivatives of f, f (k) , also denoted dk f /dxk . Here, f (1) = f 0 , f (2) = f 00 ,

and if f (k) is differentiable,

d (k)

(1.44) f (k+1) (x) = f (x).

dx

Sometimes we will run into functions of more than one variable, and will want to dif-

ferentiate with respect to each one of them. For example, if f (x, y) is defined for (x, y) in

an open set in R2 , we set

∂f f (x + h, y) − f (x, y)

(x, y) = lim ,

∂x h→0 h

(1.45)

∂f f (x, y + h) − f (x, y)

(x, y) = lim .

∂y h→0 h

We will not need any more than the definition here. A serious study of the derivative of

a function of several variables is given in the companion [T2] to this volume, Introduction

to Analysis in Several Variables.

We end this section with some results on the significance of the second derivative.

Proposition 1.4. Assume f is differentiable on (a, b), x0 ∈ (a, b), and f 0 (x0 ) = 0. As-

sume f 0 is differentiable at x0 and f 00 (x0 ) > 0. Then there exists δ > 0 such that

Proof. Since

f 0 (x0 + h) − f 0 (x0 )

(1.47) f 00 (x0 ) = lim ,

h→0 h

the assertion that f 00 (x0 ) > 0 implies that there exists δ > 0 such that the right side of

(1.47) is > 0 for all nonzero h ∈ [−δ, δ]. Hence

(1.48)

0 < h ≤ δ =⇒ f 0 (x0 + h) > 0.

Remark. Similarly,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

116

These two facts constitute the second derivative test for local maxima and local minima.

Let us now assume that f and f 0 are differentiable on (a, b), so f 00 is defined at each

point of (a, b). Let us further assume

Proposition 1.5. If (1.50) holds and a < x0 < x1 < b, then

Note that

If (1.54) fails, g must assume a minimum at some point s0 ∈ (0, 1). At such a point,

g 0 (s0 ) = 0. A computation gives g 0 (s) = f (x0 ) − f (x0 ) − (x0 − x1 )f 0 (sx0 + (1 − s)x1 ), and

hence

Thus (1.50) ⇒ g 00 (s0 ) < 0. Then (1.49) ⇒ g has a local maximum at s0 . This contradiction

establishes (1.54), hence (1.52).

Remark. The result (1.52) implies that the graph of y = f (x) over [x0 , x1 ] lies below the

chord, i.e., the line segment from (x0 , f (x0 )) to (x1 , f (x1 )) in R2 . We say f is convex.

Exercises

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

117

Compute the derivative of each of the following functions. Specify where each of these

derivatives are defined.

p

(1) 1 + x2 ,

(2) (x2 + x3 )−4 ,

√

1 + x2

(3) .

(x2 + x3 )4

Show that

5. Apply Exercise 4 to

x

(1.59) f (x) = .

1+x

Relate the conclusion to Exercises 1–2 in §3 of Chapter 2. Give a direct proof that (1.58)

holds for f in (1.59), without using calculus.

differentiable at x if and only if each component fj is, and

d

f (x) · g(x) = f 0 (x) · g(x) + f (x) · g 0 (x).

dx

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

118

8. The following is called the generalized mean value theorem. Let f and g be continuous

on [a, b] and differentiable on (a, b). Then there exists ξ ∈ (a, b) such that

Show that this follows from the mean value theorem, applied to

9. Take f : [a, b] → [α, β] and g : [α, β] → [a, b] as in the setting of the Inverse Function

Theorem, Theorem 1.3. Write (1.24) as

1

(1.62) g 0 (y) = , y ∈ (α, β).

f 0 (g(y))

Show that

f ∈ C 1 ((a, b)) =⇒ g ∈ C 1 ((α, β)),

i.e., the right side of (1.62) is continuous on (α, β). Show inductively that, for k ∈ N,

Example. Show that if f ∈ C 2 ((a, b)), then (having shown that g ∈ C 1 ) the right side of

(1.62) is C 1 and hence

1

g 00 (y) = − 0 2

f 00 (g(y))g 0 (y).

f (g(y))

continuous.) Assume a, b ∈ I, a < b, and

Hint. Reduce to the case u = 0, so f 0 (a) < 0 < f 0 (b). Show that then f |[a,b] has a

minimum at a point ξ ∈ (a, b).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

119

2. The integral

In this section, we introduce the Riemann version of the integral, and relate it to the

derivative. We will define the Riemann integral of a bounded function over an interval

I = [a, b] on the real line. For now, we assume f is real valued. To start, we partition

I into smaller intervals. A partition P of I is a finite collection of subintervals {Jk : 0 ≤

k ≤ N }, disjoint except for their endpoints, whose union is I. We can order the Jk so that

Jk = [xk , xk+1 ], where

0≤k≤N

We then set

X

I P (f ) = sup f (x) `(Jk ),

Jk

k

(2.3) X

I P (f ) = inf f (x) `(Jk ).

Jk

k

Here,

sup f (x) = sup f (Jk ), inf f (x) = inf f (Jk ),

Jk Jk

and we recall that if S ⊂ R is bounded, sup S and inf S were defined in §6 of Chapter 1;

cf. (6.32) and (6.45). We call I P (f ) and I P (f ) respectively the upper sum and lower sum

of f , associated to the partition P. Note that I P (f ) ≤ I P (f ). These quantities should

approximate the Riemann integral of f, if the partition P is sufficiently “fine.”

To be more precise, if P and Q are two partitions of I, we say P refines Q, and write

P Â Q, if P is formed by partitioning each interval in Q. Equivalently, P Â Q if and only

if all the endpoints of Q are also endpoints of P. It is easy to see that any two partitions

have a common refinement; just take the union of their endpoints, to form a new partition.

Note also that refining a partition lowers the upper sum of f and raises its lower sum:

(2.4) P Â Q =⇒ I P (f ) ≤ I Q (f ), and I P (f ) ≥ I Q (f ).

(2.5) I P1 (f ) ≤ I Q (f ) ≤ I Q (f ) ≤ I P2 (f ).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

120

(2.6) I(f ) = inf I P (f ), I(f ) = sup I P (f ),

P∈Π(I) P∈Π(I)

where Π(I) is the set of all partitions of I. We call I(f ) the lower integral of f and I(f ) its

upper integral. Clearly, by (2.5), I(f ) ≤ I(f ). We then say that f is Riemann integrable

provided I(f ) = I(f ), and in such a case, we set

Z b Z

(2.7) f (x) dx = f (x) dx = I(f ) = I(f ).

a

I

We will denote the set of Riemann integrable functions on I by R(I).

We derive some basic properties of the Riemann integral.

Proposition 2.1. If f, g ∈ R(I), then f + g ∈ R(I), and

Z Z Z

(2.8) (f + g) dx = f dx + g dx.

I I I

sup (f + g) ≤ sup f + sup g, and inf (f + g) ≥ inf f + inf g,

Jk Jk Jk Jk Jk Jk

so, for any partition P, we have I P (f + g) ≤ I P (f ) + I P (g). Also, using common refine-

ments, we can simultaneously approximate I(f ) and I(g) by I P (f ) and I P (g), and ditto

for I(f + g). Thus the characterization (2.6) implies I(f + g) ≤ I(f ) + I(g). A parallel

argument implies I(f + g) ≥ I(f ) + I(g), and the proposition follows.

Next, there is a fair supply of Riemann integrable functions.

Proposition 2.2. If f is continuous on I, then f is Riemann integrable.

Proof. Any continuous function on a compact interval is bounded and uniformly continuous

(see Propositions 1.1 and 1.3 of Chapter 3). Let ω(δ) be a modulus of continuity for f, so

(2.9) |x − y| ≤ δ =⇒ |f (x) − f (y)| ≤ ω(δ), ω(δ) → 0 as δ → 0.

Then

(2.10) maxsize (P) ≤ δ =⇒ I P (f ) − I P (f ) ≤ ω(δ) · `(I),

which yields the proposition.

We denote the set of continuous functions on I by C(I). Thus Proposition 2.2 says

C(I) ⊂ R(I).

The proof of Proposition

R 2.2 provides a criterion on a partition guaranteeing that I P (f )

and I P (f ) are close to I f dx when f is continuous. We produce an extension, giving a

condition under which I P (f ) and I(f ) are close, and I P (f ) and I(f ) are close, given f

bounded on I. Given a partition P0 of I, set

(2.11) minsize(P0 ) = min{`(Jk ) : Jk ∈ P0 }.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

121

1

(2.12) maxsize(P) ≤ minsize(Q).

k

Let |f | ≤ M on I. Then

2M

I P (f ) ≤ I Q (f ) + `(I),

(2.13) k

2M

I P (f ) ≥ I Q (f ) − `(I).

k

Proof. Let P1 denote the minimal common refinement of P and Q. Consider on the one

hand those intervals in P that are contained in intervals in Q and on the other hand those

intervals in P that are not contained in intervals in Q. Each interval of the first type is also

an interval in P1 . Each interval of the second type gets partitioned, to yield two intervals

in P1 . Denote by P1b the collection of such divided intervals. By (2.12), the lengths of the

intervals in P1b sum to ≤ `(I)/k. It follows that

X `(I)

|I P (f ) − I P1 (f )| ≤ 2M `(J) ≤ 2M ,

k

J∈P1b

2M 2M

I P (f ) ≤ I P1 (f ) + `(I), I P (f ) ≥ I P1 (f ) − `(I).

k k

The following consequence is sometimes called Darboux’s Theorem.

Theorem 2.4. Let Pν be a sequence of partitions of I into ν intervals Jνk , 1 ≤ k ≤ ν,

such that

maxsize(Pν ) −→ 0.

If f : I → R is bounded, then

Consequently,

ν

X

(2.15) f ∈ R(I) ⇐⇒ I(f ) = lim f (ξνk )`(Jνk ),

ν→∞

k=1

R

for arbitrary ξνk ∈ Jνk , in which case the limit is I

f dx.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

122

I(f ) ≤ I Q (f ) ≤ I(f ) + ε,

I(f ) ≥ I Q (f ) ≥ I(f ) − ε.

ν ≥ N =⇒ maxsize Pν ≤ ε minsize Q.

I Pν (f ) ≤ I Q (f ) + 2M `(I)ε,

I Pν (f ) ≥ I Q (f ) − 2M `(I)ε.

Hence, for ν ≥ N ,

I(f ) ≤ I Pν (f ) ≤ I(f ) + [2M `(I) + 1]ε,

I(f ) ≥ I Pν (f ) ≥ I(f ) − [2M `(I) + 1]ε.

This proves (2.14).

Remark.

R The sums on the right side of (2.15) are called Riemann sums, approximating

I

f dx (when f is Riemann integrable).

Remark. A second proof of Proposition 2.1 can readily be deduced from Theorem 2.4.

One should be warned that, once such a specific choice of Pν and ξνk has been made,

the limit on the right side of (2.15) might exist for a bounded function f that is not

Riemann integrable. This and other phenomena are illustrated by the following example

of a function which is not Riemann integrable. For x ∈ I, set

/ Q,

where Q is the set of rational numbers. Now every interval J ⊂ I of positive length contains

points in Q and points not in Q, so for any partition P of I we have I P (ϑ) = `(I) and

I P (ϑ) = 0, hence

Note that, if Pν is a partition of I into ν equal subintervals, then we could pick each ξνk to

be rational, in which case the limit on the right side of (2.15) would be `(I), or we could

pick each ξνk to be irrational, in which case this limit would be zero. Alternatively, we

could pick half of them to be rational and half to be irrational, and the limit would be

1

2 `(I).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

123

a subset of I, define the “characteristic function”

(2.18) χS (x) = 1 if x ∈ S, 0 if x ∈

/ S.

We say S “has content,” or “is contented” if these quantities are equal, which happens if

and only if χS ∈ R(I), in which case the common value of cont+ (S) and cont− (S) is

Z

(2.20) m(S) = χS (x) dx.

I

nX

N o

+

(2.21) cont (S) = inf `(Jk ) : S ⊂ J1 ∪ · · · ∪ JN ,

k=1

intervals.

There is a more sophisticated notion of the size of a subset of I, called Lebesgue measure.

The key to the construction of Lebesgue measure is to cover a set S by a countable (either

finite or infinite) set of intervals. The outer measure of S ⊂ I is defined by

nX [ o

∗

(2.22) m (S) = inf `(Jk ) : S ⊂ Jk .

k≥1 k≥1

Note that, if S = I ∩ Q, then χS = ϑ, defined by (2.16). In this case it is easy to see that

cont+ (S) = `(I), but m∗ (S) = 0. In fact, (2.22) readily yields the following:

We point out that we can require the intervals Jk in (2.22) to be open. Consequently, since

each open cover of a compact set has a finite subcover,

See the appendix at the end of this section for a generalization of Proposition 2.2, giving

a sufficient condition for a bounded function to be Riemann integrable on I, in terms of

the upper content of its set of discontinuities, in Proposition 2.11, and then, in Proposition

2.12, a refinement, replacing

R upper content by outer measure.

It is useful to note that I f dx is additive in I, in the following sense.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

124

¯ ¯

Proposition 2.5. If a < b < c, f : [a, c] → R, f1 = f ¯[a,b] , f2 = f ¯[b,c] , then

¡ ¢ ¡ ¢ ¡ ¢

(2.26) f ∈ R [a, c] ⇐⇒ f1 ∈ R [a, b] and f2 ∈ R [b, c] ,

Z c Z b Z c

(2.27) f dx = f1 dx + f2 dx.

a a b

Proof. Since any partition of [a, c] has a refinement for which b is an endpoint, we may as

well consider a partition P = P1 ∪ P2 , where P1 is a partition of [a, b] and P2 is a partition

of [b, c]. Then

so

© ª © ª

(2.29) I P (f ) − I P (f ) = I P1 (f1 ) − I P1 (f1 ) + I P2 (f2 ) − I P2 (f2 ) .

Since both terms in braces in (2.29) are ≥ 0, we have equivalence in (2.26). Then (2.27)

follows from (2.28) upon taking finer and finer partitions, and passing to the limit.

Let I = [a, b]. If f ∈ R(I), then f ∈ R([a, x]) for all x ∈ [a, b], and we can consider the

function

Z x

(2.30) g(x) = f (t) dt.

a

If a ≤ x0 ≤ x1 ≤ b, then

Z x1

(2.31) g(x1 ) − g(x0 ) = f (t) dt,

x0

so, if |f | ≤ M,

Recall from §1 that a function g : (a, b) → R is said to be differentiable at x ∈ (a, b)

provided there exists the limit

1£ ¤

(2.33) lim g(x + h) − g(x) = g 0 (x).

h→0 h

When such a limit exists, g 0 (x), also denoted dg/dx, is called the derivative of g at x.

Clearly g is continuous wherever it is differentiable.

The next result is part of the Fundamental Theorem of Calculus.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

125

Theorem 2.6. If f ∈ C([a, b]), then the function g, defined by (2.30), is differentiable at

each point x ∈ (a, b), and

(2.34) g 0 (x) = f (x).

Z x+h

1£ ¤ 1

(2.35) g(x + h) − g(x) = f (t) dt.

h h x

If f is continuous at x, then, for any ε > 0, there exists δ > 0 such that |f (t) − f (x)| ≤ ε

whenever |t − x| ≤ δ. Thus the right side of (2.35) is within ε of f (x) whenever h ∈ (0, δ].

Thus the desired limit exists as h & 0. A similar argument treats h % 0.

The next result is the rest of the Fundamental Theorem of Calculus.

Theorem 2.7. If G is differentiable and G0 (x) is continuous on [a, b], then

Z b

(2.36) G0 (t) dt = G(b) − G(a).

a

Z x

(2.37) g(x) = G0 (t) dt.

a

(2.38) g 0 (x) = G0 (x), ∀ x ∈ (a, b).

Thus f (x) = g(x) − G(x) is continuous on [a, b], and

(2.39) f 0 (x) = 0, ∀ x ∈ (a, b).

We claim that (2.39) implies f is constant on [a, b]. Granted this, since f (a) = g(a)−G(a) =

−G(a), we have f (x) = −G(a) for all x ∈ [a, b], so the integral (2.37) is equal to G(x)−G(a)

for all x ∈ [a, b]. Taking x = b yields (2.36).

The fact that (2.39) implies f is constant on [a, b] is a consequence of the Mean Value

Theorem. This was established in §1; see Theorem 1.2. We repeat the statement here.

Theorem 2.8. Let f : [a, β] → R be continuous, and assume f is differentiable on (a, β).

Then ∃ ξ ∈ (a, β) such that

f (β) − f (a)

(2.40) f 0 (ξ) = .

β−a

Now, to see that (2.39) implies f is constant on [a, b], if not, ∃ β ∈ (a, b] such that

f (β) 6= f (a). Then just apply Theorem 2.8 to f on [a, β]. This completes the proof of

Theorem 2.7.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

126

Proposition 2.9. Let f ∈ R([a, b]), and define g by (2.28). If x ∈ [a, b] and f is contin-

uous at x, then g is differentiable at x, and g 0 (x) = f (x).

The proof is identical to that of Theorem 2.6.

Proposition 2.10. Assume G is differentiable on [a, b] and G0 ∈ R([a, b]). Then (2.36)

holds.

Proof. We have

Xh

n−1 ³ k + 1´ ³ k ´i

G(b) − G(a) = G a + (b − a) − G a + (b − a)

n n

k=0

(2.41)

n−1

b−a X 0

= G (ξkn ),

n

k=0

k k+1

(2.42) a + (b − a) < ξkn < a + (b − a) ,

n n

as a consequence of the Mean Value Theorem. Given G0 ∈ R([a, b]), Darboux’s theorem

Rb

(Theorem 2.4) implies that as n → ∞ one gets G(b) − G(a) = a G0 (t) dt.

Note that the beautiful symmetry in Theorems 2.6–2.7 is not preserved in Propositions

2.9–2.10. The hypothesis of Proposition 2.10 requires G to be differentiable at each x ∈

[a, b], but the conclusion of Proposition 2.9 does not yield differentiability at all points.

For this reason, we regard Propositions 2.9–2.10 as less “fundamental” than Theorems

2.6–2.7. There are more satisfactory extensions of the fundamental theorem of calculus,

involving the Lebesgue integral, and a more subtle notion of the “derivative” of a non-

smooth function. For this, we can point the reader to Chapters 10-11 of the text [T1],

Measure Theory and Integration.

So far, we have dealt with integration of real valued functions. If f : I → C, we set

f = f1 + if2 with fj : I → R and say f ∈ R(I) if and only if f1 and f2 are in R(I). Then

Z Z Z

(2.43) f dx = f1 dx + i f2 dx.

I I I

Similar comments apply to functions f : I → Rn .

Here we provide a condition, more general then Proposition 2.2, which guarantees Rie-

mann integrability.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

127

Proposition 2.11. Let f : I → R be a bounded function, with I = [a, b]. Suppose that the

set S of points of discontinuity of f has the property

Then f ∈ R(I).

Proof. Say |f (x)| ≤ M . Take ε > 0. As in (2.21), take intervals J1 , . . . , JN such that

PN

S ⊂ J1 ∪ · · · ∪ JN and k=1 `(Jk ) < ε. In fact, fatten each Jk such that S is contained

in the interior of this collection of intervals. Consider a partition P0 of I, whose intervals

include J1 , . . . , JN , amongst others, which we label I1 , . . . , IK . Now f is continuous on

each interval Iν , so, subdividing each Iν as necessary, hence refining P0 to a partition

P1 , we arrange that sup f − inf f < ε on each such subdivided interval. Denote these

subdivided intervals I10 , . . . , IL0 . It readily follows that

N

X L

X

0 ≤ I P1 (f ) − I P1 (f ) < 2M `(Jk ) + ε`(Ik0 )

(2.45) k=1 k=1

< 2εM + ε`(I).

With a little more effort, we can establish the following result, which, in light of (2.23),

is a bit sharper than Proposition 2.11.

Proposition 2.12. In the setting of Proposition 2.11, if we replace (2.38) by

(2.46) m∗ (S) = 0,

Proof. As before, we assume |f (x)| ≤ M and pick ε > 0. This P time, take a countable

collection of open intervals {Jk } such that S ⊂ ∪k≥1 Jk and k≥1 `(Jk ) < ε. Now f is

continuous at each p ∈ I \ S, so there exists an interval Kp , open (in I), containing p, such

that supKp f − inf Kp f < ε. Now {Jk : k ∈ N} ∪ {Kp : p ∈ I \ S} is an open cover of I, so

it has a finite subcover, which we denote {J1 , . . . , JN , K1 , . . . , KM }. We have

N

X

(2.47) `(Jk ) < ε, and sup f − inf f < ε, ∀ j ∈ {1, . . . , M }.

Kj Kj

k=1

Let P be the partition of I obtained by taking the union of all the endpoints of Jk and Kj

in (2.47). Let us write

P = {Lk : 0 ≤ k ≤ µ}

³[ ´ ³[ ´

= Lk ∪ Lk ,

k∈A k∈B

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

128

j ∈ {1, . . . , M }, as in (2.47). Consequently, if k ∈ B, then Lk ⊂ J` for some ` ∈ {1, . . . , N },

so

[ N

[

(2.48) Lk ⊂ J` .

k∈B `=1

We therefore have

X

(2.49) `(Lk ) < ε, and sup f − inf f < ε, ∀ j ∈ A.

Lj Lj

k∈B

It follows that

X X

0 ≤ I P (f ) − I P (f ) < 2M `(Lk ) + ε`(Lj )

(2.50) k∈B j∈A

< 2εM + ε`(I).

Remark. Proposition 2.12 is part of the sharp result that a bounded function f on

I = [a, b] is Riemann integrable if and only if its set S of points of discontinuity satisfies

(2.46). Standard books on measure theory, including [Fol] and [T1], establish this.

We give an example of a function to which Proposition 2.11 applies, and then an example

for which Proposition 2.11 fails to apply, but Proposition 2.12 applies.

f (0) = 0,

(2.51)

f (x) = (−1)j for x ∈ (2−(j+1) , 2−j ], j ≥ 0.

See Exercises 16-17 below for a more elaborate example to which Proposition 2.11 applies.

f (x) = 0 if x ∈

/ Q,

(2.53) 1 m

if x = , in lowest terms.

n n

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

129

(2.54) S = I ∩ Q.

As we have seen below (2.23), cont+ S = 1, so Proposition 2.11 does not apply. Neverthe-

less, it is fairly easy to see directly that

As indicated below (2.23), (2.46) does apply to this function, so Proposition 2.12 applies.

Example 2 is illustrative of the following general phenomenon, which is worth recording.

Corollary 2.13. If f : I → R is bounded and its set S of points of discontinuity is

countable, then f ∈ R(I).

Proof. By virtue of (2.24), Proposition 2.12 applies.

Here is another useful sufficient condition condition for Riemann integrability.

Proposition 2.14. If f : I → R is bounded and monotone, then f ∈ R(I).

Proof. It suffices to consider the case that f is monotone increasing. Let PN = {Jk :

1 ≤ k ≤ N } be the partition of I into N intervals of equal length. Note that supJk f ≤

inf Jk+1 f . Hence

N

X −1

I PN (f ) ≤ ( inf f )`(Jk ) + (sup f )`(JN )

Jk+1 JN

(2.57) k=1

`(I)

≤ I PN (f ) + 2M ,

N

if |f | ≤ M . Taking N → ∞, we deduce from Theorem 2.4 that I(f ) ≤ I(f ), which proves

f ∈ R(I).

Remark. It can be shown that if f is monotone, then its set of points of discontinuity is

countable. Given this, Proposition 2.14 is also a consequence of Corollary 2.13.

We mention an alternative characterization of I(f ) and I(f ), which can be useful. Given

I = [a, b], we say g : I → R is piecewise constant on I (and write g ∈ PK(I)) provided there

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

130

exists a partition P = {Jk } of I such that g is constant on the interior of each interval Jk .

Clearly PK(I) ⊂ R(I). It is easy to see that, if f : I → R is bounded,

nZ o

I(f ) = inf f1 dx : f1 ∈ PK(I), f1 ≥ f ,

I

(2.58) nZ o

I(f ) = sup f0 dx : f0 ∈ PK(I), f0 ≤ f .

I

f ∈ R(I) ⇔ for each ε > 0, ∃f0 , f1 ∈ PK(I) such that

Z

(2.59) f0 ≤ f ≤ f1 and (f1 − f0 ) dx < ε.

I

Proposition 2.15. Let f ∈ R(I), and assume |f | ≤ M . Let ϕ : [−M, M ] → R be

continuous. Then ϕ ◦ f ∈ R(I).

Proof. We proceed in steps.

piecewise linear functions. Then ϕν ◦ f → ϕ ◦ f uniformly on I. A uniform limit g of

functions gν ∈ R(I) is in R(I) (see Exercise 9). So it suffices to prove Proposition 2.12

when ϕ is continuous and piecewise linear.

ϕ = ϕ1 − ϕ2 , with ϕj : [−M, M ] → R monotone, continuous, and piecewise linear. Now

ϕ1 ◦ f, ϕ2 ◦ f ∈ R(I) ⇒ ϕ ◦ f ∈ R(I).

Lipschitz. By Step 2, this will suffice. So we assume

ϕ ◦ f0 , ϕ ◦ f1 ∈ PK(I), ϕ ◦ f0 ≤ ϕ ◦ f ≤ ϕ ◦ f1 ,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

131

and Z Z

(ϕ ◦ f1 − ϕ ◦ f0 ) dx ≤ L (f1 − f0 ) dx ≤ Lε.

I I

Exercises

1. Let c > 0 and let f : [ac, bc] → R be Riemann integrable. Working directly with the

definition of integral, show that

Z b Z bc

1

(2.62) f (cx) dx = f (x) dx.

a c ac

Z b−d/c Z bc

1

(2.63) f (cx + d) dx = f (x) dx.

a−d/c c ac

R

2. Let f : I ×S → R be continuous, where I = [a, b] and S ⊂ Rn . Take ϕ(y) = I

f (x, y) dx.

Show that ϕ is continuous on S.

Hint. If fj : I → R are continuous and |f1 (x) − f2 (x)| ≤ δ on I, then

¯Z Z ¯

¯ ¯

(2.64) ¯ f 1 dx − f2 dx ¯ ≤ `(I)δ.

I I

R g (y)

Take ϕ(y) = g01(y) f (x, y) dx. Show that ϕ is continuous on S.

Hint. Make a change of variables, linear in x, to reduce this to Exercise 2.

4. Let ϕ : [a, b] → [A, B] be C 1 on a neighborhood J of [a, b], with ϕ0 (x) > 0 for all

x ∈ [a, b]. Assume ϕ(a) = A, ϕ(b) = B. Show that the identity

Z B Z b ¡ ¢

(2.65) f (y) dy = f ϕ(t) ϕ0 (t) dt,

A a

for any f ∈ C([A, B]), follows from the chain rule and the Fundamental Theorem of

Calculus.

Hint. Replace b by x, B by ϕ(x), and differentiate.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

132

4A. Show that (2.65) holds for each f ∈ PK([A, B]). Using (2.58)–(2.59), show that

f ∈ R([A, B]) ⇒ f ◦ ϕ ∈ R([a, b]) and (2.65) holds. (This result contains that of Exercise

1.)

Z b Z b

0

£ ¤

(2.66) f (s)g (s) ds = − f 0 (s)g(s) ds + f (b)g(b) − f (a)g(a) .

a a

(2.67) f (x) = f (0) + f 0 (0)x + x + ··· + x + Rj (x),

2 j!

where

Z x

(x − s)j (j+1)

(2.68) Rj (x) = f (s) ds

0 j!

Hint. Use induction. If (2.67)–(2.68) holds for 0 ≤ j ≤ k, show that it holds for j = k + 1,

by showing that

Z x Z x

(x − s)k (k+1) f (k+1) (0) k+1 (x − s)k+1 (k+2)

(2.69) f (s) ds = x + f (s) ds.

0 k! (k + 1)! 0 (k + 1)!

To establish this, use the integration by parts formula (2.66), with f (s) replaced by

f (k+1) (s), and with appropriate g(s). See §3 for another approach. Note that another

presentation of (2.68) is

Z ³¡

xj+1 1

(j+1) 1/(j+1)

¢ ´

(2.70) Rj (x) = f 1−t x dt.

(j + 1)! 0

7. Assume f : (−a, a) → R is a C j function. Show that, for x ∈ (−a, a), (2.67) holds, with

Z x

1 £ ¤

(2.71) Rj (x) = (x − s)j−1 f (j) (s) − f (j) (0) ds.

(j − 1)! 0

Hint. Apply (2.68) with j replaced by j − 1. Add and subtract f (j) (0) to the factor f (j) (s)

in the resulting integrand.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

133

as advertised in (2.60).

Z Z

(2.73) fk dx −→ f dx.

I I

10. Given I = [a, b], Iε = [a + ε, b − ε], assume fk ∈ R(I), |fk | ≤ M on I for all k, and

(2.74) fk −→ f uniformly on Iε ,

for all ε ∈ (0, (b − a)/2). Prove that f ∈ R(I) and (2.73) holds.

Z b

(2.75) xr dx, r ∈ Q \ {−1},

a

where −∞ < a < b < ∞ if r ≥ 0 and 0 < a < b < ∞ if r < 0. See §5 for (2.75) with

r = −1.

Z 1 p

(2.76) x 1 + x2 dx.

0

13. We say f ∈ R(R) provided f |[k,k+1] ∈ R([k, k + 1]) for each k ∈ Z, and

∞ Z

X k+1

(2.77) |f (x)| dx < ∞.

k=−∞ k

If f ∈ R(R), we set

Z ∞ Z k

(2.78) f (x) dx = lim f (x) dx.

−∞ k→∞ −k

Formulate and demonstrate basic properties of the integral over R of elements of R(R).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

134

14. This exercise discusses the integral test for absolute convergence of an infinite series,

which goes as follows. Let f be a positive, monotonically decreasing, continuous function

on [0, ∞), and suppose |ak | = f (k). Then

X∞ Z ∞

|ak | < ∞ ⇐⇒ f (x) dx < ∞.

k=0 0

Prove this.

Hint. Use

N

X Z N N

X −1

|ak | ≤ f (x) dx ≤ |ak |.

k=1 0 k=0

∞

X 1

< ∞ ⇐⇒ p > 1.

kp

k=1

RN

to take p ∈ R+ .) Hint. Use Exercise 11 to evaluate IN (p) = 1 x−p dx, for p 6= −1, and

R∞

let N → ∞. See if you can show 1 x−1 dx = ∞ without knowing about log N . Subhint.

R2 R 2N

Show that 1 x−1 dx = N x−1 dx.

In Exercises 16–17, C ⊂ [a, b] is the Cantor set introduced in the exercises for §9 of Chapter

1. As in (9.21) of Chapter 1, C = ∩j≥0 Cj .

cont+ C = 0.

17. Define f : [a, b] → R as follows. We call an interval of length 3−j (b − a), omitted in

passing from Cj−1 to Cj , a “j-interval.” Set

f (x) = 0, if x ∈ C,

(−1)j , if x belongs to a j-interval.

Show that the set of discontinuities of f is C. Hence Proposition 2.11 implies f ∈ R([a, b]).

18. Let fk ∈ R([a, b]) and f : [a, b] → R satisfy the following conditions.

(a) |fk | ≤ M < ∞, ∀ k,

(b) fk (x) −→ f (x), ∀ x ∈ [a, b],

(c) Given ε > 0, there exists Sε ⊂ [a, b] such that

cont+ Sε < ε, and fk → f uniformly on [a, b] \ Sε .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

135

Z b Z b

fk (x) dx −→ f (x) dx, as k → ∞.

a a

Remark. In the Lebesgue theory of integration, there is a stronger result, known as the

Lebesgue dominated convergence theorem. See Exercises 12–14 in §6 for more on this.

19. Recall that one ingredient in the proof of Theorem 2.7 was that if f : (a, b) → R, then

Consider the following approach to proving (2.79), which avoids use of the Mean Value

Theorem.

(a) Assume a < x0 < y0 < b and f (x0 ) 6= f (y0 ). Say f (y0 ) = f (x0 ) + A(y0 − x0 ), and we

may as well assume A > 0.

(b) Divide I0 = [x0 , y0 ] into two equal intervals, I0` and I0r , meeting at the midpoint

ξ0 = (x0 + y0 )/2. Show that either

Set I1 = I0` if the former holds; otherwise, set I1 = I0r . Say I1 = [x1 , y1 ].

(c) Inductively, having Ik = [xk , yk ], of length 2−k (y0 − x0 ), divide it into two equal

intervals, Ik` and Ikr , meeting at the midpoint ξk = (xk + yk )/2. Show that either

Set Ik+1 = Ik` if the former holds; otherwise set Ik+1 = Ikr .

(d) Show that

xk % x, yk & x, x ∈ [x0 , y0 ],

and that, if f is differentiable at x, then f 0 (x) ≥ A. Note that this contradicts the

hypothesis that f 0 (x) = 0 for all x ∈ (a, b).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

136

3. Power series

∞

X

(3.1) f (z) = ak (z − z0 )k ,

k=0

Proposition 3.1. If the series (3.1) converges for some z1 6= z0 , then either this series

is absolutely convergent for all z ∈ C or there is some R ∈ (0, ∞) such that the series is

absolutely convergent for |z − z0 | < R and divergent for |z − z0 | > R. The series converges

uniformly on

We now restrict attention to cases where z0 ∈ R and z = t ∈ R, and apply calculus to

the study of such power series. We emphasize that we still allow the coefficients ak to be

complex numbers.

Proposition 3.2. Assume ak ∈ C and

∞

X

(3.3) f (t) = a k tk

k=0

converges for real t satisfying |t| < R. Then f is differentiable on the interval −R < t < R,

and

∞

X

0

(3.4) f (t) = kak tk−1 ,

k=1

We first check absolute convergence of the series (3.4). Let S < T < R. Convergence of

(3.3) implies there exists C < ∞ such that

(3.5) |ak |T k ≤ C, ∀ k.

Hence, if |t| ≤ S,

C ³ S ´k

(3.6) |kak tk−1 | ≤ k ,

S T

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

137

∞

X

(3.7) g(t) = kak tk−1

k=1

is continuous on (−R, R). To show that f 0 (t) = g(t), by the fundamental theorem of

calculus, it is equivalent to show

Z t

(3.8) g(s) ds = f (t) − f (0).

0

Proposition 3.3. Assume bk ∈ C and

∞

X

(3.9) g(t) = bk tk

k=0

converges for real t, satisfying |t| < R. Then, for |t| < R,

Z t ∞

X bk k+1

(3.10) g(s) ds = t ,

0 k+1

k=0

Proof. Since, for |t| < R,

¯ b ¯

¯ k k+1 ¯

(3.11) ¯ t ¯ ≤ R|bk tk |,

k+1

convergence of the series in (3.10) is clear. Next, write

N

X ∞

X

(3.12)

SN (t) = b k tk , RN (t) = bk tk .

k=0 k=N +1

As in the proof of Proposition 3.2 in Chapter 3, pick S < T < R. There exists C < ∞

such that |bk T k | ≤ C for all k. Hence

∞ ³ ´k

X S

(3.13) |t| ≤ S ⇒ |RN (t)| ≤ C = CεN → 0, as N → ∞.

T

k=N +1

so

Z t N

X Z t

bk k+1

(3.14) g(s) ds = t + RN (s) ds,

0 k+1 0

k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

138

¯Z t ¯ Z t

¯ ¯

(3.15) ¯ RN (s) ds¯ ≤ |RN (s)| ds ≤ CRεN .

0 0

Second proof of Proposition 3.2. As shown in Proposition 3.7 of Chapter 3, if |t1 | < R,

then f (t) has a convergent power series about t1 :

∞

X

(3.16) f (t) = bk (t − t1 )k , for |t − t1 | < R − |t1 |,

k=0

with

∞

X

(3.17) b1 = nan tn−1

1 .

n=1

Remark. The definition of (3.10) for t < 0 follows standard convention. More generally,

if a < b and g ∈ R([a, b]), then

Z a Z b

g(s) ds = − g(s) ds.

b a

∞

X

(3.18) f (t) = ak (t − t0 )k , for |t − t0 | < R,

k=0

∞

X

0

(3.19) f (t) = kak (t − t0 )k−1 .

k=1

∞

X

(n)

(3.20) f (t) = k(k − 1) · · · (k − n + 1)ak (t − t0 )k−n .

k=n

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

139

In particular,

f (n) (t0 )

(3.22) an = .

n!

This suggests the following method of taking a given function and deriving a power series

representation. Namely, if we can, we compute f (k) (t0 ) and propose that

∞

X f (k) (t0 )

(3.23) f (t) = (t − t0 )k ,

k!

k=0

with r ∈ Q (but −r ∈ / N), and take t0 = 0. (Results of §5 will allow us to extend this

analysis to r ∈ R.) Using (1.36), we get

hk−1

Y i

(k)

(3.26) f (t) = (r + `) (1 − t)−(r+k) .

`=0

Hence, for k ≥ 1,

k−1

Y

(3.27) f (k) (0) = (r + `) = r(r + 1) · · · (r + k − 1).

`=0

∞

X ak

(3.28) (1 − t)−r = tk , |t| < 1,

k!

k=0

with

k−1

Y

(3.29) a0 = 1, ak = (r + `), for k ≥ 1.

`=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

140

We can verify convergence of the right side of (3.28) by using the ratio test:

¯a ¯

¯ k+1 tk+1 /(k + 1)! ¯ k + r

(3.30) ¯ ¯= |t|.

ak tk /k! k+1

This computation implies that the power series on the right side of (3.28) is absolutely

convergent for |t| < 1, yielding a function

∞

X ak

(3.31) g(t) = tk , |t| < 1.

k!

k=0

We take up this task, on a more general level. Establishing that the series

∞

X f (k) (t0 )

(3.32) (t − t0 )k

k!

k=0

converges to f (t) is equivalent to examining the remainder Rn (t, t0 ) in the finite expansion

n

X f (k) (t0 )

(3.33) f (t) = (t − t0 )k + Rn (t, t0 ).

k!

k=0

The series (3.32) converges to f (t) if and only if Rn (t, t0 ) → 0 as n → ∞. To see when this

happens, we need a compact formula for the remainder Rn , which we proceed to derive.

It seems to clarify matters if we switch notation a bit, and write

f (n) (y)

(3.34) f (x) = f (y) + f 0 (y)(y − x) + · · · + (x − y)n + Rn (x, y).

n!

We now take the y-derivative of each side of (3.34). The y-derivative of the left side is 0,

and when we apply ∂/∂y to the right side, we observe an enormous amount of cancellation.

There results the identity

∂Rn 1

(3.35) (x, y) = − f (n+1) (y)(x − y)n .

∂y n!

Also,

(3.36) Rn (x, x) = 0.

[Rn (x, y) − Rn (x, x)]/(y − x), an immediate consequence of the mean value theorem is

that, if f is real valued,

1

(3.37) Rn (x, y) = (x − y)(x − ξn )n f (n+1) (ξn ),

n!

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

141

for some ξn betweeen x and y. This is known as Cauchy’s formula for the remainder. If

f (n+1) is continuous, we can apply the fundamental theorem of calculus to (3.35)–(3.36),

and obtain the integral formula

Z x

1

(3.38) Rn (x, y) = (x − s)n f (n+1) (s) ds.

n! y

This works regardless of whether f is real valued. Another derivation of (3.38) arose in

the exercise set for §2. The change of variable x − s = t(x − y) gives the integral formula

Z 1

1

(3.39) Rn (x, y) = (x − y)n+1 tn f (n+1) (ty + (1 − t)x) dt.

n! 0

If we think of this integral as 1/(n + 1) times a weighted mean of f (n+1) , we get the

Lagrange formula for the remainder,

1

(3.40) Rn (x, y) = (x − y)n+1 f (n+1) (ζn ),

(n + 1)!

for some ζn between x and y, provided f is real valued. The Lagrange formula is shorter

and neater than the Cauchy formula, but the Cauchy formula is actually more powerful.

The calculations in (3.43)–(3.54) below will illustrate this.

Note that, if I(x, y) denotes the interval with endpoints x and y (e.g., (x, y) if x < y),

then (3.38) implies

|x − y|

(3.41) |Rn (x, y)| ≤ sup |(x − ξ)n f (n+1) (ξ)|,

n! ξ∈I(x,y)

|x − y|n+1

(3.42) |Rn (x, y)| ≤ sup |f (n+1) (ξ)|.

(n + 1)! ξ∈I(x,y)

In case f is real valued, (3.41) also follows from the Cauchy formula (3.37) and (3.42)

follows from the Lagrange formula (3.40).

Let us apply these estimates with f as in (3.24), i.e.,

and y = 0. By (3.26),

n

Y

(n+1) −(r+n+1)

(3.44) f (ξ) = an+1 (1 − ξ) , an+1 = (r + `).

`=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

142

Consequently,

(3.45) = bn (1 − ξ)−(r+n+1) , bn = .

n! n!

Note that

bn+1 n+1+r

(3.46) = → 1, as n → ∞.

bn n+1

Let us first investigate the estimate of Rn (x, 0) given by (3.42) (as in the Lagrange

formula), and see how it leads to a suboptimal conclusion. (The impatient reader might

skip (3.47)–(3.50) and go to (3.51).) By (3.45), if n is sufficiently large that r + n + 1 > 0,

sup = if − 1 ≤ x ≤ 0,

ξ∈I(x,0) (n + 1)! n+1

(3.47)

|bn |

(1 − x)−(r+n+1) if 0 ≤ x < 1.

n+1

|bn |

|Rn (x, 0)| ≤ |x|n+1 if − 1 ≤ x ≤ 0,

n+1

(3.48) ³ x ´n+1

|bn | 1

if 0 ≤ x < 1.

n + 1 (1 − x)r 1 − x

cn = =⇒ = → 1 as n → ∞,

n+1 cn |bn | n + 2

On the other hand, x/(1 − x) is < 1 for 0 ≤ x < 1/2, but not for 1/2 ≤ x < 1. Hence the

factor (x/(1 − x))n+1 decreases geometrically for 0 ≤ x < 1/2, but not for 1/2 ≤ x < 1.

Thus the second part of (3.48) yields only

1

(3.50) Rn (x, 0) −→ 0 as n → ∞, if 0 ≤ x < .

2

This is what the remainder estimate (3.42) yields.

To get the stronger result

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

143

we use the remainder estimate (3.41) (as in the Cauchy formula). This gives

|x − ξ|n

(3.52) |Rn (x, 0)| ≤ |bn | · |x| sup ,

ξ∈I(x,0) |1 − ξ|n+1+r

x−ξ

0 ≤ ξ ≤ x < 1 =⇒ ≤ x,

1−ξ

(3.53) ¯x − ξ ¯

¯ ¯

−1 < x ≤ ξ ≤ 0 =⇒ ¯ ¯ ≤ |x − ξ| ≤ |x|.

1−ξ

The first conclusion holds since it is equivalent to x − ξ ≤ x(1 − ξ) = x − xξ, hence to

xξ ≤ ξ. The second conclusion in (3.53) holds since ξ ≤ 0 ⇒ 1 − ξ ≥ 1. We deduce from

(3.52)–(3.53) that

We can now conclude that (3.28) holds, with ak given by (3.29). For another proof of

(3.28), see Exercise 14.

There are some important examples of power series representations for which one does

not need to use remainder estimates like (3.41) or (3.42). For example, as seen in Chapter

1, we have

n

X 1 − xn+1

(3.55) xk = ,

1−x

k=0

X ∞

1

(3.56) = xk , |x| < 1,

1−x

k=0

without further ado, which is the case r = 1 of (3.28)–(3.29). We can differentiate (3.56)

repeatedly to get

∞

X

−n

(3.57) (1 − x) = ck (n)xk , |x| < 1, n ∈ N,

k=0

and verify that (3.57) agrees with (3.28)–(3.29) with r = n. However, when r ∈

/ Z, such

an analysis of Rn (x, 0) as made above seems necessary.

Let us also note that we can apply Proposition 3.3 to (3.56), obtaining

∞

X Z x

xk+1 dy

(3.58) = , |x| < 1.

k+1 0 1−y

k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

144

Material covered in §5 will produce another formula for the right side of (3.58).

Exercises

1. Show that (3.6) yields the absolute convergence asserted in the proof of Proposition

3.2. More generally, show that, for any n ∈ N, r ∈ (0, 1),

∞

X

k n rk < ∞.

k=1

we have p(k) (0) = k! ak . Apply this to

Pn (t) = (1 + t)n .

(k) (k)

Compute Pn (t) using (1.7) repeatedly, then compute Pn (0), and use this to establish

the binomial formula:

n µ ¶

X µ ¶

n n k n n!

(1 + t) = t , = .

k k k!(n − k)!

k=0

X ∞

1

√ = bk xk .

1 − x4 k=0

Show that this series converges to the left side for |x| < 1.

Hint. Take r = 1/2 in (3.28)–(3.29) and set t = x4 .

4. Expand Z x

dy

p

0 1 − y4

in a power series in x. Show this holds for |x| < 1.

5. Expand Z x

dy

p

0 1 + y4

as a power series in x. Show that this holds for |x| < 1.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

145

6. Expand

Z 1

dt

√

0 1 + xt4

as a power series in x. Show that this holds for |x| < 1.

Z x

1

(3.59) Rn (x, y) = (x − s)n−1 [f (n) (s) − f (n) (y)] ds.

(n − 1)! y

f (n) (y)

Rn−1 (x, y) = + Rn (x, y).

n!

Remark. An advantage of (3.59) over (3.38) is that for (3.59), we need only f ∈ C n ,

rather than f ∈ C n+1 .

8. Note that r

√ 1

2=2 1− .

2

Expand the right

√ side in a power series, using (3.28)–(3.29). How many terms suffice to

approximate 2 to 12 digits?

9. In the setting of Exercise 8, investigate series that converge faster, such as series obtained

from r

√ 3 1

2= 1−

2 9

r

10 1

= 1− .

7 50

√ √ √

10.

√ Apply variants of the methods of Exercises 8–9 to approximate 3, 5, 7, and

1001.

√

11. Given a rational approximation xn to 2, write

√ p

2 = x n 1 + δn .

³ 1 ´

xn+1 = xn 1 + δn , 2 = x2n+1 (1 + δn+1 ).

2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

146

√

Estimate δn+1 . Does the sequence (xn ) approach 2 faster than a power series? Apply

this method to the last approximation in Exercise 9.

12. Assume F ∈ C([a, b]), g ∈ R([a, b]), F real valued, and g ≥ 0 on [a, b]. Show that

Z b ³Z b ´

g(t)F (t) dt = g(t) dt F (ζ),

a a

for some ζ ∈ (a, b). Show how this result justifies passing from (3.39) to (3.40).

Rb

Hint. If A = min F, B = max F , and M = a g(t) dt, show that

Z b

AM ≤ g(t)F (t) dt ≤ BM.

a

13. Recall that the Cauchy formula (3.37) for the remainder Rn (x, y) was obtained by

applying the Mean Value Theorem to the difference quotient

Rn (x, y) − Rn (x, x)

.

y−x

Now apply the generalized mean value theorem, described in Exercise 8 of §1, with

14. Here is an approach to the proof of (3.28) that avoids formulas for the remainder

Rn (x, 0). Set

∞

X

−r ak k

fr (t) = (1 − t) , gr (t) = t , for |t| < 1,

k!

k=0

r

fr0 (t) = f (t), and (1 − t)gr0 (t) = rgr (t).

1−t

d

(1 − t)r gr (t) = 0,

dt

and deduce that fr (t) = gr (t).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

147

The term “curve” is commonly used to refer to a couple of different, but closely related,

objects. In one meaning, a curve is a continuous function from an interval I ⊂ R to

n-dimensional Euclidean space:

(4.1) γ : I −→ Rn , γ(t) = (γ1 (t), . . . , γn (t)).

We say γ is differentiable provided each component γj is, in which case

(4.2) γ 0 (t) = (γ10 (t), . . . , γn0 (t)).

γ 0 (t) is the velocity of γ, at “time” t, and its speed is the magnitude of γ 0 (t):

q

0

(4.3) |γ (t)| = γ10 (t)2 + · · · + γn0 (t)2 .

We say γ is smooth of class C k provided each component γj (t) has this property.

One also calls the image of I under the map γ a curve in Rn . If u : J → I is continuous,

one-to-one, and onto, the map

(4.4) σ : J −→ Rn , σ(t) = γ(u(t))

has the same image as γ. We say σ is a reparametrization of γ. We usually require that u

be C 1 , with C 1 inverse. If γ is C k and u is also C k , so is σ, and the chain rule gives

(4.5) σ 0 (t) = u0 (t)γ 0 (u(t)).

Let us assume I = [a, b] is a closed, bounded interval, and γ is C 1 . We want to define

the length of this curve. To get started, we take a partition P of [a, b], given by

(4.6) a = t0 < t1 < · · · < tN = b,

and set

N

X

(4.7) `P (γ) = |γ(tj ) − γ(tj−1 )|.

j=1

We will massage the right side of (4.7) into something that looks like a Riemann sum for

Rb 0

a

|γ (t)| dt. We have

Z tj

γ(tj ) − γ(tj−1 ) = γ 0 (t) dt

tj−1

Z tj £ ¤

(4.8) = γ 0 (tj ) + γ 0 (t) − γ 0 (tj ) dt

tj−1

Z tj

0

£ 0 ¤

= (tj − tj−1 )γ (tj ) + γ (t) − γ 0 (tj ) dt.

tj−1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

148

We get

with

Z tj

(4.10) |rj | ≤ |γ 0 (t) − γ 0 (tj )| dt.

tj−1

Now if γ 0 is continuous on [a, b], so is |γ 0 |, and hence both are uniformly continuous on

[a, b]. We have

N

X

(4.12) `P (γ) = |γ 0 (tj )|(tj − tj−1 ) + RP ,

j=1

with

Since the sum on the right side of (4.12) is a Riemann sum, we can apply Theorem 2.4 to

get the following.

Proposition 4.1. Assume γ : [a, b] → Rn is a C 1 curve. Then

Z b

(4.14) `P (γ) −→ |γ 0 (t)| dt as maxsize P → 0.

a

Z b

(4.15) `(γ) = |γ 0 (t)| dt.

a

have from (4.5) that |σ 0 (t)| = |u0 (t)| · |γ 0 (u(t))|, and the change of variable formula (2.65)

for the integral gives

Z β Z b

0

(4.16) |σ (t)| dt = |γ 0 (t)| dt,

α a

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

149

Z t

(4.18) `γ (t) = |γ 0 (s)| ds, `0γ (t) = |γ 0 (t)|.

a

If we assume also that γ 0 is nowhere vanishing on [a, b], Theorem 1.3, the inverse function

theorem, implies that `γ : [a, b] → [0, `(γ)] has a C 1 inverse

(4.20) 1

= 0 γ 0 (u(t)), for t = `γ (s), s = u(t),

`γ (s)

(4.21) |σ 0 (t)| ≡ 1.

by arc length.

We now focus on that most classical example of a curve in the plane R2 , the unit circle

p p

(4.23) γ+ (t) = (t, 1 − t2 ), γ− (t) = (t, − 1 − t2 ),

on the intersection of S 1 with {(x, y) : y > 0} and {(x, y) : y < 0}, respectively. Here

γ± : (−1, 1) → R2 , and both maps are smooth. In fact, we can take γ± : [−1, 1] → R2 ,

but these functions are not differentiable at ±1. We can also parametrize S 1 away from

(x, y) = (0, ±1), by

p p

(4.24) γ` (t) = (− 1 − t2 , t), γr (t) = ( 1 − t2 , t),

0

(4.25) γ+ (t) = (1, −t(1 − t2 )−1/2 ),

so

0 t2 1

(4.26) |γ+ (t)|2 = 1 + 2

= .

1−t 1 − t2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

150

Z t

1

(4.27) `(t) = √ ds, for 0 < t < 1.

0 1 − s2

The same formula holds with γ+ replaced by γ− , γ` , or γr .

We can evaluate the integral (4.27) as a power series in t, as follows. As seen in §3,

∞

X

−1/2 ak

(4.28) (1 − r) = rk , for |r| < 1,

k!

k=0

where

1 ³ 1 ´³ 3 ´ ³ 1´

(4.29) a0 = 1, a1 = , ak = ··· k − .

2 2 2 2

The power series converges uniformly on [−ρ, ρ], for each ρ ∈ (0, 1). It follows that

∞

X

2 −1/2 ak

(4.30) (1 − s ) = s2k , |s| < 1,

k!

k=0

uniformly convergent on [−a, a] for each a ∈ (0, 1). Hence we can integrate (4.30) term by

term to get

∞

X ak t2k+1

(4.31) `(t) = , 0 ≤ t < 1.

k! 2k + 1

k=0

One can use (4.27)–(4.31) to get a rapidly convergent infinite series for the number π,

defined as

Since S 1 is a smooth curve, it can be parametrized by arc length. We will let C : R → S 1

be such a parametrization, satisfying

rays from (0, 0) to (1, 0) and from (0, 0) to C(t) make an angle that, measured in radians,

is t. This leads to the standard trigonometrical functions cos t and sin t, defined by

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

151

We can evaluate the derivative of C(t) by the following device. Applying d/dt to the

identity

since both |C(t)| ≡ 1 and |C 0 (t)| ≡ 1, (4.35) allows only two possibilities. Either

or

d d

(4.38) cos t = − sin t, sin t = cos t.

dt dt

One can think of cos t and sin t as special functions arising to analyze the length of arcs

in the circle. Related special functions arise to analyze the length of portions of a parabola

in R2 , say the graph of

1 2

(4.39) y= x .

2

³ 1 ´

(4.40) γ(t) = t, t2 ,

2

so

Z tp

(4.42) `γ (t) = 1 + s2 ds.

0

Methods to evaluate the integral in (4.42) are provided in §5. See Exercise 10 of §5.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

152

The study of lengths of other curves has stimulated much work in analysis. Another

example is the ellipse

x2 y2

(4.43) + = 1,

a2 b2

given a, b ∈ (0, ∞). This curve is parametrized by

(4.45)

= b2 + γ sin2 t, γ = a2 − b2 ,

Z tp

γ

(4.46) `γ (t) = b 1 + σ sin2 s ds, σ= .

0 b2

If a 6= b, this is called an elliptic integral, and it gives rise to a more subtle family of

special functions, called elliptic functions. Material on this can be found in §33 of [T3],

Introduction to Complex Analysis.

We end this section with a brief discussion of curves in polar coordinates. We define a

map

We say (r, θ) are polar coordinates of (x, y) ∈ R2 if Π(r, θ) = (x, y). Now, Π in (4.47) is

not bijective, since

and Π(0, θ) is independent of θ. So polar coordinates are not unique, but we will not

belabor this point. The point we make is that an equation

π π

(4.51) ρ(θ) = a cos θ, − ≤θ≤ ,

2 2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

153

To compute the arc length of (4.50), we note that, by (4.38),

(4.53)

⇒ x0 (t) = ρ0 (t) cos t − ρ(t) sin t, y 0 (t) = ρ0 (t) sin t + ρ(t) cos t,

hence

x0 (t)2 + y 0 (t)2 = ρ0 (t)2 cos2 t − 2ρ(t)ρ0 (t) cos t sin t + ρ(t)2 sin2 t

(4.54) + ρ0 (t)2 sin2 t + 2ρ(t)ρ0 (t) sin t cos t + ρ(t)2 cos2 t

= ρ0 (t)2 + ρ(t)2 .

Therefore

Z b Z b p

0

(4.55) `(γ) = |γ (t)| dt = ρ(t)2 + ρ0 (t)2 dt.

a a

Exercises

γ(t) = (a cos t, a sin t, bt)

is a helix. Compute the length of γ([0, t]).

3. Let

³ 2√2 ´

3/2 1 2

γ(t) = t, t , t .

3 2

Compute the length of γ([0, t]).

4. In case b > a for the ellipse (4.44), the length formula (4.46) becomes

Z tq

b2 − a2

`γ (t) = b 1 − β 2 sin2 s ds, β2 = ∈ (0, 1).

0 b2

Apply the change of variable x = sin s to this integral (cf. (2.46)), and write out the

resulting integral.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

154

Deduce this from the definition (4.31A) of π, together with the characterization of C(t)

in (4.33) as the unit speed parametrization of S 1 , satisfying (4.32). For a more general

identity, see (5.44).

π π

γ(t) = (a cos2 t, a cos t sin t), − ≤t≤ .

2 2

³a a a ´

γ(t) = + cos 2t, sin 2t .

2 2 2

Verify that this traces out a circle of radius a/2, centered at (a/2, 0).

7. Use (4.55) to write the arc length of the curve given by (4.52) as an integral. Show this

integral has the same general form as (4.45)–(4.46).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

155

The exponential function is one of the central objects of analysis. In this section we de-

fine the exponential function, both for real and complex arguments, and establish a number

of basic properties, including fundamental connections to the trigonometric functions.

We construct the exponential function to solve the differential equation

dx

(5.1) = x, x(0) = 1.

dt

We seek a solution as a power series

∞

X

(5.2) x(t) = ak tk .

k=0

In such a case, if this series converges for |t| < R, then, by Proposition 3.2,

∞

X

0

x (t) = kak tk−1

k=1

(5.3) ∞

X

= (` + 1)a`+1 t` ,

`=0

ak

(5.4) a0 = 1, ak+1 = ,

k+1

i.e., ak = 1/k!, where k! = k(k − 1) · · · 2 · 1. Thus (5.1) is solved by

∞

X

t1 k

(5.5) x(t) = e = t , t ∈ R.

k!

k=0

More generally, we can define

∞

X

z 1 k

(5.6) e = z , z ∈ C.

k!

k=0

The ratio test then shows that the series (5.6) is absolutely convergent for all z ∈ C, and

uniformly convergent for |z| ≤ R, for each R < ∞. Note that, again by Proposition 3.2,

∞

X

at ak

(5.7) e = tk

k!

k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

156

solves

d at

(5.8) e = aeat ,

dt

and this works for each a ∈ C.

We claim that eat is the unique solution to

dy

(5.9) = ay, y(0) = 1.

dt

d ¡ −at ¢

(5.10) e y(t) = −ae−at y(t) + e−at ay(t) = 0,

dt

where we use the product rule, (5.8) (with a replaced by −a) and (5.9). Thus e−at y(t) is

independent of t. Evaluating at t = 0 gives

whenever y(t) solves (5.9). Since eat solves (5.9), we have e−at eat = 1, hence

1

(5.12) e−at = , ∀ t ∈ R, a ∈ C.

eat

Thus multiplying both sides of (5.11) by eat gives the asserted uniqueness:

We can draw further useful conclusions from applying d/dt to products of exponential

functions. In fact, let a, b ∈ C; then

e e e

dt

(5.14) = −ae−at e−bt e(a+b)t − be−at e−bt e(a+b)t + (a + b)e−at e−bt e(a+b)t

= 0,

gives

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

157

or, setting t = 1,

(5.17) ea+b = ea eb , ∀ a, b ∈ C.

We next record some properties of exp(t) = et for real t. The power series (5.5) clearly

gives et > 0 for t ≥ 0. Since e−t = 1/et , we see that et > 0 for all t ∈ R. Since

det /dt = et > 0, the function is monotone increasing in t, and since d2 et /dt2 = et > 0,

this function is convex. (See Proposition 1.5 and the remark that follows it.) Note that,

for t > 0,

t2

(5.18) et = 1 + t + + · · · > 1 + t % +∞,

2

as t % ∞. Hence

t→+∞

(5.20) lim et = 0.

t→−∞

As a consequence,

(5.22) L : (0, ∞) −→ R.

Applying d/dt to

(5.24) L(et ) = t

gives

1

(5.25) L0 (et )et = 1, hence L0 (et ) = ,

et

i.e.,

d 1

(5.26) log x = .

dx x

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

158

Figure 5.1

Figure 5.2

Z x

dy

(5.27) log x = .

1 y

An immediate consequence of (5.17) (for a, b ∈ R) is the identity

(5.28) log xy = log x + log y, x, y ∈ (0, ∞).

We move on to a study of ez for purely imaginary z, i.e., of

(5.29) γ(t) = eit , t ∈ R.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

159

This traces out a curve in the complex plane, and we want to understand which curve it

is. Let us set

(5.30) eit = c(t) + is(t),

with c(t) and s(t) real valued. First we calculate |eit |2 = c(t)2 + s(t)2 . For x, y ∈ R,

(5.31) z = x + iy =⇒ z = x − iy =⇒ zz = x2 + y 2 = |z|2 .

It is elementary that

z, w ∈ C =⇒ zw = z w =⇒ z n = z n ,

(5.32)

and z + w = z + w.

Hence

∞

X zk

(5.33) ez = = ez .

k!

k=0

In particular,

(5.34) t ∈ R =⇒ |eit |2 = eit e−it = 1.

Hence t 7→ γ(t) = eit traces out the unit circle centered at the origin in C. Also

(5.35) γ 0 (t) = ieit =⇒ |γ 0 (t)| ≡ 1,

so γ(t) moves at unit speed on the unit circle. We have

(5.36) γ(0) = 1, γ 0 (0) = i.

Thus, for moderate t > 0, the arc from γ(0) to γ(t) is an arc on the unit circle, pictured

in Figure 5.3, of length

Z t

(5.37) `(t) = |γ 0 (s)| ds = t.

0

Figure 5.3

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

160

In other words, γ(t) = eit is the parametrization of the unit circle by arc length, intro-

duced in (4.32). As in (4.33), standard definitions from trigonometry give

d it

(5.40) e = ieit ,

dt

d d

(5.41) cos t = − sin t, sin t = cos t.

dt dt

Compare the derivation of (4.38). We can use (5.17) to derive formulas for sin and cos of

the sum of two angles. Indeed, comparing

with

gives

(5.44)

sin(s + t) = (sin s)(cos t) + (cos s)(sin t).

Exercises.

1. Show that

∞

X (−1)k−1 t2 t3

(5.45) |t| < 1 ⇒ log(1 + t) = tk = t − + − ··· .

k 2 3

k=1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

161

Z t

ds

log(1 + t) = ,

0 1+s

expand

1

= 1 − s + s2 − s3 + · · · , |s| < 1,

1+s

and integrate term by term.

2. In §4, π was defined to be half the length of the unit circle S 1 . Equivalently, π is the

smallest positive number such that eπi = −1. Show that

√

πi/2 πi/3 1 3

e = i, e = + i.

2 2

Figure 5.4

3. Show that

cos2 t + sin2 t = 1,

and

1 + tan2 t = sec2 t,

where

sin t 1

tan t = , sec t = .

cos t cos t

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

162

4. Show that

d

tan t = sec2 t = 1 + tan2 t,

dt

d

sec t = sec t tan t.

dt

5. Evaluate Z y

dx

.

0 1 + x2

Hint. Set x = tan t.

6. Evaluate Z y

dx

√ .

0 1 − x2

Hint. Set x = sin t.

π dx

= √ .

6 0 1 − x2

Use (4.27)–(4.31) to obtain a rapidly convergent infinite series for π.

Hint. Show that sin π/6 = 1/2. Use Exercise 2 and the identity eπi/6 = eπi/2 e−πi/3 . Note

that ak in (4.29)–(4.31) satisfies ak+1 = (k + 1/2)ak . Deduce that

∞

X bk 1 2k + 1

(5.45A) π= , b0 = 3, bk+1 = bk .

2k + 1 4 2k + 2

k=0

8. Set

1 t 1 t

cosh t = (e + e−t ), sinh t = (e − e−t ).

2 2

Show that

d d

cosh t = sinh t, sinh t = cosh t,

dt dt

and

cosh2 t − sinh2 t = 1.

9. Evaluate Z y

dx

√ .

0 1 + x2

Hint. Set x = sinh t.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

163

10. Evaluate Z y p

1 + x2 dx.

0

d

(sec t + tan t) = sec t(sec t + tan t),

dt

d

(sec t tan t) = sec3 t + sec t tan2 t,

dt

= 2 sec3 t − sec t.

d

log | sec t| = tan t,

dt

d

log | sec t + tan t| = sec t.

dt

tan t dt = log | sec t|,

Z

sec t dt = log | sec t + tan t|,

Z Z

3

2 sec t dt = sec t tan t + sec t dt.

(Here and below, we omit the arbitrary additive constants.) See Exercises 40–43 for other

approaches to evaluating these and related integrals.

R

14. Here is another approach to the evaluation of sec t dt. Using Exercise 8 and the chain

rule, show that

d 1

cosh−1 u = √ .

du u2 − 1

Take u = sec t and use Exercises 3–4 to get

d sec t tan t

cosh−1 (sec t) = = sec t,

dt tan t

hence Z

sec t dt = cosh−1 (sec t).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

164

n

X ak d a

Ena (t) = tk satisfies a

E (t) = aEn−1 (t).

k! dt n

k=0

d ³ −at a ´ an+1 n −at

e En (t) = − t e .

dt n!

16. Use Exercise 15 and the fundamental theorem of calculus to show that

Z

n!

tn e−at dt = − n+1 Ena (t)e−at

a

n! ³ a2 t2 an tn ´ −at

= − n+1 1 + at + + ··· + e .

a 2! n!

Z Z

n

t cos t dt and tn sin t dt.

Exercises on xr

In §1, we defined xr for x > 0 and r ∈ Q. Now we define xr for x > 0 and r ∈ C, as

follows:

(5.46) xr = er log x .

and deduce that x1/n , defined by (5.46), coincides with x1/n as defined in §1.

20. Show that xr , defined by (5.46), coincides with xr as defined in §1, for all r ∈ Q.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

165

d r

x = rxr−1 , ∀ x > 0.

dx

Ry Rx

22A. For y > 0, evaluate 0 cos(log x) dx and 0 sin(log x) dx.

Hint. Deduce from (5.46) and Euler’s formula that

cos(log x) + i sin(log x) = xi .

rj → r =⇒ xrj → xr .

d x

a , x ∈ R.

dx

25. Compute

d x

x , x > 0.

dx

x1/x −→ 1, as x → ∞.

Hint. Show that

log x

−→ 0, as x → ∞.

x

1 1

x

x dx = ex log x dx

0

Z0 ∞

−y

= e−ye e−y dy

0

XZ ∞

∞

(−1)n n −(n+1)y

= y e dy.

n=1 0 n!

Z ∞

y n e−αy dy = (−1)n F (n) (α),

0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

166

where Z ∞

1

F (α) = e−αy dy = .

0 α

Z 1 ∞

X

xx dx = (−1)n (n + 1)−(n+1)

0 n=0

1 1 1

=1− + − + ··· .

22 33 44

∞

X (−1)k−1

= log 2.

k

k=1

N

X (−1)k−1 tN +1

tk = log(1 + t) + rN (t), |rN (t)| ≤ ,

k N +1

k=1

and let t % 1.

∞

X (−1)k π

= .

2k + 1 4

k=0

∞

X

−1 (−1)k 2k+1

tan y= y , for − 1 < y < 1.

2k + 1

k=0

An alternative approach is to define log : (0, ∞) → R first and derive some of its properties,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

167

and then define the exponential function Exp : R → (0, ∞) as its inverse. The following

exercises describe how to implement this. To start, we take (5.27) as a definition:

Z x

dy

(5.47) log x = , x > 0.

1 y

Also show

1

(5.49) log = − log x, ∀ x > 0.

x

d 1

(5.50) log x = , x > 0.

dx x

(Hint. See the hint for Exercise 15 in §2.)

Then show that log x → −∞ as x → 0.

35. Deduce from Exercises 33 and 34, together with Theorem 1.3, that

(5.51) es+t = es et , ∀ s, t ∈ R.

d t

(5.52) e = et , ∀ t ∈ R.

dt

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

168

As a consequence,

dn t

(5.53) e = et , ∀ t ∈ R, n ∈ N.

dtn

38. Note that e0 = 1, since log 1 = 0. Deduce from (5.53), together with the power series

formulas (3.34) and (3.40), that, for all t ∈ R, n ∈ N,

n

X

t 1 k

(5.54) e = t + Rn (t),

k!

k=0

where

tn+1 ζn

(5.55) Rn (t) = e ,

(n + 1)!

∞

X

t 1 k

(5.56) e = t , ∀ t ∈ R.

k!

k=0

Remark. Exercises 35–39 develop et only for t ∈ R. At this point, it is natural to segue

to (5.6) and from there to arguments involving (5.7)–(5.17), and then on to (5.29)–(5.41),

renewing contact with the trigonometric functions.

Z

(5.57) R(cos θ, sin θ) dθ.

dx 1 − x2 2x

dθ = 2 , cos θ = , sin θ = .

1 + x2 1 + x2 1 + x2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

169

Z ³ 1 − x2 2x ´ dx

(5.58) 2 R , .

1 + x2 1 + x2 1 + x2

Z Z

1 1

dθ, and dθ.

sin θ cos θ

Z Z

2k+1 dt

sec θ dθ = .

(1 − t2 )k+1

Compare what you get by the methods of Exercises 40–42, and also (for k = 0, 1) those of

Exercise 13.

Hint. sec2k+1 θ = (cos θ)/(1 − sin2 θ)k+1 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

170

There are lots of unbounded functions we would like to be able to integrate. For example,

consider f (x) = x−1/2 on (0, 1] (defined any way you like at x = 0). Since, for ε ∈ (0, 1),

Z 1 √

(6.1) x−1/2 dx = 2 − 2 ε,

ε

Z 1

(6.2) x−1/2 dx = 2.

0

proper designation. Here, we define a class R# (I) of not necessarily bounded “integrable”

functions on an interval I = [a, b], as follows.

First, assume f ≥ 0 on I, and for A ∈ (0, ∞), set

(6.3)

A, if f (x) > A.

Z

(6.4) ∃ uniform bound fA dx ≤ M.

I

R

If f ≥ 0 satisfies (6.4),R then I fA dx increases monotonically to a finite limit as A % +∞,

and we call the limit I f dx:

Z Z

(6.5) fA dx % f dx, for f ∈ R# (I), f ≥ 0.

I I

Rb

We

R also use the notation a

f dx, if I = [a, b]. If I is understood, we might just write

f dx. It is valuable to have the following.

Proposition 6.1. If f, g : I → R+ are in R# (I), then f + g ∈ R# (I), and

Z Z Z

(6.6) (f + g) dx = f dx + g dx.

I I I

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

171

R R R R R

Hence (f + g)A ∈ R(I) and (f + g)A dx ≤ fA dx + gA dx ≤ f dx + g dx, so we

have f + g ∈ R# (I) and

Z Z Z

(6.8) (f + g) dx ≤ f dx + g dx.

Z Z Z

(6.9) (f + g) dx ≥ fA dx + gA dx,

Z Z Z

(6.10) (f + g) dx ≥ f dx + g dx.

Next, we take f : I → R and set

(6.11)

0 if f (x) < 0.

Then we say

and set

Z Z Z

+

(6.13) f dx = f dx − f − dx,

I I I

where the two terms on the right are defined as in (6.5). To extend the additivity, we

begin as follows

Proposition 6.2. Assume that g ∈ R# (I) and that gj ≥ 0, gj ∈ R# (I), and

(6.14) g = g0 − g1 .

Then

Z Z Z

(6.15) g dx = g0 dx − g1 dx.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

172

(6.16) g + + g1 = g0 + g − ,

Z Z Z Z

(6.17) g dx + g1 dx = g0 dx + g − dx.

+

This implies

Z Z Z Z

+ −

(6.18) g dx − g dx = g0 dx − g1 dx,

We now extend additivity.

Proposition 6.3. Assume f1 , f2 ∈ R# (I). Then f1 + f2 ∈ R# (I) and

Z Z Z

(6.19) (f1 + f2 ) dx = f1 dx + f2 dx.

I I I

Z Z Z

(f1 + f2 ) dx = g0 dx − g1 dx

Z Z

(6.21) + +

= (f1 + f2 ) dx − (f1− + f2− ) dx

Z Z Z Z

= f1 dx + f2 dx − f1 dx − f2− dx,

+ + −

the first equality by Proposition 6.2, the second tautologically, and the third by Proposition

6.1. Since

Z Z Z

(6.22) fj dx = fj dx − fj− dx,

+

If f : I → C, we set f = f1 + if2 , fj : I → R, and say f ∈ R# (I) if and only if f1 and

f2 belong to R# (I). Then we set

Z Z Z

(6.23) f dx = f1 dx + i f2 dx.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

173

Given f ∈ R# (I), we set

Z

(6.24) kf kL1 (I) = |f (x)| dx.

I

and

Z

kf + gkL1 (I) = |f + g| dx

ZI

(6.26)

≤ (|f | + |g|) dx

I

= kf kL1 (I) + kgkL1 (I) .

Note that, if S ⊂ I,

Z

+

(6.27) cont (S) = 0 =⇒ |χS | dx = 0,

I

where cont+ (S) is defined by (2.21). Thus, to get a metric, we need to form equivalence

classes. The set of equivalence classes [f ] of elements of R# (I), where

Z

(6.28) f ∼ f˜ ⇐⇒ |f − f˜| dx = 0,

I

However, this metric space is not complete. One needs the Lebesgue integral to obtain a

complete metric space. One can see [Fol] or [T1].

We next show that each f ∈ R# (I) can be approximated in L1 by a sequence of

bounded, Riemann integrable functions.

Proposition 6.4. If f ∈ R# (I), then there exist fk ∈ R(I) such that

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

174

f , so it suffices to treat the case where f is real. Similarly, writing f = f + − f − , we see

that it suffices to treat the case where f ≥ 0 on I. For such f , simply take

(6.31) fk = fA , A = k,

Z Z

(6.32) fk dx % f dx,

I I

Z Z

|f − fk | dx = (f − fk ) dx

I I

Z Z

(6.33)

= f dx − fk dx

I I

→ 0 as k → ∞.

So far, we have dealt with integrable functions on a bounded interval. Now, we say

f : R → R (or C, or Rn ) belongs to R# (R) provided f |I ∈ R# (I) for each closed, bounded

interval I ⊂ R and

Z R

(6.34) ∃A < ∞ such that |f | dx ≤ A, ∀ R < ∞.

−R

Z ∞ Z R

(6.35) f dx = lim f dx.

−∞ R→∞ −R

Exercises

Z 1

#

f ∈ R ([0, 1]) ⇐⇒ f dx is bounded as ε & 0.

ε

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

175

1 1

f dx = lim f dx.

0 ε→0 ε

2. Let a > 0. Define pa : [0, 1] → R by pa = x−a if 0 < x ≤ 1 Set pa (0) = 0. Show that

1

qb (x) = ,

x| log x|b

Z Z

#

af ∈ R (I), and af dx = a f dx.

5. Show that

f ∈ R(I), g ∈ R# (I) =⇒ f g ∈ R# (I).

Hint. Use (2.53). First treat the case f, g ≥ 1, f ≤ M . Show that in such a case,

6. Compute Z 1

log t dt.

0

R1

Hint. To compute ε

log t dt, first compute

d

(t log t).

dt

kg − gk kL1 (I) −→ 0.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

176

kh − hk kL1 (I) −→ 0.

8. Using Exercise 7 and Proposition 6.4, prove the following: given f ∈ R# (I), there exist

fk ∈ C(I) such that

kf − fk kL1 (I) −→ 0.

9. Recall Exercise 4 of §2. If ϕ : [a, b] → [A, B] is C 1 , with ϕ0 (x) > 0 for all x ∈ [a, b], then

Z B Z b

(6.36) f (y) dy = f (ϕ(t))ϕ0 (t) dt,

A a

for each f ∈ C([a, b]), where A = ϕ(a), B = ϕ(b). Using Exercise 8, show that (6.36)

holds for each f ∈ R# ([a, b]).

10. If f ∈ R# (R), so (6.34) holds, prove that the limit exists in (6.35).

11. Given f (x) = x−1/2 (1 + x2 )−1 for x > 0, show that f ∈ R# (R+ ). Show that

Z ∞ Z ∞

1 dx dy

2

√ =2 .

0 1+x x 0 1 + y4

Given ε > 0, ∃ contented Sε ⊂ [a, b] such that

Z

(b) g dx < ε, and fk → f uniformly on [a, b] \ Sε .

Sε

Z b Z b

fk (x) dx −→ f (x) dx, as k → ∞.

a a

13. Let g ∈ R# ([a, b]) be ≥ 0. Show that for each ε > 0, there exists δ > 0 such that

Z

S ⊂ [a, b] contented, cont S < δ =⇒ g dx < ε.

S

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

177

R R

Hint. With gA defined as in (6.3), pick A such that gA dx ≥ g dx − ε/2. Then pick

δ < ε/2A.

14. Deduce from Exercises 12–13 the following. Let fk ∈ R# ([a, b]), f : [a, b] → R satisfy

Given δ > 0, ∃ contented Sδ ⊂ [a, b] such that

(b) cont Sδ < δ, and fk → f uniformly on [a, b] \ Sδ .

Z b Z b

fk (x) dx −→ f (x) dx, as k → ∞.

a a

integration has a stronger result, known as the Lebesgue dominated convergence theorem.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

178

Theorem A.1. If p(z) is a nonconstant polynomial (with complex coefficients), then p(z)

must have a complex root.

Proof. We have, for some n ≥ 1, an 6= 0,

p(z) = an z n + · · · + a1 z + a0

(A.1) ¡ ¢

= an z n 1 + O(z −1 ) , |z| → ∞,

which implies

|z|→∞

|z|≥R

we deduce that

|z|≤R z∈C

z∈C

Lemma A.2. If p(z) is a nonconstant polynomial and (A.5) holds, then p(z0 ) = 0.

Proof. Suppose to the contrary that

(A.6) p(z0 ) = a 6= 0.

We can write

and b 6= 0, we have q(ζ) = bζ k + · · · + bn ζ n , i.e.,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

179

b k a

(A.10) ω =− ,

|b| |a|

identity implies

a |b|

− = eiθ ,

|a| b

for some θ ∈ R, so we can take

ω = eiθ/k .

Given (A.10),

³ ¯b¯ ´

¯ ¯

(A.11) p(z0 + εω) = a 1 − ¯ ¯εk + O(εk+1 ),

a

which contradicts (A.5) for ε > 0 small enough. Thus (A.6) is impossible. This proves

Lemma A.2, hence Theorem A.1.

Now that we have shown that p(z) in (A.1) must have one root, we can show it has n

roots (counting multiplicity).

Proposition A.3. For a polynomial p(z) of degree n, as in (A.1), there exist r1 , . . . , rn ∈

C such that

(A.12) p(z) = an (z − r1 ) · · · (z − rn ).

Proof. We have shown that p(z) has one root; call it r1 . Dividing p(z) by z − r1 , we have

where p̃(z) = an z n−1 +· · ·+ã0 and q is a polynomial of degree < 1, i.e., a constant. Setting

z = r1 in (A.13) yields q = 0, so

The numbers rj , 1 ≤ j ≤ n, in (A.12) are called the roots of p(z). If k of them coincide

(say with r` ) we say r` is a root of multiplicity k. If r` is distinct from rj for all j 6= `, we

say r` is a simple root.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

180

B. π 2 is Irrational

The following proof that π 2 is irrational follows a classic argument of I. Niven, [Niv].

The idea is to consider

Z π

1

(B.1) In = ϕn (x) sin x dx, ϕn (x) = xn (π − x)n .

0 n!

Clearly In > 0 for each n ∈ N, and In → 0 very fast, faster than geometrically:

1 ³ π ´2n

(B.1A) 0 < In < .

n! 2

The next key fact, to be established below, is that In is a polynomial of degree n in π 2

with integer coefficients:

n

X

(B.2) In = cnk π 2k , cnk ∈ Z.

k=0

n

X

(B.3) cnk a2k b2n−2k = b2n In .

k=0

But the left side of (B.3) is an integer for each n, while by the estimate (B.1A), the

right side belongs to the interval (0, 1) for large n, yielding a contradiction. It remains to

establish (B.2).

A method of computing the integral in (B.1), which works for any polynomial ϕn (x) is

the following. One looks for an antiderivative of the form

hence

One can exploit the nilpotence of ∂x2 on the space of polynomials of degree ≤ 2n and set

n

X

(B.7)

= (−1)k ϕ(2k)

n (x).

k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

181

Then

d ³ 0 ´

(B.8) Fn (x) sin x − Fn (x) cos x = ϕn (x) sin x.

dx

Z π

(B.9) ϕn (x) sin x dx = Fn (0) + Fn (π) = 2Fn (0),

0

the last identity holding for ϕn (x) as in (B.1) because then ϕn (π − x) = ϕn (x) and hence

Fn (π − x) = Fn (x). For the first identity in (B.9), we use the defining property that

sin π = 0 while cos π = −1.

(2k)

In light of (B.7), to prove (B.2) it suffices to establish an analogous property for ϕn (0).

Comparing the binomial formula and Taylor’s formula for ϕn (x):

n µ ¶

1 X ` n

ϕn (x) = (−1) π n−` xn+` , and

n! `

`=0

(B.10) 2n

X 1 (k)

ϕn (x) = ϕn (0)xk ,

k!

k=0

we see that

µ ¶

` (n + `)! n n−`

(B.11) k =n+`⇒ ϕ(k)

n (0) = (−1) π ,

n! `

so

µ ¶

n (n + `)! n 2(k−`)

(B.12) 2k = n + ` ⇒ ϕ(2k)

n (0) = (−1) π .

n! `

(2k)

Of course ϕn (0) = 0 for 2k < n. Clearly the multiple of π 2(k−`) in (B.12) is an integer.

In fact,

µ ¶

(n + `)! n (n + `)! n!

=

n! ` n! `!(n − `)!

(n + `)! n!

(B.13) =

n!`! (n − `)!

µ ¶

n+`

= n(n − 1) · · · (n − ` + 1).

n

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

182

C. More on (1 − x)b

In §3 we showed that

∞

X

b ak

(C.1) (1 − x) = xk ,

k!

k=0

k−1

Y

(C.2) a0 = 1, ak = (−b + `), for k ≥ 1.

`=0

There we required b ∈ Q, but in §5 we defined y b , for y > 0, for all b ∈ R (and for y ≥ 0 if

b > 0), and noted that such a result extends. Here, we prove a further result, when b > 0.

Proposition C.1. Given b > 0, ak as in (C.2), the identity (C.1) holds for x ∈ [−1, 1],

and the series converges absolutely and uniformly on [−1, 1].

Proof. Our main task is to show that

∞

X |ak |

(C.3) < ∞,

k!

k=0

if b > 0. This implies that the right side of (C.1) converges absolutely and uniformly on

[−1, 1] and its limit, g(x), is continuous on [−1, 1]. We already know that g(x) = (1 − x)b

on (−1, 1), and since both sides are continuous on [−1, 1], the identity also holds at the

endpoints. Now, if k − 1 > b,

ak b Y ³ b´ Y ³ b´

(C.4) =− 1− 1− ,

k! k ` `

1≤`≤b b<`≤k−1

which we write as (B/k)pk , where pk denotes the last product in (C.4). Then

X ³ b´

log pk = log 1 −

`

b<`≤k−1

(C.5) X b

≤−

`

b<`≤k−1

≤ −b log k + β,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

183

and

k−1

X Z k

1 dy

(C.7) > .

` 1 y

`=1

so

|ak |

(C.9) ≤ |Bγ| k −(1+b) ,

k!

giving (C.3).

Exercise

Hint. logs

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

184

D. Archimedes’ approximation to π

(known to the ancient Greeks) that the unit disk D = {(x, y) ∈ R2 : x2 + y 2 ≤ 1} has the

property

(D.1) Area D = π.

We have not discussed area in this text. This topic is treated in the companion text [T2].

Actually, (D.1) was originally the definition of π. Here, we have taken (4.31A) as the

definition. To get the equivalence, we appeal to notions from first-year calculus, giving

areas of regions bounded by graphs in terms of integrals. We have

Z 1p

Area D = 2 1 − x2 dx

−1

Zπ/2

=2 cos2 θ dθ

(D.2) −π/2

Z π/2

= (cos 2θ + 1) dθ

−π/2

= π.

Here, the second identity follows from the substitution x = sin θ and the third from the

identity

cos 2θ = cos2 θ − sin2 θ = 2 cos2 θ − 1,

a consequence of (5.44), with s = t = θ. One can also get (D.1) by computing areas in

polar coordinates (cf. [T2]).

Having (D.1), Archimedes proceeded as follows. If Pn is a regular n-gon inscribed in

the unit circle, then Area Pn → π as n → ∞, with

c

(D.3) π− < Area Pn < π.

n2

See (D.18)–(D.20) below for more on this. Note that such a polygon decomposes into n

equal sized isoceles triangles, with two sides of length 1 meeting at an angle αn = 2π/n.

Such a triangle Tn has

³ αn ´³ αn ´ 1

(D.4) Area Tn = sin cos = sin αn ,

2 2 2

so

n 2π

(D.5) Area Pn = sin .

2 n

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

185

One can obtain an inductive formula for Area Pn for n = 2k as follows. Set

2π 2π

(D.6) Sk = sin , Ck = cos .

2k 2k

Then, for example, S2 = 1, C2 = 0, and

i.e.,

2 2

(D.8) Ck+1 − Sk+1 = Ck , 2Ck+1 Sk+1 = Sk .

(D.9) x2 − y 2 = a, 2xy = b,

for x and y, knowing that a ≥ 0, b, x, y > 0. We substitute y = b/2x into the first equation,

obtaining

b2

(D.10) x2 − = a,

4x2

b2

(D.11) u2 − au − = 0,

4

whose positive solution is

a 1p 2

(D.12) u= + a + b2 .

2 2

Then

√ b

(D.13) x= u, y= √ .

2 u

Sk

(D.14) Sk+1 = √ ,

2 Uk

with

p

1 + Ck 1+ 1 − Sk2

(D.15) Uk = = .

2 2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

186

Then

Pk

(D.17) Pk+1 = √ .

Uk

(D.17), starting with S2 = 1 and P2 = 2.

First, we take a closer look at the error estimate in (D.3). Note that

n ³ 2π 2π ´

(D.18) π − Area Pn = − sin ,

2 n n

and that

δ3 δ5 δ3

(D.19) δ − sin δ = − + ··· < , for 0 < δ < 6,

3! 5! 3!

so

2π 3 1

(D.20) π − Area Pn < · , for n ≥ 2.

3 n2

Thus we can take c = 2π 3 /3 in (D.3) for n ≥ 2, and this is asymptotically sharp.

From (D.20) with n = 225 , we have

2π 3 −50

(D.21) π − P25 < ·2 .

3

Since

2π 3

(D.22) 210 = 1024 ⇒ 250 ≈ 1015 , and ≈ 20,

3

we get

The Archimedes method often gets bad press because the error given in (D.20) decreases

slowly with n. However, given that we take n = 2k and iterate on k, the error actually

decreases exponentially in k. Nevertheless, use of the infinite series suggested in Exercise 7

of §5 has advantages over the use of (D.14)–(D.17), particularly in that it does not require

one to calculate a bunch of square roots.

There is another disadvantage of the iteration (D.14)–(D.17), though it does not show up

in a mere 25 iterations (at least, not if one is using double precision arithmetic). Namely,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

187

any error in the approximate calculation of Pk (compared to its exact value), due for

example to truncation error, can get magnified in the approximate calculation of Pk+` for

` ≥ 1. This will ultimately lead to an instability, and a breakdown in the viability of the

iterative method (D.14)–(D.17).

We end this appendix by showing how the approximation to π described here can be

justified without any notion of area. In fact, setting

n 2π

(D.24) An = sin ,

2 n

2π 3 1

(D.25) 0 < π − An < , for n ≥ 2,

3 n2

directly from (D.19); cf. (D.20). Thus, we can simply set Pk = A2k , and then the estimates

(D.21)–(D.23) hold, and the iteration (D.14)–(D.17) works, without recourse to area.

In effect, the previous paragraph took the geometry out of Archimedes’ approximation

to π. Finally, we note the following variant, bringing in arc length (treated thoroughly in

§4) in place of area. Namely, the perimeter Qn of the regular n-gon Pn is a union of n line

segments, each being the base of an isoceles triangle with two sides of length 1, meeting

at an angle αn = 2π/n. Hence each such line segment has length 2 sin αn /2, so

π

(D.26) `(Qn ) = 2n sin .

n

The fact that

follows from the definition (4.31A) together with Proposition 4.1. Note that (D.26) implies

Note. Actually, Archimedes started with the regular hexagon and proceeded from there

to evaluate Pek = Area P3·2k , for k up to 5. The basic iteration (D.7)–(D.15) also applies

to this case. By (D.20),

0 < π − Area P96 < 0.00225.

Archimedes’ presentation was

10 1

3+ <π <3+ .

71 7

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

188

In Exercise 3 of §5, we defined tan t = sin t/ cos t. It is readily verified (via Exercise 4

of §5) that

³ π π´

(E.1) tan : − , −→ R

2 2

is one-to-one and onto, with positive derivative, so it has a smooth inverse

³ π π´

−1

(E.2) tan : R −→ − , .

2 2

It follows from Exercise 5 of §5 that

Z x

−1 ds

(E.3) tan x= .

0 1 + s2

We can insert the power series for (1 + s2 )−1 and integrate term by term to get

∞

X (−1)k 2k+1

(E.4) tan−1 x = x , if − 1 < x < 1.

2k + 1

k=0

This provides a way to obtain rapidly convergent series for π, alternative to that proposed

in Exercise 7 of §5, which can be called an evaluation of π using the arcsine.

For a first effort, we use

π 1

(E.5) tan =√ ,

6 3

which follows from

√ √

π 1 π 3 πi/6 3 1

(E.6) sin = , cos = ⇐⇒ e = + i,

6 2 6 2 2 2

compare Exercises 2 and 7 of §5. Now (E.4)–(E.5) yield

1 X (−1)k ³ 1 ´k

∞

π

(E.7) =√ .

6 3 k=0 2k + 1 3

We can compare

√ (E.7) with the series (5.45A) for π. One difference is√the occurence of the

factor 1/ 3, which is irrational. To be sure, it is not hard to compute 3 to high precision.

Compare Exercises 8–10 of §3; for a faster method, see the treatment of Newton’s method

in §5 of Chapter 5. Nevertheless, the presence of this irrational factor in (E.7) is a bit

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

189

of a glitch. Another disadvantage of (E.7) is that this series converges more slowly than

(5.45A).

We can do better by expressing π as a finite linear combination of terms tan−1 xj for

certain fairly small rational numbers xj . The key to this is the following formula for

tan(a + b). Using (5.44), we have

sin(a + b) sin a cos b + cos a sin b

tan(a + b) = =

cos(a + b) cos a cos b − sin a sin b

(E.8)

tan a + tan b

= .

1 − tan a tan b

Since tan π/4 = 1, we have, for a, b, a + b ∈ (−π/2, π/2),

π tan a + tan b

(E.9) = a + b ⇐= = 1.

4 1 − tan a tan b

Taking a = tan−1 x, b = tan−1 y gives

π

= tan−1 x + tan−1 y ⇐= x + y = 1 − xy

4

(E.10) 1−y

⇐= x = .

1+y

If we set y = 1/2, we get x = 1/3, so

π 1 1

(E.11) = tan−1 + tan−1 .

4 3 2

The power series (E.4) for tan−1 (1/3) and tan−1 (1/2) both converge faster than (E.7), but

that for tan−1 (1/2) converges at essentially the same

√ rate as (5.45A). We might optimise

by taking x = y in (E.10), but that yields x = y = √ 2 − 1, and we do not want to plug this

irrational number into (E.4). Taking a cue from 2 − 1 ≈ 0.414, we set y = 2/5, which

yields x = 3/7, so

π 2 3

(E.12) = tan−1 + tan−1 .

4 5 7

Both resulting power series converge faster than (5.45A), but not by much.

To do better, we bring in a formula for tan(a + 2b). Note that setting a = b in (E.8)

yields

2 tan b

(E.13) tan 2b = ,

1 − tan2 b

and concatenating this with (E.8) (with b replaced by 2b) yields, after some elementary

calculation,

tan a(1 − tan2 b) + 2 tan b

(E.14) tan(a + 2b) = .

1 − tan2 b − 2 tan a tan b

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

190

(E.15) = a + 2b ⇐= = 1.

4 1 − tan2 b − 2 tan a tan b

π

= tan−1 x + 2 tan−1 y ⇐= x(1 − y 2 ) + 2y = 1 − y 2 − 2xy

4

(E.16) 1 − y 2 − 2y

⇐= x = .

1 − y 2 + 2y

π 1 1

(E.17) = tan−1 + 2 tan−1 .

4 7 3

Both resulting power series converge significantly faster than (5.45A). Alternatively, we

can take y = 1/4, yielding x = 7/23, so

π 7 1

(E.18) = tan−1 + 2 tan−1 .

4 23 4

The power series for tan−1 (7/23) converges a little faster than that for tan−1 (1/3).

One can go still farther, iterating (E.13) to produce a formula for tan 4b, and concate-

nating this with (E.8) to produce a formula for

form

π

(E.20) = tan−1 x + 4 tan−1 y,

4

including the following, known as Machin’s formula:

π 1 1

(E.21) = 4 tan−1 − tan−1 ,

4 5 239

with y = 1/5, x = −1/239. For many years, this was the most popular formula for

high precision approximations to π, until the 1970s, when a more sophisticated method

(actually discovered by Gauss in 1799) became available. For more on this, the reader can

consult Chapter 7 of [AH].

Returning to the arctangent function, we record a series that converges somewhat faster

than (E.4), for such values of x as occur in (E.11), (E.12), (E,17), (E.18), and (E.21). The

following is due to Euler.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

191

³ x2 ´

(E.22) x tan−1 x = ϕ ,

1 + x2

with

³ 2 2·4 2 2·4·6 3 ´

(E.23) ϕ(z) = z 1 + z + z + z + ··· .

3 3·5 3·5·7

The power series (E.23) has the same radius of convergence as (E.4). The advantage of

(E.22)–(E.23) over (E.4) lies in the fact that x2 /(1 + x2 ) is a bit smaller than x2 , for the

values of x that appear in our various formulas for π.

To start the proof of Proposition E.1, note that

x2 2 z

(E.24) z= ⇐⇒ x = .

1 + x2 1−z

Hence, by (E.4),

∞

X

−1 (−1)k−1

x tan x= x2k

2k − 1

k=1

(E.25) ∞

X (−1)k−1 k

= z (1 − z)−k .

2k − 1

k=1

Now

∞

X

(E.26) (1 − z)−1 = zn,

n=0

X∞ µ ¶

−k k+n−1 n

(E.27) (1 − z) = z ,

n=0

n

X∞ X ∞ µ ¶

(−1)k−1 k + n − 1 n+k

ϕ(z) = z

2k − 1 n

k=1 n=0

(E.28) ∞ X`−1 µ ¶

X (−1)`−n−n ` − 1 `

= z .

n=0

2` − 2n − 1 n

`=1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

192

Hence

∞ X

X `−1 µ ¶

(−1)m ` − 1 `

(E.29) ϕ(z) = z .

m=0

2m + 1 m

`=1

`−1

X µ ¶

(−1)m ` − 1 2 · 4 · · · 2(` − 1)

(E.30) ϕ` = =⇒ ϕ` = , ` ≥ 2,

m=0

2m + 1 m 3 · 5 · · · (2` − 1)

while

(E.31) ϕ1 = 1.

2`

(E.32) ϕ`+1 = ϕ` .

2` + 1

To see this, note that the binomial formula gives

`−1

X µ ¶

2 `−1 m ` − 1 2m

(E.33) (1 − s ) = (−1) s ,

m=0

m

Z 1

(E.34) ϕ` = (1 − s2 )`−1 ds.

0

d

(1 − s2 )`+1 = −2(` + 1)s(1 − s2 )` ,

ds

(E.35)

d2

(1 − s2 )`+1 = −2(` + 1)(1 − s2 )` + 4`(` + 1)s2 (1 − s2 )`−1 .

ds2

Integrating the last identity over s ∈ [0, 1] gives

Z 1 Z 1

2 `−1 2

(E.36) 2` (1 − s ) s ds = (1 − s2 )` ds.

0 0

Hence

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

193

Recall that

sin x

(F.1) tan x =

cos x

is a smooth function on (−π/2, π/2). Here we desire to represent it as a convergent power

series

∞

X

(F.2) T (x) = τk x2k+1 .

k=0

Only odd exponents are involved since tan(−x) = − tan x. We will derive a recursive

formula for the coefficients τk .

As seen in Exercise 4 of §5,

d

(F.3) tan x = 1 + tan2 x.

dx

To find the coefficients τk in (F.2), we construct the power series to solve the differential

equation

Indeed, if (F.2) is a convergent power series for |x| < r, then, on such an interval,

∞

X

0

(F.5) T (x) = (2k + 1)τk x2k ,

k=0

and

X

T (x)2 = τj τk x2(j+k)+2

j,k≥0

(F.6) ∞ X̀

X

= τk τ`−k x2`+2 .

`=0 k=0

∞

X

0

(F.7) T (x) = τ0 + (2` + 3)τ`+1 x2`+2 ,

`=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

194

1 X̀

(F.8) τ`+1 = τk τ`−k , for ` ≥ 0.

2` + 3

k=0

Clearly, given τ0 = 1, (F.8) uniquely determines τk for all k ∈ N. The first few terms

are

1 2 1³ 2 1 2 ´

(F.9) τ0 = 1, τ1 = , τ2 = , τ3 = + + .

3 3·5 7 3·5 3·3 3·5

An easy induction shows that, for each k ∈ N,

It follows that (F.2) is a convergent power series, at least on |x| < 1, and that on this

interval the equation (F.4) holds. We claim that, on the interval of convergence,

For this task, remainder formulas such as (3.37) and (3.38) are not so convenient, since

formulas for high derivatives of tan x become quite unwieldly. We take another approach,

bringing in the function tan−1 , introduced in (E.2)–(E.3). If (F.2) converges for |x| < r,

then

defines a smooth function on (−r, r), and, via (E.3) and the chain rule,

T 0 (x)

(F.13) ψ 0 (x) = = 1.

1 + T (x)2

∞

X

(F.14) tan x = τk x2k+1 ,

k=0

As one might expect, the radius of convergence of the power series (F.14), seen above to

be ≥ 1, is actially π/2. This is conveniently estabished using methods of complex analysis,

such as treated in [T3].

The coefficients τk in the power series for tan x are closely related to the Bernoulli

numbers Bk , which arise in the power series expansion

X Bk ∞

z

(F.15) = zk .

ez − 1 k!

k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

195

In this case, one can multiply the power series on the right side of (F.15) by

∞

ez − 1 X z j

(F.16) =

z j=0

(j + 1)!

`−1

X 1 Bk

(F.17) B0 = 1, = 0.

(` − k)! k!

k=0

The first few terms are

1 1 1

(F.18) B1 = − , B2 = , B3 = 0, B4 = − .

B0 = 1,

2 6 30

Methods of complex analysis (cf. [T3]) show that the power series in (F.15) has radius of

convergence 2π. It turns out that Bk = 0 for all odd k ≥ 3. In fact, a formula equivalent

to (F.15) is

∞

1 ez + 1 1 X B2k 2k−1

(F.19) = + z .

2 ez − 1 z (2k)!

k=1

It is an exercise to show that the difference between the left side of (F.19) and 1/z is odd

in z. Now an application of Euler’s formula to (F.19) yields

∞

X B2k

(F.20) x cot x = (−1)k (2x)2k .

(2k)!

k=0

Furthermore, one can show that

(F.21) tan x = cot x − 2 cot 2x,

and then deduce from (F.20) that (F.14) holds, with

22k (22k − 1)

(F.22) τk−1 = (−1)k−1 B2k , k ≥ 1.

(2k)!

The fact that τk is positive for each k is equivalent to the fact that B2k is positive for

k odd, and negative for k even (which might take one some effort to glean from (F.17)).

Note that comparing (F.22) and (F.10) implies that the radius of convergence of the power

series in (F.19) is at least 4.

For further results, relating (F.20) to results connecting the Bernoulli numbers to ζ(2k),

defined by

X∞

(F.23) ζ(2k) = n−2k ,

n=1

X # ∞

π 4

(F.24) tan x= τk x2k+1 , τk# = ζ(2k + 2)(1 − 2−2k−2 ).

2 π

k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

196

In this appendix we prove the following result of Abel and derive some applications.

Theorem G.1. Assume we have a convergent series

∞

X

(G.1) ak = A.

k=0

Then

∞

X

(G.2) f (r) = ak r k

k=0

As a warm up, we look at the following somewhat simpler result.

Proposition G.2. Assume we have an absolutely convergent series

∞

X

(G.3) |ak | < ∞.

k=0

Then the series (G.2) converges uniformly on [−1, 1], so f ∈ C([−1, 1]).

P∞

Proof. Writing (G.2) as k=0 fk (r) with fk (r) = ak rk , we have |fk (r)| ≤ |ak | for |r| ≤ 1,

so the conclusion follows from the Weierstrass M -test, Proposition 2.4 of Chapter 3.

Theorem G.1 is much more subtle than Proposition G.2. One ingredient in the proof is

the following summation by parts formula.

Proposition G.3. Let (aj ) and (bj ) be sequences, and let

n

X

(G.4) sn = aj .

j=0

If m > n, then

m

X m−1

X

(G.5) ak bk = (sm bm − sn bn+1 ) + sk (bk − bk+1 ).

k=n+1 k=n+1

m

X

(G.6) (sk − sk−1 )bk .

k=n+1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

197

Before applying Proposition G.3 to the proof of Theorem G.1, we note that, by Propo-

sition 3.3 of Chapter 3, the power series (G.2) converges uniformly on compact subsets of

(−1, 1), and defines f ∈ C((−1, 1)). Our task here is to get uniform convergence up to

r = 1.

To proceed, we apply (G.5) with bk = rk and n + 1 = 0, s−1 = 0, to get

m

X m−1

X

k

(G.7) ak r = (1 − r) sk r k + sm r m .

k=0 k=0

Now, we want to add and subtract a function gm (r), defined for 0 ≤ r < 1 by

∞

X

gm (r) = (1 − r) sk rk

k=m

(G.8) ∞

X

m

= Ar + (1 − r) σk r k ,

k=m

(G.9) σk = sk − A −→ 0, as k → ∞.

¯X∞ ¯ ³ ´ ∞

X

¯ ¯

(1 − r)¯ σk rk ¯ ≤ sup |σk | (1 − r) rk

k≥µ

(G.10) k=µ k=µ

³ ´

= sup |σk | rµ .

k≥µ

It follows that

k≥m

m

X

(G.13) ak rk = g0 (r) + (sm − A)rm − hm (r),

k=0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

198

and this converges uniformly for r ∈ [0, 1] to g0 (r). We have Theorem G.1, with f (r) =

g0 (r).

Here is one illustration of Theorem G.1. Let ak = (−1)k−1 /k, which produces a con-

vergent series by the alternating series test (Chapter 1, Proposition 6.3). By (5.45),

∞

X (−1)k−1

(G.14) rk = log(1 + r),

k

k=1

for |r| < 1. It follows from Theorem G.1 that this infinite series converges uniformly on

[0, 1], and hence

∞

X (−1)k−1

(G.15) = log 2.

k

k=1

See Exercise 30 in §5 for a more direct approach to (G.15), using the special behavior of

alternating series. Here is a more subtle generalization.

∞

X eikθ

(G.16) = S(θ)

k

k=1

converges.

Given this claim, it follows from Theorem G.1 that

∞

X eikθ

(G.17) lim rk = S(θ), ∀ θ ∈ (0, 2π).

r%1 k

k=1

Note that taking θ = π gives (G.15). Incidentally, we mention that the function log :

(0, ∞) → R has a natural extension to

and

∞

X 1 k

(G.19) z = − log(1 − z), for |z| < 1,

k

k=1

from which one can deduce, via Theorem G.1, that S(θ) in (G.16) satisfies

Details on (G.18)–(G.19) would take us too far into the area of complex analysis for a

treatment here. One can find such material in [T3].

We want to establish the convergence of (G.16) for θ ∈ (0, 2π). In fact, we prove the

following more general result.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

199

∞

X

(G.21) bk eikθ = F (θ)

k=1

Given Proposition G.4, it then follows from Theorem G.1 that

∞

X

(G.22) lim bk rk eikθ = F (θ), ∀ θ ∈ (0, 2π).

r%1

k=1

In turn, Proposition G.4 is a special case of the following more general result, known as

the Dirichlet test for convergence of an infinite series.

Proposition G.5. If bk & 0, ak ∈ C, and there exists B < ∞ such that

k

X

(G.23) sk = aj =⇒ |sk | ≤ B, ∀ k ∈ N,

j=1

then

∞

X

(G.24) ak bk converges.

k=1

To apply Proposition G.5 to Proposition G.4, take ak = eikθ and observe that

k

X 1 − eikθ iθ

(G.25) eijθ = e ,

j=1

1 − eiθ

To prove Proposition G.5, we use summation by parts, Proposition G.3. We have, via

(G.5) with n = 0, s0 = 0,

m

X m−1

X

(G.26) ak bk = sm bm + sk (bk − bk+1 ).

k=1 k=1

∞

X ∞

X

(G.27) |sk (bk − bk+1 )| ≤ B (bk − bk+1 ) = Bb1 < ∞,

k=1 k=1

∞

X

(G.28) sk (bk − bk+1 )

k=1

is absolutely convergent, and the convergence of the left side of (G.26) readily follows.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

200

Chapter V

Further Topics in Analysis

Introduction

analysis. One underlying theme here is the approximation of a function by a sequence of

“simpler” functions.

In §1 we define the convolution of functions on R,

Z ∞

f ∗ u(x) = f (y)u(x − y) dy,

−∞

treat the Weierstrass approximation theorem, which states that each continuous function

on a closed, bounded interval [a, b] is a uniform limit of a sequence of polynomials. We

give two proofs, one using convolutions and one using the uniform convergence on [−1, 1]

of the power series of (1 − x)b , whenever b > 0, established in Appendix C of Chapter

4. (Here, we take b = 1/2.) Section 3 treats a far reaching generalization, known as the

Stone-Weierstrass theorem. A special case, of use in §4, is that each continuous function

on T1 is a uniform limit of a sequence of finite linear combinations of the exponentials

eikθ , k ∈ Z.

Section 4 introduces Fourier series,

∞

X

f (θ) = ak eikθ .

k=−∞

Z π

1

ak = f (θ)e−ikθ dθ.

2π −π

This is the Fourier inversion problem, and we examine several aspects of this. Fourier

analysis is a major area in modern analysis, and it is hoped that the material treated here

will provide a useful stimulus for further study.

For further material on Fourier analysis, one can look at Chapter 13 of [T3], dealing

with Fourier series on a similar level as here, but with a different perspective, followed by

Chapters 14–15 of [T3], on the Fourier transform and Laplace transform. Progressively

more advanced treatments of Fourier analysis can be found in [Fol], Chapter 8, and [T4],

Chapter 3.

Section 5 treats the use of Newton’s method to solve

f (ξ) = y

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

201

for ξ in an interval [a, b] given that f (a) − y and f (b) − y have opposite signs and that

It is seen that if an initial guess x0 is close enough to ξ, then Newton’s method produces

a sequence (xk ) satisfying

k

|xk − ξ| ≤ Cβ 2 , for some β ∈ (0, 1).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

202

convolution f ∗ u by

Z ∞

(1.1) f ∗ u(x) = f (y)u(x − y) dy.

−∞

Clearly

Z

(1.2) |f | dx = A, |u| ≤ M on R =⇒ |f ∗ u| ≤ AM on R.

Z ∞

(1.3) f ∗ u(x) = f (x − y)u(y) dy.

−∞

R that satisfy the following conditions:

Z Z

(1.4) fn ≥ 0, fn dx = 1, fn dx = εn → 0,

R\In

where

(1.5) In = [−δn , δn ], δn → 0.

Let u ∈ C(R) be supported on a bounded interval [−A, A], or more generally, assume

Proposition 1.1. If fn ∈ R(R) satisfy (1.4)–(1.5) and if u ∈ C(R) is bounded and

uniformly continuous (satisfying (1.6)–(1.7)), then

(1.8) un = fn ∗ u −→ u, uniformly on R, as n → ∞.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

203

Z

un (x) = fn (y)u(x − y) dy

Z Z

(1.9) = fn (y)u(x − y) dy + fn (y)u(x − y) dy

In R\In

= vn (x) + rn (x).

Clearly

Next,

Z

(1.11) vn (x) − u(x) = fn (y)[u(x − y) − u(x)] dy − εn u(x),

In

so

hence

yielding (1.8).

Here is a sequence of functions (fn ) satisfying (1.4)–(1.5). First, set

Z 1

1 2 n

(1.14) gn (x) = (x − 1) , An = (x2 − 1)n dx,

An −1

fn (x) = gn (x), |x| ≤ 1,

(1.15)

0, |x| ≥ 1.

It is readily verified that such (fn ) satisfy (1.4)–(1.5). We will use this sequence in Propo-

sition 1.1 for one proof of the Weierstrass approximation theorem, in the next section.

The functions fn defined by (1.14)–(1.15) have the property

Furthermore, they have compact support, i.e., vanish outside some compact set. We say

provided f ∈ C k (R) and f has compact support. The following result is useful.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

204

Proposition 1.2. If f ∈ C0k (R) and u ∈ R(R), then f ∗ u ∈ C k (R), and (provided k ≥ 1)

d

(1.18) f ∗ u(x) = f 0 ∗ u(x).

dx

In fact, by (1.3),

¯ ¯ ¯Z ∞ ¯

¯ ¯ ¯ ¯

¯f ∗ u(x + h) − f ∗ u(x)¯ = ¯ [f (x + h − y) − f (x − y)]u(y) dy ¯

−∞

Z ∞

≤ sup |f (x + h) − f (x)| |u(y)| dy,

x −∞

From here, it suffices to treat the case k = 1, since if f ∈ C0k (R), then f 0 ∈ C0k−1 (R),

and one can use induction on k. Using (1.3), we have

Z ∞

f ∗ u(x + h) − f ∗ u(x)

(1.19) = gh (x − y)u(y) dy,

h −∞

where

1

(1.20) gh (x) = [f (x + h) − f (x)].

h

We claim that

Given this,

¯Z ∞ Z ∞ ¯

¯ ¯

0

¯ gh (x − y)u(y) dy − f (x − y)u(y) dy ¯

−∞ −∞

(1.22) Z ∞

0

≤ sup |gh (x) − f (x)| |u(y)| dy,

x −∞

It remains to prove (1.21). Indeed, the fundamental theorem of calculus implies

Z x+h

1

(1.23) gh (x) = f 0 (y) dy,

h x

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

205

if h > 0, so

x≤y≤x+h

We say

and similarly f ∈ C0∞ (R) provided f ∈ C0k (R), for all k. It is useful to have some examples

of functions in C0∞ (R). We start with the following. Set

2

G(x) = e−1/x , if x > 0,

(1.26)

0, if x ≤ 0.

Proof. Clearly g ∈ C k for all k on (0, ∞) and on (−∞, 0). We need to check its behavior

at 0. The fact that G is continuous at 0 follows from

2

(1.27) e−y −→ 0, as y → ∞.

Note that

2 −1/x2

G0 (x) = e , if x > 0,

(1.28) x3

0, if x < 0.

also

G(h)

(1.29) G0 (0) = lim = 0,

h→0 h

as a consequence of

2

(1.30) ye−y −→ 0, as y → ∞.

of

2

(1.31) y 3 e−y −→ 0, as y → ∞.

The existence and continuity of higher derivatives of G follows a similar pattern, making

use of

2

(1.32) y k e−y −→ 0, as y → ∞,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

206

Exercises

Z

(1.34) f ≥ 0, f dx = 1,

and set

³x´

(1.35) fn (x) = nf , n ∈ N.

n

Show that Proposition 1.1 applies to the sequence fn .

2. Take

Z ∞

1 2 2

(1.36) f (x) = e−x , A= e−x dx.

A −∞

case.

Note. In [T2] it is shown that A = π in (1.36).

0, if x ≤ 0,

then G1 ∈ C ∞ (R).

1

ϕ(x) = G(x) sin , if x 6= 0,

(a) x

0, if x = 0.

1

ψ(x) = G1 (x) sin , if x 6= 0,

(b) x

0, if x = 0.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

207

Theorem 2.1. Given a compact interval I, any continuous function f on I is a uniform

limit of polynomials.

Otherwise stated, our goal is to prove that the space C(I) of continuous (real valued)

functions on I is equal to P(I), the uniform closure in C(I) of the space of polynomials.

We will give two proofs of this theorem. Our starting point for the first proof will be

the result that the power series for (1 − x)a converges uniformly on [−1, 1], for any a > 0.

This was established in Chapter 4, Appendix C, and we will use it, with a = 1/2.

From the identity x1/2 = (1 − (1 − x))1/2 , we have x1/2 ∈ P([0, 2]). More to the point,

from the identity

¡ ¢1/2

(2.1) |x| = 1 − (1 − x2 ) ,

√ √

we have |x| ∈ P([− 2, 2]). Using |x| = b−1 |bx|, for any b > 0, we see that |x| ∈ P(I) for

any interval I = [−c, c], and also for any closed subinterval, hence for any compact interval

I. By translation, we have

(2.2) |x − a| ∈ P(I)

1 1 1 1

(2.3) max(x, y) = (x + y) + |x − y|, min(x, y) = (x + y) − |x − y|,

2 2 2 2

we see that for any a ∈ R and any compact I,

Using this, one sees that, given f ∈ P(I), with range in a compact interval J, one has

h ◦ f ∈ P(I) for all h ∈ P(J). Hence f ∈ P(I) ⇒ |f | ∈ P(I), and, via (2.3), we deduce

that

Suppose now that I 0 = [a0 , b0 ] is a subinterval of I = [a, b]. With the notation x+ =

max(x, 0), we have

¡ ¢

(2.7) fII 0 (x) = min (x − a0 )+ , (b0 − x)+ ∈ P(I).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

208

This is a piecewise linear function, equal to zero off I \ I 0 , with slope 1 from a0 to the

midpoint m0 of I 0 , and slope −1 from m0 to b0 .

Now if I is divided into N equal subintervals, any continuous function on I that is linear

on each such subinterval can be written as a linear combination of such “tent functions,”

so it belongs to P(I). Finally, any f ∈ C(I) can be uniformly approximated by such

piecewise linear functions, so we have f ∈ P(I), proving the theorem.

For the second proof, we bring in the sequence of functions fn defined by (1.14)–(1.15),

i.e., first set

Z 1

1 2

(2.8) gn (x) = (x − 1)n , An = (x2 − 1)n dx,

An −1

(2.9)

0, |x| ≥ 1.

It is readily verified that such (fn ) satisfy (1.4)–(1.5). We will use this sequence in Propo-

sition 1.1 to prove that if I ⊂ R is a closed, bounded interval, and f ∈ C(I), then there

exist polynomials pn (x) such that

(2.10) pn −→ f, uniformly on I.

assuming that

h 1 1i

(2.11) I= − , .

4 4

1

(2.12) u ∈ C(R), u(x) = 0 for |x| ≥ .

2

Z

(2.13) un (x) = fn (y)u(x − y) dy =⇒ un → u uniformly on R.

Now

1

|x| ≤ =⇒ u(x − y) = 0 for |y| > 1

2 Z

(2.14)

=⇒ un (x) = gn (y)u(x − y) dy,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

209

that is,

1

(2.15) |x| ≤ =⇒ un (x) = pn (x),

2

where

Z

pn (x) = gn (y)u(x − y) dy

(2.16) Z

= gn (x − y)u(y) dy.

The last identity makes it clear that each pn (x) is a polynomial in x. Since (2.13) and

(2.15) imply

h 1 1i

(2.17) pn −→ u uniformly on − , ,

2 2

we have (2.10).

Exercises

Z ∞

1 2 2

f (x) = e−x , A = e−x dx,

A −∞

³x´

fn (x) = nf .

n

Let u ∈ C(R) vanish outside [−1, 1]. Let ε > 0 and take n ∈ N such that

x

n X 1 ³ x 2 ´k

∞

fn (x) = − 2 ,

A k! n

k=0

2. Let f be continuous on [−1, 1]. If f is odd, show that it is a uniform limit of finite linear

combinations of x, x3 , x5 , . . . , x2k+1 , . . . . If f is even, show it is a uniform limit of finite

linear combinations of 1, x2 , x4 , . . . , x2k , . . . .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

210

3. If g is continuous on [−π/2, π/2], show that g is a uniform limit of finite linear combi-

nations of

sin x, sin2 x, sin3 x, . . . , sink x, . . . .

Hint. Write g(x) = f (sin x) with f continuous on [−1, 1].

4. If g is continuous on [−π, π] and even, show that g is a uniform limit of finite linear

combinations of

1, cos x, cos2 x, . . . , cosk x, . . . .

Hint. cos : [0, π] → [−1, 1] is a homeomorphism.

Hint. Given ε > 0, find δ > 0 and continuous hε , satisfying (2.18), such that

x

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

211

is the following result, known as the Stone-Weierstrass theorem.

Theorem 3.1. Let X be a compact metric space, A a subalgebra of CR (X), the algebra of

real valued continuous functions on X. Suppose 1 ∈ A and that A separates points of X,

i.e., for distinct p, q ∈ X, there exists hpq ∈ A with hpq (p) 6= hpq (q). Then the closure A is

equal to CR (X).

We present the proof in eight steps.

the Weierstrass approximation theorem to get polynomials pk → ϕ uniformly on [−A, A].

Then pk ◦ f → ϕ ◦ f uniformly on X, so ϕ ◦ f ∈ A.

1 1

(3.1) max(f1 , f2 ) = |f1 − f2 | + (f1 + f2 ) ∈ A,

2 2

Step 3. It follows from the hypotheses that if p, q ∈ X and p 6= q, then there exists

fpq ∈ A, equal to 1 at p and to 0 at q.

on a neighborhood of p and to 0 on a neighborhood of q, and satisfying 0 ≤ gpq ≤ 1 on X.

there exists gpq ∈ A such that gpq = 1 on a neighborhood Oq of p, equal to 0 on a

neighborhood Ωq of q, satisfying 0 ≤ gpq ≤ 1 on X.

Now {Ωq } is an open cover of X \ U , so there exists a finite subcover Ωq1 , . . . , ΩqN . Let

1≤j≤N

Then gpU = 1 on O = ∩N

1 Oqj , an open neighborhood of p, gpU = 0 on X \ U , and

0 ≤ gpU ≤ 1 on X.

gpU ∈ A, equal to 1 on a neighborhood Op of p, and equal to 0 on X \ U .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

212

Now {Op } covers K, so there exists a finite subcover Op1 , . . . , Opm . Let

1≤j≤M

We have

n `o

(3.5) K` = x ∈ X : f (x) ≥ ,

k

n ` − 1o n ` − 1o

(3.6) U` = x ∈ X : f (x) > , so X \ U` = x ∈ X : f (x) ≤ .

k k

(3.7) ψ` = 1 on K` , ψ` = 0 on X \ U` , and 0 ≤ ψ` ≤ 1 on X.

Let

`

(3.8) fk = max ψ` ∈ A.

0≤`≤k k

on K`−1 and fk ≤ `/k on U`+1 . In other words,

`−1 ` `−1 `

(3.9) ≤ f (x) ≤ =⇒ ≤ fk (x) ≤ ,

k k k k

so

1

(3.10) |f (x) − fk (x)| ≤ , ∀ x ∈ X.

k

an easy final step to see that f ∈ CR (X) ⇒ f ∈ A.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

213

Theorem 3.2. Let X be a compact metric space, A a subalgebra (over C) of C(X), the

algebra of complex valued continuous functions on X. Suppose 1 ∈ A and that A separates

the points of X. Furthermore, assume

(3.11) f ∈ A =⇒ f ∈ A.

Proof. Set AR = {f + f : f ∈ A}. One sees that Theorem 3.1 applies to AR .

Here are a couple of applications of Theorems 3.1–3.2.

Corollary 3.3. If X is a compact subset of Rn , then every f ∈ C(X) is a uniform limit

of polynomials on Rn .

Corollary 3.4. The space of trigonometric polynomials, given by

N

X

(3.12) ak eikθ ,

k=−N

is dense in C(S 1 ).

Exercises

Hint. eikθ ei`θ = ei(k+`)θ , and eikθ = e−ikθ .

3. Use the results of Exercises 4–5 in §2 to provide another proof of Corollary 3.4.

Hint. Use cosk θ = ((eiθ + e−iθ )/2)k , etc.

f ∈ C(X)} is dense in C(K).

5. In the setting of Exercise 4, take f ∈ C(K), ε > 0. Show that there exists g1 ∈ C(X)

such that

sup |g1 − f | ≤ ε, and sup |g1 | ≤ sup |f |.

K X K

K X

7. Use the results of Exercises 4–6 to show that, if f ∈ C(K), then there exists g ∈ C(X)

such that g|K = f .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

214

4. Fourier series

Given f ∈ C(T1 ), or more generally f ∈ R(T1 ) (or still more generally, if f ∈ R# (T1 ),

defined as in §6 of Chapter 4), we set, for k ∈ Z,

Z 2π

1

(4.1) fˆ(k) = f (θ)e−ikθ dθ.

2π 0

∞

X

(4.2) f ∈ A(T ) ⇐⇒ 1

|fˆ(k)| < ∞.

k=−∞

Proposition 4.1. Given f ∈ C(T1 ), if f ∈ A(T1 ), then

∞

X

(4.3) f (θ) = fˆ(k)eikθ .

k=−∞

P ˆ

Proof. Given |f (k)| < ∞, the right side of (4.3) is absolutely and uniformly convergent,

defining

∞

X

(4.4) g(θ) = fˆ(k)eikθ , g ∈ C(T1 ),

k=−∞

Z 2π

1

ei`θ dθ = 0, if ` =

6 0,

(4.5) 2π 0

1, if ` = 0,

It remains to show that this implies u ≡ 0. To prove this, we use Corollary 3.4, which

implies that, for each v ∈ C(T1 ), there exist trigonometric polynomials, i.e., finite linear

combinations vN of {eikθ : k ∈ Z}, such that

(4.7) vN −→ v uniformly on T1 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

215

u(θ)vN (θ) dθ = 0, ∀ N,

T1

Z

(4.8) u(θ)v(θ) dθ = 0, ∀ v ∈ C(T1 ).

T1

Taking v = u gives

Z

(4.9) |u(θ)|2 dθ = 0,

T1

We seek conditions on f that imply (4.2). Integration by parts for f ∈ C 1 (T1 ) gives,

for k 6= 0,

Z 2π

ˆ 1 i ∂ −ikθ

f (k) = f (θ) (e ) dθ

2π 0 k ∂θ

(4.10) Z 2π

1

= f 0 (θ)e−ikθ dθ,

2πik 0

hence

Z 2π

1

(4.11) |fˆ(k)| ≤ |f 0 (θ)| dθ.

2π|k| 0

Z 2π

ˆ 1

(4.12) f (k) = − f 00 (θ)e−ikθ dθ,

2πk 2 0

hence Z 2π

1

|fˆ(k)| ≤ |f 00 (θ)| dθ.

2πk 2 0

In concert with

Z 2π

1

(4.13) |fˆ(k)| ≤ |f (θ)| dθ,

2π 0

Z 2π h

1 ¤

(4.14) |fˆ(k)| ≤ |f 00 (θ)| + |f (θ)| dθ.

2π(k 2 + 1) 0

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

216

Hence

X

(4.15) f ∈ C 2 (T1 ) =⇒ |fˆ(k)| < ∞.

We will sharpen this implication below. We start with an interesting example. Consider

(4.16) f (θ) = |θ|, −π ≤ θ ≤ π,

and extend this to be periodic of period 2π, yielding f ∈ C(T1 ). We have

Z π

ˆ 1

f (k) = |θ|e−ikθ dθ

2π −π

(4.17)

1

= −[1 − (−1)k ] 2 ,

πk

for k 6= 0, while fˆ(0) = π/2. This is clearly a summable series, so f ∈ A(T1 ), and

Proposition 4.1 implies that, for −π ≤ θ ≤ π,

π X 2

|θ| = − 2

eikθ

2 πk

k odd

(4.18) ∞

π 4X 1

= − cos(2` + 1)θ.

2 π (2` + 1)2

`=0

Now, evaluating this at θ = 0 yields the identity

X∞

1 π2

(4.19) = .

(2` + 1)2 8

`=0

Using this, we can evaluate

∞

X 1

(4.20) S= ,

k2

k=1

as follows. We have

∞

X X X

1 1 1

= +

k2 k2 k2

k=1 k≥1 odd k≥2 even

(4.21) ∞

π2 1 X 1

= + ,

8 4 `2

`=1

∞

X 1 π2

(4.22) = .

k2 6

k=1

C

(4.23) |fˆ(k)| ≤

.

k2 + 1

This is a special case of the following generalization of (4.15).

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

217

holds.

Proof. Here we are assuming f is C 2 on T1 \{p1 , . . . , p` }, and f 0 and f 00 have limits at each

of the endpoints of the associated intervals in T1 , but f is not assumed to be differentiable

at the endpoints p` . We can write f as a sum of functions fν , each of which is Lipschitz

on T1 , C 2 on T1 \ pν , and fν0 and fν00 have limits as one approaches pν from either side. It

suffices to show that each fˆν (k) satisfies (4.23). Now g(θ) = fν (θ + pν − π) is singular only

at θ = π, and ĝ(k) = fˆν (k)eik(pν −π) , so it suffices to prove Proposition G.2 when f has a

singularity only at θ = π. In other words, f ∈ C 2 ([−π, π]), and f (−π) = f (π).

In this case, we still have (4.10), since the endpoint contributions from integration by

parts still cancel. A second integration by parts gives, in place of (4.12),

Z π

ˆ 1 i ∂ −ikθ

f (k) = f 0 (θ) (e ) dθ

2πik −π k ∂θ

(4.24) Z

1 h π 00 −ikθ 0 0

i

=− f (θ)e dθ + f (π) − f (−π) ,

2πk 2 −π

which yields (4.23).

R

We next make use of (4.5) to produce results on T1

|f (θ)|2 dθ, starting with the follow-

ing.

Proposition 4.3. Given f ∈ A(T1 ),

X Z

1

(4.25) |fˆ(k)|2 = |f (θ)|2 dθ.

2π

T1

X Z

1

(4.26) fˆ(k)ĝ(k) = f (θ)g(θ) dθ.

2π

T1

Proof. Switching order of summation and integration and using (4.5), we have

Z Z X

1 1

f (θ)g(θ) dθ = fˆ(j)ĝ(k)e−i(j−k)θ dθ

2π 2π

(4.27) T1 T1 j,k

X

= fˆ(k)ĝ(k),

k

We will extend the scope of Proposition 4.3 below. Closely tied to this is the issue of

convergence of SN f to f as N → ∞, where

X

(4.28) SN f (θ) = fˆ(k)eikθ .

|k|≤N

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

218

convergence in L2 -norm, where

Z

2 1

(4.29) kf kL2 = |f (θ)|2 dθ.

2π

T1

Given f ∈ R(T1 ), this defines a “norm,” satisfying the following result, called the triangle

inequality:

See Appendix A for details on this. Behind these results is the fact that

Z

1

(4.32) (f, g)L2 = f (θ)g(θ) dθ.

2π

S1

X

(4.33) |fˆ(k)|2 = kf k2L2 ,

X

(4.34) fˆ(k)ĝ(k) = (f, g)L2 .

The left side of (4.33) is the square norm of the sequence (fˆ(k)) in `2 . Generally, a

sequence (ak ) (k ∈ Z) belongs to `2 if and only if

X

(4.35) k(ak )k2`2 = |ak |2 < ∞.

X

(4.36) ((ak ), (bk )) = ak bk .

(4.38) fν → f in L2 ⇐⇒ kf − fν kL2 → 0.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

219

θ

(4.40) f ∈ A(T1 ) =⇒ SN f → f in L2 , as N → ∞.

¯ ¯

¯ ¯

(4.41) ¯kf kL2 − kSN f kL2 ¯ ≤ kf − SN f kL2 ,

N

X

(4.42) kSN f k2L2 = |fˆ(k)|2 ,

k=−N

so

X

(4.43) kf − SN f kL2 → 0 as N → ∞ =⇒ kf k2L2 = |fˆ(k)|2 .

We now consider more general functions f ∈ R(T1 ). With fˆ(k) and SN f defined by

(4.1) and (4.28), we define RN f by

(4.44) f = SN f + RN f.

R R

Note that T1

f (θ)e−ikθ dθ = T1

SN f (θ)e−ikθ dθ for |k| ≤ N , hence

and hence

Consequently,

(4.47)

= kSN f k2L2 + kRN f k2L2 .

In particular,

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

220

ν→∞

N →∞

Then

N →∞

inequality, we have, for each ν,

N →∞

Given f ∈ C(T1 ), we have trigonometric polynomials fν → f uniformly on T1 , and

clearly (4.50) holds for each such fν . Thus Lemma 4.4 yields the following.

f ∈ C(T1 ) =⇒ SN f → f in L2 , and

(4.54) X

|fˆ(k)|2 = kf k2L2 .

Lemma 4.4 also applies to many discontinuous functions. Consider, for example

(4.55)

1 for 0 < θ < π.

fν (θ) = 0 for − π ≤ θ ≤ 0,

1

νθ for 0 ≤ θ ≤ ,

ν

(4.56) 1 1

1 for ≤θ≤π− ,

ν ν

1

ν(π − θ) for π − ≤ θ ≤ π.

ν

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

221

Then each fν ∈ C(T1 ). (In fact, fν ∈ A(T1 ), by Proposition 4.2.). Also, one can check

that kf − fν k2L2 ≤ 2/ν. Thus the conclusion in (4.54) holds for f given by (4.55).

More generally, any piecewise continuous function on T1 is an L2 limit of continuous

functions, so the conclusion of (4.54) holds for them. To go further, let us consider the class

of Riemann integrable functions. A function f : T1 → R is Riemann integrable provided

f is bounded (say |f | ≤ M ) and, for each δ > 0, there exist piecewise constant functions

gδ and hδ on T1 such that

Z

¡ ¢

(4.57) gδ ≤ f ≤ hδ , and hδ (θ) − gδ (θ) dθ < δ.

T1

Then

Z Z Z

(4.58) f (θ) dθ = lim gδ (θ) dθ = lim hδ (θ) dθ.

δ→0 δ→0

T1 T1 T1

Z Z

1 2 M +1

|f (θ) − gδ (θ)| dθ ≤ |hδ (θ) − gδ (θ)| dθ

2π π

(4.59) T1 T1

M +1

< δ,

π

imaginary parts are. In such a case, there are also piecewise constant functions fν → f in

L2 -norm, giving the following.

Proposition 4.5. We have

f ∈ R(T1 ) =⇒ SN f → f in L2 , and

(4.60) X

|fˆ(k)|2 = kf k2L2 .

This is not the end of the story. Lemma 4.4 extends to unbounded functions on T1 that

are square integrable, such as

1

(4.61) f (θ) = |θ|−α on [−π, π], 0<α< .

2

In such a case, one can take fν (θ) = min(f (θ), ν), ν ∈ N. Then each fν is continuous and

kf − fν kL2 → 0 as ν → ∞. The conclusion of (4.60) holds for such f . We can fit (4.61)

into the following general setting. If f : T1 → C, we say

f ∈ R2 (T1 ) ⇐⇒ f, |f |2 ∈ R# (T1 ),

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

222

where R# is defined in §6 of Chapter 4. Though we will not pursue the details, Lemma

4.4 extends to f, fν ∈ R2 (T1 ), and then (4.60) holds for f ∈ R2 (T1 ).

The ultimate theory of functions for which the result

(4.62) SN f −→ f in L2 -norm

holds was produced by H. Lebesgue in what is now known as the theory of Lebesgue

measure and integration. There is the notion of measurability

R of a function f : T1 →

C. One says f ∈ L2 (T1 ) provided f is measurable and T1 |f (θ)|2 dθ < ∞, the integral

here being the Lebesgue integral. Actually,

R L2 (T1 ) consists of equivalence classes of such

functions, where f1 ∼ f2 if and only if |f1 (θ) − f2 (θ)|2 dθ = 0. With `2 as in (4.35), it is

then the case that

(4.63) F : L2 (T1 ) −→ `2 ,

given by

X

(4.65) |fˆ(k)|2 = kf k2L2 , ∀ f ∈ L2 (T1 ),

and

(4.66) SN f −→ f in L2 , ∀ f ∈ L2 (T1 ).

We refer to books on the subject (e.g., [T2]) for information on Lebesgue integration.

We mention two key propositions which, together with the arguments given above,

establish these results. The fact that Ff ∈ `2 for all f ∈ L2 (T1 ) and (4.65)–(4.66) hold

follows via Lemma 4.4 from the following.

Proposition A. Given f ∈ L2 (T1 ), there exist fν ∈ C(T1 ) such that fν → f in L2 .

As for the surjectivity of F in (4.63), note that, given (ak ) ∈ `2 , the sequence

X

fν (θ) = ak eikθ

|k|≤ν

X

kfµ − fν k2L2 = |ak |2 → 0 as ν → ∞.

ν<|k|≤µ

That is to say, (fν ) is a Cauchy sequence in L2 (T1 ). Surjectivity follows from the fact that

Cauchy sequences in L2 (T1 ) always converge to a limit:

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

223

that fν → f in L2 -norm.

Proofs of Propositions A and B can be found in the standard texts on measure theory

and integration, such as [T2].

We now establish a sufficient condition for a function f to belong to A(T1 ), more general

than that in Proposition 4.2.

P ˆ

Proposition 4.6. If f is a continuous, piecewise C 1 function on T1 , then |f (k)| < ∞.

Proof. As in the proof of Proposition 4.2, we can reduce the problem to the case f ∈

C 1 ([−π, π]), f (−π) = f (π). In such a case, with g = f 0 ∈ C([−π, π]), the integration by

parts argument (4.10) gives

1

(4.67) fˆ(k) = ĝ(k), k 6= 0.

ik

By (4.60),

X

(4.68) |ĝ(k)|2 = kgk2L2 .

X ³X 1 ´1/2 ³X ´1/2

|fˆ(k)| ≤ |ĝ(k)| 2

(4.69) k2

k6=0 k6=0 k6=0

≤ CkgkL2 .

Moving beyond square integrable functions, we now provide some results on Fourier

series for a function f ∈ R# (T1 ). For starters, if f ∈ R# (T1 ), then (4.1) yields

Z 2π

1 1

(4.70) |fˆ(k)| ≤ |f (θ)| dθ = kf kL1 (T1 ) .

2π 0 2π

Using this, we can establish the following result, which is part of what is called the

Riemann-Lebesgue lemma.

Proposition 4.7. Given f ∈ R# (T1 ),

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

224

P

Now Proposition 4.5 applies to each fν , so k |fˆν (k)|2 < ∞, for each ν. Hence

Since

1

(4.74) sup |fˆ(k) − fˆν (k)| ≤ kf − fν kL1 (T1 ) ,

k 2π

(4.71) follows.

We now consider conditions on f ∈ R# (T1 ) guaranteeing that SN f (θ) converges to f (θ)

as N → ∞, at a particular point θ ∈ T1 . Note that

N

X

SN f (θ) = fˆ(k)eikθ

k=−N

N Z

1 X

(4.75) = f (ϕ)eik(θ−ϕ) dϕ

2π

k=−N T1

Z

= f (ϕ)DN (θ − ϕ) dϕ,

T1

N

1 X ikθ

(4.76) DN (θ) = e .

2π

k=−N

Lemma 4.8. We have DN (0) = (2N + 1)/2π, and if θ ∈ T1 \ 0,

1 sin(N + 1/2)θ

(4.77) DN (θ) = .

2π sin θ/2

2N

1 −iN θ X ikθ

(4.78) DN (θ) = e e .

2π

k=0

P2N

Using the geometrical series k=0 z k = (1 − z 2N +1 )/(1 − z), for z 6= 1, we have

DN (θ) = e

2π 1 − eiθ

(4.79)

1 ei(N +1)θ − e−iN θ

= ,

2π eiθ − 1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

225

Note that if Rϕ f (θ) = f (θ + ϕ), then, for each f ∈ R# (T1 ),

(4.80) SN Rϕ f = Rϕ SN f,

to Rϕ f at θ = 0. Thus we seek conditions that

Z

(4.81) SN f (0) = f (ϕ)DN (ϕ) dϕ

T1

Z π

1 f (θ)

(4.82) SN f (0) = sin(N + 21 )θ dθ.

2π −π sin θ/2

Also,

³ 1´

(4.83) sin N + θ = (sin N θ)(cos 12 θ) + (cos N θ)(sin 12 θ).

2

Using this in concert with Proposition 4.7, we have the following.

Lemma 4.9. Let f ∈ R# (T1 ). Assume f “vanishes” at θ = 0 in the sense that

f (θ)

(4.84) ∈ R# ([−π, π]).

sin θ/2

Then

(4.85) SN f (0) −→ 0, as N → ∞.

Corollary 4.10. Let g ∈ R# (T1 ), and assume

g(θ) − g(0)

(4.86) ∈ R# ([−π, π]).

sin θ/2

Then

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

226

f (θ) − f (θ0 )

(4.88) ∈ R# ([−π + θ0 , π + θ0 ]),

sin(θ − θ0 )/2

then

Proposition 4.11 has the following application. We say a function f ∈ R# (T1 ) is Hölder

continuous at θ0 ∈ T1 , with exponent α ∈ (0, 1], provided there exists δ > 0, C < ∞, such

that

Proposition 4.12. Let f ∈ R# (T1 ). If f is Hölder continuous at θ0 , with some exponent

α ∈ (0, 1], then (4.89) holds.

Proof. We have

¯ f (θ) − f (θ ) ¯

¯ 0 ¯

(4.91) ¯ ¯ ≤ C 0 |θ − θ0 |−(1−α) ,

sin(θ − θ0 )/2

[θ0 − δ, θ0 + δ], the hypothesis (4.88) holds.

We now look at the following class of piecewise regular functions, with jumps. Take

points pj ,

Take functions

(4.94) fj (pj+1 ) + fj+1 (pj+1 )

, if θ = pj+1 .

2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

227

Proof. If θ ∈

/ {p0 , . . . , pK }, this follows from Proposition 4.12. It remains to consider the

case θ = pj for some j. By (4.80), there is no loss of generality in taking pj = 0. Parallel

to (4.80), we have

Hence

1

(4.97) SN f (0) = SN (f + T f )(0).

2

However, f + T f is Hölder continuous at θ = 0, with value 2f (0), so Proposition 4.12

implies

1

(4.98) SN (f + T f )(0) −→ f (0), as N → ∞.

2

This gives (4.95) for θ = pj = 0.

Exercises

1. Prove (4.80).

2. Prove (4.96).

f (θ) = 1 for 0 < θ < π,

(4.99)

0 for − π < θ < 0.

θ = π/2 to show the following (compare Exercise 31 in Chapter 4, §5):

π 1 1 1

= 1 − + − + ··· .

4 3 5 7

5. Apply (4.60) when f (θ) is given by (4.16). Use this to show that

∞

X 1 π4

= .

k4 90

k=1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

228

6. Use Proposition 4.12 in concert with Proposition 4.2 to demonstrate that (4.3) holds

when f is Lipschitz and piecewise C 2 on T1 , without recourse to Corollary 3.4 (whose

proof in §3 uses the Stone-Weierstrass theorem). Use this in turn to prove Proposition 4.1,

without using Corollary 3.4.

7. Use the results of Exercise 6 to give a proof of Corollary 3.4 that does not use the

Stone-Weierstrass theorem.

Hint. As in the end of the proof of Theorem 2.1, each f ∈ C(T1 ) can be uniformly

approximated by a sequence of Lipschitz, piecewise linear functions.

Recall that Corollary 3.4 states that each f ∈ C(T1 ) can be uniformly approximated by

a sequence of finite linear combinations of the functions eikθ , k ∈ Z. The proof given in

§3 relied on the Weierstrass approximation theorem, Theorem 2.1, which was used in the

proof of Theorems 3.1 and 3.2. Exercise 7 indicates a proof of Corollary 3.4 that does not

depend on Theorem 2.1.

Hint. You can take I = [−π/2, π/2]. Given f ∈ C(I), you can extend it to f ∈ C([−π, π]),

vanishing at ±π, and identify such f with an element of C(T1 ). Given ε > 0, approximate

f uniformly to within ε on [−π, π] by a finite sum

N

X

ak eikθ .

k=−N

partial sum of the power series for eikθ .

9. Let f ∈ C(T1 ). Assume there exist fν ∈ A(T1 ) and B < ∞ such that fν → f uniformly

on T1 and

X∞

|fˆν (k)| ≤ B, ∀ ν.

k=−∞

10. Let f ∈ C(T1 ). Assume there exist fν ∈ C(T1 ) satisfying the conditions of Proposition

4.6 such that fν → f uniformly on T1 , and assume there exists C < ∞ such that

Z

|fν0 (θ)|2 dθ ≤ C, ∀ ν.

T1

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

229

5. Newton’s method

(5.1) f (ξ) = 0.

f vanishes somewhere in (a, b). For example, f (a) and f (b) might have opposite signs.

We take x0 ∈ (a, b) as an initial guess of a solution to (5.1), and inductively construct the

sequence (xk ), going from xk to xk+1 as follows. Replace f by its best linear approximation

at xk ,

f (xk )

(5.3) xk+1 − xk = − ,

f 0 (xk )

or

f (xk )

(5.4) xk+1 = xk − .

f 0 (xk )

Naturally, we need to assume f 0 (x) is bounded away from 0 on (a, b). This production of

the sequence (xk ) is Newton’s method, and as we will see, under appropriate hypotheses

it converges quite rapidly to ξ.

We want to give a condition guaranteeing that |xk+1 − ξ| < |xk − ξ|. Say

(5.5) xk = ξ + δ.

f (ξ + δ)

xk+1 − ξ = δ −

f 0 (ξ + δ)

(5.6)

f 0 (ξ + δ)δ − f (ξ + δ)

= .

f 0 (ξ + δ)

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

230

f 0 (ξ + δ) − f 0 (ξ + τ δ)

(5.7) xk+1 − ξ = δ.

f 0 (ξ + δ)

f 00 (ξ + γδ) 2

(5.9) xk+1 − ξ = (1 − τ ) δ , τ ∈ (0, 1), γ ∈ (τ, 1).

f 0 (ξ + δ)

Consequently,

¯ f 00 (ξ + γδ) ¯

¯ ¯ 2

(5.10) |xk+1 − ξ| ≤ sup ¯ 0 ¯δ .

0<γ<1 f (ξ + δ)

A favorable condition for convergence is that the right side of (5.10) is ≤ βδ for some

β < 1. This leads to the following.

Proposition 5.1. Let f ∈ C([a, b]) be C 2 on (a, b). Assume there exists a solution ξ ∈

(a, b) to (5.1). Assume there exist A, B ∈ (0, ∞) such that

and

A

(5.13) δ0 = β < 1.

B

When Proposition 5.1 applies, one clearly has

(5.14) |xk − ξ| ≤ β k δ0 .

In fact, (5.10) implies much faster convergence than this. With |xk −ξ| = δk , (5.10) implies

A 2

(5.15) δk+1 ≤ δ ,

B k

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

231

hence

A 2 ³ A ´1+2 ³ A ´1+2+4

(5.16) δ1 ≤ δ , δ2 ≤ δ04 , δ3 ≤ δ08 ,

B 0 B B

and, inductively,

³ A ´2k −1 k

(5.17) δk ≤ δ02k = β 2 −1

δ0 ,

B

with β as in (5.13). Note that the exponent on β in (5.17) is much larger (for moderately

large k) than that in (5.14). One says the sequence (xk ) converges quadratically to the

limit ξ, solving (5.1). Roughly speaking, xk+1 has twice as many digits of accuracy as xk .

If we change (5.1) to

(5.18) f (ξ) = y,

then the results above apply to f˜(x) = f (x) − y, so we get the sequence of approximate

solutions defined inductively by

f (xk ) − y

(5.19) xk+1 = xk − ,

f 0 (xk )

As an example, let us take

√

and approximate ξ = 2, which solves (5.18) with y = 2. Note that f (1) = 1 < 2 and

f (2) = 4 > 2. In this case, (5.19) becomes

x2k − 2

xk+1 = xk −

2xk

(5.21)

xk 1

= + .

2 xk

Let us pick

3

(5.22) . x0 =

2

√

Examining (1.4)2 and (1.5)2 , we see that 1.4 < 2 < 1.5. Thus (5.12) holds with δ0 < 1/10.

Furthermore, (5.11) holds with A = B = 2, so (5.13) holds with β < 1/10. Hence, by

(5.17),

√ k

(5.23) |xk − 2| ≤ 10−2 .

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

232

x0 = 1.5

x1 = 1.41666666666666

(5.24) x2 = 1.41421568627451

x3 = 1.41421356237469

x4 = 1.41421356237309.

Under certain circumstances, Newton’s method can be even better than quadratically

convergent. This happens when f 00 (ξ) = 0, assuming also that f is C 3 . In such a case, the

mean value theorem implies

(5.25)

= γδf (3) (ξ + σγδ),

for some σ ∈ (0, 1). Hence, given |xk − ξ| = δk , we get from (5.10) that

¯ f (3) (ξ + γδ ) ¯

¯ k ¯ 3

(5.26) |xk+1 − ξ| ≤ sup ¯ 0 ¯δ k .

0<γ<1 f (ξ + δk )

Thus xk → ξ cubically.

Here is an application to the production of a sequence that rapidly converges to π, based

on

(5.27) sin π = 0.

We take f (x) = sin x. Then f 00 (x) = − sin x, so the considerations above apply. The

iteration (5.4) becomes

sin xk

(5.28) xk+1 = xk − .

cos xk

If xk = π + δk , note that

so the iteration

is also cubically convergent, if x0 is chosen close enough to π. Now, the first few terms of

the series (4.27)–(4.31) of Chapter 4, applied to

Z 1/2

π dx

(5.31) = √

6 0 1 − x2

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

233

(5.32) x0 = 3,

x1 = 3.14112000805987

(5.33) x2 = 3.14159265357220

x3 = 3.14159265358979.

The error π − x2 is < 2 · 10−11 , and all the printed digits of x3 are accurate. If the

computation were to higher precision, x3 would approximate π to quite a few more digits.

By contrast, we apply Newton’s method to

π 1

(5.34) sin =

6 2

(equivalent to (5.31)). In this case, f (x) = sin x/6, and (5.19) becomes

(5.35) xk+1 = xk − 6 .

cos(xk /6)

x1 = 3.14066684291090

(5.36) x2 = 3.14159261236234

x3 = 3.14159265358979.

here is substantially less accurate than x2 in (5.33). Here, x3 has full accuracy, though

as noted above, x3 in (5.33) could be much more accurate if the computation (5.30) were

done to higher precision.

Exercises

tions to the following equations.

1. x5 − x3 + 1 = 0.

2. ex = 2x.

3. tan x = x.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

234

4. x log x = 2.

5. xx = 3.

of approximate solutions to f (x) = a. That is, xk → 1/a, if x0 is close enough. Try this

out with a = 3, x0 = 0.3. Note that the right side of (5.37) involves only multiplication

and subtraction.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

235

In §4, we have looked at norms and inner products on spaces of functions, such as C(S 1 )

and R(S 1 ), which are vector spaces. Generally, a complex vector space V is a set on which

there are operations of vector addition:

(A.1) f, g ∈ V =⇒ f + g ∈ V,

(A.2) a ∈ C, f ∈ V =⇒ af ∈ V,

These properties are readily verified for the function spaces mentioned above.

An inner product on a complex vector space V assigns to elements f, g ∈ V the quantity

(f, g) ∈ C, in a fashion that obeys the following three rules:

(A.6) (f, g) = (g, f ),

(f, f ) > 0 unless f = 0.

A vector space equipped with an inner product is called an inner product space. For

example,

Z

1

(A.7) (f, g) = f (θ)g(θ) dθ

2π

S1

defines an inner product on C(S 1 ), and also on R(S 1 ), where we identify two functions

that differ only on a set of upper content zero. Similarly,

Z ∞

(A.8) (f, g) = f (x)g(x) dx

−∞

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

236

defines an inner product on R(R) (where, again, we identify two functions that differ only

on a set of upper content zero).

As another example, in we define `2 to consist of sequences (ak )k∈Z such that

∞

X

(A.9) |ak |2 < ∞.

k=−∞

∞

X

¡ ¢

(A.10) (ak ), (bk ) = ak bk .

k=−∞

p

(A.11) kf k = (f, f )

is the norm on V associated with the inner product. Generally, a norm on V is a function

f 7→ kf k satisfying

(A.13) kf k > 0 unless f = 0,

(A.14) kf + gk ≤ kf k + kgk.

The property (H.14) is called the triangle inequality. A vector space equipped with a norm

is called a normed vector space. We can define a distance function on such a space by

If kf k is given by (A.11), from an inner product satisfying (A.6), it is clear that (A.12)–

(A.13) hold, but (A.14) requires a demonstration. Note that

kf + gk2 = (f + g, f + g)

(A.16) = kf k2 + (f, g) + (g, f ) + kgk2

= kf k2 + 2 Re(f, g) + kgk2 ,

while

Thus to establish (A.17) it suffices to prove the following, known as Cauchy’s inequality.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

237

Proposition A.1. For any inner product on a vector space V , with kf k defined by (A.11),

which implies

for all such a, hence

If we take t2 = kgk/kf k, we obtain the desired inequality (A.18). This assumes f and g

are both nonzero, but (A.18) is trivial if f or g is 0.

An inner product space V is called a Hilbert space if it is a complete metric space, i.e.,

if every Cauchy sequence (fν ) in V has a limit in V . The space `2 has this completeness

property, but C(S 1 ), with inner product (A.7), does not, nor does R(S 1 ). Chapter 2

describes a process of constructing the completion of a metric space. When appied to an

incomplete inner product space, it produces a Hilbert space. When this process is applied

to C(S 1 ), the completion is the space L2 (S 1 ). An alternative construction of L2 (S 1 ) uses

the Lebesgue integral.

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

238

References

[BS] R. Bartle and D. Sherbert, Introduction to Real Analysis, J. Wiley, New York,

1992.

[Be] P. Beckmann, A History of π, St. Martin’s Press, New York, 1971.

[C] P. Cohen, Set Theory and the Continuum Hypothesis, Dover, New York, 2008.

[Dev] K. Devlin, The Joy of Sets: Fundamentals of Contemporary Set Theory, Springer-

Verlag, New York, 1993.

[Fol] G. Folland, Real Analysis: Modern Techniques and Applications, Wiley-Interscience,

New York, 1984.

[Niv] I. Niven, A simple proof that π is irrational, Bull. AMS 53 (1947), 509.

[T1] M. Taylor, Measure Theory and Integration, American Mathematical Society, Prov-

idence RI, 2006.

[T2] M. Taylor, Introduction to Analysis in Several Variables (Advanced Calculus),

Lecture notes, available at

http://www.unc.edu/math/Faculty/met/math521.html

[T3] M. Taylor, Introduction to Complex Analysis. Lecture notes, avaliable at

http://www.unc.edu/math/Faculty/met/complex.html

[T4] M. Taylor, Partial Differential Equations, Vols. 1–3, Springer-Verlag, New York,

1996 (2nd ed., 2011).

[T5] M. Taylor, Introduction to Differential Equations, American Mathematical Society,

Providence RI, 2011.

[T6] M. Taylor, Elementary Differential Geometry, Lecture Notes, available at

http://www.unc.edu/math/Faculty/met/diffg.html

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

239

Index

absolute value 28, 59

absolutely convergent series 40, 61, 94

accumulation point 74, 79

alternating series 40

arc length 147

Archimedean property 26, 36

arctangent 188

Ascoli’s theorem 104

associative law 9, 16

ball 74

Banach space 104

Bernoulli numbers 194

bisection method 58

Bolzano-Weierstrass theorem 38

calculus 107

cancellation law 10

Cantor set 57, 134

Card 50

cardinal number 49

cardinality 50

Cauchy inequality 68

Cauchy remainder formula 141

Cauchy sequence 29, 37, 69, 73

change of variable 133

circle 149

cis 61

closed set 54, 62, 74

closure 74

commutative law 9, 16

compact set 54, 62, 70, 79, 111

completeness property 38, 60, 69

completion 73

complex conjugate 59

complex number 59

composite number 21

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

240

connected 74, 89

cont+ 123

cont− 123

continued fraction 32

continuous 55, 87, 109, 120

continuum hypothesis 58

convergent sequence 28, 37, 60

convex 116

convolution 202

cos 61, 150, 160

cosh 160

countable 51

countably infinite 51

cover 56

cubic convergence 232

curve 147

dense 74

derivative 109

diagonal construction 82

differentiable function 109

differential equation 155

Dirichlet convergence test 199

Dirichlet kernel 224

distance 73

disk 96

dot product 67

e 45, 155

elliptic function 152

elliptic integral 152

equicontinuity 104

equivalence class 15

equivalence relation 15

Euclidean space 67

Euler identity 62, 160

exponential function 61, 155

Fourier series 200, 213

Fundamental theorem of algebra 178

Fundamental theorem of arithmetic 20

Fundamental theorem of calculus 124, 125

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

241

function 86

geometric series 96

Heine-Borel theorem 56

Hölder continuous 226

induction principle 8

infinite decimal expansion 41

infinite series 39, 60, 93

inner product 235

inner product space 235

integer 15

integral 119

integral remainder formula 141

integral test 134

integration by parts 132

Intermediate value theorem 55, 75, 89

interval 55, 74

Inverse function theorem 112, 149

irrational number 45, 179

limit 28

log 155

lower content 123

maximum 87, 101

maxsize 118

Mean value theorem 111, 125, 126, 229

metric space 73

min 55, 63, 72, 91, 207

minimum 87, 110

modulus of continuity 88

monotone function 129

monotone sequence 30, 38

multiplying power series 98

neighborhood 74

Newton’s method 229

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

242

norm 67

order relation 11, 17, 25

outer measure 123

partition 119

path-connected 87

Peano axioms 8

perfect set 57, 75

π 150, 161, 162, 180, 232

piecewise constant 129

piecewise regular 224

polar coordinates 61, 152

polynomial 178, 202

power series 96, 136, 155

prime number 20

principle of induction 8

product rule 109, 156

Pythagorean theorem 59, 68

raduis of convergence 96

ratio test 33

rational number 23

real number 34

refinement 119

remainder in a power series 140

reparametrization 147

Riemann integrable 120

Riemann integral 119

Riemann-Lebesgue lemma 222

Riemann sum 122, 147

Schroeder-Bernstein theorem 50

second derivative 114

second derivative test 114

semicontinuous 90

sec 161

sequence 28

sin 61, 150, 160

sinh 162

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33

243

speed 147

Stone-Weierstrass theorem 211

subgroup 21

summation by parts 195

sup 39, 88

supremum property 39

tan 159

triangle inequality 28, 59, 68, 73, 236

trigonometric function 150, 160

trigonometric polynomial 213

Tychonov theorem 79

uncountable 52

uniform convergence 92

uniformly continuous 88

uniformly equicontinuous 105

upper bound 39

upper content 123

velocity 147

Weieresrass M test 94, 97

well ordering property 12

AMS Open Math Notes: Works in Progress; Reference # OMN:201701.110664; 2017-01-10 15:54:33