Sie sind auf Seite 1von 216

Lecture Notes on Measure Theory

and Functional Analysis


P. Cannarsa & T. DAprile
Dipartimento di Matematica
Universit`a di Roma Tor Vergata
cannarsa@mat.uniroma2.it daprile@mat.uniroma2.it
aa 2006/07
Contents
1 Measure Spaces 1
1.1 Algebras and algebras of sets . . . . . . . . . . . . . . . . . 1
1.1.1 Notation and preliminaries . . . . . . . . . . . . . . . . 1
1.1.2 Algebras and algebras . . . . . . . . . . . . . . . . . 2
1.2 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Additive and additive functions . . . . . . . . . . . 5
1.2.2 BorelCantelli Lemma . . . . . . . . . . . . . . . . . . 9
1.2.3 Measure spaces . . . . . . . . . . . . . . . . . . . . . . 9
1.3 The basic extension theorem . . . . . . . . . . . . . . . . . . . 10
1.3.1 Monotone classes . . . . . . . . . . . . . . . . . . . . . 11
1.3.2 Outer measures . . . . . . . . . . . . . . . . . . . . . . 13
1.4 Borel measures in R
N
. . . . . . . . . . . . . . . . . . . . . . . 17
1.4.1 Lebesgue measure in [0, 1) . . . . . . . . . . . . . . . . 18
1.4.2 Lebesgue measure in R . . . . . . . . . . . . . . . . . 20
1.4.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.4.4 Regularity of Radon measures . . . . . . . . . . . . . . 26
2 Integration 35
2.1 Measurable functions . . . . . . . . . . . . . . . . . . . . . . . 35
2.1.1 Inverse image of a function . . . . . . . . . . . . . . . . 35
2.1.2 Measurable maps and Borel functions . . . . . . . . . . 36
2.1.3 Approximation by continuous functions . . . . . . . . . 43
2.2 Integral of Borel functions . . . . . . . . . . . . . . . . . . . . 47
2.2.1 Repartition function . . . . . . . . . . . . . . . . . . . 47
2.2.2 Integral of positive simple functions . . . . . . . . . . . 49
2.2.3 The archimedean integral . . . . . . . . . . . . . . . . 51
2.2.4 Integral of positive Borel functions . . . . . . . . . . . 53
2.2.5 Integral of functions with variable sign . . . . . . . . . 58
i
ii
2.3 Convergence of integrals . . . . . . . . . . . . . . . . . . . . . 62
2.3.1 Dominated Convergence . . . . . . . . . . . . . . . . . 62
2.3.2 Uniform integrability . . . . . . . . . . . . . . . . . . . 66
2.3.3 Integrals depending on a parameter . . . . . . . . . . 68
3 L
p
spaces 73
3.1 Spaces /
p
(X, c, ) and L
p
(X, c, ) . . . . . . . . . . . . . . . 73
3.2 Space L

(X, c, ) . . . . . . . . . . . . . . . . . . . . . . . . 80
3.3 Convergence in measure . . . . . . . . . . . . . . . . . . . . . 85
3.4 Convergence and approximation in L
p
. . . . . . . . . . . . . . 86
3.4.1 Convergence results . . . . . . . . . . . . . . . . . . . . 87
3.4.2 Dense subsets of L
p
. . . . . . . . . . . . . . . . . . . 90
4 Hilbert spaces 95
4.1 Denitions and examples . . . . . . . . . . . . . . . . . . . . . 95
4.2 Orthogonal projections . . . . . . . . . . . . . . . . . . . . . . 98
4.2.1 Projection onto a closed convex set . . . . . . . . . . . 99
4.2.2 Projection onto a closed subspace . . . . . . . . . . . . 101
4.3 The Riesz Representation Theorem . . . . . . . . . . . . . . . 105
4.3.1 Bounded linear functionals . . . . . . . . . . . . . . . . 105
4.3.2 Riesz Theorem . . . . . . . . . . . . . . . . . . . . . . 107
4.4 Orthonormal sets and bases . . . . . . . . . . . . . . . . . . . 110
4.4.1 Bessels inequality . . . . . . . . . . . . . . . . . . . . . 110
4.4.2 Orthonormal bases . . . . . . . . . . . . . . . . . . . . 112
4.4.3 Completeness of the trigonometric system . . . . . . . 114
5 Banach spaces 119
5.1 Denitions and examples . . . . . . . . . . . . . . . . . . . . . 119
5.2 Bounded linear operators . . . . . . . . . . . . . . . . . . . . . 121
5.2.1 The principle of uniform boundedness . . . . . . . . . . 123
5.2.2 The open mapping theorem . . . . . . . . . . . . . . . 124
5.3 Bounded linear functionals . . . . . . . . . . . . . . . . . . . . 129
5.3.1 The Hahn-Banach Theorem . . . . . . . . . . . . . . . 129
5.3.2 Separation of convex sets . . . . . . . . . . . . . . . . . 133
5.3.3 The dual of
p
. . . . . . . . . . . . . . . . . . . . . . . 137
5.4 Weak convergence and reexivity . . . . . . . . . . . . . . . . 140
5.4.1 Reexive spaces . . . . . . . . . . . . . . . . . . . . . . 140
5.4.2 Weak convergence and BW property . . . . . . . . . . 143
iii
6 Product measures 151
6.1 Product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 151
6.1.1 Product measure . . . . . . . . . . . . . . . . . . . . . 151
6.1.2 Fubini-Tonelli Theorem . . . . . . . . . . . . . . . . . . 155
6.2 Compactness in L
p
. . . . . . . . . . . . . . . . . . . . . . . . 159
6.3 Convolution and approximation . . . . . . . . . . . . . . . . . 162
6.3.1 Convolution Product . . . . . . . . . . . . . . . . . . . 162
6.3.2 Approximation by smooth functions . . . . . . . . . . . 167
7 Functions of bounded variation and absolutely continuous
functions 175
7.1 Monotonic functions . . . . . . . . . . . . . . . . . . . . . . . 176
7.1.1 Dierentiation of a monotonic function . . . . . . . . . 177
7.2 Functions of bounded variation . . . . . . . . . . . . . . . . . 183
7.3 Absolutely continuous functions . . . . . . . . . . . . . . . . . 189
A 199
A.1 Distance function . . . . . . . . . . . . . . . . . . . . . . . . . 199
A.2 Legendre transform . . . . . . . . . . . . . . . . . . . . . . . . 200
A.3 Baires Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . 203
A.4 Precompact families of continuous functions . . . . . . . . . . 204
A.5 Vitalis covering theorem . . . . . . . . . . . . . . . . . . . . . 205
Chapter 1
Measure Spaces
1.1 Algebras and algebras of sets
1.1.1 Notation and preliminaries
We shall denote by X a nonempty set, by T(X) the set of all parts (i.e.,
subsets) of X, and by the empty set.
For any subset A of X we shall denote by A
c
its complement, i.e.,
A
c
= x X [ x / A .
For any A, B T(X) we set A B = A B
c
.
Let (A
n
) be a sequence in T(X). The following De Morgan identity holds
_

_
n=1
A
n
_
c
=

n=1
A
c
n
.
We dene
(1)
limsup
n
A
n
=

n=1

_
m=n
A
m
, liminf
n
A
n
=

_
n=1

m=n
A
m
.
If L := limsup
n
A
n
= liminf
n
A
n
, then we set L = lim
n
A
n
, and we
say that (A
n
) converges to L (in this case we shall write write A
n
L).
(1)
Observe the relationship with inf and sup limits for a sequence (a
n
) of real numbers.
We have limsup
n
a
n
= inf
nN
sup
mn
a
m
and liminf
n
a
n
= sup
nN
inf
mn
a
m
.
1
2 Measure Spaces
Remark 1.1 (a) As easily checked, limsup
n
A
n
(resp. liminf
n
A
n
)
consists of those elements of X that belong to innite elements of (A
n
)
(resp. that belong to all elements of (A
n
) except perhaps a nite num-
ber). Therefore,
liminf
n
A
n
limsup
n
A
n
(b) It easy also to check that, if (A
n
) is increasing (A
n
A
n+1
, n N),
then
lim
n
A
n
=

_
n=1
A
n
,
whereas, if (A
n
) is decreasing (A
n
A
n+1
, n N), then
lim
n
A
n
=

n=1
A
n
.
In the rst case we shall write A
n
L, and in the second A
n
L.
1.1.2 Algebras and algebras
Let / be a nonempty subset of T(X).
Denition 1.2 / is said to be an algebra if
(a) , X /
(b) A, B / = A B /
(c) A / = A
c
/
Remark 1.3 It is easy to see that, if / is an algebra and A, B /, then
A B and A B belong to /. Therefore, the symmetric dierence
AB := (A B) (B A)
also belongs to /. Moreover, / is stable under nite union and intersection,
that is
A
1
, . . . , A
n
/ =
_
A
1
A
n
/
A
1
A
n
/.
Chapter 1 3
Denition 1.4 An algebra c in T(X) is said to be a algebra if, for any
sequence (A
n
) of elements of c, we have

n=1
A
n
c.
We note that, if c is a algebra and (A
n
) c, then

n=1
A
n
c owing to
the De Morgan identity. Moreover,
liminf
n
A
n
c , limsup
n
A
n
c .
Example 1.5 The following examples explain the dierence between alge-
bras and a algebras.
1. Obviously, T(X) and c = , X are algebras in X. Moreover,
T(X) is the largest algebra in X, and c the smallest.
2. In X = [0, 1), the class /
0
consisting of , and of all nite unions
A =
n
i=1
[a
i
, b
i
) with 0 a
i
< b
i
a
i+1
1 , (1.1)
is an algebra. Indeed, for A as in (1.1), we have
A
c
= [0, a
1
) [b
1
, a
2
) [b
n
, 1) /
0
Moreover, in order to show that /
0
is stable under nite union, it suf-
ces to observe that the union of two (not necessarily disjoint) intervals
[a, b) and [c, d) in [0, 1) belongs to /
0
.
3. In an innite set X consider the class
c = A T(X) [ A is nite, or A
c
is nite .
Then, c is an algebra. Indeed, the only point that needs to be checked
is that c is stable under nite union. Let A, B c. If A and B are
both nite, then so is A B. In all other cases, (A B)
c
is nite.
4. In an uncountable set X consider the class
c = A T(X) [ A is countable, or A
c
is countable
(here, countable stands for nite or countable). Then, c is a
algebra. Indeed, c is stable under countable union: if (A
n
) is a sequence
in c and all A
n
are countable, then so is
n
A
n
; otherwise, (
n
A
n
)
c
is
countable.
4 Measure Spaces
Exercise 1.6 1. Show that algebra /
0
in Example 1.5.2 fails to be a
algebra.
2. Show that algebra c in Example 1.5.3 fails to be a algebra.
3. Show that algebra c in Example 1.5.4 is strictly smaller than T(X).
4. Let / be a subset of T(X). Show that the intersection of all algebras
including /, is a algebra (the minimal algebra including /).
Let / be a subset of T(X).
Denition 1.7 The intersection of all algebras including / is called the
algebra generated by /, and will be denoted by (/).
Exercise 1.8 In the following, let /, /
t
T(X).
1. Show that, if c is a algebra, then (c) = c.
2. Find (/) for / = and / = X.
3. Show that, if / /
t
(/), then (/
t
) = (/).
Example 1.9 1. Let E be a metric space. The algebra generated by
all open subsets of E is called the Borel algebra of E, and is denoted
by B(E). Obviously, B(E) coincides with the algebra generated by
all closed subsets of E.
2. Let X = R, and 1 be the class of all semiclosed intervals [a, b) with
a b. Then, (1) coincides with B(R). For let [a, b) R. Then,
[a, b) B(R) since
[a, b) =

n=n
0
_
a
1
n
, b
_
.
So, (1) B(R). Conversely, let A be an open set in R. Then, as is
well-known, Ais the countable union of some family of open intervals
(2)
.
Since any open interval (a, b) can be represented as
(a, b) =

_
n=n
0
_
a +
1
n
, b
_
,
(2)
Indeed, each point x A has an open interval (p
x
, q
x
) A with p
x
, q
x
Q. Hence,
A is contained in the union of the family (p, q) [ p, q Q, (p, q) A, and this family is
countable.
Chapter 1 5
where
1
n
0
< b a, we conclude that A (1). Thus, B(R) (1).
Exercise 1.10 Let c be a algebra in X, and X
0
X.
1. Show that c
0
= A X
0
[ A c is a algebra in X
0
.
2. Show that, if c = (/), then c
0
= (/
0
), where
/
0
= A X
0
[ A / .
Hint: c
0
(/
0
) follows from point 1. To prove the converse, show
that
T :=
_
A T(X) [ A X
0
(/
0
)
_
is a algebra in X including /.
1.2 Measures
1.2.1 Additive and additive functions
Let / T(X) be an algebra.
Denition 1.11 Let : / [0, +] be such that () = 0.
We say that is additive if, for any family A
1
, ..., A
n
/ of mutually
disjoint sets, we have

_
n
_
k=1
A
k
_
=
n

k=1
(A
k
).
We say that is additive if, for any sequence (A
n
) / of mutually
disjoint sets such that

k=1
A
k
/, we have

_

_
k=1
A
k
_
=

k=1
(A
k
).
Remark 1.12 Let / T(X) be an algebra.
1. Any additive function on / is also additive.
6 Measure Spaces
2. If is additive, A, B /, and A B, then (A) = (B) + (A B).
Therefore, (A) (B).
3. Let be additive on /, and let (A
n
) / be mutually disjoint sets
such that

k=1
A
k
/. Then,

_

_
k=1
A
k
_

n

k=1
(A
k
), for all n N.
Therefore,

_

_
k=1
A
k
_

k=1
(A
k
).
4. Any -additive function on / is also countably subadditive, that is,
for any sequence (A
n
) / such that

k=1
A
k
/,

_

_
k=1
A
k
_

k=1
(A
k
) .
5. In view of points 3 and 4 an additive function is -additive if and only
if it is countably subadditive.
Denition 1.13 A additive function on an algebra / T(X) is said
to be:
nite if (X) < ;
nite if there exists a sequence (A
n
) /, such that

n=1
A
n
= X
and (A
n
) < for all n N.
Example 1.14 In X = N, consider the algebra
c = A T(X) [ A is nite, or A
c
is nite
of Example 1.5. The function : c [0, ] dened as
(A) =
_
#(A) if A is nite
if A
c
is nite
Chapter 1 7
(where #(A) stands for the number of elements of A) is additive (Exer-
cise). On the other hand, the function : c [0, ] dened as
(A) =
_
_
_

nA
1
2
n
if A is nite
if A
c
is nite
is additive but not additive (Exercise).
For an additive function, the additivity of is equivalent to continuity in
the sense of the following proposition.
Proposition 1.15 Let be additive on /. Then (i) (ii), where:
(i) is additive;
(ii) (A
n
) and A /, A
n
A = (A
n
) (A).
Proof. (i)(ii) Let (A
n
), A /, A
n
A. Then,
A = A
1

_
n=1
(A
n+1
A
n
) ,
the above being disjoint unions. Since is additive, we deduce that
(A) = (A
1
) +

n=1
((A
n+1
) (A
n
)) = lim
n
(A
n
),
and (ii) follows.
(ii)(i) Let (A
n
) / be a sequence of mutually disjoint sets such that
A =

k=1
A
k
/. Dene
B
n
=
n
_
k=1
A
k
.
Then, B
n
A. So, in view of (ii), (B
n
) =

n
k=1
(A
k
) (A). This implies
(i).
Proposition 1.16 Let be additive on /. If (A
1
) < and A
n
A
with A /, then (A
n
) (A).
8 Measure Spaces
Proof. We have
A
1
=

_
k=1
(A
k
A
k+1
) A
the above being disjoint unions. Consequently,
(A
1
) =

k=1
((A
k
) (A
k+1
)) +(A) = (A
1
) lim
n+
(A
n
) +(A) .
Since (A
1
) < +, the conclusion follows.
Example 1.17 The conclusion of Proposition 1.16 above may be false with-
out assuming (A
1
) < . This is easily checked taking c and as in Ex-
ample 1.14, and A
n
= m N [ m n.
Corollary 1.18 Let be a nite additive function on a algebra c.
Then, for any sequence (A
n
) of subsets of c, we have

_
liminf
n
A
n
_
liminf
n
(A
n
) limsup
n
(A
n
)
_
limsup
n
A
n
_
. (1.2)
In particular, A
n
A = (A
n
) (A).
Proof. Set L = limsup
n
A
n
. Then we can write L =

n=1
B
n
, where
B
n
=

m=n
A
m
L. Now, by Proposition 1.16 it follows that
(L) = lim
n
(B
n
) = inf
nN
(B
n
) inf
nN
sup
mn
(A
m
) = limsup
n
(A
n
).
Thus, we have proved that
limsup
n
(A
n
)
_
limsup
n
A
n
_
.
The remaining part of (1.2) can be proved similarly.
Chapter 1 9
1.2.2 BorelCantelli Lemma
The following result is very useful as we shall see later.
Lemma 1.19 Let be a nite additive function on a algebra c. Then,
for any sequence (A
n
) of subsets of c satisfying

n=1
(A
n
) < +, we have

_
limsup
n
A
n
_
= 0 .
Proof. Set L = limsup
n
A
n
. Then, L =

n=1
B
n
, where B
n
=

m=n
A
m
decreases to L. Consequently,
(L) (B
n
)

m=n
(A
m
)
for all n N. As n , we obtain (L) = 0.
1.2.3 Measure spaces
Denition 1.20 Let c be a algebra of subsets of X.
We say that the pair (X, c) is a measurable space.
A additive function : c [0, +] is called a measure on (X, c).
The triple (X, c, ), where is a measure on a measurable space (X, c),
is called a measure space.
A measure is called a probability measure if (X) = 1.
A measure is said to be complete if
A c , B A, (A) = 0 B c
(and so (B) = 0).
A measure is said to be concentrated on a set A c if (A
c
) = 0.
In this case we say that A is a support of .
10 Measure Spaces
Example 1.21 Let X be a nonempty set and x X. Dene, for every
A T(X),

x
(A) =
_
1 if x A
0 if x / A
Then,
x
is a measure in X, called the Dirac measure in x. Such a measure
is concentrated on the singleton x.
Example 1.22 In a set X let us dene, for every A T(X),
(A) =
_
#(A) if A is nite
if A is innite
(see Example 1.14). Then, is a measure in X, called the counting measure
on X. It is easy to see that is nite if and only if X is nite, and that
is nite if and only if X is countable.
Let (X, c, ) be a measure space and let A c.
Denition 1.23 The restriction of to A (or restricted to A), written
A, is the set function
(A)(B) = (A B) B c . (1.3)
Exercise 1.24 Show that A is a measure on (X, c).
1.3 The basic extension theorem
A natural question arising both in theory and applications is the following.
Problem 1.25 Let / be an algebra in X, and be an additive function in
/. Does there exist a algebra c including /, and a measure on (X, c)
that extends , i.e.,
(A) = (A) A /? (1.4)
Should the above problem have a solution, one could assume c = (/) since
(/) would be included in c anyways. Moreover, for any sequence (A
n
) /
of mutually disjoint sets such that

k=1
A
k
/, we would have

_

_
k=1
A
k
_
=
_

_
k=1
A
k
_
=

k=1
(A
k
) =

k=1
(A
k
) .
Chapter 1 11
Thus, for Problem 1.25 to have a positive answer, must be additive.
The following remarkable result shows that such a property is also sucient
for the existence of an extension, and more. We shall see an important
application of this result to the construction of the Lebesgue measure later
on in this chapter.
Theorem 1.26 Let / be an algebra, and : / [0, +] be additive.
Then, can be extended to a measure on (/). Moreover, such an extension
is unique if is nite.
To prove the above theorem we need to develop suitable tools, namely Hal-
mos Monotone Class Theorem for uniqueness, and the concepts of outer
measure and additive set for existence. This is what we shall do in the next
sessions.
1.3.1 Monotone classes
Denition 1.27 A nonempty class / T(X) is called a monotone class
in X if, for any sequence (A
n
) /,
A
n
A = A /
A
n
A = A /
Remark 1.28 Clearly, any algebra in X is a monotone class. Conversely,
if a monotone class / in X is also an algebra, then / is a algebra
(Exercise).
Let us prove now the following result.
Theorem 1.29 (Halmos) Let / be an algebra, and / be a monotone class
in X including /. Then, (/) /.
Proof. Let /
0
be the minimal monotone class including /
(3)
. We are going
to show that /
0
is an algebra, and this will prove the theorem in view of
Remark 1.28. To begin with, we note that and X belong to /
0
.
Now, dene, for any A /
0
,
/
A
=
_
B /
0

A B, A B, B A /
0
_
.
(3)
Exercise: show that the intersection of all monotone classes including / is also a
monotone class in X.
12 Measure Spaces
We claim that /
A
is a monotone class. For let (B
n
) /
A
be an increasing
sequence such that B
n
B. Then,
A B
n
A B, AB
n
AB, B
n
A B A.
Since /
0
is a monotone class, we conclude that
B, A B, AB, B A /
0
.
Therefore, B /
A
. By a similar reasoning, one can check that
(B
n
) /
A
, B
n
B = B /
A
.
So, /
A
is a monotone class as claimed.
Next, let A /. Then, / /
A
since any B / belongs to /
0
and
satises
A B, AB, B A /
0
. (1.5)
But /
0
is the minimal monotone class including /, so /
0
/
A
. There-
fore, /
0
= /
A
or, equivalently, (1.5) holds true for any A /and B /
0
.
Finally, let A /
0
. Since (1.5) is satised by any B /, we deduce
that / /
A
. Then, /
A
= /
0
. This implies that /
0
is an algebra.
Proof of Theorem 1.26: uniqueness. Let c = (A), and let
1
,
2
be
two measures extending to c. We shall assume, rst, that is nite, and
set
/ =
_
A c


1
(A) =
2
(A)
_
.
We claim that / is a monotone class including /. Indeed, for any sequence
(A
n
) /, using Propositions 1.15 and 1.16 we have that
A
n
A =
1
(A) = lim
n

i
(A
n
) =
2
(A) (i = 1, 2)
A
n
A,
1
(X),
2
(X) < =
1
(A) = lim
n

i
(A
n
) =
2
(A) (i = 1, 2)
Therefore, in view of Halmos Theorem, / = c, and this implies that
1
=

2
.
In the general case of a nite function , we have that X =

k=1
X
k
for some (X
k
) / such that (X
k
) < for all k N. It is not restrictive to
assume that the sequence (X
k
) is increasing. Now, dene
k
(A) = (AX
k
)
for all A /, and

1,k
(A) =
1
(A X
k
)

2,k
(A) =
2
(A X
k
)
_
A c.
Chapter 1 13
Then, as easily checked,
k
is a nite additive function on /, and
1,k
,
2,k
are measures extending
k
to c. So, by the conclusion of rst part of this
proof,
1,k
(A) =
2,k
(A) for all A c and any k N. Therefore, since
A X
k
A, using again Proposition 1.15 we obtain

1
(A) = lim
k

1,k
(A) = lim
k

2,k
(A) =
2
(A) A c .
The proof is thus complete.
1.3.2 Outer measures
Denition 1.30 A function

: T(X) [0, ] is called an outer measure


in X if

() = 0, and if

is monotone and countably subadditive, i.e.,


A B =

(A)

(B)

_

_
i=1
E
i
_

i=1

(E
i
) (E
i
) T(X)
The following proposition studies an example of outer measure that will be
essential for the proof of Theorem 1.26.
Proposition 1.31 Let be a additive function on an algebra /. Dene,
for any E T(X),

(E) = inf
_

i=1
(A
i
)

A
i
/, E

_
i=1
A
i
_
. (1.6)
Then,
1.

is nite whenever is nite;


2.

is an extension of , that is,

(A) = (A), A /. (1.7)


3.

is an outer measure.
14 Measure Spaces
Proof. The rst assertion being obvious, let us proceed to check (5.11).
Observe that the iniequality

(A) (A) for any A / is trivial. To prove


the converse inequality, let A
i
/ be a countable covering of a set A /.
Then, A
i
A / is also a countable covering of A satisfying
i
(A
i
A) /.
Since is countably subadditive (see point 4 in Remark 1.12),
(A)

i=1
(A
i
A)

i=1
(A
i
) .
Thus, taking the inmum as in (1.6), we conclude that

(A) (A).
Finally, we show that

is countably subadditive. Let (E


i
) T(X), and
set E =

i=1
E
i
. Assume, without loss of generality, that all (E
i
)s are
nite (otherwise the assertion is trivial). Then, for any i N and any > 0
there exists (A
i,j
) / such that

j=1
(A
i,j
) <

(E
i
) +

2
i
, E
i

_
j=1
A
i,j
.
Consequently,

i,j=1
(A
i,j
)

(E
i
) +.
Since E

i,j=1
A
i,j
we have that

(E)

i=1

(E
i
) + for any > 0.
The conclusion follows.
Exercise 1.32 1. Let

be an outer measure in X, and A T(X).


Show that

(B) =

(A B) B T(X)
is an outer measure in X.
2. Let

n
be outer measures in X for all n N. Show that

(A) =

n
(A) and

(A) = sup
n

(A) A T(X)
are outer measures in X.
Given an outer measure

in X, we now proceed to dene additive sets.


Chapter 1 15
Denition 1.33 A subset A T(X) is called additive (or

measurable)
if

(E) =

(E A) +

(E A
c
) E T(X). (1.8)
We denote by ( the family of all additive sets.
Notice that, since

is countable subadditive, (1.8) is equivalent to

(E)

(E A) +

(E A
c
) E T(X). (1.9)
Also, observe that A
c
( for all A (. Other important properties of (
are listed in the next proposition.
Theorem 1.34 (Caratheodory) Let

be an outer measure in X. Then,


( is a algebra, and

is additive on (.
Before proving Caratheodorys Theorem, let us use it to complete the proof
of Theorem 1.26.
Proof of Theorem 1.26: existence. Given a additive function on an
algebra /, dene the outer measure

as in Example 1.31. Then, as noted


above,

(A) = (A) for any A /. Moreover, in light of Theorem 1.34,

is a measure on the algebra ( of additive sets. So, the proof will be


complete if we show that / (. Indeed, in this case, (/) turns out to be
included in (, and it suces to take the restriction of

to (/) to obtain
the required extension.
Now, let A / and E T(X). Assume

(E) < (otherwise (1.9)


trivially holds), and x > 0. Then, there exists (A
i
) / such that
E
i
A
i
, and

(E) + >

i=1
(A
i
)
=

i=1
(A
i
A) +

i=1
(A
i
A
c
)

(E A) +

(E A
c
).
Since is arbitrary,

(E)

(E A) +

(E A
c
). Thus A (.
We now proceed with the proof of Caratheodorys Theorem.
Proof of Theorem 1.34. We will split the reasoning into four steps.
16 Measure Spaces
1. ( is an algebra We note that and X belong to (. We already know
that A ( implies A
c
(. Let us now prove that, if A, B (, then
A B (. For any E T(X), we have

(E) =

(E A) +

(E A
c
)
=

(E A) +

(E A
c
B) +

(E A
c
B
c
)
= [

(E A) +

(E A
c
B)] +

(E (A B)
c
).
(1.10)
Since
(E A) (E A
c
B) = E (A B),
the subadditivity of

implies that

(E A) +

(E A
c
B)

(E (A B)).
So, by (1.10),

(E)

(E (A B)) +

(E (A B)
c
),
and A B ( as required.
2.

is additive on ( Let us prove that, if A, B ( and A B = ,


then

(E (A B)) =

(E A) +

(E B). (1.11)
Indeed, replacing E with E (A B) in (1.8), yields

(E (A B)) =

(E (A B) A) +

(E (A B) A
c
),
which is equivalent to (1.11) since A B = . In particular, taking
E = X, it follows that

is additive on (.
3. ( is a algebra Let (A
n
) (. We will show that S :=

i=1
A
i
(.
Since ( is an algebra, it is not restrictive to assume that all the sets in
(A
n
) are mutually disjoint. Set S
n
:=

n
i=1
A
i
, n N. For any n N
we have, by the subadditivity of

(E S) +

(E S
c
)

i=1

(E A
i
) +

(E S
c
)
= lim
n
_
n

i=1

(E A
i
) +

(E S
c
)
_
= lim
n
_

(E S
n
) +

(E S
c
)
_
Chapter 1 17
in view of (1.11). Since S
c
S
c
n
, it follows that

(E S) +

(E S
c
) limsup
n
_

(E S
n
) +

(E S
c
n
)
_
=

(E).
So, S (, and ( is a algebra.
4.

is additive on ( Since

is countably subadditive, and additive


by Step 2, then point 5 in Remark 1.12 gives the conclusion.
The proof is now complete.
Remark 1.35 The algebra ( of additive sets is complete, that is, it con-
tains all the sets with outer measure 0. Indeed, for any M X with

(M) = 0, and any E T(X), we have

(E M) +

(E M
c
) =

(E M
c
)

(E).
Thus, M (.
Remark 1.36 In our proof of Theorem 1.26 we have constructed the
algebra ( of additive sets such that
(/) ( T(X) . (1.12)
We shall see later on that the above inclusions are both strict, in general.
1.4 Borel measures in R
N
Let (X, d) be a metric space. We recall that B(X) denotes the Borel
algebra in X.
Denition 1.37 A measure on the measurable space (X, B(X)) is called
a Borel measure. A Borel measure is called a Radon measure if (K) <
for every compact set K X.
In this section we will study specic properties of Borel measures on (R
N
, B(R
N
)).
We begin by introducing Lebesgue measure on the unit interval.
18 Measure Spaces
1.4.1 Lebesgue measure in [0, 1)
Let 1 be the class of all semiclosed intervals [a, b) [0, 1) , and /
0
be
the algebra of all nite disjoint unions of elements of 1 (see Example 1.5.2).
Then, (1) = (/
0
) = B([0, 1)).
On 1, consider the additive set function
([a, b)) = b a, 0 a b 1. (1.13)
If a = b then [a, b) reduces to the empty set, and we have ([a, b)) = 0.
Exercise 1.38 Let [a, b) 1 be contained in [a
1
, b
1
) [a
n
, b
n
), with
[a
i
, b
i
) 1. Prove that
b a
n

i=1
(b
i
a
i
) .
Proposition 1.39 The set function dened in (1.13) is additive on 1.
Proof. Let (I
i
) be a disjoint sequence of sets in 1, with I
i
= [a
i
, b
i
), and
suppose I = [a
0
, b
0
) =
i
I
i
1. Then, for any n N, we have
n

i=1
(I
i
) =
n

i=1
(b
i
a
i
) b
n
a
1
b
0
a
0
= (I) .
Therefore,

i=1
(I
i
) (I).
To prove the converse inequality, suppose a
0
< b
0
. For any < b
0
a
0
and
> 0, we have
(4)
[a
0
, b
0
]

_
i=1
_
(a
i
2
i
) 0, b
i
_
.
Then, the HeineBorel Theorem implies that, for some i
0
N,
[a
0
, b
0
) [a
0
, b
0
]
i
0
_
i=1
_
(a
i
2
i
) 0, b
i
_
.
(4)
If a, b R we set mina, b = a b and maxa, b = a b.
Chapter 1 19
Consequently, in view of Exercise 1.38,
(I) = (b
0
a
0
)
i
0

i=1
_
b
i
a
i
+2
i
_

i=1
(I
i
) +.
Since and are arbitrary, we obtain
(I)

i=1
(I
i
).
We now proceed to extend to /
0
. For any set A /
0
such that A =
i
I
i
,
where I
1
, . . . , I
n
are disjoint sets in 1, let us dene
(A) :=
n

i=1
(I
i
) .
It is easy to see that the above denition is independent of the representation
of A as a nite disjoint union of elements of 1.
Exercise 1.40 Show that, if J
1
, . . . , J
j
is another family of disjoint sets in
1 such that A =
j
J
j
, then
n

i=1
(I
i
) =
m

j=1
(J
j
)
Theorem 1.41 is additive on /
0
.
Proof. Let (A
n
) /
0
be a sequence of disjoint sets in / such that
A :=

_
n=1
A
n
/
0
.
Then
A =
k
_
i=1
I
i
A
n
=
k
n
_
j=1
I
n,j
(n N)
for some disjoint families I
1
, . . . , I
n
and I
n,1
, . . . , I
n,k
n
in 1. Now, observe
that, for any i N,
I
i
= I
i
A =

_
n=1
(I
i
A
n
) =

_
n=1
k
n
_
j=1
(I
i
I
n,j
)
20 Measure Spaces
(with I
i
I
n,j
1), and apply Proposition 1.39 to obtain
(I
i
) =

n=1
k
n

j=1
(I
i
I
n,j
) =

n=1
(I
i
A
n
).
Hence,
(A) =
k

i=1
(I
i
) =
k

i=1

n=1
(A
n
I
i
) =

n=1
k

i=1
(A
n
I
i
) =

n=1
(A
n
)
Summing up, thanks to Theorem 1.26, we conclude that can be uniquely
extended to a measure on the algebra B([0, 1)). Such an extension is called
Lebesgue measure.
1.4.2 Lebesgue measure in R
We now turn to the construction of Lebesgue measure on (R, B(R)) . Usually,
this is done by an intrinsic procedure, applying an extension result for
additive set functions on semirings. In these notes, we will follow a shortcut,
based on the following simple observations.
Proceeding as in the above section, one can dene Lebesgue measure on
B([a, b)) for any interval [a, b) R. Such a measure will be denoted by
[a,b)
.
Let us begin by characterizing the associated Borel sets as follows.
Proposition 1.42 A set A belongs to B([a, b)) if and only if A = B [a, b)
for some B B(R).
Proof. Consider the class c := A T([a, b)) [ B B(R) : A = B[a, b).
Let us check that c is a algebra in [a, b).
1. By the denition of c we have that , [a, b) c.
2. Let A c and B B(R) be such that A = B[a, b). Then, [a, b)B
B(R). So, [a, b) A = [a, b) ([a, b) B) c.
3. Let (A
n
) c and (B
n
) B(R) be such that A
n
= B
n
[a, b). Then,

n
B
n
B(R). So,
n
A
n
= (
n
B
n
) [a, b) c.
Chapter 1 21
Since c contains all open subsets of [a, b), we conclude that B([a, b)) c.
This proves the only if part of the conclusion.
Next, to prove the if part, let T := B T(R) [ B [a, b) B([a, b)).
Then, arguing as in the rst part of the proof, we have that T is a -algebra
in R.
1. , R T by denition.
2. Let B T. Since B [a, b) B([a, b)), we have that B
c
[a, b) =
[a, b) (B [a, b)) B([a, b)) . So, B
c
T.
3. Let (B
n
) T. Then, (
n
B
n
) [a, b) =
n
(B
n
[a, b)) B([a, b)). So,

n
B
n
T.
Since T contains all open subsets of R, we conclude that B(R) T. The
proof is thus complete.
Thus, for any pair of nested intervals [a, b) [c, d) R, we have that
B([a, b)) B([c, d)). Moreover, a unique extension argument yields

[a,b)
(E) =
[c,d)
(E) E B([a, b)) . (1.14)
Now, since R =

k=1
[k, k), it is natural to dene Lebesgue measure on
(R, B(R)) as
(E) = lim
k

[k,k)
(E [k, k)) E B(R) . (1.15)
Our next exercise is intended to show that the denition of would be the
same taking any other sequence of intervals invading R.
Exercise 1.43 Let (a
n
) and (b
n
) be real sequences satisfying
a
k
< b
k
, a
k
, b
k
.
Show that
(E) = lim
k

[a
k
,b
k
)
(E [a
k
, b
k
)) E B(R) .
In order to show that is a measure on (R, B(R)), we still have to check
additivity.
22 Measure Spaces
Proposition 1.44 The set function dened in (1.15) is additive in B(R).
Proof. Let (E
n
) B(R) be a sequence of disjoint Borel sets satisfying
E :=
n
E
n
B(R). Then, by the additivity of
[k,k)
,
(E) = lim
k

[k,k)
(E [k, k)) = lim
k

n=1

[k,k)
(E
n
[k, k)).
Now, observe that, owing to (1.14),

[k,k)
(E
n
[k, k)) =
[k1,k+1)
(E
n
[k, k))

[k1,k+1)
(E
n
[k 1, k + 1))
So, for any n N, k
[k,k)
(E
n
[k, k)) is nondecreasing. The conclusion
follows applying Lemma 1.45 below.
Lemma 1.45 Let (a
nk
)
n,kN
be a sequence in [0, ] such that, for any n N,
h k = a
nh
a
nk
. (1.16)
Set, for any n N,
lim
k
a
nk
=:
n
[0, ] . (1.17)
Then,
lim
k

n=1
a
nk
=

n=1

n
Proof. Suppose, rst,

n
< , and x > 0. Then, there exists n

N
such that

n=n

+1

n
< .
Recalling (1.17), for k suciently large, say k k

, we have
n


n

< a
nk
for n = 1, . . . , n

. Therefore,

n=1
a
nk

n

n=1

n
>

n=1

n
2
for any k k

. Since

n
a
nk


n

n
, the conclusion follows.
Chapter 1 23
The analysis of the case

n
= is similar. Fix M > 0, and let
n
M
N be such that
n
M

n=1

n
> 2M .
For k suciently large, say k k
M
,
n

M
n
M
< a
nk
for n = 1, . . . , n
M
.
Therefore, for all k k
M
,

n=1
a
nk

n
M

n=1
a
nk
>
n
M

n=1

n
M > M .
Example 1.46 The monotonicity assumption of the above lemma is essen-
tial. Indeed, (1.16) fails for the sequence
a
nk
=
nk
=
_
1 if n = k
0 if n ,= k
[Kroneker delta]
since
lim
k

n=1
a
nk
= 1 ,= 0 =

n=1
lim
k
a
nk

Since is bounded on bounded sets, Lebesgue measure is a Radon measure.
Another interesting property of Lebesgue measure is translation invariance.
Proposition 1.47 Let A B(R). Then, for every x R,
A +x : = a +x [ a A B(R) (1.18)
(A +x) = (A) . (1.19)
Proof. Dene, for any x R,
c
x
= A T(R) [ A +x B(R) .
Let us check that c
x
is a algebra in R.
1. , R c
x
by direct inspection.
2. Let A c
x
. Since A
c
+x = (A+x)
c
B(R), we conclude that A
c
c
x
.
3. Let (A
n
) c
x
. Then, (
n
A
n
)+x =
n
(A
n
+x) B(R). So,
n
A
n
c
x
.
24 Measure Spaces
Since c
x
contains all open subsets of R, B(R) c
x
for any x R. This
proves (1.18).
Let us prove (1.19). Fix x R, and dene

x
(A) = (A +x) A B(R) .
It is straightforward to check that
x
and agree on the class
1
R
:=
_
(, a) [ < a
_
_
_
[a, b) [ < a b
_
.
So,
x
and also agree on the algebra /
R
of all nite disjoint unions of
elements of 1
R
. By the uniqueness result of Theorem 1.26, we conclude that

x
(A) = (A) for all A B(R).
1.4.3 Examples
In this section we shall construct three examples of sets that are hard to
visualize but possess very interesting properties.
Example 1.48 (Two unusual Borel sets) Let r
n
be an enumeration of
Q [0, 1], and x > 0. Set
A =

_
n=1
_
r
n


2
n
, r
n
+

2
n
_
.
Then, A[0, 1] is an open (with respect to the relative topology) dense Borel
set. By subadditivity, 0 < (A [0, 1]) < 2. Moreover, the compact set
B := [0, 1] A has no interior and measure nearly 1.
Example 1.49 (Cantor triadic set) To begin with, let us note that any
x [0, 1] has a triadic expansion of the form
x =

i=1
a
i
3
i
a
i
= 0, 1, 2 . (1.20)
Such a representation is not unique due to the presence of periodic expan-
sions. We can, however, choose a unique representation of the form (1.20)
picking the expansion with less digits equal to 1. Now, observe that the set
C
1
:=
_
x [0, 1]

x =

i=1
a
i
3
i
with a
1
,= 1
_
Chapter 1 25
is obtained from [0, 1] removing the middle third (
1
3
,
2
3
). It is, therefore, the
union of 2 closed intervals, each of which has measure
1
3
. More generally, for
any n N,
C
n
:=
_
x [0, 1]

x =

i=1
a
i
3
i
with a
1
, . . . , a
n
,= 1
_
is the union of 2
n
closed intervals, each of which has measure
_
1
3
_
n
. So,
C
n
C :=
_
x [0, 1]

x =

i=1
a
i
3
i
with a
i
,= 1 i N
_
,
where C is the so-called Cantor set. It is a closed set by construction, with
measure 0 since
(C) (C
n
)
_
2
3
_
n
n N.
Nevertheless, C is uncountable. Indeed, the function
f
_

i=1
a
i
3
i
_
=

i=1
a
i
2
(i+1)
(1.21)
maps C onto [0, 1].
Exercise 1.50 Show that f in (1.21) is onto.
Remark 1.51 Observe that B(R) has the cardinality of T(Q). On the other
hand, the algebra ( of Lebesgue measurable sets is complete. So, T(C)
(, where C is Cantor set. Since C is uncountable, ( must have a higher
cardinality than the algebra of Borel sets. In other terms, B(R) is strictly
included in (.
Example 1.52 (A nonmeasurable set) We shall now show that ( is also
strictly included in T([0, 1)). For x, y [0, 1) dene
x y =
_
x +y if x +y < 1
x +y 1 if x +y 1
Observe that, if E [0, 1) is a measurable set, then E x [0, 1) is also
measurable, and (E x) = (E) for any x [0, 1). Indeed,
E x =
_
(E +x) [0, 1)
_
_
_
(E +x) [0, 1) 1
_
.
26 Measure Spaces
In [0, 1), dene x and y to be equivalent if x y Q. By the Axiom of
Choice, there exists a set P [0, 1) such that P consists of exactly one
representative point from each equivalence class. We claim that P provides
the required example of a nonmeasurable set. Indeed, consider the countable
family (P
n
) T([0, 1)), where P
n
= P r
n
and (r
n
) is an enumeration of
Q [0, 1). Observe the following.
1. (P
n
) is a disjoint family for if there exist p, q P such that p r
n
=
q r
m
with n ,= m, then p q Q. So, p = q and the fact that
pr
n
= pr
m
with r
n
, r
m
[0, 1) implies that r
n
= r
m
, a contradiction.
2.
n
P
n
= [0, 1). Indeed, let x [0, 1). Since x is equivalent to some
element of P, x p = r for some p P and some r Q satisfying
[r[ < 1. Now, if r 0, then r = r
n
for some n N whence x P
n
.
On the other hand, for r < 0, we have 1 + r = r
n
for some n N. So,
x P
n
once again.
Should P be measurable, it would follow that ([0, 1)) =

n
(P
n
). But this
is impossible: the right-hand side is either 0 or +.
1.4.4 Regularity of Radon measures
In this section, we shall prove regularity properties of a Radon measure in
R
N
. We begin by studying nite measures.
Proposition 1.53 Let be a nite measure on (R
N
, B(R
N
)). Then, for any
B B(R
N
),
(B) = sup(C) : C B, closed = inf(A) : A B, open. (1.22)
Proof. Let us set
/ = B B(R
N
) [ (1.22) holds.
It is enough to show that / is a -algebra of parts of R
n
including all open
sets. Obviously, / contains R
N
and . Moreover, if B / then its comple-
ment B
c
belongs to /. Let us now prove that (B
n
) /

n=1
B
n
/. We
are going to show that, for any > 0, there is a closed set C and an open
set A such that
C

_
n=1
B
n
A, (AC) . (1.23)
Chapter 1 27
Since B
n
/ for any n N, there is an open set A
n
and a closed set C
n
such that
C
n
B
n
A
n
, (A
n
C
n
)

2
n+1
.
Now, take A =

n=1
A
n
and S =

n=1
C
n
to obtain S

n=1
B
n
A and
(AS)

n=1
(A
n
S)

n=1
(A
n
C
n
)

2
.
However, A is open but S is not necessarily closed. To overcome this di-
culty, let us approximate S by the sequence S
n
=

n
k=1
C
k
. For any n N,
S
n
is obviously closed, S
n
S, and so (S
n
) (S). Therefore, there exists
n

N such that (SS


n

) <

2
. Now, C := S
n

satises C

n=1
B
n
A
and (AC) = (AS) + (SC) < . Therefore,

n=1
B
n
/. We have
thus proved that / is a algebra.
It remains to show that / contains the open subsets of R
N
. For this, let
A be open, and set
C
n
=
_
x R
N

d
A
c(x)
1
n
_
,
where d
A
c(x) is the distance of x from A
c
. Since d
A
c is continuous, C
n
is
a closed subsets of A. Moreover, C
n
A. So, recalling that is nite, we
conclude that (AC
n
) 0.
The following result is a straightforward consequence of Proposition 1.53.
Corollary 1.54 Let and be nite measures on (R
N
, B(R
N
)) such that
(C) = (C) for any closed subset C of R
N
. Then, = .
Finally, we will extend Proposition 1.53 to Radon measures.
Theorem 1.55 Let be a Radon measure on (R
N
, B(R
N
)), and let B be a
Borel set. Then,
(B) = inf(A) [ A B, A open (1.24)
(B) = sup(K) [ K B, K compact (1.25)
Proof. Since (1.24) is trivial if (B) = , we shall assume that (B) < .
For any n N, denote by Q
n
the cube (n, n)
N
, and consider the nite
28 Measure Spaces
measure Q
n
(5)
. Fix > 0 and apply Proposition 1.53 to conclude that,
for any n N, there exists an open set A
n
B such that
(Q
n
)(A
n
B) <

2
n
.
Now, consider the open set A :=
n
(A
n
Q
n
) B. We have
(A B)

n=1

_
(A
n
Q
n
) B
_
=

n=1
(Q
n
)(A
n
B) < 2
which in turn implies (1.24).
Next, let us prove (1.25) for (B) < . Fix > 0, and apply Proposi-
tion 1.53 to Q
n
to construct, for any n N, a closed set C
n
B satisfying
(Q
n
)(B C
n
) < .
Consider the sequence of compact sets K
n
= C
n
Q
n
. Since
(B Q
n
) (B) ,
for some n

N we have that (B Q
n

) > (B) . Therefore,


(B K
n

) = (B) (K
n

)
< (B Q
n

) (C
n

Q
n

) +
= (Q
n

)(B C
n

) + < 2
If (B) = +, then, setting B
n
= BQ
n
, we have B
n
B, and so (B
n
)
+. Since (B
n
) < +, for every n there exists a compact set K
n
such
that K
n
B
n
and (K
n
) > (B
n
) 1, by which K
n
B and (K
n
)
+ = (B).
Exercise 1.56 A Radon measure on (R
N
, B(R
N
)) is obviously nite.
Conversely, is a nite Borel measure on R
N
necessarily Radon?
(5)
that is, restricted to Q
n
(see Denition 1.23).
Chapter 1 29
Hint: consider =

1/n
on B(R), where
1/n
is the Dirac measure at
1/n. To prove additivity observe that, if (B
k
) are disjoint Borel sets, then

_

_
k=1
B
k
_
=

n=1

1/n
_

_
k=1
B
k
_
=

n=1

k=1

1/n
(B
k
)
= lim
n
n

i=1

k=1

1/i
(B
k
) = lim
n

k=1
n

i=1

1/i
(B
k
)
=

k=1

i=1

1/i
(B
k
) =

i=1
(B
k
)
where we have used Lemma 1.45.
In the subsections 1.4.1-1.4.2 we constructed the Lebesgue measure on R,
starting from an additive function dened on the algebra of the nite disjoint
union of semi-closed intervals [a, b). This construction can be carried out in
R
N
provided that we substitute the semi-closed intervals by the semiclosed
rectangles of the type
N

i=1
[a
i
, b
i
), a
i
b
i
i = 1, . . . , N
whose measure is given by

_
N

i=1
[a
i
, b
i
)
_
=
N

i=1
(b
i
a
i
).
In what follows will denote the Lebesgue measure on (R
N
, B(R
N
)). Then
is clearly a Radon measure and, by the analogue of Proposition 1.47, is
translation invariant. Next proposition characterizes all the Radon measures
having the property of translation invariance.
Proposition 1.57 Let be a Radon measure on (R
N
, B(R
N
)) such that
is translation invariant, that is
(E +x) = (E) E B(R
N
), x E.
Then there exists c 0 such that (E) = c(E) for every E B(R
N
).
30 Measure Spaces
Proof. For every n N dene the set

n
=
_
N

k=1
_
a
k
2
n
,
a
k
+ 1
2
n
_
, a
k
Z
_
,
that is
n
is the set of the semi-closed cubes with edge of length
1
2
n
and with
vertexes having coordinates multiple of
1
2
n
. The sets
n
have the following
properties:
a) for every n R
N
=
Q
n
Q with disjoint union;
b) if Q
n
and P
r
with r n, then Q P or P Q = ;
c) If Q
n
, then (Q) = 2
nN
.
Observe that [0, 1)
N
is the union of 2
nN
disjoint cubes of
n
, and these
cubes are identical up to a translation. Setting c = ([0, 1)
N
) and using the
translation invariance of and , for every Q
n
we have
2
nN
(Q) = ([0, 1)
N
) = c([0, 1)
N
) = 2
nN
c([0, 1)
N
)
Then and c coincide on the cubes of the sets
n
. If A is an open nonempty
set of R
N
, then by property a) we have A =
n

Q
n
, QA
Q =
n
Z
n
where
Z
n
=
Q
n
, QA
Q. By property b) we deduce that if Q
n
and Q A,
then Q Z
n1
or Q Z
n1
= . Then A can be rewritten as
A =
_
n
_
Q
n
,QA\Z
n1
Q
and the above union is disjoint. Then the -additivity of and gives
(A) = c(A); nally, by (1.24), (B) = c(B) for every B B(R
N
).
Next theorem shows how the Lebesgue measure changes under the linear
non-singular transformations.
Theorem 1.58 Let T : R
N
R
N
be a linear non-singular transformation.
Then
i) T(E) B(R
N
) for every E B(R
N
);
ii) (T(E)) = [ det T[ (E) for every E B(R
N
).
Chapter 1 31
Proof. Consider the family
c = E B(R
N
) [ T(E) B(R
N
).
Since T is non-singular, then T() = , T(R
N
) = R
N
, T(E
c
) = (T(E))
c
,
T(
n
E
n
) =
n
T(E
n
) for all E, E
n
R
N
. Hence c is an -algebra. Further-
more T maps open sets into open sets; so c = B(R
N
) and i) follows.
Next dene
(B) = (T(B)) E B(R
N
).
Since T maps compact sets into compact sets, we deduce that is a Radon
measure. Furthermore if B B(R
N
) and x R
N
, since is translation
invariant, we have
(B +x) = (T(B +x)) = (T(B) +T(x)) = (T(B)) = (B),
and so is translation invariant too. Proposition 1.57 implies that there
exists (T) 0 such that
(B) = (T)(B) B B(R
N
). (1.26)
It remains to show that (T) = [ det T[. To prove this, let e
1
, . . . , e
N

denote the standard basis in R


N
, i.e. e
i
has the j-th coordinate equal to 1
if j = i and equal to 0 if j ,= i. We begin with the case of the following
elementary transformations:
a) there exists i ,= j such that T(e
i
) = e
j
, T(e
j
) = e
i
and T(e
k
) = e
k
for
k ,= i, j.
In this case T([0, 1)
N
) = [0, 1)
N
and det T = 1. By taking B = [0, 1)
N
in (1.26), we deduce (T) = 1 = [ det T[.
b) there exist ,= 0 and i such that T(e
i
) = e
i
and T(e
k
) = e
k
for k ,= i.
Assume i = 1. Then T([0, 1)
N
) = [0, ) [0, 1)
N1
if > 0 and
T([0, 1)
N
) = (, 0] [0, 1)
N1
if < 0. Therefore by taking B = [0, 1)
N
in (1.26), we obtain (T) = (T([0, 1)
N
)) = [[ = [ det T[.
c) there exist i ,= j and ,= 0 such that T(e
i
) = e
i
+e
j
, T(e
k
) = e
k
for
k ,= i.
32 Measure Spaces
Assume i = 1 and j = 2 and set Q

= x
2
e
2
+

i,=2
x
i
e
i
[ 0 x
i
< 1.
Then we have
T(Q

) = x
1
e
1
+(x
1
+x
2
)e
2
+. . . +x
N
e
N
) [ 0 x
i
< 1
=
_

2
e
2
+

i,=2

i
e
i

1

2
<
1
+ 1, 0
i
< 1 for i ,= 2
_
= E
1
E
2
with disjoint union, where
E
1
=
_

2
e
2
+

i,=2

i
e
i

1

2
< 1, 0
i
< 1 for i ,= 2
_
,
E
2
=
_

2
e
2
+

i,=2

i
e
i

1
2
<
1
+ 1, 0
i
< 1 for i ,= 2
_
.
Observe that E
1
Q

and E
2
e
2
= Q

E
1
; then
(T(Q

)) = (E
1
) +(E
2
) = (E
1
) +
_
E
2
e
2
_
= (Q

).
By taking B = Q

in (1.26) we deduce (T) = 1 = [ det T[.


If T = T
1
. . . T
k
with T
i
elementary transformations of type a)-b)-c), since
(1.26) implies (T) = (T
1
) . . . (T
k
), then we have
(T) = [ det T
1
[ . . . [ det T
k
[ = [ det T[.
Therefore the thesis will follow if we prove the following claim: any linear
non-singular transformation T is the product of elementary transformations
of type a)-b)-c). We proceed by induction on the dimension N. The claim
is trivially true for N = 1; assume that the claim holds for N 1. Set
T = (a
i,j
)
i,j=1,...,N
, that is
T(e
i
) =
N

j=1
a
ij
e
j
i = 1, . . . , N.
For k = 1, . . . , N, consider T
k
= (a
i,j
)
j=1,...,N1, i=1,...,N, i,=k
. Since det T =

N
k=1
(1)
k+N
a
kN
det T
k
, possibly changing two variables by a transformation
Chapter 1 33
of type a), we may assume det T
N
,= 0. Then by induction the following
transformation S
1
: R
N
R
N
S
1
(e
i
) = T
N
(e
i
) =
N1

j=1
a
ij
e
j
i = 1, . . . , N 1, S
1
(e
N
) = e
N
is the product of elementary transformations. By applying transformations
of type c) we add a
iN
S
1
(e
N
) to S
1
(e
i
) for i = 1, . . . , N 1 and we arrive at
S
2
: R
N
R
N
dened by
S
2
(e
i
) =
N

j=1
a
ij
e
j
i = 1, . . . , N 1, S
2
(e
N
) = e
N
.
Next we compose S
2
with a transformation of type b) to obtain
S
3
(e
i
) =
N

j=1
a
ij
e
j
i = 1, . . . , N 1, S
3
(e
N
) = be
N
where b will be chosen later. Now set T
1
N
= (m
ki
)
k,i=1,...N1
. By applying
again transformations of type c), for every i = 1, . . . , N 1 we multiply
S
3
(e
i
) by

N1
k=1
a
Nk
m
ki
and add the results to S
3
(e
N
); then we obtain:
S
4
(e
i
) =
N

j=1
a
ij
e
j
i = 1, . . . , N 1, S
4
(e
N
) = be
N
+
N1

i,k=1
a
Nk
m
ki
N

j=1
a
ij
e
j
.
Since

N1
i,k=1
a
Nk
m
ki

N1
j=1
a
ij
e
j
=

N1
k=1
a
Nk
e
k
, by choosing b = a
NN

N1
i,k=1
a
Nk
m
ki
a
iN
we have that T = S
4
.
Remark 1.59 As a corollary of Theorem 1.58 we obtain that the Lebesgue
measure is rotation invariant.
34 Measure Spaces
Chapter 2
Integration
2.1 Measurable functions
2.1.1 Inverse image of a function
Let X, Y be non empty sets. For any map : X Y and any A T(Y ) we
set

1
(A) := x X [ (x) A = A.

1
(A) is called the inverse image of A.
Let us recall some elementary properties of
1
. The easy proofs are left
to the reader as an exercise.
(i)
1
(A
c
) = (
1
(A))
c
for all A T(Y ).
(ii) If A, B T(Y ), then
1
(A B) =
1
(A)
1
(B). In particular, if
A B = , then
1
(A)
1
(B) = .
(iii) If A
k
T(Y ) we have

1
_

_
k=1
A
k
_
=

_
k=1

1
(A
k
).
Consequently, if (Y, T) is a measurable space, then the family of parts of X

1
(T) :=
1
(A) : A T
is a algebra in X.
35
36 Integration
Exercise 2.1 Let : X Y and let A T(X). Set
(A) := (x) [ x A.
Show that properties like (i), (ii) fail, in general, for (A).
2.1.2 Measurable maps and Borel functions
Let (X, c) and (Y, T) be measurable spaces.
Denition 2.2 We say that a map : X Y is measurable if
1
(T) c.
When Y is a metric space and T = B(Y ), we also call a Borel map. If, in
addition, Y = R, then we say that is a Borel function.
Proposition 2.3 Let / T be such that (/) = T. Then : X Y is
measurable if and only if
1
(/) c.
Proof. Clearly, if is measurable, then
1
(/) c. Conversely, suppose

1
(/) c, and consider the family
( := B T [
1
(B) c .
Using properties (i), (ii), and (iii) of
1
from the previous section, one can
easily show that ( is a algebra in Y including /. So, ( coincides with T
and the proof is complete.
Exercise 2.4 Show that a function : X R is Borel if any of the following
conditions holds:
(i)
1
((, t]) c for all t R.
(ii)
1
((, t)) c for all t R.
(iii)
1
([a, b]) c for all a, b R.
(iv)
1
([a, b)) c for all a, b R.
(v)
1
((a, b)) c for all a, b R.
Exercise 2.5 Let (X) be countable. Show that is measurable if, for any
y Y ,
1
(y) c.
Chapter 2 37
Proposition 2.6 Let X, Y be metric spaces, c = B(X), and T = B(Y ).
Then, any continuous map : X Y is measurable.
Proof. Let / be the family of all open subsets of Y . Then, (/) = B(Y )
and
1
(/) B(X). So, the conclusion follows from Proposition 2.3.
Proposition 2.7 Let : X Y be measurable, let (Z, () be a measurable
space, and let : Y Z be another measurable map. Then is mea-
surable.
Example 2.8 Let (X, c) be a measurable space, and let : X R
N
. We
regard R
N
as a measurable space with the Borel algebra B(R
N
). Denoting
by
i
the components of , that is, = (
1
, . . . ,
N
), let us show that
is Borel
i
is Borel i 1, . . . , N . (2.1)
Indeed, for any y R
N
let
A
y
=
N

i=1
(, y
i
] = z R
N
[ z
i
y
i
i ,
and dene / = A
y
[ y R
N
. Observe that B(R
N
) = (/) to deduce,
from Proposition 2.3, that is measurable if and only if
1
(/) c. Now,
for any y R
N
,

1
(A
y
) =
N

i=1
x X [
i
(x) y
i
=
N

i=1

1
i
((, y
i
]) .
This shows the part of (2.1). To complete the reasoning, assume that
is Borel and let i 1, . . . , N be xed. Then, for any t R

1
i
((, t]) =
1
(z R
N
[ z
i
t)
which implies
1
i
((, t]) c, and so
i
is Borel.
Exercise 2.9 Let , : X R be Borel. Then +, , , and
are Borel.
Hint: dene f(x) = ((x), (x)) and g(y
1
, y
2
) = y
1
+y
2
. Then, f is a Borel
map owing to Example 2.8, and g is a Borel function since it is continuous.
Thus, + = g f is also Borel. The remaining assertions can be proved
similarly.
38 Integration
Exercise 2.10 Let : X R be Borel. Prove that the function
(x) =
_
_
_
1
(x)
if (x) ,= 0
0 if (x) = 0
is also Borel.
Hint: show, by a direct argument, that f : R R dened by
f(x) =
_
1
x
if x ,= 0
0 if x = 0
is a Borel function.
Proposition 2.11 Let (
n
) be a sequence of Borel functions in (X, c) such
that [
n
(x)[ M for all x X, all n N, and some M > 0. Then, the
functions
sup
nN

n
(x), inf
nN

n
(x), limsup
n

n
(x), liminf
n

n
(x),
are Borel.
Proof. Let us prove that (x): = sup
nN

n
(x) is Borel. Let T be the set
of all intervals of the form (, a] with a R. Since (T) = B(R), we have
that is Borel. In fact

1
((, a]) =

n=1

1
n
((, a]) c.
In a similar way one can prove the other assertions.
It is convenient to consider functions with values on the extended space
R = R , . These are called extended functions. We say that a
mapping : X R is Borel if

1
(),
1
() c
and
1
(I) c for all I B(R).
All previous results can be generalized, with obvious modications, to
extended Borel functions. In particular, the following result holds.
Chapter 2 39
Proposition 2.12 Let (
n
) be a sequence of Borel functions on (X, c).
Then the following functions:
sup
nN

n
(x), inf
nN

n
(x), limsup
n

n
(x), liminf
n

n
(x),
are Borel.
Exercise 2.13 Let , : X R be Borel functions on (X, c). Prove that
= c.
Exercise 2.14 Let (
n
) be a sequence of Borel functions in (X, c). Show
that x X [ lim
n

n
(x) c.
Exercise 2.15 Let : X R be a Borel function on (X, c), and let A c.
Prove that

A
(x) =
_
(x) if x A
0 if x / A
is Borel.
Exercise 2.16 1. Let X be a metric space and c = B(X). Then, any
lower semicontinuous map : X R is Borel.
2. Any monotone function : R R is Borel.
Exercise 2.17 Let c be a algebra in R. Show that c B(R) if and only
if any continuous function : R R is c-measurable, that is,
1
(B) c
for any B B(R).
Exercise 2.18 Show that Borel functions on R are the smallest class of func-
tions which includes all continuous functions and is stable under pointwise
limits.
Denition 2.19 A Borel function : X R is said to be simple, if its
range (X) is a nite set. The class of all simple functions : X R is
denoted by o(X).
It is immediate that the class o(X) is closed under sum, product, , ,
and so on.
We recall that
A
: X R stands for the characteristic function of a set
A X, i.e.,

A
(x) =
_
1 if x A,
0 if x / A.
Then
A
o(X) if and only if A c.
40 Integration
Remark 2.20 1. We note that : X R is simple if an only if there
exist disjoint sets A
1
, . . . , A
n
c and real numbers a
1
, . . . , a
n
such that
X =
n
_
i=1
A
i
and (x) =
n

i=1
a
i

A
i
(x) x X . (2.2)
Indeed, any function given by (2.2) is simple. Conversely, if is simple,
then
(X) = a
1
, . . . , a
n
with a
i
,= a
j
for i ,= j .
So, taking A
i
:=
1
(a
i
) , i 1, . . . , n , we obtain a representation
of of type (2.2).
Obviously, the choice of sets A
1
, . . . , A
n
c and real numbers a
1
, . . . , a
n
is far from being unique.
2. Given two simple functions and , they can always be represented as
linear combinations of the characteristic functions of the same family
of sets. To see this, let be given by (2.2), and let
X =
m
_
h=1
B
h
and (x) =
m

h=1
b
h

B
h
(x), x X.
Since A
i
=

m
h=1
(A
i
B
h
) , we have that

A
i
=
m

h=1

A
i
B
h
(x) i 1, . . . , n .
So,
(x) =
n

i=1
m

h=1
a
i

A
i
B
h
(x), x X .
Similarly,
(x) =
m

h=1
n

i=1
b
h

A
i
B
h
(x), x X .
Now, we show that any positive Borel function can be approximated by
simple functions.
Chapter 2 41
Proposition 2.21 Let be a positive extended Borel function on a measur-
able space (X, c). Dene for any n N

n
(x) =
_
_
_
i1
2
n
if
i1
2
n
(x) <
i
2
n
, i = 1, 2, . . . , n2
n
,
n if (x) n.
(2.3)
Then, (
n
)
n
o(X), (
n
)
n
is increasing, and
n
(x) (x) for every
x X. If, in addition, is bounded, then the convergence is uniform.
Proof. For every n N and i = 1, . . . , n2
n
set
E
n,i
=
1
__
i 1
2
n
,
i
2
n
__
, F
n
=
1
([n, +)).
Since is Borel, we have E
n,i
, F
n
B(R
N
) and

n
=
n2
n

i=1
i 1
2
n

E
n,i
+n
F
n
.
Then, by Remark 2.20,
n
o(X). Let x X be such that
i1
2
n
(x) <
i
2
n
. Then,
2i2
2
n+1
(x) <
2i
2
n+1
and we have

n+1
(x) =
2i 2
2
n+1
or
n+1
(x) =
2i 1
2
n+1
.
In any case,
n
(x)
n+1
(x). If (x) n, then we have (x) n + 1 or
n (x) < n + 1. In the rst case
n+1
(x) = n + 1 > n =
n
(x). In the
second case let i = 1, . . . , (n + 1)2
n+1
be such that
i1
2
n+1
(x) <
i
2
n+1
.
Since (x) n, we deduce
i
2
n+1
> n, by which i = (n + 1)2
n+1
. Then

n+1
(x) = n + 1
1
2
n+1
> n =
n
(x). This proves that (
n
)
n
is increasing.
Next, x x X and let n > (x). Then,
0 (x)
n
(x) <
1
2
n
. (2.4)
So,
n
(x) (x) as n . Finally, if 0 (x) M for all x X and
some M > 0, then (2.4) holds for any x X provided that n > M. Thus

n
uniformly.
42 Integration
Denition 2.22 Let (X, c, ) be a measure space and
n
, : X R. We
say that (
n
)
n
converges to a function
almost everywhere (a.e.) if there exists a set F c, of measure 0, such
that
lim
n

n
(x) = (x) x X F;
almost uniformly (a.u.) if, for any > 0, there exists F

c such that
(F

) < and
n
uniformly on X F

.
Exercise 2.23 Let (
n
)
n
be a sequence of Borel functions on a measure
space (X, c, ).
1. Show that the pointwise limit of
n
, when it exists, is also a Borel
function.
2. Show that, if
n
a.u.
, then
n
a.e.
.
3. Show that, if
n
a.e.
and
n
a.e.
, then = except on a set of
measure 0.
4. We say that
n
uniformly almost everywhere if there exists F
c of measure 0 such that
n
uniformly in X F. Show that
almost uniform convergence does not imply uniform convergence almost
everywhere.
Hint: consider
n
(x) = x
n
for x [0, 1].
Example 2.24 Observe that the a.e. limit of Borel functions may not be
Borel. Indeed, in the trivial sequence
n
0 dened on (R, B(R), ) (de-
noting the Lebesgue measure) converges a.e. to
C
, where C is Cantor
set (see Example 1.49), and also to
A
where A is any subset of C which
is not a Borel set. This is a consequence of the fact that Lebesgue measure
on (R, B(R)) is not complete. On the other hand, if the domain (X, c, ) of
(
n
)
n
is a complete measure space, then the a.e. limit of (
n
)
n
is always a
Borel function.
The following result establishes a suprising consequence of a.e. convergence
on sets of nite measure.
Chapter 2 43
Theorem 2.25 (Severini-Egorov) Let (
n
)
n
be a sequence of Borel func-
tions on a measure space (X, c, ). If (X) < and
n
converges a.e. to a
Borel function , then
n
a.u.
.
Proof. For any k, n N dene
A
k
n
=

_
i=n
_
x X

[(x)
i
(x)[ >
1
k
_
.
Observe that (A
k
n
)
n
c because
n
and are Borel functions. Also,
A
k
n
limsup
n
_
x X

[(x)
n
(x)[ >
1
k
_
=: A
k
(n ) .
So, A
k
c. Moreover, for every x A
k
, [(x)
n
(x)[ >
1
k
for innitely
many indeces n. Thus, (A
k
) = 0 by our hypothesis. Recalling that is
nite, we conclude that, for every k N, (A
k
n
) 0 as n . Therefore,
for any xed > 0, the exists an increasing sequence of integers (n
k
)
k
such
that (A
k
n
k
) <

2
k
for all k N. Let us set
F

:=

_
k=1
A
k
n
k
.
Then, (F

)

k
(A
k
n
k
) < . Moreover, for any k N, we have that
i n
k
= [(x)
i
(x)[
1
k
x X F

.
This shows that
n
uniformly on X F

.
Example 2.26 The above result is false when (X) = . For instance, let

n
=
[n,)
dened on (R, B(R), ). Then,
n
0 pointwise, but (x
R[ [
n
[ = 1) = +.
2.1.3 Approximation by continuous functions
The object of this section is to prove that a Borel function can be approxi-
mated in a measure theoretical sense by a continuous function, as shown by
the following result known as Lusins theorem.
44 Integration
Theorem 2.27 (Lusin) Let be a Radon measure on R
N
and : R
N
R
be a Borel function. Assume A R
N
is a Borel set such that
(A) < & (x) = 0 x , A.
Then, for every > 0, there exists a continuous function f

: R
N
R with
compact support
(1)
such that

_
x R
N
[ (x) ,= f

(x)
_
< (2.5)
sup
xR
N
[f

(x)[ sup
xR
N
[(x)[ (2.6)
Proof. We split the reasoning into several steps.
1. Assume A is compact and 0 < 1, and let V be a bounded open
set such that A V . Consider the sequence (T
n
) of measurable sets
dened by
T
1
=
_
x A

1
2
(x) < 1
_
T
n
=
_
x A

1
2
n
(x)
n1

i=1
1
2
i

T
i
(x) <
1
2
n1
_
n 2
Arguing by induction, it is soon realized that, for any x A and any
i N,
T
i
(x) = a
i
, where a
i
is the i-th digit in the binary expansion of
(x), i.e., (x) = 0.a
1
a
2
. . . a
i
. . . . Therefore,
0 (x)
n

i=1
1
2
i

T
i
(x) <
1
2
n
x R
N
, n N.
Hence,
(x) =

n=1
1
2
n

T
n
(x) x R
N
, (2.7)
where the series converges uniformly in R
N
.
(1)
For any continuous function f : R
N
R, the support of f is the closure of the set
x R
N
[ f(x) ,= 0. Such a set will be denoted by supp(f).
Chapter 2 45
2. Fix > 0. Owing to Theorem 1.55, for every n there exist a compact
set K
n
and an open set V
n
such that
K
n
T
n
V
n
& (V
n
K
n
) <

2
n
Possibly replacing V
n
by V
n
V , we may assume V
n
V. Dene
f
n
(x) =
d
V
c
n
(x)
d
K
n
(x) +d
V
c
n
(x)
x R
N
It is immediate to check that f
n
is continuous for all n N and
0 f
n
(x) 1 x R
N
& f
n

_
1 on K
n
0 on V
c
n
So, in some sense, f
n
approximates
T
n
.
3. Now, let us set
f

(x) =

n=1
1
2
n
f
n
(x) x R
N
(2.8)
Since

n=1
1
2
n
f
n
is totally convergent, f

is continuous. Moreover,
x R
N
[ f

(x) ,= 0

_
n=1
x R
N
[ f
n
(x) ,= 0

_
n=1
V
n
V ,
and so supp(f

) V . Consequently, supp(f

) is compact. Further-
more, by (2.7) and (2.8),
_
x R
N
[ f

(x) ,= (x)
_

_
n=1
_
x R
N
[ f
n
(x) ,=
T
n
(x)
_

_
n=1
(V
n
K
n
)
which implies, in turn,

_
_
x R
N
[ f

(x) ,= (x)
_
_

n=1

2
n
=
Thus, conclusion (2.5) holds when A is compact and 0 < 1.
46 Integration
4. Obviously, (2.5) also holds when A is compact and 0 < M for some
M > 0 (it suces to replace by /M). Moreover, if A is compact
and is bounded, then [[ < M for some M > 0. So, in order to
derive (2.5) in this case it suces to decompose =
+

(2)
and
observe that 0
+
,

< M.
5. We will now remove the compactness assumption for A. By Theo-
rem 1.55, there exists a compact set K A such that (A K) < .
Let us set
=
K

Since vanishes outside K, we can approximate , in the above sense,


by a continuous function with compact support, say f

. Then,
_
x R
N
[ f

(x) ,= (x)
_

_
x R
N
[ f

(x) ,= (x)
_
(A K) ,
since, for any x K A
c
, f

(x) ,= (x) implies that f

(x) ,= (x).
Hence,

_
_
x R
N
[ f

(x) ,= (x)
_
_
< 2
6. In order to remove the boundedness hypothesis for f, dene measurable
sets (B
n
) by
B
n
= x A [ [(x)[ n n N
Clearly,
B
n+1
B
n
&

nN
B
n
=
Since (A) < , Proposition 1.16 yields (B
n
) 0. Therefore, for
some n N, we have (B
n
) < . Proceeding as above, we dene
= (1
B
n
)
Since is bounded (by n), we can approximate , in the above sense,
by a continuous function with compact support, that we again label f

.
Then,
_
x R
N
[ f

(x) ,= (x)
_

_
x R
N
[ f

(x) ,= (x) B
n
(2)

+
= max, 0 ,

= max, 0
Chapter 2 47
So,

_
_
x R
N
[ f

(x) ,= (x)
_
_
< 2
The proof of (2.5) is thus complete.
7. Finally, in order to prove (2.6), suppose R := sup
R
N [[ < . Dene

R
: R R
R
(t) =
_
_
_
t if [t[ < R,
R
t
[t[
if [t[ R
and

f

=
R
f

to obtain [

[ R. Since
R
is continuous, so is

f

.
Furthermore, supp(

f

) = supp(f

) and
_
x R
N
[ f

(x) = (x)
_

_
x R
N
[

f

(x) = (x)
_
.
This completes the proof.
2.2 Integral of Borel functions
Let (X, c, ) be a given measure space. In this section we will construct
the integral of a Borel function : X R with respect to . We will rst
consider the special case of positive functions, and then the case of functions
with variable sign. We begin with what can rightfully be considered the
central notion of Lebesgue integration.
2.2.1 Repartition function
Let : X [0, ] be a Borel function. The repartition function F of is
dened by
F(t): = ( > t) = ( > t), t 0.
By denition, F : [0, ) [0, ] is a decreasing
(3)
function; then F pos-
sesses limit at . Moreover, since
= =

n=1
> n ,
(3)
A function f : R R is decreasing if t
1
< t
2
= f(t
1
) f(t
2
), positive if f(t) 0
for all t R.
48 Integration
we have
lim
t
F(t) = lim
n
F(n) = lim
n
( > n) = ( = )
whenever is nite. Other important properties of F are provided by the
following result.
Proposition 2.28 Let : X [0, ] be a Borel function and let F be its
repartition function. Then, the following properties hold:
(i) For any t
0
0,
lim
tt
0
F(t) = F(t
0
),
(that is, F is right continuous).
(ii) If (X) < , then, for any t
0
> 0,
lim
tt
0
F(t) = ( t
0
)
(that is, F possesses left limits
(4)
).
Proof. First observe that, since F is a monotonic function, then F possesses
left limit at any t > 0 and right limit at any t 0. Let us prove (i). We have
lim
tt
0
F(t) = lim
n
F
_
t
0
+
1
n
_
= lim
n

_
> t
0
+
1
n
_
= ( > t
0
) = F(t
0
),
since
_
> t
0
+
1
n
_
> t
0
.
Now, to prove (ii), we note that
_
> t
0

1
n
_
t
0
.
Thus, recalling that is nite, we have
lim
tt
0
F(t) = lim
n
F
_
t
0

1
n
_
= lim
n

_
> t
0

1
n
_
= ( t
0
) ,
and (ii) follows.
From Proposition 2.28 it follows that, when is nite, F is continuous at t
0
i ( = t
0
) = 0.
(4)
In the literature, a function that is right-continuous and has left limits is called a
cadlag function.
Chapter 2 49
2.2.2 Integral of positive simple functions
We now proceed to dene the integral in the class o
+
(X) of positive simple
functions. Let o
+
(X). Then, according to Remark 2.20.1,
(x) =
n

k=1
a
k

A
k
(x) x X,
where a
1
, . . . , a
n
0, and A
1
, . . . , A
n
are mutually disjoint sets of c such
that A
1
A
n
= X. Using the convention 0 = 0, we dene the
integral of over X with respect to by
_
X
(x)(dx) =
_
X
d =
n

k=1
a
k
(A
k
). (2.9)
It is easy to see that the above denition is independent of the representation
of . Indeed, given disjoint sets B
1
, . . . , B
m
c with B
1
B
m
= X and
real numbers b
1
, . . . , b
m
0 such that
(x) =
m

i=1
b
i

B
i
(x) x X,
we have that
A
k
=
m
_
i=1
(A
k
B
i
) B
i
=
n
_
k=1
(A
k
B
i
)
and
A
k
B
i
,= = a
k
= b
i
.
Therefore,
n

k=1
a
k
(A
k
) =
n

k=1
m

i=1
a
k
(A
k
B
i
)
=
m

i=1
n

k=1
b
i
(A
k
B
i
) =
m

i=1
b
i
(B
i
) .
Proposition 2.29 Let , o
+
(X) and let , 0.Then,
_
X
( +)d =
_
X
d +
_
X
d
50 Integration
Proof. Owing to Remark 2.20.2, and can be represented using the same
family of mutually disjoint sets A
1
, . . . , A
n
of c as
=
n

k=1
a
k

A
k
=
n

k=1
b
k

A
k
.
Then,
_
X
( +)d =
n

k=1
(a
k
+b
k
)(A
k
)
=
n

k=1
a
k
(A
k
) +
n

k=1
b
k
(A
k
)
=
_
X
d +
_
X
d
as required.
Example 2.30 Let us choose a representation of a positive simple function
of the form
(x) =
n

k=1
a
k

A
k
x X,
with 0 < a
1
< a
2
< < a
n
. Then, the repartition function F of is given
by
F(t) =
_

_
(A
1
) +(A
2
) + +(A
n
) = F(0) if 0 t < a
1
,

(A
k
) +(A
k+1
) + +(A
n
) = F(a
k1
) if a
k1
t < a
k
,

(A
n
) = F(a
n1
) if a
n1
t < a
n
,
0 = F(a
n
) if t a
n
.
Thus, setting a
0
= 0, we have (A
k
) = F(a
k1
) F(a
k
) and F(t) =

n
k=1
F(a
k1
)
[a
k1
,a
k
)
(t). Then F is a simple function itself on (R, B(R))
and
_
X
d =
n

k=1
a
k
(A
k
) =
n

k=1
a
k
(F(a
k1
) F(a
k
))
=
n

k=1
F(a
k1
)(a
k
a
k1
) =
_

0
F(t)dt,
Chapter 2 51
where
_

0
F(t)dt denotes the integral of the simple function F with respect
to the Lebesgue measure.
2.2.3 The archimedean integral
The identity we have obtained in Example 2.30 for simple functions, that is,
_
X
d =
_

0
( > t)dt (2.10)
makes perfectly sense because the repartition function of a simple function is
a simple function itself (and even a step function). In order to be able to take
such an identity as the denition of the integral of when is a positive
Borel function, we rst have to give its right-hand side a meaning. For
this, we need to dene, rst, the integral of any positive decreasing function
f : [0, ) [0, ].
Let be the family of all nite sets of points = t
0
, . . . , t
N
of [0, ],
where N N and 0 = t
0
< t
1
< < t
N
< . For any decreasing function
f : [0, ) [0, ] and any = t
0
, t
1
, ..., t
N
, we set
I
f
() =
N1

k=0
f(t
k+1
)(t
k+1
t
k
).
The archimedean integral of f is dened by
_

0
f(t)dt := supI
f
() : .
Exercise 2.31 Let f : [0, ) [0, ] be a decreasing function.
1. Show that, if , and , then I
f
() I
f
().
2. Show that, for any pair of decreasing functions f, g : [0, ) [0, ]
such that f(x) g(x), we have
_

0
f(t)dt
_

0
g(t)dt.
3. Show that, if f(t) = 0 for all t > 0, then
_

0
f(t)dt = 0.
52 Integration
Proposition 2.32 Let f
n
: [0, ) [0, ] be a sequence of decreasing func-
tions such that
f
n
(t) f(t) (n ) t 0 .
Then,
_

0
f
n
(t)dt

_

0
f(t)dt .
Proof. According to Exercise 2.31.2, since f
n
f
n+1
f, we obtain
_

0
f
n
(t)dt
_

0
f
n+1
(t)dt
_

0
f(t)dt for every n. Then the inequality
lim
n
_

0
f
n
(t)dt
_

0
f(t)dt is clear. To prove the opposite inequality, let
L <
_

0
f(t)dt. Then there exists = t
0
, . . . , t
N
such that
N1

k=0
f(t
k+1
)(t
k+1
t
k
) > L.
Therefore, for n suciently large, say n n
L
,
_

0
f
n
(t)dt
N1

k=0
f
n
(t
k+1
)(t
k+1
t
k
) > L.
Thus, lim
n
_

0
f
n
(t)dt > L. Since L is any number less than
_

0
f(t)dt,
we conclude that
lim
n
_

0
f
n
(t)dt
_

0
f(t)dt .
The denition of archimedean integral can be easily adapted to the case
of a bounded interval [0, a]. Given a decreasing function f : [0, a] [0, ] it
suces to set
_
a
0
f(t)dt =
_

0
f

(t)dt
where
f

(t) =
_
f(t) if t [0, a],
0 if t > a.
(2.11)
Chapter 2 53
Exercise 2.33 1. Given a decreasing function f : [0, a] [0, ], show
that
_
a
0
f(t)dt af(a).
2. Given a decreasing function f : [0, ) [0, ], show that
_

0
f(t)dt
_
a
0
f(t)dt a > 0.
2.2.4 Integral of positive Borel functions
Given a measure space (X, c, ) and an extended positive Borel function ,
we can now dene the integral of over X with respect to according to
(2.10), that is,
_
X
d =
_
X
(x)(dx): =
_

0
( > t)dt , (2.12)
where the integral in the right-hand side is the archimedean integral of the
decreasing positive function t ( > t). If the integral of is nite we
say that is summable.
Proposition 2.34 (Markov) Let : X [0, ] be a Borel function.
Then, for any a (0, ),
( > a)
1
a
_
X
d. (2.13)
Proof. Recalling Exercise 2.33, we have that, for any a (0, ),
_
X
d =
_
+
0
( > t)dt
_
a
0
( > t)dt a( > a) .
The conclusion follows.
Markovs inequality has important consequences. Generalizing the notion of
a.e. convergence (see Denition 2.22), we say that a property concerning
the points of X holds almost everywere (a.e.), if it holds for all points of X
except for a set E c with (E) = 0.
Proposition 2.35 Let : X [0, ] be a Borel function.
54 Integration
(i) If is summable, then the set = has measure 0, that is, is
a.e. nite.
(ii) The integral of vanishes if and only if is equal to 0 a.e.
Proof.
(i) From (2.13) it follows that ( > a) < for all a > 0 and
lim
a
( > a) = 0.
Since
> n = ,
we have that
( = ) = lim
n
( > n) = 0.
(ii) If
a.e.
= 0, we have ( > t) = 0 for all t > 0. Then
_
X
d =
_
+
0
( > t)dt = 0 (see Exercise 2.31.3). Conversely, let
_
X
d = 0.
Then, Markovs inequality yields ( > a) = 0 for all a > 0. Since
>
1
n
> 0, so
( > 0) = lim
n

_
>
1
n
_
= 0 .
The proof is complete.
The following is a rst result studying the passage to the limit under the
integral sign. It is referred to as the Monotone Convergence Theorem.
Proposition 2.36 (Beppo Levi) Let
n
: X [0, ] be an increasing
sequence of Borel functions, and set
(x) = lim
n

n
(x) x X .
Then,
_
X

n
d

_
X
d.
Chapter 2 55
Proof. Observe that, in consequence of the assumptions,

n
> t > t t > 0 .
Therefore, (
n
> t) ( > t) for any t > 0. The conclusion follows from
Proposition 2.32.
Combining Propositions 2.21 and 2.36 we deduce the following result.
Proposition 2.37 Let : X [0, ] be a Borel function. Then, there
exist positive simple functions
n
: X [0, ) such that
n
pointwise
and
_
X

n
d

_
X
d.
Let us state some basic properties of the integral.
Proposition 2.38 Let , : X [0, ] be summable. Then the follow-
ing properties hold.
(i) If a, b 0, then
_
X
(a +b)d = a
_
X
d +b
_
X
d.
(ii) If , then
_
X
d
_
X
d.
Proof. The conclusion of point (i) holds for , o
+
(X), thanks to Propo-
sition 2.29. To obtain it for Borel functions it suces to apply Proposi-
tion 2.37.
To justify (ii), observe that the trivial inclusion > t > t yields
( > t) ( > t). The conclusion follows (see also Exercise 2.31.3).
Proposition 2.39 Let
n
: X [0, ] be a sequence of Borel functions
and set
(x) =

n=1

n
(x) x X.
Then

n=1
_
X

n
d =
_
X
d.
56 Integration
Proof. For every n set
f
n
=
n

k=1

k
.
Then f
n
. By applying Proposition 2.36 we get
_
X
f
n
d
_
X
d.
On the other hand (i) of Proposition 2.38 implies
_
X
f
n
d =
n

k=1
_
X

k
d

k=1
_
X

k
d.
The following basic result, known as Fatous Lemma, provides a semicon-
tinuity property of the integral.
Lemma 2.40 (Fatou) Let
n
: X [0, ] be a sequence of Borel functions
and set = liminf
n

n
. Then,
_
X
d liminf
n
_
X

n
d. (2.14)
Proof. Setting
n
(x) = inf
mn

m
(x), we have that
n
(x) (x) for every
x X. Consequently, by the Monotone Convergence Theorem,
_
X
d = lim
n
_
X

n
d = sup
nN
_
X

n
d.
On the other hand, since
n

m
for every m n, we have
_
X

n
d inf
mn
_
X

m
d.
So,
_
X
d sup
nN
inf
mn
_
X

m
d = liminf
n
_
X

n
d.
Chapter 2 57
Corollary 2.41 Let
n
: X [0, ] be a sequence of Borel functions con-
verging to pointwise. If, for some M > 0,
_
X

n
d M n N,
then
_
X
d M.
Remark 2.42 Proposition 2.36 and Corollary 2.41 can be given a version
that applies to a.e. convergence. In this case, the fact that the limit is a
Borel function is no longer guaranteed (see Example 2.24). Therefore, such
a property must be assumed a priori, or else measure must be complete.
Exercise 2.43 State and prove the analogues of Proposition 2.36 and of
Corollary 2.41 for a.e. convergence.
Example 2.44 Consider the counting measure on (N, T(N)). Then any
function x : i x(i) is Borel and x =

i=1
x(i)
i
. Then, by Proposition
2.39, if x is positive we have
_
N
x d =

i=1
x(i)(i) =

i=1
x(i).
Example 2.45 Consider the measure space (N, T(N), ) of the previous ex-
ample. Let (x
n
)
n
be a sequence of positive functions such that, for every
i N, x
n
(i) x(i) as n . Then, Beppo Levis Theorem ensures that
lim
n

i=1
x
n
(i) =

i=1
x(i) .
Compare with Lemma 1.45.
Exercise 2.46 Let a
ni
0 for n, i N. Show that

n=1

i=1
a
ni
=

i=1

n=1
a
ni
.
Hint: Set x
n
: i a
ni
. Then x
n
is a sequence of positive Borel functions on
(N, T(N)). Use Proposition 2.39 to conclude.
58 Integration
Exercise 2.47 Let , : X [0, ] be Borel functions.
1. Show that, if a.e., then
_
X
d
_
X
d.
2. Show that, if = a.e., then
_
X
d =
_
X
d.
3. Show that the monotonicity of
n
is an essential hypothesis for Beppo
Levis Theorem.
Hint: consider
n
(x) =
[n,n+1)
(x) for x R.
4. Give an example to show that the inequality in Fatous Lemma can be
strict.
Hint: set
2n
(x) =
[0,1)
(x) and
2n+1
(x) =
[1,2)
(x) for x R.
Exercise 2.48 Let (X, c, ) be a measure space. The following statements
are equivalent:
1. is -nite;
2. there exists a -summable function on X such that (x) > 0 for all
x X.
2.2.5 Integral of functions with variable sign
Let : X R be a Borel function. We say that is summable if there
exist two summable Borel functions f, g : X [0, ] such that
(x) = f(x) g(x) x X . (2.15)
In this case, the number
_
X
d :=
_
X
fd
_
X
gd. (2.16)
is called the integral of over X with respect to . Let us check, as usual,
that the integral of is independent of the choice of functions f, g used to
represent as in (2.15). Indeed, let f
1
, g
1
: X [0, ] be summable
Borel functions such that
(x) = f
1
(x) g
1
(x) x X .
Chapter 2 59
Then, f, g, f
1
and g
1
are nite a.e., and
f(x) +g
1
(x) = f
1
(x) +g(x) x X a.e.
Therefore, owing to Exercise 2.47.2 and Proposition 2.38, we have
_
X
fd +
_
X
g
1
d =
_
X
f
1
d +
_
X
gd.
Since the above integrals are all nite, we deduce that
_
X
fd
_
X
gd =
_
X
f
1
d
_
X
g
1
d
as claimed.
Remark 2.49 Let : X R be a summable function.
1. The positive and negative parts

+
(x) = max(x), 0,

(x) = max(x), 0.
are positive Borel functions such that =
+

. We claim that

+
and

are summable. Indeed let f, g : X [0, ] be Borel


functions satisfying (2.15). If x X is such that (x) 0, then

+
(x) = (x) f(x). So,
+
(x) f(x) for all x X and, recalling
Exercise 2.47.1, we conclude that
+
is summable. Similarly, one
can show that

is summable. Therefore,
_
X
d =
_
X

+
d
_
X

d.
2. From the above remark we deduce that is summable i both
+
and

are summable. Since [[ =


+
+

, it is also true that is


summable i [[ is summable. Moreover,

_
X
d


_
X
[[d. (2.17)
Indeed,

_
X
d

_
X

+
d
_
X

_
X

+
d +
_
X

d =
_
X
[[d.
60 Integration
Remark 2.50 The notion of integral can be further extended allowing in-
nite values. A Borel function : X R is said to be integrable if at least
one of the two functions
+
and

is summable. In this case, we dene


_
X
d =
_
X

+
d
_
X

d.
Notice that
_
X
d R, in general.
In order to state the analogous of Proposition 2.38, we point out that the sum
of two functions with values on the extended space R may not be well dened;
thus we need to assume that at least one of the function is real-valued.
Proposition 2.51 Let , : X R be summable functions. Then, the
following properties hold.
(i) If : X R, then, for any , R, + is summable and
_
X
( +)d =
_
X
d +
_
X
d.
(ii) If , then
_
X
d
_
X
d.
Proof.
(i) Assume rst , > 0 and let f, g, f
1
, g
1
be positive -summable func-
tions such that
(x) = f(x) g(x)
(x) = f
1
(x) g
1
(x)
_
x X .
Then, since f and g are nite, we have + = (f+f
1
)(g+g
1
)
and so,
_
X
( +)d =
_
X
(f +f
1
)d
_
X
(g +g
1
)d.
The conclusion follows from Proposition 2.38(i). The case when ,
have dierent signs can be handled similarly.
Chapter 2 61
(ii) Let . It is immediate that
+

+
and

. Then by
Proposition 2.38(ii) we obtain
_
X
d =
_
X

+
d
_
X

d
_
X

+
d
_
X

d =
_
X
d.

Let : X R be summable and let A c. Then,


A
is summable
because [
A
[ [[. Let us dene
_
A
d :=
_
X

A
d.
Since =
A
+
A
c, from Proposition 2.51.(i) we obtain
_
A
d +
_
A
c
d =
_
X
d. (2.18)
Notation 2.52 If X = R
N
and A B(R
N
), we will write
_
A
(x)dx,

_
A
(y)dy etc. rather then
_
A
d when the integrals are taken with respect
to the Lebesgue measure.
Proposition 2.53 Let : X R be a -summable function.
(i) The set [[ = has measure 0;
ii) If = 0 a.e., then
_
X
d = 0;
(ii) If A c has measure 0, then
_
A
d = 0;
(iv) If
_
E
d = 0 for every E c, then = 0 a.e.
Proof. Parts (i), (ii) and (iii) follow immediately from Proposition 2.35. Let
us prove (iv). Set E =
+
> 0. Then we have
0 =
_
E
d =
_
X

+
d.
Proposition 2.35(ii) implies
+
= 0 a.e. In a similar way we obtain

= 0
a.e.
The key result provided by the next proposition is referred to as the
absolute continuity property of the integral.
62 Integration
Proposition 2.54 Let : X R be summable. Then, for any > 0
there exists

> 0 such that


(A) <

=
_
A
[[d . (2.19)
Proof. Without loss of generality, may be assumed to be positive. Then,

n
(x) := min(x), n (x) x X .
Therefore, by Beppo Levis Theorem,
_
X

n
d
_
X
d. So, for any > 0
there exists n

N such that
0
_
X
(
n
)d <

2
n n

.
Then, for (A) <

2n

, we have
_
A
d
_
A

d +
_
X
(
n

)d < .
We have thus obtained (2.19) with

=

2n

.
Exercise 2.55 Let : X R be summable. Show that
lim
n
_
[[>n
[[d = 0 .
2.3 Convergence of integrals
We have already obtained two results that allow passage to the limit in
integrals, namely Beppo Levis Theorem and Fatous Lemma. In this section,
we will further analyze the problem.
2.3.1 Dominated Convergence
We begin with the following classical result, also known as Lebesgues Dom-
inated Convergence Theorem .
Chapter 2 63
Proposition 2.56 (Lebesgue) Let
n
: X R be a sequence of Borel
functions converging to pointwise. Assume that there exists a positive -
summable function : X [0, ] such that
[
n
(x)[ (x) x X , n N. (2.20)
Then,
n
, are summable and
lim
n
_
X

n
d =
_
X
d. (2.21)
Proof. First, we note that
n
, are -summable because they are Borel
and, in view of (2.20), [(x)[ (x) for any x X. Let us prove (2.21)
when : X [0, +). Since +
n
is positive, Fatous Lemma yields
_
X
( +)d liminf
n
_
X
( +
n
)d =
_
X
d + liminf
n
_
X

n
d.
Consequently, since
_
X
d is nite, we deduce
_
X
d liminf
n
_
X

n
d. (2.22)
Similarly,
_
X
( )d liminf
n
_
X
(
n
)d =
_
X
d limsup
n
_
X

n
d.
Whence,
_
X
d limsup
n
_
X

n
d. (2.23)
The conclusion follows from (2.22) and (2.23).
In the general case : X [0, ], consider E = x X [ (x) = .
Then (2.21) holds over E
c
and, by Proposition 2.35(i), we have (E) = 0.
Hence we deduce
_
X

n
d =
_
E
c

n
d
_
E
c
d =
_
X
d.

Exercise 2.57 Derive (2.21) if (2.20) is satised a.e. and


n
a.e.
, with
Borel.
64 Integration
Exercise 2.58 Let , : X R be Borel functions such that is -
summable and in -integrable. Assume that or is nite. Prove that
+ is -integrable and
_
X
( +)d =
_
X
d +
_
X
d.
Exercise 2.59 Let
n
: X R be Borel functions satisfying, for some
summable function : X R and some (Borel) function ,

n
(x) (x)

n
(x) (x)
_
x X .
Show that
n
, are -integrable and
lim
n
_
X

n
d =
_
X
d.
Exercise 2.60 Let
n
: X R be Borel functions satisfying, for some
summable function and some (Borel) function ,

n
(x) (x)

n
(x) (x)
_
x X .
Show that
n
, are -integrable and
_
X
d liminf
n
_
X

n
d.
Exercise 2.61 Let
n
: X R be Borel functions. Prove that, if is nite
and, for some constant M and some (Borel) function ,
[
n
(x)[ M

n
(x) (x)
_
x X ,
then
n
and are -summable and
lim
n
_
X

n
d =
_
X
d.
Chapter 2 65
Exercise 2.62 Let
n
: R R be dened by

n
(x) =
_

_
0 x 0;
(x[ log x[)

1
n
0 < x 1;
(x log x)
n
x > 1;
Prove that
i)
n
is summable (with respect to the Lebesgue measure) for every n 2;
ii) lim
n+
_
R

n
(x)dx = 1.
Exercise 2.63 Let (
n
)
n
be dened by

n
(x) =
n
x
3/2
log
_
1 +
x
n
_
, x [0, 1].
Prove that
i)
n
is summable for every n 1;
ii) lim
n+
_
1
0

n
(x)dx = 2.
Exercise 2.64 Let (
n
)
n
be dened by

n
(x) =
n

x
1 +n
2
x
2
, x [0, 1].
Prove that:
i)
n
(x)
1

x
for every n 1;
ii) lim
n+
_
1
0

n
(x)dx = 0.
Exercise 2.65 Let (
n
)
n
be dened by

n
(x) =
1
x
3/2
sin
x
n
, x > 0.
Prove that
1.
n
is summable for every n 1;
2. lim
n+
_
+
0

n
(x)dx = 0.
66 Integration
2.3.2 Uniform integrability
Denition 2.66 A sequence
n
: X R of -summable functions is said
to be uniformly summable if for any > 0 there exists

> 0 such that


(A) <

=
_
A
[
n
[d n N. (2.24)
In other terms, (
n
)
n
is uniformly summable i
lim
(A)0
_
A
[
n
[d = 0 uniformly in n.
Notice that such a property holds for a single summable function, see Propo-
sition 2.54.
The following theorem due to Vitali uses the notion of uniform summa-
bility to provide another sucient condition for taking limits behind the
integral sign.
Theorem 2.67 (Vitali) Let
n
: X R be a sequence of uniformly -
summable functions satisfying
> 0 B

c such that (B

) < + and
_
B
c

[
n
[d < n. (2.25)
If (
n
)
n
converges to : X R pointwise, then is -summable and
lim
n
_
X

n
d =
_
X
d.
Proof. Let > 0 be xed and let

> 0, B

c be such that (2.24)-(2.25)


hold true. Since, by Theorem 2.25,
n
a.u.
in B

, there exists a measurable


set A

such that (A

) <

and

n
uniformly in B

. (2.26)
So,
_
X
[
n
[d =
_
B
c

[
n
[d +
_
A

[
n
[d +
_
B

\A

[
n
[d

_
B
c

[
n
[d +
_
B
c

[[d +
_
A

[
n
[d +
_
A

[[d +(B

) sup
B

\A

[
n
[ .
Chapter 2 67
Notice that
_
A

[
n
[d ,
_
B
c

[
n
[d by (2.24)-(2.25). Also, owing to
Corollary 2.41,
_
A

[[d ,
_
B
c

[[d . Thus,
_
X
[
n
[d 4 +(B

) sup
B

\A

[
n
[ .
Since (B

) < +, by (2.26) we deduce


_
X
[
n
[d 0. (2.27)
Then
n
is -summable; consequently, since = (
n
) +
n
, by
Proposition 2.51(i) is -summable. The conclusion follows by (2.17) and
(2.27).
Exercise 2.68 Derive (2.21) if
n
, : X R,
n
, are a.e. nite,
n
a.e.

and is Borel.
For nite measures, (2.25) is always satised by taking B

= X; hence
Vitalis Theorem states that uniform summability is a sucient condition to
pass to the limit under the integral sign.
Corollary 2.69 Let (X) < and let
n
: X R be a sequence of
uniformly -summable functions converging to : X R pointwise. Then
lim
n
_
X

n
d =
_
X
d.
Exercise 2.70 Give an example to show that when (X) = (2.25) is an
essential condition for Vitalis Theorem.
Hint: consider
n
(x) =
[n,n+1)
(x) in R.
Remark 2.71 We note that property (2.25) holds for a single summable
function . Indeed, by Proposition 2.34, the sets [[ >
1
n
have nite
measure and, by Lebesgues Theorem,
_
[[
1
n

[[d =
_
X

[[
1
n

[[d 0 as n .
68 Integration
Remark 2.72 We point out that Vitalis Theorem can be regarded as a gen-
eralization of Lebesgues Theorem. Indeed, by Proposition 2.54 and Remark
2.71 it follows that properties (2.24)-(2.25) hold for a single summable func-
tion. Therefore, if (
n
)
n
is a sequence of Borel functions satisfying (2.20)
for some summable function , then
n
is uniformly summable and satis-
es (2.25). The converse is not true, in general. To see this, consider the
sequence
n
= n
[
1
n
,
1
n
+
1
n
2
)
; since
_
R

n
dx =
1
n
, then (
n
)
n
satises (2.24)-
(2.25); on the other hand sup
n

n
= where =

+
n=1
n
[
1
n
,
1
n
+
1
n
2
)
and
_
R
dx =

+
n=1
1
n
= ; consequently the sequence (
n
)
n
cannot be domi-
nated by any summable function.
2.3.3 Integrals depending on a parameter
Let (X, c, ) be a nite measure space. In this section we shall see how to
dierentiate the integral on X of a function (x, y) depending on the extra
variable y, which is called a parameter. We begin with a continuity result.
Proposition 2.73 Let (Y, d) be a metric space, let y
0
Y , let U be a neigh-
bourhood of y
0
, and let
: X Y R
be a function such that
(a) x (x, y) is Borel for every y Y ;
(b) y (x, y) is continuous at y
0
for every x X;
(c) for some summable function
[(x, y)[ (x) x X , y U
Then, (y) :=
_
X
(x, y)(dx) is continuous at y
0
.
Proof. Let (y
n
) be any sequence in Y that converges to y
0
. Suppose, further,
y
n
U for every n N. Then,
x X
_
(x, y
n
) (x, y
0
) as n
[(x, y
n
)[ (x) n N.
Chapter 2 69
Therefore, by Lebesgues Theorem,
_
X
(x, y
n
)(dx)
_
X
(x, y
0
)(dx) as n
Since (y
n
) is arbitrary, the conclusion follows.
Exercise 2.74 Let p > 0 be given. For t R dene

t
(x) =
_
1
[t[
x
p
e
x/[t[
x [0, 1] (t ,= 0)
0 (t = 0)
For what values of p does each of the following hold true?
(a)
t
(x)
a.e.
0 as t 0;
(b)
t
0 uniformly in [0, 1] as t 0;
(c)
_
1
0

t
(x)dx 0 as t 0.
For dierentiability, we shall restrict the analysis to a real parameter.
Proposition 2.75 Assume : X (a, b) R satises the following:
(a) x (x, y) is Borel for every y (a, b);
(b) y (x, y) is dierentiable in (a, b) for every x X;
(c) for some summable function ,
sup
a<y<b

y
(x, y)

(x) x X .
Then, (y) :=
_
X
(x, y)(dx) is dierentiable on (a, b) and

t
(y) =
_
X

y
(x, y) (dx) , y (a, b) .
70 Integration
Proof. We note, rst, that x

y
(x, y) is Borel for every y (a, b) because

y
(x, y) = lim
n
n
_

_
x, y +
1
n
_
(x, y)
_
(x, y) X (a, b) .
Now, x y
0
(a, b) and let (y
n
) be any sequence in (a, b) converging to y
0
.
Then,
(y
n
) (y
0
)
y
n
y
0
=
_
X
(x, y
n
) (x, y
0
)
y
n
y
0
. .
n


y
(x,y
0
)
(dx)
and

(x, y
n
) (x, y
0
)
y
n
y
0

(x) x X , n N
thanks to the mean value theorem. Therefore, Lebesgues Theorem yields
(y
n
) (y
0
)
y
n
y
0

_
X

y
(x, y
0
) (dx) as n
Since (y
n
) is arbitrary, the conclusion follows.
Remark 2.76 Note that assumption (b) above must be satised on the
whole interval (a, b) (not just a.e.) in order to be able to dierentiate under
the integral sign. Indeed, for X = (a, b) = (0, 1), let
(x, y) =
_
1 if y x
0 if y < x .
Then,

y
(x, y) = 0 for all y ,= x, but
(y) =
_
1
0
(x, y) dx = y =
t
(y) = 1 .
Example 2.77 Let us compute the integral
(y) :=
_

0
e
x
2

y
2
x
2
dx y R.
Chapter 2 71
Since (y) = (y), without loss of generality we can suppose y 0.
Observe that

y
e
x
2

y
2
x
2

=
2y
x
2
e
x
2

y
2
x
2
=
2e
x
2
y
y
2
x
2
e

y
2
x
2
. .
1/e

2e
x
2
r
for y r > 0
Therefore, for any y > 0,

t
(y) =
_

0
2y
x
2
e
x
2

y
2
x
2
dx
t=y/x
= 2
_

0
y
t
2
y
2
e
t
2

y
2
t
2
y
t
2
dt = 2(y) .
Since
_

0
e
x
2
dx =

2
,
solving the Cauchy problem
_

t
(y) = 2(y)
(0) =

2
and recalling that is an even function, we obtain
(y) =

2
e
2[y[
(y R)
Example 2.78 Applying Lebesgues Theorem to counting measure, we shall
compute
lim
n
n

i=1
sin
_
2
i
n
_
.
Indeed, observe that

n
(i) := nsin
_
2
i
n
_
satises [
n
(i)[ 2
1
. Then, by Lebesgues Theorem we have
lim
n

i=1

n
(i) =

i=1
lim
n

n
(i) =

i=1
2
i
= 1
72 Integration
Exercise 2.79 Compute the integral
_

0
sin x
x
dx
proceeding as follows.
(i) Show that
(t) :=
_

0
e
tx
sin x
x
dx
is dierentiable for all t > 0.
Hint: recall
[e
tx
sin x[ e
tx
e
rx
t r > 0 , x R
+
(ii) Compute
t
(t) for t ]0, [.
Hint: proceed as in Example 2.77 noting that
_
e
tx
sin x =
t sin x + cos x
1 +t
2
e
tx
(iii) Compute (t) (up to an additive constant) for all t ]0, [.
(iv) Show that continuous at 0 and conclude that
_

0
sin x
x
dx =

2
Hint: observe that, for any > 0,
[(t)[ +

e
tx
sin x
x
dx

Chapter 3
L
p
spaces
3.1 Spaces /
p
(X, c, ) and L
p
(X, c, )
For any p [1, ), we denote by /
p
(X, c, ) the class of all Borel functions
: X R such that [[
p
is summable, and we dene
||
p
=
__
X
[[
p
d
_
1/p
/
p
(X, c, ).
Remark 3.1 It is easy to check that /
p
(X, c, ) is closed under the fol-
lowing operations: sum of two functions (provided that at least one is nite
everywhere) and multiplication of a function by a real number. Indeed,
R, /
p
(X, c, ) = /
p
(X, c, ) & ||
p
= [[ ||
p
.
Moreover, if , /
p
(X, c, ) and : X R, then we have
[(x) +(x)[
p
2
p1
([(x)[
p
+[(x)[
p
)
(1)
x X ,
and so + /
p
(X, c, ).
Example 3.2 Let be the counting measure on N. Then, we will use the
notation
p
for space /
p
(N, T(N), ). We have

p
=
_
(x
n
)
n

x
n
R,

n=1
[x
n
[
p
<
_
.
(1)
Since f(t) = t
p
is convex on [0, ), we have that

a+b
2

|a|
p
+|b|
p
2
for all a, b 0.
73
74 L
p
spaces
Observe that
1 p q =
p

q
.
Indeed, since

n
[x
n
[
p
< , (x
n
)
n
is bounded, say [x
n
[ M for all n N.
Then, [x
n
[
q
M
qp
[x
n
[
p
. So,

n
[x
n
[
q
< .
Example 3.3 Consider Lebesgue measure on ((0, 1], B((0, 1])). We use
the abbreviated notation /
p
(0, 1) for space /
p
((0, 1], B((0, 1]), ). Let us set,
for any R

(x) = x

x (0, 1] .
Then,

/
p
(0, 1) i p +1 > 0. Thus, /
p
(0, 1) fails to be an algebra. For
instance,
1/2
/
1
(0, 1) but
1
=
2
1/2
/ /
1
(0, 1).
We have already observed that | |
p
is positively homogeneous of degree
one. However, | |
p
in general is not a norm
(2)
since ||
p
= 0 if and only if
(x) = 0 for a.e. x X.
In order to construct a vector space on which | |
p
is a norm, let us
consider the following equivalence relation on /
p
(X, c, ):

a.e.
= (3.1)
Let us denote by L
p
(X, c, ) the quotient space /
p
(X, c, )/ . For any
/
p
(X, c, ) we shall denote by the equivalence class determined by . It is
easy to check that L
p
(X, c, ) is a vector space. Indeed, the precise denition
of addition of two elements
1
,
2
L
p
(X, c, ) is the following: let f
1
, f
2
be representatives of
1
and
2
respectively, i.e. f
1

1
, f
2

2
, such
that f
1
, f
2
are nite everywhere (such representatives exist by Proposition
2.53(i)). Then
1
+
2
is the class containing f
1
+f
2
.
We set
| |
p
= ||
p
L
p
(X, c, ) .
It is easy to see that this denition is independent of the particular element
chosen in . Then, since the zero element of L
p
(X, c, ) is the class consisting
of all functions vanishing almost everywhere, it is clear that | |
p
= 0 i
(2)
Let Y be a vector space. A norm on Y is a mapping Y [0, +), y |y| such that:
(i) |y| = 0 i y = 0, (ii) |y| = [[ |y| for all R and y Y , (iii) |y
1
+y
2
| |y
1
|+|y
2
|
for all y
1
, y
2
Y . The space Y, endowed with the norm | |, is called a normed space. It
is a metric space with the distance d(y
1
, y
2
) = |y
1
y
2
|, y
1
, y
2
Y . If it is a complete
metric space, then Y is called a Banach space.
Chapter 3 75
= 0. To simplify notation, we will hereafter identify with and we will
talk about functions in L
p
(X, c, ) when there is no danger of confusion,
with the understanding that we regard equivalent functions (i.e. functions
diering only on a set of measure zero) as identical elements of the space
L
p
(X, c, ).
In order to check that | |
p
is a norm we need only to verify that | |
p
is sublinear. First we derive two classical inequalities that play an essential
role in real analysis. Let 1 < p, q < . We say that p and q are conjugate
exponents if
1
p
+
1
q
= 1 .
Proposition 3.4 (H older) Let p, q (1, ) be conjugate exponents. Then,
for any L
p
(X, c, ) and L
q
(X, c, ), we have that L
1
(X, c, )
and
||
1
||
p
||
q
. (3.2)
Proof. The conclusion is trivial if ||
p
= 0 or ||
q
= 0. Assume next
||
p
> 0 and ||
q
> 0, and set
f(x) =
[(x)[
||
p
g(x) =
[(x)[
||
q
x X.
Then, by Youngs inequality (A.4),
f(x)g(x)
f(x)
p
p
+
g(x)
q
q
x X. (3.3)
Integrating over X with respect to yields
_
X
[[d
||
p
||
q
=
_
X
fg d
1
p
_
X
f
p
d +
1
q
_
X
g
q
d = 1 .

Remark 3.5 Suppose equality holds in (3.2). Then, equality must hold in
(3.3) for a.e. x X. Therefore, recalling Example A.4, f(x)
p
= g(x)
q
for
a.e. x X. We conclude that equality holds in (3.2) i [(x)[
p
= [(x)[
q
for a.e. x X and some 0.
76 L
p
spaces
Corollary 3.6 Let (X) < . If 1 p < q, then
L
q
(X, c, ) L
p
(X, c, )
and
||
p
((X))
1
p

1
q
||
q
L
q
(X, c, ) . (3.4)
Proof. By hypothesis, [[
p
L
q
p
(X, c, ). Therefore, Holders inequality
yields
_
X
[[
p
d ((X))
1
p
q
_
_
X
[[
q
d
_
p
q
.
The conclusion follows.
Exercise 3.7 Let
1
,
2
, . . . ,
k
be functions such that

i
L
p
i
(X, c, ),
1
p
=
1
p
1
+
1
p
2
+. . . +
1
p
k
1.
Then
1

2
. . .
k
L
p
(X, c, ) and
|
1

2
. . .
k
|
p
|
1
|
p
1
|
2
|
p
2
. . . |
k
|
p
k
.
Exercise 3.8 Let 1 p < r < q and let L
p
(X, c, )L
q
(X, c, ). Then
L
r
(X, c, ) and
||
r
||

p
||
1
q
where
1
r
=

p
+
1
q
.
Proposition 3.9 (Minkowski) Let p [1, ) and let , L
p
(X, c, ).
Then, + L
p
(X, c, ) and
| +|
p
||
p
+||
p
. (3.5)
Proof. The thesis is immediate if p = 1. Assume p > 1. We have
_
X
[ +[
p
d
_
X
[ +[
p1
[[d +
_
X
[ +[
p1
[[d.
Since [ + [
p1
L
q
(X, c, ), where q =
p
p1
, using Holders inequality we
nd
_
X
[ +[
p
d
__
X
[ +[
p
d
_
1/q
(||
p
+||
p
),
and the conclusion follows.
Then space L
p
(X, c, ), endowed with the norm | |
p
, is a normed space.
Our next result shows that L
p
(X, c, ) is a Banach space.
Chapter 3 77
Proposition 3.10 (Riesz-Fischer) Let (
n
)
n
be a Cauchy sequence
(3)
in
the normed space L
p
(X, c, ). Then, a subsequence (
n
k
)
kN
and a function
in L
p
(X, c, ) exist such that
(i)
n
k
a.e.
;
(ii)
n
L
p
.
Proof. Since (
n
)
n
is a Cauchy sequence in L
p
(X, c, ), for any i N there
exists n
i
N such that
|
n

m
|
p
< 2
i
n, m n
i
. (3.6)
Consequently, we can construct an increasing sequence n
i
such that
|
n
i+1

n
i
|
p
< 2
i
i N.
Next, let us set
g(x) =

i=1
[
n
i+1
(x)
n
i
(x)[, g
k
(x) =
k

i=1
[
n
i+1
(x)
n
i
(x)[, k 1.
Minkowskis inequality shows that |g
k
|
p
< 1 for every k; since g
k
g, the
Monotone Convergence Theorem ensures that
_
X
[g[
p
d = lim
k
_
[g
k
[
p
d 1.
Then, owing to Proposition 2.35, g is nite a.e.; therefore the series

i=1
(
n
i+1

n
i
) +
n
1
converges almost everywhere on X to some function . Since
k

i=1
(
n
i+1

n
i
) +
n
1
=
n
k+1
,
(3)
that is for any > 0 there exists n

N such that n, m > n

|
n

m
|
p
< .
78 L
p
spaces
then
(x) = lim
k

n
k
(x) for a.e. x X.
Observe that is a Borel function; moreover, [(x)[ g(x) + [
n
1
(x)[ for
a.e. x X. So, L
p
(X, c, ). This concludes the proof of point (i).
Next, to derive (ii), x > 0; there exists N N such that
|
n

m
|
p
n, m N.
Taking m = n
k
and passing to the limit as k , Fatous Lemma yields
_
X
[
n
[
p
d liminf
k
_
X
[
n

n
k
[
p
d
p
n N .
The proof is thus complete.
Notation 3.11 If A B(R
N
), we will use the abbreviated notation L
p
(A)
for space L
p
(A, B(A), ) where is the Lebesgue measure.
Example 3.12 We note that the conclusion of point (i) in Proposition 3.10
only holds for a subsequence. Indeed, given any positive integer k, consider
the function

k
i
(x) =
_
_
_
1
i 1
k
x <
i
k
,
0 otherwise,
1 i k,
dened on the interval [0, 1). The sequence

1
1
,
2
1
,
2
2
, . . . ,
k
1
,
k
2
, . . . ,
k
k
, . . .
converges to 0 in L
p
([0, 1)), but does not converge at any point whatsoever.
Observe that the subsequence
k
1
=
[0,
1
k
)
converges to 0 a.e.
Exercise 3.13 Generalize Exercise 2.55 showing that, if
n
L
1
, then
lim
k
sup
nN
_
[
n
[k
[
n
[d = 0 .
Hint: observe that
_
[
n
[2k
[
n
[d 2
_
[
n
[[[k
[
n
[ [[d
2
_
[
n
[k
[
n
[d + 2
_
[[k
[[d.
Chapter 3 79
Example 3.14 There are measure spaces (X, c, ) such that
L
p
(X, c, ) , L
q
(X, c, )
for p ,= q. For instance, consider Lebesgue measure in [0, 1) and set
= +

n=1

1/n
where
y
denotes the Dirac measure concentrated at y. Then, (x) := x is
in L
2
(X, c, ) L
1
(X, c, ) because
_
[0,1)
x
2
d =
1
3
+

n=1
1
n
2
< ,
_
[0,1)
xd =
1
2
+

n=1
1
n
= .
On the other hand,
(x) :=
_
1

x
if x [0, 1) Q
0 if x [0, 1) Q
belongs to L
1
(X, c, ) L
2
(X, c, ) since
_
[0,1)
(x)d =
_
1
0
dx

x
< ,
_
[0,1)

2
(x)d =
_
1
0
dx
x
= .
Exercise 3.15 Show that L
p
(R) , L
q
(R) for p ,= q.
Hint: consider f(x) = [x(log
2
[x[ + 1)[
1/p
and show that f L
p
(R) but
f , L
q
(R) for q ,= p.
Exercise 3.16 Let (
n
)
n
be a sequence in L
1
(X, c, ). If

n=1
_
X
[
n
[d < ,
80 L
p
spaces
then
(i)

n=1
[
n
(x)[ < a.e.,
(ii)

n=1

n
L
1
(X, c, ),
(iii)

n=1
_
X

n
d =
_
X

n=1

n
d.
Exercise 3.17 Let 1 p < . Show that if L
p
(R
N
) and is uniformly
continuous, then
lim
[x[
(x) = 0.
Hint: if, by contradiction, (x
n
)
n
R
N
is such that [x
n
[ and [(x
n
)[
> 0 for every n, then the uniform continuity of implies the existence of
> 0 such that [(x)[

2
in B(x
n
, ). Show that this yields
_
R
N
[[
p
dx = .
Exercise 3.18 Show that the result in Exercise 3.17 is false in general if one
only assumes that is continuous.
Hint: Consider
f
n
(x) =
_

_
nx + 1 if
1
n
x 0,
1 nx if 0 x
1
n
,
0 if x ,
_

1
n
,
1
n
_
,
dened on R and set (x) =

n=1
n
1/p
f
n
(x n).
3.2 Space L

(X, c, )
Let : X R be a Borel function. We say that is essentially bounded if
there exists M > 0 such that ([[ > M) = 0. In this case, we set
||

= infM 0 [ ([[ > M) = 0 . (3.7)


We denote by /

(X, c, ) the class of all essentially bounded functions.


Chapter 3 81
Example 3.19 The function : (0, 1] R dened by
(x) =
_

_
1 if x ,=
1
n
n if x =
1
n
is essentially bounded and ||

= 1.
Example 3.20 Let be the counting measure on N. In the following we
will use the notation

for space /

(N, T(N), ). We have

=
_
(x
n
)
n

x
n
R, sup
n
[x
n
[ <
_
.
Observe that

p [1, ).
Remark 3.21 Recalling that t ([[ > t) is right continuous (see Propo-
sition 2.28), we conclude that
M
n
M
0
& ([[ > M
n
) = 0 = ([[ > M
0
) = 0 .
So, the inmum in (3.7) is actually a minimum. In particular, for any
/

(X, c, ),
[(x)[ ||

for a.e. x X. (3.8)


In order to construct a vector space on which | |

is a norm we pro-
ceed as in the previous section dening L

(X, c, ) as the quotient space


of /

(X, c, ) modulo the equivalence relation introduced in (3.1). So,


L

(X, c, ) is obtained by identifying functions in /

(X, c, ) that coin-


cide almost everywhere.
Exercise 3.22 Show that L

(X, c, ) is a vector space and | |

is a norm
in L

(X, c, ).
Hint: use (3.8). For instance, for any ,= 0, we have [(x)[ [[ ||

for a.e. x X. So, ||

[[ ||

. Hence, we also have


||

=
_
_
_
1

_
_
_

1
[[
||

.
Thus, ||

= [[ ||

.
82 L
p
spaces
Proposition 3.23 L

(X, c, ) is a Banach space.


Proof. For a given Cauchy sequence (
n
)
n
in L

(X, c, ), let us set, for


any n, m N,
A
n
= [
n
[ > |
n
|

,
B
m,n
= [
n

m
[ > |
n

m
|

.
Observe that, in view of Remark 3.21,
(A
n
) = 0 & (B
m,n
) = 0 m, n N.
Therefore,
X
0
:= (
n
A
n
) (
m,n
B
m,n
)
has measure zero and (
n
)
n
is a Cauchy sequence for uniform convergence on
X
c
0
. Thus, a Borel function : X R exists such that
n
uniformly
on X
c
0
. This suces to get the conclusion.
Corollary 3.24 Let (
n
)
n
L

(X, c, ) be such that


n
L

. Then

n
a.e.
.
Exercise 3.25 Show that
L
p
(X, c, ), L

(X, c, ) = L
p
(X, c, )
and
||
p
||
p
||

.
Notation 3.26 If A B(R
N
), we will use the abbreviated notation L

(A)
for space L

(A, B(A), ) where is the Lebesgue measure.


Example 3.27 It is easy to realize that spaces L

([0, 1]) and

fail to be
separable
(4)
.
1. Set

t
(x) =
[0,t]
(x) t, x [0, 1] .
We have
t ,= s = |
t

s
|

= 1.
(4)
A metric space is said to be separable if it contains a countable dense subset.
Chapter 3 83
Let us argue by contradiction: assume that (
n
)
n
is a dense countable set in
L

([0, 1]). Then,


L

([0, 1])
n
B
1/2
(
n
)
(5)
,
in contrast with the fact no pair of functions of the family (
t
)
t[0,1]
belongs
to the same ball B
1/2
(
n
).
2. Let (x
n
)
n
be a countable set in

and dene the function


x : N R, x(k) =
_
0 if [x
k
(k)[ 1,
1 +x
k
(k) if [x
k
(k)[ < 1.
We have x

and |x|

2. Furthermore, for every n N


|x x
n
|

= sup
k
[x(k) x
n
(k)[ [x(n) x
n
(n)[ 1;
consequently (x
n
)
n
is not dense in

.
Proposition 3.28 Let p [1, +) and L
p
(X, c, ) L

(X, c, ).
Then,

qp
L
p
(X, c, ) & lim
q
||
q
= ||

.
Proof. For q p we have
[(x)[
q
||
qp

[(x)[
p
for a.e. x X,
by which, after integration,
||
q
||
p
q
p
||
1
p
q

.
Consequently
qp
L
q
(X, c, ) and
limsup
q
||
q
||

. (3.9)
Conversely, let 0 < a < ||

(for ||

= 0 the conclusion is trivial). By


Markovs inequality
([[ > a) = ([[
p
> a
p
) a
p
||
p
p
.
(5)
Given a metric space (Y, d), for any y
0
Y and r > 0 we denote by B
r
(y
0
) the open
ball of radius r centered at y
0
, i.e. B
r
(y
0
) = y Y [ d(y, y
0
) < r.
84 L
p
spaces
Consequently,
||
p
a([[ > a)
1/p
,
whence, since ([[ > a) > 0,
liminf
p
||
p
a.
Since a is any number less than ||

, we conclude that
liminf
p
||
p
||

. (3.10)
From (3.9) and (3.10) the conclusion follows.
Corollary 3.29 Let be nite and let L

(X, c, ). Then,

p1
L
p
(X, c, ) & lim
p
||
p
= ||

. (3.11)
Proof. For p 1 we have
_
X
[(x)[
p
(dx) (X)||
p

.
So,
p
L
p
(X, c, ). The conclusion follows from Proposition 3.28.
It is noteworthy that

p1
L
p
(X, c, ) ,= L

(X, c, ) .
Exercise 3.30 Show that
(x) := log x x (0, 1]
belongs to L
p
((0, 1]) for all p [1, ), but / L

((0, 1]).
Chapter 3 85
3.3 Convergence in measure
We now present a kind of convergence for sequences of Borel functions which
is of considerable importance in probability theory.
Denition 3.31 A sequence
n
: X R of Borel functions is said to con-
verge in measure to a Borel function if for every > 0:
([
n
[ ) 0 as n +.
Let us compare the convergence in measure with other kind of convergences.
Proposition 3.32 Let
n
, : X R be Borel functions. The following
holds:
1. If
n
a.e.
and (X) < +, then
n
in measure;
2. If
n
in measure, then there exists a subsequence (
n
k
)
k
such that

n
k
a.e.
;
3. If 1 p + and
n
L
p
, then
n
in measure.
Proof. 1. Fix , > 0. According to Theorem 2.25 there exists E c such
that (E) < and
n
uniformly in XE. Then, for n suciently large
[
n
[ E,
by which
([
n
[ ) (E) < .
2. For every k N we have

_
[
n
[
1
k
_
0 as n ;
consequently, we can construct an increasing sequence (n
k
)
k
of positive inte-
gers such that

_
[
n
k
[
1
k
_
<
1
2
k
k N.
Now set
A
k
=

_
i=k
_
[
n
i
[
1
i
_
, A =

k=1
A
k
.
86 L
p
spaces
Observe that (A
k
)

i=k
1
2
i
for every k N. Since A
k
A, Proposition
1.16 implies
(A) = lim
k
(A
k
) = 0.
For any x A
c
there exists k N such that x A
c
k
, that is
[
n
i
(x) (x)[ <
1
i
i k.
This shows that (
n
k
)
k
converges to in A
c
.
3. Let > 0 be xed. First assume 1 p < . Then Markovs inequality
implies
([
n
[ > )
1

p
_
X
[
n
[
p
d 0 as n +.
If p = , for large n we have [
n
[ a.e. in X, by which ([
n
[ >
) = 0.
Exercise 3.33 Show that the conclusion of Part 1 in Proposition 3.32 is
false in general if (X) = .
Hint: Consider f
n
=
[n,+)
in R.
Example 3.34 Consider the sequence constructed in Example 3.12: it con-
verges to 0 in L
1
([0, 1)) and, consequently, in measure. This example shows
that Part 2 of Proposition 3.32 and Part (i) of Proposition 3.10 only hold for
a subsequence.
Exercise 3.35 Give an example to show that the converse of Part 3 in
Proposition 3.32 is not true in general.
Hint: Consider the sequence f
n
= n
[0,
1
n
)
in [0, 1].
3.4 Convergence and approximation in L
p
In this section, we will exhibit techniques to derive convergence in mean
of order p from a.e. convergence. Then, we will show that all elements of
L
p
(X, c, ) can be approximated in mean by continuous functions.
Chapter 3 87
3.4.1 Convergence results
In this section we shall use the abbreviated notation L
p
(X) for L
p
(X, c, )
when there is no danger of confusion.
The following is a direct consequence of Fatous Lemma and Lebesgues
Theorem.
Corollary 3.36 Let 1 p < and let (
n
)
n
be a sequence in L
p
(X) such
that
n
a.e.
.
(i) If (
n
)
n
is bounded in L
p
(X), then L
p
(X) and
||
p
liminf
n
|
n
|
p
.
(ii) If, for some L
p
(X), [
n
(x)[ (x) for all n N and a.e. x X,
then L
p
(X) and
n
L
p
.
Exercise 3.37 Show that, for p = , point (i) above is still true, while (ii)
is false.
Hint: consider the sequence
n
(x) =
(
1
n
,1
)
(x) in (0, 1).
Now, observe that, since [ |
n
|
p
||
p
[ |
n
|
p
, the following holds:

n
L
p
= |
n
|
p
||
p
.
Then a necessary condition for convergence in L
p
(X) is convergence of L
p

norms. Our next result shows that, if


n
a.e.
, such a condition is also
sucient.
Proposition 3.38 Let 1 p < and let (
n
)
n
be a sequence in L
p
(X)
such that
n
a.e.
. If L
p
(X) and |
n
|
p
||
p
, then
n
L
p
.
Proof.
(6)
Dene

n
(x) =
[
n
(x)[
p
+[(x)[
p
2

n
(x) (x)
2

p
x X .
(6)
By Novinger, 1972.
88 L
p
spaces
Since p 1, a simple convexity argument shows that
n
0. Moreover,

n
a.e.
[[
p
. Therefore, Fatous Lemma yields
_
X
[[
p
d liminf
n
_
X

n
d
=
_
X
[[
p
d limsup
n
_
X

n
(x) (x)
2

p
d.
So, limsup
n
|
n
|
p
0, by which
n
L
p
.
The results below generalize Vitalis uniform summability condition, and give
applications to L
p
(X) for p 1. We begin by giving the following denition.
Denition 3.39 Let 1 p < . A sequence (
n
)
n
in L
p
(X) is said to be
tight if for any > 0 there exists A

c such that
(A

) < &
_
A
c

[
n
[
p
d < n N.
Corollary 3.40 Let 1 p < and let (
n
)
n
be a sequence in L
p
(X)
satisfying the following:
(i)
n
a.e.

(ii) for every > 0 there exists > 0 such that
(A) < =
_
A
[
n
[
p
d < .
(iii) (
n
)
n
is tight.
Then, L
p
(X) and
n
L
p
.
Proof. Let us set
n
= [
n
[
p
. Then, (
n
)
n
is uniformly -summable, satises
(2.25) and converges to [[
p
a.e. in X. Therefore, Theorem 2.67 implies
L
p
(X) and
|
n
|
p
p
=
_
X

n
d ||
p
p
.
The conclusion now follows from Proposition 3.38.
Chapter 3 89
Remark 3.41 If is nite, then, by taking A

= X we deduce that ev-


ery sequence is tight; hence, (i) and (ii) of Corollary 3.40 provide sucient
conditions for convergence in L
p
(X).
Corollary 3.42 Assume (X) < . Let 1 < q < and let (
n
)
n
be a
bounded sequence in L
q
(X) such that
n
a.e.
. Then,
1pq
L
p
(X)
and
n
L
p
for any p [1, q).
Proof. Let M 0 be such that |
n
|
q
M for any n N. Point (i)
of Corollary 3.36 implies L
q
(X); consequently, by Corollary 3.6,

1pq
L
p
(X). Let 1 p < q: by Holders inequality for any A c we have
_
A
[
n
[
p
d
__
A
[
n
[
q
d
_
p
q
((A))
1
p
q
M
p
((A))
1
p
q
.
The conclusion follows from Corollary 3.40.
Corollary 3.43 Assume (X) < . Let (
n
)
n
be a sequence in L
1
(X) such
that
n
a.e.
and suppose that, for some M 0,
_
X
[
n
[ log
+
([
n
[) d M
(7)
n N.
Then, L
1
(X) and
n
L
1
.
Proof. Fix (0, 1) , t X, and apply estimate (A.5) with x =
1

and
y = [
n
(t)[ to obtain
[
n
(t)[ [
n
(t)[ log([
n
(t)[) +e
1

[
n
(t)[ log
+
([
n
(t)[) +e
1

.
Consequently, for any A c,
_
A
[
n
[d M +(A)e
1

n N.
This implies that (
n
)
n
is uniformly -summable. The conclusion follows
from Theorem 2.67.
Exercise 3.44 Show how Corollary 3.43 can be adapted to generic measures
for tight sequences.
(7)
Here, log
+
(x) = (log x) 0 for any x 0.
90 L
p
spaces
3.4.2 Dense subsets of L
p
Let R
N
be an open set and denote by c
c
() the space of all real-valued
continuous functions on which are zero outside a compact set K .
Clearly, if be a Radon measure on (, B()), then
c
c
() L
p
(, )
(8)
p [1, ].
Theorem 3.45 Let R
N
be an open set and let be a Radon measure
on (, B()). Then, for any p [1, +), c
c
() is dense in L
p
(, ).
Proof. We begin by proving the theorem when = R
N
. We shall start
imposing additional assumptions and split the reasoning into several steps,
each of which will achieve a higher degree of generality.
1. Let us show how to approximate, by continuous functions with compact
support, any function L
p
(R
N
, ) that satises, for some M, r >
0
(9)
,
0 (x) M x R
N
a.e. (3.12)
(x) = 0 x R
N
B
r
a.e. (3.13)
Let > 0. Since is Radon, we have (B
r
) < . Then, by Lusins
Theorem (Theorem 2.27), there exists a function

c
c
(R
N
) such
that
(

,= ) <

(2M)
p
& |

M .
Then,
_
R
N
[

[
p
d (2M)
p
(

,= ) < .
2. We now proceed to remove assumption (3.13). Let L
p
(R
N
, ) be
a function satisfying (3.12) and x > 0. Since B
n
R
N
, owing to
Lebesgues Theorem,
_
B
c
n
[[
p
d =
_
R
N
[[
p

B
c
n
d 0 as n .
Then, there exists n

N such that
_
B
c
n

[[
p
d <
p
. (3.14)
(8)
Hereafter we shall use the abbreviated notation (, ) for measure space (, B(), ).
(9)
Hereafter, B
r
= B
r
(0).
Chapter 3 91
Set

:=
B
n

. In view of Step 1, there exists

c
c
(R
N
) such that
|

|
p
< . Then, by (3.14) we conclude that
|

|
p
|

|
p
+|

|
p
= |
B
c
n

|
p
+|

|
p
< 2.
3. Next, let us dispense with the upper bound in (3.12). Since
0
n
(x) := min(x), n (x) x R
N
a.e.,
we have that
n
L
p
. Therefore, there exists n

N such that
|
n

|
p
< .
In view of Step 2, there exists

c
c
(R
N
) such that |
n

|
p
< .
Then, |

|
p
|
n

|
p
+|
n

|
p
< 2.
Finally, the extra assumption that 0 can be disposed of applying Step 3
to
+
and

. The proof is thus complete in the case = R


N
.
Next consider R
N
an open set and L
p
(, ). The function
(x) =
_
(x) if x ,
0 if x R
N

belongs to L
p
(R
N
, ) where (A) = (A ) for every A B(R
N
). Since
is a Radon measure on (R
N
, B(R
N
)), then there exists

c
c
(R
N
) such
that
_
R
N
[

[
p
d .
Let (V
n
)
n
be a sequence of open sets of R
N
such that
V
n
is compact, V
n
V
n+1
,
n
V
n
= (3.15)
(for example, we can choose V
n
= B
n
x [ d

c(x) >
1
n

(10)
) and set

n
(x) =

(x)
d
V
c
n+1
(x)
d
V
c
n+1
(x) +d
V
n
(x)
, x .
(10)
We recall that, given a nonempty set S R
N
, d
S
(x) denotes the distance function of
x from S, see Appendix A.1
92 L
p
spaces
We have
n
= 0 outside V
n+1
, by which
n
c
c
(). Furthermore
n
=

in V
n
and [
n
[

; then, since V
n
, we deduce
n

in L
p
(, ).
Therefore, there exists n

N such that
_

[
p
d < .
Then,
_

[
n

[
p
d 2
p1
_

[
p
d + 2
p1
_

[
p
d
= 2
p1
_
R
N
[

[
p
d + 2
p1
_

[
p
d 2
p
.

Exercise 3.46 Given R


N
an open set, explain why c
c
() is not dense in
L

() (with respect to the Lebesgue measure), and characterize the closure


of c
c
() in L

().
Hint: show that the closure is given by the set c
0
() of the continuous
functions : R satisfying
> 0 K compact s.t. sup
x\K
[(x)[ .
In particular, if = R
N
, we have
c
0
(R
N
) = : R
N
R[ continuous & lim
[x[
(x) = 0,
while, if is bounded,
c
0
() = : R[ continuous & lim
d

c(x)0
(x) = 0,
Proposition 3.47 Let A B(R
N
) and a Radon measure on (A, B(A)).
Then L
p
(A, ) is separable for 1 p < .
Proof. First assume = R
N
. Denote by 1 the set of the rectangles in R
N
of the form
R =
N

k=1
[a
k
, b
k
), a
k
, b
k
Q, a
k
< b
k
.
Chapter 3 93
Let T the vector space on Q generated by (
R
)
R1
, that is
T =
_
n

i=1
c
i

R
i

n N, c
i
Q, R
i
1
_
.
Then T is countable. We are going to verify that T is dense in L
p
(R
N
, )
for 1 p < . Indeed, let L
p
(R
N
, ) and > 0. According to Theorem
3.45 there exists

c
c
(R
N
) such that |

|
p
. Let m N be
suciently large such that, setting Q = [m, m)
N
, it results supp(

) Q.
Since is Radon, we have (Q) < . By the uniform continuity of

we
get the existence of > 0 such that
[

(x)

(y)[

((Q))
1/p
x, y R
N
s.t. [x y[
Next split the cube Q in a nite number of disjoint cubes Q
1
, . . . , Q
n
1
such that diam(Q
i
) , and dene

=
n

i=1
c
i

Q
i
where c
i
Q is chosen in the interval (inf
Q
i

,

((Q))
1/p
+ inf
Q
i

). Then

T and |



((Q))
1/p
, by which we have
|

|
p
|

|
p
+|

|
p
+ ((Q))
1/p
|

2.
If A B(R
N
), then the set
T

A
=
_
n

i=1
c
i

R
i
A

n N, c
i
Q, R
i
1
_
is dense in L
p
(A, ).
Remark 3.48 If A B(R) and a Radon measure on (A, B(A)), then the
set
_
n1

k=0
c
i

[t
k
,t
k+1
)A

n N, c
i
, t
i
Q, t
0
< t
1
< . . . < t
n
_
is countable and dense in L
p
(A, ) for 1 p < .
94 L
p
spaces
Exercise 3.49
p
is separable for 1 p < .
Hint: show that the set
T =
_
(x
n
)
n

x
n
Q, sup
x
n
,=0
n <
_
is countable and dense in
p
.
Our next result shows that the integral with respect to Lebesgue measure
is translation continuous.
Proposition 3.50 Let p [1, +) and let L
p
(R
N
) (with respect to the
Lebesgue measure). Then,
lim
[h[0
_
R
N
[(x +h) (x)[
p
dx = 0 .
Proof. Let > 0. Theorem 3.45 ensures the existence of

c
c
(R
N
) such
that |

|
p
p
< . Let A

= supp(

). Then, B

:= x R
N
[ d
A

(x) 1
is a compact set and, since the Lebesgues measure is translation invariant,
for [h[ 1 we have
_
R
N
[(x +h) (x)[
p
dx 3
p1
_
R
N
[(x +h)

(x +h)[
p
dx
+3
p1
_
R
N
[

(x +h)

(x)[
p
dx + 3
p1
_
R
N
[

(x) (x)[
p
dx
3
p
+ 3
p1
(B

) sup
[xy[[h[
[

(x)

(y)[
p
.
Therefore,
limsup
[h[0
_
R
N
[(x +h) (x)[
p
dx 3
p
.
Since is arbitrary, the conclusion follows.
Chapter 4
Hilbert spaces
4.1 Denitions and examples
Let H be a real vector space.
Denition 4.1 A scalar product , ) in H is a mapping , ) : HH R
with the following properties:
1. x, x) 0 for all x H and x, x) = 0 i x = 0;
2. x, y) = y, x) for all x, y H;
3. x +y, z) = x, z) +y, z) for all x, y, z H and , R.
A real pre-Hilbert space is a pair (H, , )).
Remark 4.2 Since, for any y H, 0y = 0, we have
x, 0) = 0x, y) = 0 x H .
Let us set
|x| =
_
x, x) x H . (4.1)
The following inequality is fundamental.
Proposition 4.3 (Cauchy-Schwarz) Let (H, , )) be a preHilbert space.
Then
[x, y)[ |x| |y| x, y H (4.2)
Moreover, equality holds i x and y are linearly dependent.
95
96 Hilbert spaces
Proof. The conclusion is trivial if y = 0. So, we will suppose y ,= 0. In fact,
to begin with, let |y| = 1. Then,
0 |x x, y)y|
2
= |x|
2
x, y)
2
, (4.3)
whence the conclusion follows. In the general case, it suces to apply the
above inequality to y/|y|.
If x and y are linearly dependent, then it is clear that [x, y)[ = |x| |y|.
Conversely, if x, y) = |x| |y| and y ,= 0, then (4.3) implies that x and y
are linear dependent.
Exercise 4.4 Dene
F() = |x +y|
2
=
2
|y|
2
+ 2x, y) +|x|
2
R.
Observing that F() 0 for all R, give an alternative proof of (4.2).
Corollary 4.5 Let (H, , )) be a pre-Hilbert space. Then the function | |
dened in (4.1) has the following properties:
1. |x| 0 for all x H and |x| = 0 i x = 0;
2. |x| = [[|x| for any x H and R;
3. |x +y| |x| +|y| for all x, y H.
Function | | is called the norm associated with , ).
Proof. The only assertion that needs a justication is property 3. For this,
observe that for all x, y H we have, by(4.2),
|x +y|
2
= x +y, x +y) = |x|
2
+|y|
2
+ 2x, y)
|x|
2
+|y|
2
+ 2|x| |y| = (|x| +|y|)
2

Remark 4.6 It is easy to see that, in a pre-Hilbert space (H, , )), the
function
d(x, y) = |x y| x, y H (4.4)
is a metric.
Denition 4.7 A pre-Hilbert space (H, , )) is called an Hilbert space if it
is complete with respect to the metric dened in (4.4) .
Chapter 4 97
Example 4.8 1. R
N
is a Hilbert space with the scalar product
x, y) =
N

k=1
x
k
y
k
,
where x = (x
1
, . . . , x
N
), y = (y
1
, . . . , y
N
) R
N
.
2. Let (X, c, ) be a measure space. Then L
2
(X, c, ), endowed with the
scalar product
, ) =
_
X
(x)(x)(dx), , L
2
(X, c, ),
is a Hilbert space (completeness follows from Proposition 3.10).
3. Let
2
be the space of all sequences of real numbers x = (x
k
) such that

k=1
x
2
k
< .

2
is a vector space with the usual operations,
a(x
k
) = (ax
k
), (x
k
) + (y
k
) = (x
k
+y
k
), a R, (x
k
), (y
k
)
2
.
The space
2
, endowed with the scalar product
x, y) =

k=1
x
k
y
k
, x = (x
k
), y = (y
k
)
2
.
is a Hilbert space. This is a special case of the above example, with
X = N, c = T(N), and given by counting measure.
Exercise 4.9 1. Show that
2
is complete arguing as follows. Take a
Cauchy sequence (x
(n)
) in
2
, that is, x
(n)
= (x
(n)
k
).
(a) Show that, for any k N, (x
(n)
k
)
nN
is a Cauchy sequence in R,
and deduce that the limit x
k
:= lim
n
x
(n)
k
does exist.
(b) Show that (x
k
)
2
.
(c) Show that x
(n)
(x
k
) as n .
98 Hilbert spaces
2. Let H = c([1, 1]) the linear space of all real continuous functions on
[0, 1]. Show that
(a) H is a preHilbert space with the scalar product
f, g) =
_
1
1
f(t)g(t)dt
(b) H is not a Hilbert space.
Hint: let
f
n
(t) =
_

_
1 if t [1/n, 1]
nt if t (1/n, 1/n)
1 if t [1, 1/n]
and show that (f
n
) is a Cauchy sequence in H. Observe that, if
f
n
H
f, then
f(t) =
_
1 if t (0, 1]
1 if t [1, 0)
3. In a pre-Hilbert space H, show that the following parallelogram identity
holds:
|x +y|
2
+|x y|
2
= 2(|x|
2
+|y|
2
) x, y H . (4.5)
(One can prove that parallelogram identity characterizes the norms
that are associated with a scalar product.)
4.2 Orthogonal projections
Let H be a Hilbert space with scalar product , ).
Denition 4.10 Two elements x and y of H are said to be orthogonal if
x, y) = 0. In this case, we write x y. Two subsets A, B of H are said to
be orthogonal (A B) if x y for all x A and y B.
The following proposition is the Hilbert space version of the Pythagorean
Theorem .
Proposition 4.11 If x
1
, . . . , x
n
are pairwise orthogonal vectors in H, then
|x
1
+x
2
+x
n
|
2
= |x
1
|
2
+|x
2
|
2
+ +|x
n
|
2
.
Exercise 4.12 Prove Proposition 4.11
Chapter 4 99
4.2.1 Projection onto a closed convex set
Denition 4.13 A set K H is said to be convex if, for any x, y K,
[x, y] := x + (1 )y [ [0, 1] K .
For instance, any subspace of H is convex. Similarly, for any x
0
H and
r > 0 the ball
B
r
(x
0
) =
_
x H [ |x x
0
| < r
_
is a convex set. We shall also use the notation B(x
0
, r) to denote such a set.
Exercise 4.14 Show that, if (K
i
)
iI
are convex subsets of H, then
i
K
i
is
convex.
We know that, in a nite dimensional space, a point x has a nonempty
projection onto a closed set, see Proposition A.2. The following result extends
such a property to convex subsets of a Hilbert space.
Theorem 4.15 Let K H be a nonempty closed convex set. Then, for any
x H there exists a unique element y
x
= p
K
(x) K, called the orthogonal
projection of x onto K, such that
|x y
x
| = inf
yK
|x y|. (4.6)
Moreover, p
K
(x) is the unique solution of the problem
_
y K
x y, z y) 0 z K.
(4.7)

_
- @
@I
x
y
z
, ,
,
K
Figure 4.1: inequality (4.7) has a simple geometric meaning
Proof. Let d = inf
yY
|x y|. We shall split the reasoning into 4 steps.
1. Let y
n
K be a minimizing sequence, that is,
|x y
n
| d as n (4.8)
100 Hilbert spaces
We claim that (y
n
) is a Cauchy sequence. Indeed, for any m, n Y ,
parallelogram identity (4.5) yields
|(xy
n
)+(xy
m
)|
2
+|(xy
n
)(xy
m
)|
2
= 2|xy
n
|
2
+2|xy
m
|
2
Hence, since K is convex and
y
n
+y
m
2
K,
|y
n
y
m
|
2
= 2|x y
n
|
2
+ 2|x y
m
|
2
4
_
_
_
_
x
y
n
+y
m
2
_
_
_
_
2
2|x y
n
|
2
+ 2|x y
m
|
2
4d
2
So, |y
n
y
m
| 0 as m, n , as claimed.
2. Since H is complete and K is closed, (y
n
) converges to some y
x
K
satisfying |x y
x
| = d. The existence of y
x
is thus proved.
3. We now proceed to show that (4.7) holds for any point y K at which
the inmum in (4.6) is attained. Let z K and let (0, 1]. Since
z + (1 )y K, we have that |x y| |x y (z y)|. So,
0
1

_
|x y|
2
|x y (z y)|
2

= 2 x y, z y) |z y|
2
. (4.9)
Taking the limit as 0 yields (4.6).
4. We will complete the proof showing that (4.6) has at most one solution.
Let y be another solution of (4.6). Then,
x y
x
, y y
x
) 0 and x y, y
x
y) 0
The above inequalities imply that |y y
x
|
2
0, or y = y
x
.
Exercise 4.16 Let K H be a nonempty closed convex set. Show that
x y, p
K
(x) p
K
(y)) |p
K
(x) p
K
(y)|
2
x, y H
Hint: apply (4.7) to z = p
K
(x) and z = p
K
(y).
Chapter 4 101
Example 4.17 In an innite dimensional Hilbert space the projection of a
point onto a closed set may be empty (in absence of convexity). Indeed, let
Q consist of all sequences x
(n)
= (x
(n)
k
)
kN

2
such that
x
(n)
k
=
_
0 if k ,= n
1 +
1
n
if k = n
(n 1)
Then, Q is closed. Indeed, since
n ,= m = |x
(n)
x
(m)
|
2
>

2 ,
Q has no cluster points in H. On the other hand, Q has no element of
minimal norm (i.e., 0 has no projection onto Q) as well, for
d
Q
(0) = inf
n1
|x
(n)
|
2
= inf
n1
_
1 +
1
n
_
= 1 ,
but |x
(n)
|
2
> 1 for every n 1.
4.2.2 Projection onto a closed subspace
Theorem 4.15 applies, in particular, to subspaces of H. In this case, however,
the variational inequality in (4.7) takes a special form.
Corollary 4.18 Let M be a closed subspace of a Hilbert space H. Then
p
M
(x) is the unique solution of
_
y M
x y, v) = 0 v M.
(4.10)
Proof. It suces to show that (4.6) and (4.10) are equivalent when M is
a subspace. If y is a solution of (4.10), then (4.6) follows taking v = z y.
Conversely, suppose y satises (4.6). Then, taking z = y + v with R
and v M we obtain
x y, v) 0 R.
Since is any real number, necessarily x y, v) = 0.
102 Hilbert spaces
Exercise 4.19 1. It is well known that any subspace of a nite dimen-
sional space H is closed. Show that this is not the case if H is innite
dimensional.
Hint: consider the set of all sequences x = (x
k
)
2
such that x
k
= 0
but for a nite number of subscripts k, and show that this is a dense
subspace of
2
.
2. Show that, if M is a closed subspace of H and M ,= H, then there
exists x
0
H 0 such that x
0
, y) = 0 for all y M.
3. Let Y be a subspace of H. Show that Y is a (closed) subspace of H.
4. For any A H let us set
A

= x H [ x A . (4.11)
Show that, if A, B H, then
(a) A

is a closed subspace of H
(b) A B = B

(c) (A B)

= A

is called the orthogonal complement of A in H.


M

M
H
0
,
,
,
x
p
M
(x)
p
M
(x)
Figure 4.2: Riesz orthogonal decomposition
Proposition 4.20 Let M be a closed subspace of a Hilbert space H. Then,
the following properties hold.
(i) For any x H there exists a unique pair (y
x
, z
x
) MM

giving the
Riesz orthogonal decompisition x = y
x
+z
x
. Moreover,
y
x
= p
M
(x) and z
x
= p
M
(x) (4.12)
Chapter 4 103
(ii) p
M
: H H is linear and |p
M
(x)| |x| for all x H.
(iii) (a) p
M
p
M
= p
M
(b) ker p
M
= M

(c) p
M
(H) = M
Proof. Let x H.
(i): dene y
x
= p
M
(x) and z
x
= x y
x
to obtain, by (4.10), that z
x
M
and
x z
x
, v) = y
x
, v) = 0 v M

.
Therefore, z
x
= p
M
(x) in view of (4.10). Suppose x = y + z for some
y M and z M

. Then,
y
x
y = z z
x
M M

= 0 .
(ii): for any x
1
, x
2
H ,
1
,
2
R and y M, we have
(
1
x
1
+
2
x
2
) (
1
p
M
(x
1
) +
2
p
M
(x
2
)), y)
=
1
x
1
p
M
(x
1
), y) +
2
x
2
p
M
(x
2
), y) = 0
Then, by Corollary 4.18 p
M
(
1
x
1
+
2
x
2
) = (
1
p
M
(x
1
) +
2
p
M
(x
2
)).
Moreover, since x p
M
(x), p
M
(x)) = 0 for any x H, we obtain
|p
M
(x)|
2
= x, p
M
(x)) |x| |p
M
(x)| .
(iii): the rst assertion follows from the fact that p
M
(x) = x for any x Y .
The rest is a consequence of (i).
Exercise 4.21 1. In the Hilbert space H = L
2
(0, 1) consider sets
N =
_
u H

_
1
0
u(x)dx = 0
_
and
M = u H [ u is constant a.e. on (0, 1)
(a) Show that N and M are closed subspaces of H.
(b) Prove that N = M

.
104 Hilbert spaces
(c) Does u(x) := 1/
3

x, 0 < x < 1, belong to H? If so, Find the Riesz


orthogonal decomposition of u with respect to N and M.
2. For any A H, show that the intersection of all closed linear subspaces
including A is a closed linear subspace of H. Such a subspace, the so-
called closed linear subspace generated by A, will be denoted by sp(A).
Given A H, we will denote by sp(A) the linear subspace generated by A,
that is,
sp(A) =
_
n

k=1
c
k
x
k
[ n 1 , c
k
R, x
k
A
_
.
Exercise 4.22 Show that sp(A) is the closure of sp(A).
Hint: since sp(A) is a closed subspace containing A, we have that sp(A)
sp(A). Conversely, sp(A) sp(A) yields sp(A) sp(A).
Corollary 4.23 In a Hilbert space H the following properties hold.
(i) If M is a closed linear subspace of H, then (M

= M.
(ii) For any A H, (A

= sp(A).
(iii) If N is a subspace of H, then N is dense i N

= 0.
Proof. We will show each point of the conclusion in sequence.
(i): from point (i) of Proposition 4.20 we deduce that
p
M
= I p
M
.
Similarly, p
(M

)
= I p
M
= p
M
. Thus, owing to point (iii) of the
same proposition,
(M

= p
(M

)
(H) = p
M
(H) = M .
(ii): let M = sp(A). Since A M, we have A

(recall Exer-
cise 4.19.4). So, (A

(M

= M. Conversely, observe that A is


included in the closed subspace (A

. So, M (A

.
(iii): rst, observe that, since N is a closed subspace, N = sp(N). So, in
view of point (ii) above,
N = H (N

= H N

= 0
Chapter 4 105
Exercise 4.24 1. Using Corollary 4.23 show that

1
:=
_
(x
n
)
nN

x
n
R,

n=1
[x
n
[ <
_
is a dense subspace of
2
.
2. Let x, y H be linearly independent unit vectors. Show that
|x + (1 )y| < 1 (0, 1) .
Hint: observe that
| x + (1 )y
. .
x

|
2
= 1 + 2(1 )
_
x, y) 1
_
(4.13)
and recall the Cauchy-Schwarz inequality. (Property (4.13), recast as
|x + (1 )y|
2
= 1 (1 )|x y|
2
, implies that a Hilbert space
is uniformly convex, see [3].)

@
@
x
y
0
,
, ,
,
x

Figure 4.3: uniform convexity


4.3 The Riesz Representation Theorem
Let H be a Hilbert space with scalar product , ).
4.3.1 Bounded linear functionals
A linear functional F on H is a linear mapping F : H R.
Denition 4.25 A linear functional F on H is said to be bounded if
[F(x)[ C|x| x H
for some constant C 0.
106 Hilbert spaces
Proposition 4.26 For any linear functional F on H the following properties
are equivalent.
(a) F is continuous.
(b) F is continuous at 0.
(c) F is continuous at some point.
(d) F is bounded.
Proof. The implications (a)(b)(c) and (d)(b) are trivial. So, it suces
to show that (c)(a) and (b)(d).
(c)(a): let F be continuous at x
0
and let y
0
H. For any sequence (y
n
) in H,
converging to y
0
, we have that
x
n
= y
n
y
0
+x
0
x
0
.
Then, F(x
n
) = F(y
n
) F(y
0
) + F(x
0
) F(x
0
). Therefore, F(y
n
)
F(y
0
). So, F is continuous at y
0
.
(b)(d): by hypothesis, for some > 0 we have that [F(x)[ < 1 for every x H
satisfying |x| < . Now, let > 0 and x H. Then,

F
_
x
|x| +
_

< 1 .
So, [F(x)[ <
1

(|x| +). Since is arbitrary, the conclusion follows.


Denition 4.27 The family of all bounded linear functionals on H is called
the (topolgical) dual of H and is denoted by H

. For any F H

we set
|F|

= sup
|x|1
[F(x)[ .
Exercise 4.28 1. Show that H

is a vector space on R, and that | |

is
a norm in H

.
2. For any F H

show that
|F|

= sup
|x|=1
[F(x)[ = sup
x,=0
[F(x)[
|x|
= inf
_
C 0

[F(x)[ C|x|
_
.
Chapter 4 107
4.3.2 Riesz Theorem
Example 4.29 For any xed vector y H dene the linear functional F
y
by
F
y
(x) = x, y) x H .
Then, [F
y
(x)[ |y| |x| for any x H. So, F
y
H

and |F
y
|

|y|. We
have thus dened a map
_
j : H H

j(y) = F
y
y H
(4.14)
It is easy to check that j is linear. Also, since [F
y
(y)[ = |y|
2
for any y H,
we conclude that |F
y
|

= |y| Therefore, j is a linear isometry.


Our next result will show that map j above is onto. So, j is an isometric
isomorphism, called the Riesz isomorphism.
Theorem 4.30 (Riesz-Frechet) Let F be a bounded linear functional on
H. Then there is a unique vector y
F
H such that
F(x) = x, y
F
), x H. (4.15)
Moreover, |F|

= |y
F
|.
Proof. To show the existence of a vector y satisfying (4.15), suppose F ,= 0
(otherwise the conclusion is trivial taking y
F
= 0) and let M = ker F. Since
M is a closed proper
(1)
subspace of H, there exists y
0
M

0. We
can also assume, without loss of generality, that F(y
0
) = 1. Thus, for any
x H we have that F(x F(x)y
0
) = 0. So, x F(x)y
0
M. Hence,
x F(x)y
0
, y
0
) = 0 or
F(x)|y
0
|
2
= x, y
0
) x H
This implies that y
F
:= y
0
/|y
0
|
2
satises (4.15). The rest of the conclusion
follows from the fact that the map j of Example 4.29 is an isometry.
Example 4.31 From the above theorem we deduce that, if (X, c, ) is a
measure space and F : L
2
(X) R is a bounded linear functional, then
there exists a unique L
2
(X) such that
F() =
_
X
d L
2
(X) .
(1)
that is, M ,= H
108 Hilbert spaces
A hyperplane in H is an ane subspace of codimension
(2)
1. Given a
bounded linear functional F H

, for any c R let

c
= x H [ F(x) = c .
From the proof Theorem 4.30 it follows that ker F =

0
= y
F
[ R.
So,
0
can be viewed as a closed hyperplane through the origin. Moreover,
xed any x
c

c
, we have that
c
= x
c
+
0
Therefore,
c
is a closed
hyperplane in H.
Our next result provides sucient conditions for two convex sets to be
strictly separated by closed hyperplanes.
Proposition 4.32 Let A and B be nonempty closed convex subsets of a
Hilbert space H such that A B = . Suppose further that A is compact.
Then there exist a bounded linear functional F H

and two constants c


1
, c
2
such that
F(x) c
1
< c
2
F(y) x A, y B

_
_

A
A
A
A
A
A
F = c
1
A
A
A
A
A
A
F = c
2
A
B
Figure 4.4: separation of convex subsets
Proof. Let C = B A :=
_
z H [ z = y x , x A, y B
_
. Then, it
is easy to see that C is a nonempty convex set such that 0 / C. We claim
that C is closed. For let C y
n
x
n
z. Since A is compact, there exists
a subsequence (x
k
n
) such that x
k
n
x A. Therefore,
y
k
n
x
k
n
+x
. .
0
z +x =: y
and so y
k
n
y B since B is closed. Then, z
0
:= p
C
(0) satises z
0
,= 0 and
0 z
0
, y x z
0
) 0 x A, y B
(2)
Here, codim = dim

.
Chapter 4 109
Hence,
x, z
0
) +|z
0
|
2
y, z
0
) x A, y B
and the conclusion follows taking
F = F
z
0
, c
1
= sup
xA
x, z
0
) , c
2
= inf
yB
y, z
0
)
Exercise 4.33 Let H =
2
.
1. For N 1 let us set F((x
n
)
n
) = x
N
. Find y H satisfying (4.15).
2. Show that, for any x = (x
n
)
n
H, the power series

n
x
n
z
n
has radius
of convergence at least 1.
3. For a given z (1, 1), set F((x
n
)
n
) =

n
x
n
z
n
. Find y H repre-
senting F, and determine |F|

.
4. Consider the sets
A :=
_
(x
n
) H [ n[x
n
n
2/3
[ x
1
n 2
_
and
B :=
_
(x
n
) H [ x
n
= 0 n 2
_
.
(a) Prove that A and B are disjoint closed convex subsets of H.
(b) Show that
A B =
_
(x
n
) H [ C 0 : n[x
n
n
2/3
[ C n 2
_
.
(c) Deduce that A B is dense in H.
Hint: x x = (x
n
) H and dene the sequence (x
(k)
) in A B
by
x
(k)
n
=
_
x
n
if k n
1/n
2/3
if k n + 1 .
(d) Prove that A and B cannot be separated by a closed hyperplane.
Hint: otherwise AB would be included in a closed half-space.
(This example shows that the compactness assumption of Proposi-
tion 4.32 cannot be dropped.)
110 Hilbert spaces
4.4 Orthonormal sets and bases
Let H be a Hilbert space with scalar product , ).
Denition 4.34 A sequence (e
k
)
kN
is called orthonormal if
h, k N e
h
, e
k
) =
_
1 if h = k
0 if h ,= k
Example 4.35 1. The sequence of vectors
e
k
= (
k1
..
0, . . . , 0, 1, 0, . . . ) k = 1, 2 . . .
is orthonornal in
2
.
2. Let (e
k
)
kN
be the sequence of functions in L
2
(, ) given by
t [, ]
_

_
e
0
(t) =
1

2
e
2j1
(t) =
sin(jt)

e
2j
(t) =
cos(jt)

(j 1)
(4.16)
Since, for any j, h 1,
1

cos(jt) sin(ht) dt = 0
1

sin(jt) sin(ht) dt =
_
0 if j ,= h
1 if j = h
1

cos(jt) cos(ht) dt =
_
0 if j ,= h
1 if j = h,
it is easy to check that (e
k
)
kN
is an orthonormal sequence in L
2
(, ).
Such a sequence is called the trigonometric system.
4.4.1 Bessels inequality
Let (e
k
)
kN
be an orthonormal sequence in H.
Chapter 4 111
Proposition 4.36 1. For any N N Bessels identity holds
_
_
_x
N

k=1
x, e
k
)e
k
_
_
_
2
= |x|
2

k=1

x, e
k
)

2
x H (4.17)
2. Bessels inequality holds

k=1

x, e
k
)

2
|x|
2
x H (4.18)
In particular, the series in the left-hand side converges.
3. For any sequence (c
k
) R

k=1
c
k
e
k
H

k=1
[c
k
[
2
<
Proof. Let x H. Bessels identity can be easily checked by induction on
N. For N = 1, (4.17) is true
(3)
. Suppose it holds for some N 1. Then,
_
_
_x
N+1

k=1
x, e
k
)e
k
_
_
_
2
=
_
_
_x
N

k=1
x, e
k
)e
k
_
_
_
2
+

x, e
N+1
)

2
2
_
x
N

k=1
x, e
k
)e
k
, x, e
N+1
)e
N+1
_
= |x|
2

k=1

x, e
k
)

x, e
N+1
)

2
So, (4.17) holds for any N 1. Moreover, Bessels identity implies that
all the partial sums of the series in (4.18) are bounded above by |x|
2
. So,
Bessels inequality holds as well. Finally, for all n N we have
_
_
_
n+p

k=n+1
c
k
e
k
_
_
_
2
=
n+p

k=n+1
[c
k
[
2
p = 1, 2, . . .
Therefore, Cauchys convergence test amounts to the same condition for the
two series of point 3.
For any x H, x, e
k
) are called the Fourier coecients of x, and

k=1
x, e
k
)e
k
is called the Fourier series of x.
(3)
indeed, we used it to prove Cauchys inequality (4.2)
112 Hilbert spaces
Remark 4.37 Fix n N and let M
n
:= sp
_
e
1
, . . . , e
n

_
. Then
p
M
n
(x) =
n

k=1
x, e
k
)e
k
x H
Indeed, for any x H and any point

n
k=1
c
k
e
k
M
n
, we have
_
_
_x
n

k=1
c
k
e
k
_
_
_
2
= |x|
2
2
n

k=1
c
k
x, e
k
) +
n

k=1
[c
k
[
2
=
_
|x|
2

k=1

x, e
k
)

2
_
+
n

k=1

c
k
x, e
k
)

2
=
_
_
_x
n

k=1
x, e
k
)e
k
_
_
_
2
+
n

k=1

c
k
x, e
k
)

2
thanks to Bessels identity (4.17).
4.4.2 Orthonormal bases
To begin this section, let us characterize situations where a vector x H is
given by the sum of its Fourier series. This fact has important consequences.
Theorem 4.38 Let (e
k
)
kN
be an orthonormal sequence in H. Then the
following properties are equivalent.
(a) sp(e
k
[ k N) is dense in H.
(b) Every x H is given by the sum of its Fourier series, that is,
x =

k=1
x, e
k
)e
k
.
(c) Every x H satises Parsevals identity
|x|
2
=

k=1

x, e
k
)

2
. (4.19)
(d) If x H and x, e
k
) = 0 for every k N, then x = 0.
Chapter 4 113
Proof. We will show that (a) (b) (c) (d) (a).
(a) (b) : for any n N let M
n
:= sp
_
e
1
, . . . , e
n

_
. Then, by hypothesis,
d(x, M
n
) 0 as n for any x H. Thus, owing to Remark 4.37,
_
_
_x
n

k=1
x, e
k
)e
k
_
_
_
2
= |x p
M
n
(x)|
2
= d
2
(x, M
n
) 0 (n ) .
This yields (b).
(b) (c) : this part of the conclusion follows from Bessels identity.
(c) (d) : obviuos.
(d) (a) : let N := sp(e
k
[ k N). Then, N

= 0 owing to (d). So, N is


dense on account of point (iii) of Corollary 4.23.
Denition 4.39 The orthonormal sequence (e
k
)
kN
is called complete if
sp(e
k
[ k N) is dense in H (or any of the four equivalent conditions of
Theorem 4.38 holds). In this case, (e
k
)
kN
is also said to be an ortonormal
basis of H.
Exercise 4.40 1. Prove that, if H possesses an orthonormal basis (e
k
)
kN
,
then H is separable, that is, H contains a dense countable set.
Hint: Consider all linear combinations of the e
k
s with rational coef-
cients.
2. Let (y
n
)
nN
be a sequence in H. Show that there exists an at most
countable set of linearly independent vectors (x
j
)
jJ
in H such that
sp(y
n
[ n N) = sp(x
j
[ j J) .
Hint: for any j = 0, 1, . . . , let n
j
be the rst integer n N such that
dimsp(y
1
, . . . , y
n
) = j .
Set x
j
:= y
n
j
. Then, sp(x
1
, . . . , x
j
) = sp(y
1
, . . . , y
n
j
). . .
114 Hilbert spaces
3. Let (e
k
)
kN
be an orthonormal basis of H. Show that
x, y) =

k=1
x, e
k
)y, e
k
) x, y H .
Hint: observe that
x, y) =
|x +y|
2
|x|
2
|y|
2
2
.
Our next result shows the converse of the property described in Exercise 4.40.1.
Proposition 4.41 Let H be a separable Hilbert space. Then H possesses an
orthonormal basis.
Proof. Let (y
n
)
nN
be a dense subset of H and let (x
j
)
jJ
be linearly in-
dependent vectors such that sp(x
j
[ j J) = H (constructed, e.g., as in
Exercise 4.40.2). Dene
e
1
=
x
1
|x
1
|
and e
k
=
x
k

j<k
x
k
, e
j
)e
j
_
_
_x
k

j<k
x
k
, e
j
)e
j
_
_
_
(k 2)
(4)
.
Then, (e
k
) is an orthonormal sequence by construction. Moreover,
sp(e
1
, . . . , e
k
) = sp(x
1
, . . . , x
k
) k 1 .
So, sp(e
k
[ k 1) is dense in H.
Example 4.42 In H =
2
, it is immediate to check that the orthonormal
sequence (e
k
)
kN
of Example 4.35.1 is complete.
4.4.3 Completeness of the trigonometric system
In this section we will show that the orthonormal sequence (e
k
)
kN
dened
in (4.16), that is,
t [, ]
_

_
e
0
(t) =
1

2
e
2j1
(t) =
sin(jt)

e
2j
(t) =
cos(jt)

(j 1)
(4)
This is the so-called Gram-Schmidt orthonormalization process.
Chapter 4 115
is an orthonormal basis of L
2
(, ).
We begin by constructing a sequence of trigonometric polynomials with
special properties. We recall that a trigonometric polynomial q(t) is a linear
combination of the above functions, i.e., an element of sp(e
k
[ k N). Any
trigonometric polynomial q is a continuous 2-periodic function.
Lemma 4.43 There exists a sequence of trigonometric polynomials (q
k
)
kN
such that, for any k N,
_

_
(a) q
k
(t) 0 t R
(b)
1
2
_

q
k
(t)dt = 1
(c) > 0 lim
k
sup
[t[
q
k
(t) = 0 .
(4.20)
Proof. For any k N dene
q
k
(t) = c
k
_
1 + cos t
2
_
k
t R
where c
k
is chosen so as to satisfy property (b). Recalling that
cos(kt) cos t =
1
2
_
cos
_
(k + 1)t
_
+ cos
_
(k 1)t
_
_
it is easy to check that each q
k
is a linear combination of (cos(kt))
kN
. So q
k
is a trigonometric polynomial.
Since (a) is immediate, it only remains to check (c). Observe that, since
q
k
is even,
1 =
c
k

_

0
_
1 + cos t
2
_
k
dt
c
k

_

0
_
1 + cos t
2
_
k
sin t dt
=
c
k
(k + 1)
_
2
_
1 + cos t
2
_
k+1
_

0
=
2c
k
(k + 1)
to conclude that
c
k

(k + 1)
2
k N.
Now, x 0 < < . Since q
k
is even on [, ] and decreasing on [0, ],
using the above estimate for c
k
we obtain
sup
[t[
q
k
(t) = q
k
()
(k + 1)
2
_
1 + cos
2
_
k
k
0 .
116 Hilbert spaces
Our next step is to derive a classical uniform approximation theorem by
trigonometric polynomials.
Theorem 4.44 (Weierstrass) Let f be a continuous 2-periodic function.
Then there exists a sequence of trigonometric polynomials (p
n
)
nN
such that
|f p
n
|

0 as n .
Proof.
(5)
Let (q
n
) be a sequence of trigonometric polynomials enjoying prop-
erties (4.20), e.g. the sequence given by Lemma 4.43. For any n N and
t R, a simple periodicity argument shows that
p
n
(t) :=
1
2
_

f(t s)q
n
(s)ds
=
1
2
_
t+
t
f()q
n
(t )d =
1
2
_

f()q
n
(t )d .
This implies that p
n
is a trigonometric polynomial. Indeed, if
q
n
(t) = a
0
+
k
n

k=1
_
a
k
cos(kt) +b
k
sin(kt)
_
,
then
p
n
(t)
a
0
2
_

f()d
=
1
2
k
n

k=1
_

f()
_
a
k
cos
_
k(t )
_
+b
k
sin
_
k(t )
_
_
d
=
1
2
k
n

k=1
a
k
_
cos(kt)
_

f() cos(k)d + sin(kt)


_

f() sin(k)d
_
+
1
2
k
n

k=1
b
k
_
sin(kt)
_

f() cos(k)d cos(kt)


_

f() sin(k)d
_
.
Next, for any > 0 let

f
() = sup
[xy[<
[f(x) f(y)[ .
(5)
This proof, based on a convolution method, is due to de la Vallee Poussin.
Chapter 4 117
Since f is uniformly continuous,
f
() 0 as 0. Now, for (0, ]
properties (4.20) (a) and (b) ensure that
[f(t) p
n
(t)[ =

1
2
_

_
f(t) f(t s)

q
n
(s)ds

1
2
_

f(t) f(t s)

q
n
(s)ds

1
2
_

f
()q
n
(s)ds +
1
2
_
[s[
2|f|

q
n
(s)ds

f
() + 2|f|

sup
[s[
q
n
(s)
for any t R. Now, x > 0 and let

(0, ] be such that that


f
(

) < .
Owing to (4.20) (c), n

N exists such that sup

[s[
q
n
(s) < for all
n n

. Thus,
|f p
n
|

< (1 + 2|f|

) n n

.
We are now ready to deduce the announced completeness of the trigono-
metric system. We recall that c
c
(a, b) denotes the space of all continuous
functions in (a, b) with compact support.
Theorem 4.45 (e
k
)
kN
is an orthonormal basis of L
2
(, ).
Proof. We will show that trigonometric polynomials are dense in L
2
(, ).
Let f L
2
(, ) and x > 0. Since c
c
(, ) is dense in L
2
(, ) on
account of Theorem 3.45, there exists f

c
c
(, ) such that |f f

|
2
< .
Clearly, we can extend f

, by periodicity, to a continuous function on whole


real line. Also, by Weierstrass Theorem 4.44 we can nd a trigonometric
polynomial p

such that |f

< . Then,
|f p

|
2
|f f

|
2
+|f

|
2
+

2 .
Exercise 4.46 Applying (4.19) to the function
x(t) = t t [, ] ,
derive Eulers identity

k=1
1
k
2
=

2
6
.
118 Hilbert spaces
Chapter 5
Banach spaces
5.1 Denitions and examples
Let X be a real vector space.
Denition 5.1 A norm , ) in X is a map | | : X H R with the
following properties:
1. |x| 0 for all x H and |x| = 0 i x = 0;
2. |x| = [[|x| for any x H and R;
3. |x +y| |x| +|y| for all x, y H.
A normed space is a pair (X, | |).
As we already observed in Chapter 4, in a normed space (X, ||), the function
d(x, y) = |x y| x, y X (5.1)
is a metric.
Denition 5.2 Two norms in X, | |
1
and | |
2
, are said to be equivalent
if there exist constants C c > 0 such that
c|x|
1
|x|
2
C|x|
1
x X .
Exercise 5.3 1. Show that two norms are equivalent if and only if they
induce the same topology on X.
119
120 Banach spaces
2. In R
n
, show that the following norms are equivalent
|x|
p
=
_
n

k=1
[x
k
[
p
_
1/p
and |x|

= max
1kn
[x
k
[ .
Denition 5.4 A normed space (X, | |) is called a Banach space if it is
complete with respect to the metric dened in (5.1) .
Example 5.5 1. Every Hilbert space is a Banach space.
2. Given any set o ,= , the family B(S) of all bounded functions f :
S R is a vector space on R with the usual sum and product dened,
for any f, g B(S) and R, by
x S
_
(f +g)(x) = f(x) +g(x)
(f)(x) = f(x) .
Moreover, B(S) equipped with the norm
|f|

= sup
xS
[f(x)[ f B(S) ,
is a Banach space.
3. Let (M, d) be a metric space. The family, c
b
(M), of all bounded contin-
uous functions on M is a closed subspace of B(M). So,
_
c
b
(M), | |

_
is a Banach space.
4. Let (X, c, ) be a measure space. For any p [1, ], spaces L
p
(X, c, ),
introduced in Chapter 3, are some of the main examples of Banach
spaces with norm dened by
||
p
=
__
X
[[
p
d
_
1/p
/
p
(X, c, )
for p [1, ), and, for p = , by
||

= infm 0 [ ([[ > m) = 0 /

(X, c, ) .
We recall that, when is the counting measure on N, we use the ab-
breviated notation
p
for /
p
(N, T(N), ). In this case we have
|x|
p
=
_

n=1
[x
n
[
p
_
1/p
and |x|

= sup
n
[x
n
[ .
The case of p = 2 was studied in Chapter 4.
Chapter 5 121
Exercise 5.6 1. Let (M, d) be a locally compact metric space. Show that
the set, c
0
(M), of all functions f c
b
(M) such that, for all > 0,
_
x M [ [f(x)[
_
is compact, is a closed subspace of c
b
(M) (so, it is a Banch space).
2. Show that
c
0
:=
_
(x
n
)

[ lim
n
x
n
= 0
_
(5.2)
is a closed subspace of

.
3. Show that | | (in B(S), c
b
(M) or

) is not induced by a scalar


product.
4. In a Banach space X, let (x
n
) be a sequence such that

n
|x
n
| < .
Show that

n=1
x
n
:= lim
k
k

n=1
x
n
X .
5.2 Bounded linear operators
Let X, Y be normed spaces. We denote by /(X, Y ) the space of all continuous
linear mappings : X Y . The elements of /(X, Y ) are also called bounded
operators between X and Y . In the special case of X = Y , we abbreviate
/(X, X) = /(X) and any /(X) is called a bounded operator on X.
Another special case of interest is when Y = R. As in the Hilbert space case,
/(X, R) is called the topological dual of X and will be denoted by X

. The
elements of X

are called bounded linear functionals.


Arguing exactly as in the proof of Proposition 4.26 one can show the
following.
Proposition 5.7 For any linear mapping : X Y the following proper-
ties are equivalent.
(a) is continuous.
(b) is continuous at 0.
(c) is continuous at some point.
122 Banach spaces
(d) There exists C 0 such that |x| C|x| for all x X.
As in Denition 4.27, let us set
|| = sup
|x|1
|x| /(X, Y ) . (5.3)
Then, for any /(X, Y ), we have
|| = sup
|x|=1
|x| = sup
x,=0
|x|
|x|
= inf
_
C 0

|x| C|x| , x X
_
.
(see also Exercise 4.28).
Exercise 5.8 Show that | | is a norm in /(X, Y ).
Proposition 5.9 If Y is complete, then /(X, Y ) is a Banach space. In
particular, the topological dual of X, X

, is a Banach space.
Proof. Let (
n
) be a Cauchy sequence in /(X, Y ). Then, for any x X,
(
n
x) is a Cauchy sequence in Y . Since Y is complete, (
n
x) converges to
a point in Y that we label x. We have thus denes a mapping : X Y
which is easily checked to be linear. Moreover, since (
n
) is bounded in
/(X, Y ), say |
n
| M for all n N, we also have that || M. Thus,
/(X, Y ). Finally, to show that
n
in /(X, Y ), x > 0 and let
n

N be such that |
n
L
m
| < for all n, m n

. Then,
|
n
x L
m
x| < |x y| x X .
Taking the limit as m , we obtain
|
n
x x| < |x y| x X .
Hence, |
n
L| for all n n

and the proof is complete.


Exercise 5.10 Given f C([a, b]), dene : L
1
(a, b) L
1
(a, b) by
g(t) = f(t)g(t) t [a, b] .
Show that is a bounded operator and || = |f|

.
Hint: || |f|

follows from Holders inequality; to prove the equality,


suppose [f(x)[ > |f|

for all x [x
0
, x
1
] and let g(x) =
[x
0
,x
1
]
be the
characteristic function of such interval . . .
Chapter 5 123
5.2.1 The principle of uniform boundedness
Theorem 5.11 (Banach-Steinhaus) Let X be a Banach space, Y be a
normed space, and let
i

iI
/(X, Y ). Then,
either a number M 0 exists such that
|
i
| M i I , (5.4)
or a dense set D X exists such that
sup
iI
|
i
x|
Y
= x D. (5.5)
y = Mx
y = Mx
x
y
@
@
@
@
@
@
@

|
i
| M

Figure 5.1: the Banach-Steinhaus Theorem


Proof. Dene
(x) := sup
iI
|
i
x| x X .
Since : X [0, ] is a lower semicontinuous function, for any n N
A
n
:= x X [ (x) > n
is an open set
(1)
. If all sets A
n
are dense, then (5.5) holds on D :=
n
A
n
which is, in turn, a dense set owing to Baires Lemma, see Proposition A.6.
Now, suppose A
N
fails to be dense for some N N. Then, there exists a
closed ball B
r
(x
0
) X A
N
. Therefore,
|x| r = x
0
+x / A
N
= (x
0
+x) N .
Consequently, |
i
x| 2N for all i I and |x| r. Hence, for all i I,
|
i
x| =
|x|
r
_
_
_
i
rx
|x|
_
_
_
2N
r
|x| x X 0 .
We have thus shown that (5.4) holds with M = 2N/r.
(1)
Alternatively, let x A
n
. Then, for some i
x
I, we have that |
i
x
x| > n. Since
i
x
is contionuous, there exists a neighbourhood V of x such that |
i
x
y| > n for all y V .
Thus, (y) > n for all y V . So, V A
n
.
124 Banach spaces
Exercise 5.12 1. Let y = (y
n
) be a real sequence and let 1 p, q
be conjugate exponents. Show that, if

n
x
n
y
n
converges for all x =
(x
n
)
p
, then y
q
.
2. Let 1 p, q be conjugate exponents, and let f L
p
loc
(R)
(2)
. Show
that, if
_

f(x)g(x)dx g L
q
(R) ,
then f L
p
(R).
5.2.2 The open mapping theorem
Bounded operators between two Banach spaces, X and Y , enjoy topological
propertiesclosely related one anotherthat are very useful for applications,
for instance, to dierential equations. The rst and main of these results is
the so-called Open Mapping Theorem that we give below.
Theorem 5.13 (Schauder) If /(X, Y ) is onto, then is open
(3)
.
Proof. We split the reasoning into four steps.
1. Let us show that a radius r > 0 exists such that
B
2r
(B
1
) . (5.6)
Observe that, since is onto,
Y =
_
k
(B
k
) .
Therefore, by Proposition A.6 (Baires Lemma), at least one of the
closed sets (B
k
) must contain a ball, say B
s
(y) (B
k
). Since (B
k
)
is symmetric with respect to 0,
B
s
(y) (B
k
) = (B
k
) .
(2)
We denote by L
p
loc
(R) the vector space of all measurable functions f : R R such
that f L
p
(a, b) for every interval [a, b] R.
(3)
that is, U open in X = (U) open in Y .
Chapter 5 125
_

`
_
`
_
`
_
(B
1
)
y
y
B
s
,
,
,
H
H
H
H
H
H
Figure 5.2: the Open Mapping Theorem
Hence, for any x B
s
, we have that x y B
s
(y) (B
k
). Since
(B
k
) is convex, we conclude that
x =
(x +y) + (x y)
2
(B
k
) .
Thus, B
s
(B
k
). Let us show how (5.6) follows with r = s/2k by a
rescaling argument. Indeed, let z B
2r
= B
s/k
. Then, kz B
s
and
there exists a sequence (x
n
) in B
k
such that x
n
kz. So, x
n
/k B
1
and (x
n
/k) z as claimed.
2. Note that, by linearity, (5.6) yields the family of inclusions
B
2
1n
r
(B
2
n) n N. (5.7)
3. We now proceed to show that
B
r
(B
1
) . (5.8)
Let y B
r
. We have to prove that y = x for some x B
1
. Applying
(5.7) with n = 1, we can nd a point
x
1
B
2
1 such that
_
_
y x
1
_
_
<
r
2
.
Thus, y x
1
B
2
1
r
. So, applying (5.7) with n = 2 we nd a point
x
2
B
2
2 such that
_
_
y (x
1
+x
2
)
_
_
<
r
2
2
.
Repeated application of this construction yield a sequence (x
n
) in X
such that
x
n
B
2
n and
_
_
y (x
1
+ +x
n
)
_
_
<
r
2
n
.
126 Banach spaces
Since

n=1
|x
n
| <

n=1
1
2
n
= 1 ,
recalling Exercise 5.6.5 we conclude that x :=

n
x
n
B
1
, and, by the
continuity of , x =

n
x
n
= y.
4. To conclude the proof, let U X be open and let x U. Then, for
some > 0, B

(x) U, whence x + (B

) (U). Therefore,
B
r
(x) = x +B
r
x + (B

)
. .
by (5.8)
(U) .
The rst consequence we deduce from the above result is the following Inverse
Mapping Theorem.
Corollary 5.14 (Banach) If /(X, Y ) is bijective, then
1
is contin-
uous.
Proof. We have to show that, for any open set U X, (
1
)
1
(U) is open.
But this follows from Theorem refth:omt since (
1
)
1
= .
Exercise 5.15 1. Let /(X, Y ) be bijective. Show that a constant
> 0 exists such that
|x|
Y
|x|
X
x X . (5.9)
Hint: use Corollary 5.14 and apply Proposition 5.7 to
1
.
2. Let ||
1
and ||
2
be norms on a vector space Z such that Z is complete
with respect to both | |
1
and | |
2
. If a constant c 0 exists such
that |x|
2
c|x|
1
for any x X, then there also exists C 0 such
that |x|
1
C|x|
2
for any x X (so, | |
1
and | |
2
are equivalent
norms).
Hint: apply (5.9) to the identity map (Z, | |
1
) (Z, | |
2
).
To introduce our next result, let us observe that the Cartesian product XY
is naturally equipped with the product norm
|(x, y)|
XY
:= |x|
X
+|y|
Y
(x, y) X Y .
Chapter 5 127
Exercise 5.16
_
X Y, |(, )|
XY
_
is a Banach space.
We conclude with the so-called Closed Graph Theorem.
Corollary 5.17 (Banach) Let : X Y be a linear mapping. Then
/(X, Y ) if and only if the graph of , that is
Graph() :=
_
(x, y) X Y

y = x
_
,
is closed in X Y .
Proof. Suppose /(X, Y ). Then, it is easy to see that
: X Y Y (x, y) = y x
is continuous. Therefore, Graph() =
1
(0) is closed.
Conversely, let Graph() be a closed subspace of the Banach space X
Y . Then, Graph() is, in turn, a Banach space with the product norm.
Moreover, the linear map

: Graph() X

(x, x) := x
is bounded and bijective. Therefore, owing to Corollary 5.14,

: X Graph()
1

x = (x, x)
is continuous, and so is =
Y

1

, where

Y
: X Y Y
Y
(x, y) := y .
Example 5.18 Let X = c
1
([0, 1]) and Y = c([0, 1]) be both equipped with
the sup norm | |

. Dene
x(t) = x
t
(t) x X , t [0, 1] .
Then Graph() is closed since
_
x
n
L

x
t
k
L

= x

c
1
([0, 1]) & x
t

= y

.
On the other hand, fails to be a bounded operator. Indeed, taking
x
n
(t) = t
n
t [0, 1] ,
we have that
x
n
X , |x
n
|

= 1 , |x
n
|

= n n 1 .
This shows the necessity of X being a Banach space in Thorem 5.13.
128 Banach spaces
Exercise 5.19 1. For a given operator /(X, Y ) show that the fol-
lowing properties are equivalent:
(a) there exists c > 0 such that |x| c|x| for all x X;
(b) ker = 0 and (X) is closed in Y .
Hint: use Exercise 5.15.1.
2. Let H be a Hilbert space and let A, B : H H be two linear maps
such that
Ax, y) = x, By) x, y H . (5.10)
Show that A, B /(H).
Hint: use (5.10) to deduce that Graph(A) and Graph(B) are closed
in X X; then, apply Corollary 5.17.
3. Let (X, | |) be an innite dimensional separable Banach space and let
(e
i
)
iI
be a Hamel basis of X
(4)
such that |e
i
| = 1 for all i I.
(a) Show that I is uncountable.
Hint: use Baires Lemma.
(b) Prove that
|x|
1
=

iI
[
i
[ if x =

iI

i
e
i
is a norm in X and that |x| |x|
1
for every x X (observe
that both series above are nite sums).
(c) Show that X is not complete with respect to | |
1
.
Hint: should (X, | |
1
) be a Banach space, then | | and | |
1
would be equivalent norms by Exercise 5.15.2, but, for any i ,= j,
we have |e
i
e
j
|
1
= . . .
(4)
that is, a maximal linearly independent subset of X. Let us recall that, applying
Zorns Lemma, one can show that any linearly independent subset of X can be completed
to a Hamel basis. Moreover, given a Hamel basis (e
i
)
iI
, we have that X = spe
i
[ i I.
Chapter 5 129
5.3 Bounded linear functionals
In this section we shall study a special case of bounded linear operators,
namely R-valued operators oras we usually saybounded linear function-
als. We will see that functionals enjoy an important extension property
described by the Hahn-Banach Theorem. Then we will derive useful ana-
lytical and geometric consequences of such a property. These results will be
essential for the analysis of dual spaces that we shall develop in the next
section. Finally, we will characterize the duals of the Banach spaces
p
.
5.3.1 The Hahn-Banach Theorem
Let us consider the following extension problem: given a subspace M X
(not necessarily closed) and a continuous linear functional f : M R,
nd F X

such that
_
F

M
= f
|F| = |f| .
(5.11)
Remark 5.20 1. Observe that a bounded linear functional f dened on
a subspace M can be extended to the closure M by a standard com-
pleteness argument. For let x M and let (x
n
) M be such that
x
n
x. Since
[f(x
n
) f(x
m
)[ |f| |x
n
x
m
| ,
(f(x
n
)) is a Cauchy sequence in R. Therefore, (f(x
n
)) converges. Then,
it is easy to see that F(x) := lim
n
f(x
n
) is the required extension of
f. So, the problem of nding an extension of f satifying (5.11) has a
unique solution when M is dense in X.
2. Another case where the extension satifying (5.11) is unique is when X
is a Hilbert space. Indeed, let us still denote by f the extension of the
given functional to M, obtained by the procedure described at point 1.
Note that M is also a Hilbert space. So, by the Riesz-Frechet Theorem,
there exists a unique vector y
f
M such that |x
f
| = |f| and
f(x) = x, y
f
) x M .
Dene
F(x) = x, y
f
) x X .
130 Banach spaces
Then, f X

satises (5.11) and |F| = |y


f
| = |f|. We claim that F
is the unique extension of f with these properties. For let G be another
bounded linear functional satifying (5.11) and let y
G
be the vector in
X associated with G in the Riesz representation of G. Consider the
Riesz orthogonal decomposition of y
G
, that is,
y
G
= y
t
G
+y
tt
G
where y
t
G
M and y
tt
G
M .
Then,
x, y
t
G
) = G(x) = f(x) = x, y
f
) x M .
So, y
t
G
= y
f
. Moreover,
|y
tt
G
|
2
= |y
G
|
2
|y
t
G
|
2
= |f|
2
|y
f
|
2
= 0 .
In general, the following classical result ensures the existence of an extension
of f satisfying (5.11) even though its uniqueness is no longer guaranteed.
Theorem 5.21 (Hahn-Banach: rst analytic form) Let (X, | |) be a
normed space and let M be a subspace of X. If f : M R is a continuous
linear functional on M, then there is a functional F X

such that
F

M
= f and |F| = |f| .
Proof. To begin with, let us suppose that |f| ,= 0 for otherwise one can
take F 0 and the conclusion becomes trivial. Then we can assume, without
loss of generality, that |f| = 1. We will show, rst, how to extend f to a
subspace of X which strictly includes M. The general case will be treated
laterin steps 2 and 3by a maximality argument.
1. Suppose M ,= X and let x
0
X M. Let us construct an extension of
f to the subspace
M
0
:= sp(M x
0
) = x +x
0
[ x M , R .
Dene
f
0
(x +x
0
) := f(x) + x M , R, (5.12)
where is a real number to be xed. Clearly, f
0
is a linear functional
on M
0
that extends f. We must nd R such that
[f
0
(x +x
0
)[ |x +x
0
| x M , R.
Chapter 5 131
A simple re-scaling argument shows that the last inequality is equiva-
lent to
[f
0
(x
0
y)[ |x
0
y| y M .
Therefore, replacing f
0
by its denition (5.12), we conclude that R
must satisfy [ f(y)[ |x
0
y| for all y M, or
f(y) |x
0
y| f(y) +|x
0
y| y M .
Now, such a choice of is possible because
f(y) f(z) = f(y z) |y z| |x
0
y| +|x
0
z|
for all y, z M, and so
sup
yM
_
f(y) |x
0
y|
_
inf
zM
_
f(z) |x
0
z|
_
.
2. Denote by T the family of all pairs (

M,

f) where

M is a subspace of
X including M, and

f is a bounded linear functional extending f to

M such that |

f| = 1. T , = since it contains (M, f). We can turn T


into a partially ordered set dening, for all pairs (M
1
, f
1
), (M
2
, f
2
) T,
(M
1
, f
1
) (M
2
, f
2
)
_
M
1
subspace of M
2
f
2
= f
1
on M
1
.
(5.13)
We claim that T is inductive. For let Q = (M
i
, f
i
)
iI
be a totally
ordered subset of T. Then, it is easy to check that
_
_
_

M :=
_
iI
M
i

f(x) := f
i
(x) if x M
i
denes a pair (

M,

f) T which is an upper bound for Q.
3. By Zorns Lemma, T has a maximal element, say (/, F). We claim
that / = X and F is the required extension. Indeed, F = f on M and
|F| = 1 by construction. Moreover / = X, for if / ,= X then the
rst step of this proof would imply the existence of a proper extension
of (/, F), contradicting its maximality.
132 Banach spaces
Example 5.22 In general, the extension provided by the Hahn-Banach The-
orem is not unique. For instance, consider the space
c
1
:=
_
x = (x
n
)

lim
n
x
n
_
.
As it is easy to see, c
1
is a closed subspace of

and c
0
is a closed subspace
of c
1
. The map
f(x) := lim
n
x
n
x = (x
n
) c
1
is a bounded linear functional on c
1
such that f 0 on c
0
. Then, f is a
nontrivial extension of the null map on c
0
.
We shall now discuss some useful consequences of the Hahn-Banach Theorem.
Corollary 5.23 Let M be a closed subspace of X and let x
0
/ M. Then
there exists F X

such that
_

_
(a) F(x
0
) = 1
(b) F

M
= 0
(c) |F| = 1/d
M
(x
0
) .
(5.14)
Proof. Let M
0
= sp(M x
0
) = M +Rx
0
. Dene f : M
0
R by
f(x +x
0
) = x M , R.
Then, f(x
0
) = 1 and f

M
= 0. Also, since
|x +x
0
| = [|
_
_
x

+x
0
_
_
[[d
M
(x
0
) ,
we have that |f| 1/d
M
(x
0
). Moreover, let (x
n
) be a sequence in M such
that
|x
n
x
0
| <
_
1 +
1
n
_
d
M
(x
0
) n 1 .
Then,
|f| |x
n
x
0
| f(x
0
x
n
) = 1 >
n
n + 1
|x
n
x
0
|
d
M
(x
0
)
n 1 .
Therefore, |f| = 1/d
M
(x
0
). Now, the existence of an extension F X

satisfying (5.14) follows from the Hahn-Banach Theorem.


Chapter 5 133
Corollary 5.24 Let x
0
X 0. Then there exists F X

such that
F(x
0
) = |x
0
| and |F| = 1 . (5.15)
Proof. Let M = Rx
0
. Dene f : M R by
f(x
0
) = |x
0
| R.
Then, one can easily check that f(x
0
) = |x
0
| and |f| = 1. Now, the
existence of an extension F X

satisfying (5.15) follows from the Hahn-


Banach Theorem.
Exercise 5.25 Hereafter, for any f X

, we will use the standard notation


f, x) := f(x) x X .
1. Let x
1
, . . . , x
n
be linearly independent vectors in X and let
1
, . . . ,
n
be real numbers. Show that there exists f X

such that
f(x
i
) =
i
i = 1, . . . , n.
2. Let M be a subspace of X.
(a) Show that a point x X belongs to M i f(x) = 0 for every
f X

such that f

M
= 0.
(b) Show that M is dense i the only functional f X

that vanishes
on M is f 0.
3. Show that X

separates the points of X, that is, for any x


1
, x
2
X
with x
1
,= x
2
there exists f X

such that f(x


1
) ,= f(x
2
).
4. Show that |x| = max
_
f, x)

f X

, |f| 1
_
.
5.3.2 Separation of convex sets
It turns out that the Hahn-Banach Thoerem has signicant geometric appli-
cations. To achieve this, we shall extend our analysis to vector spaces.
Denition 5.26 A sublinear functional on a vector space X is a function
p : X R such that
134 Banach spaces
(a) p(x) = p(x) x X 0
(b) p(x +y) p(x) +p(y) x, y X.
The Hahn-Banach Theorem can be extended in the following way.
Theorem 5.27 (Hahn-Banach: second analytic form) Let p be a sub-
linear functional on a vector space X and let M be subspace of X. If
f : M R is a linear functional such that
f(x) p(x) x M , (5.16)
then there is a linear functional F : X R such that
_
F

M
= f
F(x) p(x) x M .
(5.17)
The proof of Theorem 5.27 will be omitted. The reader is invited to check
that the proof of Theorem 5.21 can be easily adapted to the present context.
Theorem 5.28 (Hahn-Banach: rst geometric form) Let A and B be
nonempty disjoint convex subsets of a normed space X. If A is open, then
there is a functional f X

and a real number such that


f(x) < f(y) x A y B. (5.18)
Remark 5.29 Observe that (5.18) yields, in particular, f ,= 0. It can be
proved that, given a functional f X

0, for any R the set

:= f
1
() = x X [ f(x) = (5.19)
is a closed subspace of X. We will call any such set a closed hyperplane in
X. Therefore, an equivalent way to state the conclusion of Theorem 5.28 is
that A and B can be separated by a closed hyperplane.
Lemma 5.30 Let C be a nonempty convex open subset of a normed space
X such that 0 C. Then
p
C
(x) := inf 0 [ x C x X (5.20)
is a sublinear functional on X called the Minkowski function of C or the
gauge of C. Moreover.
c 0 such that 0 p
C
(x) c|x| x X (5.21)
C = x X [ p
C
(x) < 1 . (5.22)
Chapter 5 135
Proof. To begin with, observe that, being open, C contains a ball B
R
.
1. Let us prove (5.21). For any > 0 we have that
Rx
|x| +
B
R
C .
Since is arbitrary, this yields 0 p
C
(x) |x|/R.
2. We now proceed to show that p
C
is a sublinear functional. Let > 0
and x X. Fix > 0 and let 0

< p
C
(x) + be such that x

C.
Then, x

C. Thus, p
C
(x)

< (p
C
(x) + ). Since is
arbitrary, we conclude that
p
C
(x) p
C
(x) 0 , x X . (5.23)
To obtain the converse inequality observe that, in view of (5.23),
p
C
(x) = p
C
_
1

x
_

1

p
C
(x) .
Finally, let us check that p
C
satises property (b) of Denition 5.26.
Fix x, y X and > 0. Let 0 <

< p
C
(x) + and 0 <

< p
C
(y) +
be such that x

C and y

C. Then, x =

and y =

for
some points x

, y

C. Since C is convex,
x +y =

= (

)
_

. .
C
_
.
Thus,
p
C
(x +y)

< p
C
(x) +p
C
(y) + 2 > 0 ,
whence p
C
(x +y) p
C
(x) +p
C
(y).
3. Denote by

C the set in the right-hand side of (5.22). Since C C for
every [0, 1], we have that

C C. Conversely, since C is open, any
point x C belongs to some ball B
r
(x) C. Therefore, (1 + r)x C
and so p
C
(x) 1/(1 +r) < 1.
136 Banach spaces
Lemma 5.31 Let C be a nonempty convex open subset of a normed space X
and let x
0
XC. Then there is a functional f X

such that f(x) < f(x


0
)
for all x C.
Proof. First, we note that we can assume that 0 C without loos of
generality. Indeed, this is always the case up to translation. Dene M := Rx
0
and
g : M R by g(x
0
) = p
C
(x
0
) R,
where p
C
is the Minkowski function of C. Observe that g satises condition
(5.16) with respect to the sublinear functional p
C
since, for any x = x
0
M,
it is easy to see that
g(x) = p
C
(x
0
) p
C
(x) R.
Therefore, Theorem 5.27 ensures the existence of a linear extension of g, say
f, which satises (5.17). Then, f(x
0
) = g(x
0
) = 1 and, owing to (5.22),
f(x) = g(x) p
C
(x) < 1 x C .
Proof of Theorem 5.28. It is easy to see that
C := A B = x y [ x A, y B
is a convex open set and that 0 / C. Then, Lemma 5.31 ensures the existence
of a linear functional f X

such that f(z) < 0 = f(0) for all z C. Hence,


f(x) < f(y) for all x A and y B. So,
:= sup
xA
f(x) f(y) y B.
We claim that f(x) < for all x A. For suppose f(x
0
) = for some
x
0
A. Then, since A is open, B
r
(x
0
) A for some r > 0. So,
f(x
0
+rx) x B
1
.
Now, taking x B
1
such that f(x) > |f|/2(> 0), we obtain
f(x
0
+rx) = f(x
0
) +rf(x) > +
r|f|
2
.
a contradiction that concludes the proof.
Chapter 5 137
Theorem 5.32 (Hahn-Banach: second geometric form) Let C and D
be nonempty disjoint convex subsets of a normed space X. If C is closed and
D is compact, then there is a functional f X

such that
sup
xC
f(x) < inf
yD
f(y) . (5.24)
Proof. Let us denote by d
C
the distance function from A. Since C is closed
and D is compact, the continuity of d
C
yields
:= min
xD
d
C
(x) > 0 .
Dene
C

:= C +B
/2
and D

:= D +B
/2
.
It is easy to see that C

and D

are nonempty open convex sets. They are


aslo disjoint for if c +x = d+y for some points c C, d D and x, y B
/2
,
then
d
C
(d) |c d| = |y x| < .
Then, by Theorem 5.28, there is a linear functional f X

and a number
such that
f
_
c +

2
x
_
f
_
d +

2
y
_
c C, d D, x, y B
1
.
Now, let x B
1
be such that f(x) > |f|/2
(5)
. Then
f(c) +
|f|
4
< f
_
c +

2
x
_
f
_
d

2
x
_
< f(d)
|f|
4
for all c C and d D. The conclusion follows.
5.3.3 The dual of
p
In this section we will study the dual of the Banach spaces c
0
and
p
dened in
Example 5.5.4. To begin, let p [1, ] and let q be the conjugate exponent,
that is, 1/p + 1/q = 1. With any y = (y
n
)
q
we can associate the linear
map f
y
:
p
R dened by
f
y
(x) =

n=1
x
n
y
n
x = (x
n
)
p
. (5.25)
(5)
Recall that |f| > 0, see Remark 5.29.
138 Banach spaces
Holders inequality ensures that
[f
y
(x)[ |y|
q
|x|
p
x
p
.
Hence, f
y
(
p
)

and |f
y
| |y|
q
. Therefore,
_
j
p
:
q
(
p
)

j
p
(y) := f
y
(5.26)
is a bounded linear operator such that |j
p
| 1. Moreover, since c
0
is a
subspace of

, f
y
is also a bounded linear functional on c
0
for any y
1
. In
this section, for p = , we shall restrict our attention to the bounded linear
operator j

:
1
(c
0
)

.
Proposition 5.33 The bounded linear operator
j
p
:
_

q
(
p
)

if 1 p <

1
(c
0
)

if p =
is an isometric isomorphism.
Let us rst prove the following
Lemma 5.34 Let
X =
_

p
if 1 p <
c
0
if p = .
Then X is the closed linear subspace (with respect to | |
p
) generated by the
set of vectors
e
k
= (
k1
..
0, . . . , 0, 1, 0, . . . ) k = 1, 2 . . . (5.27)
Consequently, X is separable.
Proof. For any x = (x
n
)
p
, 1 p < we have
_
_
_x
n

k=1
x
k
e
k
_
_
_
p
p
=

k=n+1
[x
k
[
p
0 (n ) .
Similarly, for any x = (x
n
) c
0
,
_
_
_x
n

k=1
x
k
e
k
_
_
_

= max[x
k
[ [ k > n 0 (n )
because x
n
0 by denition. The conclusion follows.
Chapter 5 139
Remark 5.35 We note that the conclusion of above lemma is false for

since
sp(e
k
[ k 1) = c
0

. (5.28)
In fact, we know that

is not separable, see Exercise 5.6.4.


Proof of Proposition 5.33. Let us consider, rst, the case of 1 < p < .
Fix f (
p
)

and set
y
k
:= f(e
k
) k 1 (5.29)
where e
k
is dened in (5.28). It suces to show that y := (y
k
) satises
y
q
, |y|
q
|f| , f = f
y
. (5.30)
For any n 1 let
(6)
z
(n)
k
=
_
[y
k
[
q2
y
k
if k n
0 if k > n.
Then z
(n)

p
, since all its components vanish but a nite number, and
n

k=1
[y
k
[
q
= f(z
(n)
) |f| |z
(n)
|
p
= |f|
_
n

k=1
[y
k
[
q
_
1/p
,
whence
_
n

k=1
[y
k
[
q
_
1/q
|f| n 1 .
This yields the rst two assertions in (5.30). To obtain the third one, x
x
p
and let
x
(n)
k
:=
_
x
k
if k n
0 if k > n.
Observe that
f(x
(n)
) =
n

k=1
x
k
f(e
k
) =
n

k=1
x
k
y
k
.
Since x
(n)
x in
p
and the series

k
x
k
y
k
converges, we conclude that
f = f
y
. This completes the analysis of the case 1 < p < . The similar
reasoning for the remaining cases is left as an exercise.
(6)
observe that [y
k
[
q2
y
k
= 0 if y
k
= 0 since q > 1.
140 Banach spaces
Exercise 5.36 1. Prove Proposition 5.33 for p = 1.
Hint: dening y as in (5.29) the bound |y|

|f| is immediate . . .
2. Prove Proposition 5.33 for p = .
Hint: dene y as in (5.29) and
z
(n)
k
=
_
_
_
y
k
[y
k
[
if k n and y
k
,= 0
0 if y
k
= 0 or k > n.
Then |z
(n)
|

1 . . .
5.4 Weak convergence and reexivity
Let (X, | |) be a normed space. Then the dual space X

is itself a Banach
space with the dual norm.
Denition 5.37 The space X

= (X

is called the bidual of X.


Let J
X
: X X

be the linear map dened by


J
X
(x), f) := f, x) x X , f X

. (5.31)
Then, [J
X
(x), f)[ |f| |x| by denition. So, |J
X
(x)| |x|. Moreover,
by Corollary 5.24, for any x X a functional f
x
X

exists such that


f
x
(x) = |x| and |f
x
| = 1. Thus, |x| = [J
X
(x), f
x
)[ |J
X
(x)|. Therefore,
|J
X
(x)| = |x| for every x X, that is, J
X
is a linear isometry.
5.4.1 Reexive spaces
The above considerations imply that J
X
(X) is a subspace of X

. It is useful
to single out the case where such a subspace coincides with the bidual.
Denition 5.38 A space X is called reexive if the map J
X
: X X

dened in (5.31) is onto.


Recalling that J
X
is a linear isometry, we conclude that any reexive space
X is isometrically isomorphic to its bidual X

. Since X

is complete, like
every dual space, every reexive normed space must also be complete.
Chapter 5 141
Example 5.39 1. If X is a Hilbert space, then X

is isometrically iso-
morphic to X by the Riesz-Frechet Theorem. Therefore, so is X

. In
other words, any Hilbert space is reexive.
2. Let 1 < p < . Then Proposition 5.33 ensures that (
p
)

=
q
, where
p and q are conjugate exponents. So,
p
is reexive for all p (1, ).
Theorem 5.40 Let X be a Banach space.
(a) If X

is separable, then X is separable.


(b) If X

is reexive, then X is reexive.


Proof.
(a) Let (f
k
) be a dense sequence in X

. There exists a sequence (x


k
) in X
such that
|x
k
| and [f
k
(x
k
)[
|f
k
|
2
k 1 .
We claim that X coincides with the closed subspace generated by (x
k
).
For let M = sp(x
k
[ k 1) and suppose there exists x
0
XM. Then,
applying Corollary 5.23 we can nd a functional f X

such that
f(x
0
) = 1 , f

M
= 0 , |f| =
1
d
M
(x)
.
So,
|f
k
|
2
[f
k
(x
k
)[ = [f
k
(x
k
) f(x
k
)[ |f
k
f| ,
whence
1
d
M
(x)
= |f| |f f
k
| +|f
k
| 3|f f
k
| .
Thus, (f
k
) cannot be dense in X

a contradiction.
(b) Observe that, since X is a Banach space, J
X
(X) is a closed subspace of
X

. Suppose there exists


0
X

J
X
(X). Then, by Corollary 5.23
applied to the bidual, we can nd a bounded linear functional on X

,
valued 1 at
0
and 0 on J
X
(X). Since X

is reexive, such a functional


will belong to J
X
(X

). So, for some f X

0
, f) = 1 and 0 = J
X
(x), f) = f, x) x X ,
a contradiction that concludes the proof.
142 Banach spaces
Remark 5.41 1. From point (a) of Theorem 5.40 we conclude that, since

is not separable, (

also fails to be separable. So, (

is not
isomorphic to
1
, and
1
is not reexive. Moreover,

also fails to be
reexive since otherwise
1
would be reexive by point (b) above.
2. The result of point (b) of Theorem 5.40 is an equivalence since the
implication
X reexive = X

reexive
in trivial. On the contrary, the implication of point (a) cannot be
reversed. Indeed,
1
is separable, whereas

= (
1
)

is not.
Corollary 5.42 A Banach space X is reexive and separable i X

is re-
exive and separable.
Proof. The only part of the conclusion that needs to be justied is the fact
that, if X is reexive and separable, then X

is separable. But this follows


from the fact that X

is separable, since it is isomorphic to X, and from


Theorem 5.40 (a).
We conclude this section with a result on the reexivity of subspaces.
Proposition 5.43 Let M be a closed linear subspace of a reexive Banach
space X. Then M is reexive.
Proof. Let be a bounded linear functional on M

. Dene a functional
on X

by
, f) =

, f

M
_
f X

.
Since X

, by hypothesis we have that = J


X
(x) for some x X. The
proof is completed by the following two steps.
1. We claim that x M. For if x X M, then by Corollary 5.23 there
exists f X

such that
f, x) = 1 and f

M
= 0 .
This yields a contradiction since
1 = , f) =

, f

M
_
= 0 .
2. We claim that = J
M
(x). Indeed, for any f M

let

f X

be the
extension of f to X provided by the Hahn-Banach Theorem. Then,
, f) = ,

f) =

f, x) = f, x) f X

.
Chapter 5 143
5.4.2 Weak convergence and BW property
It is well known that the unit ball B
1
of a nite dimesional Banach spaces is
relatively compact. We refer to such a property as the Bolzano-Weierstrass
property. One of the most striking phenomena that occur in innite dimen-
sions is that the Bolzano-Weierstrass property is no longer true. In fact, the
following result holds.
Theorem 5.44 Any Banach space with the Bolzano-Weierstrass property
must be nite dimensional.
Lemma 5.45 Let M be a closed linear subspace of a Banach space X such
that M ,= X. Then a sequence (x
n
) X exists such that
|x
n
| = 1 n 1 and d
M
(x
n
) 1 as n . (5.32)
Proof. Invoking Corollary 5.23, we can nd a functional f X

such that
|f| = 1 and f

M
= 0 .
Then, for every n 1 there exists x
n
X such that
|x
n
| = 1 and [f(x
n
)[ > 1
1
n
.
Therefore, for every n 1,
1
1
n
< [f(x
n
) f(y)[ |x
n
y| y M .
Taking the innum over all y M we obtain that 1 1/n d
M
(x
n
) 1.
The conclusion follows.
Proof of Theorem 5.44. Suppose dimX = . Let x
1
be a xed unit
vector and let V
1
:= Rx
1
= sp(x
1
). Since V
1
,= X, the above lemma
implies the existence of a vector x
2
X such that
|x
2
| = 1 and d
V
1
(x
2
) >
1
2
.
Let V
2
:= sp(x
1
, x
2
) and observe that V
1
V
2
X. Again by Lemma 5.45
we can nd a vector x
3
X such that
|x
3
| = 1 and d
V
2
(x
3
) > 1
1
3
.
144 Banach spaces
Iterating this process we can construct a sequence (x
n
) in X such that
|x
n
| = 1 and d
V
n
(x
n+1
) > 1
1
n + 1
,
where V
n
= sp(x
1
, . . . , x
n
) X. Then, (x
n
) has no cluster point in X
since, for any 1 m < n, we have 1 1/n < d
V
m
(x
n
) |x
n
x
m
|.
A surrogate for the Bolzano-Weierstrass property in innite dimensional
spaces is ther notion of convergence we introduce below.
Denition 5.46 A sequence (x
n
) X is said to converge weakly to a point
x X if
lim
n
f, x
n
) = f, x) f X

.
In this case we write w lim
n
x
n
= x or x
n
x.
A sequence (x
n
) that converges in norm is also said to converge strongly.
Since [f, x
n
) f, x)[ |f| |x
n
x|, n it is easy to see that any strongly
convergent sequence is also weakly convergent. The conserve is not true as
is shown by the following example.
Example 5.47 Let (e
n
) be an orthonormal sequence in an innite dimen-
sional Hilbert space X. Then, owing to Bessels inequality x, e
n
) 0 as
n for every x X. Therefore, e
n
0 as n . On the other hand,
|e
n
| = 1 for every n. So, (e
n
) does not converge strongly to 0.
Proposition 5.48 Let (x
n
), (y
n
) be sequences in a Banach space X, and let
x, y X.
(a) If x
n
x and x
n
y, then x = y.
(b) If x
n
x and y
n
y, then x
n
+y
n
x +y.
(c) If x
n
x and (
n
) R converges to , then
n
x
n
x.
(d) If x
n
X
x and /(X, Y ), then x
n
Y
x.
(e) If x
n
x, then (x
n
) is bounded.
(f ) If x
n
x, then |x| liminf
n
|x
n
|.
Chapter 5 145
Proof.
(a) By hypothesis we have that f, xy) = 0 for every f X

. Then, the
conlusion follows recalling Exercise 5.25.3.
(b) The proof is left to the reader.
(c) Since (
n
) is bounded, say [
n
[ C, for any f X

we have that
[
n
f, x
n
) f, x)[ [
n
[
..
C
[ f, x
n
x)
. .
0
[ +[
n
[
. .
0
[f, x)[ .
(d) Let g Y

. Then g, x
n
) = g , x
n
) 0 since g X

.
(e) Consider the sequence (J
X
(x
n
)) in X

. Since
J
X
(x
n
), f) = f, x
n
) f, x) f X

.
we have that sup
n
[J
X
(x
n
), f)[ < for all f X

. So, the Banach-


Steinhaus Theorem implies that
sup
n
|x
n
| = sup
n
|J
X
(x
n
)| < .
(f) Let f X

be such that |f| 1. Then,


[f, x
n
)[
. .
[f,x)[
|x
n
| = [f, x)[ liminf
n
|x
n
| .
The conclusion follows recalling Exercise 5.25.4.
Exercise 5.49 1. Let f
n
: R R be dened by
f
n
(x) =
_

_
1
2
n
if x [2
n
, 2
n+1
],
0 otherwise.
Show that
f
n
0 in L
p
(R) for all 1 < p +;
f
n
does not converge weakly in L
1
(R).
146 Banach spaces
2. Show that, in a Hilbert space X,
x
n
x x
n
x and |x
n
| |x| .
Hint: observe that |x
n
x|
2
= |x
n
|
2
+|x|
2
2x
n
, x) . . .
3. Let C be a closed convex subset of X and let (x
n
) C. Show that, if
x
n
x, then x C.
Hint: use Lemma 5.31.
Besides strong and weak convergence, in dual spaces one can dene another
notion of convergence.
Denition 5.50 A sequence (f
n
) X

is said to converge weakly to a


functional f X

i
f
n
, x) f, x) as n x X . (5.33)
In this case we write
w

lim
n
f
n
= f or f
n

f (as n ) .
Remark 5.51 It is interesting to compare weak and weak convergence
on X

. By denition, a sequence (f
n
) X

converges weakly to f i
, f
n
) , f) as n (5.34)
for all X

, whereas, f
n

f i (5.34) holds for all J


X
(X). Therefore,
weak convergence is equivalent to weak convergence if X is reexive but,
in general, weak convergence is stronger than weak convergence.
Example 5.52 In

= (
1
)

consider the sequence (x


(n)
) dened by
x
(n)
k
:=
_
0 if k n
1 if k > n.
Then x
(n)

0. Indeed, for every y = (y
k
)
1
,

j
1
_
x
(n)
_
, y
_
=

k=1
x
(n)
k
y
k
=

k=n+1
y
k
0 (n ) .
Chapter 5 147
where j
1
:

(
1
)

is the linear isometry dened in (5.26). On the other


hand, we have that x
(n)
, 0. Indeed, dene
f(x) := lim
k
x
k
x = (x
k
) c
1
where c
1
:= x = (x
k
)

lim
k
x
k
(see Example 5.22). Then, denoting
by F any bounded linear functional extending f to

for instance, the one


provided by the Hahn-Banach Theoremwe have that

F, x
(n)
_
= lim
k
x
(n)
k
= 1 n 1 .
Exercise 5.53 1. Show that any (f
n
) X

that converges weakly is


bounded in X

.
2. Show that, if x
n
x and f
n
f, then f
n
, x
n
) f, x) as n .
3. Show that, if x
n
x and f
n

f, then f
n
, x
n
) f, x) as n .
One of the nice features of weak convergence is the following result yielding
a sort of weak Bolzano-Weierstrass property of X

.
Theorem 5.54 (Banach-Alaoglu) Let X be a separable normed space.
Then every bounded sequence (f
n
) X

has a weakly convergent sub-


sequence.
Proof. Let (x
n
) be a dense sequence in X and let C 0 be an upper bound
for |f
n
|. Then [f
n
(x
1
)[ C|x
1
|. So, there exists a subsequence of (f
n
),
say (f
1,n
), such that f
1,n
(x
1
) converges. Next, since [f
1,n
(x
2
)[ C|x
2
|, there
exists a subsequence (f
2,n
) (f
1,n
), such that f
2,n
(x
2
) converges. Iterating
this process, for any k 1 we can construct nested subsequences
(f
k,n
) (f
k1,n
) (f
1,n
) (f
n
)
such that [f
n
(x
k
)[ C|x
k
| and f
k,n
(x
k
) converges as n for every
k 1. Dene, for n 1, g
n
(x) := f
n,n
(x) for all x X. Then, (g
n
) (f
n
),
|g
n
(x)| C|x|, and g
n
(x
k
) converges as n for every k 1 since it is,
for n k, a subsequence of f
k,n
(x
k
).
Let us complete the proof showing that g
n
(x) converges for every x X.
Fix x X and > 0. Then, there exist k

, n

1 such that
_
|x x
k

| <
[g
n
(x
k

) g
m
(x
k

)[ < m, n n

148 Banach spaces


Therefore, for all m, n n

,
[g
n
(x) g
m
(x)[ [g
n
(x) g
n
(x
k

)[ +[g
m
(x
k

) g
m
(x)[
. .
2C|xx
k

|
+[g
n
(x
k

) g
m
(x
k

)[ (2C + 1) .
Thus, (g
n
(x)) is a Cauchy sequence satisfying [g
n
(x)[ C|x| for all x X.
This implies that f(x) := lim
n
g
n
(x) is an element of X

.
The main result of this section is that reexive Banach space have the weak
Bolzano-Weierstrass property as we show next.
Theorem 5.55 In a reexive Banach space, every bounded sequence has a
weakly convergent subsequence.
Proof. Dene M := sp(x
n
[ n 1). Observe that, in view of Proposi-
tion 5.43, M is a separable reexive Banach space. Therefore, Corollary 5.42
ensures that M

is separable and reexive too. Consider the bounded se-


quence (J
M
(x
n
)) M

. Applying Alaoglus Theorem, we can nd a subse-


quence (x
n
k
) such that J
M
(x
n
k
)

M

as n . The reexivity of M
guarantees that = J
M
(x) for some x M. Therefore, for every f M

,
f(x
n
k
) = J
M
(x
n
k
), f) J
M
(x), f) = f(x) as n .
Finally, for any F X

we have that F

M
M

. So,
F(x
n
k
) = F

M
(x
n
k
) F

M
(x) = f(x) as n .
Exercise 5.56 1. Let 1 < p < and let x
(n)
= (x
(n)
k
)
k1
be a bounded
sequence in
p
. Show that x
(n)
x = (x
k
)
k1
in
p
if and only if, for
every k 1, x
(n)
k
x
k
as n .
Hint: suppose x = 0, x y
q
, and let |x
(n)
|
p
C for all n 1. For
any > 0 let k

1 be such that
_

k=k

+1
[y
k
[
q
_1
q
< ,
and let n

1 be such that
_
k

k=1
[x
(n)
k
[
p
_1
p
< n n

.
Chapter 5 149
Then, for all n n

j
p
(y), x
(n)
)

k=1
y
k
x
(n)
k

k=k

+1
y
k
x
(n)
k

_
k

k=1
[y
k
[
q
_1
q
. .
|y|
_
k

k=1
[x
(n)
k
[
p
_1
p
. .

+
_

k=k

+1
[y
k
[
q
_1
q
. .

k=k

+1
[x
(n)
k
[
p
_1
p
. .
C
.
2. Find a counterexample to show that the above conclusion is false if x
(n)
fails to be bounded.
Hint: in
2
let x
(n)
= n
2
e
n
where e
n
is the sequence of vectors dened
in (5.27). Then, for every k 1, x
(n)
k
0 as n . On the other
hand, taking y = (1/k)
k1
we have that
y
2
and j
2
(y), x
(n)
) = n .
3. Let x
(n)
= ()
k1
be bounded in c
0
. Show that x
(n)
x = (x
k
)
k1
in c
0
if and only if, for every k 1, x
(n)
k
x
k
as n .
Hint: argue as in point 1 above.
4. Let 1 < p < and let x, x
(n)

p
. Show that
x
(n)
x
_
x
(n)
x
|x
(n)
| |x|
(5.35)
Hint: use point 1 of this exercise and adapt the proof of Proposi-
tion 3.38 observing that, for any k 1,
0
[x
(n)
k
[
p
+[x
k
[
p
2

x
(n)
k
x
k
2

p
[x
k
[
p
(n ) .
5. Show that property (5.35) fails c
0
.
Hint: consider the sequence x
(n)
= e
1
+ e
n
where (e
k
) is the sequence
of vectors dened in (5.27).
150 Banach spaces
Remark 5.57 1. We say Banach space X has the Radon-Riesz property
if (5.35) holds true for every sequence x
(n)
in X. By the above exercise,
such a property holds in
p
for all 1 < p < , but not in c
0
. Owing to
Exercise 5.49.2, all Hilbert spaces have the Radon-Riesz property.
2. A surprising result known as Schurs Theorem
(7)
ensures that, in
1
,
weak and strong convergence coincide, that is, for all x
(n)
, x
1
, we
have that
x
(n)
x x
(n)
x .
Then, in view of Schurs Theorem,
1
has the Radon-Riesz property.
On the other hand, this very theorem makes it easy to check that the
property described in Exercise 5.56.1 fails in
1
. Indeed, the sequence
(e
k
) in (5.27) does not converge stronglythus, weaklyto 0.
(7)
see. for instance, Proposition 2.19 in [3].
Chapter 6
Product measures
6.1 Product spaces
6.1.1 Product measure
Let (X, T) and (Y, () be measurable space. We will turn the product XY
into a measurable space in a canonical way.
A set of the form AB, where A T and B (, is called a measurable
rectangle. Let us denote by 1 the family of all nite disjoint unions of
measurable rectangles.
Proposition 6.1 1 is an algebra.
Proof. Clearly, and XY are measurable rectangles. It is also easy to see
that the intersection of any two measurable rectangles is again a measurable
rectangle. Moreover, the intersection of any two elements of 1 stays in
1. Indeed, let

i
(A
i
B
i
) and

j
(C
j
D
j
)
(1)
be nite disjoint unions of
measurable rectangles. Then,
_

i
(A
i
B
i
)
_

j
(C
j
D
j
)
_
=

i,j
_
(A
i
B
i
) (C
j
D
j
)
_
1.
Let us show that the complement of any set E 1 is again in 1. This is
true if E = A B is a measurable rectangle since
E
c
= (A
c
B)

(A B
c
)

(A
c
B
c
) .
(1)
Hereafter the symbol

denotes a disjoint union.
151
152 Product measures
Now, proceeding by induction, let
E =
_
n
_
i=1
(A
i
B
i
)
. .
F
_
_
(A
n+1
B
n+1
) 1
and suppose F
c
1. Then, E
c
= F
c
(A
n+1
B
n+1
)
c
1 because
(A
n+1
B
n+1
)
c
1 and we have already proven that 1 is closed under
intersection. This completes the proof.
Denition 6.2 The algebra generated by 1is called the product algebra
of T and (. It is denoted by T (.
For any E T( we dene the sections of E putting, for x X and y Y ,
E
x
= y Y : (x, y) E, E
y
= x X : (x, y) E.
Proposition 6.3 Let (X, T, ), (Y, (, ) be -nite measure spaces and let
E T (. Then the following statements hold.
(a) E
x
( and E
y
T for any (x, y) X Y .
(b) the functions
_
X R
x (E
x
)
and
_
Y R
y (E
y
)
are measurable and measurable, respectively. Moreover,
_
X
(E
x
)d =
_
Y
(E
y
)d. (6.1)
Proof. Suppose, rst, that E =

n
i=1
(A
i
B
i
) stays in 1. Then, for (x, y)
X Y we have E
x
=

n
i=1
(A
i
B
i
)
x
and E
y
=

n
i=1
(A
i
B
i
)
y
, where
(A
i
B
i
)
x
=
_
B
i
if x A
i
,
if x / A
i
,
(A
i
B
i
)
y
=
_
A
i
if y B
i
,
if y / B
i
.
Consequently,
(E
x
) =
n

i=1
((A
i
B
i
)
x
) =
n

i=1
(B
i
)
A
i
(x),
Chapter 6 153
(E
y
) =
n

i=1
((A
i
B
i
)
y
) =
n

i=1
(A
i
)
B
i
(y)
so that (6.1) follows. Then the thesis holds true when E stays in 1.
Now, let c be the family of all sets E T ( satisfying (a). Clearly,
X Y c. Furthermore for any E, (E
n
)
n
T ( and (x, y) X Y we
have
(E
c
)
x
= (E
x
)
c
, (E
c
)
y
= (E
y
)
c
,

n
(E
n
)
x
= (
n
E
n
)
x
,
n
(E
n
)
y
= (
n
E
n
)
y
.
Hence c is a -algebra including 1 and, consequently, c = T (.
We are going to prove (b). First assume that and are nite and dene
/ =
_
E T (

E satises (b)
_
.
We claim that / is a monotone class. For let (E
n
)
n
/ be such that
E
n
E. Then, for any (x, y) X Y ,
(E
n
)
x
E
x
and (E
n
)
y
E
y
.
Thus

_
(E
n
)
x
_
(E
x
) and
_
(E
n
)
y
_
(E
y
).
Since x
_
(E
n
)
x
_
is measurable for all n N, we have that x (E
x
)
is measurable too. Similarly, y (E
y
) is measurable. Furthermore,
by the Monotone Convergence Theorem,
_
X
(E
x
)d = lim
n
_
X

_
(E
n
)
x
)d = lim
n
_
Y

_
(E
n
)
y
)d =
_
Y
(E
y
)d.
Therefore, E /. Next consider (E
n
)
n
/ such that E
n
E. Then, a
similar argument as above shows that for every (x, y) X Y

_
(E
n
)
x
_
(E
x
) and
_
(E
n
)
y
_
(E
y
).
Consequently the functions x (E
x
) and y (E
y
) are measurable
and measurable, respectively. Furthermore,

_
(E
n
)
x
_
(Y ) x X,
_
(E
n
)
y
_
(X) y Y,
154 Product measures
and, since and are nite, the constants are summable. Then, Lebesguess
Theorem yields
_
X
(E
x
)d = lim
n
_
X

_
(E
n
)
x
_
d = lim
n
_
Y

_
(E
n
)
y
_
d =
_
Y
(E
y
)d,
which implies E /. So, / is a monotone class as claimed. For the rst
part of the proof 1 /. Theorem 1.29 implies that / = T (. Then
the thesis is proved when and are nite. Now assume that and
are -nite; we have X =
k
X
k
, Y =
k
Y
k
for some increasing sequences
(X
k
)
k
T and (Y
k
)
k
( such that X = X
k
, Y = Y
k
and
(X
k
) < , (Y
k
) < k N. (6.2)
Dene
k
= X
k
,
k
= Y
k
and x E T (. For any (x, y) X Y ,
E
x
Y
k
E
x
and E
y
X
k
E
y
.
Thus

k
(E
x
) =
_
E
x
Y
k
_
(E
x
) and
k
(E
y
) =
_
E
y
X
k
_
(E
y
).
Since
k
and
k
are nite measures, for all k N the function x
k
(E
x
)
is measurable; consequently x (E
x
) is measurable too. Similarly,
y (E
y
) is measurable. Furthermore, by the Monotone Convergence
Theorem,
_
X
(E
x
)d = lim
k
_
X

k
(E
x
)d = lim
k
_
Y

k
(E
y
)d =
_
Y
(E
y
)d.

Theorem 6.4 Let (X, T, ) and (Y, (, ) be -nite measure spaces. The
set function dened by
( )(E) =
_
X
(E
x
)d =
_
Y
(E
y
)d E T ( (6.3)
is a -nite measure on (X Y, T (), called product measure of and .
Moreover, if is any measure on (X Y, T () satisfying
(A B) = (A)(B) A T , B ( , (6.4)
then = .
Chapter 6 155
Proof. First, to check that is additive let (E
n
)
n
be a disjoint sequence
in T (. Then, for any (x, y) X Y ,
_
(E
n
)
x
_
n
and
_
(E
n
)
y
_
n
are disjoint
families in ( and T, respectively. Therefore,
( )(
n
E
n
) =
_
X

_
(
n
E
n
)
x
)d
=
_
X

n
(E
n
)
x
)d =
_
X

_
(E
n
)
x
)d
[Proposition 2.39] =

n
_
X

_
(E
n
)
x
)d =

n
( )(E
n
) .
To prove that is -nite, observe that if (X
k
)
k
T and (Y
k
)
k
( are
two increasing sequences such that
(X
k
) < , (Y
k
) < k N,
then, setting Z
k
= X
k
Y
k
, we have Z
k
T(, ()(Z
k
) = (X
k
)(Y
k
) <
and X Y =
k
Z
k
. Next, if is a measure on (X Y, T () satisfying
(6.4), then and coincide on 1. Theorem 1.26 ensures that and
coincide on (1).
The following result is a straightforward consequence of (6.3).
Corollary 6.5 Under the same assumptions of Theorem 6.4, let E T (
be such that ( )(E) = 0. Then, (E
y
) = 0 for a.e. y Y , and
(E
x
) = 0 for a.e. x X.
Example 6.6 We note that may not be a complete measure even
when both and are complete. Indeed, let denote Lebesgue measure
on X = [0, 1] and take ( to be the algebra of all Lebesgue measurable
sets in [0, 1] (that is, ( consists of all additive sets, see Denition 1.33).
Let A [0, 1] be a nonempty negligible set and let B [0, 1] be a set
that is not measurable (see Example 1.52). Then, A B A [0, 1] and
( )(A[0, 1]) = 0. On the other hand, AB / ( ( for otherwise one
would get a contradiction with Proposition 6.3(a).
6.1.2 Fubini-Tonelli Theorem
In this section we will reduce the computation of a double integral with
respect to to the computation of two simple integrals. The following
two theorems are basic in the theory of multiple integration.
156 Product measures
Theorem 6.7 (Tonelli) Let (X, T, ) and (Y, (, ) be nite measure spaces.
Let F : XY [0, ] be a ()measurable function. Then the following
statements hold true.
(a) (i) For every x X the function y F(x, y)
. .
F(x,)
is measurable.
(ii) For every y Y the function x F(x, y)
. .
F(,y)
is measurable.
(b) (i) The function x
_
Y
F(x, y)d(y) is measurable.
(ii) The function y
_
X
F(x, y)d(x) is measurable.
(c) We have the identities
_
XY
F(x, y)d( )(x, y) =
_
X
__
Y
F(x, y)d(y)
_
d(x) (6.5)
=
_
Y
__
X
F(x, y)d(x)
_
d(y) (6.6)
Proof. Assume rst that F =
E
with E T (. Then,
F(x, ) =
E
x
x X,
F(, y) =
E
y y Y .
So, properties (a) and (b) follow from Proposition 6.3, while (c) reduces to
formula (6.3), used to dene product measure. Consequently, (a), (b), and
(c) hold true when F is a simple function. In the general case, owing to
Proposition 2.37 we can approximate F pointwise by an increasing sequence
of simple functions
F
n
: X Y [0, ] .
The F
n
(x, )s are themselves simple functions on Y such that
F
n
(x, ) F(x, ) pointwise as n x X .
So, the function F(x, ) is measurable and (a)(i) is proven. Moreover,
x
_
Y
F
n
(x, y)d(y) is an increasing sequence of positive simple functions
satisfying
_
Y
F
n
(x, y)d(y)
_
Y
F(x, y)d(y) x X ,
Chapter 6 157
thanks to the Monotone Convergence Theorem. Therefore, (b)(i) holds true
and, again by monotone convergence,
_
X
__
Y
F
n
(x, y)d(y)
_
d(x)
_
X
__
Y
F(x, y)d(y)
_
d(x) .
Since we also have that
_
XY
F
n
(x, y)d( )(x, y)
. .
=
R
X
[
R
Y
F
n
(x,y)d(y)
]
d(x)

_
XY
F(x, y)d( )(x, y) ,
we have obtained (6.5). By a similar reasoning one can show (a)-(b)(ii) and
(6.6). The proof is thus complete.
Theorem 6.8 (Fubini) Let (X, T, ), (Y, (, ) be nite measure spaces
and let F be a ( )summable function on X Y . Then the following
statements hold true.
(a) (i) For a.e. x X the function y F(x, y) is summable on Y .
(ii) For a.e. y Y the function x F(x, y) is summable on X.
(b) (i) The function x
_
Y
F(x, y)d(y) is summable on X.
(ii) The function y
_
X
F(x, y)d(x) is summable on Y .
(c) Identities (6.5) and (6.6) are valid.
Proof. Let F
+
and F

be the positive and negative parts of F. Theo-


rem 6.7 (c), applied to F
+
and F

, yields identities (6.5) and (6.6). Also,


we have that
_
X
__
Y
F

(x, y)d(y)
_
d(x) <
_
Y
__
X
F

(x, y)d(x)
_
d(y) <
Therefore, (b) holds true for F
+
and F

, hence for F. So, on account of


Proposition 2.35,
x
_
Y
F

(x, y)d(y) is a.e. nite;


y
_
X
F

(x, y)d(x) is a.e. nite.


158 Product measures
Consequently, (a) holds true and the proof is complete.
Example 6.9 Let X = Y = [1, 1] with the Lebesgue measure and set
f(x, y) =
xy
(x
2
+y
2
)
2
.
Observe that the iterated integrals exist and are equal; indeed
_
1
1
dy
_
1
1
f(x, y)dx =
_
1
1
dx
_
1
1
f(x, y)dy = 0.
On the other hand the double integral fails to exist, since
_
[1,1]
2
[f(x, y)[dxdy
_
1
0
dr
_
2
0
[ sin cos [
r
d = 2
_
1
0
dr
r
= .
This example shows that the existence of the iterated integrals does not imply
the existence of the double integral.
Example 6.10 Consider the spaces
([0, 1], T([0, 1]), ), ([0, 1], B([0, 1]), )
where denotes the counting measure and the Lebesgue measure. Consider
the diagonal of [0, 1]
2
, that is
= (x, x) [ x [0, 1].
For every n N, set
Q
n
=
_
0,
1
n
_
2

_
1
n
,
2
n
_
2
. . .
_
n 1
n
, 1
_
2
.
Q
n
is a nite union of measurable rectangles and =
n
Q
n
, by which
T([0, 1]) B([0, 1]). So the function
D
is ()-measurable. We have
_
1
0
dy
_
1
0
f(x, y)d(x) =
_
1
0
1 dy = 1,
_
1
0
d(x)
_
1
0
f(x, y)dy =
_
1
0
0 d = 0.
Then, since is not -nite, the thesis of Tonellis theorem fails.
Exercise 6.11 Show that
B(R
2
) = B(R) B(R).
Chapter 6 159
6.2 Compactness in L
p
In this section we shall characterize all relatively compact subsets of L
p
(R
N
)
(2)
for any 1 p < , that is, all families of functions / L
p
(R
N
) whose
closure / in L
p
(R
N
) is compact. We shall see that two properties that were
examined in chapter 3, namely tightness and continuity under translations,
characterize relatively compact sets in L
p
(R
N
).
Denition 6.12 Let 1 p < . For any r > 0 and L
p
(R
N
) dene
S
r
: R
N
R by the Steklov formula
S
r
(x) =
1

N
r
N
_
B(0,r)
(x +y)dy x R
N
,
where
N
is the surface measure of the unit sphere in R
N
.
Proposition 6.13 Let 1 p < and L
p
(R
N
). Then for every r > 0
S
r
is a continuous function. Furthermore S
r
L
p
(R
N
) and, using the
notation
h
(x) = (x +h), the following hold:
[S
r
(x)[
1
(
N
r
N
)
1/p
||
p
; (6.7)
|S
r
|
p
||
p
;
[S
r
(x) S
r
(x +h)[ <
1
(
N
r
N
)
1/p
|
h
|
p
; (6.8)
| S
r
|
p
sup
0[h[r
|
h
|
p
. (6.9)
Proof. (6.7) can be derived using Holders inequality:
[S
r
(x)[
1
(
N
r
N
)
1/p
__
B(0,r)
[(x +y)[
p
dy
_
1/p
. (6.10)
(6.8) follows from (6.7) applied to
h
. Thus, (6.8) and Proposition 3.50
imply that S
r
is a continuous function. By (6.10), using Fubinis theorem
we get
_
R
N
[S
r
[
p
dx
1

N
r
N
_
B(0,r)
__
R
N
[(x +y)[
p
dx
_
dy
=
||
p
p

N
r
N
_
B(0,r)
dx = ||
p
p
.
(2)
L
p
(R
N
) = L
p
(R
N
, B(R
N
), ) where is Lebesgue measure.
160 Product measures
To obtain (6.9) observe that (S
r
)(x) =
1

N
r
N
_
B(0,r)
((x) (x+y))dy,
by which
[( S
r
) (x)[
1
(
N
r
N
)
1/p
__
B(0,r)
[(x) (x +y)[
p
dy
_
1/p
.
Therefore, Fubinis Theorem yields
_
R
N
[ S
r
[
p
dx
1

N
r
N
_
R
N
__
B(0,r)
[(x) (x +y)[
p
dy
_
dx
=
1

N
r
N
_
B(0,r)
__
R
N
[(x) (x +y)[
p
dx
_
dy
and (6.9) follows.
Theorem 6.14 (M. Riesz) Let 1 p < and let / be a bounded family
in L
p
(R
N
). Then, / is relatively compact i
sup
/
_
[x[>R
[[
p
dx 0 as R (6.11)
sup
/
_
R
N
[(x +h) (x)[
p
dx 0 as h 0 (6.12)
Proof. Let us set
h
(x) = (x + h) for any x, h R
N
. We already know
that (6.11) and (6.12) hold for a single element of L
p
(R
N
) ((6.11) follows
from Lebesgue Theorem; see Proposition 3.50 for (6.12)). If / is relatively
compact, then for any > 0 there exist functions
1
, . . . ,
m
/ such that
/ B

(
1
) B

(
m
). As we have just recalled, each
i
satises (6.11)
and (6.12). So, there exist R

> 0 such that, for every i = 1, . . . , m,


_
[x[>R

[
i
[
p
dx <
p
& |
i

i
|
p
< [h[ <

. (6.13)
Let / and let
i
be such that B

(
i
). Therefore, recalling (6.13),
we have
__
[x[>R

[[
p
dx
_
1/p

__
[x[>R

[
i
[
p
dx
_
1/p
+
__
[x[>R

[
i
[
p
dx
_
1/p
|
i
|
p
+
__
[x[>R

[
i
[
p
dx
_
1/p
< 2
Chapter 6 161
and
|
h
|
p
|
i
|
p
+|
i

i
|
p
+|
h

h
|
p
< 3.
The necessity of (6.11) and (6.12) is thus proved.
To prove suciency it will suce to show that / is totally bounded. Let
> 0 be xed. On account of assumption (6.11),
R

> 0 such that


_
[x[>R

[[
p
dx <
p
/. (6.14)
Also, recalling (6.9), assumption (6.12) yields

> 0 such that | S

|
p
< /, (6.15)
where S

is the Steklov operator introduced in Denition (6.12). Moreover,


properties (6.7) and (6.8) ensure that S

/
is a bounded equicontin-
uous family on B(0, R

). Thus, S

/
is relatively compact thanks to
Ascoli-Arzel`as Theorem. Consequently, there exists a nite set of continuous
functions
1
, . . . ,
m
on B(0, R

) such that for each / the function


S

e
: B(0, R

) R belongs to a ball of suciently small radius centered at

i
, say
[S

(x)
i
(x)[ <

(
N
R
N

)
1/p
x B(0, R

) . (6.16)
Set

i
(x) :=
_

i
(x) [x[ R

0 [x[ > R

.
Then,
i
L
p
(R
N
) and, by (6.14), (6.15), and (6.16)
|
i
|
p
=
__
[x[>R

[[
p
dx
_
1/p
+
__
B(0,R

)
[
i
[
p
dx
_
1/p
< +
__
B(0,R

)
[ S

[
p
dx
_
1/p
+
__
B(0,R

)
[S


i
[
p
dx
_
1/p
< 3.
This shows that / is totally bounded and completes the proof.
162 Product measures
6.3 Convolution and approximation
In this section we will develop a systematic procedure for approximating a
L
p
function by smooth functions. The operation of convolution provides the
tool to build such smooth approximations. The measure space of interest is
R
N
with Lebesgue measure .
6.3.1 Convolution Product
Denition 6.15 Let f, g : R
N
R be two Borel functions such that for
a.e. x R
N
the function
y R
N
f(x y)g(y) (6.17)
is summable. We dene the convolution product of f and g by
(f g)(x) =
_
R
N
f(x y)g(y) dy x R
N
a.e.
Remark 6.16 1. If f, g : R
N
[0, ] are Borel functions, then, since the
function (6.17) is positive and Borel, f g : R
N
[0, ] is well dened for
every x R
N
.
2. By making the change of variable z = x y and using the translation
invariance of the Lebesgue measure we obtain that the function (6.17) is
summable i the function z R
N
f(z)g(x z) is summable and (f
g)(x) = (g f)(x). This proves that the convolution is commutative.
Next proposition gives a sucient condition to guarantee that f g is well-
dened a.e. in R
N
.
Proposition 6.17 (Young) Let p, q, r [1, ] be such that
1
p
+
1
q
=
1
r
+ 1 (6.18)
and let f L
p
(R
N
) and g L
q
(R
N
). Then for a.e. x R
N
the function
(6.17) is summable. Furthermore f g L
r
(R
N
) and
|f g|
r
|f|
p
|g|
q
. (6.19)
Moreover, if r = , then f g is a continuous function on R
N
.
Chapter 6 163
Proof. First assume r = ; then
1
p
+
1
q
= 1. By the translation invariance
of the Lebesgue measure we have that for every x R
N
the function y
R
N
f(x y) stays in L
p
(R
N
) and has the same L
p
-norm as f. Then, by
Holders inequality and Exercise 3.25 we deduce that for every x R
N
the
function (6.17) is summable and
[(f g)(x)[ |f|
p
|g|
q
x R
N
. (6.20)
Since p and q are conjugate, at least one of them is nite and, since the
convolution is commutative, without loss of generality we may assume p < .
Then, for any x, h R
N
, the above estimate yields
[(f g)(x +h) (f g)(x)[ = [((
h
f f) g)(x)[ |
h
f f|
p
|g|
q
where
h
f(x) = f(x+h). Since |
h
f f|
p
0 as h 0 by Proposition 3.50,
the continuity of f g follows; (6.19) can be derived immediately from (6.20).
Thus, assume r < (whence p, q < ). We will get the conclusion in
four steps.
1. Suppose f, g 0. Then f g : R
N
[0, +] (see Remark 6.16.1) is a
Borel function.
Indeed the function
F : R
N
R
N
[0, ] (x, y) f(x y)g(y)
is Borel in the product space R
N
R
N
. Then Tonellis Theorem ensures
that the function x R
N

_
R
N
F(x, y)dy = (f g)(x) is Borel.
2. Suppose p = 1 = q (whence r = 1). Then, [f[ [g[ L
1
(R
N
) and
|[f[ [g[|
1
= |f|
1
|g|
1
.
Indeed, according to Step 1, [f[ [g[ is a Borel function and Tonellis
Theorem ensures that
_
R
N
([f[ [g[)(x) dx =
_
R
N
__
R
N
[f(x y)g(y)[ dy
_
dx
=
_
R
N
[g(y)[
__
R
N
[f(x y)[ dx
_
dy = |f|
1
|g|
1
.
Therefore the thesis of Step 2 follows.
164 Product measures
3. We claim that, for all f L
p
(R
N
) and g L
q
(R
N
),
([f[ [g[)
r
(x) |f|
rp
p
|g|
rq
q
([f[
p
[g[
q
)(x) x R
N
. (6.21)
First assume 1 < p, q < and let p
t
and q
t
be the conjugate exponents
of p and q, respectively. Then,
1
p
t
+
1
q
t
= 2
1
p

1
q
= 1
1
r
.
Thus,
1
p
r
= p
_
1
1
q
_
=
p
q
t
,
1
q
r
= q
_
1
1
p
_
=
q
p
t
.
Using the above relations for every x, y R
N
we obtain
[f(x y)g(y)[ = ([f(x y)[
p
)
1/q

([g(y)[
q
)
1/p

([f(x y)[
p
[g(y)[
q
)
1/r
,
whence, by Exercise 3.7,
([f[ [g[)(x) |f|
p/q

p
|g|
q/p

q
([f[
p
[g[
q
)
1/r
(x) x R
N
.
Since rp/q
t
= r p and rq/p
t
= r q, (6.21) follows.
(6.21) is immediate for p = 1 = q.
Consider the case p = 1 and 1 < q < (whence r = q). We have
[f(x y)g(y)[ = [f(x y)[
1/q

([f(x y)[[g(y)[
q
)
1/q
,
Thus, by Holders inequality we get
([f[ [g[)(x) |f|
1/q

p
([f[ [g[
q
)
1/q
(x) x R
N
a.e..
The last case q = 1, 1 < p < follows from the previous one since the
convolution is commutative.
Chapter 6 165
4. Owing to Step 1, [f[ [g[ is a Borel function and
_
R
N
([f[ [g[)
r
dx |f|
rp
p
|g|
rq
q
| [f[
p
[g[
q
|
1
. .
by (6.21)
= |f|
r
p
|g|
r
q
. .
by step2
.
(6.22)
Then [f[ [g[ L
r
(R
N
), that is,
_
R
N
_
_
R
N
[f(x y)g(y)[ dy
_
r
dx < .
Therefore, y f(xy)g(y) is summable for a.e. x R
N
. Hence, f g
is well dened and a.e. nite. Since f
+
, f

L
p
(R
N
) and g
+
, g


L
q
(R
N
), then the functions f
+
g
+
, f

, f
+
g

, f

g
+
are nite
a.e. and, according to part 1, are Borel. Moreover we have
f g = f
+
g
+
+f

(f
+
g

+f

g
+
) a.e. x R
N
.
We deduce that f g is Borel and
_
R
N
[f g[
r
dx |[f[ [g[|
r
r
|f|
r
p
|g|
r
q
. .
by (6.22)
.

Remark 6.18 For r = and 1 < p, q < in (6.18),


lim
[x[
(f g)(x) = 0 .
Indeed, for > 0 let R

> 0 be such that


_
[y[R

[f(y)[
p
dy <
p
&
_
[y[R

[g(y)[
q
dy <
q
.
Then,
[(f g)(x)[

_
[y[R

f(x y)g(y) dy

_
[y[<R

f(x y)g(y) dy

|f|
p
_
_
[y[R

[g(y)[
q
dy
_
1/q
+|g|
q
_
_
B(x,R

)
[f(z)[
p
dz
_
1/p
.
Therefore, for all [x[ 2R

,
[(f g)(x)[ (|f|
p
+|g|
q
) .
166 Product measures
Remark 6.19 As a particular case of Youngs Theorem, if f L
1
(R
N
)
and g L
p
(R
N
) with 1 p , then f g is well dened and, further
f g L
p
(R
N
) with
|f g|
p
|f|
1
|g|
p
. (6.23)
Remark 6.20 By taking p = 1 in Remark 6.19, we obtain that the operation
of convolution
: L
1
(R
N
) L
1
(R
N
) L
1
(R
N
)
provides a multiplication structure for L
1
(R
N
). This operation is commuta-
tive (see Remark 6.16.2) and associative. Indeed, if f, g, h L
1
(R
N
), then,
by using the change of variables z = t y and by Fubinis Theorem
((f g) h)(x) =
_
R
N
(f g)(x y)h(y)dy
=
_
R
N
h(y)dy
_
R
N
f(x y z)g(z)dz
=
_
R
N
f(x t)dt
_
R
N
g(t y)h(y)dy
=
_
R
N
f(x t)(g h)(t)dt = (f (g h))(x),
which proves the associativity. Finally, it is apparent that convolution obeys
the distributive laws. However, there is not unit in L
1
(R
N
) under this mul-
tiplication. Indeed, assume by absurd the existence of g L
1
(R
N
) such that
g f = f for every f L
1
(R
N
). Then the absolute continuity of the integral
implies the existence of > 0 such that
A B(R
N
) & (A)
_
A
[g[dx < 1.
Let > 0 be suciently small such that (B(0, )) < and, taking f =

B(0,)
L
1
(R
N
), for every x R
N
we compute
[f(x)[ = [(g f)(x)[
_
R
N
[g(x y)[ [f(y)[dy =
_
B(0,)
[g(x y)[dy
=
_
B(x,)
[g(z)[dz < 1
and the contradiction follows.
Exercise 6.21 Compute f g for f(x) =
[1,1]
(x) and g(x) = e
[x[
.
Chapter 6 167
6.3.2 Approximation by smooth functions
Denition 6.22 A family (f

in L
1
(R
N
) is called an approximate identity
if satises the following
f

0,
_
R
N
f

(x)dx = 1 > 0, (6.24)


> 0 :
_
[x[
f

(x)dx 0 as 0
+
. (6.25)
Remark 6.23 A common way to produce approximate identities in L
1
(R
N
)
is to take a function f L
1
(R
N
) such that f 0 and
_
R
N
f(x)dx = 1 and
to dene for > 0
f

(x) =
N
f(
1
x).
Condition (6.24)-(6.25) are satised since, introducing the change of variables
y =
1
x, we obtain
_
R
N
f

(x)dx =
_
R
N
f(y)dy = 1
and
_
[x[
f

(x)dx =
_
[y[
1

f(y)dy 0 as 0
+
,
the latter convergence is by the Lebesgue dominated convergence theorem.
Proposition 6.24 Let (f

L
1
(R
N
) be an approximate identity. Then
the following hold
1. If f L

(R
N
) and f is continuous in R
N
, then f f

f uniformly
on compact sets of R
N
as 0
+
;
2. If f L

(R
N
) and f is uniformly continuous in R
N
, then f f

f
as 0
+
;
3. If 1 p < and f L
p
(R
N
), then f f

L
p
f as 0
+
.
Proof. 1. By Youngs theorem we get that f f

is continuous and f f

(R
N
). Let K R
N
be a compact set. Hence the set x R
N
[ d
K
(x) 1
168 Product measures
is compact and f is uniformly continuous over it; then, given > 0, there
exists (0, 1) such that
[f(x y) f(x)[ x K y B(x, ).
Since
_
R
N
f

(y)dy = 1, for every x K we have


[(f f

)(x) f(x)[ =

_
R
N
_
f(x y) f(x)
_
f

(y)dy

_
[y[<

f(x y) f(x)

(y)dy
+
_
[y[

f(x y) f(x)

(y)dy

_
R
N
f

(y)dy + 2|f|

_
[y[
f

(y)dy
= + 2|f|

_
[y[
f

(y)dy.
(6.26)
The conclusion follows from (6.25).
2. The proof is the same as in Part 1 except that in this case estimate
(6.26) holds for every x R
N
.
3. According to Remark 6.19 f f

L
p
(R
N
) for all > 0. Since
_
R
N
f

(y)dy = 1, for every x R


N
we have
[(f f

)(x) f(x)[ =

_
R
N
_
f(x y) f(x)
_
f

(y)dy

_
R
N
[f(x y) f(x)[f

(y)dy.
(6.27)
If p > 1, let p
t
(1, ) be the conjugate exponent of p. Then
[(f f

)(x) f(x)[
_
R
N
[f(x y) f(x)[(f

(y))
1/p
(f

(y))
1/p

dy.
By applying Holders inequality we obtain
[(f f

)(x) f(x)[
p

__
R
N
[f(x y) f(x)[
p
f

(y)dy
___
R
N
f

(y)dy
_
p/p

=
_
R
N
[f(x y) f(x)[
p
f

(y)dy.
Chapter 6 169
Combing this with (6.27) we deduce that the following inequality holds for
1 p < :
[(f f

)(x) f(x)[
p

_
R
N
[f(x y) f(x)[
p
f

(y)dy.
After integration over R
N
, by applying Tonellis Theorem, we have
|f f

f|
p
p

_
R
N
|
y
f f|
p
p
f

(y)dy
where
y
f(x) = f(xy). Setting (y) = |
y
f f|
p
, the above inequality
becomes
|f f

f|
p
p
(
p
f

)(0)
For every y, y
0
R
N
by using the translation invariance of the Lebesgue
measure
[(y) (y
0
)[ =

|
y
f f|
p
|
y
0
f f|
p

|
y
f
y
0
f|
p
= |
y+y
0
f f|
p
0 as y y
0
;
the latter fact follows by Proposition 3.50. Hence is a continuous function.
Since
p
(y) 2
p
|f|
p
p
, then
p
L

(R
N
). By part 1 we conclude (
p

)(0)
p
(0) = 0.
Notation 6.25 Let R
N
be an open set and k N. c
k
() is the space of
the functions f : R which are k times continuously dierentiable, c
c
()
is the space of the continuous functions f : R which are zero outside a
compact set K , and
c

() =
k
c
k
(), c
k
c
() = c
k
() c
c
(), c

c
() = c

() c
c
().
In particular, if k = 0, c
0
() = c() is the space of the continuous functions
f : R. If f c
k
() and = (
1
, . . . ,
N
) is a multiindex such that
[[ :=
1
+. . . +
N
k, then we set
D

f =

[[
f
x

1
1
x

2
2
. . . x

N
N
.
If = (0, . . . , 0), we set D
0
f = f.
170 Product measures
Proposition 6.26 Let f L
1
(R
N
) and g c
k
(R
N
) such that D

g
L

(R
N
) for every N
N
such that 0 [[ k. Then f g c
k
(R
N
)
and
D

(f g) = f D

g N
N
s.t. 0 [[ k.
Proof. The continuity of f g follows from Youngs Theorem. By induction
it will be sucient to prove the thesis when k = 1. Setting
(x, y) = f(y)g(x y),
we have

x
i
(x, y)

= [f(y)
g
x
i
(x y)[
_
_
_
g
x
i
_
_
_

[f(y)[
Since (f g)(x) =
_
R
N
(x, y)dy, Proposition 2.75 implies that f g is dier-
entiable and
(f g)
x
i
(x) =
_
R
N
f(y)
g
x
i
(x y)dy =
_
f
g
x
i
_
(x).
By hypothesis
g
x
i
c(R
N
) L

(R
N
). Again Youngs Theorem implies
f
g
x
i
c(R
N
); hence f g c
1
(R
N
).
Thus convolution with a smooth function produces a smooth function.
This fact provides us with a powerful technique to prove a variety of density
theorems.
Denition 6.27 For every > 0 dene the function

: R
N
R by

(x) =
_
_
_
C
N
exp
_

2
[x[
2

2
_
if [x[ < ,
0 if [x[
where C =
_ _
[x[<1
exp
_
1
[x[
2
1
_
dx
_
1
. The family (

is called the standard


mollier.
Lemma 6.28 The standard mollier (

satises

c
(R
N
), supp(

) = B(0, ) > 0;
(

is an approximate identity.
Chapter 6 171
Proof. Let f : R R be dened by
f(t) =
_
_
_
exp
_
1
t 1
_
if t < 1
0 if t 1
Then f is a c

function. Indeed we only need to check the smoothness at


t = 1. As t 1 all the derivatives are zero. As t 1 the derivatives are nite
linear combination of terms of the form
1
(t1)
l
exp
_
1
t1
_
, l being an integer
greater than or equal to zero and these terms tend to zero as t 1.
Observe that for every > 0

(x) =
1

1
_
x

_
= C
1

N
f
_
[x[
2

_
.
Then

c
(R
N
) and supp(

) = B(0, ). Further the denition of C


implies
_
R
N

1
(x)dx = 1. Remark 6.23 allows us to conclude.
Lemma 6.29 Let f, g c
c
(R
N
). Then f g c
c
(R
N
) and
supp(f g) supp(f) + supp(g),
where for sets A and B of R
N
: A +B = x +y [ x A, y B.
Proof. By Proposition 6.26 we get f g c(R
N
). Set A = supp(f),
B = supp(g). For every x R
N
we have
(f g)(x) =
_
(xsupp(f))supp(g)
f(x y)g(y)dy.
In order to obtain that (f g)(x) ,= 0, necessarily (xsupp(f))supp(g) ,= ,
that is x supp(f) + supp(g).
Proposition 6.30 Let R
N
be an open set. Then
space c

c
() is dense in c
0
()
(3)
;
space c

c
() is dense in L
p
() for every 1 p < .
(3)
see Exercise 3.46 for the denition of c
0
().
172 Product measures
Proof. According to Theorem 3.45 and Exercise 3.46 it is sucient to prove
that, given f c
c
(), there exists a sequence (f
n
)
n
c

c
() such that
f
n
L

f and f
n
L
p
f. Indeed, xed f c
c
(), set

f =
_
f(x) if x ,
0 if x R
N
.
Then

f c
c
(R
N
). Let (

be the mollier constructed in Denition 6.27


and for every n dene f
n
:= f
1/n
. According to Proposition 6.26 f
n

c

(R
N
). Next let K = supp(f) and = inf
xK
d

(x) > 0. Then



K :=
x R
N
[ d
K
(x)

2
is a compact set and

K . By Proposition 6.29, if
n is such that
1
n
<

2
we obtain
supp(f
n
) K +B
_
0,
1
n
_
=
_
x R
N

d
K
(x)
1
n
_


K.
Then f
n
c

c
() for n suciently large. Since f is uniformly continuous,
Proposition 6.24.2 gives f
n
L


f in L

(R
N
), which implies
f
n
f in L

().
Finally, for large n,
_

[f
n
f[
p
dx =
_

K
[f
n
f[
p
dx (

K)|f
n
f|
p

0.

An interesting consequence of smoothing properties of convolution is the


following Weierstrass approximation Theorem.
Theorem 6.31 (Weierstrass) Let f c
c
(R
N
). Then there exists a se-
quence of polynomials (p
n
)
n
such that p
n
f uniformly on compact sets
of R
N
.
Proof. For every > 0 dene
u

(x) =
N
u(
1
x), x R
N
,
where
u(x) =
N/2
exp([x[
2
), x R
N
.
Chapter 6 173
The well-known Poisson formula
_
R
N
exp([x[
2
)dx =
N/2
and Remark 6.23 imply that (u

is an approximate identity. Theorem 6.24.2


yields
u

f
L

f as 0. (6.28)
Fix > 0 and let K R
N
be a compact set. We claim that there exists
a sequence of polynomials (P
n
)
n
such that
P
n
u

f uniformly in K. (6.29)
Indeed the function u

is analytic and so on any compact set can be uniformly


approximated by the partial sums of its Taylor series which are polynomials.
The set

K := K supp(f) is compact, then there exists a sequence (p
n
)
n
of
polynomials on R
N
such that p
n
u

uniformly in

K. Next set
P
n
(x) =
_
R
N
p
n
(x y)f(y)dy. (6.30)
Since f is compactly supported, then the integrand in (6.30) is bounded
by [f[ sup
ysupp(f)
[p
n
(x y)[ which is summable for every x R
N
. Then
P
n
is well dened on R
N
. Observe that p
n
(x y) is a polynomial in the
variables (x, y), that is p
n
(x y) =

K
k=1
q
k
(x)s
k
(y) with q
k
, s
k
polynomials
in R
N
; substituting in (6.30) we obtain that each P
n
is also a polynomial.
Furthermore for every x K
[P
n
(x) (u

f)(x)[
_
supp(f)
[p
n
(x y) u

(x y)[[f(y)[dy
sup
t

K
[p
n
(t) u

(t)[
_
R
N
[f(y)[dy
and (6.29) follows.
To conclude, consider a sequence
n
0
+
. For every n N, since the
set [x[ n[ is compact, we can nd a polynomial Q
n
such that
sup
[x[n
[Q
n
(x) (u

n
f)(x)[
n
.
174 Product measures
If K R
N
is a compact set, then for n suciently large K B(0, n), which
implies
sup
xK
[Q
n
(x) f(x)[ sup
xK
[Q
n
(x) (u

n
f)(x)[ + sup
xK
[(u

n
f)(x) f(x)[

n
+|(u

n
f) f|

0
by (6.28).
Corollary 6.32 Let A B(R
N
) be a bounded set and 1 p < . Then the
set T
A
of all polynomials dened on A is dense in L
p
(A).
Proof. Consider f L
p
(A) and let

f be the extension of f by zero outside
A. Then

f L
p
(R
N
); xed > 0, Proposition 6.30 implies the existence
of g c
c
(R
N
) such that
_
R
N
[

f g[
p
dx . Since

A is a compact set, by
Theorem 6.31 we get the existence of a polynomial p such that sup
x

A
[p(x)
g(x)[
_

(A)
_
1/p
. Then
_
A
[g(x) p(x)[
p
dx
_
sup
x

A
[g(x) p(x)[
_
p
(A) ,
by which
_
A
[f(x) p(x)[
p
dx 2
p1
_
A
[f(x) g(x)[
p
dx + 2
p1
_
A
[g(x) p(x)[
p
dx
2
p1
_
R
N
[

f(x) g(x)[
p
dx + 2
p1
2
p
.

Remark 6.33 By Corollary 6.32 we deduce that if A B(R


N
) is a bounded
set, then the set of all polynomials dened on A with rational coecients is
countable and everywhere dense in L
p
(A) for 1 p < (see Proposition
3.47).
Chapter 7
Functions of bounded variation
and absolutely continuous
functions
Let f and F be two functions on [a, b] such that f is continuous and F has
a continuous derivative. Then it will be recalled from elementary calculus
that the connection between the operations of dierentiation and integration
is expressed by the familiar formulas
d
dx
_
x
a
f(t)dt = f(x), (7.1)
_
x
a
F
t
(t)dt = F(x) F(a). (7.2)
This immediately suggests:
1. Does (7.1) continue to hold almost everywhere for an arbitrary summable
function f?
2. What is the largest class of functions for which (7.2) holds?
These questions will be answered in this chapter. We observe that if f is
nonnegative, then the indenite Lebesgue integral
_
x
a
f(t)dt, x [a, b], (7.3)
175
176 BV and AC functions
as a function of its upper limit, is nondecreasing. Moreover, since every
summable function f is the dierence of two nonnegative summable functions
f
+
and f

, the integral (7.3) is the dierence between two nondecreasing


functions. Hence, the study of the indenite Lebesgue integral is closely
related to the study of monotonic functions. Monotonic functions have a
number of simple and important properties which we now discuss.
7.1 Monotonic functions
Denition 7.1 A function f : [a, b] R is said to be nondecreasing if
a x
1
x
2
b implies f(x
1
) f(x
2
) and nonincreasing if a x
1
x
2
b
implies f(x
1
) f(x
2
). By a monotonic function is meant a function which
is either nondecreasing or nonincreasing.
Denition 7.2 Given a monotonic function f : [a, b] R and x
0
[a, b),
the limit
f(x
+
0
) := lim
h0, h>0
f(x
0
+h)
(which always exists) is said to be the right hand limit of f at the point x
0
.
Similarly, if x
0
(a, b], the limit
f(x

0
) = lim
h0, h>0
f(x
0
h)
is called the left-hand limit of f at x
0
.
Remark 7.3 Let f be nondecreasing on [a, b]. If a x < y b, then
f(x
+
) f(y

).
Analogously, if f is nonincreasing on [a, b] and a x < y b, then
f(x
+
) f(y

).
We now establish the basic properties of monotonic functions.
Theorem 7.4 Every monotonic function f on [a, b] is Borel and bounded,
and hence summable.
Chapter 7 177
Proof. Assume that f is nondecreasing. Since f(a) f(x) f(b) for all
x [a, b], f is obviously bounded. For every c R consider the set
E
c
= x [a, b] [ f(x) < c.
If E
c
is empty, then E
c
is (trivially) a Borel set. If E
c
is nonempty, let y be
the least upper bound of all x E
c
. Then E
c
is either the closed interval
[a, y], if y E
c
, or the half-open interval [a, y), if y , E
c
. In either case, E
c
is a Borel set; this proves that f is Borel. Finally we have
_
b
a
[f(x)[dx max[f(a)[, [f(b)[(b a),
by which f is summable.
Theorem 7.5 Let f : [a, b] R be a monotonic function. Then the set of
points of [a, b] at which f is discontinuous is at most countable.
Proof. Suppose, for the sake of deniteness, that f is nondecreasing, and
let E be the set of points at which f is discontinuous. If x E we have
f(x

) < f(x
+
); then with every point x of E we associate we associate a
rational number r(x) such that
f(x

) < r(x) < f(x


+
).
Since by Remark 7.3 x
1
< x
2
implies f(x
+
1
) f(x

2
), we see that r(x
1
) ,=
r(x
2
). We have thus established a 1-1 correspondence between the set E and
a subset of the rational numbers.
7.1.1 Dierentiation of a monotonic function
The key result of this section will be to show that a monotonic function f
dened on an interval [a, b] has a nite derivative almost everywhere in [a, b].
Before proving this proposition, due to Lebesgue, we must rst introduce
some further notation. For every x (a, b) the following four quantities
(which may take innite values) always exist:
D
t
L
f(x) = liminf
h0, h<0
f(x +h) f(x)
h
, D
tt
L
f(x) = limsup
h0, h<0
f(x +h) f(x)
h
,
D
t
R
f(x) = liminf
h0, h>0
f(x +h) f(x)
h
, D
tt
R
f(x) = limsup
h0, h>0
f(x +h) f(x)
h
.
178 BV and AC functions
These four quantities are called the derived numbers of f at x. It is clear
that the inequalities
D
t
L
f(x) D
tt
L
f(x), D
t
R
f(x) D
tt
R
f(x) (7.4)
always hold. If D
t
L
f(x) and D
tt
L
f(x) are nite and equal, their common value
is just the left-hand derivative of f at x. Similarly, if D
t
R
f(x) and D
tt
R
f(x)
are nite and equal, their common value is just the right-hand derivative of f
at x. Moreover, f has a derivative at x if and only if all four derived numbers
D
t
L
f(x), D
tt
L
f(x), D
t
R
f(x) and D
tt
R
f(x) are nite and equal.
Theorem 7.6 (Lebesgue) Let f : [a, b] R be a monotonic function.
Then f has a derivative almost everywhere on [a, b]. Furthermore f
t

L
1
([a, b]) and
_
b
a
[f
t
(t)[dt [f(b) f(a)[. (7.5)
Proof. There is no loss of generality in assuming that f is nondecreasing,
since if f is nonincreasing, we can apply the result to f which is obviously
nondecreasing. We begin by proving that the derived numbers of f are equal
(with possibly innite value) almost everywhere on [a, b]. It will be enough
to show that the inequality
D
t
L
f(x) D
tt
R
f(x) (7.6)
holds almost everywhere on [a, b]. In fact, setting, f

(x) = f(x), we see


that f

in nondecreasing on [b, a]; moreover, it is easily veried that


D
t
L
f

(x) = D
t
R
f(x), D
tt
L
f

(x) = D
tt
R
f(x).
Therefore, applying (7.6) to f

, we get
D
t
L
f

(x) D
tt
R
f

(x)
or
D
t
R
f(x) D
tt
L
f(x).
Combining this inequality with (7.6), we obtain
D
tt
R
f D
t
L
f D
tt
L
f D
t
R
f D
tt
R
f,
Chapter 7 179
after using (7.4), and the equality of the four derived numbers follows. To
prove that (7.6) holds almost everywhere, observe that the set of points
where D

L
f < D
+
R
f can clearly be represented as the union over u, v Q
with v > u > 0 of the sets
E
u,v
= x (a, b) [ D
tt
R
f(x) > v > u > D
t
L
f(x).
It will then follow that (7.6) holds almost everywhere, if we succeed in show-
ing that (E
u,v
) = 0. Let s = (E
u,v
). Then, given > 0, according to
Proposition 1.53 there is an open set A such that E
u,v
A and (A) < s+.
For every x E
u,v
and > 0, since D
t
L
f(x) < u, there exists h
x,
(0, )
such that [x h
x,
, x] A and
f(x) f(x h
x,
) < uh
x,
.
Since the collection of closed intervals ([x h
x,
, x])
x(a,b), >0
is a ne cover
of E
u,v
, by Vitalis covering lemma there exists a nite number of disjoint
intervals of such collection, say
I
1
:= [x
1
h
1
, x
1
], . . . , I
N
:= [x
N
h
N
, x
N
],
such that, setting B = E
u,v

N
i=1
(x
i
h
i
, x
i
),
(B) =
_
E
u,v

N
_
i=1
I
k
_
> s .
Summing up over these intervals we get
N

i=1
_
f(x
i
) f(x
i
h
i
)
_
< u
N

i=1
h
i
< u(A) < u(s +). (7.7)
Now we reason as above and use the inequality D
tt
R
f(x) > v; for every y B
and > 0, since D
tt
R
f(x) > v, there exists k
y,
(0, ) such that [y, y+k
y,
]
I
i
for some i 1, . . . , N and
f(y +k
y,
) f(y) > vk
y,
.
Since the collection of closed intervals ([y, y+k
y,
])
yB, >0
is a ne cover of B,
by Vitalis covering lemma there exists a nite number of disjoint intervals
of such collection, say
J
1
:= [y
1
, y
1
+k
1
], . . . , J
M
:= [y
M
, y
M
+k
M
],
180 BV and AC functions
such that,

_
B
M
_
j=1
J
j
_
(B) > s 2.
Summing up over these intervals we get
M

j=1
_
f(y
j
+k
j
) f(y
j
)
_
> v
M

j=1
k
j
= v
_
M
_
j=1
J
j
_
> v(s 2). (7.8)
For every i 1, . . . , N, we sum up over all the intervals J
j
such that
J
j
I
i
, and, using that f is nondecreasing, we obtain

j, J
j
I
i
_
f(y
j
+k
j
) f(y
j
)
_
f(x
i
) f(x
i
h
i
)
by which, summing over i and taking into account that every interval J
j
is
contained in some interval I
i
,
N

i=1
_
f(x
i
)f(x
i
h
i
)
_

N

i=1

j, J
j
I
i
_
f(y
j
+k
j
)f(y
j
)
_
=
M

j=1
_
f(y
j
+k
j
)f(y
j
)
_
.
Combining this with (7.7)-(7.8),
u(s +) > v(s 2).
The arbitrariness of implies us vs; since u < v, then s = 0. This shows
that (E
u,v
) = 0, as asserted.
We have thus proved that the function
(x) = lim
h0
f(x +h) f(x)
h
is dened almost everywhere on [a, b] and f has a derivative at x if and only
if (x) is nite. Let

n
(x) = n
_
f
_
x +
1
n
_
f(x)
_
where, to make
n
meaningful for all x [a, b], we get f(x) = f(b) for x b,
by denition. Since f is summable on [a, b], so is every
n
. Integrating
n
,
we get
Chapter 7 181
_
b
a

n
(x)dx = n
_
b
a
_
f
_
x +
1
n
_
f(x)
_
dx = n
__
b+
1
n
a+
1
n
f(x)dx
_
b
a
f(x)dx
_
= n
__
b+
1
n
b
f(x)dx
_
a+
1
n
a
f(x)dx
_
= f(b) n
_
a+
1
n
a
f(x)dx
f(b) f(a)
where in the last step we use the fact that f is nondecreasing. From Fatous
lemma it follows that
_
b
a
(x)dx f(b) f(a).
In particular is summable, and, consequently, a.e. nite. Then f has a
derivative almost everywhere and f
t
(x) = (x) a.e. in [a, b].
Example 7.7 It is easy to nd monotonic functions f for which (7.5) be-
comes a strict inequality. For example, given points a = x
0
< x
1
< . . . <
x
n
= b in the interval [a, b] and h
1
, h
2
, . . . , h
n
corresponding numbers, con-
sider the function
f(x) =
_

_
h
1
if a x < x
1
,
h
2
if x
1
x < x
2
,
. . .
h
n
if x
n1
x b.
A function of this particularly simple type is called a step function. If h
1

h
2
. . . h
n
, then f is obviously nondecreasing and
0 =
_
b
a
f
t
(x)dx < f(b) f(a) = h
n
h
1
.
Example 7.8 [Vitalis function] In the preceding example, f is discontin-
uous. However, it is also possible to nd continuous nondecreasing functions
satisfying the strict inequality (7.5). To this end let
(a
1
1
, b
1
1
) =
_
1
3
,
2
3
_
182 BV and AC functions
be the middle third of the interval [0, 1], let
(a
2
1
, b
2
1
) =
_
1
9
,
2
9
_
, (a
2
2
, b
2
2
) =
_
7
9
,
8
9
_
be the middle thirds of the intervals remaining after deleting (a
1
1
, b
1
1
) from
[0,1], let
(a
3
1
, b
3
1
) =
_
1
27
,
2
27
_
, (a
3
2
, b
3
2
) =
_
7
27
,
8
27
_
,
(a
3
3
, b
3
3
) =
_
19
27
,
20
27
_
, (a
3
4
, b
3
4
) =
_
25
27
,
26
27
_
be the middle thirds of the intervals remaining after deleting (a
1
1
, b
1
1
), (a
2
1
, b
2
1
),
(a
2
2
, b
2
2
) from [0, 1] and so on. Note that the complement of the union of all
the intervals (a
n
k
, b
n
k
) is the Cantor set constructed in Example 1.49. Now
dene a function
f(0) = 0, f(1) = 1, f(t) =
2k 1
2
n
if t (a
n
k
, b
n
k
),
so that
f(t) =
1
2
if
1
3
< t <
2
3
,
f(t) =
_

_
1
4
if
1
9
< t <
2
9
,
3
4
if
7
9
< t <
8
9
,
f(t) =
_

_
1
8
if
1
27
< t <
2
27
,
3
8
if
7
27
< t <
8
27
,
5
8
if
19
27
< t <
20
27
,
7
8
if
25
27
< t <
26
27
,
and so on. Then f is dened everywhere except at points of the Cantor set
C; furthermore f is nondecreasing on [0, 1] C and f([0, 1] C) =
2k1
2
n
[ n
N, 1 k 2
n1
which is dense in [0, 1], that is
f([0, 1] C) = [0, 1]. (7.9)
Chapter 7 183
Given any point t

C, let (t
n
)
n
be an increasing sequence of points in
[0, 1] C converging to t

and let (t
t
n
)
n
be a decreasing sequence of points in
[0, 1]C converging to t

. Such sequences exist since [0, 1]C is dense in [0, 1].


Then the limits lim
n
f(t
n
) and lim
n
f(t
t
n
) exist (since f is nondecreasing in
[0, 1] C); we claim that they are equal. Otherwise, setting a = lim
n
f(t
n
)
and b = lim
n
f(t
t
n
), then (a, b) [0, 1] f([0, 1] C), in contradiction with
(7.9). Then let
f(t

) = lim
n
f(t
n
) = lim
n
f(t
t
n
).
Completing the denition of f in this way, we obtain a continuous nonde-
creasing function on the whole interval [0, 1], known as Vitalis function. The
derivatives f
t
obviously vanishes at every interval (a
n
k
, b
n
k
), and hence vanishes
almost everywhere, since the Cantor set has measure zero. It follows that
0 =
_
1
0
f
t
(x)dx < f(1) f(0) = 1.
7.2 Functions of bounded variation
Denition 7.9 A function f dened on an interval [a, b] if said to be of
bounded variation if there is a constant C > 0 such that
n1

k=0
[f(x
k+1
) f(x
k
)[ C (7.10)
for every partition
a = x
0
< x
1
< . . . < x
n
= b (7.11)
of [a, b]. By the total variation of f on [a, b], denoted by V
b
a
(f), is meant the
quantity:
V
b
a
(f) = sup
n1

k=0
[f(x
k+1
) f(x
k
)[ (7.12)
where the least upper bound is taken over all partitions (7.11) of the interval
[a, b].
Remark 7.10 It is an immediate consequence of the above denition that
if R and f is a function of bounded variation on [a, b], then so is f and
V
b
a
(f) = [[V
b
a
(f).
184 BV and AC functions
Example 7.11 1. If f is a monotonic function on [a, b], then the left-hand
side of (7.10) equals [f(b) f(a)[ regardless of the choice of partition.
Then f is of bounded variation and V
b
a
(f) = [f(b) f(a)[.
2. If f is a step function of the type considered in Example 7.7 with
h
1
, . . . , h
n
R, then f is of bounded variation, with total variation
given by the sum of the jumps, i.e.
V
b
a
(f) =
n1

i=1
[h
i+1
h
i
[.
Example 7.12 Suppose f is a Lipschitz function on [a, b] with Lipschitz
constant K; then for any partition (7.11) of [a, b] we have
n1

k=0
[f(x
k+1
) f(x
k
)[ K
n1

k=0
[x
k+1
x
k
[ = K(b a).
Then f is of bounded variation and V
b
a
(f) K(b a).
Example 7.13 It is easy to nd a continuous function which is not of
bounded variation. Indeed consider the function
f(x) =
_
_
_
x sin
1
x
if 0 < x 1,
0 if x = 0
and, xed n N, take the following partition
0,
2
(4n 1)
,
2
(4n 3)
, . . . ,
2
3
,
2

, 1.
The sum on the left-hand side of (7.10) associated to such partition is given
by
4

2n1

k=1
1
2k + 1
+
2

sin 1
2

.
Taking into account that

k=1
1
2k+1
= , we deduce that the least upper
bound on the right-hand side of (7.12) over all partitions of [a, b] is innity.
Chapter 7 185
Proposition 7.14 If f and g are functions of bounded variation on [a, b],
then so is f +g and
V
b
a
(f +g) V
b
a
(f) +V
b
a
(g).
Proof. For any partition of the interval [a, b], we have
n1

k=0
[f(x
k+1
) +g(x
k+1
) f(x
k
) g(x
k
)[

n1

k=0
[f(x
k+1
) f(x
k
)[ +
n1

k=0
[g(x
k+1
) g(x
k
)[ V
b
a
(f) +V
b
a
(g).
Taking the least upper bound on the left-hand side over all partitions of [a, b]
we immediately get the thesis.
It follows from Remark 7.10 and Proposition 7.14 that any linear com-
bination of functions of bounded variation is itself a function of bounded
variation. In other words, the set BV ([a, b]) of all functions of bounded
variation on the interval [a, b] is a linear space (unlike the set of all mono-
tonic functions).
Proposition 7.15 If f is a function of bounded variation on [a, b] and a <
c < b, then
V
b
a
(f) = V
c
a
(f) +V
b
c
(f).
Proof. First we consider a partition of the interval [a, b] such that c is one
of the points of subdivision, say x
r
= c. Then
n1

k=0
[f(x
k+1
) f(x
k
)[
=
r1

k=0
[f(x
k+1
) f(x
k
)[ +
n1

k=r
[f(x
k+1
) f(x
k
)[
V
c
a
(f) +V
b
c
(f).
(7.13)
Now consider an arbitrary partition of [a, b]. It is clear that adding an extra
point of subdivision to this partition can never decrease the sum

n1
k=0
[f(x
k+1
)
f(x
k
)[. Therefore (7.13) holds for any subdivision of [a, b], and hence
V
b
a
(f) V
c
a
(f) +V
b
c
(f).
186 BV and AC functions
On the other hand, given any > 0, there are partitions of the intervals [a, c]
and [c, b], respectively, such that

i
[f(x
t
i+1
) f(x
t
i
)[ > V
c
a
(f)

2
,

j
[f(x
tt
j+1
) f(x
tt
j
)[ > V
b
c
(f)

2
.
Combining all points of subdivision x
t
i
, x
tt
j
, we get a partition of the interval
[a, b], with points of subdivision x
k
, such that
V
b
a
(f)

k
[f(x
k+1
) f(x
k
)[ =

i
[f(x
t
i+1
) f(x
t
i
)[ +

j
[f(x
tt
j+1
) f(x
tt
j
)[
> V
c
a
(f) +V
b
c
(f) .
Since > 0 is arbitrary, it follows that V
b
a
(f) V
c
a
(f) +V
b
c
(f).
Corollary 7.16 If f is a function of bounded variation on [a, b], then the
function
x V
x
a
(f)
is nondecreasing.
Proof. If a x < y b, Proposition 7.15 implies
V
y
a
(f) = V
x
a
(f) +V
y
x
(f) V
x
a
(f).

Proposition 7.17 A function f : [a, b] R is of bounded variation if and


only if f can be represented as the dierence between two nondecreasing func-
tions on [a, b].
Proof. Since, by Example 7.11, any monotonic function is of bounded varia-
tion and since the set BV ([a, b]) is a linear space, we get that the dierence of
two nondecreasing functions is of bounded variation. To prove the converse,
set
g
1
(x) = V
x
a
(f), g
2
(x) = V
x
a
(f) f(x).
Chapter 7 187
By Corollary 7.16 g
1
is a nondecreasing function. We claim that g
2
is non-
decreasing too. Indeed, if x < y, then, using Proposition 7.15, we get
g
2
(y) g
2
(x) = V
y
x
(f) (f(y) f(x)). (7.14)
But from Denition 7.9
[f(y) f(x)[ V
y
x
(f)
and hence the right hand side of (7.14) is nonnegative. Writing f = g
1

g
2
, we get the desired representation of f as the dierence between two
nondecreasing functions.
Theorem 7.18 Let f : [a, b] R be a function of bounded variation. Then
the set of points of [a, b] at which f is discontinuous is at most countable.
Furthermore f has a derivative almost everywhere on [a, b], f
t
L
1
([a, b])
and
_
b
a
[f
t
(x)[dx V
b
a
(f). (7.15)
Proof. Combining Theorem 7.5, Theorem 7.6 and Proposition 7.17 we im-
mediately obtain that f has no more than countably many points of discon-
tinuity, has a derivative almost everywhere on [a, b] and f
t
L
1
([a, b]). Since
for a x < y b
[f(y) f(x)[ V
y
x
(f) = V
y
a
(f) V
x
a
(f),
we get
[f
t
(x)[ (V
x
a
(f))
t
a.e. in [a, b].
Finally, using (7.5)
_
b
a
[f
t
(x)[dx
_
b
a
(V
x
a
(f))
t
dx V
b
a
(f).

Remark 7.19 Any step function and the Vitalis function (see Example
7.8) provide examples of functions of bounded variation satisfying the strict
inequality (7.15).
188 BV and AC functions
Proposition 7.20 A function f : [a, b] R is of bounded variation if and
only if the curve
y = f(x) a x b
is recticable, i.e. has nite lenght
(1)
.
Proof. For any partition of [a, b] we get
n1

i=0
[f(x
i+1
) f(x
i
)[
n1

i=0
_
(x
i+1
x
i
)
2
+ (f(x
i+1
) f(x
i
))
2
(b a) +
n1

i=0
[f(x
i+1
) f(x
i
)[.
Taking the least upper bound over all partitions we get the thesis.
Exercise 7.21 Let (a
n
)
n
be a sequence of positive numbers and let
f(x) =
_
_
_
a
n
x =
1
n
, n 1;
0 otherwise.
Prove that f is of bounded variation on [0, 1] i

n=1
a
n
< .
Exercise 7.22 Let f be a function of bounded variation on [a, b] such that
f(x) c > 0 x [a, b].
Prove that
1
f
is of bounded variation and
V
b
a
_
1
f
_

1
c
2
V
b
a
(f).
Exercise 7.23 Prove that the function
f(x) =
_
_
_
x
2
sin
1
x
3
0 < x 1,
0 x = 0
is not of bounded variation on [0, 1].
(1)
By the length of the curve y = f(x) (a x b) is meant the quantity
sup
n1

i=0
_
(x
i+1
x
i
)
2
+ (f(x
i+1
) f(x
i
))
2
where the least upper bound is taken over all possible partitions of [a, b].
Chapter 7 189
7.3 Absolutely continuous functions
We now address ourselves to the problems posed at the beginning of the chap-
ter. The object of this section is to describe the class of functions satisfying
(7.2).
Denition 7.24 A function f dened on an interval [a, b] is said to be ab-
solutely continuous if, given > 0, there is a > 0 such that
n

k=1
[f(b
k
) f(a
k
)[ < (7.16)
for every nite system of pairwise disjoint subintervals
(a
k
, b
k
) [a, b] k = 1, . . . , n
of total length

n
k=1
(b
k
a
k
) less than .
Example 7.25 Suppose f is a Lipschitz function on [a, b] with Lipschitz
constant K; then, choosing =

K
, we immediately get that f is absolutely
continuous.
Remark 7.26 Clearly every absolutely continuous function is uniformly con-
tinuous, as we see by choosing a single subinterval (a
1
, b
1
) [a, b]. However,
a uniformly continuous function need not be absolutely continuous. For ex-
ample, the Vitalis function f constructed in Example 7.8 is continuous (and
hence uniformly continuous) on [0, 1], but not absolutely continuous on [0, 1].
In fact, for every n consider the set
C
n
=
_
x [0, 1]

x =

i=1
a
i
3
i
with a
1
, . . . , a
n
,= 1
_
which is the union of 2
n
pairwise disjoint closed intervals I
i
, each of which has
measure
1
3
n
(then the total length is (
2
3
)
n
). Denoting by C the Cantor set (see
Example 1.49), we have C C
n
; since, by construction, the Vitalis function
is constant on the subintervals of [0, 1] C, then the sum (7.16) associated
to the system (I
i
) is equal to 1. Hence the Cantor set C can be covered by
a nite system of subintervals of arbitrarily small length, but the sum (7.16)
associated to every such system is equal to 1. The same example shows
that a function of bounded variation needs not be absolutely continuous. On
the other hand, an absolutely continuous function is necessarily of bounded
variation (see Proposition 7.27).
190 BV and AC functions
Proposition 7.27 If f is absolutely continuous on [a, b], then f is of bounded
variation on [a, b].
Proof. Given any > 0, there is a > 0 such that
n

k=1
[f(b
k
) f(a
k
)[ <
for every nite system of pairwise disjoint subintervals (a
k
, b
k
) [a, b] such
that
n

k=1
(b
k
a
k
) < .
Hence if [, ] is any subinterval of length less than , we have
V

(f) .
Let a = x
0
< x
1
< . . . < x
N
= b be a partition of [a, b] into N subintervals
[x
k
, x
k+1
] all of length less than . Then, by Proposition 7.15,
V
b
a
(f) N.

An immediate consequence of Denition 7.24 and obvious properties of


absolute value is the following.
Proposition 7.28 If f is absolutely continuous on [a, b], then so is f,
where is any constant. Moreover, if f and g are absolutely continuous
on [a, b], then so is f +g.
It follows from Proposition 7.28 together with Remark 7.26 that the set
AC([a, b]) of all absolutely continuous functions on [a, b] is a proper subspace
of the linear space BV ([a, b]) of all functions of bounded variation on [a, b].
We now study the close connection between absolute continuity and the
indenite Lebesgue integral. To this aim we need the following result.
Lemma 7.29 Let g L
1
([a, b]) be such that
_
I
g(t)dt = 0 for every subin-
terval I [a, b]. Then g(x) = 0 a.e. in [a, b].
Chapter 7 191
Proof. If we denote by 1 the family of all nite disjoint union of subintervals
of [a, b], it is immediate to see that 1 is an algebra and
_
A
g(t)dt = 0 for every
A 1. Let V be an open set in [a, b]; then V =

n=1
I
n
where I
n
[a, b]
is a subinterval. For every n, since
n
i=1
I
i
1, we have
_

n
i=1
I
i
g(t)dt = 0;
Lebesgue Theorem implies
_
V
g(t)dt = lim
n
_

n
i=1
I
i
g(t)dt = 0
Assume by contradiction the existence of E B([a, b]) such that (E) > 0
and g(x) > 0 in E. By Theorem 1.55 there exists a compact set K E such
that (K) > 0. Setting V = [a, b] K, V is an open set in [a, b]; then
0 =
_
b
a
g(t)dt =
_
V
g(t)dt +
_
K
g(t)dt =
_
K
g(t)dt > 0,
and the contradiction follows.
Returning to the problem of dierentiating the indenite Lebesgue in-
tegral, in the following Theorem we evaluate the derivative (7.1), thereby
giving an armative answer to the rst of the two questions posed at the
beginning of the chapter.
Theorem 7.30 Let f L
1
([a, b]) and set
F(x) =
_
x
a
f(t)dt, x [a, b].
Then F is absolutely continuous on [a, b] and
F
t
(x) = f(x) for a.e. x [a, b]. (7.17)
Proof. Given any nite collection of pairwise disjoint intervals (a
k
, b
k
), we
have
n

k=1
[F(b
k
)F(a
k
)[ =
n

k=1

_
b
k
a
k
f(t)dt

k=1
_
b
k
a
k
[f(t)[dt =
_
S
k
(a
k
,b
k
)
[f(t)[dt.
By the absolute continuity of the integral, the last expression on the right
approaches zero as the total length of the intervals (a
k
, b
k
) approaches zero.
This proves that F is absolutely continuous on [a, b]. By Proposition 7.27 F
192 BV and AC functions
is of bounded variation; consequently, by Theorem 7.18, F has a derivative
almost everywhere on [a, b] and F
t
L
1
([a, b]). It remains to prove (7.17).
First assume that there exists K > 0 such that [f(x)[ K for every x [a, b]
and let
g
n
(x) = n
_
F
_
x +
1
n
_
F(x)
_
where, to make g
n
meaningful for all x [a, b], we get F(x) = F(b) for
b < x b + 1, by denition. Clearly
lim
n
g
n
(x) = F
t
(x)
almost everywhere on [a, b]. Furthermore
[g
n
(x)[ =

n
_
x+
1
n
x
f(t)dt

K x [a, b].
Consider a c < d b and, by using Lebesgue Theorem, we get
_
d
c
F
t
(x)dx = lim
n
_
d
c
g
n
(x)dx = lim
n
n
_
_
d+
1
n
c+
1
n
F(x)dx
_
d
c
F(x)dx
_
= lim
n
_
_
d+
1
n
d
F(x)dx
_
c+
1
n
c
F(x)dx
_
= F(d) F(c)
where last equality follows from the mean value theorem. Hence we deduce
_
d
c
F
t
(x)dx = F(d) F(c) =
_
d
c
f(t)dt
by which, using Lemma 7.29, we conclude F
t
(x) = f(x) a.e. in [a, b].
Next we want to remove the hypothesis on the boundedness of f. Without
loss of generality we may assume f 0 (otherwise, we can consider separately
f
+
and f

). Then F is a nondecreasing function on [a, b]. Dene f


n
as
follows:
f
n
(x) =
_
f(x) if 0 f(x) n,
n if f(x) n.
Since f f
n
0, the function H
n
(x) :=
_
x
a
(f(t) f
n
(t))dt in nondecreasing;
hence, by Theorem 7.6, H
n
has nonnegative derivative almost everywhere.
Chapter 7 193
Since 0 f
n
n, by the rst part of the proof we have
d
dx
_
x
a
f
n
(t)dt = f
n
(x)
a.e. in [a, b]; therefore for every n N
F
t
(x) = H
t
n
(x) +
d
dx
_
x
a
f
n
(t)dt f
n
(x) a.e. in [a, b]
by which F
t
(x) f(x) for a.e. x [a, b] and so, after integration,
_
b
a
F
t
(x)dx
_
b
a
f(x)dx = F(b) F(a).
On the other hand, since F is nondecreasing on [a, b], (7.5) gives
_
b
a
F
t
(x)dx
F(b) F(a), and then
_
b
a
F
t
(x)dx = F(b) F(a) =
_
b
a
f(x)dx.
We obtain
_
b
a
(F
t
(x) f(x))dx = 0; since F
t
(x) f(x) a.e., we conclude
F
t
(x) = f(x) a.e. in [a, b].
We are going to give a denite answer to the second of the question posed
at the beginning of the chapter.
Lemma 7.31 Let f be an absolutely continuous function on [a, b] such that
f
t
(x) = 0 a.e. in [a, b]. Then f is constant on [a, b].
Proof. Fixed c (a, b), we want to show that f(c) = f(a). Let E (a, c)
be such that f
t
(x) = 0 for every x E. Then E B([a, b]) and (E) = ca.
Given > 0, there is a > 0 such that
n

k=1
[f(b
k
) f(a
k
)[ <
for every nite system of pairwise disjoint subintervals (a
k
, b
k
) [a, b] such
that
n

k=1
(b
k
a
k
) < .
Fix > 0. For every x E and > 0, since lim
yx
f(y)f(x)
yx
= 0, there
exists y
x,
> x such that [x, y
x,
] (a, c), [y
x,
x[ and
[f(y
x,
) f(x)[ (y
x,
x). (7.18)
194 BV and AC functions
The intervals ([x, y
x,
])
x(a,c),>0
provide a ne cover of E; hence, by Vitalis
covering Theorem, there exists a nite number of such disjoint subintervals
of (a, c)
I
1
= [x
1
, y
1
], . . . , I
n
= [x
n
, y
n
]
with x
k
< x
k+1
, such that (E
n
i=1
I
k
) < . Then we have
y
0
:= a < x
1
< y
1
< x
2
< . . . < y
n
< c := x
n+1
,
n

k=0
(x
k+1
y
k
) < .
From the absolute continuity of f we obtain
n

k=0
[f(x
k+1
) f(y
k
)[ < , (7.19)
while, by (7.18),
n

k=1
[f(y
k
) f(x
k
)[
n

k=1
(y
k
x
k
) (b a). (7.20)
Combining (7.19)-(7.20) we deduce
[f(c) f(a)[ =

k=0
(f(x
k+1
) f(y
k
)) +
n

k=1
(f(y
k
) f(x
k
))

+(b a).
The arbitrariness of and allows us to conclude.
Theorem 7.32 If f is absolutely continuous on [a, b], then f has a derivative
almost everywhere on [a, b], f
t
L
1
([a, b]) and
f(x) = f(a) +
_
x
a
f
t
(t)dt x [a, b]. (7.21)
Proof. By Proposition 7.27 f is of bounded variation; hence, by Theorem
7.18, f has a derivative almost everywhere and f
t
L
1
([a, b]). To prove
(7.21) consider the function
g(x) =
_
x
a
f
t
(t)dt.
Chapter 7 195
Then, by Theorem 7.30, g is absolutely continuous on [a, b] and g
t
(x) = f
t
(x)
a.e. in [a, b]. Setting = g f, is absolutely continuous, being the
dierence of two absolutely continuous functions, and
t
(x) = 0 a.e. in [a, b].
It follows from the previous lemma that is constant, that is (x) = (a) =
f(a) g(a) = f(a), by which
f(x) = (x) +g(x) = f(a) +
_
x
a
f
t
(t)dt x [a, b].

Remark 7.33 Combining Theorem 7.30 and 7.32 we can now give a deni-
tive answer to the second question posed at the beginning of the chapter:
the formula
_
x
a
F
t
(t)dt = F(x) F(a)
holds for all x [a, b] if and only if F is absolutely continuous on [a, b].
Proposition 7.34 Let f : [a, b] R. The following properties are equiva-
lent:
a) f is absolutely continuous on [a, b];
b) f is of bounded variation on [a, b] and
_
b
a
[f
t
(t)[dt = V
b
a
(f).
Proof. a) b) For any partition a = x
0
< x
1
< . . . < x
n
= b of [a, b], by
Theorem 7.32 we have
n1

k=0
[f(x
k+1
)f(x
k
)[ =
n1

k=0

_
x
k+1
x
k
f
t
(t)dt


n1

k=0
_
x
k+1
x
k
[f
t
(t)[dt =
_
b
a
[f
t
(t)[dt,
which implies
V
b
a
(f)
_
b
a
[f
t
(t)[dt.
On the other hand, by Theorem 7.18,
_
b
a
[f
t
(t)[dt V
b
a
(f), and so V
b
a
(f) =
_
b
a
[f
t
(t)[dt.
196 BV and AC functions
b) a) For every x [a, b], using (7.15), we have
V
x
a
(f)
_
x
a
[f
t
(t)[dt =
_
b
a
[f
t
(t)[dt
_
b
x
[f
t
(t)[dt = V
b
a
(f)
_
b
x
[f
t
(t)[dt
V
b
a
(f) V
b
x
(f) = V
x
a
(f)
where last equality follows from Proposition 7.15. Then we get
V
x
a
(f) =
_
x
a
[f
t
(t)[dt.
Since f
t
L
1
([a, b]), Theorem 7.30 implies that the function x V
x
a
(f)
is absolutely continuous. Given any collection of pairwise disjoint intervals
(a
k
, b
k
), we have
n

k=1
[f(b
k
) f(a
k
)[
n

k=1
V
b
k
a
k
(f) =
n

k=1
_
V
b
k
a
(f) V
a
k
a
(f)
_
.
By the absolute continuity of x V
x
a
(f), the last expression on the right
approaches zero as the total length of the intervals (a
k
, b
k
) approaches zero.
This proves that f is absolutely continuous on [a, b].
By applying the above proposition to the particular case of monotonic
functions, we obtain the following result.
Corollary 7.35 Let f : [a, b] R be a monotonic function. The following
properties are equivalent:
a) f is absolutely continuous on [a, b];
b)
_
b
a
[f
t
(t)[dt = [f(b) f(a)[.
Remark 7.36 Let f, g absolutely continuous functions on [a, b]. Then the
following formula of integration by parts holds:
_
b
a
f(x)g
t
(x)dx = f(b)g(b) f(a)b(a)
_
b
a
f
t
(x)g(x)dx.
Indeed, by Tonellis Theorem
__
[a,b]
2
[f
t
(x)g
t
(y)[dxdy =
_
b
a
[f
t
(x)[dx
_
b
a
[g
t
(y)[dy <
, that is f
t
(x)g
t
(y) L
1
([a, b]
2
). Then consider the set
A = (x, y) [a, b]
2
[ a x y b
Chapter 7 197
and let us evaluate the integral
I =
__
A
f
t
(x)g
t
(y)dxdy
in two ways using Fubinis theorem and formula (7.21). On the one hand
I =
_
b
a
g
t
(y)
_
_
y
a
f
t
(x)dx
_
dy =
_
b
a
g
t
(y)f(y)dy f(a)
_
b
a
g
t
(y)dy
=
_
b
a
g
t
(y)f(y)dy f(a)
_
g(b) g(a)
_
and, on the other hand
I =
_
b
a
f
t
(x)
_
_
b
x
g
t
(y)dy
_
dy = g(b)
_
b
a
f
t
(y)dy
_
b
a
f
t
(x)g(x)dx
= g(b)
_
f(b) f(a)
_

_
b
a
f
t
(x)g(x)dy.
Exercise 7.37 Prove that if f and g are absolutely continuous functions on
[a, b], then so is fg.
Exercise 7.38 Let (f
n
)
n
be a sequence of absolutely continuous functions
on [0, 1], which converges pointwise to a function f on [0, 1], such that
_
1
0
[f
t
n
(x)[dx M, n N,
where M > 0 is a constant.
Show that lim
n
_
1
0
f
n
(x)dx =
_
1
0
f(x)dx;
Prove that f is of bounded variation on [0, 1];
Give an example to show that, in general, f is not absolutely continuous
on [0, 1].
198 BV and AC functions
Appendix A
A.1 Distance function
In this section we recall the basic properties of the distance function from a
nonempty set S R
N
.
Denition A.1 The distance function from S is the function d
S
: R
N
R
dened by
d
S
(x) = inf
yS
|x y| x R
N
The projection of x onto S consists of those points (if any) at which the
inmum dening d
S
(x) is attained. Such a set will be denoted by proj
S
(x).
Proposition A.2 Let S be a nonempty subset of R
N
. Then the following
properties hold true.
1. d
S
is Lipschitz continuous of rank 1
(1)
.
2. For any x R
N
we have that d
S
(x) = 0 i x S.
3. proj
S
(x) ,= for every x R
N
i S is closed.
Proof. We shall prove the three properties in sequence.
(1)
A function f : R
N
R is said to be Lipschitz of rank L 0 in i
[f(x) f(y)[ L|x y| x, y
199
200 Appendix
1. Let x, x
t
R
N
and > 0 be xed. Then there exists y

S such that
|x y

| < d
S
(x) + . Thus, by the triangle inequality for Euclidean
norm,
d
S
(x
t
) d
S
(x) |x
t
y

| |x y

| + |x
t
x| +
Since is arbitrary, d
S
(x
t
) d
S
(x) |x
t
x|. Exchanging the role of
x and x
t
we conclude that [d
S
(x
t
) d
S
(x)[ |x
t
x| as desired.
2. For any x R
N
we have that d
S
(x) = 0 i a sequence (y
n
) S exists
such that |x y
n
| 0 as n , hence i x S.
3. Let S be closed and x R
N
be xed. Then
K :=
_
y S [ |x y| d
S
(x) + 1
_
is a nonempty compact set. Therefore, any point x K such that
|x x| = min
yK
|x y|
lies in proj
S
(x). Conversely, let x S. Observe that, by point 2,
d
S
(x) = 0. Take x proj
S
(x). Then |x x| = 0. So x = x S.
A.2 Legendre transform
Let f : R
N
R be a convex function. The function f

: R
N
R
dened by
f

(y) = sup
xR
x y f(x) y R
N
(A.1)
is called the Legendre transform (and, sometimes, the Fenchel transform or
convex conjugate) of f. From of the denition of f

it follows that
x y f(x) +f

(y) x, y R
N
. (A.2)
Some properties of the Legendre transform of a superlinear function are de-
scribed below.
Proposition A.3 Let f C
1
(R
N
) be a convex function satisfying
lim
|x|
f(x)
|x|
= . (A.3)
Then, the following properties hold:
Appendix 201
(a) y R
N
x
y
R
N
such that f

(y) = x
y
y f(x
y
);
(b) y = Df(x) if and only if f

(y) +f(x) = x y;
(c) f

is convex;
(d) f

is superlinear;
(e) f

= f.
Proof.
(a): the conclusion is a straightforward consequence of the continuity and
superlinearity of f.
(b): let x, y R
N
satisfy f

(y) + f(x) = x y. Then, F(x) := x y f(x)


attains its maximum at x, whence y = Df(x). Conversely, being F(x)
concave, the supremum in (A.1) is attained at every point at which
0 = DF(x) = y Df(x).
(c): take any y
1
, y
2
R
N
and t [0, 1], and let x
t
be a point such that
f

(ty
1
+ (1 t)y
2
) = [ty
1
+ (1 t)y
2
]x
t
f(x
t
) .
Since f

(y
i
) y
i
x
t
f(x
t
) for i = 1, 2, we conclude that
f

(ty
1
+ (1 t)y
2
) tf

(y
1
) + (1 t)f

(y
2
) ,
i.e., f

is convex.
(d): for all M > 0 and y R
N
, we have
f

(y) M
y
|y|
y f
_
M
y
|y|
_
M|y| max
|x|=M
f(x) .
So, for all M > 0,
liminf
|y|
f

(y)
|y|
M .
Since M is arbitrary, f

must be superlinear.
(e): by denition, f(x) x y f

(y) for all x, y R


N
. So, f f

. To
prove the converse inequality, x x R
N
and let y
x
= Df(x). Then,
owing to point (b) above,
f(x) = x y
x
f

(y
x
) f

(x) .
202 Appendix
Example A.4 (Youngs inequality) Dene, for p > 1,
f(x) =
[x[
p
p
x R.
Then, f is a superlinear function of class C
1
(R). Moreover,
f
t
(x) = [x[
p1
sign(x)
where
sign(x) =
_
x
[x[
if x ,= 0
0 if x = 0
So, f
t
is an increasing function, and f is convex.
In view of point (b) of Proposition A.3, we can compute f

(y) by solving
y = [x[
p1
sign(x). We nd x
y
= [y[
1
p1
sign(y), whence
f

(y) = x
y
y f(x
y
) =
[y[
q
q
y R,
where q =
p
p1
. Thus, on account of (A.2), we obtain the following estimate:
[xy[
[x[
p
p
+
[y[
q
q
x, y R, (A.4)
where
1
p
+
1
q
= 1. Moreover, owing to point (b) above, we conclude that
equality holds in (A.4) i [y[
q
= [x[
p
.
Exercise A.5 Let f(x) = e
x
, x R. Show that
f

(y) = sup
xR
xy e
x
=
_
_
_
if y < 0
0 if y = 0
y log y y if y > 0 .
Deduce the following estimate
xy e
x
+y log y y x, y > 0 . (A.5)
Appendix 203
A.3 Baires Lemma
Let (X, d) be a nonempty metric space. The following result is often referred
to as Baires Lemma. It is a classical result in topology.
Proposition A.6 (Baire) Let (X, d) be a complete metric space. Then the
following properties hold.
(a) Any countable intersection of dense open sets G
n
X is dense.
(b) If X is the countable union of nonempty closed sets F
k
, then at least
one F
k
has nonempty interior.
Proof. We shall use the closed balls
B
r
(x) :=
_
y X [ d(x, y) r
_
r > 0 , x X .
(a) Let us x any ball B
r
0
(x
0
). We shall prove that
_

n
G
n
_
B
r
0
(x
0
) ,= .
Since G
1
is dense, there exists a point x
1
G
1
B
r
0
(x
0
). Since G
1
is
open, there also exists 0 < r
1
< 1 such that
B
r
1
(x
1
) G
1
B
r
0
(x
0
) .
Since G
2
is dense, we can nd a point x
2
G
2
B
r
1
(x
1
) andsince G
2
is opena radius 0 < r
2
< 1/2 such that
B
r
2
(x
2
) G
2
B
r
1
(x
1
) .
Iterating the above procedure, we can construct a decreasing sequence
of closed balls B
r
k
(x
k
) such that
B
r
k
(x
k
) G
k
B
r
k1
(x
k1
) and 0 < r
k
< 1/k .
We note that (x
n
)
nN
is a Cauchy sequence in X. Indeed, for any
h, k n we have that x
k
, x
k
B
r
n
(x
n
). So, d(x
k
, x
h
) < 2/n. Therefore,
X being complete, (x
n
)
nN
converges to a point x X which must
belong to
n
G
n
.
(b) Suppose, by contradiction, that all F
k
s have empty interior. Applying
point (a) to G
k
:= X F
k
, we can nd a point x
n
G
n
. Then,
x X
k
F
k
in contrast with the fact that the F
k
s do cover X.
204 Appendix
A.4 Precompact families of continuous func-
tions
Let K be a compact topological space. We denote by c(K) the Banach space
of all continuous functions f : K R endowed with the uniform norm
|f|

= max
xK
[f(x)[ f c(K) .
We recall that convergence in c(K) is equivalent to uniform convergence.
Denition A.7 A family / c(K) is said to be:
(i) equicontinuous if, for any > 0 and any x K there exists a neigh-
bourhood V of x in K such that
[f(x) f(y)[ < y V , f /;
(ii) pointwise bounded if, for any x X, f(x) [ f / is a bounded
subset of R.
Theorem A.8 (Ascoli-Arzel`a) A family / c(K) is relatively compact
i / is equicontinuous and pointwise bounded.
Proof. Let / be relatively compact. Then, / is bounded, hence pointwise
bounded, in c(K). So, it suces to show that / is equicontinuous. For any
> 0 there exist f
1
, . . . , f
m
/ such that / B

(f
1
) B

(f
m
). Let
x K. Since each function f
i
is continuous in x, x possesses neighbourhoods
V
1
, . . . , V
n
K such that
[f
i
(x) f
i
(y)[ < y V
i
, i = 1, . . . , m.
Set V := V
1
V
m
and x f /. Let i 1, . . . , m be such that
f B

(f
i
). Thus, for any y V ,
[f(y) f(x)[ [f(y) f
i
(y)[ +[f
i
(y) f
i
(x)[ +[f
i
(x) f(x)[ < 3 .
This shows that / is equicontinuous.
Conversely, given a pointwise bounded equicontinuous family /, since K
is compact for any > 0 there exist points x
1
, . . . , x
m
K and corresponding
neighbourhoods V
1
, . . . , V
m
such that K = V
1
V
m
and
[f(x) f(x
i
)[ < f /, x V
i
, i = 1, . . . , m. (A.6)
Appendix 205
Since (f(x
1
), . . . , f(x
m
)) [ f / is relatively compact in R
m
, there exist
functions f
1
, . . . , f
n
/ such that
(f(x
1
), . . . , f(x
m
)) [ f /
n
_
j=1
B

(f
j
(x
1
), . . . , f
j
(x
m
)) . (A.7)
We claim that
/ B
3
(f
1
) B
3
(f
n
) , (A.8)
which implies that / is totally bounded
(2)
, hence relatively compact. To
obtain (A.8), let f / and let j 1, . . . , n be such that
(f(x
1
), . . . , f(x
m
)) B

(f
j
(x
1
), . . . , f
j
(x
m
)) .
Now, x x K and let i 1, . . . , m be such that x V
i
. Then, in view of
(A.6) and (A.7),
[f(x) f
j
(x)[ [f(x) f(x
i
)[ +[f(x
i
) f
j
(x
i
)[ +[f
j
(x
i
) f
j
(x)[ < 3 .
This proves (A.8) and completes the proof.
Remark A.9 The compactness of K is essential for the above result. In-
deed, the sequence
f
n
(x) := e
(xn)
2
x R
is a bounded equicontinuous family in c(R). On the other hand,
n ,= m = |f
n
f
m
|

1
1
e
.
So, (f
n
)
n
fails to be relatively compact.
A.5 Vitalis covering theorem
We present in this section the fundamental covering theorem of Vitali.
(2)
given a metric space X and a subset M X, we say that M is totally bounded if for
every > 0 there exist a nite set x
1
, . . . , x
m
X such that M
m
i=1
B

(x
i
). A subset
M of a complete metric space X is relatively compact i it is totally bounded.
206 Appendix
Denition A.10 A collection T of closed balls in R
N
is a ne cover of a
set E R
N
i
E
_
BT
B,
and, for every x E
infdiam(B) [x B, B T = 0,
where diam(B) denotes the diameter of the ball B.
Theorem A.11 (Vitali) Let E B(R
N
) such that (E) <
(3)
. Assume
that T is a ne cover of E. Then, for every > 0 there exists a nite
collection of disjoint balls B
1
, . . . , B
n
T such that

_
E
n
_
i=1
B
i
_
< .
Proof. According to Proposition 1.53, there exists an open set V such
that E V and (V ) < . Possibly substituting T by the subcollection

T = B T [ B V , which is still a ne cover of E, we may assume


without loss of generality all the balls of T are contained in V . This implies
supdiam(B) [ B T < .
We describe by induction the choice of B
1
, B
2
, . . . , B
k
, . . .. We choose B
1
so
that diam(B
1
) >
1
2
supdiam(B) [ B T. Let us suppose that B
1
, . . . , B
k
have already been chosen. There are two possibilities: either
a) E
k
i=1
B
k
;
or
b) there exists x E
k
i=1
B
k
.
In the case a), we terminate at B
k
and the thesis immediately follows. As-
sume that b) holds true. Since
k
i=1
B
k
is a compact set, we denote by > 0
the distance of x from
k
i=1
B
k
. Since T is a ne cover of E, there exists a ball
B T such that x B and diam(B) <

2
. In particular B is disjoint from
(3)
B(R
N
) is the -algebra of the Borel sets of R
N
and denotes the Lebesgue measure.
Appendix 207
B
1
, . . . , B
k
. Then the set B T [ B disjoint from B
1
, . . . , B
k
is nonempty,
hence we can dene
d
k
= supdiam(B) [ B T, B disjoint from B
1
, . . . , B
k
> 0.
We choose B
k+1
T such that B
k+1
is disjoint fromB
1
, . . . , B
k
and diam(B
k+1
) >
d
k
2
. If the process does not terminate, we get a sequence B
1
, B
2
, . . . , B
k
, . . . ,
of disjoint balls in T such that
d
k
2
< diam(B
k+1
) d
k
.
Since

k=1
B
k
V , we have

k=1
(B
k
) (V ) < . Then there exists
n N such that

k=n+1
(B
k
) <

5
N
.
We claim that
E
n
_
k=1
B
k

_
k=n+1
B

k
, (A.9)
where B

k
denotes the ball having the same center as B
k
but whose diameter
in ve times as large. Indeed let x E
n
k=1
B
k
. By reasoning as in case b),
there exists a ball B T such that x B and B is disjoint from B
1
, . . . , B
n
.
We state that B must intersect at least one of the balls B
k
(with k > n),
otherwise from the denition of d
k
for every k it would result
diam(B) d
k
2 diam(B
k+1
);
since

k
(B
k
) < , then (B
k
) 0, by which diam(B
k
) 0; consequently
the above inequality cannot be true for large k.
Then we take the rst j such that B B
j
,= . We have j > n and
diam(B) d
j1
< 2 diam(B
j
).
From an obvious geometric consideration it is then evident that B is con-
tained in the ball that has the same center as B
j
and ve times the diameter
of B
j
, i.e. B B

j
. Thus we have proved (A.9), and so

_
E
n
_
k=1
B
k
_

k=n+1
(B

k
) = 5
N

k=n+1
(B
k
)
which proves the theorem.
208
Bibliography
[1] Conway J.B., A course in functional analysis (second edition),
Graduate Texts in Mathematics 96, Springer, New York, 1990.
[2] Evans L.C., Gariepy R.F., Measure theory and ne properties
of functions, Studies in Advanced Mathematics, CRC Press, Ann
Arbor, 1992.
[3] Komornik V., Precis danalyse reelle. Analyse fonction-
nelle, integrale de Lebesgue, espaces fonctionnels (volume 2),
Mathematiques pour le 2
e
cycle, Ellipses, Paris, 2002.
[4] Rudin W., Analisi reale e complessa, Programma di matematica,
sica, elettronica, Boringhieri, Torino, 1974.
209
Index
algebra, 3
absolute continuity of integral, 61
absolutely continuous function, 187
Alaoglus theorem, 145
algebra, 2
approximation by smooth functions,
165
approximation in L
p
, 90
approximation to unit, 165
Ascoli-Arzel`as theorem, 202
Baires lemma, 201
ball in a Hilbert space, 99
Banach space, 120
Banach-Steinhaus theorem, 123
Beppo Levis theorem, 54
Bessels identity, 111
Bessels inequality, 111
bidual, 138
Bolzano-Weierstrass property, 140
Borel map, 36
Borel measure, 17
bounded linear functional, 105, 121
Cantor set, 25, 180
closed graph theorem, 127
closed linear subspace generated by a
set, 104
compactness in L
p
, 157
complete orthonormal sequence , 113
convergence in L
p
, 87
convergence in measure, 85
convergence of L
p
-norms, 87
convergence, dominated, 62
convex conjugate, 198
convex set, 99
convolution product, 160
dense subsets of L
p
, 90
dierentiation under integral sign, 68
distance function, 197
dual of
p
, 135
dual space, 121
equicontinuous family of functions, 202
Eulers identity, 117
Fatous lemma, 56
Fenchel transform, 198
Fourier coecients, 111
Fourier series, 111, 112
Fubinis theorem, 155
function of bounded variation, 181
gauge, 134
Gram-Schmidt process, 114
Hahn-Banach theorem, 130
Hamel basis, 128
Hilbert space, 96
hyperplane in H, 108
inequality, Cauchy-Schwarz, 95
210
211
integration by parts, 194
inverse mapping theorem, 126
Lebesgue measure in [0, 1), 20
Lebesgue measure in R, 20
Lebesgues theorem, 62, 176
Legendre transform, 198
linear subspace generated by a set,
104
Lusins theorem, 43
measurable map, 36
measurable rectangle, 149
Minkowski function, 134
mollier, 168
monotone convergence theorem, 54
monotonic function, 174
nonmeasurable set, 25
norm, 119
norm, associated with , ), 96
normed space, 119
open mapping theorem, 124
orthogonal, 98
orthogonal complement of a set, 102
orthonormal sequence, 110
ortonormal basis, 113
parallelogram identity, 98
Parsevals identity, 112
pre-Hilbert space, 95
principle of uniform boundedness, 123
product -algebra, 150
product measure, 149
projection onto a convex set, 99
projection onto a subspace, 101
Pythagorean Theorem, 98
Radon measure, 17
Radon-Riesz property, 147
reexive space, 138
Riesz orthogonal decompisition, 102
Riesz theorem, 158
Riesz-Frechet theorem, 107
scalar product, 95
Schurs theorem, 147
separable space, 113
separation of convex sets in H, 108
Severini-Egorovs theorem, 43
space

, 81
space
p
, 73
space c
c
(), 90, 167
space c

c
(), 167, 169
space AC([a, b]), 188
space BV ([a, b]), 183
space L

(X, c, ), 80
space L
p
(X, c, ), 73
step function, 179, 182
strong convergence, 142
sublinear functional, 133
support (of a continuous function), 44
Tonellis theorem, 154
total variation, 181
translation invariance, 23
trigonometric polynomial, 115
trigonometric system, 110
uniform convexity, 105
uniform integrability, 66
Vitalis covering theorem, 203
Vitalis function, 179
Vitalis theorem, 66
weak Bolzano-Weierstrass property, 145
weak convergence, 142
212
weak convergence, 144
Weierstrass approximation, 116, 170
Youngs inequality, 200
Youngs theorem, 160
Zorns Lemma, 131