Sie sind auf Seite 1von 253

Texts and Readings in Mathematics 77

S. Kesavan

Measure and
Integration
Texts and Readings in Mathematics

Volume 77

Advisory Editor
C. S. Seshadri, Chennai Mathematical Institute, Chennai

Managing Editor
Rajendra Bhatia, Ashoka University, Sonepat

Editors
Manindra Agrawal, Indian Institute of Technology, Kanpur
V. Balaji, Chennai Mathematical Institute, Chennai
R. B. Bapat, Indian Statistical Institute, New Delhi
V. S. Borkar, Indian Institute of Technology, Mumbai
Apoorva Khare, Indian Institute of Sciences, Bangalore
T. R. Ramadas, Chennai Mathematical Institute, Chennai
V. Srinivas, Tata Institute of Fundamental Research, Mumbai

Technical Editor
P. Vanchinathan, Vellore Institute of Technology, Chennai
The Texts and Readings in Mathematics series publishes high-quality textbooks,
research-level monographs, lecture notes and contributed volumes. Undergraduate
and graduate students of mathematics, research scholars, and teachers would find
this book series useful. The volumes are carefully written as teaching aids and
highlight characteristic features of the theory. The books in this series are
co-published with Hindustan Book Agency, New Delhi, India.

More information about this series at http://www.springer.com/series/15141


S. Kesavan (emeritus)

Measure and Integration

123
S. Kesavan (emeritus)
Institute of Mathematical Sciences
Chennai, Tamil Nadu, India

ISSN 2366-8725 (electronic)


Texts and Readings in Mathematics
ISBN 978-981-13-6678-9 (eBook)
https://doi.org/10.1007/978-981-13-6678-9

Library of Congress Control Number: 2019932597

This work is a co-publication with Hindustan Book Agency, New Delhi, licensed for sale in all countries
in electronic form only. Sold and distributed in print across the world by Hindustan Book Agency, P-19
Green Park Extension, New Delhi 110016, India. ISBN: 978-93-86279-77-4 © Hindustan Book Agency
2019.

© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019
This work is subject to copyright. All rights are reserved by the Publishers, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publishers, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publishers nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publishers remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
Dedicated to
Professor Philippe G. Ciarlet,
to whom I owe more than I can
possibly express,
on the occasion of his eightieth
birthday.
Preface
A course on the theory of measure and the Lebesgue integral is now
an essential component in any masters or graduate programme in math-
ematics in Indian universities. It is part of the training a student receives
in analysis. The most interesting examples of Banach spaces are func-
tion spaces of various kinds and the Lebesgue spaces, also known as Lp
spaces, are amongst the most important of these. A knowledge of the
theory of measure and integration is essential for the study of several
advanced topics in functional analysis like the theory of distributions
and Sobolev spaces, which constitute the functional analytic framework
for the modern study of partial differential equations. Of course, the
theory of measure and integration is vital to the study of probability
and stochastic processes.

This book grew out of the notes I prepared for lectures on measure
theory and the theory of integration. These lectures were delivered,
over the past four decades, to masters and graduate students in sev-
eral leading institutions like the Centre for Applicable Mathematics,
Tata Institute of Fundamental Research, Bangalore, The Institute of
Mathematical Sciences, Chennai, the Chennai Mathematical Institute,
Siruseri, and the Indian Institute of Technology, Madras. Portions of
the book were also taught at numerous refresher or summer courses.
In particular, it was taught by me at several refresher courses at the
Ramanujan Institute for Advanced Study in Mathematics, of the Uni-
versity of Madras, Chennai. I am indeed thankful to these institutions
and organizers of refresher courses for having given me the opportunity
to deliver these lectures.

The book starts with a preamble, where the Riemann integral is


briefly discussed. Some of the shortcomings of this theory of integration
motivate the need to develop the theory of measure and the Lebesgue
integral.

Chapter 1 develops the abstract theory of a measure defined over


classes of subsets of a non-empty set, like rings, σ-rings and σ-algebras.
The extension of a measure from a smaller class (like a ring) to a larger
class (typically, a σ-algebra), is done via the method of Carathéodory,
using outer-measures. The completion of a measure is also discussed.
vii
viii Preface

Chapter 2 is devoted to the construction and the study of the im-


portant properties of the Lebesgue measure on the euclidean space RN .

Chapter 3 studies important properties of measurable functions.

Chapter 4 introduces various notions of convergence like pointwise


convergence, almost uniform convergence and convergence in measure
and studies their inter-relationships.

Chapter 5 is the core of this book. It develops the theory of the


Lebesgue integral and proves the important limit theorems. It also com-
pares the Riemann and Lebesgue integrals on the real line.

Chapter 6 is devoted to the fundamental theorem of calculus, viz.


the relationship between differentiation and integration. Various classes
of functions, which are differentiable almost everywhere, are studied and
the relationship between the integrand and the derivative of its indefi-
nite integral is explored.

Chapter 7 is devoted to the change of variable formula, viz. the effect


on the integral under the action of transformations of the domain.

Chapter 8 studies product spaces and Fubini’s theorem is proved.


Polar coordinates in RN are discussed.

Chapter 9 concerns signed measures and the main result of this chap-
ter is the Radon-Nikodym theorem.

Finally, Chapter 10 studies Lp spaces. Density theorems and duality


are discussed. The notion of the convolution product is introduced.

Most of the material in this book can be covered in a one semester


introductory course. The pre-requisite for following this book is famil-
iarity with basic real analysis and elementary topological notions, with
special emphasis on the topology of the euclidean space RN . The in-
structor may omit certain sections or results if (s)he feels it may be too
heavy for the students taking the course. Each chapter is provided with
a variety of exercises, which, it is hoped, the students will try to solve.
Preface ix

No originality is claimed regarding the contents and the presentation


of the material in this book. I have learnt from, and have been influ-
enced by, many earlier works on this topic, especially those of Halmos,
Royden and Rudin, to mention a few. These appear in the bibliographic
references.

Since this book is meant to serve as a text book for an introductory


course, I have kept the bibliographic references to a minimum.

I wish to thank the Director of the Institute of Mathematical Sciencs


for the excellent facilities accorded to me during the preparation of this
work. I also wish to thank Prof. R. Bhatia, Managing Editor of the
TRIM Series, and Shri J. K. Jain of the Hindustan Book Agency, for
their support. I thank the anonymous referee who went through the en-
tire manuscript with such great care and pointed out several misprints
and other slips. Eliminating these has certainly made the book much
better. Finally, I wish to thank several students of the Indian Institute
of Technology, Madras, who followed my lectures and made my sojourn
there as Visiting (and then Adjunct) Professor a very enjoyable expe-
rience. In particular, I wish to mention Ashok Kumar and Nirjan Biswas.

Chennai S. Kesavan
November, 2018
Notations
Certain general conventions followed throughout the text regarding
notations are described below. All other specific notations are explained
as and when they appear in the text.
• The set of natural numbers {1, 2, 3, · · ·}, is denoted by the symbol
N, the integers by Z, the rationals by Q, the reals by R and the
complex numbers by C.

• If A and B are two sets, then by A ⊂ B, we mean that every


element of A is also an element of B, i.e. A is a subset of B. The
inclusion is not necessarily strict.

• If X is a non-empty set and A ⊂ X, then Ac denotes the comple-


ment of A in X, i.e. the set of elements in X which do not belong
to A.

• The empty set is denoted by the symbol ∅.

• The union and intersection of sets are denoted using the usual
symbols ∪ and ∩ respectively.

• If X is a non-empty set and if A and B are subsets of X, then

A\B = A ∩ B c , and A∆B = (A\B) ∪ (B\A).

• If a, b ∈ R ∪ {±∞}, then

(a, b) = {x ∈ R | a < x < b},

[a, b] = {x ∈ R | a ≤ x ≤ b},
[a, b) = {x ∈ R | a ≤ x < b},
(a, b] = {x ∈ R | a < x ≤ b}.

• The symbol RN , N ∈ N, stands for the N -dimensional euclidean


space. If x = (x1 , · · · , xN ) ∈ RN , then

N
! 12
X
|x| = |xi |2 .
i=1

x
Contents

Preamble 1

1 Measure 9
1.1 Algebras of sets . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Measures on rings . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Outer-measure and measurable sets . . . . . . . . . . . . . 16
1.4 Completion of a measure . . . . . . . . . . . . . . . . . . . 24
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2 The Lebesgue measure 30


2.1 Construction of the Lebesgue measure . . . . . . . . . . . 30
2.2 Approximation . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3 Translation invariance . . . . . . . . . . . . . . . . . . . . 46
2.4 Non-measurable sets . . . . . . . . . . . . . . . . . . . . . 49
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3 Measurable functions 54
3.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . 54
3.2 The Cantor function . . . . . . . . . . . . . . . . . . . . . 61
3.3 Almost everywhere . . . . . . . . . . . . . . . . . . . . . . 65
3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4 Convergence 68
4.1 Egorov’s theorem . . . . . . . . . . . . . . . . . . . . . . . 68
4.2 Convergence in measure . . . . . . . . . . . . . . . . . . . 70
4.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5 Integration 81
5.1 Non-negative simple functions . . . . . . . . . . . . . . . . 81
5.2 Non-negative functions . . . . . . . . . . . . . . . . . . . . 85
5.3 Integrable functions . . . . . . . . . . . . . . . . . . . . . 94
xi
xii Contents

5.4 The Riemann and Lebesgue integrals . . . . . . . . . . . . 104


5.5 Weierstrass’ theorem . . . . . . . . . . . . . . . . . . . . . 109
5.6 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

6 Differentiation 118
6.1 Monotonic functions . . . . . . . . . . . . . . . . . . . . . 118
6.2 Functions of bounded variation . . . . . . . . . . . . . . . 124
6.3 Differentiation of an indefinite integral . . . . . . . . . . . 131
6.4 Absolute Continuity . . . . . . . . . . . . . . . . . . . . . 136
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

7 Change of variable 142


7.1 The Fréchet derivative . . . . . . . . . . . . . . . . . . . . 142
7.2 Sard’s theorem . . . . . . . . . . . . . . . . . . . . . . . . 146
7.3 Diffeomorphisms . . . . . . . . . . . . . . . . . . . . . . . 147

8 Product spaces 156


8.1 Measurability in the product space . . . . . . . . . . . . . 156
8.2 The product measure . . . . . . . . . . . . . . . . . . . . . 160
8.3 Fubini’s theorem . . . . . . . . . . . . . . . . . . . . . . . 164
8.4 Polar coordinates in RN . . . . . . . . . . . . . . . . . . . 171
8.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

9 Signed measures 178


9.1 Hahn and Jordan decompositions . . . . . . . . . . . . . . 178
9.2 Absolute continuity . . . . . . . . . . . . . . . . . . . . . . 185
9.3 The Radon-Nikodym theorem . . . . . . . . . . . . . . . . 188
9.4 Singularity . . . . . . . . . . . . . . . . . . . . . . . . . . 193
9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

10 Lp spaces 196
10.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . 196
10.2 Approximation . . . . . . . . . . . . . . . . . . . . . . . . 205
10.3 Some applications . . . . . . . . . . . . . . . . . . . . . . 208
10.4 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
10.5 Convolutions . . . . . . . . . . . . . . . . . . . . . . . . . 221
10.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

Bibliography 235
About the Author
S. KESAVAN is former professor at the Institute of Mathematical
Sciences, Chennai, and adjunct faculty at the Indian Institute of
Technology Madras, Chennai, India. He started his career at the Tata
Institute of Fundamental Research Centre for Applicable Mathematics
(TIFR-CAM), Bangalore, India, in 1973. He has also been associated
with the Chennai Mathematical Institute, where he was deputy director
during 2007–2010 and he is currently adjunct professor at the Indian
Institute of Technology, Madras. He earned his PhD from Université
Pierre-et-Marie-Curie, Paris, in 1979. He is a fellow of the National
Academy of Sciences, India, at Allahabad and the Indian Academy of
Sciences, Bangalore, India. He is a life member of the National Board
for Higher Mathematics, since 2000, Indian Mathematical Society,
International Society for the Interaction of Mechanics and Mathematics
(ISIMM), Indian Society of Industrial and Applicable Mathematics
(ISIAM), Ramanujan Mathematical Society, American Mathematical
Society and an elected fellow of the Forum d’Analystes, Chennai.
He was also the Secretary (Grant Selection) of the Commission for
Developing Countries of the International Mathematical Union, during
2011–2014 and 2015–2018. He has published four books and authored
over 50 research articles, apart from several contributions to conference
proceedings and popular articles. His research interests are in partial
differential equations, homogenization, control theory, and isoperimetric
inequalities.

xiii
Preamble
From the time of the Greeks, the problem of computing the area
enclosed by a curve had been exercising the minds of scientific thinkers.
This crucial question, at the base of the theory of integral calculus, was
treated as early as the third century B.C. by Archimedes, who calcu-
lated the area of a circular disc, the area of a segment of a parabola
and other such figures. He used the ‘method of exhaustion’. The basic
idea was to exhaust the given area by a sequence of polygonal domains
and calculate the area as the limit of the area of the inscribed polygons.
During the seventeenth century, many such areas were calculated and in
each case the problem was solved by an ingenious device specially suited
for the case in hand. One of the achievements of calculus was to develop
a general and powerful method to replace these special restricted proce-
dures.

From the time of Archimedes until the time of Gauss, the attitude
was that the area was an intuitively obvious entity which need not be
defined, but which had to be computed. Before Cauchy, there was no
definition of the integral in the precise sense of the term. One was often
limited to saying which areas one had to add, or subtract, to get the
integral.

Cauchy, with his concern for rigour, which is characteristic of modern


mathematics, defined continuous functions and their integrals in much
the same way as we do now. To arrive at the integral of a continuous
function f defined on an interval [a, b] of the real line, he looked at sums
of the form X
S = f (ξi )(xi+1 − xi )
i

where a = x0 < x1 < ... < xi < xi+1 < ... < xN = b is a partition of
[a, b] and ξi ∈ [xi , xi+1 ]. He then deduced the value of the integral
Z b
f (x)dx
a

by a suitable passage to the limit.

For a long time, certain discontinuous functions were integrated by


showing that Cauchy’s definition still applied to these integrals. It was
1
2 Preamble

Riemann who systematically investigated the exact scope of this defini-


tion.

In what follows, we will briefly recall the salient features of the Rie-
mann integral and see what are its principal drawbacks which will mo-
tivate the study of Lebesgue’s theory of measure and integration.

The Riemann Integral

Let [a, b] ⊂ R be a finite interval and let f : [a, b] → R be a bounded


function. Let P = {a = x0 < x1 < ... < xN = b} be a partition of the
interval. Set

mi = inf f (x) and Mi = sup f (x), for 1 ≤ i ≤ N.


x∈[xi−1 ,xi ] x∈[xi−1 ,xi ]

Then, we define the lower and upper (Darboux) sums associated to the
function f and the partition P by
PN
L(P, f ) = i=1 mi (xi − xi−1 )
PN
U (P, f ) = i=1 Mi (xi − xi−1 ).

Then, we define the lower and upper integrals of f by


Rb
a f (x)dx = supP L(P, f )

Rb
a f (x)dx = inf P U (P, f )

where the supremum and infimum are taken over all possible partititions
of [a, b]. The function f is said to be Riemann integrable over [a, b]
if its lower and upper integrals are equal and the common value, called
the Riemann integral of f over [a, b], is denoted by the symbol
Z b
f (x)dx.
a

Since f is bounded, we have m ≤ f (x) ≤ M for all x ∈ [a, b] and it


is immediate to see that

m(b − a) ≤ L(P, f ) ≤ U (P, f ) ≤ M (b − a)


Preamble 3

for all partitions P. Thus, the lower and upper integrals of f always
exist but the question of their being equal is a delicate one.

Given a partition P as above, we set

µ(P) = max (xi − xi−1 ).


1≤i≤N

Let ti ∈ [xi−1 , xi ] for 1 ≤ i ≤ N . Denote

N
X
S(P, f ) = f (ti )(xi − xi−1 ).
i=1

The above notation is incomplete. The sum S(P, f ) depends not


only on the partition P and the function f , but also on the choice of the
points ti . But in order to avoid cumbersome notation, we will leave it
as it is.

Definition We say that

lim S(P, f ) = A
µ(P)→0

if, for every ε > 0, there exists a δ > 0 such that, for all partitions P
such that µ(P) < δ, and for all choices of points ti compatible with the
partition, we have
|S(P, f ) − A| < ε.

Theorem 1 (cf. Rudin [7])


The function f is Riemann integrable, if and only if, the limit defined
in the above definition exists and, in this case,
Z b
f (x)dx = lim S(P, f ).
a µ(P)→0

Thus, we see that the requirement that a function be Riemann inte-


grable is a very strong one. We have the following result.
4 Preamble

Theorem 2 (cf. Rudin [7])


If f is continuous, or if f has at most a countable number of disconti-
nuities, then f is Riemann integrable. 

Example Let us consider the unit interval [0, 1]. Let us choose some
numbering of all the rational numbers in this interval and write them as
r1 , r2 , .... Define

1, if x = r1 , r2 , ..., rn ,
fn (x) =
0, otherwise.

The function fn is discontinuous only at the points r1 , ..., rn which are


finite in number and so, by the previous theorem, fn is Riemann inte-
grable. In fact, it is a simple exercise to check this fact directly using the
definition of Riemann integrability and show that the integral is equal
to zero.

Let us now consider the function f (x) = limn→∞ fn (x). It is easy to


see that 
1, if x is rational,
f (x) =
0, if x is irrational.
This function is discontinuous everywhere. Given any partition P, it
is easy to see that mi = 0 and that Mi = 1 for all 1 ≤ i ≤ N . Thus
L(P, f ) = 0 and U (P, f ) = 1. Thus the lower integral is zero while the
upper integral is unity and so f fails to be Riemann integrable. 

This brings us to a major drawback of the Riemann integral. The


limit of a sequence of Riemann integrable functions need not be Rie-
mann integrable. Even if the limit is a Riemann integrable function,
the limit of the integrals need not be the integral of the limit, as the
following example shows.

Example Let fn (x) = n2 x(1 − x2 )n for x ∈ [0, 1]. Then fn (x) → 0 as


n → ∞ (why?). Now,
Z 1
1
x(1 − x2 )n dx = .
0 2n + 2

Thus,
1
n2
Z
fn (x)dx = → ∞
0 2n + 2
Preamble 5

R1
while, since f ≡ 0, we have 0 f (x)dx = 0. Similarly, if we define

fn (x) = nx(1 − x2 )n ,
R1
again fn → f ≡ 0 pointwise but 0 fn (x)dx → 1/2 6= 0.

So, when do the two limit processes - the pointwise limit of functions
and Riemann integration (which has been defined as a limit of sums as
shown in Theorem 1) - commute?

Definition We say that fn → f uniformly on [a, b] if, for every ε > 0,


there exists a positive integer N such that, for all x ∈ [a, b] and for all
n ≥ N , we have
|fn (x) − f (x)| < ε.

Theorem 3 (cf. Rudin [7])


If fn → f uniformly on [a, b], and if all the fn are Riemann integrable,
then f is Riemann integrable and, further,
Z b Z b
lim fn (x)dx = f (x)dx.
n→∞ a a

In the preceding example, the sequence {fn } failed to converge uni-


formly. In fact, the non-commutativity of the operations of taking the
pointwise limit and the Riemann integral is a useful test to prove that
a sequence of functions is not uniformly convergent.

Thus, a sequence of functions which does not converge uniformly


may converge to a function which is not integrable or it can happen
that the limit function is Riemann integrable but the limit of the inte-
grals is not the integral of the limit function. But uniform convergence
is a very strong condition as well.

We thus feel the need for a theory of integration, wherein a larger


class of functions is integrable and such that the process of taking point-
wise limits of functions commutes with the process of integration under
fairly easily verifiable hypotheses. This is where the alternative approach
6 Preamble

of Lebesgue comes in useful.

The way the Riemann integral is defined, a certain amount of conti-


nuity is forced on integrable functions. As we saw in Theorem 1, if f is
Riemann integrable, then, for all admissible choices of points ti , the value
of S(P, f ) cannot vary too much, since the limit exists as µ(P) → 0.
Thus, nearby points must have nearby values ‘to a large extent’ and this
is what Theorem 2 is all about. We can excuse a countable number of
discontinuities. But the function which takes the value 1 on the rationals
and the value 0 on irrationals is discontinuous everywhere and it fails to
be Riemann integrable.

The idea of Riemann in formulating the definition of the integral is


to consider the function following the abcissa. We take the values of the
function as we proceed along the x-axis. Thus, we are forced to consider
and compare the values of the function at nearby points and hence we
are dependent on some amount of continuity.

The idea of Lebesgue is to work, not from the domain, but from the
range of a function. We take a particular value and consider the set of
all points where this value is assumed when we define the integral. Let
us illustrate this via an example.

Example Let P be a partition of the interval [a, b]. Let


N
X
f (x) = αi χEi (x)
i=1

where Ei = [xi−1 , xi ] and for any subset A of R,



1, if x ∈ A
χA (x) =
0, if x 6∈ A.
This function has a finite number of discontinuities and the Riemann
integral is easily seen (Exercise!) to be
Z b N
X
f (x)dx = αi (xi − xi−1 ).
a i=1

By Lebesgue’s method, we will be looking at sets of the form Eα = {x ∈


[a, b] | f (x) = α} for each α ∈ R and multiply α by the ‘length’ of the
Preamble 7

set Eα and ‘add’ all these products. In our example, Eα = ∅ if α 6= αi


for any 1 ≤ i ≤ N and Eαi = Ei . Thus the (Lebesgue) integral is given
again by the same expression as the Riemann integral, in this case. 

Imagine a merchant in a shop wanting to add all the money he has


collected from sales during a particular day. He has two methods. First,
he can take the money one at a time from the till and add the amounts
as he takes them out. The other is for him to sort out all the money
according to each denomination, count the number of coins or notes in
each denomination, multiply the number by the value of the denomina-
tion and add all these products. Both procedures will yield the same
result, but the latter is more efficient, especially if it involves large quan-
tities of money (have you seen how they count the Hundi collections in a
large temple, say, Tirumala?). The approach of Riemann is like the first
method where we take a function as it comes along the x-axis, while the
approach of Lebesgue is like the second, where we sort it out according
to the values in the range. Obviously, this does not say anything about
the values of nearby points, and so, hopefully, will not depend on the
continuity of the function.

The Riemann integral approximates a function by another of the


form
XN
f (ti )χEi (x)
i=1
where ti ∈ Ei and P = {Ei | 1 ≤ i ≤ N } is a partition of [a, b] and
passes to the limit in sums of the form S(P, f ).

The Lebesgue integral approximates a function by one of the form


N
X
αi χAi (x)
i=1
where Ai , 1 ≤ i ≤ N are ‘more general’ sets than just intervals. It then
defines the integral of the simpler function by
N
X
αi µ(Ai )
i=1
where µ(A) is the ‘length’ of the set A, and then passes to the limit
suitably to get the integral of f .
8 Preamble

Here is the catch. What do we mean by the ‘length’ of a set A which


is not an interval. This brings us to the theory of measures which will
generalize the notion of length (area or volume, in higher dimensions)
to a fairly large class of sets.
Chapter 1

Measure

1.1 Algebras of sets

Throughout this section X will stand for a non-empty set. We will define
various classes of subsets of X. The power set of X, i.e. the collection
of all subsets of X, will be denoted by P(X).
Definition 1.1.1 A non-empty collection R of subsets of X is called a
ring if it is closed under the formation of unions and differences, i.e. if
E and F are members of R, then so are E ∪ F and E\F . A ring is said
to be an algebra if, in addition, X itself is a member of R. 
Remark 1.1.1 By induction, it is clear that every ring is closed under
the formation of finite unions. 

Remark 1.1.2 The empty set always belongs to any ring since, if E is
a member, so is ∅ = E\E. 

Remark 1.1.3 If R is a ring and if E, F ∈ R, then


E∆F = (E\F ) ∪ (F \E) ∈ R,
E∩F = E\(E\F ) ∈ R.
Thus, a ring is closed under the formation of symmetric differences and
intersections as well. 

Remark 1.1.4 Conversely, if a non-empty collection of subsets R is


closed under the formation of unions and symmetric differences, then it
is a ring. Indeed, if E, F ∈ R, we have, by hypothesis, that E ∪ F ∈ R
and, further,
E\F = (E ∪ F )∆F ∈ R.
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 9
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_1
10 1 Measure

Similarly, if R is closed under the formation of symmetric differences


and intersections, then also it is a ring, for

E∪F = (E∆F )∆(E ∩ F ) ∈ R,


E\F = (E ∪ F )∆F ∈ R.

Remark 1.1.5 If R is an algebra, then it is closed under complemen-


tation since E ∈ R implies that E c = X\E ∈ R. Conversely, if a non-
empty collection of subsets R is closed under the formation of unions
and complementation, it is an algebra. To see this, notice that if E ∈ R,
then E c ∈ R and so X = E ∪ E c ∈ R. Further, if E, F ∈ R, then

E\F = E ∩ F c = (E c ∪ F )c ∈ R. 

Example 1.1.1 The collections R = {∅} and R = P(X) are trivial


examples of rings for any non-empty set X. 

Example 1.1.2 Let X = Z, the set of all integers. Define

R = {A ⊂ Z | A is a non-empty finite set, or A = ∅}.

Then R defines a ring. 

The next example is one which we will deal with in detail in this book
since it will be the starting point of the construction of the Lebesgue
measure.

Example 1.1.3 Let X = R, the real line. Define

P = {[a, b) | a, b ∈ R, a ≤ b}

where [a, b) = {x ∈ R | a ≤ x < b} with the convention that this stands


for the empty set if a = b. Define R to be the collection of all finite
unions of members of P. Then R is a ring. To see this, first of all, R
is closed under the formation of finite unions, by definition. Further,
[a, b)\[c, d) will be [a, b) if the two intervals are disjoint, or empty if
[a, b) ⊂ [c, d). If a < c < b ≤ d, then [a, b)\[c, d) = [a, c) and if
a ≤ c < d < b, we have

[a, b)\[c, d) = [a, c) ∪ [d, b) ∈ R.


1.2 Measures on rings 11

Finally, if c < a < d < b, we have

[a, b)\[c, d) = [d, b).

From this it easily follows that R is closed under the formation of dif-
ferences as well. It is also clear that each member of R can, in fact, be
written as a finite disjoint union of members of P. 

Definition 1.1.2 A non-empty collection S of subsets of a non-empty


set X is said to be a σ-ring if it is closed under the formation of
differences and countable unions. In other words, if E, F ∈ S, then
E\F ∈ S and if {Ei }∞ i=1 is a countable collection of members of S, then
∪∞i=1 E i ∈ S. A σ-ring S is called a σ-algebra if, in addition, X ∈ S. 

Remark 1.1.6 Thus, a σ-ring is a ring which is closed under the forma-
tion of countable unions. If {Ei }∞
i=1 is a countable collection in a σ-ring
S, then
∩∞ ∞
i=1 Ei = E\ ∪i=1 (E\Ei ) ∈ S,

where E = ∪∞ i=1 Ei . Thus, a σ-ring is closed under the formation of


countable intersections as well. It is also easy to see that a σ-algebra
can be described as a non-empty collection of subsets which is closed
under the formation of countable unions and complementation. 

Let X be a non-empty set and let E be a non-empty collection of sub-


sets of X. Clearly, the power set of X, i.e. P(X), is a ring (respectively,
σ-ring) containing E. Now, it is immediate to see that the intersection of
a collection of rings (respectively, σ-rings) is again a ring (respectively, a
σ-ring). Consequently, there exists a smallest ring (respectively, σ-ring)
containing E. This is called the ring (respectively, σ-ring) generated by
E and is denoted by R(E) (respectively, S(E)).

The collection of all sets in X which can be covered by finite (respec-


tively, countable) unions of members of E is clearly a ring (respectively, a
σ-ring) containing E. Thus, every member of R(E) (respectively, S(E))
can be covered by a finite (respectively, countable) union of members of
E.

1.2 Measures on rings

Let X be a non-empty set and let R be a ring of subsets of X.


12 1 Measure

Definition 1.2.1 A measure, µ, on the ring R, is an extended real-


valued function on R such that
(i) µ(E) ≥ 0, for all E ∈ R,
(ii) µ(∅) = 0, and,
(iii) µ is countably additive, i.e. if {Ei }∞
i=1 is a sequence of pairwise
disjoint sets in R such that E = ∪∞ E
i=1 i ∈ R, then

X
µ(E) = µ(Ei ).  (1.2.1)
i=1

Remark 1.2.1 Since µ is an extended real-valued function on R, it is


possible that µ(E) = +∞ for some E ∈ R. If there exists at least one
E ∈ R such that µ(E) < +∞, then (ii) in the above definition will
follow as a consequence of (iii), since we can write

E = E ∪ ∅ ∪ ∅ ∪ ∅··· 

Remark 1.2.2 Since µ(Ei ) ≥ 0 for all i, the order of the summands in
(1.2.1) is unimportant. 

Remark 1.2.3 A measure is always finitely additive as well. If


{Ei }N
i=1 is a finite collction of mutually disjoint sets whose union is E
(which will be automatically in R), then since

E = ∪N
i=1 Ei ∪ ∅ ∪ ∅ ∪ ∅ · · · ,

we have
N
X
µ(E) = µ(Ei ). 
i=1

Example 1.2.1 Let X be any non-empty subset and let R = P(X). If


E ⊂ X, define

 0, ifE = ∅,
µ(E) = number of elements in E, if E is a finite non-empty set,
+∞, otherwise.

We need to check only the countable additivity. Let {Ei }∞ i=1 be a col-
lection of mutually disjoint subsets whose union is E. If E is a finite
set, then only at most finitely many of the Ei will be non-empty and
they will also be finite sets. Then (1.2.1) is obviously true. If E is an
infinite set, then either at least one of the Ei is an infinite set, or there
1.2 Measures on rings 13

are infinitely many Ei which are non-empty finite sets. In either case,
both sides of (1.2.1) take the value +∞ and so countable additivity is
established.
This measure is called the counting measure on the set X. 

Example 1.2.2 Let X and R be as in the previous example. Let x0 ∈ X


be fixed. Let E ⊂ X. Define

1, if x0 ∈ E,
µ(E) =
0, if x0 6∈ E.

Again, we need only to check the countable additivity. Let E = ∪∞ i=1 Ei


be the union of a sequence of mutually disjoint sets in X. If x0 ∈ E, then
x0 ∈ Ei0 for exactly one index io . Consequently, both sides of (1.2.1)
will be unity. If x0 6∈ E, then x0 6∈ Ei for every index i and so both sides
of (1.2.1) are zero in this case.
This measure is called the Dirac measure concentrated at the point
x0 . 

Example 1.2.3 Let X be any non-empty set. Let R be the ring of


finite subsets of X. Let f : X → R be a given non-negative real-valued
function. Define µ to be zero on the empty set and set
n
X
µ({x1 , · · · , xn }) = f (xi ).
i=1

It is easy to check that this defines a measure on X. 

The most interesting example will be the Lebesgue measure, to be


defined on the euclidean space RN , N ≥ 1, which we will study in detail
later.

We will now prove some basic, but important, properties of measures.

Proposition 1.2.1 Let µ be a measure on a ring R of subsets of a non-


empty set X. Then,
(i) µ is monotone, i.e. if E, F ∈ R and if E ⊂ F , we have µ(E) ≤
µ(F ), and
(ii) µ is subtractive, i.e. if E, F ∈ R, E ⊂ F and if µ(E) < +∞, then

µ(F \E) = µ(F ) − µ(E).


14 1 Measure

Proof: If E ⊂ F , then, by finite additivity, we have that µ(F ) =


µ(F \E) + µ(E). Then (i) is a consequence of the non-negativity of the
measure and (ii) follows from the fact that µ(E) is finite and hence we
can subtract it from both sides of the above relation. 

Proposition 1.2.2 (Subadditivity) Let µ be a measure on a ring R of


subsets of a non-empty set X. Let {Ei } be a finite, or infinite, sequence
of sets in R and let E ∈ R such that E ⊂ ∪i Ei . Then
X
µ(E) ≤ µ(Ei ).
i

Proof: Set Fi = E ∩ Ei . Define G1 = F1 and define Gi = Fi \(∪i−1 j=1 Fj ).


Then the sets Gi are all mutually disjoint and Gi ⊂ Fi for all i. Further

∪i Gi = ∪i Fi = E.

Thus, by countable additivity and monotonicity, we have


X X X
µ(E) = µ(Gi ) ≤ µ(Fi ) ≤ µ(Ei ). 
i i i

Proposition 1.2.3 Let µ be a measure on a ring R of subsets of a


non-empty set X. Let {Ei } be a finite, or infinite, sequence of mutually
disjoint sets in R such that ∪i Ei ⊂ E, where E ∈ R. Then,
X
µ(Ei ) ≤ µ(E).
i

Proof: For any positive integer n, we have ∪ni=1 Ei ⊂ E. By the finite


additivity and monotonicity of the measure, we have
n
X
µ(Ei ) = µ(∪ni=1 Ei ) ≤ µ(E)
i=1

from which we deduce the result immediately. 

Proposition 1.2.4 (Continuity from below) Let µ be a measure on a


ring R of subsets of a non-empty set X. Let {Ei }∞
i=1 be an increasing
sequence of sets in R such that ∪∞ E
i=1 i ∈ R. Then

µ(∪∞
i=1 Ei ) = lim µ(En ). (1.2.2)
n→∞
1.2 Measures on rings 15

Proof: Set E0 = ∅. Then


P∞
µ(∪∞
i=1 Ei ) = µ(∪∞
i=1 (Ei \Ei−1 )) = i=1 µ(Ei \Ei−1 )
Pn
= limn→∞ i=1 µ(Ei \Ei−1 ) = limn→∞ µ(En ),

since the sets Ei \Ei−1 are all mutually disjoint. This completes the
proof. 

Proposition 1.2.5 (Continuity from above) Let µ be a measure on a


ring R of subsets of a non-empty set X. Let {Ei }∞ i=1 be a decreasing
sequence of sets in R such that ∩∞ i=1 E i ∈ R and such that for some
positive integer m, we have µ(Em ) < +∞. Then

µ(∩∞
i=1 Ei ) = lim µ(En ). (1.2.3)
n→∞

Proof: Since the sequence of sets is decreasing, we have that µ(En ) <
+∞ for all n ≥ m. Further {Em \En }n≥m is an increasing sequence of
sets. Hence by the preceding proposition and by the subtractive property
of the measure, we have

µ(Em ) − µ(∩∞ ∞ ∞
i=1 Ei ) = µ(Em ) − µ(∩i=m Ei ) = µ(Em \(∩i=m Ei ))

= µ(∪∞
i=m (Em \Ei )) = limn→∞ µ(Em \En )

= µ(Em ) − limn→∞ µ(En ).

The result now follows on subtracting µ(Em ), which is finite, from both
sides of the above relation. 

Example 1.2.4 The preceding proposition is not valid without the as-
sumption that µ(Em ) is finite for some m. Consider the set of natural
numbers, N, equipped with the counting measure (cf. Example 1.2.1).
Let En = {m ∈ N | m ≥ n}. Then µ(En ) = +∞ for all n while
∩∞
n=1 En = ∅. 

Definition 1.2.2 Let µ be a measure on a ring R of subsets of a non-


empty set X. We say that µ is finite if µ(E) < +∞ for every E ∈ R.
We say that µ is σ-finite if every set E in R can be covered by a sequence
{Ei }∞
i=1 of sets in R with µ(Ei ) < +∞ for every i. 
16 1 Measure

Thus, the Dirac measure (cf. Example 1.2.2) and the measure de-
fined in Example 1.2.3 are both finite measures. The counting measure
on N is a σ-finite measure, since any subset of N can be covered by a
countable number of singleton sets, and each singleton set has measure
unity.

We conclude this section with a very useful result.


Proposition 1.2.6 (Borel-Cantelli Lemma) Let µ be a measure on a
σ-algebra S of subsets of a non-empty set X. Let {Ei }∞
i=1 be a sequence
of sets in X such that

X
µ(Ei ) < +∞.
i=1

Then, except for a set of measure zero, every point x ∈ X belongs to at


most finitely many of the sets Ei .
Proof: Let E be set of all points x ∈ X which belong to infinitely many
of the sets Ei . Then,
E = ∩∞ ∞
n=1 ∪i=n Ei .

Then, for every positive integer n, we have



X
µ(E) ≤ µ(∪∞
i=n Ei ) ≤ µ(Ei ).
i=n

But the sum on the extreme right is the tail of a convergent series and
hence can be made arbitrarily small for large n. Thus it follows that
µ(E) = 0. 

1.3 Outer-measure and measurable sets

In the sequel, we will really be interested only in measures defined on


σ-algebras. However, as we shall see in the construction of the Lebesgue
measure, it will be simpler to explicitly construct it on a ring and then
try to extend it to larger collections like the σ-ring generated by the ring
itself. We now investigate the possibility of extending a measure defined
on a ring to the σ-ring generated by it or to even larger classes of sets.
Definition 1.3.1 Let X be a non-empty set and let S be a σ-ring of
subsets of X. It is said to be a hereditary σ-ring if, whenever E ∈ S,
we have that every subset of E is also a member of S. 
1.3 Outer-measure and measurable sets 17

The power set of X is clearly a hereditary σ-ring and intersections of


hereditary σ-rings is also a hereditary σ-ring. It then follows that given
any collection E of subsets of X, there is a smallest hereditary σ-ring, de-
noted H(E), containing E. This is called the hereditary σ-ring generated
by the class E.
Notice that the collection of all sets in X which can be covered by
a countable union of members of E is a hereditary σ-ring containing E.
Thus, every member of H(E) can be covered by a countable union of
members of E.

Definition 1.3.2 Let X be a non-empty set and let H be a hereditary


σ-ring of subsets of X. An extended real-valued set function µ∗ defined
on H is said to be an outer-measure if the following properties hold:
(i) (non-negativity) µ∗ (E) ≥ 0 for every E ∈ H;
(ii) (monotonicity) if E, F ∈ H such that E ⊂ F , then µ∗ (E) ≤ µ∗ (F );
(iii) µ∗ (∅) = 0;
(iv) (countable subadditivity) if {En }∞
n=1 is a sequence of sets in H, then


X

µ (∪∞
n=1 En ) ≤ µ∗ (En ).  (1.3.1)
n=1

An outer-measure is said to be σ-finite if every set in the hereditary σ-


ring can be covered by a countable union of sets of finite outer-measure.

Outer-measures occur naturally when we try to extend a measure


defined on a ring.

Proposition 1.3.1 Let X be a non-empty set and let R be a ring of


subsets of X. Let µ be a measure on R. For any set E ∈ H(R), the
hereditary σ-ring generated by R, define
(∞ )
X
µ∗ (E) = inf µ(En ) | E ⊂ ∪∞
n=1 En , En ∈ R .
n=1

Then, µ∗ is an outer-measure on H(R) which extends µ. Further, if µ


is σ-finite, so is µ∗ .

Proof: Step 1: The non-negativity of µ∗ is obvious. Now, let E ∈ R.


Then since E covers itself, we have µ∗ (E) ≤ µ(E). On the other hand,
given any countable cover of E, say {En }∞ n=1 with En ∈ R for all n,
18 1 Measure

P∞
we have, by subadditivity of the measure, µ(E) ≤ n=1 µ(En ) and
∗ ∗
so, by definition of µ , we have µ(E) ≤ µ (E). Thus, for E ∈ R, we
have µ(E) = µ∗ (E) and so µ∗ extends µ. In particular, we have that
µ∗ (∅) = 0.

Step 2: Let F ⊂ E, where E ∈ H(R). Then every countable cover


of E by sets from R also covers F . Thus, it follows immediately that
µ∗ (F ) ≤ µ∗ (E).

Step 3: We now prove the subadditivity of µ∗ . Let E ∈ H(R) and


assume that E ⊂ ∪∞ i=1 Ei , where each Ei is also a member of H(R).
If there exists even a single index i such that µ∗ (Ei ) = +∞, there is
nothing to prove. Assume, therefore, that µ∗ (Ei ) < +∞ for each i.
Then, given ε > 0, there exists, by definition, sets Eij ∈ R such that
Ei ⊂ ∪ ∞
j=1 Eij and such that


X ε
µ(Eij ) < µ∗ (Ei ) + .
2i
j=1

Then E ⊂ ∪∞ ∞
i=1 ∪j=1 Eij and

∞ X
X ∞ ∞
X
µ∗ (E) ≤ µ(Eij ) ≤ µ∗ (Ei ) + ε.
i=1 j=1 i=1

Since ε > 0 was arbitrarily chosen, we deduce that



X
µ∗ (E) ≤ µ∗ (Ei ).
i=1

Step 4: Let E ∈ H(R). Then, there exists a countable cover {Ei }∞ i=1 of
E such that Ei ∈ R for each i. Since µ is σ-finite, we can find {Eij }∞
i,j=1
in R such that Ei ⊂ ∪∞j=1 Eij and µ(Eij ) < +∞ for each 1 ≤ i, j < ∞.
Now, we have E ⊂ ∪∞ ∞ ∗
i=1 ∪j=1 Eij and µ (Eij ) = µ(Eij ) < +∞. This
completes the proof. 

Example 1.3.1 Let X = N and consider the ring R of all finite subsets,
with the counting measure. Then µ is finite. Since any countable union
of singletons has to be in H(R), it follows that N is in H(R) and by
heredity, it follows that H(R) = P(N), the power set of N. It is now
immediate to see that if E is any infinite subset of N, then µ∗ (E) = +∞.
1.3 Outer-measure and measurable sets 19

Thus, even though µ is a finite measure, we can only say that µ∗ is σ-


finite. 

Definition 1.3.3 Let X be a non-empty set and let H be a hereditary


σ-ring of subsets of X. Let µ∗ be an outer-measure defined on H. A set
E ∈ H is said to be µ∗ -measurable if for every A ∈ H, we have

µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A ∩ E c ).  (1.3.2)

Remark 1.3.1 Since µ∗ is subadditive, the µ∗ -measurability of E is


equivalent to verifying

µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c ). 

Proposition 1.3.2 Let µ∗ be an outer-measure on a hereditary σ-ring


H of subsets of a non-empty set X. Then the collection of all µ∗ -
measurable sets, denoted S, is a ring.

Proof: Let A ∈ H and let E and F be µ∗ -measurable sets. Then, by


definition, we have

µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A ∩ E c ),
µ∗ (A ∩ E) = µ∗ (A ∩ E ∩ F ) + µ∗ (A ∩ E ∩ F c ),
µ∗ (A ∩ E c ) = µ∗ (A ∩ E c ∩ F ) + µ∗ (A ∩ E c ∩ F c ).

Thus,

µ∗ (A) = µ∗ (A∩E∩F )+µ∗ (A∩E∩F c )+µ∗ (A∩E c ∩F )+µ∗ (A∩E c ∩F c ).


(1.3.3)
If we replace A by A ∩ (E ∪ F ) in the above relation, we get

µ∗ (A∩(E∪F )) = µ∗ (A∩E∩F )+µ∗ (A∩E∩F c )+µ∗ (A∩E c ∩F ). (1.3.4)

It then follows from (1.3.3) that

µ∗ (A) = µ∗ (A ∩ (E ∪ F )) + µ∗ (A ∩ (E ∪ F )c ).

Thus, E ∪ F ∈ S. Similarly, replacing A in (1.3.3) by A ∩ (E\F )c =


A ∩ (E c ∪ F ), we get

µ∗ (A ∩ (E\F )c )) = µ∗ (A ∩ E ∩ F ) + µ∗ (A ∩ E c ∩ F ) + µ∗ (A ∩ E c ∩ F c ).
(1.3.5)
20 1 Measure

It now follows from (1.3.3) that

µ∗ (A) = µ∗ (A ∩ (E\F )c ) + µ∗ (A ∩ E ∩ F c )
= µ∗ (A ∩ (E\F )c ) + µ∗ (A ∩ (E\F )).

This shows that E\F ∈ S. Clearly ∅ ∈ S. This completes the proof. 

In fact, we can prove much more.


Proposition 1.3.3 Let µ∗ be an outer-measure on a hereditary σ-ring
H of subsets of a non-empty set X. Then, S, the collection of all µ∗ -
measurable sets, is a σ-ring. Further, if {Ei }∞
i=1 is a sequence of mutu-
ally disjoint sets in S whose union is E and if A is any arbitrary set in
H, we have
X∞

µ (A ∩ E) = µ∗ (A ∩ Ei ). (1.3.6)
i=1

Proof: It follows from (1.3.4) that

µ∗ (A∩(E1 ∪E2 )) = µ∗ (A∩E1 ∩E2 )+µ∗ (A∩E1 ∩E2c )+µ∗ (A∩E1c ∩E2 ).

Since E1 ∩ E2 = ∅, it follows that E1 ⊂ E2c and that E2 ⊂ E1c . Conse-


quently, the above relation yields,

µ∗ (A ∩ (E1 ∪ E2 )) = µ∗ (A ∩ E1 ) + µ∗ (A ∩ E2 ).

By induction, it follows that for any n mutually disjoint sets {Ei }ni=1 ,
we have
Xn
∗ n
µ (A ∩ (∪i=1 Ei )) = µ∗ (A ∩ Ei ). (1.3.7)
i=1

Set Fn = ∪ni=1 Ei .
Since S is a ring, we have that Fn ∈ S. Further, since
Fn ⊂ E, we have that E c ⊂ Fnc . Consequently we have, by (1.3.7) and
the monotonicity of µ∗ ,

µ∗ (A) = µ∗ (A ∩ Fn ) + µ∗ (A ∩ Fnc )
Pn ∗ (A
≥ i=1 µ ∩ Ei ) + µ∗ (A ∩ E c ).

Since n was arbitrarily fixed, we deduce that



X
µ∗ (A) ≥ µ∗ (A ∩ Ei ) + µ∗ (A ∩ E c ).
i=1
1.3 Outer-measure and measurable sets 21

Replacing A by A ∩ E and by the subadditivity of the outer-measure,


we get

X
µ∗ (A ∩ E) ≥ µ∗ (A ∩ Ei ) ≥ µ∗ (A ∩ E).
i=1
This proves (1.3.6) and we also immediately see that

µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c )

from which we deduce that E ∈ S (cf. Remark 1.3.1). Thus, S is closed


under countable disjoint unions. But since S is a ring, we can express
any countable union of sets in it as a countable disjoint union of sets in
it. Thus it follows that S is indeed a σ-ring. This completes the proof.

Definition 1.3.4 A measure µ, defined on a σ-ring S of subsets of a
non-empty set X, is said to be complete if, whenever a set E ∈ S has
measure zero, then every subset of E is also a member of S. 

Theorem 1.3.1 Let µ∗ be an outer-measure on a hereditary σ-ring H of


subsets of a non-empty set X and let S be the σ-ring of all µ∗ -measurable
subsets. For E ∈ S, define

µ(E) = µ∗ (E).

Then µ is a complete measure on S.

Proof: The fact that µ is a measure on S follows immediately from the


preceding proposition. Now, let µ∗ (E) = 0 for some E ∈ H. Let A ∈ H.
Then

µ∗ (A) = µ∗ (E) + µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c )

by monotonicity and hence (cf. Remark 1.3.1) we deduce that E ∈ S.


Thus S contains all sets of outer-measure zero. If E ∈ S such that
µ(E) = 0, then µ∗ (E) = 0 and so µ∗ (F ) = 0 for all subsets F of E.
Thus every subset F of E is in S, which completes the proof. 

We say that the measure µ is the measure induced by the outer-


measure µ∗ .

Now, let µ be a measure on a ring R of subsets of a non-empty set


X. Then we can define the hereditary σ-ring H(R), generated by R and
22 1 Measure

also define the outer-measure µ∗ on H(R), as described in Proposition


1.3.1. This outer-measure will now induce a complete measure on S, the
σ-ring of all µ∗ -measurable sets.
Proposition 1.3.4 Let R be a ring of subsets of a non-empty set X
and let µ be a measure on R. Let µ∗ be the induced outer-measure.
Then S(R) ⊂ S, where S(R) is the σ-ring generated by R and S is
the σ-ring of all µ∗ -measurable subsets. In particular µ is an extension
of the measure µ to the σ-rings S(R) and S and it is complete on the
latter.
Proof: Let E ∈ R and let A ∈ H(R). Assume that µ∗ (A) < +∞.
Then, given ε > 0, there exists a sequence of sets {Ei }∞
i=1 in R such

that A ⊂ ∪i=1 Ei and such that

X
µ(Ei ) < µ∗ (A) + ε.
i=1
But µ is a measure on R and so

X ∞
X
µ(Ei ) = (µ(Ei ∩ E) + µ(Ei ∩ E c )).
i=1 i=1
Now, µ∗ extends µ and so, by the subadditivity of the outer-measure,
we get
µ∗ (A) + ε > µ∗ (A ∩ E) + µ∗ (A ∩ E c ).
Since ε > 0 has been arbitrarily chosen, we deduce that
µ∗ (A) ≥ µ∗ (A ∩ E) + µ∗ (A ∩ E c ).
This inequality is trivially true if µ∗ (A) = +∞. Thus we have shown
that R ⊂ S (cf. Remark 1.3.1). Since S is a σ-ring containing R, it also
contains S(R). 

Remark 1.3.2 If µ on R is σ-finite, then we saw that µ∗ on H(R) is


σ-finite as well, and that, in fact each set in H(R) can be covered by
countably many sets from R with finite measure. In particular, it follows
that µ on S(R) and on S are σ-finite measures. 
Proposition 1.3.5 Let µ be a measure on a ring R of subsets of a non-
empty set X and let µ be its extension to S(R) and S as described above.
Let E ∈ H(R). Then
µ∗ (E) = inf{µ(F ) | F ∈ S, E ⊂ F }
= inf{µ(F ) | F ∈ S(R), E ⊂ F }.
1.3 Outer-measure and measurable sets 23

Proof: The proof follows from the following chain of inequalities.


µ∗ (E) = inf{ ∞ ∞
P
i=1 µ(Ei ) | E ⊂ ∪i=1 Ei , Ei ∈ R}

= inf{ ∞ ∞
P
i=1 µ(Ei ) | E ⊂ ∪i=1 Ei , Ei ∈ R}

≥ inf{ ∞ ∞
P
i=1 µ(Ei ) | E ⊂ ∪i=1 Ei , Ei ∈ S(R)}

≥ inf{µ(∪∞ ∞
i=1 Ei ) | E ⊂ ∪i=1 Ei , Ei ∈ S(R)}

≥ inf{µ(F ) | E ⊂ F, F ∈ S(R)}

≥ inf{µ(F ) | E ⊂ F, F ∈ S}

= inf{µ∗ (F ) | E ⊂ F, F ∈ S}

≥ µ∗ (E). 
Remark 1.3.3 Let µ be a measure on a ring R of subsets of a non-
empty set X and let µ be its extension, as described in this section, to
S(R). Starting from this, one could again try to define an outer-measure
µ∗ on H(R) and try to extend it further. The proof of the preceding
proposition shows that µ∗ = µ∗ and so the σ-ring of measurable sets will
still be S and so the induced measure will also only be µ. 
Definition 1.3.5 Let µ be a measure on a ring R of subsets of a non-
empty set X. Let E ∈ H(R) and let F ∈ S(R). We say that F is
a measurable cover of E if E ⊂ F and for all G ∈ S(R) such that
G ⊂ F \E, we have µ(G) = 0. 

Proposition 1.3.6 Let µ be a measure on a ring R of subsets of a non-


empty set X. Let E ∈ H(R) be such that µ∗ (E) < +∞. Then, there
exists a measurable cover F of E such that µ(F ) = µ∗ (E).

Proof: By Proposition 1.3.5, for every positive integer n, there exists


Fn ∈ S(R) such that E ⊂ Fn and
1
µ∗ (Fn ) < µ∗ (E) + .
n
Set F = ∩∞
n=1 Fn . Then F ∈ S(R) and E ⊂ F . Thus,

1
µ∗ (E) ≤ µ∗ (F ) ≤ µ∗ (Fn ) < µ∗ (E) + .
n
24 1 Measure

Since this is true for all n, we deduce that

µ∗ (E) = µ∗ (F ) = µ(F ).

Let G ∈ S(R) be such that G ⊂ F \E. Then E ⊂ F \G and µ(G) is


finite. Thus,

µ(F ) = µ∗ (E) ≤ µ∗ (F \G) = µ(F \G) = µ(F ) − µ(G)

from which it follows that µ(G) = 0. This completes the proof. 

1.4 Completion of a measure

In the previous section we started with a measure on a ring R of subsets


of a non-empty set X and extended it to a complete measure µ on S,
the σ-ring of all µ∗ -measurable sets.

Given a measure on a σ-ring, it is always possible to extend it to a


complete measure by another process, which we now describe.

Theorem 1.4.1 Let S be a σ-ring of subsets of a non-empty set X and


let µ be a measure on S. Let

Se = {E∆N | E ∈ S, N ⊂ A, A ∈ S, µ(A) = 0}.

Then, Se is a σ-ring. Define µ


e on Se by

µ
e(E∆N ) = µ(E).

e is a complete measure on Se and it extends µ.


Then µ

Proof: Let E, A ∈ S. Let µ(A) = 0 and let N ⊂ A. Then E\A ∈ S.


Consequently,

E∪N = (E\A)∆(A ∩ (E ∪ N )),


(1.4.1)
E∆N = (E\A) ∪ (A ∩ (E∆N )).

Thus,
Se = {E ∪ N | E ∈ S, N ⊂ A, A ∈ S, µ(A) = 0}.
It is now clear that Se is closed under the formation of countable unions.
Let E1 , E2 ∈ S and let N1 ⊂ A1 , N2 ⊂ A2 where Ai ∈ S, µ(Ai ) = 0
1.4 Completion of a measure 25

for i = 1, 2. Since the symmetric difference is an associative operation


(check!!) which is also obviously commutative, we have that

(E1 ∆N1 )∆(E2 ∆N2 ) = (E1 ∆E2 )∆(N1 ∆N2 )

with E1 ∆E2 ∈ S and N1 ∆N2 ⊂ A1 ∪A2 where A1 ∪A2 ∈ S, µ(A1 ∪A2 ) =


0. Thus Se is closed under the formation of symmetric differences as well.
Thus it follows (cf. Remark 1.1.4) that Se is a σ-ring.

Again let Ei , Ai , Ni , i = 1, 2 be as above. Assume that E1 ∆N1 =


E2 ∆N2 . Again, by the commutativity and associativity of the symmetric
difference, we easily see that

E1 ∆E2 = N1 ∆N2 .

It then follows that E1 ∆E2 ⊂ A1 ∪ A2 and so µ(E1 ∆E2 ) = 0. We then


deduce that µ(E1 ) = µ(E2 ) = µ(E1 ∩ E2 ). This shows that µ e is well-
defined. Using the latter definition of Se in terms of unions rather than
symmetric differences, it is very easy to check that µe defines a measure
on S. Also it is clear that µ
e e extends µ.

Now let E, A ∈ S such that µ(E) = µ(A) = 0 and let N ⊂ A. Then


e(E ∪ N ) = µ(E\A) = 0 (cf. (1.4.1)). If F ⊂ E ∪ N , then F ⊂ E ∪ A
µ
and µ(E ∪ A) = 0. Consequently, F ∈ S.
e This shows that µe is complete.


Now, if µ is a measure defined on a ring R of subsets of a non-empty


set X, we have two complete extensions of µ: first, the measure µ defined
on the σ-ring S of µ∗ -measurable sets, described in the previous section,
and, second, the measure µ e on the σ-ring Se got by adjoining subsets of
sets of measure zero to sets in the σ-ring S(R). If µ were σ-finite, these
two processes yield the same measure, as the following theorem shows.

Theorem 1.4.2 Let µ be a σ-finite measure on a ring R of subsets of a


non-empty set X. Let µ be its extension to S(R), the σ-ring generated
by R, and to S, the set of all µ∗ -measurable sets. Let µ
e be the measure
on the σ-ring S got by adjoining subsets of sets of measure zero in S(R)
e
to sets in S(R). Then Se = S and µ e = µ.

Proof: Since S is a σ-ring containing S(R) and since µ is complete,


it follows that Se ⊂ S. Further, if E, A ∈ S(R) with µ(A) = 0 and if
26 1 Measure

N ⊂ A, we have, by definition,

e(E ∪ N ) = µ(E) = µ(E ∪ N )


µ

since
µ(E ∪ N ) = µ∗ (E ∪ N ) ≤ µ∗ (E) + µ∗ (N ) = µ∗ (E)
= µ(E) ≤ µ(E ∪ N ).

The proof will, therefore, be complete if we show that S ⊂ S.


e Let

E ∈ S such that µ(E) = µ (E) < +∞. Then, by Proposition 1.3.6,
there exists a measurable cover F of E such that µ(F ) = µ∗ (F ) =
µ∗ (E) = µ(E). Recall that F ∈ S(R) and that E ⊂ F .Then, since µ is
a measure, we have that µ∗ (F \E) = µ(F \E) = 0. Now, let G ∈ S(R)
be a measurable cover of F \E, with µ(G) = 0. Then,

E = (F \G) ∪ (E ∩ G).

Since F \G ∈ S(R) and since E ∩ G ⊂ G where G ∈ S(R) with measure


zero, it follows that E ∈ S.
e Since µ is σ-finite, we have that µ is σ-finite
and so any set E ∈ S can be expressed as the countable union of sets of
finite measure and so it follows that S ⊂ Se and the proof is complete.


1.5 Exercises

1.1 Let X = RN , N ≥ 2. Let

P = ΠN

i=1 [ai , bi ) | ai , bi ∈ R, ai ≤ bi , 1 ≤ i ≤ N .

Let R be the collection of all finite unions of members of P. Show that


R is a ring.

1.2 Let X be an uncountable set. Let R be the collection of all at most


countable subsets of X, including the empty set. Show that R is a ring.
Is it a σ-ring? Is it a σ-algebra?

1.3 Let R be a ring of subsets of a non-empty set X. Define

S = {F ⊂ X | F ∈ R or F c ∈ R}.

Show that S is the smallest algebra containing R.


1.5 Exercises 27

1.4 Let X be a non-empty set and let E ⊂ X. Let E = {E}. Compute


R(E).

1.5 Let X be a non-empty set and let E ⊂ X. Let E = {F ⊂ X | E ⊂


F }. Compute R(E).

1.6 Let X = N. Let R be the collection of all finite subsets (including


the empty set) and their complements. Show that R is an algebra.

1.7 Let R be a ring of subsets of a non-empty set X. Let µ be a measure


on R. If E, F ∈ R, show that

µ(E) + µ(F ) = µ(E ∪ F ) + µ(E ∩ F ).

1.8 Let R be a ring of subsets of a non-empty set X. Let µ be a non-


negative, finite and additive set function on R. If either µ is continuous
from below for every set E ∈ R or if it is continuous from above for
E = ∅, show that µ is a measure on R.

1.9 Let X = N and let R be the ring described in Exercise 1.6 above.
Define, for E ∈ R,

+∞, if E is an infinite set,
µ(E) =
0, otherwise.

Show that µ is continuous from above for E = ∅, but that µ is not a


measure. (This shows that finiteness is essential in the previous exer-
cise.)

1.10 Let S be a σ-ring of subsets of a non-empty set X and let µ be a


measure on S. Let {Ei }∞
i=1 be a sequence of sets in S. Define

lim inf n→∞ En = ∪∞ ∞


n=1 ∩i=n Ei ,
lim supn→∞ En = ∩n=1 ∪∞

i=n Ei .

(a) Show that

µ(lim inf En ) ≤ lim inf µ(En ).


n→∞ n→∞

(b) If, for some n ∈ N, we have µ(∪∞


i=n Ei ) < +∞, show that

µ(lim sup En ) ≥ lim sup µ(En ).


n→∞ n→∞
28 1 Measure

1.11 Let H be a hereditary σ-ring of subsets of a non-empty set X and


let µ∗ be an outer-measure on H. If E, F ∈ H, and if at least one of
them is µ∗ -measurable, show that

µ∗ (E) + µ∗ (F ) = µ∗ (E ∪ F ) + µ∗ (E ∩ F ).

1.12 Let E ⊂ R. E is said to have an infinite condensation point if E has


uncountably many points outside every finite interval. Let H = P(R).
Define, for E ⊂ R,


 0, if E is empty, finite or countable,
1, if E is uncountable, but without an

µ∗ (E) =

 infinite condensation point,
+∞, if E has an infinite condensation point.

Show that
(i) µ∗ is a σ-finite outer-measure on H;
(ii) the only µ∗ -measurable sets are at most countable sets or their com-
plements;
(iii) the induced measure µ is not σ-finite.

1.13 Let X be a non-empty set and let H = P(X). Let µ∗i , i = 1, 2


be two finite outer-measures on H. Let S i , i = 1, 2 be the respective
measurable sets. If µ∗ = µ∗1 + µ∗2 , show that µ∗ is an outer measure and
that the class of µ∗ -measurable sets is S1 ∩ S2 .

1.14 Let µ be a measure on a σ-ring S of subsets of a non-empty set X.


Let µ be the induced measure defined on S, the σ-ring of µ∗ -measurable
sets. Let A, B ∈ S be such that µ(B\A) = 0. If A ⊂ E ⊂ B, show that
E ∈ S.

1.15 Let X be an uncountable set and let S be the collection of at most


countable sets (including the empty set) and their complements.
(a) Show that S is a σ-algebra.
(b) If µ is the counting measure on S, show that it is complete.
(c) Show that every subset of X is µ∗ -measurable.
(Thus, without σ-finiteness, the completion via µ∗ -measurability and
the completion as in Section 1.4 need not coincide.)

1.16 Let R be a ring of subsets of a non-empty set X and let µ be a


σ-finite measure on R. Show that for every set E ∈ S(R) and for every
1.5 Exercises 29

ε > 0, there exists E0 ∈ R such that

µ(E∆E0 ) ≤ ε.
Chapter 2

The Lebesgue measure

2.1 Construction of the Lebesgue measure

We will now study in detail the construction and properties of the


Lebesgue measure on the euclidean space RN . We will start with a
measure on the ring which arises from the notion of the length of an
interval (area or volume of a box in higher dimensions) and extend it
to a complete measure on the class of measurable sets, as described in
Section 1.3. To simplify the exposition we will describe in detail the con-
struction on the real line, R. The generalization to higher dimensions
will be obvious.

Let P denote the class of all intervals of the form [a, b), where a ≤
b, a, b ∈ R. Let R be the ring of all finite unions of members of P (cf.
Example 1.1.3). As observed earlier, we can express each member of R
as a finite disjoint union of members of P. Let us define

µ([a, b)) = b − a.

If a = b, then [a, b) = ∅ and we have µ(∅) = 0. We will now construct a


measure on R starting from µ. The definition is almost obvious: if E ∈
R is expressed as the disjoint union of intervals, i.e. if E = ∪kj=1 Ij , where
Ij , 1 ≤ j ≤ k, are mutually disjoint members of P, then, necessarily, we
must have
X k
µ(E) = µ(Ij ).
j=1

However, we need to check that this is well-defined and also that it sat-
isfies the properties of a measure. In particular, we need to verify that
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 30
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_2
2.1 Construction of the Lebesgue measure 31

this set function is countably additive on R.

We start by formally proving a few fairly obvious properties of µ,


defined on P.

Lemma 2.1.1 (a) Let {Ei }ni=1 be a finite set of mutually disjoint inter-
vals in P, such that each of them is contained in E0 ∈ P. Then
n
X
µ(Ei ) ≤ µ(E0 ). (2.1.1)
i=1

(b) Let F = [a0 , b0 ] be a finite closed interval contained in the finite


union of open intervals Ui , where Ui = (ai , bi ), 1 ≤ i ≤ n. Then
n
X
b0 − a 0 < (bi − ai ). (2.1.2)
i=1

Proof: (a) Let Ei = [ai , bi ), 0 ≤ i ≤ n. Since the Ei , 1 ≤ i ≤ n, are


disjoint, we have (by renumbering the sets, if necessary) that

a0 ≤ a1 < b1 ≤ a2 < b2 ≤ · · · < bi−1 ≤ ai < bi ≤ ai+1 < · · · < bn ≤ b0 .

Thus,
Pn Pn Pn Pn−1
i=1 µ(Ei ) = i=1 (bi − ai ) ≤ i=1 (bi − ai ) + i=1 (ai+1 − bi )

= bn − a1 ≤ b0 − a0 = µ(E0 ).

This proves (2.1.1).


(b) We may renumber the intervals Ui , after getting rid of the superfluous
ones, so that we have

bi ∈ (ai+1 , bi+1 ) = Ui+1 , 1 ≤ i ≤ m − 1,

where m ≤ n. Also a0 ∈ U1 and b0 ∈ Um . Then


m−1
X m
X
b0 − a0 < bm − a1 = (b1 − a1 ) + (bi+1 − bi ) ≤ (bi − ai ),
i=1 i=1

which proves (2.1.2), since m ≤ n. 


32 2 The Lebesgue Measure

Proposition 2.1.1 If {Ei }∞ ∞


i=0 is a sequence in P such that E0 ⊂ ∪i=1 Ei ,
then,
X∞
µ(E0 ) ≤ µ(Ei ). (2.1.3)
i=1

Proof: The result is trivially true if E0 = ∅. Let Ei = [ai , bi ), 0 ≤ i < ∞,


with ai < bi for all i, and choose ε > 0 such that 0 < ε < b0 − a0 . Let
δ > 0 be an arbitrarily small positive quantity. Set
F0 = [a0 , b0 − ε],

δ

Ui = ai − ,b
2i i
, 1 ≤ i < ∞.
Then, F0 ⊂ E0 and Ei ⊂ Ui , 1 ≤ i < ∞. Thus F0 ⊂ ∪∞ i=1 Ui . Since F0
is compact, there exists a positive integer n such that F0 ⊂ ∪ni=1 Ui . It
now follows from the preceding lemma (cf. (2.1.2)) that
n   ∞
X δ X
b0 − a 0 − ε < bi − a i + i ≤ (bi − ai ) + δ.
2
i=1 i=1

In other words,

X
µ(E0 ) − ε < µ(Ei ) + δ.
i=1
The result now follows since ε and δ are arbitrarily small quantities. 
Proposition 2.1.2 The set function µ is countably additive on P.
Proof: Let E0 = ∪∞ ∞
i=1 Ei , where {Ei }i=1 is a sequence of mutually
disjoint members of P. Assume that E0 ∈ P as well. By the preceding
proposition, we have
X∞
µ(E0 ) ≤ µ(Ei ).
i=1
On the other hand, for any positive integer n, we have by Lemma 2.1.1
(cf. (2.1.1)),
Xn
µ(Ei ) ≤ µ(E0 )
i=1
which yields

X
µ(Ei ) ≤ µ(E0 )
i=1
from which the result follows. 
2.1 Construction of the Lebesgue measure 33

Theorem 2.1.1 There exists a unique finite measure µ on R which


extends µ defined on P.

Proof: Let E ∈ R. Then E = ∪ni=1 Ei , where {Ei }ni=1 is a collection


of mutually disjoint intervals in P. Then the only possible way we can
define a measure on R is by setting
n
X
µ(E) = µ(Ei ).
i=1

This is obviously an extension of µ defined on P and so, in particular,


µ(∅) = 0 and µ(E) ≥ 0 for all E ∈ R. However, we need to verify that
µ is well-defined. Let

E = ∪ni=1 Ei = ∪m
j=1 Fj ,

where {Ei }ni=1 and {Fj }m


j=1 are two collections of mutually disjoint in-
tervals in P. Then, for each 1 ≤ i ≤ n, we can write

Ei = ∪ m
j=1 Ei ∩ Fj .

Each set Ei ∩ Fj is either empty, or is a non-empty interval in P. Hence,


by Proposition 2.1.2, since µ is also finitely additive in P, we have
m
X
µ(Ei ) = µ(Ei ∩ Fj ).
j=1

Similarly, for each 1 ≤ j ≤ m, we have


n
X
µ(Fj ) = µ(Fj ∩ Ei ).
i=1

Thus,
n
X n X
X m m X
X n m
X
µ(Ei ) = µ(Ei ∩ Fj ) = µ(Ei ∩ Fj ) = µ(Fj ).
i=1 i=1 j=1 j=1 i=1 j=1

This establishes that µ is well-defined on R. It is also clear, from the


definition of µ, that it is finitely additive on R.
34 2 The Lebesgue Measure

We now need to show that µ is countably additive on R. Let E =


∪∞
i=1 Ei ,where E ∈ R and {Ei }∞ i=1 is a collection of mutually disjoint
sets in R. Since Ei ∈ R, we can write

Ei = ∪nk=1
i
Eik , 1 ≤ i < ∞,

where {Eik }nk=1


i
is a finite collection of mutually disjoint intervals in P.
Notice that, for each 1 ≤ i < ∞, we have, by definition of µ,
ni
X
µ(Ei ) = µ(Eik ).
k=1

Case 1: E ∈ P.
In this case, since
ni
E = ∪∞
i=1 ∪k=1 Eik

is a countable disjoint union within P, we have, by Proposition 2.1.2,


that
∞ X
X ni ∞
X
µ(E) = µ(Eik ) = µ(Ei ).
i=1 k=1 i=1

Case 2: E = ∪nj=1 Fj ,
where Fj ∈ P for each 1 ≤ j ≤ n, and the Fj ’s are
all mutually disjoint. Then, for each 1 ≤ j ≤ n, we have

Fj = ∪ ∞
i=1 Fj ∩ Ei .

Thus {Fj ∩ Ei }∞
i=1 is a collection of mutually disjoint sets in R whose
union is Fj ∈ P. Thus, by Case 1 above, we have

X
µ(Fj ) = µ(Fj ∩ Ei ).
i=1

Hence, by definition of µ on R, we deduce that


n
X X ∞
n X ∞ X
X n
µ(E) = µ(Fj ) = µ(Fj ∩ Ei ) = µ(Fj ∩ Ei ).
j=1 j=1 i=1 i=1 j=1

On the other hand, we have, for each 1 ≤ i < ∞,

Ei = ∪nj=1 Ei ∩ Fj .
2.1 Construction of the Lebesgue measure 35

Now, {Ei ∩ Fj }nj=1 is a finite collection of mutually disjoint sets in R


and so, by finite additivity, we have
n
X
µ(Ei ) = µ(Ei ∩ Fj )
j=1

and so we deduce that



X
µ(E) = µ(Ei )
i=1

which proves the countable additivity of µ in the general case as well.


This completes the proof. 

Remark 2.1.1 In the same way, when N ≥ 2, if

P = ΠN

i=1 [ai , bi ) | ai ≤ bi , ai , bi ∈ R, 1 ≤ i ≤ N ,

and R is the ring of finite (disjoint) unions of members of P (cf. Exercise


1.1), we can define a unique measure on R which, when restricted to P,
is given by
µ ΠN N

i=1 [ai , bi ) = Πi=1 (bi − ai ). 

Let B = S(R) be the σ-ring generated by R. Sets in B are called


Borel sets and B is called the Borel σ-algebra. Since

R = ∪n∈Z [n, n + 1),

we see that B is, indeed, a σ-algebra. Thus, the hereditary σ-ring gen-
erated by R is clearly the power set of R. We can now define the
induced outer-measure µ∗ for all subsets of R. The collection L of all
µ∗ -measurable sets is thus a σ-algebra which is called the Lebesgue σ-
algebra and its members are called the Lebesgue measurable sets;
the induced measure on this σ-algebra is called the Lebesgue measure
on R. It is clear that the Lebesgue measure is σ-finite and complete.
Thus the Lebesgue measure is the completion of the measure induced
on the Borel σ-algebra (cf. Theorem 1.4.2) by µ.

Remark 2.1.2 In an identical manner, we can define the Borel and


Lebesgue σ-algebras in RN , N ≥ 2, starting from the class P, the ring
R and the measure µ defined in Remark 2.1.1. We will then get the
Lebesgue measure in RN which will be a σ-finite complete measure, and
36 2 The Lebesgue Measure

which is also the completion of the measure induced on the Borel σ-


algebra by µ. 

Notation: We will denote the Lebesgue measure on RN by the symbol


mN . In particular, the Lebesgue measure on R will be denoted by m1 .
The Borel and Lebesgue σ-algebras on RN will be denoted, respectively,
BN and LN . 
Definition 2.1.1 Any measure defined on the Borel σ-algebra, BN , will
be called a Borel measure on RN . 

Proposition 2.1.3 Every countable set in R is a Borel set of measure


zero.

Proof: Let a ∈ R. Then


 
1
{a} = ∩∞
n=1 a, a + ∈ B1 .
n
Thus singletons are all in B1 and, consequently, any countable set is in
B1 . Further, by Proposition 1.2.5, it follows that
 
1 1
m1 ({a}) = lim m1 a, a + = lim = 0.
n→∞ n n→∞ n

Thus, the measure of any countable set will also be zero. 


Proposition 2.1.4 The Borel σ-algebra is also the σ-algebra generated
by all the open sets in R.
Proof: Let a, b ∈ R. Then (a, b) = [a, b)\{a} ∈ B1 . Since any open
set can be written as the countable union of such intervals, it follows
that every open set is contained in the Borel σ-algebra, B1 . Hence, the
σ-algebra generated by open sets is also contained in B1 . Conversely,
[a, b) = (a, b) ∪ {a} and
 
∞ 1 1
{a} = ∩n=1 a − , a + .
n n
Consequently, P, and hence, R and B1 are all contained in the σ-algebra
generated by the open sets. This completes the proof. 

Remark 2.1.3 It is an easy exercise to adapt the proofs of the preceding


two propositions to the case RN , N ≥ 2 as well. Thus countable sets in
2.1 Construction of the Lebesgue measure 37

RN are Borel sets of measure zero and BN is the σ-algebra generated by


the open sets in RN (with its usual topology). 

Example 2.1.1 It follows from the above proposition that

m1 ((a, b)) = m1 ([a, b)) = m1 ([a, b]) = b − a.

Since R is the countable disjoint union of the intervals [n, n + 1), as n


varies over Z, it follows that m1 (R) = +∞.

Now consider any interval [a, b) as the set [a, b) × {0} ⊂ R2 . Then
 
∞ 1
[a, b) × {0} = ∩n=1 [a, b) × 0, .
n

Then, again, it follows, from Proposition 1.2.5, that


1
m2 ([a, b) × {0}) = lim (b − a) = 0.
n→∞ n
Since the real line, considered as a coordinate axis in R2 , can be written
as the countable disjoint union of sets of the form [n, n + 1) × {0}, it
follows that
m2 (R) = 0.
More generally, the Lebesgue measure in RN of any proper linear sub-
space, will be zero. 

Example 2.1.2 (i) Since any non-empty open set contains a non-empty
open interval, it follows that the Lebesgue measure of any non-empty
open set is strictly positive.
(ii) The set of rationals, Q, being countable, has Lebesgue measure zero.
Thus Q is a an example of a dense set which has measure zero. Its com-
plement, the set of irrationals, is a dense set of infinite measure.
(iii) If E ⊂ R is a measurable set of measure zero, then it cannot contain
any non-empty open set. Thus, every non-empty open set will intersect
E c and so E c will be dense.
(iv) If K ⊂ R is a compact set, then it is bounded and closed and so it
has finite measure. 

Example 2.1.3 The Cantor Set


Let X = [0, 1]. Set X1 = ( 13 , 23 ). Let X2 be the union of the open
38 2 The Lebesgue Measure

middle-thirds of the subintervals of X\X1 , i.e.


   
1 2 7 8
X2 = , ∪ , .
9 9 9 9

Now let X3 be the union of the open middle-thirds of the four subinter-
vals of X\(X1 ∪ X2 ) and so on. Set

C = X\ ∪∞
n=1 Xn .

The set C is called the Cantor set.


(i) Since each Xn is open, it follows that C is a closed set.
(ii) It is easy to see that

2n−1
m1 (Xn ) = .
3n
Since all these sets Xn are disjoint, we have

X 2n−1
m1 (∪∞
n=1 Xn ) = = 1 = m1 ([0, 1]).
3n
n=1

Consequently, m1 (C) = 0.
(iii) It then follows that since it is closed and has measure zero, it cannot
contain a non-empty open set. Thus C is nowhere dense.
(iv) Let x ∈ C. If (a, b) is any interval containing x, then, for n suffi-
ciently large, it must also contain a sub-interval of Xn . End-points of all
such sub-intervals are in C. Thus no point of C is isolated. Since C is
closed as well, it follows that C is a perfect set and so it is uncountable
(cf. Rudin [7]). Thus C is an example of an uncountable set of measure
zero.
(iv) Another way of proving the uncountability of the Cantor set is as
follows. Consider the ternary expansion of real numbers in [0, 1]:

X
x = an 3−n ,
n=1

where an = 0, 1 or 2. To determine an , we proceed in the following


manner. First we divide the interval [0, 1] into three equal sub-intervals.
Then a1 = 0, 1 or 2 according as x falls in the first, second or third
sub-interval, respectively. If it falls at one of the nodes of the partition,
then the expansion of x has only one term. Next, we divide each of
2.2 Approximation 39

the above subintervals into three equal parts and again a2 = 0, 1 or 2,


according as x falls in the first, second or third of the intervals of the
block in which it falls. We proceed inductively to determine all the an .
The expansion will terminate at a finite stage, if x falls at a node of the
partition at some level.

It is now clear that the Cantor set contains the set of all points in
[0, 1] with infinite ternary expansion such that the digits in its ternary
expansion are either 0 or 2. Now a simple Cantor diagonalisation argu-
ment shows that C is uncountable. 

We have three distinguished collections of subsets of RN , viz. the


Borel σ-algebra, BN , the Lebesgue σ-algebra, LN and, the power set of
RN , P(RN ). We have

BN ⊂ LN ⊂ P(RN ).

One would, naturally, like to know if these inclusions are strict. We will
show, in the sequel, that these inclusions are, indeed, strict. To start
with, since the Lebesgue measure is complete, we have that every subset
of C is Lebesgue measurable. Since C is uncountable, the cardinality of
L1 is 2c , where c is the cardinality of the continuum. It can be shown
that the cardinality of B1 is just c. Consequently the inclusion B1 ⊂ L1
is strict. We will later define the Cantor function and use it to prove the
existence of a Lebesgue measurable set which is not Borel measurable.
We will also prove the existence of sets in R which are not Lebesgue
measurable.

2.2 Approximation

In this section, we will present various results on the approximation of


measurable sets by toplogical sets in the context of the Lebesgue mea-
sure.

By a box in RN , N ≥ 1, we will mean a set of the form

B = ΠN
j=1 Ij ,

where each set Ij , 1 ≤ j ≤ N , is a finite interval in R. If all the Ij are


open intervals, we will call it an open box and if thay are all closed,
we will call it a closed box. If all the Ij are of the form [aj , bj ), where
40 2 The Lebesgue Measure

aj < bj , we will call it a half-open box.

Clearly, if B ⊂ RN is a box as above, we have


mN (B) = ΠN
j=1 m1 (Ij ). (2.2.1)
It is also clear that given ε > 0, we can always find boxes B1ε and B2ε
which are of any kind (open, closed or half-open) such that
B1ε ⊂ B ⊂ B2ε
and such that
mN (B\B1ε ) < ε and mN (B2ε \B) < ε.
Recall that the class P is the collection of all half-open boxes and R
is the ring generated by P. The measure µ on R is the same as mN ,
given by (2.2.1) on P, and extended by finite additivity to R. We will
denote the corresponding outer-measure defined on all subsets of RN by
µ∗ . Thus µ∗ restricted to LN is mN . Unless specified otherwise, the
term measurable set will mean a Lebesgue measurable set.
Proposition 2.2.1 Let E ⊂ RN , N ≥ 1. Then
µ∗ (E) = inf{µ∗ (U ) |E ⊂ U, U an open set}.
Proof: The result is obvious if µ∗ (E) = +∞. So let us assume that
µ∗ (E) < +∞. Since E ⊂ U implies that µ∗ (E) ≤ µ∗ (U ), it is immediate
to see that
µ∗ (E) ≤ inf{µ∗ (U ) | E ⊂ U, U an open set}.
Let ε > 0. Then, by the definition of the outer-measure (cf. Proposition
1.3.1), there exist half-open boxes Bn such that E ⊂ ∪∞ n=1 Bn and such
that

X ε
mN (Bn ) < µ∗ (E) + .
2
n=1
Now, construct open boxes {Bn0 }∞ 0 0
n=1 such that Bn ⊂ Bn and mN (Bn \Bn ) <
ε
2n+1
. Set U = ∪∞ 0
n=1 Bn . Then U is an open set and E ⊂ U . Further

X ε
µ∗ (U ) = mN (U ) ≤ mN (Bn ) + .
2
n=1

Consequently, we have E ⊂ U and µ∗ (U ) < µ∗ (E) + ε. This completes


the proof. 
2.2 Approximation 41

Proposition 2.2.2 Let E ⊂ RN . The following statements are equiva-


lent.
(i) The set E is Lebesgue measurable.
(ii) Given any ε > 0, there exists an open set U such that E ⊂ U and
such that µ∗ (U \E) < ε.
(iii) Given any ε > 0, there exists a closed set F such that F ⊂ E and
such that µ∗ (E\F ) < ε.
(iv) There exists a Gδ set G such that E ⊂ G and such that µ∗ (G\E) =
0.
(v) There exists an Fσ set F such that F ⊂ E and such that µ∗ (E\F ) =
0.

Proof: (i) ⇒ (ii): If µ∗ (E) < +∞, then, by the previous proposition,
there exists an open set U containing E such that µ∗ (U ) < µ∗ (E) + ε.
Since E is Lebesgue measurable, we have that µ∗ = mN and since a
measure is subtractive, we deduce that µ∗ (U \E) < ε. If µ∗ (E) = +∞,
then since µ∗ = mN is σ-finite, we can find disjoint measurable sets En
such that each of them has finite measure and such that E = ∪∞ n=1 En .
Then, we can find open sets Un such that En ⊂ Un and such that
µ∗ (Un \En ) < 2εn . Then U = ∪∞
n=1 Un is open, contains E and


X ε
µ∗ (U \E) ≤ µ∗ (∪∞
n=1 (Un \En )) ≤ = ε.
2n
n=1

(ii) ⇒ (iv): For each positive integer n, choose Un open such that E ⊂ Un
and µ∗ (Un \E) < n1 . Set G = ∩∞ n=1 Un . Then G is a Gδ set containing E
and
1
µ∗ (G\E) ≤ µ∗ (Un \E) < ,
n
from which we deduce that µ∗ (G\E) = 0.

(iv) ⇒ (i): By completeness of the Lebesgue measure, if µ∗ (G\E) = 0,


it follows that G\E is measurable. Since G is a Gδ set, it is measurable
as well. Now
E = G\(G\E)
and so E is mesurable as well.

(i) ⇒ (iii): If E is, measurable, so is E c . Hence, there exists an open set


U containing E c and such that µ∗ (U \E c ) < ε. Then F = U c is closed
42 2 The Lebesgue Measure

and is contained in E. Further

µ∗ (E\F ) = µ∗ (E ∩ F c ) = µ∗ (E ∩ U ) = µ∗ (U \E c ) < ε.

(iii) ⇒ (v): For every positive integer n, choose Fn , a closed set contained
in E, such that µ∗ (E\Fn ) < n1 . Set F = ∪∞ n=1 Fn . Then F is an Fσ set
contained in E. Further,

1
µ∗ (E\F ) ≤ µ∗ (E\Fn ) <
n

for each n, from which it follows that µ∗ (E\F ) = 0.

(v) ⇒ (i): Since µ∗ (E\F ) = 0, once again, by the completeness of the


Lebesgue measure, we have that E\F is measurable. Every Fσ set is
measurable as well. Thus

E = F ∪ (E\F )

is measurable as well. 

Proposition 2.2.3 Let E ⊂ RN be a measurable set of finite measure.


Given ε > 0, there exists a compact set K ⊂ E such that mN (E\K) < ε.

Proof: Step 1: Let η > 0 be an arbitrary positive number. By Proposi-


tion 2.2.2, there exists an open set V such that E ⊂ V and mN (V \E) <
η. Let B(0; r) denote the open ball in RN with centre at 0 and of radius
r > 0; let the corresponding closed ball be denoted by B(0; r). If n is a
positive integer, set Vn = B(0; n) ∩ V . Then the sequence of open sets
{Vn }∞
n=1 increases to V and since V also has finite measure, there exists
a positive integer m such that mN (V \Vm ) < η. Then E\Vm ⊂ V \Vm
and so mN (E\Vm ) < η.

Step 2: Now, Vm is a bounded open set and again, by Proposition 2.2.2,


there exists a closed set F ⊂ Vm such that mN (Vm \F ) < η. Since F is
bounded and closed, it is compact.

Step 3: Thus, for any ε > 0, and a set E ⊂ RN of finite measure, we


can find a bounded open set W such that mN (E\W ) < 3ε . Further, by
Step 2, there exists a compact set K1 ⊂ W such that mN (W \K1 ) < 3ε .
Finally, once again by Proposition 2.2.2, there exists a closed set F1 ⊂ E
2.2 Approximation 43

such that mN (E\F1 ) < 3ε . Then K = F1 ∩K1 is a compact set contained


in E and
E\K = (E\W ) ∪ ((E ∩ W )\F1 ) ∪ ((W ∩ F1 )\K1 )
⊂ (E\W ) ∪ (E\F1 ) ∪ (W \K1 ).
Thus, mN (E\K) < ε. This completes the proof. 

Remark 2.2.1 Let E ⊂ RN be a measurable set. Then, by Proposition


2.2.1, we have
mN (E) = inf{mN (U ) |E ⊂ U, U is an open set}. (2.2.2)
If mN (E) < +∞, then, by Proposition 2.2.3, we have
mN (E) = sup{mN (K) | K ⊂ E, K is a compact set}. (2.2.3)
Any Borel measure µ (cf. Definition 2.1.1) satisfying (2.2.2) with µ in
the place of mN , is called outer-regular. If it satisfies (2.2.3) with µ
in place of mN , when µ(E) < +∞, it is said to be inner-regular. If
both are valid, we say that the measure is regular. Thus, the Lebesgue
measure is a regular measure. 
Definition 2.2.1 Given any set X and a subset A thereof, the char-
acteristic function of A is the function χA : X → R defined by

1, if x ∈ A,
χA (x) =
0, if x 6∈ A. 

Definition 2.2.2 Let Ω ⊂ RN be an open set. A step function defined


on Ω is a function f of the form
k
X
f= α j χ Ij ,
j=1

where the αj , 1 ≤ j ≤ k are constants and the sets Ij , 1 ≤ j ≤ k are


boxes contained in Ω. 

Proposition 2.2.4 Let I ⊂ RN be a box. Let ε > 0 be an arbitrary


positive number. Then, there exists a function ϕ ∈ Cc (RN ), the space
of continuous real-valued functions with compact support, such that 0 ≤
ϕ(x) ≤ 1 for all x and such that
mN ({x ∈ RN | ϕ(x) 6= χI (x)}) < ε.
Further, the support of ϕ will be contained in I.
44 2 The Lebesgue Measure

Proof: We can find a closed box J1 and an open box J2 , such that
J1 ⊂ J2 ⊂ J2 ⊂ I and such that mN (I\J1 ) < ε. By Urysohn’s lemma,
there exists a continuous function ϕ such that 0 ≤ ϕ(x) ≤ 1 for all x
and such that ϕ(x) = 1 for all x ∈ J1 and ϕ(x) = 0 for all x 6∈ J2 . Then,
the support of ϕ is contained in J2 ⊂ I, and so it is compact. Thus
ϕ ∈ Cc (RN ). Now,
{x ∈ RN | ϕ(x) 6= χI (x)} ⊂ I\J1
from which the result follows immediately. 
Corollary 2.2.1 Let Ω ⊂ RN be an open set and let f : Ω → R be a
step function. Let ε > 0 be an arbitrary positive number. Then there
exists ϕ ∈ Cc (Ω) such that
mN ({x ∈ Ω | ϕ(x) 6= f (x)}) < ε,
and such that
max |ϕ(x)| ≤ max |f (x)|.
x∈Ω x∈Ω
Pk
Proof: Let f = j=1 αj χIj be a step function. Without loss of gen-
erality, we can assume that the boxes are disjoint. By the preceding
proposition, there exist functions ϕj ∈ Cc (RN ), 1 ≤ j ≤ k, such that, for
each such j, we have 0 ≤ ϕj ≤ 1, the support of ϕj is contained in the
box Ij , and
ε
mN ({x ∈ RN | ϕj (x) 6= χIj (x)}) < .
k
Pk
Set ϕ = j=1 αj ϕj . Then

{x ∈ Ω | ϕ(x) 6= f (x)} ⊂ ∪kj=1 {x ∈ Ω | ϕj (x) 6= χIj (x)}


⊂ ∪kj=1 {x ∈ RN | ϕj (x) 6= χIj (x)}

and so
mN ({x ∈ Ω | ϕ(x) 6= f (x)}) < ε.
Since the supports of the ϕj , 1 ≤ j ≤ k, are all disjoint, it follows that
max |ϕ(x)| ≤ max |αj | = max |f (x)|.
x∈Ω 1≤j≤k x∈Ω

Finally, the function ϕ has compact support contained in ∪kj=1 Ij ⊂ Ω.


Thus ϕ ∈ Cc (Ω). This completes the proof. 

We conclude this section with one more approximation result. In


order to prove it, we need a topological result.
2.2 Approximation 45

Lemma 2.2.1 Every open set in RN can be written as a countable dis-


joint union of half-open boxes.
Proof: For a fixed positive integer n, let Fn denote the set of all points
in RN whose coordinates are all integral multiples of 2−n . Let Gn denote
the collection of all half-open boxes with each edge of length 2−n and
with vertices at the points of Fn . The following conclusions are obvious:
• For a fixed positive integer n, each point x ∈ RN belongs to exactly
one box in Gn .
• Let n > m. If Q ∈ Gm and if Q0 ∈ Gn , then either Q0 ⊂ Q or
Q ∩ Q0 = ∅.
Let Ω ⊂ RN be an open set. Let x ∈ Ω.Then x lies in an open ball
contained in Ω and so, for sufficiently large n, we can find a box Q ∈ Gn
such that
x ∈ Q ⊂ Ω.
In other words, Ω is the union of all boxes contained within it and
belonging to the collection Gn , for some n. This collection of boxes is
clearly countable but may not be disjoint.
Now choose all those boxes in this collection which belong to G1 and
discard those of Gk , k ≥ 2, which are contained inside these selected
boxes. From the remaining collection of boxes in Ω, select those in G2
and discard those which are in Gk , k ≥ 3 and contained within these
selected boxes. Proceeding iteratively like this, we can express Ω as the
countable disjoint union of (half-open) boxes, as is obvious from the two
observations made above. 
Proposition 2.2.5 Let Ω ⊂ RN be an open set and let E ⊂ Ω be a
measurable set with finite measure. Then, for any ε > 0, there exists a
set F , which is a finite disjoint union of boxes, such that
mN (E∆F ) < ε.
Proof: Let G ⊂ RN be an open set such that E ⊂ G and such that
mN (G\E) < 2ε (cf. Proposition 2.2.2). Set G0 = Ω ∩ G. Then G0 is also
open, E ⊂ G0 ⊂ Ω and mN (G0 \E) < 2ε .
Now, G0 can be written as the countable disjoint union of half-open
boxes, say, {Ij }∞
j=1 , and so, for each j, we have Ij ⊂ Ω. Since E has
finite measure, so have G and G0 . Thus

X
mN (Ij ) < +∞.
j=1
46 2 The Lebesgue Measure

Choose a positive integer k such that



X ε
mN (Ij ) < .
2
j=k+1

Set F = ∪kj=1 Ij , which is a finite disjoint union of boxes. Since F ⊂ G0 ,


we have
ε
mN (F \E) ≤ mN (G0 \E) < .
2
and

X ε
mN (E\F ) ≤ mN (G0 \F ) ≤ mN (Ij ) < .
2
j=k+1

This completes the proof. 

2.3 Translation invariance

We will now study a very important property of the Lebesgue measure.

Lemma 2.3.1 Let Ω and Ω0 be open sets in RN . Let T : Ω → Ω0 be a


bijection which is a homeomorphism. Then E ⊂ Ω is a Borel set if, and
only if, T (E) is a Borel set.

Proof: Set
S = {E ⊂ Ω | T (E) is a Borel set}.
Clearly Ω and ∅ are in S. Also, if E ⊂ Ω, T (E c ) = (T (E))c and if {Ei }∞ i=1
is a sequence of subsets of Ω, we have that T (∪∞ ∞
i=1 Ei ) = ∪i=1 T (Ei ). It
follows from these observations that S is closed under the formation of
countable unions and under complementation. Thus, S is a σ-algebra
on Ω. Since open sets get mapped onto open sets, we get that all open
sets are in S. Then it follows that all Borel sets are in S as well. The
converse follows by applying this reasoning to the map T −1 . 

As a special case, let us consider Ω = Ω0 = RN and let T x = x + x0 ,


where x0 is a fixed point in RN . Thus E is a Borel set if, and only if,
T (E) is a Borel set.

If I is a half-open box, then T (I) = I +x0 is also a half-open box and


both of them have the same measure. It is now immediate to see from
the definition of the induced outer-measure, µ∗ , that µ∗ (T (E)) = µ∗ (E),
2.3 Translation invariance 47

for any set E ⊂ RN . The same is obviously true for T −1 as well.

Now let E be a Lebesgue measurable subset of RN . Let A ⊂ RN .


Then
µ∗ (A ∩ T (E)) + µ∗ (A ∩ (T (E))c ) = µ∗ (T (T −1 (A) ∩ E))+
µ∗ (T (T −1 (A) ∩ E c ))
= µ∗ (T −1 (A) ∩ E)+
µ∗ (T −1 (A) ∩ E c )
= µ∗ (T −1 (A))
= µ∗ (A).

This shows that T (E) is Lebesgue measurable. Applying this to T −1 ,


we deduce the following result.

Theorem 2.3.1 Let x0 ∈ RN be a fixed point. Let T (x) = x + x0 , x ∈


RN . Then E ⊂ RN is Lebesgue measurable if, and only if, T (E) is
Lebesgue measurable, and in this case,

mN (T (E)) = mN (E).  (2.3.1)

Definition 2.3.1 Let µ be a Borel measure defined on RN . We say


that it is translation invariant if for every Borel set E, and for every
mapping T : RN → RN such that T (x) = x + x0 for some x0 ∈ RN , we
have that µ(T (E)) = µ(E). 

Thus, the Lebesgue measure is translation invariant. In fact the


properties of outer-regularity, translation invariance and finiteness of
the measure for compact sets characterizes the Lebesgue measure, as
the following theorem shows.

Theorem 2.3.2 Let ν be a Borel measure on RN such that


(i) ν(K) < +∞ for every compact set K ⊂ RN ;
(ii) ν(E) = inf{ν(V ) | E ⊂ V, V an open set}, for every Borel set E ⊂
RN ; and
(iii) it is translation invariant.
Then, there exists a constant c > 0 such that ν(E) = cmN (E) for every
Borel set E ⊂ RN .

Proof: Let Q = [0, 1) × [0, 1) × · · · × [0, 1) (N times). Then mN (Q) = 1.


Let n ≥ 2 be an arbitrary positive integer. Q can be written as the
disjoint union of 2N n boxes in the collection Gn described in the proof
48 2 The Lebesgue Measure

of Lemma 2.2.1. Assume that ν(Q) = c > 0. Since ν is translation


invariant, all the 2N n boxes of side 2−n which make up Q will have the
same measure. Let Q e be one such box. Then

2N n ν(Q)
e = ν(Q) = c = cmN (Q) = c2N n mN (Q).
e

Thus, ν(Q) e = cmN (Q)e as well. Since any open set can be written as
the countable disjoint union of such boxes (cf. Lemma 2.2.1), it follows
that if V is any open set, then ν(V ) = cmN (V ). Then, for any Borel set
E, it follows, from condition (ii) in the statement of this theorem, that
ν(E) = cmN (E). 

As an application of this result, we have the following theorem.


Theorem 2.3.3 Let A : RN → RN be a linear transformation. Let E
be any Borel set in RN .Then

mN (A(E)) = |det(A)|mN (E). (2.3.2)

Proof: Step 1: Let A be singular. Then det(A) = 0. Also, the range of


A will be a proper subspace of RN . Then, if E is a Borel set, A(E) is
contained in a proper subspace of RN and hence will have measure zero
(cf. Example 2.1.1). Thus (2.3.2) is valid in this case.

Step 2: Let A be non-singular. Then, by Lemma 2.3.1, A(E) is Borel


measurable whenever E is so. Define

ν(E) = mN (A(E)).

It is easy to see that ν is a Borel measure. If K ⊂ RN is compact, then


so is A(K). Thus, ν takes only finite values on compact sets.

If E ⊂ V , where V is an open set, then A(V ) is an open set and


A(E) ⊂ A(V ) and vice-versa. Thus

inf{ν(V ) | E ⊂ V, V an open set}


= inf{mN (A(V )) | E ⊂ V, V an open set}
= inf{mN (U ) | A(E) ⊂ U, U an open set}
= mN (A(E)) = ν(E).
Finally,

ν(E + x0 ) = mN (A(E + x0 )) = mN (A(E) + Ax0 ) = mN (A(E)) = ν(E).


2.4 Non-measurable sets 49

Thus, ν is translation invariant.

Thus, by the preceding theorem, it follows that there exists a con-


stant cA such that mN (A(E)) = ν(E) = cA mN (E) for every Borel set
E ⊂ RN .

Step 3: It is easy to see that if A and B are two non-singular linear


transformations of RN onto itself, then cAB = cBA = cA cB .

Step 4: Let A be an orthogonal transformation. Then, if E is the


unit ball in RN , we have that A(E) = E. It follows from this that
cA = 1 = |det(A)| whenever A is orthogonal.

Step 5: Let A be represented by a diagonal matrix diag(λ1 , · · · , λN ),


where λi > 0, 1 ≤ i ≤ N . Let E = [0, 1] × · · · × [0, 1] (N times). Then
A(E) = ΠN
i=1 [0, λi ].

Thus
mN (A(E)) = ΠN
i=1 λi .
Once again, in this case, we have cA = det(A) = |det(A)|.

Step 6: Given any non-singular matrix A, we can decompose it as A =


RQ, where R is a positive definite matrix and Q is orthogonal. (It
suffices to find a positive definite matrix R such that R2 = AAT and set
Q = R−1 A.) The positive definite matrix R can, in turn be decomposed
as R = P DP T , where P is orthogonal and D is diagonal with positive
diagonal entries. Notice then that det(D) = |det(A)|. The result now
follows from the observations made in Steps 3 to 5. 

2.4 Non-measurable sets

We will now prove the existence of a subset of R which is not Lebesgue


measurable. The construction is essentially a consequence of the trans-
lation invariance of the Lebesgue measure and also uses the axiom of
choice. We will follow the treatment given in Royden [6].

Let x, y ∈ [0, 1). Define



◦ x + y, if x + y < 1,
x+y =
x + y − 1, if x + y ≥ 1.
50 2 The Lebesgue Measure

If E is a subset of [0, 1) and if y ∈ [0, 1), we set


◦ ◦
E + y = {x + y | x ∈ E}.

Lemma 2.4.1 Let E ⊂ [0, 1) and let y ∈ [0, 1). If E is measurable,



then, so is E + y and

m1 (E + y) = m1 (E).

Proof: Set E1 = E ∩ [0, 1 − y) and E2 = E ∩ [1 − y, 1). Then E1 and E2


are measurable and are disjoint. Thus, m1 (E) = m1 (E1 ) + m1 (E2 ). By
◦ ◦
definition, we clearly have E1 + y = E1 + y and E2 + y = E2 + (y − 1).

Thus, Ei + y, i = 1, 2, are measurable, and by the translation invariance

of the Lebesgue measure, we have m1 (Ei + y) = m1 (Ei ), i = 1, 2.

Further, the sets Ei + y, i = 1, 2, are disjoint. (If not, we will have
a, b ∈ [0, 1) such that a + y = b + y − 1 which implies that b − a = 1,
which is impossible.)
◦ ◦ ◦
Since E + y is the disjoint union of E1 + y and E2 + y, the result
follows immediately. 

Let x, y ∈ [0, 1). We say that x ∼ y if x − y is rational. It is easy


to verify that this defines an equivalence relation and hence partitions
[0, 1) into equivalence classes. Let P be a set containing exactly one
element from each equivalence class (axiom of choice!).

Proposition 2.4.1 The set P ⊂ [0, 1) defined above is not Lebesgue


measurable.

Proof: Let {ri }∞


i=0 be an enumeration of the rationals in [0, 1) with

r0 = 0. Set Pi = P + ri , 0 ≤ i < ∞. Thus, P0 = P . If x ∈ Pi ∩ Pj ,
◦ ◦
where i 6= j, then x = p1 + ri = p2 + rj , where p1 , p2 ∈ P . If p1 = p2 ,
then, since ri 6= rj , we must have |ri − rj | = 1, which is not possible.
Thus, p1 6= p2 . But this means that p1 − p2 is rational, i.e. p1 ∼ p2 ,
which is not possible. Thus, Pi ∩ Pj = ∅ whenever i 6= j. Further,
since P contains a representative of each equivalence class, it follows
that [0, 1) = ∪∞
i=0 Pi .
2.4 Non-measurable sets 51

Now, if P were Lebesgue measurable, then so is each Pi and m1 (Pi ) =


m1 (P ) for each i. In that case

X
1 = m1 ([0, 1)) = m1 (Pi )
i=0

and the last sum is either zero or infinity depending on whether m1 (P )


is zero or non-zero, which gives us a contradiction. Thus P cannot be
measurable. 

Let E ⊂ P be a measurable set. Then Ei = E + ri is measurable
for each 0 ≤ i < ∞ and m1 (Ei ) = m1 (E) for each such i. Once again,
since E ⊂ P , we see that the Ei are all mutually disjoint. Since their
union is contained in [0, 1), we have that

X
m1 (Ei ) ≤ 1.
i=0

Thus it follows that we must have that m1 (E) = 0. Thus the only mea-
surable subsets of P are those of measure zero. The same is true for any
Pi , 1 ≤ i < ∞.

Now let A ⊂ [0, 1) be a measurable set such that m1 (A) > 0. Set
Ei = A ∩ Pi . If Ei is measurable, then m1 (Ei ) = 0. Thus if all the Ei
are measurable, we have, since A = ∪∞ i=0 Ei ,


X
0 < m1 (A) ≤ m1 (Ei ) = 0,
i=0

a contradiction. Thus, there exists at least one i such that Ei is not


measurable. Thus every subset of strictly positive measure in [0, 1) con-
tains a non-measurable subset.

We can draw the same conclusion for any interval of the form [n, n +
1). Thus, if A ⊂ R is a measurable set with strictly positive measure,
then there exists a positive integer n such that A ∩ [n, n + 1) has strictly
positive measure and hence will contain a non-measurable subset. Thus,
we conclude that every measurable set in R with strictly positive measure
will contain a non-measurable subset.
52 2 The Lebesgue Measure

2.5 Exercises

2.1 Let g : R → R be a continuous and increasing function. Define, for


[a, b) ∈ P,
µg ([a, b)) = g(b) − g(a).
Show that there exists a unique complete measure µg , on a σ-ring con-
taining all the Borel sets, which extends µg . (This measure is called the
Lebesgue-Stieltjes measure induced by g.)

2.2 Let S 1 denote the unit circle in the plane. The Borel sets in S 1 are
the members of the σ-algebra generated by all open arcs. Show that
there exists a Borel measure µ on S 1 such that µ(S 1 ) = 1 and such that
µ is invariant under all rotations of S 1 .

2.3 Show that every subset of the plane {(x, y, z) | 2x + 3y + 4z + 1 = 0}


in R3 is Lebesgue measurable.

2.4 Show that the plane R2 cannot be expressed as the countable union
of straight lines.

2.5 Let ωN = mN (BN ), where BN denotes the unit ball in RN . Show


that the Lebesgue measure of any ball of radius r > 0 in RN is ωN rN .

2.6 Compute m2 (S 1 ), where S 1 is the unit circle in the plane R2 .

2.7 (a) Let T be a triangle in the plane R2 with vertices at the points
(0, 0), (1, 0) and (0, 1). What is the value of m2 (T )?
(b) Let T be a triangle in the plane R2 with vertices at the points
(xi , yi ), i = 1, 2, 3. Show that m2 (T ) = |A|, where

1 1 1
1
A = x1 x2 x3 .
2
y1 y2 y 3
2.8 Let A : RN → RN be a non-singular linear transformation. Let
E ⊂ RN . With the notations of Section 2.2, show that
µ∗ (E) = |det(A)|µ∗ (E).
Deduce that E is Lebesgue measurable if, and only if A(E) is Lebesgue
measurable.
2.5 Exercises 53

2.9 Let T : R → R be a bijection such that both T and T −1 map


Lebesgue measurable sets onto Lebesgue measurable sets. Define

µ(E) = m1 (T (E)),

for each Lebesgue measurable set E. Show that µ is a complete measure


on L1 .

2.10 Let µ1 and µ2 be two measures defined on a σ-algebra S. Then


µ1 is said to be absolutely continuous with respect to µ2 if µ1 (E) = 0
whenever µ2 (E) = 0. Show that the measure µ defined in Exercise 2.9
is absolutely continuous with respect to the Lebesgue measure.

2.11 Does there exist a non-measurable subset of [0, 1] consisting only


of irrational numbers?
Chapter 3

Measurable functions

3.1 Basic properties

Let X be a non-empty set and let S be a σ-algebra of subsets of X. We


then say that (X, S) is a measurable space. The members of S are
called measurable sets.

An extended real-valued function on X is a function defined on X


which takes values in the set R ∪ {±∞}.
Definition 3.1.1 Let (X, S) be a measurable space and let f be an ex-
tended real-valued function defined on X. We say that f is a measur-
able function if f −1 ((α, +∞]) ∈ S for every α ∈ R. When X = RN ,
if the above condition is satisfied by f with S = BN , we say that f is a
Borel measurable function and if it is satisfied with S = LN , we say
that f is a Lebesgue measurable function. 
Remark 3.1.1 Evidently, if a function, f , defined on RN , is Borel
measurable, it is also Lebesgue measurable. 
Proposition 3.1.1 Let (X, S) be a measurable space. Let f be an ex-
tended real-valued function defined on X. The following statements are
equivalent:
(i) for every α ∈ R, f −1 ((α, +∞]) ∈ S, i.e. f is measurable;
(ii) For every α ∈ R, f −1 ([α, +∞]) ∈ S;
(iii) For every α ∈ R, f −1 ([−∞, α)) ∈ S;
(iv) For every α ∈ R, f −1 ([−∞, α]) ∈ S.
Proof: (i) ⇒ (ii):
 
−1 1
f ([α, +∞]) = ∩∞
n=1 f
−1
α − , +∞ ∈ S.
n
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 54
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_3
3.1 Basic properties 55

(ii) ⇒ (iii):

f −1 ([−∞, α)) = (f −1 ([α, +∞]))c ∈ S.

(iii) ⇒ (iv):
 
1
f −1 ([−∞, α]) = ∩∞
n=1 f
−1
−∞, α + ∈ S.
n

(iv) ⇒ (i):

f −1 ((α, +∞]) = (f −1 ([−∞, α]))c ∈ S. 

Corollary 3.1.1 (i) Let (X, S) be a measurable space. Let f be a mea-


surable function on X. Then, for every α ∈ R ∪ {±∞}, we have

f −1 ({α}) ∈ S.

(ii) If U ⊂ R is an open set, then f −1 (U ) ∈ S.


Proof: (i) If α ∈ R, then
   
1 1
f −1 ({α}) = ∩∞
n=1 f
−1
α− , +∞ ∩ −∞, α + ∈ S.
n n
Next,
f −1 ({+∞}) = ∩∞
n=1 f
−1
((n, +∞]) ∈ S
and
f −1 ({−∞}) = ∩∞
n=1 f
−1
([−∞, −n)) ∈ S.
(ii) Given (a, b) ⊂ R, we have

f −1 ((a, b)) = f −1 ([−∞, b)) ∩ f −1 ((a, +∞]) ∈ S.

The result now follows since every open set can be written as the count-
able union of open intervals. 

Example 3.1.1 Let X = RN . Then, every continuous real-valued func-


tion, f , defined on R will be both Borel and Lebesgue measurable since
f −1 ((α, +∞])) = f −1 ((α, ∞)) which is open and hence belongs to both
the Borel and Lebesgue σ-algebras. 

Example 3.1.2 Let (X, S) be a measurable space and let f be a real-


valued function defined on X. It is clear that if f −1 (U ) is measurable
56 3 Measurable functions

for every open set U ⊂ R, then f is measurable. However, the converse


of part (i) of the preceding corollary is not true. Let E ⊂ [0, 1) be a
non-measurable subset (cf. Section 2.4), i.e. E 6∈ L1 . Define

 x, if x ∈ E,
f (x) = −x, if x ∈ [0, 1)\E,
−2, if x 6∈ [0, 1).

Then 

 R\[0, 1), if α = −2,
{−α}, if − α ∈ [0, 1)\E,

f −1 ({α}) =

 {α}, if α ∈ E,
∅, otherwise.

Thus, it follows that f −1 ({α}) is measurable for every α ∈ R. However


f −1 ((0, ∞)) = E, which is not Lebesgue measurable and so f is not
Lebesgue measurable. 

Example 3.1.3 Let (X, S) be a measurable space and let A ⊂ X. Let


f = χA , the characterisitic function of the set A. Then

 X, if α < 0,
f −1 ((α, ∞]) = A, if 0 ≤ α < 1,
∅, if α ≥ 1.

Thus, χA is measurable if, and only if, A ∈ S. 

Example 3.1.4 Let (X, S) be a measurable space. Any constant func-


tion is measurable. Let f (x) = c for all x ∈ X. If α ∈ R, then
f −1 ((α, +∞]) = X if α < c and is equal to the empty set if α ≥ c. 

Proposition 3.1.2 Let (X, S) be a measurable space and let f and g


be measurable real-valued functions defined on X. Let c ∈ R. Then
f + c, cf, f ± g and f g are all measurable functions.

Proof: (i) Let α ∈ R. Let c > 0. Then


n αo
{x ∈ X | cf (x) < α} = x ∈ X | f (x) < ∈ S.
c
If c < 0, then
n αo
{x ∈ X | cf (x) < α} = x ∈ X | f (x) > ∈ S.
c
3.1 Basic properties 57

If c = 0, then cf is the constant function taking the value zero. Thus, if


follows that cf is measurable for all c ∈ R.

(ii) Let α ∈ R. Then

{x ∈ X | f (x) + g(x) < α} = {x ∈ X | f (x) < α − g(x)}


= ∪r∈Q ({x ∈ X | f (x) < r} ∩ {x ∈ X | g(x) < α − r}).

Since the rationals are countable, it follows that (f +g)−1 ([−∞, α)) ∈ S.
Thus f + g is measurable. Since f − g = f + (−1)g, it follows from (i)
above that f − g is also measurable.

(iii) Since constant functions are measurable, it follows from (ii) above
that f + c is measurable.

(iv) Let α ∈ R. If α > 0, then


√ √
{x ∈ X | (f (x))2 > α} = {x ∈ X | f (x) > α}∪{x ∈ X | f (x) < − α}.

If α ≤ 0, then {x ∈ X | (f (x))2 > α} = X. Thus, it follows that the


function f 2 is measurable.
Now.
1
f g = ((f + g)2 − (f − g)2 ).
4
Thus, by the preceding assertions, it follows that f g is also measurable.


Remark 3.1.2 Whenever the concerned functions are well-defined, the


preceding proposition holds for extended real-valued functions as well.
For instance, f + g is not defined at points x ∈ X where f (x) = +∞
and g(x) = −∞. 

Proposition 3.1.3 Let (X, S) be a measurable space and let f be a


real-valued measurable function defined on X. Then |f | is measurable.

Proof: Let α ∈ R. Then,

{x ∈ X | |f (x)| < α} = {x ∈ X | f (x) > −α} ∩ {x ∈ X | f (x) < α}

if α > 0 and is the empty set if α ≤ 0. Thus, |f | is measurable. 


58 3 Measurable functions

Corollary 3.1.2 Let (X, S) be a measurable space and let f and g be


measurable real-valued functions defined on X. Then, max{f, g} and
min{f, g} are measurable. In particular, if f is a measurable real-valued
function defined on X, then

f + = max{f, 0} and f − = − min{f, 0}

are measurable.
Proof: The result follows from the previous propositions and the fol-
lowing relations:
1
max{f, g} = 2 (f + g + |f − g|),
1
min{f, g} = 2 (f + g − |f − g|). 

Remark 3.1.3 The functions f + and f − are called the positive and
negative parts of the function f . We have f = f + − f − and |f | =
f + + f − . Notice that both f + and f − are non-negative functions. 
Lemma 3.1.1 Let (X, S) be a measurable space and let f be a real-
valued measurable function defined on X. Then f −1 (E) ∈ S whenever
E is a Borel set.
Proof: Consider

Se = {E ⊂ R | f −1 (E) ∈ S}.

Clearly, R ∈ S.e Now f −1 (E c ) = (f −1 (E))c and so if E ∈ S, e so does


E . Similarly, if {Ei }i=1 is a sequence in S, we have f (∪∞
c ∞ e −1
i=1 Ei ) =
∞ −1 ∞
∪i=1 f (Ei ) and so ∪i=1 Ei ∈ S as well. Thus S is a σ-algebra and,
e e
by the definition of measurability, it contains all the open sets of R (cf.
Corollary 3.1.1). Thus, Se contains all the Borel sets, and this completes
the proof. 
Corollary 3.1.3 Let (X, S) be a measurable space and let f be a real-
valued measurable function defined on X. Then f is measurable if, and
only if, f −1 (U ) ∈ S whenever U is a Borel set.
Proof: If f is measurable, the above lemma shows that the inverse im-
age of every Borel set is measurable. Conversely, if the inverse image of
every Borel set is measurable, then f −1 ((α, +∞)) ∈ S for every α ∈ R,
and since the function is real-valued, it follows that f is measurable, by
definition. 
3.1 Basic properties 59

Proposition 3.1.4 Let (X, S) be a measurable space and let f be a


measurable real-valued function defined on X. Let ϕ : R → R be a Borel
measurable function. Then ϕ ◦ f is a measurable function on X.
Proof: Let α ∈ R. We have
{x ∈ X | (ϕ ◦ f )(x) > α} = f −1 (ϕ−1 ((α, +∞))).
Now ϕ−1 ((α, +∞)) is a Borel set and the proof now follows from the
preceding lemma. 

Remark 3.1.4 In general, the composition of two measurable functions


can fail to be measurable. We will see an example of this in the next
section. 
Proposition 3.1.5 Let (X, S) be a measurable space and let {fn } be
a sequence of extended real-valued measurable functions defined on X.
Define, for x ∈ X,
h(x) = sup fn (x) and g(x) = inf fn (x).
n n

Then h and g are measurable.


Proof: Let α ∈ R. Then
{x ∈ X | h(x) > α} = ∪∞
i=1 {x ∈ X | fi (x) > α},
{x ∈ X | g(x) < α} = ∪∞
i=1 {x ∈ X | fi (x) < α},

and the result follows immediately. 


Corollary 3.1.4 Let (X, S) be a measurable space. Let {fn } be a se-
quence of real-valued measurable functions defined on X. We have that
lim supn→∞ fn and lim inf n→∞ fn are measurable. Hence, if fn (x) →
f (x) for all x ∈ X, then f is a measurable function.
Proof: Notice that gn = supm≥n fm is measurable. Then
lim sup fn = inf gn
n→∞ n

is measurable. Similarly, hn = inf m≥n fm is measurable and so


lim inf fn = sup hn
n→∞ n

is measurable. If fn (x) → f (x) for all x ∈ X, then


f = lim inf fn = lim sup fn ,
n→∞ n→∞

and the result follows. 


60 3 Measurable functions

Definition 3.1.2 Let (X, S) be a measurable space. A simple func-


tion defined on X is a function of the form

k
X
f = αi χAi ,
i=1

where the αi , 1 ≤ i ≤ k, are real constants and the Ai , 1 ≤ i ≤ k, are


measurable sets. 

Remark 3.1.5 By definition, a simple function is measurable. 

Simple functions are the building blocks with which we develop


Lebesgue’s theory of integration, just as Riemann’s theory of integra-
tion was based on step functions. As a first step towards this, we have
the following result.

Theorem 3.1.1 Let (X, S) be a measurable space and let f be a non-


negative extended real-valued function defined on X. Then f is the in-
creasing limit of a sequence of non-negative simple functions defined on
X.

Proof: Let n be a fixed positive integer. For 1 ≤ i ≤ n2n , define


 
−1 i−1 i
En,i = f , and Fn = f −1 ([n, +∞]).
2n 2n

In other words, we divide the interval [0, n) into subintervals of length


1
2n and consider the inverse images under f of each of these to define the
sets En,i . The sets En,i and Fn are all clearly measurable. Now define

n2 n
X i−1
fn = nχFn + χEn,i .
2n
i=1

Thus, fn is a non-negative simple function. If f (x) ≥ n, then fn (x) = n.


If f (x) < n and if
i−1 i
n
≤ f (x) < n ,
2 2
i−1
then fn (x) = 2n . Thus, fn (x) ≤ f (x) for all x ∈ X.
3.2 The Cantor function 61

We claim that fn (x) ≤ fn+1 (x) for each positive integer n and for
each x ∈ X. Indeed, if f (x) ≥ n+1, then fn+1 (x) = n+1 and fn (x) = n.
If n ≤ f (x) < n + 1, then fn+1 (x) = 2i−1
n+1 for some i such that

 
i−1 i
f (x) ∈ , ⊂ [n, n + 1).
2n+1 2n+1

In this case, fn (x) = n and so we still have fn (x) ≤ fn+1 (x). Finally, if
f (x) < n, then for some 1 ≤ i ≤ n2n , we have
   
i−1 i 2(i − 1) 2i
f (x) ∈ , = , .
2n 2n 2n+1 2n+1

i−1 2(i−1)
Consequently, fn (x) = 2n while fn+1 (x) = 2n+1
= i−1
2n = fn (x), if
f (x) ∈ [ 2(i−1) , 2i−1 ), or fn+1 (x) = 22i−1
2n+1 2n+1 n+1 > i−1
2n = fn (x), if f (x) ∈
[ 22i−1 , 2i
n+1 2n+1 ). This establishes the claim.

Thus, {fn } is an increasing sequence of simple functions bounded


above by f . If f (x) = +∞, then fn (x) = n for all n. If f (x) < +∞,
then there exists a positive integer N such that f (x) < N . Then, for all
n ≥ N , we see, from the the construction above, that |f (x) − fn (x)| =
f (x) − fn (x) ≤ 21n . Thus we have established that fn ↑ f. 

Corollary 3.1.5 Let (X, S) be a measurable space and let f be a real-


valued measurable function defined on X. Then f is the limit of a se-
quence of simple functions.

Proof: We can split f into its positive and negative parts. Thus f =
f + − f − , where f ± are non-negative measurable functions. We can find
sequences of non-negative simple functions {ϕn } and {ψn } such that
ϕn ↑ f + and ψn ↑ f − . Thus, fn = ϕn − ψn gives a sequence of simple
functions converging pointwise to f. 

3.2 The Cantor function

The Cantor function, like the Cantor set, provides a lot of interest-
ing examples, or counter-exmples, to illustrate fine points in the theory
of measure and integration. Several constructions are possible but the
essential properties are the same for all these functions and they serve
62 3 Measurable functions

the same purpose. In this section, we will present one such construction.

Before we construct the function, which will be the uniform limit


of a sequence of piecewise linear functions, we will present a basic con-
struction which will be used iteratively, at different scales, to produce
the next member of the desired sequence from the current one.

Consider an interval [a, b] in the real line and let f : [a, b] → R be a


linear function, i.e.
f (b) − f (a)
f (x) = f (a) + (x − a), x ∈ [a, b].
b−a
We then divide the interval [a, b] into three equal parts. Let us denote
the two interior points of this partition by cj , j = 1, 2. Thus,
b−a
cj = a + j , j = 1, 2.
3
The value of f at the point c1 will, therefore, be given by the relation
2f (a) + f (b)
f (c1 ) = .
3
We then define the ‘next iterate’, g, of f as follows:

f (x), if x ∈ [a, c1 ],


g(x) = f (c1 ), if x ∈ [c1 , c2 ],
f (b)−f (c1 )
f (c1 ) + b−c2 (x − c2 ), if x ∈ [c2 , b].

In other words, we move along f in the first third of the interval, then
move horizontally along the second third, and finally climb up to f (b)
in a straight line on the third interval (cf. Figure 3.2.1 below).

f(b)

f(a)

a b
Figure 3.2.1
3.2 The Cantor function 63

A simple computation shows that the slope of g is the same as that of


f in the first third of the interval, equal to zero in the middle third and
twice the slope of f in the final third of the interval.

Let us consider the interval [0, 1] and the function f0 (x) = x defined
on it. If we apply the procedure described above to this function, we
will get the function f1 given by

x, if x ∈ [0, 13 ],


1 1 2
f1 (x) = 3 , if x ∈ [ 3 , 3 ],
2x − 1, if x ∈ [ 23 , 1].

We now apply the iteration produre described above in each of the


intervals [0, 13 ] and [ 23 , 1] to get the next function f2 and so on (cf. Figure
3.2.2 below).

Figure 3.2.2

For each positive integer n, we apply the iteration procedure de-


scribed earlier only to those sub-intervals where fn is not constant. Thus,
if fn is constant on any sub-interval, we will have fm = fn = the same
constant on that sub-interval for all m ≥ n. Notice that the union of the
sub-intervals where fn is a constant is precisely the set Xn described in
the construction of the Cantor set (cf. Example 2.1.3).

In this manner, we can construct a sequence of continuous piecewise


linear functions {fn }. By construction, we have a decreasing sequence
of functions, each of which is monotonically non-decreasing.

The maximum slope occurs in the last sub-interval and, as seen ear-
lier, each time we apply the iteration procedure, the slope doubles. Thus,
64 3 Measurable functions

in the last sub-interval of length 31n , the slope of fn will be 2n . By the


mean value theorem, we thus see that for any x ∈ [0, 1],
 n+1
−(n+1) 2
|fn (x) − fn+1 (x)| ≤ |fn+1 (1) − fn+1 (1 − 3 )| ≤ .
3
P∞ 2 n
Since the series n=1 ( 3 ) is a convergent geometric series, it follows
that {fn } is uniformly Cauchy and so it converges uniformly to a con-
tinuous function f . This function is called the Cantor function.

Since each fn is non-decreasing, so is f . We also have that f (0) = 0


and f (1) = 1. If C is the Cantor set (cf. Example 2.1.3), then, by
construction, f is constant on each sub-interval of C c , since in the con-
struction of fn+1 from fn , we set the value in each middle third interval
as a constant and once fixed thus, it remains unaltered in the construc-
tion of fm , m ≥ n + 1.

Let us now define ψ(y) = y + f (y) for y ∈ [0, 1]. Then ψ is strictly
monotonic increasing and continuous. We have ψ(0) = 0 and ψ(1) = 2.
Thus ψ is a continuous bijection of [0, 1] onto [0, 2].

Let ϕ denote the inverse of ψ. Then ϕ is also monotonic increasing.


We have that x = ϕ(x) + f (ϕ(x)), for every x ∈ [0, 2]. If x ≥ y, then
ϕ(x) ≥ ϕ(y) and

x − y = ϕ(x) − ϕ(y) + f (ϕ(x)) − f (ϕ(y)).

Since f is non-decreasing, we deduce, from the above relation, that

ϕ(x) − ϕ(y) ≤ x − y

whenever x ≥ y from which we have

|ϕ(x) − ϕ(y)| ≤ |x − y|.

Thus, ϕ is continuous as well.

Now, ψ is a bijection and so it maps disjoint sets into disjoint sets.


If I is an interval contained in C c , where C is the Cantor set, then f is a
constant, cI , on I and so ψ(x) = x + cI on I. Thus, ψ(I) just translates
I and so m1 (ψ(I)) = m1 (I). Since C c is made up of disjoint itervals,
3.3 Almost everywhere 65

it follows that m1 (ψ(C c )) = m1 (C c ) = 1. Since the range of ψ is [0, 2],


we thus conclude that m1 (ψ(C)) = 1 as well. Thus, ψ maps C, a set of
measure zero, onto a set of measure one.

Since ψ(C) has positive measure, it contains a non-measurable set,


say, S. Let M = ψ −1 (S) = ϕ(S). Then M ⊂ C and by the completeness
of the Lebesgue measure, it follows that M is Lebesgue measurable. If
M were Borel measurable, then it would follow that S = ϕ−1 (M ) is also
Borel measurable, since ϕ is continuous and hence a Borel measurable
function. But that would imply that S is also Lebesgue measurable,
which contradicts our assumption on S.

Thus, the set M described above is an example of a Lebesgue mea-


surable set which is not Borel measurable.

Finally, let Φ = χM , which is a Lebesgue measurable function. Set


ζ = Φ ◦ ϕ. Thus ζ is the composition of a Lebesgue measurable function
and a continuous (and hence, Lebesgue measurable) function. Now

ζ −1 ({1}) = {x ∈ [0, 2] | ζ(x) = 1} = ϕ−1 (M ) = S,

which, by choice, is not Lebesgue measurable. Thus ζ is not a Lebesgue


measurable function (cf. Corollary 3.1.1(i)). Thus, we have constructed
an example to show that the composition of two measurable functions
need not be measurable (cf. Proposition 3.1.4 and Remark 3.1.4).

Remark 3.2.1 Notice, however, that by Proposition 3.1.4, the mapping


ϕ ◦ Φ will be measurable. 

3.3 Almost everywhere

Let (X, S) be a measurable space. Let µ be a measure defined on S. We


say that (X, S, µ) is a measure space.

Given a measure space (X, S, µ), we say that a measurable function,


or a collection of measurable functions, enjoys a certain property almost
everywhere if that property is valid at all points in X except, possibly,
on a set of measure zero. We abbreviate ‘almost everywhere’ by a.e.
More specifically, we will frequently encounter the following situations
in the sequel.
66 3 Measurable functions

• A measurable extended real-valued function defined on X is finite


a.e. if there exists a set E ∈ S such that µ(E) = 0 and such that
f (x) ∈ R for all x ∈ E c .

• Given two measurable functions f and g defined on X, we say that


f = g a.e. if there exists a set E ∈ S such that µ(E) = 0 and such
that f (x) = g(x) for all x ∈ E c .

• Given a sequence of measurable functions {fn } and a measurable


function f defined on X, we say that fn converges to f a.e. if there
exists a set E ∈ S such that µ(E) = 0 and such that fn (x) → f (x)
for every x ∈ E c .

Definition 3.3.1 Let (X, S, µ) be a measure space and let f : X → R


be a measurable function. We say that f is essentially bounded if
there exists M > 0 such that the set

{x ∈ X | |f (x)| > M }

has measure zero. The essential supremum of f is the infimum of all


such M , and is denoted kf k∞ , i.e.

kf k∞ = inf{M | µ({x ∈ X | |f (x)| > M }) = 0}. 

3.4 Exercises

3.1 Let (X, S) be a measurable space. Let f : X → R be a measurable


function. Define
 1
g(x) = f (x) , iff (x) 6= 0,
0, iff (x) = 0.

Show that g is measurable.

3.2 Let (X, S) be a measurable space. Let f : X → R be a function


such that |f | is measurable. Is it necessary that f be measurable?

3.3 Let (X, S) be a measurable space. Let f : X → R be a function


such that f −1 ((r, +∞)) ∈ S for every rational number r. Show that f
is measurable.
3.4 Exercises 67

3.4 Let (X, S) be a measurable space. Let {fn } be a sequence of mea-


surable functions defined on X. Show that the set of all points x ∈ X,
where the sequence {fn (x)} is not Cauchy, is a measurable set.

3.5 Let (X, S) be a measurable space. Let {fn } be a sequence of mea-


surable functions defined on X converging to a function f a.e. Is it
necessary that f is measurable?

3.6 Let (X, S) be a measurable space. Let f : X × [0, 1] → R be a


function such that, for each fixed y ∈ [0, 1], the mapping x 7→ f (x, y)
is measurable, and, for each fixed x ∈ X, the mapping y 7→ f (x, y) is
continuous. Define

h(x) = min f (x, y), for x ∈ X.


y∈[0,1]

Show that h : X → R is measurable.

3.7 Let f : R → R be a Lebesgue measurable function. Show that there


exists a Borel measurable function g : R → R such that g = f a.e.
Chapter 4

Convergence

4.1 Egorov’s theorem

Theorem 4.1.1 (Egorov) Let (X, S, µ) be a finite measure space, i.e.


µ(X) < +∞. Let {fn }∞ n=1 be a sequence of real-valued measurable func-
tions, defined on X, converging almost everywhere to a real-valued mea-
surable function f . Then, given any ε > 0, there exists a measurable set
F ⊂ X such that µ(F ) < ε and such that fn → f uniformly on F c .

Proof: Let E ∈ S be such that µ(E) = 0 and such that fn → f


pointwise on E c . Set Y = E c . Given positive integers m and n, define
 
∞ 1
En,m = ∩i=n x ∈ Y | |fi (x) − f (x)| < .
m

Then, clearly,

E1,m ⊂ E2,m ⊂ · · · ⊂ En,m ⊂ En+1,m ⊂ · · · .

Further, for every x ∈ Y , we have fn (x) → f (x) and so for any m,


1
there exists N such that for all i ≥ N , we have |fi (x) − f (x)| < m , i.e.
x ∈ EN,m . Thus,
Y = ∪∞ n=1 En,m .

Consequently (cf. Proposition 1.2.4),

µ(Y ) = lim µ(En,m ).


n→∞

Since, µ(Y ) = µ(X) < +∞, given ε > 0, there exists n0 (m) ∈ N such
that
ε
µ(Y \En0 (m),m ) = µ(Y ) − µ(En0 (m),m ) < m .
2
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 68
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_4
4.1 Egorov’s theorem 69

Set G = ∪∞
m=1 (Y \En0 (m),m ). Then G is measurable and


X ε
µ(G) < = ε.
2m
m=1

Now set F = G ∪ E so that µ(F ) = µ(G) < ε. Observe that

F c = ∩∞
m=1 En0 (m),m .

1
Given any η > 0, choose m such that m < η. If x ∈ F c , then x ∈
En0 (m),m ⊂ En,m for all n ≥ n0 (m). Thus, for all x ∈ F c , and for all
n ≥ n0 (m), we have
1
|fn (x) − f (x)| < < η.
m
Since the choice of m depended only on η, this shows that we have uni-
form convergence of the sequence {fn } to f on F c . This completes the
proof. 

Example 4.1.1 The result of the theorem does not hold, in general,
in infinite measure spaces. For instance, consider the set N of natural
numbers with the counting measure defined on the σ-algebra of all sub-
sets of N. If F ⊂ N is such that µ(F ) < ε < 1, then, clearly, F = ∅.
Thus uniform convergence on F c means uniform convergence on N. Now
consider the sequence {fn }∞ n=1 defined by fn = χ{1,2,···,n} . Then fn → f
on N, where f (i) = 1 for all i ∈ N, but this convergence is not uniform. 

Inspired by the statement of Egorov’s theorem, we can formulate the


following definition.
Definition 4.1.1 Let (X, S, µ) be a measure space. Let {fn }∞n=1 be a
sequence of real-valued measurable functions defined on X. We say that
this sequence converges almost uniformly to a real-valued measurable
function f defined on X, if for every ε > 0, there exists a measurable
set F such that µ(F ) < ε and such that fn → f uniformly on F c . 
The converse of Egorov’s theorem holds for any measure space.
Proposition 4.1.1 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of real-valued measurable functions defined on X converging
almost uniformly to a real-valued measuable function f defined on X.
Then fn → f a.e. on X.
70 4 Convergence

1
Proof: Let m ∈ N. Choose Fm ∈ S such that µ(Fm ) < m and such
c ∞
that fn → f uniformly on Fm . Set F = ∩m=1 Fm . Then µ(F ) = 0. Since
F c = ∪∞ c c
m=1 Fm , we have that fn (x) → f (x) for every x ∈ F . 

Definition 4.1.2 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be a
sequence of real-valued measurable functions defined on X. We say that
this sequence is almost uniformly Cauchy if, for every ε > 0, there
exists a set F ∈ S such that µ(F ) < ε and such that {fn } is a uniformly
Cauchy sequence on F c . 

Clearly, if a sequence of real-valued measurable functions defined on


X, {fn }∞
n=1 , converges almost uniformly, then it is almost uniformly
Cauchy. We now prove the converse.

Proposition 4.1.2 Let (X, S, µ) be a measure space and let {fn }∞ n=1
be an almost uniformly Cauchy sequence of real-valued measuarble func-
tions defined on X. Then there exists a real-valued measurable function
f defined on X such that fn → f almost uniformly.
1
Proof: For each m ∈ N, choose Fm ∈ S such that µ(Fm ) < m and such
that the sequence {fn } is uniformly Cauchy on Fm . set F = ∩∞
c
m=1 Fm .
Then µ(F ) = 0. Since F c = ∪∞ F
m=1 m
c , we have that {f (x)} is a Cauchy
n
sequence for every x ∈ F c . Define

limn→∞ fn (x), if x ∈ F c ,

f (x) =
0, if x ∈ F.

Set gn = χF c fn . Then gn is measurable for each positive integer n.


Further if x ∈ F , we have gn (x) = f (x) = 0 for all n and if x ∈ F c ,
we have gn (x) = fn (x) → f (x). Thus gn → f everywhere and so (cf.
Corollary 3.1.4) f is measurable. In particular, fn → f on F c and
fn → f uniformly on Fm c , where µ(F ) < 1 . Thus, we see that the
m m
sequence {fn } converges to f almost uniformly. 

4.2 Convergence in measure

In this section, we will investigate a new notion of convergence of mea-


surable functions defined on a measure space and compare it with the
notions of pointwise convergence a.e. and almost uniform convergence.

Definition 4.2.1 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be
a sequence of real-valued measurable functions defined on X. Let f be a
4.2 Convergence in measure 71

real-valued measurable function defined on X. We say that the sequence


{fn } converges in measure to the function f if for every ε > 0, we
have
lim µ({x ∈ X | |fn (x) − f (x)| ≥ ε}) = 0.
n→∞

We say that the sequence {fn } is Cauchy in measure if for every ε > 0
and for every δ > 0, there exists N ∈ N such that for all n, m ≥ N , we
have
µ({x ∈ X | |fn (x) − fm (x)| ≥ ε}) < δ. 

Notation Let (X, S, µ) be a measure space. If a sequence of real-valued


measurable functions {fn } defined on X converges in measure to a real-
valued measurable function f , we write
µ
fn → f.

Proposition 4.2.1 Let (X, S, µ) be a finite measure space, i.e. µ(X) <
+∞. Let {fn }∞n=1 be a sequence of real-valued measurable functions,
defined on X, converging a.e. to a real-valued measurable function f .
µ
Then fn → f .

Proof: Let D denote the set of all points x ∈ X such that the sequence
{fn (x)} fails to converge to f (x). Thus µ(D) = 0. Let ε > 0. If we set

Em (ε) = {x ∈ X | |fm (x) − f (x)| ≥ ε},

then,

D = ∪ε>0 ∩∞ ∞
n=1 ∪m=n Em (ε) = ∪ε>0 lim sup En (ε).
n→∞

Thus, µ(lim supn→∞ En (ε)) = 0. Since µ(X) < +∞, we have (cf. Exer-
cise 1.10),

0 = µ(lim sup En (ε)) ≥ lim sup µ(En (ε)).


n→∞ n→∞

Thus,

0 ≤ lim inf µ(En (ε)) ≤ lim sup µ(En (ε)) ≤ 0,


n→∞ n→∞

µ
from which we deduce that limn→∞ µ(En (ε)) = 0, i.e. fn → f. 
72 4 Convergence

Example 4.2.1 This result is not valid, in general, in infinite mea-


sure spaces. If we consider the set N with the σ-algebra of all subsets
equipped with the counting measure, then it is easy to verify that con-
vergence in measure is just uniform convergence, since µ(E) < 1 implies
that E = ∅. Once again, the same sequence as in Example 4.1.1 gives a
sequence converging pointwise everywhere, but not in measure. 

The converse is not true, even in finite measure spaces, i.e. conver-
gence in measure does not imply pointwise convergence as the following
example shows.

Example 4.2.2 Consider X = [0, 1) equipped with the Lebesgue mea-


sure. Consider the function

χin = χ , 1 ≤ i ≤ n.
[ i−1 i
n ,n)

Consider the sequence

{χ11 , χ12 , χ22 , χ13 , χ23 , χ33 , · · ·}.

Let x ∈ [0, 1). For each n ∈ N, there exists exactly one i such that
χin (x) = 1, 1 ≤ i ≤ n, while χjn (x) = 0 for all 1 ≤ j ≤ n, j 6= i. Thus, we
see that the above sequence fails to converge at every point x ∈ [0, 1).
On the other hand, if 0 < ε < 1, we have
 
i−1 i 1
m1 ({x ∈ [0, 1) | |χin (x)| ≥ ε}) = m1 , = ,
n n n

from which we deduce that this sequence converges to zero in measure. 

In the above example, notice that for any x ∈ [0, 1), we have χ1n (x) =
0, for every n ≥ x−1 . Thus, there exists a subsequence converging to
the zero function pointwise. This behaviour is generic, as the following
proposition shows.

Proposition 4.2.2 Let (X, S, µ) be a measure space and let {fn }∞ n=1
be a sequence of real-valued measurable functions defined on X which
converges in measure to a real-valued measurable function f defined on
X. Then, there exists a subsequence of {fn } which converges to f almost
everywhere.
4.2 Convergence in measure 73

Proof: Set
 
1
En,m = x ∈ X | |fn (x) − f (x)| ≥ .
m
Then, for every m ∈ N, we can find a positive integer n0 (m) such that
1
µ(En0 (m),m ) < .
2m
Thus,

X
µ(En0 (m),m ) < +∞.
m=1
Hence, by the Borel-Cantelli lemma (cf. Proposition 1.2.6), there exists
a measurable set E, with measure zero, such that every point of E c
belongs to at most finitely many of the sets En0 (m),m . In other words,
for every x ∈ E c , there exist a positive integer N such that x 6∈ En0 (m),m ,
for all m ≥ N , i.e. for all m ≥ N , we have
1
|fn0 (m) (x) − f (x)| < .
m
This shows that fn0 (m) (x) → f (x) for all x ∈ E c . This completes the
proof. 

The next result shows that the limit function, under convergence in
measure, is defined uniquely up to a set of measure zero.
Proposition 4.2.3 Let (X, S, µ) be a measure space and let {fn }∞n=1 be
a sequence of real-valued measurable functions defined on X. Let f and
µ
g be real-valued measurable functions defined on X such that fn → f
µ
and fn → g. Then f = g almost everywhere.
Proof: Let ε > 0. Then,
ε

{x ∈ X | |f (x) − g(x)| ≥ ε} ⊂ x ∈ X | |fn (x) − f (x)| ≥ 2

∪ x ∈ X | |fn (x) − g(x)| ≥ 2ε ,




since
|f (x) − g(x)| ≤ |fn (x) − f (x)| + |fn (x) − g(x)|.
µ µ
Since fn → f and fn → g, we deduce that

µ({x ∈ X | |f (x) − g(x)| ≥ ε}) = 0.


74 4 Convergence

The result now follows from the relation


 
∞ 1
{x ∈ X | |f (x) − g(x)| > 0} = ∪n=1 x ∈ X | |f (x) − g(x)| ≥ .
n
We now investigate the relationship between a sequence being Cauchy
in measure and its convergence in measure.
Proposition 4.2.4 Let (X, S, µ) be a measure space and let {fn }∞ n=1
be a sequence of real-valued measurable functions defined on X. If the
sequence converges in measure, then it is Cauchy in measure.
µ
Proof: Let fn → f . Then, by a similar reasoning as in the preceding
proof, we have, for ε > 0,

{x ∈ X | |fn (x) − fm (x)| ≥ ε} ⊂ x ∈ X | |fn (x) − f (x)| ≥ 2ε




∪ x ∈ X | |fm (x) − f (x)| ≥ 2ε .




Given δ > 0, we can then find a positive integer N such that for n and
m greater than, or equal to, N , we have that the measure of each of the
two sets on the right-hand side of the above relation will be less than 2δ .
Thus for m, n ≥ N , we have

µ({x ∈ X | |fn (x) − fm (x)| ≥ ε}) < δ.

Thus the sequence {fn } is Cauchy in measure. 


Proposition 4.2.5 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be
a sequence of real-valued functions, defined on X, which is Cauchy in
measure. If there exists a subsequence {fnk } which converges in measure
µ
to a real-valued measurable function f defined on X, then fn → f .
Proof: Let ε > 0. Then
ε

{x ∈ X | |fn (x) − f (x)| ≥ ε} ⊂ x ∈ X | |fn (x) − fnk (x)| ≥ 2

∪ x ∈ X | |fnk (x) − f (x)| ≥ 2ε .




Let δ > 0 be given. Then, there exists N ∈ N such that for all n ≥ N
and for all nk ≥ N , we have

µ x ∈ X | |fn (x) − fnk (x)| ≥ 2ε < 2δ ,


 

ε δ
 
µ x ∈ X | |fnk (x) − f (x)| ≥ 2 < 2.
4.2 Convergence in measure 75

Thus, for all n ≥ N ,


µ({x ∈ X | |fn (x) − f (x)| ≥ ε}) < δ,
µ
which shows that fn → f. 
Proposition 4.2.6 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be
a sequence of real-valued measurable functions, defined on X, which con-
verges almost uniformly to a real-valued measurable function f defined
µ
on X. Then fn → f .
Proof: Let ε > 0 and δ > 0 be given. There exists F ∈ S such that
µ(F ) < δ and such that fn → f uniformly on F c . Thus, there exists
n0 ∈ N (which depends on ε and also on δ since we have chosen F
based on δ) such that, for all x ∈ F c , and for all n ≥ n0 , we have
|fn (x) − f (x)| < ε. Hence, for all n ≥ n0 ,
µ({x ∈ X | |fn (x) − f (x)| ≥ ε}) ≤ µ(F ) < δ,
µ
which proves that fn → f. 
Proposition 4.2.7 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of real-valued measurable functions, defined on X, which is
Cauchy in measure. Then, there exists a subsequence which is almost
uniformly Cauchy.
Proof: Since the sequence is Cauchy in measure, given k ∈ N, there
exists n(k) ∈ N such that for all n, m ≥ n(k), we have
 
1 1
µ x ∈ X | |fn (x) − fm (x)| ≥ k < k.
2 2
Choose n1 = n(1) + 1, n2 = max{n(2), n1 + 2}, n3 = max{n(3), n2 + 3}
and so on. Thus, nk = max{n(k), nk−1 + k}. Hence we get a strictly
increasing sequence {nk } which also satisfies nk > k for each k. Thus,
we have a subsequence {fnk } of {fn }.

Set  
1
Ek = x ∈ X | |fnk (x) − fnk+1 (x)| ≥ k .
2
Then µ(Ek ) < 2−k . Given δ > 0, choose k such that 2−(k−1) < δ. Set
F = ∪∞
i=k Ei . Then

X 1
µ(F ) ≤ µ(Ei ) < < δ.
2k−1
i=k
76 4 Convergence

Given ε > 0, choose N ≥ k such that 2−(N −1) < ε. Now F c = ∩∞ c


i=k Ei .
Thus, for all x ∈ F c and for all m ≥ ` ≥ N , we have
Pm
|fn` (x) − fnm (x)| ≤ j=` |fnj (x) − fnj+1 (x)|

Pm 1
< j=` 2j

1 1
= 2`−1
< 2N −1
< ε.

Thus {fnk } is a uniformly Cauchy sequence in F c and µ(F ) < δ, ı.e.


{fnk } is almost uniformly Cauchy. 

Proposition 4.2.8 Let (X, S, µ) be a measure space and let {fn }∞n=1 be
a sequence of real-valued measurable functions, defined on X, which is
Cauchy in measure. Then, there exists a real-valued measurable function
µ
f defined on X such that fn → f .

Proof: Let {fnk } be a subsequence which is almost uniformly Cauchy.


Then (cf. Proposition 4.1.2), there exists a real-valued measurable func-
tion f , defined on X, such that {fnk } converges almost uniformly to
µ
f . By Proposition 4.2.6, we deduce that fnk → f which implies that
µ
fn → f , by Proposition 4.2.5. 

Let us summarize the results on the inter-relationships of the various


convergences studied in this chapter so far. We have defined pointwise
convergence a.e., almost uniform convergence and convergence in mea-
sure.

• If a sequence of real-valued measurable functions converges almost


uniformly, then it converges pointwise a.e. (cf. Proposition 4.1.1)
as well as in measure (cf. Proposition 4.2.6).

• If a sequence of real-valued measurable functions converges point-


wise a.e., then it converges almost uniformly (cf. Egorov’s theo-
rem) and in measure (cf. Proposition 4.2.1), provided the space is
a finite measure space.

• If a sequence of real-valued measurable functions converges in mea-


sure, then there is a subsequence which converges pointwise a.e.
(cf. Proposition 4.2.3) and almost uniformly (cf. Propositions
4.2.7 and 4.1.2). In fact, Propositions 4.2.7, 4.1.2 and 4.1.1 yield
another proof of Proposition 4.2.2.
4.2 Convergence in measure 77

• Almost uniform convergence of a sequence of real-valued measur-


able functions obviously implies that the sequence is almost uni-
formly Cauchy and vice-versa (cf. Proposition 4.1.2).

• Convergence in measure of a sequence of real-valued measurable


functions implies that the sequence is Cauchy in measure (cf.
Proposition 4.2.4) and vice-versa (cf. Proposition 4.2.8).
We will conclude this section by studying the behaviour of conver-
gence with measure with respect to basic algebraic operations on func-
tions.
Proposition 4.2.9 Let (X, S, µ) be a measure space and let {fn }∞ n=1
and {gn }∞
n=1 be sequences of real-valued measurable functions defined on
µ µ
X. Let fn → f and let gn → g, where f and g are real-valued measurable
functions defined on X. Let α and β be non-zero real scalars.Then
µ µ
αfn + βgn → αf + βg. We also have that |fn | → |f |.
Proof: Let ε > 0. The result follows immediately from the following
relations:
{x ∈ X | |(αfn + βgn )(x) − (αf + βg)(x)| ≥ ε}
n o n o
ε ε
⊂ x ∈ X | |fn (x) − f (x)| ≥ 2|α| ∪ x ∈ X | |gn (x) − g(x)| ≥ 2|β| ,

and

{x ∈ X | | |fn (x)| − |f (x)| | ≥ ε} ⊂ {x ∈ X | |fn (x) − f (x)| ≥ ε}. 

Proposition 4.2.10 Let (X, S, µ) be a finite measure space and let


{fn }∞ ∞
n=1 and {gn }n=1 be sequences of real-valued measurable functions
µ µ
defined on X. Let fn → f and let gn → g, where f and g are real-valued
µ
measurable functions defined on X. Then fn gn → f g.
Proof: It follows from the relation
1
fg = [(f + g)2 − (f − g)2 ],
4
µ µ
that it is enough to show that if fn → f , then fn2 → f 2 .

Step 1: Let f = 0. Then, since



{x ∈ X | |fn (x)|2 ≥ ε} = {x ∈ X | |fn (x)| ≥ ε},
78 4 Convergence

µ
it follows that fn2 → 0 as well.
µ µ
Step 2: If fn → f , then fn − f → 0. Now, let
En = {x ∈ X | |f (x)| > n},
so that En ↓ ∅. Since µ(X) < +∞, it follows that (cf. Proposition 1.2.5)
µ(En ) ↓ 0. Given δ > 0, choose m ∈ N such that µ(Em ) < δ.
Now,

{x ∈ X | |fn f (x) − f 2 (x)| ≥ ε}


= {x ∈ X | |fn f (x) − f 2 (x)| ≥ ε} ∩ Em

∪{x ∈ X | |fn f (x) − f 2 (x)| ≥ ε} ∩ Em


c .

The measure of the first set on the right-hand side of the above relation
is, evidently, less than δ. Since |f (x)| ≤ m for x ∈ Em c , it follows that

for all points x in the second set on the right-hand side of the above
relation, we have
ε
ε ≤ m|fn (x) − f (x)|, i.e. |fn (x) − f (x)| ≥ .
m
Thus, we can find N ∈ N such that, for all n ≥ N , we have
µ({x ∈ X | |fn f (x) − f 2 (x)| ≥ ε} ∩ Em
c
) < δ.
Thus, for all n ≥ N , we have
µ({x ∈ X | |fn f (x) − f 2 (x)| ≥ ε}) < 2δ,
µ
which shows that fn f → f 2 .

Step 3: Now
fn2 − f 2 = (fn − f )2 + 2(fn f − f 2 ).
µ µ
Since fn −f → 0, we have (fn −f )2 → 0. Combining this with the result
µ
of Step 2 above, we deduce that fn2 → f 2 , which completes the proof. 

Example 4.2.3 The above result is not true, in general, in infinite


measure spaces. Consider the set N equipped with the counting measure
defined on the σ-algebra of all subsets. Let
 1
fn (k) = n , if 1 ≤ k ≤ n,
0, if k > n.
4.3 Exercises 79

Then fn → 0 uniformly on N and hence, as already observed earlier,


µ
fn → 0. Let g(n) = n for all n ∈ N. Now (fn g)(n) = 1 for all n ∈ N and
so fn g does not converge uniformly to zero and so it does not converge
to zero in measure. 

4.3 Exercises

4.1 Let (X, S, µ) be a measure space. Let {fn }∞


n=1 be a sequence of
real-valued measurable functions defined on X converging in measure
to a real-valued measurable function f defined on X. If g is a real-
valued measurable function defined on X such that f = g a.e., show
µ
that fn → g.

4.2 Let (X, S, µ) be a measure space. Let {fn }∞ ∞


n=1 and {gn }n=1 be two
sequences of real-valued measurable functions defined on X such that
µ µ
fn = gn a.e. for every n. If fn → f , show that gn → f , where f is a
real-valued measurable function defined on X.

4.3 Let (X, S, µ) be a measure space. Let {fn }∞n=1 be a sequence of


real-valued measurable functions defined on X converging in measure
to a real-valued measurable function f defined on X. If the sequence
{fn } is almost uniformly Cauchy, show that it converges to f almost
uniformly.

4.4 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of


real-valued measurable functions defined on X converging in measure to
a real-valued measurable function f defined on X. If the sequence {fn }
is pointwise Cauchy a.e., show that fn → f almost everywhere.

4.5 Let (X, S, µ) be a measure space. Let {fn }∞n=1 be a sequence of


real-valued measurable functions defined on X such that every subse-
quence has a further subsequence which converges in measure to a fixed
µ
real-valued measurable function f defined on X. Show that fn → f .

4.6 (a) Let (X, S, µ) be a measure space. Let {fn }∞


n=1 be a sequence of
real-valued measurable functions defined on X. Let f be a real-valued
measurable function defined on X. Show that the following statements
are equivalent:
µ
(i) fn → f .
80 4 Convergence

(ii) Every subsequence of {fn } has a further subsequence converging to


f almost uniformly.
(b) If, in addition, µ(X) < +∞, show that the above statements are
equivalent to the following statement:
(iii) Every subsequence of {fn } has a further subsequence converging to
f almost everywhere.

4.7 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of


µ
real-valued measurable functions defined on X such that fn → 0. Let
{an } be a sequence of real numbers such that an ↓ 0. Show that there
exists a subsequence {fnk } such that for almost every x ∈ X, we have
|fnk (x)| < ak for sufficiently large k.

4.8 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of


real-valued measurable functions defined on X converging in measure to
a real-valued measurable function f defined on X. If, for every n ∈ N,
we have that fn ≥ 0 a.e., show that f ≥ 0 almost everywhere. Deduce
that
µ
(i) if fn → f and if for every n, fn ≤ g a.e., then f ≤ g a.e., where g is
a real-valued measurable function defined on X;
µ
(ii) if fn → f and if for every n, |fn | ≤ g a.e., then |f | ≤ g a.e., where g
is a real-valued measurable function defined on X.

4.9 Let (X, S, µ) be a measure space. Let {fn }∞ n=1 be a sequence of


real-valued measurable functions defined on X converging in measure
to a real-valued measurable function f defined on X. If fn ≤ fn+1 for
every n, show that fn ↑ f almost everywhere.
Chapter 5

Integration

5.1 Non-negative simple functions

Let (X, S, µ) be a measure space. Let ϕ : X → R be a (measurable)


simple function which is non-negative. Let {αi }ni=1 be the set of non-zero
values assumed by ϕ. Set
Ai = ϕ−1 ({αi }), 1 ≤ i ≤ n.
The sets {Ai }ni=1 are mutually disjoint and we can write
n
X
ϕ = α i χ Ai . (5.1.1)
i=1

In order to define the integral of ϕ, over the set X, with respect to


the measure µ, we imitate what one does in order to define the Riemann
integral of a step function. Given a step-function of the form
n
X
ϕ = α i χ Ii ,
i=1

where {Ii }ni=1 is a finite collection of disjoint intervals and the αi are all
non-negative, the Riemann integral of ϕ is nothing but the area under
the graph of ϕ, i.e.
Z Xn
ϕ(x) dx = αi m1 (Ii ).
R i=1

Imitating this, we can define, when ϕ is of the form given in (5.1.1)


Z Xn
ϕ dµ = αi µ(Ai ).
X i=1

© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 81
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_5
82 5 Integration

However, a simple function may be written in more than one way as a


finite linear combination of characteristic functions. For instance, each
set Ai may be partitioned into subsets and ϕ could be written in terms
of the characteristic functions of those subsets. It is also possible to
express ϕ in the form (5.1.1), with the sets Ai not being mutually dis-
joint. Thus, we would like to define the integral in a manner which is
independent of the way the function is written.

Let us assume that we can write ϕ, given by (5.1.1), in the form


m
X
ϕ = β j χ Bj ,
j=1

where the collection of sets {Bj }m


j=1 are also mutually disjoint. Then, it
follows that each βj is equal to αi for some unique index i. In that case,
we have that Bj ⊂ Ai . Further, we have that

Ai = ∪{j | βj =αi } Bj .

Since µ is finitely additive, we have that


X
µ(Ai ) = µ(Bj ).
{j | βj =αi }

It is now immediate to see that


m
X n
X
βj µ(Bj ) = αi µ(Ai ). (5.1.2)
j=1 i=1

Now let us assume that we write


k
X
ϕ = γ i χ Ei , (5.1.3)
i=1

where the sets {Ei }ki=1 are not necessarily disjoint.

Let σ = (σ1 , · · · , σk ) be a k-tuple, where σi = ±1 for each 1 ≤ i ≤ k.


Define, for A ⊂ X,

σi A, if σi = 1,
A =
Ac , if σi = −1.
5.1 Non-negative simple functions 83

Set
E σ = ∩ki=1 Eiσi .
Thus, if σ0 = (−1, · · · , −1), then
 c
E σ0 = ∩ki=1 Eic = ∪ki=1 Ei .

Given two such k-tuples σ and σ 0 which are not equal, there must
exist i with 1 ≤ i ≤ k such that σi 6= σi0 . Without loss of generality,
assume that σi = +1 and σi0 = −1. In that case, by definition, E σ ⊂ Ei
0 0
while E σ ⊂ Eic . Thus, if σ 6= σ 0 , we have that E σ and E σ are disjoint.

Lemma 5.1.1 With the preceding notations, we have, for each 1 ≤ i ≤


k,
Ei = ∪{σ | σi =+1} E σ . (5.1.4)

Proof: If σi = +1, then E σ ⊂ Ei . Thus the set on the right-hand


side of (5.1.4) is contained in Ei . Conversely, let x ∈ Ei . Define σ as
follows: σj = +1 if x ∈ Ej and σj = −1 if x 6∈ Ej , where 1 ≤ j ≤ k. In
particular, σi = +1 and x ∈ E σ . This establishes the reverse inclusion
in (5.1.4) and thus completes the proof. 

Now let us assume that ϕ is given in the form (5.1.3). For any
1 ≤ i ≤ k, we have
X
χ Ei = χ ∪ E σ = χE σ .
{σ | σi =+1}
{σ | σi =+1}

Consequently, we have
k
X X X X
ϕ = γi χE σ = γi χE σ .
i=1 {σ | σi =+1} σ6=σ0 {i | σi =+1}

By virtue of (5.1.2), we get, since the E σ are disjoint,


Pn P 
σ
P
i=1 αi µ(Ai ) = σ6=σ0 {i | σi =+1} γi µ(E )

Pk P σ)
= i=1 γi {σ | σi =+1} µ(E

Pk
= i=1 γi µ(Ei ).
84 5 Integration

The last equality comes from the finite additivity of the measure and
from the result of Lemma 5.1.1 above.

Thus we can now make the following definition, which is independent


of the representation of a simple function.

Definition 5.1.1 Let (X, S, µ) be a measure space and let ϕ be a non-


negative simple function given by
k
X
ϕ = γ i χ Ei
i=1

The (Lebesgue) integral of ϕ, over the set X, with respect to


the measure µ, is given by
Z k
X
ϕ dµ = γi µ(Ei ). 
X i=1

Remark 5.1.1 Notice that the measure of some (or all) of the sets Ei
could be +∞. Thus the integral of ϕ could be +∞ as well. This is
the reason why we only consider non-negative functions. If the γi were
of different signs and if the corresponding sets were of infinite measure,
then we cannot add them meaningfully. For consistency, if γi = 0 and
the set Ei has infinite measure, we adopt the convention that 0.∞ = 0. 

Remark 5.1.2 Let (X, S, µ) be a measure space and let E be a mea-


surable subset X. Then we can consider the σ-algebra SE of sets of the
form A ∩ E, where A ∈ S, defined on E, and the restriction of the mea-
sure µ to this σ-algebra. If ϕ is a non-negative simple function defined
on X given by (5.1.3), then its restriction to E is given by
k
X
ϕ|E = γi χEi ∩E .
i=1

We define the integral of ϕ|E , over E, with respect to the measure µ


restricted to E, as the integral
R of ϕ, over the set E, with respect to the
measure µ, and denote it by E ϕ dµ. Clearly we have
Z k
X Z
ϕ dµ = γi µ(Ei ∩ E) = ϕχE dµ. 
E i=1 X
5.2 Non-negative functions 85

Remark 5.1.3 Let (X, S, µ) be a measure space and let ϕ be a non-


negative simple function given by (5.1.1). Let ψ be another non-negative
simple function such that ψ ≤ ϕ. Then, clearly, we can write
m
X
ψ = β j χ Fj ,
j=1

where each Fj is a subset of some unique Ai and, in that case, 0 ≤ βj ≤


αi . Then it is clear that we have
Z Z
ψ dµ ≤ ϕ dµ. 
X X

5.2 Non-negative functions

Let (X, S, µ) be a measure space. Let f be a non-negative, extended


real-valued measurable function defined on X. Recall that (cf. Theorem
3.1.1) f is the increasing limit of a sequence of non-negative simple
functions.

Definition 5.2.1 Let (X, S, µ) be a measure space and let f be a non-


negative, extended real-valued measurable function defined on X. Then
the (Lebesgue) integral of f , over the set X, with respect to the
measure µ, is defined by
Z Z
f dµ = sup ϕ dµ.
X 0≤ϕ≤f X
ϕ simple

If E ⊂ X is a measurable set, then, we define the integral of f , over the


set E, with respect to the measure µ, by
Z Z
f dµ = f χE dµ. 
E X

Remark 5.2.1 In view of Remarks 5.1.2 and 5.1.3, it is clear that the
above definition is consistent with the definitions made in the previous
section in case f is itself a non-negative simple function. Again, if ϕ is a
simple function such that 0 ≤ ϕ ≤ f , then ϕχE is a non-negative simple
function defined on E which is bounded above by f |E . Conversely, if ϕ
is a non-negative simple function defined on E bounded above by f |E ,
86 5 Integration

then its extension to all of X by setting it to be zero outside E is also


a non-negative simple function defined on X and is bounded above by
f . Thus, we can easily see that the integral of f , over the set E, with
respect to the measure µ, defined above is the same as the integral of
the function f |E , over the set E, with respect to the restriction of the
measure µ to E .

Remark 5.2.2 Notice that the integral of a non-negative real-valued


function may be infinite. 

The following proposition is an immediate consequence of the defi-


nition of the integral for non-negative functions.

Proposition 5.2.1 Let (X, S, µ) be a measure space and let f be a non-


negative, extended real-valued measurable function defined on X.
(a) If g is a measurable function defined on X such that 0 ≤ g ≤ f , and
if E is a measurable subset of X, then
Z Z
g dµ ≤ f dµ. (5.2.1)
E E

(b) If E and F are measurable subsets of X such that E ⊂ F , then


Z Z
f dµ ≤ f dµ. (5.2.2)
E F

(c) If c is a non-negative real number, and if E is a measurable subset


of X, then Z Z
cf dµ = c f dµ. (5.2.3)
E E
(d) If E is a measurable subset of X such that f (x) = 0 for all x ∈ E,
then Z
f dµ = 0.
E
(e) If E is a measurable subset of X such that µ(E) = 0, then
Z
f dµ = 0. 
E

Proposition 5.2.2 Let (X, S, µ) be a measure


R space and let f : X → R
be a non-negative measurable function. If X f dµ = 0, then f = 0
almost everywhere.
5.2 Non-negative functions 87

Proof: Set  
1
Fn = x ∈ X | f (x) > , n ∈ N.
n
Then
{x ∈ X | f (x) 6= 0} = ∪∞
n=1 Fn .

Then, by virtue of (5.2.1) and (5.2.2), we get


Z Z
1
µ(Fn ) ≤ f dµ ≤ f dµ = 0.
n Fn X

Thus, µ(Fn ) = 0 for each n ∈ N and the result follows. 


Proposition 5.2.3 Let (X, S, µ) be a measure space and let ϕ : X → R
be a non-negative simple function. Define, for E ∈ S,
Z
ν(E) = ϕ dµ.
E

Then, ν defines a measure on S.


Proof: Clearly ν is non-negative and ν(∅) = 0. We just need to check
countable additivity. Let {Ei }∞
i=1 be a sequence of measurable sets in

Pk
X which are mutually disjoint. Let E = ∪i=1 Ei . Let ϕ = j=1 αj χAj .
Then
R Pk
ν(E) = E ϕ dµ = j=1 αj µ(Aj ∩ E)

Pk P∞ P∞ Pk
= j=1 αj i=1 µ(Aj ∩ Ei ) = i=1 j=1 αj µ(Aj ∩ Ei )
P∞ R P∞
= i=1 Ei ϕ dµ = i=1 ν(Ei ).

This completes the proof. 


Proposition 5.2.4 Let (X, S, µ) be a measure space and let ϕ and ψ
be non-negative simple functions defined on X. Then
Z Z Z
(ϕ + ψ) dµ = ϕ dµ + ψ dµ. (5.2.4)
X X X
Pn Pm
Proof: Let ϕ = i=1 αi χAi and let ψ = j=1 βj χBj , where the {Ai }ni=1
and the {Bj }mj=1 are collections of mutually disjoint sets. Set Eij =
Ai ∩ Bj . Then
Xn X m
ϕ+ψ = (αi + βj )χEij .
i=1 j=1
88 5 Integration

Then,
Z Z Z
(ϕ + ψ) dµ = (αi + βj )µ(Eij ) = ϕ dµ + ψ dµ.
Eij Eij Eij

The result now follows immediately from the preceding proposition since
the Eij are all disjoint. 

We are now in a position to prove the first important theorem which


shows how the (Lebesgue) integral handles limit processes.
Theorem 5.2.1 (Monotone convergence theorem) Let (X, S, µ) be a
measure space and let {fn }∞ n=1 be a sequence of non-negative measur-
able functions defined on X such that, for every x ∈ X,
(i) 0 ≤ f1 (x) ≤ f2 (x) ≤ · · · ≤ fn (x) ≤ · · ·, and,
(ii)
lim fn (x) = f (x).
n→∞
Then Z Z
lim fn dµ = f dµ.
n→∞ X X
Proof: Let Z
α = sup fn dµ.
n X
By Proposition 5.2.1 (a), we have
Z Z Z
f1 dµ ≤ f2 dµ ≤ · · · ≤ fn dµ ≤ · · ·
X X X
R R
Since fn ≤ f , we also have Xfn dµ ≤ X f dµ. Thus, it follows that
Z
α ≤ f dµ.
X

We now need to prove the reverse inequality, which will complete the
proof.

Let 0 < c < 1 be any fixed constant. Let ϕ be a simple function such
that 0 ≤ ϕ ≤ f . For n ∈ N, define
En = {x ∈ X | fn (x) ≥ cϕ(x)}.
Then each set En is measurable. Since the sequence {fn }∞ n=1 is increas-
ing, we have that E1 ⊂ E2 ⊂ · · · ⊂ En ⊂ · · ·. Further, given any x ∈ X,
we have two possibilities.
5.2 Non-negative functions 89

• Either, f (x) = 0 which imples that fn (x) = 0 for all n and also
that ϕ(x) = 0. In this case x ∈ E1 .

• Or, f (x) > 0 which implies that f (x) > cϕ(x), since 0 < c < 1. In
this case, there exists n ∈ N such that cϕ(x) < fn (x) ≤ f (x) and
so x ∈ En .
Thus, X = ∪∞
n=1 En . Now,
Z Z Z
def
fn dµ ≥ fn dµ ≥ c ϕ dµ = cν(En ).
X En En

But, by Proposition 5.2.3, ν is a measure and so


Z
α ≥ c lim ν(En ) = cν(X) = c ϕ dµ.
n→∞ X

Since this is true for any simple function ϕ satisfying 0 ≤ ϕ ≤ f , we get,


by definition of the integral, that
Z
α ≥ c f dµ.
X

Since 0 < c < 1 was arbitrarily chosen, it now follows, on letting c tend
to unity, that Z
α ≥ f dµ,
X
which completes the proof. 
R
Remark 5.2.3 It is possible that X f dµ is infinite. In R that case,
we conclude from the preceding theorem that the limit of X fn dµ, as
n → ∞, is also infinite. 
Proposition 5.2.5 Let (X, S, µ) be a measure space and let {fn }∞ n=1
be a sequence of non-negative extended real-valued measurable functions
defined on X. Set

X
f (x) = fn (x), x ∈ X.
n=1

Then f is a non-negative measurable function and


Z X∞ Z
f dµ = fn dµ.
X n=1 X
90 5 Integration

Proof: Any finite sum of non-negative measurable functions is measur-


able (cf. Remark 3.1.2). If gn = f1 + · · · + fn , then gn increases to f .
Thus, for any α ∈ R, we have
{x ∈ X | f (x) ≤ α} = ∩∞
n=1 {x ∈ X | gn (x) ≤ α}

which shows that f is measurable.

Let {ϕn }∞ ∞
n=1 and {ψn }n=1 be increasing sequences of non-negative
simple functions increasing to f1 and f2 respectively (cf. Theorem 3.1.1).
By Proposition 5.2.4,
Z Z Z
(ϕn + ψn ) dµ = ϕn dµ + ψn dµ.
X X X
We also have that ϕn + ψn increases to f1 + f2 . Hence, by the monotone
convergence theorem, we get
Z Z
lim (ϕn + ψn ) dµ = (f1 + f2 ) dµ,
n→∞ X X
and Z Z
lim ϕn dµ = f1 dµ,
n→∞ X
Z Z
lim ψn dµ = f2 dµ.
n→∞ X
We thus conclude that
Z Z Z
(f1 + f2 ) dµ = f1 dµ + f2 dµ.
X X X
It now follows by induction that, for any n ∈ N,
Z n Z
X
(f1 + · · · + fn ) dµ = fk dµ.
X k=1 X

Setting gn = f1 + · · · + fn , we get that gn is non-negative, measurable


and increases to f . Thus, the result follows, once again, by an applica-
tion of the monotone convergence theorem. 

Example 5.2.1 (Integration with respect to the counting measure) Let


X = N be equipped with the counting measure. Let f : X → R be a
given non-negative function. Let f (k) = ak ≥ 0, k ∈ N. If we define

ak , if 1 ≤ k ≤ n,
fn (k) =
0, if k > n,
5.2 Non-negative functions 91

then fn increases to f . Notice that


n
X
fn = ak χ{k}
k=1

is a non-negative simple function and so, by definition,


Z n
X n
X
fn dµ = ak µ({k}) = ak .
X k=1 k=1

Thus, by the monotone convergence theorem, it follows that


Z ∞
X
f dµ = ak .
X k=1

The integral of a non-negative function over N with respect to the count-


ing measure is just the summation of the values of the function. 

Example 5.2.2 (Integration with respect to the Dirac measure) Let


X be a non-empty set and let x0 ∈ X. Let µ be the Dirac measure
concentrated at the point x0 (cf. Example 1.2.2). Let ϕ be a simple
function defined as in (5.1.1). Then x0 can belong to at most one set
Ai . If x0 6∈ Aj for any 1 ≤ j ≤ n, then µ(Aj ) = 0 for all 1 ≤ j ≤ n and
we have Z
ϕ dµ = 0 = ϕ(x0 ).
X
If 1 ≤ i0 ≤ n is such that x0 ∈ Ai0 , then, we see that
Z
ϕ dµ = αi0 = ϕ(x0 ).
X

Now, if f is any non-negative extended real-valued measurable function


defined on X, it is the increasing limit of non-negative simple functions
and, by the monotone convergence theorem, it immediately follows that
Z
f dµ = f (x0 ).
X

Thus, integration of a non-negative function with respect to the Dirac


measure is just evaluation of the function at the point where the mea-
sure is concentrated. 
92 5 Integration

Example 5.2.3 Let {aij }∞ i,j=1 be a double sequence of non-negative real


numbers. Let X = N be equipped with P the counting measure. Define
fi (j) = aij , 1 ≤ i, j ≤ ∞. Define f = ∞ i=1 fi . Then

X
f (j) = aij .
i=1

Now, by Proposition 5.2.5, we get


Z ∞ Z
X
f dµ = fi dµ.
X i=1 X

Using the result of Example 5.2.1, this translates into the following re-
lation:

X X∞ X ∞
f (j) = fi (j).
j=1 i=1 j=1

Substituting the values for f (j) and fi (j), we get


∞ X
X ∞ ∞ X
X ∞
aij = aij .
j=1 i=1 i=1 j=1

Thus, we have shown that, for a non-negative double sequence of reals,


the order of summation can be reversed. (Of course, both sums could be
infinite.) This result is not true in general for sequences which change
sign. We will later see sufficient conditions which ensure that the order
of summation is immaterial. 

Theorem 5.2.2 (Fatou’s lemma) Let (X, S, µ) be a measure space and


let {fn }∞
n=1 be a sequence of non-negative extended real-valued measur-
able functions defined on X. Then
Z Z
(lim inf fn ) dµ ≤ lim inf fn dµ.
X n→∞ n→∞ X

Proof: Set gn (x) = inf i≥n fi (x) for x ∈ X. Then {gn }∞n=1 is an in-
creasing sequence of non-negative measurable functions whose limit is
lim inf n→∞ fn . Thus, by the monotone convergence theorem, we get
R R
X (lim inf n→∞ fn )
dµ = limn→∞ X gn dµ
R R
≤ limn→∞ inf i≥n X fi dµ = lim inf n→∞ X fn dµ.
5.2 Non-negative functions 93

This completes the proof. 

Example 5.2.4 We can have strict inequality in Fatou’s lemma. Let


X = R be equipped with the Lebesgue measure, m1 . Set

fn = χ[n,n+1) .
R
Then fn (x) → 0 for each x ∈ R and so RR(lim inf n→∞ fn ) dm1 = 0. On
the other hand, for every n ∈ N, we have R fn dm1 = m1 ([n, n + 1)) =
1. 

The following result is a variation of the monotone convergence the-


orem.

Proposition 5.2.6 Let (X, S, µ) be a measure space and let {fn }∞ n=1
and f be non-negative, extended real-valued measurable functions defined
on X. Assume that, for all n ∈ N, and for all x ∈ X, we have 0 ≤
fn (x) ≤ f (x), and that fn (x) → f (x) as n → ∞. Then
Z Z
lim fn dµ = f dµ.
n→∞ X X

Proof: By Fatou’s lemma and the the fact that the fn are all bounded
above by f , we get
Z Z Z Z
f dµ ≤ lim inf fn dµ ≤ lim sup fn dµ ≤ f dµ,
X n→∞ X n→∞ X X

from which the desired result follows immediately. 

We conclude this section with a generalization of Proposition 5.2.3.

Proposition 5.2.7 Let (X, S, µ) be a measure space and let f be a non-


negative extended real-valued measurable function defined on X. Define
Z
ν(E) = f dµ, E ∈ S.
E

Then ν defines a measure on S. If g is any non-negative extended real-


valued measurable function defined on X, then we have
Z Z
g dν = gf dµ. (5.2.5)
X X
94 5 Integration

Proof: Clearly ν(∅) = 0 and ν(E) ≥ 0 for all E ∈ S. Let {Ei }∞i=1 be a
sequence of mutually disjoint sets in S whose union is E. Then

X
χE = χ Ei .
i=1

Thus, using Proposition 5.2.5, we get


R R
ν(E) = E f dµ = X f χE dµ
P∞ R P∞ R
= i=1 X f χEi dµ = i=1 Ei f dµ
P∞
= i=1 ν(Ei ).

This establishes the countable additivity and thus ν is a measure.


Now, let g = χE , where E ∈ S. Then,
Z Z Z
g dν = ν(E) = f dµ = f g dµ.
X E X

Thus (5.2.5) is true when g is a characteristic function. By the linearity


of the integral with respect to the integrand (cf. (5.2.3) and (5.2.4)), it
follows that (5.2.5) is true when g is any non-negative simple function.
Then, by the monotone convergence theorem, (5.2.5) is true when g is
any non-negative extended real-valued measurable function. 

Remark 5.2.4 The above method of proof is very useful in proving


several identities involving integrals of non-negative measurable func-
tions. We first prove an identity for characteristic functions, and then,
by linearity, for simple functions and then, by the monotone convergence
theorem, for arbitrary non-negative measurable functions. 

Remark 5.2.5 The result of (5.2.5) is often symbolically written as


dν = f dµ. An important converse of this result is known as the
Radon-Nikodym theorem which we will see much later. 

5.3 Integrable functions

Let (X, S, µ) be a measure space. We now consider an arbitrary measur-


able function f defined on X. Since any function f can be split into its
positive and negative parts (cf. Remark 3.1.3) as f = f + − f − , we may,
5.3 Integrable functions 95

in view of the linearity of the integral with respect to the integrand, try
to define the interal of f as the difference between the integrals of f +
and f − , which are well-defined, since these functions are non-negative.
However, if both these integrals turn out to be infinite, we cannot de-
fine
R their difference. So we need that at least one of the two quantities,
+ dµ or − dµ, be finite. In view of this requirement, we make
R
X f X f
the following definition.

Definition 5.3.1 Let (X, S, µ) be a measure space and let f be a mea-


surable function defined on X. The function f is said to be (Lebesgue)
integrable if Z
|f | dµ < +∞. 
X

Since |f | =Rf + + f − , it now


R follows from the definition of integrability
that both X f + dµ and X f − dµ are finite and so we are now in a
position to define the integral of an integrable function.

Definition 5.3.2 Let (X, S, µ) be a measure space and let f be an in-


tegrable function defined on X. Then the (Lebesgue) integral of f
over X with respect to the measure µ, is defined by
Z Z Z
f dµ = f + dµ − f − dµ.  (5.3.1)
X X X

At this point, it is easy for us to consider complex-valued functions


as well. Let f : X → C be a given function, written in terms of its real
and imaginary parts, as f = u + iv.

Definition 5.3.3 Let (X, S, µ) be a measure space and let f be a complex-


valued function defined on X. It is said to be measurable if its real
and imaginary parts are measurable real-valued functions. It is said to
be integrable if, in addition, |f | is integrable. 

If f is an integrable complex-valued function, then, clearly, its real


and imaginary parts are also integrable, for, if f = u + iv, then |u| ≤ |f |
and |v| ≤ |f |. Thus we may now define, in this case,
Z Z Z
f dµ = u dµ + i v dµ. (5.3.2)
X X X

Notation
R Let (X, S, µ) be a measure space. We generally use the sym-
bol X f dµ to denote the (Lebesgue) integral of an integrable function
96 5 Integration

defined over X, with respect to the measure µ. However, there may arise
situations when f depends on more than one variable. For instance, if
f is a function of two variables x and y varying over different measure
spaces, say, (X, S, µ) and (Y, S 0 , ν), and if we wish to integrate f as a
function of x with y fixed in Y , we will write the integral as
Z
f (x, y) dµ(x). 
X

We now prove the full linearity of the Lebesgue integral with respect to
the integrand.

Theorem 5.3.1 Let (X, S, µ) be a measure space and let f and g be in-
tegrable, complex-valued functions defined on X. Let α and β be complex
constants. Then
Z Z Z
(αf + βg) dµ = α f dµ + β g dµ. (5.3.3)
X X X

Proof: By definition, αf +βg is clearly measurable. Further, since |αf +


βg| ≤ |α||f |+|β||g|, it follows, from the linearity of the Lebesgue integral
with respect to non-negative integrands and non-negative constants (cf.
(5.2.3) and the proof of Proposition 5.2.5), that αf + βg is integrable as
well.
We will now show that
Z Z Z
(f + g) dµ = f dµ + g dµ. (5.3.4)
X X X

Again, by definition of the intergal for complex-valued functions, it is


enough to prove (5.3.4) when f and g are real-valued. Assuming this,
let h = f + g. Then

h+ − h− = f + g = f + − f − + g + − g − ,

which implies that

h+ + f − + g − = f + + g + + h− .

Since all the functions involved in the above relation are non-negative,
we deduce that (cf. the proof of Proposition 5.2.5)
Z Z Z Z Z Z
− −
+
h dµ + f dµ + g dµ = +
f dµ + +
g dµ + h− dµ.
X X X X X X
5.3 Integrable functions 97

Since all the quantities in the above relation are finite, we can rearrange
the terms to get
Z Z Z Z Z Z
− −
+
h dµ − h dµ = +
f dµ − f dµ + +
g dµ − g − dµ,
X X X X X X

which is exactly (5.3.4).


Finally, we show that if c ∈ C and if f is a complex-valued integrable
function defined on X, we have
Z Z
cf dµ = c f dµ, (5.3.5)
X X

which will complete the proof. If c ≥ 0, then (10.3.4) follows from the
definition of the integral and from (5.2.3). If c = −1, then the result is
again true since, for a real-valued function f , we have (−f )+ = f − and
(−f )− = f + , and we can again use the definition of the integral. Thus,
(10.3.4) is true for all real constants c. The proof will be complete if we
can prove the relation when c = i. Let f = u + iv be the decomposition
of f into its real and imaginary parts. Then, by definition,
Z Z Z Z Z
if dµ = (−v + iu) dµ = − v dµ + i u dµ = i f dµ.
X X X X X

This completes the proof. 

The next result is very important for estimating integrals.


Theorem 5.3.2 Let (X, S, µ) be a measure space and let f be a complex-
valued integrable function defined on X. Then
Z Z

f dµ ≤ |f | dµ.  (5.3.6)

X X

Proof: If z ∈ C, then z = |z|eiθ , where 0 ≤ θ < 2π. Thus, we can write


|z| = αz, where |α| = 1. Let
Z Z

f dµ = α f dµ,

X X

where |α| = 1. Let u denote the real part of the function αf . Then
u ≤ |αf | ≤ |f |. Then, by the preceding theorem, we have
Z Z Z Z Z

f dµ = α
f dµ = αf dµ = u dµ ≤ |f | dµ.
X X X X X
98 5 Integration

(In the above chain of equalities, we


R used the fact
R that the integral of αf
is, in fact, a real quantity and so X αf dµ = X u dµ.) This completes
the proof. 

We will now prove a result, which, without exaggeration, could be


called the high point of the theory of the Lebesgue integral. It proves
that wecan interchange the processes of limits and integration under
fairly simple conditions.
Theorem 5.3.3 (Dominated convergence theorem) Let (X, S, µ) be a
measure space and let {fn }∞n=1 be a sequence of (complex-valued) inte-
grable functions defined on X, converging pointwise to a function f .
Assume, further, that for all x ∈ X, and for all n ∈ N, we have
|fn (x)| ≤ g(x), where g is a non-negative integrable function defined
on X. Then, f is integrable. Further,
Z
lim |fn − f | dµ = 0. (5.3.7)
n→∞ X

In particular, we have
Z Z
lim fn dµ = f dµ. (5.3.8)
n→∞ X X

Proof: By the preceding theorem, (5.3.8) is an immediate consequence


of (5.3.7).
Since we also have |f (x)| ≤ g(x), we deduce that f is integrable.
Now, |fn − f | ≤ 2g and so 2g − |fn − f | is non-negative and converges
to 2g as n → ∞. Consequently, by Fatou’s lemma, we get
Z Z
2 g dµ ≤ lim inf (2g − |fn − f |) dµ
X n→∞ X
R R
= 2 X g dµ − lim supn→∞ X |fn − f | dµ.
R
Since g dµ < +∞, we deduce from the above that
X
Z Z
0 ≤ lim inf |fn − f | dµ ≤ lim sup |fn − f | dµ ≤ 0.
n→∞ X n→∞ X

This proves (5.3.7). 

Remark 5.3.1 Since the integral of a function over a set of measure


zero is zero, in all the convergence theorems proved up to now (say, the
5.3 Integrable functions 99

monotone convergence theorem and the dominated convergence theo-


rem), the theorems remain valid even if we assume that fn → f almost
everywhere. If E is the set of measure zero where convergence fails,
we can work with X\E in the proofs and the results remain valid for
integrals over X since the addition of the integrals over E does not alter
anything. 

Example 5.3.1 Let X = N be equipped with the counting measure.


Define  1
fn (k) = n , if 1 ≤ k ≤ n,
0, if k > n.
R
Then fn → f ≡ 0 uniformly. However, while X f dµ = 0, we have
1
R
X fn dµ = n. n = 1 for each n. This is because the fn are not bounded
above by any integrable function. 
Definition 5.3.4 Let (X, S, µ) be a measure space and let {fn }∞
n=1 be
a sequence of integrable functions defined on X. Then we say that the
sequence converges in the mean to an integrable function f if (5.3.7)
holds. 
Proposition 5.3.1 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be
a sequence of integrable functions defined on X, converging in the mean
µ
to an integrable function f defined on X. Then fn → f . In particular,
there exists a subsequence {fnk } which converges to f pointwise, almost
everywhere.
Proof: Let ε > 0. Set
En (ε) = {x ∈ X | |fn (x) − f (x)| ≥ ε}.
Then, Z Z
0 ≤ |fn − f | dµ ≤ |fn − f | dµ.
En (ε) X
It now follows from the definition of En (ε) that (cf. (5.2.1)),
Z
1
µ(En (ε)) ≤ |fn − f | dµ,
ε X
from which we deduce that µ(En (ε)) → 0 as n → ∞. This proves that
µ
fn → f . The other conclusion now follows from Proposition 4.2.2. 

We now prove a generalization of the dominated convergence theo-


rem.
100 5 Integration

Theorem 5.3.4 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be
a sequence of integrable functions defined on X converging pointwise,
almost everywhere, to an integrable function f defined on X. Assume
that for all x ∈ X, and for all n ∈ N, we have that |fn (x)| ≤ gn (x), where
{gn }∞
n=1 is a sequence of non-negative integrable functions defined on X
converging to a non-negative integrable function g almost everywhere in
X. Finally, assume that
Z Z
lim gn dµ = g dµ.
n→∞ X X

Then, (5.3.8) holds.

Proof: Assume that the functions fn and f are all real-valued. By


hypotheses, we have that for each n ∈ N, gn ±fn ≥ 0 and gn ±fn → g ±f
as n → ∞. Applying Fatou’s lemma, we get
R R
X (g − f ) dµ ≤ lim inf n→∞ X (gn − fn ) dµ
R R
= X g dµ − lim supn→∞ X fn dµ,

and R R
X (g + f ) dµ ≤ lim inf n→∞ X (gn + fn ) dµ
R R
= X g dµ + lim inf n→∞ X fn dµ.
R
Since X g dµ < +∞, we deduce that
Z Z Z Z
f dµ ≤ lim inf fn dµ ≤ lim sup fn dµ ≤ f dµ,
X n→∞ X n→∞ X X

from which (5.3.8) follows.


If the functions fn and f are complex valued, then the inequality
|fn (x)| ≤ gn (x) also implies that the same is valid for the sequences of
the real and imaginary parts of the fn . Hence, the theorem applies to
these two sequences from which we easily deduce (5.3.8). 

As a corollary of the preceding theorem, we have the following useful


result.

Theorem 5.3.5 Let (X, S, µ) be a measure space and let {fn }∞ n=1 be
a sequence of integrable functions defined on X converging pointwise
5.3 Integrable functions 101

(almost everywhere) to an integrable function f defined on X. Then,


the sequence converges to f in the mean if, and only if,
Z Z
lim |fn | dµ = |f | dµ. (5.3.9)
n→∞ X X

Proof: Assume that the sequence converges to f in the mean. Since

| |fn | − |f | | ≤ |fn − f |,

we see that {|fn |}∞


n=1 converges in the mean to |f | and then (5.3.9) is
an immediate consequence.

Conversely, assume that (5.3.9) holds. Define Fn = |fn − f | which


converges to zero (almost everywhere). Let Gn = |fn |+|f | and G = 2|f |.
Then Gn and G are non-negative integrable functions, |FnR| ≤ Gn , and
R n → G as n → ∞. Finally, thanks to (5.3.9), we have that X Gn dµ →
G
X G Rdµ, as n → ∞. Consequently, by the preceeding theorem, we have
that X Fn dµ → 0 as n → ∞, which is the same as saying that the
sequence {fn }∞n=1 converges to f in the mean. 

We will now see a few examples of application of the dominated con-


vergence theorem.

Example 5.3.2 (Fourier transform) Let RN be equipped with the Lebesgue


measure, mN . Given two vectors x = (x1 , · · · , xN ) and ξ = (ξ1 , · · · , ξN )
in RN , we define the usual euclidean inner-product between them by
N
X
x.ξ = xi ξi .
i=1

Let f : RN → R be an integrable function. The Fourier transform of the


function f , denoted fb, is defined by
Z
f (ξ) =
b e−2πix.ξ f (x) dmN (x).
RN

By Theorem 5.3.2, we have


Z
|fb(ξ)| ≤ |f | dmN < +∞,
RN

since the exponential has absolute value equal to unity. Thus fb is a


well-defined and bounded function. Let {ξ (n) }∞
n=1 be a sequence in R
N
102 5 Integration

converging to a vector ξ. Then, since the exponential is continuous, we


have that
(n)
e−2πiξ .x f (x) → e−2πiξ.x f (x)
as n → ∞. Further,

−2πiξ(n) .x
e f (x) ≤ |f (x)|

for all x ∈ RN and for all n ∈ N. Since f is integrable, it now follows,


from the dominated convergence theorem, that

lim fb(ξ (n) ) = fb(ξ).


n→∞

Thus, the Fourier transform of an integrable function is a continuous


and bounded function. 

Example 5.3.3 Let {aij }∞ i,j=1 be a double sequence of real numbers.


Assume that, for each j ∈ N,

X
|aij | ≤ bj ,
i=1
P∞
where j=1 bj < +∞.

Let X = N be equipped with the counting measure. Define



X
fi (j) = aij , j ∈ N, and f = fi .
i=1

Thus,

X
f (j) = aij , j ∈ N.
i=1
Define, for j ∈ N, g(j) = bj . Then g is a non-negative integrable func-
tion. By hypothesis, for each n ∈ N,
∞ ∞

Xn X X
fi (j) ≤ |fi (j)| = |aij | ≤ bj = g(j).



i=1 i=1 i=1
Pn
Since i=1 fi → f , we have, by the dominated convergence theorem,

n X
X ∞ X
X n ∞
X ∞ X
X ∞
lim aij = lim aij = f (j) = aij .
n→∞ n→∞
i=1 j=1 j=1 i=1 j=1 j=1 i=1
5.3 Integrable functions 103

In other words, we have


∞ X
X ∞ ∞ X
X ∞
aij = aij .
i=1 j=1 j=1 i=1

Thus, the given conditions are sufficient to ensure that the order of
summation can be reversed when the double sequence is not necessarily
non-negative. (cf. Example 5.2.3). 
Proposition 5.3.2 Let (X, S, µ) be a measure space and let f be an
integrable function defined on X. Then, given ε > 0, there exists δ > 0
such that, whenever µ(E) < δ, E ∈ S, we have
Z
|f | dµ < ε. (5.3.10)
E

Proof: Step 1: Let us assume that f is bounded. Let |f (x)| ≤ M for


(almost) every x ∈ X. Then, if E ∈ S, we have
Z
|f | dµ ≤ M µ(E).
E
ε
The result follows on choosing δ < M.

Step 2: Given an arbitrary integrable function f , define, for n ∈ N,



|f (x)|, if |f (x)| ≤ n,
fn (x) =
n, if |f (x)| > n.

Then {fn }∞ n=1 is a non-negative sequence of bounded functions increas-


ing to |f |. Thus, by the monotone convergence theorem, there exists
N ∈ N such that, for all n ≥ N , we have
Z Z Z
ε
| |f | − fn | dµ = |f | dµ − fn dµ < .
X X X 2
Hence, for every E ∈ S, we have, for all n ≥ N ,
Z Z Z Z
ε
|f | dµ − fn dµ = | |f | − fn | dµ ≤ | |f | − fn | dµ < .
E E E X 2
Since fN is bounded, choose, by Step 1, a δ > 0 such that, whenever
µ(E) < δ, E ∈ S, we have
Z
ε
fN dµ < .
E 2
104 5 Integration

Thus, for every set E ∈ S such that µ(E) < δ, (5.3.10) holds. 

Remark 5.3.2 Let (X, S, µ) be a measure space. We know that (cf.


Proposition 5.2.7) if f is an integrable function defined on X, then
Z
ν(E) = |f | dµ, E ∈ S,
E

defines a measure on S. The above proposition states that if ε > 0 is


given, we can find δ > 0 such that ν(E) < ε whenever µ(E) < δ. We
say that the measure ν is absolutely continuous with respect to the
measure µ. The Radon-Nikodym theorem (which we shall see much
later) states that, for σ-finite measure spaces, every measure ν which is
absolutely continuous with respect to a measure µ occurs in this form
(see also Remark 5.2.5). 

5.4 The Riemann and Lebesgue integrals

On the real line, R, we have two notions of integrals. The first is that of
the Riemann integral, defined primarily for bounded functions defined
over bounded intervals. If f : [a, b] → R is a bounded function, then the
Riemann integral, if it exists, is denoted by the symbol
Z b
f (x) dx.
a

The integral when f and/or the interval is unbounded, is then defined


by appropriate limit processes and, again, may or may not exist. When
the integral exists, it is always a real number.

The Lebesgue integral, on the other hand, is defined for all non-
negative measurable functions defined on any (Lebesgue) measurable
subset of R. The value of the integral may be finite or infinite. If
it is finite, we say the function is Lebesgue integrable. The Lebesgue
integral is then defined for any measurable function such that |f | is
Lebesgue integrable. In this case, of course, the integral will be again a
real number. The Lebesgue integral of a non-negative or an integrable
function, f , over a (Lebesgue) measurable set E, is denoted by the
symbol Z
f dm1 .
E
5.4 The Riemann and Lebesgue integrals 105

We have seen, in the Preamble, that there exist bounded functions


which are not Riemann integrable. An example is that of the character-
istic function of the rationals in the interval [0, 1]. On the other hand,
the set of rationals is countable and hence is a measurable set of measure
zero. Thus, its characteristic function is integrable and the value of the
Lebesgue integral is zero.

We now ask ourselves if a Riemann integrable function is always


Lebesgue integrable. Our aim in this section is to show that this is in-
deed the case when the function and the interval are both bounded and
that, in fact, the two theories yield the same value for the integral.

Let [a, b] be a finite interval in R. Let

P = {a = x0 < x1 < · · · < xn = b}

be a partition of the interval [a, b]. The points {xi }ni=0 are called the
nodes of the partition. The mesh size of the partition, denoted ∆(P), is
defined as follows:

∆(P) = max (xi − xi−1 ).


1≤i≤n

A partition P 0 is said to be a refinement of P if the nodes of P form a


subset of those of P 0 .

Let f : [a, b] → R be a bounded function. Let us consider a sequence


{Pk }∞
k=1 of partitions of the interval [a, b] such that, for each k, we have
that Pk+1 is a refinement of Pk and such that ∆(Pk ) → 0, as k → ∞. If

Pk = {a = x0 < x1 < · · · < xn = b},

define the functions Uk and Lk as follows:


n
X n
X
Uk = f (a) + Mi χ(xi−1 ,xi ] , and Lk = f (a) + mi χ(xi−1 ,xi ] ,
i=1 i=1

where, for 1 ≤ i ≤ n,

Mi = sup f (x), and mi = inf f (x).


[xi−1 ,xi ] [xi−1 ,xi ]
106 5 Integration

Then the upper and lower (Darboux) sums associated to f and this
partition (cf. Preamble) are given by
Z Z
U (Pk , f ) = Uk dm1 , and L(Pk , f ) = Lk dm1 . (5.4.1)
[a,b] [a,b]

Since, for each k ∈ N, we have that the partition Pk+1 is a refinement


of the partition Pk , it follows that, for each x ∈ [a, b],
L1 (x) ≤ L2 (x) ≤ · · · ≤ f (x) ≤ · · · ≤ U2 (x) ≤ U1 (x). (5.4.2)
Theorem 5.4.1 Let f : [a, b] → R be a bounded function which is Rie-
mann integrable. Then, it is also Lebesgue integrable and
Z Z b
f dm1 = f (x) dx.
[a,b] a

Proof: With the notations established above, the sequence {Uk (x)}∞ k=1
is monotonic decreasing and bounded below and the sequence {Lk (x)}∞ k=1
is monotonic increasing and bounded above, for each x ∈ [a, b]. Thus
both sequences are convergent. Let their respective limits be U (x) and
L(x). Then
L(x) ≤ f (x) ≤ U (x), x ∈ [a, b].
Since f is bounded, assume that |f (x)| ≤ M for all x ∈ [a, b]. Then,
for all x ∈ [a, b], and for all k ∈ N, we also have |Lk (x)| ≤ M and
|Uk (x)| ≤ M . Consequently, by the dominated convergence theorem, we
have R R
limk→∞ [a,b] Uk dm1 = [a,b] U dm1 , and
R R (5.4.3)
limk→∞ [a,b] Lk dm1 = [a,b] L dm1 .
Since f is Riemann integrable, the upper and lower Darboux sums con-
verge to the Riemann integral of f . Thus, in view of (5.4.1), we get
Z Z Z b
U dm1 = L dm1 = f (x) dx.
[a,b] [a,b] a

But U ≥ L and so by the above result, it follows from, Proposition


5.2.2, that U (x) = L(x) = f (x) almost everywhere. Thus f is Lebesgue
integrable and
Z Z Z Z b
f dm1 = U dm1 = L dm1 = f (x) dx.
[a,b] [a,b] [a,b] a

This completes the proof. 


5.4 The Riemann and Lebesgue integrals 107

Theorem 5.4.2 Let f : [a, b] → R be a bounded function. Then f is


Riemann integrable if, and only if, it is continuous almost everywhere.

Proof: With the preceding notations, assume that x ∈ [a, b] is not a


node of any of the partitions Pk , k ∈ N. (The set all nodes is countable
and hence is of measure zero.) Now, f is continuous at such a point x
if, and only if,
U (x) = f (x) = L(x).
Thus, from the proof of the preceding theorem, we see that if f is a
bounded and Riemann integrable function, then it is continuous almost
everywhere.

Conversely, assume that f : [a, b] → R is a bounded function which


is continuous almost everywhere. Then U = f = L almost everywhere.
Let ε > 0 be given. Then, by virtue of (5.4.1) and (5.4.3), it follows that
we can find k, sufficiently large, such that
Z Z

Uk dm1 − Lk dm1 < ε,


[a,b] [a,b]

i.e.
|U (Pk , f ) − L(Pk , f )| < ε,
which proves that f is Riemann integrable. 

Example 5.4.1 The result of Theorem 5.4.1 is not true for unbounded
intervals. Consider the interval (0, ∞) ⊂ R. Let
sin x
f (x) = .
x
It is well known (a
R ∞standard exercise on contour integration) that the
Riemann integral 0 f (x) dx is well-defined and that its value is, infact,
π
2 (cf., for example, Ahlfors [1]). However, f is not Lebesgue integrable.
To see this, consider the intervals
h π πi
In = nπ + , nπ + , n ∈ N,
4 2
which are all disjoint. On In , we have
1 π
| sin x| ≥ √ and x = |x| ≤ (2n + 1) .
2 2
108 5 Integration

Thus, for x ∈ In , we have



sin x 2 1
≥ .
x π 2n + 1
Thus, for any N ∈ N, we have
Z N Z N
X 1 X 1
|f | dm1 ≥ |f | dm1 ≥ √ ,
(0,∞) π π
n=1 [nπ+ 4 ,nπ+ 2 ]
2 2 n=1 2n + 1

and the sum on the right becomes arbitrarily large, for large N , since it
is a partial sum of a divergent series. 

We can use Theorem 5.4.1 to study the Lebesgue integrability of


functions on subsets of the real line.

Example 5.4.2 Let f (x) = √1x on the interval (0, 1). This is a non-
negative function and its Lebesgue integral is well-defined. Define
(
0, if x ∈ (0, n1 ),
fn (x) = √ , if x ∈ [ 1 , 1).
1
x n

Then the sequence of non-negative functions {fn }∞


n=1 increases to f and
so Z Z
f dm1 = lim fn dm1 .
(0,1) n→∞ (0,1)

Now Z Z
fn dm1 = f dm1 .
1
(0,1) [n ,1)

But, on the interval [ n1 , 1), the function f is a bounded and continuous


function. Hence it is Riemann integrable and, we can easily calculate
the Riemann integral using the familiar rules of the calculus as
Z 1
√ 1
 
1 1
√ dx = 2 x 1 = 2 1 − √
.
1 x n n
n

Thus, it follows that Z


f dm1 = 2
(0,1)

and so f is integrable over (0, 1). 


5.5 Weierstrass’ theorem 109

Example 5.4.3 Let


(
sin x 2

, if x ∈ (0, ∞),
f (x) = x
1, if x = 0.

Again, this is a non-negative function and its Lebesgue integral is well-


defined. We can write (using Proposition 5.2.7)
Z Z Z
f dm1 = f dm1 + f dm1 .
[0,∞) [0,1] (1,∞)

On the interval [0, 1], we have that f is a bounded and continuous func-
tion and so it is Riemann and hence Lebesgue integrable. Now, set

1
fn (x) = χ (x).
x2 (1,n)

Then for each x ∈ (1, ∞), fn (x) increases to x12 . Since fn is a bounded
and continuous function on the interval (1, n), it is Riemann integrable
there. Consequently,
Z Z n  
1 1 1
2
dm1 (x) = lim dx = lim 1 − = 1.
(1,∞) x n→∞ 1 x2 n→∞ n

1
Thus, the function x 7→ x2
is integrable on the interval (1, ∞). Since
 2
sin x 1
≤ ,
x x2

it follows that f is integrable on (1, ∞) as well and, therefore, f is


integrable on [0, ∞). 

5.5 Weierstrass’ theorem

In this section, we will prove the famous theorem of Weierstrass, on


the uniform approximation of a continuous function on a compact inter-
val by means of polynomials, using the notion of the Lebesgue integral.
Without loss of generality, we work on the interval [0, 1].

If x0 ∈ [0, 1], we denote the Dirac measure concentrated at x0 by δx0 .


110 5 Integration

Let t ∈ [0, 1] and n ∈ N be fixed. Let X = [0, 1]. On the σ-algebra


of all subsets of X, define the measure
n  
X n
µtn = tk (1 − t)n−k δ k ,
k n
k=0

where  
n n!
= ,
k k!(n − k)!
is the usual binomial coefficient.
Let fi (x) = xi , i = 0, 1, 2. Now,
Z n  
X n
µtn (X) = f0 dµtn = tk (1 − t)n−k = 1.
X k
k=0

Next,
 
R t
Pn n k n−k k
X f1 dµn = k=0 k t (1 − t) n

 
Pn n − 1 k−1
= t k=1 k − 1 t (1 − t)(n−1)−(k−1)

= t.

In the same way, a similar computation yields


Z
1
f2 dµtn = ((n − 1)t2 + t).
X n

Setting f (x) = (x − t)2 = f2 (x) − 2tf1 (x) + t2 f0 (x), we get, on simplifi-


cation,
t − t2
Z
f dµtn = . (5.5.1)
X n

Lemma 5.5.1 Let t ∈ [0, 1] and n ∈ N be fixed. Let ε > 0 be given. Set

Aε = {x ∈ X | |x − t| ≥ ε}.

Then µtn (Aε ) converges uniformly to zero (with respect to t) as n → ∞.


5.5 Weierstrass’ theorem 111

Proof: By the definition of the set Aε , we get


Z Z
2 t 2 t 1
ε µn (Aε ) ≤ (x − t) dµn (x) ≤ (x − t)2 dµtn (x) ≤ ,
Aε X 4n
1
using (5.5.1) and the fact that t(1−t) ≤ 4 when t ∈ [0, 1]. This completes
the proof. 
Lemma 5.5.2 Let f ∈ C[0, 1]. Let t ∈ [0, 1]. Then
Z
lim f dµtn = f (t),
n→∞ X

and the limit is uniform with respect to t.


Proof: Since f is continuous over a compact interval, it is uniformly
continuous. Given ε > 0, let δ > 0 be such that whenever |x − y| < δ,
we have |f (x) − f (y)| < ε. Set

Aδ = {x ∈ X | |x − t| ≥ δ}.

Now, since µtn (X) = 1, we have


Z Z Z
t t
|f (x)−f (t)| dµtn (x).

f dµn − f (t) = (f (x) − f (t)) dµn (x) ≤

X X X

We can write Z
|f (x) − f (t)| dµtn (x) = I1 + I2 ,
X
where
|f (x) − f (t)| dµtn (x), and
R
I1 = Aδ

|f (x) − f (t)| dµtn (x).


R
I2 = Acδ

Let M = maxx∈[0,1] |f (x)|. Then, by Lemma 5.5.1, we have


2M
I1 ≤ 2M µtn (Aδ ) ≤ .
4n
On the other hand, for x ∈ Acδ , we have that |f (x) − f (t)| < ε and so

I2 ≤ εµtn (Acδ ) ≤ εµtn (X) ≤ ε.

Now, given any η > 0, choose ε < η2 and choose N ∈ N such that, for
all n ≥ N , we have
2M η
< .
4n 2
112 5 Integration

Then, for all n ≥ N and for all t ∈ [0, 1], we have


Z
f dµtn − f (t) < η,


X

which completes the proof. 

Remark 5.5.1 In the above proof, to estimate the integral of |f (x) −


f (t)|, we split the integral over two sets. On one of the sets (Acδ ), we had
information which controlled the integrand while on the other (Aδ ), we
had minimal information on the integrand, but the measure of the set
was small. This method of ‘divide and rule’ is often useful in estimating
integrals. 

Now, by the definition of the measure µtn , we get


Z ∞    
t
X n k n−k k
f dµn = t (1 − t) f , (5.5.2)
X k n
k=0

which is a polynomial in t converging uniformly to f (t). Thus, we have


proved the following theorem:
Theorem 5.5.1 (Weierstrass approximation theorem) Every continu-
ous function on a compact interval can be uniformly approximated by a
sequence of polynomials.
The polynomials occuring on the right-hand side of (5.5.2) are called
the Bernstein polynomials.

5.6 Probability

We are now in a position to indicate the connections between the theory


of probability and that of measure and integration.

A probability space is a measure space (Ω, B, p) such that p(Ω) =


1. The set Ω is called a sample space and the σ-algebra B is said to
be the collection of events. Thus, the measure p(A) of a set A ∈ B, is
called the probability of the event A.

Let B ∈ B. Consider the σ-algebra of subsets of B, given by

BB = {A ∩ B | A ∈ B}.
5.6 Probability 113

We define the measure pB on BB by

p(A ∩ B)
pB (A ∩ B) = , A ∈ B.
p(B)

so that pB (B) = 1. Then (B, BB , pB ) is a probability space. The condi-


tional probability of an event A, given B, denoted p(A | B) is nothing
but pB (A ∩ B). The events A and B are said to be independent if
p(A | B) = p(A). In this case, we deduce that

p(A ∩ B) = p(A)p(B).

A random variable, X, on Ω, is a measurable real-valued function


defined on Ω. The expected value (also called expectation or mean)
of the random variable X, denoted E(X), is defined by
Z
E(X) = X dp.

Pointwise convergence, almost everywhere, of a sequence of random


variables ir referred to as convergence almost surely and convergence
in measure is referred to as convergence in probability.

A distinguishing feature of probability theory, which does not have


a parallel in the theory of measure and integration, is the study of inde-
pendent and identically distributed random variables.

Two random variables X and Y defined on Ω are said to be inde-


pendent if for any pair of Borel sets in R, say, A and B, we have

p(X −1 (A) ∩ Y −1 (B)) = p(X −1 (A))p(Y −1 (B)).

The distribution function of a random variable X defined on Ω is


defined by
F (t) = p(X −1 ((−∞, t])).

In other words, it is the probability that the random variable takes a


value less than, or equal to t. Two random variables are said to be
identically distributed if they have the same distribution function.
114 5 Integration

5.7 Exercises

5.1 Let (X, S, µ) be a measure space. Let {fn }∞n=1 be a sequence of


measurable functions defined on X converging to a measurable function
f almost everywhere. Assume that
f1 ≥ f2 ≥ · · · ≥ fn ≥ · · · ≥ 0,
and that f1 is integrable. Show that
Z Z
lim fn dµ = f dµ.
n→∞ X X

5.2 Let (X, S, µ) be a measure space such that µ(X) < +∞. Let
{fn }∞
n=1 be a sequence of integrable functions defined on X converg-
ing uniformly to a function f , on X. Show that
Z Z
lim fn dµ = f dµ.
n→∞ X X

(This is not true, in general, if µ(X) == +∞; cf. Example 5.3.1.)

5.3 Let f : R → R be integrable. Let t ∈ R be fixed. Define g(x) =


f (x + t), x ∈ R. If [a, b] ⊂ R is an interval, show that
Z Z
g dm1 = f dm1 .
[a,b] [a+t,b+t]

5.4 Let f : [0, 1] × [0, 1] → R be (Lebesgue) measurable as a function


of x, for each fixed t, and let g : [0, 1] → R be an integrable function.
Assume that for each (x, t) ∈ [0, 1] × [0, 1],
|f (x, t)| ≤ g(x).
If limt→0 f (x, t) = h(x), show that
Z Z
lim f (x, t) dm1 (x) = h dm1 .
t→0 [0,1] [0,1]

5.5 (Differentiation under the integral sign) Let f : [0, 1] × [0, 1] → R


be a function such that, for each t ∈ [0, 1], the function x 7→ f (x, t) is
integrable and for each x ∈ [0, 1], the map t 7→ f (x, t) is differentiable
and that the derivative ∂f ∂t (x, t) is uniformly bounded. Show that
Z Z
d ∂f
f (x, t) dm1 (x) = (x, t) dm1 (x).
dt [0,1] [0,1] ∂t
5.7 Exercises 115

5.6 In each of the following cases, check the function f for (Lebesgue)
integrability over the indicated domain.
1
(a) f (x) = 1+x 2 on R.

1
(b) f (x) = x on (0, 1).

1
(c) f (x) = x sin x1 on (0, 1).

5.7 Show that the function defined by

xn−1
f (x) =
(1 + x2 )k

is integrable over (0, ∞) if k > n2 .

5.8 Let (X, S, µ) be a measure space. Show that the dominated con-
vergence theorem is true if we replace ‘fn → f almost everywhere’ by
µ
‘fn → f ’.

5.9 (a) Let f : [0, ∞) → R be uniformly continuous. If f is integrable,


show that limx→+∞ f (x) = 0.
(b) Show by means of an example that this result is not true if we re-
place ‘uniformly continuous’ by ‘continuous’.

5.10 Let f ∈ C[0, 1]. Assume that, for each n ∈ N, we have


Z 1
xn f (x) dx = 0.
0

Show that f ≡ 0.

5.11 Let (X, S, µ) be a measure space and let f be a non-negative inte-


grable function defined on X. Let {ϕn }∞
n=1 be a sequence of non-negative
simple functions increasing to f . Show that
Z
lim |ϕn − f | dµ = 0.
n→∞ X

5.12 (Korovkin’s theorem) Let ϕ : [0, +∞) → [0, +∞) be a continu-


ous function such that ϕ(t) > 0 for t > 0. Let X = [0, 1] and let
{µxn }n∈N,x∈[0,1] be a collection of finite Borel measures (cf. Definition
116 5 Integration

2.1.1) on X. Define, for x ∈ X,


Z
ψn (x) = ϕ(|x − y|) dµxn (y).
X

Assume that (i) µxn (X) → 1, uniformly with respect to x, as n → ∞,


and (ii) ψn → 0, as n → ∞, uniformly on [0, 1]. Show that, for any
f ∈ C[0, 1], we have
Z
f (y) dµxn (y) → f (x), as n → ∞,
X

uniformly on X.

5.13 Show that the Weierstrass approximation theorem, as proved in


Section 5.5, is a particular case of Korovkin’s theorem, as stated above.

5.14 Let (X, S, µ) be a measure space such that µ(X) = 1. Let g : R →


R be a bounded and uniformly continuous function. Let {fn }∞ n=1 be a
µ
sequence of measurable functions defined on X such that fn → f , where
f is a measurable function defined on X. Show that
Z Z
lim g ◦ fn dµ = g ◦ f dµ,
n→∞ X X

where, for x ∈ X, and for any measurable function h defined on X,

(g ◦ h)(x) = g(h(x))

.
5.15 Let (X, S, µ) be a measure space and let M denote the collection
of all equivalence classes of real-valued measurable functions defined
on X, modulo equality almost everywhere. If f : X → R is a mea-
surable function, denote the equivalence class containing f by f . Let
ϕ : [0, +∞) → [0, 1] be a strictly monotonic increasing continuous func-
tion such that ϕ(0) = 0. Asume, further, that, for all x, y ∈ [0, +∞), we
have
ϕ(x + y) ≤ ϕ(x) + ϕ(y).
Let µ(X) = 1. For f , g ∈ M, define
Z
d(f , g) = ϕ(|f − g|) dµ.
X
5.7 Exercises 117

(a) Show that d(·, ·) is well-defined.


(b) Show that d(·, ·) defines a metric on M.
(c) Show that a sequence {f n }∞ n=1 converges to f with respect to this
µ
metric if, and only if, fn → f .
(d) Show that the function
x
ϕ(x) = , 0 ≤ x < +∞,
1+x
satisfies the conditions stated above.
Chapter 6

Differentiation

6.1 Monotonic functions

One of the important features of differential and integral calculus is that


differentiation and integration are essentially two sides of the same coin.
More precisely, the fundamental theorem of calculus states that, if f is
a Riemann integrable function which is the derivative of a function F
on an interval [a, b], then
Z b
F (b) − F (a) = f (x) dx. (6.1.1)
a

We would like to investigate how far such a result is true when we deal
with functions which are only differentable almost everywhere and with
the Lebesgue integral of the derivative of that function.

Consider the Cantor function (cf. Section 3.2), f , defined on the


interval [0, 1]. It is continuous and monotonic increasing, with f (0) = 0
and f (1) = 1. If C is the Cantor set, then f is constant on each sub-
interval of C c . Thus, f is differentiable almost everywhere and its deriva-
tive, wherever it exists, is zero. Thus we have f (1) − f (0) = 1, while the
integral of the derivative of f vanishes. In other words, (6.1.1) is not
true for this function.

Our aim in this chapter is to give necessary and sufficient condi-


tions on a function f which is differentiable almost everywhere, with the
derivative f 0 being integrable, such that (6.1.1) is true.

We will first study, briefly, various classes of functions which are dif-
ferentiable almost everywhere. Following the treatment as in Royden [6],
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 118
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_6
6.1 Monotonic functions 119

we start with monotonic functions.

Definition 6.1.1 Let I be a collection of intervals covering a set E ⊂


R. It is said to be a Vitali covering of E if, for every ε > 0, and
for every x ∈ E, there exists an interval I ∈ I such that x ∈ I and
m1 (I) < ε. 

In other words, every point in E can be covered by intervals of ar-


bitrarily small length. Such coverings occur naturally when we study
local properties of functions, especially, derivability.

In the sequel, when we use the word ‘outer-measure’, we will mean


the outer-measure which generates the Lebesgue measure on R, which
will be denoted by the symbol µ∗ .

Lemma 6.1.1 (Vitali covering lemma) Let E ⊂ R be a set of finite


outer-measure and let I be a Vitali covering of E. Then, given ε > 0,
we can find a finite collection of disjoint intervals {I1 , · · · , IN } in I such
that
µ∗ E\ ∪N

j=1 Ij < ε.

Proof: We observe, first of all, that the intervals can be of any type:
open, closed, or half-open. The addition, or removal, of end-points do
not change the results since these points constitute a set of measure
zero. Consequently, without loss of generality, we will assume that the
intervals are all closed.

Since E has finite outer-measure, we can find (cf. Proposition 2.2.1)


an open set U , which contains E and is such that m1 (U ) < +∞. Since
I is a Vitali covering of E, we can also assume, again, without loss of
generality, that all the intervals in I are also contained in U .

Let us choose an interval I1 arbitrarily from the collection I. We


will now inductively choose disjoint intervals Ik , for k > 1, as follows.

Assume that the intervals {I1 , · · · , In } have been chosen.

• If E ⊂ ∪nk=1 Ik , then we are through.

• If not, let x ∈ E\(∪nk=1 Ik ). Since the intervals are closed, the


distance, d, of x from ∪nk=1 Ik is strictly positive. Since I is a
Vitali covering, there exists an interval I ∈ I, containing x and of
120 6 Differentiation

length less than, say, d2 . Then, it is clear that I will not intersect
any of the intervals Ik , 1 ≤ k ≤ n. Set

kn = sup m1 (I).
I∈I
I ∩ Ik = ∅
1≤k≤n

Since all intervals of I are contained in U , we have that kn ≤


m1 (U ) < +∞. Thus, we can find an interval In+1 ∈ I such that
In+1 ∩ Ik = ∅ for all 1 ≤ k ≤ n and such that
1
kn < m1 (In+1 ).
2
Thus, if the process does not terminate at any finite stage, we will
intervals {Ik }∞
have a sequence of disjointP k=1 in I. Since they are all con-
tained in U , we have that ∞ m
k=1 1 k(I ) ≤ m1 (U ) < +∞. Consequently,
m1 (Ik ) → 0 as k → ∞. Now, choose N ∈ N such that

X ε
m1 (Ik ) < .
5
k=N +1

Set
R = E\ ∪N

k=1 Ik .

We complete the proof by showing that µ∗ (R) < ε.

Let x ∈ R. Then, as observed earlier, there exists I ∈ I such that


x ∈ I and I ∩Ik = ∅ for all 1 ≤ k ≤ N . Assume, if possible, that I ∩In =
∅ for all n ∈ N. Then, by definition, 0 < m1 (I) ≤ kn ≤ 2m1 (In+1 ) for all
n, which is impossible, since m1 (In ) → 0 as n → ∞. Thus, there exists
n ∈ N, such that n > N , I ∩ In 6= ∅, and I ∩ Ik = ∅ for all 1 ≤ k < n.
Notice that
m1 (I) ≤ kn−1 ≤ 2m1 (In ).
Let cn denote the mid-point of the interval In . Then, since x ∈ I and
I ∩ In 6= ∅, we have
1 5
|x − cn | ≤ m1 (I) + m1 (In ) ≤ m1 (In ).
2 2
Set  
5 5
Jn = cn − m1 (In ), cn + m1 (In ) .
2 2
6.1 Monotonic functions 121

Then, x ∈ Jn and m1 (Jn ) ≤ 5m1 (In ). Thus, R ⊂ ∪∞


k=N +1 Jk and so


X ∞
X
µ∗ (R) ≤ m1 (Jk ) ≤ 5 m1 (Ik ) < ε.
k=N +1 k=N +1

This completes the proof. 

Remark 6.1.1 There are several similar results in the literature, each of
them being called a Vitali covering lemma. The spirit of these are all the
same: a set of finite outer-measure in RN is covered by basic open sets
of arbitrarily small size and we can find a finite disjoint sub-collection
which almost completely covers the given set, i.e. the outer-measure of
the uncovered portion can be made as small as we wish. 

Let f : [a, b] → R be a given measurable function. We can define the


following ‘one-sided derivatives’ of f at any point x ∈ (a, b):

f (x+h)−f (x)
D+ f (x) = lim suph↓0 h ,

f (x)−f (x−h)
D− f (x) = lim suph↓0 h ,

f (x+h)−f (x)
D+ f (x) = lim inf h↓0 h ,

f (x)−f (x−h)
D− f (x) = lim inf h↓0 h .

We have that f is differentiable at x if, and only if, these four values co-
incide, and, in that case, we denote the common value as f 0 (x), which is
the derivative of f at x. Notice that we always have D+ f (x) ≥ D+ f (x)
and D− f (x) ≥ D− f (x) at any point x ∈ (a, b).

We say that f is differentiable almost everywhere on (a, b) if the


derivative exists for almost every x ∈ (a, b).

Theorem 6.1.1 Let f : [a, b] → R be a monotonic increasing real-


valued function. Then f is differentiable almost everywhere on (a, b).
The derivative f 0 is measurable and
Z
f 0 dm1 ≤ f (b) − f (a). (6.1.2)
(a,b)
122 6 Differentiation

Proof: Step 1. Consider the set

E = {x ∈ (a, b) | D+ f (x) > D− f (x)}.

We will show that m1 (E) = 0. All other sets involving inequalities


between the various one-sided derivatives can be handled in a similar
manner. This will establish the proof.

We can write
E = ∪r,s∈Q Ers ,
r>s

where
Ers = {x ∈ (a, b) | D+ f (x) > r > s > D− f (x)}.
Since Q is countable, again, it suffices to show that m1 (Ers ) = 0.

Let m = m1 (Ers ). Let ε > 0 be arbitrary. Then, there exists U , an


open set, such thatErs ⊂ U and m1 (U ) < m + ε. Let x ∈ Ers . Since
D− f (x) < s, for h sufficiently small, we have that [x − h, x] ⊂ U and

f (x) − f (x − h) < sh.

The collection of all such closed intervals, as x varies over Ers , forms a
Vitali covering of Ers and so by the Vitali covering lemma, we can find
a disjoint collection of such intervals {I1 , · · · , IN } such that the union of
their interiors covers a set A ⊂ Ers such that

µ∗ (A) > m − ε.

If, for 1 ≤ k ≤ N , we have Ik = [xk , xk − hk ], then

N
X N
X
(f (xk ) − f (xk − hk )) < s hk < sm1 (U ) < s(m + ε). (6.1.3)
k=1 k=1

Now, let y ∈ A. Then, for h0 sufficiently small, we have that (y, y+h0 )
is contained in an interval Ik , where 1 ≤ k ≤ N and

f (y + h0 ) − f (y) > kr.

Again the collection of such intervals, as y varies over A, is a Vitali


covering of A and so there exists a finite collections of disjoint intervals
6.1 Monotonic functions 123

{J1 , · · · , JM } which cover a set B ⊂ A of outermeasure greater than


m − 2ε. If Ji = (yi , yi + h0i ), 1 ≤ i ≤ M , then
M
X M
X
f (yi + h0i ) − f (yi ) > r ki0 > r(m − 2ε). (6.1.4)
i=1 i=1

Since each Ji is contained in some Ik and since f is monotonic increasing,


we have that
f (yi + h0i ) − f (yi ) ≤ f (xk ) − f (xk − hk ).
Thus,
M
X N
X
(f (yi + h0i ) − f (yi )) ≤ (f (xk ) − f (xk − hk )).
i=1 k=1

From (6.1.3) and (6.1.4), we then deduce that


r(m − 2ε) < s(m + ε).
Since ε > 0 was arbitrarily chosen, we get that mr ≤ ms. Since r > s,
this is possible only if m = 0. This completes the proof of the differen-
tiability, almost everywhere, of f .

Step 2. Define f (x) = f (b) for x ≥ b. Define


   
1
gn (x) = n f x + − f (x) .
n
Since f is differentiable almost everywhere, f 0 is defined almost every-
where and the sequence {gn }∞ 0
n=1 converges to f , wherever it is defined.
Since the Lebesgue measure is complete, it follows that f 0 is measurable.
Also, since f is monotonically increasing, we have gn (x) ≥ 0 for all x.
Thus, by Fatou’s lemma, we have that
Z Z
0
f dm1 ≤ lim inf gn dm1 .
(a,b) n→∞ (a,b)

By Exercise 5.3, we get


R R R 
g
(a,b) n dm 1 = n 1 1
(a+ ,b+ ) f dm 1 − (a,b) f dm 1
n n

R R 
= n [b,b+ 1 ) f dm1 −
n
(a,a+ 1 ] f dm1
n

R
= f (b) − n 1 f
(a,a+ n ] dm1 .
124 6 Differentiation

Thus,
Z Z
0
f dm1 ≤ f (b) − lim sup n f dm1 ≤ f (b) − f (a),
n→∞ 1
(a,b) (a,a+ n ]

since f (x) ≥ f (a) for all x ∈ (a, a + n1 ]. This completes the proof. 

Remark 6.1.2 As the example of the Cantor function shows, we can


have strict inequality in (6.1.2). 

6.2 Functions of bounded variation

Let f : [a, b] → R be a given function. Consider a partition

P = {a = x0 < x1 < · · · < xn−1 < xn = b}. (6.2.1)

Define
n
X
t(P, f ) = |f (xi ) − f (xi−1 )|.
i=1

Definition 6.2.1 Let f : [a, b] → R be a given function. The total


variation of f , over the interval [a, b], is defined as

Tab (f ) = sup t(P, f ),


P

where the supremum is taken over all possible partitions of the interval
[a, b]. The function f is said to be of bounded variation over the
interval [a, b] if Tab (f ) < +∞. 
Example 6.2.1 Let f : [a, b] → R be Lipschitz continuous, i.e. there
exists L > 0 such that, for all x, y ∈ [a, b], we have

|f (x) − f (y)| ≤ L|x − y|.

Then, given any partition P as in (6.2.1), we have


n
X
t(P, f ) = |f (xi ) − f (xi−1 | ≤ L(b − a).
i=1

Thus, f is of bounded variation over [a, b] and Tab (f ) ≤ L(b − a).


In particular, if f is differentiable on (a, b) and if |f 0 (x)| ≤ L for
all x ∈ (a, b), then, by the mean value theorem, f is Lipschitz continu-
ous (with Lipschitz constant L) and hence is of bounded variation over
6.2 Functions of bounded variation 125

[a, b]. 

Example 6.2.2 Let f : [a, b] → R be monotonic. Then, for any partition


P as in (6.2.1), we have
n
n
X X
t(P, f ) = |f (xi )−f (xi−1 )| = (f (xi ) − f (xi−1 )) = |f (b)−f (a)|,


i=1 i=1

by the monotonicity of f . Thus f is of bounded variation over the in-


terval [a, b] and Tab (f ) = |f (b) − f (a)|. 

Example 6.2.3 Define


x2 sin x12 , if x ∈ (0, 1],

f (x) =
0, if x = 0.
Then f is a continuous function which is not of bounded variation. To
see this, consider the patrition P of the interval [0, 1] defined by the set
of points (s )n
2
{0, 1} ∪ .
π(2k + 1)
k=0
Let us denote the points of the partion which are not on the boundary
by {xk }nk=0 . Then
2 1 2 1 2 4k 21
|f (xk ) − f (xk−1 )| = + = ≥ .
π 2k + 1 π 2k − 1 π 4k 2 − 1 πk
for 1 ≤ k ≤ n. Thus,
n
2X1
t(P, f ) ≥ ,
π k
k=1
and the right-hand side will become arbitrarily large for large n. 
Proposition 6.2.1 Let [a, b] ⊂ R be a given bounded interval. If f is a
function of bounded variation on [a, b], then f is bounded. The function
|f | is also of bounded variation. If f and g are functions of bounded
variation on [a, b] and if α and β are real constants, then αf + βg and
f g are also of bounded variation on [a, b].
Proof: Let x ∈ (a, b]. Consider the partition defined by {a < x ≤ b}.
Then |f (x) − f (a)| ≤ Tab (f ). Thus, for any x ∈ [a, b], we have

|f (x)| ≤ |f (a)| + Tab (f ).


126 6 Differentiation

The rest of the proof is an immediate consequence of the following in-


equalities:
| |f (x)| − |f (y)| | ≤ |f (x) − f (y)|,
|(αf (x)+βg(x))−(αf (y)+βg(y))| ≤ |α| |f (x)−f (y)|+|β| |g(x)−g(y)|,
and

|f (x)g(x) − f (y)g(y)| ≤ |f (x)| |g(x) − g(y)| + |g(y)| |f (x) − f (y)|. 

Given a real number r, define

r+ = max{r, 0} and r− = − min{r, 0},

so that r = r+ − r− and |r| = r+ + r− . Given a partition P as in (6.2.1)


of an interval [a, b] and a function f : [a, b] → R, define

p(P, f ) = Pni=1 (f (xi ) − f (xi−1 ))+ , and,


P
n −
n(P, f ) = i=1 (f (xi ) − f (xi−1 )) .

Thus,
t(P, f ) = p(P, f ) + n(P, f ),
(6.2.2)
f (b) − f (a) = p(P, f ) − n(P, f ).
Define

Pab (f ) = sup p(P, f ), and Nab (f ) = sup n(P, f ).


P P

Proposition 6.2.2 Let f : [a, b] → R be a function of bounded varia-


tion. Then
Tab (f ) = Pab (f ) + Nab (f ), (6.2.3)
and
f (b) − f (a) = Pab (f ) − Nab (f ). (6.2.4)

Proof: Since f is of bounded variation, we have that Tab (f ), Pab (f ) and


Nab (f ) are all finite. Now, for any partition P of [a, b], we have, using
(6.2.2),

p(P, f ) = n(P, f ) + f (b) − f (a) ≤ Nab (f ) + f (b) − f (a).

We then immediately deduce that Pab (f ) ≤ Nab (f ) + f (b) − f (a), or,


equivalently,
Pab (f ) − Nab (f ) ≤ f (b) − f (a).
6.2 Functions of bounded variation 127

Interchanging the roles of p(P, f ) and n(P, f ) in the use of (6.2.2), we


deduce, in the same way, that

Nab (f ) − Pab (f ) ≤ f (a) − f (b).

Thus, we have established (6.2.4).

Since t(P, f ) = p(P, f ) + n(P, f ) for any partition P, it follows that

Tab (f ) ≤ Pab (f ) + Nab (f ).

Again, for any partition P,


Tab (f ) ≥ p(P, f ) + n(P, f ) = p(P, f ) + p(P, f ) − (f (b) − f (a))
= 2p(P, f ) + Nab (f ) − Pab (f ),

by virtue of (6.2.4). Thus,

Tab (f ) ≥ 2Pab (f ) + Nab (f ) − Pab (f ) = Pab (f ) + Nab (f ),

which gives the reverse inequality, thereby establishing (6.2.3). This


completes the proof. 
Theorem 6.2.1 A function f : [a, b] → R is of bounded variation if,
and only if, it is the difference of two monotonic functions.
Proof: Since monotonic functions are of bounded variation, and since
the sum, or difference, of functions of bounded variation is also of
bounded variation, we see that the difference of two monotonic func-
tions is of bounded variation.

Conversely, let f be of bounded variation on the interval [a, b]. Let


x ∈ (a, b]. Define

g(x) = Pax (f ) and h(x) = Nax (f ).

By definition, it is easy to see that both g and h are monotonic increasing


functions. Consequently, the function x 7→ h(x)−f (a) is also monotonic
increasing. Further, by the preceding proposition, we have

f (x) − f (a) = g(x) − h(x).

Thus,
f (x) = g(x) − (h(x) − f (a)),
which completes the proof. 
128 6 Differentiation

Corollary 6.2.1 Let f : [a, b] → R be a function of bounded variation.


Then, it is differentiable almost everywhere.

Proof: This is a direct consequence of the preceding theorem and The-


orem 6.1.1. 

Proposition 6.2.3 Let f : [a, b] → R be a function of bounded varia-


tion. Then f 0 is integrable and
Z
|f 0 | dm1 ≤ Tab (f ).
[a,b]

In addition, if f ∈ C 1 [a, b], then we have equality in the above relation.

Proof: The functions x 7→ Pax (f ), x 7→ Nax (f ) and x 7→ Tax (f ) are all


monotonic increasing and so are differentiable almost everywhere. Since
we have
f (x) − f (a) = Pax (f ) − Nax (f ),
we have
f 0 (x) = (Pax (f ))0 − (Nax (f ))0 a.e..
Since (Pax (f ))0 and Nax (f ))0 are non-negative (the functions concerned
being monotonic increasing), we have

|f 0 (x)| ≤ |(Pax (f ))0 | + |(Nax (f ))0 | = (Pax (f ))0 + (Nax (f ))0 = (Tax (f ))0 ,

since Tax (f ) = Pax (f ) + Nax (f ). Since Tax (f ), is monotonic increasing, we


have, by Theorem 6.1.1, that
Z Z
0
|f | dm1 ≤ (Tax (f ))0 dm1 ≤ Tab (f ) − Taa (f ) = Tab (f ).
[a,b] [a,b]

If f is continuously differentiable, and if P is any partition as in (6.2.1),


we have, for 1 ≤ i ≤ n,
Z xi
f (xi ) − f (xi−1 ) = f 0 (t) dt.
xi−1

Thus, it follows that


n Z
X xi Z b Z
0 0
t(P, f ) ≤ |f (t)| dt = |f (t)| dt = |f 0 | dm1 .
i=1 xi−1 a [a,b]
6.2 Functions of bounded variation 129

Consequently, we have the reverse inequality


Z
b
Ta (f ) ≤ |f 0 | dm1 ,
[a,b]
which completes the proof. 

Let us now consider a vector-valued map f : [a, b] → RN , where, for


x ∈ [a, b], we have f (x) = (f1 (x), · · · , fN (x)). We write
N
! 12
X
|f (x)| = |fi (x)|2 .
i=1
We then say that f is of bounded variation over [a, b] if
Tab (f ) = sup t(P, f ) < +∞,
P
where the supremum is taken over all partitions P of [a, b] and if P is a
partition of [a, b] as in (6.2.1), we set
n
X
t(P, f ) = |f (xi ) − f (xi−1 )|.
i=1
If each fi , 1 ≤ i ≤ N , is integrable over [a, b], we define
Z Z !N
f dm1 = fi dm1 .
[a,b] [a,b]
i=1
If each fi , 1 ≤ i ≤ N is differentiable in (a, b), we define
f 0 (x) = (fi0 (x))N
i=1 .

Lemma 6.2.1 Let f : [a, b] → RN be integrable. Then


Z Z

f dm1 ≤ |f | dm1 . (6.2.5)


[a,b] [a,b]
R R
Proof: Set y = [a,b] f dm1 , and yi = [a,b] fi dm1 , 1 ≤ i ≤ N . The
result is trivially true if y = 0. Assume that y 6= 0. Then
PN 2 PN
|y|2 =
R
i=1 yi = i=1 yi [a,b] fi dm1

R PN R
= [a,b] i=1 yi fi dm1 ≤ [a,b] |y||f | dm1
R
= |y| [a,b] |f | dm1 ,
from which (6.2.5) follows on dividing throughout by |y|. 
130 6 Differentiation

Proposition 6.2.4 Let f : [a, b] → RN be a continuously differentiable


map. Then f is of bounded variation and
Z
b
Ta (f ) = |f 0 | dm1 .
[a,b]

Proof: Proceeding exactly as in the latter part of the proof of Proposi-


tion 6.2.3, we can easily see that, owing to the continuous differentiablity
of f , Z
Tab (f ) ≤ |f 0 | dm1 < +∞.
[a,b]

Thus, f is of bounded variation.

We now prove the reverse inequality. Since f 0 is continuous over


[a, b], it is uniformly continuous. Given ε > 0, let δ > 0 be such that,
whenever |x − y| < δ, we have |f 0 (x) − f 0 (y)| < ε. Let P be a partition
of [a, b] as in (6.2.1) and such that

max (xi − xi−1 ) < δ.


1≤i≤n

Thus, if xi−1 ≤ t ≤ xi , we have

|f 0 (t)| < |f 0 (xi )| + ε.

Therefore,
R xi 0 0
xi−1 |f (t)| dt − ε(xi − xi−1 ) ≤ |f (xi )|(xi − xi−1 )
R
xi
= xi−1 (f 0 (t) + f 0 (xi ) − f 0 (t)) dt

R
xi 0
≤ xi−1 f (t) dt

R
xi
+ xi−1 (f 0 (xi ) − f 0 (t)) dt

≤ |f (xi ) − f (xi−1 )| + ε(xi − xi−1 ).

Summing over all 1 ≤ i ≤ n, we get


Z b
|f 0 (t)| dt − ε(b − a) ≤ t(P, f ) + ε(b − a).
a
6.3 Differentiation of an indefinite integral 131

Thus,
Z b
|f 0 (t)| dt ≤ Tab (f ) + 2ε(b − a).
a
Since ε > 0 was arbitrarily chosen, we get
Z Z b
|f 0 | dm1 = |f 0 (t)| dt ≤ Tab (f ),
[a,b] a
which completes the proof. 

Example 6.2.4 (Rectifiable arcs) An arc, or a curve, in the plane, is a


continuous map of the form γ : [a, b] → R2 .To compute the ‘length’ of
the curve,we partition [a, b] as in (6.2.1) and get the approximate length
as
Xn
|γ(xi ) − γ(xi−1 )|,
i=1
which is none other than the sum of the lengths of the chords connecting
the successive points of the set {γ(xi )}ni=0 lying on the curve γ. We say
that the arc is rectifiable, i.e. its length is well-defined, if the supremum
of the above sum, taken over all partitions, is finite, and that number is
called the length of the arc. In other words, an arc given by the mapping
γ is rectifiable if, and only if, the map γ is of bounded variation and the
length of the arc is, in fact, Tab (γ). Let us assume that the arc is given
by the parametric equations: x = x(t), y = y(t), where t ∈ [a, b]. If
the functions x(t) and y(t) are continuously differentiable, then, by the
preceding proposition, we get that the length, L, of the curve is given
by the formula
Z bp
L = (x0 (t))2 + (y 0 (t))2 dt
a
which is precisely what we define as the length of a curve in an under-
graduate calculus course. 

6.3 Differentiation of an indefinite integral

In this section, we will show that the derivative of the indefinite intergal
of an integrable function is equal to the integrand, almost everywhere.
Proposition 6.3.1 Let f : [a, b] → R be an integrable function. The
indefinite integral of f , defined by
Z
F (x) = f dm1 , x ∈ [a, b],
[a,x]
132 6 Differentiation

is a uniformly continuous function of bounded variation over [a, b].

Proof: Let x, y ∈ [a, b], with x < y. Then


Z Z

|F (x) − F (y)| = f dm1 ≤ |f | dm1 .

[x,y] [x,y]

Given ε > 0, we know that (cf. Proposition 5.3.2) there exists δ > 0
such that, whenever |x − y| < δ, we have
Z
|f | dm1 < ε,
[x,y]

since f is integrable. This proves the uniform continuity of F .


If
P = {a = x0 < x1 < · · · < xn = b},
is any partition of [a, b], then
N
X n Z
X Z
|F (xi ) − F (xi−1 )| ≤ |f | dm1 = |f | dm1 .
i=1 i=1 [xi−1 ,xi ] [a,b]

Thus, it follows that


Z
Tab (F ) ≤ |f | dm1 < +∞,
[a,b]

which proves that F is of bounded variation over [a, b]. 

Remark 6.3.1 The above result shows that in order that a function
be the indefinite integral of an integrable function, it must be at least a
function of bounded variation. In fact, we will see in the next section that
even more will be required. Thus, in general, a differentiable function
may not be the indefinite integral of its derivative, even if the derivative
is integrable. It needs to be at least a uniformly continuous function of
bounded variation, with additional properties. 

Proposition 6.3.2 Let f : [a, b] → R be an integrable function such


that Z
f dm1 = 0,
[a,x]

for all x ∈ [a, b]. Then f (x) = 0 for almost every x ∈ [a, b].
6.3 Differentiation of an indefinite integral 133

Proof: Set

E+ = {x ∈ [a, b] | f (x) > 0} and E− = {x ∈ [a, b] | f (x) < 0}.

Assume that m1 (E+ ) > 0. Then (cf. Proposition 2.2.2), there exists
a closed set F ⊂ E+ such that m1 (F ) > 0. Set U = (a, b)\F , which
is open. Then, U can be written as the disjoint union of a countable
number of half-open intervals (cf. Lemma 2.2.1):

U = ∪∞
n=1 [an , bn ).

Set
f n = f χ ∪n [ak ,bk )
.
k=1

Then fn → f in U and |fn | ≤ |f |, which is integrable. Thus, by the


dominated convergence theorem, we have that
Z ∞ Z
X
f dm1 = f dm1 .
U n=1 [an ,bn )

Since f > 0 onR F and since F has positive measure, we have (cf. Propo-
sition 5.2.2), F f dm1 > 0 and so since
Z Z Z
0 = f dm1 = f dm1 + f dm1 ,
[a,b] U F
R
we deduce
R that U f dm1 6= 0. Consequently, there exists n ∈ N such
that [an ,bn ) f 6= 0. But this is a contradiction, since
Z Z Z
f dm1 = f dm1 − f dm1 = 0,
[an ,bn ) [a,bn ) [a,an )

by hypothesis. Thus, it follows that m1 (E+ ) = 0. Similarly, we can


show that m1 (E− ) = 0. This shows that f = 0 almost everywhere. 

Proposition 6.3.3 Let f : [a, b] → R be a bounded measurable function.


Define Z
F (x) = F (a) + f dm1 , x ∈ [a, b], (6.3.1)
[a,x]

where F (a) is an arbitrary constant. Then F is differentiable almost


everywhere and F 0 (x) = f (x) for almost every x ∈ [a, b].
134 6 Differentiation

Proof: By Proposition 6.3.1, we know that F is uniformly continu-


ous and of bounded variation and hence that it is differentiable almost
everywhere. Let |f (x)| ≤ M for all x ∈ [a, b]. For n ∈ N, define
    Z
1
fn (x) = n F x + − F (x) = n f dm1 .
n [x,x+ 1 ] n

Then |fn (x)| ≤ M for all x ∈ [a, b] and fn → F 0,


almost everywhere,
as n → ∞. Hence, by the dominated convergence theorem, for any
c ∈ (a, b), we have
0
R R
[a,c] F dm1 = limn→∞ [a,c] fn dm1

+ n1 ) − F (x)) dm1 (x)


R
= limn→∞ n [a,c] (F (x

 R R 
= limn→∞ n [c,c+ 1 ] F dm1 − [a,a+ 1 ] F dm1 .
n n

(We have used the result of Exercise 5.3.) Now, since F is uniformly
continuous, given ε > 0, there exists N ∈ N such that for all n ≥ N ,
and for all x ∈ [a, b), we have |F (x + n1 ) − F (x)| < ε. Consequently, for
any x ∈ [a, b) and for any n ≥ N , we have
Z Z 1

x+ n
n F dm1 − F (x) = n (F (t) − F (x)) dt ≤ ε.

[x,x+ 1 ] x
n

Thus, we deduce that, for any c ∈ [a, b),


Z Z
0
F dm1 = F (c) − F (a) = f dm1 .
[a,c] [a,c]

We then conclude, by applying the result of Proposition 6.3.2, that


F 0 = f almost everywhere. 

We can discard the hypothesis of boundedness of the integrand.


Theorem 6.3.1 Let f : [a, b] → R be an integrable function. Let F be
defined as in (6.3.1). Then F is differentiable almost everywhere and
F 0 (x) = f (x) for almost every x ∈ [a, b].
Proof: Let us first assume that f ≥ 0. Define

f (x), if f (x) ≤ n,
fn (x) =
n, if f (x) > n.
6.3 Differentiation of an indefinite integral 135

Then each fn is bounded and the sequence {fn }∞


n=1 increases to f . Thus,
f − fn ≥ 0. Define
Z
Gn (x) = (f − fn ) dm1 .
[a,x]

Then Gn is monotonic increasing and is hence differentiable almost ev-


erywhere and its derivative is non-negative. Further, since fn is bounded,
by the preceding proposition, we have
Z
d
fn dm1 = fn (x),
dx [a,x]

for almost every x ∈ [a, b]. Now, we can write


Z
F (x) = F (a) + Gn (x) + fn (x) dm1 ,
[a,x]

and so, for almost every x ∈ [a, b], we have

F 0 (x) = G0n (x) + fn (x) ≥ fn (x).

Since n ∈ N was arbitrarily chosen, we have

F 0 (x) ≥ f (x) a.e., (6.3.2)

which implies that


Z Z
F 0 dm1 ≥ f dm1 = F (b) − F (a). (6.3.3)
[a,b] [a,b]

On the other hand, since f is non-negative, we have that F is mono-


tonic increasing and hence, by Theorem 6.1.1,
Z
F 0 dm1 ≤ F (b) − F (a). (6.3.4)
[a,b]

It now follows from, (6.3.3) and (6.3.4), that


Z Z
0
F dm1 = f dm1 = F (b) − F (a).
[a,b] [a,b]

Since by (6.3.2), we have that F 0 ≥ f almost everywhere, the above


relation implies that F 0 = f almost everywhere (cf. Proposition 5.2.2).
136 6 Differentiation

This completes the proof for non-negative f .

In the general case, we write f = f + − f − . Then


Z Z
F (x) = F (a) + f + dm1 − f − dm1 .
[a,x] [a,x]

It now follows, from the preceding arguments, that for almost every
x ∈ [a, b], we have

F 0 (x) = f + (x) − f − (x) = f (x). 

6.4 Absolute Continuity

From the previous section, we see that in order that a given function f :
[a, b] → R be written as an indefinite integral of an integrable function, it
must be at least uniformly continuous and of bounded variation. We will
now introduce a new concept which will provide both a necessary and
sufficient condition for a function to be written as an indefinite integral.

Definition 6.4.1 A function f : [a, b] → R is said to be absolutely


continuous if, for every ε > 0, there exists δ > 0 such that the following
holds: given any finite collection of disjoint intervals {(xk , yk )}nk=1 in
[a, b] such that
X n
|yk − xk | < δ, (6.4.1)
k=1

we have
n
X
|f (yk ) − f (xk )| < ε.  (6.4.2)
k=1

Remark 6.4.1 Clearly an absolutely continuous function is uniformly


continuous. 

Example 6.4.1 If f : [a, b] → R is Lipschitz continuous, then it is


absolutely continuous. If L is the Lipschitz constant (cf. Example 6.2.1),
then, choose δ = Lε . Then, if {(xk , yk )}nk=1 is a collection of disjoint
intervals in [a, b], satisfying (6.4.1), we have
n
X n
X
|f (yk ) − f (xk )| ≤ L |yk − xk | < ε.
k=1 k=1
6.4 Absolute Continuity 137

Thus, every differentiable function whose derivative is bounded on [a, b]


will be absolutely continuous. 

Example 6.4.2 Let f : [a, b] → R be an integrable function and let F


be defined as in (6.3.1). Then F is absolutely continuous. To see this,
let {(xk , yk )}nk=1 be a collection of disjoint intervals in [a, b], satisfying
(6.4.1). Then
Xn Z
|F (yk ) − F (xk )| ≤ |f | dm1 .
k=1 ∪n
k=1 (xk ,yk )

The existence of δ > 0, given ε > 0, such that (6.4.2) is true follows
from Proposition 5.3.2. (See also Remark 5.3.2). 

Our aim now is to prove the converse of the result in the above
example, thereby establishing absolute continuity as the necessary and
sufficient condition for a function to be written as an indefinite integral
(of its derivative).
Proposition 6.4.1 Let f : [a, b] → R be absolutely continuous. Then,
f is of bounded variation over [a, b]. In particular, f is differentiable
almost everywhere in [a, b].
Proof: Let δ > 0 correspond to ε = 1 in the definition of absolute
continuity of f . Let K be the integral part of 1 + (b−a) δ . Then, given
any partition P of [a, b], we can refine it to a partition P 0 such that the
constituent intervals of P 0 can be grouped into K sets of intervals, each
with total length less than δ. Then,
t(P, f ) ≤ t(P 0 , f ) ≤ K.
Thus Tab (f ) ≤ K < +∞. 
Proposition 6.4.2 Let f : [a, b] → R be an absolutely continuous func-
tion such that f 0 = 0 almost everywhere in [a, b]. Then f is a constant
function.
Proof: Let c ∈ (a, b] be arbitrarily chosen. Set
E = {x ∈ (a, c) | f 0 (x) = 0}.
By hypothesis, we have that m1 (E) = c − a. Let ε and η, be arbitrarily
small positive numbers.
138 6 Differentiation

If x ∈ E, then, for sufficiently small h > 0, we have [x, x + h] ⊂ (a, c)


and also that |f (x+h)−f (x)| < ηh.Then, by the Vitali covering lemma,
we can find a finite disjoint set of intervals, {(xk , xk + hk )}nk=1 , such that

m1 (E\ ∪nk=1 (xk , yk )) < δ,

where, we have set yk = xk + hk , 1 ≤ k ≤ n. Without loss of generality,


we may assume that the {xk }nk=1 have been labelled in increasing order
of magnitude. Thus we have

y0 = a ≤ x1 < y1 ≤ x2 < y2 ≤ · · · ≤ xn < yn ≤ c = xn+1 ,

and
n
X
|xk+1 − yk | < δ.
k=0

Now, on one hand, we have


n
X n
X
|f (yk ) − f (xk )| < η |yk − xk | < η(c − a).
k=1 k=1

On the other hand, by absolute continuity, we have


n
X
|f (xk+1 ) − f (yk )| < ε.
k=0

Together, these relations yield,

|f (c) − f (a)| < ε + η(c − a).

Since ε and η were arbitrarily chosen, it follows that f (c) = f (a), and
this completes the proof, since c was fixed arbitrarily in (a, b]. 

Remark 6.4.2 The Cantor function is an example of a non-constant


function whose derivative vanishes almost everywhere. Thus, the Cantor
function is not absolutely continuous. 

Theorem 6.4.1 A function F : [a, b] → R can be written as an indef-


inite integral of an integrable function if, and only if, it is absolutely
continuous.
6.5 Exercises 139

Proof: We have already seen, in Example 6.4.2, that if F is an indefi-


nite integral, then it is absolutely continuous.

Conversly, let F be absolutely continuous. Since it is of bounded


variation, it can be written as the difference of two monotonic increasing
functions (cf. Theorem 6.2.1). Let F = F1 − F2 , where Fi , i = 1, 2, are
monotonic increasing. Then F 0 = F10 − F20 and so, by Theorem 6.1.1, we
have
0 0 0
R R R
[a,b] |F | dm1 ≤ [a,b] |F1 | dm1 + [a,b] |F2 | dm1

0 0
R R
= [a,b] F1 dm1 + [a,b] F2 dm1
P2
≤ i=1 (Fi (b) − Fi (a)) < +∞.
Thus, F 0 is integrable. Let
Z
G(x) = F 0 dm1 .
[a,x]

Then G is absolutely continuous and hence so is the function f = F − G.


Now, by Theorem 6.3.1, we have that G0 = F 0 almost everywhere, i.e.
f is an absolutely continuous function whose derivative f 0 = F 0 − G0
vanishes almost everywhere. Thus, f is a constant, equal to f (a) = F (a).
Thus, we have Z
F (a) = F (x) − F 0 dm1 ,
[a,x]

or, equivalently, Z
F (x) = F (a) + F 0 dm1 .
[a,x]

This completes the proof. 

6.5 Exercises

6.1 Let f : [a, b] → R be a monotonic function. For x ∈ (a, b), define

f (x+) = lim f (x + h) and f (x−) = lim f (x − h).


h↓0 h↓0

Show that f (x+) and f (x−) always exist. Deduce that the set of dis-
continuities of f is at most countable.
140 6 Differentiation

6.2 For x ∈ [−1, 1], define

x sin x1 , if x 6= 0,

f (x) =
0, if x = 0.

Compute D+ f (0), D+ f (0), D− f (0) and D− f (0).

6.3 Show that the function f defined on [0, 1] by

x2 sin x1 , if x 6= 0,

f (x) =
0, if x = 0,

is of bounded variation.

6.4 Let f : [a, b] → R be a function of bounded variation. Let a ≤ c ≤ b.


Show that
Tab (f ) = Tac (f ) + Tcb (f ).
6.5 Let f : [a, b] → R be an absolutely continuous function. Show that

Tab (f ) = 0|
R
[a,b] |f dm1 ,

Pab (f ) = 0 )+
R
[a,b] (f dm1 , and,

Nab (f ) = 0 )−
R
[a,b] (f dm1 .

6.6 Let f : [a, b] → R be a function of bounded variation. Define, for


x ∈ [a, b],
vf (x) = Tax (f ).
(a) If f is continuous, show that vf is also continuous.
(b) If f is absolutely continuous, show that vf is also absolutely contin-
uous.

6.7 A monotone function is said to be singular if its derivative vanishes


almost everywhere. Show that if f : [a, b] → R is a monotonic increasing
function, then it can be written as the sum of a singular function and
an absolutely continuous function.

6.8 Let f : [0, 1] → R be a continuous function which is absolutely con-


tinuous on [ε, 1] for every 0 < ε < 1.
6.5 Exercises 141

(a) Show, by means of an example, that f need not be absolutely con-


tinuous on [0, 1].
(b) If, in addition, f is of bounded variation on [0, 1], show that it is
absolutely continuous on [0, 1].

6.9 (Another example of a Cantor function) Consider the Cantor set C


(cf. Example 2.1.3). For each n ∈ N, let En = [0, 1]\Xn , where Xn is as
described in Example 2.1.3. Define, for x ∈ [0, 1],
 n Z
3
gn (x) = χEn (x) and fn (x) = gn dm1 .
2 [0,x]

(a) Show that, for each n ∈ N, fn is a monotonic increasing function such


that fn (0) = 0, fn (1) = 1 and that fn is constant on each constituent
interval of Xn = Enc .
(b) If I is any constituent interval of En , n ∈ N, show that
Z Z
gn dm1 = gn+1 dm1 = 2−n .
I I

(c) Let n ∈ N. Show that fn+1 (x) = fn (x) if x 6∈ En , and that


3
|fn+1 (x) − fn (x)| ≤
2n
for all x ∈ En .
(d) Deduce the existence of a continuous function f : [a, b] → R which is
monotonic increasing, whose derivative vanishes at every point x 6∈ C,
where C is the Cantor set, and such that f (0) = 0, f (1) = 1.
Chapter 7

Change of variable

7.1 The Fréchet derivative

Let A : RN → RN be a linear transformation. We have seen (cf. Theo-


rem 2.3.3) that if E ⊂ RN is a measurable set, then

mN (A(E)) = |det(A)|mN (E).

Now, using the procedure outlined in Remark 5.2.4, it is a simple exercise


to see that, if f : RN → RN is an integrable function, then
Z Z
f dmN = |det(A)| (f ◦ A) dmN ,
RN RN

where, f ◦ A stands for the composition of the two mappings A and f .


The aim of this chapter is to generalize this result to suitable transfor-
mations on open sets in RN . In order to do this, we need the tools of
differential calculus in RN , which we recall in this section. For proofs of
all assertions made in this section, see, for example, Kesavan [4].

Definition 7.1.1 Let U ⊂ RN be an open set and let T : U → RM be


a given mapping. The mapping is said to be differentiable at a point
a ∈ U , if there exists a linear transformation A : RN → RM such that

|T (a + h) − T (a) − A(h)|
lim = 0, (7.1.1)
h→0 |h|

where | · | denotes the euclidean length of a vector in the appropriate


euclidean space. The linear map A is called the Fréchet derivative of
T at the point a ∈ U and is denoted by T 0 (a). 
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 142
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_7
7.1 The Fréchet derivative 143

Remark 7.1.1 The following facts are immediate consequences of the


definition:
(i) If T is differentiable at a point a ∈ U , then it is continuous at that
point.
(ii) The derivative at a point, if it exists, is unique. It is for this purpose
that we work in an open set.
(iii) If T is itself a linear map, then it is differentiable at every point of
U and, for every a ∈ U , we have
T 0 (a) = T. 
Remark 7.1.2 The relation (7.1.1) can be written in an equivalent
fashion as follows:
T (a + h) = T (a) + T 0 (a)(h) + ε(h),
where, the error term ε(h) satisfies the condition
|ε(h)|
lim = 0. 
h→0 |h|

Definition 7.1.2 Let U ⊂ RN be an open set. A mapping T : U → RM


is said to be differentiable on U if T 0 (x) exists for every x ∈ U . The
mapping T is said to be of class C 1 on U , or, equivalently, T is said to
be continuously differentiable on U , if it is differentiable on U and,
in addition, the mapping x 7→ T 0 (x) from U into the space of all linear
transformations from RN into RM , denoted L(RN , RM ), is continuous
when the latter space is endowed with its usual topology. If V ⊂ RN is
an open set, a mapping T : U → V is said to be a diffeomorphism if
T is a bijection and if both T and T −1 are continuously differentiable
maps. 
Example 7.1.1 If N = M = 1, then the Fréchet derivative is the fa-
miliar derivative that we define when studying the calculus of functions
of a single variable. In this case, the derivative T 0 (a) at a point a ∈ U
is a real number, which can be visualised as a linear map from R onto
itself, acting on R by multiplication, i.e. T 0 (a)(h) = T 0 (a)h. 

Example 7.1.2 Let N > 1 and let M = 1. Then T 0 (a) is a linear


functional on RN , and so, it can be represented by a vector in RN .
Indeed, we have
 
0 ∂T ∂T
T (a) = ∇T (a) = (a), · · · , (a) ,
∂x1 ∂xN
144 7 Change of variable

∂T
where { ∂x i
(a)}Ni=1 are the usual partial derivatives of T at the point
a. It can be shown that if T is differentiable at a ∈ U , then all partial
derivatives of T exist at that point. Thus, the action of T 0 (a) on a vector
h = (h1 , · · · , hN ) is given by
N
X ∂T
T 0 (a)(h) = (a)hi . 
∂xi
i=1

Example 7.1.3 Consider the mapping T : R2 → R defined, for (x, y) ∈


R2 , by (
x5
(y−x2 )2 +x4
, if (x, y) 6= (0, 0),
T (x, y) =
0, if (x, y) = (0, 0).
Then one can verify that both partial derivatives exist, and are equal
to zero, at the point (0, 0). If T were differentiable at the origin, then
it would follow, from the preceding example, that T 0 ((0, 0)) = 0. In
particular, we must then have
|T (h)|
lim = 0.
h→0 |h|

However, if we take h = (t, t2 ), where t → 0, we can easily see that the


above limit is unity.
Thus, while the partial derivatives will all exist at a point if the
mapping is differentiable there, the converse is not true. The partial
derivatives may exist at a point, but the mapping can still fail to be
differentiable there. 

Example 7.1.4 Let N > 1 and M > 1. Let T (x) = (T1 (x), · · · , TM (x)),
where Ti is a mapping of U ⊂ RN into R. If T is differentiable at a
point a ∈ U , then all the Ti , 1 ≤ i ≤ M are also differentiable at a. The
derivative T 0 (a) can now be represented by an M × N matrix. We have,
for h = (h1 , · · · , hN ) ∈ RN ,
 
∂T1 ∂T1
(a) . . . (a)

∂x1 ∂xN h1
 ··· ··· ···  ··· 
 
T 0 (a)(h) =   . 
 ··· ··· ···  ··· 

∂TM ∂TM
∂x1 (a) . . . ∂xN (a)
hN

Definition 7.1.3 Let U ⊂ RN be an open set and let T : U → RN be a


differentiable map defined on U . The Jacobian of T at a point a ∈ U
is denoted by JT (a) and is equal to det(T 0 (a)). 
7.1 The Fréchet derivative 145

We will now recall (without poof) some important results from the
differential calculus in RN , which are well known when N = 1.

Theorem 7.1.1 (Chain rule) Let Ni ∈ N, 1 ≤ i ≤ 3. Let U ⊂ RN1


and V ⊂ RN2 be open sets. Let f : U → RN2 and let g : V → RN3
be continuous mappings. Let a ∈ U be such that f (a) ∈ V . Assume
that f is differentiable at a and that g is differentiable at f (a). Then the
composition h = g◦f , which is defined on the open set U 0 = f −1 (V ) ⊂ U ,
is differentiable at a and

h0 (a) = g 0 (f (a)) ◦ f 0 (a). 

Notation

• Let a and b be vectors in RN . We denote the (closed) line segment


connecting these points by [a, b]. Thus,

[a, b] = {ta + (1 − t)b | 0 ≤ t ≤ 1}.

• Let A : RN → RN be a linear transformation. We denote by kAk,


its norm, i.e.
kAk = max |A(x)|.
|x|=1

Theorem 7.1.2 (Mean value theorem) Let U ⊂ RN be an open set and


let T : U → RM be a given mapping. Let a, b ∈ U such that [a, b] ⊂ U .
If T is differentiable in U , then

|T (b) − T (a)| ≤ sup kT 0 (x)k |b − a|. (7.1.2)


x∈[a,b]

Remark 7.1.3 The mean value theorem has numerous applications. For
instance, let T : U → R be a mapping such that its partial derivatives
∂T
exist at all points of U . Assume that the mappings x 7→ ∂x i
(x) are
continuous at a point a ∈ U for all 1 ≤ i ≤ N . Then, using the mean
value theorem, we can show that T is differentiable at a (cf. Example
7.1.3). 

Theorem 7.1.3 Let U ⊂ RN be an open set and let T : U → RM be a


mapping of class C 1 . Then, if [a, a + h] ⊂ U , we have
Z 1
T (a + h) = T (a) + T 0 (a + th)(h) dt. (7.1.3)
0
146 7 Change of variable

7.2 Sard’s theorem

Definition 7.2.1 Let U ⊂ RN be an open set. Let T : U → RN be a


C 1 map. Let x ∈ U . We say that the point x is a singular point, or a
critical point, if the rank of T 0 (x) is strictly less than N , i.e. the linear
transformation T 0 (x) is singular. The value T (x) is called a singular
value or critical value of T . If T −1 {y} does not contain any critical
point, we say that y is a regular value of T . 

Theorem 7.2.1 (Sard) Let U ⊂ RN be an open set and let T : U → RN


be a C 1 map. Then the set of critical values of T has measure zero.

Proof: Let S be the set of critical points of T i.e.

S = {x ∈ U | JT (x) = 0}.

We need to show that mN (T (S)) = 0.

Step 1. Let C be a closed cube of side a, with sides parallel to the


coordinate axes, contained in U . Since T 0 is bounded and uniformly
continuous on C (since T is continuously differentiable), given ε > 0,
there exists δ > 0 such that

kT 0 (x) − T 0 (y)k < ε,

whenever |x − y| < δ and x, y ∈ C. Let us divide C into k N similar


cubes, each of side ka , with k being chosen large enough such that the

diameter of each sub-cube is less than δ, i.e. N ka < δ. If kT 0 (x)k ≤ L,
for all x ∈ C, we have

|T (x) − T (y)| ≤ L|x − y|,

for all x, y ∈ C, by virtue of the mean value thoerem (cf. Theorem 7.1.2).

Step 2. Let x ∈ C ∩ S. Then, x must belong to one of the sub-cubes,


e Given any y ∈ C,
say, C. e we have, on one hand,
√ a
|T (x) − T (y)| ≤ L N . (7.2.1)
k
On the other hand, by virtue of Theorem 7.1.3, we have
Z 1
0
T (y) − T (x) − T (x)(y − x) = (T 0 (x + t(y − x)) − T 0 (x))(y − x) dt.
0
7.3 Diffeomorphisms 147

Using the uniform continuity of T in C, we get


√ a
|T (y) − T (x) − T 0 (x)(y − x)| ≤ ε|y − x| ≤ ε N . (7.2.2)
k
Step 3. Set H = T 0 (x)(RN ). Since x is a critical point, it follows that
the dimension of the subspace H is at most N − 1. Hence by (7.2.2), we
deduce that √ a
dist(T (y), T (x) + H) ≤ ε N , (7.2.3)
k
for every y ∈ C.
e Combining (7.2.1) and (7.2.3), we duduce that T (C)

e is

contained within a cylindrical block of radius L N k
a
and height 2ε Na
k .
If ωN −1 is the measure of the unit ball in R N −1 , we have
 √ a N −1 √ a
e ≤ ωN −1 L N
mN (T (C)) 2ε N = K(N, C)εk −N .
k k
Thus
P
mN (T (C ∩ S) ≤ e ⊂ C mN (T (C))
e
C
e ∩ S 6= ∅
C

≤ k N K(N, C)εk −N = K(N, C)ε.


Since ε can be chosen arbitrarily small, we deduce that mN (T (C ∩ S)) =
0. Now U can be covered by a countable number of such cubes and so
the result follows. .

7.3 Diffeomorphisms

Lemma 7.3.1 Let U ⊂ RN be an open set and let T : U → RN be a


C 1 map. Let x ∈ U be such that T 0 (x) is non-singular. Let C ⊂ U be a
closed cube with centre at x, its sides parallel to the coordinate axes and
of length ν. Then, given ε > 0, there exists δ > 0 such that, if ν < δ,
and if C
e is any sub-cube of C with sides parallel to the coordinate axes,
we have
N Z
e ≤ (1 + ε)
mN (T (C)) |JT | dmN . (7.3.1)
1−ε Ce

Proof: Let B ⊂ U be a closed and bounded neighbourhood of x so that


T 0 is bounded and uniformly continuous on B. We will only work with
cubes C contained in B. Let
L = max kT 0 (ξ)k.
ξ∈B
148 7 Change of variable

If we consider any sub-cube C


e of C, with sides parallel to the coordinate
0
axes and with centre at x , we have, by the mean value theorem,

|T (y) − T (x0 )| ≤ L|y − x0 |,

for every y ∈ C.
e Thus, it follows that for any such sub-cube,

e ≤ LN mN (C).
mN (T (C)) e (7.3.2)

Let ε > 0 be arbitrarily chosen. Then, by the uniform continuity of T 0 ,


we can find δ > 0 such that, if |y − x| < δ, we have,

k(T 0 (x))−1 T 0 (y)k < 1 + ε,


(7.3.3)
|JT (y)| > (1 − ε)|JT (x)|.

Let ν < δ so that the above relations are valid throughout C. Let C e
be any sub-cube of C, as described earlier. Then, by virtue of Theorem
2.3.3, we have

|JT (x)|−1 mN (T (C))


e = mN ((T 0 (x))−1 T (C)).
e

Now, T 0 (x) is a fixed linear transformation. Hence, by Remark 7.1.1


(iii) and the chain rule (cf. Theorem 7.1.1), we have that (T 0 (x))−1 T is
differentiable and its derivative at any point ξ is (T 0 (x))−1 T 0 (ξ). Since,
for ξ ∈ C, we have that

max k(T 0 (x))−1 T 0 (ξ)k < 1 + ε,


ξ∈C

we deduce from (7.3.2) that

mN ((T 0 x)−1 T (C))


e < (1 + ε)N mN (C).
e

Thus,
e < (1 + ε)N |JT (x)|mN (C).
mN (T (C)) e (7.3.4)
On the other hand, we also have from (7.3.3), that,
Z
|JT (y)|dmN (y) > (1 − ε)|JT (x)|mN (C).
e (7.3.5)
C
e

Thus, combining (7.3.4) and (7.3.5), we deduce (7.3.1). 


7.3 Diffeomorphisms 149

Lemma 7.3.2 Let U ⊂ RN be open and let K ⊂ U be a compact set.


Let T : U → RN be a C 1 map. Then
Z Z
|JT | dmN = inf |JT | dmN . (7.3.6)
K W open W
W compact
K⊂W ⊂W ⊂U
Proof: Since T is continuously differentiable and since K is compact,
|JT | is integrable over K and over any open set W whose closure is
compact. Clearly,
Z Z
|JT | dmN ≤ inf |JT | dmN .
K W open W
W compact
K⊂W ⊂W ⊂U
To prove the reverse inequality, observe that by absolute continuity
(cf. Proposition 5.3.2), if we restrict our attention to subsets of a rela-
tively compact open set W0 containing K and contained in U , given any
ε > 0, we can find δ > 0 such that, if F ⊂ W0 satisfies mN (F ) < δ, then
Z
|JT | dmN < ε.
F

Now we can find W open such that K ⊂ W ⊂ W ⊂ W0 ⊂ W 0 ⊂ U and


such that mN (W \K) < δ (cf. Proposition 2.2.2 (ii)). Thus,
Z Z
|JT | dmN − |JT | dmN < ε.
W K

In other words, we have found W , a relatively compact open set con-


tained in U and containing K such that
Z Z
|JT | dmN < |JT | dmN + ε.
W K

This establishes the reverse inequality and completes the proof. 


Proposition 7.3.1 Let U and V be open subsets of RN and let T : U →
V be a C 1 map, which is also a homeomorphism. Then
Z
mN (T (E)) ≤ |JT | dmN , (7.3.7)
E

where E ⊂ U is either a compact set or an open set.


150 7 Change of variable

Proof: Step 1. Let K ⊂ U be a compact set and let W be a relatively


compact open set such that

K ⊂ W ⊂ W ⊂ U.

Let
L = max kT 0 (x)k.
x∈W

Let ε > 0. Let δ1 > 0 be such that, whenever |x − y| < δ1 , x, y ∈ W , we


have
kT 0 (x) − T 0 (y)k < ε.

If x ∈ K such that T 0 (x) is singular, set ν(x) = δ1 . If x ∈ K is such


that T 0 (x) is non-singular, set ν(x) = min{δ1 , δ(x)}, where δ(x) is the
number δ chosen in the proof of Lemma 7.3.1 such that (7.3.3) is valid.
Now cover K by cubes {C(x)}x∈K , where C(x) is centered at x ∈ K and
with sides of length ν(x), the sides being parallel to the coordinate axes.
Since K is compact, there exists a finite subcover. We can further ensure
that we have a finite collection of disjoint (half-open) cubes whose union
covers K and each of these is a sub-cube of one of the C(x). Thus we
have a finite cover of K consisting of disjoint cubes {C 0 (x)}x∈S , where S
is a finite set and x is the centre of the cube C 0 (x). Then S = Js ∪ Jns ,
where
Js = {x ∈ S | T 0 (x) is singular},
Jns = {x ∈ S | T 0 (x) is non-singular}.

If x ∈ Js , then we can proceed exactly as in the proof of Sard’s theorem


(Theorem 7.2.1) to get

mN (T (C 0 (x))) ≤ 2ωN −1 LN −1 εmN (C 0 (x)). (7.3.8)

If x ∈ Jns , then, by Lemma 7.4.1, we get

(1 + ε)N
Z
0
mN (T (C (x))) ≤ |JT | dmN . (7.3.9)
1−ε C 0 (x)

(We must remember that each C 0 (x) is a sub-cube of one of the original
cubes. The estimates in Sard’s theorem and Lemma 7.4.1 were observed
7.3 Diffeomorphisms 151

to be vaild for any sub-cube of the cube of admissible size.) Now,


mN (T (K)) ≤ mN (T (∪x∈S C 0 (x))

mN (T (C 0 (x)) + mN (T (C 0 (x))
P P
≤ x∈Js x∈Jns

(1+ε)N
mN (C 0 (x)) +
P P R
≤ C(K)ε x∈Js 1−ε x∈Jns C 0 (x) |JT | dmn

(1+ε)N R
≤ C(K)εmN (W ) + 1−ε W |JT | dmn ,
where
C(K) = 2ωN −1 LN −1 .
We have used here the disjointness of the cubes C 0 (x) when using mN (W )
as an upper bound for x∈Js mN (C 0 (x)) and when using the integral
P
over W as an upper bound for the sum of the integrals over the cubes
C 0 (x).
Since ε can be chosen arbitrarily small, we get
Z
mN (T (K)) ≤ |JT | dmN .
W
Consequently, we have, by Lemma 7.3.2,
Z Z
mN (T (K)) ≤ inf |JT | dmN = |JT | dmN .
W open W K
W compact
K⊂W ⊂W ⊂U
Let W ⊂ U be an open set. Then, W can be written as the countable
increasing union of compact sets, i.e.
W = ∪∞
n=1 Kn ,

where the sets Kn , n ∈ N are all compact and Kn ⊂ Kn+1 for all n ∈ N.
Then, since T is a homeomorphism,
T (W ) = ∪∞
n=1 T (Kn ),

and for all n ∈ N,we have that T (Kn ) ⊂ T (Kn+1 ) and T (Kn ) is compact.
Thus,
Z Z
mN (T (W )) = lim mN (T (Kn )) ≤ lim |JT | dmN ≤ |JT | dmN .
n→∞ n→∞ K W
n

This completes the proof. 


152 7 Change of variable

Corollary 7.3.1 Let U and V be bounded open subsets of RN . Let T


be a C 1 map which maps U homeomorphically onto V . Let E ⊂ U be a
Borel set. Then (7.3.7) holds.
Proof: By Lemma 2.3.1, T (E) is a Borel set. Since V is bounded, it
follows that T (E) has finite measure. Consequently (cf. Proposition
2.2.3 and Remark 2.2.1), since T is a homeomorphism, we have

mN (T (E)) = sup{mN (T (K)) | K ⊂ E, K compact}.

But Z Z
mN (T (K)) ≤ |JT | dmN ≤ |JT | dmN ,
K E
from which (7.3.7) follows. 

Remark 7.3.1 We needed the boundedness of V to ensure that T (E)


has finite measure. If T : U → V is such that |JT | is integrable over U ,
then, if E ⊂ U is any Borel set, we can find an open set W such that
E ⊂ W ⊂ U (cf. Proposition 2.2.2). Then, by Proposition 7.3.1, T (W )
will have finite measure and so the measure of T (E) will also be finite.
Then the proof of Corollary 7.3.1 will go through. 

The following result is an immediate consequence of the preceding


corollary.
Corollary 7.3.2 Let U and V be bounded open subsets of RN and let
T be a homeomorphism of V onto V , which is also a C 1 map. If E ⊂ U
is a Borel set of measure zero, then so is T (E). 

Proposition 7.3.2 Let U and V be bounded open subsets of RN and


let T be a homeomorphism of U onto V , which is also a C 1 map. Let
f : V → R be a non-negative Borel measurable function. Then
Z Z
f dmN ≤ (f ◦ T )|JT | dmN . (7.3.10)
V U

Proof: Let F ⊂ V be a Borel set. Then F = T (E), where E ⊂ U is


a Borel set. Then χE = χF ◦ T . Thus, if f = χF , then (7.3.10) is just
a restatement of (7.3.7). The result is now true for any non-negative
simple function and, hence, by the monotone convergence theorem, for
any non-negative Borel measurable function. 

Henceforth, we will work with diffeomorphisms.


7.3 Diffeomorphisms 153

Proposition 7.3.3 Let U and V be bounded open subsets of RN and


let T be a diffeomorphism of U onto V . Let f be a non-negative Borel
measurable function defined on V . Then
Z Z
f dmN = (f ◦ T )|JT | dmN . (7.3.11)
V U

Proof: We apply (7.3.10) to the function (f ◦ T )|JT |, defined on U and


to the diffeomorphism T −1 : V → U . Set T x = y. We then get
−1 (y))|.|J
R R
U (f ◦ T )(x)|JT (x)| dmN (x) ≤ V f (y)|JT (T T −1 (y)| dmN (y)

R
= V f (y) dmN (y),

since |JT (T −1 (y))|.|JT −1 (y)| = 1 for all y ∈ V . This gives the reverse
inequality of (7.3.10), thereby establishing (7.3.11). 
Corollary 7.3.3 Let U and V be bounded open subsets of RN and let
T be a diffeomorphism of U onto V . If E is any Borel subset of U , then
Z
mN (T (E)) = |JT | dmN .  (7.3.12)
E

We can now extend Lemma 2.3.1 to Lebesgue measurable sets.


Proposition 7.3.4 Let U and V be bounded open subsets of RN and let
T be a diffeomorphism of U onto V . If E ⊂ U is Lebesgue measurable,
then T (E) is a Lebesgue measurable subset of V .
Proof: We can write (cf. Theorem 1.4.1), E = F ∪ N , where F is a
Borel set and N is a subset of a Borel set A, where mN (A) = 0. Then
T (E) = T (F ) ∪ T (N ) and T (F ) is a Borel set. We also have that T (A)
is a Borel set of measure zero (cf. Corollary 7.3.2) and T (N ) ⊂ T (A).
Thus, T (E) is Lebesgue measurable. 
Theorem 7.3.1 (Change of variable formula) Let U and V be bounded
open subsets of RN and let T be a diffeomorphism of U onto V . Let
f : V → R be an integrable function. Then (7.3.11) holds.
Proof: Let E ⊂ U be a Lebesgue measurable set. Then if we write
E = F ∪ N as in the proof of the preceding proposition, we may assume,
without loss of generality, that F ∩ N = ∅. Then
R
mN (T (E)) = mN (T (F )) = F |JT | dmN
R R
= F ∪N |JT | dmN = E |JT | dmN .
154 7 Change of variable

If G ⊂ V is a Lebesgue measurable set, then we can write G = T (E),


where E ⊂ U is Lebesgue measurable. The preceding consderations then
show that (7.3.11) holds for f = χG . Consequently, the relation remains
valid for any non-negative simple function, and hence, by the monotone
convergence theorem, for any non-negative measurable function. If f
is an integrable function, then (7.3.11) holds for both f + and f − , and
hence for f as well. 

Example 7.3.1 Consider a continuous function f : [−1, 1] → R. When


studying the change of variable y = −x in an undergraduate class, we
usually set dy = −dx and we get
Z 1 Z −1
f (x) dx = − f (−y) dy.
−1 1

We then declare that


Z −1 Z 1
f (−y) dy = − f (−y) dy,
1 −1

to get
Z 1 Z 1
f (x) dx = f (−y) dy. (7.3.13)
−1 −1

The correct way of interpreting this is to use (7.3.11). We set T (x) =


−x. Then |JT (x)| = |T 0 (x)| = 1 for all x ∈ (−1.1). We also have
T ((−1, 1)) = (−1, 1). Thus (7.3.11) gives us
Z Z
f (x) dm1 (x) = f (−y) dm1 (y),
(−1,1) (−1,1)

which is the same as (7.3.13). 

Example 7.3.2 (Polar coordinates) Let D ⊂ R2 denote the open disc


of radius a > 0, i.e.

D = {(x, y) ∈ R2 | |x|2 + |y|2 < a}.

Consider the open set V = D\{(x, 0) | 0 ≤ x < a}. Let U = (0, a) ×


(0, 2π) ⊂ R2 . Then the mapping T : U → V defined by T (r, θ) = (x, y),
where
x = r cos θ, y = r sin θ,
7.3 Diffeomorphisms 155

defines a diffeomorphism between U and V . We have



cos θ sin θ
JT = = r.
−r sin θ r cos θ

Thus, if f : V → R is an integrable function, we have


Z Z
f dm2 = rf (r cos θ, r sin θ) dm2 (r, θ).
V U

Since D and V differ by a set of measure zero, we have


Z Z
f dm2 = rf (r cos θ, r sin θ) dm2 (r, θ).
D U

If f : R2 → R is a non-negative function, then, by the monotone con-


nvergence theorem, we have
Z Z
f dm2 = rf (r cos θ, r sin θ) dm2 (r, θ). (7.3.14)
R2 (0,+∞)×(0,2π)

By considering the positive and negative parts of f , we have that (7.3.14)


is valid for any integrable function f : R2 → R. In the next chapter, we
will write (7.3.14) in a more familiar form. 
Chapter 8

Product spaces

8.1 Measurability in the product space

Let (X, S, µ) and (Y, T , λ) be two measure spaces. We would like to de-
fine a σ-algebra and a measure on the product X ×Y which is compatible
with the structures given on X and Y and also relate the process of in-
tegration with respect to this measure with the processes of integration
on X and Y .

Definition 8.1.1 Let (X, S) and (Y, T ) be two measurable spaces. A


measurable rectangle is a subset of X × Y of the form A × B, where
A ∈ S and B ∈ T . An elementary set is a finite disjoint union of
measurable rectangles. The σ-algebra generated by the collection of all
elementary sets is denoted by S × T . 

Definition 8.1.2 Let X and Y be non-empty sets. Let E ⊂ X × Y . Let


x ∈ X. Then the x-section of E, denoted Ex , is defined by

Ex = {y ∈ Y | (x, y) ∈ E}.

Similarly, for y ∈ Y , the y-section of E, denoted E y , is defined by

E y = {x ∈ X | (x, y) ∈ E}.

Thus, Ex ⊂ Y and E y ⊂ X. 

Proposition 8.1.1 Let (X, S) and (Y, T ) be measurable spaces. Let


E ∈ S × T . Then Ex ∈ T and E y ∈ S for every x ∈ X and for every
y ∈Y.
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 156
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_8
8.1 Measurability in the product space 157

Proof: Let U denote the collection of all subsets E of X × Y such that


Ex ∈ T for every x ∈ X. If E = A × B is a measurable rectangle, then

B, if x ∈ A,
Ex =
∅, if x 6∈ A.
Thus, every measurable rectangle belongs to U . In particular, X × Y ∈
U . Now, if E ⊂ X × Y , and if x ∈ X, we have
(Ex )c = {y ∈ Y | (x, y) 6∈ E} = (E c )x .
Thus, if E ∈ U , then E c ∈ U . Similarly, if {Ei }∞
i=1 is a sequence of sets
in X × Y and if E = ∪∞ i=1 E i , then, for any x ∈ X, we have
Ex = ∪ ∞
i=1 (Ei )x .

This shows that if {Ei }∞ i=1 is a countable collection of sets in U , then


E = ∪∞ i=1 E i ∈ U . Thus, U is a σ-algebra on X × Y which contains all
measurable rectangles and so it must contain all members of S ×T . This
shows that if E ∈ S × T , then, for every x ∈ X, we have that Ex ∈ T .
In the same way we can show that if E is in S × T , then E y ∈ S for
every y ∈ Y . This completes the proof. 
Definition 8.1.3 Let X be any non-empty set. A monotone class is
a collection M of subsets of X which is closed under countable increas-
ing unions and countable decreasing intersections, i.e. if {Ai }∞
i=1 and
{Bi }∞
i=1 are two countable collections of subsets of X in M such that,
for all i ∈ N we have Ai ⊂ Ai+1 and Bi ⊃ Bi+1 , then,
∪∞ ∞
i=1 Ai ∈ M and ∩i=1 Bi ∈ M. 

Remark 8.1.1 Any σ-ring, and so, in particular, any σ-algebra, is a


monotone class. 

Remark 8.1.2 The intersection of monotone classes is obviously a


monotone class. The collection P(X), of all subsets of a non-empty
set X, is obviously a monotone class. Thus, if A is any collection of
subsets of X, there exists a smallest monotone class containing A. We
will denote it by M(A) and will call it the monotone class generated by
A. 
Lemma 8.1.1 Let X be a non-empty set and let M be a monotone
class of subsets of X. Let P ⊂ X. Define
U (P ) = {Q ⊂ X |P ∪ Q, P \Q and Q\P are all in M}.
Then U (P ) is a monotone class.
158 8 Product spaces

Proof: Let {Qi }ni=1 be an increasing sequence of sets in U (P ). Then


{P ∪ Qi }∞ ∞
i=1 and {Qi \P }i=1 are increasing sequences of sets in M. Con-
sequently,
P ∪ (∪∞ ∞
i=1 Qi ) = ∪i=1 (P ∪ Qi ) ∈ M,

and
(∪∞ ∞
i=1 Qi )\P = ∪i=1 (Qi \P ) ∈ M.

Finally {P \Qi }∞
i=1 is a decreasing sequence of sets in M. Hence

P \(∪∞ ∞
i=1 Qi ) = ∩i=1 (P \Qi ) ∈ M.

Thus, we see that ∪∞ i=1 Qi ∈ U (P ). In the same way, it is easy to see that
if {Qi }∞
i=1 is a decreasing sequence of sets in U (P ), then ∩∞i=1 Qi ∈ U (P )
as well. This completes the proof. 
Lemma 8.1.2 Let X be a non-empty set and let R be an algebra of
subsets of X. Let M(R) denote the monotone class generated by R.
Then M(R) = S(R), the σ-algebra generated by R.
Proof: Let P ∈ R. If Q ∈ R, then, P ∪ Q, P \Q and Q\P belong to
R and hence to M(R) as well. Thus, if U (P ) is as defined in the pre-
ceding lemma, we have that Q ∈ U (P ). Since U (P ) is a monotone class
containing R, it follows that U (P ) ⊃ M(R).

Now, let Q ∈ M(R). Then, we have just seen that Q ∈ U (P ).


By symmetry of the definition of U (P ), we immediately deduce that
P ∈ U (Q). Thus, U (Q) is a monotone class containing R, we have that
U (Q) ⊃ M(R). Thus, for all P and Q in M(R), we have that P ∪Q and
P \Q belong to M(R). Thus, we deduce that M(R) is an algebra as well.

Now let {Ei }∞i=1 be a countable collection of members of M(R).


Then Fn = ∪ni=1 Ei ∈ M(R), since the latter collection is an algebra.
Since {Fn }∞
n=1 is an increasing sequence of sets in M(R), which is a
monotone class, we have

∪∞ ∞
i=1 Ei = ∪n=1 Fn ∈ M(R).

Thus M(R) is a σ-algebra containing R, fom which we deduce that


M(R) ⊃ S(R). Since every σ-algebra is a monotone class, we also have
that M(R) ⊂ S(R). This completes the proof. 
Proposition 8.1.2 Let (X, S) and (Y, T ) be measurable spaces. Then
S × T is the smallest monotone class containing all elementary sets.
8.1 Measurability in the product space 159

Proof: Let us denote by E, the class of elementary sets. Let Ai ∈ S


and Bi ∈ T for i = 1, 2. Then (check!)

(A1 × B1 ) ∩ (A2 × B2 ) = (A1 ∩ A2 ) × (B1 ∩ B2 ),


(A1 × B1 )\(A2 × B2 ) = ((A1 \A2 ) × B1 ) ∪ ((A1 ∩ A2 ) × (B1 \B2 )).

Thus, the intersection of two measurable rectangles is a measurable rect-


angle and their difference is the disjoint union of two measurable rect-
angles. It then follows that if P and Q are in E, we have that P ∪ Q and
P \Q are also in E. Thus E is an algebra. The result now follows from
Lemma 8.1.2. 
Definition 8.1.4 Let (X, S) and (Y, T ) be measurable spaces. Let f :
X × Y → R be a given function. Let x ∈ X and y ∈ Y . The x-section,
fx , and the y-section, f y , are defined by

fx (y) = f (x, y) = f y (x). 

We are now dealing with three σ-algebras, viz. the σ-algebra S on


X, the σ-algebra T on Y , and the σ-algebra S × T on X × Y . To avoid
confusion, we will say a set (respectively, function) is S-measurable, T -
measurable or S × T measurable according to the context.
Proposition 8.1.3 Let (X, S) and (Y, T ) be measurable spaces. Let f
be a S ×T -measurable function defined on X ×Y . Then, for every x ∈ X
and for every y ∈ Y , the section fx is T -measurable and the section f y
is S-measurable.
Proof: Let c ∈ R. Then

Q = {(x, y) | f (x, y) > c}

is S × T measurable. Then, for x ∈ X,

{y ∈ Y | fx (y) > c} = Qx

is T -measurable by Proposition 8.1.1. Thus, fx is T -measurable. Simi-


larly, f y is S-measurable. 

Example 8.1.1 Let (X, S) and (Y, T ) be measurable spaces. Let f :


X → R be a S-measurable function. Define F : X ×Y → R by F (x, y) =
f (x) for every (x, y) ∈ X × Y . Then, if c ∈ R, we have

{(x, y) ∈ X × Y | F (x, y) > c} = {x ∈ X | f (x) > c} × Y,


160 8 Product spaces

which is S × T -measurable. Thus, F is S × T -measurable. Similarly, if


g : Y → R is T -measurable, we also have that the function (x, y) 7→ g(y)
is S × T -measurable. Since the product of measurable functions is
measurable, we have that the function ϕ : X × Y → R defined by
ϕ(x, y) = f (x)g(y) is also S × T -measurable. 

Example 8.1.2 Let R be equipped with the Borel or the Lebesgue σ-


algebra. Let (X, S) be a measurable space. Let f : R × X → R be a
function such that

• for every x ∈ X, the function t 7→ f (t, x) is continuous on R;


• for every t ∈ R, the function x 7→ f (t, x) is S-measurable.
(Such a function is called a Carathéodory function.) Then, f is
measurable on the product space R × X.
To see this, first observe that for any fixed n ∈ N,
 
k−1 k
R = ∪k∈Z , .
2n 2n

Define fn (t, x) = f ( 2kn , x) if t ∈ ( k−1 k


2n , 2n ]. Thus, we can write
X k 
fn (t, x) = f , x χ k−1 k (t).
2n ( n , n]
2 2
k∈Z

By the previous example, and by hypotheses, we see that each term in


the above series is a measurable function on R × X, from which we de-
duce that fn is a measurable function on R × X.

Now, let (t0 , x0 ) ∈ R × X. Let ε > 0 be given. Then, there exists


δ > 0 such that, if |t − t0 | < δ, we have, by hypotheis, that |f (t, x0 ) −
f (t0 , x0 )| < ε. If n is large enough so that 21n < δ, it then follows that
|fn (t0 , x0 ) − f (t0 , x0 )| < ε, by construction of the function fn . Thus,
fn → f pointwise and so f is measurable on R × X. 

8.2 The product measure

Theorem 8.2.1 Let (X, S, µ) and (Y, T , λ) be σ-finite measure spaces.


Let Q ∈ S × T . Define, for x ∈ X and y ∈ Y ,
ϕ(x) = λ(Qx ) and ψ(y) = µ(Qy ).
8.2 The product measure 161

Then ϕ is S-measurable and ψ is T -measurable. Further


Z Z
ϕ dµ = ψ dλ. (8.2.1)
X Y

Proof: Let U be the collection of all sets in S ×T such that (8.2.1) holds.

Step 1. Let Q = A × B, where A ∈ S and B ∈ T . Then (cf. the


proof of Proposition 8.1.1), ϕ = λ(B)χA and ψ = µ(A)χB . Thus ϕ is
S-measurable and ψ is T -measurable. We also have
Z Z
ϕ dµ = µ(A)λ(B) = ψ dλ.
X Y

Thus, every measurable rectangle is in U .

Step 2. Let {Qi }∞ i=1 be an increasing sequence of sets in U . Set Q =



∪i=1 Qi . Let ϕi (x) = λ((Qi )x ) and let ψi (y) = µ((Qi )y ) for x ∈ X and
y ∈ Y . Since {(Qi )x }∞
i=1 is an increasing sequence of sets whose union
is Qx and, similarly, since {(Qi )y }∞
i=1 is an increasing sequence of sets
whose union is Qy , we have that {ϕi }∞ ∞
i=1 and {ψi }i=1 are sequences of
non-negative increasing functions such that (cf. Proposition 1.2.4)

def
limi→∞ ϕi (x) = λ(Qx ) = ϕ(x), and
def
limi→∞ ψi (y) = µ(Qy ) = ψ(y).
R R
Since X ϕi dµ = Y ψi dλ, for each i, it follows from the monotone
convergence theorem that (8.2.1) holds. Thus Q ∈ U .

Step 3. It is very easy to see that if {Qi }ni=1 is a finite collection of


disjoint sets in U , then ∪ni=1 Qi ∈ U as well. Now, given any countable
collection of disjoint sets {Qi }∞ n
i=1 in U , we set Rn = ∪i=1 Qi so that

Rn ∈ U for each n ∈ N. Since {Rn }n=1 is an increasing sequence, it
follows, from Step 2, that

∪ni=1 Qi = ∪∞
n=1 Rn ∈ U .

Step 4. Let A ∈ S and B ∈ T be such that µ(A) < +∞ and λ(B) < +∞.
Let {Qi }∞
i=1 be a sequence of sets in U such that

A × B ⊃ Q1 ⊃ Q2 ⊃ · · · ⊃ Qi ⊃ Qi+1 ⊃ · · · .
162 8 Product spaces

Then in the same manner as in Step 2 (using the dominated conver-


gence theorem instead of the monotone convergence theorem), we can
show that ∩∞i=1 Qi ∈ U .

Step 5. Since both the measure spaces are σ-finite, we can write
X = ∪∞ ∞
n=1 Xn and Y = ∪m=1 Ym ,

where {Xn }∞ ∞
n=1 and {Ym }m=1 are sequences of disjoint sets such that for
all n and m we have µ(Xn ) < +∞ and λ(Ym ) < +∞. Let Q ∈ S × T .
Set Qnm = Q∩(Xn ×Ym ). Let M be the collection of all sets Q in S ×T
such that Qnm ∈ U for all n and m. By Steps 2 and 4, it follows that
M is a monotone class. By Steps 1 and 3, it follows that all elementary
sets are in M. Thus, M is a monotone class containing all elementary
sets and is contained in S × T . It now follows, from Proposition 8.1.2,
that M = S × T .

Step 6. If Q ∈ S × T , then by Step 5, Qnm ∈ U for all n and m. Then,


since the Qnm are all disjoint and since
Q = ∪∞ ∞
n=1 ∪m=1 Qnm ,

it follows, from Step 3, that Q ∈ U as well. Thus U = S × T . This


completes the proof. 

We can use the precding theorem to define the product measure on


S ×T.
Definition 8.2.1 Let (X, S, µ) and (Y, T , λ) be σ-finite measure spaces.
The product measure, denoted µ × λ, is defined for Q ∈ S × T by
Z Z
(µ × λ)(Q) = λ(Qx ) dµ(x) = µ(Qy ) dλ(y). 
X Y
Example 8.2.1 Let X = Y = R be equipped with the Lebesgue mea-
sure. The x-axis can be written as the disjoint union of measurable
rectangles:
{(x, 0) | x ∈ R} = ∪n∈Z [n, n + 1) × {0},
and so its measure, for the product measure m1 × m1 is zero. Let E ⊂
[0, 1] ⊂ R be a non-measurable set, i.e. E 6∈ L1 . Then (cf. Proposition
8.1.1), the set E × {0} 6∈ L1 × L1 . However,
E × {0} ⊂ [0, 1] × {0},
8.2 The product measure 163

and (cf. Example 2.1.1)

m2 ([0, 1] × {0}) = 0 = (m1 × m1 )([0, 1] × {0}).

Since m2 is complete, it follows that E × {0} ∈ L2 . This shows that


even though m1 is complete, it does not follow that m1 × m1 is complete
and also shows that L1 × L1 6= L2 . .

Remark 8.2.1 In view of the above example, we can ask ourselves what
is the relationship between the product of Lebesgue measures and the
Lebesgue measure of the product space. We sketch the argument below.
Let ` = k + n and let us consider R` as the product space Rk × Rn .
We have the Borel sets B` , Bk and Bn in R` , Rk and Rn respectively.
Similarly we have the Lebesgue measurable sets L` , Lk and Ln as well.
Now, any open set in R` and can be expressed as the countable
disjoint union of (half-open) boxes (cf. Lemma 2.2.1). Thus, all open
sets are in Lk × Ln and so we deduce that

B` ⊂ L k × L n .

If E ⊂ Rk is a Lebesgue measurable subset, then (cf. Proposition


2.2.2) it can be approximated from above by a Gδ set and from below
by an Fσ set. It follows from this that E × Rn is Lebesgue measurable
in R` . Similarly, if F ⊂ Rn is Lebesgue measurable, then Rk × F will
be Lebesgue measurable in R` . Then, their intersection E × F will be
Lebesgue measurable in R` . Thus, L` contains all measurable rectangles
and it follows from this that

B` ⊂ L k × L n ⊂ L ` .

We know that m` and mk × mn both agree on all boxes. Both these


measures are also easily seen to be translation invariant, outer-regular
(cf. Remark 2.2.1) and finite on all compact sets. Thus (cf. Theorem
2.3.2), it will follow that these measures agree on B` .
Now, if Q is Lk × Ln -measurable, it is also L` -measurable and so
there exist Pi ∈ B` , i = 1, 2, such that P1 ⊂ Q ⊂ P2 and such that
m` (P2 \P1 ) = 0 (cf. Proposition 2.2.2). Thus,

(mk × mn )(Q\P1 ) ≤ (mk × mn )(P2 \P1 ) = m` (P2 \P1 ) = 0.

Cosequently,

(mk × mn )(Q) = (mk × mn )(P1 ) = m` (P1 ) = m` (Q).


164 8 Product spaces

Thus, mk × mn and m` agree on Lk × Ln as well. Since the Lebesgue


measure is the completion of the same measure on Borel sets, it follows
that m` is the completion of mk × mn as well. 

8.3 Fubini’s theorem

Theorem 8.3.1 (Fubini’s theorem) Let (X, S, µ) and (Y, T , λ) be two


σ-finite measure spaces. Let f be an extended real-valued function de-
fined on X × Y which is S × T -measurable.
(a) Let f be non-negative. Define, for x ∈ X and y ∈ Y ,
Z Z
ϕ(x) = fx dλ and ψ(y) = f y dµ. (8.3.1)
Y X

Then ϕ is S-measurable, ψ is T -measurable and


Z Z Z
ϕ dµ = f d(µ × λ) = ψ dλ. (8.3.2)
X X×Y Y

(b) Assume that ϕ∗ is integrable over X, with respect to the measure µ,


where, for x ∈ X, Z

ϕ (x) = |f |x dλ.
Y
Then f is integrable over X × Y , with respect to the measure µ × λ.
(c) Let f be integrable over X × Y , with respect to the measure µ × λ.
Then, for almost every x ∈ X, the function fx is integrable over Y , with
respect to the measure λ, and, for almost every y ∈ Y , the function f y is
integrable over X, with respect to the measure µ. Further, the functions
ϕ and ψ defined by (8.3.1) above are integrable over X, with respect to
the measure µ, and over Y , with respect to the measure λ, respectively,
and (8.3.2) holds.

Proof: (a) Since fx and f y are non-negative, the functions ϕ and ψ


are defined. By definition of the product measure, (8.3.2) is exactly the
conclusion of Theorem 8.2.1, when f = χQ , where Q ∈ S × T . By
linearity, the result holds for all non-negative simple functions. Let f be
a non-negative S ×T -measurable function. Let {fn }∞ n=1 be aRsequence of
non-negative simple functions increasing to f . Let ϕn (x) = Y (ϕn )x dλ.
Then, by the monotone convergence theorem, ϕn ↑ ϕ. Further,
Z Z
fn d(µ × λ) = ϕn dµ.
X×Y X
8.3 Fubini’s theorem 165

Once again, we can pass to the limit, as n tends to infinity, to get


Z Z
f d(µ × λ) = ϕ dµ.
X×Y X

This proves one part of (8.3.2). The proof of the other part is similar.

(b) We apply the result of (a) to the function |f |. Thus, by hypothesis


and by (8.3.2), we get
Z Z
|f | d(µ × λ) = ϕ∗ dµ < +∞.
X×Y X

This shows that f is integrable over X × Y , with respect to the measure


µ × λ.

(c) We write f = f + − f − . Since f is integrable, we have that f ± are


integrable non-negative functions. Let
Z Z
±
ϕ± (x) = (f )x dλ, and ψ± (y) = (f ± )y dµ.
Y X

Then (8.3.2) holds for the triples (f ± , ϕ± , ψ± ) replacing the triple (f, ϕ, ψ).
All the integrals are now finite and so subtracting the relations for f −
from those of f + , we deduce (8.3.2) for the function f . This completes
the proof. 

Remark 8.3.1 When f is non-negative (case (a) of Theorem 8.3.1), all


the integrals could be infinite. If even one of them is finite, all are finite
and will be equal. 

Remark 8.3.2 In the case (b) of the preceding theorem, of course, an


∗ y
R
analogious statement involving ψ (y) = X |f | dµ is valid. Thus, if one
of the integrals Z Z
|f |y dµ or |f |x dλ
X Y
is finite, then we have that f is integrable over X × Y and that (8.3.2)
holds. 

Remark 8.3.3 The relation (8.3.2) can also be written as


Z Z Z Z Z
f (x, y) dλ(y)dµ(x) = f d(µ×λ) = f (x, y) dµ(x)dλ(y).
X Y X×Y Y X
166 8 Product spaces

The first and the last term are referred to as iterated integrals. 

Example 8.3.1 Let X = Y = N, and let µ = λ be the counting measure.


Let f (m, n) = amn be non-negative for all m and n. Then, by case (a)
of Theorem 8.3.1, we get that
∞ X
X ∞ ∞ X
X ∞
amn = amn .
m=1 n=1 n=1 m=1

This was proved earlier as a consequence of the monotone convergence


theorem (cf. Example 5.2.3). The same result is true without the non-
negativity condition if we assume the extra condition
∞ X
X ∞
|amn | < +∞,
m=1 n=1

as an application of case (b) of Theorem 8.3.1. We proved this earlier,


using the dominated convergence theorem (cf. Example 5.3.3). 

Example 8.3.2 Once again, let X = Y = N and let µ = λ be the


counting measure. Define

 1, if n = 1,
f (m, n) = am,n = −1, if n = m + 1,
0, otherwise.

Then
∞ X
X ∞ ∞
X
am,n = (am,1 + am,m+1 ) = 0,
m=1 n=1 m=1

while

X
am,1 = +∞
m=1

and so
∞ X
X ∞
am,n = +∞.
n=1 m=1

Note that in this case,


∞ X
X ∞
|am,n | = +∞.
m=1 n=1
8.3 Fubini’s theorem 167

Thus, the integrability of f cannot be relaxed for a general function for


the validity of (8.3.2). 

Example 8.3.3 Let X = Y = [0, 1]. Let S = T = L1 . Let µ = m1 and


let λ be the counting measure. Thus λ is not a σ-finite measure. Let

D = {(x, x) | x ∈ [0, 1]} ⊂ X × Y.

Since D is a closed set, it is Borel measurable and so D ∈ L1 × L1 (cf.


Remark 8.2.1). Let f = χD . Then
R R
RY RX f (x, y) dµ(x) dλ(y) = 0,
X Y f (x, y) dλ(y) dµ(x) = 1.

Thus, σ-finiteness cannot be dispensed with. 

Example 8.3.4 (Integration by parts for absolutely continuous func-


tions) Let [a, b] ⊂ R and let f, g : [a, b] → R be absolutely continuous
functions. Then, f 0 and g 0 exist almost everywhere and are integrable.
Let us consider the integral of ϕ(x, y) = f 0 (x)g 0 (y) on the set

E = {(x, y) ∈ [a, b] × [a, b] | x ≤ y},

which is closed and hence is Borel measurable. Consequently, it also


belongs to L1 × L1 . By case (a) of Theorem 8.3.1, we have
R 
0 (x)| |g 0 (y)| dm (x)
R R
[a,b]×[a,b] |ϕ|d(m1 × m1 ) = [a,b] [a,b] |f 1 dm1 (y)

0| 0|
R R
= [a,b] |f dm1 [a,b] |g dm1 < +∞.

Thus ϕ is integrable on [a, b]×[a, b] with respect to the measure m1 ×m1


and so we can apply Fubini’s theorem. Consequently, we have
Z Z
f 0 (x)g 0 (y)χE (x, y) dm1 (y) dm1 (x)
[a,b] [a,b]

0 (x)g 0 (y)χ
R R
= [a,b] [a,b] f E (x, y) dm1 (x) dm1 (y). (8.3.3)
The left-hand side of (8.3.3) is equal to
Z Z !
g (y) dm1 (y) f 0 (x) dm1 (x),
0
[a,b] [x,b]
168 8 Product spaces

which yields
Z Z
0
(g(b) − g(x))f (x) dm1 (x) = g(b)f (b) − g(b)f (a) − gf 0 dm1 .
[a,b] [a,b]

We have used here the fact that both f and g are absolutely continuous
and so the integral of the derivative is the difference of the values of the
function at the end points of the interval (cf. Theorem 6.4.1). Similarly,
the right-hand side of (8.3.3) is equal to
Z Z !
f 0 (x) dm1 (x) g 0 (y) dm1 (y),
[a,b] [a,y]

which yields
Z
−g(b)f (a) + g(a)f (a) + f g 0 dm1 .
[a,b]

Equating these two, we get


Z Z
f g 0 dm1 = g(b)f (b) − g(a)f (a) − f 0 g dm1 ,
[a,b] [a,b]

which is the formula for integration by parts. 

Example 8.3.5 (Polar coordinates.) In Example 7.3.2, we described


the transformation which led to polar coordinates. Let f : R2 → R be
an integrable function which is Borel measurable. Since the Lebesgue
measure m2 agrees with m1 × m1 on Borel sets, we can apply Fubini’s
theorem. Then, we can write (7.3.14) in the form
Z Z Z
f dm2 = rf (r cos θ, r sin θ) dm1 (r) dm1 (θ).
R2 (0,2π) (0,+∞)

If the integrand on the right-hand side is Riemann integrable, then we


may write
Z Z 2π Z ∞
f dm2 = f (r cos θ, r sin θ)r dr dθ. 
R2 0 0

Example 8.3.6 Let x ∈ RN and consider the function


2
f (x) = e−|x| ,
8.3 Fubini’s theorem 169

which is continuous and hence is Borel measurable. We wish to show


that this N
R function is integrable over R and evaluate that integral. Let
IN = RN f dmN . By repeated use of Fubini’s theorem (case (a)), we
get Z
2
IN = Π N
i=1 e−|xi | dm1 (xi ) = I1N .
R

Now, on one hand I2 = I12 , by the above reasoning. On the other hand,
by the previous example,
Z ∞ Z 2π Z ∞
−r 2 2
I2 = e r dθ dr = 2π e−r r dr.
0 0 0

The last integral is easily evaluated to give I12 = I2 = π. Thus


√ N
I1 = π and IN = π 2 , N ≥ 2. 

Example 8.3.7 Let (X, S, µ) be a measure space and let f : X → R


be an integrable function. The distribution function of f is a function
F : [0, +∞) → [0, +∞], defined by

F (t) = µ(E(t)),

where E(t) = {x ∈ X | |f (x)| > t}. We have


Z Z Z
F dm1 = χE(t) (x) dµ(x) dm1 (t).
[0,+∞) [0,+∞) X

Since we are dealing with a non-negative integrand, we can change the


order of integration to get
Z Z Z Z
F dm1 = dm1 (t) dµ(x) = |f (x)| dµ(x).
[0,+∞) X [0,|f (x)|] X

Thus, Z Z
|f | dµ = F dm1 .
X [0,+∞)

Two functions defined on X are said to be equimeasurable or, are rear-


rangements of each other, if they have the same distribution function.
Thus, the integrals of the absolute values of functions which are rear-
rangements of each other are equal. 
170 8 Product spaces

Example 8.3.8 (Convolutions) Let f and g be Borel measurable real-


valued functions defined on RN . Consider the mappings ϕ and ψ defined
on RN × RN taking values in RN defined by

ϕ(x, y) = x − y, ψ(x, y) = y.

These are continuous and hence Borel measurable. Thus (cf. Proposition
3.1.4), we have that the mappings

(x, y) 7→ f (x − y) and (x, y) 7→ g(y)

are Borel measurable and so their product (x, y) 7→ f (x − y)g(y) is also


Borel measurable. We would like to know if the integral
Z
def
h(x) = f (x − y)g(y) dmN (y)
RN

exists and is finite.

Assume that f and g are integrable as well. Since the Lebesgue mea-
sure agrees with the product measure on Borel measurable sets, we can
apply Fubini’s theorem. We have
R R
RN

RN f (x − y)g(y) dmN (y) dmN (x)
R R
≤ RN RN |f (x − y)|.|g(y)| dmN (y) dmN (x)
R R
= RN |g(y)| RN |f (x − y)| dmN (x) dmN (y).

Since the Lebesgue measure is translation invariant, we have that


Z Z
|f (x − y)| dmN (x) = |f | dmN
RN RN

for every fixed y. Consequently, we get


Z Z Z Z


N f (x − y)g(y) dm N (y)
dm N (x) ≤ |f | dmN |g| dmN .
NR R RN RN

Since, by hypothesis, the right-hand side is finite, it follows from Fu-


bini’s theorem, that the function h is defined for almost every x and is
integrable and, in fact we also have that
Z Z Z
|h| dmN ≤ |f | dmN |g| dmN . (8.3.4)
RN RN RN
8.4 Polar coordinates in RN 171

Now, if f and g are Lebesgue measurable functions, then we can


find Borel measurable functions f0 and g0 such that f = f0 and g = g0
almost everywhere (cf. Exercise 3.7). Since the integrals of functions
which are equal almost everywhere are the same, it follows that if f and
g are integrable functions (with respect to the Lebesgue measure) on
RN , then h is well-defined almost everywhere. The function h is called
the convolution of f and g and is denoted by the symbol f ∗ g. 

8.4 Polar coordinates in RN

We saw that the transformation

x = r cos θ, y = r sin θ,

in the plane R2 allowed us to write the integral of a non-negative func-


tion, or an integrable function in the form (cf. Example 7.3.2 and Ex-
ample 8.3.5)
Z Z 2π Z ∞
f dm2 = f (r cos θ, r sin θ)r dr dθ.
R2 0 0

In the same way the transformation in R3 defined by the spherical polar


coordinate system

x = r sin θ cos ϕ, y = r sin θ sin ϕ, z = r cos θ,


R
will convert R3 f dm3 into the multiple integral
Z 2π Z π Z ∞
f (r sin θ cos ϕ, r sin θ sin ϕ, r cos θ)r2 sin θ dr dθ dϕ,
0 0 0

when f is a Borel measurable function defined on R3 , which is non-


negative or which is integrable.

When N > 3, it is difficult to write down explicitly the ‘polar co-


ordinates’ and the computation of the Jacobian will surely be horren-
dous. We will describe below the transformation of the integral when
f : RN → R is a radial function.
Definition 8.4.1 We say that a function f : RN → R is radial, if there
exists a function fe : R → R such that, for x ∈ RN , we have

f (x) = fe(|x|). 
172 8 Product spaces

Let B be the unit ball in RN . Let us set

ωN = mN (B).

If R > 0, then the linear map T (x) = Rx maps the open unit ball
diffeomorphically onto B(0; R), the open ball of radius R, and so we
have (cf. Theorem 2.3.)

mN (B(0; R)) = ωN RN .

By translation invariance, any ball in RN with radius R will have mea-


sure ωN RN .

Let us denote the closed ball centred at the origin and of radius
R by B(0; R). Let us assume that we have a continuous function f :
B(0; R) → R which is radial. Thus, f (x) = fe(|x|), where fe : [0, R] → R
is continuous. Consider a partition of the interval [0, R]:

P = {0 = r0 < r1 < · · · < rn = R}.

For 1 ≤ i ≤ N , let us set

Ai = {x ∈ RN | ri−1 ≤ |x| < ri },

so that B(0; R) = ∪ni=1 Ai .

Now, by the mean value theorem, there exists ξi ∈ (ri−1 , ri ) such


that
riN − ri−1
N
= N ξiN −1 (ri − ri−1 ), (8.4.1)
for each 1 ≤ i ≤ n. Let us choose yi ∈ Ai such that |yi | = ξi , 1 ≤ i ≤ N .
Now define the function
n
X
fP = f (yi )χAi .
i=1

Let
∆(P) = max (ri − ri−1 ).
1≤i≤n

Now, if x ∈ Ai , we have for any 1 ≤ i ≤ n,

|f (x) − fP (x)| = |fe(|x|) − fe(|yi |)| = |fe(|x|) − fe(ξi )|.


8.4 Polar coordinates in RN 173

Since fe is uniformly continuous, given ε > 0, we can find δ > 0 such


that, if ∆(P) < δ, we have
|f (x) − fP (x)| < ε,
for every x ∈ B(0; R). Thus, as ∆(P) → 0, we have that fP → f
uniformly on B(0; R). Consequently (cf. Exercise 5.2),
Z Z
lim fP dmN = f dmN .
∆(P)→0 B(0;R) B(0;R)

On the other hand, we have


R Pn e
B(0;R) fP dmN = i=1 f (ξi )mN (Ai )

Pn N N )
= i=1 f (ξi )ωN (ri
e − ri−1
Pn N −1
= i=1 f (ξi )N ωN ξi
e (ri − ri−1 ),
in view of (8.4.1). Since f is continuous, we have that
Xn Z R
N −1
lim f (ξi )N ωN ξi
e (ri − ri−1 ) = N ωN fe(r)rN −1 dr.
∆(P)→0 0
i=1

Thus we have
Z Z R
f dmN = N ωN fe(r)rN −1 dr. (8.4.2)
B(0;R) 0

If f is a continuous non-negative, or an integrable, radial function defined


on RN , we then have
Z Z ∞
f dmN = N ωN fe(r)rN −1 dr. (8.4.3)
RN 0

The formula (8.4.3) is a particular case of a more general result known


in the literature as the coarea formula.
Theorem 8.4.1 There exists a unique Borel measure σN −1 on the unit
sphere S N −1 in RN such that, if f : RN → R is a Borel measurable
function which is either non-negative or integrable over RN (with respect
to the Lebesgue measure), then
Z Z Z
f dmN = f (rx0 )rN −1 dσN −1 (x0 ) dm1 (r),
RN [0,∞) S N −1

where r = |x| and x0 = x


r ∈ S N −1 . 
174 8 Product spaces

The interested reader is referred to the books of Evans and Gariepy [2] or
Folland [3]. Essentially, the coarea formula says that when we integrate
over RN , we first integrate over the surface of sphere of radius r, centred
at the origin, and then integrate over r. If R > 0, we have
Z Z Z
f dmN = f (rx0 )rN −1 dσN −1 (x0 ) dm1 (r).
B(0;R) [0,R) S N −1

Setting R = 1 and f ≡ 1, we get


Z 1
ωN = σN −1 (S N −1 ) rN −1 dr,
0
which yields
σN −1 (S N −1 ) = N ωN .
The quantity σN −1 (S N −1 ) is the natural ‘N − 1 dimensional surface
measure’ of the unit sphere. Indeed, if N = 2, we have that the area of
the unit circle is ω2 = π while its perimeter is σ1 (S 1 ) = 2π = 2ω2 . If
N = 3, the volume of the unit sphere is ω3 = 43 π and its surface area is
σ2 (S 2 ) = 4π = 3ω3 .

Remark 8.4.1 There is a rich theory of measures defined on surfaces,


or more genrally, lower dimensional manifolds, in RN . In fact there are
several methods to do it, depending on how we wish to handle singular-
ities in the geometry of these sets. The main theory is that of Hausdorff
measures. See Evans and Gariepy [2], or Folland [3], for a treatment of
these notions. 

Example 8.4.1 (Volume of the unit ball) We now compute the value of
2
ωN = mN (B(0; 1)). We start with the function f (x) = e−|x| . We saw
earlier (cf. Example 8.3.6) that
Z
2 N
e−|x| dmN (x) = π 2 .
RN
Since f is a radial function, we can also compute it using polar coordi-
nates. By (8.4.3), we get
Z Z ∞
−|x|2 2
e dmN (x) = N ωN e−r rN −1 dr.
RN 0

Setting s = r2 , we get
Z Z ∞  
−|x|2 N −s N −1 N N
e dmN (x) = ωN e s2 = ωN Γ ,
RN 2 0 2 2
8.5 Exercises 175

where Γ(s) is the familiar gamma function. Thus, equating the two
expressions we got for the integral, we obtain,
N N
π2 π2
ωN = N N
 = N
,
2Γ 2 Γ 2 +1

since sΓ(s) = Γ(s + 1).

Using the last mentioned property of the gamma function and the

fact that Γ 12 = π, we can easily verify that

4
ω2 = π and ω3 = π.
3
We can also see that
1 2 8 2
ω4 = π and that ω5 = π ,
2 15
and so on. 

8.5 Exercises

8.1 Give an example of a non-empty set X and a monotone class M of


subsets of X which contains X and ∅ and which is not a σ-algebra.

8.2 Let p ≥ 1. Let (X, S, µ) be a Rmeasure space. Let f be a real-valued


function defined on X such that X |f |p dµ < +∞. Show that
Z Z
p
|f | dµ = p tp−1 µ(E(t)) dm1 (t),
X [0,∞)

where
E(t) = {x ∈ X | |f (x)| > t}.
8.3 (a) For x > 0, show that
Z
1
e−xt dm1 (t) = .
[0,+∞) x

(b) Use the above relation and Fubini’s theorem to show that
Z R
sin x π
lim dx = .
R→+∞ 0 x 2
176 8 Product spaces

8.4 Let f, g and h be integrable real-valued functions defined on RN .


Show that
(a) f ∗ g = g ∗ f .
(b) Show that f ∗ (g ∗ h) and (f ∗ g) ∗ h are well-defined and that they
are equal.

8.5 Let f and g be integrable real-valued functions defined on RN . Show


that (cf. Example 5.3.2)

∗ g = fb · gb.
f[

8.6 Let (X, S) be a measurable space. Let f be a real-valued, non-


negative function defined on X. Define the upper and lower ordinate
sets of f by

V ∗ (f ) = {(x, t) ∈ X × R | 0 ≤ t ≤ f (x)}, and


V∗ (f ) = {(x, t) ∈ X × R | 0 ≤ t < f (x)},

respectively.
(a) If f is a non-negative simple function, show that V ∗ (f ) and V∗ (f ) are
measurable in X × R (where R is equipped with the Lebesgue measure).
(b) If f and g are non-negative functions such that f (x) ≤ g(x) for all
x ∈ X, show that V ∗ (f ) ⊂ V ∗ (g) and that V∗ (f ) ⊂ V∗ (g).
(c) Let {fn }∞ n=1 be a sequence of non-negative measurable functions
defined on X. If fn ↑ f , show that {V∗ (fn )}∞
n=1 is an increasing sequence
of sets whose union is V∗ (f ). If fn ↓ f , show that {V ∗ (fn )}∞ n=1 is a
decreasing sequence of sets whose intersection is V ∗ (f ).
(d) If f is a non-negative measurable function defined on X, show that
V ∗ (f ) and V∗ (f ) are measurable subsets of X × R.
(e) If f is any measurable real-valued function defined on X, show that
its graph, G(f ), is a measurable subset of X × R, where

G(f ) = {(x, t) ∈ X × R | f (x) = t}.

(f) Let (X, S, µ) be a σ-finite measure space. Set λ = µ × m1 . Show


that, if f is a non-negative measurable function defined on X, then,
Z

λ(V (f )) = λ(V∗ (f )) = f dµ.
X

(This is a generalization of the notion that the (Riemann) integral of a


non-negative real-vaued function defined on (a sub-interval of) R, is the
8.5 Exercises 177

area under the graph of the function.)

8.7 Let A be a real, symmetric and positive definite N × N matrix.


Show that s
πN
Z
T
e−x Ax dmN (x) = ,
RN det(A)

where xT denotes the transpose of (the column vector) x ∈ RN .


Chapter 9

Signed measures

9.1 Hahn and Jordan decompositions

Let (X, S) be a measurable space. Let µi , i = 1, 2, be two measures


defined on this space. Let αi , i = 1, 2, be non-negative real numbers.
Then α1 µ1 + α2 µ2 defines a measure on this space. We now consider
the possibility that αi , i = 1, 2, be arbitrary real numbers. Thus, it is
possible that certain sets have negative measure. The principal difficulty
in doing this is that if µi (E), i = 1, 2, are both infinite for some E ∈ S,
then we cannot define µ1 (E) − µ2 (E). The situation is similar to the one
we encountered when defining the integral of a function. In that case
we needed that at least one of the functions, f + or f − , be integrable.
In the same way, if we assume that if one of µ1 or µ2 is a finite measure,
then, at least formally, we can define the set function µ1 − µ2 , which will
still be countably additive.

Motivated by these remarks, we make the following definition.

Definition 9.1.1 Let (X, S) be a measurable space and let µ be an ex-


tended real-valued set function defined on S. It is said to be a signed
measure if
(i) µ(∅) = 0,
(ii) µ takes at most one of the values +∞ or −∞, and
(iii) µ is countably additive.
A signed measure, µ, is said to be finite if |µ(E)| < +∞ for every
E ∈ S. It is said to be σ-finite if X = ∪∞
n=1 En , with |µ(En )| < +∞ for
each n ∈ N. 

Example 9.1.1 As already observed, if µi , i = 1, 2, are two measures


on a measureable space (X, S), and if at least one of them is finite, then
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 178
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_9
9.1 Hahn and Jordan decompositions 179

µ1 − µ2 is a signed measure. One of our objectives in this section will


be to show that every signed measure can be written as the difference
of two measures, one of them finite. 

Example 9.1.2 Let (X, S, µ) be a measure space. Let f be an integrable


function defined on X. Define
Z
ν(E) = f dµ, E ∈ S.
E

Then ν defines a signed measure on (X, S). 

Remark 9.1.1 A signed measure is clearly finitely additive. If µ(E) is


finite, then µ(F \E) = µ(F ) − µ(E), where E, F ∈ S and E ⊂ F. 
Proposition 9.1.1 Let (X, S) be a measurable space and let µ be a
signed measure defined on it. Let E and F be measurable sets such that
E ⊂ F . If µ(F ) is finite, then µ(E) is also finite.
Proof: We have F = (F \E) ∪ E and the two sets on the right-hand side
are disjoint. Thus, µ(F ) = µ(F \E) + µ(E). If both the summands on
the right-hand side of this equation are infinite, then so is µ(F ), since we
have assumed that µ can take at most one of the two infinite values +∞
or −∞. If one of them alone is finite, then again, µ(F ) will be infinite.
Thus both summands have to be finite, which completes the proof. 
Proposition 9.1.2 Let (X, S) be a measurable space and let µ be a
signed measure defined on it. Let {En }∞ n=1 be a sequence of disjoint
P∞ mea-
surable sets such that |µ(∪∞
n=1 E n )| < +∞. Then, the series n=1 µ(En )
is absolutely convergent.
Proof: Set 
En , if µ(En ) ≥ 0,
En+ =
∅, if µ(En ) < 0,
and 
En , if µ(En ) ≤ 0,
En− =
∅, if µ(En ) > 0.
Then P∞
µ(∪∞ +
n=1 En ) = Pn=1 µ(En ),
+
∞ − ∞ − (9.1.1)
µ(∪n=1 En ) = n=1 µ(En ),

and the sum of the two series is ∞ ∞


P
n=1 µ(En ) = µ(∪n=1 En ). P
Since µ can
take at most one of the two values +∞ or −∞, and since ∞ n=1 µ(En )
180 9 Signed measures

is convergent by the given hypothesis, it follows that both the series in


(9.1.1) are finite. ButP
these are the series of positive terms and the series
of negative terms of ∞ n=1 µ(En ) and so this latter series is absolutely
convergent. 

Proposition 9.1.3 Let (X, S) be a measurable space equipped with a


signed measure µ. If {En }∞
n=1 is an increasing sequence of measurable
sets, then
µ(∪∞
n=1 En ) = lim µ(En ). (9.1.2)
n→∞

If{En }∞
n=1 is a decreasing sequence of measurable sets such that µ(Em )
is finite for some m ∈ N, then

µ(∩∞
n=1 En ) = lim µ(En ). (9.1.3)
n→∞

Proof: The proof is exactly as in the case of measures (cf. Proposi-


tions 1.2.4 and 1.2.5). Proposition 9.1.1 ensures that subsets of sets of
finite measure are also of finite measure and we can use the subtractive
property of signed measures (cf. Remark 9.1.1) to get (9.1.3). 

Definition 9.1.2 Let (X, S) be a measurable space equipped with a signed


measure µ. Let E be a measurable subset of X. We say that E is a pos-
itive set (respectively, a negative set), if for every measurable set F ,
we have µ(E ∩ F ) ≥ 0 (respectively, µ(E ∩ F ) ≤ 0). Equivalently, E
is a positive (respectively, negative) set if for every measurable subset
F ⊂ E, we have µ(F ) ≥ 0 (respectively, µ(F ) ≤ 0). 

Remark 9.1.2 The empty set is both a positive and a negative set. 

Remark 9.1.3 Any subset of a positive (respectively, negative) set is


positive (respectively, negative). Any (finite or countable) disjoint union
of positive (respectively, negative) sets is positive (respectively, nega-
tive). If Ai , i = 1, 2 are positive (respectively, negative) sets, then so
is A1 \A2 . Consequently, any countable union of positive (respectively,
negative) sets is positive (respectively, negative). 

Proposition 9.1.4 Let (X, S) be a measurable space equipped with a


signed measure µ. Let {Bi }ni=1 be a finite collection of negative sets. Let
B = ∪ni=1 Bi . Then
µ(B) ≤ min µ(Bi ). (9.1.4)
1≤i≤n
9.1 Hahn and Jordan decompositions 181

Proof: We have B1 ∪ B2 = B1 ∪ (B2 \B1 ) and the latter is a disjoint


union. By the preceding remark, we have

µ(B1 ∪ B2 ) = µ(B1 ) + µ(B2 \B1 ) ≤ µ(B1 ).

Similarly, µ(B1 ∪ B2 ) ≤ µ(B2 ). This proves (9.1.4) when n = 2. The


general case now follows by induction on n. 
Theorem 9.1.1 (Hahn decomposition) Let (X, S) be a measurable space
equipped with a signed measure µ. There exist two disjoint sets A and
B such that X = A ∪ B, and such that A is a positive set and B is a
negative set.
Proof: Without loss of generality, let us assume that for all E ∈ S, we
have
−∞ < µ(E) ≤ +∞.
Step 1. Let us denote by N , the collection of all negative subsets of X.
Set
β = inf µ(B).
B∈N

Let {Bi }∞
i=1 be a sequence of sets in N such that µ(Bi ) ↓ β. If B =
∪∞ B
i=1 i , we have seen that B ∈ N and so β ≤ µ(B). On the other hand,
if we set Cn = ∪ni=1 Bi , then µ(B) = limn→∞ µ(Cn ), by Proposition
9.1.3. But, by Proposition 9.1.4, we have that µ(Cn ) ≤ µ(Bn ) and so
µ(B) ≤ β. Thus, µ(B) = β. In particular, by our assumption on µ, we
have that β is finite.

Step 2. Let A = X\B. We will show that A is a positive set. If not,


there exists a measurable set E0 ⊂ A such that µ(E0 ) < 0.
Assume, if possible, that E0 is a negative set. Then, B ∪ E0 is a
negative set and, since B and E0 are disjoint, µ(B ∪ E0 ) = µ(B) +
µ(E0 ) < β, which is impossible by the definition of β. Thus, there exists
a measurable subset of E0 with positive measure. Since µ(E0 ), being
negative, is finite, so is the measure of any subset of E0 . Let k1 be the
smallest positive integer such that there exists a measurable set E1 ⊂ E0
with µ(E1 ) ≥ k11 . Now,

1
µ(E0 \E1 ) = µ(E0 ) − µ(E1 ) ≤ µ(E0 ) − < 0.
k1
Step 3. We can now apply the procedure of Step 2 to the set E0 \E1 .
Then there exist measurable subsets of E0 \E1 with positive measure,
182 9 Signed measures

and let k2 be the smallest positive integer with the property that there
exists such a set E2 with µ(E2 ) ≥ k12 . (In other words, at each stage,
we choose a set with positive measure with the measure being as large
as possible.)
Proceeding in this way, for each positive integer i, there exists a
measurable set of positive measure contained in E0 \ ∪i−1
k=1 Ek and let ki
be the smallest positive integer with the property that there exists such
a measurable subset Ei with µ(Ei ) ≥ k1i . Then, since Ei ⊂ E0 \ ∪i−1
`=1 E` ,
the sets {Ei }∞
i=1 are clearly disjoint, and so

X
µ(Ei ) = µ(∪∞
i=1 Ei ) < +∞,
i=1

since ∪∞
i=1 Ei ⊂ E0 , which has finite measure. In particular, it follows
that µ(Ei ) → 0 as i → ∞ and so ki → ∞.

Step 4. Let F be a measurable set such that F ⊂ E0 \ ∪∞ i=1 Ei . Then


µ(F ) ≤ 0. If not, let kn be such that µ(F ) ≥ k1n , which is possible ,
since ki → ∞. Then, for all m ≥ n, we have
F ⊂ E0 \ ∪ ∞ m
i=1 Ei ⊂ E0 \ ∪i=1 Ei ,

which yields, by definition of the ki , that km ≤ kn , which is a contradic-


tion. Thus,
F0 = E 0 \ ∪ ∞ i=1 Ei
is a negative set and µ(F0 ) ≤ µ(E0 ) < 0. Once again, this is a contradic-
tion since we then have that B ∪ F0 is a negative set and µ(B ∪ F0 ) < β.
Thus A is a positive set. This completes the proof. .

Remark 9.1.4 The decomposition of X into two disjoint sets, one posi-
tive and the other negative, is called a Hahn decompositon of X. Such a
decomposition is not unique. Let X = A ∪ B be a Hahn decomposition
and assume that there exists N ⊂ B such that µ(N ) = 0. Let F ⊂ N .
Then µ(F ) ≤ 0. If µ(F ) < 0, then 0 = µ(N ) = µ(F ) + µ(N \F ), which
implies that µ(N \F ) > 0, which is not possible since B and, hence, N ,
is a negative set. Thus, µ(F ) = 0, for all F ⊂ N . Then it is clear that
X = (A ∪ N ) ∪ (B\N ) gives another Hahn decomposition of X. 

The situation described in the preceding remark is, in fact, the only
way non-uniqueness can occur for the Hahn decomposition. More pre-
cisely, we have the following result.
9.1 Hahn and Jordan decompositions 183

Proposition 9.1.5 Let (X, S) be a measurable space equipped with a


signed measure µ. Let X = Ai ∪Bi , i = 1, 2, be two Hahn decompositions
of X. Then, for every E ∈ S, we have

µ(E ∩ A1 ) = µ(E ∩ A2 ) and µ(E ∩ B1 ) = µ(E ∩ B2 ).

Proof: Let E ∈ S. We have E ∩(A1 \A2 ) ⊂ A1 and so µ(E ∩(A1 \A2 )) ≥


0. On the other hand,

E ∩ (A1 \A2 ) = E ∩ A1 ∩ Ac2 = E ∩ A1 ∩ B2 ⊂ B2 ,

and so µ(E ∩ (A1 \A2 )) ≤ 0. Thus µ(E ∩ (A1 \A2 )) = 0 and, similarly,
µ(E ∩ (B1 \B2 )) = 0 as well. Consequently,

µ(E ∩ (A1 ∪ A2 )) = µ(E ∩ A2 ) + µ(E ∩ (A1 \A2 )) = µ(E ∩ A2 ).

Interchanging the roles of A1 and A2 , we get µ(E ∩ (A1 ∪ A2 )) = µ(E ∩


A1 ). Thus
µ(E ∩ A1 ) = µ(E ∩ A2 ).
This proves the first relation in the statement of the proposition. The
proof of the other one is similar. 

Let (X, S) be a measurable space equipped with a signed measure


µ. Let us now define two set functions on S by
µ+ (E) = µ(E ∩ A),
(9.1.5)
µ− (E) = −µ(E ∩ B),
where E ∈ S and X = A ∪ B is a Hahn decomposition of X. By the
preceding proposition, µ± are well-defined, since they do not depend on
the Hahn decomposition chosen. Further, it is clear that they are both
measures. Also, since µ takes at most one of the two infinite values +∞
or −∞, it follows that one of these two measures is finite and, we have

µ(E) = µ+ (E) − µ− (E), E ∈ S.

If µ is finite (respectively, σ-finite), then the same is true for µ± .

We have thus proved the following result.


Theorem 9.1.2 (Jordan decomposition) Let (X, S) be a measurable space
equipped with a signed measure µ. Then µ is the difference of two (pos-
itive) measures µ+ and µ− , at least one of which is finite. If µ is finite
(respectively, σ-finite), then so are µ± . 
184 9 Signed measures

Definition 9.1.3 Let (X, S) be a measurable space equipped with a signed


measure µ. The relation µ = µ+ − µ− is called the Jordan deom-
position of µ. The measures µ+ and µ− are respectively called the
upper and lower variations of the signed measure µ. The measure
|µ| = µ+ + µ− is called the total variation of the signed measure µ. 

Definition 9.1.4 Let (X, S) be a measurable space. A complex mea-


sure is a set function µ defined on S which can be written as µ =
µ1 + iµ2 , where µj , j = 1, 2, are signed measures and i is a square root
of −1. 

Proposition 9.1.6 Let (X, S) be a measurable space equipped with a


signed measure µ. Let E ∈ S. Let µ± be the upper and lower variations
of µ. Then
µ+ (E) = sup{µ(F ) | F ⊂ E, F ∈ S},
(9.1.6)
µ− (E) = − inf{µ(F ) | F ⊂ E, F ∈ S}.
Proof: Let X = A ∪ B be a Hahn decomposition of X. Let E, F ∈ S
such that F ⊂ E. Since µ+ is a measure, we have that µ(F ∩ A) ≤
µ(E ∩ A). Thus
µ(F ) = µ(F ∩ A) + µ(F ∩ B)
≤ µ(F ∩ A)
≤ µ(E ∩ A)
= µ+ (E).
It then follows that
sup{µ(F ) | F ⊂ E, F ∈ S} ≤ µ+ (E).
Since µ+ (E) = µ(E ∩ A) and E ∩ A ⊂ E, the reverse inequality is ob-
vious. This proves the first relation in (9.1.6). The proof of the second
relation is similar. 

Example 9.1.3 Let (X, S, µ) be a measure space and let f be an in-


tegrable function defined on X. Consider the signed measure ν defined
by Z
ν(E) = f dµ, E ∈ S.
E
Then we have a Hahn decomposition X = A ∪ B where
A = {x ∈ X f + (x) > 0},
B = {x ∈ X f − (x) ≥ 0}.
9.2 Absolute continuity 185

The upper and lower variations ν ± of ν are given by


Z Z

+
ν (E) = +
f dµ, and ν (E) = f − dµ,
E E

and the total variation |ν| is given by


Z
|ν|(E) = |f | dµ,
E

for E ∈ S. 

Let (X, S) be a measurable space equipped with a signed measure


µ. It is clear that a measurable function f defined on X is integrable
with respect to |µ| if, and only if, it is integrable with respect to both
µ+ and µ− . In that case we can define the integral of f over X, with
respect to µ.

Definition 9.1.5 Let (X, S) be a measurable space equipped with a signed


measure µ. Let f be a measurable function defined on X which is inte-
grable with respect to |µ|. Then, we say that f is integrable with respect
to the signed measure µ and we define
Z Z Z
f dµ = +
f dµ − f dµ− ,
X X X

where µ = µ+ − µ− is the Jordan decomposition of µ. If µi , i = 1, 2 are


two signed measures defined on the measurable space (X, S), and if µ is
the complex measure defined by µ = µ1 + iµ2 , we say that a measurable
function f defined on X is integrable with respect to µ if it is integrable
with respect to both µ1 and µ2 and we define
Z Z Z
f dµ = f dµ1 + i f dµ2 . 
X X X

9.2 Absolute continuity

Definition 9.2.1 Let (X, S) be a measurable space and let µ and ν be


signed measures defined on it. We say that ν is absolutely continuous
with respect to µ if ν(E) = 0 whenever |µ|(E) = 0, E ∈ S. In this case
we write ν << µ. 
186 9 Signed measures

Example 9.2.1 Let (X, S, µ) be a measure space and let f be an inte-


grable function defined on X. For E ∈ S, define the signed measure
Z
ν(E) = f dµ.
E

Then ν << µ. 

Example 9.2.2 Let (X, S) be a measure space and let µ and ν be mea-
sures defined on it. Then µ << µ + ν and ν << µ + ν. 

Example 9.2.3 Let (X, S) be a measurable space equipped with a


signed measure µ. Then µ+ << µ, µ− << µ. We also have µ << |µ|
and |µ| << µ. 

Example 9.2.4 Let X = [0, 1] be equipped with the Lebesgue measure


m1 . Let F = [0, 12 ] and, for x ∈ X, set

f1 (x) = 2χF (x) − 1, f2 (x) = x.

Let, for E ∈ L1 ,
Z
µi (E) = fi dm1 , i = 1, 2.
E

Then, since |f1 | ≡ 1, we have |µ1 | = m1 . Consequently (cf. Example


9.1.1), µ2 << µ1 . However,
Z
µ1 (X) = f1 dm1 = 0,
X

while
Z 1
1
µ2 (X) = x dx = 6= 0.
0 2
Thus, if µ2 << µ1 and if µ1 (E) = 0, it does not imply that µ2 (E) = 0. 

Proposition 9.2.1 Let (X, S) be a measurable space and let µ and ν be


signed measures defined on it. The following statements are equivalent.
(i) ν << µ.
(ii) ν ± << µ.
(iii) |ν| << |µ|.
9.2 Absolute continuity 187

Proof: (i) ⇒ (ii). Let X = A ∪ B be a Hahn decomposition of X


with respect to ν, so that, for E ∈ S, we have ν + (E) = ν(E ∩ A) and
ν − (E) = −ν(E ∩ B). Let E ∈ S such that |µ|(E) = 0. Then

0 ≤ |µ|(E ∩ A) ≤ |µ|(E) = 0,
0 ≤ |µ|(E ∩ B) ≤ |µ|(E) = 0.

Then ν(E ∩ A) = ν(E ∩ B) = 0. Thus, ν ± << µ.

(ii) ⇒ (iii). Clearly, ν ± << µ implies that ν ± << |µ|. Thus, if


|µ|(E) = 0, then ν ± (E) = 0 and so |ν|(E) = 0 as well. Thus, |ν| << |µ|.

(iii) ⇒ (i). If |µ|(E) = 0, then |ν|(E) = 0 and so ν ± (E) = 0 from which


we get
ν(E) = ν + (E) − ν − (E) = 0.
This completes the proof. 

Definition 9.2.2 Let (X, S) be a measurable space and let µ and ν be


signed measures defined on it. We say that µ is equivalent to ν if both
the relations µ << ν and ν << µ hold. In this case we write µ ≡ ν. 

Example 9.2.5 If µ is a signed measure defined on a measurable space


(X, S), then µ ≡ |µ|. 
We defined a notion of absolute continuity of a measure with respect
to another in Remark 5.3.2, based on the result of Proposition 5.3.2. In
that case, the measure ν is also absolutely continuous with respect to µ
according the Definition 9.2.1 above. We reconcile these two definitions
in the proposition below.

Proposition 9.2.2 Let (X, S) be a measurable space and let µ and ν be


signed measures defined on it. Assume that ν is finite and that ν << µ.
Then, given ε > 0, we can find δ > 0 such that, whenever |µ|(E) <
δ, E ∈ S, we have |ν|(E) < ε.

Proof: Assume the contrary. Then, there exists ε > 0 such that, for
every n ∈ N, there exists a set En ∈ S with |µ|(En ) < 21n and |ν|(En ) ≥
ε. Set E = lim supn→∞ En . Then, for every n ∈ N,

X 1
|µ|(E) ≤ |µ|(Em ) < ,
m=n
2n−1
188 9 Signed measures

and so |µ|(E) = 0. Since |ν| is finite (cf. Exercise 1.10 (b)), we have
|ν|(E) ≥ ε, which contradicts the absolute continuity of ν with respect
to µ. 

Example 9.2.6 The above result is not true, in general, if ν is not finite.
Let X = N and let S = P(N). Let µ({n}) = 2−n and let ν({n}) = 2n .
Then µ and ν define measures and since the only set, in either case, with
measure zero is the empty set, we have that ν << µ. However, for any
δ > 0, we can find n0 ∈ N such that for all n ≥ n0 , we have µ({n}) < δ,
while {ν({n})}n≥n0 is unbounded. 

Proposition 9.2.3 Let (X, S) be a measurable space and let µ and ν be


finite measures on it such that ν << µ.Assume that ν is not identically
zero. Then, there exists ε > 0 and a measurable set A with µ(A) > 0,
such that A is a positive set for the signed measure ν − εµ.

Proof: For each n ∈ N, consider the signed measure ν − n1 µ and let


X = An ∪ Bn be a Hahn decomposition for this signed measure. Set
A0 = ∪ ∞ ∞
n=1 An and B0 = ∩n=1 Bn . Since B0 ⊂ Bn for each n, and since
Bn is a negative set for the signed measure ν − n1 µ, we have

1
0 ≤ ν(B0 ) ≤ µ(B0 ).
n
Thus, ν(B0 ) = 0. Since A0 = B0c , and since ν is not identically zero, it
follows that ν(A0 ) > 0. By absolute continuity, it follows that µ(A0 ) > 0
as well. Then, there exists n ∈ N such that µ(An ) > 0. We can now set
A = An and ε = n1 . 

9.3 The Radon-Nikodym theorem

Let (X, S, µ) be a measure space and let f be a non-negative integrable


function defined on X. If we define the measure ν
Z
ν(E) = f dµ, E ∈ S,
E

then ν is absolutely continuous with respect to µ. The Radon-Nikodym


theorem states that in the σ-finite case, every signed measure ν, which
is absolutely continuous with respect to µ, arises in this fashion. We
will first prove this for finite measures and then extend it to the general
case.
9.3 The Radon-Nikodym theorem 189

Theorem 9.3.1 (Radon-Nikodym) Let (X, S, µ) be a finite measure space


and let ν be a finite measure defined on S such that ν << µ. Then, there
exists a non-negative function defined on X, which is integrable with re-
spect to µ, and such that
Z
ν(E) = f dµ,
E

for every E ∈ S. The function f is unique Rin the sense that, if g is


another integrable function such that ν(E) = E g dµ, the f = g almost
everywhere with respect to the measure µ.

Proof: Step 1. Uniqueness. If f and g are two functions such that


Z Z
ν(E) = f dµ = g dµ,
E E

for every E ∈ S, then, for every n ∈ N, we have that µ(En ) = 0, where


 
1
En = x ∈ X | f (x) − g(x) > .
n
It then immediately follows that

µ({x ∈ X | f (x) − g(x) > 0}) = 0.

Similarly, we have

µ({x ∈ X | f (x) − g(x) < 0}) = 0,

whence we deduce that

µ({x ∈ X | f (x) 6= g(x)}) = 0.

Step 2. Let us denote by L(µ), the set of all measurable functions defined
on X which are integrable with respect to µ. Define
 
f ≥ 0 and
K = f ∈ L(µ) R .
E f dµ ≤ ν(E) for every E ∈ S

Then K 6= ∅. To see this, let ε > 0 and A ∈ S be as in the statement of


Proposition 9.2.3. Then, if we set f = εχA , we have
Z
f dµ = εµ(E ∩ A) ≤ ν(E ∩ A) ≤ ν(E),
E
190 9 Signed measures

R
for every E ∈ S. Thus f ∈ K and X f dµ = εµ(A) > 0. Now set
Z 
α = sup f dµ | f ∈ K .
X

Then,
0 < α ≤ ν(X) < +∞.
Step 3. Let {gn }∞
n=1 be a sequence of functions in K such that
Z
1
gn dµ > α − .
X n

Let fn = max{g1 , · · · , gn } ≥ 0. We claim that fn ∈ K.


Indeed, let Ein = {x ∈ X | fn (x) = gi (x)}, for 1 ≤ i ≤ n. Then
X = ∪ni=1 Ein . Set F1n = E1n and

Fin = Ein \(∪i−1 n


k=1 Fk ).

Then the Fin , 1 ≤ i ≤ n are disjoint, Fin ⊂ Ein for 1 ≤ i ≤ n and


X = ∪ni=1 Fin . Thus, if E ∈ S, we have
Z n Z
X n Z
X n
X
fn dµ = fn dµ = gi dµ ≤ ν(E ∩ Fin ) = ν(E).
E i=1 E∩Fin i=1 E∩Fin i=1

This proves that fn ∈ K.

Step 4. We have that {fn }∞


n=1 is an increasing sequence of non-negative
functions. Let f = limn→∞ fn . Then, by the monotone convergence
theorem, Z Z
f dµ = lim fn dµ ≤ ν(E),
E n→∞ E
R
for every E ∈ S. Thus, f ∈ K as well. Consequently, X f dµ ≤ α. On
the other hand,
Z Z Z
1
f dµ ≥ fn dµ ≥ gn dµ > α − ,
X X X n
R
for every n ∈ N. Consequently, X f dµ ≥ α as well and so we have that
Z
f ∈ K and f dµ = α.
X
9.3 The Radon-Nikodym theorem 191

Step 5. Define the measure ν1 by


Z
ν1 (E) = f dµ, E ∈ S.
E

Then ν1 is a finite measure and ν1 << µ. Set ν0 = ν − ν1 , which is


a well-defined signed measure since both ν and ν1 are finite measures.
Since f ∈ K, we have that ν1 (E) ≤ ν(E) for every E ∈ S. Thus ν0 is a
measure and we also have that ν0 << µ.

Step 6. We claim that ν0 ≡ 0. This will complete the proof. Assume


the contrary. Then, since ν0 is a finite measure which is absolutely
continuous with respect to µ, we may again apply Proposition 9.2.3.
Thus, there exist η > 0 and a set F ∈ S, which is a positive set for
the signed measure ν0 − ηµ, and such that µ(F ) > 0. Hence, for every
E ∈ S, we have
Z
ηµ(E ∩ F ) ≤ ν0 (E ∩ F ) = ν(E ∩ F ) − f dµ.
E∩F

Now, set h = f + ηχF . Then, if E ∈ S, we have


R R
E h dµ = E f dµ + ηµ(E ∩ F )
R R
≤ E f dµ − E∩F f dµ + ν(E ∩ F )
R
= E∩F c f dµ + ν(E ∩ F )

≤ ν(E ∩ F c ) + ν(E ∩ F )

= ν(E).

Thus h ∈ K. But
Z Z
h dµ = f dµ + ηµ(F ) > α,
X X

which is a contradiction. Thus, ν0 ≡ 0. Hence f is the required function


and this completes the proof. 

If µ and ν are σ-finite measures, then we can write X as the disjoint


union of a countable number of measurable sets, such that on each of
them µ and ν are finite. Thus we can ‘patch up’ the function f obtained
192 9 Signed measures

on each of these sets to get f defined on X as in the theorem. Next, if


ν is a σ-finite signed measure, we can write ν = ν + − ν − . Since ν ± will
also be σ-finite, the preceding argument gives two functions f+ and f−
such that, for every E ∈ S, we have
Z
±
ν (E) = f± dµ.
E

Then we can set f = f+ − f− to get the result of the theorem in this


case. Finally, let us assume that µ is also a σ-finite signed measure. Let
X = A ∪ B be a Hahn decomposition with respect to µ. Then, if E ⊂ A,
we have |µ|(E) = µ+ (E) and if E ⊂ B, we have |µ|(E) = µ− (E). Thus,
when restricted to A or to B, since we have that ν << µ, we deduce
that ν << µ+ on A and ν << µ− on B. Hence, there exist functions
fA and fB such that

ν(E) = E fA dµ+ , for every E ∈ S, E ⊂ A,


R

fB dµ− , for every E ∈ S, E ⊂ B.


R
ν(E) = E

Now, if E ∈ S, we have

ν(E) = ν(E ∩ A) + ν(E ∩ B)

dµ+ + fB dµ−
R R
= E∩A fA E∩B
R
= E (fA χA − fB χB ) dµ.

Combining all these, we get the following result.

Theorem 9.3.2 (Radon-Nikodym) Let (X, S) be a measurable space


and let µ and ν be σ-finite signed measures defined on S such that
ν << µ. Then, there exists a measurable function defined on X such
that Z
ν(E) = f dµ, (9.3.1)
E
for every E ∈ S. 

Definition 9.3.1 Let (X, S) be a measurable space and let µ and ν be


σ-finite signed measures defined on S such that ν << µ. The function
f occuring in (9.3.1) is called the Radon-Nikodym derivative of ν

with respect to µ and we formally write f = dµ .
9.4 Singularity 193

Proposition 9.3.1 Let (X, S) be a measurable space and let λ and µ be


σ-finite measures defined on S. Let µ << λ. Let ν be a σ-finite signed
measure defined on S such that ν << µ. Then

dν dν dµ
= ,
dλ dµ dλ

almost everywhere (with respect to the measure λ).

Proof: As usual, by considering the upper and lower variations ν ±


separately, we may assume that ν is a measure, without loss of generality.
Thus, f = dµdν
≥ 0 and let g = dµ
dλ ≥ 0. By virtue of proposition 5.2.7,
we have, for E ∈ S,
Z Z
ν(E) = f dµ = f g dλ,
E E

which completes the proof. 

9.4 Singularity

If a measure is absolutely continuous with respect to another, then the


former vanishes whenever the latter vanishes. We now consider the
opposite notion.

Definition 9.4.1 Let (X, S) be a measurable space and let µ and ν be


two measures defined on S. We say that ν is singular with respect to
µ, and we write ν ⊥ µ, if there exists E ∈ S such that µ(E) = 0 and
ν ≡ 0 on E c , i.e. if F ⊂ E c , F ∈ S, then ν(F ) = 0. 

Example 9.4.1 Let m1 be the Lebesgue measure on R and let δ be the


Dirac measure concentrated at the origin. Then, if we set E = {0}, we
have m1 (E) = 0 and δ ≡ 0 on E c . Thus δ ⊥ m1 . 

Example 9.4.2 Let (X, S) be a measurable space equipped with a


signed measure µ. Then µ+ ⊥ µ− and µ− ⊥ µ+ . 

Theorem 9.4.1 (Lebesgue decomposition) Let (X, S) be a measurable


space and let µ and ν be two σ-finite measures defined on S. Then,
there exist two uniquely determined measures ν0 and ν1 such that ν =
ν0 + ν1 , ν0 ⊥ µ and ν1 << µ.
194 9 Signed measures

Proof Since ν << µ + ν, there exists a non-negative function f such


that Z
ν(E) = f d(µ + ν),
E
for every E ∈ S. Set

A = {x ∈ X |f (x) ≥ 1} and B = Ac .

Then, Z
ν(A) ≥ d(µ + ν) = µ(A) + ν(A),
A
whence we deduce that µ(A) = 0. Define, for E ∈ S,

ν0 (E) = ν(E ∩ A) and ν1 (E) = ν(E ∩ B).

Then ν = ν0 + ν1 and ν0 ⊥ µ. We will now show that ν1 << µ, which


will establish the decomposition of ν.
Let E ∈ S such that µ(E) = 0. Then
Z Z Z
ν1 (E) = dν = f dµ + f dν.
E∩B E∩B E∩B
R
Since µ(E) = 0, we have that E∩B f dµ = 0. Thus we get
Z
(1 − f ) dν = 0.
E∩B

But on B, we have that 0 ≤ f < 1. Hence it follows that ν(E ∩ B) = 0,


i.e. ν1 (E) = 0. This shows that ν1 << µ.

To complete the proof, we need to show that ν0 and ν1 are uniquely


determined. Let ν = ν0 +ν1 = νe0 +e ν1 be two Lebesgue decompositions of
ν with respect to µ. Let A (respectively A)e be such that µ(A) = µ(A)
e =
c c
0 and ν0 ≡ 0 on A (repectively, νe0 ≡ 0 on A ). Then µ(A ∪ A) = 0 and,
e e
on Ac ∩ Aec , both ν0 and νe0 are both zero. Now, set

λ = ν0 − νe0 = νe1 − ν1 .

Then λ = νe1 − ν1 is clearly absolutely continuous with respect to µ and


so we have that λ vanishes on A ∪ Ae and on all of its measurable subsets.
On the other hand, we have just seen above that λ = ν0 − νe0 vanishes
identically on the complement of A ∪ A. e Thus λ ≡ 0 which proves the
uniqueness of the Lebesgue decomposition. 
9.5 Exercises 195

9.5 Exercises

9.1 Let (X, S) be a measurable space equipped with a signed measure


µ. If µ is finite, show that, for every E ∈ S,
Z 
|µ|(E) = sup f dµ | f : X → R measurable, |f | ≤ 1 .
E

9.2 Let (X, S) be a measurable space equipped with a signed measure


µ. Let µi , i = 1, 2 be two measures defined on S such that µ = µ1 − µ2 .
Show that µ+ ≤ µ1 and that µ− ≤ µ2 .

9.3 Let (X, S) be a measurable space equipped with a signed measure


µ. Let E ∈ S. Show that |µ|(E) = 0 if, and only if, µ(F ) = 0 for every
measurable subset F of E.

9.4 Let (X, S) be a measurable space equipped with a signed measure


+ dµ−
µ. Compute dµd|µ| and d|µ| .

9.5 Let (X, S) be a measurable space. Let µ and ν be finite measures



defined on S. Let f = d(µ+ν) . Show that, for every E ∈ S, we have
Z
f
ν(E) = dµ.
E 1−f
9.6 Let ν be any σ-finite signed measure on the measurable space
(N, P(N)). Let µ be the counting measure on this measurable space.

Show that ν << µ and compute dµ .

9.7 Let (X, S) be a measurable space. Let µ and ν be σ-finite measures


defined on S such that µ ≡ ν. Show that
 −1
dµ dν
=
dν dµ
almost everywhere (with respect to µ, and hence, with respect to ν as
well).

9.8 Let (X, S) be a measurable space. Let µ and ν be σ-finite measures


defined on S such that ν << µ. Show that
 

ν x ∈ X (x) = 0 = 0.

Chapter 10

Lp spaces

10.1 Basic properties

The Lebesgue spaces, also known as the Lp spaces, constitute a rich


source of examples and counter-examples in functional analysis. They
also form an important class of function spaces when studying the ap-
plications of mathematical analysis.

Definition 10.1.1 Let (X, S, µ) be a measure space. Let f : X → R be


a measurable function. Let 1 ≤ p < ∞. We define
Z 1
p
p
kf kp = |f | dµ (10.1.1)
X

and we say that f is p-integrable (integrable, if p = 1 and square


integrable, if p = 2) if kf kp < +∞. 

Let M > 0. We set

{|f | > M } = {x ∈ X | |f (x)| > M }.

We now define (cf. Definition 3.3.1)

kf k∞ = inf{M > 0 | µ({|f | > M }) = 0}. (10.1.2)

Definition 10.1.2 Let (X, S, µ) be a measure space. Let f : X → R be


a measurable function. We say that f is essentially bounded if kf k∞ <
+∞. 

© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 196
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9_10
10.1 Basic properties 197

Definition 10.1.3 Let 1 < p < ∞. The conjugate exponent of p,


denoted p0 , is given by the relation:
1 1
+ 0 = 1.
p p

If p = 1 we define its conjugate exponent as ∞ and vice-versa. 

Lemma 10.1.1 Let 1 < p < ∞. Let p0 be its conjugate exponent. Then,
if a and b are non-negative real numbers, we have
1 1 a b
a p b p0 ≤ + 0. (10.1.3)
p p

Proof: Let t ≥ 1. Consider the function

f (t) = k(t − 1) − tk + 1,

where k ∈ (0, 1). Then f 0 (t) = k(1 − tk−1 ) ≥ 0 since 0 < k < 1. Thus,
f is an increasing function on [1, +∞) and, since f (1) = 0, we deduce
that
tk ≤ k(t − 1) + 1, (10.1.4)
for t ≥ 1 and for 0 < k < 1.
If a or b is zero, then (10.1.3) is obviously true. Let us assume,
without loss of generality (since p and p0 are conjugate exponents of
each other), that a ≥ b > 0. Then (10.1.3) follows from (10.1.4) on
setting t = ab , k = p1 and using the relation between p and p0 . 

Proposition 10.1.1 (Hölder’s inequality) Let 1 ≤ p < ∞ and let p0


be the conjugate exponent. If f is p-integrable and g is p0 -integrable
(essentially bounded, if p = 1), then
Z
|f g| dµ ≤ kf kp kgkp0 . (10.1.5)
X

Proof: If p = 1, then p0 = ∞. Then

|f (x)g(x)| ≤ |f (x)| · kgk∞

for almost every x ∈ X and then (10.1.5) follows on integrating this


inequality over X.
198 10 Lp spaces

Let us now assume that 1 < p < ∞ so that 1 < p0 < ∞ as well. The
relation (10.1.5) is trivially true if kf kp (respectively, kgkp0 ) equals zero,
for then f (respectively, g) will be equal to zero almost everywhere. So
we assume further that kf kp 6= 0 and that kgkp0 6= 0. Then, by Lemma
10.1.1,
1 1 0
|f (x)g(x)| ≤ |f (x)|p + 0 |g(x)|p
p p
for all x ∈ X. Assume now that kf kp = kgkp0 = 1. Then, integrating
the above inequality over X, we get
Z
1 1
|f g| dµ ≤ + 0 = 1.
X p p

For the general case, apply the preceding result to the functions f /kf kp
and g/kgkp0 to get (10.1.5). 

Remark 10.1.1 When p = p0 = 2, the inequality (10.1.5) is known as


the Cauchy-Schwarz inequality. 

Proposition 10.1.2 (Minkowski’s Inequality) Let 1 ≤ p ≤ ∞. Let


f and g be p-integrable (essentially bounded , if p = ∞). Then f + g is
also p-integrable (essentially bounded, if p = ∞) and

kf + gkp ≤ kf kp + kgkp . (10.1.6)

Proof: Let 1 < p < ∞. We assume that kf + gkp 6= 0, since, otherwise,


the result is trivially true. Since the function t 7→ |t|p is convex for
1 ≤ p < ∞, we have that

|f (x) + g(x)|p ≤ 2p−1 (|f (x)|p + |g(x)|p )

from which it follows that f + g is also p-integrable. Thus, if 1 < p < ∞,


we have
Z Z Z
p p−1
|f + g| dµ ≤ |f + g| |f | dµ + |f + g|p−1 |g| dµ.
X X X

We apply Hölder’s inequality to each of the terms on the right-hand side.


0
Notice that |f (x) + g(x)|(p−1)p = |f (x) + g(x)|p by the definition of p0 .
Thus |f + g|p−1 is p0 -integrable and
p
0
k |f + g|p−1 kp0 = kf + gkpp .
10.1 Basic properties 199

Thus,
p
0
kf + gkpp ≤ kf + gkpp (kf kp + kgkp ).
p
0
Dividing both sides by kf + gkpp and using, once again, the definition
of p0 , we get (10.1.6). The cases where p = 1 and p = ∞ follow trivially
from the inequality

|f (x) + g(x)| ≤ |f (x)| + |g(x)|.

This completes the proof. 

It is now easy to see that the space of all p-integrable functions


(1 ≤ p < ∞) and that of all essentially bounded functions are vector
spaces and that the map f 7→ kf kp for 1 ≤ p ≤ ∞ verifies all the prop-
erties of the norm, except that kf kp = 0 does not imply that f = 0, but
that f = 0 almost everywhere.

Given two measurable functions f and g, we say that f ∼ g if f = g


almost everywhere. This defines an equivalence relation. If f ∼ g,
then for 1 ≤ p ≤ ∞, we have that kf kp = kgkp . Further the set of
all equivalence classes forms a vector space with respect to pointwise
addition and scalar multiplication defined via arbitrary representatives
of the equivalence classes. In other words, if f1 ∼ f2 and if g1 ∼ g2 , then
f1 + g1 ∼ f2 + g2 and, for any scalar α, we also have αf1 ∼ αf2 and so
on. Since k · kp is also constant on any equivalence class, we can define
the ‘norm’ of an equivalence class via any representative function of that
class. Further, if kf kp = 0, then f will belong to the equivalence class
of the function which is identically zero. Thus the set of all equivalence
classes, with k · kp , becomes a normed linear space.
Definition 10.1.4 Let (X, S, µ) be a measure space. Let 1 ≤ p < ∞.
The space of all equivalence classes, under the equivalence relation de-
fined by equality of functions almost everywhere, of all p-integrable func-
tions is a normed linear space with the norm of an equivalence class
being the k · kp -‘norm’ of any representative of that class. This space
is denoted Lp (µ). The space of all equivalence classes of all essentially
bounded functions with the norm of an equivalence class being defined as
the k·k∞ -‘norm’ of any representative of that class, is denoted L∞ (µ). 
While we may often talk of ‘Lp -functions’, we must keep in mind that
we are really talking about equivalence classes of functions and that we
200 10 Lp spaces

carry out computations via representatives of those equivalence classes.

Notation If X = Ω, a non-empty open set of RN , provided with the


Lebesgue measure mN , then the corresponding Lp spaces will be de-
noted Lp (Ω). In particular, if R is provided with the Lebesgue measure
m1 , and if (a, b) is an interval, where −∞ ≤ a < b ≤ +∞, then the Lp
spaces on (a, b) will be denoted Lp (a, b).

Example 10.1.1 Let X = {1, 2, · · · , N }. Let S be the collection of all


subsets of X and let µ be the counting measure. Then a measurable
function can be identified with an n-tuple (a1 , a2 , · · · , aN ). In this case
Lp (µ) = RN equipped with the norm

N
! p1
X
kxkp = |xi |p ,
i=1

if 1 ≤ p < ∞, and with the norm

kxk∞ = max |xi |,


1≤i≤N

where x = (x1 , · · · , xN ) ∈ RN . Notice that in this example, equal-


ity almost everywhere is the same as equality everywhere and so every
equivalence class is a singleton. 

Example 10.1.2 Let X = N, and let S be the collection of all subsets of


X. Let µ be the counting measure. In this case, functions are identified
with real sequences and Lp (µ) = `p , the space of all real sequences
equipped with the norm


!1
X p
p
kxkp = |xk | ,
k=1

if 1 ≤ p < ∞ and with the norm

kxk∞ = sup |xk |,


k∈N

if p = ∞. Again, in this example, equivalence classes are just singletons.



10.1 Basic properties 201

Proposition 10.1.3 Let (X, S, µ) be a finite measure space. Then

Lp (µ) ⊂ Lq (µ)

with the inclusion being continuous, whenever 1 ≤ q ≤ p.

Proof: The result is trivial if p = ∞. Let 1 ≤ q < p < ∞ and let


f ∈ Lp (µ). Then, by Hölder’s inequality, we have
R p q R 1− q
p
|f |q dµ ≤ q q
R
X (|f | ) dµ dµ p
X

R p
q 1− pq
= X |f | dµ (µ(X))
p

1− pq
= kfkqp .(µ(X))

which yields
kf kq ≤ C kf kp
where 1
− p1
C = (µ(X)) q .
This completes the proof. 

Example 10.1.3 No such inclusions hold in infinite measure spaces.


For instance, the sequence ( n1 ) belongs to `2 but not to `1 . 

Example 10.1.4 Nothing can be said about the reverse inclusions. For

example, if f (x) = 1/ x, then f ∈ L1 (0, 1) but f 6∈ L2 (0, 1). 

Example 10.1.5 While we cannot say anything, in general, about the


reverse inclusions, as mentioned in the previous example, nevertheless,
if 1 ≤ p < q ≤ ∞, we do have the continuous inclusion

`p ,→ `q .

In fact have that, if x ∈ `p , then

kxkq ≤ kxkp .

Let q = ∞. If x ∈ `p , where x = (x1 , · · · , xi , · · ·), we have

|xi | ≤ kxkp ,
202 10 Lp spaces

for each i. Consequently,

kxk∞ = sup |xi | ≤ kxkp .


i

Let 1 ≤ p < q < ∞. Let x ∈ `p . Assume, for the moment, that kxkp = 1.
Then, if x = (x1 , · · · , xi , · · ·), we have seen that |xi | ≤ 1 for all i. Now

X ∞
X ∞
X
|xi |q = |xi |p |xi |q−p ≤ |xi |p = 1.
i=1 i=1 i=1

x
Thus, x ∈ `q and kxkq ≤ 1 = kxkp . Now, if x ∈ `p , consider y = kxk p
.
Then kykp = 1. Consequently, y ∈ `q and kykq ≤ 1. Thus, it follows
that x ∈ `q as well (since it is a constant multiple of y) and

kxkq
≤ 1,
kxkp

which establishes our claim. 

Remark 10.1.2 Let (X, S, µ) be a finite measure space and let f ∈


L∞ (µ), f 6= 0. Then, we have seen that f ∈ Lp (µ) for all 1 ≤ p ≤ ∞.
Let 1 ≤ p < ∞. Then
Z
|f |p dµ ≤ kf kp−1
∞ kf k1 .
X

Thus,
p−1 1
kf kp ≤ kf k∞p kf k1p .
Consequently,
lim sup kf kp ≤ kf k∞ .
p→∞

Now let 0 < ε < kf k∞ and let

E = {x ∈ X | |f (x)| > kf k∞ − ε > 0}.

Then µ(E) > 0 and


Z Z
p
|f | dµ ≥ |f |p dµ > (kf k∞ − ε)p µ(E).
X E

Then 1
kf kp ≥ (kf k∞ )(µ(E)) p ,
10.1 Basic properties 203

from which we get,


lim inf kf kp ≥ kf k∞ .
p→∞

Thus, we have
lim kf kp = kf k∞ .
p→∞

This justifies, to some extent, the notation kf k∞ for the norm given by
the essential supremum. 

A Cauchy sequence {fn }∞ p


n=1 in L (µ) is a sequence such that, given
any ε > 0, there exists N ∈ N satisfying kfn − fm kp < ε whenever n and
m are larger than N . If f ∈ Lp (µ), we say that the sequence converges
to f in Lp (µ) if kfn − f kp → 0 as n → ∞.

Lemma 10.1.2 Let 1 ≤ p < ∞. Let (X, S, µ) be a measure space. If


{f }∞ p
n=1 is a Cauchy sequence in L (µ), then the sequence is Cauchy in
measure.

Proof: Let ε > 0 be fixed. For positive integers n and m, set

An,m (ε) = {x ∈ X | |fn (x) − fm (x)| ≥ ε}.

Then
Z Z
p
|fn − fm | dµ ≥ |fn − fm |p dµ ≥ εp µ(An,m (ε)).
X An,m (ε)

Thus,
kfn − fm kpp
µ(An,m (ε)) ≤ ,
εp
and, since the right-hand side of the above inequality can be made arbi-
trarily small for large, n and m, the same is true for µ(An,m (ε)) as well
and that completes the proof. 

Theorem 10.1.1 Let (X, S, µ) be a measure space. Let 1 ≤ p ≤ ∞.


Then Lp (µ) is a Banach space.

Proof: We need to show that every Cauchy sequence in Lp (µ) is con-


vergent.

Case 1. Let 1 ≤ p < ∞. Let {fn }∞ p


n=1 be a Cauchy sequence in L (µ).
Then we saw, in the preceding lemma, that the sequence is Cauchy in
measure. Then, by Proposition 4.2.7, there exists a subsequence {fnk }
204 10 Lp spaces

which is almost uniformly Cauchy and, hence, by Proposition 4.1.2,


there exists a measurable function f such that fnk → f pointwise almost
everywhere.
Let ε > 0. Let N ∈ N be such that kfn − fm kp < ε for all n, m ≥ N .
We have, by Fatou’s lemma,
Z Z
p
|f − fn | dµ ≤ lim inf |fnk − fn |p dµ ≤ εp ,
X k→∞ X

for all n ≥ N . Thus, it follows that f − fn ∈ Lp (µ) and so f ∈ Lp (µ) as


well. Further, it also shows that fn → f in Lp (µ).
Case 2. Let p = ∞. Let {fn }∞ ∞
n=1 be Cauchy in L (µ). Then, for each
k, there exists a positive integer Nk such that
1
kfm − fn k∞ <
k
for all m, n ≥ Nk . Thus, there exists a set Ek of measure zero, such that
1
|fm (x) − fn (x)| ≤
k
for all m, n ≥ Nk and for all x ∈ X\Ek . Setting E = ∪∞ k=1 Ek , we see
that E is of measure zero and for all x ∈ X\E, the sequence {fn (x)} is
a Cauchy sequence in R. Thus, for all such x, fn (x) → f (x). Passing to
the limit as m → ∞, we see that, for all x ∈ X\E, and for all n ≥ Nk ,
1
|f (x) − fn (x)| ≤ .
k
Hence, it follows that f is essentially bounded and that fn → f in
L∞ (µ). This completes the proof. 

Corollary 10.1.1 Let (X, S, µ) be a measure space and let fn → f in


Lp (µ) for some 1 ≤ p ≤ ∞. Then, there exists a subsequence {fnk } such
that fnk (x) → f (x) almost everywhere.

Proof: If 1 ≤ p < ∞, we have already proved this in the course of the


proof of the preceding theorem. If p = ∞, then we have that fn → f
pointwise almost everywhere. 

Remark 10.1.3 An explicit construction of a subsequence, which con-


verges pointwise almost everywhere, of a Cauchy sequence in Lp (µ), 1 ≤
p < ∞, can be found in Kesavan [5] and in Rudin [8]. This has the added
10.2 Approximation 205

advantage that it also shows that the subsequence is bounded above by


a fixed function belonging to Lp (µ). 

Just as in Theorem 5.3.5, we can prove the Lp -convergence of a


sequence converging pointwise almost everywhere and whose Lp -norm
also converges.

Theorem 10.1.2 Let 1 ≤ p < ∞. Let {fn }∞ p


n=1 be a sequence in L (µ)
p
converging pointwise almost everywhere to a function f ∈ L (µ). Then
fn → f in Lp (µ) if, and only if, kfn kp → kf kp as n → ∞.

Proof: Since the norm defines a continuous function on any normed lin-
ear space, it follows that if fn → f in Lp (µ), we have that kfn kp → kf kp .

Conversely, let fn → f pointwise almost everywhere and let kfn kp →


kf kp . As in the case of Theorem 5.3.5, we apply the generalised dom-
inated convergence theorem (cf. Theorem 5.3.4). Let Fn = |fn − f |p ,
which converges to zero almost everywhere. Since the function t 7→ |t|p
is convex, we get Fn ≤ 2p−1 (|fn |p + |f |p ) = Gn , say. Then Gn is inte-
grable and Gn → G = 2Rp |f |p pointwise
R almost everywhere. Further, by
hypothesis, we get thatR X Gn dµ → X G dµ < +∞. Thus, by Theorem
5.3.4, we deduce that X Fn dµ → 0, which is the same as saying that
fn → f in Lp (µ). 

10.2 Approximation

Let Ω ⊂ RN be a non-empty open set. Let S denote the set of all


real-valued simple functions defined on Ω which vanish outside a set of
finite (Lebesgue) measure. If 1 ≤ p < ∞, a simple function ϕ belongs
to Lp (Ω) if, and only if, ϕ ∈ S.

Lemma 10.2.1 Let Ω ⊂ RN be a non-empty open set and let 1 ≤ p <


∞. Then S is dense in Lp (Ω).

Proof: Let f ∈ Lp (Ω) be a non-negative function and let {ϕn }∞ n=1 be a


sequence of non-negative simple functions increasing to f . Then, clearly
ϕn ∈ Lp (Ω) for each n and so ϕn ∈ S as well.

Now, |ϕn − f |p ≤ 2p |f |p , and since |f |p is integrable, it follows,


from the dominated convergence theorem, that ϕn → f in Lp (Ω). If
f ∈ Lp (Ω) is any generic function, then we can write f = f + − f − and
206 10 Lp spaces

f ± are non-negative functions in Lp (Ω). Then, we can find sequences


{ϕn }∞ ∞ + −
n=1 and {ψn }n=1 in S such that ϕn → f and ψn → f in L (Ω).
p

Thus ϕn − ψn ∈ S for each positive integer n and ϕn − ψn → f in Lp (Ω).


This completes the proof. 
Lemma 10.2.2 Let Ω ⊂ RN be a non-empty open set and let 1 ≤ p <
∞. Let f ∈ S. Then, f can be approximated by step functions in Lp (Ω).
Proof: Let E ⊂ Ω be a measurable set of finite measure. By Proposition
2.2.5, given ε > 0, we can find a set F ⊂ Ω which is a finite disjoint
union of boxes, such that mN (E∆F ) < εp . Then

kχE − χF kpp = mN (E∆F ) < εp

and so kχE − χF kp < ε. Now, let f ∈ S. Then we can write f =


Pk
j=1 αj χEj , where the αj are all non-zero and the Ej are all mutually
disjoint sets of finite measure. For each 1 ≤ j ≤ k, we can find Fj , a
finite disjoint union of boxes, such that
ε
kχEj − χFj kp < .
k|αj |

Then ϕ = kj=1 αj χFj is a step function, and, by the triangle inequality,


P

we have kf − ϕkp < ε. This completes the proof. 


Theorem 10.2.1 Let Ω ⊂ RN be a non-empty open set and let 1 ≤
p < ∞. Let Cc (Ω) denote the space of continuous real-valued functions
defined on Ω, having compact support contained in Ω. Then, Cc (Ω) is
dense in Lp (Ω).
Proof: By Lemmas 10.2.1 and 10.2.2, it follows that step functions are
dense in Lp (Ω). So it suffices to show that any step function in Lp (Ω)
can be approximated, in Lp (Ω), by a continuous function with compact
support. Let ε > 0 be given. Then, by Corollary 2.2.1, there exists
ϕ ∈ Cc (Ω), such that
 p
ε
mN ({x ∈ Ω | ϕ(x) 6= f (x)}) <
2kf k∞
and such that
kϕk∞ ≤ kf k∞ .
Then

kϕ − f kpp ≤ 2p kf kp∞ mN ({x ∈ Ω | ϕ(x) 6= f (x)}) < εp


10.2 Approximation 207

so that kϕ − f kp < ε. This completes the proof. 

Remark 10.2.1 The above result is not true when p = ∞. In fact, we


can show that the closure of Cc (Ω) under the sup-norm (i.e. the norm
k · k∞ ), is the space of continuous functions which vanish at infinity, i.e.
functions such that, given any ε > 0, there exists a compact set K ⊂ Ω
such that |f (x)| < ε for all x ∈ Ω\K. 

There are several interesting applications of the fact that Cc (Ω) is


dense in Lp (Ω) for 1 ≤ p < ∞. We will see a few in the next section.
To conclude this section, we discuss separability properties of the spaces
Lp (Ω).

Proposition 10.2.1 Let Ω ⊂ RN be a non-empty open set and let 1 ≤


p < ∞. Then the space Lp (Ω) is separable.

Proof: The set Ω can be expressed as the increasing union of com-


pact sets Kn , n ∈ N. Given f ∈ Lp (Ω), we can approximate it by a
continuous function with compact support, say ϕ, and the support of
ϕ will lie in some Kn . By the Weierstrass approximation theorem, we
can approximate ϕ uniformly in Kn by means of a polynomial and, in
fact, by means of a polynomial whose coefficients are rational. Since
Kn is compact, it also means that these polynomials approximate ϕ in
the Lp - norm over Kn . Since the set of all polynomials with rational
coefficients is countable, let us number them as {pm,n }∞m=1 . Denote by
pem,n the extension of pm,n to Ω by setting it to be equal to zero outside
Kn . Thus
∪∞ ∞
n=1 ∪m=1 {e
pm,n }
is a countable and dense set in Lp (Ω). This completes the proof. 

Proposition 10.2.2 Let Ω ⊂ RN be a non-empty open set. The space


L∞ (Ω) is not separable.

Proof: Let x ∈ Ω. Let r > 0 be such that B(x; r) ⊂ Ω, where B(x; r)


is the open ball of radius r which is centered at x. Let

ϕx = χB(x;r) .

Set  
∞ 1
Ux = f ∈ L (Ω) | kf − ϕx k∞ < .
2
208 10 Lp spaces

Then Ux is an open set in L∞ (Ω).

Let x and y be distinct points in Ω. Then, by definition, it follows


that kϕx − ϕy k∞ = 1. Consequently, if x 6= y, then Ux ∩ Uy = ∅. Thus
{Ux }x∈Ω is an uncountable collection of disjoint open sets in L∞ (Ω).
Hence, given any countable set {fn }∞
n=1 , there exist open sets Ux which
do not contain any of the fn . Thus, no countable set in L∞ (Ω) can be
dense. This completes the proof. 

10.3 Some applications

In the previous section, we proved the density of Cc (Ω), the space of


continuous functions with compact support, in Lp (Ω), where Ω is an
open subset of RN and 1 ≤ p < ∞. We used it to examine the sepa-
rability of Lp (Ω). In this section we will deduce a few more important
and interesting consequences of this density theorem.
Theorem 10.3.1 (Lusin’s theorem) Let E ⊂ RN be a measurable set of
finite measure. Let f : E → R be a measurable function. Let ε > 0 be
given. Then, there exists ϕ ∈ Cc (RN ) such that

mN ({x ∈ E | ϕ(x) 6= f (x)}) < ε.

Further, if f is bounded, we can ensure that

kϕk∞ ≤ kf k∞ .

Proof: Step 1. For each positive integer n, define

En = {x ∈ E | |f (x)| ≤ n}.

Then En ↑ E. Since E has finite measure, we can choose m such that


mN (E\Em ) < 3ε . Now, define fe : RN → R by

f (x), if x ∈ Em ,
f (x) =
e
0, if x ∈ RN \Em .

Since fe is bounded and since Em has finite measure, it follows that fe


is integrable on RN . Hence, there exists a sequence {ϕn }∞ N
n=1 in Cc (R )
1 N
such that ϕn → f in L (R ). Then, there exists a subsequence {ϕnk }
e
which converges to fe pointwise almost everywhere on RN .
10.3 Some applications 209

Step 2. Since Em has finite measure, we can find F ⊂ Em such that


mN (Em \F ) < 3ε and such that ϕnk → fe uniformly on F , by virtue of
Egorov’s theorem (cf. Theorem 4.1.1). Again, since F has finite mea-
sure, we can find a compact set K ⊂ F such that mN (F \K) < 3ε (cf.
Proposition 2.2.3). Clearly, mN (E\K) < ε.

Step 3. Since {ϕnk } converges uniformly to fe on K, it follows that


the restriction of fe to K is continuous. But K ⊂ F ⊂ Em and so
fe(x) = f (x) for every x ∈ K. Thus, we deduce that the restriction of f
to K is continuous and |f (x)| ≤ m for x ∈ K.

Step 4. Now, by the Tietze extension theorem (cf., for instance, Sim-
mons [9]) we can find a continuous function g : RN → R such that
kgk∞ ≤ m and such that g = f on K.

Step 5. Finally, let ψ ∈ Cc (RN ) be such that 0 ≤ ψ ≤ 1 and such that


ψ ≡ 1 on K (which exists, by Urysohn’s lemma, cf. Simmons [9]). Let
ϕ = ψg. Then ϕ ∈ Cc (RN ), and

{x ∈ E | ϕ(x) 6= f (x)} ⊂ E\K,

which has measure less than ε. Also kϕk∞ ≤ m and, if f is bounded,


m ≤ kf k∞ . This completes the proof. 

Remark 10.3.1 Usually, in most texts, Lusin’s theorem is used to prove


the density of continuous functions with compact support, in Lp (Ω).
Here we have proved the density result directly and used it to prove
Lusin’s theorem. This presentation appeared in the expository article A
note on some approximation theorems in measure theory, by S. Kesavan
and M. T. Nair, in the Mathematics Newsletter, 27, No.2, 2016, of the
Ramanujan Mathematical Society. 
Theorem 10.3.2 (Hardy’s inequality) Let 1 < p < ∞. Let f ∈ Lp (0, ∞).
For 0 < x < ∞, define
Z
1
F (x) = f dm1 . (10.3.1)
x (0,x)

Then F ∈ Lp (0, ∞) and


p
kF kp ≤ kf kp . (10.3.2)
p−1
210 10 Lp spaces

Proof: Step 1. Let f ∈ Cc ((0, ∞)) be a non-negative R x function. Then,


if F is defined as in (10.3.1), we have xF (x) = 0 f (t) dt. Further
(xF (x))0 = f (x) for all x > 0. Since f has compact support in (0, ∞),
it follows that f ≡ 0 near the origin and so F ≡ 0 near the origin. So
if we define F (0) = 0, then F is continuous on [0, ∞). If the support
of f is contained Rin a finite interval [a, b], where 0 < a < b, then f ≡ 0
x
on [b, ∞) and so 0 f (t) dt is constant for x ≥ b. Hence F (x) → 0 as
x → ∞. Also, since F is continuous on [0, b] and since x 7→ x−1 is in
Lp (b, ∞) for 1 < p < ∞, it follows that F ∈ Lp (0, ∞) as well and it is a
non-negative function. By our earlier observation, F 0 (x) exists for x > 0
and, for such x,
f (x) = F (x) + xF 0 (x). (10.3.3)
Multiplying both sides of this equation by F p−1 (x) and integrating, we
get
Z Z Z
F p−1 f dm1 = F p dm1 + xF p−1 (x)F 0 (x) dm1 (x).
(0,∞) (0,∞) (0,∞)
(10.3.4)
Since f and F are non-negative, we deduce from (10.3.4) and the mono-
tone convergence theorem, that
Z Z
0
xF p−1
(x)F (x) dm1 (x) = lim xF p−1 (x)F 0 (x) dm1 (x).
(0,∞) n→∞ (0,n)

By virtue of (10.3.3), the integrand xF p−1 (x)F 0 (x) is continuous and


so the integral in the extreme right in the above relation is, in fact, a
Riemann integral. Now,
Z n
1 n d
Z
p−1 0
xF (x)F (x) dx = x (F p (x)) dx.
0 p 0 dx
Rb
Now, since F (x) = xc for x ≥ b, where c = 0 f (t) dt, and since p > 1,
we have that xF p (x) → 0 as x → ∞. Consequently, by integration by
parts, we get, on passing to the limit as n → ∞,
Z Z
1
xF p−1 (x)F 0 (x) dm1 (x) = − F p dm1 .
(0,∞) p (0,∞)

Using this, and the fact that F ≥ 0, in (10.3.4), and applying Hölder’s
inequality, we get
p−1
Z
p
kF kp = F p−1 f dm1 ≤ kf kp kF p−1 kp0 ,
p (0,∞)
10.3 Some applications 211

where p0 is the conjugate exponent of p. But


Z Z
0 0
kF p−1 kpp0 = F (p−1)p dm1 = F p dm1 = kF kpp ,
(0,∞) (0,∞)

by the definition of p0 . Thus,


p
p−1 p p0
kF kp ≤ kf kp · kF kp ,
p

which yields (10.3.2), once again, by using the definition of p0 . Thus


the result is true for non-negative continuous functions with compact
support.

Step 2. If f ∈ Cc ((0, ∞)) is an arbitrary function, set


Z
1
T (f )(x) = f dm1 ,
x (0,x)

for x > 0. Then, clearly, |f | is a non-negative continuous function with


compact support and we have |T (f )(x)| ≤ T (|f |)(x) and so T (f ) ∈
Lp (0, ∞) and
p p
kT (f )kp ≤ kT (|f |)kp ≤ k |f | kp = kf kp .
p−1 p−1
Thus the result is proven for all continuous functions with compact sup-
port in (0, ∞).

Step 3. Let f ∈ Lp (0, ∞). Let {fn }∞ n=1 be a sequence of continuous


functions with compact support in (0, ∞) converging to f in Lp (0, ∞).
If Fn = T (fn ), then Fn ∈ Lp (0, ∞) for all n ∈ N and
p
kFn − Fm kp ≤ kfn − fm kp ,
p−1
which implies that {Fn }∞ p
n=1 is a Cauchy sequence in L (0, ∞). Let
p
Fn → G in L (0, ∞).

Step 4. Now, for any x > 0, fn → f in Lp (0, x) and so, since (0, x) has
finite measure, fn → f in L1 (0, x). Consequently,
Z Z
fn dm1 → f dm1 ,
(0,x) (0,x)
212 10 Lp spaces

and so Fn (x) → F (x) for each x > 0. But, for a subsequence Fnk → G
pointwise, almost everywhere, and so F = G almost everywhere. Thus,
it follows that F ∈ Lp (0, ∞) and that Fn → F in Lp (0, ∞). Since
p
kFn kp ≤ kfn kp ,
p−1

for each n ∈ N, we get (10.3.2) on passing to the limit as n → ∞. This


completes the proof. 

Example 10.3.1 The preceding theorem is not true when p = 1. Con-


sider the function f (x) = e−x which belongs to L1 (0, ∞). Then, by
(10.3.1), we get
1 − e−x
F (x) = .
x
For x ≥ 1, we have 1 − e−x ≥ 1 − e−1 . Thus,

1 − e−1
F (x) ≥ , x ≥ 1,
x
and so F is not integrable on (1, ∞) and so it cannot be integrable over
(0, ∞) either. 

Remark 10.3.2 Hardy’s inequality is valid in `p as well, when 1 < p <


∞. If x ∈ `p , where x = (xi ), then the sequence y = (yi ) defined by
x1 + · · · + xn
yn =
n
is also in `p and
p
kykp ≤ kxkp .
p−1
For a proof, see, for instance, Kesavan [5]. 

We conclude this section with another useful result.

Proposition 10.3.1 Let 1 ≤ p < ∞. Let f ∈ Lp (RN ). For h ∈ RN ,


define
τh (f )(x) = f (x − h), x ∈ RN .
Then
lim kτh (f ) − f kp = 0. (10.3.5)
h→0
10.4 Duality 213

Proof: By the translation invariance of the Lebesgue measure, it is clear


that τh (f ) ∈ Lp (RN ), whenever f ∈ Lp (RN ) and also that kτh (f )kp =
kf kp .

Let ε > 0 be given. Choose ϕ ∈ Cc (RN ) such that


ε
kf − ϕkp < . (10.3.6)
3
Then, we also have
ε
kτh (f ) − τh (ϕ)kp = kf − ϕkp < . (10.3.7)
3
Let the support of ϕ be contained in the box [−a, a]N . Since ϕ is uni-
formly continuous, there exists 0 < δ < 1 such that, whenever |h| < δ,
we have
ε −N
|ϕ(x − h) − ϕ(x)| < (2(a + 1)) p ,
3
N
for all x ∈ R . Then, for |h| < δ,
Z Z  ε p
|τh (ϕ)−ϕ|p dm1 = |ϕ(x−h)−ϕ(x)|p dm1 < ,
RN [−(a+1),(a+1)]N 3
so that
ε
kτh (ϕ) − ϕkp < . (10.3.8)
3
The result now follows on combining the relations (10.3.6)-(10.3.8). 

10.4 Duality

When studying normed linear spaces, one of the important objectives is


to identify the dual space of a normed linear space, i.e. the space of all
continuous linear functionals on that space. In this section, we will try
to identify the dual spaces of the spaces Lp (µ).

Let (X, S, µ) be a measure space and let 1 ≤ p ≤ ∞. Let p0 be the


0
conjugate exponent of p. Let g ∈ Lp (µ). Define Tg : Lp (µ) → R by
Z
Tg (f ) = f g dµ, f ∈ Lp (µ).
X

Then, clearly, Tg is a linear functional defined on Lp (µ). It is also


continuous, Indeed, by Hölder’s inequality, we have
|Tg (f )| ≤ kf kp kgkp0 ,
214 10 Lp spaces

which establishes the continuity of Tg . We also have

kTg k ≤ kgkp0 . (10.4.1)

In this section, we will show that if (X, S, µ) is a σ-finite measure


space, then every continuous linear functional on Lp (µ) occurs in this
way, if 1 ≤ p < ∞ and that we have equality in (10.4.1). Thus, we have
an isometric isomorphism (i.e. an isomorphism which preserves norms)
0
between the dual of Lp (µ) and the space Lp (µ) and so we can identify
the latter space with the dual of the former, when 1 ≤ p < ∞. The
result does not hold when p = ∞.

Proposition 10.4.1 (Uniqueness) Let (X, S, µ) be a σ-finite measure


0
space. Let 1 ≤ p < ∞. If gi , i = 1, 2, are in Lp (µ), where p0 is the
conjugate exponent of p, such that Tg1 = Tg2 , then g1 = g2 almost
everywhere.

Proof: If f ∈ Lp (µ), we have


Z
f (g1 − g2 ) dµ = 0.
X

Let E ⊂ X be a measurable set of finite measure. Then χE ∈ Lp (µ) and


so we deduce that Z
(g1 − g2 ) dµ = 0,
E

for all E ∈ S, µ(E) < +∞. If E ∈ S, then, by the σ-finiteness, we can


find a collection of disjoint sets {Ei }∞ ∞
i=1 in S such that E = ∪i=1 Ei and
such that µ(Ei ) < +∞ for each i. Then, it follows that
Z
(g1 − g2 ) dµ = 0,
E

for all E ∈ S, from which we easily deduce that g1 = g2 almost every-


where. 

It follows from the above proposition that the mapping g 7→ Tg from


0
Lp (µ) into the dual of Lp (µ) is injective. It is also continuous , by virtue
of (10.4.1). We now need to show that it is surjective and that it is an
isometry. We will first prove this when the measure space is finite and
then deduce the general case of a σ-finite measure space.
10.4 Duality 215

Lemma 10.4.1 Let (X, S, µ) be a finite measure space and let g : X →


R be a measurable function such that
Z
1

µ(E) g dµ ≤ K, (10.4.2)
E

for all E ∈ S with µ(E) > 0. Then |g| ≤ K almost everywhere.

Proof: Let U = {t ∈ R | |t| > K}. This is an open set in R and hence
can be written as the countable union of open intervals. Let (a−r, a+r)
be one such interval. Let

E = {x ∈ X | g(x) ∈ (a − r, a + r)}.

If µ(E) > 0, then set


Z
1
AE (g) = g dµ.
µ(E) E

Then Z
1
|AE (g) − a| =
(g − a) dµ < r.
µ(E) E

Thus,AE (g) ∈ (a − r, a + r) as well and so |AE (g)| > K, which is a


contradiction. Thus, µ(E) = 0 and since the set {x ∈ X | |g(x)| > K}
can be covered by a countable number of sets like E, we deduce that
|g(x)| ≤ K except on a set of measure zero. This completes the proof.


Theorem 10.4.1 Let (X, S, µ) be a finite measure space. Let 1 ≤ p <


∞. Let T be a continuous linear functional on Lp (µ). Then, there exists
0
a unique g ∈ Lp (µ) such that T = Tg and, further, kT k = kgkp0 .

Proof: Step 1. For E ∈ S, define

λ(E) = T (χE ).

This is well-defined, since µ(E) < +∞ and so χE ∈ Lp (µ). If A and B


are disjoint measurable sets, then

χA∪B = χA + χB ,

and so, by the linearity of T , we get that λ is finitely additive. Let


{Ei }∞
i=1 be a countable collection of disjoint sets in S. Let their union
216 10 Lp spaces

be E. Set Fk = ∪ki=1 Ei . Then Fk increases to E. Since the measure


space is finite, we have

k→∞
X
µ(E\Fk ) = µ(Ei ) → 0,
i=k+1
P∞
since µ(E) = i=1 µ(Ei ) < +∞. Now,
1
kχE − χFk kp = µ(E\Fk ) p ,
and so χFk → χE in Lp (µ). Then T (χFk ) → T (χE ) which shows that

X
λ(E) = λ(Ei ).
i=1

Thus λ defines a signed measure on (X, S). Further, if µ(E) = 0, then


χE = 0 in Lp (µ) and so λ(E) = 0 as well. Thus, λ << µ. Hence, by
the Radon-Nikodym theorem, there exists a measurable function g such
that, for every E ∈ S, we have
Z
λ(E) = g dµ,
E

or, in other words, Z


T (χE ) = χE g dµ.
X
Since both λ and µ are finite, it also follows that g ∈ L1 (µ). Further, if
ϕ is any simple function, we get by the linearity of T , that
Z
T (ϕ) = ϕg dµ.
X

Step 2. Let f ∈ L∞ (µ)be a non-negative function. Since, we are


in a finite measure space, it follows that f ∈ Lp (µ) as well for every
1 ≤ p < ∞. Let {ϕn }∞ n=1 be a sequence of non-negative simple functions
increasing to f . Then (cf. Lemma 10.2.1) ϕn → f in Lp (µ) for 1 ≤ p <
∞ and so T (ϕn ) → T (f ). On the other hand, ϕn g → f g pointwise and
|ϕn g| ≤ f |g| ≤ kf k∞ |g|.
Since g is integrable,
R it follows,
R from the dominated comnvergence the-
orem, that X ϕn g dµ → X f g dµ. Thus, for all non-negative bounded
functions, f, we have Z
T (f ) = f g dµ. (10.4.3)
X
10.4 Duality 217

By splitting any bounded function f into its positive and negative parts,
f ± , we deduce that (10.4.3) is true for any bounded measurable function.

Step 3. Let p = 1. Let E ∈ S with µ(E) > 0. Then


Z Z

g dµ = χ g dµ = |T (χ )| ≤ kT k · kχ k1 = kT kµ(E).
E E E
E X

Thus, Z
1

µ(E) g dµ ≤ kT k.
E
It then follows, from Lemma 10.4.1, that |g| ≤ kT k almost everywhere.
Thus, in this case g ∈ L∞ (µ) and kgk∞ ≤ kT k.

Step 4. Let 1 < p < ∞. Let ψ be a measurable function (taking values


±1) such that ψg = |g|. Let

En = {x ∈ X | |g(x)| ≤ n}, n ∈ N.
0
Set f = χEn |g|p −1 ψ. Then
0 0
|f |p = χEn |g|p p−p = χEn |g|p ,

by the definition of the conjugate exponent. Further, by the definition


0
of En , it follows that f is bounded as well. Since f g = χEn |g|p , we get,
from Step 2, that
Z Z
p0
|g| dµ = f g dµ = T (f ),
En X

and so
Z Z 1
p
p0 p0
|g| dµ ≤ kT k · kf kp = kT k |g| dµ ,
En En

which yields
Z  10
p
p0
|g| dµ ≤ kT k.
En
But En ↑ X and so, by the monotone convergence theorem, we have
Z  10
p
p0
|g| dµ ≤ kT k.
X
218 10 Lp spaces

0
Thus, g ∈ Lp (µ) and kgkp0 ≤ kT k.
0
Step 5. Let 1 ≤ p < ∞. Then, we have seen that g ∈ Lp (µ) and that
kgkp0 ≤ kT k. Further, since µ(X) is finite, simple functions are dense
in Lp (µ) (cf. Lemma 10.2.1) and so L∞ (µ) is also dense in Lp (µ). Both
sides of (10.4.3) define continuous linear functionals on Lp (µ) and agree
on the dense subspace L∞ (µ) and so, they agree on all of Lp (µ). Thus,
we get that, in fact, T = Tg , in which case, we have that
kT k = kTg k ≤ kgkp0 ≤ kT k.
Thus T = Tg and kT k = kTg k = kgkp0 . This completes the proof. 

Let (X, S, µ) be a σ-finite measure space. Let X = ∪∞n=1 Xn , where


the Xn are all disjoint and 0 < µ(Xn ) < +∞ for each n ∈ N. Define

X 1
h(x) = χ Xn .
n2 µ(X n)
n=1

Then h ∈ L1 (µ) and h > 0. Define, for E ∈ S,


Z
ν(E) = h dµ.
E

Then (X, S, ν) is a finite measure space and ν << µ.


1
Let 1 ≤ p < ∞. Then f ∈ Lp (ν) if, and only if, h p f ∈ Lp (µ) and
1
kf kLp (ν) = kh p f kLp (µ) .
This defines an isometric isomorphism between the spaces Lp (ν) and
Lp (µ), when 1 ≤ p < ∞. Since ν << µ, we have that L∞ (ν) = L∞ (µ)
and that kf kL∞ (ν) = kf kL∞ (µ) .

Now, let T be a continuous linear functional defined on Lp (µ), for


1 ≤ p < ∞. Define S, a linear functional on Lp (ν), by
1
S(f ) = T (h p f ), f ∈ Lp (ν).
Thus,
1 1
|S(f )| = |T (h p f )| ≤ kT k · kh p f kLp (µ) = kT k · kf kLp (ν) ,
1
− p1 − p1
|T (f )| = |T (h p h f )| ≤ kSk · kh f kLp (ν) = kSk · kf kLp (µ) ,
10.4 Duality 219

where f ∈ Lp (ν) in the first line above and f ∈ Lp (µ) in the second.
This shows that S defines a continuous linear functional on Lp (ν) and
that kSk = kT k.
Theorem 10.4.2 (Riesz representation theorem) Let (X, S, µ) be a σ-
finite measure space. Let 1 ≤ p < ∞. Let T be a continuous linear
0
functional on Lp (µ). Then, there exists a unique g ∈ Lp (µ) such that
T = Tg and, further, kT k = kgkp0 .
Proof: We write X = ∪∞ n=1 Xn as a disjoint union of sets of finite
measure. We adopt the notation and definitions made in the preceding
paragraphs and define the function h and the measure ν as before. Given
a linear functional T on Lp (µ), we define, as before, the linear functional
0
S on Lp (ν). Then, by the preceding theorem, there exists ge in Lp (ν)
such that Z
S(f ) = f gedν, for all f ∈ Lp (ν),
X
and such that kSk = kgkLp0 (ν) .
1
Define g = h p0 ge if 1 < p < ∞ and g = ge, if p = 1.

Case 1. Let 1 < p < ∞. Then


0 0 0
kgkpLp0 (µ) = X |g|p dµ = X |e
g |p h dµ
R R

0 0 0
g |p dν = kSkp = kT kp .
R
= X |e

Further, if f ∈ Lp (µ), we have


1
R R R 1− p1
X f g dµ = eh p0 dµ
X fg = X fg
eh dµ
R − p1 − p1
= X h f ge dν = S(h f ) = T (f ).
This proves the result when 1 < p < ∞.

Case 2. Let p = 1. Then

kgkL∞ (µ) = ke
g kL∞ (ν) = kSk = kT k.

If f ∈ L1 (µ), then
Z Z
f g dµ = f geh−1 dν = S(h−1 f ) = T (f ).
X X
220 10 Lp spaces

This proves the result for p = 1 and the proof of the theorem is com-
plete. 

Example 10.4.1 The above result is not true when p = ∞. For exam-
ple, consider the interval (0,1). Then C[0, 1] is a subspace of L∞ (0, 1).
If f ∈ C[0, 1], define
T (f ) = f (0).
This defines a linear functional on C[0, 1] and

|T (f )| = |f (0)| ≤ kf k∞ .

Thus T is a continuous linear functional of C[0, 1] equipped with the


norm from L∞ (0, 1). By the Hahn-Banach theorem, we can extend it to
a continuous linear functional on L∞ (0, 1).

We claim that this functional cannot be represented in the form Tg


where g ∈ L1 (µ). (Recall that the conjugate exponent of p = ∞ is
p0 = 1.) Assume the contrary. Now consider the sequence of functions
{fn }∞
n=1 in C[0, 1] given by

 1 − nt, if 0 ≤ t ≤ n1 ,

fn (t) =
1
0, if ≤ t ≤ 1.

n

Then T (fn ) = fn (0) = 1 for all n ∈ N. On the other hand, if T = Tg ,


where g ∈ L1 (0, 1), then,
Z
T (fn ) = fn g dm1 .
(0,1)

But |fn g| ≤ |g| for all n and g is integrable, while (fn g)(t) → 0 for all
0 < t ≤ 1. Thus, by the dominated convergence theorem, we have that
T (fn ) → 0, which is a contradiction. 

Remark 10.4.1 The spaces Lp (µ) are reflexive (cf. for instance, Ke-
savan [5]) if 1 < p < ∞ since we have the canonical identifications
0 0
(Lp (µ))0 = Lp (µ) and (Lp (µ))0 = Lp (µ). The spaces L1 (µ) and L∞ (µ)
are not reflexive. We have that (L1 (µ))0 = L∞ (µ) but the reverse iden-
tity fails. 
10.5 Convolutions 221

Remark 10.4.2 We can prove an inequality called Clarkson’s inequality


when 2 ≤ p < ∞ (see the exercises at the end of this chapter) which will
show that the spaces Lp (µ) are uniformly convex (cf. Kesavan [5]), which
is a geometric property of the norm. Then, it follows, from a theorem in
functional analysis, that the spaces Lp (µ) are reflexive when 2 ≤ p < ∞.
One can then easily prove that the spaces Lp (µ), when 1 < p < 2, are
also refllexive, using functional analytic arguments. Then it is very easy
0
to show that the dual space of Lp (µ) is Lp (µ) for 1 < p < ∞. The
advantage of this proof is that it does not need the measure space to be
σ-finite. See Kesavan [5] for details. 

10.5 Convolutions

In this section, we will study some properties of a very important tool


in analysis called the convolution product. We will assume throughout
that RN is equipped with the Lebesgue measure mN .

Definition 10.5.1 Let f and g be integrable functions defined on RN .


The convolution or convolution product of f and g , denoted f ∗ g,
is given by
Z
(f ∗ g)(x) = f (x − y)g(y) dmN (y), for x ∈ RN .  (10.5.1)
RN

We have already encountered this earlier (cf. Example 8.3.8): we have


seen that the convolution is well-defined and that it is a commutative
and associative binary operation on integrable functions (cf. Exercise
8.4).

We also saw that f ∗ g ∈ L1 (RN ) and that

kf ∗ gk1 ≤ kf k1 kgk1 . (10.5.2)

We will now try to extend the definition of the convolution product to


other classes of functions as well.

Theorem 10.5.1 Let 1 < p < ∞. Let f ∈ L1 (RN ) and let g ∈ Lp (RN ).
Then f ∗ g is well-defined. Further, f ∗ g ∈ Lp (RN ) and

kf ∗ gkp ≤ kf k1 kgkp . (10.5.3)


222 10 Lp spaces

0
Proof: Let p0 be the conjugate exponent of p. Let h ∈ Lp (RN ). Then
(x, y) 7→ f (x − y)g(y)h(x)
is measurable and using Hölder’s inequality and the translation invari-
ance of the Lebesgue measure, we have
R R
RN RN |f (x − y)g(y)h(x)| dmN (x)dmN (y)
R R
= RN |h(x)| RN |f (x − y)g(y)| dmN (y)dmN (x)
R R
= RN |h(x)| RN |f (w)g(x − w)| dmN (w)dmN (x)
R R
= RN |f (w)| RN |h(x)||g(x − w)| dmN (x)dmN (w)

≤ khkp0 kgkp kf k1 < +∞.


Thus, by Fubini’s theorem, the integral
Z
h(x)f (x − y)g(y) dmN (y)
RN

exists for almost all x. We can choose h(x) 6= 0 for all x (for instance,
h(x) = exp(−|x|2 ), which belongs to all Lp spaces) and so we deduce
that f ∗ g defined via (10.5.1) is well-defined. Further, by the preceding
computation, it follows that
Z
h 7→ (f ∗ g)h dmN
RN
0
is a continuous linear functional on Lp (RN ) whose norm is bounded by
the quantity kgkp kf k1 which shows, by the Riesz representation theo-
rem, that f ∗ g ∈ Lp (RN ) and that (10.5.3) holds. 

Remark 10.5.1 Notice that (10.5.2) is the same as (10.5.3) for the case
p = 1. The relation (10.5.3) is a particular case of Young’s inequality.
Let 1 ≤ p, q, r < ∞ be such that
1 1 1
+ = 1+ .
p q r
If f ∈ Lp (RN ) and g ∈ Lq (RN ), then f ∗ g is well-defined via (10.5.1),
f ∗ g ∈ Lr (RN ) and
kf ∗ gkr ≤ kf kp kgkq . 
10.5 Convolutions 223

If f and g are continuous real valued functions on RN and if at


least one of them has compact support, then the integral in (10.5.1)
makes sense and the convolution f ∗ g is well-defined. More generally,
if f1 , · · · , fn are continuous real valued functions on RN such that all
but at most one of them have compact support, then we can define the
convolution product f1 ∗ · · · ∗ fn by taking the products two at a time.
For instance we can have

f1 ∗ ((f2 ∗ f3 ) ∗ · · · ∗ (fn−1 ∗ fn )).

The actual pairing and order will be unimportant since we have commu-
tativity and associativity. The convolution product of functions occuring
within any pair of parantheses is well-defined since at least one of them
will have compact support as shown by the following result.

Theorem 10.5.2 Let f and g be continuous real valued functions on


RN and let one of them have compact support so that f ∗ g is well-
defined. Then
supp(f ∗ g) ⊂ supp(f ) + supp(g)
where, supp(ϕ) denotes the support of a continuous function ϕ : RN →
R, and, for subsets A and B of RN we define

A + B = {x + y | x ∈ A, y ∈ B}.

In particular, if both f and g have compact support, then f ∗ g has


compact support.

Proof: Let A = supp(f ) and B = supp(g) and, without loss of gener-


ality, assume that B is compact. Then A + B is closed. To see this, let
xn + yn ∈ A + B such that xn + yn → z in RN . Since B is compact,
for a subsequence, we have ynk → y ∈ B. Then xnk → z − y = x which
will belong to A, since A is closed. Thus, z = x + y ∈ A + B, which
establishes our claim.
We clearly need to consider the integral in (10.5.1) only on the set
B = supp(g). In order that (f ∗ g)(x) 6= 0, it is necessary that x − y ∈
A = supp(f ) for y varying over a subset of B with positive measure. In
particular, it follows that x ∈ supp(f ) + supp(g) and the result follows
since this set is closed.
If both functions have compact supports, then supp(f ) + supp(g) is
also compact (why?) and so f ∗ g has compact support. 
224 10 Lp spaces

If f is continuous with compact support and if g is integrable on RN ,


then also it is easy to see that f ∗ g is well-defined. One of the important
properties of the convolution product is that it has a smoothing effect
on functions. More precisely, we have the following result.
Theorem 10.5.3 Let f be a continuous real valued function on RN with
compact support and let g be integrable. Then f ∗ g is continuous. If f
is C ∞ , then so is f ∗ g.
Proof: We will show that f ∗g is continuous and that, for any 1 ≤ i ≤ N ,
∂ ∂f
(f ∗ g) = ∗g (10.5.4)
∂xi ∂xi
if f is differentiable. Iterating this, we can complete the proof.
(i) Let x ∈ RN be fixed and let h ∈ RN be such that |h| ≤ 1. Then
Z
|(f ∗g)(x+h)−(f ∗g)(x)| ≤ |(f (x+h−y)−f (x−y)||g(y)| dmN (y).
RN

The above integral needs to be taken only over a compact set K(x)
containing the supports of the functions y 7→ f (x−y) and y 7→ f (x+h−
y). For example, we can take K(x) = x+B(0; R)+B(0; 1) where B(0; r)
denotes the closed ball in RN with centre at the origin and radius r, and
R > 0 is such that supp(f ) ⊂ B(0; R). Since f has compact support, it
is uniformly continuous and so, given ε > 0, there exists η > 0 such that
|f (u) − f (v)| < ε whenever |u − v| < η. Thus, if |h| < η (we can assume
that 0 < η < 1), we get
Z
|(f ∗ g)(x + h) − (f ∗ g)(x)| ≤ ε |g| dmN
K(x)

which proves the continuity of f ∗ g at any arbitrary point x ∈ RN .


(ii) Let x ∈ RN be fixed and let h ∈ R, |h| ≤ 1. Let ei be the i-th
standard basis vector of RN , i.e. the vector with 1 in the i-th coordinate
and zero elsewhere. Then
∂ (f ∗ g)(x + hei ) − (f ∗ g)(x)
(f ∗ g)(x) = lim
∂xi h→0 h
if the limit exists. If K(x) is a compact set containing the supports of
the functions y 7→ f (x − y) and y 7→ f (x + hei − y), then
(f ∗g)(x+hei )−(f ∗g)(x)
= h1 K(x) (f (x + hei − y) − f (x − y))g(y) dy
R
h

R ∂f
= K(x) ∂xi ((x − y + θhei )g(y) dy
10.5 Convolutions 225

∂f
where θ ∈ (0, 1) (and depends on x, y and h). Since ∂x i
is assumed to be
continuous, it is bounded on the compact set K(x) and so the integrand
in the last integral above is bounded by M |g(y)| which is integrable on
the compact set K(x). Further, as h → 0, the integrand converges to
∂f
∂xi (x−y)g(y). Thus, by the dominated convergence theorem, we deduce
the validity of (10.5.4). 

Notation Let N be a fixed positive integer. A multi-index α (of size


N ), is an N -tuple of non-negative integers. If α = (α1 , · · · , αN ) is a
multi-index, we denote by |α|, the sum of its components, i.e.

|α| = α1 + · · · + αN .

If f is a sufficiently smooth real-valued function defined on RN (or an


open set thereof), we define

∂ |α|
Dα f = .
∂xα1 1 · · · ∂xαNN

For example, if N = 3 and if α = (3, 0, 1), then

∂4f
Dα f = .
∂x31 ∂x3

Remark 10.5.2 It is easy to write down a similar proof when f is C ∞


and g is continuous with compact support. More, generally, if f is C ∞
and if one of the two functions has compact support, we have

Dα (f ∗ g)(x) = ((Dα f ) ∗ g)(x)

for any multi-index α. It is also easy to see that if f is only C k , then


the above relation is valid for all multi-indices α such that |α| ≤ k. If
g also has differentiability properties, then any derivative of f ∗ g could
be got by taking the convolution product of appropriate derivatives of
f and g. 

Thus, the convolution of a smooth function with compact support


with any integrable function produces a smooth function. This fact used
together with the mollifiers defined below provides us with a powerful
technique to prove a variety of density and approximation theorems.
226 10 Lp spaces

Lemma 10.5.1 Define f : R → R by

exp(−x−2 ) if x > 0,

f (x) =
0 if x ≤ 0.

Then f ∈ C ∞ (R).

Proof: We only need to check the smoothness at x = 0. As x ↑ 0, the


function and all the derivatives are zero. As x ↓ 0, the derivatives are
all finite linear combinations of terms of the form x−k exp(−x−2 ), where
k is a non-negative integer.
Consider the function g(t) = tk e−t . Then

g 0 (t) = tk−1 e−t (k − t)

which is non-positive for t ≥ k. Thus for all such t, we have that


g(t) ≤ g(k).
Now
 k
−k −x−2 1 −2
x e = x k
2
e−x ≤ xk k k e−k
x
−2
for x12 ≥ k, i.e. x ≤ √1k . It then follows that x−k e−x → 0 as x ↓ 0.
This completes the proof. 

We can use the above lemma to construct examples of C ∞ functions


with compact support.

Example 10.5.1 Consider the function

exp(−a2 /(a2 − x2 )) if |x| < a,



ρ(x) =
0 if |x| ≥ a.

A simple application of the preceding lemma shows that ρ is a C ∞ func-


tion and that its support is the interval [−a, a]. 

Example 10.5.2 This is a slight, but very useful, variation of the pre-
ceding example. Let x = (x1 , · · · , xN ) ∈ RN . Let

N
! 12
X
|x| = |xi |2 . (10.5.5)
i=1
10.5 Convolutions 227

Given ε > 0, define


 −N
κε exp(−ε2 /(ε2 − |x|2 )) if |x| < ε,
ρε (x) = (10.5.6)
0 if |x| ≥ ε,

where Z
κ−1 = exp(−1/(1 − |x|2 )) dx.
|x|≤1

It follows then that ρε is a C ∞ function with support in the closed ball


B(0; ε), centered at the origin and of radius ε. Further ρε ≥ 0 and by
the change of variable y = xε , we see that
κ 2 2 2
R R
RN ρε dmN = εN |x|≤ε exp(−ε /(ε − |x| )) dmN (x)

− |y|2 )) dmN (y)


R
= κ |y|≤1 exp(−1/(1 = 1.

Thus, as ε → 0, the functions ρε have decreasing supports, but preserve


the volume contained under the graph and so will be concentrated near
the origin. 

Definition 10.5.2 The family of functions {ρε }ε>0 is called the family
of mollifiers. 

Theorem 10.5.4 Let {ρε }ε>0 be the family of mollifiers.


(i) If f : RN → R is continuous, then ρε ∗ f → f pointwise, as ε → 0.
(ii) If f : RN → R is continuous with compact support, then ρε ∗ f → f
uniformly, as ε → 0.

Proof: (i) Let x ∈ RN . Then, given η > 0, there exists δ > 0 such that
for all |y| < δ, we have |f (x − y) − f (x)| < η. Thus, if ε < δ, we have,
on observing that the integral of ρε is unity and that this function is
supported on B(0; ε),
Z
(ρε ∗ f )(x) − f (x) = (f (x − y) − f (x))ρε (y) dmN (y)
|y|≤ε

which yields (since ρε ≥ 0)


R
|(ρε ∗ f )(x) − f (x)| ≤ |y|≤ε |f (x − y) − f (x)|ρε (y) dmN (y)
R
< η |y|≤ε ρε (y) dmN (y) = η.
228 10 Lp spaces

This proves the first statement.


(ii) If supp(f ) = K which is compact, then

supp(ρε ∗ f ) ⊂ K + B(0; ε)

which is compact and is contained within a fixed compact set, say, K +


B(0; 1) if we restrict ε to be less than or equal to unity. Since f has
compact support, it is uniformly continuous and the δ corresponding
to η in the previous step is now independent of the point x and so the
pointwise convergence is now uniform. 

Corollary 10.5.1 Let f be a continuous real-valued function on RN


with compact support. Then ρε ∗ f → f , as ε → 0, in Lp (RN ) for all
1 ≤ p ≤ ∞.

Proof: The case p = ∞ is already covered in the previous theorem.


If 1 ≤ p < ∞, then let K be the compact set containing the support
of f and all the functions ρε ∗ f . Then, on this set we have uniform
convergence, which automatically implies convergence in Lp (RN ). 

Remark 10.5.3 Notice that in the above case, ρε ∗ f is a C ∞ function


with compact support in RN . 

Theorem 10.5.5 Let 1 ≤ p < ∞. Then, the space of C ∞ functions


with compact support in RN is dense in Lp (RN ).

Proof: By Corollary 10.5.1 and Remark 10.5.3 above, continuous func-


tions with compact support can be approximated in Lp (RN ) by C ∞
functions with compact support in RN . This completes the proof, since
continuous functions with compact support are dense in Lp (RN ). 

Corollary 10.5.2 Let {ρε }ε>0 be the family of mollifiers. If f ∈ Lp (RN ),


then ρε ∗ f → f as ε → 0, in Lp (RN ), for 1 ≤ p < ∞.

Proof: Given f ∈ Lp (RN ), we can find, for every η > 0, a continuous


function g with compact support such that
η
kf − gkp < .
3
Then, for ε sufficiently small, we have, by Corollary 10.5.1, that
η
kρε ∗ g − gkp < .
3
10.5 Convolutions 229

Then

kρε ∗ f − f kp ≤ kf − gkp + kg − ρε ∗ gkp + kρε ∗ (g − f )kp .

But by (10.5.2)
η
kρε ∗ (g − f )kp ≤ kρε k1 kg − f kp <
3
since the integral of ρε is unity and the result now follows immediately.


Theorem 10.5.6 Let Ω ⊂ RN be an open set and let 1 ≤ p < ∞. Then


the space of C ∞ functions with compact support in Ω is dense in Lp (Ω).

Proof: We know that continuous functions with compact support in Ω


are dense in Lp (Ω) for 1 ≤ p < ∞. Thus, given η > 0 and f ∈ Lp (Ω),
there exists g, a continuous function with compact support in Ω, such
that kf − gkp < η/2. Now, let ge be the extension of g by zero outside
Ω. Then ρε ∗ ge is a C ∞ function and since its support is compact and is
contained in B(0; ε) + supp(e g ) = B(0; ε) + supp(g) ⊂ Ω, for ε sufficiently
small, we have that (ρε ∗ ge)|Ω is a C ∞ function with compact support in
Ω. But ρε ∗ ge → ge, as ε → 0, in Lp (RN ). Hence, for sufficently small ε,
we have
η
k(ρε ∗ ge)|Ω − gkp < ,
2
which yields
k(ρε ∗ ge)|Ω − f kp < η
which completes the proof. 

Bibliographical comment: Apart from the books cited in the text,


the following are highly recommended for further study of the Lp -spaces
as well as their applications:

1. Brézis, H, Functional Analysis, Sobolev Spaces and Partial Differ-


ential Equations, Springer, Universitext, 2011.

2. Ciarlet, P.G. Linear and Nonlinear Functional Analysis with Appli-


cations, SIAM, 2013.

3. Lieb E. H. and Loss, M. Analysis, Graduate Studies in Mathemat-


ics, Volume 14, American Mathematical Society, 1997. (Indian Edition:
230 10 Lp spaces

Norosa, 1998.)

The five volume treatise entitled A comprehensive Course in Analysis


by Barry Simon is an excellent reference for all topics in Analysis. In
particular Part 1 of this set has material relevant to topics treated in
this book:
Real analysis: A Comprehensive Course in Analysis (Part 1), American
Mathematical Society, 2015. (Indian Edition: Universities Press, 2017.)

10.6 Exercises

10.1 Let (X, S, µ) be a measure space. Let 1 ≤ p, q, r < ∞ be such that


1 1 1
+ = .
p q r
(a) If f ∈ Lp (µ) and g ∈ Lq (µ), show that f g ∈ Lr (µ) and that

kf gkr ≤ kf kp kgkq .

(b) If fn → f in Lp (µ) and if gn → g in Lq (µ), show that fn gn → f g in


Lr (µ).

10.2 Let (X, S, µ) be a measure space. Let 1 ≤ p < ∞. Let {fn }∞ n=1
be a sequence in Lp (µ). Assume that there exists g ∈ Lp (µ) such that
|fn | ≤ g for each n ∈ N. If fn → f pointwise, show that f ∈ Lp (µ) and
that fn → f in Lp (µ).

10.3 Let (X, S, µ) be a measure space. Let 1 ≤ p < ∞. Let fn → f in


Lp (µ). Let {gn }∞
n=1 be a sequence of measurable real-valued functions
which are uniformly bounded by M > 0 and which converge almost ev-
erywhere to a measurable function g on X. Show that fn gn → f g in
Lp (µ).

10.4 Let (X, S, µ) be a measure space. Let 1 < p < ∞. Let f : X ×X →


R be such that, for every y ∈ X, the section f y is p-integrable and that
Z
kf y kp dµ(y) < +∞.
X

Define, for x ∈ X, Z
g(x) = f (x, y) dµ(y).
X
10.6 Exercises 231

Show that g ∈ Lp (µ) and that


Z
kgkp ≤ kf y kp dµ(y).
X

10.5 (Riemann-Lebesgue lemma) Let h : (0, ∞) → R be a bounded and


(Lebesgue) measurable function such that
Z
1
lim h dm1 = 0.
c→∞ c (0,c)

(a) Let [c, d] ⊂ (0, ∞) and let f = χ[c,d] . Show that


Z
lim f (t)h(ωt) dm1 (t) = 0. (10.6.1)
ω→∞ (0,∞)

(b) Show that (10.6.1) holds for all f ∈ L1 (0, ∞).


(c) If f ∈ L1 (a, b), where (a, b) ⊂ (0, ∞), show that
Z Z
lim f (t) cos nt dm1 (t) = lim f (t) sin nt dm1 (t) = 0.
n→∞ (a,b) n→∞ (a,b)

10.6 (a) Consider the trigonometric series



a0 X
+ (an cos nt + bn sin nt).
2
n=1

Show that it can be written as



a0 X
+ dn cos(nt − φn ).
2
n=1

(This is called the amplitude-phase form of the series.) Write down the
relations between an , bn and dn , φn .
(b) Using the amplitude-phase form of a trigonometric series, show that
if the series converges pointwise over a set E whose (Lebesgue) measure
is strictly positive, then an → 0 and bn → 0 as n → ∞. (This is called
the Cantor-Lebesgue theorem.)

In Exercises 10.7-10.11, which follow, give direct proofs of the results,


without appealing to the general results proved in this chapter.
232 10 Lp spaces

10.7 Show that the spaces `p are complete, for 1 ≤ p ≤ ∞.

10.8 Show that `p is separable if 1 ≤ p < ∞ and that `∞ is not separable.

10.9 Let V 0 denote the dual space of a Banach space V . If p0 is the


conjugate exponent of p, show that `0p is isometrically isomorphic to `p0 ,
when 1 ≤ p < ∞.

10.10 Give an example of a continuous linear functional on `∞ which


does not arise from any element of `1 .

10.11 Let c0 denote the space of all real sequences which converge to
zero, equipped with the norm k · k∞ .
(a) Show that c0 is complete.
(b) Show that c00 is isometrically isomorphic to `1 .

10.12 (Clarkson’s inequality)


(a) Let 2 ≤ p < ∞. If x ≥ 0, show that
p
(x2 + 1) 2 ≥ xp + 1.

Deduce that, if α and β are positive real numbers, then


p
(α2 + β 2 ) 2 ≥ αp + β p .
p
(b) Combining the above with the fact that the map t 7→ t 2 is convex
on the set {t ∈ R | t ≥ 0}, show that, if f and g are in Lp (µ), where
(X, S, µ) is a measure space, then
p p
1
(f + g) + (f − g) ≤ 1 (kf kpp + kgkpp ).
1
2 2 2
p p

(d) Deduce that if (X, S, µ) is a measure space and if 2 ≤ p < ∞, then


the space Lp (µ) is uniformly convex, i.e. given ε > 0, there exists δ > 0
such that, whenever f and g are in Lp (µ) with

kf kp = kgkp = 1, kf − gkp > ε,

we have
1
(f + g) < 1 − δ.
2
p
10.6 Exercises 233

10.13 Let (X, S, µ) be a measure space and let 1 < p < ∞. Let p0
denote the conjugate exponent of p.
0
(a)If g ∈ Lp (µ), define
0
|g(x)|p −2 g(x), if g(x) 6= 0,

f (x) =
0, if g(x) = 0.
Show that f ∈ Lp (µ).
(b) Define the continuous linear functional (as in Section 10.5)
Z
Tg (f ) = f g dµ, for all f ∈ Lp (µ).
X
Show that kTg k = kgkp0 .
(c) Given that every uniformly convex Banach space is reflexive (cf. Ke-
savan [5], Theorem 5.5.1), deduce that Lp (µ) is reflexive for all p such
that 1 < p < ∞.
0
(d) Deduce that the dual of Lp (µ) is isometrically isomorphic to Lp (µ)
for every 1 < p < ∞.

Remark 10.6.1 This gives a proof of the Riesz representation theorem


when 1 < p < ∞, without the assumption of σ-finiteness on the measure
space. 

10.14 Let f ∈ L1 (RN ) be such that for every non-negative C ∞ function


with compact support, ϕ, we have
Z
f ϕ dmN ≥ 0.
RN

Show that f ≥ 0 almost everywhere on RN .

10.15 Let (a, b) ⊂ R be a finite interval. Let {ϕk }∞ k=1 be an orthonormal


sequence in L2 (a, b), i.e.
Z 
1, if j = k,
ϕj ϕk dm1 =
(a,b) 0, if j 6= k.

Let {ck }∞ 2
P
k=1 be scalars such that k=1 |ck | < +∞. Show that there
2
R f ∈ L (a, b) such that:
exists
(i) (a,b) f ϕk dm1 = ck for every k ∈ N.
(ii)

X
kf k22 = |ck |2 .
k=1
234 10 Lp spaces

Remark 10.6.2 This result is known as the Riesz-Fischer theorem. The


completeness of the Lp spaces (cf. Theorem 10.1.1) is also known by the
same name. 

Remark 10.6.3 Let (X, S, µ) be a measure space. The space L2 (µ) is


a Hilbert space with the inner-product defined by
Z
(f, g) = f g dµ.
X

(If we are dealing with complex-valued functions then g above should


be replaced by its complex conjugate.) The Riesz-Fisher theorem, as
stated in the preceding exercise, is valid in any Hilbert space. 
Bibliography

[1] Ahlfors, L. V. Complex Analysis, International Student Edition,


Third Edition, McGraw-Hill,1979.

[2] Evans, L. C. and Gariepy, R. F. Measure Theory and Fine Prop-


erties of Functions, CRC Press, 1992.

[3] Folland, G. B. Real Analysis: Modern Techniques and their Ap-


plications, John Wiley and Sons Inc., 1984.

[4] Kesavan, S. Nonlinear Functional Analysis, A First Course,


Texts and Readings in Mathematics (TRIM), 28, Hindusthan Book
Agency, 2004.

[5] Kesavan, S. Functional Analysis, Texts and Readings in Mathe-


matics (TRIM), 52, Hindusthan Book Agency, 2009.

[6] Royden, H. L. Real Analysis, 2nd Edition, Macmillan, 1964.

[7] Rudin, W. Principles of Mathematical Analysis, Third Editon,


McGraw-Hill International Edition, 1976.

[8] Rudin, W. Real and Complex Analysis, Tata McGraw-Hill, 1974.

[9] Simmons, G. F. Introduction to Topology and Modern Analysis,


McGraw-Hill, 1963.

© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 235
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9
Index

σ-algebra, 11 almost surely, 113


σ-ring, 11 in mean, 99
hereditary, 16 in measure, 71
in probability, 113
absolute continuity, 136, 185 convolution, 170, 171, 221
absolutely continuous countable additivity, 12
measure, 104 counting measure, 90
algebra, 9 critical
almost everywhere, 65 point, 146
almost uniform convergence, 69 value, 146

Bernstein polynomial, 112 diffeomorphism, 143


Borel differentiable mapping, 142
σ-algebra, 35 Dirac measure, 91
measure, 36 distribution function, 169
set, 35 dominated convergence theorem,
Borel-Cantelli lemma, 16 98
bounded variation, 124
Egorov’s theorem, 68
Cantor elementary set, 156
function, 64, 118, 141 equimeasurable, 169
set, 37 essential supremum, 66
Cantor-Lebesgue theorem, 231 essentially bounded, 66, 196
Cauchy in measure, 71 events, 112
Cauchy-Schwarz inequality, 198 expectation, 113
chain rule, 145
change of variable, 153 Fatou’s lemma, 92
characteristic function, 43, 56 finite additivity, 12
Clarkson’s inequality, 221, 232 Fourier transform, 101
coarea formula, 173 Fréchet derivative, 142
complex measure, 184 Fubini’s thoerem, 164
conjugate exponent, 197 function
convergence Cantor, 118
© Hindustan Book Agency 2019 and Springer Nature Singapore Pte Ltd. 2019 236
S. Kesavan, Measure and Integration, Texts and Readings in
Mathematics 77, https://doi.org/10.1007/978-981-13-6678-9
10.6 Exercises 237

Carathéodory, 160 Fatou, 92


characteristic, 43, 56 Riemann-Lebesgue, 231
complex-valued, 95 Vitali covering, 119
integrable, 95 Lipschitz continuous, 124
negative part, 58 Lusin’s theorem, 208
positive part, 58
simple, 60 mapping
step, 43 continuously differentiable, 143
differentiable, 142
Hölder’s inequality, 197 mean value thoerem, 145
Hahn decomposition, 181 measurable
Hardy’s inequality, 209 cover, 23
function, 54
independent rectangle, 156
events, 113 set, 19, 54
random variables, 113 space, 54
inequality measure, 12
Cauchy-Schwarz, 198 σ-finite, 15
Clarkson, 221, 232 absolute continuity, 53, 185
Hölder, 197 absolutely continuous, 104
Hardy, 209 complete, 21
Minkowski, 198
completion of, 24
Young, 222
complex, 184
integrable, 196
continuity from above, 15
function, 95
continuity from below, 14
Riemann, 2
counting, 13
integral
Dirac, 13
Lebesgue, 84, 85, 95
equivalent, 187
Riemann, 2
finite, 15
iterated integrals, 166
inner-regular, 43
Jacobian, 144 Jordan decomposition, 184
Jordan decomposition, 183 Lebesgue decomposition, 193
Lebesgue-Stieltjes, 52
Lebesgue outer-regular, 43
σ-algebra, 35 product, 162
integral, 84, 85, 95 regular, 43
measurable sets, 35 signed, 178
measure, 35 singular, 193
lemma space, 65
Borel-Cantelli, 16 subadditivity, 14
238 10 Lp spaces

translation invariance, 47 upper variation, 184


Minkowski’s inequality, 198 simple function, 60
mollifiers, 225, 227 singular
monotone class, 157 function, 140
monotone convergence theorem, point, 146
88 value, 146
step function, 43
negative set, 180 subadditivity
countable, 17
outer-measure, 17
theorem
polar coordinates, 168 Cantor-Lebesgue, 231
positive set, 180 dominated convergence, 98
probability Egorov, 68
conditional, 113 Fubini, 164
space, 112 Hahn decomposition, 181
product measure, 162 Jordan decomposition, 183
Radon-Nikodym Lusin, 208
derivative, 192 mean value, 145
theorem, 94, 104, 189, 192 monotone convergence, 88
random variable, 113 Radon-Nikodym, 94, 104, 189,
distribution function, 113 192
identically distributed, 113 Riesz representation, 219
independence, 113 Riesz-Fischer, 234
rearrangement, 169 Sard, 146
Weierstrass, 109
rectifiable arc, 131
total variation, 124
regular value, 146
translation invariant, 47
Riemann-Lebesgue Lemma, 231
Riesz representation theorem, 219 Vitali covering, 119
Riesz-Fischer theorem, 234
ring, 9 Weierstrass’ theorem, 109

sample space, 112 Young’s inequality, 222


Sard’s theorem, 146
section
function, 159
set, 156
signed measure, 178
lower variation, 184
total variation, 184
Texts and Readings in Mathematics

1. R. B. Bapat: Linear Algebra and Linear Models (3/E)


2. Rajendra Bhatia: Fourier Series (2/E)
3. C.Musili: Representations of Finite Groups
4. Henry Helson: Linear Algebra (2/E)
5. Donald Sarason: Complex Function Theory (2/E)
6. M. G. Nadkarni: Basic Ergodic Theory (3/E)
7. Henry Helson: Harmonic Analysis (2/E)
8. K. Chandrasekharan: A Course on Integration Theory
9. K. Chandrasekharan: A Course on Topological Groups
10. Rajendra Bhatia(ed.): Analysis, Geometry and Probability
11. K. R. Davidson: C* -Algebras by Example
12. Meenaxi Bhattacharjee et al.: Notes on Infinite Permutation Groups
13. V. S. Sunder: Functional Analysis - Spectral Theory
14. V. S. Varadarajan: Algebra in Ancient and Modern Times
15. M. G. Nadkarni: Spectral Theory of Dynamical Systems
16. A. Borel: Semi-Simple Groups and Symmetric Spaces
17. Matilde Marcoli: Seiberg Witten Gauge Theory
18. Albrecht Bottcher:Toeplitz Matrices, Asymptotic Linear Algebra and Functional
Analysis
19. A. Ramachandra Rao and P Bhimasankaram: Linear Algebra (2/E)
20. C. Musili: Algebraic Geomtery for Beginners
21. A. R. Rajwade: Convex Polyhedra with Regularity Conditions and Hilbert’s Third
Problem
22. S. Kumaresen: A Course in Differential Geometry and Lie Groups
23. Stef Tijs: Introduction to Game Theory
24. B. Sury: The Congruence Subgroup Problem - An Elementary Approach Aimed at
Applications
25. Rajendra Bhatia (ed.): Connected at Infinity - A Selection of Mathematics by
Indians
26. Kalyan Mukherjea: Differential Calculas in Normed Linear Spaces (2/E)
27. Satya Deo: Algebraic Topology - A Primer (2/E)
28. S. Kesavan: Nonlinear Functional Analysis - A First Course
29. Sandor Szabo: Topics in Factorization of Abelian Groups
30. S. Kumaresan and G.Santhanam: An Expedition to Geometry
31. David Mumford: Lectures on Curves on an Algebraic Surface (Reprint)
32. John. W Milnor and James D Stasheff: Characteristic Classes(Reprint)
33. K.R. Parthasarathy: Introduction to Probability and Measure
34. Amiya Mukherjee: Topics in Differential Topology
35. K.R. Parthasarathy: Mathematical Foundation of Quantum Mechanics (Corrected
Reprint)
36. K. B. Athreya and S.N.Lahiri: Measure Theory
37. Terence Tao: Analysis - I (3/E)
38. Terence Tao: Analysis - II (3/E)
39. Wolfram Decker and Christoph Lossen: Computing in Algebraic Geometry
40. A. Goswami and B.V.Rao: A Course in Applied Stochastic Processes
240 Texts and Readings in Mathematics

41. K. B. Athreya and S.N.Lahiri: Probability Theory


42. A. R. Rajwade and A.K. Bhandari: Surprises and Counterexamples in Real
Function Theory
43. Gene H. Golub and Charles F. Van Loan: Matrix Computations (Reprint of the 4/E)
44. Rajendra Bhatia: Positive Definite Matrices
45. K.R. Parthasarathy: Coding Theorems of Classical and Quantum Information
Theory (2/E)
46. C.S. Seshadri: Introduction to the Theory of Standard Monomials (2/E)
47. Alain Connes and Matilde Marcolli: Noncommutative Geometry, Quantum Fields
and Motives
48. Vivek S. Borkar: Stochastic Approximation - A Dynamical Systems Viewpoint
49. B.J. Venkatachala: Inequalities - An Approach Through Problems (2/E)
50. Rajendra Bhatia: Notes on Functional Analysis
51. A. Clebsch: Jacobi’s Lectures on Dynamics (2/E)
52. S. Kesavan: Functional Analysis
53. V.Lakshmibai and Justin Brown: Flag Varieties - An Interplay of Geometry,
Combinatorics and Representation Theory (2/E)
54. S. Ramasubramanian: Lectures on Insurance Models
55. Sebastian M. Cioaba and M. Ram Murty: A First Course in Graph Theory and
Combinatorics
56. Bamdad R. Yahaghi: Iranian Mathematics Competitions 1973-2007
57. Aloke Dey: Incomplete Block Designs
58. R.B. Bapat: Graphs and Matrices (2/E)
59. Hermann Weyl: Algebraic Theory of Numbers(Reprint)
60. C L Siegel: Transcendental Numbers(Reprint)
61. Steven J. Miller and RaminTakloo-Bighash: An Invitation to Modern Number
Theory (Reprint)
62. John Milnor: Dynamics in One Complex Variable (3/E)
63. R. P. Pakshirajan: Probability Theory: A Foundational Course
64. Sharad S. Sane: Combinatorial Techniques
65. Hermann Weyl: The Classical Groups-Their Invariants and Representations
(Reprint)
66. John Milnor: Morse Theory (Reprint)
67. Rajendra Bhatia(Ed.): Connected at Infinity II- A Selection of Mathematics by
Indians
68. Donald Passman: A Course in Ring Theory (Reprint)
69. Amiya Mukherjee: Atiyah-Singer Index Theorem- An Introduction
70. Fumio Hiai and Denes Petz: Introduction to Matrix Analysis and Applications
71. V. S. Sunder: Operators on Hilbert Space
72. Amiya Mukherjee: Differential Topology
73. David Mumford and Tadao Oda: Algebraic Geometry II
74. Kalyan B. Sinha and Sachi Srivastava: Theory of Semigroups and Applications
75. Arup Bose and Snigdhansu Chatterjee: U-Statistics, M m -Estimators and
Resampling
76. Rajeeva L. Karandikar and B. V. Rao: Introduction to Stochastic Calculus

Das könnte Ihnen auch gefallen