Sie sind auf Seite 1von 34

Measure Theory and Integration

Richard F. Bass
Department of Mathematics
University of Connecticut
September 18, 1998
These notes are c _1998 by Richard Bass. They may be used for personal use or
class use, but not for commercial purposes.
1. Measures.
Let X be a set. We will use the notation: A
c
= x X : x / A and AB = AB
c
.
Denition. An algebra or a eld is a collection / of subsets of X such that
(a) , X /;
(b) if A /, then A
c
/;
(c) if A
1
, . . . , A
n
/, then
n
i=1
A
i
and
n
i=1
A
i
are in /.
/ is a -algebra or -eld if in addition
(d) if A
1
, A
2
, . . . are in /, then

i=1
A
i
and

i=1
A
i
are in /.
In (d) we allow countable unions and intersections only; we do not allow uncountable
unions and intersections.
Example. Let X = R and / be the collection of all subsets of R.
Example. Let X = R and let / = A R : A is countable or A
c
is countable.
Denition. A measure on (X, /) is a function : / [0, ] such that
(a) (A) 0 for all A /;
(b) () = 0;
(c) if A
i
/ are disjoint, then
(

i=1
A
i
) =

i=1
(A
i
).
Example. X is any set, / is the collection of all subsets, and (A) is the number of
elements in A.
Example. X = R, / the collection of all subsets, x
1
, x
2
, . . . R, a
1
, a
2
, . . . > 0, and
(A) =

{i:xiA}
a
i
.
1
Example.
x
(A) = 1 if x A and 0 otherwise. This measure is called point mass at x.
Proposition 1.1. The following hold:
(a) If A, B / with A B, then (A) (B).
(b) If A
i
/ and A =

i=1
A
i
, then (A)

i=1
(A
i
).
(c) If A
i
/, A
1
A
2
, and A =

i=1
A
i
, then (A) = lim
n
(A
n
).
(d) If A
i
/, A
1
A
2
, (A
1
) < , and A =

i=1
A
i
, then we have (A) =
lim
n
(A
n
).
Proof. (a) Let A
1
= A, A
2
= B A, and A
3
= A
4
= = . Now use part (c) of the
denition of measure.
(b) Let B
1
= A
1
, B
2
= A
2
B
1
, B
3
= A
3
(B
1
B
2
), and so on. The B
i
are
disjoint and

i=1
B
i
=

i=1
A
i
. So (A) =

(B
i
)

(A
i
).
(c) Dene the B
i
as in (b). Since
n
i=1
B
i
=
n
i=1
A
i
, then
(A) = (

i=1
A
i
) = (

i=1
B
i
) =

i=1
(B
i
)
= lim
n
n

i=1
(B
i
) = lim
n
(
n
i=1
B
i
) = lim
n
(
n
i=1
A
i
).
(d) Apply (c) to the sets A
1
A
i
, i = 1, 2, . . ..
Denition. A probability or probability measure is a measure such that (X) = 1. In
this case we usually write (, T, P) instead of (X, /, ).
2. Construction of Lebesgue measure.
Dene m((a, b)) = b a. If G is an open set and G R, then G =

i=1
(a
i
, b
i
) with
the intervals disjoint. Dene m(G) =

i=1
(b
i
a
i
). If A R, dene
m

(A) = infm(G) : G open, A G.


We will show the following.
(1) m

is not a measure on the collection of all subsets of R.


(2) m

is a measure on the -algebra consisting of what are known as m

-measurable sets.
(3) Let /
0
be the algebra (not -algebra) consisting of all nite unions of sets of the form
[a
i
, b
i
). If / is the smallest -algebra containing /
0
, then m

is a measure on (R, /).


We will prove these three facts (and a bit more) in a moment, but lets rst make
some remarks about the consequences of (1)-(3).
2
If you take any collection of -algebras and take their intersection, it is easy to
see that this will again be a -algebra. The smallest -algebra containing /
0
will be the
intersection of all -algebras containing /
0
.
Since (a, b] is in /
0
for all a and b, then (a, b) =

i=i0
(a, b 1/i] /, where we
choose i
0
so that 1/i
0
< b a. Then sets of the form

i=1
(a
i
, b
i
) will be in /, hence all
open sets. Therefore all closed sets are in / as well.
The smallest -algebra containing the open sets is called the Borel -algebra. It is
often written B.
A set N is a null set if m

(N) = 0. Let L be the smallest -algebra containing B


and all the null sets. L is called the Lebesgue -algebra, and sets in L are called Lebesgue
measurable.
As part of our proofs of (2) and (3) we will show that m

is a measure on L.
Lebesgue measure is the measure m

on L. (1) shows that L is strictly smaller than the


collection of all subsets of R.
Proof of (1). Dene x y if x y is rational. This is an equivalence relationship on
[0, 1]. For each equivalence class, pick an element out of that class (by the axiom of choice)
Call the collection of such points A. Given a set B, dene B +x = y +x : y B. Note
m

(A + q) = m

(A) since this translation invariance holds for intervals, hence for open
sets, hence for all sets. Moreover, the sets A+q are disjoint for dierent rationals q.
Now
[0, 1]
q[2,2]
(A+q),
where the sum is only over rational q, so 1

q[2,2]
m

(A+q), and therefore m

(A) > 0.
But

q[2,2]
(A+q) [6, 6],
where again the sum is only over rational q, so 12

q[2,2]
m

(A + q), which implies


m

(A) = 0, a contradiction.
Proposition 2.1. The following hold:
(a) m

() = 0;
(b) if A B, then m

(A) m

(B);
(c) m

i=1
A
i
)

i=1
m

(A
i
).
Proof. (a) and (b) are obvious. To prove (c), let > 0. For each i there exist intervals
I
i1
, I
i2
, . . . such that A
i

j=1
I
ij
and

j
m(I
ij
) m

(A
i
) +/2
i
. Then

i=1
A
i

i,j
I
ij
and

i,j
m(I
ij
)

i
m

(A
i
) +

i
/2
i
=

i
m

(A
i
) +.
3
Since is arbitrary, m

i=1
A
i
)

i=1
m

(A
i
).
A function on the collection of all subsets satisfying (a), (b), and (c) is called an
outer measure.
Denition. Let m

be an outer measure. A set A X is m

-measurable if
m

(E) = m

(E A) +m

(E A
c
) (2.1)
for all E X.
Theorem 2.2. If m

is an outer measure on X, then the collection / of m

measurable
sets is a -algebra and the restriction of m

to / is a measure. Moreover, / contains all


the null sets.
Proof. By Proposition 2.1(c),
m

(E) m

(E A) +m

(E A
c
)
for all E X. So to check (2.1) it is enough to show m

(E) m

(E A) +m

(E A
c
).
This will be trivial in the case m

(E) = .
If A /, then A
c
/ by symmetry and the denition of /. Suppose A, B /
and E X. Then
m

(E) = m

(E A) +m

(E A
c
)
= (m

(E A B) +m

(E A B
c
)) + (m

(E A
c
B) +m

(E A
c
B
c
)
The rst three terms on the right have a sum greater than or equal to m

(E (A B))
because A B (A B) (A B
c
) (A
c
B). Therefore
m

(E) m

(E (A B)) +m

(E (A B)
c
),
which shows A B /. Therefore / is an algebra.
Let A
i
be disjoint sets in /, let B
n
=
n
i=1
A
i
, and B =

i=1
A
i
. If E X,
m

(E B
n
) = m

(E B
n
A
n
) +m

(E B
n
A
c
n
)
= m

(E A
n
) +m

(E B
n1
).
Repeating for m

(E B
n1
), we obtain
m

(E B
n
) =
n

i=1
m

(E A
i
).
4
So
m

(E) = m

(E B
n
) +m

(E B
c
n
)
n

i=1
m

(E A
i
) +m

(E B
c
).
Let n . Then
m

(E)

i=1
m

(E A
i
) +m

(E B
c
)
m

i=1
(E A
i
)) +m

(E B
c
)
= m

(E B) +m(E B
c
)
m

(E).
This shows B /.
If we set E = B in this last equation, we obtain
m

(B) =

i=1
m

(A
i
),
or m

is countably additive on /.
If m

(A) = 0 and E X, then


m

(E A) +m

(E A
c
) = m

(E A
c
) m

(E),
which shows / contains all null sets.
None of this is useful if / does not contain the intervals. There are two main steps
in showing this. Let /
0
be the algebra consisting of all nite unions of intervals of the
form (a, b]. The rst step is
Proposition 2.3. If A
i
/
0
are disjoint and

i=1
A
i
/
0
, then we have m(

i=1
A
i
) =

i=1
m(A
i
).
Proof. Since

i=1
A
i
is a nite union of intervals (a
k
, b
k
], we may look at A
i
(a
k
, b
k
] for
each k. So we may assume that A =

i=1
A
i
= (a, b].
First,
m(A) = m(
n
i=1
A
i
) +m(A
n
i=1
A
i
) m(
n
i=1
A
i
) =
n

i=1
m(A
i
).
Letting n ,
m(A)

i=1
m(A
i
).
5
Let us assume a and b are nite, the other case being similar. By linearity, we
may assume A
i
= (a
i
, b
i
]. Let > 0. The collection (a
i
, b
i
+ /2
i
) covers [a + , b], and
so there exists a nite subcover. Discarding any interval contained in another one, and
relabeling, we may assume a
1
< a
2
< a
N
and b
i
+/2
i
(a
i+1
, b
i+1
+/2
i+1
). Then
m(A) = b a = b (a +) +

i=1
(b
i
+/2
i
a
i
) +

i=1
m(A
i
) + 2.
Since is arbitrary, m(A)

i=1
m(A
i
).
The second step is the Caratheodory extension theorem. We say that a measure m
is -nite if there exist E
1
, E
2
, . . . , such that m(E
i
) < for all i and X

i=1
E
i
.
Theorem 2.4. Suppose /
0
is an algebra and m restricted to /
0
is a measure. Dene
m

(E) = inf
_

i=1
m(A
i
) : A
i
/
0
, E

i=1
A
i
_
.
Then
(a) m

(A) = m(A) if A /
0
;
(b) every set in /
0
is m

-measurable;
(c) if m is -nite, then there is a unique extension to the smallest -eld containing
/
0
.
Proof. We start with (a). Suppose E /
0
. We know m

(E) m(E) since we can take


A
1
= E and A
2
, A
3
, . . . empty in the denition of m

. If E

i=1
A
i
with A
i
/
0
, let
B
n
= E (A
n

n1
i=1
A
i
). The the B
n
are disjoint, they are each in /
0
, and their union
is E. Therefore
m(E) =

i=1
m(B
i
)

i=1
m(A
i
).
Thus m(E) m

(E).
Next we look at (b). Suppose A /
0
. Let > 0 and let E X. Pick B
i
/
0
such that E

i=1
B
i
and

i
m(B
i
) m

(E) +. Then
m

(E) +

i=1
m(B
i
) =

i=1
m(B
i
A) +

i=1
m(B
i
A
c
)
m

(E A) +m

(E A
c
).
6
Since is arbitrary, m

(E) m

(E A) +m

(E A
c
). So A is m

-measurable.
Finally, suppose we have two extensions to the smallest -eld containing /
0
; let
the other extension be called n. We will show that if E is in this smallest -eld, then
m

(E) = n(E).
Since E must be m

-measurable, m

(E) = inf

i=1
m(A
i
) : E

i=1
A
i
, A
i

/
0
. But m = n on /
0
, so

i
m(A
i
) =

i
n(A
i
). Therefore n(E)

i
n(A
i
), which
implies n(E) m

(E).
Let > 0 and choose A
i
/
0
such that m

(E) +

i
m(A
i
) and E
i
A
i
.
Let A =
i
A
i
and B
k
=
k
i=1
A
i
. Observe m

(E) + m

(A), hence m

(AE) < . We
have
m

(A) = lim
k
m

(B
k
) = lim
k
n(B
k
) = n(A).
Then
m

(E) m

(A) = n(A) = n(E) +n(AE) n(E) +m(AE) n(E) +.


Since is arbitrary, this completes the proof.
We now drop the from m

and call m Lebesgue measure.


3. Lebesgue-Stieltjes measures. Let : R R be nondecreasing and right contin-
uous (i.e., (x+) = (x) for all x). Suppose we dene m

((a, b)) = (b) (a), dene


m

i=1
(a
i
, b
i
)) =

i
((b
i
) (a
i
)) when the intervals (a
i
, b
i
) are disjoint, and dene
m

(A) = infm

(G) : A G, G open. Very much as in the previous section we can


show that m

is a measure on the Borel -algebra. The only dierences in the proof are
that where we had a + , we replace this by a

, where a

is chosen so that a

> a and
(a

) (a) + and we replace b


i
+ /2
i
by b

i
, where b

i
is chosen so that b

i
> b
i
and
(b

i
) (b
i
) +/2
i
. These choices are possible because is right continuous.
Lebesgue measure is the special case of m

when (x) = x.
Given a measure on R such that (K) < whenever K is compact, dene
(x) = ((0, x]) if x 0 and (x) = ((x, 0]) if x < 0. Then is nondecreasing, right
continuous, and it is not hard to see that = m

.
4. Measurable functions. Suppose we have a set X together with a -algebra /.
Denition. f : X R is measurable if x : f(x) > a / for all a R.
Proposition 4.1. The following are equivalent.
(a) x : f(x) > a / for all a;
(b) x : f(x) a / for all a;
(c) x : f(x) < a / for all a;
7
(d) x : f(x) a / for all a.
Proof. The equivalence of (a) and (b) and of (c) and (d) follow from taking complements.
The remaining equivalences follow from the equations
x : f(x) a =

n=1
x : f(x) > a 1/n,
x : f(x) > a =

n=1
x : f(x) a + 1/n.
Proposition 4.2. If X is a metric space, / contains all the open sets, and f is continuous,
then f is measurable.
Proof. x : f(x) > a = f
1
(a, ) is open.
Proposition 4.3. If f and g are measurable, so are f +g, cf, fg, max(f, g), and min(f, g).
Proof. If f(x) + g(x) < , then f(x) < g(x), and there exists a rational r such that
f(x) < r < g(x). So
x : f(x) +g(x) < =
_
r rational
(x : f(x) < r x : g(x) < r).
f
2
is measurable since x : f(x)
2
> a) = x : f(x) >

a x : f(x) <

a. The
measurability of fg follows since fg =
1
2
[(f +g)
2
f
2
g
2
].
x : max(f(x), g(x)) > a = x : f(x) > a x : g(x) > a.
Proposition 4.4. If f
i
is measurable for each i, then so is sup
i
f
i
, inf
i
f
i
, limsup
i
f
i
,
and liminf
i
f
i
.
Proof. The result will follow for limsup and liminf once we have the result for the sup
and inf by using the denitions. We have x : sup
i
f
i
> a =

i=1
x : f
i
(x) > a, and the
proof for inf f
i
is similar.
Denition. We say f = g almost everywhere, written f = g a.e., if x : f(x) ,= g(x) has
measure zero. Similarly, we say f
i
f a.e., if the set of x where this fails has measure
zero.
5. Integration. In this section we introduce the Lebesgue integral.
Denition. If E X, dene the characteristic function of E by

E
(x) =
_
1 x E;
0 x / E.
8
A simple function s is one of the form
s(x) =
n

i=1
a
i

Ei
(x)
for reals a
i
and sets E
i
.
Proposition 5.1. Suppose f 0 is measurable. Then there exists a sequence of nonneg-
ative measurable simple functions increasing to f.
Proof. Let E
ni
= x : (i1)/2
n
f(x) < i/2
n
and F
n
= x : f(x) n for n = 1, 2, . . . ,
and i = 1, 2, . . . , n2
n
. Then dene
s
n
=
n2
n

i=1
i 1
2
n

Eni
+n
Fn
.
It is easy to see that s
n
has the desired properties.
Denition. If s =

n
i=1
a
i

Ei
is a nonnegative measurable simple function, dene the
Lebesgue integral of s to be
_
s d =
n

i=1
a
i
(E
i
). (5.1)
If f 0 is measurable function, dene
_
f d = sup
_
_
s d : 0 s f, s simple
_
. (5.2)
If f is measurable and at least one of the integrals
_
f
+
d,
_
f

d is nite, where f
+
=
max(f, 0) and f

= min(f, 0), dene


_
f d =
_
f
+
d
_
f

d. (5.3)
A few remarks are in order. A function s might be written as a simple function in
more than one way. For example
AB
=
A
+
B
is A and B are disjoint. It is clear
that the denition of
_
s d is unaected by how s is written. Secondly, if s is a simple
function, one has to think a moment to verify that the denition of
_
s d by means of
(5.1) agrees with its denition by means of (5.2).
Denition. If
_
[f[ d < , we say f is integrable.
The proof of the next proposition follows from the denitions.
9
Proposition 5.2. (a) If f is measurable, a f(x) b for all x, and (X) < , then
a(X)
_
f d b(X);
(b) If f(x) g(x) for all x and f and g are measurable and integrable, then
_
f d
_
g d.
(c) If f is integrable, then
_
cf d = c
_
f d for all real c.
(d) If (A) = 0 and f is measurable, then
_
f
A
d = 0.
The integral
_
f
A
d is often written
_
A
f d. Other notation for the integral is
to omit the if it is clear which measure is being used, to write
_
f(x) (dx), or to write
_
f(x) d(x).
Proposition 5.3. If f is integrable,

_
f


_
[f[.
Proof. f [f[, so
_
f
_
[f[. Also f [f[, so
_
f
_
[f[. Now combine these two
facts.
One of the most important results concerning Lebesgue integration is the monotone
convergence theorem.
Theorem 5.4. Suppose f
n
is a sequence of nonnegative measurable functions with f
1
(x)
f
2
(x) for all x and with lim
n
f
n
(x) = f(x) for all x. Then
_
f
n
d
_
f d.
Proof. By Proposition 5.2(b),
_
f
n
is an increasing sequence of real numbers. Let L be
the limit. Since f
n
f for all n, then L
_
f. We must show L
_
f.
Let s =

m
i=1
a
i

Ei
be any nonnegative simple function less than f and let c (0, 1).
Let A
n
= x : f
n
(x) cs(x). Since the f
n
(x) increases to f(x) for each x and c < 1,
then A
1
A
2
, and the union of the A
n
is all of X. For each n,
_
f
n

_
An
f
n
c
_
An
s
n
= c
_
An
m

i=1
a
i

Ei
= c
m

i=1
a
i
(E
i
A
n
).
If we let n , by Proposition 1.1(c), the right hand side converges to
c
m

i=1
a
i
(E
i
) = c
_
s.
10
Therefore L c
_
s. Since c is arbitrary in the interval (0, 1), then L
_
s. Taking the
supremum over all simple s f, we obtain L
_
f.
Once we have the monotone convergence theorem, we can prove that the Lebesgue
integral is linear.
Theorem 5.5. If f
1
and f
2
are integrable, then
_
(f
1
+f
2
) =
_
f
1
+
_
f
2
.
Proof. First suppose f
1
and f
2
are nonnegative and simple. Then it is clear from the
denition that the theorem holds in this case. Next suppose f
1
and f
2
are nonnegative.
Take s
n
simple and increasing to f
1
and t
n
simple and increasing to f
2
. Then s
n
+ t
n
increases to f
1
+f
2
, so the result follows from the monotone convergence theorem and the
result for simple functions. Finally in the general case, write f
1
= f
+
1
f

1
and similarly
for f
2
, and use the denitions and the result for nonnegative functions.
Suppose f
n
are nonnegative measurable functions. We will frequently need the
observation
_

n=1
f
n
=
_
lim
N
N

n=1
f
n
= lim
N
_

n=1
f
n
(5.4)
= lim
N
N

n=1
_
f
n
=

n=1
_
f
n
.
We used here the monotone convergence theorem and the linearity of the integral.
The next theorem is known as Fatous lemma.
Theorem 5.6. Suppose the f
n
are nonnegative and measurable. Then
_
liminf
n
f
n
liminf
n
_
f
n
.
Proof. Let g
n
= inf
in
f
i
. Then g
n
are nonnegative and g
n
increases to liminf f
n
. Clearly
g
n
f
i
for each i n, so
_
g
n

_
f
i
. Therefore
_
g
n
inf
in
_
f
i
.
If we take the supremum over n, on the left hand side we obtain
_
liminf f
n
by the
monotone convergence theorem, while on the right hand side we obtain liminf
n
_
f
n
.
A second very important theorem is the dominated convergence theorem.
11
Theorem 5.7. Suppose f
n
are measurable functions and f
n
(x) f(x). Suppose there
exists an integrable function g such that [f
n
(x)[ g(x) for all x. Then
_
f
n
d
_
f d.
Proof. Since f
n
+g 0, by Fatous lemma,
_
(f +g) liminf
_
(f
n
+g).
Since g is integrable,
_
f liminf
_
f
n
.
Similarly, g f
n
0, so
_
(g f) liminf
_
(g f
n
),
and hence

_
f liminf
_
(f
n
) = limsup
_
f
n
.
Therefore _
f limsup
_
f
n
,
which with the above proves the theorem.
Example. Suppose f
n
= n
(0,1/n)
. Then f
n
0, f
n
0 for each x, but
_
f
n
= 1 does
not converge to
_
0 = 0. The trouble here is that the f
n
do not increase for each x, nor is
there a function g that dominates all the f
n
simultaneously.
If in the monotone convergence theorem or dominated convergence theorem we have
only f
n
(x) f(x) almost everywhere, the conclusion still holds. For if A = x : f
n
(x)
f(x), then f
A
f
A
for each x. And since A
c
has measure 0, we see from Proposition
5.2(d) that
_
f
A
=
_
f, and similarly with f replaced by f
n
.
Later on we will need the following two propositions.
Proposition 5.8. Suppose f is measurable and for every measurable set A we have
_
A
f d = 0. Then f = 0 almost everywhere.
Proof. Let A = x : f(x) > . Then
0 =
_
A
f
_
A
= (A)
since f
A

A
. Hence (A) = 0. We use this argument for = 1/n and n = 1, 2, . . . ,
so x : f(x) > 0 = 0. Similarly x : f(x) < 0 = 0.
12
Proposition 5.9. Suppose f is measurable and nonnegative and
_
f d = 0. Then f = 0
almost everywhere.
Proof. If f is not almost everywhere equal to 0, there exists an n such that (A
n
) > 0
where A
n
= x : f(x) > 1/n. But then since f is nonnegative,
_
f
_
An
f
1
n
(A
n
),
a contradiction.
6. Product measures. If A
1
A
2
and A =

i=1
A
i
, we write A
i
A. If
A
1
A
2
and A =

i=1
A
i
, we write A
i
A.
Denition. / is a monotone class is / is a collection of subsets of X such that
(a) if A
i
A and each A
i
/, then A /;
(b) if A
i
A and each A
i
/, then A /.
The intersection of monotone classes is a monotone class, and the intersection of
all monotone classes containing a given collection of sets is the smallest monotone class
containing that collection.
The next theorem, the monotone class lemma, is rather technical, but very useful.
Theorem 6.1. Suppose /
0
is a algebra, / is the smallest -algebra containing /
0
, and
/ is the smallest monotone class containing /
0
. Then /= /.
Proof. A -algebra is clearly a monotone class, so / /. We must show / /.
Let ^
1
= A / : A
c
/. Note ^
1
is contained in /, contains /
0
, and is
a monotone class. So ^
1
= /, and therefore / is closed under the operation of taking
complements.
Let ^
2
= A /: AB / for all B /
0
. ^
2
is contained in /; ^
2
contains
/
0
because /
0
is an algebra; ^
2
is a monotone class because (

i=1
A
i
)B =

i=1
(A
i
B),
and similarly for intersections. Therefore ^
2
= /; in other words, if B /
0
and A /,
then A B /.
Let ^
3
= A / : A B / for all B /. As in the preceding paragraph,
^
3
is a monotone class contained in /. By the last sentence of the preceding paragraph,
^
3
contains /
0
. Hence ^
3
= /.
We thus have that / is a monotone class closed under the operations of taking
complements and taking intersections. This shows / is a -algebra, and so / /.
13
Suppose (X, /, ) and (Y, B, ) are two measure spaces, i.e., / and B are -algebras
on X and Y , resp., and and are measures on / and B, resp. A rectangle is a set of
the form AB, where A / and B B. Dene a set function on rectangles by
(AB) = (A)(B).
Lemma 6.2. Suppose AB =

i=1
A
i
B
i
, where A, A
i
/ and B, B
i
B. Then
(AB) =

i=1
(A
i
B
i
).
Proof. We have

AB
(x, y) =

i=1

AiBi
(x, y),
and so

A
(x)
B
(y) =

i=1

Ai
(x)
Bi
(y).
Holding x xed and integrating over y with respect to , we have, using (5.4),

A
(x)(B) =

i=1

Ai
(x)(B
i
).
Now use (5.4) again and integrate over x with respect to to obtain the result.
Let (
0
= nite unions of rectangles. It is clear that (
0
is an algebra. By Lemma
6.2 and linearity, we see that is a measure on (
0
. Let / B be the smallest -
algebra containing (
0
; this is called the product -algebra. By the Caratheodory extension
theorem, can be extended to a measure on /B.
We will need the following observation. Suppose a measure is -nite. So there
exist E
i
which have nite measure and whose union is X. If we let F
n
=
n
i=1
E
i
, then
F
i
X and (F
n
) is nite for each n.
If and are both -nite, say with F
i
X and G
i
Y , then will be -nite,
using the sets F
i
G
i
.
The main result of this section is Fubinis theorem, which allows one to interchange
the order of integration.
14
Theorem 6.3. Suppose f : X Y R is measurable with respect to / B. If f is
nonnegative or
_
[f(x, y)[ d( )(x, y) < , then
(a) the function g(x) =
_
f(x, y)(dy) is measurable with respect to /;
(b) the function h(y) =
_
f(x, y)(dx) is measurable with respect to B;
(c) we have
_
f(x, y) d( )(x, y) =
_
_
_
f(x, y) d(x)
_
d(y)
=
_
_
_
f(x, y) d(y)
_
(dx).
Proof. First suppose and are nite measures. If f is the characteristic function of
a rectangle, then (a)(c) are obvious. By linearity, (a)(c) hold if f is the characteristic
function of a set in (
0
, the set of nite unions of rectangles.
Let / be the collection of sets C such that (a)(c) hold for
C
. If C
i
C and
C
i
/, then (c) holds for
C
by monotone convergence. If C
i
C, then (c) holds for
C
by dominated convergence. (a) and (b) are easy. So / is a monotone class containing /
0
,
so /= /B.
If and are -nite, applying monotone convergence to C(F
n
G
n
) for suitable
F
n
and G
n
and monotone convergence, we see that (a)(c) holds for the characteristic
functions of sets in /B in this case as well.
By linearity, (a)(c) hold for nonnegative simple functions. By monotone conver-
gence, (a)(c) hold for nonnegative functions. In the case
_
[f[ < , writing f = f
+
f

and using linearity proves (a)(c) for this case, too.


7. The Radon-Nikodym theorem. Suppose f is nonnegative, measurable, and inte-
grable with respect to . If we dene by
(A) =
_
A
f d,
then is a measure. The only part that needs thought is the countable additivity, and this
follows from (5.4) applied to the functions f
Ai
. Moreover, (A) is zero whenever (A)
is.
Denition. A measure is called absolutely continuous with respect to a measure if
(A) = 0 whenever (A) = 0.
Denition. A function : / (, ] is called a signed measure if () = 0 and
(

i=1
A
i
) =

i=1
(A
i
) whenever the A
i
are disjoint and all the A
i
are in /.
Denition. Let be a signed measure. A set A / is called a positive set for if
(B) 0 whenever B A and A /. We dene a negative set similarly.
15
Proposition 7.1. Let be a signed measure and let M > 0 such that (A) M for all
A /. If (F) < 0, then there exists a subset E of F that is a negative set with (E) < 0.
Proof. Suppose (F) < 0. Let F
1
= F and let a
1
= sup(A) : A F
1
. Since
(F
1
A) = (F
1
) (A) if A F
1
, we see that a
1
is nite. Let B
1
be a subset of F
1
such that (B
1
) a
1
/2. Let F
2
= F
1
B
1
, let a
2
= sup(A) : A F
2
, and choose B
2
a subset of F
2
such that (B
2
) a
2
/2. Let F
3
= F
2
B
2
and continue.
One possibility is that this procedure stops after nitely many steps. This happens
only if for some i every subset of F
i
has nonpositive mass. In this case E = F
i
is the
desired negative set.
The other possibility is if this procedure continues indenitely. In this case, let
E =

i=1
F
i
. Note E = F (

i=1
B
i
), and the B
i
are disjoint. So
(E) = (F)

i=1
(B
i
),
and (E) (F) < 0. Also

i=1
(B
i
) = (F) (E) M.
This implies the series converges, so (B
i
) 0. Since (B
i
) a
i
/2, then a
i
0. Suppose
E is not a negative set. Then there exists A E with (A) > 0. Choose n such that
a
n
< (A). But A is a subset of F
n
, so a
n
(A), a contradiction. Therefore E is a
negative set.
Proposition 7.2. Let be a signed measure and M > 0 such that (A) M for all
A /. There exist sets E and F that are disjoint whose union is X and such that E is a
negative set and F is a positive set.
Proof. Let L = inf(A) : A is a negative set. Choose negative sets A
n
such that
(A
n
) L. Let E =

n=1
A
n
. Let B
n
= A
n
(B
1
B
n1
) for each n. Since A
n
is
a negative set, so is each B
n
. Also, the B
n
are disjoint. If C E, then
(C) = lim
n
(C (
n
i=1
B
i
)) = lim
n
n

i=1
(C B
i
) 0.
So E is a negative set.
Since E is negative,
(E) = (A
n
) +(E A
n
) (A
n
).
16
Letting n , we obtain (E) = L.
Let F = E
c
. If F were not a positive set, there would exist B F with (B) < 0.
By Proposition 7.1 there exists a negative set C contained in B with (C) < 0. But then
E C would be a negative set with (E C) < (E) = L, a contradiction.
We now are ready for the Radon-Nikodym theorem.
Theorem 7.3. Suppose is a -nite measure and is a nite measure such that is
absolutely continuous with respect to . There exists a -integrable nonnegative function
f such that (A) =
_
A
f d for all A /. Moreover, if g is another such function, then
f = g almost everywhere.
Proof. Let us rst prove the uniqueness assertion. For every set A we have
_
A
(f g) d = (A) (A) = 0.
By Proposition 5.8 we have f g = 0 a.e.
Since is -nite, there exist F
i
X such that (F
i
) < for each i. Let
i
be
the restriction of to F
i
, that is,
i
(A) = (A F
i
). Dene
i
, the restriction of to F
i
,
similarly. If f
i
is a function such that
i
(A) =
_
A
f
i
d
i
for all A, the argument of the rst
paragraph shows that f
i
= f
j
on F
i
if i j. If we dene f by f(x) = f
i
(x) if x F
i
, we
see that f will be the desired function. So it suces to restrict attention to the case where
is nite.
Let
T =
_
g : 0 g,
_
A
g d (A) for all A /
_
.
T is not empty because 0 T. Let L = sup
_
g d : g T, and let g
n
be a sequence in
T such that
_
g
n
d L. Let h
n
= max(g
1
, . . . , g
n
).
If g
1
and g
2
are in T, then h
2
= max(g
1
, g
2
) is also in T. To see this,
_
A
h
2
d =
_
A{x:g1(x)g2(x)}
h
2
d +
_
A{x:g1(x)<g2(x)}
h
2
d
=
_
A{x:g1(x)g2(x)}
g
1
d +
_
A{x:g1(x)<g2(x)}
g
2
d
(A x : g
1
(x) g
2
(x)) +(A x : g
1
(x) < g
2
(x)) = (A).
By an induction argument, h
n
is in T.
The h
n
increase, say to f. By the monotone convergence theorem,
_
f d = L and
_
A
f d (A) (7.1)
17
for all A.
Let A be a set where there is strict inequality in (7.1); let be chosen suciently
small so that if is dened by
(B) = (B)
_
B
f d (B),
then (A) > 0. is a signed measure; let F be the positive set as constructed in Proposition
7.2. In particular, (F) > 0. So for every B
_
BF
f d +(B F) (B F).
We then have, using (7.1), that
_
B
(f +
F
) d =
_
B
f d +(B F)
=
_
BF
c
f d +
_
BF
f d +(B F)
(B F
c
) +(B F) = (B).
This says that f +
F
T. However,
L
_
(f +
F
) d =
_
f d +(F) = L +(F),
which implies (F) = 0. But then (F) = 0, and hence (F) = 0, contradicting the fact
that F is a positive set for F with (F) > 0.
8. Dierentiation of real-valued functions.
Let E R be a measurable set and let O be a collection of intervals. We say O
is a Vitali cover of E if for each x E and each > 0 there exists an interval G O
containing x whose length is less than . m will denote Lebesgue measure.
Lemma 8.1. Let E have nite measure and let O be a Vitali cover of E. Given > 0 there
exists a nite subcollection of disjoint intervals I
1
, . . . , I
n
such that m(E
n
i=1
I
n
) < .
Proof. We may replace each interval in O by a closed one, since the set of endpoints of a
nite subcollection will have measure 0.
Let O be an open set of nite measure containing E. Since O is a Vitali cover, we
may suppose without loss of generality that each set of O is contained in O. Let a
1
=
supm(I) : I O. Let I
1
be an element of O with m(I
1
) a
1
/2. Let a
2
= supm(I) :
18
I O, I disjoint from I
1
,and choose I
2
O disjoint from I
1
such that m(I
2
) a
2
/2.
Continue in this way, choosing I
n+1
disjoint from I
1
, . . . , I
n
and in O with length at least
one half as large as any other such interval in O that is disjoint from I
1
, . . . , I
n
.
If the process stops at some nite stage, we are done. If not, we generate a se-
quence of disjoint intervals I
1
, I
2
, . . . Since they are disjoint and all contained in O, then

i=1
m(I
i
) m(O) < . So there exists N such that

i=N+1
m(I
i
) < /5.
Let R = E
N
i=1
I
i
; we will show m(R) < . Let J
n
be the interval with the same
center as I
n
but ve times the length. Let x R. There exists an interval I O containing
x with I disjoint from I
1
, . . . , I
N
. Since

m(I
n
) < , then

a
n
2

m(I
n
) < , and
a
n
0. So I must either be one of the I
n
for some n > N or at least intersect it, for
otherwise we would have chosen I at some stage. Let n be the smallest integer such that I
intersects I
n
; note n > N. We have m(I) a
n1
2m(I
n
). Since x is in I and I intersects
I
n
, the distance from x to the midpoint of I
n
is at most m(I) + m(I
n
)/2 (5/2)m(I
n
).
Therefore x J
n
.
Then R

i=N+1
J
n
, so m(R)

i=N+1
m(J
n
) = 5

i=N+1
m(I
n
) < .
Given a function f, we dene the derivates of f at x by
D
+
f(x) = limsup
h0+
f(x +h) f(x)
h
, D

f(x) = limsup
h0
f(x) f(x h)
h
D
+
f(x) = liminf
h0+
f(x +h) f(x)
h
, D

f(x) = liminf
h0
f(x) f(x h)
h
.
If all the derivates are equal, we say that f is dierentiable at x and dene f

(x) to be the
common value.
Theorem 8.2. Suppose f is nondecreasing on [a, b]. Then f is dierentiable almost
everywhere, f

is measurable, and
_
b
a
f

(x) dx f(b) f(a).


Proof. We will show that the set where any two derivates are unequal has measure
zero. We consider the set E where D
+
f(x) > D

f(X), the other sets being similar. Let


E
u,v
= x : D
+
f(x) > u > v > D

f(x). If we show m(E


u,v
) = 0, then taking the union
of all pairs of rationals with u > v rational shows m(E) = 0.
Let s = m(E
u,v
), let > 0, and choose an open set O such that E
u,v
O and
m(O) < s + . For each x E
u,v
there exists an arbitrarily small interval [x h, x]
contained in O such that f(x) f(xh) < vh. Use Lemma 8.1 to choose I
1
, . . . , I
n
which
are disjoint and whose interiors cover a subset of A of E
u,v
of measure greater than s .
Suppose I
n
= [x
n
h
n
, x
n
]. Summing over these intervals,
N

n=1
[f(x
n
) f(x
n
h
n
)] < v
n

n=1
h
n
< vm(O) < v(s +).
19
Each point y A is the left endpoint of an arbitrarily small interval (y, y +k) that
is contained in some I
n
and for which f(y + k) f(y) > u(k). Using Lemma 8.1 again,
we pick out a nite collection J
1
, . . . , J
M
whose union contains a subset of A of measure
larger than s 2. Summing over these intervals yields
M

i=1
[f(y
i
+k
i
) f(y
i
)] > u

k
i
> u(s 2).
Each interval J
i
is contained in some interval I
n
, and if we sum over those i for which
J
i
I
n
we nd

[f(y
i
+k
i
) f(y
i
)] f(x
n
) f(x
n
h
n
),
since f is increasing. Thus
N

n=1
[f(x
n
) f(x
n
h
n
)]
M

i=1
[f(y
i
+k
i
) f(y
i
)],
and so v(s +) > u(s 2). This is true for each , so vs us. Since u > v, this implies
s = 0.
This shows that
g(x) = lim
h0
f(x +h) f(x)
h
is dened almost everywhere and that f is dierentiable wherever g is nite. Dene
f(x) = f(b) if x b. Let g
n
(x) = n[f(x +1/n) f(x)]. Then g
n
(x) g(x) for almost all
x, and so g is measurable. Since f is increasing, g
n
0. By Fatous lemma
_
b
a
g liminf
_
b
a
g
n
= liminf n
_
b
a
[f(x + 1/n) f(x)]dx
= liminf
_
n
_
b+1/n
b
f n
_
a+1/n
a
f
_
= liminf
_
f(b) n
_
a+1/n
a
f
_
f(b) f(a).
This shows that g is integrable and hence nite almost everywhere.
A function is of bounded variation if sup

k
i=1
[f(x
i
) f(x
i1
)[ is nite, where
the supremum is over all partitions a = x
0
< x
1
< < x
k
= b of [a, b].
Lemma 8.3. If f is of bounded variation on [a, b], then f can be written as the dierence
of two nondecreasing functions on [a, b].
Proof. Dene
P(y) = sup
_
k

i=1
[f(x
i
) f(x
i1
)]
+
_
, N(y) = sup
_
k

i=1
[f(x
i
) f(x
i1
)]

_
,
20
where the supremum is over all partitions a = x
0
< x
1
< < x
k
= y for y [a, b]. Since
k

i=1
[f(x
i
) f(x
i1
)]
+
=
k

i=1
[f(x
i
) f(x
i1
)]

+f(y) f(a),
taking the supremum over all partitions of [a, y] yields
P(y) = N(y) +f(y) f(a).
Clearly P and N are nondecreasing in y, and the result follows by solving for f(y).
Dene the indenite integral of an integrable function f by
F(x) =
_
x
a
f(t) dt.
Lemma 8.4. If f is integrable, then F is continuous and of bounded variation.
Proof. The continuity follows from the dominated convergence theorem The bounded
variation follows from
k

i=1
[F(x
i
) F(x
i1
)[ =
k

i=1

_
xi
xi1
f(t) dt

i=1
_
xi
xi1
[f(t)[ dt
_
b
a
[f(t)[ dt
for all partitions.
Lemma 8.5. If f is integrable and F(x) = 0 for all x, then f = 0 a.e.
Proof. For any interval,
_
d
c
f =
_
d
a
f
_
c
a
f = 0. By dominated convergence and the fact
that any open set is the countable union of disjoint open intervals,
_
O
f = 0 for any open
set O.
If E is any measurable set, take O
n
open that such that
On
decreases to
E
a.e.
By dominated convergence,
_
E
f =
_
f
E
= lim
_
f
On
= lim
_
On
f = 0.
This with Proposition 5.8 implies f is zero a.e.
Proposition 8.6. If f is bounded and measurable, then F

(x) = f(x) for almost every x.


Proof. By Lemma 8.4, F is of bounded variation, and so F

exists a.e. Let K be a bound


for [f[. If
f
n
(x) =
F(x + 1/n) F(x)
1/n
,
21
then
f
n
(x) = n
_
x+1/n
x
f(t) dt,
so [f
n
[ is also bounded by K. Since f
n
F

a.e., then by dominated convergence,


_
c
a
F

(x) dx = lim
_
c
a
f
n
(x) dx = lim
_
c
a
[F(x + 1/n) F(x)] dx
= limn
_
c+1/n
c
F(x) dx n
_
a+c
a
F(x) dx = F(c) F(a) =
_
c
a
f(x) dx,
using the fact that F is continuous. So
_
c
a
[F

(x) f(x)] dx = 0 for all c, which implies


F

= f a.e. by Lemma 8.5.


Theorem 8.7. If f is integrable, then F

= f almost everywhere.
Proof. Without loss of generality we may assume f 0. Let f
n
(x) = f(x) if f(x) n
and let f
n
(x) = n if f(x) > n. Then f f
n
0. If G
n
(x) =
_
x
a
[f f
n
], then G
n
is
nondecreasing, and hence has a derivative almost everywhere. By Lemma 8.6, we know
the derivative of
_
x
a
f
n
is equal to f
n
almost everywhere. Therefore
F

(x) = G

n
(x) +
_
_
x
a
f
n
_

f
n
(x)
a.e. Since n is arbitrary, F

f a.e. So
_
b
a
F


_
b
a
f = F(b) F(a). On the other hand,
by Theorem 8.2,
_
b
a
F

(x) dx F(b) F(a) =


_
b
a
f. We conclude that
_
b
a
[F

f] = 0;
since F

f 0, this tells us that F

= f a.e.
A function is absolutely continuous on [a, b] if given there exists such that

k
i=1
[f(x

i
)f(x
i
)[ < whenever x
i
, x

i
) is a nite collection of nonoverlapping intervals
with

k
i=1
[x

i
x
i
[ < .
Lemma 8.8. If F(x) =
_
x
a
f(t) dt for f integrable on [a, b], then F is absolutely continu-
ous.
Proof. Let > 0. Choose a simple function s such that
_
b
a
[f s[ < /2. Let K be a
bound for [s[ and let = /2K. If (x
i
, x

i
) is a collection of nonoverlapping intervals, the
sum of whose lengths is less than , then set A =
k
i=1
(x
i
, x

i
) and note
_
A
[f s[ < /2
and
_
A
s < K = /2.
Lemma 8.9. If f is absolutely continuous, then it is of bounded variation.
Proof. Let correspond to = 1 in the denition of absolute continuity. Given a
partition, add points if necessary so that each subinterval has length at most . We can
22
then group the subintervals into at most K collections, each of total length less than ,
where K is an integer larger than (1 +b a)/. So the total variation is then less than K.
Lemma 8.10. If f is absolutely continuous on [a, b] and f

(x) = 0 a.e., then f is constant.


Proof. Let c [a, b], let E = x [a, c] : f

(x) = 0, and let > 0. For each point x E


there exists arbitrarily small intervals [x, x + h] [a, c] such that [f(x + h) f(x)[ < h.
By Lemma 8.1 we can nd a nite collection of such intervals that cover all of E except
for a set of measure less than , where is the in the denition of absolute continuity. If
the intervals are [x
i
, y
i
] with x
i
< y
i
x
i+1
, then

[f(x
i+1
) f(y
i
)[ < by the denition
of absolute continuity, while

[f(y
i
) f(x
i
)[ <

(y
i
x
i
) (c a). So adding these
two inequalities together,
[f(c) f(a)[ =

[f(x
i+1
) f(y
i
)] +

[f(y
i
) f(x
i
)]

+(c a).
Since is arbitrary, then f(c) = f(a), which implies that f is constant.
Theorem 8.11. F is an indenite integral if and only if it is absolutely continuous.
Proof. One direction was Lemma 8.11. Suppose F is absolutely continuous on [a, b].
Then F is of bounded variation, F = F
1
F
2
where F
1
and F
2
are nondecreasing, and F

exists a.e. Since [F

(x)[ F

1
(x)+F

2
(x), then
_
[F

(x)[ dx F
1
(b)+F
2
(b)F
1
(a)F
2
(a),
then F

is integrable. If G(x) =
_
x
a
F

(t) dt, then G is absolutely continuous by Lemma


8.11, so F G is absolutely continuous. Then (F G)

= 0 a.e., and therefore F G is


constant. Thus F(x) =
_
x
a
F

(t) dt +F(a).
9. L
p
spaces.
For 1 p < , dene the L
p
norm of f by
|f|
p
=
_
_
[f(x)[
p
d
_
1/p
.
For p = , dene the L

norm of f by
|f|

= infM : (x : [f(x)[ M) = 0.
For 1 p the space L
p
is the set f : |f|
p
< .
The L

norm of a function f is the supremum of f provided we disregard sets of


measure 0.
It is clear that |f|
p
= 0 if and only if f = 0 a.e.
23
Proposition 9.1. (Holders inequality) If 1 < p, q < and p
1
+q
1
= 1, then
_
f(x)g(x)d |f|
p
|g|
q
.
This also holds if p = and g = 1.
Proof. If M = |f|

, then
_
fg M
_
[g[ and the case p = and q = 1 follows. So let
us assume 1 < p, q < . If |f|
p
= 0, then f = 0 a.e and
_
fg = 0, so the result is clear
if |f|
p
= 0 and similarly if |g|
q
= 0. Let F(x) = [f(x)[/|f|
p
and G(x) = [g(x)[/|g|
q
.
Note |F|
p
= 1 and |G|
q
= 1, and it suces to show that
_
FG 1.
The second derivative of the function e
x
is again e
x
, which is positive, and so e
x
is
convex. Therefore if 0 1, we have
e
a+(1)b
e
a
+ (1 )e
b
.
If F(x), G(x) ,= 0, let a = p log F(x), b = q log G(x), = 1/p, and 1 = 1/q. We then
obtain
F(x)G(x)
F(x)
p
p
+
G(x)
q
q
.
Clearly this inequality also holds if F(x) = 0 or G(x) = 0. Integrating,
_
FG
|F|
p
p
p
+
|G|
q
q
q
=
1
p
+
1
q
= 1.
One application of Holders inequality is to prove Minkowskis inequality, which is
simply the triangle inequality for L
p
.
Proposition 9.2. (Minkowskis inequality) If 1 p , then
|f +g|
p
|f|
p
+|g|
p
.
Proof. Since [(f + g)(x)[ [f(x)[ + [g(x)[, integrating gives the case when p = 1. The
case p = is also easy. So let us suppose 1 < p < . If |f|
p
or |g|
p
is innite, the result
is obvious, so we may assume both are nite. The inequality (a +b)
p
2
p
a
p
+ 2
p
b
p
with
a = [f(x)[ and b = [g(x)[ yields, after an integration,
_
[(f +g)(x)[
p
d 2
p
_
[f(x)[
p
d + 2
p
_
[g(x)[
p
d.
So we have |f +g|
p
< . Clearly we may assume |f +g|
p
> 0.
24
Now write
[f +g[
p
[f[ [f +g[
p1
+[g[ [f +g[
p1
and apply Holders inequality with q = (1
1
p
)
1
. We obtain
_
[f +g[
p
|f|
p
_
_
[f +g[
(p1)q
_
1/q
+|g|
p
_
_
[f +g[
(p1)q
_
1/q
.
Since p
1
+q
1
= 1, then (p 1)q = p, so we have
|f +g|
p
p

_
|f|
p
+|g|
p
_
|f +g|
p/q
p
.
Dividing both sides by |f +g|
p/q
p
and using the fact that p(p/q) = 1 gives us our result.
Minkowskis inequality says that L
p
is a normed linear space, provided we identify
functions that are equal a.e. The next proposition says that L
p
is complete. This is often
phrased as saying that L
p
is a Banach space, i.e., a complete normed linear space.
Before proving this we need two easy preliminary results. The rst is sometimes
called Chebyshevs inequality.
Lemma 9.3. If 1 p < ,
(x : [f(x)[ a)
|f|
p
p
a
p
.
Proof. If A = x : [f(x)[ a, then
(A)
_
A
[f(x)[
p
a
p
d
1
a
p
_
[f[
p
d.
The next lemma is sometimes called the Borel-Cantelli lemma.
Lemma 9.4. If

(A
j
) < , then
(

j=1

m=j
A
m
) = 0.
Proof.
(

j=1

m=j
A
m
) = lim
j
(

m=j
A
m
) lim
j

m=j
(A
m
) = 0.
25
Proposition 9.5. If 1 p , then L
p
is complete.
Proof. We do only the case p < ; the case p = is easy. Suppose f
n
is a Cauchy
sequence in L
p
. Given = 2
(j+1)
, there exists n
j
such that if n, m n
j
, then |f
n
f
m
|
p

2
(j+1)
. Without loss of generality we may assume n
j
n
j1
for each j.
Set n
0
= 0 and dene f
0
0. If A
j
= x : [f
nj
(x) f
nj1
(x)[ > 2
j/2
, then from
Lemma 9.3, (A
j
) 2
jp/2
. By Lemma 9.4, (

j=1

m=j
A
m
) = 0. So except for a set
of measure 0, for each x there is a last j for which x

m=j
A
m
, hence a last j for which
x A
j
. So for each x (except for the null set) there is a j
0
(depending on x) such that if
j j
0
, then [f
nj
(x) f
nj1
(x)[ 2
j
.
Set
g
j
(x) =

m=1
[f
nm
(x) f
nm1
(x)[.
g
j
(x) increases for each x, and the limit is nite for almost every xby the preceding para-
graph. Let us call the limit g(x). We have
|g
j
|
p

j

m=1
2
j
+|f
n1
|
p
2 +|f
n1
|
p
by Minkowskis inequality, and so by Fatous lemma, |g|
p
2 +|f
n1
|
p
< . We have
f
nj
(x) =
j

m=1
(f
nm
(x) f
nm1
(x)).
Suppose x is not in the null set where g(x) is innite. Since [f
nj
(x) f
n
k
(x)[ [g
nj
(x)
g
n
k
(x)[ 0 as j, k , then f
nj
(x) is a Cauchy series (in R), and hence converges, say to
f(x). We have |f f
nj
|
p
= lim
m
|f
nm
f
nj
|
p
; this follows by dominated convergence
with the function g dened above as the dominating function.
We have thus shown that |f f
nj
|
p
0. Given = 2
(j+1)
, if m n
j
, then
|f f
m
|
p
|f f
nj
|
p
+|f
m
f
nj
|
p
. This shows that f
m
converges to f in L
p
norm.
The following is very useful.
Proposition 9.6. For 1 < p < and p
1
+q
1
= 1,
|f|
p
= sup
_
_
fg : |g|
q
1
_
. (9.1)
When p = 1 (9.1) holds if we take q = , and if p = (9.1) holds if we take q = 1.
Proof. The right hand side of (9.1) is less than the left hand side by Holders inequality.
So we need only show that the right hand side is greater than the left hand side.
26
First suppose p = 1. Take g(x) = sgnf(x), where sgna is 1 if a > 0, is 0 if a = 0,
and is 1 if a < 0. Then g is bounded by 1 and fg = [f[. This takes care of the case
p = 1.
Next suppose p = . Since is -nite, there exist sets F
n
increasing up to X
such that (F
n
) < for each n. If M = |f|

, let a be any nite real less than M. By


the denition of L

norm, the measure of A = x F


n
: [f(x)[ > a must be positive if
n is suciently large. Let g(x) = (sgnf(x))
A
(x)/(A). Then the L
1
norm of g is 1 and
_
fg =
_
A
[f[/(A) a. Since a is arbitrary, the supremum on the right hand side must
be M.
Now suppose 1 < p < . We may suppose |f|
p
> 0. Let q
n
be a sequence
of nonnegative simple functions increasing to f
+
, r
n
a sequence of nonnegative simple
functions increasing to f

, and s
n
(x) = (q
n
(x) r
n
(x))
Fn
(x). Then s
n
(x) f(x) for
each x, [s
n
(x)[ [f(x)[ for each x, s
n
is a simple function, and |s
n
|
p
< for each n. If
f L
p
, then |s
n
|
p
|f|
p
by dominated convergence. If
_
[f[
p
= , then
_
[s
n
[
p

by monotone convergence. For n suciently large, |s
n
|
p
> 0.
Let
g
n
(x) = (sgnf(x))
[s
n
(x)[
p1
|s
n
|
p/q
p
.
Since (p 1)q = p, then
|g
n
|
q
=
(
_
[s
n
[
(p1)q
)
1/q
)
|s
n
|
p/q
p
=
|s
n
|
p/q
p
|s
n
|
p/q
p
= 1.
On the other hand, since [f[ [s
n
[,
_
fg
n
=
_
[f[ [s
n
[
p1
|s
n
|
p/q
p

_
[s
n
[
p
|s
n
|
p/q
p
= |s
n
|
p(p/q)
p
.
Since p (p/q) = 1, then
_
fg
n
|s
n
|
p
, which tends to |f|
p
.
The above proof also establishes
Corollary 9.7. For 1 < p < and p
1
+q
1
= 1,
|f|
p
= sup
_
fg : |g|
q
1, g simple.
The space L
p
is a normed linear space. We can thus talk about its dual, namely,
the set of bounded linear functionals on L
p
. The dual of a space Y is denoted Y

. If H
is a bounded linear functional on L
p
, we dene the norm of H to be |H| = supH(f) :
|f|
p
1.
27
Theorem 9.8. If 1 < p < and p
1
+q
1
= 1, then (L
p
)

= L
q
.
Proof. If g L
q
, then setting H(f) =
_
fg for f L
p
yields a bounded linear functional;
the boundedness follows from Holders inequality. Moreover, from Holders inequality and
Proposition 9.6 we see that |H| = |g|
q
.
Now suppose we are given a bounded linear functional H on L
p
and we must
show there exists g L
q
such that H(f) =
_
fg. First suppose (X) < . Dene
(A) = H(
A
). If A and B are disjoint, then
(A B) = H(
AB
) = H(
A
+
B
) = H(
A
) +H(
B
) = (A) +(B).
To show is countably additive, it suces to show that if A
n
A, then (A
n
) (A). But
if A
n
A, then
An

A
in L
p
, and so (A
n
) = H(
An
) H(
A
) = (A); we use here
the fact that (X) < . Therefore is a countably additive signed measure. Moreover, if
(A) = 0, then
A
= 0 a.e., hence (A) = H(
A
) = 0. By writing =
+

and using
the Radon-Nikodym theorem for both the positive and negative parts, we see there exists
an integrable g such that (A) =
_
A
g for all sets A. If s =

a
i

Ai
is a simple function,
by linearity we have
H(s) =

a
i
H(
Ai
) =

a
i
(A
i
) =

a
i
_
g
Ai
=
_
gs.
By Corollary 9.7,
|g|
q
= sup
_
_
gs : |s|
p
1, s simple
_
supH(s) : |s|
p
1 |H|.
If s
n
are simple functions tending to f in L
p
, then H(s
n
) H(f), while by Holders
inequality
_
s
n
g
_
fg. We thus have H(f) =
_
fg for all f L
p
, and |g|
p
|H|. By
Holders inequality, |H| |g|
p
.
In the case where is -nite, but not nite, let F
n
X be such that (F
n
) <
for each n. Dene functionals H
n
by H
n
(f) = H(f
Fn
). Clearly each H
n
is a bounded
linear functional on L
p
. Applying the above argument, we see there exist g
n
such that
H
n
(f) =
_
fg
n
and |g
n
|
q
= |H
n
| |H|. It is easy to see that g
n
is 0 if x / F
n
.
Moreover, by the uniqueness part of the Radon-Nikodym theorem, if n > m, then g
n
= g
m
on F
m
. Dene g by setting g(x) = g
n
(x) if x F
n
. Then g is well dened. By Fatous
lemma, g is in L
q
with a norm bounded by |H|. Since f
Fn
f in L
p
by dominated
convergence, then H
n
(f) = H(f
Fn
) H(f), since H is a bounded linear functional on
L
p
. On the other hand H
n
(f) =
_
Fn
fg
n
=
_
Fn
fg
_
fg by dominated convergence. So
H(f) =
_
fg. Again by Holders inequality |H| |g|
p
.
10. Fourier transforms.
28
Fourier transforms give a representation of a function in terms of frequencies. We
give the basic properties here.
If f L
1
(R
n
), dene the Fourier transform

f by

f(u) =
_
R
n
e
iux
f(x)dx, u R
n
. (10.1)
We are using u x for the standard inner product in R
n
. Various books have slightly
dierent denitions. Some put a negative sign before the iu x, some have a 2 either in
front of the integral or in the exponent. The basic theory is the same in any case.
Some basic properties of the Fourier transform are given by
Proposition 10.1. Suppose f and g are in L
1
. Then
(a)

f is bounded and continuous;
(b)

(f +g)(u) =

f(u) + g(u);

(af)(u) = a

f(u);
(c) if f
a
(x) = f(x +a), then

f
a
(u) = e
iua
f(u);
(d) if g
a
(x) = e
iax
g(x), then g
a
(u) =

f(u +a);
(e) if h
a
(x) = f(ax), then

h(u) = a
n
f(u/a).
Proof. (a)

f is bounded because f L
1
and [e
iux
[ = 1. We have

f(u +h)

f(u) =
_
_
e
i(u+h)x
e
iux
_
f(x)dx.
So
[

f(u +h)

f(u)[
_

e
iux

e
ihx
1

[f(x)[dx.
The integrand is bounded by 2[f(x)[, which is integrable, and e
ihx
1 0 as h 0, and
thus the continuity follows by dominated convergence.
(b) is obvious. (c) follows because

f
a
(u) =
_
e
iux
f(x +a)dx =
_
e
iu(xa)
f(x)dx = e
iua

f(u)
by a change of variables. For (d),
g
a
(u) =
_
e
iux
e
iax
f(x)dx =
_
e
i(u+a)x
f(x)dx =

f(u +a).
Finally for (e), by a change of variables,

h
a
(u) =
_
e
iux
f(ax)dx = a
n
_
e
iu(y/a)
f(y)dy = a
n
_
e
i(u/a)y
f(y)dy = a
n

f(u/a).
One reason for the usefulness of Fourier transforms is that they relate derivatives
and multiplication.
29
Proposition 10.2. Suppose f L
1
and x
j
f(x) L
1
, where x
j
is the j
th
coordinate of
x. Then


f
u
j
(u) = i
_
e
iux
x
j
f(x)dx.
Proof. Let e
j
be the unit vector in the j
th
direction. Then

f(u +he
j
)

f(u)
h
=
1
h
_
_
e
i(u+hej)x
e
iux
_
f(x)dx
=
_
e
iux
_
e
ihxj
1
h
_
f(x)dx.
Since

1
h
_
e
ihxj
1
_

[x
j
[
and x
j
f(x) L
1
, the right hand side converges to
_
e
iux
ix
j
f(x)dx by dominated conver-
gence. Therefore the left hand side converges. Of course, the limit is

f/u
j
.
The convolution of f and g is dened by
f g(x) =
_
f(x y)g(y)dy.
By a change of variables, this is the same as
_
f(y)g(x y)dy, so f g = g f.
Proposition 10.3. (a) If f, g L
1
, then f g is in L
1
and |f g|
1
|f|
1
|g|
1
.
(b) The Fourier transform of f g is

f(u) g(u).
Proof. (a) We have
_
[f g(x)[dx
_ _
[f(x y)[ [g(y)[dy dx.
Since the integrand is nonnegative, we can apply Fubini and the right hand side is equal
to _ _
[f(x y)[dx[g(y)[dy =
_ _
[f(x)[dx[g(y)[dy = |f|
1
|g|
1
.
The rst equality here follows by a change of variables.
(b) We have

f g(u) =
_
e
iux
_
f(x y)g(y)dy dx
=
_ _
e
iu(xy)
f(x y)dxe
iuy
g(y)dy
=
_

f(u)e
iuy
g(y)dy =

f(u) g(u).
30
We applied Fubini in the rst equality; this is valid because as we saw in (a), the absolute
value of the integrand is integrable.
We want to give a formula for recovering f from

f. First we need to calculate the
Fourier transform of a particular function.
Proposition 10.4. (a) Suppose f
1
: R R is dened by
f
1
(x) =
1

2
e
x
2
/2
.
Then

f(u) = e
u
2
/2
.
(b) Suppose f
n
: R
n
R is given by
f
n
(x) =
1
(2)
n/2
e
|x|
2
/2
.
Then

f
n
(u) = e
|u|
2
/2
.
Proof. (a) may be proved using contour integration, but lets give a real variable proof.
Let g(u) =
_
e
iux
e
x
2
/2
dx. Dierentiate with respect to u. We may dierentiate under the
integral sign because (e
i(u+h)x
e
iux
)/h is bounded in absolute value by [x[ and [x[e
x
2
/2
is integrable; therefore dominated convergence applies. We then obtain
g

(u) = i
_
e
iux
xe
x
2
/2
dx.
By integration by parts this is equal to
u
_
e
iux
e
x
2
/2
dx = ug(u).
Solving the dierential equation g

(u) = ug(u), we have


[log g(u)]

=
g

(u)
g(u)
= u,
so log g(u) = u
2
/2 +c
1
, and so then
g(u) = c
2
e
u
2
. (10.2)
Since g(0) =
_
e
x
2
/2
dx =

2, c
2
=

2. Substituting this value of c
2
in (10.2) and
dividing both sides by

2 proves (a).
For (b), since f
n
(x) = f
1
(x
1
) f
1
(x
n
) if x = (x
1
, . . . , x
n
),

f
n
(u) =
_

_
f
1
(x
1
) f
1
(x
n
)dx
1
dx
n
=

f
1
(u
1
)

f
1
(u
n
) = e
|u|
2
/2
.
One more preliminary before proving the inversion theorem.
31
Proposition 10.5. Suppose is in L
1
and
_
(x)dx = 1. Let
A
(x) = A
n
(x/A).
(a) Then |f
A
f|
1
0 as A 0.
(b) If f is continuous with compact support, then f
A
converges to f pointwise.
Proof. (a) Let > 0. Choose g continuous with compact support so that |f g|
1
< .
Let h = f g. A change of variables shows that |
A
|
1
= ||
1
. Observe
|f
A
f|
1
|g
A
g|
1
+|h
A
h|
1
and
|h
A
h|
1
|h|
1
+|h
A
|
1
|h|
1
+|h|
1
|
A
|
1
< (1 +||
1
).
So since is arbitrary, it suces to show that g
A
g in L
1
.
We start by writing
g
A
(x) g(x) =
_
g(x y)
A
(y)dy g(x) =
_
g(x Ay)(y)dy g(x)
=
_
[g(x Ay) g(x)](y)dy.
We used a change of variables and the fact that
_
(y)dy = 1. Because g is continuous with
compact support, then g is bounded, and the integral on the right goes to 0 by dominated
convergence, the dominating function being |g|

[(y)[. Therefore g
A
(x) converges to
g(x) pointwise.
To show the convergence in L
1
, we have
_
[g
A
(x) g(x)[dx
_ _
[g(x Ay) g(x)[ [(y)[dy dx
=
_ _
[g(x Ay) g(x)[ [(y)[dxdy.
Since g is continuous with compact support and hence bounded, for each y
G
A
(y) =
_
[g(x Ay) g(x)[dx
converges to 0 as A 0 by dominated convergence. Also
G
A
(y)
_
[g(x Ay)[dx +
_
[g(x)[dx 2|g|
1
< .
Then _
G
A
(y)[(y)[dy
converges to 0 as A 0 by dominated convergence, the dominating function being
2|g|
1
[(y)[.
(b) This follows from the argument we used for g above.
Now we are ready to give the inversion formula. The proof seems longer than it
might be, but there is no avoiding the introduction of the function H
a
or some similar
function.
32
Theorem 10.6. Suppose f,

f L
1
. Then
f(y) =
1
(2)
n
_
e
iuy

f(u)du, a.e.
Proof. Let
H
a
=
1
(2)
n
e
|x|
2
/2a
2
.
By Propositions 10.1(e) and 10.4(b),

H
a
(u) = (2)
n/2
a
n
e
a
2
|u|
2
/2
.
We have
_

f(u)e
iuy
H
a
(u)du =
_ _
e
iux
f(x)e
iuy
H
a
(u)dxdu
=
_ _
e
iu(xy)
H
a
(u)duf(x) dx
=
_

H
a
(x y)f(x)dx. (10.3)
We can interchange the order of integration because
_ _
[f(x)[ [H
a
(u)[dxdu < . The
left hand side of the rst line of (10.3) converges to (2)
n
_

f(u)e
iuy
dy as a by
dominated convergence and the fact that

f L
1
. The last line of (10.3) is equal to
_

H
a
(y x)f(x)dx = f

H
a
(y) (10.4)
since

H
a
is symmetric. But by Proposition 10.5, f

H
a
converges to f in L
1
as a .
The last topic that we consider is the Plancherel theorem.
Theorem 10.7. (a) Suppose f is continuous with compact support. Then

f L
2
and
|f|
2
= (2)
n/2
|

f|
2
. (10.5)
(b) We can use the result in (a) to dene

f when f L
2
and so that (10.5) holds.
Proof. (a) Let g(x) = f(x). Note
g(u) =
_
e
iux
f(x)dx =
_
e
iux
f(x)dx =
_
e
iux
f(x)dx =

f(u).
33
By (10.3) and (10.4) with y = 0
f g

H
a
(0) =
_

f g(u)H
a
(u)du. (10.6)
Since

f g(u) =

f(u) g(u) = [

f(u)[
2
, the right hand side of (10.6) converges by monotone
convergence to (2)
n
_
[

f(u)[
2
du as a 0. Since f and g are continuous with compact
support, then it is easy to see that f g is also, and so the left hand side of (10.6) converges
to f g(0) =
_
f(y)g(y)dy =
_
[f(y)[
2
dy by Proposition 10.5(b).
(b) The set of continuous functions with compact support is dense in L
2
. Given a
function f in L
2
, choose a sequence of continuous functions with compact support f
m

such that f
m
f in L
2
. By the result in (a),

f
m
is a Cauchy sequence in L
2
, and
therefore converges to a function in L
2
, which we call

f. If f

m
is another sequence of
continuous functions with compact support converging to f in L
2
, then f
m
f

m
is a
sequence of continuous functions with compact support converging to 0 in L
2
; by the result
in (a),

f
m


f

m
converges to 0 in L
2
, and therefore

f is dened uniquely up to almost
everywhere equivalence. By passing to the limit in L
2
on both sides of (10.5), we see that
(10.5) holds for f L
2
.
References.
1. G.B. Folland, Real analysis: modern techniques and their applications, New York,
Wiley, 1984.
2. H.L. Royden, Real analysis, New York, Macmillan, 1963.
3. W. Rudin, Real and complex analysis, New York, McGraw-Hill, 1966.
34

Das könnte Ihnen auch gefallen