Sie sind auf Seite 1von 207

Lectures on Stochastic

Processes
By
K. Ito
Tata Institute of Fundamental Research, Bombay
1960
(Reissued 1968)
Lectures on Stochastic
Processes
By
K. Ito
Notes by
K. Muralidhara Rao
No part of this book may be reproduced in any
form by print, microlm or any other means with-
out written permission from the Tata Institute of
Fundamental Research, Colaba, Bombay 5
Tata Institute of Fundamental Research, Bombay
1961
(Reissued 1968)
Preface
In this course of lectures I have discussed the elementary parts of Stochas-
tic Processes from the view point of Markov Processes. I owe much to
Professor H.P. McKeans lecture at Kyoto University (195758) in the
preparation of these lectures.
I would like to express my hearty thanks to Professor K. Chan-
drasekharan, Dr.K. Balagangadharan, Dr.J.R. Choksi and Mr.K.M. Rao
for their friendly aid in preparing the manuscript.
K. Ito
iii
Contents
0 Preliminaries 1
1 Measurable space . . . . . . . . . . . . . . . . . . . . . 1
2 Probability space . . . . . . . . . . . . . . . . . . . . . 2
3 Independence . . . . . . . . . . . . . . . . . . . . . . . 4
4 Conditional expectation . . . . . . . . . . . . . . . . . . 5
5 Wiener and Poisson processes . . . . . . . . . . . . . . 6
1 Markov Processes 13
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Transition Probability . . . . . . . . . . . . . . . . . . . 16
4 Semi-groups . . . . . . . . . . . . . . . . . . . . . . . . 22
5 Green operator . . . . . . . . . . . . . . . . . . . . . . 23
6 The Generator . . . . . . . . . . . . . . . . . . . . . . . 25
7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 28
8 Dual notions . . . . . . . . . . . . . . . . . . . . . . . . 36
9 A Theorem of Kac . . . . . . . . . . . . . . . . . . . . 39
2 Srong Markov Processes 45
1 Markov time . . . . . . . . . . . . . . . . . . . . . . . . 45
2 Examples of Markov time . . . . . . . . . . . . . . . . . 46
3 Denition of strong Markov process . . . . . . . . . . . 48
4 A condition for a Markov process... . . . . . . . . . . . 48
5 Example of a Markov process.... . . . . . . . . . . . . . 51
6 Dynkins formula and generalized ... . . . . . . . . . . . 54
7 Blumenthals 0 1 law . . . . . . . . . . . . . . . . . . 60
v
vi Contents
8 Markov process with discrete state space . . . . . . . . . 61
9 Generator in the restricted sence . . . . . . . . . . . . . 65
3 Multi-dimensional Brownian Motion 73
1 Denition . . . . . . . . . . . . . . . . . . . . . . . . . 73
2 Generator of the k-dimensional Brownian motion . . . . 75
3 Stochastic solution of the Dirichlet problem . . . . . . . 78
4 Recurrence . . . . . . . . . . . . . . . . . . . . . . . . 86
5 Green function . . . . . . . . . . . . . . . . . . . . . . 88
6 Hitting probability . . . . . . . . . . . . . . . . . . . . 96
7 Regular points (k 3) . . . . . . . . . . . . . . . . . . . 103
8 Plane measure of a two dimensional... . . . . . . . . . . 105
4 Additive Processes 109
1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . 109
2 Gaussian additive processes and ... . . . . . . . . . . . . 110
3 Levys canonical form . . . . . . . . . . . . . . . . . . 114
4 Temporally homogeneouos L evy processes . . . . . . . 126
5 Stable processes . . . . . . . . . . . . . . . . . . . . . . 129
6 L evy process as a Markov process . . . . . . . . . . . . 133
7 Multidimensional Levy processes . . . . . . . . . . . . 139
5 Stochastic Dierential Equations 143
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 143
2 Stochastic integral (1) Function spaces E , L
2
, E
s
. . . . 146
3 Stochastic Integral (II) Denitions and properties . . . . 149
4 Denition of stochastic integral (III) Continuous version 152
5 Stochstic dierentials . . . . . . . . . . . . . . . . . . . 156
6 Stochastic dierential equations . . . . . . . . . . . . . 163
7 Construction of diusion . . . . . . . . . . . . . . . . . 173
6 Linear Diusion 179
1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . 179
2 Generator in the restricted sense . . . . . . . . . . . . . 181
3 Local generator . . . . . . . . . . . . . . . . . . . . . . 184
4 Fellers form of generators (1) Scale . . . . . . . . . . . 187
Contents vii
5 Fellers form of generator (2) Speed measure . . . . . . 190
6 Fellers form of generators (3) . . . . . . . . . . . . . . 191
7 Fellers form of generators... . . . . . . . . . . . . . . . 193
Section 0
Preliminaries
1 Measurable space
1
Let be a set and let S () denote the set of all subsets of . A S ()
is called an algebra if it is closed under nite unions and complemen-
tations; an algebra B closed under countable unions is called a Borel
algebra. For C S () we denote by A(C) and B(C), the algebra and
Borel algebra, respectively, generated by C. M S () is called a mono-
tone class if A
n
M, n = 1, 2, . . ., and {A
n
} monotone implies that
lim
n
A
n
M. We have the following lemma.
Monotone Lemma. If M is a monotone class containing an algebra A
then M B(A).
The proof of this lemma can be found in P. Halmos: Measure the-
ory.
For any given set we denote by B() a Borel algebra of subsets
of .
Denition (). A pair (, B()) is called a measurable space. A is
called measurable if A B().
Let (
1
, B
1
(
1
)) and (
2
, B
2
(
2
) be measurable spaces. A function
f :
1

2
is called measurable with respect to B
1
(
1
) if for every
A B
2
(
2
), f
1
(A) B
1
(
1
).
Now suppose that
1
is a set, (
2
, B(
2
)) a measurable space and
f a function on
1
into
2
. Let B( f ) be the class of all sets of the form 2
1
2 0. Preliminaries
f
1
(A) for A B(
2
). Then B( f ) is a Borel algebra, and is the least
Borel algebra with respect to which f is measurable.
Let (
i
, B
i
(
i
), i I, be measurable spaces. Let =

i
denote
the Cartesian product of
i
and let
i
:
i
be dened by
i
(w) = w
i
.
Let B() be the least Borel algebra with respect to which all the
i
s are
measurable. The pair (, B()) is called the product measurable space.
B() is the least Borel algebra containing the class of all sets of the form
{ f : f (i) E
i
},
where E
i
B
i
(
i
). A function F into is measurable if and only if
i
F
is measurable for every i I.
2 Probability space
Let be a set, A S () an algebra. A function P on A such p() = 1,
0 p(E) 1 for E A, and such that p(EUF) = p(E)+p(F) whenever
E, F A and EF = , is called an elementary probability measure on
A. Let (, B()) be a measurable space and p an elementary probability
measure on B. If A
n
B, A
n
disjoint, imply p(
_
n
A
n
) =
_
p(A
n
) we say
that p is a probability measure on B(). The proof of the following
important theorem can be found in P. Halmos: Measure theory.
Theorem ((Kolmogoro)). If p is an elementary probability measure 3
on A then p can be extended to a probability measure P on B(A) if and
only if the following continuity condition is satised:
A
n
A, A
n
A
n+1
,
_
n
A
n
= imply lim
n
p(A
n
) = 0.
Further under the above condition the extension is unique.
Denition (). A triple (, B, P), where P is a probability measure on B,
is called a probability space.
A real-valued measurable function on a probability space is called
a random variable. If a vandom variable x is integrable we denote the
integral by E(x) and call it the expectation of X.
2. Probability space 3
Let (
2
, B
2
) be a measurable space, (
1
, B
1
, P
1
)a probability space
and f :
1

2
a measurable function. Dene P
2
(E) = P
1
( f
1
(E))
for every E B
2
. Then (
2
, B
2
, P
2
) is a probability space and for every
integrable function g on
2
, E(g0f ) =
_
g0f dP
1
=
_
gdP
2
= E(g). We
say that f induces a measure on B
2
. In case x is a random variable, the
measure induced on the line is called the probability distribution of x.
We shall prove the following formulae which we use later Inclusion-
exclusion formula. Let (, B, P) be a probability space and A
i
B,
i = 1, 2, . . . , n. Then
P
_

_
n
_
i=1
A
i
_

_
=

i
P(A
i
)

i<j
P(A
i
A
j
) +

i<j<k
P(A
i
A
j
A
k
) . . .
To prove this, let
B
denote the characteristic function of B. Then 4
P(A
i
) = E(
A
i
) = E(1
A
i
c) = 1 E
__

A
c
i
_
= 1 E((1
A
1
)(1
A
2
) . . . (1
A
n
))
= 1 E
_
1

A
i
+

i<j

A
i

A
j

i<j<k

A
i

A
j

A
k
+ . . .
_
=

i
E(
A
i
)

i<j
E(
A
i
A
j
) +

i<j<k
E(
A
i
A
j
A
k
) . . .
=

i
P(A
i
)

i<j
P(A
i
A
j
) + . . .
The following dual inclusion-exclusion formula is due to Hunt. We
have
P(A
i
) = 1 P(A
c
i
) = 1
_

i
P(A
c
i
)

i<j
P(A
C
i
A
c
j
) + . . .
_
= 1
_

i
(1 P(A
i
))

i<j
(1 P(A
i
A
j
)) + . . .
_
= 1
_
n

i
P(A
i
)
_
n
2
_
+

i<j
P(A
i
A
j
) + . . .
_
=
_
1
_
n
1
_
+
_
n
2
_
. . .
_
+

i
P(A
i
)

i<j
P(A
i
A
j
) + . . .
4 0. Preliminaries
=

i
P(A
i
)

i<j
P(A
i
A
j
) + . . .
A collection (
t
, t T) of random variables x
t
, T being some index- 5
ing set, is called a stochastic or random process. We generally assume
that the indexing set T is an interval of real numbers.
Let {x
t
, t T} be a stochastic process. For a xed x
t
() is a
function on T, called a sample function of the process.
Lastly, an n-dimensional random variable is a measurable func-
tion into R
n
; an n-dimensional random process is a collection of n-
dimensional random variables.
3 Independence
Let (, B, P) be a probability space and B
i
, i = 1, 2, . . . , n, n Borel
subalgebras of B. They are said to be independent if for any E
i
B
i
, i
i n, P(E
1
, . . . E
n
) = P(E
1
) . . . P(E
n
). A collection (B

)
I
of Borel
subalgebras of B is said to be independent if every nite subcollection
is independent.
Let X
1
, . . . , x
n
be n random variables on (, B, P) and B(x
i
), 1 i
n, the least Borel subalgebra of B with respect to which x
i
is measurable.
x
1
, . . . x
n
are said to be independent if B
1
, . . . , B
n
are independent.
Finally, suppose that {x

(t, w)}
I
is a system of random processes
on and B

the least Borel subalgebra of B with respect to which x

(t,
w) is measurable for all t. The processes are said to be stochastically
independent if the B

are independent.
We give some important facts about independence. If x and y are
random variables on the following statements are equivalent:
(1) E(e
ix+iy
) = E(e
ix
)E(e
iy
), , and real; 6
(2) The measure induced by z(w) = (x(w), y(w))on the plane is the
product of the measures induced by x and y on the line;
(3) x and y are independent.
4. Conditional expectation 5
4 Conditional expectation
Let (, B, P) be a probability space and C a Borel subalgebra of B.
Let x(w) be a real-valued integrable function. We follow Doob in the
denition of the conditional expectation of x.
Consider the set function on C dened by (C) = E(x : C). Then
(C) is a bounded signed measure and (C) = 0 if P(C) = 0. Therefore
by the Radon-nikodym theorem there exists a unique (upto P-measure
0) function (w) measurable with respect to C such that
(C) = E( : C).
Denition (). (w) is called the conditional expectation of x with re-
spect to C and is denoted by E(x/C).
The conditional expectation is not a random variable but a set of
random variables which are equal to each other except for a set of P-
measure zero. Each of these random variables is called a version of
E(x/C).
The following conclusions (which are valid with probability 1) re-
sult from the denition.
1. E(1/C) = 1. 7
2. E(x/C) 0 if x 0.
3. E(x + y/C) = E(x/C) + E(y/C).
4. |E(x/C)| E(|x|/C).
5. If x
n
x, |x
n
| S with E(S ) < ,then
lim
n
E(x
n
/C) = E(x/C).
6. If
_
n
E(|x
n
|) < , then E(
_
n
x
n
/C) =
_
n
E(x
n
/C).
7. If x is C-measurable, then E(xy/C) = xE(y/C).
In particular, if x is C-measurable, then E(x/C) = x.
6 0. Preliminaries
8. If x and C are independent, then E(x/C) = E(x).
9. If C = {A : P(A) = 0 or 1}, then E(x/C) = E(x).
10. If C
1
C
2
, then E(x/C
2
) = E(E(x/C
1
)/C
2
) and, in particular,
E(E(x/C))) = E(x).
5 Wiener and Poisson processes
The following processes are very important and we shall encounter
many examples of these.
We shall dene a Wiener process and establish its existence.
Let
_
x
t
(w), 0 t <
_
be a stochastic process such that
(1) for almost all w the sample function x
t
(w) is a continuous function
on [0, ] and vanishes at t = 0;
(2) P(w : x
t
1
(w) E
1
, . . . , x
t
n
(w) x
t
n1
(w) E
n
) = P(w : x
t
1
(w)
E
1
) . . . P(w : x
t
n
(w) x
t
n1
(w) E
n
, where t
1
< t
2
< . . . < t
n
.
This means that x
t
1
, x
t
2
x
t
1
, . . . , x
t
n
x
t
n1
are independent if 8
t
1
< . . . < t
n
;
(3) P(w : x
t
(w) x
s
(w) E) = [2(t s)]
1
2
_
E
e
x
2
/2(ts)
dx.
Then the process is called a Wiener process. This process is ex-
tremely important and we shall now construct a Wiener process which
we shall use later. This incidentally will establish the existence of Wie-
ner process.
Let = C[0, ) be the space of all real continuous functions on [0,
). We introduce an elementary probability measure on as follows.
For any integer n, 0 < t
1
< t
2
. . . < t
n
< and a Borel set B
n
in R
n
,
let
E =
_
w : w and (w(t
1
), . . . , w(t
n
)) B
n
_
,
and
5. Wiener and Poisson processes 7
p
t
1
...t
n
(E) =
_

_
B
n
N(t
1
, 0, x
1
)N(t
2
t
1
, x
1
, x
2
) . . . N(t
n
t
n1
, x
n1
, x
n
)
dx
1
. . . dx
n
where N(t, x, y) =
1

2t
e
(yx)
2
/2t
If 0 < u
1
< . . . < u
m
< is a set of points containing t
1
, . . . , t
n
and
t
r
= u
i
r
, r = 1, 2, . . . , n, then E can also be written as
E =
_
w : w and (w(u
1
), . . . , w(u
m
)) B
m
_
,
and then 9
p
u
1
...u
m
(E) =

B
m
N(u
1
, 0, x
1
) . . . N(u
m
u
m1
, x
m1
, x
m
)dx
1
. . . dx
m
,
where B
m
is the inverse image of B
m
under the mapping (x
1
, . . . , x
m
)
(x
1
1
, . . . , x
i
n
of R
m
into R
n
. Using the formula
_
N(t, x, y)N(s, y, z)dz = N(t + s, x, z),
we can show that p
u
1
...u
m
(E) = p
t
1
...t
n
(E).
Now suppose that E has two representations
E =
_
w : (w(t
1
), . . . , w(t
n
)) B
n
, B
n
R
n
_
=
_
w : (w(s
1
), . . . , w(s
m
)) B
m
, B
m
R
m
_
,
and 0 < u
1
< . . . < u
r
is the union of the sets {t
1
, . . . , t
n
} and {s
1
, . . . , s
m
}.
Then from the above, p
t
1
...t
n
(E) = p
u
1
...u
r
(E) = p
s
1
...s
m
(E). Hence
p
t
1
...t
n
(E) does not depend on the choice of the representation for E. We
denote this by p(E).
The class A of all such sets E, for all n, for all such n-tuples
(t
1
, . . . , t
n
) and all Borel sets of R
n
, is easily shown to be an algebra.
It is not dicult to show that p is an elementary probability measure on
8 0. Preliminaries
A. This elementary probability measure is called the elementary Wiener
measure.
We shall presently prove that p satises the continuity condition of 10
Kolmogoros theorem. Hence p can be extended to a probability mea-
sure P on B(A), which we call the Wiener measure on (, B(A)). It will
then follow that P(w : w(0) = 0) = 1.
Now let x
t
(w) = w(t). Then evidently {x
t
, 0 t < } is a stochastic
process with almost all sample functions continuous and vanishing at
t = 0. We show that {x
t
, 0 t < } is a Wiener process.
The function f : (x
1
, x
2
) x
2
x
1
of R
2
R
1
is continuous and
hence for any Borel set E R
1
, the set B = f
1
(E) = {(x
1
, x
2
) : x
2
x
1

E} is a Borel set in R
2
. Therefore
p{w : x
t
x
s
E} = P{w : (w(s), w(t)) B = f
1
(E)}
=

B
N(s, o, x
1
)N(t s, x
1
, x
2
)dx
1
dx
2
.
The transformation (x
1
, x
2
) (x, y) with x = x
1
, y = x
2
x
1
gives
P(w : x
t
x
s
E) =

{(x,y):yE}
N(s, 0, x)N(t s, x, y + x)dxdy
=
_
E
N(t s, 0, y)dy.
Again
P{w : x
t
1
E
1
, . . . , x
t
n
x
t
n1
E
n
} = P{w : (w(t
1
), . . . , w(t
n
) B
n
},
where B
n
= {(x
1
, . . . , x
n
) : x
1
E
1
, x
2
x
1
E
2
, . . . , x
n
n
n1
E
n
}.
Therefore 11
P{w : x
t
1
E
1
, . . . , x
t
n
x
t
n1
E
n
}
=

B
n
N(t
1
, o, x
1
) . . . N(t
n
t
n1
, x
n1
, x
n
)dx
1
. . . dx
n
=
_
E
1
x
. . .
_
xE
n
N(t
1
, 0, x

1
) . . . N(t
n
t
n1
, 0, x

n
)dx

1
. . . dx

n
5. Wiener and Poisson processes 9
= P{w : x
t
1
E
1
}P{w : x
t
2
x
t
1
E
2
} . . . P{w : x
t
n
x
t
n1
E
n
},
where x

1
= x
1
, x

2
= x
i
x
i1
i = 2, . . . , n. We have proved that (x
t
) is a
Wiener process.
It remains to prove that p satises the continuity condition. We shall
prove the following more general theorem.
Theorem (). (Prohorov, Convergence of stochastic processes and limit
theorems in Probability Theory, Teoria veroyatnesteii e eyo primenania
Vol. I Part 2, 1956).
Let p be an elementary probability measure on A which is a prob-
ability measure when restricted to sets of A dependent on a xed set
t
1
, . . . , t
n
. Let E denote expectations with respect to p. If there exist
a > 0, b > 1 and c > 0 such that E(|x
t
x
s
|
a
) C
|ts|
b then p can be
extended to a probability measure on B(A).
Proof. Let A
n
A
n+1
, n = 1, 2, . . . , A
n
A be such that p(A
n
) >> 0,
for all n. We prove that
_
n
A
n
.
Let A
n
= {w : (w(t
(n)
1
), . . . , w(t
(n)
r
n
)) B
n
}, where B
n
B(R
r
n
)
(the set of Borel subsets of R
r
n
). For each n there exists a q
n
such 12
that (a) each t
(n)
i
q
n
, (b) at most one t
(n)
i
is contained in any closed
interval
_
(k 1)2
q
n
, k2
q
n
_
for k = 1, 2, . . . , q
n
2
q
n
. By adding super-
uous suxes if necessary, one can assume that each point k2
q
n
, k =
0, 1, . . . , q
n
2
q
n
, is in {t
(n)
i
, . . . t
(n)
r
n
}, and moreover (by adding, say, the mid-
point if necessary) that in each open interval ((k 1)2
q
n
, k2
q
n
) there is
exactly one point of (t
(n)
1
, . . . , t
(n)
r
n
). Thus r
n
= q
n
2
q
n
+1
and t
(n)
2k
= k2
q
n
.
Finally, by adding superuous sets when necessary one may assume that
q
n
= n i.e., that
A
n
=
_
w :
_
w
_
t
(n)
1
_
, . . . , w
_
t
(n)
n2
n+1
__
B
n
_
,
where t
(n)
2k
= k2
n
and
_
t
(n)
1
, . . . , t
(n)
n2
n+1
_

_
t
(n+1)
1
, . . . , t
(n+1)
(n+2)2
n+2
_
.
Since p is a probability measure when restricted to sets dependent
on a xed set s
1
, . . . , s
k
, we can further assume that each B
n
is a closed
bounded subset R
n2
n+1
. Now, since E(|x(s) x(t)|
a
) C|s t|
b
,
p(w : |w(t
(n)
i
) w(t
(n)
i1
)| |t
(n)
i
t
(n)
i1
|

) = p(w : |w(t
(n)
i
) w(t
(n)
i1
)|
a

10 0. Preliminaries
|t
(n)
i
t
(n)
i1
|
a
) C|t
(n)
i
t
(n)
i1
|
ba
Choose > 0 such that = b a 1 > 0. Then
p(w : |w(t
(n)
i
) w(t
(n)
i1
)| |t
(n)
i
t
(n)
i1
|

) C|t
(n)
i
t
(n)
i1
|
1+
C2
n(1+)
Hence 13
p
_

_
n2
n+1
_
i=2
_
w : |w(t
(n)
i
) w(t
(n)
i1
)| |t
(n)
i
t
(n)
i1
|

_
_

_

Cn
n+1
2
2
n(1+)
= 2.C.n.2
n
.
Since
_
n2
n
is convergent, there exists m
o
such that 2C

_
n=m
0
n
n
2
<

2
.
Then for l m
0
,
p
_

_
l
_
n=m
0
n
n+1
2
_
i=2
_
w : |w(t
(n)
i
) w(t
(n)
i1
)| |t
(n)
i
t
(n)
i1
|

_
_

_
<

2
,
and so p
_

_
n=m
o
n
n+1
2
_
i=2
_
w : |w(t
(n)
i
) w(t
(n)
i1
)| < |t
(n)
i
t
(n)
i1
|

_
_

_
> 1

2
.
It follows that
p
_

_
A
1

l
_
n=m
0
n
n+1
2
_
i=2
_
w : |w(t
(n)
i
) w(t
(n)
i
)| < |t
(n)
i
t
(n)
i1
|

_
_

_
> 1

2
,
and so this set is non-empty. Call this set B

l
. Then B

l
B

l+1
and
A
l
B

l
. We prove that

_
m
0
B

l
.
From each B

1
choose a function w
l
linear in each interval [t
(l)
i1
, t
(l)
i
].
Such a function exists since for each w B

l
there corresponds such a
function determined completely by (w(t
(l)
1
), . . . , w(t
(l)
l
2
l+1
)). We can as-
sume that w(t
(l)
1
) = 0, since zero never occurs in the points which dene
sets of A. Now if l m
o
|w
l
(t
(n)
i
) w
l
(t
(n)
i1
)| < |t
(n)
i
t
(n)
i1
|

2
n
, m
0
n 1, 1 i n2
n+1
,
5. Wiener and Poisson processes 11
so that |w
l
(k2
n
) w
l
((k 1)2
n
)| 2.2
n
, 1 k n2
n
, m
0
n l. 14
Given k2
l
, k

2
l
, k

< k, k2
l
< 2
m
0
, there exists q l such that 2
q

k2
l
k

2
l
< 2
q+1
. In the interval [k

2
l
, k2
l
] there exist at most
two points of the form j2
q
, ( j + 1)2
q
. Then since w
l
B

q
, |w
l
( j2
q
)
w
l
(( j +1)2
q
)| < 2.2
q
. Repeating similar arguments we can prove that
|w
l
(k2
l
) w
l
(k

2
l
)| 4(1 2

)
1
2
q
|k2
l
k

2
l
|

,
being a constant. Now we can easily see that
|w
1
(t
(l)
i
) w
l
(t
(l)
j
)| < |t
(l)
i
t
(l)
j
|

if |t
(l)
i
t
(l)
j
| 2
m
0
say.
From this easily follows, using linearity of w
l
in each interval
_
t
(l)
i
, t
(l)
i+1
_
, that if t
(l)
i
t s t
(l)
j
, then
|w
l
(t) w
l
(s)| 4|t
(l)
i
t
(l)
j
|

.
Now since w
l+p
A
l
for every p 0, (w
l+p
(t
(l)
1
), . . . , w
l+p
(t
(l)
l2
l+1
))
B
l
. Since B
l
is compact, this sequence has a limit point in B
l
. Since
the same is true for every l, we can by the diagonal method, extract a
subsequence {w
n
}, say, such that w
n
(t
(l)
i
) converges for all i and for all l.
Now let t
0
and > 0 be given. For large n
0
suppose that t
(n
0
)
i
t
0
15
t
(n
0
)
i+1
, |t
(n
0
)
i
t
(n
0
)
i+1
< 2
n
0
<
2
. Then if l and m are large and t
(n
0
)
i
t
(l)
j

t
0
t
(l)
j+1
t
(n
0
)
i+1
, t
(n
0
)
i
t
(m)
k
t
0
t
(m)
k+1
t
(n
0
)
i+1
, we have
|w
l
(t
0
) w
m
(t
0
)| |w
l
(t
0
) w
l
(t
(l)
j
)| + |w
1
(t
(l)
j
) w
l
(t
(n
0
)
i
)| + |w
l
(t
(n
0
)
i
)
w
m
(t
(n
0
)
i
)| + |w
m
(t
(n
0
)
i
) w
m
(t
(m)
k
)| + |w
m
(t
(m)
k
) w
m
(t
0
)|
|t
0
t
(l)
j
|

+ |t
(l)
j
t
(n
0
)
i
|

+
2
+ |t
(n
0
)
i
t
(m)
k
|

+ |t
(m)
k
t
0
|

< A
2
,
A being some constant. This is true for any t [t
(n
0
)
i
, t
(n
0
)
i+1
]. This
shows that the limit exists at every point of R

. Also using |w
l
(t)
w
l
(s)| < 4|t
(l)
i
t
(l)
j
|

, we easily see that the limit function say w, is


continuous. Also since (w(t
(l)
1
, . . . , w(t
(l)
l2
l+1
)) B
l
for all l,
_
lm
0
B

l
.
We have proved the theorem.
12 0. Preliminaries
In our case we have
p(x
t
x
s
E) = [2(t s)]

1
2
_
E
e
x
2
2(ts)
dx

Poisson processes. Let (x


t
, 0 t < ) be a stochastic process such that 16
1. for almost all w the sample function x
t
(w) is a step function in-
creasing with jump 1 and vanishes at t = 0;
2. P(x
t
x
s
= k) = e
(ts)
(t s)
k

k
k
with > 0;
3. P(x
t
1
E
1
, x
t
2
x
t
1
E
2
, . . . , x
t
n
E
n
) = P(x
t
1
E
1
) . . . P(x
t
n

x
t
n1
E
n
); i.e., x
t
1
, . . . , x
t
n
x
t
n1
are independent if t
1
< t
2
. . . <
t
n
; then the process is called a Poisson process.
Section 1
Markov Processes
1 Introduction
17
In the following lectures we shall be mainly concerned with Markov
processes, and in particular with diusion processes.
We shall rst give an intuitive explanation and then a mathematical
denition. The intuitive model of a Markov process is a phenomenon
changing with time according to a certain stochastic rule and admitting
the possibility of a complete stop. The space of the Markov process
has the set of possible states of the phenomenon as its counter-part in
the intuitive model. Specically, consider a moving particle. Its pos-
sible positions are points of a space S and its motion is governed by a
stochastic rule. The particle may possibly disappear at some time; we
then say it has gone to its death point. A possible motion is a mapping
of [0, ) into the space of positions. Such a function is a sample path.
The set of all sample paths is the sample space of the process, denoted
by W. A probability law P
a
governing the path of the particle starting at
a point a S is a probability distribution on a Borel algebra of subset
of W. The stochastic rule consists of a system of probability laws gov-
erning the path. Finally, the condition on the system, that if the particle
arrives at a position a at time t it starts afresh according to the prob-
ability law P
a
ingonoring its past history will correspond intuitively to
the basic Markov property.
Denitions (). We turn now to the mathematical denitions. We rst 18
13
14 1. Markov Processes
explain the notation and terminology which we shall use.
Let S denote a locally compact Hausdor space satisfying the sec-
ond axiom of countability. Let B(S ) denote the set of all Borel sub-
sets of S , B(S ) the set of all B(S )-measurable bounded functions on S .
Since S satises the second axiom of countability, this class coincides
with the class of all bounded Baire functions on S . We shall add to S a
point to get a space S {}. S {} has the topology which makes
S an open sub-space and and isolated point. Then if B(S {}), and
B(S {}) are dened in the same way, B(S ) B(S {}), and for
any f B(S ) if we put f () = 0, then f B(S {}). A function
w : [0, ] S Vis called a sample path if
(1) w() = ;
(2) there exists a number

(w) [0, ] such that w(t) = for


t

(w) and w(t) S for t <

(w);
(3) w(t) is right continuous for t <

(w).
For any sample path w,

(w) is called the killing time of the path.


For any path w we denote by x
t
(w) the value of w at t i.e., x
t
(w) = w(t).
Then we can regard x as a function of the pair (t, w). Given a sample
path w the paths w

s
and w
+
s
dened for any s by
x
t
(w

s
) = x
ts
(w)
0
i f t < ,
and 19
x

(w

s
) = ,
where
t s = min(t, s);
x
t
(w
+
s
) = x
t+s
(w),
are called the stopped path and the shifted path at time s, respectively.
A system W of sample paths is called a sample space if w W implies
w

s
W, w
+
s
W for each s. For a sample space W the Borel algebra
generated by sets of the form (w : w W, x
t
(w) E), t [0, ),
1. Introduction 15
E B(S ) is denoted by B or B(W), and B or B(W) denotes the set of
all bounded B-measurable functions on W. The class of all sets of the
form (w : w

s
B)B B, is called the stopped Borel algebra at s, and
is denoted by B
s
or B
s
(W). B will denote the system of all bounded
B
s
-measurable functions. Note that B
s
increases with s and B

= B.
Consider the function x(t, w) on R W into S {}. Let
x
n
(t, w) = x
_
j + 1
2n
, w
_
= w
_
j + 1
2
n
_
for
j
2
n
< t
j + 1
2
n
.
Then x
n
(t, w) is measurable with respect to R(R) B(W) and x
n
(t, w)
x(t, w) pointwise. x
t
(w) is therefore a measurable function of the pair 20
(t, w)
Denition (). A Markov process is a triple
M = (S, W, P
a
, a S {})
where
(1) S is a locally compact Hausdor space with the second axiom of
countability;
(2) W is a sample space;
(3) P
a
(B) are probability laws for a S {}, B B, i.e.,
(a) P
a
(B) is a probability measure in B for every a S {},
(b) P
a
(B), for xed B, is B(S )-measurable in a,
(c) P
a
(x
0
= a) = 1,
(d) P
a
has the Markov property i.e.,
B
1
B
t
, B
2
B imply
P
a
[w : w B
1
, w
+
t
B
2
] = E
a
[w B
1
; P
x
t
(B
2
)],
where the second member is by denition equal to
_
B
1
P
x
t
(w)
(B
2
)dP
a
. [For
xed t, B
2
, p
x
t
(w)
(B
2
) is a bounded measurable function on W.]
16 1. Markov Processes
Remark 1. (d) is equivalent to the following:
f B, g B imply
_
f (w)g(w
+
t
)dP
a
= E
a
[ f (w)E
x
t
(w)
(g(w

))]
More generally (d) is equivalent to
(d

) f B
t
, g B, B
1
B
t
, B
2
B imply 21
E
a
[ f (w)g(w
+
t
) : w B
1
, w
+
t
B
2
]
= E
a
[w B
1
; f (w)E
x
t
(w)
(w

B
2
: g(w

))].
S , W, P
a
are called the state space, sample space and probability
law of the process respectively.
We give below three important examples of the sample space in a
Markov process.
(a) W = W
rc
= the set of all sample paths. These processes are called
right continuous Markov processes.
(b) W = W
d
1
= the set of sample paths whose only discontinuities
before the killing time are of the kind, i.e., w(t 0), w (t + 0)
exist and w(t 0) w(t + 0) = w(t), t <

(w). These are called


Markov processes of type d
1
.
(c) W = W
c
= the set of all sample paths which are continuous before
the killing time. These are continuous Markov processes.
Remark 2. A Markove process is called conservative if P
a
(

= ) =
1 for all a.
3 Transition Probability
The function P(t, a, E) = P
a
(x
t
E) on B(S ), a S and 0 < t <
being xed, is a measure on B(S ) called the transition probability of P
a
at time t. The transition probability has the following properties:
(T.1) P(t, a, E) is a sub - stochastic measure in E, i.e., it is a measure in 22
E with total measure 1.
For P(t, a, S ) = P
a
(x
t
S ) = 1 P
a
(X
t
= ) 1.
3. Transition Probability 17
(T.2) P(t, a, E) B(S ) for xed t and E.
For P(t, a, E) = P
a
(B) where B = {x
t
E} and P
a
(B) is by deni-
tion B(S )-measurable in a for xed B.
(T.3) P(t, a, E) is measurable in the pair (t, a) for xed E. For f B(S )
let
H
t
( f (a)) =
_
S
P(t, a, db) f (b) =
_
W{x
t
}
f (w(t))dP
a
.
=
_
W
f (w(t))dP
a
, since f () = 0.
If f is a bounded continuous function and
n
0
lim

n
0
H
t+
n
f (a) = lim

n
0
_
W
f (w(t +
n
))dp
a
.
=
_
W
f ( lim

n
0
)w(t +
n
))dP
a
=
_
W
f (w(t))dP
a
,
since w(t) is right continuous.
H
t
f (a) is thus right continuous in t, if f is bounded and contin-
uous. It is not dicult to show (by considering simple functions
and then generalizing) that H
t
f (a) is measurable in a if f is mea- 23
surable. Therefore H
t
f (a) is measurable in the pair (t, a) if f is
continuous and bounded. Further, if { f
n
} is a sequence of measur-
able functions with | f
n
| and f
n
f , then H
t
f
n
H
t
f . The
class of those measurable functions f for which H
t
f (a) is measur-
able in the pair (t, a) thus contains bounded continuous functions
and is closed for limits. Therefore H
t
f (a) is measurable in the
pair (t, a) for f B(s). If f =
E
, H
t
f (a) = P(t, a, E).
(T.4) lim
t 0
P(t, a, U
a
) = 1, where U
a
is an open set containing a.
18 1. Markov Processes
Let t
n
0, and B
n
= {w : w(t
n
) U
a
}. Since w(t) is right continu-
ous, {w : w(0) U
a

_
n=1

_
m=n
B
m
.
Therefore
liminf
t
n
0
P(t
n
, a, U
a
) P
a
_

_
n=1

_
m=n
B
m
_

_
P
a
{w : w(0) U
a
} P
a
{w : w(0) = a} = 1.
(T.5) Chapman-Kolmogoro equation :
P(t + s, a, E) =
_
S
P(t, a, db)P(s, b, E).
P(t + s, a, E) = P
a
{x
t+s
E} = P
a
{x
t
S, x
t+s
E}
= P
a
{x
t
S.x
s
(w
+
t
) E}
= E
a
[x
t
S : P
x
t
{x
s
(w) E}]
= E
a
[x
t
S : P(s, x
t
, E)]
=
_
S
P(t, a, ab)P(s, b, E)
24
(T.6)
P
a
(x
t
1
E
1
, . . . , x
t
n
E
n
) =

a
i
E
i
P(t
1
, a, da
1
)
P(t
2
t
1
, a
1
, da
2
) P(t
n
t
n1
, a
n1
, da
n
)
We shall prove this for n = 2.
P
a
(x
t
1
E
1
, x
t
2
E
2
)
= P
a
(x
t
1
E
1
, x
(t
2
t
1
)+t
1
= E
2
)
= P
a
(x
t
1
E
1
, x
t
2
t
1
w
+
t
1
E
2
)
= P
a
(w B
1
, w
+
t
1
B
2
), B
1
= {x
t
1
E
1
} and B
2
=
_
x
t
2
t
1
E
2
_
3. Transition Probability 19
=
_
B
1
P
x
t
1
(B
2
)dP
a
=
_
B
1
P
x
t
1
(w : w(t
2
t
1
) E
2
)dP
a
=
_
E
1
P(t
1
, a, da
1
)P(t
2
t
1
, a
1
, E
2
)
=

a
i
E
i
P(t
1
, a, da
1
)P(t
2
t
1
, a
1
, da
2
).
(T.7) Suppose that M
1
= (S
1
, W
1
, P

a
, a S
1
{}) and M
2
= (S
2
, W
2
,
P
2
a
, a S
2
{}) are two Markov processes with S
1
= S
2
, W
1
= 25
W
2
and P

(t, a, E) = P
2
(t, a, E): then M
1
M
2
, i.e. P
1
a
= P
2
a
.
Proof. Any sub-set of W of the form
_
(x
t
1
, . . . , x
t
n
) E
1
E
n
_
, E
i
B(S ),
is in B(W). Since B(W) is a Borel algebra, any set of the form
_
(x
t
1
, . . . , x
t
n
) E
n
B(S
n
)
_
is in B(W). The class of all sets of the form
_
(x
t
1
, . . . , x
t
n
) E
n
, E
n
B(S
n
)
_
for all n, for all n-tuples 0 t
1
, . . . , t
n
< and all Borel sets E
n
of S
n
,
is an algebra A(W) B(W). Further A(W) generates B(W).
For xed 0 t
1
, . . . , t
n
< , let
P
i
a
(E
n
) = P
i
a
_
(x
t
1
, . . . , x
t
n
) E
n
_
, i = 1, 2.
Then P
i
a
is a measure on the Borel sets of S
n
. From (T.6) it follows
that P
1
a
(E
n
) = P
2
a
(E
n
), for all sets E
n
which are nite disjoint unions of
sets of the form
E
1
. . . E
n
, E
i
B(S ).
Such sets E
n
form an algebra which generates B(S
n
). Using the 26
uniqueness part of the Kolmogoro theorem, we get P
1
a
(E
n
) = P
2
a
(E
n
)
for all E
n
B(S
n
).
Thus P
1
a
= P
2
a
on A(W). One more application of the uniqueness of
the extension gives the result.
20 1. Markov Processes
T.8 Suppose that M = (S, W, P
a
, a S {}) is a triple with S and W
being as in the denition of a Markov process, and P
a
, a S {}
are probability distributions on B(W) and let
P(t, a, E) = P
a
{w : w(t) E}.
Suppose further that p(t, a, E) satises the properties (T.2), (T.4) and
(T.6). Then the contention is that Mis a Markov process with P(t, a, E)
as the transition probability of P
a
.
To prove this we have to verify conditions (b), (c) and (d) on P
a
.
The proof of b) is similar to that of (T.6). (T.6) shows that P
a
(B) is
measurable in a if B is of the form
_
(x(t
1
), . . . , x(t
a
)) E
n
, E
n
S
n
_
where E
n
is a nite disjoint union of sets of the form E
1
E
2
E
n
,
E
i
B(S ). For xed t
1
, . . . , t
n
, consider the class X of sets E
n
B(S
n
)
for which
P
a
_
(x
t
1
, . . . , x
t
n
E
n
_
is measurable in a. If E
n
i
is a monotone sequence of sets in X and
lim
i
E
n
i
= E
n
, P
a
_
(x
t
1
, . . . , x
t
n
) E
n
i
_
is a monotone sequence and 27
lim
i
P
a
_
(x
t
1
, . . . , x
t
n
) E
n
i
_
= P
a
_
(x
t
1
, . . . , x
t
n
) E
n
_
.
X is therefore a monotone class and hence X B(S
n
). We have thus
shown that P
a
(B) is measurable in a for all B A(W). Similarly we
show that the class of sets B B(W) for which P
a
(B) is measurable in
a, is a monotone class.
We now verify (c). Choose t
n
0 such that
P
a
{B
n
} = P
a
_
x
t
n
U
a
_
> 1 .
Since w(t) is right continuous
_
w : w(0)

U
a
_

_
n=1

U
m=n
[

_
m=n
B
m
].
3. Transition Probability 21
Where

U
a
denotes the closure of U
a
. Therefore
1 P
a
(w : w(0)

U
a
) 1 .
Since arbitrary, P
a
(w : w(0)

U
a
) = 1. Now we choose a decreas-
ing sequence {U
i
a
} of open sets such that U
i
a


U
i+1
a
and

_
i=1
U
i
a
= {a}.
We then have
P
a
(x
o
= a) = P
a
(

_
i=t
(x
o
U
i
a
)) = lim
i
P
a
(x
o
U
i
a
) = 1.
To prove (d) we proceed as follows. First remark that if f B(S
n
) 28
and E
n
B(S
n
) and B = ((X
t
1
, . . . , x
t
n
) E
n
then
_
E
n
P(t
1
, a, da
a
) P(t
a
t
n1
, da
n
) f (a
,
. . . , a
n
) =
_
B
f [x
t
1
, . . . , x
t
n
]dP
a
.
Let B
1
B
t
be given by B
1
= (w : w

t
B

) where B

= (x
t

1

E
1
, . . . , x
t

n
E
n
); then B
1
= (x
t
i
E
i
, 1 i n) with t
i
= tt

i
, 1 i
n. Let B
2
B
2
be given by
B
2
= (x
s
j
F
j
, 1 j m).
We have
P
a
(w B
1
, w
+
t
B
2
) = P
a
(x
t
i
E
i
, x
t+s
j
F
j
)
= P(x
t
i
E
i
, x
t
S, x
t+s
j
F
j
)
=
_
a
i
E
i
cS
P(t
1
, a, da
1
) P(t
n
t
n1
, a
n1
, da
n
)P(t t
n
, a
n
, dc)
_
b
j
F
j
P(s
1
, c, db
1
) . . . P(s
m
s
m1
, b
m1
, db
m
)
=
_
a
i
E
i
,cS
P(t
1
, a, da
1
) . . . P(t t
n
, a
n
, dc)P
c
(B
2
) =
_
B
P
x
t
(B
2
)
by the above remark. We now x B
2
and prove that the above equation
holds for all B
1
B
t
[the proof runs along the same lines as the proof
of b)]. Finally x B
1
B
t
and prove the same for all B
2
B.
22 1. Markov Processes
4 Semi-groups
Let H
t
f (a) =
_
S
P(t, a, db) f (b) = E
a
{ f (x
t
)}. Then H
t
is a map of B(S ) 29
into B(S ) with the following properties:
(H.1) It is linear on B(S ) into B(S ). It is continous in the sense that if
| f
n
| M and f
n
f then H
t
f
n
H
t
f .
(H.2) H
t
0, in the sense that if f 0, H
t
f 0.
(H.3) It has the semi-group property i.e. H
t
H
s
= H
t+s
.
H
t+s
f (a) = E
a
( f (x
t+s
)) =
_
S
P(t + s, a, db) f (b)
=
_
S
f (b)
_
S
P(t, a, dc)P(s, c, db)
=
_
S
P(t, a, dc)
_

_
_
S
f (b)P(s, c, db)
_

_
=
_
S
P(t, a, dc)H
t
f (c)
= H
t
H
s
f (a)
(H.4) H
t
| 1
(H.5) H
t
f (a) is B(R

)-measurable in t.
(H.6) If f is continuous at a, lim
t0
H
t
f (a) = f (a).
For if U
a
is an open set containing a
H
t
f (a) =
_
S
P(t, a, db) f (b) =
_
U
a
P(t, a, db) f (a)
+
_
U
a
P(t, a, db)[ f (b) f (a)] +
_
S U
a
P(t, a, db) f (b).
5. Green operator 23
= f (a)P(t, a, u
a
) +
_
U
a
P(t, a, db)[ f (b) f (a)] +
_
S U
a
P(t, a, db) f (b).
30
Now use the fact that P(t, a, U
a
) 1 and f is continuous at a.
5 Green operator
We have seen that the operators {H
t
} form a semi-group. We now in-
troduce one more operator, the Green operator, as the formal Laplace
transform of H
t
, which will lead to the concept of a generator.
Consider the operator G

_
0
e
t
H
t
dt, dened for > 0 by
G

f (a) =

_
0
e
t
H
t
f (a)dt, f B(S ).
G is called the Green operator on B(S ). Interchanging the orders of
integration, we also have
G

f (a) = E
a
_

_
0
e
t
f (x
t
(w))dt
_

_
.
Let G(, a, E) =

_
0
e
t
P(t, a, E)dt. This measure on B(S ) is called
Greens measure on B(S ). We have
G

f (a) =

_
0
e
t
H
t
f (a)dt =
_
S
f (b)

_
0
e
t
P(t, a, db)dt
=
_
S
G(, a, db) f (b).
The operator G

has the following properties:


24 1. Markov Processes
(G.1) G

is linear, and continuous in the sense that if | f


n
| < and f
n
31
f , then G

f
n
(a) G

f (a).
(G.2) G

0, i.e. G

f 0 if f 0.
(G.3) G

satises the following equation, called the resolvent equation:


G

+ ( )G

= 0.
We have
H
s
G

f (a) =
_
S
P(s, a, db)G

f (b)
=
_
S
P(s, a, db)

_
o
e
t
H
t
f (b)dt
=

_
0
e
t
H
t+s
f (a)dt (interchanging the order of integration)
= e
s

_
s
e
t
H
t
f (a)dt.
Therefore
G

f (a) =

_
0
e

s
H
s
G

f (a)ds
=

_
0
e
()
s
ds

_
s
e
t
H
t
f (a)dt
=

_
0
e
t
H
t
f (a)dt
t
_
o
e
()
s
ds
=
G

f (a) G

f (a)

6. The Generator 25
Remark. H
t
H
s
= H
t+s
= H
s
H
t
and 32
G

=
G


= G

(G.4) G

1
1

, because H
t
l 1
(G.5) The integral dening G

exists for complex numbers whose real


part > 0 for every f B(S ). Then G

f (a) is analytic in for


every f B(S ) and every a S .
(G.6) f is continuous at a implies
G

f (a) f (a) as .
For G

f (a) =

_
0
e
t
H
t
f (a)dt =

_
0
e
t
H
t

f (a)dt and
H
t
f (a) f (a) as t 0 if f is continuous at a.
6 The Generator
Dene, for f B(S )
|| f || = sup
aS
| f (a)|.
Then |H
t
f (a)| || f ||.
B(S ) is a Banach space with the norm || f ||, and H
t
becomes a semi-
group of continuous linear operators on B(S ).
Consider the following purely formal calculations.
G = lim
t0
H
t
I
t
= [
dH
t
dt
]
t=0
Then 33
dH
t
dt
= lim
0
H
t+
H
t

= lim
0
H

H
t
= G H
t
.
26 1. Markov Processes
Therefore H
t
= e
tG
and
G

_
0
e
t
H
t
dt =

_
0
e
(G )t
dt = ( G)
1
or
G = G
1

.
The above purely formal calculations have been given precise mean-
ing, and the steps justied by Hille and Yosida [ ] when H
t
satisfy certain
conditions. In our case, however, H
t
do not in general satisfy these con-
ditions, and we shall dene G with the last equation in view. We now
proceed to the rigorius denition.
Let R

= G[B(S )], N

= G
1

{0} be the image and kernel of G

respectively. We show that R

and N

are independent of and that


R

= {0}. The resolvent equation gives


G

f + ( )G

f = 0
i.e.
G

f = G

[ f + ( )G

f ]
Since f + ( ) G

f B(S ), it follows that


G

f G

[B(S )] = R

,
or that R

. Interchanging the roles of and , R

or
R

. We denote G

[B(S )] by R. Similarly f N

gives G

f = 0 34
and the resolvent equation then gives G

f = 0 or N

. We denote
G
1

{0} by N. Let u RN Then u = G

f for some f B(S ), and for


every , G
u
= 0. Now
H
s
u(a) = H
s
G

f (a) = e
s

_
s
e
t
H
t
f (a)dt,
and so H
s
u(a) is continuous in s and u(a) as s 0. Also, since

_
0
e
s
H
s
u(a)ds = G

u(a) = 0 for all , H


s
u(a) 0.
6. The Generator 27
Letting s 0 we see that u(a) 0.
For u R dene
G

u = u G
1

u.
G

u is then determined mod N. We now prove that G


u

is independent of
. If f = G

u( mod N) then f = u G
1

u, ( mod N) and
G

f = G

u u
G

f = G

u G

u,
G


f =
G


u G

u,
G

f G

f = G

u G

u,
G

f = G

f G

u + G

u
= u + G

u
f = u G
1

u (mod N) = G

u (mod N)
35
We denote G

u by Gu. Then if G

f = u we have
Gu = u f (mod N).
Thus u = G

f if and only if ( G)u = f ( mod N). The domain


D(G) of G is R and we have G = G
1

. G is called the generator of


the Markov process.
The following theorem shows that the generator determines the Mar-
kov process uniquely.
Theorem (). Let M
i
= (S, W, P
i
a
, a S {}), i = 1, 2, be two Markov
processes, and G
i
, i = 1, 2 their generators. Then if G
1
= G
2
, P
1
a
= P
2
a
,
i.e. M
1
= M
2
.
Proof. D(G
i
) = G
i

[B(S )] = R
i
. Since G
1
= G
2
, D(G
1
) = D(G
2
),
i.e. R
1
= R
2
= R (say). Since their ranges must also be the same
N
1
= N
2
= N(say). We have therefore
( G
1
)G
1

f = f (mod N)
28 1. Markov Processes
= f (mod N)[( G
2
)G
2

f ( G
1
)G
1

f ]
= ( G
2
)G
2

f (mod N),
( G)G
1

f = ( G)G
2

f (mod N) since G
1
= G
2

By denition G = G
1
= G
1

Therefore 36
G
1
1

G
1

f = G
1
1

G
2

f (mod N)
G
1

G
1
1

G
1

f = G
1

G
1
1

G
2

f
Therefore G
1

f = G
2

f . This gives

_
0
e
t
H
1
t
f (a)dt =

_
0
e
t
H
2
t
f (a)dt for everyF B(S )
Thus if f is continuous, H
1
t
f (a) H
2
t
f (a)
_
P
1
(t, a, db) f (b) =
_
P
2
(t, a, db) f (b)
for every f B(S ) which is continuous. Therefore
P
1
(t, a, E) = P
2
(t, a, E),
Hence
P
1
a
= P
2
a
.
7 Examples
We rst prove a lemma which will have applications later, and then we
give a few examples of Markov processes.
Let f be a real -valued function on an open interval (a, b). When f
is of bounded variation in every compact sub-interval of (a, b) we write
f B(a, b) and then there exists a unique signed measure d f Lebesgue- 37
Stieltjes measure) such that d f (, ] = f (+) f (+), (, ] (a, b).
7. Examples 29
Suppose that is any measure on (a, b) which is nite on compact sub-
sets of (a, b). Suppose further that there exists a function on (a, b)
which is -summbale on every compact sub-interval of (a, b) and satis-
es
f (+) f (+) =

()d ().
Then d f = d and f is absolutely continuous with respect to d.
We now prove that following
Lemma (). If f , g BW(a, b) then f g BW(a, b) and
d( f g)x = f (x+)dg(x) + g(x)d f (x).
Proof. We can assume that f and g are non-negative and non-decreasing
in (a, b). It is enough to prove that if h is continuous in (a, b) and has
compact support, then,
_
h(x)d( f g)(x) =
_
h(x) f (x+)dg(x) +
_
h(x)g(x)d f (x).

For n = 1, 2, . . . let {
n,k
}, k = 0, 1, 2, . . . be a sequence of points
such that
a <
n,o
<
n,1
< b,
n,i

n,i1
<
1
n
Dene

n
(x) =
n,i

n
(x) =
n,i1
if
n,i1
< x <
n,i
.
Then 38
I
n
=
_
h[
n
(x)]d( f g)(x)
=

i=
h(
n,i
)[ f (
n,i1
+)g(
n,i
+) f (
n,i1
+)g(
n,i1
+)
30 1. Markov Processes
=

i=
h(
n,i
) f (
n,i
+) [g(
n,i1
+)] g(
n,i1
+)
+

i=
h(
n,i1
)g(
n,i1
+)[ f (
n,i1
+) f (
n,i1
+)]
=
_
h[
n
(x)] f [
n
(x)+]dg(x) +
_
h[
n
(x)]g[
n
(x)+]d f (x)
Since h has compact support, letting n we get the result.
Ex.1 Standard Brownian motion
Let S = R
1
, W = C[0, ) [we dene w() = ]. Let P be the
Wiener measure on W and dene for a S ,
P
a
(B) = P{w : w + a B}, B B(W).
It is not dicult to show that (S, W, P
a
) is a Markov process ; that is
a continuous process and is called the Standard Brownian motion.
We shall determine the generator of this process. We have
P(t, a, E) = P(w : w + a E) =
1

2t
_
Ea
e
x
2
/2t
dx
=
_
E
N(t, a, c)dc.
H
t
f (a) =
_
R

N(t, a, b) f (b)db =

e
(ba)
2
/2t

2t
f (b)db.
39
If u R, u = G

f for some f B(R


1
) and
u(a) = G

f (a) =

_
0
e
t
H
t
f (a)dt =

_
0
f (b)db

e
t
(ba)
2
2t

2t
dt
=

2
e

2|ba|
f (b)db
7. Examples 31
= e

2a
a
_

2
e

2b
f (b)db + e

2a

_
a
1

2
e

2b
f (b)db.
Since e

2a
and
a
_

2b
f (b)db are both in BW(, ) we get
from the lemma
du(a) =

2e

2a
da
a
_

2
e

2b
f (b)db+
e

2a
e

2a
f (a)

2
da
+

2e

2a
da

_
a
1

2
e

2b
f (b)db
e

2a
e

2a
f (a)

2
da.
Therefore u is absolutely continuous and 40
u(a) = e

2a
a
_

2b
f (b)db + e

2a

_
a
e

2b
f (b)db
almost everywhere. Using the lemma again we see that u

is absolutely
continuous and we get
u

= 2u 2f almost everywhere.
Let R
+
= {u : B(R

), u abs. cont, u

abs. cont, u

B(R
1
)}. We
have seen above that if u R, then u R
+
. Conversely let u R
+
and
put f = u
1
2
u

. Then f B(R
1
) and v = G

f satises
1
2
v

= v f
Therefore w = v u satises
1
2
w

w = 0.
Hence w = c
1
e

2a
+ c
2
e

2a
. Since w is bounded, c
1
= c
2
= 0 or
u = G

f . Thus we have proved that


R =
_
u : u B(R
1
), u abs.cont, u

abs.cont, u

B(R
1
)
_
32 1. Markov Processes
If f N, u = G

f = 0 and since u

= 2u 2f a.e. we see that


f = 0 a.e. Therefore
N = { f : f = 0 a.e.}.
Also the formula u

= 2u 2f (a.e.) shows that G =


u

2
(a.e.) and 41
hence G =
1
2
d
2
da
2
.
Ex.2 Brownian motion with reecting barrier at t = 0.
Let (S = (, ),

W,

P
a
) denote the Standard Brownian motion.
Let S = [0, ) and W the set of all continuous functions on [0, )
into S . If B B(W) then B B(

W). Dene P
a
(B) = p
a
[w : |w| B] for
a s. Then (S, w, P
a
) is a continous Markov Process and is called the
Brownian motion with reecting barrier at t = 0.
We have
P(t, a, E) =

P
a
{w : |w(t)| E}
=

P
a
{w : w(t) E (E)}
=
_
E
[N(t, a, b) + N(t, a, b)]db
H
t
f (a) =

_
0
[N(t, a, b)| + N(t, a, b)
_
f (b)db
=

N(t, a, b)

f (d)db =

H
t

f (a)
where

f (b) = f (|b|). Therefore
u(a) = G

f (a) =

_
0
e
at
H
t
f (a)dt
=

_
0
e
t

H
t

f (a)dt =

G


f (a) = u(a), say
7. Examples 33
From the previous example it follows that u B(

S), u is absolutely 42
continuous, u

is absolutely continuous and u

B(

S ). Since u(a) =
u(a) for a > 0, we see that u B(S ), u is absolutely continuous for a >
0, u

is absolutely continuous, u

B(S ). Further since u(a) = u(a)


we see that u

(a) = u

(a) and hence u

(0) = 0. This gives u


+
(0) = 0.
The relation
1
2
u

= u

f gives
1
2
u

= u f
N = { f : f = 0 a.e.}
R = {u : u B(S ), u, u

abs.cont, u
+
(0) = 0 and u

B(S )}
Gu = u f =
1
2
u

(a.e.)
Ex.3 Poisson process
Let (, P) be a probability measure space and {(t, ), 0 t < } a
Poisson process on .
Let S = {0, 1, 2, . . . . . .}, W = W
d
1
= the set of all sample paths
whose only discontinuities are of the rst kind, and hence they are step
functions with integral values. For almost all , (t, ) is a step function
with jump 1 and vanishes at t = 0; therefore, for almost all , (t, ) is
a step function with integral values and hence belongs to W.
Let
(k)
(t, ) = k + (t, ) and dene
P
k
(B) = P{ :
(k)
(., ) B}, B B(W).
If E S ,
P(t, k, E) = P( : k + (t, ) E)
= P( : (t, ) E k)
=

0nEk
e
t
(t)
n
n!
=

knE
e
t
(t)
nk
(n k)!
H
t
f (k) =

n=0
f (n + k)e
t
(t)
n
n!
34 1. Markov Processes
u(k) = G

f (k) =

_
0
e
t

f (n + k)e
t
(t)
n
n!
=

n=0
f (n + k)

n
( + )
n+1
.
43
Therefore we obtain
u(k + 1) u(k) =
f (k)

u(k).
If u = G

f = 0, from the above we see that f 0, and


N = { f : f 0}.
Let u B(S ) and put f (k) = u(k) [u(k + 1) u(k)]. If v(k) =
G

f (k), v satises
v(k + 1) v(k) =
f (k)

v(k)
and hence, subtracting, 44
[v(k) u(k)] [v(k + 1) u(k + 1) v(k) + u(k)] = 0
and so
v(k + 1) u(k + 1) =
+

[v(k) u(k)]
If v(0) u(0), |v(k) u(k)| = (
+

)
k
|v(0) u(0)| which
is impossible since v u B(S ). Therefore v(0) = u(0) and hence
v(k) = u(k). Thus we have R = B(S ).
Ex. 4 Constant velocity motion
Let S = R
1
, W = C[0, ). Let
P
a
{w(t) a + t, 0 t < } = 1.
7. Examples 35
Then for any B B if w(t) = a + t B, P
a
(B) = 1 and otherwise
P
a
(B) = 0.
P(t, a, E) = (E, a + t) =
_

_
1 if a + t E
0 if a + t E
H
t
f (a) = f (a + t)
u(a) = G

f (a) =
1

_
a
e

t
f (t)dt.
From the lemma and the absolute continuity of u,
u

(a) =

u(a)
f (a)

(a.e.)
45
So if G

f = 0, f = 0 a.e.
N =
_
f : f = 0 a.e.,
_
R {u : u B(R
1
), u, abs.cont, u

B(R
1
)}
Gu = u f = u

So that G =
d
da
.
If u R, we have u B(R
1
), u abs.cont. and u

B(R
1
). Con-
versely, let u satisfy these conditions and f = u u

. Then v = G

f
satises
y y = f .
The general solution therefore is
y = G

f + Ce

a
.
Since y is to be bounded, C = 0. Thus
R =
_
u : u B(R
1
), u abs.cont, u

B(R
1
)
_
.
Ex.5 Positive velocity motion
36 1. Markov Processes
Let S = (r
1
, r
2
) and v(x) > 0 a function continuous on (r
1
, r
2
) such
that for r
1
< < < r
2

dx
v(x)
< + and
r
2
_
dx
v(x)
= +.
46
Then there exists a solution
(a)
(t) of
d
dt
= v() with the initial con-
dition
(a)
(0) = a.
Let W = W
c
and
P
a
_
x
t
(w) =
(a)
(t), 0 t <
_
= 1.
This is similar to Ex.4 and we can proceed on the same lines.
8 Dual notions
Let M = (S, W, P
a
) be a Markov process and M the set of all bounded
signed measures on B(S ). Mis a linear space. For E B(S ) and M
dene
|||| = total variation of = sup
EB(S )
[(E) (E
c
)].
H

t
(E) =
_
S
P(t, a, E)(da)
G

(E) =

_
0
e
t
H

t
(E)dt.
Then H

t
and G

are in M and
||H

t
|| ||||, ||G

||
||||

Also, for f B(S ), denote by ( f , H

t
) and ( f , G

) the integrals
_
f (a)H

t
(da) and
_
f (a)G

(da) respectively. We have


( f , H

t
) =
_
f (a)H

t
(da) =

f (a)P(t, b, da)(db)
8. Dual notions 37
=
_
H
t
f (b)(db) = (H
t
f , )
Similarly ( f , G

) = (G

f , ).
47
Theorem 1.
G

+ ( )G

= 0.
Follows easily from ( f , G

) = (G

f , ) and the resolvent equation


for G

.
Theorem 2. R

= G

M is independent of . We denote this by R

.
Follows from Theorem 1.
Theorem 3. If G

= 0, M, then = 0.
Proof. Let f C(S ). Then since G

= 0 we have
0 = ( f , G

) = (G

f , ) ( f , )
as . Hence, for every f C(S ), ( f , ) = 0. It follows that
0.
Theorem 4. G

= (G

)
1
is independent of . We denote this by
G

, and call it the dual generator of G.


Proof is easy
Theorem 5. If u R = G, v R

= D(G

) then 48
(Gu, y) (u, G

v).
Proof. Let u = G

f , = G

G. Then
(Gu, v) = (
f
, ) = (
u
, ) ( f , )
= (
u
, ) ( f , G

) = (u, ) (G

, f , )
= (u, , ) (u, ) = (u, ) = (u, G

).

38 1. Markov Processes
This theorem justies the name dual generator G

.
Theorem 6. G

determines the Matkov process i.e. if M


i
= (S
i
, W
i
, P
i
a
),
i = 1, 2 are two Markov processes with S
1
= S
2
, W
1
= W
2
and G
1
=
G
2
, then P
1
1
= P
2
a
.
Proof. Since G
1
= G
2
, D(G
1
) = D(G
2
). Let Mand = G
1

.
Since D(G
2
), = G
2


1
. Now = G
1
= G
2
=
1
.
Hence
1
=
2
i.e. G
1

= G
2

. Now for any f B(S ), and for any


M,
(G
1

f , ) = ( f , G
1

) = ( f , G
2

) = (G
2

f , ).
It follows that G
1

f G
2

f , i.e. H
1
t
f (a) = H
2
t
f (a) for almost all
t. If f C(S ), H
i
t
f (a)i = 1, 2 are right continuous in t and are equal
almost everywhere. They are therefore identical. Now the proof can be
completed easily. 49
Example. Consider the standard Brownian motion. Then
R

= { : (db) = db
_
(da)G(, |a b|)}.
This means (E) =
_
E
db
_
(da)G(, |a b|) where
G(, |a b|) =

_
0
e
dt
N(t, a, b)dt =
1

2
e

2|ba|
.
The formula shows that has the density
u(b) =

2
e

2|ba|
(da)
= e

2b
b
_

2
e

2a
(da) + e

2b

_
b
1

2
e

2a
(da).
9. A Theorem of Kac 39
Now using the lemma of 7 we see that
du(b) = e

2b
db
b
_

2a(da) + e

2b
db

_
b
e

2a
(da)
and hence u is absolutely continuous and
u

(b) = e

2b
b
_

2a
(da) + e

2b

_
b
e

2a
(da)
Using the same lemma again we see that
du

(b) = 2(db) +

2db

2|ba|
(da)
= 2(db) + 2(db).
50
Thus we have G

= v =
1
2
du
1
.
9 A Theorem of Kac
We prove the following
Theorem (Kac). Let M = (S, W, P
a
) be a Markov process. For k, f ,
B(S ) we dene
v(a) = v(, a) = E
a
_

_
0
e
t
f (x
t
)e

_
t
0
k(x
s
)ds
dt
_

_
where > ||k

|| sup(k(a)v0), {(avb) = max(a, b)}. Then


(k + G)v = f .
[If k 0, ||k

|| = 0 and > 0].


40 1. Markov Processes
Proof. We have
v u = E
a
_

_
0
e
t
f (x
t
)[e

t
_
0
k(x
s
)ds 1]dt
_

_
= E
a
_

_
0
e
t
f (x
t
)
t
_
0
e

t
_
s
k(k)d
k(x
s
)ds dt
_

_
Now

_
0
t
_
0

e
t
f (x
t
)e

t
_
0
k(k

)d
k(x
s
)

dsdt

_
0
|| f || ||k||e
(||k

||)t
tdt < .
Changing the order of integration 51
v u = E
a
_

_
0
k(x
s
)ds

_
s
e
t
f (x
t
)e

t
_
s
k(x

)d
dt
_

_
= E
a
_

_
0
k(x
s
)ds

_
s
e
(t+s)
f (x
t+s
)e

t+s
_
s
k(x

)d
dt
_

_
= E
a
_

_
0
e
s
k(x
s
)ds

_
s
e
t
f (x
t+s
)e

t
_
0
k(x
+s
)d
dt
_

_
= E
a
_

_
0
e
s
k(x
s
)ds

_
0
e
t
f [x
t
(w
+
s
)]e

t
_
0
k[x

(w
+
s
)]d
dt
_

_
=

_
0
e
s
ds E
a
_

_
k(x
s
)

_
0
e
t
f [x
t
(w
+
s
)]
_

_
e

t
_
0
k[x

(w
+
s
)]d
dt
9. A Theorem of Kac 41
=

_
0
e
s
ds E
a
_
k(x
s
)E
x
s
_

_
0
e
t
f (x
t
)e

t
_
0
k(x

)d
dt
_

_
_
=

_
0
e
s
ds E
a
[k(x
s
)v(x
s
)]
= G

(k, v)(a) D(G).


Since
u D(G),
v D(G)
Further 52
G
1

[v u] = kv mod (N) and G


1

= G
( G)(v u) = kv (mod N)
( G)v ( G)u = kv (mod N)
( G)v f = kv (mod N), ( G)u = f (mod N)
= ( + k G)v = f .
This proves the result.
As an application of Kacs theorem consider the standard Brownian
motion (S, W, P
a
). Let
(t) = the Lebesgue measure of (s : x
s
> 0 and 0 < s t).
= the time spent in the positive half line up to t.
Note that (t) is continuous in t.
Then we shall prove that
P
0
[(t) d]
d
=
1

(t )
so that
P
0
(w : (t) < ) =
2

are sin
_

t
, 0 t.
42 1. Markov Processes
We have (t) =
t
_
0
k(x
s
)ds where
k(a) = if a > 0
= 0 if a 0.
53
Therfore (t) =
t
_
0
k[x(s, w)]ds, considered as a function of w is
measurable in w. Let
(, t, a) = E
a
_
e
(t)
_
.
Then
(, t, a) = E
a
_

_
e

t
_
0
k(x
s
)ds
_

_
=
_

P
a
((t) d)
=
_

0
e

P
a
((t) d)
for, if B (, 0) then P
a
{w : (t) B} = 0. Also if v(a) = v(, , a) =

_
0
e
t
(, t, a)dt we have
v(a) = E
a
__

0
e
t
f (x
t
)e

_
t
0
k(x
s
)ds
dt
_
where f 1.
From Kacs theorem, v is a solution of the dierential equation
_
+ k
1
2
d
2
da
2
_
y = 1 (a.e.)
i.e., v satises
_
+
1
2
d
2
da
2
_
y = 1 if a > 0
9. A Theorem of Kac 43
_

1
2
d
2
da
2
_
y = 1 if a < 0
54
The general solution of this equation is
y =
1
+
+ A
1
e

2(+)x
+ A
2
e

2(+)x
, x > 0
=
1

+ B
1
e

2x
+ B
2
e

2x
, x < 0.
Since v is bounded A
2
= B
1
= 0 and using the fact v is continuous
and v

is continuous at 0 we have
v(0) = v(, , 0) =
1

+
.
Now

_
0
e
t
t
_
0
1

(t s)
1

s
e
s
ds dt
=

_
0
e
s
ds

_
s
1

(t s)
e
t
dt
=

_
0
e
s
ds

_
0
1

t
e
(t+s)
dt
=

_
0
e
(+)s
ds

_
0
1

t
e
t
dt
=
1
_
( + )
.
Therefore,

_
0
e
t
(, t, 0)dt = v(0) =

_
0
e
t
t
_
0
1

(t s)
1

s
e
s
dsdt.
44 1. Markov Processes
55
Fixing , since this is true for all , and (, t, a) and
t
_
0
1

(ts)
1

s
e
s
ds are continuous in t, we have
(, t, 0)
t
_
0
1

(t s)
1

s
e
s
ds
i.e.,

_
0
e

P
0
((t) d) =
t
_
0
e
s
1

s(t s)
ds.
Thus nally
P
0
((t) ds) =
1

s(t s)
ds.
Section 2
Srong Markov Processes
1 Markov time
56
Denition (). Let (S, W, P
a
) be a Markov process with W = W
rc
, W
d
1
or
W
c
. A mapping : W [0, ] is called Markov time if
(w : (w) t) B
t
.
It is easily seen that w w

is a measurable map of W W. In
fact, it is enough to show that
w w

(t) = x(t, W)
is measurable, and this is immediate since x(s, w), (w), t and w are all
measurable in the pair (s, w). Similarly, w w
+

is measurable.
The system of all subsets of W of the form (w : w

(w)
B), B B,
is denoted by B

. B

is a Borel algebra contained in B. We shall give


examples to show that is not always B

-measurable. However, if <


, x

= w((w)) is B

-measurable, for x

= lim
t
w

(t) and x

(w
t
) is
B

-measurable for every t.


If is a Markov time, then + is also a Markov time for every
0. It is not dicult to see that B
+
increases with . Let
B
+
=
_
>0
B
+
=
_
n
B
+ 1/n
.
45
46 2. Srong Markov Processes
Then B
+
B

and B
+
B
+
for every > 0. The class of all 57
bounded B

-measurable functions is denoted by B

and the class of all


bounded B
+
-measurable funcitons by B
+
.
Theorem (). (w) is B
+
-measurable.
Proof. We shall prove that for every > 0, (w) = (w

+
), from this
the theorem follows. Let w
0
W. If (w
0
) = the equality is trivial.
Let t = (w
0
) < . Now
(w : (w) t) B
t
B
t+
for any > 0. Also
(w : (w) > t) =
_
n
_
w : (w) t +

n
_
B
t+
.
It follows that
(w : (w) = t) B
t+
.
Hence
(w : (w) = t) = (w : w

t+
B)
for some B B. Since (w
0
) = t we see that (w
0
)

t+
B.
Hence
_
(w
0
)

t+
_

t+
= (w
0
)

t+
B.
So

_
(w
0
)

t+
_
= t,
i.e.,

_
(w
0
)

(w
0
)+
_
= (w
0
),
completing the proof.
2 Examples of Markov time
58
1. t.
2. Examples of Markov time 47
2. =
G
= inf . {t : x
t
(w) G}
= rst passage time for the open set G S .
We have
{w :
G
< t} = {w : s < t and x
s
G}
= {w : r < t, x
r
G and r rational }
=
_
r
r rational, r<t
_
w : x
r
(w

t
) G
_
.
Thus
G
is a Markov time.
Remark. =
G
is not always B

-measurable. If is a Markov time


which is B

-measurable, then
{w : ((w) < c} =
_
w : w

B, B B
_
,
and since (w

= w

, we should have (w

) < c. In particular, if
is B

-measurable and (w) < , then (w

) < . Now consider a


Markov process with S = (, ), W = W
c
and let =
G
where
G = (0, ). Let w(t) = 1 + t. Then (w) = 1. Also w

(t) = 1 + t if
t 1 and w

(t) = 0 if t 1. Therefore (w

) = . Hence cannot be
B

-measurable.
3. If G = {},
G
=

= killing time.
4. Let W = W
c
and
=
F
= inf . {t : x
t
F} .
where F is closed in S . Let G
m
G
m+1
be a sequence of open sets 59
such that
_
m
G
m
= F, and let = lim
m

G
m
. Then is measurable
[actually it is a Markov time]. We easily verify that

F
=
_

_
if <

if =

.
48 2. Srong Markov Processes
It follows that
F
is measurable. Now it is easily veried that
[w :
F
(w) < t] =
_
w :
F
(w

t
) < t
_
;
in fact the closedness of F is not necessary to prove this. Since w w

t
is B
t
-measurable, it follows that
F
is a Markov time.
3 Denition of strong Markov process
Let Mbe a Markov process. Mis said to have the strong Markov prop-
erty with respect to the Markov time if
P
a
(w : w B
1
, w
+

B
2
) = E
a
(w B
1
: P
x

(B
2
)),
where B
1
B
+
and B
2
B.
Remark. The above condition is equivalent to
E
a
( f (w)g(w
+

)) = E
a
( f (w)E
x

(g(w

))),
or, more generally, to
E
a
(w B
1
, w
+

B
2
: f (w)g(w
+

)) =
= E
a
(w B
1
: f (w)E
x

(w

B
2
: g(w

))),
where 60
f B
+
, g B, B
1
B
+
and B
2
B.
Denition (). M is called a strong Markov process if it has the strong
markov property with respect to all Markov times. A strong Markov
process is called a diusion process if W = W
c
(S ).
4 A condition for a Markov process to be a storng
Markov process
We shall later give examples to show that not all Markov processes are
strong Markov processes. The following theorem gives a sucient con-
dition for a Markov process to be a strong Markov process.
4. A condition for a Markov process... 49
Theorem (). Let M = (S, W, P
a
) be Markov process and C(S ) the set of
all real continuous bounded functions on S . If H
t
maps C(S ) into C(S ),
then Mis a strong Markov process.
Proof. Let be a Markov time. We have to show that
E
a
( f (w)g(w
+

)) = E
a
( f (w)E
x

(g(w

))).

Let > 0 and f B


+
. Then, since f B
+
for every > 0,
(w : t f (w) < t) = (w : w

+
B), B B.
Also (w

+
)

+
= w

+
. Therefore
i f (w

+
) < t.
61
Putting t = f (w) + and letting 0 we get f (w) = f (w

+
).
If
m
=
[m] + 1
m
, then
m
> and
m
as m . We have
for f B
+
and g
1
, g
2
B(S ),
E
a
( f (w)g
1
(x
t
1
+
)g
2
(x
t
2
+
)) = E
a
( < : f (w)g
1
(x
t
1
+
)g
2
(x
t
2
+
))
and
f (w

m
) = f (w

+
m

) = f (w).
If g
1
, g
2
C(S ), g
i
(x
t
i
+
m
) g
i
(x
t
i
+
), i = 1, 2, as m .
We have therefore
E
a
( f (w)g
1
(x
t
1
+
)g
2
(x
t
2
+
)) =
= lim
m
E
a
( < : f (w

m
)g
1
(x
t
1
+
m
)g
2
(x
t
2
+
m
))
= lim
m

k=1
E
a
_
: f (w

m
)g
1
(x
t
1
+
m
)g
2
(x
t
2
+
m
),
where
_

k 1
m
_

_

k
m
_
,
50 2. Srong Markov Processes
= lim
m

k=1
E
a
_
: f (w

k/m
)g
1
(x
t
1
+k/m
)g
2
(x
t
2
+k/m
),
since
m
= k/m if
k 1
m
<
k
m
. From the denition of Markov time,
_

k 1
m
_
Bk1
m
Bk
m
, ( k/m) B
k/m
,
so that B
k/m
. Therefore form the Markov property we have the last 62
expression equal to
lim
m

k=1
E
a
_
: f (w

k/m
)E
x
k/m
_
_
g
1
(x
t
1
)g
2
(x
t
2
)
_
= lim
m

k=1
E
a
_
: f (w

m
)E
x
m
_
g
1
(x
t
1
)g
2
(x
t
2
)
_
_
= lim
m
E
a
_
< : f (w

m
)F(x

m
)
_
,
where F(x

m
) = E
x
m
_
g
1
(x
t
1
)g
2
(x
t
2
)
_
. Also,
F(b) = E
b
(g
1
(x
t
1
)g
2
(x
t
2
))
= E
b
_
g
1
(x
t
2
(w

t
1
))g
2
(x
t
2
t
1
(w
+
t
1
))
_
, if t
2
> t
1
,
= E
b
_
g
1
(x
t
2
(w

t
1
))E
x
t
1
(g
2
(x
t
2
t
1
(w

)))
_
= E
b
_
g
1
(x
t
1
)H
t
2
t
1
g
2
(x
t
1
)
_
= H
t
1
_
g
1
H
t
2
t
1
g
2
_
(b).
Thus F(b) is continuous in b since H
t
: C(S ) C(S ).
Therefore
E
a
_
f (w)g
1
(x
t
1
+
)g
2
(x
t
2
+
)
_
=
= E
a
_
< : f (w)E
x

(g
1
(x
t
1
)g
2
(x
t
2
))
_
= E
a
_
f (w)E
x

(g
1
(x
t
1
)g
2
(x
t
2
))
_
.
5. Example of a Markov process.... 51
63
Generalizing this to n > 2, we have, if g
i
C(S ),
E
a
_
f (w)g
1
(x
t
1
+
) . . . g
n
(x
t
n
+
)
_
= E
a
_
f (w)E
x

(g
1
(x
t
1
) . . . g
n
(x
t
n
))
_
.
The same equation holds if g
i
B(S ). If B B and B = (w :
w(t
1
) E
1
, . . . , w(t
n
) E
n
), then
X
B
(w) = X
E
1
(x
t
1
) . . . X
E
n
(x
t
n
),
and therefore,
E
a
f (w)X
B
(w
+

) = E
a
_
f (w)E
x

(X
E
1
(x
t
1
) . . . X
E
n
(x
t
n
)
= E
a
_
f (w)E
x

(X
B
(w

))
_
.
The equation
E
a
( f (w)g(w
+

)) = E
a
_
f (w)E
x

(g(w

))
_
follows easily now for g B.
5 Example of a Markov process which is not a
strong Markov process
The above theorem shows the all the preceding examples of Markov 64
processes are strong Markov processes. The natural question is whether
there exist Markov processes which are not strong Markov processes.
The following example answers this question in the armative.
Suppose that (P) is a probability space and (w), w , a random
variable on (P) such that
P(w : (w) E) =
_
E
e
t
dt, > 0.
Such a random variable is often called exponetial holding time.
Let S =
_
0, ) and W = W
c
. Dene

(a)
(t, w) = a + t, a > 0;
52 2. Srong Markov Processes

(0)
(t.w) = 0, if t < (w),
t , if t (w).

(a)
(t, w) are random variables on (p) and for xed w are in W
c
. For
B B(W) and 0 a < , dene
P
a
(B) = P
_
w :
(a)
(., w) B
_
.
For a > 0, then, P
a
(B) = 1 if
(a)
B and P
a
(B) = 0 otherwise.
To show that M = (S, W, P
a
) is a Markov process, we have only to
verify the Markov property. To do this, we show that if f
1
, f
2
B(S ),
then
E
a
( f
1
(x
t
1
) f
2
(x
t
2
)) = H
t
1
( f
1
H
t
2
t
1
f
2
)(a).
Denoting by E the expectiation on , we have 65
H
t
f (a) = f (a + t), a > 0;
H
t
f (0) = E
0
( f (x
t
)) = E( t; f (t )) + E(t < ; f (0)).
So if a > 0,
E
a
( f
1
(x
t
1
) f
2
(x
t
1
)) = f
1
(a + t
1
) f
2
(a + t
2
).
= f
1
(a + t
1
)H
t
2
t
1
f (a + t
1
)
= H
t
1
( f
1
H
t
2
t
1
f
2
)(a)
If a = 0, we have
E
0
( f
1
(x
t
1
) f
2
(x
t
2
)) = E( t
1
; f
1
(t
1
) f
2
(t
2
))
+ E(t
1
< t
2
; f
1
(0) f
2
(t
2
)) + E(t
2
< : f
1
(0) f
2
(0))
=
t
1
_
0
f
1
(t
1
s) f
2
(t
2
s)e
s
ds + f
1
(0)
t
2
_
t
1
f
2
(t
2
s)e
s
ds + f
1
(0) f
2
(0)e
t
2
5. Example of a Markov process.... 53
=
t
1
_
0
f
1
(t
1
s) f
2
(t
2
s)e
s
ds + f
1
(0)e
t
1
t
2
t
1
_
0
f
2
(t
2
t
1
s)e
s
ds + f
1
(0) f
2
(0)e
t
2
=
t
1
_
0
f
1
(t
1
s)
_
H
t
2
t
1
f
2
(t
1
s)
_
e
s
ds + f
1
(0)e
t
1
_

_
t
2
t
1
_
0
f
2
(t
2
t
1
s)e
s
ds + f
2
(0)e
(t
2
t
1
)
_

_
=
t
1
_
0
f
1
(t
1
s)H
t
2
t
1
f
2
(t
1
s)e
s
+ f
1
(0)e
t
1
H
t
2
t
1
f
2
(0)
= H
t
1
_
f
1
H
t
2
t
1
f
2
_
(0).
66
The following facts are easily veried:
N = { f : f = 0 a.e., f (0) = 0} ;
R =
_
u : u abs.cont. in (0, ), u, u

B (0, )
_
;
Gu(a) = u

(a) for a > 0 and Gu(0) =


_
u(0+) u(0)
_
.
We now show that M is not strong Marko process. Let =
G
,
where G = (0, ). We shall show that M does not have the strong
Markov property with respect to the Markov time . We have,
A = P
0
( > 0, (W
+

) > 0) = 0,
since (w
+

) = 0. Also
_
w : (w) > 0
_

_
w : (
(0)
(., w)) > 0
_
,
and hence
P
0
(w : (w) > 0) = P(w : (
(0)
(., w)) > 0 P
_
w : (w) > 0
_
= 1.
54 2. Srong Markov Processes
Note that w((w)) = 0. If M has the strong Markov property with
respect to , we should have
0 = A = P
0
( > 0, (w
+

) > 0) = E
0
( > 0; P
x

( > 0))
= E
0
( > 0; P
0
( > 0)) = 1 P
0
( > 0) = 1,
but this is absurd. 67
6 Dynkins formula and generalized rst passage
time relation
We now prove some theorems on Markov processes which have the
strong Markov property with respect to the Markov time .
Theorem 1 (Dynkin). If u(a) = G

f (a), then
u(a) = E
a
_

_
0
e
t
f (x
t
)dt
_

_
+ E
a
(e

u(x

)).
Proof. We have
u(a) = E
a
_

_
0
e
t
f (x
t
)dt
_

_
= E
a
_

_
0
e
t
f (x
t
)dt
_

_
+ E
a
_

e
t
f (x
t
)dt
_

_
,
and
E
a
_

e
t
f (x
t
)dt
_

_
= E
a
_

_
e

_
0
e
t
f (x
t
(w
+

))dt
_

_
=

_
0
e
t
E
a
(e

f (x
t
(w
+

)))dt
6. Dynkins formula and generalized ... 55
=

_
0
e
t
E
a
(e

E
x

( f (x
t
))dt,
because Mhas the strong Markov property with respect to . (Note that 68
if is a Borel function on the real line, then () B
+
).
Therefore
E
a
_

_
0
e
t
f (x
t
)dt
_

_
= E
a
_

_
0
e

E
x

(e
t
f (x
t
))dt
_

_
= E
a
(e

u(x

)).

Before proving Theorem 2, we prove the following


Lemma (). Let

a
(dt db) = P
a
[ dt, x

db]
be the meausre induced on the Borel sets of R

S by the mapping
w (, x

) of W into R

S . Let (t, b) be a bounded Borel measurable


function on R

S . Then
_
_
o,)S
e
t

a
(dt db)

_
s=0
e
s
(s, b)ds
=

_
0
e
t
dt
_
_
0,t
_
S
(t s, b)
a
(ds db)
Proof. We have
_
[0,)S
e
t

a
(dtdb)

_
0
e
s
(s, b)db
56 2. Srong Markov Processes
=
_
[0,)S

a
(dt db)

_
t
e
s
(s t, b)ds
=
_
[0,)S

a
(dt db)

_
0
F(t, s, b)ds
where 69
F(t, s, b) = e
s
(s t, b), if s t;
0, if s < t.
Changing the order of integration we get the last expression equal to

_
0
ds

_
[0,)S
F(t, s, b)
a
(dt db) =

_
0
e
s
ds

_
[0,)S
(s t, b)
a
(dt db)
This proves the lemma.
Theorem 2. (Generalized rst passage time relation). Put
Q(t, a, E) = P
a
(x
t
E and > t).
Then
P(t, a, E) = Q(t, a, E) +
_
[0,t]S
P(t s, b, E)
a
(ds db)
Remark. When is the rst passage time, this is usually known as the
rst passage time relation.
Proof. We have
E
a
_

_
0
e
t
f (x
t
)dt
_

_
= E
a
_

_
0
e
t
f (x
t
)
[0,]
(t)dt
_

_
6. Dynkins formula and generalized ... 57
=

_
0
e
t
E
a
( > t : f (x
t
))dt.
Further 70
E
a
(e

u(x

)) =
_
[0,)S
e
t
u(b)
a
(dt db),
and since u(b) =

_
0
e
s
H
s
f (b)ds, we have from the Lemma,
E
a
(e

u(x

)) =

_
t=0
e
t
dt
_
[0,t]S
H
ts
f (b)
a
(ds db).
From Theorem 1, therefore

_
0
e
t
H
t
f (a)dt = u(a) =

_
0
e
t
E
a
( > t : f (x
t
)dt
+

_
t=0
e
t
dt
_
[0,t]S
H
ts
f (b)
a
(ds db).
Since the last equation is true for all > 0, we have for almost all t,
H
t
f (a) = E
a
( > t; f (x
t
)) +
_
[0,t]S
H
ts
f (b)
a
(dsdb).
Now suppose that f is bounded and continuous. Then
H
t
f (a) E
a
( > t : f (x
t
)) = E
a
( t; f (x
t
)) = E
a
( f (x
t
)
[0,t]
((w)))
is right continuous in t since f (x
t
) and
[0,t]
are right continuous in t.
Further
_
[0,t]S
H
ts
f (b)
a
(dsdb) =
_
[0,]S
X
[0,t]
(s)H
ts
f (b)
a
(ds db)
58 2. Srong Markov Processes
and so is also right continuous in t. Therefore the above equation holds 71
for all t if f is continuous and bounded. It follows easily that for any
f B(s) the equation is true identically in t. Putting f = X
E
we get
Theorem 2.
The following rough proof should give us an intuitive explanation
of Theorem 2.
P(t, a, E) Q(t, a, E) = P
a
(x
t
E, t )
=
t
_
s=0
t
_
S
P
a
( ds, X

db, x
t
E)
=
t
_
s=0
t
_
S
P
a
( ds, X
S
db, x
ts
(W
+
S
) E
=
t
_
s=0
t
_
S
P
a
( ds, X
S
db)P
b
(x
ts
E)
=
t
_
s=0
t
_
S
P(t s, b, E)(ds db).
We give below two examples to illustrate the use of Theorem 2.
Example 1. Let M be the standard Brownian motion, E B(0, ) and
a > 0. Then we shall prove that
P
a
(x
t
E, t <
0
) =
_
E
_
N(t, a, b) N(t, a, b)
_
db
where
0
(w) = inf(t : w(t) = 0), 72
Since w(
0
(w)) = 0, for E B[0.] and F B(S ) we have

a
(EF) = P
a
(
0
E, x

0
F) = 0, if 0 F; P
a
(
0
E), if 0 F.
Therefore form Theorem 2, with =
0
, we have
P(t, a, E) = Q(t, a, E) +
t
_
s=0
P(t s, o, E)
a
(ds),
6. Dynkins formula and generalized ... 59
P(t, a, E) = Q(t, a, E) +
t
_
s=0
P(t s, o, E)
a
(ds).
Since a > 0, E B(0, ) and all continuous paths starting at a and
going into E pass through o, Q(t, a, E) = 0. Also P(t s, o, E) =
P(t s, o, E). Therefor, subtractiong,
P(t, a, E) P(t, a, E) = Q(t, a, E) = P
a
(x
t
E, t <
0
)
i.e.,
_
E
_
N(t, a, b) N(t, a, b)
_
db = P
a
(x
t
E, t <
0
).
Remark . P
a
(x
t
E and t <

) = P
a
(x
t
E and x
s
> 0,
0 s t).
Example 2.
P
a
(x
s
> 0, 0 s t) = 2
a
_
0
1

2t
e

2
/2t
d = P
0
(|x
t
| < a).
Put E = (0, ) in Example 1. Then we get 73
P
a
(x
s
> 0, 0 s t) =
_

0
1

2t
_
e

(ba)
2
2t
e
(b+a)
2
2t
_
db
= 2
_
a
0
1

2t
e

2
/2t
d
= P
0
(|x
t
| < a).
Note that if a > 0
P
a
(x
s
> 0, 0 s t) = P
a
(
0
> t) =
= P
a
_
min
0st
x
s
> 0
_
= P
a
_
min
0st
x
s
> a
_
= P
0
_
max
0st
x
s
< a
_
.
The following important theorem which follows easily from Theo-
rem 1 gives what is called Dynkins formula.
60 2. Srong Markov Processes
Theorem 3. If E
a
() < and u D(G), then
E
a
__

0
Gu(x
t
)dt
_
= E
a
(u(x

)) u(a).
Proof. From Theorem 1,
u(a) = E
a
__

0
e
t
f (x
t
)dt
_
+ E
a
(e

u(x

)).
Also 74
f (x
t
) = u(x
t
) Gu(x
t
).
Therefore
u(a) = E
a
__

0
e
t
(u(x
t
) u(x
t
))dt
_
+ E
a
(e

(u(x

)).
Letting 0, we get the result.
7 Blumenthals 0 1 law
Let Mdenote a strong Markov process.
Theorem 1. If A B
0+
(=
_
>0
B

), then P
a
(A) = 1 or 0.
Proof. For P
a
(A) = P
a
(A, w A) = P
a
(A, w
+
0
A)
= E
a
(A : P
x
0
(A)) = E
a
(P
a
(A) : A) = (P
a
(A))
2

Theorem 2. If f (w)B
0+
, then P
a
( f = E
a
( f )) = 1.
Proof. Since f B
0+
, f is bounded. From Theorem 1,
P
a
[ f > E
a
( f )] = 1 or 0.
Obviously it cannot be 1, since then E
a
( f ) < E
a
( f ). Hence P
a
[ f >
E
a
( f )] = 0. For the same reason, P
a
[ f < E
a
( f )] = 0. Hence P
a
[ f =
E
a
( f )] = 1.
8. Markov process with discrete state space 61
We consider the following
Example. Let (t) be a function of t, positive and increasing for t > 0.
Let x
t
be a real valued strong Markov process. Consider
P
a
() = P
a
_

_
lim
0
_
0t
(|x
t
a|) (t)
_

_
.
75
By Theorem 1, P
a
() = 1 or 0. If P
a
() = 1, we say that U

(the upper class) and if P


a
() = 0, we say that L
a
(the lower class).
Wiener proved that for the Brownian notion,
(t) = t
1
2
U
a
and (t) = t
1
2
+
L
a
for every > 0
These results have been made more precise by P. Lavy, Kolmogor
and Er ods P. Levys theorem is that
(t) (1 + c)
_
2t log log 1/t U
a
, c > 0
L
a
, c < 0.
8 Markov process with discrete state space
Let M be a right continuous Markov process with discrete state space
S . Since S satises the second countability axiom, it is countable. We
denote the elements of S by (1, 2, 3, . . .). Since S is discrete, B(S ) =
C(S ) and W consists of the set of all step functions before their killing
-time. Mis a Markov process because H
t
C(S ) C(S ).
Let
a
= inf(t : x
t
a) = inf(t : x
t
G) where G = (S {a}) {}.

a
is called the rst leaving time from a. Clearly
a

.
a
has the
following properties:
1.
a
is a Markov time. 76
For,
(
a
t) = (x
s
= a for all s < t)
62 2. Srong Markov Processes
= (x
r
= a for all r < t, r rational) B
t
.
Note that
(
a
> t) = (x
s
= a for all s t)
= (x
r
= a for all rational r < t and x
t
= a) B
t
.
2. P
a
(
a
> t) = e
p
a
t
where
1
p
a
= E
a
(
a
)
Indeed we have
P
a
(
a
> t + s) = P
a
(
a
> t,
a
(w
+
t
) > s)
= E
a
(
a
> t, p
x
t
(
a
> s))
= E
a
(
a
> t, P
a
(
a
> s)), since x
t
= a,
= P
a
(
a
> t)P
a
(
a
> s)
Therefore, if (t) = P
a
(
a
> t), then (t) is right continuous, as is
easily seen, 0 (t) 1 and (t + s) = (t)(s). Further
(0) = P
a
(
a
> 0) = P
a
(w : x
o
= a) = 1.
If (t) = 0 for some t > 0, then (t) = ((t/n))
n
= 0 and so we
should have (t/n) = 0 for all n, and by right continuity, (0) = 0. 77
Therefore 0 < (t) 1 for all t. Thus
(t) = e
P
a
t
, 0 p
a
< .
If p
a
= 0, then (t) 1, i.e. P
a
(
a
> t) = 1, i.e. P
a
(
a
= ) = 1
and so
E
a
(
a
) =
_

a
=

a
(w)dP
a
(w) = .
If p
a
> 0, the map w
a
(w) induces the mesure p
a
e
p
a
t
dt.
Therefore
E
a
(
a
) =
_

0
t p
a
e
p
a
t
dt =
1
p
a
.
8. Markov process with discrete state space 63
3. x

a
and
a
are independent with respect to P
a
.
Indeed, noticing that
a
(w) = t +
a
(w
+
t
) if
a
(w) > t, we have
P
a
(
a
> t, x

a
E) = P
a
(
a
> t, x
t+
a
(w
+
t
)
(w) E)
= P
a
(
a
> t, x

a
(w
+
t
)
(w
+
t
) E
= E
a
(
a
> t, p
x
t
(x
a
E))
= E
a
(
a
> t, P
a
(x

a
E))
= P
a
(x

a
E)P
a
(
a
> t).
We now determine the generator. From Theorem 1, 78
u(a) = E
a
__

a
0
f (x
t
)e
t
dt
_
+ E
a
(e

a
u(x

a
))
Since w(t) = a for t <
a
(w) and
a
, x

a
are independent, we have
u(a) = E
a
_
f (a)
_

a
0
e
t
dt
_
+ E
a
(e

a
)E
a
(u(x

a
))
= f (a)E
a
_
1 e

_
+ E
a
(e

a
)E
a
(u(x

a
))
= f (a)
_

0
1 e
t

e
p
a
t
p
a
dt + E
a
(u(x
a
))
_

0
e
t
e
p
a
t
p
a
dt
=
f (a)
p
a
+
+
p
a
p
+
a
E
a
(u(x

a
)).
Let now

ab
= P
a
(x

a
= b).
Then
E
a
(u(x

a
)) =

bS

ab
u(b).
Since u() is by denition zero,
u(a) =
f (a)
p
a
+
+
p
a
p
a
+

bS

ab
u(b).
64 2. Srong Markov Processes
From the last equation we see that u 0 implies f 0. Therefore 79
M =
_
f : f 0
_
.
Also from the above we get
u(a) f (a) = p
a

bS

ab
u(b) p
a
u(a)
and hence
Gu(a) = p
a
_

bS

ab
u(b) u(a)
_

_
= p
a
_

bS

ab
(u(b) u(a))
a
u(a)
_

_
since
_
bS

ab
+
a
= 1.
Remark. It is generally dicult to determine R = D(G). We can also
nd G from Dynkins formula as follows:
E
a
__

a
0
Gu(x
t
)dt
_
= E
a
(u(x

a
)) u(a).
Therefore
Gu(a)E
a
__

a
0
dt
_
=

bS

ab
u(b) u(a)
i.e., Gu(a)E
a
(
a
) =
_
bS

ab
u(b) u(a) and since E
a
(
a
) = 1/p
a
, we get
the result.
Example. Suppose that
ab
= 0 expect for b = a 1 or b = and let 80

a,a+1
=
a
,
a,a1
=
a
,
a
=
a
1
a

a
.
This process is called the birth and death process. We have
Gu(a) = p
a
(
a
u(a + 1) +
a
u(a 1) u(a))
= p
a
_

a
(u(a + 1) u(a)) +
a
(u(a 1) u(a))
a
u(a)
_
In this particular case we can derermine DG which will depend on
the behaviour of p
a
,
a
and
a
at a = .
9. Generator in the restricted sence 65
9 Generator in the restricted sence
In case of the generator G dened previously there was some ambiguity
so that Gu(a) had no meaning unless we took a version of Gu. We shall
now aviod this ambiguity by restricting the domain of the generator; we
can then speak of Gu(a). Before doing this we prove some theorems
on the domain of the new generator. We rst dene the function space
D(S ).
Denition (). Let y
t
, t > 0, be a random process on a probability space
(B, P). We say that y
t
tends to y essentially (P) as t t
0
, in symbols:
y
t

ess.(P)
y, if for any countable t-set C with t
0
c,
p
_
lim
tC,tt
0
y
t
= y
_
= 1.
Let M = (S, WP
a
) be a strong Markov proces. We make the follow- 81
ing
Denition (). D(S ) =
_
f : f B(S ) and for every a, f (x
t
)
ess.(P
a
)
f (a), as t 0
_
.
Theorem 1. D(S ) (S ).
Proof. Clear.
Theorem 2. G

B(S ) D(S ). In particular, G

D(S ) D(S ).
The proof depends on the following Lemma, the proof of which can
be in Doobs book (p.355).
Lemma (). Let z be a random variable on a probability space (B, p),
with E(|z|) < . Let B
t
B, 0 < t < , be Borel algebras such that if
t < s, B
t
B
s
. Then, if B
o+
=
_
t>0
B
t
, we have
E(z/B
t
)
ess. (P)
E(z/B
o+
).
66 2. Srong Markov Processes
Proof of Theorem 2. We prove rst that
G

f (x
t
) = e
t
E
a
(z/B
t
) e
t
_
t
o
e
s
f (x
s
)ds
with P
a
probability 1, where z =
_

0
e
s
f (x
s
)ds. Indeed, if B
t
B
t
, by
the Markov property,
E
a
(G

f (x
t
) : B
t
) = E
a
_
E
x
t
__

0
e
s
f (x
s
)ds
_
: B
t
_
= E
a
__

0
e
s
f (x
s
(w
+
t
))ds : B
t
_
= e
t
E
a
__

0
e
s
f (x
s
)ds : B
t
_
.
82
Since G

f (x
t
) B
t
, by the denition of conditional expectation we
have
G

f (x
t
) = e
t
E
a
__

t
e
s
f (x
s
)ds
_
B
t
_
= e
t
E
a
__

0
e
s
f (x
s
)ds
_
B
t
_
e
t
E
a
__
t
0
e
s
f (x
s
)ds
_
B
t
_
= e
t
E
a
(z/B
t
) e
t
_

_
t
_
0
e
s
f (x
s
)ds
_

_
.
Since
_
t
0
e
s
f (x
s
)ds B
t
, the conditional expectation of
_
t
0
e
s
f (x
s
)ds is
_
t
0
e
s
f (x
s
)ds with probability 1. Using the lemma, there-
fore,
G

f (x
t
)
ess (P
a
)
E
a
(z
_
B
0+
).
From Blumenthals 0 1 law, if EB
0+
, P
a
(E) = 0 or 1. Hence
E
a
(z
_
B
0+
) = E
a
(z) = G

f (a).
This proves the theorem. 83
9. Generator in the restricted sence 67
Theorem 3. If f D(S ), f (x
t
) is right continuous with respect to L

-
norm.
Proof. Since f D(S ), if t
n
0, P
a
( f (x
t
) f (a)) = 1, so that
E
a
(| f (x
t
) f (a)|) 0 as n .
Now
E
a
(| f (x
s+t
) f (x
s
)|) = E
a
(| f (x
t
(w
+
s
)) f (x
0
(w
+
0
))|)
= E
a
(E
x
s
(| f (x
t
) f (x
0
)|)) 0 as n .
This proves the result
Theorem 4. If F D(S ) and G

f = 0, then f 0.
Proof. Note that if g

f = 0 for some , G

f = 0 for all , from the


resolvent equation. From Theorem 3,
H
t
f (a) = E
a
( f (x
t
)) f (a) as t 0.
Now
0 = G

f (a) =
_

0
e
t
H
t
f (a)dt
=
_

0
e
s
H
s/
f (a)ds f (a) as .
Q.E.D.
Theorem 5. If f D(S ),
P
a
_
1
t
_
t
0
f (x
s
)ds f (a) as t 0
_
= 1.
84
Proof. Put y(s, w) = f (x
s
(w)) f (a) and let C =
_
2
n
k
, k, n = 1, 2, . . .
_
be the set of dyadic rational numbers. Then from the denition of D(S ),
lim
t0
sup
sC,0st
|y(s, w)| = 0,
68 2. Srong Markov Processes
for w
1
, with P
a
(
1
) = 1. Put
n
(s) =
[2
n
s] + 1
2
n
. Then
n
(s) s
for every s. From Theorem 3,
_
1
0
E
a
(|y(
n
(s), w) y(s, w)|)ds 0 as n ,
i.e., y(
n
(s), w) y(s, w) in L

-norm on L

([0, 1] W). Therefore,


there exists a subsequence,
n
(s) =
k
n
(s), say, such that y(
n
(s), w)
y(s, w) for (s, w) A, say, with (m P
a
)(A) = 1, m P
a
denoting the
product measure on [0, 1] W. Now
(m P
a
)(A) =
_
m(s : (s, w) A)dP
a
(w) = 1,
so that m(s : (s, w) A) = 1 for w
2
, P
a
(
2
) = 1. Let
1

2
= .
Then if w , w
2
so that

1
t
_
t
0
y(s, w)ds

= lim
n

1
t
_
t
0
y(
n
(s), w)ds

lim
n
sup
nC,0st
y(s, w) 0 as t 0,
since w
1
. 85
Denition of generator in the restricted sence. Let M be a strong
Markov process. Consider the restriction of G

to D(S ). We shall de-


note this also by G

.
Theorem 6. R

= G

D(S ) is indepentent of . (We can therefore denote


R

by R.)
The proof is similar to that is the case of the generator dened ear-
lier.
Theorem 7. G

: D(S ) R is 1 : 1 and linear.


Proof. Since G

f = 0 implies f 0, G

is 1 : 1. Let us write G

=
G
1

.
Theorem 8. G

is independent of .
9. Generator in the restricted sence 69
This is obvious.
Denition (). G = G
1

is called the generator in the restricted sence.


Since G

is 1 : 1, Gu B(S ).
Theorem 9 (Dynkins formula). If u D(G) and a Markov time with
E
a
() < , then
E
a
__

0
Gu(x
t
)dt
_
= E
a
(u(x

)) u(a).
proof as before.
Theorem 10 (Dynkin). If Gu is continuous at a and if Gu(a) 0, then
Gu(a) = lim
Ua
E
a
(u(x

U
)) u(a)
E
a
(
U
)
where U denotes a closed neighbourhood of a and
U
is the leaving 86
time for U, i.e.
U
= inf {t : x
t
(S U) }.
Proof. Since Gu(a) 0, we may suppose that Gu(a) > > 0. Let U
be a closed neighbourhood of a such that for b U, Gu(b) > /2. Let

n
=
U
n; then E
a
(
n
) < and
E
a
__

n
0
Gu(x
t
)dt
_
= E
a
(u(x

n )) u(a).
If T <
n
, u(x
t
) U and Gu(x
t
) > /2. Hence 2||u||

2
E
a
(
n
). If
follows that E
a
(
U
) < . Therfore
E
a
__

U
0
Gu(x
t
)dt
_
= E
a
(u(x

U
)) u(a).
Since Gu(a) ia continuous at a, sup
b
|Gu(a)Gu(b)| 0 as U a.
Therefore

Gu(a)
E
a
(u(x

U
)) u(a)
E
a
(
U
)

=
1
E
a
(
U
)

E
a
__

U
0
(Gu(x
t
) Gu(a))dt
_

1
E
a
(
U
)
E
a
(
U
)

sup
b
|Gu(a) Gu(b)|

0
as U a

70 2. Srong Markov Processes


Theorem 11. If u D(G) = R, then given any sequence of Markov 87
times {
n
} such that
n
> 0, we can nd a sequence {
n
} of Markov
times, 0 <
n

n
such that
Gu(a) = lim
n
E
a
(u(x

n
)) u(a)
E
a
(
n
)
Proof. Let

( f ) = inf
_
t :
1
t

_
t
0
f (x
s
)ds f (a)

>
_
. If is easily seen
that

( f ) is a Morkov time, and since P


a
(
1
t
_
t
0
f (x
s
)ds f (a)) = 1,
P
a
(

( f ) > 0) = 1. Let now

n
=
1/n
(Gu)
n
1.
Then P
a
(
n
> 0) = 1 and 0 < E
a
(
n
) < 1. Therefore
E
a
__

n
0
Gu(x
t
)dt
_
= E
a
(u(x

n
)) u(a).
We have

Gu(a)
E
a
(u(x

n
)) u(a)
E
a
(
n
)

1
E
a
[
n
]
E
a
_
1

_
n
0
(Gu(x
t
) Gu(a))dt

1
E
a
(
n
)
E
a
(
n
)
1
n
0 as n .
Properties of generator in the restricted sense:
Theorem 12 (Mean value property). Let U be an open subset of S and 88

U
the leaving time from

U and u D(G).
(1) If u(a) = E
a
(u(x

U
)) for every a

U, then Gu(a) = 0 in U.
(2) Conversely, if E
a
(
U
) < , Gu(a) = 0 in U, then
u(a) = E
a
(u(x

U
)) for every a U.
9. Generator in the restricted sence 71
Proof. (1) If u(a) = E
a
(u(x

U
)) for every a

U, then u(a) = E
a
(u(x

U
)) for every a S . For if a

U, P
a
(
U
= 0) = 1. If follows
that E
a
(u(x

U
)) = u(a). Noting this, let be a Markov time
U
.
Then since
U
= +
U
(w
+

), we have
u(a) = E
a
(u(x

U
)) = E
a
(u(x
+
U
(w
+

)
(w)))
= E
a
(u(x

U
(w
+

)
(w
+

)))
= E
a
(E
x

(u(x

U
))) = E
a
(u(x

)).
Now we can choose a sequence of Markov times
n

U
so that
Gu(a) = lim
n
E
a
(u(x

n
)) u(a)
E
a
(
n
)
= 0.
(2) If E
a
(
U
) < we have from Dynkins formula
E
a
__

U
0
Gu(x
t
)dt
_
= E
a
(u(x

U
)) u(a),
so that if Gu(a) = 0 for a

U, Gu(x
t
) = 0 for t <
0
and we get 89
the result.

Theorem 13 (Local property). Let u, v D(G) and u = v in a closed


neighbourhood U of a. Suppose that there exists a Markov thime such
that P
a
( > 0) = 1 and P
a
(x
t
is continuous for 0 < ) = 1. Then
Gu(a) = Gv(a).
Proof. Let h = u v. Then h(b) = 0 for b U. Let =
U
. Then
since x
t
is continuous for 0 t , x

U so that E
a
(h(x

)) = 0 = h(a).
Now
Gh(a) = lim
n
E
a
(h(x

n
)) h(a)
E(
n
)
= 0,
since
n
can be chosen so that
n

U
.
Section 3
Multi-dimensional Brownian
Motion
We have already studied one-dimonsional Brownian motion. We shall 90
now dene k-dimensional Brownian motion, determine its generator and
deduce the main result of Potential Theory using properties of the k-
dimensional Brownian motion.
1 Denition
We rst dene k-dimensional Wiener process. Let x(t, w) = (x
i
(t, w), i =
1, 2, . . . , k) be a k-dimensional stochastic process on a probability space
(P). x(t, w) is called a k-dimensional Wiener process if (1) its compo-
nents x
i
(t, w) are one-dimensional Wiener processes, and (2) x
i
(t, w), 1
i k, are stochastically independent processes.
It is easy to construct a k-dimensional Wiener process x(t, w) on
(P) from a 1-dimensional Wiener process (t, ) on (Q). It is su-
cient to take =
k
and P = the product probability Q
k
, and dene for
w = (
1
, . . . ,
k
),
x(t, w) = ((t,
1
), . . . , (t,
k
)).
We now study the k-dimensional standard Brownian motions. Let
73
74 3. Multi-dimensional Brownian Motion
S = R
k
, W = the set of all continuous functions into S and dene
P
a
(B) = P[a + x(., w) B].
Here a = (a
1
, . . . , a
k
). It is easily veried that M = (S, W, P
a
) is 91
a Markov process M is called the k-dimensional standard Brownian
motion. The transition probability of the process is
P(t, a, E) =
_
E
N
k
(t, a, b)db,
where N
k
(t, a, b) = N(t, a
1
, b
1
) N(t, a
k
, b
k
)
Since, for f C(S ),
H
t
f (a) =
1
(2t)
k/2
_
e
|b|
2
/2t
f (a + b)db, |b|
2
= b
2
1
+ + b
2
k
is also in c(S ), Mis a strong Markov process.
Let denote the group of congruence (distance-preserving) trans-
formations of R
k
. If O , then O indues a transformation, which again
we denote by O, of W W dened by
(Ow)(t) = OW(t).
O carries measurable subsets of W into measurable subsets. For any
subset L W, we dene
OL = (Ow : w L).
The following facts are easily veried
(0.1) P(t, Oa, OE) = P(t, a, E)
(0.2) P
O
a
(OB) = P
a
(B).
If O is a rotation around a, i.e. if Oa = a, P
a
(OB) = P
a
(B), so 92
that O is a P
a
-measure preserving transformation of W onto W.
2. Generator of the k-dimensional Brownian motion 75
2 Generator of the k-dimensional Brownian motion
Let D = D(R
k
) be the space of all C

function with compact supports.


For D, put
(t, a) = H
t
(a),
and
(a) (a, ) = G

(a) =
_

0
e
t
(t, a)dt.
Now
(t, a) =
_
R
k
1
(2t)
k/2
e
|b|
2
/2t
(a + b)db
=
_
R
k
1
(2)
k/2
e
|b|
2
/2
(a + b

t)db,
and a simple calculation gives

t
=
1
2
, (0 + a) = (a).
Taking Laplace transform, the last equation gives
(
1
2
) = .
In order to show that is the unique solution of this equation, it is
enough to show that if C
2
, (a) 0 as |a| and (
1
2
) = 0, 93
then 0. To prove this, suppose that (a) > 0 for some a. Then since
(a) 0 as |a| , the maximum of (a) is attained at a nite point
a
0
and
(a
0
) = max (a) > 0.
Therefore
(a
0
) 0,
and hence
(
1
2
)(a
0
) > 0.
76 3. Multi-dimensional Brownian Motion
Thus (a) 0. Replacing by , we see that 0. This proves
our contention.
Now, let f B(R
k
). Then
u(a) = G

f (a) =
_
G(, |b a|) f (b)db,
where
G(, |b a| =
_

0
e
t|ba|
2
/2t
(2t)
k/2
dt.
Note that G(, |b a|) is continuous in (a, b). It is immediate that
u B(R
k
) and we can consider u as a distribution in the Schwartz sense.
Then by the denition of the derivative of a distribution, for any D,
_

1
2

_
u() =
_
u(a)
_

1
2

_
(a)da
=

G(, |a b|) f (b)


_

1
2

_
(a)da db
=
_
f (b)db
_
G(, |b a|)
_

1
2

_
(a)da
=
_
f (b)(b)db;
where 94
(b) =
_
G(, |a b|)
_

1
2

_
(a)da.
If =
_

1
2

_
, then D and
= G

,
and from the above we get
_

1
2

_
= =
_

1
2

_
,
and hence = .
2. Generator of the k-dimensional Brownian motion 77
Thus
_

1
2

_
u() =
_
f (b)(b)db,
and this means that the distribution
_

1
2

_
u is dened by the function
f . (Of course, any function equal to f almost every-where denes the
same distribution.)
What we have above also shows that if u = 0 then the distribution
_

1
2

_
u = 0 so that f = 0 a.e. Hence
N = { f : f = 0 a.e.} .
Let R =
_
u : u, u B(R
k
), u is the distribution sense
_
=
_
u : u B(R
k
) and the distribution u is dened
by a function in B(R
k
)
_
.
We see form the above that R R
+
. Now suppose u R
+
. Then 95
u B(R
k
) and u is dened by a function in B(R
k
). Let (
1
2
)u
be dened by f B(R
k
). Put G

f = v and from the above we see that


(
1
2
)v is dened by f . Hence (u
1

1
2
u
1
) = 0 where u
1
= u v.
We prove that u
1
= 0 a.e. Now
_ _

1
2

_
(a)u
1
(a)da = 0
for every D(R
k
), so that
_ _

1
2

_
(a + b)u
1
(a)da = 0
for every D(R
k
). Also

G(, |b|)|u
1
(a)||
_

1
2

_
(a + b)|da db
=
_
G(, |b|)db
_
|u
1
(a)||
_

1
2

_
(a + b)|da
78 3. Multi-dimensional Brownian Motion
=
_
G(, |b|)db
_
aK
|u
1
(a b)||
_

1
2

_
(a)|da
M
_
G(, |b|)db,
where K is the compact set outside whish is zero and
M = (diam.K). Sup |u(a b)(
1
2
)(a)|.
Therefore,
0 =
_
G(, |b|)db
_ _

1
2

_
(a + b)u
1
(a)da
=
_
u
1
(a)da
_
G(, |b|)
_

1
2

_
(a + b)db
=
_
u
1
(a)da(a).
96
Hence u
1
= 0 a.e. Thus
R =
_
u : u, u B(R
k
)
_
and Gu =
1
2
u in the distribution sense.
3 Stochastic solution of the Dirichlet problem
Let U be a bounded open set and f a function which is bounded and
continuous on the boundary U of U. The problem of niding a function
h(a : f , U), dened and harmonic in U and such that h(a : f , U) f ()
as a from within U, is called the Dirichlet problem. h(a), if it
exists, is unique and is called the classical solution. The denition of
a solution can be generalized, in various ways, so as to include cases
in which the classical solution does not exist. The generalized solution
will still be harmonic in U, but will tend to the boundary value f () in a
slightly weaker sense.
3. Stochastic solution of the Dirichlet problem 79
The stochastic solution. which we shall discuss, gives one way of
dening a generalized solution. Let

U
= rst leaving time from U = inf {t : x
t
U}
By denition, u(a) u(a : f , U) = E
a
( f (x

u
)) is the stochastic
solution of the Dirichlet problem with boundary value f . We shall see
that the stochastic solution is identical with the classical solution, in case 97
the latter exists.
We rst establish some results on
U
.
Theorem 1. P
a
(
u
< ) = 1 if U is a bounded domain.
This is a corollary of the following stronger
Theorem 2.
E
a
[
U
] < , if U is bounded .
Proof. Since P
a
[B + a] = P
0
[B], we can assume that a = 0. Further,
since
u

v
for U V, we can assume that U is the sphere =
{x : |x| < r}. Let u D(G) be such that Gu has a version satisfying
Gu(a) >
0
in for some
0
> 0. For example, if u(a) = e
|a|
2
/
4r
2
, then
Gu(a) =
1
2
u(a) =
1
2
_
k
2
r
2

|a|
2
4
r
4
_
u(a) > 0, if |a| r.
Let
n
=
U
n. Then
n
is a Markov time, and E
0
(
n
) n < .
Therefore, from Dynkins formula,
E
0
__

n
0
Gu(x
t
)dt
_
= E
0
(u(x

u
)) u(0).
For 0 t
n
, x
t
and Gu(x
t
)
0
and
0
E
0
(
n
) 2||u||
Therefore
E
0
() = limE

(
n
)
2||u||

< .

80 3. Multi-dimensional Brownian Motion


Theorem 3. If U is open and bounded, if f is continuous on U and if 98
there exists a classical solution h(a) = h(a : f , U), then
u(a : f , U) = h(a : f , U)
Proof. For any open subset V of U such that

V U, let h denote a C

function which vanishes outside U and such that h


V
= h or

V. Such
a function can easily be constructed. Then h
V
D(G). Since V is
bounded, E
a
(
V
) < and Dynkins formula gives
E
a
__

v
0
Gh
V
(x
t
)dt
_
= E
a
(h
V
(x

v
)) h
V
(a).
For t <
V
, x
t
V and Gh
V
(x
t
) =
1
2
h
V
(x
t
) =
1
2
h(x
t
) = 0. If

V
< , x

v
V so that h
V
(x
v
) = h(x

v
), and since V is bounded,
P
a
(
V
< ) = 1. Therefore E
a
(h(x

v
)) = h
V
(a). Hence, if a

V, then
E
a
(h(x

v
)) = h(a).
Now let {V
n
} be an increasing sequaence of open subsets of U such
that

V
n
U and V
n
U. Then
u
= lim
n

V
n
. Since P
a
(
u
< ) = 1, we
have with P
a
-measure. 1,
f (x

u
= lim
n
h(x

Vn
)
u(a) = E
a
( f (x

u
)) = lim
n
E
a
(h(x

vn
)) = h(a)
for every a U. This completes the proof.
A natural question is When does the classical solution exist? The
simplest case is that of a ball = (a
0
; r). For a , let a

denote the
inverse of a with respect to , i.e.,
a

= a
0
+
r
2
||a a
0
||
2
(a a
0
).
99
Let
G(b, a) =
1
|b a|
k2

r
k2
|a a
0
|
k2
1
|b a

|
k2
, k 3;
3. Stochastic solution of the Dirichlet problem 81
= log
1
|b a|
log
1
|b a

1
log
r
|a a
0
|
, k = 2,
and

(a, ) =
1
k 2

b
G(b, a)
_
b=
xr
k1
for k 3,

(a, ) =

b
G(b, a)
_
b=
r
k1
for 1 k 2,
where

b
denotes the derivative in the radial direction of . Then, if
f is dened and continuous on the boundary of the ball, and if (d) is
the uniform probability distribution (i.e. the normed rotation invariant
measure on the boundary of ), the classical solution is given by the
Poisson integral
h(a : f , U) =
_

(a, ) f ()(d).
The concrete form of

(a, ) is not of importance to us. The only


fact we need is
Theorem 4. If , V are two concentric balls, with radii r, (r > ) ; then
c
1
= min
a

V,

(a, )
and 100
c
2
= max
a

V,

(a, )
depend only on /r and c
1
, c
2
1 as /r 0.
The hitting measure
U
(a, E) of E is dened as

U
(a, E) = P
a
(x

U
E), E B(U).
Clearly
u(a : f , U) =
_

U
(a, d) f ().
We have the following
82 3. Multi-dimensional Brownian Motion
Theorem 5. If is a ball, for a we have

(a, d) =

(a, )(d)
= the harmonic measure on with respect to a.
Proof. The proof is immediate since, from the above, for every contin-
uous function f on ,
_

(a, d) f () =
_

(a, )(d) f (),


and hence the same equation holds for all bounded Borel functions on
.
Using the notation of Theorem 4, we have
Theorem 6.
c
1
(E)

(a, E) c
2
(E).
We now proceed to prove that if the boundary of a bounded open set 101
U is smooth in a certain sense, then the stochastic solution is also the
classical solution.
Denition (). Let U, where U is an open set. If there exists a cone
C U
c
, with vertex at then is called a Poincare point for U.
Theorem 7. If is a Poincare point for a bounded open set U, then
for any > 0 and for any neighbourhood of , there exists a smaller
neighbourhood

of such that
P
a
(x

u
) <
for any a

U.
Proof. Let C U
c
be a cone with vertex at . We can assume that
is a ball of radius r such that C . Let
n
be the ball with the
same centre as and radius r
n
=
n
r, where < 1 is to be chosen
subsequently. Let
n
be the rst leaving time from
n
. If x

U
,


3. Stochastic solution of the Dirichlet problem 83

u
and since P
a
(
n

) = 1 for any a we have x

n
U. Therefore
x

n
C. But x

n

n
so that x

n

n
c. Therefore for any a
n
,
P
a
(x

u
) P
a
(x

n1

n1
C, . . . , x

1

1
C)
= P
a
(x

n1

n1
C, . . . , x

2
C, x

1
(

2
+)
(
+

)
1
C)
since
1
=
2
+
1
(w
+

2
). Also since
i
<
2
, for i > 2 we have 102
x

i
= x(
i
(w), w) = x(
i
(w), w

2
)
= x(
1
(w

2
), w

2
) B

2
B

2+
.

Using the strong Markov property we have


P
a
(x

u
) E
a
(x

n1

n1
C, . . . , x

2
C : P
x

2
(x

1

1
C)
c
2
P
a
(x

n1

n1
C, . . . , x

2

2
C),
if a
n
U. where = (
1
C) < 1. Since depends only on
the solid angle at the vertex , (
1
C) = (
2
C) = =
(
n1
C). We have repeating the argument,
P
a
(x

U
) (c
2
)
n1
Since c
2
1 as 0, we can choose so small that c
2
< 1.
Now choose n large enough so that (c
2
)
n1
<.
Theorem 8. For any open set U and any bounded Borel function f on
U,
u(a) = u(a : f , U)
is harmonic in U.
Proof. Let a U and be a ball with centre at a and contained in U.
Then since
U
=

+
U
(w
+

)
, we have
u(a : f , U) = E
a
( f (x

U
)) = E
a
( f (x

U(w
+

)
(w
+

)))
84 3. Multi-dimensional Brownian Motion
= E
a
(E
x

( f (x

U
))
= E
a
(u(x

))
=
_
P
a
(x

d)u()
=
_

(a, )u()(d),
and the last term is harmonic for a . This proves that u is harmonic 103
in a neighbourhood of every a U. Hence u is harmonic in U.
Theorem 9. If U is a bounded open set such that every point of U is a
Poincare point and if f is continuous on U, then the stochastic solution
u = u(a : f , U) is also the classical solution.
Proof. By Theorem 8, u is harmonic in U. Let U. Since f is
continuous, we can choose a ball = () such that | f () f ()| < for
. By Theorem 7 we can choose

so that
P
a
(x

U
) <, a

.
For a

,
|u(a) f ()| E
a
(| f (x

U
) f ()|)
= E
a
(| f (x

U
) f ()| : x

U
)
+ E
a
(| f (x

U
) f ()| : x

u
)
+2|| f || .

Remark . When k = 1, harmonic functions are linear functions. If 104


(a
1
, a
2
) is an interval and f (a
1
), f (a
2
) are given; then
h(a : f , (a
1
, a
2
)) =
a
2
a
a
2
a
1
f (a
1
) +
a a
1
a
2
a
1
f (a
2
)
= u(a; f , (a
1
, a
2
))
= f (a
1
)P
a
(x

(a
1
,a
2
)
= a
1
) + f (a
2
)P
a
(x

(a
1
,a
2
)
= a
2
)
= f (a
1
)P
a
(
a
1
<
a
2
) + f (a
2
)P
a
(
a
2
<
a
1
),
3. Stochastic solution of the Dirichlet problem 85
where
a
i
, i = 1, 2, is the rst passage time for a
i
, i = 1, 2. Since f is
arbitrary,
P
a
(
a
1
<
a
2
) =
a
2
a
a
2
a
1
,
P
a
(
a
2
<
a
1
) =
a a
1
a
2
a
1
,
We have seen that if U is bounded and open and if every point of
U is a Poincare point, the Dirichlet problem for U has a solution. We
now dene a generalized solution.
Suppose that U is open and bounded and that f is bounded and
continuous on U. Let {U
n
} U be an increasing sequence of open sets
with

U
n
U
n+1
and such that every point of U
n
is a Poincare point.
Let F be a continuous extension of f to

U and F
n
= FU
n
. Denote
the classical solution for U
n
with boundary values F
n
by h(a; F
n
, U
n
). 105
Then lim
n
h(a; F
n
, U
n
) is, by denition, the generalized solution (in the
Wiener sense) of the Dirichlet problem with boundary values f . We
have of course to show that the limit exists and is independent of the
choice of U
r
and of F.
Theorem 10. For a bounded open set U, u(a; f , U) is the generalized
solution.
Proof. We have only to show that h(a : F
n
, U
n
) u(a : f , U). In fact
since
u
n

u
< with probability 1,
h(a : F
n
, U
n
) = u(a : F
n
, U
n
) = E
a
(F(x

un
))
E
a
(F(x

u
))
= E
a
( f (x

U
)) = u(a : f , U).

Remark. u(a) = u(a : f , U) does not always satisfy the boundary con-
dition lim
aU,a
u(a) = f () for U. In 7 we shall discuss these
boundary conditions.
86 3. Multi-dimensional Brownian Motion
4 Recurrence
Denition (). A Markov process Mis called recurrent if
P
a
(x
t
U for some t) P
a
(
U
< ) = 1
for any a S and any open U; otherwise it is called non-recurrent.
We shall now show that the standard Brownian motion is recurrent
for k 2 and is non-recurrent for k 3.
Theorem 1. Let
1
,
2
be the balls with centres a
0
and radii r
1
, r
2
(r
2
> 106
r
1
). If
1
=

1
,
2
=

2
are the rst passage times for
1
and
2
,
then for a
2

1
,
P
a
(
1
<
2
) =
_

_
r
k+2
r
k+2
2
r
k+2
1
r
k+2
2
, k 3;
log
1
r
log
1
r
2
log
1
r
1
log
1
r
2
, k = 2;
r
2
r
r
2
r
1
, k = 1;
where r = |a a
0
|.
Proof. In fact, if U =
2

1
, U =
1

2
, and the function f which
is 1 as
1
and 0 as
2
is continouous on U. Since every point in U
is a Poincar e point, the classical solution h(a; f , U) = u(a; f , U) exists
and
p(a) P
a
(
1
<
2
) = P
a
(x

U

1
) = u(a; f , U).
The function given in the statement of the theorem is harmonic in U
and takes the boundary value f . Since such a function is unique, we get
the result.
Theorem 2. Let = (a
0
, r) be a ball with centre a
0
and radius r and
let

be the rst passage time for . For a and = |a a


0
|, 107
P
a
(

< ) =
_

_
(r/)
k2
, k 3
, k 2
Therefore k-dimensional Brownian motion is recurrent or not ac-
cording as k 2 or k 3.
4. Recurrence 87
Proof. Observe that

for any path whose starting point is not


in . Let

(a
0
, r

) and

. If t <

(w), then since w(t) is


continous, F
t
= {x
s
: 0 s t} is a compact set and hence we can nd
r

such that

F
t
. Then (w) t. It follows that
lim
r

= .
Therefore
P
a
(

< ) = P
a
(

< lim
r

) = lim
r

a
(

<

).
Now take r
2
= r

and
2
=

in Theorem 1 and we get the result.

Theorem 3. If k 3, P
a
(|x
t
| as t ) = 1. If k 2, P
a
(w :
(x
s
, s t, is dense in R
k
for all t)) =1.
Proof. Case k 3. We can, without loss of generality, assume that
a = 0. Let
n
=
(0,n)
and
n
=

n
. For any path w, |x
t
| if and
only if for every given n we can nd s such that the image of [0, ] by 108
w
+
s
is contained in
c
n
. Therefore |x
t
| if and only if we can nd n
such that for every s 0, the image of [0, ] by w
+
s
has a non-empty
intersection with
n
and therefore if w
+
s
(0)
n
, then
n
(w
+
s
) < .
Therefore
P
0
[|x
t
| ] = P
0
[ n such that for every s
0 with w
+
s
(0)
n
,
n
(w
+
s
) < ]

n
P
0
_
for every s 0 with w
+
s
(0)
n
,
n
(w
+
s
) <
_

n
P
0
_
for every m > n,
n
(w
+

m
) <
_
.
Now
P
0
( for every m,
n
(m
+

m
) < ) P
0
(
n
(w
+

m
) < for some m)
= E
0
(P
x
m
(
n
< )
88 3. Multi-dimensional Brownian Motion
=
_
n
m
_
k2
0, as m .
Case k 2. Let be any ball and

= the rst passage time for . We


have
P
a
(

< ) = 1
for every a, so that for any t,
P
a
(

(w
+
t
) < ) = E
a
(P
x
t
(

< )) = 1.
Now 109
P
a
( for every t,

(w
+
t
) < ) = P
a
( for every n,

(w
+
n
) < )
= 1.
Let
1
,
2
, . . . be a complete fundamental system of neighbourhoods.
Then
P
a
( for every n, for every t,

n
(w
+
t
) < ) = 1,
i.e.,
P
a
((x
s
(w) : s t is dense in R
k
)) = 1.

5 Green function
Case k 3.
Denition (). Let U be a bounded open set. Then
G
U
(a, b) =
1
|a b|
k2

_
U

U
(a, d)
| b|
k2
is called the Green function for U, where
U
(a, d) is the harmonic
measure on U with respect to a. This is the potential at b due to a unit
charge at a and the induced charge on U.
5. Green function 89
As the limiting case, when U R
k
, we can dene the Green func-
tion (relative to the whole space R
k
by
G(a, b) =
1
|a b|
k2
.
Theorem 1. If f is bounded, Borel and has compact support, then 110
E
a
(
_

0
f (x
t
)dt) < and
E
a
(
_

0
f (x
t
)dt) =
2
K
_
f (b)db
|b a|
k2
, where K = 4
k
2
/(
k
2
1)
Proof. It is enough to prove the theorem for f 0. We have
E
a
__

0
f (x
t
)dt
_
=
_

0
E
a
( f (x
t
))dt
=
_

0
dt
_
R
k
1
(2t)
k
2
e

|ba|
2
2t
f (b)db
=
_
R
k
f (b)db
_

0
1
(2t)
k
2
e

|ba|
2
2t
dt
=
_
R
k
f (b)db
|b a|
k2
(k/2 1)
2
k
2
=
2
K
_
R
k
f (b)db
|b a|
k2
< ,
because, if is a ball containing the support of f ,
_

f (b)db
(b a)
k2
|| f ||
_

db
(b a)
k2
< .

Theorem 2. Let v(a) = E


a
(
_

0
f (x
t
)dt). Then v(a) D(G), 111
1
2
v = f a.e., and v(a) 0 as |a|
90 3. Multi-dimensional Brownian Motion
Therefore, if
u(a) =
_
G(a, b) f (b)db,
u = k f a.e. ( Poissons equation )
and u(a) 0 as /a/ .
Proof. By Theorem 1, v(a) is bounded and Borel. If
G

f (a) = E
a
__

0
e
t
f (x
t
)dt
_
,
we have
v(a) = lim
0
G

f (a)
and the resolvent equation gives
G

f G

f + ( )G

f = 0.
Letting 0,
G

f v + G

v = 0,
or
v = G

( f + v) D(G).
Also, since Gv = v G
1

v = v f v = f , a.e.,
1
2
v = f a.e.

Denition (). Let A be a bounded subset of R


k
. Then 112
S (A, w) = the Lebesgue measure of {t : x
t
(w) A}
is called the sojourn (visiting) time for the set A.
From Theorem 1, we have
5. Green function 91
Theorem 3.
E
a
(S (db))
db
=
2
K
G(a, b).
Let now U be a bounded open set and f B(U). Let
v
U
(a) = v
U
(a; f , U) = E
a
__

U
0
f (x
t
)dt
_
,

U
being the rst leaving time from U.
Theorem 4.
v
U
(a) =
2
K
_
U
G
U
(a, b) f (b)db.
Proof. Extend f by putting f = 0 in U
c
. Then
v
0
(a) = E
a
__

0
f (x
t
)dt
_
=
2
K
_
U
f (b)db
|b a|
k2
,
by Theorem 1. Also
v
0
(a) = E
a
__

U
0
f (x
t
)dt
_
+ E
a
__

U
f (x
t
)dt
_
= v
U
(a) + E
a
__

0
f (x
t
(w
+
U
))dt
_
= v
U
(a) + E
a
_
Ex

U
__

0
f (x
t
)dt
__
= v
U
(a) + E
a
(v
0
(x

U
)).
= v
U
(a) +
_
U

U
(a, d)v
0
()
= v
U
(a) +
2
K
_
f (b)db
_
U

U
(a, d)
|b |
k2
113
This gives the result.
Theorem 5. v
U
(a) satises
1
2
v
U
= f , a.e.,
and v
U
(a) 0 as a , being a regular point of U.
92 3. Multi-dimensional Brownian Motion
Proof. v
U
(a) = v
0
(a) E
a
(v
0
(x

U
)). Since E
a
(v
0
(x

U
)) is harmonic in U
and
1
2
v
0
(a) = f a.e., we have
1
2
v
U
(a) = f , a.e.

Further if U is regular, E
a
(v
0
(x

U
)) v
0
() as a and since
v
0
(a) is continuous by Theorem 1 v
0
(a) v
0
() as a . The result
follows.
Theorem 6. Let S (A/U, w) = the Lebesgue measure of {t : x
t
A, t <

U
}. Then
E
a
(S (db/U))
db
=
2
K
G
U
(a, b).
As an example we compute v
U
(a) for U = the open cube (0, 1)
3
, k =
3. Since every boundary point of the unit cube is regular (in fact every
point is a Poincar e point), v
U
= 0 as U. Therefore v = v
U
(a) is the
solution of
1
2
v = f and v = 0 on U.
114
Since v = 0 as U we can put
v(x, y, z) =

l+m+n>0
a
lmn
sin l x sin my sin nz
Then
1
2

v
=

2
2

l+m+n>0
(1
2
+ m
2
+ n
2
)a
lmn
sin lx sin my sin nz
If
f (x, y, z) =

b
lmn
sin lx sin my sin nz,
we have therefore
a
lmn
=
2b
lmn

2
(l
2
+ m
2
+ n
2
)
5. Green function 93
=
16

2
(l
2
+ m
2
+ n
2
)
_
1
0
_
1
0
_
1
0
f (, , ) sin l sin m in nddd
This gives
v(x, y, z) =

f (, , )
16

sin l sin lx sin m sin my sin n sin nz


l
2
+ m
2
+ n
2
ddd.
Hence
G
U
(x, y, z; , , ) =
32

sin l sin lx sin m sin my sin n sin nz


1
2
+ m
2
+ n
2
in the distribution sense. 115
Case k 2.
We cannot apply the preceding method to discuss the Green function
for k 2 because E
a
(
_

0
f (x
t
)dt) may be innite even if f has compact
support. We therefore follow a dierent method.
Let = (o, r) be a ball. If u C

(R
2
), [i.e. compact support and
C

] then u D(G) and Dynkins formula gives


E
a
_

_
0
_

_
1
2
u(x
t
)dt) = E
a
(u(x

)) u(a)
=
_

(a, )u()(d) u(a),

(a, ) =
r
2
a
2
|a |
2
,
=
_

G(a, b)
n
_
b=
xru()(d) u(a), G

(a, b) = log
|a

b r
2
|
|a b|
=
_

1
2
G

(a, b)
1
2
u(b)2db.
94 3. Multi-dimensional Brownian Motion
If C

(R
2
) and v(a) =
1

(a, b)(b)db, then


1
2
v = There-
fore we have for any C

(R
2
),
E
a
_

_
0
(x
t
)dt
_

_
=
1

(a, b)(b)db.
It follows that the same equation holds for any f B(R
2
), i.e.,
E
a
_

_
0
f (x
t
)dt
_

_
=
1

G(a, b) f (b)db.
Now let U be a bounded domain,

U , a ball. Then
E
a
_

_
0
f (x
t
)dt
_

_
= E
a
_

U
_
0
f (x
t
)dt
_

_
+ E
a
_

_
0
(w
+

U
) f (x
0
(w
+

U
))dt
_

_
= E
a
_

U
_
0
f (x
t
)dt
_

_
+ E
a
_
E
x
U
__

0
f (x
t
)dt
__
,
so that 116
E
a
_

U
_
0
f (x
t
)dt
_

_
=
1

(a, b) f (b)db
_
U

U
(a, d

)E

_
0
f (x
t
)dt
_

_
=
1

(a, b) f (b)db
1

_
U

U
(a, d)
_

(, b) f (b)db
=
1

G
U
(a, b) f (b)db,
where
G
U
(a, b) = G

(a, b)
_
U

U
(a, d

)G

(, b)
= log
1
|a b|

_
U

U
(a, d

) log
1
| b|
5. Green function 95
log
1
|a

b r
2
|
+
_
U

U
(a, d

) log
1
|

b r
2
|
Since b U

U , |a

b| < r
2
for a U and log |a

b r
2
| is
harmonic for a U, with boundary values log |

b r
2
|. Hence if every
point of U is regular,
log |a

b r
2
| =
_
U

U
(a, d) log |

b r
2
|.
Thus we have
G
U
(a, b) = log
1
|a b|

_
U

U
(a, d) log
1
| b|
.
Theorem 1. If U is a bounded open set such that every point of U is 117
regular and if u(a = E
a
(

U
_
0
f (x
t
)dt), then
1
2
u = f and u(a) 0 as a U.
Proof. In fact
u(a) = E
a
__

U
0
f (x
t
)dt
_
=
1

_
U
G
U
(a, b) f (b)db
and the theorem follows from the denition of G
U
(a, b).
Theorem 2.
E
a
(S (db/U))
db
=
1

G
U
(a, b).
If k = 1, we can proceed directly. Suppose that U = (, ).
Then
E
a
_

,
_
0
1
2
u

(x
t
)dt
_

_
= E
a
(u(x
(,)
)) u(a)
96 3. Multi-dimensional Brownian Motion
=
a

u() +
a

u() u(a)
=

G
(,)
(a, b)
1
2
u

(b)2db,
where
G
(,)
(x, y) = G
(,)
(y, x) =
( y)(x )

, x y .
Threfore we have
Theorem 3.
E
a
_

(,)
_
0
f (x
t
)dt
_

_
=
_

G
(,)
(a, b) f (b)2db
Theorem 4.
E
a
(S (db/(, ))
db
= 2G
(,)
(a, b).
6 Hitting probability
118
We have already discussed the hitting probability for spheres. Here we
shall discuss it for more general sets, especially compact sets.
Absolute hitting probability (k 3).
For simplicity we consider the case k = 3.
Let F be a compact set and
F
= inf{t : t > 0 and x
t
F}. Put
p
F
(a) = P
a
(
F
< ) = P
a
(x
t
F for some t > 0); p
F
(a) is called the
absolute hitting probability for a F with respect to a.
Lemma (). Let = (a, r) and
r
=

= the rst leaving time for .


Then P
a
(
r
0 as r 0) = 1.
Proof. Clearly
r
decreases as r decreases. If = lim
r0

r
, we have only
to show that P
a
( > 0) = 0. Now P
a
( > t) P
a
(
r
> t) P
a
(x
t
) =
(2t)
3/2
_

exp((x a)
2 1
2t
)dx 0 as r 0.
6. Hitting probability 97
Theorem 1. P
F
(a) is expressible as a potential induced by a bounded
measure
F
i.e. P
F
(a) =
_

F
(db)
|b a|
, where
F
is concentrated on F.
Further
F
is uniquely determined by F.
Proof. Firstly we show that p
F
(a) is harmonic in F
c
F
0
. Since p
F
(a)
1 in F
0
, we have only to show that it is harmonic in F
c
. Let be ball
such that

F
c
. If

is the rst leaving time for ,

<
F
and
p
F
(a) = P
a
(

<
F
< ) = P
a
(
F
(w
+

) < ) = E
a
(P
x

(
F
< )) =
E
a
(p
F
(x

)) =
_

(a, d

)p
F
() =
_

(a, )p
F
()(d), showing that
p
F
(a) is harmonic for a .
Let be a ball and a . Then 119
p
F
(a) = P
a
(
F
< ) p
a
(
F
(w
+

) < ) =
_

(a, )p
F
()(d).
This show that p
F
(a) is super harmonic in the wide sense (i.e. its
value at the centre of a ball in not less than the average value on the
boundary).
Finally we show that p
F
(a) is lower semi-continuous. It is enough
to show this for a F. Let a
0
F, and (a
0
, r) =
r
,
r
=

r
. Then
p
F
(a
0
) = P
a
0
(x
t
F for some t >
r
) +
r
,
r
0 (from the lemma)
=
_

r
p
F
()(d) +
r
. On the other hand
p
F
(a)
_

(a, )p
F
()(d)
so that
lim
aa
0
p
F
(a)
_

r
lim
aa
0

(a, )p
F
()(d) =
_

r
p
F
()
(d)

= p
F
(a
0
)
r
.
Now letting r 0, lim
aa
0
p
F
(a) p
F
(a
0
), showing that p
F
(a) is
lower semi-continuous. Now if is a ball containing F,

, the rst
passage time for , then we have seen that P
a
(

< ) = r
1
, = |a|.
Therefore P
a
(

< ) 0 as a and since P


a
(
F
< )
P
a
(

< ), P
a
(
F
< ) 0 as a . Since p
F
(a) is super
98 3. Multi-dimensional Brownian Motion
harmonic in R
3
, from the Reisz representation theorem there exists a
unique bounded measure
F
with p
F
(a) =
_

F
(db)
|b a|
+H(a) where H(a)
is harmonic in R
3
. But p
F
(a) 0|a| and also
_

F
(db)
|ba|
0 as 120
|a| since
F
is a bounded measure. It follows that H(a) 0 as
|a| i.e. H(a) 0. Therefore p
F
(a) =
_

F
(db)
|b a|
. Since
F
is
concentrated in the set where p
F
is not harmonic,
F
is concentrated in
F. This proves the theorem completely.
Theorem 2. If u(a) is any potential induced by a measure which is
concentrated in F and if u(a) 1, then
u(a) p
F
(a) and (F)
F
(F)(=
F
(F)).
Proof. We have u(a) =
_
F
(db)
|a b|
. Since F is compact, for xed a we
can nd n such that |a b| n. It follows that (F) < and therefore
u(a) is harmonic in F
c
. Let G
n
F
c
be a sequence of bounded open sets
such that

G
n
G
n+1
. Let
n
=
G
n
= the rst leaving time from G
n
. If
we put f = u/G
n
then u is the classical solution with boundary values
f . Therefore for every a G
n
u(a) = E
a
( f (x

n
))
= E
a
(u(x

n
)) E
a
(u(x

n
) :
F
= ) + E
a
(u(x

n
) :
F
< ).

Now
n

F
. If
F
= ,
n
and x

n
with probability 1;
and by the formula for u, u(x

n
) 0. Since u(a) 1 we have therefore
u(a) E
a
(
F
< ) = p
F
(a).
If a F
0
, p
F
(a) = 1 and u(a) 1 = p
F
(a). 121
Let now a F and = (a, r) and
r
the rst leaving time for .
Then
p
F
(a) P
a
(x
t
F for some t
r
)
6. Hitting probability 99
= P
a
(x
t
(w
+

r
) F for some t 0)
= E
a
[P
x
r
(x
t
F for some t 0)]
= E
a
P
x
r
(x
t
F for some t 0) : x

r
F
c
+ E
a
P
x
r
(x
t
F for some t 0) : x

r
F)
E
a
[x

r
F
c
: u(x

r
)] + E
a
[x

r
F : 1] E
a
[u(x

r
)]
since P
a
(x
t
F for some t 0) = 1 for a F. Letting r 0 we get
p
F
(a) lim
r0
E
a
(u(x

r
)) E
a
(lim
r0
u(x

r
)) u(a)
since u(a) is lower semi-continuous. It remains to prove that (F)

F
(F).
Let E be a compact set with E E
0
F and consider p
E
(a). Then
p
E
(a) =
_

E
(db)
|a b|
and p
E
(a) = 1 for a E
0
F. Since
_

F
(db)
|a b|

_
(db)
|b a|
we have


F
(db)
|a b|

E
(da)

(db)
|a b|

E
(da)
i.e., 122
_
F

F
(db)
_
F
(db).
An alternative proof of the last fact is the following. Since p
F
(a)
u(a)
_
F
|a|

F
(db)
|b a|

_
F
|a|
(db)
|b a|
Letting a we get the result.
From the above theorem we have
Theorem 3. C(F) =
F
(F) is the maximal total charge for those
charge distributions which induce potentials 1.
100 3. Multi-dimensional Brownian Motion
Theorem 4 (Kakutani). C(F) > 0 if and only if p
F
(a) > 0 i.e.
P
a
(x
t
F for some t > 0) > 0.
C(F) is called the capacity of F.
Theorem 5.
C(F)
max
bF
|a b|
p
F
(a)
C(F)
min
bF
|b a|
and C(F) = lim
|a|
|a| p
F
(a).
We shall now prove the subadditivity of p
F
(a) and C(F) following
Hunt. This means that p
F
(a) and C(F) are both strong capacities in the
sense of Choquet.
Theorem 6. p
F
(a) and C(F) are subadditive in the following sense.
(F
1
F
n
)
i
_
(F
i
)
i
_
j
(F
i
F
j
)+
_
i<j<k
(F
i
F
j
F
k
)
where (F) denotes either of p
F
(a) and C(F).
Proof. Put F

= {w :
F
(w) < }. Then (F
1
F
n
)

= F

1
123
F

n
, (F
1
F
n
)

1
F

n
and p
F
(a) = P
a
(F

). Using the dual


inclusion - exclusion formula of Hunt, we have
p
F
1
F
n
(a) = P
a
[(F
1
F
n
)

] P
a
[(F

1
F

n
)]
=

i
P
a
(F

i
)

i<j
P
a
(F

i
F

j
) +
=

i
P
F
i
(a)

i<j
p
F
i
F
j
(a) +

i<j<k
p
F
i
F
j
F
k
(a)

Multiplying by |a| both sides and letting |a| we get the said
inequality for C(F).
Hitting probability for open sets.
Let U be a bounded open set and dene
U
and P
U
(a) as in the
case of compact sets F. Then p
U
(a) is harmonic outside U and super
harmonic in the whole space. Therefore p
U
(a) =
_
U

U
(db)
|a b|
in (U)
c
,
6. Hitting probability 101
and
U
(U) = lim
|a|
|a| p
U
(a). Let F
n
U be compact subsets of U.
Then C(F
n
) = lim
|a|
|a| p
F
n
(a) and the convergence is uniform in n since
F
n
are contained in a bounded set. Also since
P
U
(a) = P
a
(
U
< ) = lim
n
P
a
(
F
n
< ) = lim p
F
n
(a)
we have

U
(U) = lim
n
C(F
n
).
Therefore
U
U) is the supremum of capacities of compact sets 124
contained in U; it is the capacity C(U) of U by denition. Again
p
U
(a)
C(U)
min
bU
|b a|
.
Remark. The capacity of any set is dened as follows. We have already
dened the notion of capacity for compact sets. The capacity of any
open set is by denition the supermum of the capacities of compact sets
contained in it. The outer capacity of a set is the inmum of the capaci-
ties of open sets containing it, while the inner capacity is the supremum
of the capacities of compact sets contained in it. If both are equal the
set is called capacitable and the outer (or inner) capacity is called the
capacity of the set. Choquet has proved that every Borel (even analytic)
set is capacitable.
Relative hitting probability (k 1).
Let F be a compact set contained in a bounded open set U and put

F/U
= inf{t :
U
> t 0 and x
t
F} where
U
is the rst leaving
time from U. Let p
F/U
(a) = P
a
(
F/U
< ) = P
a
{ for some t > 0x
t
reaches F before it leaves U. p
F/U
(a) is called the (relative) hitting
probability for F with respect to a and relative to U. Using the same
idea as before we can prove
Theorem 1

. p
F/U
(a) is expressible as a potential induced by a bounded
measure
F/U
with the Green function G
U
(a, b), i.e.
p
F/U
(a) =
_
G
U
(a, b)
F/U
(db), a U,
102 3. Multi-dimensional Brownian Motion
where
F/U
is concentrated on F. Further
F/U
is uniquely determined 125
by F.
We can dene the relative capacity C
U
(F) of F as
F/U
(F) and
carry out similar discussions.
Remark on absolute hitting probability (k 2).
In case k = 1, p
F
(a) 0 or 1 according as F or = .
In case k = 2, we contend that p
F
(a) 1 or 0 according as C
U
(F) >
0 or = 0, where U is a bounded open set containing F. To prove this
let V be another bounded open set such that F V

V U. Let

1
(w) =
U
(w) +
V
(w
+
U
),
2
(w) =
1
(w) +
1
(w
+

1
),
3
(w) =
2
(w) +

1
(w
+

2
), . . . ,
n
(w) =
n1
(w)+
1
(w
+
n1
), etc. Then evidently x

n
V
and
n
; for let,

n
(w) =
n1
(w) +
U
(w
+

n1
). Then
n1

n
, and x

n
V, x

n
U so that if
n
, x

V U = which
is a contradiction. Hence
n
with P
a
-probability 1. If C
U
(F) = 0,
then p
F/U
(x

n
) = 0. Now
P
a
(x
t
F,
n
< t
n+1
= P
a
(
F
(w
+

n
) <
U
(w
+

n
))
= E
a
(P
x
n
(
F
(w) <
U
(w))) = E
a
(p
F/U
(x

n
)) = 0.
Hence P
F
(a) = P
a
(x
t
F for some t > 0)
_
P
a
(x
t
F,
n
< t

n+1
) = 0. Now
1 p
F
(a) P
a
(x
t
F, o < t <
n
P
a
(
F
(w
+

r
) >
U
(w

r
)( r n)
126
The set
_

F
(w
+

r
)
_
>
U
(w
+

r
), 1 r n 1) is B

n
+
-meansurable.
For
(
F
(w
+

r
) <
U
(w
+

r
)) = (
F
[w

+1
)
+

r
(w

r
1
)
] >
U
[(w

r+1
)
+

r
(w

r+1
)
])
Hence (
F
(w
+

r
) <
U
(w
+

r
)) B

r+1
B

n
, for r +1 = n. [Note that
if
1
,
2
are two Markov times and
1
<
2
then B

1
B

2
]. Hance
by strong Markov property
7. Regular points (k 3) 103
P
a
(
F
(w
+

r
) >
U
(w
+

r
), 1 r n)
= E
a
[p
x
n
(
F
>
F
) :
F
(w
+

r
) >
U
(w
+

r
), 1 r n 1]
If C
U
(F) > 0, since p
F/U
(a) is continuous on V and always > 0 it
has a minimum > 0. Then
P
a
(
F
(w
+

r
) >
U
(w
+

r
), 1 r n) (1 )P
a
(
F
(w
+

r
)
>
U
(w
+

r
), 1 r n 1) . . . (1 )
n
C.
This proves our contertion.
7 Regular points (k 3)
In order to decide whether the garalied solution (the stochastic solution)
u(a) = u(a : f , v) satises the boundary conditions
lim
aU,
a
u(a) = f (), U
we introduce the notion of regularity of boundary points. 127
Let U be an open set and U. Let

U
= inf{t : t > 0 and x
t
U},
and consider the event

U
= 0. This clearly belongs to B
o
+ and Blumen-
thal 0 1 law gives P

U
= 0) = 1 or 0. If it is 1, is called a regular
point for U; if it is zero it is called irregular for U. Regularity is a local
property. In fact, if is regular for U, is regular for U for any
open neighbourhood of and vice versa. We state here two important
criteria for regularity.
Theorem 1. Let U.
(a) is regular for U if and only if lim
aU,a
P
a
(x(

U
) U ) = 1.
(b) is irregular for U if and only if lim

lim
a
P
a
(x(

U
) U ) = 0.
104 3. Multi-dimensional Brownian Motion
Theorem 2 (Winers test). If U and
F
n
= (b : 2
(n+1)(k2)
|b | 2
n(k2)
, b U
c
)
is regular or irregular according as
_
n
2
n(k2)
C(F
n
) = or < .
We can prove the above two theorems using the same idea we used
for the proof of Poincares test.
The following theorem, an immediate corollary of Theorem 1 gives 128
the boundary values of the stochastic solution.
Theorem 3. If U is a bounded open set, if is regular for U and if f is
bounded Borel on U and continuous at , then
lim
aU,a
u(a : f , U) = f ().
On the other hand if is irregular for U, then there exists a contin-
uous funtion f on U such that the above equality is not true.
The following thorem, which we shall state whithout proof, shows
that the set of irregular points is very small compared with the rest.
Theorem 4. Let U be a bounded open set. Then the set of irregular
points has capacity zero.
Using Theorem 3 and 4 we prove the following
Theorem 5. If U is a bounded open set and if f is continuous on U the
stochastic solution u(a) = u(a : f , U) is the unique bounded harmonic
function dened in U such that
lim
aU,a
u(a) = f (), U
except for a - set of capacity zero.
Proof. It folllows at once from Theorem 3 and 4 that the stochastic solu-
tion is a bounded harmonic function with boundary values f at regualar 129
points. Conversely let v be any bounded harmonic funtion with the
boundary values f upto capacity zero. Let N be the set of all points
8. Plane measure of a two dimensional... 105
such that v(a) f (). Then C(N) = 0 by assumption. Therefore there
exists a decreasing sequence of open sets G
m
N such that

G
m+1
G
m
and C(G
m
) 0. Since N is bounded we can assume that G
m
are also
bounded and since N U, we can assume that
_
m
G
m
U. Let
a U. Then (a, G
m
) = inf
bG
m
(a, b) > some positive constant for large
m. Therefore
P
a
(x

U
N) P
a
_

_
x

U

_
m
G
m
_

_
P
a
[
G
m
< ]
C(G
m
)
((a, G
m
))
k2
0
so that P
a
(x

U
N) = 1. Let now U
n
be open sets, U
n
U such that

U
n
U and every boundary point of U
n
is a Poincar e point for U
n
.
Then v(a) = E
a
(b(x

Un
)), a U
n
so that
v(a) = lim
n
E
a
(v(x

Un
)) = E
a
( lim
n
v(x

Un
))
= E
a
( lim
n
v(x

Un
) : lim
n
x

Un
= x

U
N)
= E
a
( f (x

U
) : x

U
N) = E
a
( f (x

U
)) = u(a : f , U).

8 Plane measure of a two dimensional Brownian


motion curve
We have seen in Theorem 3 of 4 that the two-dimensional Brownian
motion is dense in the plane. We now prove the following interesting
theorem due to Paul L evy.
Theorem 1 (P. Levy). The two dimensional Lebesgue measure of a two- 130
dimensional Brownian motion curve is zero with probability 1 i.e. if
C(w) = {x
s
: 0 s < }, and |C| = the Lebesgue measure of C(w) then
P
a
(|C| = 0) = 1.
We rst prove the following lemma.
Lemma (). Let S be a Hausdor space with the second countability
axiom and W a class of continuous functions fo [0, t] into S . Let B be the
106 3. Multi-dimensional Brownian Motion
Borel algebra generated by the class of all sets of the form {w : w W
and w(s) E} where 0 s t and E B(S ), B(S ) being the class of
Borel subsets of S (i.e. the Boral algebra generated by open sets of S ).
Let C(w) = {w(s) : 0 s t}. Then the function dened by
f (a, w) = 1 if a C(w)
= 0 if a C(w)
is B(S ) B-measurable in the pair (a, w).
Proof. It is clearly enough to prove that
{(a, w) : a C(w)} B(S ) B.
For any open set U S we have
(w : C(w) U
c
) =
_
rt
r, rational
{w : w(r) U
c
}
so that
(w : C(w) U
c
) B.
Let now U
n
be a countable base for S . Then it is not dicult to see
that
{(a, w) : a C(w)} =

_
n=1
_
U
n

_
w : C(w)U
c
n
__
using the fact that C(w) being the continuous image of [o, t] is closed. 131
Q.E.D.
Proof of Theorem. To prove the theorem it is enough toi consider two
dimensional Brownian motion curves starting at zero i.e. a two-dimen-
sional Wiener process. Let x
t
(w) be a two-dimensional Wiener process
on (B, P). It is enough to show that E(|c
t
|) = 0, where c
t
= {x
s
: 0
s t} and |c
t
| = the two dimensional Lebesgue measure of c
t
. From the
lemma the function (a, c
t
) dened as
(a, c
t
) = 1 if a c
t
8. Plane measure of a two dimensional... 107
= 0 if a c
t
is measurable in the pair (a, w). Since |c
t
| =
_
R
2
(a, c
t
)da, |c
t
| is measur-
able in w. Consider the following four processes:
1. x
s
(w), 0 s t
2. y
s
(w) = x
s+t
(w) x
t
(w), 0 s t
3. z
s
(w) = x
ts
(w) x
t
(w), 0 s t
4. u
s
(w) =
x
2s
(w)

2
, 0 s t.
All the four processes are continuous processes i.e. processes whose
sample functions are continuous. Let
c
x
t
= {x
s
: 0 s t}
with similar meanings for c
u
t
, c
y
t
and c
z
t
. Now the form of the Gaussian
distribution shows that all the above four processes have the same joint
distributions at any nite system of points. It follows that the distri- 132
butions induced on [R
2
]
[o,t]
by the above processes are the same. Also
(a, c
x
t
) = f (a, x) where f is the function in the lemma and x denotes
the path. Thus we have
E((a, c
x
t
)) = E((a, c
y
t
)) = E((a, c
z
t
)) = E((a, c
u
t
)).
Hence
E(|c
x
t
|) =
_
R
2
E((a, c
x
t
))da =
_
R
2
E((a, c
u
t
))da = E(|c
u
t
|).
We have
c
x
2t
= {x
s
: 0 s 2t = c
x
t
[c
y
t
+ x
t
]
[c
x
t
x
t
] c
y
t
= c
z
t
c
y
t
,
108 3. Multi-dimensional Brownian Motion
where denotes congruency under translation. Therefore |c
x
2t
|+|c
y
t
c
z
t
| =
|c
z
t
| + |c
y
t
|, and
E(|c
x
2t
|) + E(|c
y
t
c
z
t
|) = E(|c
z
t
|) + E(|c
y
t
|) = 2E(|c
x
t
|).
Also
E(|c
x
t
|) = E(|

2c
u
t
|) = E(2|c
u
t
|) = 2E(|c
u
t
|) = 2E(|c
x
t
|)
Therefore
E(|c
y
t
0
z
t
|) = 0 i.e.
_
R
2
E((a, c
x
t
)E(a, c
y
t
))da = 0.
Since the process y and z are easily seen to be independent
E((a, c
x
t
)E(a, c
y
t
)) = E((a, c
z
t
))E((a, c
y
t
)) = [E((a, c
x
t
))]
2
.
Therefore
_
[E((a, c
x
t
))]
2
da = 0 giving E((a, c
x
t
)) = 0 for almost 133
all a. Hence
_
E((a, c
x
t
))da = 0 i.e E(|c
x
t
|) = 0. This proves the theo-
rem.
Section 4
Additive Processes
1 Denitions
134
Let x = (x
t
, 0 t < a) be a stochastic process on a probability space
(, P). If I = (t
1
, t
2
] the increment of x in I is by denition the random
variable x(I) = x
t
2
x
t
1
.
Denition (). A process, x = (x
t
) with x
0
0 is called an additive
(or dierential) process, if for every nite disjount system I
1
, . . . , I
n
of
intervals, x(I
1
), . . . , x(I
n
) are independent.
We shall only consider additive processes x for which E(x
2
t
) <
for all t. In this case E(x
t
) = m(t) exists and is called the rst moment of
x
t
.E((x
t
m(t))
2
) is called the varience of x
t
and is denoted by V(x
t
) or
v(t). If y
t
= x
t
m(t), y. = (y
t
) is also additive.
Denition (). A process x = (x
t
) is said to be continuous in probability
at t
0
or said to have xed discontinuity at t
0
, if for every > 0,
lim
tt
0
P[|x
t
x
t
0
| > ] = 0.
If it is continuous in probability at every point t it is said to be con-
tinuous in probability.
The following theorem is due to Doob.
109
110 4. Additive Processes
Theorem 1. If an additive process (x
t
) has no xed discontinuity then 135
there exists a process (y
t
) such that
(1) P(x
t
= y
t
) = 1 for all t;
(2) almost all sample functions of (y
t
) are d
1
.
If further (y
t
), (y

t
) are two such processes, then
P(for every t, y
t
= y

t
) = 1.
y = (y
t
) is called the standard modication of x
t
. The proof can
be seen in Doobs Stochastic processes.
Denition (). An additive process (x
t
) with no point of xed discontinu-
ity and whose sample paths are d
1
with probability 1 is called a Levy
process.
It can be seen easily that Wiener processes and Poison processes are
particular cases of Levy processes.
Denition (). A process (x
t
) is called temporally homogeneous if the
probability distribution of x
s
x
t
(s > t) depends only on s t.
The above theorem of Doob shows that it is enough to study Levy
processes in order of study additive processes with no point of xed
discontinuity.
2 Gaussian additive processes and poisson additive
processes
The following two theorems give two elementary types of Levy pro-
cesses.
Denition (). An additive process (x
t
) which almost all sample paths
continuous is called a Gaussian additive process. If for almost all w, the 136
sample functions are step fucntions increasing with jump 1 the process
is called a Poisson additive process.
2. Gaussian additive processes and ... 111
We prove the following two theorem which justify the above nomen-
clature.
Theorem 1. Let (x
t
) be a Levy process. If x
t
(w) is continuous in t for
almost all w, then x(I) is Gassian variable.
The condition that x
t
is continuous of almost all w is sometimes
referred to as (x
t
) has no moving discontinuity in contrast with (x
t
)
has no xed discontinuity.
Proof. Let I = (t
0
, t
1
]. Since almost all sample functions are continuous,
for any > 0, there exists a () > 0 such that
P(for all t, s I, |t s| < |x
t
x
s
| < ) > 1 .
Noting this, let for each n,
t
0
= t
n
0
< t
n
1
< . . . < t
n
pn
= t
1
be a subdivision of (t
0
, t
1
], with 0 < t
ni
t
ni1
< (
n
), where
n
0.
Let x
nk
= x(t
nk
) x(t
nk1
). Then x = x(I) =
p
n
_
k=1
x
nk
. Dene x

nk
x
nk
if
|x
nk
| <
n
and zero otherwise. Put x
n
=
p
n
_
k=1
x

nk
. Then from the above it
follows that
P(x = x
n
) > 1
n
;
i.e., that x
n
x in probability. Since x
nk
are independent so are x

nk
. 137
Therefore
E(e
ix
) = lim
n
E(e
ix
n
) = lim
n
P
n
_
k=1
E(e
ix

nk
).
Let m
nk
= E(x

nk
), V
nk
= V(x

nk
), m
n
=
P
n
_
k=1
m
nk
and V
n
P
n
_
k=1
V
nk
. Then
|m
nk
|
n
and V
nk
4
2
n
. Now
E(e
ix
) = lim
n
e
im
n
P
n
_
k=1
E(e
i(x

nk
m
nk
)
)
112 4. Additive Processes
= lim
n
e
im
n
P
n
_
k=1
_
1

2
2
V
nk
(1 + 0(
n
))
_
,
so that
|E(e
ix
)| lim
n
_
k
e

2
2
V
nk
= lim
n
e

2
2
V
n
e

2
2
limV
n
.
Since E(e
ix
) is continuous in and is 1 at = 0, for sucient
small , E(e
ix
) 0. Hence lim
n
V
n
< , i.e. V
n
is bounded. By taking
a subsequence if necessary we can assume that V
n
V.
We can very easily prove taht if z
n
=
P
n
_
i=1
z
ni
such that
(1) sup
1iP
n
|z
ni
| 0 as n ;
(2)
P
n
_
i=1
|z
ni
| is bounded uniformly in n; and
(3) z
n
z, then
lim
n
P
n
_
i=1
[1 z
ni
] = e
z
.
Now in our case max
k
|V
nk
| 4
2
n
0,
P
n
_
k=1
V
nk
[1+0(
n
)] V and 138
V
nk
0 so that
lim
n
P
n
_
k=1
_
1

2
2
V
nk
(1 + 0(
n
))
_
= e

2
2
V
.
Therefore E(e
ix
) = lim
n
e
im
n
e

2
/V
. This implies that () =
lim
n
e
im
n
exists. Now if 0 /2,

_
0
()d = lim
n

_
0
e
im
n
d = lim
n
e
im
n
1
im
n
= 0
2. Gaussian additive processes and ... 113
if m
n
, and then () = 0 for almost all /2, i.e.
E(e
ix
) = 0 for almost all /2 and this is a contradiction.
Therefore m
n
m and
E(e
ix
) = e
im
2
/2V
.

Theorem 2. Let (x
t
) be a Levy process. If almost all sample functions
are step functions with jump 1, then x(I) is a Poisson variable.
Proof. From the continuity in probability of x
t
,
sup
|ts|<n
1
,t
0
t,st
1
P(|x
t
x
s
| 1) 0 as n .
For each n, let t
0
= t
no
< t
n1
< < t
np
n
= t
1
, t
ni
t
ni1

1
n
, be a
subdivision of [t
0
, t
1
] and let x
nk
= x
tnk
x
tnk1
, x

nk
= x
nk
if x
nk
= 0 or
1 and x

nk
= 1 if x
nk
2. Put x
n
=
_
x

nk
. Then since P(x
n
x) = 1, 139
E(e
x
) = lim
n
E(e
x
n
) = lim
n
P
n
_
k=1
E(e
x

nk
)
= lim
n
P
n
_
k=1
_
(1 p
nk
) + p
nk
e

_
= lim
n
P
n
_
k=1
_
1 P
nk
(1 e

)
_
lim
n
P
n
_
k=1
e
p
nk
(1e

)
= lim
n
e
P
n
(1e

)
= e
(1e

) lim P
n
,
where p
nk
= P(x
nk
1) = P(x

nk
= 1) and P
n
=
P
n
_
k=1
P
nk
. Therefore P
n
is bounded. We can assume that P
n
P. Again since max
1kp
n
P
nk
0,
E(e
x
) = e
p(1e

)
.
114 4. Additive Processes
3 Levys canonical form
Before considering the decomposition of a Levy process we prove some
lemmas.
Lemma 1. Let (x
t
) be a Levy process and (y
t
) a Pisson additive process.
Suppose further that (z
t
) = ((x
t
, y
t
)) is a vector-valued additive process.
Then, if
P(for every t, x
t
= x
t
or y
t
= y
t
) = 1,
the processes (x
t
) and (y
t
) are independent.
Proof. It is enough to prove that
P(x(I) E, y(I) F) = P(x(I) E)P(y(I) F).
For once this is proved we have, by the additivity of (z
t
), for any 140
nite disjoint system I
1
, . . . , I
n
of intervals,
P(x(I
i
) E
i
, y(I
i
) F
i
), i = 1, 2, . . . , n) =
_
i1
P(x(I
i
) E
i
, y(I
i
) F
i
)
=
n
_
i1
P(x(I
i
)E
i
)P(y(I
i
)F
i
)
= P[x(I
i
) E
i
, i = 1, 2, . . . , n]P[y(I
i
) F
i
, i = 1, 2, . . . , n],
and the proof can be completed easily.
Since y(I) is a Poisson variable it is enough to prove that E(e
ix(I)
:
y(I) = K)E(e
ix(I)
)p(y(I) = K).
Let I = (t
0
, t
1
]. For each n let t
0
= t
n0
< t
n
1
< . . . << t
n
n
= t
1
,
t
ni
t
n1
=
1
n
(t
1
t
0
) be the subdivision of I into n equal intervals. Put
x
ni
= x(t
ni
) x(t
ni1
), y
ni
= y(t
ni
n
) y(t
ni1
)x

ni
= x
ni
if y
ni
= 0, x

ni
= 0
if y
ni
1, and x
n
=
n
_
i=1
x

n
i
=
_
y
ni
=0
x
ni
. We have x = x(I) =
n
_
i=1
x
ni
and |x(w) x
n
(w)| y
ni
_
(W)1
x
ni
(w). Since y
t
(w) is a Poisson variable
increasing with jump 1 the number of terms in the right hand side of the
last inequality is at most y(w) = P (say). Suppose that
1
(w), . . . ,
p
(w)
3. Levys canonical form 115
are the points in I, at which y
t
(w) has jumps. Then |x(w) x
n
(w)|
p
_
j=1
|x(s

nj
) x(s
nj
)| where (s
nj
, s

nj
] is the interval of the nth subdivision
which contains
j
(w). Now |x(s

nj
) x(s
nj
)| |x(s

nj
)| x(
tau
j
)| +|x(
j
) x(s
nj
)|. Since at
1
, . . . ,
p
, y
t
(w) has jumps, x
t
(w) has no
jumps at these points. Therefore |x(
j
)x(s
nj
)| and |x(x(s

nj
)|x(
j
)| 0 141
as n . Thus P(x
n
x) = 1. Now
E(e
ix
n
: y = k) =

rk

0
1

r
n
P
1
++P
r
=k
P
1
,...,P
r
1
E
_
e
i
_

x
n
: y
n

= p

, = 1, 2, . . . r
y
n=0,

_
Put
E
_
e
i
_

x
n
: y
n

= p

, 1 r
y
n=0,

_
= E
r()(p)
Using the hypothesis that (x
t
, y
t
) is additive one shows without dif-
culty that
E
r()(p)
=
_

E(e
ix
n
) : y
n
= 0)
_
1r
P(y
n

= p

),
so that
E
r()(p)
= E(e
i
_

x
n

: y
n
= 0,

)P(y
n

= p

, 1 r)
Also P(y = 0) = P(y
n
= 0 for all )P(y
n
= 0,

)P(Y
n

=
0, 1 r). Therefore (using the additivity of (x
t
, y
t
) again)
E
r()(p)
P(y = 0) = E(e
i
_

x
n

: y
n
= 0,

)P(y
n

= 0, 1 r)
xP(y
n

= p

, 1 r)P(y
n
= 0,

)
= E(e
i
_

x
n

: y
n
= 0 for all )
P(y
n

= p

, 1 r, y
n
= 0,

)
= E(e
i
_

x
n

: y = 0
116 4. Additive Processes
P(y
n

= p

, 1 r, y
n
= 0,

).
Therefore 142
P(y = 0)E(e
ix
n
: y = k) =

rk

0
1
...<
r
n
P
1
++P
r
=k
P
1
,...,P
r
1
E
()(P)
P(y = 0)
=

rk

0
1
<...<
r
n
P
1
++P
r
=k
P
1
,...,P
r
1
E(e
i
_

x
n

: y = 0)
P(y
n

= p

, y
n
= 0,

)
Now
|E
_
e
i
_

x
n
: y = 0
_
E
_
e
ix
: y = 0
_
| E
_
|e
i
_
1
r
x
n
1|
_

=1
E
_
|e
ix
n

1|
_
K sup
|ts|
1
2
(t
1
t
0
)
t
0
t,st
1
E
_
|e
ix
t
e
ix
s
|
_
0,
as n , since x
t
has no point of xed discontinuity. We thus have,
since P(y = k) =
_
rk
_
()
(p)
P(y
n

= p

, y
n=0,

),
| p(y = 0)E
_
e
ix
n
: y = k
_
p(y = k)E(e
ix
; y = 0)

rk

()
(p)

E
_
e
i
_

x
n
: y = 0
_
E
_
e
ix
: y = 0
_

P
_
y
n

= P

, y
n
= 0,

_
sup
(),(p)
E
_
e
i
_

x
n

: y = 0
_
E(e
ix
: y = 0)|

rk
()
(p)
p(y
n

= p

y
n
= 0,

)
sup
(),(p)
E
_
e
i
_

x
n

: y = 0
_
E(e
ix
: y = 0)| 0.
3. Levys canonical form 117
Therefore p(y = 0)E(e
ix
: y = k) = E(e
ix
: y = 0)P(y = k). 143
Summing the above for k = 0, 1, 2, . . . we get P(y = 0)E(e
ix
) = E(e
ix
:
y = 0). Hence nally we have
P(y = 0)E(e
ix
: y = k) = E(e
ix
: y = 0)
P(y = k) = E(e
ix
)P(y = 0)P(y = k),
i.e.,
E(e
ix
: y = k) = P(y = k)E(e
ix
).
We have proved the lemma.
Remark. We can prove that if x = (x
t
),

y = (y
t
) are independent Levy
processes, then
P(for every t, x
t
= x
t
or y
t
= y
t
) = 1.
Lemma 2 (Ottaviani). If r
1
(.), . . . , r
n
(.) are independent stochastic pro-
cesses almost all of whose-sample functions are of type d
1
, then for any
> 0,
P
_
max
1mn
||r
1
(

) + + r
m
(

)|| > 2
_

P[||r
1
+ + r
n
|| >]
1 max
1mn1
P[||r
n+1
+ + r
n
|| >]
where ||r|| = ||r(

)|| = sup
0st
|r(s)|.
Proof. Let
A
m
=
_
max
am1
||r
1
+ + r

|| 2 , ||r
1
+ + r
m
||2
_
B
m
= (||r
m+1
+ + r
n
|| ).
Then since A
m
are disjoint, A
m
B
m
are also disjoint. Further
n
_
m=1
A
m
B
m
C = (||r
1
+ + r
n
|| >), so that 144
118 4. Additive Processes
P(c)

P(A
m
B
m
) =

P(A
m
)P(B
m
) P(UA
m
)
n
min
m=1
P(B
m
)
If we now note that
n
min
m=1
P(B
m
) = 1 max
1mn
P(B
c
m
) we get the
result.
Lemma 3. Let (x
t
) be a Levy process, such that E(x(t)) = 0, E(x(t)
2
) <
. Then for any > 0,
P
_
sup
ost
|x(s)| >
_
<
1

2
E(x(t)
2
).
Proof. This lemma is the continuous version of Kolmogoros inequal-
ity which is as follows.
Kolomogoros inequality. If x
1
, . . . , x
n
are independent random vari-
ables with E(x
i
) = 0, E(x
2
i
) < , i = 1, 2, . . . , n, and if S
m
= x
1
+ +x
m
,
then
P
_
max
1mn
|S
m
| >
_
<
1

2
E(S
2
n
).
The lemma follows easily from this inequality.
Let now (x
t
, o t < a) be a Levy process, S = {(s, u) : o s <
a, < u < }. Let B(S ) be the set of Borel subsets of S and
B
+
(S ) = (E : E B(S ) and (E, s axis) > 0).
For every w we dene
J(w) = ((t, u) S : x
t
(w) x
t
(w) = u 0, o t < a).
For E B(S ) put p(E) = number of points in J(w) E. For xed w, 145
therefore p is a mesure on B(S ). We can prove that p(E) is measurable
in w, for xed E B
+
(S ). Let (M) = E(p(M)) for M B
+
(S ). Then
we have the
Theorem ().
x
t
= x

(t) + lim
n
_
[o,t](u:1|u|>
1
n
)
[up(ds du) u(ds du)]
3. Levys canonical form 119
+
_
[o,t](|u|>1)
up(ds du)
where x

(t) is continous.
Proof. The proof is in several stages. Let E
t
= E [(s, u) : o s t]
for E B
+
(S ).
1. We shall rst prove that y
E
t
= p(E
t
) is an additive Poisson process.
Using the fact that x
t
is of type d
1
it is not dicult to see that
y
E
t
< , and that it increases with jump 1.
Let B
ts
be the least Boral algebra with respect to which x
u
x
v
, s
u, v t, are measurable. We shall prove tha t Y
E
t
y
E
s
is B
ts
-
measurable. It suces to prove this when E = G is open. Let
G
m
G,

G
m
G
m+1
be a sequnce of open sets such that

G
m
is compact. Let y
G
t
y
G
s
= N, y
G
m
t
y
G
m
s
= N
m
. For every n
let t
n
k
= s + k
(ts)
n
, k = 1, 2, . . . , n and N
m
n
=number of k such
that (t
n
k
, x
t
n
k
x
t
n
k1
) G
m
. Then N
m1
lim
n
N
m
n
N
m+1
, N
m
n
is
measurable in with respect to B
ts
, and y
G
t
y
G
s
== lim
m
lim
n
N
m
n
.
Nowsuppose that I
i
= (s
i
, t
i
]i = 1, 2, . . . , n are disjoint. Then B
t
i
s
i
,
1 i n are independent and y
E
I
i
is B
t
i
s
i
-measurable. Therefore 146
y
E
t
is an additive process.
Finally y
E
t
has no xed discontinuity. For, a xed discontinuity of
y
E
t
is also a xed discontinuity of x
t
.
Thus we have proved that p(E
t
) is an additive Poisson process.
2. Let r(E
t
) =
_
(s,u)E
t
J
u =
_
E
t
up(ds du).
We prove that r(E
t
) is additive. For every w , p is a mea-
sure on B(E
t
E
s
). Any simple function on E
t
E
s
is of the
form
n
_
i=1
a
i

F
i
where F
i
B(E
t
E
s
), i = 1, 2, . . . , n, are disjoint.
Also
_
E
t
E
s
(
_
a
i

F
i
)p(ds du) =
_
a
i
p(F
i
), so that
_
E
t
E
s
(
_
a
i

F
i
)
p(dsdu) is B
ts
-measurable. It follows that
r(E
t
) r(E
s
) =
_
E
t
E
s
up(ds du)
120 4. Additive Processes
is B
ts
-measurable. Let x
E
t
= x
t
r(E
t
). Using the fact that
r(E
t
)r(E
s
) is B
ts
-measurable, it is seen without diculty that x
E
t
is a Levy process. Since z
E
T
= (x
E
t
, y
E
t
) is additive, and P[x
E
t
= x
E
t

or y
E
t
= y
E
t

for every t) = 1 it follows that x


E
. and y
E

are indepen-
dent.
3. Now we prove that E
1
, . . . , E
n
B
+
(S ) are disjoint then
x
E
1
...E
n
, y
E
1
, . . . , y
E
n
are independent. For simplicity we prove
this for n = 2. Put x

t
= x
E
1
t
. Then (x

t
)
E
2
= x
E
1
E
2
t
and the process
y
E
2
dened with respect to x

t
is the same as y
E
2
t
with respect to
x
t
. Hence, since (x

)
E
2
and y
E
2
are independent from 2, x
E
1
E
2
147
and y
E
2
are independent. Further x
E
1
E
2
, y
E
2
) is measurable with
respect to B(x
E
1
, the least Borel algebra with respect to which x
E
1
t
is measurable for all t, and B(x
E
1
), B(y
E
1
are independent. There-
fore (x
E
1
E
2
, y
E
2
) and y
E
1
are independent. It follows that x
E
1
E
2
,
y
E
1
and y
E
1
are independent.
4. x
E
t
and r(E
t
) are independent.
Since r(E
t
) =
_
E
t
u p(ds du), it is enough to prove that if F is a
simple function on E
t
,
_
E
t
F p(ds du) and x
E
t
are independent; this
follows from 3.
5. If (M) = E(p(M)) then E(e
ir(E
t
)
) = exp
__
E
t
(e
iu
1)(ds du)
_
.
It is again enough to prove this for simple functions on E
t
. Note
that if y is a Poisson variable then E(e
iy
) = e
(e
i
1)
where =
E(y), so that for any we have E(e
iy
) = e
(e
i1
)
.
Let f =
_
s
i

F
i
be a simple function on E
t
with F
i
, 1 i n
disjoint. Since p(F
i
) are independent random variables we have
E
_
exp
__
E
t
f p(ds du)
__
= E
_

_
e
i
n
_
j=1
s
j
p(F
j
)
_

_
=
n
j=1
E(e
is
j
p(F
j
)
)
=
_
1jn
exp
_
(F
j
)(e
is
j
1)
_
=
1jn
exp
__
E
t
(e
i
F
j
s
j
1)(ds du)
_
3. Levys canonical form 121
= exp
_

_
_
E
t
(e
i
n
_
j=1
s
j

F
j1
)(ds du)
_

_
= exp
_

_
_
E
t
(e
if
1)(ds du)
_

_
.
6. Let U = ((s, u) S : |u| > 1), U
n
= ((s, u) S :
1
n
|u| 1). 148
Then x
t
= x
U
n
t
+ r(U
n
t
), and since (X
E
t
and r(E
t
) are independent
|E(e
ix
t
)| =

E
_
e
ix
U
n
t
_
||E
_
e
ir(U
n
t
)
_
| |E
_
e
ir(U
n
t
)
_

=
=

exp
_

_
_
U
n
t
_
e
iu
1
_
(dsdu)
_

= exp
_

_
_
U
n
t
(cos u 1)(dsdu)
_

_
exp
_

2
4
_
U
n
t
u
2
(ds du)
_

_
,
because cos u 1

2
u
2
4
for || 1. It follows that
_
U
n
t
u
2
(ds du) < for every n. Therefore lim
n
_
U
n
t
u
2
(ds du) < .
7. Let r
n
(t) = r(U
n
t
) E(r(U
n
t
)), then r
n
(t) converges uniformly in
[o, a). The limit we denote by r

(t).
Now r(U
m+k+1
t
) r(U
m+k
t
) = r(U
m+k+1
t
U
m+k
t
). It follows that
r
m+k
(

) r
m+k1
(

), k = 1, 2, . . . , n m are independent. Using


Lemmas 2 and 3,
P
_
max
1knm
||r
m+k
r
m
|| > 2
_

P(||r
n
r
m
|| >)
1 max
1knm1
P(||r
n
r
m+k
|| >)

2
_
U
n
t
U
m
t
u
2
(ds du)
1
1

2
_
U
n
t
U
m
t
u
2
(ds du)
0 as m, n .
122 4. Additive Processes
since
E(|r
n
(t) r
m
(t)|
2
) = E(|r(U
n
t
U
m
t
) E(r(U
n
t
U
m
t
)|
2
)
= E
_

_
_

_
_
U
n
t
U
m
t
u[p(ds du) (ds du)]
2
_

_
_

_
and 149
E
___
E
t
u(p(ds du) (ds du))
2
_
=
_
E
t
u
2
(ds du)
_
which can be proved by rst considering simple functions etc.,
and noting the fact that if y is a Poisson variable, then
E[(y E(y))
2
] = E(y).
8. Let x
n
(t) = x
U
n
t
+ E(r(U
n
t
)) r(U
t
) = x
t
r
n
(t) r(U
t
). Since
r
u
(t) converges uniformly, in every compact subinterval of [o, a),
with probabilty 1, x
n
(t) converges uniformly in [o, a), say to x

(t).
Since x
n
(t) has no jumps exceeding
1
n
in absolute value x

(t) is
continuous. We have
x
t
= r(U
t
) + lim
n
r
n
(t) + lim
n
x
n
(t) =
=
_
U
t
up(ds du) + lim
n
[up(ds du) u(ds du)]
since E(r(u
n
t
) =
_
U
n
t
u(ds du). The theorem is proved.
Since
_
U
t
(ds du) = E(p(U
t
)) < ,
_
U
t
u
1 + u
2
(ds du) < .
We have seen that lim
n
_
U
n
t
u
2
(ds du) < . Therefore lim
n
_
U
n
t
u
3
1 + u
2
(ds du) < and we can also write the last equation as 150
x
t
= g(t) + lim
n
_
[o,t](u:|u|>
1
n
)
_
up(ds du)
u
1 + u
2
(dsdu)
_
3. Levys canonical form 123
where
g(t) = x

(t) +
_
U
t
u
1 + u
2
(dsdu) lim
n
_
U
n
t
u
3
1 + u
2
(ds du).
For simplicity we shall write
x
t
= g(t) +
t
_
s=o

_
up(ds du)
u
1 + u
2
(ds du)
_
.
In the general case when x
o
0, we have
x
t
= x
o
+ g(t) +
t
_
s=o

_
up(ds du)
u
1 + u
2
(ds du)
_
From now on we shall write
x
t
=

up([o, t] du)
u
1 + u
2
([o, t] du) + g(t).
Since x
t
has no xed discontinuity P(|x
t
x
t
| > 0) = 0. It fol-
lows that ({t} U) = 0. Noting this it is not dicult to see that
_
U
t
u
1 + u
2
(ds du) and lim
n
_
U
n
t
u
3
1 + u
2
(dsdu) are both continuous in t.
Therefore g(t) is continuous hence is a Gaussian additive process. Fur-
ther we can show that g(t) and
_

[up([o, t]] du)


u
1 + u
2
([o, t] du)
are independent. We have
E(e
i(x
t
x
s
)
) = E(exp(i
_

[up([s, t] du

u
1 u
2
([s, t] du]))E(e
i[g(t)g(s)]
)
= lim
n
E(exp(i
_
|u|>
1
n
[up([s, t] du)
u
1 u
2
([s, t] du])) exp
_
i(m(t) m(s))
v(t) v(s)
2

2
_
124 4. Additive Processes
= lim
n
exp
_

_
_
|u|>
1
n
(e
iu
1
iu
a + u
2
)([s, t] du)
_

_
exp
_
t(m(t) m(s))
v(t) v(s)
2

2
_
151
Therefore
log E(e
i(x
t
x
s
)
) = i[m(t) m(s)]
v(t) v(s)
2

2
+
+
_

[e
iu
1
iu
1 + u
2
([s, t] du).
Since g(t) is Gaussian m(t) and v(t) are continuous in t and v(t) in-
creases with t.
Conversely, given m, v and such that (1) m(t) is continuous in t, (2)
v(t) is continuous and increasing, (3) [{t}U] = 0,
_

u
2
1 + u
2
([o, t]
du) < , we can construct a unique (in law) L evy process.
Let us now consider some special cases. If
_

u
2
([o, t] du) < 152
we can write
x
t
= g
1
(t) +
_

u[p([o, t] dv) ([o, t] du).


The condition
_

|u|
1 + |u|
([o, t] du) < is equivalent to the two
condition (1)
_

u
2
1 + u
2
([o, t] du) < and (2)
_

|u|
1 + u
2
([o, t] du) < so that if
_

|u|
1 + |u|
([o, t] du) <
we can write
x
t
=
_

up([o, t] du) + g
2
(t)
and
log E(e
ix
t
) = im(t)
v(t)
2

2
+
_

[e
iu
1]([o, t] du)
3. Levys canonical form 125
The condition
_

u
2
1 + |u|
([o, t] du) < is equivalent to the
condition (1)
_

u
2
1 + u
2
([o, t] du) < and (2)
_

u
3
1 + u
2
([o, t]
du) < . Therefore if
_

u
2
1 + |u|
([o, t] du) < we can write
log E(e
ix
t
) = im(t)
v(t)
2

2
+
_

[e
iu
1 iu]([o, t] du)

Lemma (). If f () = im
v
2

2
+
_

[e
iu
1
iu
1 + u
2
](du) 0, where
m and v are real and is a signed measure such that
_

u
2
1 + u
2
(du) <
, then m = v = = 0.
Proof. We have 0 f ()
1
2
_
+1
1
f ()d =
v
3
+
_

e
iu
[
1sin u
u
](du) 153
so that if
o
is the Dirac measure at 0,

_
v
3

o
(du) +
_
1
sin u
u
_
(du)
_
e
iu
0.
It follows that
v
3

o
(A) +
_
A
(1
sin u
u
)(du) = 0. Taking A = {0},
since
_
{0}
(1
sin u
u
)(du) = 0 we see that v = 0. It then follows that
= 0 and hence m = 0.
Form this lemma we can easily deduce that in the expression
log E(e
ix
t
) = im(t)
v(t)
2

2
+

[e
iu
1
iu
1 + u
2
]([o, t] du),
m(t), v(t) and are unique.
126 4. Additive Processes
4 Temporally homogeneouos L evy processes
We shall prove that if (x
t
) is a temporally homogeneous L evy process,
then log E(e
ix
t
) = t() where
() = im
v
2

2
+ +
_

_
i
iu
1
iu
1 + u
2
_
(du).
Denition (). Two random variables x and y on a probability space
(P) are said to be equivalent in law and we write s
L
y if they yield
the same distribution.
A stochastic process (x
t
, 0 t < a) on can be regarded as a
measurable function into R
[0,a]
. Two stochastic processes (x
t
), (y
t
), 0
t < a are said to be equivalent in law if they induce the same probability
distribution on R
[0,a]
and we write x
L
y. If (x
t
) and (y
t
) are addtive
processes such that x
t
x
s

L
y
t
y
s
, then we can be prove that x
L
y.
Let D

denote the set of all d


1
-type functions on [0, a) into R

. Then 154
D

R
[0,a)
and let B(D

) be the induced Borel algebra on D

by B(R
[0,a)
).
If (x
t
, 0 t < a) is a L evy process then the map w x(w) into D

is
measurable; also if x
(h)
t
= x
x+h
x
t
, 0 t < a h we can show that (x
t
)
is temporally homogeneous if and only if x
L
x
(h)
.
Now consider D

. Let E B
+
(S ),
J( f ) = {(s, u) : f (s) f (s) = u 0}, f D

and F
E
t
( f ) = number of points in J( f ) E
t
.
We can show that F
t
( f ) < and that F
t
is measurable on D

. The
proof of measurability of F
t
follows exactly on the same lines as that of
the mesurability of Y
E
t
. We have clearly p(E
t
) = y
E
t
= F
E
t
(x).
Let E B([0, t]) and U B(R

) be such that E
1
= E U B
+
(S ).
If h is such that t + h < a we prove that ((E + h) U) = (E U). Let
E
2
= (E +h) U. Then (E
2
U) = E(y
E
2
t+h
) = E(y
E
2
t+h
y
E
2
h
). Let x
(h)
t
=
x
t+h
x
h
. Since x
t
is temporally homogeneous x
(h)

L
x. Also y
E
2
t+h
y
E
2
h
=
F
E
1
t
[x
(h)
] and y
E
1
t
= F
E
1
t
[x]. It follows that E(y
E
1
t
) = E(y
E
2
t+h
). Thus for
xed U, is a translation-invariant measure on B[(0, a))] and hence is
4. Temporally homogeneouos L evy processes 127
the Lebesgue measure, i.e. (E U) = m(E)
1
(U), where m(E) is
the Lebesgue measure of E and
1
(U) is a constant depending on U.
Since is a measure on R
2
, it follows that
1
is also a measure. Hence
(ds du) = ds
1
(du). We shall drop the sux 1 and use same symbol 155
. Thus
t
_
0

_
e
iu
1
iu
1 + u
2
_
(ds du) = t

_
e
iu
1
iu
1 + u
2
_
(du).
Now log E(e
i(x
t
x
s
)
) depends only on t s. Therefore m(t) m(s)
and v(t) v(s) depend only on t s. Hence m(t) = m.t, v(t) = v.t.
Therefore, nally,
log E(e
ix
t
) = Imt
vt
2

2
+ t
_

_
e
iu
1
iu
1 + u
2
_
(du).
We shall now consider some special cases of temporally homoge-
neous L evy process. We have seen that
x
t
= g(t) +

_
up
t
(du)
u
1 + u
2
t(du)
_
,
where p
t
(du) = p([0, t] du). Since g(t) is Gaussian additive and tem-
porally homogeneous g(t) = mt +

vB
t
, where B
t
is a Wiener process.
Thus
x
t
= mt +

vB
t
+

_
up
t
(du)
u
1 + u
2
t(du)
_
and
() = Im
v
2

2
+
_

_
e
iu
1
iu
1 + u
2
_
(du)
Special cases:
1. 0. Then x
t
= mt +

vB
t
.
128 4. Additive Processes
2. In case m

= lim
0
_
|u|>
u
1+u
2
(du) existce, we can write x
t
= (m +
m

) +

vB
t
+
_

up
t
(du), and
() = i(m + m

)
v
2

2
+
_

[e
iu
1](du).
Note that if is symmetric, m

= 0. 156
3. If m + m

= 0 and v = 0, x
t
= lim
0
_
|u|>
up
t
(du)
() = lim
0
_
|u|>
[e
iu
1](du)
Such a process is called a pure jump process.
4. If = (R

) < , then
x
t
=
_

up
t
(du), () =
_

[e
iu
1](du) = [() 1]
where (E) =
1
(E) and () is the characteristic function of
. We have
E(e
ix
t
) = e
t(
= e
t

k
t
k

k
k!
()
k
= e
t

k
t
k

k
k!
[characteristic function of
k
],
where
k
denotes the k-fold convolution of . Since E(e
ix
t
) is
the chaacteristic function of the measure (t, .) we have
(t, E) = e
t

k

k
t
k
k!

k
(E).
Remark. If (t, E) = P(x
t
E) is symmtric, i.e. if P(x
t
E) = P(x
t

E) then E(e
ix
t
) is real. Hence () is real. Further, since x
t

L
x
t
, we
have x
L
x. It follows that (db) = (db). Therefore
() = im
v
2

2
+ 2
_

o
[cos u 1](du).
5. Stable processes 129
Since () is real m = 0, so that 157
() =
v
2

2
+ 2
_

0
[cos u 1](du).
5 Stable processes
Let (x
t
, 0 t < ) be a temporally homogeneous Levy process. If
x
t

L
c
t
x
1
, where c
t
is a constant depending on t we say that (x
t
) is a stable
process. We shall now give a theorem which characterises stable process
completely. FromLevys canonical form we have E(e
ix
t
)
= e
t()
where
() = Im
v
2

2
+
_

_
e
iu
1
iu
1 + u
2
_
(du).
Theorem 1.
() =
_

_
Im, m real
a
o
||
2
(a
o
+ i

||
a
1
)||
c
,
where a
0
> 0, 0 < c2 and a
1
is real.
Proof. Suppose that () is not of the form Im.
We prove that if (c) = (d) then c = d. For if = min
_
c
d
,
d
c
_
<
1 and () = ( ) so that () = (
n
) 0. Hence () 0 and
this is the omitted case.
Since e
(c
t
)
= E(e
ic
t
x
1
= E(e
ix
t
) = e
t()
we have (c
t
) = t().
Therefore
(c
ts
) = ts() = t(c
s
) = (c
t
c
s
).
It follows that c
ts
= c
t
c
s
. 158
We prove next that c
t
is continuous. Let c
t
n
d as t
n
t. If d =
we should have, since (c
t
n
) = t
n
(),
() = t
n
(c
1
t
n
) 0.
130 4. Additive Processes
Therefore d and (c
t
) = t() = lim
n
t
n
() = lim
n
(c
t
n
) =
(d). Hence limc
t
n
= d = c
t
. One shows easily that c
t
= t
1/c
. There-
fore if > 0,
(.1) = ((
c
)

c
.1) =
c
(1),
and if < 0,
() = (||(1) = ||
c
(1) = ||
c
(1),
for from the form of () we see that () = (). Thus if (1) =
a
o
+ a
i
i, we have () = ||
c
(a
o
+ ia
1

||
). Since |e
()
| 1, a
o
0;
if a
o
= 0, E(e
ix
1
) = e
ia
1
||
c1
so that
E
_
cos (x
1
a
1
||
c1
)
_
= 1,
i.e., E
_
1 cos
_
x
1
a
1
||
c1
__
= 0.
We should therefore have cos (x
1
a
1
||
c1
) = 1 a.e. or [x
1
(w)
a
1
||
c1
] = 2k(, w), k(, w) being an integer depending on and w.
For xed w, thus k(, w) is continuous in . Letting 0 we see that
k(, w) 0. Therefore x
1
(w) a
1
||
c1
0. If a
1
0 this shows that
c = 1 so that () = ia
1
.
We shall now show that o < c 2. We have x
t

L
t
1
c
x
1
, x
st

L
(st)
1
c
159
x
1

L
s
1
c
x
t
. By using additivity and homogeneity of x
t
and x
st
we can
show that x
s

L
s
1
c
x. (as random processes). It follows that the expecta-
tions of the number of jumps of these processes are the same (because
if p
1
(E
t
) and p
2
(E
t
) correspond to x
s
and S
1/c
x then p
1
(E
t
), p
2
(E
t
) are
equivalent in law). The expected number of jumps of x
s
and s
1
c
x in
dt du are sdt(du) and dt(S
1/c
du) respectively. We have therefore
(s
1
c
du) = s(du). Let
+
(u) =
_

u
(du) for u > 0. Then since
s(du) = (s
1/c
du),
s
+
(u) = s
_

u
(du) =
_

u
(s
1/c
du) =
_

s
1/cu
(du) =
+
(us
1/c)
.
Putting s = u
c
and a
+
= c
+
(1)( 0) we get u
c

+
(u) =
+
(1) =
a
+
c
,
so that
+
(u) =
a
+
c
u
c
. Therefore (du) = a
+
u
c1
du. Similarly we see
5. Stable processes 131
that (du) = a

|u|
c1
(u < 0). If a
+
= a

= 0 then () = im v/2
2
and x
t
is Gaussian additive. Also () = ||
c
(a
o
+ ia
1

||
) so that
c = 2, v/2 = a
o
and a
1
= m = 0, Therefore () = a
o

2
, a
o
> 0.
Let us now assume that at least one of a
+
or a

is positive, say a
+
.
Since
_
1
1
u
2
(du) < ,
_
1
o
u
2
(du) < , so that a
+
_
1
o
u
2
du
u
c+1
< .
This proves that c < 2. Again using
_
o
1
(du) < we can see that
o < c. The theorem is completely proved.
The number c is called the index of the stable process. We shall 160
discuss the cases o < c < 1, c = 1, and 1 < c < 2.
Case (a) 0 < c < 1.
In this case we have
_

(du) = ,
_
1
1
|u|(du) < . The second
inequality implies E(
_
1
1
|u| p([o, t] du)) < so that
P(
_
1
1
|u| p([o, t] du) < ) = 1.
Let
f (n) = t
_
|u|
1
n
(du) =
_
|u|
1
n
([o, t] du) = E
_

_
_
|u|
1
n
p([o, t] du)
_

_
= E
_
p
_
[o, t]
_
|u|
1
n
___
.
Since p(E
t
) is a Poisson variable we have
P
_
p([o, t] (|u|
1
n
)) N
_
=

kN
e
f (n)
[ f (n)]
k
k!
= 1e
f (n)

kN
[ f (n)]
k
k!
Letting n , since f (n) we have
P[p([o, t] (|u| > o)) N] = 1.
Hence P [the number of jumps in [o, t] = ] = 1. Now
_

|u|
1 + u
2
(du) < , so that we can write
x
t
= g
2
(t) +
_

up([o, t] du).
132 4. Additive Processes
We can now show that
() = im
v
2

2
+ a
+
_

o
[e
iu
1]
du
u
c+1
+ a

_
o

[e
iu
1]
du
|u|
c+1
Also
_

0
(e
iu
1)
du
u
c+1
=
c
_

0
[e
iu
1]
du
u
c+1
= 0(||
0
) = 0(||) = 161
0(||
2
); similarly
_
0

(e
iu
1)
du
|u|
c+i
= 0(||
c
) = 0(||) = 0(||
2
) as
. Hence v = m = 0, and () = a
+
_

0
[e
iu
1]
du
u
c+1
+a

_
o

[e
iu

1]
du
|u|
c+1
and x
t
=
_

up([o, t] du).
Case (b) 1 < c < 2
In this case
_

u
2
1 + |u|
du
|u|
c+1
< . Hence we can write
() = im
v
2

2
+ a
+
_

0
_
e
iu
1 iu
_
du
u
c+1
+ a

_
0

_
e
iu
1 u
_
du
|u|
c
+ 1
Now () = 0(||
c
), so that comparing the orders as and
o we see immediately that m = v = 0. Hence
() = a
+
_

o
_
e
iu
1 iu
_
du
u
c+1
+ a

_
o

_
e
iu
1 iu
_
du
|u|
c+1
.
We have
E
__
1
1
|u|
c
1
p([o, t] du)
_
= a
+
t
_
1
o
|u|
c
1 du
u
c+1
+ ma

t
_
o
1
|u|
c
1 du
|u|
c+1
,
which is nite of innite according as c

> o or c

c. Therefore
P[
_
st
|x
s
x
s
|c

< ] = 1 if c

> c. We can easily show that


E
_
exp
_

|u|
c
1
p([o, t] du)
__
= exp
_
t
_

(1 e
|u|
c
)(du)
_
Since
_
1
1
|u|
c
1
(du) = the right side is zero in the limit. It fol-
lows that exp(
_
1
1
|u|
c
1
p([o, t] du)) = 0 with probability 1. Hence 162
6. L evy process as a Markov process 133
P[
_
st
|x
1
x
s
|
c
1
= ] = 1 if c
1
c.
Case (c) c = 1
We have () = ia
1
a
o
||. Since
|| = 2
_

o
[cos u 1]
du
u
2
=
_

[e
iu
1
iu
1 + u
2
]
du
u
2
,
we have
() = ia
1
+
a
o

_
e
iu
1
iu
1 + u
2
_
du
u
2
= im
v
2

2
+
_

_
e
iu
1
iu
1 + u
2
_
(du).
From the uniqueness of representation of () we get v = 0 and
(du) =
du
u
2
. In this case thus a
+
= a

and
() = ia
1
+
a
o

_
e
iu
1
iu
1 + u
2
_
du
u
2
.
Denition (). Processes for which c = 1 are called Cauchy processes.
6 L evy process as a Markov process
Let (x
t
(w)), w (B, p) be a temporally homogeneous Levy process.
Let M = (R

, W, P
a
), where W = W
d
1
and P
a
(B) = P(x + a B). We
show that Mis Markov process.
If x is a random variable on a probability space , then the map
(w, a) (x(w), a) is measurable. It follows that the map (w, a) x(w)+
a is measurable in the pair (w, a). Now note that if F is a xed subset
of R

, then f (a) = P(w : (w, a) F) is measurable in a. Hence


P(w : x(w) + a E) for E B(R

) is measurable in a.
Therefore P(t, a, E) = P(w : x
t
(w) + a E) is measurable in a. If U 163
is an open set containing a, U a is an open set containing 0. Since x
t
is continuous in probability
lim
t0
P(t, a, U) = lim
t0
P[x
t
(w) U a] = 1.
134 4. Additive Processes
It remains to prove that if t
1
< . . . < t
n
, P
a
(x
t
i
E
i
, 1in) =
_
a
i
E
i

_
P(t
1
, a, da
1
)P(t
2
t
1
, a
1
, da
2
) . . . P(t
n
t
n1
, a
n1
, da
n
) We prove this
for n = 2. We have, since x
t
2
x
t
1

L
x
t
2
t
1
,
_
a
1
E
1
_
a
2
E
2
P(t
1
, a, da
1
)P(t
2
t
1
, a
1
, da
2
)
=
_
a
1
E
1
P(t
1
, a, da
1
)P(t
2
t
1
, a
1
, E
2
)
=
_
a
1
E
1
P(x
t
1
da
1
a)P(x
t
2t
1
E
2
a
1
)
=
_
a
1
E
1
P(x
t
1
da
1
a)P(x
t
2
x
t
1
E
2
a (a
1
a))
= P[(x
t
1
, x
t
2
x
t
1
) (E
2
a)
1
((E
1
a) R

)]
= P[x
t
1
E
1
a, x
t
2
E
2
a] = P
a
[x
t
1
E
1
, x
t
2
E
2
]
where(E
2
a)

= {(, ) : (, ) R
2
and + E
2
a}.
Thus Mis a Markov process. Further since H
t
f (a) = f (b)P(t, a, db)
=
_
f (a + b)P(x
t
db), we see that H
t
(C(R

)) C(R

). M is thus
strongly Markov. M is conservative. Recall that W
d
1
consists of all
functions which are of d
1
type before their killing time. We have 164
P
a
(

= ) = P
a
(w : w(n) R

for every integer n


= lim
n
P
a
(w(n) R

) = lim
n
P(w : x
n
(w) R

) = 1.
Also M is translation invariant, i.e. if
h
b = b + h then P

h
a
(
h
B) =
P
a
(B).
Conversely any conservative translation invariant Markov process
with state space R

can be got in the above way from a temporally ho-


mogeneous L evy process.
We shall now prove that the kernel of G

is the set of functions


which are zero a.e., i.e.G

f = 0 implies f = 0 a. e. To prove this rstly


onserve that G

f = 0 implies H
t
f (a) = 0 for almost all t. Hence we
6. L evy process as a Markov process 135
can ned a sequence of t
n
0 such that H
t
n
f (a) = 0. Now
_
f (a +
b)(t
n
, db) = 0. Since f is bounded it is locally summable. Hence for
any interval (, ) we have

f (a + b)(t
n
, db) =0
i.e., 0 =

f (a + b)da(t
n
, db)
=
+b

+b
f (a)da(t
n
, db)
=
_
g(b)(t
n
, db) = 0
where g(b) =
+b
_
+b
f (a)da is continuous. It follows that
_
g(b)(t
n
, db)
g(0) as t
n
0 i. e.

f (a)da = 0. Since this is true for every interval


(, ), f = 0 a. e. This proves our contention.
Generator. It is dicult to determine the generator of this process in
165
the general case. However we will determing Gu when u satises some
conditions.
Theorem 1. Let

f () =
_
e
ia
f (a)da denote the Fourier transform of
f . If u = G

h with h L

(, ), then u L

and u =

h

. Therefore
Gu L

and

Gu = u.
Proof. Let (t, E) = P(x
t
E). Then if f 0 we have
_
H
t
f (a)da =
_
da
_
f (a + b)(t, db) =
_
(t, db)
_
f (a + b)da
=
_
(t, db)
_
f (a)da =
_
f (a)da
136 4. Additive Processes
so that if f L

so is H
t
f (a). Now H
t
f (a) is measurable in the pair
(t, a). We have similarly if f 0,
_
G

f (a)da =
_
da

_
0
e
t
H
t
f (a)dt
=

_
0
e
t
dt
_
H
t
f (a)da
=

_
0
e
t
dt
_
f (a)da =
1

f (a)da
so that G

f (a) L

. Therefore

h() =
_
e
ia
da

_
0
e
t
dt
_
h(a + c)(t, dc).

Since

e
t
h(a + c)e
ia
dadt(t, dc)

e
t
|h(a + c)|dadt(t, dc)
=
_
G

|h(a)|da =
1

_
|h(a)|da
we can interchange the orders of integration as we like. We have

h() =

_
0
e
t

h()
_
e
ic
(t, dc)dt =

_
0
e
t

h()e
t()
dt =

h()
()
since
_
e
ia
(t, da) = E(e
ix
t
) = e
t()
and since the real part of () is 166
non-positive

_
0
e
(())t
dt exists and equals
1
()
. Since u = G

h
is in L

, Gu L

. Also from the last equation



G

h h =

h so that

Gu = u.
6. L evy process as a Markov process 137
Corollary (). If > 0 and ( ) u =

f for some function f L

then
u = G

f D(G) and

Gu = u.
For we have from Theorem 1,

G

f =

f

= u so that u =
G

f (a.e) and

Gu = u.
Theorem 2. If u, u

and u

are in L

, then u D(G) and u is given a.e.


by
Gu(a) = mu

(a) +
v
2
u

(a) +

_
u(a + b) u(a)
bu

(a)
1 + b
2
_
(db).
Proof. Let f
1
= mu

, f
2
=
v
2
u

, f
3
=
_
|b|>|
_
u(a + b) u(a)
bu

(a)
1 + b
2
_
(db). f
4
=
_
|u|1
b
3
1 + b
2
u

(a)(db) and f
5
=
_
|b|1
[u(a + b) u(a)
bu

(a)](db).
From the hypothesis we see that f
i
L

, i = 1, 2, 3, 4. We prove that
f
5
exists and is in L

. We have
u(a + b) u(a) bu

(a) =
_
b
0
u

(a + x)dx bu

(a)
=
_
b
0
[u

(a + x) u

(a)]dx
=
b
_
x=0
dx
x
_
y=0
u

(a + y)dy.
Therefore
_

da
_
|b|1
|u(a + b) u(a) bu

(a)|(db)

da
_
|b|1
(db)
b
_
x=o
dx
x
_
y=0
u

(a + y)dy
138 4. Additive Processes
=
_
|b|1
(db)
b
_
x=0
dx
x
_
y=0
dy
_

|u

(a + y)|da
=
_
|b|1
(db)
b
_
x=0
dx
x
_
y=0
dy||u

|| =
||u

||
2
_
|b|1
b
2
(db) < .
167
This shows that f
5
exists, is in L

and || f
5
||
||u

||
2
_
|b|1
b
2
(db). We
can easily see that

f
1
() = Im u(),

f
2
() =
v
2

2
u(), f
3
()
= u()
_
|b|>|
[e
ib
1
ib
1 + b
2
](db)
and

f
4
() = i u()
_
|b|1
b
3
1 + b
2
(db). Further f
5
L

and we have see


that

|u(a + b) u(a) bu

(a)|(db)da exists as a double integral.


Hence we can interchange the order of integration in

e
ia
[u(a + b) u(a) bu

(a)](db)da.
Thus we have

f
5
() =
_
[e
ia
1 ib] u()(db). Hence nally
if f = f
1
+ + f
5
,

f () +

f
1
() + +

f
5
() = () u(). We have
[ ()] u() = u

f =

u f . Using the corollary of Theorem 1,
we see that u D(G) and u = G

[u f ] so that Gu = u (u f ) =
f (a.e). This proves the theorem.
Remark. If (t, E) is symmetric, Gu(a) =
v
2
u

(a) +
_

0
[u(a+b) +u(a
b) 2u(a)](db). In the case of a symmetric Cauchy process v = 0 and
Gu(a) =
_

0
[u(a + b) + u(a b) 2u(a)](db).
7. Multidimensional Levy processes 139
7 Multidimensional Levy processes
168
A k-dimensional stochastic process (x
t
) is called a k-dimensional L evy
process if, it is additive, almost all sample functions are d
1
and it has no
point of xed discontinuity; note that unlike the k-dimensional Brown-
ian motion the component process need not be independent.
A k-dimensional random variable x is called Gaussian if and only
if E(e
i(,x)
) = e
i(m,)
1
2
(v,)
where m is a vector, v a positive denite
matrix and (a, b) denotes the scalar product of a and b.
Let x = (x
1
, . . . , x
k
) be a k-dimensional random variable such that
for any real c
1
, . . . , c
k
,
_
c
i
x
i
is a Gaussian variable. Then x is also Gaus-
sian. For E(e
i
_

i
x
i
) = e
im
v
2

2
where m =
_

i
m
i
, v

= E((
_

i
(x
i

m
i
))
2
) with m
i
= E(x
i
). Now v

=
_

2
i
v
ii
+2
_
i<j

j
v
i j
= (v, ) where
v
i j
= E((x
i
m
i
)(x
j
m
j
)) and v = (v
i j
). Since v

0, v is a posi-
tive denite matrix. Putting = 1 we have E(e
i(,x)
) = E(e
i
_

i
x
i
) =
e
i(m,)
1
2
(v,)
.
Thus if almost all sample functions of a k-dimensional Levy process
(x
t
) are continuous then x
t
x
s
is Gaussian.
Let (x
t
(w)) be a k-dimensional L evy process. Proceeding exactly as
in the case of k = 1 we can show that
x
t
= g(t) +
_
R
k
[0,t]

1
1 + u
2
(ds du)].
where g(t) is continuous; hence we can obtain 169
log E(e
i(,x
t
)
) = i(m(t), )

1
2
(v(t), ) +
_
R
k
[0,t]
_
e
i(,b)
1
i(, b)
|b|
2
+ 1
_
(dsdb)
If = 0 the path functions are continuous.
If (x
t
) is rotation invariant i.e., if E(e
i(,x
t
)
) = E(e
i(,0x
t
)
) where 0 is
any rotation, we have, since (, 0
1
x
t
) = (0, x
t
)(m(t), 0) = (m(t), )
140 4. Additive Processes
and (v(t)0, 0) = (v(t), ). Since this is true for every rotation 0 we
should have m(t) 0 and v(t) a diagonal matrix in which all the diagonal
elements are the same and we can write
log E(e
i(,x
t
)
) =
1
2
v(t)||
2
+
_
[0,t]R
k
[e
i(,b)
1
i(, b)
1 + b
2
_
(ds db).
If the process is temporally homogeneous E(e
i(,x
t
)
) = e
t()
where
() = i(m, )
1
2
(v, ) +
_
R
k
_
e
i(,b)
1
i(, b)
1 + b
2
_
(db).
Now suppose that (x
t
) is a stable process i. e. (x
t
) is temporally
homogeneous and x
t

L
c
t
x
1
. We can show (proceeding in the same way as
for k = 1) that (aE) =
1
a
c
(E) for a > 0. Now we prove that 0 < c < 2
unless 0. Let E = (b : 1 |b| >
1
2
. Since
_
|b|1
|b|
2
(db) < we
have

n=0
_
1
2
1
|b|
1
2
.
1
2
n
|b|
2
(db) <
so that

n=0
1
2
2
2
2n

_
b :
1
2
n
|b| >
1
2
1
2
n
_
<
i.e.,

1
2
2n

_
1
2
n
E
_
< .
Hence since (rE) =
1
r
c
(E), we should have (E)
_ 2
nc
2
2n
< . If 170
(E) 0, c < 2. Similarly considering
_
|b|1
(db) < we can prove
that c > 0.
Let S denote the surface of the unit sphere in R
k
. Then R
k
minus
the point (0, 0, . . . , 0) can be regarded as the product of S and the half
line (0, ). For any Borel subset of S let c
1

+
() = ([1, ])).
7. Multidimensional Levy processes 141
Then c
1

+
(d) is a measure on B(S ) and
([r, )) = (r.[1, )) =
1
r
c
c
1

+
() =
1
c
_
[r,)
dr
r
c+1
c+(d)
It follows that (db) =
dr
r
c+1

+
(d).
If x
t
is rotation invariant
+
(d) will be rotation invariant and hence
must be the uniform distribution d so that (db) =const.
dr d
r
c+1
.
We can consider a k-dimensional temporally homogenous L evy pro-
cess (x
t
) as a Markov process with state space R
k
and we can prove
that if f L

(R
k
) and u = G

f then ( ()) u() =



f () = and

Gu() = () u(). If u L

and ( ()) u() L

then u D(G) and

Gu() = () u(). To prove this let



f = ( ()) u() and v = G

f .
Then ( ()) v =

f = ( ()) u so that u = v a.e. and u D(G).
Now suppose that (x
t
) is stable and rotation invariant. We can show
that () ||
c
(

||
) so that if () is rotation invariant () = ||
c
, 171
constant. If we look at the expression for (), we see that real part of
() 0. It follows that const. 0. In this case we thus have

Gu() = u()||
c
i.e., u() =
1
||
c

Gu().
The Fourier transform (in the distribution sense) of |a|
ck
is ||
c
,
=
(k/2)c
(c/2)/(
k c
2
) (refer to Theorie des distributions by Sch-
wartz, page 113, Example 5). Since Gu is bounded, it is a rapidly de-
creasing distribution. Hence (see page 124, Theorie des distributions,
Schwartz)
u() = A

Gu()

1
|a|
kc
() = AGu
1
|a|
kc
(), A =
1

.
Therefore u(a) = A
_
Gu(b)
1
|a b|
kc
db. Thus
1
|a b|
kc
is the po-
tential kernel corresponding to this process. Potentials with such kernels
are called Reisz Potentials.
142 4. Additive Processes
When c = 2, u(a) = AGu
1
|a|
k2
so that u(a) = AGu
1
|a|
k2
=
AGu(a).
Section 5
Stochastic Dierential
Equations
1 Introduction
172
The standard Brownian motion is a one-dimensional diusion whose
generator is
1
2
d
2
da
2
. We shall here construct a more general one-dimen-
sional diusion whose generator G is the dierential operator
D =
1
2
p
2
(a)
d
2
da
2
+ r(a)
d
da
;
precisely if u C
2
(R

) = {u : u, u

, u

continuous and bounded} then u


D(G) and Gu = Du. To do this we consider the stochastic dierential
equation
dx
t
= p(x
t
)d
t
+ r(x
t
)dt,
where
t
is a Wiener process. The meaning of the above equation is
x
u
x
t
=
_
u
t
p(x
s
)d
s
+
_
u
t
r(x
s
)ds, 0 t < u < .
The meaning of
_
u
t
p(x
s
)d
s
has to be made clear; we do this in
article 3. Note that it cannot be interpreted as a Stieltjs integral for a
xed path because it can be shown that as a function of s,
s
is not of
143
144 5. Stochastic Dierential Equations
bounded variation for almost all paths. We make the following formal
considerations postponing the denition of the integral to 3.
Let x
(a)
t
be a solution of the dierential equation with the initial con- 173
dition x
(a)
0
= a, i.e. let x
(a)
t
be a solution of the integral equation
x
t
= a +
_
t
0
p(x
s
)d
s
+
_
t
0
r(x
s
)ds.
Then, under certain regularity conditions on p and r, we can dene
a strong Markov process M = (S, W, P
a
) with S = R

, W = W
c
(R

),
P
a
(B) = P(x
(a)
B) and such that
Gu(a) =
1
2
P
2
(a)
d
2
u
da
2
+ r(a)
du
da
, u C
2
(R

)
where G is the generator in the restricted sense. The same can be done in
multi-dimensional case replacing , p, r by a multi-dimensional Wiener
process, a matrix valued function and a vector valued function respec-
tively. Componentwise we will have
dx
i
t
=

j
p
i
j
(x
t
)d
j
t
+ r
i
(x
t
)dt, i = 1, . . . , n
and the generator will be given by
Gu(a) =
1
2

i, j
q
i j
(a)

2
u(a)
a
i
a
j
+

i
r
i
(a)
u(a)
a
i
where q
i j
=
_
k
p
i
k
p
j
k
.
Taking local coordinates we can extend the above to the case in
which the state space S is a manifold.
Coming back to stochastic integrals we prove the following theorem
which show that
_
u
t
f (s, w)d(s, w) cannot be interpreted as a Stieltjes
integral.
Theorem (). Let be the subdivision t = s
0
< s
1
< . . . < s
n
= u, and 174
() = max
i
(s
i+1
s
i
). Then
1. Introduction 145
1. r
2
() =
_
i
((s
i+1
) (s
i
))
2
u t L
2
-mean as () 0
2. r(, t, u) = sup

_
i
|(s
i+1
) (s
i
)| = with probability 1.
Proof. (1) E(r
2
()) =
_
i
E[(S
i+1
) (S
i
)]
2
=
_
i
(s
i+1
s
i
) = u t
and
E(r
2
()
2
) =

i
E(((s
i+1)
) (s
i
))
4
)
+ 2

i<j
E(((s
i+1
))
2
((s
j+1
) (s
j
))
2
)
=

i
3(s
i+1
s
i
)
2
+ 2

i<j
E(((s
i+1
) (s
i
))
2
)
E(((s
j+1
) (s
j
))
2
)
=

i
3(s
i+1
s
i
)
2
+ 2

i<j
(s
i+1
s
i
)(s
j+1
s
j
)
= 2

i
(s
i+1
s
i
)
2
+

i
(s
i+1
s
i
)
2
+

i<j
2(s
i+1
s
i
)(s
j+1
s
j
)
= 2

i
(s
i+1
s
i
)
2
+
_
(s
i+1
s
i
)
_
2
= 2

(s
i+1
s
i
)
2
+ (u t)
2
because (s
i+1
)(s
i
) and (s
j+1
)(s
j
) are independent for i j
and E(((t) (s))
4
) = 3(t s)
2
. We thus have
E((r
2
() (u t))
2
) = E(r
2
()
2
) (u t)
2
= 2

i
(s
i+1
s
i
)
2
2()

i
(s
i+1
s
i
) 0 as () 0.
(2) From (1) we can nd a sequence
n
= (t = s
(n)
0
< . . . < s
(n)
P
n
= u)
146 5. Stochastic Dierential Equations
such that r
2
(
n
) u t with probability 1. We have
r(, t, u)

|(s
(n)
i+1
) (s
(n)
i
)|
_
|(s
(n)
i+1
) (s
(n)
i
)|
2
max
i
|(s
(n)
i+1
) (s
(n)
i
)|

since
_
|(s
(n)
i+1
)(s
(n)
i
)|
2
ut and max
i
_
|(s
(n)
i+1
)(s
(n)
i
)| 175
0 because of continuity of path functions.

2 Stochastic integral (1) Function spaces E , L


2
, E
s
Let T be a time interval [u, v), 0 u < v < and
t
, t T be a Wiener
process i.e. (1) the sample functions are continous for almost all w, (2)
P(
t

s
E) =
_
E
1

2(t s)
e
x
2
/2(ts)
dx and (3)
t
1
,
t
2

t
1
, . . . ,
t
n

t
n1
are independent if t
1
< . . . < t
n
T. Let B
t
, t T be a monotone
increasing system of Borel subalgebras of B such that B
t
includes all
null sets for each t,
t
(B
t
) and
t+h

t
is independent of B
t
for h > 0.
We shall use the notation f (B) to denote that f is B-measurable.
Let L
s
be the set of all functions f such that (1) f is measurable in
(t, w), (2) f
t
(B
t
) for almost all t T and (3)
_
T
f
2
t
dt < for almost
all w . Instead of 3) we also consider the two stronger conditions
(3

)
_

_
T
f (t, w)
2
dt dp <
(3

) there exist a subdivision u = t


0
< t
1
< . . . < t
n
= v and M <
such that
f
t
(w) = f
t
i
(w), t
i
t < t
i+1
, 0 i n 1
and | f
t
(w)| < M. 176
We dene the function spaces L
2
and E by
L
2
= { f : 1), 2) and 3

) hold }
E = { f : 1), 2) and 3

) hold }.
2. Stochastic integral (1) Function spaces E , L
2
, E
s
147
Clearly E L
2
L
s
. L
2
is a (real) Hilbert space with the norm
|| f ||
2
=
_

_
T
| f |
2
dtd and L
s
is a (real) Fr echet space with the norm
|| f ||
L
s
=
_

1
1 +
_
_
T
| f |
2
dt
.
_
_
T
| f |
2
dt. || f ||L
s
0 is equivalent to

_
T
| f |
2
dt 0 in probability and if

L
2
then || f ||
L
s
|| f ||.
Theorem 1. 1 E is dense in L
2
(with the norm || ||)
2 E is dense in L
s
(with the norm || ||
L
s
).
Proof. 1. We shall prove that, given f L
2
there exists a sequence
f
n
E such that || f
n
f || 0. We can assume that f is bounded.
Put f (t, w) = 0 for t T. Then f is dened for all t (this is to
avoid changing T each time) and
_

f
2
dt dp < so that
_

f
2
dt < so almost all w.
Therefore
_

| f (t + h) f (t)|
2
dt 0 as h 0.
Also
_

| f (t + h) f (t)|
2
dt 4
_

f (t)
2
dt L

(). We get
_

| f (t + h) f (t)|
2
dt dp 0 as h 0.
If
n
(t) =
[2
n
t]
2
n
, n 1 then 177
_

| f (s +
n
(t)) f (s + t)|
2
ds dp 0 as n
Also
_

| f (s +
n
(t)) f (s + t)|
2
d sd P 4
_

f (s)
2
ds dP.
148 5. Stochastic Dierential Equations
Since T = [u, v) is a nite interval
_
v
u1
_

| f (s +
n
(t)) f (s + t)|
2
ds dP dt 0 as n .
i.e.,
_

ds
_
v
u1
_

| f (s +
n
(t)) f (s + t)|
2
ds dP dt 0 as n .
Therefore there exists a subsequence {n
i
} such taht
_
v
u1
_

| f (s +
n
i
(t)) f (s +t)|
2
dP dt 0 for almost all s. Choose
s [0, 1] and x it. Then
_
v
u1
| f (s +
n
i
(t)) f (s + t)|
2
dP dt 0 as n
i
.
Changing the variable
v+s
_
u1+s
_

| f (s +
n
i
(t s)) f (t)|
2
dP dt 0 as n
i

since 0 s 1
v
_
u
_

| f (s +
n
i
(t s)) f (t)|
2
dP dt 0.
Let h
i
(t) = f (s +
n
i
(t s)). Then h
i
and ||h
i
f || 0.
2. Let f L
s
. We prove that there exista a sequence f
n
E with 178
|| f
n
f ||L
s
0. We can assume that f is bounded so that f L
2
.
We can nd f
n
E such that || f
n
f || 0. But || f
n
f ||L
s

_

_
_
T
| f
n
f |
2
dt d p
_
_

_
T
| f
n
f |
2
dtdp = || f
n
f || 0.

3. Stochastic Integral (II) Denitions and properties 149


Remark. Let f
M
be the truncation of f by M i.e.
f
M
= ( f V M) M
and for a subdivision = (u = t
0
< t
1
< < t
n
= v) let f

be
the functionf

(t, w) = f (t
i
, w), t
i
t < t
i+1
, 0 i n 1. Then
the approximating functions f
n
in the above theorem are of the form
f
n
= f
M
n

n
for some M
n
,
n
.
3 Stochastic Integral (II) Denitions and properties
Let L
2
() be the real L
2
-space with the usual L
2
- norm|| || and S () be
the space of all measurable functions with the norm || f ||
s
=
_
1
1 + | f (w)|
| f (w)|dP(w). S () is a real Fr echet space and || f ||
s
0 is equiv-
alent to f 0 in probobility. Clearly L
2
() S () and if f
L
2
(), || f ||
s
|| f ||.
We rst dene I( f ) =
_
T
f d for f E , show that it is continuous
in the norms || ||, || ||
s
and hence that it is extendable to L
2
and L
s
.
We dene for f E
I( f ) =
_
T
f
t
d
t
=
n1

i=0
f (t
i
)((t
i+1
) (t
i
))
where t
0
= u < t
1
< . . . < t
n
= v is any subdivision by which f is 179
expressed. This denition is independent of the division points with
respcect to which f is expressed and I( f ) L
2
() S (). That I is
linear is easy to see and
E(I( f )) =

i
E( f (t
i
))((t
i+1
)) (t
i
))) =

i
E( f (t
i
))E((t
i+1
) (t
i
)) = 0
since f (t
i
) and (t
i+1
) (t
i
) are independent and E((t)) = 0.
Now we prove the following
(A) || f || = ||I( f )|| Though we use the same notation, note that
f L
2
, I( f ) L
2
().
150 5. Stochastic Dierential Equations
(B) ||I( f )||
s
= 0(|| f ||L
1/3
s
).
Proof of (A). Let ( f , g) = E(
_
T
f g dt). It is enough to show that
( f , g) = (I( f ), I(g)).
Let f , g be expressed by the division points (t
i
). Then
(I( f ), I(g)) = (

f
i
X
i
,

g
j
X
j
)
with f
i
= f (t
i
), g
j
= g(t
j
), X
i
= (t
i+1
) (t
i
). Note that f
i
(B
t
i
) and X
i
is independent of B
t
i
. We have
(I( f ), I(g)) =

i
E( f
i
g
i
)E(X
2
i
) +

i<j
E( f
i
g
i
X
i
)E(X
j
)
= E

i
E( f
i
g
i
)(t
i+1
t
i
)
=
_

f
i
g
i
(t
i+1
t
i
)
_
= E
_

_
_
T
f gdt
_

_
= ( f , g).
Proof of (B). Let f be expressed by the division points (t
i
) and put f
i
= 180
f (t
i
), X
i
= (t
i+1
) (t
i
),
i
= t
i+1
t
i
and = || f ||
L
s
. Then
P
_

_
_
T
f
2
dt >
2
_

_

1+

.
Let
Y
i
=
_

_
1 if
_
i
j=0
f
2
j

j

0 if
_
i
j=0
f
2
j

j
> .
Then Y
i
(B
t
i
) and since X
i
is independent of B
t
i
E
_

_
_

_
n1

i=0
Y
i
f
i
X
i
_

_
2
_

_
=
n1

i=0
E(Y
2
i
f
2
i
)
i
= E
_

_
n1

i=0
Y
i
f
2
i

i
_

_
3. Stochastic Integral (II) Denitions and properties 151
since from the denition of Y
i
, Y
2
i
= Y
i
. Again From the denition of
Y
i
,
n1
_
i=0
Y
i
f
2
i

i

2
so that E(S
2
)
2
, where S =
n1
_
i=0
Y
i
f
i
X
i
. Now
P(|S | > )
2
/
2
. If
_
T
f
2
dt
_
i
f
2
i

i

2
then Y
0
= Y
1
= . . . =
Y
n1
= 1 so that S =
_
f
i
X
i
= I( f ). Therefore
P(I( f ) S ) P
_

_
_
T
f
2
dt >
2
_

_

1+

P(|I( f )| > )
1 +

2
and
||I( f )||
s
=
_
1
1 + |I( f )|
|I( f )dP
=
_
|I( f )|
1
1 + |I( f )|
|I( f )|dP +
_
|I( f )|>
1
1 + |I( f )|
|I( f )|dP
+
1 +

2
.
181
Putting =
2/3
, =
1
2
, we get ||I( f )||
s
4
1/3
.
Using linearity of I and the fact ||I( f )|| = || f || for f E , we can
extend I to L
2
(|| ||) [since E is dence in L
2
(|| ||)] such that I is lienar.
For f L
2
, I( f ) L
2
() and || f || = ||I( f )||, and E(I( f )) = 0.
The linearity of I and the fact ||I( f )||
s
4|| f ||)
1/3
L
s
imply that we can
extend I to the closure of E in || ||
L
s
i.e. to L
s
. Since for f L
2
,
|| f ||
L
s
|| f || we see that this extension coincides with the above for
f L
2
. Further for f L
s
we have ||I( f )||
s
4|| f ||
L
1/3
s
.
Using the remark at the end of the previous article we can show that
for f L
2
I( f ) = lim
n

i
f
M
n
(t
(n)
i
)
_
(t
(n)
i+1
) (t
(n)
i
)
_
for some
n
= (t
(n)
)
i
and M
n
.
Finally if f , g, L
s
and if f = g on a measurable set
1
then
I( f ) = I(g) a.e.
1
.
152 5. Stochastic Dierential Equations
4 Denition of stochastic integral (III) Continuous
version
Let B
t
, 0 t < be a monotone increasing system of Borel subalgebras
of B such that B
t
includes all null sets for each t. Let
t
, 0 t < be a
Wiener process such that
t
(
t
) and
t+h

t
is independent of B
t
for
h > 0.
Let f
t
= f
t
(w) = f (t, w), 0 t < be such that 182
(1) f is measurable in the pair (t, w).
(2) f
t
(B
t
) for almost all t.
(3)
_
v
u
f
2
t
dt < for almost all w for any nite interval [u, v]
[0, ). Consider also the following conditions besides 1 and 2.
(3

)
_

_
v
u
f
2
t
dt dP < for any nite interval [u, v] [0, ).
(3

) There exist point 0 t


0
< t
1
< t
2
< . . . and constants
M
i
independent of w such that
f
t
(w) = f (t
i
, w), | f (t
i
)| M
i
, t
i
t < t
i+1
, i 0.
In the same way as in
_
2 we introduce theree function classes E ,
L
2
and L as follows
E = { f : 1, 2, 3

hold }
L
2
= { f : 1, 2, 3 hold }
L = { f : 1, 2, 3 hold }.
From 3 we can dene I(u, v) =
_
u
v
f (t, w)d(t, w), for f L and
for any bounded interval [u, v] [0, ).
Now we shall show
Theorem 1. I(u, v) has a continuous version in [u, v] i.e., there exists
I(u, v) such that
P[I(u, v) =
_
v
u
f d ] = 1 for any pair (u, v)
4. Denition of stochastic integral (III) Continuous version 153
and I(u, v) is continuous in the pair (u, v) for almost all w; I(u, v) is 183
uniquely determined in the sense that if I
i
(u, v) i = 1, 2 satisfy the above
conditions, then
P
_
I
1
(u, v) = I
2
(u, v)] for all u, v
_
= 1.
Proof. It is enough to show that I(t, f ) =
_
t
0
f d has a continuous (in t)
verson I

(t, f ) in 0 t v for any given v > 0, because I(u, v) =


I(0, v) I(0, u). If f E then I(t) itself is such a version and
P[ sup
0tv
|I(t, f )| > ]
1
2
|| f ||
2
(1)
where || f
2
|| =
_
v
0
_

f
2
dt dP.
To prove (1) let the restriction of f to [0, v) be expressed by the
division set = (0 = t
0
< t
1
< . . . < t
n
= v) and s
0
, s
1
, . . . be a dense
set in [0, v) such that t
i
= s
i
, 0 i n. Let now
1
, . . . ,
m
(m n) be a
rearrangement of a
0
, s
1
, . . . , s
m
in order of magnitudes. Then
I(
i
, f ) =

j<1
f (
j
)((
j+1
) (
j
))
Using arguments similar to those empolyed in the proof of Kolo-
mogoros inequality we can prove the following
Lemma (). If x
1
, . . . , x
n
, y
1
, . . . , y
n
are random variables satisfying
(1) y
i
is independent of (x
1
, . . . , x
i
, y
1
, . . . , y
i1
)
(2) E(y
i
) = 0 and E(x
2
i
), E(y
2
i
) <
then
P
_

_
max
1kn
|
k

i=1
x
i
y
i
|
_

_

1

2
)

E(x
2
i
)E(y
2
i
).
184
154 5. Stochastic Dierential Equations
Thus we have
P
_
max
0in
|I(
i
, f )| >
_

1

2
_

_
v
0
f
2
dt dP
i.e., P
_
max
0in
|I(s
i
, f )| >
_

1

2
|| f ||
2
Letting n we have (1).
Let C
s
denote the space of all functions (h(t, w), 0 t v, w )
which are continuous in [0, v] and introduce the norm || ||c
s
by
||h||
c
s
= E
_
1
1 + sup
0tv
|h(t, w)|
sup
0tv
|h(t, w)|
_
We shall prove that for f E
||I( f )||
c
s
= O(|| f ||
L
1/3
s
) (2)
where || f ||
L
s
= E
_

_
1
1 +
_
_
v
0
| f |
2
dt

_
_
_
_
t
_
0
f
2
dt
_

_
Dene Y
i
= 1 if
i1
_
j=0
f
2
j
(t
j+1
t
j
)
2
and Y
i
= 0 if
i1
_
j=0
f
2
j
(t
j+1
t
j
) >

2
and let g(t) = Y
i
f (t) = Y
i
f (t
i
) for t
i
t < t
i+1
. Then g(t) B
t
and
||g||
2
=
_

_
t
0
g
2
dt dP
2
and 185
P(I(t, f )) I(t, g) for some t [0, v) = P
_

_
v
_
0
f
2
dt >
2
_

_
<
1 +

where = || f ||
Ls
. Thus
P
_
sup
0tv
|I(t, f )| >
_
p
_
sup
0tv
|I(t, g)| >
_
+
1 +

2
+
1 +

from (1). Therefore


||I(., f )||
c
s
+
1 +

2
.
4. Denition of stochastic integral (III) Continuous version 155
Putting =
2/3
and =
1
2
we get
||I(., f )||
c
s

1
2
[2 + +
1
2
4
1
2
= 0(
1
2
) = 0(
1/3
) = 0(|| f ||
1/3
L
s
).
Since C
s
is complete in the norm || ||
c
s
we can extend the mapping
E f I(., f ) C
s
to the closure of E with respect to || ||
L
s
i.e. to L
s
. This extension gives
the continuous version of I(t, f ), 0 t v. Since (2) is also true for this
extension, we have
Theorem 2. If
_
v
0
| f
n
f |
2
dt 0 in probability
then sup
0tv
|I(t, f
n
) I(t, f )| 0 in probability.
For any Borel set E [u, v) we dene
_
E
f

=
_
v
0
f
E
d

.
For f L
2
we have seen that 186
||I(t, f )|| = || f ||.
Let f L
s
and consider the truncation f
M
. Since
_
v
u
| f
M
f |
2

E
ds 0 we see that sup
utv
|I(t, f
E
) I(t, f
M

E
)| 0 in probability.
Since
E
f
M
L
2
we have, if E has Lebesgue measure zero
||I(t, f
M

E
)|| =
_
t
0
E( f
M
)
E
ds = 0.
Thus
_
E
f

= 0 if the Lebesgue measure of E is zero.


Remark. Henceforth when we speak of the stochastic integral we shall
always understand it to mean the continuous version.
156 5. Stochastic Dierential Equations
If f L
2
we have
Theorem 3. If f L
2
and > 0,
P[|||I(t, f )||| > ]
1

2
|| f ||
2
where |||I(t, f )||| = sup
0tv
||I(t, f ||).
Proof. Let f
n
E be such that || f
n
f || 0. Then for any > 0.
P[|||I(t, f
n
f )||| > ] 0 (1)
For g E we have proved that
P[|||I(t, g)||| > ]
1

2
||g||
2
(2)
Therefore if > 0, 187
P[|||I(t, f )||| > + ] P[|||I(t, f
n
f )||| + |||I(t, f
n
)||| > + ]
[|||I(t, f
n
f )|||] + P[|||I(t, f
n
)||| > ]
P[|||I(t, f
n
f )||| > ] +
1

2
|| f
n
||
2
,
from (2). From (1) if n , P[|||I(t, f )||| > + ]
1

2
|| f ||
2
. Letting
we get the result.
5 Stochstic dierentials
Let
t
,
t
be denes as before. If x
t
= x
0
+
_
t
0
f
s
d
s
+
_
t
0
g
s
ds, where
x
0
(w) B
0
and
1. f , g are measurable in the pair (t, w)
2. f
s
, g
s
(B
s
) for almost all s, 0 s <
3.
_
t
0
f
2
s
ds < ,
_
t
0
|g
s
|ds < for almost all w, for any nite t then
we write
dx
t
= f
t
d
t
+ g
t
dt.
5. Stochstic dierentials 157
If dx
t
= f
t
d
t
+ g
t
dt, dx
i
t
= f
i
t
d
t
+ g
i
t
dt, f
t
=
_
i

i
t
f
i
t
, g
t
=
_
i

i
t
g
i
t
+
(t) then we shall write
dx
t
=

i
t
dx
i
t
+
t
dt.
Theorem (). If F(
1
, . . . ,
k
, t) is C
2
in (
1
, . . . ,
k
, ) and C
1
in t, if dx
i
t
=
f
i
t
d
t
+ g
i
t
dt and if y
t
= F(x
1
t
, . . . , x
k
t
, t), then
dy
t
=

i
F
i
dx
i
t
+
_

_
y
2
k

i, j=1
F
i j
f
i
t
f
j
t
+ F
k+1
_

_
dt
where F
i
=
F

i
, F
i j
=

2
F

j
, F
k+1
=
F
t
.
188
Remark. We can get the result formally as follows:
1. Expand dy
t
i.e. dy
t
= dF(x
1
t
, . . . , x
k
t
, t) =
_
i
F
i
dx
i
t
+ F
k+1
dt +
1
2
k
_
i, j=1
f
i j
dx
i
t
dx
j
t
+
2. Put dx
i
t
= f
i
t
d
t
+ g
i
t
dt.
3. Use d
t

dt
4. Ignore 0(dt).
Lemma 1. If f , g L
s
(as dened in 2) then
_

_
u
_
t
f
s
d
s
_

_
_

_
u
_
t

s
d
s
_

_
=
u
_
t
f
s
G
s
d
s
+
u
_
t
g
s
F
s
d
s
+
u
_
t
f
s
g
s
ds,
where
F
s
=
s
_
t
f

, G
s
s
_
t
g

.
158 5. Stochastic Dierential Equations
Proof.
Case 1. f , g E (as dened in 2).
We can express f and g by the same set of division points =
(t = t
(n)
0
< . . . < t
(n)
n
= u). Now let
n
= (t = t
(n)
0
) < t
(n)
1
< < t
(n)
n
=
u)(n m) be a sequence of sets of division points containing such that
(
n
) = max
0in1
|t
(n)
i+1
t
(n)
i
| 0. Put X
(n)
i
= f (t
(n)
i
), Y
(n)
i
= g(t
(n)
i
), B
(n)
i
=
(t
(n)
i+1
) (t
n
i
). We have
_

_
u
_
t
f
s
d
s
_

_
_

_
u
_
t
g
s
d
s
_

_
=
_

_
n1

i=0
X
(n)
i
B
(n)
_

_
_

_
n1

j=0
Y
(n)
J
B
(n)
J
_

_
=
n1

i=1
X
(n)
i
G(t
(n)
i
)B
(n)
i
+
n1

i=1
Y
(n)
i
F(t
(n)
i
)B
(n)
i
+
n1

i=0
X
(n)
i
Y
(n)
i
(B
(n)
i
)
2
.
Put
n
(s) = t
(n)
i
for f
(n)
i
s < t
(n)
i+1
and let G
n
, F
n
be dened as 189
G
n
(S, W) = G(
n
(S ), w), F
n
(s, w) = F(
n
(s), w).
Then G
n
, F
n
E and since the set
n
contains
m
, f G
n
, gF
n
E .
Thus
u
_
t
f
s
d
s
u
_
t
g
s
d
s
=
u
_
t
f (s)G
n
(s)d
s
+
u
_
t
g
s
F
n
d
s
+
n1

i=0
X
(n)
i
Y
(n)
i
(B
(n)
i
)
2
.
Now
u
_
t
| f (s)G(
n
(s)) f (s)G(s)|
2
ds max
tsu
|G
n
(s) G(s)|
u
_
t
| f (s)|
2
ds 0 with probabulity 1 since G(s) is continuous in s. Similarly
_
u
t
|g(s)F
n
(s) g(s)F(s)|
2
ds 0 with probability 1. Further
E
_

_
_

_
n1

i=0
X
(n)
i
Y
(n)
i
_
(B
(n)
i
)
2
t
(n)
i+1
t
(n)
i
_
_

_
2
_

_
=
n1

i=o
E((X
(n)
i
)
2
(Y
(n)
i
)
2
_
__
(B
(n)
i
)
2
(t
(n)
i+1
t
(n)
i
)
2
__
2
_
5. Stochstic dierentials 159
+ 2

i<j
E
_
X
(n)
i
Y
(n)
i
X
(n)
j
Y
(n)
j
_
_
B
(n)
i
_
2

_
t
(n)
n+1
_
_ _
_
B
(n)
j
_
2

_
t
(n)
j+1
t
(n)
j
_
__
=
n1

i=0
E((X
(n)
i
Y
(n)
i
)
2
E
_
_
(B
(n)
i
)
2
(t
i+1
t
(n)
i
)
_
2
_
,
since E((B
(n)
j
)
2
) = t
(n)
j+1
t
(n)
j
= 2
n1

i=0
E((X
(n)
i
Y
(n)
i
)(t
(n)
i+1
t
(n)
i
)
2
2(A
n
)E
_

_
n1

i=0
(X
(n)
i
Y
(n)
i
)
2
(t
(n)
i+1
t
(n)
i
)
_

_
= 2(
n
)E
_

_
u
_
t
f
2
(s)g
2
(s)ds
_

_
,
since
n1

i=0
_
X
(n)
i
Y
(n)
i
_
2
_
t
(n)
i+1
t
(n)
i
_
=
u
_
t
f
2
(s)g
2
(s)ds.
190
The lemma for f , g E , then follows Theorem 2 of 4.
Case 2. Let f , g L
s
. There exist sequences f
n
, g
n
E such that
u
_
t
| f
n
f |
2
ds and
u
_
t
|g
n
g|
2
ds 0 in probability.
Therefore sup
tsu
s
|F
n
F| and sup
tsu
s
|G
n
G| 0 in probability
where F(s) =
_
t
f ()d

, G(s) =
s
_
t
g()d

, F
n
(s) =
s
_
t
f
n
()d

, g
n
(s) =
_
s
t
g
n
()d

. Choosing a subsequene if necessary we can assume that


the above limits are true almost every where. Then for any w
u
_
t
| f
n
G
n
f G|ds 2
u
_
t
| f
n
f |
2
G
2
n
ds + 2
u
_
t
f
2
|G
n
G|
2
ds
2 sup
tsu
G
2
n
(s)
_
u
t
| f
n
f |
2
ds + 2 sup
tsu
|G
n
G|
2
u
_
t
f
2
ds 0.
The proof of the lemma can be completed easily.
160 5. Stochastic Dierential Equations
Proceeding on the same lines and noting that
_
i
f (t
(n)
i
)g(t
(n)
i
)((t
(n)
i+1
)
(t
(n)
i
))(t
(n)
i+1
t
(n)
i
) 0, for f , g E , as n we can prove 191
Lemma 2. (
_
u
t
f
s
d
s
)(
_
u
t
g
s
ds) =
_
u
t
f
s
G
s
d
s
+
_
u
t
g
s
F
s
ds where F
s
=
_
s
t
f

, G
s
=
_
s
t
g

.
Proof of Theorem. Write F(x
1
t
, . . . , x
k
t
, t) = F(x
t
). Let
n
= (0 = t
(n)
0
<
t
(n)
1
< t
(n)
1
< . . . < t
(n)
n
= t) be a sequence of sub divisions such that
(
n
) 0. Then
y
t
= y
0
+
n1

l=0
k

i=1
F
1
_
x(t
(n)
l
)
_ _
x
i
(t
(n)
l+1
) x
i
(t
(n)
l
)
_
+
n1

l=0
F
K+1
_
x(t
(n)
l
)
_ _
t
(n)
l+1
t
(n)
l
_
+
1
2
n1

l=0
k

i, j=1
F
i j
_
x(t
(n)
l
)
_ _
x
i
(t
(n)
l+1
) x
i
(t
(n)
l
)
_ _
x
j
(t
(n)
l+1
) x
j
(t
(n)
l
)
_
+
1
2
n1

l=0
k

i, j=1

(n)
i jl
_
x
i
(t
(n)
l+1
) x
i
(t
(n)
l
)
_ _
x
i
(t
(n)
l+1
) x
j
(t
n
l
)
_
= y
0
+
k

i=1
I
1
in
+ I
2
n
+
1
2
k

i, j=1
I
3
i jn
+
1
2
k

i, j=1
I
4
i jn
, say.
From the hypotheses on F and the continuity of x
j
(t),

(n)
i jl
0 uniformly in i, j, l as n .
Let
n
(t) = t
(n)
l
for t
(n)
l
t < t
(n)
l+1
. Then we have
I
1
in
=
n1

l=0
F
i
(x(t
(n)
l
))
_

_
t
l+1
_
t
(n)
l
f
i
s
d
s
+
t
l+1
_
t
(n)
l
g
i
s
ds
_

_
=
n1

l=0
_

_
t
l+1
_
t
(n)
l
F
i
(x(
n
(s))) f
i
s
d
s
+
t
l+1
_
t
(n)
l
F
i
(x(
n
(s)))g
i
s
ds
_

_
5. Stochstic dierentials 161
=
t
_
0
F
i
(x(
n
(s))) f
i
s
d
s
+
_
t
0
F
i
(x(
n
(s)))g
i
s
ds.
192
Also
_
t
0
|F
i
(x(
n
(s))) F
i
(x(s))|
2
( f
i
s
)
2
ds
max
0st
|F
i
(x(
n
(s))) F
i
(x(s))|
2

_
t
0
( f
i
s
)
2
ds 0
for every w. Thus
k

i=1
I
1
in

k

i=1
__
t
0
F
i
(x(s)) f
i
s
d
s
+
_
t
0
F
i
(x(s))g
i
s
ds
_
in probability. Similarly
I
2
n
=
_
t
0
F
K+1
(x(
n
(s)))ds
t
_
0
F
k+1
(x(s))ds.
Using Lemma 1 and 2 we have
(x
i
(v) x
i
(u))(x
j
(v) x
j
(u))
=
_
v
u
_
f
i
s
(x
j
(s) x
j
(u)) + f
j
s
(x
i
(s) x
i
(u)
_
d
s
+
v
_
u
f
i
s
f
j
s
ds +
v
_
u
g
i
s
_

_
s
_
u
f
j

d
_

_
ds
+
v
_
u
g
j
s
_

_
s
_
u
f
i

d
_

_
+
_

_
v
_
u
g
i
s
ds
_

_
_

_
v
_
u
g
j
s
ds
_

_
=
v
_
u
_
f
i
s
(x
j
(s) x
j
(u)) + f
j
s
(x
i
(s) x
i
(u))
_
d
s
+
v
_
u
f
i
s
f
j
s
ds
+
v
_
u
_
g
i
s
(Y
j
s
Y
j
u
) + g
j
s
(Y
i
s
Y
i
u
)
_
ds
162 5. Stochastic Dierential Equations
since
_

_
v
_
u
g
i
s
ds
_

_
_

_
v
_
u
g
j
s
ds
_

_
=
v
_
u
g
i
s
_

_
s
_
u
g
j

d
_

_
+
v
_
u
g
j
s
_

_
s
_
u
g
i

d
_

_
ds
where Y
i
s
=
s
_
0
_
f
i

+ g
i

_
d, Y
j
s
=
s
_
0
_
f
j

+ g
j

_
d.
193
Thus
I
3
i jn
=
t
_
0
F
i j
(x(
n
(s)))
_
f
i
s
(x
j
(s) x
j
(s))) + f
j
s
(x
i
(s) x
i
(
n
(s)))
_
d
s
+
t
_
0
F
i j
(x(
n
(s))) f
i
s
f
j
s
ds +
t
_
0
F
i j
(x(
n
(s)))
_
g
i
s
(Y
j
(s) Y
j
(
n
(s))) + g
j
s
(Y
i
(s) Y
i
(
n
(s)))
_
ds
t
_
0
F
i j
(x(s)) f
i
s
f
j
s
ds
in probability because othet terms can, without diculty, be shown, to
tend to zero in probability. Again
|I
4
i jn
| max
0ln1
|
(n)
i jl
|
n1

l=0
|x
i
(t
(n)
l+1
) x
i
(t
(n)
l
)||x
j
(t
(n)
l+1
) x
j
(t
(n)
l
)|

1
2
max
0ln1
|
(n)
i jl
|
n1

l=0
_
_
x
i
(t
(n)
l+1
) x
i
(t
(n)
l
)
_
2
+
_
x
j
(t
(n)
l+1
) x
j
(t
(n)
l
)
_
2
_
In the same ways as above we can show that
n1

l=0
_
x
i
(t
(n)
l+1
) x(t
(n)
l
)
_
2

t
_
0
f
i
s
f
i
s
ds.
Thus |I
4
i jn
| 0 in probability. We have proved the theorem
6. Stochastic dierential equations 163
6 Stochastic dierential equations
The notation in this article is as in the previous ones.
Theorem 1. Let p(), r(), R

satisying Lipschitz condition


| p() p()| A| |, |r() r()| A| |.
194
Then
dx
t
= p(x
t
)d
t
+ r(x
t
)dt, x
0
(w) = (w) B
0
has one and olny one solution.
[|(w)| < for almost all w]
Proof. (a) Existence. We show that
x
t
(w) = (w) +
t
_
0
p(x
s
)d
s
+
t
_
0
r(x
s
)ds
has a solution. We use successive approximation to get a solution.
Let
M
(w) be the truncation of at M (i.e., ( V M) M) and
put
x
0
(t, w)
M
(w).
Dene by induction on k
x
k+1
(t, w) =
M
(w) +
t
_
0
p(x
k
s
)d
s
+
t
_
0
r(x
k
s
)ds
=
M
+ y
k
(t) + z
k
(t), say.
Note that if f L
2
then I(t, f ) L
2
and
E(|I(t, f )|
2
) =
t
_
0
E(| f (s)|
2
)ds,
164 5. Stochastic Dierential Equations
where I(t, f ) =
t
_
0
f
s
d
s
. From the hypotheses on p and r,
x
k
(t, w) L
2
for all k. Now
E(|x
k+1
(t) x
k
(t))|
2
)
2E
_
|y
k
(t) y
k1
(t)|
2
+ 2E
_
|z
k
(t) z
k1
(t)|
2
__
2
_
t
0
E(| p(x
k
(s)) p(x
k1
(s))|
2
)ds
+ 2tE
__
t
0
|r(x
k
(s) r(x
k1
(s))|
2
ds)
_
_

_
since |z
k
(t) z
k1
(t)|
2
t
t
_
0
|r(x
k
(s)) r(x
k1
(s))|
2
ds
_

_
2A
2
(1 + t)
t
_
0
E(|x
k
(s) x
k1
(s)|
2
)ds
2A
2
(1 + v)
t
_
0
E(|x
k1
s
x
k1
x
|
2
)ds
where 0 t v < and v is xed for the present. Therefore 195
E(|x
k+1
(t) x
k
(t)|
2
) [2A
2
(1 +v)]
k
t
_
0
ds
1
s
1
_
0
ds
2
. . .
s
k1
_
0
E(|x
1
(s
k
)
x
0
(x
k
)|
2
)ds
k
[2A
2
(1 + v)]
k
t
_
0
ds
1
. . .
s
k1
_
0
2E(p
2
(
M
)s
k
+ r
2
(
M
)s
2
k
)ds
k
= [2A
2
(1 + v)]
k
2
_
E(p
2
(
M
))
t
k+1
(k + 1)!
+ 2E(r
2
(
M
))
t
k+2
(k + 2)!
_
which gives
v
_
0
E(|x
k+1
() x
k
()|
2
)d
6. Stochastic dierential equations 165
2
_
2A
2
(1 + v)
_
k
E(p
2
(
M
))
v
k+2
(k + 2)!
+ 2E(r
2
(
M
))
v
k+3
(k + 3)!
]
Let |||F(t, w)||| = sup
0tv
|F(t, w). Then
P
_
|||x
k+1
(t) x
k
(t)||| >
_
P
_
|||y
k
(t) y
k1
(t)||| >

2
+ P
_
|||z
k
(t) z
k1
(t)||| >

2
__

2
v
_
0
E
_
|(p(x
k
(s)) p(x
k1
(s))|
2
_
ds
+
4

2
v
v
_
0
E
_
|r(x
k
(s)) r(x
k1
(s))
2
_
ds.
(from Theorem 3 of 4)

4A
2
(1 + v)

2
2
_
2A
2
(1 + v)
_
k1
_
E(p
2
(
M
))
v
k+1
(k + 1)!
+ 2E(r
2
(
M
))
v
k+2
(K + 2)!
_
<
B

2
[2A
2
v(1 + v)]
k
k!
where B = 2
_
E(P
2
(
M
))v + 2v
2
E(r
2
(
M
))
_
.
Putting
k
=
[2A
2
v(1 + v)]
k/3
(k!)
1/3
we get 196
P
_
|||x
k+1
(t) x
k
(t)||| >
k
_
B
k
.

Since
_

k
is a convergent series Borel-Cantelli lemma implies that,
with probability 1, w belongs only to a nite number of sets in the
bracket of the last inequality. Therefore
P[|||x
k+1
(t) x
k
(t)||| <
k
for all k some l] = 1.
Since
_

k
< , we get
166 5. Stochastic Dierential Equations
|||x
m
(t) x
n
(t)||| o with probability 1 as m, n i.e., P[x
k
(t)
converges uniformly for 0 t v] = 1.
Taking v = 1, 2, 3, . . . we get
P[x
k
(t) converges unifolmly for 0 t n] for every n] = 1. Let
x
M
(t, w) be the limit of x
k
(t, w). This is clearly continuous in t for almost
all w. Also for any v < ,
P
_
|||x
k
(t) x
M
(t)||| 0 as k
_
= 1.
so that
_
t
0
p(x
k
(s))d
s

_
t
0
p(x
M
)(s))d
s
in probability. Now we prove
without diculty that
X
M
(t, w) =
M
(w) +
_
t
0
p(x
M
(s, w))d(s, w) +
t
_
0
r(x
M
(s, w))ds.
Let
M
= (w : |(w)| M) and dene x(t, w) = x
M
(t, w) on
M
. 197
If M < M

then on
M
,
M
=
M

so that from the construction [and


the fact that if f = g on a measurable set B then I(t, f ) = I(t, g) a.e.
on B] it follows that x
M
(t, w) = x
M

(t, w). Also since on


M
, x(t, w) =
x
M
(t, w), x(t, w) is a solution.
(b) Uniqueness. Let
x
t
= a +
_
t
0
p(x
s
)d
s
+
_
t
0
r(x
s
)ds 0 t v, a (B
0
).
y
t
= a +
_
t
0
p(y
s
)d
s
+
_
t
0
r(y
s
)ds
Case 1. E(x
2
t
) and E(y
2
t
) are bounded by some G < for 0 t v. We
have
E((x(t) y(t))
2
) 2E
_

_
__
t
0
(p(x(s)) p(y(s)))d(s)
_
2
_

_
+ 2E
_

_
__
t
0
(r(x(s)) r(y(s)))ds
_
2
_

_
2
_
t
0
E(
_
(p(x(s)) p(y(s)))
2
_
ds + 2t
_
t
0
E
_
(r(x(s)) r(y(s)))
2
_
ds
6. Stochastic dierential equations 167
since
__
t
0
(s)ds
_
2
t
_
t
0

2
(s)ds. Thus
E
_
(x
t
y
t
)
2
_
2A
2
(1 + t)
_
t
0
E
_
(x
s
y
s
)
2
_
ds
2A
2
(1 + v)
_
t
0
E
_
(x
s
y
s
)
2
_
ds
put C
t
= E((x
t
y
t
)
2
)
_
4G
2
_
. Then
C
t
2A
2
(1 + v)
_
t
o
c
s
ds
_
2A
2
(1 + v)
_
2
_
t
0
ds
_
s
0
c


Therefore
C
t

[2A
2
(1 + v)]
n
n!
t
n
4G
2
0 as n .
198
Case 2. Let x
tM
= (x
t
M)(M), y
tM
= (y
t
M)(M) and a
M
= (a
M)M. Dene x
0
, x
1
, . . . , y
0
, y
1
, . . ., inductively as follows
x
0
t
= x
tM
, x
n+1
t
= a
M
+
_
t
o
p(x
n
s
)d
s
+
_
t
o
r(x
n
s
)ds
y
0
t
= y
tM
, y
n+1
t
= a
M
+
_
t
o
p(y
n
s
)d
s
+
_
t
o
r(y
n
s
)ds.
Arguments similar to those used in the proof of existence of a solu-
tion prove that
x
t
= lim
n
x
n
t
, y
t
= lim
n
y
n
t
exist and
x
t
= a
M
+
_
t
0
p( x
s
)d
s
+
_
t
0
r( x
s
)ds
y
t
(t) = a
M
+
_
t
0
p( y
s
)d
s
+
_
t
0
r( y
s
)ds
168 5. Stochastic Dierential Equations
and
sup
0tv
E( x
2
t
) < , sup
0tv
E( y
2
t
) < .
Therefore from Case 1, x
t
= y
t
for 0 t v.
Let
M
= (w : |a| < M, sup
0tv
| x
t
|< M, sup
0tv
| y
t
|< M). Then since
x
t
and y
t
have continuous paths p[U
M

M
] = 1. But on
M
,
y
t
= y
tM
= y
0
t
= y
1
t
=
x
t
= x
tM
= x
0
t
= x
1
t
=
_

_
0 t v
199
Note that if f , g L
s
and f = g on a measurable subset B then
I(t, f ) = I(t, g) on B with probability 1.
Thus x
t
= y
t
on
M
. We have proved the theorem.
Corollary (). Let (w) L
2
() and x(t, w) satisfy
x(t, w) = (w) +
_
t
o
p(x(s, w)d(s, w) +
_
t
o
r(x(s, w))ds.
Then
E(x
2
t
) e
t
for 0 t v
where = 3E(
2
) + 6vp
2
(0) + 6v
2
r
2
(0) and = 6A
2
(1 + v).
Proof. From the proof of Theorem 1 we gather that x(t, w) L
2
(for
any v < ). If | x(t)||
2
= E(| x(t)|
2
),
||x(t) x(s)||
2
2
_
t
S
E(p(x())
2
)d + 2(t s)
_
t
s
E(r(x())
2
)d
so that ||x(t)||
2
is continuous in t. Let l = sup
0tv
||x
t
||
2
.
Now
||x
t
||
2
3E(
2
) + 3
_
t
o
E(p(x(s))
2
)ds + 3t
_
t
o
E(r(x(s))
2
)ds
3E(
2
) + 6[t p
2
(0) + t
2
r
2
(0)] + 6A
2
(1 + t)
_
t
o
||x(s
1
)||
2
ds
1
.
6. Stochastic dierential equations 169
+
_
t
o
||x(s
1
)||
2
ds
1
+
_
t
o
ds
1
[ +
_
s
1
o
||x(s
2
)||
2
ds
2
]
= (1 + t) +
2
_
t
o
ds
1
_
S
1
o
||x(s
2
)||
2
ds
2
(1 + t)
+
2
_
t
o
ds
1
_
S
1
o
ds
2
_
+
_
S
2
0
||x(s
3
)||
2
ds
3
_
=
_
1 + t +

2
t
2
2!
_
+
3
_
t
0
ds
1
_
S
1
0
ds
2
_
S
2
0
||x(s
3
)||
2
ds
3
200
Continuting this, for any n we have
||x(t)||
2

_
1 + t +

2
t
2
2!
+ +

n
t
n
n!
_
+
n+1
_
t
0
ds
1
_
S
1
0
ds
2
. . .
_
S
n
0
||x(s
n+1
)||
2
ds
n+1
e
t
+
n+1
_
t
o
ds
1
. . .
_
S
n
0
||x(s
n+1
)||
2
ds
n+1
e
t
+
n+1
l
t
n+1
(n + 1)!
Q.E.D.
Theorem 2. There exists a function x(t, a, w) measurable in the pair
(a, w) such that for every xed a R
1
,
x(t, a, w) = a +
_
t
0
p(x(s, a, w))d(s, w) +
_
t
0
r(x(s, a, w))ds
for every t and for almost all w. That is there exists a version of the
solutin of
dx
t
= p(x
t
)d
t
+ r(x
t
)dt, x(0) = a
which is measurable in the pair (a, w).
Proof. Let x
0
(t, a, w) a.x
0
(t, a, w) is measurable in the pair (a, w).
Assume that x
1
, . . . , x
k
have been dened, are measurable in the pair 201
(a, w) and for every xed a
x
i
(t, a, k) = a +
_
t
0
p(x
i1
(s, a, w))d(s, w)
170 5. Stochastic Dierential Equations
+
_
t
0
r(x
i1
(s, a, w))ds, 1 i k.
for almost all w and for all t. We shall dene x
k+1
. Let
n
= (0 =
s
n
o
< s
n
1
< . . .) be a sequence of subdivisions of [0, ) such that
n
=
sup
i
|s
n
i+1
s
n
i
| tends to zero. Let v < . Then since x
k
(s, a, w) is contin-
uous in s for almost all w,
_
v
0
| p(x
k
(s, a, w)) p(x
k
(
n
(s), a, w))|
2
ds 0
for almost all w, where
n
(t) = s
n
i
for s
n
i
t < s
n
i+1
. Hence
sup
0tv

_
t
0
[p(x
k
(s, a, w)) p(x
k
(
n
(s), a, w))]d
s
0] in probabil-
ity. By the diagonal process we can nd a subsequence n
j
suct that
p[ sup
0tv
|
_
t
0
p(x
k
(s, a, w))d
s

_
t
0
p(x
k
(
nj
(s), a, w))d
s

0 for every
v < ] = 1
Since p(x
k
(
nj
(s), a, w)) E
_
t
0
p(x
k
(
nj
(s), a, w) is measurable in
(a, w). It follows that M(t, a, w) = lim
_
t
0
p(x
k
(
nj
(s), a, w))d
s
is mea-
surable in (a, w). Now dene
x
k+1
(t, a, w) = a + M(t, a, w) +
_
t
0
r(x
k
(s, a, w))ds.
202
Proceeding as in Theorem 1 we can show that x
k
(t, a, w) converges
with probability 1. Let now
x(t, a, w) = limx
k
(t, a, w)
We can show that x(t, a, w) is the required function.
Remark. We can easily prove that if a
n
a then x(t, a
n
, w) x(t, a, w)
in probability. In fact
E(|x(t, a, w) x(t, a
n
, w)|
2
) 3|a a
n
|
2
+ 3
_
t
0
E(| p(x(s, a
n
, w)
6. Stochastic dierential equations 171
p(x(s, a, w))|
2
)ds + 3t
_
t
0
E(|r(x(s, a
n
, w)) r(x(s, a, w))|
2
)ds
so that
lim
a
n
a
E
_
lx(t, a, w) x(t, a
n
, w)|
2
_
3A
2
(1 + t)
lim
a
n
a
_
t
0
E(|x(s, a
n
, w) x(s, a, w)|
2
)ds
and now using the corollary to Theorem 1 and Fatou lemma we get
lim
a
n
a
E[lx(t, a, w) x(t, a
n
, w)|
2
]
3A
2
(1 + t)
_
t
o
lim
a
n
a
E(|n(s, a
n
, w) x(s, a, w)|
2
)ds
Q.E.D.
Theorem 3. Let x(t, a, w) be as in Theorem 2 and x(t, w) be the solution
of
dx
t
= p(x
t
)d
t
+ r(x
t
)dt, x(o, w) (w), (w) (B
0
)
Then 203
p[x(t, (w), w) = x(t, w)] = 1
Proof. We shall prove that
x(t, (w), w) = (w) +
_
t
0
p(x(s, (w), w))d(s, w) +
_
t
o
r(x(s, (w), w))ds
with probability 1; then by uniqueness part of Theorem 1 the result will
follow.
1. Since x(t, a, w) is measurable in (a, w), x(t, (w), w) is measurable
in w. In fact, x(t, (w), w) is the composite of
w ((w), w) and (a, w) x(t, a, w).
2. Consider the function-space valued random variable
(w) = (., w) (o, w).
172 5. Stochastic Dierential Equations
This induces a measure on C, the space of all continuous functions
on [0, ) and with respect to this measure the set of coordinate functions
is a Wiener process i.e., if for w C

(t, w) = w(t)
then

(t, w) is a Wiener process on C. Let

B
t
correspond to B
t
. There
exists a unique solution of the equation
d x
t
= p( x
t
)d

(t) + r( x
t
)dt, x
o
= a.
i.e., x(t, a, w) = a +
_
t
0
p( x(s, a, w))d

(s, w) +
_
t
0
r( x(s, a, w))ds
for almost all w. Hence we have by uniqueness 204
x(t, a, w) = x(t, a, (w)) a.e.
Let L(a, w) = x(t, a, w) and
R(a, w) = a +
_
t
0
p( x(s, a, w))d

(s, w) +
_
t
0
r( x(s, a, w))ds
If (w) (B
0
) then (w) and (w) are independent. Hence the mea-
sure induced by (, ) on R
1
C is the product P

where P

is the
distribution of and P

is the probability induced on C by . Hence we


have
P[w : x(t, (w), w) = (w) +
_
t
0
p(x(s, (w), w)d(s, w)
+
_
t
0
r(x(s, (w), w)ds
= (P

)[(a, w) : L(a, w) = R(a, w)]


=
_
R
P

[(a, w) : L(a, w) = R(a, w)]P

(da)
=
_
R
1.P

(da) = 1.
This proves the theorem.
7. Construction of diusion 173
7 Construction of diusion
In this article we shall answer the question of 1 i.e. we shall prove that
if p and r satisfy Lipschitz condition then there exists a diusion with
state space R

such that if u, u

, u

are continuous, u and


1
2
P
2
u

+ ru

are bounded and G is the generator in the restricted sense, then


Gu =
1
2
P
2
u

+ ru

.
We have proved in 6 that
x(t) = a +
_
t
0
p(x(s))d(s) +
_
t
0
r(x(s))ds
has a unique solution x(t, a, w). Let S = R

, W = W
c
(R
1
) and 205
P
a
(B) = P(w : x(., a, w) B), B B(W).
Then M = (S, W, P
a
) is a diusion.
We shall rst prove that M is a Markov process. We verify the
Markov property of P
a
.
Let

t
(w) denote the stopped path at t of (., w) i.e. (t., w) and let

(w) = (t + ., w) (t, w),

(s, w) = (t + s, w), B

= B
t+
. Then

(s, w) is also a Wiener process on . Let x(t, a, w), y(t, a, w) denote


solutions with respect to these processes of
dz
t
= p(z
t
)d
t
+ r(z
t
)dt
i.e.
x(t, a, w) = a +
_
t
0
p(x(s, a, w))d(s, w) +
_
t
0
r(x(s, a, w))ds
y(t, b, w) = b +
_
t
0
p(y(s, b, w))d

(s, w) +
_
t
0
r(y(s, b, w))ds
If (w) = (., w) (0, w) then (w) and

(w) induce the same


probability on C. Hence (see the proof of Theorem 3 of 6)
x(t, a, w) = x(t, a, (w)), y(t, a, w) = x(t, a,

(w)).
174 5. Stochastic Dierential Equations
Consider the C-valued random variable

t
(w). This induces a prob-
ability on C and with respect to this the process

(s, w) = w(s), 0 s t
is a Wiener process on C and there exists a unique solution for 206
d x
s
= p( x
s
)d

s
+ r( x
s
)ds, x
0
= a, 0 s t
i.e. there exists f (s, a, w) such that
f (s, a, w) = a+
_
s
0
p( f (, a, w))d

(, w) +
_
s
0
r( f (, a, w))d, 0 s t.
Then we have
f (s, a,

t
(w)) = a +
_
s
0
p( f ( , a,

t
(w)))d(, w)
+
_
s
0
r( f (, a,

t
(w)))d, 0 s t.
Therefore the stopped path at t of x(., a, w) is
F(s, a,

t
(w)) =
_

_
f (s, a,

t
(w)), 0 s t
f (t, a,

t
(w)), s > t.
Now
x(t + s, a, w)
= x(t, a, w) +
_
s
0
p(x(+t, a, w))d( + t, w) +
_
s
o
r(x( + t, a, w))d
= x(t, a, w) +
_
s
0
p(x( + t, a, w))d

(, w) +
_
s
0
r(x( + t, a, w))d.
From Theorem 3 and uniqueness part of Theorem of 1 of 6 we
have therefore
x(t + s, a, w) = y(s, x(t, a, w), w) = x(s, x(t, a, w),

(w))
7. Construction of diusion 175
= x(s, E(t, a,

t
(w)),

(w))
Let B
1
B
t
(w) and B
2
B(w). Then by denition of P
a
207
P
a
[W B
1
, W
+
t
B
2
] = P[x(., a, w) B
1
, x(t + ., a, w) B
2
]
= P[F(., a,

t
(w)) B

1
, x(., F(t, a,

t
(w)),

(w)) B
2
]
where B
1
= (w : w

t
B

1
). Let P

t
and P

be the probabilities induced


on C by

t
and

; since they are independent they induce the product


probability P

t
P

on C C. We have therefore,
P
a
(w B
1
, w
+
t
B
2
)
= (P

t
P

)[( w,

w

) : F(., a, w) B

1
, x(., F(t, a, w), w

) B
2
]
=
_
P

t
(d w)P[w

: F(., a. w) B
1
, x(., F(t, a, w),

w

) B
2
]
=
_
P

t
(dw)P[w

: F(., a. w) B
1
, x(., F(t, a, w),

(w

)) B
2
]
=
_
P(dw)P[w

: F(., a,

t
(w)) B

1
, x(., F(t, a,

t
(w)),

(w

)) B
2
]
=
_
(:F(.,a,

t
(w))B

1
)
P[w

: x(., F(t, a,

t
(w)),

(w

)) B
2
]P(dw)
=
_
(:F(.,a,

t
(w))B

1
)
P[w

: x(., x(t, a, w), (w

)) B
2
]P(dw)
since and

induce the same probability on C. Thus by denition of


P
b
we have
P
a
[w : w B
1
, w
+
t
B
2
] =
_
(w:F(.,a,

t
(w)B

1
)
P
x(t,a,w)
[B
2
]P(dw)
= E
a
[B
1
: P
x
t
(w)
(B
2
)]
We have derived the Markov property.
From the remark at the end of Theorem 2 of 6 we see that if a
n
a 208
176 5. Stochastic Dierential Equations
there exists a subsequence a
nk
such that
x(t, a
nk
, w) x(t, a, w) a.e.
Since H
t
f (a) = E
a
[ f (x
t
(w))] =
_

f (x(t, a, w))P(dw) if f is continu-


ous and a
n
a then there exists a subsequnce a
nk
such that H
t
f (a
nk
)
H
t
f (a). Since this is true of every sequence a
n
a,we should have
lim
ba
H
t
f (b) = H
t
f (a).
Mis therefore a strong Markov process. The denition of P
a
shows that
Mis conservative.
Theorem 1. If u, u

, u

are all continuous and if u and


1
2
p
2
u

+ru

are
bounded, then u D(G)(G in the restricted sense) and
Gu =
1
2
P
2
u

+ ru

.
Proof. It is enough to prove that G

u u = G

[
1
2
P
2
u

+ ru

]. From
the theorem of 5 we have
u(x(t, a, w)) = u(x(0, a, w)) +
_
t
0
u

(x(s, a, w))p(x(s, a, w))d(s)


+
_
t
0
_
1
2
P
2
(x(s, a, w))u

(x(s, a, w)) + u

(x(s, a, w))r(x(s, a, w))


_
ds.
Write F(s, a, w) =
1
2
P
2
(x(s, a, w))u

(x(s, a, w)) + u

(x(s, a, w))
r(x(s, a, w)). Then since x(0, a, w) = a, 209
_

u(x(t, a, w))P(dw) = u(a) +


_
t
0
ds
_

F(s, a, w)P(dw) (1)


since the expectation of a stochastic integral is zero.
Thus

_

0
e
t
dt
_

u(x(t, a, w))dP(w)
7. Construction of diusion 177
= u(a) +
_

0
e
t
dt
_
t
0
ds
_

F(s, a, w)dP(dw)
= u(a) +
_

0
ds
_

F(s, a, w)P(dw)
_

s
e
t
dt
= u(a) +
_

0
ds
_

F(s, a, w)P(dw)e
s
Q.E.D
Theorem 2. If u satises the conditions of Theorem 1, then
lim
t0
H
t
u(a) u(a)
t
=
1
2
P
2
(a)u

(a) + r(a)u

(a).
This is immediate from equation (1) above, since all the functions
involved are continuous.
Theorem 3. Let P(t, a, E) be the transition probability of the above dif-
fusion. Then the following Kolmogoro conditions are true.
(A) lim
t0
1
t
P(t, a, U
c
a
) = 0
(B) lim
t0
1
t
_
U
a
(b a)P(t, a, db) = r(a)
(C) lim
t0
1
t
_
U
a
(b a)
2
P(t, a, db) = P
2
(a)
where U
a
is any bounded open set containing a.
Proof. We can prove these facts using stochastic dierential equations;
but we shall deduce them from Theorem 2 above.
(A) Let V
a
be any open set containing a with

V
a
U
a
and let u be a 210
C
2
function such that u = 0 on V
a
, u = 1 on U
c
a
and 0 u 1 on
U
a
V
a
. Then u satises the conditions of Theorem 1. We have
0
1
t
P(t, a, U
c
a
)
1
t
[H
t
u(a) u(a)]
1
2
p
2
u

(a) + ru

(a) = 0.
178 5. Stochastic Dierential Equations
(B) Let V
a


U
a
and let be a C
2
function vanishing outside V
a
, 1 on

U
a
and 0 1. Put u(b) = (b a) (a). Then
lim
t0
1
t
[H
t
u(a) u(a)] =
1
2
p
2
u

(a) + ru

(a) = r(a)
i.e., lim
t0
1
t
_
R

u(b)P(t, a, db) = r(a).


Also
lim
t0
|
1
t
_
R

u(b)P(t, a, db)
1
t
_
U
a
(b a)P(t, a, db)|
lim
t0
1
t
_
V
a
U
a
|b a|| (b) 1|P(t, a, db)
lim
t0
C
1
t
_
U
c
a
P(t, a, db) = 0
from (A), where C is a bound for |b a|| (b) 1| on V
a
U
a
.
(C) Take u(b) = (b a)
2
(b) in (b).

Remark. Theorem 3 means (in an intuitive sense)


P
a
(|dx
t
| >) = 0(dt), E
a
(dx
t
) r(a)dt,
V
a
(dx
t
) = E
a
((dx
t
)
2
) p
2
(a)dt.
Section 6
Linear Diusion
We recall the denition of a diusion. A strong Markov process whose 211
path functions are continuous before the killing time is called a dision.
In this section we develop the theory (due to Feller) of linear diusion.
1 Generalities
Denition (). A diusion whose state space S is a linear connected set
is called a linear diusion.
S is therefore one of the following sets, upto isomorphism i.e., order
preserving homeomorphism of linear connected sets:
(1) [0, 1], (2) [0, 1), (3) (0, 1], (4) (0, 1), (5) {0}.
Let
b
denote the rst passage time for b, i.e.
b
= inf{t : x
t
= b}.
If P
a
(
b
< ) > 0, we write a b; if a b for some b > a, we
write a C
+
; if a b for some b < a, we write a C

. If a b for
any b > a, i.e. if a C
+
we write a K

; similarly if a b for any


b < a, i.e. if a C

, we wrire a K
+
. Thus if a K
+
and b < a then
P
a
(
b
= ) = 1, i.e. P
a
(x
t
a for all t <

) = 1.
Every point of the state space S belongs to one of the following sets:
1. C
+
C

= K
c
+
K
c

. These points are called regular points or 212


second order points.
179
180 6. Linear Diusion
2. C
+
C

= K
+
K

. A point of this set is called a pure right shunt.


3. C

C
+
= K = K
+
. A point of this set is called a pure left
Shunt. Both left and right shunts are sometimes called points of
rst order, i.e. a point of rst order is an element of C
+
C

)
(C

C
+
).
4. C
c

C
c
+
= K

K
+
. These points are called trap points or points
of order zero.
The intuitive meanings of the above should be clear; for instance of
particle starting at a pure right shunt travels to the right with probabil-
ity 1.
Theorem 1. If a C
+
( c

), then P
a
(
a
+
= 0) = 1(P
a
(
a

= 0) = 1),
where
a
+
= inf{t : x
t
> a}(
a

= inf{t : x
t
< a}).
Proof. Let =
a
+
and a C
+
. There exists b > a, and t such that
P
a
(
b
< t) > 0. Now E
a
(d

b
) e
t
P
a
(
b
< t) > 0. Since
b
(w) =
(w) +
b
(w
+

) we have, by the strong Markov property,


0 < E
a
(e

b
) = E
a
(e

b
:
b
< ) = E
a
(e

b
: < ,
b
<
= E
a
(e

b
(w
+

)
: < ,
b
(w
+

) < )
= E
a
[e

E
x

(e

b
:
b
< ) : < ]
= E
a
(e

E
a
(e

b
) : < ) = E
a
(e

b
)E
a
(e

). Q.E.D

Remark. If M is not strong Markov the theorem is not true (e.g. expo- 213
nential holding time process).
Theorem 2. If a b > a(< a) then [a, b) C
+
((b, a] C

.
Proof. Let a < b. Then P
a
(

<
b
) = 1, because the paths are
continuous. We have
0 < P
a
(
b
< ) = P
a
(

< ,
b
< ) = P
a
(

< ,
b
(w
+

) < )
= E
a
[P
x

(
b
< ) :

< = P

[
b
< ]P
a
(

< ),
since by continuity x(

) = . Therefore P

(
b
< ) > 0. Q.E.D.
2. Generator in the restricted sense 181
Corollary (). If a C
+
then some right neighbourhood (i.e. a set which
contains an interval [a, b )) of a, U
+
(a) C
+
.
Theorem 3. The set of regular points is open.
Proof. Since a C
+
, there exists b > a with P
a
(
b
< ) > 0 and since
a C

, by Theorem 1, P
a
(
a

= 0) = 1. Hence P
a
(
a

<
b
< ) > 0;
this implies that there exists c < a such that P
a
(
c
<
b
< ) > 0.
Noting that a C
+
and using Theorem 1, there exists d, a < d < b, with
P
a
(
d
<
c
<
d
< ) > 0. Using the strong Markov property
P
a
(
d
< )P
d
(
c
< )P
c
(
d
< ) > 0,
so that P
d
(
c
< ) > 0, P
c
(
b
< ) > 0. Hence (c, d] C

and
[c, b) C
+
. Q.E.D.
Theorem 4. K
+
is right closed, i.e. a
n
K
+
, a
n
a imply a K
+
(k

is
left closed).
Proof. If a K
+
then a c

. There exists b < a and a b.


Then (b, a] c

so that (b, a] K
+
= . Q.E.D. 214
2 Generator in the restricted sense
In the section of strong Markov processes we introduced a generator in
the restricted sense; we modify this to suit our special requirements. Let
D(S ) = { f : f B(S ) and f is right continuous at every point of
C
+
and left continuous at every point of C

. D(S ) is smaller than the


classes D(S ) introduced before in the section of strong Markov pro-
cesses. Clearly f D(S ) is continuous at every regular point and
D(S ) C(S ).
Theorem 1. D(S ) G

(S ); a fortiori G

D(S ) D(S ).
Proof. Let a C
+
. Then
G

f (a) = E
a
__

0
e
t
f (x
t
)dt
_
182 6. Linear Diusion
= E
a
__

b
0
e
t
f (x
t
)dt
_
+ E
a
__

b
e
t
f (x
t
)dt
_
= E
a
__

b
0
e
t
f (x
t
)dt
_
+ E
a
(e

b
G

f (x

b
))
= E
a
__

b
0
e
t
f (x
t
)dt
_
+ E
a
(e

b
)G

f (b).
Now
|E
a
__

b
0
e
t
f (x
t
)dt
_
| || f ||
1 E(e

b
)

ba
|| f ||
1 E(e

a+
)

= 0,
since
P
a
(
a
+
= 0) = 1.
Q.E.D
We can prove that G

D(S ) is independent of and the other results


easily.
Theorem 2. G

f = 0, f D(S ) imply f 0. 215


Proof. It is enough to show that P
a
( f (x
t
) f (a) as t 0) = 1.
If a is regular, f is continuous at a and there is nothing to prove. If
a C
+
C

, f is right continuous at a and P


a
(x
t
a for 0 t <

) = 1
and again the result is immediate. If a C

C
+
the same is true. If
a is a trap P
a
(x
t
= a for 0 t <

) = 1 and since P
a
(

> 0) = 1
(because P
a
(x
0
= a) = 1) the result follows again.
Denition (). We dene the generator in the restricted sense as Gu =
u f where u = G

f with f D(S ). One easily veries that Gu is


independent of .
Theorem 3. If a is a trap, then P
a
(

> t) P
a
(
a
> t) = e
kt
and Gu(a) = ku(a) where k 0 and
a
= rst leaving time from
a = inf{t : x
t
a}.
Proof. Proceeding as in the case of a Morkov process with discrete
space (Section 2, 8) we show that P
a
(
a
> t) = e
kt
and
1
k
= E
a
(
a
) if
2. Generator in the restricted sense 183
> k > 0. If k = 0, P
a
(
a
> t) = 1 for all t, giving P
a
(
a
= ) = 1 i.e.
P
a
(x
t
= a for all t) = 1 (such a point is called a conservative trap). We
have Gu(a) = u(a)f (a) =
_

o
e
t
E
a
( f (x
t
))dtf (a) = f (a)f (a) =
0. Let now > k > 0. Since
1
k
= E
a
(
a
), by Dynkins formula,
E
a
__

a
o
Gu(x
t
)dt
_
= E
a
(u(x

a
)) u(a),
i.e.,
E
a
(
a
Gu(a)) = u(a), since u(x

a
) = u() = 0.
Q.E.D.
Theorem 4 (Dynkin). If a is not a trap then E
a
(
U
) < for a su- 216
ciently small open neighbourhood U of a and
Gu(a) = lim
Ua
E
a
(u(x

U
)) u(a)
E
a
(
U
)
,
where
U
= rst leaving time from U.
Proof. We prove that if a is not a trap, there exists u
0
D(G) such
that u
0
(a) > 0. Let Gu(a) = 0 for every u D(G). Then for all
f C(S ), .G

f (a) f (a) = 0 i.e.


_

o
H
t
f (a)e
t
dt =
1

f (a) =
_

0
e
t
f (a)dt. Since for f C(S ), H
t
f is right continuous in t,
H
t
f (a) = f (a) i.e.
_
f (b)P(t, a, db) = f (a) for all f C(S ). It fol-
lows that P(t, a, db) =
s
(db) i.e. P
a
(x
t
= a) = 1 for all t. By right
continuity P
a
(x
t
= a) for all t) = 1, i.e. a is a trap. Thus there exists u
0
such that Gu
0
(a) 0.
From the denition of D(S ) we see that there exists
0
> 0 and a
neighbourhood U
0
(a) such that
Gu
0
(b) >
0
_

_
for b U
0
(a) if a is regular,
for b U
0
(a) and b a if a is a pure right shunt,
for b U
0
(a) and b a if a is a pure left shunt.
184 6. Linear Diusion
Therefore P
a
(Gu
0
(x
t
) >
0
for 0 t <
U
0
) = 1. Now put
n
=
n
U
0
. Then
E
a
__

n
0
Gu
0
(x
t
)dt) = E
a
(u
0
(x

n
)) u
0
(z)
_
so that

0
E
a
(
n
) 2||u
0
||.
Letting n , E
a
(
U
0
) 2
||u
0
||

0
< . Therefore for U U
0
(a), 217
E
a
(
U
) < .
Now let u D(G). For every > 0, there exists a open neighbour-
hood U(a) U
0
(a) such that
|Gu(b) Gu(a)| <
_

_
for b U(a) if a is regular,
for b U(a) and b a if a is a pure right shunt,
for b U(a) and b a if a is a pure left shunt.
Therefore P
a
(|Gu(x
t
) Gu(a)| < for 0 t <
U
) = 1. Using
Dynkins formula the proof can be easily completed.
3 Local generator
Let M = (S, W, P
a
) denote a linear diusion, and S

a closed interval
in S . Put W

= W
c
(S

), P

a
(B

) = P
a
[w

], where
(S

)
0 (w)
is the rst leaving time from the interior (S

)
0
of (S

). We prove that
M

= (S

, W

, P

a
) is also a linear diusion. We shall verify the strong
Markov property for M

. First we show that, if

(w

) is a Markov time
in W

, then (w) =

(w

(w)
) is a Markov time in W Now
(w : (w) t) = [w :

(w

(w)
) t)] = (w : w

(w)
) B

t
), B

t
B

t
= (w : (w

(w)
)

t
B

), B

B
= (w : t (w), w

t
B

) (w : (w) < t, (w

t
)

(w)
B

)
= (w : t (w), w

t
B

) (w : (w) < t, (w

t
)
(w

t
)
B

) B
t
since w w

t
is B
t
-measurable and w w

1
is B-measurable for any 218
Markov time
1
we have (w : (w

t
)

1
(w

t
)
B) B
t
for any B B.
3. Local generator 185
Thus is a Markov time in W. Let f

1
B

+
and f

2
B

. Then
by denition of P

a
we have
E

a
_
f

1
(w

) f

2
(w

(w

)
)
_
= E
a
_
f

1
(w

(w)
) f

2
((w

(w)
)
+
(w)
)
_
Put f
1
(w) = f

1
(w

(w)
) and f
2
(w) = f
1
2
(w

(w)
). Let
2
= . We
show that f
1
B

2
+
. Now
f
1
(w

2(w)+
= f

1
((w

2
(w)+
)

(w

2
(w)+)
) = f

1
_
(w

(w)
)

(w)+
_
= f

1
_
(w

(w)
)

(w

(w)
)+
] +
_
= f

1
(w

(w)
), since f

1
B

+
This proves that f
1
B

2
+
. From the denition of and
2
we can
see without diculty that for any t 0,

2
(w) + (t (w
+

2
(w)
)) = (w) ((w) + t)
Hence (w
+

2
(w)
)

(w
+

2
(w)
) = (w

(w)
)
+
(w)
, so that
f
2
[w
+

2
(w)
] = f
1
2
_
(w
+

2
(w)
)

(w
+

2
(w))
_
= f

2
_
(w

(w)
)
+
(w)
_
.
Thus
E

a
_
f

1
(w

) f

2
(w

(w

)
)
_
= E
a
_
f

1
(w

(w)
) f

2
((w

(w)
)
+
(w)
)
_
= E
a
_
f
1
(w) f
2
(w
+

2
)
_
= E
a
_
f
1
(w)E
x
2
( f
2
(w))
_
= E
a
_
f

1
(w

(w)
)E
x

(w)
(w

(w)
)( f

2
(w

(w)
))
_
= E

a
_
f

1
(w

)E
x

( f

2
(w

))
_
which proves that M

is a linear diusion.M

is called the stopped pro- 219


cess at the boundary S

of S

. We also denote M

by M
S
, its generator
by G

or G
s
etc.
A point a S is called a conservative point if there exists a neigh-
bourhood U such that M
U
is conservative. The set of all conservative
points is evidently open. Let a be a conservative regular point and S

a
closed interval containing a such that M
S
is conservative. We shall
186 6. Linear Diusion
prove that if u D(G), then u

= u|S

D(G

) and G

= Gu in
(S

)
0
; more generally if S

, if u

D(G
S
) then u

= u

/S

(restriction to S

) is in D(G
S
) and G

= G

in (S

)
0
. Then we
can dene G
a
the local generator as the inductive limit of G
S
as S

a
in the following way. Consider the set D
a
of all functions dened in a
neighbourhood (which may depend on the function) right (left) contin-
uous at points of C
+
(C ). Introduce an equivalence relation in D
a
by
putting f g if only if there exists a neighbourhood U of a such that
f = g in U. Let

D
a
(S ) = D
a
(S )/ (the equivalence classes). Dene
D(G
a
) =
_
u : u D
a
(S ) and there exist U = U(a) with u|U D(G
U
).
Dene DG
a
u = (G
U
u)/ where u = u| , u|U D(G
U
). From above 220
it follows that this is independent of the choice of u. We now prove that
if uD(G ) then u

= u|S

D(G

) and G
u
= G

in (S

)
0
. Note that if
[b, c] = S

, =
U
=
b

c
, U = (S

)
o
. We have
u() = G

f () = E

_
0
e
t
f (x
t
)dt
_

_
= E

__

0
e
t
f (x
t
)dt
_
+ E

b
e
t
f (x
t
)dt :
b
<
c
_

_
+ E

c
e
t
f (x
t
)dt
c
<
b
_

_
= E

_
0
e
t
f (x
t
(w

))dt
_

_
+ G

f (b)E

(e

b
:
b
<
c
)
+ G

f (c)E

(e

c
:
c
<
b
)
by strong Markov property. Put f

= f in U, f

(b) = G

f (b) and
f

(c) = G

f (c). Then it is easy to show that u

= u|s

= G

and
G

= Gu in U.
Denition (). G
a
is called the local generator at a.
4. Fellers form of generators (1) Scale 187
4 Fellers form of generators (1) Scale
We shall derive Fellers cannocial form of generators by purely proba-
bilistic methods following Dynkin in the following articles.
Let a be a conservative regular point. There exists U = U(a) = (b, c)
such that

U has only conservative regular point and E

(
U
) = E

(
U
) <
for U. Put s() = P

(
c
<
b
).
(1

) s D(G
U
) and G
U
s = 0 in

U.
Let f (c) = 1 and f () = 0, [b, c). Then f D(

U) = D

and
G

f () = E

_
0
e
t
f (x
t
)dt
_

_
= E

__

0
e
t
f (x
t
(w

U
))dt
_
=
E

c
e
t
dt :
c
<
b
_

_
.
221
Hence lim
0
G

f () = P

(
c
<
b
) = s(). The resolvent equa-
tion gives
(G

) f + ( )G

f = 0 or
(G

) f + ( )G

f = 0.
Letting 0 we get s() + G

s() = 0. Therefore rstly


s D

and again since s = G

s, s D(G

) and
G

s = s (G

)
1
s = s s = 0 in

U.
(2

) is continuous in

U.
Since s D

and all points of U are regular for s

, s is continuous
in U.
It remains to prove that s is continuous at b and c. We prove the
continuity at c; continuity at b is proved in the same way. To prove
188 6. Linear Diusion
this we shall rst prove that e = lim
c
E(e

c
) = 1 or 0,
c
=
lim
c

. Let < < < c. Then E

(e

) = E

(e

)E

(e

c
).
Letting c, now and c nally we get e = e
2
so that e = 1 or
0. Since c is regular, there exists < c such that P

(
c
< ) > 0
and then E

(e

c
) > 0. Also
c

c
. It follows that E

(e

c
) > 222
0. Hence e = 1. Since is conservative
P

(x

c
= ,
c
< ) P

< ) = 0.
Therefore since
c

c
and the paths are continuous before the
killing time, P

(
c
=
c
) = 1. We have proved that lim
c
E

(e

c
)
= 1. For every > 0, therefore, lim
c
P

(
c
< ) = 1. Also
S () = P

(
c
<
b
) P

(
c
< ,
b
) P

(
c
< ) P

(
b
< )
If b <
0
< < c, then
P

(
b
< ) P

0
< ,
b
(w
+

0
) < )
= P

0
< )P

0
(
b
< ) P

0
(
b
< )
Therefore s() P

(
c
< ) P

0
(
b
< ). Letting c rst and
0 next, we get lim
c
s() 1 i.e. s() is continuous at = c.
(3

) s() is strictly increasing.


The set of points , b < c such that s() = 0 is closed in (b, c].
If P

0
(
c
<
b
) = 0, the same is evidently true for any b < <
0
.
Since
0
is regular lim

0
P

0
(

< ) = 1 for any > 0. Also


P
xi
0
(
b
> 0) = 1. It easily follows that lim

0
P

0
(

<
b
) = 1.
Choose
0
>
0
with P

0
(

0
<
b
) > 0. Then P

0
(

<
b
) > 0
for any
0
< <
0
. Now that if a < then (
a
<

) = (w : 223

(w

a
) = ), and hence is in B

a
. We have 0 = P

0
(
a
<

b
) = P

0
(

<
b
)P

(
c
<
b
). Thus P

(
c
<
b
) = 0.
The connectedness of (b, c] shows that s() 0 in (b, c]. Exactly
4. Fellers form of generators (1) Scale 189
similar argument also shows that s() < 1 in [b, c). Now if < ,
we replace c by and repeat the argument to get P

<
b
) < 1.
Thus if <
s() = P

(
c
<
b
) = P

<
b
)P

(
c
<
b
) < P

(
c
<
b
)
(4

) s + is the general solution of G

u = 0.
Let f () = 1 for b c. Then f () = 1 = E

(
_

0
e
t
f (x
t
)dt)
= 0

f (). This rstly shows that f D

(s

) and then the same


equation shows that f D(G

). Thus G

11(G

)
1
1 = =
0. Hence since G

s = 0 G

(s + ) = 0. Now let G

u = 0. Then
0 = E

U
_
0
G

u(x
t
)dt
_

_
= E

(u(x

U
)) u() = E

(u(x

U
)) u()
= u(b)P

(
b
<
c
) + u(c)P

(
c
<
b
) u().
Therefore u is linear in s.
(5

) If b < b

< c then P

(
c
<
b
) =
s() s(b

)
s(c

) s(b

)
Let x = P

(
c
<
b
), y = P

(
b
<
c
); then x + y = 1 and
P

(
c
<
b
) = P

(
c
<
b
)P
c
(
c
<
b
)
Also
(
c
<
b
) = (
c
<
b
) (
c
>
b
,
b
(w
+

) >
c
(w
+

))
Therefore 224
P

(
c
<
b
) = P

(
c
<
b
)P
c
(
c
<
b
) + P

(
c
>
b
,
b
(w
+

)
>
c
(w
+

))P
c
(
c
<
b
)
= xs(c

) + P

(
c
>
b
)P
b
(
b
>
c
)P
c
(
c
<
b
)
= xs(c

) + P

(
b
<
c
)P
b
(
c
<
b
)
i.e., s() = x s(c

) + y s(b

). Solving for x we get the result.

Denition (). s is called the canonical sacle in b, c.


190 6. Linear Diusion
5 Fellers form of generator (2) Speed measure
Let p() = E

(
U
) = E

(
U
), U = (b, c). Put f = 1 for x U, f (b) =
f (c) = 0. Then G

f () = E

_
0
e
t
f (x
t
)dt
_

_
= E

U
_
0
e
t
dt
_

_
, so that
lim
0
G

f () = p()
We have G

f G

f + ( )G

f = 0. Letting 0
G

f p + G

p = 0.
This shows that p D(G

), because, f being indentically 1 in U, is


continuous at every regular-point and b, c are traps for M

. We have,
(1

) G

p = f i.e. G

p = 1 in U, G

p(b) = G

p(c) = 0
p is continuous in

U and p(b) = p(c) = 0.
We prove that p(c) = p(c) = 0. Let b < < c and

=
(b,)
. 225
Then if b <
0
<
E

0
(
U
) = E

0
(

) + E

0
(
U
(w
+

)) = E

0
(

)
+ E

0
(E
x

(
U
) :

<
b
) = E

0
(

) + E

(
U
)P

0
(

<
b
).
Now as c, E

0
(

) E

0
(
U
) and


c
. Therefore
lim
c
E

(
U
)P

0
(
c
<
b
) = 0 i.e. p(c) = 0.
(3

) p is the only solution of G

u = 1 in U, u(b) = u(c) = 0.
For p() = E

(
U
) = E

U
_
0
G

u(x
t
)dt) = E

(u(x

U
)) u().
Since x

U
= b or c, u(x

U
) = 0. Q.E.D.
(4

) We have proved that s : [b, c] [0, 1] is 1 1 continuous. We


dene a mapping p

on [0, 1] by p

(s()) = p(). To prove that p

6. Fellers form of generators (3) 191


is strictly concave in [0, 1]. We express this by p is concave in
s. We have to prove that, if b < < c
p() >
s() s()
s() s()
p() +
s() s()
s() s()
p().
Now p() = E

(
U
) = E

( =
U
(w
+

)), =
,
> E

(
U
(w
+

)) =
right side of the above inequlity
(5

) m() =
d
+
p
ds
is strictly increasing and bounded if there exists an
interval V such that E

(
V
) and M
V
is conservative. (The
measure dm is called the speed measure for

U).
From (4

) the right derivative


d
+
p
ds
exists and strictly increases. We 226
prove that it is bounded. Let V = (b
1
, c
1
) [b, c]. Put p
1
() = E

(
V
),
s
1
() = P

(
c
1
<
b
1
). We have
P
1
() = E

(
U
) + E

(
V
(w
+

U
)) = p() + s()P
1
(c) + (1 s())P
1
(b).
From this one easily sees that
m
1
() = d + p
1
()ds
1
= [m() (P
1
(c) p
1
(b))]
1
s
1
(c) s
1
(b)
Q.E.D.
6 Fellers form of generators (3)
Theorem (Feller). u D(G

) if and only if
(1) u is of bounded variation in U
(2) du < ds i.e. du is absolutely continuous with respect to ds.
(3)
du
ds
(Radon Nikodym derivative) is of bounded variation in U.
(4) d
du
ds
< dm in U.
192 6. Linear Diusion
(5) (d
du
ds
)/dm (which we shall write
d
dm
du
ds
has a continuous version in
U.
(6) u is continuous at b and c i.e. u is continuous in

U and G

u =
d
dm
du
ds
in U, G

u = 0 at b and c.
Proof. (Dynkin) Let u (G

). Then for some f D

u() = G

f () = E

U
_
0
e
t
f (x
t
)dt
_

_
+ E

(e

b
:
b
<
c
)
f (b)

+
f (c)

(e

c
:
c
<
b
).
Thus lim
b
u() =
f (b)

= u(b) and lim


c
u() =
f (c)

= u(c). u is
therefore continuous in

U. 227
Let [, ] U. If G

V 0 in (, ) then Dynkins formula shows


0 E
_

_
0
G

v(x
t
)dt
_

_
= v()
s() s()
s() s()
+ v()
s() s()
s() s()
v(),
so that v is convex in s and hence is of bounded variation in [, ]. Also
d
+
v
ds
exists and increases in [, ] and dv is absolutely continuous with
respect to ds. If G

v in (, ), then G

(v + p) 0 in (, ).
Therefore d
d
+
v
ds
dm. Similarly if G

v in (, ) then d
d
+
v
ds

dm.
Consider a division = (b =
0
<
1
< <
n
= c) of [b, c]. Put

i
= inf
(
i
,
i+1
)
G

u(),
i
= sup
(
i
,
i+1
)
G

u().
Then
i
dm d
d
+
u
ds

i
dm in (
i
,
i+1
) and
i
dm G

udm
i
dm
in (
i
,
i+1
). Putting () =
i
and () =
i
for
i
<
i+1
we
7. Fellers form of generators (4). . . 193
have ()dm d
d
+
u
ds
()dm, and ()dm G

u()dm ()dm.
Therefore (() ())dm d
d
+
u
ds
G

u()dm (() ())dm. As


() = max
i
[
i+1

i
] tends to zero, () () 0. We have
d
d
+
u
ds
= G

u dm in U.
228
Conversely suppose that u satises all the above six conditions. De-
ne f = u
d
dm
d
ds
u in U and f (b) = u(b), f (c) = u(c). Then
since f is continuous in U, f D

. Let v = G

f . From what we
have already proved v
d
dm
d
ds
v = f in U, v(b) =
b

f (b), v(c) =
1

f (c). If = u v then is continuous in



U, (b) = (c) = 0, and

d
dm
d
ds
= 0. There exists a point
0
such that (
0
)is a maxi-
mum. Now (
0
) > () > 0 near
0

d
dm
d
ds
> 0 near
0

d
ds

strictly increases near


0
. Then if >
o
> are near
o
we have
(
o
) () =

o
_

d
ds
ds <
d
ds
(
0
)[s(
0
) s()]. Hence
d
ds
(
0
) > 0. On the
other hand ()(
o
) =

o
d
ds
ds >
d
ds
(
o
)[s()s(
o
)], a contradiction.
Therefore () 0. Smiliarly we prove () 0. Q.E.D.
7 Fellers form of generators (4) Conservative com-
pact interval
Let I = [b, c] be a conservative compact regular interval i.e., a compact
interval consisting only of conservative regular points. We shall prove
the following
Theorem (Feller). All the results of the three articles hold for M
I
.
194 6. Linear Diusion
Proof. Since every a I is conservative regular, we can associate with
any a I an open interval U(a) such that E

(
U(a)
) < for U(a)
and then the results of the last three articles are true for M
U(a)
. Denote 229
the quantities s, p, m etc. for

U by s
U
, p
U
, m
U
, etc. Let s = P

(
c
<
b
).
Then from (5
0
) of 4 we get
s() = s(b

) + [s(c

) s(b

)]P

(
c
<
b
),
where (b

, c

) is an interval such that the results of the last three ar-


ticles are true for M
[b

,c

]
. This equation shows that s() is strictly in-
creasing and continuous in some neighbourhood of the point . There-
fore s() is strictly increasing and continuous in I, and s is linear in s
U
in U, for every interval U such that the results of the last three arti-
cles are true for M
U
. Let dm be a measure dened on B(I) as follows.
dm =
1

U
dm
U
if in U, s =
U
s
U
+
U
. Let V U = W . Since
p
U
= p
W
+ p
U
(b

) + s
W
()[p
U
(c

) p
U
(b

)] where U = (b

, c

) we have
1

U
dm
U
= dm
W
=
1

V
dm
V
. Therefore the measure dm is uniquely de-
ned on B(I) and
d
dm
d
ds
=
d
ds
U
d
ds
U
in U. dm is dened by a strictly
increasing function m (say) in I. Consider now the following dieren-
tial equation
d
dm
d
ds
u = 1 in (b, c) and u(b+) = u(c) = 0.
Then p() =

_
b+
m()ds() +
c
_
b+
m()ds()[s() S (b)]
1
s(c) s(b)
is a solution and (
d
dm
d
ds
)p = p+1 in (b, c) and p(b+) = p(c) = 0.
Let f = p + 1 in (b, c) and f (b) = f (c) = 0. Since f is continuous in
(b, c), f D
I
. Let v = G
I
f . Then v D(G
I
) and ( G
I
)v = p + 1. 230
v (G
I
) v D(G

U
) so that (
d
dm
U
d
ds
U
)v = p+1 in U. Therefore
(
d
dm
d
dS
)v = p + 1 in (b, c). Since v D(G
I
), it is continuous in I.
Let = p v. is continuous in I and (
d
dm
d
dS
) = 0. We prove as
7. Fellers form of generators (4). . . 195
in 6 that = 0. Thus p() = v() D(G
I
). Using Dynkins formula
we have, if
n
=
(b,c)
n = n (say)
E

n
_
0
G
I
(p(x
t
)dt) = E

(p(x

n
)) p()
_

_
i.e.,
E

(
n
) 2|| p|| < . We get E

() < .
Again using Dynkins formula
E

_
0
G
I
p(x
t
)dt
_

_
= E

(p(x

)) p() i.e. p() = E

(
(b,c)
).
The proof of the theorem can be completed as in 6.
Bibliography
231
[1] DOOB, J.L. Stochastic Processes, New York (1952)
[2] HALMOS, P.R. Measure Theory, New York (1950)
[3] KOLMOGROV, A. N. Grundbegrie der Wahrscheinlickeit-
srechnung Ergeb. der Math. 2, 3 (1993) [English translation: Foun-
dations of the theory of Probability (1956)]
Sections 1 and 2
[4] BLUMENTHAL, R.M. -An extended Markov property, Trans.
Amer Math. Soc., 85(1957), pp. 52-72
[5] CHUNG, K.L -On a basic property of Markov chains, Ann. Math.
68(1958), pp.126-149
[6] DYNKIN, E. B. Innitesimal operators of Markov processes, The-
ory of Probility and its applications 1; 1 (1956) (in Russian),pp.
38-60
[7] KAC, M. -On some connections between Probability Theory and
dierential and integral equations. Proc. of the Second Berkely
Symposium on Mathematical Statistics and Probability, 1951,pp.
199-215
[8] RAY, D. B. -Resolvents, Translation functions and strongly
Markovian Process, Ann. Math. 70, 1(1959), pp. 43-72
197
198 BIBLIOGRAPHY
[9] RAY, D. B. -Stationary Markov process with continuous paths,
Trans. Amer. Math. Soc., 49 (1951), pp. 137-172
Sections 3
[10] DOOB, J. L. -A probability approach to the heat equation, Trans. 232
Amer. Math. Soc.80, 1(1955), pp. 216-280
[11] DOOB, J. L. -Conditional Brownian mation and the boundary lim-
its of harmonic functions, Bull. Soc Math. France, 85(1957), pp.
431-458
[12] DOOB, J. L. -Probability methods applied to the rst boundary
value problem, Proc. of Thrid Berkeley Symposium on Math. stat.
and Probability, 2, pp. 49-80
[13] DOOB, J. L. -Probability theory and the rst boundary value prob-
lem, IIi. Jour. Math., 2 -1 (1958), pp. 19-36
[14] DOOB, J. L. -Semi-martingales and subharmonic functions, Trans
Amer. Math. Soc, 77-1 (1954), pp. 86-121
[15] HUNT, G. A. -Markov processes and potentials,III. Jour. Math.,
1(1957), pp. 44-93; pp. 316-369 and 2(1958), pp. 151-213
[16] ITO, K & McKEAN, H. P -Potentials and the random walk, Ill.
Jour. Math.4-1(1960), pp. 119-132
[17] LEVY, P. -Processes Stochastique et Movement Brownien, Paris,
1948
Section 4
[18] DOOB, J. L. - loc.cit.
[19] FROSTMAN, O. - Potential dequilibre et capacite des ensembles
avecquelques applications a la theorie des fonctions, These pour le
doctract, Luad, 1935, pp. 1-118
[20] ITO, K. - Stochastic Processes, jap. Jour. Math., 18(1942), pp.
261-301
BIBLIOGRAPHY 199
[21] LEVY, P. - Theorie de laddition des variables aleatoires, 2nd ed.,
Paris, (1954) [Chap. 7]
[22] RIESZ, M. - Integrale de Riemann-Lioville et Potentiels Acta Sci.
Szeged. 9(1938); pp.1 - 42
Section 5
[23] DOOB, J. L. -loc.cit [Chap. 10]
[24] ITO, K. -Stochastic dierential equations, Memoir. Amer. Math.
Soc., 4, 1951
Section 6 233
[25] DYNKIN, E. B. -loc.cit.
[26] FELLER, W. -The parabolic dierential equations and the asso-
ciated semi groups of transformations, Ann. Math.55(1952), pp.
468-519
[27] FELLER, W. -On second order dierntial operatorts, Ann. Math
61-1(1955), pp. 90-108
[28] FELLER, W. -On the intrinsic form for second order dierential
operators, Ill. Jour. Math. 2-1(1958), pp. 1-18
[29] RAY, D. B loc.cit.

Das könnte Ihnen auch gefallen