Beruflich Dokumente
Kultur Dokumente
Electrical Engineering
Springer Texts in Electrical Engineering
Linear Programming
M. Sakarovitch
Stochastic Processes in
Engineering Systems
Springer-Verlag
New York Berlin Heidelberg Tokyo
Eugene Wong Bruce Hajek
Department of Electrical Engineering Department of Electrical and
and Computer Sciences Computer Engineering
University of California University of Illinois, Urbana-
Berkeley, California 94720 U.S.A. Champaign
Urbana, Illinois 61801 U.S.A.
9 8 7 6 543 2 1
ISBN-13: 978-1-4612-9545-7 e-ISBN-13: 978-1-4612-5060-9
DOl: 10.1007/978-1-4612-5060-9
Preface
Eugene Wong
Bruce Hajek
Contents
PREFACE v
2 STOCHASTIC PROCESSES 37
3 SECOND-ORDER PROCESSES 74
1. Introduction 74
2. Second-order continuity 77
3. Linear operations and second-order calculus 78
4. Orthogonal expansions 81
5. Wide-sense stationary processes 88
6. Spectral representation 97
7. Lowpass and bandpass processes 105
8. White noise and white-noise integrals 109
9. Linear prediction and filtering 116
1. Introduction 139
2. Stochastic integrals 141
3. Processes defined by stochastic integrals 145
4. Stochastic differential equations 149
5. White noise and stochastic calculus 155
6. Generalizations of the stochastic integral 163
7. Diffusion equations 169
1. Introduction 180
2. The Markov semigroup 182
3. Strong Markov processes 190
4. Characteristic operators 193
5. Diffusion processes 198
1. Martingales 209
2. Sample-path integrals 217
3. Predictable processes 222
4. Isometric integrals 227
5. Semi martingale integrals 233
CONTENTS xi
1. Introduction 250
2. Likelihood ratio representation 254
3. Filter representation---change of measure derivation 257
4. Filter representation-innovations derivation 262
5. Recursive estimation 269
1. Introduction 279
2. Homogenous random fields 280
3. Spherical harmonics and isotropic random fields 285
4. Markovian random fields 292
5. Multiparameter martingales 296
6. Stochastic differential forms 303
REFERENCES 311
INDEX 355
1
Elements of Probability Theory
and nAn
n=l
othen lim (P(An) = 0 (monotone sequential
n-HO
continuity at 0) (l.le)
1 Complementation, union, and intersection are the most familiar Boolean set opera-
tions. Only complementation and either union or intersection need be defined. All
other set operations are then expressible in terms of the two basic operations.
2 Since all set operations are expressible in terms of complementation and union, to
verify that a class is a u algebra, we only need to verify that it is closed under com-
plementation and countable union.
1. EVENTS AND PROBABILITY 3
Proposition 1.1 (Extension Theorem). Let ffi be an algebra and let a(ffi) be
its generated 0- algebra. If ()' is a probability measure defined on ffi,
then there is one and only one probability measure cP defined on a(ffi)
such that the restriction of cP to CB is ()' [Neveu, 1965, p. 23].
Thus, we have arrived at the basic concept of a probability space. A
probability space is a triplet (Q,a,()') where Q is a nonempty set whose
elements are usually interpreted as outcomes of a random experiment,
a is a 0" algebra of subsets of Q, and ()' is a probability measure defined on
a. The set Q will be called the basic space, and its elements are called
points. Elements of a are called events.
A subset of an event of zero probability is called a null set. Note
that a null set need not be an event. A probability space (Q,a,()') is
said to be complete if every null set is an event (necessarily of zero
probability). If (Q,a,()') is not already complete, ()' can be uniquely
extended to the 0" algebra a generated by a and its null sets. This pro-
cedure is called completion. The process of completion is equivalent to
the following: For a given probability space (Q,a.()'), define for every
subset A of Q an outer probability ()'*(A) and an inner probability
()'* (A) by
m
serniopen interval, we set <p(A) = b - a. If A = U Ai is a union of
r
i=1
m
disjoint intervals Ai, we set <p(A) = (p(A,). Clearly, <P satisfies con-
i= 1
ditions (1.1a) and (LIb). We shall show that <P is, in fact, IT additive.
00
Suppose that AI, A 2 , ••• are disjoint sets in ill such that A = U Ai is
i= 1
::llso in ill. Then A is a finite union of disjoint spmiopen intervals
II, . . . , 1 m , and each Ik n Ai is again a finite union of disjoint semi-
open intervals. Therefore, to prove that <P is IT additive, it is enough to
show that if
[atb) = U [ai,b i )
i= I
i = 1,2, ..
then
00 00
rn
(b, - ai) :::; ~ (b i - ai)
n
+
n-l
~ (a,+1 -- bi) = bn - al :::; b - a
r
i=1 i=1 i=1
00
The Heine-Borel theorem [see, e.g., Rudin, 1966, p. 36] then states that
there is a finite N such that
N
[a, b - 0] C U (ai - 0/2 i, bi)
i= 1
I t follows that
N
[a,b) C [b - 0, b) U U [a; - 0/2;, bi)
i= 1
N
b - a :::; 0 + ~ (b i - ai + 0/2 i )
i=1
+
00
:::; 20 ~ (b i - ai)
°: :; r
i= 1
00
intervals. Sets in a are called Borel sets of [0,1). (P can be further extended
by completion to a. Sets in Ci are called Lebesgue measurable sets of
[0,1). We note that a and Ci include all intervals in [0,1) and not just
semiopen ones. In particular, a point x in [0,1) can be considered to be a
degenerate interval [x,x] and is in a.
Let Q be a basic space, and let a be a u algebra of subsets of Q.
The pair (Q,a) is called a measurable space and is sometimes referred to
as a preprobability space. A nonnegative u additive set function !L defined
on a is callpd a measure. Thus, a probability measure is a measure
sati"fying (p(Q) = 1. More generally, !L is said to be a finite measure if
!L(Q) < eX). Even more generally, iH2 is a countable union of sets A 1, A. 2 , • • •
in Ct such that !L(A i ) is finite for each i, then !L is said to be a
u-finite measure. For example, let Q = R be the real line, and let a be the
u algebra generated by the clasK of all intervals, and for any finite interval
A define !L(A) = length of A. Then, !L can be extended to a unique
u-finite measure on (Q,a), and upon completion it is just the Lebesgue
measure of the real line.
n Ai.
n
called a product and will be denoted by
i=1
The pair (Rn,eR n) is obviously a measurable space. 1Vleasures defined
on (Rn,(Rn) are called Borel measures, and probability measures defined
on (Rn,(Rn) are called Borel probability measures. Let !L be a finite Borel
measure defined on (Rn, eRn) , and define a function M on Rn by
J.,I, en
n-l
i=1
Ai n (- 00 ,b n») = J.,I, (n A n (-
n-I
i=1
I 00 an») + J.,I, en
n
i~1
A i)
Hence,
i=1
oo,bn») - J.,I, en A,,, (-
n-I
i=1
oo,an»)
n
n
A = {x: ai ::; x, < bd (2.3)
n
n
A = {x: ai :::; Xi < b;} (2.5)
i= 1
Let <B(C) be the smallest algebra (but not u algebra) containing C. Then,
every set in <B(C) is a finite union of disjoint rectangles of the form (2.5).
Given a distribution function M, we can define a nonnegative set function
/-L on C by
/-L(A) L (-l)k(x)M(x)
xE8(A)
Px(A) L
x,E·1IlS
<p(/w: X(w) = :r,}) (3.7)
3. MEASURABLE FUNCTIONS AND RANDOM VARIABLES 9
The function px is called the probability density function for the random
variables Xl, . . . , X". Representation (3.8) results from an applica-
tion of the Radon-Nikodym theorem [LOEwe, 1963, p. 132]. In terms of
the distribution function, (3.8) takes on the more familiar form
P X(a1, a2, . . an)
= f~~ ... f~: PX(XI, X2, • • • I Xn) dX1 dX2 . • . dXn (3.9)
(3.13)
where J is the matrix with elements aji- 1 (y)/aYn and IJ(y)1 stands for
the absolute value of the determinant. As an example of (3.15), suppose
that (X l,X 2) has a joint density function
= -.!... e-lY'(AA')-ly
211'
and
( ) __ 1_ -lY'(AA')-ly
py Y - 211'IAI e
is unduly clumsy. We shall use the simpler, but less exact, notation
<p(X E A)
instead.
n_"" n-l
""
limit lim An is defined for every monotone sequence as U An or
n-l
n
An
according as {An} is increasing or decreasing. For a general sequence
{An} (not necessarily monotone), we define superior and inferior limits
as follows:
lim sup An =
n~oo
n U Ak
n=l k?:..n
(4.1)
The superior limit is the set of all points which occur in an infinite number
of An's, while the inferior limit is the set of all points which occur in all
but a finite number of An's. Hence, lim sup An:J lim inf An. If the
superior limit and the inferior limit coincide, then we say that {An}
is a convergent sequence, and we set
We note that lim sup, lim inf, and lim all involve only countable set
operations. Hence all such sequential limits of events are again events.
Suppose that {An} is a convergent sequence of events, and A is its
limit. Then A is an event, and the following proposition relates (P(A) to
(P(An).
(4.4)
n_oo
12 ELEMENTS OF PROBABILITY THEORY
Proof: First, if {An} decreases to the empty set 0, then (4.4) is simply
(1.1c) (sequential monotone continuity at 0). If {An} is a decreasing
sequence with a nonempty limit A, then
L <P(A,,) <
00
= lim '?P(
n .... oo
U Ak);5;
k:?:n
lim :E '?P(Ak )
n .... ook:?:n
L <P(An)
00
L <P(A
00
lim k) = 0
n---+ GO k~n
If the inferior and superior limitR of {X n ) agree, we say that the sequence
I X n I converges and set
lim Xn = lim inf X n = lim sup Xn (4.7)
n-+ 00 n-+ oo
Again, we note that even though X n is finitely valued for each n, the
limit lim Xn may aSRume value,. ± 00.
n-+ 00
Iw: lim inf Xn(W) < a) = {w: sup inf Xn(W) < al
nu
n k~n
and
{w: lim sup Xn(W) < a) {W: inf sup Xn(W) < a)
n k~n
= un {w: Xk(w)
n k~n
< a! (4.9)
14 ELEMENTS OF PROBABILITY THEORY
Iw: lim inf Xn(W) < a} or {w: lim sup Xn(W) < a}
(4.10)
where ott, ot2, • • • , otn, are real numbers, and AI, A 2, ••• , An, are
events. Then, X is called a simple random variable.
where the integral needs to be defined. For these reasons, we shall give a
definition for EX by defining the integral in (5.2) for a random variable X.
First, we define EX when X is a simple random variable. By defini-
tion, this means that X has the form
n
X =
i=l
L xJAi (5.3)
where Ai are events and Xi are real constants. For such X we define
n
EX = L xicp(Ai) (5.4)
i=l
EX = lim EX n (5.6)
n-HC
lim Zn = Yp
n->oo
5. EXPECTATION OF RANDOM VARIABLES 17
Hence, lim EX n 2:: lim EYp • Reversing the roles of {Xn} and {Y n }
n--+ 00 p---+ co
EX = In X(w)CP(dw)
We shall also use the abbreviation
Inr X dcp. A random variable X
is said to be integrable if EIXI < 00. The following results on sequences
of integrable random variables are counterparts of standard results on
sequences of integrable functions in integration theory. Proofs will be
omitted [Loeve, 1963, pp. 124-125].
EX = lim EX n (5.9)
lim EX n = EX (.5.11)
We leave the proof of (5.12) to Exercise 1.3. If jis not nonnegative, then
we write f = f+ - f- as before, and (5.12) still holds provided that one
of the pair Ef+(X) and Ef-(X) is finite. The integral JRn f(x) Px(dx) is
called a Lebesgue-Stieltjes integral and is often written as JRn f(x) dPx(x)
to emphasize the role of P x as a distribution function. However, for the
definition of the integral, it is the role of P x as a measure which is crucial.
If X = (XI, . . . , X n) are real random variables, the function
Fx(u), u ERn, defined by
Fx(u) = E exp (i rn
k=!
UkXk) = E cos
k=!
r
n
UkXk + iE sin rn
k=!
UkX k
(5.13)
(i l
n
(.5.16)
6. CONVERGENCE CONCEPTS
We abbreviate the words "almost sure" and "almost surely" to a.s., and
we adopt the equivalent notations lim a.s. Xn = X and Xn ~ n--->oo
X to
denote the a.s. convergence of IX n I to X. We observe that as we have
defined it, the limit a.s. convergent sequence of random variables is
always finite except on a set of probability zero. Furthermore, two func-
tions which are both a.s. limits of the same sequence are equal except
on a set of probability zero. With these considerations, we see that we
can always take the limit X of an a.s. convergent sequence of random
variables I Xn I to be a random variable, i.e., it is finite and measurable.
We note that the convergence theorems for expectation, Proposi-
tions 5.2 to 5.4, remain valid if in the statements of these theorems
"convergence at every point" is replaced by "convergence almost surely."
This follows simply from the fact that if es>(A) = 0, then fAXdes> = O.
a.s.
Thus, if Xn ~
n->oo
X, and we denote by A the set on which convergence
does not take place, then ( Xn des> ~ ( X des> is the same as
In-A In-A
( Xn des>~ ( X des>.
in JIl
Given a sequence IXnl which mayor may not be almost surely
convergent, it is difficult to apply the definition of a.s. convergence to
it, because we have no candidate for the limit X. For this reason, the
concept of mutual convergence is useful. We say that a sequence {X n I
converges mutually a.s. if sup IX m - Xnl ~ O. By virtue of the Cauchy
>
m_n n-lo IX)
Remark: The only part of Proposition 6.2 that is easy to prove is (a).
a.s.
If X n ---->
n--->oo
X, then
for every e > O. The proof for parts (b) and (c) will be omitted.
There is a third type of convergence which is important to us. We
define it as follow!;.
Proposition 6.3.
(a) If {Xnl converges in vth mean, then it converges in probability
to the same limit.
(b) {Xn I converges in vth mean if and only if
sup EIXf/I - Xnlv ~ 0 (6.7)
m~n n----'OO
Proof:
(a) We make use of the following rather obvious inequality known as
the Markov inequality:
• in p.
WhICh proves that Xn ~n->o"
X.
(b) We first suppose that {Xn I converges in vth mean to X. Then,
sup EIX", - Xnlv ::; 2v{EIXn - Xlv
m~n
+ sup EIXm;:::n
m - Xlv} ~ 0
n~oo
L sup <p(IXm -
n m;;:::'n
Xnl 2:: E) < 00 (6.9)
Proof: Since (6.9) implies sup <p(IX m - Xnl 2:: E) ~ 0, the sequence
m~n n----+oo
It follows from (6.9) that L <P(An') < 00, and it follows from the Borel-
n
Cantelli lemma (Proposition 4.2) that
<P(lim sup An') = 0
n
SO that for every E > 0, IX" - XI 2:: 2E for, at most, a finite number of
values of n with probability 1. If we take A = U lim sup Anl/k, then
k';?:l n
<p(A) = 0 and w E A implies that
EIX m - Xnl 2 = 1
V211'U mn 2
f'" x-'"
2 exp (-~)
2u
dx
mn
= u mn 2 =
1m - nl
"'-------' (6.11)
mn
Therefore,
f-'"'" X2k+2 exp (_tx2) dx = - f'"-'" X2k+1 -dxd [exp (_tx2) 1dx
= (2k + 1) f-"'", X2k exp (_tx2) dx
Therefore,
EIX m _ Xnl 4 = 3 (1m - nl)2
mn
EIX m _ Xnl 6 = 15 (1m - nl)3
(6.12)
mn
Proof: We write
Therefore,
Similary, we find
Pn(X) ~ P(x + E) + CP(jX - Xnl > E)
n CP(Ak)
r
r
CP (n Ak,) = (7.2)
i= I i= I
dent events. If A and B are events such that <9(B) > 0, then we define
the conditional probability of A given B by
n Pi(Xi)
N
. , XN) = (7.5)
i=1
(7.7)
P(ylxlJ . . . , XN)
lim <9(Y
-dO
< ylXi ~ Xi < Xi + Ei, i = 1, . . . ,N) (7.8)
.-1,2, ... IN
26 ELEMENTS OF PROBABILITY THEORY
P(yIXl' . . . ,XN) dy
= ()'(y :::; Y < y + dylX l = Xl, . • . , X N = XN) (7.10)
(7.12)
p.+(B) = EIBX+
p.-(B) = EIBX-
B E a' (7.13)
Therefore, there exist a'-measurable functions <p+ and <p- such that for
all B in a',
(7.14)
If we set
then ECi'X will have the defining properties, and uniqueness follows from
the uniqueness of <p+ and <p-.
If EX+ and EX- are both finit.e, t.hat. is, if E\X\ is finit.e, t.hen EIi'X
can always be t.aken to be finite valued so that it is a random variable
as we have defined it. If not, EIi'X may have to assume values of ± 00
and is a random variable only in an extended sense. Of course, if we had
defined random variables t.o be extended-valued functions to begin with,
this difficulty would be avoided. This is done by many authors. However,
this approach also has its disadvantages. For example, extended-valued
functions cannot always be added, because the sum may involve 00 - 00.
We shall continue to define random variables as real-valued functions.
When the need arises, we shall make free use of extended-valued measur-
able functions. While they are not random variables in our sense, the
difference is seldom important.
Roughly speaking, a conditional expectation has (almost surely)
all the properties of an expectation. We make this precise by the following
proposition.
7. INDEPENDENCE AND CONDITIONAL EXPECTATION 29
Proposition 7.3. If X = c a.s. then ECl'X = c a.s., and if X 2:: Y a.s., then
ECl'X 2:: ECl'y a.s. Furthermore E Cl' is a linear operation, that is,
a.s. (7.16)
ECl'Xn~ECl'X (7.17)
n-> 00
(7.18)
(7.20)
(a) If B is an atom of a' and (J'(B) > 0, then the value of ECl'X on
B is given by
a.s. (7.24)
Proof:
(a) If B is an atom of a', then E(j'X must be constant on B, because
E(j' X is a' measurable. By definition,
EIBE(j'X = EIBX
Since E(j'X is constant on B, we also have
EIBE(j'X = (EIB)(E(j'X)
Hence, (E(j'X)B = (l/EIB)EIBX, which proves (a).
(b) Let B E a'. Then I B and X are independent. Therefore,
EIBE(j'X = EIBX = (EX)(EI B)
On the other hand, EX is a' measurable and
E(IBEX) = (EX)(EI B)
Hence, E(j'X = EX, a.s. by virtue of the uniqueness of E(j'X.
(c) If B is an event in a', and Y = I B , then for every A E ai,
EIAE(j'YX = EIAlsX = EIA(\BX
EIAYE(j'X = EIAIBE(j'X
= EIA()BX
EI A Y nE I1'X ~
n-+oo
EI A YE I1 'X
EI A E I1'XY n = EIAXYn~ n-->
EIAXY 00
I1
= EI AE 'XY
a.s. (7.27)
The notation El1xy is unnecessarily cumbersome, and one usually writes
E(YIX) instead.
To show (7.27), we begin by recalling the probability measure P x
on the a algebra of Borel sets ill defined by
Px(B) = (p({w: X(w) E BD (7.28)
Let !sex), x E R, and B E ill, be the indicator function
if x E B
IB(x) = {~ if x E B
(7.29)
32 ELEMENTS OF PROBABILITY THEORY
EIB(X)Y =
Jr[w: X(w)EB) (ECixy)(w) CP(dw) (7.31)
It is not hard to verify that (7.36) reduces to (7.6) and (7.7) under the
corresponding conditions.
We close this chapter with an important result which is used re-
peatedly in Chapters 6 and 7. Let X be a vector space of bounded real-valued
functions on a set Q. X is said to be closed under uniform (resp. bounded
monotone) convergence if whenever a sequence of In in X converges uni-
formly ::'0 a function I (resp. whenever there is a pointwise monotone
increasing sequence of functions In such that the functions are uniformly
bounded above by a constant and I is the limit) it holds that I is in X.
EXERCISES
1. Let fl = [0, 00 ).
(a) Let C I be the class of all intervals of the form [O,a). Show that C t is not an
algebra.
(b) Let C 2 be the class of all unions of a finite number of intervals of the form [a,b).
Show that C 2 is an algebra, but not a 0' algebra.
(d) Show that the 0' algebra generated by C 2 contains all intervals in [0, 00), closed
or open at either end.
2. Suppose that fl = [0,00), and C is the class of all intervals of the form [O,a), 0 <
a S 00. Let P(x), 0 ::; x < 00, be a left-continuous nondecreasing function such
that P(O) = 0 and lim P(x) = 1. We define CP on C by
CP([O,a» = Pea)
(b) How must CP«a,b», CP([a,b», and CP([a,b]) be defined in terms of P in order for
CP to be 0' additive?
34 ELEMENTS OF PROBABILITY THEORY
4. Let {An, n = 1,2, . . . } be a sequence of sets. We can show that a point w belongs
to nU
n-l k~n
Ak if and only if it belongs to an infinite number of A,,'s as follows.
contains nU
n~lk?n
A k, this proves that w cannot belong to nU
n~lk?n
Ak if w belongs
5. Suppose that XI, . . . ,Xn are n random variables with a joint density function
px. Let Y I , • • • , Y n be defined by Y i = li(X I , • • • , Xn). Suppose that f has a
differentiable inverse g so that Xi = gi( Y I , • • • , Y n). Show that the joint density
function of Y I, • • • , Y n is given by
where IJI denotes the absolute value of the determinant of J(y) = agdaYj.
Suggestion: Consider the incremental "rectangle" in Rn with sides dYI, d!h, . . . ,
dYn and located at the point y. Under the transformation g this rectangle is mapped
approximately into a "parallelepiped" located at g!y) with sides J(y) ely, the vol-
ume of which is IJ(y) I elYI, dYe, . . . , dYn [see, for example, Birkhoff and Mac Lane,
1953, pp. 307-3lOJ. The desired result now follows from the interpretation of
probability density function as probability per unit volume [see also, Thomasian,
1969, pp. 362-363J.
EXERCISES 35
LX;,
k
Let Y k = k 1, . . . , n. Find the joint density Py for Y 1, . • • , Y n.
J~l
8. Prove that
· E (
11m IXu - XI ) =0
n-> '" 1 +
IXn - XI
q.m.
11. Starting from the Schwarz inequality IEXYI' :::; EX'EY', prove that Xn ----> X
n"""" QO
implies that
EXn~EX
n-> ""
and
12. If IX" I converges in q.m. to X and each X" has a density function Pn(X) = (1/
V2"'O'n 2) exp [-l(x - JLn)2/O'n2], prove that X has a density
n
n
P(X\,X2) =
2".
VI1 _ p2
exp [ - 1
2 (1 - p2)
(X\2 - 2pX\X2 + X2 2)]
find E(XI\X.).
Let Y = l2 VX +
X22. Find E(X\IY).
Hint; Introduce random variable 4> so that X \ = Y cos 4> and X 2 = Y sin 4>.
16. Let ax denote the smallest 0' algebra with respect to which the random variables
X = (Xl, X 2, • • • , Xn) are all measurable. Let X-I(B) denote Iw: X(w) E BJ.
Show that ax = {X-I(B), BE eRnJ. In other words, ax is the collection of all
inverse images of Borel sets.
2
Stochastic Processes
{Xt, °: :;
We illustrate the previous remarks by an example. Suppose that
t :::; I} is a stochastic process. The event
1, 2, . . . } (1.2)
and
may not even be in a, that is, it may not be an event, because it involves
uncountable set operations on events. Whether it is an event or not
depends on the precise nature of the probability space and the process.
We shall consider these questions in greater detail in Sec. 2.
In practice, one seldom begins with a given probability space and a
given family of random variables defined on it. Instead one often starts
with a proposed collection of finite-dimensional distributions {P T " all
finite Tn in T}, which is usually obtained by a combination of observations
and hypotheses. The question then arises as to whether we can always
find a stochastic process having these distributions. We shall answer this
question as clearly as we can, because it is a source of some confusion.
First, the collection of finite-dimensional distributions must be
compatible in the following sense: If Tn and T m are two ordered finite
1. DEFINITION AND PRELIMINARY CONSIDERATIONS 39
sets from T such that Tn contains T m, then P T ", must be equal to PT.
with the appropriate variables set to 00. For example,
(1.5)
Given a compatible family of finite-dimensional distributions {PT., all
finite Tn in T}, we can always find a probability space (n,a,cp) and a
family of random variablef\ tXt, t E T} having the given finite-dimen-
sional distributions. The proof is by construction. Let n = RT = {the set
of all real-valued functions defined on T}. Let XI(W) = the value of W
at t. Let ffix and ax be, respectively, the smallest Boolean algebra and
the smallest q algebra with respect to which every X t is measurable.
Now, every set in ffix is of the form
{W: (Xt,(w), XI,(w), . . . ,Xt.(w)) E B} (1.6)
where B is an n-dimensional Borel set. Given a compatible family of
finite-dimensional distributions {PT.l, we set
cp({w: (Xt,(w), X,,(w), . . . ,X,,(w)) E B})
= fB dP 1,.1, ... ,,/, (Xl, . . . ,X
n) (1.7)
This defines an elementary probability measure CP on (n,ffix). Now, it
can be shown that CP, so defined, is not only finitely additive, but also
q additive. This means that CP is a probability measure and can be uniquely
extended to ax. To show that CP is q additive, and not merely finitely
additive, is fairly difficult [see Neveu, 1965, pp. 82-83], and the proof
will be omitted. To summarize, we take n = RT,
Xt(W) = value of W at t (1.8)
and ax to be the minimal q algebra generated by {XI, t E T}. The
probability measure CP is defined by (1.7). So defined, the process tXt,
t E T} has the preE'cribed finite-dimensional distributions. We note that
{Xt(w), wERT, t E T} as defined by (1.8) is called the coordinate func-
tion, because X,(w) is the lth coordinate of w. Sets of the form (1.6) are
called cylinder sets.
In the construction that we have just given, the basic space n was
taken to be RT, and the q algebra was taken to be ax, the minimal
q algebra containing all cylinder sets. In a sense, RT is too big and ax
is rather small. For example, sets of the form
{w: a ~ Xt(w) ~ b for all t E T}
are not in ax, if T is uncountable. Sometimes, it may be convenient to
work with a different n. Suppose that we are given a compatible family
40 STOCHASTIC PROCESSES
(P(XI > E, X. < -e) = [/.oo ';211" exp (_iZ2) dZr (1.12)
Since T:) S, the opposite inequality holds also. Hence for any open
interval I
Remarks:
(a) Although the set {w: Xt(w) "r: Xt(w) I is a null event for each t,
the set
{w: Xt(w) "r: XI(W) for at least one tin T)
= U {w: XI"r: Xt(w») (2.4)
lET
need not be an event and need not have zero probability even if it
is an event. If it is a null event, then tXt, t E T} is itself a separable
process.
(b) Obviously, {XI, t E T} and {Xt' t E T} have the same finite-
dimensional distributions.
(c) It may be necessary for X t to assume values ± 00.
A proof of Proposition 2.1 will be omitted. Suffice it to note that the
standard proof is by construction, and the separating set S that results
is, in general, quite complicated. The situation improves if a continuity
condition is satisfied by the finite-dimensional distributions. A process
{Xt, t E T} is said to be continuous in probability at t if
(2.5)
The process tXt, t E [0,1]} is nonseparable because for any set S C [0,1],
{w: Xt(w) = 0 for all t E S} = [0,1] - S
Therefore,
<p({w: Xt(w) = 0 for all t in [0,1]}) = °
while if S is any countable set, then
<P({w: Xt(w) = ° for all t in SD 1
Now, let tXt, t E [O,1]} be defined by
Xt(w) = ° for all t and w (2.8)
The process tXt, t E [O,l]} is clearly separable. Indeed, for every closed
set K,
°
{w: Xt(w) E K for all t E [O,1]} {w: Xo(w) E K}
{ [0,1] if K:3
o ifKlJO
For each t,
{w: Xt(w) = Xt(w)} = [0,1] - It}
which is an event with probability 1.
It is often desirable to be able to define integrals of the form
lab Xt(W) dt. If the integral is to be interpreted as a Lebesgue integral of
sample functions, Lebesgue integrability of almost all sample functions
is clearly a necessity. Even if almost all sample functions of {XI, t E T}
are Lebesgue integrable, the resulting integral lab Xt(w) dt still may not
be a random variable. What is needed is that Xt(w) defines a (t,w) func-
tion measurable with respect to £ ® a, where £ denotes the u algebra
2. SEPARABILITY AND MEASURABILITY 45
fA EIXt/ dt < 00
<p(Z < a) =
1
f a 27rO'
-00
_ ;-;;-;.
V 2
exp
[1
- -
2
(z -
0'2
J.I.)2]
dz (3.1)
Z = r OtiX
N
i=1
t, (3.5)
EZ =
k,l~
L Ukf.L(t
1
k)
and
N
E(Z - EZ)2 L UkU1R(tk,tl)
k,l ~ 1
N
Conversely, suppose conditions (a) and (b) are satisfied. Let Z L (XkX'k
k~l
= exp [iu
k~l
L (Xkf.L(tk) - lu 2 L(Xk(X1R(tk,tl)]
k,l
= eiuEZe-}u'E(Z-EZ)'
and
(3.10)
N
= -1- foo . . . foo F(ul . . . , UN) exp ( -1: kfl UkXk) dUl
(21r)N -00 -00'
L UkXk) dXl'
N
If 0 < tl < t2 < ... < tN, the matrix R = [min (tk,tl)] is positive
definite. Thus, the density function can be written down immediately
by the use of (3.11). After a little rearrangement, we find
(3.14)
= J .~n exp - -
(~ Xn_I)2]
d~
- 00 vi211'(tn - tn-I) 2 tn - tn-- I
= cp(X tn :::; xnlX tn _1 = Xn-l)
Indeed, a little reflection would show that any process with independent
increments {X" t E Tj is also a Markov process.
A Brownian motion {XI, t 2': O} also has the property (see Exercises
8 and 9)
E(XdX" 0 :::; T :::; 8) = X. a.s.
3. GAUSSIAN PROCESSES AND BROWNIAN MOTION 51
EXb 2 = r 6>(11
N
k=l
= k)E(Xb 2111 = k)
;: : .rN
k=l
6>(11 = k)E(Xb2 111 = k)
From the definition of II, we have E(X t .2/ 11 = k) ;::: E2. Because the event
II = k depends only on Xt" XI" . . . ,XI., we have
proving (3.18). I
Remarks:
(a) (3.18) holds for separable complex-valued martingales with
EIXbl2 replacing EXb 2. The proof is almost identical.
3. GAUSSIAN PROCESSES AND BROWNIAN MOTION 53
.
JI JI n---i' CIQ
If L <In <
n=!
00 then the convergence is also almost sure.
54 STOCHASTIC PROCESSES
Sn = L
p= 1
(Xt.<n) - X;:~,)2 - (b - a)
N(n)
= k~ [(X t", (n) - X(n)
t,,_l
)2 - (t "(n) - t(n)
11-1
)}
p=1
ESn 2 = L E[(X,.<n)
p=1
- X~;~Y - (Un) - t~~l)F
N(n)
= 2 L
p= 1
(tp(n) - t~~l)2 ~ 2Ll,,(b - a)
Since Ll n -~
n-l> co
0, this proves Sn ~
n---'"
O. 00
For the second part of the proposition, using the Chebyshev inequal-
ity (1.5.5), we find
ES Ll
<P(ISnl > E) ::; ---f-
E
2
~ 2(b - a) -f
E
Since by assumption, L Ll
n
n < 00, we have for every E > 0,
Remark: The condition LLl n < 00 for a.s. convergence can be replaced by
n
the condition that Tn be nested, that is, T n+l ::) Tn for each n
[Doob, 1953, pp. 395-396}.
As we said earlier, an intuitive interpretation of Proposition 3.4
is that dX t Vit. This has some important consequences. For example,
f"Oo..J
4. CONTINUITY
For a stochastic process {Xt, t E TI defined on an interval T, we dis-
tinguish the following types of continuity. The process is said to be
(4.2)
3. Almost-surely continuous at t, if
Then,
'"
sup IX, - X.I :::; 2 ~ Z, (4.8)
t,sES v=n+l
It-sl<2~"
Proof: If It - sl
It - k/2nl < 2- n ,
<
Is -
2- n , then we can find k such that < k < 2 n and °
k/2nl < 2- n. If t E S and It - k/2 n l < 2- n , t must
be of the form
t = -
k
2" -
+ l
.=1
m
t v 2-(n+.) (tv = 0,1)
Thus,
l . l
n+m ""
Z <
- Z,
.=n+l v=n+l
""
sup IX I - X.! :::; 2 ~ Z.
t,sES .=n+l
It-sl<2~n
lim
n--+ 00 p=n
'"
L Z, = ° (4.9)
lim
n~oo
sup
t,sES
/t-8/ <2~n
!X, - X.I = ° (4.10)
= CP ( sup
O~k~2P-l
IXk + 1 / 2 P _ X k / 2 pl ;:::: (:~)'Y)
2
:,21'-1
= C2-0p
'"
Sinc(' L
p=o
2- 0p < 00, we have
58 STOCHASTIC PROCESSES
-By the Borel-Cantelli lemma, Zp ;-:::: 1/2 p'Y for at most a finite number of
v's with probability 1. That is, there exists N(w) almost-surely finite
such that
1
Zp(w) < -2 'Yp
for all II ;-:::: N(w)
°
.,
and lim ~
\"' Zp(w) = with probability 1. From Proposition 4.1,
n-+ 00 v=n+ 1
sup
t,BES
IX t - X.I ~o
n-+ 00
It-81<2-'
<p(X t X. < x) =
x
~1 exp (U
- I
2 I) du
- /
- 00 27rlt _ 81 2t - 8
Therefore,
EIX t - X.14 =
00
_/
U4
exp
(1 -I--I
- -
u 2 )
du
v 27rlt - 81
/
- 00 2 t - 8
= 3(t - 8)2
and
probability measure (p' on (C,CB). However, (p' is, in general, only finitely
additive, but not q additive. If (P' is q additive (equivalently, if (P' is
sequentially continuous), then it can be extended to a probability measure
(P on (C,a) where a is the minimal q algebra with respect to which every
X t is measurable. What results then is a stochastic process {Xl, t E [O,l]}
defined on (C,a,<p). Since (p(C) = 1, every sample function of X t is
obviously continuous. We can interpret the Kolmogorov condition to
mean that every consistent family of finite-dimensional distributions which
satisfies (4.11) defines an elementary probability measure (P' on (C,CB)
which is q additive [Prokhorov, 1956]. It is possible to show this directly,
but we won't do it here. In the case of Brownian motion, the resulting <P
is known as the Wiener measure. We ;;:hould note that every process
{Xl, t E T} defined on (C,a,(p) in this way is necessarily separable and
measurable.
If a process is not sample continuous, then it is clearly of interest
to know the nature of its discontinuities. We shall say that a function
f(t), 0 ~ t ~ 1 has only discontinuities of the first kind if (1) f is bounded,
(2) at every t E (0,1) limit from the left lim f(8) = f-(t) and limit from
sit
the right lim f(8) = f+(t) exist. Clearly, [0,1] can easily be replaced by
s! t
[a,b] in everything that we say. We give without proof a condition,
similar to the Kolmogorov condition, which guarantees that a process
{Xl, t E [O,1]} has sample functions which have only discontinuities of
the first kind with probability 1 [Cramer, 1966].
It is clear that with probability 1, {X" - 00 <t < oo} takes only values
± 1. It has independent increments,
E(IX,+h - X.IIX. - X,i) = EIX ,+h - X.IEIX. - Xd
= (1 - e-1t+h-.I) (1 - e- I.-tl )
~ h 2 /4 for all sin [t, t h] +
Hence, (4.13) is satisfied and with probability 1, every sample function
has only discontinuities of the first kind.
As a final topic, we note that the proof of Proposition 4.2 contains an
estimate of the modulus of continuity which we now make explicit. As
in Proposition 4.2, we assume lXI, 0 ~ t ~ I} to be separable and for
some positive constants C, ex, (3
We then found that there existed an a.s. finite random variable N(w) so
that
Z,(w) = sup IX(k+l)/2'(W) - Xk/2>(W) I
O:O;k:O;2'
< 1/2'1' for all v ~ N(w) (4.15)
where 'Y is any constant in [0, (3/ex). If we take S = {k/2", k = 0, . ..,
2n - 1, n = 0, 1, . . . }, then from Proposition 4.1
'"
sup IXt(w) - X.(w) I ~ 2 \'
l,sES ~
II-sl<2-' .=n+l
for any 'Y in [0, (3/ex). For any h < 2-N (",),
with probability 1 for any EO > O. For a Brownian motion process, the
largest (3/ a that we can take is t. Therefore, for a separable Brownian
motion process,
5. MARKOV PROCESSES
Therefore, by using (5.2) repeatedly, we find for tl < t2 < < tn,
n
n
; xn,tn) = P(XI,tI ) p(xv,tvlxV-I,tv-I) (5.3)
v=2
which means that all finite-dimensional distributions are completely
determined by the two-dimensional distributions. This fact is true with
all Markov processes and does not depend on the existence of density
functions.
62 STOCHASTIC PROCESSES
where dl;i stands for [h, ~i + d~i). Using the Markov property, we get
This condition relates P(x,t) to P(x,tl ~,s). The second condition is obtained
by noting that if to < s < t, then
P(x,tlxo,to) = cp(X t < xiX!. = xo)
= cp(X t < X, - 00 < X. < 00 IX!. = xo)
= f-"'", cp(X! < xiX. = ~,X!. = xo)cp(X. E dl;lX t• = xo)
= f-"'", P(x,tl ~,s) dP(~,slto,xo)
This yields a condition that must be satisfied by the transition function
P(x,tl ~,s), t > s, namely,
P(x,tlxo,to) = f-"'", P(x,tll;,s) dP(I;,slto,xo) to <s< t (5.8)
5. MARKOV PROCESSES 63
q(t - to) = q(t - s)q(s - to) + [1 - q(t - 8)][1 - q(s - to)] (5.9)
J(t) = e- Xt
and A must be nonnegative, becauRe f(t) ~ 1. This means that q(t) must
be of the form
(5.11)
The resulting process is precisely the random telegraph process that we
introduced in Sec. 4 (4.14), except for the trivial scale factor "".
As a second example of using (5.8), we shall derive a set of conditions
which must be satisfied by the covariance function of a Gaussian-Markov
process. We can always assume that lXI, t E Tl has zero mean, because
Xtis l\Iarkov if and only if X, - p.(t) is Markov. If lXI, t E Tl is Gaussian,
64 STOCHASTIC PROCESSES
E(XtIX.
,
= ~) =
R(t,s)
R( ) ~ +E [
X, -
R(t,s).
R-(
. ) A,
J
s,s ' 8,8
= R(t,s) ~
R(s,s)
= I '"-00
R(t,s)
R(s,s) ~ dP(~,slxo,to)
R(t,s) R(s,t o)
=-- Xo
R(s,s) RCto,to)
R(t,t o)
= - - - Xu t> s > to
R(to,to)
Therefore, for tXt, t E T} to be Gaussian and Markov we must have
R( _ R(t,s)R(s,t o)
t,t o) - R(s,s) t > s > to (5.13)
We also suppose that tXt, - <Xl < t < <Xl} is a separable and measurable
process. Let S denote the space of all a-measurable random variables.
We agree that two random variables which are equal almost surely
count as the same element of S. Then S is a linear space closed under
almost-sure convergence. Now, we define a family {Tt, - <Xl < t < <Xl}
of linear mappings of S onto S as follows:
- <Xl < t, 8 < <Xl (6.1a)
Definition. A stationary process {Xt, - <Xl < t < <Xl} is said to be ergodic
if every invariant random variable of the process is almost surely
equal to a constant.
We note that a random variable almost surely equal to a constant is
always invariant. The great interest in ergodic processes, from the point
of view of applications, is largely due to the following theorem.
&8 STOCHASTIC PROCESSES
Proposition 6.1. Let tXt, - 00 < t < oo} be a separable and measurable
ergodic process. Let f be any Borel function such that Elf(X 0) I < 00.
Then
°
where A and are independent random variables, and is uniformly °
distributed on [0,211"). It is easy to show (e.g., by computing the char-
acteristic functions) that tXt, - 00 < t < 00 I is stationary. Now, for
this simple example, every Z in S is some Borel function of the pair
(A,O), and
.
hm IT
1 _Tf(Xt(w» dt = lim -N
2T 1 _Nf(A(w) cos [211"t IN + O(w»)) dt
T--+ '" N--+ '" 2
= lim 1
N f~Nf(A(w) cos 211"t) dt
N--+ '" 2
= 10 1
f(A(w) cos 211"t) dt (6.6)
where we made repeated use of the fact that f(A cos (211"t + 0» is periodic
in t with period 1. On the other hand
Ef(X 0) = Ef(A cos 0)
= r:", [10 2
". Lf(a cos cp) dcp] dP A (a)
If we denote 101
f(A cos 21C't) dt by j(A), then the time average is j(A(w».
These two are equal if and only if A(w) is almost surely equal to a constant.
In general, it is not easy to give a simple condition which would
ensure ergodicity. For Gaussian processes, however, we have the following
sufficient condition for ergodicity: 1 Assume that {X" - 00 < t < oo} is
Gaussian and stationary, with mean p. and covariance function
R(T) = E(X,+r - p.)(X t - p.)
(6.9a)
that is,
r
We note that Ef(X t) = Ef(X 0) for every t. Therefore,
=
1
(2T)2 fl T
E{[f(X,) - Ef(Xt)][f(Xs) - Ef(X.)J) dtds
If
T
= ~l_
(2T)2 -T
R r(t -
..
8) rtt ds (6.10)
EXERCISES
1. Suppose that {X" 0 S t S 1) is a family of independent random variables, i.e.,
every finite subcollection X", . . . , X'n is mutually independent. Show that
{X" 0 S t S 1) cannot be continuous in probability unless there is a continuous
function f(·) such that for each t, X,(w) = f(t) for almost all w.
Note that even though {w: X,( w) = 0 for at least one t in T l is an uncountable
union, it is an event in this particular case.
Find the one-dimensional distribution function P, for this process. Repeat for
the two-dimensional distribution function p, .•.
4. Find the mean function p.(t) = EX, and the covariance function R(t,s) = E[X, -
P.(t)](X8 - p.(s)] for the process defined in Exercise 3.
5. Verify that the process defined in Exercise :3 is both separable and measurable·
6. LetZ and 0 be independent random variables such that Z has a density function
pz given by
pz(z) = { 0ze _1 2
z<O
2% z:2.0
and 0 is uniformly distributed in the interval [0,271"). Define
X, = Z cos (271"t + 0)
Show that {X" - 00 <t< 00 l is a Gaussian process.
Now define Y, = X_I for t ::; O. Show that lYe, 0::; t < 00 I and {X"O ::; t < 00 I
are two independent processes.
8. Let {X" t E Tl be a stochastic process such that EIX,I < 00 for every t E T,
and let ax, denote the smallest rr algebra with respect to which X. is measurable
for every s ::; t. Suppose that
Note: E(i••X, can also be written in a more suggestive way as E(X,IX" T ::; s).
!I. Let {X" t ~ 0 I be a Brownian motion. Use the result of Exercise 8 to prove
that for t ~ s,
10. Let {X"~ t E [O,l]} be a separable Gaussian process with zero mean. If
1'>0
show that {X, E [O,l]l must be sample continuous no matter how small l' may be.
11. Suppose that {X" - 00 < t < 00 I is a Gaussian process with zero mean and
EX,X. = e- 1t -. I• Express X, in the form
X, = f(t)Wg(t)lf(t)
12. Let {X" - 00 < t < oc I be a stationary Markov process which assumes only a
finite number of values XI, X2, • • • ,Xn . Let peT) be an n X n matrix with elements
Pii(T) = 6'(X'+T = xilX, = Xi)
(a) Suppose that lim (I/T)[p(T) - I] = A exists and IS finite. Show that P(T)
dO
must have the form
peT) = eTA
p(T)q - q
72 STOCHASTIC PROCESSES
and
1 + e- AT 1 - e- AT ]
[
peT) = 1 _:e-AT 1 +:e-AT A ;::: 0
13. Let l' be an interval and R(t,s), t, sET, be a continuous covariance function such
that R(t,t) > 0 for every t in the interior of 1'. Suppose that R satisfies
R( t to ) -_ R(t,s)R(s,to)
,
---:::C-:--c-'--
R(s,s)
to <s <t
to < s < t
p(t,to) = p(t,s)p(s,to)
to, s, t in int (1')
(b) Show that pet,s) >0 for all t and s in the interior of 1'.
p( t,a) t :::::; a
aCt) = { _1_
t ;::: a
p(t,a)
Show that
a(min (t,s»
pet,s) = ---'-------'-''--'--'- t, s E int (1')
a(max (t,s»
14. Let IX" - 00 < t < 00 1 be a q.m. continuous and stationary Gaussian-:'.larkov
process. Find its covariance function.
(b) Show that EX, = 0 for all t provided that EA < 00. Does
M T
1
= 21'
fT_ T X, dt converge in probability to EX, aR T -, ",'?
EXERCISES 73
(c) Show that {X" - 00 < t < oo} is a Gaussian process if and only if A has
a Rayleigh distribution, that is, A has a density PA given by
PA(r) := r
-exp
q2
(1
- - r2)
-
2 q2 r~O
16. Suppose that {X"t ~ O} and f Y" t 2:: 0 J are two independent. standard Brownian
motions. Let Z, = YX,2 + Y,2.
(a) Suppose that X" y, are observed at t = I, 2, 3 and the data a,re summarized
as follows:
t 1 2 3
x .3 1 -2
--
Y - .1 - .2 1
y5 ~ E {Z,lobserved data J ~ Y7
(b) Is {Z" t ~ 0 J a Markov process?
3
Second-Order Processes
1. INTRODUCTION
L L ajiikR(tj,tk)
j - l k=1
= lim L L ajlxkR.(thtk)
v-->oo j=1 k=1
~0
L ct.(t)u.(s)
v=1
(1.10)
..
r
.=1
u.(t)iJv(s). More generally, any pointwise limit of a sequence of
.
bilinear sums of the form (1.10) is a covariance function. This includes
not only infinite sum such as r
,,=1
u.(t)iJ.(S) , but also integrals of the form
lab u(t,X)iJ(s,X) dX. It will be seen later that most covariance functions
can be represented in the form of bilinear sums and/or integrals, and
these representations play an extremely useful role in the application
of stochastic processes.
2. SECOND-ORDER CONTINUITY
Definition. A second-order process tXt, t E T} is said to be continuous in
quadratic mean (q.m. continuous) at t if
EIX'+h
.
- Xti2~O
h-->O
Proof;
(a) If R is continuous at (t,t), then
EJX t+h - X t J2 = R(t + 11" t + h) - R(t, t + h)
- R(t + h, t) + R(t,t)
= [R(t + h, t + h) - R(t,t)] - [R(t, t + h) - R(l,t)]
- [R(t + h, t) - R(t,t)]----"o
h-->
0
Conversely, if tXt, t E T} is q.m. continuous at t, then
R(t + h, t + h') - R(t,t) = EXt+hXI+h, - EXIX!
= E(Xt+h - Xt)X t+h , + EXt(X +h, -
t Xt)
78 SECOND-ORDER PROCESSES
IR(t + h, s + h')
- R(t,s)1 = IEXt+hX s +h ' - EXrX,1
= IE(X t+h - XaX 8 +h, +
EX t (X 8 +h, - Xs)1
~ VEIX,+h - Xt/2EIX 8 +h,12 + VEIX t I2EIX 8 +h • - X812~ 0
(c) Since every nonnegative definite function on TXT is the covariance
function of some second-order process on T, part (c) follows immediately
from (a) and (b). I
(3.6)
which holds if the mixed partial derivative of the covariance function, that is,
a 2R( t1> t 2 )/ atl at2 , exists in a neighborhood of (t, t) and is continuous at
(t, t). We note that
XI E Xx by definition.
Quadratic-mean integrals arise even more frequently than q.m.
derivatives. Let tXt, t E T} be a second-order process, and let J(t) be a
complex-valued function defined on the interval T. We define the q.m.
integral IT
J(t)X t dt as an element in Xx as follows. Let {Tn} be a sequence
of partitions of T such that as n ~ 00 Tn becomes dense in T. To be
80 SECOND-ORDER PROCESSES
I
n-l
where t~n are any sequence of points satisfying t.(n) :::; n < t~~l' The t:
integral lab
f(t)X t dt is well defined by this procedure, provided that
the q.m. limit exists and is independent of the choice of {Tn} and, for
each {Tn}, is independent of the choice of {t: n }. In that case, we say that
the integral lab
f(t)XI dt exists.
Proposition 3.1. The q.m. integral lab f(t)X I dt exists if and only if
{b (b _
Ja Jc f(t).f(s)R(t,s) dt ds exists as a Riemann integral.
Remark:
4. ORTHOGONAL EXPANSIONS
A family 5" of elements of Xx is said to be an orthonormal (O-N) family if
any two distinct elements Y and Z of 5" satisfy
II YII = 1 = IIZII (4.1)
(Y,Z) = EYZ = 0
The second of these conditions is called orthogonality. An O-N family 5" is
said to be complete in Xx if there exists no element of Xx, except the
zero element, which is orthogonal to every element of 5".
Suppose that {X" t E T} is a q.m. continuous process, where T is
an interval, finite or infinite. Let T' be the set of all rational points in T.
For every t E T, there exists a sequence Itn } in T' such that in ~ n--->
t. 00
ElY - L (Y,Zn)Zn 12 =
n=l
EIYI2 - L
n=l
I(Y,Zn)12 ~ 0
Therefore,
L
00
so that
n=l
L (Y,Zn)Zn
00
Y -
n=l
L (Y,Zn)Zn
00
Y = (4.2)
n=l
Xt = r
n=l
00
CTn(t)Zn t E T (4.4)
for all t E T
n~l
anon(t) = 0 for all t E T implies that
N
L anZ n is
n=l
orthogonal to XI for every t E T, hence, also orthogonal
to every Zn which implies an = 0 for every 11. It follows from (4.4) that
L CTn(t)iTn(s)
00
R(t,s) =
n=l
r
00
L CTn(t)Zn(w)
00
X,(w) = t E T
n = 1
Thus, (4.4) and (4.5) imply each other. Representations of the form (4.4) are
useful because they permit the continuum of random variables {Xli t E T}
to be represented by a countable number of orthonormal random variables
{Zn}' However, their use is, in general, limited by the fact that it is usually
difficult to express the random variables Zn explicitly in terms of {Xt' t E T}.
An exceptional case is when {on} are orthogonal, that is, iT0m( t)iV t) dt = 0
whenever m *
n. This motivates the expansion widely known as the
Karhunen-Loeve expansion.
Consider a q.m. continuous process {Xt' a ~ t ~ b}, where the parame-
ter set is explicitly assumed to be a closed and finite interval.
4. ORTHOGONAL EXPANSIONS 83
Xt(W) = r
n=l
""
Un(t)Zn(W) (4.6)
R(t,s) = r
n=l
""
un(t)if,,(S) (4.9)
for each (t,s) in [a,b] X [a,b]. Now from the Schwarz inequality and the
fact that R is continuous on [a,b] X [a,b] we have
Ir r
N N
sup
a:o;t •• S;b n=l
un(t)ifn(S) I::; sup
as;tS;b n=l
IUn(t)12
::; sup R(t,t) < 00 (4.10)
a~t ~b
(4.11)
AO = max I I R(t,s)<p(S)C{J(t) ds dt
(4.13)
T
11'1'11 = 1 a
<Po(t) = -1
AO
1ba
R(t,s)<po(s) ds (4.14)
(4.18)
7. The sequence AO, A11 ••• , may terminate after a finite number of
terms, in which case we have
N
R(t,s) = L
n=O
An'Pn(t)i/Jn(S) (4.19)
N~oo
lim sup
a$t.8$b
I R(t,s) L An'Pn(t)i/Jn(S) I = 0
n=O
(4.20)
In other words,
N
N~oo
lim 2:
n=O
An'Pn(t)i/Jn(S) = R(t,s) uniformly on [a,b)2 (4.21)
ff IR(t,s) L An'Pn(t)i/Jn(S) /2 dt ds
b N
lim = 0 (4.22)
N~oo
a n=O
9. In general, the O-N family {'Pnl is not complete in the space L 2(a,b).
The most that can be said is that given f E L 2(a,b) (that is,
we can write
(4.25)
and
(4.28)
(b) Conversely, if X(w,t) has an expansion of the form (4.26) with
(b -
Ja 'Pm(t)qJn(t) dt = omn = Ebmbn, then l'Pn} and IAn} must be eigen-
functions and eigenvalues of (4.25).
Proof;
(a) By direct computation we have
N N
E IX t - L ~ 'Pn(t)bn 12 =
n=O
R(t,t) L Anl'Pn(t)i2
n=O
which goes to zero as N ---+ 00 uniformly in t by virtue of the Mercer's
theorem.
(b) Suppose X t has the stated expansion. Then, we have
'"
R(t,s) = EXtX. = L An'Pn(t)qJn(S)
n=O
Hence,
n=O
An'Pn(t)'Pn(S)'Pm(S) ds
cp(t) = A sin ~~ t
Applying the condition cjJ(T) = 0 obtained from (4.30), we find
1
cos V~ T = 0
(4.33)
It is clear from the definition of Xx (3.1) that VtZ is well defined for
every Z in Xx by (5.3). Equation (5.2) can now be rewritten as
l.;f to, t, 8
(5.6a)
dnJ(t) I
IIlm I -dtn- ~O
Itl-t 00
(5.6c)
The spaces Lp and Co are complete normed linear spaces (Banach spaces)
00 ]l/P
with respective norms [ {ooll( t)I P dt and s~p If(t)l. The space S; is
dense in both Lp and Co. That is, the completion of S; with respect to the
norm of Lp is Lp ' and its completion with respect to the norm of Co is Co.
Therefore, for every I in Lp , we can find a sequence {In} in S; such that
J(t - 0) + J(t + 0)
2
= lim
N->",
r:N ei21rvt/(/I) dll (5.8)
These two equations are nearly identical, the only difference being the
terms e±i2.-v1 in the integrals. By convention, / is called the Fourier
transform of j, and f is called the inverse Fourier transform of j.
Depending on j, one or both of the integrals in (5.9) may have to be
defined as the limit in some sense of a sequence of finite integrals, and
the equality may only hold in a restricted sense. For example, if J E L1
and is of bounded variation, then the first integral is absolutely convergent,
but the second equation, strictly speaking, should be replaced by (5.8).
If J E S, then J is also in S. In this case both integrals are absolutely
convergent, and equality holds for every II and t in (- 00,00 ). If f E L 2 ,
5. WIDE-SENSE STATIONARY PROCESSES 91
then the first equation really says that there exists j E £.2 such that
f .. I j(v)
- 00
- fT,Tl f(lJe- i2 ...t dt I dv
- Tli
)0
T2-+ co
ATaf = TaAf
If the input space Viis L 1, and a filter A is defined by Af = h * f, then
A. is linear and time invariant. The function h is known as the impulse
response of the filter A. The Fourier transform h, of the impulse response
is known as the transfer function. More generally, suppose a filter A
is defined by
Then, A is again time invariant and linear. The function It is again called
a transfer function. In general, h,(v) may not be the Fourier integral of
any impulse response. To put it in another way, the inverse Fourier
transform of It may not exist except as a generalized function. We have
been deliberately vague in specifying the input space Vi and the output
space V o, because they depend very much on the filter. For example,
1 If I, hELl, then 1* hELl. If I, hE L 2, then 1* h is bounded. If IE LI and h is
bounded, then I * h is again bounded.
92 SECOND-ORDER PROCESSES
(5.15)
(5.16)
fJ k(t -
'"
T)ii(t - U)R(T - u) dT du
EYtY. = II h(t -
'"
r)ii(s - r)R(r - u) dr du
= 1_"'", e .-v(t-8)1'h(v)I2S(v) dv
i2 (5.18)
it follows that!.·' PI
S(v) dv is just the average power of the X process in
[V1,V2J. This justifies our earlier assertion that S is nonnegative and that it
In the more general situation, there may be spectral lines, i.e., distinct
frequencies with finite amount of power. Even more complicated situa-
tions involving continuous, but not absolutely continuous, distributions
may arise. The general statement concerning the harmonic representation
of a stationary covariance function is given by the Bochner's theorem
stated below.
Proposition 5.1. A function R(T), - 00 < T < 00, is the covariance func-
tion of a q.m. continuous and wide-sense stationary process if and
only if it is of the form
(5.21)
at t = s, because
Rn(T) = {( 0 1 - 2n R(T)
H) ITI ~ 2n
(5.25)
ITI > 2n
The fact that Rn(T) is a nonnegative definite function follows from the
fact that max (0, 1 - ITI/2n) is a nonnegative definite function. Now,
clearly Rn E Ll n Co. Therefore, there corresponds a sequence of spectral-
density functions {Sn} defined by
= f2n
-2n
(1 - tl)2n
R(T)e- i27fVT dT (5.26)
= sup If(lI)IR(O)
p
then
Fab = lim
n ..... '"
! '"
-'"
R(T)
e-i27rTb _
-27rtT
e- i2 11"Ta
. exp
[ 1 (211"T) 2]
- - - - dT
2 n
(5.32)
6. SPECTRAL REPRESENTATION
Proposition 6.1. Defined as above, the process {XA, - 00 < A < oo}
satisfies
(6.2)
o =
}..p
{I,0, AA;;t.= p.
p.
(6.3)
The process {X", - 00 < A < oo} will be called the spectral process
of {XI' - 00 < t < oo}. We shall show that
It is clear that f-"'. . f(A) dX" is well defined for every ffor which there
exists a sequence of step functions {fn} (linear combinations of indicator
6. SPECTRAL REPRESENTATION 99
Although we won't prove it, the class of all such f is precisely L2(F),
that is, the class of functions satisfying 1-"'", If(~) 12F(d~) < 00. For a
continuous function fin L 2(F), a suitable approximating sequence of step
functions can be constructed by sampling f at continuity points of F.
For an arbitrary fin L 2 (F), the construction of an approximating sequence
is somewhat more complicated [Doob, 1953, pp. 439-441].
fES (6.9)
where lab denotes the indicator function of [a,b). Let Un} be a sequence
of step functions obtained by samplingf at continuity points of the spectral
measure F. As the sampling points become dense in (- 00,00 ),fn converges
to f uniformly and in L 1 metric. Therefore, Un} converges to J uniformly.
Now, for each n,
Therefore, if f E £,
I
E 1-"'", f(A) dX x - 1-"'", J(t)X t dt r: ; 2E I 1-"'", [f(A) - fn(A)] dX x \2
+ 2E 1 1-"'", [J(t) - In(t)]X t dt 12 = 2 1-"'", If(A) - fn(A)12F(dA)
+ 2 II
'"
R(t - S)[J(t) - In(t)][J(S) - In(S)] dt ds
The first of these integrals goes to zero as n ---+ 00 by virtue of the con-
struction of fn. The second integral also goes to zero, because
II R(t -
T
Remark: Note the similarity between (6.10) and (5.29). Indeed, (5.29)
can be obtained from (6.10) by using
(6.11)
the process defined by 1_0000 e i21rXt dX x must be q.m. continuous and wide-
sense stationary by virtue of Bochner's theorem (Proposition 5.1).
Conversely, let {Xt, - 00 < t < oo} be q.m. continuous and wide-
sense stationary and define Xx hy (6.1). Then, (6.11) follows from (6.9)
by using the familiar approximation:;
Proposition 6.5. Let {XI, - 00 < t < oo} be a <l.m. continuous and wide-
sense stationary process with spectral process lXx, - 00 < X < oo}
and spectral measure F. Let Xx be the Hilbert space generated by
{Xt' - 00 < t < oo}. Then, a random variable Y belongs to Xx
if and only if there exists 7J E L 2(F) such that
(6.12)
102 SECOND-ORDER PROCESSES
L a.X
n
t.; then from (6.11), we have
.=1
This means that {1)n I is a Cauchy sequence in L 2(F), and the completeness
of L 2(F) implies the existence of 1) E L 2(F) such that
(6.13)
tE(-oo,oo) (6.14)
6. SPECTRAL REPRESENTATION 103
(6.15)
(c) For arbitrary t and 8 in (- 00,00),
To prove that (a) implies (b), we make use of the fact that
U t dX.,. = e i2 ..X1 dX x (6.17)
More precisely, this means that
Thus, (6.14) follows upon using (6.17). To prove that (c) implies (a), we
first note that from Proposition 6.4 for each t there exists TJ(-,t) E L 2(F)
such that
If we denote :VI = f-"'", ei2 ".>.t 17 0_,0) dX>., then the above equation is equiva-
lent to
t,8 E (- 00 , 00 )
Since (Y t - :VI), being in Xx, is the q.m. limit of a sequence of finite sums
of the form 2:
a.X!., we have EIY t - :Vt/ 2 = 0. The proof is now
complete. I
Remark: The process {Y f , - 00 < t < oo} is necessarily wide-sense sta-
tionary. More than that, {Xl, Y t , - 00 < t < oo} may be said to
be jointly stationary (wide-sense) in the seIlse that every linear
combination aX! +
/3Y t defines a wide-sense stationary process. On
the other hand, suppose that for every tin (- 00,00), Zt E Xx and
{Zt, - 00 < t < oo} is wide-sense stationary. The process {Z/,
- 00 < t < oo} does not necessarily satisfy the conditions (6.14)
to (6.16) and is not necessarily jointly stationary with tXt, - 00 <
t < oo} (see Exercise 12).
If h( t), - 00 < t < 00, is a function such that its Fourier transform
h is in L 2(F), then
Yt = f-"'", h(t - T)XT dT
= f-"'", h(A)eib>.t dX>. (6.18)
(6.22)
whenever it exists. It is rather obvious from (6.22) that the q.m. deriva-
tive exists if and only if f-"'",
IAI2F(dA) < 00. As a second example, con-
sider the process {Y t , - 00 < t < oo} defined by
(6.23)
Xt + iY t = 210'" eihXt dX x
we can say that X t +
iY t has no negative frequencies. Hilbert transforms
will be made use of again in the next section.
XI f w
= -w eihvt
~
dX v (7.2)
E IXt - f-w+o
w-o e ihvt dX v 12 = E/ r
Jlvl~w
ei2 ...t dX v /2
= F« -. 00 , - W]) + F([W, 00» =0
106 SECOND-ORDER PROCESSES
Let {Xt, - 00
Proposition 7.1 (Sampling Theorem). <t< oo} be a process
bandlimited to frequency W. Then
Proof: We begin with the representation (7.2) and for emphasis rewrite
it as
X t
= f-w+o
W-O
ei2nt dX
~
(7.4)
k--N
L
N
ei2nk/2W _1_ f W ei2 .. (t-k/2W) dv
2W -w
p
f W-O
-w+o
1 ei2 ...t _
k=4 N
~ ei2...k/2wsin 21rW(t - k/2W)
21rW(t - k/2W)
12 F(dt..) -----')
N---.oo
°
7. LOWPASS AND BANDPASS PROCESSES 107
Remark: The proof fails if F merely satisfies F(( - 00, - W)) = 0 and
F((W, (0)) = 0 but has a finite mass at one or both of the points
± W. Then, we would have to write
X /. = j -w-o
W+o
e i21rv !
~
dX "
YI = f oo
-00 (-i sgn lI)e i2nt dX.
~
(7.7)
where sgn II = 1 or - 1 according as II > 0 or II < O. The value of sgn 0
is immaterial, since (7-7) can be rewritten as
Yt = i j -w,-w
-WO+IV -
e i2nt dX. - i
flVo+w
+wo-w
e i2 ... t dX.
~
(7.8)
Again, we note that the limits in the integrals in (7.8) can be replaced by
their limits from the left or the right without any material effect. The
process {Xt, - 00 < t < oo} can be written as
(7.9)
Xt + 'tV
.
t
= 2 ~wo+w °
w,-w
e,21r·t
~
dX •
= 2e i2rW ,t fW
-w e i2nt dX .+W 0
(""c.c+k/2We-i2 .. Wot
0/
+ I: k
C;a+ /2
we'"
02 W
ot
) sin 21rW(t - k/2W - a)
21rW(t _ k/2W _ a)
n= -00
Proposition 7.2. Let {Xt, - 00 < t < oo} be a bandpass process with
center frequency W 0 and bandwidth 2W. Let Y t be its Hilbert
transform defined by (7.7). Then
L
N
sin 21rW(t - k/2W - a)
X t = lim in q.m.
N-.". k=-N 21rW(t - k/2W - a)
[Xk/2W+a cos 21rWo(t - k/2W - a)
- Yk/2w+asin21rWo(t - k/2W - a)] (7.12)
Equation (7.12) involves both X t and Y t being sampled at 2W,
giving a total sampling rate of 4W.
It is interesting to note that
(7.13)
Since ~t and 1/t are both bandlimited to W, we can expect 1~d2 +
11/112 to be
relatively slowly varying when W« Woo Therefore, IXd 2 +
IYd 2 is also
slowly varying (relative to a sinusoid with frequency W 0)' On the other
hand, most of the average power of XI is concentrated near ± W 0, so
XI itself is rapidly varying. If X t is real valued, then it can be written as
would then give (8.1). These considerations are purely formal and require
elaboration and substantiation. At the outset, we should distinguish
between the problem of handling white noise in a mathematically con-
sistent way and the problem of interpreting white noise as an abstraction
of physical phenomena. As far as the calculus of white noise is concerned,
the problem is not difficult, at least for linear problems. Nonlinear
problems involving Gaussian white noise are substantially more complex
and will be dealt with in a later chapter. The principal tool that we shall
use in establishing a self-consistent calculus for white noise is the second-
order stochastic integral that we introduced in Sec. 6 in connection with
spectral representation. There remains the problem of interpretation.
Since R(O) = 00 implies an infinite average power, a white noise
cannot be a physical process. If a white noise is not a physical process,
and if it leads to mathematical complications, then why is it used at all?
First, even though the calculus of white noise requires justification,
once justified it leads to a tremendous analytical simplification in many
problems. Secondly, many processes that one encounters in practice are
well approximated by white noise, but this statement requires amplifi-
cation. Because a white noise is not a second-order process (indeed, it
is not a stochastic process at all!), no sequence of processes {Xn(t), t E
(- oo,oo)} which is q.m. convergent for each t can converge to a white
noise. The way out of this difficulty is to recall that just as a 0 function is
never used outside of an integral, the same is true with white noise.
This definition helps to make clear the idea that a process {XI, - 00 <
t < oo} is approximately a white noise. What we really mean is that for
all function f that we are concerned with, the quantity EX(f)X(f) is
very nearly equal to So f_"'",
If(t) 12 dt.
Suppose {Xt(n>, t E (- oo,oo)} is a sequence of processes converging
to a white noise. By definition, for every f E L 2, there exists a second-
8. WHITE NOISE AND WHITE-NOISE INTEGRALS 111
Then
= f-"'", f(t) dZ t
Proposition 8.2. Let tXt, - 00 < t < 00 1 be a white noise. Then there
exists a second white noise {X p, - 00 < II < 00 1 such that
for all h, k E L 2•
Remark: We repeat once again that the integrals in (8.12) are merely
symbolic representations of f-"'",
k(t) dZ t and 1_"'", h(lI) dip.
Then,
E(Zb - Za)(id - ic) = 1_"'", Iob(t)fcd(t) dt
= 1_"'", I (II)lcd(lI) dll
ab
Hence, E dZp dill = 5PII dll, and lip, - 00 < II < 00 1has the same second-
order properties as {Z/, - 00 < t < 00 I. For an arbitrary h E L 2 , by
approximating h by step functions in the familiar way, we find
that is,
= i b
dv = b - a
Y =
I
f '"
- '" 1
1
+ i27r1l e i2 .. vl X dv
A
p
(S.17)
Proposition 8.3. Let a and (3 E L2(a,b). Then. (8.19) has one and only
one solut.ion with the same initial condition Y a , provided that
EIY a l2 < 00.
converges t.o a whit.e noise Xt. Then we can show that (Y/n), a S t S b}
converges to a q. m. continuous process {Y t , a S t S b} for any b such
that a, (3 E L2(a,b). Further, {Y t , a S t S b} satisfies the equation
These considerations justify the use of (8.19), even when the driving
force is "not quite white." Finally, we note that for our earlier example,
we can easily show that
Y
t
= f _
00
00 1 +1i27r1l e i2 .. vt dZ
v
(8.21)
t ~ a (8.22)
Y
a
= f _
oo
00 1
1
+ i27r1l ei2.. va
-
dZ
v
(8.23)
(8.24)
By Proposition 8.2,
j~-
00 e-(t-.) dZ. = f -
00
00 27rill
1
+1e i2 .-vt dZ
v
00
i,i
116 SECOND-ORDER PROCESSES
Z = Y _ E(Y - Y)X t X
EIX t l2 t
(9.4b)
(g.4c)
The spectral-density functions S" and Sy are necessarily real and positive.
The cross-spectral density SX1/ need not be real. Under these assumptions
XI, Y I are not only individually wide-sense stationary processes, but are
in fact jointly wide-sense stationary in the sense that for arbitrary
complex constants a and b, Zt = aX t +
bY t is again a wide-sense sta-
tionary process. For such a process ZI, we have
f oo
-00
lIn S(l') I dl'
1 + l'2
< 00
(9.9)
v = - tan-
o (9.11)
2
In 8 ( - tan D =
n
""
2:
= - co
an ein8 (9.12)
an = -1
271"
f~.
e- zn8 In 8
-~
(- tall -
2
0) dO (9.13)
.8 1 - ill
e' =- -- (9.14)
I + ill
so that (9.12) and (9.13) become
a (~)n
n= - 00
n 1 + ill (9.15)
and
an = ~ f"
71" - "" 1
~1_ (.!~-,,)n In 8(11) dv
+ 112 1 - ~11
(9.16)
If Ihl2 = 8, then
In 8(v) = In h + In h (9.17)
We now identify
h(p) = exp -
A
2
[ao +
n=1
-
(11+11J
2:"" an - - ill)n]
. (9.18)
Because a_ n = an, (9.17) is satisfied, hence also (9. lOa) . Conditions (9. lOb )
120 SECOND-ORDER PROCESSES
. ao ~ [l-i(U+iV)]"
feu 2" + /=1 an 1 + i(u + iv)
+ 'tV) =
(9.19)
is analytic for v < o. Hence, k(u + iv) is analytic for v < 0, and (9.lOb)
follows (more or less) from contour integration closing the contour in
the lower half plane. Condition (9.lOc) follows from the fact that if
fez) is analytic, then elfz ) has no zero. We call the above procedure the
spectral-factorization procedure, since S is usually identified with a
spectral-density function.
Let {Xl, - 00 < t < 00 I have a spectral-density function S"" and
let k be obtained by factoring Sx so that (9.10) is satisfied with Sx replacing
Sin (9. lOa) . Then, in view of our discussion on (8.15), {X t , - 00 < t < 00 I
can be regarded as the output of filtering a white noise Itt, - 00 <
t < 00 I with a nonanticipative filter having transfer function k. Condi-
tion (9.lOc) means that the white noise It"~ - 00 < t < 00 I can in turn
be obtained by filtering tXt, - 00 < t < 00 I by 11k, which is also
nonanticipative. A more precise statement of these results can be made
in terms of the process with orthogonal increments {Zt, - 00 < t < 00 I
corresponding to ttl, - 00 < t < 00 I. CO'1dition (9.lOb) then implies that
there exists a process {Z/, - 00 < t < 00 I withZ o = 0 andE dZ t dZ. = Ot. dt
such that
which points out even more clearly that tt (=Zt) is obtained by filtering
X t by 11k.
We shall now give the main result of the Wiener theory of filtering
as follows [Wiener, 1949].
Proposition 9.3. Let tXt, Yt, - 00 < t < 00 I be a pair of wide-sense sta-
tionary processes satisfying (9.4). Let k be obtained by factoring Sx
so that (9, lOa) to (9, lOe) are satisfied. Let I Z" - 00 < t < 00 I be
9. LINEAR PREDICTION AND FILTERING 121
where g is given by
get)
•
,,= f " e,2,,,(
. -,--
-00
SXy(v) dv
h(v)
-oo<t<oo (9.24)
To verify (9.25a), we note that f~", get - 8) dZ. is in JCz t , where JCzt
denotes the Hilbert space spanned by {Z 8, - 00 < 8 ~ t J. Since for each
t, Z, E JCx t , so JCzt C JCx t , verifying (9.25a). To verify (9.25b), we note
that
Therefore.
= EYtX"T T ~ t
which verifie:-; (9.2.r5b) and completes the proof. I
122 SECON~RDER PROCESSES
The solution (9.23) can be put in a more useful form by using (9.22).
If we define
(9.29)
n
m
(v - Zk)(V - Zk)
S,,( v) = K2 ::...k;_I=--_ _ _ _ __ (9.30)
k=l
n (v - Pk)(V - ilk)
where every Zk and Pk have positive imaginary parts. Since /1,,/2 = S" and
1,,(u + iv) has neither poles nor zeros for v < 0, h must be of the form
n
m
(v - Zk)
1,,( v) = A =-k:......:1'----_ __ (9.31)
n
k=1
(v - Pk)
n (-i - Zk)
k=1
t
n( -i) = ea.!2 =A "--n-=------
n (-i -
k=1
Pk)
9. LINEAR PREDICTION AND FILTERING 123
and ao is given by
get) = f"
-""
e1,' 1I"'pt
2e'1.21rlla
1
1
+ i27r/J
d 'II
e-(t+a) t> -a
{O
t < -a
1
= e- a - - - -
1 + i27rIJ
so that
E(Yt/JCx t ) = e- a f-"""" e i27rPt elX. = e-aX t
For this simple example, the predictor is nothing more than an attenuator.
As a second example, suppose
1 + (27rIJ)2
124 SECOND-ORDER PROCESSES
f (3
-00
OO 1
= v'5 + V2 1 + i211"11
-00
+ 2 + V.5/2 1 ) ~h~
- dV
1 + ,/5/2 V5 - V2 i21rl1 ~
:3
----:=-----= e- t t> 0
V5 + v'2
2 - V5/2 exp ( I~ t) t<O
V5 + V / 2 "V 2
and (9.28) gives
3 1
1(11) = V5 + V2 1 + i211"11
Finally, from (9.28) we get
cases where spectral-density functions exist, one would expect that the
Wiener theory can again be developed. If we denote by A+ the Hermitian
adjoint of a matrix A, we can define the matrix spectral-density function
Sx by
Sx(v) = h(v)h+(v)
such that the matrix h(v) satisfies conditions similar to (9. lOb) and
(9.lOc). Except that instead of (9.lOc), the determinant of h(u +
iv) is
to have no zero for v > O. The final solution can be expressed in a form
which generalizes (9.29) as
where the matrix y(v) is similarly defined, as in the scalar case. The matrix
spectml-factorization problem is considerably more difficult than the
scalar problem. For the rational case, a number of finite algorithms to
achieve the factorization have been derived [Wiener and Masani, 1958;
Wong and Thomas, 1961; Youla, 1961].
In some areas of application, the Wiener formulation of the filtering
problem is not appropriate because of some of its inherent assumptions.
Among these are the following: (1) wide-sense stationarity and existence
of spectral densities; (2) the second-order properties of the processes
lXI, Y(, - co < t < co} are known, and no other information is known;
(3) the estimator is to be based on the infinite past of the observation
process. These limitations are removed in the formulation of the filtering
problem due to Kalman and Bucy. Im:tead, they made other assumptions
which are more natural in a great variety of applications. The form of the
solution is also different. While in the Wiener theory, the final solution
is in the form of a time-invariant linear and nonanticipative filter, the
Kalman-Bucy theory yields a differential equation which is satisfied by
the estimator. Implementation of the "filter" in feedback form is thus
immediate [Kalman and Bucy, 1961].
The Kalman-Bucy filter problem is usually stated in vector form
as follows: Let {XI, Y l , t 2: to} be a pair of vector-valued second-order
processes. The X process will be the observation process, and the Y
process is to be estimated. While this notational convention is consistent
with our earlier discussion, it is not universal. Often in the literature, the
126 SECOND-ORDER PROCESSES
two letters X and Yare used in just the opposite way. The basic assump-
tions are the following: Throughout, boldface will be used to denote
vectors and matrices, prime denotes transpose, +
denotes Hermitian
adjoint, and I denotes identity matrix.
1. The process to be estimated satisfies
(9.33)
2. The observation process satisfies
Xt = H(t)Yt + B(t){t t > to (9.34)
Remarks:
(a) Both (9.32) and (9.34) are to be interpreted along the lines
discussed in Sec. 8. We shall denote the process with orthogonal
components which correspond to {t by Zt, so that formally Zt = {t.
If to > - 00, it is convenient to set Zro = o.
(b) We note that (9.34) is not really a differential equation, since
X t can be immediately expressed explicitly in terms of the Y and {
process by integrating (9.34). However, the problem would be no
more general if we replace (9.34) by a linear differential equation in
Xt. Such an equation can always be changed into (9.34) by redefining
the observation process.
(c) It is necessary to assume that the initial values Xlo and Y to are
random variables orthogonal to {Zr, t ;::: to}, in particular they can
simply be constants.
As usual, let xxt denote the smallest Hilbert space generated by
tXT' to ::; T ::; t l, and let E denote projection. The Kalman-Bucy filtering
problem is to find E(Y.IXxt), and the main results can be summarized
as follows.
Proposition 9.4. Let tXt, Y t , t ;::: to} satisfy (9.32) and (9.34). Let ~(slt) be
the unique solution of
d
ds ~(slt) = F(s)~(slt) s > t (9.35)
with initial condition ~(tlt) = 1. Let A(t) and B(t) III (9.32) and
(9.33) be continuous functions on [to, 00).
9. LINEAR PREDICTION AND FILTERING 127
Remarks:
(a) A complete proof is rather complicated, and will not be pre-
sented. Instead, we shall give a heuristic derivation.
(b) The continuity conditions on A(t) and B(t) are sufficient, but
not necessary. However, some smoothness condition is needed.
Unfortunately, this point is largely lost in our formal derivation.
(c) Equation (9.39) can be simplified somewhat by using (9.40). If
B(t)B+(t) is invertible, these two equations can be combined to give
a single equation in l:(t), which is a nonlinear differential equation
of the Riccati type.
(d) Once K is determined from (9.39) and (9.40), implementation
of (9.37) in feedback form is immediate and yields a continuous
estimate. Feedback implementation of (9.37) is often referred to
as the Kalman-Buey filter.
First, we derive (9.36) as follows: From (9.32) we can write for
8 :2: t,
Y. = ~(8It)Yt +~ 8 ~(81t)A(T) dZ T
Therefore,
Now, let Jet denote the smallest Hilbert space containing X t ., Y t ., and
lZ1, T ::; tl. Because of (9.32) and (9.34), Jex t is contained in Jet. Because
128 SECOND-ORDER PROCESSES
Zt. is a process with orthogonal increments, and X/., Y1) are orthogonal to
{Zt,t ~ to},
Hence,
2(dZ.lxx t) = 2[E(dZ.!x t)lxxt ] = 0 T ~ t
It follows that
E(Y8!Xxt) = <I>(sIOY t
which was to be derived.
To derive (9.37), we first note that every element in xxt can be
written in the form of
(9.41)
Thus,
dYt (t~.
= K(tlt) dXt + dt [a(t)Xt o + ito at K(tIT) dX.]
(9.42)
Since the bracketed terms in (9.42) are in Xxt, we can rewrite (9.42) as
dY t = K(tlt) dXt + E[dY t - K(tlt) dXtlxxt] (9.43)
Now, from (9.34)
2(dXtlxxt) = H(t)Yt dt (9.44)
Now,
d~(t) = ~(t + dt) - ~(t)
= E£t+dt£t+dt - E£t£t+
= E(£t+dt - £t)(£t+dt - £t)+ +
E(£t+dt - £t)£t+ + E£t(£t+dt - £t)+
=E d£t d£t+ + E d£t£t+ + E£t d£t+
Using (9.46) and the fact that dZ t is orthogonal to Jet, we find
d~(t) = [A(t) - K(t)B(t)][A(t) - K(t)B(t)]+ dt
+
[F(t) - K(t)H(t)]~(t) dt +
~(t)[F(t) - K(t)H(t)]+ dt
(9.47)
where tl! satisfies
(9.54)
and
(9.55)
(9.56)
and
N
t
= -1
2
It
-00
e- 2 (t-T) dV
T
(9.57)
The initial value };(t o) can be evaluated as follows: First, we note that the
linear least-squares estimator of Yto given Xto has the form aX to ' where a is
determined by
E(Yto - aXt.)X to = 0 (9.67)
This yields
(9.68)
Finally,
~(to) = EI Y'o
- aXt o l2 = EliYfo - iN to l2
= HEIYto !2 + EIN'oI2) = i (9.69)
This completes the solution for };(t), and via (9.63), also completes the
solution for K(t), and hence the Kalman-Bucy filter.
If we let to - - <Xl in (9.66), we get
~(t) - - - )
to~ - 00
(V 10 - 3) (9.70)
This gives us
(Vi-
:V t = f~", 1) exp [- Vi (t - r)]e-2T dXT
( ~5 ) (27riv) +2 3 2 + i27rV
2" - 1 (27riv) + Vi = Vs + V2 Vs + V'2 i27rv
which should be compared with (9.31a).
EXERCISES
1. Test whether each of the following functions is nonnegative definite.
(d) R(t,s) = {~
It - sl :::; 1
It - sl > 1
L L
N N
R(t,t) - A.I'I'.(t)I' = E [ X, - 'I'.(t) lab oP.(s)X. ds [2 ~ 0
n=O n=O
L
N
An :::; lab R(t,t) dt < 00
n=O
N
Hence,
L
n=O
An must converge and An ---> O.
n--> '"
3. Suppose that a q.m. continuous and wide-sense stationary process IX" - 00 <
t < 00 J has a covariance function R(·) which is periodic with period T, that is,
Define for n = 0, ± 1, ± 2,
Z
n
= -T !c
T X,e- in (27r IT)t
1 O '
whenever m .= n
L
00
X t = Znein (27rIT)t
n=-Q()
f
i...
ak a2k RU,s)
at Zk
= {- bk
i...
aZk oCt -
at"
8) a < t, 15 < b
k=O k-O
Show that the integral equation
R(t,s)
I
n a2k
ak - k R(I,s) =
Im
bk -
a 2k
/)(1 - 8)
al 2 al 2k
k=O k=O
L <>n(I)Zn(w)
00
X,(w)
n=O
where {Zn, n = 0, 1, 2, . . . } are independent Gaussian random variables with
EZmZn = /)mn. What conditions do we need, if any, in order that each Zn belongs
to JCx?
8. Suppose that {X" - 00 < I < co} has the spectral density given in Exercise 7,
and {Y" - 00 < I < oo} is such that:
(a) Y, E JCx for every I
9. Suppose that {X" - 00 < t < oo} is wide-sense stationary. Show that for a fixed
constant W, {e i2 n-lI'tXt, - 00 < I < oo} is again wide-sense stationary. Is
{cos 211" WtX" - 00 < I < co} wide-sense stationary?
EXERCISES 135
N
10. Suppose that X, = L Xk', - 00 <t < co, is the sum of N wide-sense stationary
k=l
and q.m. continuous processes Xu, X 2t, ••• , XN'. Show that we have a repre-
sentation
x, =
-0() e 'dg.
/0() i21rV
Is the process {g., - co < II < co} always a process with orthogonal increments?
(Note: X, = cos 2".WtZ" with Z, stationary, is an example of such a process.)
11. For a process of the type given in Exercise 10, show that
12. Let {X" - co <t < oo} be a wide-sense stationary process with a spectral
representation
0() A
Let
where ",,(11), - 00 < II < 00, is a real-valued function. Show that {Y" - 00 <
t < 00 I is wide-sense stationary.
13. Suppose that {X" - 00 < t < oo} is a real-valued wide-sense stationary process,
and let X, denote its Hilbert transform
0() A
with ",,(11) = 2".[v - 110 sgn Ill. [Hence, it must be wide-sense stationary (see Exer-
cise 12).J
14. Let {Z" - co < t < oo} be a process with orthogonal increments such that EZ,
0, EIZ, - Z.12 = It - sl and Zo = 0.
(a) Show that
Show that for each n, {tnt, - 00 < t < 00 J is wide-sense stationary and find its
spectral-density function S,,(,,), - 00 < " < 00.
(c) Show that tnt converges to a white noise in the sense of (8.2) and (8.3).
15. Let f be a differentiable function such that its derivative j is continuous on [a,bj.
Show that
(b
Ja f(t) dZ t + Ja(b f(t)Z,
.
dt = f(b)Zb - f(a)Za
where I Zt, - 00 < t < 00 J is the process described in Exercise 14. (Hint: Make
use of the sequence It"d defined in Exercise 14 and show that
(bj(t) (ltn8dsdt~
ia}o n~oo In(bj(t)Ztdt
The rest is easy.)
16. Use the fI'sults of Exercise 15 and show that the solution of the integral equation
Y, = Ya - la t
Y. ds + Zt - Za
02
-
at oS
EXtX. =
-
oCt - 8) + pet,s) a<l<b
(a) Show that X(f) "" fb/(t) dXt is well defined for all f satisfying (b 1/(01 2 dt
a Ja
such that
(1) X(·) is linear, that is, X(af + fJg) = aX(f) + fJX(g)
(2) If a S c < d S band 1 = I,d is the indicator function of [c,d), then
X(f) = Xd - Xc
(b) Find an expression for EX(f)X(g)
Y = 1b 7J(t)dX t + kXa
18. Let {W" t ;::: 0 J be a standard Brownian motion process, and let 11 be a real-valued
second-order random variable (EA = 0, EA 2 = 1) independent of the W process.
Suppose that the process
XI = At + W, 1;:::0
EXERCISES 137
is observed. Let At denote the linear least-squares estimator of A given X., 0 :::;;
s :::;; t. Find an explicit expression for At in the form of
20. Suppose that {X" - 00 < t < "1 has a spectral density given by
1
S.(p) = 4 + (21TP)4
Find the predictor 1?(Xt+a /JCx t), and express it in the form
21. Let {X" - 00 <t < 00 I be as in Exercise 20, and let {Nt, - 00 < t < 00 I have
spectral density S.v(v) = 1/[1 + (211"v)21. Assume that EX,N, = 0 for all t and .~
and define
Y, = X, + Nt
Find 1?(X./JCy') and express it in the form
A(t) = [~]
B(t) = 1
F(t) = [~ - ~J
H(t) = [0 1)
and
24. Suppose that in Exercise 23 the process 1;" instead of being white, satisfies
EI;,~. = e-/'-·/. Reexpress the two differential equations in the form of (4.32) and
(4.34) with suitable choices for X, and {to (Hint: Now (d/dt) 1;, +
1;, is a white noise.)
4
Stochastic Integrals and
Stochastic Differential
Equations
1. INTRODUCTION
Roughly speaking, stochastic differential equations are differential
equations driven by Gaussian white noise. Here, we are using the term
"stochastic differential equations" in a restricted sense and not merely to
denote differential equations with some probabilistic aspects. The impor-
tance of stochastic differential equations is largely due to the fact that the
solution of such an equation is a sample-continuous :Markov process, and
conversely, a large and important class of sample-continuous Markov
processes can be modeled by the solutions of stochastic differential equa-
tions. From the point of view of applications, this is a direct benefit of
using white noise as a noise model, and this fact accounts for its popular-
ity. After all, white noise is, at best, a tolerable abstraction and is never a
completely faithful representation of a physical noise source. Its raison
d'~tre is the simplicity of analysis that it brings about. We have seen this
in connection with Kalman filtering, and we shall see it again in the
Markovian nature of solutions to stochastic differential equations.
139
140 STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS
2. STOCHASTIC INTEGRALS
Let (g,a,<p) be a fixed probability space. Let {ai, - 00 < t < oo} be an
increasing family of sub-a- algebras of a, and let {WI' - 00 < t < oo} be a
Brownian motion process such that for each 8, the aggregate {W t - W.,
t ::::: 8} is independent of a. and W t is at measurable for each t. It follows
that for 8 2:: 0
EatW t+ = W,
8
(2.1)
Eat(W t+8 - W t )2 = 8 a.s.
We recall that we refer to this situation by saying that {W" a" - 00 <
t < oo} is a Brownian motion. By a stochastic integral we mean a quantity
of the form
(2.2)
and if rp satisfies (2.3) and (2.4), then we call cp an (w,t)-step function and
define the stochastic integral by
l
n-1
lab cp(w,t) dW(w,t) CPv(w)[W(W,tv+1) - W(w,t.)] (2.6)
v=o
142 STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS
2. If cP satisfies (2.3) and (2.4), then we shall show that there exists a
sequence of (w,t)-step functions {CPn(w,t)} satisfying (2.3) and (2.4) such
that
(2.7)
Proof:
(a) Suppose Ecp(·,t)ip(-'s) is continuous on [a,b] X [a,b], that is, CPt is q.m.
continuous on [a,b], then an approximating sequence of (w,t) step functions
{CPn} can be constructed by partitioning [a,b], sampling <p(w,t) at partition
points t/ n), defining CPn(W,t) = cp(w,tv(n), tv(n) ~ t < t~~l' and refining the
partitions to zero [max (t~~l - tv<n) -70]. Since CPt is q.m. continuous,
v n->"
Elcp(·,t) - CPn(·,t) 12 ~ 0 for every t in [a,b]. By the dominated conver-
n->"
gence theorem, we have
lb Elcp(·,t) - CPn(',t)12 dt ~0
More generally, if cP merely satisfies (2.3) and (2.4) and is not necessarily
q.m. continuous on [a,b], we construct a sequence of approximating (w,t)-
2. STOCHASTIC INTEGRALS 143
step functions in the following manner: First, let in be ep with its real and
imaginary parts truncated to ± n, then
(2.11)
(2.12)
Then,g"C',t) is q.m. continuous on [a,b] and satisfies (2.3) and (2.4). Now,
- cp (-, t - ~) /2 dt dr (2.13)
(2.15)
and
EII(epn)12 = L E[lepnpI2(A.nW) 2]
p
= L Elepn.12E(!,.('.)(A. nW)
v
2
= L. ElepnpI2(t~~1 - tp(n)
(2.16)
144 STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS
Now, 1(\0111+..) - l(tpn) = l(tpm+n - tpn) and tpm+n - tpn is again a step.
Therefore,
n---+oo n-->"
(2.18)
which means that (2.18) follows from the seemingly more special case
EI1(tp)12 = lab
Eltpt/2 dt. Now if tp is an (w,t)-step function, we have
already proved in (2.16) that
a.s. (3.2)
, a., we find
EIi'(X t - X.) = EIi·EIi., . . . EIi'-(X t - Xs)
= 0 a.s.
Since Xs is obviously as measurable, this proves the proposition for <p
equal to a step. If <p is not a step, let I <Pn) be step approximations to <p,
and define
Xn(w,t) = fat <Pn(W,t) dW(w,r) (3.4)
q,m.
~
n---> 00
0 (3.5)
Hence, EO,(X, - Xs) = 0, a.s., and the proof if' complete. I
A process I X I, a ~ t ~ b) as defined by (3.1) is obviously q. m.
continuous. Thus, we can choose a version of lXI, a ~ t ~ b) which is
separable and measurable. If we choose such a version and if we assume
that the Brownian process I Wt, a ~ t ~ b) in (3.1) is also separable, then
lXI, a ~ 1 ~ b) is sample continuous with probability 1. When <p is an
(w,t)-step function, sample continuity is obvious since I X" a ~ t ~ b) is
then a separable Brownian motion pieced together in a continuous way.
If <p is not a step, let {<Pn} be a sequence of (w,t)-step functions satisfying
(2.3) and (2.4) such that
t
If we set X"' =
each n, {Xnt' a ~ t ~ b) is sample continuous with probability 1. For
each n, {X nl - X I, a ~ t ~ b) is a separable second-order martingale. If
we apply the version of Proposition 2.3.2 for complex-valued martingales,
we get
and Xt(w), a ::; t ::; b, being the uniform limit of a sequence of continuous
functions, is itself continuous. This proves the sample continuity of
tXt, a ::; t ::; bl.
One immediate consequence of the martingale property is that a
Ht.ochastic integral does not behave like an ordinary integral. Consider,
for example, the stochastic integral lot
W8 dW 8 • If the integral is like an
ordinary integral, surely it must be equal to -HW t 2 - W 0 2) = -!W t 2. How-
ever, -!W t 2 is not a martingale, as is seen from the relationship
E(t8(-!W t2) = -!W8 2 + -Ht - s)
<p n( ) = (
w,t
<pew,t) if 1t I<pew,s) 12 ds ::; n
o otherwise
Stochastic integrals appearing in the following proposition will be assumed
to be defined in this way if the integrands satisfy (3.6) rather than (2.4).
148 STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS
lib
n
dX~(w,t') + i i ~i
j=lk=l
b
if;JIc(X(W,t') , t')cpj(W,t')CPk(wt') dt' (3.8)
Remark: The surprising thing about (3.8) is the last term. It comes about
in roughly the following way. We recall that a Brownian motion
W t has the curious property (dW t )2 ~ dt. Therefore, dXj(t) dXk(t) '"
CPjCPk dt. Now,
dY t = Yt+dt - Y t = if;(X t+dt , t + dt) - if;(Xt,t)
= ~ dt + I
k
if;k dXk(t) +!
2
II
j k
if;jk dXj(t) dXk(t) + (3.9)
Both the first and the third term in (3.9) are of order dt, hence
dY t = ~ dt + I
k
if;k dXk(t) +!
2
L
j,k
if;jkCPjCPk dt + o(dt) (3.10)
or
lot Ws dW. = Y, - M =kW t 2 - it (3.11)
where prime denotes differentiation with respect to the first variable. For
the second special case, consider the product Y(w,t) = X 1 (W,t)X 2(w,t),
where Xl and X 2 satisfy (3.7) with k = 1, 2. Then (3.8) becomes
Yt = Ya + it X 2 (w,t') dX 1 (w,t') + it X 1 (w,t') dX 2 (w,t')
(4.1)
Under the conditions that we shall assume, we can in fact assert a stronger
result than a.s. equality of the two random variables, viz., q.m. difference
between the two is zero.
We shall first state and prove an existence and uniqueness theorem
following Ito.
Xo(w,t) = X(w)
Xn+l(W,t) = X(w) +f m(Xn(W,S), s) ds
integral for each n. That is, we need to show that for each n
q(X n(W,t), t) is jointly measurable in (w,t), and for each t
is (Xt measurable (4.7)
and
(4.8)
This can be done by induction. First, we verify (4.7) and (4.8) for n = O.
Since Xo(w,t) = X(w), (4.7) is satisfied, because q is a Borel measurable
fUIlction, and u(X,t) is not only (Xt measurable, it is (Xu measurable. Using
(4.5), we have
q2(X,t) :::; K2(1 + X2)
so that
iT Eu 2 (Xo(-,t), t) dt :::; K2(1 + EX2)(T - a) < 00
and (4.8) is verified for n = O. Now, assume that (4.7) and (4.8) are both
satisfied for n = 0, 1, 2, . . . , k. Then from (4.6),
Xk+l(W,t) = X + it m(Xk(w,s), s) ds + it U(Xk(W,S), s) dW(w,s)
each of the three terms on the right is (it measurable, because {Xk(',S),
a :::; 8 :::; t) is (Xt measurable. Next, we note that for a :::; to :::; t :::; T,
t:
E[X k +1 (',t) - X k +1 (-,t O»)2
:::; 2 {K2[1 + (t - to)] [1 + EXk2(·,S)] dS} (4.11)
The induction is complete, and we have verified (4.7) and (4.8) for every n.
Therefore, the sequence of processes {X,,(',t), a ::::; t ::::; T, n = 0, 1, . . . l
is well defined.
Next we prove that for each t, {X,,(·,t), n = 0, 1, . . . l converges
in quadratic mean. To do this, define
~o(w,t) = X(w)
(4.13)
~n(w,t) = Xn(w,t) - Xn_1(w,t) n = 1, 2, . . .
Using (4.6), we get
The inequality (4.15) can be iterated starting from E~02(-,t) = EX2, and
we get
(4.16)
Now,
m
Xn+m(W,t) - Xn(W,t) 2: ~n+k(W,t)
k=l
(4.17)
°
k=O
Because for each n, {Xn(-,t), a ::; t ::; T} is q.m. continuous, the limit
process {XI, a ::; t .:::; T} is also q.m. continuous, hence continuous in
probability. It follows from Proposition 2.2.3 that a separable and mea-
surable version can be chosen for {X" a ::; t ::; 1'). We shall now show
that IX" a ::; t ::; T} so constructed satisfies PI - P 5•
First, for each t, Xn(·,t) is a, measurable for every n. Therefore,
XCt) is also at measurable for each t, and PI is proved. Next,
sup
n
iT EXn2(·,t) dt ::; 2 [iT eat dt + (T - a) ] EX2 = A < 00
(4.22)
Hence, using (4.20) on (4.21) we get
It is now easy to show that each of the three terms on the right-hand
side goes to zero in quadratic mean as n ~ 00. Therefore
ED t 2 = 0
and for each t E [a,T],
Xl = X
r
+ Ja(t m(Xs,s) ds + Ja[t u(X.,s) dW. (4.23)
154 STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS
(4.29)
Xu = X
remains unanswered. Indeed, this question cannot be answered without
more being said about what we want (4.29) to mean. As it stands, (4.29)
is merely a string of symbols, nothing more. We shall take up this ques-
tion in the next section.
Finally, we note that the existence of a solution to (4.2) is ensured
even without the Lipschitz condition (4.4), but then the uniqueness is
no longer guaranteed rSkorokhod. 1965, p. 59].
is the following:
d
- X(w,t) = m(X(w,t), t)
dt
+ u(X(w,t), t)s(w,t) (5.1)
d
dt Xn(W,t) = m(Xn(w,t), t) + u(X,,(w,t), l)S,,(w,t)
(5.2)
(5.3)
5. WHITE NOISE AND STOCHASTIC CALCULUS 157
f(TV,,(w,t), t) ----')
n-->
if;(W(w,t), t)
00
f(W,,(w,t), t) ----7
n--> 00
if;(W(w,t), t)
158 STOCHASTIC INTEGRALS AND STOCHASTIC D!FFERENTIAl EQUATIONS
Y(w,t) =
Ja[t 'P(W(w,s), s) dW(w,s) +.!.2 Ja[t 'P'(W(w,s), s) ds (5.12)
so that
dlf.-(Xn(w,t), t) = ~(Xn(W,t), t) dt +
If.-'(Xn(w,t), t) dXn(w,t)
. m(Xn(w,t), t) )
= If.-(Xn(w,t), t) +
«
u Xn w,t), t
+
) dt dWn(w,t) (5.16
or
1f(X r .(w,t), t) - If.-(Xn(w,a), a) = f J.L(Xn(w,s), s) ds
+ W,,(w,t) - W n(w,a) (5.17)
5. WHITE NOISE AND STOCHASTIC CALCULUS 159
then we can apply Ito's differentiation formula (3.8) to !/I(Xt,t) and get
(5.24)
Again, we note the presence of an extra term fuu', which will be referred
to as the correction term.
We shall now state some convergence results concerning (5.13) and
(5.23) [Wong and Zakai, 1965a and b. 1966]. We need to define some types
of approximations {Wn(w,t) I to a Brownian motion W(w,t) as follows:
a.R.
AI: For each t, Wn(-,t) ~
n ...... oo
Wc,t). For each n and almost all w, Wn(w,.)
is sample continuous and of bounded variation on [a,T].
A 2: Al and also for almost all w, W neW,) uniformly bounded, i.e., for
almost all u;,
sup sup /Wn(w,t)/ < 00
n tE[a,bj
Aa: A2 and for each n and almost all w, W n(W,t) has a continuous deriva-
tive Wn(W,t).
A4: For each n, W n(w,t) is a polygonal approximation of W(w,t) defined
by
t- t·(n)
- W(w t·(n»] ___J _
, J t(n) _ Un)
J+I .1
Proposition 5.2. Let m(x,t), u(x,t), u'(x,t) = (ajax)u(x,t), and u(x,t) = (aj
at)o(x,t) be continuous in - 00 < x < 00, a S t S b. Let m(x,t),
u(x,t), and u(x,t)u'(x,t) satisfy a unifOl"m Lipschitz condition, Le.,
if f denotes any of three quantities m, u, uu', then
/f(x,t) - f(y,t)/ s K/x - y/ (5.28)
Let {Xn(w,t), t 2': al satisfy (.5.2), and let {X(w,t), t 2': aJ satisfy the
5. WHITE NOISE AND STOCHASTIC CALCULUS 161
is a good approximation to (5.35). The difference here is that Zl, Z2, ... ,
are no longer independent.
for integrands satisfying (1) <p jointly measurable in (w,t), (2) for each t
<Pt is at measurable, and (3) lab E/<pd
dt < 00. The stochastic integral
2
a.s. (6.2)
Proposition 6.1. Let I Wt,G. t } be a Brownian motion and let ep(c..;,t) satisfy:
(a) ep is jointly measurable in (w,t).
(b) For each t, ept is at measurable,
(c) Jab
1r,c(w,t)12 dt < 00 almost surely.
Let epm be defined by
and we define
Proof: Let epm be defined by (6.3). For each m, epm satisfies (2.3) and
(2.4) so that J(epm) is well defined. Now, for any w such that
<P (IJ(ep"") - J(epn) I ;::: e) :::; <P (Jab lept/ 2 dt > min (m,n») --;;::;;::;: 0
which proves that {I(epn)} converges in probability so that (6.4) is an
adequate qefinition for J(ep). I
Remarks.
(a) If l'Pn} is a sequence of functions satisfying conditions (2.3)
and (6.2), if lepm(w,t) I :::; lep(w,t)l, and if
epm -4
m---H.
ep in <P X £ measure (6.5)
then
(6.6)
6. GENERALIZATIONS OF THE STOCHASTIC INTEGRAL 165
(6.8)
The procedure for defining (6.12) is exactly the same as before and will
not be repeated.
The class of processes satisfying both (6.9) and (6.10) is still quite
restricted. In particular, if Z t is almost surely sample continuous, then
F(t) is necessarily continuous [for convenience, we set F(O) = 0] and Zt
can be expressed as
Zt = WF(t) (6.15)
where W t is a Brownian motion. Therefore, if we consider only sample-
continuous Zt, then the stochastic integral (6.12) is really the same as
166 STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS
I(cp,w) = lv
CPv(w)[Z(w,tv+ I) - Z(w,t v )]
Remark: I t is clear that ZIt now plays the role played by l in the original
definition of stochastic integral.
If X t is of the form
1 {t {I( ) 2 Z
+ "2}a if; X.,s /Ps d Is (6.20)
a.s. (6.21)
Zl(W,t) = lo
a
t z(w,s)
- E d(EZ 1s )
zs
(6.22)
provided that the first integral exists almost surely as a Stieltjes integral
and the second as a stochastic integral. A process that can be decomposed
as in (6.25) was termed a quasi-martingale by Fisk [1965] who also gave
necessary and sufficient conditions for the existence of such a decomposi-
tion. U nfortunateiy, these conditions are not always easily verified.
168 STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS
Proof: We shall give an outline of the proof. Let {Z/, a :::; t :::; b} be
defined by
Z/ = Xt - Xa - it m(X.,s) ds (6.33)
Because of (6.30), we can show that {Zt, G t , a :::; t ~ b} is a sample-
continuous martingale. Because of (6.27), it is also second order. Further-
more, if we define
Y/ = Z/2 - it u 2 (X.,s) ds (6.34)
7. DIFFUSION EQUATIONS 169
7. DIFFUSION EQUATIONS
In this section, we shall try to show that the transition probabilities of a
process satisfying a stochastic differential equation can be obtained by
solving either of a pair of partial differential equations. These equations
are called the backward and forward equations of Kolmogorov or, alter-
natively, diffusion equations. The forward equation is also sometimes
called the Fokker-Planck equation. The situation, however, is not com-
pletely satisfactory. As we shall see, the original derivation of Kolmogorov
involved assumptions that cannot be directly verified. Attempts in cir-
cumventing these assumptions involve other difficulties. We begin v.ith a
derivation of the diffusion equations following the lines of Kolmogorov
[1931].
Let {XI, a ~ t ~ b} be a l\Iarkov process, and denote
P(x,t/xo,to) = a>(XI < .r/X'o =, .ro) (7.1)
We call P(x,t/xo,to) the transition function of the process. If there is a
function p(x,t/xo,to) so that
1
-11l3(x,t;
~
E,~) ~
A!O
0 (7.8)
It is clear that if 1 - M o(x,t; E,~) ~ 0, then by dominated convergence,
A!O
+ -1 a p(x,tlxo, to + ~) (z - Xo )2
2
2 aX02
+ .!.6 aap(x,t!z,a to + ~) I
3 (z - Xo
)3
18 - xol :::; Iz - xol (7.11)
Z z=8
7. DIFFUSION EQUATIONS 171
I[
P(x.tlxo,to) - P(x,tlxo. to + !:J.)] __~ M ( . A)
1 xo,to, E,i.J.
!:J. !:J.
ap(x,tlxo, to _+ !:J.) _ ! -.! M 2(xo to· E!:J.)
8xo 2!:J.' , ,
a2p(x,tl:;~:0 + !:J.) I s; I [1 - M o~o,to; E,!:J.)]
+ Ma(xo,to; E,!:J.)
sup
Iaap(x,tlz, to + II a
!:J.)
(7.13)
6!:J. Iz -xol~. az
If we let !:J.1 0 and use conditions (7.5) through (7.8), (7.13) becomes
8 a
- - P(x,tlxo,to) = m(xo,to) - P(x,tlxo,to)
ato axo
a
+ io-2(xo,tO) -aX0
2
2
P(x,tlxo,to) a < to < t < b (7.14)
Now, write
f(x) = L-k!
2
k=O
1
j<k)(Z)(X - Z)k + H(3)(O)(X - z)3 (7.19)
+~
ax
[m(x,t)p(X,t1xo,t o)]} dx = 0 (7.22)
Since (7.22) holds for all f E S, the quantity in the brackets must be
zero for almost all x, but being continuous, it must be zero for all x.
Therefore,
a 1 82
- p(x,tlxo,to) = - - [u 2 (x,t)p(x,tlxo,to)]
at 2 ax 2
8
- -
ax [m(x,t)p(x,tlxo,to)] b > t> to > a (7.23)
Equation (7.23) is the forward equation of diffusion, and is also called the
Fokker-Planck equation. The initial condition to be imposed is
f '"
-'"
f(x)p(x,tlxo,to) dx ~ f(xo)
I tlo
VfES (7.24)
Proposition 7.1. Let m(x,t) and u(x,t) satisfy the following conditions on
- 00 < x <
00, a S; t S; b:
Im(x,t)1 S; K V~2
(7.25)
o < Uo S; u(x,t) S; K viI x2 +
There exist positive constants 'Y and K so that
Im(x,t) - m(y,t)1 S; Klx - YI'Y
(Holder condition) (7.26)
!u(x,t) - u(y,t) I S; Klx - YI'Y
Then, the following conclusions are valid:
(a) The backward equation
) i,)2P(x,tlxo,to) . ) ap(x,tlxo,to)
1 2(
2U xo,to
axo
2 + m (xo,to axo
a
= - - P(x,tlxo,t o) t > to (7.27)
ato
has a unique solution corresponding to condition (7.15). Further,
for t > to, P(x,tlxo,to) is differentiable with respect to x so we have
the transition density
a
p(x,tlxo,t) = - P(x,tlxo,t o) (7.28)
ax
(7.29)
174 STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS
Therefore,
d2
-d [(1
X2
+ x 2)p(x)] + -ddX [xp(x)] = 0
or
d
- [(1
dx
+ x 2)p(x)] + xp(x) = constant
U
( x,t,.,xo,to) -_ sinh- 1 x_ /- sinh- 1 Xo
2 v t - to
7. DIFFUSION EQUATIONS 175
+ ~ ju+.jt-toe- z '
,;;; u- .jt- to
liz} (7.34)
p(.r,tlxo,to) =
(t -
1
to
) exp - -1 ----
2 t - to
(x + xo) 10 (V- -no)
-
to t -
(7.40)
176 STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS
and
1 \'
- ~
a 2
- - [(:Jij(x,t)p(x,tlxo,t o)] -
\'
~
a
-- [mi(x,t)p(x,tlxo,to)]
2 y.. a~a~ I
. a~
EXERCISES
1. Let 'P satisfy (2.3) and (2.4). In addition, assume that its derivative q; is continuous
on [a,b J. Show that
5. Let ~(x), - 00 < x < 00, be a real-valued Borel function with bounded-continuous
second derivative ~", and let {W" t ? 0 I be a standard Brownian motion. Show
that
X, = ~(W,) - ~ (t ~"(W.) ds
2 Jo
is a martingale. Thus, if ~" ? 0 (~" ~ 0) then ~(W,) is a submartingale (respec-
tively, supermartingale).
x, -
- [XIt]
X 2,
- [ W, ]
- ~(W,)
and
Verify, both directly and by using (.5.34), that X, can also be considered to be the
solution of a white-noise equation
X, =~ Wk,2
k=l
Show that X, satisfies a stochastic differential equation, and find this equation.
Is X, Markov?
9. Consider the forward equation (7.23) for the case where m and u 2 are functions
only of x and not t, that is,
a 1 a2 a
- p(x,tixo,to) =-- [u'(x)p(x,tixo,to)] - - [m(x)p(x,tixo,t o)]
at 2 ax' ax
EXERCISES 179
where A is a Borel subset of the real line, I' a Borel measure, and <p)." 1/;)" are both
solutions of the Sturm-Liouville equation
-1 -d [ q2(X)W(X) -
2 dx
df(X)]
-
dx
+ Xw(x)f(x) = 0
10. For q2(X) = 1, m(x) = 0, we have the Brownian motion case and
p(x,tlxo,to) = V. 1
211"(t - to)
exp [ - -21 (~t -=- ~0;2]0
p(x,tlxo,t o) = ~
211"
! 00
-.,
e-h'(t-to)eip(x-xo) d"
11. Suppose that q2(X,t) = (3(t) and m(x,t) = a(t)x. Show that the fundamental solu-
tion of the Fokker-Planck equation has the form
12. Suppose that q2(X,t) = 1 and m(x,t) - 8gn x where sgn x is + 1 or - I according
as x ;::: 0 or x < o.
(a) Find the limiting density p(x) = lim p(x,tlxo,t o).
t-+ .,
1. INTRODUCTION
ThiR chapter iR an introduction to the semigroup treatment of Markov
processes with stationary transition functions. Modern theory of Markov
processes is primarily a semigroup theory. Even though much of this
theory has not found its way into applications in physical problems, the
elucidation that is made possible with the semigroup approach makes
it indispensable in any treatment of Markov processes.
Consider a Markov process {X(w,t), t E [0,00») defined on a proba-
bility space (n,a,cp). We assume that the transition function
cp(X( < biX. = a) = P(b,tla,s) t> s
depends only on t - s and not on t and s separately. We call such transi-
tion functions stationary transition functions. A process with a stationary
transition function need not be a stationary process. For example,
Brownian motion has a stationary transition function, but is not a
stationary process.
It is rather important that the set of values that X(w,t) can assume be
180
1. INTRODUCTION 181
(1.2)
If a transition density Pa(b,t) exists for t > 0, that is, if P a(E,t) can be
written as
if a E A
Pa(A,O) = {~ if a E A
Ht = etA
~ ,An
== 1.. tn
n=O n.
184 ONE-DIMENSIONAL DIFFUSIONS
It is clear that Bo ::> :DA. It turns out that :DA is dense in Bo. That is, every
lEBo is the strong limit of a sequence from :DA. To show this, take any
i E Bo and set
{lin
in = n}o H.lds (2.15)
Then, for n = 1, 2, . . . ,
8 lim (Hdn - In) = n(H l/nl - f)
flO
2. THE MARKOV SEMIGROUP 185
Proposition 2.1. For every g E Bo and every X > 0, RAg E :DA. Further-
more, f = RAg is the unique solution to the equation
Ai - Af = g f E:DA (2.17)
Proof: We shall sketch a proof with some details omitted. First, we verify
that RAg E :DA by computing
and
e-XtHt<p = H o<P = <P
It follows that
o ::; 1\<p11 = e-XtIIHt<p1\ ::; e-Xtll<pl! ~ 0
N ext, define Ax by
Ax = AARx (2.21)
From (2.19) we have
IIAdli = AIIARdl1 = AIIARA! - fll ::; AIIARdl1 + AII!II ::; 2AII!1I
(2.22)
so that Ax is a bounded operator. We can define
2. THE MARKOV SEMIGROUP 187
jn = n Io lln HJ ds (2.24)
Then, fn E Bo for each n, and
(2.25)
for each x E 8 and each t 2:: O. Finally, we note that
f(u,x) = e;uX
and
Therefore,
If we set 'Pt = lot IIH.! - e'A>.fll ds and make use of (2.22), we find
or
by dominated convergence. I
Example. As an example, we shall derive the generator for a standard
Brownian motion. We recall that a Brownian motion has a transi-
tion-density function given by
1
Pa(:t,t) = - - e-(1/2t)(;r-a)' (2.28)
V27rt
Let f be any fUllction in B with a bounded-continuous second deriv-
ative f", and let e 2 denote the set of all such functions. Then,
to
OO
e-
A
tPa
(
x,t) dt =
exp (- ~
_ r:::
~2A
Ix - al) O<A< 00 (2.30)
Hence,
and
Af = if" (2.33)
Further, for every t > 0, Pa(X,t) is e 2 in a so that
a 1 a 2
at Pa(X,t) = "2 aa 2 Pa(X,t)
which is the familiar backward equation for Brownian motion.
= 0 {w: X.(w)
n=l
= a for 80me 8 in [0, t - !1}
n
3. STRONG MARKOV PROCESSES 191
Since for each n, the set {w: X.(w) = a for some sin [0, t - lin]} is in ai,
{w: Ta(W) < t} Eat
Now, let S be the state space of the Markov process, and let a be in
the interior of S. We define
Ta+ = lim Tb (3.4)
bla
Ta- = lim Tb (3.5)
bra
and these are also Markov times. We should again note that if T is a
Markov time, the set {w: T(W) = t} need not be in at. For example,
neither {Ta+ = t} nor {Ta- = t} is necessarily in at.
Let T be a Markov time. We define the u algebra a r + as follows:
if and only if E ("\. {"-': T(W) < t} Eat
It is obvious that T is ar + measurable. If T = to is a deterministic time,
then
Thus, we see that if T represents the present, then ar + is a little bit more
than the past and present.
Definition. tXt,
every Markov time
°: ; t < OC)} is said to be a strong Markov process if for
T,
(3.6)
Every strong Markov process is Markov in the ordinary sense. This is
because if (3.6) is satisfied, then
<p(X t+. E Ela t ) = E d' <P(X t+
8 E Ela t +)
1. If (3.6) is satisfied for the following classes of Markov times, then the
process is strongly Markov:
1 = Ta aE8
T = Ta+ a E int (8)
T = Ta- a E int (8)
2. If for every t ~ 0, the operator H t maps bounded-continuous func-
tions into bounded-continuous functions, then the process is a strong
Markov process.
Proposition 3.1. Let tXt, 0 ~ t < oo} be a strong Markov process. Let
T(W) be a Markov time.
(a) Let j E Bo and define
u>.(a) = (R>.f)(a) = Ea 10" e->.tj(X t) dt (3.12)
Then
(3.13)
4. CHARACTERISTIC OPERATORS 193
4. CHARACTERISTIC OPERATORS
For the remainder of this chapter we shall restrict ourselves to processes
tXt, 0 :::; t < 00 I satisfying the following conditions:
Every point in the state space 8 is reachable from every point in
194 ONE-DIMENSIONAL DIFFUSIONS
Proof; Under assumption (4.1), <Pa(Tb < 00) > 0, so that for some
t < 00,
<Pa(Tb > t) = aCt) < 1 (4.5)
Now, for a ::; x ::; b,
<Px(Tab > t) ::; <Px(Tb > t) ::; <Pa(1b > t) = aCt) < 1 (4.6)
N ext, by the Markov property, we have
Now, we write
: :; L
n=O
""
(n + 1)t[CI'z(Tab > nt) - Cl'z(Tab > (n + 1)t)]
4. ) (x)
(~g = !')d
d 2g(x)
2
(4.12)
~ X
1 fa/Vi ,
<Pa(Xt = 0) = _ / - . r. e-!z dz
V 27r -a/vt
It is clear that knowing the two functions Pab+ and mab for every [a,b] C 8
completely determines (Ag)(x) at everv interior point of 8. It will turn
out that they also elucidate the behavi'or of Ag at any closed endpoints.
The converse is also true. That is, knowing Ag in int (8) completely
determines Pab+ and mab for every [a,b] C S, as the following proposition
shows.
Proposition 4.2. For every [a,b] C S, the functions Pab+ and mab are,
respectively, the unique continuous solutions to the equations
But Pay+(X) ~
xTy
1, so that Pab+(X) ~ xTy
Pab+(Y). Similarly, we can show
Pab+(X) ~x!y
Pab+(Y) , so that Pab+ is continuous. Continuity of mab can be
proved in a similar way. Uniqueness is much more difficult to prove. It
depends on the fact that A satisfies a "minimum principle" [Dynkin,
1965, Chap. 1, p. 145]. I
If 8 is a closed interval, say [0,1], let
u(X) = POl+(X) (4.23)
Then, for 0 :::; a < b :::; 1, we have from (4.22)
u(x) - u(a)
Pab+(X) = u(b) - u(a) (4.24)
A i d ( 1 dg(x»)
( g)(x) = JL'(x) dx u'(x)--a;.:
5. DIFFUSION PROCESSES
There is some disagreement in recent literature as to the definition of a
diffusion process. Some authors call any process satisfying condition (4.1)
a diffusion, while other restrict the name to a smaller class of processes.
We shall adopt the more restrictive definition and define a diffusion
process as any process satisfying (4.1) and for which the limits
(5.1)
and
. E:r,(X'ab -
11m X)2 2( )
= 0- X (5.2)
atx E",Tab
b!x
exist at every x E int (8). We shall always assume that m and 0- 2 satisfy
a Holder condition (cf. 4.7) and
0-2(X) > 0 x E int (8) (5.3)
1 2( ) d 2g(x)
20"Xd;2
+ mx~=
() dg(x) 0 a<x<b (5.5)
1 2( ) d 2m a b(X)
20" x dx 2
+ m () dmQb(X)
x --;z;;-
_- -1
a<x<b
(5.8)
mab(a) = 0 = mab(b)
1 2( ) a 2Gab (x,y)
20" X
+ mx
() aGab(x,y) =0
ax 2 ax
Therefore,
and
(b 2
mab(x) = Ja (J2(y)U'(y) Gab(x,y) dy (5.11)
Given m and (J2 in int (S), Pab+ and mab are determined, and thus A
is completely determined in int (S). However, we know that if S is closed
at one or both endpoints, knowing A in the interior of S may not be
enough to determine the semigroup {H t , 0 ~ t < oo} (equivalently, the
transition function) uniquely. To clarify the situation, we need to study
the possible behavior at the boundaries. The first result in this direction
is the following.
h a
c u(y) - u(a) d
(J2(y)U' (y)
y < 00 (5.12)
The right endpoint b of S belongs to S if and only if u(b) < 00, and
for some c < b,
lc
b u(b) - u(y) d
(J2(y)U' (y)
y < 00 (5.13)
Proof: We shall only prove the first half, the proof for the second half
being nearly identical.
First, suppose that a E S. Let a be any point such that [a,a] C S.
Then from (4.4) we have
sup maa(X) < 00
a<x<a
= lim
z!a
t
Gzc(x,y) 2()2,() dy
C
U Y u Y
= 2 [u(c) - U(X)] (x u(y) - u(a) dy
u(c) - u(u) Ja u 2 (y)u'(y)
+ 2 [U(X) - u(a)] (c u(c) - u(y) dy
u(c) - u(a) Jx u 2(y)u'(y)
Proposition 5.3. Let a be a closed left endpoint of S. If for some c E int (S),
ja
e 1
u2(y)u' (y)
dy < 0() (5.14)
[C 1 dy = 0()
(5.16)
}a u2(y)u'(y)
Remarks:
(a) The point a is called a regular boundary if (5.14) is satisfied for
some c E int (8), otherwise it is called an exit boundary.
(b) The criterion of Proposition 5.3 can be modified for a closed
right endpoint in an obvious way, and we won't repeat it.
Let a be a regular left endpoint of 8. Define
. EaTc
Ka = 1I m - - (5.17)
da c - a
We note that Ka cannot be determined from m and u 2• Indeed, each
choice of Ka gives us a process with a different transition function. If b
is a regular right endpoint of 8, then we set
Kb = 1I· mEbTc
-- (5.18)
cU b - c
If a is a regular left endpoint, then
o 1
(Ag)(a) = - g'(a+) (5.19)
Ka
and if b is a regular right endpoint, then
1
(Ag)(b) = Kb g'(b-) (5.20)
Remark: We note that because .fA E :01 n C2, (5.22) takes on a differ-
ential form in the interior of 8, viz.,
1 , 1 ,
Regular K/>.(a+) = A!>.(a+) - g(a) - fx(b-) = Aj,,(b-) - g(b)
Kb
solution to
M~' (x) - Aj>.(x) = -g(x) O<x<oo (5.25)
subject to the boundary condition
1 I
2K o1,,(0+) = Aj>.(O+) - g(O) (5.26)
(5.27)
204 ONE-DIMENSIONAL DIFFUSIONS
1 a2 O<y<x< 00
- - Fx(x,y) - XFx(x,y) = 0
2 ax 2 O<x<y< 00
Fx(y+,y) = FA(y-,y)
~
ax
FA(x,y) I
x=y+
~ FA(x,y) Ix=y-
- ax -2
(5.28)
1 d2
- -2 VA(X) - XVA(X) = 0 0 < x < 00
2 dx
(5.29)
1 ,
VA(O+) = XVx(O+) - 1
2K 0
where A(y), B(y), C(y) and D are determined by the subsidiary conditions
in (5.28) and (5.29).
Let g(u,x) = e- UX • Then, !A(U,X) is given by
!A(U,X) = 10 00
e-At(Exe-ux,) dt
= 10 00
rAt 10 00
e-ubP x(db,t) dt (5.32)
(5.33)
for A > 0 has a pair of solutions f}. = <p}., 8}. with the following properties:
<p}.(x) > 0 8}.(x) > 0 x E int (S)
<p>.(x) nondecreasing, (l}.(x) nonincreasing
(5.38)
<p>.(x) bounded if and only if the right endpoint of S is closed
()>.(x) bounded if and only if the left endpoint of S is closed
The two functions <P>. and 8>. are linearly independent, and the Wronskian
A is given by
A(X) = ()>.(x)<p~(x) - <p>.(x)8~(x) = u'(x) (5.39)
where u(x) is the scale function. Therefore, every solution of (5.37) is a
linear combination of <P>. and 8>.. We now seek a bounded solution of
d2 df>.(x)
iu 2(x) dx2f>.(x) + p.(x) ----a;- = A!>.(x) - g(x)
where a and b are endpoints of S. The functions F, UA, v>. are of the form
-l
given by
F,(x,y) -
U2(y)~'(Y)
2
'P,(y)8,(x)
(5.49)
u 2(y)u'(y) 8,(y)'P,(x)
From (.5.49) and the fact that F,(x,y) is the Laplace transform of the
transition density px(y,t), we can finally deduce that Px(y,t) must satisfy
both the forward and the backward equations of diffusion. Why it is
so hard to prove that Px(y,t) satisfy the diffusion equations (without
assuming the differentiability conditions) is not well understood.
EXERCISES
1. Let IX" - 00 < t < 00 I be a random telegraph process which we define as a
stationary Markov process such that
+ ~7rI~ 1'" -
a/Vt-.
e- h ' dz
t > s
x>o
a>O
For a = 0 we have <P(X, = OIX. = 0) = 1. Find the generator A.
3. Let {X, t ;::: 0 I be a Brownian motion starting from 0, that is, X 0 = 0 with
probability 1. Let T< be defined by
Tc = min {t, X, = cI c >0
1>0
4. Let {X" - 00 < t < 00 I be a zero mean Gaussian process with EX,X. = e-I'-.I,
namely, X is an Ornstein-Uhlenbeck process. Use (3.13) to find
<P(X, ;::: 0, 0 ~ t ~ 1)
6. For the process in Exercise 5 find a suitable scale function u(x), and determine
whether the state spaee 8 is closed at its left endpoint o. If 8 is closed at 0, deter-
mine whether it is a regular boundary or an exit boundary. If it is a regular
boundary, determine (Ay)(O).
7. Show that, except for notational changes, (5.22) can be viewed as a generalization
of (2.12), that is, if H,y E :J)A for each t, then it follows from (2.12) that
AJA = AJA - Y
8. Let X" t ;::: 0, be a diffusion process so that in iIlt (8) A = iu2(x)d'jdx2 + m(x)
djdx. Let h(x,t) be defined by
This is the celebrated arc-sine law of Levy. [Hint: Set k(x) = 13«1 + sgn x)/2)
andf(x) = 1 in Kac's theorem.]
10. Let X" t ~ 0, be a standard Brownian motion. Use the result of Exercise 8 to find
the distribution of the quadratic functional
10 1
X," dt
11. Suppose that int (8) = (0,00). For the following pairs of (T" and m, determine:
(a) Whether 8 is closed at 0.
m(x)
1 -I
1 -x
-x
1 - tx
6
Martingale Calculus
1. MARTINGALES
a.s. (1.1)
t E T
Definition. We say that teo = (tel: t;::: 0) satisfies the usual conditions if
X t ( w) = limXs( w)
s~t
s>t
and (1.2)
X t _ ( w) = lim Xs( w)
s~t
s<t
exists.
Remark: Proposition 1.1 and the closely related result, Proposition 1.2, are
due to Doob. Proofs can be found in [Doob, 1953], [Doob, 1984],
[Meyer, 1966] and in [Dellacherie and Meyer, 1982]. In typical proofs,
Proposition 1.2 in the case T = 7l+ is deduced first and then Proposi-
tions 1.1 and 1.2 are deduced.
A process X which satisfies the conditions (1.2) for all t for w in a set
(not depending on t) of probability one is said to have continuous-on-the-right
with limits-on-the-Ieft (corlol) sample paths, or is simply said to be a corlol
process. A corlol process is clearly separable with any countable dense subset
of R + serving as a separating set.
Given two numbers a and b we use a /\ b to denote the smaller and
a V b to denote the larger of the two numbers. Given two a-algebras te and
1. MARTINGALES 211
c>fJ of subsets of a given set we use cr V c>fJ to denote the smallest a-algebra
containing both crand c>fJ.
Remark: Clearly Xoo is croo measurable, where Aoo = V crt, that is, croo is
t
the smallest a-algebra containing the algebra Ucr t• A similar conver-
t
gence theorem is true when T = III or T = Z and t tends to minus
infinity.
An arbitrary family e of random variables is called uniformly inte-
grable if
The following is a proposition from classical measure theory, the first half of
which is a generalization of the dominated convergence theorem.
Proposition 1.3.
(a) Let (Xn) be a sequence of random variables, with EIXnl finite for
each n, converging a.s. to a random variable X. Then (EIXI is finite
and Xn converges to X in I-mean) if and only if the family of random
variables {Xn} is uniformly integrable.
(b) Let X be a random variable on (Q,cr) with EIXI < + 00, and let
{cry: y E r} be an arbitrary family of sub-a-algebras of cr with
arbitrary parameter set r. Then
e = {E [Xlcry ] : y E r}
following proposition:
a.s. (1.5)
Thus E[(B - Xoo)IAl = 0 for all A in the algebra Uet t , and thus for all A
t
in eroo by the extension theorem. Since Xoo is etoo measurable, Eq. (1.5)
follows. •
A nonnegative random variable S, S ~ + 00, is called an et. stopping
time if
(w: S( w) ~ t} E ett for all t ~ 0
1. MARTINGALES 213
a.s. (1.6)
Proof: Let A > 0 and define a stopping time T to be the first k such that
Yk z X, and T = n if there is no such k. By the optional sampling theorem,
Y,. :-;;; E[y"lcr so that
T ]
and so
Since the event that Y,. z A is the same as the event that Yn* z A, this
proves (a).
214 MARTINGALE CALCULUS
Now Yn* ~ Y1 + ... + Y" and E[l',?] ~ E[Y;] < + 00 so that E[(Yn*)2] is
finite, so (b) is implied. •
Proof: Suppose 8 1 ,82 , •.• is a separating set for the interval [0, T] with
81 =T. For n ~ 1, let t~, ... , t;: be 8 1 , ••• , 8 n arranged in increasing order.
Then (IMtZI, (£tZ: 1 ~ k ~ n) is a positive submartingale so, by the lemma,
(1.8)
Proposition 1.7. Let (n, tt) be a measurable space, and let Mo and M be two
finite measures defined on (n, tt). Then there exists a nonnegative,
tt--measurable function A and a finite measure ~L such that for every A
in tt,
A= dM
dMo
If M « Mo and ~ is a sub-a-algebra of tt then the restriction of M to
0~ is always absolutely continuous with respect to the corresponding restric-
tion of Mo' It is easy to verify this fact and the formula (see Exercise 6)
oB
dM- =Eo
-
dM:f'
[dMl]
--
dM
o
~ a.s. Mo (1.9)
This equation provides a useful method for computing d0' / d0'o since Ln can
often be computed directly, as we now demonstrate.
First, by (1.9) we have
(1.10)
2n-l
TI (1/h'IT2- n ) e- ~(2n)(Xv+l -x v- 2-n)2
v~o
2n-l
TI (1/h'IT2- n ) e- ~(2n)(XV+l -xv)2
v~o
2n-l ]
= exp [ + L (XV+l - Xv) e-~
V~O
= e-~e(XI-XO)
f<l>( S, w) dV( s, w)
or simply as
for random processes <l> and V. Such an integral can in some cases be defined
as an integral involving deterministic functions (namely the sample paths of
<l> and V) for each w. Before defining the integral as a Lebesgue-Stieltjes
integral, we will briefly review the theory of Lebesgue-Stieltjes integrals for
deterministic functions.
Let G be a bounded increasing function on III + which is right continu-
ous. Then there exists a unique measure JL defined on the Borel subsets of III +
218 MARTINGALE CALCULUS
such that
n-l
fbldvsl ~ sup L IV(tk+ I ) - v(tk)1 ::-:; +00
a (E'1Tab k=O
We say that v has finite variation if its variation over IR+ is finite.
A real function v has finite variation if and only if v can be written as
(2.2)
2n-l
Ddvsl
o
= lim L
n ..... oo k=O
Iv( t8;:+I) - v( WI,') I (2.3)
2n-l
and
2n-l
v2 ( t) = lim L (v( t8;:+I) - v( t8;:»_ (2.5)
n ..... 00 k=O
1 cf>dvs
00
= lOOcf>+ dv 1 (s) -
0 0 0
l OO cf>_ dv (s)
1
-1o cf>+ dv
00
2( s) + 1 cf>- dv
0
00
2( s)
The integral is well defined and finite if the integral of 1cf>1 with respect to the
variation of v, defined by
(2.6)
is finite. The integral of cf> with respect to v over a compact interval [0, t] is
defined by
(2.7)
If the left-hand side of (2.6) is finite then the integral in (2.7) is finite, and as
a function of t the integral in (2.7) is right continuous and its total variation
is equal to the left-hand side of (2.6). It is convenient to use the notation
cf> • v to denote the integral as a function of t-thus
io
t T
<f>s dvs (for some T, T ~ t) (2.8)
because the value of the integral in (2.8) is the same for all T exceeding t.
Then <f> • V t is well defined and finite for all t if
and the integral in (2.9) is the variation of <f> • v over the interval [0, t].
Suppose now that (Q, ce, 0') is a probability space equipped with an
increasing family of sub-a-algebras of ce, ceo = (ce t , t ~ 0). Suppose that V =
(~: t ~ 0) is a real corlol random process which is ceo adapted. We then
define fldY"l, ~(t) and V;(t) by the same equations, Eqs. (2.3)-(2.5), which
were used for deterministic functions v. Then, for example, for each w fixed,
2n-l
tldy"l(w) = lim L W(t8;:+l>w)-V(t8;:,w)1
o n~oo k~O
for each t
For the remainder of this section we will assume that V has locally finite
variation.
The corresponding facts for deterministic processes immediately imply
that tldy"l, VI(t) and V;(t) are increasing and right continuous in t and
o
that Eqs. (2.1) and (2.2) hold, for a.e. w. On the other hand, for each t fixed
these quantities are defined as (pointwise) limits of sequences of cet measur-
able random variables. Therefore fldy"l, VI and V; are adapted random
o
processes.
Next, if <f>( w, t) is a real Borel measurable function of t for each w then
we define <f> • ~(w) for each w to be the Lebesgue-Stieltjes integral of the
sample function s ~ <p( s, w) with respect to the sample functions ~ V( s, w).
In order that the resulting integral be measurable and adapted, a
condition on <f> as a function of w is needed. The appropriate condition is
that of progressive measurability.
Proposition 2.1. Let V be a corlol, tt. -adapted random process and let <I> be
an tt. -progressively measurable process. Then the sample path Le-
besgue-Stieltjes integral
Proof: The only part of the proposition that does not follow immediately
from the facts given above for deterministic functions is that the integrals in
(2.10) and (2.11) are ttt measurable for each t. It is sufficient to consider the
case when <I> is bounded, because in the general case <I> is the pointwise limit
of a sequence of bounded <l>n with l<I>nl ~ 1<1>1, and the integrals of the <l>n
converge pointwise to those of <I> by the dominated convergence theorem. Fix
t with t ~ O. It suffices to prove (the stronger result) that the random
variable U defined by
and
For example,
fNs dNs = L Ns = 1 + 2 +
o s 5, t
I!.Ns~l
3. PREDICTABLE PROCESSES
a.s. when t z s
(3.1)
Proposition 3.1.
n2
Hn(t,w) = L H(!,w)I(k1n,(k+l)ln](t).
k=O n
Proposition 3.2.
Proof: Rather than give a proof (which can be found in [Dellacherie and
Meyer, 1982] and in [Doob, 1984]) we only attempt to make plausible the
existence of predictable compensators for a class D submartingale B. Given a
positive integer M, define a process Am by (note the similarity to Eq. (3.5)):
zero, and has corlol locally finite variation sample paths. Thus, asser-
tion (a) of the proposition is equivalent to the fact that such
martingales are equal to zero for all t, with probability one.
Meyer's original proof, as well as those of Dol€ans (1968) and Rao
(1969) were based on a different characterization of compensator-they
were called "natural increasing processes." Doleans (1967) first estab-
lished equivalence of the two characterizations (see [Dellacherie and
Meyer 1982, p. 126] and [Doob, 1984, p. 483]).
for some nonnegative teo progressive process >t, then A is called the
intensity of C with respect to (te., ~).
4. ISOMETRIC INTEGRALS
a.s. ( 4.1)
The fact that the right-hand side is non-negative for all complex A implies
the useful inequality:
The result is a random process in '!)R,2 which we denote by cf> • M-that is,
Proposition 4.1. There is a unique mapping <p --- <p • M from e2(M) to '!)R, 2
which satisfies properties (a)-(c):
( a) If cf> is a function in e (M) such that
2
( 4.5)
( b) (Linearity)
( e) A( <p • M)t = cf>t AMt for all t, with probability one, where
AZt = Zt - Zt~ is the jump of a process Z at time t.
4. ISOMETRIC INTEGRALS 229
Proof: If cP is a step function then the set to,"" tn in (4.4) is not unique.
For example, if (4.4) is true for to, ... ,tn then it is also true for any sequence
t6, t{, ... , t~, for which to,"" tn is a subsequence. In spite of this nonunique-
ness, the sum in (4.5) does not depend on the choice of to,"" tn, so we can
use (a) to define cP Moo for a step function cpo If If; is another step function
e
0
in 2 (M), then a single sequence to,"" tn can be found so that (4.3) and the
corresponding statement for If; are both true. The linearity property is easily
established in this case. Next,
n-ln-l
ICP Moo 12
0 = L L CPtk+~tj+(Mtk+l - MtJ(Mtj+1 - MtJ
k=O j=O
Let Dkj denote the kith term on the right-hand side. The fact that CPtk+ is
eftk measurable for each k implies that for k > i,
= E l°OICPld(M,M)t = IICPI12.
o
Thus, properties ( b) and (c) are both true when cP and If; are step functions.
230 MARTINGALE CALCULUS
So far, we have only defined a random variable cp • Moo for each step
function cpo However, because of the one-to-one correspondence between ~2
and the space of square integrable teoo -measurable random variables dis-
cussed at the beginning of the section, this serves to define the process cp • M
in ~ 2 for any step function cpo The case of general cp will be considered next.
e
Let L denote the set of functions cp in 2 (M) such that there exists a
e
sequence of step functions cpn in 2 (M) such that
This proves the assertion that the map satisfying properties (a )-( c) is
unique, if it exists.
Continuing now with the existence proof, we attempt to define cp • M
by Eq. (4.8). To see that this works we need only check two things: First, (use
la + bl 2 5: 21al 2 + 2I b I2),
E[lcpn • Moo - cpm. Moo 12] = Ilcpn _ cpml12
5: 2( Ilcpn - cpll2 + Ilcpm - cp1l2) ~ 0
so that the sequence of random variables cpn • Moo is Cauchy in 2-mean and
hence the limit in (4.8) exists. Second, if (l/;n) is another sequence of step
e
functions in 2 (M) such that Ilcp - l/;nll tends to zero as n tends to infinity,
then
the left-hand side of (4.9) tends to zero as n tends to infinity. Thus 1/In - M
converges in 2-mean to q, - M, so our definition of q, - M does not depend on
the sequence q,n chosen. Thus q, - M is well defined for any q, in 2 (M).e
If q,n and 1/In are step functions then
( 4.10)
and if the functions are chosen so that 1Iq, - q,nll, 111/1 - 1/I nll, and hence also
1Iq, +tj; - q,n - 1/I nll tend to zero as n tends to infinity, we obtain (4.6) from
(4.10) by taking the limit in 2-mean of each term in (4.10). Property (c) is
e
easily verified for any q, in 2 (M) by a similar argument.
The isometry property (c') is implied by property (c) and the identi-
ties obtained by substituting q, - M and 1/1 - M or q, and 1/1 in for M and N in
Eq. (4.2).
For any s and t,
so that
a.s.
and
for all t, W
232 MARTINGALE CALCULUS
Thus, by the monotone class theorem, (e) is true for all bounded real
e
functions <f> in 2(M). Appealing to the approximation argument again, one
e
easily sees that (e) is true for all <f> in 2(M).
Property ( f ) is proved by the same method used to prove property ( e).
To prove Eq. (4.7) we only need to establish that U defined by
s 211<f> - <f>nll(EIN~I)1/2
Thus, ~n converges to ~ in I-mean for each t. To see that this implies that
U is a martingale, note that for s > t
Then
Thus, the martingales ltH': dy' converge to ltHs dY: in I-mean for each t,
o 0
which yields the desired result. •
5. SEMIMARTINGALE INTEGRALS
if T = °
xi = (~tX if 0.:00; t.:oo; T
if t ~ T
andT>O
andT>O
T
or equivalently, xi
= I{T> O}Xt/\ T· A random process M is called a local
martingale (resp.local square integrable martingale) if it is corlol and if
there is a sequence (Tn) of stopping times which is increasing (i.e., Tn( w) .:00;
Tn+ 1(w) a.s.) and which satisfies Tn ~ + 00 a.s., such that MTn is a
martingale (resp. square integrable martingale) for each n. Similarly, a
random process H is called locally bounded if there is a sequence Tn i + 00
of stopping times such that HTn is a bounded random process for each n.
for t ~ 0, a.s.
where M E ~2 and A has finite variation. For any good stopping time R,
the integral H· X R of H with respect to XR can thus be defined by the
procedure given at the beginning of the section. If S is another good stopping
time, then so is R /\ S and
for any finite T. If U is an adapted corlol process then for each t > 0
(write 8;: for k2- n ):
(5.3)
EsuPIHnOMtI2::O:;4El°OIHtnI2d(M,M)t n-cJ 0
t 0
236 MARTINGALE CALCULUS
We thus have that sup IHn • Xtl converges to zero in probability as n tends
t
to infinity.
In the general case there are stopping times Rk i 00 such that
for t ~ 0, a.s.
n --+ 00
o
by our conclusion in the special case. This implies Eq. (5.2). Applying what
was just proved to
as n tends to infinity. Proposition 5.1 implies that the limit exists in the
6. QUADRATIC VARIATION AND THE CHANGE OF VARIABLE FORMULA 237
(6.2)
This equation shows that we can choose a corlol version of [X, Y], which we
always will. Since Eq. (6.1) defines [X, Y]t up to a set of zero probability for
each t, and since a corlol process is determined by its values at a countable
set of times, any two corlol versions of [X, Y] are equal for all t, with
probability one.
The process [X, Y] is linear in X and [X, Y] = [Y, X ], so (ignoring a
set of zero probability)
[ X, Y] = H [X + Y, X + Y] - [X - Y, X - Y]
+i[ -iX + Y, -iX + Y] - i[iX + Y,iX + Y]} (6.3)
Processes of the form [Z, Z]t are increasing in t, and thus [X, Y] is a locally
finite variation process. The continuous part, [X, Y]C, of the process [X, Y]
is defined by
[X,YL=Xo¥o+ L IlXslly'
0<8,,;t
(6.4)
(6.5)
O<s:s;t O<s:s;t
Since the predictable compensator of N is t, it follows that (N, N)t =
(n, n)t = t.
Equation (6.2) (which is virtually the definition of [X, Y]) yields that
for semimartingales X and Y,
(6.6)
s:s;t S5;t
(6.8)
where K is random and is defined by
Thus, the sum of jumps term on the right-hand side of Eq. (6.7) converges
absolutely for each finite t, with probability one, and its variation is locally
finite. The right-hand side of Eq. (6.7) is therefore a semimartingale.
Equation (6.7) is certainly true if f(x) = lor if I(x) = x. The equation
when f(x) = xn becomes
For n = 2 this equation reduces to the product fonnula (6.6) and it is hence
true in that case. Now, argue by induction and suppose that for some
n, n ~ 2, that Eq. (6.9) for xn is correct. Then, since xn+l = x(xn), the
product formula yields
xn+l
t = ltxs- dXsn + ltxns- dXs + [X , xn] t (6.10)
o 0
[ X , xn] t = xn+l
0
+ nltxn-ld[X
s-'
X] s
o
+ L IlX8 ( X: - X:_ - nX:-=-l IlX8 )
85,t
and
fX
o
s- dX: = n fX:- dXs +
0
L
8:5, t
X s-( X: - X:_ - nX:-=-l IlX.)
if Izl :0; D
and since the sum of I~XsI2 over s in any finite interval is a.s. finite, the
product term in (7.1) converges absolutely for all t, with probability one. The
product term thus represents a process with locally finite variation, so the
process 0(X) is itself a semimartingale. Note that 0(X)t = exp (t) if X t == t.
Using the generalized Ito's formula, it is not hard (although it is
tedious) to show that
(7.2)
tts
242 MARTINGALE CALCULUS
Proof: Since W and N are adapted, it suffices to prove that for any s, t
°
with ~ s < t,
wt - Ws is Gaussian, mean 0, variance t - s;
Nr - Ns is Poisson, mean t - s; and
Wt - Ws' Nt - Ns and 6es are independent (7.4)
Xt = iuwt +( e iv - l)nt
[ X , X]Ct = -u 2 t'!:J.X
t = (e iv -l)!:J.Nt and IlNt E {O, I}
so we obtain
or, equivalently,
This translates to
°
characteristic function of a pair of independent random variables, the first of
which is Gaussian with mean and variance t - s, and the second of which
is Poisson with mean t - s. Thus, by the uniqueness of characteristic
functions, the conditions in (7.4) must be true. •
The next application of semimartingale exponentials is to the descrip-
tion of local martingales when the probability measure on the underlying
probability space is changed. We start out with a probability measure 0'0 on
(n, 6e). Suppose that M is a local martingale relative to (6e t ) on the
probability space (n, 6e, 0'0). For short, we say that M is a 0'0 local martingale.
Assume also that M is real valued, that Mo = 0, and that IlMt > 1 for all t,
and let L = S(M). Then Lo = 1, and L is a positive 0'0 local martingale.
Hence, L is also a 0'0 supermartingale (see Exercise 8) and the a.s. limit Loo
7. SEMI MARTINGALE EXPONENTIALS AND APPLICATIONS 243
(7.5)
Lemma 7.1. For any random variable D with EIDI finite and any sub-a-alge-
bra CJ of ct,
0' a.s.
(7.6)
Let V denote the left-hand side of this equation. To show that it is equal to
the right-hand side we will show that V has the two properties which
characterize the right-hand side. First, V is CJ measurable. Second, we must
prove that
all bounded CJ measurable Z (7.7)
Now
= Eo[ Eo[LooICJ]I{Eo[L""I'51~0}] = 0
Lemma 7.2. Suppose that U is an adapted random process such that Lt~ is
a 0'0 martingale (resp. 0'0 local martingale). Then U is a 0' martingale
(resp. 0' local martingale).
Proof: Suppose that Lt~ is a 0'0 martingale. Then, by Lemma 7.1, for
s> t,
0' a.s.
and the first assertion is proved. If, instead, Lt~ is only a 0'0 local
martingale, then there exist stopping times R n i 00 such that (LU)t /\ R is a
0'0 martingale for each n. Then, by the argument above, ~ /\ R is n a 0'
martingale for each n, which implies that U is a 0' local martingafe. •
Proposition 7.2 (Abstract Girsanoy's Theorem, [GirsanoY, 1960], [Van Schuppen and
Wong, 1974]).Let Y be a 0'0 local martingale. If (Y, M) exists (com-
puted under measure 0'0)' then Y - (Y, M) is a 0' local martingale.
Proof: By Lemma 7.2, it suffices to prove that (Y - (Y, M»)L is a 0'0 local
martingale. By Ito's product formula,
Since (Y, M) has locally finite variation and is predictable and since dL t =
L t _ dMt ,
and that 0' is defined by Eq. (7.5). Then, computing under measure 0'0'
and
[') itA - p
( M, N - J~ As ds = (M, N)t = J~ \.. s d(N, N\
o t 0 s
and
are each local martingales. Since the quadratic variation process of the
first of these is t, the first of these processes is in fact a 0' Wiener
process. The intensity of N under measure 0' is p.
The following lemma ensures that the family «(9t) satisfies the usual condi-
tions, so that, for example, «(9t, 0'0) martingales have corlol modifications.
Fix t and let X be a bounded (9t+ -measurable random variable. Since (9t+ is
independent of Glt t + ~ for each n ~ 1 we have
The martingale convergence theorem ensures that the left-hand side con-
verges a.s. to Eo[XIGltt ] so that Eo[XIGltt ] = EoX a.s. By considering all
possible X we conclude that the a-algebras Glt t and (9t+ are independent.
Now if Z is a sum of finitely many terms AiBi where the Ai are bounded
(9t-measurable random variables and the Bi are bounded Gltt-measurable
246 MARTINGALE CALCULUS
(7.9)
o~ t ~ T a.s.
o :s; t ~ T a.s.
and
We will show that 1'; = ~. First define Zt = 1'; - ~ and note that (z, w) =
( z, n) = O. The jumps of Z are uniformly bounded so that we can choose
A > 0 so that A t::.Zt > -1 for all t. Then c£(AZ) is a positive local martingale
so there exist (9. stopping times Rk iT such that Lk = c£CAZ t 1\ R) is a
martingale for each k. Let 0'k denote the probability measure on (Q, (f)
which is absolutely continuous with respect to 0'0 such that d0'k = L} d0'o.
Since (z, w) = (z, n) = 0, Proposition 7.1 yields that w is a 0'k Wiener
martingale and N is a 0'k Poisson process. Thus 0'k and 0'0 ar~ identical on
(9T' which implies that L} = 1, 0'0 ~.s. Thus, Zt = 0 and 1'; = 1'; for 0 ~ t ~
R k • Since k was arbitrary, 1'; = 1'; for 0 ~ t ~ T. Now, 1'; converges in
2-mean to ~ as c --+ + 00 and so Eqs. (7.9) and (7.11) are obtained in the
limit. By an easy patching-together argument, Y has a representation of the
form (7.9) for all t ~ O.
Next, suppose that Y is a square integrable «(9t, 0'0) martingale, and let
T> O. Then there exists a sequence of bounded (9T-measurable random
248 MARTINGALE CALCULUS
variables Y'; converging in 2-mean to YT' Let yk denote the corlol martingale
on [0, T] defined by ~k = Eo[~kl~\]. Then yk is bounded so that
Equations (7.9) and (7.11) thus hold for 0 ~ t ~ T, and it is then an easy
matter to show that there exist H and K so that Eqs. (7.9) and (7.11) hold
for 0 ~ t ~ T.
The general case of the proposition can be proved by a more elaborate
approximation argument [Liptser and Sl-iryaev, 1976], which we omit. See
[Jacod, 1979] for a variety of extensions. •
EXERCISES
2. Suppose that S and T are stopping times. Show that their minimum S /\ T and
maximum S v T are stopping times. If (Tn) is a sequence of stopping times show that
T* = supTn is a stopping time.
n
3. Use the optional sampling theorem to show that if M is a corlol martingale and T is a
stopping time then MT defined by Mr = M,,, T is also a martingale.
4. Prove the optional sampling theorem (Proposition 1.5) in the special case that there are
two nonrandom times tl < t2 such that S, R E {t l , t 2 } a.s.
9. Let (!'t n ) be an increasing family of sub·a·algebras of !'t, let !'too = V!'tn , let 0'0 and 0' be
probability Illeasures on !'too> and let 0'on and 0'n denote the restrictions of 0'0 and 0' to
!'tn. Suppose that 0'; « 0'n for each n and let Ln = d0'; jd0'n. Then L is a positive 0'0
martingale, so that Loo = lim a.s.Ln exists. Show that 0' « 0'0 if and only if EoLoo = 1.
(Hint: EoLoo = 1 is equival~;t"'to L being uniformly integrable.)
10. [Kakutani, 1948] Let g = {(XI' X2' ... ): Xi E 1Ii!} and let Xi denote the function on g
defined by Xi(X)=Xi' Let !'tn = a(XI"",Xn} and !'too =Vn!'tn. Let 0'0 and 0' be
probability Illeasures on (g, !'too) so that XI' X 2 , ••• are independent unit variance
Gaussian random ~ables under 0'0 and under 0'. Assume that EOXi = 0 and EX; = ai
for all i. Let S = L af. Using the previous problem, show that 0'0 == 0' if S < + 00 and
i=l
that 0'0 ..L 0' otherwise. Show also that if 0'0 == 0' then the Radon-NikodyJTI derivative
of 0' with respect to 0'0 is given by
11. Find the predictable compensators for wt4 and for N t4 , where w is a Wiener Illartingale
and N is an !'t. Poisson process.
12. Let (Vn'@"n: n E Z+) be a martingale with Vo = 0, let Bn = Vn + 1 - Vn for n:2: 1, and
suppose that IBnl = 1 for all n and w.
(a) Show that Bn + I is independent of !'tn for each n (so that B I , B 2 , ... are mutually
independent) and that 0'(Bi = 1) = 0'(Bi = -1) = 0.5.
( b) Show that any martingale (Mn) relative to the a-algebras (!'t;;') generated by V has
a representation
n
Mn = Mo + L HkBk n:2:1
k~l
1. INTRODUCTION
In both the detection problem and the filtering problem a sample path from
some random process 0 is observed. In the detection problem, it is pos-
tulated that one of two known probability measures ?P or C!Po is given on the
underlying space (g, ce). Presumably the distribution of the observed process
is different under ?P and ?Po. The problem is to intelligently guess which of
the two measures is in effect, on the basis of the observation. In the filtering
problem a random process t which may be a coordinate of some other
random process, is given and the problem is to produce at each time t an
estimate of ~t.
The detection and filtering problems are closely related. On the one
hand, the solution of the detection problem can involve estimating processes
under the assumption that one of the given probabilities is in effect. On the
other hand, one treatment of filtering is to introduce a probability measure
in addition to the one given, thus leading to the situation encountered in the
detection problem.
250
1. INTRODUCTION 251
For both the detection and the filtering problem the role of the
observations is to provide information, and this information is summarized
by the increasing family of a-algebras (9. = «(9t) generated by the observa-
tions:
(9t = a( Os: 0 ~ s ~ t)
Proof: Consider first the case that ~(t, w) = f( t)U( w) where f is a bounded
Borel-measurable function and U is a bounded random variable. Let (Mt )
denote a separable version of the martingale E[UI(9t]. There exists a count-
able subset D of IR + such that for each t not in D, M is a.s. left continuous
at t [Doob 1953, Th. 11.2, p. 358]. Then the process (j defined by
Mt iftE D
Prool: Let X denote the space of bounded ~ for which Proposition 1.2 is
true. If ~(t, w) = l(t)U(w) and M and D are as in the proof of Proposition
1.1, then ~ defined by
= { I( t) limsupMLt(n-l)J/n if t $. D
~t = n-oo
o ift E D
(with the convention that M t = 0 for t < 0) and D satisfy the conclusion of
Proposition 1.2. Thus ~ is in X, and X also includes finite linear combina-
tions of such processes. The rest of the proof is nearly identical to that of
Proposition 1.1. •
Zt = Zo + ltlscis + m t
o
where EIZol is finite, I is an a. progressive process with
and
Proof: If we use Eq. (1.2) to define D and use the representation for Z and
the definition of Zt we obtain
where 19. progressive versions are chosen for conditional expectations where
appropriate. Now for 0 ~ a < t,
Yt = Ioth s ds + wp (1.4)
Nt 1\8ds + Mp
= (1.7)
o
where N is a counting process, (1.8)
254 DETECTION AND FILTERING
f t,
Yt = Yt - Jfi hs cb; and (1.11)
o
are (9. martingales. The processes Y and IV are called the innovations
processes corresponding to the observation process y and N.
We also assume that (9t = (9t+ and that (90 consists only of ~o null sets and
their complements. Let 0'~ and 0't denote the restrictions of 0'0 and 0' to (9t'
We assume for simplicity that 0';[ and <»T are mutually absolutely continu-
ous and define At for 0 s t s T by
compute AT and
and for any E between zero and one there is a choice of the parameters y and
a such that p;I = E. By the well-known Neyman-Pearson lemma of statistics,
the resulting likelihood ratio test achieves the minimum p; over all decision
rules with p;I = E.
The key to implementing and evaluating a likelihood ratio test is to
compute the likelihood ratio AT' and the key to understanding the likelihood
ratio process (A t) is to connect it to the behavior of (q) under 0' and 0'0. We
will first represent A as a martingale exponential.
Define, for n z 1,
Thus EoArI{R';'TJ = 0, and since 0'O(AT > 0) = 1 it follows that 0'o(R ::;; T)
= o. Therefore A;~ is locally bounded, and As- is locally bounded away
from zero. We can thus define
or
05.t5.T
for some (9. predictable processes <l> and I/; such that
0'0 a.s.
To see how <l> and I/; are related to 0', first apply the abstract Girsanov's
theorem, Proposition 6.7.2, to deduce that
Comparing this with the fact that the innovations processes y and IV in
(1.11) are also «(9.,0') martingales and using the fact that the «(9.,0')
predictable compensators of Y and N are unique, we conclude that
X = h· Y +(~ - 1) • n
3. FILTER REPRESENTATION - CHANGE OF MEASURE DERIVATION 257
so that
where
and
Remark: The likelihood ratio representation (2.1) can be traced back to work
of Sosulin and Stratonovich (1965), Duncan (1968), and Kailath (1969).
For an account of its history for point process observations, see
[Bremaud (1981)].
Let (!2, ct, 0') be a complete probability space and let ~. = (~t) and (9. = «(9t)
be two increasing families of sub-a-algebras of ct which satisfy the usual
conditions. The a-algebra ~t represents "state information" up to time t and
(9t represents "observed information" up to time t. It is useful to define a
third increasing family ct. by ct t = ~t V(9t. We will assume that the observa-
tions a-algebras are generated by a pair of processes y and N:
(3.3)
Observation:
(3.4)
where
A is 6£. predictable, E
o
iTAs ds < + 00, As > 0, (3.8)
M is an fJ t V ~T martingale (3.9)
Proposition 3.1.
and
are 0'0 independent. In particular (take s = 0), S'lT and (9T are 0'0
independent.
Proof: Using the identity (6.7.3) we easily check that Lt~ = 1, which
implies (a). Next, V is an (S'lT V (91' 0') martingale with Vo = 1 so that
E[VTIS'lT] = 1. Thus, for any bounded S'lT-measurable random variable X,
and
A-I
Mt - <- h· w + -A
- . M' M) t = N t - t = n t
so by the abstract Girsanov's theorem, y and n are each «(9t VS'lT' 0'0) local
martingales. Moreover, y is sample continuous with [y, Y]t = t and N is a
counting process. Properties (c)-( e) are then implied by the Levy-Watanabe
characterization theorem, Proposition 6.7.1. •
Remark: We will no longer appeal directly to assumption (3.10), but will only
use the fact that 0'0 satisfying the conclusions of Proposition 3.1 exists.
This is important since Proposition 3.1 can be established under much
less restrictive assumptions. In particular, it is not necessary to have
0'0 « 0'.
By Lemma 6.7.1, if 1'/ is a measurable process with E( 1'/t) finite for all t,
then
(3.11)
Note that
where A is the likelihood ratio process for 0' versus 0'0 relative to (9 ••
It follows that if we define ~ to be any predictable process such that
~t = TIt 0'0 a.s. unless t is in some Borel subset D of ~ + with Lebesgue
measure zero, then we have
dtd0'( w) a.e.
Eo i
0
T
(~sLshJ ds <
2
+00
a.s. for t ~ °
Lemma 3.1. [y, m]t = [n, m]t = °for t ~ 0,0'0 a.s.
Proof of Lemma 3.1: y, and therefore also [y, m], is sample continuous so
that
[ y, m L = (y, m)t
Since the probability that n has a jump at any fixed time is zero and since n
and m are independent under 0'0' it follows from (3.12) that [n, m] = 0. •
3. FILTER REPRESENTATION - CHANGE OF MEASURE DERIVATION 261
Lemma 3.2.
and the second a-algebra on the right-hand side is <3'0 independent of tes '
Since P.Ls is tes measurable and tes ::J {9s' we claim that
a.s. (3.14)
= t~s dys
o
The first equality can be proved by reducing it to the case that YsLs- is
replaced by a bounded te ° predictable step function, and the second equality
is true by the definition of ~s. The proof of part (c) is similar.
The fact that m is an (~o' <3'0) martingale and that {9T is <3'0 indepen-
dent of ~T implies that m, and therefore (tP°mt), is a (<3'o'~tV{9T)
262 DETECTION AND FILTERING
martingale. Thus,
or
M is an tt. martingale
We will also make the following assumptions, which can be relaxed.
At~e forall(t,w),wheree> 0
E iTo li; At dt < + 00 (e.g., A bounded)
By the argument used in the first part of the proof of Proposition 6.7.3,
there exist tt. predictable processes cf> and I/; so that
( 4.1)
0' a.s.
and
Proof: Define X = -h Y -
0 (1 - I/:\.) 0 IV and let L = 0(X), or
Lt=exp(-hoys- t(-2
o
Ih ;+:\.s-I)ds) fl ~IA
S5,(
!!.Ns~l s
and
are 0'n local martingales. Now
and
Thus, under measure 0'n, w up to time 'Tn is an (9. Wiener martingale and N
up to time 'Tn is an (9. Poisson process.
4. FILTER REPRESENTATION -INNOVATIONS DERIVATION 265
By inequality (6.4.3),
so that (D, X n );, as well as Dl, has finite expectation under 0'. Since L n is
bounded, both (D, xn); and Dt2 have finite means under 0'n as well. Thus,
D - (D, xn) is an (0., 0'n) square integrable martingale with initial value
zero, so by the representation theorem of Section 6.7,
EnlTo H2 + K2 ds <
s s
+ 00
0:::; t:::; T
and a.s . •
Lemma 3. Hand K in Lemma 4.2 are given by Eqs. (4.2) and (4.3).
Proof: (See the remark below for a heuristic approach.) Define a sequence
266 DETECTION AND FILTERING
~Y = yf3 • /L + Y • m + ~h 0 /L + ~ w + (m, w)
0 ( 4.4)
so that
(b) l/,Sn = (yf3 + ~h + <1» 0 /Lt A Sn + an cr. martingale
Thus, projecting this process onto the observations a-algebras (Proposition
1.3) yields
and
is an (9. local martingale. It is also predictable, has initial value zero, and has
locally finite variation corlol sample paths. It is hence zero for all t with
probability one, so that H t must be given by Eq. (4.2) for t S Sn. Since Sn
increases to T as n tends to infinity, Eq. (4.2) is thus true in general.
The identification of K is similar: First define a sequence (Tn) of (9.
stopping times by
Tn = inf {t: Nt ~ n}
4. FILTER REPRESENTATION -INNOVATIONS DERIVATION 267
Since the jumps of N have size at most one, Nt is bounded above by n for
o :s;
t :s; Tn' The analogue of Eq. (4.4) is
~N = NP • /L + N _ • m + O· . /L + L· M
+([m,M] - (m,M») + (m,M)
where N _ at s is equal to N s - (and similarly for ~_). Since Em, M] - (m, M)
is an ct. martingale, we have
which is analogous to Eq. (4.5). The rest of the proof for identifying K is so
similar to that for identifying H that we omit it. •
Remark: Since the proof of Lemma 4.3 is somewhat mysterious, we will give
a heuristic derivation of the equations for H and K by appealing
directly to the "orthogonality principle." Fix t, let dt be a small
positive number, and use the notation d~t = ~t+dt - ~t> etc. We know
that
Example [Kalman and Buey, 1961]. Let ~ be the unique solution to the linear
stochastic differential equation
(4.7)
(4.8)
where
(4.9)
( 4.10)
dEt
dt
= 2 aH
t
+ b2 - H 2• H =
toO
(72
0 ( 4.12)
Equations (4.8) and (4.12) are a special case of the Kalman-Bucy filter
and associated Riccati equation given by Eqs. (3.9.37) and (3.9.39).
These equations provide a recursive method for computing ~t (see
Section 5).
5. RECURSIVE ESTIMATION
The estimation equations of the previous two sections are specialized in this
section to the case that a Markov process X represents an unobserved state,
and conditional moments
Nt = Ia\(Xs) ds + Mt (5.2)
and we suppose that conditions (3.5)-(3.9) are true with hs = h(Xs) and
As = A(Xs _)·
To apply the estimation equations of Propositions 3.2 and 4.1 to
processes g of the form gt = g(Xt ), we must find a semimartingale represen-
tation of g as in Eq. (3.2). If g is in Gj)A' then an easy modification of the
proof of Dynkin's identity given in Chapter 5 yields that for 0 :s; s :s; t,
(since here s and t are not random, it is not necessary that X be strong
Markov). Equivalently, cg defined by
(5.3)
Remark: The fact that Cf in Eq. (5.3) is a martingale for g in Gj)A char-
acterizes Ag. Moreover, if we use such martingale property as a
definition for Ag, we can extend the operator A to a much larger
domain (possibly including some unbounded functions).
Thus, if g E Gj)A then gt = g(Xt ) has the desired representation (3.2)
with Ps = Ag(Xs_) and M t = Cf. Proposition 3.2 then yields the representa-
tion
then a set of equations of the form (5.4), one for each g in GLl, yields a
stochastic differential equation for
Example [Wonham, 1964]. Suppose the statespace S is {I, 2, ... , n}. Functions
and measures on S are represented by column and row vectors, respec-
tively. The generator A is then represented by an n X n matrix which
we also call A, so that for a function I on S, it is consistent to
interpret AI as the product of a matrix A and a column vector I.
Let 8 i denote the function on S such that 8 i (z) is one for z = i
and is zero otherwise, and let II denote the vector process defined by
We will now obtain an analogue of Eq. (5.6) for the general case. For
disjoint sets An in 01>(8),
a.s. (5.7)
(JL, t) = 1s fdJL
The adjoint A* of A relative to this product is characterized by
(We will not enter into a discussion of the domain of A *.) If I is a reference
measure on 'iB (S) we use (., ·)z to denote the usual inner product
(f,gL = isf(x)g(x)l(dx)
(g,At)z = (t*g, fL
For example, if X is a diffusion with drift term m(x) and diffusion term
(J2(X) as described in Section 4.7, then
Af(x) =
af 1 a2 f
m(x)-a (x) + -2 (J2(X)-2 (x)
x ax
or, using M f to denote the operator on measures defined by MfJL(A) = L,f dJL,
274 DETECTION AND FILTERING
we have
This equation is true for g in 6j)A and 6j)A is dense in the space of all bounded
'iB(S)-measurable functions so we conclude that
(5.9)
Ancestors of Eq. (5.11) were first given by Stratonovich (1960) and Kushner
(1967). In the case that h and A are identically constant functions (equiv-
alent to no observations) this equation reduces to dqt/dt = e*qt, which is
the Kolmogorov forward equation for the family of densities of X t as t
vanes.
(5.12)
Then r(t, x) = 'lTt(x) exp (- h(x)Yt), so we can apply Ito's formula and
5. RECURSIVE ESTIMATION 275
( 5.14)
Thus, 'ITt is the density of ITt with respect to the unconditional distribution
of X,.f!,( -, t).
Example. Let X t = X for all t for some random variable X. Suppose that h
and A have the form
processes
and
(
F(x,cp,t/;,t) = exp a(x)</> + 2" lita(x) u(s) ds + b{xH
0
2 2
+ {b(x)v(s) -Ids)
Note that 'ITt determines all the moments fe X)t and that III and 'It' can
be recursively updated using the equations
Example [BeneA, 1981]. Suppose the state process X is the solution to the
stochastic differential equation
Yo = 0
( 5.15)
where
F(x) = lo feu) du
x
-oo<x<oo
EXERCISES 277
ds t = 1 _ k 2s 2 • So = 0,
dt t,
/Lo = Xo
The process p does not satisfy the nonlinear filtering equation (5.10).
However, using direct computation and Eq. (5.15), one readily finds
that
EXERCISES
1. Let Z = (Zk: k E Z+) and 0 = (Ok: k E Z+) be random processes and let '8 k =
a( 0 0 , ••• , Ok)' Suppose that each zk takes values in S = {l, 2, ... , n} and that Ok takes
values in some finite set (not depending on k or w). Suppose that for each possible value
(J of Ok that there is an n X n matrix R«(J) such that
(i.e., (zk+ " Ok+ I) is conditionally independent of fJ k given Zk' and the transition mecha-
nisms are time homogeneous.} Define Ilk to be the row vector with ith entry '!J'(Zk =
il 0 , _... , Ok)' Derive the recursive filtering equation,
278 DETECTION AND FILTERING
where e is the column vector of all ones. (Note that the numerator on the right-hand
side is an unnormalized version of I1 k + 1)'
2. Find E[Ui t - ~t)2] in terms of a and b for the example of Section 7.4 in the case that
a~ = O. Find its limiting value as f tends to infinity.
3. Let U be a positive random variable with density function f and distribution function
F. Suppose that N is a counting process with intensity A = aI[ o. U) + bI[ u. + 00) relative
to the family of o-algebras
a(U)Yo(Ns:S75,f) f~O
l. INTRODUCTION
,Ve have used the term stochastic process to denote a collection of ran-
dom variables indexed by a single real parameter. In other \Yords, the
parameter space is a subset of the real line and usually an interval. In
most applications, this parameter is interpreted as time. There are many
applications where it is more appropriate to consider collections of ran-
dom variables indexed by points in a more general parameter space. For
example, in problems involving propagation of electromagnetic waves
through random media, the natural parameter space is a subset of R4,
representing space and time. A similar example is the velocity field in
turbulence theory. The term random field is often used to denote a
collection of random variables with a parameter space which is a subset
of Rn. There are other possible parameter spaces. For example, the
parameter space can be taken to be a function space of some kind. Such
is the case with generalized processes. Alternatively, ,ve can also take
the parameter space to be a collection of subsets of Rn. Such, for example,
279
280 RANDOM FIELDS
is the situation for random measures, which have already been made use
of in connection with second-order stochastic integrals. Generally speak-
ing, the kind of assumptions that we make concerning mutual depen-
dence of the collection of random variables reflects something of the
parameter space. For example, if the parameter space is an interval, we
usually assume continuity in probability. If the parameter space is a
linear topological space, we usually assume that the collection of random
variables as a function of the parameter is both linear and continuous in
probability. If the parameter space is a IT algebra, then we usually assume
that the collection of random variable is IT additive, and so on.
Compared to the one-parameter case, relatively little is known con-
cerning processes with a more general parameter space. Of course, a great
deal of the results concerning stochastic processes with a one-dimensional
parameter space do not depend on the fact that the parameter is one
dimensional. These results are easily generalized to more general collec-
tions of random variables. Such generalizations require little elaboration.
In this chapter, we shall focus our attention on problems of the following
two kinds: (1) problems which arise only when the parameter space is
more complex than one-dimensional, and (2) important properties of
one-dimensional processes \\"hich are not easily extended, because they
depend on the parameter space being one dimensional. As an example of
category (1), we have the rich interplay between the probabilistic prop-
erties of a random field and the geometry of its parameter space. Although
this interplay already appears in the one-dimenBional case in the form of
stationarity, the geometry of the real line is obviously both degenerate and
rather trivial by comparison with the geometry of higher dimensions. As
an example of category (2), consider :\Iarkov processes. The definition
of a :Markov process makes explicit use of the ,yell orderedness of the
real line. It is difficult to see how it can be generalized to a multidimen-
sional parameter space. The way that it is done is one of the most inter-
eBting problem8 that we shall discuss in thi8 chapter.
To avoid confusion, we shall adopt the following terminology: A
collection of random variables defined on a common probability space
will be called a stochastic process or a random field accordingly as its
parameter space is one-dimensional or multidi mensional. We should note
that while this terminology is widely used, it is by no means universal.
For example, a random field is often called a stochastic process with a
several-dimensional time.
(2.2)
As in Chap. 3, we allow X. to be complex valued. The most straight-
forward generalizations to wide-sense stationary processes are homo-
geneous random fields defined as follows : We say that {X., z E Rn} is
homogeneous if EX. = fJ. does not depend on z and
(2.3)
for all Zo, z, z' in Rn. Setting Zo = -z' in (2.3), we see that the covariance
function
E(X. - EX.)(X., - EX.,) = R(z - z') (2.4)
depends only on z - z'. Of course, R(z - z') is also nonnegative definite;
i.e., for any finite number of points Zl, Z2, • • • ,ZN in Rn, and any collec-
tion of complex constants aI, a2, . . . , aN, we have
N
2:
;,j= 1
a/ijR(zi - Zj) ~ 0 (2.5)
Where F is a finite Borel measure on Rn, and (v,z) denotes the inner
product
(v,z) = rn
;=1
ViZ; (2.7)
282 RANDOM FIELDS
(2.8)
JR" !(II)X(dll)
can be defined for any! E U(F) as the q.m. limit of a sequence of random
variables resulting from approximating! by a sequence Uk I where each
!k is a linear combination of indicator functions of Borel sets and
n (2.10)
ti(Z) = L aijzj
j~l
(2.13)
(2.14)
284 RANDOM FIELDS
F or any rotation t,
R(lIt(z) Ii) = R(lIzll)
= ( ei27r (•• t(zllF(dll)
JR-
= r
JR'
ei27r( •• zlF(dll) (2.15)
(2.20)
The constant K is just the total area of Sn-2 and can be absorbed into F o.
The inside integral in (2.20) can be evaluated to be
(2.21)
Proposition 2.2.A function R(r), 0 :::; r < 00, is the covariance function of
an isotropic and homogeneous q.m. continuous random field if
3. SPHERICAL HARMONICS AND ISOTROPIC RANDOM FIELDS 285
and only if
where 1f(6,6') is just the angle between the two straight lines connecting
the origin in Rn+1 to 6 and 6'. It is easy to verify that (3.1) defines a metric.
Now, suppose that we consider the set of all transformations t: Sn ~ Sn
such that
(3.3)
286 RANDOM FIELDS
a 2
a 2 a 2
~(Rn+l) = -+-+
az 2 az2
1 2
... + -
aZ;+l
(3.5)
(3.8)
afJ 1 2
~(Sn) = 1 ~ (sinn-1
sin n - 1 en aen
e ~)
n aen
+ _1_
sin en
2
~(Sn-l) (3.9)
From (3.8) it is easy to see that ~(Sn) commutes with each T u, g E G(Sn),
since T is left invariant by any 9 E G(S1O), and Tu commutes with ~(Rn+l).
It is now convenient to take as the domain of ~(Sn) the space C2(Sn) of
functions in C2(Rn+l) which do not depend on the radial distance T. Now,
consider the eigenvalues and eigenfunctions of ~(Sn). An eigenvalue A of
~(Sn) is any complex number such that the equation
(3.10)
3. SPHERICAL HARMONICS AND ISOTROPIC RANDOM FIELDS 287
(3.11)
(3.17)
F(O,O') = rdm
1= 1
hmz(n)(O)hmz(n)(o') (3.24)
We shall show that the F in (3.24) satisfies both (3.19) and (3.21). For an
arbitrary g E o(sn), Xm is invariant under Tg so that we can write
(Tghml(n»(o) = hm1'n)(g(0» = r
dm
1'-1
(Xll'(g)h;;:i-(O) (3.25)
3. SPHERICAL HARMONICS AND ISOTROPIC RANDOM FIELDS 289
Because
Isn hmk(n) (g(6»h mz(n) (g(6» dO = Isn hmk(n) (6)h mz (n) (6) dO = aiel (3.26)
L
1'=1
all'(g)akl'(g) = akZ (3.27)
This means that A(g) = [aij(g)] is an orthogonal matrix so that we also have
dm
L
1'= 1
az'z(g)al'k(g) = aM (3.28)
It follows that
d,. dm dm dm
~ hmz<n) (g(6) )hmz(n) (g(6'» ~ ~ h;;:z\(6)ht;:k,(6) L all'(g)azdg)
1= 1 I' = 1 k' = 1 1= 1
(3.32)
L L X m/(r)hml n-1)(9)
.. dm
X(r,9) = (3.37)
m=OI=l
= (
}S,,-lXSn.-l
R[r " r' cos .1'(9
Y',
Cn,-;I) (9') dO dO'
9')]h m l(n-l) (8)h ml (3.39)
L
00
The bilinear fonn (3.32) and (3.40) can now be used in (3.39) to yield
EXml ( r)XmAr') = 6mm,6u ,Rm(r, r') (3.41)
This means that {Xm/(r)} is a countable family of orthogonal one-
dimensional stochastic processes.
Suppose that {X., Z ERn} is not only isotropic, but also homo-
geneous; then we know from (2.8) that we can write
where X is a random set function defined on the Borel sets of Rn. Now,
adopt a polar-coordinate system for both 11 and z in (3.42) so that
O:$X<
= (X,+) +E Sn-l
00
211'11
O:$r< 00
z = (r,9)
9 E Sn-l
3. SPHERICAL HARMONICS AND ISOTROPIC RANDOM FIELDS 291
It is obvious that
~(Rn)ei2.-( •.. ) = _ X2ei2r( •• ·)
It follows that we must be able to write
L
OD
eiArcosy,(6.0) = e,).rco88 R _ 1 = Cm(n-2)f2(COS (}n-l)fm(Xr) (3.43)
m=O
where fm satisfies
-
1 d ( n 1 d
-
)
r - - fm(Xr) -
m(m +n - 2)
fm(Xr)
r n- 1 dr dr r2
(3.44)
f m (Xr ) -- K
m
J(n-2)/2+m(Xr)
(Xr) (n-2)/2 (3.45)
e,).rcosy,(8.'I') = f
L
K
m
J(n_2)/2+m(Xr) C (n-2)/2(cos 1/;(0
(Xr)(n-2)/2 m ,
,»
m=O
= f
L
~
L d
Km J(n-2)/2+m(Ar) h (n-l)(O)h (n-l) (A)
(Ar) (n-2)/2 ml ml ...
(3.46)
m=O 1=1 m
where {Xmd is a family of random set functions defined on the Borel sets
of [0,00) by the formula
~
Xml(A) Km! ~ (dA
= d m AXS.- 1 hmz'n-l) (,)X 271" d, ) (3.48)
(3.49)
X(r,6) = r r
..
m=OI=l
dm
X ml (r)h ml (n-l)(6) (4.1)
4. MARKOVIAN RANDOM FIELDS 293
(4.3)
(Am ')(r)
J
= _1_ ~
r,,-l dr
(rn-1 df(r»)
dr
_ m(m + n -
r2
2) fer) (4.6)
= [AmRm(r,')](r') (4.9)
so that for r > r',
(4.10)
Since the two sides of (4.10) are functions of different variables, they must
be equal to a constant, i.e.,
(Amfm)(r) = vmfm(r) r>O (4.11)
(Amgm) (r) = vmgm(r) r>O (4.12)
Now, consider
R(r) = EX(r,9)X(O,·) = EXzXo
= L Rm(r,O)hm/n-l)(9)hm/n-l)( <p)
m,l
m ~ 1
because hOI (n-l)(9) = 1 and all other h m / n - 1) are orthogonal to it. There-
fore,
R(r) = Ro(r,O) = fo(r)go(O) ( 4.13)
It follows from (4.12) that R(') must satisfy
R(r) = AJ (n-2)/2(Aor)
(4.15)
(Aor)(n-2)/2
X (r) = X J(n-2l/2+m(Aor)
(4.16)
ml ml (Aor)(n-2)/2
for \vhich R (liz - z'll) is positive definite, but R (0) 00 • Eq uatioll (4.17)
(4.19)
also has the property that given X, X and X z ' are independent whenever
Z
In view of the important role that martingales have played in the develop-
ment of a theory of filtering and detection, one is motivated to generalize the
martingale concept to multiparameter processes. This can be done in a
number of ways. One of the simplest and most natural is to make use of the
5. MUL TIPARAMETER MARTINGALES 297
( increasing family)
a.s. (5.1)
E1)(A)1)(B) = area (A n B)
where we have used the fact that 1)(As) is IXws measurable while 1)(At - As)
(At - As being disjoint from As) is IXws independent.
298 RANDOM FIELDS
Denote the collection of all such random fields by Xl' Then, Xl is a Hilbert
space with inner product
and call such cf> 's simple functions. For simple <p's, we have
(d) It can be shown in a way similar to that in Section 4.2 that every
</> E X 1 is the limit of a sequence of simple functions. Hence, </> • W is
well defined for all </> E Xl.
It is clear that </>. W is a straightforward generalization of the Ito
integral (defined in Section 4.2) and its properties are similar. These include:
(a) linearity: (a</> + P</>')· W= a(</>· W) + /3(</>'. W)
(b) isometry: E(</>· W)(</>,· W) = (</>,</>')
(c) martingale: E( </> • Wlttt) = </>IA t • W almost surely for every t E T
(i.e., E[ 1</>s d~lttt] = is<t </>s dW.)
T
Z= EZ + </>. W
which contradicts
as follows:
(a) Suppose that there exist rectangles Ll1 and Ll2 such that ~1 X ~2
C G and
and each I/; k is of the form given in (a). If I/; is simple, we set
m
I/; • W 2 = L I/;k· W 2
k~l
(c) It can be shown (Wong and Zakai, 1974) that simple I/;'s are dense
in X 2 with respect to the norm
Ie EI/;L dtds
1/2
( )
111/;11 =
[,a(u)f(u, W)
u
Mt = f
At
u( s) dw"
and
302 RANDOM FIELDS
and
d tl Md t2 M= L u(t)'r/(ds)rj(ds')
Sf /\ s
SVS'=t
If a curve y in IR 2 is represented by
then f x is given by
y
X(y) = fx
y
11 = a' + a"
A stochastic differential 1-fonn X is a family of random variables parame-
trized by
where ai are real constants and ai are oriented line segments with the
requirements that: (a) elements of f1 equal under subdivision are not
distinguished, and (b) a( - a) = ( - a) a. A differential 1-fonn X is easily
extended to f1 by linearity, i.e.,
lim in p. X( p) = 0 (6.4)
Ilpll~O
With the continuity conditions (6.2)-(6.4), the I and 2 forms can now
be further extended. For example, a sequence of approximating I-chains can
be constructed for any smooth curve y by successively subdividing y and
constructing a staircase approximation using the subdivision. If the subdivi-
sions are nested then the difference between two staircase approximations is
the boundary of a 2-chain. Continuity (6.3) then allows a I-form to be
extended to y. Similarly, (6.4) allows a 2-form to be extended to a two-dimen-
sional set that can be approximated by 2-chains.
Before proceeding further, consider the following example. Let
{1J (A), A E 01}} be a Gaussian collection of random variables parametrized
by the collection 012 of Borel sets in 1R2 such that E1J(A) = 0 and
E1J(A)1J(B) = area (A n B)
The set-parameter process 1J will be called a Gaussian white noise. Now, for
any oriented rectangle, set
when At is the rectangle (un oriented) bounded by the two axes and t. The
sign is + if t is in the first or third quadrant and - otherwise.
A I-form G2l can be defined in terms of Was follows. Let ab denote an
oriented horizontal or vertical line segment from point a to point b, and set
G2l ( ab) = W( b) - W( a)
dye ab) = Y( b) - Y( a)
and for a I-form 'Y,
GIl = dW
but the relationship between Z and GIl (or W) remains obscure. To expose
that relationship requires one more concept. Given a I-form, say X, we can
express it in terms of a "coordinate system" similar to (6.1) as follows. Define
Xi (i = 1,2) as one forms such that
p= [±]
a b
Then,
6. STOCHASTIC DIFFERENTIAL FORMS 307
Now,
(6.6)
Now, the question is: What is the relationship among the martingale r-forms
for different r? The following result due in its original form to Cairoli and
Walsh (1975) relates O-form martingales to I-form martingales.
if (J is vertical
where the integral is a type-I integral introduced in Section 5. Now, G21/\ G21
== 0, but G21 1 /\ G21 2 is given by the type-2 integral
t,t' E 1R2
The pair ai' ~2) defines a differential I-form X via the relationship
X( (J) = 1~lt al
a
t if (1 is horizontal
= 1~2t dl
a
t if (1 is vertical
dX+o:I\X=1j (6.8)
o:~ + o:~ = 1
BeneS, V. E. (1981): Exact finite-dimensional filters for certain diffusions with nonlin-
ear drift, Stochastics, 5:65-92.
Birkoff, G. and S. MacLane (1953): "A Survey of Modern Algebra," Macmillan, New
York.
Breiman, L. (1968): "Probability," Addison-Wesley, Reading, Mass.
Bremaud, P. (1981): "Point Processes and Queues, Martingale Dynamics," Springer-
Verlag, New York.
Bucy, R. S. and P. D. Joseph (1968): "Filtering for Stochastic Processes with
Applications to Guidance," Interscience, Wiley, New York.
Cairoli, R. and J. B. Walsh (1975): Stochastic integrals in the plane, Acta Math.
134:111-183.
Cramer, H. (1966): On stochastic processes whose trajectories have no discontinuities
of the second kind, Ann. di Matematica (iv), 71:85-92.
Davenport, W. B., Jr. and W. L. Root (1958): "An Introduction to the Theory of
Random Signals and Noise," McGraw-Hill, New York.
Davis, M. H. A. (1980): On a mUltiplicative functional transformation arising in
nonlinear filtering theory, Z. Wahrscheinlichkeitstheorie verw. Geb., 54: 125-139.
Dellacherie, C. and P. A. Meyer (1978): "Probabilities and Potential," North-Holland,
New York.
Dellacherie, C. and P. A. Meyer (1982): "Probabilities and Potential B, Theory of
Martingales," North-Holland, New York.
Doleans, C. (= Doleans-Dade, C.) (1967): Processus croissant naturels et processus
tres-bien-measurables, C.R. Acad. Sci. Paris,264:874-876.
311
312 REFERENCES
Kalman, R. E. and R. S. Bucy (1961): New results in linear filtering and prediction
theory, Trans. Am. Soc. Mech. Engn. Series D, J. Basic Eng., 83:95-108.
Karhunen, K. (1947): Uber linear methoden in der wahrscheinlichkeitsrechnung, Ann.
A cad. Sci. Fenn., 37.
Kohlmann, M. and W. Vogel (Eds.) (1979): "Stochastic control theory and stochastic
differential systems, Proceedings of a Workshop of the 'Sonderforschungsbereich 72
der Deutschen Forschungsgemeinschaft an der Universitat Bonn,' which took place
in January 1979 at Bad Honnef," Lecture Notes in Control and Information
Sciences, 16, Springer-Verlag, New York.
Kolmogorov, A. N. (1931): Uber die analytische methoden in der wahrscheinlichkeits-
rechnung, Math. Ann., 104:415-458.
Kunita, H. and S. Watanabe (1967): On square integrable martingales, Nagoya Math.
J.,30:209-245.
Kushner, H. J. (1967): Dynamical equations for optimal nonlinear filtering, J. Diff.
Equat., 3:179-190.
Levy, P. (1956): A special problem of Brownian motion, and a general theory of
Gaussian random function, Proc. 3rd Berkeley Symp. Math. Stat. and Prob.,
2:133-175.
Liptser, R. S. and A. N. Shiryayev (1977): "Statistics of Random Processes, I and II,"
Springer-Verlag, New York.
Loeve, M. (1963): "Probability Theory," 3d ed., Van Nostrand, Princeton, N.J.
McKean, H. P., Jr. (1960): The Bessel motion and a singular integral equation, Mem.
Coli. Sci. Univ. Kyota, Series A, 33:317-322.
McKean, H. P., Jr. (1963): Brownian motion with a several dimensional time, Theory
of Prob. and Appl., 8:335-365.
McKean, H. P., Jr. (1969): "Stochastic Integrals," Academic, New York.
McShane, E. J. (1969): Toward a stochastic calculus, II, Proc. National Academy of
Sciences, 63:1084-1087.
McShane, E. J. (1970): Stochastic differential equations and models of random
processes, Proc. 6th Berkeley Symp. Math. Stat. and Prob., 3:263-294.
Meyer, P. A. (1966): "Probability and Potentials," Blaisdell, Waltham, Mass.
Mortensen, R. E. (1966): "Optimal Control of Continuous Time Stochastic Systems,"
Ph.D. dissertation, Dept. of Electrical Engineering, University of California, Berke-
ley.
Neveu, J. (1965): "Mathematical Foundations of the Calculus of Probability," Arniel
Feinstein (trans.), Holden-Day, San Francisco.
Paley, R. E. A. C. and N. Wiener (1934): "Fourier Transforms in the Complex
Domain," Amer. Math. Soc. Coll. Pub., Am. Math. Soc., 19.
Prokhorov, Yu V. (1956): Convergence of random processes and limit theorems in
probability theory, Theory of Prob. and Appl., 1:157-214.
Rao, K. M. (1969): On decomposition theorems of Meyer, Math. Scand., 24:66-78.
Riesz, F. and B. Sz.-Nagy (1955): "Functional analysis," Ungar, New York.
Root, W. L. (1962): Singular Gaussian measures in detection theory, Proc. Symp.
314 REFERENCES
Time Series Analysis, Brown University, 1962, Wiley, New York, 1963, pp. 292-316.
Rudin, Walter (1966): "Real and Complex Analysis," McGraw-Hill, New York.
Skorokhod, A. V. (1965): "Studies in the Theory of Random Processes," (trans. from
Russian), Addison-Wesley, Reading, Mass.
Slepian, D. (1958): Some comments on the detection of Gaussian signals in Gaussian
noise, IRE Trans. Inf. Th., 4:65-68.
Sosulin, Yu and R. L. Stratonovich (1965): Optimum detection of a diffusion process
in white noise, Radio Engrg. Electron. Phys., 10:704-714.
Stratonovich, R. L. (1960): Conditional Markov processes, Theory Prob. Appl.,
5:156-178.
Stratonovich, R. L. (1966): A new form of representation of stochastic integrals and
equations, SIAM J. Control, 4:362-371.
Taylor, A. E. (1961): "Introduction to Functional Analysis," Wiley, New York.
Thomasian, A. J. (1969): "The Structure of Probability Theory with Applications,"
McGraw-Hill, New York.
Van Schuppen, J. H. and E. Wong (1974): Transformation of local martingales under
a change of law, Annals of Prob., 2:878-888.
Whitney, H. (1957): "Geometric Integration Theory", Princeton University Press,
Princeton, N.J.
Whittle, P. (1963): Stochastic processes in several dimensions, Bull. Inst. Int. Statist.,
40:974-994.
Wiener, N. (1949): "Extrapolation, Interpolation, and Smoothing of Stationary Time
Series," Wiley, New York.
Wiener, N. and P. Masani (1958): The prediction theory of multivariate stochastic
processes-II, the linear predictor, Acta Mathematica, 99:93-137.
Wong, E. (1964): The construction of a class of stationary Markoff processes, Proc.
Symp. in Appl. Math., Am. Math. Soc., 16:264-276.
Wong, E. (1969): Homogeneous Gauss-Markov random fields, Ann. Math. Stat.,
40:1625-1634.
Wong, E. and J. B. Thomas (1961): On the multidimensional prediction and filtering
problem and the factorization of spectral matrices, J. Franklin Institute, 272:87-99.
Wong, E. and M. Zakai (1965a): On the relationship between ordinary and stochastic
differential equations, Int. J. Engng. Sci., 3:213-229.
Wong, E. and M. Zakai (1965b): On the convergence of ordinary integrals to stochas-
tic integrals, Ann. Math. Stat., 36:1560-1564.
Wong, E. and M. Zakai (1966): On the relationship between ordinary and stochastic
differential equations and applications to stochastic problems in control theory,
Proc. 3rd IFAC Congress, paper 3B.
Wong, E. and M. Zakai (1969): Riemann-Stieltjes approximations of stochastic in-
tegrals, Z. Wahrscheinlichkeitstheorie verw. Geb., 12:87-97.
Wong, E. and M. Zakai (1974): Martingales and stochastic integrals for processes with
a multi-dimensional parameter, Z. Wahrscheinlichkeitstheorie verw. Geb., 29:
109-122.
REFERENCES 315
CHAPTER 1
1. (a) We need only note that [O,a) (\ [0, a + 1) = [a, a + 1) e Ct.
m m+n
(b) Let A = U [a.,b.) and B = U [a.,b i )
;=1 .=",+1
Then
m+n m m+n
A VB = U [a.,b;) and A (\ B = U U [a;,bi) (\ [aj,b j )
i=1 i=1 j=m+l
But [a;,b i ) ( \ [aj,b j ) is either empty or of the form [min (ai,b j ), max (bi,b j ». Hence,
A (\ B is again a finite union of intervals of the form [a,b).
(c) Let C be any Boolean algebra containing Ct. Because [a,b) = [O,b) (\ [O,a),
C must contain all sets of the form [a,b) and, hence, all finite unions of such sets. Hence,
every Boolean algebra containing C) must also contain C 2• Because C 2 is a Boolean
n
algebra, it must also be the smallest.
(a,b) 0 [a + .!., b)
10=1 n
(b) If <P is <T additive, then it is also sequentially continuous so that from Solution
l.Id, we have
and
j(x) = jk X EAk,
k = I, . . . , m,
I
Tn
/RJ(X)P(dX) = jkP(A k)
k=l
f
m
In
k=l
= j(X(w»<P(dw)
r f(x)P(dx)
JRn
= lim
m-+oo
r
]Rn
jm(x)P(dx)
318 SOLUTIONS TO EXERCISES
= f(X(w»!J'(dw)
6. Let Xl = Y cos 8, X 2 = Y sin 8 cos <1>, and Xa = Y sin 0 sin <1>. If we denote the
joint density function of Y, 0, and <I> by p, then
L Xi> we have X,
k
7. Since Y k = Y,and
j=1
Yk - Yk - l = Xk k = 2, . . . ,n
Therefore,
1 o 0 o
-1 1 0
PY(YI, .•• , Yn) = 0 -1 PX(YI, Y2 - YI, • . . , Yn - Yn-I)
o ...... -1
_1_ exp [ -
(2'11")"12
~ ~
2 ~
(Yk - Yk-I)2]
k=l
where Yo == O.
•. E-L!L
1 + IXI
= I
JIXI?.1
~d!J'+ I
+ IXI JIXI<. 1
~d!J'
+ IXI
This implies that
IXI
> --!J'(IXI > E)
E
E--
1 IXI + - 1 + E -
CHAPTER 2 319
and
so that
15. Let Xl = Y cos <1>. The joint density function of Y and <I> is given by
p(y,,,,) = Isin
cos
. '"
-ysin", 11
_ e-} (I/! cos! ~+1I2 sin! rp)
= Y -1
2,..
!u0
2r
cos'" d", = 0
16. By definition ax contains every set of the form {w: Xi(W) E A}, A E (JlI, i 1,
. , n. It follows that if A I, • • • , A n are one-dimensional Borel sets then
n
n n
X-I (n Ai) = {w: Xi(W) E Ad E ax
i=l t=1
Since (Jln is the smallest <T algebra containing all n products of one-dimensional Borel
sets, it follows that for every b E (Jln,
X-I(B) E ax so that ax ::J {X-I(B), B E (Jln I
Conversely, consider the collection {X-I(B), BE (}tnl. It is a <T algebra, and
every Xi is clearly measurable with respect to {X-I(B), B E (Jln I. Hence, ax C
{X-I(B), B E (Jln}, and our assertion is proved.
CHAPTER 2
1. For any real number a,
Cl'(IX, - X.I > 0) ~ Cl'(X, > a +
0, X. < a - 0)
= [1 - P,(a o)JP.(a - 0)+
---> [1 - P,(a
.-.t
o)]P,(a - 0) +
320 SOLUTIONS TO EXERCISES
°
lET"
= L <l'(X = -t) =
lET"
(b) <l'(X +t = °for at least one t in [0,1)) = <l'(X E [-1,0)) = f~ 1 J2,.. e- z '
dx > °
3. P,(x) = <l'( /w: X,(w) < xl) = <l'(/w: tw < xl)
= Lebesgue measure of [0, i) n [0,1)
= min (1, i)
P,.,(XI,X2) = 0' ({ w: w < ¥, w < ~})
or
1
p(z cos 8, z sin 8) = - e-1••
2,..
CHAPTER 2 321
and
p(a,b)
;=1
L (XiX" = (.L (Xi cos 27rti) A - (L
;=1 ;=1
(Xi sin 27rti) B
8. Let ffixo denote the smallest algebra (not <T algebra) such that for every T ::; s, all
sets of the form {w: XT(W) < a} are in ffi". It is clear that ax. is generated by ffiu.
Now, every set A in ffi x • depends on only a finite collection X," X,,, . . . ,X,,,, ti ::; s.
Therefore, for every A E ffix.
EIAX, = E{E[IAX,IX," X,,, . . . ,X, .. X.])
Writing X, = X,+ - X,-, where both X,+ and X,- are nonnegative, we have
EIAX,+ = EIAX,+ A E ffi x •
EIAX,- = EIAX.- A E ffixo
Each of these four terms defines a finite measure on ffi .. which has a unique extension
to a... It follows that
for otherwise, we would have two different extensions of the same measure. Similarly,
EIAX,- = EIAX.-
and
Hence, we can take !(t) = ke-I where k is any nonzero constant. Thus,
X, = ke-IWI(l/I:).~'
12. (a) Since IX" - co < t < co I is Markov, the Chapman-Kolmogorov equation
yields
n
<P(X,.... = xilXo = Xj) L <P(X,+. = xilX. = Xk)<P(X. = xklXo = Xj)
k-l
or, equivalently,
n
Pij(t + 8) = L Pik(t)Pk;(8)
.1:-1
t,8 >0
so that
i-I
If q = [t]. then we have p(r)l = I and pT(r)1 = I, and p(r) must have the form
p(r) = L~1(r) 1 - !(r)]
!(r)
must be less than or equal to zero. Setting j(O) = - X, we have .from part (a)
CHAPTER 2 323
14. Because {X" - co < t < co) is stationary, its covariance function depends only
on the time difference. Set
p(t + 8) = p(t)p(s) t, 8 2: 0
p( t) = e nn P(I) t 2: 0
and by symmetry
pet) = eltllnp(l)
= e- Xlti
I
k-l
= E {L 10 2
". exp [iA
k=l
Uk cos (271"tk + 1/t) ] #}
(i I
n
=E exp UkX,.)
k=l
Hence, by the transformation rule for random variables (see, e.g., Exercise 1.5), we
have
= . 0) Icos.
pyz(r cos 0, r SID
SIn
0
0
-rsin 0
r COS 0
I
r (r2)
= 27ru exp - 20'2
2
and
PA(r) r
= - exp
0"2
(1 r2)
- - -
2 u2
r~O
E(Z.ldata) ~ V E(Z.2Idata)
=0
On the other hand, the Schwarz inequality applied to summations yields
CHAPTER 2 325
so that
W+1I2 > Ixxo+ yyol > (xxo + YYo)
y - VXoz + yo2 - VX02 + Y02
= v'xoz + Y02 + _/ (x
~
- xo) + _/ ~
(y - Yo)
V X02 + Yo' V X02 + Yo'
It follows that
v'X;-;- + Y.2
=
and
= to tOO
dr
0
2" rJ(r)
dO -----'-'--'--
27r(t n - tn_I)
exp { - .
2(tn -
1
t,,_,)
[r' + r02 - 2rro cos (0 - 00 ) I}
t '" t
Because cos 0 is periodic with period 27r, a change in variable yields
and
so that IZ" I 2: O} is Markov (see Proposition 4.6.4 for an easy means of proving
this fact).
326 SOLUTIONS TO EXERCISES
CHAPTER 3
1. (a) Compute
!o
-00 -1
=2 1
(1 - T) cos 211'JIT dT
== -- 11 . 2
SIn
2(1 - cos 211'11)
211'JIT dT == - - - - - -
211'11 0 (211'1')2
== (Si::vy ~ 0
ff
T
EZmZn = R(t - s)e- i (2 7r IT)(ml-n&) dt ds
o
2:
00
RnOmkOnk = Rnomn
k=-oo
2:
N N
(b) E [ X t - Znein(27rIT)t [2 = R(O) ~
L.. EJZnJ2
n=-N lI=-N
N
= R(O)
n= -N
2: R,. ----->
,N-Jooo
0
where Rn = 1/2(1 + n2), n ;;e 0 and Ro = 1. Since the family {e in (2"IT)I, 0 ::;; t ::;; T,
n = 0, ±1, . . . \ is orthogonal, we can clearly take 'l'n(t) = (I/VT)e in (2 7r IT)l to be
the orthonormal eigenfunction. The eigenvalues are An = RnT.
CHAPTER 3 327
4. First, we write
as was suggested by the hint. The second of the above equations yields the boundary
condition -",'(0) = ",'(t). The first and second equations yield ",(0) + ",(t) = j",'(O).
From the equation A","(t) = -2",(t), we get
",(t) = C cos ~~ (t - i)
_ F- cos l V2/A _~
t V 2/A = = cot l V 2/A
sini
4
vI27X 4
WT(t) = L
'"
n=O
vx: CPn(r(t»Zn o~ t ~ T
where An and CPn are given (4.32) and (4.33), respectively, and {Znl are Gaussian and
orthonormal. Hence, the desired expansion is obtained by setting
+ sin (1 - 211"/I)r) dr
1[3 3
1 + 211"/1 1 - 211"/1
= 8 1 + (1 + 211"/1)2 + 1+0- 211"/1)2 + 1 + (1 + 211"11)2 + 1 + (1 - 211" v)
]
2
1 16 + 4(211"v)2 = ! 4 + (211"/1)2
== 8 4 + (271"/1)4 2 4 + (271"v)4
Therefore,
H(/I) = _1_
S( v)
I'"
-",
e- i2 .. >T p (r) dr
1 1
= S(v) 4 + (211'"v)4
H(v) = H(-v)
To prove this, first assume HE L2( - 00, 00), then H is the fourier transform of a
CHAPTER 3 329
=
1
It follows that
= 4 fro
oo sin2"I'(t - s)
1 + (2n-v)2 dl'
(c) Z, = f-OOoo (cos 2"" ot - i sgn I' sin 2"I' ot)e i2"'" dX.
= f-OOoo e-' sgn '(2"0)<+i2"" dX.
14. (a) Check the three cases min (t,s) > 0, max (t,s) < 0, and max (t,s) > 0 >
min (t,s).
= ~ (I t + ~ / + /s + ~ /- It - sl + It I + lsi - It - sl - It I
- 18
I
+ -I
1 I + II t
n
- 8 -
1/
-
n
-
It + -1 I -
I n
lsi + Ii.t + -1 - 8 II')
It
=~' (I t - 8 - ~ + It - + ~ I -
/ 8 21t - sl)
I
o ~ It - s. ~
II
elsewhere
= l~ n(l
f-lin .
- nT)e-· 2 >rVT dT = 2n frl~ (1 - nT) cos 2"I'T dT
0
2
= -2n21c lin sin 2"I'T dT = -2n- ( 1 - cos -:;-
2,,1')
2,,1' 0 (2"1')2
= f oo j
-00
(1')6(,,)
(Sin "I'ln) 2
---
1rv/n
dll
-oo
(dominated convergence)
n~oo
)
f oo
-00
j (,,){i(I')
- d"
= f-OOoo 1(t)0(t) dt
330 SOLUTIONS TO EXERCISES
2
= -n ~
n---?
0
00
uniformly in t
It follows that
{b.
Ja f(t)Zn, dt + Ja{b (t)ln' dt = f(b)Z.~ - (a)Zna
and
Y, = U, = -Ut + Ya + Z, - Za
2: OIiXUi).
n
cator function of a half-open interval, calif a step function and set XU) =
i= 1
CHAPTER 3 331
EX(f)X(g) -
a
It follows that
(b) EXU)X(g) =
a
L"'iXI,
n
where 'f/ is a step function. Every Y E Xx is the q.m. limit of a sequence of finite sums.
Since X( 'f/n) converges in q.m. if and only if 'f/n is a L2-convergent sequence, the result
is proved.
E(A - A,)X. = 0
and find
s
10o
8
h(t,r) dr = ~~
1 +t
1
h(ts)
,
= 1-+-t 0::;8::;t
E( Y - f,)x. = 0 8 ::; t
and find
hot
h(t,r) -
a
ar (EXTX.) dr + EXoX. = cos 211"Ws
332 SOLUTIONS TO EXERCISES
- (21r W)' cos 21r W 8 [ 21r W lot h(t,T) sin 21r WT dTJ
= vo'e- v,' - 1'0 Ja(t h(t,T) sgn (T - s)e-Vo!T-.! ciT
ah(t,s)
+ 21'0-----;;;-
or
21'0 ah~~8) + {[v o' + (21rW)'](21rW) 101 h(t,T) sin 21rWT ciT} cos 21rlFs = 0
J(t) = - ~
21'0
[1'0' + (21rW)') (I h(t,T) sin 21rWT ciT
Ja
which yields
r(t) {I + 21'0
~ [1'0' + (21rWJ')} Jo(f sin' 21rlVT ciT
+ {~.
2vo
[vu' + (21rW)') (t sin 21rWT dT} get)
Jo = 0 (**)
Equations (**) and (** *) suffice to determine f(l) and a(t), which in turn determine
h(t,s) completely.
(b) \Ve bpgin by noting that f-"'", I'-V,'T!e i '"'' dT = 2vo/(12lrvl' + po') so that {N"
- 00 <t < oo} can be viewed as a white noise filtered by transfer function Y2vo/
(vo + i21rv) or
N, + voN, = Y2vo r,
where r, is a standard white-noise process. It follows that
.Y, (-21rW sin 21rWt)Y + Nt
= (-21rW sin 21rWt) Y - vo(X. - cos 21rWtY) + Y2vo r,
CHAPTER 3 333
Continuing, we get
. H2(t)
~(t) = - - - ~2(t)
B2(t)
'li(t) H2(t)
u(t) = B2(t) 1;(t)
we then get
d H2(t)
u(t) = li(t) - - -
dt B2(t)
8 (v)
L
= [(i2".v) + 1 + i][(i2".v) + 1 - i][(i2".v) - 1 + i][(i2".v) - 1 - iJ
It is clear that an Ii satisfying (9.10) can be taken to be
1
h(v) = ----
[(i2".v) + 1 + i][(i2n) + 1 - iJ
('orresponding to
[
cos a +
sin a(2ri" + 1) ]
= e-a (2ri" +
1 - i)(2ri" + 1 + i)
From (9.29) we get
Therefore,
Since g(.) is the inverse transform of [8,"(,,)/';(,,)] and only gel) for I;::: 0 contributes
CHAPTER 3 335
H(II)
y, = FY, + [~J r,
X, = HY, + r,
Hence,
.
y, = FY, + [IJ .-
0 (X, HY,)
X, [0 1) [YIIJ
=
Y
+ [0 1) ['TI']
2, ~,
336 SOLUTIONS TO EXERCISES
CHAPTER 4
1. Without loss of generality we can assume a = 0 and b = 1 and set
W,(n) = W k1n kin ~ t <k + l/n
k = 0, 1, . . . ,n - 1
Because E<p''P8 is clearly continuous on [0,1)2, we have
n-1
\ ' <Pkln( W k+lln -
~
W kin) --->
n~oo
j 0
1 'P, d W,
k~O
I
n-l
<p,W, - <poWo - 10 1 tP,W,(n) dt = <p,W, - <poWo - W kln fk~:1/n tPl dt
k~O
I
11-1
I
n
<f'kln(TVkln - HT(k_l)ln)
k~1
I
n -1
= <Pkln(HT(k+l)ln - W kln )
k~O
I
n
The first term goes to 101 <PI dW, as n -> 00, while
I
k~1
! (<Pkln - <Pk-lIn)(Wkln - W(k-ll/n) I ~ ~-!
k=1
(<Pkln - <P(k_1lln)2 ! (Wkl:-~
k-1
W(k_1lln)2
a.s.
---> O. Since
!c 1 a.s.
tP,W,(n) dt - - >
!c 1
tP,W, dt, we have
n--+ QC 0 n--+ OCI 0
2. Set X, = In Z, then
d X ,= -l dZ, - -2Z,2
Z,
1 1
- Z,2",,2 dt = "', dW, -
1
2",,2 dt
Since Xo = In Zo = 0 we have
1 1 1
dY, = - dX, - - - ,,2(t)X,2 dt
X, 2X,2
,,2( t)
= met) dt + ,,(t) dW, - 2 dt
Since Yo = 0, we have
1}
or
[t [t [ ,,2(S)
X, = Xoe Y ' = exp { }o ,,(8) dW. +}o m(s) - 2 ds
4. We first write
d
dif(t) - 3K(l + t)f(t) ~ 1 + :lEXo2
or
d
di [e- 3K (t+i212) f(t)] ~ (1 + 3EX o 2)e- 3K (t+i2/2)
338 SOLUTIONS TO EXERCISES
and
[t
rp(W,) - rp(O) = }o rp'(W.) dW. +:21 }o[t rp"(W.) ds
Therefore, X, = rp(O) + lot rp'(W.) dW•. Since rp" is bounded (say 1",,"1 ~ B), we have
Irp'(x) I ~ Irp'(O) I + Blxl. Hence, lot Elrp'(W.)I'd8 < 00, and lot rp'(W,) dW. is a
martingale and so is X,.
L
n
7. First, let Z, = W k ,'. Then,
k=l
L L
n n
(Z, - Z.) = (Wkt - Wka)2 +2 Wk. (W kt - Wk.)
k=l k=l
Therefore,
L
n
E[(Z, - Z.)IWkn T ~ 8, k = I, . . . , n] E(Wki - Wka)2 = n(t - 8)
k=l
and
L
n
L L
n
+ E(Wk, - Wk.)'E(Wu - WI,)' +4 W k8 'E(Wkt - Wk.)'
+ L
k~l k=l
4 Wk.WI.E[(Wkt - Wk.)(W/t - WI.)] - 3n(t - 8)' + n(n - 1)(t - 8)2
k~l
+ 4(t - 8)Z.
It follows that
= (~2 - ~) ~ dt
2 X,
+ dW,.
We note that neither the stochastic differential equation for Z nor the one for X
satisfies the conditions of Proposition 4.1. However, since Z is Markov and
X, = V
Z" the process X is also Markov.
and get
(b) For p., = a(OX" the equivalent stochastic differential equation is now
Therefore,
= get) -1 --.
d [ q2(X) W(x) - - d""(X)]
'2 ax ax
Therefore,
get)
1 dg(t)
dt
-- l- Id - -- [ q2(X)W(X)--
W(x)",,(x) 2 dx
d",,(x)
elx
1
.J
The two sides, being functions of different yariables, must be constants. HenC'e, if
f(x,t) is It product, then it must have the form
.h(x,/) = e-AI""x(x)
~~
2 elx
[u2(X)II'(X) d""(X)]
ax
+ XlV(x)",,(J') = 0
Under rather general conditions, it can be shown that every solution /(x,t) can
be represented as a linear combination of produets. SinC'e p(x,tlxo,lo) is a fundioll of
t - 10 , x, and .1'0, it must have the assumed form.
U"(x) + Vex) = 0
and
1 exp [_ ...:..(X_-_X_o)c-'] !
V2".(t - to) 2 (t - to)
11. The Fokker-Planek equation for this ease has the form
op o'p 0
at = t~(t) ox' - aCt) oX (xp)
Therefore,
op
12. (a) We assume -~ 0 EO that
ot t-> 00
1 d'p(x) d
-2 -dx'- + dx
- [sgn xp(x)1 = 0
or
dp(x)
- - + 2 sgn xp(x)
dx
= eonstant
342 SOLUTIONS TO EXERCISES
d [ p(x)~
"21 dx dCP(X)] + Ap(x)cp(x) = 0
The solutions satisfying, (1) d",/dx continuous at x = 0, and (2) ptp2 bounded, are of
the form
",(x) = A.eizi sin IIX
Therefore,
CHAPTER 5
1. (H,f)(x) = Ez(f(X,»
= !(1 + e-')f(x) + -l(1 - e-')f( -x) x = 1, -1
If we set f = [~~~ 1)]. then the operator H, has a representation as a matrix, and
CHAPTER 5 343
we have
=![-1
2 1-11]
We can verify that H, = e,A in many ways. For example, we note that
-1] 1 =-A
so that for n = 1, 2, . . . ,
An = (-l)n-IA = - -
(-I)n[ 1
2 -1
Therefore,
:DA = If: I bounded continuous on [0, oc), I" boundeu continuous on (0, oc) I
and
Since {OO 1 exp [ _ ...!: (x - a)2] e- A' dt = exp (- ~Ix - al)/y2X, we find
}o Y21rt 2t
ux(a) = ic
00 exp (- vI2x
_ /
Ix -
'V 2X
a I)
dx
l
344 SOLUTIONS TO EXERCISES
~ exp [-
2X
V2X (c - a)] a:::;c
('" e-'"
}o}o
(t q(s) ds dt = !>.}o('" e-"'q(t) dt = !>. Er"T, = !>. exp (- V2X c)
On the other hand,
t > 0,
;2;
4. Let TO = min {t: X, = OJ. Then,
Finally,
p(x,t;xo,O) =
211"
VI
I-p'
exp [ - 1
2(I-p')
(x' - 2pxxo + x o')]
where p = e-'. Therefore,
00 1
/ ---= e- u2 /2<l'(X, ::s; O[Xo = a) da
o V211"
= r
}o
00 da /0
-00
dx
211"
VI
I-p2
- exp [ -
2(I-p2)
1 (x' - 2pxa + a2) ]
(Letting a = r cos 0, x = r sin 0, we get)
=
}o
r 00 r dr /0 -,,-/2
dO
211" ~
1 - exp [ - _ _r'_ _ (1 -
2(1 - p2)
p sin 20) ]
=
VI -
211"
p2/0 -,,-/2 I - p
I
sin 20
dO
1 I.
= - - ~ Sln-' p
4 211"
m(x) = 2 q2(X) = 4x
u(x) = l c
x
exp [icY
- 2m(z)
- - dz ]
0 q2(Z)
dy
u(x) = t X
e-1ny dy = In x
fo OO
e--)..' -d H,g dt = A
dt
foo e-)..'H,g dt
0
= AI).
Hence,
AI).. = Xf).. - y
8. First, we write
hex, t + 0) = Ex (exp [ - 10;; k(X,) dsJ E {f(X'+<l) exp [ - lot k(X.+<l) dsJ I
X;;} )
9. Setting k(x) = /3(1 + sgn x/2) and I(x) = 1 in Kac's theorem, we get
u~.cx) = Ex 10
(00
e- Xt exp
((t -/3 10
1 + sgn
2
X. )
ds dt
Therefore,
1 d 2u)..(x)
- -- -
2 dx 2
(/3 + X)ux(x) -1 x>o
1 d 2u)..(x)
- - - - -Xu)..(x) = -1
2 dx 2
x<o
CHAPTER 5 347
If we define q(s,t) as the density of ret), that is, q(s,t) ds = (J'(T(t) E ds), then
u).(O) =
VX(X +
1
(J)
= If
0
'"
e-(At-fJ')q(s,t) ds dt
Ico'" 1 1
e-A'q(s,t) dt = - - --_ e- A'
.y;; VX
Inverting once again yields
III
q(s,t) = - - --- t > s
7r V; Vt - s
= 0 t < s
Finally
so that
ii(t) = 2{1v(t)
With the initial condition a(O) = 0, we get
aCt) = y2;3 tanh Y2~ t
and
A (t) = exp ( - lut y2;3 tanh y 2{1 SdS)
= exp (- In cosh Y2~ t)
1
cosh y~t
Therefore,
h(O,t)
cosh y2(:l t
and the density function for Z = 10 1
X,, dt can be found by inverting h(O,1), that is,
1 ~C+i"" 1
pz(z) = - ea. - - - - d{1
27ri C - i'" cosh V2;3
e" - 1
-1 Closed Regular
2
1 -x lox eu2 dy Closed Regular
ezu - 1
x -x Closed Regular
2
- 3
:r~r loX y3e-2. dy Open
CHAPTER 6 349
CHAPTER 6
2. It suffices to prove that for any t ~ 0 the events {S 1\ T ~ t}, {S V T ~ t}, and
{T* ~ t} are in CPt. These events can be expressed in terms of events in ti, as {S ~ t} U {T
~ t}, {S ~ t} n {T ~ t}, and n{Tn ~ t}, respectively, which implies the result desired.
3. Let s < t. Then T' = (T 1\ t) V s is a bounded stopping time with T' ~ s so that
E[MT'I(t s ] = Ms by the optional sampling theorem. Now
so
Since E[X'2 - X" Id"l] ~ 0, so is the left·hand side of this equation. This completes the
proof.
5. (a) Form~n,
and E[(ln(u,))2] < +00. Then, by the weak law of large numbers, (l/n)ln(Zn) n";J
In 2 - 1. Thus In (Zn) converges in probability to - 00, which means that Zn converges in
probability to zero. Thus Zoo = O-that is, Zn converges a.S. to zero. Since EZn = 1 for all
n, Zn does not converge in p·mean for any p ~ 1.
350 SOLUTIONS TO EXERCISES
6. Let A E "B. Then M(A) = t (dMGf, /dM:J') dMo. On the other hand, by the definition of
conditional expectation, we also have
Equating these two expressions for M(A) and noting that each side of Eq. (l.9) is 6Ji\
measurable, the conclusion follows.
7. Let Rn i + 00 be stopping times such that (MRn 1\ t, CEt : t ~ 0) is a martingale for each
n ~ l. Then, as n tends to + 00, we use a conditional version of Fatou's lemma to deduce
that for t > s,
which implies that t9(h· w) is class D. Finally, since t9(h· w) is a local martingale, there
exists a sequence 'Tn i + 00 of stopping times such that Et9(h· W)T n = 1 for each n. Then
Et9(h • w)oo = 1 by Proposition l.3.
9. If '!P« '!Po then the Radon-Nikodym derivative A = d'!P/d'!Po exists and Ln = E[AICEnl.
By Proposition l.4, Loo = A a.s. so that ELoo = l. Conversely, suppose that ELoo = l.
Then for c ~ 1,
E[LnI(Ln"C}] ~ E[ LnI(Loo"C-l/C,ILn-Lool"l/cd
so that
a bounded random variable which is ttn measurable for some n, then EU = EOULn =
EoULoo. Then, by the monotone class theorem, EU = EoUL"" for all ct"" bounded measura-
ble U. Hence 0' « 0'0 and L"" is the Radon-Nikodym derivative.
~o
so we see that 6 jtws2 ds is the predictable compensator of w,'. Next (simply check the
o
jumps)
and
352 SOLUTIONS TO EXERCISES
Thus, 0'[Bn+ 1 = 1Ia~] = 0'[Bn+ 1 = -1Ia~] = 0.5 for each n, and this implies part
(a).
There exists a sequence of functions Fn so that
n;e:O
Now
n>O
then Mn+l = Mn + Bn+1Hn+l for n;e: o. This implies that the representation to be proved
is valid.
CHAPTER 7
1. For each possible value (J of Ok+l'
Now the ith term in the numerator of the last expression is equal to
which is the same as IIk(i)Ri/(J). The denominator above is the sum of similar terms, and
the desired conclusion follows.
2. Since H, is not random, it is equal to E[a, - ~,)2]. Equation (4.12) for H can be written
for some constants A and B. Using the initial condition u(O)ju(O) = Ho - a = - a, we get
Since H, = a + u(t)ju(t), this yields H,. We find that H, tends to a + past tends to
infinity. This limit value can also be obtained by setting H, = 0 in Eq. (4.12).
CHAPTER 7 353
3. The goal is to find a recursive equation for ~, where ~ = I[o,v). Let tt, = (J as, Ns:
o os; s
os; t). Then, arguing heuristically,
This suggests the easily verified fact that if fi, = -~,/(t)/(l -,F(t)); then m defined by
Eq. (3.2) is an tt. martingale. Now As = (a - b)~s + b, ~; = ~s' ~s = ~s-' and '" in Eq. (4.3)
is zero, so Eq. (4.3) becomes
Eigenfunction
of integral equation, 83 Gauss-Markov process, condition for,
of Laplacian operator, 287 64-65
Eigenvalue Gauss-Markov random field, iso-
of integral equation, 83 tropic, 292-293
of Laplacian operator, 286 and homogeneous, 293-295
Envelope, 109 Gaussian process, 46
Ergodic process, 67 characteristic function, 47
condition for a Gaussian process, density function, 49
69 linear operations on, 47-48
Ergodic theorem, 68 Markov, 64-65
Estimator Gaussian random variable, 46
linear least squares, 116-117 Gaussian white noise
recursive, 269 convergence, 160-161
Euclidean distance, 282 correction term, 160-162
Euclidean group, 282 as derivative of Brownian motion,
Euclidean norm, 281 156-157
Event, 1-3 in differential equations, 156-160
convergence, 11 in integrals, 157
Exit boundary, 202 simulation, 162-163
Exit time, 194 Gegenbauer equation, 287
Expectation, 15-17 Gegenbauer polynomials, 287
conditional, 27, 29-31 Generalized process, 279
Convergence, 17 Generator (of Markov semigroup),
Exponential of a semimartingale, 183
241 Girsanov's theorem, 244
Extension thereom, 3 Green's function, 199
358 INDEX