Sie sind auf Seite 1von 99

Stochastic Calculus for Finance II:

Continuous-Time Models
Solution of Exercise Problems
Yan Zeng
Version 1.0.8, last revised on 2015-03-13.

Abstract
This is a solution manual for Shreve [14]. If you find any typos/errors or have any comments, please
email me at zypublic@hotmail.edu. This version skips Exercise 7.1, 7.2, 7.5–7.9.

Contents
1 General Probability Theory 2

2 Information and Conditioning 10

3 Brownian Motion 16

4 Stochastic Calculus 26

5 Risk-Neutral Pricing 44

6 Connections with Partial Differential Equations 54

7 Exotic Options 65

8 American Derivative Securities 67

9 Change of Numéraire 72

10 Term-Structure Models 78

11 Introduction to Jump Processes 94

1
1 General Probability Theory
⋆ Comments:
Example 1.1.4 of the textbook illustrates the paradigm of extending probability measure from a σ-algebra
consisting of finitely many elements to a σ-algebra consisting of infinitely many elements. This procedure is
typical of constructing probability measures and its validity is justified by Carathéodary’s Extension Theorem
(see, for example, Durrett[4, page 402]).

I Exercise 1.1. using the properties of Definition 1.1.2 for a probability measure P, show the following.
(i) If A ∈ F , B ∈ F , and A ⊂ B, then P(A) ≤ P(B).
Proof. P(B) = P((B − A) ∪ A) = P(B − A) + P (A) ≥ P(A).

(ii) If A ∈ F and {An }∞


n=1 is a sequence of sets in F with limn→∞ P(An ) = 0 and A ⊂ An for every n, then
P(A) = 0. (This property was used implicitly in Example 1.1.4 when we argued that the sequence of all
heads, and indeed any particular sequence, must have probability zero.)
Proof. According to (i), P(A) ≤ P(An ), which implies P(A) ≤ limn→∞ P(An ) = 0. So 0 ≤ P(A) ≤ 0. This
means P(A) = 0.

I Exercise 1.2. The infinite coin-toss space Ω∞ of Example 1.1.4 is uncountably infinite. In other words,
we cannot list all its elements in a sequence. To see that this is impossible, suppose there were such a
sequential list of all elements of Ω∞ :
(1) (1) (1) (1)
ω (1) = ω1 ω2 ω3 ω4 · · · ,
(2) (2) (2) (2)
ω (2) = ω1 ω2 ω3 ω4 · · · ,
(3) (3) (3) (3)
ω (3) = ω1 ω2 ω3 ω4 · · · ,
..
.
(1)
An element that does not appear in this list is the sequence whose first component is H is ω1 is T and is
(1) (2) (2)
T if ω1 is H, whose second component is H if ω2 is T and is T if ω2 is H, whose third component is H
(3) (3)
if ω3 is T and is T if ω3 is H, etc. Thus, the list does not include every element of Ω∞ .
Now consider the set of sequences of coin tosses in which the outcome on each even-numbered toss matches
the outcome of the toss preceding it, i.e.,

A = {ω = ω1 ω2 ω3 ω4 ω5 · · · ; ω1 = ω2 , ω3 = ω4 , · · · }.

(i) Show that A is uncountably infinite.


Proof. We define a mapping ϕ from A to Ω∞ as follows: ϕ(ω1 ω2 · · · ) = ω1 ω3 ω5 · · · . Then ϕ is one-to-one
and onto. So the cardinality of A is the same as that of Ω∞ , which means in particular that A is uncountably
infinite.

(ii) Show that, when 0 < p < 1, we have P(A) = 0.


Proof. Let An = {ω = ω1 ω2 · · · : ω1 = ω2 , · · · , ω2n−1 = ω2n }. Then An ↓ A as n → ∞. So

P(A) = lim P(An ) = lim [P(ω1 = ω2 ) · · · P(ω2n−1 = ω2n )] = lim [p2 + (1 − p)2 ]n .
n→∞ n→∞ n→∞

Since p2 + (1 − p)2 ≤ max{p, 1 − p}[p + (1 − p)] < 1 for 0 < p < 1, we have limn→∞ (p2 + (1 − p)2 )n = 0.
This implies P(A) = 0.

2
I Exercise 1.3. Consider the set function P defined for every subset of [0, 1] by the formula that P(A) = 0
if A is a finite set and P(A) = ∞ if A is an infinite set. Show that P satisfies (1.1.3)-(1.1.5), but P does not
have the countable additivity property (1.1.2). We see then that the finite additivity property (1.1.5) does
not imply the countable additivity property (1.1.2).
Proof. Clearly P(∅) = 0. For any A and B, if both of them are finite, then A ∪ B is also finite. So
P(A ∪ B) = 0 = P(A) + P(B). If at least one of them is infinite, then A ∪ B is also infinite. So P(A ∪ B) =
∑N
∞ = P(A) + P(B). Similarly, we can prove P(∪N n=1 An ) = n=1 P(An ), even if An ’s are not disjoint.
To see countable additivity property doesn’t hold for P, let An = { n1 }.∑Then A = ∪∞ n=1 An is an infinite

set and therefore P(A) = ∞. However, P(An ) = 0 for each n. So P(A) ̸= n=1 P(An ).

I Exercise 1.4.
(i) Construct a standard normal random variable Z on the probability space (Ω∞ , F∞ , P) of Example 1.1.4
under the assumption that the probability for head is p = 12 . (Hint: Consider Examples 1.2.5 and 1.2.6)

Solution. By Example 1.2.5, we can construct a random variable X on the coin-toss space, which is uniformly
∫x ξ2
distributed on [0, 1]. For the strictly increasing and continuous function N (x) = −∞ √12π e− 2 dξ, we let
Z = N −1 (X). Then P(Z ≤ a) = P(X ≤ N (a)) = N (a) for any real number a, i.e. Z is a standard normal
random variable on the coin-toss space (Ω∞ , F∞ , P).

(ii) Define a sequence of random variables {Zn }∞


n=1 on Ω∞ such that

lim Zn (ω) = Z(ω) for every ω ∈ Ω∞


n→∞

and, for each n, Zn depends only on the first n coin tosses. (This gives us a procedure for approximating a
standard normal random variable by random variables generated by a finite number of coin tosses, a useful
algorithm for Monte Carlo simulation).

Solution. Define
∑n
1
Xn = 1{ωi =H} .
i=1
2i

Then Xn (ω) → X(ω) for every ω ∈ Ω∞ where X is defined as in Example 1.2.5. So Zn = N −1 (Xn ) →
Z = N −1 (X) for every ω. Clearly Zn depends only on the first n coin tosses and {Zn }n≥1 is the desired
sequence.

I Exercise 1.5. When dealing with double Lebesgue integrals, just as with double Riemann integrals, the
order of integration can be reversed. The only assumption required is that the function being integrated be
either nonnegative or integrable. Here is an application of this fact.
Let X be a nonnegative random variable with cumulative distribution function F (x) = P{X ≤ x}. Show
that ∫ ∞
EX = (1 − F (x))dx
0
by showing that ∫ ∫ ∞
1[0,X(ω)) (x)dxdP(ω)
Ω 0
∫∞
is equal to both EX and 0
(1 − F (x))dx.
Proof. First, by the information given by the problem, we have
∫ ∫ ∞ ∫ ∞∫
1[0,X(ω)) (x)dxdP(ω) = 1[0,X(ω)) (x)dP(ω)dx.
Ω 0 0 Ω

3
The left side of this equation equals to
∫ ∫ X(ω) ∫
dxdP (ω) = X(ω)dP (ω) = P[X].
Ω 0 Ω

The right side of the equation equals to


∫ ∞∫ ∫ ∞ ∫ ∞
1{x<X(ω)} dP(ω)dx = P(x < X)dx = (1 − F (x))dx.
0 Ω 0 0
∫∞
So E[X] = 0
(1 − F (x))dx.

I Exercise 1.6. Let u be a fixed number in R, and define the convex function φ(x) = eux for all x ∈ R.
1
Let X be a normal random variable with mean µ = EX and standard deviation σ = [E(X − µ)2 ] 2 , i.e., with
density
1 (x−µ)2
f (x) = √ e− 2σ2 .
σ 2π
(i) Verify that
2
1
σ2
EeuX = euµ+ 2 u .
Proof.
∫ ∞
[ ] 1 (x−µ)2
E euX = eux √ e− 2σ2 dx
−∞ σ 2π
∫ ∞
1 (x−µ)2 −2σ 2 ux
= √ e− 2σ 2 dx
−∞ σ 2π
∫ ∞
1 [x−(µ+σ 2 u)]2 −(2σ 2 uµ+σ 4 u2 )
= √ e− 2σ 2 dx
−∞ σ 2π
∫ ∞
σ 2 u2 1 [x−(µ+σ 2 u)]2
= euµ+ 2 √ e− 2σ 2 dx
−∞ σ 2π
σ 2 u2
= euµ+ 2

(ii) Verify that Jensen’s inequality holds (as it must):

Eφ(X) ≥ φ(EX).
[ ] u2 σ 2
Proof. E[φ(X)] = E euX = euµ+ 2 ≥ euµ = φ(E[X]).

I Exercise 1.7. For each positive integer n, define fn to be the normal density with mean zero and variance
n, i.e.,
1 x2
fn (x) = √ e− 2n .
2nπ
(i) What is the function f (x) = limn→∞ fn (x)?
Solution. Since |fn (x)| ≤ √2nπ
1
, f (x) = limn→∞ fn (x) = 0.
∫∞
(ii) What is limn→∞ −∞ fn (x)dx?

4
∫∞ ∫∞ 2
√1 e− 2
x
Solution. By the change of variable formula, −∞
fn (x)dx = −∞ 2π
dx = 1. So we must have
∫ ∞
lim fn (x)dx = 1.
n→∞ −∞

(iii) Note that ∫ ∫


∞ ∞
lim fn (x)dx ̸= f (x)dx.
n→∞ −∞ −∞
Explain why this does not violate the Monotone Convergence Theorem, Theorem 1.4.5.
Solution. This is not contradictory with the Monotone Convergence Theorem because {fn }n≥1 doesn’t in-
crease to 0.

I Exercise 1.8. (Moment-generating function). Let X be a nonnegative random variable, and assume
that
φ(t) = EetX
[ tX ]
is finite for every t ∈ R.[ Assume
] further that E Xe < ∞ for every t ∈ R. The purpose of this exercise is
to show that φ′ (t) = E XetX and, in particular, φ′ (0) = EX.
We recall the definition of derivative:
[ tX ]
′ φ(t) − φ(s) EetX − EesX e − esX
φ (t) = lim = lim = lim E .
s→t t−s s→t t−s s→t t−s
The limit above is taken over a continuous variable s, but we can choose a sequence of numbers {sn }∞
n=1
converging to t and compute [ tX ]
e − esn X
lim E ,
sn →t t − sn
where now we are taking a limit of the expectation of the sequence of random varibles
etX − esn X
Yn = .
t − sn

If this limit turns out to be the same, regardless ] we choose the sequence {sn }n=1 that converges to
[ tXof how
e −esX ′
t, then this limit is also the same as lims→t E t−s and is φ (t).
The Mean Value Theorem from calculus states that if f (t) is a differentiable function, then for any two
numbers s and t, there is a number θ between s and t such that

f (t) − f (s) = f ′ (θ)(t − s).

If we fix ω ∈ Ω and define f (t) = etX(ω) , then this becomes

etX(ω) − esX(ω) = (t − s)X(ω)eθ(ω)X(ω) , (1.9.1)

where θ(ω) is a number depending on ω (i.e., a random variable lying between t and s).
(i) Use the Dominated Convergence Theorem (Theorem 1.4.9) and equation (1.9.1) to show that
[ ] [ ]
lim EYn = E lim Yn = E XetX . (1.9.2)
n→∞ n→∞
[ ]
This establishes the desired formula φ′ (t) = E XetX .
tX s X
−e n
Proof. By (1.9.1), |Yn | = e t−sn
= |Xeθn X | = Xeθn X ≤ Xemax{2|t|,1}X . The last inequality is by X ≥ 0
and the fact that θn is between t and sn , and hence smaller than max{2|t|, 1} for n sufficiently large. So by
the Dominated Convergence Theorem, φ′ (t) = limn→∞ E[Yn ] = E[limn→∞ Yn ] = E[XetX ].

5
(ii)
[ Suppose
] the random variable X can take both positive′ and negative
[ tXvalues
] and EetX < ∞ and
E |X|e tX
< ∞ for every t ∈ R. Show that once again φ (t) = E Xe . (Hint: Use the notation
(1.3.1) to write X = X + − X − .)

Proof. Since E[etX 1{X≥0} ] + E[e−tX 1{X<0} ] = E[etX ] < ∞ for every t ∈ R, E[et|X| ] = E[etX 1{X≥0} ] +
+ +


E[e−(−t)X 1{X<0} ] < ∞ for every t ∈ R. Similarly, we have E[|X|et|X| ] < ∞ for every t ∈ R. So, similar to
(i), we have |Yn | = |Xeθn X | ≤ |X|emax{2t,1}|X| for n sufficiently large, and by the Dominated Convergence
Theorem, φ′ (t) = limn→∞ E[Yn ] = E[limn→∞ Yn ] = E[XetX ].

I Exercise 1.9. Suppose X is a random variable on some probability space (Ω, F, P), A is a set in F, and
for every Borel subset B of R, we have

1B (X(ω))dP(ω) = P(A) · P{X ∈ B}. (1.9.3)
A

Then we say that X is independent of the event A.


Show that if X is independent of an event A, then

g(X(ω))dP(ω) = P(A) · Eg(X)
A

for every nonnegative, Borel-measurable function g.

Proof. If g(x) is of the form 1B (x), where B is a Borel subset of R, then the desired equality is just (1.9.3).
By the linearity
∑n of Lebesgue integral, the desired equality also holds for simple functions, i.e. g of the
form g(x) = i=1 1Bi (x), where each Bi is a Borel subset of R. Since any nonnegative, Borel-measurable
function g is the limit of an increasing sequence of simple functions, the desired equality can be proved by
the Monotone Convergence Theorem.

I Exercise 1.10. Let P be the uniform (Lebesgue) measure on Ω = [0, 1]. Define
{
0 if 0 ≤ ω < 12 ,
Z(ω) =
2 if 12 ≤ ω ≤ 1.

For A ∈ B[0, 1], define ∫


e
P(A) = Z(ω)dP(ω).
A

e is a probability measure.
(i) Show that P

Proof. If {Ai }∞
i=1 is a sequence of disjoint Borel subsets of [0, 1], then by the Monotone Convergence Theorem,
e ∞ Ai ) equals to
P(∪i=1
∫ ∫ ∫ n ∫
∑ ∞

1∪∞ ZdP = lim 1 ∪n ZdP = lim 1 ∪n ZdP = lim ZdP = e i ).
P(A
i=1 Ai n→∞ i=1 Ai n→∞ i=1 Ai n→∞
i=1 Ai i=1

e
Meanwhile, P(Ω) e is a probability measure.
= 2P([ 21 , 1]) = 1. So P
e
(ii) Show that if P(A) = 0, then P(A) = 0. We say that Pe is absolutely continuous with respect to P.
∫ ∫
e
Proof. If P(A) = 0, then P(A) = A ZdP = 2 A∩[ 1 ,1] dP = 2P(A ∩ [ 12 , 1]) = 0.
2

e
(iii) Show that there is a set A for which P(A) e and P are not equivalent.
= 0 but P(A) > 0. In other words, P
Proof. Let A = [0, 12 ).

6
I Exercise 1.11. In Example 1.6.6, we began with a standard normal random variable X under a measure
P. According to Exercise 1.6, this random variable has the moment-generating function
1 2
EeuX = e 2 u for all u ∈ P.

The moment-generating function of a random variable determines its distribution. In particular, any random
1 2
variable that has moment-generating function e 2 u must be standard normal.
In Example 1.6.6, we also defined Y = X + θ, where θ is a constant, we set Z = e−θX− 2 θ , and we
1 2

defined Pe by the formula (1.6.9):



e
P(A) = Z(ω)dP(ω) for all A ∈ F.
A

We showed by considering its cumulative distribution function that Y is a standard normal random variable
e Give another proof that Y is standard normal under P
under P. e by verifying the moment-generating function
formula
e uY = e 21 u2 for all u ∈ R.
Ee
Proof.
[ ] [ ] [ 2 ] 2 [ ] 2 (u−θ)2 2
e euY = E euY Z = E euX+uθ e−θX− θ2 = euθ− θ2 E e(u−θ)X = euθ− θ2 e 2 = e u2 .
E

I Exercise 1.12. In Example 1.6.6, we began with a standard normal random variable X on a probability
space (Ω, F, P) and defined the random variable Y = X + θ, where θ is a constant. We also defined
e by the
Z = e−θX− 2 θ and used Z as the Radon-Nikodým derivative to construct the probability measure P
1 2

formula (1.6.9): ∫
e
P(A) = Z(ω)dP(ω), for all A ∈ F .
A
e the random variable Y was shown to be standard normal.
Under P,
e and X is related
We now have a standard normal random variable Y on the probability space (Ω, F, P),
to Y by X = Y − θ. By what we have just stated, with X replaced by Y and θ replaced by −θ, we could
b
define Zb = eθY − 2 θ and then use Zb as a Radon-Nikodým derivative to construct a probability measure P
1 2

by the formula ∫
b
P(A) = b
Z(ω)d e
P(ω) for all A ∈ F,
A
b the random variable X is standard normal. Show that Zb =
so that, under P, 1 b = P.
and P
Z

θ2 θ2 θ2
b
Proof. First, Zb = eθY − 2 = eθ(X+θ)− 2 = e 2 +θX = Z −1 . Second, for any A ∈ F, P(A) = A Zd e =
b P
∫ ∫
b b b
(1A Z)ZdP = 1A dP = P(A). So P = P. In particular, X is standard normal under P, since it’s standard
normal under P.

I Exercise 1.13 (Change of measure for a normal random variable). A nonrigorous but informative
derivation of the formula for the Radon-Nikodým derivative Z(ω) in Example 1.6.6 is provided by this
exercise. As in that example, let X be a standard normal random variable on some probability space
(Ω, F, P), and let Y = X + θ. Our goal is to define a strictly positive random variable Z(ω) so that when
we set ∫
e
P(A) = e
Z(ω)dP(ω) for all A ∈ F, (1.9.4)
A

the random variable Y under P e is standard normal. If we fix ω ∈ Ω and choose a set A that contains ω and
is “small,” then (1.9.4) gives
e
P(A) ≈ Z(ω)P(A),

7
where the symbol ≈ means “is approximately equal to.” Dividing by P(A), we see that
e
P(A)
≈ Z(ω)
P(A)
for “small” sets A containing ω. We use this observation to identify[Z(ω). ]
With ω fixed, let x = X(ω). For ϵ > 0, we define B(x, ϵ) [= x − 2ϵ , x + ]
ϵ
2 to be the closed interval
centered at x and having length ϵ. Let y = x + θ and B(y, ϵ) = y − 2ϵ , y + 2ϵ .
(i) Show that { }
1 1 X 2 (ω)
P{X ∈ B(x, ϵ)} ≈ √ exp − .
ϵ 2π 2
Proof.
∫ x+ 2ϵ
1 1 1 u2 1 1 x2 1 X 2 (ω̄)
P(X ∈ B(x, ϵ)) = √ e− 2 du ≈ √ e− 2 · ϵ = √ e− 2 .
ϵ ϵ x− 2ϵ 2π ϵ 2π 2π

e show that we must have


(ii) In order for Y to be a standard normal random variable under P,
{ }
1e 1 Y 2 (ω)
P{Y ∈ B(y, ϵ)} ≈ √ exp − .
ϵ 2π 2
Proof. Similar to (i).
(iii) Show that {X ∈ B(x, ϵ)} and {Y ∈ B(y, ϵ)} are the same set, which we call A(ω, ϵ). This set contains
ω and is “small” when ϵ > 0 is small.
Proof. {X ∈ B(x, ϵ)} = {X ∈ B(y − θ, ϵ)} = {X + θ ∈ B(y, ϵ)} = {Y ∈ B(y, ϵ)}.
(iv) Show that
e { }
P(A) 1 2
≈ exp −θX(ω) − θ .
P(A) 2
The right-hand side is the value we obtained for Z(ω) in Example 1.6.6.
e
P(A)
Proof. By (i)-(iii), P(A) is approximately
Y 2 (ω̄)
√ϵ e− 2
Y 2 (ω̄)−X 2 (ω̄) (X(ω̄)+θ)2 −X 2 (ω̄) θ2

X 2 (ω̄)
= e− 2 = e− 2 = e−θX(ω̄)− 2 .
√ϵ e − 2

I Exercise 1.14 (Change of measure for an exponential random variable). Let X be a nonnegative
random variable defined on a probability space (Ω, F, P) with the exponential distribution, which is

P{X ≤ a} = 1 − e−λa , a ≥ 0,

where λ is a positive constant. Let λ̃ be another positive constant, and define

λ̃ −(λ̃−λ)X
Z= e
λ
e by
Define P ∫
e
P(A) = ZdP for all A ∈ F .
A

e
(i) Show that P(Ω) = 1.

8
Proof.
∫ e e∫ ∞ ∫ ∞
e λ −(λ−λ)X
e λ e e
e −λx
P(Ω) = e dP = e−(λ−λ)x λe−λx dx = λe dx = 1.
λ λ 0 0

(ii) Compute the cumulative distribution function

e
P{X ≤ a} for a ≥ 0

e
for the random variable X under the probability measure P.

Solution.
∫ e ∫ ae ∫ a
e λ e λ −(λ−λ)x
e e
e −λx e
P(X ≤ a) = e−(λ−λ)X dP = e λe−λx dx = λe dx = 1 − e−λa .
{X≤a} λ 0 λ 0

I Exercise 1.15 (Provided by Alexander Ng). Let X be a random variable on a probability space
(Ω, F, P), and assume X has a density function f (x) that is positive for every x ∈ R. Let g be a strictly
increasing, differentiable function satisfying

lim g(y) = −∞, lim g(y) = ∞,


y→−∞ y→∞

and define the random variable Y = g(X). ∫∞


Let h(y) be an arbitrary nonnegative function satisfying −∞ h(y)dy = 1. We want to change the
probability measure so that h(y) is the density function for the random variable Y . To do this, we define

h(g(X))g ′ (X)
Z= .
f (X)

e by
(i) Show that Z is nonnegative and EZ = 1. Now define P

e
P(A) = ZdP for all A ∈ F .
A

Proof. Clearly Z ≥ 0. Furthermore, we have


{ } ∫ ∞ ∫ ∞ ∫ ∞
h(g(X))g ′ (X) h(g(x))g ′ (x)
E{Z} = E = f (x)dx = h(g(x))dg(x) = h(u)du = 1.
f (X) −∞ f (x) −∞ −∞

e
(ii) Show that Y has density h under P.

Proof.
∫ ∫ g −1 (a) ∫ g −1 (a)
e ≤ a) = h(g(X))g ′ (X) h(g(x))g ′ (x)
P(Y dP = f (x)dx = h(g(x))dg(x).
{g(X)≤a} f (X) −∞ f (x) −∞
∫a
By the change of variable formula, the last equation above equals to −∞
h(u)du. So Y has density h under
e
P.

9
2 Information and Conditioning
I Exercise 2.1. Let (Ω, F, P) be a general probability space, and suppose a random variable X on this
space is measurable with respect to the trivial σ-algebra F0 = {∅, Ω}. Show that X is not random (i.e.,
there is a constant c such that X(ω) = c for all ω ∈ Ω). Such a random variable is called degenerate.

Proof. For any real number a, we have {X ≤ a} ∈ F0 = {∅, Ω}. So P(X ≤ a) is either 0 or 1. Since
lima→∞ P(X ≤ a) = 1 and lima→−∞ P(X ≤ a) = 0, we can find a number x0 such that P(X ≤ x0 ) = 1 and
P(X ≤ x) = 0 for any x < x0 . So
( ) ( ( ))
1 1
P(X = x0 ) = lim P x0 − < X ≤ x0 = lim P(X ≤ x0 ) − P X ≤ x0 − = 1.
n→∞ n n→∞ n

Since {X = x0 } ∈ {∅, Ω}, we conclude {X = x0 } = Ω.

I Exercise 2.2. Independence of random variables can be affected by changes of measure. To illustrate
this point, consider the space of two coin tosses Ω2 = {HH, HT, T H, T T }, and let stock prices be given by

S0 = 4, S1 (H) = 8, S1 (T ) = 2,
S2 (HH) = 16, S2 (HT ) = S2 (T H) = 4, S2 (T T ) = 1.

Consider the two probability measures given by

e 1 e 1 e 1 e 1
P(HH) = , P(HT )= , P(T H) = , P(T T ) = ,
4 4 4 4
4 2 2 1
P(HH) = , P(HT ) = , P(T H) = , P(T T ) = .
9 9 9 9
Define the random variable {
1 if S2 = 4,
X=
0 if S2 ̸= 4.

(i) List all the sets in σ(X).

Solution. σ(X) = {∅, Ω, {HT, T H}, {T T, HH}}.


(ii) List all the sets in σ(S1 ).

Solution. σ(S1 ) = {∅, Ω, {HH, HT }, {T H, T T }}.


e
(iii) Show that σ(X) and σ(S1 ) are independent under the probability measure P.
e
Proof. P({HT, e
T H} ∩ {HH, HT }) = P({HT e
}) = 14 , P({HT, e
T H}) = P({HT e
}) + P({T 1 1 1
H}) = 4 + 4 = 2,
e
and P({HH, e
HT }) = P({HH}) e
+ P({HT }) = 4 + 4 = 2 . So we have
1 1 1

e
P({HT, e
T H} ∩ {HH, HT }) = P({HT, e
T H})P({HH, HT }).

e ∩ B) = P(A)
Similarly, we can work on other elements of σ(X) and σ(S1 ) and show that P(A e e
P(B) for any
e
A ∈ σ(X) and B ∈ σ(S1 ). So σ(X) and σ(S1 ) are independent under P.

(iv) Show that σ(X) and σ(S1 ) are not independent under the probability measure P.
Proof. P({HT, T H} ∩ {HH, HT }) = P({HT }) = 29 , P({HT, T H}) = 29 + 29 = 49 and P({HH, HT }) =
4 2 6
9 + 9 = 9 . So
P({HT, T H} ∩ {HH, HT }) ̸= P({HT, T H})P({HH, HT }).
Hence σ(X) and σ(S1 ) are not independent under P.

10
(v) Under P, we have P{S1 = 8} = 23 and P{S1 = 2} = 13 . Explain intuitively why, if you are told that
X = 1, you would want to revise your estimate of the distribution of S1 .
Solution. Because S1 and X are not independent under the probability measure P, knowing the value of X
will affect our opinion on the distribution of S1 .

I Exercise 2.3 (Rotating the axes). Let X and Y be independent standard normal random variables.
Let θ be a constant, and define random variables

V = X cos θ + Y sin θ and W = −X sin θ + Y cos θ.

Show that V and W are independent standard normal random variables.


Proof. We note (V, W ) are jointly Gaussian. In order to prove their independence, it suffices to show they
are uncorrelated. Indeed,

E[V W ] = E[−X 2 sin θ cos θ + XY cos2 θ − XY sin2 θ + Y 2 sin θ cos θ]


= − sin θ cos θ + 0 + 0 + sin θ cos θ
= 0.

I Exercise 2.4. In Example 2.2.8, X is a standard normal random variable and Z is an independent
random variable satisfying
1
P{Z = 1} = P{Z = −1} = .
2
We defined Y = XZ and showed that Y is standard normal. We established that although X and Y are
uncorrelated, they are not independent. In this exercise, we use moment-generating functions to show that
Y is standard normal and X and Y are not independent.
(i) Establish the joint moment-generating function formula

1 2
+v 2 ) euv + e−uv
EeuX+vY = e 2 (u · .
2
Solution.
[ ] [ ]
E euX+vY = E euX+vXZ
[ ] [ ]
= E euX+vXZ |Z = 1 P (Z = 1) + E euX+vXZ |Z = −1 P (Z = −1)
1 [ uX+vX ] 1 [ uX−vX ]
= E e + E e
2[ 2 ]
1 (u+v)2 (u−v)2
= e 2 +e 2
2
u2 +v 2 euv + e−uv
= e 2 · .
2

1 2
(ii) Use the formula above to show that EevY = e 2 v . This is the moment-generating function for a standard
normal random variable, and thus Y must be a standard normal random variable.
Proof. Let u = 0, we are done.
(iii) Use the formula in (i) and Theorem 2.2.7(iv) to show that X and Y are not independent.
[ ] u2 [ ] v2 [ ] [ ] [ ]
Proof. E euX = e 2 and E evY = e 2 . So E euX+vY ̸= E euX E evY . Therefore X and Y cannot be
independent.

11
I Exercise 2.5. Let (X, Y ) be a pair of random variables with joint density function
{ { }
2|x|+y (2|x|+y)2


exp − 2 if y ≥ −|x|,
fX,Y (x, y) =
0 if y < −|x|.

Show that X and Y are standard normal random variables and that they are uncorrelated but not indepen-
dent.
Proof. The density fX (x) of X can be obtained by
∫ ∫ ∫
2|x| + y − (2|x|+y)2 ξ ξ2 1 x2
fX (x) = fX,Y (x, y)dy = √ e 2 dy = √ e− 2 dξ = √ e− 2 .
{y≥−|x|} 2π {ξ≥|x|} 2π 2π

The density fY (y) of Y can be obtained by



fY (y) = fXY (x, y)dx

2|x| + y − (2|x|+y)2
= 1{|x|≥−y} √ e 2 dx

∫ ∞ ∫
2x + y − (2x+y)2 0∧y
−2x + y − (−2x+y)2
= √ e 2 dx + √ e 2 dx
0∨(−y) 2π −∞ 2π
∫ ∞ ∫ 0∨(−y)
2x + y − (2x+y)2 2x + y − (2x+y)2
= √ e 2 dx + √ e 2 d(−x)
0∨(−y) 2π ∞ 2π
∫ ∞
ξ ξ2 ξ
= 2 √ e− 2 d( )
|y| 2π 2
1 − y2
= √ e 2.

So both X and Y are standard normal random variables. Since fX,Y (x, y) ̸= fX (x)fY (y), X and Y are not
∫∞ 2 u2
independent. However, if we set F (t) = t √u2π e− 2 du, we have
∫ ∞ ∫ ∞
E [XY ] = xyfX,Y (x, y)dxdy
−∞ −∞
∫ ∞ ∫ ∞
2|x| + y − (2|x|+y)2
= xy1{y≥−|x|} √ e 2 dxdy
−∞ −∞ 2π
∫ ∞ ∫ ∞
2|x| + y − (2|x|+y)2
= xdx y √ e 2 dy
−∞ −|x| 2π
∫ ∞ ∫ ∞
ξ ξ2
= xdx (ξ − 2|x|) √ e− 2 dξ
−∞ |x| 2π
∫ ∞ (∫ x2
)

ξ 2 − ξ2 e− 2
= xdx √ e 2 dξ − 2|x| √
−∞ |x| 2π 2π
∫ ∞ ∫ ∞ 2 ∫ 0 ∫ ∞ 2
ξ ξ 2
ξ ξ2
= x √ e− 2 dξdx + x √ e− 2 dξdx
0 x 2π −∞ −x 2π
∫ ∞ ∫ 0
= xF (x)dx + xF (−x)dx
0 −∞
= 0.

12
I Exercise 2.6. Consider a probability space Ω with four elements, which we call a, b, c, and d (i.e.,
Ω = {a, b, c, d}). The σ-algebra F is the collection of all subsets of Ω; i.e., the sets in F are

Ω, {a, b, c}, {a, b, d}, {a, c, d}, {b, c, d},


{a, b}, {a, c}, {a, d}, {b, c}, {b, d}, {c, d},
{a}, {b}, {c}, {d}, ∅.

We define a probability measure P by specifying that


1 1 1 1
P{a} = , P{b} = , P{c} = , P{d} = ,
6 3 4 4
and, as usual, the probability of every other set in F is the sum of the probabilities of the elements in the
set, e.g., P{a, b, c} = P{a} + P{b} + P{c} = 34 .
We next define two random variables, X and Y , by the formulas

X(a) = 1, X(b) = 1, X(c) = −1, X(d) = −1,


Y (a) = 1, Y (b) = −1, Y (c) = 1, Y (d) = −1.

We then define Z = X + Y .
(i) List the sets in σ(X).
Solution. σ(X) = {∅, Ω, {a, b}, {c, d}}.
(ii) Determine E[Y |X] (i.e., specify the values of this random variable for a, b, c, and d). Verify that the
partial-averaging property is satisfied.
Solution.

E[Y |X] = E[Y |X = α]1{X=α}
α∈{1,−1}
∑ E[Y 1{X=α} ]
= 1{X=α}
P(X = α)
α∈{1,−1}

1 · P(Y = 1, X = 1) − 1 · P(Y = −1, X = 1)


= 1{X=1}
P(X = 1)
1 · P(Y = 1, X = −1) − 1 · P(Y = −1, X = −1)
+ 1{X=−1}
P(X = −1)
1 · P({a}) − 1 · P({b}) 1 · P({c}) − 1 · P({d})
= 1{X=1} + 1{X=−1}
P({a, b}) P({c, d})
1
= − 1{X=1} .
3
To verify the partial-averaging property, we note
[ ] 1 [ ] 1 1
E E[Y |X]1{X=1} = − E 1{X=1} = − P({a, b}) = −
3 3 6
and
[ ] 1
E Y 1{X=1} = P(X = Y = 1) − P(Y = −1, X = 1) = P({a}) − P({b}) = − .
6
Similarly, [ ]
E E[Y |X]1{X=−1} = 0
and [ ]
E Y 1{X=−1} = P(Y = 1, X = −1) − P(Y = −1, X = −1) = P({c}) − P({d}) = 0.
Together, we can conclude the partial-averaging property hodls.

13
(iii) Determine E[Z|X]. Again, verify the partial-averaging property.

Solution.
2
E[Z|X] = X + E[Y |X] = 1{X=1} − 1{X=−1} .
3
Verification of the partial-averaging property is skipped.

(iv) Compute E[Z|X] − E[Y |X]. Citing the appropriate properties of conditional expectation from Theorem
2.3.2, explain why you get X.
Solution. E[Z|X] − E[Y |X] = E[Z − Y |X] = E[X|X] = X, where the first equality is due to linearity
(Theorem 2.3.2(i)) and the last equality is due to “taking out what is known” (Theorem 2.3.2(ii)).

I Exercise 2.7. Let Y be an integrable random variable on a probability space (Ω, F, P) and let G be a
sub-σ-algebra of F. Based on the information in G, we can form the estimate E[Y |G] of Y and define the
error of the estimation Err = Y − E[Y |G]. This is a random variable with expectation zero and some variance
Var(Err). Let X be some other G-measurable random variable, which we can regard as another estimate of
Y . Show that
Var(Err) ≤ Var(Y − X).
In other words, the estimate E[Y |G] minimizes the variance of the error among all estimates based on the
information in G. (Hint: Let µ = E(Y − X). Compute the variance of Y − X as
[ ]
2
E[(Y − X − µ)2 ] = E ((Y − E[Y |G]) + (E[Y |G] − X − µ)) .

Multiply out the right-hand side and use iterated conditioning to show the cross-term is zero.)
Proof. Let µ = E[Y − X] and ξ = E[Y − X − µ|G]. Note ξ is G-measurable, we have

Var(Y − X) = E[(Y − X − µ)2 ]


{ }
2
= E [(Y − E[Y |G]) + (E[Y |G] − X − µ)]
= Var(Err) + 2E [(Y − E[Y |G])ξ] + E[ξ 2 ]
= Var(Err) + 2E [Y ξ − E[Y ξ|G]] + E[ξ 2 ]
= Var(Err) + E[ξ 2 ]
≥ Var(Err).

I Exercise 2.8. Let X and Y be integrable random variables on a probability space (Ω, F, P). Then
Y = Y1 + Y2 , where Y1 = E[Y |X] is σ(X)-measurable and Y2 = Y − E[Y |X]. Show that Y2 and X are
uncorrelated. More generally, show that Y2 is uncorrelated with every σ(X)-measurable random variable.

Proof. It suffices to prove the more general case. For any σ(X)-measurable random variable ξ, E[Y2 ξ] =
E[(Y − E{Y |X})ξ] = E[Y ξ − E{Y |X}ξ] = E[Y ξ] − E[Y ξ] = 0.

N Exercise 2.9. Let X be a random variable.


(i) Give an example of a probability space (Ω, F, P), a random variable X defined on this probability space,
and a function f so that the σ-algebra generated by f (X) is not the trivial σ-algebra {∅, Ω} but is strictly
smaller than the σ-algebra generated by X.

14
Solution. Consider the dice-toss space similar to the coin-toss space. Then a typical element ω in this
space is an infinite sequence ω1 ω2 ω3 · · · , with ωi ∈ {1, 2, · · · , 6} (i ∈ N). We define X(ω) = ω1 and
f (x) = 1{odd integers} (x). Then it’s easy to see

σ(X) = {∅, Ω, {ω : ω1 = 1}, · · · , {ω : ω1 = 6}}

and σ(f (X)) equals to

{∅, Ω, {ω : ω1 = 1} ∪ {ω : ω1 = 3} ∪ {ω : ω1 = 5}, {ω : ω1 = 2} ∪ {ω : ω1 = 4} ∪ {ω : ω1 = 6}}.

So {∅, Ω} ( σ(f (X)) ( σ(X), and each of these containment is strict.


(ii) Can the σ-algebra generated by f (X) ever be strictly larger than the σ-algebra generated by X?
Solution. No. σ(f (X)) ⊂ σ(X) is always true.

I Exercise 2.10. Let X and Y be random variable (on some unspecified probability space (Ω, F, P)),
assume they have a joint density fX,Y (x, y), and assume E|Y | < ∞. In particular, for every Borel subset C
of R2 , we have ∫
P{(X, Y ) ∈ C} = fX,Y (x, y)dxdy.
C
In elementary probability, one learns to compute E[Y |X = x], which is a nonrandom function of the
dummy variable x, by the formula
∫ ∞
E[Y |X = x] = yfY |X (y|x)dy, (2.6.1)
−∞

where fY |X (y|x) is the conditional density defined by

fX,Y (x, y)
fY |X (y|x) = .
fX (x)
∫∞
The denominator in this expression, fX (x) = −∞ fX,Y (x, η)dη, is the marginal density of X, and we must
assume it is strictly positive for every x. We introduce the symbol g(x) for the function E[Y |X = x] defined
by (2.6.1); i.e., ∫ ∞ ∫ ∞
yfX,Y (x, y)
g(x) = yfY |X (y|x)dy = dy
−∞ −∞ fX (x)
In measure-theoretic probability, conditional expectation is a random variable E[Y |X]. This exercise is
to show that when there is a joint density for (X, Y ), this random variable can be obtained by substituting
the random variable X in place of the dummy variable x in the function g(x). In other words, this exercise
is to show that
E[Y |X] = g(X).
(We introduced the symbol g(x) in order to avoid the mathematically confusing expression E[Y |X = X].)
Since g(X) is obviously σ(X)-measurable, to verify that E[Y |X] = g(X), we need only check that the
partial-averaging property is satisfied. For every Borel-measurable function h mapping R to R and satisfying
E|h(X)| < ∞, we have ∫ ∞
Eh(X) = h(x)fX (x)dx. (2.6.2)
−∞

This is Theorem 1.5.2 in Chapter 1. Similarly, if h is a function of both x and y, then


∫ ∞∫ ∞
Eh(X, Y ) = h(x, y)fX,Y (x, y)dxdy (2.6.3)
−∞ −∞

whenever (X, Y ) has a joint density fX,Y (x, y). You may use both (2.6.2) and (2.6.3) in your solution to
this problem.

15
Let A be a set in σ(X). By the definition of σ(X), there is a Borel subset B of R such that A = {ω ∈
Ω; X(ω) ∈ B} or, more simply, A = {X ∈ B}. Show the partial-averaging property
∫ ∫
g(X)dP = Y dP.
A A

Proof.
∫ ∫ ∞ ∫ ∫
yfX,Y (x, y)
g(X)dP = E[g(X)1B (X)] = g(x)1B (x)fX (x)dx = dy1B (x)fX (x)dx
A −∞ fX (x)
∫ ∫ ∫
= y1B (x)fX,Y (x, y)dxdy = E[Y 1B (X)] = E[Y IA ] = Y dP.
A

I Exercise 2.11.
(i) Let X be a random variable on a probability space (Ω, F, P), and let W be a nonnegative σ(X)-measurable
random variable. Show there exists a function g such that W = g(X). (Hint: Recall that every set in σ(X)
is of the form {X ∈ B} for some Borel set B ⊂ R. Suppose first that W is the indicator of such a set, and
then use the standard machine.)
Proof. We can find a sequence {Wn }n≥1 of σ(X)-measurable simple functions such that Wn ↑ W . Each Wn
∑Kn n
can be written as the form i=1 ai 1Ani , where Ani ’s belong to σ(X) and are disjoint. So each Ani can be
∑Kn n ∑Kn n
written as {X ∈ Bi } for some Borel subset Bin of R, i.e. Wn = i=1
n
ai 1{X∈Bin } = i=1 ai 1Bin (X) = gn (X),
∑Kn n
where gn (x) = i=1 ai 1Bin (x). Define g = lim sup gn , then g is a Borel function. By taking upper limits on
both sides of Wn = gn (X), we get W = g(X).
(ii) Let X be a random variable on a probability space (Ω, F, P), and let Y be a nonnegative random variable
on this space. We do not assume that X and Y have a joint density. Nonetheless, show there is a function
g such that E[Y |X] = g(X).

Proof. Note E[Y |X] is σ(X)-measurable. By (i), we can find a Borel function g such that E[Y |X] = g(X).

3 Brownian Motion
I Exercise 3.1. According to Definition 3.3.3(iii), for 0 ≤ t < u, the Brownian motion increment W (u) −
W (t) is independent of the σ-algebra F(t). Use this property and property (i) of that definition to show
that, for 0 ≤ t < u1 < u2 , the increment W (u2 ) − W (u1 ) is also indepdent of Ft .
Proof. We have F(t) ⊂ F(u1 ) and W (u2 )−W (u1 ) is independent of F(u1 ). So in particular, W (u2 )−W (u1 )
is independent of F(t).

I Exercise 3.2. Let W (t), t ≥ 0, be a Brownian motion, and let F(t), t ≥ 0, be a filtration for this
Brownian motion. Show that W 2 (t) − t is a martingale. (Hint: For 0 ≤ s ≤ t, write W 2 (t) as (W (t) −
W (s))2 + 2W (t)W (s) − W 2 (s).)

Proof. E[W 2 (t) − W 2 (s)|F(s)] = E[(W (t) − W (s))2 + 2W (t)W (s) − 2W 2 (s)|F(s)] = t − s + 2W (s)E[W (t) −
W (s)|F(s)] = t − s. Simple algebra gives E[W 2 (t) − t|F(s)] = W 2 (s) − s. So W 2 (t) − t is a martingale.

I Exercise 3.3 (Normal kurtosis). The kurtosis of a random variable is defined to be the ratio of its
fourth central moment to the square of its variance. For a normal random variable, the kurtosis is 3. This
fact was used to obtain (3.4.7). This exercise verifies this fact.
Let X be a normal random variable with mean µ, so that X − µ has mean zero. Let the variance of X,
which is also the variance of X − µ, be σ 2 . In (3.2.13), we computed the moment-generating function of

16
1 2 2
X − µ to be φ(u) = Eeu(X−µ) = e 2 u σ , where u is a real variable. Differentiating this function with respect
to u, we obtain [ ]
φ′ (u) = E (X − µ)eu(X−µ) = σ 2 ue 2 σ u
1 2 2

and, in particular, φ′ (0) = E(X − µ) = 0. Differentiating again, we obtain


[ ]
φ′′ (u) = E (X − µ)2 eu(X−µ) = (σ 2 + σ 4 u2 )e 2 σ u
1 2 2

and, in particular, φ′′ (0) = E[(X − µ)2 ] = σ 2 . Differentiate two more times and obtain the normal kurtosis
formula E[(X − µ)4 ] = 3σ 4 .
Solution. 2
1
u2 1 2
u2 1 2
u2
φ(3) (u) = 2σ 4 ue 2 σ + (σ 2 + σ 4 u2 )σ 2 ue 2 σ = e2σ (3σ 4 u + σ 6 u3 ),
and 2
1
u2 1 2
u2 1 2
u2
φ(4) (u) = σ 2 ue 2 σ (3σ 4 u + σ 6 u3 ) + e 2 σ (3σ 4 + 3σ 6 u2 ) = e 2 σ (6σ 6 u2 + σ 8 u4 + 3σ 4 ).
So E[(X − µ)4 ] = φ(4) (0) = 3σ 4 .

I Exercise 3.4 (Other variations of Brownian motion). Theorem 3.4.3 asserts that if T is a positive
number and we choose a partition Π with points 0 = t0 < t1 < t2 < · · · < tn = T , then as the number
n of partition points approaches infinity and the length of the longest subinterval ||Π|| approaches zero, the
sample quadratic variation

n−1
(W (tj+1 ) − W (tj ))2
j=0

approaches
∑n−1 T for almost every path of the Brownian motion W . In Remark 3.4.5, we further showed that
∑n−1
j=0 (W (t j+1 ) − W (t j ))(t j+1 − tj ) and j=0 (tj+1 − tj ) have limit zero. We summarize these facts by the
2

multiplication rules
dW (t)dW (t) = dt, dW (t)dt = 0, dtdt = 0. (3.10.1)
(i) Show that as the number m of partition points approaches infinity and the length of the longest subinterval
approaches zero, the sample first variation

n−1
|W (tj+1 ) − W (tj )|
j=0

approaches ∞ for almost every path of the Brownian motion W . (Hint:



n−1 ∑
n−1
(W (tj+1 ) − W (tj ))2 ≤ max |W (tk+1 ) − W (tk )| · |W (tj+1 ) − W (tj )|.)
0≤k≤n−1
j=0 j=0

Proof. Assume there exists A ∈ F, such that P(A) > 0 and for every ω ∈ A,

n−1
lim sup |Wtj+1 − Wtj |(ω) < ∞.
n
j=0

Then for every ω ∈ A,



n−1 ∑
n−1
(Wtj+1 − Wtj )2 (ω) ≤ max |Wtk+1 − Wtk |(ω) · |Wtj+1 − Wtj |(ω)
0≤k≤n−1
j=0 j=0


n−1
≤ max |Wtk+1 − Wtk |(ω) · lim sup |Wtj+1 − Wtj |(ω)
0≤k≤n−1 n
j=0
→ 0,
since by uniform continuity of continuous functions over a closed interval, limn→∞ max0≤k≤n−1 |Wtk+1 −
∑n−1
Wtk |(ω) = 0. This is a contradiction with limn→∞ j=0 (Wtj+1 − Wtj )2 = T a.s..

17
(ii) Show that as the number n of partition points approaches infinity and the length of the longest subinterval
approaches zero, the sample cubic variation


n−1
|W (tj+1 ) − W (tj )|3
j=0

approaches zero for almost every path of the Brownian motion W .


Proof. We note by an argument similar to (i),


n−1 ∑
n−1
|W (tj+1 ) − W (tj )|3 ≤ max |W (tk+1 ) − W (tk )| · (W (tj+1 ) − W (tj ))2 → 0
0≤k≤n−1
j=0 j=0

as n → ∞.

I Exercise 3.5 (Black-Scholes-Merton formula). Let the interest rate r and the volatility σ > 0 be
constant. Let 1 2
S(t) = S(0)e(r− 2 σ )t+σW (t)
be a geometric Brownian motion with mean rate of return r, where the initial stock price S(0) is positive.
Let K be a positive constant. Show that, for T > 0,
[ ]
E e−rT (S(T ) − K)+ = S(0)N (d+ (T, S(0))) − Ke−rT N (d− (T, S(0))),

where ( [
) ]
S(0) 1 σ2
d± (T, S(0)) = log √
+ r± T ,
σ T K 2
and N is the cumulative standard normal distribution function
∫ y ∫ ∞
1 1
e− 2 z dz = √ e− 2 z dz.
1 2 1 2
N (y) = √
2π −∞ 2π −y

Proof.
[ ]
E e−rT (ST − K)+
∫ ∞ ( ) e− 2T
x2

= e−rT [ ] S0 e (r− 12 σ 2 )T +σx


−K √ dx
1
σ ln K
S0 −(r− 12 σ 2 )T 2πT
∫ ∞ ( ) e− y22 √
−rT (r− 12 σ 2 )T +σ T y
= e [ ] S0 e − K √ dy
1
√ ln SK −(r− 12 σ 2 )T 2π
∫ ∞ ∫ ∞
σ T 0

1 − y2 +σ√T y 1 − y2
= S0 e− 2 σ T dy − Ke−rT
1 2
[ ] √ e 2 [ ] √ e 2 dy
1
√ ln SK −(r− 21 σ 2 )T 2π 1
√ ln SK −(r− 21 σ 2 )T 2π
∫ ∞ ( ( ))
σ T 0 σ T 0

1 − ξ2 1 S0 1 2
= S0 √ e 2 dξ − Ke−rT N √ ln + (r − σ )T
[ ] √
√1
ln SK −(r− 12 σ 2 )T −σ T 2π σ T K 2
σ T 0
−rT
= Ke N (d+ (T, S0 )) − Ke−rT N (d− (T, S0 )).

I Exercise 3.6. Let W (t) be a Brownian motion and let F(t), t ≥ 0, be an associated filtration.
(i) For µ ∈ R, consider the Brownian motion with drift µ:

X(t) = µt + W (t).

18
Show that for any Borel-measurable function f (y), and for any 0 ≤ s < t, the function
∫ ∞ { }
1 (y − x − µ(t − s))2
g(x) = √ f (y) exp − dy
2π(t − s) −∞ 2(t − s)
satisfies∫ E[f (X(t))|F(s)] = g(X(s)), and hence X has the Markov property. We may rewrite g(x) as

g(x) = −∞ f (y)p(τ, x, y)dy, where τ = t − s and
{ }
1 (y − x − µτ )2
p(τ, x, y) = √ exp −
2πτ 2τ
is the transition density for Brownian motion with drift µ.
Proof.
E[f (Xt )|Ft ] = E[f (Wt − Ws + a)|Fs ]|a=Ws +µt
= E[f (Wt−s + a)]|a=Ws +µt
∫ ∞ x2
e− 2(t−s)
= f (x + Ws + µt) √ dx
−∞ 2π(t − s)
∫ ∞ (y−Ws −µs−µ(t−s))2
e− 2(t−s)
= f (y) √ dy
−∞ 2π(t − s)
= g(Xs ).
∫∞ (y−x−µτ )2
So E[f (Xt )|Fs ] = f (y)p(t − s, Xs , y)dy with p(τ, x, y) = √ 1 e− 2τ .
−∞ 2πτ

(ii) For ν ∈ R and σ > 0, consider the geometric Brownian motion


S(t) = S(0)eσW (t)+νt .
Set τ = t − s and { ( )2 }
1 log xy − ντ
p(τ, x, y) = √ exp − .
σy 2πτ 2σ 2 τ
Show that for any Borel-measurable function f (y) and for any 0 ≤ s < t the function
∫ ∞
g(x) = f (y)p(τ, x, y)dy 1
0

satisfies E[f (S(t))|F(s)] = g(S(s)) and hence S has the Markov property and p(τ, x, y) is its transition
density.
Proof. E[f (St )|Fs ] = E[f (S0 eσXt )|Fs ] with µ = σv . So by (i),
∫ ∞
1 [y−Xs −µ(t−s)]2
E[f (St )|Fs ] = f (S0 eσy ) √ e− 2(t−s) dy
−∞ 2π(t − s)
∫ ∞ 2
[ σ1 ln Sz0 − σ1 ln SS0s −µ(t−s)] dz
S0 eσy =z 1 −
= f (z) √ e 2

0 2π(t − s) σz
2
∫ −
[ln Szs −v(t−s)]
∞ 2σ 2 (t−s)
e
= f (z) √ dz
0 σz 2π(t − s)
∫ ∞
= f (z)p(t − s, Ss , z)dz
0
= g(Ss ).

1 The
∫∞
textbook wrote 0 h(y)p(τ, x, y)dy by mistake.

19
I Exercise 3.7. Theorem 3.6.2 provides the Laplace transform of the density of the first passage time for
Brownian motion. This problem derives the analogous formula for Brownian motions with drift. Let W be
a Brownian motion. Fix m > 0 and µ ∈ R. For 0 ≤ t < ∞, define

X(t) = µt + W (t),
τm = min{t ≥ 0; X(t) = m}.

As usual, we set τm = ∞ if X(t) never reaches the level m. Let σ be a positive number and set
{ ( ) }
1 2
Z(t) = exp σX(t) − σµ + σ t .
2

(i) Show that Z(t), t ≥ 0, is a martingale.


Proof.
[ ]
Zt
E Fs
Zs
[ { ( ) }]
σ2
= E exp σ(Wt − Ws ) + σµ(t − s) − σµ + (t − s)
2
{ ( ) } ∫ ∞ √ − x2
2
σ2 σ t−s·x e
= exp σµ(t − s) − σµ + (t − s) · e √ dx
2 −∞ 2π
{ } ∫ ∞

(x−σ t−s)2 { }
σ2 e− 2 σ2
= exp − (t − s) · √ dx · exp (t − s)
2 −∞ 2π 2
= 1.

(ii) Use (i) to conclude that


[ { ( ) }]
1
E exp σX(t ∧ τm ) − σµ + σ 2 (t ∧ τm ) = 1, t ≥ 0.
2

Proof. By optional stopping theorem, E[Zt∧τm ] = E[Z0 ] = 1, that is,


[ { ( ) }]
σ2
E exp σXt∧τm − σµ + t ∧ τm = 1.
2

(iii) Now suppose µ ≥ 0. Show that, for σ > 0,


[ { ( ) } ]
1 2
E exp σm − σµ + σ τm 1{τm <∞} = 1.
2
Proof. If µ ≥ 0 and σ > 0, Zt∧τm ≤ eσm . By bounded convergence theorem,

E[1{τm <∞} Zτm ] = E[ lim Zt∧τm ] = lim E[Zt∧τm ] = 1,


t→∞ t→∞

1 2
[ σ2
]
since on the event {τm = ∞}, Zt∧τm ≤ eσm− 2 σ t
→ 0 as t → ∞. Therefore, E eσm−(σµ+ 2 )τm 1{τm <∞} = 1.
σ2
Let σ ↓ 0, by bounded convergence theorem, we have P(τm < ∞) = 1. Let σµ + 2 = α, we get

E[e−ατm ] = e−σm = emµ−m 2α+µ .
2

20
(iv) Show that if µ > 0, then Eτm < ∞. Obtain a formula for Eτm . (Hint: Differentiate the formula in (iii)
with respect to α.)
Proof. We note for α > 0, E[τm e−ατm ] < ∞ since xe−αx is bounded on [0, ∞). So by an argument similar
to Exercise 1.8, E[e−ατm ] is differentiable and

∂ √ −m
E[e−ατm ] = −E[τm e−ατm ] = emµ−m 2α+µ √
2
.
∂α 2α + µ2

Let α ↓ 0, by monotone increasing theorem, E[τm ] = m


µ < ∞ for µ > 0.

(v) Now suppose µ < 0. Show that, for σ > −2µ,


[ { ( ) } ]
1
E exp σm − σµ + σ 2 τm 1{τm <∞} = 1.
2

Use this fact to show that P{τm < ∞} = e−2m|µ| ,2 which is strictly less than one, and to obtain the Laplace
transform √
Ee−ατm = emµ−m 2α+µ for all α > 0.
2

σ2
Proof. By σ > −2µ > 0, we get σµ + 2 > 0. Then Zt∧τm ≤ eσm and on the event {τm = ∞}, Zt∧τm ≤
2
σm−( σ2
e +σµ)t
→ 0 as t → ∞. Therefore,
[ σ2
]
E eσm−(σµ+ 2 )τm 1{τm <∞} = E[ lim Zt∧τm ] = lim E[Zt∧τm ] = 1.
t→∞ t→∞

σ2
Let σ ↓ −2µ, we then get P(τm < ∞) = e2µm = e−2|µ|m < 1. Set α = σµ + 2 > 0 and we have

E[e−ατm ] = E[e−ατm 1{τm <∞} ] = e−σm = emµ−m 2α+µ2
.

I Exercise 3.8. This problem presents the convergence of the distribution of stock prices in a sequence of
binomial models to the distribution of geometric Brownian motion. In contrast to the analysis of Subsection
3.2.7, here we allow the interest rate to be different from zero.
Let σ > 0 and r ≥ 0 be given. For each positive integer n, we consider a binomial model √ taking n steps
per unit time. In this model, the interest rate per period is nr , the up factor is un = eσ/ n , and the down

factor is dn = e−σ/ n . The risk-neutral probabilities are then
√ √
r
n + 1 − e−σ/ n eσ/ n − nr − 1
pen = √ e
√ , qn = √ √ .
e σ/ n − e−σ/ n eσ/ n − e−σ/ n
Let t be an arbitrary positive rational number, and for each positive integer n for which nt is an integer,
define
∑nt
Mnt,n = Xk,n ,
k=1

where X1,n , · · · , Xnt,n 3 are independent, identically distributed random variables with

e k,n = 1} = pen , P{X


P{X e k,n = −1} = qen , k = 1, · · · , nt4 .
2 The textbook wrote e−2x|µ| by mistake.
3 The textbook wrote Xn,n by mistake.
4 The textbook wrote n by mistake.

21
The stock price at time t in this binomial model, which is the result of nt steps from the initial time, is given
by (see (3.2.15) for a similar equation)
1 1
(nt+Mnt,n ) (nt−Mnt,n )
Sn (t) = S(0)un2 dn2
{ } { }
σ σ
= S(0) exp √ (nt + Mnt,n ) 5
exp − √ (nt − Mnt,n )
2 n 2 n
{ }
σ
= S(0) exp √ Mnt,n .
n

This problem shows that as n → ∞, the distribution of the sequence of random variables √σn Mnt,n appearing
( )
in the exponent above converges to the normal distribution with mean r − 21 σ 2 t and variance σ 2 t. There-
fore, the limiting distribution of Sn (t) is the same as the distribution of the geometric Brownian motion
S(0) exp{σW (t) + (r − 21 σ)t} at time t.
(i) Show that the moment-generating function φn (u) of √1n Mnt,n is given by
[ ( √ ) ( √ )]nt
√u
r
n + 1 − e−σ/ n − √un
r
+ 1 − e σ/ n
φn (u) = e n √ √ −e n √ √ .
e σ/ n − e−σ/ n eσ/ n − e−σ/ n

Proof.
[ 1 ] ( [ u ])nt
e eu √n Mnt,n = E e e √n X1,n − √u
√u
φn (u) = E = (e n pen + e n qen )nt
[ ( − √σ
) ( √ σ )]nt
√u
r
n +1−e n − √u − nr − 1 + e n
= e n √σ
− √σ
+e n √σ
− √σ
.
e n −e n e n −e n

(ii) We want to compute


lim φn (u) = lim φ 1 (u),
n→∞ x↓0 x2

where we have made the change of variable x = √1n . To do this, we will compute log φ 12 (u) and then take
x
the limit as x ↓ 0. Show that
[ ]
t (rx2 + 1) sinh ux + sinh(σ − u)x
log φ 12 (u) = 2 log
x x sinh σx
ez −e−z ez +e−z
(the definitions are sinh z = 2 , cosh z = 2 ), and use the formula

sinh(A − B) = sinh A cosh B − cosh A sinh B

to rewrite this as [ ]
t (rx2 + 1 − cosh σx) sinh ux
log φ 12 (u) = 2 log cosh ux + .
x x sinh σx
Proof.
[ ( ) ( )] t
rx2 + 1 − e−σx −ux rx2 + 1 − eσx x2
φ 1 (u) = e ux
−e .
x2 eσx − e−σx eσx − e−σx

22
So,
[ ]
t (rx2 + 1)(eux − e−ux ) + e(σ−u)x − e−(σ−u)x
log φ 1 (u) = log
x2 x2 eσx − e−σx
[ ]
t (rx + 1) sinh ux + sinh(σ − u)x
2
= log
x2 sinh σx
[ ]
t (rx2 + 1) sinh ux + sinh σx cosh ux − cosh σx sinh ux
= log
x2 sinh σx
[ ]
t (rx2 + 1 − cosh σx) sinh ux
= log cosh ux + .
x2 sinh σx

(iii) Use the Taylor series expansions


1
cosh z = 1 + z 2 + O(z 4 ), sinh z = z + O(z 3 ),
2
to show that
(rx2 + 1 − cosh σx) sinh ux 1 rux2 1
cosh ux + = 1 + u2 x2 + − ux2 σ + O(x4 ). (3.10.2)
sinh σx 2 σ 2
The notation O(xj ) is used to represent terms of the order xj .
Proof.

(rx2 + 1 − cosh σx) sinh ux


cosh ux +
sinh σx
2 2
2 2
u x 4 (rx2 + 1 − 1 − σ 2x + O(x4 ))(ux + O(x3 ))
= 1+ + O(x ) +
2 σx + O(x3 )
2
u2 x2 (r − σ2 )ux3 + O(x5 )
= 1+ + + O(x4 )
2 σx + O(x3 )
2
u2 x2 (r − σ2 )ux3 (1 + O(x2 ))
= 1+ + + O(x4 )
2 σx(1 + O(x2 ))
u2 x2 rux2 1
= 1+ + − σux2 + O(x4 ).
2 σ 2

(iv) Use the Taylor series expansion log(1 + x) = x + O(x2 ) to compute limx↓0 log φ 12 (u). Now explain how
( x)
you know that the limiting distribution for √σn Mnt,n is normal with mean r − 12 σ 2 t and variance σ 2 t.

Solution.
( ) ( )
t u2 x2 ru 2 σux2 t u2 x2 ru 2 σux2
log φ 1 = log 1 + + x − + O(x 4
) = + x − + O(x 4
) .
x2 x2 2 σ 2 x2 2 σ 2
[ 1 ]
2
So limx↓0 log φ 12 (u) = t( u2 + ru − σu
), and e eu √n Mnt,n = φn (u) → 1 tu2 + t( r − σ )u. By the one-to-one
E
x σ 2 2 σ 2
correspondence between distribution and moment generating function6 , ( √1n Mnt,n )n converges to a Gaussian
random variable with mean t( σr − σ2 ) and variance t. Hence ( √σn Mnt,n )n converges to a Gaussian random
σ2
variable with mean t(r − 2 ) and variance σ 2 t.
6 This correspondence holds only under certain conditions, which normal distribution certainly satisfies. For details, see

Shiryaev [12, page 293-296].

23
I Exercise 3.9 (Laplace transform of first passage density). The solution to this problem is long
and technical. It is included for the sake of completeness, but the reader may safely skip it.
Let m > 0 be given, and define
{ }
m m2
f (t, m) = √ exp − .
t 2πt 2t
According to (3.7.3) in Theorem 3.7.1, f (t, m) is the density in the variable t of the first passage time
τm = min{t ≥ 0; W (t) = m}, where W is a Brownian motion without drift. Let
∫ ∞
g(α, m) = e−αt f (t, m)dt, α > 0,
0

be the Laplace transform of the density f (t, m). This problem verifies that g(α, m) = e−m 2α
, which is the
formula derived in Theorem 3.6.2.
(i) For k ≥ 1, define ∫ ∞ { }
1 m2
αk (m) = √ t−k/2 exp −αt − dt,
2π 0 2t
so g(α, m) = ma3 (m). Show that
gm (α, m) = a3 (m) − m2 a5 (m),
gmm (α, m) = −3ma5 (m) + m3 a7 (m).
Proof. We note
∫ ∞ { }
∂ 1 −k/2 ∂ m2
ak (m) = √ t exp −αt − dt
∂m 2π 0 ∂m 2t
∫ ∞ { }
1 m2 ( m )
= √ t−k/2 exp −αt − − dt
2π 0 2t t
= −mak+2 (m).
So

gm (α, m) = [ma3 (m)] = a3 (m) − m2 a5 (m),
∂m
and
gmm (α, m) = −ma5 (m) − 2ma5 (m) + m3 a7 (m) = −3ma5 (m) + m3 a7 (m).

(ii) Use integration by parts to show that


2α m2
a5 (m) = − a3 (m) + a7 (m).
3 3
Proof. For k > 2 and α > 0, we have
∫ ∞ { }
1 m2
ak (m) = √ t−k/2 exp −αt − dt
2π 0 2t
∫ ∞ { }
2 1 m2
= − ·√ exp −αt − dt−(k−2)/2
k−2 2π 0 2t
[ { } ∞
2 1 m2 −(k−2)/2
= − ·√ exp −αt − t
k−2 2π 2t 0
∫ ∞ { 2
} ( 2
) ]
−(k−2)/2 m m
− t exp −αt − −α + 2 dt
0 2t 2t
[ ∫ ∞ { } ∫ { } ]
2 1 −(k−2)/2 m2 m2 ∞ −(k+2)/2 m2
= − ·√ α t exp −αt − dt − t exp −αt − dt
k−2 2π 0 2t 2 0 2t
2α m2
= − ak−2 (m) + ak+2 (m).
k−2 k−2

24
m2
Plug in k = 5, we obtain a5 (m) = − 2α
3 a3 (m) + 3 a7 (m).

(iii) Use (i) and (ii) to show that g satisfies the second-order ordinary differential equation

gmm (α, m) = 2αg(α, m).

Proof. We note

gmm (α, m) = −3ma5 (m) + m3 a7 (m) = 2αma3 (m) − m3 a7 (m) + m3 a7 (m) = 2αma3 (m) = 2αg(α, m).

(iv) The general solution to a second-order ordinary differential equation of the form

ay ′′ (m) + by ′ (m) + cy(m) = 0

is
y(m) = A1 eλ1 m + A2 eλ2 m ,
where λ1 and λ2 are roots of the characteristic equation

aλ2 + bλ + c = 0.

Here we are assuming that these roots are distinct. Find the general solution of the equation in (iii) when
α > 0. This solution has two undetermined parameters A1 and A2 , and these may depend on α.
Solution. The characteristic equation of gmm (α, m) = 2αg(α, m) is

λ2 − 2α = 0.
√ √
So λ1 = 2α and λ2 = − 2α, and the general solution of the equation in (iii) is
√ √
A1 e 2αm
+ A2 e− 2αm
.

(v) Derive the bound


∫ m
√ { } ∫ ∞
m m −3/2 m2 1
g(α, m) ≤ √ t exp − dt + √ e−αt dt
2π 0 t 2t 2πm m
and use it to show that, for every α > 0,

lim g(α, m) = 0.
m→∞

Use this fact to determine one of the parameters in the general solution to the equation in (iii).
Solution. We have

g(α, m) = ma3 (m)


∫ ∞ { }
m m2
= √ t−3/2 exp −αt − dt
2π 0 2t
(∫ m { } ∫ ∞ { } )
m m2 m2
= √ t−3/2 exp −αt − dt + t−3/2 exp −αt − dt
2π 0 2t m 2t
(∫ m { } ∫ ∞ )
m −3/2 m2 −3/2
≤ √ t exp −αt − dt + m exp {−αt} dt
2π 0 2t m
∫ m { } ∫ ∞
m m2 1
= √ e−αt t−3/2 exp − dt + √ e−αt dt.
2π 0 2t 2πm m

25
√m
Over the interval (0, m], e−αt is monotone decreasing with a range of [e−αm , 1) and t is monotone

decreasing with a range of [1, ∞). so e−αt < mt over the interval (0, m]. This gives the desired bound
∫ m
√ { } ∫ ∞
m m −3/2 m2 1
g(α, m) ≤ √ t exp − dt + √ e−αt dt.
2π 0 t 2t 2πm m
∫ ∞ −αt
Now, let m → ∞. Obviously limm→∞ √1
2πm m
e dt = 0, while the first term through the change of
variable − 1t = u satisfies
∫ m
√ { } ∫ m { }
m m −3/2 m2 m3/2 m2
√ t exp − dt = √ t−2 exp − dt
2π 0 t 2t 2π 0 2t
∫ 1
m3/2 − m m2 u
= √ e 2 du
2π −∞
m3/2 e−m/2
= √ · →0
2π m2 /2

as m √→ ∞. Combined, we can conclude that for every α > 0, limm→∞ g(α, m) = limm→∞ √
(A1 e 2αm
+
A2 e− 2αm ) = 0, which requires A1 = 0 as a necessary condition. Therefore, g(α, m) = A2 e− 2αm .


(vi) Using first the change of variable s = t/m2 and then the change of variable y = 1/ s, show that

lim g(α, m) = 1.
m↓0

Using this fact to determine the other parameter in the general solution to the equation in (iii).
Solution. Following the hint of the problem, we have
∫ ∞
m m2
g(α, m) = √ t−3/2 e−αt− 2t dt
2π 0
∫ ∞
s=t/m 2
m
(sm2 )−3/2 e−αm s− 2s m2 ds
2 1
= √
2π 0
∫ ∞
1
s−3/2 e−αm s− 2s ds
2 1
= √
2π 0
√ ∫ ∞ 2
y=1/ s 1 −αm2 y12 − y2 2
= √ y3 e dy
2π 0 y3
∫ ∞ 2
2 −αm2 y12 − y2
= √ e dy.
2π 0
So by dominated convergence theorem,
∫ ∞ 2
∫ ∞
2 −αm2 y12 − y2 2 y2
lim g(α, m) = √ lim e dy = √ e− 2 dy = 1.
m↓0 2π 0 m↓0 2π 0

We√ already proved in (v) that g(α, m) = A2 e− 2αm
. So we must have A2 = 1 and hence g(α, m) =
e− 2αm .

4 Stochastic Calculus
⋆ Comments:

26
1) To see how we obtain the solution to the Vasicek interest rate model (equation (4.4.33)), we recall
the method of integrating factors, a technique often used in solving first-order linear ordinary differential
equations (see, for example, Logan [8, page 74]). Starting from (4.4.32)
dR(t) = (α − βR(t))dt + σdW (t),
we move the term containing R(t) to the left side of the equation and multiple both sides by the integrating
factor eβt :
eβt [R(t)d(βt) + dR(t)] = eβt [αdt + σdW (t)].
Applying Itô’s formula lets us know the left side of the equation is
d(eβt R(t)).
Integrating from 0 to t on both sides, we obtain
∫ t
α
e R(t) − R(0) = (eβt − 1) + σ
βt
eβs dW (s).
β 0

This gives us (4.4.33)


∫ t
−βt α
R(t) = e R(0) + (1 − e−βt ) + σe−βt eβs dW (s).
β 0

2) The distribution of CIR interest rate process R(t) for each positive t can be found in reference [41]
of the textbook (Cox, J. C., INGERSOLL, J. E., AND ROSS, S. (1985) A theory of the term structure of
interest rates, Econometrica 53, 373-384).
3) Regarding to the comment at the beginning of Section 4.5.2, “Black, Scholes, and Merton argued that
the value of this call at any time should depend on the time (more precisely, on the time to expiration)
and on the value of the stock price at that time”, we note it is justified by risk-neutral pricing and Markov
property:
e −r(T −t) (ST − K)+ |F(t)] = E[e
C(t) = E[e e −r(T −t) (ST − K)+ |St ] = c(t, St ).
For details, see Chapter 5.
4) For the PDE approach to solving the Black-Scholes-Merton equation, see Wilmott [15] for details.

I Exercise 4.1. Suppose M (t), 0 ≤ t ≤ T , is a martingale with respect to some filtration F(t), 0 ≤ t ≤ T .
Let ∆(t), 0 ≤ t ≤ T , be a simple process adapted to F(t) (i.e., there is a partition Π = {t0 , t1 , · · · , tn }
of [0, T ] such that, for every j, ∆(tj ) is F(tj )-measurable and ∆(t) is constant in t on each subinterval
[tj , tj+1 )). For t ∈ [tk , tk+1 ), define the stochastic integral


k−1
I(t) = ∆(tj )[M (tj+1 ) − M (tj )] + ∆(tk )[M (t) − M (tk )].
j=0

We think of M (t) as the price of an asset at time t and ∆(tj ) as the number of shares of the asset held by
an investor between times tj and tj+1 . Then I(t) is the capital gains that accrue to the investor between
times 0 and t. Show that I(t), 0 ≤ t ≤ T , is a martingale.
Proof. Fix t and for any s < t, we assume s ∈ [tm , tm+1 ) for some m.
Case 1. m = k. Then I(t)−I(s) = ∆tk (Mt −Mtk )−∆tk (Ms −Mtk ) = ∆tk (Mt −Ms ). So E[I(t)−I(s)|Ft ] =
∆tk E[Mt − Ms |Fs ] = 0.
Case 2. m < k. Then tm ≤ s < tm+1 ≤ tk ≤ t < tk+1 . So

k−1
I(t) − I(s) = ∆tj (Mtj+1 − Mtj ) + ∆tk (Ms − Mtk ) − ∆tm (Ms − Mtm )
j=m


k−1
= ∆tj (Mtj+1 − Mtj ) + ∆tk (Mt − Mtk ) + ∆tm (Mtm+1 − Ms ).
j=m+1

27
Hence
E[I(t) − I(s)|Fs ]

k−1
= E[∆tj E[Mtj+1 − Mtj |Ftj ]|Fs ] + E[∆tk E[Mt − Mtk |Ftk ]|Fs ] + ∆tm E[Mtm+1 − Ms |Fs ]
j=m+1
= 0.
Combined, we can conclude I(t) is a martingale.

I Exercise 4.2. Let W (t), 0 ≤ t ≤ T , be a Brownian motion, and let F(t), 0 ≤ t ≤ T , be an associated
filtration. Let ∆(t), 0 ≤ t ≤ T , be a nonrandom simple process (i.e., there is a partition Π = {t0 , t1 , · · · , tn }
of [0, T ] such that for every j, ∆(tj ) is a nonrandom quantity and ∆(t) = ∆(tj ) is constant in t on the
subinterval [tj , tj+1 )). For t ∈ [tk , tk+1 ), define the stochastic integral


k−1
I(t) = ∆(tj )[W (tj+1 ) − W (tj )] + ∆(tk )[W (t) − W (tk )].
j=0

(i) Show that whenever 0 ≤ s < t ≤ T , the increment I(t) − I(s) is independent of F(s). (Simplification: If
s is between two partition points, we can always insert s as an extra partition point. Then we can relabel
the partition points so that they are still called t0 , t1 , · · · , tn , but with a larger value of n and now with
s = tk for some value of k. Of course, we must set ∆(s) = ∆(tk−1 ) so that ∆ takes the same value on the
interval [s, tk+1 ) as on the interval [tk−1 , s). Similarly, we can insert t as an extra partition point if it is not
already one. Consequently, to show that I(t) − I(s) is independent of F(s) for all 0 ≤ s < t ≤ T , it suffices
to show that I(tk ) − I(tl ) is independent of Fl whenever tk and tl are two partition points with tl < tk . This
is all you need to do.)
Proof. We follow the simplification in the hint and consider I(tk ) − I(tl ) with tl < tk . Then I(tk ) − I(tl ) =
∑k−1
j=l ∆(tj )[W (tj+1 ) − W (tj )]. Since ∆(t) is a non-random process and W (tj+1 ) − W (tj ) ⊥ F (tj ) ⊃ F (tl )
for j ≥ l, we must have I(tk ) − I(tl ) ⊥ F (tl ).
(ii) Show that whenever 0 ≤ s < t ≤ T , the increment I(t) − I(s) is a normally distributed random variable
∫t
with mean zero and variance s ∆2 (u)du.
Proof. We use the notation in (i) and it is clear that I(tk ) − I(tl ) is normal since it is a linear combination
∑k−1
of independent normal random variables. Furthermore, E[I(tk ) − I(tl )] = j=l ∆tj E[W (tj+1 ) − W (tj )] = 0
∑k−1 ∑k−1 ∫t
and Var(I(tk ) − I(tl )) = j=l ∆2 (tj )Var[W (tj+1 ) − W (tj )] = j=l ∆2 (tj )(tj+1 − tj ) = tlk ∆2u du.

(iii) Use (i) and (ii) to show that I(t), 0 ≤ t ≤ T , is a martingale.


Proof. By (i), E[I(t) − I(s)|F(s)] = E[I(t) − I(s)] = 0, for s < t, where the last equality is due to (ii).
∫t
(iv) Show that I 2 (t) − 0 ∆2 (u)du, 0 ≤ t ≤ T , is a martingale.
Proof. For s < t,
[ ∫ t ( ∫ ) ]
s
E I 2 (t) − ∆2u du − I 2 (s) −∆2u du Fs
0 0
∫ t
[ ]
= E (I(t) − I(s)) + 2I(t)I(s) − 2I (s)|Fs −
2 2
∆2u du
s
∫ t
= E[(I(t) − I(s)) ] + 2I(s)E[I(t) − I(s)|Fs ] −
2
∆2u du
s
∫ t ∫ t
= ∆u du + 0 −
2
∆2u du
s s
= 0.

28
I Exercise 4.3. We now consider a case in which ∆(t) in Exercise 4.2 is simple but random. In particular,
let t0 = 0, t1 = s, and t2 = t, and let ∆(0) be nonrandom and ∆(s) = W (s). Which of the following
assertions is true?
(i) I(t) − I(s) is independent of F(s).
Solution. We first note

I(t) − I(s) = ∆(0)[W (t1 ) − W (0)] + ∆(t1 )[W (t2 ) − W (t1 )] − ∆(0)[W (t1 ) − W (0)]
= ∆(t1 )[W (t2 ) − W (t1 )]
= W (s)[W (t) − W (s)].

So I(t) − I(s) is not independent of F(s), since W (s) ∈ F(s).


(ii) I(t) − I(s) is normally distributed. (Hint: Check if the fourth moment is three times the square of the
variance; see Exercise 3.3 of Chapter 3.)
Solution. Following the hint, recall from (i), we know I(t) − I(s) = W (s)[W (t) − W (s)]. So

E[(I(t) − I(s))4 ] = E[Ws4 ]E[(Wt − Ws )4 ] = 3s2 · 3(t − s)2 = 9s2 (t − s)2

and ( )2 ( )2
3 E[(I(t) − I(s))2 ] = 3 E[Ws2 ]E[(Wt − Ws )2 ] = 3s2 (t − s)2 .
( )2
Since E[(I(t) − I(s))4 ] ̸= 3 E[(I(t) − I(s))2 ] , I(t) − I(s) is not normally distributed.
(iii) E[I(t)|F(s)] = I(s).
Solution. E[I(t) − I(s)|F(s)] = W (s)E[W (t) − W (s)|F(s)] = 0. So E[I(t)|F(s)] = I(s) is true.
∫t ∫s
(iv) E[I 2 (t) − 0 ∆2 (u)du|F(s)] = I 2 (s) − 0 ∆2 (u)du.
Solution.
[ ∫ ( )
∫ ]
t s
E I (t) −2
∆ (u)du − I (s) −
2 2
∆ (u)du F(s) 2
0 0
[ ∫ t ]

= E (I(t) − I(s)) + 2I(t)I(s) − 2I (s) −
2 2 2
∆ (u)du F(s)
s
[ ∫ t ]

= E (I(t) − I(s))2 + 2I(s)(I(t) − I(s)) − W 2 (s)1(s,t] (u)du F(s)
s
= E[W (s)(W (t) − W (s)) + 2∆(0)W (s)(W (t) − W (s)) − W 2 (s)(t − s)|F(s)]
2 2 2

= W 2 (s)E[(W (t) − W (s))2 ] + 2∆(0)W 2 (s)E[W (t) − W (s)|F(s)] − W 2 (s)(t − s)


= W 2 (s)(t − s) − W 2 (s)(t − s)
= 0.
∫ t ∫s
So E[I 2 (t) − 0 ∆2 (u)du|F(s)] = I 2 (s) − 0 ∆2 (u)du is true.

I Exercise 4.4 (Stratonovich integral). Let W (t), t ≥ 0, be a Brownian motion. Let T be a fixed
positive number and let Π = {t0 , t1 , · · · , tn } be a partition of [0, T ] (i.e., 0 = t0 < t1 < · · · < tn = T ). For
each j, define t∗j = j 2j+1 to be the midpoint of the interval [tj , tj+1 ].
t +t

(i) Define the half-sample quadratic variation corresponding to Π to be


n−1
QΠ/2 = (W (t∗j ) − W (tj ))2 .
j=0

29
Show that QΠ/2 has limit 12 T as ||Π|| → 0. (Hint: It suffices to show EQΠ/2 = 12 T and lim||Π||→0 Var(QΠ/2 ) =
0.
Proof. Following the hint, we first note that


n−1 ∑
n−1 ∑
n−1
tj+1 − tj T
E[QΠ/2 ] = E[(W (t∗j ) − W (tj ))2 ] = (t∗j − tj ) = = .
j=0 j=0 j=0
2 2
( )
tj+1 −tj
Then, by noting W (t∗j ) − W (tj ) is equal to W (t∗j − tj ) = W 2 in distribution and E[(W 2 (t) − t)2 ] =
E[W 4 (t) − 2tW 2 (t) + t2 ] = 3E[W 2 (t)]2 − 2t2 + t2 = 2t2 , we have

Var(QΠ/2 )
 2 
 ∑( )2 T
n−1

= E  W (t∗j ) − W (tj ) −  
j=0
2
 2 
 ( ∑
n−1
) ∑
n−1
tj+1 − tj  
= E 
2
W (t∗j ) − W (tj ) − 
j=0 j=0
2

∑ [( )( )]
n−1
( )2 tj+1 − tj tk+1 − tk
W (t∗j ) − W (tj ) − (W (t∗k ) − W (tk )) −
2
= E
2 2
j, k=0

∑ [( )( )]
n−1
( tj+1 − tj )2 tk+1 − tk
W (t∗j ) ∗ 2
= E − W (tj ) − (W (tk ) − W (tk )) −
2 2
j, k=0,j̸=k
[ ( )2 ]

n−1
tj+1 − tj

+ E (W (tj ) − W (tj )) −
2

j=0
2
[ ( )2 ]

n−1
t − t
E W 2 (t∗j − tj ) −
j+1 j
= 0+
j=0
2

n−1 ( )2
tj+1 − tj
= 2·
j=0
2
T
≤ max |tj+1 − tj | → 0.
2 1≤j≤n

Combined, we can conclude lim||Π||→0 QΠ/2 → T


2 in L2 (P).
(ii) Define the Stratonovich integral of W (t) with respect to W (t) to be
∫ T ∑
n−1
W (t) ◦ dW (t) = lim W (t∗j )(W (tj+1 ) − W (tj )). (4.10.1)
0 ||Π||→0
j=0

∫T
In contrast to the Itô integral 0 W (t)dW (t) = 12 W 2 (T ) − 12 T of (4.3.4), which evaluates the integrand at
the left endpoint of each subinterval [tj , tj+1 ], here we evaluate the integrand at the midpoint t∗j . Show that
∫ T
1 2
W (t) ◦ dW (t) = W (T ).
0 2
(Hint: Write the approximating sum in (4.10.1) as the sum of an approximating sum for the Itô integral
∫T
0
W (t)dW (t) and QΠ/2 . The approximating sum for the Itô integral is the one corresponding to the
partition 0 = t0 < t∗0 < t1 < t∗1 < · · · < t∗n−1 < tn = T , not the partition Π.)

30
Proof. Following the hint and using the result in (i), we have


n−1
W (t∗j )(W (tj+1 ) − W (tj ))
j=0


n−1
{ }
= W (t∗j ) [W (tj+1 ) − W (t∗j )] + [W (t∗j ) − W (tj )]
j=0


n−1
{ ∑
} n−1
= W (t∗j )[W (tj+1 ) − W (t∗j )] + W (tj )[W (t∗j ) − W (tj )] + [W (t∗j ) − W (tj )]2 .
j=0 j=0

By the construction of Itô integral,

∑ ∫
n−1
{ } T
lim

W (t∗j )[W (tj+1 ) − W (t∗j )] + W (tj )[W (t∗j ) − W (tj )] = W (t)dW (t),
||Π ||→0 0
j=0

where Π∗ is the partition 0 = t0 < t∗0 < t1 < t∗1 < · · · < t∗n−1 < tn = T . (i) already shows


n−1
T T
lim [W (t∗j ) − W (tj )]2 = = .
||Π||
j=0
2 2

Combined, we conclude
∫ T ∑
n−1 ∫ T
T 1
W (t) ◦ dW (t) = lim W (t∗j )(W (tj+1 ) − W (tj )) = W (t)dW (t) + = W 2 (T ).
0 ||Π||→0
j=0 0 2 2

I Exercise 4.5 (Solving the generalized geometric Brownian motion equation). Let S(t) be a
positive stochastic process that satisfies the generalized geometric Brownian motion differential equation (see
Example 4.4.8)
dS(t) = α(t)S(t)dt + σ(t)S(t)dW (t), (4.10.2)
where α(t) and σ(t) are processes adapted to the filtration F(t), t ≥ 0, associated with the Brownian motion
W (t), t ≥ 0. In this exercise, we show that S(t) must be given by formula (4.4.26) (i.e., that formula provides
the only solution to the stochastic differential equation (4.10.2)). In the process, we provide a method for
solving this equation.
(i) Using (4.10.2) and the Itô-Doeblin formula, compute d log S(t). Simplify so that you have a formula for
d log S(t) that does not involve S(t).
Solution.
( )
dSt 1 d⟨S⟩t 2St dSt − d⟨S⟩t 2St (αt St dt + σt St dWt ) − σt2 St2 dt 1 2
d log St = − = = = σt dWt + αt − σt dt.
St 2 St2 2St2 2St2 2

(ii) Integrate the formula you obtained in (i), and then exponentiate the answer to obtain (4.4.26).

Solution. ∫ ∫ t( )
t
1
log St = log S0 + σs dWs + αs − σs2 ds.
0 0 2
{∫ ∫t( ) }
t
So St = S0 exp 0
σs dWs + 0
αs − 1 2
2 σs ds .

31
{ ( ) }
I Exercise 4.6. Let S(t) = S(0) exp σW (t) + α − 12 σ 2 t be a geometric Brownian motion. Let p be a
positive constant. Compute d(S p (t)), the differential of S(t) raised to the power p.
Solution. Without loss of generality, we assume p ̸= 1. Since (xp )′ = pxp−1 , (xp )′′ = p(p − 1)xp−2 , we have
1
d(Stp ) = pStp−1 dSt + p(p − 1)Stp−2 d⟨S⟩t
2
1
= pStp−1 (αSt dt + σSt dWt ) + p(p − 1)Stp−2 σ 2 St2 dt
[ 2 ]
p 1
= St pαdt + pσdWt + p(p − 1)σ dt 2
2
[ ( ) ]
p p −1 2
= pSt σdWt + α + σ dt .
2

I Exercise 4.7.
(i) Compute dW 4 (t) and then write W 4 (T ) as the sum of an ordinary (Lebesgue) integral with respect to
time and an Itô integral.
∫T ∫T
Solution. dWt4 = 4Wt3 dWt + 12 · 4 · 3Wt2 d⟨W ⟩t = 4Wt3 dWt + 6Wt2 dt. So WT4 = 4 0 Wt3 dWt + 6 0 Wt2 dt.

(ii) Take expectations on both sides of the formula you obtained in (i), use the fact that EW 2 (t) = t, and
derive the formula EW 4 (T ) = 3T 2 .
[∫ ] ∫T ∫T
T
Solution. E[WT4 ] = 4E 0 Wt3 dWt + 6 0 E[Wt2 ]dt = 6 0 tdt = 3T 2 .

(iii) Use the method of (i) and (ii) to derive a formula for EW 6 (T ).
∫T ∫T ∫T
Solution. dWt6 = 6Wt5 dWt + 12 ·6·5Wt4 dt. So WT6 = 6 0 Wt5 dWt +15 0 Wt4 dt. Hence E[WT6 ] = 15 0 3t2 dt =
15T 3 .

I Exercise 4.8 (Solving the Vasicek equation). The Vasicek interest rate stochastic differential equation
(4.4.32) is
dR(t) = (α − βR(t))dt + σdW (t),
where α, β, and σ are positive constants. The solution to this equation is given in Example 4.4.10. This
exercise shows how to derive this solution.
(i) Use (4.4.32) and the Itô-Doeblin formula to compute d(eβt R(t)). Simplify it so that you have a formula
for d(eβt R(t)) that does not involve R(t).
Solution. d(eβt Rt ) = βeβt Rt dt + eβt dRt = eβt [βRt dt + (α − βR(t))dt + σdW (t)] = eβ (αdt + σdW (t)).

(ii) Integrate the equation you obtain in (i) and solve for R(t) to obtain (4.4.33).
Solution. ∫ ∫
t t
α βt
eβt Rt = R0 + eβs (αds + σdWs ) = R0 + (e − 1) + σ eβs dWs .
0 β 0
−βt −βt
∫t −β(t−s)
Therefore Rt = R0 e + α
β (1 −e )+σ 0
e dWs .

32
I Exercise 4.9. For a European call expiring at time T with strike price K, the Black-Scholes-Merton
price at time t, if the time-t stock price is x, is

c(t, x) = xN (d+ (T − t, x)) − Ke−r(T −t) N (d− (T − t, x)),

where
[ ( ) ]
1 x 1
d+ (τ, x) = √ log + r + σ2 τ
σ τ K 2

d− (τ, x) = d+ (τ, x) − σ τ ,

and N (y) is the cumulative standard normal distribution


∫ y ∫ ∞
1 z2 1 z2
N (y) = √ e− 2 dz = √ e− 2 dz.
2π −∞ 2π −y

The purpose of this exercise is to show that the function c satisfies the Black-Scholes-Merton partial differ-
ential equation
1
ct (t, x) + rxcx (t, x) + σ 2 x2 cxx (t, x) = rc(t, x), 0 ≤ t < T, x > 0, (4.10.3)
2
the terminal condition
lim c(t, x) = (x − K)+ , x > 0, x ̸= K, (4.10.4)
t↑T

and the boundary conditions

lim c(t, x) = 0, lim [c(t, x) − (x − e−r(T −t) K)] = 0, 0 ≤ t < T. (4.10.5)


x↓0 x→∞

Equation (4.10.4) and the first part of (4.10.5) are usually written more simply but less precisely as

c(T, x) = (x − K)+ , x ≥ 0

and
c(t, 0) = 0, 0 ≤ t ≤ T.
For this exercise, we abbreviate c(t, x) as simply c and d± (T − t, x) as simply d± .
(i) Verify first the equation
Ke−r(T −t) N ′ (d− ) = xN ′ (d+ ). (4.10.6)

Proof.
d2


−r(T −t) e
2
−r(T −t) ′
Ke N (d− ) = Ke √


(d+ −σ T −t)2

−r(T −t) e
2
= Ke √

√ σ 2 (T −t)
= Ke−r(T −t) eσ T −td+ e− 2 N ′ (d+ )
x σ2 σ 2 (T −t)
= Ke−r(T −t) e(r+ 2 )(T −t) e− 2 N ′ (d+ )
K
= xN ′ (d+ ).

(ii) Show that cx = N (d+ ). This is the delta of the option. (Be careful! Remember that d+ is a function of
x.)

33
Proof. By equation (4.10.6) from (i),

∂ ∂
cx = N (d+ ) + xN ′ (d+ ) d+ (T − t, x) − Ke−r(T −t) N ′ (d− ) d− (T − t, x)
∂x ∂x
∂ ∂
= N (d+ ) + xN ′ (d+ ) d′+ (T − t, x) − xN ′ (d+ ) d+ (T − t, x)
∂x ∂x
= N (d+ ).

(iii) Show that


σx
ct = −rKe−r(T −t) N (d− ) − √ N ′ (d+ ).
2 T −t
This is the theta of the option.

Proof.
∂ ∂
ct = xN ′ (d+ ) d+ (T − t, x) − rKe−r(T −t) N (d− ) − Ke−r(T −t) N ′ (d− ) d− (T − t, x)
∂t [ ∂t ]
′ ∂ −r(T −t) ′ ∂ σ
= xN (d+ ) d+ (T − t, x) − rKe N (d− ) − xN (d+ ) d+ (T − t, x) + √
∂t ∂t 2 T −t
−r(T −t) σx ′
= −rKe N (d− ) − √ N (d+ ).
2 T −t

(iv) Use the formulas above to show that c satisfies (4.10.3).

Proof.
1
ct + rxcx + σ 2 x2 cxx
2
σx 1 ∂
= −rKe−r(T −t) N (d− ) − √ N ′ (d+ ) + rxN (d+ ) + σ 2 x2 N ′ (d+ ) d+ (T − t, x)
2 T −t 2 ∂x
σx ′ 1 2 2 ′ 1
= rc − √ N (d+ ) + σ x N (d+ ) √
2 T −t 2 σ T − tx
= rc.

(v) Show that for x > K, limt↑T d± = ∞, but for 0 < x < K, limt↑T d± = −∞. Use these equalities to
derive the terminal condition (4.10.4).
Proof. For x > K, d+ (T (− t, x) > 0 and limt↑T d+ (T − t, x))= limτ ↓0 d+ (τ, x) = ∞. limt↑T d− (T − t, x) =
√ √
1
limτ ↓0 d− (τ, x) = limτ ↓0 σ√ τ
x
ln K + σ1 (r + 12 σ 2 ) τ − σ τ = ∞. Similarly, limt↑T d± = −∞ for x ∈
(0, K). Also it’s clear that limt↑T d± = 0 for x = K. So
{
x − K, if x > K
lim c(t, x) = xN (lim d+ ) − KN (lim d− ) = = (x − K)+ .
t↑T t↑T t↑T 0, if x ≤ K

(vi) Show that for 0 ≤ t < T , limx↓0 d± = −∞. Use this fact to verify the first part of boundary condition
(4.10.5) as x ↓ 0.

34
Proof. We note [ ( ) ]
1 x 1
lim d+ = √ lim log + r + σ 2 τ = −∞
x↓0 σ τ x↓0 K 2
and √
lim d− = lim d+ − σ τ = −∞.
x↓0 x↓0

So for t ∈ [0, T ),

lim c(t, x) = lim xN (lim d+ (T − t, x)) − Ke−r(T −t) N (lim d− (T − t, x)) = 0.


x↓0 x↓0 x↓0 x↓0

(vii) Show that for 0 ≤ t < T , limx→∞ d± = ∞. Use this fact to verify the second part of boundary condition
(4.10.5) as x → ∞. In this verification, you will need to show that

N (d+ ) − 1
lim = 0.
x→∞ x−1
This is an indeterminate form 00 , and L’Hôspital’s rule implies that this limit is
d
dx [N (d+ ) − 1]
lim d −1
.
dx x
x→∞

Work out this expression and use the fact that


{ ( )}
√ 1
x = K exp σ T − td+ − (T − t) r + σ 2
2

to write this expression solely in terms of d+ (i.e., without the appearance of any x except the x in the
argument of d+ (T − t, x)). Then argue that the limit is zero as d+ → ∞.

Proof. For t ∈ [0, T ), it is clear limx→∞ d± = ∞. Following the hint, we note

N ′ (d+ ) ∂x

d+ N ′ (d+ ) σ√1T −t
lim x(N (d+ ) − 1) = lim = lim .
x→∞ x→∞ −x−2 x→∞ −x−1
{ √ }
By the expression of d+ , we have x = K exp σ T − td+ − (T − t)(r + 21 σ 2 ) . So
d2 √
e− 2 −Keσ T −td+ −(T −t)(r+ 21 σ 2 )
+
−x
lim x(N (d+ ) − 1) = lim N ′ (d+ ) √ = lim √ √ = 0.
x→∞ x→∞ σ T − t d+ →∞ 2π σ T −t
Therefore

lim [c(t, x) − (x − e−r(T −t) K)]


x→∞

= lim [xN (d+ ) − Ke−r(T −t)N (d− ) − x + Ke−r(T −t) ]


x→∞

= lim [x(N (d+ ) − 1) + Ke−r(T −t) (1 − N (d− ))]


x→∞

= lim x(N (d+ ) − 1) + Ke−r(T −t) (1 − N ( lim d− ))


x→∞ x→∞
= 0.

35
****************************** UPDATE STOPPED HERE *************************************
I Exercise 4.10 (Self-financing trading). The fundamental idea behind no-arbitrage pricing is to
reproduce the payoff of a derivative security by trading in the underlying asset (which we call a stock) and
the money market account. In discrete time, we let Xk denote the value of the hedging portfolio at time k
and let ∆k denote the number of shares of stock held between times k and k + 1. Then, at time k, after
rebalancing (i.e., moving from a position of ∆k−1 to a position ∆k in the stock), the amount in the money
market account is Xk − Sk ∆k . The value of the portfolio at time k + 1 is

Xk+1 = ∆k Sk+1 + (1 + r)(Xk − ∆k Sk ). (4.10.7)

This formula can be rearranged to become

Xk+1 − Xk = ∆k (Sk+1 − Sk ) + r(Xk − ∆k Sk ), (4.10.8)

which says that the gain between time k and time k + 1 is the sum of the capital gain on the stock holdings,
∆k (Sk+1 − Sk ), and the interest earnings on the money market account, r(Xk − ∆k Sk ). The continuous-time
analogue of (4.10.8) is
dX(t) = ∆(t)dS(t) + r(X(t) − ∆(t)S(t))dt. (4.10.9)
Alternatively, one could define the value of a share of the money market account at time k to be

Mk = (1 + r)k

and formulate the discrete-time model with two processes, ∆k as before and Γk denoting the number of
shares of the money market account held at time k after rebalancing. Then

Xk = ∆k Sk + Γk Mk , (4.10.10)

so that (4.10.7) becomes

Xk+1 = ∆k Sk+1 + (1 + r)Γk Mk = ∆k Sk+1 + Γk Mk+1 . (4.10.11)

Subtracting (4.10.10) from (4.10.11), we obtain in place of (4.10.8) the equation

Xk+1 − Xk = ∆k (Sk+1 − Sk ) + Γk (Mk+1 − Mk ), (4.10.12)

which says that the gain between time k and time k + 1 is the sum of the capital gain on stock holdings,
∆k (Sk+1 − Sk ), and the earnings from the money market investment, Γk (Mk+1 − Mk ).
But ∆k and Γk cannot be chosen arbitrarily. The agent arrives at time
(i)

Proof. We show (4.10.16) + (4.10.9) ⇐⇒ (4.10.16) + (4.10.15), i.e. assuming X has the representation
Xt = ∆t St + Γt Mt , “continuous-time self-financing condition” has two equivalent formulations, (4.10.9) or
(4.10.15). Indeed, dXt = ∆t dSt +Γt dMt +(St d∆t +dSt d∆t +Mt dΓt +dMt dΓt ). So dXt = ∆t dSt +Γt dMt ⇐⇒
St d∆t + dSt d∆t + Mt dΓt + dMt dΓt = 0, i.e. (4.10.9) ⇐⇒ (4.10.15).

(ii)
Proof. First, we clarify the problems by stating explicitly the given conditions and the result to be proved.
We assume we have a portfolio Xt = ∆t St + Γt Mt . We let c(t, St ) denote the price of call option at time t
and set ∆t = cx (t, St ). Finally, we assume the portfolio is self-financing. The problem is to show
[ ]
1
rNt dt = ct (t, St ) + σ 2 St2 cxx (t, St ) dt,
2

where Nt = c(t, St ) − ∆t St .

36
Indeed, by the self-financing property and ∆t = cx (t, St ), we have c(t, St ) = Xt (by the calculations in
Subsection 4.5.1-4.5.3). This uniquely determines Γt as

Xt − ∆t St c(t, St ) − cx (t, St )St Nt


Γt = = = .
Mt Mt Mt
Moreover,
[ ]
1
dNt = ct (t, St )dt + cx (t, St )dSt + cxx (t, St )d⟨St ⟩t − d(∆t St )
2
[ ]
1
= ct (t, St ) + cxx (t, St )σ 2 St2 dt + [cx (t, St )dSt − d(Xt − Γt Mt )]
2
[ ]
1
= ct (t, St ) + cxx (t, St )σ 2 St2 dt + Mt dΓt + dMt dΓt + [cx (t, St )dSt + Γt dMt − dXt ].
2

By self-financing property, cx (t, St )dt + Γt dMt = ∆t dSt + Γt dMt = dXt , so


[ ]
1
ct (t, St ) + cxx (t, St )σ 2 St2 dt = dNt − Mt dΓt − dMt dΓt = Γt dMt = Γt rMt dt = rNt dt.
2

4.11.

Proof. First, we note c(t, x) solves the Black-Scholes-Merton PDE with volatility σ1 :
( )
∂ ∂ 1 ∂2
+ rx + x2 σ12 2 − r c(t, x) = 0.
∂t ∂x 2 ∂x

So
1
ct (t, St ) + rSt cx (t, St ) + σ12 St2 cxx (t, St ) − rc(t, St ) = 0,
2
and
1
dc(t, St ) = ct (t, St )dt + cx (t, St )(αSt dt + σ2 St dWt ) + cxx (t, St )σ22 St2 dt
[ ]2
1 2 2
= ct (t, St ) + αcx (t, St )St + σ2 St cxx (t, St ) dt + σ2 St cx (t, St )dWt
2
[ ]
1
= rc(t, St ) + (α − r)cx (t, St )St + St2 (σ22 − σ12 )cxx (t, St ) dt + σ2 St cx (t, St )dWt .
2

Therefore
[
1
dXt = rc(t, St ) + (α − r)cx (t, St )St + St2 (σ22 − σ12 )σxx (t, St ) + rXt − rc(t, St ) + rSt cx (t, St )
2
]
1 2
− (σ2 − σ1 )St cxx (t, St ) − cx (t, St )αSt dt + [σ2 St cx (t, St ) − cx (t, St )σ2 St ]dWt
2 2
2
= rXt dt.

This implies Xt = X0 ert . By X0 , we conclude Xt = 0 for all t ∈ [0, T ].

4.12. (i)

37
Proof. By (4.5.29), c(t, x) − p(t, x) = x − e−r(T −t) K. So px (t, x) = cx (t, x) − 1 = N (d+ (T − t, x)) − 1,
pxx (t, x) = cxx (t, x) = σx√1T −t N ′ (d+ (T − t, x)) and

pt (t, x) = ct (t, x) + re−r(T −t) K


σx
= −rKe−r(T −t) N (d− (T − t, x)) − √ N ′ (d+ (T − t, x)) + rKe−r(T −t)
2 T −t
σx
= rKe−r(T −t) N (−d− (T − t, x)) − √ N ′ (d+ (T − t, x)).
2 T −t

(ii)
Proof. For an agent hedging a short position in the put, since ∆t = px (t, x) < 0, he should short the
underlying stock and put p(t, St ) − px (t, St )St (> 0) cash in the money market account.
(iii)

Proof. By the put-call parity, it suffices to show f (t, x) = x − Ke−r(T −t) satisfies the Black-Scholes-Merton
partial differential equation. Indeed,
( )
∂ 1 ∂2 ∂ 1
+ σ 2 x2 2 + rx − r f (t, x) = −rKe−r(T −t) + σ 2 x2 · 0 + rx · 1 − r(x − Ke−r(T −t) ) = 0.
∂t 2 ∂x ∂x 2

Remark: The Black-Scholes-Merton PDE has many solutions. Proper boundary conditions are the key
to uniqueness. For more details, see Wilmott [15].

4.13.

Proof. We suppose (W1 , W2 ) is a pair of local martingale defined by SDE


{
dW1 (t) = dB1 (t)
(1)
dW2 (t) = α(t)dB1 (t) + β(t)dB2 (t).

We want to find α(t) and β(t) such that


{
(dW2 (t))2 = [α2 (t) + β 2 (t) + 2ρ(t)α(t)β(t)]dt = dt
(2)
dW1 (t)dW2 (t) = [α(t) + β(t)ρ(t)]dt = 0.

Solve the equation for α(t) and β(t), we have β(t) = √ 1


and α(t) = − √ ρ(t)2 . So
1−ρ2 (t) 1−ρ (t)

{
W1 (t) = B1 (t)
∫t ∫t (3)
W2 (t) = 0 √−ρ(s)
2
dB1 (s) + 0 √ 1
dB2 (s)
1−ρ (s) 1−ρ2 (s)

is a pair of independent BM’s. Equivalently, we have


{
B1 (t) = W1 (t)
∫t ∫t√ (4)
B2 (t) = 0 ρ(s)dW1 (s) + 0 1 − ρ2 (s)dW2 (s).

4.14. (i)

38
Proof. Clearly Zj ∈ Ftj+1 . Moreover

E[Zj |Ftj ] = f ′′ (Wtj )E[(Wtj+1 − Wtj )2 − (tj+1 − tj )|Ftj ] = f ′′ (Wtj )(E[Wt2j+1 −tj ] − (tj+1 − tj )) = 0,

since Wtj+1 − Wtj is independent of Ftj and Wt ∼ N (0, t). Finally, we have

E[Zj2 |Ftj ] = [f ′′ (Wtj )]2 E[(Wtj+1 − Wtj )4 − 2(tj+1 − tj )(Wtj+1 − Wtj )2 + (tj+1 − tj )2 |Ftj ]
= [f ′′ (Wtj )]2 (E[Wt4j+1 −tj ] − 2(tj+1 − tj )E[Wt2j+1 −tj ] + (tj+1 − tj )2 )
= [f ′′ (Wtj )]2 [3(tj+1 − tj )2 − 2(tj+1 − tj )2 + (tj+1 − tj )2 ]
= 2[f ′′ (Wtj )]2 (tj+1 − tj )2 ,

where we used the independence of Browian motion increment and the fact that E[X 4 ] = 3E[X 2 ]2 if X is
Gaussian with mean 0.
(ii)
∑n−1 ∑n−1
Proof. E[ j=0 Zj ] = E[ j=0 E[Zj |Ftj ]] = 0 by part (i).
(iii)

Proof.


n−1 ∑
n−1
V ar[ Zj ] = E[( Zj )2 ]
j=0 j=0


n−1 ∑
= E[ Zj2 + 2 Zi Zj ]
j=0 0≤i<j≤n−1


n−1 ∑
= E[E[Zj2 |Ftj ]] + 2 E[Zi E[Zj |Ftj ]]
j=0 0≤i<j≤n−1


n−1
= E[2[f ′′ (Wtj )]2 (tj+1 − tj )2 ]
j=0


n−1
= 2E[(f ′′ (Wtj ))2 ](tj+1 − tj )2
j=0


n−1
≤ 2 max |tj+1 − tj | · E[(f ′′ (Wtj ))2 ](tj+1 − tj )
0≤j≤n−1
j=0
→ 0,
∑n−1 ∫T
since j=0 E[(f ′′ (Wtj ))2 ](tj+1 − tj ) → 0
E[(f ′′ (Wt ))2 ]dt < ∞.

4.15. (i)

Proof. Bi is a local martingale with


 2

d
σij (t) ∑
d 2
σij (t)
(dBi (t))2 =  dWj (t) = dt = dt.
j=1
σi (t) j=1
σi2 (t)

So Bi is a Brownian motion.
(ii)

39
Proof.
 [ ]
∑ d
σ (t) ∑d
σ (t)
dBi (t)dBk (t) =  ij
dWj (t)
kl
dWl (t)
j=1
σi (t) σk (t)
l=1
∑ σij (t)σkl (t)
= dWj (t)dWl (t)
σi (t)σk (t)
1≤j, l≤d


d
σij (t)σkj (t)
= dt
j=1
σi (t)σk (t)
= ρik (t)dt.

4.16.
Proof. To find the m independent Brownion motion W1 (t), · · · , Wm (t), we need to find A(t) = (aij (t)) so
that
(dB1 (t), · · · , dBm (t))tr = A(t)(dW1 (t), · · · , dWm (t))tr ,
or equivalently

(dW1 (t), · · · , dWm (t))tr = A(t)−1 (dB1 (t), · · · , dBm (t))tr ,

and

(dW1 (t), · · · , dWm (t))tr (dW1 (t), · · · , dWm (t))


= A(t)−1 (dB1 (t), · · · , dBm (t))tr (dB1 (t), · · · , dBm (t))(A(t)−1 )tr dt
= Im×m dt,

where Im×m is the m × m unit matrix. By the condition dBi (t)dBk (t) = ρik (t)dt, we get

(dB1 (t), · · · , dBm (t))tr (dB1 (t), · · · , dBm (t)) = C(t).

So A(t)−1 C(t)(A(t)−1 )tr = Im×m , which gives C(t) = A(t)A(t)tr . This motivates us to define A as the
square root of C. Reverse the above analysis, we obtain a formal proof.

4.17.
Proof. We will try to solve all the sub-problems in a single, long solution. We start with the general Xi :
∫ t ∫ t
Xi (t) = Xi (0) + θi (u)du + σi (u)dBi (u), i = 1, 2.
0 0

The goal is to show


C(ϵ)
lim √ = ρ(t0 ).
ϵ↓0 V1 (ϵ)V2 (ϵ)
First, for i = 1, 2, we have

Mi (ϵ) = E[Xi (t0 + ϵ) − Xi (t0 )|Ft0 ]


[∫ t0 +ϵ ∫ t0 +ϵ ]
= E Θi (u)du + σi (u)dBi (u)|Ft0
t0 t0
[∫ t0 +ϵ ]
= Θi (t0 )ϵ + E (Θi (u) − Θi (t0 ))du|Ft0 .
t0

40
By Conditional Jensen’s Inequality,
[∫ t0 +ϵ ] [∫ ]
t0 +ϵ
E
(Θi (u) − Θi (t0 ))du|Ft0 ≤ E |Θi (u) − Θi (t0 )|du|Ft0

t0 t0
∫ t +ϵ ∫ t +ϵ
Since 1ϵ t00 |Θi (u) − Θi (t0 )|du ≤ 2M and limϵ→0 1ϵ t00 |Θi (u) − Θi (t0 )|du = 0 by the continuity of Θi ,
the Dominated Convergence Theorem under Conditional Expectation implies
[∫ t0 +ϵ ] [ ∫ ]
1 1 t0 +ϵ
lim E |Θi (u) − Θi (t0 )|du|Ft0 = E lim |Θi (u) − Θi (t0 )|du|Ft0 = 0.
ϵ→0 ϵ t0 ϵ→0 ϵ t
0

So Mi (ϵ) = Θi (t0 )ϵ + o(ϵ). This proves (iii).


∫t
To calculate the variance and covariance, we note Yi (t) = 0 σi (u)dBi (u) is a martingale and by Itô’s
∫t
formula Yi (t)Yj (t) − 0 σi (u)σj (u)du is a martingale (i = 1, 2). So
E[(Xi (t0 + ϵ) − Xi (t0 ))(Xj (t0 + ϵ) − Xj (t0 ))|Ft0 ]
[( ∫ t0 +ϵ )( ∫ t0 +ϵ ) ]
= E Yi (t0 + ϵ) − Yi (t0 ) + Θi (u)du Yj (t0 + ϵ) − Yj (t0 ) + Θj (u)du |Ft0
t0 t0
[∫ t0 +ϵ ∫ t0 +ϵ ]
= E [(Yi (t0 + ϵ) − Yi (t0 )) (Yj (t0 + ϵ) − Yj (t0 )) |Ft0 ] + E Θi (u)du Θj (u)du|Ft0
t0 t0
[ ∫ t0 +ϵ ] [ ∫ t0 +ϵ ]
+E (Yi (t0 + ϵ) − Yi (t0 )) Θj (u)du|Ft0 + E (Yj (t0 + ϵ) − Yj (t0 )) Θi (u)du|Ft0
t0 t0
= I + II + III + IV.
[∫ t0 +ϵ ]
I = E[Yi (t0 + ϵ)Yj (t0 + ϵ) − Yi (t0 )Yj (t0 )|Ft0 ] = E σi (u)σj (u)ρij (t)dt|Ft0 .
t0

By an argument similar to that involved in the proof of part (iii), we conclude I = σi (t0 )σj (t0 )ρij (t0 )ϵ + o(ϵ)
and
[∫ t0 +ϵ ∫ t0 +ϵ ] [∫ t0 +ϵ ]
II = E (Θi (u) − Θi (t0 ))du Θj (u)du|Ft0 + Θi (t0 )ϵE Θj (u)du|Ft0
t0 t0 t0
= o(ϵ) + (Mi (ϵ) − o(ϵ))Mj (ϵ)
= Mi (ϵ)Mj (ϵ) + o(ϵ).
By Cauchy’s inequality under conditional expectation (note E[XY |F] defines an inner product on L2 (Ω)),
[ ∫ t0 +ϵ ]
III ≤ E |Yi (t0 + ϵ) − Yi (t0 )| |Θj (u)|du|Ft0
t0

≤ M ϵ E[(Yi (t0 + ϵ) − Yi (t0 ))2 |Ft0 ]

≤ M ϵ E[Yi (t0 + ϵ)2 − Yi (t0 )2 |Ft0 ]
√ ∫
t0 +ϵ
≤ M ϵ E[ Θi (u)2 du|Ft0 ]
t0

≤ Mϵ · M ϵ
= o(ϵ)
Similarly, IV = o(ϵ). In summary, we have
E[(Xi (t0 + ϵ) − Xi (t0 ))(Xj (t0 + ϵ) − Xj (t0 ))|Ft0 ] = Mi (ϵ)Mj (ϵ) + σi (t0 )σj (t0 )ρij (t0 )ϵ + o(ϵ) + o(ϵ).
This proves part (iv) and (v). Finally,
C(ϵ) ρ(t0 )σ1 (t0 )σ2 (t0 )ϵ + o(ϵ)
lim √ = lim √ 2 = ρ(t0 ).
ϵ↓0 V1 (ϵ)V2 (ϵ) ϵ↓0 (σ1 (t0 )ϵ + o(ϵ))(σ22 (t0 )ϵ + o(ϵ))

41
This proves part (vi). Part (i) and (ii) are consequences of general cases.

4.18. (i)

Proof.
d(ert ζt ) = (de−θWt − 2 θ t ) = −e−θWt − 2 θ t θdWt = −θ(ert ζt )dWt ,
1 2 1 2

where for the second “=”, we used the fact that e−θWt − 2 θ
1 2
t
solves dXt = −θXt dWt . Since d(ert ζt ) =
rert ζt dt + ert dζt , we get dζt = −θζt dWt − rζt dt.
(ii)

Proof.

d(ζt Xt ) = ζt dXt + Xt dζt + dXt dζt


= ζt (rXt dt + ∆t (α − r)St dt + ∆t σSt dWt ) + Xt (−θζt dWt − rζt dt)
+(rXt dt + ∆t (α − r)St dt + ∆t σSt dWt )(−θζt dWt − rζt dt)
= ζt (∆t (α − r)St dt + ∆t σSt dWt ) − θXt ζt dWt − θ∆t σSt ζt dt
= ζt ∆t σSt dWt − θXt ζt dWt .

So ζt Xt is a martingale.
(iii)

Proof. By part (ii), X0 = ζ0 X0 = E[ζT Xt ] = E[ζT VT ]. (This can be seen as a version of risk-neutral pricing,
only that the pricing is carried out under the actual probability measure.)

4.19. (i)
∫t
Proof. Bt is a local martingale with [B]t = 0
sign(Ws )2 ds = t. So by Lévy’s theorem, Bt is a Brownian
motion.

(ii)
Proof. d(Bt Wt ) = Bt dWt + sign(Wt )Wt dWt + sign(Wt )dt. Integrate both sides of the resulting equation
and the expectation, we get
∫ t ∫ t
1 1
E[Bt Wt ] = E[sign(Ws )]ds = E[1{Ws ≥0} − 1{Ws <0} ]ds = t − t = 0.
0 0 2 2

(iii)

Proof. By Itô’s formula, dWt2 = 2Wt dWt + dt.


(iv)

Proof. By Itô’s formula,

d(Bt Wt2 ) = Bt dWt2 + Wt2 dBt + dBt dWt2


= Bt (2Wt dWt + dt) + Wt2 sign(Wt )dWt + sign(Wt )dWt (2Wt dWt + dt)
= 2Bt Wt dWt + Bt dt + sign(Wt )Wt2 dWt + 2sign(Wt )Wt dt.

42
So
∫ t ∫ t
E[Bt Wt2 ] = E[ Bs ds] + 2E[ sign(Ws )Ws ds]
0 0
∫ t ∫ t
= E[Bs ]ds + 2 E[sign(Ws )Ws ]ds
0 0
∫ t
= 2 (E[Ws 1{Ws ≥0} ] − E[Ws 1{Ws <0} ])ds
0
∫ t∫ ∞ x2
e− 2s
= 4 x√ dxds
0 0 2πs
∫ t√
s
= 4 ds
0 2π
̸= 0 = E[Bt ] · E[Wt2 ].

Since E[Bt Wt2 ] ̸= E[Bt ] · E[Wt2 ], Bt and Wt are not independent.

4.20. (i)

{  if x > K {
x − K, if x ≥ K 1, 0, if x ̸= K
Proof. f (x) = So f ′ (x) = undefined, if x = K and f ′′ (x) =
0, if x < K. 
 undefined, if x = K.
0, if x < K

(ii)
∫∞ −x
2 √ 2
T −K
Proof. E[f (WT )] = K
(x − K) √
e 2T
dx = 2π e
2T − KΦ(− √KT ) where Φ is the distribution function of
2πT
∫T
standard normal random variable. If we suppose 0 f ′′ (Wt )dt = 0, the expectation of RHS of (4.10.42) is
equal to 0. So (4.10.42) cannot hold.
(iii)
Proof. This is trivial to check.
(iv)
1
Proof. If x = K, limn→∞ fn (x) = 8n = 0; if x > K, for n large enough, x ≥ K + 2n1
, so limn→∞ fn (x) =
limn→∞ (x − K) = x − K; if x < K, for n large enough, x ≤ K − 2n , so limn→∞ fn (x) = limn→∞ 0 = 0. In
1

summary, limn→∞ fn (x) = (x − K)+ . Similarly, we can show




0, if x < K

lim fn (x) = 12 , if x = K (5)
n→∞ 

1, if x > K.

(v)
Proof. Fix ω, so that Wt (ω) < K for any t ∈ [0, T ]. Since Wt (ω) can obtain its maximum on [0, T ], there
exists n0 , so that for any n ≥ n0 , max0≤t≤T Wt (ω) < K − 2n
1
. So
∫ T
LK (T )(ω) = lim n 1(K− 2n
1 1 (Wt (ω))dt = 0.
,K+ 2n )
n→∞ 0

43
(vi)
Proof. Take expectation on both sides of the formula (4.10.45), we have

E[LK (T )] = E[(WT − K)+ ] > 0.

So we cannot have LK (T ) = 0 a.s..


Remark 1. Cf. Kallenberg Theorem 22.5, or the paper by Alsmeyer and Jaeger (2005).

4.21. (i)
Proof. There are two problems. First, the transaction cost could be big due to active trading; second, the
purchases and sales cannot be made at exactly the same price K. For more details, see Hull [6].
(ii)
Proof. No. The RHS of (4.10.26) is a martingale, so its expectation is 0. But E[(ST − K)+ ] > 0. So
XT ̸= (ST − K)+ .

5 Risk-Neutral Pricing
⋆ Comments:
1) Heuristics to memorize the conditional Bayes formula (Lemma 5.2.2 on page 212 of the textbook)

e |F(s)] = 1
E[Y E[Y Z(t)|F(s)].
Z(s)

We recall an example in Durrett [4, page 223] (Example 5.1.3): suppose Ω1 , Ω2 , · · · is a finite or infinite
partition of Ω into disjoint sets, each of which has positive probability, and let F = σ(Ω1 , Ω2 , · · · ) be the
σ-field generated by these sets. Then
∑ E[X; Ωi ]
E[X|F(s)] = 1Ωi .
i
P(Ωi )

Therefore, if F(s) = σ(Ω1 , Ω2 , · · · ), we have

∑ e ; Ωi ] ∑
E[Y E[Y Z(t); Ωi ]
e |F(s)] =
E[Y 1Ωi = 1Ωi .
e
P(Ωi ) E[Z(s); Ωi ]
i i

Consequently, on any Ωi ,

e |F(s)] = E[Y Z(t); Ωi ]/P(Ωi ) = E[Y Z(t)|F(s)] = 1 E[Y Z(t)|F(s)].


E[Y
E[Z(s); Ωi ]/P(Ωi ) E[Z(s)|F(s)] Z(s)

2) Heuristics to memorize the one-dimensional Girsanov’s Theorem (Theorem 5.2.3 on page 212 of the

f (t) = W (t) + t Θ(u)du
textbook): suppose under a change of measure with density process (Z(t))t≥0 , W 0
becomes a martingale, then we must have
{ ∫ t ∫ }
1 t 2
Z(t) = exp − Θ(u)dW (u) − Θ (u)du .
0 2 0

Recall Z is a positive martingale under the original probability measure P, so by martingale representation
theorem under the Brownian filtration, Z must satisfy the SDE

dZ(t) = Z(t)X(t)dW (t)

44
e
for some adapted process X(t).7 Since an adapted process M is a P-martingale if and only if M Z is a
P-martingale, we conclude W
8 e
f is a P-martingale f Z is a P-martingale. Since
if and only if W

f (t)Z(t)]
d[W = f (t)Z(t)X(t)dW (t) + [dW (t) + Θ(t)dt]Z(t)X(t)dW (t)
Z(t)[dW (t) + Θ(t)dt] + W
= (· · · )dW (t) + Z(t)[Θ(t) + X(t)]dt,
{ ∫ ∫t }
t
we must require X(t) = −Θ(t). So Z(t) = exp − 0 Θ(u)dW (u) − 1
2 0
Θ2 (u)du .

3) Main idea of Example 5.4.4. Combining formula (5.4.15) and (5.4.22), we have


m ∑
m
d[D(t)X(t)] = ∆i (t)d[D(t)Si (t)] = ∆i (t)D(t)Si (t)[(αi (t) − R(t))dt + σi (t)dBi (t)].
i=1 i=1

To create arbitrage, we want D(t)X(t) to have a deterministic and positive return rate, that is,
{∑
m
i=1 ∆i (t)Si (t)σi (t) = 0
∑m
i=1 ∆i (t)Si (t)[αi (t) − R(t)] > 0.

In Example 5.4.4, i = 2, ∆1 (t) = 1


S1 (t)σ1 , ∆2 (t) = − S2 (t)σ
1
2
and

α1 − r α − r
− > 0.
σ1 σ2

5.1. (i)

Proof.
1
df (Xt ) = f ′ (Xt )dt + f ′′ (Xt )d⟨X⟩t
2
1
= f (Xt )(dXt + d⟨X⟩t )
[ 2 ]
1 2 1 2
= f (Xt ) σt dWt + (αt − Rt − σt )dt + σt dt
2 2
= f (Xt )(αt − Rt )dt + f (Xt )σt dWt .

This is formula (5.2.20).

(ii)
Proof. d(Dt St ) = St dDt + Dt dSt + dDt dSt = −St Rt Dt dt + Dt αt St dt + Dt σt St dWt = Dt St (αt − Rt )dt +
Dt St σt dWt . This is formula (5.2.20).

5.2.
[ ]
e T VT |Ft ] = E
Proof. By Lemma 5.2.2., E[D DT VT ZT
|Ft . Therefore (5.2.30) is equivalent to Dt Vt Zt =
Zt
E[DT VT ZT |Ft ].

5.3. (i)
7 The ξ(t)
existence of X is easy: if dZ(t) = ξ(t)dW (t), then X(t) = Z(t)
.
e (t)|F(s)] = E[M (t)Z(t)|F(s)]/Z(s).
8 To see this, note E[M

45
Proof.
d e −rT f 1 2
cx (0, x) = E[e (xeσWT +(r− 2 σ )T − K)+ ]
dx[ ]
e −rT d σWfT +(r− 1 σ 2 )T
= E e h(xe 2 )
dx
[ ]
e e−rT eσW fT +(r− 1 σ 2 )T
= E 2 1 σW f +(r− 1 σ 2 )T
{xe T 2 >K}
[ ]
f
e eσWT 1 f 1 K
e− 2 σ T E
1 2
= {WT > σ (ln x −(r− 2 σ )T )}
1 2

[ √ f ]
WT
−2σ T e
1 2 σ T √
= e E e T 1 f
W √ √
{ √T −σ T > √ 1
x −(r− 2 σ )T )−σ T }
(ln K 1 2

∫ ∞
T σ T

1 z2

e− 2 σ T √ e− 2 eσ T z 1{z−σ√T >−d+ (T,x)} dz
1 2
=
−∞ 2π
∫ ∞ √
1 − (z−σ T )2
= √ e 2 1{z−σ√T >−d+ (T,x)} dz
−∞ 2π
= N (d+ (T, x)).

(ii)
f
Proof. If we set ZbT = eσWT − 2 σ T and Zbt = E[ e ZbT |Ft ], then Zb is a Pe-martingale, Zbt > 0 and E[ZbT ] =
1 2

e f
σ WT − 2 σ T
] = 1. So if we define P by dP = ZT dPe on FT , then Pb is a probability measure equivalent to
b b
1 2
E[e
e
P , and
e ZbT 1{S >K} ] = Pb(ST > K).
cx (0, x) = E[ T
∫t
Moreover, by Girsanov’s Theorem, W ct = W ft + (−σ)du = W ft − σt is a Pb-Brownian motion (set Θ = −σ
0
in Theorem 5.4.1.)
(iii)
f 1 2 c 1 2
Proof. ST = xeσWT +(r− 2 σ )T
= xeσWT +(r+ 2 σ )T
. So
( )
cT +(r+ 1 σ 2 )T cT
W
Pb(ST > K) = Pb(xe σW 2 > K) = Pb √ > −d+ (T, x) = N (d+ (T, x)).
T

5.4. First, a few typos. In the SDE for S, ‘‘σ(t)dW f (t)” → ‘‘σ(t)S(t)dW f (t)”. In the first equation for
e
c(0, S(0)), E → E. In the second equation for c(0, S(0)), the variable for BSM should be
 √ 
∫ T ∫ T
1 1
BSM T, S(0); K, r(t)dt, σ 2 (t)dt .
T 0 T 0

(i)
∫T ∫T
f f
St − 2St2 d⟨S⟩t = rt dt + σt dWt − 2 σt dt. So ST = S0 exp{ 0 (rt − 2 σt )dt + 0 σt dWt }. Let
Proof. d ln St = dS t 1 1 2 1 2
∫T ∫ T
X = 0 (rt − 12 σt2 )dt + 0 σt dW ft . The first term in the expression of X is a number and the second term
∫T
is a Gaussian random variable N (0, 0 σt2 dt), since both r and σ ar deterministic. Therefore, ST = S0 eX ,
∫T ∫T
with X ∼ N ( 0 (rt − 21 σt2 )dt, 0 σt2 dt),.

(ii)

46
Proof. For the standard BSM model with constant volatility Σ and interest rate R, under the risk-neutral
measure, we have ST = S0 eY , where Y = (R− 21 Σ2 )T +ΣW fT ∼ N ((R− 1 Σ2 )T, Σ2 T ), and E[(S
e 0 eY −K)+ ] =
√ 2
eRT BSM (T, S0 ; K, R, Σ). Note R = T1 (E[Y ] + 21 V ar(Y )) and Σ = T1 V ar(Y ), we can get
( ( ) √ )
e 0 eY − K)+ ] = eE[Y ]+ 21 V ar(Y ) BSM 1 1 1
E[(S T, S0 ; K, E[Y ] + V ar(Y ) , V ar(Y ) .
T 2 T

So for the model in this problem,


∫T
c(0, S0 ) = e− 0
rt dt e 0 eX − K)+ ]
E[(S
( ( ) √ )
∫T 1 1 1
− rt dt E[X]+ 21 V ar(X)
= e 0 e BSM T, S0 ; K, E[X] + V ar(X) , V ar(X)
T 2 T
 √ 
∫ T ∫ T
1 1
= BSM T, S0 ; K, rt dt, σt2 dt .
T 0 T 0

5.5. (i)

Proof. Let f (x) = x1 , then f ′ (x) = − x12 and f ′′ (x) = 2


x3 . Note dZt = −Zt Θt dWt , so
( )
1 1 1 1 2 2 2 Θt Θ2
d = f ′ (Zt )dZt + f ′′ (Zt )dZt dZt = − 2 (−Zt )Θt dWt + 3 Zt Θt dt = dWt + t dt.
Zt 2 Zt 2 Zt Zt Zt

(ii)
[ ]
fs = E[
eM ft |Fs ] = E ft ft |Fs ] =
Proof. By Lemma 5.2.2., for s, t ≥ 0 with s < t, M Zs |Fs
Zt M
. That is, E[Zt M
fs . So M = Z M
Zs M f is a P -martingale.

(iii)
Proof.
( ) 2
ft = d Mt · 1 = 1 dMt + Mt d 1 + dMt d 1 = Γt dWt + Mt Θt dWt + Mt Θt dt + Γt Θt dt.
dM
Zt Zt Zt Zt Zt Zt Zt Zt

(iv)
Proof. In part (iii), we have

ft = Γt Mt Θt Mt Θ2t Γ t Θt Γt M t Θt
dM dWt + dWt + dt + dt = (dWt + Θt dt) + (dWt + Θt dt).
Zt Zt Zt Zt Zt Zt

et =
Let Γ Γt +Mt Θt
, ft = Γ
then dM e t dW
ft . This proves Corollary 5.3.2.
Zt

5.6.

47
Proof. By Theorem 4.6.5, it suffices to show W fi (t) is an Ft -martingale under Pe and [W fi , W
fj ](t) = tδij
fi (t) is an Ft -martingale under Pe if and only if W
(i, j = 1, 2). Indeed, for i = 1, 2, W fi (t)Zt is an Ft -martingale
under P , since [ ]
fi (t)Zt
W
eW
E[ fi (t)|Fs ] = E |Fs .
Zs
By Itô’s product formula, we have
fi (t)Zt )
d(W = fi (t)dZt + Zt dW
W fi (t) + dZt dW
fi (t)
fi (t)(−Zt )Θ(t) · dWt + Zt (dWi (t) + Θi (t)dt) + (−Zt Θt · dWt )(dWi (t) + Θi (t)dt)
= W

d
fi (t)(−Zt )
= W Θj (t)dWj (t) + Zt (dWi (t) + Θi (t)dt) − Zt Θi (t)dt
j=1


d
fi (t)(−Zt )
= W Θj (t)dWj (t) + Zt dWi (t)
j=1

fi (t)Zt is an Ft -martingale under P . So W


This shows W fi (t) is an Ft -martingale under Pe. Moreover,
[ ∫ · ∫ · ]
f f
[Wi , Wj ](t) = Wi + Θi (s)ds, Wj + Θj (s)ds (t) = [Wi , Wj ](t) = tδij .
0 0

Combined, this proves the two-dimensional Girsanov’s Theorem.


5.7. (i)
Proof. Let a be any strictly positive number. We define X2 (t) = (a + X1 (t))D(t)−1 . Then
( )
X2 (0)
P X2 (T ) ≥ = P (a + X1 (T ) ≥ a) = P (X1 (T ) ≥ 0) = 1,
D(T )
( )
and P X2 (T ) > X 2 (0)
D(T ) = P (X1 (T ) > 0) > 0, since a is arbitrary, we have proved the claim of this
problem.
Remark 2. The intuition is that we invest the positive starting fund a into the money market account, and
construct portfolio X1 from zero cost. Their sum should be able to beat the return of money market account.
(ii)
Proof. We define X1 (t) = X2 (t)D(t) − X2 (0). Then X1 (0) = 0,
( ) ( )
X2 (0) X2 (0)
P (X1 (T ) ≥ 0) = P X2 (T ) ≥ = 1, P (X1 (T ) > 0) = P X2 (T ) > > 0.
D(T ) D(T )

5.8. The basic idea is that for any positive Pe-martingale M , dMt = Mt · M1t dMt . By Martingale Repre-
sentation Theorem, dMt = Γ e t dW
ft for some adapted process Γe t . So dMt = Mt ( Γet )dW
ft , i.e. any positive
Mt
martingale must be the exponential of an integral w.r.t. Brownian motion. Taking into account discounting
factor and apply Itô’s product rule, we can show every strictly positive asset is a generalized geometric
Brownian motion.
(i)

e − 0T Ru du VT |Ft ] = E[D
Proof. Vt Dt = E[e e T VT |Ft ]. So (Dt Vt )t≥0 is a Pe-martingale. By Martingale Represen-

tation Theorem, there exists an adapted process Γ e t , 0 ≤ t ≤ T , such that Dt Vt = t Γ e s dW
fs , or equivalently,

−1 t e f
∫0
−1 t e f
Vt = Dt 0 Γs dWs . Differentiate both sides of the equation, we get dVt = Rt Dt 0 Γs dWs dt + Dt−1 Γ e t dW
ft ,
et
Γ
i.e. dVt = Rt Vt dt + Dt dWt .

48
(ii)

Proof. We prove the following more general lemma.


Lemma 5.1. Let X be an almost surely positive random variable (i.e. X > 0 a.s.) defined on the probability
space (Ω, G, P ). Let F be a sub σ-algebra of G, then Y = E[X|F] > 0 a.s.
Proof. By the property of conditional expectation Yt ≥ 0 a.s. Let A = {Y = 0},∑we shall show P (A) = 0. In-

deed, note A ∈ F, 0 = E[Y IA ] = E[E[X|F]IA ] = E[XIA ] = E[X1A∩{X≥1} ] + n=1 E[X1A∩{ n1 >X≥ n+1 1
}] ≥
∑∞
P (A∩{X ≥ 1})+ n=1 n+1 P (A∩{ n > X ≥ n+1 }). So P (A∩{X ≥ 1}) = 0 and P (A∩{ n > X ≥ n+1 }) = 0,
1 1 1 1 1
∑∞
∀n ≥ 1. This in turn implies P (A) = P (A ∩ {X > 0}) = P (A ∩ {X ≥ 1}) + n=1 P (A ∩ { n1 > X ≥ n+1 1
}) =
0.

By the above lemma, it is clear that for each t ∈ [0, T ], Vt = E[ee − tT Ru du VT |Ft ] > 0 a.s.. Moreover,
by a classical result of martingale theory (Revuz and Yor [11], Chapter II, Proposition (3.4)), we have the
following stronger result: for a.s. ω, Vt (ω) > 0 for any t ∈ [0, T ].
(iii)
( )
et f
Γ e ft = Rt Vt dt +
Proof. By (ii), V > 0 a.s., so dVt = Vt V1t dVt = Vt V1t Rt Vt dt + Dt dWt = Vt Rt dt + Vt VtΓDt t dW
ft , where σt = et
Γ
σt Vt dW Vt Dt . This shows V follows a generalized geometric Brownian motion.

5.9.
y 2
Proof. c(0, T, x, K) = xN (d+ ) − Ke−rT N (d− ) with d± = 1

σ T
x
(ln K + (r ± 12 σ 2 )T ). Let f (y) = √1 e− 2

,
then f ′ (y) = −yf (y),

∂d+ ∂d−
cK (0, T, x, K) = xf (d+ ) − e−rT N (d− ) − Ke−rT f (d− )
∂y ∂y
−1 1
= xf (d+ ) √ − e−rT N (d− ) + e−rT f (d− ) √ ,
σ TK σ T
and

cKK (0, T, x, K)
1 x ∂d+ ∂d− e−rT d−
= xf (d+ ) √ − √ f (d+ )(−d+ ) − e−rT f (d− ) + √ (−d− )f (d− )
σ TK 2 σ TK ∂y ∂y σ T ∂y
−rT
x xd+ −1 −1 e d −1
= √ f (d+ ) + √ f (d+ ) √ − e−rT f (d− ) √ − √ − f (d− ) √
σ TK 2 σ TK Kσ T Kσ T σ T Kσ T
x d e−rT f (d− ) d−
= f (d+ ) √ [1 − √+ ] + √ [1 + √ ]
2
K σ T σ T Kσ T σ T
e−rT x
= f (d− )d+ − 2 2 f (d+ )d− .
Kσ 2 T K σ T

5.10. (i)
Proof. At time t0 , the value of the chooser option is V (t0 ) = max{C(t0 ), P (t0 )} = max{C(t0 ), C(t0 ) −
F (t0 )} = C(t0 ) + max{0, −F (t0 )} = C(t0 ) + (e−r(T −t0 ) K − S(t0 ))+ .
(ii)

49
Proof. By the risk-neutral pricing formula, V (0) = E[e e −rt0 V (t0 )] = E[e
e −rt0 C(t0 )+(e−rT K −e−rt0 S(t0 )+ ] =
e
C(0) + E[e −rt 0
(e −r(T −t 0 )
K − S(t0 )) ]. The first term is the value of a call expiring at time T with strike
+

price K and the second term is the value of a put expiring at time t0 with strike price e−r(T −t0 ) K.

5.11.
Proof. We first make an analysis which leads to the hint, then we give a formal proof.
(Analysis) If we want to construct a portfolio X that exactly replicates the cash flow, we must find a
solution to the backward SDE
{
dXt = ∆t dSt + Rt (Xt − ∆t St )dt − Ct dt
XT = 0.

Multiply Dt on both sides of the first equation and apply Itô’s product rule, we get d(Dt Xt ) = ∆t d(Dt St ) −
∫T ∫T
Ct Dt dt. Integrate from 0 to T , we have DT XT − D0 X0 = 0 ∆t d(Dt St ) − 0 Ct Dt dt. By the terminal
∫ ∫
condition, we get X0 = D0−1 ( 0 Ct Dt dt − 0 ∆t d(Dt St )). X0 is the theoretical, no-arbitrage price of
T T

the cash flow, provided we can find a trading strategy ∆ that solves the BSDE. Note the SDE for S
gives d(Dt St ) = (Dt St )σt (θt dt + dWt ), where θt = αtσ−R t
. Take the proper change of measure so that
∫t t

Wt = θs ds + Wt is a Brownian motion under the new measure Pe, we get


f
0
∫ T ∫ T ∫ T
Ct Dt dt = D0 X0 + ∆t d(Dt St ) = D0 X0 + ft .
∆t (Dt St )σt dW
0 0 0
∫T ∫T
ft .
This says the random variable 0 Ct Dt dt has a stochastic integral representation D0 X0 + 0 ∆t Dt St σt dW
∫T
This inspires us to consider the martingale generated by 0 Ct Dt dt, so that we can apply Martingale Rep-
resentation Theorem and get a formula for ∆ by comparison of the integrands.
∫T
(Formal proof) Let MT = 0 Ct Dt dt, and Mt = E[M e T |Ft ]. Then by Martingale Representation Theo-

rem, we can find an adapted process Γ e t , so that Mt = M0 + t Γ ft . If we set ∆t = Γet , we can check
e dW
0 t
∫ ∫ ∫ Dt St σt
Xt = Dt−1 (D0 X0 + 0 ∆u d(Du Su ) − 0 Cu Du du), with X0 = M0 = E[
t t e T Ct Dt dt] solves the SDE
0
{
dXt = ∆t dSt + Rt (Xt − ∆t St )dt − Ct dt
XT = 0.

Indeed, it is easy to see that X satisfies the first equation. To check the terminal condition, we note
∫T ∫ ∫
XT DT = D0 X0 + 0 ∆t Dt St σt dWft − T Ct Dt dt = M0 + T Γ e t dW
ft − MT = 0. So XT = 0. Thus, we have
0 0
found a trading strategy ∆, so that the corresponding portfolio X replicates the cash flow and has zero

e T Ct Dt dt] is the no-arbitrage price of the cash flow at time zero.
terminal value. So X0 = E[ 0

Remark 3. As shown in the analysis, d(Dt Xt ) = ∆t d(Dt St ) − Ct Dt dt. Integrate from t to T , we get
∫T ∫T
0 − Dt Xt = t ∆u d(Du Su ) − t Cu Du du. Take conditional expectation w.r.t. Ft on both sides, we get
∫ ∫
−Dt Xt = −E[e T Cu Du du|Ft ]. So Xt = Dt−1 E[
e T Cu Du du|Ft ]. This is the no-arbitrage price of the cash
t t
flow at time t, and we have justified formula (5.6.10) in the textbook.
5.12. (i)
ei (t) = dBi (t) + γi (t)dt = ∑d σij (t) dWj (t) + ∑d σij (t) Θj (t)dt = ∑d σij (t) dW
Proof. dB fj (t). So Bi is a
j=1 σi (t) j=1 σi (t) j=1 σi (t)
ei (t)dB ∑
ei (t) = d σij (t)2 dt = dt, by Lévy’s Theorem, B
2
ei is a Brownian motion under
martingale. Since dB j=1 σi (t)
Pe.
(ii)

50
Proof.

dSi (t) = ei (t) + (αi (t) − R(t))Si (t)dt − σi (t)Si (t)γi (t)dt
R(t)Si (t)dt + σi (t)Si (t)dB

d ∑
d
ei (t) +
= R(t)Si (t)dt + σi (t)Si (t)dB σij (t)Θj (t)Si (t)dt − Si (t) σij (t)Θj (t)dt
j=1 j=1
ei (t).
= R(t)Si (t)dt + σi (t)Si (t)dB

(iii)
ei (t)dB
Proof. dB ek (t) = (dBi (t) + γi (t)dt)(dBj (t) + γj (t)dt) = dBi (t)dBj (t) = ρik (t)dt.

(iv)

Proof. By Itô’s product rule and martingale property,


∫ t ∫ t ∫ t
E[Bi (t)Bk (t)] = E[ Bi (s)dBk (s)] + E[ Bk (s)dBi (s)] + E[ dBi (s)dBk (s)]
0 0 0
∫ t ∫ t
= E[ ρik (s)ds] = ρik (s)ds.
0 0
∫t
eB
Similarly, by part (iii), we can show E[ ei (t)B
ek (t)] = ρik (s)ds.
0

(v)
Proof. By Itô’s product formula,
∫ t ∫ t
E[B1 (t)B2 (t)] = E[ sign(W1 (u))du] = [P (W1 (u) ≥ 0) − P (W1 (u) < 0)]du = 0.
0 0

Meanwhile,
∫ t
eB
E[ e1 (t)B
e2 (t)] = e
E[ sign(W1 (u))du
0
∫ t
= [Pe(W1 (u) ≥ 0) − Pe(W1 (u) < 0)]du
0
∫ t
= [Pe(W
f1 (u) ≥ u) − Pe(W
f1 (u) < u)]du
0
∫ t ( )
1
= 2 − Pe(W
f1 (u) < u) du
0 2
< 0,
eB
for any t > 0. So E[B1 (t)B2 (t)] = E[ e1 (t)B
e2 (t)] for all t > 0.

5.13. (i)
∫t
e 1 (t)] = E[
Proof. E[W eW f1 (t)] = 0 and E[W
e 2 (t)] = E[
eW f2 (t) − f1 (u)du] = 0, for all t ∈ [0, T ].
W
0

(ii)

51
Proof.
e
Cov[W 1 (T ), W2 (T )] =
e 1 (T )W2 (T )]
E[W
[∫ ∫ ]
T T
e
= E W1 (t)dW2 (t) + W2 (t)dW1 (t)
0 0
[∫ ] [∫ ]
T T
e
= E f f f e
W1 (t)(dW2 (t) − W1 (t)dt) + E f1 (t)
W2 (t)dW
0 0
[∫ ]
T
e
= −E f1 (t)2 dt
W
0
∫ T
= − tdt
0
1
= − T 2.
2

5.14. Equation (5.9.6) can be transformed into d(e−rt Xt ) = ∆t [d(e−rt St ) − ae−rt dt] = ∆t e−rt [dSt − rSt dt −
adt]. So, to make the discounted portfolio value e−rt Xt a martingale, we are motivated to change the measure
∫t
in such a way that St −r 0 Su du−at is a martingale under the new measure. To do this,[we note the SDE for ]S
is dSt = αt St dt+σSt dWt . Hence dSt −rSt dt−adt = [(αt −r)St −a]dt+σSt dWt = σSt (αt −r)S σSt
t −a
dt + dWt .

Set θt = (αt −r)St −a ft = θs ds + Wt , we can find an equivalent probability measure Pe, under which
and W
t
σSt 0
S satisfies the SDE dSt = rSt dt + σSt dW ft + adt and W ft is a BM. This is the rational for formula (5.9.7).
This is a good place to pause and think about the meaning of “martingale measure.” What is to be
a martingale? The new measure Pe should be such that the discounted value process of the replicating
portfolio is a martingale, not the discounted price process of the underlying. First, we want Dt Xt to be a
martingale under Pe because we suppose that X is able to replicate the derivative payoff at terminal time,
XT = VT . In order to avoid arbitrage, we must have Xt = Vt for any t ∈ [0, T ]. The difficulty is how
to calculate Xt and the magic is brought by the martingale measure in the following line of reasoning:
Vt = Xt = Dt−1 E[D e T VT |Ft ]. You can think of martingale measure as a calculational
e T XT |Ft ] = Dt−1 E[D
convenience. That is all about martingale measure! Risk neutral is just a perception, referring to the
actual effect of constructing a hedging portfolio! Second, we note when the portfolio is self-financing, the
discounted price process of the underlying is a martingale under Pe, as in the classical Black-Scholes-Merton
model without dividends or cost of carry. This is not a coincidence. Indeed, we have in this case the
relation d(Dt Xt ) = ∆t d(Dt St ). So Dt Xt being a martingale under Pe is more or less equivalent to Dt St
being a martingale under Pe. However, when the underlying pays dividends, or there is cost of carry,
d(Dt Xt ) = ∆t d(Dt St ) no longer holds, as shown in formula (5.9.6). The portfolio is no longer self-financing,
but self-financing with consumption. What we still want to retain is the martingale property of Dt Xt , not
that of Dt St . This is how we choose martingale measure in the above paragraph.
Let VT be a payoff at time T , then for the martingale Mt = E[e e −rT VT |Ft ], by Martingale Representation

Theorem, we can find an adapted process Γ e t , so that Mt = M0 + t Γ fs . If we let ∆t = Γet ert , then the
e dW
0 s σSt
value of the corresponding portfolio X satisfies d(e−rt Xt ) = Γ e t dW
ft . So by setting X0 = M0 = E[e e −rT VT ],
−rt
we must have e Xt = Mt , for all t ∈ [0, T ]. In particular, XT = VT . Thus the portfolio perfectly hedges
VT . This justifies the risk-neutral pricing of European-type contingent claims in the model where cost of
carry exists. Also note the risk-neutral measure is different from the one in case of no cost of carry.
Another perspective for perfect replication is the following. We need to solve the backward SDE
{
dXt = ∆t dSt − a∆t dt + r(Xt − ∆t St )dt
XT = VT

for two unknowns, X and ∆. To do so, we find a probability measure Pe, under which e−rt Xt is a martingale,

e −rT VT |Ft ] := Mt . Martingale Representation Theorem gives Mt = M0 + t Γ
then e−rt Xt = E[e e dWfu for
0 u

52
some adapted process Γ.e This would give us a theoretical representation of ∆ by comparison of integrands,
hence a perfect replication of VT .
(i)

Proof. As indicated in the above analysis, if we have (5.9.7) under Pe, then d(e−rt Xt ) = ∆t [d(e−rt St ) −
ft . So (e−rt Xt )t≥0 , where X is given by (5.9.6), is a Pe-martingale.
ae−rt dt] = ∆t e−rt σSt dW

(ii)

Proof. By Itô’s formula, dYt = Yt [σdW ft + (r − 1 σ 2 )dt] + 1 Yt σ 2 dt = Yt (σdWft + rdt). So d(e−rt Yt ) =


2 2 ∫
ft and e−rt Yt is a Pe-martingale. Moreover, if St = S0 Yt + Yt
σe−rt Yt dW
t a
ds, then
0 Ys
∫ t ( ∫ t )
a a ft + rdt) + adt = St (σdW
ft + rdt) + adt.
dSt = S0 dYt + dsdYt + adt = S0 + ds Yt (σdW
0 Ys 0 Ys

This shows S satisfies (5.9.7).

Remark 4. To obtain this formula for S, we first set Ut = e−rt St to remove the rSt dt term. The SDE for
U is dUt = σUt dW ft + ae−rt dt. Just like solving linear ODE, to remove U in the dWft term, we consider
ft
−σ W
Vt = Ut e . Itô’s product formula yields
( ) ( )
ft
−σ W ft
−σ W f 1 2 ft
−σ W f 1 2
dVt = e dUt + Ut e (−σ)dWt + σ dt + dUt · e (−σ)dWt + σ dt
2 2
f 1
= e−σWt ae−rt dt − σ 2 Vt dt.
2
1 2
Note V appears only in the dt term, so multiply the integration factor e 2 σ t on both sides of the equation,
we get
f 1 2
d(e 2 σ t Vt ) = ae−rt−σWt + 2 σ t dt.
1 2

f 1 2 ∫t
Set Yt = eσWt +(r− 2 σ )t , we have d(St /Yt ) = adt/Yt . So St = Yt (S0 + 0 ads Ys ).

(iii)
Proof.
[ ∫ ∫ ]
t T
e T |Ft ] = e T |Ft ] + Ee YT a a
E[S S0 E[Y ds + YTds|Ft
0 Yst Ys
∫ t ∫ T [ ]
e T |Ft ] + a e T |Ft ] + a e YT |Ft ds
= S0 E[Y dsE[Y E
0 Ys t Ys
∫ t ∫ T
e T −t ] + a e T −t ] + a e T −s ]ds
= S0 Yt E[Y dsYt E[Y E[Y
0 Ys t
( ∫ t ) ∫ T
a
= S0 + ds Yt er(T −t) + a er(T −s) ds
0 Ys t
( ∫ t )
ads a
= S0 + Yt er(T −t) − (1 − er(T −t) ).
0 Y s r

e T ] = S0 erT − a (1 − erT ).
In particular, E[S r

(iv)

53
Proof.
( ∫ t )
e T |Ft ] = r(T −t) ads a
dE[S ae dt + S0 + (er(T −t) dYt − rYt er(T −t) dt) + er(T −t) (−r)dt
0 Ys r
( ∫ t )
ads r(T −t) ft .
= S0 + e σYt dW
0 Ys

e T |Ft ] is a Pe-martingale. As we have argued at the beginning of the solution, risk-neutral pricing is
So E[S
e T |Ft ]
valid even in the presence of cost of carry. So by an argument similar to that of §5.6.2, the process E[S
is the futures price process for the commodity.
(v)
e −r(T −t) (ST − K)|Ft ] = 0 for K, and get K = E[S
Proof. We solve the equation E[e e T |Ft ]. So F orS (t, T ) =
F utS (t, T ).
(vi)
Proof. We follow the hint. First, we solve the SDE
{
dXt = dSt − adt + r(Xt − St )dt
X0 = 0.
By our analysis in part (i), d(e−rt Xt ) = d(e−rt St ) − ae−rt dt. Integrate from 0 to t on both sides, we get
Xt = St − S0 ert + ar (1 − ert ) = St − S0 ert − ar (ert
( − 1).∫ In particular,
) XT = ST − S0 erT − ar (erT − 1).
e T |Ft ] = S0 +
Meanwhile, F orS (t, T ) = F uts (t, T ) = E[S
t ads
Yt er(T −t) − a (1−er(T −t) ). So F orS (0, T ) =
0 Ys r
S0 erT − ar (1 − erT ) and hence XT = ST − F orS (0, T ). After the agent delivers the commodity, whose value
is ST , and receives the forward price F orS (0, T ), the portfolio has exactly zero value.

6 Connections with Partial Differential Equations


⋆ Comments:
1) A rigorous presentation of (strong) Markov property can be found in Revuz and Yor [11], Chapter III.
2) A rigorous presentation of the Feymann-Kac formula can be found in Øksendal [9, page 143] (also
see 钱敏平等 [10, page 236]). To reconcile the version presented in this book and the one in Øksendal [9]
(Theorem 8.2.1), we note for the undiscounted version, in this book
g(t, X(t)) = E[h(X(T ))|F(t)] = EX(t) [h(X(T − t))]
while in Øksendal [9]
v(t, x) = E x [h(Xt )].
So v(t, x) = g(T − t, x). The discounted version can connected similarly.
3) Hedging equation. Recall the SDE satisfied by the value process X(t) of a self-financing portfolio is
(see formula (5.4.21) and (5.4.22) on page 230)
d[D(t)X(t)] = D(t)[dX(t) − R(t)X(t)dt]
{m [ ] }
∑ ∑
m
= D(t) ∆i (t)dSi (t) + R(t) X(t) − ∆i (t)Si (t) dt − R(t)X(t)dt
i=1 i=1
[m ]
∑ ∑
m
= D(t) ∆i (t)dSi (t) − R(t) ∆i (t)Si (t)dt
i=1 i=1

m
= ∆i (t)d[D(t)Si (t)].
i=1

54
Under the multidimensional market model (formula (5.4.6) on page 226)


d
dSi (t) = αi (t)Si (t)dt + Si (t) σij (t)dWj (t), i = 1, · · · , m.
j=1

and assuming the existence of the risk-neutral measure (that is, the market price of risk equations has a
∑d
solution (formula (5.4.18) on page 228): αi (t) − R(t) = j=1 σij (t)Θj (t), i = 1, · · · , m), D(t)Si (t) is a
martingale under the risk-neutral measure (formula (5.4.17) on page 228) satisfying the SDE


d ∑
d
d[D(t)Si (t)] = D(t)Si (t) fj (t), dSi (t) = R(t)Si (t)dt + Si (t)
σij (t)dW fj (t), i = 1, · · · , m.
σij (t)dW
j=1 j=1
∑m
Consequently, d[D(t)X(t)] = ∆i (t)d[D(t)Si (t)] becomes
i=1
 
∑m ∑
d
d[D(t)X(t)] = D(t) ∆i (t)Si (t) fj (t) .
σij (t)dW
i=1 j=1

If we further assume X(t) has the form v(t, S(t)), then the above equation becomes
 
∑ m
1 ∑m
D(t) vt (t, S(t))dt + vxi (t, S(t))dSi (t) + vxk xl (t, S(t))dSk (t)dSl (t) − R(t)v(t, X(t))dt
i=1
2
k,l=1
 
∑m ∑
d
= (· · · )dt + D(t) vxi (t, S(t)) Si (t) fj (t)
σij (t)dW
i=1 j=1
 

m ∑
d
= D(t) ∆i (t)Si (t) fj (t) .
σij (t)dW
i=1 j=1

fj (t), we have the hedging equation


Equating the coefficient of each dW


m ∑
m
vxi (t, S(t))Si (t)σij (t) = ∆i (t)Si (t)σij (t), j = 1, · · · , d.
i=1 i=1

One solution of the hedging equation is

∆i (t) = vxi (t, S(t)), i = 1, · · · , m.

I Exercise 6.1. Consider the stochastic differential equation

dX(u) =

(i)

Proof. Zt = 1 is obvious. Note the form of Z is similar to that of a geometric Brownian motion. So by Itô’s
formula, it is easy to obtain dZu = bu Zu du + σu Zu dWu , u ≥ t.

(ii)

55
Proof. If Xu = Yu Zu (u ≥ t), then Xt = Yt Zt = x · 1 = x and

dXu = Yu dZu + Zu dYu + dYu Zu


( )
au − σu γu γu γu
= Yu (bu Zu du + σu Zu dWu ) + Zu du + dWu + σu Zu du
Zu Zu Zu
= [Yu bu Zu + (au − σu γu ) + σu γu ]du + (σu Zu Yu + γu )dWu
= (bu Xu + au )du + (σu Xu + γu )dWu .

Remark 5. To see how to find the above solution, we manipulate the equation (6.2.4) ∫as follows. First, to
remove the term bu Xu du, we multiply on both sides of (6.2.4) the integrating factor e− t bv dv . Then
u

∫u ∫u
d(Xu e− t
bv dv
) = e− t
bv dv
(au du + (γu + σu Xu )dWu ).
∫u ∫u ∫u
Let X̄u = e− t
bv dv
Xu , āu = e− t
bv dv
au and γ̄u = e− t
bv dv
γu , then X̄ satisfies the SDE

dX̄u = āu du + (γ̄u + σu X̄u )dWu = (āu du + γ̄u dWu ) + σu X̄u dWu .
∫u
To deal with the term σu X̄u dWu , we consider X̂u = X̄u e− . Then t
σv dWv


( ∫
)
− tu σv dWv − tu σv dWv 1 − ∫ u σv dWv 2
dX̂u = e [(āu du + γ̄u dWu ) + σu X̄u dWu ] + X̄u e (−σu )dWu + e t σu du
2
∫u
+(γ̄u + σu X̄u )(−σu )e− t
σv dWv
du
1
= âu du + γ̂u dWu + σu X̂u dWu − σu X̂u dWu + X̂u σu2 du − σu (γ̂u + σu X̂u )du
2
1
= (âu − σu γ̂u − X̂u σu2 )du + γ̂u dWu ,
2
∫u ∫u ∫u
where âu = āu e− and γ̂u = γ̄u e−
1 2
σv dWv σv dWv 2 σv dv
t t . Finally, use the integrating factor e t , we have
( ∫u ) ∫u 1 ∫u 2
1
σv2 dv 1
σv2 dv 1
d X̂u e 2 t = e2 t (dX̂u + X̂u · σu2 du) = e 2 t σv dv [(âu − σu γ̂u )du + γ̂u dWu ].
2
Write everything back into the original X, a and γ, we get
( ∫u ∫u ∫u 2 ) ∫u 2 ∫u ∫u
d Xu e− t bv dv− t σv dWv + 2 t σv dv = e 2 t σv dv− t σv dWv − t bv dv [(au − σu γu )du + γu dWu ],
1 1

i.e. ( )
Xu 1
d = [(au − σu γu )du + γu dWu ] = dYu .
Zu Zu
This inspired us to try Xu = Yu Zu .

6.2. (i)
Proof. The portfolio is self-financing, so for any t ≤ T1 , we have

dXt = ∆1 (t)df (t, Rt , T1 ) + ∆2 (t)df (t, Rt , T2 ) + Rt (Xt − ∆1 (t)f (t, Rt , T1 ) − ∆2 (t)f (t, Rt , T2 ))dt,

56
and

d(Dt Xt )
= −Rt Dt Xt dt + Dt dXt
= Dt [∆1 (t)df (t, Rt , T1 ) + ∆2 (t)df (t, Rt , T2 ) − Rt (∆1 (t)f (t, Rt , T1 ) + ∆2 (t)f (t, Rt , T2 ))dt]
( )
1
= Dt [∆1 (t) ft (t, Rt , T1 )dt + fr (t, Rt , T1 )dRt + frr (t, Rt , T1 )γ 2 (t, Rt )dt
2
( )
1
+∆2 (t) ft (t, Rt , T2 )dt + fr (t, Rt , T2 )dRt + frr (t, Rt , T2 )γ 2 (t, Rt )dt
2
−Rt (∆1 (t)f (t, Rt , T1 ) + ∆2 (t)f (t, Rt , T2 ))dt]
1
= ∆1 (t)Dt [−Rt f (t, Rt , T1 ) + ft (t, Rt , T1 ) + α(t, Rt )fr (t, Rt , T1 ) + γ 2 (t, Rt )frr (t, Rt , T1 )]dt
2
1
+∆2 (t)Dt [−Rt f (t, Rt , T2 ) + ft (t, Rt , T2 ) + α(t, Rt )fr (t, Rt , T2 ) + γ 2 (t, Rt )frr (t, Rt , T2 )]dt
2
+Dt γ(t, Rt )[Dt γ(t, Rt )[∆1 (t)fr (t, Rt , T1 ) + ∆2 (t)fr (t, Rt , T2 )]]dWt
= ∆1 (t)Dt [α(t, Rt ) − β(t, Rt , T1 )]fr (t, Rt , T1 )dt + ∆2 (t)Dt [α(t, Rt ) − β(t, Rt , T2 )]fr (t, Rt , T2 )dt
+Dt γ(t, Rt )[∆1 (t)fr (t, Rt , T1 ) + ∆2 (t)fr (t, Rt , T2 )]dWt .

(ii)

Proof. Let ∆1 (t) = St fr (t, Rt , T2 ) and ∆2 (t) = −St fr (t, Rt , T1 ), then

d(Dt Xt ) = Dt St [β(t, Rt , T2 ) − β(t, Rt , T1 )]fr (t, Rt , T1 )fr (t, Rt , T2 )dt


= Dt |[β(t, Rt , T1 ) − β(t, Rt , T2 )]fr (t, Rt , T1 )fr (t, Rt , T2 )|dt.

Integrate from 0 to T on both sides of the above equation, we get


∫ T
DT XT − D0 X0 = Dt |[β(t, Rt , T1 ) − β(t, Rt , T2 )]fr (t, Rt , T1 )fr (t, Rt , T2 )|dt.
0

If β(t, Rt , T1 ) ̸= β(t, Rt , T2 ) for some t ∈ [0, T ], under the assumption that fr (t, r, T ) ̸= 0 for all values of r
and 0 ≤ t ≤ T , DT XT − D0 X0 > 0. To avoid arbitrage (see, for example, Exercise 5.7), we must have for
a.s. ω, β(t, Rt , T1 ) = β(t, Rt , T2 ), ∀t ∈ [0, T ]. This implies β(t, r, T ) does not depend on T .

(iii)
Proof. In (6.9.4), let ∆1 (t) = ∆(t), T1 = T and ∆2 (t) = 0, we get
[ ]
1
d(Dt Xt ) = ∆(t)Dt −Rt f (t, Rt , T ) + ft (t, Rt , T ) + α(t, Rt )fr (t, Rt , T ) + γ 2 (t, Rt )frr (t, Rt , T ) dt
2
+Dt γ(t, Rt )∆(t)fr (t, Rt , T )dWt .

This is formula (6.9.5). [ ]


then d(Dt Xt ) = ∆(t)Dt −Rt f (t, Rt , T ) + ft (t, Rt , T ) +]}12 γ 2 (t, Rt )frr (t, Rt , T ) dt. We
If fr (t, r, T ) = 0, {[
choose ∆(t) = sign −Rt f (t, Rt , T ) + ft (t, Rt , T ) + 12 γ 2 (t, Rt )frr (t, Rt , T ) . To avoid arbitrage in this
case, we must have ft (t, Rt , T ) + 12 γ 2 (t, Rt )frr (t, Rt , T ) = Rt f (t, Rt , T ), or equivalently, for any r in the
range of Rt , ft (t, r, T ) + 12 γ 2 (t, r)frr (t, r, T ) = rf (t, r, T ).

6.3.

57
Proof. We note
d [ − ∫ s bv dv ] ∫s ∫s
e 0 C(s, T ) = e− 0 bv dv [C(s, T )(−bs ) + bs C(s, T ) − 1] = −e− 0 bv dv .
ds
So integrate on both sides of the equation from t to T, we obtain
∫T ∫t
∫ T ∫s
e− 0
bv dv
C(T, T ) − e− 0
bv dv
C(t, T ) = − e− 0
bv dv
ds.
t
∫t ∫T ∫s ∫T ∫t
Since C(T, T ) = 0, we have C(t, T ) = e 0
bv dv
t
e− 0
bv dv
ds = t
e s
bv dv
ds. Finally, by A′ (s, T ) =
−a(s)C(s, T ) + 12 σ 2 (s)C 2 (s, T ), we get
∫ T ∫ T
1
A(T, T ) − A(t, T ) = − a(s)C(s, T )ds + σ 2 (s)C 2 (s, T )ds.
t 2 t
∫T
Since A(T, T ) = 0, we have A(t, T ) = t
(a(s)C(s, T ) − 21 σ 2 (s)C 2 (s, T ))ds.
6.4. (i)
Proof. By the definition of φ, we have
∫T
C(u,T )du 1 1
φ′ (t) = e 2 σ
1 2
t σ 2 (−1)C(t, T ) = − φ(t)σ 2 C(t, T ).
2 2

2φ (t) ′
So C(t, T ) = − ϕ(t)σ 2 . Differentiate both sides of the equation φ (t) = − 2 φ(t)σ C(t, T ), we get
1 2

1
φ′′ (t) =− σ 2 [φ′ (t)C(t, T ) + φ(t)C ′ (t, T )]
2
1 1
= − σ 2 [− φ(t)σ 2 C 2 (t, T ) + φ(t)C ′ (t, T )]
2 2
1 4 1
= σ φ(t)C 2 (t, T ) − σ 2 φ(t)C ′ (t, T ).
4 2
[ ] ′′
So C ′ (t, T ) = 41 σ 4 φ(t)C 2 (t, T ) − φ′′ (t) / 12 φ(t)σ 2 = 12 σ 2 C 2 (t, T ) − 2φ (t)
σ 2 φ(t) .

(ii)
Proof. Plug formulas (6.9.8) and (6.9.9) into (6.5.14), we get

2φ′′ (t) 1 2 2 2φ′ (t) 1


− + σ C (t, T ) = b(−1) + σ 2 C 2 (t, T ) − 1,
σ 2 φ(t) 2 σ 2 φ(t) 2

i.e. φ′′ (t) − bφ′ (t) − 21 σ 2 φ(t) = 0.


(iii)
Proof. The characteristic equation of φ′′ (t) − bφ′ (t) − 12 σ 2 φ(t) = 0 is λ2 − bλ − 21 σ 2 = 0, which gives two
√ √
roots 21 (b ± b2 + 2σ 2 ) = 21 b ± γ with γ = 12 b2 + 2σ 2 . Therefore by standard theory of ordinary differential
equations, a general solution of φ is φ(t) = e 2 bt (a1 eγt + a2 e−γt ) for some constants a1 and a2 . It is then
1

easy to see that we can choose appropriate constants c1 and c2 so that


c1 c2
e−( 2 b+γ)(T −t) − 1 e−( 2 b−γ)(T −t) .
1 1
φ(t) =
2b − γ
1
2b +γ

(iv)

58
Proof. From part (iii), it is easy to see φ′ (t) = c1 e−( 2 b+γ)(T −t) − c2 e−( 2 b−γ)(T −t) . In particular,
1 1

2φ′ (T ) 2(c1 − c2 )
0 = C(T, T ) = − =− 2 .
σ 2 φ(T ) σ φ(T )
So c1 = c2 .
(v)
Proof. We first recall the definitions and properties of sinh and cosh:
ez − e−z ez + e−z
sinh z = , cosh z = , (sinh z)′ = cosh z, and (cosh z)′ = sinh z.
2 2
Therefore
[ ]
− 12 b(T −t) e−γ(T −t) eγ(T −t)
φ(t) = c1 e −
2b − γ
1 1
2b + γ
[ ]
1
2b − γ −γ(T −t) 1
2b + γ
= c1 e− 2 b(T −t) eγ(T −t)
1
e −
1 2
4b − γ2 1
b 2 − γ2
[ 4
]
2c1 − 1 b(T −t) 1 −γ(T −t) 1 γ(T −t)
= e 2 −( b − γ)e + ( b + γ)e
σ2 2 2
2c1 − 1 b(T −t)
= e 2 [b sinh(γ(T − t)) + 2γ cosh(γ(T − t))].
σ2
and
1 2c1 − 1 b(T −t)
φ′ (t) = b· 2 e 2 [b sinh(γ(T − t)) + 2γ cosh(γ(T − t))]
2 σ
2c1 1
+ 2 e− 2 b(T −t) [−γb cosh(γ(T − t)) − 2γ 2 sinh(γ(T − t))]
σ [ 2
b bγ bγ
= 2c1 e− 2 b(T −t)
1
sinh(γ(T − t)) + 2 cosh(γ(T − t)) − 2 cosh(γ(T − t))
2σ 2 σ σ
]
2γ 2
− 2 sinh(γ(T − t))
σ
b2 − 4γ 2
= 2c1 e− 2 b(T −t)
1
sinh(γ(T − t))
2σ 2
= −2c1 e− 2 b(T −t) sinh(γ(T − t)).
1

This implies
2φ′ (t) sinh(γ(T − t))
C(t, T ) = − = .
σ 2 φ(t) γ cosh(γ(T − t)) + 12 b sinh(γ(T − t))

(vi)
2aφ′ (t)
Proof. By (6.5.15) and (6.9.8), A′ (t, T ) = σ 2 φ(t) . Hence
∫ T
2aφ′ (s) 2a φ(T )
A(T, T ) − A(t, T ) = ds = 2 ln ,
t σ 2 φ(s) σ φ(t)
and [ ]
γe 2 b(T −t)
1
2a φ(T ) 2a
A(t, T ) = − 2 ln = − 2 ln .
σ φ(t) σ γ cosh(γ(T − t)) + 21 b sinh(γ(T − t))

59
6.5. (i)
Proof. Since g(t, X1 (t), X2 (t)) = E[h(X1 (T ), X2 (T ))|Ft ] and
e−rt f (t, X1 (t), X2 (t)) = E[e−rT h(X1 (T ), X2 (T ))|Ft ],
iterated conditioning argument shows g(t, X1 (t), X2 (t)) and e−rt f (t, X1 (t), X2 (t)) ar both martingales.
(ii) and (iii)
Proof. We note
dg(t, X1 (t), X2 (t))
1 1
= gt dt + gx1 dX1 (t) + gx2 dX2 (t) + gx1 x2 dX1 (t)dX1 (t) + gx2 x2 dX2 (t)dX2 (t) + gx1 x2 dX1 (t)dX2 (t)
[ 2 2
1 2 2
= gt + gx1 β1 + gx2 β2 + gx1 x1 (γ11 + γ12 + 2ργ11 γ12 ) + gx1 x2 (γ11 γ21 + ργ11 γ22 + ργ12 γ21 + γ12 γ22 )
2
]
1 2 2
+ gx2 x2 (γ21 + γ22 + 2ργ21 γ22 ) dt + martingale part.
2
So we must have
1 2 2
gt + gx1 β1 + gx2 β2 + gx1 x1 (γ11 + γ12 + 2ργ11 γ12 ) + gx1 x2 (γ11 γ21 + ργ11 γ22 + ργ12 γ21 + γ12 γ22 )
2
1 2 2
+ gx2 x2 (γ21 + γ22 + 2ργ21 γ22 ) = 0.
2
Taking ρ = 0 will give part (ii) as a special case. The PDE for f can be similarly obtained.
6.6. (i)
1
Proof. Multiply e 2 bt on both sides of (6.9.15), we get
( )
1 1 1 b 1 1 1
d(e 2 bt Xj (t)) = e 2 bt Xj (t) bdt + (− Xj (t)dt + σdWj (t) = e 2 bt σdWj (t).
2 2 2 2
∫ ( ∫ )
t 1 t 1
So e 2 bt Xj (t) − Xj (0) = 12 σ 0 e 2 bu dWj (u) and Xj (t) = e− 2 bt Xj (0) + 12 σ 0 e 2 bu dWj (u) . By Theorem
1 1

−bt ∫t 2
4.4.9, Xj (t) is normally distributed with mean Xj (0)e− 2 bt and variance e 4 σ 2 0 ebu du = σ4b (1 − e−bt ).
1

(ii)
∑d
Proof. Suppose R(t) = j=1 Xj2 (t), then


d
dR(t) = (2Xj (t)dXj (t) + dXj (t)dXj (t))
j=1
d (
∑ )
1
= 2Xj (t)dXj (t) + σ 2 dt
j=1
4
d (
∑ )
1
= −bXj2 (t)dt + σXj (t)dWj (t) + σ 2 dt
j=1
4
( ) ∑
d 2 √ d
X (t)
= σ − bR(t) dt + σ R(t) √j dWj (t).
4 j=1
R(t)
∑d ∫ t Xj (s) ∑d Xj2 (t)
Let B(t) = j=1 0
√ then B is a local martingale with dB(t)dB(t) = j=1 R(t)
dWj (s), dt = dt. So
R(s)

by Lévy’s Theorem, B is a Brownian motion. Therefore dR(t) = (a − bR(t))dt + σ R(t)dB(t) (a := d4 σ 2 )
and R is a CIR interest rate process.

60
(iii)

Proof. By (6.9.16), Xj (t) is dependent on Wj only and is normally distributed with mean e− 2 bt Xj (0) and
1

2
variance σ4b [1 − e−bt ]. So X1 (t), · · · , Xd (t) are i.i.d. normal with the same mean µ(t) and variance v(t).
(iv)

Proof.

[ ] ∫ (x−µ(t))2

uXj2 (t)

ux2 e− 2v(t) dx
E e = e √
−∞ 2πv(t)
∫ (1−2uv(t))x2 −2µ(t)x+µ2 (t)

e− 2v(t)
= √ dx
−∞ 2πv(t)
∫ ∞
µ(t)
(x− 1−2uv(t) )
2
+
µ2 (t)

µ2 (t)
1 −
1−2uv(t) (1−2uv(t))2
= √ e dx
2v(t)/(1−2uv(t))

−∞ 2πv(t)
∫ √ 2 µ2 (t)(1−2uv(t))−µ2 (t)

e− 2v(t)(1−2uv(t))
µ(t)
(x− 1−2uv(t) )
1 − 2uv(t) − 2v(t)/(1−2uv(t))
= √ e dx · √
−∞ 2πv(t) 1 − 2uv(t)
uµ2 (t)
e− 1−2uv(t)
= √ .
1 − 2uv(t)

(v)
∑d
Proof. By R(t) = j=1 Xj2 (t) and the fact X1 (t), · · · , Xd (t) are i.i.d.,

udµ2 (t) e−bt uR(0)


E[euR(t) ] = (E[euX1 (t) ])d = (1 − 2uv(t))− 2 e 1−2uv(t) = (1 − 2uv(t))− σ2 e−
2 d 2a
1−2uv(t) .

6.7. (i)
e −rT (ST − K)+ |Ft ] is a martingale by iterated conditioning argument. Since
Proof. e−rt c(t, St , Vt ) = E[e

d(e−rt c(t, St , Vt ))
[
−rt 1
= e c(t, St , Vt )(−r) + ct (t, St , Vt ) + cs (t, St , Vt )rSt + cv (t, St , Vt )(a − bVt ) + css (t, St , Vt )Vt St2 +
2
]
1
cvv (t, St , Vt )σ 2 Vt + csv (t, St , Vt )σVt St ρ dt + martingale part,
2

we conclude rc = ct + rscs + cv (a − bv) + 12 css vs2 + 12 cvv σ 2 v + csv σsvρ. This is equation (6.9.26).

(ii)

61
Proof. Suppose c(t, s, v) = sf (t, log s, v) − e−r(T −t) Kg(t, log s, v), then

ct = sft (t, log s, v) − re−r(T −t) Kg(t, log s, v) − e−r(T −t) Kgt (t, log s, v),
1 1
cs = f (t, log s, v) + sfs (t, log s, v) − e−r(T −t) Kgs (t, log s, v) ,
s s
cv = sfv (t, log s, v) − e−r(T −t) Kgv (t, log s, v),
1 1 1 1
css = fs (t, log s, v) + fss (t, log s, v) − e−r(T −t) Kgss (t, log s, v) 2 + e−r(T −t) Kgs (t, log s, v) 2 ,
s s s s
K
csv = fv (t, log s, v) + fsv (t, log s, v) − e−r(T −t) gsv (t, log s, v),
s
cvv = sfvv (t, log s, v) − e−r(T −t) Kgvv (t, log s, v).

So
1 1
ct + rscs + (a − bv)cv + s2 vcss + ρσsvcsv + σ 2 vcvv
2 2
= sft − re−r(T −t) Kg − e−r(T −t) Kgt + rsf + rsfs − rKe−r(T −t) gs + (a − bv)(sfv − e−r(T −t) Kgv )
[ ] ( )
1 1 1 K gs K
+ s2 v − fs + fss − e−r(T −t) 2 gss + e−r(T −t) K 2 + ρσsv fv + fsv − e−r(T −t) gsv
2 s s s s s
1 2
+ σ v(sfvv − e−r(T −t) Kgvv )
[2 ] [
1 1 1 1
= s ft + (r + v)fs + (a − bv + ρσv)fv + vfss + ρσvfsv + σ 2 vfvv − Ke−r(T −t) gt + (r − v)gs
2 2 2 2
]
1 1
+(a − bv)gv + vgss + ρσvgsv + σ 2 vgvv + rsf − re−r(T −t) Kg
2 2
= rc.

That is, c satisfies the PDE (6.9.26).

(iii)
Proof. First, by Markov property, f (t, Xt , Vt ) = E[1{XT ≥log K} |Ft ]. So f (T, Xt , Vt ) = 1{XT ≥log K} , which
implies f (T, x, v) = 1{x≥log K} for all x ∈ R, v ≥ 0. Second, f (t, Xt , Vt ) is a martingale, so by differentiating
f and setting the dt term as zero, we have the PDE (6.9.32) for f . Indeed,
[
1 1
df (t, Xt , Vt ) = ft (t, Xt , Vt ) + fx (t, Xt , Vt )(r + Vt ) + fv (t, Xt , Vt )(a − bvt + ρσVt ) + fxx (t, Xt , Vt )Vt
2 2
]
1
+ fvv (t, Xt , Vt )σ 2 Vt + fxv (t, Xt , Vt )σVt ρ dt + martingale part.
2

So we must have ft + (r + 12 v)fx + (a − bv + ρσv)fv + 21 fxx v + 12 fvv σ 2 v + σvρfxv = 0. This is (6.9.32).

(iv)
Proof. Similar to (iii).

(v)
Proof. c(T, s, v) = sf (T, log s, v) − e−r(T −t) Kg(T, log s, v) = s1{log s≥log K} − K1{log s≥log K} = 1{s≥K} (s −
K) = (s − K)+ .
6.8.

62
Proof. We follow the hint. Suppose h is smooth and compactly supported, then it is legitimate to exchange
integration and differentiation:
∫ ∞ ∫ ∞

gt (t, x) = h(y)p(t, T, x, y)dy = h(y)pt (t, T, x, y)dy,
∂t 0
∫ ∞ 0

gx (t, x) = h(y)px (t, T, x, y)dy,


∫ ∞
0

gxx (t, x) = h(y)pxx (t, T, x, y)dy.


0
[ ∫∞ ]
So (6.9.45) implies 0 h(y) pt (t, T, x, y) + β(t, x)px (t, T, x, y) + 12 γ 2 (t, x)pxx (t, T, x, y) dy = 0. By the ar-
bitrariness of h and assuming β, pt , px , v, pxx are all continuous, we have
1
pt (t, T, x, y) + β(t, x)px (t, T, x, y) + γ 2 (t, x)pxx (t, T, x, y) = 0.
2
This is (6.9.43).
6.9.
Proof. We first note
1
dhb (Xu ) = h′b (Xu )dXu + h′′b (Xu )dXu dXu
[ 2 ]
1 2
= hb (Xu )β(u, Xu ) + γ (u, Xu )hb (Xu ) du + h′b (Xu )γ(u, Xu )dWu .
′ ′′
2
Integrate on both sides of the equation, we have
∫ T[ ]
′ 1 2 ′′
hb (XT ) − hb (Xt ) = hb (Xu )β(u, Xu ) + γ (u, Xu )hb (Xu ) du + martingale part.
t 2
Take expectation on both sides, we get
∫ ∞
E t,x
[hb (XT ) − hb (Xt )] = hb (y)p(t, T, x, y)dy − h(x)
−∞
∫ T
1
= E t,x [h′b (Xu )β(u, Xu ) + γ 2 (u, Xu )h′′b (Xu )]du
t 2
∫ T ∫ ∞[ ]
1
= h′b (y)β(u, y) + γ 2 (u, y)h′′b (y) p(t, u, x, y)dydu.
t −∞ 2

Since hb vanishes outside (0, b), the integration range can be changed from (−∞, ∞) to (0, b), which gives
(6.9.48).
By integration-by-parts formula, we have
∫ b ∫ b

β(u, y)p(t, u, x, y)h′b (y)dy = hb (y)β(u, y)p(t, u, x, y)|b0 − hb (y) (β(u, y)p(t, u, x, y))dy
0 0 ∂y
∫ b

= − hb (y) (β(u, y)p(t, u, x, y))dy,
0 ∂y
and
∫ b ∫ b
∂ 2
γ 2 (u, y)p(t, u, x, y)h′′b (y)dy = − (γ (u, y)p(t, u, x, y))h′b (y)dy
0 0 ∂y
∫ b
∂2 2
= (γ (u, y)p(t, u, x, y))hb (y)dy.
0 ∂y

63
Plug these formulas into (6.9.48), we get (6.9.49).
Differentiate w.r.t. T on both sides of (6.9.49), we have
∫ b ∫ b ∫
∂ ∂ 1 b ∂2 2
hb (y) p(t, T, x, y)dy = − [β(T, y)p(t, T, x, y)]hb (y)dy + [γ (T, y)p(t, T, x, y)]hb (y)dy,
0 ∂T 0 ∂y 2 0 ∂y 2
that is,
∫ b [ ]
∂ ∂ 1 ∂2 2
hb (y) p(t, T, x, y) + (β(T, y)p(t, T, x, y)) − (γ (T, y)p(t, T, x, y)) dy = 0.
0 ∂T ∂y 2 ∂y 2
This is (6.9.50).
By (6.9.50) and the arbitrariness of hb , we conclude for any y ∈ (0, ∞),
∂ ∂ 1 ∂2 2
p(t, T, x, y) + (β(T, y)p(t, T, x, y)) − (γ (T, y)p(t, T, x, y)) = 0.
∂T ∂y 2 ∂y 2

6.10.
Proof. Under the assumption that limy→∞ (y − K)rye
p(0, T, x, y) = 0, we have
∫ ∞ ∫ ∞

− (y − K) (rye p(0, T, x, y))dy = −(y − K)rye p(0, T, x, y)|∞
K + rye
p(0, T, x, y)dy
∂y
K
∫ ∞ K

= ryep(0, T, x, y)dy.
K

If we further assume (6.9.57) and (6.9.58), then use integration-by-parts formula twice, we have

1 ∞ ∂2
(y − K) 2 (σ 2 (T, y)y 2 pe(0, T, x, y))dy
2 K ∂y
[ ∫ ∞ ]
1 ∂ ∂ 2
= (y − K) (σ 2 (T, y)y 2 pe(0, T, x, y))|∞
K − (σ (T, y)y 2
e
p(0, T, x, y))dy
2 ∂y K ∂y
1 2
= − (σ (T, y)y 2 pe(0, T, x, y)|∞ K)
2
1 2
= σ (T, K)K 2 pe(0, T, x, K).
2
Therefore,
∫ ∞
cT (0, T, x, K) = −rc(0, T, x, K) + e−rT (y − K)e pT (0, T, x, y)dy
∫ ∞ K
∫ ∞
−rT −rT
= −re (y − K)ep(0, T, x, y)dy + e (y − K)e pT (0, T, x, y)dy
∫K∞ ∫K∞

= −re−rT (y − K)ep(0, T, x, y)dy − e−rT (y − K) (rye p(t, T, x, y))dy
K K ∂y
∫ ∞
1 ∂2 2
+e−rT (y − K) 2
(σ (T, y)y 2 pe(t, T, x, y))dy
2 ∂y
∫K ∞ ∫ ∞
−rT
= −re (y − K)ep(0, T, x, y)dy + e−rT ryep(0, T, x, y)dy
K K
1 2
−rT
+e σ (T, K)K 2 pe(0, T, x, K)
2∫

1
= re−rT K pe(0, T, x, y)dy + e−rT σ 2 (T, K)K 2 pe(0, T, x, K)
K 2
1 2
= −rKcK (0, T, x, K) + σ (T, K)K 2 cKK (0, T, x, K).
2

64
7 Exotic Options
⋆ Comments:
On the PDE approach to pricing knock-out barrier options. We give some clarification to the explanation
below Theorem 7.3.1 (page 301-302), where the key is that V (t) and v(t, x) differ by an indicator function
and all the hassles come from this difference.
More precisely, we define the first passage time ρ by following the notation of the textbook:

ρ = inf{t > 0 : S(t) = B}.

Then risk-neutral pricing gives the time-t price of the knock-out barrier option as
[ ]
e e−r(T −t) V (T ) F(t)
V (t) = E

where V (T ) = (S(T ) − K)+ 1{ρ≥T } = (S(T ) − K)+ 1{ρ>T } , since the event {ρ = T } ⊂ {S(T ) = B} has zero
probability. [ ]
Therefore, the discounted value process e−rt V (t) = E e e−rT V (T ) F(t) is a martingale under the risk-
e but we cannot say V (t) is solely a function of t and S(t), since it depends on the path
neutral measure P,
property before t as well. To see this analytically, we note

{ρ > T } = {ρ > t, ρ ◦ θt > T − t},

where θt is the shift operator used for sample paths (i.e. θt (ω· ) = ωt+· . See Øksendal [9, page 119] for
details). Then
[ ] [ ]
e e−r(T −t) V (T ) F(t) = E
V (t) = E e e−r(T −t) (S(T ) − K)+ 1{ρ◦θ >T −t} F(t) 1{ρ>t} .
t

Note ρ ◦ θt is solely dependent on the behavior of sample paths between time t and T . Therefore, we can
apply Markov property
e S(t) [(S(T − t) − K)+ 1{ρ>T −t} ]1{ρ>t}
V (t) = e−r(T −t) E

Define v(t, x) = e−r(T −t) E e x [(S(T − t) − K)+ 1{ρ>T −t} ], then V (t) = v(t, x)1{ρ>t} and v(t, x) satisfies the
conditions (7.3.4)-(7.3.7) listed in Theorem 7.3.1. Indeed, it is easy to see v(t, x) satisfies all the boundary
conditions (7.3.5)-(7.3.7), by the arguments in the paragraph immediately after Theorem 7.3.1. For the
Black-Scholes-Merton PDE (7.3.4), we note

d[e−rt V (t)] = d[e−rt v(t, S(t))1{ρ>t} ] = 1{ρ>t} d[e−rt v(t, S(t))] + e−rt v(t, S(t))d1{ρ>t} .

Following the computation in equation (7.3.13), We can see the Black-Scholes-Merton equation (7.3.4) must
hold for {(t, x) : 0 ≤ t < T, 0 ≤ x < B}.

7.1. (i)
log s − 12 r± 21 σ 2 √
Proof. Since δ± (τ, s) = 1

σ τ
[log s + (r ± 12 σ 2 )τ ] = σ τ + σ τ,

∂ log s 1 − 3 ∂τ r ± 21 σ 2 1 − 1 ∂τ
δ± (τ, s) = (− )τ 2 + τ 2
∂t σ 2 ∂t σ 2 ∂t
[ ]
1 log s 1 r ± 2σ1 2

= − √ (−1) − τ (−1)
2τ σ τ σ
[ ]
1 1 1
= − · √ − log ss + (r ± σ 2 )τ )
2τ σ τ 2
1 1
= − δ± (τ, ).
2τ s

65
(ii)

Proof. ( [ ])
∂ x ∂ 1 x 1 1
δ± (τ, ) = √ log + (r ± σ 2 )τ = √ ,
∂x c ∂x σ τ c 2 xσ τ
( [ ])
∂ c ∂ 1 c 1 1
δ± (τ, ) = √ log + (r ± σ 2 )τ =− √ .
∂x x ∂x σ τ x 2 xσ τ

(iii)
Proof.
(log s+rτ )2 ±σ 2 τ (log s+rτ )+ 1 σ 4 τ 2
1 δ± (τ,s) 1
N ′ (δ± (τ, s)) = √ e− 2 = √ e−
4
2σ 2 τ .
2π 2π
Therefore
N ′ (δ+ (τ, s)) −
2σ 2 τ (log s+rτ ) e−rτ
= e 2σ 2 τ =
N ′ (δ− (τ, s)) s
and e−rτ N ′ (δ− (τ, s)) = sN ′ (δ+ (τ, s)).
(iv)

Proof.

N ′ (δ± (τ, s)) [(log s+rτ ) ]


2 −(log 1 +rτ )2 ±σ 2 τ (log s−log 1 )
s s 4rτ log s±2σ 2 τ log s
= e− = e− = e−( σ2 ±1) log s = s−( σ2 ±1) .
2r 2r
2σ 2 τ 2σ 2 τ
′ −1
N (δ± (τ, s ))

So N ′ (δ± (τ, s−1 )) = s( σ2 ±1) N ′ (δ± (τ, s)).


2r

(v)
[ ] [ ] √
Proof. δ+ (τ, s) − δ− (τ, s) = 1

σ τ
log s + (r + 12 σ 2 )τ − 1

σ τ
log s + (r − 12 σ 2 )τ = 1

σ τ
σ2 τ = σ τ.

(vi)
[ ] [ ]
Proof. δ± (τ, s) − δ± (τ, s−1 ) = 1

σ τ
log s + (r ± 12 σ 2 )τ − 1

σ τ
log s−1 + (r ± 12 σ 2 )τ = 2 log
√ s.
σ τ

(vii)
y 2 y 2 2
Proof. N ′ (y) = √1 e− 2

, so N ′′ (y) = √1 e− 2

(− y2 )′ = −yN ′ (y).

To be continued ...
7.3.
c c c cT − W
ct = (W
fT − W
ft ) + α(T − t) is independent of Ft ,
Proof. We note ST = S0 eσWT = St eσ(WT −Wt ) , W
c c
supt≤u≤T (Wu − Wt ) is independent of Ft , and
c
YT = S0 e σ M T
c c
= S0 eσ supt≤u≤T Wu 1{M
ct ≤sup ct }
W
+ S0 eσMt 1{M
ct >sup cu }
W
t≤u≤T t≤u≤T

cu −W
σ supt≤u≤T (W ct )
= St e 1 Y cu −W
σ supt≤u≤T (W c ) + Yt 1 Y cu −W
σ supt≤u≤T (W c ) .
{ St ≤e t } { St ≤e t }
t t

So E[f (ST , YT )|Ft ] = E[f (x STS0−t , x YTS−t


0
1{ y ≤ YT −t } + y1{ y ≤ YT −t } )], where x = St , y = Yt . Therefore
x S0 x S0
E[f (ST , YT )|Ft ] is a Borel function of (St , Yt ).

66
7.4.
Proof. By Cauchy’s inequality and the monotonicity of Y , we have

m ∑
m
| (Ytj − Ytj−1 )(Stj − Stj−1 )| ≤ |Ytj − Ytj−1 ||Stj − Stj−1 |
j=1 j=1
v v
u∑ u∑
um um
≤ t (Ytj − Ytj−1 ) t (Stj − Stj−1 )2
2

j=1 j=1
v
√ u∑
um
≤ max |Ytj − Ytj−1 |(YT − Y0 )t (Stj − Stj−1 )2 .
1≤j≤m
j=1

If we increase the number of partition points


√∑ to infinity and let√the length of the longest subinterval
m
max1≤j≤m |tj − tj−1 | approach zero, then j=1 (Stj − Stj−1 ) →
2 [S]T − [S]0 < ∞ and max1≤j≤m |Ytj −
∑m
Ytj−1 | → 0 a.s. by the continuity of Y . This implies j=1 (Ytj − Ytj−1 )(Stj − Stj−1 ) → 0.

8 American Derivative Securities


⋆ Comments:
∫t
Justification of Definition 8.3.1. For any given stopping time τ , let Ct = 0
(K − S(u))d1{τ ≤u} , then
∫ ∞
e−rτ (K − S(τ ))1{τ <∞} = e−rt dCt .
0

Regarding C as a cumulative cash flow, valuation of put option via (8.3.2) is justified by the risk-neutral
valuation of a cash flow via (5.6.10).

8.1.

′ x − σ2r2 −1 1 ′ ′
Proof. vL (L+) = (K − L)(− σ2r2 )( L ) L 2 L (K − L).
= − σ2r So vL (L+) = vL (L−) if and only if
x=L
− σ2r
2 L (K − L) = −1. Solve for L, we get L =
2rK
2r+σ 2 .

8.2.
Proof. By the calculation in Section 8.3.3, we can see v2 (x) ≥ (K2 − x)+ ≥ (K1 − x)+ , rv2 (x) − rxv2′ (x) −
1 2 2 ′′
2 σ x v2 (x) ≥ 0 for all x ≥ 0, and for 0 ≤ x < L1∗ < L2∗ ,

1
rv2 (x) − rxv2′ (x) − σ 2 x2 v2′′ (x) = rK2 > rK1 > 0.
2
So the linear complementarity conditions for v2 imply v2 (x) = (K2 − x)+ = K2 − x > K1 − x = (K1 − x)+
on [0, L1∗ ]. Hence v2 (x) does not satisfy the third linear complementarity condition for v1 : for each x ≥ 0,
equality holds in either (8.8.1) or (8.8.2) or both.
8.3. (i)
Proof. Suppose x takes its values in a domain bounded away from 0. By the general theory of linear
differential equations, if we can find two linearly independent solutions v1 (x), v2 (x) of (8.8.4), then any
solution of (8.8.4) can be represented in the form of C1 v1 +C2 v2 where C1 and C2 are constants. So it suffices
to find two linearly independent special solutions of (8.8.4). Assume v(x) = xp for some constant p to be
determined, (8.8.4) yields xp (r−pr− 21 σ 2 p(p−1)) = 0. Solve the quadratic equation 0 = r−pr− 12 σ 2 p(p−1) =
(− 21 σ 2 p − r)(p − 1), we get p = 1 or − σ2r2 . So a general solution of (8.8.4) has the form C1 x + C2 x− σ2 .
2r

67
(ii)
Proof. Assume there is an interval [x1 , x2 ] where 0 < x1 < x2 < ∞, such that v(x) ̸≡ 0 satisfies (8.3.19)
with equality on [x1 , x2 ] and satisfies (8.3.18) with equality for x at and immediately to the left of x1 and
for x at and immediately to the right of x2 , then we can find some C1 and C2 , so that v(x) = C1 x + C2 x− σ2
2r

on [x1 , x2 ]. If for some x0 ∈ [x1 , x2 ], v(x0 ) = v ′ (x0 ) = 0, by the uniqueness of the solution of (8.8.4), we
would conclude v ≡ 0. This is a contradiction. So such an x0 cannot exist. This implies 0 < x1 < x2 < K
(if K ≤ x2 , v(x2 ) = (K − x2 )+ = 0 and v ′ (x2 )=the right derivative of (K − x)+ at x2 , which is 0). 9 Thus
we have four equations for C1 and C2 :
 − σ2r2

 C x + C x = K − x1

 1 1 2 1

C x + C x− σ2r2 = K − x
1 2 2 2 2
 − σ2r2 −1

 C1 − σ2 C2 x1
2r
= −1


 − 2r
−1
C1 − σ2r2 C2 x2 σ 2
= −1.

Since x1 ̸= x2 , the last two equations imply C2 = 0. Plug C2 = 0 into the first two equations, we have
C1 = K−x = K−xx2 ; plug C2 = 0 into the last two equations, we have C1 = −1. Combined, we would have
1 2
x1
x1 = x2 . Contradiction. Therefore our initial assumption is incorrect, and the only solution v that satisfies
the specified conditions in the problem is the zero solution.
(iii)
Proof. If in a right neighborhood of 0, v satisfies (8.3.19) with equality, then part (i) implies v(x) = C1 x +
C2 x− σ2 for some constants C1 and C2 . Then v(0) = limx↓0 v(x) = 0 < (K − 0)+ , i.e. (8.3.18) will be
2r

violated. So we must have rv − rxv ′ − 21 σ 2 x2 v ′′ > 0 in a right neighborhood of 0. According to (8.3.20),


v(x) = (K − x)+ near o. So v(0) = K. We have thus concluded simultaneously that v cannot satisfy (8.3.19)
with equality near 0 and v(0) = K, starting from first principles (8.3.18)-(8.3.20).
(iv)
Proof. This is already shown in our solution of part (iii): near 0, v cannot satisfy (8.3.19) with equality.
(v)
Proof. If v satisfy (K − x)+ with equality for all x ≥ 0, then v cannot have a continuous derivative as stated
in the problem. This is a contradiction.
(vi)

Proof. By the result of part (i), we can start with v(x) = (K − x)+ on [0, x1 ] and v(x) = C1 x + C2 x− σ2 on
2r

[x1 , ∞). By the assumption of the problem, both v and v ′ are continuous. Since (K −x)+ is not differentiable
at K, we must have x1 ≤ K.This gives us the equations

K − x = (K − x )+ = C x + C x− σ2r2
1 1 1 1 2 1
−1 = C − 2r C x− σ2 −1 .
2r

1 σ2 2 1

Because v is assumed to be bounded, we must have C1 = 0 and the above equations only have two unknowns:
C2 and x1 . Solve them for C2 and x1 , we are done.
8.4. (i)
Proof. This is already shown in part (i) of Exercise 8.3.
9 Note we have interpreted the condition “v(x) satisfies (8.3.18) with equality for x at and immediately to the right of x ”
2
as “v(x2 ) = (K − x2 )+ and v ′ (x2 ) =the right derivative of (K − x)+ at x2 .” This is weaker than “v(x) = (K − x) in a right
neighborhood of x2 .”

68
(ii)

Proof. We solve for A, B the equations


{
AL− σ2 + BL = K − L
2r

− σ2r2 AL− σ2 −1 + B = −1,


2r

2r
σ 2 KL σ2
and we obtain A = σ 2 +2r ,B= 2rK
L(σ 2 +2r) − 1.

(iii)

Proof. By (8.8.5), B > 0. So for x ≥ K, f (x) ≥ BK > 0 = (K − x)+ . If L ≤ x < K,


2r
σ 2 KL σ2 − 2r2 2rKx
f (x) − (K − x) +
= 2
x σ + −K
σ + 2r L(σ 2 + 2r)
2r
[ ]
x σ2r2 +1 x σ2r2
KL σ2 σ 2
+ 2r( L ) − (σ 2
+ 2r)( L )
= x− σ2
2r
.
(σ 2 + 2r)L
2r 2r 2r
Let g(θ) = σ 2 + 2rθ σ2 +1 − (σ 2 + 2r)θ σ2 with θ ≥ 1. Then g(1) = 0 and g ′ (θ) = 2r( σ2r2 + 1)θ σ2 − (σ 2 +
2r) σ2r2 θ σ2 −1 = σ2r2 (σ 2 + 2r)θ σ2 −1 (θ − 1) ≥ 0. So g(θ) ≥ 0 for any θ ≥ 1. This shows f (x) ≥ (K − x)+ for
2r 2r

L ≤ x < K. Combined, we get f (x) ≥ (K − x)+ for all x ≥ L.


(iv)

Proof. Since limx→∞ v(x) = limx→∞ f (x) = ∞ and limx→∞ vL∗ (x) = limx→∞ (K − L∗ )( Lx∗ )− σ2 = 0, v(x)
2r

and vL∗ (x) are different. By part (iii), v(x) ≥ (K − x)+ . So v satisfies (8.3.18). For x ≥ L, rv − rxv ′ −
1 2 2 ′′ 1 2 2 ′′ ′ 1 2 2 ′′
2 σ x v = rf − rxf − 2 σ x f = 0. For 0 ≤ x ≤ L, rv − rxv − 2 σ x v = r(K − x) + rx = rK. Combined,
′ 1 2 2 ′′
rv − rxv − 2 σ x v ≥ 0 for x ≥ 0. So v satisfies (8.3.19). Along the way, we also showed v satisfies (8.3.20).
In summary, v satisfies the linear complementarity condition (8.3.18)-(8.3.20), but v is not the function vL∗
given by (8.3.13).

(v)

In this case, v(x) = Ax− σ2 =


2r
Proof. By part (ii), B = 0 if and only if 2rK
L(σ 2 +2r) − 1 = 0, i.e. L = 2rK
2r+σ 2 .
σ2 K x − σ2r2 x − σ2 2r

σ 2 +2r ( L ) = (K − L)( L ) = vL∗ (x), on the interval [L, ∞).

e −(r−a)τL ],
8.5. The difficulty of the dividend-paying case is that from Lemma 8.3.4, we can only obtain E[e
e
not E[e −rτL
]. So we have to start from Theorem 8.3.2.
(i)
f 1 2
ft − 1 (r − a −
Proof. By (8.8.9), St = S0 eσWt +(r−a− 2 σ )t
. Assume S0 = x, then St = L if and only if −W σ
1 2 1 x
2 σ )t = σ log L . By Theorem 8.3.2,
[ √ ]
e −rτL ] = e− σ log L σ (r−a− 2 σ )+ σ2 (r−a− 2 σ ) +2r .
1 x 1 1 2 1 1 2 2
E[e

e −rτL ] as e−γ log Lx = ( x )−γ . So
If we set γ = σ12 (r − a − 12 σ 2 ) + σ1 σ12 (r − a − σ12 )2 + 2r, we can write E[e L
the risk-neutral expected discounted pay off of this strategy is
{
K − x, 0≤x≤L
vL (x) = x −γ
(K − L)( L ) , x > L.

69
(ii)
x −γ γ(K−L)
Proof. ∂
∂L vL (x) = −( L ) (1 − L ). Set ∂
∂L vL (x) = 0 and solve for L∗ , we have L∗ = γK
γ+1 .

(iii)

Proof. By Itô’s formula, we have


[ ]
[ −rt ] 1 ′′
d e vL∗ (St ) = e −rt
−rvL∗ (St ) + vL∗ (St )(r − a)St + vL∗ (St )σ St dt + e−rt vL
′ 2 2 ′ ft .
(St )σSt dW

2

If x > L∗ ,

′ 1 ′′
−rvL∗ (x) + vL ∗
(x)(r − a)x + vL (x)σ 2 x2
2 ∗
( )−γ
x x−γ−1 1 x−γ−2
= −r(K − L∗ ) + (r − a)x(K − L∗ )(−γ) −γ + σ 2 x2 (−γ)(−γ − 1)(K − L∗ ) −γ
L∗ L∗ 2 L∗
( )−γ [ ]
x 1 2
= (K − L∗ ) −r − (r − a)γ + σ γ(γ + 1) .
L∗ 2

By the definition of γ, if we define u = r − a − 21 σ 2 , we have

1
r + (r − a)γ − σ 2 γ(γ + 1)
2
1 2 2 1
= r − σ γ + γ(r − a − σ 2 )
2 2
( √ )2 ( √ )
1 2 u 1 u2 u 1 u2
= r− σ + + 2r + + + 2r u
2 σ2 σ σ2 σ2 σ σ2
( √ ( )) √
1 2 u2 2u u2 1 u2 u2 u u2
= r− σ + 3 + 2r + 2 + 2r + 2+ + 2r
2 σ4 σ σ2 σ σ2 σ σ σ2
√ ( ) √
u2 u u2 1 u2 u2 u u2
= r− 2 − + 2r − + 2r + 2 + + 2r
2σ σ σ2 2 σ2 σ σ σ2
= 0.
′ ′′
If x < L∗ , −rvL∗ (x) + vL ∗
(x)(r − a)x + 21 vL ∗
(x)σ 2 x2 = −r(K − x) + (−1)(r − a)x = −rK + ax. Combined,
we get [ ]
d e−rt vL∗ (St ) = −e−rt 1{St <L∗ } (rK − aSt )dt + e−rt vL ′ ft .
(St )σSt dW

Following the reasoning in the proof of Theorem 8.3.5, we only need to show 1{x<L∗ } (rK − ax) ≥ 0 to finish
the solution. This is further equivalent to proving rK − aL∗ ≥ 0. Plug L∗ = γ+1 γK
into the expression and

note γ ≥ σ1 σ12 (r − a − 12 σ 2 )2 + σ12 (r − a − 12 σ 2 ) ≥ 0, the inequality is further reduced to r(γ + 1) − aγ ≥ 0.
We prove this inequality as follows.
Assume for some K, r, a and σ (K and σ are assumed to be strictly positive, r and a are assumed to be
non-negative), rK − aL∗ < 0, then necessarily r < a, since L∗ = γ+1 γK
≤ K. As shown before, this means

r(γ + 1) − aγ < 0. Define θ = r−a σ , then θ < 0 and γ = σ
1
2 (r − a − 1 2
2 σ ) + 1
σ σ 2 (r − a − 2 σ ) + 2r =
1 1 2 2

70

1
σ (θ − 12 σ) + 1
σ (θ − 21 σ)2 + 2r. We have

r(γ + 1) − aγ < 0 ⇐⇒ (r − a)γ + r < 0


[ √ ]
1 1 1 1 2
⇐⇒ (r − a) (θ − σ) + (θ − σ) + 2r + r < 0
σ 2 σ 2

1 1
⇐⇒ θ(θ − σ) + θ (θ − σ)2 + 2r + r < 0
2 2

1 2 1
⇐⇒ θ (θ − σ) + 2r < −r − θ(θ − σ)(< 0)
2 2
1 2 1 1
⇐⇒ θ [(θ − σ) + 2r] > r + θ (θ − σ)2 + 2θr(θ − σ 2 )
2 2 2
2 2 2
⇐⇒ 0 > r2 − θrσ 2
⇐⇒ 0 > r − θσ 2 .

Since θσ 2 < 0, we have obtained a contradiction. So our initial assumption is incorrect, and rK − aL∗ ≥ 0
must be true.

(iv)
Proof. The proof is similar to that of Corollary 8.3.6. Note the only properties used in the proof of Corollary
8.3.6 are that e−rt vL∗ (St ) is a supermartingale, e−rt∧τL∗ vL∗ (St ∧τL∗ ) is a martingale, and vL∗ (x) ≥ (K −x)+ .
Part (iii) already proved the supermartingale-martingale property, so it suffices to show vL∗ (x) ≥ (K − x)+
in our problem. Indeed, by γ ≥ 0, L∗ = γ+1 γK
< K. For x ≥ K > L∗ , vL∗ (x) > 0 = (K − x)+ ; for 0 ≤ x < L∗ ,
vL∗ (x) = K − x = (K − x)+ ; finally, for L∗ ≤ x ≤ K,

d x−γ−1 L−γ−1
∗ γK 1
(vL∗ (x) − (K − x)) = −γ(K − L∗ ) −γ + 1 ≥ −γ(K − L∗ ) −γ + 1 = −γ(K − ) γK + 1 = 0.
dx L∗ L∗ γ + 1 γ+1

and (vL∗ (x) − (K − x))|x=L∗ = 0. So for L∗ ≤ x ≤ K, vL∗ (x) − (K − x)+ ≥ 0. Combined, we have
vL∗ (x) ≥ (K − x)+ ≥ 0 for all x ≥ 0.

8.6.
Proof. By Lemma 8.5.1, Xt = e−rt (St − K)+ is a submartingale. For any τ ∈ Γ0,T , Theorem 8.8.1 implies

e −rT (ST − K)+ ] ≥ E[e


E[e e −rτ ∧T (Sτ ∧T − K)+ ] ≥ E[e−rτ (Sτ − K)+ 1{τ <∞} ] = E[e−rτ (Sτ − K)+ ],

e −rT (ST −
where we take the convention that e−rτ (Sτ −K)+ = 0 when τ = ∞. Since τ is arbitrarily chosen, E[e
+ e
K) ] ≥ maxτ ∈Γ0,T E[e −rτ
(Sτ − K) ]. The other direction “≤” is trivial since T ∈ Γ0,T .
+

8.7.
Proof. Suppose λ ∈ [0, 1] and 0 ≤ x1 ≤ x2 , we have f ((1 − λ)x1 + λx2 ) ≤ (1 − λ)f (x1 ) + λf (x2 ) ≤
(1 − λ)h(x1 ) + λh(x2 ). Similarly, g((1 − λ)x1 + λx2 ) ≤ (1 − λ)h(x1 ) + λh(x2 ). So

h((1 − λ)x1 + λx2 ) = max{f ((1 − λ)x1 + λx2 ), g((1 − λ)x1 + λx2 )} ≤ (1 − λ)h(x1 ) + λh(x2 ).

That is, h is also convex.

71
9 Change of Numéraire
⋆ Comments:
1) To provide an intuition for change of numéraire, we give a summary of results for change of numéraire
in discrete case. This summary is based on Shiryaev [13].
Consider a model of financial market (B, e B̄, S) as in Delbaen and Schachermayer [2] Definition 2.1.1 or
e
Shiryaev [13, page 383]. Here B and B̄ are both one-dimensional while S could be a vector price process.
Suppose B e and B̄ are both strictly positive, then both of them can be chosen as numéaire.
Several results hold under this model. First, no-arbitrage and completeness properties of market are
independent of the choice of numéraire (see, for example, Shiryaev [13, page 413, 481]).
Second, if the market is arbitrage-free, then e
( ) corresponding
( ) to B (resp. B̄), there is an equivalent proba-
e B̄ S e S e (resp. P̄).
bility measure P (resp. P̄), such that , (resp.
e
B e
B
B
, ) is a martingale under P
B̄ B̄
Third, if the market is both arbitrage-free and complete, we have the relation

B̄T 1 e
dP̄ = [ ] dP. (6)
e
BT E B̄e0
B 0

See Shiryaev [13, page 510], formula (12).


Finally, if fT is a European contingent claim with maturity N and the market is both arbitrage-free and
complete, then 10 [ ] [ ]
fT e e fT
B̄t Ē Ft = Bt E F .
B̄T eT t
B
That is, the risk-neutral price of fT is independent of the choice of numéraire. See Shiryaev [13], Chapter
VI, §1b.2.11
2) The above theoretical results can be applied to market involving foreign money market account. We
consider the following market: a domestic money market account M (M0 = 1), a foreign money market
account M f (M0f = 1), a (vector) asset price process S denominated in domestic currency called stock.
Suppose the domestic vs. foreign currency exchange rate is Q. Note Q is not a traded asset. Denominated
by domestic currency, the traded assets are (M, M f Q, S), where M f Q can(be seen as ) the price process
of one unit foreign currency. Domestic risk-neutral measure P e is such that M f
Q S e
is a P-martingale.
M , M
( )
S ef
Denominated by foreign currency, the traded assets are M f , M Q , Q . Foreign risk-neutral measure P is
( )
such that QM M S
f , QM f
ef -martingale. This is a change of numéraire in the market denominated by
is a P
domestic currency, from M to M f Q. If we assume the market is arbitrage-free and complete, the foreign
10 If the market is incomplete but the contingent claim is still replicable, this result still holds for t = 0. Indeed,
[ ]in Shiryaev
[13], Chapter V §1c.2, it is easy to generalize formula (12) to the case of N ≥ 1 with x∗ = supeP∈P(P) E e fN B0 , x∗ =
[ ] BN
e fN B0 , C ∗ (P) = inf{x ≥ 0 : ∃π ∈ SF, xπ = x, xπ ≥ fN }, and C∗ (P) = sup{x ≥ 0 : ∃π ∈ SF, xπ = x, xπ ≤ fN }.
infeP∈P(P) E BN 0 N 0 N
No-arbitrage implies C∗ ≤ C ∗ . Since
π ∑N ( )
XN Xπ Sk
= 0 + γk ∆ ,
BN B0 k=1
B k

we must have C∗ ≤ x∗ ≤ x∗ ≤ C ∗ . The replicability of fN implies C ∗ ≤ C∗ . So, if the market is incomplete but the contingent
claim is still replicable, we still have C∗ = x∗ = x∗ = C ∗ , i.e. the risk-neutral pricing formula still holds for t = 0.
11 The invariance of risk-neutral price gives us a mnemonics to memorize formula (6): take f = 1 B̄ with A ∈ F and set
T A T T
t = 0, we have [ ]
B̄0 P̄(A) = B e B̄T ; A .
e0 E
eT
B
e=
So the Radon-Nikodým derivative is dP̄/dP
B̄T /B̄0
BeT /Be0 . This leads to formula (9.2.6) in Shreve [14, page 378]. The textbook’s
chapter summary also provides a good way to memorize: the Radon-Nikodým derivative “is the numéraire itself, discounted in
order to be a martingale and normalized by its initial condition in order to have expected value 1”.

72
risk-neutral measure is given by

ef = QT MTf e QT DT MTf e
dP [ ] dP = dP
Q0 M0f Q0
MT E M 0

on FT . For a European contingent claim fT [denominated


] in domestic currency, its payoff in foreign currency
e DTf
fT
is fT /QT . Therefore its foreign price is E Df Q Ft . Convert this price into domestic currency, we have
f
[ f ] t T

e DT fT e e on FT and the Bayes formula, we get


Qt E Df Q Ft . Use the relation between P and P
f f
t T

[ ] [ ]
ef DTf fT e DT fT
Qt E Ft = E Ft .
Dtf QT Dt

The RHS is exactly the price of fT in domestic market if we apply risk-neutral pricing.
3) Alternative proof of Theorem 9.4.2 (Black-Scholes-Merton option pricing with random interest rate).
e T [V (T )|F(t)], where V (T ) = (S(T ) − K)+ with
By (9.4.7), V (t) = B(t, T )E
{ }
f f 1 2
S(T ) = ForS (T, T ) = ForS (t, T ) exp σ[W (T ) − W (t)] − σ (T − t) .
T T
2

e T [V (T )|F(t)] is like the Black-Scholes-Merton model for stock options, where interest
Then the valuation of E
rate is 0 and the underlying has price ForS (t, T ) at time t. So its value is

e T [V (T )|F(t)] = ForS (t, T )N (d+ (t)) − KN (d− (t))


E
[ ]
where d± (t) = σ√1T −t log ForSK(t,T ) ± 12 σ 2 (T − t) . Therefore

V (t) = B(t, T )[ForS (t, T )N (d+ ) − KN (d− )] = S(t)N (d+ (t)) − KB(t, T )N (d− (t)).

9.1. (i)
Proof. For any 0 ≤ t ≤ T , by Lemma 5.5.2,
[ ] [ ]
(M2 ) M1 (T )
M2 (T ) M1 (T ) E[M1 (T )|Ft ] M1 (t)
E Ft = E Ft = = .
M2 (T ) M2 (t) M2 (T ) M2 (t) M2 (t)
M1 (t)
So M2 (t) is a martingale under P M2 .

(ii)

Proof. Let M1 (t) = Dt St and M2 (t) = Dt Nt /N0 . Then Pe(N ) as defined in (9.2.6) is P (M2 ) as defined in
Remark 9.2.5. Hence M 1 (t) St e(N ) , which implies St(N ) = St is a martingale
M2 (t) = Nt N0 is a martingale under P Nt
under Pe(N ) .
9.2. (i)
f
Proof. Since Nt−1 = N0−1 e−ν Wt −(r− 2 ν
1 2
)t
, we have

f
d(Nt−1 ) = N0−1 e−ν Wt −(r− 2 ν
1 2
)t ft − (r − 1 ν 2 )dt + 1 ν 2 dt] = Nt−1 (−νdW
[−νdW ct − rdt).
2 2

(ii)

73
Proof.
( ) ( )
ct = Mt d 1 1 1 ct (−νdW
ct − rdt) + rM
ct dt = −ν M
ct dW
ct .
dM + dMt + d dMt = M
Nt Nt Nt
Remark: This can also be obtained directly from Theorem 9.2.2.
(iii)
Proof.
( ) ) ( ( )
bt Xt 1 1 1
dX = d = Xt d + dXt + d dXt
Nt Nt Nt Nt
( ) ( )
1 1 1
= (∆t St + Γt Mt )d + (∆t dSt + Γt dMt ) + d (∆t dSt + Γt dMt )
Nt Nt Nt
[ ( ) ( ) ] [ ( ) ( ) ]
1 1 1 1 1 1
= ∆ t St d + dSt + d dSt + Γt Mt d + dMt + d dMt
Nt Nt Nt Nt Nt Nt
= ∆t dSbt + Γt dM
ct .

9.3. To avoid singular cases, we need to assume −1 < ρ < 1.


(i)
f 1 2
Proof. Nt = N0 eν W3 (t)+(r− 2 ν )t
. So
f
dNt−1 = d(N0−1 e−ν W3 (t)−(r− 2 ν
1 2
)t
)
[ ]
f f 1 2 1 2
N0−1 e−ν W3 (t)−(r− 2 ν )t
1 2
= −νdW3 (t) − (r − ν )dt + ν dt
2 2
f3 (t) − (r − ν 2 )dt],
= Nt−1 [−νdW
and
= Nt−1 dSt + St dNt−1 + dSt dNt−1
(N )
dSt
f1 (t)) + St Nt−1 [−νdW
= Nt−1 (rSt dt + σSt dW f3 (t) − (r − ν 2 )dt]
= St
(N ) f1 (t)) + St(N ) [−νdW
(rdt + σdW f3 (t) − (r − ν 2 )dt] − σSt(N ) ρdt
(N ) (N )
= St (ν 2 − σρ)dt + St (σdW f1 (t) − νdWf3 (t)).

Define γ = f4 (t) =
σ 2 − 2ρσν + ν 2 and W σf
− νf f4 is a martingale with quadratic
then W
γ W1 (t) γ W3 (t),
variation
f4 ]t = σ2 σν ν2
[W 2
t − 2 2 ρt + 2 t = t.
γ γ r

f4 is a BM and therefore, St(N ) has volatility γ =
By Lévy’s Theorem, W σ 2 − 2ρσν + ν 2 .
(ii)
f2 (t) = √−ρ W
Proof. This problem is the same as Exercise 4.13, we define W f1 (t) + √ 1 f3 (t), then W
W f2
2 1−ρ 1−ρ2
is a martingale, with
( )2 ( )
f ρ f 1 f ρ2 1 2ρ2
(dW2 (t)) = − √
2
dW1 (t) + √ dW3 (t) = + − dt = dt,
1 − ρ2 1 − ρ2 1 − ρ2 1 − ρ2 1 − ρ2

f2 (t)dW
and dW f1 (t) = − √ ρ dt + √ ρ 2 dt = 0. So W f2 is a BM independent of W
f1 , and dNt = rNt dt +
1−ρ21−ρ

f3 (t) = rNt dt + νNt [ρdW
νNt dW f2 (t)].
f1 (t) + 1 − ρ2 dW

74
(iii)

Proof. Under Pe, (W


f1 , W
f2 ) is a two-dimensional BM, and
 ( )

 f1 (t)
dW
 f
dSt = rSt dt + σSt dW1 (t) = rSt dt + St (σ, 0) · dW
 f (t) 2
( )

 √ f1 (t)
dW
 f
dNt = rNt dt + νNt dW3 (t) = rNt dt + Nt (νρ, ν 1 − ρ ) ·
 2 .
f2 (t)
dW

So under Pe, the volatility vector for S is (σ, 0), and the volatility vector for N is (νρ, ν √1 − ρ2 ). By Theorem
9.2.2, under the measure Pe(N ) , the volatility vector for S (N ) is (v1 , v2 ) = (σ − νρ, −ν 1 − ρ2 . In particular,
the volatility of S (N ) is
√ √ √ √
v12 + v22 = (σ − νρ)2 + (−ν 1 − ρ2 )2 = σ 2 − 2νρσ + ν 2 ,

consistent with the result of part (i).


9.4.
∫t ∫t
f3 (s)+ (Rs − 21 σ22 (s))ds
Proof. From (9.3.15), we have Mtf Qt = M0f Q0 e 0
σ2 (s)dW 0 . So
Dtf ∫ ∫
f3 (s)− t (Rs − 1 σ 2 (s))ds
= D0f Q−1
0 e
− 0t σ2 (s)dW 0 2 2
Qt
and
( )
Dtf Dtf f
f3 (t) − (Rt − 1 σ 2 (t))dt + 1 σ 2 (t)dt] = Dt [−σ2 (t)dW
f3 (t) − (Rt − σ 2 (t))dt].
d = [−σ2 (t)dW 2 2 2
Qt Qt 2 2 Qt

To get (9.3.22), we note


( ) ( ) ( )
Mt Dtf Dtf Dtf Dtf
d = Mt d + dMt + dMt d
Qt Qt Qt Qt
Mt Dtf f
f3 (t) − (Rt − σ22 (t))dt] + Rt Mt Dt dt
= [−σ2 (t)dW
Qt Qt
Mt Dtf f3 (t) − σ22 (t)dt)
= − (σ2 (t)dW
Qt
Mt Dtf f f (t).
= − σ2 (t)dW 3
Qt
To get (9.3.23), we note
( ) ( ) ( )
Dtf St Dtf Dtf Dtf
d = dSt + St d + dSt d
Qt Qt Qt Qt
Dtf f
f1 (t)) + St Dt [−σ2 (t)dW
f3 (t) − (Rt − σ22 (t))dt]
= St (Rt dt + σ1 (t)dW
Qt Qt
f1 (t) Dtf f3 (t)
+St σ1 (t)dW (−σ2 (t))dW
Qt
Dtf St f1 (t) − σ2 (t)dW
f3 (t) + σ22 (t)dt − σ1 (t)σ2 (t)ρt dt]
= [σ1 (t)dW
Qt
Dtf St f f (t) − σ2 dW
f f (t)].
= [σ1 (t)dW 1 3
Qt

75
9.5.

Proof. We combine the solutions of all the sub-problems into a single solution as follows. The payoff of a
ST
quanto call is ( QT
− K)+ units of domestic currency at time T . By risk-neutral pricing formula, its price at
e −r(T −t) ( ST − K)+ |Ft ]. So we need to find the SDE for St under risk-neutral measure Pe. By
time t is E[e QT Qt
f 1 2
formula (9.3.14) and (9.3.16), we have St = S0 eσ1 W1 (t)+(r− 2 σ1 )t and
√ 2
f f f
Qt = Q0 eσ2 W3 (t)+(r−r − 2 σ2 )t = Q0 eσ2 ρW1 (t)+σ2 1−ρ W2 (t)+(r−r − 2 σ2 )t .
f 1 2 f 1 2


St f1 (t)−σ2 1−ρ2 W
S0 (σ1 −σ2 ρ)W f2 (t)+(r f + 1 σ 2 − 1 σ 2 )t
So Q t
=Q 0
e 2 2 2 1 . Define

√ √ √
f4 (t) = σ1 − σ2 ρ f1 (t) − σ2 1 − ρ2 f
σ4 = (σ1 − σ2 ρ)2 + σ22 (1 − ρ2 ) = σ12 − 2ρσ1 σ2 + σ22 and W W W2 (t).
σ4 σ4

f4 is a martingale with [W
f4 ]t = (σ1 −σ2 ρ)2 f4 is a Brownian motion under Pe. So
2
Then W σ42
t + σ2 (1−ρ
σ42
)
t + t. So W
if we set a = r − rf + ρσ1 σ2 − σ22 , we have
( )
St S0 σ4 W
f4 (t)+(r−a− 1 σ 2 )t St St f4 (t) + (r − a)dt].
= e 2 4 and d = [σ4 dW
Qt Q0 Qt Qt

Therefore, under Pe, QSt


t
behaves like dividend-paying stock and the price of the quanto call option is like the
price of a call option on a dividend-paying stock. Thus formula (5.5.12) gives us the desired price formula
for quanto call option.

9.6. (i)
√ √
Proof. d+ (t) − d− (t) = √1
σ T −t
σ 2 (T − t) = σ T − t. So d− (t) = d+ (t) − σ T − t.

(ii)

Proof. d+ (t) + d− (t) = √2


σ T −t
log ForSK(t,T ) . So

ForS (t, T )
d2+ (t) − d2− (t) = (d+ (t) + d− (t))(d+ (t) − d− (t)) = 2 log .
K

(iii)
Proof.

ForS (t, T )e−d+ (t)/2 − Ke−d− (t) = e−d+ (t)/2 [ForS (t, T ) − Ked+ (t)/2−d− (t)/2 ]
2 2 2 2 2

ForS (t,T )
= e−d+ (t)/2 [ForS (t, T ) − Kelog
2
K ]
= 0.

(iv)

76
Proof.

dd+ (t)
[ ]
1√ √ ForS (t, T ) 1 2 1 dForS (t, T ) (dForS (t, T ))2 1
= 1σ (T − t) [log
3 + σ (T − t)]dt + √ − − σdt
2 K 2 σ T − t ForS (t, T ) 2ForS (t, T )2 2
1 ForS (t, T ) σ 1 1 1
f T (t) − σ 2 dt − σ 2 dt)
= √ log dt + √ dt + √ (σdW
2σ (T − t)3 K 4 T −t σ T −t 2 2
1 ForS (t, T ) 3σ f T (t)
dW
= log dt − √ dt + √ .
2σ(T − t)3/2 K 4 T −t T −t

(v)

Proof. dd− (t) = dd+ (t) − d(σ T − t) = dd+ (t) + σdt

2 T −t
.

(vi)
dt
Proof. By (iv) and (v), (dd− (t))2 = (dd+ (t))2 = T −t .

(vii)
Proof.
1
dN (d+ (t)) = N ′ (d+ (t))dd+ (t) + N ′′ (d+ (t))(dd+ (t))2
2
1 − d2+ (t) 1 1 d2
+ (t) dt
= √ e 2 dd+ (t) + √ e− 2 (−d+ (t)) .
2π 2 2π T −t

(viii)

Proof.
1
dN (d− (t)) = N ′ (d− (t))dd− (t) + N ′′ (d− (t))(dd− (t))2
2
( ) d2
− (t)
1 − d− (t)2
σdt 1 e− 2 dt
= √ e 2 dd+ (t) + √ + √ (−d− (t))
2π 2 T −t 2 2π T −t

d2
− (t)(σ T −t−d+ (t))
σe−d− (t)/2
2
1 e− 2
√ e−d− (t)/2 dd+ (t) + √
2
= dt + √ dt
2π 2 2π(T − t) 2(T − t) 2π
d2
− (t)
σe−d− (t)/2
2
1 d+ (t)e− 2
√ e−d− (t)/2 dd+ (t) + √
2
= dt − √ dt.
2π 2π(T − t) 2(T − t) 2π

(ix)

Proof.

e−d+ (t)/2 −d (t)/2


2 2

f T (t) 1 c T (t) = σFor√


S (t, T )e +
dForS (t, T )dN (d+ (t)) = σForS (t, T )dW √ √ dW dt.
2π T −t 2π(T − t)

77
(x)

Proof.

ForS (t, T )dN (d+ (t)) + dForS (t, T )dN (d+ (t)) − KdN (d− (t))
[ ]
σForS (t, T )e−d+ (t)/2
2
1 d+ (t)
= ForS (t, T ) √ e−d+ (t)/2 dd+ (t) − √ e−d+ (t)/2 dt +
2 2
√ dt
2π 2(T − t) 2π 2π(T − t)
[ ]
e−d− (t)/2
2
σ −d2− (t)/2 d+ (t) −d2− (t)/2
−K √ dd+ (t) + √ e dt − √ e dt
2π 2π(T − t) 2(T − t) 2π
[ ]
ForS (t, T )d+ (t) −d2+ (t)/2 σForS (t, T )e−d+ (t)/2 Kσe−d− (t)/2
2 2
Kd+ (t) −d2− (t)/2
= √ e + √ − √ − √ e dt
2(T − t) 2π 2π(T − t) 2π(T − t) 2(T − t) 2π
1 ( )
ForS (t, T )e−d+ (t)/2 − Ke−d− (t)/2 dd+ (t)
2 2
+√

= 0.

The last “=” comes from (iii), which implies e−d− (t)/2 = ForSK(t,T ) e−d+ (t)/2 .
2 2

10 Term-Structure Models
⋆ Comments:
1) Computation of eΛt (Lemma 10.2.3). For a systematic treatment of the computation of matrix
exponential function eΛt , see 丁同仁等 [3, page 175] or Arnold et al. [1, page 221]. For the sake of Lemma
10.2.3, a direct computation to find (10.2.35) goes [ as follows.
] [ ]
λ1 0 0 0
i) The case of λ1 = λ2 . In this case, set Γ1 = and Γ2 = . Then Λ = Γ1 +Γ2 , Γn2 = 02×2
[ n ] 0 λ 2 λ21 0
λ 0
for n ≥ 2, and Γn1 = 1 . Note Γ1 and Γ2 commute, we therefore have
0 λn2
[ ]( [ ]) [ λ t ] [ ]
eλ 1 t 0 00 e 1 0 eλ 1 t 0
e Λt
=e Γ1 t
·e Γ2 t
= I2×2 + = = .
0 eλ2 t λ21 t 0 λ21 teλ2 t eλ2 t λ21 teλ1 t eλ 2 t

ii) The case of λ1 ̸= λ2 . In this case, Λ has two distinct eigenvalues λ1 and λ2 . So it can be diagonalized,
that is, we can find a matrix P such that
[ ]
−1 λ1 0
P ΛP = J = .
0 λ2

Solving the matrix equation P Λ = P J, we have


[ ]
1 0
P = λ21 ,
λ1 −λ2 1

and consequently [ ]
1 0
P −1 = .
− λ1λ−λ
21
2
1
Therefore

( ∞
) [ ] [ ]
∑ P J n P −1 n ∑ Jn n eλ1 t λ1 t
e Λt
= t =P· t · P −1 = P
0
P −1 = ( eλ t ) 0
.
n! n! 0 eλ 2 t λ21
λ1 −λ2 e 1 − eλ2 t eλ2 t
n=0 n=0

78
2) Intuition of Theorem 10.4.1 (Price of backset LIBOR). The intuition here is the same as that of a
discrete-time model: the payoff δL(T, T ) = (1 + δL(T, T )) − 1 at time T + δ is equivalent to an investment
of 1 at time T (which gives the payoff 1 + δL(T, T ) at time T + δ), subtracting a payoff of 1 at time T + δ.
The first part has time t price B(t, T ) while the second part has time t price B(t, T + δ).
Pricing under (T + δ)-forward measure: Using the (T + δ)-forward measure, the time-t no-arbitrage price
of a payoff δL(T, T ) at time (T + δ) is

e T +δ [δL(T, T )] = B(t, T + δ) · δL(t, T ),


B(t, T + δ)E
( )
B(t,T )
where we have used the observation that the forward rate process L(t, T ) = 1δ B(t,T +δ) − 1 (0 ≤ t ≤ T ) is
a martingale under the (T + δ)-forward measure.

I Exercise 10.1 (Statistics in the two-factor Vasicek model). According to Example 4.7.3, Y1 (t) and
Y2 (t) in (10.2.43)-(10.2.46) are Gaussian processes.
(i) Show that
e 1 (t) = e−λ1 t Y1 (0),
EY (10.7.1)
that when λ1 ̸= λ2 , then

e 2 (t) = λ21 ( −λ1 t )


EY e − e−λ2 t Y1 (0) + e−λ2 t Y2 (0), (10.7.2)
λ1 − λ2
and when λ1 = λ2 , then
e 2 (t) = −λ21 te−λ1 t Y1 (0) + e−λ1 t Y2 (0).
EY (10.7.3)
We can write
e 1 (t) = e−λ1 t I1 (t),
Y1 (t) − EY
when λ1 ̸= λ2 ,
e 2 (t) = λ21 ( −λ1 t )
Y2 (t) − EY e I1 (t) − e−λ2 t I2 (t) − e−λ2 t I3 (t),
λ1 − λ2
and when λ1 = λ2 ,
e 2 (t) = −λ21 te−λ1 t I1 (t) + λ21 e−λ1 t I4 (t) + e−λ1 t I3 (t),
Y2 (t) − EY

where the Itô integrals


∫ t ∫ t
I1 (t) = f1 (u),
eλ 1 u d W I2 (t) = f1 (u),
eλ2 u dW
0 0
∫ t ∫ t
I3 (t) = f2 (u),
eλ 2 u d W I4 (t) = f1 (u),
ueλ1 u dW
0 0

all have expectation zero under the risk-neutral measure P. e Consequently, we can determine the variances
of Y1 (t) and Y2 (t) and the covariance of Y1 (t) and Y2 (t) under the risk-neutral measure from the variances
and covariances of Ij (t) and Ik (t). For example, if λ1 = λ2 , then

Var(Y1 (t)) e 2 (t),


= e−2λ1 t EI 1
2 2 −2λ1 t e 2 e 2 (t) + e−2λ1 t EI
e 2 (t)
Var(Y2 (t)) = λ21 t e EI1 (t) + λ221 e−2λ1 t EI 4 3
−2λ1 t e −2λ1 t e
−2λ te
2
E[I1 (t)I4 (t)] − 2λ21 te
21 E[I1 (t)I3 (t)]
e 4 (t)I3 (t)],
+2λ21 e−2λ1 t E[I
Cov(Y1 (t), Y2 (t)) e 2 (t) + λ21 e−2λ1 t E[I
= −λ21 te−2λ1 t EI e 1 (t)I4 (t)] + e−2λ1 t E[I
e 1 (t)I3 (t)],
1

e
where the variances and covariance above are under the risk-neutral measure P.

79
Proof. Using the notation I1 (t), I2 (t), I3 (t) and I4 (t) introduced in the problem, we can write Y1 (t) and
Y2 (t) as
Y1 (t) = e−λ1 t Y1 (0) + e−λ1 t I1 (t)
and
Y2 (t)
{ [ ]
−λ1 t
λ21
λ1 −λ2 (e − e−λ2 t )Y1 (0) + e−λ2 t Y2 (0) + λ1λ−λ
21
e−λ1 t I1 (t) − e−λ2 t I2 (t) − e−λ2 t I3 (t), ̸ λ2 ;
λ1 =
= [ 2 ]
−λ21 te−λ1 t Y1 (0) + e−λ1 t Y2 (0) − λ21 te−λ1 t I1 (t) − e−λ1 t I4 (t) + e−λ1 t I3 (t), λ1 = λ2 .

Since all the Ik (t)’s (k = 1, · · · , 4) are normally distributed with zero mean, we can conclude
e 1 (t)] = e−λ1 t Y1 (0)
E[Y
and
{
−λ1 t
e 2 (t)] =
λ21
λ1 −λ2 (e − e−λ2 t )Y1 (0) + e−λ2 t Y2 (0), if λ1 ̸= λ2 ;
E[Y −λ1 t
−λ21 te Y1 (0) + e−λ1 t Y2 (0), if λ1 = λ2 .

(ii) Compute the five terms


e 2 (t), E[I
EI e 1 (t)I2 (t)], E[I
e 1 (t)I3 (t)], E[I
e 1 (t)I4 (t)], E[I
e 2 (t)].
1 4

The five other terms, which you are not being asked to compute, are
1 ( 2λ2 t )
EI22 (t) = e −1 ,
2λ2
E[I2 (t)I3 (t)] = 0,
t 1 ( )
E[I2 (t)I4 (t)] = e(λ1 +λ2 )t + 1 − e (λ1 +λ2 )t
,
λ1 + λ2 (λ1 + λ2 )2
1 ( 2λ2 t )
EI32 (t) = e −1 ,
λ2
E[I3 (t)I4 (t)] = 0.
Solution. The calculation relies on the following fact: if Xt and Yt are both martingales, then Xt Yt − [X, Y ]t
is also a martingale. In particular, E[X e t Yt ] = E{[X,
e Y ]t }. Thus
∫ t ∫ t
e 2 (t)] = e 2λ1 t
−1 e e(λ1 +λ2 )t − 1
E[I 1 e 2λ1 u
du = , E[I 1 (t)I 2 (t)] = e (λ1 +λ2 )u
du = ,
0 2λ1 0 λ1 + λ2
∫ t [ ]
e 1 (t)I3 (t)] = 0, E[I
e 1 (t)I4 (t)] = 1 e2λ1 t − 1
E[I ue2λ1 u du = te2λ1 t −
0 2λ1 2λ1
and
∫ t
e 2 (t)] = t2 e2λ1 t te2λ1 t e2λ1 t − 1
E[I 4 u2 e2λ1 u du = − 2 + .
0 2λ1 2λ1 4λ31

(iii) Some derivative securities involve time spread (i.e., they depend on the interest rate at two different
times). In such cases, we are interested in the joint statistics of the factor processes at different times.
These are still jointly normal and depend on the statistics of the Itô integral Ij at different times. Compute
e 1 (s)I2 (t)], where 0 ≤ s < t. (Hint: Fix s ≥ 0 and define
E[I
∫ t
J1 (t) = f1 (u),
eλ1 u 1{u≤s} dW
0

where 1{u≤s} is the function of u that is 1 if u ≤ s and 0 if u > s. Note that J1 (t) = I1 (s) when t ≥ s.)

80
Solution. Following the hint, we have

e 1 (s)I2 (t)] = E[J
e 1 (t)I2 (t)] =
t
e(λ1 +λ2 )s − 1
E[I e(λ1 +λ2 )u 1{u≤s} du = .
0 λ1 + λ2

I Exercise 10.2 (Ordinary differential equations for the mixed affine-yield model). In the mixed
model of Subsection 10.2.3, as in the two-factor Cox-Ingersoll-Ross model, zero-coupon bond prices have the
affine-yield form
f (t, y1 , y2 ) = e−y1 C1 (T −t)−y2 C2 (T −t)−A(T −t) ,
where C1 (0) = C2 (0) = A(0) = 0.
(i) Find the partial differential equation satisfied by f (t, y1 , y2 ).
[ ∫ ]
Solution. Assume B(t, T ) = E e e− tT Rs ds Ft = f (t, Y1 (t), Y2 (t)). Then

d(D(t)B(t, T )) = D(t)[−R(t)f (t, Y1 (t), Y2 (t))dt + df (t, Y1 (t), Y2 (t))].

By Itô’s formula,

df (t, Y1 (t), Y2 (t)) = [ft (t, Y1 (t), Y2 (t)) + fy1 (t, Y1 (t), Y2 (t))(µ − λ1 Y1 (t)) + fy2 (t, Y1 (t), Y2 (t))(−λ2 )Y2 (t)
1
+fy1 y2 (t, Y1 (t), Y2 (t))σ21 Y1 (t) + fy1 y1 (t, Y1 (t), Y2 (t))Y1 (t)
2 ]
1 2
+ fy2 y2 (t, Y1 (t), Y2 (t))(σ21 Y1 (t) + α + βY1 (t)) dt + martingale part.
2

Since D(t)B(t, T ) is a martingale, we must have


[ ]
∂ ∂ ∂
−(δ0 + δ1 y1 + δ2 y2 ) + + (µ − λ1 y1 ) − λ2 y2 f
∂t ∂y1 ∂y2
[ ( )]
1 ∂2 ∂2 2 ∂2
+ 2σ21 y1 + y1 2 + (σ21 y1 + α + βy1 ) 2 f
2 ∂y1 ∂y2 ∂y1 ∂y2
= 0.

(ii) Show that C1 , C2 , and A satisfy the system of ordinary differential equations12
1 1 2
C1′ = −λ1 C1 − C12 − σ21 C1 C2 − (σ21 + β)C22 + δ1 , (10.7.4)
2 2
C2′ = −λ2 C2 + δ2 , (10.7.5)
1
A′ = µC1 − αC22 + δ0 . (10.7.6)
2
12 The textbook has a typo in formula (10.7.4): the coefficient of C 2 should be − 1 (σ 2 + β) instead of −(1 + β). See
2 2 21
http://www.math.cmu.edu/∼shreve/ for details.

81
Proof. If we suppose f (t, y1 , y2 ) = e−y1 C1 (T −t)−y2 C2 (T −t)−A(T −t) , then
∂f
= [y1 C1′ (T − t) + y2 C2′ (T − t) + A′ (T − t)]f
∂t
∂f
= −C1 (T − t)f
∂y1
∂f
= −C2 (T − t)f
∂y2
∂2f
= C1 (T − t)C2 (T − t)f
∂y1 ∂y2
∂2f
= C12 (T − t)f
∂y12
∂2f
= C22 (T − t)f.
∂y22

So the PDE in part (i) becomes

−(δ0 + δ1 y1 + δ2 y2 ) + y1 C1′ + y2 C2′ + A′ − (µ − λ1 y1 )C1 + λ2 y2 C2


1[ ]
+ 2σ21 y1 C1 C2 + y1 C12 + (σ21 2
y1 + α + βy1 )C22
2
= 0.

Sorting out the LHS according to the independent variables y1 and y2 , we get

 ′
−δ1 + C1 + λ1 C1 + σ21 C1 C2 + 2 C1 + 2 (σ21 + β)C2 = 0
1 2 1 2 2


−δ2 + C2 + λ2 C2 = 0


−δ0 + A′ − µC1 + 12 αC22 = 0.

In other words, we can obtain the ODEs for C1 , C2 and A as follows



 ′
C1 = −λ1 C1 − σ21 C1 C2 − 2 C1 − 2 (σ21 + β)C2 + δ1
1 2 1 2 2

C2′ = −λ2 C2 + δ2

 ′
A = µC1 − 12 αC22 + δ0 .

I Exericse 10.3 (Calibration of the two-factor Vasicek model). Consider the canonical two-factor
Vasicek model (10.2.4), (10.2.5), but replace the interest rate equation (10.2.6) by

R(t) = δ0 (t) + δ1 Y1 (t) + δ2 Y2 (t), (10.7.7)

where δ1 and δ2 are constant but δ0 (t) is a nonrandom function of time. Assume that for each T there is a
zero-coupon bond maturing at time T . The price of this bond at time t ∈ [0, T ] is
[ ∫ ]
e e− tT R(u)du F(t) .
B(t, T ) = E

Because the pair of processes (Y1 (t), Y2 (t)) is Markov, there must exist some function f (t, T, y1 , y2 ) such
that B(t, T ) = f (t, T, Y1 (t), Y2 (t)). (We indicate the dependence of f on the maturity T because, unlike in
Subsection 10.2.1, here we shall consider more than one value of T .)
(i) The function f (t, T, y1 , y2 ) is of the affine-yield form

f (t, T, y1 , y2 ) = e−y1 C1 (t,T )−y2 C2 (t,T )−A(t,T ) . (10.7.8)


d d d
Holding T fixed, derive a system of ordinary differential equations for dt C1 (t, T ), dt C2 (t, T ), and dt A(t, T ).

82
Solution. We have d(Dt B(t, T )) = Dt [−Rt f (t, T, Y1 (t), Y2 (t))dt + df (t, T, Y1 (t), Y2 (t))] and
df (t, T, Y1 (t), Y2 (t))
= [ft (t, T, Y1 (t), Y2 (t)) + fy1 (t, T, Y1 (t), Y2 (t))(−λ1 Y1 (t)) + fy2 (t, T, Y1 (t), Y2 (t))(−λ21 Y1 (t) − λ2 Y2 (t))
1 1
+ fy1 y1 (t, T, Y1 (t), Y2 (t)) + fy2 y2 (t, T, Y1 (t), Y2 (t))]dt + martingale part.
2 2
Since Dt B(t, T ) is a martingale under risk-neutral measure, we have the following PDE:
[ ]
∂ ∂ ∂ 1 ∂2 1 ∂
−(δ0 (t) + δ1 y1 + δ2 y2 ) + − λ 1 y1 − (λ21 y1 + λ2 y2 ) + + f (t, T, y1 , y2 ) = 0.
∂t ∂y1 ∂y2 2 ∂y12 2 ∂y22

Suppose f (t, T, y1 , y2 ) = e−y1 C1 (t,T )−y2 C2 (t,T )−A(t,T ) , then


 [ ]

 ft (t, T, y1 , y2 ) = −y1 dt d
C1 (t, T ) − y2 dt d
C2 (t, T ) − d
dt A(t, T ) f (t, T, y1 , y2 ),



 f (t, T, y , y ) = −C (t, T )f (t, T, y 1 2 ),
, y


y1 1 2 1
f (t, T, y , y ) = −C (t, T )f (t, T, y , y ),
y2 1 2 2 1 2

 fy1 y2 (t, T, y1 , y2 ) = C1 (t, T )C2 (t, T )f (t, T, y1 , y2 ),



 2


f y1 y1 (t, T, y1 , y2 ) = C1 (t, T )f (t, T, y1 , y2 ),

fy2 y2 (t, T, y1 , y2 ) = C22 (t, T )f (t, T, y1 , y2 ).
So the PDE becomes
( )
d d d
−(δ0 (t) + δ1 y1 + δ2 y2 ) + −y1 C1 (t, T ) − y2 C2 (t, T ) − A(t, T ) + λ1 y1 C1 (t, T )
dt dt dt
1 1
+(λ21 y1 + λ2 y2 )C2 (t, T ) + C12 (t, T ) + C22 (t, T ) = 0.
2 2
Sorting out the terms according to independent variables y1 and y2 , we get


−δ0 (t) − dt A(t, T ) + 2 C1 (t, T ) + 2 C2 (t, T ) = 0
d 1 2 1 2

−δ1 − dtd
C1 (t, T ) + λ1 C1 (t, T ) + λ21 C2 (t, T ) = 0


−δ2 − dt C2 (t, T ) + λ2 C2 (t, T ) = 0.
d

That is


 dt C1 (t, T ) = λ1 C1 (t, T ) + λ21 C2 (t, T ) − δ1
d

dt C2 (t, T ) = λ2 C2 (t, T ) − δ2
d

d
dt A(t, T ) = 2 C1 (t, T ) + 2 C2 (t, T ) − δ0 (t).
1 2 1 2

(ii) Using the terminal conditions C1 (T, T ) = C2 (T, T ) = 0, solve the equations in (i) for C1 (t, T ) and
C2 (t, T ). (As in Subsection 10.2.1, the functions C1 and C2 depend on t and T only through the difference
τ = T − t; however, the function A discussed in part (iii) below depends on t and T separately.)
d −λ2 t
Solution. For C2 , we note dt [e C2 (t, T )] = −e−λ2 t δ2 from the ODE in (i). Integrate from t to T , we have
−λ2 t
∫ T −λ s ( )
0−e C2 (t, T ) = −δ2 t e 2 ds = λδ22 (e−λ2 T − e−λ2 t ). So C2 (t, T ) = λδ22 1 − e−λ2 (T −t) . For C1 , we note

d −λ1 t λ21 δ2 −λ1 t


(e C1 (t, T )) = (λ21 C2 (t, T ) − δ1 )e−λ1 t = (e − e−λ2 T +(λ2 −λ1 )t ) − δ1 e−λ1 t .
dt λ2
Integrate from t to T , we get
−e−λ1 t C1 (t, T )
{
δ2 −λ1 T λ21 δ2 −λ2 T e(λ2 −λ1 )T −e(λ2 −λ1 )t
− λλ21
2 λ1
(e − e−λ1 t ) − λ2 e λ2 −λ1 + λδ11 (e−λ1 T − e−λ1 T ) if λ1 ̸= λ2
= δ2 −λ1 T
− λλ21
2 λ1
(e − e−λ1 t ) − λ21 δ2 −λ2 T
λ2 e (T − t) + λδ11 (e−λ1 T − e−λ1 T ) if λ1 = λ2 .

83
So
{
λ21 δ2 −λ1 (T −t) λ21 δ2 e−λ1 (T −t) −e−λ2 (T −t)
λ2 λ1 (e − 1) + λ2 λ2 −λ1 − λδ11 (e−λ1 (T −t) − 1) if λ1 ̸= λ2
C1 (t, T ) = λ21 δ2 −λ1 (T −t) λ21 δ2 −λ2 T +λ1 t
.
λ2 λ1 (e − 1) + λ2 e (T − t) − λδ11 (e−λ1 (T −t) − 1) if λ1 = λ2 .

(iii) Using the terminal condition A(T, T ) = 0, write a formula for A(t, T ) as an integral involving C1 (u, T ),
C2 (u, T ), and δ0 (u). You do not need to evaluate this integral.
Solution. From the ODE d
dt A(t, T ) = 12 (C12 (t, T ) + C22 (t, T )) − δ0 (t), we get
∫ T [ ]
1
A(t, T ) = δ0 (s) − (C12 (s, T ) + C22 (s, T )) ds.
t 2

(iv) Assume that the model parameters λ1 > 0 λ2 > 0, λ21 , δ1 , and δ2 and the initial conditions Y1 (0) and
Y2 (0) are given. We wish to choose a function δ0 so that the zero-coupon bond prices given by the model
match the bond prices given by the market at the initial time zero. In other words, we want to choose a
function δ(T ), T ≥ 0, so that
f (0, T, Y1 (0), Y2 (0)) = B(0, T ), T ≥ 0.

In this part of the exercise, we regard both t and T as variables and uses the notation ∂t to indicate the

derivative with respect to t when T is held fixed and the notation ∂T to indicate the derivative with respect to

T when t is held fixed. Give a formula for δ0 (T ) in terms of ∂T log B(0, T ) and the model parameters. (Hint:

Compute ∂T A(0, T ) in two ways, using (10.7.8) and also using the formula obtained in (iii). Because Ci (t, T )
depends only on t and T through τ = T −t, there are functions C i (τ ) such that C i (τ ) = C i (T −t) = Ci (t, T ),
i = 1, 2. Then
∂ ′ ∂ ′
Ci (t, T ) = −C i (τ ), Ci (t, T ) = C i (τ ),
∂t ∂T
where ′ denotes differentiation with respect to τ . This shows that
∂ ∂
Ci (t, T ) = − Ci (t, T ), i = 1, 2, (10.7.9)
∂T ∂t
a fact that you will need.)
Solution. We want to find δ0 so that f (0, T, Y1 (0), Y2 (0)) = e−Y1 (0)C1 (0,T )−Y2 (0)C2 (0,T )−A(0,T ) = B(0, T ) for
all T > 0. Take logarithm on both sides and plug in the expression of A(t, T ), we get
∫ T [ ]
1 2
log B(0, T ) = −Y1 (0)C1 (0, T ) − Y2 (0)C2 (0, T ) + (C (s, T ) + C2 (s, T )) − δ0 (s) ds.
2
0 2 1

Taking derivative w.r.t. T , we have


∂ ∂ ∂ 1 1
log B(0, T ) = −Y1 (0) C1 (0, T ) − Y2 (0) C2 (0, T ) + C12 (T, T ) + C22 (T, T ) − δ0 (T ).
∂T ∂T ∂T 2 2
Therefore
∂ ∂ ∂
δ0 (T ) = −Y1 (0) C1 (0, T ) − Y2 (0) C2 (0, T ) − log B(0, T )
∂T[ ∂T ] ∂T
{
−Y1 (0) δ1 e−λ1 T − λ21 δ2 −λ2 T
λ2 e − Y2 (0)δ2 e−λ2 T − ∂T

log B(0, T ) if λ1 ̸= λ2
= [ −λ T −λ
] −λ
−Y1 (0) δ1 e 1 − λ21 δ2 e 2 T − Y2 (0)δ2 e 2 − ∂T log B(0, T )
T T ∂
if λ1 = λ2 .

84
I Exercise 10.4. Hull and White [89] propose the two-factor model

e2 (t),
dU (t) = −λ1 U (t)dt + σ1 dB (10.7.10)

e1 (t),
dR(t) = [θ(t) + U (t) − λ2 R(t)]dt + σ2 dB (10.7.11)
where λ1 , λ2 , σ1 , and σ2 are positive constants, θ(t) is a nonrandom function, and B e1 (t) and B
e2 (t) are
e e
correlated Brownian motions with dB1 (t)dB2 (t) = ρdt for some ρ ∈ (−1, 1). In this exercise, we discuss
how to reduce this to the two-factor Vasicek model of Subsection 10.2.1, except that, instead of (10.2.6), the
interest rate is given by (10.7.7), in which δ0 (t) is a nonrandom function of time.
(i) Define [ ] [ ] [ ]
U (t) λ1 0 σ1 0
X(t) = , K= , Σ=
R(t) −1 λ2 0 σ2
[ ] [ ]
0 e
Θ(t) = e = B1 (t) ,
, B(t)
θ(t) Be2 (t)

so that (10.7.10) and (10.7.11) can be written in vector notation as

e
dX(t) = Θ(t)dt − KX(t)dt + ΣdB(t). (10.7.12)

Now set ∫ t
b
X(t) = X(t) − e−Kt eKu Θ(u)du.
0

Show that
b
dX(t) b
= −K X(t)dt e
+ ΣdB(t). (10.7.13)

Proof.
∫ t
b
dX(t) = dX(t) + Ke−Kt eKu Θ(u)dudt − Θ(t)dt
0
∫ t
e + Ke−Kt
= −KX(t)dt + ΣdB(t) eKu Θ(u)dudt
0
b
= −K X(t)dt e
+ ΣdB(t).

Remark 6. The definition of X b is motivated by the observation that Y has homogeneous dt term in (10.2.4)-
( )
(10.2.5) and e is the integrating factor for X: d eKt X(t) = eKt (dX(t) − KX(t)dt).
Kt

(ii) With [ ]
1
σ1 0
C= − √ρ √1 ,
σ1 1−ρ2 σ2 1−ρ2

b
define Y (t) = C X(t), f (t) = CΣB(t).
W e f1 (t) and W
Show that the components of W f2 (t) are independent
Brownian motions and
f (t),
dY (t) = −ΛY (t)dt + dW (10.7.14)
where [ ]
λ1 0
−1
Λ = CKC 2 −λ1 )−σ1
= ρσ2 (λ√
2
λ2 .
σ2 1−ρ

Equation (10.7.14) is the vector form of the canonical two-factor Vasicek equations (10.2.4) and (10.2.5).

85
Proof.
( 1
)( ) ( )
0 σ1 0 e 1 0
f (t) = CΣB(t)
W e = σ1
B(t) = − √ ρ √1 e
B(t).
− √ρ 2 √1 0 σ2
σ1 1−ρ σ2 1−ρ2 1−ρ2 1−ρ2

f is a martingale with ⟨W
So W f 1 ⟩(t) = ⟨B
e 1 ⟩(t) = t,
⟨ ⟩
f 2 ⟩(t) = − √ ρ e1 + √ 1 2
e 2 (t) = ρ t + t ρ ρ2 + 1 − 2ρ2
⟨W B B − 2 ρt = t = t,
1 − ρ2 1 − ρ2 1 − ρ2 1 − ρ2 1 − ρ2 1 − ρ2

and ⟨ ⟩
f1, W
⟨W f 2 ⟩(t) = e1, − √ ρ
B e1 + √ 1
B e2
B (t) = − √
ρt
+√
ρt
= 0.
1−ρ2 1 − ρ2 1−ρ 2 1 − ρ2
f is a two-dimensional BM. Moreover, dYt = CdX
Therefore W bt = −CK X bt dt + CΣdB et = −CKC −1 Yt dt +
ft = −ΛYt dt + dW
dW ft , where
( )(  
1 ) √1 2 0
0 λ 0 1
Λ = CKC −1 = − √ρ
σ 1
√1 2
1
·  2 ρ
σ 1−ρ 
σ 1−ρ2 σ 1−ρ
−1 λ2 |C| √ 2 σ1
1 2 σ1 1−ρ 1

( λ1
)( )
0 σ1
=
σ1 √0
− √ 2 − √1 2
ρλ 1 √λ2 ρσ2 σ2 1 − ρ2
σ1 1−ρ σ2 1−ρ σ2 1−ρ2
( )
λ1 0
= ρσ2 (λ2 −λ1 )−σ1 .
√ λ2
σ2 1−ρ2

(iii) Obtain a formula for R(t) of the form (10.7.7). What are δ0 (t), δ1 , and δ2 ?

Solution.
∫ t ∫ t
X(t) b
= X(t) + e −Kt Ku −1
e Θ(u)du = C Yt + e −Kt
eKu Θ(u)du
0 0
( )( ) ∫ t
σ1 √ 0 Y1 (t) −Kt
= + e eKu Θ(u)du
ρσ2 σ2 1 − ρ2 Y2 (t) 0
( ) ∫ t
σ1 Y1√(t) −Kt
= + e eKu Θ(u)du.
ρσ2 Y1 (t) + σ2 1 − ρ2 Y2 (t) 0

So √
R(t) = X2 (t) = ρσ2 Y1 (t) + σ2 1 − ρ2 Y2 (t) + δ0 (t),
∫t
where δ0 (t) is the second coordinate of e−Kt 0 eKu Θ(u)du and can be derived explicitly by Lemma 10.2.3.
Note
∫ t ∫ t [ λ1 (u−t) ][ ]
−Kt Ku e 0 0
e e Θ(u)du = du
0 0 ∗ eλ2 (u−t) θ(u)
∫ t[ ]
0
= du
0 θ(u)eλ2 (u−t)
∫t √
Then δ0 (t) = e−λ2 t 0
eλ2 u θ(u)du, δ1 = ρσ2 , and δ2 = σ2 1 − ρ2 .

86
I Exercise 10.5 (Correlation between long rate and short rate in the one-factor Vasicek model).
The one-factor Vasicek model is the one-factor Hull-White model of Example 6.5.1 with constant parameters,
f (t),
dR(t) = (a − bR(t))dt + σdW (10.7.15)

where a, b, and σ are positive constants and W f (t) is a one-dimensional Brownian motion. In this model, the
price at time t ∈ [0, T ] of the zero-coupon bond maturing at time T is

B(t, T ) = e−C(t,T )R(t)−A(t,T ) ,

where C(t, T ) and A(t, T ) are given by (6.5.10) and (6.5.11):


∫ ∫s
T
1( )
C(t, T ) = e− t bdv ds = 1 − e−b(T −t) ,
t b
∫ T( )
1 2 2
A(t, T ) = aC(s, t) − σ C (s, T ) ds
t 2
2ab − σ 2 σ 2 − ab ( −b(T −t)
) σ2 ( −2b(T −t)
)
= (T − t) + 1 − e − 1 − e .
2b2 b3 4b3
In the spirit of the discussion of the short rate and the long rate in Subsection 10.2.1, we fix a positive
relative maturity τ̄ and define the long rate L(t) at time t by (10.2.30):
1
L(t) = − log B(t, t + τ̄ ).
τ̄
Show that changes in L(t) and R(t) are perfectly correlated (i.e., for any 0 ≤ t1 < t2 , the correlation
coefficient between L(t2 ) − L(t1 ) and R(t2 ) − R(t1 ) is one). This characteristic of one-factor models caused
the development of models with more than one factor.
Proof. We note C(t, T ) and A(t, T ) are dependent only on T − t. So C(t, t + τ̄ ) and A(t, t + τ̄ ) are constants
when τ̄ is fixed. So
d B(t, t + τ̄ )[−C(t, t + τ̄ )R′ (t) − A(t, t + τ̄ )]
L(t) = −
dt τ̄ B(t, t + τ̄ )
1
= [C(t, t + τ̄ )R′ (t) + A(t, t + τ̄ )]
τ̄
1
= [C(0, τ̄ )R′ (t) + A(0, τ̄ )].
τ̄
Integrating from t1 to t2 on both sides, we have L(t2 ) − L(t1 ) = τ̄1 C(0, τ̄ )[R(t2 ) − R(t1 )] + τ̄1 A(0, τ̄ )(t2 − t1 ).
Since L(t2 ) − L(t1 ) is a linear transformation of R(t2 ) − R(t1 ), their correlation is 1.

I Exercise 10.6 (Degenerate two-factor Vasicek model). In the discussion of short rates and long
rates in the two-factor Vasicek model of Subsection 10.2.1, we made the assumptions that δ2 ̸= 0 and
(λ1 − λ2 )δ1 + λ21 δ2 ̸= 0 (see Lemma 10.2.2). In this exercise, we show that if either of these conditions is
violated, the two-factor Vasicek model reduces to a one-factor model, for which long rates and short rates
are perfectly correlated (see Exercise 10.5).
(i) Show that if δ2 = 0 (and δ0 > 0, δ1 > 0), then the short rate R(t) given by the system of equations
(10.2.4)-(10.2.6) satisfies the one-dimensional stochastic differential equation

f1 (t).
dR(t) = (a − bR(t))dt + dW (10.7.16)

Define a and b in terms of the parameters in (10.2.4)-(10.2.6).

87
Proof. If δ2 = 0, then
( )
dR(t) = δ1 −λ1 Y1 (t)dt + dWf1 (t)
[( ) ]
δ0 R(t) f
= δ1 − λ1 dt + dW1 (t)
δ1 δ1
f1 (t).
= (δ0 λ1 − λ1 R(t))dt + δ1 dW

So a = δ0 λ1 and b = λ1 .
(ii) Show that if (λ1 − λ2 )δ1 + λ21 δ2 = 0 (and δ0 > 0, δ12 + δ22 ̸= 0), then the short rate R(t) given by the
system of equations (10.2.4)-(10.2.6) satisfies the one-dimensional stochastic differential equation
e
dR(t) = (a − bR(t))dt + σdB(t). (10.7.17)
e in terms
Define a and b in terms of the parameters in (10.2.4)-(10.2.6) and define the Brownian motion B(t)
f f
of the independent Brownian motions W1 (t) and W2 (t) in (10.2.4) and (10.2.5).
Proof.

dR(t) = δ1 dY1 (t) + δ2 dY2 (t)


= f1 (t) − δ2 λ21 Y1 (t)dt − δ2 λ2 Y2 (t)dt + δ2 dW
−δ1 λ1 Y1 (t)dt + λ1 dW f2 (t)
= f1 (t) + δ2 dW
−Y1 (t)(δ1 λ1 + δ2 λ21 )dt − δ2 λ2 Y2 (t)dt + δ1 dW f2 (t)
= −Y1 (t)λ2 δ1 dt − δ2 λ2 Y2 (t)dt + δ1 dWf1 (t) + δ2 dWf2 (t)
= −λ2 (Y1 (t)δ1 + Y2 (t)δ2 )dt + δ1 dW f1 (t) + δ2 dW
f2 (t)
√ [ ]
δ1 f δ2 f
= −λ2 (Rt − δ0 )dt + δ1 + δ2 √ 2
2 2 dW1 (t) + √ 2 dW2 (t) .
δ1 + δ22 δ1 + δ22

e = √ δ1 W
So a = λ2 δ0 , b = λ2 , σ = δ12 + δ22 and B(t) f1 (t) + √ δ2 W f2 (t).
2 2
δ1 +δ2 2 2
δ1 +δ2

I Exercise 10.7 (Forward measure in the two-factor Vasicek model). Fix a maturity T > 0. In the
eT of Definition 9.4.1:
two-factor Vasicek model of Subsection 10.2.1, consider the T -forward measure P

eT (A) = 1 e for all A ∈ F.
P D(T )dP
B(0, T ) A

eT -Brownian motion W
(i) Show that the two-dimensional P f T (t), W
f T (t) of (9.2.5) are
1 2
∫ t
WfjT (t) = Cj (T − u)du + Wfj (t), j = 1, 2, (10.7.18)
0

where C1 (τ ) and C2 (τ ) are given by (10.2.26)-(10.2.28).


Proof. We use the canonical form of the model as in formulas (10.2.4)-(10.2.6). By (10.2.20),

dB(t, T ) = df (t, Y1 (t), Y2 (t))


= de−Y1 (t)C1 (T −t)−Y2 (t)C2 (T −t)−A(T −t)
= (· · · )dt + B(t, T )[−C1 (T − t)dW f1 (t) − C2 (T − t)dWf2 (t)]
( )
f1 (t)
dW
= (· · · )dt + B(t, T )(−C1 (T − t), −C2 (T − t)) f2 (t) .
dW
∫t
e is (−C1 (T − t), −C2 (T − t)). By (9.2.5), W
So the volatility vector of B(t, T ) under P f T (t) = Cj (T −
j 0
fj (t) (j = 1, 2) form a two-dimensional P
u)du + W eT −BM.

88
(ii) Consider a call option on a bond maturing at time T̄ > T . The call expires at time T and has strike
price K. Show that at time zero the risk-neutral price of this option is
[( )+ ]
e
B(0, T )E T
e −C1 (T̄ −T )Y1 (T )−C2 (T̄ −T )Y2 (T )−A(T̄ −T )
−K . (10.7.19)

Proof. Under the T -forward measure P eT , the numeraire is B(t, T ). By risk-neutral pricing, at time zero the
risk-neutral price V (0) of the option satisfies
[ ( )+ ]
V (0) e 1 −C1 (T̄ −T )Y1 (T )−C2 (T̄ −T )Y2 (T )−A(T̄ −T )
=E T
e −K .
B(0, T ) B(T, T )

Note B(T, T ) = 1, we get (10.7.19).


eT , the term
(iii) Show that, under the T -forward measure P

X = −C1 (T̄ − T )Y1 (T ) − C2 (T̄ − T )Y2 (T ) − A(T̄ − T )

appearing in the exponent in (10.7.19) is normally distributed.


Proof. Using (10.7.18), we can rewrite (10.2.4) and (10.2.5) as
{
f T (t) − C1 (T − t)dt
dY1 (t) = −λ1 Y1 (t)dt + dW 1
f T (t) − C2 (T − t)dt.
dY2 (t) = −λ21 Y1 (t)dt − λ2 Y2 (t)dt + dW 2

Then
{ ∫t ∫
f T (s) − t C1 (T − s)eλ1 (s−t) ds
Y1 (t) = Y1 (0)e−λ1 t + 0 eλ1 (s−t) dW
∫t 1
∫0t ∫
f2 (s) − t C2 (T − s)eλ2 (s−t) ds.
Y2 (t) = Y0 e−λ2 t − λ21 0 Y1 (s)eλ2 (s−t) ds + 0 eλ2 (s−t) dW 0

Since C∫1 is deterministic, Y1 has Gaussian distribution. As the consequence, the second
∫t term in the expression
t f2 (s), since W
of Y2 , 0 Y1 (s)eλ2 (s−t) ds also has Gaussian distribution and is uncorrelated to 0 eλ2 (s−t) dW fT
1
and Wf T are uncorrelated. Therefore, (Y1 , Y2 ) is jointly Gaussian and X as a linear combination of them is
2
also Gaussian.
(iv) It is a straightforward but lengthy computation, like the computations in Exercise 10.1, to determine
the mean and variance of the term X. Let us call its variance σ 2 and its mean µ − 12 σ 2 , so that we can write
X as
1
X = µ − σ 2 − σZ,
2
where Z is a standard normal random variable under P eT . Show that the call option price in (10.7.19) is

B(0, T ) (eµ N (d+ ) − KN (d− )) ,

where ( )
1 1
d± = µ − log K ± σ 2 .
σ 2
Proof. The call option price in (10.7.19) is
[( )+ ]
eT
B(0, T )E eX − K

e T [X] = µ − 1 σ 2 and Var(X) = σ 2 . Comparing with the Black-Scholes formula for call options: if
with E 2
ft , then
dSt = rSt dt + σSt dW
[ ( )+ ]
e
E e −rT
S0 e σWT +(r− 12 σ 2 )T
−K = S0 N (d+ ) − Ke−rT N (d− )

89
with d± = 1

σ T
(log SK0 + (r ± 12 σ 2 )T ), we can set in the Black-Scholes formula r = µ, T = 1 and S0 = 1, then
[( )+ ] [ ( ) ]
eT
B(0, T )E eX − K e T e−µ eX − K + = B(0, T ) (eµ N (d+ ) − KN (d− ))
= B(0, T )eµ E

where d± = 1
σ (− log K + (µ ± 12 σ 2 )).

I Exercise 10.8 (Reversal of order of integration in forward rates). The forward rate formula
(10.3.5) with v replacing T states that
∫ t ∫ t
f (t, v) = f (0, v) + α(u, v)du + σ(u, v)dW (u).
0 0

Therefore ∫ ∫ [ ∫ ∫ ]
T T t t
− f (t, v)dv = − f (0, v) + α(u, v)du + σ(u, v)dW (u) dv. (10.7.20)
t t 0 0

(i) Define
∫ T ∫ T
b(u, t, T ) =
α b(u, t, T ) =
α(u, v)dv, σ σ(u, v)dv.
t t

Show that if we reverse the order of integration in (10.7.20), we obtain the equation
∫ T ∫ T ∫ t ∫ t
− f (t, v)dv = − f (0, v)dv − b(u, t, T )du − b(u, t, T )dW (u).
σ (10.7.21)
t t 0 0

(In one case, this is a reversal of the order of two Riemann integrals, a step that uses only the theory of
ordinary calculus. In the other case, the order of a Riemann and an Itô integral are being reversed. This
step is justified in the appendix of [83]. You may assume without proof that this step is legitimate.)
Proof. Starting from (10.7.20), we have
∫ T ∫ T ∫ ∫
− f (t, v)dv = − f (0, v)dv − α(u, v)dudv − σ(u, v)dW (u)dv
t t {t≤v≤T,0≤u≤t} {t≤v≤T,0≤u≤t}
∫ T ∫ t ∫ T ∫ t ∫ T
= − f (0, v)dv − du α(u, v)dv − dW (u) σ(u, v)dv
t 0 t 0 t
∫ T ∫ t ∫ t
= − f (0, v)dv − b(u, t, T )du −
α b(u, t, T )dW (u).
σ
t 0 0

(ii) Take the differential with respect to t in (10.7.21), remembering to get two terms from each of the
∫t ∫t
b(u, t, T )du and 0 σ
integrals 0 α b(u, t, T )dW (u) because one must differentiate with respect to each of the
two ts appearing in these integrals.
Solution. Differentiating both sides of (10.7.21), we have
( ∫ )
T
d − f (t, v)dv
t
∫ t ∫ t
∂ ∂
= f (0, t)dt − α
b(t, t, T )dt − b(u, t, T )du − σ
α b(t, t, T )dW (t) − b(u, t, T )dW (u)
σ
0 ∂t 0 ∂t
∫ T ∫ t ∫ T ∫ t
= f (0, t)dt − α(t, v)dvdt + α(u, t)dudt − σ(t, v)dvdW (t) + σ(u, t)dW (u)dt.
t 0 t 0

90
(iii) Check that your formula in (ii) agrees with (10.3.10).
∫t ∫t
Proof. We note R(t) = f (t, t) = f (0, t) + 0 α(u, v)du + 0 σ(u, v)dW (u). From (ii), we therefore have
( ∫ )
T
d − f (t, v)dv = R(t)dt − α∗ (t, T )dt − σ ∗ (t, T )dW (t),
t

which is (10.3.10).

I Exercise 10.9 (Multifactor HJM model). Suppose the Heath-Jarrow-Morton model is driven by a
d-dimensional Brownian motion, so that σ(t, T ) is also a d-dimensional vector and the forward rate dynamics
are given by
∑d
df (t, T ) = α(t, T )dt + σj (t, T )dWj (t).
j=1

(i) Show that (10.3.16) becomes


d
α(t, T ) = σj (t, T )[σj∗ (t, T ) + Θj (t)].
j=1

Proof. We first derive the SDE for the discounted bond price D(t)B(t, T ). We first note
( ∫ )  
T ∫ T ∫ T ∑
d
d − f (t, v)dv = f (t, t)dt − df (t, v)dv = R(t)dt − α(t, v)dt + σj (t, v)dWj (t) dv
t t t j=1


d
= R(t)dt − α∗ (t, T )dt − σj∗ (t, T )dWj (t).
j=1

The applying Itô’s formula, we have


 ( ) 
∫T
∫ T ∑d
1
dB(t, T ) = de− t
f (t,v)dv
= B(t, T ) d − f (t, v)dv + (σ ∗ (t, T ))2 dt
t 2 j=1 j
 
1 ∑d ∑d
= B(t, T ) R(t) − α∗ (t, T ) + (σj∗ (t, T ))2  dt − B(t, T ) σj∗ (t, T )dWj (t).
2 j=1 j=1

Using integration-by-parts formula, we have

d(D(t)B(t, T )) = D(t)[−B(t, T )R(t)dt + dB(t, T )]


  
1 ∑d ∑d
= D(t)B(t, T ) −α∗ (t, T ) + (σ ∗ (t, T ))2  dt − σj∗ (t, T )dWj (t)
2 j=1 j j=1

e under which
If no-arbitrage condition holds, we can find a risk-neutral measure P
∫ t
Wf (t) = Θ(u)du + W (t)
0

is a d-dimensional Brownian motion for some d-dimensional adapted process Θ(t). This implies

1∑ ∗ ∑
d d
−α∗ (t, T ) + (σj (t, T ))2 + σj∗ (t, T )Θj (t) = 0
2 j=1 j=1

91
Differentiating both sides w.r.t. T , we obtain


d ∑
d
−α(t, T ) + σj∗ (t, T )σj (t, T ) + σj (t, T )Θj (t) = 0.
j=1 j=1

Or equivalently,

d
α(t, T ) = σj (t, T )[σj∗ (t, T ) + Θj (t)].
j=1

(ii) Suppose there is an adapted, d-dimensional process

Θ(t) = (Θ1 (t), · · · , Θd (t))

satisfying this equation for all 0 ≤ t ≤ T ≤ T . Show that if there are maturities T1 , · · · , Td such that the
d × d matrix (σj (t, Ti ))i,j is nonsingular, then Θ(t) is unique.
Proof. Let σ(t, T ) = [σ1 (t, T ), · · · , σd (t, T )]tr and σ ∗ (t, T ) = [σ1∗ (t, T ), · · · , σd∗ (t, T )]tr . Then the no-arbitrage
condition becomes
α(t, T ) − (σ ∗ (t, T ))tr σ(t, T ) = (σ(t, T ))tr Θ(t).
Let T iterate T1 , · · · , Td , we have a matrix equation
    
α(t, T1 ) − (σ ∗ (t, T1 ))tr σ(t, T1 ) σ1 (t, T1 ) ... σd (t, T1 ) Θ1 (t)
 ..   .. .. ..   .. 
 . = . . .  . 
α(t, Td ) − (σ ∗ (t, Td ))tr σ(t, Td ) σ1 (t, Td ) ... σd (t, Td ) Θd (t)

Therefore, if (σj (t, Ti ))i,j is nonsingular, Θ(t) can be uniquely solved from the above matrix equation.

I Exercise 10.10. (i) Use the ordinary differential equations (6.5.8) and (6.5.9) satisfied by the functions
A(t, T ) and C(t, T ) in the one-factor Hull-White model to show that this model satisfies the HJM no-arbitrage
condition (10.3.27)
Proof. Recall (6.5.8) and (6.5.9) are
{
C ′ (t, T ) = b(t)C(t, T ) − 1
A′ (t, T ) = −a(t)C(t, T ) + 12 σ 2 (t)C 2 (t, T ).

Then by β(t, r) = a(t) − b(t)r and γ(t, r) = σ(t) (p. 430), we have

∂ ∂ ′ ∂ ′
C(t, T )β(t, R(t)) + R(t) C (t, T ) + A (t, T )
∂T ∂T ∂T[ ]
∂ ∂ ∂ ∂
= C(t, T )β(t, R(t)) + R(t)b(t) C(t, T ) + −a(t) 2
C(t, T ) + σ (t)C(t, T ) C(t, T )
∂T ∂T ∂T ∂T
∂ [ ]
= C(t, T ) β(t, R(t)) + R(t)b(t) − a(t) + σ 2 (t)C(t, T )
∂T
( )

= C(t, T ) C(t, T )γ 2 (t, R(t)).
∂T

(ii) Use the ordinary differential equations (6.5.14) and (6.5.15) satisfied by the functions A(t, T ) and C(t, T )
in the one-factor Cox-Ingersoll-Ross model to show that this model satisfies the HJM no-arbitrage condition
(10.3.27).

92
Proof. Recall (6.5.14) and (6.5.15) are
{
C ′ (t, T ) = bC(t, T ) + 12 σ 2 C 2 (t, T ) − 1
A′ (t, T ) = −aC(t, T ).

Then by β(t, r) = a − br and γ(t, r) = σ r (p. 430), we have

∂ ∂ ′ ∂ ′
C(t, T )β(t, R(t)) + R(t) C (t, T ) + A (t, T )
∂T ∂T
[ ∂T ]
∂ ∂ ∂ ∂
= C(t, T )β(t, R(t)) + R(t) b 2
C(t, T ) + σ C(t, T ) C(t, T ) − a C(t, T )
∂T ∂T ∂T ∂T
∂ [ ]
= C(t, T ) β(t, R(t)) + R(t)b + R(t)σ 2 C(t, T ) − a
∂T
( )

= C(t, T ) C(t, T )γ 2 (t, R(t)).
∂T

I Exercise 10.11. Let δ > 0 be given. Consider an interest rate swap paying a fixed interest rate K
and receiving backset LIBOR L(Tj−1 , Tj−1 ) on a principal of 1 at each of the payment dates Tj = δj,
j = 1, 2, · · · , n + 1. Show that the value of the swap is


n+1 ∑
n+1
δK B(0, Tj ) − δ B(0, Tj )L(0, Tj−1 ). (10.7.22)
j=1 j=1

Remark 10.7.1 The swap rate is defined to be the value of K that makes the initial value of the swap equal
to zero. Thus, the swap rate is ∑n+1
j=1 B(0, Tj )L(0, Tj−1 )
K= ∑n+1 . (10.7.23)
j=1 B(0, Tj )

Proof. On each payment date Tj , the payoff of this swap contract is δ(K − L(Tj−1 , Tj−1 )). Its no-arbitrage
price at time 0 is δ(KB(0, Tj ) − B(0, Tj )L(0, Tj−1 )) by Theorem 10.4. So the value of the swap is


n+1 ∑
n+1 ∑
n+1
δ[KB(0, Tj ) − B(0, Tj )L(0, Tj−1 )] = δK B(0, Tj ) − δ B(0, Tj )L(0, Tj−1 ).
j=1 j=1 j=1

I Exercise 10.12. In the proof of Theorem 10.4.1, we showed by an arbitrage argument that the value at
time 0 of a payment of backset LIBOR L(T, T ) at time T + δ is B(0, T + δ)L(0, T ). The risk-neutral price
of this payment, computed at time zero, is
e
E[D(T + δ)L(T, T )].

Use the definitions


1 − B(T, T + δ)
L(T, T ) = ,
δB(T, T + δ)
B(0, T + δ) = e
E[D(T + δ)],

and the properties of conditional expectations to show that


e
E[D(T + δ)L(T, T )] = B(0, T + δ)L(0, T ).

93
1−B(T,T +δ)
Proof. Since L(T, T ) = δB(T,T +δ) ∈ FT , we have

e
E[D(T + δ)L(T, T )] = e E[D(T
E[ e + δ)L(T, T )|FT ]]
[ ]
e 1 − B(T, T + δ) e
= E E[D(T + δ)|FT ]
δB(T, T + δ)
[ ]
e 1 − B(T, T + δ)
= E D(T )B(T, T + δ)
δB(T, T + δ)
[ ]
e D(T ) − D(T )B(T, T + δ)
= E
δ
B(0, T ) − B(0, T + δ)
=
δ
= B(0, T + δ)L(0, T ).

eT +δ : the time-t no-arbitrage price


Remark 7. An alternative proof is to use the (T + δ)-forward measure P
of a payoff δL(T, T ) at time (T + δ) is

e T +δ [δL(T, T )] = B(t, T + δ) · δL(t, T ),


B(t, T + δ)E
( )
B(t,T )
where we have used the observation that the forward rate process L(t, T ) = 1δ B(t,T +δ) − 1 (0 ≤ t ≤ T ) is
a martingale under the (T + δ)-forward measure.

11 Introduction to Jump Processes


⋆ Comments:
1) A mathematically rigorous presentation of semimartingale theory and stochastic calculus can be found
in He et al. [5] and Kallenberg [7].
2) Girsanov’s theorem. The most general version of Girsanov’s theorem for local martingales can be
found in He et al. [5, page 340] (Theorem 12.13) or Kallenberg [7, page 523] (Theorem 26.9). It has the
following form

Theorem 1 (Girsanov’s theorem for local martingales). Let Q = Zt · P on Ft for all t ≥ 0, and consider
a local P -martingale M such that the process [M, Z] has locally integrable variation and P -compensator
−1
⟨M, Z⟩. Then M̄ = M − Z− · ⟨M, Z⟩ is a local Q-martingale.13
Applying Girsanov’s theorem to Lemma 11.6.1, in order to change the intensity of a Poisson process via
change of measure, we need to find a P-martingale L such that the Radon-Nikodým derivative Z· = dP/dP e
satisfies the SDE
dZ(t) = Z(t−)dL(t)
and
e
N (t) − λt − ⟨N (t) − λt, L(t)⟩ = N (t) − λt
e This implies we should find L such that
is a martingale under P.
e − λ)t.
⟨N (t) − λt, L(t)⟩ = (λ
13 For the difference between the predictable quadratic variation ⟨·⟩ (or angle bracket process) and the quadratic variation (or

square bracket process), see He et al. [5, page 185-187]. Basically, we have [M, N ]t = M0 N0 + ⟨M c , N c ⟩t + s≤t ∆Ms ∆Ns
and ⟨M, N ⟩ is the dual predictable projection of [M, N ] (i.e. [M, N ]t − ⟨M, N ⟩t is a martingale).

94
∫t
Assuming martingale representation theorem, we suppose L(t) = L(0) + 0 H(s)d(N (s) − λs), then
∫ t ∑ ∫ t
[N (t) − λt, L(t)] = H(s)d[N (s) − λs, N (s) − λs] = 2
H(s)(∆N (s)) = H(s)dN (s).
0 0<s≤t 0

∫t
So ⟨N (t) − λt, L(t)⟩ = 0
H(s)λds. Solving the equation
∫ t
e − λ)t,
H(s)λds = (λ
0
e
λ−λ
we get H(s) = λ . Combined, we conclude Z should be determined by the equation (11.6.2)
e−λ
λ
dZ(t) = Z(t−)dM (t).
λ

I Exercise 11.1. Let M (t) be the compensated Poisson process of Theorem 11.2.4.
(i) Show that M 2 (t) is a submartingale.
Proof. First, M 2 (t) = N 2 (t) − 2λtN (t) + λ2 t2 . So E[M 2 (t)] < ∞. φ(x) = x2 is a convex function, by the
conditional Jensen’s inequality (Shreve [14, page 70], Theorem 2.3.2 (v)),
E[φ(M (t))|F(s)] ≥ φ(E[M (t)|F(s)]) = φ(M (s)), ∀s ≤ t.
So M 2 (t) is a submartingale.
(ii) Show that M 2 (t) − λt is a martingale.
Proof. We note M (t) = N (t) − λt has independent and stationary increment. So ∀s ≤ t,
E[M 2 (t) − M 2 (s)|F(s)]
= E[(M (t) − M (s))2 |F(s)] + E[(M (t) − M (s)) · 2M (s)|F(s)]
= E[M 2 (t − s)] + 2M (s)E[M (t − s)]
= Var(N (t − s)) + 0
= λ(t − s).
That is, E[M 2 (t) − λt|F(s)] = M 2 (s) − λs.

I Exercise 11.2. Suppose we have observed a Poisson process up to time s, have seen that N (s) = k, and
are interested in the values of N (s + t) for small positive t. Show that
P{N (s + t) = k|N (s) = k} = 1 − λt + O(t2 ),
P{N (s + t) = k + 1|N (s) = k} = λt + O(t2 ),
P{N (s + t) ≥ k + 2|N (s) = k} = O(t2 ),
where O(t2 ) is used to denote terms involving t2 and higher powers of t.
Proof. We note
P(N (s + t) = k|N (s) = k) = P(N (s + t) − N (s) = 0|N (s) = k) = P(N (t) = 0) = e−λt = 1 − λt + O(t2 ).
Similarly, we have
(λt)1 −λt
P(N (s + t) = k + 1|N (s) = k) = P(N (t) = 1) = e = λt(1 − λt + O(t2 )) = λt + O(t2 ),
1!
and

∑ (λt)k
P(N (s + t) ≥ k + 2|N (t) = k) = P(N (t) ≥ 2) = e−λt = O(t2 ).
k!
k=2

95
I Exercise 11.3 (Geometric Poisson process). Let N (t) be a Poisson process with intensity λ > 0,
and let S(0) > 0 and σ > −1 be given. Using Theorem 11.2.3 rather than the Itô-Doeblin formula for jump
processes, show that
S(t) = exp {N (t) log(σ + 1) − λσt} = (σ + 1)N (t) e−λσt
is a martingale.

Proof. For any t ≤ u, we have


[ ] [ ]
S(u)
E F(t) = E (σ + 1)N (t)−N (u) e−λσ(t−u) |F(t)
S(t)
[ ]
= e−λσ(t−u) E (σ + 1)N (t−u)
= e−λσ(t−u) E[eN (t−u) log(σ+1) ]
{ ( )}
= e−λσ(t−u) exp λ(t − u) elog(σ+1) − 1 (by (11.3.4))
= e−λσ(t−u) eλσ(t−u)
= 1.

So S(t) = E[S(u)|F(t)] and S is a martingale.

I Exercise 11.4. Suppose N1 (t) and N2 (t) are Poisson processes with intensities λ1 and λ2 , respectively,
both defined on the same probability space (Ω, F, P) and relative to the same filtration F(t), t ≥ 0. Show
that almost surely N1 (t) and N2 (t) can have no simultaneous jump. (Hint: Define the compensated Poisson
processes M1 (t) = N1 (t) − λt and M2 (t) = N2 (t) − λ2 t, which like N1 and N2 are independent. Use Itô
product rule for jump processes to compute M1 (t)M2 (t) and take expectations.)
Proof. The problem is ambiguous in that the relation between N1 and N2 is not clearly stated. According
to page 524, paragraph 2, we would guess the condition should be that N1 and N2 are independent.
Suppose N1 and N2 are independent. Define M1 (t) = N1 (t) − λ1 t and M2 (t) = N2 (t) − λ2 t. Then
by independence E[M1 (t)M2 (t)] = E[M1 (t)]E[M2 (t)] = 0. Meanwhile, by Itô’s product formula (Corollary
11.5.5),
∫ t ∫ t
M1 (t)M2 (t) = M1 (s−)dM2 (s) + M2 (s−)dM1 (s) + [M1 , M2 ](t).
0 0
∫t ∫t
Both 0
M1 (s−)dM2 (s) and M2 (s−)dM1 (s) are martingales. So taking expectation on both sides, we get
0
 

0 = 0 + E{[M1 , M2 ](t)} = E  ∆N1 (s)∆N2 (s) .
0<s≤t

∑ ∑
Since 0<s≤t ∆N1 (s)∆N2 (s) ≥ 0 a.s., we conclude 0<s≤t∑ ∆N1 (s)∆N2 (s) = 0 a.s. By letting t = 1, 2, · · · ,
we can find a set Ω0 of probability 1, so that ∀ω ∈ Ω0 , 0<s≤t ∆N1 (ω; s)∆N2 (ω; s) = 0 for all t > 0.
Therefore N1 and N2 can have no simultaneous jump.

I Exercise 11.5. Suppose N1 (t) and N2 (t) are Poisson processes defined on the same probability space
(Ω, F, P) relative to the same filtration F(t), t ≥ 0. Assume that almost surely N1 (t) and N2 (t) have no
simultaneous jump. Show that, for each fixed t, the random variable N1 (t) and N2 (t) are independent.
(Hint: Adapt the proof of Corollary 11.5.3.) (In fact, the whole path of N1 is independent of the whole path
of N2 , although you are not being asked to prove this stronger statement.)

Proof. We shall prove the whole path of N1 is independent of the whole path of N2 , following the scheme
suggested by page 489, paragraph 1.

96
Fix s ≥ 0, we consider Xt = u1 (N1 (t) − N1 (s)) + u2 (N2 (t) − N2 (s)) − λ1 (eu1 − 1)(t − s) − λ2 (eu2 − 1)(t − s),
t > s. Then by Itô’s formula for jump process, we have
∫ t ∫ ∑
1 t Xu
e −e
Xt Xs
= Xu
e dXu +c
e dXuc dXuc + (eXu − eXu− )
s 2 s
s<u≤t
∫ t ∑
= eXu [−λ1 (eu1 − 1) − λ2 (eu2 − 1)]du + (eXu − eXu− ).
s 0<u≤t

Since ∆Xt = u1 ∆N1 (t) + u2 ∆N2 (t) and N1 , N2 have no simultaneous jump,

eXu − eXu− = eXu− (e∆Xu − 1) = eXu− [(eu1 − 1)∆N1 (u) + (eu2 − 1)∆N2 (u)].
∫t ∫t
Note s eXu du = s eXu− du14 , we have

eXt − eXs
∫ t ∑
= eXu− [−λ1 (eu1 − 1) − λ2 (eu2 − 1)]du + eXu− [(eu1 − 1)∆N1 (u) + (eu2 − 1)∆N2 (u)]
s s<u≤t
∫ t
= eXu− [(eu1 − 1)d(N1 (u) − λ1 u) − (eu2 − 1)d(N2 (u) − λ2 u)] .
s

Therefore, E[eXt ] = E[eXs ] = 1, which implies


[ ]
= eλ1 (e −1)(t−s) eλ2 (e −1)(t−s)
u1 u2
E eu1 (N1 (t)−N1 (s))+u2 (N2 (t)−N2 (s))
[ ] [ ]
= E eu1 (N1 (t)−N1 (s)) E eu2 (N2 (t)−N2 (s)) .

This shows N1 (t) − N1 (s) is independent of N2 (t) − N2 (s).


Now, suppose we have 0 ≤ t1 < t2 < t3 < · · · < tn , then the vector (N1 (t1 ), · · · , N1 (tn )) is independent
of (N2 (t1 ), · · · , N2 (tn )) if and only if (N1 (t1 ), N1 (t2 ) − N1 (t1 ), · · · , N1 (tn ) − N1 (tn−1 )) is independent of
(N2 (t1 ), N2 (t2 ) − N2 (t1 ), · · · , N2 (tn ) − N2 (tn−1 )). Let t0 = 0, then
[ ∑n ∑n ]
E e i=1 ui (N1 (ti )−N1 (ti−1 ))+ j=1 vj (N2 (tj )−N2 (tj−1 ))
[ ∑n−1 ∑n−1
= E e i=1 ui (N1 (ti )−N1 (ti−1 ))+ j=1 vj (N2 (tj )−N2 (tj−1 )) ·
[ ]]

E eun (N1 (tn )−N1 (tn−1 ))+vn (N2 (tn )−N2 (tn−1 )) F(tn−1 )
[ ∑n−1 ∑n−1 ] [ ]
= E e i=1 ui (N1 (ti )−N1 (ti−1 ))+ j=1 vj (N2 (tj )−N2 (tj−1 )) E eun (N1 (tn )−N1 (tn−1 ))+vn (N2 (tn )−N2 (tn−1 ))
[ ∑n−1 ∑n−1 ] [ ] [ ]
= E e i=1 ui (N1 (ti )−N1 (ti−1 ))+ j=1 vj (N2 (tj )−N2 (tj−1 )) E eun (N1 (tn )−N1 (tn−1 )) E evn (N2 (tn )−N2 (tn−1 )) ,

where the second equality comes from the independence of Ni (tn ) − Ni (tn−1 ) (i = 1, 2) relative to F(tn−1 )
and the third equality comes from the result obtained just before. Working by induction, we have
[ ∑n ∑n ]
E e i=1 ui (N1 (ti )−N1 (ti−1 ))+ j=1 vj (N2 (tj )−N2 (tj−1 ))

n [ ] ∏n [ ]
= E eui (N1 (ti )−N1 (ti−1 )) · E evj (N2 (tj )−N2 (tj−1 ))
i=1 j=1
[ ∑n ] [ ∑n ]
= E e i=1 ui (N1 (ti )−N1 (ti−1 )) · E e j=1 vj (N2 (tj )−N2 (tj−1 )) .

This shows the whole path of N1 is independent of the whole path of N2 .


14 On the interval [s, t], the sample path X (ω) has only finitely many jumps for each ω, so the Riemann integrals of X and
· ·
X·− should agree with each other.

97
I Exercise 11.6. Let W (t) be a Brownian motion and let Q(t) be a compound Poisson process, both defined
on the same probability space (Ω, F, P) and relative to the same filtration F(t), t ≥ 0. Show that, for each
t, the random variables W (t) and Q(t) are independent. (In fact, the whole path of W is independent of the
whole path of Q, although you are not being asked to prove this stronger statement.)
Proof. Let Xt = u1 W (t) − 12 u21 t + u2 Q(t) − λt(φ(u2 ) − 1) where φ is the moment generating function of the
jump size Y . Itô’s formula for jump process yields
∫ t ( ) ∫ ∑
1 2 1 t Xs 2
e −1=
Xt
e Xs
u1 dW (s) − u1 ds − λ(φ(u2 ) − 1)ds + e u1 ds + (eXs − eXs− ).
0 2 2 0
0<s≤t

Note ∆Xt = u2 ∆Q(t) = u2 YN (t) ∆N (t), where N (t) is the Poisson process associated with Q(t). So

eXt − eXt− = eXt− (e∆Xt − 1) = eXt− (eu2 YN (t) − 1)∆N (t).


∑N (t)
Consider the compound Poisson process Ht = i=1 (eu2 Yi − 1), then
[ ]
Ht − λE eu2 Y1 − 1 t = Ht − λ(φ(u2 ) − 1)t

is a martingale, eXt − eXt− = eXt− ∆Ht and


∫ t ( ) ∫ ∫ t
1 2 1 t Xs 2
e −1 =
Xt
e Xs
u1 dW (s) − u1 ds − λ(φ(u2 ) − 1)ds + e u1 ds + eXs− dHs
0 2 2 0 0
∫ t ∫ t
= eXs u1 dW (s) + eXs− d(Hs − λ(φ(u2 ) − 1)s).
0 0
[ ]
This shows eXt is a martingale and E eXt ≡ 1. So
[ ] 1
[ ] [ ]
E eu1 W (t)+u2 Q(t) = e 2 u1 t eλt(φ(u2 )−1)t = E eu1 W (t) E eu2 Q(t) .

This shows W (t) and Q(t) are independent.


Remark 8. It is easy to see that, if we follow the steps of solution to Exercise 11.5, firstly proving the indepen-
dence of W (t)−W (s) and Q(t)−Q(s), then proving the independence of (W (t1 ), W (t2 )−W (t1 ), · · · , W (tn )−
W (tn−1 )) and (Q(t1 ), Q(t2 ) − Q(t1 ), · · · , Q(tn ) − Q(tn−1 )) (0 ≤ t1 < t2 < · · · < tn ), then we can show the
whole path of W is independent of the whole path of Q.

I Exercise 11.7. Use Theorem 11.3.2 to prove that a compound Poisson process is Markov. In other
words, show that, whenever we are given two times 0 ≤ t ≤ T and a function h(x), there is another function
g(t, x) such that
E[h(Q(T ))|F(t)] = g(t, Q(t)).
Proof. By Independence Lemma, we have

E[h(Q(T ))|F(t)] = E[h(Q(T ) − Q(t) + Q(t))|F(t)] = E[h(Q(T − t) + x)]|x=Q(t) = g(t, Q(t)),

where g(t, x) = E[h(Q(T − t) + x)].

References
[1] Vladimir I. Arnold and Roger Cooke. Ordinary differential equations, 3rd edition, Springer, 1992. 78
[2] Freddy Delbaen and Walter Schachermayer. The mathematics of arbitrage. Springer, 2006. 72
[3] 丁同仁、李承治:《常微分方程教程》
,北京:高等教育出版社,1991. 78

98
[4] Richard Durrett. Probability: Theory and examples, 4th edition. Cambridge University Press, New York,
2010. 2, 44
[5] Sheng-wu He, Jia-gang Wang, and Jia-an Yan. Semimartingale theory and stochastic calculus. Science
Press & CRC Press Inc, 1992. 94
[6] John Hull. Options, futures, and other derivatives, 4th edition. Prentice-Hall International Inc., New
Jersey, 2000. 44
[7] Olav Kallenberg. Foundations of modern probability, 2nd edition. Springer, 2002. 94

[8] J. David Logan. A first course in differential equations, 2nd edition. Springer, New York, 2010. 27
[9] B. Øksendal. Stochastic differential equations: An introduction with applications, 6th edition. Springer-
Verlag, Berlin, 2003. 54, 65
[10] 钱敏平、龚光鲁:《随机过程论(第二版)》,北京:北京大学出版社,1997.10。 54

[11] Daniel Revuz and Marc Yor. Continous martingales and Brownian motion, 3rd edition. Springer-Verlag,
Berlin, 1998. 49, 54
[12] A. N. Shiryaev. Probability, 2nd edition. Springer, 1995. 23

[13] A. N. Shiryaev. Essentials of stochastic finance: facts, models, theory. World Scientific, Singapore, 1999.
72

[14] Steven Shreve. Stochastic calculus for finance II: Continuous-time models. Springer-Verlag, New York,
2004. 1, 72, 95

[15] Paul Wilmott. The mathematics of financial derivatives: A student introduction. Cambridge University
Press, 1995.

27, 38

99

Das könnte Ihnen auch gefallen