Sie sind auf Seite 1von 47

Closed-Form Likelihood Estimation of Jump-Diffusions with an

Application to the Realignment Risk of the Chinese Yuan

Jialin Yu∗
Department of Finance & Economics
Columbia University†

First draft: January 30, 2003


This version: March 8, 2005

Abstract

This paper provides closed-form likelihood approximations for multivariate jump-diffusion


processes widely used in finance. The maximum-likelihood estimator (MLE) computed from this
approximate likelihood achieves the asymptotic efficiency of the true yet uncomputable MLE
estimator. The approximation, based on Kolmogorov equations common to Markov processes,
can be generalized beyond jump-diffusions. This method is then used to uncover the realignment
probability of the Chinese Yuan. Since February 2002, the realignment intensity implicit in the
financial market has increased fivefold. The term structure of the forward realignment rate, which
completely characterizes realignment probabilities in the future, is hump-shaped and peaks at six
months from the end of 2003. The realignment probability responds quickly to news releases on
the Sino-US trade surplus, state-owned enterprise reform, Chinese government tax revenue and,
most importantly, government interventions.


I thank Yacine Aït-Sahalia for helpful discussions throughout this project. I also thank Laurent Calvet, Gregory
Chow, Martin Evans, Rodrigo Guimaraes, Harrison Hong, Bo Honoré, Robert Kimmel, Adam Purzitsky, Hélène Rey,
Jacob Sagi, Ernst Schaumburg, José Scheinkman, Wei Xiong and participants of the Princeton Gregory C. Chow
Econometrics Research seminar and Princeton Microeconometric Workshop for comments and suggestions. All errors
are mine.

421 Uris Hall, 3022 Broadway, New York, NY 10027. Email: jy2167@columbia.edu
1 Introduction
Jump-diffusions are very useful tools to model various economic phenomena such as currency crises,
financial market crashes, defaults etc. To estimate jump-diffusions, likelihood-based methods such
as maximum-likelihood estimation and Bayesian estimation methods are preferred. This optimality
is well documented in the statistics literature. However, likelihood-based methods cannot be used in
practice because the likelihood function does not have a closed form except in rare cases.
Partly due to the lack of explicit likelihood functions, alternative methods of estimation have
been proposed. For example, simulation-based estimation (Gouriéroux, Monfort and Renault (1993),
Pedersen (1995), Gallant and Tauchen (1996), Elerian, Chib and Shephard (2001), Brandt and Santa-
Clara (2002)), generalized method of moments estimation (Hansen and Scheinkman (1995), Kessler
and Sørensen (1999)), nonparametric estimation (Aït-Sahalia (1996), Bandi and Phillips (2003)),
and numerical computation or analytical approximation of the Fourier inversion of the characteristic
function of the transition density (Singleton (2001), Aït-Sahalia and Yu (2002)) have been proposed
in the literature.
Recent papers by Aït-Sahalia provide closed-form approximations to the likelihood functions
of univariate diffusions (Aït-Sahalia (2002)) and multivariate diffusions (Aït-Sahalia (2001)). The
approximations are shown to be extremely accurate and are fast to compute by Monte-Carlo studies
(see Jensen and Poulsen (2002)). Schaumburg (2001) extends the result in Aït-Sahalia (2002) to
univariate Levy-driven processes.
Building on this closed-form approximation approach, this paper provides a closed-form approx-
imation of the likelihood function of multivariate jump-diffusion processes. It extends Schaumburg
(2001) by relaxing the i.i.d. property inherent in Levy-driven randomness and by addressing multi-
variate processes.
The approximation, based on Kolmogorov equations common to Markov processes, can be ex-
tended beyond jump-diffusions.
The maximum-likelihood estimators using the approximate likelihood function provided in this
paper are shown to achieve the asymptotic efficiency of the true yet uncomputable maximum-
likelihood estimators.
Why consider jumps? First, from a statistical point of view, jump-diffusion nests diffusion as a
special case. Being a more general model, jump-diffusion can approximate the true data-generating
process better as measured by, for example, Kullback-Leibler Information Criterion (KLIC) (see
White (1982)). Second, as illustrated by Merton (1992), jumps model “Rare Events” that can have
a big impact over a short period of time. Explicit modelling of jumps improves our understanding
of such phenomena. For example, it is known in the derivative pricing theory that, with jumps,
the arbitrage pricing argument leading to the Black-Scholes option pricing formula breaks down.
Liu, Longstaff and Pan (2003) have shown that the risks brought by jumps and stochastic volatility
dramatically change an investor’s optimal portfolio choice. Finally, some economic phenomena call
for jumps as a modelling device. For instance, in studying credit risks, the structural models, which
assume a firm’s asset value follows a diffusion and default happens if the asset value falls below a

1
certain level, have the unrealistic implication that short maturity corporate debt has negligible yield
spread — the excess of corporate debt yield over risk free yield (See Duffie and Singleton (2003)).
The jump-diffusion processes considered in this paper nest the widely used affine jump-diffusions
(AJD). Though a subset of jump-diffusions, AJD processes are very useful yet difficult to estimate,
as can be seen from Sundaresan (2000): “The challenge to the econometricians is to present a
framework for estimating such multivariate diffusion processes, which are becoming more and more
common in financial economics in recent times. ... The development of estimation procedures for
multivariate AJD processes is certainly a very important step toward realizing this hope.” Note the
method proposed in this paper applies to both affine and non-affine jump-diffusions.
Applying the proposed approximation method, the second part of the paper uncovers the re-
alignment probability of the Chinese Yuan. The Yuan has been essentially pegged to the US Dollar
for the past seven years. A recent export boom has led to diplomatic pressure on China to allow
the Yuan to appreciate. The realignment risk is important to both China and other parts of the
world. Foreign trade volume accounts for roughly 40 percent of China’s GDP in 2002.1 Any shift in
the terms of trade and any changes in the competitive advantage of exporters brought by currency
fluctuation can have a significant impact on Chinese economy. Between 1992 and 1999, Foreign
Direct Investment (FDI) into China accounted for 8.2 percent of worldwide FDI and 26.3 percent of
FDI going into developing countries, all of which are subject to currency risks.2 Beginning in August
2003, China started to open its domestic financial markets to foreign investors through Qualified
Foreign Institutional Investors and is planning to allow Chinese citizens to invest in foreign financial
markets. Currency risk will be important for these investors of financial markets, too.
This paper uncovered the term structure of the forward realignment rate which completely char-
acterizes realignment probabilities in the future.3 The term structure is hump-shaped and peaks
at six months from the end of 2003. This implies the financial market is anticipating an upward
realignment in the next year and, conditioning on no realignments in that period, the chance of a
realignment is perceived to be small in the further future. Since February 2002, the upward realign-
ment intensity for the Yuan implicit in the financial market has increased fivefold. The realignment
probability responds quickly to news releases on Sino-US trade surplus, state-owned enterprise re-
form, Chinese government tax revenue and, most importantly, both domestic and foreign government
interventions.
The paper is organized as follows. Section 2 provides the likelihood approximation. That the
MLE estimators computed from this approximate likelihood achieve the asymptotic efficiency of the
true MLE estimators is proved in Section 3. Section 4 uncovers the realignment probability of the
Chinese Yuan and section 5 concludes. Appendix 1 contains the proofs. Appendix 2 extends the
approximation method. Appendix 3 provides the history of the Yuan’s exchange rate regime. Import
news releases that influence the realignment probability are documented in Appendix 4. Appendix
5 has histograms for the application.
1
Calculated from data provided by Datastream International.
2
The data are from Huang (2003).
3
The forward realignment rate is defined in section 4.2.1.

2
2 Likelihood Approximation
2.1 Multivariate jump-diffusion process
We consider a multivariate jump-diffusion process X defined on a probability space (Ω, F, P ) with
filtration {Ft } satisfying the usual conditions (see, for example, Protter (1990)).

dXt = µ (Xt , θ) dt + σ (Xt , θ) dWt + Jt dNt (2.1)

where Xt is an n-dimensional state vector and Wt is a standard d-dimensional Brownian motion.


θ ∈ Rp is a finite dimensional parameter to be estimated. µ (., θ) : Rn → Rn is the drift function and
σ (., θ) : Rn → Rn×d is the diffusion function. The pure jump process N has stochastic intensity4
λ (Xt , θ) and jump size 1. The jump size Jt is independent of Ft− and has probability density
ν (., θ) : Rn → R with support C ∈ Rn . In this section, we only consider the case where C has a
non-empty interior5 in Rn . Let V (x, θ) ≡ σ (x, θ) σ (x, θ)T be the variance matrix, σ (x, θ)T denotes
the matrix transposition of σ (x, θ).
The other way to look at X is that it is a Markov process with infinitesimal generator6 AB ,
defined on bounded C 2 function f on Rn with bounded first and second derivatives, given by
n
X n n Z
B ∂ 1 XX ∂2
A f (x) = µi (x, θ) f (x)+ vij (x, θ) f (x)+λ (x, θ) [f (x + c) − f (x)] ν (c, θ) dc
∂xi 2 ∂xi ∂xj C
i=1 i=1 j=1

where µi (x, θ), vij (x, θ) and xi are, respectively, elements of µ (x, θ), V (x, θ) and x.

Definition 1. The transition probability density p (∆, y|x, θ), when it exists, is the conditional density
of Xt+∆ = y ∈ Rn given Xt = x ∈ Rn

To save notation, the dependence on θ of the functions µ (., θ), σ (., θ), V (., θ), λ (., θ) , ν (., θ)
and p (∆, y|x, θ) will not be made explicit when there is no confusion.

Assumption 1. The variance matrix V (x) is positive definite for all x in the domain of the process
X.

This is a nondegeneracy condition. Given this assumption, we can find, by Cholesky decomposi-
£ ¤T
tion, an n × n positive definite matrix V 1/2 (x) satisfying V (x) = V 1/2 (x) V 1/2 (x) .
To carry out likelihood-based estimation, we make the following assumption.

Assumption 2. The stochastic differential equation (2.1) has a unique solution. The transition
density p (∆, y|x) exists for all x, y in the domain of X and all ∆ > 0. p (∆, y|x) is continuously
differentiable with respect to ∆, twice continuously differentiable with respect to x and y.
4
See page 27-28 of Brémaud (1981) for definition of stochastic intensity.
5
This implies, when N jumps, all the state variables can jump. Cases where some state variables do not jump,
together with some other extensions, are considered in Appendix 2.
6
See, for example, Revuz and Yor (1999) for details on infinitesimal generators.

3
When the jump intensity is constant, the following lemma provides sufficient conditions for as-
sumption 2 to hold.

Lemma 1. Under the following conditions, the stochastic differential equation (2.1) has a unique
solution. The transition density p (∆, y|x) exists for all x, y in the domain of X and all ∆ > 0. Fur-
ther, p (∆, y|x) is continuously differentiable with respect to the ∆, twice continuously differentiable
with respect to x and y:

1. The jump intensity λ (.) is constant;

2. µ (.), σ (.) are infinitely continuously differentiable with bounded derivatives;

3. The eigenvalues of V (.) is bounded below by a positive constant;

4. The jump distribution of Jt has moments of all orders

Proposition 1. Under assumptions 2, the transition density satisfies the backward and forward
Kolmogorov equations given by

p (∆, y|x) = AB p (∆, y|x) (2.2)
∂∆


p (∆, y|x) = AF p (∆, y|x) (2.3)
∂∆
(2.2) and (2.3) are, respectively, the backward and forward equations. The infinitesimal genera-
tors AB and AF are defined as
Z
B B
A p (∆, y|x) = L p (∆, y|x) + λ (x) [p (∆, y|x + c) − p (∆, y|x)] ν (c) dc
Z C

AF p (∆, y|x) = LF p (∆, y|x) + [λ (y − c) p (∆, y − c|x) − λ (y) p (∆, y|x)] ν (c) dc
C

The operators LB and LF are given by


n
X n n
∂ 1 XX ∂2
LB p (∆, y|x) = µi (x) p (∆, y|x) + vij (x) p (∆, y|x) (2.4)
∂xi 2 ∂xi ∂xj
i=1 i=1 j=1
n
X Xn Xn
∂ 1 ∂2
LF p (∆, y|x) = − [µ (y) p (∆, y|x)] + [vij (y) p (∆, y|x)]
∂yi i 2 ∂yi ∂yj
i=1 i=1 j=1

Assumption 3. The boundary of the process X is unattainable.

This assumption implies that the transition density p (∆, y|x) is uniquely determined by the
backward and forward Kolmogorov equations.
Anticipating the use of asymptotic expansion later, we make a smoothness assumption.

Assumption 4. ν (.), µ (.), σ (.) and λ (.) are infinitely differentiable almost everywhere in the
domain of X.

4
This assumption is stronger than is actually needed. ν (.), µ (.), σ (.) and λ (.) suffice to be
continuously differentiable up to orders that show up in the expansion in Section 2.4.

2.2 Method of approximating the transition density


In this paper, we will find a closed-form approximation to p (∆, y|x) using the backward and forward
equations. Specifically, we conjecture
" # ∞ ∞
−n/2 C (−1) (x, y) X (k) X
p (∆, y|x) = ∆ exp − C (x, y) ∆k + D(k) (x, y) ∆k (2.5)

k=0 k=1

for some function C (k) (x, y) and D(k) (x, y) to be determined. We then plug (2.5) into thei backward
h (−1)
C (x,y)
and forward equations, match the terms with the same orders of ∆ or ∆ exp − ∆ , set their
coefficients to 0 and solve for C (k) (x, y) and D(k) (x, y). An approximation of order m > 0 is obtained
" # m m
(m) −n/2 C (−1) (x, y) X (k) X
p (∆, y|x) = ∆ exp − C (x, y) ∆k + D(k) (x, y) ∆k (2.6)

k=0 k=1

by ignoring terms of higher orders.


Even though the true transition density is uniquely determined by either one of the Kolmogorov
equations, to pin down the approximate transition density, we need both the forward and back-
ward equations, together with the following two conditions which guarantee the approximate density
converges to a dirac-delta function at y = x when ∆ → 0, a property of the true transition density.

Condition 1. C (−1) (x, y) = 0 if and only if x = y.

Condition 2. C (0) (x, x) = (2π)−n/2 [det V (x)]−1/2 .

When ∆ → 0, condition 1 guarantees the approximate density peaks at x and condition 2 is


derived by requiring the approximate density integrates to one with respect to y as ∆ becomes small
(see Appendix 1 for proof).
To see why both the backward and forward equations are needed to compute the approximate
density, notice the backward equation is a differential equation with respect to x and its solution,
coefficient C (0) (x, y) in particular, is only determined up to a multiplicative function of y. Similarly,
the solution of the forward equation is determined up to a multiplicative function of x. By comparing
the solutions, the two multiplicative factors can be computed up to a multiplicative constant which
is further pinned down by condition 2.
The approximation is a small-∆ approximation in the sense that it does not need more and more
correction terms to deliver better approximation for fixed ∆. What it does is, for a fixed number of
terms, the approximation gets better as ∆ gets smaller, much like approximating a function of ∆ by
Taylor expansion around 0. h (−1) iP
As will be shown in Section 2.3, intuitively, ∆−n/2 exp − C ∆(x,y) m
k=0 C
(k) (x, y) ∆k captures
Pm
the behavior of p (∆, y|x) at y near x and k=1 D(k) (x, y) ∆k captures the tail behavior of p (∆, y|x).

5
Pm (k) (x, y) ∆k
k=1 D starts with k = 1 because, with Poisson-driven jumps, p (∆, y|x) has a tail of
order O (∆).

2.3 Why is the approximate density p(m) (∆, y|x) chosen this way?
Before calculating C (k) (x, y) and D(k) (x, y), we discuss why (2.5) is the right form of approximation
to consider. Let At,∆ denote the event that no jumps happening from time t to t + ∆. Act,∆ denotes
set complementation.

p (∆, y|x) = Pr (At,∆ |Xt = x) pdf (Xt+∆ = y|Xt = x, At,∆ ) (2.7)


¡ ¢ ¡ ¢
+ Pr Act,∆ |Xt = x pdf Xt+∆ = y|Xt = x, Act,∆

The Poisson arrival rate implies the second term is O (∆) as ∆ → 0. Assumption 4 further
implies this term admits a Taylor expansion at ∆ = 0. We have now got the second term in (2.5).
Conditioning on no jumps, pdf(Xt+∆ = y|Xt = x, At,∆ ) is the transition density for a diffusion.
As shown in Varadhan (1967 Theorem 2.2) (see also Section 5.1 in Aït-Sahalia 2002),

lim [−2∆ log pdf (Xt+∆ = y|Xt = x, At,∆ )] = d2 (x, y)


∆→0

for some function d2 (x, y)7 . h (−1) i


Letting C (−1) (x, y) = 12 d2 (x, y), we obtain the leading term exp − C ∆(x,y) in (2.5). It is pre-
multiplied by ∆−n/2 because, as ∆ → 0, the density at y = x goes to infinity at the same speed as a
standard n-dimensional normal density with variance of order O (∆) in light of the driving Brownian
P
motion. m k=0 C
(k) (x, y) ∆k corrects for the fact that ∆ is not 0.

In the fortunate case that pdf(Xt+∆ = y|Xt = x, At,∆ ) is explicitly known, no expansion is needed
for pdf(Xt+∆ = y|Xt = x, At,∆ ) and we can obtain a refined approximation as in Section 2.5.

2.4 Closed-form expression of the approximate transition density


Theorem 1 and 2 in this section give a set of restrictions on C (k) (x, y) and D(k) (x, y) imposed by
the forward and backward equations which, together with condition 1 and 2, can be used to solve
for the approximate transition density (2.6). Corollary 1 and 2 give explicit forms of C (k) (x, y) and
D(k) (x, y) in the univariate case. Remark 1 clarifies how theorem 1 and 2 are used to compute the
approximate density.
To simplify the expression, we introduce some notations. Let Srn denote the set of n-tuple non-
negative even integers (s1 , s2 , ..., sn ) with the property that |(s1 , s2 , ..., sn )| ≡ s1 + s2 + ... + sn =
r. For s ∈ Srn and x ∈ Rn , xs ≡ xs11 xs22 ...xsnn and differentiation of a function q (.) : Rn → R
∂s ∂ |s|
is denoted ∂x s q (x) ≡ s
∂x 1 ...∂xsn
q (x). Let Msn denote the s-th moment of n-dimensional standard
1 n

7 2
d (x, y) is the square of the shortest distance from x to y measured by the Riemannian metric defined locally as
ds = dxT · V −1 (x) · dx where V −1 (x) is the matrix inverse to V (x), dx is the vector (dx1 , dx2 , ..., dxn )T .
2

6
R h T i
normal distribution given by Msn ≡ (2π)−n/2 Rn exp − w 2 w ws dw. det (.) is the determinant of a
square matrix. For ease of notation, we sometimes use C (k) and D(k) for C (k) (x, y) and D(k) (x, y).

Theorem 1. The backward equation imposes the following restrictions,


∙ ¸T ∙ ¸
(−1) 1 ∂ (−1) ∂ (−1)
0 = C (x, y) − C (x, y) V (x) C (x, y)
2 ∂x ∂x
h ∙ ¸T ∙ ¸
ni ∂ (−1) ∂ (0)
0 = C (0) LB C (−1) − + C (x, y) V (x) C (x, y)
2 ∂x ∂x
h i ∙ ¸T ∙ ¸
(k+1) B (−1) n ∂ (−1) ∂ (k+1)
0 = C L C + (k + 1) − + C (x, y) V (x) C (x, y)
2 ∂x ∂x
£ ¤
+ λ (x) − LB C (k) for nonnegative k
0 = D(1) − λ (x) ν (y − x)
⎡ ⎤
Xk X ¯
1 ⎣ B (k) 1 ∂ s ¯
0 = D(k+1) − A D + (2π)n/2 λ (x) Msn g
s k−r
(x, y, w)¯¯ ⎦ for k > 0
1+k r=0
(2r)! n ∂w w=0
s∈S2r

−1
C (k) (wB −1
(w),y)ν (wB (w)−x) £ ¤T ∂
where gk (x, y, w) ≡ 
 
 ,
 wB (x, y) ≡ V 1/2 (x) ∂x C
(−1) (x, y) . Fixing y, wB (., y)
 ∂  
det ∂xT wB (x,y) −1 
 x=w
B
(w) 
−1
is invertible in a neighborhood of x = y and wB (.) is its inverse function in this neighborhood. (For
−1
the ease of notation, the dependence of wB (.) on y is not made explicit henceforward.)

Theorem 2. The forward equation imposes the following restrictions,


∙ ¸T ∙ ¸
(−1) 1 ∂ (−1) ∂ (−1)
0 = C (x, y) − C (x, y) V (y) C (x, y)
2 ∂y ∂y
⎡ ⎤
Xn Xn X n ∙ ¸T ∙ ¸
(0) ⎣ ∂ 1 n ∂ ∂
0 = C − µi (y) C (−1)
+ Hij (x, y) − ⎦ + C (−1)
(x, y) V (y) (0)
C (x, y)
∂yi 2 2 ∂y ∂y
i=1 i=1 j=1
⎡ ⎤
Xn Xn X n
∂ (−1) 1 n
0 = C (k+1) ⎣− µi (y) C + Hij (x, y) − + (k + 1)⎦
∂yi 2 2
i=1 i=1 j=1
∙ ¸T ∙ ¸
∂ (−1) ∂ (k+1) £ ¤
+ C (x, y) V (y) C (x, y) + λ (y) − LF C (k) for nonnegative k
∂y ∂y
0 = D(1) − λ (x) ν (y − x)
⎡ ⎤
Xk X ¯
1 1 ∂ s £ ¡ ¢ ¤ ¯
0 = D(k+1) − ⎣AF D(k) + (2π)n/2 Msn s
λ wF−1 (w) hk−r (x, y, w) ¯¯ ⎦
1+k r=0
(2r)! n ∂w w=0
s∈S2r
for k positive
h ih i h ih i 2
where Hij (x, y) ≡ ∂y∂ i vij (y) ∂y∂ j C (−1) (x, y) + ∂y∂ j vij (y) ∂y∂ i C (−1) (x, y) +vij (y) ∂y∂i ∂yj C (−1) (x, y),
−1
C (k) (x,wF −1
(w))ν (y−wF (w)) £ ¤T ∂ (−1)
hk (x, y, w) ≡   , wF (x, y) ≡ V 1/2 (y)




 ∂y C (x, y). Fixing x, wF (x, .) is
det ∂y∂T wF (x,y) 
 y=w−1 (w) 
F

7
invertible in a neighborhood of y = x and wF−1 (.) is its inverse function in this neighborhood. (For
the ease of notation, the dependence of wF−1 (.) on x is not made explicit henceforward.)

To simplify the expression, we abused the notation for LB a little. LB C (−1) (x, y) is defined the
same way as in (2.4) with C (−1) (x, y) replacing p (∆, y|x). Similar use extends to LF , AB and AF .
The first equation in either theorem characterizes C (−1) (x, y). Knowing C (−1) (x, y), the second
equation can be solved for C (0) (x, y). C (k+1) (x, y) is then solved recursively through the third
equation. The last two equations give D(k) (x, y) which do not require solving differential equations.
In the univariate case, n = 1, the functions C (k) (x, y) and D(k) (x, y) in the previous theorems
can be solved explicitly.

Corollary 1. Univariate case. From the backward equation,


∙Z y ¸2
(−1) 1 −1
C (x, y) = σ (s) ds
2 x
∙Z y ¸
(0) 1 µ (s) σ 0 (s)
C (x, y) = √ exp 2
− ds
2πσ (y) x σ (s) 2σ (s)
⎧ hR 0 i ⎫
∙Z y ¸−(k+1) Z y ⎨ x σ (u)
exp s 2σ(u) − σµ(u) −1 ⎬
2 (u) du σ (s)
(k+1) −1 h i
C (x, y) = − σ (s) ds Ry k£ ¤ ds for k ≥ 0
x ⎩ −1
λ (s) − LB C (k) (s, y) ⎭
s σ (u) du
x

D(1) (x, y) = λ (x) ν (y − x)


" k ¯ #
1 √ X M2r1 ∂ 2r ¯
D (k+1)
(x, y) = B (k)
A D (x, y) + 2πλ (x) g
2r k−r
(x, y, w)¯¯ for k > 0
1+k r=0
(2r)! ∂w w=0

¡ −1 ¢ ¡ −1 ¢ ¡ −1 ¢ R ³ 2´
where gk (x, y, w) ≡ C (k) wB (w) , y ν wB (w) − x σ wB 1 ≡
(w) , M2r √1 exp − s2 s2r ds
2π R
Rx
and wB (x, y) = y σ (s)−1 ds .

Corollary 2. Univariate case. From forward equation,


∙Z y ¸2
(−1) 1 −1
C (x, y) = σ (s) ds
2 x
∙Z y ¸
(0) 1 µ (s) 3σ 0 (s)
C (x, y) = √ exp 2
− ds
2πσ (x) x σ (s) 2σ (s)
⎧ hR i ⎫
∙Z y ¸−(k+1) Z y ⎨ s
exp y 3σ
0 (u)
− µ(u)
du σ (s)−1 ⎬
2σ(u) σ2 (u)
C (k+1) (x, y) = − σ (s)−1 ds hR ik £ ¤ ds for k ≥ 0
x ⎩ F C (k) (x, s) ⎭
s −1
x
x σ (u) du λ (s) − L
D(1) (x, y) = λ (x) ν (y − x)
" k ¯ #
1 √ X M 1 ∂ 2r £ ¡ ¢ ¤ ¯
D(k+1) (x, y) = AF D(k) (x, y) + 2π 2r
2r
λ wF−1 (w) hk−r (x, y, w) ¯¯ for k > 0
1+k r=0
(2r)! ∂w w=0

¡ ¢ ¡ ¢ ¡ ¢ R ³ 2´
where hk (x, y, w) ≡ C (k) x, wF−1 (w) ν y − wF−1 (w) σ wF−1 (w) , M2r
1 ≡ √1
2π R exp − s2 s2r ds
Ry
and wF (x, y) = x σ (s)−1 ds.

8
Remark 1. As discussed after condition 2, we need the second equation in both theorem 1 and
theorem 2 to compute C (0) (x, y). Once C (0) (x, y) is known, either one of the theorems can produce
an approximate transition density on its own which coincides with the other because the true transition
density (hence its expansion) is uniquely pinned down by either the backward or the forward equation.
The results in corollary 1 and corollary 2 are solved in this way.

As we have seen, either theorem 1 or theorem 2 can deliver the approximate transition density
once C (0) (x, y) is computed. It largely depends on computational ease when choosing which one
to use. For example, it is easier to use the results from the forward equation to verify that, in
univariate case without jumps, the approximate transition density obtained here coincides with that
in Aït-Sahalia (2002). Aït-Sahalia (2001) gives approximation for the log-likelihood of multivariate
diffusion. Without jumps, the log of the approximate transition density obtained in this paper takes
the form
m
X
(m) n C (−1) (x, y)
log p (∆, y|x) = − log ∆ − + log C (k) (x, y) ∆k
2 ∆
k=0

which coincides with the approximation in Aït-Sahalia (2001) when the last term is Taylor expanded
around ∆ = 0.

2.5 A case of refined approximation


We give a refined approximation in the special case when the process
³ ´ ³ ´
dXbt = µ X bt dt + σ X bt dWt

admits an explicitly known transition density. The process X b has the same drift and diffusion
functions as the process X we considered and the only difference is that X b does not exhibit jumps.
bt+∆ = y given X
Let p0 (∆, y|x) denote the explicitly known transition density of X bt = x.
Let
" # m m
(m) −n/2 C (−1) (x, y) X (k) X
p (∆, y|x) = ∆ exp − C (x, y) ∆k + D(k) (x, y) ∆k

k=0 k=1

be the approximate transition density of X computed in the previous section.


From equation (2.7),

p (∆, y|x) = p0 (∆, y|x) Pr (At,∆ |Xt = x)


¡ ¢ ¡ ¢
+pdf Xt+∆ = y|Xt = x, Act,∆ Pr Act,∆ |Xt = x

To utilize the explicitly known form of p0 (∆, y|x), we consider approximating the first term by
à ∞
!
X
(k) k
p0 (∆, y|x) 1 + F (x, y) ∆
k=1

9
with the coefficient functions F (k) (x, y) to be determined. This is a refined approximation over
p(m) (∆, y|x) in that we only approximate Pr (At,∆ |Xt = x) and use an exact form for p0 (∆, y|x).
To compute F (k) (x, y), expand p0 (∆, y|x) into
" # ∞
C (−1) (x, y) X
p0 (∆, y|x) = ∆−n/2 exp − G(k) (x, y) ∆k

k=0

for some functions G(k) (x, y) which are known since


h p0(−1)
(∆, y|x)i is explicitly known. To make
¡ P∞ (k) k
¢ −n/2 C (x,y) Pm (k) (x, y) ∆k , it suffices for
p0 (∆, y|x) 1 + k=1 F (x, y) ∆ equal ∆ exp − ∆ k=0 C
F (k) (x, y) to satisfy
k
X
C (k) (x, y) = F (i) (x, y) G(k−i) (x, y) for k > 0
i=0

with F (0) = 1. F (k) (x, y) for k = 1, 2, 3, ... can be solved iteratively. An m-th order refined approxi-
mation is then given by
m
X m
X
(m)
pR (∆, y|x) = p0 (∆, y|x) F (k) (x, y) ∆k + D(k) (x, y) ∆k
k=0 k=1

2.6 Examples
In this section, we use examples to give an intuition on how p(m) (∆, y|x) approximates the true
transition density p (∆, y|x).

2.6.1 Brownian motion

Consider the one dimensional Brownian motion

dXt = σdWt

The true transition density is that of normal distribution with mean 0 and variance σ 2 ∆
" #
1 (y − x)2
p (∆, y|x) = √ exp −
2πσ 2 ∆ 2σ 2 ∆

Using the result from the previous section, we can calculate

(y − x)2
C (−1) (x, y) =
2σ 2
1
C (0) (x, y) = √
2πσ 2
C (k) (x, y) = D(k) (x, y) = 0 for k ≥ 1

10
Therefore, for any m > 0,
" #
(m) 1 (y − x)2
p (∆, y|x) = √ exp −
2πσ 2 ∆ 2σ 2 ∆

which is exact for p (∆, y|x).

2.6.2 Brownian motion with drift

For the process

dXt = µdt + σdWt

the true transition density is that of normal with mean µ∆ and variance σ 2 ∆
" #
1 (y − x − µ∆)2
p (∆, y|x) = √ exp −
2πσ 2 ∆ 2σ 2 ∆
" # µ ¶
1 (y − x)2 µ µ2
= √ exp − + 2 (y − x) exp − 2 ∆
2πσ 2 ∆ 2σ 2 ∆ σ 2σ

The functions C (k) (x, y) and D(k) (x, y) are

(y − x)2
C (−1) (x, y) =
2σ 2
1 hµ i
C (0) (x, y) = √ exp 2 (y − x)
2πσ 2 σ
µ2
C (1) (x, y) = −C (0) (x, y) 2

(k)
D (x, y) = 0 for all k

The approximate density p(1) (∆, y|x) is therefore


" #µ ¶
(1) 1 (y − x)2 µ µ2
p (∆, y|x) = √ exp − + 2 (y − x) 1 − 2 ∆
2πσ 2 ∆ 2σ 2 ∆ σ 2σ
³ ´
µ2
It can be seen that p(1) (∆, y|x) approximates p (∆, y|x) by replacing exp − 2σ 2∆ with its first-
µ2
order Taylor expansion 1 − 2σ2
∆.

2.6.3 Jump-diffusion

Consider the following univariate jump-diffusion

dXt = µdt + σdWt + St dNt (2.8)

11
where Nt is a Poisson process with constant arrival rate λ. The jump size St is i.i.d. normal with
mean µS and variance σ 2S .
The true transition density is
∞ −λ∆
" #
X e (λ∆)j 1 (y − x − µ∆ − jµS )2
p (∆, y|x) = √ q exp − ¡ ¢
j! 2 2 σ 2 ∆ + jσ 2
j=0 2
2π σ ∆ + jσ S
S

because, conditioning on j jumps happening, the transition density is normal with mean x+µ∆+jµS
and variance σ 2 ∆ + jσ 2S .
The functions C (k) (x, y) and D(k) (x, y) are

(y − x)2
C (−1) (x, y) =
2σ 2
1 hµ i
C (0) (x, y) = √ exp 2 (y − x)
2πσ 2 σ
µ 2 ¶
µ
C (1) (x, y) = −C (0) (x, y) + λ
2σ 2
" #
2
λ (y − x − µS )
D(1) (x, y) = q exp − 2
2πσ 2 2σ S
S

The approximate density p(1) (∆, y|x) is therefore


" #∙ µ 2 ¶ ¸
(1) 1 (y − x)2 µ µ
p (∆, y|x) = √ exp − + (y − x) 1 − + λ ∆
2πσ 2 ∆ 2σ 2 ∆ σ2 2σ 2
" #
λ (y − x − µS )2
+q exp − ∆
2πσ 2 2σ 2S
S

Rewrite the true density p (∆, y|x)


" # ∙ µ 2 ¶ ¸
1 (y − x)2 µ µ
p (∆, y|x) = √ exp − + 2 (y − x) exp − +λ ∆ (2.9)
2πσ 2 ∆ 2σ 2 ∆ σ 2σ 2
" #
e−λ∆ λ (y − x − µ∆ − µS )2
+√ q exp − ¡ ¢ ∆
2π σ 2 ∆ + σ 2S 2 σ 2 ∆ + σ 2S
" #
e−λ∆ λ2 (y − x − µ∆ − 2µS )2
+ √ q exp − ¡
2 ∆ + 2σ 2
¢ ∆2
2 2π σ 2 ∆ + 2σ S 2 2 σ S

∞ −λ∆
" #
X e (λ∆)j 1 (y − x − µ∆ − jµS )2
+ √ q exp − ¡ ¢
j=3
j! 2π σ 2 ∆ + jσ 2 2 σ 2 ∆ + jσ 2S
S

where the first term corresponds to the event that no jump happened, the second (third) term
corresponds to the event that exactly one (two) jump(s) happened. It’s easy to see that p(1) (∆, y|x)

12
h ³ 2 ´ i
µ
approximates p (∆, y|x) by approximating exp − 2σ 2 + λ ∆ with its first-order Taylor expansion
∙ ¸
e−λ∆ λ (y−x−µ∆−µS )2
at ∆ = 0, by approximating √ √ exp − 2 σ2 ∆+σ2 with its limit at ∆ = 0 and by
2π σ 2 ∆+σ 2S ( S)

ignoring the terms corresponding to at least two jumps which are of order ∆2 .
Similarly, the higher order approximation p(2) (∆, y|x), h whose
³ 2 expression
´ i is not detailed here,
µ
approximates p (∆, y|x) in (2.9) by Taylor expanding exp − 2σ2 + λ ∆ to the second order, by
∙ ¸
e−λ∆ λ (y−x−µ∆−µS )2
approximating √ √ 2 exp − 2 σ2 ∆+σ2 with its first order Taylor expansion , by approx-
2π σ ∆+σ2S ( S)
∙ ¸
−λ∆ 2 2
imating √ e√ 2 λ 2
exp − (y−x−µ∆−2µ
2(σ ∆+2σS )
2 2
S)
with its limit at ∆ = 0 and by ignoring the terms
2 2π σ ∆+2σS
corresponding to at least three jumps which are of order ∆3 .
Without the jumps, X becomes a diffusion with constant drift and diffusion coefficient whose
transition density is explicitly known and therefore the refined approximation discussed in Section
2.5 applies. The refined approximation to the first order is
" #
(1) 1 (y − x − µ∆)2
pR (∆, y|x) = √ exp − (1 − λ∆)
2πσ 2 ∆ 2σ 2 ∆
" #
λ (y − x − µS )2
+q exp − ∆
2πσ 2 2σ 2S
S

(1) (1)
Compared
³ ´ with p (∆, y|x), pR (∆, y|x) improves the approximation by retaining the term
µ2
exp − 2σ 2∆ instead of approximating it with its first-order expansion.

2.6.4 A Numerical Example

Consider the Ornstein-Uhlenbeck process with jump

dXt = −kXt dt + σdWt + St dNt

where k, σ > 0, N is a standard Poisson process with constant intensity λ. St is i.i.d. and has double
exponential distribution with mean 0 and standard deviation σ S , which has a fatter tail than normal
distribution.
X is an affine process whose characteristic function is known8 , see Duffie et al (2000). An ap-
proximate transition density, pF F T (∆, X∆ |X0 ), can be obtained via fast Fourier transform. Treating
pF F T (∆, X∆ |X0 ) as the “true” transition density, we now investigate, using numerical examples9 ,
the accuracy of the closed-form likelihood approximation and how the accuracy varies with ∆ and
the order of the approximation.
The first and the second graph in figure 1 show the effect of adding correction terms. The first
graph uses first order approximation while the second uses second order approximation. Both graph
   2k
   (e−2k∆ −1)u2 σ2 1+e−2k∆ u2 σ 2
λ
8
E ei uX∆  X0 = x = exp i uxe−k∆ + 4k 1+u2 σ 2
S
S
9
The numbers used are k = 0.5, σ = 0.2, λ = 1/3, µS = 0, σS = 0.2, X0 = 0.

13
log p(1) − log pF F T , weekly log p(2) − log pF F T , weekly

log p(2) − log pF F T , daily


Figure 1: Accuracy of Likelihood Approximations

correspond to weekly sampling. The accuracy of the approximation increases rapidly with additional
terms. For weekly sampling, the standard deviation of X∆ |X0 = 0 is 0.03. The first two graphs
in fact show the likelihood approximation is good from negative twenty to positive twenty standard
deviations. That the approximation is good in the large deviation area is useful in case a rare event
occurs in the observations.
The second and the third graph in figure 1 show the effect of shrinking sampling interval. The
third graph uses daily sampling interval as opposed to weekly sampling in the second graph. Both
graphs use second order approximation. The approximation improves quickly as smaller sampling
interval is used, in line with our expectation.

2.7 Calculate C (−1) (x, y)


The equations characterizing C (−1) (x, y) in Section 2.4 are, in multivariate cases, PDEs which do
not always have explicit solutions. In this section, we will discuss the conditions under which they
can be solved explicitly. In the case they cannot be explicitly solved, we will discuss how to find an
approximate solution with a second expansion.

14
2.7.1 The reducible case

Consider the equation characterizing C (−1) (x, y) in theorem 1.


∙ ¸T ∙ ¸
(−1) 1 ∂ (−1) ∂ (−1)
C (x, y) = C (x, y) V (x) C (x, y)
2 ∂x ∂x
£ ¤T ∂
Remember wB (x, y) is defined as wB (x, y) ≡ V 1/2 (x) ∂x C
(−1) (x, y) in theorem 1, we have

1 T
C (−1) (x, y) = wB (x, y) wB (x, y)
∙2 ¸
∂ (−1) ∂ T
C (x, y) = w (x, y) wB (x, y)
∂x dx B

Therefore,
h iT ∙ ∂ ¸
1/2 T
wB (x, y) ≡ V (x) w (x, y) wB (x, y) (2.10)
dx B
£ ¤T £ ∂ T ¤
Let In be the n-dimensional identity matrix. It is easy to see V 1/2 (x) dx wB (x, y) = In
characterizes a solution for wB (x, y) and hence a solution for C (−1) (x, y). Let V −1/2 (x), whose
−1/2
element in the i-th row and j-th column is denoted vij (x), be the inverse matrix of V 1/2 (x) .
The condition for the existence of a vector function wB (x, y) whose first derivative dx∂T wB (x, y) =
V −1/2 (x) is, intuitively, that the second derivative matrix of each element of wB (x, y) is symmetric,

∂ −1/2 ∂ −1/2
vij (x) = v (x) for all i, j, k = 1, ..., n (2.11)
∂xk ∂xj ik

This is the reducibility condition given by Proposition 1 in Aït-Sahalia (2001). All the univariate
cases are reducible. Therefore, it is not surprising that C (−1) (x, y) can be solved explicitly in the
univariate case as in corollary 1 and 2.

2.7.2 The irreducible case

When the reducibility condition (2.11) does not hold, we can no longer solve (2.10) and C (−1) (x, y)
explicitly in general. In this case, we will approximate C (−1) (x, y) instead.
When choosing the method to approximate C (−1) (x, y), we notice that the derivatives of C (−1) (x, y)
are needed to compute C (0) (x, y) whose derivatives will in turn be used to compute C (1) (x, y) and
so on. Therefore, the method has to approximate not only C (−1) (x, y) but also its derivatives of
various orders. A natural method of approximation is therefore Taylor expansion of C (−1) (x, y) as
proposed in Aït-Sahalia (2001). The idea is that we will Taylor expand the differential equation
for C (−1) (x, y) around y = x with undetermined coefficients, match and set to 0 the coefficients of
terms involving the same orders of y − x, solve for the undetermined coefficients and obtain a Taylor
expansion for C (−1) (x, y). This second expansion is discussed in detail in Aït-Sahalia (2001) (see
especially Theorem 2 and the remarks thereafter).

15
In the rest of this section, we will first give an example to illustrate this method of approximation
and then discuss how to determine the order of the Taylor expansion. We will use C (−1,J) (x, y)
to denote the expansion of C (−1) (x, y) to order J and use C (k,J) (x, y) for k ≥ 0 to denote other
coefficients computed from using C (−1,J) (x, y). To make the error of p(m) (∆, y|x) have the correct
order, J will depend on m. But for notational convenience, this dependence is not written out
explicitly.
As an example, we take the equation characterizing C (−1) (x, y) in Theorem 2 and seek a second
order expansion C (−1,2) (x, y).
∙ ¸T ∙ ¸
(−1) 1 ∂ (−1) ∂ (−1)
C (x, y) = C (x, y) V (y) C (x, y)
2 ∂y ∂y

Taylor expand the terms around y = x and notice the condition C (−1) (x, x) = 0
¯ ¯
∂ (−1) ¯ 1 ∂ 2 (−1) ¯
C ¯
(x, y)¯ · (y − x) + (y − x) · T
C (x, y)¯ · (y − x) + R1
∂y 2 ∂y 2 ¯
y=x y=x
" ¯ ¯ #T
1 ∂ (−1) ¯ ∂ 2 ¯
= C (x, y)¯¯ + 2
C (−1) (x, y)¯¯ · (y − x) + R2 · [V (x) + R3 ]
2 ∂y y=x ∂y y=x
" ¯ ¯ #
∂ (−1) ¯ ∂ 2 ¯
· C (x, y)¯¯ + C (−1)
(x, y)¯¯ · (y − x) + R2
∂y y=x ∂y 2 y=x

where R1 , R2 and R3 are remainder terms that will not affect the result.
Matching the terms with the same orders of (y − x) and setting their coefficients to 0, we get
¯
∂ (−1) ¯
C (x, y)¯¯ = 0
∂y y=x
¯
∂ 2 (−1) ¯
2
C (x, y)¯¯ = V −1 (x)
∂y y=x

The second order expansion of C (−1) (x, y) around y = x is therefore


³ ´
C (−1) (x, y) = C (−1,2) (x, y) + o |y − x|2 (2.12)
1 ³ ´
= (y − x)T · V −1 (x) · (y − x) + o |y − x|2
2
Now, let us discuss the choice of J. We can achieve an error of op (∆m ) if³ each´term C (k,J) (x, y) ∆k

differs from C (k) (x, y) ∆k by op (∆m ). For jump-diffusions, y − x ∼ Op ∆ . It is clear then to
¡ ¢
make C (k,J) (x, y) − C (k) (x, y) = op ∆m−k , we need to set

J ≥ 2 (m − k) for all k ≥ −1 (2.13)

as shown in equation (5.3) in Aït-Sahalia (2001). Therefore, J should at least be 2 (m + 1). This
P
choice of J also ensures the relative error incurred by mk=1 D
(k) (x, y) ∆k is o (∆m ), as can be
p

16
verified from Theorem 1 and 2.
For k ≥ 0, C (k,J) (x, y) are exact solutions to the linear PDEs10 characterizing them. However,
they only approximate C (k) (x, y) because these PDEs involve C (−1) (x, y) for which only an approx-
imation C (−1,J) (x, y) is known. In the case that these linear PDEs are too cumbersome to solve, we
can find an approximate solution C (k,J(k)) (x, y) in the same way as the expansion for C (−1) (x, y)
with the order of expansion being J (k) = 2 (m − k). See also Aït-Sahalia (2001).

3 Estimation
To estimate the unknown parameter θ ∈ Rp in the jump diffusion model, we collect observations in a
time span T with sampling interval ∆. Let XT,∆ = {X0 , X∆ , X2∆ , · · ·XT } denote the observations.

Assumption 5. The model (2.1) describes the true data-generating process and the true parameter,
denoted θ0 , is in a compact set Θ.

Let
T /∆
X ¡ ¢
lT,∆ (θ) = log p ∆, Xi∆ | X(i−1)∆ , θ
i=1

be the log-likelihood function of observing XT,∆ conditioning on X0 . The loss of information


contained in the first observation X0 is negligible when the sample size increases. Let b
θT,∆ ≡
maxθ∈Θ lT,∆ (θ) denote the estimator from maximum-likelihood estimation (MLE).
Let
T /∆
(m)
X ¡ ¢
lT,∆ (θ) = log p(m) ∆, Xi∆ | X(i−1)∆ , θ
i=1

(m) (m)
be the m-th order approximate likelihood function. Denote b
θT,∆ ≡ maxθ∈Θ lT,∆ (θ). To simplify
(m)
notation, the dependence of b θT,∆ and b
θT,∆ on T and ∆ will not be made explicit when there is no
confusion.
(m)
The rest of the section will show that b
θ attains the asymptotic efficiency of b
θ when the sampling
interval becomes small.n h io
. .
Let IT,∆ (θ) ≡ diag Eθ lT,∆ (θ) lT,∆ (θ)T be the information matrix. To simplify the notation,
.
lT,∆ (θ) is used for derivative with respect to the vector θ.

Assumption 6. IT,∆ (θ) is invertible and lT,∆ (θ) is thrice continuously differentiable with respect
to θ ∈ Θ. There exists ∆ > 0 such that

−1 p
1. IT,∆ (θ) → 0 as T → ∞, uniformly for θ ∈ Θ and ∆ ≤ ∆;
10
Linear PDEs can be solved using, for example, the method in chapter 6 of Carrier et al (1988).

17
° ... ³ ´ °
° −1/2 −1/2 °
2. °IT,∆ (θ) l T,∆ θ IT,∆ (θ)° is uniformly bounded in probability for e
e θ ∈ Θ and ∆ ≤ ∆.

° The second
... ³condition
´ in° this assumption is stronger than the usual one which only requires
° −1/2 −1/2 °
θ IT,∆ (θ)° to be uniformly bounded in probability for e
°IT,∆ (θ) l T,∆ e θ in a shrinking neighbor-
(m)
hood of θ. This stronger version is used to prove the asymptotic
° property
³ ´of b
θ . If°the process is
° −1/2 ... −1/2 °
stationary, this condition will automatically hold because °lT,∆ (θ) l T,∆ e
θ IT,∆ (θ)° will converge
³ ´
in probability to a constant η ∆, θ, eθ which is bounded for parameters in the compact set Θ and
∆ ≤ ∆ (see equation (A.49) in Aït-Sahalia (2002) for the stationary diffusion case), so no cost is
incurred from strengthening this condition.
Given assumption 6, it can be shown11 that the true MLE estimator bθT,∆ satisfies
³ ´ h .. i−1 h . i
1/2 −1/2 −1/2 −1/2
IT,∆ (θ0 ) b
θT,∆ − θ0 = − IT,∆ (θ0 ) l T,∆ (θ0 ) IT,∆ (θ0 ) IT,∆ (θ0 ) lT,∆ (θ0 ) + op (1) (3.1)
h .. i−1
−1/2 −1/2
as T → ∞, uniformly for all ∆ ≤ ∆. Depending on the joint distribution of IT,∆ (θ0 ) l T,∆ (θ0 ) IT,∆ (θ0 )
. ³ ´
−1/2 1/2
and IT,∆ (θ0 ) lT,∆ (θ0 ), possible limiting distributions of IT,∆ (θ) b
θT,∆ − θ0 include LABF (locally
asymptotically Brownian functional) and LAMN (locally asymptotically mixed normal) which has
LAN (locally asymptotically normal) as a special case.

Theorem 3. Let Mt = sup0≤s≤t kXs k. If, for any t < ∞, Mt is bounded in probability, there exists
T →∞
a sequence ∆∗T → 0 so that for any {∆T } satisfying ∆T ≤ ∆∗T ,
° ³ (m) ´°
° 1/2 b b ° m
° T,∆T 0
I (θ ) θ T,∆T − θ T,∆T ° = op (∆T )

as T → ∞.
(m)
Therefore, b
θ achieves the asymptotic efficiency of b θ if the sampling interval becomes small.
(m)
The speed at which bθ approaches the efficiency of b
θ increases with the order of the approximation.

4 Realignment Risk of the Yuan


The currency of China is the Yuan.12 Since 1994, daily movement of the exchange rate between the
Yuan and the US Dollar (USD) is limited to 0.3% on either side of the exchange rate published by
People’s Bank of China, China’s central bank. Since 2000, the Yuan/USD rate has been in a narrow
range of 8.2763 to 8.2799 and is essentially pegged to USD at 8.277.
A recent export boom has led to diplomatic pressures on China to allow the Yuan to appreciate.
This realignment risk is important to both China and other parts of the world. Foreign trade volume
accounts for roughly 40 percent of China’s GDP in 2002.13 A shift in the terms of trade and changes
in the competitive advantage of exporters brought by currency fluctuation can have a significant
11
See the proof of Theorem 3.
12
A history of its exchange rate regime is provided in Appendix 3.
13
Calculated from data provided by Datastream International.

18
impact on Chinese economy. Between 1992 and 1999, Foreign Direct Investment (FDI) into China
accounted for 8.2 percent of worldwide FDI and 26.3 percent of FDI going to developing countries,
all of which are subject to the currency risk.14 Beginning in August 2003, China starts to open
its domestic financial markets to foreign investors through Qualified Foreign Institutional Investors
(QFII) and is planning to allow Chinese citizens to invest in foreign financial markets. Currency risk
will be important for these investors of financial markets, too.
What is the realignment probability of the Yuan implicit in the financial market? How does this
probability respond to economic fundamentals? These are the questions to be addressed in the rest of
the paper. In particular, we will obtain a time-series estimate of the Yuan’s realignment intensity in
the past and uncover the implied term structure of the forward realignment rate (defined in Section
4.2.1) going into the future.

4.1 Data
“Ironically, what began as a protection against currency devaluation has now become
the chief tool for betting on currency appreciation, particularly in China — where an
export boom has led to diplomatic pressure to allow the yuan to rise in value.”
–– “Feature — NDFs, the secretive side of currency trading”
Forbes, August 28, 2003

To uncover the realignment probability, we obtained the Non-Deliverable Forward (NDF) rates
traded in major off-shore banks. The data, covering February 15, 2002 to December 12, 2003, are
sampled by WM/Reuters and are obtained from Datastream International.15
A NDF contract that a client purchases from an off-shore bank is the same as a forward contract
except that, on the expiration day, no physical delivery of the Yuan or the USD takes place. Instead,
any profit (loss) incurred by the client depending on the difference between the NDF rate and the
spot exchange rate at maturity is converted to USD and paid to the client (bank) by the bank
(client).
The daily trading volume in the offshore Chinese Yuan NDF market is around 200 million USD.16
Though not as liquid as the major currencies, it is large enough for the purpose of assuming no
arbitrage and hence the existence of a risk-neutral pricing measure in the analysis later.
Figure 2 plots the spot Yuan/USD exchange rate, the one-month NDF rate and the three-month
NDF rate. A feature of the plot is the downward crossing over spot rate of both forward rates near
the end of year 2002. In early 2002, both forward rates are above the spot rate; however, since late
2002, both forward rates have been lower than the spot rates. Intuitively, this seems to suggest that
14
The data are from Huang (2003).
15
Datastream daily series CHIYUA$, USCNY1F and USCNY3F are obtained for the spot Yuan/USD rate, 1 month
NDF rate and 3 month NDF rate. All the data are sampled by WM/Reuters at 16:00 hrs London time.
16
The number is obtained from the article “Feature — NDFs, the secretive side of currency trading”, Forbes, August
28, 2003.

19
8.3

8.28

8.26

8.24

8.22
Yuan / USD

8.2

8.18

8.16

8.14

8.12 spot rate


one−month NDF rate
8.1 three−month NDF rate
CIP implied rate
Jan02 Apr02 Jul02 Oct02 Jan03 Apr03 Jul03 Oct03 Jan04
Date

Figure 2: Spot and NDF rates

the market is expecting the Yuan to appreciate. Indeed, the estimation result in Section 4.4 shows
that the market-implied realignment probability was much bigger in 2003 than in 2002.
For a currency that can be freely exchanged, the forward price is pinned down by the spot
exchange rate and domestic and foreign interest rate through arbitrage (the Covered Interest Rate
Parity). The “CIP implied rate” in the graph is the three-month forward rate implied by the covered
interest rate parity if the Yuan were freely traded.17 It can be seen that not only the level but also
the trend of the NDF rate differs from the rate implied by the covered interest rate parity. Therefore,
the NDF rate does provide additional information and will be exploited later.

4.2 Setup
The exchange rate E in USD/Yuan is assumed to follow a pure jump process

dEt
= Jtu dUt + Jtd dDt
Et−

where U and D are Poisson processes with arrival rate λu,t and λd,t . The jump size Jtu (Jtd ) is a
positive (negative) i.i.d. random variable. Therefore, U and D are associated with upward (the Yuan
17
Three-month eurodollar deposit rate and three-month time deposit rate for the Yuan are used. They are obtain
from series FREDD3M and CHSRW3M in Datastream International.

20
appreciates) and downward (the Yuan depreciates) realignments, respectively.
Under regularity conditions, the price Fd of a Non-Deliverable Forward with maturity d is given
by
h Ud ¯ i
¯
0 = E Q e− 0 rs ds (Ed − Fd )¯ E0 , Γ0

where r is the short-term US interest rate and Q is a risk-neutral equivalent martingale measure.
Considering that China is financially a small country relative to the US, we make two simplifying
Ud
assumptions: that the realignment risk premium is zero and that e− 0 rs ds and Ed are uncorrelated.
The zero risk premium assumption enables us to identify the risk-neutral measure Q with the actual
probability measure. We will use one-month NDF price in the estimation, so the effect of interest rate
change is expected to be small in such a short window. Also, this avoids any potential misspecification
of interest rate dynamics in light of the recent finding that available term structure models do not fit
time series behavior of interest rates very well.18 With these two assumptions, the forward pricing
formula simplifies to

Fd = E [ Ed | E0 , Γ0 ] (4.1)

We complete the setup by parametrizing the time-varying realignment intensities. Let λu,t =
λueΓt−and λd,t = λd e−Γt− where the process Γ follows

dΓt = −kΓt dt + σdWt + Kt dLt

for some k, σ > 0. W is an Brownian motion, L is a Poisson process with arrival rate η, Kt is a
random variable assumed to be normally distributed with mean 0 and variance σ 2K . To keep the
model simple, W , L and K are assumed to be independent of other uncertainties in the model.
This specification parsimoniously characterizes how the realignment intensity varies over time.
When Γ > 0, λu,t > λu and upward realignment is more likely than long-run average. When Γ < 0,
λd,t > λd and downward realignment is more likely. The process Γ mean reverts to 0 at which point
both upward and downward realignment intensities equate their long-run averages. If we can uncover
the time series of Γ, we will be able to trace over time λu,t and λd,t .

4.2.1 Term Structure of the Forward Realignment Rate

Let q (t) denote the probability of no upward realignment before time t.19 Under the current setup,20
h Ut i
q (t) = E0 e− 0 λu,τ dτ

For s ≥ t, let q (s|t) denote the probability of no realignment before s given no realignment
18
See Duffee (2002).
19
The term structure of downward realignment rate can be defined in direct parallel. However, it is suppressed since
the current interest is in the appreciation of the Yuan.
20
This can be shown by using a doubly stochastic argument as in Brémaud (1981).

21
happened before time t. It can be shown that
Us
q (s|t) = e− t f (τ )dτ

where the function f is given by

q 0 (t)
f (t) = −
q (t)

The function f is termed the forward realignment rate.21 Knowing the term structure of f gives
complete information regarding the realignment probabilities in the future.
This paper uses one factor Γ to model the evolution of the realignment intensity. This simple
setup already affords a rich set of possibilities for the term structure of f . Figure 3 plots three such
possibilities including monotonically decreasing, monotonically increasing and hump-shaped term
structure.
Downward sloping term structure Upward sloping term structure Hump−shaped term structure
0.045 0.24

0.23
Forward Realignment Rate

Forward Realignment Rate


Forward Realignment Rate

0.04
0.3
0.22
0.035
0.25 0.21
0.03
0.2
0.2 0.025
0.19
0.02
0.15 0.18
0.015
0.17
0.1
0.01 0.16

0.05 0.005 0.15


0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
Year Year Year

Figure 3: Possible Shapes of the Term Structure

With the monotonically decreasing term structure, the financial market is perceiving an imme-
diate realignment. Conditioning on no alignment happening, the chance of a realignment decreases
over time. With an increasing term structure, realignment probability is perceived to be very small
and reverts back to the long-run mean as time progresses. When the term structure is hump-shaped,
the financial market could be expecting a realignment at a certain time in the future and, condition-
ing on no realignment before that time, the chance of a realignment in the further future is perceived
to be small.
The entire term structure of the forward realignment rate can be calculated once the unknown
parameters are estimated and the unobservable process Γ is filtered out by inverting the pricing
formula (4.1). However, two problems have to be solved. First, the pricing formula (4.1) does not
have a closed form. Second, there are some unknown parameters to be estimated, some of which are
not even identified.
21
Readers familiar with duration models will recognize this is the hazard rate of the Yuan’s realignment. This concept
is also parallel to the forward interest rate in term structure interest rate models and to the forward default rate in
credit risk models.

22
4.2.2 Approximating the Forward Price

Let f (Ed , Γd ) = Ed . We will expand the forward price E [ Ed | E0 , Γ0 ] to the third order around d = 0
through the infinitesimal generator A of the process {E, Γ}, i.e.,
3
X di
Fd ≈ E0 + Ai f (E0 , Γ0 ) (4.2)
i!
i=1

Since a NDF with short maturity (one month) will be used in the estimation, the approximation
is expected to be good. In fact, it will be verified later in Section 4.4.3 that this third order expansion
indeed delivers sufficient precision.

4.2.3 Identification

Let g (Ed ) = Ed . The forward price (4.1) can be rewritten as


∙Z d ¯ ¸
¯
Fd = E0 + E ¯
Lg (Es ) ds¯ E0 , Γ0
0

where L is the infinitesimal generator of the process E. It can be calculated that


h ³ ´i
Lg (Et ) = eΓt− λu E (Jtu ) + e−Γt− λd E Jtd Et
¡ ¢
Therefore, λu , E (Jtu ), λd and E Jtd are not separately identified from the forward rate alone.
Intuitively, a higher intensity and a larger magnitude of realignment have the same implication for
the forward rate.
¡ ¢
In principle, the parameters λu , E (Jtu ), λd and E Jtd are identified from the time series of ex-
change rates. However, in the sample period from February 2002 to September 2003, no realignments
took place. To overcome this problem, we collected all the realignment events since 1955 when the
Yuan was created (see Appendix 3). In the past forty-eight and a half years, there have been two
appreciations for the Yuan: 9 percent in 1971 and 11 percent in 1973 with an average of 10 percent.
There have been four depreciations: a roughly 28 percent depreciation in 1986 when the foreign
exchange swap rate was introduced; a 21 percent and a 10 percent depreciation in 1989 and 1990,
respectively; and finally a 33 percent depreciation when the official effective rate was unified with
the foreign exchange swap rate in 1994. The average of these depreciations is 23 percent.
2 4
¡ ¢
Therefore, we will set λu = 48.5 , E (Jtu ) = 0.1, λd = 48.5 , E Jtd = −0.23. In other words, we
will consider realignment risk with an average of 10 percent change in value upside and an average
of 23 percent change in value downside and uncover the realignment probability corresponding to
realignment risks of these magnitudes. This is a normalization that will not affect the results later.

4.3 Iterative Maximum-Likelihood Estimation


To uncover the realignment probability, parameters θ ≡ {k, σ, η, σ K } need to be estimated. The
true parameter θ0 is assumed to be in a compact set Θ. Starting from an initial estimate θ(m) ,

23
³ ´
we can invert the forward pricing formula (4.2) to get Γ b (m) = Γ θ(m) . The vector Γ b(m) =
n o
b(m) , Γ
Γ b(m) , Γ
b (m) , ..., Γ
b(m) estimates the process Γ at different points in time. (To save notation, Γ is
0 ∆ 2∆ T
used³for both´ the process Γ and the function inverting the forward pricing equation. The dependence
of Γ θ (m)
on observed spot and forward prices is not made explicit.) Knowing Γ b(m) , we can apply
³ ´
maximum-likelihood estimation to get another estimate of the parameters, say θ(m+1) = H Γ b . The
process Γ does not admit a closed-form likelihood. Fortunately, the approximate likelihood intro-
duced in the first part of the paper can be applied to compute θ(m+1) . In particular, the third order
(3)
approximation pR will be used. This defines an iterative estimation procedure.
³ ³ ´´
θ(m+1) = H Γ θ(m) (4.3)

Let θ∗T,∆ denote the uncomputable maximum-likelihood estimator if we can observe the process Γ,
i.e., θ∗T,∆ = H (Γ (θ0 )). θ∗T,∆ is uncomputable because the process Γ is not directly observable by the
econometrician. The following proposition shows that the iterative maximum-likelihood estimation
procedure converges and the estimator so obtained approaches the efficiency of θ∗T,∆ . As a reminder,
d is the maturity of the forward contract (one month here), ∆ is the sampling interval (daily here)
and T is the sampling period (February 2002 to December 2003).
d→0
Proposition 2. Given ∆ and T , there exists a positive function M (d) → 0 so that as d → 0,
with probability approaching one, the mapping H (Γ (.)) is a contraction with modulus M (d) and the
iterative maximum-likelihood estimation procedure converges to an estimate b θT,∆ . Further,
° ° 1 ° ∗ °
°b ° °θT,∆ − θ0 °
°θT,∆ − θ0 ° ≤
1 − M (d)

This proposition ensures we can approach efficiency obtainable when the process Γ is observed.
Theorem 3 ensures θ∗T,∆ approaches the efficiency of the maximum-likelihood estimator using the
exact likelihood function. Therefore, the iterative MLE estimator asymptotically achieves the effi-
ciency of the true MLE estimator which is uncomputable both because the process Γ is unobservable
and because the likelihood function³is not
´ known in closed form.
b b b
With θT,∆ so estimated, Γ = Γ θT,∆ estimates the time series of the process Γ.

4.3.1 Which Forward Price to Use?

To infer Γ by inverting the forward pricing formula, forward prices of one maturity suffice. The data
contain forward prices with nine different maturities: one day, two days, one week, one month, two
months, three months, six months, nine months and one year. Which forward price should we use?
Proposition 2 demands the forward price with the shortest maturity. However, we did not take
Proposition 2 literally and use the forward price with the shortest maturity (one day), because, when
the duration gets too small, the realignment risk becomes negligible and the prices can be prone to
noises not modelled. To see this, let Fd − E denote the spread of the forward rate with maturity d
over the spot rate. Table 1 gives the sample correlation coefficients between the spread of different

24
forward rates.

Table 1: Correlation Matrix of Different Forward Spreads


1d 2d 1w 1m 2m 3m 6m 9m 1y
1d 1 0.990 -0.999 -0.912 -0.908 -0.893 -0.868 -0.850 -0.844
2d 1 -0.989 -0.890 -0.894 -0.885 -0.862 -0.844 -0.838
1w 1 0.913 0.908 0.893 0.869 0.851 0.845
1m 1 0.990 0.981 0.963 0.951 0.947
2m 1 0.994 0.979 0.968 0.963
3m 1 0.990 0.982 0.978
6m 1 0.996 0.994
9m 1 0.999
1y 1

The forwards with extremely short maturities (one and two days) behave very differently from
other forwards. The spreads for maturities one month or longer are highly correlated and any one
of them captures most of the information in other forwards. Therefore, the one-month forward price
will be used, striking a balance between Proposition 2 and the noises in the extremely short-maturity
forwards. As long as the realignment risk for a long horizon (say, next five years) is concerned, this
choice is more sensible than the one-day forward price.

4.4 Estimation Result


We applied the iterative MLE procedure on the spot exchange rate and one-month NDF price. The
estimates are in table 2.22

Table 2: Parameter Estimates

k σ η σK
0.2545 0.5159 31.414 0.2393
(0.4379) (0.0259) (5.5737) (0.0290)

4.4.1 Realignment Intensity in the Past


³ ´
bt
The 95 percent uniform confidence band for the estimated realignment intensity λu,t = λu exp Γ
in the sample period is plotted in Figure 4.
22
The numbers in the parentheses are the bootstrap standard errors. I used a parametric bootstrap procedure where
the process Γ is simulated five hundred times using the estimated parameters, and then the forward price is computed
e The iterative
using the pricing formula (4.2). Each simulation is initiated at a Γ randomly picked from the estimated Γ.
MLE procedure is then applied, pretending only the forward price is observed as in the actual estimation. Histograms
of the bootstrap estimates are in Appendix 5.

25
uniform 95% upper
uniform 95% lower
1.2

Realignment Intensity 0.8

0.6

0.4

0.2

Jan02 Apr02 Jul02 Oct02 Jan03 Apr03 Jul03 Oct03 Jan04


Date

Figure 4: Realignment Intensity

The confidence band plotted is uniform in that the entire time series of Γ in the sample period
lies inside the band with 95 percent probability. The point estimates are not plotted because the
confidence band is very narrow.
At the beginning of the sample period, the realignment intensity is 0.07 which is quite close to
the historical average. At the end of the sample period, December 2003, the realignment intensity
increased by five times to 0.38 which implies the market is expecting the Yuan to appreciate much
more often than historical average.

4.4.2 Term Structure of the Forward Realignment Rate

The realignment probability going into the future can be assessed from the term structure of the
forward realignment rate. Using the parameter estimates and the filtered Γ, this term structure on
December 12, 2003, the last day in the sample, is recovered23 and plotted in figure 5.
Interestingly, the term structure is hump-shaped and peaks at about six months from December
12, 2003. The market considers the Yuan likely to realign upwardly in 2004 and, conditioning on no
realignment in that period, realignments in the further future is perceived less likely.
The term structure of forward realignment rates enables us to compute the probability of any
future realignment. Two examples are given in figure 6.
In figure 6, the term structure curve enables us to compute that the probability of a realignment
happening in the next two years, 1 − q (2), is 0.49. Similarly we can compute that, conditioning
23
Simulation with 100,000 sample paths is used to compute the term structure. Each week is discretized with 50
points in between using the Euler Scheme.

26
0.4

Forward Realignment Rate


0.35

0.3

0.25

0.2

0.15

0.1
0 1 2 3 4 5
Year(s) from December 12, 2003

Figure 5: Term Structure of Forward Realignment Rate

0.4 0.4

0.35
Forward Realignment Rate

0.35
Forward Realignment Rate

0.3 0.3

0.25 0.25

0.2 0.2
1 − q(2) = 0.49
0.15 0.15

1 − q(4|2) = 0.28
0.1 0.1
0 1 2 3 4 5 0 1 2 3 4 5
Year(s) from December 12, 2003 Year(s) from December 12, 2003

Figure 6: Computed Realignment Probabilities using the Term Structure

27
on no realignment in the next two years, the probability of a realignment before the fourth year,
1 − q (4|2), is 0.28.

4.4.3 Accuracy of the Approximation

There are two approximations involved in the estimation procedure. One is the approximation of
the forward pricing formula, the other is the approximation of the true likelihood. This subsection
verifies that the two approximations are very good.
To check the accuracy of the approximate forward price, we simulate the process Γ to obtain the
forward price and compare it to the price given by (4.2).24 The estimated Γ b in the sample period
ranges from a minimum of 0.5087 to a maximum of 3.3170. The simulation result shows that the
difference between the price given by (4.2) and the simulated forward price is no more than 1 × 10−4
for all Γ in this range. Therefore, the degree of accuracy is sufficient for the application since the
forward price data reported by Datastream International have just four digits after the decimal point.
Now we check the approximation to the log-likelihood. The process Γ is affine with a known
characteristic function. Using Fast Fourier Transform, we can numerically invert the characteristic
function and compute the transition density, hence the log-likelihood denoted by log pF F T . This
(3)
numerical log-likelihood, treated as the true log-likelihood, is compared to log pR to check the
(3)
approximation accuracy. Each graph in figure 7 plots both log pF F T (Γ∆ |Γ0 ) and log pR (Γ∆ |Γ0 ).
The starting value Γ0 is chosen to be, from left to right, 0.5087, 1.09 and 3.317 which are the minimum,
mean and maximum of the estimated Γ in the sample period. In the sample period, the maximum
(minimum) change of Γ from one observation to another, i.e., Γ b t+∆ −Γbt , is 0.5224 (-0.6402). Therefore,
to cover all possible cases in the sample, each plot covers a range of Γ∆ ∈ [Γ0 − 0.6402, Γ0 + 0.5224].
LogLikelihood LogLikelihood
LogLikelihood
2 2 2
1 1 1
Γ Γ Γ
0.2 0.4 0.6 0.8 1 0.6 0.8 1 1.2 1.4 1.6 2.8 3 3.2 3.4 3.6 3.8
-1 -1 -1
-2 -2 -2
-3 -3 -3
-4 -4 -4
-5 -5 -5

Figure 7: Compare Approximate and True Log-Likelihood

It can be seen that the two log-likelihoods are virtually on top of each other in every one of the
plots. Log-likelihoods from other starting values are checked and the ³ results look
´ the same. ³In addi- ´
b F F T b b (3) bt+∆ |Γ
bt
tion, at the estimated Γ, the maximum difference between log p ∆, Γt+∆ |Γt and log pR ∆, Γ
is 0.006 for all t. Therefore, the accuracy of log-likelihood approximation is also very good.
24
The Euler scheme with 100 discretization points is used. One million simulated sample paths are used to compute
each price.

28
4.4.4 A Robustness Check

The estimation used one-month forward price and left out forward prices with longer maturities. In
this subsection, we will use these forward prices to check the robustness of the model specification.
Let spd = Fd − E be the spread of a d-maturity forward rate over the spot rate. Let spb d be the
spread implied by the model. The following regression

b d + εd
spd = αd + β d · sp

should produce an estimate α b close to 1 if the model is correctly


b d close to 0 and an estimate β d
specified. The regression results are in table 3.25

d 2m 3m 6m 9m 1y
bd
α 0.0000201 0.0000651 0.0003364 0.0008146 0.0014124
(1.89e−06) (4.54e−06) (0.0000157) (0.0000348) (0.0000585)
b
β 0.9305391 0.9232267 0.9728467 0.9977598 0.9721176
d
(0.01764) (0.026629) (0.0362506) (0.0415448) (0.0419878)

Table 3: OLS Regression Result

The estimates for forwards with maturities at least six months are in line with our expectation.
The estimated β coefficients for forwards with maturities two and three months are also very close
to one. The joint hypothesis that all five β coefficients equal one is accepted at 99 percent level,
assuming the estimate for each regression is independent of the estimates from other regressions. This
result suggests that the model specification adequately captures variations of the longer maturity
forward prices.

4.5 What is the Realignment Probability Responding to?


The estimates suggest that the realignment intensity can vary dramatically over a short period of
time. It is useful to infer from the estimates what economic fundamentals it is responding to. Such
information helps identify the factors affecting the likelihood of an exchange rate realignment.
b To identify when a jump takes place, we computed
Figure 8 plots the daily estimated process Γ.
the standard deviation over one day of the process Γ should there be no jumps. Days over which Γ
changed by a magnitude larger than five such standard deviations are picked out. They contain at
least one jumps with probability close to one by Bayes rule. Using this rule, there are twenty four
jumps identified in the sample period. To find out what these jumps are responding to, we checked
new releases on the corresponding dates. Eight of these jumps are found to coincide with economic
news releases. These eight days are marked by dots and the jump magnitudes are in the parentheses.
Noticing that the intensity of realignment is proportional to the exponential of Γ, a 0.01 increase
in Γ roughly translates into a one percentage increase of the intensity. Therefore, the jumps in Γ
caused the realignment intensity to swing by as much as 50 percent on some of these days.
25
Standard errors of the estimates are in the parentheses.

29
3.5
10/7/03 (0.37)

10/8/03
(−0.64)
2.5
9/23/03 (0.42)

Gamma
2
8/29/03 (0.52)

1/6/03 (0.35)
6/13/03 (0.18)
1.5
9/2/03
(−0.17)

3/8/02 (−0.21)
0.5
Jan02 Apr02 Jul02 Oct02 Jan03 Apr03 Jul03 Oct03 Jan04
Date

Figure 8: Filtered Γ and Eight of its Jumps

Events coinciding with the jumps are listed in table 5 in Appendix 4.


It can be seen that the jumps coincide with news releases on the Sino-US trade surplus, state-
owned enterprise reform, Chinese government tax revenue and, most importantly, both domestic and
foreign government interventions. Given the tightly managed nature of China’s foreign exchange
market, it is not surprising that signals of the Chinese government’s determination to maintain the
current exchange rate plays a dominant role. Diplomatic pressures from foreign governments are also
important.
To illustrate how the term structure of the forward realignment rate responds to the news release,
figure 9 plots the term structure on August 28, August 29 and September 2 in 2003.
On August 29, 2003, before the visit of the US Treasury Secretary, the Japanese Finance Minister
publicly urged China to let the Yuan float freely and the state-owned enterprises in China showed
improved profitability. On the same day, the level of the forward realignment rate almost doubled
relative to the previous day. More interestingly, the peak of the term structure moved from ten
months later to six months later, indicating the market is anticipating the Yuan to appreciate sooner.
Four days later, after the diplomatic pressure receded, the peak of the term structure moved back
to roughly eight months later and the level of the term structure dropped.
To gain insights into how the prospect of a realignment is affected by the factors identified in the
news releases, economic theory is needed. Most of the economic theory about currency realignment
is on the depreciation side, unlike the case of the Yuan. Therefore, the results obtained here lend
empirical evidence to modelling currency realignment in the upward side.

30
0.32

0.3
08/29/03
0.28

Forward Realignment Rate


0.26

0.24

0.22 09/02/03

0.2
08/28/03

0.18

0.16

0 0.2 0.4 0.6 0.8 1


Year(s) later

Figure 9: Changes of the Term Structure on Jump Days

5 Conclusion
This paper provides closed-form likelihood approximations for multivariate jump-diffusion processes.
The maximum-likelihood estimator computed from this approximate likelihood achieves the asymp-
totic efficiency of the true maximum-likelihood estimator which is uncomputable in practice. It
therefore facilitates the estimation of jump-diffusions which are very useful in analyzing many eco-
nomic phenomena such as currency crises, financial market crashes, defaults, etc.
In some sense, the jump-diffusion processes considered in this paper and Appendix 2 are illustra-
tive examples. What really matters is the method: obtain the Kolmogorov equations, substitute a
postulated expansion for the transition density into the Kolmogorov equations, match terms with the
same order, and solve for the undetermined coefficients. Since Kolmogorov equations are common
to Markov processes, this method can be generalized to a very large class of processes, namely the
Markov processes.
The approximation technique is applied to the Chinese Yuan to uncover its realignment proba-
bility. The term structure of the forward realignment rate is hump-shaped and peaks at six months
from the end of 2003. The implication is that the financial market is anticipating an upward realign-
ment in the next year, and, conditioning on no realignments before then, the chance of realignment
is perceived to be small in the further future. Since February 2002, the realignment intensity for the
Yuan has increased fivefold. The realignment probability responds quickly to news releases on the
Sino-US trade surplus, state-owned enterprise reform, Chinese government tax revenue and, most
importantly, both domestic and foreign government interventions.

31
References
[1] Aït-Sahalia, Yacine (1996): “Nonparametric Pricing of Interest Rate Derivative Securities”,
Econometrica 64, 527-560.
[2] Aït-Sahalia, Yacine (2002): “Maximum Likelihood Estimation of Discretely Sampled Diffusions:
A Closed-Form Approximation Approach.”, Econometrica, Vol 70, No. 1.
[3] Aït-Sahalia, Yacine (2001): “Closed-Form Likelihood Expansions for Multivariate Diffusions”,
working paper.
[4] Aït-Sahalia, Yacine and Jialin Yu (2002): “Saddlepoint Approximations for Markov Processes”,
working paper.
[5] Bandi, Federico M. and Peter C. B. Phillips (2003): “Fully Nonparametric Estimation of Scalar
Diffusion Models”, Econometrica Vol 71, No. 1, 241-284.
[6] Bichteler, Klaus, Jean-Bernard Gravereaux and Jean Jacod (1987): “ Malliavin Calculus for
Processes with Jumps”, Gordon and Breach Science Publishers.
[7] Brandt, Michael and Pedro Santa-Clara (2002): “Simulated Likelihood Estimation of Diffu-
sions with an Application to Exchange Rate Dynamics in Incomplete Markets”, Journal of
Financial Economics, Vol 63, 161-210.
[8] Brémaud, Pierre (1981): “Point Processes and Queues, Martingale Dynamics”, Springer-Verlag.
[9] Duffee, Gregory (2002): “Term Premia and Interest Rate Forecasts in Affine Models”, Journal
of Finance, Vol 57, 405-443.
[10] Duffie, Darrell, Jun Pan and Kenneth Singleton (2000): “Transform Analysis and Asset Pricing
for Affine Jump Diffusions”, Econometrica, Vol 68, 1343-1376.
[11] Duffie, Darrell and Kenneth Singleton (2003): Credit Risk, Princeton University Press.
[12] Elerian, Ola, Siddhartha Chib and Neil Shephard (2001): “Likelihood Inference for Discretely
Observed Nonlinear Diffusions”, Econometrica, Vol 69, No. 4, 959-993.
[13] Gallant, A. R. and G. Tauchen (1996): “Which Moments to Match?”, Econometric Theory 12,
657-681.
[14] Gouriéroux, C., A. Monfort and E. Renault (1993): “Indirect Inference”, Journal of Applied
Econometrics 8, S85-S118.
[15] Hall, Peter, C. C. Heyde (1980): Martingale Limit Theory and Its Application, Academic Press.
[16] Hansen, L.P. and J.A. Scheinkman (1995): “Back to the Future: Generating Moment Implica-
tions for Continuous-Time Markov Processes”, Econometrica 63, 767-804.
[17] Hoffman, Kenneth (1975): Analysis in Euclidean Space, Prentice Hall.
[18] Huang, Yasheng (2003): Selling China, Cambridge University Press.

32
[19] Jensen, Bjarke and Rolf Poulsen (2002): “Transition Densities of Diffusion Processes: Numerical
Comparison of Approximation Techniques”, Journal of Derivatives, Vol 9, No. 4, 18-32.
[20] Kessler, M. and M. Sørensen (1999): “Estimating Equations Based on Eigenfunctions for a
Discretely Observed Diffusion Process”, Bernoulli, 5, 299-314.
[21] Liu, Jun, Francis A. Longstaff and Jun Pan (2003): “Dynamic Asset Allocation with Event
Risk”, Journal of Finance, Vol 58, No. 1, 231-259.
[22] Merton, Robert (1992): Continuous-Time Finance, Blackwell Publishers.
[23] Pedersen, A.R. (1995): “A New Approach to Maximum-Likelihood Estimation for Stochastic
Differential Equations Based on Discrete Observations”, Scandinavian Journal of Statistics,
Vol 22, 55-71.
[24] Protter, Philip (1990): “Stochastic Integration and Differential Equations”, Springer-Verlag.
[25] Revuz, D. and Marc Yor (1999): Continuous Martingales and Brownian Motion, Springer Ver-
lag, Third Edition.
[26] Robinson, P. M. (1988): “The Stochastic Difference between Econometric Statistics”, Econo-
metrica, Vol. 56, No. 3, 531-548.
[27] Schaumburg Ernst (2001): “Maximum Likelihood Estimation of Levy Type SDEs”, Ph.D. dis-
sertation, Princeton University.
[28] Singleton, Kenneth (2001), ”Estimation of Affine Asset Pricing Models Using the Empirical
Characteristic Function,” Journal of Econometrics, Vol. 102, 111-141.
[29] Stokey, Nancy, Robert Lucas with Edward Prescott (1996): Recursive Methods in Economic
Dynamics, Harvard University Press.
[30] Sundaresan, Suresh (2000): “Continuous-Time Methods in Finance: A Review and an Assess-
ment”, Journal of Finance, Vol. 55, No. 4, 1477-1900.
[31] Varadhan, S.R.S. (1967): “On the Behavior of the Fundamental Solution of the Heat Equation
with Variable Coefficients”, Communications on Pure and Applied Mathematics, Vol XX,
431-455.
[32] White, Halbert (1982): “Maximum Likelihood Estimation of Misspecified Models”, Economet-
rica, Vol 50, No. 1.

33
Appendix 1: Proofs

Proof of Lemma 1: The claim follows from Theorem 2-29 in Bichteler et al (1987).

Proof of Proposition 1: We first prove the backward equation. For simplicity of notation, we prove it
in univariate case, multivariate case follows in exactly the same way. For a nonnegative bounded C 2 function
f on R that vanishes outside a compact interval, let P∆ f (x) = E [ f (X∆ )| X0 = x]. By proposition VII.1.2 in
Revuz et al (1991), for a bounded C 2 function f on R,

P∆ f (x) = AB P∆ f (x) = P∆ AB f (x)
∂∆

By the choice of f and the continuous differentiability of p (∆, y|x), a change of differentiation and inte-
gration gives Z
∂ ∂
P∆ f (x) = f (ξ) p (∆, ξ|x) dξ
∂∆ ∂∆
and Fubini’s Theorem gives
Z Z
∂ 1 ∂2
AB P∆ f (x) = µ (x) f (ξ) p (∆, ξ|x) dξ + σ 2 (x) f (ξ) 2 p (∆, ξ|x) dξ
∂x 2 ∂x
ZZ Z
+ p (∆, ξ|x + c) ν (c) dcf (ξ) dξ − p (∆, ξ|x) f (ξ) dξ

Now, pick a sequence of functions f that form a delta converging sequence which converges to a dirac
delta generalized function at y, we obtain the backward equation.

The forward equation is obtained similarly starting from ∂∆ P∆ f (x) = P∆ AB f (x) and apply integration-
by-parts twice.

Derivation of Condition 2: It is clear that this initial condition does not affect the result in Theorem
1 which is satisfied by any solution to the backward equation.
£ ∂ (−1) ¤T £ ∂ (−1) ¤
Knowing C (−1) (x, y) = 12 ∂x C (x, y) V (x) ∂x C (x, y) ,
Z Z ∙ ¸
−n/2 C (−1) (x, y)
p (∆, y|x) dy = ∆ exp − C (0) (x, y) dy + o (∆)

Z ( £ ¤T £ ∂ (−1) ¤)
∂ (−1)
−n/2 ∂x C (x, y) V (x) ∂x C (x, y)
= ∆ exp − C (0) (x, y) dy + o (∆)
2∆
³ ´
Z (2π∆)−n/2 exp − wT w C (0) ¡x, w−1 (w)¢
n/2 2∆ x
= (2π) ¯ ¡ ¢¯ dw + o (∆)
¯ 1/2 ∂ 2
(−1) x, w −1 (w) ¯¯
¯det V (x) ∂x∂y T C x

1/2 ∂
by a change of variable formula. wx−1 (.) is defined as the inverse function of V (x) ∂x C
(−1)
(x, .) and satisfies
wx−1 (0) = x. It can³be verified
´ as in the proof of Theorem 1 that this change of variable is legitimate.
−n/2 T
(2π∆) exp − w2∆w converges to a dirac-delta function at 0 implies, as ∆ → 0,

Z ¯ ¯ ¯−1
¯ ∂2 ¯ ¯
n/2 (0) ¯ 1/2 (−1) ¯ ¯
p (∆, y|x) dy → (2π) C (x, x) ¯det V (x) C (x, y)¯ ¯
¯ ∂x∂y T y=x ¯

34
Requiring the right-hand side equals 1 gives condition 2, noticing from equation 2.12
¯
∂2 ¯
C (−1)
(x, y)¯¯ = −V (x)−1
∂x∂y T y=x

The proofs to Theorem 1 and 2 are very similar. We, therefore, prove theorem 1. Corollary 1 and 2 are
obtained by explicitly solving the equations in Theorem 1 and 2.
Proof of Theorem 1: To get C (k) (x, y) and D(k) (x, y), we replace p (∆, y|x) with (2.5) in the backward
Kolmogorov equation (2.2), match the terms with the same orders of ∆k and the terms with the same orders
h (−1)
i
of ∆k exp − C ∆(x,y) and set their coefficients to 0. Most of the steps are careful bookkeeping, only the
following two points need to be addressed.

R
• How to expand C
p (∆, y|x + c) ν (c) dc in increasing orders of ∆?

−1
• Prove that, for fixed y, wB (., y) is invertible in a neighborhood of x = y and its inverse function wB (.)
is continuously differentiable.

Notice Z
p (∆, y|x + c) ν (c) dc → ν (y − x) as ∆ → 0
C
R h i
k C (−1) (x,y)
implies the expansion of C
p (∆, y|x + c) ν (c) dc does not involve terms in the form of ∆ exp − ∆ .
h (−1)
i
Therefore, we first group terms with the same orders of ∆k exp − C ∆(x,y) , set their coefficient to 0 and get

∙ ¸T ∙ ¸
1 ∂ (−1) ∂ (−1)
C (−1) (x, y) = C (x, y) V (x) C (x, y) (5.1)
2 ∂x ∂x
¯
which is the equation for C (−1) (x, y) in theorem 1. Take the derivative of wB (x, y) and notice ∂
∂x C
(−1)
(x, y)¯x=y =
0,
¯ h iT ¯
∂ ¯ ∂2 ¯
w (x, y)¯ = V 1/2
(y) C (−1)
(x, y)¯
∂x T B ¯ ∂x∂x T ¯
x=y x=y

= V (y)−1/2

by equation 2.12. By the Inverse Mapping Theorem (Section 8.5 in Hoffman (1975)), there is an open neigh-
borhood of x = y, so that wB (x, y), for fixed y, maps this neighborhood 1-1 and onto an open neighborhood
of 0 in Rn and its inverse function is continuously differentiable.
R
Now, knowing wB (x, y) is invertible, we will expand C p (∆, y|x + c) ν (c) dc in increasing orders of ∆.
First, for a given k ≥ 0, by (5.1)
Z ∙ ¸
−n/2 C (−1) (x + c, y)
∆ exp − C (k) (x + c, y) ν (c) dc (5.2)
C ∆
Z ∙ ¸
wT (x + c, y) wB (x + c, y)
= ∆−n/2 exp − B C (k) (x + c, y) ν (c) dc
C 2∆

35
Let C1 ⊂ C be an open set containing y − x on which wB (x + ., y) is invertible for the given x and y. Denote
by W ∈ Rn the image of C1 under wB (x + ., y). As shown previously, W is open and contains 0. As ∆ → 0,
the last integral differs exponentially small from
Z ∙ ¸
wT (x + c, y) wB (x + c, y)
∆−n/2 exp − B C (k) (x + c, y) ν (c) dc
C1 2∆
Z ∙ ¸
−n/2 ωT ω
= ∆ exp − gk (x, y, ω) dω
W 2∆

by a change of variable. The function gk (x, y, ω) is defined in the theorem. Using the multivariate differenti-
ation notation in Section 2.4, we Taylor expand gk (x, y, ω) around ω = 0 to get
Z ∙ ¸ ∞ ¯
ωT ω X 1 X s ∂ s ¯
= ∆ −n/2
exp − ω g (x, y, u)¯¯
s k

W 2∆ r=0
r! ∂u u=0
|s|=r
X∞ ¯ Z ∙ ¸
1 X ∂s ¯
¯ −n/2 ωT ω s
= gk (x, y, ω)¯ ∆ exp − ω dω
r=0
r! ∂ω s ω=0 W 2∆
|s|=r
X∞ ¯ Z ∙ ¸
1 X ∂s ¯
¯ r/2 ωT ω s
≈ gk (x, y, ω)¯ ∆ exp − ω dω
r=0
r! ∂ω s ω=0 Rn 2
|s|=r

where the last term follows by first changing the variable √ω and then ignore terms that are exponentially

small as ∆ → 0. We note that, to decide which terms are exponentially small, we need not worry about the
P P∞
change of limit operators of the form: lim∆→0 ∞ r=0 to r=0 lim∆→0 simply because only finite number of
P∞
terms in the expansion r=0 will be used to compute the approximate transition density p(m) (∆, y|x) for
P
fixed m. I.e., the notation ∞ r=0 here does not mean convergence of the infinite sum, it simply represents a
Taylor expansion from which onlyh a Tfinite
i number of terms will be used.
n −n/2 R w w
Let Ms ≡ (2π) Rn
exp − 2 ws dw denote the s-th moment of n-variate standard normal distrib-
n
ution. Ms = 0 if any element of s is odd. The last expression therefore equals

X∞ ¯
∆r X ∂ s ¯
= (2π) n/2
g (x, y, w)¯¯
s k
Msn (5.3)
r=0
(2r)! n
∂w w=0
s∈S2r

n
with S2r defined as in Section 2.4.
Note, we have implicitly assumed y − x ∈ C. In the case y − x ∈
/ C, assumption 4 implies ν (.) and its
derivatives of all orders vanish at y − x which implies the right hand side of (5.3) is 0. It is also clear that
(5.2) is exponentially small in ∆ if y − x ∈
/ C. Therefore, (5.3) is still valid.
Now, knowing the above expansion,
Z ∙ ¸ ∞
C (−1) (x + c, y) X (k)
∆−n/2 exp − C (x + c, y) ∆k ν (c) dc
C ∆
k=0

X X∞ X ¯
∆ r
∂ s ¯
= (2π)n/2 ∆k g (x, y, w) ¯ Msn
(2r)! ∂w s k ¯
r=0
k=0 n s∈S2r w=0


X k
X ¯
1 X ∂s ¯
(x, y, w)¯¯
n/2 k
= (2π) ∆ g
s k−r
Msn
r=0
(2r)! n
∂w w=0
k=0 s∈S2r

36
This last expression, expanded in increasing orders of ∆k , can be used to match other terms in orders of
∆.
This completes the proof.

Proof of Theorem 3: The true maximum-likelihood estimator sets the score to 0. Therefore,
. .. ³ ´³ ´
lT,∆ (θ0 ) = − l T,∆ e
θ bθT,∆ − θ0

for some e
θ in between θ0 and b
θT,∆ .
³ ´ h .. ³ ´i−1 .
IT,∆ (θ0 ) b = IT,∆ (θ0 ) − l T,∆ e
1/2 1/2
θT,∆ − θ0 θ lT,∆ (θ0 )
h .. ³ ´ i−1 .
= − IT,∆ (θ0 ) l T,∆ e
−1/2 −1/2 −1/2
θ IT,∆ (θ0 ) IT,∆ (θ0 ) lT,∆ (θ0 )
h .. i−1 h . i
−1/2 −1/2 −1/2
= − IT,∆ (θ0 ) l T,∆ (θ0 ) IT,∆ (θ0 ) IT,∆ (θ0 ) lT,∆ (θ0 ) + Op (1)
= G−1
∆ S∆ + Op (1)

by assumption 6, where, for fixed ∆, random variables G∆ and S∆ are the probability limits under Pθ0 of
h .. i .
−1/2 −1/2 −1/2
IT,∆ (θ0 ) − l T,∆ (θ0 ) IT,∆ (θ0 ) and IT,∆ (θ0 ) lT,∆ (θ0 ), respectively, as T → ∞. We have now proved the
consistency of b
θT,∆ .
Knowing that b
−1/2
θT,∆ will asymptotically be in an I T,∆ (θ0 )-neighborhood of θ0 , by repeating the previous
argument, we have
³ ´ h .. ³ ´ i−1 .
IT,∆ (θ0 ) b = − IT,∆ (θ0 ) l T,∆ e
1/2 −1/2 −1/2 −1/2
θT,∆ − θ0 θ IT,∆ (θ0 ) IT,∆ (θ0 ) lT,∆ (θ0 )
h .. i−1 h . i
−1/2 −1/2 −1/2
= − IT,∆ (θ0 ) l T,∆ (θ0 ) IT,∆ (θ0 ) IT,∆ (θ0 ) lT,∆ (θ0 ) + op (1)
= G−1
∆ S∆ + op (1)

which gives the asymptotic distribution of b


θT,∆ .
(m)
Now we investigate the stochastic difference (see Robinson (1988)) between b θT,∆T and b
θT,∆T .
µ ¶ µ ¶
. (m) . (m) (m)
lT,∆T b θT,∆T − lT,∆T b θT,∆T
µ ¶
. (m)
= lT,∆T b θT,∆T
³ ´ .. µ ¶
. ¡ ¢ (m)
= lT,∆T b θT,∆T + l T,∆T θ b θT,∆T − b
θT,∆T
µ ¶
.. ¡ ¢ (m)
= l T,∆T θ b θT,∆T − b θT,∆T

(m)
for some θ between bθT,∆T and bθT,∆T . Therefore,
µ ¶ h i−1 ∙ µ ¶ µ ¶¸
(m) .. ¡ ¢ −1/2 . (m) . (m) (m)
IT,∆T (θ0 ) bθT,∆T − b IT,∆T (θ0 ) lT,∆T b θT,∆T − lT,∆T b
1/2 −1/2 −1/2
θT,∆T = IT,∆T (θ0 ) l T,∆T θ IT,∆T (θ0 ) θT,∆T
∙ µ ¶ µ ¶¸
¡ ¢ −1/2 . (m) . (m) (m)
= G−1 + Op (1) IT,∆T (θ0 ) lT,∆T b θT,∆T − lT,∆T b θT,∆T

..
−1/2 −1/2
where G is the probability limit of IT,∆T (θ0 ) l T,∆T (θ0 ) IT,∆T (θ0 ) under Pθ0 as T → ∞.

37
We will show later that there exists a sequence {∆∗T } so that for any sequence {∆T } satisfying ∆T ≤ ∆∗T ,
. (m) .
lT,∆T (·) incurs a relative error of order op (∆m
T ) in approximating lT,∆T (·), uniformly for the parameters in
(m)
the compact set Θ. Assuming this holds now, we have, for some e θ between b θ and θ0 ,
T,∆T
µ ¶ µ ¶
(m) . (m)
1/2
IT,∆T b b
(θ0 ) θT,∆T − θT,∆T m −1/2 b
= op (∆T ) IT,∆T (θ0 ) lT,∆T θT,∆T
∙ ³ ´ µ (m) ¶¸
. ..
m −1/2
= op (∆T ) IT,∆T (θ0 ) lT,∆T (θ0 ) + l T,∆T θ e b
θT,∆T − θ0
∙ µ ¶¸
. (m)
m −1/2 1/2 b
= op (∆T ) IT,∆T (θ0 ) lT,∆T (θ0 ) + Op (1) IT,∆T (θ0 ) θT,∆T − θ0
∙ µ ¶¸
(m)
m 1/2 b b b
= op (∆T ) S + Op (1) IT,∆T (θ0 ) θT,∆T − θT,∆T + θT,∆T − θ0

.
−1/2
where S is the probability limit of IT,∆T (θ0 ) lT,∆T (θ0 ) under Pθ0 as T → ∞.
Therefore,
µ ¶ h ³ ´i
(m)
(I − op (∆m )) I
1/2
(θ 0 ) b
θ − b
θ T,∆ = o p (∆m
) Op (1) + Op (1) I
1/2
(θ 0 ) b
θ T,∆ − θ 0
T T,∆T T,∆T T T T,∆T T

= op (∆m
T )

which proves the theorem.


Finally, we show the existence of {∆∗T }, which was used in proving the theorem.
¡ ¢ ¡ ¢
Uniformly for θ in the compact set Θ, p(m) ∆, Xi∆ |X(i−1)∆ , θ is an expansion of p ∆, Xi∆ |X(i−1)∆ , θ
³ ´ ¡ ¢
with relative error ∆m ε1 ∆ e i , Xi∆ , X(i−1)∆ , θ for some ∆ e i ≤ ∆, i.e., p(m) ∆, Xi∆ |X(i−1)∆ , θ is equal to
¡ ¢³ ³ ´´ ¡ ¢
p ∆, Xi∆ |X(i−1)∆ , θ 1 + ∆m ε1 ∆ e i , Xi∆ , X(i−1)∆ , θ . Similarly, ∂ p(m) ∆, Xi∆ |X(i−1)∆ , θ is an expan-
∂θ
¡ ¢ ³ ´

sion of ∂θ p ∆, Xi∆ |X(i−1)∆ , θ with relative error ∆m ε2 ∆ b i ≤ ∆.
b i , Xi∆ , X(i−1)∆ , θ for some ∆
Now we bound the functions ε1 and ε2 . Let {qt } and {ξ t } be two sequences of positive numbers converging
to 0. Under the assumption that MT is bounded in probability for any T < ∞, Xt for all t < T is in
a compact set with probability 1 − qT and we can always choose the sequence {∆∗T } converging to 0 fast
enough so that for any {∆T } satisfying ∆T ≤ ∆∗T , kε1 k and kε2 k for observations
µ ¶ inside this
µ compact
¶ set
. (m) . (m) (m)
are bounded by ξ T , uniformly for θ ∈ Θ. With this choice of {∆∗T }, lT,∆T b
θT,∆T − lT,∆T b θT,∆T equals
µ ¶
. (m)
op (∆m b
T ) lT,∆T θT,∆T .
This completes the proof.

Proof of Proposition 2: Let G (.) = H (Γ (.)). First, we show, with probability approaching one as
d→0
d → 0, kG0 k < M (d) for some M (d) → 0.
Using equation (4.1), we can compute the forward pricing as
1 ¡ ¢
Fd (E, Γ) = E + df1 (Γ) + d2 f2 (Γ, θ) + o d2
2
¡ ¢
for some function f1 and f2 . In particular, f1 = λu e E (J u ) + λd e−Γ E J d . Notice, the parameter θ does
Γ

not affect f1 . Therefore, the function Γ from inverting this pricing formula satisfies
∂f (Γ,θ)
2
∂Γ
= − ∂f ∂θ d + o (d)
∂θ 1 (Γ,θ)
∂Γ

38

Given Γ, the next-step iterative estimator satisfies ∂θ L (Γ, θ) = 0. Therefore,
2

∂ L
H = − ∂θ∂Γ
∂ 2
∂Γ ∂θ 2
L

Now, θ is in a compact set and with probability approaching one, Γ stays in a compact set for given ∆ and
∂ ∂Γ d→0
T . Therefore, ∂Γ H is bounded, ∂θ is O (d) . It is clear then we can find M (d) → 0 so that kG0 k < M (d) < 1
with probability approaching one as d → 0. Therefore, H (Γ (.)) is a contraction. The convergence of the
iterative MLE procedure now follows from the contraction mapping theorem (see Theorem 3.2 in Stokey et al
(1996)).
Now with probability approaching one as d → 0,
° ° ° ° ° °
°b ° ° °
°θT,∆ − θ0 ° ≤ °b θT,∆ − θ∗T,∆ ° + °θ∗T,∆ − θ0 °
° ³ ³ ´´ ° ° °
° °
= °H Γ b θT,∆ − H (Γ (θ0 ))° + °θ∗T,∆ − θ0 °
° ° ° °
° °
≤ M (d) °b θT,∆ − θ0 ° + °θ∗T,∆ − θ0 °

and ° °
°b ° 1 ° ∗ °
°θT,∆ − θ0 ° ≤ °θT,∆ − θ0 °
1 − M (d)

39
Appendix 2: Extensions of the Approximation Method
The closed-form approximation in the paper is based on the Kolmogorov equations common to Markov
processes. It is therefore possible to extend this method to other processes. This appendix provides several
such examples.

1. When some state variables do not jump

The state variables in the main part of the paper jump simultaneously. However, it is sometimes of interest
to consider cases where a subset of the state variables jump while others do not26 . In this case, the solution
to the Kolmogorov equations no longer has an expansion in the form of (2.5). However, the methodology still
applies and all we need is another form of leading term in the expansion. This new leading term, together
with the closed-form approximation, will be provided below.
Let the model setup be the same as Section 2.1 except that the state variable X can be partitioned into
two subsets. Ã !
XnC1 ×1
Xn×1 =
XnD2 ×1

where X D can jump and X C cannot.


To simplify the algebra, only the following case is considered where the variance matrix can be partitioned
as à ¡ ¢ !
V11 X C 0
V (X) = ¡ ¢
0 V22 X D

This condition simplifies the algebra below without losing any intuition on the new leading term. It can
be relaxed, though details are suppressed.
The transition density in this case has an expansion in the form of
∙ ¸ ∞ " ¡ ¢# ∞
−n/2 C (−1) (x, y) X (k) k −n1 /2 D(−1) xC , y C X (k)
p (∆, y|x) = ∆ exp − C (x, y) ∆ + ∆ exp − D (x, y) ∆k
∆ ∆
k=0 k=1

Intuitively, when jump happens, X C has to diffuse to the new location while X D can jump to the new
location thus the new form of leading term. (See also Section 2.3). This conjecture can be substituted into
the Kolmogorov equations and solve for the unknown coefficient functions.
It turns out the functions C (k) (x, y) are exactly the same as those in Theorem 1 and 2. This is intuitive
since a change in jump behavior will not affect terms for the diffusion part. The other coefficient functions
26
As an example, one could model an asset price as a jump diffusion with stochastic volatility where the
volatility process follows a diffusion.

40
are characterized below.
∙ ¸ ∙ ¸
¡ ¢ 1 ∂ ¡ C C¢ T ¡ C¢ ∂ ¡ C C¢
D(−1) xC , y C = D (−1)
x , y V 11 x D (−1)
x , y
2 ∂xC ∂xC
h ∙ ¸T ∙ ¸
(1) (1) B (−1) n1 i ∂ (−1) ∂ (1)
D (x, y) = λ (x) ν (y − x) − D L D − − D (x, y) V (x) D (x, y)
2 ∂x ∂x
( Pk P ¯ )
B (k) n2 /2 1 n ∂s ¯
1 A D + (2π) λ (x) r=0 s∈S
n2 M 2
s s gk−r (x, y, w)
D(k+1) (x, y) = £ ¤ (2r)!
£ 2r
¤T ∂w
£ ¤ w=0
1+k −D(k+1) LB D(−1) − n1 − ∂ D(−1) V (x) ∂ D(k) (x, y) 2 ∂x ∂x
for k > 0
⎛⎛ ⎞ ⎞
⎜⎜ xC ⎟ ⎟ −1
C (k) ⎝⎝ D
⎠,y⎠ν (wB (w)−x )
−1
¡ ¢ h ¢iT
wB (w) 1/2 ¡ ∂
¡ D D¢
where gk (x, y, w) ≡ , wB xD , y D ≡ V22 xD ∂xD H x , y .The

det wB (xD ,y D )
∂ (xD )T −1
xD =w (w)
B
¡ ¢ 1
£ ∂
¡ D D ¢¤T ¡ ¢£ ¡ ¢¤ ¡ ¢
function H satisfies H xD , y D = 2 ∂xD H x ,y V22 xD ∂x∂D H xD , y D . Fixing y D , wB ., y D is
−1
invertible in a neighborhood of xD = y D
and wB (.) is its inverse function in this neighborhood. (For the
−1
ease of notation, the dependence of wB (.) on y D is not made explicit.)
Knowing all the coefficient functions, an approximation of the transition density can be obtained by
truncating the infinite sum at an appropriate order.

2. State-dependent jump distribution

Now let us consider a process characterized by its infinitesimal generator AB on a bounded C 2 function f
with bounded first and second derivatives
Xn n n Z
B ∂ 1 XX ∂2
A f (x) = µi (x) f (x) + vij (x) f (x) + λ (x) [f (x + c) − f (x)] ν (x, c) dc
i=1
∂xi 2 i=1 j=1 ∂xi ∂xj C

The jump distribution ν (x, .) is now state-dependent. The solution of the Kolmogorov equations admits
an expansion in the form of (2.5). It turns out that the coefficient functions C (k) (x, y) and D(k) (x, y) in this
case are characterized by theorem 1 and corollary 1, with the exception that ν (.) is replaced everywhere by
ν (x, .).

3. Multiple jump types

Consider a process whose infinitesimal generator AB , on a bounded C 2 function f with bounded first and
second derivatives, is given by
n
X n n Z
∂ 1 XX ∂2 X
AB f (x) = µi (x) f (x) + vij (x) f (x) + λu (x) [f (x + c) − f (x)] ν u (x, c) dc
i=1
∂xi 2 i=1 j=1 ∂xi ∂xj u C

This is a process with multiple jump types with different jump intensity λu (.) and jump distribution
ν u (x, .). The solution of the Kolmogorov equations also admits an expansion in the form of (2.5) where the

41
coefficient functions C (k) (x, y) and D(k) (x, y) are given below.
∙ ¸T ∙ ¸
(−1) 1 ∂ (−1) ∂ (−1)
0 = C (x, y) − C (x, y) V (x) C (x, y) and C (−1) (x, y) = 0 when y = x
2 ∂x ∂x
h ∙ ¸T ∙ ¸
ni ∂ (−1) ∂ (0)
0 = C (0) LB C (−1) − + C (x, y) V (x) C (x, y)
2 ∂x ∂x
h i ∙ ¸T ∙ ¸
(k+1) B (−1) n ∂ (−1) ∂ (k+1)
0 = C L C + (k + 1) − + C (x, y) V (x) C (x, y)
2 ∂x ∂x
" #
X
+ λu (x) − LB C (k) for nonnegative k
u
X
(1)
0 = D − λu (x) ν u (x, y − x)
u
⎡ ⎤
Xk X X ¯
1 1 ∂ s ¯
gu,k−r (x, y, w)¯¯
⎣AB D(k) + (2π) n/2 ⎦ for k > 0
0 = D(k+1) − λu (x) Msn
1+k r=0
(2r)! u n
∂ws w=0
s∈S2r

−1
C (k) (wB −1
(w),y)ν u (wB (w)−x) £ ¤T ∂
where gu,k (x, y, w) ≡ , wB (x, y) ≡ V 1/2 (x) ∂x C
(−1)
(x, y) . Fixing y, wB (., y)
det ∂
wB (x,y)| −1
∂xT x=w (w)
B
−1
is invertible in a neighborhood of x = y and wB (.) is its inverse function in this neighborhood. (For the ease
−1
of notation, the dependence of wB (.) on y is not made explicit.)

42
Appendix 3: History of the Yuan’s Exchange Rate Regime27

Table 4 documents the history of the Yuan’s exchange rate regime since its creation28 .

Table 4: History of the Yuan’s Exchange Rate Regime


Date Exchange Rate Regime Official rate
March 1, 1955 The Yuan was created and pegged to USD 2.46
August 15, 1971 A new official rate against the USD was announced. 2.267
February 20, 1973 The official rate (the “Effective Rate”) was realigned. 2.04
August 19, 1974 The Effective Rate was pegged to a trade-weighted 1.5 — 2.8
– December 31, 1985 basket of fifteen currencies with undisclosed
compositions. The rate was fixed almost daily
against that basket.
January 1, 1986 The Effective Rate was placed on a controlled float.
July 5, 1986 The Effective Rate was fixed against the USD 3.72
until December 15, 1989
November 1986 A second rate, Foreign Exchange Swap Rate 3.72
determined by market force, was created.
The Foreign Exchange Swap Rate quickly moved
to 5.2 in one month.
December 15, 1989 The Effective Rate was realigned. 4.72
November 17, 1990 The Effective Rate was realigned. 5.22
April 9, 1991 The Effective Rate start to adjust frequently
depending on certain economic indicators.
December 31, 1993 The Effective Rate was 5.8. 5.8
The Foreign Exchange Swap Rate was 8.7.
January 1, 1994 The Effective Exchange Rate and the swap market
rate were unified at the prevailing swap market rate.
Daily movement of the exchange rate of the Yuan
against the USD is limited ot 0.3% on either side
of the reference rate as announced by the PBC.

27
Courtesy of the Economics Department at the Chinese University of Hong Kong.
28
Official rate is quoted in Yuan/USD.

43
Time series of the Yuan/USD exchange rate after January 1, 1994 is plotted in figure 10.

8.5

7.5
Yuan / USD

6.5

5.5
1994 1996 1998 2000 2002 2004
Year

Figure 10: Time Series of the Yuan/USD rate after 1/1/94

44
Appendix 4: News Release on Jump Days

Table 5 documents the news releases that coincide with the jumps in Γ.

Table 5: News Release on the Jump Days


Date Events News Source
3-8-2002 1. State-owned enterprises, which contribute 60 percent ChinaOnline
of China’s tax revenue, see a profit drop of 1.4% last year.
2. China plans to incur 80 billion yuan, a quarter of its InfoProd
planned 2003 deficit, on restructuring SOEs.
1-6-2003 1. China’s foreign trade increased 21 percent over 2002 Business Daily
with surplus expanding to USD 27 billion for the first 11
month of 2002.
2. China’s tax revenue rose 12.1% year-over-year China Daily
with GDP estimated to grow by 8%.
6-13-2003 Goldman Sachs forecasts an appreciation of the yuan to AFX News
8.07 by the year-end based on the view that China was
preparing for a change in currency policy, as SARS and
deflationary pressures recede and the dollar weakens.
8-29-2003 1. Japanese Finance Minister urges China to let the yuan Japan Economic Newswire
float freely before meeting US treasury secretary.
2. China’s state-owned enterprises posted a profit of 212 AFX News
billion yuan from Jan to July, up 47% year-over-year.
9-2-2003 US Treasury Secretary wraps up first day of talks in AFX news
China, no comments on the yuan.
9-23-2003 G-7 requests China to move rapidly towards more South China Morning Post
flexible exchange rates.
10-7-2003 The People’s Bank of China is considering revaluing the Jiji Press Japan
yuan by about 30 percent over the next five years, subject
to a final decision from the State Council led by Premier
Wen Jiabao.
10-8-2003 China’s Premier Wen Jiabao suggested he would resist Xinhua News
foreign pressure to revalue the yuan.

45
Appendix 5: Histograms of the Bootstrap Estimates

Histograms of the bootstrap estimates in Section 4.4 are provided in this appendix..
As a reminder, we used a parametric bootstrap procedure where the process Γ is simulated five
hundred times using the estimated parameters and then the forward price is computed using the
pricing formula (4.2). Each simulation is initiated at a Γ randomly picked from the estimated Γ. b
The iterative MLE procedure is then applied pretending only the forward price is observed as in the
actual estimation. The parameters used to simulate the bootstrap sample paths are those in table 2.
Histograms for the bootstrap estimates are plotted below.
Distribution of Bootstrap k Distribution of Bootstrap Diffusion Coefficient
0.5 0.08

0.06

0.4 0.04

0.02

0.3 0

0.1

0.2 0.08

0.06

0.1 0.04

0.02

0 0
0 0.5 1 1.5 2 2.5 3 0.42 0.44 0.46 0.48 0.5 0.52 0.54 0.56 0.58 0.6
Distribution of Bootstrap Jump Intensity Distribution of Jump Standard Deviation
0.08 0.08

0.06 0.06

0.04 0.04

0.02 0.02

0 0

0.1 0.1

0.08 0.08

0.06 0.06

0.04 0.04

0.02 0.02

0 0
10 15 20 25 30 35 40 45 50 55 0.15 0.2 0.25 0.3 0.35

Figure 11: Histogram of Bootstrap Estimates

46

Das könnte Ihnen auch gefallen