Efficient Risk Estimation Via Nested PDF

MANAGEMENT SCIENCE
Vol. 57, No. 6, June 2011, pp. 11721194

issn 0025-1909 eissn 1526-5501 11 5706 1172 doi 10.1287/mnsc.1110.1330
2011 INFORMS
Efficient Risk Estimation via Nested

Sequential Simulation
Mark Broadie
Graduate School of Business, Columbia University, New York, New York 10027, mnb2@columbia.edu
Yiping Du
Industrial Engineering and Operations Research, Columbia University, New York, New York 10027,
yd2166@columbia.edu
Ciamac C. Moallemi
Graduate School of Business, Columbia University, New York, New York 10027,
ciamac@gsb.columbia.edu
W e analyze the computational problem of estimating financial risk in a nested simulation. In this approach,
an outer simulation is used to generate financial scenarios, and an inner simulation is used to estimate
future portfolio values in each scenario. We focus on one risk measure, the probability of a large loss, and we
propose a new algorithm to estimate this risk. Our algorithm sequentially allocates computational effort in the
inner simulation based on marginal changes in the risk estimator in each scenario. Theoretical results are given
to show that the risk estimator has a faster convergence order compared to the conventional uniform inner
sampling approach. Numerical results consistent with the theory are presented.
Key words: simulation; decision analysis; risk; risk management; sequential analysis
History: Received February 24, 2010; accepted January 9, 2011, by Grard Cachon, stochastic models and
simulation. Published online in Articles in Advance April 29, 2011.
1. Introduction Carlo simulation can represent a prohibitive compu-

The measurement and management of risk is an tational challenge, various approximation approaches
increasingly important function at financial institu- are often employed. The focus of our paper is on
tions. A primary goal of risk measurement is to algorithmic improvements of the direct nested Monte
ensure that banks and other financial firms have suffi- Carlo simulation approach, so that risk computation
cient capital reserves in relation to their holdings and can be done on portfolios of derivative securities with
investment activities. The recent failures of large and more realistic multifactor financial models.
small investment and commercial banks highlight the In this paper, we consider what is perhaps the most
need for better modeling and computation of financial basic risk measure, the probability that the future
portfolio value falls below a prespecified threshold, in
risk measures.
other words, the probability of a large loss. When ana-
Risk measurement is typically divided into two
lytical formulas are available for the portfolio revalua-
stages: scenario generation and portfolio revaluation.
tion step, a primary challenge of a single-level Monte
Scenario generation refers to the sampling of risk fac-
Carlo simulation is to reduce the variance of the sim-
tors over a given time horizon. This first (or outer)
ulation risk estimator. In the nested setting, simula-
stage is often performed with Monte Carlo simu- tion is also used for the portfolio revaluation step,
lation, especially when more realistic models with and additional sources of variability are introduced.
a large number of correlated risk factors are used. The second level of simulation introduces bias into
Portfolio revaluation refers to the computation of the computation, and hence both bias and variance
the portfolio value at the risk time horizon given a need to be balanced and reduced to minimize the total
particular scenario of risk factors. Often the portfo- error in the simulation risk estimate.
lio contains derivative securities with nonlinear pay- The problem of estimating the probability of a
offs that, in conjunction with more realistic financial loss via nested simulation was first analyzed by Lee
models, require Monte Carlo simulation for this sec- (1998) and Lee and Glynn (2003), and was subse-
ond (or inner) stage. Thus, in realistic applications, quently considered by Gordy and Juneja (2010). These
the risk measurement calculation involves a two-level authors primarily considered and analyzed uniform
nested Monte Carlo simulation. Because nested Monte nested simulation estimators. Such estimators employ
1172
Broadie, Du, and Moallemi: Efficient Risk Estimation via Nested Sequential Simulation
Management Science 57(6), pp. 11721194, 2011 INFORMS 1173
a constant number of inner samples across portfo- of order k2/3 . Because nonuniform sampling provides
lio revaluation calculations, thus allocating compu- a lower bias for the same number of inner stage
tational effort uniformly across all scenarios. They samples, some of this computational savings can be
demonstrate that, asymptotically, the bias of a uni- used for the generation of additional outer scenar-
form estimator is a function of the number of inner ios to lower variance. We show that our nonuniform
samples used in each portfolio revaluation, whereas method has an asymptotic MSE of order k4/5+ for
the variance of a uniform estimator is a function all positive . Furthermore, we demonstrate a practi-
of the number of outer scenarios. They characterize cal implementation of our nonuniform estimator that
the asymptotically optimal uniform estimator. This adaptively balances bias (inner sampling) and vari-
estimator balances a limited computational budget ance (outer scenario generation).
between using many outer scenarios, to lower vari- 4. We demonstrate the practical benefits of our method
ance, and using many inner samples in each scenario, via numerical experiments.
to lower bias, in a way that minimizes the overall Numerical experiments demonstrate that the per-
mean squared error (MSE) among the class of uniform formance of our nonuniform nested estimation algo-
estimators. rithm is up to two orders of magnitude better than
This paper seeks to exploit the fact that accu- competing methods. Hence, we illustrate that the
rate portfolio revaluation is not equally important results achievable in practice are consistent with the
across all scenarios. Nested simulation can be made gains suggested by the theory.
much more efficient by allocating computational The rest of this paper is organized as follows. Sec-
effort nonuniformly across scenarios. Nonuniform esti- tion 1.1 contains a brief literature review. The prob-
mators have been suggested previously by others lem setup and notation are given in 2. Results for
in a number of contexts (e.g., Lee and Glynn 2003; uniform inner stage sampling are reviewed in 3.
Lesnevski et al. 2004, 2007; Gordy and Juneja 2008; A sequential nonuniform algorithm is motivated and
Lan et al. 2010). Here, we propose and analyze a presented in 4 and a theoretical analysis is given
novel class of nonuniform estimators based on the in 5. Section 6 gives a practically implementable
idea of allocating additional effort to scenarios with a adaptive version of the sequential algorithm, and
greater expected marginal change to the risk measure. numerical results are provided in 7. Concluding
Specifically, the main contributions of this paper are remarks are given in 8, and proofs are provided in
as follows: the appendix.
1. We propose a nonuniform nested simulation algo-
1.1. Literature Review
rithm for estimating the probability of a loss.
Overviews of financial risk measurement and man-
Our algorithm proceeds by allocating the inner
agement are given in Crouhy et al. (2000), Jorion
stage samples for portfolio revaluation in a sequential (2006), and McNeil et al. (2006). There is a large lit-
fashion. At each time step, it myopically selects the erature on the properties of alternative risk measures
scenario where one additional inner stage sample will (see, e.g., Artzner et al. 2000, Rockafellar and Uryasev
have the greatest marginal impact to the estimated 2002, Fllmer and Schied 2002). Variance reduction
loss probability. This algorithm is simple to imple- techniques to improve first stage sampling are given
ment and incurs minimal computational overhead. in Glasserman et al. (2000, 2002).
2. We provide an analysis that demonstrates the lower Most closely related to our work is that of Lee
asymptotic bias of our approach. (1998) and Lee and Glynn (2003), who consider the
Given m inner stage samples in each scenario, a uni- problem of estimating the probability of a large loss
form nested estimator has an asymptotic bias of order and analyze nested simulation estimators and their
m1 . We analyze a simplified variation of our nonuni- convergence properties under uniform inner stage
form estimator and demonstrate that with an average sampling. They consider two settings, where the
of m inner stage samples per scenario, the asymp- underlying scenario space is either continuous or dis-
totic bias is of order m 2+ for all positive . Hence, crete.1 They establish that, given a total computational
for the same overall number of samples, the nonuni- budget of k, the optimal uniform nested estimator
form estimator reduces bias by an order of magni- results on an asymptotic MSE of order k2/3 in the
tude. This theoretical analysis builds on ideas from continuous case and k1 log k in the discrete case.
sequential hypothesis testing and highlights the rela- Independently, Gordy and Juneja (2010) also consider
tionship between our nonuniform estimation algo- estimating the probability of large loss in the contin-
rithm and classical sequential hypothesis testing. uous case, under a different set of assumptions. They
3. We provide an analysis that demonstrates the lower
asymptotic MSE of our approach. 1
In this paper, we will consider only continuous scenario spaces.
Given a computational budget of k, the optimal uni- Note that the theory is qualitatively different in the discrete case
form nested estimator results in an asymptotic MSE versus the continuous case.
1174 Management Science 57(6), pp. 11721194, 2011 INFORMS
also consider two additional risk measures (value at smoothing in risk estimation. Sun et al. (2011) con-
risk and expected shortfall). For each of these three sider nested simulation in the context of estimating
risk measures, they derive asymptotic bias and vari- conditional variance.
ance results for uniform second stage sampling. This
allows them to derive the optimal allocation of effort
between first and second stage sampling and derive 2. Problem Formulation: Nested
the optimal asymptotic MSE of order k2/3 . They also Simulation
propose a jackknife procedure for reducing bias. Consider the problem of measuring the risk of a port-
The idea of nonuniform nested estimation of risk folio of securities at some future time t = (the risk
measures dates back to at least the work of Lee and horizon), from the perspective of an observer at time
Glynn (2003). In the discrete case, they identify a class t = 0. Denote the current portfolio value by X0 . The
of nonuniform nested estimators for the probability value of the portfolio at time , X , is in general a
of a large loss with asymptotic MSE of order k1 log k. random variable and thus is not known at time 0. We
In this setting, the nonuniform estimator achieves the assume, however, that there is a probabilistic model
same asymptotic convergence as the uniform estima- for the uncertainty between times 0 and . In partic-
tor, but with a better constant. Lesnevski et al. (2004, ular, suppose that is a set of possible future sce-
2007) propose a nonuniform nested estimator for a narios or risk factors. Each scenario incorporates
related discrete problem: they estimate the worst case sufficient information so as to determine all assets
expected loss across a finite set of scenarios. They prices at time . Thus, in each scenario , the
are able to develop confidence intervals for their esti- portfolio has value X 45. The mark-to-market loss of
mation procedure. Lan et al. (2007, 2010) and Liu the portfolio at time in scenario is given by2
et al. (2010) extend this work to the case of estimating L45 X0 X 450
expected shortfall. Contemporaneous with the present A risk measure is a functional that quantifies the
work, Gordy and Juneja (2008) suggest a broad class risk of the random variable L by a scalar 4L5 .
of nonuniform estimators for estimating the proba- Some common examples of risk measures include
bility of a loss large, as in the present setting. Their value at risk and conditional value at risk. In this
description is rather general, however, whereas we paper, we will focus on what is perhaps the most
provide a concrete algorithm. basic risk measure, the probability of a large loss; that is,
Note that some of the nonuniform estimators in given a threshold c , we are interested in estimat-
this prior literature have similarities to the nonuni- ing the probability of the loss L exceeding c. Denote
form estimator that we propose; we discuss these in the resulting probability by P4L c5.
4. Critically, however, none of this prior work is able To estimate the loss probability , we face two chal-
to establish theoretically that a nonuniform estimator lenges. First, typically, the space of possible scenarios
converges at a faster asymptotic order than is possible is quite large, if not infinite. Thus, one approach
with uniform estimators. is to approximate the distribution of the loss random
There are some connections between nested simula- variable L with an empirical distribution obtained by
tion to estimate risk and ranking and selection (R&S) Monte Carlo sampling. This is referred to as the outer
procedures that search for the best among a finite level (or first stage) of the simulation. In particular,
number of systems. For an overview of ranking and if 1 1 0 0 0 1 n are n independent and identically dis-
selection, see Kim and Nelson (2005) and the book by tributed (i.i.d.) samples drawn according to the phys-
Chen and Lee (2010). Each R&S system corresponds ical (or real-world) distribution of , then we can
to an outer sample, and sampling a performance mea- approximate the loss probability by
sure from a system corresponds to an inner sample.
n
Many R&S procedures rely on myopic rules to deter- 1X
0 (1)
mine an allocation of inner samples (e.g., Frazier et al. n i=1 8L4i 5c9
2008), and the spirit of our procedure is similar. R&S
typically considers a finite and small number of sys- However, even in a single scenario i , it may be dif-
tems, whereas our outer sampling draws from an infi- ficult to exactly compute the loss L4i 5. The portfolio
nite and often multidimensional domain. The R&S may contain a collection of complex, path-dependent
objective of finding the best performing system is also securities with random cashflows between times
different than estimating a risk measure across and and some final horizon T . Then, the loss L4i 5 must
range of first stage outcomes. be estimated via an inner level (or second stage) of
Finally, also of interest is the work of Liu and Staum Monte Carlo simulation of the expected cashflows of
(2010); they explore an alternative approach based
on stochastic kriging for estimating a risk measure. 2
Without loss of generality, we assume the portfolio has no inter-
Hong and Juneja (2009) consider the benefits of kernel mediate cashflows before time , and that the riskless rate is 0.
Figure 1 Illustration of Uniform Sampling

1
.
.
. Z i, 1
.
i .
.
. Z
.
. i, m
n
Time t
0 T
Notes. The outer stage generates n financial scenarios 1 1 0 0 0 1 n . Conditional on scenario i , m inner stage portfolio losses Zi1 1 1 0 0 0 1 Zi1 m are generated.
the portfolio over the interval 61 T 7. The inner simu- for the parameters m and n? This question has been
lation occurs under the risk-neutral distribution, con- addressed in the work of Lee (1998) and Gordy and
i1 1 1 0 0 0 1 Z
ditioned on the scenario i . If Z i1 m are m i.i.d. Juneja (2010). We follow the latter approach.
samples of losses generated according to this second Denote the Uniform estimate of the probability
stage of simulation, each with mean L4i 5, then we of a large loss by m1 n Uniform4m1 n5. The obvi-
can approximate the loss L4i 5 in scenario i by ous objective is to choose parameters 4m1 n5 so as to
m minimize the MSE of the estimate m1 n , subject to
1 X
L i 0
Z (2) the constraint of a limited budget of computational
m j=1 i1 j resources. The Uniform estimator involves outer sce-
nario generation and inner sampling. We will make
The Uniform estimator of Algorithm 1 describes
the assumption that the computational effort of this
a nested simulation procedure that combines the esti-
estimator is dominated by the latter.3
mates from the outer and inner levels of simulation
Given parameters 4m1 n5, a total of mn inner sam-
in the obvious way to produce an overall estimate of
ples are generated to compute the estimate m1 n .
the loss probability. The estimator is a function of two
Thus, given a computational work budget k on the
parameters: n, the number of outer stage samples, and
total number of inner samples, we have the optimiza-
m, the number of inner stage samples. We say that this
tion problem
estimator samples uniformly in the sense that a con-
stant number of inner stage samples is used for each
minimize E64 m1 n 52 7
outer stage scenario. This procedure is illustrated in m1 n
Figure 1. subject to mn k1 (3)
Algorithm 1 (Estimate the probability of a large loss m1 n 00
using a uniform nested simulation. The parameter
m is the number of inner samples per scenario. The The mean squared error objective can be decomposed
parameter n is the number of outer scenarios.) into variance and bias terms according to
1: procedure Uniform(m1 n)
2: for i 1 to n do E64 m1 n 52 7 = E64 m1 n E6 m1 n 752 7
3: generate scenario i | {z }
variance
4: conditioned on scenario i , generate i.i.d.
samples Z i1 1 1 0 0 0 1 Z
i1 m of portfolio losses + 4E6 m1 n 752 0 (4)
| {z }
5: compute an estimate of the loss in bias2
scenario i , L i 41/m5 m Z
P
i1 j j=1
6: end for To analyze the asymptotic behavior of the MSE, first
7: compute an estimateP of the probability of a consider the following technical assumption:4
large loss, 41/n5 ni=1 8L i c9 Assumption 1. Denote by L45 the portfolio loss in
8: return scenario at time , and denote by L an estimator of the
9: end procedure form (2) for L45, based on the average of m i.i.d. inner
3. Optimal Uniform Sampling 3

This will typically be true because the risk horizon is often short
The Uniform estimator is a function of two param- relative to the time horizon T of realized cashflows. In any event,
eters: n, the number of scenarios, and m, the num- the analysis in this paper can easily be extended to account for the
ber of inner stage samples for each scenario. This computational effort of scenario generation.
raises an obvious question: what are the best choices 4
For an alternative set of assumptions, see Lee (1998).
stage samples. Assume the following: where we have used the fact that the loss esti-
1. The joint probability density function pm 4l1 l5 of mates 8L i 9 are independent and identically dis-

4L1 L5 and its partial derivatives 4/l5pm 4l1 l5 and tributed. Applying Theorem 1,
exist for each m and 4l1 l5.
4 2 /l2 5pm 4l1 l5
41 5 E6 m1 n 741 E6 m1 n 75
2. For each m 1, there exist functions f01 m 4 5, f11 m 4 5, Var4 m1 n 5 = +
and f21m 4 5 so that n n
E6 m1 n 7
+

f 4l51 f 4l51 n

pm 4l1 l5 01 m
p
l m 4l1 l5 11 m
41 5
2

= + O4m1 n1 50
pm 4l1 l5 f 4l5

n
l2 21 m
Theorem 1 and Corollary 1 provide a complete

Furthermore,
for all 4l1 l5. asymptotic characterization of the MSE of the Uni-
Z form estimator. The asymptotic variance of the esti-
sup r f 4l5
l d l < 1 mator is determined by the number of scenarios n
i1 m
m
and decays as n1 , whereas the asymptotic bias of the
for all i = 01 11 21 and 0 r 40 estimator is determined by the number of inner stage
samples per scenario m and decays as m1 .
Gordy and Juneja (2010) establish the following:5
Given a computational budget of a total of k inner
Theorem 1. Suppose that Assumption 1 holds, and stage samples, a naive choice of parameters 4m1 n5
denote by f 4 5 the density of the loss variable L. As might be to sample equally in the outer and inner
m , the bias of the Uniform estimator asymptotically stages, i.e., set m = n = k1/2 . This would result in an
satisfies asymptotic bias squared of order k1 , an asymptotic
variance of order k1/2 , and an overall asymptotic
E6 m1 n 7 = c + O4m3/2 51
m MSE of order k1/2 . Because the variance is asymptot-
where ically dominating the bias squared and determining
the MSE, the naive Uniform estimator is clearly not
c 0 4c51 4c5 12 f 4c5E6 2 45 L45 = c71 (5) optimal. One could do better by using fewer inner
and 2 45 is the variance of the inner stage samples in stage samples per scenario and increasing the number
scenario . of scenarios.
To find the optimal Uniform estimator, using The-
Theorem 1 directly provides an asymptotic analysis orem 1 and Corollary 1, we can approximate the min-
of the bias term in the MSE (4). Theorem 1 can imme- imum MSE problem (3) by the optimization problem
diately be employed to analyze the variance term, as
in the following corollary: 41 5 c2
minimize + 2
Corollary 1. Under the conditions of Theorem 1, as m1 n n m
m , the variance of the Uniform estimator satisfies subject to mn k1
41 5 m1 n 00
Var4 m1 n 5 = + O4m1 n1 50
n
Proof. Note that This suggests optimal allocations
n
1X 1 m = k1/3 / 1 n = k2/3 1
Var4 m1 n 5 = Var = Var48L 1 c9 5
n i=1 8Li c9 n
41 5 1/3

E6 m1 n 741 E6 m1 n 75 where 1 (6)
2c2
= 1
n
and the optimal asymptotic mean squared error
5
In what follows, given arbitrary sequences 8fN 9 and 8gN 9, and a
positive sequence 8qN 9, as N , we will say that fN = gN + O4qN 5 E64 m1 n 52 7 = 34 52 k2/3 + o4k2/3 50 (7)
if limN fN gN /qN < , i.e., if the difference between f and g is
asymptotically bounded above by some constant multiple of q. Sim- The optimal allocations suggested by (6) involve,
ilarly, we will say that fN = gN + o4qN 5 if limN fN gN /qN = 0, asymptotically, order k2/3 outer stage scenarios and
i.e., if the difference between f and g is asymptotically dominated order k1/3 inner stage samples per scenario. How-
by every constant multiple of q. Finally, we will say that fN = gN +
4qN 5 if 0 < lim inf N fN gN /qN limN fN gN /qN < , i.e.,
ever, the optimal constant factors depend on the con-
if the difference between f and g is asymptotically bounded above stant c , and it is not clear how to effectively estimate
and below by constant multiples of q. c a priori. As we will see in 7, the choice of these
constant factors is critical to the practical performance because the loss probability estimate is calculated
of a uniform estimator. according to
n
Finally, it is instructive to compare the rate of 1X
0 (8)
convergence of the optimal Uniform estimator in a n i=1 8Li c9
two-level nested Monte Carlo simulation to that of an
estimator of the probability of a large loss in a single- Thus, only the ordinal position of the estimates L 1
level Monte Carlo simulation. In the latter case, sce- and L 2 relative to the loss threshold c is relevant.
narios 1 1 0 0 0 1 n are generated. It is assumed that in Given the uncertainty in the estimate L 1 , it is fairly
each scenario i , the loss L4i 5 can be exactly com- certain that L41 5 < c, and, indeed, this could likely
puted and the probability is estimated via (1). Note be inferred using fewer inner samples in scenario 1 .
that the estimator (1) is unbiased and has a variance Given the uncertainty in the estimate L 2 , the fact that
proportional to n1 . In a single-level simulation, then, L42 5 c, on the other hand, is much less certain.
the amount of work is proportional to n, whereas the Without more inner samples in this scenario, there
MSE of the estimator decays proportional to n1 . In may be significant risk of misclassifying L42 5. These
a two-level simulation, however, as shown above, the observations suggest that a nonuniform sampling strat-
amount of work is proportional to k, whereas the MSE egy may be superior: the number of inner samples
decays at best at a rate of k2/3 . This slower rate of m1 employed at scenario 1 should be less than the
decay is due to the bias introduced by the inner level number of inner samples m2 employed at scenario 2 .
of simulation. The discussion above suggests that in a scenario
with a loss L45 that is much greater than c or much
less than c, few inner samples are necessary. If the
4. Sequential Sampling loss L45 is close to c, however, many inner samples
The Uniform estimator described in 2 and 3 are necessary. Unfortunately, a priori, it is not clear
employs a constant number of inner stage samples for how to do this. It is impossible to know the value of
each outer stage sample. It is intuitively clear to see L45this is exactly what we seek to estimate via the
that this may not be an efficient strategy. As an illus- inner Monte Carlo simulation.
trative example, consider the situation depicted in Fig- We propose a procedure that simultaneously main-
ure 2. Here, we wish to estimate the loss probability tains estimates of the loss in each scenario while
associated with the shaded region. There are two outer sequentially attempting to allocate additional inner
stage scenarios, 1 and 2 , associated with the port- samples across the outer scenarios. We will first moti-
folio losses L41 5 and L42 5, respectively. These true vate our algorithm with an informal justification, and
losses are approximated, in each scenario, by the esti- then give a precise description. In particular, suppose
mated losses L 1 and L 2 . that there are n scenarios 1 1 0 0 0 1 n . For each sce-
Suppose that, under a uniform nested simulation, nario i , suppose that mi inner samples Z i1 1 1 0 0 0 1 Z
i1m
i
the portfolio losses estimated in each scenario are dis- have been made, resulting in the loss estimate L i
Pmi
tributed according to the dashed probability distribu- 1/mi j=1 Zi1 j . This results in an overall probability of
tions. Then, it is clear that it would be advantageous a large loss estimate given by (8).
to employ fewer inner stage samples at scenario 1 , Without loss of generality, assume that L i c. Sup-
and more inner stage samples at scenario 2 . This is pose we wish to perform one additional inner stage
Figure 2 An Illustration of the Benefits of Nonuniform Sampling
Probability
Choose m2 large
L 1
L 2
Loss
L(1) c L(2)
Choose m1 small
Notes. The uncertainty in the loss L 1 estimated in scenario 1 is unlikely to impact the overall probability of large loss estimate, hence the number of inner
samples m1 in this scenario can be chosen to be small. In scenario 2 , however, a large number of inner samples m2 should be used.
Figure 3 An Illustration of the Impact of an Additional Sample

Probability
Loss
c
L i L i
Add 1 sample
Note. An additional inner sample in scenario i will only change the overall probability of loss estimate if L i moves to the opposite side of the loss threshold c.
sample. If we were to perform the additional sample where G is an increasing function. In this case,
in scenario i , this would result in a new loss estimate
given by P4L 0i < c5 P4Z
i1 m +1 L4i 5 < mi L i c5
i

1 mX i +1 m
L 0i i1 j = 1 Z
Z i1 m +1 + mi L i 0 = G i L i c 0 (11)
mi + 1 j=1 mi + 1 i
mi + 1 i
The additional sample will only impact the estimate Maximizing the probability of a sign change accord-
if the L i is on the opposite side of the threshold level c ing to (11) also results in the myopic rule (10).
than L 0i , i.e., if L 0i < c. This is illustrated in Figure 3. To We call the quantity minimized in (10), 4mi /i 5
myopically maximize the impact of the single addi- L i c, the error margin associated with the sce-
tional sample, we will seek to choose the scenario i nario i . The allocation rule (10), which picks a sce-
that maximizes the probability of such a sign change. nario by greedily minimizing the error margin, makes
Suppose that the additional sample Z i1 m +1 has vari- intuitive sense qualitatively. It encourages additional
i
2 2
ance i 4i 5. Observe that inner samples at scenarios that are close to the loss
boundary (i.e., L i c is small), scenarios with few
P4L 0i < c5 = P Z i1 m +1 L4i 5 < mi 4L i c54L4i 5c5

i inner samples (i.e., mi is small), or scenarios with
i1 m +1 L4i 5 < mi L i c5 significant variability in the portfolio losses (i.e., i
P4Z i
is large). The Sequential estimator of Algorithm 2
1
m2i employs the allocation rule (10). This estimator takes

2
1+ 2 Li c 0 (9) a triple 4m0 1 m1
n5 of input parameters. Here, n is the
i
desired number of outer stage scenarios, m0 is the ini-
Here, the approximation follows from the assump-
tial number of inner stage samples per scenario, and m
tion that mi 1, so that mi 4L i c5 4L4i 5 c5 is the desired average number of inner stages samples
mi L i c. The inequality follows from the one-sided per scenario at the conclusion of the algorithm. The
Chebyshev inequality. By analogous consideration of
algorithm proceeds as follows: first, n scenarios are
the symmetric case (where L i < c), a myopic alloca- generated, and, for each scenario, m0 inner stage sam-
tion rule that seeks to maximize the probability of a ples are performed. The remaining mn m0 n inner
sign change estimated via the Chebyshev bound6 (9) stage samples are allocated one at a time in a sequen-
will choose to add the additional inner sample in sce-
tial fashion myopically, as in (10).
nario i , where
m Algorithm 2 (Estimate the probability of a large loss
i arg min i L i c0 (10) using a sequential nonuniform nested simulation. The
i i
parameter m0 is the initial number of inner samples per
An alternative justification for the myopic rule (10) scenario. The parameter m is the average number of
arises if the additional sample Z i1 m +1 is drawn from
i inner samples per scenario at the conclusion of the sim-
i1 m +1 is
a location-scale family of distributions, e.g., if Z ulation. The parameter n is the number of scenarios.)
i
normally distributed. Such a distribution is specified 1: procedure Sequential(m0 1 m1 n)
by a mean L4i 5 and a variance i2 so that 2: for i 1 to n do

P4Zi1 m +1 < z5 = G z L4i 5 1 3: generate scenario i
i
i 4: conditioned on scenario i , generate i.i.d.
samples Z i1 1 1 0 0 0 1 Z
i1m0 of portfolio losses
0
6
We thank an anonymous reviewer for suggesting this motivation. 5: mi m
6: end for over a long time horizon, but the memory require-
while ni=1 mi < mn ment is minimal because all intermediate computa-
P
7: do
8: set i argmini mi L i c/i , where, for each
tions are discarded, and only the inner sample loss is
1 i n, L i is the current estimate of the recorded.7
Pmi The Sequential estimator has some similarities to
loss in scenario i , L i 41/mi 5 j=1 Zi1j ,
nonuniform estimators that have been proposed in
and i is the standard deviation of the
the literature. Lee and Glynn (2003) suggest a nonuni-
distribution of losses in scenario i
form nested estimator in the case where the sce-
9: generate one additional portfolio loss nario space is discrete. They choose the number of
sample Z i 1m +1 in scenario i
i inner samples mi in each scenario i so as to opti-
10: mi mi + 1 mize certain large deviation asymptotics. Using a
11: end while Gaussian approximation as a heuristic, this results in
12: compute an estimate of the probability of a the allocation
large loss, 41/n5 ni=1 8L i c9 i2
P
mi 0 (12)
13: return 4L4i 5 c52
14: end procedure Because the loss L4i 5 in scenario i is unknown,
Lee and Glynn (2003) propose a two-pass algorithm:
Note that the Sequential estimator requires access
in the first pass, a small number of inner samples
to the conditional standard deviation i2 of losses
are generated in each scenario and are used to com-
in each scenario i , to compute the error margin.
pute inner sample allocations in a second produc-
These are not required for the Uniform estimator tion run.
and, moreover, are typically not known in practice. Our Sequential estimator differs from (12) in sev-
However, these conditional standard deviations can eral fundamental ways: First, the allocation (12) is
be estimated in an online fashion over the course of loosely analogous to minimizing the square of the
the estimation algorithm; we discuss such variations error margin, as opposed to the error margin itself.
in 7.4. Second, the allocation (12) is accomplished with mul-
Furthermore, the Sequential estimator requires tiple passes, whereas our estimator is fully sequen-
additional computational overhead beyond that of tial. Indeed, in 5, tools from sequential analysis will
the Uniform estimator. However, this is minimal: prove fundamental in the theoretical analysis of our
the only additional requirement is to track scenar- estimators. Finally, and most importantly, in the set-
ios in order of error margin. This can be accom- ting of Lee and Glynn (2003), nonuniform sampling
plished efficiently via a priority queue data structure does not provide a qualitatively different rate of con-
(see, e.g., Cormen et al. 2002). With a priority queue, vergence than uniform sampling. Given a total com-
determining the scenario with minimum error mar- putational budget of order k, both the uniform and
gin (line 8 in Algorithm 2) can be accomplished in nonuniform methods achieve an asymptotic MSE of
constant time (i.e., in an amount of time indepen- order k1 log k, albeit with different constants. As we
dent of m and n). Once a new inner sample is gen- shall see in 5, we will be able to establish theo-
erated for a scenario (lines 9 and 10 in Algorithm 2), retically that a nonuniform estimator converges at a
order log n time would be required to update the faster asymptotic order than is possible with uniform
priority queue data structure. In practice, this is not estimators.
significant. Gordy and Juneja (2008) suggest a general class
The Sequential estimator also requires more mem- of multipass dynamic allocation schemes for nun-
ory than the Uniform estimator. In particular, the uniform nested estimation. Such schemes would, for
Uniform estimator can be implemented in a way example, divide the simulation into a sequence of J
where scenarios are processed one at a time and never phases, where in the jth phase inner samples would
need to be simultaneously stored in memory. Such only be allocated to scenarios i if L i c j . Here,
an implementation would have a constant memory 1 > 2 > > J is a sequence of thresholds. Gordy
requirement (i.e., independent of m and n). For the and Juneja (2008) provide some numerical evidence
Sequential estimator, each of the n outer scenarios that such schemes may provide a significant improve-
must be stored in memory over the course of the ment over uniform estimators, but the choice of spe-
cific parameters of the algorithm (e.g., the number of
algorithm, hence the memory requirement is of order
phases J or the thresholds 8j 9) is left as a direction
n. In practice, even given a very large number of
for future research.
scenarios (e.g., millions), each of very high dimen-
sion (e.g., thousands), this memory requirement is 7
The nonuniform Threshold estimator that will be discussed in
well within the reach of commodity hardware. Each 5.1 does not require any additional computational or memory
inner sample may require simulating multiple steps overhead beyond that of the standard Uniform estimator.
5. Analysis This is precisely what is done by the Threshold esti-

In 4, we introduced the nonuniform Sequential mator of Algorithm 3.
estimator and motivated this algorithm via an infor- Algorithm 3 (Estimate the probability of a large loss
mal discussion. In this section, we will provide an using a threshold-based nonuniform nested simula-
analysis of nonuniform estimation. We begin in 5.1 tion. The parameter is the error margin threshold.
by introducing a simplified variation of the Sequen- The parameter n is the number of scenarios.)
tial estimator. This simplified estimator preserves 1: procedure Threshold(1 n)
the myopic and nonuniform behavior of the Sequen- 2: for i 1 to n do
tial estimator, but is more amenable to analysis. 3: generate scenario i
Moreover, the simplified estimator is reminiscent of a 4: set i to be the standard deviation of the
compound sequential hypothesis test and highlights distribution of the losses in scenario i
connections to the classical field of sequential analy- 5: mi 0
sis. In 5.2, we provide an asymptotic analysis of the 6: repeat
bias and variance of simplified nonuniform estimator. 7: generate one additional portfolio loss
Finally, in 5.3, we discuss optimal parameter choices sample Z i1m +1 in scenario i
i
for the simplified nonuniform estimator. We demon- 8: mi mi + 1
strate that this estimator has an asymptotic MSE of 9: compute an estimate of the loss in
order k4/5+ , for all positive , as a function of the Pm i
scenario i , L i 41/mi 5 j=1 Zi1 j
computational budget k. This can be compared to the
10: until 4mi /i 5Li c
asymptotic MSE of order k2/3 of the optimal uniform
11: end for
estimator.
12: compute an estimateP of the probability of a
5.1. A Simplified Nonuniform Estimator large loss, 41/n5 ni=1 8L i c9
Analysis of the Sequential estimator described in 13: return
4 presents a number of challenges. Foremost among 14: end procedure
these is the fact that, over the course of the nested At a high level, the Sequential and Threshold
simulation of the Sequential estimator, the loss esti- estimators are quite similar. Both seek to nonuni-
mates L 1 1 0 0 0 1 L n are dependent random variables. formly allocate inner stage samples based on min-
This dependence is induced by the myopic selection imization of the error margin. However, they are
rule (10), which, at each point in time, simultaneously parameterized differently. The Sequential estimator
depends upon all of the loss estimates. To make the takes as an input the parameter m, which is the mean
analysis tractable, we will consider a modification of number of inner stage samples. On the other hand,
the Sequential estimator that results in independent the Threshold estimator takes as input the parame-
loss estimates while maintaining the spirit of myopic ter , which is the threshold for the error margin. As
nonuniform sampling. argued earlier, for large values of m and , these two
In particular, recall that the Sequential estimator algorithms yield similar results. Furthermore, we will
takes as input a parameter m, specifying the desired see numerical evidence for this in 7.
average number of inner samples in each scenario, From a practical perspective, the Sequential esti-
and a parameter n, specifying the desired number of mator is more natural. In particular, if all other param-
scenarios. Over the course of the algorithm, mn total eters are fixed, it is easy to choose a value for m.
inner stage samples will be generated. These samples This parameter explicitly specifies the total number
are allocated in a sequential fashion so as to myopi- of inner stage samples to be generated by mn, and
cally minimize the error margin, 4mi /i 5L i c, uni- therefore determines the running time of the algo-
formly over 1 i n. rithm. Thus, we can choose m based on the available
If we imagine the algorithm to be in a state where running time. In the Threshold estimator, the param-
a significant number of inner samples have been gen- eter implicitly specifies the total number of inner
erated, i.e., mi 1 for each i, then one would expect stage samples to be generated, and hence indirectly
determines the running time. It is not clear, however,
the error margins to be roughly constant; if not, more
how to make choice of a priori that ensure a certain
inner samples would have been generated for the sce-
running time, for example.
narios with lower error margins. One could achieve
From a theoretical perspective, however, the
a similar effect by fixing a threshold > 0 and contin-
Threshold estimator proves much more amenable
uing to add inner stage samples to each scenario i
to analysis. The main reason is that, at any point
until the error margin exceeds , i.e.,
during the execution of the algorithm, the loss esti-
mi mates L 1 1 0 0 0 1 L n are independent and identically dis-
L c 0 (13) tributed random variables. This i.i.d. structure will
i i
Figure 4 An Illustration of the Threshold Estimator
(i)
Sm

0 m
mi
4i5
Note. Given a scenario i , the estimator generates inner stage samples until the partial sum Sm crosses barriers at or . If the exit occurs through the
upper barrier at , as illustrated, the scenario is declared to be a loss exceeding c. If the exit occurs through the lower barrier at , the scenario is declared
not to be a loss exceeding c.
prove crucial in the analysis of 5.2, because it allows squared error into bias and variance terms. We begin
the analysis of the overall algorithm via the analysis with an assumption:
of a single outer stage scenario.
Moreover, the Threshold estimator has another Assumption 2. Assume the following:
interesting interpretation. Given a threshold , 1. Conditional on an outer stage scenario i , the
consider a scenario i with inner loss samples i1 1 1 Z
inner stage samples Z i1 2 1 0 0 0 are i.i.d. normal random
i1 1 1 Z
Z i1 2 1 0 0 0 0 Examining (13), the algorithm will gen- variables. Denote the standard deviation of these samples
erate mi inner stage samples in this scenario, with by 4i 5.
2. Given a scenario , define the normalized excess
mi = inf8m > 02 Sm4i5 91 (14) loss 45 4L45c5/45. Then, the probability density
function p of ,
where, for m 0, the partial sum is defined by
m d
1 p4u5 P4 u51
Sm4i5
X
4Zi1 j c50 (15) du

j=1 i
exists and is continuously differentiable in a neighborhood
4i5
Note that 8Sm 1 m 09 is a random walk with unit of 0.
variance increments. Then, the number of samples mi
is determined by the first exit time of the random The second condition of Assumption 2 is a techni-
walk from the interval 41 5. This is illustrated in cal condition that is reminiscent of the first condition
Figure 4. If the exit occurs through the upper barrier of Assumption 1. The first condition is motivated by
at , then L i > c, and the scenario is declared to be a the random walk interpretation of 5.1. In particu-
loss exceeding c. If the exist occurs through the lower lar, consider the random walk formed by the partial
4i5
barrier at , then L i < c, and the scenario is declared sums 8Sm 1 m 09 from (15). By the functional cen-
not to be a loss exceeding c. tral limit theorem, under a proper scaling, this process
The interpretation of the threshold policy in terms converges to a Brownian motion, i.e., a random walk
of the first exit of a random walk is reminiscent with normal increments. The first condition makes the
of sequential hypothesis testing (see, e.g., Siegmund assumption that the unscaled random walk also has
1985). Indeed, for each scenario i , the threshold esti- normal increments.
mator is defining a sequential compound hypothesis We are interested in the accuracy of the Threshold
test of whether the i.i.d. unit variance random vari- estimator in the asymptotic regime where the result-
ables 84Zi1 j c5/i 9 have a positive or negative mean. ing estimate converges to the true value, i.e., as n
As we show next, techniques from sequential anal- (many outer stage scenarios) and (many inner
ysis will prove helpful in theoretical analysis of our stage samples). Our first result is the following theo-
algorithm. rem, which characterizes the asymptotic bias of this
estimator.
5.2. Asymptotic Analysis
Define 1 n to be the Threshold estimate, i.e., 1 n Theorem 2. Under Assumption 2, as , the
Threshold41 n5. As in 3, we will analyze the accu- asymptotic bias of the Threshold estimator satisfies
racy of this estimator by decomposing the mean E6 1 n 7 = O4 2 5.
The proof of Theorem 2 is provided in the ap- Note that Theorem 3 is intuitive given the first exit
pendix. It relies on the random walk interpretation time interpretation of Figure 4. In particular, for large
of 5.1 as well as techniques from sequential analy- values of , the amount of time required for a random
sis. Specifically, exponential martingales are used in walk starting at the origin with drift 6= 0 to exit the
combination with the optional stopping theorem. interval 41 5 is approximately /. If the random
The following is an immediate corollary of Theo- walk has zero drift, the exit time is approximately 2 .
rem 2 and provides an asymptotic expression for the In our case, the expected number of samples m45 is
variance of the simplified sequential estimator. averaged over various possibilities of drift given by
45 4L45 c5/45. The probability of this drift
Corollary 2. Under the conditions of Theorem 2, as being exactly zero is zero, by the second condition of
, the variance of the Threshold estimator satisfies Assumption 2. However, arbitrarily small drifts are
possible, and thus m45
is slightly larger than O45.
41 5 Although Theorem 3 provides an O4 log 5 bound
Var4 1 n 5 = + O4 2 n1 50
n on the expected number of inner stage samples per sce-
Proof. Note that nario, it might be the case that the realized number of
inner stage samples per scenario is larger. The follow-
n
1X 1 ing theorem guarantees that, so long as the number of
Var4 1 n 5 = Var 8L i c9 = Var48L 1 c9 5
n i=1 n scenarios n is sufficiently large, an O4 log 5 bound
continues to hold on the number of realized samples
E6 1 n 741 E6 1 n 75 per scenario with high probability. The proof can be
= 1
n found in the appendix.
where we have used the fact that the loss esti- Theorem 4. Under Assumption 2, suppose that
mates 8L i 9 are independent and identically dis- C0 1 0 > 0 are constants so that, for all 0 , m45

tributed. Applying Theorem 2, C0 log . (Such constants are guaranteed to exist by The-
orem 3.) Furthermore, suppose the number of scenarios n
41 5 E6 1 n 741 E6 1 n 75 n45 is chosen as a function of and that there exist con-
Var4 1 n 5 = + stants C1 1 1 > 0, so that, for all 1 , n45 C1 ; that
n n
is, n asymptotically grows at least linearly in . Then, for
E6 1 n 7
+ any 1 > 0, there exists 2 > 0 so that, for all 2 ,
n n
1X
41 5 P m 4C0 + 5 log < 0
= + O4 2 n1 50 n i=1 i
n
5.3. Optimal Nonuniform Threshold Estimator
The total run time of the Threshold estimator is Theorems 2 and 3 and Corollary 2 allow a comparison
proportional to the total number of inner stage sam- between the Uniform estimator and the nonuniform
ples generated. Note, however, by the nature of the Threshold estimator. In particular, suppose m1 n is
algorithm, the number of inner samples is stochas- the Uniform estimate with n scenarios and m inner
tic. Hence, define m45
to be the expected number of stage samples. As discussed in 3, when m1 n ,
inner stage samples at a single outer stage scenario, this has asymptotic bias and variance
given parameter ; that is, c
E6 m1 n 7 = + O4m3/2 51
m
(17)

m
m45
E inf m > 02 L45 c 0 (16) 41 5
45 Var4 m1 n 5 = + O4m1 n1 50
n
Here, the expectation is over the scenario and On the other hand, suppose that 1 n is the nonuni-

the corresponding loss estimate L45. Then, given form Threshold estimator with n scenarios and a
parameters 41 n5, the Threshold estimator has threshold of . By Theorem 3, this estimator will
expected run time proportional to m45n.
The follow- employ, on average, m m45
= O4 1+ 5 inner stage
ing theorems, whose proof is given in the appendix, samples per scenario for any positive . We can
characterizes the rate of growth of this run time as a express the asymptotic bias and variance results of
Theorem 2 and Corollary 2 as a function of n and m by
function of .
2+ 51
E6 1 n 7 = O4m
Theorem 3. Under Assumption 2, as , the
41 5 (18)
expected number of inner stages samples in each sce- Var4 1 n 5 = 2+ n1 5
+ O4m
nario under the Threshold estimator satisfies m45
= n
O4 log 50 for all positive .
Comparing (17) and (18), we see that, up to the scenarios (i.e., the choice of n) and generating more
dominant term, the two algorithms achieve the same inner samples across scenarios (i.e., the choice of m)
asymptotic variance of order n1 . This is consistent is unaddressed, however. The discussion in 5.3 sug-
with the discussion in 3, which suggests that the gests that, given a total work budget of k, one should
asymptotic variance is determined by the randomness asymptotically approximately choose n k4/5 and
in scenario generation. This is exactly the same in the m k1/5 . However, the constants in these asymptotic
two algorithms. The inner stage sampling is different, expressions are unspecified. The choice of these con-
however, and this results in a difference in bias for stants may have an enormous impact on the practical
the estimators. Specifically, as a function of the aver- performance of these algorithms. Note that the Uni-
age number of inner stage samples per scenario, the form estimator faces the same problemindeed, the
bias of the nonuniform Threshold estimator decays optimal allocation (6) suggested by the analysis of 3
approximately as the square of the bias of the Uni- requires knowledge of the constant c . It is not clear,
form estimator. in general, how to determine this constant.
Given a total work budget of k (i.e., mn k), we In this section we will consider an adaptive alloca-
saw in 3 that the optimal Uniform estimator (in the tion approach. This algorithm is a heuristic that esti-
sense of minimum MSE) would utilize a number of mates the optimal choice of m and n at each point
scenarios n of order k1/3 , a number of inner stage sam- in time. It refines these estimates over the course
ples per scenario m of order k2/3 , and result in an of the simulation. The main idea of this approach
MSE of order k2/3 . For the nonuniform Threshold is that, based on the results of 5, the variance is
estimator, from the results of 5.2, we can bound the determined by the number of scenarios (n), and the
MSE by bias squared is determined by the amount of inner
41 5 C sampling (m). The adaptive algorithm estimates these
E64 1 n 52 7 + 41 quantities and then either increases the number of
n
scenarios or increases the number of inner samples
for sufficiently large n and and an appropriate depending on whether the MSE is dominated by the
choice of the constant C. We can find a nonuniform variance or the biased squared.
Threshold estimator with low MSE by minimizing Specifically, the Adaptive estimator of Algorithm 4
this upper bound over choices of 41 n5, subject to an proceeds as follows:
expected total work constraint; that is, we consider 1. The simulation is initialized (lines 27) by gen-
optimization problem erating n0 scenarios with m0 inner samples for each
41 5 C scenario.
minimize + 4 2. The work budget of the simulation k is divided
1 n n
(19) into K k/e intervals (or epochs) of length e (note
subject to m45n
k1 that we assume for simplicity of exposition that K is
1 n 00 integral and that the first epoch is only of length e
n0 m0 because of the initialization).
For any positive and given a work budget k, sup- 3. At the beginning of the lth epoch (line 9), esti-
pose we choose k1/5 and n k4/5 . Then, we mates are made for the bias squared and variance of
5n = O4k1 log k5 = o4k5. Thus, for suf-
have that m4 the loss probability estimate, given the scenarios and
ficiently large k, the expected total work will be less samples that have been generated thus far. Specifi-
than k. Indeed, because 4 1 n 5 satisfy the conditions cally, given the loss probability estimate
of Theorem 4, for sufficiently large k the realized total
n
work will also be less than k with high probability. 1X
This choice will result in an MSE of O4k4/5+ 5. Hence, = 1
n i=1 8Li c9
the optimal nonuniform Threshold estimator con-
verges at a faster rate than any uniform estimator. This the bias is approximated according to
is accomplished by generating more outer scenarios
(k4/5 versus k2/3 ) and generating fewer inner stage E6 7 B 1
(20)
samples on average in each scenario (k1/5 versus k1/3 )
than is optimal in the uniform case. where
n
mi 4L i c5

1X
0
n i=1 i
6. Adaptive Allocation Algorithm
The nonuniform Sequential estimator provides a This approximation is based on a central limit the-
way to determine the placement of inner stage sam- orem heuristic: in each scenario i , when the num-
ples across scenarios. The decision of how to allo- ber of samples mi is large, each loss estimate L i can
cate computational effort between generating more be approximated by a normal distribution with mean
equal to L4i 5 and with variance i2 /mi . Hence, given The solution to (22) is given by
a fixed set of scenarios 1 1 0 0 0 1 n , one might estimate 1/5
the bias via 0 Vn 4
n = min max 4mn
+ e 5 1 n 1 n + e 1
n 4B2 m
4
1X
E6 7 = 8P4L i c5 8L4i 5c9 9 mn
+ e
n i=1 0 =
m 0 (23)
n0
n
1X mi 4L4i 5 c5
8L4i 5c9 0 After obtaining the target number of scenarios n0
n i=1 i
(line 10), n0 n additional scenarios are generated.
5. Over the course of the lth epoch (lines 1321),
Because each true loss L4i 5 is unknown in prac-
e inner samples are generated. These are distributed
tice, we can approximate this with its realized esti-
to ensure that every scenario has at least m0 inner
mate L i . This results in (20). By making a similar
samples in total (not per epoch). Once that is the
heuristic approximation for the variance, we arrive at
case, inner samples are allocated myopically accord-
the expression
ing to minimum error margin as in the Sequential
41
5 estimator.
V
Var45 0 (21)
n Algorithm 4 (Estimate the probability of a large
loss using an adaptive nonuniform nested simula-
Note that the estimators (20) and (21) are meant only
tion. This estimator employs a sequential algorithm
as heuristics. Better estimators may be possible, and
to determine the placement of inner stage samples
bias in particular is notoriously difficult to estimate.
across scenarios and adaptively decides the number
For our purposes, however, they only need to be accu-
of scenarios and inner samples to add by estimating
rate within orders of magnitude so as to allocate com-
the bias and variance. The parameters n0 and m0 are
putational effort between inner samples and outer
the initial number of scenarios and inner samples per
scenarios. We will see in the numerical results of 7
scenario, respectively. The parameter e is the epoch
that, empirically, they suffice for this purpose.
length. The parameter k is the total number of inner
4. Suppose there are n outer scenarios and an aver-
samples. Note that each standard deviation i can be
41/n5 ni=1 mi inner samples per scenario
P
age of m
estimated in an online fashion over the course of the
at the beginning of the lth epoch. From the results
simulation, as is discussed in 7.4.)
in 5, we expect the bias squared to decrease accord-
ing to m 4+ and the variance to decrease in propor- 1: procedure Adaptive(m0 1 n0 1 e 1 k)
2: generate scenarios 1 1 2 1 0 0 0 1 n0
tion to n1 . Then, assume that the number of scenarios
3: n n0
and samples at the end of the lth epoch is given by n0
4: for i 1 to n0 do
and m 0 . We can estimate the bias squared at the end
5: conditioned on scenario i , generate i.i.d.
of the lth epoch, as a function of the bias estimate B at
samples Z i1 1 1 0 0 0 1 Z
i1m0 of portfolio losses
the beginning, by B2 4m/
m 0 54 . Similarly, the variance at 0
6: mi m
the end of the lth epoch can be estimated by V 4n/n0 5.
7: end for
Thus, at the beginning of the lth epoch, we consider
8: for l 1 to k/e do
the following optimization problem:
9: estimate the current bias and variance by
4 B and V from (20)(21)
minimize 2 m
B

+ n
V 10: determine a target number of scenarios by
0 1 n0
m m
0 n0 1/5
0
Vn 4
0 n0 = mn
subject to m + e 1 (22) n min max 4mn+
e5 1n 1
4B2 m
4
0
n n n + e 1

n + e
0 00
m
11: generate scenarios n+1 1 0 0 0 1 n0 ,
This problem seeks to make a choice of 4m 0 1 n0 5 that set mi 0 for i = n + 11 0 0 0 1 n0
results in a minimal mean squared error at the end of 12: n n0
while ni=1 mi < le do
P
the lth epoch. The first constraint ensures that the total 13:
number of inner samples in the lth epoch will equal 14: if mini mi < m0 then
the epoch length e . The second constraint ensures that 15: set i arg mini mi
the number of scenarios at the end of the lth epoch is 16: else
at least the number of scenarios at the beginning, and 17: set i arg mini 8mi L i c/i 9
increases by at most the length of the epoch. 18: end if
19: generate one additional portfolio loss This example is more complex because the portfo-
sample Z i 1 m +1 in scenario i lio cashflows are nonlinear and follow highly skewed
i
20: mi mi + 1 distributions, which vary substantially across outer
21: end while stage scenarios. Here, the underlying asset follows a
22: end for geometric Brownian motion with an initial price of
23: compute an estimate of the probability of a S0 = 100. The drift of this process under the real-world
large loss, 41/n5 ni=1 8L i c9
P distribution used in the outer stage of simulation is
24: return = 8%. The annualized volatility is = 20%. The
25: end procedure risk-free rate is r = 3%. The strike of the put option is
K = 95, and the maturity is T = 0025 years (i.e., three
months). The risk horizon is = 1/52 years (i.e., one
7. Numerical Results week). With these parameters, the initial value of the
In this section we present numerical results that put is X0 = 10669 given by the BlackScholes formula.
illustrate the benefits of nonuniform nested estima- Denote by S 45 the underlying asset price at the
tion. We begin in 7.1 by describing two settings for risk horizon . This random variable is generated
2
our numerical experiments. In 7.2, we compare the according to S 45 S0 e4 /25+ , where the real-
bias of the Uniform estimator and the nonuniform valued risk factor is a standard normal random
Threshold and Sequential estimators. In 7.3, we variable. The portfolio loss at the risk horizon is
compare the MSE of a number of both implementable given by
and idealized uniform and nonuniform estimators.
Finally, in 7.4, we consider issues arising from the L45 = X0 E6er4T 5 max4K ST 41 W 51 05 71
estimation of the variance of inner stage samples.
where the expectation is taken over the random vari-
7.1. Experimental Setting able W , which is an independently distributed stan-
Our numerical experiments are set in the context of dard normal, and ST 41 W 5 is given by
the following two examples: a portfolio with Gaus- 2 /254T 5+

sian cashflows, where both the outer stage scenarios ST 41 W 5 S 45e4r T W
0
and inner stage samples are generated from normal
distributions, and a put option example, where the Note that, given a fixed value of and a stan-
portfolio consists of a single put option on an under- dard normal W , the random variable ST 41 W 5 is dis-
lying asset whose price follows a geometric Brownian tributed according to the risk-neutral distribution of
motion process. For both examples, we are interested underlying asset price at the option maturity T , con-
in computing the probability of a loss. We consider ditional on asset price S 45 at the risk horizon .
loss thresholds corresponding to 10%, 1%, and 001% Given an outer scenario i , each inner loss sample
loss probabilities. takes the form
In the Gaussian example, we consider a portfolio i1 j = X0 er4T 5 max4K ST 4i 1 Wi1 j 51 051
Z
with normally distributed risk factors and cashflows.
This is the simplest setting in which to test any nested where Wi1 j is an independent standard normal ran-
simulation procedure. Specifically, we consider a port- dom variable. Notice that outer stage scenarios are
folio with value X0 = 0 at time t = 0 and value X 45 = generated using the real-world distribution governed
at the risk horizon . We assume that the real- by the drift , whereas inner stage scenarios used to
valued risk factor is normally distributed with generate future put option prices are generated using
2
mean zero and standard deviation outer = 1. Then, the risk-neutral distribution governed by the drift r.
the loss L45 = X0 X 45 = is a standard nor- It is not difficult to see that the loss L45 is strictly
mal random variable. Given a scenario i , each inner increasing in the risk factor . Hence, the probabil-
loss sample takes the form Z i1 j = i + inner Wi1 j , ity of a loss exceeding a threshold c can be computed
where Wi1 j is a standard normal random variable and according to = P4L c5 = P4 5 = 4 5,
inner = 5 is the standard deviation of the inner stage where is the unique solution to L4 5 = c. We
samples. choose the values 00859, 10221, and 10390 for the loss
In this case, given a loss threshold c, the probabil- threshold c, corresponding to loss probabilities of
ity of a loss exceeding c is given by = 4c5. We 10%, 1%, and 001%, respectively.
choose the values 10282, 20326, and 30090 for the loss
threshold c, corresponding to loss probabilities of 7.2. Bias Comparison
10%, 1%, and 001%, respectively. As established in 5, the advantage of nonuniform
In the put option example, we assume that the port- inner stage sampling relative to uniform sampling is
folio consists of a long position in a single put option. that, for the same total quantity of inner samples, a
Figure 5 Bias as a Function of the Total Number of Inner Stage Samples

(a) Gaussian, = 1%
100
k 1
1
10
102
k 2
Bias
103
THRESHOLD
104 SEQUENTIAL
UNIFORM
105 106
Total number of inner stage samples k
(b) Put option, = 1%

100
k 1
101
k2
102
Bias
103
THRESHOLD
SEQUENTIAL
104 UNIFORM
105 106
Total number of inner stage samples k
Notes. The vertical axis shows the bias in absolute terms, i.e., the absolute value of difference between the estimated loss probability and the true loss
probability , as a function of the total number of inner stage samples. In the case of the Threshold algorithm, the expected total number of samples is shown.
A set of n = 101000 stratified outer scenarios was used. The bias of the nonuniform Threshold and Sequential estimators is consistent with the predicted
theoretical decay of k 2+ , for any positive . Similarly, the bias of the Uniform estimator is consistent with the predicted theoretical decay of k 1 .
lower bias is attained. In this section, we numerically the nonuniform Sequential estimator, this is accom-
compare the Uniform estimator and the nonuniform plished by using m0 = 2 initial inner samples per sce-
Threshold and Sequential estimators on the basis nario and then varying the average number of inner
of bias. stage samples per scenario from m = 2 to m
= 400.
For this purpose, we generate a fixed sequence In the case of the nonuniform Threshold estimator,
1 1 0 0 0 1 n of n = 101000 outer stage scenarios. To the threshold parameter was varied over the inter-
eliminate any noise in our comparison due to ran- val 45 105 1 2 101 5, and the expected total num-
domness in scenario generation, we choose the sce- ber of inner stage samples was plotted (averaged over
narios in a deterministic and stratified manner so that the independent trials). This range of was experi-
P4 i 5 = i/4n + 15, for all 1 i n. Given the strat- mentally chosen so that the range of expected total
ified scenarios, we numerically compute the bias of inner samples for the Threshold algorithm coincided
each estimator, measured over 11000 independent tri- with the range of total inner samples for the other
als, as the total number of inner stage samples (or algorithms.
the work budget) is varied from k = 201000 to k = The results for both the Gaussian example and the
410001000. In the case of the Uniform estimator, this put option example with = 1% are plotted in Fig-
is accomplished by varying the number of inner stage ure 5. In both cases, the nonuniform Threshold and
samples per scenario from m = 2 to m = 400. For Sequential estimators exhibit a lower bias than the
Figure 6 Distribution of Inner Stage Samples

(a) Gaussian, = 1%
SEQUENTIAL
UNIFORM
Number of inner stage samples mi
4
10
103
c = 2.326
102
3 2 1 0 1 2 3
Loss L (i)
(b) Put option, = 1%
SEQUENTIAL
Number of inner stage samples mi
104 UNIFORM
103
c = 1.221
102
4 3 2 1 0 1
Loss L (i)
Notes. This figure shows the number of inner stage samples as a function of the loss in each scenario, averaged over 1,000 trials. Here, k = 410001000
inner stage samples are distributed across n = 101000 stratified scenarios. The Uniform estimator employs m = 400 inner samples for each scenario. The
nonuniform Sequential estimator varies the number of samples over two orders of magnitude and employs many more samples close to the loss threshold c.
Uniform estimator, given the same work budget. Fur- convergence of Threshold estimator in 5 provides a
thermore, for the Uniform estimator, the results are good proxy for the rate of convergence of the Sequen-
consistent with the bias decreasing with order k1 , as tial estimator.
suggested by Theorem 1. For the nonuniform Thresh- Figure 6 gives some qualitative insight into the
old estimator, the results are consistent8 with the bias inner sampling behavior of the nonuniform Sequen-
decreasing according to k2+ for any positive , as tial estimator. Here, we have plotted the number of
suggested by the theory presented in 5. Note that inner samples (averaged across the 11000 indepen-
the performance of the Threshold and Sequential dent trials) in a scenario against the loss in the sce-
estimators is largely indistinguishable. This strongly nario. Here, the amount of inner sampling employed
suggests that our theoretical analysis of the rate of by the Sequential varies over two orders of magni-
tude across scenarios, with much more sampling tak-
ing place close to the loss threshold c than far away
8
The exact rates of decay (i.e., the asymptotic slopes in Figure 5) from it.
are challenging to accurately estimate numerically. This is because
it is not computationally feasible to compute the estimators over
7.3. MSE Comparison
many orders of magnitude of k. The results in Figure 5 are not
intended as numerical proof of a particular rate of convergence, In this section, we will provide an overall comparison
but are rather intended to illustrate that the numerical convergence of the MSE achieved by various uniform and nonuni-
is consistent with our earlier theoretical analyses. form estimators, given a fixed computational budget
of k inner stage samples. We consider each of the fol- is dramatically different. This highlights the sensitiv-
lowing estimators. ity of the Uniform estimator in practice to the choice
Optimal uniform. This is the Uniform estimator of constant. Note that computing the constant , as
with parameters chosen optimally, as in 5. The sim- given in (6), requires knowledge of the constant c ,
ulation budget is allocated according to m = k1/3 / defined in (5). It is not clear how to estimate this con-
and n = k2/3 , where the constant , given in (6), stant in practice, and this constant may vary dramat-
is chosen to minimize MSE. Note that, in general, it ically across different problem instances.
is not clear how to determine the value given the Optimal uniform vs. optimal sequential. These
problem parameters. For both the Gaussian and put represent the best possible performance that can be
option examples here, we are able to use closed form achieved by the Uniform and Sequential estima-
expressions for the probability distribution of losses tors. Neither of these estimators is implementable in
to exactly compute this constant. practicethe former because it depends on a param-
1/322/3 uniform. This is the Uniform estimator eter that cannot be readily determined from the prob-
with m = k1/3 and n = k2/3 . Based on the analysis in 3, lem data, the latter because it requires exploration
this estimator has MSE that decays with same order over the choice of parameters. However, by contrast-
(k2/3 ) as the optimal uniform estimator, but with a ing them we can see a comparison of uniform and
suboptimal constant. This is meant to illustrate the nonuniform sampling on an equal footing. This com-
case where the constant of the optimal uniform parison clearly illustrates benefits of nonuniform sam-
estimator is unknown, and an arbitrary choice of con- pling. In every test case, the optimal sequential esti-
stant ( = 1) is made. mator has the lowest MSE. The MSE improvement
Optimal sequential. This is the Sequential relative to the optimal uniform estimator is between
nonuniform sampling estimator where n is chosen a factor of 4 and 10. This improvement is greatest
optimally to minimize MSE. Here, m0 = 2 initial sam- when estimating loss probabilities that are rare (e.g.,
ples were used. The parameters 4m1 n5 controlling the = 001% case).
the average number of inner samples and the num- Furthermore, note that the optimal sequential esti-
ber of scenarios were varied over choices satisfying mator employs many fewer inner stage samples and
the simulation work budget, i.e., mn = k. The choice many more outer stage scenarios. This is consistent
that resulted in the minimum MSE was selected. The with the theory developed in 5 and the experiments
optimal sequential estimator is an idealized algorithm in 7.2. The optimal sequential estimator is able to
meant to capture the best possible performance than achieve a low bias with fewer inner stage samples,
can be achieved using the Sequential estimator. hence it can employ more scenarios with the same
Adaptive. This is the Adaptive estimator of 6, computational budget.
which utilizes sequential nonuniform sampling and Optimal sequential vs. adaptive sequential. The
adaptively allocates computational effort between optimal sequential estimator relies on a brute force
outer stage scenarios and inner stage samples. Here, optimization over the parameters choosing the num-
n0 = 500 initial scenarios were used, with m0 = 2 initial ber of inner samples and outer scenarios; this is not
inner samples per scenario. An epoch length of e = feasible in practice. On the other hand, the adaptive
1001000 was used. sequential estimator makes this choice dynamically
Adaptive 4 i 5. This is a variation of the Adaptive over the course of the simulation and thus is imple-
estimator in which the variance of inner samples is mentable in practice. Comparing these two methods
estimated. This will be discussed shortly in 7.4. illustrates how much of the benefit of the optimal
The numerical results for the six test cases (the sequential method can be achieved in practice.
Gaussian and put option examples, each with thresh- Across our experiments, the adaptive sequential
olds corresponding to three different loss probabil- estimator achieves an MSE between one and two
ities) using the five estimators are summarized in times that of the optimal sequential estimator. In some
Table 1. In all cases, a computational budget of k = cases, the adaptive estimator overestimates the true
410001000 inner stage samples was used. The results bias and uses too many inner stage samples compared
in each case are computed over 11000 independent to the optimal allocation. This suggests that there is
trials. modest room for improvement in the Adaptive pro-
The numerical results in Table 1 can be interpreted cedure for allocating computational effort between
naturally through a series of pairwise comparisons, as inner and outer stages.
follows.
Optimal uniform vs. (1/3) : (2/3) uniform. These 7.4. Variance Estimation
are both asymptotically optimal Uniform estimators; The Adaptive algorithm requires the value of i , the
they differ only by the choice of constant . The prac- standard deviation of the inner stage loss samples
tical performance of these two estimators, however, i1 1 1 Z
Z i1 2 1 0 0 0 in scenario i . In practice, i will not be
Table 1 Numerical Results
MSE MSE
2
n
m Variance Bias MSE std. err. norm.
Gaussian
= 10% 1/3:2/3 uniform 251199 159 400 106 208 104 209 104 201 106 3504
Optimal uniform 41499 889 201 105 806 106 300 105 102 106 307
Adaptive (i ) 141968 281 700 106 207 106 907 106 407 107 102
Adaptive 121802 321 702 106 105 106 806 106 309 107 100
Optimal sequential 121395 323 608 106 104 106 802 106 307 107 100
= 1% 1/3:2/3 uniform 251199 159 601 107 208 105 208 105 206 107 6009
Optimal uniform 51089 786 203 106 100 106 303 106 105 107 702
Adaptive (i ) 161177 250 700 107 307 109 700 107 301 108 105
Adaptive 161118 251 701 107 401 109 702 107 301 108 106
= 001% 1/3:2/3 uniform 251199 159 802 108 101 106 102 106 109 108 4800
Optimal uniform 71788 514 107 107 709 108 205 107 103 108 1000
Adaptive (i ) 301798 132 305 108 407 1010 305 108 106 109 104
Adaptive 301628 132 308 108 500 1010 308 108 302 109 105
Put option
= 10% 1/3:2/3 uniform 251199 159 401 106 501 104 501 104 209 106 5806
Optimal uniform 51095 785 109 105 204 105 402 105 106 106 408
Adaptive (i ) 61671 601 105 105 408 106 200 105 902 107 203
Adaptive 71325 547 103 105 201 107 104 105 602 107 106
= 1% 1/3:2/3 uniform 251199 159 708 107 904 105 905 105 504 107 14108
Optimal uniform 31143 11273 308 106 102 106 500 106 201 107 705
Adaptive (i ) 101085 401 102 106 200 107 104 106 602 108 201
Adaptive 91992 405 101 106 107 108 101 106 408 108 106
= 001% 1/3:2/3 uniform 251199 159 105 107 801 106 802 106 702 108 17405
Optimal uniform 21570 11556 404 107 309 108 408 107 207 108 1002
Adaptive (i ) 141884 274 101 107 108 108 103 107 900 109 208
Adaptive 141384 284 902 108 506 1010 902 108 104 108 200
Notes. This table shows the numerical results for five estimation algorithms over six test cases (the Gaussian and put option examples, each with thresholds
corresponding to three different loss probabilities). The results are computed over 11000 independent trials, each with a total simulation budget of k =
410001000. The results reported include the number of outer stage scenarios (n) and the average number of inner stage samples per scenario (m) employed
by each estimator, as well as the variance, the bias squared, the MSE, and the standard error of the MSE for each estimator. The last column contains MSE
results normalized relative to the optimal sequential estimator.
known. However, one can imagine many variations reliably, especially when there are a small number of
of the Adaptive algorithm where each i is estimated inner stage samples at a given scenario. For b = 0, the
over the course of the estimation algorithm. procedure corresponds to the usual sample standard
One such variation replaces each i in the Adaptive deviation estimator. For large values of b, the ensem-
algorithm with the estimate ble estimate is given a larger weight.
Numerical results for an adaptive estimator using
mi b
i + 0
(24) this procedure for estimating i , with b = 5, are given
mi + b i mi + b in Table 1. To avoid a prohibitive computational bur-
Here, we define den, we only update the average at the end of each
mi 1/2 specific epoch.9 The results show that there is a mod-
1 X L i 52 est to no loss in performance when the estimated i
i 4Z
mi 1 j=1 i1 j is used in place of the true i .
to be the sample standard deviation of the inner

9
Numerically stable and efficient algorithms are available for
Ploss samples generated in scenario i , and
stage
updating sample variance calculations (see, e.g., Chan et al. 1983).
1/n ni=1 i to be the overall average of all such sam-
These would allow for rapid calculation of each i in an online
ple standard deviations. This procedure balances an fashion. However, once is updated, every i will change. This
ensemble estimate with a local estimate so that the will necessitate rebuilding the priority queue data structure for the
estimated standard deviations can be generated more scenarios, and may require order n time.
8. Conclusion unit variance. For each m 0, define the partial sum Sm

Pm
Two-level nested simulation can provide a more real- j=1 Yj . It follows from Assumption 2 that if = 4i 5, then
4i5
istic assessment of financial risk, but with a consid- Sm has the same distribution as Sm . Define m45 inf8m >
erable computational cost. In this paper we propose 02 Sm 9. From (25) and (27), we have that m45, the
a nested sequential simulation procedure that signif- expected number of inner stage samples for the Threshold
icantly reduces the computational burden. The sav- estimator, satisfies
ings are achieved by using a nonuniform inner sam-
Z
m45
= E 6m457p45 d0 (28)
pling procedure that allocates more resources where
the effect on the risk estimation is the greatest, which Here, E denotes expectation under the distribution . Sim-
in turn allows relatively more effort to be devoted to ilarly, we can define
the generation of outer scenarios. The combined effect
produces a risk estimator that converges at a faster b+ 45 8m45< and Sm45 9 1 b 45 8m45< and Sm45 9 1
rate to the true value. In numerical experiments, mean b45 b 45809 + b+ 458<09 0
squared error was reduced by factors ranging from 4
to over 100.
Then, from (26), we have that b45, the bias of the Thresh-
The sequential estimation procedure can be com- old estimator, satisfies
bined with previous research on variance reduction
Z
b45 = E 6b457p45 d0 (29)
for the outer stage scenario generation to achieve
further computational savings. The algorithms and Finally, by Assumption 2, define 401 15 so that p is
results were presented in the context of estimating the continuously differentiable over the interval 61 7, and set
probability of a large loss, but it may be possible to
apply similar ideas to develop nonuniform algorithms U0 max p451 U1 max p0 450 (30)

for other risk measures. This remains an open area for
future research. A.2. Asymptotic Bias
The asymptotic bias result of Theorem 2 is that, as ,
Acknowledgments
b45 = O4 2 5. We will establish this via a careful analysis
This work was supported by National Science Foundation of (29). In particular, consider the decomposition
Grant DMS-0914539. Z Z

b45 E 6b457p45 d + E 6b457p45 d
Appendix. Proofs >
In this section, we provide proofs for Theorems 2, 3, and 4 Z

Z

of 5.2, which analyze the performance of the Threshold E 6b457p45 d + E 6b457p405 d
estimator. >
Z

+ E 6b457p0 4455 d0 (31)
A.1. Preliminaries
Consider the Threshold estimator with n scenarios and a
Here, using Assumption 2, we have applied Taylors theo-
threshold parameter . Each scenario i has inner loss sam-
i1 1 1 Z
i1 2 1 0 0 0, which, by Assumption 2, are i.i.d. normal rem, and is a function with 45 for all 61 7. By
ples Z
symmetry, for any , we have that E 6b457 = E 6b457.
random variables with mean L4i 5 and standard deviation
Then,
4i 5. The estimator will generate mi inner stage samples Z Z
in this scenario, with

b45 E 6b457p45d + E 6b457p0 45 d
>
mi = inf8m > 02 Sm4i5 90 (25) Z
Z
4i5
Here, for m 0, the partial sum Sm is defined by Sm
4i5 E 6b457p45d +U1 E 6b457d0 (32)
Pm >
j=1 4Zi1 j c5/4i 5. Each term in this partial sum has mean
4i 5 4L4i 5 c5/4i 5. By considering these partial sums Theorem 2 will follow by applying Lemmas 1 and 2, estab-
over all n scenarios, the Threshold estimator can be written lished below, to (32).
as 1 n = 41/n5 ni=1 8S 4i5 9 . We are interested in, as ,
P We begin with a preliminary proposition:
mi
the asymptotic behavior of the bias, Proposition 1. If < 0, then

b45 E6 1 n 7 = E68S 4i5 9 8L4 5c9 71 (26) e2 E 641 + 24Sm45 558Sm45 9 7 P 4Sm45 5
mi i
and the expected number of inner stage samples per e2 P 4Sm45 50

scenario,
m45
E6mi 70 (27) If > 0, then
Now, given , define P to be a probability measure e2 E 641+24Sm45 +558Sm45 9 7 P 4Sm45 5

so that, under P , the random variables Y1 1 Y2 1 0 0 0 are a col-
lection of i.i.d. normal random variables with mean and e2 P 4Sm45 50
Proof. Consider the case where < 0. Let F denote the A.3. Expected Number of Inner Samples
N 41 15 distribution. Note that the RadonNikodym deriva- The asymptotic characterization of the number of inner
tive between the F and F is given by e2y . Then, samples provided by Theorem 3 is that, as , m45 =
O4 log 5. We will establish this via an analysis of (28). In
P 4Sm45 5 = E 68Sm45 9 7 = E 6e2Sm45 8Sm45 9 7 particular, we have that
= e2 E 6e24Sm45 5 8Sm45 9 70 (33)
Z
m45
= E 6m457p45 d
> 1
x
For x > 0, we have that 1 x e 1. Thus, Z
+ E 6m457p45 d0 (34)
41 + 24Sm45 558Sm45 9 e24Sm45 5 8Sm45 9 8Sm45 9 0 1
The result follows after taking an expectation with respect Theorem 3 will follow by applying Lemmas 4 and 5, estab-
to P , and applying (33). The case where > 0 is handled lished below, to (34).
similarly. To this end, the following result will be helpful.
Lemma 3. Suppose Y1 1 Y2 1 0 0 0 are i.i.d. random variables
Lemma 1. As , under the probability measure P , with E 6Y1 7 = and
E 6Y12 7 < . Define, for m 0, the partial sum Sm m
P
j=1 Yj
Z
E 6b457p45 d = o4 2 50
> and, for > 0, the one-sided hitting times
Proof. Note that m+ 45 inf8m > 02 Sm > 91 m 45 inf8m > 02 Sm > 90
Z
(i) Lorden (1970): Suppose that > 0. Then, if x+

E 6b457p45 d
> max4x1 05,

Z

E 64Y1+ 52 7
Z
= E 6b 457p45 d + E 6b+ 457p45 d sup E 6Sm+ 45 7 0

>0
Z Z
(ii) Pruitt (1981): There exist constants V1 and V2 (inde-

E b 45 p45 d + E b+ 45 p45 d
pendent of the distribution of Y1 ) such that, if K 45
Z Z 2 E 6Y1 2 8Y1 9 7,
P Sm45 p45 d + P Sm45 p45 d0
V1 V2
E 6m 457 1 P 4 max Sm 5 0
By Proposition 1, K 45 1mn nK 4553
Z Z
2
(iii) Gut (1974):
> E 6b457p45 d e P Sm45 p45 d

E64Ym++ 45 52 7 E6m+ 457E64Y1+ 52 70
Z
e2 P

+ Sm45 p45 d Lemma 4. As ,
Z
Z Z E 6m457p45 d = O4 log 50
e2 p45 d + e2 p45 d > 1

Z Proof. Note that because Y1 has mean and unit vari-
e2 p45 d = o4 2 50 ance under the distribution P ,
>
E 64Y1+ 52 7 E 6Y1 2 7 1 + 2
= 0 (35)
Lemma 2. As ,
Z Furthermore, define the one-sided hitting times m+ 45 and
E 6b457 d = O4 2 50 m 45 as in Lemma 3. By the optional stopping theorem,

Proof. Notice that, using Proposition 1, E 6Sm+ 45 7 = E 6m+ 4571 if > 0;

Z
E 6b457 d E 6Sm 45 7 = E 6m 4571 if < 0.

Then, because m45 m+ 45 and m45 m 45, we have
Z 0 Z
E 6b+ 457 d + E 6b 457 d that
0
Z
0
E 6m457p45 d
> 1
Z
e2 P 4Sm45 5 d
Z 1 Z
Z E 6m 457p45 d + E 6m+ 457p45 d
1
+ e2 P 4Sm45 5 d
0 1
Z E 6Sm 45 7 Z E 6Sm+ 45 7
Z 0
2
Z
2
= p45 d + p45 d
e d + e d 1
0
11+2 Z 1+2
Z
1 e2 e2 p45d+ + p45 d
= 2 0 2 1 2
2 2 2
Z 1
The result follows. = 1+ 2 + p45 d0 (36)
> 1
Here, the final inequality follows from (35) and part (i) of A.4. Realized Number of Inner Samples
Lemma 3. In this section, we will establish Theorem 4, which pro-
Now, without loss of generality, assume that > 1 . vides a probabilistic bound on the realized number of inner
Recalling U0 from (30), we have that stage samples per scenario. Our proof relies on the follow-
Z ing lemma, which bounds the second moment of the num-
E 6m457p45 d ber of inner stage samples per scenario.
> 1
Lemma 6. As , E6m452 7 = O4 3 5.
Z 1
1+ 2 + p45 d We will defer the proof of Lemma 6 for the moment, and
> 1
first employ this lemma to prove Theorem 4.
Z 1
+ 1+ 2 + p45 d Proof of Theorem 4. Fix > 0, and suppose that 0 .
>
Then, by Chebyshevs inequality,
Z 1

1

Z n
2U0 1+ 2 + d + 1 + 2 + p45 d

1X
1 > P mi 4C0 + 5 log
n i=1
= 2U0 4 1 + 1 + log+ log5+1+2 +1 n
1 X
P 4mi m455
log
= O4 log 50 (37) n i=1
Var4m455 E6m452 7
2
0
Lemma 5. As , n4 log 5 n4 log 52
Z By Lemma 6, there exist constants C00 1 00 > 0 so that if
E 6m457p45 d = O450 00 , E6m452 7 C00 3 . Then, if max80 1 00 1 1 9, we
1
have that
Proof. Here, we will apply part (ii) of Lemma 3. Without
n
C00

loss of generality, assume that > 1. Then, < 1 in the 1X
P mi 4C0 + 5 log 1
region of integration, and thus > 0. From Lemma 3, n i=1 C1 4 log 52
K 45 satisfies the following:
which can be made arbitrarily small with sufficiently
K 45 = 2 E 6Y1 2 8Y1 9 7 = 2 E0 6Y1 + 2 8Y1 +9 7 large .
2 E0 6Y1 + 2 8Y1 9 7 To prove Lemma 6, consider the decomposition

Z
2 4E0 6Y1 2 8Y1 9 7 + 2E0 6Y1 8Y1 9 75 E6m452 7 = E 6m452 7p45 d
> 1
= 2 E0 6Y1 2 8Y1 9 70 (38)

Z
+ E 6m452 7p45 d0 (41)
1
Here, we have used the fact that under P0 , Y1 N 401 15,
hence E0 6Y1 8Y1 9 7 = 0. Lemma 6 will follow by applying Lemmas 7 and 8, estab-
Then, from part (ii) of Lemma 3, because m45 m 45, lished below, to (41).
Lemma 7. As ,
V1 2
E 6m457 E 6m 457 0 Z
E0 6Y1 2 8Y1 9 7 E 6m452 7p45 d = O4 3 50
> 1
Without loss of generality, assume that > 1 , and recall Proof. We proceed as in the proof of Lemma 4. Using
U0 from (30). Then, the stopping times m+ 45 and m 45 defined there, we have
Z Z V1 2 Z Z 1
E 6m457p45 d 2
p45 d E 6m452 7p45 d E 6m 452 7p45 d
1 1 E0 6Y1 8Y1 9 7 > 1
Z
V1 2 Z
+ E 6m+ 452 7p45 d0
2
p45 d 1
E0 6Y1 8Y1 9 7 1
2U0 V1 First, consider the case when > 0. By the optional stop-
0 (39) ping theorem applied to the quadratic martingale 4Sm
E0 6Y1 2 8Y1 9 7
m52 m, we have that E 64Sm+ 45 m+ 4552 7 = E 6m+ 457.
Notice that > 1 > is assumed before. By the monotone Now, for any real numbers a1 b , we have that 4a + b52
convergence theorem, 24a2 + b 2 5. Therefore,
2
lim E0 6Y1 2 8Y1 9 7 = E0 6Y1 2 7 = 10 (40) E 6m+ 452 7 4E 64S m+ 4552 7 + E 6Sm2 + 45 75
2 m+ 45
2
The result follows. = 4E 6m 457 + E 6Sm2 + 45 75
2 +
Using the fact that Sm+ 45 + Ym++ 45 , part (iii) of Lemma 3, Without loss of generality, assume that > 1 . Then, as
and (35), in (39),
2 Z
E 6m+ 452 7 2 4E 6m+ 457 + E 64 + Ym++ 45 52 75 E 6m452 7p45 d
1
2 W0 4
2 4E 6m+ 457 + 2 2 + 2E 64Ym++ 45 52 75
Z
2 2
p45 d
1 4E0 6Y1 8Y1 9 75
2
4E 6m 457 + 2 2 + 2E 6m+ 457E 64Y1+ 52 75 2U0 W0 3
2 + 0
4E0 6Y1 2 8Y1 9 752
2
4E 6m 457 + 2 2 + 242 + 15E 6m+ 4575
The result follows from (40).
2 +
4 2

6
= + 4 E 6m+ 457 + 2 0
2 References
By similar consideration of the symmetric case where Artzner, P., F. Delbaen, J.-M. Eber, D. Heath. 2000. Coherent mea-
< 0, we have, repeating the calculation in (36), sures of risk. Math. Finance 9(3) 203228.
Z Chan, T. F., G. H. Golub, R. J. LeVeque. 1983. Algorithms for com-
E 6m452 7p45 d puting the sample variance: Analysis and recommendations.
> 1 Amer. Statist. 37(3) 242247.
Chen, C.-H., L. Lee. 2010. Stochastic Simulation Optimization: An Opti-
4 2

Z 6 1 mal Computing Budget Allocation. World Scientific, Singapore.
+4 1+ + + p45 d0
> 1 2 2 2 Cormen, T. H., C. E. Leiserson, R. L. Rivest, C. Stein. 2002. Intro-
duction to Algorithms, 2nd ed. MIT Press, Cambridge, MA.
Without loss of generality, assume that > 1 . Then, as Crouhy, M., D. Galai, R. Mark. 2000. Risk Management. McGraw-
in (37), Hill, New York.
Z Fllmer, H., A. Schied. 2002. Robust preferences and convex mea-
E 6m452 7p45 d sures of risk. K. Sandmann, P. J. Schnbucher, eds. Advances in
> 1
Finance and Stochastics. Springer-Verlag, Berlin, 3956.
4 2

Z 6 1 Frazier, P. I., W. B. Powell, S. Dayanik. 2008. A knowledge-gradient
2U0 + 4 1 + + + d policy for sequential information collection. SIAM J. Control
1 2 2 2
Optim. 47(5) 24102439.
4 2 Glasserman, P., P. Heidelberger, P. Shahabuddin. 2000. Variance

6 1
+ 2
+4 1+ 2 + + 2 = O4 3 50 reduction techniques for estimating value-at-risk. Management
Sci. 46(10) 13491364.
Glasserman, P., P. Heidelberger, P. Shahabuddin. 2002. Portfolio
Lemma 8. As , value-at-risk with heavy-tailed risk factors. Math. Finance 12(3)
Z
239270.
E 6m452 7p45 d = O4 3 50
1 Gordy, M. B., S. Juneja. 2008. Nested simulation in portfolio risk
measurement. FEDS 2008-21, Federal Reserve Board, Washing-
Proof. We proceed as in Lemma 5. Without loss of gener- ton, DC.
ality, assume that > 1. Observe that m45 m 45, because Gordy, M. B., S. Juneja. 2010. Nested simulation in portfolio risk
the latter is an exit time for a larger set than the former. management. Management Sci. 56(10) 18331848.
Then, using summation by parts, Gut, A. 1974. On the moments and limit distributions of some first
passage times. Ann. Probab. 2(2) 277308.
E 6m452 7 E 6m 452 7 Hong, L. J., S. Juneja. 2009. Estimating the mean of a non-linear
function of conditional expectation. M. D. Rossetti, R. R. Hill,
n2 P 4m 45 = n5 B. Johansson, A. Dunkin, R. G. Ingalls, eds. Proc. 2009 Winter
X
=
n=1 Simulation Conf., IEEE Press, Piscataway, NJ, 12231236.
Jorion, P. 2006. Value at Risk. McGraw-Hill, New York.

X Kim, S.-H., B. L. Nelson. 2005. Selecting the best system. S. G. Hen-
= 1+ 42n + 15P 4m 45 > n5 derson, B. L. Nelson, eds. Handbooks in Operations Research and
n=1
Management Science: Simulation. Elsevier, Oxford, UK, 501534.

X Lan, H., B. L. Nelson, J. Staum. 2007. Two-level simulations for risk
= 1+ 42n + 15P 4 max Sm 50 management. Proc. 2007 INFORMS Simulation Soc. Res. Work-
1mn
n=1 shop, Fontainebleau, France, 102107.
Using part (ii) of Lemma 3, for any integer N 1, Lan, H., B. L. Nelson, J. Staum. 2010. A confidence interval pro-
N 1 cedure for expected shortfall risk measurement via two-level
V2 X 2n + 1 simulation. Oper. Res. 58(5) 14811490.
E 6m452 7
X
42n + 15 + 3
n=0
K 45 n=N
n3 Lee, S.-H. 1998. Monte Carlo Computation of Conditional Expectation
Quantiles. Ph.D. thesis, Stanford University, Stanford, CA.

3V2 X 1 3V2 Lee, S.-H., P. W. Glynn. 2003. Computing the distribution function
N2 + N2 + 0 of a conditional expectation via Monte Carlo: Discrete condi-
K 453 n=N n2 K 453 4N 15
tioning spaces. ACM Trans. Modeling Comput. Simulation 13(3)
Because K 45 1, we may take N 3/K 451 so that 238258.
N 1 1/K 45. Then, there exists a constant W0 so that Lesnevski, V., B. L. Nelson, J. Staum. 2004. Simulation of coherent
risk measures. Proc. 2004 Winter Simulation Conf., IEEE Press,
W0 W0 4 Piscataway, NJ, 15791585.
E 6m452 7 2
1 Lesnevski, V., B. L. Nelson, J. Staum. 2007. Simulation of coherent
K 45 4E0 6Y1 8Y1 9 752
2
risk measures based on generalized scenarios. Management Sci.
using (38). 53(11) 17561769.
Liu, M., J. Staum. 2010. Stochastic kriging for efficient nested sim- Pruitt, W. E. 1981. The growth of random walks and Levy processes.
ulation of expected shortfall. J. Risk 12(3) 327. Ann. Probab. 9(6) 948956.
Liu, M., B. L. Nelson, J. Staum. 2010. An efficient simulation pro- Rockafellar, R. T., S. Uryasev. 2002. Conditional value-at-risk
cedure for point estimation of expected shortfall. Proc. 2010 for general loss distributions. J. Banking Finance 26(7)
Winter Simulation Conf., IEEE Press, Piscataway, NJ, 28212831. 14431471.
Lorden, G. 1970. On excess over the boundary. Ann. Math. Statist. Siegmund, D. 1985. Sequential Analysis: Tests and Confidence Intervals.
41(2) 520527. Springer, New York.
McNeil, A., R. Frey, P. Embrechts. 2006. Quantitative Risk Manage- Sun, Y., D. W. Apley, J. Staum. 2011. Efficient nested simulation for
ment: Concepts, Techniques, and Tools. Princeton University Press, estimating the variance of a conditional expectation. Oper. Res.
Princeton, NJ. Forthcoming.

Efficient Risk Estimation Via Nested PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Efficient Risk Estimation Via Nested PDF

Hochgeladen von

Copyright:

Verfügbare Formate

MANAGEMENT SCIENCE

Vol. 57, No. 6, June 2011, pp. 11721194

Efficient Risk Estimation via Nested

1. Introduction Carlo simulation can represent a prohibitive compu-

Figure 1 Illustration of Uniform Sampling

3. Optimal Uniform Sampling 3

Theorem 1 and Corollary 1 provide a complete

Figure 2 An Illustration of the Benefits of Nonuniform Sampling

Figure 3 An Illustration of the Impact of an Additional Sample

5. Analysis This is precisely what is done by the Threshold esti-

Figure 4 An Illustration of the Threshold Estimator

Figure 5 Bias as a Function of the Total Number of Inner Stage Samples

(b) Put option, = 1%

Figure 6 Distribution of Inner Stage Samples

(b) Put option, = 1%

Table 1 Numerical Results

to be the sample standard deviation of the inner

8. Conclusion unit variance. For each m 0, define the partial sum Sm

and the expected number of inner stage samples per e2 P 4Sm45 50

Now, given , define P to be a probability measure e2 E 641+24Sm45 +558Sm45 9 7 P 4Sm45 5

Proof. Notice that, using Proposition 1, E 6Sm+ 45 7 = E 6m+ 4571 if > 0;

2 E0 6Y1 + 2 8Y1 9 7 To prove Lemma 6, consider the decomposition

= 2 E0 6Y1 2 8Y1 9 70 (38)

Das könnte Ihnen auch gefallen