Risk Metrics 2008

RiskMetrics
Volume 8, Number 1
Journal
Winter 2008
› Volatility Forecasts and At-the-Money
Implied Volatility
› Inflation Risk Across the Board
› Extensions of the Merger Arbitrage Risk Model
› Measuring the Quality of Hedge Fund Data
› Capturing Risks of Non-transparent Hedge Funds
a RiskMetrics Group Publication
On the Cover:
Weights wk( ΔT ) as function of
the forecast horizon ΔT.
Page 11, Figure 1
Editor’s Note
Christopher C. Finger
RiskMetrics Group
chris.finger@riskmetrics.com
In this, the 2008 issue of the RiskMetrics Journal, we present five papers treating types of risk that are
inadequately served by what we might call the “standard risk model”. By the standard model, we
refer to a model that takes the natural observed market risk factors for each instrument, and models
these factors with some continuous process. Applying this framework to less traditional asset classes
presents a variety of challenges: infrequent (or non-existent) market observations, inconsistencies in
conventions across instrument types and positions with risks not represented in the natural data
choices. The papers herein represent a number of our efforts to tackle these challenges and to extend
the standard model.
In the first paper, Gilles Zumbach considers volatility risk. Though this is a subject we have treated in
the past, we are still faced with the problem of modeling implied volatility as a risk factor when
consistent observations of implied volatility are not available. Gilles examines a number of processes
for option underlyings, and studies the processes for volatility that these underlying processes induce.
He shows empirically that these induced volatility processes do a nice job of capturing the dynamics
of market implied volatility. Further, he compares the induced volatility to some standard pricing
models and shows that while the induced processes do have a similar structure to these models, they
ultimately describe a different evolution of volatility—a result that may have greater implications on
how we treat volatility as a risk factor in the future.
In the next article, Fabien Couderc presents a framework for assessing inflation risk. Inflation is
another topic we have examined previously, but the development of a variety of inflation products has
meant that what at one point was a natural choice of risk factors is no longer adequate. To achieve a
consistent and well behaved risk factor, Fabien proposes the use of break-even inflation, with
adjustments to filter out both predictable effects (such as seasonality) and specific market conventions.
Our third article, by Stéphane Daul, is an extension of his article on merger arbitrage from last year.
Merger arbitrage positions would seem to be nothing more than equity pairs, but the nature of the
participants in a proposed merger—in particular, that the equities are no longer well represented by
©2008 RiskMetrics Group, Inc. All Rights Reserved.

2 Editor’s Note
their historical price moves—requires a specific model. Stéphane extends the validation of his model
from last year, and presents an empirical model for forecasting the probability of merger success.
Interestingly, this simple model appears to outperform the market-implied probability, confirming the
opportunities in this area of trading.
The final two articles both analyze hedge fund returns. Treated as a set of positions, of course, we
may treat hedge fund portfolios with a standard risk model as long as the model is sophisticated
enough for all of the fund’s positions. But in the absence of position information, hedge fund returns
themselves present a new risk modeling challenge. One aspect of the challenge is the quality of the
hedge fund return data. Daniel Straumann examines this issue in our fourth paper, introducing a
scoring mechanism for hedge fund data quality. Daniel then investigates which aspects of hedge funds
most influence the quality of the return data, and quantifies the performance bias induced by funds
with poor data quality.
Finally, in our last paper, Stéphane Daul introduces a model for hedge fund returns. The unique aspect
of this model is to treat hedge fund returns properly as time series, in order to capture the dynamics
that we observe empirically. Through a backtesting exercise, Stéphane demonstrates the efficacy of
the model, in particular the improvements it brings over the typical approaches in the hedge fund
investment literature.
Volatility Forecasts and At-the-Money Implied Volatility
Gilles Zumbach
RiskMetrics Group
gilles.zumbach@riskmetrics.com
This article explores the relationship between realized volatility, implied volatility and several
forecasts for the volatility built from multiscale linear ARCH processes. The forecasts are derived
from the process equations, and the parameters set according to different risk methodologies
(RM1994, RM2006). An empirical analysis across multiple time horizons shows that a forecast
provided by an I-GARCH(1) process (one time scale) does not capture correctly the dynamic of
the realized volatility. An I-GARCH(2) process (two time scales, similar to GARCH(1,1)) is
better, but only a long memory LM-ARCH process (multiple time scales) replicates correctly the
dynamic of the realized volatility. Moreover, the forecasts provided by the LM-ARCH process are
close to the implied volatility. The relationship between market models for the forward variance
and the volatility forecasts provided by ARCH processes is investigated. The structure of the
forecast equations is identical, but with different coefficients. Yet the process equations for the
variance induced by the process equations for an ARCH model are very different from those
postulated for a market model, and not of any usual diffusive type when derived from ARCH.
1 Introduction
The intuition behind volatility is to measure price fluctuations, or equivalently, the typical magnitude
for the price changes. Yet beyond the first intuition, volatility is a fairly complex concept for various
reasons. First, turning this intuition into formulas and numbers is partly arbitrary, and many
meaningful and useful definitions of volatilities can be given. Second, the volatility is not directly
“observed” or traded, but rather computed from time series (although this situation is changing
indirectly through the ever increasing and sophisticated option market and the volatility indexes). For
trading strategies, options and risk evaluations, the valuable quantity is the realized volatility, namely
the volatility that will occur between the current time t and some time in the future t + ∆T . As this
quantity is not available at time t, a forecast needs to be constructed. Clearly, a better forecast of the
realized volatility improves option pricing, volatility trading and portfolio risk management.
At a time t, a forecast for the realized volatility can be constructed from the (underlying) price time
series. In this paper, multiscale ARCH processes are used. On the other hand, a liquid option market
furnishes the implied volatility, corresponding to the “market” forecast for the realized volatility. On
4 Volatility Forecasts and At-the-Money Implied Volatility
the theoretical side, an “instantaneous”, or effective, volatility σeff is needed to define processes, and
the forward variance. Therefore, at a given time t, we have mainly one theoretical instantaneous
volatility and three notions of “observable” volatility (forecasted, implied and realized). This paper
studies the empirical relationship between these three time series, as a function of the forecast horizon
∆T .1
The main line of this work is to model the underlying time series by multicomponent ARCH
processes, and to derive an implied volatility forecast. This forecast is close to the implied volatility
for the at-the-money option. Such an approach produces a volatility surface based only on the
underlying time series, and therefore a surface can be inferred even when option data is poor or not
available. This article does not address the issue of the full surface, but only the implied volatility for
the at-the-money options, called the backbone.
A vast literature on implied volatility and its dynamic already exists. In this article, we will review
some recent developments on market models for the forward variance. These models focus on the
volatility as a process, and many process equations can be set that are compatible with a martingale
condition for the volatility. On the other side, the volatility forecast as induced by a multicomponent
ARCH process leads also to process equations for the volatility itself. These two approaches leading
to the volatility process are contrasted, showing the formal similarity in the structure of the forecasts,
but the very sharp difference in the processes for the volatility. If the price time series behave
according to some ARCH process, then the implication for volatility modeling is far reaching as the
usual structure based on Wiener process cannot be used.
This paper is organized as follows. The required definitions for the volatilities and forward variance
are given in the next section. The various multicomponent ARCH processes are introduced in
Section 3, and the induced volatility forecasts and processes given in Section 4 and 5. The market
models and the associated volatility dynamics are presented in Section 6. The relationship between
market models, options and the ARCH forecasts are discussed in Section 7. Section 8 presents an
empirical investigation of the relationship between the forecasted, implied and realized volatilities.
Section 9 concludes.
1 There exist already an abundant literature on this topic, and (Poon 2005) published a book summarizing nicely the
available publications (approximately 100 articles on volatility forecast alone!).
Definitions and setup of the problem 5
2 Definitions and setup of the problem
2.1 General
We assume to be at time t, with the corresponding information set Ω(t). The time increment for the
processes and the granularity of the data is denoted by δt, and is one day in the present work. We
assume that there exists an instantaneous volatility, denoted by σeff (t), which corresponds to the
annualized expected standard deviation of the price in the next time step δt. This is a useful quantity
for the definitions, but this volatility is essentially unobserved. In a process, σeff gives the magnitude
of the returns, as in (9) below.
2.2 Realized volatility
The realized volatility corresponds to the annualized standard deviation of the returns in the interval
between t and t + ∆T
1 year
σ2 (t,t + ∆T ) = ∑ r2 (t 0)
n δt t<t 0 ≤t+∆T
(1)
where r(t) are the (unannualized) returns measured over the time interval δt, and the ratio 1 year/δt
annualizes the volatility. The empirical section is done with daily data and the returns are evaluated
over a one day interval. If the returns do not overlap in the sum, then ∆T = n δt. At the time t, the
realized volatility cannot be evaluated from the information set Ω(t). The realized volatility is the
useful quantity we would like to forecast and to relate to the implied volatility.
2.3 Forward variance
The expected cumulative variance is defined by

Z t+∆T
V (t,t + ∆T ) = dt 0 E σ2eff (t 0) | Ω(t) (2)
t
and the forward variance by
∂V (t,t + ∆T )
v(t,t + ∆T ) = = E σ2eff (t + ∆T ) | Ω(t) . (3)
∂∆T
The cumulative variance is an extensive quantity as it is proportional to ∆T . For empirical
investigation, it is simpler to work with an intensive quantity as this removes a trivial dependency on
the time horizon. For this reason, the cumulative variance is used only in the theoretical part (hence
also the continuum definition with an integral), whereas the forecasted volatility is used in the
empirical part.
The variance enters into the variable leg of a variance swap, and as such, it is tradable. Related
tradable instruments are the volatility indexes like the VIX (but the relation is indirect as the index is
defined through implied volatility of a basket of options). Because volatility is tradable, the forward
variance should be a martingale

E v(t 0, T ) | Ω(t) = v(t, T ). (4)
For the volatility, this condition is quite weak as it follows also from the chain rule for conditional
expectation

E E σ2eff (T ) | Ω(t 0) | Ω(t) = E σ2eff (T ) | Ω(t) for t < t 0 < T (5)
and from the definition of the forward variance as a conditional expectation. Therefore, any forecast
built as a conditional expectation produces a martingale for the forward variance.
At this level, there is a formal analogy with interest rates, with the (zero coupon) interest rate and
forward rate being analogous to the cumulative variance and forward variance. Therefore, some ideas
and equations can be borrowed from the interest rate field. For example, on the modeling side, one
can write process for the cumulative variance or for the variance swap, the later being more
convenient as the martingale condition gives simpler constraints on the possible equations. In this
paper, the ARCH path is followed using a multiscale process for the underlying. The forward variance
is computed as an expectation, and therefore the martingale property follows. In Section 6, this
ARCH approach is contrasted with a direct model for the forward volatility, where the martingale
condition has to be explicitly enforced.
2.4 The forecasted volatility
The forecasted volatility is defined by
1 2 0
σ
e2 (t,t + ∆T ) = ∑
n t<t 0 ≤t+∆T
E σeff (t ) | Ω(t) . (6)
Up to a normalization and the transformation of the integral into a discrete sum, this definition is
similar to the expected cumulative variance.
Multicomponent ARCH processes 7
2.5 The implied volatility
As usual, the implied volatility is defined as the volatility to insert into the Black-Scholes equation so
as to recover the market price for the option. The implied volatility σBS (m, ∆T ) is a function of the
moneyness m and of the time to maturity ∆T . The moneyness can be defined is various ways, with
most definitions similar to m ' ln (F/K), and with F the forward rate F = Ser ∆T . The (forward)
at-the-money option corresponds to m = 0. The backbone is the implied volatility at the money
σBS (∆T ) = σBS (m = 0, ∆T ), as a function of the time to maturity ∆T . For a given time to maturity
∆T , the implied volatility as function of moneyness is called the smile.
Intuitively, the implied volatility surface can loosely be decomposed in backbone × smile. The
rationale for this decomposition is that the two directions depend on different option features. The
backbone is related to the expected volatility until the option expiry
σ
e(t,t + ∆T ) = σBS (m = 0, ∆T )(t) (7)
In the Black-Scholes formula, the volatility appears only through the combination ∆T σ2 ,
corresponding to the cumulative expected variance. In the other direction, the smile is the fudge factor
to remedy the incomplete modeling of the underlying by a Gaussian random walk. The Black-Scholes
model has the key advantage to be solvable, but does not include many stylized facts like
heteroskedasticity, fat-tails, or leverage effect. These shortcomings translate into various “features” of
the smile.
In principle, (7) should be checked using empirical data. Yet this comparison raises a number of
issues, on both sides of the equation. On the left-hand side, the variance forecast should be computed
using some equations and the time series for the underlying. The forecasting scheme, with its
estimated parameters, is subject to errors. On the right-hand side, the option market has its own
idiosyncracies, for example related to demand and supply. Such effect can be clearly observed by
computing the implied volatility corresponding to the option bid or ask prices. These points are
discussed in more detail in Section 8. Therefore, (7) should be taken only a first order approximation.
3 Multicomponent ARCH processes
3.1 The general setup
The basic idea of a multicomponent ARCH process is to measure historical volatilities using
exponential moving average on a set of time horizons, and to compute the effective volatility for the
next time step as a convex combination of the historical volatilities. A first process along these lines
was introduced in (Dacorogna, Müller, Olsen, and Pictet 1998), and this family of processes was
thoroughly developed and explored in (Zumbach and Lynch 2001; Lynch and Zumbach 2003;
Zumbach 2004). A particular simple process with long memory is used to build the RM2006 risk
methodology (Zumbach 2006). One of the key advantages of these multicomponent processes is that
forecasts for the variance can be computed analytically. We will use this property to explore their
relations with the option implied volatility.
In order to build the process, the historical volatilities are measured by exponential moving averages
(EMA) at time scales τk
σ2k (t) = µk σ2k (t − δt) + (1 − µk ) r2 (t) k = 1, · · · , n (8)
and with decay coefficients µk = exp(−δt/τk ). The process time increment, δt, is one day in this
work. Let us emphasize that the σk are computed from historical data, and there is no hidden
stochastic processes like in a stochastic volatility model.
The “effective” variance σ2eff is a convex combination of the σ2k and of the mean variance σ2∞
n n
σ2eff (t) = ∑ k k
w σ2
(t) + w σ2
∞ ∞ = σ2
∞ + ∑ wk σ2k (t) − σ2∞ ,
k=1 k=1
n
1 = ∑ wk + w∞ .
k=1
Finally, the price follows a random walk with volatility σeff
r(t + δt) = σeff (t) ε(t + δt). (9)
Depending on the number of components n, the time horizons τk and weights wk , a number of
interesting processes can be built. The processes we are using to compare with implied volatility are
given in the next subsections.
On general ground, we make the distinction between affine processes for which the mean volatility is
fixed by σ∞ and w∞ > 0, and the linear process for which w∞ = 0. The linear and affine terms qualify
the equations for the variance. The linear processes are very interesting for forecasting volatility as
they have no mean volatility parameter σ∞ , which clearly would be time series dependent. However,
their asymptotic properties are singular, and affine processes should be used in Monte Carlo
simulations. This subtle difference between both classes of processes is discussed in details in
(Zumbach 2004). As this paper deal with volatility forecasts, only the linear processes are used.
Multicomponent ARCH processes 9
3.2 I-GARCH(1)
The I-GARCH(1) model corresponds to a 1-component linear process
σ2 (t) = µ σ2 (t − δt) + (1 − µ) r2(t)
σ2eff (t) = σ2 (t).
It has one parameter τ (or equivalently µ). This process is equivalent to the integrated GARCH(1,1)
process (Engle and Bollerslev 1986), and with a given value for µ is equivalent to the standard
RiskMetrics methodology (RM1994). Its advantage is to be the most simple, but it does not capture
mean reversion for the forecast (that is, that long term forecasts should converge to the mean
volatility).
For the empirical evaluation, the characteristic time has been fixed a priori to τ = 16 business days,
corresponding to µ ' 0.94.
3.3 I-GARCH(2) and GARCH(1,1)
The I-GARCH(2) process corresponds to a two-component linear model
σ21 (t) = µ1 σ21 (t − δt) + (1 − µ1) r2 (t),
σ22 (t) = µ2 σ22 (t − δt) + (1 − µ2) r2 (t),
σ2eff (t) = w1 σ21 (t) + w2 σ22 (t).
It has three parameters τ1 , τ2 and w1 . Even if this process is linear, it has mean reversion for time
scales up to τ2 , with σ2 playing the role of the mean volatility.
The GARCH(1,1) process (Engle and Bollerslev 1986) corresponds to the one-component affine
model
σ21 (t) = µ1 σ21 (t − δt) + (1 − µ1) r2 (t),
σ2eff (t) = (1 − w∞ ) σ21 (t) + w∞σ2∞ .
It has three parameters τ1 , w∞ and σ∞ . In this form, the analogy between the I-GARCH(2) and
GARCH(1,1) processes is clear, with the long term volatility σ2 playing a similar role as the mean
volatility σ∞ .
Given a process, the parameters need to be estimated on a time series. GARCH(1,1) is more
problematic with that respect because σ∞ is clearly time series dependent. A good procedure is to
estimate the parameters on a moving historical sample, say in a window between t − ∆T 0 and t for a
fixed span ∆T 0 . With this setup, the mean variance σ2∞ is essentially the sample variance ∑ r2
computed on the estimating window. This is a rectangular moving average, similar to an EMA but for
the weights given to the past. This argument shows that I-GARCH(2) and (a continuously
re-estimated on a moving window) GARCH(1,1) behave similarly. A detailed analysis of both
processes in (Zumbach 2004) show that they have similar forecasting power, with an advantage to
I-GARCH(2).
In this work, we use the I-GARCH(2) process with two parameter sets fixed a priori to some
reasonable values. The first set is τ1 = 4 business days, τ2 = 512 business days, w1 = 0.843 and w2 =
0.157. The second set is τ1 = 16 business days, τ2 = 512 business days, w1 = 0.804 and w2 = 0.196.
The values for the weights are obtained according to the long memory ARCH process, but with only
two given τ components.
3.4 Long Memory ARCH
The idea for a long memory process is to use a multicomponent ARCH model with a large number of
components but simple analytical form for the characteristic time τk and the weights wk . For the long
memory ARCH process, the characteristic times τk increase as a geometric series
τk = τ1 ρk−1 k = 1, · · · , n, (10)
while the weights decay logarithmically
1
wk = (1 − ln(τk )/ ln(τ0 )) , (11)
C
C = ∑ (1 − ln(τk )/ ln(τ0 )) .
k
This choice produces lagged correlations for the volatility that decays logarithmically, as observed in
the empirical data (Zumbach 2006). The parameters are taken as for the RM2006 methodology,
√
namely τ1 = 4 business days, τn = 512 business days, ρ = 2 and the logarithmic decay factor τ0 =
1560 days = 6 years .
Forward variance and multicomponent ARCH processes 11
Figure 1
Weights wk ( ∆T ) as function of the forecast horizon ∆T
Long memory process with w∞ = 0.1 and τk = 2, 4, 8, 16, · · · , 256 days. Weight profiles for increasing
characteristic times τk have decreasing initial values and maximum values going from left to right.
0.2
0.15
weights wk ( ∆T )
0.1
0.05
0
0 1 2 3
10 10 10 10
Forecast horizon ∆T [day]
4 Forward variance and multicomponent ARCH processes
For multiscale ARCH processes (I-GARCH, GARCH(1,1), long-memory ARCH, etc ...), the forward
variance can be computed analytically (Zumbach 2004; Zumbach 2006). The idea is to compute the
conditional expectation of the process equations, from which iterative relations can be deduced. Then,
some algebra and matrix computations produce the following form for the forward variance
n
v(t, t + ∆T ) = E σ2eff (t + ∆T ) | Ω(t) = σ2∞ + ∑ wk ( ∆T ) σ2k (t) − σ2∞ . (12)
k= 1
The weights wk ( ∆T ) can be computed by a recursion formula depending on the decay coefficients µk
and with initial values given by wk = wk ( 1) . The equation for the forecast of the realized volatility has
the same form but the weights wk ( ∆T ) are different.
Let us emphasize that this can be done for all processes in this class (linear and affine). Moreover, the
σ2k (t) are computed from the underlying time series, namely there is no hidden stochastic volatility to
estimate. This makes volatility forecasts particularly easy in this framework.
For a multicomponent ARCH process, the intuition for the forecast can be understood from a graph of
the weights wk ( ∆T ) as function of the forecast horizon ∆T as given in Figure 1. For short forecast
horizons, the volatilities with the shorter time horizons dominate. As the forecast horizon gets larger,
the weights of the short term volatilities decay while the weights of the longer time horizons increase.
Figure 2
Sum of the weights ∑k wk ( ∆T ) = 1 − w∞
Same parameters as in Figure 1
0.8
sum of weights
0.6
0.4
0.2
0
0 1 2 3
10 10 10 10
Forecast horizon ∆T [day]
The weight for a particular horizon τk peaks at a forecast horizon similar to τk , for example the
Burgundy curve corresponds to τ = 32 days, and its maximum is around this value. Figure 2 shows
the sum of the volatility coefficients ∑k wk = 1 − w∞ . This shows the increasing weight of the mean
volatility as the forecast horizon gets longer. Notice that this behavior corresponds to our general
intuition about forecasts: short term forecasts depend mainly on the recent past while long term
forecasts need to use more information from the distant past. The nice feature of the multicomponent
ARCH process is that the forecast weights are derived from the process equations, and that they have
a similar content to the process equations (linear or affine, one or multiple time scales).
5 The induced volatility process
The multicomponent ARCH processes are stochastic processes for the return, in which the volatilities
are convenient intermediate quantities. It is important to realize that the volatilities σk and σeff are
useful and intuitive in formulating a model, but they can be completely eliminated from the equations.
An important advantage of this class of process is that the forward variance v(t, t + ∆T ) can be
computed analytically. Going in the opposite direction, we want to eliminate the return, namely to
derive the equivalent process equations for the dynamic of the forward variance induced by a
multicomponent ARCH process. This will allow us to make contact with some models for the
forward variance that are available in the literature and presented in the next section.
The induced volatility process 13
Equation (8) for σk can be rewritten as
dσ2k (t) = σ2k (t) − σ2k (t − δt) (13)

= (1 − µk ) −σ2k (t − δt) + ε2 (t) σ2eff (t − δt)

= (1 − µk ) σ2eff (t − δt) − σ2k (t − δt) + (ε2 (t) − 1) σ2eff (t − δt) .
The equation can be simplified by introducing the annualized variances vk = 1y/δt σ2k ,
veff = 1y/δt σ2eff and a new random variable χ with
χ = ε2 − 1 such that E [ χ(t) ] = 0, χ(t) > −1. (14)
Assuming that the time increment δt is small compared to the time scales τk in the model, the
following approximation can be used
δt
1 − µk = + O(δt 2 ). (15)
τk
In the present derivation, this expansion is used only to make contact with the usual form for
processes, but no term of higher order are neglected. Exact expressions are obtained by replacing
δt/τk by 1 − µk in the equations below.
These notations and approximations allow the equivalent equations

δt
dvk = {veff − vk + χveff } , (16a)
τk
veff = ∑ wk vk + w∞ v∞ . (16b)
k
The process for the forward variance is given by
dv∆T = ∑ wk (∆T ) dvk (17)

k
with dvτ (t) = v(t,t + ∆T ) − v(t − δt,t − δt + ∆T ).
The content of (16a) is the following. The term δt {veff − vk } /τk gives a mean reversion toward the
current effective volatility veff at a time scale τk . This structure is fairly standard, except for veff which
is given by a convex combination of all the variances vk . Then, the random term is unusual. All the
variances share the same random factor δt χ/τk , which has a standard deviation of order δt instead of
√
the usual δt appearing in Gaussian model.
An interesting property of this equation is to enforce positivity for vk through a somewhat unusual
mechanism. Equation (16a) can be rewritten as
δt
dvk = {−vk + (χ + 1)veff } (18)
τk
Because χ ≥ 1, the term (χ + 1)veff is never negative, and as δt vk (t − δt)/τk is smaller than vk (t − δt),
this implies that vk (t) is always positive (even for a finite δt). Another difference with the usual
random process is that the distribution for χ is not Gaussian. In particularly if ε has a fat-tailed
distribution—as seems required in order to have a data generating process that reproduce the
properties of the empirical time series—the distribution for χ also has fat tails.
The continuum limit of the GARCH(1,1) process was already investigated by (Nelson 1990). In this
limit, GARCH(1,1) is equivalent to a stochastic volatility process where the variance has its own
source of randomness. Yet Nelson constructed a different limit as above because he fixes the GARCH
parameters α0 , α1 and β1 . The decay coefficient is given by α1 + β1 = µ and is therefore fixed. With
µ = exp(−δt/τ), fixing µ and taking the limit δt → 0 is equivalent to τ → 0. Because the
characteristic time τ of the EMA goes to zero, the volatility process becomes independent of the
return process, and the model converges toward a stochastic volatility model. A more interesting limit
is to take τ fixed and δt → 0, as in the computation above. Notice that the computation is done with a
finite time increment δt; the existence of a proper continuum limit δt → 0 for a process defined by
(16b) to (17) is likely not a simple question.
Let us emphasize that the derivation of the volatility process as induced by the ARCH structure
involves only elementary algebra. Essentially, if the price follows an ARCH process (one or multiple
time scales, with or without mean σ∞ ), then the volatility follows a process according to (16). The
structure of this process involves a random term of order δt and therefore it cannot be reduced to a
Wiener or Levy process. This is a key difference from the processes used in finance that were
developed to capture the price diffusion.
The implications of (16) are important as they show a key difference between ARCH and stochastic
volatility processes. This has clearly implication for option pricing, but also for risk evaluation. In a
risk context, the implied volatility is a risk factor for any portfolio that contains options, and it is
likely better to model the dynamic of the implied volatility by a process with a similar structure.
6 Market model for the variance
In the literature, the models for the implied volatility are dominated by stochastic volatility processes,
essentially assuming that the implied volatility “has its own life”, independent of the underlying. In
this vast literature, a recent direction is to write processes directly for the forward variance. Recent
work in this direction include (Buehler 2006; Bergomi 2005; Gatheral 2007). In this direction, we
present here simple linear processes for the forward variance, and discuss the relation with
Market model for the variance 15
multicomponent ARCH in the next section.
The general idea is to write a model for the forward variance
v(t,t + ∆T ) = G(vk (t); ∆T ), (19)
where G is a given function of the (hidden) random factors vk . In principle, the random factors can
appear everywhere in the equation, say for example as a random characteristic time like τk . Yet
Buehler has showed that strong constraints exist on the possible random factors, for example
forbidding random characteristic time. In this paper, only linear model will be discussed, and
therefore the random factor appears as a variance vk .
The dynamic for the random factor vk are given by processes
d
dvk = µk (v) dt + ∑ σαk (v) dW α k = 1, · · · , n. (20)
α=1
The processes have d sources of randomness dW α , and the volatility σαk (v) can be any function of the
factors.
As such, the model is essentially unconstrained, but the martingale condition (4) for the forward
variance still has to be enforced. Through standard Ito calculus, the variance curve model together
with the martingale condition lead to a constraint between G(v; ∆T ), µ(v) and σ(v)
n n d
∂∆T G(v; ∆T ) = ∑ µi ∂vi G(v; ∆T ) + ∑ ∑ σαi σαj ∂2vi ,v j G(v; ∆T ) (21)
i=1 i, j=1 α=1
A given function G is said to be compatible with a dynamic for the factors if this condition is valid.
The compatibility constraint is fairly weak, and many processes can be written for the forward
variance that are martingales. As already mentioned, we consider only functions G that are linear in
the risk factors. Therefore, ∂2vi ,v j G = 0, leading to first order differential equations that can be solved
by elementary techniques. For this class of models, the condition does not involve the volatility σαk (v)
of the factor, which therefore can be chosen freely.
6.1 Example: one-factor market model
The forward variance is parameterized by
G(v1 ; ∆T ) = v∞ + w1 e−∆T /τ1 (v1 − v∞ ) (22)

which is compatible with the stochastic volatility dynamic

dt β
dv1 = −(v1 − v∞ ) + γ v1 dW for β ∈ [1/2, 1]. (23)
τ1
The parameter w1 can be chosen freely, and for identification purposes the choice w1 = 1 is often
made. Because G is linear in v1 , there is no constraint on β. The value β = 1/2 corresponds to the
Heston model, β = 1 to the lognormal model. This model is somewhat similar to the GARCH
process, with one characteristic time τ1 , a mean volatility v∞ , and the volatility of the volatility
(vol-of-vol) γ. This model is not rich enough to describe the empirical forward variance dynamic,
which involves multiple time scale.
6.2 Example: two-factor market model
The linear model with two factors
G(v; ∆T ) = v∞ + w1 e−∆T /τ1 (v1 − v∞ ) (24)

1
+ −w1 e−∆T /τ1 + (w1 + w2 ) e−∆T /τ2 (v2 − v∞ )
1 − τ1 /τ2
= v∞ + w1 (∆T ) (v1 − v∞ ) + w2 (∆T ) (v2 − v∞ ) (25)
is compatible with the dynamic

β
dv1 = −(v1 − v2 ) dt/τ1 + γ v1 dW1 (26)
β
dv2 = −(v2 − v∞ ) dt/τ2 + γ v2 dW2 .
The parameters w1 and w2 can be chosen freely, and for identification purposes the choice w1 = 1 and
w2 = 0 is often made. Notice the similarity of (25) with the Svensson parameterization for the yield
curve.
The linear model can be solved explicitly for n components, but the ∆T dependency in the coefficients
wk (∆T ) becomes increasingly complex. It is therefore not natural in this approach to create the
equivalent of a long-memory model with multiple time scales.
7 Market models and options
Assuming a liquid option market, the implied volatility surface can be extracted, and from its
backbone, the forward variance v(t,t + ∆T ) is computed. At a given time t, given a market model
Comparison of the empirical implied, forecasted and realized volatilities 17
G(vk (t); ∆T ), the risk factors vk (t) are estimated by fitting the function G(∆T ) on the forward
variance curve. It is therefore important for the function G(∆T ) to have enough possible shapes to
accommodate the various forward variance curves. This estimation procedure for the risk factors
gives the initial condition vk (t). Then, the postulated dynamics for the risk factors induce a dynamic
for G, and hence of the forward variance.
Notice that in this approach, there is no relation with the underlying and its dynamic. For this reason,
the possible processes are weakly constrained, and the parameters need to be estimated independently
(say for example the characteristic times τk ). Another drawback of this approach is to rely on the
empirical forward variance curve, and therefore a liquid option market is a prerequisite.
Our choice of notations makes clear the formal analogy of the market model with the forecasts
produced by a multicomponent ARCH process. Except for the detailed shapes of the functions
wk (∆T ), the equations (12) and (25) have the same structure. They are however quite different in
spirit: the vk are computed from the underlying time series in the ARCH approach, whereas in a
market model approach, the vk are estimated from the forward variance curve obtained from the
option market. In other words, ARCH leads to a genuine forecast based on the underlying, whereas
the market model provides for a constrained fit of the empirical forward curve. Beyond this formal
analogy, the dynamic for the risk factors are quite different as the ARCH approach leads to the
unusual (16a) whereas market models use the familiar generic Gaussian process in (20).
8 Comparison of the empirical implied, forecasted and realized

volatilities
As explained in Section 4, a multicomponent ARCH process provides us with a forecast for the
realized volatility, and the forecast is directly related to the underlying process and its properties. At a
given time t, there are three volatilities (implied, forecasted and realized) for each forecast horizon
∆T . Essentially, the implied and forecasted volatilities are two forecasts for the realized volatility. In
this section, we investigate the relationship between these three volatilities and the forecast horizon
∆T . When analyzing the empirical statistics and comparing these three volatilities, several factors
should be kept in mind.
1. For short forecast horizons (∆T = 1 day and 5 days), the number of returns in ∆T is small and
therefore the realized volatility estimator (computed with daily data) has a large variance.
2. The forecastability decreases with increasing ∆T .

3. The forecast and implied volatilities are “computed” using the same information set, namely the
history up to t. This is different from the realized volatility, computed using the information in
the interval [t,t + ∆T ]. Therefore, we expect the distance between the forecast and implied to be
the smallest.
4. The implied volatility has some idiosyncracies related to the option market, for example supply
and demand, or the liquidity of the underlying necessary to implement the replication strategy.
Similarly, an option bears volatility risk, and a related volatility risk premium can be expected.
These particular effects should bias the implied volatility upward.
5. From the raw options and underlying prices, the computations leading to the implied volatility
are complex, and therefore error prone. This includes dependencies on the original data
providers. An example is given by the time series for CAC 40 implied volatility, where during a
given period, the implied volatility above three months jumps randomly between a realistic
value and a much higher value. This is likely created by quotes for the one-year option that are
quite off the “correct” price (see Figure 3). Yet this data quality problem is inherent to the
original data provider and the option market, and reflects the difficulty in computing clean and
reliable implied volatility surfaces.
6. The options are traded for fixed maturity time, whereas the convenient volatility surface is given
for constant time to maturity. Therefore, some interpolation and extrapolation needs to be done.
In particular, the short times to maturity (one day, five days) need most of the time an
extrapolation, as the options are traded at best with one expiry for each month. This is clearly a
difficult and error prone procedure.
7. The ARCH-based forecasts are dependent on the choice of the process and the associated
parameters.
8. As the forecast horizon increases, the dynamic of the volatility gets slower and the actual
number of independent volatility points decreases (as 1/∆T ). Therefore, the statistical
uncertainty on the statistics are increasing with ∆T .
Because of the above points, each volatility has some peculiarities, and therefore we do not have a
firm anchor point to base our comparison. Given that we are on a floating ground, our goals are fairly
modest. Essentially, we want to show that processes with one or two time scales are not good enough,
and that the long-memory process provide for a very good forecast with an accuracy comparable to
the implied volatility. The processes used in the analysis are I-GARCH(1), I-GARCH(2) with two set
Figure 3
Volatility time series for USD/EUR (top) and CAC 40 (bottom), six month forecast horizon
18
realized
16 implied
I−GARCH(1)
14 I−GARCH(2)
volatility [%]
LM−ARCH
12
10
2003/01/01 2004/01/01 2005/01/01 2006/01/01

60
realized
50 implied
I−GARCH(1)
40 I−GARCH(2)
volatility [%]
LM−ARCH
30
20
10
0
2003/01/01 2004/01/01 2005/01/01 2006/01/01
of parameters and LM-ARCH. The equations for the processes are given in Section 3, along with the
values for the parameters.
The best way to visualize the dynamic of the three volatilities is to use a movie of the σ[∆T ] time
evolution. On a movie, the properties of the various volatilities, their dynamics and relationships are
very clear. Unfortunately, the present analogic paper does not allow for such a medium, and we have
to rely on conventional statistics to present their properties.
The statistics are computed for two time series: the USD/EUR foreign exchange rate and the CAC 40
stock index. The ATM implied volatility data originate from JP Morgan Chase for USD/EUR and
Egartech for the CAC 40 index; the underlying prices originate from Reuters. The time series for the
volatilities are shown on Figure 3 for a six-month forecast horizon. The time series are fairly short
(about six years for USD/EUR and four years for CAC 40). This clearly makes statistical inferences
difficult, as the effective sample size is fairly small. On the USD/EUR panel, the lagging behavior of
the forecast and implied volatility with respect to the realized volatility is clearly observed. For the
CAC 40, the data sample contains an abrupt drop in the realized volatility at the beginning of 2003.
This pattern was difficult to capture for the models with long term mean reversion. In 2005 and early
2006, the implied volatility data are also not reliable: first there are two “competing” streams of
implied volatility at ∼12% and ∼18%, before a period at the end of 2005 where there is likely no
update in the data stream. This shows the difficulty of obtaining reliable implied volatility data, even
from a major data supplier.
For the statistics, in order to ease the comparison between the graphs, all the horizontal and vertical
scales are identical, the color is fixed for a given forecast model, and the line type is fixed for a given
volatility pair. The graphs are presented for the mean absolute error (MAE)
1
n∑
MAE(x, y) = |x(t) − y(t)|, (27)
t
where n is the number of terms in the sum. Other measures of distance like root mean square error
give very similar figures. The volatility forecast depends on the ARCH process. The parameters for
the processes have been selected a priori to some reasonable values, and no optimization was done.
The overall relationship between the three volatilities can be understood from Figure 4. The pair of
volatilities with the closest relationship is the implied and forecasted volatilities, because they are
built upon the same information set. The distance with the realized volatility is larger, with similar
values for implied-realized and forecast-realized. This shows that it is quite difficult to assert which
one of the implied and forecasted volatility provides for a better forecast of the realized volatility. All
the distances have a global U-shape form as function of ∆T . This originates in the points 1 and 2
above, and leads to a minimum around one month for the measure of distances. The distance is larger
for shorter ∆T because of the bad estimator for the realized volatility, and larger for longer ∆T
because of the decreasing forecastability.
Figure 5 shows the distances for given volatility pairs, depending on the process used to build the
forecast. The forecast-implied distance shows clear difference between processes (left panels). The
I-GARCH(1) process lacks mean reversion, an important feature of the volatility dynamic. The
I-GARCH(2) process with parameter set 1 is handicapped by the short characteristic time for the first
EMA (4 days); this leads to a noisy volatility estimator and subsequently to a noisy forecast. The
same process with a longer characteristic time for the first EMA (16 days, parameter set 2) shows
much improved performance up to a time horizon comparable to the long EMA (260 days). Finally,
the LM-ARCH produces the best forecast. As the forecast becomes better (1 time scale → 2 time
scales → multiple time scales), the distance between the implied and forecasted volatilities decreases.
Figure 4
MAE distances between volatility pairs for EUR/USD, grouped by forecast method
The vertical axis gives the MAE for the annualized volatility in %, the horizontal axis the forecast time
interval ∆T in days.
I-GARCH(1) I-GARCH(2) parameter set 1

8 8
fcst−impl fcst−impl
7 fcst−real 7 fcst−real
impl−real impl−real
6 6
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 0 1 2
10 10 10 10 10 10
I-GARCH(2) parameter set 2 LM-ARCH

8 8
fcst−impl fcst−impl
7 fcst−real 7 fcst−real
impl−real impl−real
6 6
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 0 1 2
10 10 10 10 10 10
Figure 5
MAE distances between volatility pairs, grouped by pairs
The vertical axis gives the MAE for the annualized volatility in %, the horizontal axis the forecast time
interval ∆T in days.
EUR/USD forecast-implied EUR/USD forecast-realized

8 8
I−GARCH(1)
7 I−GARCH(2): param.1 7
I−GARCH(2): param.2
6 6
LM−ARCH
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 0 1 2
10 10 10 10 10 10
CAC40 forecast-implied CAC40 forecast-realized

8 8
I−GARCH(1)
7 I−GARCH(2): param.1 7
I−GARCH(2): param.2
6 6
LM−ARCH
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 0 1 2
10 10 10 10 10 10
Conclusion 23
For EUR/USD, the mean volatility is around 10% (the precise value depending on the volatility and
time horizon), and the MAE is in the 1 to 2% range. This shows that in this time to maturity range, we
can build a good estimator of the ATM implied volatility based only on the underlying time series.
The distance forecast-realized is larger than the forecast-implied volatility (right panel), with the long
memory process giving the smallest distance. The only exception is the I-GARCH(1) process applied
to the CAC 40 time series, due to the particular abrupt drop in the realized volatility at early 2003.
This shows the limit of our analysis due to the fairly small data sample. Clearly, to gain statistical
power requires longer time series for implied volatility, as well as a cross-sectional study over many
time series.
9 Conclusion
The ménage à trois between the forecasted, implied and realized volatilities is quite a complex affair,
where each participants have their own peculiarities. The salient outcome is that the forecasted and
impled volatilities have the closest relationship, while the realized volatility is more distant as it
incorporates a larger information set. This picture is dependent to some extent on the quality of the
volatility forecast: the multiscale dynamic of the long memory ARCH process is shown to capture
correctly the dynamic of the volatility, while the I-GARCH(1) and I-GARCH(2) processes are not rich
enough in their time scale structures. This conclusion falls in line with the RM2006 risk methodology,
where the same process is shown to capture correctly the lagged correlation for the volatility.
The connection with the market model for the forward variance shows the parallel structure of the
volatility forecasts provided by both approaches. However, their dynamics are very different
(postulated for the forward volatility market models, induced by the ARCH structure for the
multicomponent ARCH processes). Moreover, the volatility process induced by the ARCH equations
is of a different type from the usual price process, because the random term is of order δt instead of
√
δt used in diffusive equations. This emphasizes a fundamental difference between price and
volatility processes. A clear advantage of the ARCH approach is to deliver a forecast based only on
the properties of the underlying time series, with a minimal number of parameters that need to be
estimated (none in our case as all the parameters correspond to the values used in RM1994 and
RM2006). This point bring us to a nice and simple common framework to evaluate risks as well as the
implied volatilities of at-the-money options.
For the implied volatility surface, the problem is still not completely solved, as the volatility smile
needs to be described in order to capture the full implied volatility surface. Any multicomponent
ARCH process will capture some (symmetric) smile, due to the heteroskedasticity. Moreover, fat tail
innovations will make the smile stronger, as the process becomes increasingly distant from a Gaussian
random walk. Yet, adding an asymmetry in the smile, as observed for stocks and stock indexes,
requires enlarging the family of process to capture asymmetry in the distribution of returns. This is
left for further work.
References
Bergomi, L. (2005). Smile dynamics ii. Risk 18, 67–73.
Buehler, H. (2006). Consistent variance curve models. Finance and Stochastics 10, 178–203.
Dacorogna, M. M., U. A. Müller, R. B. Olsen, and O. V. Pictet (1998). Modelling short-term
volatility with GARCH and HARCH models. In C. Dunis and B. Zhou (Eds.), Nonlinear
Modelling of High Frequency Financial Time Series, pp. 161–176. John Wiley.
Engle, R. F. and T. Bollerslev (1986). Modelling the persistence of conditional variances.
Econometric Reviews 5, 1–50.
Gatheral, J. (2007). Developments in volatility derivatives pricing. Presentation at “Global
derivatives”, Paris, May 23.
Lynch, P. and G. Zumbach (2003, July). Market heterogeneities and the causal structure of
volatility. Quantitative Finance 3, 320–331.
Nelson, D. (1990). Arch model as diffusion approximation. Journal of Econometrics 45, 7–38.
Poon, S.-H. (2005). Forecasting financial market volatility. Wiley Finance.
Zumbach, G. (2004). Volatility processes and volatility forecast with long memory. Quantitative
Finance 4, 70–86.
Zumbach, G. (2006). The RiskMetrics 2006 methodology. Technical report, RiskMetrics Group.
Available at www.riskmetrics.com.
Zumbach, G. and P. Lynch (2001, September). Heterogeneous volatility cascade in financial
markets. Physica A 298(3-4), 521–529.
Inflation Risk Across the Board
Fabien Couderc
RiskMetrics Group
fabien.couderc@riskmetrics.com
Inflation markets have evolved significantly in recent last years. In addition to stronger issuance
programs of inflation-linked debt from governments, derivatives have developed, allowing a
broader set of market participants to start trading inflation as a new asset class. These changes call
for modifications of risk management and pricing models. While the real rate framework allowed
us to apply the familiar nominal bond techniques on linkers, it does not provide a consistent view
with inflation derivative markets, and limits our ability to report inflation risk. We thus introduce
in detail the concept of Break-Even Inflation and develop associated pricing models. We describe
various adjustments for taking into account indexation mechanisms and seasonality in realized
inflation. The adjusted break-even framework consolidates views across financial products and
geography. Inflation risk can now be explicitly defined and monitored as any other risk class.
1 Introduction
Even though inflation is an old topic for academics, interest from the financial community only began
recently. This is somewhat due to historical reasons. Inflation was and is a social and macroeconomic
matter, and has consequently been a concern for economists, politicians and policy makers, not for
financial market participants. High inflation (and of course deflation) being perceived as a bad signal
for the health of an economy, efforts concentrated on the understanding of the main inflation drivers
rather than on the risks inflation represents, especially for financial markets. One of the most famous
sentences of Milton Friedman, “Inflation is always and everywhere a monetary phenomenon,”
suggests that monetary fluctuations (thus, money markets) are meaningful indicators of inflation risks
we might face in financial markets. While this claim is supported by most economic theories, the
monetary explanation cannot be transposed in the same way on financial markets. Persistent shocks
on money markets effectively determine a large fraction of long-term inflation moves, but short and
mid-term fluctuations of inflation-linked assets bear another risk which has to be analyzed in its own
right.
Quantifying inflation risk on financial markets is today a major concern. The markets developed
quickly over the last five to ten years and we expect them to continue to evolve. On the one hand, after
26 Inflation Risk Across the Board
several years of low inflation across industrialized countries, signals of rising inflation have appeared.
Rising commodity and energy prices are typical examples. On the other hand, more and more players
are coming to inflation markets both on the supply and demand sides. Two decades ago, people
considered equities as a good hedge against inflation, but equities appear to be little correlated with
inflation, and demand for pure inflation hedges has dramatically increased. Pension funds and insurers
are the most active and really triggered new attention. Today, they not only face pressure from their
rising liabilities but also from regulators.
Even though inflation is not a new topic for us, we thus need to brush the cobwebs on techniques used
to understand and quantify inflation risk, examining the new perspectives and problems offered by
evolving inflation markets. In this paper, we first explore how inflation is measured. While everybody
can agree on what a rate of interest is just by looking at the evolution of a savings account, inflation
hits different people differently and depends on consumption habits. We highlight seasonality effects
and their impact on the volatility of short-term inflation. We then survey the structure of the
inflation-linked asset class. We are left with the necessity to consider inflation risk in a new way. For
that purpose, we come back to the well-known Irving Fisher paradigm linking real and nominal
economies, and define in a clean way the concept of break-even inflation. We then describe various
adjustments required by the indexation mechanisms of inflation markets so as to make break-even a
suitable quantity for risk management. Adjusted break-evens allow us to consistently consider and
measure inflation risk across assets and markets. Finally, we illustrate the methodology through actual
data over the last years, considering break-evens, nominal and real yields.
2 Measuring economic inflation
The media relay inflation figures on a regular basis. It is important to understand where numbers
come from and their implications. An inflation rate is measured off a so-called Consumer Price Index
(CPI) which varies significantly across countries and through time. The annual inflation rate is
commonly reported. It constitutes the percentage change in the index over the prior year. We
underline hereafter that this is a meaningful way to measure economic inflation. Exploiting economic
inflation would however be a challenging task for financial markets which require higher frequency
data. The monthly change on most CPIs is more suitable but should be considered with care, taking
into account seasonality in the index.
Measuring economic inflation 27
2.1 Consumer Price Indices
A consumer price index is a basket containing household expenditures on marketable goods and
services. The total value of this index is usually scaled to 100 for some reference date. As an
example, consider the Harmonized Index of Consumer Prices ex-tobacco (HICPx). This index applies
to the Euro zone, and is defined as a weighted average of local HICPx. The weights across countries
are given by two-year lagged total consumption, and the composition of a local index is revised on a
yearly basis. While the country composition has been generally stable, we notice significant changes
with the entry of Greece in 2001 and with the growth of Ireland. The contribution of Germany has
decreased from 35% to 28%.
The composition of the HICPx across main categories of expenditures has also significantly evolved
over the last ten years.1 We can observe that expenditures for health and insurances have grown by
6%. We would have also expected a large growth in the role of housing expenditures as real estate and
energy take a greater share of the budget today, but this category stayed barely constant (from 16.1%
in 1996 to 15.9% in 2007). This is a known issue, but no corrective action has been taken yet for fear
of a jump in short-term inflation and an increase the volatility of the index.
These considerations point out that inflation as measured through CPI baskets is a dynamic concept,
likely to capture individual exposure to changes in purchasing power with a lag.2 To some extent, the
volatility of market-implied inflation contains the volatility in the definition of the index itself.
2.2 Measure of realized inflation
Beyond the issue of how representative any CPI is, measuring an inflation rate off a CPI raises timing
issues. We define the annualized realized inflation rate i(t, T ) from t to T corresponding to an index
CPI(t) as
s
CPI(T )
i(t, T ) = T −t − 1. (1)
CPI(t)
Subtleties in the way information on the index is gathered and disclosed are problematic for financial
markets. First, CPI values are published monthly or quarterly only while markets trade on a daily
basis. Second, a CPI is officially published with a lag, because of the time needed for collecting prices
of its constituents. Finally, the data gathering process implies that a CPI value can hardly be assigned
1 Data on the index and its composition can be found on the Eurostat web site epp.eurostat.ec.europa.eu
2 The most famous consequence of this lag is known as the Boskin effect.
to a precise date t. An arbitrary date is however settled. By convention, this reference date is the first
day of the month for which the collecting process started.
As an illustration, let us consider the HICPx with the information available on 04 October 2007. The
last published value was 104.19 on 17 September 2007. This value corresponds to the reference date
01 August 2007. The previous value, 104.14, published in August, corresponds to 01 July 2007. On
October 4, we could thus compute the annualized realized inflation rate corresponding—by
12
convention—to 01 July 2007 through 01 August 2007: i = 104.19104.14 − 1 ' 0.58%, but the inflation
rate between August 1 and October 4 was still unknown.
Such a low inflation figure, 0.58%, sounds unusual. The media commonly report inflation over the
104.19
past year. On October 04, we could thus reference inflation 102.46 − 1 ' 1.69% in the Euro zone,
102.46 being the HICPx value for the reference date 01 August 2006. The top graph in Figure 1
shows the evolution of these two inflation measures. The annual inflation rate has been relatively
stable on the Euro zone, certainly as the result of controlled monetary policies. The high volatility in
the monthly inflation rate is striking, with a lot of negative values. Of course, we cannot talk about
deflation in June (−0.31%) and inflation in July (0.58%). The repeating peaks and troughs, noticeably
pronounced over the last years, and the nature of some constituents of the CPI, advocate for a
systematic pattern (seasonality) in monthly data which has to be filtered out.
2.3 Seasonality
Seasonality in demand for commodities and energy is a well established phenomenon. There exist
several methods which extract seasonality from a time series. Typically, the trend is estimated though
autoregressive processes (AR) and seasonal patterns through moving average (MA). We apply here a
proven model developed by the US Bureau of Census for economic time series, known as the X-11
method.3 We set a multiplicative model on the CPI itself such that
CPI(t) = S(t)CPI(t) (2)
where CPI(t) represents the seasonally adjusted CPI, and S(t) the seasonal pattern.
Returning to monthly estimates of the inflation rate, the top graph in Figure 1 shows that most of the
volatility in monthly inflation measured off the unadjusted CPI time series is due to seasonality. The
seasonally adjusted monthly inflation rate for July is about 1.57% and for June about 1.72%, figures
3
In practice, the time series is decomposed into three components: the trend, the seasonality and a residual. The X-11
method consists in a sequential and iterative estimation of the trend and of the seasonality by operating moving averages in
time series and in cross section for each month. See for instance (Shishkin, Young, and Musgrave 1967) for further details.
Measuring economic inflation 29
Figure 1
Realized inflation implied by HICP ex-tobacco
Annualized inflation reported on monthly and annual CPI variations, with and without seasonality
adjustments. Seasonality component S(t) estimated using the X-11 method.
Annualized inflation rate (%)
Monthly
10 Monthly − Adjusted
Annual Inflation
5
−5
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007
100.4
Seasonal component (%)
100.2
100
99.8
99.6
99.4
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007
in line with the annual estimate. This proves that seasonality has to be an important ingredient when
measuring inflation risk on financial markets, exactly as it is for commodities.
The bottom graph of Figure 1 presents the evolution of the seasonal pattern. The relative impact of
seasonality on the HICP ex-tobacco has doubled over the last ten years. The effect is more
pronounced in winter. Even though the pattern is different, it is interesting to notice that the same
comments apply to the US CPI (and actually for all developed countries). We will show in Section 5
that modeling a stochastic seasonal pattern is pointless.
3 Structure of the market
Inflation-linked financial instruments are now established as an asset class, but the growth of this
market4 is a new phenomenon mainly driven by an increased demand. Conventional wisdom used to
be that inflation-linked liabilities could be hedged with equities, producing a low demand for pure
inflation products. The past ten years have demonstrated the disconnection between inflation figures
and the evolution of equity markets, and highlighted the necessity to develop inflation markets.
We examined various measures of correlation between the main equity index and the corresponding
CPI in the US, the UK and in France.5 Our results were between -20% and 40%, concentrated around
0% to 20%. An institution with inflation-indexed liabilities could not be satisfied with even the best
case, a 40% hedging ratio. Demand for inflation products has thus exploded, reinforced by new
regulations (the International Financial Reporting Standards, IFRS, and Solvency II in Europe).
Restructuring activities from banks have also created a liquid derivatives market. This evolution of
inflation markets has shifted the way inflation-linked products and inflation risk have to be assessed.
While in the early stages, the analogy with nominal bonds was convenient for the treatment of
inflation-linked bonds, this is no longer true with derivatives.
3.1 The inflation-linked bond market
Governments structured inflation-linked bonds (ILB) and created these markets as alternative issuance
programs to standard treasury notes. This supply is mainly driven from cost of funding concerns. The
risk premium that governments should grant to investors on ILB should be slightly lower than on
nominal bonds. In addition, governments have risk diversification incentives on their debt portfolio.
Both outgoing flows (such as wages) and incoming flows (such as taxes) are directly linked to
inflation. As a consequence, some proportion of ILB should be issued; importantly for the
development of these markets, governments committed to provide consistent and transparent issuance
plans several years ahead.
Inflation-linked bonds are typically bonds with coupons and principal indexed to inflation. The
coupon is fixed in real terms and inflated by the realized inflation from the issuance of the bond (or
4 Bloomberg reports a tenfold increase in the HICPx-linked market, to over C100 million outstanding, since 2001.
5
We considered monthly versus yearly returns and realized inflation, using expanding, rolling and non-overlapping
windows. Rolling windows show a high volatility in the correlation itself, though the statistical significance of these results
is pretty low. Introducing a lag does not modify the outputs.
Structure of the market 31
Figure 2
Cash flow indexation mechanism
Inflation protected period for holding a cash flow from t to T . L stands for the indexation lag, and tl for
the last reference date for which the CPI value has been published.
from a base CPI value).6 The indexation mechanism is designed to overcome the problems associated
with the CPI, namely its publication frequency and its publication lag. This is achieved by introducing
a so-called indexation lag L.7 Typical lag periods range from two to eight months.
Flows are indexed looking at the CPI value corresponding to a reference date L months into the past.
With such a scheme, the inflation protected period and the holding period do not exactly match, as
illustrated in Figure 2. The nominal coupon CN to be paid at time T to an investor who acquired this
CPI(T −L)
right at time t on a real coupon CR is then given by CN = CR CPI(t−L) . The lag is set such that the base
CPI value CPI(t − L) is known at time t, provided that an interpolation method is also specified to
compute this value from the CPI values of the closest reference dates. This mechanism is mandatory
for calculations of accrued interests.
The approach yields to a simple pricing formula—Section 4.2 justifies it—in terms of real discount
rates RR (t, .). Denoting t1, ...,tN the coupon payment dates and CPI0 the value of the reference CPI at
the issuance of the bond (typically, CPI0 = CPI(t0 − L) where t0 is the issuance date), we get
!
CPI(t − L) CR 100
PV (t) = ∑ ti −t
+
(1 + RR (t,tN ))tN −t
. (3)
CPI0 i;t<ti (1 + RR (t,ti))
This equation implies that in real terms, an ILB can be considered as a plain vanilla nominal bond
6 This structure is now standard, though other structures with capital only or coupons only indexation can still be found.
7 Depending on the context, L should be understood as a number of months or as a fraction of year.
with fixed coupon CR , but substituting real rates for nominal rates. (In fact, this equation effectively
defines the real rates.) Risk is represented by the daily variations in real rates—or real yields—since
the real rate curve is the only unknown from today to tomorrow. Surprisingly, we cannot isolate
inflation risk here even though nominal rates, real rates and inflation risks are connected quantities.
The standard convention is to quote ILB prices in real terms, that is, to quote the real coupon bond
which then must be inflated by the above index ratio. For instance, on December 3, the French OATe
2.25 7/25/2020 linked to the HICPx is quoted with a real clean price of 103.177. Using the
appropriate index ratio of 1.0894 leads to a clean PV of C112.401.8
3.2 Development of derivatives
Liability hedging sparked off this boom. Pension funds are largely exposed to inflation moves:
indirectly, when retirement pensions are partially linked to inflation through indexation to final wages,
directly, when pensions are explicitly linked to a CPI.9
Inflation swap markets have been developed to answer more precisely the needs from liability driven
investments. Stripping ILB into zero-coupon bonds, financial intermediaries are now able to propose
customized inflation-protected cashflow schedules. In its simplest form, a zero-coupon inflation swap
is written on the inflation rate from a given CPI. The inflation payer pays at the maturity T of the
−L)
contract the increase in the index CPI(T
CPI(t−L) times a predefined notional. The protection buyer pays a
fixed rate on the same notional. The fixed rate is determined at the inception such that the value of the
swap is null.10 With inflation swaps—and similar derivatives—this fixed rate is the quantity at risk,
and expresses mainly views on expected inflation. By convention, inflation swaps are quoted through
this fixed rate for a set of full-year maturities. Thus, the natural risk factors for inflation swaps are the
quoted swap rates, rather than the real rates from before. In illiquid markets, we could accept different
risk factor definitions, but today, with liquid markets and significant intermediation activity, we
require a consistent view. Break-even inflation as defined later in this paper fills the gap while
accounting for the intrinsic features of consumer price indices.
8 This ratio is computed taking into account a three-month lag, a linear interpolation method and a three days settlement
period. The HICPx values can be downloaded from the Eurostat web site.
9 The tendency in Europe is to the direct linkage of pensions. Previously, most pensions were indirectly linked through
the average wage over the last years before retirement.

10
Such a structure is very effective for liability hedging: the whole capital can still be invested in risky assets while ILB
lock money in low returns. Among the possible swaps, one would prefer year-on-year inflation swaps which pays inflation
yearly. It strongly limits the amount of cash which has to be paid at maturity.
Structure of the market 33
Figure 3
Inflation Markets
3.3 More players with different incentives
Figure 3 depicts the global structure of inflation markets. Pension funds, insurers, corporates and
retail banks look both for protection and portfolio diversification. Banks stand across the board, as
intermediaries competing with hedge funds on inflation arbitrages, and as buyers looking for money
market investments and diversification. On the supply side, utilities and large corporates issue
inflation swaps or ILB so as to reduce their cost of funding. The inflation swap market has indeed
consistently granted a small premium (0-50bps) to inflation payers. In addition to standard
factors—credit and liquidity risk—this premium contains restructuring fees and a reward for a small
portion of inflation risk which cannot be transferred from bond markets to swap markets because of
indexation rules.
4 The concept of break-even inflation
Ideally, we would like to define and use expected inflation as a consistent risk factor on both markets.
Given that inflation-linked assets are written in terms of CPI ratios, we might believe that expected
inflation can be extracted from inflation swaps quotes, or from ILB prices. Unfortunately, measuring
expected inflation is challenging and cannot be done without an explicit dependency on the nominal
interest rate curve. Even though prices of inflation products can be observed, we show that there are
no model independent inflation expectations which can be derived.
We previously defined the realized inflation measure i(t, T ) in (1) and can thus define the expected
inflation I(t, T) as
" s #
CPI(T )
I(t, T ) = Et T −t − 1, (4)
CPI(t)
where the expectation is taken under the physical measure. Given that future inflation is uncertain, a
premium is necessarily embedded into the above expectation.11 But making an assumption about this
premium is not sufficient either. Using standard concepts of asset pricing theory, we demonstrate
hereafter that one can extract forward CPI values—or expectations of CPI values under corresponding
nominal forward measures—only. These forward CPI values are the main inputs entering into the
break-even inflation concept.
From now on, we leave aside the annualization, consider perfect indexation (L = 0) and assume that
the realized CPI is observable on a daily basis. We remark that we would be able to observe the
expected inflation if we could observe the expected CPI:

CPI(T ) 1
Et = Et [CPI(T )] . (5)
CPI(t) CPI(t)
4.1 The exchange rate analogy
Let us first call the nominal world our physical “home” world: in this world, any good or amount is
expressed in a monetary unit which is a currency (say, $US). We can consider other worlds that we
would still describe as nominal worlds but in which the monetary unit is another currency (say, C),
the “foreign” worlds. In a complete market with absence of arbitrage opportunities, a unique measure
exists to value all goods and assets in our nominal “home” world, the risk neutral measure. Through a
11
There is no consensus in the academic literature about the sign and the magnitude of this risk premium. In the US,
recent papers evaluated the premium up to 50bp while the European premium might be insignificant. See for instance
(Buraschi and Jiltsov 2005) and (Hördahl and Tristani 2007).
The concept of break-even inflation 35
change of numeraire, typically given by the exchange rate dynamics, pricing can be done in any
currency using the risk neutral measure in the “foreign” world.
It is common to consider inflation analogously to this setup by defining a CPI basket as a new
monetary unit. We refer to this as a basket unit as opposed to a dollar unit. The world where all goods
and amounts are expressed in basket units is the real world. Because of the completeness argument,
through a change of numeraire pricing of real assets can be done in the real economy or in the
nominal economy equivalently. The change of numeraire is given by the CPI price itself. As with
exchange rates, CPI(t) is the spot exchange rate to convert one basket unit in the real economy to the
nominal economy, that is, to $.
4.2 Standard pricing of linkers
As we discussed, linkers are traditionally priced in the real world through real coupon bonds. Let us
consider the simplest linker, a perfectly indexed inflation-linked discount bond (ILDB)—with price
CPI(T )
P(t, T ;t) $—issued at time t, which gives the right to receive a cash flow of CPI(t) at T . A linker can
obviously be decomposed into a deterministic linear combination of ILDB matching the coupon
payment dates. We further introduce the discount bonds BN,N (t, T ) and BR,R (t, T ) at date t with
maturity date T in respectively the nominal world and the real world and in their own monetary units.
In other words, these bond prices are obtained under the risk neutral measures PN∗ and PR∗ of the
nominal and real worlds respectively:
  
ZT
P∗
Bx,x (t, T ) = Et x exp − rx (s)ds , (6)
t
where rx (s) is the short rate in the corresponding world at time s. We can express the price of this real
discount bond in nominal terms BN,R (t, T ) using the spot CPI as BN,R (t, T ) = CPI(t)BR,R(t, T ). This
does not correspond to the price of an investment paying in our “home” world since it settles to one
1
CPI unit at time T : clearly, CPI(t) BN,R (t, T ) 6= P(t, T ;t) as they do not provide the payoff in the same
units. This is illustrated Figure 4 through the black and the red arrows.
Though pricing of ILDB can be done using expectations on future CPIs (this is developed in the next
section), a straightforward observation yields to (3) and a pricing model in terms of real rates. At time
)
t0 , paying P(t0, T ;t0 ) $ for one ILDB issued at t0 and maturing at T yields to a payoff of CPI(T
CPI(t0 ) $ at
maturity. The same payoff can be locked by investing BR,R (t0, T ) $ in the real world—thus paying
1 1
CPI(t0 ) BR,R (t0 , T ) real unit for CPI(t0 ) real discount bond maturing at T —and converting the payoff
back into dollars at maturity. This is materialized in Figure 5, where we explicitly mark dollar flows.
Figure 4
ILDB and real discount bond prices in the nominal world
Figure 5
Investment flows for replicating an ILDB
Since this is a self-financing replicating strategy which yield to the same payoff in all states of nature
at T , in absence of arbitrage opportunities, the price of the ILDB at time t such that t0 ≤ t ≤ T is equal
to the value of the replicating strategy, and thus given by
CPI(t) 1
P(t, T ;t0) = BR,R (t, T ) = BN,R (t, T ) $. (7)
CPI(t0) CPI(t0)
Do note the following implication: at issuance, the dollar value of the ILDB is exactly equal to the
real discount bond price, as if a CPI unit was worth $1. This justifies (3), and this simple pricing
model advocated for the use of real rates as risk factors.
4.3 Expectations of future CPI values
4.3.1 The forward CPI value
Let us derive the forward CPI value which will be the building block of break-evens. Since we have
ultimately to price instruments in the nominal world—this is our “home” world—we consider the
forward price FN,R (t, T1, T2 ) of a real discount bond delivered at T1 and maturing at T2 expressed in
nominal terms. Using the exchange rate analogy, we define the forward CPI value FCPI (t, T ) at t for
delivery date T as the forward price when T1 = T2 = T : FCPI (t, T ) = FN,R (t, T, T ). In absence of
arbitrage opportunities (AOA), we obtain
CPI(t)BR,R(t, T2)
FN,R (t, T1, T2 ) = . (8)
BN,N (t, T1)
Notice that given a set of nominal discount bond prices and a set of real discount bond prices in their
respective units, the values of forward CPI are known.
4.3.2 Link with CPI expectations
In our complete market, the AOA condition implies that at time t, the following condition should hold:
  
ZT
P ∗ CPI(t)
BN,N (t, T ) = Et R  exp − rR (s)ds , (9)
CPI(T )
t
since investing in the nominal world should be equivalent to investing in the real world with an initial
nominal amount and converting the output back to the nominal world (see Figure 6). The standard
relationship between risk neutral P∗ and forward measures PT applied to the real world leads to

PRT 1
BN,N (t, T ) = CPI(t)BR,R(t, T )Et . (10)
CPI(T )
Figure 6
Non-arbitrage condition between real and nominal worlds
Combining (8) and (10) implies that

1 PRT 1
= Et . (11)
FCPI (t, T ) CPI(T )
Considering the AOA condition from the real world leads to a similar equation,
  
ZT
P ∗ CPI(T ) PT
BR,R (t, T ) = Et N  exp − rN (s)ds ⇒ FCPI (t, T ) = Et N [CPI(T )] . (12)
CPI(t)
t
This shows that the expected CPI value Et [CPI(T )] cannot be directly observed. Of course, assuming
the dynamics for the CPI itself, its interactions with nominal rates and the shape of the inflation risk
premium, we could derive the exact relationship between the forward CPI and the expected CPI. In
particular, if the CPI dynamics and nominal rate dynamics were independent FCPI (t, T ) would be
equal to the expected CPI under the nominal risk neutral measure.
4.4 The break-even inflation
The above equations actually define the quantity that we can extract—free of modeling bias—which
we refer to as the zero-coupon break-even inflation (BEI). In discrete annual compounding, it can be
characterized by s
T −t FCPI (t, T )
BEI(t, T ) = − 1, (13)
CPI(t)
or in continuous compounding,

1 FCPI (t, T )
BEIc (t, T ) = log , (14)
T −t CPI(t)
Notice that because of (8) and (10), the definition (13) is equivalent to
1 + RN (t, T )
BEI(t, T ) = − 1, (15)
1 + RR (t, T )
where RR (t, .) and RN (t, .) are respectively the real and the nominal zero-coupon rates.
Equation (15) is the well known Fisher equation, with break-evens substituted for expected inflation.
However, with stochastic inflation (equivalently, future CPI values) the relationship is verified by
break-evens only. Equation (10) provides more insight on the components included in the break-even
inflation: the complete Fisher equation involving the expected inflation I(t, T ) can be written as12
(1 + RN (t, T )) = (1 + RR (t, T ))(1 + BEI(t, T)) = (1 + RR (t, T ))(1 + I(t, T))(1 + π(t, T))(1 + v(t, T ))
(16)
in which interest rates and CPI dynamics provide expressions for π(t, T )—which contains the
inflation risk premium—and v(t, T )—the correlation correction between interest rates and the CPI
dynamics, with a convexity adjustment, depending on the model.
4.5 Pricing in the nominal world only
Using a zero-coupon break-even curve and a nominal zero-coupon interest rate curve, we can now
rely on a new pricing model for linkers. This new framework enables the explicit modeling of
inflation risk and standard interest rate risk. An inflation-linked bond can indeed be modeled as a
stochastic coupon bond. The price P(t, T ;t0 ,CR ) at time t of a linker maturing at T , issued at t0 with a
real coupon of CR can indeed be computed under the risk neutral measure PN∗ through
"    #
Zti ZT
∗ CPI(ti) CPI(T )
P(t, T ;t0 ,CR ) = Et N ∑
P
CR exp − rN (s)ds + 100 exp − rN (s)ds (17)
i;t<ti CPI(t 0 ) CPI(t 0 )
t t
!
ti
CPI(t) CPI(ti) T CPI(T )
∑
P P
= Et N CR BN,N (t,ti) + Et N 100 BN,N (t, T ) (18)
CPI(t0) i;t<ti CPI(t) CPI(t)
!
CPI(t) 1 + BEI(t,ti) ti −t 1 + BEI(t, T ) T −t
=
CPI(t0) i;t<t ∑ CR 1 + RN (t,ti) + 100
1 + R N (t, T )
(19)
i
where PNs is the forward measure for the time s in the nominal world. Similarly, one can show that the
fixed rate of an inflation swap is equal to the break-even inflation for the same horizon.
12 This can be derived from Itô’s lemma applied to the dynamics of the CPI, and nominal or real rates.
5 Adjusted break-evens as risk factors
Importantly, the methodology presented above does not depend on perfect indexation. The indexation
lag and the publication lag can be taken into account through slight adjustments in the definition of
the break-even inflation and in pricing formulas. We now detail those adjustments, beginning with
seasonality. We then discuss two types of break-evens that can be used in a risk context. The second
type of BEI is motivated by requirements of homogeneity and portability, two essential characteristics
for a consistent evaluation of inflation risk across both asset classes and countries.
5.1 Including seasonality
We showed in Section 2.3 that the predictable seasonal pattern should be stripped out from observed
values of the CPI. This seasonality component should be included into our modeling of break-evens
by defining seasonally adjusted break-evens. Assuming that the seasonal pattern is deterministic, we
can combine (2), (11) and (13), defining a seasonally adjusted break-even BEI(t, T ) as
S(t, T )
(1 + BEI(t, T))T −t = (1 + BEI(t, T ))T −t (20)
S(t,t)
where S(t, T ) is the seasonality estimated at t projected for T . The projection is done by repeating the
last whole year’s seasonal pattern S(t), as extracted in Section 2.3. Between estimated seasonal
monthly values S(t), a linear interpolation is performed.
Since inflation swap quotes constitute (a special type of) break-evens, we explore the impact of
seasonality on the strike rate of an inflation swap. Figure 7 presents results on the EU HICPx and US
CPI inflation. We will refer to these BEI as adjusted and unadjusted standard break-evens. In the
figure, we observe that US inflation was expected to increase, while the European curve was flat. US
inflation swaps traded about 25bp to 50bp above European inflation swaps. Further, the impact of
seasonality vanishes quickly with increasing swap maturity. Since the seasonal pattern changes
slowly,13 any implication on mid- and long-term unadjusted BEI would be insignificant. Short-term
unadjusted BEI are well predicted by the seasonal pattern observed on the previous year. Figure 7
underlines the benefits of using adjusted BEI as risk factors. Because of the swings created by the
seasonality, we would overestimate short-term inflation risk by considering unadjusted BEI. This will
be shown later. Unless specified, we consider seasonally adjusted BEI in the remainder of this paper.
13 Recall Figure 1.
Adjusted break-evens as risk factors 41
Figure 7
Impact of seasonality on inflation swap quotes
Inflation swap quotes are interpolated using smoothed splines. Data as of 03 December 2007
US Break−even inflation swap (%)
3.5
2.5
Adjusted
Unadjusted
Quotes
2
0 5 10 15 20 25 30
Maturity
EU Break−even inflation swap (%)
3.5
Adjusted
Unadjusted
Quotes
3
2.5
2
0 5 10 15 20 25 30
Maturity
5.2 Homogeneity and portability
The validation of a pricing model for risk management purposes imposes some constraints on the
input data, or equivalently on the risk factors entering into equations. When introducing imperfect
indexation, definition (13) has to be adapted for our break-even to satisfy those constraints. First, our
ability to estimate and model the distribution of a risk factor depends on our capacity to observe an
homogenous sample of this factor through time. Homogeneity comes into two flavors: the observable
should refer to the same theoretical quantity—in particular, to the same horizon—and it should be
forward looking. Second, risk analyses at the portfolio level call for a consistent modeling of the same
risk across markets and assets. As far as inflation is concerned, we would like inflation swaps and
inflation-linked bonds to rely on the same risk factor. Trading on the two asset classes—for instance
through inflation-linked swaps—might even require interchangeability in break-evens derived from
inflation swaps and from linkers.14 This involves disentangling specific market conventions from the
risk factors themselves.
14 In general, an inflation-linked swap consists in swapping a linker against a floating nominal leg.
5.2.1 The standard BEI
We characterize a first type of break-even by coming back to ILDB. Relaxing the perfect indexation
assumption implies that the payoff of an inflation linked discount bond refers to lagged inflation. Its
price is given by
  
Z T
P∗ CPI(T − L)
P(t, T ;t0 , L) = Et N  exp − rN (s)ds (21)
CPI(t0 − L)
t
  
Z T
CPI(t − L) PN∗  CPI(T − L)
= E exp − rN (s)ds (22)
CPI(t0 − L) t CPI(t − L)
t

CPI(t − L) PNT CPI(T − L)
= E BN,N (t, T ). (23)
CPI(t0 − L) t CPI(t − L)
Equation (23) shows that we can embed the indexation lag within the definition of the break-even
inflation by setting
FCPI (t, T − L) S(t, T − L)
(1 + BEI(t,t − L, T − L))T −t = = (1 + BEI(t,t − L, T − L))T −t . (24)
CPI(t − L) S(t,t − L)
This definition conveniently allows us to stick to (19) without additional effort beyond modifying the
upfront index ratio. This is a standard market practice. We thus refer to (24) as the standard BEI.
From a risk perspective, seasonally adjusted standard break-evens do satisfy some of the
aforementioned criteria on risk factors. We can indeed derive a standard zero-coupon BEI curve on a
daily basis from inflation swap quotes as well as from treasuries and inflation-indexed bonds.
However, this simple approach comes at the cost of deriving a quantity which is not purely at risk.
Figure 8 shows how the protected period—and so the BEI—decomposes. We distinguish three parts.
From the base indexation date t − L up to the last observed reference date tl , the inflation rate is
actually known. In between tl and the analysis date t, if not fully known the inflation rate is strongly
predictable. The third, forward-looking part is the true period at risk.
Portability of standard BEI is also a concern. Conventions for computing the base index value
CPI(t − L) varies from one market to another, and leaving this component within the break-even
significantly restricts the way it can be used. For instance, French linkers indexed to the HICPx (the
OATe) obtain their base index value by interpolation between two prior observed CPI values, while
HICPx inflation swaps choose the CPI value for a single reference date. Because of these different
conventions, the discrepancy between standard BEI derived from inflation swaps and from linkers
varies in a predictable way through a month, which is unsatisfactory. The HICPx interpolation rules
additionally create spurious jumps when the protected period changes.
Figure 8
Standard and fully adjusted break-evens
Zero coupon break-even inflation defined off a position held from t to T . L stands for the indexation
lag, and tl for the reference date at which the last CPI value was published.
5.2.2 Fully adjusted BEI
Acknowledging these defects in the standard BEI, another common practice is to define “forward”
break-evens BEI f (t, T, L) as the risk factors by stripping the first two parts of the protected period:
(1 + BEI(t,t − L, T ))T −t+L S(t, T )
(1 + BEI f (t, T, L))T−t = = (1 + BEI f (t, T, L))T−t . (25)
(1 + BEI(t,t − L,t))L S(t,t)
We underline that the “forward” label is a misuse of language since BEI f (t, T, L) is nothing more than
a spot break-even measure. The forward standard BEI is thus the quantity that we have to use when
applying the Fisher equation (15). More importantly, it is obvious that none of the standard BEI
drawbacks are corrected.15
We can nevertheless define another type of break-even. Coming back to Figure 8, we could partially
adjust for the indexation lag and consider a break-even over the non-deterministic protected period
from tl to T − L. This would break the homogeneity condition since on a daily basis for a quoted
constant time-to-maturity instrument (such as an inflation swap), the interval [tl , T − L] increases
slowly before contracting again when a new CPI value is published for tl+1. Homogeneity of the risk
factor can be satisfied by defining a break-even on the third period only as shown on Figure 8,
FCPI (t, T − L) S(t, T − L)
(1 + BEI(t, T, L))T−t−L = = (1 + BEI(t, T, L))T−t−L . (26)
CPI(t) S(t,t)
15 The defects could be corrected if we could observe the price of an overnight inflation swap or a one-day linker.
This break-even cannot be observed since the current CPI is unknown at t. Because of the high
predictability of CPI(t) we can define fully adjusted break-even through a forecast CPI ∗(t) as
FCPI (t, T − L)
(1 + BEI(t, T, L))T−t−L = . (27)
CPI ∗(t)
Various strategies can be applied so as to obtain CPI ∗(t). Notice that the forecasting model can be
designed under the physical or the risk neutral probability measure since the risk premium should be
barely null for such a small horizon.16 Regression-based strategies on log CPI could be operated.17 A
conservative approach can also be designed, for instance, by assuming that the past year’s inflation is
the best forecast of the increase in the CPI from the last published value. We apply this strategy in the
following figures and data.
5.2.3 The adjustments in practice
We highlight the differences in the standard and the fully adjusted break-evens by looking at the
European (HICPx) and the US inflation swap markets. Figure 9 presents the term structure of adjusted
break-evens extracted from market quotes. As expected, differences are significant on the short-term.
On the one hand, the standard BEI smoothes out expectations of future inflation by realized inflation
on the indexation lag period. When realized inflation has been higher than expected, it tends to lower
the true expectations of future inflation, and vice versa. For instance, given that the indexation lag on
the US inflation swaps is three months, the US standard curve is influenced by the realized inflation
over August. The August inflation was high 3.36% and is likely to bias the standard BEI curve which
displays a short term value of 2.38%. On the other hand, fully adjusted BEI can suffer from bad
predictions of the current index value CPI ∗(t). Our conservative forecasting strategy slightly
underestimated the August to November inflation with an estimated CPI value of 2.5%, while ex-post
we observed a 3.03% inflation rate. We nevertheless observe here that the implied short term
break-even—about 2.92%—is more inline with the last inflation realizations. Put simply, fully
adjusted break-evens react more quickly to realized inflation.18
Figure 10 shows implications of the various adjustments on volatility estimates. Most inflation swap
markets offer a decreasing volatility with the break-even horizon, though the US market is an
16 With the development of inflation markets, liquid futures on inflation could for instance be used.
17
Prices of futures on energy and commodities, money market rates, etc...are potential forecasting variables.
18 Let us point out that even though we do not provide details here, several subtle issues were taken into account. First,
the market conventions for the indexation method differ between the European and US markets. Both markets use a three-
month lag but the US market uses a linear interpolation while the European swaps are indexed between index reference
dates directly. Second, data and break-evens have to be interpolated. This has been done through smoothed splines applied
on seasonally adjusted break-evens and forward break-evens.
Figure 9
Adjusted break-even term structures
Forward standard BEI and fully adjusted BEI term structures from the HICPx and the US CPI inflation
swaps. All curves are seasonally adjusted. Data as of 05 November 2007
2.6
EU BEI Swap (%)
2.5
2.4
2.3
Standard SA
Fully Adj. SA
2.2
5 10 15 20 25 30
3.2
US BEI Swap (%)
2.8
2.6
2.4 Standard SA
Fully Adj. SA
2.2
5 10 15 20 25 30
exception. Seasonality effects are presented on the standard adjusted curves only. It exhibits the
benefits of adjusting break-evens for seasonality by removing meaningless waves. From previous
comments about differences between the two adjusted BEI, we could expect that the standard BEI
methodology tends to underestimate short-term volatility. This is confirmed in Figure 10.
We further check the adequacy of the break-evens with classic assumptions of risk models. It is for
instance standard to assume that risk factors follow a normal—log-normal for a positive
variable—distribution or a t-distribution. Looking at the evolution of standard break-evens especially
casts doubt on such an assumption (see Figure 11). The data displays different regimes with several
jumps. We performed a Jarque-Bera statistics, an adequacy test to the normal distribution. For
standard BEI, the null hypothesis of adequacy to the normal distribution is rejected at a 5%
confidence level for all maturities up to ten years. On the contrary, for the fully adjusted break-even
the same test cannot reject the null hypothesis for all maturities above two years. Given that there is
only one market point below—the one year point—this can be interpreted as a good signal and
advocates for the use of fully adjusted BEI.
Figure 10
Volatility (annualized) across the term structure of BEI from HICPx inflation swaps
Volatility computed using a decay factor of 0.94. “U” indicates a non-seasonally adjusted curve; “SA”
stands for seasonally adjusted. Data as of 05 November 2007
2
Standard SA
1.8 Standard U
Fully Adj. SA
1.6
EU BEI Swap Vol (%)
1.4
1.2
0.8
0.6
0.4
0.2
0
1 2 3 4 5 6 7 8 9 10
Figure 11
Short- and mid-term BEI from HICPx inflation swaps
3.5
1−year EU BEI Swap (%)
Standard SA
3 Fully Adj. SA
2.5
1.5
1
02/06 04/06 06/06 08/06 10/06 12/06 02/07 04/07 06/07 08/07 10/07
2.6
5−year EU BEI Swap (%)
2.4
2.2
2
Standard SA
Fully Adj. SA
1.8
02/06 04/06 06/06 08/06 10/06 12/06 02/07 04/07 06/07 08/07 10/07
Figure 12
Inflation swaps versus ILB standard BEI term structures
Forward standard BEI term structures from the HICPx and the US CPI inflation swaps. All data
seasonally adjusted. European curve derived using French OATe only. Data as of 05 November 2007
2.6
2.5
EU BEI (%)
2.4
2.3
2.2 ILB Standard SA

Swap Standard SA
2.1
5 10 15 20 25 30 35 40
3.2
3
US BEI (%)
2.8
2.6
2.4 ILB Standard SA

Swap Standard SA
2.2
5 10 15 20 25 30
5.3 ILB, inflation swaps and nominal rates
Up to now we presented outputs from inflation swaps only. Deriving break-even data from linkers is
more challenging. Interpolation issues have to be handled with care since linkers are mostly
coupon-paying bonds. Interactions with the nominal treasury curve and the way it is constructed
magnify the problems that a classic bootstrap procedure generates.19 Figure 12 compares the US and
European term structures implied by inflation swaps and inflation-linked bonds. The inflation-linked
bond markets typically offer long maturity instruments—up to 40 years for the European market—as
issuers target potential buyers of long-term liability hedges. Inflation swap markets are more active on
the short-term and mid-term so as to open inflation trading to other market participants. In addition,
19
The BEI curves derived from linkers and treasury bonds presented here have been obtained through an optimization
algorithm. While methodological details are not the purpose of this paper, we underline that the whole methodology is
available in the RiskMetrics Group Technical Notes.
the structures that financial intermediaries need to create for issuing inflation swaps involve
short-term nominal interest rates. As discussed earlier, we can observe that inflation swaps trade
above inflation implied by linkers. On the liquid European market, this premium tightened and is now
contained within a 30bp range. The premium seems to be higher on the short-term but this has to be
balanced with the fact that the linkers do not contain information: the smallest available maturity on
the French OATe market is about five years. The premium in the US is higher at about 50bp but on the
short-term which presents an anomaly with a negative premium. Preliminary analysis suggests that
the premium is strongly volatile, and that a basis risk could be monitored.
6 Conclusion
From many years fixed income traders used to think in terms of break-evens by applying the famous
Fisher relationship BEI = 1+Y1+YR − 1 ≈ YN −YR on the yields-to-maturity of adjacent government
N
nominal bonds and linkers. Doing so, they relied on the real rate pricing framework which allows us
to treat linkers in the same fashion as classic bonds. Setting theoretical foundations on the concept of
break-even inflation, we can move away from the real rate framework and define pricing models
depending on nominal and break-even rates only. Our investigations suggest that fully adjusted
break-evens are particularly adapted in this context. We can then decompose any portfolio containing
inflation-linked assets into nominal interest risk and inflation risk.
Of course, the Fisher relationship still applies over the whole term structure of interest rates. It tells us
that over the last years, nominal rates have mainly been driven by fluctuations in real rates while
break-evens remained stable: see Figure 13. However, while economists fear rising inflation, we are
now in a much better situation to identify changes in break-evens and hedge our inflation exposures.
References
Buraschi, A. and A. Jiltsov (2005). Inflation risk premia and the expectations hypothesis. Journal
of Financial Economics 75, 429–490.
Hördahl, P. and O. Tristani (2007). Inflation risk premia in the term structure of interest rates. BIS
Working Papers (228).
Shishkin, J., A. H. Young, and J. C. Musgrave (1967). The X-11 variant of the census method II
seasonal adjustment program. Technical Paper No. 15, U.S. Department of Commerce, Bureau
of Economic Analysis.
Conclusion 49
Figure 13
Decomposition of Euro nominal swaps
Decomposition of the ten-year nominal swap rate through the Fisher equation into a ten-year real rate
and zero-coupon break-even inflation.
4.5
4
EU 10 yeasr Rate (%)
3.5 Nominal
BEI Standard SA
3 Real SA
2.5
1.5
12/06 02/07 04/07 06/07 08/07 10/07
Extensions of the Merger Arbitrage Risk Model
Stéphane Daul
RiskMetrics Group
stephane.daul@riskmetrics.com
A traditional VaR approach is not suitable to assess the risk of merger arbitrage hedge funds. We
recently proposed a simple two- or three-state model that captures the risk characteristics of the
deals in which merger arbitrage funds invest. Here, we refine the model, and demonstrate that it
captures merger and acquisition risk characteristics using over 4000 historical deals. We then
measure the risk of a realistic sample portfolio. The risk measures that we obtain are consistent
with those of actual hedge funds. Finally, we present a statistical model for the probability of
success and show that we beat the market in an out-of-sample study, suggesting that there is a
potential “alpha” for merger arbitrage hedge funds.
1 Introduction
The merger arbitrage strategy consists of capturing the spread between the market and bid prices that
occurs when a merger or acquisition is announced. There are two main types of mergers: cash
mergers and stock mergers. In a cash merger, the acquirer offers to exchange cash for the target
company’s equity. In a stock merger, the acquirer offers its common stock to the target in lieu of cash.
Let us consider a cash merger in more detail. Company A decides to acquire Company B, for example
for a vertical synergy (B is a supplier of A). Company A announces that they offer a given price for
each share of B. The price of stock B will immediately jump to (almost) that level. However, the
transaction typically will not be effective for a number of months, as it is subject to regulator
clearance, shareholder approval, and other matters. During the interim, the stock price of B actually
trades at a discount with respect to the offer price, since their is a risk that the deal fails. Usually, the
discount decreases as the effective date approaches and vanishes at the effective date.
In a stock merger, company A offers to exchange a fixed number of its shares for each share of B. The
stock price of B trades at a discount with respect to the share price of A (rescaled by the exchange
ratio) as long as the deal is not closed.
With a cash merger, the arbitrageur simply buys the target company’s stock. As mentioned above, the
target’s stock sells at a discount to the payment promised, and profits can be made by buying the
52 Extensions of the Merger Arbitrage Risk Model
Figure 1
Cash deals. Share price of target (thick line) and bid offer (dotted line)
LabOne Inc. InfoUSA Inc.

46 13
12.5
44
12
11.5
42
11
Share Price
Share Price
40 10.5
10
38
9.5
9
36
8.5
34 8
31−May−05 31−Jun−05 31−Jul−05 31−Aug−05 30−Sep−05 31−Oct−05 9−Jun−05 31−Jun−05 31−Jul−05 31−Aug−05 30−Sep−05
Date Date
target’s stock and holding it until merger consummation. At that time, the arbitrageur sells the target’s
common stock to the acquiring firm for the offer price.
For example, on 8 August 2005, Quest Diagnostic announced that it was offering $43.90 in cash for
each publicly held share of LabOne Inc. The left panel of Figure 1 shows the LabOne share price. It
can be seen that the shares closed at $42.82 on 23 August 2005. This represents a 2.5% discount with
respect to the bid price. The deal closed successfully on 1 November 2005 (just over two months after
the announcement), generating an annualized return of 10.9% for the arbitrageur.
In a stock merger, the arbitrageur sells short the acquiring firm’s stock in addition to buying the
target’s stock. The primary source of profit is the difference between the price obtained from the short
sale of the acquirer’s stock and the price paid for the target’s stock.
For example, on 20 December 2005, Seagate Technology announced that it would acquire Maxtor
Corp. The terms of the acquisition included a fixed share exchange ratio of 0.37 share of Seagate
Technology for every Maxtor share. Figure 2 shows the movement of both the acquirer share price
and the target share price. On December 21, Maxtor shares closed at $6.90 and Seagate at $20.21
yielding a $0.58 merger spread. The deal was completed successfully on 22 May 2006.
More complicated deal structures involving preferred stock, warrants, or collars are common. From
the arbitrageur’s perspective, the important feature of all these deal structures is that returns depend on
Introduction 53
Figure 2
Equity deal. Share prices of Maxtor (thick line) and Seagate Technology (dotted line)
Seagate Technology share prices are rescaled by the exchange ratio.
11
10
8
Share Price
3
31−Oct−05 31−Dec−05 28−Feb−06 30−Apr−06
31−May−06
Date
mergers being successfully completed. Thus the primary risk borne by the arbitrageur is that of deal
failure. For example, on 13 June 2005, Vin Gupta & Co LLC announced that it was offering $11.75 in
cash for each share of infoUSA Inc. In the right panel of Figure 1, we see that after the
announcement, the share price of infoUSA jumped to that level. The offer was withdrawn, however,
on 24 August 2005, and the share price fell to a similar pre-announcement level.
A recent survey of 21 merger arbitrageurs (Moore, Lai, and Oppenheimer 2006) found that they invest
mainly in announced transactions with a minimum size of $100 million and use leverage to some
extent. They gain relevant information using outside consultants and get involved in deals within a
couple of days after the transaction is announced. They unwind their positions slowly in cases where
the deal is canceled, minimizing liquidity issues. Their portfolios consist, on average, of 36 positions.
Finally, from Figure 3, we clearly see that the volatility of the share price before and after the
announcement is very different. Measuring the risk with a traditional VaR approach in terms of
historical volatility is surely wrong. Thus arbitrageurs typically control their risk by setting position
limits and by diversifying industry and country exposures.
We recently have developed a risk model suitable for a VaR approach that captures the characteristics
of these merger arbitrage deals (Daul 2007). In this article, we will refine this model to better describe
equity deals and also study in more detail the probability of deal success. The model will then be
tested on 4000+ worldwide deals and also compared to real hedge funds.
Figure 3
Stock price of LabOne Inc. The large dot refers to the announcement date.
Share Price
31−May−05 31−Jun−05 31−Jul−05 31−Aug−05 30−Sep−05 31−Oct−05

Date
2 Risk Model
We consider only pure cash or equity deals, and introduce the following notation (see Figure 4):
St is the stock price of the target firm at time t.
t0 is the announcement date.
Λ is the deal length.
Kt is the bid offer per share at time t.
For cash deals, the bid offer is typically fixed and known at announcement date, Kt = Kt0 . For equity
deals, the bid offer is the acquirer stock price At times the deal conversion ratio ρ, Kt = ρAt . This
difference will not affect our model as the main hypothesis applies when the deal is withdrawn.
Notice further that for cash deals, the bid offer can also change over time, for example if the offer is
sweetened or if a second bidder enters the game (Daul 2007).
The announcement date t0 is evidently fixed. The deal length Λ can fluctuate and is modeled as a
random variable following a distribution
F(t) = P(Λ ≤ t). (1)

Risk Model 55
Figure 4
Definition of parameters
Λ
K
Share Price
St
0
t0
31−May−05 31−Jun−05 31−Jul−05 31−Aug−05 30−Sep−05 31−Oct−05

Date
We will consider a model conditioned on Λ, where at time t0 + Λ (the effective date), we know if the
deal is completed (success) or withdrawn (failure).
To model this event, we introduce the binomial indicator C. With probability π, we have C = 1,
indicating deal success, and with probability 1 − π, we have C = 0, indicating deal failure. In case of
success, the stock price of the target reaches its bid offer, while when the deal breaks we have to make
further assumptions. This will consist of our main hypothesis: we model the level to which the stock
price jumps as a virtual stock price S̃t . Hence the stock price at the effective date is
(
Kt0 +Λ if C = 1,
St0 +Λ = (2)
S̃t0 +Λ if C = 0.
Since the withdrawal might be considered as negative information, the virtual stock price is subject to
a random shock J at time t0 + Λ. An illustration of this virtual stock price is shown in Figure 5. The
black line is the real stock price for a withdrawn deal, and the dotted blue line is a virtual path that the
stock price could have taken if no deal had been put in place.
The virtual stock price follows a simple jump-diffusion process
d S̃t = µS̃t dt + σS̃t dWt − J S̃t dNt , (3)
where µ is the drift (set to zero afterwards), σ is the volatility of the price before announcement, J is a
positive random shock following an exponential distribution with parameter λcash for cash deals and
Figure 5
Virtual stock price
31
30
29
28
−JS
t
27
Share Price
26
25
24
23
22
21
31−Sep−05 31−Dec−05 31−Mar−06 30−Jun−06 30−Sep−06 30−Nov−06
Date
λequity for equity deals, and Nt is a point process taking values
(
1 if t ≥ t0 + Λ,
N(t) = (4)
0 if t < t0 + Λ.
Finally, the initial condition is
S̃t0 = St0 . (5)
We can easily integrate this process and get for t < t0 + Λ,
S̃t = St0 e∆Z (6)
√
where ∆Z follows a normal distribution with mean (µ − 12 σ2 )(t − t0) and standard deviation σ t − t0 .
For t = t0 + Λ we get
S̃t0 +Λ = St0 e∆Z (1 − J). (7)

Parameters estimation and model validation 57
3 Parameters estimation and model validation
3.1 Virtual stock price
The parameters of the model are estimated using historical information on deals. The transaction
details (such as announcement date, effective date, type of deal, and so forth) are obtained from
Thomson One Banker. We consider pure cash or equity deals between public companies from 1996 to
2006 worldwide where the target company offered value is over $100 million. We consider those
deals for which we can also obtain stock prices from DataMetrics.
The daily drift µ is set to zero, and the ex-ante deal daily volatility is estimated using one year of daily
returns, equally weighted.
The intensity parameters λcash and λequity of the shock are obtained by moment matching. Conditional
on deal failure, the expected value of the stock price is
2 /2)Λ
E[St0 +Λ |C = 0] = St0 e(µ−σ (1 − λ· ). (8)
Assuming µ − σ2 /2 ≈ 0, we get

St0 +Λ
E C = 0 = (1 − λ· ). (9)
St0
Using the 131 withdrawn cash deals in our database, we get λcash = −0.07 ± 0.06; using the 33
withdrawn equity deals, we get λequity = 0.2 ± 0.1. Hence we set
λcash = 0 and λequity = 0.2. (10)
3.2 Deal length
We model the deal length Λ with a Weibull distribution having parameters a and b,
t b
F(t) = 1 − e−( a ) . (11)
This distribution is assumed to be universal. Using 1075 realized deal lengths (measured in days), we
obtain the following boundaries at 95% level of confidence:
143 < â < 154 (12)
1.43 < b̂ < 1.56 (13)
This corresponds to an average deal length of
L̄ = 135 days. (14)
3.3 Test of the main hypothesis
As stated above, the main hypothesis is the “existence” of a virtual stock price that is reached only in
case of withdrawal. For cash deals, λcash = 0, meaning the stock prices after withdrawal should follow
a lognormal distribution, with volatility σi different for each deal i. Hence, the normalized residuals
i
St +Λ
0 i
log Si
t
ui = √0 (15)
σi Λi
should follow a standard normal distribution. The p-value of a Kolmogorov-Smirnov test using the
131 withdrawn deals is 93%, implying that we cannot reject at all the main hypothesis.
For equity deals, λequity = 0.2, and the residuals defined as above do not follow a normal distribution.
Instead we study the residuals
Sti +Λ
vi = 0 i . (16)
St0
This should be distributed as
e∆Zi (1 − J), (17)
where ∆Zi follows a normal distribution with parameter σi different for each deal. We set the
volatility equal to the average of the σi , and use Monte Carlo to obtain a sample distributed according
to (17). We then compare this sample to our 33 withdrawn deals using a two-sample
Kolmogorov-Smirnov test. The result is a p-value of 53%. Again we cannot reject at all the
hypothesis, confirming the validity of our model.
4 Risk measurement application
We want to measure the risk of a sample portfolio consisting of 30 pure cash deals pending end of
2006. All deals are described by
• the target company,
• the bid offer K,
• the date of announcement t0 ,
• the probability of success π,

Risk measurement application 59
Table 1
VaR using the merger arbitrage risk model and the traditional risk model
VaR level Merger Arb Model Traditional Equity Model
95% 1.37% 7.25%
99% 2.21% 10.24%
Table 2
Dispersion of historical VaRs for merger arbitrage hedge funds
VaR level 1st quartile median 3rd quartile
95% 0.81% 1.29% 1.68%
99% 2.17% 2.92% 4.90%
and are assumed independent. We set the probability of success π to the historical value of 86% (see
Section 5).
We forecast the P&L distribution of the portfolio at a risk horizon of one month using Monte-Carlo
simulations. For each deal, one iteration is as follows:
1. Draw an effective date using the Weibull distribution.
2. If the risk horizon is subsequent to the effective date, draw a completion indicator. If the risk
horizon is before the effective date, the deal stays in place.
3. If the completion indicator indicates failure, draw a virtual stock price, and calculate the loss. If
the completion indicator indicates success, calculate the profit.
Table 1 reports the VaRs at two different confidence levels obtained from model, as well as VaRs
obtained from modeling the positions as simple equities following a log-normal distribution. We
notice that our model produces lower risk measures, consistent with our expectation.
For more evidence, we compare these monthly VaRs with the historical monthly VaRs of 41 merger
arbitrage hedge funds obtained from the HFR database. Table 2 shows that the dispersion of the hedge
fund VaRs contains our model’s results. We conclude that our new model consistently captures the
risk of a merger arbitrage hedge fund, and that the traditional model likely overstates risk.
5 Probability of success
In the risk measurement application above, the probability of success was unconditional on the deal,
and set to the historical estimate using all deals worldwide from 1996 to 2006
Nsuccess 4176
πhistorical = = = 86%. (18)
Ntotal 4879
A deal-specific probability of success can be inferred from the observed spread in the market as in
(Daul 2007),
πimplied = π (∆, St0 , K, rfree) . (19)
Alternately, we may fit an empirical model. We will use a logistic regression and assume that the
probability of success is a function of observable factors Xi as
1
πempirical = . (20)
1+ e− ∑i bi Xi
If the factor sensitivity bi is positive, then larger Xi lead to higher probability of success, assuming
other factors are constant.
We consider the following factors:
• Target attitude:
1 Friendly
Xi = 0 Neutral
− 1 Hostile
• Premium: the relative extra amount the bidder offers. Its magnitude should be an indicator of
the acquirer’s interest.
K − St0
Xi =
St0
• Multiple: the ratio of enterprize value (EV), calculated by adding the target’s debt to the deal
value, to the EBITDA, an accounting measure of cash flows.
EV
Xi =
EBITDA
• Industrial sector: By acquiring a target in the same industrial sector, the acquirer increases its
market share. This could influence deal success.
1 same sectors
Xi =
0 different sectors
Probability of success 61
Table 3
Logistic regression on 1322 deals
Factor bi p-value
Constant -1.09 0.04
Target attitude 1.79 0.00
Premium 0.76 0.17
Multiple 0.44 0.00
Industrial sectors 0.33 0.15
Relative size 0.44 0.00
Deal type 0.34 0.16
Trailing number of deals -0.29 0.19
• Relative size of acquirer to the target

Acquirer assets
Xi = log .
EV
• Deal type (
1 cash
Xi =
0 equity
• Trailing number of deals. The number of deals is cyclical; the position in that cycle should
influence deal completion.
Ndeals in last 12 months
Xi =
yearly average of Ndeals
We have 1322 realized deals (completed or withdrawn) with all factors available. Table 3 shows the
results obtained from the logistic regression. We see that attitude, multiple and relative size are very
relevant factors (very small p-values). The premium, having the target and the acquirer in the same
industrial sector and the deal type are relevant to some extent. The sensitivity for the trailing number
of deals is counterintuitive: it appears that a large number of deals announced might catalyze less
convincing deals.
To assess the predictive power of our model we perform an out-of-sample test, and compute the
so-called cumulative accuracy profile (CAP) curve. The model parameters are fit using the 873 (66%)
oldest deals. We then infer the probability of success for the remaining 449 (34%) deals. After sorting
the deals by their probability of success obtained with the statistical model (from less probable to
most probable), the CAP curve is calculated as the cumulative ratio of failures as a function of the
cumulative ratio of all deals.
Figure 6
CAP curve for the out-of-sample test (OOS) and the implied probability of success
0.8
0.6
0.4 OOS
implied
0.2
0
0 0.2 0.4 0.6 0.8 1
The 449 out-of-sample deals have an overall failure ratio of 10.2%. If the model were perfect, then
the first 10.2% of deals as sorted by our model would have contained all of the failed deals, and we
would have CAP(x) = 1 for x ≥ 10.2%. If the model were useless, the ordering would be random, and
we would have CAP(x) = x. In Figure 6 we show the result for the out-of-sample test, the
market-implied probability of success and the two limiting cases. We clearly see that our model beats
the market, suggesting that there is a potential “alpha” for merger arbitrage hedge funds. Further
looking closer at the lower left corner we notice that the CAP curve for the statistical model follows
the perfect limiting case up to about 5%. This means that our statistical model ranks the first half of
the withdrawn deals perfectly as the worst ones.
6 Conclusion
The specifics of merger arbitrage deals can be captured introducing a binomial completion indicator
and a virtual stock price modeled as a simple jump-diffusion process. This model has been validated
using a large set of deals. A merger arbitrage hedge fund would benefit from using this model to
measure the risk of his portfolio in a VaR framework and/or perform stress tests using the probability
Conclusion 63
of deal success for example.
Finally we have developed a statistical model for the probability of success and showed in a
out-of-sample analysis that its forecasting power is superior to the market predicting power.
References
Daul, S. (2007). Merger arbitrage risk model. RiskMetrics Journal 7(1), 129–141.
Moore, K. M., G. C. Lai, and H. R. Oppenheimer (2006). The behavior of risk aribtrageurs in
mergers and acquisitions. The Journal of Alternatives Investments Summer.
Measuring the Quality of Hedge Fund Data
Daniel Straumann
RiskMetrics Group
daniel.straumann@riskmetrics.com
This paper discusses and investigates the quality of hedge fund databases. The accuracy of hedge
fund return data is taken for granted in most empirical studies. We show however that hedge fund
return time series often exhibit peculiar and most likely “man-made” patterns, which are worth to
be recognized. We develop a statistical testing methodology which can detect these patterns.
Based on these tests, we devise a data quality score for rating hedge funds and, more generally,
hedge fund databases. In an empirical study we show how this data quality score can be used
when exploring a hedge fund database. Thereby we can confirm many of the insights by (Liang
2003) concerning the quality of hedge fund return data and made by different means. In a last step
we try to estimate the impact of imperfect data on performance measurement by defining a “data
quality bias”. The main goals of this paper are to increase the awareness for the practical
limitations of hedge fund data and to suggest a tool for the quantification of financial data quality.
1 Introduction
The past years have seen a rapid growth of the hedge fund industry and an enormous increase of the
assets that this investor segment controls. Originally only accessible for institutional investors or very
wealthy individuals, hedge funds are nowadays much better established in the broad public. In many
countries, even retail investors can place money into hedge funds.
There are several reasons why hedge funds have been so successful in attracting new money. Hedge
fund risk-return profiles are perceived as superior to classical long-only mutual. Hedge funds are
flexible and basically unregulated. In contrast to mutual funds, any kind of financial investment is
permitted. Hedge funds may for instance go short, invest the capital into futures, derivative securities,
commodities and other asset classes that are not accessible for mutual funds. Furthermore, they can
borrow money in order to create leverage on their portfolio. Many hedge funds seek to achieve
absolute returns. Therefore, they have no traditional benchmark such as a stock or bond index or a
blend of indices. Mutual funds, however, are required to more or less track a benchmark and are
therefore much more exposed to bearish market conditions. And indeed, a majority of hedge funds
did convincingly well in the aftermath of the burst of the technology bubble and the September 11
terrorist attacks.
66 Measuring the Quality of Hedge Fund Data
Parallel to the success and the maturing of the hedge fund industry, academics were beginning to take
an interest in how hedge fund managers achieve their profits and whether successful track records are
due to skill or just luck. The questions of sources of hedge fund returns and performance persistence
have been addressed in many empirical studies, and the literature on these topics is still growing. It is
needless to say that this research heavily relies on hedge fund returns data and statistical methods to
analyze them. Concerning the data, several providers offer hedge fund databases. These databases
differ substantially in coverage of funds and in the information beyond the return time series. The
hedge fund database business is rather fragmented. It does not seem that there is a “golden standard”
for hedge fund returns. As a matter of fact, no database exists that would provide full coverage. The
diversity of hedge fund databases used in articles can explain dissimilar quantitative results.
While increasingly complicated stochastic models are being used for the description of hedge fund
returns, there are certain limitations on the data side. These limitations are only marginally discussed,
if not neglected. The purpose of this paper is not to provide refinements of models or to present yet
another large empirical study. Instead, we are concerned about what forms the backbone of hedge
fund research: hedge fund data. Our focus lies on data quality aspects, a topic that is somewhat
disregarded in the literature. From our experience, hedge fund return data is not always beyond all
doubts. To assess the plausibility of hedge fund return data, we propose an objective and
mathematically sound method. We then use this method to analyze a hedge fund database. We also
quantify the impact which imperfect data may have.
1.1 Issues with hedge fund data
We have recently examined hedge fund databases of several providers, and have come to doubt the
quality of the return data. In this paper, we uniquely work with the Barclay1 database. In all
databases similar issues were identified.
To illustrate the aforementioned issues, we consider the following example of an active onshore
long-short hedge fund. Its return and asset under management time series are displayed in Table 1.
The following peculiarities are striking:
• The returns of the year 2000 are repeated in 2001. This is obviously a serious data error.
Interestingly, the assets under management do not show any recurring patterns.
1
Barclay Hedge is a company specialized in the field of hedge fund and managed futures performance measurement
and portfolio management; see www.barclayhedge.com. We benefited from the excellent support by Sol Waksman and his
client service team of Barclay Hedge. This is gratefully acknowledged.
Introduction 67
Table 1
Monthly returns of an active long-short hedge fund, January 1999 through December 2002
“AUM” stands for assets under management (in $M). The full time series covers January 1991 through
September 2007. The symbol “+” signifies that the return value appears at least twice in the entire time
series. The boxes frame the blocks of recurring returns.
Date AUM Return (%) Date AUM Return (%)

Jan–1999 10.7 0.20 + Jan–2001 22.5 4.40 +
Feb–1999 10.7 6.50 Feb–2001 22.5 0.20 +
Mar–1999 15.5 3.70 + Mar–2001 34.0 0.00 +
Apr–1999 15.5 -6.30 Apr–2001 34.0 5.40 +
May–1999 15.5 -0.90 + May–2001 34.0 6.40 +
Jun–1999 23.8 2.90 + Jun–2001 43.0 0.40 +
Jul–1999 23.8 0.10 + Jul–2001 43.0 1.10 +
Aug–1999 23.8 4.10 + Aug–2001 43.0 -2.60 +
Sep–1999 28.1 -0.80 + Sep–2001 48.0 -8.60 +
Oct–1999 28.1 -2.10 + Oct–2001 48.0 4.00 +
Nov–1999 28.1 -3.00 Nov–2001 48.0 3.50 +
Dec–1999 26.8 5.70 Dec–2001 48.0 3.10 +
Jan–2000 26.8 4.40 + Jan–2002 48.0 0.90 +
Feb–2000 26.8 0.20 + Feb–2002 50.0 -0.90 +
Mar–2000 36.1 0.00 + Mar–2002 50.0 3.40 +
Apr–2000 36.1 5.40 + Apr–2002 50.0 1.80 +
May–2000 36.1 6.40 + May–2002 51.5 1.60 +
Jun–2000 28.0 0.40 + Jun–2002 51.0 -0.90 +
Jul–2000 28.0 1.10 + Jul–2002 50.0 -5.70
Aug–2000 28.0 -2.60 + Aug–2002 51.0 -0.80 +
Sep–2000 25.0 -8.60 + Sep-2002 51.0 -2.30
Oct–2000 25.0 4.00 + Oct–2002 53.0 -0.60 +
Nov–2000 25.0 3.50 + Nov–2002 60.0 3.30 +
Dec–2000 22.5 3.10 + Dec–2002 62.0 1.10 +
• The returns appear to be rounded.
• Many return values appear more than once in the time series (depicted by the symbol “+”).
Note that this is partially caused by the rounding.
• Appearance of zero returns (for instance in March 2000). It is rather unlikely that a fund has
returns exactly equal to zero.
The recurrence of one year of return data is clearly a serious error. We admit that in the Barclay
database such extreme problems appear for a handful of funds only. Much more frequent is the
recurrence of blocks of length two or three. For instance, the January to March returns of a certain
year would recur in one of the subsequent years. We picked the long-short fund because it provides an
exemplary illustration of all types of problems that we have encountered. Once again, it must be
stressed that such irregularities are by no means restricted to Barclay, but were evident in all databases
we examined.
An important question is why for this particular fund the data quality is so poor. One argument could
be that the fund is exposed to illiquid markets or instruments, which would make an accurate
valuation difficult. In the Barclay database, we find the following description of the fund’s investment
objectives:
Long/Short stocks and other securities and instruments traded in public markets.
Emphasis is to manage the portfolio with near zero beta. Focus on companies with market
capitalization between 200 mm and 2.5 bb. Uses quantitative screening and fundamental
analysis to identify undervalued equities with strong cash flow to purchase. The short
strategy uses proprietary quantitative screens and fundamental analysis to identify short
opportunities with a non-price based catalyst, potential for negative earnings surprise, and
overvaluation. Overvaluation is a necessary but not sufficient condition to be short.
This description indicates that the fund’s positions are probably not particularly illiquid, and that it
should be feasible to supply an exact valuation once per month. It appears either the fund does have
valuation difficulties, or at minimum that they do not report the exact valuations to Barclay.
Since the reported monthly performance numbers appear unreliable, one might postulate that the
long-short fund in question does in general not properly value its assets every month and that
therefore only crude return estimates are reported. Two arguments speak against this. First, the
long-short fund is audited every December, an information provided by the Barclay database and
absolutely plausible in view of the fund size. Therefore, one would expect that in December some sort
Introduction 69
of equalization is applied, meaning that the December return is determined in such a way that the
actual one-year return is matched. This return figure would most likely be a number with two digits
after the decimal. However, for the fund in question we do not see any numbers with more than one
digit after the decimal. Second, the long-short fund is open for new investments and subscription
possible every month. This would imply that the fund is able to quantify the net asset value of its
holdings on a monthly basis.
Finally, the information reported to data providers is not audited by a third party and cannot be
thoroughly reviewed by the data provider due to the mass of funds that report. One has to be also
aware that hedge funds report voluntarily and therefore the willingness of fund managers to revise
numbers in order to ensure data accuracy might be rather limited.
Concluding our reasonings, the most likely reason for the questionable return history is a certain
negligence exercised by the fund when reporting to Barclay.
1.2 Goals and organization of the paper
The lesson from the previous example is that hedge fund return data can be problematic. While the
conclusions from empirical hedge fund research might be unaffected in qualitative terms, it is clear
that inaccurate data could have an impact when industry-wide numbers such as performance, Sharpe
ratio or alpha are calculated. In the past, people have cared a lot about data biases such as the
survivorship bias, which occurs when failed hedge funds are excluded from performance studies. The
survivorship bias generally leads to an overstatement of performance. The accuracy of return data
itself is however mostly taken for granted, and the impact of the data quality on the analysis is rarely
questioned. This is in a surprising contrast to the attention paid to the “classical” data biases. This
paper tries to fill a gap and to increase the awareness for the practical limitations of hedge fund data.
We regard inaccurate data as another cause of performance misrepresentation. In this context we

would like to introduce a new terminology: data quality bias. It is by no means our aim to excoriate
hedge fund data providers, which are reliant on hedge funds reporting accurately. The providers’
hands are tied when it comes to verification of the performance numbers. Maybe this paper
contributes to preparing the ground for improvement of hedge fund data quality in the future.
The only way for assessing the accuracy of hedge fund returns in a systematic and objective manner is
via a quantification. We first devise tests that detect the kind of problems discussed above. The results
of these tests are then combined into a single number, which serves as a measure or score for how
plausible a hedge fund return time series is. A score of a group of funds is just the average score. We
will also call it data quality score. Applying the data quality score to the Barclay hedge fund data, we
provide a small study showing results which often have an intuitive explanation. It is important to be
aware that our score rates the quality of returns only. It disregards other aspects of data quality.2
A score for data quality is a powerful tool and can serve multiple purposes. It allows for instance
comparison of different groups of funds with respect to data accuracy. We mention that, if properly
adapted, the ideas and principles presented in this paper can be applied to any kind of financial data.
Besides identifying problematic samples, a score of data quality helps one to monitor the
improvement of data quality over time and confirm the effectiveness of data improvement measures
and differentiate between competing databases.
The paper is organized as follows. In a section on preliminaries, we provide a brief survey on the
hedge fund literature. In the subsequent section we introduce and discuss our data quality score. This
is then followed by an empirical study using the Barclay database. Finally we conclude.
2 Preliminaries
First we give a classification of the hedge fund literature. Such an overview helps to make the
connection between this paper and the literature. Secondly we provide more details on hedge fund
databases and their biases. Lastly, we cite and summarize the literature on hedge fund data quality
that we are aware of.
2.1 Classification of hedge fund literature
As mentioned earlier, virtually any hedge fund research relies on return data. There are basically three
interrelated main streams of academic research, addressing the following matters:
Hedge fund performance persistence
Here the main question is whether the performance achieved by a fund relative to its peers3 is
consistent over time, or in other words whether the outperformers of a certain time period are likely to
remain outperformers for the next time period, and vice versa. Miscellaneous methods have been
2 For a readable survey on data quality, we refer to (Karr, Sanil, and Banks 2006).
3 That is, other hedge funds which pursue a similar investment strategy
Preliminaries 71
applied and the hedge fund databases of various providers were used in order to investigate whether
hedge fund performance persists. The answers are mixed, and it seems that the community has not yet
reached a consensus. We refer to (Eling 2007) for a comprehensive overview of the literature on
hedge fund performance persistence.
Sources of hedge fund returns
We have used the terms “performance” and “track record” without being explicit. Comparisons and
rankings of hedge funds (or any investments) based on raw returns would not be sensible because one
would neglect the risks that have been taken in order to achieve these returns. Therefore risk-adjusted
performance measures should be used when ranking funds. The most classical measure is the Sharpe
ratio. Since the statistical properties of hedge fund returns are however often not in accordance with
the Gaussian law (skewness and fat tails), many people resort to a generalized Jensen’s alpha. Alpha
is the regression intercept of a multi-factor linear model, and (together with the mean-zero
idiosyncratic return) represents the skill of the manager, that is, what is unique about the manager’s
investment strategy. Building a factor model that contains all common driving factors (or sources of
hedge fund returns), is not trivial. Capturing hedge funds’ trading strategies through a linear model
requires the use of non-linear factors. There have been many proposals for hedge-fund-specific style
factors; see for instance (Hasanhodzic and Lo 2007).
Hedge fund return anomalies
This topic is probably closest to the main theme of this paper, and for this reason we elaborate a bit
more on it. While the question about economic sources of returns searches for the factors that
determine the performance of hedge funds, the anomalies research stream rather deals with what one
could call the fine structure of hedge fund return processes, or in other words, the peculiarities that
they exhibit.
Hedge fund returns are mostly not available at frequencies higher than monthly. There are several
reasons for this. Unlike mutual funds, hedge funds are privately organized investment vehicles and
often not subject to regulation; therefore there are no binding reporting standards. Moreover, there is
still secrecy around hedge funds. Managers are reluctant to disclose information on a daily basis, even
something as basic as realized returns. This is particularly the case for those that trade in illiquid
markets, because it is feared that disclosed information could be abused by competitors. From an
operational point of view, managers do not want to have the burden of daily subscriptions or
redemptions necessitating the calculation of daily returns because they want to have the freedom to
fully concentrate on investment operations.
A great deal of the anomalies literature is concerned with the serial correlation of hedge fund returns.
The occurrence of pronounced autocorrelations is remarkable because it seems to contradict the
efficient markets hypothesis. However, due to lock-out and redemption periods, it would hardly be
possible to take advantage of these autocorrelations. Getmansky, Lo and Makarov (2004) show that
serial correlations are most likely induced by return smoothing. These authors argue that the exposure
of the fund to illiquid assets or markets leads to return smoothing when a portfolio is valued. This also
explain why funds that invest in very liquid assets (such as CTAs4 and long-only hedge funds) rarely
show significant autocorrelations. Getmansky, Lo and Makarov (2004) propose the use of
autocorrelations as a proxy for a hedge fund’s illiquidity exposure. They moreover stress that the
naive estimator overstates the Sharpe ratio because it ignores the autocorrelation structure of hedge
fund returns.
Bollen and Pool (2006) go a step further and insinuate that some managers might smooth returns
artificially by underreporting gains and diminishing losses, a practice they call “conditional
smoothing”. The authors also devise a screen to detect funds that apply conditional smoothing. Such
screens could be used by investors or regulators as an early warning system. Conditional smoothing
does not necessarily imply purely fraudulent behavior of the managers. It can be partially explained
by the pressure that they face in order to accord with the widespread myth of hedge funds as
generators of absolute returns. However, history shows that many hedge fund fraud cases came along
with misrepresentations of returns and so it might be worthwhile to have a closer look at those funds
which appear to misrepresent returns.
In a subsequent paper (Bollen and Pool 2007), the same authors look at the distribution of hedge fund
returns and give evidence that it has a discontinuity at zero. Again, the explanation is the tendency to
avoid reporting negative returns. It is tempting for a hedge fund manager to report something like
0.02% for an actual return of, say -0.09%. If such practices are followed by a non-negligible number
of managers, a discontinuity at zero will occur. The authors test for other possible causes, but return
misrepresentation turns out to be most likely.
Similar in nature is a study by (Agarwal, Daniel, and Naik 2006), which shows that average hedge
fund returns are significantly higher during December than during the rest of the year. The analysis
indicates that the “December spike” is most likely related to the compensation schemes of hedge
4
CTA stands for commodity trading advisor and actually comes from legal terminology. The CTA strategy is also
referred to as managed futures. A CTA manager follows trends and invests in commodities, currencies, interest rates,
futures and other liquid assets.
Preliminaries 73
funds. These tempt managers to inflate December returns in order to achieve a better year-end
performance, which in turn leads to higher compensation. An equalization is then made in the
subsequent year. Another piece of research giving evidence that hedge fund managers are driven by
the incentive structure is (Brown, Goetzmann, and Park 2001). There, it is shown that hedge funds that
perform well in the first six months of the year tend to reduce the volatility in the second half of the
year. It seems very hard to avoid such behavior other than through modifying the incentive systems.
2.2 Hedge fund databases and biases
Hedge fund data is marketed by several providers. Some are small vendors focusing on hedge fund
data alone, whereas others operate within a large data provider company that covers a variety of other
financial segments. Currently there are about twenty providers, many of which offer additional
services such as hedge fund index calculation.
The various hedge fund databases differ in coverage and in the information supplied besides pure
returns or assets under management. The differences with respect to coverage are considerable. The
estimated coverage of the largest databases is no more than 60% of all hedge funds.5 A reason for this
is the fact that managers typically report to one or two, but hardly to all existing databases. Some
funds prefer not to report at all, particularly if they are not interested in attracting new investors. Some
providers are specialized in the collection of data of certain hedge fund segments and would even
exclude others.6 For all these reasons, there is no database yet which fully represents all hedge funds.
Indices constructed based on one database share again the problem of inadequate representation of the
hedge fund universe. This issue led EDHEC, a French risk and asset management research institution
supported by universities, to construct a family of indices of hedge fund indices. These indices are
meant to combine the information content of the various data provider indices in an optimal fashion;
see (Edhec-Risk Asset Management Research 2006).
Apart from inadequate representation, which leads to biased estimates of performances,there are other
data biases which play an important role. Numerous articles discuss these biases and provide
estimates of their magnitude.
Survivorship bias. This bias, mostly upward, is created when funds that have been liquidated, or
have stopped reporting, are removed from the database. Survivorship bias has also been
discussed in the mutual fund performance literature, but it is particularly pronounced for hedge
5 See (Lhabitant 2004).
6 As an example, HFR excludes CTAs from their database.
funds because their attrition rate is significantly higher than that of mutual funds. Nowadays,
providers are aware of this issue and make sure that collected data of defunct funds does not get
erased. Most hedge fund providers offer so called graveyard databases, this means, databases
containing the “dead” funds.
Backfill bias. This bias occurs when hedge funds that join a database are allowed to report some of
their past return history. This again leads to overstatements of the historical performance of all
funds because most likely hedge funds will start reporting during a period of success. A simple
remedy to limit this bias is to record the date when the fund joined the database.
Self-reporting bias. Recall that hedge funds report voluntarily. There might be differences between
reporting and non-reporting funds, and it is difficult to quantify. An indirect way is to look at
funds-of-hedge funds. The performance of funds-of-hedge funds can serve as a proxy for the
performance of the “market portfolio” of hedge funds.
Another big difficulty leading to potentially distorted numbers is the style classification of hedge
funds. First, the style classification used by a provider is hardly ever perfect. Second, the style is
self-proclaimed by the manager. Third the investment style pursued by a fund may change over time,
but the databases we know do not treat style as a time series item. A lot more could be said about
hedge fund databases and their biases; see the excellent survey given in (Lhabitant 2004).
2.3 Hedge fund data quality
Note that the biases presented above are uniquely connected to the way the providers collect and
manage data, and to the willingness of managers to report returns and other information. Most of
these studies take the accuracy of hedge fund returns for granted. We are aware of two papers raising
questions regarding this assumption. In (Liang 2000), differences between the HFR and TASS
databases are explored. The returns and NAVs of funds that appear in both databases are compared.
The returns coincide in about 47% of the cases only and the NAVs in about 83% of the cases. The
second article (Liang 2003) finds that the average annual return discrepancy between identical hedge
funds in TASS and the US Offshore Fund Directory is as large as 0.97%. Liang also compares
onshore versus offshore equivalents of TASS and different snapshots of TASS. Furthermore he
identifies factors which influence return discrepancies by means of regressions. He finds that audited
funds have a much lower return discrepancy than non-audited funds. Moreover, funds listed on
exchanges, funds of funds, funds open to the public, funds invested in a single industrial sector and
funds with a low leverage have generally less return discrepancies than other funds. Similarly to us,
A data quality score 75
Liang questions the accuracy of hedge fund return data. He measures data quality in terms of return
discrepancies across databases. Liang does however not ascertain which of the two data sources is
more credible and therefore of higher quality.
Our paper has a similar scope, but we highlight two differences in our approach:
• We rate the quality of a single database and the funds therein in absolute terms. We do not
depend on comparisons across databases. In contrast, Liang quantifies data quality in relative
terms by looking at return discrepancies.
• We can assess the data quality of all funds since we do not rely on matching funds between
different databases. In contrast, Liang’s approach can only determine the data quality for the
intersection of funds in two databases.
3 A data quality score
In this section, we devise a quality score for fund return time series. Inspired by the patterns found in
the long-short hedge fund of the introduction, we first define five statistical tests for time series of
returns. For a fund, the quality score is the number of rejected tests. For a group of funds, the quality
score is the average of the fund scores. For illustrative purposes we finally compare the quality of
stock returns and fund of hedge fund returns.
3.1 Testing for patterns
As announced, we design five tests to detect patterns in return data. A test is rejected if the return data
exhibits the corresponding pattern. We suppose that the monthly returns are expressed as percentages
with two decimal places as in Table 1. We begin by describing the tests in a rather loose way:
1. For test T1 , the number z1 of returns exactly equal to zero is evaluated. If z1 is “too large”, T1 is
rejected.
2. Test T2 is based on the inverse z2 of the proportion of unique values in the time series. If z2 is
“too large” (or, equivalently, the proportion of unique values is too small), T2 is rejected.
3. Test T3 looks at runs of the time series. A run is a sequence of consecutive observations that are
identical. To give an example, (2.31, 2.31, 2.31) would be a run of length three. If the length z3
of the longest run is “too large”, T3 is rejected.
4. In test T4 the number z4 of different recurring blocks of length two is evaluated. A block is
considered as recurring if it reappears in the sequence without an overlap. For example,
consider the sequence
(1.25, 4.57, −2.08, 8.21, 8.21, 8.21, −0.55, 1.25, 4.57, −2.08, 6.42, 1.25, 4.57, −2.08).
The sequence contains two different recurring blocks of length two: (1.25, 4.57) and
(4.57, −2.08). Note that (8.21, 8.21) is not a recurring block because of the no overlap rule.
The test T4 is rejected if z4 is “too large”.
5. Test T5 is based on the sample distribution of the second digit after the decimal. If this
distribution is “unlikely”, T5 is rejected.
It is evident that there are overlaps between the five tests. Tests T2 and T5 check for concentration of
return values and rounding, whereas T3 and T4 are meant to uncover repetitions in the data.
So far we have been unspecific about the thresholds for rejecting the tests. The role of the thresholds
is to discriminate between patterns appearing just by chance and those that are caused by real
problems in the data. Fixed thresholds would not be useful, since the longer the time series, the more
likely certain features such as recurring blocks occur by chance. Moreover, the volatility plays an
important role; funds with a very low volatility will feature a high concentration in certain return
values because the range of the data is limited.
We thus set the thresholds on a per time series basis. To this end, we suppose that monthly fund
returns are independent and identically distributed normal random variables rounded to two digits
after the decimal:
rt i.i.d. ∼ N̄ (µ, σ2 ), t = 1, . . . , n. (1)
Here the notation N̄ highlights that the normal random variables are rounded, and n denotes the length
of the return time series. Under the distributional assumption (1) we next compute for each test Ti the
probability that the corresponding test statistic Zi is equal to or larger than the actually observed zi :
pi = Pµ,σ2 ;n (Zi ≥ zi ). (2)
If this probability is small, it implies that we have observed an unlikely event and so the pattern can be
considered as significant. Note that pi is the p-value of the test Ti under the null hypothesis (1).
Instead of working with thresholds, we can equivalently use levels of significance and reject tests if
the p-values exceed these levels. We chose to take a common level of significance equal to 1%
because this makes all tests comparable. Summarizing, we have:

reject Ti ⇐⇒ pi < α = 1%. (3)
Now there are a couple of practical issues to consider. For computing the p-values, we replace the
unknown parameters µ and σ2 in (2) by the sample mean µ̂ and by the sample variance σ b2 of the
returns, respectively. The numerical values of pi are then obtained by Monte Carlo simulation.
We have yet to define the test statistic Z5 . We first determine, via Monte Carlo simulation, the
distribution of the second digit after the decimal under (1) with µ = µ̂ and σ2 = σ b2 . The probability
that this digit is equal to k is denoted by qk . We have found that for the range of volatilities σ ≥ 0.5
the digit is close to being equidistributed on {0, 1, . . ., 9}. For a sample of n returns, the number of
occurrences of k as the second digit after the decimal is denoted by nk . We define Z5 as the distance
between the sample distribution of the second digit and its distribution under (1). This distance is
measured through the classical χ2 goodness-of-fit test statistic:
9
(nk − nqk )2
Z5 = ∑ . (4)
k=0 nqk
Note that we use Monte-Carlo simulation for the calculation of the p-values of T5 ; we do not resort to
a χ2 approximation of the distribution of Z5 .
To conclude the definition of the tests, a couple of remarks are warranted. From visual inspection of
return time series we have developed a sense of imperfect data. We have concentrated on patterns in
sequences of return numbers. This resulted in the tests T1 -T5 . Of course these tests are not necessarily
exhaustive since there might be other patterns which we are not aware of. Although we believe that
testing for faulty outliers of extreme returns would be of high relevance, the only feasible way of
doing that would be via a comparison of the identical fund across various databases, which we did not
pursue.
It is evident there is a speculative element in our method. We can merely conjecture that a certain
return time series is inaccurate. The only way to validate our approach would be to call up all the
funds with problematic data. It should be clear that this is beyond the scope of this paper.
An assumption which might lead to objections is the hypothesis (1) of i.i.d. normally distributed
percentage returns (rounded to two decimal places after the decimal). This model is needed to
estimate the p-values, and we are aware that it is crude. It could be easily replaced by a model
allowing for skewness and heavy tails. Note that our tests are of discrete nature and rather indifferent
about the return distribution. For this reason we conjecture that the approach is to some degree robust
with respect to the chosen model for the return distribution. At least on a qualitative level we expect
that the distributional assumptions have little impact on the results of Section 4.
Another choice we have made is the level of significance α = 1%, which is somewhat arbitrary, as
with any statistical testing problem. We have taken a rather low α because we wish to be prudent
about rejecting funds and want to keep the Type I error7 low. Another reason for taking a low α is the
increase of the Type I error if the five tests are applied jointly. The Type I error of the five tests applied
jointly is smaller than or equal to 5%. This is a consequence of the Bonferroni inequality:
5
P µ,σ2 ;n ( at least one Ti is rejected) ≤ ∑ P µ,σ2;n (Ti rejected) = 5α . (5)
i= 1
3.2 Definition of the quality score
The data quality score of a fund is just the number of rejected tests. Using (3), the score can be
formally written as
5
s= ∑ I{ Zi< pi} . (6)
i= 1
Note that high values of S correspond to a low data quality, and vice versa. For a group F of funds,
the score is the average fund score:
1
(7)
| F | j∑F
S= sj .
Note that S is the average number of rejected tests (per fund) and lies between zero and five. If
hypothesis (1) holds true for each fund, we have that
1
E ( S) = ∑ E ( s j) ≤ 5α, (8)
|F | j F
since P µ,σ2 ;n ( Zi < pi ) ≤ α; note that we don’t have equality because the Zi s are discrete. Our rationale
for testing a group of funds is to compare its score (7) with the upper bound (8) on the expected
number of rejected tests. If the score exceeds this bound by far, we infer that there are issues with
some of the underlying return time series. In the next section we provide an illustration with stocks
and funds of hedge funds.
3.3 A reality check
As a first application of the data quality score, we would like to demonstrate the fundamentally
different quality of equity and Barclay funds of hedge funds returns. We chose funds of hedge funds
because this is one of the best categories in the Barclay database with respect to data quality.
7 That is, the likelihood of rejecting a “good” return time series by chance.
Table 2
Data quality score of monthly return data
Stocks Funds of
Hedge Funds
Score 0.04 0.23
P(s=0) 96.77% 85.60%
P(s=1) 2.84% 8.65%
P(s=2) 0.39% 3.61%
P(s=3) 0.00% 1.47%
P(s=4) 0.00% 0.63%
P(s=5) 0.00% 0.04%
The equity data was obtained from the Compustat database. We took the monthly returns of the
members of the RiskMetrics US stock universe, which is used to produce equity factors for the
RiskMetrics Equity Factor Model.8 This universe contains the ordinary shares of the largest US
companies and consists of roughly 2000 stocks. The time window was chosen such that the stock
return time series length matches the average length of the funds of hedge funds return time series.
The results in Table 2 speak for themselves and support our hypothesis that there are issues with
hedge fund data. Note that for the stocks the Bonferroni inequality (5) is “on average” respected, since
P(at least one test is rejected) = P(s > 0) = 1 − P(s = 0) = 3.23%. (9)
Here P(s = 0) is the proportion of return time series with a quality score of zero, that is, with no
rejected tests. The quality score for the stocks also obeys the inequality (8). All this indicates that, as
expected, the equity data does not exhibit any of the patterns we are concerned with.
These positive findings are contrasted by the funds of hedge fund return data. Here many more funds
than predicted by the model (1) display patterns; inequality (8) is clearly breached. This leads us to
conclude that the data accuracy for the funds of hedge funds is not always given.
8 See (Straumann and Garidi 2007).

4 Analyzing a hedge fund database
This section presents an empirical study, from which many conclusions about data quality can be
drawn. The study also demonstrates the power of the previously defined data quality score and the
mechanics for using it. We explore the Barclay database with a particular view towards its quality.
4.1 The Barclay database
In this section, we describe the Barclay database and discuss the filtering rules which we have applied.
We look at the October 2007 snapshot of the complete Barclay database, which contains CTAs, fund
of funds, and hedge funds. We also consider inactive funds. A fund is called inactive if it has stopped
reporting to Barclay; note that this does not necessarily mean that it does not exist anymore. All in all,
the database contains 11, 701 funds. In order that all funds contain consistent information, we apply
certain filters. These six filters, discussed next, lead to a downsize of the Barclay database to a
universe of 8574 funds.
First, we remove the 591 funds with no strategy classification. We only admit funds that report returns
net of all fees. This leads to a further exclusion of 204 funds. The next filter deals with duplicates. For
many funds there exist multiple classes, sometimes denominated in different currencies. Also there
are funds coexisting in an onshore and offshore version. We want to restrict ourselves to one class or
version only. Similarly to (Christory, Daul, and Giraud 2006), we devised an algorithm to find fund
duplicates. This algorithm is based on a comparison of Barclay’s manager identifiers, strategy
classifications, the roots of the fund names, and the investment objectives. The latter are stored in a
text field and consist of longer written summaries of the type shown in the introduction of this paper.
All in all, we remove 1184 duplicates. The next filter removes the 702 funds which have not reported
more than one year of monthly returns. Since we consider the assets under management (AUM) as a
very important piece of information, we next remove all 375 funds where the AUM time series is
missing.9 We mention that we have removed the leading and trailing zeros in all return time series,
since we interpret the latter as missing data. The occurrence of runs of zeros of length three or more
strictly inside the return time series is also interpreted as missing data. We omit the corresponding
71 funds. The remaining funds have no gaps in their return series. We also mention that all funds of
the Barclay database have a monthly return frequency.
The Barclay strategy classification consists of 79 items. We mapped these categories to the following
broad EDHEC categories: CTA, Fund of Hedge Funds, Long/Short Equity, Relative Value, Event
9 We admit however gaps of missing data in the AUM time series.
Analyzing a hedge fund database 81
Table 3
Summary of Barclay hedge fund strategies
Active Inactive Total

CTA 769 1633 2402
Funds of Funds 1800 582 2382
Hedge Funds 2325 1465 3790
Long/Short Equity 1183 737 1920
Relative Value 459 404 863
Event Driven 260 167 427
Emerging Markets 321 89 410
Global Macro 102 68 170
Total 4894 3680 8574
Driven, Emerging Markets and Global Macro.10
Table 3 summarizes the number of funds in each strategy broken down by status. Barclay is known
for providing a large coverage on CTAs, and this can also be seen from the numbers. In the CTA
category there is a high proportion of inactive funds. This seems to be a database legacy artefact.
Among the funds active during the 1980s and 1990s, the CTA category clearly dominates. We
conclude that Barclay mainly focused on CTAs during these times. Moreover the CTAs exhibit a
higher attrition rate than the other categories.11 As we will see below, the CTA class contains many
“micro-funds” with less than one million dollars in AUM. It is not surprising that small funds have a
higher likelihood of disappearing than large funds. This point has also been addressed and confirmed
by (Grecu, Malkiel, and Saha 2007). We mention that many of these tiny funds are legally spoken not
CTAs because their managers do not hold an SEC licence; since they invest similarly to CTAs,
Barclay nevertheless categorizes them as CTAs.12
10 We refer to www.edhec-risk.com/indexes for a concise description of these strategies.

11 The average annual attrition rates in the period 1990–2006 are: 12.3% for CTAs, 3.9% for funds of funds and 6.2% for
hedge funds.
12 From personal communication with the Barclay Hedge client services
4.2 Overview of the data quality
After applying the filters, we determine the score for every fund. Tables 4 and 5 give an overview of
scores and rejection probabilities. In overall data quality, global macro and funds of funds do best and
CTAs worst.
The favorable data quality of funds of funds is not unexpected. Fund of funds managers are not
directly involved in trading activities and their role is to some extent also administrative. It is in their
interest to have precise knowledge of the NAVs of funds they are invested in. All these factors
increase the likelihood that their reporting to the database vendors is accurate. Our result is similar to
that of (Liang 2003). In his comparison of identical funds in TASS and the US Offshore Fund
Directory, he found that among the fund of funds there were no return discrepancies at all.
The satisfactory quality of global macro fund data is positive. A possible explanation is that global
macro funds are active in liquid markets: currency and interest rate markets. For this reason, the
valuation of the assets is relatively straightforward for global macro funds, and this in turn should
induce a good data quality.
In contrast, we have not found a convincing explanation for the relatively bad data quality of
long-short funds. Since long-short funds trade in the equity markets, which are rather liquid, we
would have expected a better result for this category. It surprises us that the relative value and
emerging markets hedge funds, which are active on less liquid markets, outperform the long-short
strategy in terms of data quality. We did not gain any insight into the high score of the long-short
funds either by using the more granular strategy categorization by Barclay. In the next section, we
have a closer look at possible factors which affect data quality. But even accounting for these factors,
we have not uncovered the reasons for the poor score of long-short equity funds.
The unfavorable data quality of CTAs is caused by the many tiny funds of this group; see
Section 4.3.2 below.
For all strategies, either test T2 on the proportion of unique values or test T5 on the distribution of the
second digit after the decimal is rejected most often. We have verified that in one third of the cases
that T2 or T5 is rejected, rounding appears to be the main cause for rejection. The next most frequent
problem concerns the occurrence of zeros (test T1 ). Less frequent are recurring blocks (test T4 ). The
occurrence of runs (test T3 ) is least common in the return data.
Table 4
Data quality scores for Barclay funds
Score P(s=0) P(s=1) P(s=2) P(s=3) P(s=4) P(s=5)

CTA 0.45 75.19% 11.70% 7.79% 3.66% 1.62% 0.04%
Funds of Funds 0.23 85.60% 8.65% 3.61% 1.47% 0.63% 0.04%
Hedge Funds 0.35 78.92% 11.08% 6.62% 2.37% 0.98% 0.03%
Long/Short Equity 0.41 76.46% 11.46% 8.33% 2.55% 1.20% 0.00%
Relative Value 0.29 81.81% 11.01% 4.63% 1.62% 0.81% 0.12%
Event Driven 0.41 76.81% 11.71% 6.09% 4.22% 1.17% 0.00%
Emerging Markets 0.29 81.71% 10.73% 5.37% 1.71% 0.49% 0.00%
Global Macro 0.14 90.59% 6.47% 1.76% 1.18% 0.00% 0.00%
Table 5
Proportion of funds failing individual quality tests
P(T1 rej.) P(T2 rej.) P(T3 rej.) P(T4 rej.) P(T5 rej.)
CTA 6.45% 15.15% 2.33% 2.87% 18.15%
Funds of Funds 2.85% 8.73% 0.71% 1.76% 8.94%
Hedge Funds 3.72% 12.64% 0.82% 2.64% 15.67%
Long/Short Equity 3.85% 13.65% 0.47% 3.07% 19.53%
Relative Value 3.36% 11.47% 1.74% 2.32% 10.08%
Event Driven 5.85% 16.39% 1.17% 3.28% 14.52%
Emerging Markets 2.44% 10.00% 0.00% 1.71% 14.39%
Global Macro 1.76% 4.12% 1.18% 0.00% 6.47%
Figure 1
Data quality score as a function of time series length
0.9
0.8
0.7
0.6
Quality Score
0.5
0.4
0.3
CTA
0.2 Funds of Funds
Hedge Funds
0.1
0
0 20 40 60 80 100 120 140 160 180 200
Time−Series Length
4.3 Predictors of data quality
In the previous discussion, we did not make use of any covariates, which would possibly help to
explain the score values. The goal of this section is to find the explanatory factors for data quality.
4.3.1 Time series length
Figure 1 displays the data quality versus the length of the return time series.13 The plot shows nicely
that the longer the time series, the higher the data quality score. This relationship is close to linear. It
is straightforward to give an explanation: the longer the return time series, the higher is the
probability that errors have occurred during its recording.
13
The curves have been estimated through a binning approach. In each category, bins containing approximately 200
funds with similar return time series lengths are constructed. For each bin one determines the average time series length
together with the score of all funds therein, and this gives one point in the xy-plane.
Table 6
Assets under management ($M) by time series length
33%- and 66%-tiles of AUM in each time series length category reported.
Length (yrs.) (1,3] (3,6] >6

Percentile 33% 66% 33% 66% 33% 66%
CTA 0.9 5.0 1.5 9.3 5.9 35.6
Funds of Funds 13.8 52.6 26.1 85.3 33.9 127.4
Hedge Funds 11.3 47.5 17.6 72.7 31.1 105.5
Long/Short Equity 9.3 44.1 15.5 66.1 27.2 80.3
Relative Value 10.3 48.0 18.3 75.7 34.8 129.5
Event Driven 12.8 44.1 24.2 105.7 46.5 131.0
Emerging Markets 22.0 66.0 27.5 73.4 32.5 115.4
Global Macro 9.8 34.9 21.5 70.7 36.1 131.3
4.3.2 Assets under management (AUM)
In this section, we consider fund size, defined as the time average of the AUM series. In Table 6, we
consider the fund size in relation to time series length. From comparing the percentiles across the
three categories of time series lengths, we see that the longer the time series, the higher in general the
AUM. The CTA category contains the funds with the lowest AUM. As we alluded earlier, it is striking
how many tiny CTA funds exist. We utilize the AUM percentiles from Table 6 to divide the funds into
small, medium and large size categories.
We present the quality scores in Table 7. First we remark that bucketing by time series length is
necessary in order to remove the strong effect of this factor on the data quality score. We would
expect funds with low AUM to exhibit a poorer data quality than large funds since they most likely
have fewer resources for accurate valuation and reporting.14 For CTAs, the quality improves (scores
decrease) with greater size in each of the time series length categories; our conjecture holds true. For
funds of funds or hedge funds, there is no such clear relationship. We have investigated whether
auditing could play a role for this result by further subdividing the groups into audited and
non-audited funds, but this has revealed that auditing or non-auditing, respectively, is not the cause.
14 We mention that (Liang 2003) established such a relationship indirectly by giving the argument that fund size and
auditing are strongly related and by establishing a positive effect of auditing on the size of the return discrepancies.
Table 7
Data quality score by time series length and assets under management
Number of funds in each category in parentheses
Length (yrs.) (1,3] (3,6] >6

Fund Size Small Medium Large Small Medium Large Small Medium Large
CTA 0.30 (292) 0.25 (292) 0.11 (292) 0.78 (251) 0.41 (250) 0.22 (251) 0.90 (258) 0.73 (258) 0.44 (258)
Funds of Funds 0.16 (221) 0.05 (221) 0.14 (221) 0.21 (320) 0.18 (319) 0.21 (320) 0.44 (253) 0.25 (254) 0.42 (253)
Hedge Funds 0.15 (397) 0.23 (397) 0.23 (397) 0.30 (434) 0.28 (435) 0.34 (434) 0.44 (432) 0.57 (432) 0.61 (432)
Long/Short Equity 0.16 (193) 0.24 (194) 0.30 (193) 0.32 (224) 0.38 (224) 0.47 (224) 0.41 (223) 0.57 (222) 0.74 (223)
Relative Value 0.15 (95) 0.21 (94) 0.17 (95) 0.24 (107) 0.12 (106) 0.16 (107) 0.53 (86) 0.54 (87) 0.59 (86)
Event Driven 0.24 (37) 0.31 (36) 0.19 (37) 0.27 (48) 0.28 (47) 0.29 (48) 0.83 (58) 0.62 (58) 0.43 (58)
Emerging Markets 0.04 (52) 0.23 (52) 0.21 (52) 0.39 (38) 0.24 (37) 0.34 (38) 0.36 (47) 0.36 (47) 0.45 (47)
Global Macro 0.05 (20) 0.05 (21) 0.15 (20) 0.00 (18) 0.00 (19) 0.28 (18) 0.39 (18) 0.06 (18) 0.28 (18)
4.3.3 Auditing
In this section, we look at the effect of auditing on fund quality. The results are depicted in Table 8.
For the strategies CTA, fund of funds, hedge fund and long-short fund, audited funds clearly
outperform the non-audited funds in terms of data quality. For the CTAs, auditing seems to be a
particularly effective way for decreasing the data quality score. Note however that only a small
minority of the funds are audited. We guess that this fact is to some extent related to the size of the
CTAs, which is generally small. For relative value, event driven, emerging markets and global macro
funds, the effects of auditing seem to be mixed: audited groups do not always have a lower score than
the corresponding non-audited group.
4.3.4 Fund status
In this section, we explore differences between active and inactive funds. We expect that inactive
funds have a lower data quality than active funds. Our argument is as follows. Inactive funds are
funds that have stopped reporting, but still exist, or funds that were liquidated.15 In the first case, the
fund seems no longer interested in reporting. It would not be surprising if this was reflected by a high
data quality score, at least towards the end of the time series. In the second case of a liquidated fund,
15 We mention that the main reason for liquidation is poor performance at the end of the fund life, as shown by (Grecu,
Malkiel, and Saha 2007).
Table 8
Data quality score by audited/non-audited and time series length
Length categories as in Table 6. Number of funds in each category in parentheses
Audited Non-Audited
Length (yrs.) (1,3] (3,6] >6 (1,3] (3,6] >6
CTA 0.08 (60) 0.25 (20) 0.40 (20) 0.23 (816) 0.48 (732) 0.70 (754)
Funds of Funds 0.10 (416) 0.15 (781) 0.34 (664) 0.15 (247) 0.39 (178) 0.59 (96)
Hedge Funds 0.19 (758) 0.30 (1021) 0.53 (1076) 0.23 (433) 0.32 (282) 0.60 (220)
Long/Short Equity 0.20 (364) 0.37 (530) 0.56 (562) 0.28 (216) 0.46 (142) 0.62 (106)
Relative Value 0.18 (168) 0.18 (239) 0.57 (206) 0.16 (116) 0.15 (81) 0.49 (53)
Event Driven 0.27 (74) 0.26 (118) 0.58 (145) 0.19 (36) 0.36 (25) 0.86 (29)
Emerging Markets 0.16 (113) 0.35 (91) 0.36 (119) 0.16 (43) 0.23 (22) 0.55 (22)
Global Macro 0.03 (39) 0.12 (43) 0.20 (44) 0.18 (22) 0.00 (12) 0.40 (10)
Table 9
Data quality score by status and time series length
Length categories as in Table 6. Number of funds in each category in parentheses
Active Inactive
Length (yrs.) (1,3] (3,6] >6 (1,3] (3,6] >6
CTA 0.11 (257) 0.19 (218) 0.46 (294) 0.26 (619) 0.58 (534) 0.83 (480)
Funds of Funds 0.12 (444) 0.20 (740) 0.37 (616) 0.11 (219) 0.19 (219) 0.38 (144)
Hedge Funds 0.19 (668) 0.32 (783) 0.59 (874) 0.22 (523) 0.29 (520) 0.44 (422)
Relative Value 0.17 (122) 0.13 (176) 0.66 (161) 0.18 (162) 0.23 (144) 0.39 (98)
Event Driven 0.24 (59) 0.28 (76) 0.66 (125) 0.25 (51) 0.28 (67) 0.55 (49)
Global Macro 0.06 (33) 0.06 (31) 0.26 (38) 0.11 (28) 0.13 (24) 0.19 (16)
the argument is similar. It is likely that a fund does not concentrate anymore on accurately reporting
close before it liquidates. We convince ourselves from Table 9 that this conjecture is true for CTAs
only. For funds of funds, hedge funds and all subcategories there is no striking relationship between
data quality score and fund status.
4.3.5 Concluding remarks concerning predictors of data quality
Summarizing, we have found that the time series length and whether a fund is audited or not are the
most important predictors for the data quality score. For the other tested predictors, there are no
conclusive results which would hold across all fund categories. AUM and fund status have some
predictive power for the class of CTA funds. Small values of AUM have a clearly negative impact on
the data quality of CTAs. Inactive CTAs have a generally lower data quality than active CTAs.
We have presented an exploratory analysis of the predictive power of certain factors for data quality.
Additional factors could have been tested. An alternative approach would have been to fit some
generalized linear model to the data quality scores. The advantage of a model-based analysis would
be the straightforward and mechanical assessment of the significance of factors, basically by looking
at p-values of estimated model parameters. The disadvantage is that we would have to trust a black
box. Since the data quality score is a new concept and since we wanted to gain a certain intuition
about it, we have preferred performing an exploratory data analysis, which consists of looking at
tables and plots.
4.4 Is there an improvement of data quality over time?
This section is concerned with the evolution of data quality through time. To this end, we look at two
equally long time periods: 1997–2001 and 2002–2006. For each time period, we consider those funds
that reported returns during the full period. We calculate the data quality score for the time series
restricted to the corresponding time-period. Note that this leads to some simplification of the analysis
by virtue of the fact that all funds have equally long time series consisting of 60 monthly returns. The
division into small, medium and large funds is as discussed in the previous sections.
The results of Table 10 indicate that there is a clear improvement of the quality for CTAs. Indeed, for
all fund size groups the score of the second period is lower than for the first period. For all other
strategies, the relationship is mixed. For funds of funds and hedge funds, the data quality stays more
or less at the same level. Most striking we found the considerable decrease of quality for large
long-short funds. At the time being we do not have an explanation for this result.
Table 10
Evolution of data quality score through time
Fund Size Small Medium Large

Time Period 1997-2001 2002-2006 1997-2001 2002-2006 1997-2001 2002-2006
CTA 0.43 (294) 0.33 (375) 0.24 (295) 0.17 (376) 0.28 (294) 0.11 (375)
Funds of Funds 0.21 (191) 0.20 (716) 0.09 (191) 0.14 (717) 0.14 (191) 0.20 (716)
Hedge Funds 0.23 (462) 0.22 (1080) 0.26 (462) 0.27 (1080) 0.30 (462) 0.34 (1080)
Relative Value 0.16 (95) 0.22 (255) 0.27 (94) 0.17 (255) 0.28 (95) 0.19 (255)
Event Driven 0.43 (63) 0.25 (122) 0.37 (62) 0.26 (121) 0.13 (63) 0.34 (122)
Global Macro 0.12 (17) 0.10 (49) 0.44 (18) 0.21 (48) 0.00 (17) 0.14 (49)
Summarizing the results, it is all in all fair saying that the data quality in the Barclay database has
improved. Possible reasons for this improvement could be the general increase of transparency in the
hedge fund world during the past decade and the advances of information technology, which generally
facilitated the collection and management of large amounts of data.
4.5 The data quality bias
We finally estimate the data quality bias, as announced earlier in this paper. The data quality bias
measures the impact of imperfect data on performance. We adapt the common definition of data
biases to the case of data quality. The data quality bias is defined as the average annual performance
difference of the entire universe of funds and the group of funds with data quality score equal to zero.
Following the practice of the hedge fund literature, performances of funds are averaged using equal
weights. A positive data quality bias indicates that the inclusion of funds with imperfect return data
leads to an overstatement of the performance, and vice versa. The data quality bias for a certain
strategy is obtained by restricting the universe to this strategy. Also recall that the survivorship bias is
the average annual performance difference of the living funds and the entire universe of funds.
The results are presented in Table 11. We mention that we have taken into account the non-USD
currencies when calculating the performances. Numbers indicate that the data quality could be a
non-negligible source for performance misrepresentation.
Table 11
Data quality bias and survivorship bias 1997-2006 (annualized)
Data Quality Bias(%) Survivorship Bias(%)

CTA 0.16 1.75
Funds of Funds -0.14 0.07
Hedge Funds 0.48 0.67
Long/Short Equity 0.64 0.60
Relative Value -0.14 0.73
Event Driven 1.50 -0.17
Emerging Markets -0.01 0.84
Global Macro 0.24 1.93
Both the data quality and the survivorship bias are almost negligible for the funds of funds. For CTAs,
the data quality bias is small compared to the huge survivorship bias; note that the large survivorship
bias for CTAs is to a large extent due to their high attrition rate. Recall that the data quality of CTAs is
generally low; nevertheless the data quality bias is not outrageous. For hedge funds the data quality
bias is in the order of magnitude of the survivorship bias. Concluding this section, we would like to
stress the point that data biases are of course not additive.
4.6 Regularizing hedge fund return data
As a last piece of the analysis of the Barclay database, we would like to study the uses of the quality
score for “cleaning” hedge fund data. We appeal to the previously given overview on the hedge fund
return anomalies literature, where we cited the work of (Bollen and Pool 2007). These authors found
a significant discontinuity at zero for the pooled distribution of hedge fund returns reported to the
CISDM database. First we would like to verify whether a similar observation can be made when
using the Barclay return data. We apply a kernel density estimator to percentage returns. Thereby we
use a Gaussian kernel together with a bandwidth equal to 0.025. The results in Figure 2 are quite
illustrative. For the left-hand plot, the estimator is applied to all 8574 funds in the Barclay database,
whereas for the right-hand plot the 1738 funds with data quality issues are removed.
Note that the kernel density estimate for the raw Barclay return data in the right-hand plot is very
wiggly. This wiggling is induced by those funds which have heavily rounded returns. The next
Conclusions 91
Figure 2
Kernel density estimates for pooled distribution of Barclay fund returns
All funds (left) and funds with quality score of zero (right)
Barclay raw Barclay cleaned

0.25 0.25
0.2 0.2
0.15 0.15
density
density
0.1 0.1
0.05 0.05
0 0
−2 −1 0 1 2 −2 −1 0 1 2
return (%) return (%)
observation is the pronounced jump of the density at zero, which is in line with (Bollen and Pool
2007). Before we move to the right-hand plot, we stress that for both plots the kernel density
estimates are based on the same bandwidth and evaluated at identical grid-points. In the right-hand
plot, the wiggling almost disappears. We are not surprised by this because the tests T1 -T5 reject time
series with heavily rounded returns. Also note that the density curve is still very steep at zero;
however, the jump size appears to be slightly less pronounced. This example shows that removing
funds with a nonzero data quality score can to some extent regularize hedge fund return data.
5 Conclusions
In this paper, we have provided a comprehensive discussion of quality issues in hedge fund return
data. Hedge funds data quality is a topic which is often avoided, maybe because it is perceived as not
particularly fruitful or just boring. The main goal of this paper was to increase the awareness for
irregularities and patterns in hedge fund return data, to suggest methods for finding them and to
quantify their severity.
Using a simple, natural and mathematically sound rationale, we introduced tests and devised a novel
scoring method for quantifying the quality of return time series data. By means of an empirical study
using the Barclay database, we then demonstrated how such a score can be used for exploring
databases. Our findings conformed to a large extent with results from other articles. This can be seen
as a partial validation of the score approach.
While the score approach is appealing and can be applied in an almost mechanical fashion, it seems to
us that uncritically computing data quality scores could prove harmful. Most hedge fund databases
have grown organically, and every analysis must respect that there are legacy issues. It would be very
wrong to ascribe excessive importance to numerical score values without looking at the underlying
causes.
References
Agarwal, V., N. Daniel, and N. Naik (2006). Why is Santa Claus so kind to hedge funds? The
December return puzzle! Working paper, Georgia State University.
Bollen, N. and V. Pool (2006). Conditional return smoothing in the hedge funds industry.
Forthcoming, Journal of Financial and Quantitative Analysis.
Bollen, N. and V. Pool (2007). Do hedge fund managers misreport returns? Working paper,
Vanderbilt University.
Brown, S., W. Goetzmann, and J. Park (2001). Careers and survival: competition and risk in the
hedge fund and CTA industry. Journal of Finance 56(5), 1869–1886.
Christory, C., S. Daul, and J.-R. Giraud (2006). Quantification of hedge fund default risk. Journal
of Alternative Investments 9(2), 71–86.
Edhec-Risk Asset Management Research (2006). EDHEC investable hedge fund indices. Available
at http://www.edhec-risk.com.
Eling, M. (2007). Does hedge fund performance persist? Overview and new empirical evidence.
Working paper, University of St. Gallen.
Getmansky, M., A. Lo, and I. Makarov (2004). An econometric model of serical correlation and
illiquidity in hedge fund returns. Journal of Financial Economics 74, 529–609.
Grecu, A., B. Malkiel, and A. Saha (2007). Why do hedge funds stop reporting their performance?
Journal of Portfolio Management 34(1), 119–126.
Conclusions 93
Hasanhodzic, J. and A. Lo (2007). Can hedge-fund returns be replicated? The linear case. Journal
of Investment Management 5(2), 5–45.
Karr, A., A. Sanil, and D. Banks (2006). Data quality: a statistical perspective. Statistical
Methodology 3, 137–173.
Lhabitant, F.-S. (2004). Hedge Funds: Quantitative Insights. Chichester: John Wiley & Sons.
Liang, B. (2000). Hedge funds: the living and the dead. Journal of Financial and Quantitative
Analysis 35, 309–326.
Liang, B. (2003). Hedge fund returns: auditing and accuracy. JOPM 29(Spring), 111–122.
Straumann, D. and T. Garidi (Winter 2007). Developing an equity factor model for risk.
RiskMetrics Journal 7(1), 89–128.
Capturing Risks of Non-transparent Hedge Funds
Stéphane Daul∗
RiskMetrics Group
stephane.daul@riskmetrics.com
We present a model that captures risks of hedge funds only using their historical performance as
input. This statistical model is a multivariate distribution where the marginals derive from an
AR(1)/AGARCH(1,1) process with t5 innovations, and the dependency is a grouped-t copula. The
process captures all relevant static and dynamic characteristics of hedge fund returns, while the
copula enables us to go beyond linear correlation and capture strategy-specific tail dependency.
We show how to estimate parameters and then successfully backtest our model and some peer
models using 600+ hedge funds.
1 Introduction
Investors taking stakes in hedge funds usually do not get full transparency of the funds’ exposures.
Hence in order to perform their monitoring function, investors would benefit from models based only
on hedge fund past performances.
The first type of models consists of linear factor decompositions. These are potentially very powerful,
but no clear results have emerged and intensive research is ongoing. We present here a less ambitious
but successful second approach based on statistical processes. We are able to accurately forecast the
risk taken by one or more hedge funds only using their past track record.
In this article we first describe the static and dynamic characteristics of hedge fund returns that we
wish to capture. Then we introduce an extension of the usual GARCH process and detail its
parametrization. This model encompasses other standard processes, enabling us to backtest all of the
models consistently. Finally we study the dependence of hedge funds, going beyond linear correlation
to introduce tail dependency.
∗ The author would like to thank G. Zumbach for helpful discussion.

96 Capturing Risks of Non-transparent Hedge Funds
2 Characterizing hedge fund returns
We start by presenting descriptive statistics of hedge funds returns. To that end, we use the
information from the HFR database. This database consists of (primarily monthly) historical returns
for hedge funds. We assume that what is reported for each hedge fund is the monthly return at time t
(measured in months) defined as
NAVt − NAVt−1
rt = , (1)
NAVt−1
where NAVt is the net asset value of the hedge fund at time t. This return is considered net of all
hedge fund fees. We consider only the 680 hedge funds with more than 10 years of data (i.e. at least
120 observations). This will enable us to perform extensive out-of-sample backtesting afterwards.
We first analyze the shape of the distribution of the monthly returns. The classical tests for normality
are the Jarque-Bera and Lilliefor tests. At a 95% confidence level both tests reject the normality
hypothesis on a vast majority of hedge funds: out of the 680 hedge funds, the Jarque-Bera test rejects
598, and the Lilliefor test rejects 498.
A common assertion about hedge fund returns is that they are skewed. By looking at the sample
skewness, this is certainly what we would conclude. However this quantity is sensitive to outliers and
not a robust statistic. We opt for testing the symmetry using the Wilcoxon signed rank sum,
N
W = ∑ φi R i , (2)
i=1
where Ri is the rank of the absolute values, φi is the sign of sample i and N is the number of samples.
This test rejects only 26 of the 680 hedge funds at a significance level of 95%. We conclude that the
bulk of the hedge funds do not display asymmetric returns, but that tail events and small sample size
are produce high sample skewness.
After describing the static behavior of hedge fund returns, we analyze their dynamics by calculating
various one-month lagged correlation coefficients. We consider the following correlation coefficients:
• Return-lagged return ρ(rt , rt−1 ),
• Volatility-lagged volatility ρ(σt , σt−1 ) and
• Volatility-lagged return ρ(σt , rt−1 ).
If the time series have no dynamics (such as white noise) then the autocorrelation coefficients should
follow a normal distribution with variance N1 . Hence a coefficient is significant at 95% if it falls
Characterizing hedge fund returns 97
Figure 1
Distributions of one-month lagged correlation coefficients across hedge funds
Only significantly non-zero values reported.
Return − Lagged Return Volat − Lagged Volat Volat − Lagged Return

100 100 60
50
80 80
40
60 60
30
40 40
20
20 20
10
0 0 0
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
h i
2 √2
outside − N , N . Using the 680 hedge funds, we found 336 significant coefficients for the
√
return-lagged return correlation, 348 for the volatility-lagged volatility correlation and 192 for the
volatility-lagged return correlation.
We summarize the significant coefficients in histograms in Figure 1. The first coefficient is the
correlation between the return and the lagged return. The coefficients are essentially positive,
meaning that some of one month’s return is transferred to the next month. This could have different
origins: valuation issues, trading strategies or even smoothing. The second coefficient is the
correlation between the volatility and the lagged volatility. These are essentially positive and imply
heteroscedasticity, or non-constant volatility. Finally, the third coefficient is the correlation between
the volatility and the lagged return. These coefficients are both positive and negative, suggesting that
hedge fund managers adapt their strategies (increasing or decreasing the risk taken) to upwards or
downwards markets.
We also examined the dependency between the three coefficients, but did not observe any structure.
We conclude that the three dynamic characteristics are distinct, and must be captured separately by
our model. To summarize, we have found that hedge fund returns are non-normal but not necessarily
skewed, and that they have at least three distinct dynamic properties.
3 The univariate model
3.1 The process
To describe the univariate process of hedge fund returns, we start with a GARCH(1,1) model to
capture heteroscedasticity. We then extend the model by introducing autocorrelation in the returns and
an asymmetric volatility response. The innovations are also generalized by using an asymmetric
t-distribution.
The process is
rt+1 = r̄ + α(rt − r̄) + σt εt , (3)

σt2 = (ω∞ − α2 )σ2∞ + (1 − ω∞ )σ̃t2 , (4)
σ̃t2 = µσ̃t−1
2
+ (1 − µ) [1 − λ sign(rt )] (rt 2
− r̄) . (5)
The parameters of the model are thus
{r̄, α, ω∞ , σ∞ , µ, λ} , (6)
and the distribution shape of the innovations εt .
When r̄, α, λ = 0, the model reduces to the standard GARCH(1,1) model written is a different way
(Zumbach 2004). In this form, the GARCH(1,1) process appears with the elementary forecast σt
given by a convex combination of the long-term volatility σ∞ and the historical volatility σ̃t . The
historical volatility σ̃t is measured by an exponential moving average (EMA) at the time horizon
τ = −1/ log(µ). The parameter ω∞ ranges from 0 to 1 and can be interpreted as the volatility of the
volatility.
The parameter α ∈ [−1, 1] induces autocorrelation in the returns. The parameter λ ∈ [−1, 1] yields
positive (negative) correlation between the lagged return and the volatility if λ is negative (positive).
The innovations εt are i.i.d. random variables with E[εt ] = 0 and E[εt2 ] = 1. We choose an asymmetric
generalization of the t-distributionintroduced by Hansen (Hansen 1994). The parameters are λ0 and ν,
and the density is given by

 2 −(ν+1)/2

 1 bε+a
 bc 1 + ν−2 1−λ0 ε ≤ −a/b,
gλ0 ,ν (ε) = 2 −(ν+1)/2 (7)


 1
 bc 1 + ν−2 bε+a
1+λ0 ε > −a/b,
The univariate model 99
with

ν−2
0
a = 4λ c , (8)
ν−1
b2 = 1 + 3λ02 − a2 , (9)
1 Γ((ν + 1)/2)
c = p . (10)
π(ν − 2) Γ(ν/2)
For λ0 = 0 this distribution reduces to the usual t-distributionwith ν degrees of freedom; for λ0 > 0, it
is right skewed and for λ0 < 0, it is negatively skewed.
3.2 Parametrization
The choice of our parametrization enables us to separate the different parameter estimations. First, we
set
α = ρ(rt , rt−1 ). (11)
For λ = 0, this implies

E[(rt − r̄)2 ] = σ2∞ (12)
justifying the estimation of σ∞ by the sample standard deviation. The expected return r̄ is set to the
historical average return.
In order to reduce overfitting, we set some parameters to fixed values. We make the hypothesis that
the volatility dynamics and the tails of the innovations are universal, implying fixed values for ω∞ , µ
and ν.
We obtain ω∞ and µ by analyzing the correlation function for the pure GARCH case (that is, λ = 0
and α = 0). We rewrite the process, introducing
β0 = σ2∞ (1 − µ)ω∞ , (13)

β1 = (1 − ω∞ )(1 − µ), (14)
β2 = µ. (15)
Assuming an average return r̄ = 0, the process becomes
rt = σt εt , (16)
σt2 = β0 + β1 rt2 + β2 σt−1
2
. (17)
Figure 2
Tail distribution of the innovations
−2
−4
−6
log(cdf)
−8
residuals
−10 t5
normal
−12
−14
−16
−1 −0.5 0 0.5 1 1.5 2
log(−ε)
The autocorrelation function for rt2 decays geomatrically1
ρk = ρ(rt2 , rt−k
2
) = ρ1 (β1 + β2 )k−1 , (18)
with
β21 β2
ρ1 = β1 + . . (19)
1 − 2β1 β2 − β22
We evaluate the sample autocorrelation function
ρk = ρ(rt2 , rt−k
2
), (20)
across the 680 hedge funds. We then fit the cross-sectional average ρ¯k to a power law decay as (18).
Finally, we transform the estimated parameters back to our parametrization, yielding ω∞ = 0.55 and
µ = 0.85.
To estimate the tail parameter ν of the innovations, we compute the realized innovations, setting
λ = 0, ω∞ = 0.55 and µ = 0.85. Since we hypothesize that the innovation distribution is universal, we
aggregate all realized innovations and plot the tail distribution. We see in Figure 2 that a value of
ν = 5 is the optimal choice.
Finally, the remaining parameters λ and λ0 are obtained for each hedge fund using maximum
likelihood estimation. Table 1 recapitulates all parameters and their estimation.
1 See (Ding and Granger 1996).
Backtest 101
Table 1
Parameter estimation
Parameter Effect captured Value

r̄ expected return individual mean(r)
α autocorrelation individual ρ(rt , rt−1 )
ω∞ volatility of volatility universal 0.55
σ∞ long term volatility individual std(r)
µ EMA decay factor universal 0.85
λ dynamic asymmetry individual MLE
ν innovation tails universal 5
λ0 innovation asymmetry individual MLE
4 Backtest
We follow the framework set in (Zumbach 2007). The process introduced in Section 3.1 yields us a
forecast at time t for the next month’s return
r̂t+1 = r̄ + α(rt − r̄), (21)
and a forecast of the volatility σ̂t . At time t + 1, we know the realized return rt+1 and can evaluate the
realized residual
rt+1 − r̂t+1
εt = . (22)
σ̂t
Next, we calculate the probtile
zt = t5 (εt ) , (23)
where t5 (x) is the cumulative distribution function of the innovations. These probtiles should be
uniformly distributed through time and across hedge funds. To quantify the quality of our model, we
calculate the relative exceedance
δ(z) = cdfemp (z) − z, (24)
where cdfemp is the empirical distribution function for the probtiles and introduce the distance
Z 1
d= dz|δ(z)|. (25)
0
We have calculated such distance across all times and all hedge funds, and report the results in Figure
3. The first result (labeled “AR(0) normal”) is the usual normal distribution with no dynamics. Then
Figure 3
Average distance d across all hedge funds
AR(0) normal
AR(1) normal
AR(1) GARCH normal
AR(1) GARCH t
AR(1) AGARCH t
AR(1) AGARCH asym. t
0 0.02 0.04 0.06 0.08
from top to bottom, we add successively autocorrelation, heteroscedasticity with normal innovations,
heteroscedasticity with t5 innovations, asymmetric response in the dynamic volatility and finally
asymmetry in the innovations. We see that compared to the usual static normal distribution, the best
model reduces more than three times the distance between the realized residuals and the modeled
residuals. The two major improvements are when we introduce heteroscedasticity and fat tails in the
innovations. The last step (adding innovation asymmetry) does not improve the results, as we might
have suspected from the earlier Wilcoxon tests, and further, induces over-fitting.
5 The multivariate extension
Let us consider now the case of N hedge funds simultaneously, as in a fund of hedge funds. We have
seen in Section 4 that the appropriate univariate model is the AR(1) plus AGARCH plus t5 -distributed
innovations. We now consider multivariate innovations where the marginals are t5 -distributions and
the dependency is modeled by a copula.
This structure enables us to capture tail dependency, which is not possible with linear correlation
alone but present within hedge funds. Figure 4 presents an illustrative extreme example of two hedge
fund return time series. We see that most of the time the two hedge funds behave differently, while in
one month they both experience tail events. These joint tail events are a display of tail dependency.
The multivariate extension 103
Figure 4
Tail dependency in hedge fund returns
0.2
0.1
−0.1
−0.2
−0.3
−0.4
−0.5
31−Jan−85 31−Oct−87 31−Jan−00
The multivariate distribution of the N innovations is given by
F(ε1 , . . . , εN ) = U (t5 (ε1 ), . . .,t5(εN )) (26)
where U (u1, . . . , uN )—a multivariate uniform distribution—is the copula to estimate.
5.1 Copula and tail dependency
Consider two random variables X and Y with marginal distributions FX and FY . The upper tail
dependency is h i
λu = lim P X > FX−1 (q)|Y > FY−1 (q) , (27)
q→1
and analogously the lower tail dependency is

h i
λ` = lim P X ≤ FX (q)|Y ≤ FY (q) .
−1 −1
(28)
q→0
These coefficients do not depend on the marginal distribution of X and Y , but only on their copula.
See (Nelsen 1999) for more details.
The probtiles (defined in Section 4) of the N hedge funds observed at time t
(zt1 , . . . , ztN ) = UtN (29)

Figure 5
Upper and lower tail dependency as a function of q, fixed income arbitrage hedge funds
0.14
0.12
0.1
0.08
0.06
0.04
0.02
−0.02
0 0.01 0.02 0.03 0.04 0.05
constitute a realization of the copula UtN . Since our univariate model has extracted all of the process
dynamics, we are free to reorder and aggregate our observations across time. We make the further
assumption that within a strategy, all pairs of hedge funds have the same dependence structure. Thus,
we may interpret each observation of innovations for a pair of hedge funds in a strategy at a given
time as a realization from a universal (for the strategy) bivariate copula. So from M historical periods
on N hedge funds, we extract MN(N − 1)/2 realizations.
From this bivariate sample we can infer the upper and lower tail dependency nonparametrically using
Equations (27) and (28). We calculate the coefficient for fixed values of q as shown if Figure 5 and
extrapolate the value for the limiting case.
We can also obtain a parametric estimation of the tail dependency by fitting the realized copula
between two hedge funds to a t-copula. The parameters of such copula are the correlation matrix ρ
(which we estimate using Kendall’s τ) and the degrees of freedom νcop (which we estimate using
maximum likelihood). The tail dependency is symmetric and is obtained by2
√
p 1 + ρ12
λ` = λu = 2 − 2tνcop+1 νcop + 1 √ , (30)
1 − ρ12
where ρ12 is the correlation coefficient between the two hedge funds.
2 (Embrechts, McNeil, and Straumann 2002)
The multivariate extension 105
Table 2
Estimated tail dependency coefficients
Empirical Empirical
Strategy N Lower Upper λ ± σλ νcop
Convertible Arbitrage 16 0.2 0.1 0.18 ± 0.09 6
Distressed Securities 18 0.06 0.05 0.05 ± 0.09 10
Emerging Markets 29 0 0 0.07 ± 0.06 8
Equity Hedge 103 0.05 0 0.04 ± 0.05 10
Equity Market Neutral 16 0.04 0.04 0.02 ± 0.03 9
Equity Non-Hedge 32 0.1 0 0.17 ± 0.06 5
Event-Driven 38 0.17 0 0.11 ± 0.08 7
Fixed Income 28 0.09 0 0.03 ± 0.07 9
Foreign Exchange 14 0 0.1 0.03 ± 0.09 10
Macro 27 0 0.05 0.03 ± 0.08 10
Managed Futures 58 0 0.07 0.05 ± 0.07 9
Merger Arbitrage 10 0 0.15 0.20 ± 0.17 5
Relative Value Arbitrage 20 0.1 0 0.04 ± 0.09 10
Short Selling 7 0 0 0.50 ± 0.22 3
Table 2 shows the results for all strategies. We report the nonparametric lower and upper coefficients,
as well as the results of the parametric estimation. Since in the parametric case, the coefficient
depends on the correlation between each pair of hedge funds, we report the average and its standard
deviation across fund pairs. We also show the estimated degrees of freedom of the copula νcop . We
see that the two estimates of tail dependence are consistent.
5.2 The multivariate model
To capture the different tail dependencies within each strategy we use a generalization of the t-copula,
namely the grouped-t copula (Daul, DeGiorgi, Lindskog, and McNeil 2003). We first partition the N
hedge funds in m groups (strategies) labeled k, with dimension sk and parameter νk .
Then let Z be a random vector following a multivariate normal distribution of dimension N with linear
correlation matrix ρ and let Gν be the distribution function of

r
ν
, (31)
χ2ν
where χ2ν follows a chi-square distribution with ν degree of freedom. Introducing U , a uniformly
distributed random variable independent of Z, we define
Rk = G−1
νk (U ), (32)
and    
Z1
   
 R1  . . .  
 
 Z 
  s1  
 
Y= Zs1 +1 . (33)
   
 R2  . . .  
 
 Zs2 
 
...
As a result, for instance, the group of random variables (Y1 , . . .,Ys1 ) has a s1 -dimensional multivariate
t-distributionwith ν1 degrees of freedom.
Finally using the univariate distribution function of the innovations, we get a random vector of
innovations, h i
t5−1 (tν1 (Y1 )) , . . .,t5−1 (tνk (YN )) (34)
following a meta grouped-t distribution with linear correlation matrix ρ and with different tail
dependency in each group (strategy). The tail dependencies are captured by the νk ’s and are different
in general from the degree of freedom ν = 5 of the innovations.
6 Conclusion
We have presented a model that captures all the static, dynamic and dependency characteristics of
hedge fund returns. Individual hedge fund returns are non-normally distributed, and show
autocorrelation and heteroscedasticity. Their volatility adapts when the hedge fund manager under- or
outperforms. Concerning multiple hedge funds, we have looked at joint events and noticed that tail
dependency is present.
Our model consists of a univariate process and a copula structure on the innovations of that process.
The univariate process is an asymmetric generalization of a GARCH(1,1) process while the
Conclusion 107
dependency is captured by a grouped-t copula with different tail dependency for each strategy. This
model shows compelling out-of-sample backtesting results.
This approach can be applied to any hedge fund and in particular to non-transparent ones. Using only
hedge fund historical performance we may forecast the risk of a portfolio of hedge funds. A
straightforward model extension permits analysis of portfolios of hedge funds mixed with other asset
classes.
References
Daul, S., E. DeGiorgi, F. Lindskog, and A. McNeil (2003). The grouped t-copula with an
application to credit risk. Risk 16, 73–76.
Ding, Z. and C. W. J. Granger (1996). Modeling volatility persistence of speculative returns: A
new approach. Journal of Econometrics (73), 185–215.
Embrechts, P., A. McNeil, and D. Straumann (2002). Correlation and dependence in risk
management: Properties and pitfalls. In M. Dempster (Ed.), Risk Management: Value-at-Risk
and Beyond, pp. 176–223. Cambridge University Press.
Hansen, B. E. (1994). Autoregressive conditional density estimation. International Economic
Review 3(35), 705–730.
Nelsen, R. (1999). An introduction to Copulas. Springer, New York.
Zumbach, G. (2004). Volatility processes and volatility forecast with long memory. Quantitative
Finance 4, 70–86.
Zumbach, G. (2007). Backtesting risk methodologies from one day to one year. RiskMetrics
Journal 7(1), 17–60.
www.riskmetrics.com
RiskMetrics
Volume 8, Number 1
Journal
Winter 2008
3 Volatility Forecasts and At-the-Money
Implied Volatility

Risk Metrics 2008

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Risk Metrics 2008

Hochgeladen von

Copyright:

Verfügbare Formate

RiskMetrics

©2008 RiskMetrics Group, Inc. All Rights Reserved.

2 Definitions and setup of the problem

2.2 Realized volatility

2.3 Forward variance

The expected cumulative variance is defined by

and the forward variance by

2.4 The forecasted volatility

The forecasted volatility is defined by

2.5 The implied volatility

3 Multicomponent ARCH processes

3.1 The general setup

σ2k (t) = µk σ2k (t − δt) + (1 − µk ) r2 (t) k = 1, · · · , n (8)

Finally, the price follows a random walk with volatility σeff

r(t + δt) = σeff (t) ε(t + δt). (9)

The I-GARCH(1) model corresponds to a 1-component linear process

σ2 (t) = µ σ2 (t − δt) + (1 − µ) r2(t)

σ2eff (t) = σ2 (t).

3.3 I-GARCH(2) and GARCH(1,1)

The I-GARCH(2) process corresponds to a two-component linear model

σ21 (t) = µ1 σ21 (t − δt) + (1 − µ1) r2 (t),

σ22 (t) = µ2 σ22 (t − δt) + (1 − µ2) r2 (t),

σ2eff (t) = w1 σ21 (t) + w2 σ22 (t).

σ21 (t) = µ1 σ21 (t − δt) + (1 − µ1) r2 (t),

σ2eff (t) = (1 − w∞ ) σ21 (t) + w∞σ2∞ .

3.4 Long Memory ARCH

while the weights decay logarithmically

Forecast horizon ∆T [day]

4 Forward variance and multicomponent ARCH processes

Forecast horizon ∆T [day]

5 The induced volatility process

Equation (8) for σk can be rewritten as

dσ2k (t) = σ2k (t) − σ2k (t − δt) (13)

χ = ε2 − 1 such that E [ χ(t) ] = 0, χ(t) > −1. (14)

These notations and approximations allow the equivalent equations

The process for the forward variance is given by

dv∆T = ∑ wk (∆T ) dvk (17)

with dvτ (t) = v(t,t + ∆T ) − v(t − δt,t − δt + ∆T ).

6 Market model for the variance

multicomponent ARCH in the next section.

The general idea is to write a model for the forward variance

v(t,t + ∆T ) = G(vk (t); ∆T ), (19)

The dynamic for the random factor vk are given by processes

6.1 Example: one-factor market model

The forward variance is parameterized by

G(v1 ; ∆T ) = v∞ + w1 e−∆T /τ1 (v1 − v∞ ) (22)

which is compatible with the stochastic volatility dynamic

6.2 Example: two-factor market model

The linear model with two factors

G(v; ∆T ) = v∞ + w1 e−∆T /τ1 (v1 − v∞ ) (24)

is compatible with the dynamic

7 Market models and options

8 Comparison of the empirical implied, forecasted and realized

2. The forecastability decreases with increasing ∆T .

2003/01/01 2004/01/01 2005/01/01 2006/01/01

I-GARCH(1) I-GARCH(2) parameter set 1

I-GARCH(2) parameter set 2 LM-ARCH

EUR/USD forecast-implied EUR/USD forecast-realized

CAC40 forecast-implied CAC40 forecast-realized

2 Measuring economic inflation

2.1 Consumer Price Indices

2.2 Measure of realized inflation

CPI(t) = S(t)CPI(t) (2)