Sie sind auf Seite 1von 11

Biometrika (20xx), xx, xx, pp.

111

C 20xx Biometrika Trust

Advance Access publication on xx xxxx 20xx

Printed in Great Britain

Nonlinear Shrinkage Estimation of Large Integrated


Covariance Matrix
B Y C LIFFORD LAM AND C HARLIE HU
Department of Statistics, London School of Economics and Political Science,
Houghton Street, London WC2A 2AE, U.K.
C.Lam2@lse.ac.uk Q.Hu1@lse.ac.uk
S UMMARY
While the use of intra-day price data increases the sample size substantially for asset allocation, the usual realized covariance matrix still suffers from bias contributed from the extreme
eigenvalues when the number of assets is large. We introduce a novel nonlinear shrinkage estimator for the integrated volatility matrix which shrinks the extreme eigenvalues of a realized
covariance matrix back to acceptable level, and enjoys a certain asymptotic efficiency at the same
time, all at a high dimensional setting where the number of assets can have the same order as the
number of data points. Compared to a time-variation adjusted realized covariance estimator and
the usual realized covariance matrix, our estimator demonstrates favorable performance in both
simulations and a real data analysis in portfolio allocation.

10

15

Some key words: High dimension; Intra-day volatility; Realized covariance; Extreme eigenvalues; Portfolio allocation.

1. I NTRODUCTION
With the easily obtainable intra-day trading data nowadays, financial market analysts and academic researchers enjoy more accurate return or volatility matrix estimation through the substantial increase in sample size. Yet, with respect to the integrated covariance matrix estimation
for asset returns, there are several well-known challenges using such intra-day price data. For
instance, when tick-by-tick price data is used, the contamination by market microstructure noise
(At-Sahalia et al., 2005; Asparouhova et al., 2013) can hugely bias the realized covariance matrix. Non-sychronous trading times presents another challenge when there are more than one
asset to consider.
To present a further challenge, it is well-documented that with independent and identically distributed random vectors, random matrix theories imply that there are biased extreme eigenvalues
for the corresponding sample covariance matrix when the dimension of the random vectors p
has the same order as the sample size n, i.e., p/n c > 0 for some constant c > 0. See for
instance Bai & Silverstein (2010) for more details. This suggests that the realized covariance
matrix, which is essentially a sample covariance matrix when all covolatilities are constants and
all log prices have zero drift with equally-spaced observation times (see the diffusion process
for the log price defined in (1) for more details), can have biased extreme eigenvalues under the
high dimensional setting p/n c > 0. The resulting detrimental effects to risk estimation or
portfolio allocation are thoroughly demonstrated in Bai et al. (2009) when inter-day price data is
used.
To rectify this bias problem, many researchers focused on regularized estimation of covariance
or precision matrices with special structures. These go from banded (Bickel & Levina, 2008b)

20

25

30

35

C. L AM AND C. H U

2
40

45

50

55

60

65

70

or sparse covariance matrix (Bickel & Levina, 2008a; Cai & Zhou, 2012; Lam & Fan, 2009;
Rothman et al., 2008), sparse precision matrix (Friedman et al., 2008; Meinshausen & Buhlmann,
2006), sparse modified Cholesky factor (Pourahmadi, 2007), to a spiked covariance matrix from
a factor model (Fan et al., 2008, 2011), or combinations of these (Fan et al., 2013).
Recently, Ledoit & Wolf (2012) proposed a nonlinear shrinkage formula for shrinking the extreme eigenvalues in a sample covariance matrix without assuming a particular structure of the
true covariance matrix. The method is generalized in Ledoit & Wolf (2014) for portfolio allocation with remarkable results. However, such a nonlinear shrinkage formula is only applicable
to the independent and identically distributed random vector setting. It is not applicable to intraday price data since the volatility within a trading day is highly variable, so that asset returns at
different time periods, albeit independent theoretically, are not identically distributed.
Lam (2016) proves that by splitting the data into two independent portions of certain sizes, one
can achieve the same nonlinear shrinkage asymptotically without the need to evaluate a shrinkage
formula as in Ledoit & Wolf (2012), which can be computationally expensive. At the same time,
such a data splitting approach can be generalized to adapt to different data settings. In this paper,
we modify the method proposed in Lam (2016) to achieve nonlinear shrinkage of eigenvalues
in the realized covariance matrix using intra-day price data. We use the same assumption as in
Zheng & Li (2011) (see Assumption 1 in Section 2 and the details therein) to overcome the
difficulty of time-varying volatilities for all underlying stocks. Ultimately, our method produces
a positive definite integrated covariance matrix asymptotically almost surely with shrinkage of
eigenvalues achieved nonlinearly, while local integrated covolatilities are adapted and estimated
accurately. Our method is fast since it involves only eigen-decompositions of matrices of size
p p, which is not computationally expensive when p is of order of hundreds. This is usually
the typical order for p in the case of portfolio allocation
The rest of the paper is organized as follows. We first present the framework for the data
together with the notations and the main assumptions to be used in Section 2. Our method of
estimation is detailed in Section 21, while Section 3 presents all related theories. Simulation
results are given in Section 4, and a real data example of portfolio allocation is presented in
Section 42. All proofs are presented in the supplementary materials accompanying this paper.

(1)
(Xt ,

Let Xt =
diffusion process

2. F RAMEWORK AND M ETHODOLOGY


(p)
, Xt )T be a p-dimensional log-price process which is modeled by the
dXt = t dt + t dWt , t [0, 1],

(1)

where t is the drift, t is a p p matrix called the (instantaneous) covolatility process, and
(1)
(p)
Wt = (Wt , . . . , Wt )T is a p-dimensional standard Brownian motion. We want to estimate
the integrated covariance matrix
1
p =
t T dt.
(2)
0
75

80

It is well-known that the log-price process Xt is contaminated by market microstructure noise;


see Zhang (2011) for instance when the tick-by-tick high-frequency trading data is used to calculate an integrated covariance estimator. In this paper, instead of using the tick-by-tick data which
has the highest observation frequency possible, we use sparsely sampled data synchronized by
refresh times (Andersen et al., 2001; Barndorff-Nielsen et al., 2011), so that the theory in our
paper should be readily applicable. Hence in the sequel, we assume that we can observe the

Nonlinear shrinkage of large integrated covariance matrix

price Xt at synchronous time points n, , = 0, 1, . . . , n. The realized covariance matrix is then


defined as
n

RCV
=
X XT , where X := Xn, Xn,1 .
(3)
p
=1

Jacod & Protter (1998) shows that as n goes to infinity, the above estimator converges weakly to
the true one defined in (2). Hence the realized covariance matrix is one of the most frequently
used estimator for the integrated covariance matrix.
While the intra-day volatility can change hugely within a short time period, it is not unreasonable to assume that the correlation of any two price processes stays constant within such a
period, say within a trading day. Following Zheng & Li (2011), for j = 1, . . . , p, write
(j)

dXt
(j)

(j)

(j)

(j)

= t + t dZt ,

(j)

85

(4)
(j)

where t , t are assumed to be c`adl`ag over [0, 1], and the Zt s are one dimensional standard
(j)
(j)
Brownian motions. Both the t s and the Zt s are related to t and Wt in (1). We assume
further, defining X, Y t to be the quadratic covariation between the processes X and Y :
(1)

90

(p)

Assumption 1. The correlation matrix process of Zt = (Zt , . . . , Zt )T , defined by


(
)
Rt = Z (j) , Z (k) t /t 1j,kp ,
is constant and non-zero on (0, 1] for each j, k. Furthermore, the correlation matrix process of
Xt , defined by
( t (j) (k)
)
s s dZ (j) , Z (k) s
0
,
t (k)
t (j) 2
2
1j,kp
0 (s ) ds 0 (s ) ds
is constant on (0, 1] for each j, k.

95

The rest of the assumptions in this paper can be found in Section 3. We present this assumption
first since following Proposition 4 in Zheng & Li (2011), the log-price process Xt defined in (1)
satisfying Assumption 1 is such that there exist a c`adl`ag process (t )t[0,1] and a p p matrix
satisfying tr(T ) = p such that
t = t .

(5)

The nonlinear shrinkage estimator described in the next section is based on this property.
21. Nonlinear shrinkage estimator
When the dimension p is large relative to the sample size n, even for a sample covariance matrix constructed from independent and identically distributed random vectors, its extreme eigenvalues will be severely biased from the true ones (see chapter 5.2 of Bai & Silverstein (2010) for
example). While various assumptions have been made on the true integrated covariance matrix
like sparsity (Wang & Zou, 2010) or having a factor structure (Tao et al., 2011), in this paper we
follow Ledoit & Wolf (2012) and introduce nonlinear shrinkage for regularization, which does
not need a particular structural assumption on the true integrated covariance matrix itself.
However, since intra-day covariance can vary hugely within a short time period, the X s
defined in (3) are not identically distributed, and hence we cannot directly apply the nonlinear
shrinkage formula in Ledoit & Wolf (2012) to the realized covariance matrix in (3). Instead, we
use the data splitting idea for nonlinear shrinkage of eigenvalues in Lam (2016), and modify their

100

105

110

C. L AM AND C. H U

115

method to accommodate the intra-day volatility change base on (5), which is a condition derived
from Assumption 1 as proved in Zheng & Li (2011).
To this
observe that by (5), the integrated covariance matrix in (2) can be written as
1 end,
2
p = 0 t dt T . Zheng & Li (2011) proposed a so-called Time-variation adjusted realized
covariance matrix, defined as
p :=

tr(RCV
)
X XT
p
where
:= p
,

,
X 2
p
n

(6)

=1

120


p is a good estimator for p by
and denotes the norm of a vector. They demonstrate that

1
is good for = T . Here
showing that tr(RCV
)/p is a good estimator for 0 t2 dt, while
p
plays the role of a sample covariance matrix for estimating . Hence if p/n c > 0, then

suffers from bias to the extreme eigenvalues as well.


is similar to a sample covariance matrix can be seen as folRemark 1. An intuition of why
lows. If t = 0 in (1) and the n, s are independent of Wt (see Assumptions 2 and 3 respectively
in Section 3), then by model (1), we can write
n,
( n,
)1/2
d
X =
t dWt =
t2 dt
1/2 Z ,
n,1

n,1

125

where = stands for equal in distribution, and the Z s are independent random vectors each with
Z N (0, Ip ). Then
( Z ZT )
1/2 Z ZT 1/2
d 1

1/2 1

1/2 .
T
T
n
Z Z /p
n
Z Z /p
n

=1

=1

We can actually show that ZT Z /p goes to 1 almost surely, leaving the above being the sample
covariance matrix constructed from the Z s sandwiched by 1/2 .
130

Following Lam (2016), since the X s are independent following model (1), we split the
data X = (X1 , . . . , Xn ) into two independent parts, say X = (X1 , X2 ), with Xi
having size p ni for i = 1, 2, such that n = n1 + n2 . Define
X XT
ei = p


,
X 2
ni
Ii
e 1 , suppose
e 1 = P1 D1 PT .
where Ii = { : X Xi }. Carrying out an eigen-analysis on
1
Then we introduce our estimator as
b p :=

135

)
tr(RCV
p
T
b where
b := P1 diag(PT
e
,
1 2 P1 )P1 ,
p

(7)

b above belongs to
with diag() setting all non-diagonal elements of a matrix to 0. The estimator
T
a class of rotation equivariant estimator (D) = P1 DP1 , where D is a diagonal matrix, and P1
e
e 1 . The choice of D = diag(PT
is the matrix containing all the eigenvectors of
1 2 P1 ) comes
from solving


e 2 ,
min P1 DPT1
D

Nonlinear shrinkage of large integrated covariance matrix


5

where A F = tr1/2 (AAT ) is the Frobenius norm of a matrix. Similar to Lam (2016), regue 2 P1 ) comes from the independence between P1
larization of the eigenvalues in D = diag(PT1
1
2
e
and 2 , since X is independent of X .

140

3. A SYMPTOTIC T HEORY AND P RACTICAL I MPLEMENTATION


We introduce two more assumptions needed for our results to hold. Assumption 1 is presented
in Section 2.
Assumption 2. The drift in (1) satisfies t = 0 for t [0, 1]. All eigenvalues of t Tt are
bounded uniformly from 0 and infinity in t [0, 1].

145

Assumption 3. The observation times n, s are independent of the log-price Xt , and there
exists a constant C > 0 such that for all positive integer n,
max n(n, n,1 ) C.

1n

We set t = 0 in Assumption 2 for the ease of proofs and presentation. If t is slowly varying
locally, the results to be presented are still valid at the expense of longer and more complex
proofs. The uniform bounds on the eigenvalues of t Tt are needed so that individual volatility

1
(i)
process for each Xt are bounded uniformly. Also, 0 t2 dt > 0 uniformly, and finally, p =
O(1) uniformly as a result, which are all needed for our results to hold. These assumptions
essentially treat t as non-random. Extension to t being stochastic can follow the lines of Zheng
& Li (2011), but we keep it non-random for the ease of presentation and proofs as well.

150

155

L EMMA 1. Let Assumptions 1, 2 and 3 hold for the log-price process


Xt in (1). Then for the
b
estimator in (7), writing P1 = (p11 , . . . , p1p ), if p/n c > 0 and n2 1 pn5
2 < , we
have
pT
e 2 p1i pT p1i a.s.

1i
max 1i
0,
1ip
pT1i p1i
a.s.

where represents almost sure convergence.


b are the pT
e
Since the eigenvalues of
1i 2 p1i s, the above Lemma shows that they are regularized
T
to p1i p1i asymptotically almost surely, which has values bounded by min () and max (),
the minimum and maximum eigenvalues of respectively. Assumption 2 ensures that these
b is asymptotically
eigenvalues are uniformly bounded away from 0 and infinity, and hence
almost surely positive definite. This is true even when the constant c > 1, i.e., when p is larger
than n as they grow together to infinity.
With this result, we can present the following theorem.

160

165

T HEOREM 1. Let all the assumptions in Lemma 1 hold. Then as p, n such that p/n
b p defined in (7) is almost surely positive definite.
c > 0,
This is an important result since p is always assumed to be positive definite, and we want our
estimator to be so too. This is certainly not the case for a sample covariance matrix when p > n,
p defined in (6) by Zheng & Li (2011), which is demonstrated in
and is still not the case for
our simulation results in Section 4.

170

175

C. L AM AND C. H U

5
Remark 2. Both Lemma 1 and Theorem 1 requires
n2 1 pn2 < . Following Lam
(2016), we set n2 = an1/2 where a is a constant, so that when p/n c > 0, the condition is
satisfied. See Section 31 for more details on how to find n2 with finite sample.
To present the rest of the results, we introduce a benchmark estimator for comparisons. This
estimator is called the ideal estimator, defined by
1
ideal =
t2 dt Pdiag(PT P)PT .
(8)
0

180

185

This is similar to the proposed estimator defined in (7), except that the estimator tr(RCV
)/p is
p
1 2
e 2 is replaced by the population counterreplaced by the population counterpart 0 t dt, while
part . Also, P1 is replaced by P, which is the matrix containing all orthonormal eigenvectors
defined in (6) using all data points. In line with Ledoit & Wolf
for the covariance-type matrix
(2012) and Lam (2016), this estimator utilizes
1 2 all data points for calculating the eigenmatrix P,
and it assumes the knowledge of and 0 t dt. With this, we define the efficiency loss of any
b as
estimator
b
b := 1 L(p , Ideal ) ,
EL(p , )
(9)
b
L(p , )
b is a loss function for estimating p by .
b We consider the Frobenius loss
where L(p , )


b =
b p 2 ,
L(p , )
(10)
F
and the inverse Steins loss function in this paper,
b = tr(p
b 1 ) log det(p
b 1 ) p.
L(p , )

(11)

190

195

200

205

The class of rotation-equivariant estimator (D) = PDP minimizes the Frobenius norm exb Ideal , while similar to Proposition 2 in Lam (2016),
b Ideal also minimizes the inverse
actly at
b p , being
Steins loss within such a class of estimator. Hence it is intuitive that our estimator
also rotation-equivariant but not utilizing all data points in calculating the eigenmatrix, will be
b p ) > 0. It turns out that asymptotically,
b p is doing as
less efficient in the sense that EL(p ,
b
good as Ideal , as shown in the following theorem.
T HEOREM 2. Let all the assumptions in Lemma 1 hold. Then as p, n such that p/n
b p ) a.s.
c > 0, we have EL(p ,
0 with respect to both the Frobenius and the inverse Steins loss
1
b Ideal ) does not tend to 0 almost surely.
functions, as long as p L(p ,
b Ideal ) not going to 0 almost surely eliminates the case p =
The requirement p1 L(p ,
1 2
0 t dt Ip , when both the loss functions will attain 0 for the the ideal estimator. Our esti)/p will still be a good estimator for
mator will still do a good job in such a case since tr(RCV
p
1 2
b
0 t dt by the proof of Theorem 1, while can still do a fine job when permutation of the data
is allowed as demonstrated in the simulation results in Lam (2016). Improvement by averaging
and permutation will be described in Section 31.
31. Practical Implementation

2
Following Assumption 1, X XT / X is independent of t and is similar to a data
point in constructing a sample covariance matrix, which is independent of each others for different ; see Remark 1 in Section 21. This observation permits us to permute the data beforehand,

Nonlinear shrinkage of large integrated covariance matrix


(j)

(j)

7
(j)

say at the jth permutation, we form a data matrix X(j) = (X1 , X2 ), with Xi having
size p ni for i = 1, 2, such that n = n1 + n2 . Then we construct
X XT
e (j) = p


2 ,
i

ni
X
(j)

(12)

Ii

(j)
(j)
e (j) , say
e (j) =
where Ii = { : X Xi }, and perform eigen-analysis on
1
1
(j) (j) (j)T
P1 D1 P1 . The we can form the jth estimator as

b (j) :=

tr(RCV
) (j)
p
b , where
b (j) := P(j) diag(P(j)T
e (j) P(j) )P(j)T .

1
1
2
1
1
p

(13)

If we perform M permutations and get M estimators as above, we can define the averaged
estimator as
M
1 b (j)
b
p,M :=
p .
M

210

(14)

j=1

Note that in all M estimators, we are only using one split location, n1 , for the data, instead of
using several of them and then average the results similar to the grand average estimator in Abadir
et al. (2010). To find the best split location empirically, we minimize the following function:
M
1


(j) 2
(j)
b
e
g(m) =
(p 2 ) ,
M
F

(15)

j=1

e (j) is defined in (12) and


b (j) in (13). A very similar function is also used to determine
where
2
the split location for nonlinear shrinkage of a covariance matrix in Lam (2016).
We now show that the averaged estimator defined in (14) also enjoys good asymptotic efficiency.
T HEOREM 3. Let all the assumptions in Lemma 1 hold. Suppose the number of permutations
b p,M ) 0
M is finite in (14). Then as p, n such that p/n c > 0, we have EL(p ,
almost surely with respect to both the Frobenius and the inverse Steins loss functions, as long
b Ideal ) does not tend to 0 almost surely.
as p1 L(p ,
b p,M , with a good choice of m, performs much better than using just
In practice, the estimator
M = 1. We use M = 50 which provides a good trade-off between computational complexity
and estimation accuracy with respect to the Frobenius or the inverse Steins loss functions. For
minimizing g(m) defined in (15), we search the following split locations:

215

220

225

m = [2n1/2 , 0.2n, 0.4n, 0.6n, 0.8n, n 2.5n1/2 , n 1.5n1/2 ].


1
Except for the case p = 0 t2 dtIp which needs n to be as small as possible (see the arguments
provided in Lam (2016)),
split locations [n 2.5n1/2 ] and [n 1.5n1/2 ] are those sat the two
isfying the condition n2 1 pn5
2 < needed in all theorems presented when p/n c > 0.
We include 0.2n to 0.8n for accommodating finite sample performance.

230

C. L AM AND C. H U

235

240

4. E MPIRICAL R ESULTS
We carry out simulation studies to compare the performances of our estimator in (7), the time
variation-adjusted realized covariance matrix in (6) and the realized covariance matrix in (3)
by comparing their Frobenius and inverse Steins losses defined in (10) and (11) respectively.
Then in Section 41, we consider a trading exercise using simulated market data and compare
the risks associated with the minimum variance portfolios constructed using these three different
estimators. Finally, in Section 42, we consider real data from the New York Stock Exchange.
Consider two different scenarios for the diffusion process {Xt } defined in (1), with t = 0
and t = t as in (5). One has t being piecewise constant, the other has t being continuous,
detailed as follows:
Design I: Piecewise constants. We take t to be
{
0.0007, t [0, 1/4) [3/4, 1],
t =
0.0001, t [1/4, 3/4).
Design II: Continuous path. We take t to be

t = 0.0009 + 0.0008 cos(2t), t [0, 1].

245

250

We assume = Ip and the observation times are taken to be equidistant, where n, = /n, =
1, . . . , n. We generate {Xt } using model (1) and get n = 200 discrete observations, and consider
p = 100, 200. For each design and each (n, p) combination, we repeat 1000 times the simulations, and compare the mean Frobenius and inverse Steins losses for our proposed estimator, the
time variation-adjusted realized covariance matrix and the realized covariance matrix.
Table 1 presents the simulation results. It is clear that overall, our proposed estimator performs
the best. In particular, since the realized covariance or the time variation-adjusted realized covariance matrices are singular when p = 200, their inverses do not exist. In contrast, our proposed
estimator is always non-singular and stable even in this case, which is in line with Theorem 1.
41. A market trading exercise
As an application in finance, we simulate market trading data in this section and construct minimum variance portfolio using the three different estimators compared in the previous section.
Given an integrated covariance matrix p , the minimum variance portfolio solves
min

w:wT 1p =1

wT p w,

where 1p is a vector of p ones. The solution to the above is given by


wopt =
255

1Tp 1
p 1p

(16)

For the price data, following Barndorff-Nielsen et al. (2011) and Fan et al. (2012), we simuo(i)
(i)
(i)
(i)
late p = 100 stock prices for 200 days using Xt = Xt + t , where Xt is the underlying
(i)
(i)
log-price, and t models the market microstructure noise, with t N (0, 0.00052 ) and are
(i)
assumed to be independent of each other. The underlying log-price Xt is generated by the
stochastic volatility model. For i = 1, . . . , 100,

(i)
(i)
(i)
(i)
(i) (i)
dXt = dt + t dBt + 1 ((i) )2 t dWt + (i) dZt ,
(i)

260

1
p 1p

where {Wt }, {Zt } and the {Bt }s are all independent standard Brownian motions. The process
{Zt } plays the role of a pervasive factor, which is usually the market factor in asset returns. The

Nonlinear shrinkage of large integrated covariance matrix


Design I

Proposed

p = 100
p = 200

.13(.02)
.55(.17)

p = 100
p = 200

.17(.014)
88(15)

Design II

Proposed

p = 100
p = 200

.29(.03)
.38(.03)

p = 100
p = 200

.54(.1)
88(16)

Time variation-adjusted
Frobenius loss
2.8(.04)
69(31)
Inverse Steins loss
5.63(.058)
-

Realized covariance

Time variation-adjusted
Frobenius loss
6.32(.09)
13(10)
Inverse Steins loss
693(31)
-

Realized covariance

3.6(.06)
1564(63)
7.08(.08)
-

7.55(.1)
15(20)
1232(53.8)
-

Table 1. Mean and standard deviation (in bracket) of losses for different methods. All values
reported in this table are multiplied by 1000. Upper table: results for Design I. Lower table:
results for Design II. For p = 200, the time variation-adjusted and realized covariance matrices
are always singular, and hence inverse Steins loss are at infinity.
(i)

(i)

spot volatility t = exp(t ) follows the independent Ornstein-Uhlenbeck process


(i)

(i)

(i)

(i)

(i)

dt = (i) (0 t )dt + 1 dUt ,


(i)

where the {Ut }s are independent standard Brownian motions. We use


(i)
(i)
(i)
(i)
(i)
(i)
(i)
((i) , 0 , 1 , (i) , (i) ) = (0.03x1 , x2 , 0.75x3 , 1/40x4 , 0.7) and (i) = exp(0 ),
(i)
where the xj s are independent and uniformly distributed on the interval [0.7, 1.3]. The initial
(i)
X0

265

(i)
0

value of each log-price is set at


= 1 and the starting spot volatility
= 0.
We simulate the trading times independently from the price data assuming the transaction
times for each stock follow independent Poisson processes with rates 1 , , 100 respectively,
where i = 0.01i 23400. We set this because normal trading time for one day is 23400 seconds.
After simulating the data, we split a trading day into 15-minute intervals, and set the price data
for each stock at the end of each interval as the price observed at the trade right before the end
of the interval. The data is used to calculate various integrated covariance estimators, including
our proposed one.
At the start, we invest 1 unit of capital using the minimum variance allocations (16) constructed
from using different estimators of the integrated covariance matrix. Each time we use a 60-day
training window (so the first trade starts on day 61, and the last one on day 200) and we reevaluate our portfolio weights every 5 days, using the past 60 days of data as a training set, until
we reach day 195.
In Table 2, we report the mean of three risks. The first one is the theoretical risk R(wopt ) =
T
wopt p wopt , where wopt is calculated as in (16) using the true integrated covariance matrix
of the underlying log-return over the past 60-day training period and p is the true integrated

270

275

280

C. L AM AND C. H U

10
Theoretical risk
Proposed
.922
.753

Actual risk
Perceived risk

.735
Time variation-adjusted
3.918
3.869

Realized covariance
4.115
4.034

b w
b opt ) and perceived risk R(
b opt ).
Table 2. Mean of theoretical risk R(wopt ), actual risk R(w

285

290

295

300

305

b opt ) =
covariance matrix over the 5-day investment period. The second one is the actual risk R(w
T
b opt
b opt , where w
b opt is calculated using different integrated covariance matrix estimators.
w
p w
T b
b w
b opt ) = w
b opt
b opt .
Finally the perceived risk is defined by R(
p w
We can see from Table 2 that our method has the best performance among all three different
methods, and has the risk closet to the theoretical one. In particular, our method has the smallest
actual risk, which is the most relevant risk in practice.
42. Portfolio allocation on NYSE data
We consider p = 45 stocks from the New York Stock Exchange from January 1 of 2013 to December 31 of 2013 (245 trading days). We choose the stocks from mid-cap energy sector stocks.
We downloaded all the trades of these stocks from Wharton Research Data Services (WRDS,
https://wrds-web.wharton.upenn.edu/). The raw data are of high frequency nature. As mentioned
before, the stocks have non-sychronous trading times and all the log-prices are contaminated by
market microstructure noise.
Like the market trading exercise in Section 41, we consider trades in 15-minute intervals on
every trading day from 9:30 to 16:00, with each log-price being the observed one from a trade
right before a 15-minute interval ends. This results in a total of 6732 observations over the 245
trading days. Hence on average there are around 27 observations per day.
We consider two settings. For the first one, we consider 20-day training windows and reevaluate portfolio weights every 5 days. Another setting use 5-day training windows and reevaluate portfolio weights everyday. We use the annualized out-of-sample standard deviation
b,
together with the annualized portfolio return
b and the Sharpe ratio
b/b
to gauge the performance of each method. For 20-day training windows and 5 day re-evaluation period,
b and
b are
defined by

b = 52

(
)1/2
1 T
1 T
wi ri ,
b = 52
(wi ri
b)2
.
45
45
49

49

i=5

i=5

We use the annualized out-of-sample standard deviation since we do not know the true underlying
integrated covariance matrix, and hence the actual risk cannot be calculated. For 5-day training
windows with daily re-evaluation of portfolio weights,
b and
b are defined by

b = 252

(
)1/2
1 T
1 T
wi ri ,
b = 252
(wi ri
b)2
.
240
240
245

245

i=6

i=6

R EFERENCES
310

A BADIR , K. M., D ISTASO , W. & Z IKE S , F. (2010). Model-free estimation of large variance matrices. The Rimini
Centre for Economic Analysis, WP 10-17.
AI T-S AHALIA , Y., M YKLAND , P. A. & Z HANG , L. (2005). How often to sample a continuous-time process in the
presence of market microstructure noise. Review of Financial Studies 18, 351416.

Nonlinear shrinkage of large integrated covariance matrix


315

11

A NDERSEN , T., B OLLERSLEV, T., D IEBOLD , F. & P., L. (2001). The distribution of realized exchange rate volatility.
Journal of the American Statistical Association 96, 4255.
A SPAROUHOVA , E., B ESSEMBINDER , H. & K ALCHEVA , I. (2013). Noisy prices and inference regarding returns.
The Journal of Finance 68, 665714.
BAI , Z., L IU , H. & W ONG , W.-K. (2009). ENHANCEMENT OF THE APPLICABILITY OF MARKOWITZS
PORTFOLIO OPTIMIZATION BY UTILIZING RANDOM MATRIX THEORY. Mathematical Finance 19,
639667.
BAI , Z. & S ILVERSTEIN , J. (2010). Spectral Analysis of Large Dimensional Random Matrices. New York: Springer
Series in Statistics, 2nd ed.
BARNDORFF -N IELSEN , O. E., H ANSEN , P. R., L UNDE , A. & S HEPHARD , N. (2011). Multivariate realised kernels:
Consistent positive semi-definite estimators of the covariation of equity prices with noise and non-synchronous
trading. Journal of Econometrics 162, 149 169.
B ICKEL , P. J. & L EVINA , E. (2008a). Covariance regularization by thresholding. Ann. Statist. 36, 25772604.
B ICKEL , P. J. & L EVINA , E. (2008b). Regularized estimation of large covariance matrices. Ann. Statist. 36, 199227.
C AI , T. T. & Z HOU , H. H. (2012). Optimal rates of convergence for sparse covariance matrix estimation. The Annals
of Statistics 40, 23892420.
FAN , J., FAN , Y. & LV, J. (2008). High dimensional covariance matrix estimation using a factor model. Journal of
Econometrics 147, 186197.
FAN , J., L I , Y. & Y U , K. (2012). Vast volatility matrix estimation using high- frequency data for portfolio selection.
Journal of the American Statistical Association 107, 412428.
FAN , J., L IAO , Y. & M INCHEVA , M. (2011). High-dimensional covariance matrix estimation in approximate factor
models. The Annals of Statistics 39, 33203356.
FAN , J., L IAO , Y. & M INCHEVA , M. (2013). Large covariance estimation by thresholding principal orthogonal
complements. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 75, 603680.
F RIEDMAN , J., H ASTIE , T. & T IBSHIRANI , R. (2008). Sparse inverse covariance estimation with the graphical
lasso. Biostatistics 9, 432441.
JACOD , J. & P ROTTER , P. (1998). Asymptotic error distributions for the euler method for stochastic differential
equations. Ann. Probab. 26, 267307.
L AM , C. (2016). Nonparametric eigenvalue-regularized precision or covariance matrix estimator. Ann. Statist. To
appear.
L AM , C. & FAN , J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation. Ann. Statist.
37, 42544278.
L EDOIT, O. & W OLF, M. (2012). Nonlinear shrinkage estimation of large-dimensional covariance matrices. The
Annals of Statistics 40, 10241060.
L EDOIT, O. & W OLF, M. (2014). Nonlinear shrinkage of the covariance matrix for portfolio selection: Markowitz
meets Goldilocks. ECON - Working Papers 137, Department of Economics - University of Zurich.

M EINSHAUSEN , N. & B UHLMANN


, P. (2006). High-dimensional graphs and variable selection with the lasso. The
Annals of Statistics 34, 14361462.
P OURAHMADI , M. (2007). Cholesky decompositions and estimation of a covariance matrix: Orthogonality of variancecorrelation parameters. Biometrika 94, 10061013.
ROTHMAN , A. J., B ICKEL , P. J., L EVINA , E. & Z HU , J. (2008). Sparse permutation invariant covariance estimation.
Electron. J. Statist. 2, 494515.
TAO , M., WANG , Y., YAO , Q. & Z OU , J. (2011). Large volatility matrix inference via combining low-frequency and
high-frequency approaches. Journal of the American Statistical Association 106, 10251040.
WANG , Y. & Z OU , J. (2010). Vast volatility matrix estimation for high-frequency financial data. Ann. Statist. 38,
943978.
Z HANG , L. (2011). Estimating covariation: Epps effect, microstructure noise. Journal of Econometrics 160, 33 47.
Z HENG , X. & L I , Y. (2011). On the estimation of integrated covariance matrices of high dimensional diffusion
processes. Ann. Statist. 39, 31213151.

[Received xxxx 20xx. Revised xxxx 20xx]

320

325

330

335

340

345

350

355

360

Das könnte Ihnen auch gefallen