CVX Applications

Convex Optimization Applications
Stephen Boyd and Steven Diamond

EE & CS Departments
Stanford University
MLSS, Kyoto, August 23-24 2015
Outline
Portfolio Optimization
Worst-Case Risk Analysis
Optimal Advertising
Regression Variations
Model Fitting
Outline
Optimal Advertising
Model Fitting
Portfolio allocation vector
invest fraction wi in asset i, i = 1, . . . , n
w Rn is portfolio allocation vector
1T w = 1
wi < 0 means a short position in asset i

(borrow shares and sell now; must replace later)
w 0 is a long only portfolio
kw k1 = 1T w+ + 1T w is leverage
(many other definitions used . . . )
Asset returns
I
investments held for one period
initial prices pi > 0; end of period prices pi+ > 0
asset (fractional) returns ri = (pi+ pi )/pi
portfolio (fractional) return R = r T w
common model: r is a random variable, with mean E r = ,

covariance E(r )(r )T =
so R is a RV with E R = T w , var(R) = w T w
E R is (mean) return of portfolio
var(R) is risk of portfolio

p
(risk also sometimes given as std(R) = var(R))
two objectives: high return, low risk
Classical (Markowitz) portfolio optimization
maximize T w w T w
subject to 1T w = 1, w W
I
variable w Rn
W is set of allowed portfolios
common case: W = Rn+ (long only portfolio)
> 0 is the risk aversion parameter
T w w T w is risk-adjusted return
varying gives optimal risk-return trade-off
can also fix return and minimize risk, etc.
Example
optimal risk-return trade-off for 10 assets, long only portfolio
Example
return distributions for two risk aversion values
Portfolio constraints
W = Rn (simple analytical solution)
leverage limit: kw k1 Lmax

market neutral: mT w = 0
I
I
I
mi is capitalization of asset i
M = mT r is market return
mT w = cov(M, R)
i.e., market neutral portfolio return is uncorrelated with

market return
Example
optimal risk-return trade-off curves for leverage limits 1, 2, 4
10
Example
three portfolios with w T w = 2, leverage limits L = 1, 2, 4
11
Variations
require T w R min , minimize w T w or k1/2 w k2
include (broker) cost of short positions,

s T (w ) ,
s0
include transaction cost (from previous portfolio w prev ),

T |w w prev | ,
common models: = 1, 3/2, 2
12
Factor covariance model

T +D
= F F
F Rnk , k n is factor loading matrix
k is number of factors (or sectors), typically 10s
Fij is loading of asset i to factor j
I
I
D is diagonal matrix; Dii > 0 is idiosyncratic risk

> 0 is the factor covariance matrix
F T w Rk gives portfolio factor exposures
portfolio is factor j neutral if (F T w )j = 0
13
Portfolio optimization with factor covariance model
+ w T Dw
maximize T w f T f
subject to 1T w = 1, f = F T w
w W, f F
variables w Rn (allocations), f Rk (factor exposures)
F gives factor exposure constraints
computational advantage: O(nk 2 ) vs. O(n3 )
14
Example
I
50 factors, 3000 assets
leverage limit = 2
solve with covariance given as
I
I
single matrix
factor model
CVXPY/ECOS single thread time
covariance
solve time
single matrix
factor model
687.26 sec
0.58 sec
15
Outline
Optimal Advertising
Model Fitting
16
Covariance uncertainty
single period Markowitz portfolio allocation problem
we have fixed portfolio allocation w Rn
return covariance not known, but we believe S
S is convex set of possible covariance matrices
risk is w T w , a linear function of
17
Worst-case risk analysis

I
what is the worst (maximum) risk, over all possible

covariance matrices?
worst-case risk analysis problem:

maximize w T w
subject to S,
0
with variable
I
. . . a convex problem with variable
if the worst-case risk is not too bad, you can worry less
if not, youll confront your worst nightmare
18
Example
w = (0.6, 0.5, 0.25, 0.65, 0)
optimized for nom , return 0.1, leverage limit 2
S = {nom + : |ii | = 0, |ij | 0.2},
nom
0.86
0.34
0.14
0.15
0.55
0.34 0.14 0.15 0.55

0.66 0.12 0.51 0.24
0.12 0.45 0.06 0.11
0.51 0.06 0.55 0.14

0.24 0.11 0.14 0.9
19
Example
nominal risk = 0.511
worst case risk = 0.917
worst case =
0
0.2
0.2
0.2 0.08
0.2
0
0.2 0.2 0.02
0.2
0.2
0
0.2 0.05
0.2
0.2
0.2
0
0.02
0.08 0.02 0.05 0.02
0
20
Outline
Optimal Advertising
Model Fitting
Optimal Advertising
21
Ad display
m advertisers/ads, i = 1, . . . , m
n time slots, t = 1, . . . , n
Tt is total traffic in time slot t
Dit 0 is number of ad i displayed in period t
contracted minimum total displays:
goal: choose Dit
Dit Tt
Optimal Advertising
Dit ci
22
Clicks and revenue
Cit is number of clicks on ad i in period t
click model: Cit = Pit Dit , Pit [0, 1]
payment: Ri > 0 per click for ad i, up to budget Bi
ad revenue
Si = min Ri
)
X
Cit , Bi
. . . a concave function of D
Optimal Advertising
23
Ad optimization
choose displays to maximize revenue:

P
maximize
i Si
subject to D 0,
I
variable is D Rmn
data are T , c, R, B, P
Optimal Advertising
DT 1 T ,
D1 c
24
Example
I
I
24 hourly periods, 5 ads (AE)

total traffic:
Optimal Advertising
25
Example
ad data:
Ad
ci
Ri
Bi
Optimal Advertising
61000
0.15
25000
80000
1.18
12000
61000
0.57
12000
23000
2.08
11000
64000
2.43
17000
26
Example
Pit
Optimal Advertising
27
Example
optimal Dit
Optimal Advertising
28
Example
ad revenue
Ad
ci
Ri
Bi
61000
0.15
25000
80000
1.18
12000
61000
0.57
12000
23000
2.08
11000
64000
2.43
17000
Dit
Si
61000
182
80000
12000
148116
12000
23000
11000
167323
7760
Optimal Advertising
29
Outline
Optimal Advertising
Model Fitting
30
Standard regression
given data (xi , yi ) Rn R, i = 1, . . . , m
fit linear (affine) model yi = T xi v , Rn , v R
residuals are ri = yi yi
least-squares: choose , v to minimize kr k22 =
mean of optimal residuals is zero
can add (Tychonov) regularization: with > 0,
2
i ri
minimize kr k22 + kk22
31
Robust (Huber) regression

I
replace square with Huber function

(
(u) =
u2
|u| M
2Mu M 2 |u| > M
M > 0 is the Huber threshold
same as least-squares for small residuals, but allows (some)

large residuals
32
Example
m = 450 measurements, n = 300 regressors
choose true ; xi N (0, I)
set yi = ( true )T xi + i , i N (0, 1)
with probability p, replace yi with yi
data has fraction p of (non-obvious) wrong measurements
distribution of good and bad yi are the same
try to recover true Rn from measurements y Rm
prescient version: we know which measurements are wrong
33
Example
50 problem instances, p varying from 0 to 0.15
34
Example
35
Quantile regression
I
tilted `1 penalty: for (0, 1),

(u) = (u)+ + (1 )(u) = (1/2)|u| + ( 1/2)u
quantile regression: choose , v to minimize
= 0.5: equal penalty for over- and under-estimating
= 0.1: 9 more penalty for under-estimating
= 0.9: 9 more penalty for over-estimating
(ri )
36
Quantile regression
for ri 6= 0,
(ri )
= |{i : ri > 0}| (1 ) |{i : ri < 0}|
v
i
(roughly speaking) for optimal v we have

|{i : ri > 0}| = (1 ) |{i : ri < 0}|
and so for optimal v , m = |{i : ri < 0}|
-quantile of optimal residuals is zero
hence the name quantile regression
37
Example
time series xt , t = 0, 1, 2, . . .
auto-regressive predictor:
xt+1 = T (xt , . . . , xtM ) v
M = 10 is memory of predictor
use quantile regression for = 0.1, 0.5, 0.9
at each time t, gives three one-step-ahead predictions:

0.1
xt+1
,
0.5
xt+1
,
0.9
xt+1
38
Example
time series xt
39
Example
0.5 , x
0.9 (training set, t = 0, . . . , 399)
0.1 , x
t+1
t+1
xt and predictions xt+1
40
Example
0.5 , x
0.9 (test set, t = 400, . . . , 449)
0.1 , x
t+1
t+1
xt and predictions xt+1
41
Example
residual distributions for = 0.9, 0.5, and 0.1 (training set)
42
Example
residual distributions for = 0.9, 0.5, and 0.1 (test set)
43
Outline
Optimal Advertising
Model Fitting
Model Fitting
44
Data model
I
given data (xi , yi ) X Y, i = 1, . . . , m
for X = Rn , x is feature vector
for Y = R, y is (real) outcome or label
for Y = {1, 1}, y is (boolean) outcome
find model or predictor : X Y so that (x ) y

for data (x , y ) that you havent seen
for Y = R, is a regression model
for Y = {1, 1}, is a classifier
we choose based on observed data, prior knowledge
Model Fitting
45
Loss minimization model
data model parametrized by Rn
loss function L : X Y Rn R
L(xi , yi , ) is loss (miss-fit) for data point (xi , yi ),

using model parameter
choose ; then model is

(x ) = argmin L(x , y , )
y
Model Fitting
46
Model fitting via regularized loss minimization
regularization r : Rn R {}
r () measures model complexity, enforces constraints, or

represents prior
choose by minimizing regularized loss

(1/m)
L(xi , yi , ) + r ()
i
I
for many useful cases, this is a convex problem
model is (x ) = argminy L(x , y , )
Model Fitting
47
Examples
model
least-squares
ridge regression
lasso
logistic classifier
SVM
L(x , y , )
(T x y )2
(T x y )2
(T x y )2
log(1 + exp(y T x ))
(1 y T x )+
> 0 scales regularization
all lead to convex fitting problems
Model Fitting
(x )
T x
T x
T x
sign(T x )
sign(T x )
r ()
0
kk22
kk1
0
kk22
48
Example
original (boolean) features z {0, 1}10
(boolean) outcome y {1, 1}
new feature vector x {0, 1}55 contains all products zi zj

(co-occurence of pairs of original features)
use logistic loss, `1 regularizer
training data has m = 200 examples; test on 100 examples
Model Fitting
49
Example
Model Fitting
50
Example
selected features zi zj , = 0.01
Model Fitting
51

CVX Applications

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

CVX Applications

Hochgeladen von

Copyright:

Verfügbare Formate

Convex Optimization Applications

Stephen Boyd and Steven Diamond

MLSS, Kyoto, August 23-24 2015

Portfolio allocation vector

invest fraction wi in asset i, i = 1, . . . , n

w Rn is portfolio allocation vector

wi < 0 means a short position in asset i

w 0 is a long only portfolio

investments held for one period

initial prices pi > 0; end of period prices pi+ > 0

asset (fractional) returns ri = (pi+ pi )/pi

portfolio (fractional) return R = r T w

common model: r is a random variable, with mean E r = ,

E R is (mean) return of portfolio

var(R) is risk of portfolio

two objectives: high return, low risk

Classical (Markowitz) portfolio optimization

W is set of allowed portfolios

common case: W = Rn+ (long only portfolio)

> 0 is the risk aversion parameter

varying gives optimal risk-return trade-off

can also fix return and minimize risk, etc.

W = Rn (simple analytical solution)

leverage limit: kw k1 Lmax

i.e., market neutral portfolio return is uncorrelated with

require T w R min , minimize w T w or k1/2 w k2

include (broker) cost of short positions,

include transaction cost (from previous portfolio w prev ),

common models: = 1, 3/2, 2

Factor covariance model

F Rnk , k  n is factor loading matrix

k is number of factors (or sectors), typically 10s

Fij is loading of asset i to factor j

D is diagonal matrix; Dii > 0 is idiosyncratic risk

F T w Rk gives portfolio factor exposures

portfolio is factor j neutral if (F T w )j = 0

Portfolio optimization with factor covariance model

variables w Rn (allocations), f Rk (factor exposures)

F gives factor exposure constraints

computational advantage: O(nk 2 ) vs. O(n3 )

50 factors, 3000 assets

CVXPY/ECOS single thread time

Worst-Case Risk Analysis

single period Markowitz portfolio allocation problem

we have fixed portfolio allocation w Rn

return covariance not known, but we believe S

S is convex set of possible covariance matrices

risk is w T w , a linear function of

Worst-Case Risk Analysis

Worst-case risk analysis

what is the worst (maximum) risk, over all possible

worst-case risk analysis problem:

. . . a convex problem with variable

if not, youll confront your worst nightmare

Worst-Case Risk Analysis

w = (0.6, 0.5, 0.25, 0.65, 0)

optimized for nom , return 0.1, leverage limit 2

S = {nom + : |ii | = 0, |ij | 0.2},

Worst-Case Risk Analysis

0.34 0.14 0.15 0.55

0.12 0.45 0.06 0.11

0.51 0.06 0.55 0.14

F Rnk , k n is factor loading matrix

set yi = ( true )T xi + i , i N (0, 1)