Sie sind auf Seite 1von 51

Convex Optimization Applications

Stephen Boyd and Steven Diamond


EE & CS Departments
Stanford University

MLSS, Kyoto, August 23-24 2015

Outline

Portfolio Optimization
Worst-Case Risk Analysis
Optimal Advertising
Regression Variations
Model Fitting

Outline

Portfolio Optimization
Worst-Case Risk Analysis
Optimal Advertising
Regression Variations
Model Fitting

Portfolio Optimization

Portfolio allocation vector

invest fraction wi in asset i, i = 1, . . . , n

w Rn is portfolio allocation vector

1T w = 1

wi < 0 means a short position in asset i


(borrow shares and sell now; must replace later)

w 0 is a long only portfolio

kw k1 = 1T w+ + 1T w is leverage
(many other definitions used . . . )

Portfolio Optimization

Asset returns
I

investments held for one period

initial prices pi > 0; end of period prices pi+ > 0

asset (fractional) returns ri = (pi+ pi )/pi

portfolio (fractional) return R = r T w

common model: r is a random variable, with mean E r = ,


covariance E(r )(r )T =

so R is a RV with E R = T w , var(R) = w T w

E R is (mean) return of portfolio

var(R) is risk of portfolio


p
(risk also sometimes given as std(R) = var(R))

two objectives: high return, low risk

Portfolio Optimization

Classical (Markowitz) portfolio optimization

maximize T w w T w
subject to 1T w = 1, w W
I

variable w Rn

W is set of allowed portfolios

common case: W = Rn+ (long only portfolio)

> 0 is the risk aversion parameter

T w w T w is risk-adjusted return

varying gives optimal risk-return trade-off

can also fix return and minimize risk, etc.

Portfolio Optimization

Example
optimal risk-return trade-off for 10 assets, long only portfolio

Portfolio Optimization

Example
return distributions for two risk aversion values

Portfolio Optimization

Portfolio constraints

W = Rn (simple analytical solution)

leverage limit: kw k1 Lmax


market neutral: mT w = 0

I
I
I

mi is capitalization of asset i
M = mT r is market return
mT w = cov(M, R)

i.e., market neutral portfolio return is uncorrelated with


market return

Portfolio Optimization

Example
optimal risk-return trade-off curves for leverage limits 1, 2, 4

Portfolio Optimization

10

Example
three portfolios with w T w = 2, leverage limits L = 1, 2, 4

Portfolio Optimization

11

Variations

require T w R min , minimize w T w or k1/2 w k2

include (broker) cost of short positions,


s T (w ) ,

s0

include transaction cost (from previous portfolio w prev ),


T |w w prev | ,

common models: = 1, 3/2, 2

Portfolio Optimization

12

Factor covariance model


T +D
= F F

F Rnk , k  n is factor loading matrix

k is number of factors (or sectors), typically 10s

Fij is loading of asset i to factor j

I
I

D is diagonal matrix; Dii > 0 is idiosyncratic risk


> 0 is the factor covariance matrix

F T w Rk gives portfolio factor exposures

portfolio is factor j neutral if (F T w )j = 0

Portfolio Optimization

13

Portfolio optimization with factor covariance model

+ w T Dw
maximize T w f T f
subject to 1T w = 1, f = F T w
w W, f F

variables w Rn (allocations), f Rk (factor exposures)

F gives factor exposure constraints

computational advantage: O(nk 2 ) vs. O(n3 )

Portfolio Optimization

14

Example
I

50 factors, 3000 assets

leverage limit = 2
solve with covariance given as

I
I

single matrix
factor model

CVXPY/ECOS single thread time

Portfolio Optimization

covariance

solve time

single matrix
factor model

687.26 sec
0.58 sec

15

Outline

Portfolio Optimization
Worst-Case Risk Analysis
Optimal Advertising
Regression Variations
Model Fitting

Worst-Case Risk Analysis

16

Covariance uncertainty

single period Markowitz portfolio allocation problem

we have fixed portfolio allocation w Rn

return covariance not known, but we believe S

S is convex set of possible covariance matrices

risk is w T w , a linear function of

Worst-Case Risk Analysis

17

Worst-case risk analysis


I

what is the worst (maximum) risk, over all possible


covariance matrices?

worst-case risk analysis problem:


maximize w T w
subject to S,

0

with variable
I

. . . a convex problem with variable

if the worst-case risk is not too bad, you can worry less

if not, youll confront your worst nightmare

Worst-Case Risk Analysis

18

Example

w = (0.6, 0.5, 0.25, 0.65, 0)

optimized for nom , return 0.1, leverage limit 2

S = {nom + : |ii | = 0, |ij | 0.2},

nom

Worst-Case Risk Analysis

0.86
0.34
0.14
0.15
0.55

0.34 0.14 0.15 0.55


0.66 0.12 0.51 0.24

0.12 0.45 0.06 0.11

0.51 0.06 0.55 0.14


0.24 0.11 0.14 0.9

19

Example

nominal risk = 0.511

worst case risk = 0.917

worst case =

Worst-Case Risk Analysis

0
0.2
0.2
0.2 0.08
0.2
0
0.2 0.2 0.02

0.2
0.2
0
0.2 0.05

0.2
0.2
0.2
0
0.02
0.08 0.02 0.05 0.02
0

20

Outline

Portfolio Optimization
Worst-Case Risk Analysis
Optimal Advertising
Regression Variations
Model Fitting

Optimal Advertising

21

Ad display

m advertisers/ads, i = 1, . . . , m

n time slots, t = 1, . . . , n

Tt is total traffic in time slot t

Dit 0 is number of ad i displayed in period t

contracted minimum total displays:

goal: choose Dit

Dit Tt

Optimal Advertising

Dit ci

22

Clicks and revenue

Cit is number of clicks on ad i in period t

click model: Cit = Pit Dit , Pit [0, 1]

payment: Ri > 0 per click for ad i, up to budget Bi

ad revenue

Si = min Ri

)
X

Cit , Bi

. . . a concave function of D

Optimal Advertising

23

Ad optimization

choose displays to maximize revenue:


P

maximize
i Si
subject to D 0,
I

variable is D Rmn

data are T , c, R, B, P

Optimal Advertising

DT 1 T ,

D1 c

24

Example
I
I

24 hourly periods, 5 ads (AE)


total traffic:

Optimal Advertising

25

Example

ad data:
Ad
ci
Ri
Bi

Optimal Advertising

61000
0.15
25000

80000
1.18
12000

61000
0.57
12000

23000
2.08
11000

64000
2.43
17000

26

Example
Pit

Optimal Advertising

27

Example
optimal Dit

Optimal Advertising

28

Example

ad revenue

Ad

ci
Ri
Bi

61000
0.15
25000

80000
1.18
12000

61000
0.57
12000

23000
2.08
11000

64000
2.43
17000

Dit
Si

61000
182

80000
12000

148116
12000

23000
11000

167323
7760

Optimal Advertising

29

Outline

Portfolio Optimization
Worst-Case Risk Analysis
Optimal Advertising
Regression Variations
Model Fitting

Regression Variations

30

Standard regression

given data (xi , yi ) Rn R, i = 1, . . . , m

fit linear (affine) model yi = T xi v , Rn , v R

residuals are ri = yi yi

least-squares: choose , v to minimize kr k22 =

mean of optimal residuals is zero

can add (Tychonov) regularization: with > 0,

2
i ri

minimize kr k22 + kk22

Regression Variations

31

Robust (Huber) regression


I

replace square with Huber function


(

(u) =

u2
|u| M
2Mu M 2 |u| > M

M > 0 is the Huber threshold

same as least-squares for small residuals, but allows (some)


large residuals

Regression Variations

32

Example

m = 450 measurements, n = 300 regressors

choose true ; xi N (0, I)

set yi = ( true )T xi + i , i N (0, 1)

with probability p, replace yi with yi

data has fraction p of (non-obvious) wrong measurements

distribution of good and bad yi are the same

try to recover true Rn from measurements y Rm

prescient version: we know which measurements are wrong

Regression Variations

33

Example
50 problem instances, p varying from 0 to 0.15

Regression Variations

34

Example

Regression Variations

35

Quantile regression
I

tilted `1 penalty: for (0, 1),


(u) = (u)+ + (1 )(u) = (1/2)|u| + ( 1/2)u

quantile regression: choose , v to minimize

= 0.5: equal penalty for over- and under-estimating

= 0.1: 9 more penalty for under-estimating

= 0.9: 9 more penalty for over-estimating

Regression Variations

(ri )

36

Quantile regression

for ri 6= 0,

(ri )
= |{i : ri > 0}| (1 ) |{i : ri < 0}|
v
i

(roughly speaking) for optimal v we have


|{i : ri > 0}| = (1 ) |{i : ri < 0}|

and so for optimal v , m = |{i : ri < 0}|

-quantile of optimal residuals is zero

hence the name quantile regression

Regression Variations

37

Example

time series xt , t = 0, 1, 2, . . .

auto-regressive predictor:
xt+1 = T (xt , . . . , xtM ) v

M = 10 is memory of predictor

use quantile regression for = 0.1, 0.5, 0.9

at each time t, gives three one-step-ahead predictions:


0.1
xt+1
,

Regression Variations

0.5
xt+1
,

0.9
xt+1

38

Example
time series xt

Regression Variations

39

Example
0.5 , x
0.9 (training set, t = 0, . . . , 399)
0.1 , x
t+1
t+1
xt and predictions xt+1

Regression Variations

40

Example
0.5 , x
0.9 (test set, t = 400, . . . , 449)
0.1 , x
t+1
t+1
xt and predictions xt+1

Regression Variations

41

Example
residual distributions for = 0.9, 0.5, and 0.1 (training set)

Regression Variations

42

Example
residual distributions for = 0.9, 0.5, and 0.1 (test set)

Regression Variations

43

Outline

Portfolio Optimization
Worst-Case Risk Analysis
Optimal Advertising
Regression Variations
Model Fitting

Model Fitting

44

Data model
I

given data (xi , yi ) X Y, i = 1, . . . , m

for X = Rn , x is feature vector

for Y = R, y is (real) outcome or label

for Y = {1, 1}, y is (boolean) outcome

find model or predictor : X Y so that (x ) y


for data (x , y ) that you havent seen

for Y = R, is a regression model

for Y = {1, 1}, is a classifier

we choose based on observed data, prior knowledge

Model Fitting

45

Loss minimization model

data model parametrized by Rn

loss function L : X Y Rn R

L(xi , yi , ) is loss (miss-fit) for data point (xi , yi ),


using model parameter

choose ; then model is


(x ) = argmin L(x , y , )
y

Model Fitting

46

Model fitting via regularized loss minimization

regularization r : Rn R {}

r () measures model complexity, enforces constraints, or


represents prior

choose by minimizing regularized loss


(1/m)

L(xi , yi , ) + r ()

i
I

for many useful cases, this is a convex problem

model is (x ) = argminy L(x , y , )

Model Fitting

47

Examples
model
least-squares
ridge regression
lasso
logistic classifier
SVM

L(x , y , )
(T x y )2
(T x y )2
(T x y )2
log(1 + exp(y T x ))
(1 y T x )+

> 0 scales regularization

all lead to convex fitting problems

Model Fitting

(x )
T x
T x
T x
sign(T x )
sign(T x )

r ()
0
kk22
kk1
0
kk22

48

Example

original (boolean) features z {0, 1}10

(boolean) outcome y {1, 1}

new feature vector x {0, 1}55 contains all products zi zj


(co-occurence of pairs of original features)

use logistic loss, `1 regularizer

training data has m = 200 examples; test on 100 examples

Model Fitting

49

Example

Model Fitting

50

Example
selected features zi zj , = 0.01

Model Fitting

51

Das könnte Ihnen auch gefallen