Sie sind auf Seite 1von 5

Solving a regression problem by linear programming

A very common and important problem in statistics is linear regression, the problem of fitting a straight line to statistical data. The most commonly employed technique is the method of least squares, but there are other interesting criteria where linear programming can be used to solve for the optimal values of the regression parameters. Let (x1, y1 , (x!, y! , ", (xn, yn be data points and a1 and a# be the parameters of the regression line y$a1x%a#. (a &ormulate a linear program whose optimal solution minimi'es the sum of the absolute deviations of the data from the line, i.e., formulate min y i ( a1 x i + a # )
a i =1 n

as an L(. (b &ormulate the minimi'ation of the maximum absolute deviation as an L(, i.e., formulate min max y i (a1 x i + a #
a i

as an L(. (c )enerali'e the model to allow fitting to general polynomials y = a k x k + a k 1 x k 1 + + a1 x + a # .

Solution The difficulty here lies in the fact that he optimi'ation problem as it is stated in the problem set is not linear* the absolute value or the maximum functions are not linear. +o we need to reformulate these somehow using simple tric,s that ma,e the problems linear. a -ote that our goal is to find values for a1 and a# which minimi'e

y (a x
i =1 i 1

+ a# ) .

Thus, a1 and a# are variables, and xi.s and yi.s are given data. /owever, the above function is not linear. To ma,e it linear, we need to introduce new variables. &or i$1, ",n, let z i = y i ( a1 x i + a # ) . Then the new model is* 0inimi'e

z
i =1

sub1ect to z i = y i ( a1 x i + a # ) ,

for each i$1,",n

/owever, now we have non2linear functions in the constraints. +uppose for each i$1,"n, we substitute z i = y i ( a1 x i + a # ) by a pair of related constraints* z i y i ( a1 x i + a # (1 z y + ( a x + a and i (! i 1 i #

-ote that (1 and (! provide that z i y i ( a1 x i + a # . 3ut since our model is trying to minimi'e 'i.s, in the optimal solution the value of each 'i will be ta,en all the way down to y i (a1 x i + a # . +ummari'ing, the linear program is* 0inimi'e

z
i =1

sub1ect to z i y i (a1 x i + a # ,
z i y i + ( a1 x i + a # ,

for each i$1,",n for each i$1,",n

(1 (!

max y i (a1 x i + a # . a1 and a# are variables, and xi.s and yi.s are b 4e want to min a i

given data. 3ut the maximum of absolute values is not a linear function. To ma,e it linear, we need to introduce a new variable. Let z = max y i (a1 x i + a # . Then the new
i

model is* 0inimi'e '


y i ( a1 x i + a # sub1ect to z = max i

-ow we have a non2linear function in the constraint. /owever, the following equivalent formulation ta,es care of that problem. 0inimi'e ' sub1ect to z y i ( a1 x i + a # ,
z y i + ( a1 x i + a # ,

for each i$1,",n for each i$1,",n

(1 (!

y i (a1 x i + a # . 3ut since our model is trying -ote that (1 and (! provide that z max i

to minimi'e ', in the optimal solution the value of each ' will be ta,en all the way down y i ( a1 x i + a # . to max i c 5ust replace (a1xi % a# above with (a, xi,% a,21x,21i% %a1xi % a# .

AMPL (software for solving linear programs) model for part (a) set (oints6 param x7(oints86 param y7(oints86

var a16 var a#6 var '7(oints86

minimi'e ob1* sum7i in (oints8 '9i:6

s.t. c17i in (oints8* '9i: ;$ y9i:2(a1<x9i:%a# 6

s.t. c!7i in (oints8* '9i: ;$ 2y9i:%(a1<x9i:%a# 6

data6

set (oints *$ 1 ! = >6

param* x 1 ! = > 1 ! = >

y *$ > ? ? @6

AMPL model for part (b)

set (oints6 param x7(oints86 param y7(oints86

var a16 var a#6 var '6

minimi'e ob1* '6

s.t. c17i in (oints8* ' ;$ y9i:2(a1<x9i:%a# 6

s.t. c!7i in (oints8* ' ;$ 2y9i:%(a1<x9i:%a# 6

data6

set (oints *$ 1 ! = >6

param* x 1 ! = 1 ! =

y *$ > ? ?

>

>

@6