Sie sind auf Seite 1von 24

Numerical Optimization

Unit 9: Penalty Method and Interior Point Method


Unit 10: Filter Method and the Maratos Eect
Che-Rung Lee
Scribe:
May 1, 2011
(UNIT 9,10) Numerical Optimization May 1, 2011 1 / 24
Penalty method
The idea is to add penalty terms to the objective function, which
turns a constrained optimization problem to an unconstrained one.
Quadratic penalty function
Example (For equality constraints)
min x
1
+ x
2
subject to x
2
1
+ x
2
2
2 = 0 (x

= (1, 1))
Dene Q(x, ) = x
1
+ x
2
+

2
(x
2
1
+ x
2
2
2)
2
For = 1,
Q(x, 1) =
_
1 + 2(x
2
1
+ x
2
2
2)x
1
1 + 2(x
2
1
+ x
2
2
2)x
2
_
=
_
0
0
_
,
_
x

1
x

2
_
=
_
1.1
1.1
_
For = 10,
Q(x, 10) =
_
1 + 20(x
2
1
+ x
2
2
2)x
1
1 + 20(x
2
1
+ x
2
2
2)x
2
_
=
_
0
0
_
,
_
x

1
x

2
_
=
_
1.0000001
1.0000001
_
(UNIT 9,10) Numerical Optimization May 1, 2011 2 / 24
Size of
It seems the larger , the better solution is.
When is large, matrix
2
Q cc
T
is ill-conditioned.
Q(x, ) = f (x) +

2
(c(x))
2
Q = f + cc

2
Q =
2
f + cc
T
+ c
2
c
cannot be too small either.
Example
min
x
5x
2
1
+ x
2
2
s.t. x
1
= 1.
Q(x, ) = 5x
2
1
+ x
2
2
+

2
(x
1
1)
2
.
For < 10, the problem min Q(x, ) is unbounded.
(UNIT 9,10) Numerical Optimization May 1, 2011 3 / 24
Quadratic penalty function
Picks a proper initial guess of and gradually increases it.
Algorithm: Quadratic penalty function
1
Given
0
> 0 and x
0
2
For k = 0, 1, 2, ...
1 Solve min
x
Q(:,
k
) = f (x) +

k
2

i
c
2
i
(x).
2 If converged, stop
3 Increase
k+1
>
k
and nd a new x
k+1
Problem: the solution is not exact for .
(UNIT 9,10) Numerical Optimization May 1, 2011 4 / 24
Augmented Lagrangian method
Use the Lagrangian function to rescue the inexactness problem.
Let
(x, , ) = f (x)

i
c
i
(x) +

2

i
c
2
i
(x)
= f (x)

i
c
i
(x) +

i
c
i
(x)c
i
.
By the Lagrangian theory, = f

i
(
i
c
i
)
. .

i
c
i
.
At the optimal solution, c
i
(x

) =
1

i

i
).
If we can approximate
i

i
,
k
need not be increased
indenitely,

k+1
i
=
k
i

k
c
i
(x
k
)
Algorithm: update
i
at each iteration.
(UNIT 9,10) Numerical Optimization May 1, 2011 5 / 24
Inequality constraints
There are two approaches to handle inequality constraints.
1
Make the object function nonsmooth (non-dierentiable at some
points).
2
Add slack variable to turn the inequality constraints to equality
constraints.
c
i
0
_
c
i
(x) s
i
= 0
s
i
0
But then we have bounded constraints for slack variables.
We will focus on the second approach here.
(UNIT 9,10) Numerical Optimization May 1, 2011 6 / 24
Inequality constraints
Suppose the augmented Lagrangian method is used and all inequality
constraints are converted to bounded constraints.
For a xed and

,
min
x
(x,

, ) = f (x)
m

i =1

i
c
i
(x) +

2
m

i =1
c
2
i
(x)
s.t.

x u
The rst order necessary condition for x to be a solution of the above
problem is
x = P(x
x

A
(x,

, ),

, u),
where
P(g,

, u) =

i
, if g
i

i
;
g
i
, if g
i
(
i
, u
i
);
u
i
, if g
i
u
i
.
for all i = 1, 2, . . . , n.
(UNIT 9,10) Numerical Optimization May 1, 2011 7 / 24
Nonlinear gradient projection method
Sequential quadratic programming + trust region method to solve
min
x
f (x) s.t.

x u
Algorithm: Nonlinear gradient projection method
1
At each iteration, build a quadratic model
q(x) =
1
2
(x x
k
)
T
B
k
(x x
k
) +f
T
k
(x x
k
)
where B
k
is SPD approximation of
2
f (x
k
).
2
For some
k
, use the gradient projection method to solve
min
x
q(x)
s.t. max(

, x
k

k
) x max(u, x
k
+
k
),
3
Update
k
and repeat 1-3 until converge.
(UNIT 9,10) Numerical Optimization May 1, 2011 8 / 24
Interior point method
Consider the problem
min
x
f (x)
s.t. C
E
(x) = 0
C
I
(x) s = 0
s 0
where s are slack variables.
The interior point method starts a point inside the feasible region,
and builds walls on the boundary of the feasible region.
A barrier function goes to innity when the input is close to zero.
min
x,s
f (x)
m

i =1
log(s
i
) s.t.
C
E
(x) = 0
C
I
(x) s = 0
(1)
The function f (x) = log x as x 0.
: barrier parameter
(UNIT 9,10) Numerical Optimization May 1, 2011 9 / 24
An example
Example (min x + 1, s.t. x 1)
min
x
x + 1 ln(1 x)
= 1, x

= 0.00005
= 0.1, x

= 0.89999
= 0.01, x

= 0.989999
= 10
5
, x

= 0.99993

y
x
1 x 0

The Lagrangian of (1) is


(x,s, y, z) = f (x)
m

i =1
log(s
i
) y
T
C
E
(x) z
T
(C
I
(x) s)
1
Vector y is the Lagrangian multiplier of equality constraints.
2
Vector z is the Lagrangian multiplier of inequality constraints.
(UNIT 9,10) Numerical Optimization May 1, 2011 10 / 24
The KKT conditions
The KKT conditions
The KKT conditions for (1)

x
= 0 f A
E
y A
I
z = 0

s
= 0 SZ I = 0

y
= 0 C
E
(x) = 0

z
= 0 C
I
(x) s = 0
(2)
Matrix S = diag(s) and matrix Z = diag(z).
Matrix A
E
is the Jacobian of C
E
and matrix A
I
is the Jacobian of C
I
.
(UNIT 9,10) Numerical Optimization May 1, 2011 11 / 24
Newtons step
Let F =

f A
E
y A
I
z
SZ I
C
E
(x)
C
I
(x) s

.
The interior point method uses Newtons method to solve F = 0.
F =

xx
0 A
E
(x) A
I
(x)
0 Z 0 S
A
E
(x) 0 0 0
A
I
(x) I 0 0

Newtons step
F =

p
x
p
s
p
y
p
z

= F
x
k+1
= x
k
+
x
p
x
s
k+1
=s
k
+
s
p
s
y
k+1
= y
k
+
y
p
y
z
k+1
= z
k
+
z
p
z
(UNIT 9,10) Numerical Optimization May 1, 2011 12 / 24
Algorithm: Interior point method (IPM)
Algorithm: Interior point method (IPM)
1
Given initial x
0
,s
0
, y
0
, z
0
, and
0
2
For k = 0, 1, 2, . . . until converge
(a) Compute p
x
, p
s
, p
y
, p
z
and
x
,
s
,
y
,
z
(b) (x
k+1
,s
k+1
, y
k+1
, z
k+1
) = (x
k
,s
k
, y
k
, z
k
) + (
x
p
x
,
s
p
s
,
y
p
y
,
z
p
z
)
(c) Adjust
k+1
<
k
(UNIT 9,10) Numerical Optimization May 1, 2011 13 / 24
Some comments of IMP
Some comments of the interior point method
1
The complementarity slackness condition says s
i
z
i
= 0 at the optimal
solution, by which, the parameter , SZ = I , needs to decrease to
zero as the current solution approaches to the optimal solution.
2
Why cannot we set zero or small in the beginning? Because that
will make x
k
going to the nearest constraint, and the entire process
will move along constraint by constraint, which again becomes an
exponential algorithm.
3
To keep x
k
(or s and z) too close any constraints, IPM also limits the
step size of s and z

max
s
= max{ (0, 1],s + p
s
(1 )s}

max
z
= max{ (0, 1], z + p
z
(1 )z}
(UNIT 9,10) Numerical Optimization May 1, 2011 14 / 24
Interior point method for linear programming
We will use linear programming to illustrate the details of IPM.
The primal The dual
min
x
c
T
x
s.t. Ax =

b,
x 0.
max

b
T

s.t. A
T

+s = c,
s 0.
KKT conditions
A
T

+s = c
Ax =

b
x
i
s
i
= 0
x 0,s 0
(UNIT 9,10) Numerical Optimization May 1, 2011 15 / 24
Solve problem
Let X =

x
1
x
2
.
.
.
x
n

, S =

s
1
s
2
.
.
.
s
n

,
F =

A
T

+s c
Ax

b
Xs e

The problem is to solve F = 0


(UNIT 9,10) Numerical Optimization May 1, 2011 16 / 24
Newtons method
Using Newtons method
F =

0 A
T
I
A 0 0
S 0 X

p
x
p

p
z

= F
x
k+1
= x
k
+
x
p
x

k+1
=

k
+

z
k+1
= z
k
+
z
p
z
How to decide
k
?

k
=
1
n
x
k
. s
k
is called duality measure.
(UNIT 9,10) Numerical Optimization May 1, 2011 17 / 24
The central path
The central path
The central path: a set of points, p() =

, dened by the
solution of the equation
A
T

+s = c
Ax =

b
x
i
s
i
= i = 1, 2, , n
x,s > 0
(UNIT 9,10) Numerical Optimization May 1, 2011 18 / 24
Algorithm
Algorithm: The interior point method for solving linear programming
1
Given an interior point x
0
and the initial guess of slack variables s
0
2
For k = 0, 1, . . .
(a) Solve

0 A
T
I
A 0 0
S
k
0 X
k

x
k

k
s
k

b Ax
k
c s
k
A
T

k
X
k
S
k
e +
k

k
e

for

k
[0, 1].
(b) Compute (
x
,

,
s
) s.t.

x
k+1

k+1
s
k+1

x
k
+
x
x
k

k
+

k
s
k
+
s
s
k

is in the
neighborhood of the central path
() =

XSe e

for some (0, 1].


(UNIT 9,10) Numerical Optimization May 1, 2011 19 / 24
Filter method
There are two goals of constrained optimization:
1
Minimize the objective function.
2
Satisfy the constraints.
Example
Suppose the problem is
min
x
f (x)
s.t. c
i
(x) = 0 for i
c
i
(x) 0 for i
Dene h(x) penalty functions of constraints.
h(x) =

i
c
i
(x) +

i
[c
i
(x)]

,
in which the notation [z]

= max{0, z}.
The goals become
_
min f (x)
min h(x)
(UNIT 9,10) Numerical Optimization May 1, 2011 20 / 24
The lter method
A pair (f
k
, h
k
) dominates (f
l
, h
l
) if f
k
< f
l
and h
k
< h
l
.
A lter is a list of pairs (f
k
, h
k
) such that no pair dominates any other.
The lter method only accepts the steps that are not dominated by
other pairs.
Algorithm: The lter method
1
Given initial x
0
and an initial trust region
0
.
2
For k = 0, 1, 2, . . . until converge
1 Compute a trial x
+
by solving a local quadric programming model
2 If (f
+
, h
+
) is accepted to the lter
Set x
k+1
= x
+
, add (f
+
, h
+
) to the lter, and remove pairs dominated
by it.
Else
Set x
k+1
= x
k
and decrease
k
.
(UNIT 9,10) Numerical Optimization May 1, 2011 21 / 24
The Maratos eect
The Maratos eect shows the lter method could reject a good step.
Example
min
x
1
,x
2
f (x
1
, x
2
) = 2(x
2
1
+ x
2
2
1) x
1
s.t. x
2
1
+ x
2
2
1 = 0
The optimal solution is x

= (1, 0)
Suppose x
k
=
_
cos
sin
_
, p
k
=
_
sin
2

sin cos
_
x
k+1
= x
k
+p
k
=
_
cos + sin
2

sin (1 cos )
_
(UNIT 9,10) Numerical Optimization May 1, 2011 22 / 24
Reject a good step
x
k
x

=
_
_
_
_
_
cos 1
sin
__
_
_
_
=

cos
2
2 cos + 1 + sin
2
=

2(1 cos )
x
k+1
x

=
_
_
_
_
cos + sin
2
1
sin sin cos
_
_
_
_
=
_
_
_
_
cos (1 cos )
sin (1 cos )
_
_
_
_
=

cos
2
(1 cos )
2
+ sin
2
(1 cos )
2
=

(1 cos )
2
Therefore
x
k+1
x

x
k
x

2
=
1
2
. This step gives a quadratic convergence.
However, the lter method will reject this step because
f (x
k
) = cos , and c(x
k
) = 0,
f (x
k+1
) = cos sin 2 = sin
2
cos > f (x
k
)
c(x
k+1
) = sin
2
> 0 = c(x
k
)
(UNIT 9,10) Numerical Optimization May 1, 2011 23 / 24
The second order correction
The second order correction could help to solve this problem.
Instead of c(x
k
)
T
p
k
+ c(x
k
) = 0, use quadratic approximation
c(x
k
) +c(x
k
)
T

d
k
+
1
2

d
T
k

2
xx
c(x)

d
k
= 0. (3)
Suppose

d
k
p
k
is small. Use Taylor expansion to approximate
quadratic term
c(x
k
+p
k
) c(x
k
) +c(x
k
)
T
p
k
+
1
2
p
T
k

2
xx
c(x)p
k
.
1
2

d
T
k

2
xx
c(x)

d
k

1
2
p
T
k

2
xx
c(x)p
k
c(x
k
+p
k
) c(x
k
) c(x
k
)
T
p
k
.
Equation (3) can be rewritten as
c(x
k
)
T

d
k
+ c(x
k
+p
k
) c(x
k
)
T
p
k
= 0
Use the corrected linearized constraint:
c(x
k
)
T
p + c(x
k
+p
k
) c(x
k
)
T
p
k
= 0.
(The original linearized constraint is c(x
k
)
T
p + c(x
k
) = 0.)
(UNIT 9,10) Numerical Optimization May 1, 2011 24 / 24

Das könnte Ihnen auch gefallen