Beruflich Dokumente
Kultur Dokumente
= (1, 1))
Dene Q(x, ) = x
1
+ x
2
+
2
(x
2
1
+ x
2
2
2)
2
For = 1,
Q(x, 1) =
_
1 + 2(x
2
1
+ x
2
2
2)x
1
1 + 2(x
2
1
+ x
2
2
2)x
2
_
=
_
0
0
_
,
_
x
1
x
2
_
=
_
1.1
1.1
_
For = 10,
Q(x, 10) =
_
1 + 20(x
2
1
+ x
2
2
2)x
1
1 + 20(x
2
1
+ x
2
2
2)x
2
_
=
_
0
0
_
,
_
x
1
x
2
_
=
_
1.0000001
1.0000001
_
(UNIT 9,10) Numerical Optimization May 1, 2011 2 / 24
Size of
It seems the larger , the better solution is.
When is large, matrix
2
Q cc
T
is ill-conditioned.
Q(x, ) = f (x) +
2
(c(x))
2
Q = f + cc
2
Q =
2
f + cc
T
+ c
2
c
cannot be too small either.
Example
min
x
5x
2
1
+ x
2
2
s.t. x
1
= 1.
Q(x, ) = 5x
2
1
+ x
2
2
+
2
(x
1
1)
2
.
For < 10, the problem min Q(x, ) is unbounded.
(UNIT 9,10) Numerical Optimization May 1, 2011 3 / 24
Quadratic penalty function
Picks a proper initial guess of and gradually increases it.
Algorithm: Quadratic penalty function
1
Given
0
> 0 and x
0
2
For k = 0, 1, 2, ...
1 Solve min
x
Q(:,
k
) = f (x) +
k
2
i
c
2
i
(x).
2 If converged, stop
3 Increase
k+1
>
k
and nd a new x
k+1
Problem: the solution is not exact for .
(UNIT 9,10) Numerical Optimization May 1, 2011 4 / 24
Augmented Lagrangian method
Use the Lagrangian function to rescue the inexactness problem.
Let
(x, , ) = f (x)
i
c
i
(x) +
2
i
c
2
i
(x)
= f (x)
i
c
i
(x) +
i
c
i
(x)c
i
.
By the Lagrangian theory, = f
i
(
i
c
i
)
. .
i
c
i
.
At the optimal solution, c
i
(x
) =
1
i
i
).
If we can approximate
i
i
,
k
need not be increased
indenitely,
k+1
i
=
k
i
k
c
i
(x
k
)
Algorithm: update
i
at each iteration.
(UNIT 9,10) Numerical Optimization May 1, 2011 5 / 24
Inequality constraints
There are two approaches to handle inequality constraints.
1
Make the object function nonsmooth (non-dierentiable at some
points).
2
Add slack variable to turn the inequality constraints to equality
constraints.
c
i
0
_
c
i
(x) s
i
= 0
s
i
0
But then we have bounded constraints for slack variables.
We will focus on the second approach here.
(UNIT 9,10) Numerical Optimization May 1, 2011 6 / 24
Inequality constraints
Suppose the augmented Lagrangian method is used and all inequality
constraints are converted to bounded constraints.
For a xed and
,
min
x
(x,
, ) = f (x)
m
i =1
i
c
i
(x) +
2
m
i =1
c
2
i
(x)
s.t.
x u
The rst order necessary condition for x to be a solution of the above
problem is
x = P(x
x
A
(x,
, ),
, u),
where
P(g,
, u) =
i
, if g
i
i
;
g
i
, if g
i
(
i
, u
i
);
u
i
, if g
i
u
i
.
for all i = 1, 2, . . . , n.
(UNIT 9,10) Numerical Optimization May 1, 2011 7 / 24
Nonlinear gradient projection method
Sequential quadratic programming + trust region method to solve
min
x
f (x) s.t.
x u
Algorithm: Nonlinear gradient projection method
1
At each iteration, build a quadratic model
q(x) =
1
2
(x x
k
)
T
B
k
(x x
k
) +f
T
k
(x x
k
)
where B
k
is SPD approximation of
2
f (x
k
).
2
For some
k
, use the gradient projection method to solve
min
x
q(x)
s.t. max(
, x
k
k
) x max(u, x
k
+
k
),
3
Update
k
and repeat 1-3 until converge.
(UNIT 9,10) Numerical Optimization May 1, 2011 8 / 24
Interior point method
Consider the problem
min
x
f (x)
s.t. C
E
(x) = 0
C
I
(x) s = 0
s 0
where s are slack variables.
The interior point method starts a point inside the feasible region,
and builds walls on the boundary of the feasible region.
A barrier function goes to innity when the input is close to zero.
min
x,s
f (x)
m
i =1
log(s
i
) s.t.
C
E
(x) = 0
C
I
(x) s = 0
(1)
The function f (x) = log x as x 0.
: barrier parameter
(UNIT 9,10) Numerical Optimization May 1, 2011 9 / 24
An example
Example (min x + 1, s.t. x 1)
min
x
x + 1 ln(1 x)
= 1, x
= 0.00005
= 0.1, x
= 0.89999
= 0.01, x
= 0.989999
= 10
5
, x
= 0.99993
y
x
1 x 0
i =1
log(s
i
) y
T
C
E
(x) z
T
(C
I
(x) s)
1
Vector y is the Lagrangian multiplier of equality constraints.
2
Vector z is the Lagrangian multiplier of inequality constraints.
(UNIT 9,10) Numerical Optimization May 1, 2011 10 / 24
The KKT conditions
The KKT conditions
The KKT conditions for (1)
x
= 0 f A
E
y A
I
z = 0
s
= 0 SZ I = 0
y
= 0 C
E
(x) = 0
z
= 0 C
I
(x) s = 0
(2)
Matrix S = diag(s) and matrix Z = diag(z).
Matrix A
E
is the Jacobian of C
E
and matrix A
I
is the Jacobian of C
I
.
(UNIT 9,10) Numerical Optimization May 1, 2011 11 / 24
Newtons step
Let F =
f A
E
y A
I
z
SZ I
C
E
(x)
C
I
(x) s
.
The interior point method uses Newtons method to solve F = 0.
F =
xx
0 A
E
(x) A
I
(x)
0 Z 0 S
A
E
(x) 0 0 0
A
I
(x) I 0 0
Newtons step
F =
p
x
p
s
p
y
p
z
= F
x
k+1
= x
k
+
x
p
x
s
k+1
=s
k
+
s
p
s
y
k+1
= y
k
+
y
p
y
z
k+1
= z
k
+
z
p
z
(UNIT 9,10) Numerical Optimization May 1, 2011 12 / 24
Algorithm: Interior point method (IPM)
Algorithm: Interior point method (IPM)
1
Given initial x
0
,s
0
, y
0
, z
0
, and
0
2
For k = 0, 1, 2, . . . until converge
(a) Compute p
x
, p
s
, p
y
, p
z
and
x
,
s
,
y
,
z
(b) (x
k+1
,s
k+1
, y
k+1
, z
k+1
) = (x
k
,s
k
, y
k
, z
k
) + (
x
p
x
,
s
p
s
,
y
p
y
,
z
p
z
)
(c) Adjust
k+1
<
k
(UNIT 9,10) Numerical Optimization May 1, 2011 13 / 24
Some comments of IMP
Some comments of the interior point method
1
The complementarity slackness condition says s
i
z
i
= 0 at the optimal
solution, by which, the parameter , SZ = I , needs to decrease to
zero as the current solution approaches to the optimal solution.
2
Why cannot we set zero or small in the beginning? Because that
will make x
k
going to the nearest constraint, and the entire process
will move along constraint by constraint, which again becomes an
exponential algorithm.
3
To keep x
k
(or s and z) too close any constraints, IPM also limits the
step size of s and z
max
s
= max{ (0, 1],s + p
s
(1 )s}
max
z
= max{ (0, 1], z + p
z
(1 )z}
(UNIT 9,10) Numerical Optimization May 1, 2011 14 / 24
Interior point method for linear programming
We will use linear programming to illustrate the details of IPM.
The primal The dual
min
x
c
T
x
s.t. Ax =
b,
x 0.
max
b
T
s.t. A
T
+s = c,
s 0.
KKT conditions
A
T
+s = c
Ax =
b
x
i
s
i
= 0
x 0,s 0
(UNIT 9,10) Numerical Optimization May 1, 2011 15 / 24
Solve problem
Let X =
x
1
x
2
.
.
.
x
n
, S =
s
1
s
2
.
.
.
s
n
,
F =
A
T
+s c
Ax
b
Xs e
0 A
T
I
A 0 0
S 0 X
p
x
p
p
z
= F
x
k+1
= x
k
+
x
p
x
k+1
=
k
+
z
k+1
= z
k
+
z
p
z
How to decide
k
?
k
=
1
n
x
k
. s
k
is called duality measure.
(UNIT 9,10) Numerical Optimization May 1, 2011 17 / 24
The central path
The central path
The central path: a set of points, p() =
, dened by the
solution of the equation
A
T
+s = c
Ax =
b
x
i
s
i
= i = 1, 2, , n
x,s > 0
(UNIT 9,10) Numerical Optimization May 1, 2011 18 / 24
Algorithm
Algorithm: The interior point method for solving linear programming
1
Given an interior point x
0
and the initial guess of slack variables s
0
2
For k = 0, 1, . . .
(a) Solve
0 A
T
I
A 0 0
S
k
0 X
k
x
k
k
s
k
b Ax
k
c s
k
A
T
k
X
k
S
k
e +
k
k
e
for
k
[0, 1].
(b) Compute (
x
,
,
s
) s.t.
x
k+1
k+1
s
k+1
x
k
+
x
x
k
k
+
k
s
k
+
s
s
k
is in the
neighborhood of the central path
() =
XSe e
i
c
i
(x) +
i
[c
i
(x)]
,
in which the notation [z]
= max{0, z}.
The goals become
_
min f (x)
min h(x)
(UNIT 9,10) Numerical Optimization May 1, 2011 20 / 24
The lter method
A pair (f
k
, h
k
) dominates (f
l
, h
l
) if f
k
< f
l
and h
k
< h
l
.
A lter is a list of pairs (f
k
, h
k
) such that no pair dominates any other.
The lter method only accepts the steps that are not dominated by
other pairs.
Algorithm: The lter method
1
Given initial x
0
and an initial trust region
0
.
2
For k = 0, 1, 2, . . . until converge
1 Compute a trial x
+
by solving a local quadric programming model
2 If (f
+
, h
+
) is accepted to the lter
Set x
k+1
= x
+
, add (f
+
, h
+
) to the lter, and remove pairs dominated
by it.
Else
Set x
k+1
= x
k
and decrease
k
.
(UNIT 9,10) Numerical Optimization May 1, 2011 21 / 24
The Maratos eect
The Maratos eect shows the lter method could reject a good step.
Example
min
x
1
,x
2
f (x
1
, x
2
) = 2(x
2
1
+ x
2
2
1) x
1
s.t. x
2
1
+ x
2
2
1 = 0
The optimal solution is x
= (1, 0)
Suppose x
k
=
_
cos
sin
_
, p
k
=
_
sin
2
sin cos
_
x
k+1
= x
k
+p
k
=
_
cos + sin
2
sin (1 cos )
_
(UNIT 9,10) Numerical Optimization May 1, 2011 22 / 24
Reject a good step
x
k
x
=
_
_
_
_
_
cos 1
sin
__
_
_
_
=
cos
2
2 cos + 1 + sin
2
=
2(1 cos )
x
k+1
x
=
_
_
_
_
cos + sin
2
1
sin sin cos
_
_
_
_
=
_
_
_
_
cos (1 cos )
sin (1 cos )
_
_
_
_
=
cos
2
(1 cos )
2
+ sin
2
(1 cos )
2
=
(1 cos )
2
Therefore
x
k+1
x
x
k
x
2
=
1
2
. This step gives a quadratic convergence.
However, the lter method will reject this step because
f (x
k
) = cos , and c(x
k
) = 0,
f (x
k+1
) = cos sin 2 = sin
2
cos > f (x
k
)
c(x
k+1
) = sin
2
> 0 = c(x
k
)
(UNIT 9,10) Numerical Optimization May 1, 2011 23 / 24
The second order correction
The second order correction could help to solve this problem.
Instead of c(x
k
)
T
p
k
+ c(x
k
) = 0, use quadratic approximation
c(x
k
) +c(x
k
)
T
d
k
+
1
2
d
T
k
2
xx
c(x)
d
k
= 0. (3)
Suppose
d
k
p
k
is small. Use Taylor expansion to approximate
quadratic term
c(x
k
+p
k
) c(x
k
) +c(x
k
)
T
p
k
+
1
2
p
T
k
2
xx
c(x)p
k
.
1
2
d
T
k
2
xx
c(x)
d
k
1
2
p
T
k
2
xx
c(x)p
k
c(x
k
+p
k
) c(x
k
) c(x
k
)
T
p
k
.
Equation (3) can be rewritten as
c(x
k
)
T
d
k
+ c(x
k
+p
k
) c(x
k
)
T
p
k
= 0
Use the corrected linearized constraint:
c(x
k
)
T
p + c(x
k
+p
k
) c(x
k
)
T
p
k
= 0.
(The original linearized constraint is c(x
k
)
T
p + c(x
k
) = 0.)
(UNIT 9,10) Numerical Optimization May 1, 2011 24 / 24