Dgms Ps

Augmented Lagrangian algorithms
based on the spe tral proje ted gradient method

for solving nonlinear programming problems
1
M. A. Diniz-Ehrhardt
M. A. Gomes-Ruggiero
S. A. Santos
1 This
J. M. Mart
nez
resear h has been supported by FAPESP (grants 90/3724-6 and 01/04597-4), PRONEX, CNPq
and FAEP-Uni amp.
2 Asso iate Professor, Departamento de Matem
ati a Apli ada, IMECC-UNICAMP, CP 6065, 13081970 Campinas SP, Brazil ( hetiime.uni amp.br)
ati a Apli ada, IMECC-UNICAMP, CP 6065, 13081970 Campinas SP, Brazil (mar iaime.uni amp.br)
4 Professor, Departamento de Matem
ati a Apli ada, IMECC-UNICAMP, CP 6065, 13081-970 Campinas SP, Brazil (martinezime.uni amp.br)
ati a Apli ada, IMECC-UNICAMP, CP 6065, 13081970 Campinas SP, Brazil (sandraime.uni amp.br)
1
2
Abstra t.
The Spe tral Proje ted Gradient method (SPG) is an algorithm for large-s ale bound- onstrained
optimization introdu ed re ently by Birgin, Martnez and Raydan. It is based on Raydan's un onstrained generalization of the Barzilai-Borwein method for quadrati s. The SPG algorithm turned
out to be surprisingly ee tive for solving many large-s ale minimization problems with box onstraints. Therefore, it is natural to test its performan e for solving the subproblems that appear
in nonlinear programming methods based on augmented Lagrangians. In this work, augmented
Lagrangian methods whi h use SPG as underlying onvex- onstraint solver are introdu ed (ALSPG),
and the methods are tested in two sets of problems. First, a meaningful subset of large-s ale nonlinearly onstrained problems of the CUTE olle tion is solved and ompared with the performan e
of LANCELOT. Se ond, a family of lo ation problems in the minimax formulation is solved against
the pa kage FFSQP.
Key Words: Augmented Lagrangian methods, proje ted gradients, nonmonotone line sear h,
large-s ale problems, bound- onstrained problems, Barzilai-Borwein method.
1.
Introdu tion
In a re ent paper (Ref. 1), Birgin, Martnez and Raydan introdu ed the Spe tral Proje ted
Gradient method (SPG) for ontinuous optimization with onvex onstraints. This algorithm is based on Raydan's un onstrained generalization of the Barzilai-Borwein method
for quadrati s (see Ref. 2, 3, 4). Consider the problem
Minimize F (x) subje t to x 2
;
(1)
where F has ontinuous rst partial derivatives and

is losed and onvex. Given the
urrent point xk 2
, SPG omputes the sear h dire tion dk 2 IRn as
dk = P (xk
k rF (xk )) xk ;
(2)
where P (z) is the orthogonal proje tion of z on

and k is the spe tral s aling parameter.
The new point xk = xk + tk dk is hosen in order to satisfy a nonmonotone su ient
des ent ondition (see Ref. 5). The method SPG is espe ially useful when the proje tions
on
are easy to ompute, for example, when
is a box or a ball. In these ases, this
method is extremely easy to ode and its memory requirements are minimal. Surprisingly,
its numeri al performan e is very good, when ompared with sophisti ated trust-region
algorithms. SPG's algorithms have been su essful in many problems both a ademi (see
Ref. 1, 6, 7) and industrial (see Ref. 8, 9, 10, 11, 12).
The fa t that SPG is very easy to ode is important in pra ti al situations. In some
engineering appli ations the obje tive fun tion F has been already oded in unusual omputer languages, and it is better to write an SPG ode in that language than to rely
on not always ee tive interfa e software. The ode, on the other hand, is very short
+1
3
and suitable for mi ro omputers. Finally, it is well known that the main obsta le for
the popularization of parallel omputer ar hite tures is the ne essity of developing spe i ar hite ture-oriented software for mathemati al problems. Clearly, simple algorithms
make this task easier.
This state of fa ts motivated us to use SPG within the augmented Lagrangian framework for solving
Minimize f (x) subje t to h(x) = 0; x 2
;
(3)
where f : IRn ! IR and h : IRn ! IRm have ontinuous rst derivatives and the set
is
given by
= fx 2 IRn j i(x) 0; i = 1; : : : ; pg;

where the fun tions i are ontinuously dierentiable and onvex. At ea h outer iteration
of the augmented Lagrangian s heme for solving (3) one nds an approximate solution of
Minimizex L(x; ; ) subje t to x 2
;
(4)
where
L(x; ; ) = f (x) + hh(x); i + 2 kh(x)k

(5)
is the augmented Lagrangian fun tion, 2 IRm is an estimate of the ve tor of Lagrange
multipliers, > 0 is the penalty parameter, h; i is the Eu lidean inner produ t and k k
2
is the Eu lidean norm (see Ref. 13, 14, 15, 16).

In the present resear h our proposal is to solve (4) using SPG. We want to assess the
performan e of this ombination using two families of problems: a representative set of
large-s ale nonlinear programming problems of the CUTE ole tion (Ref. 17) and lo ation
problems in the minimax formulation.
This paper is organized as follows. In Se tion 2 we des ribe our SPG implementation.
In Se tion 3 we present the augmented Lagrangian algorithm, and in Se tion 4 its global
onvergen e properties are analysed. The numeri al experiments are des ribed and dis ussed in Se tion 5, where problems from CUTE olle tion are ompared with LANCELOT
(Ref. 15), and lo ation problems are solved also by FFSQP (Ref. 18). Con lusions and
perspe tives are presented in Se tion 6.
2.
The
SPG
method
In (Ref. 1) two spe tral gradient methods are presented, alled SPG1 and SPG2 respe tively. The performan e of SPG2 turned out to be better than the one of SPG1, so, SPG2
is the algorithm used in our resear h and it is alled SPG here. When applied to (1), SPG
generates a sequen e of feasible points wk 2
; k = 0; 1; 2; : : : . We will denote by fwk g
the sequen e generated by SPG to avoid on i t with the one generated by the algorithm
ALSPG des ribed at next se tion. As in the introdu tion, P denotes the orthogonal proje tion operator on
. The algorithm requires the following initial hoi es and parameters:
4
w0 2
, the initial approximation; M > 0, an integer parameter used in the nonmonotone
linesear h; 2 (0; 1), the su ient de rease parameter; " > 0, the stopping toleran e;
nfmax > 0, the maximum number of fun tional evaluations and 0 < min < max < 1,
the safeguarding parameters.

Algorithm SPG
Step 1:
Initialization of the spe tral s aling parameter:

Dene 0 = 1 or hoose a positive value for 0 2 [min ; max using some heuristi ;
Set k = 0 and feval = 0.
Step 2:
Test the stopping riterion:

If kP (wk rF (wk )) wk k1 < " stop and set wk as the solution to problem (1).
Step 3:
Compute the new dire tion:

Compute dk = P (wk k rF (wk )) wk .
Update the maximum value of F (x) among the last M iterations:
Dene: k = maxfF (wk); F (wk ); : : : ; F (wj )g, where j = maxf0; k M + 1g.
Test the su ient des ent ondition:
Set t = 1.
Compute: fnew = F (wk + dk ) and feval = feval + 1.
While (fnew > k + thrF (wk); dk i) and (feval < nfmax),
ompute tnew 2 [0:1t; 0:9t and set t = tnew ;
ompute fnew = F (wk + tdk ) and feval = feval + 1:
Test the stopping riterion:
If (feval nfmax) and (fnew > k +thrF (wk); dk i), stop with a failure message.
Compute the new approximation:
Dene wk = wk + tdk .
Update the spe tral s aling parameter:
Compute:
sk = w k
wk ,
k
k
y = rF (w ) rF (wk ).
If hsk ; yk i 0 dene k = max .
Otherwise, dene
Step 4:
Step 5:
Step 6:
Step 7:
Step 8:
+1
+1
+1
+1
k+1
Step 9:
k k
= max min hhssk ;; yskii ; max ; min
Prepare the new iteration:

Set k = k + 1 and return to step 2.
(6)
5
When k is not safeguarded (so k 2 (min ; max )) we have that
+1
+1
R1
1 = hsk ; yk i = hsk ; r F (wk + tsk )dt sk i :

k
hsk ; sk i
hsk ; sk i
0
+1
R1
So, +1 is a Rayleigh quotient relative to the average Hessian r F (wk + tsk )dt and,
in onsequen e, is between the smallest and the largest eigenvalue of this matrix. Thus,
it an be onsidered that +1 I is the matrix of the form I that better approximates
the average Hessian. Therefore, one should develop heuristi s for hoosing , sin e, in
general, the ne essary information for su h estimative is not available at the initial point.
Our parti ular hoi e will be detailed in Se tion 5.
There are two reasons for stopping in algorithm SPG: the maximum number of fun tional evaluations was rea hed or kP (wk rF (wk)) wk k < ". The rst riterion is a
safeguard to over ome a too small hoi e for toleran e ". The se ond riterion indi ates
su ient stationarity of the urrent approximation wk , sin e P (wk rF (wk )) wk is the
ontinuous proje ted gradient of F .
1
3.
The Augmented Lagrangian Algorithm
In this se tion we adress problem (3) by the augmented Lagrangian approa h. The algorithm requires the following initial hoi es to be given: x 2
, the initial approximation
to the solution of (3); L > 0, a bound for the norm of the multipliers; 2 IRm su h
that k k L, the initial value for the multipliers; > 0, the initial penalty parameter;
" > 0, the initial toleran e for optimality; " > 0, the nal toleran e for optimality;
> 0, the initial toleran e for feasibility; > 0, the nal toleran e for feasibility;
nfmax > 0, the maximum number of fun tional evaluations.
0
Algorithm ALSPG
Step 1: Set k = 0.
Step 2:
Step 3:
Address subproblem:
Solve the subproblem minx F (x) = L(x; k ; k ) using algorithm SPG with w = xk
as initial estimate and toleran e "k .
Let w be the solution obtained.
Test stopping riteria:
If SPG stopped by rea hing maximum number of fun tional evaluations, stop with
a failure message;
otherwise dene xk = w.
If kh(xk )k1 and "k ", stop with a message of su ess and set xk as
the solution to problem (3).
0
+1
+1
+1
6
Step 4:
Update the multipliers:
Dene
k+1 = k + k h(xk+1 ):
If kk+1k1 L, dene k+1 = k .
Step 5:
Update the penalty parameter:
If
kh(xk )k1 0:1kh(xk )k1
(8)
kh(xk )k1 k
(9)
k+1 = k :
(10)
k+1 = 10k :
(11)
+1
or
+1
dene
Otherwise, dene
Step 6:
(7)
Update the toleran es "k and k :
Adopt some heuristi strategy for updating "k and k su h that " "k < "k
and k < k ;
set k = k + 1 and return to step 2.
Further details on the parameter hoi es and the heuristi strategies for updating
sequen es f"k g and fk gwill be given in the des ription of the numeri al experiments.
+1
+1
4.
Global Convergen e of Algorithm
ALSPG
In this se tion we prove two theoreti al results on erning the global onvergen e of algorithm ALSPG, under regularity assumptions on the feasible set. These results a tually
do not rely on the SPG inner solver, but solely depend on the approximate minimization
riterion
kP (xk rxL(xk ; k ; k )) xk k "k ;
(12)
and on the updatings of steps 4 and 5 of algorithm ALSPG.
The onvergen e proof of the method is given in Theorems 4.1 and 4.2. The rst is related with feasibility and the se ond with optimality. Theorem 4.1 says that the algorithm
onverges to a point that is stationary with respe t to the square norm of infeasibility.
Of ourse, one would like to prove that the limit point is feasible, but this is impossible
under the given hypotheses sin e the feasible set ould be even empty. In Theorem 4.1
we assume the regularity of all the points of
. This ondition is satised when
is a
simple set, like a box.
7
Assume that x is a limit point of a sequen e generated by algorithm ALSPG
and fr i (x) j i (x) = 0g is linearly independent for all x 2
. Then, x is a rst-order
stationary point of the problem
Theorem 4.1
Minimize kh(x)k2 subje t to x 2

:
(13)
Proof. Let K be an innite subset of IN su h that limk2K1 xk = x . Sin e xk 2
for
k = 1; 2; : : : and
is losed then x 2
.
By the way that the k sequen e is generated, either the penalty parameters k take
on a ommon value for k larger than some k or the sequen e k tends to innity.
Suppose, rst, that there exists k 2 IN su h that for all k k , k = k . By (8)
and (9) this implies that limk2K1 h(xk ) = 0. Therefore, h(x ) = 0 and, so, the thesis is
true.
Now, suppose that limk!1 k = 1. We have
1

P
lim
k2K1
m
X
k
rf (x )
i=1
rhi (x )
k
k h0 (xk )T h(xk )
+1
0:
Without loss of generality, assume that q p,

i (x ) = 0 for i = 1; : : : ; q;
i (x ) < 0 for i = q + 1; : : : ; p;
and there exists k 2 K su h that

i (xk ) < 0 for i = q + 1; : : : ; p for all k 2 K ; k k :
Dene
m
X
ki rhi (xk ) k h0 (xk )T h(xk ):
y k = xk rf (xk )
1
i=1
Then P ( ) is a solution of the onvex problem

Minimize ky ykk subje t to y 2
:
(14)
Sin e limk2K1 xk = x and limk2K1 kP (yk) xk k = 0, we have that
limk2K1 kP (yk) x k = 0. Therefore, without loss of generality, we an assume that
lim i(P (yk)) = 0 for i = 1; : : : ; q; and
k2K1
yk
i (P (y k )) < 0 for i = q + 1; : : : ; p; for all k 2 K1 ; k k1 :
Thus, by the KKT onditions of (14), we have that, if P (yk) = uk , then

2(u
q
X
) + k
i=1
r i (uk ) = 0
(15)
8
for all k 2 K ; k k and for some ik 0, i = 1; 2; : : : ; q.
Dividing (15) by k and sin e fxk g, fP (yk)g and fxk rf (xk )
bounded for k 2 K , we obtain that
1
"
lim
k2K1
Therefore,
X
2h0(xk )T h(xk ) + i
k
i=1
"
lim r kh(x )k +
k2K1
q
X
ik
k
i=1
Pm
k
i=1 i
rhi(xk )g are
r i (u ) = 0:
k
r i(u ) = 0:
k
(16)
Let us all ik = and k = maxf k ; : : : ; qk g. If limk2K1 k = 0, we have that

r (kh(x)k ) = 0 and we are done. Otherwise, assume without loss of generality that
k > 0 for all k 2 K . Dividing (16) by k we get
k
i
"
q
ik
1 r kh(xk )k + X
r (uk ) = 0:
lim
k2K1 k
k i
(17)
i=1
If k ! 1 this ontradi ts linearly independen e of fr i(x )gqi . Therefore, there

exists > 0 su h that k for all k 2 K , after possibly relabelling. Then, taking
onvergent subsequen es of ik and taking limits in (17) we obtain
=1
r kh(x )k +
2
q
X

i=1
r i(x ) = 0
for some i 0, i = 1; 2; : : : ; q. This implies that x is a stationary point of (13).
The proof of Theorem 4.1 depends strongly on the boundedness of the multiplier
estimates k . This property is not automati ally satised by (7), whi h represents a rstorder update of Lagrange multipliers in the nonlinear programming terminology. Higher
order (more a urate) updates are possible, but they are omputationally more expensive.
Sin e our main interest is on large-s ale problems, we adopted the rst-order update in
our algorithm. However, formula (7) must be modied to avoid unboundedness of k . In
our implementation, when kk k (given by (7)) is larger than L, we set k = k as a
safeguard strategy.
To omplete the global onvergen e analysis we prove Theorem 4.2 in the following.
Sin e we already know that limit points of Algorithm ALSPG are stationary points of
kh(x)k , we an lassify them in two families. The rst is the lass of infeasible stationary
points of kh(x)k on
, for whi h there is nothing to do ex ept to regret their existen e.
Se ond, we have the lass of feasible limit points. In Theorem 4.2 we prove that if a feasible limit point is regular (the gradients of a tive onstraints are linearly independent),
then it is a stationary point of the nonlinear programming problem (3).
+1
+1
9
Assume that x is a limit point of the sequen e generated by algorithm
ALSPG. Suppose that h(x ) = 0 and x is regular (the gradients of the a tive onstraints,
in luding those of set
, are linearly independent). Then, x is a stationary point of (3).
Theorem 4.2.
Proof. Assume, without loss of generality, that
i (x ) = 0; i = 1; : : : ; q;
(18)
i (x ) < 0; i = q + 1; : : : ; p:
(19)
are linearly independent,
By the regularity hypothesis, the olumns of A 2 IRn m q

where
A = (rh (x ); : : : ; rhm (x ); r (x ); : : : ; r q (x )):
(20)
Let K be an innite subset of IN su h that limk2K1 xk = x . By (19) and the regularity
hypothesis, there exists k 2 IN su h that
(
+ )
i (xk ) < 0; i = q + 1; : : : ; p;
and Ak 2 IRn m
(
+p)
(21)
is full-rank, where
Ak = (rh1 (xk ); : : : ; rhm (xk ); r 1 (xk ); : : : ; r q (xk )):
(22)
for all k 2 K ; k k . Moreover, the Moore-Penrose pseudo-inverse Ayk is su h that

1
lim Ayk = Ay:
(23)
k = k + k h(xk ):
(24)
k2K1
For all k 2 IN let us dene

By (12), we have that

P

r(f (x )
k
m
X
k
i=1
rhi (x )
k
"k :
(25)
The rest of the proof onsists in showing that fk g is bounded, so that we an take limits
in (25). For this purpose, dene
y k = xk
and
rf (xk )
m
X
k
i=1
u k = P (y k ):
rhi (xk )
10
Then, by the KKT onditions asso iated to the proje tion, we have that, for k 2 K ,
kk ,
p
q
X
X
k
k
k
k
ik r i (uk ) = 0
2(u y ) + i r i(u ) +
i q
i
ik 0; i = 1; : : : ; p
(26)
k
k
i i (u ) = 0; i = 1; : : : ; p
i (uk ) 0; i = 1; : : : ; p
But i(xk ) < 0 for k 2 K ; k k ; i = q + 1; : : : ; p: So, sin e limk2K1 (uk xk ) = 0 we
have that there exists k k su h that
1
= +1
=1
i (uk ) < 0 for k 2 K1 ; k k2 ; i = q + 1; : : : ; p:
Thus
ik = 0 for k 2 K1 ; k k2 ; i = q + 1; : : : ; p:
Then, by (26), if k 2 K ; k k , we have

1
2(uk yk ) +
q
X
k
i=1
r i(uk ) = 0:
Therefore, for k 2 K ; k k ,
1
x + rf (x
k
m
X
) + k
i
i=1
rhi(x ) +
k
So,
uk
xk + r
B
B
B
B
B
f (xk ) + Aek B
B
B
B
B

q
X
ik
i=1
k1
r
i (uk ) = 0:
2
...
C
C
C
C
k
m C
C=
1k =2 C
C
... C
C
A
pk =2
(27)
where Aek = rh (xk ); : : : ; rhm(xk ); r (uk ); : : : ; r q (uk ) :

Sin e matrix Ak is full-rank, so is Aek , for k 2 K ; k k . Therefore, premultiplying
(27) by Aeyk and taking limits, we obtain that k ; : : : ; km are onvergent to ; : : : ; m.
This allows us to take limits on both sides of (25), obtaining
1
x
rf (x )
m
X

i=1
rh (x)
i
x = 0:

11
This means that x solves the problem

Minimizeu u
x + r
m
X
f (x ) +
i=1
2

hi (x )
subje t to u 2
:
(28)
Writing the KKT onditions of problem (28), we obtain the thesis.

5.
5.1.
Numeri al Experiments
Problems from
CUTE
olle tion
In order to assess the performan e of algorithm ALSPG, thirty-nine nonlinear equality

onstrained problems with simple bounded variables from the CUTE olle tion (version
May/98) were sele ted to onstitute our rst test set. These test problems were divided
in small, medium and large ones, with the following features: 60% had both number
of variables (n) and number of equality onstraints (m) between 1 and 500; 30% had
n 2 (500; 3000 or m 2 (500; 3000 and 10% had n 2 (3000; 5000 or m 2 (3000; 5000. As
a ben hmark, these problems were also solved by LANCELOT, an augmented Lagrangian
algorithm implemented with a trust-region strategy for addressing the subproblems (see
Ref. 15).
The tests were run in Fortran 77 (double pre ision, O ompiler option), in a Spar
Station Sun Ultra 1. An interfa e for running ALSPG with the CUTE olle tion was prepared,
assuming that the problems had already been prepro essed, so that nonlinear inequality
onstraints were turned into equalities by means of additional nonnegative sla k variables.
Therefore, for hoosing our set of test problems, we de ided to sele t problems originally
of the form (3).
The initial approximation x was the default of the CUTE set. The ad ho algorithmi
hoi es for these tests were: = 0 2 IRm, = 10, " = 10 , = 10 , = 0:1,
M = 50, = 10 , min = 10 , max = 10 .
The initial toleran e " was heuristi ally hosen as a fun tion of the initial approximation kd k = kP (x rxL(x ; ; )) x k and the nal toleran e ", omputed as
follows: given a onstant p 2 (0; 1), we set = log kd k, p = log " , = p ( p)
and dened " = 10 .
The updates of both f"k g and fk g were based on some heuristi s as well. The main
idea behind sequen e f"k g was to gradually tight the toleran e "k to avoid oversolving of
the subproblems: given a onstant p 2 (0; 1), we set "k = maxf10p1 p "k ; "g. In the
implementation we used p = 0:25 and p = 0:2.
The sequen e of toleran es fk g was reated inspired in the ideas of Conn,
Gould
and
n
o

Toint in Ref. 14. It wasn updated
in step 6 a ording to k = max +1 ; if (10)
o
0 ; if (11) o urs.
holds or to k = max +1

0
30
10
10
10
+1
+1
12
The initial spe tral step was set by means of an auxiliary initial omputation
as
0k
k
x
follows: given x , , , we omputed d = P (x rxL(x ; ; )) x , xb = x +0:5 kd0k d ,
sb = xb x , yb = rx L(xb; ; ) rx L(x ; ; ) and set = hsb; sbi=hsb; ybi.
To a hieve the robustness of LANCELOT in full extent, we have followed the authors'
suggestions (Ref. 15) and our previous experien e (Ref. 19) to dene parameters and
settings ompatible with the ALSPG hoi es:
0
exa t-se ond-derivatives-used
bandsolver-pre onditioned- g-solver-used 5
exa t-Cau hy-point-required
infinity-norm-trust-region-used
gradient-a ura y-required 1.0D 4
onstraint-a ura y-required 1.0D 4
trust-region-radius 1.0D+0
maximum-number-of-iterations 1000000
Results are shown in Table 1, where problems are presented in alphabeti al order.
For ea h problem, the gures of the rst row orrespond to ALSPG and the ones of the
se ond row to LANCELOT. We keep the notation n, m for number of variables and equality
onstraints, respe tively. The number of outer iterations of the orresponding augmented
Lagrangian algorithm is denoted by out; in is the number of inner iterations; feval the
number of fun tional evaluations and time the CPU time in se onds.
ARGAUSS
15
AVION2
49
15
BIGBANK
2230
1112
BRITGAS
450
360
Problem
ALJAZZAF
out
7
6
6
1
2
9
1
4
6
6
in
98
22
7
1
675585
14013
787722
49
119437
99
feval
228
22
24
1
1000001
14013
1000002
49
155183
99
Table 1: Complete omparative results:
fobj
0.7501D+02
0.7500D+02
0.0000D+00
0.1179D 07
0.1245D+08
0.9468D+08
0.3129D+07
0.4206D+07
0.0000D+00
0.0000D+00
ALSPG
time
0.5010D 02
0.6000D 01
0.3497D 02
0.1000D 03
0.1547D+03
0.5850D+02
0.1164D+05
0.1978D+04
0.6118D+03
0.1494D+02
LANCELOT.
13
Problem
BROYDN3D
5000
5000
BROYDNBD
5000
5000
CATENARY
501
166
CLUSTER
DALLASL
906
667
DITTERT
105
70
DIXCHLNV
10
DNIEPER
61
24
DTOC5
1999
999
GILBERT
1000
HAGER1
1001
500
HEART6
HS111
10
HS41
HUESTIS
1000
LCH
600
LEAKNET
156
153
LINSPANH
97
33
LOTSCHD
12
METHANB8
31
31
MINC44
1113
1032
MINPERM
583
520
BT4
out
15
1
1
1
6
4
8
7
2
1
6
6
6
9
7
3
7
6
7
9
8
5
6
2
2
1
6
4
6
4
11
11
6
5
5
11
6
6
6
5
2
1
6
10
7
9
in
504
5
51
31
70
22
172723
797
17
9
410894
81
601
130
218
12
397974
68
603180
21
41
27
761763
10
152959
1751
2618
45
23
6
1045
151
1912
36
761134
117
140
9
890
21
749366
36
109
19
152
93
feval
589
5
60
31
111
22
224903
797
19
9
539057
81
778
130
252
12
527677
68
794567
21
99
27
1000001
10
570414
1751
3794
45
54
6
1083
151
2430
36
1000001
117
175
9
1119
21
1000001
36
130
19
181
93
fobj
0.0000D+00
0.3096D 16
0.0000D+00
0.1371D 08
0.4551D+02
0.4551D+02
0.3484D+06
0.3482D+06
0.0000D+00
0.2818D 06
0.2026D+06
0.2026D+06
0.1985D+01
0.2000D+121
0.7098D 10
0.1199D 07
0.1874D+05
0.1874D+05
0.1536D+01
0.1515D+01
0.4820D+03
0.4821D+03
0.1133D+01
0.8808D+00
0.0000D+00
0.5663D 05
0.4776D+02
0.4776D+02
0.1926D+01
0.1926D+01
0.3482D+11
0.3482D+11
0.4288D+01
0.4318D+01
0.7523D+01
0.7961D+01
0.7700D+02
0.7700D+02
0.2398D+04
0.2398D+04
0.0000D+00
0.9367D 05
0.3158D 03
0.3173D 03
0.9366D 03
0.9446D 03
Table 1 ( ont.): Complete omparative results:
ALSPG
time
0.2155D+02
0.7600D+00
0.3930D+01
0.1224D+02
0.3706D 02
0.6000D 01
0.3779D+03
0.3058D+02
0.2415D 02
0.1000D 01
0.8579D+04
0.1372D+03
0.4837D+00
0.2470D+01
0.4114D 01
0.5000D 01
0.1185D+03
0.3800D+00
0.6670D+04
0.2580D+01
0.2873D+00
0.5500D+01
0.3536D+04
0.8300D+00
0.1891D+02
0.2530D+01
0.1100D+01
0.1600D+00
0.2403D 02
0.3000D 01
0.4055D+01
0.1128D+03
0.6507D+01
0.2570D+01
0.1232D+04
0.6360D+01
0.4224D 01
0.1700D+00
0.6871D 01
0.1000D+00
0.4659D+03
0.8200D+00
0.1884D+01
0.5560D+01
0.1133D+01
0.1761D+02
LANCELOT.
14
Problem
NCVXQP1
100
50
OPTCNTRL
32
20
ORTHRDM2
4003
2000
ORTHREGC
1005
500
READING1
202
100
SPANHYD
97
33
STEENBRC
540
126
TENBARS2
18
TRAINF
4008
2002
TRIGGER
YORKNET
312
256
out
6
7
6
6
5
4
6
5
4
4
6
4
4
8
3
4
6
9
3
1
6
12
in
1403
43
1310
21
606440
129
490278
43
772486
741
29959
27
872300
6446
731412
346
760251
58
999986
20
210592
236
feval
1504
43
1469
21
1000001
129
701241
43
1000001
741
41464
27
1000001
6446
1000002
346
1000001
58
1000001
20
1000001
236
fobj
0.7298D+06
0.7298D+06
0.5500D+03
0.5500D+03
0.1570D+03
0.1555D+03
0.1879D+02
0.1879D+02
0.1490D+00
0.1605D+00
0.2397D+03
0.2397D+03
0.3678D+05
0.2750D+05
0.2318D+04
0.2278D+04
0.2191D+01
0.2758D+01
0.0000D+00
0.1025D 06
0.1968D+05
0.1423D+05
Table 1 ( ont.): Complete omparative results:
ALSPG
time
0.8356D+00
0.4200D+00
0.1779D+00
0.1100D+00
0.2568D+05
0.4318D+02
0.5227D+04
0.3580D+01
0.1325D+04
0.2621D+02
0.1059D+02
0.5600D+00
0.1624D+04
0.1100D+03
0.7155D+02
0.1590D+01
0.1752D+05
0.2103D+03
0.4484D+02
0.3000D 01
0.8266D+03
0.7071D+02
LANCELOT.
Twenty-seven out of the thirty-nine test problems were su essfully solved by algorithm
For the remaining twelve, the maximum allowed number of fun tional evaluations
(1000000) was ex eeded. For two of these twelve problems (ORTHRDM2 and READING1)
the nal approximation was nearly optimal ( onstraint violation less than 10 and obje tive fun tion value lose to the one obtained by LANCELOT). Thus, ALSPG had su ess
in 74.4% of the tests (29 in 39). It is worth noti ing that the proportion of failures of
ALSPG followed quite losely the size distribution of the problems: 1 in 4 of the large
problems, 3 in 12 of the medium and 6 in 23 of the small ones ould not be solved
by ALSPG. LANCELOT performed as follows: thirty-three problems were su essfully solved
(84.6%) and for ve problems it stopped with a too small step (AVION2, DNIEPER, HEART6,
TENBARS2 and YORKNET), whi h orrespond to 12.8% of the tests. For a single problem
(DITTERT) LANCELOT did not onverge, that is, no feasible solution ould be found. Algorithm ALSPG was su essful in solving three problems (DITTERT, DNIEPER and HEART6) of
the six aforementioned stops of LANCELOT.
Summarizing Table 1 using geometri means of the number of the fun tional evaluations performed and the CPU time spent, we obtained the gures of Table 2. The
results of Table 2 allow us to estimate the average time of a single iteration of ea h algorithm: 0.0009 se onds for ALSPG and 0.037 se onds for LANCELOT. Algorithm ALSPG needs
to perform approximately 259.6 times the number of fun tional evaluations of LANCELOT,
ALSPG.
15
whereas it takes around 6.4 times the CPU time spent by the algorithm of Conn, Gould
and Toint.
feval
time
14900.0 13.34
57.394 2.095
Table 2: Summary of omparative results (average values) for problems from CUTE.
ALSPG
LANCELOT
5.2.
The lo ation problem
x2
K3
P3
K4
P4
P2
P1
K2
K1
x1
Figure 1: A possible optimal onguration of the lo ation problem.

Given a family of polytopes Ki IR , i = 1; : : : ; npol, the optimal lo ation problem
onsists of nding P 2 IR su h that its maximum distan e to the polytopes is minimum.
In other words, P is a solution to
2
min maxfkP P k; kP P k; : : : ; kP Pnpolkg
Pi 2Ki
(29)
16
where k k is the Eu lidean norm. A possible optimal onguration with npol = 4 is
illustrated in Figure 1.
Problem (29) may be rewritten in the format
Minimize z subje t to kP Pik z; Pi 2 Ki; i = 1; : : : ; npol:
(30)
Introdu ing positive sla k variables, the inequality onstraints kP Pi k z are turned
into equalities kP Pik + i z = 0, so that a problem of type (4) is built
npol
X
i (kP
Pi k + i
X
npol
2 i (kP Pi k + i z)
i
subje t to i 0; Pi 2 Ki; i = 1; : : : ; npol:
Minimize z +
=1
z) +
=1
(31)
In our numeri al experiments, we generated a family of twenty medium-size problems of type (29) and ompared the performan e of solving them by FFSQP (Ref. 18)
against solving orresponding problem (30) by Algorithm ALSPG. Code FFSQP is a Fortran
implementation of a feasible sequential quadrati programming algorithm for solving onstrained nonlinear, possibly minimax, optimization problems. For FFSQP the toleran es
were set so that its feasibility and optimality riteria were ompatible and omparable
with ALSPG hoi es. The tests were run in Fortran 77 (double pre ision, O ompiler
option), in a Spar Station Sun Ultra 1.
It is worthwhile noti ing that, sin e the formulation onsidered by ea h algorithm is
dierent, ALSPG solves problems with 3npol + 3 variables (2npol + 3 original variables and
npol sla k ones) and npol onstraints whereas FFSQP works with 2npol + 2 variables and
as mu h onstraints as the number of verti es of the problem. The lo ation problems were
randomly generated and were addressed with a dierent formulation in (Ref. 20).
The initial approximation (P; P ; : : : ; Pnpol) was obtained by proje ting the origin onto
an auxiliar entered polygon reated during generation of the problem (initial P ). The initial values of Pi were set as the proje tion of su h P onto the polygons Ki, i = i; : : : ; npol.
Variables i; i = 1; : : : ; npol and z were initially zero. Algorithmi parameter hoi es for
these tests were mostly the same used for the CUTE set of problems, ex ept for min = 10 ,
max = 10 , " = 10 and "k = maxf0:1"k ; "g. We also implemented a stopping test
to dete t la k of progress as follows: we omputed h i = minfkh(x )k; : : : ; kh(xi)kg and
stopped if h k h k maxf10 h k ; 10 g at fty onse utive iterations.
Comparative results are presented in Table 3, where olumn problem provides a number for future referen e and the pair (number of polygons, number of verti es) of ea h
problem; for the next olumns, the rst row orresponds to ALSPG and the se ond, to
FFSQP. We denote by iter the number of outer iterations of ea h algorithm; feval gives
the number of fun tion evaluations; fobj provides the nal obje tive fun tion value (problem (30)) and time gives the CPU time spent in se onds.
1
+1
+1
17
Problem
1 (70, 485)
2 (77, 479)
3 (104, 709)
4 (107, 652)
5 (116, 717)
6 (136, 1054)
7 (159, 3888)
8 (163, 1061)
9 (189, 3596)
10 (197, 1356)
11 (296, 1985)
12 (323, 4889)
13 (325, 2185)
14 (331, 2177)
15 (361, 2357)
16 (375, 2478)
17 (406, 2639)
18 (436, 2875)
19 (449, 2967)
20 (466, 3042)
iter
5
8
4
13
10
8
55
6
6
7
17
8
4
9
4
8
4
8
54
6
4
8
52
9
4
8
52
7
4
6
52
6
5
6
4
8
5
7
54
6
feval
18470
560
10346
1001
11013
835
14197
642
12446
812
13588
1088
3848
1431
13993
1304
3866
1512
14053
1182
18992
2368
13748
2910
10038
2600
18529
2317
8313
2166
20794
2250
9451
2436
10146
3488
10262
3143
21711
2796
Table 3: Comparative results:
fobj
time
30.6309
30.6215
36.7179
36.7179
37.8149
37.8034
92.1574
72.0387
49.3796
49.3786
49.1609
49.1322
44.7871
44.7868
48.3872
48.3805
50.0040
50.0039
78.9317
53.8598
65.8694
65.8666
90.1371
65.7716
69.8736
69.8730
69.5391
69.5289
69.4936
69.7373
74.8264
74.8021
75.2198
75.2184
80.4443
80.4420
80.2297
80.2291
83.7485
83.7463
7.38
3.29
5.08
7.52
5.78
10.75
8.05
9.99
7.88
13.82
11.36
21.92
8.86
86.73
13.50
35.57
10.18
92.18
14.54
50.13
30.84
237.55
42.80
436.13
25.47
295.66
31.33
291.84
20.29
336.02
40.70
374.63
25.79
468.38
30.64
786.28
31.48
706.32
54.43
695.60
ALSPG
FFSQP
For six out of the twenty problems of Table 3, algorithm ALSPG stopped with la k of
progress (problems 4, 10, 12, 14, 16 and 20), whi h amounts to 70% su essful exits for
these tests. Algorithm FFSQP had two stops with too small step (problems 3 and 12),
orresponding to 90% of su ess. In all these stops, however, a nearly optimal iterate was
18
a hieved. For ALSPG, the values of kh(xk )k1 were 3 10 ; 2 10 ; 8 10 ; 3 10 ; 4
10 and 4 10 at the nal approximation for the six aforementioned problems. For
FFSQP, the norm of the gradient of the Lagrangian at the nal iterate was 5 10 and
2 10 for problems 3 and 12, respe tively.
The results of Table 3 are summarized in Table 4, as we did with Tables 1 and 2. For
this set of tests, the estimated average time of a single iteration for ea h algorithm is 0.001
and 0.06 se onds, for ALSPG and FFSQP, respe tively. As a result of the low ost of ALSPG,
although our approa h needs around eight times the number of fun tional evaluations
taken by FFSQP, algorithm FFSQP needs more than ve times as mu h CPU time as the
one spent by ALSPG.
2
feval
time
11781.0 16.81
1608.1 92.99
Table 4: Summary of omparative results (average values) for lo ation problems.
ALSPG
FFSQP
In Table 5 we present results of ALSPG relatively to eight large-s ale lo ation problems, for whi h FFSQP failed due to memory requirements. The number of variables and
of equality nonlinear onstraints varied within n 2 [1548; 2817 and m 2 [515; 938, respe tively. The notation is similar to the one of Table 3, ex ept that the outer iterations
are given in olumn out and we also provide the number of inner iterations in olumn in.
Problem
21 (515, 14159)
22 (640, 4759)
out
in
feval
fobj
time
6763
9038
85.8409
100.90
12118
16774
92.7432
88.33
25
15651
21691
102.3918
189.79
24 (677, 5035)
11433
15838
96.3327
84.79
25 (734,5442)
9733
13595
104.5577
88.46
26 (742, 5519)
5984
7951
99.7280
56.78
23 (646, 12924)
27 (801, 5955)
13107
18495
110.3467
111.92
28 (938, 6985)
23
13268
20006
119.0289
134.15
Table 5: Performan e of ALSPG for large-s ale lo ation problems

Analysing Table 5, we dened a measure for the e ien y of the SPG step, given by
the arithmeti mean of the values fevali /ini, i = 1; 2; : : : ; 8, whi h ame to 1.4. Ideally,
if the su ient des ent ondition is satised with t = 1 at step 5 of algorithm SPG, a
single fun tional evaluation would be done per inner iteration. For this family of tests,
in average, less than two fun tional evaluations are ne essary per inner iteration, whi h
indi ates a good performan e of the SPG step.
19
6.
Final Remarks
We have introdu ed an augmented Lagrangian algorithm (ALSPG) for whi h the spe tral
proje ted gradient is the tool for ta kling the underlying subproblems. Our motivation
was the SPG ee tiveness for minimization with simple bounds, so we wanted to assess its
performan e within the augmented Lagrangian framework. For the proposed algorithm
we proved global onvergen e results in the sense that the generated limit points are
stationary provided they are regular and feasible with respe t to the nonlinear equality
onstraints.
Two families of test problems were addressed. First, thirty-nine nonlinear equality
onstrained problems with simple bounded variables from the CUTE olle tion were solved
and ompared against LANCELOT. Both augmented Lagrangian algorithms had a quite
robust performan e for su h problems: 74.4% were su essfully solved by ALSPG and
84.6% by LANCELOT.
The se ond set of tests was formed by twenty medium-size lo ation problems in minimax formulation. After turned into the nonlinear programming equivalent format, with
auxiliary and sla k variables, they were solved by ALSPG. For omparative purposes, they
were solved, in the original formulation, by the ode FFSQP. Both strategies performed
quite well (70% of reported su ess for ALSPG and 90% for FFSQP). An additional family
of eight large-s ale lo ation problems were solved solely by ALSPG, sin e FFSQP ould not
address their large number of variables and/or onstraints. For this large-s ale set, we
observed the e ien y of the SPG inner step, with an average of 1.4 fun tional evaluations
per iteration.
Summing up, the results summarized in Tables 2 and 4 orroborate the low ost features of ALSPG iterations. The rst-order features of algorithm ALSPG might require a
large number of fun tional evaluations. Its iterations, however, are very heap. This
easy-to- ode algorithm, with minimal memory requirements (12 n-ve tors, 2 m-ve tors
and a single 50-ve tor), available upon request to the authors, might be a worthwhile
alternative provided the problem does not have too expensive fun tional evaluations.
We are thankful to E. G. Birgin for making his polygon generator
available for our numeri al tests with lo ation problems and to the anonymous referees
for their valuable omments and suggestions for improving our work. We also thank
A. R. Conn, N. I. M. Gould and Ph. L. Toint for the pa kage LANCELOT and J. L. Zhou,
A. L. Tits and C. T. Lawren e for the ode FFSQP.
A knowledgements:
Referen es
1. E. G. Birgin, J. M. Martnez and M. Raydan, Nonmonotone spe tral proje ted gradient methods on onvex sets, SIAM Journal of Optimization, Vol. 10, pp. 1196-1211,
2000.
20
2. J. Barzilai and J. M. Borwein, Two point step size gradient methods, IMA Journal of
Numeri al Analysis, Vol. 8, pp. 141-148, 1988.
3. M. Raydan, On the Barzilai and Borwein hoi e of steplength for the gradient method,
IMA Journal of Numeri al Analysis, Vol. 13, pp. 321-326, 1993.
4. M. Raydan, The Barzilai and Borwein gradient method for the large s ale un onstrained minimization problem, SIAM Journal on Optimization, Vol. 7, pp. 26-33,
1997.
5. L. Grippo, F. Lampariello and S. Lu idi, A nonmonotone line sear h te hnique for
Newton's method, SIAM Journal on Numeri al Analysis, Vol. 23, pp. 707-716, 1986.
6. B. Molina, S. Petiton and M. Raydan, An assessment of the pre onditioned gradient
method with retards for parallel omputers, Computational and Applied Mathemati s,
to appear.
7. F. Luengo, W. Glunt, T. L. Hayden and M. Raydan, Pre onditioned spe tral gradient
method, submitted to Numeri al Algorithms.
8. E. G. Birgin, R. Biloti, M. Tygel and L. T. Santos, Restri ted optimization: a lue to
a fast and a urate implementation of the ommon re e tion surfa e method, Journal
of Applied Geophysi s, Vol. 42, pp. 143-155, 1999.
9. E. G. Birgin, I. Chambouleyron and J. M. Martnez, Estimation of opti al onstants
of thin lms using un onstrained optimization, Journal of Computational Physi s,
Vol. 151, pp. 862-880, 1999.
10. E. G. Birgin, and Y. G. Evtushenko, Automati dierentiation and spe tral proje ted
gradient methods for optimal ontrol problems, Optimization Methods and Software,
Vol. 10, pp. 125-146, 1998.
11. Z. Castillo, D. Cores and M. Raydan, Low ost optimization te hniques for solving
the nonlinear seismi re e tion tomography problem, Optimization and Engineering,
Vol. 1, pp. 155-169, 2000.
12. M. Mulato, I. Chambouleyron, E. G. Birgin and J. M. Martnez, Determination
of thi kness and opti al onstants of a-Si:H lms from transmittan e data, Applied
Physi s Letters, Vol. 77, pp. 2133-2135, 2000.
13. D. P. Bertsekas, Constrained optimization and Lagrange multiplier methods, A ademi Press, New York, 1982.
14. A. R. Conn, N. I. M. Gould and Ph. L. Toint, A globally onvergent augmented
Lagrangian algorithm for optimization with general onstraints and simple bounds,
SIAM Journal on Numeri al Analysis, Vol. 28, pp. 545-572, 1991.
21
15. A. R. Conn, N. I. M. Gould and Ph. L. Toint, LANCELOT: a Fortran pa kage
for large-s ale nonlinear optimization (Release A), Springer Series in Computational
Mathemati s 17, Springer-Verlag, New York, Berlin and Heidelberg, 1992.
16. A. Friedlander, J. M. Martnez and S. A. Santos, A new trust region algorithm for
bound onstrained minimization, Applied Mathemati s and Optimization, Vol. 30,
pp. 235-266, 1994.
17. I. Bongartz, A. R. Conn, N. I. M. Gould and Ph. L. Toint, CUTE: Constrained and
Un onstrained Testing Environment, ACM Transa tions on Mathemati al Software,
Vol. 21, pp. 123-160, 1995.
18. J. L. Zhou, A. L. Tits and C. T. Lawren e, User's Guide for FFSQP Version 3.7:
a Fortran ode for solving optimization programs, possibly minimax, with general inequality onstraints and linear equality onstraints, generating feasible iterates, Te h-
ni al Report SRC-TR-92-107r5, Institute for Systems Resear h, University of Maryland, USA, 1997.
19. M. A. Diniz-Ehrhardt, Z. Dostal, M. A. G. Ruggiero, J. M. Martnez and S. A.
Santos, Nonmonotone strategy for minimization of quadrati s with simple onstraints,
Te hni al Report 10/99, Institute of Mathemati s, University of Campinas, Brazil,
1991. Appli ations of Mathemati s, to appear.
20. E. G. Birgin, J. M. Martnez and M. Raydan, SPG: software for onvex onstrained
optimization, Te hni al Report, Institute of Mathemati s, University of Campinas,
Brazil, 2000. ACM Transa tions on Mathemati al Software, to appear.

Dgms Ps

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Dgms Ps

Hochgeladen von

Copyright:

Verfügbare Formate

Augmented Lagrangian algorithms

based on the spe tral proje ted gradient method

where F has ontinuous rst partial derivatives and

where P (z) is the orthogonal proje tion of z on

= fx 2 IRn j i(x)  0; i = 1; : : : ; pg;

L(x; ; ) = f (x) + hh(x); i + 2 kh(x)k

is the Eu lidean norm (see Ref. 13, 14, 15, 16).

the safeguarding parameters.

Initialization of the spe tral s aling parameter:

Test the stopping riterion:

Compute the new dire tion:

Prepare the new iteration:

1 = hsk ; yk i = hsk ; r F (wk + tsk )dt sk i :

The Augmented Lagrangian Algorithm

Update the multipliers:

Update the penalty parameter:

kh(xk )k1  0:1kh(xk )k1

Update the toleran es "k and k :

Global Convergen e of Algorithm

Minimize kh(x)k2 subje t to x 2

Without loss of generality, assume that q  p,

and there exists k 2 K su h that

Then P ( ) is a solution of the onvex problem

i (P (y k )) < 0 for i = q + 1; : : : ; p; for all k 2 K1 ; k  k1 :

Thus, by the KKT onditions of (14), we have that, if P (yk) = uk , then

Let us all ik =  and k = maxf k ; : : : ; qk g. If limk2K1 k = 0, we have that

If k ! 1 this ontradi ts linearly independen e of fr i(x )gqi . Therefore, there

for some i  0, i = 1; 2; : : : ; q. This implies that x is a stationary point of (13).

Proof. Assume, without loss of generality, that

By the regularity hypothesis, the olumns of A 2 IRn m q

Ak = (rh1 (xk ); : : : ; rhm (xk ); r 1 (xk ); : : : ; r q (xk )):

for all k 2 K ; k  k . Moreover, the Moore-Penrose pseudo-inverse Ayk is su h that

lim Ayk = Ay:

For all k 2 IN let us de ne

i (uk ) < 0 for k 2 K1 ; k  k2 ; i = q + 1; : : : ; p:

Then, by (26), if k 2 K ; k  k , we have

where Aek = rh (xk ); : : : ; rhm(xk ); r (uk ); : : : ; r q (uk ) :

Writing the KKT onditions of problem (28), we obtain the thesis.

In order to assess the performan e of algorithm ALSPG, thirty-nine nonlinear equality

exa t-se ond-derivatives-used

bandsolver-pre onditioned- g-solver-used 5

exa t-Cau hy-point-required

gradient-a ura y-required 1.0D 4

onstraint-a ura y-required 1.0D 4

Table 1: Complete omparative results:

Table 1 ( ont.): Complete omparative results:

Table 1 ( ont.): Complete omparative results:

The lo ation problem

Figure 1: A possible optimal on guration of the lo ation problem.

min maxfkP P k; kP P k; : : : ; kP Pnpolkg

Table 3: Comparative results:

Table 5: Performan e of ALSPG for large-s ale lo ation problems

Das könnte Ihnen auch gefallen

= fx 2 IRn j i(x) 0; i = 1; : : : ; pg;

L(x; ; ) = f (x) + hh(x); i + 2 kh(x)k

kh(xk )k1 0:1kh(xk )k1

Update the toleran es "k and k :

Without loss of generality, assume that q p,

i (P (y k )) < 0 for i = q + 1; : : : ; p; for all k 2 K1 ; k k1 :

Let us all ik = and k = maxf k ; : : : ; qk g. If limk2K1 k = 0, we have that

for some i 0, i = 1; 2; : : : ; q. This implies that x is a stationary point of (13).

for all k 2 K ; k k . Moreover, the Moore-Penrose pseudo-inverse Ayk is su h that

For all k 2 IN let us dene

i (uk ) < 0 for k 2 K1 ; k k2 ; i = q + 1; : : : ; p:

Then, by (26), if k 2 K ; k k , we have

Figure 1: A possible optimal onguration of the lo ation problem.