Beruflich Dokumente
Kultur Dokumente
Maria Mitradjieva–Daneva
Linköping 2007
Linköping Studies in Science and Technology. Dissertations, No. 1095
Maria Mitradjieva-Daneva
Typeset LATEX 2ε
There are a lot of people who made the appearance of this work possible.
First of all my sincere thanks go to professor Maud Göthe Lundgren, for all
encouragement and advice. My deepest thanks for her support during my
time of difficulties.
Many thanks go to Clas Rydergren for all help and support he has given me
over the years. It has been a great pleasure to work with him.
I would like to thank professor Per Olov Lindberg, for giving me the op-
portunity to work within the Optimization group in Linköping. I have to
thank him for his importance in my work, for his enthusiasm and endless
finickiness.
I also gratefully acknowledge the financial support from KFB (Swedish Trans-
portation & Communications Research Board) and later Vinnova under the
project ”Mathematical Models for Complex Problems within the Road and
Traffic Area”.
v
Many heartfelt thanks to the girls in LiTH Doqtor, esspecially to Linnéa.
Last, but absolutely not least, I would like to express my deepest gratitude
to my parents, my sister Rumi and my friends only for being there.
Thank you to my lovely sons, Petter, Stefan and Martin, who made hard
times seem brighter with their cheering laugh.
vi
Sammanfattning
vii
Abstract
This thesis concerns the development of novel feasible direction type algo-
rithms for constrained nonlinear optimization. The new algorithms are based
upon enhancements of the search direction determination and the line search
steps.
The new methods are applied to the single-class user traffic equilibrium prob-
lem, the multi-class user traffic equilibrium problem under social marginal
cost pricing, and the stochastic transportation problem. In a limited set
of computational tests the algorithms turn out to be quite efficient. Addi-
tionally, a feasible direction method with multi-dimensional search for the
stochastic transportation problem is developed.
ix
Contents
Sammanfattning vii
Abstract ix
Contents xi
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . 5
xi
2.5 Convergence . . . . . . . . . . . . . . . . . . . . . . . 15
Bibliography 23
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Bibliography 63
xii
PAPER II: Multi-Class User Equilibria under Social Marginal
Cost Pricing
1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4 A two-link example . . . . . . . . . . . . . . . . . . . . . . . . 74
Bibliography 79
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
2 Conjugate directions . . . . . . . . . . . . . . . . . . . . . . . 86
Bibliography 95
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
xiii
2 The stochastic transportation problem . . . . . . . . . . . . . 102
6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Bibliography 119
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
xiv
7.1 Termination criteria . . . . . . . . . . . . . . . . . . . 149
8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Bibliography 157
xv
PART I
1 Introduction
optimization problems [1]. One can show that the number of iterations that
an interior point algorithm needs in order to achieve a specified accuracy
is bounded by a polynomial function of the size of the problem. For more
details on interior-point methods, see [1, 4, 26]. Other important recent
developments are the increased accent on large-scale problems [6, 13, 33, 48,
67, 79], and algorithms that take advantage of problem structures as well as
parallel computation [10, 56, 76].
2.1 Prerequisites
Accelerating the steepest descent method while avoiding the evaluation, stor-
age and inversion of the Hessian matrix motivates the existence of quasi–
Newton methods as well as conjugate direction methods. In conjugate di-
rection methods (see e.g. [51, Ch. 8]) for unconstrained convex quadratic
optimization, one performs line searches consecutively in a set of directions,
d1 , · · · , dn , mutually conjugate with respect to the Hessian ∇2 f (x) of the ob-
T
jective (i.e. fulfilling di ∇2 f (x)dj = 0 for i 6= j). In Rn the optimum is then
identified after n line searches [51, p. 241, Expanding Subspace Theorem].
In 1962, Fletcher and Reeves [23] introduced the nonlinear conjugate gra-
dient method, known as the Fletcher-Reeves, FR, method. This method is
shown to be globally convergent when all the search direction are descent.
The method can produce a poor search direction in the sense that the search
direction dk is almost orthogonal to −∇f (xk ), which results in a small im-
provement in the objective value. Therefore, whenever this happens, using
a steepest descent direction is advisable. The search direction may fail to
be a descent direction, unless the step size βk satisfies the Wolfe condition
[60]. The FR method may take small steps and thus have bad numerical
performance (see [10]).
where
1
f (xk ) + ∇f (xk )T d + dT ∇2 f (xk )d
2
Line search methods and trust region methods differ in the order in which
they choose the direction and the step length of the move to the next iterate.
Trust region methods first choose the maximum distance and then determine
the new direction. The choice of the trust region size is crucial and it is based
on the ratio between the actual and the predicted reduction of the objective
value. The robustness and strong convergence characteristics have made
trust regions popular, especially for non-convex optimization [1, 4, 11, 57, 60].
The Frank–Wolfe, FW, method [28] was originally suggested for quadratic
programming problems, but in the original paper it was noted that the
method could be applied also to linearly constrained convex programs.
The next step of the method is to perform a line search in the FW direction,
i.e. a one-dimensional minimization of f , along the line segment between
the current iterate xk and the point y k . The point where this minimum is
attained (at least approximately) is chosen as the next iterate, xk+1 . Note
that f (xk+1 ) is an upper bound to f ∗ .
k+1
X
min f (xk + λi (y i − xk ))
i=0
k+1
X
s.t. λi ≤ 1 (10)
i=0
λi ≥ 0, i = 0, . . . , k + 1,
When the algorithm throws away every point that is previously generated, we
are back to the Frank–Wolfe algorithm. The number of the stored extreme
points is crucial for the convergence properties, since if it is to small the
behavior can be as bad as the Frank–Wolfe algorithm. We refer to [1] for
further information about column dropping and simplicial decomposition.
Hearn et al. [37] extend the simplicial decomposition concept to the re-
stricted simplicial decomposition algorithm [38, 69], in which the number of
stored extreme points is bounded by a parameter r. Convergence to an opti-
mal solution is obtained provided that r is greater then the dimension of the
optimal face of the feasible set. Another extension of the simplicial decom-
position strategy, known as disaggregate simplicial decomposition, is made
by Larsson and Patriksson [45], who take advantage of Cartesian product
structures. The simplicial decomposition strategy has been applied mainly
to certain classes of structured linearly constrained convex programs, where
it has been shown to be successful.
12 Introduction and Overview
The SLP approach originates from Griffith and Stewart [35]. Their method
is called the Method of Approximation Programming, and utilizes an LP
approximation of the type
effects are negligible. For problems which are highly nonlinear, SLP methods
may converge slowly and become unreliable. A variety of numerical methods
has been proposed [7, 11, 12, 21, 24, 57, 61, 78] to improve the convergence
properties of SLP algorithms.
One of the milestones in the development of the SLP concept is the work
of Fletcher and Sainz de la Maza [24]. They describe an algorithm that
solves a linear program to identify an active set of constraints, followed
by the solution of an equality constrained quadratic problem (EQP). This
sequential linear programming - EQP (SLP-EQP) method is motivated by
the fact that solving quadratic subproblems with inequality constraints can
be expensive. The cost of solving one linear program followed by an equality
constrained quadratic problem would be much lower.
where ∇2xx L(xk , uk ) denotes the Hessian of the Lagrangian. The SQP algo-
rithm in this form is a local algorithm. If the algorithm starts at a point in a
vicinity of a local minimum, the algorithm has a quadratic local convergence.
A line search or a trust region method is used to achieve global convergence
from a distant starting point. In the line search case the new iterate is
obtained by searching along the direction generated by solving (SQPSUB),
until a certain merit function is sufficiently decreased. A variety of merit
functions are described in e.g. [60, Chapter 15]. Another way to find the
next iterate is to use trust regions. SQP methods have proved to be efficient
in practice. They typically require fewer function evaluations than some of
the other methods. For an overview of SQP methods, see [5].
One of the important recent developments in SLP and SQP methods is the
14 Introduction and Overview
introduction of the filter concept by Fletcher and Leyffer [20]. The main
advantage of using the filter concept is to avoid using a merit function. The
filter allows a trial step to be accepted if it reduces either the objective
function or a constraint violation function. The filter is used in trust region
type algorithms as a criterion for accepting or rejecting a trial step. Global
convergence of an SLP-filter algorithm is shown in [12, 21] and the global
convergence properties of an SQP-filter algorithm are discussed in [19, 22,
68].
If a point (x∗ , u∗ ) solves (SP P ), then, according to the saddle point theorem
([4, p. 427]), x∗ is a local minimum to (N LP ). Furthermore, if the problem
(N LP ) is convex then x∗ is a global optimal solution to the problem (N LP )
(see [3]).
The saddle point theorem gives sufficient conditions for optimality. By intro-
ducing the Lagrangian function for the (N LP ) problem with slack variables
in the constrains, gi (x) + s2i = 0, i = 1, . . . , m, necessary conditions for a
local optimum of general constrained optimization problem can be estab-
lished. A point (x∗ , u∗ , s∗ ) is a stationary point to (SP P ) if it satisfies
∇L(x∗ , u∗ , s∗ ) = 0, and the Hessian with respect to x and s is positive
semidefinite. These requirements can be written as
The methods that solve (N LP ) problems can be divided into methods that
work in primal, dual and primal-dual spaces. The primal algorithms work
with feasible solutions and improve the value of the objective function. Com-
putational difficulties may arise from the necessity to remain within the
feasible region, particularly for problems with nonlinear constraints. For
problems with linear constraints they enjoy fast convergence.
The dual methods attempt to solve the dual problem. In this case a direction
determination step should find an ascent direction for the dual objective
function, which is always concave even when the primal problem may be
non-convex. This means that a local optimum of (SP P ) is also a global one.
The main difficulty of the dual problem is that it may be non-differentiable
and is not explicitly available.
Primal-dual methods [27, 31, 32, 50] are methods that simultaneously work
in the primal and dual spaces. This principle is widely spread in the field of
interior point methods. A nice book that covers the theoretical properties,
practical and computational aspects of primal-dual interior-point methods
is written by Stephen J. Wright [75].
2.5 Convergence
a) if x ∈
/ Γ, then Z(y) < Z(x) for all points y ∈ A(x)
The first paper, ”The Stiff is Moving - Conjugate Direction Frank–Wolfe Meth-
ods with Applications to Traffic Assignment”, treats the traffic assignment
problem [63]. In this problem, travelers between different origin-destination
3 Outline of the thesis and contribution 17
d˜k = dk + βk d˜k−1 ,
A linear approximation of (SP P ) (see Section 2.4) at the current primal and
dual points gives a column generation problem which reduces and separates
into a primal and a dual column generation problems. These are used to
find better approximations of the inner primal and dual spaces. The line
search problem of a traditional SLP algorithm is replaced by a minimization
problem of the same type as the original one, but with typically fewer vari-
ables and fewer constraints. Because of the fewer number of variables and
constraints, it should be computationally less demanding than the original
problem.
The theoretical results presented in this paper show the convergence of the
new method to a point that satisfies the KKT conditions, and thus to a global
optimal solution for a convex problem. In the presented algorithm it is not
necessary to introduce rules to control the move limits ∆k , and we may
abandon the merit function as well, while still guaranteeing convergence.
In the paper, the suggested idea of using multi-dimensional search is also
outlined for the case of sequential quadratic programming algorithms.
The papers that has contributed to the contents of the thesis, arised in the
following order.
23
24 Introduction and Overview
[11] T. Y. Chen. Calculation of the move limits for the sequential linear pro-
gramming method. Internat. J. Numer. Methods Engrg., 36(15):2661–
2679, 1993.
[26] A. Forsgren and Ph. E. Gill. Interior methods for nonlinear optimiza-
tion. SIAM Rev., 44:525–597, 2002.
[31] E. M. Gertz and Ph. E. Gill. A primal-dual trust region algorithm for
nonlinear optimization. Math. Program., 100(1):49–94, 2004.
[34] N. Gould, D. Orban, and Ph. Toint. Numerical methods for large-scale
nonlinear optimization. Acta Numerica, pages 299–361, 2005.
[79] Ch. Zillober, K. Schittkowski, and K. Moritzen. Very large scale opti-
mization by sequential convex programming. Optim. Methods Softw.,
19(1):103–120, 2004.