Beruflich Dokumente
Kultur Dokumente
UNCONSTRAINED OPTIMIZATION
2.1 Introduction
Several engineering, economic and planning problems can be posed as optimization prob-
lems, i.e. as the problem of determining the points of minimum of a function (possibly in
the presence of conditions on the decision variables). Moreover, also numerical problems,
such as the problem of solving systems of equations or inequalities, can be posed as an
optimization problem.
We start with the study of optimization problems in which the decision variables are
defined in IRn : unconstrained optimization problems. More precisely we study the problem
of determining local minima for differentiable functions. Although these methods are
seldom used in applications, as in real problems the decision variables are subject to
constraints, the techniques of unconstrained optimization are instrumental to solve more
general problems: the knowledge of good methods for local unconstrained minimization is
a necessary pre-requisite for the solution of constrained and global minimization problems.
The methods that will be studied can be classified from various points of view. The
most interesting classification is based on the information available on the function to be
optimized, namely
methods without derivatives (direct search, finite differences);
methods based on the knowledge of the first derivatives (gradient, conjugate direc-
tions, quasi-Newton);
methods based on the knowledge of the first and second derivatives (Newton).
f (x) f (y)
for all y F.
A point x F is a strict (or isolated) global minimum (or minimiser) for the Problem 1
if
f (x) < f (y)
1
The set F may be specified by equations of the form (1.1) and/or (1.2).
2
Alternatively, the term global minimiser can be used to denote a point at which the function f attains
its global minimum.
2.2. DEFINITIONS AND EXISTENCE CONDITIONS 11
The following result provides a sufficient, but not necessary, condition for the existence of
a global minimum for Problem 1.
In unconstrained optimization problems the set F coincides with IRn , hence the above
statement cannot be used to establish the existence of global minima. To address the
existence problem it is necessary to consider the structure of the level sets of the function
f . See also Section 1.2.3.
Definition 3 Let f : IRn IR. A level set of f is any non-empty set described by
L() = {x IRn : f (x) },
with IR.
For convenience, if x0 IRn we denote with L0 the level set L(f (x0 )). Using the concept
of level sets it is possible to establish a simple sufficient condition for the existence of
global solutions for an unconstrained optimization problem.
Proof. By Proposition 1 there exists a global minimum x? of f in L0 , i.e. f (x? ) f (x) for
all x L0 . However, if x 6 L0 then f (x) > f (x0 ) f (x? ), hence x? is a global minimum
of f in IRn . /
It is obvious that the structure of the level sets of the function f plays a fundamental
role in the solution of Problem 1. The following result provides a necessary and sufficient
condition for the compactness of all level sets of f .
3
A compact set is a bounded and closed set.
12 CHAPTER 2. UNCONSTRAINED OPTIMIZATION
Proposition 3 Let f : IRn IR be a continuous function. All level sets of f are compact
if and only if for any sequence {xk } one has
A function that satisfies the condition of the above proposition is said to be radially
unbounded.
Proof. We only prove the necessity. Suppose all level sets of f are compact. Then,
proceeding by contradiction, suppose there exist a sequence {xk } such that limk kxk k =
and a number > 0 such that f (xk ) < for all k. As a result
{xk } L().
Definition 4 Let f : IRn IR. A vector d IRn is said to be a descent direction for f
in x? if there exists > 0 such that
Proposition 4 Let f : IRn IR and assume4 f exists and is continuous. Let x? and d
be given. Then, if f (x? )0 d < 0 the direction d is a descent direction for f at x? .
f (x? + d) f (x? )
f (x? )0 d = lim ,
0+
4 f f 0
We denote with f the gradient of the function f , i.e. f = [ x 1,, xn
]. Note that f is a
column vector.
2.2. DEFINITIONS AND EXISTENCE CONDITIONS 13
f increasing
f(x) = f(x )
*
anti-gradient
descent direction
and this is negative by hypothesis. As a result, for > 0 and sufficiently small
The proposition establishes that if f (x? )0 d < 0 then for sufficiently small positive dis-
placements along d and starting at x? the function f is decreasing. It is also obvious that
if f (x? )0 d > 0, d is a direction of ascent, i.e. the function f is increasing for sufficiently
small positive displacements from x? along d. If f (x? )0 d = 0, d is orthogonal to f (x? )
and it is not possible to establish, without further knowledge on the function f , what is
the nature of the direction d.
From a geometrical point of view (see also Figure 2.1), the sign of the directional derivative
f (x? )0 d gives information on the angle between d and the direction of the gradient at
x? , provided f (x? ) 6= 0. If f (x? )0 d > 0 the angle between f (x? ) and d is acute. If
f (x? )0 d < 0 the angle between f (x? ) and d is obtuse. Finally, if f (x? )0 d = 0, and
f (x? ) 6= 0, f (x? ) and d are orthogonal. Note that the gradient f (x? ), if it is not
identically zero, is a direction orthogonal to the level surface {x : f (x) = f (x? )} and it is
a direction of ascent, hence the anti-gradient f (x? ) is a descent direction.
Remark. The scalar product x0 y between the two vectors x and y can be used to define
the angle between x and y. For, define the angle between x and y as the number [0, ]
such that5
x0 y
cos = .
kxkE kykE
If x0 y = 0 one has cos = 0 and the vectors are orthogonal, whereas if x and y have the
same direction, i.e. x = y with > 0, cos = 1.
5
kxkE denotes the Euclidean norm of the the vector x, i.e. kxkE = x0 x.