Sie sind auf Seite 1von 239

Nonlinear

Programming

SIAM's Classics in Applied Mathematics series consists of books that were previously
allowed to go out of print. These books are republished by SIAM as a professional
service because they continue to be important resources for mathematical scientists.
Editor-in-Chief
Robert E. O'Malley, Jr., University of Washington
Editorial Board
Richard A. Brualdi, University of Wisconsin-Madison
Herbert B. Keller, California Institute of Technology
Andrzej Z. Manitius, George Mason University
Ingram Olkin, Stanford University
Stanley Richardson, University of Edinburgh
Ferdinand Verhulst, Mathematisch Instituut, University of Utrecht
Classics in Applied Mathematics
C. C. Lin and L. A. Segel, Mathematics Applied to Deterministic Problems in the
Natural Sciences
Johan G. F. Belinfante and Bernard Kolman, A Survey of Lie Groups and Lie
Algebras with Applications and Computational Methods
James M. Ortega, Numerical Analysis: A Second Course
Anthony V. Fiacco and Garth P. McCormick, Nonlinear Programming: Sequential
Unconstrained Minimization Techniques
F. H. Clarke, Optimization and Nonsmooth Analysis
George F. Carrier and Carl E. Pearson, Ordinary Differential

Equations

Leo Breiman, Probability


R. Bellman and G. M. Wing, An Introduction to Invariant Imbedding
Abraham Berman and Robert J. Plemmons, Nonnegative Matrices in the Mathematical
Sciences
Olvi L. Mangasarian, Nonlinear Programming
*Carl Friedrich Gauss, Theory of the Combination of Observations Least Subject
to Errors: Part One, Part Two, Supplement. Translated by G. W. Stewart
Richard Bellman, Introduction to Matrix Analysis
U. M. Ascher, R. M. M. Mattheij, and R. D. Russell, Numerical Solution of
Boundary Value Problems for Ordinary Differential Equations
K. E. Brenan, S. L. Campbell, and L. R. Petzold, Numerical Solution of InitialValue Problems in Differential-Algebraic Equations
Charles L. Lawson and Richard J. Hanson, Solving Least Squares Problems
J. E. Dennis, Jr. and Robert B. Schnabel, Numerical Methods for Unconstrained
Optimization and Nonlinear Equations
Richard E. Barlow and Frank Proschan, Mathematical Theory of Reliability
*First time in print.

Classics in Applied Mathematics (continued)


Cornelius Lanczos, Linear Differential

Operators

Richard Bellman, Introduction to Matrix Analysis, Second Edition


Beresford N. Parlett, The Symmetric Eigenvalue Problem
Richard Haberman, Mathematical Models: Mechanical Vibrations, Population
Dynamics, and Traffic Flow
Peter W. M. John, Statistical Design and Analysis of Experiments
Tamer Bajar and Geert Jan Olsder, Dynamic Noncooperative Game Theory, Second
Edition
Emanuel Parzen, Stochastic Processes
Petar Kokotovic, Hassan K. Khalil, and John O'Reilly, Singular Perturbation
Methods in Control: Analysis and Design
Jean Dickinson Gibbons, Ingram Olkin, and Milton Sobel, Selecting and Ordering
Populations: A New Statistical Methodology
James A. Murdock, Perturbations: Theory and Methods
Ivar Ekeland and Roger Tmam, Convex Analysis and Variational Problems
Ivar Stakgold, Boundary Value Problems of Mathematical Physics, Volumes I and II
J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in
Several Variables
David Kinderlehrer and Guido Stampacchia, An Introduction to Variational
Inequalities and Their Applications
F. Natterer, The Mathematics of Computerized Tomography
Avinash C. Kak and Malcolm Slaney, Principles of Computerized Tomographic Imaging
R. Wong, Asymptotic Approximations of Integrals
O. Axelsson and V. A. Barker, Finite Element Solution of Boundary Value
Problems: Theory and Computation
David R. Brillinger, Time Series: Data Analysis and Theory
Joel N. Franklin, Methods of Mathematical Economics: Linear and Nonlinear
Programming, Fixed-Point Theorems
Philip Hartman, Ordinary Differential Equations, Second Edition
Michael D. Intriligator, Mathematical Optimization and Economic Theory
Philippe G. Ciarlet, The Finite Element Method for Elliptic Problems

This page intentionally left blank

Nonlinear
Programming
Otvi L Mangasarian
University of Wisconsin
Madison, Wisconsin

siajTL.
Society for Industrial and Applied Mathematics
Philadelphia

Library of Congress Cataloging-in-Publication Data


Mangasarian, Olvi L., 1934Nonlinear programming / Olvi L. Mangasarian.
p. cm. -- (Classics in applied mathematics ; 10)
Originally published: New York : McGraw-Hill, 1969, in series:
McGraw-Hill series in systems science.
Includes bibliographical references and indexes.
ISBN 0-89871-341-2
1. Nonlinear programming.
I. Title. II. Series
T57.8.M34 1994
519.7'6-dc20
94-36844
109876543
All rights reserved. Printed in the United States of America. No part of this
book may be reproduced, stored, or transmitted in any manner without the
written permission of the Publisher. For information, write the Society for
Industrial and Applied Mathematics, 3600 University City Science Center,
Philadelphia, PA 19104-2688.
Copyright 1994 by the Society for Industrial and Applied Mathematics.
This SIAM edition is a corrected republication of the work first published in
1969 by the McGraw-Hill Book Company, New York, New York.
Siam

is a registered trademark.

To

Josephine Mangasarian, my mother,


and to
Claire

This page intentionally left blank

Preface to the
Classics Edition

Twenty-five years have


passed since the original edition of
this book appeared; however, the
topics covered are still timely and
currently taught at the University
of Wisconsin as well as many other
major institutions. At Wisconsin
these topics are taught in a course
jointly listed by the Computer Sciences, Industrial Engineering, and
Statistics departments. Students
from these and other disciplines
regularly take this course. Each
year I get a number of requests
from the United States and abroad
for copies of the book and for
permission to reproduce reserve
copies for libraries. I was therefore
pleased when SIAM approached
me with a proposal to reprint the
book in its Classics series. I believe
that this book is an appropriate
choice for this series inasmuch as
it is a concise, 'igorous, yet accessible account o ' the fundamentals
of constrained optimization theory
that is useful to both the beginning
student as well as the active researcher.
I am appreciative that SIAM
has chosen to publish the book and
to make the corrections that I supplied. I am especially grateful to
Vickie Kearn and Ed Block for their
friendly and professional handling
of the publication process. My hope
is that the mathematical programming community will benefit from
this endeavor.
Olvi L. Mangasarian

ix

This page intentionally left blank

Preface

This book is based on a course in


nonlinear programming given in the
Electrical Engineering and Computer Sciences Department and the
Industrial Engineering and Operations Research Department of the
University of California at Berkeley
and in the Computer Sciences Department of the University of Wisconsin at Madison. The intent of
the book is to cover the fundamental theory underlying nonlinear
programming for the applied mathematician. The entire book could
be used as a text for a one-semester
course, or the first eight chapters
for a one-quarter course. The
course level would probably be
advanced undergraduate or firstyear graduate. The only prerequisite would be a good course in
advanced calculus or real analysis.
(Linear programming is not a prerequisite.) All the results needed
in the book are given in the
Appendixes.
I am indebted to J. Ben
Rosen who first introduced me to
the fascinating subject of nonlinear
programming, to Lotfi A. Zadeh
who originally suggested the writing of such a book, to Jean-Paul
Jacob, Phillippe Rossi, and James
W. Daniel who read the manuscript carefully and made numerous
improvements, and to all my students whose questions and observations resulted in many changes.
Olvi L. Mangasarian

xi

This page intentionally left blank

To
the
Reader

The following system of numbering and cross-referencing is


used in this book. At the top of
each page in the outer margin appear chapter and section numbers
in boldface type; for example,
3.2 at the top of a page means
that the discussion on that page is
part of Chapter 3, Section 2. In
addition, each item on the page
(Definition, Theorem, Example,
Comment, Remark, etc.) is given
a number that appears in the lefthand margin; such items are numbered consecutively within each
section. Item numbers and all
cross-references in the text are in
italic type. Cross-references are
of the form "by Definition 5,4.3";
this means "by the definition which
is item 3 of Section 4 in Chapter 5."
Since the four appendixes are labeled A, B, C, and D, the reference
"C.I.3" is to "item 3 in Section 1,
Appendix C." When we refer in a
section to an item within the same
section, only the item number is
given; thus "substituting in 7"
means "substituting in Equation 7
of this section."

xiii

This page intentionally left blank

Contents
Preface to the Classics Edition
Preface
To the Reader
Chapter 1. The Nonlinear Programming Problem,
Preliminary Concepts, and Notation
1. The nonlinear programming problem
2. Sets and symbols
3. Vectors
4. Matrices
5. Mappings and functions
6. Notation

Chapter 2. Linear Inequalities and Theorems


of the Alternative
1.
2.
3.
4.

Introduction
The optimalily criteria of linear programming: An
application of Farkas' theorem
Existence theorems for linear systems
Theorems of the alternative

Chapter 3. Convex Sets in Rn


1.
2.

Convex sets and their properties


Separation theorems for convex sets

Chapter 4. Convex and Concave Functions


1.
2.

Definitions and basic properties


Some fundamental theorems for convex functions

Chapter 5. Saddlepoint Optimality Criteria


of Nonlinear Programming
Without Differentiability
1.
2.
3.
4.

The minimization and saddlepoint problems


Some basic results for minimization and local minimization problems
Sufficient optimality criteria
Necessary optimality criteria

ix
xi

xiii
1

1
3
6

8
11
13

16
16
18
21
27

38
38
46

54
55
63

69
70
72
74
76
XV

xvi

Contents

Chapter 6. Differentiable Convex


and Concave Functions
1.
2.
3.

Differentidble convex and concave functions


Differentiable strictly convex and concave functions
Twice-differentiable convex and concave functions

83
83
87
88

4. Twice-differentiable strictly convex and Concave

functions
functions

Chapter 7. Optimality Criteria in Nonlinear


Programming with Differentiability
The minimization problems and the Fritz John and
Kuhn-Tucker stationary-point problems
2. Sufficient optimality criteria
3. Necessary optimality criteria

90

92

1.

93
96
97

Chapter 8. Duality in Nonlinear Programming

113

1. Duality in nonlinear programming


2. Duality in quadratic programming
3. Duality in linear programming

123
126

Chapter 9. Generalizations of Convex Functions:


Quasiconvex, Strictly Quasiconvex,
and Pseudoconvex Functions
1.
2.
3.
4.

Quasiconvex and quasiconcave functions


Strictly quasiconvex and strictly quasiconcave functions
Pseudoconvex and pseudoconcave functions
Summary of properties and relations between quasiconvex, strictly quasiconvex, pseudoconvex, convex, and
strictly convex functions
5. Warning
6. Problems

Chapter 10. Optimality and Duality for Generalized


Convex and Concave Functions
1. Sufficient optimality criteria
2. Necessary optimality criteria
3. Duality

114

131
131
136
140

145
147
148

151
151
153
157

Contents

xvii

Chapter 11. Optimality and Duality in the Presence


of Nonlinear Equality Constraints
1. Sufficient optimality criteria
2. "Minimum principle" necessary optimality criteria:
X not open
3. Fritz John and Kuhn-Tucker stationary-point necessary optimality criteria: X open
4. Duality with nonlinear equality constraints

Appendix A. Vectors and Matrices


1.
2.

Vectors
Matrices

Appendix B. Resume of Some Topological


Properties of Rn
1. Open and closed sets
2. Sequences and bounds
3. Compact sets in Rn

Appendix C.

Continuous and Semicontinuous


Functions, Minima and Infima

161
16%
162
170
174

177
177
179

182
182
185
188

191

1. Continuous and semicontinuous functions


191
2. Infimum (supremum) and minimum (maximum) of a
195
set of real numbers
3. Infimum (supremum) and minimum (maximum) of a
196
numerical function
4. Existence of a minimum and a maximum of a numerical function
198

Appendix D. Differentiable Functions, Mean-value


and Implicit Function Theorems
1. Differentiable and twice-differentiable functions
2. Mean-value theorem and Taylor's theorem
3. Implicit function theorem

200
200
204
204

Bibliography

205

Name Index

215
217

Subject Index

This page intentionally left blank

Chapter
One

The
Nonlinear
Programming
Problem,
Preliminary
Concepts,
and
Notation

1. The nonlinear
programming problem f
The nonlinear programming problem that will concern us has three
fundamental ingredients: a finite
number of real variables, a finite
number of constraints which the
variables must satisfy, and a function of the variables which must
be minimized (or maximized).
Mathematically speaking we can
state the problem as follows: Find
specific values (xi, . . . ,xn), if
they exist, of the variables
(1, . . ,zn) that will satisfy the
inequality constraints

the equality constraints

and minimize (or maximize) the


o bjective function
over all values of Xi, . . . ,xn
satisfying 1 and 2. Here, Qi, hj,
and 6 are numerical functions! of
the variables x\, . . . ,xn, which
are defined for all finite values of
t In order to introduce the problem in the
first section of the book, some undefined
terms (function, real variable, constraints,
etc.) must be interpreted intuitively for the
time being. The problem will be stated
rigorously at the end of this chapter (see
1.6.9 to 1.6.12).
t The concept of a numerical function will be
defined precisely in Sec. 1.5. For the present by a numerical function of xi, . . . , xn
we mean a correspondence which assigns a
single real number for each n-tuple of real
values that the variables xi, . , xn
assume.

1.1

Nonlinear Programming

the variables. The fundamental difference between this problem and


that of the classical constrained minimization problem of the ordinary
calculus [Courant 47, Fleming 65] f is the presence of the inequalities 1.
As such, inequalities will play a crucial role in nonlinear programming and
will be studied in some detail.
As an example of the above problem consider the case shown in
Fig. 1.1.1. Here we have n = 2 (two variables Xi,Xz), m = 3 (three
inequality constraints), and A; = 1 (one equality constraint). Each curve
in Fig. 1.1.1 is obtained by setting some numerical function equal to a real
number such as B(xi,x2) 5 or g*(x\,xt) = 0. The little arrows on the
t This refers to the works by Courant, written in 1947, and by Fleming, written in 1965,
as listed in the Bibliography at the back of the book. This system of references will be
used throughout the book with one exception: [Gordan 73] refers to Gordan's paper written
in 1873.

Fig. 1.1.1 A typical nonlinear programming problem in two


variables (xi,xi).

Preliminary Concepts and Notations

1.3

curves Qi(x\,Xi) = 0 indicate the side in the direction of which g^ increases,


and hence all (xi,xz) must lie on the opposite side of these curves if they are
to satisfy 1. All such (xi,Xz) lie in the shaded area of Fig. 1.1.1. To
satisfy 2, (2:1,2:2) must lie on the curve hi(xi,Xz) = 0. The solution to the
problem is (1,0:2). This is the point on the curve A 1(0:1,2:2) = 0 at which
6 assumes its lowest value over the set of all (x 1,2:2) satisfying ^(0:1,2:2) ^ 0,
i = 1, 2, 3. In more complicated situations where n, m, and A; may be
large, it will not be easy to solve the above problem. We shall then be
concerned with obtaining necessary and/or sufficient conditions that a
point (x\, . . . ,xn) must satisfy in order for it to solve the nonlinear programming problem 1 to 3. These optimality conditions form the crux
of nonlinear programming.
In dealing with problems of the above type we shall confine ourselves to minimization problems only. Maximization problems can be
easily converted to minimization problems by employing the identity
maximum B(x\, . . . ,) = minimum [0(2:1, . . . ,)]
Problem
Solve graphically as indicated in Fig. 1.1.1 the following nonlinear
programming problem:
minimize ( x\ Xt)
subject to

2. Sets and symbols


We shall use some symbols and elementary concepts from set theory
[Anderson-Hall 63, Hamilton-Landin 61, Berge 63]. In particular a set F
is a collection of objects of any kind which are by definition elements or
points of T. For example if we let R (the reals or the real line) denote
the set of all real numbers, then 7 is an element or point of R. We use
the symbol G to denote the fact that an element belongs to a set. For
example we write 7 G R- For simplicity we also write sometimes
5,7 G R instead of 5 R and 7 G R.
If T and A are two sets, we say that F is contained in A, F is in A,
F is a subset of A, or A contains F, if each element of F is also an element
of A, and we write F C A or A 3 F. If F C A and A C F we write F = A.
A slash across a symbol denotes its negation. Thus x @ F and F < A
denote respectively that x is not an element of F and that F is not a subset
3

1.2

Nonlinear Programming

of A. The empty set is the set which contains no elements and is denoted
by 0. We denote a set sometimes by {x,y,z}, if the set is formed by the
elements x, y, z. Sometimes a set is characterized by a property that its
elements must have, in which case we write
[x | x satisfying property P\
For example the set of all nonnegative real numbers can be written as
The set of elements belonging to either of two sets F or A is called
the union of the sets F and A and is denoted by F U A. We have then
The set of elements belonging to at least one of the sets of the (finite or
infinite) family of sets (r,),e/ is called the union of the family and is
denoted by U F,. Then

The set of elements belonging to both sets T and A is called the


intersection of the sets F and A and is denoted by F P> A. We then have

The set of elements belonging to all the sets of the (finite or infinite)
family of sets (F,)ie/, is called the intersection of the family and is denoted
by r! F,. Then
ie/

Two sets F and A are disjoint if they do not intersect, that is, if

r n A = 0.

The difference of the sets A and F is the set of those elements of A


not contained in F and is denoted by A ~ F. We have then

In the above it is not assumed in general that F C A. If however F C A,


then A ~ F is called the complement of F relative to A.
The product of two sets F and A, denoted by F X A, is defined as
the set of ordered pairs (x,y) of which x G T and y G A. We have then

Preliminary Concept! and Notations

1.1

Fig. 1.2.1 The product r X A of the sets r


and A.

The product of n sets Fi, . . . , Tn, denoted by Ti X F 2 X * X F n ,


is defined as the set of ordered n-tuples (x\, . . . ,xn) of which x\ G I\
, #n Tn. We have then
If Ti = T2 = = rn = T, then we write F" = T X r X X T.
If we let
then
Figure 1.2.1 depicts the set T X A. The set # 2 = R X R, which can be
represented by points on a plane, is called the Euclidean plane.
The following symbols will also be used:
(Vx) reads for each x
(3x) reads there exists an x such that
=> reads implies
<= reads is implied by
= reads is equivalent to
(A slash [/] across any one of the last three symbols denotes their negation.)
For example the statement "for each x there exists a y such that 6(x,y) = 1"
can be written as
The negation of the above statement can be automatically written as
Frequently we shall refer to a certain relationship such as an
equation or an inequality by a number or Roman numeral such as I or II.
5

1.8

Nonlinear Programming:

The notation I => II means relationship I implies relationship II. An


overbar on I or II (I or II) denotes the negation of the relationship
referred to by that numeral. Obviously then the statement that I = II
is logically equivalent to I <= II. Thus

3.

Vectors
n-vector

An n-vector or n-dimensional vector x, for any positive integer n,


is an n-tuple (xi, . . . ,xn) of real numbers. The real number x, is
referred to as the ith component or element of the vector x.
Rn

The n-dimensional (real) Euclidean space Rn, for any positive


integer n, is the set of all n-vectors.
The notation x j Rn means that x is an element of Rn, and hence,
x is an n-vector. Frequently we shall also refer to re as a point in Rn.
R1, or simply R, is then the Euclidean line (the set of all real numbers),
R* is the Euclidean plane (the set of all ordered pairs of real numbers),
and R" = R X R X X R (n times).
Vector addition and multiplication by a real number
Let x,y (E Rn and a G R- The sum x -\- y is defined by

x + y = (xi + 7/1, . . . ,xn + yn)


and the multiplication by a real number ax is defined by
ax = (axi, . . . ,axn)

Linear dependence and independence


The vectors x1, . . . ,xm G Rn are said to be linearly independent if

otherwise they are linearly dependent. (Here and elsewhere 0 denotes the
real number zero or a vector each element of which is zero.)
Linear combination
The vector x Rn is a linear combination of x1, . . . , xm G Rn if
x = X1^1 4- + \mxm
6

for some X1, . . . , \m G R

Preliminary Concepts and Notations

1.3

and it is a nonnegative linear combination of xl, . . . , xm if in addition


to the above equality X 1 , . . . , \m ^ 0. The numbers X 1 , . . . , Xm are
called weights.
The above concepts involving vector addition and multiplication
by a scalar define the vector space structure of Rn. They are not enough
however to define the concept of distance. For that purpose we introduce
the scalar product of two vectors.
Scalar product
The scalar product xy of two vectors x,y E Rn is defined by
xy = xtfji + + xnyn
Norm of a vector
The norm |ja;|| of a vector x E Rn is defined by

Cauchy-Schwarz inequah'ty
Let x,y E Rn. Then
where \xy\ is the absolute value of the real number xy.
PROOF

Let x,y E Rn be fixed. For any a E R

Hence the roots of the quadratic equation in a


xx(a)z + 2xya + yy = 0
cannot be distinct real numbers, and so
(xy)2 ^ (xx)(yy)
which implies the Cauchy-Schwarz inequality.

Distance between two points


Let x,y E Rn- The nonnegative number S(x,y) = \\x y\\ is
called the distance between the two points x and y in Rn.
Problem
Establish the fact that Rn is a metric space by showing that
7

1.4

Nonlinear Programming

8(x,y) satisfies the following conditions

(triangle inequality)
(Hint: Use the Cauchy-Schwarz inequality to establish the triangle
inequality.)
Angle between two vectors
Let x and y be two nonzero vectors in Rn: The angle \l> between
x and y is defined by the formula

This definition of angle agrees for n = 2,3 with the one in analytic geometry. The nonzero vectors x and y are orthogonal if xy = 0
(\l/ = 7T/2); form an acute angle with each other if xy ^ 0 (0 2g ^ f=j ir/2),
a s/nc/ acwte an0/e if xy > 0 (0 ^ ^ < ?r/2), an obtuse angle if xy ^ 0
(7T/2 ^ ^ ^ TT), and a sin'd obtuse angle if xy < 0 (?r/2 < ^ ^ ?r).

4. Matrices
Although our main concern is nonlinear problems, linear systems of the
following type will be encountered very frequently:
AnXj. + + Alnxn - bi
Amlxi + ' ' + Amnxn - bm
where Ai}^ and 6,, i = 1, . . . , m,j = 1, . . . , n, are given real numbers.
We can abbreviate the above system by using the concepts of the previous
section. If we let Ai denote an n-vector whose n components are
AHJ j = 1, . . . , n, and if we let x G Rn, then the above system is
equivalent to
AiX bi

i = 1, . . . , m

In 2 we interpret A& as the scalar product 1.3.6 of At and x. If we further let Ax denote an w-vector whose m components are A<x, i = 1,
. . . , m, and b an w-vector whose m components are bi, then the equivalent systems 1 and 2 can be further simplified to

Ax = b
8

Preliminary Concepts and Notations

1.4

In order to be consistent with ordinary matrix theory notation,


we define the m X n matrix A as follows

The ith row of the matrix A will be denoted by A and will be an n-vector.
Hence

The jth column of the matrix A will be denoted by A.> and will be an
w-vector. Hence

The transpose of the matrix A is denoted by A' and is defined by

Obviously the t'th row of A is equal to the ith column of A', and the jth
column of A is equal to the jth row of A'. Hence

The last equalities of 8 and 9 are to be taken as the definitions of A't and
A'J respectively. Since Ay is the real number in the t'th row of the jth
column of A, then if we define A^t as the real number in thejth row of the
ith column of A', we have

The equivalent systems 1, 2, and 3 can be written still in another


form as follows

1.4

Nonlinear Programming

Here A.j and b are vectors in Rm and x, are real numbers. The representation 2 can be interpreted as a problem in Rn whereas 11 can be interpreted as a problem in Rm. In 2 we are required to find an x Rn that
makes the appropriate scalar products 6, (or angles, see 1.3.11} with the
n-vectors Ai} i = 1, . . . , m. In 11, we are given the n + 1 vectors in
Rm> A.j, j' 1, . . . , n and b, and we are required to find n weights
xi, . . . , xn such that b is a linear combination of the vectors A.j.
These two dual representations of the same linear system will be used in
interpreting some of the important theorems of the alternative of the next
chapter.
The m X n matrix A of 4 can generate another linear system yA,
defined as follows

where y Rn. Hence, yA is an n-dimensional vector whose jth component is given by


In general we shall follow the convention of using upper case Latin
letters to denote matrices. If A is an m X n matrix, and if we let

then we define the following sub matrices of A (which are matrices with
rows and columns extracted respectively from the rows and columns of A)

Fig. 1.4.1 An m X n matrix and its submatrices.


10

Preliminary Concepts and Notations

1.8

jth column of
ith row of
It follows then that

and

Figure 1.4-1 depicts some of the above submatrices of A.


Nonvacuous matrix
A matrix A is said to be nonvacuous if it contains at least one
element A^. An m X n matrix A with m ^ 1 and n ^ 1 is nonvacuous
even if all its elements AH = 0.

5. Mappings and functions


Mapping
Let X and Y be two sets. A mapping F from X into Y is a correspondence which associates with every element x of X a subset of Y. For
each x (E X, the set T(x) is called the image of x. The domain X* of F
is the subset of points of X for which the image T(x) is nonempty, that is,
The range T(X*) of F is the union of the images of all the points of X*,
that is

EXAMPLEE

Let X = F = #, F(z) = {y | cos y = *}.

Then

Function
A function f is a single-valued mapping from a set X into a set Y.
That is for each x X, the image set/(x) consists of a single element of F.
The domain of / is X, and we say that / is defined on X. The range of /
is f(X) = U f(x). (For convenience we will write the image of a funcxex
tion not as a set but as the unique element of that set.)
11

1.&

Nonlinear Programming

Numerical function
A numerical function 6 is a function from a set X into #. In other
words a numerical function is a correspondence which associates a real
number with each element x of X.
EXAMPLESS If If
X=
then
d isd the
familiar
real
EXAMPLESS
X R,
= R,
then
is the
familiar
realsingle-valued
single-valued func
func
tion of a real variable, such as 0(x) = sin x. If X is the set of positive
integers, then 6 assigns a real number for each positive integer, for example 6(x) = l/x\. If X = Rn, then d is the real single-valued function
of n variables.

Vector function
An m-dimensional vector function f is a function from a set X
into Rm. In other words a vector function is a correspondence which
associates a vector from Rm with each element x of Jf. The m components of the vector f(x) are denoted by fi(x), . . . , fm(x).
Each/< is a
numerical function on X. A vector function / has a certain property (for
example continuity) whenever each of its components/, has that property.
EXAMPLE If X = R, then d is the familiar real single-valued func
of Rn. The m components /,, i' = 1, . . . , m, of / are numerical functions on Rn.
Linear vector functions on Rn
An m-dimensional vector function / defined on Rn is said to be
linear if

f(x) = Ax + b
where A is some fixed m X n matrix and b is some fixed vector in Rm.
It follows that if / is a linear function on Rn then

(Conversely, the last two relations could be used to define a linear vector
function on Rn, from which it could be shown that/(a;) = Ax -f 6 [Berge
63, p. 159].)
If m = 1 in the above, then we have a numerical linear function
B on Rn and
where c is a fixed vector in Rn and y is a fixed real number.
12

Preliminary Concepts and Notations

1.6

Inequalities or equalities involving linear vector functions (or


linear numerical functions) will be naturally called linear inequalities or
equalities.

6.

Notation
Vectors and real numbers

In general we shall follow the convention that small Latin letters


will denote vectors such as a, b, c, x, y, z, or vector functions such as/, g, h.
Exceptions will be the letters i, j, k, m, n, and sometimes others, which
will denote integers. Small Greek letters will denote a real number (a
point in R) such as a, ft, 7, , rj, f, or a numerical function such as 8, </>, \f/.
Subscripts
A small Latin letter with an integer subscript or a small Latin letter
subscript will denote a component of a vector, in general, and on occasion
will denote a vector. For example, if x G Rb, then x$ and zt denote
respectively the third and z'th components of x. On the other hand we
will have occasion to let x\ G Rm, 2 G Rm, etc., in which case this intent
will be made explicit. Small Greek letters with integer or Latin subscripts will occasionally be used to denote real numbers such as Xi, X,. If
x G Rn, K C N = {I, . . . ,n}, and K contains k elements each of
which is distinct, then X^K is a vector in Rk with the components
[xi | i G K-} and is denoted by XK. Thus a small Latin letter with a
capital Latin letter subscript denotes a vector in a space of smaller or
equal dimension to that of the space of the unsubscripted vector.
Superscripts
A small Latin or Greek letter with a superscript or an elevated
symbol will denote a fixed vector or real number, for example x1, xz, x\
x, x, *, I, etc. Exponentiation on the other hand will be distinguished
by parentheses enclosing the quantity raised to a power, for example (z)2.
Zero
The number 0 will denote either the real number zero or a vector
in Rn all components of which are zero.
Matrices
Matrices will be denoted by capital Latin letters as described in
detail in a previous section, Sec. 1.4.
13

1.6

Nonlinear Programming

Sets
Sets will always be denoted by capital Greek or Latin letters such
as F, A, fl, R, I, X, Y. Capital letters with subscripts, such as TI, T2, I\,
and capital letters with elevated symbols, such as r*, X will also denote
sets. (See also Sec. 1.2.)
Ordering relations
The following convention for equalities and inequalities will be
used. If x,y Rn, then

If x ^ 0, x is said to be nonnegative, if x > 0 then x is said to be semipositive, and if x > 0 then x is said to be positive. The relations =, ^, >, >
defined above are called ordering relations (in Rn).
The nonlinear programming problem
By using the notation introduced above, the nonlinear programming problem 1.1.1 to 1.1.3 can be rewritten in a slightly more general
form as follows. Let X C Rn, let g, h, and 6 be respectively an m-dimensional vector function, a ^-dimensional vector function, and a numerical
function, all defined on X. Then the problem becomes this: Find an
x, if such exists, such that

The set X is called the feasible region, x the minimum solution, and Q(x)
the minimum. All points x in the feasible region X are referred to as
feasible points or simply as feasible.
Another way of writing the same problem which is quite common
in the literature is the following:

subject to

14

Preliminary Concepts and Notations

1.6

We favor the more precise and brief designation 9 of the problem instead
of 10 to 12. Notice that if we let X = Rn in the above problem, then
we obtain the nonlinear programming problem 1.1.1 to 1.1.3.
If X = Rn and 6, g, and h are all linear functions on Rn, then
problem 9 becomes a linear programming problem: Find an x, if such
exists, such that

where 6, c, and d are given fixed vectors in Rn, Rm, and Rk respectively,
and A and B are given fixed m X n and k X n matrices respectively.
There exists a vast literature on the subject of linear programming
[Dantzig 63, Gass 64, Hadley 62, Simmonard 66]. It should be remarked
that problem 13 is equivalent to finding an x such that

When B and d are absent from this formulation, 14 becomes the standard
dual form of the linear programming problem [Simmonard 66, p. 95].

16

Chapter
Two
Linear
Inequalities
and
Theorems
of the
Alternative

1. Introduction
It was mentioned in Chap. 1 that
the presence of inequality constraints in a minimization problem
constitutes the distinguishing feature between the minimization
problems of the classical calculus
and those of nonlinear programming. Although our main interest
lies in nonlinear problems, and
hence in nonlinear inequalities,
linearization (that is, approximating nonlinear constraints by linear
ones) will be frequently resorted to.
This will lead us to linear inequalities. It is the purpose of this chapter to establish some fundamental
theorems for linear inequalities
which will be used throughout this
work. (Needless to say, these
fundamental theorems also play a
crucial role in linear programming.
See for example [Gale 60, chap. 2].)
The type of theorem that
will concern us in this chapter will
involve two systems of linear
inequalities and/or equalities, say
systems I and II. A typical
theorem of the alternative asserts
that either system I has a solution,
or that system II has a solution,
but never both. The most famous
theorem of this type is perhaps
Farkas' theorem
[Farkas 02,
Tucker 56, Gale 60].
Farkas' theorem
For each fixed p X n matrix
A and each fixed vector b in Rn,
either
has a solution

or

Linear Inequalities and Theorems of the Alternative

a.i

Fig. 2.1.1 Geometric interpretation


of Farkas' theorem: II'
has solution, I' has no
solution.

has a solution

but never both.


We shall postpone a proof of Farkas' theorem until after we have
given a geometric interpretation of it and applied it to get the necessary
optimality conditions of linear programming.
To give a geometric interpretation of Farkas' theorem, we rewrite
I and II as follows

where A[f denotes the jih column of A' and Aj the jih row of A (see
Sec. 1.4). System II' requires that the vector b be a nonnegative linear
combination of the vectors AI to Ap. System I' requires that we find a
vector x G Rn that makes an obtuse angle (^7r/2) with the vectors AI
to Ap and a strictly acute angle (<ir/2) with b (see 1.3.11). Figure 2.1.1

Fig. 2.1.2 Geometric interpretation of Farkas'


theorem: I' has solution, II' has no
solution.
17

h.2

Nonlinear Programming

shows a simple case with n = 2, p = 3, in which IF has a solution, and


hence by Farkas' theorem F cannot have a solution. Figure 2.1.2 shows
a case with n = 2, p = 2, in which IF has no solution, and hence by
Farkas' theorem F must have a solution.

2. The optimality criteria of linear programming:


An application of Farkas' theorem
As a typical example of the use and power of the theorems of the alternative we shall show how Farkas' theorem can be employed to derive necessary optimality conditions for the following linear programming problem:
Find an x, if it exists, such that

where b and c are given fixed vectors in Rn and Rm respectively, and A


is a given fixed m X n matrix.
Optimality criteria of linear programming

(Necessity} Let x be a solution of the linear programming problem 1.


Then there exists a u G Rm such that (x,u) satisfy
(sual feasibility)
(primal feasibility)
(complementarity)

(Sufficiency) If some x (E Rn and some u G Rm satisfy 3 to 6, then x


solves 1.
PROOF
(Necessity) Let x solve 1. Define the index sets P, Q, and M as follows:

and assume that P and Q contain p and q elements, respectively.


(see 14.16)

Then

t These are standard terms of linear programming [Dantzig 63], which differ from the
terminology of nonlinear programming (see Chap. 8). The complementarity condition
usually refers to an equivalent form of 6: H(Ax c) = 0.
18

Linear Inequalities and Theorems of the Alternative

2.3

it follows that
for some real number 8 > 0, where e is an m-vector of ones.
each x G Rn> we can find a real number a > 0 such that

Then for

and hence x -\- ax ^ X. Since x is the minimum solution of 1, we have


that for each x G Rn there exists an a > 0 such that
Hence
bz ^ 0

for each x G Rn

which implies that b = O.f By taking u = 0 G ^m, the relations 3 to 6


are satisfied because b = 0.
If P 5^ 0, then we assert that the system
has no solution x G Rn- For if P did have a solution z, say, then ax
would also be a solution of 9 for each a > 0. Now consider the point
x + ax, where re is a solution of 9 and a > 0. Then

where in 12 the first inequality follows by defining e as a -vector of ones


and

and the second inequality of 12 holds for some a > 0 because d < 0.
But relations 10 to 12 imply that x -f ax G -^ and b(x + az) < bx,
which contradicts the assumption that x is a solution of 1. Hence 9 has
no solution x G Rn, and by Farkas' theorem 2.1.1 the system
f To see this, take for a fixed i M = {1, 2, .. . , n}, at,- = b{, x^i = 0, then bx ^ 0 implies that
(b,; ) 2 ^ 0. Hence bt = 0. Repeating this process for each i M, we get 6 = 0.
19

8.2

Nonlinear Programming:

must have a solution y G Rp-

If we let 0 G R", we have then

By defining u G Em such that u = (UP,WQ), where wp = ?/ G Rr,


UQ = 0 G #*, condition /4 becomes conditions 4 and 5, and condition 15
becomes condition #. Condition 3 holds because x G X. This completes
the necessity proof.
(Sufficiency] Let x G Rn and u Rm satisfy 3 to 0, and let x G X. By
3 we have that x G -X". Now

We remark that Farkas' theorem was used only in establishing


the necessity of the conditions 3 to 6, whereas the sufficiency of the conditions 3 to 6 required only elementary arguments. This is the typical
situation in establishing optimality criteria in mathematical programming
in general. Necessity requires some fairly sophisticated mathematical
tool, such as a theorem of the alternative or a separation theorem for
convex sets (see Chap. 3), whereas sufficiency can be established merely
by using elementary manipulation of inequalities.
The remaining sections of this chapter will be devoted to obtaining
a rather comprehensive set of theorems of the alternative of the Farkas
type. We shall follow Tucker [Tucker 56] in establishing these results.
We begin first in the next section with some existence theorems for linear
inequalities, from which all the theorems of the alternative follow.
Problem
Consider the following general form of the linear programming
30

Linear Inequalities and Theorems of the Alternative

1.3

problem: Find x G Rn> y E R*, if they exist, such that

where 6, a, c, and d are given vectors in Rn, R', Rm, and Rk respectively,
and A, D, B, and E are given m X n, m X t, k X n, and A; X ^ matrices
respectively. Show that a necessary and sufficient condition for (x,y) to
solve the above problem is that (x,y) and some u G #w and v Rk
satisfy the following conditions:

(Hint: Replace the equality constraint


Bx + Ey = d

by
z + y ^ d

and

-5a: - Ey ^ -d
and use Theorem 2.}

3. Existence theorems for linear systems


We establish now some key theorems for the existence of certain types
of solutions for linear systems and begin with a lemma due to Tucker
[Tucker 56].
ai

Nonlinear Programming

1.3

Tucker's lemma
For any given p X n matrix A, the systems

Ax ^ 0

and

II

A'y = 0, y ^ 0

possess solutions x and y satisfying

Aix + 2/1 > 0


PROOF The proof is by induction on p. For p = 1, if AI = 0, take
y\ = 1, x 0; if A\ j& 0, take yi = 0, x = A!.
Now assume that the lemma is true for a matrix -4 of p rows and
proceed to prove it for a matrix of p + 1 rows A:

By applying the lemma to A, we get x,y satisfying

we take

Then

which extends the lemma to A.


However, if Ap+ix < 0, we apply the lemma a second time to the
matrix B:

where

32

Linear Inequalities and Theorems of the Alternative

1.3

So
or
Bx = 0

This second use of the lemma yields v,u satisfying

Let

It follows from 5 and 7 that

then

and

where the last inequality follows from 7, the equality before from 6, the
equality before from 10, the equality before from 11, and the equality
before from 4- Finally from 4, H, 10, 6, and 7 we have
AIW + HI = (Bi \iA p+i)w -f ui = BIW + HI

Relations 5, 5, .?#, and 15 extend the lemma to A


23

>.3

Nonlinear Programming

From Tucker's lemma two important existence theorems follow.


These theorems assert the existence of solutions of two linear systems
that have a certain positivity property.
First existence theorem [Tucker 56]
For any given p X n matrix A, the systems
and
possess solutions x and y satisfying
PROOF In Tucker's lemma the row AI played a special role. By
renumbering the rows of A, any other row, say -4,-, could have played
the same role. Hence, by Tucker's lemma 1, there exist zl G Rn,
yi RP, i = I , . . . , p, such that

Define

Hence by 15, we have that

and for i = 1, . . . , p,

14

Linear Inequalities and Theorems of the Alternative

a.s

Fig. 2.3.1 Geometric interpretation of Theorem 14, n 2, p = 3.

or

Ax + y > 0
A geometric interpretation of the above theorem can be given in
the space of the rows A G Rn, i = 1, . . . , p. The theorem states
that one of the following alternatives must occur for any given p X n
matrix A:
(a) There exists a vector x which makes a strict acute angle (<ir/2) with
all the rows Ait Fig. 2.3.la, or
(6) There exists a vector x which make an acute angle (=g ir/2) with all
the rows Aif and the origin 0 can be expressed as a nonnegative
linear combination of the rows A, with positive weights assigned to
the rows A, that are orthogonal to x, Fig. 2.3.1b, or
(c) The origin can be expressed as a positive linear combination of the
rows At, Fig. 2.3.Ic.
By letting the matrix A of the previous theorem have a special
structure a second existence theorem can be easily established.
Second existence theorem [Tucker 56]
Let A and B be given pl X n and p2 X n matrices, with A nonvacuous. Then the systems

and

possess solutions x G Rn, yi G Rpl, yt G Rfl satisfying


Ax + yi> 0
25

s.s

Nonlinear Programming

PROOF We remark first that the requirement that A be nonvacuous is


merely to ensure that the statement of the theorem is nonvacuous, that is,
without a matrix A the theorem says nothing.
We apply Theorem 14 to the systems

and

and obtain x, yi, zi, z-t, satisfying


AX + 7/i > 0

Bx + 21 > 0

-fix + 22 > 0
Define now y2 = Zi 22- We have then that x, 1/1, 1/2 satisfy
Ax ^ 0
Bx = 0
A'yi + B'y2 = 0
t,! ^ 0
As + yi >

Corollary
Lef A, B, C, and D be given pl X n, pz X n, p3 X n, and p* X n
matrices, with A, B, or C nonvacuous. Then the systems
and
II

A'j/i + B'y, + C'y, + ^4 = 0, tfl ^ 0, y2 ^ 0, y3 Z 0

possess solutions

satisfying

Ax + t/i > 0
Bx + 2/2 > 0
and
Cx + 1/3 > 0

This corollary to Theorem 17 will be used to derive the most


general types of the theorems of the alternative of the next section.
16

Linear Inequalities and Theorems of the Alternative

1.4

4. Theorems of the alternative


In this section we shall be concerned with establishing a series of theorems
relating to the certain occurrence of one of two mutually exclusive events.
The two events, denoted by I and II, will be the existence of solutions of
two related systems of linear inequalities and/or equalities, f The prototype of the theorem of the alternative can be stated as follows: Either I or
II, but never both. If we let I denote the nonoccurrence of I and similarly
for IT, then we can state the following.
Typical theorem of the alternative

I<^T!
or equivalently
I<=*II
TYPICAL PROOF
I = II

(or equivalently I <= II)

and
I => II

(or equivalently I <= II)

The proof that I => II is usually quite elementary, but the proof that
I => II utilizes the existence theorems of the previous section.
In the theorems to follow, certain obvious consistency conditions
will not be stated explicitly for the sake of brevity. For example, it will
be understood that certain matrices must have the same number of rows,
that the dimensionality of certain vectors must be the same as the number of columns in certain matrices, etc.
We begin now by establishing a fairly general theorem of the
alternative due to Slater [Slater 51].
Slater's theorem of the alternative [Slater 51]
Let A, B, C, and D be given matrices, with A and B being nonvacuous.
Then either
I

Ax > 0

Bx > 0

Cx ^ 0

Dx = 0 has a solution x

or
f Occasionally we shall also refer to the systems of inequalities and equalities themselves
as systems I and II.
XT

3.4

Nonlinear Programming

has a solution

6w< never 6o/i.


PROOF (I => II) By assumption, I holds. We will now show that if
II also holds, then a contradiction ensues. If both I and II hold, then
we would have x, yi, yz, y*, y\ such that

xA'yi + xB'y* + xC'y, + xD'y* > 0


because xD'y4 = 0, xC'y3 ^ 0, and either xB'yz ^ 0 and xA'yl > 0, or
xB'yz > 0 and xA'y\ ^ 0. But now we have a contradiction to the first
equality of II. Hence I and II cannot hold simultaneously. Thus,

i=*n.
(I => ID

=>II.
We remark that in the above proof, the requirement that both A
and B be nonvacuous was used essentially in establishing the fact that
I => II. Corollary 18, which was used to prove that I = II, can handle
systems in which merely A or B are nonvacuous. By slightly modifying
the above proof, the cases B vacuous and A vacuous lead respectively to
Motzkin's theorem of the alternative (or transposition theorem, as
Motzkin called it) [Motzkin 36] and Tucker's theorem of the alternative
[Tucker 56].
Motzkin's theorem of the alternative [Motzkin 36]
Let A, C, and D be given matrices, with A being nonvacuous. Then
as

Linear Inequalities and Theorems of the Alternative

2.4

either
Ax > 0

Cx ^ 0

Dx = 0

has a solution x

or
has a solution

but never both.


PROOF (I ==> II)
such that

If both I and II hold, then we would have x, yit y3, y^

xA'yi + xC'y3 + xD'y4 > 0


because xD'y^ = 0, xC'y3 ^ 0, and xA'y\ > 0. But now we have a
contradiction to the first equality of II. Hence, I and II cannot hold
simultaneously. Thus, I => II.

(T=>H)
I =* (Ax ^ 0, Cx ^ 0, Dx = 0> => {Ax ^ 0
(by Corollary 2.3.18)

=*!!. |

Tucker's theorem of the alternative [Tucker 56]


either
I

Let B, C, and D be given matrices, with B being nonvacuous.


Bx > 0

Cx ^ 0

Then

Dx = 0 has a solution x

or

has a solution y2, y3,y4


but never both.
The proof is similar to the proof of Motzkin's theorem 2. (The reader
is urged to go through the proof himself here.)
Slater [Slater 51] considered his theorem as the one providing the
most general system I possible, because it involved all the ordering relations >, >, ^, =. Similarly we can derive another theorem which
involves the most general system II possible, in which y\ > 0, yz > 0,
j/a ^ 0, and 2/4 is unrestricted.
99

h2.4

Theorem of the alternative


Let A, B, C, and D be given matrices, with A and B being nonvacuous.
Then either

has a solution
or

has a solution
but never both.
PROOF (I => !I)
7/3, 2/4 satisfying

If both I and II hold, then we would have x, ylt yt,

xA'yi + xB'y* + xC'y3 + zZ) / y 4 > 0


because xD'y\ = 0, xC'y3 ^ 0, and either xB'y2 ^ 0 and xA'y\ > 0, or
z5'2/2 > 0 and xA'yi ^ 0. But now we have a contradiction to the first
equality of II. Hence, I and II cannot hold simultaneously. Thus,

i=>n.
(n=>i)

by Crollary 18
=>!
We remark that if either A or 5 is vacuous, then we revert to
Tucker's theorem 3 or Motzkin's theorem 2.
We remark further that in all of the above theorems of the alternative the systems I are all homogeneous. Hence, by defining z = x, the
system I of, say, Slater's theorem 1 can be replaced by
I'

Az < 0, Bz < 0, Cz ^ 0, Dz = 0 has a solution z

Similar remarks apply to theorems 2 through 480

Linear Inequalities and Theorems of the Alternative

1.4

The above theorems of the alternative subsume in essence all other


theorems of this type. We derive below some of these theorems directly
from the above ones.
Gordan's theorem [Gordan 73]
For each given matrix A, either
I

Ax > 0 has a solution x

or

II

A'y = 0, y > 0 has a solution y

but never both.


PROOF Follows directly from Motzkin's theorem 2, by deleting the
matrices C and D.
Geometrically we may interpret Gordan's theorem as follows.
Either there exists a vector x which makes a strict acute angle (<7r/2)
with all the row vectors of A, Fig. 2.4-1 a, or the origin can be expressed
as a nontrivial, nonnegative linear combination of the rows of A, Fig.
24.1b.
Farkas' theorem [Farkas 02]
For each given p X n matrix A and each given vector b in Rn either
I

Ax ^ 0

bx > 0 has a solution x G Rn

or

II

A'y = b, y ^ 0 has a solution y G Rp

but never both.


PROOF

By Motzkin's theorem 2, either I holds or

IF 677 A'y =3 = 0, ij > 0, j/s ^ 0 must have a solution r\ (E R and

Fig. 2.4.1 Geometric interpretation of


Gordan's theorem.

si

8.4

Nonlinear Programming

but not both. Since r; G ]?, T? > 0 means 77 > 0. Dividing through by
17 and letting y = 1/3/77, we have that II' is equivalent to
Stiemke's theorem [Stiemke 15]
For each given matrix B, either
I

Bx > 0 has a solution x

or

II

B'y = 0, y > 0 has a solution y

but never both.


PROOF
and D

Follows directly from Tucker's theorem 3, by deleting C

Nonhomogeneous Farkas' theorem [Duifin 56]


For a given p X n matrix A, given vectors b (E Rn, c Rp, and a
given scalar ft, either
I

bx > /3, Ax ^ c has a solution x (E Rn

or

has a solution
but never both.
PROOF

The system I is equivalent to

has a solution
By Motzkin's theorem #, then, either I' holds or

but not both.


sa

Linear Inequalities and Theorems of the Alternative

8.4

Now we have either 171 > 0 or 171 = 0 (and hence fi > 0). B
defining y y^lt\\ in the first case and y = y3 in the second, we have that
II' is equivalent to

II" is equivalent to II.


Gale's theorem for linear equalities [Gale 60]
For a given p X n matrix A and given vector c G Ep, either
I

Ax = c has a solution x G Rn

or

II

A'y 0, cy = 1 has a solution y G Rp

but never both.


PROOF
I'

The system I is equivalent to

> 0, - & + Ax = 0 has a solution G R, x G Rn


By Motzkin's theorem 2, either I' holds or

has a solution
but not both. Since y\ G R, y\> 0. By defining y = y*/y\, II' is
equivalent to II. |
Gale's theorem for linear inequalities (^) [Gale 60]
For a given p X n matrix A and given vector c G Rp, either
I

Ax ^ c has a solution x G Rn

or

II

A'y = 0, cy 1, y ^ 0 has a solution y G Rp

but never both.


PROOF

The system I is equivalent to

has a solution
93

Nonlinear Programming

3.4

Table 2.4.1 Theorems of the Alternative!


1 Ax > 0, Bx > 0, Cx 0, Dx = 0 4'yi +B'y2 + C'y3 +D'yt = 0
(A and B nonvacuous) (Slater)
y\ > 0, 2/2 ^ 0, 3/3 ^ 0
or
^! 0, i/2 > 0, 3/3 ^ 0

2 Ax > 0, Cx ^ 0, Dx = 0
(A nonvacuous) (Motzkin)

A'y, + C'y, + Z)'y4 = 0


J/i > 0, y, ^ 0

3 Bx > 0, Cx 0, Dx = 0
(B nonvacuous) (Tucker)

B'y2 + C"2/3 + D'yt = 0


y2 > 0, 2/3 ^ 0

4 Ax > 0, Bx 0, Cx 0, Dx = 0 A'yi + 5'2/2 + C"i/3 + D'yt = 0


or
2/i > 0, 2/2 > 0, yi ^ 0
Ax 0, Bx > 0, Cx 0, Dx = 0
(-4 and B nonvacuous)

5 Az > 0 (Gordan)

4'y = 0, y > 0

6 bx > 0, Ax ^ 0 (Farkas)

A'l/ = 6, y ^ 0

7 JSx > 0 (Stiemke)

B'y = 0, y > 0

* 6z > 0, 4z ^ c
(Nonhomogeneous Farkas)

A'y = b, cy 0, y ^ 0
or
A'y = 0, cy < 0, y ^ 0

9 4z = c (Gale)

A'y = 0, cy = 1

*0 4x ^ c (Gale)

A'y = 0, cy = -1, y ^ 0

ff

A'y = 0, cy = -1, y ^ 0
or
A'y = 0, cy g 0, y > 0

Ax < c

tNo "or" appearing in the above table and in Problems 2.4.12 to 2.4.17 is an
exclusive "or."

By Motzkin's theorem 2, either I' holds or

has a solution
but not both. By defining y = y^/y\, II follows from II'.
34

Linear Inequalities and Theorems of the Alternative

3.4

Theorem for linear inequalities (<)


For a given p X n matrix A and a given vector c E; Rp, either
I

Ax < c has a solution x G Rn

or

has a solution
but never both.

PROOF
T

The system I is equivalent to

> 0, c - Ax > 0 has a solution R, x E R"


By Slater's theorem 1, either I' holds or

but not both. If for the case when y\ > 0, y% ^ 0, we set y = 2/2/2/1, and
for the case when y\ ^ 0, yz > 0, we set y = y 2 , then II is equivalent to
II'.
In the table above, Table 2.4-1, we give a convenient summary of all
the above theorems of the alternative.
Problems
By using any of the above theorems 1 to 11, establish the validity
of the following theorems of the alternative (12 to ./7): Either I holds, or
II holds, but never both, where I and II are given below.

has a solution
has a solution
has a solution
has a solution
has a solution
has a solution
SB

2.4

Nonlinear Programming

has a solution

has a solution
has a solution
has a solution
has a solution
has a solution

Mnemonic hint
In all the theorems of the alternative Mo 17 above, which involve
homogeneous inequalities and/or homogeneous equalities, the following
correspondence between the ordering relations, >, >, ^, =, occurs:
Orderings appearing in I

36

Orderings appearing in II

Linear Inequalities and Theorem! of the Alternative

2.4

The asterisks indicate ordering relations which must be present in order


for the correspondence to hold. The arrows indicate the direction in
which the correspondence is valid; for example, > indicat es that starting
with the relations at the unpointed end of the arrow, the corresponding
relations are those at the pointed end of the arrow.
Problem
Establish Motzkin's theorem 2 by starting with Farkas' theorem 6.
(Hint: Let

has a solution
pseudoconcave
and use Farkas' theorem. e is a vector pof ones in the above.)

37

Chapter
Three
Convex
Sets
inR n

The purpose of this chapter is to


introduce the fundamental concept
of convex sets, to describe some
properties of these sets, and to
derive the basic separation theorems for convex sets. These separation theorems are the foundations
on which many optimality conditions of nonlinear programming
rest.

1. Convex sets and


their properties
In order to define the concept of a
convex set, we begin by defining
line and line segments through two
points in Rn.
Line
Let x\x2 e R*. The line
through xl and x* is defined as the
set

{x\x = (1 - X)*1 + Xz2, X ER]


or equivalently
[x | x = pix1 + pzx2, pi, pz G R,
Pi + Pz = 1}

If we rewrite the first definition in the equivalent form

[ x \ x = x1 -f X(z 2 - x1), X e # )
and consider the case when x G R*,
it becomes obvious that the vector
equation x = x1 + X(z 2 x1) is
the parametric equation of elementary analytic geometry of the line
through x1 and x2, Fig. 8.1.1.

Convex Seta in 72"

8.1

Line segments
Let xl, x2 Rn. We define the following line segments joining z1
z
and x :
(i) Closed line segment [xl,x*] = {x \ x = (1 - X)*1 + \x*, 0 g X ^ 1}
(ii) Open line segment (xl,x2) = {x \ x = (1 - X)z l -f Xz 2 , 0 < X < 1}
(iii) Closed-open line segment [xl,x*) = {x \ x (1 X)^1 + Xz2,
0 ^ X < 1}
(iv) Open-closed line segment (xl,x*] = {x \ x = (1 \)xl + Xz2,
0 < X ^ 1}
Obviously [z^z2] is the portion of the straight line through a;1 and
x which lies between and includes the points x1 and x2, Fig. 3.1.1. (z*,z2)
does not include x1 or z2, (a:1,*;2) does not include a:2, and (xl,x*] does not
include x1.
z

Convex set
A set F C Rn is a convex set if the closed line segmentf joining
every two points of F is in T. Equivalently we have that a set F C Rn
is convex if

Figure 3.1.2 depicts some con vex sets in R2, and Fig. 3.1.3 some nonconvex sets in Rz. It follows from 3 that Rn itself is convex, that the
empty set is convex, and that all sets consisting each of one point are
convex.
The subsets of Rn defined below in 4. 5, and 6 are all convex sets
n
in R . This can be easily established by a direct verification of the
definition 3 of a convex set.
t It is obvious that the definition of a convex set would be unchanged if any of the other
line segments denned in 2 were used here instead of the closed line segment.

Fig. 3.1.1 Line and line segment through x1 and x*.


39

Nonlinear Programming

8.1

Fig. 3.1.2 Convex sets.

Halfspace
Let c G Rn, c 7* 0, and a G R- Then the set [x x G R", ex < a]
is an open half space in Rn, and the set {x \ x G #n, cz ^ a j is a c/oserf
halfspace in #". (Both halfspaces are convex sets.)
Plane
Letc G #n, c ?* 0, and a G # Then the set {z | x G ^n, ex = a}
is called a plane in .Rn. (Each plane in Rn is a convex set.)
Subspace
A set F C Rn is a subspace if

Each subspace of /2n contains the origin and is a convex set. The
subspaces of R3 consist of 0, R3, the origin, and all straight lines and
planes passing through the origin.
Problem
(i) Show that each open or closed ball
Bt(x) = {x \ x G Rn, \\x - x\\ < 6} Bt(x) = {x | x G Rn, \\x - x\\ ^ e]

Fig. 3.1.3 Nonconvex sets.


iO

Convex Sets in R*

3.1

around a point x G Rn is a convex set. (Hint: Use the triangle inequality


1.3.10 in the form \\x + y\\ ^ \\x\\ + \\y\\.)
(ii) Show that the interior of a convex set is convex.
Vertex
Let F be a convex set in Rn. Each x G F for which there exist no
two distinct xl,xz G F different from x such that x G [z1,^2], is called a
vertex of F (or an extreme point of F).
A convex set F C Rn may have no vertices (for example the plane
{x | x G Rn, ex = a} and the open ball B\(f) have no vertices), a finite
number of vertices (for example the set {x \ x G Rn, x ^ 0, ex = 1},
where e in an n-vector of ones, has the n vertices e\ i = 1, . . . , n,
where & is an n-vector with ef = 1 and e? = 0, i 5^ j), or an infinite
number of vertices (for example the closed ball J?\(z)C Rn has an
infinite number of vertices given by [x \ x G R", \\x x\\ = X}).
Theorem
// (Fi),-e/ is a family (finite or infinite] of convex sets in Rn, then their
intersection C\ Ft- is a convex set.
i&
PROOF
l

Let x^x* G H Ft, and let 0 ^ X ^ 1. Then for each i G /,


iGI

x ,x G F,, and since F, is convex, (1 \)xl + \x2 G Ft.


Polytope and polyhedron
A set in Rn which is the intersection of a finite number of closed
halfspaces in Rn is called a polytope. If a polytope is bounded (that is,
for each x in the polytope \\x\\ ^ a for some fixed a G R), it is called a
polyhedron.
It follows from the convexity of the halfspaces 4 and Theorem 9
that poly topes and polyhedra are convex sets.
Convex combination
A point b G Rn is said to be a convex combination of the vectors
0 , , TO G Rn if there exist m real numbers pit . . . , pm such that
1

b = pia1 + + pmam, pi, . . . , pm ^ 0, pi + 4- Pm = 1


Equivalently, if we define an m X n matrix A whose ith row is Ai = a',
41

3.1

Nonlinear Programming

and if we let p = (pit . . . ,pm) Rm and e be an m-vector of ones, then


we have that 6 is a convex combination of the rows of A if
(6 = A'p, p ^ 0, ep = 1) has a solution p E Rm
Note that if 6 is a convex combination of two points a1,a2 Rn,
then this is equivalent to saying that 6 [a^o2] (see 2).
Simplex
Let x, xl, . . . , xm be m + 1 distinct points in Rn, with m ^ n.
If the vectors xl x, . . . , xm x are linearly independent, then the
set of all convex combinations of x, xl, . . . , xm

is called an m-simplex in /2n with vertices x, x1, . . . , xm. (A 0-simplex


is a point, a 1-simplex is a closed line segment, a 2-simplex is a triangle,
and a 3-simplex is a tetrahedron.)
Theorem
A set F C Rn i$ convex if and only if for each integer m ^ 1, every
convex combination of any m points of T is in T. Equivalently, a necessary
and sufficient condition for the set T to be convex is that for each integer m ^ 1

PROOF The sufficiency of 14 is trivial; take m 2, then T is convex


by 3.
The necessity of 14 will be shown by induction. For m = 1, 14
holds trivially. For m 2, 14 holds as a consequence of 3. Assume
now that 14 holds for m, we will now show that it also holds for m -f 1.
Let
x\ x*, . . . , xn+l T
Pit , Pm+i
Pi +

^ 0

+ Pm+1 = 1

If pm+i = 0, then pix1 + -f pmxm F, since 14 holds for m. If


pm+1 = 1, thenpis 1 + + Pm-nzm+1 = zm+1 r. If 0 < pm+1 < 1,
48

Convex Seta in Rn

3.1

then we can write

Caratheodory's theorem [Caratheodory 07]


Let F C Rn- If ^ is a convex combination of points of F, then x is a
convex combination of n + I or fewer points of F.
PROOF

Let

We will show now that if m > n -f 1, then x can be written as a convex


combination of m 1 points in F. (This would establish the theorem
then, for we could repeatedly apply the result until is a convex combination of n -f 1 points of F.) If any Pi in the above expression is zero,
then x is a convex combination of m 1 or fewer points of T. So let
each pi > 0. Since m > n + 1, there exist rit . . . , rm_l G R, not all
zero, such that

Define
9 = Pi otTi

for i = I , . . . , m

where a is some positive number chosen such that #,- ^ 0 for all i, and
at least one g,-, say g*, is equal to 0. In particular we choose a such that

43

a.i

Nonlinear Programming

Fig. 3.1.4 A set T and its convex hull [T].

Then

and

Hence x is a convex combination of m 1 points in F.


Convex hull
Let T C Rn. The convex hull of T, denoted by [F], is the intersection of all convex sets in Rn containing F. (By Theorem 9, the convex
hull of any set F C Rn is convex. Figure 3.1.4 shows a hatched nonconvex set in Rz and its shaded convex hull.)
Obviously if F is convex, then F = [F].
Theorem
The convex hull [F] of a set F C Rn is equal to the set of all convex
combinations of points of F.
PROOF

Let A denote the latter set, that is,

If x\x* G A, then

44

Conrex Sets in Rn

3.1

Hence for 0 g X ^ 1

and

Thus \xl -f- (1 X)o;2 G A, and A is convex. It is also clear that r C A.


Since A is convex, then [T] C A. We also have by Theorem 13 that the
convex set [T] containing T must also contain all convex combinations of
points of I\ Hence A C [r], and A = [r].
Sum of two sets
Let T,A C Rn. Their sum r -f A is denned by
Product of a set with a real number
Let T C Rn, and let X E R-

The product XT is defined by

Note that if X = -1 and r, A C Rn, then A + XF = A - T. Note that


this is not the complement of F relative to A as defined in 1.2 and written as

A ~ r.

Theorem
The sum Y + A of two convex sets T and A in Rn is a convex set.
PROOF Let 21,*2 E T + A, then zl = xl + yl and zz = x2 + yz, where
xl,x2 G T and y^y2 A. For 0 ^ X g 1

Hence r + A is convex.
Theorem
The product pY of a convex set T in Rn and the real number p is a
convex set.
45

1.2

Nonlinear Programming

PROOF Let zl,z* T, then z1 = xl, z2 = ^x\ where xl,x2 T.


0 ^ X^ 1

For

Corollary
// F and A are two convex sets in R", then F A is a convex set.

2. Separation theorems for convex sets


It is intuitively plausible that if we had two disjoint convex sets in Rn,
then we could construct a plane such that one set would lie on one
side of the plane and the other set on the other side. Despite its simplicity, this is a rather deep result and is not easy to prove. One version of
this result, the Hahn-Banach theorem, can be established by only using
the vector space properties 1.3.3 of Rn and not the topological properties
induced by the norm ||xj! [Berge 03, Valentine 64]. We shall, however,
use these topological properties of Rn (all summarized in Appendix B) in
deriving the separation theorems for convex sets. In particular our
method of proof will make use of Gordan's theorem of the alternative
2.4-5 and the finite intersection theorem of compact sets B.3.2 (iii).
(Knowledge of the contents of Appendix B is assumed from here on.)

Separating plane
The plane {x \ x Rn, ex = a } , c 7^ 0, is said to separate (strictly
separate] two nonempty sets F and A in Rn if

If such a plane exists, the sets F and A are said to be separable (strictly
separable).
Figure 3.2.1 gives a simple illustration in R2 of two sets in Rn
which are separable, but which are neither disjoint nor convex. It should
be remarked that in general separability does not imply that the sets are
disjoint (Fig. 3.2.1], nor is it true in general that two disjoint sets are
separable (Fig. 3.2.2}. However, if the sets are nonempty, convex, and
46

Convex Sets In R?

3.2

Fig. 3.2.1 Separable but not disjoint sets.

disjoint, then they are separable, and in fact this is a separation theorem
we intend to prove.
Lemma
Let fl be a nonempty convex set in Rn, not containing the origin 0.
Then there exists a plane {x \x (~ Rn, ex = 0}, c 7* 0, separating 2 and 0,
thatis,

PROOF

With every x G & we associate the nonempty closed set

Let x1, . . . , xm be any finite set of points in fl. It follows from the convexity of fl, Theorem 3.1.13, and from the fact that 0 ^ 12, that

Fig. 3.2.2 Disjoint but not separable sets.


47

3.2

Nonlinear Programming

or equivalently

Hence by Gordan's theorem 2.4-5


x^y > 0, i 1, . . . , m has a solution y G: Rn
Obviously y ^ 0, and we can take y such that yy = 1. Then

and hence

The sets (A^en are closed sets relative to the compact set {y \ y G Rn,
yy = 1} [see B.1.8 and B.3.2(\}}, hence by the finite intersection theorem
B.3.2(\\\) we have that C\ Ax ^ 0. Let c be any point in this interseczen
tion. Then cc = 1 and cz ^ 0 for all x G ^- Hence {# | a: G #n,
ex = 0} is the required separating plane. |
It should be remarked that in the above lemma we did not impose
any conditions on fl other than convexity. The following example shows
that the above lemma cannot be strengthened to x G ^ =* ex > 0 without
some extra assumptions. The set
is convex and does not contain the origin, but there exists no plane
{x | x R*, ex = 0} such that x E & = ex > 0 (Fig. 3.8.3).
If on the other hand we do assume that ft is closed (or even if we

Fig. 3.2.3
48

Convex Sets in Rn

3.9

assume less, namely that the origin is not a point of closure ft), then we
can establish a stronger result, that is, there exists a plane which strictly
separates the origin from ft (see Corollary 4 and Lemma 5 below). However, before doing this, we need to establish the following fundamental
separation theorem.
Separation theorem
Let F and A fee two nonempty disjoint convex sets in R". Then there
exists a plane {x | x G Rn, ex = a } , c ^ 0, which separates them, that is,

PROOF

The set

is convex by Corollary 3.1.22, and it does not contain the origin 0 because
F C\ A = 0. By Lemma 2 above there exists a plane \x \ x G Rn, ex = 0},
c 7 0, such that

or
Hence
Define

Then

We derive now from the above fundamental separation theorem a


corollary, and from the corollary a lemma, Lemma 5. Lemma 5 will be
used in establishing a strict separation theorem, Theorem 6, below.
Corollary
Let 12 be a nonempty convex set in Rn. If the origin 0 is not a point
of closure of fl (or equivalently if the origin is not in the closure fi of fi), then
49

3.2

Nonlinear Programming

there exists a plane {x \ x G Rn, ex = a ] , c 7* 0, a > 0, strictly separating


12 and 0, and conversely. In other words

PROOF (=) Assume that there exist c ^ 0, a > 0 such that ex > a
for all x G Q. If 0 G 12, then (see BJ.3 and B.I.6} there exists an x G f
such that ||z|| < a/2||c||, and hence

which is a contradiction. Hence 0 12.


(=>) Since 0 is not a point of closure of 12, there exists an open
ball Bf(0) = \x \ x G Rn, \\x\\ < e\ around 0 such that B e (0) n 0 = 0
(see B.1.3). Since the ball /?(0) is convex (see 3.1.7), it follows by
Theorem 3 that there exists a plane {x \ x G Rn, ex = 7}, c 7* 0, such that

Since 5t(0) is an open ball, it must contain the nonzero vector Sc for
some positive 5. Hence 7 ^ dec > 0. Let a = ^dcc > 0. Then

Lemma
Let ft be a nonempty closed convex set in Rn. If 12 does not contain
the origin, then there exists a plane \x \ x G Rn, ex = a } , c ^ 0, a > 0,
strictly separating 12 and 0, and conversely. In other words

PROOF This lemma follows from Corollary 4 above by observing that


the requirement that 12 be closed and not contain the origin 0 implies
that 0 is not a point of closure of 12, that is, 0 ^ fl (see B.1.3, B.I.5 and
B.1.6).
Strict separation theorem
Let F and A be two nonempty convex sets in Rn, with T compact and
A closed. If T and A are disjoint, then there exists a plane {x \ x G Rn,
80

3.2

Convex Set* in R"

ex = a ] , c 5^ 0 which strictly separates them, and conversely


words

In other

PROOF (=) If x E F C\ A, then ex < a < ex, a contradiction.


(=) The set
is convex by Corollary 3.1.22 and closed by Corollary B.3.8. Hence by
Lemma 5 above there exists a plane \x | x G Rn, ex = /x}, c 5^ 0, ^i > 0,
such that
orr

Hence

Define

Then

The above separation theorems will be used to derive some fundamental theorems for convex functions in the next chapter, which in turn
will be used in obtaining the fundamental Kuhn-Tucker saddlepoint
optimality criteria of convex nonlinear programming in Chap. 5 and also
the minimum principle necessary optimality condition of Chap. 11.
We remark here that a theorem of the alternative, the Gordan
theorem 8.4.6, was fundamental in deriving the above separation theorems. We can reverse the process and use the above separation theorems
to derive theorems of the alternative. Thus to derive Gordan's theorem
2.4.5, namely that either A'y = 0, y > 0 has a solution y G Rm or Ax > 0
51

8.3

Nonlinear Programming

Fig. 3.2.4 Geometric reinterpretation of Gordan's theorem by using Lemma 6.


(a) A'y = 0, y > 0 has solution; Ax > 0 has no solution;
(b) Ax > 0 has solution; A'y = 0, y > 0 has no solution.

has a solution x G Rn, we observe that if e G Rm is a vector of ones, then

The last implication follows by taking y = & G Rm, i = 1, . . . , m,


where & has zeros for all elements except 1 for the ith element.
Using the framework of Lemma 5 we can give a geometric reinterpretation of the Gordan's theorem as follows: Either the origin 0 R"

Fig. 3.2.5 Geometric interpretation of Lemma 5


02

Convex Bets in A"

3.2

is in the convex hull of the row vectors A I , . . . , An of the matrix A


(A'y = 0, y > 0 has a solution, Fig. 8.2.4a), or it is not (in which case
by Lemma 5 Ax > 0 has a solution x = c, Fig. 3.2.4V). More generally,
if fi is any nonempty closed convex set in Rn, then either it contains the
origin, Fig. 8.2.5a, or it does not (in which case by Lemma 5 there exists a
vector c Rn which makes a strict acute angle with each x G fy Fig.
3.2.5V).
Problem
Establish Farkas' theorem 2.4-6 by using Theorem 6 above.
(Hint: Observe that A'y = 6, y ^ 0 has no solution if and only if the sets
{6} and \z \ z = A'y, y ^ 0} are disjoint. Then use Theorem 6.)

us

Chapter
Four
Convex
and
Concave
Functions

In this chapter we introduce convex, concave, strictly convex, and


strictly concave functions defined
on subsets of Rn. Convex and concave functions are extremely important in nonlinear programming
because they are among the few
functions for which sufficient optimality criteria can be given (Chaps.
5 and 7), and they are the only functions for which necessary optimality conditions can be given without
linearization (Kuhn-Tucker saddlepoint condition in Chap. 5). We
give in this chapter some of the
basic properties of convex and concave functions and obtain some
fundamental theorems involving
these functions. These theorems,
derived by using the separation
theorems for convex sets of Chap.
3, are akin to the theorems of the
alternative derived in Chap. 2 for
linear systems. In this sense convex and concave functions inherit
some of the important properties
of linear functions. These fundamental theorems will be used to
derive the important saddlepoint
necessary optimality condition of
Chap. 5 and the minimum principle
necessary optimality condition of
Chap. 11. Finally it should be
mentioned that no differentiability
or explicit continuity requirements
are made on the functions introduced in this chapter. A subsequent chapter, Chap. 6, will be
devoted to differentiate convex
and concave functions.

Convex and Concave Functiona

4.1

1. Definitions and basic properties


Convex function
A numerical function 6 denned on a set T C Rn is said to be
convex at x G F (with respect to F) if

6 is said to be convex on T if it is convex at each x G F.


Note that this definition of a convex function is slightly more general than the customary definition in the literature [Fenchel 53, Valentine
64, Berge-Ghouila Houri 65] in that (i) we define convexity at a point
first and then convexity on a set, and (ii) we do not require F to be a
convex set. This generalization will allow us to handle a somewhat wider
class of problems later. It follows immediately from the above definition
that a numerical function 6 defined on a convex set F is convex on F if
and only if

Figure l+.l,l depicts two convex functions on convex subsets of Rn = R.

Fig. 4.1.1 Convex functions on subsets of Rn = R. (a) A convex function 8 on R;


(b) A convex function 6 on r = [ 1, <*>).
55

4.1

Nonlinear Programming

Concave function
A numerical function 6 defined on a set F C Rn is said to be concave at x i F (with respect to F) if

6 is said to be concave on F if it is concave at each x G r.


Obviously 9 is concave at x F (concave on F) if and only if 6
is convex at x (convex on F). Results obtained for convex functions can
be changed into results for concave functions by the appropriate multiplication by 1, and vice versa.
It follows immediately from the above definition that a numerical
function 6 defined on a convex set F is concave on F if and only if

Figure 4-1'# depicts two concave functions on convex subsets of Rn = R.


Problem
Show that a linear function, d(x) = ex a, x G Rn, is both convex
and concave on Rn, and conversely.
Strictly convex function
A numerical function 0 defined on a set F C Rn is said to be strictly

Fig. 4.1.2 Concave functions on subsets of Rn = R. (a) A concave function 8 on


R; (b) A concave function 6 on r = [0,1].
86

Convex and Concave Functions

4.1

convex at x F (with respect to F) if

6 is said to be strictly convex on F if it is strictly convex at each x F.


Strictly concave function
A numerical function 6 defined on a set F C Rn is said to be strictly
concave at x F (with respect to F) if

8 is said to be strictly concave on T if it is strictly concave at each x G T.


Obviously a strictly convex (strictly concave) function on a set
r C Rn is convex (concave) on F, but not conversely. For example a
constant function on Rn is both convex and concave on Rn, but neither
strictly convex nor strictly concave on Rn. In fact, it can be easily shown
that all linear functions 0(z) = ex a on Rn are neither strictly convex
nor strictly concave on Rn. Hence, because of the linear portion, the
function depicted in Fig. ^.l.la is not strictly convex on R, but the function of Fig. 4-1 -lb is strictly convex on [ 1, ). Both functions of Fig.
4-1.2 are strictly concave on their domains of definition.
An n-dimensional vector function / defined on a set F in Rn is
convex at x G F, convex on F, etc., if each of its components /,, i 1,
. . . , m, is convex at x G F, convex on F, etc.
Theorem
Let f = (/i, . . . ,/m) be an m-dimensional vector function defined on
F C Rn- Iffis convex atx G T (convex on F), then each nonnegative linear
combination of its components /,-

6(x) = pf(x)

p 0

is convex at x (convex on F).


67

4.1

Nonlinear Programming

PROOF

Let x F, 0 ^ X ^ 1, and let (1 - X)z -f Xz G F.

Then

Problem
Let 0 be a numerical function defined on a convex set F C #"
Show that 6 is respectively convex, concave, strictly convex, or strictly
concave on T if and only if for each a:1,a:2 G F, the numerical function \p
defined on the line segment [0,1] by
is respectively convex, concave, strictly convex, or strictly concave on
[0,1].
Theorem
For a numerical function 9 defined on a convex set T C Rn to be convex on F it is necessary and sufficient that its epigraph
be a convex set in Rn+1.
PROOF
(Sufficiency) Assume that Ge is convex. Let xl,xz G T, then [xl,6(x1)]
G Ge and [z2,0(2)] G Ge. By the convexity of Ge we have that
or

and hence 6 is convex on F.


(Necessity) Assume that 6 is convex on F. Let xl,l G G9 and xz,2 G Ge.
By the convexity of 6 on F we have that for 0 ^ X ^ 1

SB

Convex and Concave Functions

4.1

Hence'
and Ge is a convex set in Rn+1.
Corollary
For a numerical function 6 defined on a convex set T C Rn to be concave on F it is necessary and sufficient that its hypograph
be convex set in Rn+l.
Figure J^.l.Sa depicts a convex function on T and its convex epigraph Ge. Figure 4-l-$b depicts a concave function on T and its convex
hypograph Hf.
Theorem
Let 6 be a numerical function defined on a convex set T C Rn- A
necessary but not sufficient condition for 6 to be convex on T is that the set
be convex for each real number a.
PROOF

Let B be convex on T and let xl,xz Aa. Then

Hence (1 X)^ 1 -f- Xz 2 G A a , and Aa is convex.

Fig. 4.1.3 The convex epigraph Ge of a convex function and the


convex hypograph He of a concave function.
59

Nonlinear Programming

4.1

We now show that if Aa is convex for each a, it does not follow


that 6 is a convex function on F. Consider the function Q on R defined
by Q(x) = (a;)3. 6 is not convex on R. However, the set
is obviously convex for any a (see 3.1.4).
Corollary
Let d be a numerical function defined on the convex set F C Rn- A
necessary but not sufficient condition for 0 to be concave on F is that the set
be convex for each real number a.
Figure ^A.^a depicts a convex function 0 on a convex set
F C R" = R and the associated convex set A a . Figure 4-1-4b depicts a
nonconvex function 6 and the associated convex set A a . Figure 4-1-4c
depicts a concave function 6 on a convex set F C Rn = R and the assof>mt,prl r.nnvpv spf. fi...

Problem
Let 9 be a numerical function defined on the convex set F C Rn.
Show that a necessary and sufficient condition for 9 to be convex on F
is that for each integer m ^ 1

Fig. 4.1.4 The convex sets Aa and fla of 10 and 11 associated with a function 6.
60

Convex and Concave Functions

(Hint: Use Theorems 8 and 3.1.13.


inequality [Jensen 06].)

4-1

The above inequality is Jensen's

Theorem
If (0t)i'e/ is a family (finite or infinite) of numerical functions
which are convex and bounded from above on a convex set T C Rn, then the
numerical function
6(x) = sup 0i(x)
iei
is a convex function on T.

PROOF

Since each 0,- is a convex function on F, their epigraphs

are convex sets in Rn+1 by Thearem 8, and hence their interaction

is also a convex set in Rn+1 by Theorem 3.1.9. But this convex intersection is the epigraph of 6. Hence 6 is a convex function on F by
Theorem 8. |
Corollary
If (#i)ie/ is a family (finite or infinite) of numerical functions which
are concave and bounded from below on a convex set T C Rn, then the numerical function

is a concave function on F.
We end this section by remarking that a function 0 which is convex
on a convex set F C Rn is not necessarily a continuous function. For
example on the halfline T = {x \ x ^ R, x ^ I } , the numerical function

is a convex function on F, but is obviously not continuous at x = 1,


Fig. 4-1-1 b. However, if F is an open convex set, then a convex function
6 on F is indeed continuous. This fact is established in the following
theorem.
61

4,1

Nonlinear Programming

Theorem
Let F be an open convex set in Rn. If 6 is a convex numerical function on T then 6 is continuous on T.
PROOF [Fleming 65]f Let x G T, and let a be the distance (see L3.9)
from x to the closest point in Rn not in T (a + w if r = Rn). Let C
be an n-cube with center x and side length 25, that is
By letting (n)*43 < a, we have that C C r.
vertices of C. Let

Let V denote the set of 2"

/3 = max 6(x)
xV

By Theorem 10 the set A0 = [x \ x T, e(x) ^ ft] is convex. Since C is


the convex hull of V (this can be easily shown by induction on u) and
V C A0, it follows that C C Ap, by Theorem 3.1.13 (Fig. 4.1.5).
Let x be any point such that 0 < \\x x\\ < d, and define x + u,
x u on the line through x and x as in Fig. 4-1-5. Write x now as a
convex combination of x and x + u, and x as a convex combination of
x and z - M. If X = \\x - x\\/8, then

t Fleming attributes this proof to F. J. Almgren.

Fig. 4.1.5
69

Convex and Concave Functions

4.9

Since 8 is convex on F

These inequalities give

Thus for any given e > 0 it follows that 1 6 ( x ) - B(x)\ < e for all x satisfying
[ft - 0(x)] ||x - x\\ < ed, and hence 6(x) is continuous at oj. |
Since the interior of each set F C Rn is open, it follows that if 6
is a convex function on a convex set F C Rn, it is continuous on its
interior.

2.

Some fundamental theorems for convex functions

We saw in Chap. 2 that Farkas' theorem of the alternative played a


crucial role in deriving the necessary optimality conditions of linear programming. In this section we shall derive what may be considered
extensions of theorems of the alternative of Chap. 2 to convex and concave functions. These theorems in turn will play a similar crucial role
in deriving the necessary optimality conditions of nonlinear programming
in Chaps. 5 and 11. (In the remainder of this chapter various properties
of continuous and semicontinuous functions will be used. For convenience, these results are summarized in Appendix C.)
We begin by establishing a fundamental theorem for convex functions, the essence of which is given in [Fan-Glicksburg-Hoffman 57].
Theorem
Let r be a nonempty convex set in Rn, let f be an m-dimensional convex vector function on T, and let h be a k-dimensional linear vector function
on Rn. If

63

4.3

Nonlinear Programming

then there exist p G Rm and q G Rk such that

REMARK p ^ 0 and (p,q) ^ 0 docs not imply p > 0 and q ^ 0, but


it docs imply p > 0 or q ^ 0 or both. However if we delete the linear
equalities h(x) = 0, then p > 0.
PROOF

Deh'ne the sets

and

By hypothesis A does not contain the origin 0 G Rm+k.


for if ( y l , z l ) and (# 2 , 2 ) are in A, then for 0 ^ X ^ 1

Also, A is convex,

and

Because A is a nonempty convex set not containing the origin, it follows


by Lemma 3.2.2 that there exist /) G Rm, q G Rk, (p,q) ^ 0 such that
Since each u, can be made as large as desired, p ^ 0.
Let e > 0, u = f(x) + ee, v = h(x), x G T, where e is a vector of
ones in Rm. Hence (u,v) G A(2) C A, and

or

Now, if

we get, by picking e such that epe < 8, that

64

Convex and Concave Functions

*-2

which is a contradiction to the fact that pf(x) + qh(x) ^ epe for all
x G r. Hence

If we observe that for an m-dimensional vector function / denned


on T C Rn we have that

and

then the following corollary is a direct consequence of Theorem 1.


Corollary
Let T be a nonempty convex set in Rn, let f i , f t , fa be m1-, m2-, and
m*-dimensional convex vector functions on T, and h a k~dimensional linear
vector function on Rn. If

then there exist pi E Rm\ Pa G Rm*, Pz G Rm>, and q G Rk such that

We give now a generalization of Gordan's theorem of the alternative 2.4-.S to convex functions over an arbitrary convex set in Rn.
Generalized Gordan theorem [Fan-Glicksburg-Hoffman 57]
Let f be an m-dimensional convex vector function on the convex set
T C Rn. Then either
I

f(x) < 0 has a solution x G r


6

4.1

Nonlinear Programming

or

II

pf(x) ^ 0 for all x G T /or some p > 0, p Rm

but never both.


PROOF (I => II) Let z T be a solution of f(x) < 0. Then for any
p > 0 in Rm, pf(x) < 0, and hence II cannot hold.
(I == II) This follows directly from Theorem 1 above by deleting
h(x) = 0 from the theor
To see that 3 is indeed a generalization of Gordan's theorem 2.4-5
we let f(x) = Ax, where A is an m X n matrix. Then

where the last equivalence follows by taking x = e\ i = 1, . . . , n,


where ei Rn has zeros for all its elements except 1 for the ith element.
In the same spirit, Theorem 1 above can be considered a partial
generalization of Motzkin's theorem 24-% to convex functions. The
generalization is partial (unlike 3 which is a complete generalization of
Gordan's theorem), because the statement of Theorem 1 does not exclude
the possibility of both systems having a solution, that is, there may exist
an x T and p ^ 0, (p,q) ^ 0, such that f(x) < 0, h(x) = 0, and
P/C*0 + qh(x) ^ 0 for all x G F. Similarly, Corollary 2 is a partial
generalization of Slater's theorem 2.4-1- However, it is possible to
sharpen Theorem 1 and make it a theorem of the alternative if we let
T = Rn, h(x) = Bx d and require that the rows of B be linearly
independent. We obtain then the following result.
Theorem
Let f be a given m-dimensional convex function on Rn, let B be a given
k X n matrix with linearly independent rows, and let d be a given k-dimensional vector. Then either
I

f(x) < 0, Bx = d has a solution x Rn

or

pf(x) -f q(Bx - d) ^ 0 for all x Rn for some p > 0, p G Rm,


qER"
but never both.

II

66

Convex and Concave Functions

4.9

PROOF (I => II) Let x Rn be a solution of f(x) < 0 and Bx = d.


Then for any p > 0 and q in # m and Rk respectively,

Hence II cannot hold.


(I => II) If I has no solution then by Theorem 1 there exists
p ^ 0, (p,q) ^ 0 such that

If p > 0, the theorem is proved. We assume the contrary, that p = 0,


and exhibit a contradiction. If p = 0, then

We will show now that B'q = 0. For, if B'q ^ 0, then by picking


x = qB for the case when qd ^ 0, and x 2(qd)qB/qBB'q for the case
when qd < 0, we obtain that q(Bx d) < 0. Hence B'q = 0 for some
q 7* 0, which contradicts the assumption that the rows of B are linearly
independent.
We close this chapter by obtaining another fundamental theorem
for a (possibly infinite) family of convex and linear functions.
Theorem [Bohnenblust-Karlin-Shapley 50]
Let r be a nonempty compact convex set in Rn and let (fi)i^M be a
family (finite or infinite) of numerical functions which are convex and lower
semicontinuous on T, and let (hi)^K be a family (finite or infinite) of
linear numerical functions on Rn. If

then for some finite subfamily (/,- . . . ,/iJ of (fi)iM and some finite
subfamily (h^, . . . ,^,-J of (fti),-e/c there exist p Rm and q Rk
such that

If K is empty, that is if all equalities hi(x) = 0 are deleted, then the last
inequality above (^ 0) becomes a strict inequality (> 0).
67

Nonlinear Programming

4.2

PROOF

[Berge-Ghouila Houri 65] The system

has no solution x in T. [For if it did have a solution x, then /,() ^


for all e > 0 and all i G M, and hi(x) 0 for all i G K. This in turn
implies that /,() ^ 0 for all i G -fl/ and /i,(x) = 0 for all i E. K (for
otherwise if fi(x) > 0 for some i G M, then picking e = Y^ji(x) > 0
would lead to a contradiction). This however contradicts the hypothesis
of the theorem.] The sets
are closed sets (because of the lower semicontinuity of /,, the linearity of
hj, and the compactness of T, see Appendix C) contained in the compact
set F, and their intersection is empty. Hence by the finite intersection
theorem B.3.2(iii) there exist a finite number of such sets so that their
intersection is empty. Thus we obtain indices (ii,iz, ,im) G M,
(ii,iz, . . . ,4) G K, and real numbers ei, e2, . . . , em > 0, such that
t.Vip svat.pm

has no solution x G T. Hence by Corollary 2 there exist p G Rm, 3 G Rk


such that
and

from which the conclusion of the theorem follows if we observe that

68

Chapter
Five
Saddlepoint

Optimality
Criteria
of Nonlinear
Programming
Without
Differentiability

The purpose of this chapter is to


derive optimality criteria of the
saddlepoint type for nonlinear programming problems. This type of
optimality criterion is perhaps best
illustrated by a simple example.
Consider the problem of minimizing the function 6 on the set
X = [x | x G R, -x + 2 O j ,
where Q(x) (x)2. Obviously the
solution is x = 2, and the minimum
is 6(x) = 4. The saddlepoint optimality criterion for this problem is
this: A necessary and sufficient condition that a; be a solution of the
minimization problem is that there
exists a real number u (here u = 4)
such that for all x G R and all
u G R, u ^ 0
6(x) + u(-x + 2)
^ 6(x) + u(-x + 2)
^6(x) +u(-x + 2)

It is easy to verify that the above


inequalities are satisfied for x = 2,
u = 4. Hence the function \f/ defined on R* by
t(x,u) = 8(x) + u(-x + 2)
has a saddlepoint at x = 2, n = 4,
because it has a minimum at (x,u)
with respect to x for all real x, and
a maximum with respect to u for
all real nonnegative u.
For the above simple problem, the saddlepoint criterion happens to be both a necessary and a
sufficient optimality criterion for x
to be a solution of the minimization
problem. This is not always the
case. We shall show in this chapter that the above saddlepoint condition is a sufficient optimality
condition without any convexity

6.1

Nonlinear Programming

requirements. However to establish the necessity of the above saddlepoint condition, we need not only convexity but also some sort of a regularity condition, a constraint qualification. This confirms earlier statements made to the effect that necessary optimality conditions are more
complex and harder to establish.
We shall develop the optimality criteria of this chapter without
any differentiability assumptions on the functions involved. Subsequent
chapters, Chaps. 7 and 11, will establish optimality criteria that involve
differentiable functions.

1. The minimization and saddlepoint problems


The optimality criteria of this chapter relate the solutions of a minimization problem, a local minimization problem, and two saddlepoint problems to each other. We define these problems below now.
Let X be a subset of Rn, let 0 and g be respectively a numerical
function and an m-dimensional vector function defined on X.
The minimization problem (MP)
Find an x, if it exists, such that

The set X is called the feasible region or the constraint set, x the
minimum solution or solution, and 6(x) the minimum. All points x in
the feasible region X are called feasible points.
If X is a convex set, and if 6 is convex on ^T, the minimization
problem MP is often called a convex programming problem or convex
program,
(We observe that the above minimization problem is a special case
of the general minimization problem 1.6.9, where the additional ^-dimensional vector equality constraint h(x) = 0 was also present. The reason
for this is that in the absence of differentiability there are no significant
optimality criteria for problems with nonlinear equality constraints.
Some results for linear equality constraints will be obtained however.
See 5.8.2, 6.4.2, and 6.4.8.)
The local minimization problem (LMP)
Find an x in X, if it exists, such that for some open ball B&(x)
around x with radius 6 > 0

70

Saddlepoint Optimality Criteria without Differentiability

8.1

The Fritz John saddlepoint problem (FJSP)

The Kuhn-Tucker saddlepoint problem (KTSP)

Remark
If (z,fo,f) is a solution of FJSP and f 0 > 0, then (x,f/f0) is a solution of KTSP. Conversely, if (x,u) is a. solution of KTSP, then (x,l,u)
is a solution of FJSP.
Remark
The numerical functions <(>(x,r0,r) and t(x,u) defined above are
often called Lagrangian functions or simply Lagrangians, and the m-dimensional vectors f and u Lagrange multipliers or dual variables. These
multipliers play a role in linear and nonlinear programming which is very
similar to the role played by the Lagrange multipliers of the classical
calculus where a function of several variables is to be minimized subject
to equality constraints (see for example [Fleming 65]). Here, because
we have inequality constraints, the Lagrange multipliers turn out to be
nonnegative. When we shall consider equality constraints in 5.3.2,
5.4-2, and 5.4-8, the multiplier associated with these equalities will not
be required to be nonnegative.
Remark
The right inequality of both saddlepoint problems, FJSP 3 and
KTSP 4
and

71

5.2

Nonlinear Programming

can be interpreted as a minimum principle, akin to Pontryagin's maximum principle] [Pontryagin et al. 62]. Pontryagin's principle in its
original form is a necessary optimality condition for the optimal control
of systems described by ordinary differential equations. As such, it is a
necessary optimality condition for a programming problem, not in Rn,
but in some other space. More recently [Halkin 66, Canon et al. 66,
Mangasarian-Fromovitz 67] a minimum principle has also been established for optimal control problems described by ordinary difference
equations. This is a programming problem in Rn, which unfortunately
is not convex in general, and hence the results of this chapter do not apply.
However the optimality conditions of Chaps. 7 and 11, which are based
mainly on linearization and not on convexity, do apply to optimal control
problems described by nonlinear difference equations.

2.

Some basic results for minimization and


local minimization problems

We establish now some basic results concerning the set of solutions of the
minimization problem and relate the solutions of the minimization and
local minimization problems to each other.
Theorem
Let X be a convex set, and let 6 be a convex function on X.
of solutions of MP 5.1.1 is convex.

The set

REMARK A sufficient but not necessary condition for the convexity


of X is that X be a convex set and that g be convex on X. This follows
from 4.1.10 and 3.1.9.
PROOF

Let xl and x* be solutions of MP.

That is,

It follows by the convexity of X and 0, that for O ^ X ^ 1 , ( 1 X).tl-|Xz 2 G X, and

Hence (1 X)^1 + Xz2 is also a solution of MP, and the set of solutions
is convex.
t Pontryagin gets a maximum principle instead of a minimum principle because his
Lagrangian is the negative of the Lagrangian of nonlinear programming.
72

Saddlepoint Optimality Criteria without Differentiability

5.2

Uniqueness theorem
Let X be convex and x be a solution of MP 5.1.1. If 8 is strictly
convex at x, then x is the unique solution of MP.
PROOF Let x ^ x be another solution of MP, that is, x X, and
0(z) = 6(x). Since -X" is convex, then (1 X)x + Arc G X whenever
0 < X < 1, and by the strict convexity of 0 at x
This contradicts the assumption that 8(x) is a minimum, and hence x
cannot be another solution.
Theorem
Let X be convex, and let 6 be a nonconstant concave function on X.
Then no interior point of X is a solution of MP 5.1.1, or equivalently any
solution x of MP, if it exists, must be a boundary point of X.
PROOF If MP 5.1.1 has no solution the theorem is trivially true. Let
x be a solution of MP. Since 6 is not constant on X, there exists a point
x G X such that 6(x) > 6(x). If z is an interior point of X, there exists
a point y j X such that for some X, 0 ^ X < 1
See Fig. 5.2.1.

Hence

and B(x] does not attain its minimum at an interior point z.


Figure 5.2.2 shows a simple example of Theorem 3 in R.
Theorem
If x is a solution of MP 5.1.1, then it is also a solution of LMP 5.1.2.
The converse is true if X is convex and 6 is convex at x.

Fig. 5.2.1
73

Nonlinear Programming

6.8

Fig. 5.2.2 A simple example of Theorem


Sin R.

PROOF If x solves MP, then x solves LMP for any 8 > 0. To prove
the converse now, assume that x solves LMP for some 8 > 0, and let X
be convex and 6 be convex at x. Let y be any point in X distinct from x.
Since X is convex, (1 X)z + Xy G X for 0 < X ^ 1. By choosing X
small enough, that is, 0 < X < 8/\\y x\\ and X ^ 1, we have that
Hence

from which it follows that

3. Sufficient optimality criteria


The main sufficient optimality criteria developed here (1 and 2 below)
require no convexity assumptions on the minimization problem MP 5.1.1.
These criteria are quite straightforward to obtain and need no complicated machinery to derive. First results of this type were obtained in
[Uzawa 58].
Sufficient optimality theorem
// (x,u) is a solution of KTSP 5.1.4, then x is a solution of MP 5.1.1.
If (x,f0,f) is a solution of FJSP 5.1.3, and fo > 0, then x is a solution of
MP 5.1.1.
PROOF The second statement of the theorem follows trivially from the
first statement by Remark 5.1.5.
Let (x,ii) be a solution of KTSP 6.14. Then for all u ^ 0 in Rm
74

Saddlepoint Optimality Criteria without Differentiability

5.8

and all x in X

From the first inequality we have that


For any j, I ^ j ^ m, let

It follows then that QJ(X) ^ 0. Repeating this for all j, we get that
g(x) ^ 0, and hence x is a feasible point, that is, x G X.
Now since u ^ 0 and g(x) ^ 0, we have that ug(x) ^ 0. But
again from the first inequality of the saddlepoint problem we have, by
setting u = 0, that ug(x) ^ 0. Hence ug(x) = 0.
Let x be any point in X, then from the second inequality of the
saddlepoint problem we get

Hence x is a solution of MP. |


It should be remarked here that because no convexity assumptions
were made in the above theorem, equality constraints can be handled by
replacing them by two inequality constraints. That is, replace h(x) = 0
by h(x) g 0 and h(x) ^ 0.
Problem
Consider the minimization problem
where h is a A;-dimensional vector function on X and all else is defined as
in MP 5.1.1. Let
and

Show that if there exist x X, u Rm, u ^ 0, v Rk such that

75

5.4

Nonlinear Programming

or if there exist x G X, f0 G R, f0 > 0, f G Rm, f ^ 0, s Rk such that

then x is a solution of the minimization problem. (Notice that v and s


are not restricted in sign.)
The question may be raised as to what sort of point is the point x
if (x,f0,f) is a solution of FJSP 5.1.8 and we do not require that f0 > 0.
An answer to this question is given by the following result.
Corollary
// (x,fo,f) is a solution of FJSP 5.1.3, then either x solves MP 5.1.1
or X has no interior relative to g(x] ^ 0, that is, {x \ x G -X"0, g(x) < 0} = 0.
PROOF By the same argument as in the proof of Theorem 1 above we
show that g(x) ^ 0 and rg(x) = 0. Now, if f 0 > 0, then x solves MP
by Theorem 1. If f 0 = 0, then f > 0 and we have from the second
inequality of FJSP 5.1.8 that
Now, if the set {x \ x G X, g(x) < 0} is nonempty, then for any element
x in it fg(x) < 0, which contradicts the fact established above that
fg(x) ^ 0 for all x G ^- Hence {x \ x G X, g(x) < 0} = 0. |

4. Necessary optimality criteria


The situation with respect to necessary criteria is considerably more
complicated than the situation with respect to sufficient optimality
criteria. The two situations are compared in the following table:

Necessary criteria

Sufficient criteria

(a) Convexity needed

No convexity needed

(6) Consequence of separation theorem of Separation theorem of convex sets not


convex sets needed
needed
(c) Regularity condition (constraint quali- No constraint qualification needed
fication) needed in the more important
necessary criterion (7 below)
76

Saddlepoint Optimality Criteria without Diflerentiability

5.4

We begin by establishing a necessary optimality criterion which does not


require any regularity conditions. This necessary optimality criterion is
similar in spirit to the necessary optimality criterion of Fritz John [John
48] (see also Chap. 7), which was derived for the case where the functions
0 and g were differentiate but not convex. We use no differentiability
here, but instead we use convexity. The present criterion is a saddlepoint criterion, whereas Fritz John's is a gradient criterion. The main
point of similarity is the presence of the multiplier f 0 in both criteria.
Fritz John saddlepoint necessary optimality theorem
[Uzawa 58, Karlin 59]
Let X be a convex set in Rn, and let 6 and g be convex on X. If x
is a solution of MP 5.1.1, then x and some f 0 G R, f G Rm, (?o,f) > 0 solve
FJSP 5.1.3 and fg(x) = 0.
PROOF

Because x solves MP

By Corollary 4.2.2 there exist f 0 E R, f Rm, (f0,f)

> 0 such that

By letting x = x in the above, we get that fg(x) ^ 0. But since f ^ 0


and g(x) ^ 0, we also have fg(x) ^ 0. Hence
and

which is the second inequality of FJSP 5.1.3.


We also have, because g(x) ^ 0, that
and hence, since fg(x) = 0
which is thefirstinequality of FJSP 5.1.
Problem
Consider the minimization problem

77

5.4

Nonlinear Programming

where h is a fc-dimensional linear vector function on Rn, 6 and g are convex


on X, and all else is defined as in MP 5.1.1. Show that if I is a solution
of the above problem, then x and some f0 G R, f G Rm, s G Rk, (f0,f) ^ 0,
(f0,f,) ? 0 satisfy fg(x) = 0, and

(Hint: Again use Corollary 4-#-#-)


It should be remarked here that in the above necessary optimality
criteria there is no guarantee that f 0 > 0. In cases where f 0 = 0 it is
intuitively obvious that the necessary optimality criterion FJSP 5.1.3
does not say much about the minimization problem MP 5.1.1, because
the function 8 has disappeared from 5.1.3 and any other function could
have played its role. In order to exclude such cases, we have to introduce
some regularity conditions. These regularity conditions are referred to
in the literature as constraint qualifications. We shall have occasion to
use a number of these constraint qualifications throughout this book.
Some of these constraint qualifications (like the three introduced below)
make use only of the convexity properties of the functions defining the
feasible region X. Other constraint qualifications, to be introduced later,
in Chap. 7 for example, make use mostly of the differentiability properties of the functions defining the feasible region X.
Slater's constraint qualification [Slater 50]
Let X be a convex set in Rn. The m-dimensional convex vector
function g on X which defines the convex feasible region
is said to satisfy Slater's constraint qualification (on X) if there exists an
x G Xsuchthat0(:r) < 0.
Karlin's constraint qualification [Karlin 59]
Let X be a convex set in Rn. The m-dimensional convex vector
function g on X which defines the convex feasible region
is said to satisfy Karlin's constraint qualification (on X) if there exists
no p G Rm, P > 0 such that

78

Saddlepoint Optimality Criteria without Differentiability

6.4

The strict constraint qualification


Let X be a convex set in Rn. The m-dimensional convex vector
function g on X which defines the convex feasible region
is said to satisfy the strict constraint qualification (on X) if X contains
at least two distinct points x1 and x 2 such that g is strictly convex at x1.
Lemma
Slater's constraint qualification 3 and Karlin's constraint qualification 4 are equivalent. The strict constraint qualification 5 implies Slater's
and Karlin's constraint qualifications 3 and 4PROOF (3 <=> 4)
pnnivalpnt.

By Gordan's generalized theorem 4.8.3, 3 and 4 are

Because g is strictly convex at x1, it follows from 4-1-4 that


where the last inequality follows from the fact that g(xl) ^ 0 and
g(xz) ^ 0. Thus g satisfies Slater's constraint qualification 3 and hence
also Karlin's.
We are ready now to derive the most important necessary optimality criterion without the use of differentiability. The theorem is
widely known under the name Kuhn-Tucker [Kuhn-Tucker 51], even
though Kuhn and Tucker required both convexity and differentiability
in its derivation. The theorem in its present form, without any differentiability requirements, is attributed to Uzawa [Uzawa 58] and Karlin
[Karlin 59].
Kuhn-Tucker saddlepoint necessary optimality theorem
[Kuhn-Tucker 51, Uzawa 58, Karlin 59]
Let X be a convex set in Rn, let 6 and g be convex on X, and let g
satisfy Slater's constraint qualification 3, Karlin's constraint qualification 4,
or the strict constraint qualification 5 on X. If x is a solution of MP 5.1.1,
then x and some u G Rm, u i; 0, solve KTSP 5.1.4 and ug(x) = 0.
PROOF We first observe that by Lemma 6 above we need only establish
the theorem under Karlin's constraint qualification. By Theorem 1, x and
79

5.4

Nonlinear Programming

some f0R, f Rm, (f 0 ,f) > 0, solve FJSP 5.1.3 and fg(x) = 0. If
f 0 > 0, then by Remark 5.1.5 we are done. If f0 = 0, then f > 0, and
from the second inequality of FJSP 5.1.3
which contradicts Karlin's constraint qualification 4- Hence f 0 > 0.
We summarize in Fig. 5.4-1 the relationships between the solutions
of the various problems of this chapter.
We end this section by deriving a Kuhn-Tucker saddlepoint necessary optimality criterion in the presence of linear equality constraints.
In order to do this, we have to let the set X of MP 5.1.1 be the entire
space Rn.
Kuhn-Tucker saddlepoint necessary optimality theorem in the
presence of linear equality constraints [Uzawa 58]
Let 6, g be respectively a numerical function and an m-dimensional
vector function which are both convex on Rn. Let h be a k-dimensional
linear vector function on Rn, that is, h(x) = Bx d, where B is a k X n
matrix, and d is a k-vector. Let x be a solution of the minimization problem

and let g and h satisfy any of the constraint qualifications:


(i)

(Generalized Slater 3} g(x) < 0, Bx d has a solution x Rn

Fig. 5.4.1 Relationships between


the solutions of the
local m i n i m i z a t i o n
problem (LMP) 5.1.2,
the minimization problem (MP) 5.1.1, the
Fritz John saddlepoint
problem (FJSP) 5.1.3,
and the Kuhn-Tucker
saddlepoint
problem
(KTSP) 6.1.4.
80

Saddlepoint Optimality Criteria without Differentiability

S.4

There exists no p > 0, p G Rm, 9 G Rk

(ii)

(Generalized Karlin 4)
such that

(iii)

(Generalized strict 5) X contains at least two distinct points xl and x2


such that g is strictly convex at x1

Then x and some u G: Rm, u ^ 0, v G Rk satisfy ug(x) = 0, and

PROOF

We shall first establish the fact that

(iii) =* (i) => (")


and then prove the theorem under (ii).
[(iii) => (i)] Since g(xl) ^ 0, g(x*) g 0, Bxl = d, Bx* = d, we
have for 0 < X < 1 that B[(l - \)xl + Xz2] = d and
Hence (i) holds.
[(i) ==> (ii)] If g ( x ) < 0 and Bx = d, then for any p > 0, p RMI,
and any q Rk,
Hence (ii) holds.
We establish now the theorem under (ii). There will be no loss
of generality if we assume that the rows B I , . . . , Bk of B are linearly
independent, for suppose that some row, Bk say, is linearly dependent
on BI, . . . , Bk-i, that is Bk = V s,-J?,-, where $1, . . . , Sk-i are fixed
real numbers. Then

for any x satisfying B{X = di, i = 1, . . . , k 1. But since x ( X,


and BiX = di, i = I , . . . , k, it follows that V Mt dk = 0 and
.B*z d* = 0 for any x satisfying B{x = d,-, i = 1, . . . , fc 1.
81

6.4

Nonlinear Programming

Hence the equality constraint BkX = dk is redundant and can be dropped


from the minimization problem without changing the solution x. Then,
once we have established the theorem for the linearly independent rows
of B, we can reintroduce the linearly dependent row Bk (without changing
the minimization problem) and set Vk 0 in the saddlcpoint problem.
By 2 above, there exist f0 R, f Rm, s E Rk, (f0,f) ^ 0,
(f0,f,s) T^ 0, which satisfy fg(x) = 0 and solve the saddlepoint problem
of 2. If TO > 0, then u = f/f0) v s/f0 solve the saddlepoint problem
of the present theorem, and we are done. Suppose f 0 = 0. Then since
fg(x) = 0 and Bx d = 0, we have by the second inequality of the
saddlepoint problem of 2 that

0 ^ fg(x) + s(Bx - d)

for all x Rn

which contradicts (ii) above, if f > 0. Now suppose that f = 0, then


s ^ 0 and s(Bx d) ^ 0 for all x in R". Hence (see last part of proof
of 4-%-4) B's = 0, which contradicts the assumption that the rows of B
are linearly independent. Thus f 0 > 0.

82

Chapter
Six
Differentiable
Convex
and
Concave
Functions

In this chapter we give some of the


properties of differentiable and
twice-differentiable convex and
concave functions. Appendix D
summarizes the results of differentiable and twice-differentiable functions which are needed in this
chapter.

1. Differentiable convex and


concave functions
Let 8 be a numerical function
denned on an open set T in Rn.
We recall from Appendix D that
if 6 is differentiable at x G T, then

where V6(x) is the n-dimensional


gradient vector of 6 at x whose n
components are the partial derivatives of 6 with respect to Zi, . . . ,
xn evaluated at x, and a is a numerical function of x.
Theorem
Let 8 be a numerical function
defined on an open set T C Rn and
let 8 be differentiable at x G T. If 8
is convex at x G T, then
d(x) - 8(x) ^ V6(x)(x - x)
for each x G T
// 8 is concave at x G T, then
8(x) - 8(x) ^ V8(x)(x - x)
for each x G T
PROOF Let 8 be convex at x.
Since T is open, there exists an open
ball B&(x) around x, which is contained in T. Let x E T, and let
x 5^ x. Then for some p., such

6.1

Nonlinear Programming

that 0 < n < 1 and /i < d/\\x x\\, we have that

x = x + n(x - x) = (1 - n)x + /iz G B(s) C r


Since 5 is convex at a;, it follows from 4-1-1, the convexity of BS(X) (see
3.J.7), and the fact that x G Bt(x), that for 0 < X ^ 1
(1 - \)6(x) + X0(z) ^ 6[(l - \)x + Xz]

or

Since

taking the limit of the previous expression as X approaches zero gives


Since 6 is convex at x, since ^ G T, and since f = (1 n)x + /ux,
we have by 4-1-1 that
But since
and fj. > 0, the last three relations give
The proof for the concave case follows in a similar way to the above
by using 4-1- instead of 4-1-1.
Theorem
Let 8 be a numerical differentiable function on an open convex set
F C Rn- & is convex on T if and only if
6 is concave on T if and only if

84

Diflerentiable Convex and Concave Functions

6.1

PROOF

(Necessity) Since 6 is convex (concave) at each xl G F, this part of the


proof follows from Theorem 1 above.
(Sufficiency) We shall establish the result for the convex case. The
concave case follows in a similar way. Let xl,xz GE F, and let 0 ^ X ^ 1.
Since F is convex, (1 X)^1 -f Xz2 F. We have then

Multiplying the first inequality by (1 X), the second one by X, and


adding gives
A geometric interpretation of the above results can be given as
follows. For a differentiable convex function 6 on F, the linearization
6(x) 4- V0(x)(x x) at x never overestimates 8(x) for any x in F, see
Fig. 6.1.1. For a differentiable concave function 6 on F, the linearization
6(x) 4- V6(x)(x x) at x never underestimates 6(x) for any x in F, see
Fig. 6.1.2.
Theorem
Let 9 be a numerical differentiable function on an open convex set
T C Rn- A necessary and sufficient condition that 6 be convex (concave)
on F is that for each xl,x2 F

Fig. 6.1.1 Linearization of a convex function 0 never


overestimates the function.
SB

6.1

Nonlinear Programming

Fig. 6.1.2 Linearization of a


concave function 6
never
underestimates the function.

PROOF We shall establish the theorem for the convex case.


cave case is similar.

The con-

(Necessity} Let 6 be convex on F and let xl,x2 (E F. By Theorem 2


we have that
and

Adding these two inequalities gives

(Sufficiency) Letxl,x2 F. Then for 0 ^ X ^ 1, (1 - \)xl -f Xx2 E F.


Now by the mean-value theorem D.2.1 we have for some X, 0 < X < 1
But by assumption
or

Hence
and by Theorem 2 above, 6 is convex on F.
If / is an n-dimensional function on F C Rn, and [/(x2) /(x1)]
86

Dlflerentiable Convex and Concave Function*

6.1

(x2 x1) ^ 0 for all xl,x2 T, then / is said to be monotone on T. It is


seen from the above theorem that a differentiable numerical function on
the open convex set F C R" is convex if and only if V0 is monotone on F.
There exists an extensive literature on monotone functions [Zarantonello
60, Browder 66, Minty 64, Opial 67], which is involved mainly with the
solution of nonlinear equations in more general spaces than Rn. Some
use of monotone functions in nonlinear programming has been made
[Karamardian 66, Rockafellar 67b], but the full power of the theory of
monotone functions has not, it seems, been exploited in nonlinear
programming.

2. Differentiable strictly convex and concave functions


All the results of the previous section extend directly to strictly convex
and strictly concave functions by changing the inequalities ^ and ^ to
strict inequalities > and <. In particular we have the following results.
Theorem
Let 6 be a numerical function defined on an open set F C Rn, and
let 6 be differentiable at x G F- If 9 is strictly convex at x G F, then
If 8 is strictly concave at x G F, then
Theorem
Let 0 be a numerical differentiable function on an open convex set
F C Rn. 8 is strictly convex on F if and only if
8 is strictly concave on F if and only if
Theorem
Let 8 be a numerical differentiable function on an open convex set
F C Rn- A necessary and sufficient condition that 8 be strictly convex
(concave) on F is that for each x1^ G F, x1 5^ x2
The proofs of Theorems 2 and 3 above are essentially identical to
the proofs of Theorems 6.1.2 and 6.1.3 respectively and are left to the
87

6.8

Nonlinear Programming

reader. The proof of Theorem 1 is not only different from the proof of
Theorem 6.1.1 but also makes use of Theorem 6.1.1. We give this proof
below.
PROOF OF THEOREM 1 Let 0 be strictly convex at x.

Then

Since 6 is convex at x, it follows from Theorem 6.1.1 that


We will now show that if equality holds in 5 for some x in F which is
distinct from x, then a contradiction ensues. Let equality hold in 5 for
x = , x E T and x j* x. Then
From 4 and 6 we have that

or

But by applying Theorem 6.1.1 again to the points x(X) = (1 X)x + \,


which are in F, and x, we obtain that

which contradicts 7, since for small X, (1 X)x + Xz G F because F is


open. Hence equality cannot hold in 5 for any x in F which is distinct
from x, and the theorem is established for the convex case. The concave
case is similarly proved.

3. Twice-differentiable convex and concave functions


Let 0 be a numerical function defined on some open set F in Rn. We
recall from Appendix D that if 0 is twice differentiate at x G F, then

88

Differentiable Convex and Concave Function!

6.8

where V8(x) is the n-dimensional gradient vector of 6 at x, and V28(x) is


the n X n Hessian matrix of 6 at x whose ijth element is d28(x)/dxi dxj,
i, j = I , . . . , n.
Theorem
Let 6 be a numerical function defined on an open set T C Rn, and
let 6 be twice differentiate at x G T. If 0 is convex at x, then V28(x) is
positive semidefinite, that is,
If 6 is concave at x, then V28(x) is negative semidefinite, that is,

PROOF

Let y G Rn-

Because F is open, there exists a X > 0 such that

By Theorem 6.1.1 it follows that


But since 8 is twice differentiable at x

Hence

Taking the limit as X approaches zero, and recalling that lim 0(x,\y) = 0,
we get that
The concave case is established in a similar way.
Theorem
Let 8 be a numerical twice-differentiable function on an open convex
set T C Rn- 8 is convex on T if and only if V20(z) is positive semidefinite
on T, that is, for each x G T
8 is concave on T if and only if V28(x) is negative semidefinite on T, that is,
for each x G T

89

8.4

Nonlinear Programming

PROOF
(Necessity) Since 6 is convex (concave) at each x (E T, this part of the
proof follows from Theorem 1 above.
(Sufficiency) By Taylor's theorem D.2.2 we have that for any x1^ T

for some 8, 0 < 5 < 1. But the right-hand side of the above equality is
nonnegative (nonpositive), because V 2 0(z) is positive (negative) semidefinite on T, and xl + &(xz xl) G r. Hence the left-hand side is
nonnegative (nonpositive), and by Theorem 6.1.2 6 is convex (concave)
on T.

4. Twice-differentiable strictly convex and


concave functions
Not all the results of the previous section extend to strictly convex and
strictly concave functions by replacing inequalities by strict inequalities.
In fact we begin by establishing the following partially negative result.
Theorem
Let 6 be a numerical function defined on an open set T C Rn, and
let 8 be twice differentiate atx^T. If 6 is strictly convex at x, then Vz6(x)
is positive semidefinite, but not necessarily positive definite; that is, it is not
necessarily true that

// 6 is strictly concave at x, then V29(x) is negative semidefinite, but not


necessarily negative definite; that is, it is not necessarily true that

PROOF If 6 is strictly convex at x, then 0 is convex at x, and by Theorem


6.3.1 V 2 0(z) is positive semidefinite. That V 2 0(z) is not necessarily positive definite can be seen from the counterexample 6(x) = (x)4, x R.
6 is strictly convex on R but V20(z) = 12(x)2 is not positive definite since
V20(0) = 0. The concave case is established similarly.
90

Diflerentiable Convex and Concave Functions

6.4

Theorem
Let 6 be a numerical twice-differentiable function on an open convex
set F C Rn- -4 nonnecessary but sufficient condition that 8 be strictly convex on T is that V 2 0(z) be positive definite on T; that is, for each x G T
A nonnecessary but sufficient condition that 6 be strictly concave on T is
that V 2 0(z) be negative definite on T; that is, for each x G F

PROOF The nonnecessity follows from Theorem 1 above. The sufficiency proof is essentially identical to the sufficiency proof of Theorem
6.3.8.

91

Chapter
Seven
Optimality
Criteria
in Nonlinear
Programming
With
Differentiability

In Chap. 5 we developed optimality


criteria without the use of any
differentiability assumptions on
the functions entering into the
nonlinear programming problem.
Many problems (for example linear
programming and quadratic programming) involve differentiate
functions. It is important then
to develop optimality criteria that
take advantage of this property.
These criteria are merely extensions
of the well-known and often abused
optimality criterion of the classical
calculus of "setting the derivatives
equal to zero."
As we did in Chap. 5, we
shall develop necessary and sufficient optimality criteria. For the
sufficient optimality criteria we
shall need differentiability and convexity (in 5.3.1 no convexity was
needed). For the necessary optimality criteria we shall need differentiability, and, depending on
which criterion we are talking
about, we shall or we shall not
need a constraint qualification.
Note that we do not need any convexity requirements in order to
establish necessary optimality criteria here. This is unlike the necessary optimality criteria of Sec.
5.4, where convexity was crucially
needed.
Again, as is the case of
Chap. 5, the main sufficient optimality criteria here are straightforward to establish. Only simple
inequalities relating convex functions and their gradients are needed.
The necessary optimality criteria,
on the other hand, need more
sophisticated arguments that involve theorems of the alternative,
and also, some of them require some
sort of constraint qualification.

Nonlinear equality constraints will be treated cursorily in this


chapter. A subsequent chapter, Chap. 11, will be devoted to programming problems with nonlinear equality constraints.

1. The minimization problems, and the Fritz John and


Kuhn-Tucker stationary-point problems
The optimality criteria of this chapter relate the solutions of a minimization problem, a local minimization problem, and two stationary-point
problems (the Fritz John and Kuhn-Tucker problems) to each other.
The minimization and the local minimization problems considered here
will be the same as the corresponding problems of Chap. 5, that is, problems 5.1.1 and 5.1.2, with the added differentiability assumption. The
Fritz John and Kuhn-Tucker problems of this chapter (3 and 4 below)
follow from the Fritz John and Kuhn-Tucker saddlepoint problems 5.1.3
and 5.1.4 if differentiability is assumed, and conversely the Fritz John
and Kuhn-Tucker saddlepoint problems 5.1.3 and 5.1.4 follow from the
Fritz John and Kuhn-Tucker problems (3 and 4 below) if convexity is
assumed (see 7.3.8 below).
Let X be an open set in Rn, and let 6 and g be respectively a
numerical function and an ??i-dimensional vector function both defined
on X. (In many nonlinear programming problems X is Rn.)
The minimization problem (MP)
Find an x, if it exists, such that

The local minimization problem (LMP)


Find an x in X, if it exists, such that for some open ball Bs(x)
around x with radius 5 > 0

The Fritz John stationary-point problem (FJP)


Find x -X"0, f0^R,f^Rm

if they exist, such that

93

Nonlinear Programming

7.2

(It is implicit in the above statement that 6 and g are differentiable at x.)
The Kuhn-Tucker stationary-point problem (KTP)
Find x G X, u E Rm if they exist, such that

(Again, it is implicit in the above statement that 8 and g are differentiable


at x.)
Remark
If (x,f 0 ,f) is a solution of FJP 8, and f 0 > 0, then (x,f/f0) is a
solution of KTP 4- Conversely, if (x,u) is a solution of KTP 4, then
(z,l,w) is a solution of FJP 3.
Remark
The Lagrangian functions <t>(z,rQ,r) and $(x,u) defined above are
precisely the same Lagrangian functions defined in Chap. 5 (see 5.1.3
and 5.1.4).

2.

Sufficient optimality criteria

The sufficient optimality criteria developed here (1 and 2 below), unlike


the sufficient optimality criteria 5.3.1, depend heavily on convexity.
Their derivation, however, is quite straightforward.
Sufficient optimality theorem [Kuhn-Tucker 51]
Let x G -X"0, let X be open, and let 6 and g be differentiable and convex at x. If (x,u) is a solution of KTP 7.1.4, then x is a solution of MP
7.1.1. If (x,f0,f) is a solution of FJP 7.1.3, and f0 > 0, then x is a solution
of MP 7.1.1.
PROOF The second statement of the theorem follows trivially from
the first statement by Remark 7.1.5.
94

7.2

Optlmality Criteria with Diflerentlability

Let (x,u) be a solution of KTP.

We have for any x in X that

Hence
Since <7(x) ^ 0, x is in .X", and hence

It should be remarked here that because of the convexity requirements on g, nonlinear equality constraints of the type h(x) = 0 cannot
be handled by the above theorem by replacing them by two inequalities
h(x) ^ 0 and h(x) 5 0. However, linear equality constraints can be
handled by this procedure.
Problem
Let x G X, let X be open, let 6 and g be differentiable and convex
at x, let B be a given k X n matrix, and let d be a given fc-dimensional
vector. Show that if (f,w,y), x G X, u G Rm, v G Rk is a solution of
the following Kuhn-Tucker problem

then

An interesting case not covered by Theorem 1 above is the case


when (x,fo,f) is a solution of FJP 7.1.3 but the requirement that f0 > 0
95

7.2

Nonlinear Programming

is not made, in order to guarantee that x be a solution of MP 7.1.1. This


is given by the following theorem, where instead of requiring that f 0 > 0,
we require that g be strictly convex at x.
Sufficient optimality theorem
Let x G X, let, X be open, let 6 be differentidble and convex at x,
and let g be differentiate and strictly convex at x. If (x,f0,f) is a solution of
FJP 7.1.3, then x solves MP 7.1.1.
PROOF

Let (x,fQ,f) solve FJP.

/ = {i\Qi(x) = 0 }

Let

J = [i\gi(x) <0}

I\JJ = {1,2, . . . ,m}t

Since f ^ 0, g(x) ^ 0, and fg(x) 0, we have that

?igi(x) = 0

for i 1, . . . , m

and hence
f, = 0

for i E J

Since r0V6(x) -f fVg(x) = 0 and (f 0 ,f) > 0, we have that

It follows then by Gordan's theorem 2.4-5 that

has a solution
Consequently

has a solution
for if it did have a solution x E X, then x 5^ x, and
0 > tf(f) - 0(z) ^ V5(x)(f - x)

(by convexity of 0 at x and ff.^.jf)

0 ^ (7/(^) ^/(x) > Vgi(x)(x x)

(by strict convexity of g at and


6.2.1)

f Sometimes the constraints gi(x) ^ 0 are said to be active constraints at x, and pj(x) ^ 0
are said to be inactive constraints at x.
96

Optimality Criteria with Differentiability

which contradicts 4 if we set z = x x.


have from 5 that

7.3

Recalling that gi(x) = 0, we

Since g(x) ^ 0, x is in X, and hence x solves MP 7.1.1.


Remark
In 1 and 2 above, since u ^ 0, g(x) ^ 0, and ug(x) = 0, we have
that iiigi(x) 0 for i = 1, . . . , m, and hence iii = 0 for

Similarly in S we have already shown that fi 0 for i G J It is obvious


then from the proofs above that the convexity of gi at x is all that is
needed in 1 and 2 instead of the convexity of g at x (as was assumed in
1 and 2), and that the strict convexity of gi at x is all that is needed in
3 instead of the strict convexity of g at x (as was assumed in $). The
more restrictive assumptions were made in 1, 2, and 8 to make the
statements of the theorems simpler.

3. Necessary optimality criteria


In the necessary optimality conditions to be derived now, convexity does
not play any crucial role. The differentiability property of the functions
is used to linearize the nonlinear programming problem, and then theorems of the alternative are used to obtain the necessary optimality conditions. Once again, to derive the more important necessary optimality
conditions (7 below), constraint qualifications are needed.
We begin with the following slight generalization of Abadie's
linearization lemma, which establishes the nonexistence of a solution of a
system of linear inequalities whenever the local minimization problem
LMP 7.1.2 has a solution.
Linearization lemma [Abadie 67]
Let x be a solution of LMP 7.1.2, let X be open, let d and g be differentiable at x, let
concave
97

7.8

Nonlinear Programming

and let

W {i') ffi(x) 0, and ^, is not concave at x}


Then the system

has no solution z in Rn.


PROOF

Let

Hence
Let x be a solution of LMP 7.1.2 with 5 = 5. We shall show that
if z satisfies V6(x)z < 0, Vgw(x)z < 0, and Vgry(z)2 ^ 0 then a contradiction ensues. Let z satisfy these inequalities. Then, since X is open,
there exists a 5 > 0 such that
Since 6 and g are differentiate at x (see D.I.3), we have that for 0 < 8 < I

where
(i) If 6 is small enough (say 0 < 5 < 50), then
and hence

then
98

(ii)

Similarly, for i G W and 5 small enough (say 0 < 8 < 5,),

Optimallty Criteria with Differentiability

7.3

and hence

that

(iii)

For i E V, we have, since <?< is concave at x and V0v(;r)z ^ 0,

(iv) For i E /, 0i(z) < 0. Hence for small enough 5 (say


0 < 5 < 5,), we have
and hence
Let us call 5 the minimum of all the positive numbers 5, 5, 50, fi,
(i = 1, . . . ,m) defined above. Then for any 5 in the interval 0 < 5 < 5
we have that

Hence, for 0 < 5 < 6, we have that x + 626 BI(X) r\Xand6(x + dz) < 6(x),
which contradicts the assumption that x is a solution of LMP 7.1.2 with 5
= 5. Hence there exists no z in R" satisfying V0()z < 0, Vgr,,-(f )z < 0, and
V0v()z^0.
We are now ready to derive a series of necessary optimality criteria
based on the above linearization lemma. We begin by a generalization of a
result of Abadie which, for the case of a finite number of constraints, includes
the classical result of Fritz John.
Fritz John stationary-point necessary optimality theorem
[John 48, Abadie 67]
Let xbea solution of LMP 7.1.2 or of MP 7.1.1, let X be open and
let 8 and g be differentiable at x. Then there exists an f0 E R and anfRm
such that (x,r0,f) solves FJP 7.1.3, and

99

7.S

Nonlinear Programming

where

PROOF

If x solves MP 7.1.1, then x solves LMP 7.1.2.

Let

and

By Lemma 1 above we have that

Hence by Motzkin's theorem 2.4-2 there exist f0, fw, fv such that

Since gw(x) = 0 and <?K(Z) = 0, it follows that if we define


then

and

Since x is in JT, ^(x) 5j 0.


(f0,r > 0

Hence (x,r 0 ,f) solves FJP 7.1.3, and

REMARK The above theorem becomes that of Abadie [Abadie 67] if we


replace the concavity of gv at x and the nonconcavity of gw at x by the
requirements that gv be linear on Rn and that gw be nonlinear on Rn.
It might not be immediately obvious why the above result is more general
than Abadie's. One way to see this is the following: Besides (f 0 , f) > 0,
we try to establish in addition the semipositivity of the smallest subset of
the components of (f 0 , f). For example if we can establish the semipositivity of fo (and hence its positivity), then this is the sharpest necessary
100

ttttttH

optimality condition we can get (see 7 below). In the above theorem


we establish the semipositivity of (f 0 , f^), where W is the set of indices of
the active constraints which are not concave at x. Abadie establishes
the semipositivity of a larger-subset of components of (f 0 , f), namely that
of (f 0 ,ftf), where N is the set of indices of the active constraints which are
nonlinear on Rn. On the other hand, the above theorem becomes that
of Fritz John [John 48] if we drop the result that ( r 0 , f w ) > 0.
REMARK If X is convex and gv is concave on X, then the above
theorem certainly holds. However, the concavity of j/v does not (unless
gv is linear) make the set {x \ x E -X"0, gv(x) ^ 0} convex (see J^.1.10
and 4.1.11).
Again, just as in the case for the saddlepoint necessary optimality
criteria (see Section 5.4), there is no guarantee that f 0 > 0 in Theorem 2
above. This can occur, for example, when the solution x of LMP 7.1.2 or
of MP 7.1.1 is at a cusp, Fig. 7.3.la, or when the feasible region x is degenerate and is made up of one point only, Fig. 7.3.1b. In cases when f0 0,
the function 6 disappears from the Fritz John problem 7.1.3, and we have a
degenerate case. As was done in Sec. 5.4, it is possible to exclude such
cases by introducing restrictions on the constraints g(x) ^ 0. These
restrictions are again called, as in Sec. 5.4, constraint qualifications. In
addition to Slater's constraint qualification 5.4-3, Karlin's constraint
qualification 5.4-4, and the strict constraint qualification 5.4-5, we shall
introduce the Kuhn-Tucker constraint qualification 3, the Arrow-Hurwicz-

Fig. 7.3.1 Examples of minimization problems in which f 0 of the Fritz John


problem 7.1.3 is zero.

101

7.3

Nonlinear Programming

Uzawa constraint qualification 4, and the reverse convex constraint


qualification 5.
The Kuhn-Tucker constraint qualification [Kuhn-Tucker 51]
Let X be an open set in Rn, let g be an w-dimensional vector
function defined on X, and let

g is said to satisfy the Kuhn-Tucker constraint qualification at x G: X if


g is differentiate at x and if

where

The Arrow-Hurwicz-Uzawa constraint qualification


[Arrow et al. 61]
Let X be an open set in Rn, let g be an w-dimensional vector function defined on X, and let

0 is said to satisfy the Arrow-Hurwicz-Uzawa constraint qualification at


x E X if g is differentiate at x and if

where
and

108

Optimality Criteria with Differentiability

7.3

The reverse convex constraint qualification [Arrow et al. 61]


Let X be an open set in Rn, let g be an w-dimensional vector function defined on X, and let
g is said to satisfy the reverse convex constraint qualification at x G X
if g is differentiate at x, and if for each i G / either p, is concave at x or
gi is linear on Rn, where
(The reason for the adjective reverse convex is that if A" 0 is a convex set and gij is concave on X, then the set {x | x X, giej(x) ^ 0} is
not convex, but its complement relative to X, that is, {x \ x X, g^i(x)
> 0}, is a convex set. For example if X = R" and g(x) = xx + 1, then g
satisfies the reverse constraint qualification at each x {x \ x Rn,
xx ^ 1}.)
Before deriving the fundamental Kuhn-Tucker necessary optimality conditions (7 below), we relate the three constraint qualifications
just introduced to each other and to the constraint qualifications of
Chap. 5.
Lemma
Let X be an open set in Rn, let g be an m-diniensional vector function
defined X, and let

(i)
(ii)
(iii)

// g satisfies the reverse convex constraint qualification 5 at x, then g


satisfies the Arrow-Hurwicz-Uzawa constraint qualification 4 at x// g satisfies the reverse convex constraint qualification 5 at x, then g
satisfies the Kuhn-Tucker constraint qualification 3 at x.
Let X be convex, let g be convex on X{\ and let g be differentiable at x.
If g satisfies Slater's constraint qualification 5.4-3 on X, Karlin's
constraint qualification 5.4-4 on A', or the strict constraint qualification 5.4-5 on X, then g satisjies the Arrow-Hurwicz-Lzawa constraint
qualification 4 at x.

PROOF (i) Let g satisfy the reverse constraint qualification 5 at x.


Then the set W defined in 4 is empty, and 2 = 0 satisfies
Hence g satisfies the Arrow-Hurwicz-Uzawa constraint qualification 4 at x.
103

7.3

Nonlinear Programming

(ii)

Let g satisfy the reverse constraint qualification 5 at x,

Define
Let y be any vector in Rn satisfying
Define
for some

Obviously conditions (a) and (c) of the Kuhn-Tucker constraint qualification 3 are satisfied. We will now show that condition (6) is also
satisfied. Since X is open, and since gi is concave and differentiate
at x, we have that

and
for 0 ^ T ^ 1 and 0 ^ X < X. Hence for 0 ^ X < X and 0 ^ r ^ 1, we
have that
and for i G /, we have that

where the last inequality holds because gj(x) < 0 and lim ati(x,\Ty) = 0.
Hence gj[e(r)\ < 0, for 0 ^ r ^ 1. Since gi[e(r)] ^ 0 for 0 ^ r ^ 1,
we have that

and condition (6) of the Kuhn-Tucker constraint qualification 3 is satisfied,


(iii) By Lemma 5.4-6 we have that Slater's constraint qualification
5.4-3 and Karlin's constraint qualification 5.4-4 are equivalent and that
the strict constraint qualification 5.4-5 implies both Slater's and Karlin's
constraint qualification. Hence we need only establish the present lemma
under Slater's constraint qualification. If g satisfies Slater's constraint
104

Optimality Criteria with Differentiability

qualification on X, there exists an x G X such that g(x) < 0.


is differentiable at x, we have by 6.1.1 that

7.3

Since g

where
Hence by taking z = x x, we have that Vgf/(z)z > 0, and the ArrowHurwicz-Uzawa constraint qualification 4 is satisfied at x.
The results of the above lemma are summarized in Fig. 7.3.2.
We are now ready to derive the fundamental necessary optimality
criterion of nonlinear programming, the Kuhn-Tucker necessary optimality criterion. We shall establish the result under all the constraint qualifications introduced. In view of Lemma 6 above, we need only establish
the result under the Kuhn-Tucker constraint qualification 3 and the
Arrow-Hurwicz-Uzawa constraint qualification 4- (A somewhat involved
proof [Abadie 67] shows that a special case of the Arrow-Hurwicz-Uzawa
constraint qualification, where gv are the linear active constraints and
gw are the nonlinear active constraints, implies the Kuhn-Tucker constraint qualification. We take here a different and somewhat simpler
approach and show that either the Arrow-Hurwicz-Uzawa or the KuhnTucker constraint qualification is adequate for establishing the necessary
optimality conditions we are after.)
Kuhn-Tucker stationary-point necessary optimality theorem
[Kuhn-Tucker 51]
Let X be an open subset of Rn, let 6 and g be defined on X, let x
solve LMP 7.1.2 or MP 7.1.1, let 6 and g be differentiable at x, and let g

Fig. 7.3.2 Relations between the various constraint qualifications (CQ).


IDS

7.3

Nonlinear Programming

satisfy
(i) the Kuhn-Tucker constraint qualification 3 at x, or
(ii) the Arrow-Hurwicz-Uzawa constraint qualification 4 at x, or
(iii) the reverse convex constraint qualification 5 at x, or
(iv) Slater's constraint qualification 5.4-3 on X, or
(v) Karlin's constraint qualification 5.4-4 on X, or
(vi) the strict constraint qualification 5-4-5 on X.
Then there exists a u (E Rm such that (x,u) solves KTP 7.1-4PROOF In view of Lemma 6 above we need only establish the theorem
under assumptions (i) or (ii).
(i) Let x solve LMP 7.1.8 with 5 = 5. Let
We have to consider two cases: the case when 7 is empty and the case
when 7 is nonempty.
(7 = 0) Let y be any vector in Rn such that yy = 1. Then
Since Qi(x) < 0 and lim ai(x,8y) = 0, it follows that for small enough 5,
-o
say 0 < 8 < 8 < 5, gt(x + fy) < 0 and x + Sy E ^. But since x
solves LMP 7.1.2, we have that

for 0 < 5 < 5. Hence

Since lim a(x,dy) = 0, we have on taking the limit of the above expression
*-o
as 5 approaches zero that

Since y is an arbitrary vector in 72n satisfying yy 1, we conclude from


this last inequality, by taking y = e', where ei G Rn is a vector with
one in the ith position and zero elsewhere, that

Hence x and u = 0 satisfy KTP 7.1.4.


106

7.3

Optimality Criteria with Differentiability

(/ 7^ 0) Let g satisfy the Kuhn-Tucker constraint qualification


3 at x, and let y G Rn satisfy Vgi(x)y ^ 0. Hence by 3, there exists an
n-dimensional vector function e defined on [0,1] such that e(0) = x,
e(r) G X for 0 ^ T ^ 1, e is differentiate at T = 0, and de(Q)/dr = \y
for some X > 0. Hence for 0 f T ^ 1

where lim Ti(0,r) = 0. Hence by taking T small enough, say 0 < T <
f < 1, we have that e(r) G Bi(x).
x solves LMP 7.1.2, we have that

Since e(r) G X for 0 ^ T ^ 1 and

Hence b}^ the chain rule D.I.6 and the differentiability of 6 at x and of e
at 0, we have for 0 < T < f that

where lim /3(0,r) = 0.

Hence

Taking the limit as T approaches zero gives

Since e(0) = x and de(0)/dr = Ay for some X > 0, we have that

Hence we have shown that


or that

Hence by Motzkin's theorem 2.4-2 there exists an fo and an f/ such that

107

7.8

Nonlinear Programming

Since f 0 is a real number, f 0 > 0 means f0 > 0. By defining

we have that

and since x X, we have that

Hence (x,u) solves KTP 7J.4(ii) Let solve MP 7.1.1 or LMP 7J.0. Then by the Fritz John
theorem 2 there exists an f0 R and an f G # m such that (x, r0, f) solves
FJP 7.1.3, and
where
Define

and
By Remark 7.1.5 we only have to show that f 0 > 0. Since (f0,fw) > 0,
we have that f 0 > 0 if W is empty. Assume now that W is nonempty.
We will now show by contradiction that f0 > 0. Suppose that f 0 = 0;
then since fj = 0, we have that

Since g satisfies the Arrow-Hurwicz-Uzawa constraint qualification 4 at


x, there exists a z #" such that

108

Optimality Criteria with Differentiability

7.S

Fig. 7.3.3 A geometric interpretation of the Kuhn-Tucker conditions


7.1.4.

Premultiplying these inequalities by fw and TV respectively gives


which contradicts the fact that
Hence f 0 > 0.

Fig. 7.3.4 Relationship between


the solutions of the
local minimization
problem (LMP) 7.1.2,
the
minimization
problem (MP) 7.1.1,
the Fritz John problem (FJP) 7.1.S, and
the
Kuhn-Tucker
problem
(KTSP)
7.1.4.
109

7.S

Nonlinear Programming

Fig. 7.3.5 Relationships between the solutions of the local minimization problem (LMP) 7.1.2, the minimization problem (MP) 7.1.1, the Fritz John
problem (FJP) 7.1.3, the Kuhn-Tucker problem (KTP) 7.14, the Fritz
John saddlepoint problem (FJSP) 5.1.3, and the Kuhn-Tucker saddlepoint problem (KTSP) S.I.4.

A geometric interpretation of the Kuhn-Tucker conditions 7.1.4


can be given as follows. At x, there exists a nonnegative linear combination of the gradient of the objective function V6(x) (with positive weight)
and the gradients of the active constraints ^gi(x), i G I, which is equal
to zero. Figure 7.8.3 depicts such a situation. If, however, none of the
constraint qualifications are satisfied, there may not exist such a nonnegative linear combination of V6(x) and V0/(x). In such cases V6(x) may
have a zero weight, as for example in Fig. 7.3.1.
The various optimality criteria of this chapter are related to each
other in Fig. 7.3.4, and to the optimality criteria of Chap. 5 in Fig. 7.3.5.
Problem
Let X be an open set in Rn, and let 6 and g be defined on X.
Find the conditions under which
110

Optimality Criteria with Differentiability

(i)
(ii)

7.3

A solution (x, r 0 , f) of the Fritz John saddlepoint problem FJSP 5.1.3


is a solution of the Fritz John problem FJP 7.1.3, and conversely,
A solution (x,u) of the Kuhn-Tucker saddlepoint problem KTSP
5.1.4 is a solution of the Kuhn-Tucker problem KTP 7.1.4, and
conversely.
(The above relationships are indicated by dotted lines in Pig. 7.3.5.}
Problem

Let X be an open set in Rn, let 0 be defined on X, let x E X,


and let 6 be differentiable at x. Show that if

then

Under what conditions is the converse true? Give a geometric interpretation of the above conditions for the case when X = R.
Problem
Let X be an open set in Rn, let 8 be defined on X, let x X,
and let 6 be differentiable at x. Show that if

then

Under what conditions is the converse true? Give a geometric interpretation of the above conditions for the case when X = R.
Problem (method of feasible directions [Zoutendijk 60])
Let one of the assumptions (ii) to (vi) of Theorem 7 hold, and let 6
and g be differentiable on X. Suppose that we have a point x X at
which one of the last five constraint qualifications of 7 holds but at which the
Kuhn-Tucker conditions 7.1.4 are not satisfied. Show that there exists a feasible direction z in Rn such that x + dz X and 6(x + dz) < 6(x) for some
111

7.8

5 > 0.

Nonlinear Programming:

(Hint: Use the Farkas' theorem 2.4-6, the fact that

has no solution UT, and the differentiability property of Q and g.)


a linear programming problem the solution of which will yield z.

State

Problem (general Kuhn-Tucker stationary-point necessary


optimality criterion)
Let X be an open set in Rn* X R"*, let d, g, and h be respectively
a numerical function, an w-dimensional vector function, and a /c-dimensional vector function defined on X, let (x,y) G -X"0 solve the following
general minimization problem

let 6, g, and h be differentiable at (x,y), and let the composite m + 2k +


n 2 -dimensional vector function / defined on X by f(x,y) = [g(x,y), h ( x , y ) ,
h ( x , y ) , y] satisfy one of the constraint qualifications 3, 4> or 5 at
(x,y), or one of the constraint qualifications 5.4-3, 5.4-4* or 5.4-5 on X.
Show that there exist u G Rm, v G Rk such that (x,u,v} satisfies the following general Kuhn-Tucker necessary optimality conditions

(Hint: Redefine X as
X = {(x,y) | (x,v) G X, g(x,y) ^ 0, /i^y) g 0, -h(x,y} ^ 0, -y ^ 0}
and then use Theorem 7.)

113

Chapter
Eight
Duality
in
Nonlinear
Programming

Duality plays a crucial role in the


theory and computational algorithms of linear programming
[Dantzig 63, Simmonard 66]. The
inception of duality theory in linear
programming may be traced to the
classical minmax theorem of von
Neumann [von Neumann 28] and
was first explicitly given by Gale,
Kuhn, and Tucker [Gale et al. 51].
Duality in nonlinear programming
is of a somewhat later vintage,
beginning with the duality results
of quadratic programming [Dennis
59]. In spirit, however, duality
theory in nonlinear programming
is related to the reciprocal principles of the calculus of variations,
which have been known since as far
back as 1927 [Trefftz 27, 28,
Friedrichs 29, Courant-Hilbert 53,
Lax 55, Mond-Hanson 67].
In recent years there has
been an extensive interest in the
duality theory of nonlinear programming as evidenced by the
publication of some 50 or more
papers on the subject. To cover
the subject of all these papers
would take a book by itself. Instead we shall concentrate on a few
of the more basic results which are
simpler to prove and which subsume in an obvious way the duality
results of linear programming.
The plan of this chapter is
as follows. We first introduce the
minimization problem and its dual
and develop the duality results of
nonlinear programming. We then
apply these results to quadratic
and linear programming problems.

8.1

Nonlinear Programming

1. Duality in nonlinear programming


Let X be an open set in Rn, and let 6 and g be respectively a numerical
function and an ?n-dimensional vector function, both defined on X. We
define now the same minimization problem that we have been dealing
with in the previous chapters and its dual.
The (primal) minimization problem (MP)
Find an x, if it exists, such that

The dual (maximization) problem (DP)


Let 8 and g be differentiate on X.
they exist, such that

Find an x and a u G Rm, if

The duality results we are about to establish relate solutions x of


MP and (x,u) of DP to each other. They also relate the objective functions 0 and \j/ to each other. We begin by giving a weak and easy-toderive duality theorem.
Weak duality theorem [Wolfe 61]
Let X be open, and let 6 and g be differentiate

where X and Y are defined in 1 and 2.


114

on X.

Then

Duality In Nonlinear Programming1

8.1

PROOF

We derive now one of the more important duality theorems of


nonlinear programming.
Wolfe's duality theorem [Wolfe 61]
Let X be an open set in Rn, let 8 and g be differentiable and convex
on X, let x solve MP 1, and let g satisfy any one of the six constraint qualifications of Theorem 7.3.7. Then there exists a u G Rm such that (x,u]
solves DP 2 and

PROOF [Huard 62] By Theorem 7.3.7, there exists a u Rm such that


(x,u) satisfies the Kuhn-Tucker conditions

Hence
(x.u) G Y = \(x,u)

x Xn,u Rm, V8(x) + uVg(x) = 0, u ^ 0}

Now let (x,u) be an arbitrary element of the set Y.

Then

115

1.1

Nonlinear Programming

Hence

Since

ALTERNATE PROOF Again as above, by Theorem 7.3.7, there exists


a u (E Rm such that (x,u) satisfies the Kuhn-Tucker conditions. Hence
(x,u) E Y. Now
where the last inequality follows from Theorem 3. Hence (x,u) solves
DP 2.
REMARK The constraint qualification in the above theorem is merely
a sufficient condition for the theorem to hold. It may hold without
satisfying such a constraint qualification, as evidenced by the following
primal problem. Find x (E R2 such that

where a is some nonnegative number. The dual to this problem is: Find
x G R2 and u G R* such that

When a = 1, the primal problem has one feasible (and hence minimal)
point, Xi = X2 = 0. At this point none of the constraint qualifications
are satisfied. However it can be verified (after a little calculation) that
x = 0, u 0 also solves the dual problem with a maximum of zero.
Another important duality theorem is the converse of Theorem 4
above. In order to obtain such a theorem, we have to modify the
hypotheses of Theorem 4- There are a number of such converse theo11*

Duality in Nonlinear Programming

8.1

rems [Hanson 61, Huard 62, Mangasarian 62, Mangasarian-Ponstein 65,


Karamardian 67]. To cover all such theorems would be lengthy. Instead
we shall confine ourselves to two converse theorems which hold under
somewhat different assumptions and the new proofs of which are completely different from each other.
Strict converse duality theoremf [Mangasarian 62]
Let X be an open set in Rn, let 9 and g be differentiable and convex
on X, let MP 1 have a solution x, and let g satisfy any one of the six constraint qualifications of Theorem 7.3.7. // (x,u) is a solution of DP 2, and
if if/(x,u) is strictly convex at x, then A x, that is, x solves MP 1, and

REMARK $(x,u) is strictly convex at x if either 0 is strictly convex at


x or if for some i, wt > 0 and 0, is strictly convex at A.
PROOF We shall assume that 9* x and exhibit a contradiction.
Since x is a solution of MP 1, it follows from Theorem 4 above that there
exists a u G Rm such that (x,u) solves DP 2. Hence

and
Because (,u) G Y, we have that Vxif/(,u) = 0. Hence by the strict
convexity of \l>(x,u) at x and 6.2.1 we have that
It follows then that

or that
But from Theorem 7.3.7 we have that ug(x) = 0, hence
f We have used the adjective strict to distinguish the above converse duality theorem from
other theorems, such as Dorn's converse duality theorem [Dorn 60] (see 8.2.6 below) and
theorem 5.6 of [Mangasarian-Ponstein 65], in which the solution of the primal problem
x need not equal x, where (x,u) is the solution of the dual problem.
117

8.1

Nonlinear Programming

which contradicts the facts that u ^ 0 and g(x) ^ 0. Hence x = x.


We also have that

It should be remarked that the above theorem can be strengthened


(see theorem 5.7 in [Mangasarian-Ponstein 65]) by dropping the assumptions that MP 1 has a solution and that g satisfies a constraint qualification, but retaining the strict convexity assumption on \j/(x,u}. The proof
of this strengthened theorem is rather lengthy, and hence we content
ourselves here with the above weaker version.
We derive now another strict converse duality theorem under
different hypotheses from those of Theorem 5 above. We drop the
assumptions that MP 1 has a solution, that g satisfies a constraint qualification, and that t(x,u) is strictly convex at x. We add the assumptions
that ^(x,u) is twice continuously differentiate at x and that the Hessian
matrix Vx2\fs(x,u) is nonsingular at x.
The Hanson-Huard strict converse duality theorem
[Hanson 61, Huard 62]
Let X be an open set in Rn, and let 6 and g be differentiate on X.
Let (x,u) be a solution of DP 2, and let 6 and g be convex at x. If either (i)
[Huard 62] *(/(x,u) is twice continuously differentiate at x and the n X n
Hessian matrix Vx^(x,u) is nonsingular, or (ii) [Hanson 61] there exists an
open set A C Rm containing u and an n-dimensional differentiate vector
function e(u) on A such that
and

then x solves MP 1, and


B(x) = t(x,u)
PROOF Since Vx^(x,u) = 0, we have that assumption (i) above implies,
by the implicit function theorem D.3.1, assumption (ii) above. (Assumption (ii) does not necessarily imply assumption (i).) We establish the
theorem now under assumption (ii). From (ii) we have that
(x,u) = [e(u),u] G {[e(u),u] \u G A, u ^ 0} C Y
118

Duality in Nonlinear Programming

8.1

and since

we have then that

Since A is open and the constraint u ^ 0 is linear, we have by Theorem


7.3.7 (or more conveniently by 7.3.10) and the chain rule D.I.6 that

Since u G A. since V^Oc.w)

x = e(u)

= 0 for u G A, and since

the last two relations give

But since (x,u) G Y, we also have that

The last four relations and Theorem 7.2.1 imply that x is a solution of
UP 1. We also have that
We give now a theorem which shows when the dual problem has
an objective function, unbounded from above, on the set of dual feasible
points Y. This is a generalization of a theorem of Wolfe [Wolfe 61,
theorem 3].
Unbounded dual theorem
Let X be an open set in Rn, and let 6 and g be differentiable on X.
If there exists a dual feasible point (xl,u1} such that
then the dual problem has an unbounded objective function (from above) on
the set of dual feasible points Y.
119

Nonlinear Programming

PROOF By Theorem 24.10 of the alternative there exists a w 2 Rm


such that

Since (a;1,!*1) is dual feasible, then

Consider now the point (x1, w1 -f- ru2) for any r > 0. We have that

and hence (x1, w1 + rw 2 ) is dual feasible for any T > 0. Since u2g(xl) = 1,
we have that the dual objective function

tends to as r tends to .
REMARK The above theorem does not apply if g is convex at xl and
the primal problem has a feasible point, x*, say. For if we define
z x* xl, then we have that
and hence ^(x1) + Vg^z1)* ^ 0 has a solution 2 = x* x1.
EXAMPLE Consider the example in the Remark following Theorem 4
above, and let a = y. Obviously then the primal problem has no
feasible point. Consider now the dual feasible point Xi1 = x 2 x = 0,
Ui1 = W2 1 = 1. We have then that

has no solution z G Rn, and hence by Theorem 7 above the dual objective
function is unbounded from above on the dual feasible region Y. This
can be easily seen if we let Xi = x 2 = 0 and let u\ w 2 tend to . Then
^(x,w) tends to w.
no

Duality in Nonlinear Programming

8.1

Corollary (unbounded dual)


Let X = Rn, and let 6 and g be differentiable on Rn. Let the dual
problem DP 2 have a feasible point (x1^1), let g be concave^ at xl or linear
on Rn, and let the primal problem MP 1 have no feasible point. Then the dual
problem has an unbounded objective function $(x,u) (from above) on the set
of dual feasible points Y.

PROOF

We claim that

has no solution
For if it did have a solution z Rn, then x* = xl + z would satisfy
where the next-to-last inequality follows from 6.1.1. Hence
which contradicts the assumption that X is empty. So,
has no solution z in Rn, and by Theorem 7 the dual problem has an
unbounded objective function (from above) on the set Y of dual feasible
points.
The case when g is linear in the above corollary is theorem 3 of
[Wolfe 61].
We finally give a theorem which tells us when the primal problem
has no local or global minimum.
Theorem (no primal minimum)
Let X be an open set in Rn, let 0 and g be differentiable and concave^
on X, and let X ?* 0. // the dual problem DP 2 has no feasible point, then
neither the primal problem MP 1 nor the local minimization problem LMP
7.1.2 has a solution.

PROOF

Because the dual problem has no feasible point, we have that

has noi solution


t Note that if g were concave on Rn, the set X would still not be convex in general, unless
g were linear.
% Note that the concavity of 8 and g does not, in general, produce a primal convex programming problem (unless 8 and g are linear).

iai

8.1

Nonlinear Programming

Hence by Farkas' theorem 2.4.6

has a solution

Let x (E X.

Then by 6.1.1 and the last two inequalities we have that

Hence by choosing 5 small enough, x + z G BI(X) C\ X for any 5 > 0


and x cannot be a solution of LMP 7.1.2 and, hence, not a solution of
MP 1.
We mentioned earlier that there is an extensive literature on the
duality theory of nonlinear programming. The presentation we have
followed here resembles that given in [Dorn 60, Wolfe 61, Hanson 61,
Huard 62, Mangasarian 62], There are approaches to duality which use
the conjugate function concept, which we have not touched upon here
[Fenchel 53, Berge-Ghouila Houri 65, Dieter 65a, Rockafellar 63, 64, 67a,
676, 67c, 69, Whinston 65, 67]. There are also approaches that use a
minmax theorem [Stoer 63, 64, Mangasarian-Ponstein 65, Karamardian
67]. There is also the theory of symmetric dual nonlinear programs [Dorn
61, Cottle 63, Dantzig et al. 65, Mond 65, Mond-Cottle 66, Stoer 64].
Duality relations relaxing convexity conditions have appeared in [Mangasarian 65, Karamardian 67, Rissanen 67, Jacob-Rossi 69] (see also
Chap. 10). Duality in spaces more general than Rn has also appeared in
[Hurwicz 58, Kretschmar 61, Rubinstein 63, Rockafellar 63, Brondsted
64, Moreau 65, Tyndall 65, 67, Dieter 65b, 66, Levinson 66, LarsenPolak 66, Gol'stein 67, Ritter 67, Varaiya 67, Hanson-Mond 67, Van
Slyke-Wets 67, Hanson 68].
Problem
Let X be an open set in Rn> X R"1, let 6 and g be respectively a
differentiate numerical function and a differentiable w-dimensional
vector function on X. Show that the dual to the following primal
minimization problem

122

Duality in Nonlinear Programming

8.8

is the following maximization problem

State and prove the duality relations that hold for these dual problems
and the conditions under which they hold.

2. Duality in quadratic programming


We consider in this section programming problems with a quadratic objective function and linear constraints. Let b be an n-vector, c an m-vector,
C a symmetricf n X n matrix, and A an m X ft matrix. We consider
the following primal problem.
The (primal) quadratic minimization problem (QMP)
Find an x, if it exists, such that

According to 8.1.2 the dual to the above problem is given by

The constraint relation Cx b + A'u = 0 implies that

Using this equation in the objective function, the dual problem becomes
the following.
t If C is not symmetric, we replace C by (C + C")/2 in QMP 1 because xCx - x[(C + C')/2]x.
M

8.2

Nonlinear Programming

The quadratic dual (maximization) problem (QDP)


Find an x G R" and a u G Rm, if they exist, such that

A whole group of duality theorems now follows from the theorems


of the previous section.
Weak duality theorem
Let C be positive semidefinite (that is, xCx ^ 0 for all x in Rn).

Then

PROOF This theorem follows from Theorem 8.1.3 by observing that


Y^xCx bx is convex on Rn if the matrix C is positive semidefinite (see
Theorem 6.3.8).
Dorn's duality theorem [Dorn 60]
Let C be positive semidefinite. If x solves QMP 1, then x and some
u G Rm solve QDP 2, and the two extrema are equal.
PROOF This theorem follows from Theorem 8.1.4 if we make the same
observation as in the proof of Theorem 3 above.
REMARK The dual quadratic programs / and 2 possess a nice feature
not shared by the dual nonlinear programs 8.1.1 and 8.1.2. If C is positive semidefinite, then the objective function of QMP 1 is convex on Rn,
and the objective function of QDP 2 is concave on Rn X Rm.
Strict converse duality theorem
Let C be positive definite (that is, xCx > 0 for all nonzero x in R").
If (x,u) solves QDP 2, then x solves QMP 1, and the two extrema are equal.
PROOF This theorem follows from Theorem 8.1.6 by observing that if
C is positive definite, then %xCx bx is strictly convex (and hence
convex) on Rn (see Theorem 6.4-2), and the Hessian matrix Vx*[%xCx
bx + u(Ax c)], which is equal to C, is nonsingular (for if it were singu134

Duality in Nonlinear Programming

8.9

lar, then Cx = 0 for some x j* 0, and xCx = 0 for some x j* 0, which


contradicts the positive definite assumption).
Dorn's converse duality theorem [Dorn 60]
Let C be positive semidefinite. If (x,u) solves QDP 2, then some
x G Rn (not necessarily equal to x), satisfying C(x x) = 0, solves QMP 1,
and the two extrema are equal.
REMARK We have not established here the counterpart of this
theorem for the general nonlinear case. Such a theorem exists [Mangasarian-Ponstein 65, theorem 5.6]. Note the difference between 5 and 6
above. In 5 the same x appears in the solutions of 1 and 2. In 6 we
merely have that C(x x) = 0.
PROOF By Theorem 7.3.7 (or more conveniently by 7.8.12) and the
linearity of the constraints Cx + A'u = b, u ^ 0, there exists a v (E Rn
such that (,u,v) satisfies the following Kuhn-Tucker conditions

Substitution of the first relation into the second one gives

The last four relations, the assumption that C is positive semidefinite,


and Theorems 6.3.2 and 7.2.3 imply that v solves QMP 1. Hence x = v
solves QMP 1, and Cx = Cv = Cx. Now we show that the two extrema
are equal.

115

Nonlinear Programming

8.8

Unbounded dual theorem


Let Y 9* 0. Then
(X = 0) = (QDP # /ias an unbounded objective function from above on Y)
PROOF

Follows from Corollary 8.1.8.

Theorem (no primal minimum)


Let C be negative semidefinite, and let X 9* 0.

PROOF

Then

Follows from Theorem 8.1.9.

Problem
Let b, a, c, and d be given vectors in Rn, Re, Rm, and Rk, respectively, and let A, D, B, E, C, and F be given ra X n, m X A /c X n, k X A
n X n (symmetric), and / X t (symmetric) matrices, respectively. Show
that the following are dual quadratic problems.

3. Duality in linear programming


We outline briefly here the main duality results of linear programming
and show how they follow from the results we have established.
By deleting the matrix C from the dual quadratic problems 8.2.1
and 8.2.2, we obtain the following dual linear programs.
196

Duality in Nonlinear Programming

8.3

The (primal) linear minimization problem (LMP)


Find an x, if it exists, such that

The dual linear (maximization) problem (LDP)


Find a u, if it exists, such that

Note that the dual problem contains only the variable u. The
nonlinear and quadratic dual problems 8.1.2 and 8.2.2 contain both the
variables x and u. The dual linear program has the unique feature that
it does not contain the primal variable x.
We combine now all the fundamental duality results of linear
programming in the following theorem.
Duality theorem of linear programming [Gale et al. 51]

(ii)
(\\\)

(x solves LMP 1) <=> (u solves LDP 2} => bx = cu


Let Y 7* 0. Then
(X = 0) <= ( cu is unbounded from above on Y}

(iv)

Let X 7* 0.

Then

(Y = 0) *=$ ( bx is unbounded from below on X)

REMARK Both of the sets X and Y may be empty, in which case none
of the above results hold. For example take A = Q, c = 1,6= 1.
(i) Follows from 8.8.3.
(ii) Follows from 8.24 and 8.2.6.
(ill) The forward implication follows from 8.2.7.
ward implication is equivalent logically to

PROOF

The back-

(X j* 0) =* {cu is bounded from above on Y)

8.3

Nonlinear Programming

which follows from part (i) of this theorem,


(iv) (=>)
hs no solution

(<=)

The backward implication is equivalent to

<F j 0) =* ( bx is bounded from below on X)


which follows from part (i) of this theorem.
(v) The backward implication is trivial because x X and
u Y. We prove now the forward implication. Because of (ii) we
need only prove that
(X 7* 0 and F ^ 0) => (Bx that solves LMP 1}
Since X ^ 0, we have that Ax ^ c has a solution x, which implies that
A'v = 0, v 2i 0, cv < 0 has no solution v (for if it did have a solution v,
then 0 = v-Az ^ cv, which contradicts cv < 0). Hence

(A'v = 0, v ^ 0) = cv ^ 0
Similarly, since F 5^ 0, A'w = 6, u ^ 0 has a solution w, which implies
that Ay ^ 0, 6y > 0 has no solution y (for if it did have a solution y,
then 0 ^ wAy = by, which contradicts by > 0). Hence
Ay ^ 0 => by ^ 0

We will now show that if LMP 1 has no solution, then a contradiction


ensues.
{LMP 1 has no solution)

has a solution

128

Duality In Nonlinear Programming

8.3

has a solution

has a solution

has a solution

has a solution

has a solution

Now, either er = 0 or a > 0. If a = 0, then Az ^ 0, A't = 0, t ^ 0,


which implies (by making use of 4 and 5 above) that bz ^ 0, ct ^ 0, and
bz ct ^ 0. This contradicts bz ct > 0 above. If <r > 0, we have,
upon normalization, that A't = b, Az :g c, bz-\-ct<Q,t^Q, which
is a contradiction since A't = b, Az ^ c, and ^ 0 imply that bz -fct ^ 0. So in either case of a 0 or <r > 0 we arrive at a contradiction.
Hence LMP I has a solution.
139

Nonlinear Programming

8.3

Problem
Consider the primal linear minimization problem LMP 1. Show
that if X T* 0, and if, for all x in X, bx^a for some real number a,
then LMP 1 has a solution x. (Hint: Use the facts that -X" 5^ 0 and that
bx < a has no solution x in X to show that there exists a dual feasible
point, that is, a w 6E Y; then use Theorem 3, part (v).)
Remark
It is not true in general that if a linear function is bounded from
below on a closed convex set, then it achieves its infimum on that set.
For example, x% ^ 0 on the closed convex set {x \ x G R2, (2)~x> xz ^ 0};
however x2 does not attain its infimum of zero on the set.f Problem 7
above shows that if a linear function is bounded from below on a poly tope,
then it achieves its infimum on the polytope.
Problem
Show that the following are dual linear programs:

where 6, a, c, and d are given vectors in Rn, Rf, Rm, and Rh, respectively
and A, D, B, and E are given m X n, m X , k X n, and k X I matrices,
respectively.
11 am indebted to my colleague James W. Daniel for this example.

ISO

Chapter
Nine
Generalizations
of
Convex
Functions:
Quasiconvex,
Strictly
Quasiconvex
and
Pseudoconvex
Functions

Beginning with Chap. 4, we have


continually used the concepts of
convex and concave functions in
deriving optimality conditions and
duality relations. Since not all
properties of convex and concave
functions are needed in establishing some of the previous results, a
more general type of function will
also work. For example, some
results need only that the set
Aa = {x | x T, 6(x) ^ a} be convex, where T is a convex set in Rn,
6 is a numerical function defined on
T, and a is any real number. Now
if 6 is a convex function on T, the
convexity of A is assured by Theorem 4.I.W. However, 0 need not
be convex in order that Aa be convex. A function 0 which is quasiconvex on F has this property
[Nikaido 54]. Another property
of differentiable convex functions
that was used in previous results
was this: If V6(x)(x x) ^ 0, then
0(x) ^ 6(x). This property follows from 6.1.1, and has the obvious consequence that if V0(z) = 0,
then B(x) ^ B(x). Not only convex functions have this property.
Pseudoconvex functions [Tuy 64,
Mangasarian 65], to be introduced
in this chapter, also have this property. By using only the properties
of convex functions that are needed
in establishing some of the previous
results, these results are extended
to a larger class of functions.

1. Quasiconvex and quasiconcave functions


Quasiconvex function
A numerical function 6 de-

Nonlinear Programming

fined on a set F C Rn is said to be quasiconvex at x r (with respect to


F) if for each x G r such that Q(x) ^ 6(x), the function 6 assumes a value
no larger than 0(x) on each point in the intersection of the closed line
segment [x,x] and F, or equivalently

0 is said to be quasiconvex on F if it is quasiconvex at each x G r.


Note that this definition of a quasiconvex function is slightly more
general than the customary definition in the literature [Nikaido 54, BergeGhouila Houri 65, Martos 65] in that (i) we define quasiconvexity at a
point first and then on a set, and (ii) we do not require F to be convex.
This generalization will allow us to handle a somewhat wider class of
problems. It follows from the above definition that a numerical function
0 defined on a convex set F is quasiconvex on F if and only if

Quasiconcave function
A numerical function 6 defined
concave at x G F (with respect to
B(x) ^ Q(x), the function 6 assumes
each point on the intersection of the
equivalently

on a set F C Rn is said to be quasiF) if for each x G F such that


a value no smaller than 8(x) on
closed line segment [x,x] and F, or

6 is said to be quasiconcave on F if it is quasiconcave at each x G F.


Obviously B is quasiconcave at x (on F) if and only if 6 is quasiconvex at x (on F). Results obtained for quasiconvex functions can be
changed into results for quasiconcave functions by the appropriate multiplication by 1, and vice versa.
188

Generalizations of Convex Functions

9.1

It follows from 2 that a numerical function 6 denned on a convex


set F is quasiconcave on F if and only if

Figure 9.1.1 depicts a quasiconvex and a quasiconcave function

on R.

3Theorem
Let 6 be a numerical function defined on a convex set T C Rn, let
and let

Then
(B quasiconvex on F) <=> (Aa is convex for each a G R)

and
(9 quasiconcave on F) => (fta is convex for each a G R)
PROOF We shall establish the theorem for the quasiconvex case.
quasiconcave case follows from it.

The

Fig. 9.1.1 A quasiconvex and a quasiconcave function on R. (a)


Quasiconvex 6 on Rn R; (b) quasiconcave 6 on
Rn = R.
133

9.1

Nonlinear Programming

(<=) Let xl,x* E T, e(x*) 8(xl), and 0 ^ X ^ 1. If we let


a = 8(xl), then by the convexity of Aa we have that

and hence 6 is quasiconvex on F.


(=>) Let 6 be quasiconvex on F, a be any real number, and let x1
2
and x be any two points in Aa (if Aa is empty, then it is convex). Without loss of generality let 0(x2) ^ 6(xl). Since xl,x^ Aa, we have that
Since 6 is quasiconvex, and since F is convex, we have that for 0 fS X ^ 1
Hence (1 X)^ 1 + Xz2 A a , and Aa is convex.

4Theorem (differentiate quasiconvex and


quasiconcave functions)
Let T be an open set in Rn, and let 8 be a numerical function defined on
F Then

PROOF We shall prove the quasiconvex case only. The other case is
similar.
(=>) If x1 xz, the implication is trivial. Assume then that
1
x T* x*. Since F is open, there exists an open ball B&(x1} around x1 which
is contained in F. Then, for some ft such that 0 < ft < 1 and # < d/
134

Generalizations of Convex Functions

9.1

we have that
Then

and let
We establish now the quasiconvexity of 0 on T by showing that ft is empty.
We assume that there is an x G ft and show that a contradiction ensues.
Since 0(z2) ^ Q(xl) < B(x) for x G 0, we have from the hypothesis of the
theorem that

and

Since 0 < X < 1, it follows then that

1SS

9.2

Nonlinear Programming

Since 6(xl) < 0(x), and since 6 is continuous on F, the set S) is open
tive to (xl,x*), it contains x, and there exists an x3 = (1 ju)z +
0 < M ^ 1, such that x3 is a point of closure of ft, and such that 0(z3) =
(see C.I.I and B.I.3}. By the mean-value theorem D.2.1 we have
for some x G ft

relanxl,
6(x1)
that

and since
then
Since x G fy the last relation above contradicts the equality established
earlier, V0(x)(x 2 - x1} = 0 for all x E fi.

2. Strictly quasiconvex and strictly


quasiconcave functions
We begin this section by recalling the concepts of a, local minimum and
maximum (see 5.1.2 or 7.1.2). Let 6 be a numerical function defined
on the set F C Rn- A point x F is said to be a local minimum (maximum) of 6 if there exists an open ball BI(X) around x with radius 5 > 0
such that
(local minimum)
(local maximum)
We remark that for a numerical function 6 which is quasiconvex
(quasiconcave) on a convex set F, a local minimum (maximum) need not
be a (global) minimum (maximum) over F. This fact is easily demonstrated by the numerical function 6 defined on R as follows:

It is easy to see that 0 is both quasiconvex and quasiconcave on R by


verifying definitions 9.1.1 and 9.1.2. It is also easy to see that x = %
is both a local minimum and a local maximum, but certainly not a global
minimum or maximum over R.
136

Generalizations of Convex Functions

9.3

We introduce now a type of function which is essentially a restriction of quasiconvex (quasiconcave) functions to functions with the property that a local minimum (maximum) is also a global minimum (maximum). Such functions have been independently introduced in [Hanson
64, Martos 65, Karamardian 67]. (These functions were called functionally convex by Hanson and explicitly quasiconvex by Martos.)
Strictly quasiconvex function
A numerical function 6 defined on a set F C Rn is said to be strictly
quasiconvex at x G F (with respect to F) if for each x G F such that
6(x) < B(x) the function 8 assumes a lower value than 6(x) on each point
in the intersection of the open line segment (x,x) and F, or equivalently

0 is said to be strictly quasiconvex on F if it is strictly quasiconvex at


each x G FIt follows from the above definition that a numerical function 6
defined on a convex set F is strictly quasiconvex on F if and only if

Strictly quasiconcave function


A numerical function 6 defined on a set F C Rn is said to be strictly
quasiconcave at x (E F (with respect to F) if for each x G F such that
B(x) < 6(x) the function 6 assumes a higher value than B(x) on each point
in the intersection of the open line segment (x,x) and F, or equivalently

6 is said to be strictly quasiconcave on F if it is strictly quasiconcave at


each x G F.
137

9.2

Nonlinear Programming

It follows from the above definition that a numerical function 6


defined on a convex set F is strictly quasiconcave on F if and only if

Obviously 6 is strictly quasiconcave at x (on F) if and only if 0


is strictly quasiconvex at x (on F).
Figure 9.2.1 depicts a strictly quasiconvex and a strictly quasiconcave function on R. The quasiconvex function of Fig. 9.1.la is not
strictly quasiconvex, and the quasiconcave function of Fig. 9.Lib is not
strictly quasiconcave. (Why?)
We observe that a strictly quasiconvex function need not be quasiconvex. Consider the numerical function 8 defined on R as follows

This function is strictly quasiconvex on R but is not quasiconvex on R.


For by taking x1 = -1, x* = 1, X = >^, we see that 0(z2) = B(x1}, but
0[(1 - X)^ 1 + Xz2] > 6(xl).
If we require that 6 be lower semicontinuous, the above counterexample will be eliminated and every strictly quasiconvex function will
also be quasiconvex. In fact we have the following result.

Fig. 9.2.1 Strictly quasiconvex and strictly quasiconcave functions,


(a) Strictly quasiconvex function 6 on Rn = R; (6)
strictly quasiconcave function 6 on Rn = R.
138

Generalizations of Convex Functions

9.3

Theorem [Karamardian 67]


Let 6 be a lower (upper) semicontinuous numerical function defined
on the convex set T in Rn. If 0 is strictly quasiconvex (strictly quasiconcave)
on F, then 6 is quasiconvex (quasiconcave) on T, but not conversely.
PROOF We prove only the quasiconvex case. Let 8 be strictly quasiconvex on F, and let xl and x 2 be in F. By 1 we have that

Hence if 6(xz) < 8(xl), we are done by 9.1.1. Assume now that
6(xz) = 6(xl). We will show (by contradiction) that there exists no
G (xl,xz) such that 6(xl) < 8(x). This will then establish the quasiconvexity of 6 by 9.1.1. Let x (x^x*) such that 8(xl) < 8(x). Then
Since 6 is lower semicontinuous on F, ft is open relative to (xl,x*) by
C.1.2(\v}. Hence, there exists an G (xl,x) C\ ft. By the strict quasiconvexity of 6 and 1 we have that (since G ft and x G ft)

and
which is a contradiction. Hence no such x exists and 6 is quasiconvex
on T.
That the converse is not true follows from the example given at the
beginning of this section, which is quasiconvex on R but not strictly
quasiconvex on R. (For take x1 = %, x2 = %, X = ^!o> then
0(z2) < 6(x1), but 8[(l - \)xl + Xz 2 )] = 8(xl), which contradicts 1.)
Theorem
Let 8 be a numerical function defined on the convex set F in Rn, and
let x G r be a local minimum (maximum). If 8 is strictly quasiconvex
(strictly quasiconcave) at x, then 8(x) is a global minimum (maximum) of
8 on T.
PROOF We prove the strictly quasiconvex case.
mum, then there exists a ball BS(X) such that

If x is a local mini-

139

Nonlinear Programming

9.3

Assume now that there exists an x in F but not in BS(X) such that Q(x
&(x). By the strict quasiconvexity of 6 at x and the convexity of F, we
have that

and hence

which contradicts a previous inequality.


In closing this section on strictly quasiconvex (strictly quasiconcave) functions, we remark that there does not seem to be a simple characterization of a differentiable strictly quasiconvex (strictly quasiconcave)
function in terms of the gradient of the function [such as 9.1.4 for a
quasiconvex (quasiconcave) function].

3. Pseudoconvex and pseudoconcave functions


Let 6 be a differentiable function on Rn. If VQ(x) = 0, then it follows
from 6.1.1 that Q(x) ^ 6(x) for all x in Rn if 6 is convex at x. If 8 is a
differentiable strictly quasiconvex function (and hence by 9.2.3 also a
quasiconvex function) on Rn, and if VB(x) = 0, it does not necessarily
follow that Q(x) ^ 6(x) for all x in Rn. For example take the strictly
quasiconvex function 6 defined on R by 6(x) = (x)3. For this function
V0(0) = 0, but certainly 0(0) is not a minimum of 9 on R. In order to
exclude such strictly quasiconvex functions that have inflection points
with horizontal tangents, we introduce the concept of a pseudoconvex
function. (The concept of a pseudoconvex function was introduced
independently in [Mangasarian 65] and in a slightly different form under
the name of semiconvex function in [Tuy 64]. Most of the results for
pseudoconvex functions derived in this book are based on [Mangasarian
65].)
Pseudoconvex function
Let 6 be a numerical function defined on some open set in Rn
containing the set F. 6 is said to be pseudoconvex at x G F (with respect
140

Generalizations of Convex Functions

9.3

to F) if it is differentiable at x and

8 is said to be pseudoconvex on T if it is pseudoconvex at each x G T.


Pseudoconcave function
Let 6 be a numerical function denned on some open set in Rn containing the set T. 9 is said to be pseudoconcave at x G T (with respect
to F) if it is differentiable at x and

8 is said to be pseudoconcave on T if it is pseudoconcave at each :r F.


Obviously 6 is pseudoconcave at x (on T) if and only if 6 is
pseudoconvex at x (on T).
Figure 5.3J depicts a pseudoconvex function and a pseudoconcave
function on R.
We give now some results involving pseudoconvex and pseudoconcave functions.
Theorem (minimum principle)
Let 6 be a numerical function defined on some open set in Rn containing the set T, and let x G T.

Fig. 9.3.1 Pseudoconvex and pseudoconcave functions, (a) Pseudoconvex function 6 on Rn = R; (6) pseudoconcave function 6 on Rn = R.
141

9.3

Nonlinear Programming

(i)

Let F be convex, and let 6 be differentiate

(ii)

Let 6 be pseudoconvex (pseudoconcave) at x.

PROOF

(i)

Let x be any point in T.

at x.

Then

Then

Since T is convex

Since 6 is differentiate at x, and since 6(x) = min d(x), we have that

where

Hence
By taking the limit as X approaches zero, we get
The second implication of (i) follows similarly.
(ii) For any x in T we have that V6(x)(x x) ^ 0, hence by 1,
Q(x] ^ 6(x) and 6(x) = min 6(x). The second implication follows
*er
from 2.
Corollary
Let 6 be a numerical function defined on the open convex set T in Rn.
Let x T, and let 6 be differentiable at x. Then

142

Generalizations of Convex Functions

(ii)

Let & be pseudoconvex (pseudoconcave) at x.

9.3

Then

PROOF (i) We consider only the first implication. By 3(i) we have


that Vd(x) (x x) ^ 0 for all x E F. Since F is open, x = x 5V0(z) E r
for some 5 > 0. Hence 8V8(x)V8(x) ^ 0, which implies that
V8(x) = 0.
(ii)

Follows trivially from S(ii).

We relate now pseudoconvex functions to strictly quasiconvex


functions, quasiconvex functions, and convex functions.
Theorem
Let F be a convex set in Rn, and let 8 be a numerical Junction defined
on some open set containing F. If 6 is pseudoconvex (pseudoconcave) on F,
then 6 is strictly quasiconvex (strictly quasiconcave) on F and hence also
quasiconvex (quasiconcave) on F. The converse is not true.
PROOF Let 8 be pseudoconvex on F. We shall assume that 6 is not
strictly quasiconvex on F and exhibit a contradiction. If 6 is not strictly
quasiconvex on F, then it follows from 9.2.1 that there exist xl,x2 E F
such

and
for some x such that
Hence there exists an x (xl,x*) such that

where
By Theorem 8(1) above we have then that

143

9.3

Nonlinear Programming

and
Since
then

and
Hence

and

But by the pseudo-convexity of 6 on T, it follows that

and hence

This last inequality contradicts the earlier statement that

Hence 6 is strictly quasiconvex on T and, by 9.2.3, is also quasiconvex


on F. That the converse is not necessarily true can be seen from the
example d(x) = (x)3, x G R, which is strictly quasiconvex on R but is
not pseudoconvex on R.
Theorem
Let Q be a numerical function defined on some open T set in Rn, let
x G T, and let 6 be differentiable at x. If 6 is convex (concave) at x, then d
is pseudoconvex (pseudoconcave) at x, but not conversely.
PROOF

144

Let 6 be convex at x.

By 6.1.1 we have that

9.4

Generalizations of Convex Functions

Hence

and 6 is pseudoconvex at x. That the converse is not necessarily true


can be seen from the example

8 is not convex on R because Vz8(x) < 0 for x < 0.


convex on R because V8(x) = I -f 3(z) 2 > 0 and

However, 6 is pseudo-

Theorem
Let F be a convex set in Rn, and let 9 be a numerical function defined
on some open set containing F. If 6 is pseudoconvex (pseudoconcave) on T,
then each local minimum (maximum) of 6 on T is also a global minimum
(maximum) of 8 on T.
PROOF By Theorem 5 & is strictly quasiconvex (strictly quasiconcave
on T. By Theorem 9.2.4 each local minimum (maximum) of 6 on T is
also a global minimum (maximum) on F.

4. Summary of properties and relations between


quasiconvex, strictly quasiconvex, pseudoconvex,
convex, and strictly convex functions
In Fig. 9.4.1 we summarize the relations between differentiable functions
which are quasiconvex, strictly quasiconvex, pseudoconvex, convex, and
strictly convex, all denned on an open convex set F in Rn. For convenience, we include statements which are logical equivalents of each
other. For example, we include the two following equivalent definitions of a pseudoconvex function

or

145

9.4

Nonlinear Programming

In Fig. 9.4-1 certain inequalities hold only if xl 9* x2 and X ;* 0 or X 5* 1.


All implications are represented by single arrows. Implications not so
represented are not true in general. Figure 9.4-1 immediately shows that
the class of quasiconvex functions is the largest class considered and the
strictly convex class is the smallest. Figure 9.4-2 shows similar results
for differentiate quasiconcave functions, etc.

Fig. 9.4.1 Properties and relations between differentiable quasiconvex, strictly quasiconvex, pseudoconvex, convex and strictly convex functions defined on an open convex set r C Rn146

Generalizations of Convex Functions

9.8

Fig. 9.4.2 Properties and relations between differentiable quasiconcave, strictly quasiconcave,
pseudoconcave, concave and strictly concave functions defined on an open convex
set T C Rn-

5.

Warning

It was shown in 4-1-6 that each nonnegative linear combination of convex


functions is also convex. This is not true in general for quasiconvex,
strictly quasiconvex, or pseudoconvex functions. We indicate the reason
for the pseudoconvex functions 0 and 0 on Rn. If we define

147

Nonlinear Programming

9.6

then the requirement that


does not ensure that
and hence we cannot conclude that
and so ^ may not be pseudoconvex on Rn. For example Q(x) x and
<f>(x) = x + (x)z are both pseudoconvex on R, but their sum \f/(x} = (x)3
is not pseudoconvex on R.

6. Problems
Nonlinear fractional functions
Let 0 and ^ be numerical functions defined on a set F C R", and
let <A ^ 0 on T.
(i)

Let T be open, let x F, and let </> and ^ be differentiable at x.


Show that d = <(>/$ is pseudoconvex at x if

and

(ii)

Let T be convex. Show that d = $/$ is quasiconvex on F if

and

148

9.6

Generalizations of Convex Functions

(Hint: Consider, for the case ^ > 0, the set

Nonlinear fractional functions


Show that the words convex and concave can be interchanged
throughout the above problem 1, except that the sentence "Let T be
convex . . . " remains unchanged.
Linear fractional functions
Let a Rn, b E Rn, R, ft Rdenned by

Show that the function 6

is both pseudoconvex and pseudoconcave (and hence also both quasiconvex and quasiconcave) on each convex set r C R" on which bx + /3 ^ 0.
Bi-nonlinear functions
Let 0 and cr be numerical functions defined on a convex set F Rn,
and let o- 7^ 0 on r.
(i)

Let F be open, and let 0 and a be differentiate on F. Show that


8 = 0cr is pseudoconvex on F if
(0 is convex, <t> ^ 0, or is concave, a- > 0, all on F)
or

(0 is concave, 0 ^ 0, a is convex, tr < 0, all on F)


Show that 8 is pseudoconcave on F if
(0 is convex, 0 ^ 0, a is convex, a < 0, all on F)
or

(0 is concave, 0 ^ 0, cr is concave, cr > 0, all on F)


(ii)

If we drop the assumptions that F is open and that 0 and a are


differentiate on F, show that either of the first two conditions of (i)
make 6 quasiconvex on F and either of the last two conditions of (i)
make 8 quasiconcave on F. (Hint: Show first that the reciprocal
149

9.6

Nonlinear Programming

of a positive concave function is a positive convex function, and that


the reciprocal of a negative convex function is a negative concave
function, then use the results of Prob. 1 above.)
Let 6 be a positive numerical function defined on the open convex
set F in Rn. Show that if log 0 is convex (concave) on T, then 6 is convex
(pseudoconcave) on F.

160

Chapter
Ten
Optimality
and
Duality
for
Generalized
Convex
and
Concave
Functions

In this chapter we shall use the


concepts of quasiconvexity (quasiconcavity) and pseudoconvexity
(pseudoconcavity), introduced in
the previous chapter, to derive
sufficient optimality criteria, necessary optimality criteria, and duality relations for a class of nonlinear programming problems that
involve differentiable functions.
The pioneering work in this direction was that of [Arrow-Enthoven
61]. Subsequent work appeared in
[Tuy 64, Martos 65, Mangasarian
65].

1. Sufficient optimality
criteria
We establish in this section two
sufficient optimality criteria which
are slight generalizations of the
sufficiency result of [Mangasarian
65]. The present results also subsume the sufficiency results of
[Arrow-Enthoven 61] and Theorem
7.2.1 of this book.
Sufficient optimality
theorem
Let X be an open set in Rn,
and let 6 and g be respectively a
numerical function and an m-dimensional vector function both defined
on X. Let x G X, let

let 0 be pseudoconvex at x, and let


QI be differentiable and guasiconvex
at x. If there exists a u G Rm such
that (x,u) satisfies the following
conditions

10.1

Nonlinear Programming

then x is a solution of the following minimization problem

PROOF

Let

and hence
Since
it follows by the quasiconvexity of QI at x and Theorem 9.1.4 that
and hence
But since uj = 0, we also have that
Addition of the last two relations gives
But since [V6(x) + uVg(x)](x x) ^ 0 for all x G X, the last inequality
gives

which, by the pseudoconvexity of 6 at x, implies that

152

Generalized Optimality and Duality

10.2

Since g(x) ^ 0, we have that x E X, and

The following theorem, which is a generalizaton of the KuhnTucker sufficient optimality criterion 7.8.1, follows trivially from the
above theorem by observing that V6(x) -f uVg(x) = 0 implies that
[V0(z) + uVg(x)](x - x) = 0 for all x 6 X.
Kuhn-Tucker sufficient optimality theorem
Theorem 1 above holds with the condition
replacing the condition

2. Necessary optimality criteria


The purpose of this section is to generalize the necessary optimality
criteria of Sec. 7.3 by employing the concepts introduced in Chap. 9.
We begin by extending Abadie's linearization lemma 7.3.1.
Linearization lemma
Let X be an open set in Rn, and let 6 and g be respectively a numerical
function and an m-dimensional vector function both defined on X. Let x
be a solution of LMP 7.1.2, let Q and g be differentiable at x, let
P

\i I Qi(%) 0>

an

d Qi is pseudoconcave at x\

and let

Q H I 9i(%)

0>

d Qits t pseudoconcave at x}

an

Then the system

has no solution z in Rn.


PROOF The proof is identical to that of Lemma 7.3.1 except that V
and W are replaced by P and Q and step (iii) of the proof of 7.3.1 is
168

10.2

Nonlinear Programming

modified as follows:
(in') For i 6E P, we have by 9.3.2, since 0t is pseudoconcave at
x and dVgp(x)z ^ 0, that
We generalize now the Fritz John stationary-point necessary
optimality theorem 7.3.2.
Fritz John stationary-point necessary optimality theorem
Let x be a solution of LMP 7.1.2 or of MP 7.1.1, let X be open, and
let 8 and g be differentiate at x. Then there exist an f 0 G # and an f (E Rm
such that (x,f0,r) solves FJP 7.1.3 and
where

PROOF The proof follows from Lemma 1 above in exactly the same
way as Theorem 7.3.2 follows from Lemma 7.3.1.
Again, as in Chap. 7, there is no guarantee that f 0 > 0 in the above
theorem. To ensure this, we impose constraint qualifications. We give
below less restrictive versions of constraint qualifications that were
introduced earlier.
The weak Arrow-Hurwicz-Uzawa constraint qualification
(see 7.3.4)

Let X be an open set in Rn, let g be an m-dimensional vector function defined on X, and let
g is said to satisfy the weak Arrow-Hurwicz-Uzawa constraint qualification at x G X if g is differentiable at x and
has a solution z Gj Rn
where
and Qi is pseudoconcave at x}
and

and Qi is not pseudoconcave at x \


154

Generalized Optimality and Duality

10.3

The weak reverse convex constraint qualification (see 7.3.5)


Let X be an open set in Rn, let g be an m-dimensional vector
function defined on X, and let
g is said to satisfy the weak reverse convex constraint qualification at
x G X if g is differentiate at x and, for each i G I, either gr, is pseudoconcave at x, or 0, is linear on Rn, where

Slater's weak constraint qualification (see 5.4-3)


Let X be an open set in Rn, let g be an m-dimensional vector
function defined on X, and let

g is said to satisfy Slater's weak constraint qualification at x G X if g


is differentiable at x, gi is pseudoconvex at x, and there exists an x G X
such that gi(x) < 0 where

We observe here that each of the weak constraint qualifications 3,


4, and 5 above is essentially implied by the corresponding constraint
qualification introduced earlier, that is, by 7.3.4, 7.3.5, and 5.4-3,
respectively.
Before deriving the Kuhn-Tucker necessary optimality conditions
under the weakened constraint qualifications above, we relate the weakened constraint qualifications to each other and to the Kuhn-Tucker
constraint qualification 7.3.3.
Lemma
Let X be an open set in Rn, let g be an m-dimensional vector function
defined on X, and let

(i) // g satisfies the weak reverse convex constraint qualification 4 at x, then


g satisfies the weak Arrow-Hurwicz-Uzawa constraint qualification
3 at x.
(ii) // g satisfies the weak reverse convex constraint qualification 4 at x, then
g satisfies the Kuhn-Tucker constraint qualification 7.3.3 at x.
IBS

lo.a
(iii)

Nonlinear Programming

// g satisfies Slater's weak constraint qualification 5 at x, then g


satisfies the weak Arrow-Hurwicz-Uzawa constraint qualification 3
at x.

PROOF (i) We have here that Q is empty, and hence by taking z = 0,


we immediately satisfy Vgp(x)z = 0, and the weak Arrow-Hurwicz-Uzawa
constraint qualification 3 is satisfied at x.
(ii) The proof is essentially identical to the proof of Lemma
7.3.6(ii), except that the line reading
in the proof of 7.3.6(\\) is replaced by the following argument:
(by pseudoconcavity of gj
at x)

(iii) Let g satisfy Slater's weak constraint qualification 5 at x.


Then there exists an x G X such that
where
But since gi is pseudoconvex at x, it follows from (the logical equivalent
of) 9.8.1 that
Hence by taking z x x, we have that Vgi(x)z > 0, and the weak
Arrow-Hurwicz-Uzawa constraint qualification 3 is satisfied at x.
We summarize the relations obtained above between the various
constraint qualifications in Fig. 10,2.1.
Kuhn-Tucker stationary-point necessary optimality theorem
Let X be an open set in Rn, let 9 and g be defined on X, let x solve
LMP 7.1.2 or MP 7.1.1, let 9 and g be differentiate at x, and let g satisfy
(i)
(ii)
(iii)
(iv)
166

The Kuhn-Tucker constraint qualification 7.3.3 at x, or


the weak Arrow-Hurwicz-Uzawa constraint qualification 3 at x, or
the weak reverse convex constraint qualification 4 at x, or
Slater's weak constraint qualification 5 at x.

Generalized Optimality and Duality

10.3

Fig. 10.2.1 Relations between various


constraint
qualifications
(CQ).

Then, there exists a u G Rm such that (x,u) solves KTP 7.1.4PROOF In view of Lemma 6 we need only establish the theorem under
assumptions (i) and (ii) above.
(i) This is the same as 7.3.7(1).
(ii) This proof is identical with the proof of Theorem 7.3.7(ii)
with the following modifications: V and W are replaced by P and Q, and
Theorem 2 is used in the proof instead of Theorem 7.8.2.

3. Duality
In this section we will extend the Hanson-Huard strict converse duality
theorem 8.1.6 in two directions. In the first direction we will show that
(x,u) need only be a local solution of the dual problem rather than a global
solution. In the second direction we will relax the convexity of the objective function 6 to pseudoconvexity, and the convexity of the constraint
function g to quasiconvexity. We will also show that neither the weak
duality theorem 8.1,3 nor Wolfe's duality theorem 8.1.4 holds for a
pseudoconvex 6.
Strict converse duality theorem [Mangasarian 65]
Let X be an open set in Rn, and let 6 and g be differentiate
Let (x,u) be a local solution of DP 8.1.2, that is,

on X.

157

Nonlinear Programming

10.3

where BS(X,U) is an open ball in Rn X Rm with radius 5 > 0 around (x,u).


Let 6 be pseudoconvex at x, and let g be quasiconvex at x. If either
(i)
(ii)

$(x,u) is twice continuously differentiate at x, and the n X n Hessian


matrix Vxz^(x,u) is nonsingular, or
there exists an open set A in Rm containing u and an n-dimensional
differentiate vector function e on A such that

and

then A solves MP 8.1.1, and

If x solves MP 8.1.1, it does not follow that x and some u solve DP 8.1.2
(even if g is linear), nor does it follow that 0(x) ^ $(x,u) for any dual feasible point (x,u), that is, (,u) E Y.
PROOF The proof of the first part of the theorem is essentially identical
to the proof of Theorem 8.1.6, except that we invoke the sufficient optimality theorem 10.1.2 here, whereas we invoked the sufficient optimality
theorem 7.2.1 in the proof of 8.1.6. (The assumption that (x,u) need
only be a local solution of DP 8.1.2 does not change the proof of 8.1.6
and in fact could have been made in Theorem 8.1.6 also. It would have
added nothing to that theorem, however, since under the assumptions of
Theorem 8.1.6 each local maximum is also a global maximum. This,
however, is not the case under the present assumptions, as can be seen
from the example following this proof, in which a local maximum to the
dual problem is not a global maximum.)
We establish now the second part of the theorem by means of the
following counterexample:

158

Generalized Optimallty and Duality

10.8

The solution of MP1 is obviously x = 1, whereas DPI has no maximum


but has a supremum, which is equal to zero. (To see this, rewrite the
dual problem in the equivalent form

and note that the quadratic equation 1 2x + 2(z)2 = 0 has only complex roots.) We also have dual feasible points (,ti), such as = 10,
U = 20e-100, such that 8(x) < iK,fl).
We give now an example where the first part of the above theorem
applies and where the dual problem has no global maximum but a local
maximum which is also a solution of the primal problem. Consider the
primal minimization problem

and its dual

Setting v = I u gives

and setting v = [l/3(z)2], x j* 0, gives

Hence V<() = 0 implies that x = 1, and since V 2 0(z) < 0 for x < 0,
</> is strictly concave on {x \ x G R, x < 0}, and f = 1 is a local maximum of 0. However : = 1 is not a global maximum of <f> on {x \ x G R,
x > 0 or x < 0}, since <p approaches < as x approaches <. However
(,#) == ( 1,^) is a local solution of the dual problem and ^(z,w) = 2.
1S9

10.3

Nonlinear Programming

It is obvious also that x = 1 solves the primal problem and that


B(x) = 2. Hence x = x, and 6(x) = t(x,u).
We end this chapter with a corollary that follows directly from the
proof of Theorem 8.1.6 or from the proof of Theorem 1 above.
Corollary
Let all the assumptions of Theorem 1 above hold, except that 6 need
not be pseudoconvex at x nor g be quasiconvex at x. If (x,u) is a local solution of DP 8.1.2, then (x,u) is a Kuhn-Tucker point, that is, (x,u) solves
KTP7.14.

160

Chapter
Eleven
Optimality
and
Duality
in the
Presence
of Nonlinear
Equality
Constraints

Up until now we have dealt almost


exclusively with minimization problems in which the feasible region X
is determined by inequality constraints only, that is,

where X is a subset of Rn, and g


is an w-dimensional vector function
defined on X. (For exceptions
see 6.8.8, 5.4-8, 7.2.2, and 7.3.18,
where equality constraints are also
considered.) In this chapter we
shall examine in some detail minimization problems in which the
feasible region X is determined by
inequality and nonlinear equality
constraints, that is,

where h is a fc-dimensional vector


function defined on X. In particular we shall obtain sufficient
optimality criteria, necessary optimality criteria, and some duality
results for the following minimization problem

We remark here that all the


necessary optimality conditions of
Chap. 7 that involved differentiable functions required that the set
X be open. We derive in this
chapter an important necessary
optimality condition of the maximum-principle type which does not
require that X be open.

11.1

1,

Nonlinear Programming

Sufficient optixnality criteria

The sufficient optimality criteria given here follow directly from Theorem
10.1.1 and Theorem 10.1.2 by observing that the equality constraint
h(x) = 0 can be written as h(x) ^ 0 and h(x) ^ 0, and that the negative of a quasiconcave function is quasiconvex.
Sufficient optimality theorem
Let X be an open set in Rn, and let d, g, and h be respectively a numerical function, an m-dimensional vector function, and a k-dimensional vector
function, all defined on X. Let x G X, let
let 0 be pseudoconvex at x, let gr be differentiate and quasiconvex at x, and let
h be differentiate, quasiconvex, and quasiconcave at x. If there exist u j Rm
and v Rk such that (x,u,v) satisfies the following conditions

then x is a solution of the following minimization problem

Kuhn-Tucker sufficient optimality theorem


Theorem 1 above holds with the condition
replacing the condition

2.

"Minimum-principle" necessary optimality criteria:


X not open

We consider in this section the minimization problem

162

Optimality and Duality with Nonlinear Equalities

11.2

We do not assume here that X is an open subset of Rn, that g is convex


on X, or that h is linear on\ff n .f Without the assumption that X is
open, we cannot derive necessary optimality criteria of either the Fritz
John type (7.3.2} or of the Kuhn-Tucker type (7.3.7). We can, however,
derive necessary optimality criteria of the minimum-principle type, which,
when X is open, become necessary optimality criteria of the Fritz John
or Kuhn-Tucker type. (The minimum-principle necessary optimality
criterion we are about to derive has been referred to as a maximum principle in [Halkin 66, Halkin-Neustadt 66, Canon et al. 66, MangasarianFromovitz 67] because of its close relation to Pontryagin's maximum
principle [Pontryagin et al. 62], which is the fundamental optimality
condition of optimal control. We have a minimum principle here instead
of a maximum principle because the adjoint variables of Pontryagin's
maximum principle, which play the role of Lagrange multipliers, are the
negative of our Lagrange multipliers f 0 , f, s, to be introduced below.
Halkin [Halkin 66] has a maximum principle because he has a maximization problem instead of a minimization problem.)
We point out now the main difference between a minimum-principle necessary optimality criterion and the main condition of the Fritz
John necessary optimality criterion. The minimum principle requires
that
where x is a solution of the minimization problem, f 0 , f, and s are Lagrange multipliers and X is the closure of X. The Fritz John criterion
requires that
Obviously, the latter condition follows from the first one if X is open.
If X is not open, the latter condition does not hold in general.
We begin by establishing a generalization of a linearization lemma
of [Mangasarian-Fromovitz 67a].
Linearization lemma
Let X be a convex set in Rn with a nonempty interior, int (X), and
let A be an open set in Rn. Let f be an ^-dimensional vector function, and
let h be a k-dimensional vector function, both defined on some open set containing X. Let
t For the case when X is convex, g is convex on X, and h is linear on Rn, we have the
necessary optimality conditions 5.4-8.
163

ll.X

Nonlinear Programming

Let f be differentiable at x, let h have continuous first partial derivatives at x,


and let Vhj(x), j = I , . . . , k, be linearly independent, that is,

Then

PROOF k = n: The linear independence of Vhj(x), j = 1, . . , , k, is


equivalent to the nonsingularity of Vh(x). Hence Vf(x)(x x) < 0
cannot hold, because Vh(x)(x x) = 0 implies that x x = 0.
k > n: This case is excluded because of the assumption that
Vhj(x),j j = 1, . . . , k, are linearly independent, which implies that k ^ n.
0 < k < n: We shall establish the lemma by proving the following
backward implication

which is the logical equivalent of the implication of the lemma.


Since h(x) = 0, and since Vhj(x), j = 1, . . . , k, are linearly
independent, we have, by the implicit function theorem D.3.1, that there
exist: (i) a partition (XJ,XK) of x such that xi G Rn~k, XK Rk, (ii) an
open set Si in Rn~k containing xr, and (iii) a fc-dimensional differentiable
vector function e on ft such that

164

Optlmality and Duality with Nonlinear Equalities

11.1

and

VXKh(x) is nonsingular
Since h and e are differentiable, and since h[xi,e(xi)] = 0 for all XT E fy
we have, by the chain rule D.L6, that
Postmultiplication by (XT XT) gives
But by assumption Vh(x)( x) = 0, or equivalently

Hence the last two distinct equalities and the nonsingularity of VXKh(x)
imply that

Now, since ft is open, since XT (E ft, and since e is differentiable at


x, there exists a 50 > 0 such that for S < do
and

where c is a /c-dimensional vector function that tends to zero as 5 tends to


zero. By combining the last two equalities and the fact that XK = e(xi),
we have that for 5 < 50
But this relation and the fact that / is differentiable at x imply that there
exists a 61 such that 0 < 81 < 80 and such that for 5 < di

188

11.1

Nonlinear Programming

where ||a;/,x/c|| denotes (xixi + XKXR)^, and where 6 is an /-dimensional


vector function which tends to zero as 6 tends to zero. Since c tends to
zero as 5 tends to zero, and since (by assumption) Vf(x)(x x) < 0,
that is,
it follows that there exists a 52 such that 0 < 52 < 5i, and such that for
0 < 5 < 82, the quantity in the curly brackets { } in the expression
next to the last is strictly negative. Hence (since /() = 0) we have that
and, since h[xi,e(xi}} 0 for XT (E 12, we also have that
Now, we have already shown that for 5 < 80
Because x G int (-X"0), and because c tends to zero as 8 tends to zero, there
exists a 83 such that 0 < 83 < 52 and 53 < 1, and such that
Since x ( X, it follows from the last two relations and the convexity
of X that
Since x G A, and since A is open, it follows from the expression established
above for e[xi -+- 8(xi Xi}} that there exists a 5 4 such that 0 < 54 < 53
and such that
Hence, combining the above results, we have that any x given by
satisfies f(x] < 0, h() = 0 and is in X C\ A. We have thus established
the lemma for the case when n > k > 0.
k = 0: This is the case where there is no h(x) 0. We prove
here that
We have now that for 0 < 8 < 80

166

Optimality and Duality with Nonlinear Equalities

11.9

Since b tends to zero as 5 tends to zero, and since Vf(x) (x x) < 0, there
exists a 6 > 0 such that

Since x G X C\ A, and since X is convex and A is open, there exists a


< 0 such that < 8 and $ < 1, and such that

Hence any x = x + 8(x x ) . 0 < 5 < . satisfies f(] < 0 and is in

xnA.

We derive now another lemma from the above one by using the
fundamental theorem for convex functions, Theorem ^$.1 (which is a
consequence of the separation theorem for convex sets, Theorem 3.2.8).
This will be the key lemma in deriving the minimum-principle necessary
optimality criterion.
Lemma
Let X be a convex set in Rn with a nonempty interior, int (X~), and
let A be an open set in Rn. Let f be an ^-dimensional vector function, and
let h be a k-dimensional vector function, both defined on some open set containing X. Let

Let f be differentiable
at x. Then

at x, and let h have continuous first partial derivatives

PROOF If Vhj(x), j = 1, . . . , k, are linearly dependent, then there


exists a q 9^ 0 such that qVh(x) = 0, and the lemma is trivially satisfied
by p = 0, q ^ 0, and qVh(x)(x - x) = 0 for all x X.
If Vhj(x),j = 1, ... , k, are linearly independent, then by Lemma
1 above we have that

has
solution
has aa solution
167

11.3

Nonlinear Programming

Since the interior, int (X), of a convex set is convex (see 3.1.7), it follows
by Theorem 4-2.1 that there exist p G R*, q G Rk, P ^ 0, (p,q) ^ 0 such
that
and since the above expression is continuous in x, and in fact linear, the
above inequality holds also on the closure X of X.
We are now ready to derive the fundamental necessary optimality
criterion of this chapter.
Minimum-principle necessary optimality theorem
Let X be a convex set in Rn with a nonempty interior: int (X). Let
6 be a numerical function, let g be an m-dimensional vector function, and let
h be a k-dimensional vector function, all defined on some open set containing
X. Let x be a solution of

Let 6 and g be differentidble at x, and let h have continuous first partial derivatives at x. Then there exist f 0 G R> f G Rm, s G Rk such that the following
conditions are satisfied

REMARK The first of the above four relations is called a minimum


principle because it is a necessary optimality condition for f06(x) -frg(x) -f- sh(x) to have a minimum at x (see Theorem 9.3.3). If in addition we have that f08 + fg + sh is pseudoconvex or convex at x (this, in
general, is not the case under the assumptions of the above theorem),
then the minimum-principle condition implies that
fod(x) + fg(x) -f- sh(x) = min f08(x) + fg(x) -f- sh(x)
x.X<>

The above condition becomes then equivalent to the saddlepoint condition


of Problem 6.4.2.
PROOF

168

Let

Optimality and Duality with Nonlinear Equalities

11.2

Let m/ and mj denote the number of elements in the sets / and J respectively, so mi -\- mj = m.
Since g is defined on some open set containing X, and since g is
differentiate at x, there exists a 8 > 0 such that for i 6E J and \\x x \ < 8

Hence the set


is an open set in Rn.
Now, we have that the system

for if it did have a solution, then x would not be a solution of the minimization problem. But we also have that
Hence by Lemma 2 above, there exist f 0

such that

By defining f / = 0 G: Emj and observing that


and

the theorem is established.


It should be remarked that the restriction that the convex set X
have a nonempty interior is a mild restriction, since in general a convex
set without an interior is equivalent to the intersection of a convex set
with a nonempty interior and a linear manifold \x \ x Rn, h(x] = 0}
(h linear on Rn). Since we allow equalities h(x) = 0 in Theorem 3 above,
169

11,8

Nonlinear Programming

convex sets with empty interiors can, in effect, be handled by the above
results. The convexity requirement on X is of course a restriction which
cannot be dispensed with easily. (See however [Halkin 66, Canon et al.
66].) If we replace the convexity requirement on X by the requirement
that XQ be open, a stronger necessary optimality condition than the above
one can be obtained. In effect this will be an extension of the Fritz John
stationary-point necessary optimality theorem 7.3.2 to the case of nonlinear equalities. We shall give this result in the next section of this
chapter.

3. Fritz John and Kuhn-Tucker stationary-point necessary


optimality criteria: X open
We derive in this section necessary optimality criteria of the Fritz John
and Kuhn-Tucker types from the minimum principle of the previous
section.
Fritz John stationary-point necessary optimality theorem
[Mangasarian-Fromovitz 67a]
Let X be an open set in Rn. Let 6 be a numerical function on X,
let g be an m-dimensional vector function on X, and let h be a k-dimensional
vector function on X. Let x be a (global) solution of the minimization
problem

or a local solution thereof, that is,

where B&(x) is an open ball around x with radius 8. Let 6 and g be differentiable at x, and let h have continuous first partial derivatives at x. Then
there exist fo (E R, r G Rm, s G Rk such that the following conditions are
satisfied

PROOF
ITU

Let x be a global or local solution of the minimization problem.

Optlmality and Duality with Nonlinear Equalities

11.8

In either case (since X is open) there exists an open ball Bp(x) around x
with radius p such that Bp(x) C BI(X) C. X, and

Since Bp(x) is a convex set with a nonempty interior, we have by the


minimum principle 11.2.3 that there exist f 0 E R, ? Rm, s Rk such
that

Since for some small positive f


we have from the first inequality above that
To derive Kuhn-Tucker conditions from the above, we need to
impose constraint qualifications on the problem.
The Kuhn-Tucker constraint qualification (see 7.3.3)
Let X be an open set in Rn, let g and h be w-dimensional and
/c-dimensional vector functions on X, and let

g and h are said to satisfy the Kuhn-Tucker constraint qualification at


x G X if g and h are differentiate at x and

there exists an n-dimsional vector funtion


on the interval
suuch that

is differentiable at
where

171

11.3

Nonlinear Programming

The weak Arrow-Hurwicz-Uzawa constraint qualification


(see 10.8.3)
Let X be an open set in Rn, let g and h be m-dimensional and
/c-dimensional vector functions on X, and let

g and h are said to satisfy the weak Arrow-Hurwicz-Uzawa constraint


qualification at x (E X if g and h are differentiate at x, h is both pseudoconvex and pseudoconcave at x, and

has a solution

where

ispseudoconcave
pseudoconcave

The weak reverse convex constraint qualification (see 10.2.4)


Let X be an open set in Rn, let g and & be m-dimensional and
/c-dimensional vector functions on X, and let

g and h are said to satisfy the weak reverse constraint qualification at


x G X if a and /i are differentiate at x, and if for each

either Qi is pseudoconcave at x or linear on Rn, and /i is both pseudoconvex and pseudoconcave at x.


The modified Arrow-Hurwicz-Uzawa constraint qualification
[Mangasarian-Fromovitz 67a]
Let X be an open set in Rn, let g and h be w-dimensional and
fc-dimensional vector functions on X, and let

g and /i are said to satisfy the modified Arrow-Hurwicz-Uzawa constraint


qualification at x G X if g is differentiable at x, A is continuously differen172

Optimality and Duality with Nonlinear Equalities

11.3

liable at x, Vhi(x), i 1, . . . , k, are linearly independent, and

has a solution
where

Kuhn-Tucker stationary-point necessary optimality theorem


Let X be an open set in Rn, and let 9, g, and h be respectively a numerical function, an m-dimensional vector function, and a k-dimensional vector
function, all defined on X. Let x be a (global) solution of the minimization
problem

or a local solution thereof, that is,

where B$(x) is some open ball around x with radius 5. Let d, g, and h be
differentiate at x, and let g and h satisfy
(i)
(ii)
(m)
(iv)

the Kuhn-Tucker constraint qualification 2 at x, or


the weak Arrow-Hurwicz-Uzawa constraint qualification 3 at x, or
the weak reverse convex constraint qualification 4 at x, or
the modified Arrow-Hurwicz-Uzawa constraint qualification 5 at x.

Then there exist u G Rm and a v G Rk such that

PROOF (i)-(ii)-(iii) These parts of the theorem follow from Theorem


10.2.7, parts (i), (ii), and (iii), by replacing h(x) = 0 in the above theorem
by h(x) ^ 0 and -h(x) ^ 0.
173

U.*

Nonlinear Programming

(iv) All we have to show here is that f 0 > 0 in the


theorem 1 above, for then we have that x,w = f/f 0 and v =
the Kuhn-Tucker conditions above. We assume now that
exhibit a contradiction.
If 7 = [i | Qi(x) = 0} is empty, that is, there are no
straints, then f 0 (since fg(x) = 0, f ^ 0 and g(x) ^ 0).
Theorem 1

Fritz John
s/f0 satisfy
f 0 = 0 and
active conHence by

which contradicts the assumption of 5 that Vhi(x), i = 1, . . . , k, are


linearly independent. (If there are no equality constraints h(x) 0,
then (fo,f) = 0 contradicts the condition (f 0 ,f) ^ 0 of Theorem 1, since
there is no s.)
If / is not empty, then by Theorem 1 we have that

If fj = 0, then s 5^ 0, and we have a contradiction to the assumption of 5


that Vhi(x), i = 1, . . . , k, are linearly independent (if there is no s, then
fi = 0 implies (f 0 ,f) = 0, which contradicts the condition (f 0 , f) 5^ 0 of
Theorem /). If rz 9* 0, then fj > 0. But by 5 there exists a 2 such that
Hence
which contradicts the equality above that

4. Duality with nonlinear equality constraints


The following strict converse duality results follow respectively from
Theorem 10.3.1 and Corollary 10.3.2 by replacing the equality constraints
h(x) = 0 by h(x) 0 and -h(x) ^ 0.
Strict converse duality theorem
Let X be an open set in Rn, and let 6, g, and h be respectively a numerical function, an m-dimensional vector function, and a k-dimensional vector
function, all defined and differentiable on X. Let (x,u,v) be a (global)
174

Optimallty and Duality with Nonlinear Equalities

11.4

solution of the dual problem

or a local solution of the dual problem, that is,

where BS(X,U,&) is an open ball in Rn X Rm X Rk with radius 8 > 0 around


(,w,$). Let 6 be pseudoconvex at x, let g be quasiconvex at x, and let h be
both quasiconvex and quasiconcave at x. If either
(i)
(ii)

\l/(x,u,6) is twice continuously differentiate at x and the n X n Hessian


matrix Vxz\l/(x,u,D) is nonsingular, or
there exists an open set A in Rm X Rk containing (u,v} and an n-dimensional differentiate vector function e on A such that

and

then A solves the minimization problem

and

If x solves the minimization pro blem 3, it does not follow that x and some u
and v solve the dual problem 2 (unless X is convex, 6 and g are convex on X,
and h is linear on Rn), nor does it follow that &(x) ^ \l/(,u,v) for any dual
feasible point (,#,), that is, (,u,v) Y.
175

11.4

Nonlinear Programming

Corollary
Let all the assumptions of Theorem 1 above hold, except that 6 need
not be pseudoconvex at x, nor g quasiconvex at x, nor h both quasiconvex and
quasiconcave at x. If (x,u,v) is a local solution of the dual problem 2, then
(,ufi} is a Kuhn-Tucker point, that is,

176

Appendix
A
Vectors
and
Matrices

We collect in this appendix some of


the bare essentials of real linear
algebra which are needed in this
book. For more detail the reader
is referred to [Gale 60, Halmos 58].
We assume here that the reader is
familiar with the notation and
definitions of Chap. 1.

1. Vectors
Fundamental theorem of
vector spaces
Let each of the vectors y, yl,
. . . , ym in Rn be a linear combination of the vectors xl, xz, . . . , xm
in Rn; then y, yl, . . . , ym are
linearly dependent.

PROOF The proof will be by


induction on m. If m = 1, then

then y = yl = 0, and y and yl are


linearly dependent. If not, then
po T* 0, say. Then
and T/ and y1 are linearly dependent because p0 ^ 0.
Now assume that the theorem holds for TO = k 1, and we
will show that it also holds for
m = k. By hypothesis we have
that

If all pi' are zero then all y3' are


zero and hence linearly dependent.
Assume now that at least one p*'
is not zero, say pi. Define

A.I

Nonlinear Programming

Then each of the k vectors zj is a linear combination of the k 1 vectors


x*, . . . , xk, and hence by the induction hypothesis, the zj are linearly dependent, that is, there exist numbers qi, . . . , qk, not all zero, such that

which shows that the y' are linearly dependent.


Corollary
Any n + 1 vectors in Rn are linearly dependent.
PROOF Let e ' b e the vector in Rn with zero for each element except
the ith, which is 1. Then any vector x in Rn is a linear combination of
the ei for x = V xte{. The corollary then follows from Theorem 1
above.
Corollary
Any m vectors in Rn are linearly dependent if m > n.
PROOF By Corollary 2 any n + 1 vectors of the m vectors are linearly
dependent, hence the m vectors are linearly dependent.
Corollary
Each system of n homogeneous linear equations in m unknowns,
m > n, has a nonzero solution.
PROOF Consider the matrix equation Ax = 0 where A is any n X m
matrix. The m columns Aj of A are vectors in Rn and hence by Corollary 3 above are linearly dependent. Hence there exists an x ^ 0 such
that Ax = 0.
Basis of a subset of Rn
Let S be a subset of Rn, and let r be the maximum number of
linearly independent vectors which can be chosen from S. Any set of r
linearly independent vectors of S is called a basis for S.
178

A.2

Appendix A

Basis theorem
The linearly independent vectors xl, . . . , xr are a basis for a subset
n
S of R if and only if every vector y in S is a linear combination of the x\
PROOF [Gale 60] Suppose every y in S is a linear combination of
the x\ Then S contains no larger set of linearly independent vectors,
for any set of more than r vectors must be dependent since they are combinations of the x\ Therefore r is the maximum number of linearly
independent vectors which can be chosen from S, and hence the xi are
a basis for S.
Conversely, suppose that the xi are a basis for S. Then by definition, r is the number of vectors in the largest set of linearly independent
vectors that can be found in S. So if y is in S, then x1, . . . , xr, y are
linearly dependent, that is,

and po T^ 0, for otherwise the x{ would be linearly dependent.

Hence

which is the desired linear combination.

2.

Matrices

Row and column rank of matrix


The maximum number of linearly independent rows of a matrix
is called the row rank of the matrix, and the maximum number of linearly
independent columns of a matrix is called the column rank of the matrix.
Rank theorem
For any matrix A, the row rank and column rank are equal.
PROOF [Gale 60] Let r be the row rank of A, s its column rank, and
suppose that r < s. Choose a row basis for A, which we may assume
consists of the rows AI, . . . , Ar (renumbering rows and columns of A
clearly does not affect its row or column rank), and a column basis, which
we assume consists of the columns A.i, . . . , A.,. Let

179

A.a

Nonlinear Programming:

and note that the equations


are r equations in s (> r) unknowns and hence have a nonzero solution y
by Corollary A.I4- Also since AI, . . . , Ar are a row basis, it follows
1

from the basis theorem A.1.6 that for all k, Ak Y pikAi for some numbers pik. Hence

and
for all k

which is equivalent to

This shows that the columns A.I, . . . , A.t of A are linearly dependent,
which contradicts the assumption that they were a basis. This contradiction shows that r ^ s. The same argument applied to the transpose of A
shows that s ^ r. Hence r = s.
Rank of a matrix
The rank of a matrix is the column or row rank of the matrix
(which are equal by the above theorem).
Corollary
Let A be an r X n matrix with rank r (r ^ ri). Then for any b in Rr,
the system Ax = b has a solution x in R".
PROOF By Theorem 2 above, the column rank of A is r, and thus if
A.i, . . . , A.r, say, are a column basis, we have r linearly independent
vectors in Rr, and the vector b Rr can be expressed as a linear combination of them, that is,

By defining Xj = 0 for.; = r + 1, . . . , n, we have that 6 = Ax.


180

Appendix A

A.2

Nonsingular matrix
An n X n matrix is said to be nonsingular if it has rank n.
Semidefinite matrix
An n X n matrix A is said to be positive semidefinite if xAx ^ 0
for all x in Rn and negative semidefinite if xAx ^ 0 for all x in Rn.
Definite matrix
An n X n matrix A is said to be positive definite if
and negative definite if
Obviously the negative of a positive semidefinite (definite) matrix
is a negative semidefinite (definite) matrix and conversely. Also, each
positive (negative) definite matrix is also positive (negative) semidefinite.
Proposition
Each positive or negative definite matrix is nonsingular.
PROOF Let A be an n X n positive or negative definite matrix. If A is
singular, then its rank is less than n, and there exists an x G R", x -^ 0,
such that Ax = 0. Hence xAx = 0 for some x ?** 0, which contradicts
the assumption that A is positive or negative definite. Hence A is
nonsingular.

181

Appendix
B
Resume'
of
Some
Topological
Properties
ofR n

We collect here some well-known


topological properties of R" which
can be found in greater detail in
any modern book on real analysis
or general topology [Buck 65,
Fleming 65, Rudin 64, Berge 63,
Simmons 63J.

1.

Open and closed sets


Open ball

Given a point x Rn and


a real number X > 0, the set

is called an open ball around x


with radius X. (Sometimes the
subscript X from B^(x) will be
dropped for convenience.)
Interior point
A point x is said to be an
interior point of the set T C Rn if
there exists a number e > 0 such
that Bt(x) C T.
Point of closure
A point x is said to be a
point of closure of the set F C Rn
if for each number e > 0,
Note that a point of closure
of T need not be in T. For example 0 E R is a point of closure of
the infinite set T = {1,H>M> } >
but 0 T. On the other hand,
every point in the set F is also a
point of closure of T. In other
words, a point of closure x of a set
T is a point such that there exist
points in T that are arbitrarily
close to it (closeness being measured
by the distance 5(x,x) = \\x x\\,
see 1.8.9).

Appendix B

B.I

Open set
A set T C Rn such that every point of F is an interior point is said
to be open.
Closed set
A set T C Rn such that every point of closure of F is in F is said
to be closed.
Closure of a set
The closure T of a set T C Rn is the set of points of closure of F.
Obviously F C T, and for a closed set F = T.
Interior of a set
The interior int (F) of a set F C Rn is the set of interior points of F.
Obviously int (F) C F, and for an open set F = int (F).
Relatively open (closed) sets
Let F and A be two sets such that F C A C Rn- F is said to be
open (closed) relative to A if F = A C\ Q, where 0 is some open (closed)
set in Rn.
Obviously an open (closed) set F in Rn is open (closed) relative to
n
R . If F C A, and if F is open (closed), then F is open (closed) relative
to A for F = A r> F.
Problem
Show that:
(i) Every open ball B\(a) in Rn is an open set.
(ii) The closure B*(a) of an open ball B^(a) in Rn is
and is closed. (The closure of an open ball is called a closed ball
and is denoted by -Bx(a).)
(iii) The interior of a closed ball #x(a) is the open ball B\(a).
Theorem
The family of open sets in Rn has the following properties:
(i) Every union of open sets is open.
(ii) Every finite intersection of open sets is open.
(iii) The empty set 0 and Rn are open.
us

B.I

Nonlinear Programming

PROOF (i) Let (I\-),-e/ be a family, finite or infinite, of open sets in


Rn. If x r = U r,, then x I\ for some i /, and there exists an
> 0 such that

B.(x) C r, C r
Hence is an interior point of T, and T is open.
(ii) Let r,6/ be a finite family of open sets in Rn. If x
F = O I\, then a: I\ for each i /. Because each F, is open, there
exist c* > 0 such that
Bt<(x) C I\

for V* 7

Take e = minimum e' > 0 (this is where the finiteness of / is used), then

B.(X) c r
and F is open.
(iii) Since the empty set 0 contains no points, we need not find
an open ball surrounding any point, and hence 0 is open. The set Rn is
open because for each x Rn, Bt(x) C Rn for all e > 0.
The family of open sets in Rn, as defined in 4> is called a topology
n
in R . (In fact any family of setscalled openwhich has properties
(i), (ii), (iii) above is also called a topology in Rn. For example the sets
0, Rn also form a topology in Rn. We shall, however, be concerned here
only with open sets as defined by 40
Theorem
Let T C Rn-

PROOF

Then

Let x Rn.

Then

Hence

Corollary
The complement (relative to Rn) of an open set in Rn is closed, and
vice versa.
184

B.

Appendix B

PROOF

T is closed *=> T = T

By using Corollary 12, the following theorem is a direct consequence


of Theorem 10.
Theorem
The family of closed sets in Rn has the following properties:
(i)
(ii)
(iii)

Every intersection of closed sets is closed.


Every finite union of closed sets is closed.
The empty set 0 and R" are closed.
Problem
Show that the interior of each set F C Rn is an open set.
Problem
Show that the closure of each set F C Rn is a closed set.
Problem
Let A be a fixed m X n matrix and b a fixed m-vector.

(i)
(ii)
(iii)

Show that the set {x \ x G Rn, Ax ^ b}, and hence also the set
{a: | x G Rn, Ax = 6} are closed sets in Rn.
Show that the set {x \ x G Rn, Ax < b} is an open set in Rn.
Show that the set {x \ x j Rn, Ax < b} is neither a closed nor an
open set in Rn.

2. Sequences and bounds


Sequence
A sequence in a set X is a function / from the set / of all positive
integers into the set X. If /(n) = xn G X for n G I, it is customary to
denote the sequence / by the symbol [ x n \ or by x1, z2, . . . . For any
sequence of positive integers n1, n2, . . . , such that n1 < n 2 < ,
the sequence z"1, z"2, . . . is called a subsequence of x1, x2, . . . . If
X = Rn, then x1, z2, . . . , is a sequence of points in Rn, and if X = R,
then z1, z2, . . . , is the familiar sequence of real numbers.
18*

B.a

Nonlinear Programming

Limit point
Let x1, x2, ..., be a sequence of points in Rn. A point x Rn is said
to be a limit point of the sequence if

(In the literature, a limit point is sometimes called an accumulation point


or a cluster point.}
Limit
Let x1, x2, . . . , be a sequence of points in Rn. A point x G Rn
is said to be a limit of the sequence, or we say that the sequence converges
or tends to a limit x if

Remark

Obviously a. limit of a sequence is also a limit point of the sequence,


but the converse is not necessarily true. (For example the sequence 1,
1, 1, 1, . . . , has the limit points 1 and 1, but no limit. The
sequence 1, 3^j, %, > has a limit, and hence a limit point, zero.)
Also note that if x is a limit of xl, x2, . . . , then it is also a limit of each
subsequence of x1, a;2, . . . .
Theorem
(i) // x is a limit point of the sequence x1, x2, . . . , then there exists a
subsequence xn\ xn\ . . . , which has x as a limit, and conversely.
(ii) // a sequence x1, x2, . . . , tends to a limit x, then there can be no
limit point (and hence no limit) other than x.
PROOF

(i) Let x be a limit point of x1, re2, . . . .

Then

and so on. The subsequence xn\ xn\ . . . , converges to x.


186

Appendix B

B.3

Conversely, if the subsequence xn\ xn\ . . . , converges to x, then


for given e > 0 and n there exists an n* ^ n (in fact all n{ ^ n) such that
||xn< x|| < t, and hence x is a limit point of xl, a;2, . . . .
(ii) Let the sequence x1, x2, . . . , tend to the limit x, and
x ?* x. Then for 0 < e < ||x x||, there exists an integer n such that
(by triangle inequality, see
1.3.10)
and hence x is not a limit point of the sequence.
Remark
Note that a limit point (and hence a limit) of a sequence x1, x2,
. . . , in Rn is a point of closure of any set F in Rn containing {x^x2,
. . . }. Conversely, if x is a point of closure of the set F C Rn, then there
exists a sequence x1, x2, . . . C F (and hence also a subsequence x"1,
x"2, . . . ) such that x is a limit point of x1, x2, . . . (and hence a limit of
l
n2 , . . . )" .
T n , r4
X

Lower and upper bounds


Let F be a nonempty set of real numbers.

Then

a is a lower bound of F <=> (x (E F => x ^ a)


/3 is an upper bound of F <= (x G F => x ^ /3)
Greatest lower and least upper bounds
Let F be a nonempty set of real numbers. A lower bound a is a
greatest lower bound of F (infimum of F, or inf F) if no bigger number is a
lower bound of F. An upper bound /3 is a least upper bound of F (supremum of F, or sup F) if no smaller number is an upper bound of F. Equivalently we have

We shall take the following axiom as one of the axioms of the real
number system [Birkhoff-Maclane 53].
1ST

B.S

Nonlinear Programming

Axiom
Any nonempty set T of real numbers which has a lower (upper) bound
has a greatest (least) lower (upper) bound.
If the set F has no infimum (or equivalently by the above axiom
if it has no lower bound), we say that F is unbounded from below and we
write inf F = . Similarly if F has no supremum (or equivalently by
the above axiom if it has no upper bound), we say T is unbounded from
above and we write sup F = + . Hence by augmenting the Euclidean
line R by the two points + and any nonempty set F will have an
infimum, which may be , and a supremum, which may be + o. We
shall follow the convention of writing inf 0 = -f and sup 0 = .
We observe that neither inf F nor sup F need be in F. For example
inf {iy2,lA, . . .} is 0, but 0 is not in the set |1,H,H }
Theorem
Every bounded nondecreasing (nonincreasing) sequence of real numbers has a limit.
PROOF Let xl, x2, . . . , be a bounded nondecreasing sequence of real
numbers. By the above Axiom 9, the sequence has a least upper bound 0.
Hence

(by 8)
(because sequence is nondecreasing)
(because 0 ^ xn for all n)
The proof for a bounded nonincreasing sequence is similar.
Cauchy convergence criterion
A sequence xl, x2, . . . , in Rn converges to a limit XQ if and only if
it is a Cauchy sequence, that is, for each e > 0 there exists an n* such that
\\xm xn\\ < efor each m,n ^ n*.
For a proof see [Buck 65, Fleming 65, Rudiri 64].

3.

Compact sets in Rn

Bounded set
A set F C Rn is bounded if there exists a real number a such that
for each x F, ||x|| ^ a.
188

Appendix B

B.S

A set F C Rn which is both closed and bounded has some very


interesting properties. In fact such sets constitute one of the most
important classes of sets in analysis, the class of compact sets. We give
in the theorem below a number of equivalent characterizations of a compact set in Rn. (These characterizations are not necessarily equivalent
in spaces more general than Rn.)
Compact sets (theorem-definition)
A set T C Rn is said to be compact if it satisfies any of the following
equivalent conditions:
(i)
(ii)
(iii)

T is closed and bounded.


(Bolzano-Weierstrass) Every sequence of points in T has a limit point
in T.
(Finite intersection property) For any family (r,)ier of sets, closed
relative to T, it follows that

or equivalently

(iv)

(Heine-Borel) From every family (I\)t-e/ of open sets whose union


VJ I\ contains T we can extract a finite subfamily T tl , . . . , Tim

te/

whose union F,, \J F,2 W I\m contains T (or equivalently stated:


every open covering of T has a finite subcovering).

We shall not give here a proof of the equivalence of the above four
conditions; such a proof can be found in any of the references given at the
beginning of this Appendix. An especially lucid proof is also given in
chap. 2 of [Berge-Ghouila Houri 65].
Corollary
Let T and A be sets in Rn which are respectively compact and closed.
Then the sum ft
is closed.
PROOF Let x belong to the closure of Q. Then there exists a sequence
xl, x2, . . . , in 8 which converges to x (see B.2.6). Then we can find
189

Nonlinear Programming

B.3

a sequence y1, yz, . . . , in T and a sequence 21, 22, . . . , in A such that


z n = yn + 2"

forn = 1, 2, . . .

Since T is compact, there exists a subsequence i/"1, ynt, . . . , which


converges to y T. Then

Hence
and ft is closed.

1*0

Appendix
C
Continuous
and
Semicontinuous
Functions,
Minima
and
Infima

We collect in this appendix some


basic definitions and properties
of continuous and semicontinuous
functions defined on a subset of Rn.
We also state some facts about the
minima and maxima of such functions. More detailed discussions
can be found in [Berge 63, BergeGhouila Houri 65.]

1. Continuous and semicontinuous functions


Continuous function
A numerical function 8 defined on a set r C Rn is said to be
continuous at x G r (with respect
to T) if either of the two following
equivalent conditions are satisfied:
(i)

For each e > 0 there exists a


5 > 0 such that

(ii)

For each sequence xl, x*,


. . . , in T converging to x,
lim 6(xn) = 0(lim xn) = d(x)

no

6 is said to be continuous on T (with


respect to F) if it is continuous
(with respect to F) at each point
x F, or equivalently if any of
the following equivalent conditions
hold:
(iii) The sets

and

(iv)

are closed relative to F for


each real a.
The sets

C.I

Nonlinear Programming

and

(v)

are open relative to F for each real a.


The epigraph of 6
and the hypograph of 6
are closed relative to F X R.

In each of the above conditions a pair of symmetric requirements


can be found. By dropping one requirement from each pair, we arrive
at the concept of semicontinuous functions.
Lower semicontinuous function
A numerical function 0 defined on a set F C Rn is said to be lower
semicontinuous at x G F (with respect to F) if either of the two following
equivalent conditions is satisfied :
(i)

For each e > 0 there exists a 5 > 0 such that

(ii)

For each sequence x1, x2, . . . , in F converging to x,

where lim inf 8(x") denotes the infimum of the limit points of the
n+

sequence of real numbers 8(x1}i 0(z2), . . . .


6 is said to be lower semicontinuous on T (with respect to F) if it
is lower semicontinuous (with respect to F) at each point x G F, or
equivalently if any of the following equivalent conditions holds:

(iii)

The set

(iv)

is closed relative to F for each real a.


The set

198

C.I

Appendix C

is open relative to F for each real a.


(v)

The epigraph of 6
is closed relative to I' X R.

Upper semicontinuous function


A numerical function 0 defined on a set F C Rn is said to be upper
semicontinuous at x G F (with respect to F) if either of the two following
equivalent conditions is satisfied:
(i)

For each e > 0 there exists a 8 > 0 such that

(ii)

For each sequence x1, x2, . . . , in F converging to x,

where lim sup 0(xn) denotes the supremum of the limit points of the
no

sequence of real numbers 0(x1), 0(z2), . . . .


0 is said to be upper semicontinuous on T (with respect to F) if it
is upper semicontinuous (with respect to F) at each point x G F, or
equivalently if any of the following equivalent conditions hold:

(iii)

The set

(iv)

is closed relative to F for each real a.


The set

(v)

is open relative to F for each real a.


The hypograph of 6
is closed relative to F X R.
Remark
6 is lower semicontinuous at x 6: F (with respect to F) if and only
193

C.I

Nonlinear Programming

if 6 is upper semicontinuous at x T (with respect to r). 8 is continuous at x (E T (with respect to T) if and only if it is both lower semicontinuous and upper semicontinuous at x F (with respect to r).
Examples
is lower semicontinuous on R, see Fig.

C.I.I
is upper semicontinuous on R, see Fig.

rU1.-1 .9

Fig. C.1.1 A lower semicontinuous function


on R.

Fig. C.1.2 A upper semicontinuous function


on R.

Theorem
Let (6i)ii be a (finite or infinite) family of lower semicontinuous
functions on T C Rn> Its least upper bound
e(x) = sup 8t(x)
i&

is lower semicontinuous on T. If I is finite, then its greatest lower bound


4>(x) = inf ef(x)
i&

is also lower semicontinuous on T.


PROOF The first part follows directly from #(iii) above and Theorem
B,L13(i) if we observe that for any real X, the set

194

Appendix C

C.S

is closed (relative to F). The second part follows from #(iii) and Theorem
B.IJ3(ii) if we observe that for any real X the set

is closed (relative to F).


Corollary
Let (Qi)ii be a (finite or infinite) family of upper semicontinuous
functions on F C Rn. Us greatest lower bound

is upper semicontinuous on F. If I is finite, then its least upper bound

is also upper semicontinuous on T.

2.

Intimum (supremum) and minimum (maximum)


of a set of real numbers

We recall that in Appendix B (B.2.8] we defined the infimum and supremum of a set of real numbers F as follows

and

We also noted that neither a nor 0 need be in F. However when a and |5


are in F, they are called respectively min F and max F.
Minimum (maximum)
Let F be a set of real numbers. If (inf F) F, then inf F is called
the minimum of F and is denoted by min F. If (sup F) (E F, then sup F
is called the maximum of F and is denoted by max F. Equivalently

195

C.S

Nonlinear Programming

3. Infimum (supremum) and minimum (maximum)


of a numerical function
Bounded functions
A numerical function 6 defined on the set F is said to be bounded
from below on T if there exists a number a such that
The number a is a lower bound of 8 on F. 6 is said to be bounded from
above on T if there exists a number /3 such that
The number ft is an upper bound of 0 on F.
Infimum of a numerical function
Let 0 be a numerical function denned on the set F. If there is a
number a such that

and

then 5 is called the infimum of 9 on F, and we write

Supremum of a numerical function


Let 6 be a numerical function defined on the set F.
number 0 such that

If there is a

and

then /3 is called the supremum of 6 on F, and we write

If we admit the points + , then every numerical function 6 has a


supremum and infimum on the set F on which it is defined.
196

Appendix C

C.S

Examples

Minimum of a numerical function


Let 6 be a numerical function denned on the set F. If there exists
an x G F such that

then 0(z) is called the minimum of 6 on F, and we write

We make the following remarks:


Not every numerical function has a minimum; for example, e~x and
x have no minima on R.
(ii) The minimum of a numerical function, if it exists, must be finite,
(iii) The minimum of a numerical function, if it exists, is an attained
infimum, that is,
(i)

Maximum of a numerical function


Let 6 be a numerical function denned on the set F. If there exists
an x G F such that

then 6(x) is called the maximum of 0 on F and we write

Remarks similar to (i), (ii), and (iii) above, which applied to the
minimum of a function, also apply here to the maximum of a function.
197

Nonlinear Programming

C.4

4. Existence of a minimum and a maximum


of a numerical function
We give now sufficient conditions for a numerical function defined on a
subset of Rn to have a minimum and a maximum on that subset.
Theorem
A lower (upper) semicontinuous function 6 defined on a compact
that is, closed and boundedset T in Rn is bounded from below (above) and
attains in T the value

In other words there exists an x G r such that

PROOF
the set

We prove the lower semicontinuous case.

Let 7 > a.

is not empty and is closed relative to T by C.1.2(i\\).


finite intersection theorem B.3.2(iii)

Then

Hence by the

Choose x in this intersection; then d(x) = a, and hence a is finite.


It should be remarked here that the above theorem cannot be
strengthened by dropping the semicontinuity or the compactness assumptions. In other words, in order to ensure that

we cannot weaken the conditions that


(i)
(ii)
(iii)

8 is lower semicontinuous on F,
T is closed, and
T is bounded.

We give examples below where the infimum is not attained whenever any
one of the above conditions is violated.
198

Appendix C

C.4

(i)

inf

8(x) = 0, but no minimum exists on the compact set

(ii)
inf 8(x) = 0, but no minimum exists of this continuous function

0<x<1

on the open set

(iii)
inf 6(x) = 0, but no minimum exists of this continuous function on

the closed unbounded set R.

199

Appendix
D
Differentiable
Functions,
Mean-value
and
Implicit
Function
Theorems

We summarize here the pertinent


definitions and properties of differentiable functions and give the
mean-value and implicit function
theorems. For details the reader
is referred to [Fleming 65, Rudin64,
Hestenes 66, Bartle 64].

1. Differentiable and twicedifferentiable functions


Differentiable numerical
function
Let 6 be a numerical function defined on an open set F in Rn,
and let x be in F. 6 is said to be
differentiable at x if for all x G Rn
such that x + x G F we have that

where t() is an n-dimensional


bounded vector, and a is a numerical function of x such that
6 is said to be differentiable on F if
it is differentiable at each x in F.
[Obviously if 6 is differentiable on
the open set F, it is also differentiable on any subset A (open or not)
of F. Hence when we say that 6
is differentiable on some set A (open
or not), we shall mean that 6 is
differentiable on some open set
containing A.]
Partial derivatives and gradient of a numerical function
Let 6 be a numerical function defined on an open set F in Rn,
and let x be in F. 0 is said to have
a partial derivative at x with respect
to Xi, i = 1, . . . , n, if

Appendix D

D.I

tends to a finite limit when 5 tends to zero. This limit is called the partial
derivative of 6 with respect to x, at x and is denoted by 66(x)/dXi.
The
n-dimensional vector of the partial derivatives of 6 with respect to
Xi, . . . , xn at x is called the gradient of 6 at x and is denoted by V0(z),
that is,

Theorem
Let 6 be a numerical function defined on an open set Y in Rn, and
let x be in T.
(i) // 6 is differentiate at x, then 6 is continuous at x, and V0(z) exists
(but not conversely), and

(ii) // 6 has continuous partial derivatives at x with respect to xi, . . . , xn,


that is, V6(x) exists and V0 is continuous at x, then 6 is differentiate
at x.
We can summarize most of the above result as follows

Differentiable vector function


Let / be an w-dimensioiial vector function defined on an open set
F in Rn, and let x be in F. /is said to be differentiate at x (respectively
on T) if each of its components f i , . . . ,m f is differentiate at x (respectively on F).
Partial derivatives and Jacobian of a vector function
Let / be an m-dimensional vector function defined on an open set
F in Rn, and let x be in T. f is said to have partial derivatives at x with
201

Nonlinear Programming

D.I

respect to Xi, . . . , xn if each of its components f i , . . . , f


derivatives at x with respect to Xi, . . . , xn. We write

has partial

The m X n matrix V/(x) is called the Jacobian (matrix) of f at x.


Chain rule theorem
Let f be an m-dimensional vector function defined on an open set F
in Rn, and let $ be a numerical function which is defined on- Rm. The
numerical function 6 defined on F by
6(x) = *[/(*)]
is differentiate at x G F if f is differentiate
aty = f(x), and

at x and if <f> is

differentiate

V6(x) = V0()V/(z)
Twice-differentiable numerical function and its Hessian
Let 6 be a numerical function denned on an open set F in Rn, and
let x be in F. $ is said to be twice differentiate at x if for all x G Rn
such that x -f- x G r we have that

where V26(x) is an n X n matrix of bounded elements, and /3 is a numerical


function of x such that
lim 0(x,x) = 0
z-*0

The n X n matrix V20(z) is called the Hessian (matrix) of 6 at x and its


ijth element is written as

Obviously if 8 is twice differentiate at x, it must also be differentiate at x.


Theorem
Let 6 be a numerical function defined on an open set F in Rn, and
kt x be in F. Then
aoa

Appendix D

(i)
(ii)

D.I

V0 differentiable at x => 6 twice differentidble at x


V0 has continuous partial derivatives at x => 6 twice differentiable at x

Remark
The numbers dd(x)/dxi, i' = 1, . . . , n, are also called the first
partial derivatives of 6 at x, and d*Q(x)/dXi dx, i,j = 1, . . . , n, are also
called the second partial derivatives of 6 at x. In an analogous way we
can define /cth partial derivatives of 6 at x.
Remark
Let 6 be a numerical function defined on an open set F C Rn X Rk
which is differentiable at (x,y) (E T. We define then

Let / be an m-dimensional function defined on an open set F C


Rn X Rk which is differentiable at (x,y) G r. We define then

and

108

Nonlinear Programming

D.3

2. Mean-value theorem and Taylor's theorem


Mean-value theorem
Let 6 be a differentiate numerical function on the open convex set
T C Rn, and let xl,x2 E T. Then
for some real number y, 0 < 7 < 1.
Taylor's theorem (second order)
Let 6 be a twice-differentiable numerical function on the open convex
set T C Rn, and let x\x2 E T. Then

for some real number

3. Implicit function theorem


Implicit function theorem
Let f be an m-dimensional vector function defined on an open set
A C Rn X Rm, let f have continuous first partial derivatives at (x,y) G: A,
let f(x,y) = 0, and let Vvf(x,y) be nonsingular.
Then there exist a ball Bx(x,y) with radius X > 0 in Rn+m, an open
set F C -Kn containing x, and an m-dimensional vector function e with
continuous first partial derivatives on F such that
Vvf(x>y)

and

204

*s nonsingular for all (x,y} B\(x,y)

References
Abadie, J.: On the Kuhn-Tucker Theorem, in J. Abadie (ed.), "Nonlinear Programming," pp. 21-36, North Holland Publishing Company, Amsterdam, 1967.
Anderson, K. W., and D. W. Hall: "Sets, Sequences and Mappings, the Basic
Concepts of Analysis," John Wiley & Sons, Inc., New York, 1963.
Arrow, K. J., and A. C. Enthoven: Quasiconcave Programming, Econometrica
29:779-800 (1961).
Arrow, K. J., L. Hunvicz, and H. Uzawa (eds.): "Studies in Linear and Nonlinear Programming," Stanford University Press, Stanford, Calif., 1958.
Arrow, K. J., L. Hurwicz, and H. Uzawa: Constraint Qualifications in Maximization Problems, Naval Research Logistics Quarterly 8: 175-191 (1961).
Bartle, R. G.: "The Elements of Real Analysis," John Wiley & Sons, Inc., New
York, 1964.
Berge, C.: "Topological Spaces," The MacMillan Company, New York, 1963.
Berge, C., and A. Ghouila Houri: "Programming, Games and Transportation
Networks," John Wiley & Sons, Inc., New York, 1965.
Birkhoff, G. and S. Maclane: "A Survey of Modern Algebra," The MacMillan
Company, New York, 1953.
Bohnenblust, H. F., S. Karlin, and L. S. Shapley: "Solutions of Discrete, Twoperson Games," Contributions to the Theory of Games, vol. I, Annals of Mathematics Studies Number 24, Princeton, 1950, 51-72.
Bracken, J., and G. P. McCormick: "Selected Applications of Nonlinear Programming," John W T iley & Sons, Inc., New York, 1968.
Brondsted, A.: Conjugate Convex Functions in Topological Vector Spaces,
Matematisk-fysiske Meddelel ser udgivet af Dei Kongelige Danske Videnskabernes
Selskab, 34(2): 1-27 (1964).
Browder, F. E.: On the Unification of the Calculus of Variations and the Theory
of Monotone Nonlinear Operators in Banach Spaces, Proc. Nat. Acad. Sci. U.S.,
66: 419-425 (1966).
Buck, R. C.: "Advanced Calculus," McGraw-Hill Book Company, New York,
1965.
Canon, M., C. Cullum, and E. Polak: Constrained Minimization Problems in
Finite-dimensional Spaces, Society for Industrial and Applied Mathematics Journal
on Control, 4: 528-547 (1966).
Carathe'odory, C.: tlber den variabilitatsberich der Koeffizienten von Potenzreihen, die gegebene Werte nicht annehmen, Mathematische Annalen 64: 95-115
(1907).

205

Nonlinear Programming

Charnes, A., and W. W. Cooper: "Management Models and Industrial Applications of Linear Programming," vols. I, II, John Wiley & Sons, Inc., New York,
1961.
Cottle, R. W.: Symmetric Dual Quadratic Programs, Quarterly of Applied Mathematics 21: 237-243 (1963).
Courant, R.: "Differential and Integral Calculus," vol. II, 2d ed., rev., Interscience Publishers, New York, 1947.
Courant, R., and D. Hilbert: "Methods of Mathematical Physics," pp. 231-242,
Interscience Publishers, New York, 1953.
Dantzig, G. B.: "Linear Programming and Extensions," Princeton University
Press, Princeton, N.J., 1963.
Dantzig, G. B., E. Eisenberg, and R. W. Cottle: Symmetric Dual Nonlinear
Programs, Pacific Journal of Mathematics, 15: 809-812 (1965).
Dennis, J. B.: "Mathematical Programming and Electrical Networks," John
Wiley & Sons, Inc., New York, 1959.
Dieter, U.: Dualitat bei konvexen Optimierungs(Programmierungs)Aufgaben, Unternehmensforschung 9: 91-111 (1965a).
Dieter, U.: Dual External Problems in Locally Convex Linear Spaces, Proceedings of the
Colloquium on Convexity, Copenhagen, 52-57 (1965b).
Dieter, U.: Optimierungsaufgaben in topologischen Vektorraumen I: Dualitatstheorie,
Zeitschrift fur \Vahrscheinlichkeitstheorie und Verwandte Gebiete, 5: 89-117
(1966).
Dorn, W. S.: Duality in Quadratic Programming, Quarterly of Applied Mathematics, 18: 155-162 (1960).
Dorn, W. S.: Self-dual Quadratic Programs, Society for Industrial and Applied
Mathematics Journal on Applied Mathematics, 9: 51-54 (1961).
Duffin, R. J.: Infinite Programs, in [Kuhn-Tucker 56], pp. 157-170.
Fan, K., I. Glicksburg, and A. J. Hoffman: Systems of Inequalities Involving
Convex Functions, American Mathematical Society Proceedings, 8: 617-622 (1957).
Farkas, J.: Uber die Theorie der einfachen Ungleichungen, Journal fur die Heine
und Angewandte Mathematik, 124: 1-24 (1902).
Fenchel, W.: "Convex Cones, Sets and Functions," Lecture notes, Princeton
University, 1953, Armed Services Technical Information Agency, AD Number
22695.
Fiacco, A. V.: Second Order Sufficient Conditions for Weak and Strict Constrained Minima, Society for Industrial and Applied Mathematics Journal on
Applied Mathematics, 16: 105-108 (1968).
106

References

Fiacco, A. V., and G. P. McCormick: "Nonlinear Programming: Sequential


Unconstrained Minimization Techniques," John Wiley &Sons, Inc., New York,
1968.
Fleming, W. H.: "Functions of Several Variables," McGraw-Hill Book Company, New York, 1965.
Friedrichs, K. 0.: Ein Verfahren der Variationsrechnung des Minimum eines Integral
als das Maximum eines Anderen Ausdruckes Darzustellen, Nachrichten von der Gesellschajl der Wissenschaften zu Gottingen MathematischePhysikalische Klasse, 1320(1929).
Gale, D.: "The Theory of Linear Economic Models," McGraw-Hill Book Company, New York, 1960.
Gale, D., H. W. Kuhn, and A. W. Tucker: Linear Programming and the Theory
of Games, in [Koopmans 51], pp. 317-329.
Gass, S.: "Linear Programming," 2d ed., McGraw-Hill Book Company, New
York, 1964.
Gol'stein, E. G.: Dual Problems of Convex and Fractionally-convex Programming in Functional Spaces, Soviet Mathematics-Doklady (English translation), 8:
212-216 (1967).
Gordan, P.: Uber die Auflosungen linearer Gleichungen mit reelen Coefficienten, Mathematische Annalen, 6: 23-28 (1873).
Graves, R. L., and P. Wolfe (ed.): "Recent Advances in Mathematical Programming," McGraw-Hill Book Company, New York, 1963.
Hadley, G. F.: "Linear Programming," Addison-Wesley Publishing Company,
Inc., Mass., 1962.
Hadley, G. F.: "Nonlinear and Dynamic Programming," Addison-Wesley
Publishing Company, Inc., Reading, Mass., 1964.
Halkin, H.: A Maximum Principle of the Pontryagin Type for Systems Described
by Nonlinear Difference Equations, Society for Industrial and Applied Mathematics Journal on Control, 4: 90-111 (1966).
Halkin, H., and L. W. Neustadt: General Necessary Conditions for Optimization
Problems, Proc. Nat. Acad. Sci. U.S., 56: 1066-1071 (1966).
Halmos, P. R.: "Finite-dimensional Vector Spaces," D. Van Nostrand Company,
Inc., Princeton, N.J., 1958.
Hamilton, N. T., and J. Landin: "Set Theory: The Structure of Arithmetic,"
Allyn and Bacon, Inc., Boston, 1961.
Hanson, M. A.: A Duality Theorem in Nonlinear Programming with Nonlinear
Constraints, Australian Journal of Statistics 3: 64-71 (1961).
207

Nonlinear Programming:

Hanson, M. A.: Bounds for Functionally Convex Optimal Control Problems,


Journal of Mathematical Analysis and Applications, 8: 84-89 (1964).
Hanson, M. A.: Duality for a Class of Infinite Programming Problems, Society
for Industrial and Applied Mathematics Journal on Applied Mathematics, 16:
318-323 (1968).
Hanson, M. A., and B. Mond: Quadratic Programming in Complex Space,
Journal of Mathematical Analysis and Applications, 23: 507-514 (1967).
Hestenes, M. R.: "Calculus of Variations and Optimal Control Theory," John
Wiley & Sons, Inc., New York, 1966.
Hu, T. C.: "Integer Programming and Network Flows," Addison-Wesley Publishing Company, Inc., Reading, Mass., 1969.
Huard, P.: Dual Programs, 75.17 J. Res. Develop., 6: 137-139 (1962).
Hurwicz, L.: Programming in Linear Spaces, in [Arrow et al. 58], pp. 38-102.
Jacob, J.-P., and P. Rossi: "General Duality in Mathematical Programming,"
IBM Research Report, 1969.
Jensen, J. L. W. V.: Sur les fonctions convexes et les ine'galite's entre les valeurs
moyennes, Acta Mathematica, 30: 175-193 (1906).
John, F.: Extremum Problems with Inequalities as Subsidiary Conditions, in
K. 0. Friedrichs, O. E. Neugebauer, and J. J. Stoker, (eds.), "Studies and Essays:
Courant Anniversary Volume," pp. 187-204, Interscience Publishers, New
York, 1948.
Karamardian, S.: "Duality in Mathematical Programming," Operations Research
Center, University of California, Berkeley, Report Number 66-2, 1966; Journal
of Mathematical Analysis and Applications, 20: 344-358 (1967).
Karlin, S.: "Mathematical Methods and Theory in Games, Programming, and
Economics," vols. I, II, Addison-Wesley Publishing Company, Inc., Reading,
Mass., 1959.
Koopman, T. J. (ed.): "Activity Analysis of Production and Allocation," John
Wiley & Sons, Inc., New York, 1951.
Kowalik, J., and M. R. Osborne: "Methods for Unconstrained Optimization
Problems," American Elsevier Publishing Company, Inc., New York, 1968.
Kretschmar, K. S.: Programmes in Paired Spaces, Canadian Journal of Mathematics, 13: 221-238 (1961).
Kuhn, H. W., and A. W. Tucker: Nonlinear Programming, in J. Neyman (ed.),
"Proceedings of the Second Berkeley Symposium on Mathematical Statistics
and Probability," pp. 481-492, University of California Press, Berkeley, Calif.,
1951.
908

Beferencea

Kuhn, H. W., and A. W. Tucker: "Linear Inequalities and Related Systems,"


Annals of Mathematics Studies Number 38, Princeton University Press, Princeton, N.J., 1956.
Ktinzi, H. P., W. Krelle, and W. Oettli: "Nonlinear Programming," Blaisdell
Publishing Company, Waltham, Mass., 1966.
Larsen, A., and E. Polak: Some Sufficient Conditions for Continuous-linearprogramming Problems," International Journal of Engineering Science, 4: 583604 (1966).
Lax, P. D.: Reciprocal Extremal Problems in Function Theory, Communications
on Pure and Applied Mathematics, 8: 437-453 (1955).
Levinson, N.: Linear Programming in Complex Space, Journal of Mathematical
Analysis and Applications, 14: 44-62 (1966).
Levitin, E. S., and B. T. Polyak: Constrained Minimization Methods, USSR
Computational Mathematics and Mathematical Physics (English translation),
6(5):1-50 (1966).
McCormick, G. P.: Second Order Conditions for Constrained Minima, Society
for Industrial and Applied Mathematics Journal on Applied Mathematics, 16:
641-652 (1967).
Mangasarian, 0. L.: Duality in Nonlinear Programming, Quarterly of Applied
Mathematics, 20: 300-302 (1962).
Mangasarian, 0. L.: Pseudo-convex Functions, Society for Industrial and Applied
Mathematics Journal on Control, 3: 281-290 (1965).
Mangasarian, O. L., and S. Fromovitz: A Maximum Principle in Mathematical
Programming, in A. V. Balakrishnan and L. W. Neustadt (eds.), "Mathematical Theory of Control," pp. 85-95, Academic Press Inc., New York, 1967.
Mangasarian, 0. L., and S. Fromovitz: The Fritz John Necessary Optimality
Conditions in the Presence of Equality and Inequality Constraints, J. Math.
Analysis and Applications, 17:37-47 (1967a).
Mangasarian, 0. L., and J. Ponstein: Minmax and Duality in Nonlinear Programming, Journal of Mathematical Analysis and Applications, 11: 504-518,
(1965).
Martos, B.: The Direct Power of Adjacent Vertex Programming Methods,
Management Science, 12: 241-252 (1965).
Minty, G. J.: On the Monotonicity of the Gradient of a Convex Function,
Pacific Journal of Mathematics, 14: 243-247 (1964).
Mond, B.: A Symmetric Dual Theorem for Nonlinear Programs, Quarterly of
Applied Mathematics, 23: 265-269 (1965).
S09

Nonlinear Programming

Mond, B., and R. W. Cottle: Self-duality in Mathematical Programming, Society


for Industrial and Applied Mathematics Journal on Applied Mathematics, 14:
420-423 (1966).
Mond, B., and M. A. Hanson: Duality for Variational Problems, Journal of
Mathematical Analysis and Applications, 18: 355-364 (1967).
Moreau, J.-J.: Proximite* et dualite* dans un espace Hilbertien, Bulletin de la
Societe Mathematique de France, 93: 273-299 (1965).
Motzkin, T. S.: "Beitrage zur Theorie der Linearen Ungleichungen," Inaugural
Dissertation, Basel, Jerusalem, 1936.
Nikaido, H.: On von Neumann's Minimax Theorem, Pacific Journal of Mathematics, 4: 65-72 (1954).
Opial, Z.: "Nonexpansive and Monotone Mappings in Banach Spaces," Lecture
Notes 67-1, Division of Applied Mathematics, Brown University, Providence,
Rhode Island, 1967.
Ponstein, J.: Seven Types of Convexity, Society for Industrial and Applied
Mathematics Review, 9: 115-119 (1967).
Pontryagin, L. S., V. G. Boltyanskii, R. V. Gamkrelidze, and E. F. Mishchenko:
"The Mathematical Theory of Optimal Processes," John Wiley & Sons, Inc.,
New York, 1962.
Rissanen, J.: On Duality Without Convexity, Journal of Mathematical Analysis
and Applications, 18: 269-275 (1967).
Ritter, K.: Duality for Nonlinear Programming in a Banach Space, Society for
Industrial and Applied Mathematics Journal on Applied Mathematics, 15: 294-302
(1967).
Rockafellar, R. T.: "Convex Functions and Dual Extremum Problems," Doctoral dissertation, Harvard University, 1963.
Rockafellar, R. T.: Duality Theorems for Convex Functions, Bulletin of the
American Mathematical Society, 70: 189-192 (1964).
Rockafellar, R. T.: Conjugates and Legendre Transforms of Convex Functions,
Canadian Journal of Mathematics, 19: 200-205 (1967a).
Rockafellar, R. T.: Convex Programming and Systems of Elementary Monotonic
Relations, Journal of Mathematical Analysis and Applications, 19: 543-564
(1967b).
Rockafellar, R. T.: Duality and Stability in Extremum Problems Involving
Convex Functions, Pacific Journal of Mathematics, 21: 167-187 (1967c).
Rockafellar, R. T.: "Convex Analysis," Princeton University Press, Princeton,
N.J., 1969.

210

References

Rosen, J. B.: The Gradient Projection Method for Nonlinear Programming,


Society for Industrial and Applied Mathematics Journal on Applied Mathematics,
8: 181-217 (1960) and 9: 514-553 (1961).
Rubinstein, G. S.: Dual Extremum Problems, Soviet Mathematics-Doklady
(English translation), 4: 1309-1312 (1963).
Rudin, W.: "Principles of Mathematical Analysis," 2d ed., McGraw-Hill Book
Company, New York, 1964.
Saaty, T. L., and J. Bram: "Nonlinear Mathematics," McGraw-Hill Book Company, New York, 1964.
Simmonard, M.: "Linear Programming," Prentice-Hall, Inc., Englewood Cliffs,
N.J., 1966.
Simmons, G. F.: "Introduction to Topology and Modern Analysis," McGrawHill Book Company, New York, 1963.
Slater, M.: "Lagrange Multipliers Revisited: A Contribution to Nonlinear Programming," Cowles Commission Discussion Paper, Mathematics 403, November, 1950.
Slater, M. L.: A Note on Motzkin's Transposition Theorem, Econometrica, 19:
185-186 (1951).
Stiemke, E.: t)ber positive Losungen homogener linearer Gleichungen, Mathematische Annalen, 76: 340-342 (1915).
Stoer, J.: Duality in Nonlinear Programming and the Minimax Theorem,
Numerische Mathematik, 5: 371-379 (1963).
Stoer, J.: tlber einen Dualitatssatz der nichtlinearen Programmierung, Numerische Mathematik, 6: 55-58 (1964).
Trefftz, E.: "Ein Gegenstiick zum Ritzschen Verfahren," Verhandlungen des
Zweiten Internationalen Kongresses fur Technische Mechanik, p. 131, Zurich,
1927.
Trefftz, E.: Konvergenz und Fehlerschatzung beim Ritzschen Verfahren,
Mathematische Annalen, 100: 503-521 (1928).
Tucker, A. W.: Dual Systems of Homogeneous Linear Relations, in [KuhnTucker 56], pp. 3-18.
Tuy, Hoang: Sur les ine'galite's line*aires, Colloquium Mathematicum, 13: 107-123
(1964).
Tyndall, W. F.: A Duality Theorem for a Class of Continuous Linear Programming Problems, Society for Industrial and Applied Mathematics Journal on Applied
Mathematics, 13: 644-666 (1965).
111

Nonlinear Programming

Tyndall, W. F.: An Extended Duality Theorem for Continuous Linear Programming Problems, Notices of the American Mathematical Society, 14: 152-153 (1967).
Uzawa, H.: The Kuhn-Tucker Theorem in Concave Programming, in [Arrow
et al. 58], pp. 32-37.
Vajda, S.: "Mathematical Programming," Addison-Wesley Publishing Company,
Inc., Reading, Mass., 1961.
Valentine, F. A.: "Convex Sets," McGraw-Hill Book Company, New York, 1964.
Van Slyke, R. M., and R. J. B. Wets: "A Duality Theory for Abstract Mathematical Programs with Applications to Optimal Control Theory," Mathematical
Note Number 538, Mathematics Research Laboratory, Boeing Scientific Research
Laboratories, October, 1967.
Varaiya, P. P.: Nonlinear Programming in Banach Space, Society for Industrial
and Applied Mathematics Journal on Applied Mathematics, 16: 284-293 (1967).
von Neumann, J.: Zur Theorie der Gesellschaftsspiele, Mathematische Annalen, 100:
295-320(1928).
Whinston, A.: Conjugate Functions and Dual Programs, Naval Research
Logistics Quarterly, 12: 315-322 (1965).
Whinston, A.: Some Applications of the Conjugate Function Theory to Duality,
in [Abadie 67], pp. 75-96.
Wolfe, P.: A Duality Theorem for Nonlinear Programming, Quarterly of Applied
Mathematics, 19: 239-244 (1961).
Zangwill, W. I.: "Nonlinear Programming: A Unified Approach," Prentice-Hall,
Inc., Englewood Cliffs, N.J., 1969.
Zarantonello, E. H.: "Solving Functional Equations by Contractive Averaging,"
Mathematics Research Center, University of Wisconsin, Technical Summary
Report Number 160, 1960.
Zoutendijk, G.: "Methods of Feasible Directions," Elsevier Publishing Company,
Amsterdam, 1960.
Zukhovitskiy, S. I., and L. I. Avdeyeva: "Linear and Convex Programming,"
W. B. Saunders Company, Philadelphia, 1966.

ata

Indexes

This page intentionally left blank

Name Index
Abadie, J., 97, 99, 100, 205
Almgren, F. J., 62
Anderson, K. W., 3, 205
Arrow, K. J., 102, 103, 151, 205
Avdeyeva, L. I., 212

Fiacco, A. V., 206, 207


Fleming, W. H., 2, 62, 71, 182, 188,
200, 207
Friedrichs, K. 0., 113, 207
Fromovitz, S., 72, 163, 170, 172, 209

Bartle, R. G., 200, 205


Berge, C., 3, 46, 55, 68, 122, 132, 182
189, 191, 205
Birkhoff, G., 187, 205
Bohnenblust, H. F., 67, 205
Boltyanskii, V. G., 72, 163, 210
Bracken, J., 205
Bram, J., 211
Brondsted, A., 122, 205
Browder, F. E., 87, 205
Buck, R. C., 182, 188, 205

Gale, D., 16, 33, 35, 113, 127, 177, 179,


207
Gamkrelidze, R. V., 72, 163, 210
Gass, S., 15, 207
Ghouila Houri, A., 55, 68, 122, 132,
189, 191, 205
Glicksburg, I., 63, 65, 206
Gol'stein, S., 122, 207
Gordan, P., 31, 207
Graves, R. L., 207

Canon, M., 72, 163, 170, 205


Carathe"odory, C., 43, 205
Charnes, A., 206
Cooper, W. W., 206
Cottle, R. W., 122, 206, 210
Courant, R., 2, 113, 206
Cullum, C., 72, 163, 170, 205
Daniel, J. W., 130
Dantzig, G. B., 15, 18, 113, 122, 206
Dennis, J. B., 113, 206
Dieter, U., 122, 206
Dorn, W. S., 117, 122, 124, 125, 206
Duffin. R. J.. 32, 206
Eisenberg, E., 122, 206
Enthoven, A. C., 151, 205

Fan, K., 63, 65, 206


Farkas, J., 16, 31, 206
Fenchel, W., 55, 122, 206

Hadley, G. F., 15, 207


Halkin, H., 72, 163, 170, 207
Hall, D. W., 3, 205
Halmos, P. R., 177, 207
Hamilton, N. T., 3, 207
Hanson, M. A., 113, 117, 118, 122, 137,
207, 208, 210
Hestenes, M. R., 200, 208
Hilbert, D., 113, 206
Hoffman, A. J., 63, 65, 206
Hu, T. C., 208
Huard, P., 115, 117, 118, 122, 208
Hurwicz, L., 102, 103, 122, 205, 208
Jacob, J.-P., 122, 208
Jensen, J. L. W. V., 61, 208
John, F., 77, 99, 208
Karamardian, S., 87, 117, 122, 137,
139, 208
Karlin, S., 67, 77-79, 205, 208
Koopmans, T. J., 208
Kowalik, J., 208
Krelle, W., 209

Nonlinear Programming

Kretschmar, K. S., 122, 208


Kuhn, H. W., 79, 94, 102, 105, 113,
127, 207-209
Kiinzi, H. P., 209

Rosen, J. B., 211


Rossi, P., 122, 208
Rubinstein, G. S., 122, 211
Rudin, W., 182, 188, 200, 211

Landin, J., 3, 207


Larsen, A., 122, 209
Lax, P. D., 113, 209
Levinson, N., 122, 209
Levitin, E. S., 209

Saaty, T. L., 211


Shapley, L. S., 67, 205
Simmonard, M., 15, 113, 211
Simmons, G. F., 182, 211
Slater, M. L., 27, 78, 211
Stiemke, E., 32, 211
Stoer, J., 122, 211

McCormick, G. P., 205, 207, 209


MacLane, S., 187, 205
Mangasarian, 0. L., 72, 117, 118, 122,
131, 140, 151, 157, 163, 170, 172, 209
Martos, B., 132, 137, 151, 209
Minty, G. J., 87, 209
Mishchenko, E. F., 72, 163, 210
Mond, B., 113, 122, 208-210
Moreau, J.-J., 122, 210
Motzkin, T. S., 28, 210
Neustadt, L. W., 163, 207
Nikaid6, H., 131, 132, 210
Oettli, W., 209
Opial, Z., 87, 210
Osborne, M. R., 208
Polak, E., 72, 163, 170, 205, 209
Polyak, B. T., 209
Ponstein, J., 117, 118, 122, 209, 210
Pontryagin, L. S., 72, 163, 210
Rissanen, J., 122, 210
Ritter, K., 122, 210
Rockafellar, R. T., 87, 122, 210

216

Trefftz, E., 113, 211


Tucker, A. W., 16, 20, 21, 24, 25, 28,
29, 79, 94, 102, 105, 113, 127, 207209, 211
Tuy, Hoang, 131, 140, 151, 211
Tyndall, W. F., 122,211, 212
Uzawa, H., 77, 79, 80, 102, 103, 205,
212
Vajda, S., 212
Valentine, F. A., 46, 55, 212
Van Slyke, R. M., 122, 212
Varaiya, P. P., 122, 212
von Neuman, J., 113, 212
Wets, R. J. B., 122, 212
Whinston, A., 122, 212
Wolfe, P., 114, 115, 121, 122, 207, 212
Zangwill, W. L, 212
Zarantonello, E. H., 87, 212
Zoutendijk, G., Ill, 212
Zukhovitskiy, S. L, 212

Subject Index
Alternative:
table of theorems of, 34
theorems of, 27-37
Angle between two vectors, 8
Axiom of real number system, 188
Ball:
closed, 183
open, 182
Basis, 178
Bi-nonlinear function, 149
Bolzano-Weierstrass theorem, 189
Bounded function, 196
Bounded set, 188
Bounds:
greatest lower and least upper,
187
lower and upper, 187
Caratheodory's theorem, 43
Cauchy convergence, 188
Cauchy-Schwarz inequality, 7
Closed set, 183
Closure of a set, 183
Compact set, 189
Concave function, 56
differentiable, 83-91
minimum of, 73
strictly, 57
strictly concave: and differentiable,
87-88
and twice differentiable, 90-91
twice differentiable, 88-90
Concave functions, infimum of, 61
Constraint qualification:
Arrow-Hurwicz-Uzawa, 102
modified, 172
weak, 154
with convexity, 78-81
with differentiability, 102-105,
154-156, 171-172
Karlin's, 78
Kuhn-Tucker, 102, 171

Constraint qualification:
reverse convex, 103
weak, 155, 172
Slater's, 78
weak, 155
strict, 79
Constraint qualifications, relationships
between, 79, 103, 155
Continuous function, 191-192
Convex:
and concave functions, 54-68
generalized, 131-150
and pseudoconvex functions, relation
between, 144, 146
Convex combinations. 41
Convex function, 55
continuity of, 62
differentiable, 83-91
at a point, 83
on a set, 84
strictly, 56
strictly convex: and differentiable,
Q7 fifi
oi~
oo

and twice differentiable, 90-91


twice differentiable, 88-90
Convex functions:
fundamental theorems for, 63-68
supremum of, 61
Convex hull, 44
Convex set, 39
Convex sets, 38-53
intersection of, 41
Differentiable function:
numerical, 200
vector, 201
Distance between two points, 7
Dual linear problem (LDP), 127,
130
Dual (maximization) problem (DP),
114, 123
Dual problem, unbounded, 119, 121,
126, 127
Dual variables, 71

Nonlinear Programming

Duality, 113-130, 157-160, 174-176


in linear programming, 126-130
with nonlinear equality constraints,
174-176
in nonlinear programming, 114-123
in quadratic programming, 123-126
Duality theorem:
Dorn's, 124
Dorn's strict converse, 125
Hanson-Huard strict converse, 118
linear programming, 127
strict converse, 117, 118, 124, 157,
160, 174, 176
weak, 114, 124
Wolfe's, 115
Equality constraints:
linear, 80, 95
nonlinear, 75, 112, 161-176
Euclidean space Rn , 6
Farkas' theorem, 16, 31, 37, 53
nonhomogeneous, 32
Feasible directions, method of, 111
Finite-intersection property, 189
Fractional function:
linear, 149
nonlinear, 148
Fritz John saddlepoint necessary
optimality theorem, 77
Fritz John saddlepoint problem
(FJSP), 71
Fritz John stationary-point necessary
optimality theorem, 99, 170
Fritz John stationary-point problem
(FJP), 93
Function, 11
linear vector, 12
numerical, 12
vector, 12
Gale's theorems, 33
Gordan's theorem, 31

ais

Gordan's theorem:
generalized to convex functions, 65
Gradient of a function, 201
Halfspace, 40
Heine-Borel theorem, 189
Hessian matrix, 202
Implicit function theorem, 204
Inequalities, linear, 16-37
Infimum, 187, 195
of a numerical function, 196
Interior of a set, 183
Interior point, 182
Jacobian matrix, 202
Kuhn-Tucker saddlepoint necessary
optimality theorem, 79
in the presence of equality constraints, 80
Kuhn-Tucker saddlepoint problem
(KTSP), 71
Kuhn-Tucker stationary-point
necessary optimality criteria, 105,
111, 112, 156, 173
Kuhn-Tucker stationary-point
problem (KTP), 94
Kuhn-Tucker sufficient optimality
theorem, 94, 153, 162
Lagrange multipliers, 71
Lagrangian function, 71
Limit, 186
Limit point, 186
Line, 38
segments, 39
Linear combination, 6
nonnegative, 7
Linear dependence, 6

Subject Index

Linear inequalities, 16-37


Linear minimization problem (LMP),
127, 130
Linear programming:
duality, 126-130
optimality, 18-21
Linear systems, existence theorems,
21-26
Linearization lemma, 97, 153, 163
Local minimization problem (LMP),
70, 93
Mapping, 11
Matrices, 8-11
Matrix:
nonvacuous, 11
submatrices of, 10
Maximum, 195
of a numerical function, 197
existence of, 198
Maximum principle (see Minimum
principle)
Mean-value theorem, 204
Metric space, 7
Minimization problem (MP), 70, 93
local, 70, 93
solution set, 72
uniqueness, 73
Minimum, 195
of a numerical function, 197
existence of, 198
Minimum principle, 72, 162-170, 141-142
Motzkin's theorem, 28
Negative definite matrix, 91, 181
Negative semidefinite matrix, 89, 90,
181
Nonlinear programming problem 1-3,
14-15
Nonsingular matrix, 181
Norm of a vector, 7
Notation, 13-15
Numerical function, 12

Open set, 183


Optimality criteria:
with differentiability, 92-112,
151-176
necessary, 97-112, 153-157,
162-174
sufficient, 94-97, 151-153, 162
necessary saddlepoint, 76-82
saddlepoint, 69-82
sufficient saddlepoint, 74-76
Partial derivatives:
of a numerical function, 200-201
of a vector function, 201-202
Plane, 40
Point of closure, 182
Polyhedron, 41
Poly tope, 41
Positive definite matrix, 91, 181
Positive semidefinite matrix, 89, 90,
181
Primal minimization problem (MP),
114, 122
Primal problem, no minimum, 121,126,
127
Product of a convex set with a real
number, 45
Pseudoconcave function, 141
Pseudoconvex and strictly quasiconvex functions, relation
between, 143, 146
Pseudoconvex function, 140-141
Quadratic dual problem (QDP), 124
Quadratic minimization problem
(QMP), 123
Quadratic programming, 123-126
Quasiconcave function, 132
differentiable, 134
strictly, 137
Quasiconvex and strictly quasiconvex
functions, relation between, 139,
146
119

Nonlinear Programming

Quasiconvex function, 131


differentiable, 134
strictly, 137
Rank of a matrix:
column, 179
row, 179
Relatively closed set, 183
Relatively open set, 183
Rn, 6
Saddlepoint problem:
Fritz John (FJSP), 71
Kuhn-Tucker (KTSP), 71
Scalar product, 7
Semicontinuous function:
lower, 192-193
upper, 193
Separating plane, 46
Separation theorem, 49
strict, 50
Separation theorems, 46-53
Sequence, 185
Set, 3
complement of, 4
element of, 3
empty, 4
product with a real number, 45
subset of, 3
Set theoretic symbols, 5
Sets:
difference of, 4
disjoint, 4

120

Sets:
intersection of, 4
product of, 4-5
sum of, 45
union of, 4
Simplex, 42
Slater's theorem, 27
Stiemke's theorem, 32
Subspace, 40
Sum of two convex sets, 45
Supremum, 187, 195
of a numerical function, 196
Symbols, 5
Taylor's theorem, 204
Triangle inequality, 8, 41
Tucker's theorem, 29
Twice differentiable numerical
function, 202
Uniqueness of minimum solution, 73
Vector, 6
addition, 6
multiplication by a scalar, 6
norm of, 7
Vector function, 12
Vector space, fundamental theorem of,
177
Vectors:
angle between two, 8
scalar product of, 7
Vertex, 41

Das könnte Ihnen auch gefallen