Beruflich Dokumente
Kultur Dokumente
Alexander J. Zaslavski
Numerical
Optimization with
Computational
Errors
Springer Optimization and Its Applications
VOLUME 108
Managing Editor
Panos M. Pardalos (University of Florida)
Editor–Combinatorial Optimization
Ding-Zhu Du (University of Texas at Dallas)
Advisory Board
J. Birge (University of Chicago)
C.A. Floudas (Princeton University)
F. Giannessi (University of Pisa)
H.D. Sherali (Virginia Polytechnic and State University)
T. Terlaky (McMaster University)
Y. Ye (Stanford University)
Numerical Optimization
with Computational Errors
123
Alexander J. Zaslavski
Department of Mathematics
The Technion – Israel Institute
of Technology
Haifa, Israel
v
vi Preface
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Subgradient Projection Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 The Mirror Descent Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Proximal Point Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Variational Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Subgradient Projection Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 A Convex Minimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 The Main Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Proof of Theorem 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Subgradient Algorithm on Unbounded Sets . . . . . . . . . . . . . . . . . . . . . . . 20
2.6 Proof of Theorem 2.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.7 Zero-Sum Games with Two-Players . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.8 Proof of Proposition 2.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.9 Subgradient Algorithm for Zero-Sum Games . . . . . . . . . . . . . . . . . . . . . 35
2.10 Proof of Theorem 2.11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3 The Mirror Descent Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.1 Optimization on Bounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2 The Main Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3 Proof of Theorem 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 Optimization on Unbounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5 Proof of Theorem 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.6 Zero-Sum Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4 Gradient Algorithm with a Smooth Objective Function . . . . . . . . . . . . . . . 59
4.1 Optimization on Bounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 An Auxiliary Result and the Proof of Proposition 4.1 . . . . . . . . . . . . 61
4.3 The Main Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 Proof of Theorem 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.5 Optimization on Unbounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
vii
viii Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
Chapter 1
Introduction
BX .x; r/ D fy 2 X W kx yk rg:
C BX .0; M0 /;
jf .x/ f .y/j Lkx yk for all x; y 2 U:
For every nonempty closed convex set D X and every x 2 X there is a unique
point PD .x/ 2 D satisfying
f .z/ ! min; z 2 C:
t 2 @f .xt / C BX .0; ı/
kxtC1 PC .xt at t /k ı:
x 2 C
satisfies
kx0 k M0 C 1
t 2 @f .xt / C BX .0; ı/
and
kxtC1 PC .xt at t /k ı:
X
T
at .f .xt / f .x //
tD0
Now we can think about the best choice of T. It is not difficult to see that it should
be at the same order as bı 1 c.
4 1 Introduction
Let X be a Hilbert space equipped with an inner product h; i which induces a
complete norm k k. We use the notation introduced in the previous section.
Let C be a nonempty closed convex subset of X, U be an open convex subset of
X such that C U and let f W U ! R1 be a convex function. Suppose that there
exist L > 0, M0 > 0 such that
C BX .0; M0 /;
jf .x/ f .y/j Lkx yk for all x; y 2 U:
inf.h; D/ D inffh.y/ W y 2 Dg
and
In Chap. 3 we study the convergence of the mirror descent algorithm under the
presence of computational errors. This method was introduced by Nemirovsky and
Yudin for solving convex optimization problems [90]. Here we use a derivation of
this algorithm proposed by Beck and Teboulle [19].
We consider the minimization problem
f .z/ ! min; z 2 C:
define
x 2 C
satisfies
kx0 k M0 C 1
t 2 @f .xt / C BX .0; ı/
and
X
T
at .f .xt / f .x //
tD0
X
T
21 .2M0 C 1/2 C ı.2M0 C L C 2/ at
tD0
X
T
Cı.T C 1/.8M0 C 8/ C 21 .L C 1/2 a2t :
tD0
!1
X
T
1 2
2 .2M0 C 1/ at C ı.2M0 C L C 2/
tD0
!1 !1
X
T X
T X
T
1 2
Cı.T C 1/.8M0 C 8/ at C 2 .L C 1/ a2t at :
tD0 tD0 tD0
If we think about the best choice of T, it is clear that it should be at the same order
as bı 1 c.
In Chap. 9 we analyze the behavior of the proximal point method in a Hilbert space
which is an important tool in the optimization theory. See, for example, [9, 15, 16,
29, 31, 34, 36, 53, 55, 69, 70, 77, 81, 87, 103, 104, 106, 107, 111, 113] and the
references mentioned therein.
Let X be a Hilbert space equipped with an inner product h; i which induces the
norm k k.
For each function g W X ! R1 [ f1g set
and that
lim f .x/ D 1:
kxk!1
Let a point
x 2 Argmin.f /
M > inf.f / C 4:
Evidently,
kx k < M0 1:
Assume that
k 2 Œ1 ; 2 ; k D 0; 1; : : : ;
1=2
1=2 .L C 1/.21
1 C 8M0 1 / 1 and .L C 1/ =4:
f .x0 / M
and
f .xk / inf.f / C :
8 1 Introduction
f ./ inf.f / C
doing bc1 1 c iterations with the computational error D c2 2 , where the constant
c1 > 0 depends only on M0 ; 2 and the constant c2 > 0 depends only on
M0 ; L; 1 ; 2 .
B.x; r/ D fy 2 X W kx yk rg:
We suppose that
S 6D ;:
In Chap. 12, we present examples which provide simple and clear estimations for
the sets S in some important cases. These examples show that elements of S can
be considered as -approximate solutions of the variational inequality.
In Chap. 12, in order to solve the variational inequality (to find x 2 S), we use the
algorithm known in the literature as the extragradient method [75]. In each iteration
of this algorithm, in order to get the next iterate xkC1 , two orthogonal projections
onto C are calculated, according to the following iterative step. Given the current
iterate xk calculate yk D PC .xk k f .xk // and then
and
with a constant ı > 0 which depends only on our computer system. Surely, in this
situation one cannot expect that the sequence fxk g1kD0 converges to the set S. The
goal is to understand what subset of C attracts all sequences fxk g1kD0 generated by
the algorithm. The main result of Chap. 12 (Theorem 12.2) shows that this subset
of C is the set S with some > 0 depending on ı. The examples considered in
Chap. 12 show that one cannot expect to find an attracting set smaller than S , whose
elements can be considered as approximate solutions of the variational inequality.
Chapter 2
Subgradient Projection Algorithm
2.1 Preliminaries
The subgradient projection algorithm is one of the most important tools in the
optimization theory and its applications. See, for example, [1–3, 12, 30, 44, 51, 79,
89, 92, 95, 96, 105, 108, 109, 112] and the references mentioned therein.
In this chapter we use this method for constrained minimization problems in
Hilbert spaces equipped with an inner product denoted by h; i which induces a
complete norm k k. For every z 2 R1 denote by bzc the largest integer which does
not exceed z: bzc D maxfi 2 R1 W i is an integer and i zg.
Let X be a Hilbert space. For each x 2 X and each r > 0 set
BX .x; r/ D fy 2 X W kx yk rg:
C BX .0; M0 /; (2.2)
jf .x/ f .y/j Lkx yk for all x; y 2 U: (2.3)
kz y0 k2 kz y1 k2 ky0 y1 k2 D 2hz y1 ; y1 y0 i:
Moreover,
hz PD .x/; x PD .x/i 0;
kz PD .x/k2 C kx PD .x/k2 kz xk2 :
Lemma 2.3. Let A > 0 and n 2 be an integer. Then the minimization problem
X
n
a2i ! min
iD1
X
n
a D .a1 ; : : : ; an / 2 Rn and ai D A
iD1
X
n1
an D A ai
iD1
Thus ai D an for all i D 1; : : : ; n1 and ai D n1 A for all i D 1; : : : ; n. Lemma 2.3
is proved.
t 2 @f .xt / C BX .0; ı/
kxtC1 PC .xt at t /k ı:
x 2 C (2.5)
14 2 Subgradient Projection Algorithm
satisfies
kx0 k M0 C 1 (2.7)
and
X
T
at .f .xt / f .x //
tD0
X
T
C ı.2M0 C 1/ C 21 A1
T .L C 1/
2
a2t
tD0
on the set
( )
X
T
a D .a0 ; : : : ; aT / 2 RTC1 W ai 0; i D 0; : : : ; T; ai D AT :
iD0
By Lemma 2.3, this function has a unique minimizer a D .a0 ; : : : ; aT / where
ai D .T C 1/1 AT , i D 0; : : : ; T. This is the best choice of at , t D 0; 1; : : : ; T.
Theorem 2.4 implies the following result.
Theorem 2.5. Let ı 2 .0; 1, a > 0 and let x 2 C satisfies
kx0 k M0 C 1
t 2 @f .xt / C BX .0; ı/
and
Now we will find the best a > 0. Since T can be arbitrary large, we need to find
a minimizer of the function
and
x 2 C satisfies
kx0 k M0 C 1
t 2 @f .xt / C BX .0; ı/
and
Now we can think about the best choice of T. It is clear that it should be at the
same order as bı 1 c. Putting T D bı 1 c, we obtain that
2.3 The Main Lemma 17
!
X
T
f .T C 1/1 xt f .x /; minff .xt / W t D 0; : : : ; Tg f .x /
tD0
Note that in the theorems above ı is the computational error produced by our
computer system.
In view of the inequality above, which has the right-hand side bounded by c1 ı 1=2
with a constant c1 > 0, we conclude that after T D bı 1 c iterations we obtain a
point 2 U such that
BX .; ı/ \ C 6D ;
and
z 2 C: (2.13)
Assume that
and that
u2U (2.16)
satisfies
ku PC .x a/k ı: (2.17)
18 2 Subgradient Projection Algorithm
Then
l 2 @f .x/ (2.18)
such that
kl k ı: (2.19)
Clearly,
It follows from (2.4), (2.18), (2.19), (2.22), (2.23), and (2.24) that
and
It is clear that
kxt k M0 C 1; t D 0; 1; : : : :
z D x ; a D at ; x D xt ; D t ; u D xtC1
we obtain that
X
T
at .f .xt / f .x //
tD0
X
T
.21 kx xt k2 21 kx xtC1 k2
tD0
Thus (2.10) is true. Evidently, (2.10) implies (2.11). Theorem 2.4 is proved.
We use the notation and definitions introduced in Sect. 2.1. Let X be a Hilbert space
with an inner product h; i, D be a nonempty closed convex subset of X, V be an
open convex subset of X such that
D V; (2.27)
We suppose that
Dmin 6D ;: (2.29)
L > 0 satisfy
and let
fat g1
t D 0 Œ0 ; 1 ; (2.36)
kx0 k M (2.37)
and
kxi k 3M C 2; i D 0; : : : ; q
and
Note that in the theorem above ı is the computational error produced by our
computer system. In view of the inequality above, in order to obtain a good
approximate solution we need bc1 ı 1 c C 1 iterations, where
BX .; ı/ \ D 6D ;
and
kx1 zk 2M C 2; (2.44)
kx1 k 3M C 2: (2.45)
Set
and
C D D \ BX .0; M0 /: (2.47)
kxt zk 2M C 2; (2.48)
f .xt / f .z/
.20 /1 .kz xt k2 kz xtC1 k2 /
C 01 ı.4M0 C 1/ C ı.2M0 C 1/ C 21 1 .L C 1/2 : (2.49)
z 2 C BX .0; M0 /; (2.50)
xt 2 U \ BX .0; M0 C 1/: (2.51)
It follows from (2.33), (2.36), (2.40), (2.48), (2.53), and Lemma 2.2 that
kz PD .xt at t /k kz xt C at t k
kz xt k C kt kat 2M C 3;
kPD .xt at t /k 3M C 3: (2.54)
24 2 Subgradient Projection Algorithm
PD .xt at t / 2 C; (2.55)
and
By (2.32), (2.38), (2.39), (2.46), (2.47), (2.50), (2.51), (2.55), (2.56), (2.57), and
Lemma 2.7 which holds with
x D xt ; a D at ; D t ; u D xtC1 ;
we have
kz xt k2 kz xtC1 k2 0;
kz xtC1 k kz xt k 2M C 2: (2.59)
Therefore we assumed that (2.48) is true and showed that (2.58) and (2.59) hold.
Hence by induction we showed that (2.49) holds for all t D 1; : : : ; T and (2.48)
holds for all t D 1; : : : ; T C 1.
It follows from (2.49) which holds for all t D 1; : : : ; T, (2.41) and (2.44) that
X
T
.20 /1 .kz xt k2 kz xtC1 k2 /
tD1
kz xt k 2M C 2; t D 1; : : : ; T C 1;
kxt k 3M C 2; t D 0; : : : ; T C 1:
kxt k 3M C 2; t D 0; : : : ; q
and
f .xq / f .z/ 0 :
C U; D V (2.60)
26 2 Subgradient Projection Algorithm
Let
x 2 C and y 2 D (2.64)
satisfy
t 2 f0; : : : ; T C 1g,
Let
!1
X
T X
T
xO T D ai at xt ;
iD0 tD0
!1
X
T X
T
yO T D ai at yt : (2.69)
iD0 tD0
2.7 Zero-Sum Games with Two-Players 27
Then
ˇ !1 T ˇ
ˇ X X ˇ
ˇ T ˇ
ˇ at at f .xt ; yt / f .x ; y /ˇˇ
ˇ
ˇ tD0 tD0 ˇ
!1 !1
X
T X
T X
T
at bt C at supf .u/ W u 2 Œ0; 2M0 C 1g; (2.71)
tD0 tD0 tD0
ˇ !1 T ˇ
ˇ X X ˇ
ˇ T
ˇ
ˇf .OxT ; yO T / at at f .xt ; yt /ˇˇ
ˇ
ˇ tD0 tD0 ˇ
!1
X
T X
T
at bt C Lı
tD0 tD0
!1
X
T
C at supf .u/ W u 2 Œ0; 2M0 C 1g; (2.72)
tD0
f .z; yO T / f .OxT ; yO T /
!1
X
T
2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
2 at bt Lı; (2.73)
tD0 tD0
f .OxT ; v/ f .OxT ; yO T /
!1
X
T
C2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
C2 at bt C Lı: (2.74)
tD0 tD0
Corollary 2.10. Suppose that all the assumptions of Proposition 2.9 hold and that
xQ 2 C; yQ 2 D
satisfy
Then
f .z; yQ / f .Qx; yQ /
!1
X
T
2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
2 at bt 4Lı;
tD0 tD0
f .Qx; v/ f .Qx; yQ /
!1
X
T
C2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
C2 at bt C 4Lı:
tD0 tD0
jf .Qx; yQ / f .OxT ; yO T /j
jf .Qx; yQ / f .Qx; yO T /j C jf .Qx; yO T / f .OxT ; yO T /j
LkQy yO T k C LkQx xO T k 2Lı
f .z; yQ / f .z; yO T / Lı
!1
X
T
f .OxT ; yO T / 2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
2 at bt 2Lı
tD0 tD0
2.8 Proof of Proposition 2.9 29
!1
X
T
f .Qx; yQ / 2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
2 at bt 4Lı;
tD0 tD0
f .Qx; v/ f .OxT ; v/ C Lı
!1
X
T
f .OxT ; yO T / C 2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
C2 at bt C 2Lı
tD0 tD0
!1
X
T
f .Qx; yQ / C 2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
C2 at bt C 4Lı:
tD0 tD0
It is clear that (2.70) is true. Let t 2 f0; : : : ; Tg. By (2.65), (2.67), and (2.68),
at .f .xt ; yt / f .x ; y //
at .f .xt ; yt / f .x ; yt //
.kx xt k/ .kx xtC1 k/ C bt ; (2.77)
at .f .x ; y / f .xt ; yt //
at .f .xt ; y / f .xt ; yt //
.ky yt k/ .ky ytC1 k/ C bt : (2.78)
30 2 Subgradient Projection Algorithm
X
T X
T
at f .xt ; yt / at f .x ; y /
tD0 tD0
X
T X
T
. .kx xt k/ .kx xtC1 k// C bt
tD0 tD0
X
T
.kx x0 k/ C bt ; (2.79)
tD0
X
T X
T
at f .x ; y / at f .xt ; yt /
tD0 tD0
X
T X
T
. .ky yt k/ .ky ytC1 k// C bt
tD0 tD0
X
T
.ky y0 k/ C bt : (2.80)
tD0
zT 2 C (2.82)
such that
kzT xO T k ı: (2.83)
In view of (2.82), we apply (2.67) with z D zT and obtain that for all t D 0; : : : ; T,
at .f .xt ; yt / f .zT ; yt //
.kzT xt k/ .kzT xtC1 k/ C bt : (2.84)
2.8 Proof of Proposition 2.9 31
at .f .xt ; yt / f .OxT ; yt //
at .f .xt ; yt / f .zT ; yt // C at Lı
.kzT xt k/ .kzT xtC1 k/ C bt C at Lı: (2.86)
X
T X
T
at f .xt ; yt / at f .OxT ; yt /
tD0 tD0
X
T X
T X
T
. .kzT xt k/ .kzT xtC1 k// C bt C at Lı
tD0 tD0 tD0
X
T X
T
.kzT x0 k/ C bt C at Lı
tD0 tD0
X
T X
T
supf .s/ W s 2 Œ0; 2M0 C 1g C bt C at Lı: (2.87)
tD0 tD0
X
T X
T
at f .xt ; yt / at f .OxT ; yO T /
tD0 tD0
X
T X
T
at f .xt ; yt / at f .OxT ; yt /
tD0 tD0
X
T X
T
supf .s/ W s 2 Œ0; 2M0 C 1g C bt C at Lı: (2.89)
tD0 tD0
32 2 Subgradient Projection Algorithm
hT 2 D (2.90)
such that
khT yO T k ı: (2.91)
In view of (2.90), we apply (2.68) with v D hT and obtain that for all t D 0; : : : ; T,
at .f .xt ; hT / f .xt ; yt //
.khT yt k/ .khT ytC1 k/ C bt : (2.92)
at .f .xt ; yO T / f .xt ; yt //
at .f .xt ; hT / f .xt ; yt // C at Lı
.khT yt k/ .khT ytC1 k/ C bt C at Lı: (2.94)
In view of (2.94),
X
T X
T
at f .xt ; yO T / at f .xt ; yt /
tD0 tD0
X
T X
T X
T
. .khT yt k/ .khT ytC1 k// C bt C at Lı: (2.95)
tD0 tD0 tD0
X
T
at f .OxT ; yO T /: (2.96)
tD0
2.8 Proof of Proposition 2.9 33
X
T X
T
at f .OxT ; yO T / at f .xt ; yt /
tD0 tD0
X
T X
T
at f .xt ; yO T / at f .xt ; yt /
tD0 tD0
X
T X
T
.khT y0 k/ C bt C at Lı
tD0 tD0
X
T X
T
supf .s/ W s 2 Œ0; 2M0 C 1g C bt C at Lı: (2.97)
tD0 tD0
X
T X
T
supf .s/ W s 2 Œ0; 2M0 C 1g C bt C at Lı:
tD0 tD0
X
T
at f .xt ; yt / f .z; yt //
tD0
X
T X
T
Œ .kz xt k/ .kz xtC1 k/ C bt : (2.98)
tD0 tD0
X
T X
T
at f .xt ; yt / at f .z; yO T /
tD0 tD0
X
T
at .f .xt ; yt / f .z; yt //
tD0
X
T X
T
Œ .kz xt k/ .kz xtC1 k/ C bt
tD0 tD0
X
T
.kz x0 k/ C bt : (2.100)
tD0
X
T
at .f .xt ; v/ f .xt ; yt //
tD0
X
T X
T
Œ .kv yt k/ .kv ytC1 k/ C bt : (2.101)
tD0 tD0
!
X
T
at f .OxT ; v/: (2.102)
tD0
X
T X
T
at f .OxT ; v/ at f .xt ; yt /
tD0 tD0
X
T
.kv y0 k/ C bt :
tD0
C U; D V: (2.103)
@x f .;
/ D fl 2 X W f .y;
/ f .;
/ hl; y i for all y 2 Ug; (2.109)
@y f .;
/ D fl 2 Y W hl; y
i f .; y/ f .;
/ for all y 2 Vg: (2.110)
In view of properties (i) and (ii) and (2.107)–(2.110), for each 2 U and each
2 V,
; 6D @x f .;
/ BX .0; L/; (2.111)
; 6D @y f .;
/ BY .0; L/: (2.112)
Let
x 2 C and y 2 D
satisfy
and the next pair of iteration vectors xtC1 2 U, ytC1 2 V such that
kxtC1 PC .xt at t /k ı;
kytC1 PD .yt C at
t /k ı:
and that
kytC1 PD .yt C at
t /k ı: (2.118)
!1
X
T X
T
1
C ı.2M0 C 1/ C 2 at .L C 1/2 a2t ; (2.120)
tD0 tD0
ˇ !1 T ˇ
ˇ X X ˇ
ˇ T
ˇ
ˇf .OxT ; yO T / a a f .x ; y / ˇ
ˇ t t t t ˇ
ˇ tD0 tD0 ˇ
!1
X
T
1 2
Œ2 .2M0 C 1/ C ı.T C 1/.4M0 C 1/ at
tD0
!1
X
T X
T
1
C ı.2M0 C 1/ C 2 at .L C 1/2 a2t C Lı; (2.121)
tD0 tD0
f .z; yO T / f .OxT ; yO T /
!1 !1
X
T X
T
.2M0 C 1/2 at 2 at .T C 1/ı.4M0 C 1/
tD0 tD0
!1
X
T X
T
2ı.2M0 C 1/ at .L C 1/2 a2t Lı;
tD0 tD0
f .OxT ; v/ f .OxT ; yO T /
!1 !1
X
T X
T
2
C .2M0 C 1/ at C2 at ı.T C 1/.4M0 C 1/
tD0 tD0
!1
X
T X
T
C 2ı.2M0 C 1/ C at .L C 1/2 a2t C Lı:
tD0 tD0
By Lemma 2.3, this function has a unique minimizer a D .a0 ; : : : ; aT / where
ai D .T C 1/1 AT , i D 0; : : : ; T which is the best choice of at , t D 0; 1; : : : ; T.
Now we will find the best a > 0. Let T be a natural number and at D a for all
t D 0; : : : ; T. We need to choose a which is a minimizer of the function
2.10 Proof of Theorem 2.11 39
Now our goal is to find the best integer T > 0 which gives us an appropriate value of
T .a/. Since in view of the inequalities above, this value is bounded from below by
c0 ı 1=2 with the constant c0 depending on L; M0 , it is clear that in order to make the
best choice of T, it should be at the same order as bı 1 c. For example, T D bı 1 c.
Note that in the theorem above ı is the computational error produced by
our computer system. We obtain a good approximate solution after T D bı 1 c
iterations. Namely, we obtain a pair of points xO 2 U, yO 2 V such that
BX .Ox; ı/ \ C 6D ;; BY .Oy; ı/ \ D 6D ;
a D at ; x D xt ; f D f .; yt /; D t ; u D xtC1
40 2 Subgradient Projection Algorithm
a D at ; x D yt ; f D f .xt ; /; D
t ; u D ytC1
and define
.s/ D 21 s2 ; s 2 R1 :
It is easy to see that all the assumptions of Proposition 2.9 hold and it implies
Theorem 2.11.
Chapter 3
The Mirror Descent Algorithm
In this chapter we analyze the convergence of the mirror descent algorithm under
the presence of computational errors. We show that the algorithms generate a good
approximate solution, if computational errors are bounded from above by a small
positive constant. Moreover, for a known computational error, we find out what an
approximate solution can be obtained and how many iterates one needs for this.
Let X be a Hilbert space equipped with an inner product h; i which induces a
complete norm k k. For each x 2 X and each r > 0 set
BX .x; r/ D fy 2 X W kx yk rg:
C BX .0; M0 /; (3.2)
jf .x/ f .y/j Lkx yk for all x; y 2 U: (3.3)
inf.h; D/ D inffh.y/ W y 2 Dg
and
We study the convergence of the mirror descent algorithm under the presence
of computational errors. This method was introduced by Nemirovsky and Yudin
for solving convex optimization problems [90]. Here we use a derivation of this
algorithm proposed by Beck and Teboulle [19].
Let ı 2 .0; 1 and fak g1
kD0 .0; 1/.
We describe the inexact version of the mirror descent algorithm.
Mirror Descent Algorithm
Initialization: select an arbitrary x0 2 U.
Iterative step: given a current iteration vector xt 2 U calculate
define
x 2 C (3.5)
3.1 Optimization on Bounded Sets 43
satisfies
kx0 k M0 C 1 (3.7)
and
X
T
21 .2M0 C 1/2 C ı.2M0 C L C 2/ at
tD0
X
T
C ı.T C 1/.8M0 C 8/ C 21 .L C 1/2 a2t : (3.10)
tD0
By Lemma 2.3, this function has a unique minimizer a D .a0 ; : : : ; aT / where
ai D .T C 1/1 AT , i D 0; : : : ; T. This is the best choice of at , t D 0; 1; : : : ; T.
Let T be a natural number and at D a, t D 0; : : : ; T. Now we will find the best
a > 0. By Theorem 3.1, we need to choose a which is a minimizer of the function
Now we can think about the best choice of T. It is clear that it should be at the
same order as bı 1 c.
Note that in the theorem above ı is the computational error produced by our
computer system. In order to obtain a good approximate solution we need T
iterations which is at the same order as bı 1 c. As a result, we obtain a point 2 U
such that
BX .; ı/ \ C 6D ;
and
z 2 C: (3.11)
3.2 The Main Lemma 45
Assume that
and that
u2U (3.15)
satisfies
Then
l 2 @f .x/ (3.17)
such that
kl k ı: (3.18)
uO 2 BX .u; ı/ \ C (3.20)
such that
hz uO ; x uO ai 0: (3.24)
jkz uO k2 kz uk2 j
jkz uO k kz ukj.kz uO k C kz uk/
ku uO k.4M0 C 1/ .4M0 C 1/ı: (3.29)
It is clear that
kxt k M0 C 1; t D 0; 1; : : : :
z D x ; a D at ; x D xt ; D t ; u D xtC1
we obtain that
X
T
at .f .xt / f .x //
tD0
X
T
.21 kx xt k2 21 kx xtC1 k2 /
tD0
48 3 The Mirror Descent Algorithm
X
T X
T
Cı.2M0 C L C 2/ at C .T C 1/ı.8M0 C 8/ C 21 .L C 1/2 a2t
tD0 tD0
X
T
21 .2M0 C 1/2 C ı.2M0 C L C 2/ at
tD0
X
T
C.T C 1/ı.8M0 C 8/ C 21 .L C 1/2 a2t :
tD0
Thus (3.10) is true. Evidently, (3.10) implies the last relation of the statement of
Theorem 3.1. This completes the proof of Theorem 3.1
We use the notation and definitions introduced in Sect. 3.1. Let X be a Hilbert space
with an inner product h; i, D be a nonempty closed convex subset of X, V be an
open convex subset of X such that
D V; (3.31)
We suppose that
Dmin 6D ;:
L > 1 satisfy
and let
fat g1
tD0 Œ0 ; 1 ;
kx0 k M (3.39)
and
f .xq / inf.f I D/ C 0 ;
kxq k 15M C 1:
Thus 0 is at the same order as bı 1=2 c. By (3.38) and the inequalities above, n0 is at
the same order as bı 1 c.
2 BX .x1 ; ı/ \ argminfh0 ; vi C .2a0 /1 kv x0 k2 W v 2 Dg: (3.44)
h0 ;
i C .2a0 /1 k
x0 k2
h0 ; zi C .2a0 /1 kz x0 k2 : (3.45)
k0 k L C 1: (3.46)
In view of (3.36),
a1 1
0 1 4.L C 1/: (3.47)
M C .2M C 1/2 k
x0 k2 21 k
x0 k;
.k
x0 k 41 /2 .4M C 1/2 ;
k
x0 k 8M:
k
k 9M; kx1 k 9M C 1;
k
zk 10M;
kx1 zk 10M C 1: (3.48)
Set
and
C D D \ BX .0; M0 /: (3.52)
z 2 C BX .0; M0 /: (3.53)
In view of (3.57),
It follows from the inequality above, (3.34), (3.36), (3.42), (3.49), and (3.55) that
h 2 C: (3.59)
and
It follows from (3.31), (3.35), (3.51), (3.52)–(3.55), (3.60), and Lemma 3.2 which
holds with
x D xt ; a D at ; D t ; u D h
that
kz xt k2 kz xtC1 k2 0;
kz xtC1 k kz xt k 14M C 1:
Hence by induction we showed that (3.50) holds for all t D 1; : : : ; T and (3.49)
holds for all t D 1; : : : ; T C 1.
3.6 Zero-Sum Games 53
X
T
1
.20 / .kz xt k2 kz xtC1 k2 /
tD1
and
then T < n0 and (3.49) holds for all t D 1; : : : ; T C 1. This implies that there exists
an integer t 2 f1; : : : ; n0 g such that
f .xt / f .z/ 0 ;
kxt k 15M C 1:
C U; D V: (3.62)
54 3 The Mirror Descent Algorithm
for each v 2 V,
@x f .;
/ D fl 2 X W f .y;
/ f .;
/ hl; y i for all y 2 Ug;
@y f .;
/ D fl 2 Y W hl; y
i f .; y/ f .;
/ for all y 2 Vg:
In view of properties (i) and (ii) and (3.63)–(3.65), for each 2 U and each
2 V,
; 6D @x f .;
/ BX .0; L/;
; 6D @y f .;
/ BY .0; L/:
Let
x 2 C and y 2 D (3.66)
satisfy
and the next pair of iteration vectors xtC1 2 U, ytC1 2 V such that
BX .x0 ; ı/ \ C 6D ;; BY .y0 ; ı/ \ D 6D ;
!1
X
T
1 2
Œ2 .2M0 C 1/ C 8ı.T C 1/.M0 C 1/ at
tD0
!1
X
T X
T
1
Cı.2M0 C 2L C 2/ C 2 at .L C 1/2 a2t ;
tD0 tD0
f .OxT ; v/ f .OxT ; yO T /
!1 !1
X
T X
T
2
C.2M0 C 1/ at C at 16ı.T C 1/.M0 C 1/
tD0 tD0
!1
X
T X
T
C2ı.2M0 C 2L C 2/ C at .L C 1/2 a2t ;
tD0 tD0
f .z; yO T / f .OxT ; yO T /
!1 !1
X
T X
T
2
.2M0 C 1/ at 16 at .T C 1/ı.M0 C 1/
tD0 tD0
!1
X
T X
T
2ı.2M0 C 2L C 2/ at .L C 1/2 a2t :
tD0 tD0
kxt k M0 C 1; kyt k M0 C 1; t D 0; 1; : : : :
a D at ; x D xt ; f D f .; yt /; D t ; u D xtC1
a D at ; x D yt ; f D f .xt ; /; D
t ; u D ytC1
3.6 Zero-Sum Games 57
and define
.s/ D 21 s2 ; s 2 R1 :
It is easy to see that all the assumptions of Proposition 2.9 hold and it implies
Theorem 3.4.
We are interestedPin the optimal choice of at , t D 0; 1; : : : . Let T be a natural
number and AT D TtD0 at be given. By Theorem 3.4, in order to make the best
P
choice of at ; t D 0; : : : ; T, we need to minimize the function TtD0 a2t on the set
( )
X
T
a D .a0 ; : : : ; aT / 2 R
TC1
W ai 0; i D 0; : : : ; T; ai D AT :
iD0
By Lemma 2.3, this function has a unique minimizer a D .a0 ; : : : ; aT / where
ai D .T C 1/1 AT , i D 0; : : : ; T which is the best choice of at , t D 0; 1; : : : ; T.
Now we will find the best a > 0. Let T be a natural number and at D a for all
t D 0; : : : ; T. We need to choice a which is a minimizer of the function
Now our goal is to find the best T > 0 which gives us an appropriate value of T .a/.
Since in view of the inequalities above, this value is bounded from below by c0 ı 1=2
with the constant c0 depending on L; M0 it is clear that in order to make the best
choice of T, it should be at the same order as bı 1 c. For example, T D bı 1 c.
Note that in the theorem above ı is the computational error produced by
our computer system. We obtain a good approximate solution after T D bı 1 c
iterations. Namely, we obtain a pair of points xO 2 U, yO 2 V such that
BX .Ox; ı/ \ C 6D ;; BY .Oy; ı/ \ D 6D ;
Let X be a Hilbert space equipped with an inner product h; i which induces a
complete norm k k. For each x 2 X and each r > 0 set
BX .x; r/ D fy 2 X W kx yk rg:
Then
C BX .0; M0 /; (4.4)
jf .v1 / f .v2 /j Lkv1 v2 k for all v1 ; v2 2 U; (4.5)
0 0
kf .v1 / f .v2 /k Lkv1 v2 k for all v1 ; v2 2 U: (4.6)
t 2 f 0 .xt / C BX .0; ı/
x0 2 U \ BX .0; M0 /: (4.7)
and
f .xTC1 / inf.f I C/
.2T/1 L.2M0 C 1/2 C Lı.8M0 C 8/.T C 1/ (4.10)
and
!
X
TC1
1
minff .xt / W t D 2; : : : ; T C 1g inf.f I C/; f T xt inf.f I C/
tD2
hz y; x yi 0: (4.12)
Then y D PD .x/.
62 4 Gradient Algorithm with a Smooth Objective Function
hz x; z xi D hz y C .y x/; z y C .y x/i
D hy x; y xi C 2hz y; y xi C hz y; z yi
hy x; y xi C hz y; z yi
ky xk2 C kz yk2 :
u 2 BX .0; M0 C 1/ \ U; (4.15)
4.3 The Main Lemma 63
2 X satisfy
k f 0 .u/k ı (4.16)
B.x; ı/ \ C 6D ; (4.18)
f .x/ f .v/ 21 Lkx vk2 21 Lkx uk2 ıL.8M0 C 8/; (4.19)
1 2
f .x/ f .v/ 2 Lku vk C Lhv u; u xi ıL.8M0 C 12/: (4.20)
x0 2 C (4.24)
such that
kv x0 k
kv PC .u L1 /k C kPC .u L1 / PC .u L1 f 0 .u//k
ı C L1 k f 0 .u/k ı.1 C L1 /: (4.29)
kx0 k M0 : (4.30)
kvk M0 C 1: (4.31)
Let
x2U (4.33)
satisfy
B.x; ı/ \ C 6D ;: (4.34)
x1 2 C (4.38)
4.3 The Main Lemma 65
such that
kx1 xk ı: (4.39)
jkx x0 k2 kx vk2 j
D jkx x0 k kx vkj.kx x0 k C kx vk/ ı.8M0 C 8/ (4.48)
and
jku x0 k2 ku vk2 j
D jku x0 k ku vkj.ku x0 k C ku vk/ ı.8M0 C 8/: (4.49)
In view of (4.45),
and (4.19) holds. It follows from (4.15), (4.29), (4.34), (4.35), (4.49), and (4.50) that
such that
kxt k M0 C 1; t D 0; 1; : : : : (4.53)
u D xt ; D t ; v D xtC1 ; x D z
we obtain that
X
T
Tf .z/ f .xtC1 /
tD1
X
T
.21 Lkz xtC1 k2 21 Lkz xt k2 / ıLT.8M0 C 8/
tD1
we obtain that
and
X
T1
Tf .xTC1 / C f .xtC2 /
tD0
X
T1
D Œtf .xtC1 / .t C 1/f .xtC2 / C f .xtC2 /
tD0
X
T1 X
T1
Œ21 LtkxtC2 xtC1 k2 ıL.8M0 C 8/ t: (4.56)
tD0 tD0
f .xTC1 / f .z/
.2T/1 L.2M0 C 1/2 C Lı.8M0 C 8/.T C 1/:
In view of (4.54)
! !
X
TC1
1
T.minff .xt / W t D 2; : : : ; T C 1g f .z//; T f T xt f .z/
tD2
X
T
f .xtC1 / Tf .z/
tD1
We use the notation and definitions introduced in Sect. 4.1. Let X be a Hilbert space
with an inner product h; i which induces a complete norm k k.
Let D be a nonempty closed convex subset of X, V be an open convex subset of
X such that
DV
4.5 Optimization on Unbounded Sets 69
We suppose that
Dmin 6D ;: (4.58)
M0 4M C 8; L 1 satisfy
and let
kx0 k M (4.64)
and
f .xq / inf.f I D/ C 0 ;
kxi k 3M C 3; i D 0; : : : ; q:
kx1 k 3M C 3: (4.69)
Set
and
C D D \ BX .0; M0 /: (4.72)
kxt zk 2M C 3: (4.73)
(In view of (4.64), (4.67), and (4.68), our assumption is true for t D 0.) By (4.67)
and (4.72),
z 2 C BX .0; M0 /: (4.74)
By (4.65), (4.66), (4.79), (4.80), (4.81), and Lemma 4.4 applied with
u D xt ; D t ; v D xtC1 ; x D z
we obtain that
f .z/ f .xtC1 /
21 Lkz xtC1 k2 21 Lkz xt k2 Lı.8M0 C 8/: (4.82)
kz xtC1 k kz xt k 2M C 3:
kz xt k 2M C 3; kxt k 3M C 3
4.1 C T/Lı.2M0 C 3/
.1 C T/.minff .xt / W t D 1; : : : ; T C 1g f .z//
72 4 Gradient Algorithm with a Smooth Objective Function
X
T
.f .xtC1 / f .z//
tD0
X
T
21 L .kz xt k2 kz xtC1 k2 / C .T C 1/Lı.8M0 C 8/;
tD0
and
kxt k 3M C 3; t D 0; : : : ; T C 1:
BX .x; ı/ \ D 6D ;
and
Let X be a Hilbert space equipped with an inner product h; i which induces a
complete norm k k. For each x 2 X and each r > 0 set
BX .x; r/ D fy 2 X W kx yk rg:
We suppose that
argmin.F/ 6D ; (5.1)
.L/
Gu; .w/ D f .u/ C h; w ui C 21 Lkw uk2 C g.w/; w 2 X (5.3)
L > 1 satisfy
and
M1 3M satisfy
L0 > 1 satisfy
and let
kx0 k M (5.12)
and
.L/
BX .xtC1 ; ı/ \ argmin.Gxt ;t / 6D ;: (5.14)
kxi k M2 ; i D 0; : : : ; q
and
F.xq / inf.F/ C 0 :
Note that in the theorem above ı is the computational error produced by our
computer system. We obtain a good approximate solution after bc1 ı 1 c iterations
[see (5.10) and (5.11)], where c1 > 0 is a constant which depends only on
L; L0 ; M; M2 . As a result we obtain a point x 2 X such that
F.x/ inf.F/ C c2 ı;
.L/
Lemma 5.2 ([20]). Let u; 2 X and L > 0. Then the function Gu; has a point of
.L/
minimum and z 2 X is a minimizer of Gu; if and only if
0 2 C L.z u/ C @g.z/:
.L/
lim Gu; .w/ D 1:
kwk!1
76 5 An Extension of the Gradient Algorithm
.L/
This implies that the function Gu; has a minimizer. Clearly, z is a minimizer of
.L/
Gu; if and only if
.L/
0 2 @Gu; .z/ D C L.z u/ C @g.z/:
and
M1 M0 satisfy
Assume that
2 X satisfies
k f 0 .u/k 1 (5.18)
.L/ .L/
BX .v; 1/ \ fz 2 X W Gu; .z/ inf.Gu; / C 1g 6D ;: (5.19)
Then
vO 2 BX .v; 1/ (5.20)
such that
.L/ .L/
Gu; .v/
O inf.Gu; / C 1: (5.21)
5.2 Auxiliary Results 77
It is clear that
Since the function f is Lipschitz on BX .0; M0 C 2/ relations (5.17) and (5.18) imply
that
By (5.23)–(5.25),
and
kvk kvk
O C 1 kvO uk C kuk C 1
.4.M1 C 1 C jc j/ C 8.L C 1/2 /1=2 C M0 C 2:
and
M1 M0 satisfy
Assume that
2 X satisfies
k f 0 .u/k ı (5.32)
.L/
BX .v; ı/ \ argmin.Gu; / 6D ;: (5.33)
.L/
vO 2 argmin.Gu; / (5.34)
and
kv vk
O ı: (5.35)
5.3 The Main Lemma 79
O M2 :
kvk; kvk (5.36)
Let
Clearly,
jh; v vij
O .L C 1/ı: (5.43)
80 5 An Extension of the Gradient Algorithm
jg.v/ g.v/j
O L0 kv vk
O L0 ı: (5.45)
F.x/ F.v/
.L/
f .x/ C g.x/ Gu; .v/ ı.M2 C M0 C 1/
.L/
f .x/ C g.x/ Gu; .v/
O ı.L0 C L C 1 C .L C 1/M2 C M0 C 1//:
(5.47)
l 2 @g.v/
O (5.49)
such that
C L.vO u/ C l D 0: (5.50)
g.x/ g.v/
O C hl; x vi:
O (5.51)
5.3 The Main Lemma 81
f .x/ C g.x/
f .u/ C h; x ui ı.2M0 C 2/ C g.v/
O C hl; x vi:
O (5.52)
In view of (5.3),
.L/
O D f .u/ C h; vO ui C 21 LkvO uk2 C g.v/:
Gu; .v/ O (5.53)
.L/
f .x/ C g.x/ Gu; .v/
O
In view of (5.35)–(5.37),
.L/
f .x/ C g.x/ Gu; .v/
O
F.x/ F.v/
21 Lkv xk2 21 Lku xk2 ı.L.M2 C M0 C 1/
C2M0 C 2 C L0 C .L C 1/.M2 C M0 C 2//:
If f .x1 / f .z/ C 0 , then in view of Lemma 5.3, kx1 k M2 and the assertion of the
theorem holds. Let
kxt zk 2M (5.61)
and
F.z/ F.xtC1 /
21 Lkz xtC1 k2 21 Lkz xt k2 ı..M2 C M0 C 2/.2L C 3/ C L0 /:
(5.62)
Assume that t 2 f0; : : : ; Tg and (5.61) holds. Relations (5.58) and (5.61)
imply that
Set
M0 D 3M (5.64)
and
By (5.5)–(5.9), (5.13), (5.14), (5.58), (5.63), (5.64), and Lemma 5.4 applied with
x D z; u D xt ; D t ; v D xtC1 ;
we have
2M kz xt k kz xtC1 k:
Thus we have shown by induction that (5.62) holds for all t D 0; : : : ; T and
that (5.61) holds for all t D 0; : : : ; T C 1.
By (5.60), (5.62) and (5.65),
X
T
T0 < .F.xtC1 / F.z//
tD0
X
T
Œ21 Lkz xt k2 21 Lkz xtC1 k2 C TıM3
tD0
Thus we have shown that if T 0 is an integer and (5.60) holds for all t D 0; : : : ; T,
then (5.61) holds for all t D 0; : : : ; T C 1 and T < n0 C 1. This implies that there
exists an integer q 2 f1; : : : ; n0 C 2g such that
kxi k 3M; i D 0; : : : ; q;
F.xq / F.z/ C 0 :
In this chapter we analyze the behavior of Weiszfeld’s method for solving the
Fermat–Weber location problem. We show that the algorithm generates a good
approximate solution, if computational errors are bounded from above by a small
positive constant. Moreover, for a known computational error, we find out what an
approximate solution can be obtained and how many iterates one needs for this.
Let X be a Hilbert space equipped with an inner product h; i which induces a
complete norm k k.
If x 2 X and h is a real-valued function defined in a neighborhood of x which is
Frechet differentiable at x, then its Frechet derivative at x is denoted by h0 .x/ 2 X.
For each x 2 X and each r > 0 set
BX .x; r/ D fy 2 X W kx yk rg:
X
n
hx; yi D xi yi ;
iD1
m is a natural number,
!i > 0; i D 1; : : : ; m
and that
A D fai 2 Rn W i D 1; : : : ; mg;
X
m
f .x/ D !i kx ai k for all x 2 Rn (6.1)
iD1
and
f .x/ ! min; x 2 Rn
by using Weiszfeld’s method [110] which was recently revisited in [18]. This prob-
lem is often called the Fermat–Torricelli problem named after the mathematicians
originally formulated (Fermat) and solved (Torricelli) it in the case of three points;
Weber (as well as Steiner) considered its extension for finitely many points. For a
full treatment of this problem with a modified proof of the Weiszfeld algorithm by
using the subdifferential theory of convex analysis (as well for generalized versions
of the Fermat–Torricelli and related problems) with no presence of computational
errors see [86].
Since the function f is continuous and satisfies a growth condition, this problem
has a solution which is denoted by x 2 Rn . Thus
In view of Theorem 2.1 of [18] this solution is unique but in our study we do not
use this fact.
If x 62 A, then
X
m
f 0 .x / D !i kx ai k1 .x ai / D 0: (6.3)
iD1
6.2 Preliminaries
Let y 2 Rn n A satisfy
T.y/ D y:
X
m
.!i ky ai k1 .y ai // D 0:
iD1
f 0 .y/ D 0:
X
m
h.x; y/ D !i ky ai k1 kx ai k2 : (6.7)
iD1
s D h.; y/ W Rn ! R1
which is strictly convex and possesses a unique minimizer x satisfying the relation
X
m
0 D s0 .x/ D 2 !i ky ai k1 .x ai /
iD1
T.y/ D x:
h.y; y/ D f .y/:
Proof. Assertion (i) was already proved [see (6.8)]. Assertion (ii) is evident. Let us
prove assertion (iii).
Let x 2 Rn and y 2 Rn n A. Clearly, for each a 2 R1 and each b > 0,
a2 b1 2a b:
6.2 Preliminaries 89
kx ai k2 ky ai k1 2kx ai k ky ai k:
X
m
h.x; y/ D !i ky ai k1 kx ai k2
iD1
X
m X
m
2 !i kx ai k !i ky ai k D 2f .x/ f .y/:
iD1 iD1
f .T.y// f .y/
X
m
L.x/ D !i kx ai k1 : (6.9)
iD1
90 6 Weiszfeld’s Method
For j D 1; : : : ; m set
X
m
L.aj / D !i kaj ai k1 : (6.10)
iD1;i6Dj
Proof. Clearly, the function x ! h.x; y/, x 2 Rn is quadratic. Therefore its second-
order Taylor expansion around y is exact and can be written as
and
Set
Q D maxfkai k W i D 1; : : : ; mg:
M (6.12)
X
m
f .0/ Q
!i M: (6.13)
iD1
X
m X
m
Q f .0/ f .x / D
!i M !i kx ai k: (6.14)
iD1 iD1
X
m X
m
Q
!i M !i kx aj k;
iD1 iD1
Q
kx aj k M
and
Q
kx k 2M: (6.16)
Q and y 2 Rn n A satisfy
Lemma 6.4. Let M M
kyk M: (6.17)
Then
kT.y/k 3M:
X
m X
m
f .y/ D !i ky ai k 2M !i : (6.19)
iD1 iD1
X
m X
m
!i kT.y/ ai k D f .T.y// f .y/ 2M !i : (6.20)
iD1 iD1
X
m X
m
!i kT.y/ aj k 2M !i ;
iD1 iD1
kT.y/ aj k 2M:
Together with (6.12) this implies that kT.y/k 3M. Lemma 6.4 is proved.
Q ı 2 .0; 1, y 2 Rn n A satisfy
Lemma 6.5 (The Basic Lemma). Let M M,
kyk M; (6.22)
x 2 Rn satisfy
kT.y/ xk ı (6.23)
kzk M: (6.24)
Then
!
X
m
1 2 2
f .x/ f .z/ 2 L.y/.kz yk kz xk / C ı 8M C 1 C !i :
iD1
X
m X
m
jf .x/ f .T.y//j kx T.y/k !i ı !i : (6.25)
iD1 iD1
6.3 The Basic Lemma 93
X
m
0 1 2
f .x/ f .z/ C hf .y/; T.y/ zi C 2 L.y/kT.y/ yk C ı !i : (6.29)
iD1
In view of (6.11),
X
m
f .x/ f .z/ C 21 L.y/.kz yk2 kz T.y/k2 / C ı !i : (6.33)
iD1
X
m
rj D !i kai aj k1 .aj ai /: (6.37)
iD1;i6Dj
X
m
rp D !i kai ap k1 .ap ai /:
iD1;i6Dp
Since our computer system produces computational errors we can obtain only a
vector rOp 2 Rn such that kOrp rp k 0 .
Proposition 6.6. Assume that rOp 2 Rn satisfies
kOrp rp k 0 (6.38)
and
Then
Q 0:
f .ap / inf.f / C 9M
l 2 @f .ap / (6.41)
such that
and
Q 0:
f .ap / f .x / C 9M
dp 2 Rn satisfies
tp 0 satisfies
kx0 ap tp dp k ı: (6.46)
Then
kdp k 1 C ı; (6.47)
1
tp L.ap / .krp k !p / C ı; (6.48)
Q C 2.2ı C L.ap /1 .krp k !p //;
kx0 k M (6.49)
1 1
kx0 ap C .krp k !p /L.ap / krp k rp k
ı.3 C .krp k !p /L.ap /1 /; (6.50)
1 1
jf .x0 / f .ap .krp k !p /L.ap / krp k rp /k
X
m
ı.3 C .krp k !p /L.ap /1 / !i (6.51)
iD1
and
f .ap / f .x0 /
X
m
.krp k !p /2 .2L.ap //1 ı.3 C .krp k !p /L.ap /1 / !i : (6.52)
iD1
Proof. In view of (6.44), (6.47) is true. Inequality (6.45) implies (6.48). By (6.12)
and (6.46)–(6.48),
and (6.50) holds. Relations (6.1) and (6.50) imply (6.51). Relation (6.52) follows
from (6.51) and Lemma 6.8. Proposition 6.9 is proved.
The next theorem which is proved in Sect. 6.5, is our main result.
Theorem 6.10. Let
and
!
X
m
2ı 8M0 C 1 C !i < .krp k !p /2 .16L.ap //1 ; (6.56)
iD1
!
X
m
0 D 4ı 16M0 C 1 C !i Œ144L.ap /2 .krp k !p /4 M02
iD1
Xm
C1.. !i /2 C 1/ (6.57)
iD1
and
!1
X
m
1
n0 D bı 8M0 C 1 C !i .krp k !p /2 .8L.ap //1 c C 1: (6.58)
iD1
Then
x0 62 A
xi 62 A; i 2 f0; : : : ; jg n fjg;
f .xj / inf.f / C 0 :
Note that in the theorem above ı is the computational error produced by our
computer system. In order to obtain a good approximate solution we needP bc1 ı 1 c
iterations [see (6.58)], where c1 > 0 is a constant depending only on M0 , miD1 !i ,
krp k !p and L.ap /. As a result, we obtain a point x 2 Rn such that
f .x/ inf.f / C c2 ı
Pm
[see (6.57)], where the constant c2 > 0 depends only on M0 , iD1 !i , krp k !p and
L.ap /.
f .x0 / f .ap /
X
m
2 1 1
.krp k !p / .2L.ap // C ı.3 C .krp k !p /L.ap / / !i
iD1
x0 62 A: (6.61)
If
xi 62 A; i D 0; : : : ; k (6.63)
and
(Note that in view of (6.61) and (6.62), relations (6.63) and (6.64) hold for k D 0.)
For all integers i 0, set
0 1
X
m
i D iı @8M0 C 1 C !j A : (6.65)
jD1
X
m
i n0 ı.8M0 C 1 C !j /
jD1
X
m
ı.8M0 C 1 C !j / C .krp k !p /2 .8L.ap //1
jD1
In view of (6.16),
Q
kx k 2M: (6.70)
kx0 x k M0 :
and
xj 62 A: (6.72)
vi 2 @f .ai /: (6.73)
In view of (6.1),
X
m
@f .ai / D !q kai aq k1 .ai aq / C !i BRn .0; 1/: (6.74)
qD1;q6Di
f .ai / f .x0 /
f .ai / f .xj / C j
hvi ; ai xj i C j
kvi kkai xj k C j
X
m
kai xj k !q C j : (6.76)
qD1
X
m
f .ap / f .x0 / kai xj k !q C j ; i D 1; : : : ; m: (6.77)
qD1
6.5 Proof of Theorem 6.10 101
and
X
m
kai xj k1 4.f .ap / f .x0 //1 !q : (6.78)
qD1
Lemma 6.2, (6.1), (6.59), (6.64), (6.68), and (6.72) imply that
X
m
f .xjC1 / f .T.xj // C kxjC1 T.xj /k !i
iD1
X
m
f .xj / C ı !i f .x0 / C jC1 : (6.80)
iD1
It follows from (6.54), (6.59), (6.64), (6.67), (6.69), (6.70), (6.72), (6.79), and
Lemma 6.5 applied with M D 2M0 , z D x , y D xj , and x D xjC1 that
Therefore in view of the relation above, (6.80) and (6.81), we showed by induction
that (6.68) and (6.69) hold for j D 0; : : : ; kC1 and that (6.81) holds for j D 0; : : : ; k.
It follows from (6.58), (6.60), (6.64), (6.68) and the relation k n0 that
102 6 Weiszfeld’s Method
and
xkC1 62 A: (6.82)
X
k
.k C 1/0 < .f .xjC1 / f .x //
jD0
!2
X
m X
k
2 !i .f .ap / f .x0 //1 .kx xj k2 kx xjC1 k2 /
iD1 jD0
!
X
m
C .k C 1/ı 16M0 C 1 C !i :
iD1
!1
X
m
D 361 L.ap /1 .krp k !p /2 ı 1 16M0 C 1 C !i 21 n0 :
iD1
Thus we assumed that an integer k 2 Œ0; n0 satisfies (6.63) and (6.64) and showed
that
xkC1 62 A
[see (6.82)] and that k C 1 21 n0 . (Note that in view of (6.56) and (6.58), n0 5.)
This implies that there exists an integer k 2 Œ0; n0 =2 such that (6.63), (6.64) hold
and
f .xkC1 / inf.f / C 0 :
Let .X; h; i/ be a Hilbert space with an inner product h; i which induces a complete
norm k k.
For each x 2 X and each nonempty set A X put
B.x; r/ D fy 2 X W kx yk rg:
f .x/ ! min; x 2 C:
Denote
We study the minimization problem with the objective function f , over the set C,
using the extragradient method introduced in Korpelevich [75]. By Lemma 2.2, for
each nonempty closed convex set D X and for each x 2 X, there is a unique point
PD .x/ 2 D satisfying
and
hz PD .x/; x PD .x/i 0
an integer k satisfy
Assume that
f˛i g1 1 1
iD0 Œ˛ ; ˛ ; fxi giD0 X; fyi giD0 X; (7.10)
kx0 k M0 (7.11)
kxi k 3M0 ; i D 0; : : : ; j;
f .PC .xj ˛j f 0 .xj /// inf.f I C/ C :
f ./ inf.f I C/ C ;
where > 0 is given. In order to meet this goal, the computational errors, produced
by our computer system, should not exceed c1 , where c1 > 0 is a constant
depending only on M0 , ˛ [see (7.9)]. The number of iterations is bc2 1 c, where
c2 > 0 is a constant depending only on M0 , ˛ .
It is easy to see that the following proposition holds.
Proposition 7.2. If limx2C; kxk!1 f .x/ D 1 and the space X is finite-
dimensional, then for each > 0 there exists > 0 such that if x 2 C satisfies
f .x/ inf.f I C/ C , then d.x; Cmin / .
The following theorem is our second main result of this chapter.
108 7 The Extragradient Method for Convex Optimization
an integer k satisfy
Assume that
f˛i g1 1 1
iD0 Œ˛ ; ˛ ; fxi giD0 X; fyi giD0 X; (7.22)
kx0 k M0 (7.23)
Then
kNu u k2 ku u k2 ku vk2 kv uN k2
C2˛Œf .u / f .v/ C 2˛kNu vkkf 0 .u/ f 0 .v/k:
In view of (7.28),
Set
z D u ˛f 0 .v/: (7.32)
kNu u k2 kz u k2 kz PC .z/k2
D ku ˛f 0 .v/ u k2 ku ˛f 0 .v/ uN k2
D ku u k2 ku uN k2 C 2˛hu uN ; f 0 .v/i
ku u k2 ku uN k2
C2˛hv uN ; f 0 .v/i C 2˛Œf .u / f .v/: (7.35)
kNu u k2 ku u k2
C2˛hv uN ; f 0 .v/i C 2˛Œf .u / f .v/
hu v C v uN ; u v C v uN i
D ku u k2 C 2˛hv uN ; f 0 .v/i
C2˛Œf .u / f .v/ ku vk2
kv uN k2 2hu v; v uN i
D ku u k2 ku vk2 kv uN k2
C2˛Œf .u / f .v/ C 2hv uN ; ˛f 0 .v/ u C vi
ku u k2 ku vk2 kv uN k2
C2˛Œf .u / f .v/ C 2˛hNu v; f 0 .u/ f 0 .v/i
ku u k2 ku vk2 kv uN k2
C2˛Œf .u / f .v/ C 2˛kNu vkkf 0 .u/ f 0 .v/k:
˛L 1 (7.39)
and let
Then
kNu u k2 ku u k2
C2˛Œf .u / f .v/ ku vk2 .1 ˛ 2 L2 /:
kNu u k2 ku u k2 ku vk2 kv uN k2
C2˛Œf .u / f .v/ C 2˛kNu vkkf 0 .u/ f 0 .v/k: (7.41)
kf 0 .u/k M1 : (7.42)
kNu u k2 ku u k2 ku vk2 kv uN k2
C2˛Œf .u / f .v/ C 2˛kNu vkku vkL
ku u k2 ku vk2 kv uN k2
C2˛Œf .u / f .v/ C ˛ 2 L2 ku vk2 C kNu vk2
ku u k2 C 2˛Œf .u / f .v/ ku vk2 .1 ˛ 2 L2 /: (7.45)
By (7.45),
˛L 1: (7.49)
Assume that
x 2 B.u ; M0 /; y 2 X; (7.50)
ky PC .x ˛f 0 .x//k ı; (7.51)
0
xQ 2 X; kQx PC .x ˛f .y//k ı: (7.52)
Then
kQx u k2 4ı.M0 C 1/ C kx u k2
C2˛Œf .u / f .PC .x ˛f 0 .x///
jjx PC .x ˛f 0 .x//jj2 .1 ˛ 2 L2 /:
Proof. Put
Lemma 7.5, (7.46), (7.47), (7.48), (7.49), (7.50), and (7.53) imply that
It is clear that
kQx u k2 D kQx z C z u k2
D kQx zk2 C 2hQx z; z u i C kz u k2
kQx zk2 C 2kQx zkkz u k C kz u k2 : (7.55)
kv yk ı: (7.56)
Put
zQ D PC .x ˛f 0 .y//: (7.57)
It follows from (7.52), (7.57), (7.53), Lemma 2.2, (7.58), (7.59), (7.46), (7.48), (7.56),
and (7.49) that
(It is clear that in view of (7.62), inclusion (7.63) is valid for i D 0.) It follows
from (7.61), (7.63), (7.7), (7.11), (7.5), (7.6), (7.12), (7.13), and Lemma 7.6 applied
114 7 The Extragradient Method for Convex Optimization
kxiC1 u k2 4ı.2M0 C 1/
Ckxi u k2 C 2˛i Œf .u / f .PC .xi ˛i f 0 .xi ///: (7.64)
Assume the contrary. Then relations (7.9) and (7.10) imply that for each integer
i 2 Œ0; k,
It follows from (7.62), (7.66), and property (P1) that for each integer i 2 Œ0; k,
and
k 4M02 ˛1 1 :
This contradicts (7.8). The contradiction we have reached proves that there exists an
integer j 2 Œ0; k such that
We may assume without loss of generality that for each integer i 0 satisfying
i < j,
It follows from (7.68), (7.9), and (7.10) that for any integer i 0 satisfying
i < j, inequality (7.66) is valid. Combined with (7.62), property (P1), and (7.61)
this implies that for each integer i satisfying 0 i j, we have
kxi u k 2M0
Let
ku k M0 1: (7.70)
(It is clear that in view of (7.71) inclusion (7.72) is valid for i D 0.) It follows
from (7.69), (7.9), (7.17)–(7.19), (7.72), (7.70), (7.24), and Lemma 7.6 applied with
x D xi , y D yi , xQ D xiC1 , ˛ D ˛i that
kxiC1 u k2 4ı.2M0 C 1/
Ckxi u k2 C 2˛i Œf .u / f .PC .xi ˛i f 0 .xi ///
kxi PC .xi ˛i f 0 .xi //k2 .1 ˛i2 L2 /
and by (7.22),
kxi u k2 kxiC1 u k2
2˛ Œf .PC .xi ˛i f 0 .xi /// f .u /
Ckxi PC .xi ˛i f 0 .xi //k2 .1 ˛i2 L2 / 4ı.2M0 C 1/: (7.73)
116 7 The Extragradient Method for Convex Optimization
maxff .PC .xi ˛i f 0 .xi /// f .u /; kxi PC .xi ˛i f 0 .xi //k2 g : (7.74)
and
maxff .PC .xi ˛i f 0 .xi /// f .u /; kxi PC .xi ˛i f 0 .xi //k2 g > : (7.75)
It follows from (7.72), property (P2), (7.73), (7.75), (7.19), (7.21), and (7.22) that
kxi u k2 kxiC1 u k2
minf˛ ; 1 .˛ /2 L2 g 4ı.2M0 C 1/
21 minf˛ ; 1 .˛ /2 L2 g:
kxi u k2 kxiC1 u k2
21 minf˛ ; 1 .˛ /2 L2 g
We claim that there exists an integer i 2 Œ0; k such that (7.74) is valid.
7.4 Proof of Theorem 7.3 117
Assume the contrary. Then (7.75) holds for each integer i 2 Œ0; k. Combined
with (7.71) and property (P4) this implies that
k.=2/ minf˛ ; 1 .˛ /2 L2 g
and
This contradicts (7.20). The contradiction we have reached proves that there
exists an integer j 2 Œ0; k such that (7.74) is valid with i D j.
We may assume that for all integers i 0 satisfying i < j Eq. (7.75) holds. It
follows from property (P4) and (7.71) that
There are two cases: (7.74) is valid; (7.75) is valid. Assume that (7.74) is true.
In view of property (P3), (7.78), and (7.16),
Since u is an arbitrary point of the set Cmin we may assume without loss of
generality that
It follows from (7.79), (7.15), property (P2), (7.73), (7.19), and (7.21) that
and
Assume that (7.75) holds. Property (P4), (7.78), and (7.16) imply that
and
In this chapter we study the convergence of the projected subgradient method for
a class of constrained optimization problems in a Hilbert space. For this class of
problems, an objective function is assumed to be convex but a set of admissible
points is not necessarily convex. Our goal is to obtain an -approximate solution in
the presence of computational errors, where is a given positive number.
Let .X; h; i/ be a Hilbert space with an inner product h; i which induces a complete
norm k k.
For each x 2 X and each nonempty set A X put
B.x; r/ D fy 2 X W kx yk rg:
be the -subdifferential of f at x.
Let C be a closed nonempty subset of the space X.
Assume that
It means that for each M0 > 0 there exists M1 > 0 such that if a point x 2 X satisfies
the inequality kxk M1 , then f .x/ > M0 .
Define
Since the function f is Lipschitz on all bounded subsets of the space X it follows
from (8.4) that inf.f I C/ is finite.
Set
It is well known that if the set C is convex, then the set Cmin is nonempty. Clearly,
the set Cmin 6D ; if the space X is finite-dimensional.
In this chapter we assume that
Cmin 6D ;: (8.6)
1
X
lim ˛i D 0; ˛i D 1
i!1
iD1
and let M; > 0. Then there exist a natural number n0 and ı > 0 such that the
following assertion holds.
Assume that an integer n n0 ,
Then the inequality d.xk ; Cmin / hods for all integers k satisfying n0 k n.
Theorem 8.2. Let M; > 0. Then there exists ˇ0 2 .0; 1/ such that for each ˇ1 2
.0; ˇ0 / there exist a natural number n0 and ı > 0 such that the following assertion
holds.
Assume that an integer n n0 ,
f
k gn1
kD0 ; fk gkD0 B.0; ı/
n1
Then the inequality d.xk ; Cmin / holds for all integers k satisfying n0 k n.
In this chapter we use the following definitions and notation.
Define
N
X0 B.0; K/: (8.9)
Since the function f is Lipschitz on all bounded subsets of the space X there exists
a number LN > 1 such that
We use the notation and definitions introduced in Sect. 8.1 and suppose that all the
assumptions posed in Sect. 8.1 hold.
Proposition 8.3. Let 2 .0; 1. Then for each x 2 X satisfying
N y 2 B.0; KN C 1/:
x 2 B.0; K/; (8.14)
N yk < .=2/21 :
jf .x/ f .y/j Lkx (8.15)
It follows from the choice of the point y, (8.11), and (8.15) that
y2C
8.2 Auxiliary Results 123
and
hv; y xi =2:
xN 2 Cmin ; (8.19)
ı.K0 C KN C 1/ .8L/
N 1 ; (8.20)
124 8 A Projected Subgradient Method for Nonsmooth Problems
; 2 B.0; ı/; v 2 @=4 f .x/ n f0g (8.22)
and let
y D PC .x ˛kvk1 v ˛/
: (8.23)
Then
N 1 C 2˛ 2 C k
k2 C 2k
k.K0 C KN C 2/:
ky xN k2 kx xN k2 ˛.4L/
Proof. In view of (8.8)–(8.10) and (8.19), for every point z 2 B.Nx; 41 LN 1 /, we
have
Lemma 8.4, (8.21), (8.22), and (8.24) imply that for every point
z 2 B.Nx; 41 LN 1 /;
we have
hv; z xi =2:
N 1 /:
hkvk1 v; z xi < 0 for all z 2 B.Nx; .4L/ (8.25)
Put
By (8.28),
Set
It follows from (8.30), (8.22), (8.21), (8.19), (8.8), (8.9), (8.29), and (8.20) that
ky0 xN k2 D kx ˛kvk1 v ˛ xN k2
D kx ˛kvk1 v xN k2 C ˛ 2 kk2
2˛h; x ˛kvk1 v xN i
kx ˛kvk1 v xN k2
C˛ 2 ı 2 C 2˛ı.K0 C KN C 1/
kx xN k2 2hx xN ; ˛kvk1 vi
C˛ 2 C ˛ 2 ı 2 C 2˛ı.K0 C KN C 1/
< kx xN k2 2˛.41 LN 1 /
C˛ 2 .1 C ı 2 / C 2˛ı.K0 C KN C 1/
N 1 C 2˛ 2 :
kx xN k2 ˛.4L/ (8.31)
N 2C2
ky0 xN k2 .K0 C K/
and
ky0 xN k K0 C KN C 2: (8.32)
ky xN k2 D kPC .y0 /
xN k2
kPC .y0 / xN k2 C k
k2 C 2k
kkPC .y0 / xN k
ky0 xN k2 C k
k2 C 2k
kky0 xN k
N 1
kx xN k2 ˛.4L/
C2˛ 2 C k
k2 C 2k
k.K0 C KN C 2/:
Lemma 8.6. Let K0 > 0, 2 .0; 1; ˛ 2 .0; 1, a positive number ı satisfy
ı.K0 C KN C 1/ .8L/
N 1 ;
let x 2 X satisfy
let
; 2 B.0; ı/; v 2 @=4 f .x/ n f0g
and let
y D PC .x ˛kvk1 v ˛/
:
Then
N 1
d.y; Cmin /2 d.x; Cmin /2 ˛.4L/
C2˛ 2 C k
k2 C 2k
k.K0 C KN C 2/:
We may assume without loss of generality that < 1. In view of Proposition 8.3,
there exists a number
such that
Fix
xN 2 Cmin : (8.35)
Fix
KN C 4 < p0 (8.37)
N 1 0 :
˛p < .32L/ (8.38)
P1
Since iD0 ˛i D 1 there exists a natural number n0 > p0 C 4 such that
0 1
nX
N
˛i > .4p0 C M C kNxk/2 01 16L: (8.39)
iDp0
Fix
N 1 0 :
6ı.K C 1/ < .16L/ (8.41)
f
k gn1
kD0 ; fk gkD0 B.0; ı/;
n1
(8.43)
In view of (8.35), (8.41), (8.48), (8.43), (8.44), and (8.45), the conditions of
Lemma 8.5 hold with K0 D K , D 0 , ˛ D ˛k , x D xk , D k , v D vk ,
y D xkC1 .
D ˛k
k and combined with (8.43), (847), (8.38), (8.41), and (8.40) this
lemma implies that
N 1 0
kxkC1 xN k2 kxk xN k2 ˛k .4L/
C2˛k2 C ˛k2 k
k k2 C 2k
k k˛k .K C KN C 2/
N 1 0 C 2˛k2 C ˛k2 ı 2 C 2ı˛k .K C KN C 2/
kxk xN k2 ˛k .4L/
N 1 0 C 2ı˛k .K C KN C 3/
kxk xN k2 ˛k .8L/
N 1 0 :
kxk xN k2 ˛k .16L/
f .xj / inf.f I C/ C 0
It follows from (8.45), (8.43), (8.41), (8.35), and (A2) that for all integers
i D 0; : : : ; n 1, we have
Let
N 1 ˛i 0 :
kxiC1 xN k2 kxi xN k2 .16L/ (8.53)
8.3 Proof of Theorem 8.1 129
and
0 1
nX
N 01 .M C 3p0 C kNxk//2 :
˛i 16L
iDp0
This contradicts (8.39). The contradiction we have reached proves that there exists
an integer
j 2 fp0 ; : : : ; n0 g
such that
d.xi ; Cmin / :
Assume the contrary. Then there exists an integer k 2 Œj; n for which
k > j p0 : (8.58)
Thus
Assume that (8.61) is valid. It follows from (8.61), (8.36), (8.33), (8.8), and (8.9)
that
N
xk1 2 X0 B.0; K/: (8.63)
kxk1 zk ı: (8.64)
Combined with (8.41), (8.58), and (8.38) the relation above implies that
Together with (8.10) and (8.67) the inclusion above implies that
N k1 xk k L.4ı
jf .xk1 / f .xk /j Lkx N C ˛k1 /: (8.69)
8.4 Proof of Theorem 8.2 131
d.xk ; Cmin / :
This inequality contradicts (8.57). The contradiction we have reached proves (8.62).
By (8.60), (8.8), and (8.9), we have
kxk1 k KN C 1: (8.71)
It follows from (8.40), (8.41), (8.43), (8.44), (8.71), and (8.62) that Lemma 8.6
holds with
Combined with (8.38), (8.58), (8.43), (8.41), and (8.60) this implies that
N 1 0 C 2˛k1
d.xk ; Cmin /2 d.xk1 ; Cmin /2 ˛k1 .4L/ 2 2
C 2˛k1 k
k1 k2
C2˛k1 k
k1 k.2KN C 3/
N 1 ˛k1 0 C 2ı 2 ˛k1
d.xk1 ; Cmin /2 .8L/ 2
C 2˛k1 ı.2KN C 3/
N 1 ˛k1 0 C 2ı˛k1 .2KN C 4/
d.xk1 ; Cmin /2 .8L/
N 1 ˛k1 0 d.xk1 ; Cmin /2 2 :
d.xk1 ; Cmin /2 .16L/
such that
Put
N 1 N :
ˇ0 D .64L/ (8.75)
Let
ˇ1 2 .0; ˇ0 /: (8.76)
N
ˇ1 n0 > 162 .3 C 2M/2 N 1 L: (8.77)
Fix
N 1 N ˇ1 :
6ıK < .64L/ (8.79)
Fix a point
xN 2 Cmin : (8.80)
fxk gnkD0 X; f
k gn1
kD0 X; fk gkD0 X; f˛k gkD0 Œˇ1 ; ˇ0 ;
n1 n1
(8.81)
kx0 k M; k
k k ı; kk k ı; k D 0; : : : ; n 1; (8.82)
k 2 Œ0; n 1;
kxk k K f .xk / > inf.f I C/ C N =4: (8.85)
It follows from (8.75), (8.78)–(8.81), (8.83), (8.85), (8.82), and (8.74) that Lemma
8.5 holds with D N =4, K0 D K , ˛ D ˛k , x D xk ,
D
k , D k , v D vk ,
y D xkC1 and combining with (8.79) this implies that
N 1 N
kxkC1 xN k2 kxk xN k2 ˛k .16L/
C2˛k2 C ı 2 C 2ı.K C KN C 2/
N 1 N C 2˛k2 C 2ı.K C KN C 3/:
kxk xN k2 ˛k .16L/
Together with (8.81), (8.75), (8.78), and (8.79) this implies that
N 1 N C 2ı.KN C 3 C K /
kxkC1 xN k2 kxk xN k2 ˛k .32L/
N 1 N ˇ1 C 2ı.KN C 3 C K /
kxk xN k2 .32L/
N 1 N :
kxk xN k2 ˇ1 .64L/
N 1 ˇ1 N :
kxkC1 xN k2 kxk xN k2 .64L/
It follows from (8.84), (8.82), (8.79), (A2), (8.80), (8.81), and (8.75) that for all
integers i D 0; : : : ; n 1, we have
Let k 2 f1; : : : ; n0 1g. It follows from (8.89), (8.86), and property (P2) that
N 1 ˇ1 N :
kxkC1 xN k2 kxk xN k2 .64L/ (8.90)
N 1 N ˇ1 ˇ1 n0 =2.64L/
.n0 1/.64L/ N 1 N ;
N 1 N ˇ1 .M C kNxk C 3/2 .2M C 3/2 :
.n0 =2/.64L/
This contradicts (8.77). The contradiction we have reached proves that there
exists an integer
j 2 f1; : : : ; n0 g
for which
d.xj ; C/ ı: (8.92)
d.xi ; Cmin / :
Assume the contrary. Then there exists an integer k 2 Œj; n for which
k > j: (8.95)
8.4 Proof of Theorem 8.2 135
Then
Assume that (8.98) is valid. In view of (8.98), (8.73), (8.8), and (8.9),
N
xk1 2 X0 B.0; K/: (8.100)
kxk1 zk ı: (8.101)
kxk zk
ı C kz PC .xk1 ˛k1 kvk1 k1 vk1 ˛k1 k1 /k
ı C kz xk1 k C ˛k1 C ı < 3ı C ˛k1 : (8.102)
This inequality contradicts (8.94). The contradiction we have reached proves (8.99).
In view of (8.97), (8.8), and (8.9),
kxk1 k KN C 1: (8.105)
136 8 A Projected Subgradient Method for Nonsmooth Problems
It follows from (8.78), (8.79), (8.105), (8.99), and (8.82)–(8.84) that Lemma 8.6
holds with
x D xk1 ; y D xk ; D k1 ;
D
k1 ;
v D vk1 ; ˛ D ˛k1 ; K0 D KN C 1
D 41 N and combining with (8.81), (8.75), (8.79), and (8.97) this implies that
d.xk ; Cmin /2
N 1 N C 2˛k1
d.xk1 ; Cmin /2 ˛k1 .16L/ 2
C ı 2 C 2ı.KN C 4/
N 1 ˛k1 N C 2˛k1
d.xk1 ; Cmin /2 .16L/ 2
C 2ı.KN C 5/
N 1 ˛k1 N C 2ı.KN C 5/
d.xk1 ; Cmin /2 .32L/
N 1 ˇ1 N 2ı.2KN C 5/
d.xk1 ; Cmin /2 .32L/
< d.xk1 ; Cmin /2 2 :
d.xi ; Cmin /
In this chapter we study the convergence of a proximal point method under the
presence of computational errors. Most results known in the literature show the
convergence of proximal point methods when computational errors are summable.
In this chapter the convergence of the method is established for nonsummable
computational errors. We show that the proximal point method generates a good
approximate solution if the sequence of computational errors is bounded from above
by some constant.
We analyze the behavior of the proximal point method in a Hilbert space which is
an important tool in the optimization theory. See, for example, [15, 16, 31, 34, 36,
53, 55, 69, 77, 81, 87, 103, 104, 106, 107, 111, 113] and the references mentioned
therein.
Let X be a Hilbert space equipped with an inner product h; i which induces the
norm k k.
For each function g W X ! R1 [ f1g set
and that
Let a point
x 2 argmin.f / (9.4)
Clearly,
Assume that
k 2 Œ1 ; 2 ; k D 0; 1; : : : ; (9.9)
1=2
1=2 .L C 1/.21
1 C 8M0 1 / 1 and .L C 1/ =4: (9.11)
f .x0 / M (9.12)
9.1 Preliminaries and the Main Results 139
and
f .xk / inf.f / C :
f ./ inf.f / C
doing bc1 1 c iterations [see (9.10)] with the computational error D c2 2 [see
(9.11)], where the constant c1 > 0 depends only on M0 ; 2 and the constant c2 > 0
depends only on M0 ; L; 1 ; 2 .
Theorem 9.1 implies the following result.
Theorem 9.2. Let
k 2 Œ1 ; 2 ; k D 0; 1; : : : ;
1=2
N 1=2 .L C 1/.21
1 C 8M0 1 / 1 and N .L C 1/ 1=4: (9.15)
Assume that
fi g1
iD0 .0; N /; lim i D 0 (9.16)
i!1
and that > 0. Then there exists a natural number T0 such that for each sequence
fxk g1
kD0 X satisfying
f .x0 / M (9.17)
and
f .xkC1 / C 21 k kxkC1 xk k2 inf.f C 21 k k xk k2 / C k (9.18)
for all integers k 0, the inequality
f .xk / inf.f / C
holds for all integers k > T0 .
140 9 Proximal Point Method in Hilbert Spaces
Since the function f is convex and lower semicontinuous and satisfies (9.2),
Theorem 9.2 easily implies the following result.
Corollary 9.3. Suppose that all the assumptions of Theorem 9.2 hold and that
the sequence fxk g1
kD0 X satisfies (9.17) and (9.18) for all integers k 0.
Then limk!1 f .xk / D inf.f / and the sequence fxk g1
kD0 is bounded. Moreover, it
possesses a weakly convergent subsequence and the limit of any weakly convergent
subsequence of fxk g1
kD0 is a minimizer of f .
Problem (P) is called well posed if the function f possesses a unique minimizer
which is a limit in the norm topology of any minimizing sequence of f (see [60, 121]
and the references mentioned therein).
Corollary 9.3 easily implies the following result.
Corollary 9.4. Suppose that problem (P) is well posed, all the assumptions of
Theorem 9.2 hold and that the sequence fxk g1kD0 X satisfies (9.17) and (9.18)
for all integers k 0. Then fxk g1
kD0 converges in the norm topology to a unique
minimizer of f .
Note that in [60] it was shown that most problems of type (P) (in the sense of
Baire category) are well posed.
The results of the chapter were obtained in [120]. The chapter is organized as
follows. Section 9.2 contains auxiliary results. Theorem 9.1 is proved in Sect. 9.3
and Theorem 9.2 is proved in Sect. 9.4.
We use the notation and definitions introduced in Sect. 9.1 and suppose that all the
assumptions made in the introduction holds.
Lemma 9.5. Assume that
k 2 Œ1 ; 2 ; k D 0; 1; : : : (9.19)
f .x0 / M; (9.20)
kxk k M0 : (9.22)
It follows from (9.21), (9.19), (9.7), (9.8), (9.3), (9.4), (9.5), and (9.22) that
Together with (9.6) the inequality above implies that kxkC1 k M0 . Thus we showed
by induction that (9.22) holds for all integers k 0. This completes the proof of
Lemma 9.5.
Lemma 9.6. Assume that
k 2 Œ1 ; 2 ; k D 0; 1; : : : ; (9.23)
f .x0 / M (9.24)
X
m X
m
21
2 .f .xi / f .x // C kxi1 xi k2
iDn iDn
X
m1
4M02 C Œ21 1 1=2
1 i C 8M0 .i 1 / :
iDn1
f .z/ C 21 k kz xk k2
21 f .xkC1 / C 21 f .ykC1 /
C21 k .21 kykC1 xk k2 C 21 kxkC1 xk k2 k21 .ykC1 xkC1 /k2 /
infff .x/ C 21 k kx xk k W x 2 Xg
C21 k 21 k k21 .ykC1 xkC1 /k2 :
and that
0 2 @f .ykC1 / C k .ykC1 xk /
9.2 Auxiliary Results 143
By (9.32), we have
and
f .x / f .xkC1 /
D f .x / f .ykC1 / C f .ykC1 / f .xkC1 /
k hxk ykC1 ; x ykC1 i C f .ykC1 / f .xkC1 /
D 21 k ŒkykC1 x k2 kxk x k2 C kxk ykC1 k2 C f .ykC1 / f .xkC1 /
21 k ŒkykC1 x k2 kxk x k2 C kxk xkC1 k2 k : (9.34)
It follows from (9.28), (9.23), (9.26), (9.7), (9.8), and (9.5) that for all integers q 1,
kyq k M0 ; q D 1; 2; : : : (9.35)
Now we use (9.34) and (9.35) and obtain an estimation of f .x /f .xkC1 / without
terms which contain ykC1 .
In view of (9.26) and (9.31),
f .x / f .xkC1 /
k C 21 k ŒkxkC1 x k2 kx xk k2 C kxk xkC1 k2 8M0 .k 1
1 /
1=2
;
f .xkC1 / f .x / C 21 k kxk xkC1 k2
k C 21 k Œkxk x k2 kxkC1 x k2 C 21 k 8M0 .k 1
1 /
1=2
:
and by (9.23),
X
m X
m
.2=2 /.f .xi / f .x // C kxi1 xi k2
iDn iDn
X
m1
kxn1 x k2 C Œ2i 1 1 1=2
1 C 8M0 .i 1 /
iDn1
X
m1
4M02 C Œ21 1 1=2
1 i C 8M0 .i 1 / :
iDn1
It follows from (9.9), (9.10), (9.11), (9.12), (9.13), and Lemma 9.6 applied for a
natural number n and m D n C L that
9.3 Proof of Theorem 9.1 145
X
nCL
.2=2 /.f .xi / f .x //
iDn
X
n1CL
4M02 C Œ21 1 1=2
1 C 8M0 .1 /
iDn1
1=2
4M02 C .L C 1/ 1=2 Œ21
1 C 8M0 1 4M02 C 1: (9.38)
.L C 1/21 2
2 minff .xi / f .x / W i D n; : : : ; n C Lg 4M0 C 1
and by (9.10),
Since (9.39) holds for any natural number n there exists a strictly increasing
sequence of natural numbers fSi g1
iD1 such that
Si j < SiC1
and
j Si L C 1: (9.42)
f .xkC1 / f .xk / C :
Combined with (9.42), (9.41) and (9.11) the inequality above implies that
f .x0 / M;
f .xkC1 / C 2 k kxkC1 xk k inf.f C 21 k k xk k2 / C N ; k D 0; 1; : : : :
1 2
Then
By Theorem 9.1 there exist ı 2 .0; N / and an integer L0 1 such that the
following property holds:
(P2) For every sequence fyi g1
iD0 X which satisfies
f .y0 / inf.f / C 1;
f .ykC1 / C 21 k kykC1 yk k2 inf.f C 21 k k yk k2 / C ı
T0 > L0 C L1 C L: (9.45)
Assume that a sequence fxi g1 iD0 X satisfies (9.17) and (9.18). It follows from
property (P1), (9.17), (9.18), and (9.16) that
yk D xkCLCL1 : (9.47)
9.4 Proof of Theorem 9.2 147
Combined with (9.47) and (9.45) the inequality above implies that
In this chapter we study the local convergence of a proximal point method in a metric
space under the presence of computational errors. We show that the proximal point
method generates a good approximate solution if the sequence of computational
errors is bounded from above by some constant. The principle assumption is a
local error bound condition which relates the growth of an objective function to
the distance to the set of minimizers, introduced by Hager and Zhang [55].
analysis includes [15, 16, 31, 34, 35, 42, 55, 56, 66, 67, 69, 77, 81, 87, 103,
104, 111, 113]. In the proximal point method iterates xk , k 0 are generated
by the following rule:
Set
xN 2 ˝0 ˝ \ B.Nx; / (10.3)
Let
< 1: (10.7)
10.1 Preliminaries and the Main Results 151
2 .0; 1/; < .1 C .1 /1 /1 ; < .0 /=3; (10.8)
2.k0 .//2 .2./ˇ11 /1=2 C 4.k0 .//2 ./1 .2˛ ˇ0 /1 < 0 ; (10.10)
The following theorem obtained in [124] is the main result of this chapter.
Theorem 10.1. Let a number satisfy (10.8), k0 D k0 ./ and D ./. Assume
that
f
k g1
kD0 Œˇ1 ; ˇ0 ; (10.13)
a sequence fxk g1
kD0 X satisfies
Then
xj 2 B.Nx; 0 / for all integers j 0;
D.xj ; ˝0 / for all integers j k0 :
Theorem 10.2. There exists N > 0 such that for each sequence fi g1
iD0 .0; N
satisfying
lim i D 0 (10.16)
i!1
lim f .x/ D 1
kxk!1
and
where f 00 .x/ is the second order Frechet derivative of f at a point x. It is not difficult
to see that the function f possesses a unique minimizer xN and (10.4) holds with
D 1, ˝ D ˝0 D fNxg, 0 > 1 and
10.1 Preliminaries and the Main Results 153
Example 10.4. Let .X; h; i/ be a Hilbert space with an inner product h; i which
induces a complete norm k k. Assume that f W X ! R1 [ f1g is a lower
semicontinuous function, 0 < a < b, the restriction of f to the set fz 2 X W kzk < bg
is a convex function which possesses a continuous second order Frechet derivative
f 00 ./,
and that
It is easy to see that there exists a unique minimizer xN of f , kNxk a and that (10.4)
holds with ˝ D ˝0 D fNxg, 0 D .b a/=2, any positive constant < 0 and
c2 c1
1 < 0 : (10.18)
Set
Let
Thus
x.0/ D x.2/ D 0; jx0 .t/j 1; t 2 Œ0; 2 almost everywhere (a.e.): (10.23)
Clearly, this problem possesses a unique solution xN .t/ D sin.t/, t 2 Œ0; 2. Let us
show that this constrained problem is a particular case of the problem considered in
this session.
Denote by X the set of all a.c. functions x W Œ0; 2 ! R1 such that (10.23) holds.
For all x1 ; x2 2 X set
Z 2
f .x/ D jx.t/ sin.t/jdt: (10.24)
0
By (10.23),
By (10.23) and (10.27) for each t 2 Œ0; 2 satisfying jt0 tj d.x; xN /=4;
and
and let
x 2 ˝: (10.29)
Then
d.z1 ; z0 /2 d.x; z0 /2 C 2 1
and
Then
2 Œˇ1 ; ˇ0 (10.30)
10.2 Auxiliary Results 157
Then
Proof. Let
x 2 ˝0 : (10.33)
By (10.32),
f .z1 / f .Nx/
C 21
D.z1 ; ˝0 /.2d.z1 ; z0 / C D.z1 ; ˝0 //: (10.34)
d.yj ; xN / 0 ; j D 0; : : : ; k; (10.39)
0 k < k0 (10.42)
For all integers j D 0; : : : ; k, it follows from (10.13), (10.38), and Lemma 10.8
applied with z0 D yj , z1 D yjC1 ,
D
j that
X
k X
k
d.ykC1 ; y0 / d.yjC1 ; yj / ŒD.yj ; ˝0 / C .2
1
j /
1=2
jD0 jD0
X
k
Œ j D.y0 ; ˝0 / C j C .2
1
j /
1=2
jD0
X
k
.1 /1 D.y0 ; ˝0 / C j C .k C 1/.2ˇ11 /1=2 : (10.46)
jD0
By (10.13), (10.47), (10.44), (10.38), (10.43), (10.7), (10.41), and Lemma 10.9
applied with z0 D yk , z1 D ykC1 ,
D
k ,
In view of (10.47) and (10.48) we conclude that (10.43) holds for all j D 0; : : : ; kC1.
Thus by induction we have shown that at least one of the following cases holds:
(i) there is an integer k 2 Œ0; k0 such that (10.39) and (10.40) hold.
(ii) (10.43) holds for all j D 0; : : : ; k0 .
In the case (i) the assertion of the lemma holds.
Assume that the case (ii) holds. By (10.43) with j D k0 , (10.3), (10.37), (10.41),
(10.9), (10.11), and (10.12),
d.xj ; xN / 0 ; j D 0; : : : ; k; (10.49)
D.xj ; ˝0 / : (10.51)
D.xjC1 ; ˝0 / :
Let
f
k g1
kD0 Œˇ1 ; ˇ0 ; (10.58)
a sequence fxk g1
kD0 X satisfies
Then
D.xj ; ˝0 / : (10.63)
D.xjC1 ; ˝0 / :
Therefore in order to prove the proposition it is sufficient to show that (10.63) holds
with some integer j 2 Œ0; k0 .
Assume the contrary. Thus
D.xj ; ˝0 / > for all integers j 2 Œ0; k0 : (10.65)
10.6 Proof of Theorem 10.2 163
0 k < k0 ; (10.66)
Thus (10.67) holds for j D k C 1. By induction we have shown that (10.67) holds for
all j D 0; : : : ; k0 . Together with (10.59), (10.55), (10.57), and (10.62) this implies
that
By Theorem 10.1 there exists N > 0 such that the following property holds:
(P1) For each sequence
f
k g1
kD0 Œˇ1 ; ˇ0 (10.68)
we have
Assume that
fi g1
iD0 .0; N and lim i D 0 (10.71)
i!1
and > 0. By Proposition 10.11 there are O > 0 and a natural number q1 such that
the following property holds:
(P2) Assume that (10.68) holds, a sequence fxk g1
kD0 X satisfies (10.70) and that
for all integers k 0,
Set
k1 D q1 C q2 : (10.73)
We use the notation and definitions from Sect. 10.1. Suppose that
˝ D fNxg
10.7 Well-Posed Minimization Problems 165
lim .zi ; xN / D 0:
i!1
In other words the problem f .x/ ! min, x 2 X is well posed in the sense of [121].
Fix M > 1 C 0 .
Proposition 10.12. There exist ;
> 0 such that for each
2 .0;
, each
z0 2 B.Nx; M/ and each z1 2 X satisfying
the inequality
holds.
Proof. Since the problem f .z/ ! min, z 2 X is well posed there is ı > 0 such that
Let
Together with (10.77) this implies that (10.76). Proposition 10.12 is proved.
Let ;
> 0 be as guaranteed by Proposition 10.12. We suppose that
ˇ 0
: (10.80)
166 10 Proximal Point Methods in Metric Spaces
f
k g1
kD0 Œˇ1 ; ˇ0 ; (10.82)
a sequence fxk g1
kD0 X satisfies
d.x0 ; xN / M (10.83)
Then
Since can be arbitrarily small positive number the assertion of the theorem now
follows from Theorem 10.1.
Theorem 10.14. Let N > 0 be as guaranteed by Theorem 10.2,
O D minfN ; g; (10.85)
fi g1
iD0 .0; O ; lim i D 0; (10.86)
i!1
> 0 and let a natural number k1 be as guaranteed by Theorem 10.2 with the
sequence fiC1 g1
iD0 . Assume that
f
k g1
kD0 Œˇ1 ; ˇ0 ; (10.87)
10.8 An Example 167
a sequence fxk g1
kD0 X satisfies
d.x0 ; xN / M (10.88)
Then
10.8 An Example
Let X D Rn be equipped with the Euclidean norm k k which induces the metric
d.x; y/ D kx yk, x; y 2 Rn .
Set
˝ D B.0; 1/;
Clearly, all the assumptions made in Sect. 10.1 hold with xN D 0, ˝ D ˝0 D B.0; 1/,
D 1, ˛ D 1 and any positive constant 0 > 1. Thus Theorems 10.1 and 10.2
hold for the function f .
We prove the following result.
Proposition 10.15. Assume that
> 0, a sequence fi g1
iD0 .0; 1 satisfies
1
X 1=2
i D1 (10.90)
iD0
168 10 Proximal Point Methods in Metric Spaces
xk D tk y1 C .1 tk /y0 : (10.94)
Clearly,
q
fxk gkD0 F B.0; 1/: (10.95)
Let Rn be the n-dimensional Euclidean space equipped with an inner product h; i
which induces the norm k k.
n
A multifunction T W Rn ! 2R is called a monotone operator if
hz z0 ; w w0 i 0 8z; z0 ; w; w0 2 Rn
f.z; w/ 2 Rn Rn W w 2 T.z/g
z 2 .I C cT/.u/;
Pc WD .I C cT/1 (11.2)
and Pc .z/ D z if and only if 0 2 T.z/. Following the terminology of Moreau [87] Pc
is called the proximal mapping associated with cT.
The proximal point algorithm generates, for any given sequence fck g1 kD0 of
positive real numbers and any starting point z0 2 Rn , a sequence fzk g1kD0 R ,
n
where
is closed.
Assume that
F WD fz 2 Rn W 0 2 T.z/g 6D ;:
Fix
N > 0: (11.5)
B.x; r/ D fy 2 Rn W kx yk rg:
We prove the following result, which establishes the convergence of the proximal
point algorithm without computational errors.
Theorem 11.1. Let M; > 0. Then there exists a natural number n0 such that, for
each sequence fk g1 N 1
kD0 Œ; 1/ and each sequence fxk gkD0 R such that
n
kx0 k M;
kx0 k M;
d.xn0 ; F/ :
Theorem 11.2 easily follows from the following result, which is proved in
Sect. 11.4.
Theorem 11.3. Let M; 0 > 0, let a natural number n0 be as guaranteed by
Theorem 11.1 with D 0 =2, and let ı D 0 .2n0 /1 .
0 1
Then, for each sequence fk gnkD0 N 1/ and each sequence fxk gn0 Rn
Œ; kD0
such that
172 11 Maximal Monotone Operators and the Proximal Point Algorithm
kx0 k M;
d.xn0 ; F/ :
Theorem 11.5. Let M > 0, fık g1 kD0 be a sequence of positive numbers such that
limk!1 ık D 0 and let > 0. Then there exists a natural number n such that, for
each sequence fk g1 N 1
kD0 Œ; 1/ and each sequence fxk gkD0 B.0; M/ satisfying
the inequality
d.xk ; F/
Theorem 11.7. Suppose that the set F be bounded and let M > 0. Then there exists
ı > 0 such that the following assertion holds.
Assume that fık g1
kD0 .0; ı satisfies
lim ık D 0
k!1
and that > 0. Then there exists a natural number n such that, for each sequence
fk g1 N 1
kD0 Œ; 1/ and each sequence fxk gkD0 R satisfying
n
kx0 k M;
0 2 T.z/: (11.8)
21 kz xk k2 21 kz xkC1 k2 21 kxk xkC1 k2 D hxk xkC1 ; xkC1 zi: (11.9)
174 11 Maximal Monotone Operators and the Proximal Point Algorithm
By (11.7),
xk xkC1 2 k T.xkC1 /:
Together with (11.9) and (11.8) this completes the proof of Lemma 11.8.
Using (11.3) we can easily deduce the following lemma.
Lemma 11.10. Assume that z 2 Rn satisfies (11.8), M > 0,
fk g1 1
kD0 .0; 1/; fxk gkD0 R ;
n
kx0 zk M and that (11.7) holds for all integers k 0. Then kxk zk M for
all integers k 0.
Lemma 11.11. Let M; > 0. Then there exists ı > 0 such that, for each x 2
B.0; M/, each N and each z 2 B.0; ı/ satisfying z 2 T.x/ the inequality
d.x; F/ holds.
Proof. Assume the contrary. Then, for each natural number k there exist
such that
1
k zk 2 T.xk / (11.12)
and
k1 N 1 N 1 1 ! 0 as k ! 1:
k zk k kzk k k (11.13)
x WD lim xk : (11.14)
k!1
Since graph T is closed, then (11.12), (11.13) and (11.14) imply that 0 2 T.x/ and
that x 2 F. Together with (11.14) this implies that d.xk ; F/ =2 for all sufficiently
large natural numbers k. This contradicts (11.11) and proves Lemma 11.11.
Lemma 11.12. Assume that the integers p; q, with 0 p < q, are such that
q1 q1
fk gkDp .0; 1/; fk gkDp .0; 1/; (11.15)
q q
fxk gkDp Rn ; fyk gkDp Rn ; yp D xp ;
11.3 Proof of Theorem 11.1 175
X
k1
kyk xk k i : (11.17)
iDp
Proof. We prove the lemma by induction. In view of (11.16) and (11.15), equation
(11.17) holds for k D p C 1.
Assume that an integer j satisfies p C 1 j q, (11.17) holds for all k D
p C 1; : : : ; j and that j < q.
By (11.16), (11.3) and (11.17) with k D j
X
j1
X
j
kyj xj k C j i C j D i
iDp iDp
and (11.17) holds for all k D pC1; : : : ; jC1. Therefore we showed by induction that
(11.17) holds for all k D p C 1; : : : ; q. This completes the proof of Lemma 11.12.
Fix
z 2 F: (11.18)
By Lemma 11.11, there exists ı 2 .0; 1/ such that the following property holds:
(P1) For each x 2 B.0; M C 2kzk/, each N and each z 2 B.0; ı/ satisfying
z 2 T.x/ we have d.x; F/ =2.
Choose a natural number n0 such that
Assume that
fk g1 N 1
kD0 Œ; 1/; fxk gkD0 R ; kx0 k M;
n
(11.20)
0 1
nX
kxk xkC1 k2 kz x0 k2 kz xn0 k2 kz x0 k2 .kzk C M/2 :
kD0
In view of (11.20)
0 1
fk gnkD0 N 1/; fxk gn0 Rn ;
Œ; (11.25)
kD0
11.4 Proofs of Theorems 11.3,11.5,11.6, and 11.7 177
Put
d.xn0 ; F/ 0 :
Put
n D n0 C p: (11.30)
Assume that
fk g1 N 1
kD0 Œ; 1/; fxk gkD0 B.0; M/; (11.31)
By Theorem 11.2 there exist ı > 0 and a natural number n0 such that the following
property holds:
(P2)
0 1
For each sequence fk gnkD0 N 1/ and each sequence fxk gn0 Rn which
Œ; kD0
satisfies
kx0 k M;
we have
d.xn0 ; F/ =4:
Put
ı D minfı; .=4/n1
0 g: (11.34)
Assume that
fk g1 N 1
kD0 Œ; 1/; fxk gkD0 R
n
(11.35)
and
kx0 k M;
kxn0 k M: (11.38)
Assume that j is a natural number and (11.39) holds. By (11.39), (11.35), (11.36),
(11.34), and (P2),
d.x.jC1/n0 ; F/ =4:
Together with (11.33) this implies that kx.jC1/n0 k M. Thus (11.39) holds for all
natural numbers j.
Let j be a natural number. Put
By Lemma 11.12, (11.35), (11.36), (11.40), and (11.34) for all k D jn0 C 1; : : : ;
2.j C 1/n0 ,
By Theorem 11.6 there are ı > 0 and a natural number n0 such that the following
property holds:
(P3) for each fk g1 N 1
kD0 Œ; 1/ and each fxk gkD0 R satisfying
n
kx0 k M;
fık g1
kD0 .0; ı; lim ık D 0; > 0: (11.46)
k!1
< 1: (11.47)
By Theorem 11.6 there are ı 2 .0; ı/ and a natural number n such that the
following property holds:
(P4) for each fk g1 N 1
kD0 Œ; 1/ and each fxk gkD0 R satisfying kx0 k M,
n
Put
n D n0 C p C n : (11.49)
Assume that
fk g1 N 1
kD0 Œ; 1/; fxk gkD0 R ;
n
(11.50)
kx0 k M;
It follows from (11.50), (11.53), (11.54), (11.49), and property (P4) applied to the
sequence fxk g1
kDn0 Cp that
B.x; r/ D fy 2 X W kx yk rg:
Moreover,
kPC .x/ PC .y/k kx yk for all x; y 2 X
and
hz PC .x/; x PC .x/i 0
We suppose that
S 6D ;: (12.2)
In the sequel, we present examples which provide simple and clear estimations
for the sets S in some important cases. These examples show that elements of S
can be considered as -approximate solutions of the variational inequality.
In this chapter, in order to solve the variational inequality (to find x 2 S), we
use the algorithm known in the literature as the extragradient method [75]. In each
iteration of this algorithm, in order to get the next iterate xkC1 , two orthogonal
projections onto C are calculated, according to the following iterative step. Given
the current iterate xk calculate yk D PC .xk k f .xk // and then
and
with a constant ı > 0 which depends only on our computer system. Surely, in this
situation one cannot expect that the sequence fxk g1 kD0 converges to the set S. The
goal is to understand what subset of C attracts all sequences fxk g1
kD0 generated by
the algorithm. The main result of this chapter (Theorem 12.2) shows that this subset
of C is the set S with some > 0 depending on ı [see (12.9) and (12.10)]. The
examples considered in this section show that one cannot expect to find an attracting
set smaller than S , whose elements can be considered as approximate solutions of
the variational inequality.
The results of this chapter were obtained [127].
We suppose that the mapping f is Lipschitz on all bounded subsets of X and that
B.0; M0 / \ S 6D ;; (12.5)
0 > 0 satisfy
Assume that
fxi g1 1 1
iD0 X; fyi giD0 X; fi giD0 ŒQ ; ; (12.12)
kx0 k M0 ; (12.13)
kxj yj k 20 ;
S fy 2 X W kyk 2g
and the assertion of Theorem 12.2 holds with M1 D 400 [see (12.6)], L D 1 [see
(12.7)], Q D 21 , D 3=4 [see (12.8)],
0 D 51 103
12.1 Preliminaries and the Main Results 187
[see (12.9)],
ı D 21 1011 2
[see (12.10)] and with k, which is the smallest integer larger than 1012 16 2 [see
(12.11)].
The following example demonstrates that the set S can be easily calculated if
the mapping f is strongly monotone.
Example 12.4. Let rN 2 .0; 1/. Set
CrN D fx 2 X W kx PC .x/k rN g:
Remark 12.5. Note that inequality (12.19) holds, if f is strongly monotone with a
constant ˛ on CrN .
Let > 0 and x 2 S . Then for all y 2 C,
hf .x/; y xi ky xk
and in particular
kx u k .2˛ 1 /1=2
and
S fx 2 X W kx u k .2˛ 1 /1=2 g:
188 12 The Extragradient Method for Solving Variational Inequalities
Note that the constant ˛ can be obtained by analyzing an explicit form of the
mapping f .
In next example we show what is the set S when C D X.
Example 12.6. Assume that C D X. It is easy to see that
S D fx 2 X W f .x/ D 0g:
S fx 2 X W kf .x/k 2g:
v 2 B.0; ı/
yi D xi f .xi / D .1 /xi ;
X
n1
2 n
xn D .1 C / x0 C .1 C 2 /i v ! . 2 /1 v as n ! 1:
iD0
12.2 Auxiliary Results 189
Thus any
Assume that a constant MQ > kx k be given. (Note that MQ can be known a priori or
obtained by analyzing an explicit form of the function F.) Let 2 .0; 1/ and
Q
x 2 S \ B.0; M/:
and, in particular,
F.x / F.x/
Q :
kx x k 2 M
Thus
We use the assumptions, definitions, and the notation introduced in Sect. 12.1.
190 12 The Extragradient Method for Solving Variational Inequalities
Let
CDT (12.26)
uQ D PD .u f .v//: (12.27)
Then
kf .u/k M1 :
Together with (12.21), (12.24), and Lemma 2.2, this implies that
hQu v; .u f .u// vi 0:
Set
z D u f .v/: (12.32)
Together with (12.27), (12.30), (12.32), and (12.33) this implies that
kQu u k2 kz u k2 kz PD .z/k2
D ku f .v/ u k2 ku f .v/ uQ k2
D ku u k2 ku uQ k2
C2hu uQ ; f .v/i
ku u k2 ku uQ k2 C 2 hv uQ ; f .v/i:
kQu u k2 ku u k2
C2hv uQ ; f .v/i hu v C v uQ ; u v C v uQ i
D ku u k2 ku vk2 kv uQ k2
C2hQu v; u v f .v/i
ku u k2 ku vk2 kv uQ k2
C2hQu v; f .u/ f .v/i: (12.34)
Assume that
Then
Proof. Set
Clearly,
kQx u k2 D kQx z C z u k2
D kQx zk2 C 2hQx z; z u i C kz u k2
kQx zk2 C 2kQx zkkz u k C kz u k2 : (12.43)
kv yk ı: (12.44)
It follows from (12.41), (12.35), Lemma 2.2, (12.39), and (12.36) that
By (12.46), (12.41), Lemma 2.2, (12.45), (12.40), (12.37), (12.38), and (12.44),
By (12.5) there is
u 2 S \ B.0; M0 /: (12.48)
Assume that (12.53) holds. Then by (12.53), (12.14), (12.10), and the inequality
0 < 1,
kxiC1 u k2
4ı.1 C 2M0 / C kxi u k2 02 .1 2 L2 /
kxi u k2 02 .1 2 L2 /21 : (12.55)
and
This contradicts (12.11). The contradiction we have reached proves that case (a)
does not hold. Then case (b) holds and there is an integer j 2 f0; : : : ; kg guaranteed
by (b). Then (12.16) and (12.17) hold.
Assume that an integer j 2 Œ0; k satisfies (12.17). (Clearly, in view of (b) such
integer j is unique.) Then
By (12.64), (12.57), (12.48), (12.6), (12.9), (12.61), (12.65), and (12.4) for each
2 C,
hf .Ny/; yN i
hf .Ny/; xj i kf .Ny/kkxj yN k
hf .Ny/; xj i 3M1 0
hf .xj /; xj i kf .Ny/ f .xj /kk xj k 3M1 0
30 Q 1 .1 C M1 / 30 Q 1 k xj k 30 Lk xj k 3M1 0
30 .M1 C Q 1 .1 C M1 // 30 .L C Q 1 /.k yN k C kNy xj k/
30 .L C Q 1 /.k yN k/ 30 .M1 C Q 1 .1 C M1 / C .L C Q 1 //
k yN k
We use the assumptions, definitions, and notation introduced in Sect. 12.1 and we
prove the following result.
Theorem 12.11. Let X D Rn , 2 .0; 1/, M0 > 0 be such that
B.0; M0 / \ S 6D ;;
0 < 1; L < 1:
12.4 The Finite-Dimensional Case 197
Then there exist ı 2 .0; / and an integer k 1 such that for each fxi g1
iD0 R and
n
1
each fyi giD0 R which satisfy kx0 k M0 and for each integer i 0,
n
Theorem 12.11 follows immediately from Theorem 12.2 and the following result.
Lemma 12.12. Let M0 > 0, > 0. Then there exists 2 .0; / such that for each
z 2 S \ B.0; M0 /
inffkz uk W u 2 Sg :
z D lim z.k/ :
k!1
This contradicts (12.66). The contradiction we have reached proves Lemma 12.12.
We use the assumptions, definitions, and notation introduced in Sect. 12.1. Let
X D Rn . For each x 2 Rn and each A Rn set
N 0 2/;
S B.0; M (12.67)
f .B.0; 3M N 1 /; f .B.0; 3M
N 0 C 1// B.0; M N 0 C 3M
N 1 C 1// B.0; M
N 2 /: (12.68)
Assume that
N0CM
M0 > M N1CM
N 2 ; M1 > 0; L > 0; (12.69)
fıi g1
iD0 .0; ; lim ıi D 0 (12.75)
i!1
and let 2 .0; 1/. Then there exists a natural number k0 such that for each pair of
sequences fxi g1 1
iD0 R and fyi giD0 R which satisfies
n n
kx0 k M0 (12.76)
the inequality
d.xi ; S/
ku0 k M0 (12.79)
Set
k0 D 2 C kN C k1 C k2 : (12.83)
200 12 The Extragradient Method for Solving Variational Inequalities
for each integer i 0 equation (12.77) holds. Assume that an integer j 0 satisfies
kxj k M0 : (12.84)
kxp k M0 :
N kxi k M0 :
i 2 Œj C 1; j C k; (12.86)
N 0 2 C 1=4:
kxj k M (12.87)
By Lemma 2.2, (12.73), (12.85), (12.87), (12.68), (12.72), (12.77), and (12.75),
N 2:
kf .yj /k M (12.89)
0 j0 k2 : (12.92)
j0 C 1 C kN k2 : (12.93)
N and kxj1 k M0 :
j1 2 Œj0 C 1; j0 C 1 C k (12.94)
N
j1 > k2 ; j1 k2 j1 j0 1 C k: (12.95)
kxj k M0 : (12.96)
kxjC1 k M0 : (12.98)
By (P3), (12.98), (12.77), (12.95), and (12.82) there is an integer i 2 ŒjC1; jCk1 C1
for which d.xi ; S/ =8. Thus we have shown that the following property holds:
(P5) If an integer j j1 and kxj k M0 , then there is an integer i 2 ŒjC1; jC1Ck1
such that d.xi ; S/ =8:
202 12 The Extragradient Method for Solving Variational Inequalities
(P5), (12.94), (12.67), and (12.69) imply that there exists a sequence of natural
numbers fjp g1
pD1 such that for each integer p 1,
1 jpC1 jp 1 C k1 (12.99)
We show that
Set
Let p 2 be an integer. We show that for each integer l satisfying 0 l < jpC1 jp ,
that
kxjp ClC1 u k
kxjp Cl u k C .4ıjp Cl .1 C M0 //1=2 =8 C l0
N 2 .1 C M0 //1=2 =8 C l0 C k1 =4 D =8 C .l C 1/0 :
C.4ık1 1
12.5 A Convergence Result 203
Thus by induction we have shown that for all l D 0; : : : ; jpC1 jp relation (12.102)
holds and it follows from (12.102), (12.99), and (12.101) that for all integers l D
0; : : : ; jpC1 jp 1,
Since the inequality above holds for all integers p 2, we conclude that
j2 k1 C j1 C 1 k1 C 2 C kN C k2 D k0 :
d.xi ; S/ =2
Let .X; h; i/ be a Hilbert space equipped with an inner product h; i which induces
a complete norm k k. We denote by Card.A/ the cardinality of the set A. For every
point x 2 X and every nonempty set A X define
B.x; r/ D fy 2 X W kx yk rg:
Moreover,
kPC .x/ PC .y/k kx yk
hz PC .x/; x PC .x/i 0:
Note that this set was introduced in Chap. 12 and that every point x 2 S .f ; C/
is an -approximate solution of the variational inequality associated with the pair
.f ; C/ 2 L1 . The examples considered in Chap. 12, show that elements of S .f ; C/
can be considered as -approximate solutions of the corresponding variational
inequality.
We suppose that for every mapping T 2 L2 the set
l Card.L1 [ L2 /:
xkC1 WD ŒA.k/.xk /; k D 0; 1; : : : :
According to the results known in the literature, this sequence should converge
weakly to an element of S. In this chapter, we study the behavior of the sequences
generated by A 2 R taking into account computational errors which are always
present in practice. Namely, in practice the algorithm associated with A 2 R
generates a sequence fxk g1
kD0 such that for each integer k 0,
and if
with a constant ı > 0 which depends only on our computer system. Surely, in this
situation one cannot expect that the sequence fxk g1kD0 converges to the set S. Our
goal is to understand what subset of X attracts all sequences fxk g1
kD0 generated by
algorithms associated with A 2 R. In Chap. 12 we showed that in the case when
L2 D ; and the set L1 is a singleton, this subset of X is the set of -approximate
solutions of the corresponding variational inequality with some > 0 depending
on ı. In this chapter we generalize the main result of Chap. 12 and show that in the
general case (see Theorem 13.1 stated below) this subset of X is the set
B.0; M0 / \ S 6D ; (13.11)
and
L < 1: (13.14)
an integer
Assume that
A 2 R; fxk g1
kD0 X; kx0 k M0 (13.18)
and if
kxi xj k
Note that Theorem 13.1 provides the estimations for the constants ı and n0 ,
which follow from relations (13.15)–(13.17). Namely, ı D c1 2 and n0 D c2 2 ,
where c1 and c2 are positive constants depending only on M0 .
Let 2 .0; 1, a positive number ı be defined by relations (13.15) and (13.17),
and let an integer n0 1 satisfy inequality (13.16). Assume that we apply an
algorithm associated with a mapping A 2 R under the presence of computational
210 13 A Common Solution of a Family of Variational Inequalities
errors bounded from above by a positive constant ı and that our goal is to find an -
approximate solution x 2 S : It is not difficult to see that Theorem 13.1 also answers
an important question how we can find an iteration number i such that xi 2 S :
According to Theorem 13.1, we should find the smallest integer q 2 Œ0; n0 1 such
that for every integer i 2 Œql; .q C 1/l 1 properties (P3) and (P4) hold and that the
relation kxi k 3M0 C 1 is true. Then the inclusion xi 2 S is valid for all integers
i 2 Œql; .q C 1/l.
Consider the following convex feasibility problem. Suppose that C1 ; : : : ; Cm are
nonempty closed convex subsets of X, where m is a natural number, such that the set
C D \m iD1 Ci is also nonempty. We are interested to find a solution of the feasibility
problem x 2 C.
For every point x 2 X and every integer i D 1; : : : ; m there exists a unique
element Pi .x/ 2 Ci such that
kx Pi .x/k D inffkx yk W y 2 Ci g:
The feasibility problem is a particular case of the problem discussed above with
L1 D ; and L2 D fPi W i D 1; : : : ; mg.
and let
Assume that
x 2 B.u ; m0 /; y 2 X; ky PC .x f .x//k ı;
xQ 2 X; kQx PC .x f .y//k ı:
Then
kQx u k2 4ı.1 C m0 / C kx u k2 .1 2 L2 /kx PC .x f .x//k2 :
13.2 Auxiliary Results 211
B.0; M0 / \ S 6D ;; (13.23)
and let
Let
Then
u 2 B.0; M0 / \ S: (13.30)
Combined with (13.24), (13.27), and (13.30) the relation above implies that
Combined with (13.26) and (13.27) the relation above implies that
Proof. It is not difficult to see that inequality (13.33) follows from (13.14), (13.17)
and the relations l 1, cN < 1 and 1 < 1=2. Relation (13.34) follows from (13.17),
(13.14) and the inequality l 1. Relation (13.35) follows from (13.17) and the
inequalities l 1 and cN < 1.
u 2 B.0; M0 / \ S: (13.36)
A.i/ D T 2 L2 I (13.40)
Assume that (13.40) is valid. Then it follows from (13.19), (13.36), (13.40),
(13.5), (13.39), Lemma 13.4, and (13.33) that
In view of (13.39), (13.42), the inclusions i 2 Œpl; .p C 1/l 1 and 2 .0; 1/ and
(13.15),
kxiC1 u k2 kxi u k2
ı.kxiC1 u k C kxi u k/ 2ı.2M0 C 1/: (13.43)
kxi u k 2M0 C 1
and
(Note that the first inequality of (13.45) follows from (13.44), the second inequality
follows from (13.17) and the inequalities cN < 1 and l 1, and the third inequality
follows from (13.39).)
214 13 A Common Solution of a Family of Variational Inequalities
Thus by induction we proved that for all integers i D pl; : : : ; .p C 1/l the inequality
holds and that for all integers i D pl; : : : ; .p C 1/l 1 the inequality
is valid.
By (13.15), we have shown that the following property holds:
(P5) If a nonnegative integer p satisfies the inequality
then we have
and
Assume that an integer qQ 2 Œ0; n0 1 and that for every integer p 2 Œ0; qQ the
following property holds:
(P6) there exists i 2 fpl; : : : ; .p C 1/l 1g such that (P3) and (P4) do not hold.
Assume now that an integer p 2 Œ0; qQ satisfies
In view of property (P5) and relation (13.48), inequalities (13.46) and (13.47) are
valid.
Property (P6) implies that there exists an integer j 2 fpl; : : : ; .p C 1/l 1g such
that properties (P3) and (P4) do not hold with i D j. Evidently, one of the following
cases holds:
A.j/ D T 2 L2 I (13.49)
Assume that relation (13.49) holds. Since property (P3) does not hold with i D j
we have
Hence
Assume that relation (13.50) is valid. Then relations (13.50), (13.36), (13.46),
(13.12), (13.13), (13.20), and (13.21) imply that all the assumptions of Lemma 13.2
hold with
x D xj ; y D vj ; xQ D xjC1 ; m0 D 2M0 C 1
ku xjC1 k2
4ı.2 C 2M0 / C ku xj k2 .1 . L/2 /kxj PC .xj f .xj //k2 :
(13.55)
216 13 A Common Solution of a Family of Variational Inequalities
kxj u k2 kxjC1 u k2
.1 . L/2 /.9=16/12 4ı.2 C 2M0 /
.1 . L/2 /12 =2: (13.58)
and
4M02 ku x0 k2
ku x0 k2 ku x.QqC1/l k2
qQ
X
D Œku xpl k2 ku x.pC1/l k2
pD0
and
We assumed that an integer qQ 2 Œ0; n0 1 and that for every integer p 2 Œ0; qQ
property (P6) holds and proved that qQ C 1 < n0 .
This implies that there exists an integer q 2 Œ0; n0 1 such that for every integer
p satisfying 0 p < q, property (P6) holds and that the following property holds:
(P8) For every integer i 2 fql; : : : ; .q C 1/l 1g properties (P3) and (P4) hold.
Property (P7) (with qQ D q 1) implies that
and that for all integers i D pl; : : : ; .p C 1/l 1 properties (P3) and (P4) hold.
Let
A.i/ D T 2 L2 I (13.68)
Assume that relation (13.68) is valid. Then in view of (13.68) and property (P3),
kxiC1 xi k 1 : (13.70)
It follows from (13.11), (13.69), (13.12), (13.13), (13.14), (13.67), (13.66), (13.72),
(13.73), and (13.33) that all the assumptions of Lemma 13.3 hold with
x D xi ; y D vi ; xQ D xiC1
(and with the constants M0 ; M1 ; L as in Theorem 13.1), and this implies that
kxiC1 xi k
2ı C .1 C L/kxi PC .xi f .xi //k
2ı C .5=4/.1 C L/1
.5=4/.1 C L/1 C 2ı < 51 : (13.74)
(Note that the second inequality in (13.74) follows from (13.73), the third one
follows from (13.69) and the last inequality follows from (13.14) and (13.33).)
In view of Lemma 2.2, (13.73), (13.69), (13.66), and (13.12), for every point
2 C,
13.3 Proof of Theorem 13.1 219
and
hf .xi /; xi i
21 1 k xi k 41 12 21 1 M1
for each 2 C: (13.75)
Set
(Note that the inclusion in (13.78) follows from (13.76), the inequality 1 < 1=2
and (13.66), and the last inequality in (13.78) follows from (13.12).)
In view of (13.75), (13.77), (13.78), and (13.15), for every point 2 C,
hf .Ny/; yN i
hf .Ny/; xi i kf .Ny/kkxi yN k
hf .Ny/; xi i 2M1 1
hf .xi /; xi i kf .Ny/ f .xi /kk xi k 2M1 1
21 1 k xi k 41 1
2M1 1 1 2L1 k xi k 2M1 1
220 13 A Common Solution of a Family of Variational Inequalities
yN 2 S .f ; C/:
kxi xiC1 k 51
hold.
In view of properties (P9) and (P10), for every integer i 2 fpl; : : : ; .p C 1/l 1g,
kxi xiC1 k 51 :
This implies that for every pair of integers i; j 2 fpl; : : : ; .p C 1/lg, we have
[see (13.15)].
Let j 2 fpl; : : : ; .p C 1/lg. Assume that T 2 L2 . In view of (13.18) and property
(P1) there exists an integer i 2 fpl; : : : ; .p C 1/l 1g such that
A.i/ D T:
xi 2 Fix.21 / .T/
13.4 Examples 221
and
kxi T.xi /k 21 :
d.xi ; S .f ; C// 21 :
Combined with relations (13.79) and (13.15) (the choice of 1 ) the inequality above
implies that
13.4 Examples
In this section we present examples for which Theorem 13.1 can be used.
Example 13.5. Let p 1 be an integer, Ci , i D 1; : : : ; p be nonempty closed convex
subsets of the Hilbert space X and let for every integer i 2 f1; : : : ; pg, gi W X ! R1
be a convex Fréchet differentiable function and g0i .x/ 2 X be its Fréchet derivative at
a point x 2 X. We assume that for all integers i D 1; : : : ; p the mapping g0i W X ! X
is Lipschitzian on all bounded subsets of X.
Consider the following multi-objective minimization problem
p
Find x 2 \iD1 Ci such that
gi .x/ D inffgi .z/ W z 2 Ci g for all i D 1; : : : ; p:
p
Find x 2 \iD1 Ci such that for all i D 1; : : : ; p;
hg0i .x/; y xi 0 for all y 2 Ci :
As it was shown in Example 13.5, we can apply Theorem 13.1 for this problem with
fi D g0i , i D 1; 2.
Now we define the constants which appear in Theorem 13.1. Set l D 2. Clearly,
C1 \ C2 fx D .x1 ; x2 ; x3 ; x4 / 2 R4 W
jx1 j 10; jx2 j 10; x3 D 2; x4 D 2g B.0; 16/:
Thus we can set M0 D 16. It is easy to see that the set of solutions of our problem
(thus M1 D 1530) and that the functions f1 ; f2 are Lipschitzian on R4 with the
Lipschitz constant L D 12. Put D 161 , cN D 1=2.
We apply Theorem 13.1 with these constants and with D 103 . Then (13.15)
implies that we can set 1 D 41 107 . By (13.16), we have
Note that this example can also be considered as an example of a convex feasibility
problem
or equivalently
Now we describe how the subgradient algorithm is applied for our example.
First of all, note that for any y D .y1 ; y2 ; y3 ; y4 / 2 R4 ,
We apply Theorem 13.1 with x0 D .0; 0; 0; 0/ and A 2 R such that for each
integer i 0,
and
kv2iC1 PC2 .x2iC1 161 f2 .x2iC1 //k ı;
kx2iC2 PC2 .x2iC1 161 f2 .v2iC1 //k ı:
For every nonnegative integer p we calculate
It is easy to show that this problem is equivalent to the following problem which is a
particular case of the problem discussed in this section, and for which Theorem 13.1
was stated:
q
Find x 2 \jD1 Dj such that for all i D 1; : : : ; p;
x 2 Ci and hg0i .x/; y xi 0 for all y 2 Ci :
Let .Y; k k/ be a Banach space and 1 < a < b < 1. A function x W Œa; b ! Y is
strongly measurable on Œa; b if there exists a sequence of functions xn W Œa; b ! Y,
n D 1; 2; : : : such that for any integer n 1 the set xn .Œa; b/ is countable and the
set ft 2 Œa; b W xn .t/ D yg is Lebesgue measurable for any y 2 Y, and xn .t/ ! x.t/
as n ! 1 in .Y; k k/ for almost every t 2 Œa; b.
The function x W Œa; b ! Y is Bochner integrable if it is strongly measurable and
Rb
there exists a finite a kx.t/kdt.
If x W Œa; b ! Y is a Bochner integrable function, then for almost every (a. e.)
t 2 Œa; b,
Z tCt
lim .t/1 kx. / x.t/kd D 0
t!0 t
Let 1 < 1 < 2 < 1. Denote by W 1;1 .1 ; 2 I Y/ the set of all functions
x W Œ1 ; 2 ! Y for which there exists a Bochner integrable function u W Œ1 ; 2 ! Y
such that
Z t
x.t/ D x.1 / C u.s/ds; t 2 .1 ; 2
1
(see, e.g., [11, 27]). It is known that if x 2 W 1;1 .1 ; 2 I Y/, then this equation defines
a unique Bochner integrable function u which is called the derivative of x and is
denoted by x0 .
B.x; r/ D fy 2 X W kx yk rg:
kx PD .x/k D inffkx yk W y 2 Dg
Set
and
0 D 2ı.2M C 1/
1 2 1 1
1 ; T0 > 4M
1 0 (14.1)
and
Then
Then x W Œ0; T ! X is differentiable and x0 .t/ D u.t/ for almost every t 2 Œ0; T.
Assume that
We claim that the restriction of g to the set fx.t/ W t 2 Œ0; Tg is Lipschitzian.
Indeed, since the set fx.t/ W t 2 Œ0; Tg is compact, the closure of its convex hull C
is both compact and convex, and so the restriction of g to C is Lipschitzian. Hence
the function .g x/.t/ WD g.x.t//, t 2 Œ0; T, is absolutely continuous. It follows that
for almost every t 2 Œ0; T, both the derivatives x0 .t/ and .g x/0 .t/ exist:
Proof. There exist a neighborhood U of x.t/ in X and a constant L > 0 such that
Let > 0 be given. In view of (14.6), there exists ı > 0 such that
x.t C h/; x.t/ C hx0 .t/ 2 U for each h 2 Œı; ı \ Œt; T t; (14.10)
14.4 Proof of Theorem 14.1 229
Let
jg.x.t C h// g.x.t/ C hx0 .t//j Lkx.t C h/ x.t/ hx0 .t/k < Ljhj: (14.13)
Clearly,
Assume that the theorem does not hold. Then there exists
z 2 B.0; M/ (14.16)
such that
f .x.t// > f .z/ C 0 for all t 2 Œ0; T0 : (14.17)
Set
.t/ D kz x.t/k2 ; t 2 Œ0; T0 : (14.18)
In view of Corollary 14.3, for a. e. t 2 Œ0; T0 , there exist derivatives x0 .t/, 0 .t/ and
230 14 Continuous Subgradient Method
such that
It follows from (14.2), (14.22), (14.23), and (14.24) that for almost every t 2 Œ0; T0 ,
.0/ 4M 2 : (14.26)
kz x.t/k D .t/1=2 2M C 1:
2 .0; T
14.5 Continuous Subgradient Projection Method 231
such that
kz x. /k D 2M C 1: (14.28)
T0 4M 2 . 1 0 /1 :
This contradicts the choice of T0 (see (4.1)). The contradiction we have reached
proves Theorem 14.1.
CU
@f .xI / 6D ;:
0 D 2ı.10M C
2 .L C 1//
1
1 ; (14.36)
and let
1 2 : (14.38)
d.x.0/; C/ ı (14.39)
and that for almost every t 2 Œ0; T0 , there exists .t/ 2 X such that
Then
Proof. Assume that (14.42) does not hold. Then there exists
z2C (14.43)
such that
B. .t/; ı/ \ C 6D ;: (14.46)
Define
B.x; ı/ \ C 6D ; (14.48)
It follows from (14.48) and (14.53) that for every t 2 Œ0; T0 , there exists
xO .t/ 2 C (14.54)
such that
In view of (14.32) and (14.40), for almost every t 2 Œ0; T0 there exists
O 2 @f .x.t/I x0 .t//
.t/ (14.59)
such that
O
f 0 .x.t/; x0 .t// D h.t/; x0 .t/i; (14.60)
O
k.t/ .t/k ı: (14.61)
14.5 Continuous Subgradient Projection Method 235
and
O
f .x.t// f .z/ h.t/; O
x.t/ z C x0 .t/i h.t/; x0 .t/i: (14.62)
Relations (14.45) and (14.63) imply that for almost every t 2 Œ0; T0 ,
kx.t/k M: (14.65)
It follows from (14.33), (14.41), (14.45), and (14.65) that for almost every t 2
Œ0; T0 ,
By (14.33), (14.34), (14.38), (14.45), (14.53), (14.58), (14.59), (14.61), (14.63), and
(14.65),
It follows from (14.45), (14.61), (14.62), and (14.66) that for almost every
t 2 Œ0; T0 ,
f .x.t// f .z/
O .t/; x.t/ z C x0 .t/i
h.t/; x.t/ z C x0 .t/i C h.t/
O
h.t/; x0 .t/i C h.t/ .t/; x0 .t/i
h.t/; x.t/ z C x0 .t/i h.t/; x0 .t/i C 4Mı: (14.71)
Relations (14.70) and (14.71) imply that for almost every t 2 Œ0; T0 ,
f .x.t// f .z/
1 hx0 .t/; x.t/ C x0 .t/ zi C
1 .4M C
2 .L C 1//ı
h.t/; x0 .t/i C 4Mı: (14.72)
and
It follows from (14.60), (14.75), and Corollary 14.3 that for almost every t 2 Œ0; T0 ,
and integrating the inequality above over the interval Œ0; t, we obtain that for all
t 2 Œ0; T0 ,
Relations (14.33), (14.34), (14.48), and (14.53) imply that for all t 2 Œ0; T0 ,
ıt.10M C
2 .L C 1//
1 1
1 0 t D ıt.10M C
2 .L C 1//
1 :
This contradicts (14.37). The contradiction we have reached completes the proof of
Theorem 14.4.
In Theorem 14.4 ı is the computational error. According to this result we obtain
a point 2 Cı (see (14.47), (14.53)) such that
f ./ inf.f I C/ C c1 ı
[see (14.36), (14.42)], during a period of time c2 ı 1 [see (14.37)], where c1 ; c2 > 0
are constants depending only on
1 ;
2 ; L; M.
Chapter 15
Penalty Methods
In this chapter we use the penalty approach in order to study constrained mini-
mization problems in infinite dimensional spaces. A penalty function is said to have
the exact penalty property if there is a penalty coefficient for which a solution of
an unconstrained penalized problem is a solution of the corresponding constrained
problem. Since we consider optimization problems in general Banach spaces, not
necessarily finite-dimensional, the existence of solutions of original constrained
problems and corresponding penalized unconstrained problems is not guaranteed.
By this reason we deal with approximate solutions and with an approximate exact
penalty property which contains the classical exact penalty property as a particular
case. In our recent research we established the approximate exact penalty property
for a large class of inequality-constrained minimization problems. In this chapter
we improve this result and obtain an estimation of the exact penalty.
Penalty methods are an important and useful tool in constrained optimization. See,
for example, [25, 33, 43, 45, 49, 57, 80, 85, 117, 121] and the references mentioned
there. In this chapter we use the penalty approach in order to study constrained
minimization problems in infinite dimensional spaces. A penalty function is said
to have the exact penalty property if there is a penalty coefficient for which a
solution of an unconstrained penalized problem is a solution of the corresponding
constrained problem.
The notion of exact penalization was introduced by Eremin [48] and Zangwill
[114] for use in the development of algorithms for nonlinear constrained optimiza-
tion. Since that time, exact penalty functions have continued to play an important
role in the theory of mathematical programming.
B.x; r/ D fy 2 X W kx yk rg:
and
Let n 1 be an integer. For every 2 .0; 1/ denote by ˝ the set of all vectors
D .1 ; : : : ; n / 2 Rn such that
X
n
It is clear that for every vector 2 .0; 1/n the function W X ! R1 [ f1g
is bounded from below and lower semicontinuous and satisfies inf. / < 1. We
associate with problem (P) the corresponding family of unconstrained minimization
problems
.z/ ! min; z 2 X (P )
(A4) For every positive number M there is M1 > 0 such that for every point y 2 X
satisfying f .y/ M there exists a neighborhood V of y in X such that if
z 2 V, then
Remark 15.1. Note that if the function f is convex, then assumptions (A1)–(A4)
hold with h.z; y/ D f .z/ f .y/, z 2 X, y 2 dom.f /. In this case M1 D 1 for all
M > 0. If the function f is finite-valued and Lipschitzian on all bounded subsets of
X, then assumptions (A1)–(A4) hold with h.z; y/ D kz yk for all z; y 2 X.
Let 2 .0; 1/. The main result of [118] (Theorem 15.2 stated below) imply that if
is sufficiently large, then any solution of problem .P / with 2 ˝ is a solution
of problem .P/. Note that if the space X is infinite-dimensional, then the existence
242 15 Penalty Methods
of solutions of problems .P / and .P/ is not guaranteed. In this case Theorem 15.2
implies that for each > 0 there exists ı./ > 0 which depends only on such that
the following property holds:
If 0 , 2 ˝ and if x is a ı-approximate solution of .P /, then there exists
an -approximate solution y of .P/ such that ky xk .
Here 0 is a positive constant which does not depend on .
It should be mentioned that we deal with penalty functions whose penalty
parameters for constraints g1 ; : : : ; gn are 1 ; : : : ; n respectively, where > 0
and .1 ; : : : ; n / 2 ˝ for a given 2 .0; 1/. Note that the vector .1; 1; : : : ; 1/ 2 ˝
for any 2 .0; 1/. Therefore our results also includes the case 1 D D n D 1
where one single parameter is used for all constraints. Note that sometimes it is
an advantage from numerical consideration to use penalty coefficients 1 ; : : : ; n
with different parameters i , i D 1; : : : ; n. For example, in the case when some of
the constrained functions are very “small” and some of the constraint functions are
very “large.”
The next theorem is the main result of [118].
Theorem 15.2. Let 2 .0; 1/. Then there exists a positive number 0 such that for
each > 0 there exists ı 2 .0; / such that the following assertion holds:
If 2 ˝ , 0 and if x 2 X satisfies
.x/ inf. / C ı;
Note that Theorem 15.2 is just an existence result and it does not provide any
estimation of the constant 0 . In this chapter we prove the main result of [119]
which improves Theorem 15.2 and provides an estimation of the exact penalty 0 .
In view of (15.4) and (15.5), there exists a positive number M such that
By (15.7), we have
In view of (A4), there exists a positive number M1 such that the following property
holds:
(P1) for every point y 2 X satisfying f .y/ jf .Qx/jC1 there exists a neighborhood
V of y in X such that f .z/ f .y/ M1 h.z; y/ for all z 2 V.
15.2 Proof of Theorem 15.4 243
By (15.4), (15.5), and assumption (A3), there exists a positive number M2 such
that
Remark 15.3. If the function f is convex, then by Remark 15.1, we choose h.z; y/ D
f .z/ f .y/ for all z 2 X and all y 2 dom.f / with M1 D 1 for all M > 0 and then
Thus in this case M2 can be any positive number such that M2 f .Qx/ inf.f /.
If the function f is finite-valued and Lipschitzian on bounded subsets of X, then
by Remark 15.1, we choose h.z; y/ D kz yk for all z; y 2 X and M1 is a Lipschitz
constant of the restriction of f to B.0; M/. In this case
and M2 D M.
Let 2 .0; 1/. Fix a number 0 > 1 such that
X
n
.ci gi .Qx// > maxf21 2 2
0 M1 M2 ; 80 M g: (15.9)
iD1
Assume that the theorem does not hold. Then there exist
such that
.N
x/ inf. / C 21 1
0 (15.12)
and
fy 2 B.Nx; / \ A W .y/ .N
x/g D ;: (15.13)
By (15.12) and Ekeland’s variational principle [50] (see Theorem 15.19), there
exists a point yN 2 X such that
.N
y/ .N
x/; (15.14)
1
kNy xN k 2 (15.15)
and
.N
y/ .z/ C 1
0 kz y
N k for all z 2 X: (15.16)
In view of (15.13)–(15.15),
yN 62 A: (15.17)
Define
I1 6D ;: (15.19)
Relations from (15.3), (15.6), (15.10), (15.11), (15.12), (15.14), (15.18), and (15.19)
imply that
.N
x/ 1 .N
y/ 1 D f .Ny/
X
C i .gi .Ny/ ci / 1: (15.20)
i2I1
Property (P1), (15.21), and (15.22) imply that there exists an open neighborhood V
of the point yN in X such that
By (15.21), (15.5), (15.14), (15.12), (15.25), and (15.16), for every point z 2
B.Ny; r/ \ dom.f /, we have
X X
i .gi .z/ ci / C i maxfgi .z/ ci ; 0g
i2I1 i2I2 [I3
X X
i .gi .Ny/ ci / i maxfgi .Ny/ ci ; 0g
i2I1 i2I2 [I3
D .z/ .N
y/ f .z/ C f .Ny/ 1
0 kN
y zk f .z/ C f .Ny/:
Combined with (15.11) this relation implies that for every point z 2 B.Ny; r/,
X X
i gi .z/ C i maxfgi .z/ ci ; 0g
i2I1 i2I2 [I3
X X
i gi .Ny/ i maxfgi .Ny/ ci ; 0g
i2I1 i2I2 [I3
is convex. Combined with the equality h.Ny; yN / D 0 [see (A1)] this implies that
(15.26) holds true for every point z 2 X.
Since relation (15.26) is valid for z D xQ relations (15.5), (15.10), (15.2), (15.11),
(15.18), and (15.19) imply that
X
i gi .Qx/ C 1 M1 h.Qx; yN / C 2
0 kQ
x yN k
i2I1
X X
i gi .Ny/ > i ci :
i2I1 i2I1
Combined with (15.21), (15.5), (15.8), (15.22), (15.10), (15.2), assumption (A1)
and the choice of M2 (see Sect. 15.1) this implies that
42 2 1
0 M C 0 M1 supfh.Q
x; z/ W z 2 X and f .z/ f .Qx/ C 1g
X
2 2 1
0 4M C 0 M1 .h.Qx; yN /C / i .ci gi .Qx//
i2I1
X
n
.ci gi .Qx//
iD1
and
X
n
.ci gi .Qx// 42 2 1
0 M C 0 M1 M2 :
iD1
This contradicts (15.9). The contradiction we have reached proves Theorem 15.4.
obtain an estimation of the exact penalty. Using this exact penalty property we obtain
necessary and sufficient optimality conditions for the constrained minimization
problems.
Let X be a vector space, X 0 be the set of all linear functionals on X and let Y be a
vector space ordered by a convex cone YC such that
for all y1 ; y2 2 Y
and
Set
.1/ D 1:
The functional
was used in [115, 116, 121] for the study of minimization
problems with increasing objective functions. Here we use it in order to construct a
penalty function.
The following auxiliary result is proved in Sect. 15.4.
Lemma 15.7. Let y 2 Y n .YC / and l 2 @
.y/. Then
BX .x; r/ D fz 2 X W kx zk rg;
BY .y; r/ D fz 2 Y W ky zk rg:
15.3 Infinite-Dimensional Inequality-Constrained Minimization Problems 249
We suppose that there exist MQ > 0, rQ > 0 and a nonempty set ˝ X such that
supfkxk W x 2 ˝g < 1:
Remark 15.9. Clearly, G possesses (P1) and (P2) if G.X/ Y and G is continuous.
It is easy to see that G possesses (P4) and (P5) if Y D Rn , YC D fy 2 Rn W
yi 0; i D 1; : : : ; ng, G D .g1 ; : : : ; gn / and the functions gi W X ! R1 [ f1g,
i D 1; : : : ; n are lower semicontinuous. In general properties (P4) and (P5) are an
infinite-dimensional version of the lower semicontinuity property.
250 15 Penalty Methods
In view of (15.36)
Remark 15.11. Note that assumption (A1) is a form of a local Lipschitz property
for f on the sublevel set f 1 ..1; MQ C 1/.
The following theorem is the first main result of this section.
Theorem 15.12. Assume that (A1) holds, M0 > 0 is as guaranteed by (A1) and let
0 > 1 satisfy
rQ > 4.2 1
0 C M0 0 /.supfkzk W z 2 ˝g C M1 /: (15.40)
Then for each 2 .0; 1/, each 0 and each x 2 X which satisfies
Corollary 15.13. Assume that (A1) holds, M0 > 0 is as guaranteed by (A1) and
let 0 > 1 satisfy (15.40). Then for each 0 and each sequence fxi g1
iD1 X
satisfying
inf.f I A/ D inf. /:
Corollary 15.14. Assume that (A1) holds, M0 > 0 is as guaranteed by (A1) and let
0 > 1 satisfy (15.40). Then if 0 and if x 2 X satisfies
then x 2 A and
Theorem 15.12 is proved in Sect. 15.5. In our second main result of this section
we do not assume (A1). Instead of it we assume that the function f is convex and
that the mapping G is finite-valued.
Theorem 15.15. Assume that G.X/ Y, the function f is convex and let 0 > 1
satisfy
1 Q
rQ > 42
0 .supfkxk W x 2 ˝g C M1 / C 40 .M inf.f //: (15.45)
Corollary 15.16. Let the assumptions of Theorem 15.15 hold. Then for each
0 and each sequence fxi g1 iD1 X satisfying (15.41) there exists a sequence
fyi g1
iD1 A such that (15.42) holds. Moreover, for each 0
inf.f I A/ D inf. /:
252 15 Penalty Methods
Corollary 15.17. Let the assumptions of Theorem 15.15 hold. Then if 0 and
if x 2 X satisfies (15.43), then x 2 A and (15.44) holds.
Theorem 15.15 is proved in Sect. 15.6.
Using our exact penalty results we obtain necessary and sufficient optimality
conditions for constrained minimization problems (P) with the convex function f .
Theorem 15.18. Assume that G.X/ Y, the function f is convex, 0 > 1 satisfies
(15.45), 0 and xN 2 X. Then the following assertions are equivalent.
1. xN 2 A and f .Nx/ D inf.f I A/.
2. .Nx/ D inf. /.
3. There exist
@ .N
x/ D @f .Nx/ C @.
ı .G./ c//.Nx/
D @f .Nx/ C [f@.l ı .G./ c//.Nx/ W l 2 @
.G.Nx/ c/g:
Now in order to complete the proof it is sufficient to note that assertion 2 holds if
and only if
02@ .N
x/:
l0 .G.Nx/ c/ D .G.Nx/ x/ D 0
Let
2 .0; .y/=4/
and
l.z/ kzk :
klk 1:
Combined with (15.46) this implies that klk D 1. Lemma 15.7 is proved.
254 15 Penalty Methods
We may assume without loss of generality that
.G.yi / c/ is finite for all integers
i 1.
Let > 0. By (15.28) for any integer i 1 there exists zi 2 Y such that
In view of (15.52) and (15.53) the sequence fkzi kg1iD1 is bounded. Together with
(15.53) this implies that the sequence fG.yi / cg1
iD1 is ./-bounded from above
[see (P2)]. It follows from (P4) and (15.53) that
By (15.51), (15.54), and (P5) there exists a natural number i0 such that for each
integer i i0 there is ui 2 Y which satisfies
G.y/ c D .G.y/ ui / C ui c
.G.y/ ui / C G.yi / c .G.y/ ui / C zi : (15.56)
and
(It is easy to see that (P6) implies the validity of Theorem 15.2.)
Assume the contrary. Then there exist
such that
.N
x/ inf. / C 21 1
0 ; (15.59)
fy 2 BX .Nx; / \ A W .y/ .N
x/g D ;: (15.60)
It follows from (15.59), Lemma 15.10, and Theorem 15.19 that there is yN 2 X such
that
.N
y/ .N
x/; (15.61)
.N
y/ .z/ C 1
0 kz y
N k for all z 2 X: (15.63)
yN 62 A: (15.64)
256 15 Penalty Methods
In view of (A1) and (15.65) there exists a neighborhood V of yN in .X; k k/ such that
for each z 2 V,
It follows from (15.61), (15.59), (15.67), (15.37), (15.63), and (15.58) that for each
z2V
and
.G.z/ c/ C .2 1
0 C M0 0 /kz y
N k
.G.Ny/ c/: (15.68)
.z/
Q D
.G.z/ c/ C .2 1
0 C M0 0 /kz y
N k; z 2 X (15.69)
0 2 @
.N
Q y/: (15.70)
such that
kl0 k .2 1
0 C M0 0 /: (15.72)
15.5 Proof of Theorem 15.12 257
l1 2 @ .G.Ny/ c/ (15.73)
such that
Let
xh 2 ˝ (15.78)
such that
G.xh / C h c: (15.79)
It follows from (15.78), (15.66), (15.72), (15.75), (15.76), and (15.79) that
.2 1 2 1
0 C M0 0 /.M1 C supfkxk W x 2 ˝g/ .0 C M0 0 /.kxh k C kN
yk/
kl0 k .kxh k C kNyk/ l0 .xh yN /
l1 .G.xh / c/ l1 .G.Ny/ c/ l1 .G.xh / c/
.G.Ny/ c/
l1 .G.xh / c/ l1 .h/: (15.80)
Since (15.80) holds for all h satisfying (15.77) we conclude using (15.76) that
.2 1
0 C M0 0 /.M1 C supfkxk W x 2 ˝g/
This contradicts (15.40). The contradiction we have reached proves (P6) and
Theorem 15.12 itself.
258 15 Penalty Methods
We show that property (P6) (see Sect. 15.5) holds. (Note that (P6) implies the
validity of Theorem 15.15).
Assume the contrary. Then there exist
such that (15.59) and (15.60) hold. It follows from (15.59), Lemma 15.10 and
Ekeland’s variational principle [50] that there is yN 2 X such that (15.61)–(15.63)
hold. By (15.60), (15.61) and (15.62),
yN 62 A: (15.82)
Arguing as in the proof of Theorem 15.12 we show that (15.37), (15.35), (15.59),
(15.61), (15.36), and (15.38) imply that
Put
.z/
Q D
.G.z/ c/ C 1 f .z/ C 1 1
0 kz y
N k; z 2 X (15.85)
0 2 @
.N
Q y/: (15.86)
such that
It follows from (15.87), (15.30), (15.31), and Proposition 15.6 that there exists
l0 2 @ .G.Ny/ c/ (15.89)
such that
Let
xh 2 ˝ (15.94)
such that
G.xh / C h c: (15.95)
It follows from (15.94), (15.83), (15.88), (15.91), (15.87), (15.92), (15.36), and
(15.95) that
2 2
0 .M1 C supfkxk W x 2 ˝g/ 0 .kxh k C kN
yk/
kl1 C 1 l2 k .kxh k C kNyk/ .l1 C 1 l2 /.xh yN /
D l1 .xh yN / C 1 l2 .xh yN /
l0 .G.xh / c/ l0 .G.Ny/ c/ C 1 .f .xh / f .Ny//
l0 .G.xh / c/
.G.Ny/ c/ C 1 .MQ inf.f //
l0 .h/ C 1 .MQ inf.f //
260 15 Penalty Methods
l0 .h/ 1 Q 2
0 .M inf.f // C 0 .M1 C supfkxk W x 2 ˝g/: (15.96)
Since the inequality above holds for all h satisfying (15.93) it follows from (15.92)
and (15.96) that
This contradicts (15.45). The contradiction we have reached proves (P6) and
Theorem 15.15 itself.
15.7 An Application
Let X be a Hilbert space equipped with an inner product h; i which induces the
complete norm k k. We use the notation and definitions introduced in Sect. 15.1.
Let n be a natural number, gi W X ! R1 [ f1g, i D 1; : : : ; n be convex lower
semicontinuous functions and c D .c1 ; : : : ; cn / 2 Rn . Set
X
n
Clearly for each 2 .0; 1/n the function W X ! R1 [ f1g is bounded from
below and lower semicontinuous and satisfies inf. / < 1.
15.7 An Application 261
We suppose that the function f is convex. By Remark 15.1, (A1)–(A4) hold with
h.z; y/ D f .z/ f .y/, z 2 X, y 2 dom.f /. In this case M1 D 1 for all M > 0.
There is M > 0 such that [see (15.7)]
Clearly, (P1) holds with M1 D 1. In view of Remark 15.3, the constant M2 can be
any positive number such that
M2 f .Qx/ inf.f /:
X
n
.ci gi .Qx// > maxf21 2 2
0 M2 ; 80 M g:
iD1
x 2 A (15.101)
such that
B.0; M C 1/ U; (15.105)
jf .x/ f .y/j Lkx yk; jgi .x/ gi .y/j Lkx yk: (15.106)
In view of (15.98) and (15.106), the function 0 is Lipschitzian on U and for all
x; y 2 U,
j 0 .x/ 0 .y/j
In this section we use the projection on the set B.0; M/ denoted by PB.0;M/ and
defined for each z 2 X by
We apply the subgradient projection method, studied in Chap. 2, for the minimiza-
tion problem of the function 0 on the set B.0; M/. For each ı > 0 set
Set
and
0 2 @f .xt / C B.0; ı/:
i 2 @gi .xt / C B.0; ı/:
Set
X
n
t D
0 C 0
i :
iD1
Now we can think about the best choice of T. It was explained in Chap. 2 that it
should be at the same order as
It follows from (15.109), (15.114), (15.115), and property (P7) that there exist
y0 ; y1 2 A such that
X
T
1
y0 .T C 1/ xt 2˛.ı/0 ; ky1 x k 2˛.ı/0 ;
tD0
!
X
T
1
f .y0 / 0 .T C 1/ xt inf.f I A/ C ˛.ı/;
tD0
The analogous analysis can be also done for the mirror descent method.
Chapter 16
Newton’s Method
BX .x; r/ D fu 2 X W ku xk rg;
BY .y; r/ D fv 2 Y W kv yk rg:
Let IX .x/ D x for all x 2 X and let IY .y/ D y for all y 2 Y. Denote by L.X; Y/ the
set of all linear continuous operators A W X ! Y. For each A 2 L.X; Y/ set
A 2 @ F.x C t0 .y x//
or
lim inf. .t/ .t0 //.t t0 /1 0; lim sup. .t/ .t0 //.t t0 /1 0 (16.5)
t!t0C t!t0
Assume that
j .t/ .t0 /j
jkF.x/ F.x C t.y x//k kF.x/ F.x C t0 .y x//kj
kF.x C t.y x// F.x C t0 .y x//k
kF.x C t.y x// F.x C t0 .y x// .t t0 /A.y x/k
Cjt t0 jkA.y x/k: (16.8)
268 16 Newton’s Method
Set
In view of (16.11) and (16.12), inequality (16.10) holds. By (6.8), (6.10), and (6.12),
Assume that the case (b) holds. Then (16.2), (16.3), (16.6), and (16.13) imply
that
0 lim inf
. .t/ .t0 //.t t0 /1
t!t0
D lim inf
. .t/ t. .1/ .0// . .t0 / t0 . .1/ .0////.t t0 /1
t!t0
16.2 Convergence of Newton’s Method 269
D lim inf
. .t/ .t0 //.t t0 /1 .1/
t!t0
D lim inf
j .t/ .t0 /jjt t0 j1 kF.x/ F.y/k
t!t0
and
We use the notation and definitions of Sect. 16.1. Suppose that the normed space X
is Banach.
Let > 0, r > 0, xN 2 X, U be a nonempty open subset of X such that
BX .Nx; r/ U (16.14)
M > 0:
Set
h D MLK: (16.21)
ht2 .1 1 M /t C 1 D 0 (16.22)
and
Set
there exists a unique point x 2 BX .Nx; Kt0 / such that A.F.x // D 0 and for each
x 2 BX .Nx; Kt0 /,
kxn x k 5ı:
2. Let ı > 0, a natural number n0 > 4 satisfy (16.26) and let sequences
kxn x k 10ı:
xi 2 BX .Nx; Kt0 /;
then
Then
fxi gniD0
0
BX .Nx; Kt0 /;
kxn0 1 x k and kxn0 1 xn0 k < =4:
272 16 Newton’s Method
We use the notation, assumptions, and definitions introduced in Sects. 16.1 and 16.2.
Lemma 16.4. The mapping T W U ! X is . M/-pre-differentiable at every point
of U and for every x 2 U,
IX A ı A.x/ 2 @M T.x/:
Proof. Let x 2 U and > 0. In view of (16.1) and (16.16), there exists ı > 0 such
that
BX .x; ı/ U (16.31)
Since is any positive number, this completes the proof of Lemma 16.4.
Lemma 16.5. Let r0 2 .0; r and x 2 BX .Nx; r0 /. Then
kIX A ı A.x/k
D kIX A ı A.Nx/ C A ı A.Nx/ A ı A.x/k
kIX A ı A.Nx/k C kAkkA.Nx/ A.x/k
1 C MLkx xN k 1 C MLr0 :
x; y 2 BX .Nx; r0 /:
Then
Proof. By (16.14), Theorem 16.2, and Lemmas 16.4 and 16.5, there exists 0 2
.0; 1/ such that
kT.x/ T.y/k
k.IX A ı A.x C 0 .y x///.y x/k C M ky xk
.1 C MLr0 C M /ky xk:
kT.x/ xN k K C 1 r0 C M. r0 C Lr02 /:
Proof. Let
x 2 BX .Nx; r0 /: (16.33)
kT.x/ xN k D kx A.F.x// xN k
D kAŒ.A.Nx//.x xN / F.x/ C F.Nx/
A.F.Nx// C .IX A ı A.Nx//.x xN /k
kAkkF.x/ F.Nx/ .A.Nx//.x xN /k
CkA.F.Nx//k C kIX A ı A.Nx/kkx xN k
MkF.x/ F.Nx/ .A.Nx//.x xN /k C K C 1 kx xN k: (16.34)
Together with (16.1) and (16.16) this implies that the mapping is . /-pre-
differentiable at every point of U and that for all z 2 U,
It follows from (16.35), (16.36), and Theorem 16.2 that for every z 2 BX .Nx; r0 /,
kT.x/ xN k K C 1 r0 C M. r0 C Lr02 /:
Set
r0 D Kt0 : (16.39)
We show that
r0 r: (16.40)
r0 D Kt0
D 2KŒ.1 1 M / C ..1 1 M /2 4h/1=2 1
2.1 1 M /1 K 4K r:
By (16.21), (16.39), (16.40), and Lemma 16.6, for each x; y 2 BX .Nx; Kt0 /,
1 C ht0 C M
1 C M C 2.1 1 M /1 .1 1 M /2 =4
D 1 C M C 21 .1 1 M /
D 21 C 21 .1 C M / 3=4: (16.43)
Let
x 2 BX .Nx; r0 /: (16.45)
It follows from (16.21), (16.22), (16.24), (16.39), (16.40), (16.45), and Lemma 16.7
that
and
x 2 BX .Nx; r0 /
such that
T.x / D x :
In order to complete the proof of Theorem 16.3 it is sufficient to show that assertions
1, 2, and 3 hold.
Let us prove assertion 1. For each integer i 0,
X
p1
1 p
kxp x k .3 4 / kx0 x k C ı .3 41 /i : (16.48)
iD0
kxpC1 x k ı C .3=4/kxp x k
X
p
.3 41 /pC1 kx0 x k C ı .3 41 /i :
iD0
Thus we showed by induction that (16.48) holds for all integers p 1. By (16.48)
and the choice of n0 [see (16.26)], for all integer n n0 ,
Assertion 1 is proved.
Let us prove assertion 2. In view of (16.27), for each integer i 0,
and
kxn x k 10ı:
Assertion 2 is proved.
Let us prove assertion 3. In view of (16.24),
and
2 p < n0 ;
xi 2 BX .Nx; Kt0 / (16.54)
X
q2
kxq xq1 k .3 41 /q1 kx1 x0 k C 2ı .3 41 /i : (16.55)
iD0
(In view of (16.51), (16.52), and (16.53), our assumption holds for p D 2.) By
(16.30), (16.44), (16.54), and (16.55),
X
p1
D .3 41 /p kx1 x0 k C 2ı .3 41 /i
iD0
278 16 Newton’s Method
X
pC1
kxpC1 x0 k kxq xq1 k
qD1
X
pC1
.3 41 /q1 kx1 x0 k C 8ıp
qD1
Set
xQ 0 D xn0 1 (16.57)
Clearly,
x D lim xQ i :
i!1
16.5 Set-Valued Mappings 279
Let .X; / be a complete metric space. For each z 2 X and each r > 0 set
In Sect. 16.7 we prove the following result which is important in our study of
Newton’s method for nonlinear inclusions.
Theorem 16.8. Suppose that W X ! 2X , a > 0, 2 .0; 1/, xN 2 X,
1
X
2.1 /1 i C .1 /1 .maxfi W i D 0; 1; : : : g C .Nx; .Nx/// a
iD0
(16.64)
and that a sequence fxi g1
iD0 X satisfies
x0 D xN (16.65)
280 16 Newton’s Method
hold. Then
X
k1 X
k
.xk ; xkC1 / ..Nx; .Nx// C 0 / C
k
j k1j
C j kj ; (16.69)
jD0 jD1
x D lim xn 2 B.Nx; a/
n!1
satisfying x 2 .x /.
2. Let 2 .0; 1/, a natural number n0 > 2 satisfy
and
x0 D xN
and for each integer i 2 Œ0; n0 1 satisfying xi 2 B.Nx; a/, the inequalities
hold. Then
such that
for each integer s satisfying 0 s < n and each integer k satisfying s < k n,
X
k1
.xk ; xkC1 / ks .xs ; xsC1 / C is .ki1Cs C kiCs / (16.84)
iDs
and
0 1 0 1
X
n X
n X
n1 X
n1i X
n Xni
.xp ; xpC1 / p .x0 ; x1 / C i @ jA C i @ jA :
pD0 pD0 iD0 jD0 iD1 jD0
(16.85)
Assertion 1 is proved.
Let us prove assertion 2. Assume that an integer
We show that
such that
Thus (16.81) and (16.82) hold. It follows from (16.63), (16.88), (16.91), and (16.93)
that
0 s < n:
We show by induction that for each integer k satisfying s < k n, (16.84) holds. In
view of (16.94),
X
k
D kC1s .xs ; xsC1 / C is .kiCs C kC1iCs /:
iDs
Thus by induction we showed that (16.84) holds for all integers k satisfying
s < k n.
284 16 Newton’s Method
X
p1
.xp ; xpC1 / p .x0 ; x1 / C i .pi1 C pi /: (16.95)
iD0
x0 D xN :
X
n
.Nx; xnC1 / D .x0 ; xnC1 / .xp ; xpC1 /
pD0
X
n1 X
n
.1 /1 .x0 ; x1 / C .1 /1 i C .1 /1 i
iD0 iD1
X
n
1 1
.1 / ..Nx; .Nx// C 0 / C 2.1 / i .1 /1 n
iD0
1
< a .1 / n
and
.xnC1 ; xN / < a n :
and
Lemma 16.9, (16.81), (16.82), and (16.97) imply that for each integer p 0 there
exists
such that
x 2 .x /:
Lemma 16.9, (16.67), (16.84), and (16.97) imply that for each integer k > 0,
X
k1 X
k
.xk ; xkC1 / k .x0 ; x1 / C j k1j C j kj
jD0 jD1
X
k1 X
k
k ..Nx; .Nx// C 0 / C j k1j j kj : (16.102)
jD0 jD1
1
X 1
X
.xkCp ; xkCpC1 / .1 /1 .xk ; xkC1 / C 2.1 /1 p :
pD0 pDk
Assertion 1 is proved.
Let us prove assertion 2. For i D 0; : : : ; n0 1 set
i D ı: (16.103)
By (16.73) and (16.103), for every integer i n0 there exists i > 0 such that
(16.64) holds,
1
X
2.1 /1 i < =4; i ı for all integers i 0: (16.104)
iDn0
Clearly, for every integer i n0 C 1, there exists xi 2 X such that the following
property holds:
if an integer i 0 satisfies xi 2 B.Nx; a/, then (16.66) and (16.67) hold.
It follows from (16.62), (16.64), (16.66)–(16.69), (16.71), (16.72), (16.103),
(16.104), and assertion 1 that
and
Let .Z; k k/ be a normed space. For each x 2 Z and each nonempty set C Z
define
BZ .z; r/ D fy 2 Z W kz yk rg:
Let .X; k k/ and .Y; k k/ be normed spaces. Denote by L.X; Y/ the set of all
linear continuous operators A W X ! Y. For each A 2 L.X; Y/ set
BX .x; ı.// U
BX .x; ı.// U
G.x C h/ C g.x C h/
G.x/ C .G0 .x//.h/ C 21 khkBY .0; 1/ C g.x C h/
G.x/ C .G0 .x//.h/ C 21 khkBY .0; 1/ C g.x/ C . C 41 /khkBY .0; 1/
G.x/ C g.x/ C .G0 .x//.h/ C . C /khkBY .0; 1/
16.8 Pre-differentiable Set-Valued Mappings 289
and
and that
xQ 2 F.x/: (16.114)
A 2 @ F.x C t0 .y x//
or
lim inf. .t/ .t0 //.tt0 /1 0; lim sup. .t/ .t0 //.tt0 /1 0 (16.117)
t!t0C t!t0
.t/ .t0 /
D d.Qx; F.x C t.y x/// d.Qx; F.x C t0 .y x///
H.F.x C t.y x//; F.x C t0 .y x///;
.t0 / .t/
D d.Qx; F.x C t0 .y x/// d.Qx; F.x C t.y x///
H.F.x C t.y x//; F.x C t0 .y x///
and
we have
Set
In view of (16.124) and (16.125), relations (16.122) and (16.123) hold. By (16.123)
and (16.125),
j .t/ .t0 /j
H.F.x C t0 .y x//; F.x C t0 .y x// C .t t0 /A.y x//
CH.F.x C t0 .y x// C .t t0 /A.y x/; F.x C t.y x///
jt t0 jkA.y x/k C . C 41 /jt t0 jky xk: (16.126)
Assume that the case (a) holds. Then (16.115)–(16.117), (16.124), and (16.126)
imply that
Assume that the case (b) holds. Then (16.114)–(16.116), (16.118), and (16.126)
imply that
0 lim inf
. .t/ .t0 //.t t0 /1
t!t0
D lim inf
. .t/ t. .1/ .0// . .t0 / t0 . .1/ .0////.t t0 /1
t!t0
292 16 Newton’s Method
D lim inf
. .t/ .t0 //.t t0 /1 . .1/ .0//
t!t0
D lim inf
j .t/ .t0 /jjt t0 j1 d.Qx; F.y//
t!t0
Thus (16.127) holds in both cases. This completes the proof of Theorem 16.11.
We use the notation and definitions of Sect. 16.8. Suppose that the normed space X
is Banach.
Let > 0, r > 0, xN 2 X, U be a nonempty open subset of X such that
BX .Nx; r/ U
In view of (16.130),
M > 0:
16.9 Newton’s Method for Solving Inclusions 293
and for every x 2 X n U set T.x/ D ;. The following result is proved in Sect. 16.11.
Theorem 16.12. Let
and there exists x 2 BX .Nx; r0 / such that x 2 T.x / and 0 2 F.x /. Moreover, the
following assertions hold.
P1
1. Assume that a sequence fi g1 iD0 .0; 1/ satisfies iD0 i < 1,
1
X
4 i C 2 maxfi W i D 0; 1; : : : g 21 r0
iD0
x0 D xN
d.xiC1 ; T.xi // i ;
kxi xiC1 k d.xi ; T.xi // C i
hold. Then
lim xn 2 BX .Nx; r0 /
n!1
and
x0 D xN
d.xiC1 ; T.xi // ı;
kxi xiC1 k d.xi ; T.xi // C ı
hold. Then
IX A ı A.x/ 2 @M T.x/:
Proof. Let x 2 U and > 0. In view of (16.128), there exists ı > 0 such that
BX .x; ı/ U
Since is any positive number, this completes the proof of Lemma 16.13.
Lemma 16.14. Let r0 2 .0; r and x 2 BX .Nx; r0 /. Then
kIX A ı A.x/k
D kIX A ı A.Nx/ C A ı A.Nx/ A ı A.x/k
kIX A ı A.Nx/k C kAkkA.Nx/ A.x/k
1 C MLkx xN k 1 C MLr0 :
x; y 2 BX .Nx; r0 /:
Then
IX A ı A.x/ 2 @M T.x/:
xQ 2 T.x/:
By Theorem 16.11 and Lemmas 16.13 and 16.14, there exists t0 2 .0; 1/ such that
Analogously,
Therefore
By (16.131)–(16.133),
Let
x; y 2 BX .Nx; r0 /:
Lemma 16.15 imply that
H.T.x/; T.y// ky xk.1 C MLr0 C M / 21 kx yk:
is closed. It is not difficult to see that Theorem 16.12 follows from Theorem 16.8
applied to the mapping T.
References
18. Beck A, Sabach S (2015) Weiszfeld’s method: old and new results. J Optim Theory Appl
164:1–40
19. Beck A, Teboulle M (2003) Mirror descent and nonlinear projected subgradient methods for
convex optimization. Oper Res Lett 31:167–175
20. Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear
inverse problems. SIAM J Imag Sci 2:183–202
21. Ben-Israel A (1966) A Newton-Raphson method for the solution of equations. J Math Anal
Appl 15:243–253
22. Ben-Israel A, Greville TNE (1974) Generalized inverses: theory and applications. Wiley, New
York
23. Bolte J (2003) Continuous gradient projection method in Hilbert spaces. J Optim Theory Appl
119:235–259
24. Bonnans JF (1994) Local analysis of Newton-type methods for variational inequalities and
nonlinear programming. Appl Math Optim 29:161–186
25. Boukari D, Fiacco AV (1995) Survey of penalty, exact-penalty and multiplier methods from
1968 to 1993. Optimization 32:301–334
26. Bregman LM (1967) A relaxation method of finding a common point of convex sets and
its application to the solution of problems in convex programming. Z Vycisl Mat Mat Fiz
7:620–631
27. Brezis H (1973) Opérateurs maximaux monotones. North Holland, Amsterdam
28. Bruck RE (1974) Asymptotic convergence of nonlinear contraction semigroups in a Hilbert
space. J Funct Anal 18:15–26
29. Burachik RS, Iusem AN (1998) A generalized proximal point algorithm for the variational
inequality problem in a Hilbert space. SIAM J Optim 8:197–216
30. Burachik RS, Grana Drummond LM, Iusem AN, Svaiter BF (1995) Full convergence of the
steepest descent method with inexact line searches. Optimization 32:137–146
31. Burachik RS, Lopes JO, Da Silva GJP (2009) An inexact interior point proximal method for
the variational inequality problem. Comput Appl Math 28:15–36
32. Burachik RS, Kaya CY, Sabach S (2012) A generalized univariate Newton method motivated
by proximal regularization. J Optim Theory Appl 155:923–940
33. Burke JV (1991) An exact penalization viewpoint of constrained optimization. SIAM
J Control Optim 29:968–998
34. Butnariu D, Kassay G (2008) A proximal-projection method for finding zeros of set-valued
operators. SIAM J Control Optim 47:2096–2136
35. Ceng LC, Mordukhovich BS, Yao JC (2010) Hybrid approximate proximal method with
auxiliary variational inequality for vector optimization. J Optim Theory Appl 146:267–303
36. Censor Y, Zenios SA (1992) The proximal minimization algorithm with D-functions.
J. Optim. Theory Appl. 73:451–464
37. Censor Y, Gibali A, Reich S (2011) The subgradient extragradient method for solving
variational inequalities in Hilbert space. J Optim Theory Appl 148:318–335
38. Censor Y, Gibali A, Reich S (2012) A von Neumann alternating method for finding common
solutions to variational inequalities. Nonlinear Anal 75:4596–4603
39. Censor Y, Gibali A, Reich S, Sabach S (2012) Common solutions to variational inequalities.
Set-Valued Var Anal 20:229–247
40. Chen Z, Zhao K (2009) A proximal-type method for convex vector optimization problem in
Banach spaces. Numer Funct Anal Optim 30:70–81
41. Chen X, Nashed Z, Qi L (1997) Convergence of Newtons method for singular smooth and
nonsmooth equations using adaptive outer inverses. SIAM J Optim 7:445–462
42. Chuong TD, Mordukhovich BS, Yao JC (2011) Hybrid approximate proximal algorithms for
efficient solutions in for vector optimization. J Nonlinear Convex Anal 12:861–864
43. Clarke FH (1983) Optimization and nonsmooth analysis. Willey Interscience, New York
44. Demyanov VF, Vasilyev LV (1985) Nondifferentiable optimization. Optimization Software,
New York
References 299
45. Di Pillo G, Grippo L (1989) Exact penalty functions in constrained optimization. SIAM
J Control Optim 27:1333–1360
46. Dontchev AL, Rockafellar RT (2010) Newton’s method for generalized equations: a sequen-
tial implicit function theorem. Math Program 123:139–159
47. Dontchev AL, Rockafellar RT (2013) Convergence of inexact Newton methods for general-
ized equations. Math Program Ser B 139:115–137
48. Eremin II (1966) The penalty method in convex programming. Sov Math Dokl 8:459–462
49. Eremin II (1971) The penalty method in convex programming. Cybernetics 3:53–56
50. Ekeland I (1974) On the variational principle. J Math Anal Appl 47, 324–353
51. Ermoliev YM (1966) Methods for solving nonlinear extremal problems. Cybernetics 2:1–17
52. Facchinei F, Pang J-S (2003) Finite-dimensional variational inequalities and complementarity
problems, volume I and volume II. Springer, New York
53. Guler O (1991) On the convergence of the proximal point algorithm for convex minimization.
SIAM J Control Optim 29:403–419
54. Gwinner J, Raciti F (2009) On monotone variational inequalities with random data. J Math
Inequal 3:443–453
55. Hager WW, Zhang H (2007) Asymptotic convergence analysis of a new class of proximal
point methods. SIAM J Control Optim 46:1683–1704
56. Hager WW, Zhang H (2008) Self-adaptive inexact proximal point methods. Comput Optim
Appl 39:161–181
57. Han S-P, Mangasarian OL (1979) Exact penalty function in nonlinear programming. Math
Program 17:251–269
58. Hiriart-Urruty J-B, Lemarechal C (1993) Convex analysis and minimization algorithms.
Springer, Berlin
59. Iiduka H, Takahashi W, Toyoda M (2004) Approximation of solutions of variational inequal-
ities for monotone mappings. Pan Am Math J 14:49–61
60. Ioffe AD, Zaslavski AJ (2000) Variational principles and well-posedness in optimization and
calculus of variations. SIAM J Control Optim 38:566–581
61. Iusem A, Nasri M (2007) Inexact proximal point methods for equilibrium problems in Banach
spaces. Numer Funct Anal Optim 28:1279–1308
62. Iusem A, Resmerita E (2010) A proximal point method in nonreflexive Banach spaces. Set-
Valued Var Anal 18:109–120
63. Izmailov AF, Solodov MV (2014) Newton-type methods for optimization and variational
problems. Springer International Publishing, Cham
64. Kantorovich LV (1948) Functional analysis and applied mathematics. Usp Mat Nauk
3:89–185
65. Kantorovich LV, Akilov GP (1982) Functional analysis. Pergamon Press, Oxford, New York
66. Kaplan A, Tichatschke R (1994) Stable methods for ill-posed variational problems. Akademie
Verlag, Berlin
67. Kaplan A, Tichatschke R (1998) Proximal point methods and nonconvex optimization.
J Global Optim 13:389–406
68. Kaplan A, Tichatschke R (2007) Bregman-like functions and proximal methods for varia-
tional problems with nonlinear constraints. Optimization 56:253–265
69. Kassay G (1985) The proximal points algorithm for reflexive Banach spaces. Stud Univ
Babes-Bolyai Math 30:9–17
70. Kiwiel KC (1996) Restricted step and Levenberg–Marquardt techniques in proximal bundle
methods for nonconvex nondifferentiable optimization. SIAM J Optim 6:227–249
71. Konnov IV (1997) On systems of variational inequalities. Russ Math (Iz VUZ) 41:79–88
72. Konnov IV (2001) Combined relaxation methods for variational inequalities. Springer, Berlin,
Heidelberg
73. Konnov IV (2008) Nonlinear extended variational inequalities without differentiability:
applications and solution methods. Nonlinear Anal. 69:1–13
74. Konnov IV (2009) A descent method with inexact linear search for mixed variational
inequalities. Russ Math (Iz VUZ) 53:29–35
300 References
75. Korpelevich GM (1976) The extragradient method for finding saddle points and other
problems. Ekon Matem Metody 12:747–756
76. Kutateladze SS (1979) Convex operators. Usp Math Nauk 34:167–196
77. Lemaire B (1989) The proximal algorithm. In: Penot JP (ed) International series of numerical
mathematics, vol 87. Birkhauser-Verlag, Basel, pp 73–87
78. Lotito PA, Parente LA, Solodov MV (2009) A class of variable metric decomposition methods
for monotone variational inclusions. J Convex Anal 16:857–880
79. Mainge P-E (2008) Strong convergence of projected subgradient methods for nonsmooth and
nonstrictly convex minimization. Set-Valued Anal 16:899–912
80. Mangasarian OL, Pang J-S (1997) Exact penalty functions for mathematical programs with
linear complementary constraints. Optimization 42:1–8
81. Martinet B (1978) Pertubation des methodes d’optimisation: application. RAIRO Anal Numer
12:153–171
82. Minty GJ (1962) Monotone (nonlinear) operators in Hilbert space. Duke Math J 29:341–346
83. Minty GJ (1964) On the monotonicity of the gradient of a convex function. Pac J Math
14:243–247
84. Mordukhovich BS (2006) Variational analysis and generalized differentiation, I: I: basic
theory. Springer, Berlin
85. Mordukhovich BS (2006) Variational analysis and generalized differentiation, II: applica-
tions. Springer, Berlin
86. Mordukhovich BS, Nam NM (2014) An easy path to convex analysis and applications.
Morgan&Clayton Publishes, San Rafael, CA
87. Moreau JJ (1965) Proximite et dualite dans un espace Hilbertien. Bull Soc Math Fr
93:273–299
88. Nashed MZ, Chen X (1993) Convergence of Newton-like methods for singular operator
equations using outer inverses. Numer Math 66:235–257
89. Nedic A, Ozdaglar A (2009) Subgradient methods for saddle-point problems. J Optim Theory
Appl 142:205–228
90. Nemirovski A, Yudin D (1983) Problem complexity and method efficiency in optimization.
Wiley, New York
91. Nesterov Yu (1983) A method for solving the convex programming problem with convergence
rate O.1=k2 /. Dokl Akad Nauk 269:543–547
92. Nesterov Yu (2004) Introductory lectures on convex optimization. Kluwer, Boston
93. Pang J-S (1985) Asymmetric variational inequality problems over product sets: applications
and iterative methods. Math Program 31:206–219
94. Pang J-S (1990) Newton’s method for B-differentiable equations. Math Oper Res 15:311–341
95. Polyak BT (1967) A general method of solving extremum problems. Dokl Akad Nauk
8:593–597
96. Polyak BT (1987) Introduction to optimization. Optimization Software, New York
97. Polyak BT (2007) Newtons method and its use in optimization. Eur J Oper Res
181:1086–1096
98. Polyak RA (2015) Projected gradient method for non-negative least squares. Contemp Math
636:167–179
99. Qi L, Sun J (1993) A nonsmooth version of Newton’s method. Math Program 58:353–367
100. Reich S, Sabach S (2010) Two strong convergence theorems for Bregman strongly nonexpan-
sive operators in reflexive Banach spaces. Nonlinear Anal 73:122–135
101. Reich S, Zaslavski AJ (2014) Genericity in nonlinear analysis. Springer, New York
102. Robinson SM (1994) Newtons method for a class of nonsmooth functions. Set-Valued Anal
2:291–305
103. Rockafellar RT (1976) Augmented Lagrangians and applications of the proximal point
algorithm in convex programming. Math Oper Res 1:97–116
104. Rockafellar RT (1976) Monotone operators and the proximal point algorithm. SIAM J Control
Optim 14:877–898
105. Shor NZ (1985) Minimization methods for non-differentiable functions. Springer, Berlin
References 301
106. Solodov MV, Svaiter BF (2000) Error bounds for proximal point subproblems and associated
inexact proximal point algorithms. Math Program 88:371–389
107. Solodov MV, Svaiter BF (2001) A unified framework for some inexact proximal point
algorithms. Numer Funct Anal Optim 22:1013–1035
108. Solodov MV, Zavriev SK (1998) Error stability properties of generalized gradient-type
algorithms. J Optim Theory Appl 98:663–680
109. Su M, Xu H-K (2010) Remarks on the gradient-projection algorithm. J Nonlinear Anal Optim
1:35–43
110. Weiszfeld EV (1937) Sur le point pour lequel la somme des distances de n points donnes est
minimum. Tohoku Math J 43:355–386
111. Xu H-K (2006) A regularization method for the proximal point algorithm. J Global Optim
36:115–125
112. Xu H-K (2011) Averaged mappings and the gradient-projection algorithm. J Optim Theory
Appl 150:360–378
113. Yamashita N, Kanzow C, Morimoto T, Fukushima M (2001) An infeasible interior proximal
method for convex programming problems with linear constraints. J Nonlinear Convex Anal
2:139–156
114. Zangwill WI (1967) Nonlinear programming via penalty functions. Manage Sci 13:344–358
115. Zaslavski AJ (2003) Existence of solutions of minimization problems with an increasing cost
function and porosity. Abstr Appl Anal 2003:651–670
116. Zaslavski AJ (2003) Generic existence of solutions of minimization problems with an
increasing cost function. J. Nonlinear Funct Anal Appl 8:181–213
117. Zaslavski AJ (2005) A sufficient condition for exact penalty in constrained optimization.
SIAM J Optim 16:250–262
118. Zaslavski AJ (2007) Existence of approximate exact penalty in constrained optimization.
Math Oper Res 32:484–495
119. Zaslavski AJ (2010) An estimation of exact penalty in constrained optimization. J Nonlinear
Convex Anal 11:381–389
120. Zaslavski AJ (2010) Convergence of a proximal method in the presence of computational
errors in Hilbert spaces. SIAM J Optim 20:2413–2421
121. Zaslavski AJ (2010) Optimization on metric and normed spaces. Springer, New York
122. Zaslavski AJ (2010) The projected subgradient method for nonsmooth convex optimization
in the presence of computational errors. Numer Funct Anal Optim 31:616–633
123. Zaslavski AJ (2011) An estimation of exact penalty for infinite-dimensional inequality-
constrained minimization problems. Set-Valued Var Anal 19:385–398
124. Zaslavski AJ (2011) Inexact proximal point methods in metric spaces. Set-Valued Var Anal
19:589–608
125. Zaslavski AJ (2011) Maximal monotone operators and the proximal point algorithm in the
presence of computational errors. J. Optim Theory Appl 150:20–32
126. Zaslavski AJ (2012) The extragradient method for convex optimization in the presence of
computational errors. Numer Funct Anal Optim 33:1399–1412
127. Zaslavski AJ (2012) The extragradient method for solving variational inequalities in the
presence of computational errors. J Optim Theory Appl 153:602–618
128. Zaslavski AJ (2013) The extragradient method for finding a common solution of a finite
family of variational inequalities and a finite family of fixed point problems in the presence
of computational errors. J Math Anal Appl 400:651–663
129. Zeng LC, Yao JC (2006) Strong convergence theorem by an extragradient method for fixed
point problems and variational inequality problems. Taiwan J Math 10:1293–1303
Index
H
C Hilbert space, 1, 4, 6, 11, 20
Cardinality of a set, 137
Collinear vectors, 86
Compact set, 228 I
Concave function, 26, 36 Increasing function, 247
Continuous subgradient algorithm, 225 Inner product, 1, 4, 6, 20
Convex–concave function, 11
Convex cone, 247
Convex function, 1, 4, 6, 11, 20, 26, 35 K
Convex hull, 228 Karush–Kuhn–Tucker theorem, 252
Convex minimization problem, 105
Convex set, 11, 12
L
Lebesgue measurable function, 225, 227
E Linear functional, 246
Ekelands variational principle, 244, 252 Linear inverse problem, 74
Euclidean norm, 167 Lower semicontinuous function, 6, 137
Euclidean space, 86, 169
Exact penalty, 239
Extragradient method, 183, 205 M
Maximal monotone operator, 169
Metric space, 149
F Minimization problem, 2
Fermat–Weber location problem, 85, 86 Minimizer, 15, 16, 22, 42
P
V
Penalty function, 239
Variational inequality, 8, 183
Pre-derivative, 268
Vector space, 247
Pre-differentiable mapping, 268
Projected gradient algorithm, 59
Projected subgradient method, 119
Proximal mapping, 170 W
Proximal point method, 6, 137 Weiszfelds method, 85
Pseudo-monotone mapping, 8, 184 Well-posed problem, 140, 165
Q Z
Quadratic function, 90 Zero-sum game, 25, 35