Sie sind auf Seite 1von 308

Springer Optimization and Its Applications  108

Alexander J. Zaslavski

Numerical
Optimization with
Computational
Errors
Springer Optimization and Its Applications
VOLUME 108

Managing Editor
Panos M. Pardalos (University of Florida)

Editor–Combinatorial Optimization
Ding-Zhu Du (University of Texas at Dallas)

Advisory Board
J. Birge (University of Chicago)
C.A. Floudas (Princeton University)
F. Giannessi (University of Pisa)
H.D. Sherali (Virginia Polytechnic and State University)
T. Terlaky (McMaster University)
Y. Ye (Stanford University)

Aims and Scope


Optimization has been expanding in all directions at an astonishing rate
during the last few decades. New algorithmic and theoretical techniques
have been developed, the diffusion into other disciplines has proceeded at a
rapid pace, and our knowledge of all aspects of the field has grown even more
profound. At the same time, one of the most striking trends in optimization
is the constantly increasing emphasis on the interdisciplinary nature of the
field. Optimization has been a basic tool in all areas of applied mathematics,
engineering, medicine, economics, and other sciences.
The series Springer Optimization and Its Applications publishes under-
graduate and graduate textbooks, monographs and state-of-the-art exposi-
tory work that focus on algorithms for solving optimization problems and
also study applications involving such problems. Some of the topics covered
include nonlinear optimization (convex and nonconvex), network flow
problems, stochastic optimization, optimal control, discrete optimization,
multi-objective programming, description of software packages, approxima-
tion techniques and heuristic approaches.

More information about this series at http://www.springer.com/series/7393


Alexander J. Zaslavski

Numerical Optimization
with Computational Errors

123
Alexander J. Zaslavski
Department of Mathematics
The Technion – Israel Institute
of Technology
Haifa, Israel

ISSN 1931-6828 ISSN 1931-6836 (electronic)


Springer Optimization and Its Applications
ISBN 978-3-319-30920-0 ISBN 978-3-319-30921-7 (eBook)
DOI 10.1007/978-3-319-30921-7

Library of Congress Control Number: 2016934410

Mathematics Subject Classification (2010): 47H09, 49M30, 65K10

© Springer International Publishing Switzerland 2016


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, express or implied, with respect to the material contained herein or for any
errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by Springer Nature


The registered company is Springer International Publishing AG Switzerland
Preface

The book is devoted to the study of approximate solutions of optimization problems


in the presence of computational errors. We present a number of results on the con-
vergence behavior of algorithms in a Hilbert space, which are known as important
tools for solving optimization problems and variational inequalities. According to
the results known in the literature, these algorithms should converge to a solution.
In this book, we study these algorithms taking into account computational errors
which are always present in practice. In this case the convergence to a solution does
not take place. We show that our algorithms generate a good approximate solution,
if computational errors are bounded from above by a small positive constant. In
practice it is sufficient to find a good approximate solution instead of constructing
a minimizing sequence. On the other hand, in practice computations can induce
numerical errors and if one uses optimization methods to solve minimization prob-
lems these methods usually provide only approximate solutions of the problems.
Our main goal is, for a known computational error, to find out what an approximate
solution can be obtained and how many iterates one needs for this.
This monograph contains 16 chapters. Chapter 1 is an introduction. In Chap. 2,
we study the subgradient projection algorithm for minimization of convex and
nonsmooth functions. The mirror descent algorithm is considered in Chap. 3. The
gradient projection algorithm for minimization of convex and smooth functions
is analyzed in Chap. 4. In Chap. 5, we consider its extension which is used for
solving linear inverse problems arising in signal/image processing. The convergence
of Weiszfeld’s method in the presence of computational errors is discussed in
Chap. 6. In Chap. 7, we solve constrained convex minimization problems using the
extragradient method. Chapter 8 is devoted to a generalized projected subgradient
method for minimization of a convex function over a set which is not necessarily
convex. In Chap. 9, we study the convergence of a proximal point method in a
Hilbert space under the presence of computational errors. Chapter 10 is devoted
to the local convergence of a proximal point method in a metric space under
the presence of computational errors. In Chap. 11, we study the convergence of
a proximal point method to a solution of the inclusion induced by a maximal
monotone operator, under the presence of computational errors. In Chap. 12, the

v
vi Preface

convergence of the subgradient method for solving variational inequalities is proved


under the presence of computational errors. The convergence of the subgradient
method to a common solution of a finite family of variational inequalities and of a
finite family of fixed point problems, under the presence of computational errors, is
shown in Chap. 13. In Chap. 14, we study continuous subgradient method. Penalty
methods are studied in Chap. 15. Chapter 16 is devoted to Newton’s method. The
results of Chaps. 2–6, 14, and 16 are new. The results of other chapters were obtained
and published during the last 5 years.
The author believes that this book will be useful for researchers interested in the
optimization theory and its applications.

Rishon LeZion, Israel Alexander J. Zaslavski


October 19, 2015
Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Subgradient Projection Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 The Mirror Descent Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Proximal Point Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Variational Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Subgradient Projection Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 A Convex Minimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 The Main Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Proof of Theorem 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Subgradient Algorithm on Unbounded Sets . . . . . . . . . . . . . . . . . . . . . . . 20
2.6 Proof of Theorem 2.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.7 Zero-Sum Games with Two-Players . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.8 Proof of Proposition 2.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.9 Subgradient Algorithm for Zero-Sum Games . . . . . . . . . . . . . . . . . . . . . 35
2.10 Proof of Theorem 2.11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3 The Mirror Descent Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.1 Optimization on Bounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2 The Main Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3 Proof of Theorem 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 Optimization on Unbounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5 Proof of Theorem 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.6 Zero-Sum Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4 Gradient Algorithm with a Smooth Objective Function . . . . . . . . . . . . . . . 59
4.1 Optimization on Bounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 An Auxiliary Result and the Proof of Proposition 4.1 . . . . . . . . . . . . 61
4.3 The Main Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 Proof of Theorem 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.5 Optimization on Unbounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

vii
viii Contents

5 An Extension of the Gradient Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73


5.1 Preliminaries and the Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.2 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.3 The Main Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.4 Proof of Theorem 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6 Weiszfeld’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.1 The Description of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.3 The Basic Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.4 The Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.5 Proof of Theorem 6.10. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7 The Extragradient Method for Convex Optimization . . . . . . . . . . . . . . . . . . 105
7.1 Preliminaries and the Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
7.3 Proof of Theorem 7.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.4 Proof of Theorem 7.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
8 A Projected Subgradient Method for Nonsmooth Problems . . . . . . . . . . 119
8.1 Preliminaries and Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
8.2 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
8.3 Proof of Theorem 8.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
8.4 Proof of Theorem 8.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
9 Proximal Point Method in Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
9.1 Preliminaries and the Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
9.2 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.3 Proof of Theorem 9.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
9.4 Proof of Theorem 9.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
10 Proximal Point Methods in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
10.1 Preliminaries and the Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
10.2 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
10.3 The Main Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
10.4 Proof of Theorem 10.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
10.5 An Auxiliary Result for Theorem 10.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
10.6 Proof of Theorem 10.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
10.7 Well-Posed Minimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
10.8 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
11 Maximal Monotone Operators and the Proximal Point Algorithm . . . 169
11.1 Preliminaries and the Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
11.2 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
11.3 Proof of Theorem 11.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
11.4 Proofs of Theorems 11.3, 11.5, 11.6, and 11.7 . . . . . . . . . . . . . . . . . . . . 176
Contents ix

12 The Extragradient Method for Solving Variational Inequalities . . . . . . 183


12.1 Preliminaries and the Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
12.2 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
12.3 Proof of Theorem 12.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
12.4 The Finite-Dimensional Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
12.5 A Convergence Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
13 A Common Solution of a Family of Variational Inequalities . . . . . . . . . . 205
13.1 Preliminaries and the Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
13.2 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
13.3 Proof of Theorem 13.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
13.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
14 Continuous Subgradient Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
14.1 Bochner Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
14.2 Convergence Analysis for Continuous Subgradient Method . . . . . 226
14.3 An Auxiliary Result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
14.4 Proof of Theorem 14.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
14.5 Continuous Subgradient Projection Method . . . . . . . . . . . . . . . . . . . . . . . 231
15 Penalty Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
15.1 An Estimation of Exact Penalty in Constrained Optimization . . . . 239
15.2 Proof of Theorem 15.4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
15.3 Infinite-Dimensional Inequality-Constrained
Minimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
15.4 Proofs of Auxiliary Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
15.5 Proof of Theorem 15.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
15.6 Proof of Theorem 15.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
15.7 An Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
16 Newton’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
16.1 Pre-differentiable Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
16.2 Convergence of Newton’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
16.3 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
16.4 Proof of Theorem 16.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
16.5 Set-Valued Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
16.6 An Auxiliary Result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
16.7 Proof of Theorem 16.8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
16.8 Pre-differentiable Set-Valued Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
16.9 Newton’s Method for Solving Inclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 292
16.10 Auxiliary Results for Theorem 16.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
16.11 Proof of Theorem 16.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
Chapter 1
Introduction

In this book we study behavior of algorithms for constrained convex minimization


problems in a Hilbert space. Our goal is to obtain a good approximate solution of
the problem in the presence of computational errors. We show that the algorithm
generates a good approximate solution, if the sequence of computational errors is
bounded from above by a constant. In this section we discuss several algorithms
which are studied in the book.

1.1 Subgradient Projection Method

In Chap. 2 we study the subgradient projection algorithm for minimization of convex


and nonsmooth functions and for computing the saddle points of convex–concave
functions, under the presence of computational errors. It should be mentioned that
the subgradient projection algorithm is one of the most important tools in the
optimization theory and its applications. See, for example, [1–3, 12, 30, 44, 51, 79,
89, 92, 95, 96, 105, 108, 109, 112] and the references mentioned therein.
We use this method for constrained minimization problems in Hilbert spaces
equipped with an inner product denoted by h; i which induces a complete norm
k  k. For every z 2 R1 denote by bzc the largest integer which does not exceed z:
bzc D maxfi 2 R1 W i is an integer and i  zg.
Let X be a Hilbert space. For each x 2 X and each r > 0 set

BX .x; r/ D fy 2 X W kx  yk  rg:

For each x 2 X and each nonempty set E  X set

d.x; E/ D inffkx  yk W y 2 Eg:

© Springer International Publishing Switzerland 2016 1


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_1
2 1 Introduction

Let C be a nonempty closed convex subset of X, U be an open convex subset of X


such that C  U and let f W U ! R1 be a convex function. For each x 2 U set [84]

@f .x/ D fl 2 X W f .y/  f .x/  hl; y  xi for all y 2 Ug

which is called the subdifferential of the function f at the point x [84].


Suppose that there exist L > 0, M0 > 0 such that

C  BX .0; M0 /;
jf .x/  f .y/j  Lkx  yk for all x; y 2 U:

It is not difficult to see that for each x 2 U,

; 6D @f .x/  BX .0; L/:

For every nonempty closed convex set D  X and every x 2 X there is a unique
point PD .x/ 2 D satisfying

kx  PD .x/k D inffkx  yk W y 2 Dg:

We consider the minimization problem

f .z/ ! min; z 2 C:

Suppose that ı 2 .0; 1 is a computational error produced by our computer


system and that fak g1
kD0  .0; 1/.
Let us describe our algorithm.
Subgradient Projection Algorithm
Initialization: select an arbitrary x0 2 U.
Iterative step: given a current iteration vector xt 2 U calculate

t 2 @f .xt / C BX .0; ı/

and the next iteration vector xtC1 2 U such that

kxtC1  PC .xt  at t /k  ı:

In Chap. 2 we prove the following result (see Theorem 2.4).


Theorem 1.1. Let ı 2 .0; 1, fak g1
kD0  .0; 1/ and let

x 2 C

satisfies

f .x /  f .x/ for all x 2 C:


1.1 Subgradient Projection Method 3

Assume that fxt g1 1


tD0  U, ft gtD0  X,

kx0 k  M0 C 1

and that for each integer t  0,

t 2 @f .xt / C BX .0; ı/

and

kxtC1  PC .xt  at t /k  ı:

Then for each natural number T,

X
T
at .f .xt /  f .x //
tD0

 21 kx  x0 k2 C ı.T C 1/.4M0 C 1/


X
T X
T
Cı.2M0 C 1/ at C 21 .L C 1/2 a2t :
tD0 tD0

Moreover, for each natural number T,


8 0 !1 T 1 9
< X
T X =
max f @ at at xt A  f .x /; minff .xt / W t D 0; : : : ; Tg  f .x /
: ;
tD0 tD0
!1 !1
X
T X
T
1 2
2 at kx  x0 k C at ı.T C 1/.4M0 C 1/
tD0 tD0
!1
X
T X
T
1
Cı.2M0 C 1/ C 2 at .L C 1/2 a2t :
tD0 tD0

We are interestedPin an optimal choice of at , t D 0; 1; : : : . Let T be a natural


number and AT D TtD0 at be given. It is shown in Chap. 2 that the best choice is
ai D .T C 1/1 AT , i D 0; : : : ; T.
Let T be a natural number and at D a, t D 0; : : : ; T. It is shown in Chap. 2 that
best choice of a > 0 is

a D .2ı.4M0 C 1//1=2 .L C 1/1 :

Now we can think about the best choice of T. It is not difficult to see that it should
be at the same order as bı 1 c.
4 1 Introduction

1.2 The Mirror Descent Method

Let X be a Hilbert space equipped with an inner product h; i which induces a
complete norm k  k. We use the notation introduced in the previous section.
Let C be a nonempty closed convex subset of X, U be an open convex subset of
X such that C  U and let f W U ! R1 be a convex function. Suppose that there
exist L > 0, M0 > 0 such that

C  BX .0; M0 /;
jf .x/  f .y/j  Lkx  yk for all x; y 2 U:

It is not difficult to see that for each x 2 U,

; 6D @f .x/  BX .0; L/:

For each nonempty set D  X and each function h W D ! R1 put

inf.h; D/ D inffh.y/ W y 2 Dg

and

argminfh.y/ W y 2 Dg D fy 2 D W h.y/ D inf.hI D/g:

In Chap. 3 we study the convergence of the mirror descent algorithm under the
presence of computational errors. This method was introduced by Nemirovsky and
Yudin for solving convex optimization problems [90]. Here we use a derivation of
this algorithm proposed by Beck and Teboulle [19].
We consider the minimization problem

f .z/ ! min; z 2 C:

Suppose that ı 2 .0; 1 is a computational error produced by our computer system


and that fak g1
kD0  .0; 1/. We describe the inexact version of the mirror descent
algorithm.
Mirror Descent Algorithm
Initialization: select an arbitrary x0 2 U.
Iterative step: given a current iteration vector xt 2 U calculate

t 2 @f .xt / C BX .0; ı/;

define

gt .x/ D ht ; xi C .2at /1 kx  xt k2 ; x 2 X


1.2 The Mirror Descent Method 5

and calculate the next iteration vector xtC1 2 U such that

BX .xtC1 ; ı/ \ argminfgt .y/ W y 2 Cg 6D ;:

Note that gt is a convex bounded from below function on X which possesses a


minimizer on C.
In Chap. 3 we prove the following result (see Theorem 3.1).
Theorem 1.2. Let ı 2 .0; 1, fak g1
kD0  .0; 1/ and let

x 2 C

satisfies

f .x /  f .x/ for all x 2 C:

Assume that fxt g1 1


tD0  U, ft gtD0  X,

kx0 k  M0 C 1

and that for each integer t  0,

t 2 @f .xt / C BX .0; ı/

and

BX .xtC1 ; ı/ \ argminfht ; vi C .2at /1 kv  xt k2 W v 2 Cg 6D ;:

Then for each natural number T,

X
T
at .f .xt /  f .x //
tD0

X
T
 21 .2M0 C 1/2 C ı.2M0 C L C 2/ at
tD0

X
T
Cı.T C 1/.8M0 C 8/ C 21 .L C 1/2 a2t :
tD0

Moreover, for each natural number T,


0 !1 1
X
T X
T
f@ at at xt A  f .x /; minff .xt / W t D 0; : : : ; Tg  f .x /
tD0 tD0
6 1 Introduction

!1
X
T
1 2
 2 .2M0 C 1/ at C ı.2M0 C L C 2/
tD0
!1 !1
X
T X
T X
T
1 2
Cı.T C 1/.8M0 C 8/ at C 2 .L C 1/ a2t at :
tD0 tD0 tD0

We are interestedPin an optimal choice of at , t D 0; 1; : : : . Let T be a natural


number and AT D TtD0 at be given. It is shown in Chap. 3 that the best choice is
at D .T C 1/1 AT , i D 0; : : : ; T.
Let T be a natural number and at D a, t D 0; : : : ; T. It is shown in Chap. 3 that
the best choice of a > 0

a D .16ı.M0 C 1//1=2 .L C 1/1 :

If we think about the best choice of T, it is clear that it should be at the same order
as bı 1 c.

1.3 Proximal Point Method

In Chap. 9 we analyze the behavior of the proximal point method in a Hilbert space
which is an important tool in the optimization theory. See, for example, [9, 15, 16,
29, 31, 34, 36, 53, 55, 69, 70, 77, 81, 87, 103, 104, 106, 107, 111, 113] and the
references mentioned therein.
Let X be a Hilbert space equipped with an inner product h; i which induces the
norm k  k.
For each function g W X ! R1 [ f1g set

inf.g/ D inffg.y/ W y 2 Xg:

Suppose that f W X ! R1 [ f1g is a convex lower semicontinuous function and


a is a positive constant such that

dom.f / WD fx 2 X W f .x/ < 1g 6D ;;


f .x/  a for all x 2 X

and that

lim f .x/ D 1:
kxk!1

It is not difficult to see that the set

Argmin.f / WD fz 2 X W f .z/ D inf.f /g 6D ;:


1.3 Proximal Point Method 7

Let a point

x 2 Argmin.f /

and let M be any positive number such that

M > inf.f / C 4:

It is clear that there exists a number M0 > 1 such that

f .z/ > M C 4 for all z 2 X satisfying kzk  M0  1:

Evidently,

kx k < M0  1:

Assume that

0 < 1 < 2  M02 =2:

The following theorem is the main result of Chap. 9.


Theorem 1.3. Let

k 2 Œ1 ; 2 ; k D 0; 1; : : : ;

 2 .0; 1, a natural number L satisfy

L > 2.4M02 C 1/2 1

and let a positive number  satisfy

1=2
 1=2 .L C 1/.21
1 C 8M0 1 /  1 and .L C 1/  =4:

Assume that a sequence fxk g1


kD0  X satisfies

f .x0 /  M

and

f .xkC1 / C 21 k kxkC1  xk k2  inf.f C 21 k k  xk k2 / C 

for all integers k  0. Then for all integers k > L,

f .xk /  inf.f / C :
8 1 Introduction

By this theorem, for a given  > 0, we obtain  2 X satisfying

f ./  inf.f / C 

doing bc1 1 c iterations with the computational error  D c2 2 , where the constant
c1 > 0 depends only on M0 ; 2 and the constant c2 > 0 depends only on
M0 ; L; 1 ; 2 .

1.4 Variational Inequalities

In Chap. 12 we are interested in solving of variational inequalities. The studies


of gradient-type methods and variational inequalities are important topics in
optimization theory. See, for example, [3, 5, 12, 30, 31, 37–39, 44, 52, 54, 59, 68, 71–
74, 93, 129] and the references mentioned therein.
Let .X; h; i/ be a Hilbert space with an inner product h; i which induces a
complete norm k  k. For each x 2 X and each r > 0 set

B.x; r/ D fy 2 X W kx  yk  rg:

Let C be a nonempty closed convex subset of X.


Consider a mapping f W X ! X. We say that the mapping f is monotone on C if

hf .x/  f .y/; x  yi  0 for all x; y 2 C:

We say that f is pseudo-monotone on C if for each x; y 2 C the inequality

hf .y/; x  yi  0 implies that hf .x/; x  yi  0:

Clearly, if f is monotone on C, then f is pseudo-monotone on C. Denote by S the


set of all x 2 C such that

hf .x/; y  xi  0 for all y 2 C:

We suppose that

S 6D ;:

For each  > 0 denote by S the set of all x 2 C such that

hf .x/; y  xi  ky  xk   for all y 2 C:


1.4 Variational Inequalities 9

In Chap. 12, we present examples which provide simple and clear estimations for
the sets S in some important cases. These examples show that elements of S can
be considered as -approximate solutions of the variational inequality.
In Chap. 12, in order to solve the variational inequality (to find x 2 S), we use the
algorithm known in the literature as the extragradient method [75]. In each iteration
of this algorithm, in order to get the next iterate xkC1 , two orthogonal projections
onto C are calculated, according to the following iterative step. Given the current
iterate xk calculate yk D PC .xk  k f .xk // and then

xkC1 D PC .xk  k f .yk //;

where k is some positive number. It is known that this algorithm generates


sequences which converge to an element of S. In Chap. 12, we study the behavior of
the sequences generated by the algorithm taking into account computational errors
which are always present in practice. Namely, in practice the algorithm generates
sequences fxk g1 1
kD0 and fyk gkD0 such that for each integer k  0,

kyk  PC .xk  k f .xk //k  ı

and

kxkC1  PC .xk  k f .yk //k  ı;

with a constant ı > 0 which depends only on our computer system. Surely, in this
situation one cannot expect that the sequence fxk g1kD0 converges to the set S. The
goal is to understand what subset of C attracts all sequences fxk g1kD0 generated by
the algorithm. The main result of Chap. 12 (Theorem 12.2) shows that this subset
of C is the set S with some  > 0 depending on ı. The examples considered in
Chap. 12 show that one cannot expect to find an attracting set smaller than S , whose
elements can be considered as approximate solutions of the variational inequality.
Chapter 2
Subgradient Projection Algorithm

In this chapter we study the subgradient projection algorithm for minimization of


convex and nonsmooth functions and for computing the saddle points of convex–
concave functions, under the presence of computational errors. We show that
our algorithms generate a good approximate solution, if computational errors
are bounded from above by a small positive constant. Moreover, for a known
computational error, we find out what an approximate solution can be obtained and
how many iterates one needs for this.

2.1 Preliminaries

The subgradient projection algorithm is one of the most important tools in the
optimization theory and its applications. See, for example, [1–3, 12, 30, 44, 51, 79,
89, 92, 95, 96, 105, 108, 109, 112] and the references mentioned therein.
In this chapter we use this method for constrained minimization problems in
Hilbert spaces equipped with an inner product denoted by h; i which induces a
complete norm k  k. For every z 2 R1 denote by bzc the largest integer which does
not exceed z: bzc D maxfi 2 R1 W i is an integer and i  zg.
Let X be a Hilbert space. For each x 2 X and each r > 0 set

BX .x; r/ D fy 2 X W kx  yk  rg:

For each x 2 X and each nonempty set E  X set

d.x; E/ D inffkx  yk W y 2 Eg:

Let C be a nonempty closed convex subset of X, U be an open convex subset of X


such that C  U and let f W U ! R1 be a convex function. Recall that for each
x 2 U,

© Springer International Publishing Switzerland 2016 11


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_2
12 2 Subgradient Projection Algorithm

@f .x/ D fl 2 X W f .y/  f .x/  hl; y  xi for all y 2 Ug: (2.1)

Suppose that there exist L > 0, M0 > 0 such that

C  BX .0; M0 /; (2.2)
jf .x/  f .y/j  Lkx  yk for all x; y 2 U: (2.3)

In view of (2.1) and (2.3), for each x 2 U,

; 6D @f .x/  BX .0; L/: (2.4)

It is easy to see that the following result is true.


Lemma 2.1. Let z; y0 ; y1 2 X. Then

kz  y0 k2  kz  y1 k2  ky0  y1 k2 D 2hz  y1 ; y1  y0 i:

The next result is given in [13, 14].


Lemma 2.2. Let D be a nonempty closed convex subset of X. Then for each x 2 X
there is a unique point PD .x/ 2 D satisfying

kx  PD .x/k D inffkx  yk W y 2 Dg:

Moreover,

kPD .x/  PD .y/jj  kx  yk for all x; y 2 X

and for each x 2 X and each z 2 D,

hz  PD .x/; x  PD .x/i  0;
kz  PD .x/k2 C kx  PD .x/k2  kz  xk2 :

Lemma 2.3. Let A > 0 and n  2 be an integer. Then the minimization problem

X
n
a2i ! min
iD1

X
n
a D .a1 ; : : : ; an / 2 Rn and ai D A
iD1

has a unique solution a D .a1 ; : : : ; an / where ai D n1 A, i D 1; : : : ; n.


2.2 A Convex Minimization Problem 13

Proof. Clearly, the minimization problem has a solution a D .a1 ; : : : ; an / 2 Rn :


Then

X
n1
an D A  ai
iD1

and .a1 ; : : : ; an1 / is a minimizer of the function


!2
X
n1 X
n1
.a1 ; : : : ; an1 / WD a2i C A ai ; .a1 ; : : : ; an1 / 2 Rn1 :
iD1 iD1

It is clear that for all i D 1; : : : ; n  1,


!
X
n1
0 D .@ =@ai /.a1 ; : : : ; an1 / D 2ai  2 A  ai D 2ai  2an :
iD1

Thus ai D an for all i D 1; : : : ; n1 and ai D n1 A for all i D 1; : : : ; n. Lemma 2.3
is proved.

2.2 A Convex Minimization Problem

Let ı 2 .0; 1 and fak g1


kD0  .0; 1/.
Let us describe our algorithm.
Subgradient Projection Algorithm
Initialization: select an arbitrary x0 2 U.
Iterative step: given a current iteration vector xt 2 U calculate

t 2 @f .xt / C BX .0; ı/

and the next iteration vector xtC1 2 U such that

kxtC1  PC .xt  at t /k  ı:

In this chapter we prove the following result.


Theorem 2.4. Let ı 2 .0; 1, fak g1
kD0  .0; 1/ and let

x 2 C (2.5)
14 2 Subgradient Projection Algorithm

satisfies

f .x /  f .x/ for all x 2 C: (2.6)

Assume that fxt g1 1


t D 0  U, ft gt D 0  X,

kx0 k  M0 C 1 (2.7)

and that for each integer t  0,

t 2 @f .xt / C BX .0; ı/ (2.8)

and

kxt C 1  PC .xt  at t /k  ı: (2.9)

Then for each natural number T,

X
T
at .f .xt /  f .x //
tD0

 21 kx  x0 k2 C ı.T C 1/.4M0 C 1/


X
T X
T
C ı.2M0 C 1/ at C 21 .L C 1/2 a2t : (2.10)
tD0 tD0

Moreover, for each natural number T,


0 !1 1
X
T X
T
f@ at at xt A  f .x /; minff .xt / W t D 0; : : : ; Tg  f .x /
tD0 tD0
!1 !1
X
T X
T
1 2
2 at kx  x0 k C at ı.T C 1/.4M0 C 1/
tD0 tD0
!1
X
T X
T
1
C ı.2M0 C 1/ C 2 at .L C 1/2 a2t : (2.11)
tD0 tD0

Theorem 2.4 is proved in Sect. 2.4.


We are interestedPin an optimal choice of at , t D 0; 1; : : : . Let T be a natural
number and AT D TtD 0 at be given. By Theorem 2.4, in order to make the best
choice of at ; t D 0; : : : ; T, we need to minimize the function

.a0 ; : : : ; aT / WD 21 A1 2 1


T kx  x0 k C AT ı.T C 1/.4M0 C 1/
2.2 A Convex Minimization Problem 15

X
T
C ı.2M0 C 1/ C 21 A1
T .L C 1/
2
a2t
tD0

on the set
( )
X
T
a D .a0 ; : : : ; aT / 2 RTC1 W ai  0; i D 0; : : : ; T; ai D AT :
iD0

By Lemma 2.3, this function has a unique minimizer a D .a0 ; : : : ; aT / where
ai D .T C 1/1 AT , i D 0; : : : ; T. This is the best choice of at , t D 0; 1; : : : ; T.
Theorem 2.4 implies the following result.
Theorem 2.5. Let ı 2 .0; 1, a > 0 and let x 2 C satisfies

f .x /  f .x/ for all x 2 C:

Assume that fxt g1 1


t D 0  U, ft gt D 0  X,

kx0 k  M0 C 1

and that for each integer t  0,

t 2 @f .xt / C BX .0; ı/

and

kxtC1  PC .xt  at /k  ı:

Then for each natural number T,


!
X
T
1
f .T C 1/ xt  f .x /; minff .xt / W t D 0; : : : ; Tg  f .x /
tD0

 21 .T C 1/1 a1 .2M0 C 1/2 C a1 ı.4M0 C 1/


C ı.2M0 C 1/ C 21 .L C 1/2 a:

Now we will find the best a > 0. Since T can be arbitrary large, we need to find
a minimizer of the function

.a/ WD a1 ı.4M0 C 1/ C 21 .L C 1/2 a; a 2 .0; 1/:


16 2 Subgradient Projection Algorithm

Clearly, the minimizer a satisfies

a1 ı.4M0 C 1/ D 21 .L C 1/2 a

and

a D .2ı.4M0 C 1//1=2 .L C 1/1

and the minimal value of is

.2ı.4M0 C 1//1=2 .L C 1/: (2.12)

Theorem 2.5 implies the following result.


Theorem 2.6. Let ı 2 .0; 1,

a D .2ı.4M0 C 1//1=2 .L C 1/1 ;

x 2 C satisfies

f .x /  f .x/ for all x 2 C:

Assume that fxt g1 1


t D 0  U, ft gt D 0  X,

kx0 k  M0 C 1

and that for each integer t  0,

t 2 @f .xt / C BX .0; ı/

and

kxtC1  PC .xt  at /k  ı:

Then for each natural number T,


!
X
T
1
f .T C 1/ xt  f .x /; minff .xt / W t D 0; : : : ; Tg  f .x /
tD0

 21 .T C 1/1 .2M0 C 1/2 .L C 1/.2ı.4M0 C 1//1=2 C ı.2M0 C 1/


C 21 .2ı.4M0 C 1//1=2 .L C 1/ C ı.4M0 C 1/.L C 1/.2ı.4M0 C 1//1=2 :

Now we can think about the best choice of T. It is clear that it should be at the
same order as bı 1 c. Putting T D bı 1 c, we obtain that
2.3 The Main Lemma 17

!
X
T
f .T C 1/1 xt  f .x /; minff .xt / W t D 0; : : : ; Tg  f .x /
tD0

 21 .2M0 C 1/2 .L C 1/.8M0 C 2/1=2 ı 1=2 C ı.2M0 C 1/


C .8M0 C 2/1=2 .L C 1/ı 1=2 C .4M0 C 1/.L C 1/.8M0 C 2/1=2 ı 1=2 :

Note that in the theorems above ı is the computational error produced by our
computer system.
In view of the inequality above, which has the right-hand side bounded by c1 ı 1=2
with a constant c1 > 0, we conclude that after T D bı 1 c iterations we obtain a
point  2 U such that

BX .; ı/ \ C 6D ;

and

f ./  f .x / C c1 ı 1=2 ;

where the constant c1 > 0 depends only on L and M0 .

2.3 The Main Lemma

We use the notation and definitions introduced in Sect. 2.1.


Lemma 2.7. Let ı 2 .0; 1, a > 0 and let

z 2 C: (2.13)

Assume that

x 2 U \ BX .0; M0 C 1/; (2.14)

 2 @f .x/ C BX .0; ı/ (2.15)

and that

u2U (2.16)

satisfies

ku  PC .x  a/k  ı: (2.17)
18 2 Subgradient Projection Algorithm

Then

a. f .x/  f .z//  21 kz  xk2  21 kz  uk2


C ı.4M0 C 1 C a.2M0 C 1// C 21 a2 .L C 1/2 :

Proof. In view of (2.15), there exists

l 2 @f .x/ (2.18)

such that

kl  k  ı: (2.19)

By Lemmas 2.1 and 2.2 and (2.13),

0  hz  PC .x  a/; PC .x  a/  .x  a/i


D hz  PC .x  a/; PC .x  a/  xi
Cha; z  PC .x  a/i
D 21 Œkz  xk2  kz  PC .x  a/k2  kx  PC .x  a/k2 
Cha; z  xi C ha; x  PC .x  a/i: (2.20)

Clearly,

jha; x  PC .x  a/ij  21 .kak2 C kx  PC .x  a/k2 /: (2.21)

It follows from (2.20) and (2.21) that

0  21 Œkz  xk2  kz  PC .x  a/k2  kx  PC .x  a/k2 


Cha; z  xi C 21 a2 kk2 C 21 kx  PC .x  a/k2
 21 kz  xk2  21 kz  PC .x  a/k2 C 21 a2 kk2 C ha; z  xi: (2.22)

Relations (2.2), (2.13), and (2.17) imply that

jkz  PC .x  a/k2  kz  uk2 j


D jkz  PC .x  a/k  kz  ukj.kz  PC .x  a/k C kz  uk/
 ku  PC .x  a/k.4M0 C 1/  .4M0 C 1/ı: (2.23)

By (2.2), (2.13), (2.14), and (2.19),

ha; z  xi D hal; z  xi C ha.  l/; z  xi


2.4 Proof of Theorem 2.4 19

 hal; z  xi C ak  lkkz  xk


 hal; z  xi C aı.2M0 C 1/: (2.24)

It follows from (2.4), (2.18), (2.19), (2.22), (2.23), and (2.24) that

0  21 kz  xk2  21 kz  PC .x  a/k2 C 21 a2 kk2 C ha; z  xi


 21 kz  xk2  21 kz  uk2 C ı.4M0 C 1/ C 21 a2 .L C 1/2
C hal; z  xi C aı.2M0 C 1/: (2.25)

By (2.1), (2.18), and (2.25),

a.f .z/  f .x//  hal; z  xi

and

a.f .x/  f .z//  hal; x  zi


 21 kz  xk2  21 kz  uk2 C ı.4M0 C 1/ C 21 a2 .L C 1/2
C aı.2M0 C 1/:

This completes the proof of Lemma 2.7.

2.4 Proof of Theorem 2.4

It is clear that

kxt k  M0 C 1; t D 0; 1; : : : :

Let t  0 be an integer. Applying Lemma 2.7 with

z D x ; a D at ; x D xt ;  D t ; u D xtC1

we obtain that

at .f .xt /  f .x //  21 kx  xt k2  21 kx  xtC1 k2


C ı.4M0 C 1 C at .2M0 C 1// C 21 a2t .L C 1/2 : (2.26)
20 2 Subgradient Projection Algorithm

By (2.26), for each natural number T,

X
T
at .f .xt /  f .x //
tD0

X
T
 .21 kx  xt k2  21 kx  xtC1 k2
tD0

C ı.4M0 C 1/ C at .2M0 C 1/ı C 21 a2t .L C 1/2 /


 21 kx  x0 k2 C ı.T C 1/.4M0 C 1/
X
T X
T
C ı.2M0 C 1/ at C 21 .L C 1/2 a2t :
tD0 tD0

Thus (2.10) is true. Evidently, (2.10) implies (2.11). Theorem 2.4 is proved.

2.5 Subgradient Algorithm on Unbounded Sets

We use the notation and definitions introduced in Sect. 2.1. Let X be a Hilbert space
with an inner product h; i, D be a nonempty closed convex subset of X, V be an
open convex subset of X such that

D  V; (2.27)

and f W V ! R1 be a convex function which is Lipschitz on all bounded subsets of


V. Set

Dmin D fx 2 D W f .x/  f .y/ for all y 2 Dg: (2.28)

We suppose that

Dmin 6D ;: (2.29)

We will prove the following result.


Theorem 2.8. Let ı 2 .0; 1, M > 0 satisfy

Dmin \ BX .0; M/ 6D ;; (2.30)


M0  4M C 4; (2.31)

L > 0 satisfy

jf .v1 /  f .v2 /j  Lkv1  v2 k for all v1 ; v2 2 V \ BX .0; M0 C 2/; (2.32)


2.5 Subgradient Algorithm on Unbounded Sets 21

0 < 0  1  .L C 1/1 ; (2.33)


0 D 201 ı.4M0 C 1/ C 2ı.2M0 C 1/ C 21 .L C 1/2 (2.34)

and let

n0 D b01 .2M C 2/2 01 c: (2.35)

Assume that fxt g1 1


t D 0  V, ft gt D 0  X,

fat g1
t D 0  Œ0 ; 1 ; (2.36)
kx0 k  M (2.37)

and that for each integer t  0,

t 2 @f .xt / C BX .0; ı/ (2.38)

and

kxtC1  PD .xt  at t /k  ı: (2.39)

Then there exists an integer q 2 Œ1; n0 C 1 such that

kxi k  3M C 2; i D 0; : : : ; q

and

f .xq /  f .x/ C 0 for all x 2 D:

We are interested in the best choice of at ; t D 0; 1; : : : . Assume for simplicity


that 1 D 0 . In order to meet our goal we need to minimize the function

2 1 ı.4M0 C 1/ C 2.L C 1/2 ;  2 .0; 1/:

This function has a minimizer

 D .ı.4M0 C 1//1=2 .L C 1/1 ;

the minimal value of 0 is

2ı.2M0 C 1/ C 4.ı.4M0 C 1//1=2 .L C 1/

and n0 D bc where

 D .2.ı.4M0 C 1//1=2 .L C 1/1 /1 .2M C 2/2 .2ı.2M0 C 1/


22 2 Subgradient Projection Algorithm

C 4.ı.4M0 C 1//1=2 .L C 1/1 /


 ı 1=2 .4M0 C 1/1=2 .L C 1/.2M C 2/2 .4L C 4/1 .4M0 C 1/1=2 ı 1=2
D ı 1 .4M0 C 1/1 .2M C 2/2 41 :

Note that in the theorem above ı is the computational error produced by our
computer system. In view of the inequality above, in order to obtain a good
approximate solution we need bc1 ı 1 c C 1 iterations, where

c1 D 41 .4M0 C 1/1 .2M C 1/2 :

As a result, we obtain a point  2 V such that

BX .; ı/ \ D 6D ;

and

f ./  infff .x/ W x 2 Dg C c2 ı 1=2 ;

where the constant c2 > 0 depends only on L and M0 .

2.6 Proof of Theorem 2.8

By (2.30) there exists

z 2 Dmin \ BX .0; M/: (2.40)

Assume that T is a natural number and that

f .xt /  f .z/ > 0 ; t D 1; : : : ; T: (2.41)

Lemma 2.2, (2.36), (2.37), (2.39), and (2.40) imply that

kx1  zk  kx1  PD .x0  a0 0 /k C kPD .x0  a0 0 /  zk


 ı C kx0  zk C a0 k0 k  1 C 2M C 1 k0 k: (2.42)

In view of (2.32), (2.37), and (2.38),

0 2 @f .x0 / C BX .0; 1/  BX .0; L/ C 1;


k0 k  L C 1: (2.43)
2.6 Proof of Theorem 2.8 23

It follows from (2.33), (2.40), (2.42), and (2.43) that

kx1  zk  2M C 2; (2.44)
kx1 k  3M C 2: (2.45)

Set

U D V \ fv 2 X W kvk < M0 C 2g (2.46)

and

C D D \ BX .0; M0 /: (2.47)

By induction we show that for every integer t 2 Œ1; T,

kxt  zk  2M C 2; (2.48)
f .xt /  f .z/
 .20 /1 .kz  xt k2  kz  xtC1 k2 /
C 01 ı.4M0 C 1/ C ı.2M0 C 1/ C 21 1 .L C 1/2 : (2.49)

In view of (2.44), (2.48) holds for t D 1.


Assume that an integer t 2 Œ1; T and that (2.48) holds. It follows
from (2.31), (2.40), (2.46), (2.47), and (2.48) that

z 2 C  BX .0; M0 /; (2.50)
xt 2 U \ BX .0; M0 C 1/: (2.51)

Relation (2.39) implies that xtC1 2 V satisfies

kxtC1  PD .xt  at t /k  1: (2.52)

By (2.32), (2.38), and (2.51),

t 2 @f .xt / C BX .0; 1/  BX .0; L C 1/: (2.53)

It follows from (2.33), (2.36), (2.40), (2.48), (2.53), and Lemma 2.2 that

kz  PD .xt  at t /k  kz  xt C at t k
 kz  xt k C kt kat  2M C 3;
kPD .xt  at t /k  3M C 3: (2.54)
24 2 Subgradient Projection Algorithm

In view of (2.47) and (2.54),

PD .xt  at t / 2 C; (2.55)

and

PD .xt  at t / D PC .xt  at t /: (2.56)

Relations (2.44), (2.52), and (2.54) imply that

kxtC1 k  3M C 4; xtC1 2 U: (2.57)

By (2.32), (2.38), (2.39), (2.46), (2.47), (2.50), (2.51), (2.55), (2.56), (2.57), and
Lemma 2.7 which holds with

x D xt ; a D at ;  D t ; u D xtC1 ;

we have

at .f .xt /  f .z//  21 kz  xt k2  21 kz  xtC1 k2


C ı.4M0 C 1 C at .2M0 C 1// C 21 a2t .L C 1/2 :

The relation above, (2.34) and (2.36) imply that

f .xt /  f .z/  .20 /1 kz  xt k2  .20 /1 kz  xtC1 k2


C 01 ı.4M0 C 1/ C .2M0 C 1/ı C 21 1 .L C 1/2 : (2.58)

In view of (2.41), (2.58) and the inclusion t 2 Œ1; T,

kz  xt k2  kz  xtC1 k2  0;
kz  xtC1 k  kz  xt k  2M C 2: (2.59)

Therefore we assumed that (2.48) is true and showed that (2.58) and (2.59) hold.
Hence by induction we showed that (2.49) holds for all t D 1; : : : ; T and (2.48)
holds for all t D 1; : : : ; T C 1.
It follows from (2.49) which holds for all t D 1; : : : ; T, (2.41) and (2.44) that

T0 < T.minff .xt / W t D 1; : : : ; Tg  f .z//


X
T
 .f .xt /  f .z//
tD1
2.7 Zero-Sum Games with Two-Players 25

X
T
 .20 /1 .kz  xt k2  kz  xtC1 k2 /
tD1

C T01 ı.4M0 C 1/ C T.2M0 C 1/ı C 21 T1 .L C 1/2


 .20 /1 .2M C 2/2 C T01 ı.4M0 C 1/
C T.2M0 C 1/ı C 21 T1 .L C 1/2 :

Together with (2.34) and (2.35) this implies that

0 < .20 T/1 .2M C 2/2 C 01 ı.4M0 C 1/


C .2M0 C 1/ı C 21 1 .L C 1/2 ;
21 0 < .20 T/1 .2M C 2/2 ;
T < 01 .2M C 2/2 01  n0 C 1:

Thus we have shown that if an integer T satisfies (2.41), then T  n0 and

kz  xt k  2M C 2; t D 1; : : : ; T C 1;
kxt k  3M C 2; t D 0; : : : ; T C 1:

This implies that there exists an integer q 2 Œ1; n0 C 1 such that

kxt k  3M C 2; t D 0; : : : ; q

and

f .xq /  f .z/  0 :

Theorem 2.8 is proved.

2.7 Zero-Sum Games with Two-Players

We use the notation and definitions introduced in Sect. 2.1.


Let X; Y be Hilbert spaces, C be a nonempty closed convex subset of X, D be a
nonempty closed convex subset of Y, U be an open convex subset of X, and V be an
open convex subset of Y such that

C  U; D  V (2.60)
26 2 Subgradient Projection Algorithm

and let a function f W U  V ! R1 possess the following properties:


(i) for each v 2 V, the function f .; v/ W U ! R1 is convex;
(ii) for each u 2 U, the function f .u; / W V ! R1 is concave.
Assume that a function W R1 ! Œ0; 1/ is bounded on all bounded sets and
positive numbers M0 ; L satisfy

C  BX .0; M0 /; D  BY .0; M0 /; (2.61)


jf .u; v1 /  f .u; v2 /j  Lkv1  v2 k for all u 2 U and all v1 ; v2 2 V; (2.62)
jf .u1 ; v/  f .u2 ; v/j  Lku1  u2 k for all v 2 V and all u1 ; u2 2 U: (2.63)

Let

x 2 C and y 2 D (2.64)

satisfy

f .x ; y/  f .x ; y /  f .x; y / (2.65)

for each x 2 C and each y 2 D.


In the next section we prove the following result.
Proposition 2.9. Let T be a natural number, ı 2 .0; 1, fat gTtD 0  .0; 1/
and let fbt gTtD 0  .0; 1/. Assume that fxt gTC1
t D 0  U, fyt gt D 0  V, for each
TC1

t 2 f0; : : : ; T C 1g,

B.xt ; ı/ \ C 6D ;; B.yt ; ı/ \ D 6D ;; (2.66)

for each z 2 C and each t 2 f0; : : : ; Tg,

at .f .xt ; yt /  f .z; yt //  .kz  xt k/  .kz  xtC1 k/ C bt (2.67)

and that for each v 2 D and each t 2 f0; : : : ; Tg,

at .f .xt ; v/  f .xt ; yt //  .kv  yt k/  .kv  ytC1 k/ C bt : (2.68)

Let
!1
X
T X
T
xO T D ai at xt ;
iD0 tD0
!1
X
T X
T
yO T D ai at yt : (2.69)
iD0 tD0
2.7 Zero-Sum Games with Two-Players 27

Then

B.OxT ; ı/ \ C 6D ;; B.OyT ; ı/ \ D 6D ;; (2.70)

ˇ !1 T ˇ
ˇ X X ˇ
ˇ T ˇ
ˇ at at f .xt ; yt /  f .x ; y /ˇˇ
ˇ
ˇ tD0 tD0 ˇ
!1 !1
X
T X
T X
T
 at bt C at supf .u/ W u 2 Œ0; 2M0 C 1g; (2.71)
tD0 tD0 tD0
ˇ !1 T ˇ
ˇ X X ˇ
ˇ T
ˇ
ˇf .OxT ; yO T /  at at f .xt ; yt /ˇˇ
ˇ
ˇ tD0 tD0 ˇ
!1
X
T X
T
 at bt C Lı
tD0 tD0
!1
X
T
C at supf .u/ W u 2 Œ0; 2M0 C 1g; (2.72)
tD0

and for each z 2 C and each v 2 D,

f .z; yO T /  f .OxT ; yO T /
!1
X
T
2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
2 at bt  Lı; (2.73)
tD0 tD0

f .OxT ; v/  f .OxT ; yO T /
!1
X
T
C2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
C2 at bt C Lı: (2.74)
tD0 tD0

Corollary 2.10. Suppose that all the assumptions of Proposition 2.9 hold and that

xQ 2 C; yQ 2 D

satisfy

kOxT  xQ k  ı; kOyT  yQ k  ı: (2.75)


28 2 Subgradient Projection Algorithm

Then

jf .Qx; yQ /  f .OxT ; yO T /j  2Lı (2.76)

and for each z 2 C and each v 2 D,

f .z; yQ /  f .Qx; yQ /
!1
X
T
2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
2 at bt  4Lı;
tD0 tD0

f .Qx; v/  f .Qx; yQ /
!1
X
T
C2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
C2 at bt C 4Lı:
tD0 tD0

Proof. In view of (2.62), (2.63), and (2.75),

jf .Qx; yQ /  f .OxT ; yO T /j
 jf .Qx; yQ /  f .Qx; yO T /j C jf .Qx; yO T /  f .OxT ; yO T /j
 LkQy  yO T k C LkQx  xO T k  2Lı

and (2.76) holds.


Let z 2 C and v 2 D. Relations (2.62), (2.63), and (2.75) imply that

jf .z; yQ /  f .z; yO T /j  Lı;


jf .Qx; v/  f .OxT ; v/j  Lı:

By the relation above, (2.73), (2.74), and (2.75),

f .z; yQ /  f .z; yO T /  Lı
!1
X
T
 f .OxT ; yO T /  2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
2 at bt  2Lı
tD0 tD0
2.8 Proof of Proposition 2.9 29

!1
X
T
 f .Qx; yQ /  2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
2 at bt  4Lı;
tD0 tD0

f .Qx; v/  f .OxT ; v/ C Lı
!1
X
T
 f .OxT ; yO T / C 2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
C2 at bt C 2Lı
tD0 tD0
!1
X
T
 f .Qx; yQ / C 2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
C2 at bt C 4Lı:
tD0 tD0

This completes the proof of Corollary 2.10.

2.8 Proof of Proposition 2.9

It is clear that (2.70) is true. Let t 2 f0; : : : ; Tg. By (2.65), (2.67), and (2.68),

at .f .xt ; yt /  f .x ; y //
 at .f .xt ; yt /  f .x ; yt //
 .kx  xt k/  .kx  xtC1 k/ C bt ; (2.77)
at .f .x ; y /  f .xt ; yt //
 at .f .xt ; y /  f .xt ; yt //
 .ky  yt k/  .ky  ytC1 k/ C bt : (2.78)
30 2 Subgradient Projection Algorithm

In view of (2.77) and (2.78),

X
T X
T
at f .xt ; yt /  at f .x ; y /
tD0 tD0

X
T X
T
 . .kx  xt k/  .kx  xtC1 k// C bt
tD0 tD0

X
T
 .kx  x0 k/ C bt ; (2.79)
tD0

X
T X
T
at f .x ; y /  at f .xt ; yt /
tD0 tD0

X
T X
T
 . .ky  yt k/  .ky  ytC1 k// C bt
tD0 tD0

X
T
 .ky  y0 k/ C bt : (2.80)
tD0

Relations (2.61), (2.64), (2.66), (2.79), and (2.80) imply that


ˇ ˇ
ˇ T !1 T ˇ
ˇ X X ˇ
ˇ a a f .x ; y /  f .x ; y / ˇ
ˇ t t t t   ˇ
ˇ tD0 tD0 ˇ
!1 !1
X
T X
T X
T
 at bt C at supf .s/ W s 2 Œ0; 2M0 C 1g: (2.81)
tD0 tD0 tD0

By (2.70), there exists

zT 2 C (2.82)

such that

kzT  xO T k  ı: (2.83)

In view of (2.82), we apply (2.67) with z D zT and obtain that for all t D 0; : : : ; T,

at .f .xt ; yt /  f .zT ; yt //
 .kzT  xt k/  .kzT  xtC1 k/ C bt : (2.84)
2.8 Proof of Proposition 2.9 31

It follows from (2.63) and (2.83) that for all t D 0; : : : ; T,

jf .zT ; yt /  f .OxT ; yt /j  LkzT  xO T k  Lı: (2.85)

By (2.84) and (2.85), for all t D 0; : : : ; T,

at .f .xt ; yt /  f .OxT ; yt //
 at .f .xt ; yt /  f .zT ; yt // C at Lı
 .kzT  xt k/  .kzT  xtC1 k/ C bt C at Lı: (2.86)

Combined with (2.61), (2.66), and (2.82) this implies that

X
T X
T
at f .xt ; yt /  at f .OxT ; yt /
tD0 tD0

X
T X
T X
T
 . .kzT  xt k/  .kzT  xtC1 k// C bt C at Lı
tD0 tD0 tD0

X
T X
T
 .kzT  x0 k/ C bt C at Lı
tD0 tD0

X
T X
T
 supf .s/ W s 2 Œ0; 2M0 C 1g C bt C at Lı: (2.87)
tD0 tD0

Property (ii) and (2.69) imply that


! 0 !1 1
X
T X
T X
T X
T
at f .OxT ; yt / D ai @at ai f .OxT ; yt /A
tD0 iD0 tD0 iD0
!
X
T
 at f .OxT ; yO T /: (2.88)
tD0

By (2.87) and (2.88),

X
T X
T
at f .xt ; yt /  at f .OxT ; yO T /
tD0 tD0

X
T X
T
 at f .xt ; yt /  at f .OxT ; yt /
tD0 tD0

X
T X
T
 supf .s/ W s 2 Œ0; 2M0 C 1g C bt C at Lı: (2.89)
tD0 tD0
32 2 Subgradient Projection Algorithm

By (2.70), there exists

hT 2 D (2.90)

such that

khT  yO T k  ı: (2.91)

In view of (2.90), we apply (2.68) with v D hT and obtain that for all t D 0; : : : ; T,

at .f .xt ; hT /  f .xt ; yt //
 .khT  yt k/  .khT  ytC1 k/ C bt : (2.92)

It follows from (2.62) and (2.91) that for all t D 0; : : : ; T,

jf .xt ; hT /  f .xt ; yO T /j  LkhT  yO T k  Lı: (2.93)

By (2.92) and (2.93), for all t D 0; : : : ; T,

at .f .xt ; yO T /  f .xt ; yt //
 at .f .xt ; hT /  f .xt ; yt // C at Lı
 .khT  yt k/  .khT  ytC1 k/ C bt C at Lı: (2.94)

In view of (2.94),

X
T X
T
at f .xt ; yO T /  at f .xt ; yt /
tD0 tD0

X
T X
T X
T
 . .khT  yt k/  .khT  ytC1 k// C bt C at Lı: (2.95)
tD0 tD0 tD0

Property (i) and (2.69) imply that


! 0 !1 1
X
T X
T X
T X
T
at f .xt ; yO T / D ai @at ai f .xt ; yO T /A
tD0 iD0 tD0 iD0

X
T
 at f .OxT ; yO T /: (2.96)
tD0
2.8 Proof of Proposition 2.9 33

By (2.61), (2.66), (2.90), (2.95), and (2.96),

X
T X
T
at f .OxT ; yO T /  at f .xt ; yt /
tD0 tD0

X
T X
T
 at f .xt ; yO T /  at f .xt ; yt /
tD0 tD0

X
T X
T
 .khT  y0 k/ C bt C at Lı
tD0 tD0

X
T X
T
 supf .s/ W s 2 Œ0; 2M0 C 1g C bt C at Lı: (2.97)
tD0 tD0

It follows from (2.89) and (2.97) that


ˇ T ˇ
ˇX X
T ˇ
ˇ ˇ
ˇ at f .OxT ; yO T /  at f .xt ; yt /ˇ
ˇ ˇ
tD0 tD0

X
T X
T
 supf .s/ W s 2 Œ0; 2M0 C 1g C bt C at Lı:
tD0 tD0

This implies (2.72).


Let z 2 C. By (2.67),

X
T
at f .xt ; yt /  f .z; yt //
tD0

X
T X
T
 Π.kz  xt k/  .kz  xtC1 k/ C bt : (2.98)
tD0 tD0

By property (ii) and (2.69),


! 0 !1 1
X
T X
T X
T X
T
at f .z; yt / D ai @at ai f .z; yt /A
tD0 iD0 tD0 iD0
!
X
T
 at f .z; yO T /: (2.99)
tD0
34 2 Subgradient Projection Algorithm

In view of (2.98) and (2.99),

X
T X
T
at f .xt ; yt /  at f .z; yO T /
tD0 tD0

X
T
 at .f .xt ; yt /  f .z; yt //
tD0

X
T X
T
 Π.kz  xt k/  .kz  xtC1 k/ C bt
tD0 tD0

X
T
 .kz  x0 k/ C bt : (2.100)
tD0

It follows from (2.61), (2.70), and (2.72) that


!1
X
T X
T
f .z; yO T /  ai at f .xt ; yt /
iD0 tD0
!1 !1
X
T X
T X
T
 ai supf .s/ W s 2 Œ0; 2M0 C 1g  ai bt
iD0 iD0 tD0
!1
X
T
 f .OxT ; yO T /  2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
2 at bt  Lı
tD0 tD0

and (2.73) holds.


Let v 2 D. By (2.68),

X
T
at .f .xt ; v/  f .xt ; yt //
tD0

X
T X
T
 Π.kv  yt k/  .kv  ytC1 k/ C bt : (2.101)
tD0 tD0

By property (i) and (2.69),


! 0 !1 1
X
T X
T X
T X
T
at f .xt ; v/ D ai @at ai f .xt ; v/A
tD0 iD0 tD0 iD0
2.9 Subgradient Algorithm for Zero-Sum Games 35

!
X
T
 at f .OxT ; v/: (2.102)
tD0

In view of (2.101) and (2.102),

X
T X
T
at f .OxT ; v/  at f .xt ; yt /
tD0 tD0

X
T
 .kv  y0 k/ C bt :
tD0

Together with (2.61), (2.66), and (2.72) this implies that


!1
X
T X
T
f .OxT ; v/  at at f .xt ; yt /
tD0 tD0
!1 !1
X
T X
T X
T
C at supf .s/ W s 2 Œ0; 2M0 C 1g C at bt
tD0 tD0 tD0
!1
X
T
 f .OxT ; yO T / C 2 at supf .s/ W s 2 Œ0; 2M0 C 1g
tD0
!1
X
T X
T
C2 at bt C Lı:
tD0 tD0

Therefore (2.74) holds. This completes the proof of Proposition 2.9.

2.9 Subgradient Algorithm for Zero-Sum Games

We use the notation and definitions introduced in Sect. 2.1.


Let X; Y be Hilbert spaces, C be a nonempty closed convex subset of X, D be a
nonempty closed convex subset of Y, U be an open convex subset of X, and V be an
open convex subset of Y such that

C  U; D  V: (2.103)

For each concave function g W V ! R1 and each x 2 V set

@g.x/ D fl 2 Y W hl; y  xi  g.y/  g.x/ for all y 2 Vg: (2.104)


36 2 Subgradient Projection Algorithm

Clearly, for each x 2 V,

@g.x/ D .@.g/.x//: (2.105)

Suppose that there exist L > 0, M0 > 0 such that

C  BX .0; M0 /; D  BY .0; M0 /; (2.106)

a function f W U  V ! R1 possesses the following properties:


(i) for each v 2 V, the function f .; v/ W U ! R1 is convex;
(ii) for each u 2 U, the function f .u; / W V ! R1 is concave,
for each v 2 V,

jf .u1 ; v/  f .u2 ; v/j  Lku1  u2 k for all u1 ; u2 2 U (2.107)

and that for each u 2 U,

jf .u; v1 /  f .u; v2 /j  Lkv1  v2 k for all v1 ; v2 2 V: (2.108)

For each .;


/ 2 U  V, set

@x f .;
/ D fl 2 X W f .y;
/  f .;
/  hl; y  i for all y 2 Ug; (2.109)
@y f .;
/ D fl 2 Y W hl; y 
i  f .; y/  f .;
/ for all y 2 Vg: (2.110)

In view of properties (i) and (ii) and (2.107)–(2.110), for each  2 U and each

2 V,

; 6D @x f .;
/  BX .0; L/; (2.111)
; 6D @y f .;
/  BY .0; L/: (2.112)

Let

x 2 C and y 2 D

satisfy

f .x ; y/  f .x ; y /  f .x; y / (2.113)

for each x 2 C and each y 2 D.


Let ı 2 .0; 1 and fak g1
kD0  .0; 1/.
Let us describe our algorithm.
2.9 Subgradient Algorithm for Zero-Sum Games 37

Subgradient Projection Algorithm for Zero-Sum Games


Initialization: select arbitrary x0 2 U and y0 2 V.
Iterative step: given current iteration vectors xt 2 U and yt 2 V calculate

t 2 @x f .xt ; yt / C BX .0; ı/;



t 2 @y f .xt ; yt / C BY .0; ı/

and the next pair of iteration vectors xtC1 2 U, ytC1 2 V such that

kxtC1  PC .xt  at t /k  ı;
kytC1  PD .yt C at
t /k  ı:

In this chapter we prove the following result.


Theorem 2.11. Let ı 2 .0; 1 and fak g1 1
kD0  .0; 1/. Assume that fxt gt D 0  U,
1 1 1
fyt gt D 0  V, ft gt D 0  X, f
t gt D 0  Y,

BX .x0 ; ı/ \ C 6D ;; BY .y0 ; ı/ \ D 6D ; (2.114)

and that for each integer t  0,

t 2 @x f .xt ; yt / C BX .0; ı/; (2.115)



t 2 @y f .xt ; yt / C BY .0; ı/; (2.116)
kxtC1  PC .xt  at t /k  ı (2.117)

and that

kytC1  PD .yt C at
t /k  ı: (2.118)

Let for each natural number T,


!1 !1
X
T X
T X
T X
T
xO T D at at xt ; yO T D at at yt : (2.119)
iD0 tD0 iD0 tD0

Then for each natural number T,


ˇ !1 T ˇ
ˇ X X ˇ
ˇ T ˇ
ˇ at at f .xt ; yt /  f .x ; y /ˇˇ
ˇ
ˇ tD0 tD0 ˇ
!1
X
T
 Œ21 .2M0 C 1/2 C ı.T C 1/.4M0 C 1/ at
tD0
38 2 Subgradient Projection Algorithm

!1
X
T X
T
1
C ı.2M0 C 1/ C 2 at .L C 1/2 a2t ; (2.120)
tD0 tD0
ˇ !1 T ˇ
ˇ X X ˇ
ˇ T
ˇ
ˇf .OxT ; yO T /  a a f .x ; y / ˇ
ˇ t t t t ˇ
ˇ tD0 tD0 ˇ
!1
X
T
1 2
 Œ2 .2M0 C 1/ C ı.T C 1/.4M0 C 1/ at
tD0
!1
X
T X
T
1
C ı.2M0 C 1/ C 2 at .L C 1/2 a2t C Lı; (2.121)
tD0 tD0

and for each natural number T, each z 2 C, and each u 2 D,

f .z; yO T /  f .OxT ; yO T /
!1 !1
X
T X
T
 .2M0 C 1/2 at 2 at .T C 1/ı.4M0 C 1/
tD0 tD0
!1
X
T X
T
 2ı.2M0 C 1/  at .L C 1/2 a2t  Lı;
tD0 tD0

f .OxT ; v/  f .OxT ; yO T /
!1 !1
X
T X
T
2
C .2M0 C 1/ at C2 at ı.T C 1/.4M0 C 1/
tD0 tD0
!1
X
T X
T
C 2ı.2M0 C 1/ C at .L C 1/2 a2t C Lı:
tD0 tD0

We are interestedPin the optimal choice of at , t D 0; 1; : : : ; T. Let T be a natural


number and AT D TtD 0 at be given. By Theorem 2.11, in order to make the best
P
choice of at ; t D 0; : : : ; T, we need to minimize the function TtD 0 a2t on the set
( )
X
T
a D .a0 ; : : : ; aT / 2 R TC1
W ai  0; i D 0; : : : ; T; ai D AT :
iD0

By Lemma 2.3, this function has a unique minimizer a D .a0 ; : : : ; aT / where
ai D .T C 1/1 AT , i D 0; : : : ; T which is the best choice of at , t D 0; 1; : : : ; T.
Now we will find the best a > 0. Let T be a natural number and at D a for all
t D 0; : : : ; T. We need to choose a which is a minimizer of the function
2.10 Proof of Theorem 2.11 39

T .a/ D ..T C 1/a/1 Œ.2M0 C 1/2 C 2ı.T C 1/.4M0 C 1/


C 2ı.2M0 C 1/ C a.L C 1/2
D .2M0 C 1/2 ..T C 1/a/1 C 2ı.4M0 C 1/a1 C 2ı.2M0 C 1/ C .L C 1/2 a:

Since T can be arbitrary large, we need to find a minimizer of the function

.a/ WD 2a1 ı.4M0 C 1/ C .L C 1/2 a; a 2 .0; 1/:

In Sect. 2.2 we have already shown that the minimizer is

a D .2ı.4M0 C 1//1=2 .L C 1/1

and the minimal value of is

.8ı.4M0 C 1//1=2 .L C 1/:

Now our goal is to find the best integer T > 0 which gives us an appropriate value of
T .a/. Since in view of the inequalities above, this value is bounded from below by
c0 ı 1=2 with the constant c0 depending on L; M0 , it is clear that in order to make the
best choice of T, it should be at the same order as bı 1 c. For example, T D bı 1 c.
Note that in the theorem above ı is the computational error produced by
our computer system. We obtain a good approximate solution after T D bı 1 c
iterations. Namely, we obtain a pair of points xO 2 U, yO 2 V such that

BX .Ox; ı/ \ C 6D ;; BY .Oy; ı/ \ D 6D ;

and for each z 2 C and each v 2 D,

f .z; yO /  f .Ox; yO /  cı 1=2 ; f .Ox; v/  f .Ox; yO / C cı 1=2 ;

where the constant c > 0 depends only on L and M0 .

2.10 Proof of Theorem 2.11

By (2.106), (2.114), (2.117), and (2.118), for all integers t  0,

kxt k  M0 C 1; kyt k  M0 C 1: (2.122)

Let t  0 be an integer. Applying Lemma 2.7 with

a D at ; x D xt ; f D f .; yt /;  D t ; u D xtC1
40 2 Subgradient Projection Algorithm

we obtain that for each z 2 C,

at .f .xt ; yt /  f .z; yt //  21 kz  xt k2  21 kz  xtC1 k2


C ı.4M0 C 1 C at .2M0 C 1// C 21 a2t .L C 1/2 : (2.123)

Applying Lemma 2.7 with

a D at ; x D yt ; f D f .xt ; /;  D 
t ; u D ytC1

we obtain that for each v 2 D,

at .f .xt ; v/  f .xt ; yt //  21 kv  yt k2  21 kv  ytC1 k2


C ı.4M0 C 1 C at .2M0 C 1// C 21 a2t .L C 1/2 : (2.124)

For all integers t  0 set

bt D ı.4M0 C 1 C at .2M0 C 1// C 21 a2t .L C 1/2

and define

.s/ D 21 s2 ; s 2 R1 :

It is easy to see that all the assumptions of Proposition 2.9 hold and it implies
Theorem 2.11.
Chapter 3
The Mirror Descent Algorithm

In this chapter we analyze the convergence of the mirror descent algorithm under
the presence of computational errors. We show that the algorithms generate a good
approximate solution, if computational errors are bounded from above by a small
positive constant. Moreover, for a known computational error, we find out what an
approximate solution can be obtained and how many iterates one needs for this.

3.1 Optimization on Bounded Sets

Let X be a Hilbert space equipped with an inner product h; i which induces a
complete norm k  k. For each x 2 X and each r > 0 set

BX .x; r/ D fy 2 X W kx  yk  rg:

For each x 2 X and each nonempty set E  X set

d.x; E/ D inffkx  yk W y 2 Eg:

Let C be a nonempty closed convex subset of X, U be an open convex subset of X


such that C  U and let f W U ! R1 be a convex function. Recall that for each
x 2 U,

@f .x/ D fl 2 X W f .y/  f .x/  hl; y  xi for all y 2 Ug: (3.1)

© Springer International Publishing Switzerland 2016 41


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_3
42 3 The Mirror Descent Algorithm

Suppose that there exist L > 0, M0 > 0 such that

C  BX .0; M0 /; (3.2)
jf .x/  f .y/j  Lkx  yk for all x; y 2 U: (3.3)

In view of (3.1) and (3.3), for each x 2 U,

; 6D @f .x/  BX .0; L/: (3.4)

For each nonempty set D  X and each function h W D ! R1 put

inf.h; D/ D inffh.y/ W y 2 Dg

and

argminfh.y/ W y 2 Dg D fy 2 D W h.y/ D inf.hI D/g:

We study the convergence of the mirror descent algorithm under the presence
of computational errors. This method was introduced by Nemirovsky and Yudin
for solving convex optimization problems [90]. Here we use a derivation of this
algorithm proposed by Beck and Teboulle [19].
Let ı 2 .0; 1 and fak g1
kD0  .0; 1/.
We describe the inexact version of the mirror descent algorithm.
Mirror Descent Algorithm
Initialization: select an arbitrary x0 2 U.
Iterative step: given a current iteration vector xt 2 U calculate

t 2 @f .xt / C BX .0; ı/;

define

gt .x/ D ht ; xi C .2at /1 kx  xt k2 ; x 2 X

and calculate the next iteration vector xtC1 2 U such that

BX .xtC1 ; ı/ \ argminfgt .y/ W y 2 Cg 6D ;:

Note that gt is a convex bounded from below function on X which possesses a


minimizer on C.
In this chapter we prove the following result.
Theorem 3.1. Let ı 2 .0; 1, fak g1
kD0  .0; 1/ and let

x 2 C (3.5)
3.1 Optimization on Bounded Sets 43

satisfies

f .x /  f .x/ for all x 2 C: (3.6)

Assume that fxt g1 1


tD0  U, ft gtD0  X,

kx0 k  M0 C 1 (3.7)

and that for each integer t  0,

t 2 @f .xt / C BX .0; ı/ (3.8)

and

BX .xtC1 ; ı/ \ argminfht ; vi C .2at /1 kv  xt k2 W v 2 Cg 6D ;: (3.9)

Then for each natural number T,


X
T
at .f .xt /  f .x //
tD0

X
T
 21 .2M0 C 1/2 C ı.2M0 C L C 2/ at
tD0

X
T
C ı.T C 1/.8M0 C 8/ C 21 .L C 1/2 a2t : (3.10)
tD0

Moreover, for each natural number T,


0 !1 T 1
X T X
f@ at at xt A  f .x /; minff .xt / W t D 0; : : : ; Tg  f .x /
tD0 tD0
!1
X
T
1 2
 2 .2M0 C 1/ at C ı.2M0 C L C 2/
tD0
!1 !1
X
T X
T X
T
1 2
C ı.T C 1/.8M0 C 8/ at C 2 .L C 1/ a2t at :
tD0 tD0 tD0

Theorem 3.1 is proved in Sect. 3.3.


We are interestedPin an optimal choice of at , t D 0; 1; : : : . Let T be a natural
number and AT D TtD0 at be given. By Theorem 3.1, in order to make the best
P
choice of at ; t D 0; : : : ; T, we need to minimize the function TtD0 a2t on the set
( )
X
T
a D .a0 ; : : : ; aT / 2 RTC1 W ai  0; i D 0; : : : ; T; ai D AT :
iD0
44 3 The Mirror Descent Algorithm

By Lemma 2.3, this function has a unique minimizer a D .a0 ; : : : ; aT / where
ai D .T C 1/1 AT , i D 0; : : : ; T. This is the best choice of at , t D 0; 1; : : : ; T.
Let T be a natural number and at D a, t D 0; : : : ; T. Now we will find the best
a > 0. By Theorem 3.1, we need to choose a which is a minimizer of the function

21 ..T C 1/a/1 .2M0 C 1/2 C ı.2M0 C L C 2/


Ca1 ı.8M0 C 8/ C 21 .L C 1/2 a:

Since T can be arbitrary large, we need to find a minimizer of the function

.a/ WD a1 ı.16M0 C 16/ C .L C 1/2 a; a 2 .0; 1/:

Clearly, the minimizer is

a D .16ı.M0 C 1//1=2 .L C 1/1

and the minimal value of is

2.ı.16M0 C 16//1=2 .L C 1/:

Now we can think about the best choice of T. It is clear that it should be at the
same order as bı 1 c.
Note that in the theorem above ı is the computational error produced by our
computer system. In order to obtain a good approximate solution we need T
iterations which is at the same order as bı 1 c. As a result, we obtain a point  2 U
such that

BX .; ı/ \ C 6D ;

and

f ./  f .x / C c1 ı 1=2 ;

where the constant c1 > 0 depends only on L and M0 .

3.2 The Main Lemma

We use the notation and definitions introduced in Sect. 3.1.


Lemma 3.2. Let ı 2 .0; 1, a > 0 and let

z 2 C: (3.11)
3.2 The Main Lemma 45

Assume that

x 2 U \ BX .0; M0 C 1/; (3.12)


 2 @f .x/ C BX .0; ı/; (3.13)
1 2
g.v/ D h; vi C .2a/ kv  xk ; v 2 X (3.14)

and that

u2U (3.15)

satisfies

BX .u; ı/ \ fv 2 C W g.v/ D inf.gI C/g 6D ;: (3.16)

Then

a.f .x/  f .z//  ıa.2M0 C L C 2/ C 8ı.M0 C 1/


C21 a2 .L C 1/2 C 21 kz  xk2  21 kz  uk2 :

Proof. In view of (3.13), there exists

l 2 @f .x/ (3.17)

such that

kl  k  ı: (3.18)

Clearly, the function g is Frechet differentiable on X. We denote by g0 .v/ its Frechet


derivative at v 2 X. It is easy to see that

g0 .v/ D  C a1 .v  x/; v 2 X: (3.19)

By (3.16), there exists

uO 2 BX .u; ı/ \ C (3.20)

such that

g.Ou/ D inf.gI C/: (3.21)


46 3 The Mirror Descent Algorithm

It follows from (3.19) and (3.21) that

0  hg0 .Ou/; v  uO i D h C a1 .Ou  x/; v  uO i: (3.22)

By (3.2), (3.11), (3.12), (3.17), and (3.18),

a.f .x/  f .z//  ahx  z; li


D ahx  z; i C ahx  z; l  i
 ahx  z; i C akx  zkkl  k
 ıa.2M0 C 1/ C ahx  z; i
D ıa.2M0 C 1/ C ah; x  uO i C ah; uO  zi
D ıa.2M0 C 1/ C hx  uO ; ai
Chz  uO ; x  uO  ai C hz  uO ; uO  xi: (3.23)

Relations (3.11) and (3.22) imply that

hz  uO ; x  uO  ai  0: (3.24)

In view of Lemma 2.1,

hz  uO ; uO  xi D 21 Œkz  xk2  kz  uO k2  kOu  xk2 : (3.25)

It follows from (3.4), (3.13), and (3.20) that

hx  uO ; ai D hx  u; ai C hu  uO ; ai


 aı.L C 1/ C hx  u; ai
 aı.L C 1/ C 21 kx  uk2 C 21 a2 kk2 : (3.26)

By (3.4), (3.13), and (3.23)–(3.26),

a.f .x/  f .z//  ıa.2M0 C 1/ C aı.L C 1/


C21 kx  uk2 C 21 a2 kk2
C21 kz  xk2  21 kz  uO k2  21 kOu  xk2
 ıa.2M0 C L C 2/ C 21 a2 .L C 1/2
C21 kz  xk2  21 kz  uO k2
C21 kx  uk2  21 kOu  xk2 : (3.27)
3.3 Proof of Theorem 3.1 47

In view of (3.2), (3.12), (3.16), and (3.20),

jkx  uk2  kOu  xk2 j


 jkx  uk  kOu  xkj.kx  uk C kOu  xk/
 4ku  uO k.M0 C 1/  4.M0 C 1/ı: (3.28)

Relations (3.2), (3.11), (3.12), and (3.20) imply that

jkz  uO k2  kz  uk2 j
 jkz  uO k  kz  ukj.kz  uO k C kz  uk/
 ku  uO k.4M0 C 1/  .4M0 C 1/ı: (3.29)

By (3.27), (3.28), and (3.29),

a.f .x/  f .z//  ıa.2M0 C L C 2/ C C21 a2 .L C 1/2


C21 kz  xk2  21 kz  uk2 C 8.M0 C 1/ı:

This completes the proof of Lemma 3.2.

3.3 Proof of Theorem 3.1

It is clear that

kxt k  M0 C 1; t D 0; 1; : : : :

Let t  0 be an integer. Applying Lemma 3.2 with

z D x ; a D at ; x D xt ;  D t ; u D xtC1

we obtain that

at .f .xt /  f .x //  21 kx  xt k2  21 kx  xtC1 k2


Cı.8.M0 C 1/ C at .2M0 C L C 2// C 21 a2t .L C 1/2 : (3.30)

By (3.2), (3.5), (3.7), and (3.30), for each natural number T,

X
T
at .f .xt /  f .x //
tD0

X
T
 .21 kx  xt k2  21 kx  xtC1 k2 /
tD0
48 3 The Mirror Descent Algorithm

X
T X
T
Cı.2M0 C L C 2/ at C .T C 1/ı.8M0 C 8/ C 21 .L C 1/2 a2t
tD0 tD0

X
T
 21 .2M0 C 1/2 C ı.2M0 C L C 2/ at
tD0

X
T
C.T C 1/ı.8M0 C 8/ C 21 .L C 1/2 a2t :
tD0

Thus (3.10) is true. Evidently, (3.10) implies the last relation of the statement of
Theorem 3.1. This completes the proof of Theorem 3.1

3.4 Optimization on Unbounded Sets

We use the notation and definitions introduced in Sect. 3.1. Let X be a Hilbert space
with an inner product h; i, D be a nonempty closed convex subset of X, V be an
open convex subset of X such that

D  V; (3.31)

and f W V ! R1 be a convex function which is Lipschitz on all bounded subsets


of V. Set

Dmin D fx 2 D W f .x/  f .y/ for all y 2 Dg: (3.32)

We suppose that

Dmin 6D ;:

We will prove the following result.


Theorem 3.3. Let ı 2 .0; 1, M > 1 satisfy

Dmin \ BX .0; M/ 6D ;; (3.33)


M0 > 80M C 6; (3.34)

L > 1 satisfy

jf .v1 /  f .v2 /j  Lkv1  v2 k for all v1 ; v2 2 V \ BX .0; M0 C 2/; (3.35)

0 < 0  1  .4L C 4/1 ; (3.36)

0 D 1601 ı.M0 C 1/ C 4ı.2M0 C L C 2/ C 1 .L C 1/2 (3.37)


3.5 Proof of Theorem 3.3 49

and let

n0 D b01 .2M C 1/2 01 c C 1: (3.38)

Assume that fxt g1 1


tD0  V, ft gtD0  X,

fat g1
tD0  Œ0 ; 1 ;

kx0 k  M (3.39)

and that for each integer t  0,

t 2 @f .xt / C BX .0; ı/ (3.40)

and

BX .xtC1 ; ı/ \ argminfht ; vi C .2at /1 kv  xt k2 W v 2 Cg 6D ;: (3.41)

Then there exists an integer q 2 Œ1; n0  such that

f .xq /  inf.f I D/ C 0 ;
kxq k  15M C 1:

We are interested in the best choice of at ; t D 0; 1; : : : . Assume for simplicity


that 1 D 0 . In order to meet our goal we need to minimize 0 which obtains its
minimal value when

0 D .4ı.M0 C 1//1=2 .L C 1/1

and the minimal value of 0 is

4ı.2M0 C L C 2/ C 4.4ı.M0 C 1//1=2 .L C 1/;

Thus 0 is at the same order as bı 1=2 c. By (3.38) and the inequalities above, n0 is at
the same order as bı 1 c.

3.5 Proof of Theorem 3.3

By (3.33) there exists

z 2 Dmin \ BX .0; M/: (3.42)

Assume that T is a natural number and that

f .xt /  f .z/ > 0 ; t D 1; : : : ; T: (3.43)


50 3 The Mirror Descent Algorithm

In view of (3.41), there exists


2 BX .x1 ; ı/ \ argminfh0 ; vi C .2a0 /1 kv  x0 k2 W v 2 Dg: (3.44)

Relations (3.42) and (3.44) imply that

h0 ;
i C .2a0 /1 k
 x0 k2
 h0 ; zi C .2a0 /1 kz  x0 k2 : (3.45)

It follows from (3.34), (3.35), (3.39), and (3.40) that

k0 k  L C 1: (3.46)

In view of (3.36),

a1 1
0  1  4.L C 1/: (3.47)

By (3.35), (3.39), (3.40), (3.42), and (3.45)–(3.47),

.L C 1/M C .2a0 /1 .2M C 1/2


 h0 ; zi C .2a0 /1 kz  x0 k2
 .2a0 /1 k
 x0 k2 C h0 ;
 x0 i C h0 ; x0 i
 .2a0 /1 k
 x0 k2  .L C 1/k
 x0 k  .L C 1/M:

Together with (3.36) this implies that

M C .2M C 1/2  k
 x0 k2  21 k
 x0 k;
.k
 x0 k  41 /2  .4M C 1/2 ;
k
 x0 k  8M:

Together with (3.39) and (3.44) this implies that

k
k  9M; kx1 k  9M C 1;
k
 zk  10M;
kx1  zk  10M C 1: (3.48)

By induction we show that for every integer t 2 Œ1; T,

kxt  zk  14M C 1; (3.49)


f .xt /  f .z/  ı.2M0 C L C 2/ C 801 ı.M0 1
C 1/ C 2 1 .L C 1/ 2

C.20 /1 .kz  xt k2  kz  xtC1 k2 /: (3.50)


3.5 Proof of Theorem 3.3 51

Set

U D V \ fv 2 X W kvk < M0 C 2g (3.51)

and

C D D \ BX .0; M0 /: (3.52)

In view of (3.48), (3.49) holds for t D 1.


Assume that an integer t 2 Œ1; T and that (3.49) holds. It follows
from (3.34), (3.42) and (3.52) that

z 2 C  BX .0; M0 /: (3.53)

In view of (3.34), (3.42), and (3.49),

xt 2 U \ BX .0; M0 C 1/: (3.54)

Relations (3.35), (3.40), and (3.54) imply that

t 2 @f .xt / C BX .0; ı/  BX .0; L C 1/: (3.55)

In view of (3.41), there exists

h 2 BX .xtC1 ; ı/ \ argminfht ; vi C .2at /1 kv  xt k2 W v 2 Dg: (3.56)

By (3.42) and (3.56),

ht ; hi C .2at /1 kh  xt k2


 ht ; zi C .2at /1 kz  xt k2 : (3.57)

In view of (3.57),

ht ; zi C .2at /1 kz  xt k2


 .2at /1 kh  xt k2 C ht ; h  xt i C ht ; xt i:

It follows from the inequality above, (3.34), (3.36), (3.42), (3.49), and (3.55) that

.L C 1/M C .2at /1 .14M C 1/2


 .2at /1 kh  xt k2  .L C 1/kh  xt k  .L C 1/.15M C 1/
 .2at /1 .kh  xt k2  kh  xt k/  .L C 1/.15M C 1/
 .2at /1 .kh  xt k  1/2  .2at /1  .L C 1/.15M C 1/;
.2at /1 C .L C 1/.16M C 1/ C .2at /1 .14M C 1/2
52 3 The Mirror Descent Algorithm

 .2at /1 .kh  xt k  1/2 ;


4.14M C 1/2  2 C 16M C .14M C 1/2  .kh  xt k  1/2 ;
kh  xt k  28M C 4;
khk  44M C 5 < M0 : (3.58)

By (3.52), (3.56), and (3.58),

h 2 C: (3.59)

Relations (3.52), (3.56), and (3.59) imply that

h 2 argminfht ; vi C .2at /1 kv  xt k2 W v 2 Cg

and

h 2 BX .xtC1 ; ı/ \ argminfht ; vi C .2at /1 kv  xt k2 W v 2 Cg: (3.60)

It follows from (3.31), (3.35), (3.51), (3.52)–(3.55), (3.60), and Lemma 3.2 which
holds with

x D xt ; a D at ;  D t ; u D h

that

at .f .xt /  f .z//  ıat .2M0 C L C 2/


C8ı.M0 C 1/ C 21 a2t .L C 1/2
C21 kz  xt k2  21 kz  xtC1 k2

Together with the inclusion at 2 Œ0 ; 1  this implies that

f .xt /  f .z/  01 8ı.M0 C 1/


C.2M0 C L C 2/ı C 21 1 .L C 1/2
C.20 /1 kz  xt k2  .20 /1 kz  xtC1 k2 (3.61)

and (3.50) holds.


In view of (3.37), (3.43), (3.49), and (3.61),

kz  xt k2  kz  xtC1 k2  0;
kz  xtC1 k  kz  xt k  14M C 1:

Hence by induction we showed that (3.50) holds for all t D 1; : : : ; T and (3.49)
holds for all t D 1; : : : ; T C 1.
3.6 Zero-Sum Games 53

It follows from (3.37)–(3.39), (3.42), (3.43), and (3.50) that

T0 < T.minff .xt / W t D 1; : : : ; Tg  f .z//


X
T
 .f .xt /  f .z//
tD1

X
T
1
 .20 / .kz  xt k2  kz  xtC1 k2 /
tD1

C8T01 ı.M0 C 1/ C T.2M0 C L C 2/ı C 21 T1 .L C 1/2


 .20 /1 .2M C 1/2 C 8T01 ı.M0 C 1/
CT.2M0 C L C 2/ı C 21 T1 .L C 1/2 ;
0 < .20 T/1 .2M C 1/2 C 801 ı.M0 C 1/
C.2M0 C L C 2/ı C 21 1 .L C 1/2 ;
21 0 < .20 T/1 .2M C 1/2

and

T < 01 .2M C 1/2 01 < n0 :

Thus we have shown that if an integer T  1 satisfies

f .xt /  f .z/ > 0 ; t D 1; : : : ; T;

then T < n0 and (3.49) holds for all t D 1; : : : ; T C 1. This implies that there exists
an integer t 2 f1; : : : ; n0 g such that

f .xt /  f .z/  0 ;
kxt k  15M C 1:

Theorem 3.3 is proved.

3.6 Zero-Sum Games

We use the notation and definitions introduced in Sect. 3.1.


Let X; Y be Hilbert spaces, C be a nonempty closed convex subset of X, D be a
nonempty closed convex subset of Y, U be an open convex subset of X, and V be an
open convex subset of Y such that

C  U; D  V: (3.62)
54 3 The Mirror Descent Algorithm

Suppose that there exist L > 0, M0 > 0 such that

C  BX .0; M0 /; D  BY .0; M0 /; (3.63)

a function f W U  V ! R1 possesses the following properties:


(i) for each v 2 V, the function f .; v/ W U ! R1 is convex;
(ii) for each u 2 U, the function f .u; / W V ! R1 is concave,
and that for each u 2 U,

jf .u; v1 /  f .u; v2 /j  Lkv1  v2 k for all v1 ; v2 2 V; (3.64)

for each v 2 V,

jf .u1 ; v/  f .u2 ; v/j  Lku1  u2 k for all u1 ; u2 2 U: (3.65)

Recall that for each .;


/ 2 U  V,

@x f .;
/ D fl 2 X W f .y;
/  f .;
/  hl; y  i for all y 2 Ug;
@y f .;
/ D fl 2 Y W hl; y 
i  f .; y/  f .;
/ for all y 2 Vg:

In view of properties (i) and (ii) and (3.63)–(3.65), for each  2 U and each
2 V,

; 6D @x f .;
/  BX .0; L/;
; 6D @y f .;
/  BY .0; L/:

Let

x 2 C and y 2 D (3.66)

satisfy

f .x ; y/  f .x ; y /  f .x; y / (3.67)

for each x 2 C and each y 2 D.


Let ı 2 .0; 1 and fak g1
kD0  .0; 1/.
Let us describe our algorithm.
Mirror Descent Algorithm for Zero-Sum Games
Initialization: select arbitrary x0 2 U and y0 2 V.
Iterative step: given current iteration vectors xt 2 U and yt 2 V calculate

t 2 @x f .xt ; yt / C BX .0; ı/;



t 2 @y f .xt ; yt / C BY .0; ı/
3.6 Zero-Sum Games 55

and the next pair of iteration vectors xtC1 2 U, ytC1 2 V such that

BX .xtC1 ; ı/ \ argminfht ; vi C .2at /1 kv  xt k2 W v 2 Cg 6D ;;


BY .ytC1 ; ı/ \ argminfh
t ; ui C .2at /1 ku  yt k2 W u 2 Dg 6D ;:

In this chapter we prove the following result.


Theorem 3.4. Let ı 2 .0; 1 and fak g1 1
kD0  .0; 1/. Assume that fxt gtD0  U,
1 1 1
fyt gtD0  V, ft gtD0  X, f
t gtD0  Y,

BX .x0 ; ı/ \ C 6D ;; BY .y0 ; ı/ \ D 6D ;

and that for each integer t  0,

t 2 @x f .xt ; yt / C BX .0; ı/;



t 2 @y f .xt ; yt / C BY .0; ı/;
BX .xtC1 ; ı/ \ argminfht ; vi C .2at /1 kv  xt k2 W v 2 Cg 6D ;;
BY .ytC1 ; ı/ \ argminfh
t ; ui C .2at /1 ku  yt k2 W u 2 Dg 6D ;:

Let for each natural number T,


!1 !1
X
T X
T X
T X
T
xO T D at at xt ; yO T D at at yt :
iD0 tD0 iD0 tD0

Then for each natural number T,

BX .OxT ; ı/ \ C 6D ;; BY .OyT ; ı/ \ D 6D ;; (3.68)


ˇ ˇ
ˇ T !1 T ˇ
ˇ X X ˇ
ˇ at at f .xt ; yt /  f .x ; y /ˇˇ
ˇ
ˇ tD0 tD0 ˇ
!1
X
T
 Œ21 .2M0 C 1/2 C 8ı.T C 1/.M0 C 1/ at
tD0
!1
X
T X
T
1
Cı.2M0 C L C 2/ C 2 at .L C 1/2 a2t ;
tD0 tD0
ˇ !1 T ˇ
ˇ ˇ
ˇ XT X ˇ
ˇf .OxT ; yO T /  at at f .xt ; yt /ˇˇ
ˇ
ˇ tD0 tD0 ˇ
56 3 The Mirror Descent Algorithm

!1
X
T
1 2
 Œ2 .2M0 C 1/ C 8ı.T C 1/.M0 C 1/ at
tD0
!1
X
T X
T
1
Cı.2M0 C 2L C 2/ C 2 at .L C 1/2 a2t ;
tD0 tD0

and for each natural number T, each z 2 C, and each v 2 D,

f .OxT ; v/  f .OxT ; yO T /
!1 !1
X
T X
T
2
C.2M0 C 1/ at C at 16ı.T C 1/.M0 C 1/
tD0 tD0
!1
X
T X
T
C2ı.2M0 C 2L C 2/ C at .L C 1/2 a2t ;
tD0 tD0

f .z; yO T /  f .OxT ; yO T /
!1 !1
X
T X
T
2
.2M0 C 1/ at  16 at .T C 1/ı.M0 C 1/
tD0 tD0
!1
X
T X
T
2ı.2M0 C 2L C 2/  at .L C 1/2 a2t :
tD0 tD0

Proof. Evidently, (3.68) holds. It is not difficult to see that

kxt k  M0 C 1; kyt k  M0 C 1; t D 0; 1; : : : :

Let t  0 be an integer. Applying Lemma 3.2 with

a D at ; x D xt ; f D f .; yt /;  D t ; u D xtC1

we obtain that for each z 2 C,

at .f .xt ; yt /  f .z; yt //  21 kz  xt k2  21 kz  xtC1 k2


Cı.8M0 C 8 C at .2M0 C L C 2// C 21 a2t .L C 1/2 :

Applying Lemma 3.2 with

a D at ; x D yt ; f D f .xt ; /;  D 
t ; u D ytC1
3.6 Zero-Sum Games 57

we obtain that for each v 2 D,

at .f .xt ; v/  f .xt ; yt //  21 kv  yt k2  21 kv  ytC1 k2


Cı.8M0 C 8 C at .2M0 C L C 2// C 21 a2t .L C 1/2 :

For all integers t  0 set

bt D ı.8M0 C 8 C at .2M0 C L C 2// C 21 a2t .L C 1/2

and define

.s/ D 21 s2 ; s 2 R1 :

It is easy to see that all the assumptions of Proposition 2.9 hold and it implies
Theorem 3.4.
We are interestedPin the optimal choice of at , t D 0; 1; : : : . Let T be a natural
number and AT D TtD0 at be given. By Theorem 3.4, in order to make the best
P
choice of at ; t D 0; : : : ; T, we need to minimize the function TtD0 a2t on the set
( )
X
T
a D .a0 ; : : : ; aT / 2 R
TC1
W ai  0; i D 0; : : : ; T; ai D AT :
iD0

By Lemma 2.3, this function has a unique minimizer a D .a0 ; : : : ; aT / where
ai D .T C 1/1 AT , i D 0; : : : ; T which is the best choice of at , t D 0; 1; : : : ; T.
Now we will find the best a > 0. Let T be a natural number and at D a for all
t D 0; : : : ; T. We need to choice a which is a minimizer of the function

T .a/ D ..T C 1/a/1 .2M0 C 1/2 C 2ı.2M0 C L C 2/


C16ı.T C 1/.M0 C 1/.a.T C 1//1 C a.L C 1/2
D .2M0 C 1/2 ..T C 1/a/1 C 2ı.2M0 C L C 2/ C 16ı.M0 C 1/a1
C.L C 1/2 a:

Since T can be arbitrary large, we need to find a minimizer of the function

.a/ WD 16a1 ı.M0 C 1/ C .L C 1/2 a; a 2 .0; 1/:

This function has a minimizer

a D 4.ı.M0 C 1//1=2 .L C 1/1

and the minimal value of is

8.ı.M0 C 1//1=2 .L C 1/:


58 3 The Mirror Descent Algorithm

Now our goal is to find the best T > 0 which gives us an appropriate value of T .a/.
Since in view of the inequalities above, this value is bounded from below by c0 ı 1=2
with the constant c0 depending on L; M0 it is clear that in order to make the best
choice of T, it should be at the same order as bı 1 c. For example, T D bı 1 c.
Note that in the theorem above ı is the computational error produced by
our computer system. We obtain a good approximate solution after T D bı 1 c
iterations. Namely, we obtain a pair of points xO 2 U, yO 2 V such that

BX .Ox; ı/ \ C 6D ;; BY .Oy; ı/ \ D 6D ;

and for each z 2 C and each v 2 D,

f .z; yO /  f .Ox; yO /  cı 1=2 ; f .Ox; v/  f .Ox; yO / C cı 1=2 ;

where the constant c > 0 depends only on L and M0 .


Chapter 4
Gradient Algorithm with a Smooth Objective
Function

In this chapter we analyze the convergence of a projected gradient algorithm with


a smooth objective function under the presence of computational errors. We show
that the algorithm generates a good approximate solution, if computational errors
are bounded from above by a small positive constant. Moreover, for a known
computational error, we find out what an approximate solution can be obtained and
how many iterates one needs for this.

4.1 Optimization on Bounded Sets

Let X be a Hilbert space equipped with an inner product h; i which induces a
complete norm k  k. For each x 2 X and each r > 0 set

BX .x; r/ D fy 2 X W kx  yk  rg:

For each x 2 X and each nonempty set E  X put

d.x; E/ D inffkx  yk W y 2 Eg:

Let C be a nonempty closed convex subset of X, U be an open convex subset of X


such that C  U and let f W U ! R1 be a convex continuous function.
We suppose that the function f is Frechet differentiable at every point x 2 U and
for every x 2 U we denote by f 0 .x/ 2 X the Frechet derivative of f at x. It is clear
that for any x 2 U and any h 2 X

hf 0 .x/; hi D lim t1 .f .x C th/  f .x//: (4.1)


t!0

© Springer International Publishing Switzerland 2016 59


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_4
60 4 Gradient Algorithm with a Smooth Objective Function

For each nonempty set D and each function g W D ! R1 set

inf.gI D/ D inffg.y/ W y 2 Dg; (4.2)


argminfg.z/ W z 2 Dg D fz 2 D W g.z/ D inf.gI D/g: (4.3)

We suppose that the mapping f 0 W U ! X is Lipschitz on all bounded subsets of U.


It is well known (see Lemma 2.2) that for each nonempty closed convex set D  X
and each x 2 X there exists a unique point PD .x/ 2 D such that

kx  PD .x/k D inffkx  yk W y 2 Dg:

In this chapter we study the behavior of a projected gradient algorithm with a


smooth objective function which is used for solving convex constrained minimiza-
tion problems [91, 92, 98].
In the sequel we use the following proposition [91, 92, 98] which is proved in
Sect. 4.2.
Proposition 4.1. Assume that x; u 2 U, L > 0 and that for each v1 ; v2 2 ftxC
.1  t/u W t 2 Œ0; 1g,

kf 0 .v1 /  f 0 .v2 /k  Lkv1  v2 k:

Then

f .u/  f .x/ C hf 0 .x/; u  xi C 21 Lku  xk2 :

Suppose that there exist L > 1, M0 > 0 such that

C  BX .0; M0 /; (4.4)
jf .v1 /  f .v2 /j  Lkv1  v2 k for all v1 ; v2 2 U; (4.5)
0 0
kf .v1 /  f .v2 /k  Lkv1  v2 k for all v1 ; v2 2 U: (4.6)

Let ı 2 .0; 1. We describe below our algorithm.


Gradient Algorithm
Initialization: select an arbitrary x0 2 U \ BX .0; M0 /.
Iterative step: given a current iteration vector xt 2 U calculate

t 2 f 0 .xt / C BX .0; ı/

and calculate the next iteration vector xtC1 2 U such that

kxtC1  PC .xt  L1 t /k  ı:

In this chapter we prove the following result.


4.2 An Auxiliary Result and the Proof of Proposition 4.1 61

Theorem 4.2. Let ı 2 .0; 1 and let

x0 2 U \ BX .0; M0 /: (4.7)

Assume that fxt g1 1


tD1  U, ft gtD0  X and that for each integer t  0,

kt  f 0 .xt /k  ı (4.8)

and

kxtC1  PC .xt  L1 t /k  ı: (4.9)

Then for each natural number T,

f .xTC1 /  inf.f I C/
 .2T/1 L.2M0 C 1/2 C Lı.8M0 C 8/.T C 1/ (4.10)

and
!
X
TC1
1
minff .xt / W t D 2; : : : ; T C 1g  inf.f I C/; f T xt  inf.f I C/
tD2

 .2T/1 L.2M0 C 1/2 C Lı.8M0 C 8/: (4.11)

We are interested in an optimal choice of T. If we choose T in order to minimize


the right-hand side of (4.11) we obtain that T should be at the same order as ı 1 .
In this case the right-hand side of (4.11) is at the same order as ı. For example, if
T D bı 1 cC1  ı 1 , then the right-hand side of (4.11) does not exceed 2Lı.4M0 C
4 C .2M0 C 1/2 /.

4.2 An Auxiliary Result and the Proof of Proposition 4.1

Proposition 4.3. Let D be a nonempty closed convex subset of X, x 2 X and y 2 D.


Assume that for each z 2 D,

hz  y; x  yi  0: (4.12)

Then y D PD .x/.
62 4 Gradient Algorithm with a Smooth Objective Function

Proof. Let z 2 D. By (4.12),

hz  x; z  xi D hz  y C .y  x/; z  y C .y  x/i
D hy  x; y  xi C 2hz  y; y  xi C hz  y; z  yi
 hy  x; y  xi C hz  y; z  yi
 ky  xk2 C kz  yk2 :

Thus y D PD .x/. Proposition 4.3 is proved.


Proof of Proposition 4.1. For each t 2 Œ0; 1 set

.t/ D f .x C t.u  x//: (4.13)

Clearly, is a differentiable function and for each t 2 Œ0; 1,

0 .t/ D hf 0 .x C t.u  x/; u  xi: (4.14)

By (4.13), (4.14), and the proposition assumptions,

f .u/  f .x/ D .1/  .0/


Z 1 Z 1
0
D .t/dt D hf 0 .x C t.u  x/; u  xidt
0 0
Z 1 Z 1
0
D hf .x/; u  xidt C hf 0 .x C t.u  x/  f 0 .x/; u  xidt
0 0
Z 1
0
 hf .x/; u  xi C Ltku  xk2 dt
0
Z 1
D hf 0 .x/; u  xi C Lku  xk2 tdt
0

D hf 0 .x/; u  xi C Lku  xk2 =2:

Proposition 4.1 is proved.

4.3 The Main Lemma

We use the notation, definitions, and assumptions introduced in Sect. 4.1.


Lemma 4.4. Let ı 2 .0; 1,

u 2 BX .0; M0 C 1/ \ U; (4.15)
4.3 The Main Lemma 63

 2 X satisfy

k  f 0 .u/k  ı (4.16)

and let v 2 U satisfy

kv  PC .u  L1 /k  ı: (4.17)

Then for each x 2 U satisfying

B.x; ı/ \ C 6D ; (4.18)

the following inequalities hold:

f .x/  f .v/  21 Lkx  vk2  21 Lkx  uk2  ıL.8M0 C 8/; (4.19)
1 2
f .x/  f .v/  2 Lku  vk C Lhv  u; u  xi  ıL.8M0 C 12/: (4.20)

Proof. For each x 2 U define

g.x/ D f .u/ C hf 0 .u/; x  ui C 21 Lkx  uk2 : (4.21)

Clearly, g W U ! R1 is a convex Frechet differentiable function, for each x 2 U,

g0 .x/ D f 0 .u/ C L.x  u/; (4.22)


lim g.x/ D 1 (4.23)
kxk!1

and there exists

x0 2 C (4.24)

such that

g.x0 /  g.x/ for all x 2 C: (4.25)

By (4.24) and (4.25), for all z 2 C,

hg0 .x0 /; z  x0 i  0: (4.26)

In view of (4.22) and (4.26),

hL1 f 0 .u/ C x0  u; z  x0 i  0 for all z 2 C: (4.27)

Proposition 4.3, (4.24), and (4.27) imply that

x0 D PC .u  L1 f 0 .u//: (4.28)


64 4 Gradient Algorithm with a Smooth Objective Function

It follows from (4.16), (4.28), and Lemma 2.2 that

kv  x0 k
 kv  PC .u  L1 /k C kPC .u  L1 /  PC .u  L1 f 0 .u//k
 ı C L1 k  f 0 .u/k  ı.1 C L1 /: (4.29)

In view of (4.4) and (4.24),

kx0 k  M0 : (4.30)

Relations (4.4) and (4.17) imply that

kvk  M0 C 1: (4.31)

By (4.6), (4.15), (4.21), and Proposition 4.1, for all x 2 U,

f .x/  f .u/ C hf 0 .u/; x  ui C 21 Lku  xk2 D g.x/: (4.32)

Let

x2U (4.33)

satisfy

B.x; ı/ \ C 6D ;: (4.34)

It follows from (4.5) and (4.29) that

jf .x0 /  f .v/j  Lkv  x0 k  ı.L C 1/: (4.35)

In view of (4.24) and (4.32),

g.x0 /  f .x0 /: (4.36)

By (4.21), (4.36) and convexity of f ,

f .x/  f .x0 /  f .x/  g.x0 /


D f .x/  f .u/  hf 0 .u/; x0  ui  21 Lku  x0 k2
 f .u/ C hf 0 .u/; x  ui  f .u/  hf 0 .u/; x0  ui  21 Lku  x0 k2
D hf 0 .u/; x  x0 i  21 Lku  x0 k2 : (4.37)

Relation (4.34) implies that there exists

x1 2 C (4.38)
4.3 The Main Lemma 65

such that

kx1  xk  ı: (4.39)

By (4.5), (4.33), (4.38), and (4.39),

jf .x1 /  f .x/j  Lı: (4.40)

It follows from (4.26) (with z D x1 ) and (4.38) that

0  hg0 .x0 /; x1  x0 i D hg0 .x0 /; x1  xi C hg0 .x0 /; x  x0 i: (4.41)

By (4.4), (4.5), (4.15), (4.22), (4.24), (4.38), and (4.39),

hg0 .x0 /; x1  xi D hf 0 .u/; x1  xi C Lhx0  u; x1  xi


 Lı C Lı.2M0 C 1/: (4.42)

In view of (4.22) and (4.24),

hg0 .x0 /; x  x0 i D hf 0 .u/ C L.x0  u/; x  x0 i: (4.43)

Relations (4.41) and (4.43) imply that

hf 0 .u/; x  x0 i D hg0 .x0 /; x  x0 i  Lhx0  u; x  x0 i


 hg0 .x0 /; x1  xi  Lhx0  u; x  x0 i
 Lhx0  u; x  x0 i  Lı.2M0 C 2/: (4.44)

It follows from (4.37) and (4.44) that

f .x/  f .x0 /  hf 0 .u/; x  x0 i  21 Lkx0  uk2


 Lı.2M0 C 2/  Lhx0  u; x  x0 i  21 Lkx0  uk2 : (4.45)

In view of (4.45) and Lemma 2.1,

f .x/  f .x0 /  Lı.2M0 C 2/  21 Lkx0  uk2


21 LŒkx  uk2  kx  x0 k2  ku  x0 k2 
D 21 Lkx  x0 k2  21 Lkx  uk2  Lı.2M0 C 2/: (4.46)

By (4.4), (4.24), (4.34), and (4.35),

f .x/  f .v/  f .x/  f .x0 /  ı.L C 1/: (4.47)


66 4 Gradient Algorithm with a Smooth Objective Function

It follows from (4.15), (4.17), (4.24), and (4.29) that

jkx  x0 k2  kx  vk2 j
D jkx  x0 k  kx  vkj.kx  x0 k C kx  vk/  ı.8M0 C 8/ (4.48)

and

jku  x0 k2  ku  vk2 j
D jku  x0 k  ku  vkj.ku  x0 k C ku  vk/  ı.8M0 C 8/: (4.49)

In view of (4.45),

f .x/  f .x0 /  Lı.2M0 C 2/ C 21 Lkx0  uk2


Lhx0  u; x0  ui  Lhx0  u; x  x0 i
 Lı.2M0 C 2/ C 21 Lkx0  uk2
Lhx0  u; x  ui: (4.50)

By (4.35), (4.46), and (4.48),

f .x/  f .v/  f .x/  f .x0 /  ı.L C 1/


 21 Lkx  x0 k2  21 Lkx  uk2  Lı.2M0 C 4/
 21 Lkx  vk2  21 Lkx  uk2  2Lı.4M0 C 4/

and (4.19) holds. It follows from (4.15), (4.29), (4.34), (4.35), (4.49), and (4.50) that

f .x/  f .v/  f .x/  f .x0 /  ı.L C 1/


 Lı.2M0 C 4/ C 21 Lkx0  uk2  Lhx0  u; x  ui
 Lı.2M0 C 4/ C 21 Lku  vk2  4Lı.M0 C 1/
Lhv  u; x  ui  Lı.2M0 C 4/
 21 Lku  vk2  Lhv  u; x  ui  Lı.8M0 C 12/

and (4.20) holds. Lemma 4.4 is proved.

4.4 Proof of Theorem 4.2

Clearly, the function f has a minimizer on the set C. Fix


z2C (4.51)
4.4 Proof of Theorem 4.2 67

such that

f .z/ D inf.f I C/: (4.52)

It is easy to see that

kxt k  M0 C 1; t D 0; 1; : : : : (4.53)

Let T be a natural number and t  0 be an integer. Applying Lemma 4.4 with

u D xt ;  D t ; v D xtC1 ; x D z

we obtain that

f .z/  f .xtC1 /  21 Lkz  xtC1 k2  21 Lkz  xt k2  ıL.8M0 C 8/:

This implies that

X
T
Tf .z/  f .xtC1 /
tD1

X
T
 .21 Lkz  xtC1 k2  21 Lkz  xt k2 /  ıLT.8M0 C 8/
tD1

 21 L.kz  xTC1 k2  kz  x1 k2 /  ıLT.8M0 C 8/: (4.54)

Let t  0 be an integer. Applying Lemma 4.4 with

x D xtC1 ; u D xtC1 ;  D tC1 ; v D xtC2 ;

we obtain that

f .xtC1 /  f .xtC2 /  21 LkxtC2  xtC1 k2  ıL.8M0 C 8/

and

t.f .xtC1 /  f .xtC2 //  21 LtkxtC2  xtC1 k2  ıLt.8M0 C 8/:

We can write the relation above as

tf .xtC1 /  .t C 1/f .xtC2 / C f .xtC2 /


 21 LtkxtC2  xtC1 k2  ıLt.8M0 C 8/: (4.55)

Summing (4.55) with t D 0; : : : ; T  1 we obtain that


68 4 Gradient Algorithm with a Smooth Objective Function

X
T1
Tf .xTC1 / C f .xtC2 /
tD0

X
T1
D Œtf .xtC1 /  .t C 1/f .xtC2 / C f .xtC2 /
tD0

X
T1 X
T1
 Œ21 LtkxtC2  xtC1 k2   ıL.8M0 C 8/ t: (4.56)
tD0 tD0

By (4.54) and (4.56),

T.f .z/  f .xTC1 //


 21 Lkz  x1 k2  LıT.8M0 C 8/  Lı.4M0 C 4/T.T  1/

and in view of (4.51) and (4.53),

f .xTC1 /  f .z/
 .2T/1 L.2M0 C 1/2 C Lı.8M0 C 8/.T C 1/:

In view of (4.54)
! !
X
TC1
1
T.minff .xt / W t D 2; : : : ; T C 1g  f .z//; T f T xt  f .z/
tD2

X
T
 f .xtC1 /  Tf .z/
tD1

 21 L.2M0 C 1/2 C LıT.8M0 C 8/:

This completes the proof of Theorem 4.2.

4.5 Optimization on Unbounded Sets

We use the notation and definitions introduced in Sect. 4.1. Let X be a Hilbert space
with an inner product h; i which induces a complete norm k  k.
Let D be a nonempty closed convex subset of X, V be an open convex subset of
X such that

DV
4.5 Optimization on Unbounded Sets 69

and f W V ! R1 be a convex Fréchet differentiable function which is Lipschitz on


all bounded subsets of V. Set

Dmin D fx 2 D W f .x/  f .y/ for all y 2 Dg: (4.57)

We suppose that

Dmin 6D ;: (4.58)

We will prove the following result.


Theorem 4.5. Let ı 2 .0; 1, M > 0 satisfy

Dmin \ BX .0; M/ 6D ;; (4.59)

M0  4M C 8; L  1 satisfy

jf .v1 /  f .v2 /j  Lkv1  v2 k for all v1 ; v2 2 V \ BX .0; M0 C 2/; (4.60)


kf 0 .v1 /  f 0 .v2 /k  Lkv1  v2 k for all v1 ; v2 2 V \ BX .0; M0 C 2/; (4.61)
0 D 4Lı.2M0 C 3/ (4.62)

and let

n0 D b.4ı/1 .2M C 1/2 c C 1: (4.63)

Assume that fxt g1 1


tD0  V, ft gtD0  X,

kx0 k  M (4.64)

and that for each integer t  0,

kt  f 0 .xt /k  ı (4.65)

and

kxtC1  PD .xt  L1 t /k  ı: (4.66)

Then there exists an integer q 2 Œ1; n0 C 1 such that

f .xq /  inf.f I D/ C 0 ;
kxi k  3M C 3; i D 0; : : : ; q:

Proof. By (4.59) there exists

z 2 Dmin \ BX .0; M/: (4.67)


70 4 Gradient Algorithm with a Smooth Objective Function

By (4.60), (4.64)–(4.67), and Lemma 2.2,

kx1  zk  kx1  PD .x0  L1 0 /k C kPD .x0  L1 0 /  zk


 ı C kx0  zk C L1 k0 k
 1 C 2M C L1 .L C 1/  2M C 3: (4.68)

In view (4.67) and (4.68),

kx1 k  3M C 3: (4.69)

Assume that an integer T  0 and that for all t D 1; : : : ; T C 1,

f .xt /  f .z/ > 0 : (4.70)

Set

U D V \ fv 2 X W kvk < M0 C 2g (4.71)

and

C D D \ BX .0; M0 /: (4.72)

Assume that an integer t 2 Œ0; T and that

kxt  zk  2M C 3: (4.73)

(In view of (4.64), (4.67), and (4.68), our assumption is true for t D 0.) By (4.67)
and (4.72),

z 2 C  BX .0; M0 /: (4.74)

Relations (4.67), (4.71), and (4.73) imply that

xt 2 U \ BX .0; M0 C 1/: (4.75)

It follows from (4.60), (4.65), and (4.75) that

t 2 f 0 .xt / C BX .0; 1/  BX .0; L C 1/: (4.76)

By (4.67), (4.73), (4.76), and Lemma 2.2,

kz  PD .xt  L1 t /k  kz  xt C L1 t k


 kz  xt k C L1 kt k  2M C 5: (4.77)
4.5 Optimization on Unbounded Sets 71

In view of (4.67) and (4.77),

kPD .xt  L1 t /k  3M C 5: (4.78)

Relations (4.72) and (4.78) imply that

PD .xt  L1 t / 2 C; (4.79)


1 1
PD .xt  L t / D PC .xt  L t /: (4.80)

It follows from (4.66), (4.71), and (4.78) that

kxtC1 k  3M C 6; xtC1 2 U: (4.81)

By (4.65), (4.66), (4.79), (4.80), (4.81), and Lemma 4.4 applied with

u D xt ;  D t ; v D xtC1 ; x D z

we obtain that

f .z/  f .xtC1 /
 21 Lkz  xtC1 k2  21 Lkz  xt k2  Lı.8M0 C 8/: (4.82)

By (4.62), (4.70), and (4.82),

4Lı.2M0 C 3/ D 0 < f .xtC1 /  f .z/


 21 L.kz  xt k2  kz  xtC1 k2 / C Lı.8M0 C 8/: (4.83)

In view of (4.73) and (4.83),

kz  xtC1 k  kz  xt k  2M C 3:

Thus by induction we showed that for all t D 0; : : : ; T C 1

kz  xt k  2M C 3; kxt k  3M C 3

and that (4.83) holds for all t D 0; : : : ; T.


It follows from (4.63), (4.64), (4.67), and (4.83) that

4.1 C T/Lı.2M0 C 3/
 .1 C T/.minff .xt / W t D 1; : : : ; T C 1g  f .z//
72 4 Gradient Algorithm with a Smooth Objective Function

X
T
 .f .xtC1 /  f .z//
tD0

X
T
 21 L .kz  xt k2  kz  xtC1 k2 / C .T C 1/Lı.8M0 C 8/;
tD0

2Lı.1 C T/  21 Lkz  x0 k2  21 L.2M C 1/2

and

T < .2M C 1/2 .4ı/1  n0 :

Thus we assumed that an integer T  0 satisfies

f .xt /  f .z/ > 0 ; t D 1; : : : ; T C 1

and showed that T  n0  1 and

kxt k  3M C 3; t D 0; : : : ; T C 1:

This implies that there exists a natural number q  n0 C 1 such that

f .xq /  f .z/  0  4Lı.2M0 C 3/


kxt k  3M C 3; t D 0; : : : ; q:

Theorem 4.2 is proved.


Note that in the theorem above ı is the computational error produced by our
computer system. We obtain a good approximate solution after b.4ı/1 .2MC1/2 cC
2 iterations. Namely, we obtain a point x 2 X such that

BX .x; ı/ \ D 6D ;

and

f .x/  inf.f I D/ C 4Lı.2M0 C 3/:


Chapter 5
An Extension of the Gradient Algorithm

In this chapter we analyze the convergence of a gradient type algorithm, under


the presence of computational errors, which was introduced by Beck and Teboulle
[20] for solving linear inverse problems arising in signal/image processing. We
show that the algorithm generates a good approximate solution, if computational
errors are bounded from above by a small positive constant. Moreover, for a known
computational error, we find out what an approximate solution can be obtained and
how many iterates one needs for this.

5.1 Preliminaries and the Main Result

Let X be a Hilbert space equipped with an inner product h; i which induces a
complete norm k  k. For each x 2 X and each r > 0 set

BX .x; r/ D fy 2 X W kx  yk  rg:

Suppose that f W X ! R1 is a convex Fréchet differentiable function on X and for


every x 2 X denote by f 0 .x/ 2 X the Fréchet derivative of f at x. It is clear that for
any x 2 X and any h 2 X

hf 0 .x/; hi D lim t1 .f .x C th/  f .x//:


t!0

For each function W X ! R1 set

inf. / D inff .y/ W y 2 Xg;


argmin. / D argminf .z/ W z 2 Xg WD fz 2 X W .z/ D inf. /g:

We suppose that the mapping f 0 W X ! X is Lipschitz on all bounded subsets of X.

© Springer International Publishing Switzerland 2016 73


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_5
74 5 An Extension of the Gradient Algorithm

Let g W X ! R1 be a convex continuous function which is Lipschitz on all


bounded subsets of X. Define

F.x/ D f .x/ C g.x/; x 2 X:

We suppose that

argmin.F/ 6D ; (5.1)

and that there exists c 2 R1 such that

g.x/  c for all x 2 X: (5.2)

For each u 2 X, each  2 X and each L > 0 define a convex function

.L/
Gu; .w/ D f .u/ C h; w  ui C 21 Lkw  uk2 C g.w/; w 2 X (5.3)

which has a minimizer.


In this chapter we analyze the gradient type algorithm, which was introduced
by Beck and Teboulle in [20] for solving linear inverse problems, and prove the
following result.
Theorem 5.1. Let ı 2 .0; 1, M  1 satisfy

argmin.F/ \ BX .0; M/ 6D ;; (5.4)

L > 1 satisfy

jf .w1 /  f .w2 /j  Lkw1  w2 k for all w1 ; w2 2 BX .0; 3M C 2/ (5.5)

and

kf 0 .w1 /  f 0 .w2 /k  Lkw1  w2 k for all w1 ; w2 2 X; (5.6)

M1  3M satisfy

jf .w/j; F.w/  M1 for all w 2 BX .0; 3M C 2/; (5.7)


2 1=2
M2 D .8.M1 C jc j C 1 C .L C 1/ // C 3M C 2; (5.8)

L0 > 1 satisfy

jg.w1 /  g.w2 /j  L0 kw1  w2 k for all w1 ; w2 2 BX .0; M2 /; (5.9)


0 D 2ı..M2 C 3M C 2/.2L C 3/ C L0 / (5.10)
5.2 Auxiliary Results 75

and let

n0 D b4LM 2 01 c: (5.11)

Assume that fxt g1 1


tD0  X, ft gtD0  X,

kx0 k  M (5.12)

and that for each integer t  0,

kt  f 0 .xt /k  ı (5.13)

and
.L/
BX .xtC1 ; ı/ \ argmin.Gxt ;t / 6D ;: (5.14)

Then there exists an integer q 2 Œ0; n0 C 2 such that

kxi k  M2 ; i D 0; : : : ; q

and

F.xq /  inf.F/ C 0 :

Note that in the theorem above ı is the computational error produced by our
computer system. We obtain a good approximate solution after bc1 ı 1 c iterations
[see (5.10) and (5.11)], where c1 > 0 is a constant which depends only on
L; L0 ; M; M2 . As a result we obtain a point x 2 X such that

F.x/  inf.F/ C c2 ı;

where c2 > 0 is a constant which depends only on L; L0 ; M; M2 .

5.2 Auxiliary Results

.L/
Lemma 5.2 ([20]). Let u;  2 X and L > 0. Then the function Gu; has a point of
.L/
minimum and z 2 X is a minimizer of Gu; if and only if

0 2  C L.z  u/ C @g.z/:

Proof. By (5.2) and (5.3),

.L/
lim Gu; .w/ D 1:
kwk!1
76 5 An Extension of the Gradient Algorithm

.L/
This implies that the function Gu; has a minimizer. Clearly, z is a minimizer of
.L/
Gu; if and only if

.L/
0 2 @Gu; .z/ D  C L.z  u/ C @g.z/:

Lemma 5.2 is proved.


Lemma 5.3. Let M0  1, L > 1 satisfy

jf .w1 /  f .w2 /j  Lkw1  w2 k for all w1 ; w2 2 BX .0; M0 C 2/

and

kf 0 .w1 /  f 0 .w2 /k  Lkw1  w2 k for all w1 ; w2 2 X; (5.15)

M1  M0 satisfy

jf .w/j; F.w/  M1 for all w 2 BX .0; M0 C 2/: (5.16)

Assume that

u 2 BX .0; M0 C 1/; (5.17)

 2 X satisfies

k  f 0 .u/k  1 (5.18)

and that v 2 X satisfies

.L/ .L/
BX .v; 1/ \ fz 2 X W Gu; .z/  inf.Gu; / C 1g 6D ;: (5.19)

Then

kvk  .8.M1 C jc j C .L C 1/2 C 1//1=2 C M0 C 2:

Proof. In view of (5.19), there exists

vO 2 BX .v; 1/ (5.20)

such that
.L/ .L/
Gu; .v/
O  inf.Gu; / C 1: (5.21)
5.2 Auxiliary Results 77

By (5.3), (5.16), (5.17), and (5.21),

f .u/ C h; vO  ui C 21 LkvO  uk2 C g.v/


O
.L/ .L/
D Gu; .v/
O  Gu; .u/ C 1 D F.u/ C 1  M1 C 1: (5.22)

It follows from (5.2), (5.16), (5.17), and (5.22) that

h; vO  ui C 21 LkvO  uk2  2M1 C 1 C jc j: (5.23)

It is clear that

2L1 jh; vO  uij  L1 .41 kvO  uk2 C 4kk2 /: (5.24)

Since the function f is Lipschitz on BX .0; M0 C 2/ relations (5.17) and (5.18) imply
that

kk  kf 0 .u/k C 1  L C 1: (5.25)

By (5.23)–(5.25),

2L1 .2M1 C 1 C jc j/


 kvO  uk2  2L1 jh; vO  uij
 kvO  uk2  41 kvO  uk2  4kk2
 21 kvO  uk2  4.L C 1/2 :

This implies that

kvO  uk2  4.M1 C 1 C jc j/ C 8.L C 1/2

and

kvO  uk  .4.M1 C 1 C jc j/ C 8.L C 1/2 /1=2 :

Together with (5.17) and (5.20) this implies that

kvk  kvk
O C 1  kvO  uk C kuk C 1
 .4.M1 C 1 C jc j/ C 8.L C 1/2 /1=2 C M0 C 2:

Lemma 5.3 is proved.


78 5 An Extension of the Gradient Algorithm

5.3 The Main Lemma

Lemma 5.4. Let ı 2 .0; 1, M0  1, L > 1 satisfy

jf .w1 /  f .w2 /j  Lkw1  w2 k for all w1 ; w2 2 BX .0; M0 C 2/ (5.26)

and

kf 0 .w1 /  f 0 .w2 /k  Lkw1  w2 k for all w1 ; w2 2 X; (5.27)

M1  M0 satisfy

jf .w/j; F.w/  M1 for all w 2 BX .0; M0 C 2/; (5.28)


M2 D .8.M1 C jc j C .L C 1/2 C 1//1=2 C M0 C 2 (5.29)

and let L0 > 1 satisfy

jg.w1 /  g.w2 /j  L0 kw1  w2 k for all w1 ; w2 2 BX .0; M2 /: (5.30)

Assume that

u 2 BX .0; M0 C 1/; (5.31)

 2 X satisfies

k  f 0 .u/k  ı (5.32)

and that v 2 X satisfies

.L/
BX .v; ı/ \ argmin.Gu; / 6D ;: (5.33)

Then for each x 2 BX .0; M0 C 1/,

F.x/  F.v/  21 Lkv  xk2  21 Lku  xk2


ı..M2 C M0 C 2/.2L C 3/ C L0 /

Proof. By (5.33) there exists vO 2 X such that

.L/
vO 2 argmin.Gu; / (5.34)

and

kv  vk
O  ı: (5.35)
5.3 The Main Lemma 79

In view of the assumptions of the lemma, Lemma 5.3 and (5.34),

O  M2 :
kvk; kvk (5.36)

Let

x 2 BX .0; M0 C 1/: (5.37)

Clearly,

F.x/ D f .x/ C g.x/; F.v/ D f .v/ C g.v/: (5.38)

Proposition 4.1 and (5.31) imply that

g.v/ C f .v/  f .u/ C hf 0 .u/; v  ui C 21 Lkv  uk2 C g.v/: (5.39)

By (5.3), (5.31), (5.32), (5.36), (5.38), and (5.39),

F.x/  F.v/  Œf .x/ C g.x/  Œf .v/ C g.v/


 f .x/ C g.x/  Œf .u/ C hf 0 .u/; v  ui C 21 Lkv  uk2 C g.v/
D Œf .x/ C g.x/  Œf .u/ C h; v  ui C 21 Lkv  uk2 C g.v/
Ch  f 0 .u/; v  ui
.L/
 Œf .x/ C g.x/  Gu; .v/  k  f 0 .u/kkv  uk
.L/
 Œf .x/ C g.x/  Gu; .v/  ı.M2 C M0 C 1/: (5.40)

It follows from (5.3) that


.L/ .L/
Gu; .v/  Gu; .v/
O

D h; v  ui C 21 Lkv  uk2 C g.v/


Œh; vO  ui C 21 LkvO  uk2 C g.v/
O
O C 21 LŒkv  uk2  kvO  uk2  C g.v/  g.v/:
D h; v  vi O (5.41)

Relation (5.27) and (5.32) imply that

kk  kf 0 .u/k C 1  L C 1: (5.42)

In view of (5.35) and (5.42),

jh; v  vij
O  .L C 1/ı: (5.43)
80 5 An Extension of the Gradient Algorithm

By (5.31), (5.35), and (5.36),

jkv  uk2  kvO  uk2 j  jkv  uk  kvO  ukj.kv  uk C kvO  uk/


 kv  vk.2M
O 2 C 2M0 C 2/  ı.2M2 C 2M0 C 2/:
(5.44)

In view of (5.30), (5.35), and (5.36),

jg.v/  g.v/j
O  L0 kv  vk
O  L0 ı: (5.45)

It follows from (5.41) and (5.43)–(5.45) that


.L/ .L/
jGu; .v/  Gu; .v/j
O

 .L C 1/ı C 21 Lı.2M0 C 2M2 C 2/ C L0 ı


 ı.L C 1 C L0 C L.M0 C M2 C 1//: (5.46)

Relations (5.40) and (5.46) imply that

F.x/  F.v/
.L/
 f .x/ C g.x/  Gu; .v/  ı.M2 C M0 C 1/
.L/
 f .x/ C g.x/  Gu; .v/
O  ı.L0 C L C 1 C .L C 1/M2 C M0 C 1//:
(5.47)

By the convexity of f , (5.31), (5.32), and (5.37),

f .x/  f .u/ C hf 0 .u/; x  ui


 f .u/ C h; x  ui  jhf 0 .u/  ; x  uij
 f .u/ C h; x  ui  kf 0 .u/  kkx  uk
 f .u/ C h; x  ui  ı.2M0 C 2/: (5.48)

Lemma 5.2 and (5.34) imply that there exists

l 2 @g.v/
O (5.49)

such that

 C L.vO  u/ C l D 0: (5.50)

In view of (5.49) and the convexity of g,

g.x/  g.v/
O C hl; x  vi:
O (5.51)
5.3 The Main Lemma 81

It follows from (5.48) and (5.51) that

f .x/ C g.x/
 f .u/ C h; x  ui  ı.2M0 C 2/ C g.v/
O C hl; x  vi:
O (5.52)

In view of (5.3),
.L/
O D f .u/ C h; vO  ui C 21 LkvO  uk2 C g.v/:
Gu; .v/ O (5.53)

By (5.50), (5.52), and (5.53),

.L/
f .x/ C g.x/  Gu; .v/
O

 h; x  vi O  21 LkvO  uk2  ı.2M0 C 2/


O C hl; x  vi
O  21 LkvO  uk2
D ı.2M0 C 2/ C h C l; x  vi
D 21 LkvO  uk2  LhvO  u; x  vi
O  ı.2M0 C 2/
D 21 LkvO  uk2 C LhvO  u; u  xi  ı.2M0 C 2/: (5.54)

In view of (5.35)–(5.37),

jkvO  xk2  kv  xk2 j


 jkvO  xk  kv  xkj.kvO  xk C kv  xk/
 kvO  vk.2M2 C 2M0 C 2/  ı.2M2 C 2M0 C 2/: (5.55)

Lemma 2.1 implies that

hvO  u; u  xi D 21 ŒkvO  xk2  kvO  uk2  ku  xk2 : (5.56)

By (5.54) and (5.56),

.L/
f .x/ C g.x/  Gu; .v/
O

 21 LkvO  uk2 C 21 LkvO  xk2


21 LkvO  uk2  21 Lku  xk2  ı.2M0 C 2/
 21 LkvO  xk2  21 Lku  xk2  ı.2M0 C 2/
 21 Lkv  xk2  21 Lku  xk2  21 Lı.2M2 C 2M0 C 2/  ı.2M0 C 2/:
(5.57)
82 5 An Extension of the Gradient Algorithm

It follows from (5.47) and (5.57) that

F.x/  F.v/
 21 Lkv  xk2  21 Lku  xk2  ı.L.M2 C M0 C 1/
C2M0 C 2 C L0 C .L C 1/.M2 C M0 C 2//:

Lemma 5.4 is proved.

5.4 Proof of Theorem 5.1

By (5.4), there exists

z 2 argmin.F/ \ BX .0; M/: (5.58)

In view of (5.12) and (5.58),

kx0  zk  2M: (5.59)

If f .x0 /  f .z/ C 0 , then the assertion of the theorem holds. Let

f .x0 / > f .z/ C 0 :

If f .x1 /  f .z/ C 0 , then in view of Lemma 5.3, kx1 k  M2 and the assertion of the
theorem holds. Let

f .x1 / > f .z/ C 0 :

Assume that T  0 is an integer and that for all integers t D 0; : : : ; T,

F.xtC1 /  F.z/ > 0 : (5.60)

We show that for all t 2 f0; : : : ; Tg,

kxt  zk  2M (5.61)

and

F.z/  F.xtC1 /
 21 Lkz  xtC1 k2  21 Lkz  xt k2  ı..M2 C M0 C 2/.2L C 3/ C L0 /:
(5.62)

In view of (5.59), (5.61) is true for t D 0.


5.4 Proof of Theorem 5.1 83

Assume that t 2 f0; : : : ; Tg and (5.61) holds. Relations (5.58) and (5.61)
imply that

kxt k  3M: (5.63)

Set

M0 D 3M (5.64)

and

M3 D .M2 C M0 C 2/.2L C 3/ C L0 : (5.65)

By (5.5)–(5.9), (5.13), (5.14), (5.58), (5.63), (5.64), and Lemma 5.4 applied with

x D z; u D xt ;  D t ; v D xtC1 ;

we have

F.z/  F.xtC1 /  21 Lkz  xtC1 k2  21 Lkz  xt k2  ıM3 : (5.66)

It follows from (5.60) and (5.66) that

0 < F.xtC1 /  F.z/


 21 Lkz  xt k2  21 Lkz  xtC1 k2 C ıM3 : (5.67)

In view of (5.10), (5.65) and (5.67),

2M  kz  xt k  kz  xtC1 k:

Thus we have shown by induction that (5.62) holds for all t D 0; : : : ; T and
that (5.61) holds for all t D 0; : : : ; T C 1.
By (5.60), (5.62) and (5.65),

X
T
T0 < .F.xtC1 /  F.z//
tD0

X
T
 Œ21 Lkz  xt k2  21 Lkz  xtC1 k2  C TıM3
tD0

 21 Lkz  x0 k2 C TıM3 : (5.68)


84 5 An Extension of the Gradient Algorithm

It follows from (5.10), (5.59), (5.65), and (5.68) that

T0 =2  21 L.4M 2 /;


T  4LM 2 01 < n0 C 1:

Thus we have shown that if T  0 is an integer and (5.60) holds for all t D 0; : : : ; T,
then (5.61) holds for all t D 0; : : : ; T C 1 and T < n0 C 1. This implies that there
exists an integer q 2 f1; : : : ; n0 C 2g such that

kxi k  3M; i D 0; : : : ; q;
F.xq /  F.z/ C 0 :

Theorem 5.1 is proved.


Chapter 6
Weiszfeld’s Method

In this chapter we analyze the behavior of Weiszfeld’s method for solving the
Fermat–Weber location problem. We show that the algorithm generates a good
approximate solution, if computational errors are bounded from above by a small
positive constant. Moreover, for a known computational error, we find out what an
approximate solution can be obtained and how many iterates one needs for this.

6.1 The Description of the Problem

Let X be a Hilbert space equipped with an inner product h; i which induces a
complete norm k  k.
If x 2 X and h is a real-valued function defined in a neighborhood of x which is
Frechet differentiable at x, then its Frechet derivative at x is denoted by h0 .x/ 2 X.
For each x 2 X and each r > 0 set

BX .x; r/ D fy 2 X W kx  yk  rg:

Let a 2 X. The function g.x/ D kx  ak, x 2 X is convex. For every x 2 X n fag,


g is Frechet differentiable at x and

g0 .x/ D kx  ak1 .x  a/:

It is easy to see that

@g.a/ D BX .0; 1/:

Recall that the definition of the subdifferential is given in Sect. 2.1.

© Springer International Publishing Switzerland 2016 85


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_6
86 6 Weiszfeld’s Method

In this chapter we assume that X is the finite-dimensional Euclidean space Rn ,


for each pair x D .x1 ; : : : ; xn /; y D .y1 ; : : : ; yn / 2 Rn ,

X
n
hx; yi D xi yi ;
iD1

m is a natural number,

!i > 0; i D 1; : : : ; m

and that

A D fai 2 Rn W i D 1; : : : ; mg;

where ai1 6D ai2 for all i1 ; i2 2 f1; : : : ; mg satisfying i1 6D i2 .


Set

X
m
f .x/ D !i kx  ai k for all x 2 Rn (6.1)
iD1

and

inf.f / D infff .z/ W z 2 Rn g:

We say that the vectors ai ; : : : ; am are collinear if there exist y; b 2 Rn and


t1 ; : : : ; tm 2 R1 such that ai D y C ti b, i D 1; : : : ; m. We suppose that the vectors
ai ; : : : ; am are not collinear.
In this chapter we study the Fermat–Weber location problem

f .x/ ! min; x 2 Rn

by using Weiszfeld’s method [110] which was recently revisited in [18]. This prob-
lem is often called the Fermat–Torricelli problem named after the mathematicians
originally formulated (Fermat) and solved (Torricelli) it in the case of three points;
Weber (as well as Steiner) considered its extension for finitely many points. For a
full treatment of this problem with a modified proof of the Weiszfeld algorithm by
using the subdifferential theory of convex analysis (as well for generalized versions
of the Fermat–Torricelli and related problems) with no presence of computational
errors see [86].
Since the function f is continuous and satisfies a growth condition, this problem
has a solution which is denoted by x 2 Rn . Thus

f .x / D inf.f /: (6.2)


6.2 Preliminaries 87

In view of Theorem 2.1 of [18] this solution is unique but in our study we do not
use this fact.
If x 62 A, then

X
m
f 0 .x / D !i kx  ai k1 .x  ai / D 0: (6.3)
iD1

If x D ai with some i 2 f1; : : : ; mg, then


 
 X 
 m 
 !j kx  aj k .x  aj /
 1 
   !i : (6.4)
jD1;j6Di 

6.2 Preliminaries

For each x 2 Rn n A set


!1
X
m X
m
1
T.x/ D !i kx  ai k .kx  ai k1 !i ai /: (6.5)
iD1 iD1

Let y 2 Rn n A satisfy

T.y/ D y:

This equality is equivalent to the relation


!1
X
m X
m
1
yD !i ky  ai k .ky  ai k1 !i ai /
iD1 iD1

which in its turn is equivalent to the equality

X
m
.!i ky  ai k1 .y  ai // D 0:
iD1

It is easy to see that the last equality is equivalent to the relation

f 0 .y/ D 0:

Thus for every y 2 Rn n A,

T.y/ D y if and only if f 0 .y/ D 0: (6.6)


88 6 Weiszfeld’s Method

For each x 2 Rn and every y 2 Rn n A set

X
m
h.x; y/ D !i ky  ai k1 kx  ai k2 : (6.7)
iD1

Let y 2 Rn n A and consider the function

s D h.; y/ W Rn ! R1

which is strictly convex and possesses a unique minimizer x satisfying the relation

X
m
0 D s0 .x/ D 2 !i ky  ai k1 .x  ai /
iD1

which is equivalent to the equality

T.y/ D x:

This implies that

h.T.y/; y/  h.x; y/ for all x 2 Rn : (6.8)

Lemma 6.1 ([18]).


(i) For every y 2 Rn n A,

h.T.y/; y/  h.x; y/ for all x 2 Rn :

(ii) For every y 2 Rn n A,

h.y; y/ D f .y/:

(iii) For every x 2 Rn and every y 2 Rn n A,

h.x; y/  2f .x/  f .y/:

Proof. Assertion (i) was already proved [see (6.8)]. Assertion (ii) is evident. Let us
prove assertion (iii).
Let x 2 Rn and y 2 Rn n A. Clearly, for each a 2 R1 and each b > 0,

a2 b1  2a  b:
6.2 Preliminaries 89

Therefore for all i D 1; : : : ; m,

kx  ai k2 ky  ai k1  2kx  ai k  ky  ai k:

This implies that

X
m
h.x; y/ D !i ky  ai k1 kx  ai k2
iD1

X
m X
m
2 !i kx  ai k  !i ky  ai k D 2f .x/  f .y/:
iD1 iD1

Lemma 6.1 is proved.


Lemma 6.2 ([18]). For every y 2 Rn n A,

f .T.y//  f .y/

and the equality holds if and only if T.y/ D y.


Proof. Let y 2 Rn n A. In view of Lemma 6.1 (i), (6.8) holds. By the strict convexity
of the function x ! h.x; y/, x 2 Rn , T.y/ is its unique minimizer, by Lemma 6.2 (ii),

h.T.y/; y/  h.y; y/ D f .y/

and if T.y/ 6D y, then this implies that

h.T.y/; y/ < h.y; y/ D f .y/:

Together with Lemma 6.1 (iii) this implies that

2f .T.y//  f .y/  h.T.y/; y/  f .y/

and if T.y/ 6D y, then

2f .T.y//  f .y/  h.T.y/; y/ < h.y; y/ D f .y/:

This completes the proof of Lemma 6.2.


For every x 2 Rn n A set

X
m
L.x/ D !i kx  ai k1 : (6.9)
iD1
90 6 Weiszfeld’s Method

For j D 1; : : : ; m set

X
m
L.aj / D !i kaj  ai k1 : (6.10)
iD1;i6Dj

Clearly, for each x 2 Rn n A,

T.x/ D x  L.x/1 f 0 .x/: (6.11)

Lemma 6.3. Let y 2 Rn n A. Then

f .T.y//  f .y/ C hf 0 .y/; T.y/  yi C 21 L.y/kT.y/  yk2 :

Proof. Clearly, the function x ! h.x; y/, x 2 Rn is quadratic. Therefore its second-
order Taylor expansion around y is exact and can be written as

h.x; y/ D h.y; y/ C h@x h.y; y/; x  yi C L.y/kx  yk2 :

Combined with the relations

h.y; y/ D f .y/ and @x h.y; y/ D 2f 0 .y/

this implies that

h.x; y/ D f .y/ C 2hf 0 .y/; x  yi C L.y/kx  yk2 :

For x D T.y/ the relation above implies that

h.T.y/; y/ D f .y/ C 2hf 0 .y/; T.y/  yi C L.y/kT.y/  yk2 :

Together with Lemma 6.1 (iii) this implies that

2f .T.y//  f .y/  h.T.y/; y/


D f .y/ C 2hf 0 .y/; T.y/  yi C L.y/kT.y/  yk2 ;
2f .T.y//  2f .y/ C 2hf 0 .y/; T.y/  yi C L.y/kT.y/  yk2

and

f .T.y//  f .y/ C hf 0 .y/; T.y/  yi C 21 L.y/kT.y/  yk2 :

Lemma 6.3 is proved.


6.3 The Basic Lemma 91

6.3 The Basic Lemma

Set

Q D maxfkai k W i D 1; : : : ; mg:
M (6.12)

By (6.1) and (6.12),

X
m
f .0/  Q
!i M: (6.13)
iD1

Q Since x is the minimizer of f it follows from (6.1)


We show that kx k  2M.
and (6.13) that

X
m X
m
Q  f .0/  f .x / D
!i M !i kx  ai k: (6.14)
iD1 iD1

There exists j0 2 f1; : : : ; mg such that

kx  aj k  kx  ai k; i D 1; : : : ; m: (6.15)

By (6.12), (6.14), and (6.15),

X
m X
m
Q 
!i M !i kx  aj k;
iD1 iD1
 Q
kx  aj k  M

and

Q
kx k  2M: (6.16)

Q and y 2 Rn n A satisfy
Lemma 6.4. Let M  M

kyk  M: (6.17)

Then

kT.y/k  3M:

Proof. In view of Lemma 6.2,

f .T.y//  f .y/: (6.18)


92 6 Weiszfeld’s Method

It follows from (6.1), (6.12), and (6.17) that

X
m X
m
f .y/ D !i ky  ai k  2M !i : (6.19)
iD1 iD1

By (6.1), (6.18), and (6.19),

X
m X
m
!i kT.y/  ai k D f .T.y//  f .y/  2M !i : (6.20)
iD1 iD1

There exists j 2 f1; : : : ; mg such that

kT.y/  aj k  kT.y/  ai k; i D 1; : : : ; m: (6.21)

Relations (6.20) and (6.21) imply that

X
m X
m
!i kT.y/  aj k  2M !i ;
iD1 iD1

kT.y/  aj k  2M:

Together with (6.12) this implies that kT.y/k  3M. Lemma 6.4 is proved.
Q ı 2 .0; 1, y 2 Rn n A satisfy
Lemma 6.5 (The Basic Lemma). Let M  M,

kyk  M; (6.22)

x 2 Rn satisfy

kT.y/  xk  ı (6.23)

and let z 2 Rn satisfy

kzk  M: (6.24)

Then
!
X
m
1 2 2
f .x/  f .z/  2 L.y/.kz  yk  kz  xk / C ı 8M C 1 C !i :
iD1

Proof. Relations (6.1) and (6.23) imply that

X
m X
m
jf .x/  f .T.y//j  kx  T.y/k !i  ı !i : (6.25)
iD1 iD1
6.3 The Basic Lemma 93

In view of Lemma 6.3,

f .T.y//  f .y/ C hf 0 .y/; T.y/  yi C 21 L.y/kT.y/  yk2 : (6.26)

Since the function f is convex we have

f .y/  f .z/ C hf 0 .y/; y  zi: (6.27)

By (6.26) and (6.27),

f .T.y//  f .z/ C hf 0 .y/; y  zi


C hf 0 .y/; T.y/  yi C 21 L.y/kT.y/  yk2
D f .z/ C hf 0 .y/; T.y/  zi C 21 L.y/kT.y/  yk2 : (6.28)

It follows from (6.25) and (6.28) that

X
m
0 1 2
f .x/  f .z/ C hf .y/; T.y/  zi C 2 L.y/kT.y/  yk C ı !i : (6.29)
iD1

In view of (6.11),

f 0 .y/ D L.y/.y  T.y//: (6.30)

By (6.29) and (6.30),

f .x/  f .z/ C L.y/hy  T.y/; T.y/  zi


X
m
C 21 L.y/kT.y/  yk2 C ı !i : (6.31)
iD1

Lemma 2.1 implies that

hy  T.y/; T.y/  zi D 21 Œkz  yk2  kz  T.y/k2  ky  T.y/k2 : (6.32)

It follows from (6.31) and (6.32) that

X
m
f .x/  f .z/ C 21 L.y/.kz  yk2  kz  T.y/k2 / C ı !i : (6.33)
iD1

Lemma 6.4 and (6.22) imply that

kT.y/k  3M: (6.34)


94 6 Weiszfeld’s Method

By (6.23), (6.24), and (6.34),

jkz  T.y/k2  kz  xk2 j


D jkz  T.y/k  kz  xkj.kz  T.y/k C kz  xk/
 kx  T.y/k.8M C 1/  ı.8M C 1/: (6.35)

It follows from (6.33) and (6.35) that


!
X
m
1 2 2
f .x/  f .z/  2 L.y/.kz  yk  kz  xk / C ı 8M C 1 C !i :
iD1

Lemma 6.5 is proved.

6.4 The Main Result

Let ı 2 .0; 1 and a positive number 0  ı. Choose p 2 f1; : : : ; mg such that

f .ap /  f .ai /; i D 1; : : : ; m: (6.36)

For each j D 1; : : : ; m set

X
m
rj D !i kai  aj k1 .aj  ai /: (6.37)
iD1;i6Dj

In order to solve our minimization problem we need to calculate

X
m
rp D !i kai  ap k1 .ap  ai /:
iD1;i6Dp

Since our computer system produces computational errors we can obtain only a
vector rOp 2 Rn such that kOrp  rp k  0 .
Proposition 6.6. Assume that rOp 2 Rn satisfies

kOrp  rp k  0 (6.38)

and

kOrp k  !p C 20 : (6.39)


6.4 The Main Result 95

Then

Q 0:
f .ap /  inf.f / C 9M

Q was defined by (6.12).)


(Note that M
Proof. By (6.38) and (6.39),

krp k  kOrp k C 0  !p C 30 : (6.40)

It follows from (6.1), (6.37), and (6.40) that there exists

l 2 @f .ap / (6.41)

such that

klk  30 : (6.42)

By the convexity of f , (6.2), (6.12), (6.16), (6.41), and (6.42),

f .x /  f .ap / C hl; x  ap i


Q 0
 f .ap /  klk.kx k C kap k/  f .ap /  9M

and

Q 0:
f .ap /  f .x / C 9M

Proposition 6.6 is proved.


In view of Proposition 6.6, if rOp satisfies (6.38) and (6.39), then ap is an
approximate solution of our minimization problem. It is easy to see that the
following proposition holds.
Proposition 6.7. Assume that rOp 2 Rn satisfies kOrp  rp k  0 and kOrp k > !p C 20 :
Then krp k > !p C 0 and ap is not a minimizer of f .
Lemma 6.8 ([18]). Let krp k > !p . Then

f .ap /  f .ap  .krp k  !p /L.ap /1 krp k1 rp /


 .krp k  !p /2 .2L.ap //1 :

Proposition 6.9. Assume that

krp k > !p ; (6.43)


96 6 Weiszfeld’s Method

dp 2 Rn satisfies

kdp C krp k1 rp k  ı; (6.44)

tp  0 satisfies

jtp  L.ap /1 .krp k  !p /j  ı (6.45)

and that x0 2 Rn satisfies

kx0  ap  tp dp k  ı: (6.46)

Then

kdp k  1 C ı; (6.47)
1
tp  L.ap / .krp k  !p / C ı; (6.48)
Q C 2.2ı C L.ap /1 .krp k  !p //;
kx0 k  M (6.49)
1 1
kx0  ap C .krp k  !p /L.ap / krp k rp k
 ı.3 C .krp k  !p /L.ap /1 /; (6.50)
1 1
jf .x0 /  f .ap  .krp k  !p /L.ap / krp k rp /k
X
m
 ı.3 C .krp k  !p /L.ap /1 / !i (6.51)
iD1

and

f .ap /  f .x0 /
X
m
 .krp k  !p /2 .2L.ap //1  ı.3 C .krp k  !p /L.ap /1 / !i : (6.52)
iD1

Proof. In view of (6.44), (6.47) is true. Inequality (6.45) implies (6.48). By (6.12)
and (6.46)–(6.48),

kx0 k  ı C kap k C tp kdp k


Q C 2tp  M
 ıCM Q C 2.2ı C L.ap /1 .krp k  !p //

and (6.49) is true. It follows from (6.44)–(6.47) that

kx0  ap C .krp k  !p /L.ap /1 krp k1 rp k


 ı C ktp dp C .krp k  !p /L.ap /1 krp k1 rp k
6.4 The Main Result 97

 ı C ktp dp  .krp k  !p /L.ap /1 dp k


Ck.krp k  !p /L.ap /1 .dp C krp k1 rp /k
 ı C ı.1 C ı/ C .krp k  !p /L.ap /1 ı
 ı.3 C .krp k  !p /L.ap /1 /

and (6.50) holds. Relations (6.1) and (6.50) imply (6.51). Relation (6.52) follows
from (6.51) and Lemma 6.8. Proposition 6.9 is proved.
The next theorem which is proved in Sect. 6.5, is our main result.
Theorem 6.10. Let

krp k > !p ; (6.53)


Q C 4 C 2.krp k  !p /L.ap / ;
M0 D 3M 1
(6.54)

a positive number ı satisfy


!1
X
m
1
ı < 12 .krp k  !p / !i (6.55)
iD1

and
!
X
m
2ı 8M0 C 1 C !i < .krp k  !p /2 .16L.ap //1 ; (6.56)
iD1

!
X
m
0 D 4ı 16M0 C 1 C !i Œ144L.ap /2 .krp k  !p /4 M02
iD1

Xm
C1.. !i /2 C 1/ (6.57)
iD1

and
!1
X
m
1
n0 D bı 8M0 C 1 C !i .krp k  !p /2 .8L.ap //1 c C 1: (6.58)
iD1

Assume that tp  0, dp 2 Rn and x0 2 Rn satisfy (6.44)–(6.46), fxi g1


iD1  R and
n

that for each integer i  0 satisfying xi 62 A,


kT.xi /  xiC1 k  ı: (6.59)
98 6 Weiszfeld’s Method

Then
x0 62 A

and there exists j 2 f0; : : : ; n0 g such that

xi 62 A; i 2 f0; : : : ; jg n fjg;
f .xj /  inf.f / C 0 :

Note that in the theorem above ı is the computational error produced by our
computer system. In order to obtain a good approximate solution we needP bc1 ı 1 c
iterations [see (6.58)], where c1 > 0 is a constant depending only on M0 , miD1 !i ,
krp k  !p and L.ap /. As a result, we obtain a point x 2 Rn such that

f .x/  inf.f / C c2 ı
Pm
[see (6.57)], where the constant c2 > 0 depends only on M0 , iD1 !i , krp k  !p and
L.ap /.

6.5 Proof of Theorem 6.10

Proposition 6.9, (6.44)–(6.46), (6.55), and (6.56) imply that

f .x0 /  f .ap /
X
m
2 1 1
 .krp k  !p / .2L.ap // C ı.3 C .krp k  !p /L.ap / / !i
iD1

 f .ap /  .krp k  !p /2 .4L.ap //1 : (6.60)

By (6.36) and (6.60),

x0 62 A: (6.61)

If

f .x0 /  inf.f / C 0 or f .x1 /  inf.f / C 0 ;

then in view of (6.61) the assertion of the theorem holds with j D 0 or j D 1,


respectively. Consider the case with

f .x0 / > inf.f / C 0 and f .x1 / > inf.f / C 0 : (6.62)


6.5 Proof of Theorem 6.10 99

Assume that k 2 Œ0; n0  is an integer,

xi 62 A; i D 0; : : : ; k (6.63)

and

f .xi / > inf.f / C 0 ; i D 0; : : : ; k C 1: (6.64)

(Note that in view of (6.61) and (6.62), relations (6.63) and (6.64) hold for k D 0.)
For all integers i  0, set
0 1
X
m
i D iı @8M0 C 1 C !j A : (6.65)
jD1

By (6.56), (6.58), (6.60), and (6.64), for all i D 0; : : : ; n0 ,

X
m
i  n0 ı.8M0 C 1 C !j /
jD1

X
m
 ı.8M0 C 1 C !j / C .krp k  !p /2 .8L.ap //1
jD1

 .krp k  !p /2 .8L.ap //1 C .krp k  !p /2 .16L.ap //1


 41  3.f .ap /  f .x0 //: (6.66)

Remind (see Sect. 6.1) that x 2 Rn satisfies

f .x / D inf.f /: (6.67)

We show that for all j D 0; : : : ; k C 1,

f .xj /  f .x0 / C j ; (6.68)



kxj  x k  M0 : (6.69)

In view of (6.16),

Q
kx k  2M: (6.70)

Proposition 6.9, (6.44)–(6.46), and (6.49) imply that

Q C 2.2 C L.ap /1 .krp k  !p //:


kx0 k  M (6.71)
100 6 Weiszfeld’s Method

By (6.54), (6.70), and (6.71),

kx0  x k  M0 :

Thus (6.68) and (6.69) hold for j D 0.


Assume that an integer j 2 f0; : : : ; kg and (6.68) and (6.69) hold.
By (6.36), (6.60), (6.66), (6.68), and the relation k  n0 ,

f .xj /  f .x0 / C n0


 f .x0 / C 41  3.f .ap /  f .x0 // < f .ap /

and

xj 62 A: (6.72)

Let i 2 f1; : : : ; mg and

vi 2 @f .ai /: (6.73)

In view of (6.1),
X
m
@f .ai / D !q kai  aq k1 .ai  aq / C !i BRn .0; 1/: (6.74)
qD1;q6Di

Relations (6.73) and (6.74) imply that


X
m X
m
kvi k  !q C !i D !q : (6.75)
qD1;q6Di qD1

It follows from (6.68), (6.73), and (6.75) that

f .ai /  f .x0 /
 f .ai /  f .xj / C j
 hvi ; ai  xj i C j
 kvi kkai  xj k C j
X
m
 kai  xj k !q C j : (6.76)
qD1

By (6.36) and (6.76),

X
m
f .ap /  f .x0 /  kai  xj k !q C j ; i D 1; : : : ; m: (6.77)
qD1
6.5 Proof of Theorem 6.10 101

In view of (6.66) and (6.77), for all i D 1; : : : ; m,


0 11
Xm
kai  xj k  41 .f .ap /  f .x0 // @ !q A
qD1

and

X
m
kai  xj k1  4.f .ap /  f .x0 //1 !q : (6.78)
qD1

It follows from (6.9), (6.72), and (6.78) that


!2
X
m X
m
L.xj / D .!i kxj  ai k1 /  4 !i .f .ap /  f .x0 //1 : (6.79)
iD1 iD1

Lemma 6.2, (6.1), (6.59), (6.64), (6.68), and (6.72) imply that

X
m
f .xjC1 /  f .T.xj // C kxjC1  T.xj /k !i
iD1

X
m
 f .xj / C ı !i  f .x0 / C jC1 : (6.80)
iD1

It follows from (6.54), (6.59), (6.64), (6.67), (6.69), (6.70), (6.72), (6.79), and
Lemma 6.5 applied with M D 2M0 , z D x , y D xj , and x D xjC1 that

0 < f .xjC1 /  f .x /


!2
Xm
2 !i .f .ap /  f .x0 //1 .kx  xj k2  kx  xjC1 k2 /
iD1
!
X
m
C ı 16M0 C 1 C !i : (6.81)
iD1

By (6.57), (6.60), and (6.81),

kx  xj k  kx  xjC1 k:

Therefore in view of the relation above, (6.80) and (6.81), we showed by induction
that (6.68) and (6.69) hold for j D 0; : : : ; kC1 and that (6.81) holds for j D 0; : : : ; k.
It follows from (6.58), (6.60), (6.64), (6.68) and the relation k  n0 that
102 6 Weiszfeld’s Method

f .xkC1 /  f .x0 / C kC1


!
X
m
 f .x0 / C .n0 C 1/ı 8M0 C 1 C !i
iD1
!
X
m
2 1
< f .x0 / C .krp k  !p / .8L.ap // C 2ı 8M0 C 1 C !i < f .ap /
iD1

and

xkC1 62 A: (6.82)

By (6.81) which holds for all j D 0; : : : ; k,

X
k
.k C 1/0 < .f .xjC1 /  f .x //
jD0
!2
X
m X
k
2 !i .f .ap /  f .x0 //1 .kx  xj k2  kx  xjC1 k2 /
iD1 jD0
!
X
m
C .k C 1/ı 16M0 C 1 C !i :
iD1

Together with (6.57), (6.60), and (6.69) this implies that


!2
X
m
1
2 .k C 1/0  2 !i 4L.ap /.krp k  !p /2 kx0  x k2
iD1
!2
X
m
8 !i L.ap /.krp k  !p /2 M02 :
iD1

Combined with (6.56) and (6.57) this implies that


!2
X
m
k C 1  16 !i L.ap /.krp k  !p /2 M02 01
iD1
!2
X
m
 16 !i L.ap /.krp k  !p /2 M02
iD1
!1 !2
X
m X
m
1 2 4
 .4ı/ 16M0 C 1 C !i L.ap / .krp k  !p / M02 1441 !i
iD1 iD1
6.5 Proof of Theorem 6.10 103

!1
X
m
D 361 L.ap /1 .krp k  !p /2 ı 1 16M0 C 1 C !i  21 n0 :
iD1

Thus we assumed that an integer k 2 Œ0; n0  satisfies (6.63) and (6.64) and showed
that

xkC1 62 A

[see (6.82)] and that k C 1  21 n0 . (Note that in view of (6.56) and (6.58), n0  5.)
This implies that there exists an integer k 2 Œ0; n0 =2 such that (6.63), (6.64) hold
and

f .xkC1 /  inf.f / C 0 :

Theorem 6.10 is proved.


Chapter 7
The Extragradient Method for Convex
Optimization

In this chapter we study convergence of the extragradient method for constrained


convex minimization problems in a Hilbert space. Our goal is to obtain an
-approximate solution of the problem in the presence of computational errors,
where  is a given positive number. We show that the extragradient method generates
a good approximate solution, if the sequence of computational errors is bounded
from above by a constant.

7.1 Preliminaries and the Main Results

Let .X; h; i/ be a Hilbert space with an inner product h; i which induces a complete
norm k  k.
For each x 2 X and each nonempty set A  X put

d.x; A/ D inffkx  yk W y 2 Ag:

For each x 2 X and each r > 0 set

B.x; r/ D fy 2 X W kx  yk  rg:

Let C be a nonempty closed convex subset of X.


Assume that f W X ! R1 is a convex continuous function which is bounded from
below on C.
Recall that for each x 2 X,

@f .x/ D fu 2 X W f .y/  f .x/  hu; y  xi for all y 2 Xg:

© Springer International Publishing Switzerland 2016 105


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_7
106 7 The Extragradient Method for Convex Optimization

We consider the minimization problem

f .x/ ! min; x 2 C:

Denote

Cmin D fx 2 C W f .x/  f .y/ for all y 2 Cg: (7.1)

We assume that Cmin 6D ;.


We suppose that f is Gâteaux differential at any point x 2 X and for x 2 X denote
by f 0 .x/ 2 X the Gâteaux derivative of f at x. This implies that for any x 2 X and
any h 2 X

hf 0 .x/; hi D lim t1 Œf .x C th/  f .x/: (7.2)


t!0

We suppose that the mapping f 0 W X ! X is Lipschitz on all the bounded subsets


of X.
Set

inf.f I C/ D infff .z/ W z 2 Cg: (7.3)

We study the minimization problem with the objective function f , over the set C,
using the extragradient method introduced in Korpelevich [75]. By Lemma 2.2, for
each nonempty closed convex set D  X and for each x 2 X, there is a unique point
PD .x/ 2 D satisfying

kx  PD .x/k D inffkx  yk W y 2 Dg;


kPD .x/  PD .y/k  kx  yk for all x; y 2 X

and

hz  PD .x/; x  PD .x/i  0

for each x 2 X and each z 2 D.


The following theorem is our first main result of this chapter.
Theorem 7.1. Let M0 > 0, M1 > 0, L > 0,  2 .0; 1/,

B.0; M0 / \ Cmin 6D ;; (7.4)

f 0 .B.0; 3M0 //  B.0; M1 /; (7.5)

kf 0 .z1 /  f 0 .z2 /k  Lkz1  z2 k

for all z1 ; z1 2 B.0; 3M0 C M1 C 1/; (7.6)

0 < ˛ < ˛   1; ˛  L  1; (7.7)


7.1 Preliminaries and the Main Results 107

an integer k satisfy

k > 4M02 ˛1  1 (7.8)

and let a positive number ı satisfy

ı < 41 .2M0 C 1/1 ˛ : (7.9)

Assume that

f˛i g1  1 1
iD0  Œ˛ ; ˛ ; fxi giD0  X; fyi giD0  X; (7.10)

kx0 k  M0 (7.11)

and that for each integer i  0,

kyi  PC .xi  ˛i f 0 .xi //k  ı; (7.12)


0
kxiC1  PC .xi  ˛i f .yi //k  ı: (7.13)

Then there is an integer j 2 Œ0; k such that

kxi k  3M0 ; i D 0; : : : ; j;
f .PC .xj  ˛j f 0 .xj ///  inf.f I C/ C :

In Theorem 7.1 our goal is to obtain a point  2 C such that

f ./  inf.f I C/ C ;

where  > 0 is given. In order to meet this goal, the computational errors, produced
by our computer system, should not exceed c1 , where c1 > 0 is a constant
depending only on M0 , ˛ [see (7.9)]. The number of iterations is bc2  1 c, where
c2 > 0 is a constant depending only on M0 , ˛ .
It is easy to see that the following proposition holds.
Proposition 7.2. If limx2C; kxk!1 f .x/ D 1 and the space X is finite-
dimensional, then for each  > 0 there exists > 0 such that if x 2 C satisfies
f .x/  inf.f I C/ C , then d.x; Cmin /  .
The following theorem is our second main result of this chapter.
108 7 The Extragradient Method for Convex Optimization

Theorem 7.3. Let

lim f .x/ D 1 (7.14)


x2C; kxk!1

and let the following property hold:


(C) for each  > 0 there exists > 0 such that if x 2 C satisfies f .x/ 
inf.f I C/ C , then d.x; Cmin /  =2.
Let  2 .0; 1/,

2 .0; .=4/2 / (7.15)

be as guaranteed by property (C), M0 > 1, M1 > 0, L > 0,

Cmin  B.0; M0  1/; (7.16)


f 0 .B.0; 3M0 //  B.0; M1 /; (7.17)
0 0
kf .z1 /  f .z2 /k  Lkz1  z2 k
for all z1 ; z1 2 B.0; 3M0 C M1 C 1/; (7.18)
 
0 < ˛ < ˛  1; ˛ L < 1; (7.19)

an integer k satisfy

k > 8M02 1 minf˛ ; 1  .˛  /2 L2 g1 (7.20)

and let a positive number ı satisfy

16ı.8M0 C 8/  minf˛ ; 1  .˛  /2 L2 g: (7.21)

Assume that

f˛i g1  1 1
iD0  Œ˛ ; ˛ ; fxi giD0  X; fyi giD0  X; (7.22)
kx0 k  M0 (7.23)

and that for each integer i  0,

kyi  PC .xi  ˛i f 0 .xi //k  ı; (7.24)


kxiC1  PC .xi  ˛i f 0 .yi //k  ı: (7.25)

Then d.xi ; Cmin / <  for all integers i  k.


The chapter is organized as follows. Section 7.2 contains auxiliary results.
Theorem 7.1 is proved in Sect. 7.3 while Theorem 7.3 is proved in Sect. 7.4.
The results of this chapter were obtained in [126].
7.2 Auxiliary Results 109

7.2 Auxiliary Results

We use the assumptions, notation, and definitions introduced in Sect. 7.1.


Lemma 7.4. Let

u 2 Cmin ; u 2 X; ˛ > 0; (7.26)


0 0
v D PC .u  ˛f .u//; uN D PC .u  ˛f .v//: (7.27)

Then

kNu  u k2  ku  u k2  ku  vk2  kv  uN k2
C2˛Œf .u /  f .v/ C 2˛kNu  vkkf 0 .u/  f 0 .v/k:

Proof. It is easy to see that

hf 0 .v/; x  vi  f .x/  f .v/ for all x 2 X: (7.28)

In view of (7.28),

hu  uN ; f 0 .v/i  hv  uN ; f 0 .v/i  f .u /  f .v/: (7.29)

It follows from (7.27) and Lemma 2.2 that

hNu  v; .u  ˛f 0 .u//  vi  0: (7.30)

Relation (7.30) implies that

hNu  v; .u  ˛f 0 .v//  vi  ˛hNu  v; f 0 .u/  f 0 .v/i: (7.31)

Set

z D u  ˛f 0 .v/: (7.32)

It follows from (7.27) and (7.32) that

kNu  u k2 D kz  u C PC .z/  zk2


D kz  u k2 C kz  PC .z/k2 C 2hPC .z/  z; z  u i: (7.33)

Relation (7.26) and Lemma 2.2 imply that

2kz  PC .z/k2 C 2hPC .z/  z; z  u i


D 2hz  PC .z/; u  PC .z/i  0: (7.34)
110 7 The Extragradient Method for Convex Optimization

It follows from (7.33), (7.34), (7.32), (7.27), and (7.28) that

kNu  u k2  kz  u k2  kz  PC .z/k2
D ku  ˛f 0 .v/  u k2  ku  ˛f 0 .v/  uN k2
D ku  u k2  ku  uN k2 C 2˛hu  uN ; f 0 .v/i
 ku  u k2  ku  uN k2
C2˛hv  uN ; f 0 .v/i C 2˛Œf .u /  f .v/: (7.35)

In view of (7.31) and (7.35),

kNu  u k2  ku  u k2
C2˛hv  uN ; f 0 .v/i C 2˛Œf .u /  f .v/
hu  v C v  uN ; u  v C v  uN i
D ku  u k2 C 2˛hv  uN ; f 0 .v/i
C2˛Œf .u /  f .v/  ku  vk2
kv  uN k2  2hu  v; v  uN i
D ku  u k2  ku  vk2  kv  uN k2
C2˛Œf .u /  f .v/ C 2hv  uN ; ˛f 0 .v/  u C vi
 ku  u k2  ku  vk2  kv  uN k2
C2˛Œf .u /  f .v/ C 2˛hNu  v; f 0 .u/  f 0 .v/i
 ku  u k2  ku  vk2  kv  uN k2
C2˛Œf .u /  f .v/ C 2˛kNu  vkkf 0 .u/  f 0 .v/k:

This completes the proof of Lemma 7.4.


Lemma 7.5. Let

u 2 Cmin ; M0 > 0; M1 > 0; L > 0; ˛ 2 .0; 1; (7.36)

f 0 .B.u ; M0 //  B.0; M1 /; (7.37)

kf 0 .z1 /  f 0 .z2 /k  Lkz1  z2 k

for all z1 ; z1 2 B.u ; M0 C M1 /; (7.38)

˛L  1 (7.39)
and let

u 2 B.u ; M0 /; v D PC .u  ˛f 0 .u//; uN D PC .u  ˛f 0 .v//: (7.40)


7.2 Auxiliary Results 111

Then

kNu  u k2  ku  u k2
C2˛Œf .u /  f .v/  ku  vk2 .1  ˛ 2 L2 /:

Proof. Lemma 7.4 and (7.36) imply that

kNu  u k2  ku  u k2  ku  vk2  kv  uN k2
C2˛Œf .u /  f .v/ C 2˛kNu  vkkf 0 .u/  f 0 .v/k: (7.41)

In view of (7.40) and (7.37),

kf 0 .u/k  M1 : (7.42)

It follows from (7.40), (7.36), Lemma 2.2, and (7.42) that

kv  u k  ku  ˛f 0 .u/  u k  ku  u k C ˛kf 0 .u/k


 M0 C ˛M1 : (7.43)

In view of (7.40), (7.43), (7.36), and (7.39),

kf 0 .u/  f 0 .v/k  Lku  vk: (7.44)

By (7.41) and (7.44),

kNu  u k2  ku  u k2  ku  vk2  kv  uN k2
C2˛Œf .u /  f .v/ C 2˛kNu  vkku  vkL
 ku  u k2  ku  vk2  kv  uN k2
C2˛Œf .u /  f .v/ C ˛ 2 L2 ku  vk2 C kNu  vk2
 ku  u k2 C 2˛Œf .u /  f .v/  ku  vk2 .1  ˛ 2 L2 /: (7.45)

By (7.45),

kNu  u k2  ku  u k2 C 2˛Œf .u /  f .v/  ku  vk2 .1  ˛ 2 L2 /:

This completes the proof of Lemma 7.5.


Lemma 7.6. Let

u 2 Cmin ; M0 > 0; M1 > 0; L > 0; ˛ 2 .0; 1; ı 2 .0; 1/; (7.46)

f 0 .B.u ; M0 //  B.0; M1 /; (7.47)


112 7 The Extragradient Method for Convex Optimization

kf 0 .z1 /  f 0 .z2 /k  Lkz1  z2 k

for all z1 ; z2 2 B.u ; M0 C M1 C 1/; (7.48)

˛L  1: (7.49)
Assume that

x 2 B.u ; M0 /; y 2 X; (7.50)
ky  PC .x  ˛f 0 .x//k  ı; (7.51)
0
xQ 2 X; kQx  PC .x  ˛f .y//k  ı: (7.52)

Then

kQx  u k2  4ı.M0 C 1/ C kx  u k2
C2˛Œf .u /  f .PC .x  ˛f 0 .x///
jjx  PC .x  ˛f 0 .x//jj2 .1  ˛ 2 L2 /:

Proof. Put

v D PC .x  ˛f 0 .x//; z D PC .x  ˛f 0 .v//: (7.53)

Lemma 7.5, (7.46), (7.47), (7.48), (7.49), (7.50), and (7.53) imply that

kz  u k2  kx  u k2 C 2˛Œf .u /  f .v/  kx  vk2 .1  ˛ 2 L2 /: (7.54)

It is clear that

kQx  u k2 D kQx  z C z  u k2
D kQx  zk2 C 2hQx  z; z  u i C kz  u k2
 kQx  zk2 C 2kQx  zkkz  u k C kz  u k2 : (7.55)

In view of (7.51) and (7.53),

kv  yk  ı: (7.56)

Put

zQ D PC .x  ˛f 0 .y//: (7.57)

By (7.53), (7.46), Lemma 2.2, (7.50), and (7.47),

ku  vk  ku  xk C ˛kf 0 .x/k  M0 C ˛M1 : (7.58)


7.3 Proof of Theorem 7.1 113

Relations (7.58), (7.56), and (7.46) imply that

ku  yk  ku  vk C kv  yk  M0 C ˛M1 C 1: (7.59)

It follows from (7.52), (7.57), (7.53), Lemma 2.2, (7.58), (7.59), (7.46), (7.48), (7.56),
and (7.49) that

kQx  zjj  kQx  zQk C kQz  zk


 ı C kQz  zk  ı C ˛kf 0 .y/  f 0 .v/k
 ı C ˛Lkv  yk  ı C ˛Lı
D ı.1 C ˛L/  2ı: (7.60)

In view of (7.55), (7.60), (7.54), (7.46), (7.50), and (7.53),

kQx  u k2  4ı 2 C kx  u k2 C 2˛Œf .u /  f .v/


kx  vk2 .1  ˛ 2 L2 / C 4ıkx  u k
 4ı.M0 C 1/ C kx  u k2
C2˛Œf .u /  f .PC .x  ˛f 0 .x///  kx  vk2 .1  ˛ 2 L2 /
D 4ı.M0 C 1/ C kx  u k2 C 2˛Œf .u /  f .PC .x  ˛f 0 .x///
kx  PC .x  ˛f 0 .x//k2 .1  ˛ 2 L2 /:

This completes the proof of Lemma 7.6.

7.3 Proof of Theorem 7.1

In view of (7.4), there exists a point

u 2 Cmin \ B.0; M0 /: (7.61)

It follows from (7.11) and (7.61) that

kx0  u k  2M0 : (7.62)

Assume that i  0 is an integer and that

xi 2 B.u ; 2M0 /: (7.63)

(It is clear that in view of (7.62), inclusion (7.63) is valid for i D 0.) It follows
from (7.61), (7.63), (7.7), (7.11), (7.5), (7.6), (7.12), (7.13), and Lemma 7.6 applied
114 7 The Extragradient Method for Convex Optimization

with x D xi , y D yi xQ D xiC1 , ˛ D ˛i that

kxiC1  u k2  4ı.2M0 C 1/
Ckxi  u k2 C 2˛i Œf .u /  f .PC .xi  ˛i f 0 .xi ///: (7.64)

Thus we have shown that the following property holds:


(P1) If an integer i  0 satisfies (7.63), then inequality (7.64) is valid.
We claim that there exists an integer i 2 Œ0; k for which

f .PC .xi  ˛i f 0 .xi ///  f .u / C : (7.65)

Assume the contrary. Then relations (7.9) and (7.10) imply that for each integer
i 2 Œ0; k,

2˛i Œf .u /  f .PC .xi  ˛i f 0 .xi /// C 4ı.2M0 C 1/


 2˛i ./ C 4ı.2M0 C 1/
 2˛  C 4ı.2M0 C 1/  ˛ : (7.66)

It follows from (7.62), (7.66), and property (P1) that for each integer i 2 Œ0; k,

kxiC1  u k2  kxi  u k2  ˛ : (7.67)

In view of (7.63) and (7.67),

4M02  kx0  u k2  kxk  u k2


X
k1
D Œkxi  u k2  kxiC1  u k2   k˛ 
iD0

and

k  4M02 ˛1  1 :

This contradicts (7.8). The contradiction we have reached proves that there exists an
integer j 2 Œ0; k such that

f .PC .xj  ˛j f 0 .xj ///  f .u / C :

We may assume without loss of generality that for each integer i  0 satisfying
i < j,

f .PC .xi  ˛i f 0 .xi /// > f .u / C : (7.68)


7.4 Proof of Theorem 7.3 115

It follows from (7.68), (7.9), and (7.10) that for any integer i  0 satisfying
i < j, inequality (7.66) is valid. Combined with (7.62), property (P1), and (7.61)
this implies that for each integer i satisfying 0  i  j, we have

kxi  u k  2M0

and kxi k  3M0 . This completes the proof of Theorem 7.1.

7.4 Proof of Theorem 7.3

Let

u be an arbitrary element of Cmin : (7.69)

In view of (7.69) and (7.16),

ku k  M0  1: (7.70)

Relations (7.70) and (7.23) imply that

kxi  u k  2M0  1: (7.71)

Assume that i  0 is an integer such that

xi 2 B.u ; 2M0 /: (7.72)

(It is clear that in view of (7.71) inclusion (7.72) is valid for i D 0.) It follows
from (7.69), (7.9), (7.17)–(7.19), (7.72), (7.70), (7.24), and Lemma 7.6 applied with
x D xi , y D yi , xQ D xiC1 , ˛ D ˛i that

kxiC1  u k2  4ı.2M0 C 1/
Ckxi  u k2 C 2˛i Œf .u /  f .PC .xi  ˛i f 0 .xi ///
kxi  PC .xi  ˛i f 0 .xi //k2 .1  ˛i2 L2 /

and by (7.22),

kxi  u k2  kxiC1  u k2
 2˛ Œf .PC .xi  ˛i f 0 .xi ///  f .u /
Ckxi  PC .xi  ˛i f 0 .xi //k2 .1  ˛i2 L2 /  4ı.2M0 C 1/: (7.73)
116 7 The Extragradient Method for Convex Optimization

Thus we have shown that the following property holds:


(P2) If an integer i  0 satisfies (7.72), then inequality (7.73) is valid.
Assume that an integer i  0 satisfies (7.72) and that

maxff .PC .xi  ˛i f 0 .xi ///  f .u /; kxi  PC .xi  ˛i f 0 .xi //k2 g  : (7.74)

It follows from (7.74), property (C), and (7.15) that

d.PC .xi  ˛i f 0 .xi /; Cmin /  =2;


kxi  PC .xi  ˛i f 0 .xi //k  1=2 < =4

and

d.xi ; Cmin /  3=4:

Thus we have shown that the following property holds:


(P3) If an integer i  0 satisfies (7.72) and (7.74), then

d.xi ; Cmin /  3=4:

Assume that an integer i  0 satisfies inclusion (7.72) and that

maxff .PC .xi  ˛i f 0 .xi ///  f .u /; kxi  PC .xi  ˛i f 0 .xi //k2 g > : (7.75)

It follows from (7.72), property (P2), (7.73), (7.75), (7.19), (7.21), and (7.22) that

kxi  u k2  kxiC1  u k2
 minf˛ ; 1  .˛  /2 L2 g  4ı.2M0 C 1/
 21 minf˛ ; 1  .˛  /2 L2 g:

Thus we have shown that the following property holds:


(P4) If an integer i  0 satisfies (7.72) and (7.75), then

kxi  u k2  kxiC1  u k2
 21 minf˛ ; 1  .˛  /2 L2 g

and since u is an arbitrary point of Cmin , we have

d.xiC1 ; Cmin /2  d.xi ; Cmin /2  . =2/ minf˛ ; 1  .˛  /2 L2 /g:

We claim that there exists an integer i 2 Œ0; k such that (7.74) is valid.
7.4 Proof of Theorem 7.3 117

Assume the contrary. Then (7.75) holds for each integer i 2 Œ0; k. Combined
with (7.71) and property (P4) this implies that

4M02  kx0  u k2  kxk  u k2


X
k1
D Œkxi  u k2  kxiC1  u k2 
iD0

 k. =2/ minf˛ ; 1  .˛  /2 L2 g

and

k  8M02 1 minf˛ ; 1  .˛  /2 L2 g1 :

This contradicts (7.20). The contradiction we have reached proves that there
exists an integer j 2 Œ0; k such that (7.74) is valid with i D j.
We may assume that for all integers i  0 satisfying i < j Eq. (7.75) holds. It
follows from property (P4) and (7.71) that

xj 2 B.u ; 2M0 /: (7.76)

Property (P3), (7.76), and (7.74) with i D j imply that

d.xj ; Cmin /  3=4: (7.77)

Assume that an integer i  j and that

d.xi ; Cmin / < : (7.78)

There are two cases: (7.74) is valid; (7.75) is valid. Assume that (7.74) is true.
In view of property (P3), (7.78), and (7.16),

d.xi ; Cmin / < 3=4:

Since u is an arbitrary point of the set Cmin we may assume without loss of
generality that

kxi  u k < .4=5/: (7.79)

It follows from (7.79), (7.15), property (P2), (7.73), (7.19), and (7.21) that

kxiC1  u k  kxi  u k C 2.8ı.M0 C 1//1=2 < .4=5/ C =5

and

d.xiC1 ; Cmin / < :


118 7 The Extragradient Method for Convex Optimization

Assume that (7.75) holds. Property (P4), (7.78), and (7.16) imply that

d.xiC1 ; Cmin /2  d.xi ; Cmin /2  . =2/ minf˛ ; 1  .˛  /2 L2 g

and

d.xiC1 ; Cmin / < : (7.80)

Thus (7.80) holds in both cases.


We have shown that if an integer i  j and (7.78) holds, then (7.80) is true.
Therefore,

d.xi ; Cmin / <  for all integers i  k:

This completes the proof of Theorem 7.3.


Chapter 8
A Projected Subgradient Method for Nonsmooth
Problems

In this chapter we study the convergence of the projected subgradient method for
a class of constrained optimization problems in a Hilbert space. For this class of
problems, an objective function is assumed to be convex but a set of admissible
points is not necessarily convex. Our goal is to obtain an -approximate solution in
the presence of computational errors, where  is a given positive number.

8.1 Preliminaries and Main Results

Let .X; h; i/ be a Hilbert space with an inner product h; i which induces a complete
norm k  k.
For each x 2 X and each nonempty set A  X put

d.x; A/ D inffkx  yk W y 2 Ag:

For each x 2 X and each r > 0 set

B.x; r/ D fy 2 X W kx  yk  rg:

Assume that f W X ! R1 is a convex continuous function which is Lipschitz on all


bounded subsets of X.
For each point x 2 X and each positive number  let

@f .x/ D fl 2 X W f .y/  f .x/  hl; y  xi for all y 2 Xg (8.1)

© Springer International Publishing Switzerland 2016 119


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_8
120 8 A Projected Subgradient Method for Nonsmooth Problems

be the subdifferential of f at x and let

@ f .x/ D fl 2 X W f .y/  f .x/  hl; y  xi   for all y 2 Xg (8.2)

be the -subdifferential of f at x.
Let C be a closed nonempty subset of the space X.
Assume that

lim f .x/ D 1: (8.3)


kxk!1

It means that for each M0 > 0 there exists M1 > 0 such that if a point x 2 X satisfies
the inequality kxk  M1 , then f .x/ > M0 .
Define

inf.f I C/ D infff .z/ W z 2 Cg: (8.4)

Since the function f is Lipschitz on all bounded subsets of the space X it follows
from (8.4) that inf.f I C/ is finite.
Set

Cmin D fx 2 C W f .x/ D inf.f I C/g: (8.5)

It is well known that if the set C is convex, then the set Cmin is nonempty. Clearly,
the set Cmin 6D ; if the space X is finite-dimensional.
In this chapter we assume that

Cmin 6D ;: (8.6)

It is clear that Cmin is a closed subset of X.


We suppose that the following assumption holds.
(A1) For every positive number  there exists ı > 0 such that if a point x 2 C
satisfies the inequality f .x/  inf.f I C/ C ı, then d.x; Cmin /  .
(It is clear that (A1) holds if the space X is finite-dimensional.)
We also suppose that the following assumption holds.
(A2) There exists a continuous mapping PC W X ! X such that PC .X/ D C,
PC .x/ D x for all x 2 C and

kx  PC .y/k  kx  yk for all x 2 C and all y 2 X:

For every number  2 .0; 1/ let

./ D supfı 2 .0; 1 W if x 2 C satisfies f .x/  inf.f I C/ C ı;


then d.x; Cmin /  minf1; gg: (8.7)
8.1 Preliminaries and Main Results 121

In view of (A1), ./ is well defined for every positive number .


In this chapter we will prove the following two results obtained in [122].
Theorem 8.1. Let f˛i g1
iD0  .0; 1 satisfy

1
X
lim ˛i D 0; ˛i D 1
i!1
iD1

and let M;  > 0. Then there exist a natural number n0 and ı > 0 such that the
following assertion holds.
Assume that an integer n  n0 ,

fxk gnkD0  X; kx0 k  M;


vk 2 @ı f .xk / n f0g; k D 0; 1; : : : ; n  1;
f
k gn1
kD0 ; fk gkD0  B.0; ı/;
n1

and that for k D 0; : : : ; n  1,

xkC1 D PC .xk  ˛k kvk k1 vk  ˛k k /  ˛k


k :

Then the inequality d.xk ; Cmin /   hods for all integers k satisfying n0  k  n.
Theorem 8.2. Let M;  > 0. Then there exists ˇ0 2 .0; 1/ such that for each ˇ1 2
.0; ˇ0 / there exist a natural number n0 and ı > 0 such that the following assertion
holds.
Assume that an integer n  n0 ,

fxk gnkD0  X; kx0 k  M;


vk 2 @ı f .xk / n f0g; k D 0; 1; : : : ; n  1;
f˛k gn1
kD0  Œˇ1 ; ˇ0 ;

f
k gn1
kD0 ; fk gkD0  B.0; ı/
n1

and that for k D 0; : : : ; n  1,

xkC1 D PC .xk  ˛k kvk k1 vk  ˛k k / 


k :

Then the inequality d.xk ; Cmin /   holds for all integers k satisfying n0  k  n.
In this chapter we use the following definitions and notation.
Define

X0 D fx 2 X W f .x/  inf.f I C/ C 1g: (8.8)


122 8 A Projected Subgradient Method for Nonsmooth Problems

In view of (8.3), there exists a number KN > 0 such that

N
X0  B.0; K/: (8.9)

Since the function f is Lipschitz on all bounded subsets of the space X there exists
a number LN > 1 such that

N 1  z2 k for all z1 ; z2 2 B.0; KN C 4/:


jf .z1 /  f .z2 /j  Lkz (8.10)

8.2 Auxiliary Results

We use the notation and definitions introduced in Sect. 8.1 and suppose that all the
assumptions posed in Sect. 8.1 hold.
Proposition 8.3. Let  2 .0; 1. Then for each x 2 X satisfying

d.x; C/ < minfLN 1 21 .=2/; =2g;


f .x/  inf.f I C/ C minf21 .=2/; =2g; (8.11)

the inequality d.x; Cmin /   holds.


Proof. In view of the definition of , .=2/ 2 .0; 1 and

if x 2 C satisfies f .x/ < inf.f I C/ C .=2/;


then d.x; Cmin /  minf1; =2g: (8.12)

Assume that a point x 2 X satisfies (8.11). There exists a point y 2 C which


satisfies

kx  yk < 21 LN 1 .=2/ and kx  yk < =2: (8.13)

Relations (8.11), (8.8), (8.9), and (8.13) imply that

N y 2 B.0; KN C 1/:
x 2 B.0; K/; (8.14)

By (8.13), (8.14), and the definition of LN [see (8.10)],

N  yk < .=2/21 :
jf .x/  f .y/j  Lkx (8.15)

It follows from the choice of the point y, (8.11), and (8.15) that

y2C
8.2 Auxiliary Results 123

and

f .y/ < f .x/ C .=2/21  inf.f I C/ C .=2/:

Combined with (8.12) this implies that

d.y; Cmin /  =2:

Together with (8.13) this implies that

d.x; Cmin /  kx  yk C d.y; Cmin /  :

This completes the proof of Proposition 8.3.


Lemma 8.4. Assume that  > 0, x 2 X, y 2 X,

f .x/ > inf.f I C/ C ; f .y/  inf.f I C/ C =4; (8.16)

v 2 @=4 f .x/: (8.17)

Then hv; y  xi  =2.


Proof. In view of (8.2) and (8.17),

f .u/  f .x/  hv; u  xi  =4 for all u 2 X: (8.18)

By (8.16) and (8.18),

.3=4/  f .y/  f .x/  hv; y  xi  =4:

The inequality above implies that

hv; y  xi  =2:

This completes the proof of Lemma 8.4.


Lemma 8.5. Let

xN 2 Cmin ; (8.19)

K0 > 0,  2 .0; 1; ˛ 2 .0; 1; let a positive number ı satisfy

ı.K0 C KN C 1/  .8L/
N 1 ; (8.20)
124 8 A Projected Subgradient Method for Nonsmooth Problems

let a point x 2 X satisfy

kxk  K0 ; f .x/ > inf.f I C/ C ; (8.21)


;  2 B.0; ı/; v 2 @=4 f .x/ n f0g (8.22)

and let

y D PC .x  ˛kvk1 v  ˛/ 
: (8.23)

Then

N 1  C 2˛ 2 C k
k2 C 2k
k.K0 C KN C 2/:
ky  xN k2  kx  xN k2  ˛.4L/

Proof. In view of (8.8)–(8.10) and (8.19), for every point z 2 B.Nx; 41  LN 1 /, we
have

N  xN k  f .Nx/ C =4 D inf.f I C/ C =4:


f .z/  f .Nx/ C Lkz (8.24)

Lemma 8.4, (8.21), (8.22), and (8.24) imply that for every point

z 2 B.Nx; 41  LN 1 /;

we have

hv; z  xi  =2:

Combined with (8.22) the inequality above implies that

N 1 /:
hkvk1 v; z  xi < 0 for all z 2 B.Nx; .4L/ (8.25)

Put

zQ D xN C 41 LN 1 kvk1 v: (8.26)

It is easy to see that

zQ 2 B.Nx; 41 LN 1 /: (8.27)

Relations (8.25), (8.26), and (8.27) imply that

0 > hkvk1 v; zQ  xi D hkvk1 v; xN C 41 LN 1 kvk1 v  xi: (8.28)


8.2 Auxiliary Results 125

By (8.28),

hkvk1 v; xN  xi < 41 LN 1 : (8.29)

Set

y0 D x  ˛kvk1 v  ˛: (8.30)

It follows from (8.30), (8.22), (8.21), (8.19), (8.8), (8.9), (8.29), and (8.20) that

ky0  xN k2 D kx  ˛kvk1 v  ˛  xN k2
D kx  ˛kvk1 v  xN k2 C ˛ 2 kk2
2˛h; x  ˛kvk1 v  xN i
 kx  ˛kvk1 v  xN k2
C˛ 2 ı 2 C 2˛ı.K0 C KN C 1/
 kx  xN k2  2hx  xN ; ˛kvk1 vi
C˛ 2 C ˛ 2 ı 2 C 2˛ı.K0 C KN C 1/
< kx  xN k2  2˛.41 LN 1 /
C˛ 2 .1 C ı 2 / C 2˛ı.K0 C KN C 1/
N 1  C 2˛ 2 :
 kx  xN k2  ˛.4L/ (8.31)

In view of (8.8), (8.9), (8.19), (8.21), and (8.31),

N 2C2
ky0  xN k2  .K0 C K/

and

ky0  xN k  K0 C KN C 2: (8.32)

By (8.23), (8.30), (8.19), (A2), (8.31), and (8.32),

ky  xN k2 D kPC .y0 / 
 xN k2
 kPC .y0 /  xN k2 C k
k2 C 2k
kkPC .y0 /  xN k
 ky0  xN k2 C k
k2 C 2k
kky0  xN k
N 1 
 kx  xN k2  ˛.4L/
C2˛ 2 C k
k2 C 2k
k.K0 C KN C 2/:

This completes the proof of Lemma 8.5.


Lemma 8.5 implies the following result.
126 8 A Projected Subgradient Method for Nonsmooth Problems

Lemma 8.6. Let K0 > 0,  2 .0; 1; ˛ 2 .0; 1, a positive number ı satisfy

ı.K0 C KN C 1/  .8L/
N 1 ;

let x 2 X satisfy

kxk  K0 ; f .x/ > inf.f I C/ C ;

let


;  2 B.0; ı/; v 2 @=4 f .x/ n f0g

and let

y D PC .x  ˛kvk1 v  ˛/ 
:

Then

N 1 
d.y; Cmin /2  d.x; Cmin /2  ˛.4L/
C2˛ 2 C k
k2 C 2k
k.K0 C KN C 2/:

8.3 Proof of Theorem 8.1

We may assume without loss of generality that  < 1. In view of Proposition 8.3,
there exists a number

N 2 .0; =8/ (8.33)

such that

if x 2 X; d.x; C/  2N and f .x/  inf.f I C/ C 2N ;


then d.x; Cmin /  : (8.34)

Fix

xN 2 Cmin : (8.35)

Fix

0 2 .0; 41 N /: (8.36)


8.3 Proof of Theorem 8.1 127

Since limi!1 ˛i D 0 there is an integer p0 > 0 such that

KN C 4 < p0 (8.37)

and that for all integers p  p0 , we have

N 1 0 :
˛p < .32L/ (8.38)
P1
Since iD0 ˛i D 1 there exists a natural number n0 > p0 C 4 such that

0 1
nX
N
˛i > .4p0 C M C kNxk/2 01 16L: (8.39)
iDp0

Fix

K > KN C 4 C M C 4n0 C 4kNxk (8.40)

and a positive number ı such that

N 1 0 :
6ı.K C 1/ < .16L/ (8.41)

Assume that an integer n  n0 and that

fxk gnkD0  X; kx0 k  M; (8.42)

f
k gn1
kD0 ; fk gkD0  B.0; ı/;
n1
(8.43)

vk 2 @ı f .xk / n f0g; k D 0; 1; : : : ; n  1 (8.44)

and that for all integers k D 0; : : : ; n  1, we have

xkC1 D PC .xk  ˛k kvk k1 vk  ˛k k /  ˛k


k : (8.45)

In order to prove the theorem it is sufficient to show that

d.xk ; Cmin /   for all integers k satisfying n0  k  n: (8.46)

Assume that an integer

k 2 Œp0 ; n  1; (8.47)

kxk k  K ; f .xk / > inf.f I C/ C 0 : (8.48)


128 8 A Projected Subgradient Method for Nonsmooth Problems

In view of (8.35), (8.41), (8.48), (8.43), (8.44), and (8.45), the conditions of
Lemma 8.5 hold with K0 D K ,  D 0 , ˛ D ˛k , x D xk ,  D k , v D vk ,
y D xkC1 .
D ˛k
k and combined with (8.43), (847), (8.38), (8.41), and (8.40) this
lemma implies that
N 1 0
kxkC1  xN k2  kxk  xN k2  ˛k .4L/
C2˛k2 C ˛k2 k
k k2 C 2k
k k˛k .K C KN C 2/
N 1 0 C 2˛k2 C ˛k2 ı 2 C 2ı˛k .K C KN C 2/
 kxk  xN k2  ˛k .4L/
N 1 0 C 2ı˛k .K C KN C 3/
 kxk  xN k2  ˛k .8L/
N 1 0 :
 kxk  xN k2  ˛k .16L/

Thus we have shown that the following property holds:


(P1) If an integer k 2 Œp0 ; n  1 and (8.48) is valid, then we have
N 1 ˛k 0 :
kxkC1  xN k2  kxk  xN k2  .16L/

We claim that there exists an integer j 2 fp0 ; : : : ; n0 g such that

f .xj /  inf.f I C/ C 0

Assume the contrary. Then

f .xj / > inf.f I C/ C 0 ; i D p0 ; : : : ; n0 : (8.49)

It follows from (8.45), (8.43), (8.41), (8.35), and (A2) that for all integers
i D 0; : : : ; n  1, we have

kxiC1  xN k  1 C kPC .xi  ˛i kvi k1 vi  ˛i /  xN k (8.50)


1
 1 C kxi  ˛i kvi k vi  ˛i   xN k
 1 C kxi  xN k C 2 D kxi  xN k C 3:

By (8.40), (8.42), and (8.50), for all integers i D 0; : : : ; n0 ,

kxi k  kx0  xN k C 3i C kNxk  M C 3i C 2kNxk  M C 3n0 C 2kNxk < K : (8.51)

Let

i 2 fp0 ; : : : ; n0  1g: (8.52)

It follows from (8.52), (8.51), (8.49), and property (P1) that

N 1 ˛i 0 :
kxiC1  xN k2  kxi  xN k2  .16L/ (8.53)
8.3 Proof of Theorem 8.1 129

Relations (8.42), (8.50), (8.52), and (8.53) imply that

.M C 3p0 C kNxk/2  kxn0  xN k2  kxp0  xN k2


0 1
nX 0 1
nX
D N 1 0
ŒkxiC1  xN k2  kxi  xN k2   .16L/ ˛i
iDp0 iDp0

and
0 1
nX
N 01 .M C 3p0 C kNxk//2 :
˛i  16L
iDp0

This contradicts (8.39). The contradiction we have reached proves that there exists
an integer

j 2 fp0 ; : : : ; n0 g

such that

f .xj /  inf.f I C/ C 0 : (8.54)

By (8.45), (A2), (8.43), (8.41), and (8.36), we have

d.xj ; C/  ˛j1 ı < N : (8.55)

In view of (8.54), (8.55), (8.36), and (8.34),

d.xj ; Cmin /  : (8.56)

We claim that for all integers i satisfying j  i  n,

d.xi ; Cmin /  :

Assume the contrary. Then there exists an integer k 2 Œj; n for which

d.xk ; Cmin / > : (8.57)

By (8.56) and (8.57), we have

k > j  p0 : (8.58)

By (8.56) we may assume without loss of generality that

d.xi ; Cmin /   for all integers i satisfying j  i < k: (8.59)


130 8 A Projected Subgradient Method for Nonsmooth Problems

Thus

d.xk1 ; Cmin /  : (8.60)

There are two cases:

f .xk1 /  inf.f I C/ C 0 I (8.61)

f .xk1 / > inf.f I C/ C 0 : (8.62)

Assume that (8.61) is valid. It follows from (8.61), (8.36), (8.33), (8.8), and (8.9)
that

N
xk1 2 X0  B.0; K/: (8.63)

By (8.45) and (8.43) there exists a point z 2 C such that

kxk1  zk  ı: (8.64)

By (8.45), (8.43), (8.64), and (A2),

kxk  zk  ˛k1 ı C kz  PC .xk1  ˛k1 kvk1 k1 vk1  ˛k1 k1 /k


 ı C kz  xk1 k C ˛k1 C ı D 3ı C ˛k1 : (8.65)

Combined with (8.41), (8.58), and (8.38) the relation above implies that

d.xk ; C/  3ı C ˛k1 < 0 : (8.66)

In view of (8.64) and (8.65),

kxk  xk1 k  4ı C ˛k1 : (8.67)

It follows from (8.60), (8.67), (8.41), (8.38), and (8.58) that

d.xk ; Cmin /  2: (8.68)

Relations (8.63), (8.68), (8.8), and (8.9) imply that

xk1 ; xk 2 B.0; KN C 2/:

Together with (8.10) and (8.67) the inclusion above implies that

N k1  xk k  L.4ı
jf .xk1 /  f .xk /j  Lkx N C ˛k1 /: (8.69)
8.4 Proof of Theorem 8.2 131

In view of (8.69), (8.51), (8.41), (8.38), and (8.58), we have


N
f .xk /  f .xk1 / C L.4ı C ˛k1 /
N
 inf.f I C/ C 0 C L.4ı C ˛k1 /  inf.f I C/ C 20 : (8.70)

It follows from (8.70), (8.66), (8.36), and (8.34) that

d.xk ; Cmin /  :

This inequality contradicts (8.57). The contradiction we have reached proves (8.62).
By (8.60), (8.8), and (8.9), we have

kxk1 k  KN C 1: (8.71)

It follows from (8.40), (8.41), (8.43), (8.44), (8.71), and (8.62) that Lemma 8.6
holds with

x D xk1 ; y D xk ;  D k1 ; v D vk1 ; ˛ D ˛k1 ; K0 D KN C 1;


 D 0 ;
D ˛k1
k1 :

Combined with (8.38), (8.58), (8.43), (8.41), and (8.60) this implies that
N 1 0 C 2˛k1
d.xk ; Cmin /2  d.xk1 ; Cmin /2  ˛k1 .4L/ 2 2
C 2˛k1 k
k1 k2
C2˛k1 k
k1 k.2KN C 3/
N 1 ˛k1 0 C 2ı 2 ˛k1
 d.xk1 ; Cmin /2  .8L/ 2
C 2˛k1 ı.2KN C 3/
N 1 ˛k1 0 C 2ı˛k1 .2KN C 4/
 d.xk1 ; Cmin /2  .8L/
N 1 ˛k1 0  d.xk1 ; Cmin /2   2 :
 d.xk1 ; Cmin /2  .16L/

This contradicts (8.57).


The contradiction we have reached proves that d.xi ; Cmin /   for all integers i
satisfying j  i  n. Since j  n0 this completes the proof of Theorem 8.1.

8.4 Proof of Theorem 8.2

We may assume that without loss of generality

 < 1; M > KN C 4: (8.72)

Proposition 8.3 implies that there exists

N 2 .0; =8/ (8.73)


132 8 A Projected Subgradient Method for Nonsmooth Problems

such that

if x 2 X; d.x; C/  2N and f .x/  inf.f I C/ C 2N ;


then d.x; Cmin /  =4: (8.74)

Put

N 1 N :
ˇ0 D .64L/ (8.75)

Let

ˇ1 2 .0; ˇ0 /: (8.76)

There exists an integer n0  4 such that

N
ˇ1 n0 > 162 .3 C 2M/2 N 1 L: (8.77)

Fix

K > 2M C 4 C 4n0 C 2KN C 2M (8.78)

and a positive number ı such that

N 1 N ˇ1 :
6ıK < .64L/ (8.79)

Fix a point

xN 2 Cmin : (8.80)

Assume that an integer n  n0 ,

fxk gnkD0  X; f
k gn1
kD0  X; fk gkD0  X; f˛k gkD0  Œˇ1 ; ˇ0 ;
n1 n1
(8.81)

kx0 k  M; k
k k  ı; kk k  ı; k D 0; : : : ; n  1; (8.82)

vk 2 @ı f .xk / n f0g; k D 0; 1; : : : ; n  1 (8.83)

and that for all integers k D 0; : : : ; n  1,

xkC1 D PC .xk  ˛k kvk k1 vk  ˛k k / 


k : (8.84)

We claim that d.xk ; Cmin /   for all integers k satisfying n0  k  n.


8.4 Proof of Theorem 8.2 133

Assume that an integer

k 2 Œ0; n  1;
kxk k  K f .xk / > inf.f I C/ C N =4: (8.85)

It follows from (8.75), (8.78)–(8.81), (8.83), (8.85), (8.82), and (8.74) that Lemma
8.5 holds with  D N =4, K0 D K , ˛ D ˛k , x D xk ,
D
k ,  D k , v D vk ,
y D xkC1 and combining with (8.79) this implies that
N 1 N
kxkC1  xN k2  kxk  xN k2  ˛k .16L/
C2˛k2 C ı 2 C 2ı.K C KN C 2/
N 1 N C 2˛k2 C 2ı.K C KN C 3/:
 kxk  xN k2  ˛k .16L/

Together with (8.81), (8.75), (8.78), and (8.79) this implies that

N 1 N C 2ı.KN C 3 C K /
kxkC1  xN k2  kxk  xN k2  ˛k .32L/
N 1 N ˇ1 C 2ı.KN C 3 C K /
 kxk  xN k2  .32L/
N 1 N :
 kxk  xN k2  ˇ1 .64L/

Thus we have shown that the following property holds:


(P2) if an integer k 2 Œ0; n  1 and (8.85) is valid, then we have

N 1 ˇ1 N :
kxkC1  xN k2  kxk  xN k2  .64L/

We claim that there exists an integer j 2 f1; : : : ; n0 g for which

f .xj /  inf.f I C/ C N =4:

Assume the contrary. Then we have

f .xj / > inf.f I C/ C N =4; j D 1; : : : ; n0 : (8.86)

It follows from (8.84), (8.82), (8.79), (A2), (8.80), (8.81), and (8.75) that for all
integers i D 0; : : : ; n  1, we have

kxiC1  xN k  1 C kxi  ˛i kvi k1 vi  ˛i i  xN k


 kxi  xN k C 3: (8.87)

By (8.80)–(8.82), (8.72), (8.87), and (8.78) for i D 0; : : : ; n0 ,

kxi  xN k  kx0  xN k C 3i; (8.88)


kxi k  2kNxk C M C 3n0 < K : (8.89)
134 8 A Projected Subgradient Method for Nonsmooth Problems

Let k 2 f1; : : : ; n0  1g. It follows from (8.89), (8.86), and property (P2) that

N 1 ˇ1 N :
kxkC1  xN k2  kxk  xN k2  .64L/ (8.90)

Relations (8.72), (8.80), (8.88), (8.82), and (8.90) imply that

.M C kNxk C 3/2  kxn0  xN k2  kx1  xN k2


0 1
nX
D ŒkxiC1  xN k2  kxi  xN k2 
iD1

N 1 N ˇ1  ˇ1 n0 =2.64L/
 .n0  1/.64L/ N 1 N ;
N 1 N ˇ1  .M C kNxk C 3/2  .2M C 3/2 :
.n0 =2/.64L/

This contradicts (8.77). The contradiction we have reached proves that there
exists an integer

j 2 f1; : : : ; n0 g

for which

f .xj /  inf.f I C/ C N =4: (8.91)

By (8.84), (A2), and (8.82), we have

d.xj ; C/  ı: (8.92)

Relations (8.91), (8.92), (8.79), and (8.74) imply that

d.xj ; Cmin /  : (8.93)

We claim that for all integers i satisfying j  i  n, we have

d.xi ; Cmin /  :

Assume the contrary. Then there exists an integer k 2 Œj; n for which

d.xk ; Cmin / > : (8.94)

It is easy to see that

k > j: (8.95)
8.4 Proof of Theorem 8.2 135

We may assume without loss of generality that

d.xi ; Cmin /   for all integers i satisfying j  i < k: (8.96)

Then

d.xk1 ; Cmin /  : (8.97)

There are two cases:

f .xk1 /  inf.f I C/ C N =4I (8.98)


f .xk1 / > inf.f I C/ C N =4: (8.99)

Assume that (8.98) is valid. In view of (8.98), (8.73), (8.8), and (8.9),

N
xk1 2 X0  B.0; K/: (8.100)

By (8.82), (8.84), and (A2), there exists a point z 2 C such that

kxk1  zk  ı: (8.101)

It follows from (8.82), (8.84), (8.101), and (A2) that

kxk  zk
 ı C kz  PC .xk1  ˛k1 kvk1 k1 vk1  ˛k1 k1 /k
 ı C kz  xk1 k C ˛k1 C ı < 3ı C ˛k1 : (8.102)

Relations (8.101), (8.98), (8.79), and (8.74) imply that

d.xk1 ; Cmin /  =4: (8.103)

By (8.101), (8.102), (8.79), (8.81), (8.75), and (8.73),

kxk  xk1 k  4ı C ˛k1 < N < =8: (8.104)

In view of (8.103) and (8.104),

d.xk ; Cmin / < :

This inequality contradicts (8.94). The contradiction we have reached proves (8.99).
In view of (8.97), (8.8), and (8.9),

kxk1 k  KN C 1: (8.105)
136 8 A Projected Subgradient Method for Nonsmooth Problems

It follows from (8.78), (8.79), (8.105), (8.99), and (8.82)–(8.84) that Lemma 8.6
holds with

x D xk1 ; y D xk ;  D k1 ;
D
k1 ;
v D vk1 ; ˛ D ˛k1 ; K0 D KN C 1

 D 41 N and combining with (8.81), (8.75), (8.79), and (8.97) this implies that

d.xk ; Cmin /2
N 1 N C 2˛k1
 d.xk1 ; Cmin /2  ˛k1 .16L/ 2
C ı 2 C 2ı.KN C 4/
N 1 ˛k1 N C 2˛k1
 d.xk1 ; Cmin /2  .16L/ 2
C 2ı.KN C 5/
N 1 ˛k1 N C 2ı.KN C 5/
 d.xk1 ; Cmin /2  .32L/
N 1 ˇ1 N  2ı.2KN C 5/
 d.xk1 ; Cmin /2  .32L/
< d.xk1 ; Cmin /2   2 :

This contradicts (8.94). The contradiction we have reached proves that

d.xi ; Cmin /  

for all integers i satisfying j  i  n. In view of inequality n0  j, Theorem 8.2 is


proved.
Chapter 9
Proximal Point Method in Hilbert Spaces

In this chapter we study the convergence of a proximal point method under the
presence of computational errors. Most results known in the literature show the
convergence of proximal point methods when computational errors are summable.
In this chapter the convergence of the method is established for nonsummable
computational errors. We show that the proximal point method generates a good
approximate solution if the sequence of computational errors is bounded from above
by some constant.

9.1 Preliminaries and the Main Results

We analyze the behavior of the proximal point method in a Hilbert space which is
an important tool in the optimization theory. See, for example, [15, 16, 31, 34, 36,
53, 55, 69, 77, 81, 87, 103, 104, 106, 107, 111, 113] and the references mentioned
therein.
Let X be a Hilbert space equipped with an inner product h; i which induces the
norm k  k.
For each function g W X ! R1 [ f1g set

inf.g/ D inffg.y/ W y 2 Xg:

Suppose that f W X ! R1 [ f1g is a convex lower semicontinuous function and


a is a positive constant such that

dom.f / WD fx 2 X W f .x/ < 1g 6D ;;

f .x/  a for all x 2 X (9.1)

© Springer International Publishing Switzerland 2016 137


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_9
138 9 Proximal Point Method in Hilbert Spaces

and that

lim f .x/ D 1: (9.2)


kxk!1

In view of (9.1) and (9.2), the set

argmin.f / WD fz 2 X W f .z/ D inf.f /g 6D ;: (9.3)

Let a point

x 2 argmin.f / (9.4)

and let M be any positive number such that

M > inf.f / C 4: (9.5)

In view of (9.2), there exists a number M0 > 1 such that

f .z/ > M C 4 for all z 2 X satisfying kzk  M0  1: (9.6)

Clearly,

kx k < M0  1: (9.7)

Assume that

0 < 1 < 2  M02 =2: (9.8)

The following theorem is the main result of this chapter.


Theorem 9.1. Let

k 2 Œ1 ; 2 ; k D 0; 1; : : : ; (9.9)

 2 .0; 1, a natural number L satisfy

L > 2.4M02 C 1/2 1 (9.10)

and let a positive number  satisfy

1=2
 1=2 .L C 1/.21
1 C 8M0 1 /  1 and .L C 1/  =4: (9.11)

Assume that a sequence fxk g1


kD0  X satisfies

f .x0 /  M (9.12)
9.1 Preliminaries and the Main Results 139

and

f .xkC1 / C 21 k kxkC1  xk k2  inf.f C 21 k k  xk k2 / C  (9.13)

for all integers k  0. Then for all integers k > L,

f .xk /  inf.f / C :

By Theorem 9.1, for a given  > 0, we obtain  2 X satisfying

f ./  inf.f / C 

doing bc1 1 c iterations [see (9.10)] with the computational error  D c2 2 [see
(9.11)], where the constant c1 > 0 depends only on M0 ; 2 and the constant c2 > 0
depends only on M0 ; L; 1 ; 2 .
Theorem 9.1 implies the following result.
Theorem 9.2. Let

k 2 Œ1 ; 2 ; k D 0; 1; : : : ;

a natural number L satisfy

L > .4M02 C 1/22 (9.14)

and let a positive number N satisfy

1=2
N 1=2 .L C 1/.21
1 C 8M0 1 /  1 and N .L C 1/  1=4: (9.15)

Assume that

fi g1
iD0  .0; N /; lim i D 0 (9.16)
i!1

and that > 0. Then there exists a natural number T0 such that for each sequence
fxk g1
kD0  X satisfying

f .x0 /  M (9.17)

and
f .xkC1 / C 21 k kxkC1  xk k2  inf.f C 21 k k  xk k2 / C k (9.18)
for all integers k  0, the inequality
f .xk /  inf.f / C
holds for all integers k > T0 .
140 9 Proximal Point Method in Hilbert Spaces

Since the function f is convex and lower semicontinuous and satisfies (9.2),
Theorem 9.2 easily implies the following result.
Corollary 9.3. Suppose that all the assumptions of Theorem 9.2 hold and that
the sequence fxk g1
kD0  X satisfies (9.17) and (9.18) for all integers k  0.
Then limk!1 f .xk / D inf.f / and the sequence fxk g1
kD0 is bounded. Moreover, it
possesses a weakly convergent subsequence and the limit of any weakly convergent
subsequence of fxk g1
kD0 is a minimizer of f .

Problem (P) is called well posed if the function f possesses a unique minimizer
which is a limit in the norm topology of any minimizing sequence of f (see [60, 121]
and the references mentioned therein).
Corollary 9.3 easily implies the following result.
Corollary 9.4. Suppose that problem (P) is well posed, all the assumptions of
Theorem 9.2 hold and that the sequence fxk g1kD0  X satisfies (9.17) and (9.18)
for all integers k  0. Then fxk g1
kD0 converges in the norm topology to a unique
minimizer of f .
Note that in [60] it was shown that most problems of type (P) (in the sense of
Baire category) are well posed.
The results of the chapter were obtained in [120]. The chapter is organized as
follows. Section 9.2 contains auxiliary results. Theorem 9.1 is proved in Sect. 9.3
and Theorem 9.2 is proved in Sect. 9.4.

9.2 Auxiliary Results

We use the notation and definitions introduced in Sect. 9.1 and suppose that all the
assumptions made in the introduction holds.
Lemma 9.5. Assume that

k 2 Œ1 ; 2 ; k D 0; 1; : : : (9.19)

and that a sequence fxk g1


kD0 satisfies

f .x0 /  M; (9.20)

f .xkC1 / C 21 k kxkC1  xk k2  inf.f C 21 k k  xk k2 / C 1 (9.21)

for all integers k  0. Then kxk k  M0 for all integers k  0.


Proof. Relations (9.20) and (9.6) imply that
kx0 k  M0 :
9.2 Auxiliary Results 141

Assume that an integer k  0 and that

kxk k  M0 : (9.22)

It follows from (9.21), (9.19), (9.7), (9.8), (9.3), (9.4), (9.5), and (9.22) that

f .xkC1 /  f .x / C 21 k kx  xk k2 C 1


 f .x / C 21 2 .2M0 /2 C 1
 f .x / C 2 D inf.f / C 2 < M:

Together with (9.6) the inequality above implies that kxkC1 k  M0 . Thus we showed
by induction that (9.22) holds for all integers k  0. This completes the proof of
Lemma 9.5.
Lemma 9.6. Assume that

k 2 Œ1 ; 2 ; k D 0; 1; : : : ; (9.23)

k 2 .0; 1, k D 0; 1; : : : , a sequence fxk g1


kD0  X satisfies

f .x0 /  M (9.24)

and that for all integers k  0,

f .xkC1 / C 21 k kxkC1  xk k2  inf.f C 21 k k  xk k2 / C k : (9.25)

Then the following assertions hold.


1. For every integer k  0,

.2=k /.f .xkC1 /  f .x // C kxkC1  xk k2


 2k 1  2  2 1 1=2
1 C kxk  x k  kxkC1  x k C 8M0 .k 1 / :

2. For every pair of natural numbers m > n,

X
m X
m
21 
2 .f .xi /  f .x // C kxi1  xi k2
iDn iDn

X
m1
 4M02 C Œ21 1 1=2
1 i C 8M0 .i 1 / :
iDn1

Proof. It follows from (9.24), (9.25), and Lemma 9.5 that

kxk k  M0 for all integers k  0: (9.26)


142 9 Proximal Point Method in Hilbert Spaces

In view of (9.25), for every integer k  0,

f .xkC1 /  f .xk / C k : (9.27)

We will prove Assertion 1. Let k  0 be an integer. There exists a point ykC1 2 X


such that

f .ykC1 / C 21 k kykC1  xk k2  f .x/ C 21 k kx  xk k2 for all x 2 X: (9.28)

We estimate kxkC1  ykC1 k. Set

z D 21 .xkC1 C ykC1 /: (9.29)

It is easy to see that

21 kykC1  xk k2 C 21 kxkC1  xk k2  k21 .xkC1 C ykC1 /  xk k2


D 21 kykC1 k2 C 21 kxk k2  hykC1 ; xk i C 21 kxkC1 k2 C 21 kxk k2
hxkC1 ; xk i  kxk k2 C hxk ; xkC1 C ykC1 i  k21 .ykC1 C xkC1 /k2
D 21 kykC1 k2 C 21 kxkC1 k2  k21 .ykC1 C xkC1 /k2
D k21 .ykC1  xkC1 /k2 : (9.30)

In view of (9.29), convexity of the function f , (9.30), (9.28), and (9.25),

f .z/ C 21 k kz  xk k2
 21 f .xkC1 / C 21 f .ykC1 /
C21 k .21 kykC1  xk k2 C 21 kxkC1  xk k2  k21 .ykC1  xkC1 /k2 /
 infff .x/ C 21 k kx  xk k W x 2 Xg
C21 k  21 k k21 .ykC1  xkC1 /k2 :

Combining with (9.19) the inequality above implies that

k21 .ykC1  xkC1 /k2  k 1


1

and that

kykC1  xkC1 k  2.k 1


1 /
1=2
: (9.31)

Now we estimate f .x /  f .xkC1 /. In view of (9.28),

0 2 @f .ykC1 / C k .ykC1  xk /
9.2 Auxiliary Results 143

and for every point u 2 X,

f .u/  f .ykC1 /  k hxk  ykC1 ; u  ykC1 i: (9.32)

By (9.32), we have

f .x /  f .ykC1 /  k hxk  ykC1 ; x  ykC1 i: (9.33)

Relation (9.25) implies that

f .xkC1 / C 21 k kxkC1  xk k2  f .ykC1 / C 21 k kykC1  xk k2 C k

and

f .ykC1 /  f .xkC1 /  21 k .kxkC1  xk k2  kykC1  xk k2 /  k :

Together with (9.33) the relation above implies that

f .x /  f .xkC1 /
D f .x /  f .ykC1 / C f .ykC1 /  f .xkC1 /
 k hxk  ykC1 ; x  ykC1 i C f .ykC1 /  f .xkC1 /
D 21 k ŒkykC1  x k2  kxk  x k2 C kxk  ykC1 k2  C f .ykC1 /  f .xkC1 /
 21 k ŒkykC1  x k2  kxk  x k2 C kxk  xkC1 k2   k : (9.34)

It follows from (9.28), (9.23), (9.26), (9.7), (9.8), and (9.5) that for all integers q  1,

f .yq /  f .x / C 21 2 kx  xq k2  f .x / C 1 < M:

Combined with (9.6) the inequality above implies that

kyq k  M0 ; q D 1; 2; : : : (9.35)

Now we use (9.34) and (9.35) and obtain an estimation of f .x /f .xkC1 / without
terms which contain ykC1 .
In view of (9.26) and (9.31),

kxk  ykC1 k2 D k.xk  xkC1 /  .ykC1  xkC1 /k2


D kxk  xkC1 k2 C kykC1  xkC1 k2  2hxk  xkC1 ; ykC1  xkC1 i
 kxk  xkC1 k2  2kxk  xkC1 kkykC1  xkC1 k
 kxk  xkC1 k2  8M0 .k 1
1 /
1=2
: (9.36)
144 9 Proximal Point Method in Hilbert Spaces

By (9.26), (9.7), and (9.31), we have

kykC1  x k2 D k.xkC1  x / C .ykC1  xkC1 /k2


D kxkC1  x k2 C kykC1  xkC1 k2 C 2hxkC1  x ; ykC1  xkC1 i
 kxkC1  x k2  8M0 .k 1
1 /
1=2
: (9.37)

Relations (9.34) and (9.37) imply that

f .x /  f .xkC1 /
 k C 21 k ŒkxkC1  x k2  kx  xk k2 C kxk  xkC1 k2  8M0 .k 1
1 /
1=2
;
f .xkC1 /  f .x / C 21 k kxk  xkC1 k2
 k C 21 k Œkxk  x k2  kxkC1  x k2  C 21 k 8M0 .k 1
1 /
1=2
:

and by (9.23),

.2=k /.f .xkC1 /  f .x // C kxk  xkC1 k2


 2k 1  2  2 1 1=2
1 C kxk  x k  kxkC1  x k C 8M0 .k 1 / :

Thus Assertion 1 is proved.


Let us prove Assertion 2. It follows from Assertion 1, (9.23), (9.7), and (9.26)
that for all pairs of natural numbers m > n,

X
m X
m
.2=2 /.f .xi /  f .x // C kxi1  xi k2
iDn iDn

X
m1
 kxn1  x k2 C Œ2i 1 1 1=2
1 C 8M0 .i 1 / 
iDn1

X
m1
 4M02 C Œ21 1 1=2
1 i C 8M0 .i 1 / :
iDn1

Assertion 2 is proved. This completes the proof of Lemma 9.6.

9.3 Proof of Theorem 9.1

It follows from (9.9), (9.10), (9.11), (9.12), (9.13), and Lemma 9.6 applied for a
natural number n and m D n C L that
9.3 Proof of Theorem 9.1 145

X
nCL
.2=2 /.f .xi /  f .x //
iDn

X
n1CL
 4M02 C Œ21 1 1=2
1  C 8M0 .1 / 
iDn1
1=2
 4M02 C .L C 1/ 1=2 Œ21
1 C 8M0 1   4M02 C 1: (9.38)

Let n  1 be an integer. In view of (9.38),

.L C 1/21  2
2 minff .xi /  f .x / W i D n; : : : ; n C Lg  4M0 C 1

and by (9.10),

minff .xi /  f .x / W i D n; : : : ; n C Lg


 .4M02 C 1/.L C 1/1 21 2 < =4: (9.39)

Since (9.39) holds for any natural number n there exists a strictly increasing
sequence of natural numbers fSi g1
iD1 such that

S1 2 f1; : : : ; 1 C Lg; SiC1  Si 2 Œ1; 1 C L; i D 1; 2; : : : ; (9.40)

f .xSi /  f .x /  =4; i D 1; 2; : : : : (9.41)

Let an integer j  L C 1. In view of (9.40), there is an integer i  1 such that

Si  j < SiC1

and

j  Si  L C 1: (9.42)

By (9.13) for every integer k  0, we have

f .xkC1 /  f .xk / C :

Combined with (9.42), (9.41) and (9.11) the inequality above implies that

f .xj /  f .xSi / C .L C 1/  f .x / C =4 C =4  f .x / C :

Theorem 9.1 is proved.


146 9 Proximal Point Method in Hilbert Spaces

9.4 Proof of Theorem 9.2

By Theorem 9.1 the following property holds:


(P1) Let a sequence fxk g1
kD0  X satisfy

f .x0 /  M;
f .xkC1 / C 2 k kxkC1  xk k  inf.f C 21 k k  xk k2 / C N ; k D 0; 1; : : : :
1 2

Then

f .xk /  inf.f / C 1 for all integers k > L:

By Theorem 9.1 there exist ı 2 .0; N / and an integer L0  1 such that the
following property holds:
(P2) For every sequence fyi g1
iD0  X which satisfies

f .y0 /  inf.f / C 1;
f .ykC1 / C 21 k kykC1  yk k2  inf.f C 21 k k  yk k2 / C ı

for all integers k  0 we have

f .yk /  inf.f / C for all integers k  L0 : (9.43)

(Here is as in the statement of the theorem.)


In view of (9.16), there exists an integer L1  1 such that

k < ı for all natural numbers k  L1 : (9.44)

Fix a natural number

T0 > L0 C L1 C L: (9.45)

Assume that a sequence fxi g1 iD0  X satisfies (9.17) and (9.18). It follows from
property (P1), (9.17), (9.18), and (9.16) that

f .xk /  inf.f / C 1 for all integers k > L: (9.46)

For every nonnegative integer k set

yk D xkCLCL1 : (9.47)
9.4 Proof of Theorem 9.2 147

It follows from (9.47) and (9.46) that

f .y0 /  inf.f / C 1: (9.48)

By (9.18), (9.47), and (9.44), for all nonnegative integers k,

f .ykC1 / C 21 k kykC1  yk k2  inf.f C 21 k k  yk k2 / C ı: (9.49)

In view of (9.48), (9.49), and property (P2),

f .yk /  inf.f / C for all integers k  L0 : (9.50)

Combined with (9.47) and (9.45) the inequality above implies that

f .xk /  inf.f / C for all integers k > T0 :

This completes the proof of Theorem 9.2.


Chapter 10
Proximal Point Methods in Metric Spaces

In this chapter we study the local convergence of a proximal point method in a metric
space under the presence of computational errors. We show that the proximal point
method generates a good approximate solution if the sequence of computational
errors is bounded from above by some constant. The principle assumption is a
local error bound condition which relates the growth of an objective function to
the distance to the set of minimizers, introduced by Hager and Zhang [55].

10.1 Preliminaries and the Main Results

Let X be a metric space equipped with a metric d.


For each x 2 X and each r > 0 set

B.x; r/ D fy 2 X W d.x; y/  rg:

For each x 2 X and each nonempty set A  X set

D.x; A/ D inffd.x; y/ W y 2 Ag:

For each g W X ! R1 [ f1g put

inf.g/ D inffg.x/ W x 2 Xg:

Let f W X ! R1 [ f1g be a lower semicontinuous bounded from below function


which is not identically infinity.
In this chapter we continue our study of proximal point methods which was began
in the previous chapter. Literature connected with the analysis and development of
proximal point methods and based on tools and methods of convex and variational

© Springer International Publishing Switzerland 2016 149


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_10
150 10 Proximal Point Methods in Metric Spaces

analysis includes [15, 16, 31, 34, 35, 42, 55, 56, 66, 67, 69, 77, 81, 87, 103,
104, 111, 113]. In the proximal point method iterates xk , k  0 are generated
by the following rule:

xkC1 2 argminff .x/ C 21 k d.x; xk /2 g;

where f k g1kD0 is a sequence of positive numbers and x0 2 X is an initial point.


Most results, known in the literature, establish the convergence of proximal point
methods when X is a Hilbert space and the function f is convex. Convergence of
proximal point methods for a nonconvex objective function f was established in
[55, 56]. In [56] convergence results were obtained for finite dimensional opti-
mization problems. In a Hilbert space setting, convergence results for a nonconvex
objective function were established in [55]. The principle assumption of [55, 56] is
a local error bound condition [see (10.4)] which relates the growth of an objective
function to the distance to the set of minimizers. Convergence results in Banach
spaces were obtained in [7, 40, 61, 62, 100]. Variable-metric methods were used in
order to obtain convergence results in [4, 26, 78].
Let xN 2 X satisfy

f .Nx/ D inf.f /: (10.1)

Set

˝ D fx 2 X W f .x/ D inf.f /g (10.2)

and let a set ˝0  X satisfy

xN 2 ˝0  ˝ \ B.Nx;  / (10.3)

with a positive constant  .


Suppose that ˛ > 0; 0 >  and that

f .x/  f .Nx/  ˛D.x; ˝0 /2 for all x 2 B.Nx; 0 /: (10.4)

Let

 2 .0; 0 /; ˇ0 2 .0; 2˛=3/; ˇ1 2 .0; ˇ0 /; (10.5)

D 2ˇ0 .2˛  ˇ0 /1 : (10.6)

By (10.5) and (10.6),

< 1: (10.7)
10.1 Preliminaries and the Main Results 151

For each number  which satisfies

 2 .0; 1/;  < .1 C .1  /1 /1 ;  < .0   /=3; (10.8)

choose a natural number k0 ./ which satisfies

k0 ./  < =8 (10.9)

and a positive number ./ such that

2.k0 .//2 .2./ˇ11 /1=2 C 4.k0 .//2 ./1 .2˛  ˇ0 /1 < 0  ; (10.10)

.2./ˇ11 /1=2 < 161 .k0 .//1 .1  /; (10.11)

./ < .1  /321 .k0 .//1 2 .2˛  ˇ0 /: (10.12)

The following theorem obtained in [124] is the main result of this chapter.
Theorem 10.1. Let a number  satisfy (10.8), k0 D k0 ./ and  D ./. Assume
that

f k g1
kD0  Œˇ1 ; ˇ0 ; (10.13)

a sequence fxk g1
kD0  X satisfies

d.x0 ; xN /.1 C .1  /1 / <  (10.14)

and that for all integers k  0,

f .xkC1 / C 21 k d.xkC1 ; xk /2


 f .z/ C 21 k d.z; xk /2 C  for all z 2 X: (10.15)

Then
xj 2 B.Nx; 0 / for all integers j  0;
D.xj ; ˝0 /   for all integers j  k0 :

It is easy to see that k0 ./ in (10.9) and ./ in (10.10)–(10.12) depend on


 monotonically. Namely, k0 ./ (./, respectively) is an decreasing (increasing,
respectively) function of . Clearly, k0 ./ and ./ also depend on the parameters
and . The influence of the choice of these parameters on the convergence is
not simple. For example, according to (10.14) it is desirable to choose  which
is sufficiently close to 0 . But in view of (10.10), if  is close to 0 , the value
./ becomes very small. We can choose a small parameter ˇ0 [see (10.5)] and in
this case in view of (10.5) and (10.6) and ˇ1 are small, by (10.9) the number of
iterations k0 ./ is not large, but in view of (10.11) ./ is small.
The following theorem is the second main result of this chapter.
152 10 Proximal Point Methods in Metric Spaces

Theorem 10.2. There exists N > 0 such that for each sequence fi g1
iD0  .0; N 
satisfying

lim i D 0 (10.16)
i!1

the following assertion holds.


Let  > 0. Then there exists a natural number k1 such that for each sequence
f k g1 1
kD0  Œˇ1 ; ˇ0  and for each sequence fxk gkD0  X which satisfies

d.x0 ; xN /.1 C .1  /1 /1 < 

and for all integers k  0

f .xkC1 / C 21 k d.xkC1 ; xk /2


 f .z/ C 21 k d.z; xk /2 C k for all z 2 X

the following relations hold:

xj 2 B.Nx; 0 / for all integers j  0;

D.xj ; ˝0 /   for all integers j  k1 :

Theorem 10.2 establishes the convergence of the proximal algorithm in the


presence of computational errors fi g1
iD0 such that limi!1 i D 0 without assuming
their summability.
The local error bound condition (10.4) with parameters ˛; 0 ;  , introduced
in [55, 56], hold for many functions and can be verified in principle. In the
three examples below we show that the parameters ˛; 0 ;  can be calculated by
investigating the function f in some cases. On the hand these parameters can be
obtained as a result of numerical experiments.
Example 10.3. Let .X; h; i/ be a Hilbert space with an inner product h; i which
induces a complete norm k  k. Assume that f 2 C2 is a real-valued convex function
on X such that

lim f .x/ D 1
kxk!1

and

inffhf 00 .x/h; hikhk2 W x 2 X and h 2 X n f0gg > 0;

where f 00 .x/ is the second order Frechet derivative of f at a point x. It is not difficult
to see that the function f possesses a unique minimizer xN and (10.4) holds with
 D 1, ˝ D ˝0 D fNxg, 0 > 1 and
10.1 Preliminaries and the Main Results 153

˛ D 21 inffhf 00 .x/h; hikhk2 W x 2 X and h 2 X n f0gg > 0:

Example 10.4. Let .X; h; i/ be a Hilbert space with an inner product h; i which
induces a complete norm k  k. Assume that f W X ! R1 [ f1g is a lower
semicontinuous function, 0 < a < b, the restriction of f to the set fz 2 X W kzk < bg
is a convex function which possesses a continuous second order Frechet derivative
f 00 ./,

inffhf 00 .x/h; hikhk2 W x 2 X; kxk < b and h 2 X n f0gg > 0

and that

f .z/ > infff .x/ W x 2 Xg for all z 2 X n B.0; a/:

It is easy to see that there exists a unique minimizer xN of f , kNxk  a and that (10.4)
holds with ˝ D ˝0 D fNxg, 0 D .b  a/=2, any positive constant  < 0 and

˛ D 21 inff< f 00 .x/h; h > khk2 W x 2 X; kxk < b and h 2 X n f0gg:

Example 10.5. Assume that .X; d/ is a metric space, f W X ! R1 [ f1g is a lower


semicontinuous function which is not identically infinity, xN 2 X satisfies (10.1),
equations (10.2) and (10.3) hold with a positive constant  and (10.4) holds with
constants ˛ > 0 and 0 >  .
Assume that T W X ! X satisfies

c1 d.x; y/  d.T.x/; T.y//  c2 d.x; y/ for all x; y 2 X; (10.17)

where c1 ; c2 are positive constants such that

c2 c1
1  < 0 : (10.18)

Set

g.x/ D f .T.x//; x 2 X: (10.19)

It is easy to see that

g.T 1 .Nx// D inf.g/ D inf.f /; T 1 .˝/ D fz 2 X W g.z/ D inf.g/g: (10.20)

Let

x 2 B.T 1 .Nx/; c1


2 0 /: (10.21)

By (10.17) and (10.21)


154 10 Proximal Point Methods in Metric Spaces

d.T.x/; xN /  c2 d.x; T 1 .Nx//  0 : (10.22)

It follows from (10.4), (10.17), (10.19) and (10.22) that

g.x/  g.T 1 .Nx//


D f .T.x//  f .Nx/  ˛D.T.x/; ˝0 /2
D ˛ inffd.T.x/; z/ W z 2 ˝0 g2
 ˛ inffc1 d.x; T 1 .z// W z 2 ˝0 g2
 ˛c21 inffd.x; v/ W v 2 T 1 .˝0 /g2 :

Thus

g.x/  g.T 1 .Nx//  ˛c21 D.x; T 1 .˝0 //2

for all x 2 B.T 1 .Nx/; c1


2 0 /:

By (10.3), (10.17), and (10.18) for each z 2 T 1 .˝0 /,

d.z; T 1 .Nx//  c1 N /  c1


1 d.T.z/; x 1  ;

T 1 .˝0 /  T 1 .˝/ \ B.T 1 .Nx/; c1


1  /:

Note that by (10.18), c1 1


1  < c2 0 . Thus the local error bound condition also
holds for the function g.
Example 10.6. Consider the following constrained minimization problem
Z 2
jx.t/  sin.t/jdt ! min;
0

x W Œ0; 2 ! R1 is an absolutely continuous (a. c.) function such that

x.0/ D x.2/ D 0; jx0 .t/j  1; t 2 Œ0; 2 almost everywhere (a.e.): (10.23)

Clearly, this problem possesses a unique solution xN .t/ D sin.t/, t 2 Œ0; 2. Let us
show that this constrained problem is a particular case of the problem considered in
this session.
Denote by X the set of all a.c. functions x W Œ0; 2 ! R1 such that (10.23) holds.
For all x1 ; x2 2 X set

d.x1 ; x2 / D maxfjx1 .t/  x2 .t/j W t 2 Œ0; 2g:

Clearly, .X; d/ is a metric space. For x 2 X put


10.1 Preliminaries and the Main Results 155

Z 2
f .x/ D jx.t/  sin.t/jdt: (10.24)
0

Clearly, the functional f W X ! R1 is continuous.


Let

x 2 X n fNxg; d.x; xN /  1: (10.25)

By (10.23),

jx.t/j  2; t 2 Œ0; 2: (10.26)

Clearly, there is t0 2 Œ0; 2 such that

jx.t0 /  sin.t0 /j D d.x; xN /: (10.27)

By (10.23) and (10.27) for each t 2 Œ0; 2 satisfying jt0  tj  d.x; xN /=4;

jx.t/  x.t0 /j  jt  t0 j  d.x; xN /=4;

jNx.t/  xN .t0 /j  jt  t0 j  d.x; xN /=4

and

jx.t/  xN .t/j  jx.t0 /  xN .t0 /j  jx.t/  x.t0 /j  jNx.t/  xN .t0 /j


 d.x; xN /  d.x; xN /=2:

Together with (10.24) and (10.25) this implies that


Z 2
f .x/ D jx.t/  xN .t/jdt  .d.x; xN /=2/d.x; xN /=4;
0

f .x/  f .Nx/  81 d.x; xN /2

and (10.4) holds with 0 D 1, ˛ D 81 ,  D 1=2, ˝ D ˝0 D fNxg:


The results of the chapter were obtained in [124]. The chapter is organized as
follows. Section 10.2 contains auxiliary results. Section 10.3 contains the main
lemma. Theorem 10.1 is proved in Sect. 10.4. Section 10.5 contains an auxiliary
result for Theorem 10.2 which is proved in Sect. 10.6. In Sect. 10.7 we obtain
extensions of the main results of the chapter for well-posed minimization problems.
In Sect. 10.8 we construct an example of a function f for which under the conditions
of Theorem 10.2 the sequence fxk g1 kD0 does not converge to a point.
156 10 Proximal Point Methods in Metric Spaces

10.2 Auxiliary Results

Lemma 10.7. Let  > 0, z0 ; z1 2 X and 2 Œˇ1 ; ˇ0  satisfy

f .z1 / C 21 d.z1 ; z0 /2  f .z/ C 21 d.z; z0 /2 C  for all z 2 X (10.28)

and let

x 2 ˝: (10.29)

Then

d.z1 ; z0 /  d.x; z0 / C .2 1 /1=2 :

Proof. By (10.28), (10.29) and (10.2),

f .z1 / C 21 d.z1 ; z0 /2


 f .x/ C 21 d.x; z0 /2 C 
 f .z1 / C 21 d.x; z0 /2 C :

This implies that

d.z1 ; z0 /2  d.x; z0 /2 C 2 1

and

d.z0 ; z1 /  d.x; z0 / C .2 1 /1=2 :

Lemma 10.7 is proved.


Lemma 10.7 implies the following result.
Lemma 10.8. Let  > 0, z0 ; z1 2 X and 2 Œˇ1 ; ˇ0  satisfy

f .z1 / C 21 d.z1 ; z0 /2  f .z/ C 21 d.z; z0 /2 C  for all z 2 X:

Then

d.z1 ; z0 /  D.z0 ; ˝0 / C .2 1 /1=2 :

Lemma 10.9. Let

2 Œˇ1 ; ˇ0  (10.30)
10.2 Auxiliary Results 157

and let  > 0; z0 ; z1 2 X satisfy

z1 2 B.Nx; 0 /; D.z1 ; ˝0 / > 0; (10.31)

f .z1 / C 21 d.z1 ; z0 /2  f .z/ C 21 d.z; z0 /2 C  for all z 2 X: (10.32)

Then

D.z1 ; ˝0 /  2.D.z1 ; ˝0 //1 .2˛  ˇ0 /1 C D.z0 ; ˝0 / C .2 1 /1=2 :

Proof. Let

x 2 ˝0 : (10.33)

By (10.32),

f .z1 / C 21 d.z1 ; z0 /2  f .x/ C 21 d.x; z0 /2 C :

Together with (10.33), (10.3), and (10.2) this implies that

f .z1 /  f .Nx/  21 .d.x; z0 /2  d.z1 ; z0 /2 / C 


  C 21 d.x; z1 /.2d.z1 ; z0 / C d.x; z1 //:

Since the inequality above holds for any x 2 ˝0 we obtain that

f .z1 / f .Nx/
  C 21 D.z1 ; ˝0 /.2d.z1 ; z0 / C D.z1 ; ˝0 //: (10.34)

By (10.31) and (10.4),

f .z1 /  f .Nx/  ˛D.z1 ; ˝0 /2 :

Together with (10.34) this implies that

˛D.z1 ; ˝0 /2   C 21 D.z1 ; ˝0 /.2d.z1 ; z0 / C D.z1 ; ˝0 //:

The inequality above implies that

.˛  21 /.D.z1 ; ˝0 //2   C D.z1 ; ˝0 /d.z1 ; z0 /:

Combined with (10.30) and (10.31) this implies that

.˛  21 ˇ0 /D.z1 ; ˝0 /  ˇ0 d.z1 ; z0 / C .D.z1 ; ˝0 //1 :


158 10 Proximal Point Methods in Metric Spaces

By the inequality above and (10.5)

D.z1 ; ˝0 /  ˇ0 .˛  21 ˇ0 /1 d.z1 ; z0 /

C .D.z1 ; ˝0 //1 .˛  21 ˇ0 /1 : (10.35)

By Lemma 10.8, (10.30), and (10.32),

d.z1 ; z0 /  D.z0 ; ˝0 / C .2 1 /1=2 : (10.36)

In view of (10.35), (10.36), (10.6), and (10.7),

D.z1 ; ˝0 /  2.D.z1 ; ˝0 //1 .2˛  ˇ0 /1


C D.z0 ; ˝0 / C .2 1 /1=2
 2.D.z1 ; ˝0 //1 .2˛  ˇ0 /1
C D.z0 ; ˝0 / C .2 1 /1=2 :

Lemma 10.9 is proved.

10.3 The Main Lemma

Lemma 10.10. Let a number  satisfy (10.8), k0 D k0 ./,  D ./, a sequence


f k g1 1
kD0 satisfy (10.13) and let a sequence fyk gkD0  X satisfy

d.y0 ; xN /.1 C .1  /1 / <  (10.37)

and for all integers k  0,

f .ykC1 / C 21 k d.ykC1 ; yk /2


 f .z/ C 21 k d.z; yk /2 C  for all z 2 X: (10.38)

Then there exists an integer k 2 Œ0; k0  such that

d.yj ; xN /  0 ; j D 0; : : : ; k; (10.39)

D.yk ; ˝0 /  =2: (10.40)

Proof. For all integers j  0 set

j D 4j1 .2˛  ˇ0 /1 C j.2ˇ11 /1=2 : (10.41)


10.3 The Main Lemma 159

Assume that an integer k satisfies

0  k < k0 (10.42)

and that for all integers j D 0; : : : ; k,

yj 2 B.Nx; 0 /; D.yj ; ˝0 /  j D.y0 ; ˝0 / C j : (10.43)

(Clearly for k D 0 this assumption holds.)


If there is an integer k1 2 Œ0; kC1 such that D.yk1 ; ˝0 /  =2, then the assertion
of the lemma holds.
Therefore we may assume without loss of generality that

D.yj ; ˝0 / > =2 for al integers j 2 Œ0; k C 1: (10.44)

For all integers j D 0; : : : ; k, it follows from (10.13), (10.38), and Lemma 10.8
applied with z0 D yj , z1 D yjC1 , D j that

d.yj ; yjC1 /  D.yj ; ˝0 / C .2 1


j /
1=2
: (10.45)

By (10.45) and (10.43),

X
k X
k
d.ykC1 ; y0 /  d.yjC1 ; yj /  ŒD.yj ; ˝0 / C .2 1
j /
1=2

jD0 jD0

X
k
 Œ j D.y0 ; ˝0 / C j C .2 1
j /
1=2

jD0

X
k
 .1  /1 D.y0 ; ˝0 / C j C .k C 1/.2ˇ11 /1=2 : (10.46)
jD0

By (10.46), (10.37), (10.3), (10.41), (10.42), and (10.10),

d.ykC1 ; xN /  d.ykC1 ; y0 / C d.y0 ; xN /


X
k
1
 d.y0 ; xN /.1 C .1  / / C j C .k C 1/.2ˇ11 /1=2
jD0

  C .k C 1/k C .k C 1/.2ˇ11 /1=2


  C 4k02 1 .2˛  ˇ0 /1
Ck02 .2ˇ11 /1=2 C k0 .2ˇ11 /1=2 < 0 : (10.47)
160 10 Proximal Point Methods in Metric Spaces

By (10.13), (10.47), (10.44), (10.38), (10.43), (10.7), (10.41), and Lemma 10.9
applied with z0 D yk , z1 D ykC1 , D k ,

D.ykC1 ; ˝0 /  2.D.ykC1 ; ˝0 //1 .2˛  ˇ0 /1


C D.yk ; ˝0 / C .2 1
k /
1=2

 41 .2˛  ˇ0 /1 C D.yk ; ˝0 / C .2ˇ11 /1=2


 41 .2˛  ˇ0 /1
C . k D.y0 ; ˝0 / C k / C .2ˇ11 /1=2
 kC1 D.y0 ; ˝0 / C kC1 : (10.48)

In view of (10.47) and (10.48) we conclude that (10.43) holds for all j D 0; : : : ; kC1.
Thus by induction we have shown that at least one of the following cases holds:
(i) there is an integer k 2 Œ0; k0  such that (10.39) and (10.40) hold.
(ii) (10.43) holds for all j D 0; : : : ; k0 .
In the case (i) the assertion of the lemma holds.
Assume that the case (ii) holds. By (10.43) with j D k0 , (10.3), (10.37), (10.41),
(10.9), (10.11), and (10.12),

D.yk0 ; ˝0 /  k0 D.y0 ; ˝0 / C k0


 k0  C 4k0 1 .2˛  ˇ0 /1 C k0 .2ˇ11 /1=2 < =2

and the assertion of the lemma holds.


This completes the proof of Lemma 10.10.

10.4 Proof of Theorem 10.1

By Lemma 10.10 there is an integer k 2 Œ0; k0  such that

d.xj ; xN /  0 ; j D 0; : : : ; k; (10.49)

D.xk ; ˝0 /  =2: (10.50)

We show that for all integers j  k

D.xj ; ˝0 /  : (10.51)

This will complete the proof of the theorem.


Assume that an integer j  k satisfies (10.51). In view of (10.3) and (10.8), in
order to complete the proof it is sufficient to show that
10.5 An Auxiliary Result for Theorem 10.2 161

D.xjC1 ; ˝0 /  :

We may assume without loss of generality that

D.xjC1 ; ˝0 / > =2: (10.52)

By Lemma 10.8 applied with z0 D xj , z1 D xjC1 , D j , (10.13), (10.15), and


(10.51),

d.xj ; xjC1 /  D.xj ; ˝0 / C .2 1


j /
1=2
  C .2ˇ11 /1=2 :

Together with (10.51) and (10.11) this implies that

D.xjC1 ; ˝0 /  d.xjC1 ; xj / C D.xj ; ˝0 /  2 C .2ˇ11 /1=2 < 3:

Together with (10.3) and (10.8) this implies that

d.xjC1 ; xN / < 0 : (10.53)

By (10.13), (10.53), (10.31), (10.15), (10.52), (10.51), (10.11), (10.12), and


Lemma 10.9 applied to z0 D xj , z1 D xjC1 , D j ,

D.xjC1 ; ˝0 /  2.D.xjC1 ; ˝0 //1 .2˛  ˇ0 /1 C D.xj ; ˝0 / C .2 1


j /
1=2

 41 .2˛  ˇ0 /1 C  C .2ˇ11 /1=2 < :

This completes the proof of Theorem 10.1.

10.5 An Auxiliary Result for Theorem 10.2

Let

 2 .0; 1/; (10.54)

a natural number k0 satisfy

k0 0 < =8 (10.55)

and let a positive number  satisfy

4k0 1 .2˛  ˇ0 /1 < .1  /=8; (10.56)

k0 .2ˇ11 /1=2 < .1  /=8: (10.57)


162 10 Proximal Point Methods in Metric Spaces

Proposition 10.11. Assume that

f k g1
kD0  Œˇ1 ; ˇ0 ; (10.58)

a sequence fxk g1
kD0  X satisfies

xk 2 B.Nx; 0 / for all integers k  0 (10.59)

and that for all integers k  0

f .xkC1 / C 21 k d.xkC1 ; xk /


 f .z/ C 21 k d.z; xk / C  for all z 2 X: (10.60)

Then

D.xj ; ˝0 /   for all integers j  k0 : (10.61)

Proof. For all integers j D 0; 1; : : : put

j D j.21 .2˛  ˇ0 /1 C .2ˇ11 /1=2 /: (10.62)

Assume that an integer j  0 and that

D.xj ; ˝0 /  : (10.63)

We show that D.xjC1 ; ˝0 /  . We may assume without loss of generality that

D.xjC1 ; ˝0 / > =2: (10.64)

By (10.58), (10.60), (10.64), (10.59), and Lemma 10.9 applied with z0 D xj , z1 D


xjC1 , D j , (10.11), (10.12), (10.63), (10.56), and (10.57),

D.xjC1 ; ˝0 /  2.D.xjC1 ; ˝0 //1 .2˛  ˇ0 /1 C D.xj ; ˝0 / C .2 1


j /
1=2

 41 .2˛  ˇ0 /1 C  C .2ˇ11 /1=2 < :

Thus we have shown that if an integer j  0 satisfies (10.63), then

D.xjC1 ; ˝0 /  :

Therefore in order to prove the proposition it is sufficient to show that (10.63) holds
with some integer j 2 Œ0; k0 .
Assume the contrary. Thus
D.xj ; ˝0 / >  for all integers j 2 Œ0; k0 : (10.65)
10.6 Proof of Theorem 10.2 163

Assume that an integer k satisfies

0  k < k0 ; (10.66)

D.xj ; ˝0 /  j D.x0 ; ˝0 / C j : (10.67)

(Clearly, for k D 0 this assumption holds.)


By (10.58), (10.59), (10.65), (10.60), (10.67), (10.7), (10.62), (10.66), and
Lemma 10.9 applied with z0 D xk , z1 D xkC1 , D k ,

D.xkC1 ; ˝0 /  2.D.xkC1 ; ˝0 //1 .2˛  ˇ0 /1 C D.xk ; ˝0 / C .2 1


k /
1=2

 21 .2˛  ˇ0 /1 C kC1 D.x0 ; ˝0 / C k C .2ˇ11 /1=2


D kC1 D.x0 ; ˝0 / C kC1 :

Thus (10.67) holds for j D k C 1. By induction we have shown that (10.67) holds for
all j D 0; : : : ; k0 . Together with (10.59), (10.55), (10.57), and (10.62) this implies
that

D.xk0 ; ˝0 /  k0 D.x0 ; ˝0 / C k0


 k0 0 C k0 .21 .2˛  ˇ0 /1 C .2ˇ11 /1=2 / < :

This contradicts (10.65). The contradiction we have reached proves Proposi-


tion 10.11.

10.6 Proof of Theorem 10.2

By Theorem 10.1 there exists N > 0 such that the following property holds:
(P1) For each sequence

f k g1
kD0  Œˇ1 ; ˇ0  (10.68)

and each sequence fxk g1


kD0  X which satisfies

d.x0 ; xN /.1 C .1  /1 / <  (10.69)

and for all integers k D 0; 1; : : : ;

f .xkC1 / C 21 k d.xkC1 ; xk /2

 f .z/ C 21 k d.z; xk /2 C N for all z 2 X


164 10 Proximal Point Methods in Metric Spaces

we have

xj 2 B.Nx; 0 / for all integers j  0: (10.70)

Assume that

fi g1
iD0  .0; N  and lim i D 0 (10.71)
i!1

and  > 0. By Proposition 10.11 there are O > 0 and a natural number q1 such that
the following property holds:
(P2) Assume that (10.68) holds, a sequence fxk g1
kD0  X satisfies (10.70) and that
for all integers k  0,

f .xkC1 / C 21 k d.xkC1 ; xk /2


 f .z/ C 21 k d.z; xk /2 C O for all z 2 X:

Then D.xj ; ˝0 /   for all integers j  q1 .


By (10.71) there exists a natural number q2 such that

j < O for all integers j  q2 : (10.72)

Set

k1 D q1 C q2 : (10.73)

Assume that (10.68) holds, a sequence fxk g1


kD0  X satisfies (10.69) and that for all
integers k  0

f .xkC1 / C 21 k d.xk ; xkC1 /2


 f .z/ C 21 k d.z; xk /2 C k for all z 2 X: (10.74)

Equations (10.68), (10.69), (10.74), (10.71), and (P1) imply (10.70).


By (10.68), (10.70), (10.72), and (10.74) we can apply (P2) to the sequence
fxq2 Cj g1
jD0 and obtain that D.xj ; ˝0 /   for all integers j  q1 C q2 D k1 .
Theorem 10.2 is proved.

10.7 Well-Posed Minimization Problems

We use the notation and definitions from Sect. 10.1. Suppose that
˝ D fNxg
10.7 Well-Posed Minimization Problems 165

and that for each sequence fzi g1


iD0  X satisfying limi!1 f .zi / D inf.f / we have

lim .zi ; xN / D 0:
i!1

In other words the problem f .x/ ! min, x 2 X is well posed in the sense of [121].
Fix M > 1 C 0 .
Proposition 10.12. There exist  ;  > 0 such that for each 2 .0;  , each
z0 2 B.Nx; M/ and each z1 2 X satisfying

f .z1 / C 21 d.z1 ; z0 /2


 f .z/ C 21 d.z; z0 /2 C  for all z 2 X (10.75)

the inequality

d.z1 ; xN /  21 .1 C .1  /1 /1 (10.76)

holds.
Proof. Since the problem f .z/ ! min, z 2 X is well posed there is ı > 0 such that

if z 2 X satisfies f .z/  inf.f / C ı;


then d.z; xN /  21 .1 C .1  /1 /1 : (10.77)

Choose positive numbers

 < .M 2 C 1/1 ı;  2 .0; ı=2/: (10.78)

Let

2 .0;  ; z0 2 B.Nx; M/ (10.79)

and let z1 2 X satisfy (10.75). By (10.79), (10.78), and (10.75),

f .z1 /  f .z1 / C 21 d.z1 ; z0 /2


 f .Nx/ C 21  d.Nx; z0 /2 C 
 inf.f / C 21  M 2 C   inf.f / C ı:

Together with (10.77) this implies that (10.76). Proposition 10.12 is proved.
Let  ;  > 0 be as guaranteed by Proposition 10.12. We suppose that

ˇ 0   : (10.80)
166 10 Proximal Point Methods in Metric Spaces

We may assume without loss of generality that

./ 2 .0;   for all  2 .0; 1/: (10.81)

Theorem 10.13. Let a number  satisfy

 2 .0; 1/;  < 21 .1 C .1  /1 /1 ;  < 0 =3;

k0 D k0 ./ and let a positive number  D ./. Assume that

f k g1
kD0  Œˇ1 ; ˇ0 ; (10.82)

a sequence fxk g1
kD0  X satisfies

d.x0 ; xN /  M (10.83)

and that for all integers k  0,

f .xkC1 / C 21 k d.xkC1 ; xk /2


 f .z/ C 21 k d.z; xk /2 C  for all z 2 X: (10.84)

Then

xj 2 B.Nx; 0 / for all integers j  1;

xj 2 B.Nx; / for all integers j  k0 C 1:

Proof. By the choice of  and  , Proposition 10.12, (10.84), (10.82), (10.80),


(10.83), and (10.81)

d.x1 ; xN /  21 .1 C .1  /1 /1 :

Since  can be arbitrarily small positive number the assertion of the theorem now
follows from Theorem 10.1.
Theorem 10.14. Let N > 0 be as guaranteed by Theorem 10.2,

O D minfN ;  g; (10.85)

fi g1
iD0  .0; O ; lim i D 0; (10.86)
i!1

 > 0 and let a natural number k1 be as guaranteed by Theorem 10.2 with the
sequence fiC1 g1
iD0 . Assume that

f k g1
kD0  Œˇ1 ; ˇ0 ; (10.87)
10.8 An Example 167

a sequence fxk g1
kD0  X satisfies

d.x0 ; xN /  M (10.88)

and that for all integers k  0,

f .xkC1 / C 21 k d.xkC1 ; xk /2


 f .z/ C 21 k d.z; xk /2 C k for all z 2 X: (10.89)

Then

xj 2 B.Nx; 0 / for all integers j  1;

xj 2 B.Nx; / for all integers j  k1 C 1:

Proof. By the choice of  and  , Proposition 10.12, (10.88), (10.86), (10.85),


(10.87), and (10.80),

d.x1 ; xN /  21 .1 C .1  /1 /1 :

The assertion of the theorem now follows from Theorem 10.2.

10.8 An Example

Let X D Rn be equipped with the Euclidean norm k  k which induces the metric
d.x; y/ D kx  yk, x; y 2 Rn .
Set

˝ D B.0; 1/;

f .x/ D D.x; B.0; 1//2 ; x 2 Rn :

Clearly, all the assumptions made in Sect. 10.1 hold with xN D 0, ˝ D ˝0 D B.0; 1/,
 D 1, ˛ D 1 and any positive constant 0 > 1. Thus Theorems 10.1 and 10.2
hold for the function f .
We prove the following result.
Proposition 10.15. Assume that > 0, a sequence fi g1
iD0  .0; 1 satisfies

1
X 1=2
i D1 (10.90)
iD0
168 10 Proximal Point Methods in Metric Spaces

and that x0 2 B.0; 1/. Then there exists a sequence fxk g1


kD0  B.0; 1/ such that for
all integers k  0,

f .xkC1 / C 21 kxkC1  xk k2


 f .z/ C 21 kz  xk k2 C k for all z 2 Rn (10.91)

and that for all z 2 B.0; 1/,

lim inf kxk  zk D 0:


k!1

Proposition 10.15 easily follows from the following auxiliary result.


Lemma 10.16. Assume that > 0, a sequence fi g1 iD0  .0; 1 satisfies (10.90)
and that y0 ; y1 2 B.0; 1/. Then there exist a natural number q and a sequence
q
fxk gkD0  B.0; 1/ such that x0 D y0 , xq D y1 ant that for all integers k 2 Œ0; q  1,
Eq. (10.91) holds.
Proof. Set

F D fty1 C .1  t/y0 W t 2 Œ0; 1g:

Set t0 D 0 and for all integers i  0

tiC1 D minfti C 21 .2i 1 /1=2 ; 1g: (10.92)

By (10.90) and (10.92) there exists a natural number q such that

tq D 1; ti < 1 for all nonnegative integers i < q: (10.93)

For any integer k 2 Œ0; q set

xk D tk y1 C .1  tk /y0 : (10.94)

Clearly,
q
fxk gkD0  F  B.0; 1/: (10.95)

Let an integer k satisfy 0  k < q. By (10.92), (10.94) and (10.95)

f .xkC1 / C 21 kxkC1  xk k2


D 21 ktkC1 y1 C .1  tkC1 /y0  .tk y1 C .1  tk /y0 /k2
D 21 k.tkC1  tk /.y1  y0 /k2  2 .tkC1  tk /2  k :

This implies (10.91) and completes the proof of Lemma 10.16.


Chapter 11
Maximal Monotone Operators and the Proximal
Point Algorithm

In a finite-dimensional Euclidean space, we study the convergence of a proximal


point method to a solution of the inclusion induced by a maximal monotone
operator, under the presence of computational errors. The convergence of the
method is established for nonsummable computational errors. We show that the
proximal point method generates a good approximate solution, if the sequence of
computational errors is bounded from above by a constant.

11.1 Preliminaries and the Main Results

Let Rn be the n-dimensional Euclidean space equipped with an inner product h; i
which induces the norm k  k.
n
A multifunction T W Rn ! 2R is called a monotone operator if

hz  z0 ; w  w0 i  0 8z; z0 ; w; w0 2 Rn

such that w 2 T.z/ and w0 2 T.z0 /: (11.1)

It is called maximal monotone if, in addition, the graph

f.z; w/ 2 Rn  Rn W w 2 T.z/g

is not strictly contained in the graph of any other monotone operator T 0 W Rn ! Rn .


A fundamental problem consists in determining an element z such that 0 2 T.z/.
A proximal point algorithm is an important tool for solving this problem. This
algorithm has been studied extensively because of its role in convex analysis and
optimization. See, for example, [15–17, 31, 34, 36, 53, 55, 69, 81–83, 87, 103, 104,
106, 107, 111, 113] and the references mentioned therein.

© Springer International Publishing Switzerland 2016 169


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_11
170 11 Maximal Monotone Operators and the Proximal Point Algorithm

Let T W Rn ! Rn be a maximal monotone operator. The algorithm for solving the


inclusion 0 2 T.z/ is based on the fact established by Minty [82], who showed that,
for each z 2 Rn and each c > 0, there is a unique u 2 Rn such that

z 2 .I C cT/.u/;

where I W Rn ! Rn is the identity operator (Ix D x for all x 2 Rn ).


The operator

Pc WD .I C cT/1 (11.2)

is therefore single-valued from all of Rn onto Rn (where c is any positive number).


It is also nonexpansive:

kPc .z/  Pc .z0 /k  kz  z0 k for all z; z0 2 Rn (11.3)

and Pc .z/ D z if and only if 0 2 T.z/. Following the terminology of Moreau [87] Pc
is called the proximal mapping associated with cT.
The proximal point algorithm generates, for any given sequence fck g1 kD0 of
positive real numbers and any starting point z0 2 Rn , a sequence fzk g1kD0  R ,
n

where

zkC1 WD Pck .zk /; k D 0; 1; : : : (11.4)

We study the convergence of a proximal point method to the set of solutions of


the inclusion 0 2 T.x/ under the presence of computational errors. We show that
the proximal point method generates a good approximate solution, if the sequence
of computational errors is bounded from above by some constant.
More precisely, we show (Theorem 11.2) that, for given positive numbers M; ,
there exist a natural number n0 and ı > 0 such that, if the computational errors
do not exceed ı for any iteration and if kx0 k  M, then the algorithm generates a
sequence fxk gnkD0
0
such that kxn0  xN k  , where xN 2 Rn satisfies 0 2 T.Nx/.
The results of this chapter were obtained in [125].
Let T W Rn ! Rn be a maximal monotone operator. It is not difficult to see that
its graph

graphT WD f.x; w/ 2 Rn  Rn W w 2 T.x/g

is closed.
Assume that

F WD fz 2 Rn W 0 2 T.z/g 6D ;:

For each x 2 Rn and each nonempty set A  Rn put


11.1 Preliminaries and the Main Results 171

d.x; A/ D inffkx  yk W y 2 Ag:

Fix

N > 0: (11.5)

For each x 2 Rn and each r > 0 set

B.x; r/ D fy 2 Rn W kx  yk  rg:

We prove the following result, which establishes the convergence of the proximal
point algorithm without computational errors.
Theorem 11.1. Let M;  > 0. Then there exists a natural number n0 such that, for
each sequence fk g1 N 1
kD0  Œ; 1/ and each sequence fxk gkD0  R such that
n

kx0 k  M;

xkC1 D Pk .xk / for all integers k  0; (11.6)

the inequality d.xk ; F/   holds for all integers k  n0 .


Since n0 depends only on M;  we can say that Theorem 11.1 establishes the
uniform convergence of the proximal point algorithm without computational errors
on bounded sets.
Theorem 11.1 is proved in Sect. 11.3. The following theorem is one of the main
results of this chapter.
Theorem 11.2. Let M;  > 0. Then there exist a natural number n0 and a positive
0 1
number ı such that, for each sequence fk gnkD0 N 1/ and each sequence
 Œ;
n0
fxk gkD0  R such that
n

kx0 k  M;

kxkC1  Pk .xk /k  ı; k D 0; 1; : : : ; n0  1;

the following inequality holds:

d.xn0 ; F/  :

Theorem 11.2 easily follows from the following result, which is proved in
Sect. 11.4.
Theorem 11.3. Let M; 0 > 0, let a natural number n0 be as guaranteed by
Theorem 11.1 with  D 0 =2, and let ı D 0 .2n0 /1 .
0 1
Then, for each sequence fk gnkD0 N 1/ and each sequence fxk gn0  Rn
 Œ; kD0
such that
172 11 Maximal Monotone Operators and the Proximal Point Algorithm

kx0 k  M;

kxkC1  Pk .xk /k  ı; k D 0; 1; : : : :n0  1;

the following inequality holds:

d.xn0 ; F/  :

Theorem 11.2 easily implies the following result.


Theorem 11.4. Let M;  > 0 and let a natural number n0 and ı > 0 be as
guaranteed by Theorem 11.2. Assume that fk g1 N
kD0  Œ; 1/ and that a sequence
fxk g1
kD0  B.0; M/ satisfies

kxkC1  Pk .xk /k  ı; k D 0; 1; : : :

Then d.xk ; F/   for all integers k  n0 .


Next result is proved in Sect. 11.4. It establishes a convergence of the proximal
point algorithm with computational errors, which converge to zero under an
assumption that all the iterates are bounded by the same prescribed bound. This
convergence is uniform since n depends only on M, , and fık g1 kD0 .

Theorem 11.5. Let M > 0, fık g1 kD0 be a sequence of positive numbers such that
limk!1 ık D 0 and let  > 0. Then there exists a natural number n such that, for
each sequence fk g1 N 1
kD0  Œ; 1/ and each sequence fxk gkD0  B.0; M/ satisfying

kxkC1  Pk .xk /k  ık for all integers k  0;

the inequality

d.xk ; F/  

holds for all integers k  n .


In the last two theorems, which are proved in Sect. 11.4, we consider the case
when the set F is bounded. In Theorem 11.6 it is assumed that computational errors
do not exceed a certain positive constant, while in Theorem 11.7 computational
errors tend to zero.
Theorem 11.6. Suppose that the set F be bounded. Let M;  > 0. Then there exists
ı > 0 and a natural number n0 such that, for each fk g1 N
kD0  Œ; 1/ and each
sequence fxk g1
kD0  Rn
satisfying
kx0 k  M;
kxkC1  Pk .xk /k  ı ; k D 0; 1; : : :
the inequality d.xk ; F/   holds for all integers k  n0 .
11.2 Auxiliary Results 173

Theorem 11.7. Suppose that the set F be bounded and let M > 0. Then there exists
ı > 0 such that the following assertion holds.
Assume that fık g1
kD0  .0; ı satisfies

lim ık D 0
k!1

and that  > 0. Then there exists a natural number n such that, for each sequence
fk g1 N 1
kD0  Œ; 1/ and each sequence fxk gkD0  R satisfying
n

kx0 k  M;

kxkC1  Pk .xk /k  ık ; k D 0; 1; : : : ;

the inequality d.xk ; F/   holds for all integers k  n .


Note that in Theorem 11.6 ı depends on , while in Theorem 11.7 ı does not
depend on .

11.2 Auxiliary Results

It is easy to see that the following lemma holds.


Lemma 11.8. Let z; x0 ; x1 2 Rn . Then

21 kz  x0 k2  21 kz  x1 k2  21 kx0  x1 k2 D hx0  x1 ; x1  zi:

Lemma 11.9. Let fk g1 1


kD0  .0; 1/, fxk gkD0  R satisfy for all integers k  0,
n

xkC1 D Pk .xk / D .I C k T/1 .xk / (11.7)

and let z 2 Rn satisfies

0 2 T.z/: (11.8)

Then, for all integers k  0,

kz  xk k2  kz  xkC1 k2  kxk  xkC1 k2  0:

Proof. Let k  0 be an integer. By Lemma 11.8,

21 kz  xk k2  21 kz  xkC1 k2  21 kxk  xkC1 k2 D hxk  xkC1 ; xkC1  zi: (11.9)
174 11 Maximal Monotone Operators and the Proximal Point Algorithm

By (11.7),

xk  xkC1 2 k T.xkC1 /:

Together with (11.9) and (11.8) this completes the proof of Lemma 11.8.
Using (11.3) we can easily deduce the following lemma.
Lemma 11.10. Assume that z 2 Rn satisfies (11.8), M > 0,

fk g1 1
kD0  .0; 1/; fxk gkD0  R ;
n

kx0  zk  M and that (11.7) holds for all integers k  0. Then kxk  zk  M for
all integers k  0.
Lemma 11.11. Let M;  > 0. Then there exists ı > 0 such that, for each x 2
B.0; M/, each   N and each z 2 B.0; ı/ satisfying z 2 T.x/ the inequality
d.x; F/   holds.
Proof. Assume the contrary. Then, for each natural number k there exist

xk 2 B.0; M/; zk 2 B.0; k1 /; k  N (11.10)

such that

d.xk ; F/ > ; zk 2 k T.xk /: (11.11)

By (11.11) and (11.10), for all integers k  1,

1
k zk 2 T.xk / (11.12)

and

k1 N 1 N 1 1 ! 0 as k ! 1:
k zk k   kzk k   k (11.13)

By (11.10), extracting a subsequence and re-indexing, we can assume that these


exists

x WD lim xk : (11.14)
k!1

Since graph T is closed, then (11.12), (11.13) and (11.14) imply that 0 2 T.x/ and
that x 2 F. Together with (11.14) this implies that d.xk ; F/  =2 for all sufficiently
large natural numbers k. This contradicts (11.11) and proves Lemma 11.11.
Lemma 11.12. Assume that the integers p; q, with 0  p < q, are such that
q1 q1
fk gkDp  .0; 1/; fk gkDp  .0; 1/; (11.15)

q q
fxk gkDp  Rn ; fyk gkDp  Rn ; yp D xp ;
11.3 Proof of Theorem 11.1 175

and that for all integers k 2 fp; : : : ; q  1g,

ykC1 D Pk .yk /; kxkC1  Pk .xk /k  k : (11.16)

Then, for any integer k 2 fp C 1 : : : ; qg,

X
k1
kyk  xk k  i : (11.17)
iDp

Proof. We prove the lemma by induction. In view of (11.16) and (11.15), equation
(11.17) holds for k D p C 1.
Assume that an integer j satisfies p C 1  j  q, (11.17) holds for all k D
p C 1; : : : ; j and that j < q.
By (11.16), (11.3) and (11.17) with k D j

kyjC1  xjC1 k  kPj yj  xjC1 k  kPj yj  Pj xj k C kPj xj  xjC1 k

X
j1
X
j
 kyj  xj k C j  i C j D i
iDp iDp

and (11.17) holds for all k D pC1; : : : ; jC1. Therefore we showed by induction that
(11.17) holds for all k D p C 1; : : : ; q. This completes the proof of Lemma 11.12.

11.3 Proof of Theorem 11.1

Fix

z 2 F: (11.18)

By Lemma 11.11, there exists ı 2 .0; 1/ such that the following property holds:
(P1) For each x 2 B.0; M C 2kzk/, each   N and each z 2 B.0; ı/ satisfying
z 2 T.x/ we have d.x; F/  =2.
Choose a natural number n0 such that

.kzk C M/2 n1 2


0 <ı : (11.19)

Assume that

fk g1 N 1
kD0  Œ; 1/; fxk gkD0  R ; kx0 k  M;
n
(11.20)

xkC1 D Pk .xk / for all integers k  0:


176 11 Maximal Monotone Operators and the Proximal Point Algorithm

By Lemma 11.9, (11.20) and (11.18) for all integers k  0,

kxk k  kzk C kxk  zk  kzk C kx0  zk  2kzk C kx0 k  2kzk C M: (11.21)

By Lemma 11.9, (11.18) and (11.20) for any integer k  0,

kxk  xkC1 k2  kz  xk k2  kz  xkC1 k2

and this implies that

0 1
nX
kxk  xkC1 k2  kz  x0 k2  kz  xn0 k2  kz  x0 k2  .kzk C M/2 :
kD0

Together with (11.19) this implies that

minfkxk  xkC1 k2 W k D 0; : : : ; n0  1g  .kzk C M/2 n1 2


0 <ı :

Therefore there is an integer j such that

0  j  n0  1; kxj  xjC1 k  ı: (11.22)

In view of (11.20)

xj  xjC1 2 j T.xjC1 /: (11.23)

It follows from (11.22), (11.23), (11.21), (11.20), and (P1) that

d.xjC1 ; F/  =2: (11.24)

By (11.24), (11.20), and Lemma 11.9,

d.xi ; F/  =2 for all integers i  j C 1

and for all integers i  n0 :

This completes the proof of Theorem 11.1.

11.4 Proofs of Theorems 11.3, 11.5, 11.6, and 11.7

Proof of Theorem 11.3. Assume that

0 1
fk gnkD0 N 1/; fxk gn0  Rn ;
 Œ; (11.25)
kD0
11.4 Proofs of Theorems 11.3,11.5,11.6, and 11.7 177

kx0 k  M; kxkC1  Pk .xk /k  ı D 0 .2n0 /1 ; k D 0; : : : ; n0  1: (11.26)

Put

y0 D x0 ; ykC1 D Pk .yk /; k D 0; : : : ; n0  1: (11.27)

By the choice of n0 , (11.27), (11.26) and (11.25),

d.yn0 ; F/  0 =2: (11.28)

By Lemma 11.12 and (11.25)–(11.27),

kyn0  xn0 k  n0 ı  0 =2:

Combined with (11.28) this implies that

d.xn0 ; F/  0 :

Theorem 11.3 is proved.


Proof of Theorem 11.5. Let ı > 0 and a natural number n0 be as guaranteed by
Theorem 11.2. There is a natural number p such that

ık  ı for all integers k  p: (11.29)

Put

n D n0 C p: (11.30)

Assume that

fk g1 N 1
kD0  Œ; 1/; fxk gkD0  B.0; M/; (11.31)

kxkC1  Pk .xk /k  ık for all integers k  0:

In view of (11.29) and (11.31),

kxkC1  Pk .xk /k  ı for al integers k  p: (11.32)

By (11.32), (11.31), the choice of ı; n0 , Theorem 11.4 and (11.30),

d.xk ; F/   for all integers k  p C n0 D n :

Theorem 11.5 is proved.


178 11 Maximal Monotone Operators and the Proximal Point Algorithm

Proof of Theorem 11.6. We may assume without loss of generality that

M > 1 C supfkzk W z 2 Fg;  < 1: (11.33)

By Theorem 11.2 there exist ı > 0 and a natural number n0 such that the following
property holds:
(P2)
0 1
For each sequence fk gnkD0 N 1/ and each sequence fxk gn0  Rn which
 Œ; kD0
satisfies

kx0 k  M;

kxkC1  Pk .xk /k  ı; k D 0; : : : ; n0  1

we have

d.xn0 ; F/  =4:

Put

ı D minfı; .=4/n1
0 g: (11.34)

Assume that

fk g1 N 1
kD0  Œ; 1/; fxk gkD0  R
n
(11.35)

and

kx0 k  M;

kxkC1  Pk .xk /k  ı ; k D 0; 1; : : : (11.36)

By (11.35), (11.36), (11.34) and (P2),

d.xn0 ; F/  =4: (11.37)

In view of (11.37) and (11.33),

kxn0 k  M: (11.38)

We show by induction that for any natural number j,

d.xjn0 ; F/  =4; kxjn0 k  M: (11.39)

Equations (11.37) and (11.38) imply that (11.39) is valid for j D 1.


11.4 Proofs of Theorems 11.3,11.5,11.6, and 11.7 179

Assume that j is a natural number and (11.39) holds. By (11.39), (11.35), (11.36),
(11.34), and (P2),

d.x.jC1/n0 ; F/  =4:

Together with (11.33) this implies that kx.jC1/n0 k  M. Thus (11.39) holds for all
natural numbers j.
Let j be a natural number. Put

yjn0 D xjn0 ; ykC1 D Pk .yk /; k D jn0 ; : : : ; 2jn0  1: (11.40)

By Lemma 11.12, (11.35), (11.36), (11.40), and (11.34) for all k D jn0 C 1; : : : ;
2.j C 1/n0 ,

kyk  xk k  n0 ı  =4: (11.41)

Since the set F is closed and bounded there is z 2 F such that

d.xjn0 ; F/ D kxjn0  zk: (11.42)

It follows from (11.39) and (11.42) that

kxjn0  zk  =4: (11.43)

By (11.35), (11.40), the inclusion z 2 F, Lemma 11.9 and (11.43),

kyk  zk  kyjn0  zk D kxjn0  zk  =4

for all integers k D jn0 C 1; : : : ; 2jn0 : (11.44)

By (11.41), (11.44) and the inclusion z 2 F for all integers k D jn0 C 1; : : : ;


2.j C 1/n0 ,

d.xk ; F/  kxk  zk  kxk  yk k C kyk  zk  =4 C =4:

Since j is any natural number we conclude that

d.xk ; F/  =2 for all integers j  n0 :

Theorem 11.6 is proved.


Proof of Theorem 11.7. We may assume without loss of generality that

M > 2 C supfkzk W z 2 Fg: (11.45)


180 11 Maximal Monotone Operators and the Proximal Point Algorithm

By Theorem 11.6 there are ı > 0 and a natural number n0 such that the following
property holds:
(P3) for each fk g1 N 1
kD0  Œ; 1/ and each fxk gkD0  R satisfying
n

kx0 k  M;

kxkC1  Pk .xk /k  ı; k D 0; 1; : : :

the inequality d.xk F/  1 holds for all integers k  n0 .


Assume that

fık g1
kD0  .0; ı; lim ık D 0;  > 0: (11.46)
k!1

We may assume without loss of generality that

 < 1: (11.47)

By Theorem 11.6 there are ı 2 .0; ı/ and a natural number n such that the
following property holds:
(P4) for each fk g1 N 1
kD0  Œ; 1/ and each fxk gkD0  R satisfying kx0 k  M,
n

kxkC1  Pk .xk /k  ı ; k D 0; 1; : : :

we have d.xk ; F/   for all integers k  n .


By (11.46) there is a natural number p such that

ık < ı for all integers k  p: (11.48)

Put

n D n0 C p C n : (11.49)

Assume that

fk g1 N 1
kD0  Œ; 1/; fxk gkD0  R ;
n
(11.50)

kx0 k  M;

kxkC1  Pk .xk /k  ık ; k D 0; 1; : : : (11.51)

By (11.50), (11.51), (11.46) and (P3),

d.xk ; F/  1 for all integers k  n0 : (11.52)


11.4 Proofs of Theorems 11.3,11.5,11.6, and 11.7 181

It follows from (11.52) and (11.45) that

kxk k  M for all integers k  n0 : (11.53)

By (11.48) and (11.51) for all integers k  n0 C p,

kxkC1  Pk .xk /k  ı : (11.54)

It follows from (11.50), (11.53), (11.54), (11.49), and property (P4) applied to the
sequence fxk g1
kDn0 Cp that

d.xk ; F/   for all integers k  n0 C p C n D n :

This completes the proof of Theorem 11.7.


Chapter 12
The Extragradient Method for Solving
Variational Inequalities

In a Hilbert space, we study the convergence of the subgradient method to a


solution of a variational inequality, under the presence of computational errors.
The convergence of the subgradient method for solving variational inequalities is
established for nonsummable computational errors. We show that the subgradient
method generates a good approximate solution, if the sequence of computational
errors is bounded from above by a constant.

12.1 Preliminaries and the Main Results

The studies of gradient-type methods and variational inequalities are important


topics in optimization theory. See, for example, [3, 12, 30, 31, 37, 44, 52, 54, 68, 71–
74] and the references mentioned therein. In this chapter we study convergence
of the subgradient method, introduced in [75] and known in the literature as the
extragradient method, to a solution of a variational inequality in a Hilbert space,
under the presence of computational errors.
Let .X; h; i/ be a Hilbert space with an inner product h; i which induces a
complete norm k  k. For each x 2 X and each r > 0 set

B.x; r/ D fy 2 X W kx  yk  rg:

Let C be a nonempty closed convex subset of X. By Lemma 2.2, for each x 2 X


there is a unique point PC .x/ 2 C satisfying

kx  PC .x/k D inffkx  yk W y 2 Cg:

Moreover,
kPC .x/  PC .y/k  kx  yk for all x; y 2 X

© Springer International Publishing Switzerland 2016 183


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_12
184 12 The Extragradient Method for Solving Variational Inequalities

and
hz  PC .x/; x  PC .x/i  0

for each x 2 X and each z 2 C.


Consider a mapping f W X ! X. We say that the mapping f is monotone on C if

hf .x/  f .y/; x  yi  0 for all x; y 2 C:

We say that f is pseudo-monotone on C if for each x; y 2 C the inequality

hf .y/; x  yi  0 implies that hf .x/; x  yi  0:

Clearly, if f is monotone on C, then f is pseudo-monotone on C. Denote by S the


set of all x 2 C such that

hf .x/; y  xi  0 for all y 2 C: (12.1)

We suppose that

S 6D ;: (12.2)

For each  > 0 denote by S the set of all x 2 C such that

hf .x/; y  xi  ky  xk   for all y 2 C: (12.3)

In the sequel, we present examples which provide simple and clear estimations
for the sets S in some important cases. These examples show that elements of S
can be considered as -approximate solutions of the variational inequality.
In this chapter, in order to solve the variational inequality (to find x 2 S), we
use the algorithm known in the literature as the extragradient method [75]. In each
iteration of this algorithm, in order to get the next iterate xkC1 , two orthogonal
projections onto C are calculated, according to the following iterative step. Given
the current iterate xk calculate yk D PC .xk  k f .xk // and then

xkC1 D PC .xk  k f .yk //;

where k is some positive number. It is known that this algorithm generates


sequences which converge to an element of S. In this chapter, we study the behavior
of the sequences generated by the algorithm taking into account computational
errors which are always present in practice. Namely, in practice the algorithm
generates sequences fxk g1 1
kD0 and fyk gkD0 such that for each integer k  0,

kyk  PC .xk  k f .xk //k  ı


12.1 Preliminaries and the Main Results 185

and

kxkC1  PC .xk  k f .yk //k  ı;

with a constant ı > 0 which depends only on our computer system. Surely, in this
situation one cannot expect that the sequence fxk g1 kD0 converges to the set S. The
goal is to understand what subset of C attracts all sequences fxk g1
kD0 generated by
the algorithm. The main result of this chapter (Theorem 12.2) shows that this subset
of C is the set S with some  > 0 depending on ı [see (12.9) and (12.10)]. The
examples considered in this section show that one cannot expect to find an attracting
set smaller than S , whose elements can be considered as approximate solutions of
the variational inequality.
The results of this chapter were obtained [127].
We suppose that the mapping f is Lipschitz on all bounded subsets of X and that

hf .y/; y  xi  0 for all y 2 C and all x 2 S: (12.4)

Remark 12.1. Note that (12.4) holds if f is pseudo-monotone on C.


Usually algorithms, studied in the literature, generate sequences which converge
weakly to an element of S. In this chapter, for a given  > 0, we are interested to
find a point x 2 X such that inffkx  yk W y 2 S g  : This point x is considered as
an -approximate solution of the variational inequality. We will prove the following
result, which shows that an -approximate solution can be obtained after k iterations
of the subgradient method and under the presence of computational errors bounded
from above by a constant ı, where ı and k are constants depending on .
Theorem 12.2. Let  2 .0; 1/, M0 > 0; M1 > 0; L > 0 be such that

B.0; M0 / \ S 6D ;; (12.5)

f .B.0; 3M0 C 1//  B.0; M1 /; (12.6)

kf .z1 /  f .z2 /k  Lkz1  z2 k

for all z1 ; z2 2 B.0; 3M0 C M1 C 1/; (12.7)

0 < Q <   1;  L < 1; (12.8)

0 > 0 satisfy

30 .M1 C Q 1 .1 C M0 C M1 / C .L C Q 1 // < ; (12.9)

let ı 2 .0; 1/ satisfy

4ı.1 C 2M0 / < 02 .1  2 L2 /=2 (12.10)


186 12 The Extragradient Method for Solving Variational Inequalities

and let an integer

k > 8M02 02 .1  2 L2 /1 : (12.11)

Assume that

fxi g1 1 1
iD0  X; fyi giD0  X; fi giD0  ŒQ ;  ; (12.12)

kx0 k  M0 ; (12.13)

and that for each integer i  0,

kyi  PC .xi  i f .xi //k  ı; (12.14)

kxiC1  PC .xi  i f .yi //k  ı: (12.15)

Then there is an integer j 2 Œ0; k such that

xi 2 B.0; 3M0 /; i D 0; : : : ; j; (12.16)

kxj  yj k  20 ;

kxi  yi k > 20 for all integers i satisfying 0  i < j: (12.17)

Moreover, if an integer j 2 Œ0; k satisfies (12.17), then

hf .xj /;   xj i    k  xj k for all  2 C (12.18)

and there is y 2 S such that kxj  yk  .


Note that Theorem 12.2 provides the estimations for the constants ı and k, which
follow from (12.9)–(12.11). Namely, ı D c1  2 and k D c2  2 , where c1 and c2 are
positive constants depending on M0 .
Let us consider the following particular example.
Example 12.3. Assume that f .x/ D x for all x 2 X and C D X. Then S D f0g. Let
 2 .0; 1=2/ and M0 D 102 . Clearly, in this case

S  fy 2 X W kyk  2g

and the assertion of Theorem 12.2 holds with M1 D 400 [see (12.6)], L D 1 [see
(12.7)], Q D 21 ,  D 3=4 [see (12.8)],

0 D 51  103 
12.1 Preliminaries and the Main Results 187

[see (12.9)],

ı D 21  1011  2

[see (12.10)] and with k, which is the smallest integer larger than 1012  16 2 [see
(12.11)].
The following example demonstrates that the set S can be easily calculated if
the mapping f is strongly monotone.
Example 12.4. Let rN 2 .0; 1/. Set

CrN D fx 2 X W kx  PC .x/k  rN g:

We say that f is strongly monotone with a constant ˛ > 0 on CrN if

hf .x/  f .y/; x  yi  ˛kx  yk2 for all x; y 2 CrN :

Fix u 2 S. We suppose that there is ˛ 2 .0; 1/ such that

hf .x/; x  u i  ˛kx  u k2 for all x 2 CrN : (12.19)

Remark 12.5. Note that inequality (12.19) holds, if f is strongly monotone with a
constant ˛ on CrN .
Let  > 0 and x 2 S . Then for all y 2 C,

hf .x/; y  xi  ky  xk  

and in particular

ku  xk    hf .x/; u  xi  ˛kx  u k2 :

This implies that

˛kx  u k2  2 maxf; kx  u kg;

kx  u k  maxf2˛ 1 ; .2˛ 1 /1=2 g

and if   21 ˛, then

kx  u k  .2˛ 1 /1=2

and

S  fx 2 X W kx  u k  .2˛ 1 /1=2 g:
188 12 The Extragradient Method for Solving Variational Inequalities

According to Theorem 12.2, under its assumptions, there is an integer j 2 Œ0; k


such that

kxj  u k   C .2˛ 1 /1=2 :

Note that the constant ˛ can be obtained by analyzing an explicit form of the
mapping f .
In next example we show what is the set S when C D X.
Example 12.6. Assume that C D X. It is easy to see that

S D fx 2 X W f .x/ D 0g:

Let  > 0 and x 2 S . Then for all z 2 B.0; 1/,

hf .x/; zi  kzk    2

and kf .x/k  2. Thus

S  fx 2 X W kf .x/k  2g:

In the following example, we demonstrate that, if computational errors made


by our computer system are ı > 0, then in principle any element of the set S ,
where  is a positive constant depending on ı, can be a limit point of the sequence
fxi g1
iD0 generated by the subgradient method. This means that Theorem 12.2 cannot
be improved.
Example 12.7. Assume that f .x/ D x for all x 2 X and C D X. Clearly, f is strongly
monotone on X with a constant 1 and S D f0g. According to Example 12.6 for any
 2 .0; 21 /, S  fy 2 X W kyk  2g:
Let  2 .0; 1/, ı 2 .0; 1/,

v 2 B.0; ı/

and let sequences fxi g1 1


iD0 ; fyi giD0  X satisfy for all integers i  0,

yi D xi  f .xi / D .1   /xi ;

xiC1 D xi  f .yi / C v D .1   C  2 /xi C v: (12.20)

By induction it follows from (12.20) that for all integers n  1,

X
n1
2 n
xn D .1   C  / x0 C .1   C  2 /i v ! .   2 /1 v as n ! 1:
iD0
12.2 Auxiliary Results 189

Thus any

 2 B.0; ı.   2 /1 /

can be a limit of a sequence fxn g1


nD0 generated by the subgradient method under the
present of computational errors ı.
In next example we obtain an estimation for the sets S when f ./ is the Gâteaux
derivative of a convex function.
Example 12.8. Assume that F W X ! R1 is a convex Gâteaux differentiable function
and f .x/ D F 0 .x/ for all x 2 X, where F 0 .x/ is the Gâteaux derivative of F at the
point x 2 X. We suppose that F 0 D f be Lipschitz on all bounded subsets of X and
that x 2 C satisfies

F.x /  F.z/ for all z 2 C:

Assume that a constant MQ > kx k be given. (Note that MQ can be known a priori or
obtained by analyzing an explicit form of the function F.) Let  2 .0; 1/ and

Q
x 2 S \ B.0; M/:

Then for each y 2 C,

F.y/  F.x/  hF 0 .x/; y  xi


D hf .x/; y  xi  ky  xk  

and, in particular,

F.x /  F.x/
Q  :
 kx  x k    2 M

Thus

Q  fx 2 C W F.x/  F.x / C .2M


S \ B.0; M/ Q C 1/g:

The chapter is organized as follows. Section 12.2 contains auxiliary results.


Theorem 12.2 is proved in Sect. 12.3. Convergence results for the finite-dimensional
space X are obtained in Sects. 12.4 and 12.5.

12.2 Auxiliary Results

We use the assumptions, definitions, and the notation introduced in Sect. 12.1.
190 12 The Extragradient Method for Solving Variational Inequalities

Lemma 12.9. Assume that

 > 0; u 2 S; M0 > 0; M1 > 0; L > 0; (12.21)

f .B.u ; M0 //  B.0; M1 /; (12.22)

kf .z1 /  f .z2 /k  Lkz1  z2 k

for all z1 ; z2 2 B.u ; M0 C  M1 /: (12.23)

Let

u 2 B.u ; M0 /; v D PC .u   f .u//; (12.24)

T WD fw 2 X W hu  f .u/  v; w  vi  0g; (12.25)

D be a convex and closed subset of X such that

CDT (12.26)

(by Lemma 2.2, C  T) and let

uQ D PD .u  f .v//: (12.27)

Then

kQu  u k2  ku  u k2  .1   2 L2 /ku  vk2 : (12.28)

Proof. By (12.22) and (12.24),

kf .u/k  M1 :

Together with (12.21), (12.24), and Lemma 2.2, this implies that

ku  vk  ku  .u  f .u//k  M0 C  M1 : (12.29)

By (12.4), (12.21), and (12.24),

hf .v/; uQ  u i  hf .v/; uQ  vi: (12.30)

In view of (12.25), (12.26), and (12.27),

hQu  v; .u  f .u//  vi  0:

This implies that

hQu  v; .u  f .v//  vi   hQu  v; f .u/  f .v/i: (12.31)


12.2 Auxiliary Results 191

Set

z D u   f .v/: (12.32)

By (12.27) and (12.32),

jjQu  u jj2 D jjz  u jj2 C jjz  PD .z/jj2


C 2hPD .z/  z; z  u i: (12.33)

By (12.21), (12.26) and Lemma 2.2,

2kz  PD .z/k2 C 2hPD .z/  z; z  u i


D 2hz  PD .z/; u  PD .z/i  0:

Together with (12.27), (12.30), (12.32), and (12.33) this implies that

kQu  u k2  kz  u k2  kz  PD .z/k2
D ku  f .v/  u k2  ku   f .v/  uQ k2
D ku  u k2  ku  uQ k2
C2hu  uQ ; f .v/i
 ku  u k2  ku  uQ k2 C 2 hv  uQ ; f .v/i:

Together with (12.31) this implies that

kQu  u k2  ku  u k2
C2hv  uQ ; f .v/i  hu  v C v  uQ ; u  v C v  uQ i
D ku  u k2  ku  vk2  kv  uQ k2
C2hQu  v; u  v   f .v/i
 ku  u k2  ku  vk2  kv  uQ k2
C2hQu  v; f .u/  f .v/i: (12.34)

By Cauchy-Schwarz inequality, (12.23), (12.24), (12.29), and (12.34),

kQu  u k2  ku  u k2  kv  uk2  kv  uQ k2 C 2 LkQu  vkku  vk


 ku  u k2  ku  vk2  kv  uQ k2 C  2 L2 ku  vk2 C kQu  vk2
 ku  u k2  .1   2 L2 /ku  vk2 :

Lemma 12.9 is proved.


192 12 The Extragradient Method for Solving Variational Inequalities

Lemma 12.10. Let

u 2 S; M0 > 0; M1 > 0; L > 0; ı 2 .0; 1/; (12.35)

f .B.u ; M0 //  B.0; M1 /; (12.36)

kf .z1 /  f .z2 /k  Lkz1  z2 k

for all z1 ; z2 2 B.u ; M0 C M1 C 1/; (12.37)

 2 .0; 1; L < 1: (12.38)

Assume that

x 2 B.u ; M0 /; y 2 X; ky  PC .x  f .x//k  ı; (12.39)

xQ 2 X; kQx  PC .x  f .y//k  ı: (12.40)

Then

kQx  u k2  4ı.1 C M0 / C kx  u k2  .1   2 L2 /kx  PC .x   f .x//k2 :

Proof. Set

v D PC .x   f .x//; z D PC .x  f .v//; zQ D PC .x   f .y//: (12.41)

By Lemma 12.9, (12.41), (12.35), (12.36), (12.37), (12.38), and (12.39),

kz  u k2  kx  u k2  .1   2 L2 /kx  vk2 : (12.42)

Clearly,

kQx  u k2 D kQx  z C z  u k2
D kQx  zk2 C 2hQx  z; z  u i C kz  u k2
 kQx  zk2 C 2kQx  zkkz  u k C kz  u k2 : (12.43)

By (12.39) and (12.41),

kv  yk  ı: (12.44)

It follows from (12.41), (12.35), Lemma 2.2, (12.39), and (12.36) that

ku  vk  ku  xk C kf .x/k  M0 C  M1 : (12.45)


12.3 Proof of Theorem 12.2 193

By (12.45), (12.44), and (12.35),

ku  yk  ku  vk C kv  yk  M0 C  M1 C 1: (12.46)

By (12.46), (12.41), Lemma 2.2, (12.45), (12.40), (12.37), (12.38), and (12.44),

kQx  zk  kQx  zQk C kQz  zk


 ı C kQz  zk  ı C  kf .y/  f .v/k
 ı C  Lı D ı.1 C L/: (12.47)

By (12.43), (12.47), (12.42), (12.41), (12.38), and (12.39),

kQx  u k2  .ı.1 C L//2 C 2ı.1 C  L/kz  u k C kz  u k2


 .ı.1 C L//2 C 2ı.1 C  L/kx  u k C kz  u k2
 ı 2 .1 C  L/2 C 2ı.1 C L/kx  u k
Ckx  u k2  .1   2 L2 /kx  PC .x   f .x//k2
 4ı 2 C 4ıM0 C kx  u k2  .1   2 L2 /kx  PC .x   f .x//k2 :

This completes the proof of Lemma 12.10.

12.3 Proof of Theorem 12.2

By (12.5) there is

u 2 S \ B.0; M0 /: (12.48)

By (12.13) and (12.48),

kx0  u k  2M0 : (12.49)

Assume that i  0 is an integer and that

xi 2 B.u ; 2M0 /: (12.50)

(Note that for i D 0 (12.50) holds.) By Lemma 12.10 applied with x D xi , y D yi ,


xQ D xiC1 and  D i , (12.48), (12.50), (12.6), (12.7), (12.12), (12.14), and (12.15),

kxiC1  u k2  4ı.1 C 2M0 / C kxi  u k2

 .1  i2 L2 /kxi  PC .xi  i f .xi //k2 : (12.51)


194 12 The Extragradient Method for Solving Variational Inequalities

There are two cases:

kxi  yi k  20 I (12.52)

kxi  yi k > 20 : (12.53)

Assume that (12.53) holds. Then by (12.53), (12.14), (12.10), and the inequality
0 < 1,

kxi  PC .xi  i f .xi //k


 kxi  yi k  k  yi C PC .xi  i f .xi //k
> 20  ı > 0 > 02 : (12.54)

Then in view of (12.54), (12.51), (12.12), and (12.10),

kxiC1  u k2
 4ı.1 C 2M0 / C kxi  u k2  02 .1  2 L2 /
 kxi  u k2  02 .1  2 L2 /21 : (12.55)

Thus we have shown that the following property holds:


(P) if an integer i  0 satisfies (12.50) and (12.53), then (12.55) holds.
Property (P), (12.52), (12.53), and (12.49) imply that at least one of the following
cases holds:
(a) for all integers i D 0; : : : ; k the relations (12.50), (12.53), and (12.55) are true;
(b) there is an integer j 2 f0; : : : ; kg such that for all integers i D 0; : : : ; j, (12.50) is
valid, for all integers i satisfying 0  i < j (12.53) holds and that

kxj  yj k  20 : (12.56)

Assume that case (a) holds. Then by (12.49) and (12.55),

4M02  ku  x0 k2  ku  xk k2


X
k1
D Œku  xi k2  ku  xiC1 k2   21 k02 .1  2 L2 /
iD0

and

k  8M02 02 .1  2 L2 /1 :


12.3 Proof of Theorem 12.2 195

This contradicts (12.11). The contradiction we have reached proves that case (a)
does not hold. Then case (b) holds and there is an integer j 2 f0; : : : ; kg guaranteed
by (b). Then (12.16) and (12.17) hold.
Assume that an integer j 2 Œ0; k satisfies (12.17). (Clearly, in view of (b) such
integer j is unique.) Then

kxj  u k  2M0 ; kxj  yj k  20 : (12.57)

By (12.57), (12.10), and (12.14),

kxj  PC .xj  j f .xj //k

 kxj  yj k C kyj  PC .xj  j f .xj //k  20 C ı  30 : (12.58)

By Lemma 2.2, for each  2 C,

0  hxj  j f .xj /  PC .xj  j f .xj //;   PC .xj  j f .xj //i: (12.59)

By (12.58) and (12.59) for each  2 C,

0  hxj  PC .xj  j f .xj //;   PC .xj  j f .xj //i


j hf .xj /;   PC .xj  j f .xj //i
 kxj  PC .xj  j f .xj //k.k  xj k C kxj  PC .xj  j f .xj //k/
j hf .xj /;   xj i  j hf .xj /; xj  PC .xj  j f .xj //i
 30 k  xj k  902  j hf .xj /;   xj i  3j kf .xj /k0 : (12.60)

By (12.60), (12.12), (12.6), (12.9), (12.57), (12.48) for each  2 C,

j hf .xj /;   xj i  30 .1 C  M1 /  30 k  xj k

and in view of (12.8) and (12.12)

hf .xj /;   xj i  30 j1 .1 C  M1 /  30 j1 k  xj k


 30 Q 1 .1 C M1 /  30 Q 1 k  xj k: (12.61)

By (12.9) and (12.61),

hf .xj /;   xj i    k  xj k for all  2 C: (12.62)

Clearly, (12.62) is the claimed (12.18). Set

yN D PC .xj  j f .xj //: (12.63)


196 12 The Extragradient Method for Solving Variational Inequalities

By (12.63) and (12.58),

yN 2 C; kxj  yN k  30   < 1: (12.64)

By (12.57), (12.48), (12.64), and (12.7),

kf .xj /  f .Ny/k  Lkxj  yN k  30 L: (12.65)

By (12.64), (12.57), (12.48), (12.6), (12.9), (12.61), (12.65), and (12.4) for each
 2 C,

hf .Ny/;   yN i
 hf .Ny/;   xj i  kf .Ny/kkxj  yN k
 hf .Ny/;   xj i  3M1 0
 hf .xj /;   xj i  kf .Ny/  f .xj /kk  xj k  3M1 0
 30 Q 1 .1 C M1 /  30 Q 1 k  xj k  30 Lk  xj k  3M1 0
 30 .M1 C Q 1 .1 C M1 //  30 .L C Q 1 /.k  yN k C kNy  xj k/
 30 .L C Q 1 /.k  yN k/  30 .M1 C Q 1 .1 C M1 / C .L C Q 1 //
 k  yN k  

for all  2 C. Thus yN 2 S . This completes the proof of Theorem 12.2.

12.4 The Finite-Dimensional Case

We use the assumptions, definitions, and notation introduced in Sect. 12.1 and we
prove the following result.
Theorem 12.11. Let X D Rn ,  2 .0; 1/, M0 > 0 be such that

B.0; M0 / \ S 6D ;;

M1 > 0 be such that

f .B.0; 3M0 C 1//  B.0; M1 /;

L > 0 be such that

kf .z1 /  f .z2 /k  Lkz1  z2 k

for all z1 ; z2 2 B.0; 3M0 C M1 C 1/;

0 <   1;  L < 1:
12.4 The Finite-Dimensional Case 197

Then there exist ı 2 .0; / and an integer k  1 such that for each fxi g1
iD0  R and
n
1
each fyi giD0  R which satisfy kx0 k  M0 and for each integer i  0,
n

kyi  PC .xi  f .xi //k  ı;

kxiC1  PC .xi  i f .yi //k  ı

there is an integer j 2 Œ0; k such that

kxj k  3M0 and inffkxj  zk W z 2 Sg  :

Theorem 12.11 follows immediately from Theorem 12.2 and the following result.
Lemma 12.12. Let M0 > 0, > 0. Then there exists  2 .0; / such that for each

z 2 S \ B.0; M0 /

the following relation holds:

inffkz  uk W u 2 Sg  :

Proof. Assume the contrary. Then there exists a sequence fk g1


kD1  .0; / which
converges to zero and a sequence

z.k/ 2 B.0; M0 / \ Sk ; k D 1; 2; : : :

such that for each integer k  1,

inf.fkz.k/  uk W u 2 Sg/ > : (12.66)

We may assume without loss of generality that there is

z D lim z.k/ :
k!1

By definition, for each integer k  1 and each  2 C,

hf .z.k/ /;   z.k/ i  k k  z.k/ k  k :

This implies that for each  2 C,

hf .z/;   zi D lim hf .z.k/ /;   z.k/ i


k!1

 lim .k k  z.k/ k  k / D 0:


k!1
198 12 The Extragradient Method for Solving Variational Inequalities

Thus z 2 S and for all natural numbers k large enough

kz  z.k/ k < =2:

This contradicts (12.66). The contradiction we have reached proves Lemma 12.12.

12.5 A Convergence Result

We use the assumptions, definitions, and notation introduced in Sect. 12.1. Let
X D Rn . For each x 2 Rn and each A  Rn set

d.x; A/ D inffkx  yk W y 2 Ag:

Suppose that the set S is bounded and choose M N 1; M


N 0 > 2, M N 2 > 0 such that

N 0  2/;
S  B.0; M (12.67)

f .B.0; 3M N 1 /; f .B.0; 3M
N 0 C 1//  B.0; M N 0 C 3M
N 1 C 1//  B.0; M
N 2 /: (12.68)

Assume that

N0CM
M0 > M N1CM
N 2 ; M1 > 0; L > 0; (12.69)

f .B.0; 3M0 C 1//  B.0; M1 /; (12.70)

kf .z1 /  f .z2 /k  Lkz1  z2 k for all z1 ; z2 2 B.0; 3M0 C M1 C 1/; (12.71)

0 <   1 and L < 1: (12.72)

By Theorem 12.11 there exist

2 .0; 41 / (12.73)

and a natural number kN such that the following property holds:


(P2) for each pair of sequences fxi g1 1
iD0  R and fyi giD0  R with kx0 k  M0
n n

and for each integer i  0 with


kyi  PC .xi  f .xi //k  ; kxiC1  PC .xi  i f .yi //k  (12.74)
N such that d.xj ; S/  1=4.
there is an integer j 2 Œ0; k
We prove the following result.
12.5 A Convergence Result 199

Theorem 12.13. Let

fıi g1
iD0  .0; ; lim ıi D 0 (12.75)
i!1

and let  2 .0; 1/. Then there exists a natural number k0 such that for each pair of
sequences fxi g1 1
iD0  R and fyi giD0  R which satisfies
n n

kx0 k  M0 (12.76)

and for each integer i  0 with

kyi  PC .xi  f .xi //k  ıi ; kxiC1  PC .xi  i f .yi //k  ıi (12.77)

the inequality

d.xi ; S/  

holds for all integers i  k0 .


Proof. By Theorem 12.11, there exist a positive number

ıN <  2 .1 C M0 /1 641 (12.78)

and an integer k1  1 such that the following property holds:


(P3) for each pair of sequences fui g1 1
iD0  R and fvi giD0  R such that
n n

ku0 k  M0 (12.79)

and that for each integer i  0,

N kuiC1  PC .ui  i f .vi //k  ıN


kvi  PC .ui  f .ui //k  ı; (12.80)

there is an integer j 2 Œ0; k1  such that

d.uj ; S/  =8: (12.81)

By (12.75) there is an integer k2  1 such that

ıi < k12 ıN for all integers i  k2 : (12.82)

Set

k0 D 2 C kN C k1 C k2 : (12.83)
200 12 The Extragradient Method for Solving Variational Inequalities

Assume that sequences fxi g1 1


iD0  R and fyi giD0  R satisfy (12.76) and that
n n

for each integer i  0 equation (12.77) holds. Assume that an integer j  0 satisfies

kxj k  M0 : (12.84)

N such that kxi k  M0 .


We show that there exists an integer i 2 Œ1 C j; 1 C j C k
N such that
By (12.75), (12.77), (12.84), and (P2) there is an integer p 2 Œj; j C k

d.xp ; S/  1=4: (12.85)

In view of (12.67), (12.69), and (12.85),

kxp k  M0 :

If p > j we put i D p and obtain

N kxi k  M0 :
i 2 Œj C 1; j C k; (12.86)

Assume that p D j. Then in view of (12.67) and (12.85),

N 0  2 C 1=4:
kxj k  M (12.87)

By Lemma 2.2, (12.73), (12.85), (12.87), (12.68), (12.72), (12.77), and (12.75),

kyj k  C kPC .xj  f .xj //k


 C kPC .xj /k C kf .xj /k
 kxj k C 1=2 C  kf .xj /k
N 0  2 C 3=4 C M
M N 1: (12.88)

By (12.88) and (12.68),

N 2:
kf .yj /k  M (12.89)

By (12.69), (12.77), (12.75), (12.73), (12.85), (12.72), (12.87), (12.89), and


Lemma 2.2,

kxjC1 k  1=4 C kPC .xj  f .yj //k


 kPC .xj /k C kf .yj /k C 1=4
N0CM
 kxj k C 1=2 C kf .yj k  M N 2 < M0 : (12.90)
12.5 A Convergence Result 201

By (12.90), (12.77), (12.75), and (P2) there exists an integer i 2 Œj C 1; j C 1 C k N


such that d.xi ; S/  1=4 and together with (12.67) and (12.69) the equation above
implies that kxi k < M0 . Thus we have shown that the following property holds:
(P4) if an integer j  0 satisfies kxj k  M0 , then there is an integer i 2 Œj C 1; j C
N such that kxi k  M0 .
1 C k
Set

j0 D supfi W i is an integer; i  k2 and kxi k  M0 g: (12.91)

By (12.76) the number j0 is well defined and satisfies

0  j0  k2 : (12.92)

In view of (P4) and (12.91),

j0 C 1 C kN  k2 : (12.93)

By (P4) and (12.91) there is an integer j1 such that

N and kxj1 k  M0 :
j1 2 Œj0 C 1; j0 C 1 C k (12.94)

By (12.91) and (12.94),

N
j1 > k2 ; j1  k2  j1  j0  1 C k: (12.95)

Assume that an integer j  j1 satisfies

kxj k  M0 : (12.96)

We show that there is an integer i 2 Œj C 1; j C 1 C k1  such that d.xi ; S/  =8.


By (P3), (12.96), (12.77), (12.95), and (12.82) there is an integer p 2 Œj; j C k1 
such that

d.xp ; S/  =8: (12.97)

If p > j, then we set i D p. Assume that p D j. Clearly, (12.87)–(12.90) hold and

kxjC1 k  M0 : (12.98)

By (P3), (12.98), (12.77), (12.95), and (12.82) there is an integer i 2 ŒjC1; jCk1 C1
for which d.xi ; S/  =8. Thus we have shown that the following property holds:
(P5) If an integer j  j1 and kxj k  M0 , then there is an integer i 2 ŒjC1; jC1Ck1 
such that d.xi ; S/  =8:
202 12 The Extragradient Method for Solving Variational Inequalities

(P5), (12.94), (12.67), and (12.69) imply that there exists a sequence of natural
numbers fjp g1
pD1 such that for each integer p  1,

1  jpC1  jp  1 C k1 (12.99)

and for each integer p  2,

d.xjp ; S/  =8: (12.100)

We show that

d.xi ; S/   for all integers i  j2 :

Set

0 D 41 k11 : (12.101)

Let p  2 be an integer. We show that for each integer l satisfying 0  l < jpC1  jp ,

d.xjp Cl ; S/  .=8/ C l0 : (12.102)

By (12.100) estimate (12.102) holds for l D 0. Assume that an integer l satisfies

0  l < jpC1  jp (12.103)

and that (12.102) holds. By (12.102), (12.103), (12.99), and (12.101),

d.xjp Cl ; S/  .=8/ C k1 0 < =2: (12.104)

By (12.102) there is u 2 S such that

d.xjp Cl ; S/ D kxjp Cl  u k  .=8/ C l0 : (12.105)

It follows from (12.67), (12.69)–(12.72), (12.105), (12.99), (12.101)–(12.103),


(12.82), (12.78), (12.77), (12.95), and Lemma 12.10 applied with u , M0 , M1 , L,

ı D ıjp Cl ; x D xjp Cl ; y D yjp Cl ; xQ D xjp ClC1

that

kxjp ClC1  u k
 kxjp Cl  u k C .4ıjp Cl .1 C M0 //1=2  =8 C l0
N 2 .1 C M0 //1=2  =8 C l0 C k1 =4 D =8 C .l C 1/0 :
C.4ık1 1
12.5 A Convergence Result 203

This implies that

d.xjp ClC1 ; S/  .=8/ C .l C 1/0 :

Thus by induction we have shown that for all l D 0; : : : ; jpC1  jp relation (12.102)
holds and it follows from (12.102), (12.99), and (12.101) that for all integers l D
0; : : : ; jpC1  jp  1,

d.xjp Cl ; S/  .=8/ C l0  =8 C k1 0  =2:

Since the inequality above holds for all integers p  2, we conclude that

d.xi ; S/  =2 for all integers i  j2 : (12.106)

By (12.99), (12.95), and (12.83),

j2  k1 C j1 C 1  k1 C 2 C kN C k2 D k0 :

Together with (12.106) this implies that

d.xi ; S/  =2

for all integers i  k0 . Theorem 12.13 is proved.


Chapter 13
A Common Solution of a Family of Variational
Inequalities

In a Hilbert space, we study the convergence of the subgradient method to a common


solution of a finite family of variational inequalities and of a finite family of fixed
point problems under the presence of computational errors. The convergence of the
subgradient method is established for nonsummable computational errors. We show
that the subgradient method generates a good approximate solution, if the sequence
of computational errors is bounded from above by a constant.

13.1 Preliminaries and the Main Result

Let .X; h; i/ be a Hilbert space equipped with an inner product h; i which induces
a complete norm k  k. We denote by Card.A/ the cardinality of the set A. For every
point x 2 X and every nonempty set A  X define

d.x; A/ WD inffkx  yk W y 2 Ag:

For every point x 2 X and every positive number r put

B.x; r/ D fy 2 X W kx  yk  rg:

Let cN 2 .0; 1/ and 0 <  <    1.


Let C be a nonempty closed convex subset of X. In view of Lemma 2.2, for every
point x 2 X there is a unique point PC .x/ 2 C satisfying

kx  PC .x/k D inffkx  yk W y 2 Cg:

Moreover,
kPC .x/  PC .y/k  kx  yk

© Springer International Publishing Switzerland 2016 205


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_13
206 13 A Common Solution of a Family of Variational Inequalities

for all x; y 2 X and for each x 2 X and each z 2 C,

hz  PC .x/; x  PC .x/i  0:

Let L1 be a finite set of pairs .f ; C/ where C is a nonempty closed convex subset


of X, and f W X ! X and L2 be a finite set of mappings T W X ! X. We suppose that
the set L1 [ L2 is nonempty. (Note that one of the sets L1 or L2 may be empty.)
We suppose that for each .f ; C/ 2 L1 the mapping f is Lipschitz on all bounded
subsets of X, the set

S.f ; C/ WD fx 2 C W hf .x/; y  xi  0 for all y 2 Cg (13.1)

is nonempty and that

hf .y/; y  xi  0 for all y 2 C and all x 2 S.f ; C/: (13.2)

Evidently, every point x 2 S.f ; C/ is considered as a solution of the variational


inequality associated with the pair .f ; C/ 2 L1 :

Find x 2 C such that hf .x/; y  xi  0 for all y 2 C:

For every pair .f ; C/ 2 L1 and every positive number  define

S .f ; C/ D fx 2 C W hf .x/; y  xi  ky  xk   for all y 2 Cg: (13.3)

Note that this set was introduced in Chap. 12 and that every point x 2 S .f ; C/
is an -approximate solution of the variational inequality associated with the pair
.f ; C/ 2 L1 . The examples considered in Chap. 12, show that elements of S .f ; C/
can be considered as -approximate solutions of the corresponding variational
inequality.
We suppose that for every mapping T 2 L2 the set

Fix.T/ WD fz 2 X W T.z/ D zg 6D ;; (13.4)

kT.z1 /  T.z2 /k  kz1  z2 k for all z1 ; z2 2 X; (13.5)

kz  xk2  kz  T.x/k2 C cN kx  T.x/k2

for all x 2 X and all z 2 Fix.T/: (13.6)

For every mapping T 2 L2 and every positive number  define

Fix .T/ WD fz 2 X W kT.z/  zk  g: (13.7)

Suppose that the set

S WD Œ\.f ;C/2L1 S.f ; C/ \ Œ\T2L2 Fix.T/ 6D ;: (13.8)


13.1 Preliminaries and the Main Result 207

Let  > 0 and .f ; C/ 2 L1 . For every point x 2 X define

Q;f ;C .x/ D PC .x  f .x//; (13.9)


P;f ;C .x/ D PC .x  f .Q;f ;C .x///: (13.10)

Let a natural number

l  Card.L1 [ L2 /:

Denote by R the set of all mappings

A W f0; 1; 2; : : : g ! L2 [ fP;f ;C W .f ; C/ 2 L1 ;  2 Π;   g

such that the following properties hold:


(P1) for every nonnegative integer p and every mapping T 2 L2 there exists an
integer i 2 fp; : : : ; p C l  1g such that A.i/ D T;
(P2) for each integer p  0 and every pair .f ; C/ 2 L1 there exist an integer
i 2 fp; : : : ; p C l  1g and  2 Π;    such that A.i/ D P;f ;C .
We are interested to find solutions of the inclusion x 2 S. In order to meet this
goal we apply algorithms generated by A 2 R. More precisely, we associate with
any A 2 R the algorithm which generates, for any starting point x0 2 X, a sequence
fxk g1
kD0  X, where

xkC1 WD ŒA.k/.xk /; k D 0; 1; : : : :

According to the results known in the literature, this sequence should converge
weakly to an element of S. In this chapter, we study the behavior of the sequences
generated by A 2 R taking into account computational errors which are always
present in practice. Namely, in practice the algorithm associated with A 2 R
generates a sequence fxk g1
kD0 such that for each integer k  0,

if A.k/ D T 2 L2 ; then kxkC1  A.k/.xk /k  ı

and if

A.k/ D P;f ;C with  2 Π;   ; .f ; C/ 2 L1 ;

then there is vk 2 X such that

kvk  Q;f ;C .xk /k  ı;


kxkC1  PC .xk  f .vk //k  ı
208 13 A Common Solution of a Family of Variational Inequalities

with a constant ı > 0 which depends only on our computer system. Surely, in this
situation one cannot expect that the sequence fxk g1kD0 converges to the set S. Our
goal is to understand what subset of X attracts all sequences fxk g1
kD0 generated by
algorithms associated with A 2 R. In Chap. 12 we showed that in the case when
L2 D ; and the set L1 is a singleton, this subset of X is the set of -approximate
solutions of the corresponding variational inequality with some  > 0 depending
on ı. In this chapter we generalize the main result of Chap. 12 and show that in the
general case (see Theorem 13.1 stated below) this subset of X is the set

S WD fx 2 X W x 2 Fix .T/ for each T 2 L2


and d.x; S .f ; C//   for all .f ; C/ 2 L1 g

with some  > 0 depending on ı [see (13.15) and (13.17)].


Our goal is also, for a given  > 0, to find a point x 2 S . This point x is
considered as an -approximate common solution of the problems associated with
the family of operators L1 [ L2 . We will prove the following result (Theorem 13.1),
which shows that an -approximate common solution can be obtained after l.n0  1/
iterations of the algorithm associated with A 2 R and under the presence of
computational errors bounded from above by a constant ı, where ı and n0 are
constants depending on  [see (13.15)–(13.17)]. This result was obtained in [128].
Theorem 13.1. Let  2 .0; 1, M0 > 0 be such that

B.0; M0 / \ S 6D ; (13.11)

and let M1 > 0; L > 0 be such that for each .f ; C/ 2 L1 ,

f .B.0; 3M0 C 2//  B.0; M1 /; (13.12)


kf .z1 /  f .z2 /k  Lkz1  z2 k
for all z1 ; z2 2 B.0; 5M0 C M1 C 2/ (13.13)

and

  L < 1: (13.14)

Let 1 2 .0; 21 / satisfy

4 1 .5l1 C 2  1 M1 C 4L C 2M1 /  ; (13.15)

an integer

n0 > 16M02 cN 1 .1  .  /2 L2 /1 12 (13.16)


13.1 Preliminaries and the Main Result 209

and a number ı 2 .0; 1/ satisfy

4ı.2 C 2M0 /l < 161 cN 12 .1  .  /2 L2 /: (13.17)

Assume that

A 2 R; fxk g1
kD0  X; kx0 k  M0 (13.18)

and that for each integer k  0,

if A.k/ D T 2 L2 ; then kxkC1  A.k/.xk /k  ı; (13.19)

and if

A.k/ D P;f ;C with  2 Π;   ; .f ; C/ 2 L1 ; (13.20)

then there is vk 2 X such that

kvk  Q;f ;C .xk /k  ı; kxkC1  PC .xk   f .vk //k  ı: (13.21)

Then there is an integer p 2 Œ0; n0  1 such that

kxi k  3M0 C 1; i D 0; : : : ; .p C 1/l (13.22)

and for each integer i 2 fpl; : : : ; .p C 1/l  1g,


(P3) If A.i/ D T 2 L2 , then kxiC1  xi k  1 ;
(P4) If A.i/ D P;f ;C with  2 Π;   , .f ; C/ 2 L1 , then kxi  vi k  1 .
Moreover, if an integer p 2 Œ0; n0 1 be such that for each integer
i 2 Œpl; .pC1/l1 properties (P3) and (P4) hold and kxi k  3M0 C 1, then
for each pair i; j 2 fpl; : : : ; .p C 1/lg,

kxi  xj k  

and for each i 2 fpl; : : : ; .p C 1/lg,

xi 2 Fix .T/ for all T 2 L2 ;


d.xi ; S .f ; C//   for all .f ; C/ 2 L1 :

Note that Theorem 13.1 provides the estimations for the constants ı and n0 ,
which follow from relations (13.15)–(13.17). Namely, ı D c1  2 and n0 D c2  2 ,
where c1 and c2 are positive constants depending only on M0 .
Let  2 .0; 1, a positive number ı be defined by relations (13.15) and (13.17),
and let an integer n0  1 satisfy inequality (13.16). Assume that we apply an
algorithm associated with a mapping A 2 R under the presence of computational
210 13 A Common Solution of a Family of Variational Inequalities

errors bounded from above by a positive constant ı and that our goal is to find an -
approximate solution x 2 S : It is not difficult to see that Theorem 13.1 also answers
an important question how we can find an iteration number i such that xi 2 S :
According to Theorem 13.1, we should find the smallest integer q 2 Œ0; n0  1 such
that for every integer i 2 Œql; .q C 1/l  1 properties (P3) and (P4) hold and that the
relation kxi k  3M0 C 1 is true. Then the inclusion xi 2 S is valid for all integers
i 2 Œql; .q C 1/l.
Consider the following convex feasibility problem. Suppose that C1 ; : : : ; Cm are
nonempty closed convex subsets of X, where m is a natural number, such that the set
C D \m iD1 Ci is also nonempty. We are interested to find a solution of the feasibility
problem x 2 C.
For every point x 2 X and every integer i D 1; : : : ; m there exists a unique
element Pi .x/ 2 Ci such that

kx  Pi .x/k D inffkx  yk W y 2 Ci g:

The feasibility problem is a particular case of the problem discussed above with
L1 D ; and L2 D fPi W i D 1; : : : ; mg.

13.2 Auxiliary Results

The next result follows from Lemma 12.10.


Lemma 13.2. Let .f ; C/ 2 L1 ,

u 2 S; m0 > 0; M1 > 0; L > 0; ı 2 .0; 1/;


f .B.u ; m0 //  B.0; M1 /;
kf .z1 /  f .z2 /k  Lkz1  z2 k
for all z1 ; z2 2 B.u ; m0 C M1 C 1/

and let

 2 .0; 1; L < 1:

Assume that

x 2 B.u ; m0 /; y 2 X; ky  PC .x   f .x//k  ı;
xQ 2 X; kQx  PC .x  f .y//k  ı:

Then
kQx  u k2  4ı.1 C m0 / C kx  u k2  .1   2 L2 /kx  PC .x   f .x//k2 :
13.2 Auxiliary Results 211

Lemma 13.3. Let .f ; C/ 2 L1 ,

M0 > 0; M1 > 0; L > 0; ı 2 .0; 1/;

B.0; M0 / \ S 6D ;; (13.23)

f .B.0; 3M0 C 1//  B.0; M1 /; (13.24)

kf .z1 /  f .z2 /k  Lkz1  z2 k

for all z1 ; z2 2 B.0; 5M0 C M1 C 2/ (13.25)

and let

 2 .0; 1; L < 1: (13.26)

Let

x 2 B.0; 3M0 C 1/; y 2 X; ky  PC .x   f .x//k  ı; (13.27)

xQ 2 X; kQx  PC .x  f .y//k  ı: (13.28)

Then

kQx  xk  2ı C .1 C  L/kx  PC .x   f .x//k:

Proof. In view of (13.28) and Lemma 2.2,

kQx  PC .x  f .x//k  ı C kPC .x  f .y//  PC .x   f .x//k


 ı C  kf .y/  f .x/k: (13.29)

It follows from (13.23) that there exists a point

u 2 B.0; M0 / \ S: (13.30)

Lemma 2.2, (13.27), and (13.30) imply that

ky  u k  ı C kPC .x  f .x//  u k  ı C kx   f .x/  u k


 ı C kxk C ku k C  kf .x/k:

Combined with (13.24), (13.27), and (13.30) the relation above implies that

kyk  1 C 3M0 C 1 C 2M0 C  M1 : (13.31)

In view of (13.25), (13.26), (13.27), (13.29), and (13.31),

kQx  PC .x  f .x//k  ı C  Lky  xk: (13.32)


212 13 A Common Solution of a Family of Variational Inequalities

It follows from (13.32) that

kQx  xk  kQx  PC .x  f .x//k C kx  PC .x   f .x//k


 ı C Lky  xk C kx  PC .x   f .x//k:

Combined with (13.26) and (13.27) the relation above implies that

kQx  xk  2ı C ky  xk.1 C  L/;


kQx  xk  2ı C .1 C L/kx  PC .x   f .x//k:

Lemma 13.3 is proved.


Lemma 13.4. Suppose that all the assumptions of Theorem 13.1 hold. Then

ı < 1 =4; (13.33)

2ı.2M0 C 1/ < 161 cN 12 ; (13.34)

4ı.2 C 2M0 /  161 12 .1  .  /2 L2 /: (13.35)

Proof. It is not difficult to see that inequality (13.33) follows from (13.14), (13.17)
and the relations l  1, cN < 1 and 1 < 1=2. Relation (13.34) follows from (13.17),
(13.14) and the inequality l  1. Relation (13.35) follows from (13.17) and the
inequalities l  1 and cN < 1.

13.3 Proof of Theorem 13.1

In view of (13.11), there exists

u 2 B.0; M0 / \ S: (13.36)

It follows from (13.18) and (13.36) that

ku  x0 k  2M0 : (13.37)

Assume that a nonnegative integer p satisfies

ku  xpl k  2M0 : (13.38)

(Clearly, in view of (13.37), inequality (13.38) holds with p D 0.)


Assume that an integer i 2 fpl; : : : ; .p C 1/l  1g satisfies

kxi  u k  2M0 C 1 .i  pl/: (13.39)


13.3 Proof of Theorem 13.1 213

Then one of the following two cases hold:

A.i/ D T 2 L2 I (13.40)

A.i/ D P;f ;C with  2 Π;   ; .f ; C/ 2 L1 : (13.41)

Assume that (13.40) is valid. Then it follows from (13.19), (13.36), (13.40),
(13.5), (13.39), Lemma 13.4, and (13.33) that

kxiC1  u k  kxiC1  A.i/.xi /k C kA.i/.xi /  u k


 ı C kxi  u k  2M0 C .i C 1  pl/ 1 : (13.42)

In view of (13.39), (13.42), the inclusions i 2 Œpl; .p C 1/l  1 and  2 .0; 1/ and
(13.15),

kxi  u k  2M0 C l 1  2M0 C 1;


kxiC1  u k  2M0 C l 1  2M0 C 1:

Combined with (13.42) these inequalities above imply that

kxiC1  u k2  kxi  u k2
 ı.kxiC1  u k C kxi  u k/  2ı.2M0 C 1/: (13.43)

Assume that (13.41) is valid. In view of (13.36), (13.39), (13.15), (13.12),


(13.13), (13.17), (13.41),

kxi  u k  2M0 C 1

and all the assumptions of Lemma 13.2 hold with x D xi , y D vi , xQ D xiC1 , m0 D


2M0 C 1, and this implies that

kxiC1  u k2  4ı.2 C 2M0 / C kxi  u k2 (13.44)

and

kxiC1  u k  kxi  u k C 2.ı.2 C 2M0 //1=2


 kxi  u k C 1  2M0 C .i C 1  pl/ 1 : (13.45)

(Note that the first inequality of (13.45) follows from (13.44), the second inequality
follows from (13.17) and the inequalities cN < 1 and l  1, and the third inequality
follows from (13.39).)
214 13 A Common Solution of a Family of Variational Inequalities

It follows from (13.42)–(13.45) that in both cases

kxiC1  u k  2M0 C .i C 1  pl/ 1 ;


kxiC1  u k2  kxi  u k2 C 4ı.2 C 2M0 /:

Thus by induction we proved that for all integers i D pl; : : : ; .p C 1/l the inequality

kxi  u k  2M0 C .i  pl/ 1

holds and that for all integers i D pl; : : : ; .p C 1/l  1 the inequality

kxiC1  u k2  kxi  u k2 C 4ı.2 C 2M0 /

is valid.
By (13.15), we have shown that the following property holds:
(P5) If a nonnegative integer p satisfies the inequality

ku  xpl k  2M0 ;

then we have

kxi  u k < 2M0 C 1 for all i D pl; : : : ; .p C 1/l (13.46)

and

kxi  u k2  kxiC1  u k2  4ı.2 C 2M0 /


for all i D pl; : : : ; .p C 1/l  1: (13.47)

Assume that an integer qQ 2 Œ0; n0  1 and that for every integer p 2 Œ0; qQ  the
following property holds:
(P6) there exists i 2 fpl; : : : ; .p C 1/l  1g such that (P3) and (P4) do not hold.
Assume now that an integer p 2 Œ0; qQ  satisfies

ku  xpl k  2M0 : (13.48)

In view of property (P5) and relation (13.48), inequalities (13.46) and (13.47) are
valid.
Property (P6) implies that there exists an integer j 2 fpl; : : : ; .p C 1/l  1g such
that properties (P3) and (P4) do not hold with i D j. Evidently, one of the following
cases holds:

A.j/ D T 2 L2 I (13.49)

A.j/ D P;f ;C with  2 Π;   ; .f ; C/ 2 L1 : (13.50)


13.3 Proof of Theorem 13.1 215

Assume that relation (13.49) holds. Since property (P3) does not hold with i D j
we have

kxjC1  xj k > 1 : (13.51)

It follows from (13.49), (13.51), (13.19), and Lemma 13.4 that

kxj  T.xj /k  kxjC1  xj k  kT.xj /  xjC1 k


 1  ı  .3=4/ 1 : (13.52)

Relations (13.49), (13.6), (13.36), (13.8), and (13.52) imply that

ku  xj k2  ku  T.xj /k2 C cN kxj  T.xj /k2


 ku  T.xj /k2 C cN .9=16/ 12
D ku  xjC1 k2 C kxjC1  T.xj /k2
C2hu  xjC1 ; xjC1  T.xj /i C cN .9=16/ 12
 ku  xjC1 k2  2ku  xjC1 kkxjC1  T.xj /k C cN .9=16/ 12 :
(13.53)

In view of (13.53), (13.19), (13.46), and (13.34),

ku  xj k2  ku  xjC1 k2


 cN .9=16/ 12  2ı.2M0 C 1/  .Nc=2/ 12 :

Hence

if (13.49) holds, then ku  xj k2  ku  xjC1 k2  .Nc=2/ 12 : (13.54)

Assume that relation (13.50) is valid. Then relations (13.50), (13.36), (13.46),
(13.12), (13.13), (13.20), and (13.21) imply that all the assumptions of Lemma 13.2
hold with

x D xj ; y D vj ; xQ D xjC1 ; m0 D 2M0 C 1

and this implies that

ku  xjC1 k2
 4ı.2 C 2M0 / C ku  xj k2  .1  . L/2 /kxj  PC .xj   f .xj //k2 :
(13.55)
216 13 A Common Solution of a Family of Variational Inequalities

Since property (P4) does not hold with i D j we conclude that

kxj  vj k > 1 : (13.56)

In view of (13.46), (13.21), and (13.56),

kxj  PC .xj  f .xj //k


 kxj  vj k  kPC .xj  f .xj //  vj k > 1  ı > .3=4/ 1 : (13.57)

It follows from (13.35), (13.55), and (13.57) that

kxj  u k2  kxjC1  u k2
 .1  .  L/2 /.9=16/ 12  4ı.2 C 2M0 /
 .1  .  L/2 / 12 =2: (13.58)

By relations (13.54) and (13.58), in both cases we have

kxj  u k2  kxjC1  u k2  cN .1  .  L/2 / 12 =2: (13.59)

It follows from (13.17), (13.47), and (13.59) that

ku  xpl k2  ku  x.pC1/l k2


.pC1/l1
X
D Œku  xi k2  ku  xiC1 k2 
iDpl

 cN .1  .  L/2 / 12 =2  4ı.2 C 2M0 /l


> cN .1  .  L/2 / 12 =4: (13.60)

By (13.48) and (13.60),

ku  x.pC1/l k  ku  xpl k  2M0 : (13.61)

Thus we have shown that the following property holds:


(P7) If an integer p 2 Œ0; qQ  satisfies inequality (13.48), then [see relations (13.61),
(13.46), (13.60)] we have

ku  x.pC1/l k  2M0 ; ku  xi k  2M0 C 1; i D pl; : : : ; .p C 1/l

and

ku  xpl k2  ku  x.pC1/l k2 > cN .1  .  L/2 / 12 =4: (13.62)


13.3 Proof of Theorem 13.1 217

In view of (13.37), property (P7), (13.62), and (13.16),

4M02  ku  x0 k2
 ku  x0 k2  ku  x.QqC1/l k2
qQ
X
D Œku  xpl k2  ku  x.pC1/l k2 
pD0

 .Qq C 1/Nc.1  .  L/2 / 12 =4

and

qQ C 1  16M02 cN 1 .1  .  L/2 /1 12 < n0 : (13.63)

We assumed that an integer qQ 2 Œ0; n0  1 and that for every integer p 2 Œ0; qQ 
property (P6) holds and proved that qQ C 1 < n0 .
This implies that there exists an integer q 2 Œ0; n0  1 such that for every integer
p satisfying 0  p < q, property (P6) holds and that the following property holds:
(P8) For every integer i 2 fql; : : : ; .q C 1/l  1g properties (P3) and (P4) hold.
Property (P7) (with qQ D q  1) implies that

ku  xjl k  2M0 ; j D 0; : : : ; q: (13.64)

In view of (13.64) and property (P5),

ku  xi k  2M0 C 1; i D 0; : : : ; .q C 1/l: (13.65)

It follows from (13.64), (13.65), and (13.36) that

kxi k  3M0 C 1; i D 0; : : : ; .q C 1/l:

Assume that p is a nonnegative integer,

kxi k  3M0 C 1; i D pl; : : : ; .p C 1/l  1 (13.66)

and that for all integers i D pl; : : : ; .p C 1/l  1 properties (P3) and (P4) hold.
Let

i 2 fpl; : : : ; .p C 1/l  1g: (13.67)

There are two cases:

A.i/ D T 2 L2 I (13.68)

A.i/ D P;f ;C with  2 Π;   ; .f ; C/ 2 L1 : (13.69)


218 13 A Common Solution of a Family of Variational Inequalities

Assume that relation (13.68) is valid. Then in view of (13.68) and property (P3),

kxiC1  xi k  1 : (13.70)

It follows from (13.68), (13.19), (13.70), and (13.17) that

kT.xi /  xi k  kT.xi /  xiC1 k C kxiC1  xi k  ı C 1  .5=4/ 1 ;


xi 2 Fix.5=4/ 1 .T/: (13.71)

Thus we have shown that the following property holds:


(P9) If (13.68) is true, then relations (13.70) and (13.71) hold.
Assume that (13.69) holds. In view of (13.69), (13.67), property (P4), (13.20),
and (13.21),

kxiC1  PC .xi  f .vi //k  ı;


kvi  PC .xi  f .xi //k  ı;
kxi  vi k  1 : (13.72)

Relations (13.17), (13.21), and (13.72) imply that

kxi  PC .xi  f .xi //k


 kxi  vi k C kvi  PC .xi   f .xi //k
 ı C 1  .5=4/ 1 : (13.73)

It follows from (13.11), (13.69), (13.12), (13.13), (13.14), (13.67), (13.66), (13.72),
(13.73), and (13.33) that all the assumptions of Lemma 13.3 hold with

x D xi ; y D vi ; xQ D xiC1

(and with the constants M0 ; M1 ; L as in Theorem 13.1), and this implies that

kxiC1  xi k
 2ı C .1 C  L/kxi  PC .xi   f .xi //k
 2ı C .5=4/.1 C L/ 1
 .5=4/.1 C   L/ 1 C 2ı < 5 1 : (13.74)

(Note that the second inequality in (13.74) follows from (13.73), the third one
follows from (13.69) and the last inequality follows from (13.14) and (13.33).)
In view of Lemma 2.2, (13.73), (13.69), (13.66), and (13.12), for every point
 2 C,
13.3 Proof of Theorem 13.1 219

0  hxi  f .xi /  PC .xi  f .xi //;   PC .xi   f .xi //i


D hxi  PC .xi  f .xi //;   PC .xi   f .xi //i
hf .xi /;   PC .xi  f .xi //i
 kxi  PC .xi  f .xi //k.k  xi k C kxi  PC .xi   f .xi //k/
 hf .xi /;   xi i  hf .xi /; xi  PC .xi   f .xi //i
 2 1 .k  xi k C 2 1 /  hf .xi /;   xi i  2  kf .xi /k 1 ;
 hf .xi /;   xi i  2 1 k  xi k  4 12  2 1   M1

and

hf .xi /;   xi i
 2 1 1 k  xi k  41 12  2 1   1 M1
for each  2 C: (13.75)

Set

yN D PC .xi  f .xi //: (13.76)

Relations (13.76) and (13.73) imply that

kxi  yN k  .5=4/ 1 : (13.77)

It follows from (13.77), (13.66), (13.13), (13.12), and (13.10) that

kf .xi /  f .Ny/k  Lkxi  yN k  L.5=4/ 1 ;


yN 2 B.xi ; 1/  B.0; 3M0 C 2/; kf .Ny/k  M1 : (13.78)

(Note that the inclusion in (13.78) follows from (13.76), the inequality 1 < 1=2
and (13.66), and the last inequality in (13.78) follows from (13.12).)
In view of (13.75), (13.77), (13.78), and (13.15), for every point  2 C,

hf .Ny/;   yN i
 hf .Ny/;   xi i  kf .Ny/kkxi  yN k
 hf .Ny/;   xi i  2M1 1
 hf .xi /;   xi i  kf .Ny/  f .xi /kk  xi k  2M1 1
 2 1 1 k  xi k  4 1 1
2M1 1 1    2L 1 k  xi k  2M1 1
220 13 A Common Solution of a Family of Variational Inequalities

 2 1 1 k  yN k  2 1 1 kNy  xi k


41 1  2 1 M1   1
2L 1 k  yN k  2L 1 kxi  yN k  2M1 1
 .2 1 1 C 2L 1 /k  yN k  6 1 1
2 1 M1   1  4L 1
2M1 1  kNy  k  :

Combined with relation (13.76) the relation above implies that

yN 2 S .f ; C/:

It follows from the inclusion above and (13.77) that

d.xi ; S .f ; C//  .5=4/ 1 :

Thus we have shown that the following property holds:


(P10) if (13.69) is valid, then the inequalities

kxi  xiC1 k  5 1

[see (13.74)] and

d.xi ; S .f ; C//  .5=4/ 1

hold.
In view of properties (P9) and (P10), for every integer i 2 fpl; : : : ; .p C 1/l  1g,

kxi  xiC1 k  5 1 :

This implies that for every pair of integers i; j 2 fpl; : : : ; .p C 1/lg, we have

kxi  xj k  5l 1 < =4: (13.79)

[see (13.15)].
Let j 2 fpl; : : : ; .p C 1/lg. Assume that T 2 L2 . In view of (13.18) and property
(P1) there exists an integer i 2 fpl; : : : ; .p C 1/l  1g such that

A.i/ D T:

It follows from (13.68) and property (P9) that

xi 2 Fix.2 1 / .T/
13.4 Examples 221

and

kxi  T.xi /k  2 1 :

Combined with relations (13.79) and (13.15) this implies that

kxj  T.xj /k  kxj  xi k C kxi  T.xi /k C kT.xj /  T.xi /k  =2 C 2 1 < ;


xj 2 Fix .T/ for all T 2 L2 : (13.80)

Assume that .f ; C/ 2 L1 . In view of (13.18) and property (P2), there exists an


integer i 2 fpl; : : : ; .p C 1/l  1g such that

A.i/ D P;f ;C with  2 Π;   :

It follows from (13.69) and property (P10) that

d.xi ; S .f ; C//  2 1 :

Combined with relations (13.79) and (13.15) (the choice of 1 ) the inequality above
implies that

d.xj ; S .f ; C//  d.xi ; S .f ; C// C kxi  xj k  2 1 C =4 < 

for all .f ; C/ 2 L1 . Theorem 13.1 is proved.

13.4 Examples

In this section we present examples for which Theorem 13.1 can be used.
Example 13.5. Let p  1 be an integer, Ci , i D 1; : : : ; p be nonempty closed convex
subsets of the Hilbert space X and let for every integer i 2 f1; : : : ; pg, gi W X ! R1
be a convex Fréchet differentiable function and g0i .x/ 2 X be its Fréchet derivative at
a point x 2 X. We assume that for all integers i D 1; : : : ; p the mapping g0i W X ! X
is Lipschitzian on all bounded subsets of X.
Consider the following multi-objective minimization problem
p
Find x 2 \iD1 Ci such that
gi .x/ D inffgi .z/ W z 2 Ci g for all i D 1; : : : ; p:

It is clear that this problem is equivalent to the following problem which is a


particular case of the problem discussed in this section with fi D g0i , i D 1; : : : ; p,
and for which Theorem 13.1 was stated:
222 13 A Common Solution of a Family of Variational Inequalities

p
Find x 2 \iD1 Ci such that for all i D 1; : : : ; p;
hg0i .x/; y  xi  0 for all y 2 Ci :

Let S be the set of solutions of these two problems. We assume that S 6D ;.


Now it is not difficult to see that all the assumptions needed for Theorem 13.1
hold with fi D g0i , i D 1; : : : ; p, L1 D f.g0i ; Ci / W i D 1; : : : ; pg and L2 D ;.
The constants M0 ; M1 ; L can be found, in principle, using the analytic description
of the functions gi and the sets Ci , i D 1; : : : ; p which is usually given. In many cases
p
the set \iD1 Ci is contained in a ball or one of the sets Ci , i D 1; : : : ; p is contained
in a ball and the radius of the ball can be found. Then we can found the constant
M1 ; L, choose a positive constant   < L1 and apply Theorem 13.1 with A 2 R
such that for each integer i  0 and each integer j 2 Œ0; p  1,

A.ip C j/ D P  ;fjC1 ;CjC1 :

Our next example is a particular case of Example 13.5.


Example 13.6. Let X D R4 , p D 2,

C1 D fx D .x1 ; x2 ; x3 ; x4 / 2 R4 W jx1 j  10; x3 D 2g;


C2 D fx D .x1 ; x2 ; x3 ; x4 / 2 R4 W jx2 j  10; x4 D 2g;
g1 .x/ D .2x1 C x2 C x3  x4  3/2 ; x D .x1 ; x2 ; x3 ; x4 / 2 R4 ;
g2 .x/ D .x1 C 2x2 C x3  x4  3/2 ; x D .x1 ; x2 ; x3 ; x4 / 2 R4 :

Evidently, the functions g1 ; g2 are convex and g1 ; g2 2 C2 .


Consider the problem

Find x 2 C1 \ C2 such that


gi .x/ D inffgi .z/ W z 2 Ci g for i D 1; 2:

As it was shown in Example 13.5, we can apply Theorem 13.1 for this problem with
fi D g0i , i D 1; 2.
Now we define the constants which appear in Theorem 13.1. Set l D 2. Clearly,

C1 \ C2  fx D .x1 ; x2 ; x3 ; x4 / 2 R4 W
jx1 j  10; jx2 j  10; x3 D 2; x4 D 2g  B.0; 16/:

Thus we can set M0 D 16. It is easy to see that the set of solutions of our problem

S D fx D .x1 ; x2 ; 2; 2/ W jx1 j  10; jx2 j  10; g1 .x/ D 0; g2 .x/ D 0g


D fx D .x1 ; x2 ; 2; 2/ W jx1 j  10; jx2 j  10; 2x1 C x2  3 D 0; x1 C 2x2  3 D 0g
D f.1; 1; 2; 2/g:
13.4 Examples 223

For all points x D .x1 ; x2 ; x3 ; x4 / 2 R4 we have

f1 .x/ D g01 .x/ D 2.2x1 C x2 C x3  x4  3/.2; 1; 1; 1/;


f2 .x/ D g02 .x/ D 2.x1 C 2x2 C x3  x4  3/.1; 2; 1; 1/:

These equalities imply that for i D 1; 2,

fi .B.0; 50//  B.0; 1530/

(thus M1 D 1530) and that the functions f1 ; f2 are Lipschitzian on R4 with the
Lipschitz constant L D 12. Put   D 161 , cN D 1=2.
We apply Theorem 13.1 with these constants and with  D 103 . Then (13.15)
implies that we can set 1 D 41 107 . By (13.16), we have

n0 > 2  163  1014

and in view of (13.17) the following inequality holds:

ı < .16  34/1  163  1014 :

Note that this example can also be considered as an example of a convex feasibility
problem

Find x 2 C1 \ C2 \ fz 2 Rn W g1 .z/  0g \ fz 2 Rn W g2 .z/  0g;

or equivalently

Find x 2 fz 2 C1 W g1 .z/  0g \ fz 2 C2 W g2 .z/  0g:

Now we describe how the subgradient algorithm is applied for our example.
First of all, note that for any y D .y1 ; y2 ; y3 ; y4 / 2 R4 ,

PC1 .y/ D .minfmaxfy1 ; 10g; 10g; y2 ; 2; y4 /;


PC2 .y/ D .y1 ; minfmaxfy2 ; 10g; 10g; y3 ; 2/:

We apply Theorem 13.1 with x0 D .0; 0; 0; 0/ and A 2 R such that for each
integer i  0,

A.2i/ D P161 ;f1 ;C1 ; A.2i C 1/ D P161 ;f2 ;C2 :

Then our algorithm generates two sequences fxi g1 1 4


iD0 ; fvi giD0  R such that for
every nonnegative integer i,

kv2i  PC1 .x2i  161 f1 .x2i //k  ı;


kx2iC1  PC1 .x2i  161 f1 .v2i //k  ı
224 13 A Common Solution of a Family of Variational Inequalities

and
kv2iC1  PC2 .x2iC1  161 f2 .x2iC1 //k  ı;
kx2iC2  PC2 .x2iC1  161 f2 .v2iC1 //k  ı:
For every nonnegative integer p we calculate

p D maxfkv2p  x2p k; kv2pC1  x2pC1 kg

and find the smallest (first) integer p  0 such that p  1 . In view of


Theorem 13.1, this nonnegative integer p exists and satisfies p < n0 . Then it follows
from Theorem 13.1 that x2p ; x2pC1 ; x2pC2 2 S .
Example 13.7. Let p; q  1 be integers, Ci , i D 1; : : : ; p, Di , i D 1; : : : ; q be
nonempty closed convex subsets of the Hilbert space X and let for every integer
i 2 f1; : : : ; pg, gi W X ! R1 be a convex Frechet differentiable function and
g0i .x/ 2 X be its Frechet derivative at a point x 2 X. We assume that for every integer
i D 1; : : : ; p,

gi .z/  0 for all z 2 Ci

and that the mapping g0i W X ! X is Lipschitzian on all bounded subsets of X.


Consider the following convex feasibility problem
p q
Find a point belonging to .\iD1 fz 2 Ci W gi .z/  0g/ \ .\jD1 Dj /:

It is easy to show that this problem is equivalent to the following problem which is a
particular case of the problem discussed in this section, and for which Theorem 13.1
was stated:
q
Find x 2 \jD1 Dj such that for all i D 1; : : : ; p;
x 2 Ci and hg0i .x/; y  xi  0 for all y 2 Ci :

Let S be the set of solutions of these two problems. We assume that S 6D ;.


Now it is not difficult to see that all the assumptions needed for Theorem 13.1
hold with fi D g0i , i D 1; : : : ; p, L1 D f.g0i ; Ci / W i D 1; : : : ; pg and L2 D fPDi W
i D 1; : : : ; qg, l D p C q.
The constants M0 ; M1 ; L can be found as it was explained in Example 13.5, using
the analytic description of the functions gi and the sets Ci ; Dj , i D 1; : : : ; p, j D
1; : : : ; q which is usually given. Then we choose a positive constant   < L1 and
apply Theorem 13.1 with A 2 R such that

A.i C .p C q// D A.i/ for all integers i  0;


A.i/ D P  ;fiC1 ;CiC1 ; i D 0; : : : ; p  1; A.p  1 C j/ D PDj ; j D 1; : : : ; q:
Chapter 14
Continuous Subgradient Method

In this chapter we study the continuous subgradient algorithm for minimization


of convex functions, under the presence of computational errors. We show that
our algorithms generate a good approximate solution, if computational errors
are bounded from above by a small positive constant. Moreover, for a known
computational error, we find out what an approximate solution can be obtained and
how much time one needs for this.

14.1 Bochner Integrable Functions

Let .Y; k  k/ be a Banach space and 1 < a < b < 1. A function x W Œa; b ! Y is
strongly measurable on Œa; b if there exists a sequence of functions xn W Œa; b ! Y,
n D 1; 2; : : : such that for any integer n  1 the set xn .Œa; b/ is countable and the
set ft 2 Œa; b W xn .t/ D yg is Lebesgue measurable for any y 2 Y, and xn .t/ ! x.t/
as n ! 1 in .Y; k  k/ for almost every t 2 Œa; b.
The function x W Œa; b ! Y is Bochner integrable if it is strongly measurable and
Rb
there exists a finite a kx.t/kdt.
If x W Œa; b ! Y is a Bochner integrable function, then for almost every (a. e.)
t 2 Œa; b,
Z tCt
lim .t/1 kx. /  x.t/kd D 0
t!0 t

and the function


Z t
y.t/ D x.s/ds; t 2 Œa; b
a

is continuous and a. e. differentiable on Œa; b.

© Springer International Publishing Switzerland 2016 225


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_14
226 14 Continuous Subgradient Method

Let 1 < 1 < 2 < 1. Denote by W 1;1 .1 ; 2 I Y/ the set of all functions
x W Œ1 ; 2  ! Y for which there exists a Bochner integrable function u W Œ1 ; 2  ! Y
such that
Z t
x.t/ D x.1 / C u.s/ds; t 2 .1 ; 2 
1

(see, e.g., [11, 27]). It is known that if x 2 W 1;1 .1 ; 2 I Y/, then this equation defines
a unique Bochner integrable function u which is called the derivative of x and is
denoted by x0 .

14.2 Convergence Analysis for Continuous Subgradient


Method

The study of continuous subgradient algorithms is an important topic in optimiza-


tion theory. See, for example, [6, 10, 23, 27, 28] and the references mentioned
therein. In this chapter we analyze its convergence under the presence of compu-
tational errors.
We suppose that X is a Hilbert space equipped with an inner product denoted by
h; i which induces a complete norm k  k. For each x 2 X and each r > 0 set

B.x; r/ D fy 2 X W kx  yk  rg:

For each x 2 X and each nonempty set E  X set

d.x; E/ D inffkx  yk W y 2 Eg:

Let D be a nonempty closed convex subset of X. Then for each x 2 X there is a


unique point PD .x/ 2 D satisfying

kx  PD .x/k D inffkx  yk W y 2 Dg

(see Lemma 2.2).


Suppose that f W X ! R1 [ f1g is a convex, lower semicontinuous and bounded
from below function such that

dom.f / WD fx 2 X W f .x/ < 1g 6D ;:

Set

inf.f / D inf.f .x/ W x 2 Xg


14.2 Convergence Analysis for Continuous Subgradient Method 227

and

argmin.f / D fx 2 X W f .x/ D inf.f /g:

For each set D  X put

inf.f I D/ D infff .z/ W z 2 Dg;


sup.f I D/ D supff .z/ W z 2 Dg:

Recall that for each x 2 dom.f /,

@f .x/ D fl 2 X W hl; y  xi  f .y/  f .x/ for all y 2 Xg:

In Sect. 14.4 we will prove the following result.


Theorem 14.1. Let ı 2 .0; 1, 0 < 1 < 2 , M > 0,

0 D 2ı.2M C 1/ 1 2 1 1
1 ; T0 > 4M 1 0 (14.1)

and let W Œ0; T0  ! R1 be a Lebesgue measurable function such that

1  .t/  2 for all t 2 Œ0; T0 : (14.2)

Assume that x 2 W 1;1 .0; T0 I X/,

x.0/ 2 dom.f / \ B.0; M/ (14.3)

and that for almost every t 2 Œ0; T0 ,

x.t/ 2 dom.f / (14.4)

and

B.x0 .t/; ı/ \ . .t/@f .x.t/// 6D ;: (14.5)

Then

minff .x.t// W t 2 Œ0; T0 g  inf.f I B.0; M// C 0 :

In Theorem 14.1 ı is the computational error. According to this result we can


find a point  2 X such that

f ./  inf.f I B.0; M// C c1 ı


228 14 Continuous Subgradient Method

during a period of time c2 ı 1 , where c1 ; c2 > 0 are constants depending only


on 1 ; M.

14.3 An Auxiliary Result

Let V  X be an open convex set and g W V ! R1 be a convex locally Lipschitzian


function.
Let T > 0, x0 2 X and let u W Œ0; T ! X be a Bochner integrable function. Set
Z t
x.t/ D x0 C u.s/ds; t 2 Œ0; T:
0

Then x W Œ0; T ! X is differentiable and x0 .t/ D u.t/ for almost every t 2 Œ0; T.
Assume that

x.t/ 2 V for all t 2 Œ0; T:

We claim that the restriction of g to the set fx.t/ W t 2 Œ0; Tg is Lipschitzian.
Indeed, since the set fx.t/ W t 2 Œ0; Tg is compact, the closure of its convex hull C
is both compact and convex, and so the restriction of g to C is Lipschitzian. Hence
the function .g  x/.t/ WD g.x.t//, t 2 Œ0; T, is absolutely continuous. It follows that
for almost every t 2 Œ0; T, both the derivatives x0 .t/ and .g  x/0 .t/ exist:

x0 .t/ D lim h1 Œx.t C h/  x.t/; (14.6)


h!0

.g  x/0 .t/ D lim h1 Œg.x.t C h//  g.x.t//: (14.7)


h!0

We continue with the following fact (see Proposition 8.3 of [101]).


Proposition 14.2. Assume that t 2 Œ0; T and that both the derivatives x0 .t/ and
.g  x/0 .t/ exist. Then

.g  x/0 .t/ D lim h1 Œg.x.t/ C hx0 .t//  g.x.t//: (14.8)


h!0

Proof. There exist a neighborhood U of x.t/ in X and a constant L > 0 such that

jg.z1 /  g.z2 /j  Ljjz1  z2 jj for all z1 ; z2 2 U : (14.9)

Let  > 0 be given. In view of (14.6), there exists ı > 0 such that

x.t C h/; x.t/ C hx0 .t/ 2 U for each h 2 Œı; ı \ Œt; T  t; (14.10)
14.4 Proof of Theorem 14.1 229

and such that for each h 2 Œ.ı; ı/ n f0g \ Œt; T  t,

kx.t C h/  x.t/  hx0 .t/k < jhj: (14.11)

Let

h 2 Œ.ı; ı/ n f0g \ Œt; T  t: (14.12)

It follows from (14.10), (14.9), (14.11), and (14.12) that

jg.x.t C h//  g.x.t/ C hx0 .t//j  Lkx.t C h/  x.t/  hx0 .t/k < Ljhj: (14.13)

Clearly,

Œg.x.t C h//  g.x.t//h1 D Œg.x.t C h//  g.x.t/ C hx0 .t//h1


CŒg.x.t/ C hx0 .t//  g.x.t//h1 : (14.14)

Relations (14.13) and (14.14) imply that

jŒg.x.t C h//  g.x.t//h1  Œg.x.t/ C hx0 .t//  g.x.t//h1 j


 jg.x.t C h//  g.x.t/ C hx0 .t//jjh1 j  L: (14.15)

Since  is an arbitrary positive number, (14.7) and (14.15) imply (14.8).


Corollary 14.3. Let z 2 X and g.y/ D kz  yk2 for all y 2 X. Then for almost every
t 2 Œ0; T, the derivative .g  x/0 .t/ exists and

.g  x/0 .t/ D 2hx0 .t/; x.t/  zi:

14.4 Proof of Theorem 14.1

Assume that the theorem does not hold. Then there exists

z 2 B.0; M/ (14.16)

such that
f .x.t// > f .z/ C 0 for all t 2 Œ0; T0 : (14.17)
Set
.t/ D kz  x.t/k2 ; t 2 Œ0; T0 : (14.18)
In view of Corollary 14.3, for a. e. t 2 Œ0; T0 , there exist derivatives x0 .t/, 0 .t/ and
230 14 Continuous Subgradient Method

0 .t/ D 2hx0 .t/; x.t/  zi: (14.19)

By (14.5), for a. e. t 2 Œ0; T0 , there exist

.t/ 2 @f .x.t// (14.20)

such that

kx0 .t/ C .t/.t/k  ı: (14.21)

It follows from (14.19) that for almost every t 2 Œ0; T0 ,

0 .t/ D 2hx.t/  z; x0 .t/i


D 2hx.t/  z;  .t/.t/i C 2hx.t/  z; x0 .t/ C .t/.t/i: (14.22)

In view of (14.21), for almost every t 2 Œ0; T0 ,

jhz  x.t/; x0 .t/ C .t/.t/ij  ıkz  x.t/k: (14.23)

By (14.17) and (14.20), for almost every t 2 Œ0; T0 ,

hz  x.t/; .t/i  f .x.t// C f .z/  0 : (14.24)

It follows from (14.2), (14.22), (14.23), and (14.24) that for almost every t 2 Œ0; T0 ,

0 .t/  20 .t/ C 2ıkz  x.t/k


 20 1 C 2ıkz  x.t/k: (14.25)

Relations (14.3), (14.16), and (14.18) imply that

.0/  4M 2 : (14.26)

We show that for all t 2 Œ0; T0 ,

kz  x.t/k D .t/1=2  2M C 1:

Assume the contrary. Then there exists

 2 .0; T
14.5 Continuous Subgradient Projection Method 231

such that

kz  x.t/k < 2M C 1; t 2 Œ0; /; (14.27)

kz  x. /k D 2M C 1: (14.28)

By (14.1) and (14.25)–(14.28),

.2M C 1/2  4M 2  kz  x. /k2  kz  x.0/k2


Z 
D ./  .0/ D 0 .t/dt
0
Z 
 .20 1 C 2ıkz  x.t/k/dt
0

 .20 1 C 2ı.2M C 1//   0 1 ;

a contradiction. The contradiction we have reached proves that

kz  x.t/k  2M C 1; t 2 Œ0; T0 : (14.29)

By (14.1), (14.25), and (14.29), for almost every t 2 Œ0; T0 ,

0 .t/  20 1 C 2ı.2M C 1/  0 1 : (14.30)

It follows from (14.26) and (14.30) that


Z T0
2
4M  .0/  .T0 / D  0 .t/dt  T0 0 1 ;
0

T0  4M 2 . 1 0 /1 :

This contradicts the choice of T0 (see (4.1)). The contradiction we have reached
proves Theorem 14.1.

14.5 Continuous Subgradient Projection Method

We use the notation and definitions introduced in Sect. 14.2.


Let C be a nonempty, convex, and closed set in the Hilbert space X, U be an open
and convex subset of X such that

CU

and f W U ! R1 be a convex locally Lipschitzian function.


232 14 Continuous Subgradient Method

Let x 2 U and  2 X. Set

f 0 .x; / D lim t1 Œf .x C t/  f .x/; (14.31)


t!0C

@f .xI / D fl 2 @f .x/ W hl; i D f 0 .x; /g: (14.32)

It is a well-known fact of the convex analysis that

@f .xI / 6D ;:

Let M > 1; L > 0 and assume that

C  B.0; M  1/; fy 2 X W d.y; C/  1g  U; (14.33)


jf .v1 /  f .v2 /j  Lkv1  v2 k
for all v1 ; v2 2 B.0; M C 1/ \ U: (14.34)

We will prove the following result.


Theorem 14.4. Let ı 2 .0; 1,

0 < 1 < 2 ; 1  1; (14.35)

0 D 2ı.10M C 2 .L C 1// 1
1 ; (14.36)

T0 > ı 1 .10M C 2 .L C 1//1 Œ.2 1 /1 4M 2 C L.2M C 2/ 1 (14.37)

and let

1   2 : (14.38)

Assume that x 2 W 1;1 .0; T0 I X/,

d.x.0/; C/  ı (14.39)

and that for almost every t 2 Œ0; T0 , there exists .t/ 2 X such that

B..t/; ı/ \ @f .x.t/I x0 .t// 6D ;; (14.40)


PC .x.t/  .t// 2 B.x.t/ C x0 .t/; ı/: (14.41)

Then

minff .x.t// W t 2 Œ0; T0 g  inf.f I C/ C 0 : (14.42)


14.5 Continuous Subgradient Projection Method 233

Proof. Assume that (14.42) does not hold. Then there exists

z2C (14.43)

such that

f .x.t// > f .z/ C 0 ; t 2 Œ0; T0 : (14.44)

For almost every t 2 Œ0; T0  set

.t/ D x.t/ C x0 .t/: (14.45)

It is clear that W Œ0; T is a Bochner integrable function. In view of (14.41) and


(14.45), for almost every t 2 Œ0; T0 ,

B. .t/; ı/ \ C 6D ;: (14.46)

Define

Cı D fx 2 X W d.x; C/  ıg: (14.47)

Clearly, Cı is a convex closed set, for each x 2 Cı ,

B.x; ı/ \ C 6D ; (14.48)

and in view of (14.46),

.t/ 2 Cı for almost every t 2 Œ0; T0 : (14.49)

Evidently, the function es .s/, s 2 Œ0; T0  is Bochner integrable.


We claim that for all t 2 Œ0; T0 ,
Z t
x.t/ D et x.0/ C et es .s/ds: (14.50)
0

Clearly, (14.50) holds for t D 0. For every t 2 .0; T0  we have


Z t Z t
e .s/ds D
s
es .x.s/ C x0 .s//ds
0 0
Z t
D .es x.s//0 ds D et x.t/  x.0/:
0

This implies (14.50) for all t 2 Œ0; T0 .


234 14 Continuous Subgradient Method

By (14.50), for all t 2 Œ0; T0 ,


Z t
t t t 1 t
x.t/ D e x.0/ C .1  e /.1  e / e es .s/ds
0
Z t
D et x.0/ C .1  et / es .et  1/1 .s/ds: (14.51)
0

In view of (14.49), for all t 2 Œ0; T0 ,


Z t
es .et  1/1 .s/ds 2 Cı : (14.52)
0

Relations (14.39), (14.51), and (14.52) imply that

x.t/ 2 Cı for all t 2 Œ0; T0 : (14.53)

It follows from (14.48) and (14.53) that for every t 2 Œ0; T0 , there exists

xO .t/ 2 C (14.54)

such that

kx.t/  xO .t/k  ı: (14.55)

By (14.55) and Lemma 2.2, for almost every t 2 Œ0; T0 ,

hOx.t/  PC .x.t/  .t//; x.t/  .t/  PC .x.t/  .t//i  0: (14.56)

Inequality (14.56) implies that for almost every t 2 Œ0; T0 ;

hx.t/  PC .x.t/  .t//; x.t/  .t/  PC .x.t/  .t//i


 hx.t/  xO .t/; x.t/  .t/  PC .x.t/  .t//i: (14.57)

It follows from (14.43) and Lemma 2.2 that

hz  PC .x.t/  .t//; x.t/  .t/  PC .x.t/  .t//i  0: (14.58)

In view of (14.32) and (14.40), for almost every t 2 Œ0; T0  there exists
O 2 @f .x.t/I x0 .t//
.t/ (14.59)

such that

O
f 0 .x.t/; x0 .t// D h.t/; x0 .t/i; (14.60)

O
k.t/  .t/k  ı: (14.61)
14.5 Continuous Subgradient Projection Method 235

In view of (14.32) and (14.59), for almost every t 2 Œ0; T0 ,


O
f .z/  f .x.t// C h.t/; z  x.t/i  0

and
O
f .x.t//  f .z/  h.t/; O
x.t/  z C x0 .t/i  h.t/; x0 .t/i: (14.62)

By (14.41), for almost every t 2 Œ0; T0 ,

k.x.t/ C x0 .t//  PC .x.t/  .t//k  ı: (14.63)

Relations (14.45) and (14.63) imply that for almost every t 2 Œ0; T0 ,

hz  x.t/  x0 .t/; x.t/  .t/  x.t/  x0 .t/i


D hz  .t/; x.t/  .t/  .t/i
D hz  .t/; x.t/  .t/  PC .x.t/  .t//i
Chz  .t/; PC .x.t/  .t//  .t/i
 hz  .t/; x.t/  .t/  PC .x.t/  .t//i
Cıkz  .t/k: (14.64)

In view of (14.33) and (14.53), for all t 2 Œ0; T0 ,

kx.t/k  M: (14.65)

It follows from (14.33), (14.41), (14.45), and (14.65) that for almost every t 2
Œ0; T0 ,

k .t/k D kx.t/ C x0 .t/k  M; (14.66)

kx0 .t/k  2M: (14.67)

By (14.33), (14.43), (14.64), and (14.66),

hz  x.t/  x0 .t/;  .t/  x0 .t/i


 hz  .t/; x.t/  .t/  PC .x.t/  .t//i C 2Mı: (14.68)

By (14.33), (14.34), (14.38), (14.45), (14.53), (14.58), (14.59), (14.61), (14.63), and
(14.65),

hz  .t/; x.t/  .t/  PC .x.t/  .t//i


 hz  PC .x.t/  .t//; x.t/  .t/  PC .x.t/  .t//i
Cıkx.t/  .t/  PC .x.t/  .t//k  ı.2M C 2 .L C 1//:
(14.69)
236 14 Continuous Subgradient Method

In view of (14.68) and (14.69), for almost every t 2 Œ0; T0 ,

hx.t/ C x0 .t/  z; .t/ C x0 .t/i  ı.4M C 2 .L C 1//: (14.70)

It follows from (14.45), (14.61), (14.62), and (14.66) that for almost every
t 2 Œ0; T0 ,

f .x.t//  f .z/
O  .t/; x.t/  z C x0 .t/i
 h.t/; x.t/  z C x0 .t/i C h.t/
O
h.t/; x0 .t/i C h.t/  .t/; x0 .t/i
 h.t/; x.t/  z C x0 .t/i  h.t/; x0 .t/i C 4Mı: (14.71)

Relations (14.70) and (14.71) imply that for almost every t 2 Œ0; T0 ,

f .x.t//  f .z/
 1 hx0 .t/; x.t/ C x0 .t/  zi C 1 .4M C 2 .L C 1//ı
h.t/; x0 .t/i C 4Mı: (14.72)

By (14.44) and (14.72), for almost every t 2 Œ0; T0 ,

0 < f .x.t//  f .z/


  1 kx0 .t/k2  1 hx0 .t/; x.t/  zi
h.t/; x0 .t/i C ı.8M C 2 .L C 1// 1
1 (14.73)

and

1 kx0 .t/k2 C 1 hx0 .t/; x.t/  zi


Ch.t/; x0 .t/i C f .x.t//  f .z/
 ı.8M C 2 .L C 1// 1
1 : (14.74)

In view of (14.61), (14.67), and (14.74),

1 kx0 .t/k2 C 1 hx0 .t/; x.t/  zi


O
Ch.t/; x0 .t/i C f .x.t//  f .z/
 ı.10M C 2 .L C 1// 1
1 : (14.75)

It follows from (14.60), (14.75), and Corollary 14.3 that for almost every t 2 Œ0; T0 ,

.2 /1 .d=dt/.kx.t/  zk2 / C f 0 .x.t/; x0 .t//


Cf .x.t//  f .z/ C 1 kx0 .t/k2  ı.10M C 2 .L C 1// 1
1 :
14.5 Continuous Subgradient Projection Method 237

Using Proposition 14.2, the equality

f 0 .x.t/; x0 .t// D .f ı x/0 .t/

and integrating the inequality above over the interval Œ0; t, we obtain that for all
t 2 Œ0; T0 ,

.2 /1 kx.t/  zk2  .2 /1 kx.0/  zk2


Z t
Cf .x.t//  f .x.0// C .f .x.s//  f .z//ds  ıt.10M C 2 .L C 1// 1
1 :
0
(14.76)

Relations (14.33), (14.34), (14.48), and (14.53) imply that for all t 2 Œ0; T0 ,

f .x.t//  inf.f I C/  ıL: (14.77)

By (14.36), (14.38), (14.73), (14.76), and (14.77), for all t 2 Œ0; T0 ,

.2 2 /1 kx.t/  zk2  .2 1 /1 kx.0/  zk2


C inf.f I C/  f .x.0//  ıL
Z t
 ıt.10M C 2 .L C 1// 1
1  .f .x.s//  f .z//ds
0

 ıt.10M C 2 .L C 1// 1 1
1  0 t D ıt.10M C 2 .L C 1// 1 :

The relation above with t D T0 implies that

.2 1 /1 kx.0/  zk2 C inf.f I C/  f .x.0//  ıL


 ıT0 .10M C 2 .L C 1// 1
1 : (14.78)

In view of (14.33), (14.34), (14.48), and (14.53),

f .x.0//  sup.f I C/ C ıL: (14.79)

Relations (14.33), (14.34), and (14.79) imply that

inf.f I C/  f .x.0//  inf.f I C/  sup.f I C/  ıL  L.2M C 1/: (14.80)

It follows from (14.33), (14.43), (14.65), and (14.78) that

.2 1 /1 4M 2  L.2M C 2/  ıT0 .10M C 2 .L C 1// 1


1 ;

ıT0  ..2 1 /1 4M 2 C L.2M C 2//.10M C 2 .L C 1//1 1 :


238 14 Continuous Subgradient Method

This contradicts (14.37). The contradiction we have reached completes the proof of
Theorem 14.4.
In Theorem 14.4 ı is the computational error. According to this result we obtain
a point  2 Cı (see (14.47), (14.53)) such that

f ./  inf.f I C/ C c1 ı

[see (14.36), (14.42)], during a period of time c2 ı 1 [see (14.37)], where c1 ; c2 > 0
are constants depending only on 1 ; 2 ; L; M.
Chapter 15
Penalty Methods

In this chapter we use the penalty approach in order to study constrained mini-
mization problems in infinite dimensional spaces. A penalty function is said to have
the exact penalty property if there is a penalty coefficient for which a solution of
an unconstrained penalized problem is a solution of the corresponding constrained
problem. Since we consider optimization problems in general Banach spaces, not
necessarily finite-dimensional, the existence of solutions of original constrained
problems and corresponding penalized unconstrained problems is not guaranteed.
By this reason we deal with approximate solutions and with an approximate exact
penalty property which contains the classical exact penalty property as a particular
case. In our recent research we established the approximate exact penalty property
for a large class of inequality-constrained minimization problems. In this chapter
we improve this result and obtain an estimation of the exact penalty.

15.1 An Estimation of Exact Penalty in Constrained


Optimization

Penalty methods are an important and useful tool in constrained optimization. See,
for example, [25, 33, 43, 45, 49, 57, 80, 85, 117, 121] and the references mentioned
there. In this chapter we use the penalty approach in order to study constrained
minimization problems in infinite dimensional spaces. A penalty function is said
to have the exact penalty property if there is a penalty coefficient for which a
solution of an unconstrained penalized problem is a solution of the corresponding
constrained problem.
The notion of exact penalization was introduced by Eremin [48] and Zangwill
[114] for use in the development of algorithms for nonlinear constrained optimiza-
tion. Since that time, exact penalty functions have continued to play an important
role in the theory of mathematical programming.

© Springer International Publishing Switzerland 2016 239


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_15
240 15 Penalty Methods

In our recent research, which was summarized in [121] we established the


approximate exact penalty property for a large class of inequality-constrained
minimization problems. This approximate exact penalty property can be used
for approximate solutions and contains the classical exact penalty property as a
particular case. In this chapter we obtain an estimation of the exact penalty.
We use the convention that   1 D 1 for all  2 .0; 1/,  C 1 D 1 and
maxf; 1g D 1 for every real number  and that supremum over empty set is
1. For every real number  put C D maxf; 0g.
We use the following notation and definitions.
Let .X; k  k/ be a Banach space. For every point x 2 X and every positive number
r > 0 put

B.x; r/ D fy 2 X W kx  yk  rg:

For every function f W X ! R1 [ f1g and every nonempty set A  X define

dom.f / D fx 2 X W f .x/ < 1g;


inf.f / D infff .z/ W z 2 Xg

and

inf.f I A/ D infff .z/ W z 2 Ag:

For every point x 2 X and every nonempty set B  X define

d.x; B/ D inffkx  yk W y 2 Bg: (15.1)

Let n  1 be an integer. For every  2 .0; 1/ denote by ˝ the set of all vectors
D . 1 ; : : : ; n / 2 Rn such that

  minf i W i D 1; : : : ; ng and maxf i W i D 1; : : : ; ng D 1: (15.2)

Let gi W X ! R1 [ f1g, i D 1; : : : ; n be convex lower semicontinuous functions


and c D .c1 ; : : : ; cn / 2 Rn . Define

A D fx 2 X W gi .x/  ci for all i D 1; : : : ; ng: (15.3)

Let f W X ! R1 [ f1g be a bounded from below lower semicontinuous function


which satisfies the following growth condition

lim f .x/ D 1: (15.4)


kxk!1

We suppose that there exists a point xQ 2 X such that

gj .Qx/ < cj for all j D 1; : : : ; n and f .Qx/ < 1: (15.5)


15.1 An Estimation of Exact Penalty in Constrained Optimization 241

We consider the following constrained minimization problem

f .x/ ! min subject to x 2 A: (P)

By (15.5), A 6D ; and inf.f I A/ < 1.


For every D . 1 ; : : : ; n / 2 .0; 1/n define

X
n

.z/ D f .z/ C i maxfgi .z/  ci ; 0g; z 2 X: (15.6)


iD1

It is clear that for every vector 2 .0; 1/n the function W X ! R1 [ f1g
is bounded from below and lower semicontinuous and satisfies inf. / < 1. We
associate with problem (P) the corresponding family of unconstrained minimization
problems

.z/ ! min; z 2 X (P )

where 2 .0; 1/n .


Assume that there exists a function h W X  dom.f / ! R1 [ f1g such that the
following assumptions hold:
(A1) h.z; y/ is finite for every pair of points y; z 2 dom.f / and h.y; y/ D 0 for
every point y 2 dom.f /.
(A2) For every point y 2 dom.f / the function h.; y/ ! R1 [ f1g is convex.
(A3) For every point z 2 dom.f / and every positive number r

supfh.z; y/ W y 2 dom.f / \ B.0; r/g < 1:

(A4) For every positive number M there is M1 > 0 such that for every point y 2 X
satisfying f .y/  M there exists a neighborhood V of y in X such that if
z 2 V, then

f .z/  f .y/  M1 h.z; y/:

Remark 15.1. Note that if the function f is convex, then assumptions (A1)–(A4)
hold with h.z; y/ D f .z/  f .y/, z 2 X, y 2 dom.f /. In this case M1 D 1 for all
M > 0. If the function f is finite-valued and Lipschitzian on all bounded subsets of
X, then assumptions (A1)–(A4) hold with h.z; y/ D kz  yk for all z; y 2 X.
Let  2 .0; 1/. The main result of [118] (Theorem 15.2 stated below) imply that if
 is sufficiently large, then any solution of problem .P / with 2 ˝ is a solution
of problem .P/. Note that if the space X is infinite-dimensional, then the existence
242 15 Penalty Methods

of solutions of problems .P / and .P/ is not guaranteed. In this case Theorem 15.2
implies that for each  > 0 there exists ı./ > 0 which depends only on  such that
the following property holds:
If   0 , 2 ˝ and if x is a ı-approximate solution of .P /, then there exists
an -approximate solution y of .P/ such that ky  xk  .
Here 0 is a positive constant which does not depend on .
It should be mentioned that we deal with penalty functions whose penalty
parameters for constraints g1 ; : : : ; gn are  1 ; : : : ;  n respectively, where  > 0
and . 1 ; : : : ; n / 2 ˝ for a given  2 .0; 1/. Note that the vector .1; 1; : : : ; 1/ 2 ˝
for any  2 .0; 1/. Therefore our results also includes the case 1 D    D n D 1
where one single parameter  is used for all constraints. Note that sometimes it is
an advantage from numerical consideration to use penalty coefficients  1 ; : : : ;  n
with different parameters i , i D 1; : : : ; n. For example, in the case when some of
the constrained functions are very “small” and some of the constraint functions are
very “large.”
The next theorem is the main result of [118].
Theorem 15.2. Let  2 .0; 1/. Then there exists a positive number 0 such that for
each  > 0 there exists ı 2 .0; / such that the following assertion holds:
If 2 ˝ ,   0 and if x 2 X satisfies

 .x/  inf.  / C ı;

then there exists y 2 A such that

ky  xk   and f .y/  inf.f I A/ C :

Note that Theorem 15.2 is just an existence result and it does not provide any
estimation of the constant 0 . In this chapter we prove the main result of [119]
which improves Theorem 15.2 and provides an estimation of the exact penalty 0 .
In view of (15.4) and (15.5), there exists a positive number M such that

if y 2 X satisfies f .y/  jf .Qx/j C 1; then kyk < M: (15.7)

By (15.7), we have

kQxk < M: (15.8)

In view of (A4), there exists a positive number M1 such that the following property
holds:
(P1) for every point y 2 X satisfying f .y/  jf .Qx/jC1 there exists a neighborhood
V of y in X such that f .z/  f .y/  M1 h.z; y/ for all z 2 V.
15.2 Proof of Theorem 15.4 243

By (15.4), (15.5), and assumption (A3), there exists a positive number M2 such
that

supfh.Qx; z/ W z 2 X and f .z/  f .Qx/ C 1g  M2 :

Remark 15.3. If the function f is convex, then by Remark 15.1, we choose h.z; y/ D
f .z/  f .y/ for all z 2 X and all y 2 dom.f / with M1 D 1 for all M > 0 and then

supfh.Qx; z/ W z 2 X and f .z/  f .Qx/ C 1g


 supff .Qx/  f .z/ W z 2 X and f .z/  f .Qx/ C 1g D f .Qx/  inf.f /:

Thus in this case M2 can be any positive number such that M2  f .Qx/  inf.f /.
If the function f is finite-valued and Lipschitzian on bounded subsets of X, then
by Remark 15.1, we choose h.z; y/ D kz  yk for all z; y 2 X and M1 is a Lipschitz
constant of the restriction of f to B.0; M/. In this case

supfh.Qx; z/ W z 2 X and f .z/  f .Qx/ C 1g


 supfkQx  zk W z 2 B.0; M/g  2M

and M2 D M.
Let  2 .0; 1/. Fix a number 0 > 1 such that

X
n
 .ci  gi .Qx// > maxf21 2 2
0 M1 M2 ; 80 M g: (15.9)
iD1

We will prove the following result obtained in [119].


Theorem 15.4. For each  2 .0; 1/, each 2 ˝ , each   0 and each x 2 X
which satisfies

 .x/  inf.  / C .20 /1 

there exists y 2 A such that

ky  xk   and f .y/   .x/  inf.f I A/ C :

15.2 Proof of Theorem 15.4

Assume that the theorem does not hold. Then there exist

 2 .0; 1/; D . 1 ; : : : ; n / 2 ˝ ; (15.10)


  0 and xN 2 X (15.11)
244 15 Penalty Methods

such that

 .N
x/  inf.  / C 21 1
0 (15.12)

and

fy 2 B.Nx; / \ A W  .y/   .N
x/g D ;: (15.13)

By (15.12) and Ekeland’s variational principle [50] (see Theorem 15.19), there
exists a point yN 2 X such that

 .N
y/   .N
x/; (15.14)
1
kNy  xN k  2  (15.15)

and

 .N
y/   .z/ C 1
0 kz  y
N k for all z 2 X: (15.16)

In view of (15.13)–(15.15),

yN 62 A: (15.17)

Define

I1 D fi 2 f1; : : : ; ng W gi .Ny/ > ci g; (15.18)


I2 D fi 2 f1; : : : ; ng W gi .Ny/ D ci g;
I3 D fi 2 f1; : : : ; ng W gi .Ny/ < ci g:

By (15.17) and (15.18), we have

I1 6D ;: (15.19)

Relations from (15.3), (15.6), (15.10), (15.11), (15.12), (15.14), (15.18), and (15.19)
imply that

infff .z/ W z 2 Ag D inff  .z/ W z 2 Ag  inf.  /

  .N
x/ 1  .N
y/  1 D f .Ny/
X
C  i .gi .Ny/  ci /  1: (15.20)
i2I1

By (15.20), (15.18), and (15.5),

f .Ny/  infff .z/ W z 2 Ag C 1  f .Qx/ C 1: (15.21)


15.2 Proof of Theorem 15.4 245

In view of (15.21) and (15.7),

kNyk < M: (15.22)

Property (P1), (15.21), and (15.22) imply that there exists an open neighborhood V
of the point yN in X such that

V  B.0; M/; (15.23)

f .z/  f .Ny/  M1 h.z; yN / for each z 2 V: (15.24)

Since the functions gi ; i D 1; : : : ; n are lower semicontinuous it follows from


(15.18) that there exists a positive number r < 1 such that for every point y 2 B.Ny; r/,

gi .y/ > ci for each i 2 I1 : (15.25)

By (15.21), (15.5), (15.14), (15.12), (15.25), and (15.16), for every point z 2
B.Ny; r/ \ dom.f /, we have
X X
 i .gi .z/  ci / C  i maxfgi .z/  ci ; 0g
i2I1 i2I2 [I3
X X
  i .gi .Ny/  ci /   i maxfgi .Ny/  ci ; 0g
i2I1 i2I2 [I3

D  .z/   .N
y/  f .z/ C f .Ny/  1
0 kN
y  zk  f .z/ C f .Ny/:

Combined with (15.11) this relation implies that for every point z 2 B.Ny; r/,
X X
i gi .z/ C i maxfgi .z/  ci ; 0g
i2I1 i2I2 [I3
X X
 i gi .Ny/  i maxfgi .Ny/  ci ; 0g
i2I1 i2I2 [I3

C1 .f .z/  f .Ny//  2


0 kN
y  zk:

By this inequality, (15.23) and (15.24), for every point z 2 B.Ny; r/ \ V,


X X
i gi .z/ C i maxfgi .z/  ci ; 0g
i2I1 i2I2 [I3

C1 M1 h.z; yN / C 2


0 kz  y
Nk
X X
 i gi .Ny/ C i maxfgi .Ny/  ci ; 0g: (15.26)
i2I1 i2I2 [I3
246 15 Penalty Methods

In view of (A2), the function


X X
i gi .z/ C i maxfgi .z/  ci ; 0g
i2I1 i2I2 [I3

C1 M1 h.z; yN / C 2


0 kz  y
N k; z 2 X

is convex. Combined with the equality h.Ny; yN / D 0 [see (A1)] this implies that
(15.26) holds true for every point z 2 X.
Since relation (15.26) is valid for z D xQ relations (15.5), (15.10), (15.2), (15.11),
(15.18), and (15.19) imply that
X
i gi .Qx/ C 1 M1 h.Qx; yN / C 2
0 kQ
x  yN k
i2I1
X X
 i gi .Ny/ > i ci :
i2I1 i2I1

Combined with (15.21), (15.5), (15.8), (15.22), (15.10), (15.2), assumption (A1)
and the choice of M2 (see Sect. 15.1) this implies that

42 2 1
0 M C 0 M1 supfh.Q
x; z/ W z 2 X and f .z/  f .Qx/ C 1g
X
 2 2 1
0 4M C 0 M1 .h.Qx; yN /C /  i .ci  gi .Qx//
i2I1

X
n
 .ci  gi .Qx//
iD1

and

X
n
 .ci  gi .Qx//  42 2 1
0 M C 0 M1 M2 :
iD1

This contradicts (15.9). The contradiction we have reached proves Theorem 15.4.

15.3 Infinite-Dimensional Inequality-Constrained


Minimization Problems

In this section we use the penalty approach in order to study inequality-constrained


minimization problems in infinite dimensional spaces. For these problems, a
constraint is a mapping with values in a normed ordered space. For this class of
problems we introduce penalty functions, prove the exact penalty property, and
15.3 Infinite-Dimensional Inequality-Constrained Minimization Problems 247

obtain an estimation of the exact penalty. Using this exact penalty property we obtain
necessary and sufficient optimality conditions for the constrained minimization
problems.
Let X be a vector space, X 0 be the set of all linear functionals on X and let Y be a
vector space ordered by a convex cone YC such that

YC \ .YC / D f0g; ˛YC  YC for all ˛  0; YC C YC  YC :

We say that y1 ; y2 2 Y satisfy y1  y2 if and only if y2  y1 2 YC .


We add to the space Y the largest element 1 and suppose that y C 1 D 1 for
all y 2 Y [ f1g and that ˛  1 D 1 for all ˛ > 0.
For a mapping F W X ! Y [ f1g we set

dom.F/ D fx 2 X W F.x/ < 1g:

A mapping F W X ! Y [ f1g is called convex if for all x1 ; x2 2 X and all


˛ 2 .0; 1/,

F.˛x1 C .1  ˛/x2 /  ˛F.x1 / C .1  ˛/F.x2 /:

A function G W Y ! R1 is called increasing if G.y1 /  G.y2 / for all y1 ; y2 2 Y


satisfying y1  y2 .
Assume that p W X ! R1 [ f1g is a convex function.
Recall that for xN 2 dom.p/,

@p.Nx/ D fl 2 X 0 W l.x  xN /  p.x/  p.Nx/ for all x 2 Xg: (15.27)

The set @p.Nx/ is a subdifferential of p at the point xN .


Since we consider convex minimization problems we need to use the following
two important facts of convex analysis (see Theorems 3.6.1 and 3.6.4 of [76]
respectively).
Proposition 15.5. Let p1 W X ! R1 [ f1g and p2 W X ! R1 be convex functions.
Then for any xN 2 dom.p1 /,

@.p1 C p2 /.Nx/ D @p1 .Nx/ C @p2 .Nx/:

Proposition 15.6. Let F W X ! Y [ f1g be a convex mapping, G W Y ! R1 be an


increasing convex function, G.1/ D 1 and let xN 2 dom.F/. Then

@.G ı F/.Nx/ D [f@.l ı F/.Nx/ W l 2 @G.F.Nx//g:

In this paper we suppose that .X; k  k/ is a Banach space, .Y; k  k/ is a normed


space and that .X  ; k  k / and .Y  ; k  k / are their dual spaces.
248 15 Penalty Methods

We also suppose that Y is ordered by a convex cone YC which is a closed subset


of .Y; k  k/.
For each function f W Z ! R1 [ f1g where the set Z is nonempty put inf.f / D
infff .z/ W z 2 Zg and for each nonempty subset A  Z put

inf.f I A/ D infff .z/ W z 2 Ag:

For all y 2 Y put

.y/ D inffkzk W z 2 Y and z  yg: (15.28)

It is not difficult to see that

.y/  0 for all y 2 Y;


.y/ D 0 if and only if y  0;
.y/  kyk for all y 2 Y; (15.29)
.˛y/ D ˛ .y/ for all ˛ 2 Œ0; 1/ and all y 2 Y; (15.30)

for all y1 ; y2 2 Y

.y1 C y2 /  .y1 / C .y2 / (15.31)

and

if y1  y2 ; then .y1 /  .y2 /: (15.32)

Set

.1/ D 1:

The functional was used in [115, 116, 121] for the study of minimization
problems with increasing objective functions. Here we use it in order to construct a
penalty function.
The following auxiliary result is proved in Sect. 15.4.
Lemma 15.7. Let y 2 Y n .YC / and l 2 @ .y/. Then

l.z/  0 for all z 2 YC ; l.y/ D .y/ and jjljj D 1: (15.33)

For each x 2 X, each y 2 Y, and each r > 0 set

BX .x; r/ D fz 2 X W kx  zk  rg;
BY .y; r/ D fz 2 Y W ky  zk  rg:
15.3 Infinite-Dimensional Inequality-Constrained Minimization Problems 249

A set E  Y is called ./-bounded from above if the following property holds:


(P2) there exists ME > 0 such that for each y 2 E there is z 2 BY .0; ME / for
which z  y.
Let f W X ! R1 [ f1g be a bounded from below lower semicontinuous function
which satisfies the following growth condition

lim f .x/ D 1: (15.34)


kxk!1

Assume that G W X ! Y [ f1g is a convex mapping and c 2 Y. Set

A D fx 2 X W G.x/  cg: (15.35)

We suppose that there exist MQ > 0, rQ > 0 and a nonempty set ˝  X such that

G.x/  c and f .x/  MQ for all x 2 ˝ (15.36)

and that the following property holds:


(P3) for each h 2 Y satisfying khk D rQ there is xh 2 ˝ such that G.xh / C h  c.
By (15.36) and (15.34),

supfkxk W x 2 ˝g < 1:

Remark 15.8. Property (P3) is an infinite-dimensional version of Slater condition


[58]. In particular, it holds if there exists xQ 2 A such that f .Qx/ < 1 and c  G.Qx/ is
an interior point of YC in the normed space Y. In this case MQ is any positive constant
satisfying MQ  f .Qx/ and ˝ D fQxg.
We assume that G possesses the following two properties.
(P4) If fyi g1
iD1  Y, y 2 Y, limi!1 yi D y in .Y; k  k/ and G.y/ D 1, then the
sequence fG.yi /g1 iD1 is not ./-bounded above.
(P5) If fyi g1
iD1  Y, y 2 Y, limi!1 yi D y in .Y; k  k/ and G.y/ < 1, then
for each given  > 0 and all sufficiently large natural numbers i there exists
ui 2 Y such that

G.yi /  ui and kui  G.y/j  :

Remark 15.9. Clearly, G possesses (P1) and (P2) if G.X/  Y and G is continuous.
It is easy to see that G possesses (P4) and (P5) if Y D Rn , YC D fy 2 Rn W
yi  0; i D 1; : : : ; ng, G D .g1 ; : : : ; gn / and the functions gi W X ! R1 [ f1g,
i D 1; : : : ; n are lower semicontinuous. In general properties (P4) and (P5) are an
infinite-dimensional version of the lower semicontinuity property.
250 15 Penalty Methods

We consider the following constrained minimization problem

f .x/ ! min subject to x 2 A: (P)

In view of (15.36)

A 6D ; and inf.f I A/ < 1:

For each > 0 define

.z/ D f .z/ C .G.z/  c/; z 2 X: (15.37)

The set , > 0 is our family of penalty functions.


The following auxiliary result is proved in Sect. 15.4.
Lemma 15.10. For each > 0 the function W X ! R1 [ f1g is lower
semicontinuous.
By (15.34) there is M1 > 0 such that

kzk < M1 for each z 2 X satisfying f .z/  MQ C 1: (15.38)

We use the following assumption.


Assumption (A1) There is M0 > 0 such that for each x 2 X satisfying f .x/ 
MQ C 1 there is a neighborhood V of x in .X; k  k/ such that for each z 2 V, f .z/ is
finite and

jf .z/  f .x/j  M0 kx  zk: (15.39)

Remark 15.11. Note that assumption (A1) is a form of a local Lipschitz property
for f on the sublevel set f 1 ..1; MQ C 1/.
The following theorem is the first main result of this section.
Theorem 15.12. Assume that (A1) holds, M0 > 0 is as guaranteed by (A1) and let
0 > 1 satisfy

rQ > 4.2 1
0 C M0 0 /.supfkzk W z 2 ˝g C M1 /: (15.40)

Then for each  2 .0; 1/, each   0 and each x 2 X which satisfies

 .x/  inf. / C .20 /1 

there exists y 2 A such that

ky  xk   and f .y/  inf. / C   inf.f I A/ C :


15.3 Infinite-Dimensional Inequality-Constrained Minimization Problems 251

Corollary 15.13. Assume that (A1) holds, M0 > 0 is as guaranteed by (A1) and
let 0 > 1 satisfy (15.40). Then for each   0 and each sequence fxi g1
iD1  X
satisfying

lim  .xi / D inf. / (15.41)


i!1

there exists a sequence fyi g1


iD1  A such that

lim kyi  xi k D 0 and lim f .yi / D inf.f I A/: (15.42)


i!1 i!1

Moreover, for each   0

inf.f I A/ D inf.  /:

Corollary 15.14. Assume that (A1) holds, M0 > 0 is as guaranteed by (A1) and let
0 > 1 satisfy (15.40). Then if   0 and if x 2 X satisfies

 .x/ D inf.  /; (15.43)

then x 2 A and

f .x/ D  .x/ D inf. / D inf.f I A/: (15.44)

Theorem 15.12 is proved in Sect. 15.5. In our second main result of this section
we do not assume (A1). Instead of it we assume that the function f is convex and
that the mapping G is finite-valued.
Theorem 15.15. Assume that G.X/  Y, the function f is convex and let 0 > 1
satisfy
1 Q
rQ > 42
0 .supfkxk W x 2 ˝g C M1 / C 40 .M  inf.f //: (15.45)

Then for each  > 0, each   0 and each x 2 X which satisfies

 .x/  inf. / C .20 /1 

there exists y 2 A such that

ky  xk   and f .y/  inf. / C   inf.f I A/ C :

Corollary 15.16. Let the assumptions of Theorem 15.15 hold. Then for each  
0 and each sequence fxi g1 iD1  X satisfying (15.41) there exists a sequence
fyi g1
iD1  A such that (15.42) holds. Moreover, for each   0

inf.f I A/ D inf.  /:
252 15 Penalty Methods

Corollary 15.17. Let the assumptions of Theorem 15.15 hold. Then if   0 and
if x 2 X satisfies (15.43), then x 2 A and (15.44) holds.
Theorem 15.15 is proved in Sect. 15.6.
Using our exact penalty results we obtain necessary and sufficient optimality
conditions for constrained minimization problems (P) with the convex function f .
Theorem 15.18. Assume that G.X/  Y, the function f is convex, 0 > 1 satisfies
(15.45),   0 and xN 2 X. Then the following assertions are equivalent.
1. xN 2 A and f .Nx/ D inf.f I A/.
2.  .Nx/ D inf.  /.
3. There exist

l0 2 @ .G.Nx/  c/; l1 2 @.l0 ı .G./  c//.Nx/

and l2 2 @f .Nx/ such that l1 C l2 D 0.


Proof. The equivalence of assertions 1 and 2 follows from Corollaries 15.16
and 15.17. Therefore it is sufficient to show that assertions 2 and 3 are equivalent.
It is clear that if at least one of assertions 2 and 3 holds, then xN 2 dom.f /.
Therefore we may assume that xN 2 dom.f /. Clearly, the function  is convex.
By Propositions 15.5 and 15.6,

@  .N
x/ D @f .Nx/ C @. ı .G./  c//.Nx/
D @f .Nx/ C [f@.l ı .G./  c//.Nx/ W l 2 @ .G.Nx/  c/g:

Now in order to complete the proof it is sufficient to note that assertion 2 holds if
and only if
02@  .N
x/:

It should be mentioned that Theorem 15.18 is an infinite-dimensional version of the


classical Karush–Kuhn–Tucker theorem [58].
Assume now that the assumptions of Theorem 15.18 hold and assertions 1, 2,
and 3 hold. Let l0 , l1 , and l2 be as guaranteed by assertion 3. Then

l0 .G.Nx/  c/ D .G.Nx/  x/ D 0

because G.Nx/  c. This is an infinite-dimensional version of the complementary


slackness condition [58].
The results of this section were obtained in [123].
In the proof of Theorem 15.12 we use the following fundamental variational
principle of Ekeland [50].
Theorem 15.19. Assume that .Z; / is a complete metric space and that W Z !
R1 [ f1g is a lower semicontinuous bounded from below function which is not
identically 1. Let  > 0 and x0 2 Z be given such that

.x0 /  .x/ C  for all x 2 Z:


15.4 Proofs of Auxiliary Results 253

Then for any  > 0 there is xN 2 Z such that

.Nx/  .x0 /; .Nx; x0 /  ;


.x/ C .=/.x; xN / > .Nx/ for all x 2 Z n fNxg:

15.4 Proofs of Auxiliary Results

Proof of Lemma 15.7. It is not difficult to see that

klk  1; l.y/ D .y/ > 0; l.z/  0 for all z 2 YC : (15.46)

We will show that klk D 1. By (15.46) and (15.28),

l.y/ D .y/ D inffkzk W z 2 Y and z  yg: (15.47)

Let

 2 .0; .y/=4/

[see (15.46)]. By (15.47) there exists z 2 Y such that

z  y and kzk  .y/ C : (15.48)

By (15.46) and (15.28)

kzk > 0 (15.49)

and

.y/ D l.y/  l.z/: (15.50)

It follows from (15.50) and (15.48) that

l.z/  kzk  :

Together with (15.49), (15.48), and (15.47) this implies that

klk  l.z/kzk1  .kzk  /kzk1  1  kzk1  1   .y/1 :

Since  is any positive number satisfying  < .y/=4 we conclude that

klk  1:

Combined with (15.46) this implies that klk D 1. Lemma 15.7 is proved.
254 15 Penalty Methods

Proof of Lemma 15.10. It is sufficient to show that the function .G./  c/ W X !


R1 [ f1g is lower semicontinuous.
Assume that y 2 Y, fyi g1
iD1  Y and

lim kyi  yk D 0: (15.51)


i!1

It is sufficient to show that

lim inf .G.yi /  c/  .G.y/  c/:


i!1

Extracting a subsequence and re-indexing if necessary we may assume without loss


of generality that there exists

lim .G.yi /  c/ < 1: (15.52)


i!1

We may assume without loss of generality that .G.yi /  c/ is finite for all integers
i  1.
Let  > 0. By (15.28) for any integer i  1 there exists zi 2 Y such that

zi  G.yi /  c; kzi k  .G.yi /  c/ C =4: (15.53)

In view of (15.52) and (15.53) the sequence fkzi kg1iD1 is bounded. Together with
(15.53) this implies that the sequence fG.yi /  cg1
iD1 is ./-bounded from above
[see (P2)]. It follows from (P4) and (15.53) that

G.y/ < 1: (15.54)

By (15.51), (15.54), and (P5) there exists a natural number i0 such that for each
integer i  i0 there is ui 2 Y which satisfies

G.yi /  ui and kui  G.y/k  =4: (15.55)

In view of (15.55) and (15.53) for all integers i  i0

G.y/  c D .G.y/  ui / C ui  c
 .G.y/  ui / C G.yi /  c  .G.y/  ui / C zi : (15.56)

It follows from (15.55) and (15.53) that for all integers i  i0

kG.y/  ui C zi k  kG.y/  ui k C kzi k


 =4 C .G.yi /  c/ C =4: (15.57)

By (15.56) and (15.57), for all integers i  i0 ,

.G.y/  c/  kG.y/  ui C zi k  .G.yi /  c/ C =2


15.5 Proof of Theorem 15.12 255

and

.G.y/  c/  lim .G.yi /  c/ C =2:


i!1

Since  is any positive number we conclude that

.G.y/  c/  lim .G.yi /  c/:


i!1

Lemma 15.10 is proved.

15.5 Proof of Theorem 15.12

We show that the following property holds:


(P6) For each  2 .0; 1/, each   0 and each x 2 X which satisfies

 .x/  inf. / C .20 /1 

there exists y 2 A for which

ky  xk   and  .y/   .x/:

(It is easy to see that (P6) implies the validity of Theorem 15.2.)
Assume the contrary. Then there exist

 2 .0; 1/;   0 ; xN 2 X (15.58)

such that

 .N
x/  inf. / C 21 1
0 ; (15.59)

fy 2 BX .Nx; / \ A W  .y/   .N
x/g D ;: (15.60)

It follows from (15.59), Lemma 15.10, and Theorem 15.19 that there is yN 2 X such
that

 .N
y/   .N
x/; (15.61)

kNy  xN k  21 ; (15.62)

 .N
y/   .z/ C 1
0 kz  y
N k for all z 2 X: (15.63)

By (15.60), (15.61), and (15.62),

yN 62 A: (15.64)
256 15 Penalty Methods

It follows from (15.37), (15.35), (15.59), and (15.61) that

infff .z/ W z 2 Ag D inff  .z/ W z 2 Ag  inf. /   .N


x/ 1
  .N
y/  1  f .Ny/  1

and in view of (15.36),

f .Ny/  infff .z/ W z 2 Ag C 1  MQ C 1: (15.65)

By (15.65) and (15.38)

kNyk < M1 : (15.66)

In view of (A1) and (15.65) there exists a neighborhood V of yN in .X; k  k/ such that
for each z 2 V,

f .z/ is finite and jf .z/  f .Ny/j  M0 kz  yN k: (15.67)

It follows from (15.61), (15.59), (15.67), (15.37), (15.63), and (15.58) that for each
z2V

 .G.z/  c/   .G.Ny/  c/ D  .z/   .N


y/  f .z/ C f .Ny/
 1 N k  f .z/ C f .Ny/  1
0 kz  y 0 kz  y
N k  M0 kz  yN k

and

.G.z/  c/  .G.Ny/  c/  kz  yN k.2 1


0 C M0 0 /:

This implies that for each z 2 V

.G.z/  c/ C .2 1
0 C M0 0 /kz  y
N k  .G.Ny/  c/: (15.68)

Clearly the function

.z/
Q D .G.z/  c/ C .2 1
0 C M0 0 /kz  y
N k; z 2 X (15.69)

is convex. By (15.68) and (15.69)

0 2 @ .N
Q y/: (15.70)

By (15.70), (15.69), and Proposition 15.5 there is

l0 2 @. .G./  c//.Ny/ (15.71)

such that

kl0 k  .2 1
0 C M0 0 /: (15.72)
15.5 Proof of Theorem 15.12 257

It follows from (15.71) and Proposition 15.6 that there exists

l1 2 @ .G.Ny/  c/ (15.73)

such that

l0 2 @.l1 ı .G./  c//.Ny/: (15.74)

In view of (15.74) for each z 2 X

l0 .z  yN /  l1 .G.z/  c/  l1 .G.Ny/  c/: (15.75)

By (15.73), (15.64), and Lemma 15.7

kl1 k D 1; l1 .z/  0 for all z 2 YC ; (15.76)


l1 .G.Ny/  c/ D .G.Ny/  c/:

Let

h 2 Y and khk D rQ : (15.77)

By (15.77) and property (P3) there is

xh 2 ˝ (15.78)

such that

G.xh / C h  c: (15.79)

It follows from (15.78), (15.66), (15.72), (15.75), (15.76), and (15.79) that

.2 1 2 1
0 C M0 0 /.M1 C supfkxk W x 2 ˝g/  .0 C M0 0 /.kxh k C kN
yk/
 kl0 k .kxh k C kNyk/  l0 .xh  yN /
 l1 .G.xh /  c/  l1 .G.Ny/  c/  l1 .G.xh /  c/  .G.Ny/  c/
 l1 .G.xh /  c/  l1 .h/: (15.80)

Since (15.80) holds for all h satisfying (15.77) we conclude using (15.76) that

.2 1
0 C M0 0 /.M1 C supfkxk W x 2 ˝g/

 supfl1 .h/ W h 2 Y and khk D rQ g D rQ kl1 k D rQ :

This contradicts (15.40). The contradiction we have reached proves (P6) and
Theorem 15.12 itself.
258 15 Penalty Methods

15.6 Proof of Theorem 15.15

We show that property (P6) (see Sect. 15.5) holds. (Note that (P6) implies the
validity of Theorem 15.15).
Assume the contrary. Then there exist

 2 .0; 1/;   0 ; xN 2 X (15.81)

such that (15.59) and (15.60) hold. It follows from (15.59), Lemma 15.10 and
Ekeland’s variational principle [50] that there is yN 2 X such that (15.61)–(15.63)
hold. By (15.60), (15.61) and (15.62),

yN 62 A: (15.82)

Arguing as in the proof of Theorem 15.12 we show that (15.37), (15.35), (15.59),
(15.61), (15.36), and (15.38) imply that

f .Ny/  MQ C 1; kNyk < M1 : (15.83)

It follows from (15.37) and (15.63) that for each z 2 X

f .z/ C  .G.z/  c/  f .Ny/   .G.Ny/  c/


D  .z/   .N
y/  1
0 kz  y
Nk
and
1 .f .z/  f .Ny// C .G.z/  c/  .G.Ny/  c/
 1 1
0 kz  y
N k:
This implies that for all z 2 X

1 f .z/ C .G.z/  c/ C 1 1 N k  1 f .Ny/ C .G.Ny/  c/:


0 kz  y (15.84)

Put

.z/
Q D .G.z/  c/ C 1 f .z/ C 1 1
0 kz  y
N k; z 2 X (15.85)

In view of (15.85) the function Q is convex. By (15.84) and (15.85),

0 2 @ .N
Q y/: (15.86)

By (15.85), (15.86), (15.81), and Proposition 15.5 there exist

l1 2 @. ı .G./  c//.Ny/; l2 2 @f .Ny/ (15.87)


15.6 Proof of Theorem 15.15 259

such that

kl1 C 1 l2 k  2


0 : (15.88)

It follows from (15.87), (15.30), (15.31), and Proposition 15.6 that there exists

l0 2 @ .G.Ny/  c/ (15.89)

such that

l1 2 @.l0 ı .G./  c//.Ny/: (15.90)

In view of (15.90) for each z 2 X

l1 .z  yN /  l0 .G.z/  c/  l0 .G.Ny/  c/: (15.91)

By (15.89), (15.82), and Lemma 15.7

kl0 k D 1; l0 .z/  0 for all z 2 YC ; (15.92)


l0 .G.Ny/  c/ D .G.Ny/  c/:

Let

h 2 Y and khk D rQ : (15.93)

By (15.93) and property (P3) there is

xh 2 ˝ (15.94)

such that

G.xh / C h  c: (15.95)

It follows from (15.94), (15.83), (15.88), (15.91), (15.87), (15.92), (15.36), and
(15.95) that

2 2
0 .M1 C supfkxk W x 2 ˝g/  0 .kxh k C kN
yk/
 kl1 C 1 l2 k .kxh k C kNyk/  .l1 C 1 l2 /.xh  yN /
D l1 .xh  yN / C 1 l2 .xh  yN /
 l0 .G.xh /  c/  l0 .G.Ny/  c/ C 1 .f .xh /  f .Ny//
 l0 .G.xh /  c/  .G.Ny/  c/ C 1 .MQ  inf.f //
 l0 .h/ C 1 .MQ  inf.f //
260 15 Penalty Methods

and in view of (15.81)

l0 .h/  1 Q 2
0 .M  inf.f // C 0 .M1 C supfkxk W x 2 ˝g/: (15.96)

Since the inequality above holds for all h satisfying (15.93) it follows from (15.92)
and (15.96) that

rQ D rQ kl0 k D supfl0 .h/ W h 2 Y and khk D rQ g


 1 Q 2
0 .M  inf.f // C 0 .M1 C supfkxk W x 2 ˝g/:

This contradicts (15.45). The contradiction we have reached proves (P6) and
Theorem 15.15 itself.

15.7 An Application

Let X be a Hilbert space equipped with an inner product h; i which induces the
complete norm k  k. We use the notation and definitions introduced in Sect. 15.1.
Let n be a natural number, gi W X ! R1 [ f1g, i D 1; : : : ; n be convex lower
semicontinuous functions and c D .c1 ; : : : ; cn / 2 Rn . Set

A D fx 2 X W gi .x/  ci for all i D 1; : : : ; ng:

Let f W X ! R1 [ f1g be a bounded from below lower semicontinuous function


which satisfies the following growth condition
lim f .x/ D 1:
kxk!1
We suppose that there is xQ 2 X such that

gj .Qx/ < cj for all j D 1; : : : ; n and f .Qx/ < 1: (15.97)

We consider the following constrained minimization problem

f .x/ ! min subject to x 2 A: (P)

For each vector D . 1 ; : : : ; n / 2 .0; 1/n define

X
n

.z/ D f .z/ C i maxfgi .z/  ci ; 0g; z 2 X: (15.98)


iD1

Clearly for each 2 .0; 1/n the function W X ! R1 [ f1g is bounded from
below and lower semicontinuous and satisfies inf. / < 1.
15.7 An Application 261

We suppose that the function f is convex. By Remark 15.1, (A1)–(A4) hold with
h.z; y/ D f .z/  f .y/, z 2 X, y 2 dom.f /. In this case M1 D 1 for all M > 0.
There is M > 0 such that [see (15.7)]

if y 2 X satisfies f .y/  jf .Qx/j C 1; then kyk < M: (15.99)

Clearly, (P1) holds with M1 D 1. In view of Remark 15.3, the constant M2 can be
any positive number such that

M2  f .Qx/  inf.f /:

We suppose that M2 is given.


Let  2 .0; 1/ Choose 0 > 1 [see (15.9)] such that

X
n
 .ci  gi .Qx// > maxf21 2 2
0 M2 ; 80 M g:
iD1

By Theorem 15.4, the following property holds:


(P7) for each  2 .0; 1/, each 2 ˝ , each   0 and each x 2 X satisfies

 .x/  inf.  / C 21 1


0

there exists y 2 A such that

ky  xk   and f .y/   .x/  inf.f I A/ C :

Property (P7) implies that for each 2 ˝ and each   0 ,


inf.  / D inf.f I A/: (15.100)

In order to obtain an approximate solution of problem (P) we apply the


subgradient projection method, studied in Chap. 2, for the minimization of the
function 0 , where is a fixed element of the set ˝ .
We suppose that problem (P) has a solution

x 2 A (15.101)

such that

f .x /  f .x/ for all x 2 A: (15.102)

By (15.100), (15.101), and (15.102),

f .x / D 0 .x / D inf. 0 /: (15.103)


262 15 Penalty Methods

It follows from (15.97), (15.99), and (15.102) that

kx k < M: (15.104)

In view of (15.103) and (15.104), we consider the minimization of 0 on B.0; M/.


We suppose that there exist an open convex set U  X and a number L > 0 such
that

B.0; M C 1/  U; (15.105)

the functions f and gi ; i D 1; : : : ; n are finite-valued on U and that for all x; y 2 U


and all i D 1; : : : ; n,

jf .x/  f .y/j  Lkx  yk; jgi .x/  gi .y/j  Lkx  yk: (15.106)

In view of (15.98) and (15.106), the function 0 is Lipschitzian on U and for all
x; y 2 U,

j 0 .x/  0 .y/j

 jf .x/  f .y/j C n0 Lkx  yk  Lkx  yk.1 C 0 n/: (15.107)

In this section we use the projection on the set B.0; M/ denoted by PB.0;M/ and
defined for each z 2 X by

PB.0;M/ .z/ D z if kzk  M;


PB.0;M/ .z/ D Mkzk1 z if kzk > M:

We apply the subgradient projection method, studied in Chap. 2, for the minimiza-
tion problem of the function 0 on the set B.0; M/. For each ı > 0 set

˛.ı/ D 21 .2M C 1/2 .1 C L.0 n C 1//.8M C 2/1=2 ı 1=2 .1 C 0 n/1=2


Cı.1 C 0 n/.2M C 1/ C .8M C 2/1=2 .1 C L.0 n C 1//ı 1=2 .1 C 0 n/1=2
C.4M C 1/.1 C L.0 n C 1//.8M C 2/1=2 ı 1=2 .1 C 0 n/1=2 : (15.108)

Let ı 2 .0; 1 be our computational error which satisfies

ı.1 C n0 / < 1; 2˛.ı/0 < 1: (15.109)

Set

ı0 D ı.1 C n0 / (15.110)

and

a D .2ı0 .4M C 1//1=2 .L.1 C 0 n/ C 1/1 :

Let us describe our algorithm.


15.7 An Application 263

Subgradient Projection Algorithm


Initialization: select an arbitrary x0 2 B.0; M/.
Iterative step: given a current iteration vector xt 2 U calculate

t 2 @ 0 .xt / C B.0; ı0 / (15.111)

and the next iteration vector xtC1 2 X such that

kxtC1  PB.0;M/ .xt  at /k  ı0 : (15.112)

Let t  0 be an integer. Let us explain how one can calculate t satisfying


(15.111). We find
0 2 X satisfying


0 2 @f .xt / C B.0; ı/:

For every i D 1; : : : ; n, if gi .xt /  ci , then set


i D 0 and if gi .xt / > ci , then we
calculate


i 2 @gi .xt / C B.0; ı/:

Set

X
n
t D
0 C 0
i :
iD1

It follows from the equality above, the choice of


i , i D 0; : : : ; n, the subdifferential
calculus in [84], (15.98) and (15.110) that

B.t ; ı0 / \ @ 0 .xt / D B.t ; ı.1 C n0 // \ @ 0 .xt / 6D ;

and (15.111) is true.


By Theorem 2.6, applied to the function 0 , for each natural number T,
!
X
T
1
0 .T C 1/ xt  0 .x /;
tD0

minf 0 .xt / W t D 0; : : : ; Tg  0 .x /

 21 .T C 1/1 .2M C 1/2 .L.1 C 0 n/ C 1/.2ı0 .4M C 1//1=2


Cı0 .2M C 1/
C21 .2ı0 .4M C 1//1=2 .L.1 C 0 n/ C 1/
Cı0 .4M C 1/.L.1 C 0 n/ C 1/.2ı0 .4M C 1//1=2 :
264 15 Penalty Methods

Now we can think about the best choice of T. It was explained in Chap. 2 that it
should be at the same order as

bı01 c D bı 1 .1 C n0 /1 c:

Put T D bı01 c and obtain from (15.108) and (15.110) that


!
X
T
1
0 .T C 1/ xt  0 .x /;
tD0

minf 0 .xt / W t D 0; : : : ; Tg  0 .x /  ˛.ı/ (15.113)

By (15.100), (15.101), (15.103), and (15.113),


!
X
T

0 .T C 1/1 xt  inf. 0 / C ˛.ı/: (15.114)


tD0

Let  2 f0; : : : ; Tg satisfy

0 .x / D minf 0 .xt / W t D 0; : : : ; Tg:

In view of (15.101), (15.103), (15.106), and (15.113),

0 .x /  inf. 0 / C ˛.ı/: (15.115)

It follows from (15.109), (15.114), (15.115), and property (P7) that there exist
y0 ; y1 2 A such that
 
 X
T 
 1 
y0  .T C 1/ xt   2˛.ı/0 ; ky1  x k  2˛.ı/0 ;
 
tD0
!
X
T
1
f .y0 /  0 .T C 1/ xt  inf.f I A/ C ˛.ı/;
tD0

f .y1 /  0 .x /  inf.f I A/ C ˛.ı/:

The analogous analysis can be also done for the mirror descent method.
Chapter 16
Newton’s Method

In this chapter we study the convergence of Newton’s method for nonlinear


equations and nonlinear inclusions in a Banach space. Nonlinear mappings, which
appear in the right-hand side of the equations, are not necessarily differentiable. Our
goal is to obtain an approximate solution in the presence of computational errors. In
order to meet this goal, in the case of inclusions, we study the behavior of iterates
of nonexpansive set-valued mappings in the presence of computational errors.

16.1 Pre-differentiable Mappings

Newton’s method is an important and useful tool in optimization and numerical


analysis. See, for example, [8, 21, 22, 24, 32, 41, 46, 47, 63–65, 88, 94, 97, 99, 102]
and the references mentioned therein. We study equations with nonlinear mappings
which are not necessarily differentiable. In this section we consider this class of
mappings.
Let .X; k  k/ and .Y; k  k/ be normed spaces. For each x 2 X, each y 2 Y, and
each r > 0 set

BX .x; r/ D fu 2 X W ku  xk  rg;
BY .y; r/ D fv 2 Y W kv  yk  rg:

Let IX .x/ D x for all x 2 X and let IY .y/ D y for all y 2 Y. Denote by L.X; Y/ the
set of all linear continuous operators A W X ! Y. For each A 2 L.X; Y/ set

kAk D supfkA.x/k W x 2 BX .0; 1/g:

Let U  X be a nonempty open set, F W U ! Y, x 2 U and > 0. We say that


the mapping F is . /-pre-differentiable at x if there exists A 2 L.X; Y/ such that

© Springer International Publishing Switzerland 2016 265


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7_16
266 16 Newton’s Method

 lim sup kF.x C h/  F.x/  A.h/kkhk1


h!0

D lim supfkF.x C h/  F.x/  A.h/kkhk1 W h 2 BX .0; / n f0gg: (16.1)


!0C

If A 2 L.X; Y/ satisfies (16.1), then A is called a . /-pre-derivative of F at x.


We denote by @ F.x/ the set of all . /-pre-derivatives of F at x. Note that the
set @ F.x/ can be empty. We say that the mapping F is . /-pre-differentiable if it is
. /-pre-differentiable at every x 2 U.
If G W U ! Y, x 2 U and G is Frechet differentiable at x, then we denote by
G0 .x/ the Frechet derivative of G at x.
Proposition 16.1. Let G W U ! Y, g W U ! Y, x 2 U, > 0, G be Frechet
differentiable at x and let

kg.z2 /  g.z1 /k  kz2  z1 k for all z1 ; z2 2 U:

Then G0 .x/ is the . /-pre-derivative of G C g at x 2 X.


Proof. For every h 2 X n f0g such that khk is sufficiently small,

khk1 k.G C g/.x C h/  .G C g/.x/  .G0 .x//.h/k


 khk1 kG.x C h/  G.x/  .G0 .x//.h/k C khk1 kg.x C h/  g.x/k
 khk1 kG.x C h/  G.x/  .G0 .x//.h/k C :

This implies that

lim sup khk1 k.G C g/.x C h/  .G C g/.x/  .G0 .x//.h/k  :


h!0

Proposition 16.1 is proved.


In our analysis of Newton’s method we need the following mean-valued theorem.
Theorem 16.2. Assume that U  X is a nonempty open set, > 0, a mapping
F W U ! Y is . /-pre-differentiable at every point of U and that x; y 2 U satisfy
x 6D y and

ftx C .1  t/y W t 2 Œ0; 1g  U:

Then there exists t0 2 .0; 1/ such that for every

A 2 @ F.x C t0 .y  x//

the following inequality holds:

kF.x/  F.y/k  kA.y  x/k C ky  xk:


16.1 Pre-differentiable Mappings 267

Proof. For each t 2 Œ0; 1 set

.t/ D kF.x/  F.x C t.y  x//k; t 2 Œ0; 1: (16.2)

Since the mapping F is . /-pre-differentiable, the function is continuous on Œ0; 1.


For every t 2 Œ0; 1 set

.t/ D .t/  t. .1/  .0//; t 2 Œ0; 1: (16.3)

Clearly, the function is continuous on Œ0; 1 and

.0/ D .0/ D 0; .1/ D 0: (16.4)

By (16.4), there exists t0 2 .0; 1/ such that either

(a) t0 is a point of minimum of on Œ0; 1

or

(b) t0 is a point of maximum of on Œ0; 1:

It is easy to see that in the case (a)

lim inf. .t/  .t0 //.t  t0 /1  0; lim sup. .t/  .t0 //.t  t0 /1  0 (16.5)
t!t0C t!t0

and in the case (b)

lim sup. .t/ .t0 //.t t0 /1  0; lim inf



. .t/ .t0 //.t t0 /1  0: (16.6)
t!t0
t!t0C

Assume that

A 2 @ F.x C t0 .y  x//: (16.7)

Let t 2 .0; 1/. By (16.2),

j .t/  .t0 /j
jkF.x/  F.x C t.y  x//k  kF.x/  F.x C t0 .y  x//kj
 kF.x C t.y  x//  F.x C t0 .y  x//k
 kF.x C t.y  x//  F.x C t0 .y  x//  .t  t0 /A.y  x/k
Cjt  t0 jkA.y  x/k: (16.8)
268 16 Newton’s Method

Let  > 0. Since the mapping F is . /-pre-differentiable on U, it follows from (16.1)


that there exists ı./ 2 .0; / such that

BX .x C t0 .y  x/; ı.//  U (16.9)

and that for each h 2 BX .0; ı.//,

kF.x C t0 .y  x/ C h/  F.x C t0 .y  x//  A.h/k  . C /khk: (16.10)

Assume that t 2 .0; 1/ satisfies

jt  t0 j  ı./.kxk C kyk C 1/1 : (16.11)

Set

h D .t  t0 /.y  x/: (16.12)

In view of (16.11) and (16.12), inequality (16.10) holds. By (6.8), (6.10), and (6.12),

j .t/  .t0 /j  jt  t0 jkA.y  x/k C . C /jt  t0 jky  xk (16.13)

for every t 2 .0; 1/ satisfying (16.11).


Assume that the case (a) holds. Then (16.3), (16.5), and (16.13) imply that

0  lim inf. .t/  .t0 //.t  t0 /1


t!t0C

D lim inf. .t/  t. .1/  .0//  . .t0 /  t0 . .1/  .0////.t  t0 /1


t!t0C

D lim inf. .t/  .t0 //.t  t0 /1  .1/


t!t0C

 kA.y  x/k C . C /ky  xk  kF.x/  F.y/k:

Since  is any positive number we conclude that

kF.x/  F.y/k  kA.y  x/k C ky  xk:

Assume that the case (b) holds. Then (16.2), (16.3), (16.6), and (16.13) imply
that

0  lim inf

. .t/  .t0 //.t  t0 /1
t!t0

D lim inf

. .t/  t. .1/  .0//  . .t0 /  t0 . .1/  .0////.t  t0 /1
t!t0
16.2 Convergence of Newton’s Method 269

D lim inf

. .t/  .t0 //.t  t0 /1  .1/
t!t0

D lim inf

j .t/  .t0 /jjt  t0 j1  kF.x/  F.y/k
t!t0

 kA.y  x/k C . C /ky  xk  kF.x/  F.y/k:

Since  is any positive number we conclude that

kF.x/  F.y/k  kA.y  x/k C ky  xk:

This completes the proof of Theorem 16.2.


Let x 2 X and r > 0. Define Px;r .z/ 2 X for every z 2 X by

Px;r .z/ D z if z 2 BX .x; r/

and

Px;r .z/ D x C rkz  xk1 .z  x/ if z 2 X n BX .x; r/:

Clearly, for each z 2 X,

kz  Px;r .z/k D inffkz  yk W y 2 BX .x; r/g:

16.2 Convergence of Newton’s Method

We use the notation and definitions of Sect. 16.1. Suppose that the normed space X
is Banach.
Let > 0, r > 0, xN 2 X, U be a nonempty open subset of X such that

BX .Nx; r/  U (16.14)

and let F W U ! Y be a . /-pre-differentiable mapping. Let

L > 0; 1 2 .0; 1/: (16.15)

Suppose that for each z 2 U there exists

A.z/ 2 @ F.z/ (16.16)

such that for each z1 ; z2 2 BX .Nx; r/,

kA.z1 /  A.z2 /k  Lkz1  z2 k: (16.17)

Let A 2 L.Y; X/ satisfy

kIX  A ı A.Nx/k  1 ; (16.18)


M D kAk (16.19)
270 16 Newton’s Method

and let a positive number K satisfy

kA.F.Nx//k  K  41 r: (16.20)

In view of (16.15), (16.18), and (16.19),

M > 0:

Set

h D MLK: (16.21)

Let us consider the equation

ht2  .1  1  M /t C 1 D 0 (16.22)

with respect to the variable t 2 R1 . We suppose that

1 C M  21 ; MLK D h < 41 .1  1  M /2 : (16.23)

Equation (16.22) has solutions

.2h/1 .1  1  M C ..1  1  M /2  4h/1=2 /

and

.2h/1 .1  1  M  ..1  1  M /2  4h/1=2 /:

Set

t0 D .2h/1 .1  1  M  ..1  1  M /2  4h/1=2 /: (16.24)

For every x 2 U define

T.x/ D x  A.F.x//: (16.25)

The following result is proved in Sect. 16.4.


Theorem 16.3. For each x; y 2 BX .Nx; Kt0 /,

kT.x/  T.y/k  .3=4/kx  yk;


T.BX .Nx; Kt0 //  BX .Nx; Kt0 /;

there exists a unique point x 2 BX .Nx; Kt0 / such that A.F.x // D 0 and for each
x 2 BX .Nx; Kt0 /,

kT i .x/  x k  2.3=4/i Kt0 ; i D 0; 1; : : : :


16.2 Convergence of Newton’s Method 271

If the operator A is injective, then F.x / D 0. Moreover, the following assertions


hold.
1. Let ı > 0, a natural number n0 satisfy

.3  41 /n0 2Kt0  ı (16.26)

and let a sequence fxi g1


iD0  BX .N
x; Kt0 / satisfy

kT.xi /  xiC1 k  ı for all integers i  0:

Then for all integers n  n0 ,

kxn  x k  5ı:

2. Let ı > 0, a natural number n0 > 4 satisfy (16.26) and let sequences

fxi g1 x; Kt0 /; fyi g1


iD0  BX .N iD0  X

satisfy for all integers i  0,

kyiC1  T.xi /k  ı; xiC1 D PxN;Kt0 .yiC1 /: (16.27)

Then for all integers n  n0 ,

kxn  x k  10ı:

3. Let  > 0, an integer n0 > 2 satisfy

.3  41 /n0 16Kt0   (16.28)

and let a positive number ı satisfy

ı < 1281 ; 24.n0 C 1/ı < K; K  6kA.F.Nx//k: (16.29)

Assume that fxi gniD0


0
 X, x0 D xN and that if an integer i 2 Œ0; n0  1 satisfies

xi 2 BX .Nx; Kt0 /;

then

kT.xi /  xiC1 k  ı: (16.30)

Then

fxi gniD0
0
 BX .Nx; Kt0 /;
kxn0 1  x k   and kxn0 1  xn0 k < =4:
272 16 Newton’s Method

16.3 Auxiliary Results

We use the notation, assumptions, and definitions introduced in Sects. 16.1 and 16.2.
Lemma 16.4. The mapping T W U ! X is . M/-pre-differentiable at every point
of U and for every x 2 U,

IX  A ı A.x/ 2 @M T.x/:

Proof. Let x 2 U and  > 0. In view of (16.1) and (16.16), there exists ı > 0 such
that

BX .x; ı/  U (16.31)

and for each h 2 BX .0; ı/,

kF.x C h/  F.x/  .A.x//.h/k


 . C .kAk C 1/1 /khk: (16.32)

By (16.19), (16.25), (16.31), and (16.32), for each h 2 BX .0; ı/,

kT.x C h/  T.x/  .IX  A ı A.x//.h/k


D kx C h  A.F.x C h//  x C A.F.x//  h C A..A.x//.h//k
D k.A/.F.x C h/  F.x/  .A.x//.h//k
 kAkkF.x C h/  F.x/  .A.x//.h/k
 kAk. C .kAk C 1/1 /khk
 .kAk C /khk D .M C /khk:

Since  is any positive number, this completes the proof of Lemma 16.4.
Lemma 16.5. Let r0 2 .0; r and x 2 BX .Nx; r0 /. Then

kIX  A ı A.x/k  1 C MLr0 :

Proof. By (16.17), (16.18), and (16.19),

kIX  A ı A.x/k
D kIX  A ı A.Nx/ C A ı A.Nx/  A ı A.x/k
 kIX  A ı A.Nx/k C kAkkA.Nx/  A.x/k
 1 C MLkx  xN k  1 C MLr0 :

Lemma 16.5 is proved.


16.3 Auxiliary Results 273

Lemma 16.6. Let r0 2 .0; r and

x; y 2 BX .Nx; r0 /:

Then

kT.x/  T.y/k  . 1 C MLr0 C M /ky  xk:

Proof. By (16.14), Theorem 16.2, and Lemmas 16.4 and 16.5, there exists 0 2
.0; 1/ such that

kT.x/  T.y/k
 k.IX  A ı A.x C 0 .y  x///.y  x/k C M ky  xk
 . 1 C MLr0 C M /ky  xk:

Lemma 16.6 is proved.


Lemma 16.7. Let r0 2 .0; r. Then for each x 2 BX .Nx; r0 /,

kT.x/  xN k  K C 1 r0 C M. r0 C Lr02 /:

Proof. Let

x 2 BX .Nx; r0 /: (16.33)

By (16.19), (16.20), and (16.25),

kT.x/  xN k D kx  A.F.x//  xN k
D kAŒ.A.Nx//.x  xN /  F.x/ C F.Nx/
A.F.Nx// C .IX  A ı A.Nx//.x  xN /k
 kAkkF.x/  F.Nx/  .A.Nx//.x  xN /k
CkA.F.Nx//k C kIX  A ı A.Nx/kkx  xN k
 MkF.x/  F.Nx/  .A.Nx//.x  xN /k C K C 1 kx  xN k: (16.34)

For every x 2 U define

.x/ D F.x/  F.Nx/  .A.Nx//.x  xN /: (16.35)

We show that the mapping is . /-pre-differentiable on U. By (16.35), for each


x 2 U and each h 2 X satisfying x C h 2 U,

.x C h/  .x/  .A.x/  A.Nx//.h/


D F.x C h/  F.Nx/  .A.Nx//.x C h  xN /
F.x/ C F.Nx/ C .A.Nx//.x  xN /  .A.x/  A.Nx//.h/
274 16 Newton’s Method

D F.x C h/  F.x/  .A.Nx//.h/  .A.x/  A.Nx//.h/


D F.x C h/  F.x/  .A.x//.h/:

Together with (16.1) and (16.16) this implies that the mapping is . /-pre-
differentiable at every point of U and that for all z 2 U,

A.z/  A.Nx/ 2 @ .z/: (16.36)

In view of (16.17), for all z 2 BX .Nx; r0 /,

kA.z/  A.Nx/k  Lkz  xN k: (16.37)

It follows from (16.35), (16.36), and Theorem 16.2 that for every z 2 BX .Nx; r0 /,

k .z/k D k .z/  .Nx/k  kz  xN k C Lkz  xN k2 : (16.38)

By (16.33)–(16.35) and (16.38), for every x 2 BX .Nx; r0 /,

kT.x/  xN k  K C 1 r0 C M. r0 C Lr02 /:

Lemma 16.7 is proved.

16.4 Proof of Theorem 16.3

Set

r0 D Kt0 : (16.39)

We show that

r0  r: (16.40)

Indeed, in view of (16.20), (16.23), (16.24), and (16.39),

r0 D Kt0
D 2KŒ.1  1  M / C ..1  1  M /2  4h/1=2 1
 2.1  1  M /1 K  4K  r:

By (16.21), (16.39), (16.40), and Lemma 16.6, for each x; y 2 BX .Nx; Kt0 /,

kT.x/  T.y/k  . 1 C MLKt0 C M /kx  yk


 . 1 C ht0 C M /kx  yk: (16.41)
16.4 Proof of Theorem 16.3 275

It follows from (16.24) that

t0  2.1  1  M /1 : (16.42)

Relations (16.23) and (16.42) imply that

1 C ht0 C M
 1 C M C 2.1  1  M /1 .1  1  M /2 =4
D 1 C M C 21 .1  1  M /
D 21 C 21 . 1 C M /  3=4: (16.43)

By (16.41) and (16.43), for each x; y 2 BX .Nx; Kt0 /,

kT.x/  T.y/k  .3  41 /kx  yk: (16.44)

Let

x 2 BX .Nx; r0 /: (16.45)

It follows from (16.21), (16.22), (16.24), (16.39), (16.40), (16.45), and Lemma 16.7
that

kT.x/  xN k  K C 1 Kt0 C M. Kt0 C LK 2 t02 /


 K.1 C 1 t0 C M t0 C MLKt02 /
D K.1 C 1 t0 C M t0 C ht02 / D Kt0 ;
T.x/ 2 BX .Nx; r0 /

and

T.BX .Nx; r0 //  BX .Nx; r0 /: (16.46)

Relations (16.45) and(16.46) imply that there exists a unique point

x 2 BX .Nx; r0 /

such that

T.x / D x :

In order to complete the proof of Theorem 16.3 it is sufficient to show that assertions
1, 2, and 3 hold.
Let us prove assertion 1. For each integer i  0,

kxiC1  x k  kxiC1  T.xi /k C kT.xi /  x k  ı C .3  41 /kxi  x k: (16.47)


276 16 Newton’s Method

By induction we show that for all integers p  1,

X
p1
1 p
kxp  x k  .3  4 / kx0  x k C ı .3  41 /i : (16.48)
iD0

In view of (16.47), inequality (16.48) holds for p D 1.


Assume that an integer p  1 and that (16.48) holds. It follows from (16.47) and
(16.48) that

kxpC1  x k  ı C .3=4/kxp  x k
X
p
 .3  41 /pC1 kx0  x k C ı .3  41 /i :
iD0

Thus we showed by induction that (16.48) holds for all integers p  1. By (16.48)
and the choice of n0 [see (16.26)], for all integer n  n0 ,

kxn  x k  .3  41 /n0 2Kt0 C 4ı  5ı:

Assertion 1 is proved.
Let us prove assertion 2. In view of (16.27), for each integer i  0,

ı  kyiC1  T.xi /k  kyiC1  xiC1 k

and

kxiC1  T.xi /k  2ı:

By the relation above, (16.26) and assertion 1, for all integers n  n0 ,

kxn  x k  10ı:

Assertion 2 is proved.
Let us prove assertion 3. In view of (16.24),

t0 D 2..1  1  M / C ..1  1  M /2  4h/1=2 /1


 .1  1  M /1 : (16.49)

By (16.23),(16.25), (16.29), (16.30), and (16.49),

kx1  x0 k D kx1  xN k  kx1  T.x0 /k C kT.x0 /  x0 k


 ı C kA.F.Nx//k  ı C 61 K
< K < K.1  1  M /1  Kt0 (16.50)
16.4 Proof of Theorem 16.3 277

and

x1 2 BX .Nx; Kt0 /: (16.51)

Relations (16.30), (16.44), (16.50), and (16.51) imply that

kx2  x1 k  kx2  T.x1 /k C kT.x1 /  x1 k


 ı C kT.x1 /  T.x0 /k C kT.x0 /  x1 k
 2ı C .3  41 /kx1  x0 k
 2ı C .3  41 /ı C 61 .3=4/K: (16.52)

It follows from (16.23), (16.29), (16.49), (16.50), and (16.52),

kx2  xN k  kx2  x1 k C kx1  x0 k


 2ı C .3  41 /.ı C 61 K/ C .ı C 61 K/
 2ı C 2.ı C 61 K/  4ı C K=3 < K  Kt0 : (16.53)

Assume that an integer p satisfies

2  p < n0 ;
xi 2 BX .Nx; Kt0 / (16.54)

for all i D 1; : : : ; p and for all q D 2; : : : ; p,

X
q2
kxq  xq1 k  .3  41 /q1 kx1  x0 k C 2ı .3  41 /i : (16.55)
iD0

(In view of (16.51), (16.52), and (16.53), our assumption holds for p D 2.) By
(16.30), (16.44), (16.54), and (16.55),

kxpC1  xp k  kxpC1  T.xp /k C kT.xp /  T.xp1 /k C kT.xp1 /  xp k


 2ı C 3  41 kxp  xp1 k
X
p2
 .3  41 /p kx1  x0 k C .3  21 /ı .3  41 /i C 2ı
iD0

X
p1
D .3  41 /p kx1  x0 k C 2ı .3  41 /i
iD0
278 16 Newton’s Method

and (16.55) holds for q D p C 1. By (16.55) which holds for all q D 2; : : : ; p C 1,


(16.29), and (16.50),

X
pC1
kxpC1  x0 k  kxq  xq1 k
qD1

X
pC1
 .3  41 /q1 kx1  x0 k C 8ıp
qD1

 4kx1  x0 k C 8n0 ı  4ı C 2  31 K C 8n0 ı


 8.n0 C 1/ı C 2  31 K  K  Kt0

and (16.54) holds for i D p C 1.


Thus we showed by induction that our assumption holds for p D n0 . Together
with (16.54) and (16.55) holding for p D n0 , (16.28) and (16.29) this implies that

xn0 2 BX .Nx; Kt0 /;


kxn0  xn0 1 k  8ı C .3  41 /n0 1 2Kt0  =8 C =16: (16.56)

Set

xQ 0 D xn0 1 (16.57)

and for all integer i  0 set

xQ iC1 D T.Qxi /: (16.58)

By (16.28), (16.30), (16.44), and (16.56)–(16.58),

kQx0  xQ 1 k D kxn0 1  T.xn0 1 /k  kxn0 1  xn0 k C ı  =4 (16.59)

and for all integers i  0,

kQxiC2  xQ iC1 k D kT.QxiC1 /  T.Qxi /k  .3  41 /kQxiC1  xQ i k:

Together with (16.59) this implies that for all integers i  0,

kQxiC1  xQ i k  .3  41 /i =4: (16.60)

Clearly,

x D lim xQ i :
i!1
16.5 Set-Valued Mappings 279

By (16.57) and (16.60),


1
X
kx  xn0 1 k D kx  xQ 0 k  lim kQxq  xQ 0 k  kQxq  xQ qC1 k  :
q!1
qD0

Assertion 3 is proved. This completes the proof of Theorem 16.3.

16.5 Set-Valued Mappings

Let .X; / be a complete metric space. For each z 2 X and each r > 0 set

B.z; r/ D fy 2 X W .z; y/  rg:

For each x 2 X and each nonempty set C  X define

.x; C/ D inff.x; y/ W y 2 Cg:

In Sect. 16.7 we prove the following result which is important in our study of
Newton’s method for nonlinear inclusions.
Theorem 16.8. Suppose that W X ! 2X , a > 0,  2 .0; 1/, xN 2 X,

.x/ 6D ; for all x 2 B.Nx; a/; (16.61)

.Nx; .Nx// < a.1   /; (16.62)

for all u; v 2 B.Nx; a/,

supf.z; .v// W z 2 .u/ \ B.Nx; a/g  .u; v/ (16.63)

and that the set

graph. I B.Nx; a// WD f.x; y/ 2 B.Nx; a/  B.Nx; a/ W y 2 .x/g

is closed. Then the following assertions hold.


P1
1. Assume that a sequence fi g1
iD0  .0; 1/ satisfies iD0 i < 1,

1
X
2.1  /1 i C .1   /1 .maxfi W i D 0; 1; : : : g C .Nx; .Nx///  a
iD0
(16.64)
and that a sequence fxi g1
iD0  X satisfies

x0 D xN (16.65)
280 16 Newton’s Method

and for each integer i  0 satisfying xi 2 B.Nx; a/, the inequalities

.xiC1 ; .xi //  i ; (16.66)

.xi ; xiC1 /  .xi ; .xi // C i (16.67)

hold. Then

.xi ; xN / < a  i1 for all integers i  1; (16.68)

for each integer k  1,

X
k1 X
k
.xk ; xkC1 /   ..Nx; .Nx// C 0 / C
k
j  k1j
C j  kj ; (16.69)
jD0 jD1

for each integer k  0,


1
X 1
X
.xi ; xiC1 /  .1   /1 .xk ; xkC1 / C 2.1   /1 i (16.70)
iDk iDk

and there exists

x D lim xn 2 B.Nx; a/
n!1

satisfying x 2 .x /.
2. Let  2 .0; 1/, a natural number n0 > 2 satisfy

4 n0 .a.1   / C 1/ < .1   / (16.71)

and let a number ı 2 .0; / satisfy

ı < .8n0 /1 .1   / (16.72)

and

ı < Œa  .1   /1 .Nx; .Nx//.1   /.2n0 C 1/1 : (16.73)

Assume that a sequence fxi gniD0


0
 X satisfies

x0 D xN

and for each integer i 2 Œ0; n0  1 satisfying xi 2 B.Nx; a/, the inequalities

.xiC1 ; .xi //  ı; (16.74)

.xi ; xiC1 /  .xi ; .xi // C ı (16.75)


16.6 An Auxiliary Result 281

hold. Then

.xi ; xN / < a  ı for all i D 1; : : : ; n0

and there exists x 2 B.Nx; a/ such that

x 2 .x / and .xn0 ; x / < :

A prototype of Theorem 16.8, for which W X ! 2X n f;g is a strict contraction,


was proved in Chap. 9 of [121].

16.6 An Auxiliary Result

Lemma 16.9. Suppose that W X ! 2X , a > 0, P 2 .0; 1/, xN 2 X, (16.61)–(16.63)


1
hold and let a sequence fi g1
iD0  .0; 1/ satisfy iD0 i < 1 and (16.64). Then
the following assertions hold.
1. Let x0 D xN and x1 2 X satisfy

.x1 ; .x0 //  0 ; .x0 ; x1 /  .x0 ; .x0 // C 0 : (16.76)

Then .x1 ; xN / < a  0 .


2. Assume that n  1 is an integer, fxi gnC1
iD0  X,

x0 2 B.Nx; a/; (16.77)


.xi ; xN / < a  i1 ; i D 1; : : : ; n (16.78)

and that for each integer i 2 f0; : : : ; ng the inequalities

.xiC1 ; .xi //  i ; (16.79)

.xi ; xiC1 /  .xi ; .xi // C i (16.80)

hold. Then for each integer k 2 Œ0; n  1 there exists

ykC1 2 .xk / \ B.Nx; a/ (16.81)

such that

.xkC1 ; ykC1 / < 2k ; (16.82)

for all integers k D 0; : : : ; n  1,

.xkC2 ; xkC1 /  .xk ; xkC1 / C k C kC1 (16.83)


282 16 Newton’s Method

for each integer s satisfying 0  s < n and each integer k satisfying s < k  n,

X
k1
.xk ; xkC1 /   ks .xs ; xsC1 / C  is .ki1Cs C kiCs / (16.84)
iDs

and
0 1 0 1
X
n X
n X
n1 X
n1i X
n Xni
.xp ; xpC1 /   p .x0 ; x1 / C i @  jA C i @  jA :
pD0 pD0 iD0 jD0 iD1 jD0

(16.85)

Moreover, if x0 D xN , then .xnC1 ; xN / < a  n .


Proof. Let us prove assertion 1. By (16.64) and (16.76),

.x1 ; xN /  .Nx; .Nx// C 0 < a  0 :

Assertion 1 is proved.
Let us prove assertion 2. Assume that an integer

k 2 Œ0; n  1: (16.86)

By (16.77)–(16.80) and (16.86),

xk ; xkC1 2 B.Nx; a/; xkC2 2 X;


.xkC2 ; .xkC1 //  kC1 ; .xkC1 ; .xk //  k ; (16.87)
.xkC2 ; xkC1 /  .xkC1 ; .xkC1 // C kC1 : (16.88)

We show that

.xkC2 ; xkC1 /  .xk ; xkC1 / C k C kC1 :

Let a positive number  satisfy

 < minfk ; a  k  .xkC1 ; xN /g (16.89)

[see (16.78)]. In view of (16.87), there exists

ykC1 2 .xk / (16.90)

such that

.xkC1 ; ykC1 / < k C : (16.91)


16.6 An Auxiliary Result 283

By (16.89) and (16.91),

.ykC1 ; xN / < k C  C .xkC1 ; xN / < a: (16.92)

Relations (16.89)–(16.92) imply that

ykC1 2 .xk / \ B.Nx; a/; .xkC1 ; ykC1 / < 2k : (16.93)

Thus (16.81) and (16.82) hold. It follows from (16.63), (16.88), (16.91), and (16.93)
that

.xkC2 ; xkC1 /  kC1 C .xkC1 ; .xkC1 //


 kC1 C .xkC1 ; ykC1 / C .ykC1 ; .xkC1 //
 kC1 C k C  C supf.z; .xkC1 // W z 2 .xk / \ B.Nx; a/g
 kC1 C k C  C .xk ; xkC1 /:

Since  is an arbitrary positive number satisfying (16.89) we conclude that

.xkC2 ; xkC1 /  .xk ; xkC1 / C k C kC1 (16.94)

for all integers k D 0; : : : ; n  1.


Let an integer s satisfy

0  s < n:

We show by induction that for each integer k satisfying s < k  n, (16.84) holds. In
view of (16.94),

.xsC2 ; xsC1 /  .xs ; xsC1 / C s C sC1

and (16.84) holds for k D s C 1.


Assume that an integer k satisfies
s<k<n
and that (16.84) holds. By (16.84) and (16.94),

.xkC2 ; xkC1 /  .xk ; xkC1 / C k C kC1


X
k1
 kC1s
.xs ; xsC1 / C  isC1 .ki1Cs C kiCs / C k C kC1
iDs

X
k
D  kC1s .xs ; xsC1 / C  is .kiCs C kC1iCs /:
iDs

Thus by induction we showed that (16.84) holds for all integers k satisfying
s < k  n.
284 16 Newton’s Method

In particular, for all integers p D 1; : : : ; n,

X
p1
.xp ; xpC1 /   p .x0 ; x1 / C  i .pi1 C pi /: (16.95)
iD0

It follows from (16.95) that


0 1
X
n Xn X
n X
p1
.xp ; xpC1 /  @  p A .x0 ; x1 / C  i .pi1 C pi /
pD0 pD0 pD1 iD0
0 1
Xn X
n X
p1
X
n X
p
D@  p A .x0 ; x1 / C i  pi1 C i  pi
pD0 pD1 iD0 pD1 iD1
0 1 0 1 0 1
X
n X
n1 X
n1i X
n Xni
D@  p A .x0 ; x1 / C i @  jA C i @  jA :
pD0 iD0 jD0 iD1 jD0

Thus (16.85) holds.


Assume that

x0 D xN :

By (16.64), (16.80), and (16.85),

X
n
.Nx; xnC1 / D .x0 ; xnC1 /  .xp ; xpC1 /
pD0

X
n1 X
n
 .1   /1 .x0 ; x1 / C .1   /1 i C .1   /1 i
iD0 iD1

X
n
1 1
 .1   / ..Nx; .Nx// C 0 / C 2.1   / i  .1   /1 n
iD0
1
< a  .1   / n

and

.Nx; xnC1 / < a  :

This completes the proof of Lemma 16.9.


16.7 Proof of Theorem 16.8 285

16.7 Proof of Theorem 16.8

Let us prove assertion 1. By (16.66), (16.67), and assertion 1 of Lemma 16.9,

.x1 ; xN / < a  0 : (16.96)

We show that for all integers p  1,

.xp ; xN / < a  p1 : (16.97)

In view of (16.96), inequality (16.97) holds for p D 1.


Assume that n  1 is an integer and (16.97) holds for all integers p D 1; : : : ; n.
By (16.66), (16.67), (16.97), and assertion 2 of Lemma 16.9,

.xnC1 ; xN / < a  n :

Thus (16.97) holds for all integers p  1.


It follows from (16.66), (16.67), (16.85), (16.97), and Lemma 16.9 that
1
X 1
X
.xp ; xpC1 /  .1   /1 .Nx; x1 / C 2.1   /1 i < 1:
pD0 iD0

This implies that there exists

x D lim xp 2 B.Nx; a/ (16.98)


p!1

and

lim .xp ; xpC1 / D 0: (16.99)


p!1

Lemma 16.9, (16.81), (16.82), and (16.97) imply that for each integer p  0 there
exists

ypC1 2 .xp / \ B.Nx; a/ (16.100)

such that

.xpC1 ; ypC1 / < 2p : (16.101)

By (16.97), (16.98), (16.100), and (16.101),

lim .xp ; yp / D 0; lim .yp ; x / D 0


p!1 p!1
286 16 Newton’s Method

and since graph. I B.Nx; a// is closed we conclude that

x 2 .x /:

Lemma 16.9, (16.67), (16.84), and (16.97) imply that for each integer k > 0,

X
k1 X
k
.xk ; xkC1 /   k .x0 ; x1 / C j  k1j C j  kj
jD0 jD1

X
k1 X
k
  k ..Nx; .Nx// C 0 / C j  k1j j  kj : (16.102)
jD0 jD1

Let k  0 be an integer. In view of (16.64), (16.66), (16.67), and (16.97), we apply


Lemma 16.9 to the sequences fxkCi g1 1
iD0 , fkCi giD0 and obtain from (16.85) that

1
X 1
X
.xkCp ; xkCpC1 /  .1   /1 .xk ; xkC1 / C 2.1   /1 p :
pD0 pDk

Assertion 1 is proved.
Let us prove assertion 2. For i D 0; : : : ; n0  1 set

i D ı: (16.103)

By (16.73) and (16.103), for every integer i  n0 there exists i > 0 such that
(16.64) holds,
1
X
2.1  /1 i < =4; i  ı for all integers i  0: (16.104)
iDn0

Clearly, for every integer i  n0 C 1, there exists xi 2 X such that the following
property holds:
if an integer i  0 satisfies xi 2 B.Nx; a/, then (16.66) and (16.67) hold.
It follows from (16.62), (16.64), (16.66)–(16.69), (16.71), (16.72), (16.103),
(16.104), and assertion 1 that

.xi ; xN / < a  i1 for all integers i  1;


.xi ; xN / < a  ı for all integers i D 1; : : : ; n0

and

.xn0 ; xn0 C1 /   n0 ..Nx; .Nx// C ı/ C 2n0 ı


< 41 .1   / C 41 .1   /: (16.105)
16.8 Pre-differentiable Set-Valued Mappings 287

By assertion 1, (16.64), (16.66), (16.67), (16.70), (16.104), and (16.105),


1
X 1
X
.xi ; xiC1 /  .1   /1 .xn0 ; xn0 C1 / C 2.1   /1 i < =2 C =4
iDn0 iDn0
(16.106)
and there exists

x D lim xn 2 B.Nx; a/ (16.107)


n!1

satisfying x 2 .x /. It follows from (16.106) and (16.107) that


1
X
.xn0 ; x / D lim .xn0 ; xn /  .xi ; xiC1 / < :
n!1
iDn0

Assertion 2 is proved. This completes the proof of Theorem 16.8.

16.8 Pre-differentiable Set-Valued Mappings

Let .Z; k  k/ be a normed space. For each x 2 Z and each nonempty set C  Z
define

d.x; C/ D inffkx  zk W z 2 Cg:

For each z 2 Z and each r > 0 set

BZ .z; r/ D fy 2 Z W kz  yk  rg:

Let IZ .z/ D z for all z 2 Z.


For each pair of nonempty sets C1 ; C2  Z define

H.C1 ; C2 / D maxfsupfd.x; C2 / W x 2 C1 g; supfd.y; C1 / W y 2 C2 gg:

Let .X; k  k/ and .Y; k  k/ be normed spaces. Denote by L.X; Y/ the set of all
linear continuous operators A W X ! Y. For each A 2 L.X; Y/ set

kAk D supfkA.x/k W x 2 BX .0; 1/g:

Let U  X be a nonempty open set, F W U ! 2Y n f;g, x 2 U and > 0. We


say that the mapping F is . /-pre-differentiable at x if there exists A 2 L.X; Y/ such
that the following property holds:
(P1) for each  > 0 there exists ı./ > 0 such that

BX .x; ı.//  U (16.108)


288 16 Newton’s Method

and that for each h 2 BX .0; ı.//,

F.x/ C A.h/  F.x C h/ C . C /khkBY .0; 1/; (16.109)


F.x C h/  F.x/ C A.h/ C . C /khkBY .0; 1/; (16.110)

If A 2 L.X; Y/ and (P1) holds, then A is called a . /-pre-derivative of F at x.


We denote by @ F.x/ the set of all . /-pre-derivatives of F at x. Note that the set
@ F.x/ can be empty.
Clearly, A 2 L.X; Y/ satisfies A 2 @ F.x/ if and only if for each  > 0 there
exists ı./ > 0 such that

BX .x; ı.//  U

and that for each h 2 BX .0; ı.//,

H.F.x/ C A.h/; F.x C h//  . C /khk: (16.111)

We say that the mapping F is . /-pre-differentiable if it is . /-pre-differentiable


at every x 2 U.
Recall that if G W U ! Y, x 2 U and G is Frechet differentiable at x, then we
denote by G0 .x/ the Frechet derivative of G at x.
Proposition 16.10. Let G W U ! Y, g W U ! 2Y n f;g, x 2 U, > 0, G be Frechet
differentiable at x and let

H.g.z2 /; g.z1 //  kz2  z1 k for all z1 ; z2 2 U: (16.112)

Then G0 .x/ is the . /-pre-derivative of G C g at x 2 X.


Proof. Let  > 0. There exists ı./ > 0 such that

BX .x; ı.//  U

and that for each h 2 BX .0; ı.// n f0g,

khk1 kG.x C h/  G.x/  .G0 .x//.h/k < =2: (16.113)

By (16.112) and (16.113), for each h 2 BX .0; ı.// n f0g,

G.x C h/ C g.x C h/
 G.x/ C .G0 .x//.h/ C 21 khkBY .0; 1/ C g.x C h/
 G.x/ C .G0 .x//.h/ C 21 khkBY .0; 1/ C g.x/ C . C 41 /khkBY .0; 1/
 G.x/ C g.x/ C .G0 .x//.h/ C . C /khkBY .0; 1/
16.8 Pre-differentiable Set-Valued Mappings 289

and

G.x/ C g.x/ C .G0 .x//.h/


 G.x C h/ C 21 khkBY .0; 1/ C g.x/
 G.x C h/ C 21 khkBY .0; 1/ C g.x C h/ C . C 41 /khkBY .0; 1/
 G.x C h/ C g.x C h/ C . C /khkBY .0; 1/:

Proposition 16.10 is proved.


The next result is a mean-value theorem for pre-differentiable set-valued map-
pings.
Theorem 16.11. Assume that U  X is a nonempty open set, > 0, a mapping
F W U ! 2Y nf;g is . /-pre-differentiable at every point of U, x; y 2 U satisfy x 6D y
and

ftx C .1  t/y W t 2 Œ0; 1g  U

and that

xQ 2 F.x/: (16.114)

Then there exists t0 2 .0; 1/ such that for every

A 2 @ F.x C t0 .y  x//

the following inequality holds:

d.Qx; F.y//  kA.y  x/k C ky  xk:

Proof. For each t 2 Œ0; 1 set

.t/ D d.Qx; F.x C t.y  x///; t 2 Œ0; 1: (16.115)

Since the mapping F is . /-pre-differentiable, the function is continuous on Œ0; 1.


For every t 2 Œ0; 1 set

.t/ D .t/  t. .1/  .0//; t 2 Œ0; 1: (16.116)

Clearly, the function is continuous on Œ0; 1 and in view of (16.114)

.0/ D .0/ D 0; .1/ D 0:

Therefore there exists t0 2 .0; 1/ such that either

(a) t0 is a point of minimum of on Œ0; 1


290 16 Newton’s Method

or

(b) t0 is a point of maximum of on Œ0; 1:

It is easy to see that in the case (a)

lim inf. .t/ .t0 //.tt0 /1  0; lim sup. .t/ .t0 //.tt0 /1  0 (16.117)
t!t0C t!t0

and in the case (b)

lim sup. .t/  .t0 //.t  t0 /1  0; lim inf



. .t/  .t0 //.t  t0 /1  0:
t!t0
t!t0C
(16.118)
Assume that

A 2 @ F.x C t0 .y  x//: (16.119)

By (16.115), for every t 2 Œ0; 1,

.t/  .t0 /
D d.Qx; F.x C t.y  x///  d.Qx; F.x C t0 .y  x///
 H.F.x C t.y  x//; F.x C t0 .y  x///;
.t0 /  .t/
D d.Qx; F.x C t0 .y  x///  d.Qx; F.x C t.y  x///
 H.F.x C t.y  x//; F.x C t0 .y  x///

and

j .t/  .t0 /j  H.F.x C t.y  x//; F.x C t0 .y  x///: (16.120)

Let  > 0. Since the mapping F is . /-pre-differentiable on U, it follows from


(16.119) that there exists ı./ 2 .0; / such that

BX .x C t0 .y  x/; ı.//  U (16.121)

and that for each

h 2 BX .0; ı.// (16.122)

we have

H.F.x C t0 .y  x// C A.h/; F.x C t0 .y  x/ C h//  . C =4/khk: (16.123)


16.8 Pre-differentiable Set-Valued Mappings 291

Assume that t 2 Œ0; 1 satisfies

jt  t0 j  ı./.kxk C kyk C 1/1 : (16.124)

Set

h D .t  t0 /.y  x/: (16.125)

In view of (16.124) and (16.125), relations (16.122) and (16.123) hold. By (16.123)
and (16.125),

H.F.x C t0 .y  x// C .t  t0 /A.y  x/; F.x C t.y  x///


 . C 41 /jt  t0 jky  xk:

It follows from the relation above and (16.120) that

j .t/  .t0 /j
 H.F.x C t0 .y  x//; F.x C t0 .y  x// C .t  t0 /A.y  x//
CH.F.x C t0 .y  x// C .t  t0 /A.y  x/; F.x C t.y  x///
 jt  t0 jkA.y  x/k C . C 41 /jt  t0 jky  xk: (16.126)

Assume that the case (a) holds. Then (16.115)–(16.117), (16.124), and (16.126)
imply that

0  lim inf. .t/  .t0 //.t  t0 /1


t!t0C

D lim inf. .t/  t. .1/  .0//  . .t0 /  t0 . .1/  .0////.t  t0 /1


t!t0C

D lim inf. .t/  .t0 //.t  t0 /1  . .1/  .0//


t!t0C

 kA.y  x/k C . C 41 /ky  xk  d.Qx; F.y//:

Since  is any positive number we conclude that

d.Qx; F.y//  kA.y  x/k C ky  xk: (16.127)

Assume that the case (b) holds. Then (16.114)–(16.116), (16.118), and (16.126)
imply that

0  lim inf

. .t/  .t0 //.t  t0 /1
t!t0

D lim inf

. .t/  t. .1/  .0//  . .t0 /  t0 . .1/  .0////.t  t0 /1
t!t0
292 16 Newton’s Method

D lim inf

. .t/  .t0 //.t  t0 /1  . .1/  .0//
t!t0

D lim inf

j .t/  .t0 /jjt  t0 j1  d.Qx; F.y//
t!t0

 kA.y  x/k C . C 41 /ky  xk  d.Qx; F.y//:

Since  is any positive number we conclude that

d.Qx; F.y//  kA.y  x/k C ky  xk:

Thus (16.127) holds in both cases. This completes the proof of Theorem 16.11.

16.9 Newton’s Method for Solving Inclusions

We use the notation and definitions of Sect. 16.8. Suppose that the normed space X
is Banach.
Let > 0, r > 0, xN 2 X, U be a nonempty open subset of X such that

BX .Nx; r/  U

and let F W U ! 2Y n f;g be a . /-pre-differentiable mapping at all points of U such


that F.x/ is a closed set for all x 2 U.
Let

L > 0; 1 2 .0; 1/:

Suppose that for each z 2 U there exists

A.z/ 2 @ F.z/ (16.128)

such that for each z1 ; z2 2 BX .Nx; r/,

kA.z1 /  A.z2 /k  Lkz1  z2 k: (16.129)

Let A 2 L.Y; X/ satisfy


kIX  A ı A.Nx/k  1 ; (16.130)

there exists a continuous operator A1 W X ! Y, a positive number K satisfy

inffkA.z/k W z 2 F.Nx/g  K  41 r and M WD kAk: (16.131)

In view of (16.130),

M > 0:
16.9 Newton’s Method for Solving Inclusions 293

For every x 2 U define

T.x/ D x  A.F.x// (16.132)

and for every x 2 X n U set T.x/ D ;. The following result is proved in Sect. 16.11.
Theorem 16.12. Let

1 C M  41 ; K  .16ML/1 ; r0 WD minf.4ML/1 ; rg: (16.133)

Then for each x; y 2 BX .Nx; r0 /,

H.T.x/; T.y//  21 kx  yk;


d.Nx; T.Nx//  41 r0

and there exists x 2 BX .Nx; r0 / such that x 2 T.x / and 0 2 F.x /. Moreover, the
following assertions hold.
P1
1. Assume that a sequence fi g1 iD0  .0; 1/ satisfies iD0 i < 1,

1
X
4 i C 2 maxfi W i D 0; 1; : : : g  21 r0
iD0

and that a sequence fxi g1


iD0  X satisfies

x0 D xN

and for each integer i  0 satisfying xi 2 BX .Nx; r0 /, the inequalities

d.xiC1 ; T.xi //  i ;
kxi  xiC1 k  d.xi ; T.xi // C i

hold. Then

kxi  xN k < r0  i1 for all integers i  1;

for each integer k  1,


X
k1 X
k
k 1 kC1Cj
kxk  xkC1 k  2 .4 r0 C 0 / C 2 j C 2kCj j ;
jD0 jD1

for each integer k  0,


1
X 1
X
kxi  xiC1 k  2kxk  xkC1 k C 4 i
iDk iDk
294 16 Newton’s Method

and there exists

lim xn 2 BX .Nx; r0 /
n!1

satisfying limn!1 xn 2 T.limn!1 xn / and 0 2 F.limn!1 xn /.


2. Let  2 .0; 1/, a natural number n0 > 2 satisfy

2n0 .21 r0 C 1/ < 81 

and let a number ı 2 .0; / satisfy

ı < .16n0 /1 ;

and

ı < 41 .2n0 C 1/1 r0 :

Assume that a sequence fxi gniD0


0
 X satisfies

x0 D xN

and for each integer i 2 Œ0; n0  1 satisfying xi 2 BX .Nx; r0 /, the inequalities

d.xiC1 ; T.xi //  ı;
kxi  xiC1 k  d.xi ; T.xi // C ı

hold. Then

kxi  xN k < r0  ı for all i D 1; : : : ; n0

and there exists x 2 BX .Nx; r0 / such that

F.x / D 0 and kxn0  x k < :

16.10 Auxiliary Results for Theorem 16.12

Lemma 16.13. The mapping T W U ! 2X n f;g is . M/-pre-differentiable at every


point of U and for every x 2 U,

IX  A ı A.x/ 2 @M T.x/:

Proof. Let x 2 U and  > 0. In view of (16.128), there exists ı > 0 such that

BX .x; ı/  U

and for each h 2 BX .0; ı/,

H.F.x/ C .A.x//.h/; F.x C h//  . C .kAk C 1/1 /khk:


16.10 Auxiliary Results for Theorem 16.12 295

By the inclusion above, (16.131) (16.132), for each h 2 BX .0; ı/,

H.T.x/ C .IX  A ı A.x//.h/; T.x C h//


D H.x  A.F.x// C h  A..A.x//.h//; x C h  A.F.x C h///
D H.A.F.x/ C .A.x//.h//; A.F.x C h///
 kAk. C .kAk C 1/1 /khk  .M C /khk:

Since  is any positive number, this completes the proof of Lemma 16.13.
Lemma 16.14. Let r0 2 .0; r and x 2 BX .Nx; r0 /. Then

kIX  A ı A.x/k  1 C MLr0 :

Proof. By (16.129) and (16.130),

kIX  A ı A.x/k
D kIX  A ı A.Nx/ C A ı A.Nx/  A ı A.x/k
kIX  A ı A.Nx/k C kAkkA.Nx/  A.x/k
 1 C MLkx  xN k  1 C MLr0 :

Lemma 16.14 is proved.


Lemma 16.15. Let r0 2 .0; r and

x; y 2 BX .Nx; r0 /:

Then

H.T.x/; T.y//  . 1 C MLr0 C M /ky  xk:

Proof. By Lemma 16.13, the mapping T is . M/-pre-differentiable and for every


x 2 U,

IX  A ı A.x/ 2 @ M T.x/:

We may assume that x 6D y.


Let

xQ 2 T.x/:

By Theorem 16.11 and Lemmas 16.13 and 16.14, there exists t0 2 .0; 1/ such that

d.Qx; T.y//  k.IX  A ı A.x C t0 .y  x///.y  x/k C M ky  xk


 . 1 C MLr0 C M /ky  xk:
296 16 Newton’s Method

Sine xQ is an arbitrary element of T.x/ this implies that

supfd.Qx; T.y// W xQ 2 T.x/k  ky  xk. 1 C MLr0 C M /:

Analogously,

supfd.Qy; T.x// W yQ 2 T.y/k  ky  xk. 1 C MLr0 C M /:

Therefore

H.T.x/; T.y//  ky  xk. 1 C MLr0 C M /:

Lemma 16.15 is proved.

16.11 Proof of Theorem 16.12

By (16.131)–(16.133),

d.Nx; T.Nx// D inffkA.z/k W z 2 F.Nx/k  K  41 r0 :

Let
x; y 2 BX .Nx; r0 /:
Lemma 16.15 imply that
H.T.x/; T.y//  ky  xk. 1 C MLr0 C M /  21 kx  yk:

Clearly, the set

f.x; y/ 2 BX .Nx; r0 /  BX .Nx; r0 / W y 2 T.x/g

is closed. It is not difficult to see that Theorem 16.12 follows from Theorem 16.8
applied to the mapping T.
References

1. Alber YI (1971) On minimization of smooth functional by gradient methods. USSR Comput


Math Math Phys 11:752–758
2. Alber YI, Iusem AN, Solodov MV (1997) Minimization of nonsmooth convex functionals in
Banach spaces. J Convex Anal 4:235–255
3. Alber YI, Iusem AN, Solodov MV (1998) On the projected subgradient method for
nonsmooth convex optimization in a Hilbert space. Math Program 81:23–35
4. Alvarez F, Lopez J, Ramirez CH (2010) Interior proximal algorithm with variable metric for
second-order cone programming: applications to structural optimization and support vector
machines. Optim Methods Softw 25:859–881
5. Ansari QH, Yao JC (1999) A fixed point theorem and its applications to a system of variational
inequalities. Bull Aust Math Soc 59:433–442
6. Antipin AS (1994) Minimization of convex functions on convex sets by means of differential
equations. Differ Equ 30:1365–1375
7. Aragon Artacho FJ, Geoffroy MH (2007) Uniformity and inexact version of a proximal
method for metrically regular mappings. J Math Anal Appl 335:168–183
8. Aragon Artacho FJ, Dontchev AL, Gaydu M, Geoffroy MH, Veliov VM (2011) Metric
regularity of Newtons iteration. SIAM J Control Optim 49:339–362
9. Attouch H, Bolte J (2009) On the convergence of the proximal algorithm for nonsmooth
functions involving analytic features. Math Program Ser B 116:5–16
10. Baillon JB (1978) Un Exemple Concernant le Comportement Asymptotique de la Solution du
Probleme 0 2 du=dt C @ .u/. J Funct Anal 28:369–376
11. Barbu V, Precupanu T (2012) Convexity and optimization in Banach spaces. Springer,
Heidelberg, London, New York
12. Barty K, Roy J-S, Strugarek C (2007) Hilbert-valued perturbed subgradient algorithms. Math
Oper Res 32:551–562
13. Bauschke HH, Borwein JM (1996) On projection algorithms for solving convex feasibility
problems. SIAM Rev 38:367–426
14. Bauschke HH, Combettes PL (2011) Convex analysis and monotone operator theory in
Hilbert spaces. Springer, New York
15. Bauschke HH, Borwein JM, Combettes PL (2003) Bregman monotone optimization algo-
rithms. SIAM J Control Optim 42:596–636
16. Bauschke HH, Goebel R, Lucet Y, Wang X (2008) The proximal average: basic theory. SIAM
J Optim 19:766–785
17. Bauschke H, Moffat S, Wang X (2012) Firmly nonexpansive mappings and maximally
monotone operators: correspondence and duality. Set-Valued Var Anal 20:131–153

© Springer International Publishing Switzerland 2016 297


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7
298 References

18. Beck A, Sabach S (2015) Weiszfeld’s method: old and new results. J Optim Theory Appl
164:1–40
19. Beck A, Teboulle M (2003) Mirror descent and nonlinear projected subgradient methods for
convex optimization. Oper Res Lett 31:167–175
20. Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear
inverse problems. SIAM J Imag Sci 2:183–202
21. Ben-Israel A (1966) A Newton-Raphson method for the solution of equations. J Math Anal
Appl 15:243–253
22. Ben-Israel A, Greville TNE (1974) Generalized inverses: theory and applications. Wiley, New
York
23. Bolte J (2003) Continuous gradient projection method in Hilbert spaces. J Optim Theory Appl
119:235–259
24. Bonnans JF (1994) Local analysis of Newton-type methods for variational inequalities and
nonlinear programming. Appl Math Optim 29:161–186
25. Boukari D, Fiacco AV (1995) Survey of penalty, exact-penalty and multiplier methods from
1968 to 1993. Optimization 32:301–334
26. Bregman LM (1967) A relaxation method of finding a common point of convex sets and
its application to the solution of problems in convex programming. Z Vycisl Mat Mat Fiz
7:620–631
27. Brezis H (1973) Opérateurs maximaux monotones. North Holland, Amsterdam
28. Bruck RE (1974) Asymptotic convergence of nonlinear contraction semigroups in a Hilbert
space. J Funct Anal 18:15–26
29. Burachik RS, Iusem AN (1998) A generalized proximal point algorithm for the variational
inequality problem in a Hilbert space. SIAM J Optim 8:197–216
30. Burachik RS, Grana Drummond LM, Iusem AN, Svaiter BF (1995) Full convergence of the
steepest descent method with inexact line searches. Optimization 32:137–146
31. Burachik RS, Lopes JO, Da Silva GJP (2009) An inexact interior point proximal method for
the variational inequality problem. Comput Appl Math 28:15–36
32. Burachik RS, Kaya CY, Sabach S (2012) A generalized univariate Newton method motivated
by proximal regularization. J Optim Theory Appl 155:923–940
33. Burke JV (1991) An exact penalization viewpoint of constrained optimization. SIAM
J Control Optim 29:968–998
34. Butnariu D, Kassay G (2008) A proximal-projection method for finding zeros of set-valued
operators. SIAM J Control Optim 47:2096–2136
35. Ceng LC, Mordukhovich BS, Yao JC (2010) Hybrid approximate proximal method with
auxiliary variational inequality for vector optimization. J Optim Theory Appl 146:267–303
36. Censor Y, Zenios SA (1992) The proximal minimization algorithm with D-functions.
J. Optim. Theory Appl. 73:451–464
37. Censor Y, Gibali A, Reich S (2011) The subgradient extragradient method for solving
variational inequalities in Hilbert space. J Optim Theory Appl 148:318–335
38. Censor Y, Gibali A, Reich S (2012) A von Neumann alternating method for finding common
solutions to variational inequalities. Nonlinear Anal 75:4596–4603
39. Censor Y, Gibali A, Reich S, Sabach S (2012) Common solutions to variational inequalities.
Set-Valued Var Anal 20:229–247
40. Chen Z, Zhao K (2009) A proximal-type method for convex vector optimization problem in
Banach spaces. Numer Funct Anal Optim 30:70–81
41. Chen X, Nashed Z, Qi L (1997) Convergence of Newtons method for singular smooth and
nonsmooth equations using adaptive outer inverses. SIAM J Optim 7:445–462
42. Chuong TD, Mordukhovich BS, Yao JC (2011) Hybrid approximate proximal algorithms for
efficient solutions in for vector optimization. J Nonlinear Convex Anal 12:861–864
43. Clarke FH (1983) Optimization and nonsmooth analysis. Willey Interscience, New York
44. Demyanov VF, Vasilyev LV (1985) Nondifferentiable optimization. Optimization Software,
New York
References 299

45. Di Pillo G, Grippo L (1989) Exact penalty functions in constrained optimization. SIAM
J Control Optim 27:1333–1360
46. Dontchev AL, Rockafellar RT (2010) Newton’s method for generalized equations: a sequen-
tial implicit function theorem. Math Program 123:139–159
47. Dontchev AL, Rockafellar RT (2013) Convergence of inexact Newton methods for general-
ized equations. Math Program Ser B 139:115–137
48. Eremin II (1966) The penalty method in convex programming. Sov Math Dokl 8:459–462
49. Eremin II (1971) The penalty method in convex programming. Cybernetics 3:53–56
50. Ekeland I (1974) On the variational principle. J Math Anal Appl 47, 324–353
51. Ermoliev YM (1966) Methods for solving nonlinear extremal problems. Cybernetics 2:1–17
52. Facchinei F, Pang J-S (2003) Finite-dimensional variational inequalities and complementarity
problems, volume I and volume II. Springer, New York
53. Guler O (1991) On the convergence of the proximal point algorithm for convex minimization.
SIAM J Control Optim 29:403–419
54. Gwinner J, Raciti F (2009) On monotone variational inequalities with random data. J Math
Inequal 3:443–453
55. Hager WW, Zhang H (2007) Asymptotic convergence analysis of a new class of proximal
point methods. SIAM J Control Optim 46:1683–1704
56. Hager WW, Zhang H (2008) Self-adaptive inexact proximal point methods. Comput Optim
Appl 39:161–181
57. Han S-P, Mangasarian OL (1979) Exact penalty function in nonlinear programming. Math
Program 17:251–269
58. Hiriart-Urruty J-B, Lemarechal C (1993) Convex analysis and minimization algorithms.
Springer, Berlin
59. Iiduka H, Takahashi W, Toyoda M (2004) Approximation of solutions of variational inequal-
ities for monotone mappings. Pan Am Math J 14:49–61
60. Ioffe AD, Zaslavski AJ (2000) Variational principles and well-posedness in optimization and
calculus of variations. SIAM J Control Optim 38:566–581
61. Iusem A, Nasri M (2007) Inexact proximal point methods for equilibrium problems in Banach
spaces. Numer Funct Anal Optim 28:1279–1308
62. Iusem A, Resmerita E (2010) A proximal point method in nonreflexive Banach spaces. Set-
Valued Var Anal 18:109–120
63. Izmailov AF, Solodov MV (2014) Newton-type methods for optimization and variational
problems. Springer International Publishing, Cham
64. Kantorovich LV (1948) Functional analysis and applied mathematics. Usp Mat Nauk
3:89–185
65. Kantorovich LV, Akilov GP (1982) Functional analysis. Pergamon Press, Oxford, New York
66. Kaplan A, Tichatschke R (1994) Stable methods for ill-posed variational problems. Akademie
Verlag, Berlin
67. Kaplan A, Tichatschke R (1998) Proximal point methods and nonconvex optimization.
J Global Optim 13:389–406
68. Kaplan A, Tichatschke R (2007) Bregman-like functions and proximal methods for varia-
tional problems with nonlinear constraints. Optimization 56:253–265
69. Kassay G (1985) The proximal points algorithm for reflexive Banach spaces. Stud Univ
Babes-Bolyai Math 30:9–17
70. Kiwiel KC (1996) Restricted step and Levenberg–Marquardt techniques in proximal bundle
methods for nonconvex nondifferentiable optimization. SIAM J Optim 6:227–249
71. Konnov IV (1997) On systems of variational inequalities. Russ Math (Iz VUZ) 41:79–88
72. Konnov IV (2001) Combined relaxation methods for variational inequalities. Springer, Berlin,
Heidelberg
73. Konnov IV (2008) Nonlinear extended variational inequalities without differentiability:
applications and solution methods. Nonlinear Anal. 69:1–13
74. Konnov IV (2009) A descent method with inexact linear search for mixed variational
inequalities. Russ Math (Iz VUZ) 53:29–35
300 References

75. Korpelevich GM (1976) The extragradient method for finding saddle points and other
problems. Ekon Matem Metody 12:747–756
76. Kutateladze SS (1979) Convex operators. Usp Math Nauk 34:167–196
77. Lemaire B (1989) The proximal algorithm. In: Penot JP (ed) International series of numerical
mathematics, vol 87. Birkhauser-Verlag, Basel, pp 73–87
78. Lotito PA, Parente LA, Solodov MV (2009) A class of variable metric decomposition methods
for monotone variational inclusions. J Convex Anal 16:857–880
79. Mainge P-E (2008) Strong convergence of projected subgradient methods for nonsmooth and
nonstrictly convex minimization. Set-Valued Anal 16:899–912
80. Mangasarian OL, Pang J-S (1997) Exact penalty functions for mathematical programs with
linear complementary constraints. Optimization 42:1–8
81. Martinet B (1978) Pertubation des methodes d’optimisation: application. RAIRO Anal Numer
12:153–171
82. Minty GJ (1962) Monotone (nonlinear) operators in Hilbert space. Duke Math J 29:341–346
83. Minty GJ (1964) On the monotonicity of the gradient of a convex function. Pac J Math
14:243–247
84. Mordukhovich BS (2006) Variational analysis and generalized differentiation, I: I: basic
theory. Springer, Berlin
85. Mordukhovich BS (2006) Variational analysis and generalized differentiation, II: applica-
tions. Springer, Berlin
86. Mordukhovich BS, Nam NM (2014) An easy path to convex analysis and applications.
Morgan&Clayton Publishes, San Rafael, CA
87. Moreau JJ (1965) Proximite et dualite dans un espace Hilbertien. Bull Soc Math Fr
93:273–299
88. Nashed MZ, Chen X (1993) Convergence of Newton-like methods for singular operator
equations using outer inverses. Numer Math 66:235–257
89. Nedic A, Ozdaglar A (2009) Subgradient methods for saddle-point problems. J Optim Theory
Appl 142:205–228
90. Nemirovski A, Yudin D (1983) Problem complexity and method efficiency in optimization.
Wiley, New York
91. Nesterov Yu (1983) A method for solving the convex programming problem with convergence
rate O.1=k2 /. Dokl Akad Nauk 269:543–547
92. Nesterov Yu (2004) Introductory lectures on convex optimization. Kluwer, Boston
93. Pang J-S (1985) Asymmetric variational inequality problems over product sets: applications
and iterative methods. Math Program 31:206–219
94. Pang J-S (1990) Newton’s method for B-differentiable equations. Math Oper Res 15:311–341
95. Polyak BT (1967) A general method of solving extremum problems. Dokl Akad Nauk
8:593–597
96. Polyak BT (1987) Introduction to optimization. Optimization Software, New York
97. Polyak BT (2007) Newtons method and its use in optimization. Eur J Oper Res
181:1086–1096
98. Polyak RA (2015) Projected gradient method for non-negative least squares. Contemp Math
636:167–179
99. Qi L, Sun J (1993) A nonsmooth version of Newton’s method. Math Program 58:353–367
100. Reich S, Sabach S (2010) Two strong convergence theorems for Bregman strongly nonexpan-
sive operators in reflexive Banach spaces. Nonlinear Anal 73:122–135
101. Reich S, Zaslavski AJ (2014) Genericity in nonlinear analysis. Springer, New York
102. Robinson SM (1994) Newtons method for a class of nonsmooth functions. Set-Valued Anal
2:291–305
103. Rockafellar RT (1976) Augmented Lagrangians and applications of the proximal point
algorithm in convex programming. Math Oper Res 1:97–116
104. Rockafellar RT (1976) Monotone operators and the proximal point algorithm. SIAM J Control
Optim 14:877–898
105. Shor NZ (1985) Minimization methods for non-differentiable functions. Springer, Berlin
References 301

106. Solodov MV, Svaiter BF (2000) Error bounds for proximal point subproblems and associated
inexact proximal point algorithms. Math Program 88:371–389
107. Solodov MV, Svaiter BF (2001) A unified framework for some inexact proximal point
algorithms. Numer Funct Anal Optim 22:1013–1035
108. Solodov MV, Zavriev SK (1998) Error stability properties of generalized gradient-type
algorithms. J Optim Theory Appl 98:663–680
109. Su M, Xu H-K (2010) Remarks on the gradient-projection algorithm. J Nonlinear Anal Optim
1:35–43
110. Weiszfeld EV (1937) Sur le point pour lequel la somme des distances de n points donnes est
minimum. Tohoku Math J 43:355–386
111. Xu H-K (2006) A regularization method for the proximal point algorithm. J Global Optim
36:115–125
112. Xu H-K (2011) Averaged mappings and the gradient-projection algorithm. J Optim Theory
Appl 150:360–378
113. Yamashita N, Kanzow C, Morimoto T, Fukushima M (2001) An infeasible interior proximal
method for convex programming problems with linear constraints. J Nonlinear Convex Anal
2:139–156
114. Zangwill WI (1967) Nonlinear programming via penalty functions. Manage Sci 13:344–358
115. Zaslavski AJ (2003) Existence of solutions of minimization problems with an increasing cost
function and porosity. Abstr Appl Anal 2003:651–670
116. Zaslavski AJ (2003) Generic existence of solutions of minimization problems with an
increasing cost function. J. Nonlinear Funct Anal Appl 8:181–213
117. Zaslavski AJ (2005) A sufficient condition for exact penalty in constrained optimization.
SIAM J Optim 16:250–262
118. Zaslavski AJ (2007) Existence of approximate exact penalty in constrained optimization.
Math Oper Res 32:484–495
119. Zaslavski AJ (2010) An estimation of exact penalty in constrained optimization. J Nonlinear
Convex Anal 11:381–389
120. Zaslavski AJ (2010) Convergence of a proximal method in the presence of computational
errors in Hilbert spaces. SIAM J Optim 20:2413–2421
121. Zaslavski AJ (2010) Optimization on metric and normed spaces. Springer, New York
122. Zaslavski AJ (2010) The projected subgradient method for nonsmooth convex optimization
in the presence of computational errors. Numer Funct Anal Optim 31:616–633
123. Zaslavski AJ (2011) An estimation of exact penalty for infinite-dimensional inequality-
constrained minimization problems. Set-Valued Var Anal 19:385–398
124. Zaslavski AJ (2011) Inexact proximal point methods in metric spaces. Set-Valued Var Anal
19:589–608
125. Zaslavski AJ (2011) Maximal monotone operators and the proximal point algorithm in the
presence of computational errors. J. Optim Theory Appl 150:20–32
126. Zaslavski AJ (2012) The extragradient method for convex optimization in the presence of
computational errors. Numer Funct Anal Optim 33:1399–1412
127. Zaslavski AJ (2012) The extragradient method for solving variational inequalities in the
presence of computational errors. J Optim Theory Appl 153:602–618
128. Zaslavski AJ (2013) The extragradient method for finding a common solution of a finite
family of variational inequalities and a finite family of fixed point problems in the presence
of computational errors. J Math Anal Appl 400:651–663
129. Zeng LC, Yao JC (2006) Strong convergence theorem by an extragradient method for fixed
point problems and variational inequality problems. Taiwan J Math 10:1293–1303
Index

A Fréchet derivative, 45, 59


Absolutely continuous function, 154 Fréchet diferentiable function, 45, 59, 63
Algorithm, 11, 13, 14
Approximate solution, 1, 44
G
Gâteaux derivative, 106
B Gâteaux differential function, 106
Banach space, 225 Gradient-type method, 8, 73
Bochner integrable function, 225

H
C Hilbert space, 1, 4, 6, 11, 20
Cardinality of a set, 137
Collinear vectors, 86
Compact set, 228 I
Concave function, 26, 36 Increasing function, 247
Continuous subgradient algorithm, 225 Inner product, 1, 4, 6, 20
Convex–concave function, 11
Convex cone, 247
Convex function, 1, 4, 6, 11, 20, 26, 35 K
Convex hull, 228 Karush–Kuhn–Tucker theorem, 252
Convex minimization problem, 105
Convex set, 11, 12
L
Lebesgue measurable function, 225, 227
E Linear functional, 246
Ekelands variational principle, 244, 252 Linear inverse problem, 74
Euclidean norm, 167 Lower semicontinuous function, 6, 137
Euclidean space, 86, 169
Exact penalty, 239
Extragradient method, 183, 205 M
Maximal monotone operator, 169
Metric space, 149
F Minimization problem, 2
Fermat–Weber location problem, 85, 86 Minimizer, 15, 16, 22, 42

© Springer International Publishing Switzerland 2016 303


A.J. Zaslavski, Numerical Optimization with Computational Errors, Springer
Optimization and Its Applications 108, DOI 10.1007/978-3-319-30921-7
304 Index

Mirror descent method, 4, 5, 41 S


Monotone mapping, 8, 184 Saddle point, 11
Monotone operator, 170 Strongly measurable function, 225
Subdifferential, 85, 120
Subgradient projection algorithm, 1, 11
N
Newtons method, 267
Nonexpansive mapping, 170 T
Norm, 1, 4, 6 Taylor expansion, 90

P
V
Penalty function, 239
Variational inequality, 8, 183
Pre-derivative, 268
Vector space, 247
Pre-differentiable mapping, 268
Projected gradient algorithm, 59
Projected subgradient method, 119
Proximal mapping, 170 W
Proximal point method, 6, 137 Weiszfelds method, 85
Pseudo-monotone mapping, 8, 184 Well-posed problem, 140, 165

Q Z
Quadratic function, 90 Zero-sum game, 25, 35

Das könnte Ihnen auch gefallen