Lec 15 Summary of Single Var Methods

Optimization Techniques
Topic: All about One Dimensional

Unconstrained OT
Dr. Nasir M Mirza

Email: nasirmm@yahoo.com
Classification of Optimization Problems

If f(x) and the constraints are linear, we
have linear programming.
e.g.: Maximize x + y subject to
3x + 4y 2
y5
If f(x) is quadratic and the constraints are
linear, we have quadratic programming.
If f(x) is not linear or quadratic and/or the
constraints are nonlinear, we have
nonlinear programming.
Classification of Optimization Problems

When constraints (equations marked with *)
are included, we have a constrained
optimization problem.
Otherwise, we have an unconstrained
optimization problem.
Optimization Methods
One-Dimensional Unconstrained Optimization
Golden-Section Search
Quadratic Interpolation
Newton's Method
Multi-Dimensional Unconstrained Optimization

Non-gradient or direct methods
Gradient methods
Linear Programming (Constrained)

Graphical Solution
Simplex Method
Global and Local Optima

A function is said to be multimodal on a given
interval if there are more than one
minimum/maximum point in the interval.
Characteristics of Optima
To find the optima, we can find the zeroes of f'(x).
Mathematical Background
Objective: Maximize or Minimize f(x)
subject to
d i (x) ai
ei (x) = bi
i = 1,2, , m *
Constraints
i = 1,2, , p *
x = {x1, x2, , xn}

f(x): objective function
di(x): inequality constraints
ei(x): equality constraints
ai and bi are constants
Maximize f ( x )
Minimize f ( x )
Global and Local Optima

A function is said to be multimodal on a given interval if
there are more than one minimum/maximum point in
the interval.
Line search methods

Line
Line search
search techniques
techniques are
are in
in essence
essence optimization
optimization algorithms
algorithms
for
forone-dimensional
one-dimensionalminimization
minimizationproblems.
problems.
They
They are
are often
often regarded
regarded as
as the
the backbones
backbones of
of nonlinear
nonlinear
optimization
optimizationalgorithms.
algorithms.
Typically,
Typically,these
thesetechniques
techniquessearch
searchaabracketed
bracketedinterval.
interval.
Often,
Often,unimodality
unimodalityis
isassumed.
assumed.
x*
Exhaustive search requires N = (b-a)/ + 1 calculations to

search the above interval, where is the resolution.
Bracketing Method
f(x)
xl
xu
Suppose f(x) is unimodal on the interval [xl, xu]. That is,

there is only one local maxima in [xl, xu].
Objective: Gradually narrowing down the interval by
eliminating the sub-interval that does not contain the
maxima.
Bracketing Method
xl
xa xb xu
xl
xa
x b xu
Let xa and xb be two points in (xl, xu) where xa < xb.

If f(xa) > f(xb), then the maximum point will not reside in the
interval [xb, xu] and as a result we can eliminate the portion
toward the right of xb.
In other words, in the next iteration we can make xb the new
xu
Basic bracketing algorithm
x1
x2
Two point search (dichotomous search) for finding the solution to

minimizing (x):
0) assume an interval [a,b]
1) Find x1 = a + (b-a)/2 - /2 and x2 = a+(b-a)/2 + /2
where is the resolution.
2) Compare (x1) and (x2)
3) If (x1) < (x2) then eliminate x > x2 and set b = x2
If (x1) > (x2) then eliminate x < x1 and set a = x1
If (x1) = (x2) then pick another pair of points
4) Continue placing point pairs until interval < 2
Generic Bracketing Method (Pseudocode)

// xl, xu: Lower and upper bounds of the interval
// es: Acceptable relative error
function BracketingMax(xl, xu, es) {
do {
prev_optimal = optimal;
Select xa and xb s.t. xl <= xa < xb <= xu;
if (f(xa) < f(xb))
xl = xa;
else
xu = xb;
optimal = max(f(xa), f(xb));
ea = abs((max prev_max) / max);
} while (ea < es);
return max;
}
Bracketing Method
How would you suggest we select xa and xb (with
the objective to minimize computation)?
Eliminate as much interval as possible in each iteration
Set xa and xb close to the center so that we can
halve the interval in each iteration
Drawbacks: function evaluation is usually a costly
operation.
Minimize the number of function evaluations
Select xa and xb such that one of them can be
reused in the next iteration (so that we only need to
evaluate f(x) once in each iteration).
How should we select such points?
Current iteration
Objective:
l1
xl
xb' = xa or xa' = xb
l1
lo
xa
xb
xu
Next iteration
l'1
l'1
l'o
x'l
x'a x'b
x'u
xl
xa
xb
l1'
l1
=R= '
l0
l0
xu
If we can calculate xa and xb

based on the ratio R w.r.t.
the current interval length in
each iteration, then we can
reuse one of xa and xb in the
next iteraton.
In this example, xa is reused
as x'b in the next iteration so
in the next iteration we only
need to evaluate f(x'a).
Current iteration
Since l '0 = l1 and l '1 = l0 l1

l '1
=R
l '0
l1
l1
lo
xl
xa
xb
xu
Next iteration
l0 l1
=R
l1
l0 Rl0
=R
Rl0
l1
[ = R l1 = Rl0 ]
l0
R 2l0 + Rl0 l0 = 0
R2 + R 1 = 0
l'1
1 + 1 4( 1)
R=
2(1)
l'1
l'o
x'l
x'a x'b
x'u
xl
xa
xb
xu
5 1
0.61803
2
Golden Ratio
Starts with two initial guesses, xl and xu
Two interior points xa and xb are calculated based on the
golden ratio as
5 1
xa = xu d or xb = xl + d where d =
( xu xl )
2
In the first iteration, both xa and xb need to be
calculated.
In subsequent iteration, xl and xu are updated
accordingly and only one of the two interior points
needs to be calculated. (The other one is inherited from
the previous iteration.)
In each iteration the interval is reduced to about 61.8%
(Golden ratio) of its previous length.
After 10 iterations, the interval is shrunk to about
(0.618)10 or 0.8% of its initial length.
After 20 iterations, the interval is shrunk to about
(0.618)20 or 0.0066%.
Bracketing a Minimum using Golden Section

Initialize:
x1 = a + (b-a)*0.382
x2 = a + (b-a)*0.618
f1 = (x1)
a
f2 = (x2)
Loop:
if f1 > f2 then
a = x1; x1 = x2; f1 = f2
x2 = a + (b-a)*0.618
f2 = (x2)
else
b = x2; x2 = x1; f2 = f1
x1 = a + (b-a)*0.382
f1 = (x1)
endif
x1
x2
Fibonacci Search
Fibonacci numbers are:
1,1,2,3,5,8,13,21,34,..
that is , the sum of the last 2 numbers
Fn = Fn-1 + Fn-2
L1 = L2 + L3
L2
L3
It can be derived that
x1
Ln = (L1 + Fn-2 ) / Fn
x2
L2
L1
Optima of g(x)
Optima of f(x)
f(x)
x0
x1
x3
x2
Idea:
(i) Approximate f(x) using a quadratic function g(x) = ax2+bx+c
(ii) Optima of f(x) Optima of g(x)
Shape near optima typically appears like a
parabola. We can approximate the original
function f(x) using a quadratic function:
g(x) = ax2 + bx + c.
At the optimum point of g(x), g'(x) = 2ax + b = 0.
Let x3 be the optimum point, then x3 = -b/2a.
How to compute b and a?
2 points => unique straight line (1st-order polynomial)

3 points => unique parabola (2nd-order polynomial)
So, we need to pick three points that surround the optima.
Let these points be x0, x1, x2 such that x0 < x1 < x2
a and b can be obtained by solving the system of linear
equations
ax 02
ax12
ax 22
+ bx 0
+ bx1
+ bx 2
+ c = f ( x0 )
+ c = f ( x1 )
+ c = f ( x2 )
Substitute a and b into x3 = -b/2a yields
f ( x0 )(x12 x22 ) + f ( x1 )(x22 x02 ) + f ( x2 )(x02 x12 )

x3 =
2 f ( x0 )(x1 x2 ) + 2 f ( x1 )(x2 x0 ) + 2 f ( x2 )(x0 x1 )
The process can be repeated to improve the
approximation.
Next step, decide which sub-interval to discard
Since f(x3) > f(x1)
if x3 > x1, discard the interval toward the left of x1
i.e., Set x0 = x1 and x1 = x3
if x3 < x1, discard the interval toward the right of x1
i.e., Set x2 = x1 and x1 = x3
Calculate x3 based on the new x0 , x1 , x2
Gradient method: Newton's Method

If
If your
your function
function is
is differentiable,
differentiable, then
then you
you do
do not
not
need
need to
to evaluate
evaluate two
two points
points to
to determine
determine the
the
region
region to
to be
be discarded.
discarded. Get
Get the
the slope
slope and
and the
the sign
sign
indicates
indicates which
which region
region to
to discard.
discard.
Basic premise in Newton-Raphson method:
Root finding of first derivative is equivalent to finding optimum
(if function is differentiable).
Method is sometimes referred to as a line search by curve fit
because it approximates the real (unknown) objective
function to be minimized.
Newtons Method
Let g(x) = f'(x)
Thus the zeroes of g(x) is the optima of f(x).
Substituting g(x) into the updating formula of
Newton-Rahpson method, we have
xi +1
g ( xi )
f ' ( xi )
= xi
= xi
g ' ( xi )
f " ( xi )
Newtons Method
Shortcomings
Need to derive f'(x) and f"(x).
May diverge
May "jump" to another solution far away
Advantages
Fast convergent rate near solution
Hybrid approach: Use bracketing method to find an
approximation near the solution, then switch to
Newton's method.
False
False Position
Position Method
Method or
or Secant
Secant Method
Method
Second order information is expensive to calculate (for multivariable problems).
Thus, try to approximate second order derivative.
Replace y''(xk) in Newton Raphson with
y' ' ( x k ) =
y' ( x k ) y' ( x k 1 )
x k x k 1
Hence, Newton Raphson becomes

x k +1 = x k
x k x k 1
( y' ( x k ))
y' ( x k ) y' ( x k 1 )
Main advantage is no second derivative requirement

Question: Why is this an advantage ?

Lec 15 Summary of Single Var Methods

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Lec 15 Summary of Single Var Methods

Hochgeladen von

Copyright:

Verfügbare Formate

Optimization Techniques

Topic: All about One Dimensional

Dr. Nasir M Mirza

Classification of Optimization Problems

Classification of Optimization Problems

Multi-Dimensional Unconstrained Optimization

Linear Programming (Constrained)

Global and Local Optima

To find the optima, we can find the zeroes of f'(x).

x = {x1, x2, , xn}

Global and Local Optima

Line search methods

Exhaustive search requires N = (b-a)/ + 1 calculations to

Suppose f(x) is unimodal on the interval [xl, xu]. That is,

Let xa and xb be two points in (xl, xu) where xa < xb.

Basic bracketing algorithm

Two point search (dichotomous search) for finding the solution to

Generic Bracketing Method (Pseudocode)

If we can calculate xa and xb

Since l '0 = l1 and l '1 = l0 l1

Bracketing a Minimum using Golden Section

It can be derived that

2 points => unique straight line (1st-order polynomial)

Substitute a and b into x3 = -b/2a yields

f ( x0 )(x12 x22 ) + f ( x1 )(x22 x02 ) + f ( x2 )(x02 x12 )

Calculate x3 based on the new x0 , x1 , x2

Gradient method: Newton's Method

Hence, Newton Raphson becomes

Main advantage is no second derivative requirement

Das könnte Ihnen auch gefallen