Beruflich Dokumente
Kultur Dokumente
Methods
Amirkabir University of Technology
Dr. Madadi
Indirect Search (Descent) Methods
Gradient
of a function
If we move along the gradient direction from any point in
n-dimensional space, the function value increases at the
fastest rate.
Direction of steepest ascent
The direction of steepest ascent is a local property.
The negative of the gradient vector denotes the
direction of steepest descent.
Faster methods
Evaluation of the Gradient
The
evaluation of the gradient requires the computation
of the partial derivatives , . There are three situations
where the evaluation of the gradient poses certain
problems:
1. The function is differentiable at all the points, but
the calculation of the components of the gradient, ,
is either impractical or impossible.
2. The expressions for the partial derivatives can be
derived, but they require large computational time
for evaluation.
3. The gradient is not defined at all the points
In the first case, we can use the forward finite-
difference formula
(1)
Evaluation of the Gradient
to approximate the partial derivative at . If the function
value at the base point is known, this formula requires
one additional function evaluation to find . Thus it
requires n additional function evaluations to evaluate
the approximate gradient . For better results we can use
the central finite difference formula to find the
approximate partial derivative :
(2)
Evaluation of the Gradient
Selection of is important
When is not differentiable, direct methods
should be used.
Steepest descent (Cauchy) method
The
use of the negative of the gradient vector as a direction for
minimization
Start from an initial trial point
Iteratively move along the steepest descent directions
1. When the change in function value in two
consecutive iterations is small:
(8)
(9)
(10)
(11)
Newton’s method
[J] should be nonsingular
Iterative method
Second order method
Sometimes may not converge
May converge to saddle points and relative
maxima
Newton’s method
1. It requires the storing of the matrix [].
2. It becomes very difficult and sometimes impossible
to compute the elements of the matrix [].
3. It requires the inversion of the matrix [] at each step.
4. It requires the evaluation of the quantity at each
step.
These features make the method impractical for
problems involving a complicated objective function
with a large number of variables.
Marquardt method
The
steepest descent method reduces the function
value when the design vector is away from the
optimum point
The Newton method, on the other hand, converges
fast when the design vector is close to the
optimum point .
Advantage of both the steepest descent and
Newton methods
This method modifies the diagonal elements of the
Hessian matrix
Marquardt method
1. Start with an arbitrary initial point and constants
(on the order of ), (0<<1), (>1), and ε (on the order
of ). Set the iteration number as .
2. Compute the gradient of the function, .
3. Test for optimality of the point If , is optimum and
hence stop the process. Otherwise, go to step 4.
4. Find the new vector as
(12)
5. Compare the values of and . If , go to, step 6. If ,
go to step 7.
6. Set =, , and go to step 2.
7. Set and go to step 4.