Beruflich Dokumente
Kultur Dokumente
Global Optimisation
Optimisation methods aim to find the values of a set of related variable(s) in the objective
function that will produce the minimum or maximum value as required. There are two types of
objective function, deterministic and stochastic. When the objective function is a calculated
value in the model (deterministic), we simply find the combination of parameter values that
optimise this calculated value. When the objective function is simulated random variable, we
need to decide on some statistical measure associated with that variable that variable that
should be optimised. .then the optimisation algorithm must run a simulation for each set of
decision variables values and record the statistic. There are many optimisation methods
available in the literature and implemented in the commercial software.
The history of global optimisation begins from 1970s simulation based optimisation
research due to the invention of the genetic algorithm by John Holland [1]. A genetic algorithm
is a class of population based adaptive stochastic optimization procedures, characterising the
randomness in the optimisation process. The randomness may be present as either noise in
measurements or Montecarlo randomness in search procedure or both. The basic idea behind
the genetic algorithm is to mimic a simple picture of the Darwinian natural selection in order to
find a good algorithm and involves the operation such as mutation, selection, and
evaluation of the fitness repeatedly.
A little later, in 1978, Aimo Trn introduced his Clustering algorithm of global
optimisation [2]. The method improves upon the earlier local search algorithms that needed
multiple start from several points distributed over the whole optimisation region. Clustering
algorithm avoids the drawback of the Multi-start (many starting points are used) converged to
same minimum. The Clustering method avoids this repeated determination of local minima.
This is realised in three steps, which may be iteratively used, (1) sample points in the region of
interest (2) transform the sample to obtain the points grouped around the local minima and (3)
use clustering technique to group these points. Starting a single local optimisation from each
cluster would determine the local minima and, thus also the global minimum.
Little later in 1983 another global optimisation algorithm namely Simulated annealing
method was proposed to mimic the annealing process in metallurgy by Kirkpatrick et al [3,4].
In annealing process a metal in the molten state(at very high temparature) is slowly cooled so
The method of differential equation (DE), another global optimisation method, grew out of
Kenneth Prices attempts to solve Chebychev polynomial fitting problem in 1996[8]. The
crucial idea behind DE is a scheme for generating trial parameter vectors. Initially, a population
of points (p in d dimensional space) is generated and evaluated for their fitness. Then for each
point pi three different points pa, pb, pc are randomly chosen from the generated population. A
new population pz is subjected to a crossover with the current point p i with a probability of
crossover cr , yielding a candidate point, say pu is evaluated and if found
All population based methods of global optimisation have a characteristic of the
probabilistic nature inherited to them. As a result, one cannot obtain certainty in their results,
unless they are permitted to go for indefinitely large search attempts. Larger is the number of
attempts, greater is the probability that they would find out the global optimum, but even then it
would not reach at the certainty. Secondly, all of them adapt themselves to the surface on which
they find the global optimum. Each of these methods operates with a number of parameters that
may be changed at choice to make it more effective. This choice is often problem oriented and
for obvious reasons. A particular choice may be extremely effective in a few cases, but it might
be ineffective (or counterproductive) in certain other cases. Additionally there is a relation of
trade off among those parameters. These features make all these methods a subject of trail and
error.
i 0,L , k . The best so far point can be updated at each step k as follows.
(k )
best
x ( k )
k 1
xbest
( k 1)
if f ( x ( k ) ) f ( xbest
otherwise
The PSO was developed by Kennedy and Eberhart [7]. It mimics the natural behaviour of a
swarm, e.g. swarm of birds, searching for a food source. In this interpretation the search of the
best available food source (i.e. optimum) is navigated based on the own memory of each
particle of the swarm (a bird) as well as the knowledge of the swarm as a whole. PSO is
naturally applicable to the continuous design spaces. In past several years, PSO has been
successfully applied in many research and application areas. It is demonstrated that PSO gets
better results in a faster, cheaper way compared with other methods by the explosive amount of
journal publication in this topic. Each particle represents a possible solution to the optimisation
task at hand. During each iteration, the accelerating direction of one particle determined by its
own best solution found so far and the global best position discovered so far by any particles in
the swarm. This means that if a particle discovers a promising a new solution, all the other
particles will move closer to it, exploring the region more thoroughly in the process. The basic
elements of standard PSO are briefly stated and defined as follows
Particle Xi (t), i = 1,,n: It is a potential solution represented by a n dimensional vector,
where n is the number of optimisation variables.
Swarm: It is a disorganised population of moving particles that tend to cluster together while
each particle seems to be moving in a random direction
Individual best position Pi (t), i = 1,, n: As a particle moves through the search space, it
compares the fitness value at the current position to the best fitness value it attained previously
Global best position, Pg (t) : It is the best position among all individual best positions achieved
so far
Particle velocity Vi (t), i = 1,, n : It is the velocity of the moving particles, which is
represented by a n dimensional vector. According to the individual best and global best
positions, the particles velocity is updated. After obtaining the velocity updating, each particle
position is changed to the next generation.
In a PSO of swarm size M, each individual is treated as a volume less particle in ndimensional space, with the position vector X i (t ) and velocity vector Vi (t ) of particle I
represented as
position
t
X i (t ) X i1 (t ), X i 2 (t ),L , X in (t )
time
(4.1)
Curren
t
at
Particle
memory
Swarm
influence
Vij (t 1) wV
. ij (t ) c1.r1. Pij X ij (t ) c2 .r2 . Pgj X ij (t ) (4.2)
Inertia
Factor
Self
Confiden
ce
Swarm
Confidenc
e
X ij (t 1) X ij (t ) Vij (t 1)
For i 1,2, , M and j 1, 2,L , n . Parameters c1 and c2 are called acceleration coefficients and
satisfy c1 c2 4 to guarantee the convergence of the particles. Parameter w , which is the
inertia weight introduced to accelerate the convergence speed of the PSO. Vector
Pi ( Pi1 , Pi 2 , , PiD ) is the best previous position (the position giving the best fitness value)
experienced by particle i and is denoted pbest. Vector Pg ( Pg1 , Pg 2 , , PgD ) is the position of
the best particle (with best fitness value) among all the particles in the swarm and is denoted by
gbest. r1 and r2 are two different random numbers uniformly distributed within (0, 1). Empirical
studies show that the PSO performs well when w varies linearly from 0.9 to 0.4 over the run.
There are many other improved versions of PSO in the literature such as co-evolutionary
particle swarm [12], hybrid particle swarm optimization [13] and Quantum behaved PSO.
Quantum-behaved particle swarm optimisation (QPSO), which can be guaranteed theoretically
to find optimal solution in search space, has few control parameters [14-17].
In Quantum model of a PSO, the state of the particle is represented by a wave function
(x,t)(Schrodinger equation)[15]. The dynamic behaviour of the particle is widely divergent
form in comparison with the classical PSO particle. To gurantee convergence of PSO
algorithm, each particle must converge to its local attracter pi = (pi,1,pi,2, pi,3,pi,n) of which the
coordinates are defined as
pi , j
c P
1 i, j
c2 Pg , j
c1 c2
, j 1, 2, L , n
(4.3)
or
pi , j r1 Pi , j 1 r1 Pg , j , r1 is a uniform random number
We can obtain the position of the particle using following equation.
X i j pij
Lij
2
r 2
ln
where r2 is a uniform random number and pij is defined as the coordinates of the local
attractor. To evaluate the Li,j(t), a global point called mean best position of the population is
introduced onto PSO
1
m(t ) m1 t , m2 t ,L , mn t
M
Pi ,1 t ,
i 1
1
M
Pi ,2 t ,L ,
i 1
1
M
P t
M
i 1
i ,n
(4.4)
Where M is the population size and Pi is the pbest position of the particle i. The values of
Li,j (t) is determined by
Li , j t 2 . m j t X i , j t
and the position of the particle Xi,j is calculated as
(4.5)
1
, r 3 a uniform random number (4.6)
r 3
X i , j t 1 pi , j t . m j t X i , j t .ln
The parameter is called contraction expansion coefficient, which can be tuned to control
the convergence speed of the algorithm. The value of is linearly changing from 1 to 0.5 when
the algorithm is running.
The QPSO algorithm is given below.
1. Initialise the population vector of particle with random position inside the problem domain.
2. Determine the mean best position m(t) among the particles
3. Evaluate the objective function (e.g. minimisation) for each particle and compare with the
previous best values. If the current value is better than the previous best value, then set the best
value to the current value, i.e. if f X i f Pi , then X i Pi
4. Determine the current global position minimum among the particles best position, i.e.
gbest arg min f ( pi ) where M is the population size.
1i M
5. Compare the current global position with the previous global best position. If the current
global position is better than the previous global position, set the global position to the current
global position.
6. For each dimension of the particle, obtain a stochastic point between Pij and Pg
p j .Pij 1 Pgj , ~ U (0,1)
where U (0,1) is a random number between 0 and 1.
7. Update the position by stochastic equation
1
, where u ~ U (0,1) and parameter is called contraction u
X ij p j mbest j X ij . ln
expansion coefficient.
8. Repeat steps 2-7 until a stopping criterion is satisfied or a pre-specified number of iterations
are completed.
xi t 1 p . | mbest i xi t . ln
is replaced by
1
, where G =|N(0,1)|
G
xi t 1 p . | mbest i x i t . ln
is replaced by pi , j
GP
i, j
gPg , j
c P
1 i, j
G g
c2 Pg , j
c1 c2
where g =|N(0,1)|
z (t ) cos 2 y t 1 e r z t 1
where mod is the modulus after division. The Zaslavskii map has a strange attractor with largest
Lyapunov exponent for v = 400, r = 3, and a = 12.6695. in this case, the values of
z (t ) 1.0512,1.0512 . QBPSO approach using chaotic sequences based on Zaslavskii map
can be more capable of escaping from local optima than random number sequences [17]. In this
approach based on Zaslavskii map, the parameter u of the term in QBPSO
1
xi t 1 p . | mbest i xi t . ln
is replaced by
if r pm t then
1
xi t 1 p . | mbesti xi t .ln
u
else
1
i , j
xi t 1 p . | mbesti xi t .ln
zj t
i , j z2, j (t ) u j l j
z2, j t
Where pm(t) is the mutation rate, r is a value generated according to a uniform probability
distribution in range [0,1], zj(t) is the value generated for each design variable j (j =1, 2, ..., n), n
is the n-dimensional optimization problem. is scale factor equal to 1.0512, uj and lj are upper
and lower values of jth design variable and t is the current generation.
i
i
i
i
j
j
j
j
Selection of two parents x1 , x 2 , x3 , , x n , x1 , x 2 , x3 , , x n for crossover is performed by
nonlinear ranking selection procedure [19]. In this procedure, with the population
i 1
'
where q
1 1 q
q is the selection probability of the best chromosome. After the selection probability of each
chromosome is determined, the roulette wheel selection is adopted to select the excellent
chromosome. This kind of selection procedure need neither use individual chromosomes fitness
nor transform the fitness scaling which can prevent the premature convergence. After the
~
selection, the offspring x i , x j are created using the following scheme [20-22].
~
i
x ax i (1 a ) x j
~
x j ax j (1 a ) x i
Where a is random number between -0.5 and 1.5.
Mutation
A widely used mutation operator in real coded Genetic algorithm is Non Uniform Mutation
[23]. This mutation scheme of the algorithm is as follows. From a chromosome
x i x1i , x 2i , x3i , , x ni
i 1
i 1
i 1
i 1
i 1
the mutated chromosome x x1 , x 2 , x3 , , x n
is created as
follows.
i 1
j
x ij i, x uj x ij
i
i
l
x j i, x j x j
if
r 0.5
otherwise
Where i is the current generation number and r is a uniformly distributed random number
u
l
between [0, 1]. x j , x j are upper and lower bounds of the j th component of the mutated
chromosome respectively. The function (i, y ) given below takes values in the interval [0, y].
(i , y ) y 1 u
MaxIter
Where u is a uniformly distributed random number in the interval [0, 1], MaxIter is the
maximum number of iterations and b is a parameter, determining the strength of the mutation
operator. In Romara we set b = 5.
Local Technique
This technique helps to concentrate the points in the region S around the global minimum
[21]. The procedures of the local technique are as follows.
(1)Select a random number.
x j 1 j x best
j yj
j
j 1,2, , n
best
Where j is a random number in [-0.5, 1.5] and x j is the j th component of the best
best
chromosome x
worst
(3)Replace the worst point x
in S with x , if f ( x ) f ( x worst )
x a
1
exp
,
2b
b
f ( x)
F ( x)
x a
1
exp
2
b
x a
1
1 exp
2
b
xa
where a is called a location parameter and b>0
xa
generated
1
a b log u , u 2
a b log(u ), u 1
from
pair
of
parents
for smaller values of b, offsprings are likely to be produce near the parents and for larger
values of b offsprings are expected to be produced far from the parents. This way the Laplace
crossover operator exhibits self adaptive behaviour[]. Other crossover operator in the literature
is Heuristic crossover operator []
min f ( x),
x Rn
g i ( x) 0, i 1,2, , m
Subject to
h j ( x) 0, j 1,2, , k
a i x i bi ,
1 i n
(*)
x x1 , x 2 , , x n
Where f(x) is an objective function, gi(x) and hj(x) are inequality and equality constaints
respectively, and ai and bi are the search space upper bound and lower bound respectively for
xi. The formulation of the constraints is not restrictive, since an inequality constraint of the
form gi(x) 0 can also be represented as g i(x) 0, and the equality constraint hj (x) = 0 is
equal to two inequality constraints gi(x) 0 and gi(x) 0. The most common approach to
solving the constraint optimisation problems is the use of a penalty function. The purpose of
using the penalty function is to transform the continuous non linear programming (CNLP)
problem to the uncontraint NLP (UNLP) problem by building a single objective function and
penalizing the constraints. Then we can minimize the new single objective function using the
unconstraint optimisation algorithm. This is main reason behind the concept of popular usage
of the penalty function approach. The drawback of this approach is the difficulty to select
suitable penalty values. If the penalty values are high the minimisation algorithms are usually
trapped in local minima and if the penalty values are low, they can barely detect feasible
optimal solutions.
Penalty Function Method
Various literatures have addressed this issue. One of the modifications recommended by [14] is
dynamically changing penalty values as the iteration progresses. The penalty function is
generally defined as follows [14]
F ( x) f ( x) h(t ) H ( x), x R n
Where f(x) is the original objective function of CNPL problem and h(t) is a dynamically
modified penalty value, t is the algorithms current iteration number and H(x) is a penalty factor
defines as
m
H ( x) q i x q i x
qi x
i 1
qi x
described in (*). The function h(.), (.), and (.) are problem dependent. According to [14] the
values of qi x 1 ,if qi(x) < 1; otherwise qi x 2 . Additionally qi(x) < 0.001,then qi x 10
else if qi(x) < 0.1,then
function h(.) is set as
qi x 20
h(t ) t t
the
Gradient repair method proposed by Chootinan and Chen [25] utilises gradient information
obtained from the constraint set to systematically repair infeasible solutions by directing
infeasible solutions toward the infeasible region. The steps of gradient repair method is
summarised as below.
First determine the degree of constraint violation V, by equation (1)
g
Min 0, g ( x)
V m1 V
h( x )
hk 1
Where V consists of vectors of inequality constraints g and equality constraints h for the
problem.
Compute X V , where
inverse of X V to be used.
Moore-Pensore inverse is defined as xV xV T xV xV T
1
t 1
t
t
1
t
C j ( x)
1,
g j x
If g j x 0
If g j x 0
g max x
Where gmax(x) =max{gj(x), j=1, 2, 3,m} and Cj(x) is the fitness level of point x for constraint
condition (j)
For equality constraint hj(x) 0
C j ( x)
1,
h j x
If h j x 0
If h j x 0
hmax x
Cf ( x) w j C j x ,
j 1
m k
w
j 1
1,
0 wj
1
, j
mk
Where wj is a randomly generated weight for constraint j. the sum signifies the fitness level of
point x as related to the feasible domain Q. if Cf(x)=1, it is an indication that x Q , and on the
other hand , if 0 < Cf(x) <1, the smaller Cf(x) indicates that the solution x in infeasible domain
Q or further away.
Infeasible degree selection
This method is currently applied in particle swarm optimisation method. In [12] the author
introduced the following definition of infeasible degree (IFD)
IFD( x i ) min0, g j x i
m
j 1
h j xi
j 1
Where the infeasible degree can be considered as the distance between solution xi and the
feasible region. When solution xi is an infeasible solution, the infeasible degree value is
inversely proportion to the distance between solution xi and the feasible region. To increase the
selection pressure with the evolutionary process, the threshold of infeasible degree in defined
as the product of a linearly decreasing gene and the average infeasible degree value for
population (IFDp)
M
IFD P
IFD(x )
i 1
where
T (t ) M
2 .2
T (t ) 0.8 t
,
Maximum Generation
Maximum
f ( x),
F ( x)
if accepted
m
f max ( x) g j ( x),
otherwise
j 1
Where the parameter fmax(x) is the objective function value of the worst feasible solution in the
population. This technique can help the particles reach the feasible region of the search space,
and to properly keep the infeasible solution attracting the particles to the constraint boundary.
The
15. Leandro dos Santos Coelho, A Quantum Particle Swarm Optimizer with Chaotic
mutation Operator, Chaos, Solitons and Fractals, 37, 2008, 1409-1418
16. Maolong Xi et al , Quantum- behaved Particle Swarm Optimization with Elitist Mean
Best Position, Complex Systems and Applications- Modelling, Control and
Simulations, 14(S2), 2007, 1643-1647
17. Leandro
dos
Santos
Coelho,
Gaussian
Quantum-behaved
Particle
Swarm