Sie sind auf Seite 1von 5

CCECE 2014 1569887533

Optimizing Particle Swarm Optimization Algorithm

Iraj Koohi Voicu Z. Groza


School of Information Technology School of Information Technology
and Engineering and Engineering
University of Ottawa University of Ottawa
Ottawa, Canada Ottawa, Canada
Email: ikooh012@uottawa.ca Email: groza@site.uottawa.ca

Abstract—Particle Swarm Optimization (PSO) algorithm has


become more popular recently. It has been shown to be an
effective optimization tool in most of the applications. In this
paper, we have applied the PSO algorithm to a sample Artificial
Neural Network (ANN) application, measured the improvement,
and optimized the PSO parameters to improve results as much
as possible. The application is character recognition of English
numbers. Two indicators of accuracy of the results and processing
time are taken in to account. The objective of this paper is
to show that we can empirically adjust the PSO parameters
to optimize PSO for the best results. Through several iterative
processes of extracting improvements and adjusting the PSO
parameters, we have recorded optimized PSO parameters and
respective variances for similar applications. Indeed, the method
can also be extended to alphabetic characters by just providing Fig. 1. MODIFICATION OF A SEARCHING POINT BY PSO
the input training patterns of each character. The details of the
proposed approach and the simulation results are recorded in
this paper.
Figure 1, shows the concept of modification of a searching
I. I NTRODUCTION point by the PSO algorithm. The current search point and the
current velocity vectors are called xk and vk respectively. Like-
Particle Swarm Optimization (PSO) was first introduced wise, the modified position and the modified velocity vectors
by James Kennedy and Russell Eberhart in 1995[1]. Recently, are called xk+1 and vk+1 respectively. The modified values for
Particle Swarm Optimization (PSO) has gained attention for its the particles position and velocity vectors are estimated based
efficiency in solving complicated and hard optimization prob- on the interaction with other particles and the last best position
lems. It is used in several applications such as power allocation of the particle itself.
of cooperative communication networks [2], evolving artifi-
cial neural networks [3] [4], machine learning [5], character There are two approaches to swarm optimization. The first
recognitions [11], and many other applications. Furthermore, approach is inspired by the flocking behavior of birds or the
the PSO has been found to be robust and fast in solving non- sociological behavior of a group of people while the second
linear, non-differentiable and multi-modal problems [6]. approach is inspired by the behavior of social insects such
In PSO method, a swarm of particles, inspired by the flock- as ant colonies [7]. Both approaches have common properties
ing behavior in birds, move (fly) in an N dimensional search such as flexibility, robustness and self-organization. Generally,
space to reach the better solution. Each particle in a population the number of particles used in experimentation is chosen to
has two main properties, its current position called particle be from 10 to 50 particles [1]. It has been shown in previous
position and its velocity. Using those parameters, the particles studies, such as [8] and [9], that fewer particles may lose the
move around in the search-space according to mathematical chance of finding the best particle position, and more than 50
formulas to finally converge to the desired position. In order particles may disturb the effectiveness of the interaction among
to be able to modify their positions, each particle has a memory the particles. In ANN based applications, it is recommended
of its best position called particle best position (pbest ). Also, that 10 percent of the total weights of all connections be used
the particle swarm optimizer keeps track of the overall best including biases of each node and not more than 50 particles
value, and its location, obtained so far by any particle in the [8] and [9].
population; this is called (gbest ). While moving around, the In this paper, we have applied the PSO algorithm to an
particle adjusts its position in the search space according to the Artificial Neural Network (ANN) application. We have taken
best position obtained, and the position of the known best-fit a sample of the English numbers 1 to 5, applied the PSO
particle in the entire population. Particles are displaced from optimizer to the application and have recorded the degree of
their current position by applying a velocity vector to them. improvements. Next, optimization of PSO parameters have
The magnitude and direction of their velocity vectors at each done empirically through several iteration of simulation until
step is a function of personal best (pbest ), and the group best the improvement is maximized. The outcome of this paper
(gbest ) values. is the optimized PSO parameters and respective variances

978-1-4799-3010-9/14/$31.00 ©2014 IEEE CCECE 2014 Toronto, Canada

1
for similar applications. Our improvement to previous studies Vectors pki and gki are respectively the pbest and gbest of
is to optimize the PSO parameters empirically to show the particle i at iteration k . The acceleration constants c1 and
importance of the parameters involved in convergence of PSO c2 control the relative impact of the pbest and gbest locations
algorithm. Computational intelligence is used to optimize the on the velocity of a particle. Small values for c1 and c2 allow
parameters, such that the developed software with C++ can each particle to explore far away from already uncovered good
predict the optimized values of each parameter automatically points, and big values encourage more intensive search of
within the allowable variances known for each parameter. regions close to these points. The social factors r1 and r2
Without computational intelligence it would be much difficult ensure that the algorithm is stochastic. Inertial weight factor
or even impossible to optimize the parameters. The software is denoted by χ , which is used to control the ability of search-
is called intelligent because can detect the best run according ing Global or Local minimums. Larger values will increase
to fitness values of different combinations of PSO parameters. the ability of Global search, and smaller values increase the
The rest of this paper is organized as follows: In section ability of searching local minimums. It controls the impact of
II, we introduce the basic concepts of PSO algorithm and a particles previous velocity on its current velocity. Typically,
Artificial Neural Networks (ANN), we also apply PSO to an this value is optimized empirically that is described in results
ANN application. The implementation of the proposed method section. In other words it helps the swarm to converge to
is presented in section III followed by analysis of the results global minimum point. Coefficient η controls exploration-
in section IV. Finally, in last section we conclude the paper exploitation. With the value of 0, global minimum point will be
and give directions for future work. explored faster, but only local exploration, while value 1 is for
high degree of exploration with lower speed. As mentioned
II. BACKGROUND earlier, each particle has a memory which helps the particle
to consider it for the next better position in the search space.
In this section, we present the structure of the PSO algo- The memory component of the particle is given by its personal
rithm and provide background on Artificial Neural Networks. best value and it is called the cognitive component. In order
The aim of this section is to introduce the basic concepts, for the particle to move to its best region in the space, it also
which are later used to build up the rest of the work in this uses its social group best component too.
paper. Algorithm 1, describes the steps of PSO optimization
search. The algorithm initializes the particle positions at first
A. Description of the PSO Algorithm and then run the application as many times as necessary to
The PSO algorithm maintains a set of candidate solutions in find the gbest weight vector. In each iteration, pbest and gbest
the search space. The algorithm iteratively evaluates the fitness vectors are updated to calculate the velocity vector of each
of the solution by the objective function being optimized. Each particle. Once the velocity for each particle is calculated, each
particle in the swarm represents a candidate solution to the particles position is updated by applying the new velocity to
optimization problem. In the first stage of the algorithm, it the particles previous position. This process is repeated until
selects candidate solutions randomly within the search space, the stopping condition is met.
which is composed of all the possible solutions. The PSO
algorithm has no prior knowledge of the objective function,
and thus the search is to determine which solution is near the
local or the global maximum. The PSO algorithm simply uses Algorithm 1: PSO Algorithm
the objective function to evaluate its candidate solutions, and —————————————————–
operates upon the resultant fitness values. At iteration k the Initialize number of particles
position of particle i is denoted by xki . Particle i moves in the Initialize each particle position and velocity randomly
space according to its velocity vki . Position of each particle Initialize c1 , c2 , r1 , r2 , k, kmax , and stop condition
is updated by: While (not stop condition)
For (all particle positions)
For (all input training patterns)
xk+1
i = xki + vk+1
i . (1) apply input training pattern
Particle position xk+1 is the updated version of last per- feed forward
i
sonal best position plus particle’s Velocity (acceleration). It update fitness value epsum
is called velocity, because it is estimated based on three End for
acceleration parameters of previously saved velocity, c1 r1 , and If ( epsum ≤ pbest )
c2 r2 . Particle velocity drives the optimization process and is update pbest value
updated using the following equation: update pbest vector
End if
End for
For (all particle positions)
vk+1 = χ ∗ vki + c1 r1 pki − xki + c2 r2 gki − xki ,
  
i update gbest to the minimum pbest value
update gbest vector
where: update particle velocity vector vki

update particle position vector xki
p
χ = 2η/ 2 − ϕ − ϕ ∗ (ϕ − 4) ; χ ∈ [0,1] ,

End for
η ∈ [0,1] , End while
ϕ1 = c1 r1 ; ϕ2 = c2 r2 ; ϕ = ϕ1 + ϕ2. (2) —————————————————–

2
B. Artificial Neural Networks network is best trained.
In the next section, we describe our approach of applying
ANNs are composed of Neural Nodes. Each node is a the BEP algorithm to find the global minimum point and stop
simulated version of a biological neuron. It has inputs, inte- there [2], at which we can say the network is best trained, well
gration sections, threshold activation function, and an output. generalized, and can be used to predict correct outputs of all
It is composed of many nodes connected to each other. Nodes similar input patterns.
are structured in input, hidden, and output layers. The number
of hidden layers depends on the application. If the output has
a linear behavior with respect to the inputs, therefore hidden III. P ROPOSED A PPROACH
nodes cannot be applied. In case of non-linearity, one or more In order to show the influence of PSO in an ANN
hidden layers need to be employed. If hidden units are adopted application, we followed a two phases approach. First, we
less than necessary, under fitting problem occurs that means have trained a supervised Multilayer Perceptron (MLP)
after training there may be some input patterns that cannot Artificial Neural Network which was initialized with the
be detected by the network. In contrast, if they are more than uniform randomly generated weights to recognize English
necessary, we may get over fitting or generalization problem numbers, and recorded the results. Next, have included PSO
that means the network is well trained, working like a memory, algorithm to provide the optimized weights for the network,
and can only detect exactly the same training input patterns run and recorded the results again. Finally, the PSO-BEP
instead of similar ones [5]. Depending on initialized weights results was improved by optimizing the PSO parameters.
that are generated randomly, the fitness value will get a value, In our approach, the training pattern is composed of an
and will be updated after each Backward Error Propagation array of 64 bits of (0/1) representing the bit map image
(BEP) training loop. This will give us the training curve of of each character. To train the network, we have created
the Neural Network. 24 input patterns of different numbers with different offset
Figure 2, shows the training curve versus the number shapes. Since we have only five numbers, so five outputs are
of iterations. There are two critical points, local and global considered enough to represent the numbers. At each BEP
minimum points. Global minimum point is the minimum of training session all input training patterns are applied and
all local minimums. Local minimums are not the ideal stop the calculated fitness value (epsum ) is compared with the
points, but the global minimum. acceptable fitness value. The training sessions are continued
until the stop condition is satisfied. At this point the BEP
results are being recorded. The structure of the ANN
consists of five hidden, and five output nodes. Each of all 64
inputs are connected to the inputs of all hidden nodes, and
each node of hidden layer is connected to inputs of all output
nodes. Regarding biases of all nodes, we can calculate the
total number of weights as following:
Total weights = (64x5) + (5x5) + 10 = 355
After the training is completed, we tested the ability of the
network to recognize available noisy test patterns. We have
created five input test patterns to test the network recognition
Fig. 2. TRAINING CURVE OF AN ANN capability. These test patterns were the noisy and skewed
numbers of 1 to 5. A well trained network should be able to
C. Applying PSO Algorithm to application detect all test numbers with a high accuracy. The BEP
result is extremely dependent on initial weights, and initial
In order to improve the local search ability of BEP al- weights are generated randomly, so each time we run the
gorithm, studies such as [8] utilizes the PSO algorithm to software we will get different results. In order to be able
train the neural network. The PSO algorithm not only has the to compare the results with PSO-BEP run, we have run the
global search ability, but its computation is simple. Because of software10 times, and recorded the best result. Table 1, is
having feasible local search ability, there is also some adaptive the result of best run for BEP algorithm. Although it is the
technologies to improve the original PSO algorithm: like the best run, some numbers like 3 and 5 cannot be recognized by
inertia weight method and the constriction factor method. the software. For example, test number 3 is detected as either
Particle Swarm Optimization algorithm can help us to find 3 or 5, or 99 percent detected as number 3, and 61 percent as
the mentioned global minimum point. That is, when using PSO number 5. Also, number of iterations is too much like 58331
prior to BEP algorithm, the network will be initialized with the which is not desired, but not as important as the accuracy
best weight vector (gbest ). Each possible solution afterwards of the results. Less iteration is better and will improve the
can be assumed as one out of different local minimums. The processing time for sure. Developed software is able
objective of PSO is to find the best weight vector capable of to record accumulated pattern error of all training patterns
converging to the global minimum point. The BEP algorithm (epsum ) at each iteration. Training curve will also show the
guarantees to stop at the global minimum point, while PSO slope of convergence to the local or global minimum point.
itself cannot find the global minimum point, but can converge Sharper declining slopes indicate more convergence of the
to, and enable the BEP training algorithm to find and stop at the algorithm to the desired minimum points. Figure 3, shows
right point. In other words PSO proposes the best initialization the training curve of BEP run. For the next step, we have
weight vector (gbest ) for the ANN, and BEP algorithm finds implemented PSO algorithm to find the best initialization
and stops at the ideal global minimum point at which the weight vector (gbest ), initialized the network with gbest prior

3
TABLE I. BEP RESULT OF THE BEST RUN TABLE II. PSO-BEP RESULT OF THE WORST RUN
Test No. Output No. Actual Value Desired Value Output Error Test No. Output No. Actual Value Desired Value Output Error
1 1 0.997109 1 0.8358E-6 1 1 0.9877905 1 0.0001491
2 0 0 0 2 0.000037 0 1.369E-09
3 0.003946 0 1.557E-05 3 0.000549 0 3.014E-07
4 0.000527 0 2.777E-07 4 0.000125 0 1.563E-08
5 0 0 0 5 0.66836 0 0.0044671
2 1 0 0 0 2 1 0 0 0
2 0.985471 1 0.0002111 2 0.987347 1 0.0001601
3 0 0 0 3 0.005894 0 3.474E-05
4 0.001598 0 2.554E-06 4 0.008257 0 6.818E-05
5 0.000213 0 4.537E-08 5 0.000001 0 1E-12
3 1 0.00034 0 1.156E-07 3 1 0.007801 0 6.086E-05
2 0 0 0 2 0.010941 0 0.0001325
3 0.999063 1 8.78E-07 3 0.988489 1 0
4 0 0 0 4 0 0 4.001E-05
5 0.611014 0 0.3733381 5 0.006325 0 0
4 1 0 0 0 4 1 0 0 0
2 0.006772 0 4.586E-05 2 0.012152 0 0.0001477
3 0 0 0 3 0 0 0
4 0.999736 1 6.97E-08 4 0.997561 1 5.949E-06
5 0 0 0 5 0.001656 0 2.742E-06
5 1 0 0 0 5 1 0.002533 0 6.416E-06
2 0.89285 0 0.7971811 2 0.000002 0 4E-12
3 0.000001 0 1E-12 3 0.012872 0 0.0001657
4 0 0 0 4 0.000083 0 6.889E-09
5 0.992445 1 5.708E-05 5 0.989021 1 0.0001205
Iteration 58331 Total Error 1.1708611 Iteration 3830 Total Error 0.0056815
Accuracy 0.7835873 Accuracy 0.9849248

to BEP, have run the PSO-BEP 10 times, and the best results
are being recorded for the PSO-BEP run too.

Fig. 4. TRAINING CURVE OF PSO-BEP RUN

from 10 to 40 for similar applications.


3- Acceleration constants c1 , c2 in equation (2) are optimized
Fig. 3. TRAINING CURVE OF BEP RUN to c1 =c2 =3, but variance is from 1 to 4 for each constant, but
c1 +c2 ≥4 for similar applications.
Finally, the developed software was adjusted to run the 4- Inertial weight factor χ in equation (2) is optimized to value
PSO-BEP program with different rational combinations of 0.6, but variance is from 0 to 1 for similar applications.
PSO parameters, and find the optimized values. With the It can be seen that the accuracy of the training is improved
available optimized parameters, we have run the software 10 by 25.69432532 percent. Moreover, the processing time is im-
times again and recorded the worst result this time. Table 2, proved by 93.4340231 percent too. It means that the optimized
shows the result of the worst PSO-BEP run with the optimized parameters can improve accuracy in less processing time which
PSO parameters. Based on the printed result, the accuracy is the outcome of this paper. With the 5x5=25 output values
of the training is increased considerably. Furthermore, there each between (0,1), the accuracy can float between 0 to 100
was no conflict between recognition of the test numbers. For percent. Note that the Output Error columns in both tables are
example, test number 3 is being recognized with 0.98 accuracy. squared values.
Also, training curve of PSO- BEP run is shown in Figure 4. Followings are the improved
p parameters:
BEP Accuracy=1 − 1.17086113/25=0.78358733
p
PSO-BEP Accuracy=1 − 0.00568154/25=0.9849248
IV. R ESULTS AND D ISCUSSION
Improved Accuracy = 25.69432532 percent
All of the optimized values and the respective variances Improved Processing time = 93.4340231 percent
are obtained empirically, and listed as followings:
1- Kmax which is defined as part of PSO stopping condition V. C ONCLUSION
is optimized to 90, but variance is from 50 to 200 for similar
applications. PSO is used to optimize parameters of the sample ANN
2- Particle numbers are optimized to 30, but the variance is application. It generates the best initialization weights of the

4
network. Without this procedure ANN may be trapped in one
of the local minimums which is not desired and will end
up missing the training of ANN. In this paper, we have
extracted the PSO improvements and tried to optimize the PSO
parameters accordingly. Parameters are optimized empirically
with trial and error method. We have shown, in tables 1, 2
that the PSO-BEP can be optimized and improve the training
process such that to avoid under and over fitting and meet the
generalization capability of the ANN as well. The optimized
parameters are valid for this application and can be used for
similar applications if we consider the variances of parameters.
There is no rule to define the variances, but parameters
boundaries and equations can help to estimate and optimize
the parameters for similar applications. The future work
is to explore rules to optimize PSO parameters according to
applications mathematically.

ACKNOWLEDGMENT
The authors would like to thank those who kindly partici-
pated in subjective assessments, and also appreciate members
of Biomedical, and Robotic Labs for their helpful comments.

R EFERENCES
[1] Eberhart, R.; Kennedy, J. Micro Machine and Human Science, ”A New
Optimizer Using Particle Swarm Theory”, Proceedings of 1995 IEEE 6th
International Symposium, Pages: 39-43
[2] Zhiquan Bai; Yinlong Xu; Peihao Dong; Peng Gong; Kyungsup Kwak,
”Particle swarm optimization based power allocation in DF cooperative
communications”, Ubiquitous and Future Networks (ICUFN), 2013 Fifth
International Conference on , vol., no., pp.316,320, 2-5 July 2013
[3] Gaxiola, F.; Melin, P.; Valdez, F.; Castillo, O., ”Optimization of type-
2 fuzzy weight for neural network using genetic algorithm and parti-
cle swarm optimization”, Nature and Biologically Inspired Computing
(NaBIC), 2013 World Congress on , vol., no., pp.22,28, 12-14 Aug.
2013
[4] Wei-Chang Yeh, ”New Parameter-Free Simplified Swarm Optimization
for Artificial Neural Network Training and its Application in the Pre-
diction of Time Series”, Neural Networks and Learning Systems, IEEE
Transactions on , vol.24, no.4, pp.661,665, April 2013
[5] Guoshao Su, ”Accelerating Particle Swarm Optimization Algorithms
Using Gaussian Process Machine Learning”, Computational Intelligence
and Natural Computing, 2009. CINC ’09. International Conference on ,
vol.2, no., pp.174,177, 6-7 June 2009
[6] Duan H.B.; Liu S.Q.; ”Non-linear dual-mode receding horizon control
for multiple unmanned air vehicles formation flight based on chaotic
particle swarm optimisation”, Control Theory and Applications, vol.4,
no.11, pp.2565,2578, November 2010
[7] Russell C. Eberhart; Yuhui Shi, ”Particle Swarm Optimization, Develop-
ments, Applications and Resources”, Proceedings of 2001IEEE Congress
on Evolutionary Computation, vol. 1, Pages: 8186
[8] Jun Liu; Xiaohong Qiu, ”A Novel Hybrid PSO-BP Algorithm for Neural
Network Training”, Computational Sciences and Optimization, 2009.
CSO 2009, Proceedings of 2009 IEEE International Joint Conference
Pages: 300-303
[9] Jiansheng Wu; Mingzhe Liu Granular Computing, ”Improving Gen-
eralization Performance of Artificial Neural Networks with Genetic
Algorithms”, Proceedings of 2005 IEEE International Conference, Pages:
288-291
[10] Van Wyk A.B.; Engelbrecht, A.P. Evolutionary Computation (CEC),
”Overfitting by PSO Trained Feedforward NeuralNetworks”, Proceedings
of 2010 IEEE Congress, Pages: 1-8
[11] Guo Zhitao; Yuan Jinli; Dong Yongfeng; Gu Junhua, ”Handwritten
Chinese Characters Recognition Based on PSO Neural Networks”, Pro-
ceedings of 2009 IEEE International Conference, Pages:350-353

Das könnte Ihnen auch gefallen