Beruflich Dokumente
Kultur Dokumente
net/publication/261166887
CITATIONS READS
19 720
4 authors, including:
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Code **** Matlab ****Binary Cat Swarm Optimization Algorithm View project
All content following this page was uploaded by Mohammad Manthouri on 21 January 2017.
Abstract— In this study, a new type of training the adaptive It has been proved that with convenient number of rules, a
network-based fuzzy inference system (ANFIS) is presented by TSK system could approximate every plant [6]. The TSK
applying different types of the Differential Evolution branches. recurrent fuzzy networks (TRFN) [7] uses a global feedback
The TSK-type consequent part is a linear model of exogenous structure, where the firing strengths of each rule are summed
inputs. The consequent part parameters are learned by a and fed back as internal network inputs. The TSK systems
gradient descent algorithm. The antecedent fuzzy sets are
are widely used in the form of a neural-fuzzy system called
learned by basic differential evolution (DE/rand/1/bin) and then
with some modifications in it. This method is applied to Adaptive Network-based Fuzzy Inference System (ANFIS)
identification of the nonlinear dynamic system, prediction of the [8]. The ANFIS is a class of adaptive networks that are
chaotic signal under both noise-free and noisy conditions and functionally equivalent to fuzzy inference systems. The
simulation of the two-dimensional function. Instead of ANFIS architecture stands for adaptive network-based fuzzy
DE/rand/1/bin, this paper suggests the complex type inference system or semantically equivalent adaptive neural
(DE/current-to-best/1+1/bin & DE/rand/1/bin) on predicting of fuzzy inference system [9]. This adaptive network has good
Mackey-glass time series and identification of a nonlinear ability and performance in system identification, prediction
dynamic system revealing the efficiency of proposed structure. and control and has been applied in many different systems.
Finally, the method is compared with pure ANFIS to show the
The ANFIS has the advantage of good applicability as it can
efficiency of this method.
be interpreted as local linearization modeling and
conventional linear techniques for state estimation and
I. INTRODUCTION
control are directly applicable.
311
an expert who is familiar with the target system to be used here contains =16 rules, with four membership
modelled. In our simulation, however, no expert is available functions assigned to each input variable. The total number
and the number of MFs assigned to each input variable is of fitting parameters is 64, including 16 premises (nonlinear)
chosen empirically-that is, by plotting the data sets and parameters (8 centres of MFs and 8 standard deviations) and
examining them visually, or simply by trial and error. For 48 consequent (linear) parameters. (We also tried ANFIS
data sets with more than three inputs, visualization model with 64 rules, because the first model is too simple to
techniques are not very effective and most of the time we describe the highly nonlinear function.)To train the
have to rely on trial and error. This situation is similar to that premise parameters we used DE and proposed two methods
of neural networks; there is just no simple way to determine for construction of the initial population.
in advance the minimal number of hidden units needed to
achieve a desired performance level. Method1: A good uniform random initialization method is
used to construct method1.
A. Gradient Descent
Method2: For each input variable we have four MFs, on the
Gradient-based algorithms are the most common and other hand, have four centres of MF that are equally
important nonlinear local optimization techniques [13].The distributed along the range of ( ).
back propagation is a gradient-based technique that applies Slip shows the distance between two centres of MF
to neural network systems [17]. It is possible to decrease the
belonging to the two neighbouring individuals of population,
difference between the actual output of ANFIS structure and
and in this paper, slip is calculated as below:
the desired output of the ANFIS using gradient-based
methods. Consider an error function as follows: (13)
(9)
Then
Where, is the output of the ANFIS structure, and is
the desired output. We can optimize by using the partial (14)
derivatives in the differentiation chain rules [18].After the
partial derivatives are computed, the linear equations can be
used to update the consequent parameters from the Therefore the initial values of the centres of MFs in
iteration to the iteration as follows: population are distributed in an interval around them.
The initial values for standard deviations calculate as
(10) below:
(15)
(11)
312
E. General Notation Example 2: Identification of a nonlinear dynamic system. In
A general notation was adopted in the DE literature, this example, the nonlinear system model with multiple time-
namely DE/x/y/z [10]. Using this notation, x refers to the delays is described as [16]
method of selecting the target vector, y indicates the number Table 1
Results of simulating Mackey-Glass series prediction
of difference vectors used, and z indicates the crossover
method used. The train type of The number of
The train type of the
the consequent MFs (each Test Error Train Error
antecedent part
part input)/ Epochs
IV. SIMULATION RESULTS 2 / 500 5.2051e-005 7.2074e-005
4 / 500 5.2158e-005 7.9852e-005
In this section, the way DE employed to update the ANFIS
4 / 250 1.1140e-004 1.4670e-004
antecedent part parameters is shown. The antecedent part of GD GD
6 / 167 1.0586e-004 1.5202e-004
ANFIS has two parameters which need training, the means 8 / 250 5.8830e-005 8.4107e-005
and the standard deviation (STDEV). The membership 8 / 125 1.2090e-004 1.5564e-004
functions are assumed Gaussian as in equations (1, 2). The 2 / 30 4.4669e-004 4.1855e-004
Where, NMF represents the number of MFs. The consequent 4 / 20 1.8841e-005 4.6117e-006
4 / 10 1.1910e-005 1.0931e-005
part parameters ( ) also are trained during 4 / 250 6.2660e-006 2.6258e-006
optimization algorithm. 2 / 30 7.8053e-004 7.2889e-004
We used a number of variations to the basic DE in our 3 / 30 5.7960e-005 1.6248e-005
simulation. The different numbers of membership functions DE/best/1/bin GD
4 / 30 1.5628e-005 3.7171e-006
with different numbers of epochs are used for the different 4 / 20 3.8646e-005 6.2848e-006
4 / 10 4.9271e-005 2.9141e-005
DE strategies.
4 / 250 1.2390e-005 2.6935e-006
The size of population has a direct influence on the 2 / 30 3.3349e-004 6.5970e-005
exploration ability of DE algorithms. The more individuals 3 / 30 3.9061e-005 1.4706e-005
are in the population, the more differential vectors are DE/best/1/expo GD
4 / 30 4.1868e-005 1.5389e-005
available, and the more directions can be explored. However, 4 / 20 1.0251e-005 1.4105e-005
4 / 10 3.3630e-005 2.2836e-005
it should be kept in mind that the computational complexity
4 / 250 5.5436e-006 5.4491e-006
per generation increases with the size of the population. 2 / 30 7.2959e-005 5.6971e-005
Empirical studies provide the guideline that is ( ). DE/rand/3/bin GD 4 / 10 2.4233e-005 8.7561e-007
Where, is population size and is the number of genes 4 / 100 8.9657e-006 1.8663e-006
313
Table 2
Results of simulating nonlinear dynamic system prediction
The train The number
The train type
type of the of MFs (each Test
of the Train Error
consequent input) / Error
antecedent part
part Epochs
8.5324e-
2 / 500 0.0036
005
7.7851e-
4 / 250 0.0017
005
2.5545e-
GD GD 4 / 30 0.0019
004
7.4678e- 2.8731e-
8 / 125
004 005
2.3893e-
8 / 10 0.0025
004
(a) 4 / 10 0.0057 0.0283
4 / 30 0.0050 0.0139
DE/rand/1/bin GD
6.6945e-
8 / 10 0.0048
004
4 / 10 0.0163 0.0365
4 / 30 0.0224 0.0291
DE/rand/1/expo GD
2.9706e-
8 / 10 0.0084
004
4 / 10 0.0125 0.0207
DE/best/1/bin GD 4 / 30 0.0067 0.0129
8 / 10 0.0034 0.0036
1.8155e-
4 / 10 0.0010
004
DE/best/1/expo GD
4 / 30 0.0090 0.0133
8 / 10 0.0012 0.0071
(b) 4 / 10 0.005 0.0290
4 / 30 0.0041 0.0119
Fig. 2: Mackey glass prediction. (a) Using DE to train the DE/rand/3/bin GD
antecedent part parameters in ANFIS structure (b) Using GD to 3.1786e-
8 / 10 0.0072
train the antecedent and the consequent part parameters in ANFIS 004
structure. 4 / 10 0.0066 0.0276
4 / 30 0.0054 0.0144
DE/rand/3/expo GD
3.5513e-
(17) 8 / 10 0.0061
004
Where 4 / 10 0.0049 0.0196
4 / 30 0.0061 0.0126
DE/best/3/bin GD
(18) 3.3399e-
8 / 10 0.0100
004
Here, the current output of the plant depends on three 4 / 10 0.0064 0.0213
previous outputs and two previous inputs. The ANFIS
4 / 30 0.0050 0.0119
structure, with five input nodes for feeding the appropriate DE/best/3/expo GD
5.5242e-
past values of and u were used. The system input signal 8 / 10
004
0.0123
u(k) as the following equation [16] 4 / 10 0.0062 0.0307
DE/current-to- 4 / 30 0.0053 0.0178
GD
best/1+1/bin 2.8258e-
8 / 10 0.0052
004
4 / 10 0.0350 0.0527
(19) DE/current-to-
4 / 30 0.0302 0.0384
best/1+1/bin & GD
DE/rand/1/bin 2.6240e-
8 / 10 0.0027
004
The trigonometric used here contains two hidden layers frequency and phase in hidden layers, five inputs and one
with four-two neurons sine and cosine with or without output.
314
The ANFIS structure applied here contains five inputs and parameters just with GD. Some new algorithms, preferably
different numbers of membership functions for an input. We those that have roots in nature may also be employed in the
use 597/1000 data as training/test. ANFIS structure to help it reach the globally optimal
The results are illustrated in Table 2. The results of solution. Since these algorithms are free of derivation which
different methods are also shown in Table 2 so that it could is very difficult to calculate to train, the antecedent part
parameters complexity of these approaches are less than
other training algorithms like GD. On the other hand, the
number of computation required by each algorithm has
shown that DE requires less to achieve the same error goal as
with the back propagation. Also, the local minimum problem
in GD algorithm to train DE algorithm is solved. The
effectiveness of the proposed DE method was indicated by
applying it to identification of nonlinear method.
VI. REFERENCES
315
[16] C. J. Lin and Y. J. Xu, “A selfadaptive neural fuzzy network
with group-based symbiotic evolution and its prediction
applications,” Fuzzy Sets and Systems, vol. 157, pp.1036-1056, 2006.
[17] M. M. Gupta, L. Jin, and N. Homma, Static and Dynamic Neural
Networks from Fundamentals to Advanced Theory, John Wiley &
Sons, Inc , 2003.
[18] R. Alcal´a, J. Casillas, O. Cord´on, F. Herrera, and S. J. I.Zwiry,
Techniques for Learning and Tuning Fuzzy Rule-BasedSystems
for Linguistic Modeling and their Application, E.T.S. de
Ingenier´ıa Inform´atica, University of Granada. 18071 – Granada,
Spain,1999.
316