Beruflich Dokumente
Kultur Dokumente
Authorized licensed use limited to: Auburn University. Downloaded on March 9, 2009 at 09:47 from IEEE Xplore. Restrictions apply.
INES 2007 -
11th International Conference on Intelligent Engineering Systems -
29 June - 1 July 2007 Budapest, Hungary
-
neuron equations:
III. NEURAL NETWORK LEARNING
3x+y-3>0
Several methods to training a single neuron (read single layer
AND x+3y-3>0 of feed forward neural network) were developed such as:
* correlation learning rule Awu = cxid1
x-2 > 0
* instar learning rule Awi = c(xi - wi)
x-2y+2 >0 * outstar learning rule Awj = c(dj - wij)
* perceptron fixed rule: Awi (d= cx1 - 01)
* perceptron adjustable rule: Awij = cnet xi (dj oj)
xf
* LMS (Widrow-Hoff) rule: Aw. = cxi (di -net1)
Fig. 3. Neural network architecture for patterns of Fig. 1 and used
activation functions. * delta rule: Aw. = cxi fJ (d- o
* linear regression (pseudoinverse) rule: w =(XTx) x Td
where:
wi =weight from ith tojth neuron
Aw1 =weight change for w
c =learning constant
xi =signal on the ith input
n
net1 =
YXuwij +Wi+1j
i=l
(1)
0o = f (net1) output signal
In order to understand neural network learning let us consider
a neuron with eight inputs with binary signals and weights
such as
x =+1 -1 -1 +1 -1 +1 +1 -1
if weights w = x
w =+1 -1 -1 +1 -1 +1 +1 -1
then
n
net = Zwi xi= 8 (2)
i=l
and this is the maximum value net=8 can have for any other
combinations net would be smaller. For example for the same
weights and slightly different input pattern:
Fig. 4. Surfaces on different nodes of neural networks of Fig. 3 for x= +1 +1 -1 +1 -1 +1 +1 -1
neurons with the gain of activation function k=5. n
o.tp.t k=2
output k=20X
net = Zwi xi= 6 (3)
i=l
One may notice that the maximum net value is when weights
are the same as the pattern (assuming that both weights and
patterns are normalized). Therefore, one can draw a
conclusion that during the learning process weights should be
changed the same way as input signals to these weights
Awji =cAA xji (4)
(a) (b) In the case of the delta rule for one neuron
Fig. 5. Nonlinear mapping of the neural network of Fig. 3 for
different values of the neuron gains (a) k=20, (b) k=2 j f(dj -oj) (5)
Neural network may separate patterns (perform A. Error Back Propagation EBP
classifications) as shown in Fig. 5 (a) but also they can
produce very complex nonlinear shapes (see Fig. 5 (b)) The This rule can be extended for multiple layers as it is shown in
question is how to design neural networks for arbitrary Fig. 6. Instead of a gain f' through single neuron the signal
nonlinear mapping? Can we just select randomly chosen gain through entire network can be used and this leads to the
neural network architectures and train them?
14
Authorized licensed use limited to: Auburn University. Downloaded on March 9, 2009 at 09:47 from IEEE Xplore. Restrictions apply.
B. M. Wilamowski * Neural Networks and Fuzzy Systems for Nonlinear Applications
z2 si t E(w(t)) (t (9)
ON
i
Ax Ax
SUj (t) = (I1- a)Dj(t)+ S Sj (t -1) (12)
XMIIN
15
Authorized licensed use limited to: Auburn University. Downloaded on March 9, 2009 at 09:47 from IEEE Xplore. Restrictions apply.
INES 2007 -
11th International Conference on Intelligent Engineering Systems -
29 June - 1 July 2007 Budapest, Hungary
-
Ow n Ow 1
the LM algorithm an N by N matrix must be inverted in every
02E fEl(4' iteration. Therefore LM algorithm is not very efficient to train
A =8w,8w2 2 En2 2 large neural networks.
a2E a2E a2E aE
Iw,8wn aw2awn awn 2Wn IV. SPECIAL NEURAL NETWORK ARCHITECTURES
The Newton algorithm works very well only for quasi linear FOR EASY LEARNING
system. In nonlinear systems it often fails to converge. Also,
Computation of the Hessian is relatively expensive. Because difficulties with neural network training several
Levenberg and Marquardt made several improvements for the special neural network architectures were developed which are
Newton method. Instead of computing Hessian they are easy to train. These special neural network architectures will
computing a Jacobian be reviewed in following sections:
A. Functional Link Networks
aell ael I el I
0W 1 02 0 n One-layer neural networks are relatively easy to train, but
ae 21 ae21 ae 21 these networks can solve only linearly separated problems.
One possible solution for nonlinear problems presented by
8-w1W e2
8Iw en
I Nilsson [11] and then was elaborated by Pao [12] using the
functional link network are shown in Fig. 9. Using nonlinear
eeMI aemi aeMI terms with initially determined functions, the actual number of
1 2 8W1N inputs supplied to the one-layer neural network is increased.
(15)
aelp
8-w aelp
8Iw aelp
8w
eW 1 aW 2 eW N
ae2P ae2P 8e2P
8-w 1 8Iw 2 8w N
A=2JTJ (16)
and
g = 2jT8e (17) (a)
unipolar neuron bipolar neuron
Resulting in the modified Newton method [9] [1O]: output
S/XOR
16
Authorized licensed use limited to: Auburn University. Downloaded on March 9, 2009 at 09:47 from IEEE Xplore. Restrictions apply.
B. M. Wilamowski * Neural Networks and Fuzzy Systems for Nonlinear Applications
h1
- isl2 (20)
N ;'t 0 Ao X Z
where:
ix =input vector
isi stored pattern representing the center of the i cluster
summing vi =radius of the cluster
unipolar circuits The radial-based networks can be designed or trained.
neurons Training is usually carried out in two steps. In the first step,
Fig. 10 Counterpropagation network the hidden layer is usually trained in the unsupervised mode
by choosing the best patterns for cluster representation. An
C. Counterpropagation networks approach, similar to that used in the WTA architecture can be
used. Also in this step, radii uv must be found for a proper
The counterpropagation network was originally proposed by overlapping of clusters.
Hecht-Nilsen [13] This network, which is shown in Fig. 10, The second step of training is the error backpropagation
requires numbers of hidden neurons equal to the number of algorithm carried on only for the output layer. Since this is a
input patterns, or more exactly, to the number of input supervised algorithm for one layer only, the training is very
clusters. The first layer is known as the Kohonen layer with rapid.
unipolar neurons. In this layer only one neuron, the winner, hidden "neurons
can be active. The second is the Grossberg outstar layer.
A modification [14] of counterpropagation network is shown
in Fig. 11. This modified network is can be easily designed.
Weights in the input layer are equal to input patterns and
weights in the output layer are equal to output patterns.
0 FII
(a)
Inputs Outputs
same as input weights same as output weights
-2.3 3.12 -1 6.8 1.3 -1 4.1
0.2 -5.2 4.6 2.5 5.4 -4.5 0.3 weights adjusted every step
5.23 -2.1 9.32 -1.32 -3.1 2.1 1.92 once adjusted weights and then frozen
17
Authorized licensed use limited to: Auburn University. Downloaded on March 9, 2009 at 09:47 from IEEE Xplore. Restrictions apply.
INES 2007 -
11th International Conference on Intelligent Engineering Systems -
29 June - 1 July 2007 Budapest, Hungary
-
oD
U)
analog \
inputs
O
N
N
fuzzy a)O)
CDO
0
zy'
fuzzyv a
0
fuzzyv N
analog
output/
0 1 2 3 4 5 6 7 8 9 10
_7 U
Fig. 16. Design process for Mamdani type fuzzy controller
E E Let us assume that we have to design the Mamdani Fuzzy
Fig. 14. Architecture of Mamdani fuzzy controller
controller having the required function shown in Fig. 15. A
contour plot of the same surface is shown in Fig. 16. In order
A. Mamdanifuzzy controller to design a fuzzy controller at first characteristic break points
should be selected for input variables. In our example we
The block diagram of the Mamdani approach [3] is shown in selected 8, 13, 17, and 22 for X variable and 10, 18, and 24 for
Y variable. For higher accuracy, more membership functions
Fig. 14. The fuzzy logic is similar to Boolean logic but instead should be used. A similar process has to be done for the
of AND operators, MIN operators are used and in place of OR
operators, M\AX operators are implemented. The Mamdani output variable where values 0.8, 2, 4.5, 7.5, and 9.5 were
concept follows the rule of ROM and PLA digital structures selected. The last step is to assign to every area one specific
where AND operators are selecting specified addresses and membership function of the output variable. This is marked on
then OR operators are used to find the output bits from the Fig. 16 by letters from A to F. The described above process
information stored at these addresses. can be redefined by assigning different values for both inputs
In some implementations, MIN operators are replaced by and outputs.
product operators (signals are multiplied). The rightmost block
of the diagram represents defuzzification, where the output
analog variable is retrieved from a set of output fuzzy 10
variables. The most common is the centroid type of
defuzzification. 8
4
108 D X
2
0
30
20 30
20
10
10
0 0
0
30 Fig. 17. Control surface obtained with Mamdani Fuzzy controller.
20 30
20 B. TSKfuzzy controller
10
0 0
More recently Mamdani architecture was replaced by TSK
Fig. 15. Required control surface
[4][5] (Takagi, Sugeno, Kang) architecture where the
defuzzification block was replaced with normalization and
18
Authorized licensed use limited to: Auburn University. Downloaded on March 9, 2009 at 09:47 from IEEE Xplore. Restrictions apply.
B. M. Wilamowski * Neural Networks and Fuzzy Systems for Nonlinear Applications
weighted average The block diagram of TSK approach is learning, and PCA - Principal Component Analysis are
shown in Fig. 18 The TSK architecture, as shown in Fig. X presented as a derivation of the general learning rule. The
(b), does not require M\AX operators, but a weighted average presentation focuses on various practical methods such as
is applied directly to regions selected by MIN operators. What Quickprop, RPROP, Back Percolation, Delta-bar-Delta and
makes the TSK system really simple is that the output weights others. More advance gradient-based methods including
are proportional to the average function values at the selected pseudo inversion learning, conjugate gradient, Newton and
regions by MIN operators. The TSK fuzzy system works as a LM - Levenberg-Marquardt Algorithm are illustrated with
lookup table. examples.
Special neural architectures for easy learning such as
e) E cascade correlation networks, functional link networks,
0
C,) o
0
_,) polynomial networks, counterpropagation networks, and RBF-
a) -o
2 inputs L\
fuzzy
L-
a) fuzzy )
N
fuzzy
1 output\ Radial Basis Function networks are described.
(analog) *N 4-- (analog)
Y
N 0 E -7/ C)
4-
E
O U) REFERENCES
[1] J.M. Zurada, Artificial Neural Systems, PWS Publishing
Fig. 18 Architecture of TSK fuzzy controller Company, St. Paul, MN, 1995.
[2] Wilamowski, B.M, Neural Networks and Fuzzy Systems,
chapter 32 in Mechatronics Handbook edited by Robert R.
Bishop, CRC Press, pp. 33-1 to 32-26, 2002.
[3] E. H. Mamdani, "Application of Fuzzy Algorithms for
Control of Simple Dynamic Plant," IEEE Proceedings, Vol.
121, No. 12, pp. 1585-1588, 1974.
-/
/ [4] Sugeno and G. T. Kang, "Structure Identification of Fuzzy
Model," Fuzzy Sets and Systems, Vol. 28, No. 1, pp. 15-33,
1988.
[5] T. Takagi and M. Sugeno, "Fuzzy Identification of Systems
and Its Application to Modeling and Control," IEEE
Transactions on System, Man, Cybernetics, Vol. 15, No. 1,
pp. 116-132, 1985.
[6] Rumelhart, D. E., Hinton, G. E. and Wiliams, R. J, "Learning
.7 representations by back-propagating errors", Nature, vol.
323, pp. 533-536, 1986.
[7] Scott E. Fahlman. Faster-learning variations on back-
propagation: An empirical study. In T. J. Sejnowski G. E.
Hinton and D. S. Touretzky, editors, 1988 Connectionist
Fig. 19. Design process for TSK type fuzzy c(ontroller Models Summer School, San Mateo, CA, 1988. Morgan
Kaufmann.
[8] M Riedmiller and H Braun. A direct adaptive method for
faster backpropagation learning: The RPROP algorithm. In
10- Proceedings of the IEEE International Conference on Neural
Networks 1993 (ICNN 93), 1993.
8.
[9] M.T. Hagan, M.B. Menhaj. "Training feedforward networks
6. with the Marquardt algorithm." IEEE Transaction onANeural
ANetworks 1994; NN-5:989-993.
4. [10] B. M.Wilamowski, S. Iplikci, M. 0. Kayank and M. O.Efe,
"An Algorithm for Fast Convergence in Training Neural
2-
Networks ", International Joint Conference on Neural
0- Networks (IJCNN'O]) pp. 1778-1782, Washington DC, July
30 15-19, 2001.
20 30 [11] N. J. Nilson, Learning AMachines: Foundations of Trainable
20 Pattern Classifiers, New York: McGraw Hill 1965.
10
10 [12] Y. H. Pao, Adaptive Pattern Recognition and Neural
0 0
Networks, Reading, Mass. Addison-Wesley Publishing Co.
1989
Fig. 20. Control surface obtained with TSK Fuzzy corntroller. [13] Hecht-Nielsen, R. 1987. Counterpropagation networks. Appl.
Opt. 26(23):4979-4984.
VI. CONCLUSION [14] B. M. Wilamowski and R. C. Jaeger, "Implementation of
RBF Type Networks by MLP Networks," IEEE International
Various learning methods of neural netwc)rks including Conference on Neural Networks, Washington, DC, June 3-6,
supervised and unsupervised methods are Ipresented and 1996, pp. 1670-1675.
illustrated with examples. General learning rule as a function [15] S.E Fahlman,.and C. Lebiere, "The cascade- correlation
learning architecture" nn D. S.Touretzky, Ed. Advances in
of the incoming signals is discussed. Other learr ig rules such Neural Information Processing Systems 2, Morgan
as Hebbian learning, perceptron learning, LMS - Least Mean Kaufmann, San Mateo, Calif., (1990), 524-532.
Square learning, delta learning, WTA - Wirmer Take All
19
Authorized licensed use limited to: Auburn University. Downloaded on March 9, 2009 at 09:47 from IEEE Xplore. Restrictions apply.