Beruflich Dokumente
Kultur Dokumente
Input layer
Output layer
Hidden Layers
PMR5406 Redes Neurais e Lgica Fuzzy Multilayer Perceptrons 2
x2 -1 1 -1 1
x1 xor x2 -1 1 1 -1
0.1 +1
1 -1 1 x2 -1
+1
if v > 0
N(v) =
-1 if v e 0 N is the sign function.
NEURON MODEL
Sigmoidal Function N (v j )
1
Increasing a
N (v j ) !
vj !
1
1 av j
vj
-10 -8 -6 -4 -2 2 4 6 8 10
w
i ! 0,...,m
ji
yi
LEARNING ALGORITHM
Back-propagation algorithm
Function signals Forward Step
It adjusts the weights of the NN in order to minimize the average squared error.
PMR5406 Redes Neurais e Lgica Fuzzy Multilayer Perceptrons 5
E(n) !
EAV !
1 2
e
jC
N n !1
2 j
(n)
1 N
E (n)
Notation
e j Error at output of neuron j yj Output of neuron j
vj !
wji yi
i !0 ,...,m
Multilayer Perceptrons
xE (w ji ! -L xw ji
Multilayer Perceptrons
Local Gradient
H j ! e jN d j ) (v
Update Rule
We obtain
because
(w ji ! LH j yi
xE xv j xE ! xv j xw ji xw ji
xE !Hj xv j
PMR5406 Redes Neurais e Lgica Fuzzy
xv j xw ji
! yi
11
Multilayer Perceptrons
Multilayer Perceptrons
12
ej ! dj - yj
Then
H j ! (d j - y j )N ' (v j )
Multilayer Perceptrons
13
Multilayer Perceptrons
14
Multilayer Perceptrons
15
E(n) !
1 2
e 2 (n) k
kC
xE xe k ! ek ! xy j xy j kC
xe k xv k e k xv xy kC k j xv k ! w kj xy j w kj
16
from We obtain
xe k ! N ' (vk ) xv k
xE ! xy j
H
kC
Multilayer Perceptrons
H j ! N d j ) H k w kj (v
kC w1j N(v1) N(vk) N(vm) e1 ek em Signal-flow graph of backpropagation error signals to neuron j H1 Hk Hm
Hj N(vj)
wkj wm j
Multilayer Perceptrons
17
Delta Rule
Delta rule (wji = LHj yi
Hj !
N d j )(d j y j ) (v
N d j )H k w kj (v
kC
Multilayer Perceptrons
18
Multilayer Perceptrons
19
Backpropagation algorithm
Two phases of computation:
Forward pass: run the NN and compute the error for each neuron of the output layer. Backward pass: start at the output layer, and pass the errors backwards through the network, layer by layer, by recursively computing the local gradient of each neuron.
Multilayer Perceptrons
20
Summary
Multilayer Perceptrons
21
Training
Sequential mode (on-line, pattern or stochastic mode):
(x(1), d(1)) is presented, a sequence of forward and backward computations is performed, and the weights are updated using the delta rule. Same for (x(2), d(2)), , (x(N), d(N)).
PMR5406 Redes Neurais e Lgica Fuzzy Multilayer Perceptrons 22
Training
The learning process continues on an epochby-epoch basis until the stopping condition is satisfied. From one epoch to the next choose a randomized ordering for selecting examples in the training set.
Multilayer Perceptrons
23
Stopping criterions
Sensible stopping criterions:
Average squared error change: Back-prop is considered to have converged when the absolute rate of change in the average squared error per epoch is sufficiently small (in the range [0.1, 0.01]). Generalization based criterion: After each epoch the NN is tested for generalization. If the generalization performance is adequate then stop.
PMR5406 Redes Neurais e Lgica Fuzzy Multilayer Perceptrons 24
Early stopping
Multilayer Perceptrons
25
Generalization
Generalization: NN generalizes well if the I/O mapping computed by the network is nearly correct for new data (test set). Factors that influence generalization:
the size of the training set. the architecture of the NN. the complexity of the problem at hand.
Overfitting (overtraining): when the NN learns too many I/O examples it may end up memorizing the training data.
PMR5406 Redes Neurais e Lgica Fuzzy Multilayer Perceptrons 26
Generalization
Multilayer Perceptrons
27
Expressive capabilities of NN
Boolean functions: Every boolean function can be represented by network with single hidden layer but might require exponential hidden units Continuous functions: Every bounded continuous function can be approximated with arbitrarily small error, by network with one hidden layer Any function can be approximated with arbitrary accuracy by a network with two hidden layers
Multilayer Perceptrons
28
(w ji ( n) ! E(w ji ( n 1) LH j ( n)y i ( n)
momentum constant
PMR5406 Redes Neurais e Lgica Fuzzy Multilayer Perceptrons
29
Multilayer Perceptrons
30
L adaptation
Heuristics for accelerating the convergence of the back-prop algorithm through L adaptation: Heuristic 1: Every weight should have its own L. Heuristic 2: Every L should be allowed to vary from one iteration to the next.
Multilayer Perceptrons
31
NN DESIGN
Data representation Network Topology Network Parameters Training Validation
Multilayer Perceptrons
32
Multilayer Perceptrons
33
Multilayer Perceptrons
34
Multilayer Perceptrons
35
N v !
1 1 av
N (v) ! a tanh(bv)
Multilayer Perceptrons
Multilayer Perceptrons
37
Multilayer Perceptrons
38
d j ! a I
Multilayer Perceptrons
41
Multilayer Perceptrons
42
Wv !1 Ww ! m
PMR5406 Redes Neurais e Lgica Fuzzy
1 / 2
m=number of weights
Multilayer Perceptrons
43
Multilayer Perceptrons
44
Multilayer Perceptrons
45
Multilayer Perceptrons
46
MLP
Multilayer Perceptrons
47
Data representation
d k,j 1 , x j C k ! 0 , x j C k
0 / 1 n Kth element / 0
Multilayer Perceptrons
48
Multilayer Perceptrons
49