MSEP2010 L10 (Xor Problem)

MSEP2010_L10 1
MSEP2010_L10 2
Problems with Backpropagation
p p g
1. The presence of local minima in the error surface.
Sol tions get trapped in local peaks
Solutions peaks, and ma
may never
ne er
reach the lowest point if the multidimensional function
has manyy local minimums.
2. Backpropagation is an extremely slow process. Tens of
thousands of learning trials are common even for simple
problems.
bl
3. The assumption behind backpropagation is that
minimizing the error for a training set is the right thing
to do. In fact, this depends very much on the definition
of error, on the choice of training set, and on the function
off th
the network.
t k
MSEP2010_L10 3
4. If learningg goes
g on too longg generalization
g often suffers.
5. Backpropagation is unbiological - there is no evidence of
weight error information running backwards in the brain.
6 Some backpropagation networks can exhibit what is known
6.
as catastrophic unlearning. After the network has learned a
set of patterns, if a new pattern has to be learned, sometimes
it may be needed to undo all old connections and to change
everything in order to accommodate the new information
(which, of course, takes a very long time).
7. The biggest problem is neural network mysticism. A neural
network may solve a practical problem, but it can be
difficult to understand how it solved it. For many y pproblems
the hidden layer is not doing an obvious analysis. If you
don’t know what was done, it can be hard to improve it.
The strong tendency to say, “Who cares? The network works.” This
approachh iis rarely
l the
h roadd to either
i h progress or wisdom.
id
MSEP2010_L10 4
XOR Problem
Haykin, page 176
Touretzkey and Pomerleau, 1989
-1.5
1
-2 -0.5
1
1 -0.5
1
1
When the top hidden neuron is off and the bottom hidden neuron is on, which
occurs when the input pattern is (0,1) or (1,0), the output neuron is switched
on due to the excitatory effect of the positive weight connected to the bottom
MSEP2010_L10 5
hidden neuron
Feed-forward network mappings
Feed-forward neural networks provide a general framework

for representing non-linear functional mappings between a set of input variables
aandd a set of
o output va
variables.
ab es. Thiss iss ac
achieved
eved by representing
ep ese t g the
t e non-linear
o ea
function of many variables in terms of compositions of non-linear functions of
MSEP2010_L10
a single variable, called activation functions. 6
The units which are not treated as output units are called hidden units.
In this network there are d inputs,
p , M hidden units and c outputp units.
The output of the j th hidden unit is obtained by first forming a weighted

linear combination of the d input
p values, and adding g a bias, to g
give
Here wji denotes a weight in the first layer, going from input i to
hidd uniti j , and
hidden d wj0 denotes
d the
h bibias ffor hidd
hidden unit
i j.
Treating bias as a weight
The activation of hidden unit j is then obtained by transforming

the linear sum in using an activation function g(.) to give
MSEP2010_L10 7
The outputs of the network are obtained by transforming the activations
of the hidden units usingg a second layer
y of p processing
g elements. Thus,,
for each output unit k, we construct a linear combination of the outputs of
the hidden units of the form
T ti bias
Treating bi as a weight
i ht
The activation of the kth output unit is then obtained by transforming

this linear combination using a non-linear activation function, to give
An explicit expression for the complete function represented by the

network diagram
MSEP2010_L10 8
When the inputs are binary-
We can easily show that a two-layer network of the form shown in the
earlier figure can generate any Boolean function, provided the number M
of hidden units is sufficiently large (McCulloch and Pitts, 1943)
When the inputs are continuous-
continuous
MSEP2010_L10 9
(Lippmann, 1987 . In the file section of the group- Reading Assignment)
Design a single output two layer network which classifies the shaded
region
i ini Fig.
Fi from
f the
th other
th region.
i
(1,3)
(3, 2)
(1,1)
MSEP2010_L10 10
The equations of the decision boundaries are
h1
x1 − 1 = 0
h2
0.5 x1 − x 2 + 0.5 = 0
− 0.5 x1 − x 2 + 3.5 = 0 h3
1
1
So the hidden layer weights are For the first neuron, weights are 1 0 -1
For the second neuron, weights are 0.5 -1 0.5
F the
For th third
thi d neuron, weights
i ht are -0.5
0 5 -1
1 0.5
05 33.5
5
Let the outputs of the hidden neurons are h1,

h1 h2 and h3.
h3 h1=1 means it is ‘on’
on denoted by
h1>0. Similarly h1=0 by off state by h1<0. Now take the part numbered 1. here h1=0,
h2=0 and h3=0. Similarly we can write for parts also (see the table). We are interested in
making a neural network for which parts 1,
1 3 and 5 will produce output (o) 1 and other
parts 0.
MSEP2010_L10 11
1
(0,4)
*
h1>0
(2,3)
*
6
2
7 h2>0
(0 2) *
(0,2) * *
(2,2) (4,2) 5
4 h3>0
* *
(0,0) (2,0)
MSEP2010_L10 12
−θ > 0
w3 − θ < 0
h1 h2 h3 o
1 0 0 0 1 w 2 + w3 − θ > 0
2 0 0 1 0 w1 + w 2 + w 3 − θ < 0
3 0 1 1 1
4 1 1 1 0 w1 + w 2 − θ > 0
5 1 1 0 1 w1 − θ < 0
6 1 0 0 0
7 1 0 1 0
w1 + w 3 − θ < 0
From the first equation we know that θ is negative. From the second equation, w3 is
more negative h θ andd so on…II tookk θ =-1,
i than 1 w3=-2,2 w2=33 and
d w1=-3.
3
p [ ; ];
p=[2;0];
net = newff([0 10;0 10],[3 1],{'hardlim' 'hardlim'});
net.b{1}=[-1;.5;3.5];
net.b{2}=[1];
net.lw{2,1}=[-3 3 -2];
net iw{1 1}=[1 0;
net.iw{1,1}=[1 0;.5
5 -1;-.5
1; 5 -1];
1];
y=sim(net,p)
MSEP2010_L10 13
About the lab class
MSEP2010_L10 14
Find a neural network to fit the data generated by
humps function between [0 2]
100
80 x=0:.05:2;
y=humps(x);
60
plot(x,y)
40
20
-20
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
MSEP2010_L10 15
p x;t y;
p=x;t=y;
net=newff(p,t,[2,1],{'logsig','purelin'});
net=train(net,p,t);
t t i ( t t)
y=sim(net,p);
Plot the output
Change the number of hidden neurons
Change the learning rate
Change number of epochs
net.trainParam.lr = 0.05;
net.trainParam.epochs =100;
MSEP2010_L10 16
load iris Classification
load housing Regression
Read about housing

gpproblem in the following
g link
http://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html
UC IIrvine
i M Machine
hi L Learning
i R Repository
it
http://archive.ics.uci.edu/ml/
p
MSEP2010_L10 17

MSEP2010 L10 (Xor Problem)

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

MSEP2010 L10 (Xor Problem)

Hochgeladen von

Copyright:

Verfügbare Formate

MSEP2010_L10 1

Feed-forward neural networks provide a general framework

The output of the j th hidden unit is obtained by first forming a weighted

Treating bias as a weight

The activation of hidden unit j is then obtained by transforming

The activation of the kth output unit is then obtained by transforming

An explicit expression for the complete function represented by the

Let the outputs of the hidden neurons are h1,

Read about housing

Das könnte Ihnen auch gefallen