Sie sind auf Seite 1von 30

Artificial Neural

Networks
part one
Artificial Intelligence for
Control and Identification
Dr. Wilbert G. Aguilar

Ph.D. in Automatic Control, Robotics and Computer Vision

Mster AR

2007 Dr. X. Parra & Dr. C. Angulo

Outline

1. Why Artificial Neural Networks?

Artificial Neural Networks

2. Model of a neuron

1. Why Artificial Neural Networks?

Von Neumanns Computer

Artificial Neural Networks

determinism

Human Brain
fuzzy behaviour

sequence of instructions

parallelism

high speed

slow speed

repetitive tasks
programming
uniqueness of solutions
ex. matrix product

adaptation to situations
learning
different solutions
ex. face recognition

1. Why Artificial Neural Networks?


[Human] Brain Operation
When humans recognize a face or take an object, they
do not solve equations

Artificial Neural Networks

Brain works in an associative way

Each sensorial state evokes a brain state (an


electro-chemical activity) which is memorized
depending on the necessities

1. Why Artificial Neural Networks?


Playing tennis
Ball trajectory depends on many different factors:

Artificial Neural Networks

shot strength, initial angle, racket trajectory, ball spin, wind


speed,

A desired trajectory requires:


accurate measure of all the variables
simultaneous solution of many complex equations, which
must be solved for each data acquisition (fast dynamics)

How makes a player to manage all that?

1. Why Artificial Neural Networks?


Playing tennis
In a learning phase, a human player tries and experiments with
different actions and memorizes the good ones:

Artificial Neural Networks

if the racket is in the right side and the ball comes from right to left
then move a step backward and cross the racket to left side
if racket is in the right side and the ball comes from left to right then
move a step forward
if racket is

1. Why Artificial Neural Networks?


Playing tennis

Artificial Neural Networks

In an operative phase, the brain controls the actions without


thinking, on the base of the learned associations

A similar mechanism is
used for speech recognition
or motion control

Outline

1. Why Artificial Neural Networks?

Artificial Neural Networks

2. Model of a neuron

2. Model of a neuron
Biological inspiration

Artificial Neural Networks

The basic scheme of a biological neuron is

soma

nucleus
axon
synapse
dendrites

2. Model of a neuron
Biological inspiration

Artificial Neural Networks

Incoming signals from other neurons determine if the neuron


shall fire
The contribution of the signals to the output depends on the
strength of the synaptic connection. The output depends on
the attenuation/amplification in the synapses.
Dendrites receive the amplified signals through the synapses
and send them to the cellular body, where they are added.
When the sum exceeds a threshold the neuron output is
active (firing)

2. Model of a neuron
Biological inspiration

Artificial Neural Networks

Some of the brain properties are:


Neuron speed

milliseconds

Number of neurons

1011 1012

Number of connections

103 104 per neuron

Number of synapses

1014 1016 nervous system

Distributed control

any CPU

Fault tolerant

graceful degradation

Low power consumption

no batteries (but food)

10

2. Model of a neuron
McCulloch & Pitts Model (1943)

Artificial Neural Networks

The McCulloch-Pitts neuron is a binary neuron with


only two states: active (fired/excited) and inactive
(not fired/excited)
The neuron worked by inputting either a 1 or 0 for
each of the inputs (binary inputs), where 1 represented true and 0 false. Likewise, the threshold was
given a real value (), say 1, which would allow for a 0
or 1 output if the threshold was met or exceeded
(binary output)

11

2. Model of a neuron
McCulloch & Pitts Model (1943)

Artificial Neural Networks

W. S. McCulloch and W. Pitts. A logical calculus of ideas immanent in nervous


activity. Bulletin of Mathematical Biophysics, 5:115--133, 1943.
inputs

synaptic
weights

x1

w1

x2

xm

xi{-1,1}
wi{-1,1}
y{-1,1}

summing
junction

w2

activation
function

output

wm
1

1 if

0
-1
-2

y=
-1

+1

Sign Function

+2

w x
i

i=1

-1 otherwise
12

2. Model of a neuron

Artificial Neural Networks

McCulloch & Pitts Model (1943)


inputs

synaptic
weights

x1

w1

summing
junction

x2

activation
function

output

w2

determine the parameters (w1, w2 and ) so that


the neuron represents a logical OR function

13

2. Model of a neuron
McCulloch & Pitts Model (1943)
Some interesting points:

Artificial Neural Networks

it is the relative magnitudes that are important and not their


absolute magnitudes
in the previous example we could change the weights and threshold
by a factor of 2 and the output would be the same
a realization of the AND function may be achieved by simply
changing the threshold to -1
we can specify weights and then determine the binary logic output

The next important step is the ability to specify the desired


output and adjust the weights to achieve that output
14

2. Model of a neuron
Hebbs Rule (1949)
Hebb, D.O. The organization of behavior; a neuropsychological theory. WileyInterscience, New York, 1949

Artificial Neural Networks

Biological Rule: the synapse resistance to the incoming signal


can be changed during a learning process
If an input of a neuron is repeatedly and
persistently causing the neuron to fire, a
metabolic change happens in the synapse of
that particular input to reduce its resistance

learning is not an inherently property of the


neuron but is due to synapses modification
15

2. Model of a neuron
Hebbs Rule (1949)

Artificial Neural Networks

Hebbian Learning Rule: a change in the strength of a connection


is a function of the pre- and postsynaptic neural activities
It is a method of determining how to alter the weights between
model neurons
If xi is the output of the presynaptic neuron, yj the output of the
postsynaptic neuron, and wji the strength of the connection
between them, and learning rate, the form of the learning rule is:

w ji = xi y j

16

2. Model of a neuron
The Perceptron (1958)
Rosenblatt, Frank (1958), The Perceptron: A Probabilistic Model for Information Storage and
Organization in the Brain, Cornell Aeronautical Lab, Psychological Review, v65, N.6, pp.386-408.

Artificial Neural Networks

Combines the McCulloch-Pitts model of an artificial neuron and


the Hebbian learning rule of adjusting weights
In addition to the variable weight values, the perceptron model
added an extra input that represents bias
bias

x1

w1

x2

w2

summing
junction

xm

wm

inputs

synaptic
weights

y
output

activation
function

17

2. Model of a neuron
The Perceptron (1958)
It produces associations between inputs and outputs:

\ {0,1}
Artificial Neural Networks

Pk Tk

k = 1... p

where Pk are the inputs or patterns and Tk are the outputs or


targets
The set of patterns and targets is the learning set or training set

{Pk , Tk }k =1... p
18

2. Model of a neuron
The Perceptron (1958)

Artificial Neural Networks

Perceptron Learning Rule: change the weight by an amount


proportional to the difference between the desired output and the
actual output

w ji = ( Tkj y j ) Pk i
learning rate

desired output

actual output

w ji (t + 1) = w ji (t ) + ( Tkj y j (t ) ) Pk i
i = 1...m

j = 1...n

k = 1... p

19

2. Model of a neuron
The Perceptron (1958)
Novikoff, A. B. (1962). On convergence proofs on perceptrons. Symposium on the
Mathematical Theory of Automata, 12, 615-622. Polytechnic Institute of Brooklyn.

Artificial Neural Networks

Perceptron Theorem: if this process is


repeated cyclical, ends up converging to the
weights looked for in finite time

for

Pk

inputs,

t < | y j (t ) = Tkj

20

2. Model of a neuron
The Perceptron (1958)

Artificial Neural Networks

Vector notation:
x0

w0

x1

w1

x2

w2

xm

wm
x = [x0 x1 x2 ... xm]T
w = [w0 w1 w2 ... wm]T

y = F(wTx) = F(wxT)

21

2. Model of a neuron

Artificial Neural Networks

The Perceptron (1958)


The equation for wxT, can be
viewed as an equation of a line.
Depending on the values of the
weights, this line will separate
the four possible inputs into two
categories

x2
C

x1

Equation of the line becomes

w2 =

w1
w2

x1

w0
w2

w0=-1
w1=-1 A
w2=1 w =1
0
w1=-1
w2=1

B
w0=1
w1=1
w2=1

w0=-1
w1=1
w2=1

22

2. Model of a neuron
The Perceptron (1958)
Example

Artificial Neural Networks

(0,1)

(1,1)

(-1,1)
(1,0)
(-1,0)
(0,-1)

\2

{0,1}

(-1,1)

(-1,0)

(0,-1)

(1,0)

(0,1)

(1,1)

23

2. Model of a neuron
The Perceptron (1958)
Minsky M L and Papert S A 1969 Perceptrons (Cambridge, MA: MIT Press)

The XOR problem Minsky & Papert (1969)

Artificial Neural Networks

(0,1)

(0,0)

(1,1)

(1,0)

\2

{0,1}

(0,0)

(0,1)

(1,0)

(1,1)

24

2. Model of a neuron
The Perceptron (1958)
The XOR problem Minsky & Papert (1969)

Artificial Neural Networks

(0,1)

(0,0)

(1,1)

(1,0)

\2

{0,1}

(0,0)

(0,1)

(1,0)

(1,1)

Functions such as XOR, require two lines to separate the


points into the appropriate classes. Hence we would require
multiple layers of neural units to represent this function
25

2. Model of a neuron
The Perceptron (1958)

Artificial Neural Networks

Synthesis :
Architecture

single layer feedforward

Transfer function

Hardlim

Associations

\ m {0,1}

Learning rule

w ji (t + 1) = w ji (t ) + ( Tkj y j (t ) ) Pk i
i = 1...m

j = 1...n

k = 1... p

ERROR

26

2. Model of a neuron
Learning
The aim of the learning is to find an association between the
patterns and the targets of the training set

Artificial Neural Networks

actual
output

p(1,1)
P= #
p(p,1)

y(1,1)
Y= #
y(p,1)

" p(1,m)
%
"

" y(1,n)
%
"

p(p,m)

t(1,1)
T= #
t(p,1)

y(p,n)

t(1,1) - y(1,1)
E=T-Y= #
t(p,1) - y(p,1)

desired
output

" t(1,n)
%
"

t(p,n)

" t(1,n) - y(1,n)


%
"

= [ 0]
t(p,n) - y(p,n)

27

2. Model of a neuron
Learning
so the aim of the learning is to find the weights that
minimizes the error, but

Artificial Neural Networks

how can we measure the error E = [0]?

Objective function
p
1 p n 2
e = e(w ) = E ( k,j) = diag ( ET E )
2 k =1 j=1
k =1

LEAST
SQUARE
ERROR
28

2. Model of a neuron
Learning

Artificial Neural Networks

It is possible to define the problem as a function approximation

2
1 p n 2
1 p n
min e = e(w ) = E ( k,j) = ( T ( k,j) Y ( k,j) )
2 k =1 j=1
2 k =1 j=1

where

Y ( k,j) = F w ( j,i ) P ( k,i )


i =0

Classification: classifier function

error
surface

Regression: approximation function

29