Beruflich Dokumente
Kultur Dokumente
HOPFIELD NETWORKS
1
Kinds of Neural Networks
The perceptron: Single unit, limited in what it can learn.
Multi-layer feed forward network: Can learn many different
patterns. In particular, this is applied to supervised learning problems
where we have a set of data which has been put into classes.
Competitive learning and Kohonen self-organizing maps:
Designed for unsupervised learning that is, extracting patterns from
datasets without pre-classification.
Content-addressable aka associative networks, such as Hopfield
networks, provide a kind of memory which can retrieve wholes from
parts.
Recurrent networks allow neural networks to learn sequences of
actions in time.
Overall structure of Neural
Networks
Among all competing nodes, only one will win and all others will lose
We mainly deal with single winner WTA, but multiple winners WTA are possible
(and useful in some applications)
Easiest way to realize WTA: have an external, central arbitrator (a program) to
decide the winner by comparing the current outputs of the competitors (break
the tie arbitrarily)
2
This is biologically unsound (no such external arbitrator exists in biological nerve
system).
3
Architectures (1) Feed Forward Networks Some important points:
4
Architectures (3) Associative
networks
There is no hierarchical arrangement
The connections can be bidirectional
5
Hopfield Networks
Introduction H.N.:
The Hopfield N/W was developed by John Hopfield in 1982. it is the impact on
the field of NN. The N/W was very sophisticated and based on coherent
theoretical picture.
In 1982, J.J. Hopfield brings together several earlier ideas concerning these
networks and presents a complete mathematical analysis (based on Ising spin
models). It is therefore that this network is generally referred to as the
Hopfield network.
There are two main approaches for solving combinational optimization
problems using ANN: Hopfield Networks & Kohonen’s Self Organizing Feature
Maps.
While the latter are mainly used in Euclidean Problems, the Hopfield networks
have been widely applied in different classes of combinational Optimization
problems.
MODEL of H.N.:
The Hopfield model is used as an auto-associative memory to store and recall
a set of bitmap images. Associative recall of images, given incomplete or
corrupted version of a stored image the network can recall the original
A Hopfield net is a form of recurrent artificial neural network invented by
john Hopfield. Hopfield nets serve as content-addressable memory systems
with binary threshold units. They are guaranteed to converge to a local
minimum, but convergence to one of the stored patterns is not guaranteed.
The Hopfield network demonstrates how the mathematical simplification of a
neuron can allow the analysis of the behaviour of large scale NNs.
Hopfield provided the important link between local interactions & global
behaviour.
NNs are complex and often contain nonlinear components and hence the
behavior of a NN is difficult to analyze.
Hopfield applied some ideas to those from an important and developing area of mathematics called
nonlinear system theory
Almost all the networks discussed in the previous units were non –recurrent.
i.e. there is no feedback from the output of the network to their inputs.
6
Non recurrent networks have a repertoire of behavior that is limited compared
to their recurrent counterparts.
As recurrent networks have feedback paths from their outputs to their inputs,
the response of such networks is dynamic i,e. is after applying a new input,
the output calculated and feedback to modify the input.
The output is again recalculated and this process is repeated again and again.
If the network is stable, successive iterations produce smaller and smaller
output changes until eventually the output become constant.
For many networks, the process never ends and such networks are said to be
unstable. Unstable networks posses interesting properties and are usually
studied as examples of chaotic systems.
Stability problems stymied early researchers. No one was as able to predict
which network would be stable and which would change.
Moreover the problem appeared so difficult that many researchers were
pessimistic about finding a solution.
But fortunately a powerful theorem that defines a subset of the recurrent
researchers whose outputs eventually reach a stable state was deviced out by
Cohen & Grossberg.
This opened the door further research. Many scientists today are exploiting
the complicated behaviour and capabilities of these systems.
7
≤0.
Si : 1 to 0 transition
If si initially equals 1, and Σ wij sj <θi
Then si goes from 1 to 0 with all other sj constant
The “energy gap or change in E, is given, for symmetric wij, by :
∆E = Σj wij sj – θi < 0 , On every updating we have ∆E ≤0.
Minimizing Energy
On every updating we have ∆E≤0.
Hence the dynamic of the net tends to move E toward a minimum.
We stress that there may be different such states – they are local minima, Global
minimization is not guaranteed.
Recurre
➢ Recurrent(feedback) network : No self-feedback loof
8
• Learning process
• Let ξ1 , ξ2 , ξ3 ,⋅⋅⋅⋅⋅⋅, ξMdenote a known set of N-dim. memories.
1 M
W = (∑ξµξµ
T
−MΙ)
N µ=1
9
– Update asynchronously (i.e., randomly and one at a time)
according to the rule
Y = X fixed
• Associated memories
1 1
E =− ∑∑ωji x j xi
2 j i
∆E j = E j ( n +1) − E j ( n) = − ∆x j ∑ωji xi
2 i
i≠j i≠j
– N=3 example
10
Limitations of Hopfield model:
1) The stored memories are not always stable.
N
≅
M
N
≅
➢ The signal-to-noise ratio: M for large M.
2) There may be stable states that were not the stored memories.
(Spurious states)
11
3) Stable state may not be the state that is most similar to the
input state.
m = 7 core
12
Hopfield pattern recognition
ξ1
Stored P different patterns:
i
ξiµ ( µ = 1,2,⋅ ⋅ ⋅, P )
Input pattern: 10% reversal of ( =0.8)
1
Output pattern: Si Ψ= ∑S ξi
1
i
N i
13
Hopfield
Discrete Hopfield NN :
Input vectors values are in {-1,1} (or {0,1}).
The number of neurons is equal to the input dimension.
Every neuron has a link from every other neuron (recurrent architecture)
except itself (no self-feedback).
The neuron state at time n is its output value.
The network state at time n is the vector of neurons states.
The activation function used to update a neuron state is the sign
function except if the input of the activation function is 0 then the new
output (state) of the neuron is equal to the old one.
Weights are symmetric: Wij = Wji
f
compute the weights.
µi
14
i-th component of the µ fundamental memory.
Training Hopfield NN
1. Storage. Let f1, f2, … , fM denote a known set of N-dimensional
fundamental memories. The weights of the network are:
1 M
w ji = M
∑
µ
fµ
=1
,i f µ, j j ≠i
0 j =i
where wji is the weight from neuron i to neuron j. The elements of
the vector fμ are in {-1,+1}. Once they are computed, the synaptic
weights are kept fixed.
15
4. Outputting. Let xfixed denote the fixed point or stable state,
that is such that x(n+1)=x(n), computed at the end of step 3.
The resulting output y of the network is:
y =x fixed
RECURR
RECURRENT NETWORKS & BINARY SYSTEMS:
This network is consists of two layers.
This network somewhat different from the format found in the work of
Hopfield and others it is, still functionally equivalent to the model.
16
Layer 0, as in the networks discussed in previous units serves no
computations function. It simply distributes the network outputs back
to the inputs.
On the other hand each layer1 neuron computes the weighted sum of
its inputs, producing a NET signal that is then operated on by the non
linear function F to yield the out signal.
The function F was a simple threshold in the earlier works of Hopfield.
The output of such a neuron is one if the weighted sum of the outputs of
the other neurons is greater than a threshold Tj, if is zero.
NETj = Σi≠j Wij OUTi +INj
OUTj= 1 if NET j>Tj
OUTi= 0 if NETj <Tj
OUTj remains unchanged if NETj=Tj.
17
Recurren
STABILITY
The weights between layers in this network may be considered to form a
matrix W.
Cohen & Grossberg have proved those recurrent networks are stable if
the matrix is “Symmetrical” with zeros on its main diagonal.
A symmetrical matrix has the property Wij = W ji for i not equal to j
and Wii = 0 for all i.
The stability of such a network can be proved by a mathematical
Single
technique.
Suppose that a function can be found which always decreases each time
the network changes the state.
Eventually this function must reach a minimum and stop, thereby
ensuring that the network stable.
Here
E is the artificial network energy Wij is the weight from the output of
neuron I to the input of neurons j
18
OUTj is the output of neuron j
Ij is the external input to neuron j
Tj is the threshold of neuron j.
The change in energy E, due to a change in the state of neuron j is given
by:
SE = -[ Σi≠j (Wij OUTi) + Ij –Tj ] δ OUTj
= - [NETj – Tj] δOUTj
Here δ OUTj is the change in the output of
neuron j.
Case1: where the NET value of neuron j is greater than the threshold
value. This will cause the term in the brackets to be positive and in
equation
NETj = Σi=j Wij OUTj +INj ,
the output of neuron j must change in the positive direction or else
remain constant. This implies that the OUTj can be only positive or zero
and δ E must be negative. Therefore the network energy must either
decrease or stay constant.
Case2: that NET is less than the threshold value. Then δOUTj can be
only negative or zero. Hence the energy is again restricted to either
decrease or stay constant.
Case3: if NET equals the threshold then δj is zero and the energy
remains unchanged.
This means that any change in the state of a neuron will either reduce
the energy or maintain its current value.
Since the energy shows a continuous downward trend, eventually it
must find minimum and stop at the point. (by definition such networks
are said to be stable.)
The network (weight) symmetry criterion is sufficient, but not necessary
to define a stable system.
There are many stable systems which do not satisfy this criteria e.g all
the feed forward networks.
A minute deviation from symmetry produces continuous oscillations and
therefore approximate symmetry is usually adequate to produce stable
systems.
19
Bi directional Recurrent Neural
Networks
One of the methods used to try to overcome these limitations consists of
using bidirectional recurrent neural networks (BRNNs).
An RNN is a neural network that allows “backward” connections. This
means, that it can re-process the output.
Our brain, as you can surely guess, is a recurrent network.
Sadly, the issues of RNN training and architecture are too complicated
for this lecture. An intuitive explanation will be presented, however.
The Network structure is:
The input, vector It, encodes the external input at time t. In the most
simple case, it encodes one amino acid, using orthogonal encoding.
Ot = η(Ft , Bt , It)
20
Ft and Bt store information about the “past” and the “future” of the
sequence. They make the whole difference, because now we can utilize
global information.
The funtions satisfy the recurrent bidirectional equations:
Ft = φ(Ft-1, It)
Bt = β(Bt+1, It)
where φ() and β() are learnable nonlinear state transition
functions, implemented by two NNs (left and right subnetworks in the
picture).
Intuitively, we can think of Ft and Bt as “wheels” that can be rolled
along the protein.
To predict the class at position t, we roll the wheels in opposite
directions from the N- and C-terminus up to position t and then combine
what is read on the wheels with It to calculate the proper output using η.
Boltzmann
How a Boltzmann Machine models
data
It is not a causal generative model (like a sigmoid belief net) in which we
first generate the hidden states and then generate the visible states
given the hidden ones.
21
To generate a sample from the model, we just keep stochastically
updating the binary states of all the units
Restricted Boltzmann
Machines
We restrict the connectivity to make learning easier.
Boltzmann Machine
The Boltzmann machine is a stochastic version of the Hopfield model.
Used for optimization problems such as the classic traveling salesman
problem
Those are only a few of the more common network structures.
Advanced users can build networks designed for a particular problem in
many software packages readily available on the market today.
22
The Boltzmann machine is similar in function and operation to the
Hopfield network with the addition of using a simulated annealing
technique when determining the original pattern.
The Boltzmann machine incorporates the concept of simulated
annealing to search the pattern layer's state space for a global
minimum. Because of this, the machine will gravitate to an improved
set of values over time as data iterates through the system.
Ackley, Hinton, and Sejnowski developed the Boltzmann learning
rule in 1985. Like the Hopfield network, the Boltzmann machine has an
associated state space energy based upon the connection weights in the
pattern layer.
The processes of learning a training set full of patterns involves the
minimization of this state space energy. Because of this, the machine
will gravitate to an improved set of values for the connection weights
while data iterates through the system.
The Boltzmann machine requires a simulated annealing schedule, which
is added to the learning process of the network. Just as in physical
annealing, temperatures start at higher values and decreases over time.
The increased temperature adds an increased noise factor into each
processing element in the pattern layer. Typically, the final temperature
is zero. If the network fails to settle properly, adding more iterations at
lower temperatures may help to get to a optimum solution.
A Boltzmann machine learning at high temperature behaves much like a
random model and at low temperatures it behaves like a deterministic model.
Because of the random component in annealed learning, a processing element
can sometimes assume a new state value that increases rather than
decreases the overall energy of the system. This mimics physical annealing
and is helpful in escaping local minima and moving toward a global minimum.
As with the Hopfield network, once a set of patterns are learned, a partial
pattern can be presented to the network and it will complete the missing
information. The limitation on the number of classes, being less than fifteen
percent of the total processing elements in the pattern layer, still applies.
23
quadratic energy function in can
be replaced by an energy function whose typical term is
24
Non-binary units
The binary stochastic units used in Boltzmann machines can be generalized to
"softmax" units that have more than 2 discrete values, Gaussian units whose
output is simply their total input plus Gaussian noise, binomial units, Poisson
units, and any other type of unit that falls in the exponential family (Welling
et. al., 2005). This family is characterized by the fact that the adjustable
parameters have linear effects on the log probabilities. The general form of
the gradient required for learning is simply the change in the sufficient
statistics caused by clamping data on the visible units.
Where: wij – weight of the connection , xi, xj – are the states of the Xi and
Xj units
Simulated annealing
Simulated annealing is a general method for making likely the escape
from local minima by allowing jumps to higher energy states.
Where:
Unit Xi does not necessary change its state, so the probability of the net
accepting a change in state for Xi:
Where:
26
Boltzmann Machine Learning Rule
Learning Rule was proposed by Ackey, Hinton and Sejnowski in 1985.
27
Boltzmann Machine Structure
The Boltzmann Machine is a Hopfield network, in which
Hidden
The free running phase in which only the inputs are held fixed and
other neurons are allowed to vary.
28
These phases iterate till learning has created a Boltzmann Machine
which can be said to have learned the input patterns and will converge
to the learned patterns when noisy or incomplete pattern is presented.
Generalized Networks-Clamped
Phase
Generally the initial weights of the net are randomly set to values in a small
range e.g. -0.5 to +0.5.
Then an input pattern is presented to the net and clamped to the visible
neurons.
P( sj à -sj) = 1 / ( 1 + exp(-∆E/ T) )
The activation passing can continue till the net reaches a low energy state.
But, the net will reach a state of thermal equilibrium in which individual
neurons will change state and the probability of any single state can be
calculated.
For a system being in any state α with associated energy Eα and temperature
T the probability will be:
29
After the temperature is gradually dropped, the net goes as low an energy
state as it can at each temperature. So, the correlations
between the firing of pairs of neurons at the final temperature:
After presentation of the input patterns all neurons can update their
states and the annealing schedule is performed (as before).
And again, the correlations between the firing of pairs of neurons at the
final temperature:
ρ-ij = ‹sjsi›-
Where: ‘-’ indicates that the correlations is carried out when the visible
neurons are not in a clamped state
Learning Phase
Here we use the Boltzmann Machine’s learning rule to update the
weights:
Applications
The weighted matching problem:
30
A set of N point with a known “distance” between each.
Graph bipartitioning:
A set of points which will be split into two disjoint sets with as low
an associated cost as possible.
Boltzmann Machine:
Simulated Annealing:
31
Associative Memories
Motivation
32
In early memory models, capacity was limited to the length of the
memory and allowed for negligible input distortion (old CAMs).
Morphological Memories
Formulated using Mathematical Morphology Techniques
Image Dilation
Image Erosion
Training Constructs Two Memories: M and W
M used for recalling dilated patterns
W used for recalling eroded patterns
M and W are not sufficient…Why?
General distorted patterns are both dilated and eroded
solution: hybrid approach
Incorporate a kernel matrix, Z, into M and W
General distorted pattern recall is now possible!
Input → MZ → WZ → Output
Improving Limitations
Experiment
33
Construct a binary morphological auto-associative memory to
recall bitmap images of capital alphabetic letters
Use Hopfield Model for baseline
Construct letters using Microsoft San Serif font (block
letters) and Math5 font (cursive letters)
Attempt recall 5 times for each pattern for each image
distortion at 0%, 2%, 4%, 8%, 10%, 15%, 20%, and 25%
Use different memory sizes: 5 images, 10, 26, and 52
Use Average Recall Rate per memory size as a performance
measure, where recall is correct if and only if it is perfect
Results
Morphological Model and Hopfield Model:
Both degraded in performance as memory size increased
Both recalled letters in Microsoft San Serif font better than Math5
font
Morphological Model:
Always perfect recall with 0% image distortion
Performance smoothly degraded as memory size and distortion
increased
Hopfield Model:
Never correctly recalled images when memory contained more than 5
images
34
Resul
35
Result
HOPFILED NETS AND OPTIMIZATION: Traveling
salesman problem:
To design Hopfield nets to solve optimization problems: given a
problem, choose weights for the network so that E is a measure of the
overall constraint violation. A famous example is the traveling salesman
problem.
Hopfield and Tank 1986 have constructed VLSI chips for such networks
which do indeed settle incredibly quickly to a local minimum of E.
Unfortunately, there is no guarantee that this minimum is an optimal
solution to the traveling salesman problem.
Experience shows it will be "a pretty good approximation," but
conventional algorithms exist which yield better performance.
36
The salesman wishes to find a way to visit the cities that is optimal
in two ways: each city is visited only once, and the total route is as
short as possible.
Exponential Complexity
Why is exponential complexity a problem?
exp(1) = 2.72
exp(250,000) = 10108,573
Solution representation:
Solution representation
37
Construct a Hopfield network with N2 nodes.
d ij nia ( n j , a +1 + n j , a −1 )
1
L= ∑
2 i , j ,a
Energy function:
Energy function that enforces the constraints:
∑n
a
ia =1, ∀i ∑ni
ia = 1, ∀a
2 i , j ,a 2 a i i a
Connection weights
Nodes within each row connected with weight –γ
Each node is connected to nodes in columns left and right with weight –
dij
Continuous activation
38
H =
1
2
∑d
i , j ,a
+
γ
ij
2
∑
a
1 − ∑
i
nia
2
+ ∑
i
1 − ∑
a
nia
2
Com
nia ( n j , a +1 + n j , a −1 )
p
TSP Network Connections
84
• Semantics
1 2 3 4 5 6 7 8 91
0
Combinatorial
A
optimization
TSP
B (Travelling Salesman
Problem)
C
E
place
city
F
G
39
H
J
The TSP problem
1 if arc i - j is in the tour
xij =
0 otherwise
n n
Minimize ∑∑c
i =1 j =1
ij xij
n
subject to ∑x
i =1
ij =b j =1 ( j =1,..., n)
n
∑x
j =1
ij = ai =1 (i =1,..., n)
X =( xij ) ∈S
xij =0 or 1 (i, j =1,..., n)
cij is the cost of moving or the distance from node i to j
Hopfield and Tank (1985) showed how this problem can be solved by a
recurrent network.
The first step is to map the problem onto the network so that solutions
correspond to states of the network.
The problem for N cities are coded into an N by N network. The next
step is to construct an energy function that can eventually be rewritten
in the form of
1
E =− ∑wij xi x j
2 i, j
and has minima associated with states that are valid solutions.
40
Let nodes be indexed according to their row and column so that yxi is
the output of the node for city x in tour position i and consider the sum
x
∑
i
∑∑yj ≠i
xi y xj
Each term is the product of a pair of single city outputs with different
tour positions. This term will tend to encourage rows to contain at most
a single “on” unit. Similar terms may be constructed to encourage single
units being “on” in columns, the existence of exactly ten units “on” in
the net and to foster a shortest tour.
These states can be seen as 'dips' in energy space. When the network is cued
with a noisy or incomplete test pattern, it will render the incorrect or missing
data by iterating to a stable state which is in some sense 'near' to the cued
pattern.
41
–
Associative Memory
Human memory operates in an associative manner; that is a portion of a
recollection can produce a larger related memory.
A recurrent network forms an associative memory. Like human memory, a
portion of the desired data is supplied and the full data “memory” is returned.
To make an associative memory using a recurrent network, the weights must
be selected to produce energy minima at desired vertexes of the unit
hypercube.
Hopfield (1984) has developed an associative memory in which the outputs
are continuous, ranging from +1 to -1, corresponding to the binary values 0
and 1, respectively. The memories are encoded as binary vectors and stored
in the weights according to the formula that follows:
42
Wij = Σ ( OUTi,d OUT j,d)
d= 1 to m
where m = the number of desired memories(output vectors)
d= the number of a desired memory (output vector)
OUTi,d = the ith component of the desired output vector.
This expression may be clarified by noting that the weight array W can be
found by calculating the outer product of each desired vector with itself (if the
desired vector has n components, this operation forms an n-by-n matrix) an
summing all of the matrixes thus formed.
W = ∑i D1i Di where Di is the ith desired row vector.
Once the weights are determined, the network may be used to produce the
desired output vector, even given an input vector that any be partially
incorrect or incomplete. To do so, the outputs of the network are first forced to
the values of this input vector.
The input vector is removed and the network is allowed to “relax” toward the
closet deep minimum.
Note that the network follows the local slope of the energy function, and it
may become trapped in a local minimum and not find the best solution in a
global sense.
The Hopfield network implements a so-called associative (also called content
addressable) memory.
A collection of patterns called fundamental memories is stored in the NN by
means of weights.
Each neuron represents an attribute (dimension) of the input.
The weight of the link between two neurons measures the correlation between
the two corresponding attributes over the fundamental memories. If the
weight is high then the corresponding attributes are often equal in the
fundamental memories.
Bi-directional Associative
Memory
This network model was developed by Bart Kosko and again generalizes the
Hopfield model. A set of paired patterns are learned with the patterns
represented as bipolar vectors. Like the Hopfield, when a noisy version of one
pattern is presented, the closest pattern associated with it is determined.
It has as many inputs as output processing nodes. The two hidden layers
are made up of two separate associated memories and represent the
size of two input vectors.
The two lengths need not be the same, although this examples shows
identical input vector lengths of four each. The middle layers are fully
connected to each other.
44
The input and output layers are for implementation purposes the means
to enter and retrieve information from the network. Kosko original work
targeted the bi-directional associative memory layers for optical
processing, which would not need formal input and output structures.
This state, providing the network is not over trained, corresponds to the
closest learned association and will generate the original training
pattern on the output.
Characteristics of BAM
BAM have the capability to generalize. For example consider that an incomplete or
partially incorrect vector is applied at A. then, the network tends to produce the
closest memory at B. Which in turn tends to correct the errors in A. through it might
take several passes, but still the network converges to the nearest stored memory.
Feedback systems rarely stabilize they are prone to oscillations, that is they wander
from state to state, never reaching stable. But kosko has proven that all the BAMs are
unconditionally stable for any weight network.
This is also a very important characteristic and it arises from the transpose
relationship used between the two weight network. And this also ensures that any set
of associations may be learned without the risk of limitability.
Also, there is a close relationship between the BAM and the Hopfield networks. If the
weight W is made square and symmetrical the W = W1.
46
Like the hopfield network, the BAM has restrictions on the maximum numbers
associations it can accurately recall. If this unit exceeds, the network may produce
incorrect outputs.
In general the maximum numbers of stored associations cannot exceeds the number
of neurons in the smaller layer. But by choosing an appropriate threshold for each
neuron, the number of stable states can be made anything from 1 to 2n , where n is
the number of neurons in the smaller layer.
TYPES OF BAM
Though BAMs have many problems they still remain the subject of research,
because of their simplicity and the property that they can be implemented
using large integrated circuits (either analog or digital)
Continuous BAM
Adaptive BAM
Competitive BAM
Continuous BAM
The neurons in layers 1 and 2 are considered to be synchronous, that is all the
neurons contain memory so that all of them change state simultaneously,
upon the occurrence of a pulse from a central clock. But in an asynchronous
system, any neuron is free to change state at any time. Whenever its input
indicates that it should do so.
The BAMs simple threshold had been used as the neurons activation function,
thereby producing a discontinuity in the neurons transfer function. Both
synchronous operation and discontinuous functions are biologically
implansible and quite unnecessary.
Continuous asynchronous BAMs overcome both these situations and function
in much the same way as the discrete version. It might appear that such BAMs
would suffer from instability, but fortunately this is not the case as continuous
BAMs are stable.
Continuous BAMs us the sigmoid function with values of λ, near 1, thereby producing neurons
that respond smoothly and continuously, much like their biological prototype continuous BAM
lends itself to analog implementations constructed of resistors and amplifiers very large scale
integration (VLSI) of such networks appear feasible and economically attractive.
Adaptive BAM
All the versions of BAM discussed so far had their weight matrix calculated as
the sum of the outer products of the input –vector pairs. This calculation is
useful in that it demonstrates the functions that a BAM can represent. But this
is certainly not the way that weights are determined in the brain.
47
Adaptive BAM adjusts its weight during operation. That is application of the
training vector set causes it to response. Slowly the short term memory
converts into long-term memory, modifying the network as a function of its
experience.
The network is trained by applying vectors to layer-A and associated vectors
to layer B. either of the vectors can be a noisy versions of the ideal within
limits the network learns the idealized vectors free of noise.
As the continuous BAM is proved to be stable regardless of the weights, slow
changes in the weights do not upset the stability
Competitive BAM
Some sort of competition between the neurons is observed in many biological
neural systems. For example in the neurons that process signals from the
retina, lateral inhibition tends to increase the output of the most highly
activated neuron at the expense of its neighbors.
The rich-gets richer system increases contrast by raising the activation level of
the neurons connected to bright areas of the retina, while reducing the
outputs of those “viewing” the darker areas.
Competitions in BAM are implemented by interconnecting neurons which each
layer by means of additional weights. These from another weight matrix, with
positive weights on the main diagonal and negative weights at other positions.
From the Cohen Grossberg theorem we can infer that such a system in unconditionally stable if the
weight arrays are symmetrical though, practically the networks are stable even without symmetry.
48