Beruflich Dokumente
Kultur Dokumente
Curve Prediction
L ECTU RE C OU RSE : A U SGEW ÄHLTE O PTIM IE RU NGSV ER FA HRE N FÜ R I NGEN IE UR E
Abstract
We use artificial neural networks to perform curve prediction. For that, we have
created a class of neural networks (feed forward multilayer perceptron networks
with backpropagation) that have a topology which is determined by their genetic
makeup. Using a simple evolutionary strategy on their genes, we optimise the
networks’ topologies to solve the problems at hand. Using this approach, we could
generate networks that are able to predict simple functions such as sin(x) or linear
combinations thereof, with moderate computational overhead. However, it was not
possible to generate networks that predict more complex functions such as sinc(x)
or the NASDAQ composite index satisfactorily, within the allowed sizes for the
networks. In general though, it appears to be a useful approach to generate neural
networks using this form of evolutionary strategy as it substitutes for experience in
neural network design.
Introduction
Curve prediction is one of the most popular applications for artificial neural
networks. However, the success of using a neural network to solve a certain
problem is inherently linked to the designer’s ability to apply an appropriate
network to the task. Even relatively simple artificial neural networks such as the
multi-layer-perceptron or variants thereof have several degrees of freedom, e.g. the
number of neurons, the number of hidden layers, the type of transfer functions
employed, which the network is very sensitive to. For most tasks there is no
methodology for designing a neural network which guarantees success. Instead, we
try to evolve a neural network topology that is suitable for any curve prediction
task.
Aim
• To develop an evolvable artificial neural network representation
Neural Networks
In order to perform the prediction tasks described above, we use multi layer
perceptron networks and a simple backpropagation learning rule. Then, we use an
evolutionary strategy to change the following parameters of the network:
In order to do this, we define a genetic code for the class of neural networks
comprising an N digit binary bit string. In order to limit the optimisation search
space, we arbitrarily limit the number of hidden layers to
The allowed transfer functions are linear, linear with bounds and hyperbolic
tangens, as shown below:
Thus, the number of neurons for each hidden layer can be represented by a four bit
number, and the transfer function for the neurons in each layer by a two bit
number, totalling a required six bits per layer. As there are up to 10 hidden layers,
the total bitstring will be 60 bits long, and is represented as follows:
Bit 0 Bit 59
Layer Layer Layer Layer Layer Layer Layer Layer Layer Layer
0 1 2 3 4 5 6 7 8 9
Each layer can have up to 15 neurons, as given by the binary number [Neuron bit 3:
Neuron bit 0]. If the bit string encodes zero neurons for that layer, it is interpreted
as being non-existent. Also, as the bit string can encode for four transfer functions
in each layer, as given by the binary number [Transfer bit 1: Transfer bit 0], but only
three are employed, a bias is given towards the linear y = x transfer function, to be
encoded in two of the possible four states. Thus, by changing the genes of a
network, it will have a different topology, some more others less suitable for the
tasks at hand.
Optimisation
In order to find an optimal network topology for the given tasks, we use an
evolutionary strategy to evolve the genetic makeup of the network, which could
also be regarded as a genetic algorithm without cross breeding.
5. A fitness function evaluates the fitness of each network, and the fittest
network is kept for the next iteration, while the other ones are discarded.
6. The fittest network is cloned six times to refill the generation, and each of
these clones is mutated randomly by inverting one of the 60 bits at random.
A table that contains all the bitstrings that have already been evaluated is kept so
as to avoid computing the same network topology multiple times.
Here, we evaluate the fitness of the networks in two ways, depending on the task at
hand.
• For short term prediction, a part of the test signal series is used as input to
the network, and the first value the network predicts is compared with the
actual value of the series at that point. The cumulative error is found by
summing the absolute difference between the predicted value and the actual
value for all shifted versions of the actual signal as input to the network. The
fitness of the network is the reciprocal of the cumulative error. This method
evaulates the network’s ability to make short term predictions for a given
pattern and number of inputs. (see below)
Actual signal
1 1 . . . . . . . N
.
Neural network with inputs and output
i n p u t s o
2 1 . . . . . . . N
.
i n p u t s o
3 1 . . . . . . . N
.
i n p u t s O
• For long term prediction, a part of the signal is used as input to the
network, and the first predicted value is fed back and used as the next input
to the network, and this is repeated for a given number of points that are to
be predicted. Then, the fitness is the reciprocal of the cumulative absolute
difference between the actual signal and the recurrently predicted signal.
This method evaluates the network’s ability to make long term predictions
(forecasts) for a given starting pattern and number of points to predict. (see
below)
Actual signal
1 1 . . . . . . . N
.
Neural network with inputs and output
i n p u t s o
2 1 . . . . . . . N
.
i n p u t s o
3 1 . . . . . . . N
.
i n p u t s O
In both cases, there is a minimum error (and thus maximum fitness) that each
network must have, in order to avoid division by zero errors and infinite fitness.
Furthermore, smaller networks, i.e. networks with fewer hidden layers are preferred
over larger ones, as are networks with a small number of neurons.
In fact, the search space is slightly smaller than this, because some of the
topologies where one or more layers have zero neurons are equivalent. Still, it can
be appreciated, that the search space is large enough to justify this optimisation
approach.
Testing Strategy
There are two tasks, short- and long term (function) prediction that our neural
networks will have to perform. Here, we shall qualitatively assess the ability of the
evolved networks to perform each task, using the following representative test
signals:
Apart from running both tasks on the four different functions, the following
questions shall be addressed:
• How fit are the networks? As mentioned above, there exists a maximum
fitness that a network may achieve. How fit, relative to the maximum
achievable fitness, are the evolved networks?
Tabular Summary
The section below describes important aspects of the individual tests in detail. In
addition, the table below summarises the results.
Test Details
Sinusoidal
The neural networks evolved are able to predict the sinusoidal signals with
acceptable accuracy, provided they receive enough inputs. Figure 3 below
illustrates the evolution process over several trials. Each point on the graph
represents an improved network topology over the previous one.
Figure 3: Fitness evolution for different trials for Sin(x) - Clear fitness improvement
As with the fitness evolution, the error performance of the evolved networks
improves. Figure 4 below illustrates how the %-error in the long term prediction
decreases in general with each generation.
Figure 4: Error evolution for different trials for Sin(X) - Clear performance improvement
Comparing the size of the network (number of neurons inside hidden layers) to the
network’s performance, it appears, that there exists a certain range of ‘right’ sizes
that the network should have, which allows it to achieve high fitness. Another way
of looking at this is that the network should have a certain minimum complexity (in
terms of numbers of neurons) which is adequate to solve the task at hand. Below
that critical size, it is unlikely, that a network can achieve a high fitness.
Figure 5: Size does matter – a network needs a certain minimum size or complexity to
achieve high fitness
Sinc(x)
Unlike in the previous case with sin(x), the evolutionary approach does not generate
sufficiently fit networks to perform long-term prediction on a sinc(x) function. Figure
6 and Figure 7 outline the evolutionary performances over several trials. They show
clearly, that the evolutionary approach works in principal, i.e .networks are evolving
and improving, however the task to predict a sinc(x) seems to be too ‘difficult’ a
task for the simple feed-forward perceptrons employed here. It appears that the
network evolution is hitting a fitness and %-error performance limit at about 15%
(f)and 90%(e) respectively.
Figure 6: Fitness evolution for different trials for Sinc(x) - No clear fitness improvement
Figure 7: Error evolution for different trials for Sinc(x) – The error decreases, but is still
unacceptably high
NASDAQ
Attempting to perform long/term prediction on the NASDAQ is ambitious. Here, the
evolutionary approach again works somewhat, as it is able to generate networks
with improving performance over several generations, however the network model
or complexity is again not able to cope with the challenge posed by the NASDAQ.
Figure 8 and Figure 9 summarize the network evolution over several trials.
Figure 8: Fitness evolution for different trials for the NASDAQ - Too difficult for the
networks
Figure 9: Error evolution for different trials for the NASDAQ - Error performance improves,
but it is still too high
General Comments
• The networks’ performances are closely tied to some randomness in
the initialisation and the success of the training. The training method
employed, backpropagation, however is not guaranteed to achieve a
satisfactory level of training, and does not necessarily find the globally
optimal solution parameters for the network. To overcome this, we repeated
the training sequence of the networks several times to increase the likelihood
of obtaining a ‘’well trained’ network. This however increased the
computational load manifold, to an impractical degree, and is therefore not
an adequate remedy to decouple the success of the evolutionary approach
from its sensitivity on randomness and initial conditions.
Conclusion
• To develop an evolvable artificial neural network representation
Using a simple evolutionary strategy, we tried to optimise the network topology for
the tasks at hand by optimising a generation of networks’ genetic makeup. In
principal, this approach has proven valid, and we have demonstrated an evolution
of networks to predict a sinus function. For more complex function such as the
sinc(x) function or the NASADQ, the evolutionary approach worked, although it was
limited by the network model’s inherent ability to predict complex patterns.
We are optimistic about the approach to evolve neural network topologies for given
tasks, and there are several aspects that could be improved or further investigated.
In particular, we suggest the following:
• Within our simulations, we had to limit the search space by limiting the
allowed size of the networks and their transfer functions. A more extensive
investigation into the evolutionary approach could include other non-linear
transfer functions, larger networks as well as more interconnected or feed-
back networks.