FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting using
Neural Nets
Preliminary Results and Findings
Leonard Aye
1994
FTSE Trend Forecasting Using Neural Networks Contents
Contents
Project Summary ...............................................................................................................................................4
Project Aim ............................................................................................................................................4
Introduction .......................................................................................................................................................10
Report layout..........................................................................................................................................10
Software package ...................................................................................................................................11
Neural Networks ...............................................................................................................................................10
Brief introduction...................................................................................................................................10
Input Data Manipulations..................................................................................................................................11
Introduction............................................................................................................................................11
Input data manipulations ........................................................................................................................12
General manipulations ................................................................................................................12
Indexes manipulations ................................................................................................................13
Interest rates................................................................................................................................17
Exchange rates ............................................................................................................................18
Economic data.............................................................................................................................18
Futures ........................................................................................................................................18
Complete list of input data sets ..............................................................................................................19
Output Data Selection .......................................................................................................................................21
Number of predictive items in the neural net .........................................................................................21
Short term prediction ..................................................................................................................21
Long term prediction ..................................................................................................................21
Selection of predictive item....................................................................................................................21
Selection of FTSE difference......................................................................................................22
Network Tuning ................................................................................................................................................27
Hidden nodes ..............................................................................................................................27
Learning rate (0.0–1.0) ...............................................................................................................28
Momentum (0.0–0.9) ..................................................................................................................28
Learning Threshold (0.0–3.0) .....................................................................................................28
Number of presentation ..............................................................................................................29
Presentation type.........................................................................................................................29
Minimum and maximum values..................................................................................................29
Results ...............................................................................................................................................................52
Results evaluation methods....................................................................................................................52
Trading — Short term prediction................................................................................................52
Forecasting — Long term prediction ..........................................................................................54
Preliminary results .................................................................................................................................54
Result sheets ...............................................................................................................................54
Test 1—FTSE +1D MOM prediction .........................................................................................54
Test 2—FTSE +2D MOM (residual) prediction.........................................................................55
Test 3—FTSE +65D MOM prediction .......................................................................................55
Reduction of input data sets ...................................................................................................................56
Numerical Analysis.....................................................................................................................56
Direct experimentation................................................................................................................56
Conclusions .......................................................................................................................................................56
Highlights...............................................................................................................................................56
Conclusions............................................................................................................................................56
Input data manipulation ..............................................................................................................56
Output data selection ..................................................................................................................57
Network Tuning..........................................................................................................................57
Analysis methods ........................................................................................................................57
Appendices ........................................................................................................................................................59
Appendix A— Back-propagation algorithm...............................................................................59
Appendix B— FTSE Analyses ...................................................................................................59
Appendix A .......................................................................................................................................................60
Back-propagation algorithm...................................................................................................................60
Appendix B .......................................................................................................................................................61
Momentum (Close to Close Difference) .....................................................................................61
Returns ........................................................................................................................................62
Percentage Change of Momentum (PCM)..................................................................................62
Moving Average (MAV) ............................................................................................................62
Copyright— © Len Aye 1994 2
FTSE Trend Forecasting Using Neural Networks Contents
Rate of Change (ROC)................................................................................................................63

Moving Average Convergence-Divergence (MACD)................................................................63
Relative Strength Index (RSI).....................................................................................................64
Zero trend close-close volatility (ZCCV) ...................................................................................64

FTSE Trend Forecasting Using Neural Networks Project summary
PROJECT SUMMARY
Project Aim
This report is a result of 3 months work at a stock-broker firm based in London. The aim
of the project is to add value to existing business areas by making predictions about the
level of Index values in the future using Neural Networks. The FTSE index was chosen
as the initial target for the estimation.
The assumption is that, except in the case of unexpected shocks, e.g. the invasion of
Kuwait, the likely future levels for the market are largely contained in the data available
to participants in the market today.
So vast is the amount of that data that turning it into usable information is a difficult task.
The function of the neural network is to help discriminate between the data as to what is
significant and to discover patterns in the data which enable it to make estimates about
the future. The intention is not that the neural network would stand alone but that it will
be used to complement the existing methods.
From the Technical Analysis perspective the required time scale for the FTSE estimation
is 3 months and a predictive accuracy of ±1.5%. For Trading purpose, a 1 or 2 day
estimate is required with an accuracy for large moves (greater than ±0.75%) of 0.5%, but
with an over-riding requirement of getting the direction of movement correct.
The task of performing financial predictions, or any other analysis, using neural nets
involves 4 major steps: input data selection, output data selection, network tuning and
analysing results from the network.
The purpose of the report is to describe our initial findings in these four areas, namely to
establish:-
the most promising data sets that could be used as indicators of FTSE prediction
the appropriate output parameters which could be predicted most accurately by
Neuroshell
the parameters in NeuroShell that are most likely to affect the overall accuracy of the
results and methods used in tuning these parameters, and
the appropriate methods for analysing the results.

FTSE Trend Forecasting Using Neural Networks Input data manipulations
INTRODUCTION
The task of predicting accurately the future value of FTSE 100 Shares Index, either one
day or a few months ahead, is by no means easy. In the past, and even now, statistical
tools have been used and have proved successful, up to a certain point, in predicting such
financial indicators.
However, there is now a different class of computerised tools that are becoming available
which can be use alongside the statistical methods in predicting data consisting of non-
linear patterns. This new class of tools are called Neural Networks (or neural nets) and
are originated in the field of psychology, cognitive science and later crossed over to
computing.
The idea of neural nets was first investigated in the 1940s and only recently have
practical, off-the-shelf tools are becoming available. The neural nets have been applied to
such diverse fields as classification: speech, image and hand-written characters
recognition, medical screening, geo-demographic analysis; control of complex non-linear
plants such as engines and chemical processes; data fusion: medical diagnosis, sales
forecasting, credit/loan risk analysis; and of course, prediction: financial systems and
exchange rate forecasting.
Report layout
This report is a summary of work carried out during the first 6 months of the project. In
order to understand the results from our experiments it is necessary that the reader has
some basic understanding of the neural nets. Hence, the section ‘Neural Networks’
briefly explains the idea behind the principle—alternatively skip the section if you are
already familiar with the subject.
As stated earlier, the task of performing financial predictions, or any other analysis, using
neural nets involves 4 major steps: input data selection, output data selection, network
tuning and analysing the results from the network, hence the main body of the report is
broken down into 4 sections to reflect these 4 steps1.
The section ‘Input Data Manipulations’ shows the data sets that were acquired and how
they were manipulated so that they can be used as inputs to the neural network.
The next step is to decide what we want the network to produce as outputs, i.e. the items
to be predicted. This is not as obvious as one would have expected. The section ‘Output
Data Selection’ details the various parameters that were tested for their suitability as
predictive items for the FTSE index.
Once we have established both the items to be used as input and output we then trained
the network. The sections ‘Network Tuning’ describes the parameters involved in
training a network (within the confines of the NeuroShell package) and how they were
tuned.
After the tests were carried out the results from the tests were analysed and the ‘Results’
section highlights the observations from the tests.
The last section ‘Conclusions and future plans’ presents our findings and observations
from each of the previous sections and our plans for the next 6 months of the project. For
those readers who are not technically inclined may skip to this last section for a
condensed summary of the report.
1 The reader should be aware that the optimal data sets or parameters required for each step are not
obtained in isolation with other steps but were obtained in parallel by doing experiments iteratively..

Software package
The package that we have used for all the experiments is called NeuroShell®2 and in this
report we use the term ‘neural net’ when the context applies to neural networks in general
and ‘NeuroShell’ when the context refers to the particulars of the package.
2 NeuroShell™ is a trademark of Ward Systems Group, Inc., 245 West Patrick Street, Frederick,
Maryland 21701, USA. Tel: (+1) 301 662-7950.

FTSE Trend Forecasting Using Neural Networks Neural Nets
NEURAL NETWORKS
Brief introduction
Neural networks are typically composed of interconnected “units”, and each connection
is associated with a modifier weight3. Each unit converts the pattern of incoming
activities that it receives into a single outgoing activity that it broadcasts to other units. It
performs this conversion in two stages. First, it multiplies each incoming activity by the
weight on the connection and adds together all these weighted inputs to get a quantity
called the total input. Second, a unit uses an input-output function that transforms the
total input into the outgoing activity (see Figure 2.1 below).
WEIGHTED
INPUT UNIT
ACTIVITY INPUT
WEIGHT OUTPUT
2 ACTIVITY
0.1
INPUT
0.5 SUM OUTPUT
FUNCTION
1.5
Figure 2.1 — Functions of a unit in a neural network
To make a neural network that performs some specific task, the weights on the
connections and how the units are connected to each other must be set appropriately. The
connections determine whether it is possible for one unit to influence another. The
weights specify the strengths of the influence.
The common types of neural networks consists of three layers of units: a layer of input
units is connected to a layer of “hidden” units, which is in turn connected to a layer of
output units. The activity of the input units represents the raw information that is fed into
the network. The activities of each hidden unit is determined by the activities of the input
units and the weights on the connections between the input and hidden units. Similarly,
the behaviour of the output units depends on the activity of the hidden units and the
weights between the hidden and output units (see Figure 2.2). The number of hidden
layers in a network depends very much on the problem to be solved using the network.
3 Hinton, G. E. (1992), How Neural Networks Learn from Experience, Scientific American, September
1992, pp 105-109.

FTSE Trend Forecasting Using Neural Networks Neural Nets
O1 O2
OUTPUT LAYER
HIDDEN LAYER
H1 H2 H3 H4 H5
INPUT LAYER
I1 I2 I3
I = input unit
H = hidden unit
O = output unit
Figure 2.2 — A common three layer neural network
To train a network, the input patterns are presented to the network and the actual activity
of the output units and the desired activity is compared. The error is calculated, which is
defined as the square of the difference between the actual and the desired activities. The
weights of each connection is then changed in order to reduce the error. The above
process is repeated until the network classifies, or recognises, every input pattern
correctly.

INPUT DATA MANIPULATIONS
Introduction
Of the numerous financial data at our disposal we have chosen the following financial
data as suitable indicators for the prediction of FTSE. These data sets are classified into
their relative groups, as follows:
Indexes
FTSE 100
FTSE Eurotrack 100
Dow Jones
DAX
NIKKEI
CAC 40
Interest rates
UK Base rate UK Interbank 3 Months UK 20 yr. Gilt Yield

US Base rate US Interbank 3 Months US 30 yr. Bond Yield
German Lombard rate German Interbank 3 Months German 10 yr. Bond
Yield
French Base rate French Interbank 3 Months French 10 yr. Bond Yield
Japan Base Rate Japan Interbank 3 Months Japan 10 yr. Bond Yield
Exchange rates
US $ – £ Sterling
French Franc – £ Sterling
Japanese Yen ¥ – £ Sterling
German Marks DM – £ Sterling
Economic data
UK Money supply UK RPI (inflation) UK GDP

US Money supply US RPI (inflation) US GDP
French Money supply French RPI (inflation) French GDP
German Money supply German RPI (inflation) German GDP
Japan Money supply Japan RPI (inflation) Japan GDP
Futures trading
UK 3 month Sterling (Short Sterling)

FTSE
Long Gilt
US T-Bond
3 month Eurodollar
German Government Bond (Bund)
The list above shows our initial list of financial and economic indicators that we have
decided to use as predictive variables. The data sets as they stand in their raw form
contain historical information that is not directly apparent in the data, and by calculating
their derivatives (e.g. moving averages, etc.) this hidden information or patterns can be
brought to the surface and made more explicit, and consequently be recognised by the
neural network.

The sections below describes how the raw data were analysed and the types of derivatives
measured.
Input data manipulations
General manipulations
The following adjustments were applied to all the data sets.
1–Spikes in data
When calculating the derivatives of the data set—index values in particular—we need a
way of handling the sudden rise or fall of large magnitude in values, e.g. when the stock
market crashed the FTSE dropped by over 250 points. Because of this crash all the
derivatives that were calculated (i.e. moving averages, differences, rate of change, etc.)
have large spikes in them. Since our interest is in the direction or movements but less so
in the absolute values of the movements above a given level we can reduce the size of
these spikes without loosing information.
Another reason for dealing with spikes is that the precision of the NeuroShell output is
determined by the range of the minimum and maximum values set for a particular data
series. The NeuroShell manual suggests that when dealing with spikes the minimum and
maximum values should be set tightly around the majority of the data set.
Hence, from the graphical analysis of the data series in Excel, we decided that 4 standard
deviations of the data series would be suitable for using as the minimum and maximum
values.
2–History in data series

The type of training that the neural network software, NeuroShell uses is known as the
back-propagation algorithm, a particular class of supervised learning algorithms. In the
back-propagation algorithm, the weights of the nodes in the network are adjusted in a
particular fashion so as to reduce the errors between the actual and the expected outputs
(for those of mathematically minded nature refer to Appendix A for detailed description
of the back-propagation algorithm).
The back-propagation algorithm is suitable for the majority of problems, where the data
to be trained is discrete or independent of each other. However, the algorithm does not
handle temporal or historical data well4. To overcome this limitation, we used
momentums (differences) of the indexes between today and some periods in the past as
representatives of the ‘historical’ information in the data.
The following table shows the various index differences that we wish to calculate and
used as inputs to the neural network.
1 day difference (momentum) short-term value that we wish to predict

2 day check on 1 day (prediction needed)
5 day 1 week change
20 day 1 month change
65 day long-term value that we wish to predict
25 day used by Technical Analysis
50 day " " "
200 day " " "
Table 4.1 — FTSE momentums
4 There are other algorithms such as recurrent algorithms, which can handle time-series data.
However, the current version of NeuroShell does not provide this feature.

Multiples of 5 are chosen to avoid week-day affects which may be particularly great in
the UK because of its settlement accounting system.
These will provide some history which the neural network would not otherwise get.
However, it is unlikely that all of these are significant and part of the neural net’s job is to
discriminate between them.
3–Levels
From our early experience of NeuroShell we have found that the network cannot predict
values that are outside the range of its learning set. This particular problem is not limited
to NeuroShell alone but is a limitation with the neural nets in general. For this reason any
data that has levels (or trends) must be transformed into one that does not contain levels.
More importantly, we cannot use NeuroShell to predict the real value of FTSE. The two
methods described below can be used to eliminate the trend in the data.
Differences
This is simply the normalisation of the raw data and can be done in many ways, and the
most simplest method of removing the levels is to calculate the difference between the
current value and the value some periods ago. It is obviously sufficient, in predicting
future values, to calculate the difference from today.
As daily differences are a function of market level then this series too will have widening
bounds. However, this is a second order effect and unlikely to be significant over the
periods of 2 days or 3 months currently being considered for prediction.
Trend removals
This method on the other hand approximates the underlying trend, using linear
regression, and removing the trend from the raw data series and using only the residual
series as an input to the network. This is difficult because of the number of data series,
each with its own trend which will not be independent. At present, it may be safer not to
adjust for trend.
Significant figures
All data series are calculated from their raw values and chopped at 4 std. deviations
individually for each series. After this, the effect of rounding
up to the next multiple of 0.2, so that values are xxx.2, xxx.4, xxx.6, ....
to the nearest 0.5, so that values are xxx.0, xxx.5, xxx.0, ....
Indexes manipulations
The following applies to index data sets only. When dealing with index data we should be
aware of the following points:
Raw Index data will not be used as input because of level problems, and at the same
time we must never loose sight of the actual Index values.
To calculate the Index value it is sufficient to calculate the expected difference from
today’s Index value.
No inputs should be used that are expected to have a trend because the neural
network does not predict at all well outside its learning experience (although
differences are OK).
It is acceptable to underestimate very large changes as these are generally exceptions
that are not expected to be within the normal patterns previously seen — i.e. the
neural network should not be expected to anticipate a large ‘shock’ to the market but
might be expected to predict reasonably the aftermath of a shock given it has seen a
few before.
All derived data series, differences mainly, should be limited to 4 std. dev. of the
original data set, and rounded to the same accuracy of the original data.
All data should be rounded to an acceptable degree of accuracy. Nothing need be
more accurate than 0.01%, e.g. 0.01*2500/100 = 0.25 in FTSE. For FTSE, clearly
0.02% (±0.5) is acceptable.
History information about the data must somehow be made available to the network.

The use of raw data will depend very much upon the required output type, i.e. short
or long-term predictions. For the short-term predictions, e.g. 1 day (and 2 day, as a
check for the 1 day prediction), we could use the raw FTSE data for calculating the
derivatives. In contrast, average values of the Index, which reduce the noise and
daily fluctuations in the Index, could be used for the long-term, e.g. 65 days or 3
months, prediction.
All the predicted Index values, either short- or long-term, will be differences from
today’s value only.
FTSE 100 Index

Since this is the item that we wish the neural network to predict we paid special attention
to this index. The following shows the various analyses and derivatives that were
calculated for FTSE and used as input data to NeuroShell (see Appendix B for complete
description of the analysis methods and how they were implemented in Excel).
The analyses that were performed are as follows:

Momentum (close to close difference)
Returns
Percentage Change of momentum
Moving Averages
Rate of Change
Moving Average Convergence-Divergence
Relative Strength Index
Zero close-close volatility
These particular indicators were chosen on the basis of Robin Griffiths’s experience.
They are widely used in the market (which to some extent must make them self fulfilling)
and he has found them the most valuable of the huge range available (e.g. from Reuters
RT handbook).
Trend removal
While it is expected that any trend in the FTSE data is exponential rather than linear
(because the rise should be related to the growth of money values, with re-investment),
we should nevertheless test for this assumption.
Linear
Assume there is a trend,
FTSE = m (time) + const + error.
Using normal linear regression, minimise the error term,

i.e. pick m and const such that ∑ ( error ) 2 is a minimum.
Exponential
Assume there is a trend
log(FTSE) = m (time) + const + error.

Inverse
Assume there is a trend,
1
=m (time) + const + error.
FTSE


We should pick the best solution not on the basis of the individual ∑ ( error ) 2
terms
above, but on the equivalent calculation for the series converted back into FTSE values.
i.e. it is always calculated as ( FTSE − est ( FTSE )) 2 ∑
We then just take the best of these three, subtract it from the original data set and use the
resulting values (FTSE residual) as the items to be estimated by the neural network5. The
following graph shows the trends in the raw FTSE.
It can be observed from the graph that the trends of FTSE are slightly offset by the large
peak in 1987, i.e. the trend lines are above what one would consider an optimum trend. It
can also be observed that in both of the graphs the linear trend fits the graph better than
the exponential and we use this linear trend to calculate the FTSE residual values, which
could then be used as the items to be estimated.
3000
Underlying trends of raw FTSE
2800
2600
2400
2200
2000
1800
FTSE 100
1600 y=mx+c
y=c*m^x
1400
1200
1000
Ja F M A A M Ju Ju A S O N D Ja F M M A M Ju Ju A S O N D Ja F M M A M Ju Ju A S O N D Ja F F M A M Ju Ju A S O N D Ja Ja F M A M Ju Ju A S O N D Ja Ja F M A M Ju Ju A S O N D D Ja F M A M Ju Ju A S O N D D Ja F M A M Ju Ju A S O N N D Ja
n- e a p p a n- l- u e ct- o e n- e a a p a n- l- u e ct- o e n- e a a p a n- l- u e ct- o e n- e e a p a n- l- u e ct- o e n- n- e a p a n- l- u e ct- o e n- n- e a p a n- l- u e ct- o e e n- e a p a n- l- u e ct- o e e n- e a p a n- l- u e ct- o o e n-
85 b- r- r- r- y- 85 85 g- p- 85 v- c- 86 b- r- r- r- y- 86 86 g- p- 86 v- c- 87 b- r- r- r- y- 87 87 g- p- 87 v- c- 88 b- b- r- r- y- 88 88 g- p- 88 v- c- 89 89 b- r- r- y- 89 89 g- p- 89 v- c- 90 90 b- r- r- y- 90 90 g- p- 90 v- c- c- 91 b- r- r- y- 91 91 g- p- 91 v- c- c- 92 b- r- r- y- 92 92 g- p- 92 v- v- c- 93
85 85 85 85 85 85 85 85 85 86 86 86 86 86 86 86 86 86 87 87 87 87 87 87 87 87 87 88 88 88 88 88 88 88 88 88 89 89 89 89 89 89 89 89 90 90 90 90 90 90 90 90 90 91 91 91 91 91 91 91 91 91 92 92 92 92 92 92 92 92 92
Figure 4.1 — Underlying trend in FTSE
Seasonality
This should be tackled only after the trend has been removed. We should do long term
seasonality (1 year) first, only then should we see whether there is any remaining cycles
that might be removed hopefully by looking at the graphs.
It is difficult to decide on the best method for calculating seasonals without knowing the
nature of the trends described above and looking at the resulting graphic to see whether
the seasonal variations are likely to remain constant or rise with increasing trend, and to
what extent. However, it would probably be reasonable to start with the assumption that
seasonals are a constant ratio to trend.
5 Microsoft Excel provides built-in functions for calculating the straight line and exponential curves that
best fit the given series of values.
For the linear trend, the gradient m and constant c can be obtained from the function LINEST(values)
which returns an array that describes the line.
Hence, the gradient is obtained by:

m = INDEX(LINEST(values),1)
and constant is obtained by:
c = INDEX(LINEST(values),2).
With these values a straight line is then constructed using arbitrary x values ranging from 0 to n
number of data points in the series.
For the exponential trend, the gradient m and constant c can be obtained from the function
LOGEST(values) which returns an array that describes the curve, and the gradient and constant are
obtained as described above.

Use of exponentially weighted moving average (EMA) of the FTSE residual could be
applied here; e.g.
EMA t = 0. 01∗ F r + 0.99∗ EMA t −1

t
where EMA t = EMA at time t ,
F tr = FTSE residual at time t.
It is well known in the market that FTSE behaves in a seasonal pattern, i.e. one that
repetitive over a certain time period. For example, the value of FTSE rises around
beginning of each year (see figure below). The question is how do we incorporate this
information as an input to the neural network.
Our first attempt was to use another input which simply consists of a series of numbers
representing the days in a year. For example, day 1 is always the first Monday in the
second week of a new year. We use this input together with FTSE derivatives to indicate
the seasonal change of FTSE. In order that this new information will be of use to the
network the data will have to be presented in a rotational instead of a random basis. So
far, we have not removed any seasonal information from the data but simply placed an
additional indicator to the neural network that seasonal variations exists in the data.
Comparison of FTSE values between 1985-1992 03/03/9

3000
2800
2600
2400
2200
2000
1800
1600
1400
1200
1000
JaJaJaJaJaJaJaJaJaJaJaJa F F F F F F F F F F M M M M M M M M M M A A A A A A A A A A A M M M M M M M M M M M M JuJuJuJuJuJuJuJuJuJuJuJuJuJuJuJuJuJuJuJuJu A A A A A A A A A A A S S S S S S S S S S S O O O O O O O O O O O N N N N N N N N N N N D D D D D D D D D D D
n- n- n- n- n- n- n- n- n- n- n- n- e e e e e e e e e e a a a a a a a a a a p p p p p p p p p p p a a a a a a a a a a a a n- n- n- n- n- n- n- n- n- n- l- l- l- l- l- l- l- l- l- l- l- u u u u u u u u u u u e e e e e e e e e e e ct-ct-ct-ct-ct-ct-ct-ct-ct-ct-ct- o o o o o o o o o o o e e e e e e e e e e e
858585858585858585858585 b- b- b- b- b- b- b- b- b- b- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- y- y- y- y- y- y- y- y- y- y- y- y- 858585858585858585858585858585858585858585 g- g- g- g- g- g- g- g- g- g- g- p- p- p- p- p- p- p- p- p- p- p- 8585858585858585858585 v- v- v- v- v- v- v- v- v- v- v- c- c- c- c- c- c- c- c- c- c- c-
85858585858585858585858585858585858585858585858585858585858585858585858585858585858585 85858585858585858585858585858585858585858585 85858585858585858585858585858585858585858585
1985 FTSE 1986 FTSE 1987 FTSE 1988 FTSE 1989 FTSE 1990 FTSE 1991 FTSE 1992 FTSE
Figure 4.2 — Seasonal trends in FTSE
Other Indexes
The following two manipulation methods were applied to the following indexes: Dow
Jones, DAX, Nikkei, and CAC 40.
Trend replacements
The following two differences are used as the indicators of the Index without the trend:
• Index - FTSE
⎛ Index ⎞
• FTSE − ⎜ ⎟.
⎝ £/Ex. rate ⎠

Historical data
This is done by calculating the difference between today’s index and the index n periods
ago, and we used the 1 day, 1 month, 3 months and 12 months difference of the following
derivatives:
• Index
• Index - FTSE
⎛ Index ⎞
• FTSE − ⎜ ⎟.
⎝ £/Ex. rate ⎠
• RSI 14 days
• RSI 9 days.
Interest rates
The following shows the data manipulation carried out for the UK Interest rates, but is
equally applicable to other nations rates.
Trend removals using differences

The following shows the differences calculated between the UK interest rates data series
(which will be used as additional inputs to the raw data series). Differences calculated
are:
• Base rate - 3M Interbank

• 3M Interbank - 20 yr. Gilt yield
• 20 yr. Gilt yield - Inflation
• Base rate - Inflation
⎧⎛ RPI t ⎞ ⎫
where Inflation is derived from ⎨⎜⎜ ⎟⎟ − 1⎬ × 100 .
⎩⎝ RPI t −12 M ⎠ ⎭
Historical data
The following table shows the historical data that are expected to be important and
obtained using differences.
1 day 1 month 12 months

Base rate 9 9 —
3M Interbank 9 9 —
20 yr. Gilt yield 9 9 9
Base rate - 3M Interbank 9 9 —
3M Interbank - 20 yr. Gilt 9 9 9
20 yr. Gilt - Inflation 9 9 —
Base rate -Inflation 9 9 —
3 M Interbank rates
This is an additional factor used for Interbank rates and the following table shows the
differences that we wish to calculate.

Differences UK US German French Japan

UK — — — — —
US 9 — — — —
German 9 9 — — —
French 9 — 9 — —
Japan 9 — 9 — —
Exchange rates
For exchange rates 1 day and 1 month (20 day) differences were calculated as derivatives
of the exchange rates. The derivatives are used together with the raw values of exchange
rates because although the exchange rates varies, they are normally bounded within
certain ranges.
as is 1 day 1 month
US $/£ exchange rate 9 9 9
French Franc/£ exchange rate 9 9 9
Japanese ¥/£ exchange rate 9 9 9
German Marks DM/£ exchange rate 9 9 9
Economic data
Here, only the 12 month percentage change is calculated for use as replacement of the
actual values for the following economic data:
UK Money supply US Money supply

UK RPI (inflation) US RPI (inflation)
UK GDP US GDP
French Money supply German Money supply
French RPI (inflation) German RPI (inflation)
French GDP German GDP
Japan Money supply
Japan RPI (inflation)
Japan GDP
Futures
Currently, we have not yet used the Futures data extensively.

Complete list of input data sets

The table 4.3 shows the complete list of input data sets which are generated from the
manipulations of the raw data sets. The table below shows the naming convention that we
have adopted to label the different types of input data sets.
Acronym Meaning
UK UK
US US
GE Germany
FR France
JP Japan
DJ Dow Jones
NIK Nikkei
1D, 2D, 65D, etc. 1 day, 2 day, 65 day, etc.

3M, 12M, etc. 3 month, 12 month, etc.
MAV Moving average
MOM Momemntum (or difference)
ROC Rate of Change
RSI Relative Strenght Index
Zero cl-cl vol. Zero close-close volatility
MACD Moving Average Convergence-Divergence
FTSE- FTSE-(DJ/Exchange rate)
(DJ/EX.RATE)
FTSE-DJEX,etc. FTSE-(DJ/Exchange rate)
BR Base rate
3MIB 3 month interbank
INF Inflation
>7Y BY more than 7 years Bond yield
20YR. GILT 20 years Gilt yield
30Y BY 30 years Bond yield
Table 4.2 — Naming convention used in the tests

1day MOM FTSE-(DAX/EX.RATE) CAC 1D MOM US 30Y BY 12M MOM FRR >7YBY-INF
2 day MOM DAX 1D MOM CAC 20D MOM US BR-3MIB 1D MOM FR 3MIB 1D MOM
1 week MOM DAX 20D MOM CAC 65D MOM US BR-3MIB 20D MOM FR 3MIB 20D MOM
25day MOM DAX 65D MOM CAC-FTSE 1D MOM US 3MIB-30YBY 1D MOM FR >7Y BY 1D MOM
50day MOM DAX 12M MOM CAC-FTSE 20D MOM US 3MIB-30YBY 20D MOM FR >7Y BY 20D MOM
65 days MOM FTSE-DAX 1D MOM CAC-FTSE 65D MOM US 3MIB-30YBY 12M MOM FR >7Y BY 12M MOM
1 year MOM FTSE-DAX 20D MOM FTSE-(CAC/EX.RATE) 1D US 30Y-INF 1D MOM FR 3MIB->7YBY 1D MOM
MOM
% change of MOM FTSE-DAX 3M MOM FTSE-(CAC/EX.RATE)20D US 30Y-INF 20D MOM FR 3MIB->7YBY 20D MOM
over 10 days MOM
% change of MOM FTSE-DAX 12M MOM FTSE-(CAC/EX.RATE) 65D US BR-INF 1D MOM FRR >7YBY-INF 1D MOM
over 25 days MOM
% change of MOM Day numbers CAC RSI 9D 1D MOM US BR-INF 20D MOM FRR >7YBY-INF 20D MOM
over 50 days
Close-2day MAV FT-DAXEX 1D MOM CAC RSI 9D 20D MOM JP BR-3MIB UK-US 3MIB
Close-5 day MAV FT-DAXEX 20D MOM CAC RSI 9D 65D MOM JP 3MIB-10Y BY UK-GE 3MIB
Close-25 day MAV FT-DAXEX 65D MOM CAC RSI 14D 1D MOM JP 10Y BY-INF UK-FR 3MIB
Close-50 day MAV FT-DAXEX 12M MOM CAC RSI 14D 20D MOM JP BR-INF UK-JP 3MIB
3 day ROC DAX RSI 9D 1D MOM CAC RSI 14D 65D MOM JP BR 1D MOM US-GE 3MIB
5 day ROC DAX RSI 9D 20D MOM UK BR-3M IB JP BR 20D MOM GE-FR 3MIB
25 day ROC DAX RSI 9D 65D MOM UK 3MIB-20YR.GILT JP 3MIB 1D MOM GE-JP 3MIB
50 day ROC DAX RSI 9D 12M MOM UK 20YR.GILT-INFLATION JP 3MIB 20D MOM US$/£ EX. RATE
MACD DAX RSI 14D 1D MOM UK BR-INFLATION JP10Y BY 1D MOM US$/£ 1D MOM
RSI 9 day DAX RSI 14D 20D UK BR 1D MOM JP 10Y BY 20D MOM US$/£ 20D MOM
MOM
RSI 14 days DAX RSI 14D 65D UK BR 20D MOM JP 10Y BY 12M MOM FRANCS/£ EX. RATE
MOM
Zero cl-cl- vol DAX RSI 14D 12M UK 3MIB 1D MOM JP BR-3MIB 1D MOM FR/£ 1D MOM
MOM
DJ-FTSE NIKKEI-FT UK 3MIB 20D MOM JP BR-3MIB 20D MOM FR/£ 20D MOM
FTSE-(DJ/EX.RATE) FT-(NIKKEI/EX.RATE) UK 20YR.GILT 1D MOM JP 3MIB-10YBY 1D MOM MARKS/£ EX. RATE
DJ 1D MOM NIK 1D MOM UK 20YR.GILT 20D MOM JP 3MIB-10YBY 20D MOM MARKS/£ 1D MOM
DJ 20D MOM NIK 20D MOM UK 20YR.GILT 12M MOM JP 3MIB-10YBY 12M MOM MARKS/£ 20D MOM
DJ 65D MOM NIK 65D MOM UK BR-3M IB 1D MOM JP 1OYBY-INF 1D MOM YEN/£ EX. RATE
DJ 12M MOM NIK 12M MOM UK BR-3M IB 20D MOM JP 1OYBY-INF 20D MOM YEN/£ 1D MOM
DJ-FTSE 1D MOM NIK-FT 1D MOM UK 3MIB-20YR.GILT 1D MOM JP BR-INF 1D MOM YEN/£ 20D MOM
DJ-FTSE 20D MOM NIK-FT 20D MOM UK 3MIB-20YR.GILT 20D JP BR-INF 20D MOM UK GDP 12M % CHANGE
MOM
DJ-FTSE 65D MOM NIK-FT 65D MOM UK 3MIB-20YR.GILT 12M GE BR-3M IB UK M. SUPLY 12M %
MOM CHANGE
DJ-FTSE 12M MOM NIK-FT 12M MOM UK 20YR.GILT-INF 1D MOM GE 3MIB-10YR BY UK INF 12M % CHANGE
FT-DJEX 1D MOM FT-NIKEX 1D MOM UK 20YR.GILT-INF 20D MOM GE BR 1D MOM US GDP 12M % CHANGE
FT-DJEX 20D MOM FT-NIKEX 20D MOM UK BR-INF 1D MOM GE BR 20D MOM US M. SUPLY 12M %
CHANGE
FT-DJEX 65D MOM FT-NIKEX 65D MOM UK BR-INF 20D MOM GE 3M IB 1D MOM US INF 12M % CHANGE
FT-DJEX 12M MOM FT-NIKEX 12M MOM US BR-3M IB GE 3M IB 20D MOM FR GDP 12M % CHANGE
DJ RSI9D 1D MOM NIK RSI 9D 1D MOM US 3MIB-30Y BY GE10 YR BY 1D MOM FR M. SUPLY 12M %
CHANGE
DJ RSI9D 20D MOM NIK RSI 9D 20D MOM US 30Y BY-INF GE10 YR BY 20D MOM FR INF 12M % CHANGE
DJ RSI9D 65D MOM NIK RSI 9D 65D MOM US BR-INF GE10 YR BY 12M MOM GE GDP 12M % CHANGE
DJ RSI9D 12M MOM NIK RSI 9D 12M MOM US BR 1D MOM GE BR-3M IB 1D MOM GE M. SUPLY 12M %
CHANGE
DJ RSI14D 1D MOM NIK RSI14D 1D MOM US BR 20D MOM GE BR-3M IB 20D MOM JP GDP 12M % CHANGE
DJ RSI14D 20D NIK RSI14D 20D MOM US 3M IB 1D MOM GE 3MIB-10YR BY 1D JP M. SUPLY 12M %
MOM MOM CHANGE
DJ RSI14D 65D NIK RSI14D 65D MOM US 3M IB 20D MOM GE 3MIB-10YR BY 20D JP INF 12M % CHANGE
MOM MOM
DJ RSI14D 12M CAC-FTSE US 30Y BY 1D MOM GE 3MIB-10YR BY 12M
MOM MOM
FTSE-DAX FTSE-(CAC/EX.RATE) US 30Y BY 20D MOM FR 3MIB->7YBY
Table 4.3 — Complete list of input data sets
Note: The first 22 data sets (1 day MOM to Zero cl-cl vol.) are the derivatives of FTSE
Index..

FTSE Trend Forecasting Using Neural Networks Output data selection
OUTPUT DATA SELECTION

In this section we discuss our concepts in the selection of items to be predicted, i.e. to be
produced as output, from the neural network.
Number of predictive items in the neural net

As we want to predict the FTSE Shares Index for a short and long period, i.e. 2 day close
and 3 months moving average respectively, the obvious solution is to design and train a
network with 2 outputs, one for 2 day close and the other for 3 months moving average.
However, it is generally accepted that a neural network which has more than 1 output
performs less well than separate networks each having a single output. This is particularly
true in NeuroShell which uses a least squares minimisation technique to decide how to
apportion its weights adjustments amongst several outputs. This means that the accuracy
of each output is sacrificed in order to minimise the total error of all the outputs. With
this in mind we will have to build two networks for predicting short term and long term
FTSE indexes separately. The question then arises is what type of data will be suitable for
each of the two networks.
Short term prediction

Firstly, for a short term prediction, using only the short term indicators, e.g. 1 day close, 1
day moving average, 5 day rate of change, etc., will not be enough. This is because the
short term price changes are highly volatile and to predict accurately additional
information must be provided, such as seasonal and cyclical changes information.
Long term prediction

However, for long term prediction, day to day changes of price and moving averages will
have less effect than long term price indicators, such as 30 day, 3 moths and 1 year
momentums, 50 day moving averages and rate of change, and zero trend close-close
volatility index.
Hence, we believe that the types of input data for both network will be similar in many
aspects, but the weightings will be different.
Selection of predictive item

As we have described earlier, any data which contain trends or levels cannot be used
either as input or output parameter to the neural net. So, instead we decided to use the
following derivatives of FTSE as the items to be predicted: returns, logarithms and
momentum (differences)6.
Initially we expected the returns (percentage change) of FTSE would prove useful as the
predictive item. However, the tests carried out by Grashoff showed that the use of returns
was not successful as expected. This may be because price moves are always discrete
units (1p, 2p, etc.). Thus using absolute differences provides the neural net with more
repetition. When returns are used, a change of 10p in price is an input of different value
depending on the underlying price level at the start of the period. Returns therefore
provides a more continuous set of value to present to the neural net and also compensate
for level. However, any benefit appears to be offset by the vastly increased number of
different values. Some work was done by rounding the returns to a small number of
significant figures and while this gave some improvement the end result was less
accuracy in prediction than achieved by straight differences.
6 See Appendix B for explanation.

The same observation was made for the use of differences of logarithms of FTSE. Hence,
all our tests now use the FTSE difference (the value of FTSE n number of days ahead
with respect to today’s value) as the predictive item from the neural net.
Selection of FTSE difference

Having established that differences would be used as the sole item for prediction we then
have to decide precisely the type of FTSE difference that would be suitable.
In earlier tests, we have used the 2 day moving average (MAV), defined as
(P t −1 + P t ) / 2 , as the factor to be predicted, since the daily differences contains too
much noise for the neural net. It is in the nature of market that daily movements are
generally over done and that some correction occurs the following day. However, the 2
day moving average suffers from the problem that its average point in time is about
midday rather than end day: the price of today and yesterday are recorded at close of
business times, so the average of the two days is around midday.
A better average might then be, what we have called, 3 day weighted average, that is;
P t −1 + 2∗ P t + P t +1
4
This is centred at the close of business required but has the problem that it requires
tomorrow’s value. Using this would therefore involves estimating at least two days
forward.
As the graph below shows the 2 day moving average differs from the actual FTSE value
by less than 0.5% normally.
FTSE/MAV(FTSE, 2D)*100-100
3
2.5
1.5
0.5
-0.5
-1
-1.5
-2
-2.5
M M M M M A A A A M M M M M J J J J J J J J A A A A A S S S S O O O O N N N N N D D D D J J J J J F F F F M M M M A A A A M M M M M J J J J J J J J J A A A A S S S S O O O
a a a a a p p p p a a a a a u u u u u u u u u u u u u e e e e c c c c o o o o o e e e e a a a a a e e e e a a a a p p p p a a a a a u u u u u u u u u u u u u e e e e c c c
r r r r r r r r r y y y y y n n n n l l l l g g g g g p p p p t t t t v v v v v c c c c n n n n n b b b b r r r r r r r r y y y y y n n n n l l l l l g g g g p p p p t t t
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
Figure 5.1 — Difference between 2 day MAV and actual FTSE
However, with 3 day weighted average the difference in relation to the real FTSE value is
around 0.25%, half that of the 2 day moving average (see graph below).

FTSE/3 Day weighted Average

Y=FTSE/(3 day Weighted average)*100-100
1.25
1
0.75
0.5
0.25
0
-0.25
-0.5
-0.75
-1
-1.25
-1.5
M M M M Ap Ap Ap Ap M M M Ju Ju Ju Ju Ju Ju Ju Ju Au Au Au S S S S O O O O N N N D D D D Ja Ja Ja Ja F F F M M M M Ap Ap Ap M M M M Ju Ju Ju Ju Ju Ju Ju Au Au Au Au S S S O O O O
a a a a r- r- r- r- a a a n- n- n- n- l- l- l- l- g- g- g- e e e e ct- ct- ct- ct- o o o e e e e n- n- n- n- e e e a a a a r- r- r- a a a a n- n- n- l- l- l- l- g- g- g- g- e e e ct- ct- ct- c
r- r- r- r- 91 91 91 91 y- y- y- 91 91 91 91 91 91 91 91 91 91 91 p- p- p- p- 91 91 91 91 v- v- v- c- c- c- c- 92 92 92 92 b- b- b- r- r- r- r- 92 92 92 y- y- y- y- 92 92 92 92 92 92 92 92 92 92 92 p- p- p- 92 92 92 9
91 91 91 91 91 91 91 91 91 91 91 91 91 91 91 91 91 91 92 92 92 92 92 92 92 92 92 92 92 92 92 92
Figure 5.2 — Difference between 3 day weighted AV and actual FTSE
It is unlikely that any wider moving average would be of value because of the incidence
of ‘special’ events. Such as average would have to be treated with care because it
includes future information.
P t −1 + 2∗ P t + P t+1
Perhaps, is assigned to day t+1, when this problem disappears. If it
4
was assigned as a difference to Pt the value would be
⎛ P + 2 ∗ Pt + Pt +1 ⎞
Pt − ⎜ t −1 ⎟
⎝ 4 ⎠
2 P − Pt −1 − Pt +1
= t
4
which is perhaps a measure of how much yesterday’s value was an over or under estimate
of some ‘true’ underlying value for the index.
Initial results from tests using the 3 day MAV showed that the overall percentage error is
2.8% (or 1.78% and 4% for the first and second half of the test set) and therefore use of
this measure was discontinued for the moment.

FTSE Trend Forecasting Using Neural Networks Network tuning
NETWORK TUNING
After 6 months intensive use of NeuroShell we have made the following observations
with regards to the package.
Hidden nodes
The number of hidden nodes suitable for a particular application is still an inexact
science. NeuroShell provides a simple tool, which in itself is a network, called
HIDNODES which can be used to determine the number of hidden nodes required for a
particular problem. HIDNODES expects three inputs (number of input nodes, number of
output nodes and a figure representing the complexity of the patterns in the sample data
set) and produces as output the number of hidden nodes to use.
This tool, though useful, does not guarantee that the number of hidden nodes it suggests
will work for the problem, since it requires the user to provide the network with a
subjective figure (from 0 to 10, where 0 being not very complex and 10 being very
complex). Depending on this figure the number of hidden nodes can vary in number of
tens (as shown in the following table).
Input nodes 50 50 50 50 100 200

Output nodes 1 1 1 2 1 1
Complexity (0-10) 0 5 10 5 5 7
Suggested no. of hidden nodes 13 26 45 26 41 77
Table 6.1 — Suggested number of nodes in relation the complexity of the problem
As an alternative, a good rule of thumb in deciding the number of hidden nodes required
is that the total number of weights in a network should be much less than the total number
of patterns in the sample set and the number of output nodes. This is to avoid having the
problem of overfitting, i.e. the network is memorising instead of generalising the given
input sample data, which results in the network producing very good results on the
sample data set but does very poorly in other data set. Using this rule of thumb, we
decided to use 25 hidden nodes. We arrived at this figure as follows:
In a 3-layer, fully connected network, each of the input node is connected to all the
hidden nodes and similarly each of the output node is connected to all the hidden nodes,
as shown below.
O1 O2
H1 H2 H3 H4 H5
I = input node
H = hidden node
O = output node
I1 I2 I3
Figure 6.1 — A typical 3-layer network

Total connections, Tc = N H (N I +N O )
where NH = number of hidden nodes
NI = number of input nodes
NO= number of output nodes,
and in the above example network, there are a total of 25 connections.
In the data sets that we used in the experiment, the total number of patterns, or cases in
NeuroShell terminology, is around 1500 (approximately 5½ years worths of data), and
the total number of input nodes is around 40 and there is usually only one output in the
networks.
If we let Tc=1000, ie. 66% of 1500 cases, then NH is given by:

Tc
NH =
(N I + N O )
1000
NH =
41
gives N H ≈ 25.
Note that this is by no means a definite or a strict rule. We used this formula to give us an
initial value and found to be of value.
Learning rate (0.0–1.0)

To minimise the learning errors, the network automatically adjusts the weights, that are
linked to the nodes in the net, in the direction required to produce a smaller error the next
time the same input pattern is presented. The amount of weight adjustment carried out by
the network at each node is proportional to the amount of error produced, i.e. ϖ ∝ ε ,
where ϖ = weight change and ε = error. In NeuroShell, the learning rate is the factor
which determines how much of the error produced is applied to the change of weight at
the nodes, i.e. ϖ ≅ λε, where λ is the learning rate.
The value of 0.4 was found to give good predictions (together with the value of
Momentum, see below) and is occasionally reduced to 0.2 in some tests once the network
has learned for sometime, and the accumulated errors have not reduced further.
Momentum (0.0–0.9)
The term momentum in NeuroShell is different from that used in financial analysis where
it is used to mean the difference between the index value today and the value some
periods ago. In NeuroShell, momentum (µ) is a factor which determines the proportion of
the last weight change which is added to the new weight change.
The total weight change at time t is ∴ given by

ϖt = λεt+µϖt-1.
In tests, it was found that the value of 0.6 (together with the value of 0.4 for the Learning
rate) produced the best results.
Learning Threshold (0.0–3.0)

The learning threshold sets the limit of accuracy to which a network is trained. The
learning process stops when the errors for all cases fall below this value. The user guide
suggests that for large number of classifying characteristics (i.e. the number of data sets)
a larger value (0.1 or 0.001) would be more suitable.

We have found that this value has some indirect effect on the learning accuracy of the
network, and the value of 0.0001 was found to produce networks that have learned
accurately.
Number of presentation
This is the number of times the data sets are presented to the network on a case by case
basis, where a case consists of all the financial indicators, e.g. FTSE, Dow Jones, etc., on
a particular day. Once the input case is seen by the network it produces an output and
compares it with the expected value and makes any necessary adjustments to the weights
in the nodes.
The performance of a network relates closely to the number of times a particular case is
seen by the network. In other words, as the number of times a case is seen by the neural
net increases the better the output of that case will be. However, care should be taken not
to over present (i.e. over train) the network by presenting the cases in the learning set
more than necessary. It is true that the network predictions get better as the number of
presentation increases, but this is only true for the learning set.
A network that is being trained too well on the learning set is normally useless in
predicting any events or values using data it has not seen before. This is the classic case
of the network ‘memorising’ the learning set and hence cannot generalise on any data that
lies outside the learning set.
Again, deciding how many presentations would produce an adequately learned-, good
generalising network is still an inexact science.
Presentation type
There are two ways in which the input data can be presented to the network; random and
rotation.
Random
In this method, the patterns, or cases, from the sample set are presented randomly to the
network. The advantage of this method is that the learning time is usually quicker than
the rotation method. However, there is a danger that if the number of cases is sufficiently
large in the sample set the learning time will be a great deal longer in order to ensure that
all the cases in the sample set are presented at least once to the network. If not, learning
will take place only from those randomly chosen patterns, and all the cases may not have
been seen by the network.
Rotate
As the name suggests the network learns by reading the data from the sample set, one day
at a time, in sequential and rotational order (from top to bottom of the files, and back to
top again). This method is useful for learning and predicting events which contain
historical information, and also ensures that all of the patterns in the sample set are seen
by the network. As the FTSE prediction involves the use of historical data this method of
data presentation was used most often and produced better results than the random
presentation.
Minimum and maximum values

In predicting any numerical value, appropriate values for the maximum and minimum
should be set according to the following conditions:
the range between the max. and min. values should be tight enough in order to obtain
good results
the range should also cover all values in the sample set as well as in the test set, but
most importantly, any values likely to be encountered during actual prediction.

The effect of the latter condition can be seen in one of tests (FT10ST01) where the
values of the test set fall outside the range of the sample set (+100, -300). The graph
below shows that the prediction of the values below -300 in the latter part of the test set
the neural net is hopeless at predicting values outside that of the known range, where the
percentage error was found to be close to 8% in that region compared with 1.3% for the
first half of the test set.
FTSE +2 day (trend removed)

100
50
-50
-100
-150
-200
-250
-300
NOTE:
-350 File= FT10ST01
Presentation=3.4M Actual FTSE +2 days
-400 Threshold=0.4
Momentum= 0.6 Predicted
-450
-500
O O O O O O N N N N N N N D D D D D D D Ja Ja Ja Ja Ja Ja Ja Ja F F F F F F M M M M M M M M A A A A A A A M M M M M M M Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju A A A A A A A S S S S S S S O O O
ct- ct- ct- ct- ct- ct- o o o o o o o e e e e e e e n- n- n- n- n- n- n- n- e e e e e e a a a a a a a a p p p p p p p a a a a a a a n- n- n- n- n- n- n- l- l- l- l- l- l- l- l- u u u u u u u e e e e e e e ct- ct- ct-
91 91 91 91 91 91 v- v- v- v- v- v- v- c- c- c- c- c- c- c- 92 92 92 92 92 92 92 92 b- b- b- b- b- b- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- y- y- y- y- y- y- y- 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 g- g- g- g- g- g- g- p- p- p- p- p- p- p- 92 92 92
91 91 91 91 91 91 91 91 91 91 91 91 91 91 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92
Figure 6.2 — Limitation of neural net to predict values outside the known range

FTSE Trend Forecasting Using Neural Networks Results
RESULTS
Results evaluation methods
Trading — Short term prediction
Analysis methods
Trading requires accurate prediction of price movements over a one/two day period. Any
longer and although the prediction may come right, intermediate adverse values could
break position limits. Even with one day predictions, intra-day values could hurt but this
is less likely.
It is required to predict the one day, two day and three day closing values. Accurate one
day predictions would be the ideal. Experience suggests that there are many
circumstances where a move in one direction is reversed, at least partially, the following
day and hence our expectation is that a prediction of the two day value will be more
reliable. If the prediction for the day following our target day (+1 or +2) is in the same
direction as the prediction for the target day then our expectation that it will be fulfilled
should be greater. Hence the need for three day prediction, to add confidence to the two
day estimate.
It is important in trading to accept that on some days we make profits and some losses—
we do not expect to be right all the time but we have to restrict the losses and expect the
cumulative profit to grow steadily. The maximum accumulative downside must also be
acceptable.
In trading using the neural network, we therefore need to make decisions to be long,
neutral or short as this is a system for aiding positioning. We don’t want to make
mistakes, but we don’t want to miss opportunities.
There are 3 levels of testing that we should do as predictions:

i) direction only
ii) magnitude
iii) profit & loss.
All predictions are for differences from today’s level; not for absolute values with which
the neural network cannot cope.
Direction only
Numeric and graphical analysis should be used.
Straight count
Compares the actual and predicted direction of the FTSE of each case in the test set, and
output as follows:
success if PC/AC > 0, AC ≠ 0

⎧ PC / AC < 0, AC ≠ 0
failure if ⎨
⎩ AC = 0
where PC= Predicted Change, AC = Actual Change.

Assign numeric values, e.g. success=1; failure=0, to each test case, and the overall result
is computed by finding the average number of successful cases, e.g. 125 from 250 implies
a 50% success rate.
In addition, the result is subdivided into two halves to give a feel for the effect of distance
from the learning set. (This is because the training set and the test set are derived from the
same source data file, chronologically ordered. This means that the first case in the test
set follows immediately from that of the last case in the learning set.)
The results can then be expressed as follows, e.g.

first half 65 from 125 52 % success
second half 60 from 125 48 % success.
The overall total should be split to indicate whether success is better on –ve or +ve
predictions and –ve or +ve moves.
Straight count (ignoring small changes)

Small predicted changes would be excluded because it wouldn’t be worth our while
trading on them.
Magnitude
Look at the magnitude of predicted changes compared with the last known value of the
index at that date and separate out moves of less than x% absolute where x is probably
0.5% or 1%.
success if |PC| > |xI| AND (PC/AC > 0), where I = Index value
Result expressed as 125 from 200 (50 small) success 62.5%.
This could be amended to include as failures small predicted moves which turned out to
be large.
Of predicted small changes;
failure if |PC| < |xI| AND |AC-PC| > 2xI.
Again, this analysis should be split into first half, second half and all of test set. Also, the
total should be split to show if success is better for +ve or –ve predictions and for +ve or
–ve moves.
Quantitative analysis
Initial quantitative analysis carried out on the output results is on the basis of percentage
errors. The percentage error (PE) is calculated as follows:
PI − AI
PE = ∗100
AI
where PI= predicted index value

AI= actual index value.
The statistics calculated were as follows:
1
i) Average PE ,
n
∑ PE( t ) .
ii) Std. Dev. PE,
iii) Average absolute PE,
iv) Std. Dev. absolute PE,
v) Max. PE,
vi) Min. PE.

The above analyses were carried out for first half, second half, and whole of both the test
set and the sample set.
Forecasting — Long term prediction

The level of the FTSE within ±1.5% is the objective of long term forecasting. Such a
view would be important to long-term holders of significant equity positions. The size of
their portfolios (and consequent reductions in its liquidity) makes the possibility of
trading the total position impossible and thus intermediate values of the Index are of less
importance.
Accurate 3 month predictions would be the ideal. However, as these Index values are
being predicted as differences which could be subject to significant variations in the
initial and final specific daily values some averaging is desirable.
Estimation of the two day average as a proxy for the FTSE was considered a good
compromise. Tests showed that it generally varied by less than 0.5% from the FTSE
value itself, and except on exceptional days the variation was within 1% bounds. As a
measure of noise, the volatility of the 2 day MAV (moving average) was 11.7%
compared to 16% for the FTSE.
The tests for short term values was applied to these longer term estimates as well. The
results needed to be presented initially as success in predicting the 2 day MAV but also in
terms of predicting the FTSE itself.
Preliminary results
In our earlier experiments with NeuroShell, the following tests produced the most
promising results, and of these three types of tests the latter two gave the best results of
the experiments and these are shown on the following pages.
Test Predictive item Derivatives calculated on

1 FTSE +1D MOM raw FTSE
2 FTSE +2D MOM (residual) 4 std. dev. adjusted FTSE
3 FTSE +65D MOM of 2D MAV 4 std. dev. adjusted FTSE 2D MAV
Result sheets
The result sheets on the following pages are of three varieties:
Test Record Sheet—contains the summary of the conditions, input data sets and their
contributions, accuracy of prediction and any other information that is relevant to the
test
Line Graph—showing the actual and the predicted outputs plotted over time (usually
from October 1991 to October 1992). A 100% accurate prediction means that the
actual and the predicted graphs will be identical.
Scattered Graph—compares the values of actual and the predicted outputs. Again, a
100% accurate predictions means that the scattered values will be aligned with the
y=x line.
Test 1—FTSE +1D MOM prediction

The tests carried out in predicting the 1D MOM produced results which are less
successful than the 2D MOM predictions, e.g. overall accuracy of 52% and a lightly
better prediction of 54% at the first half of the test set. Hence, we have concentrated our
efforts in predicting the 2D MOM values.

Test 2—FTSE +2D MOM (residual) prediction

Here the predictive item is not the raw FTSE, but the residual of FTSE, i.e. the trend in
FTSE has been removed. The input parameters are derivatives of FTSE and raw data of
other indexes, interest rates and economic data (see result sheet FT10ST01.RES).
Although the overall direction accuracy is 80%, the average percentage error is 1.38% —
above the limit of acceptance. However, if we only look at the first half of the test, i.e. the
first 6 months from the last available data that was used in the training of the network, we
can see that the average error is 0.33%. This shows that the neural net produced better
predictions to events in the near future than those that are more than 6 months away.
Figure 7.1 shows the actual and predicted values of FTSE 2 day momentum plotted over
October 1991- October 1992, and figure 7.2 shows the comparison of the actual and
predicted values.
Test 3—FTSE +65D MOM prediction

The predictive item is the 2 day MAV of FTSE in 65 days time. The input parameters are
derivatives of FTSE and raw data of other indexes, interest rates and economic data (see
result sheet FT08AA03.RES).
Figure 7.3 shows the actual and predicted values of FTSE 65 day momentum plotted over
October 1991- October 1992, and Figure 7.4 shows the comparison of the actual and
predicted values.

Reduction of input data sets

In general, a neural net’s performance can be greatly increased by simplifying its inputs.
From the vast number of derivative data sets that we have generated (See ‘Input data
manipulation’ section),. tests were carried out to determine the most significant data sets
and to remove those which made the least contributions to the overall result of the
prediction.
Numerical Analysis
To minimise the amount of input parameters required, correlation analysis was first
carried out on the derivatives of the FTSE index. Any data set which has high correlation
with other data sets is of little value as an input to the neural net. This is because it does
not contain any information that is not found in other correlated data sets already.
The table below shows the results of the correlation analysis. From the result of the
analysis we can remove those data sets which are highly correlated (> 85%) with more
than one others.
FTSE 1 day 1day 2 day 1 week 25day 50day 65 day 1 year % change % change % change
100 Return MOM MOM MOM MOM MOM MOM MOM of MOM of MOM of MOM
s over 10 over 25 over 50
days days days
FTSE 100 1
1 day Returns 1
1day MOM 0.98 1
2 day MOM 1
1 week MOM 1
25day MOM 1
50day MOM 1
65 days MOM 0.87 1
1 year MOM 1
% change of MOM 1
over 10 days
% change of MOM 0.98 1
over 25 days
% change of MOM 0.98 0.86 1
over 50 days
Close-2 Close-5 Close-25 Close-50 3 day 5 day 25 day 50 day MACD RSI 9 RSI 14 Zero cl-
day MAV day MAV day MAV day MAV ROC ROC ROC ROC day days cl- vol.
Close-2day MAV 1
Close-5 day MAV 0.85 1
Close-25 day MAV 1
Close-50 day MAV 0.88 1
3 day ROC 0.92 1
5 day ROC 0.9 1
25 day ROC 0.88 0.92 1
50 day ROC 1
MACD 0.85 1
RSI 9 day 1
RSI 14 days 0.86 0.97 1
Zero cl-cl- vol. 1
Table 7.1— Correlation analysis of FTSE derivatives
Direct experimentation
Due to the limitations of NeuroShell as well as the large number of input data sets that we
have generated a total of 3 tests had to be devised to test for the suitability of the inputs,
namely:

FTSE and other indexes (US, Germany, France, Japan) derivatives

FTSE and interest rates (US, Germany, France, Japan) derivatives
FTSE, exchange rates and economic (US, Germany, France, Japan) derivatives.
In all of the tests the +2 day momentum of the FTSE 2 day MAV was used as the
predictive item.
FTSE and other Indexes derivatives (2 days prediction)

The following graphs FT14AB02.CFT and FT14AB02.CFO show the plot of Index
derivatives against their contribution factors. The contribution factor is the sum of the
absolute values of the weights leading from the particular input data set, and is a rough
measure of the importance of that input.
From the graphs we have extracted the extreme cases, i.e. the inputs which made the most
and least significance.
Rank Most significant inputs Least significant inputs

1 DJ-FTSE CAC 1D MOM
2 1 year MOM DJ RSI9D 1D MOM
3 FT-DAXEX 12M MOM 1 day Log
4 DJ 12M MOM DJ-FTSE 1D MOM
5 Day numbers FTSE-(CAC/EX.RATE) 1D MOM
6 Zero cl-cl- vol. 1day MOM
7 FT-DJEX 12M MOM 50day MOM
8 CAC RSI 9D 20D MOM FTSE-DAX
9 FT-DJEX 65D MOM FT-DJEX 1D MOM
10 CAC-FTSE 65D MOM NIK-FT 20D MOM
11 % change of MOM over 10 days FT-DAXEX 1D MOM
12 FT-DJEX 20D MOM NIK-FT 1D MOM
13 DJ RSI14D 65D MOM NIK 20D MOM
14 DAX RSI 14D 65D MOM DAX RSI 9D 1D MOM
15 FTSE-DAX 12M MOM FT-NIKEX 65D MOM
Table 7.2 — Most and least significant contributions of index derivatives
Observations
The majority of the Nikkei index derivatives fall in the medium to low significance
region, where as DAX, Dow Jones and CAC index derivatives made significant
contributions. This is not surprising, and at the same time confirms that the Japanese
market plays a less influential role in the movements of FTSE. It is safe to state that we
can remove most Nikkei index derivatives from the future tests since they do not make a
great deal of contribution.
The 1D MOM for many items are of low significance and could be excluded, perhaps in
favour of 2D MOMs.
The Day Numbers, representing seasonality are of higher significance than might have
been anticipated.

FTSE and Interest rates derivatives (2 days prediction)

Again, the graphs FT14AB03.CFT and FT14AB03.CFO illustrate the contributions made
by different nationals interest rates derivatives. From the graphs we have extracted the
extreme cases, i.e. the inputs which made the most and least significance.

1 % change of MOM over 10 days UK 20YR.GILT-INF 1D MOM
2 Close-25 day MAV JP BR-INF 1D MOM
3 RSI 9 day UK BR-INF 1D MOM
4 FR 3MIB->7YBY 20D MOM JP BR 1D MOM
5 FR 3MIB 20D MOM UK BR 1D MOM
6 UK 20YR.GILT 20D MOM FRR >7YBY-INF 1D MOM
7 RSI 14 days US 30Y-INF 1D MOM
8 FR 3MIB->7YBY 1D MOM JP 1OYBY-INF 1D MOM
9 US 30Y BY 20D MOM FR >7Y BY 1D MOM
10 FR 3MIB->7YBY JP 3MIB-10YBY 1D MOM
11 US 30Y-INF 20D MOM US BR-INF 1D MOM
12 Day numbers GE BR 1D MOM
13 JP BR-3MIB 20D MOM US BR 1D MOM
14 65 days MOM UK 3MIB-20YR.GILT
15 5 day ROC JP BR-3MIB 1D MOM
Table 7.3 — Most and least significant contributions of interest rates derivatives
Observations
Again, 1D MOMs, the Japanese indicators and inflation seem the least significant. The
2D MOMs seem to have significant value and also the 12 month MOMs.
The continued significance of both 9 and 14 day FTSE RSI suggests that we should try
this derivative for other items of data.

FTSE, Exchange rate and economic data derivatives (2 days prediction)

The graphs FT14TD04.CFT and FT14TD04.CFO show the contributions made by
different nationals exchange rates and their respective GDP derivatives. The table below
shows the most and least significant contributors.

1 YEN/£ EX. RATE MARKS/£ 20D MOM
2 Close-2day MAV 3 day ROC
3 FR INF 12M % CHANGE GE M. SUPLY 12M % CHANGE
4 UK INF 12M % CHANGE JP INF 12M % CHANGE
5 65 days MOM US$/£ 1D MOM
6 1 year MOM 1 week MOM
7 YEN/£ 20D MOM 25day MOM
8 US$/£ EX. RATE Close-25 day MAV
9 MARKS/£ EX. RATE 25 day ROC
10 UK M. SUPLY 12M % CHANGE RSI 14 days
11 US M. SUPLY 12M % CHANGE FR/£ 20D MOM
12 Day numbers 5 day ROC
13 Zero cl-cl- vol FR/£ 1D MOM
14 % change of MOM over 25 days US GDP 12M % CHANGE
15 MACD RSI 9 day
Table 7.4 — Most and least significant contributions of exchange rate and GDP derivatives
Observations
Surprisingly, the Yen/£ Exchange rate comes out as being significant. Perhaps, something
from Japan has to be!
This is a very strange set of results requiring further analysis.

FTSE, Exchange rate and economic data derivatives (65 days prediction)
When the previous test was rerun to predict the 65 days, instead of 2 days, momentum,
we obtained slightly different results, as shown in graphs FT14TD04.CFT and
FT14TD04.CFO. The table below shows the most and least significant contributors.

1 1 year MOM 50day MOM
2 US$/£ EX. RATE MACD
3 GE GDP 12M % CHANGE 25day MOM
4 FRANCS/£ EX. RATE Zero cl-cl- vol
5 MARKS/£ EX. RATE 2 day MOM
6 UK INF 12M % CHANGE 1 week MOM
7 RSI 9 day 5 day ROC
8 US GDP 12M % CHANGE MARKS/£ 20D MOM
9 US$/£ 20D MOM Close-25 day MAV
10 UK GDP 12M % CHANGE MARKS/£ 1D MOM
11 JP GDP 12M % CHANGE 3 day ROC
12 JP INF 12M % CHANGE 50 day ROC
13 US M. SUPLY 12M % CHANGE YEN/£ 20D MOM
14 US$/£ 1D MOM GE M. SUPLY 12M % CHANGE
15 US INF 12M % CHANGE Day numbers
Table 7.4 — Most and least significant contributions of exchange rate

and GDP derivatives (65 days prediction
Observations
It can be seen that the 12 month % change of inflation and GDP indicators are the
prominent factors in the longer term prediction, compared to that of the short-term (2
day) prediction.
In many respects, this is a much more understandable result.

FTSE Trend Forecasting Using Neural Networks Conclusions
CONCLUSIONS
Highlights
At this stage of the project, to some extent we are still trying to understand the major
factors involved in training a network as much as concentrating our efforts in producing
networks which can predict with high accuracy. However, in the process of trying to
understand these factors we have also produced some networks which gave high accuracy
in their predictions namely:
Predicting 2day MAV of FTSE 65 days (3 months) ahead gave good results, e.g.
overall direction accuracy of 79%.
Predicting residual (i.e. where an approximated linear trend was removed from raw
values) of FTSE 2 days ahead also gave good results with overall directional
accuracy of 80%, and an average percentage error of 1.4%. In particular, the average
error for the first part of the test set was as low as 0.33%.
The predictions on the first half of the tests sets are better than those of the latter part
of the test sets.
Conclusions
Input data manipulation

Ordering of the inputs data sets did not make any difference to the accuracy of
prediction.
The use of back-propagation algorithm in NeuroShell means that accurate
predictions are possible if large amounts of sample data is available for learning. For
majority of the tests, we have used 5 years worth of data (1986-1991) for learning
and data from 1992 is used as test data. On some occasions, however, the required
amount of data was not available, mainly due to the lack of French interest rates data
from DATASTREAM, where the earliest available data starts from 1988. This meant
that in some tests, where interest rates were used, only 3 years worth of sample data
was available for learning.
The tests showed that short- and long-term predictions will require different data
sets. The results of the tests need to be analysed closely to distinguish those data that
are suited for the short-term and those that are suited for long-term prediction.
Reduction of input data

In general, 1 day Momentum inputs do not make significant contributions, hence,
they can be eliminated from future tests. In contrast, medium and long term, e.g. 20
day, 65 day and 12 month, momentums made significant contributions to both short-
term and long-term prediction.
It also appeared that the Japanese market indicators do not played a major role in the
tests. We will have to carry out further tests to see if all of the Japanese inputs can be
removed without suffering from loss of accurate prediction.

Output data selection

Predicting 3 day weighted average, 2 days ahead, gave poor results. There are
anticipated problems with this proxy value and work on this estimation have been
suspended.
Predicting FTSE 2 day momentum, without trend, produced acceptable results and
we need to carry out further tests to see if this is also true for long-term predictions.
Network Tuning
In terms of using the NeuroShell package, we have made the following observations:
A network with more than one output consistently failed to converge (minimise
errors) on the training set, hence produced poor predictions.
A total number of 25 hidden nodes was found to be satisfactory with most of the
tests, when the number of input nodes is between 14 to 45. We have not done
extensive tests on networks with larger number of input nodes.
The values for Threshold and Momentum which consistently gave good results were
found to be 0.4 and 0.6 respectively, again when the number of input nodes is
between 14 to 45.
Reducing the values of Threshold and Momentum by 50% after the network has
been trained for sometime (around 2M presentation) did not improve the overall
predicted results.
The back-propagation algorithm used for learning the past experience of the market
cannot handle time-series data. The ability to handle time-series data is of great
importance for financial prediction since the network needs to learn the market
behaviour in the past.
Current method of preparation of input data sets, especially the calculation of
derivatives, in spreadsheet is time consuming, laborious, and most importantly, error
prone. A typical test would normally take from half to a full working day for
preparation and validation of the data.
NeuroShell allows networks with only a single hidden layer. The higher the number
of hidden layers the greater the network is able to recognise larger number of market
scenarios and be able to predict with greater accuracy.
No security measures to protect the network from inexperienced user, i.e. the
network parameters can be easily altered by users. This is dangerous because a
network can only predict accurately as long as the parameters in the network
remained unchanged. The package does not provide security measures to stop novice
users from tinkling with the network parameters which could result in the well-
trained network becoming next to useless.
Analysis methods
The results of directional analysis is somewhat misleading. The figure below shows a
result from one of the tests (FT10ST01). It can be seen from the graph that the
accuracy of direction is better on the first half compared with the second half of the
test set.

FTSE +2 day (trend removed)

100
50
-50
-100
-150
-200
-250
-300
-350
NOTE: Actual FTSE +2 days

-400 File= FT10ST01
Presentation=3.4M Predicted
Threshold=0.4
-450 Momentum= 0.6
-500
O O O O O O O O N N N N N N N N N N N D D D D D D D D D D D Ja Ja Ja Ja Ja Ja Ja Ja Ja Ja Ja F F F F F F F F F F M M M M M M M M M M M A A A A A A A A A A A M M M M M M M M M M M Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju Ju A A A A A A A A A A A S S S S S S S S S S S O O O
ct- ct- ct- ct- ct- ct- ct- ct- o o o o o o o o o o o e e e e e e e e e e e n- n- n- n- n- n- n- n- n- n- n- e e e e e e e e e e a a a a a a a a a a a p p p p p p p p p p p a a a a a a a a a a a n- n- n- n- n- n- n- n- n- n- n- l- l- l- l- l- l- l- l- l- l- l- u u u u u u u u u u u e e e e e e e e e e e ct- ct- ct-
91 91 91 91 91 91 91 91 v- v- v- v- v- v- v- v- v- v- v- c- c- c- c- c- c- c- c- c- c- c- 92 92 92 92 92 92 92 92 92 92 92 b- b- b- b- b- b- b- b- b- b- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- y- y- y- y- y- y- y- y- y- y- y- 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 g- g- g- g- g- g- g- g- g- g- g- p- p- p- p- p- p- p- p- p- p- p- 92 92 92
91 91 91 91 91 91 91 91 91 91 91 91 91 91 91 91 91 91 91 91 91 91 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92 92
However, the result from the directional analysis showed that the directional
accuracy is better in the second half than the first. The reason for this is as follows:
During the first half of the test set the fluctuation of the actual values are small and
oscillate around 0 line, and any small errors in the prediction make the predicted
values go below or above the 0 line. Since directional accuracy only looks at those
predicted values which are in the same region (above or below the 0 line) as that of
the actual values the overall directional accuracy in the first half is lower than that of
the second half of the set, where even though the differences between the actual and
the predicted values are large they both happened to be in the same region. The
implication is that the directional accuracy of the predicted results is dependent upon
the relative position of the 0 line.
A better assessment of the results was needed and for that we use the percentage
error and standard deviations of errors between the actual and the predicted output
for each case. This is done for both the learning and the test set for comparison. The
reason being that although the network does very well on the learning set (over 90%
accuracy), but not so on the test set should indicate that the network may be simply
memorising instead of generalising. By analysing both the learning and the test set
we hope to determine the right amount of learning (i.e. number of presentation of
cases) required to give good results on the test set, and any data the network has not
seen before.

FTSE Trend Forecasting Using Neural Networks Appendix
APPENDICES
Appendix A— Back-propagation algorithm
Appendix B— FTSE Analyses

FTSE Trend Forecasting Using Neural Networks Appendix
APPENDIX A
Back-propagation algorithm
The following is an extract from a paper by Camp7. A network learns by successive
repetitions of a problem, making smaller errors with each iteration. The most commonly
used function for the error is the sum of the squared errors of the output units:
E=
1
∑ ( yi − di )2
2
The value d i is the desired output of unit i, and y i is its actual output, where y i is the
sigmoid function 1 / (1 + e − x ) . To minimise the error, take the derivative of the error
with respect to w ij, the weights between units i and j:
δE
= y i y j (1 − y j )β j
δw ij
where β j = ( y j − d j ) for output units and β j = ∑ kw jk y k (1 − y k )β k for hidden

units (k represents the number of units in the next layer that unit j is connected to). Note
that y j (1− y j ) is the derivative of the sigmoid function.
The error can then be calculated directly from the links going into the output units. For
hidden units, however, the derivative depends on values calculated at all the layers that
come after it. That is, the value β must be back-propagated through the network to
calculate the derivatives.
Using these equations, we can state the back-propagation algorithm as follows:

Choose a step size, δ (used to update the weights).
Until the network is trained,
For each sample pattern,
Do a forward pass through the net, producing an output pattern,
For all output units, calculate β j = ( y j − d j ) .
For all other units (from last layer to first), calculate β using the calculation
from the layer after it:
β j = ∑ k w jk y k (1 − y k )β k .
For all weights in the network, change the weight by
∆w ij = −δy i y j (1 − y j )β j .
7 Drew van Camp, Neurons for Computer, Scientific American, September 1992, pp 125-127.

FTSE Trend Forecasting Using Neural Networks Appendix B
APPENDIX B
FTSE Index analyses
The following is the complete set of analysis that we have performed on the FTSE Index.
The analyses that were performed are as follows:
• Momentum (close to close difference)
• Returns
• Percentage Change of momentum
• Moving Averages
• Rate of Change
• Moving Average Convergence-Divergence
• Relative Strength Index
• Zero close-close volatility
The formulae described using Microsoft Excel notation below assumed that the
worksheet is set up as follows:
FTSE Derivative
INDEX
1 1234.5
2 1345.6
3 1456.7
4 1567.8
5 1678.9
6 1789.0
7 1890.1
8 1901.2
9 2012.3
10 2123.4
11 2234.5
The following notations are used:

v = value of the index at close today
v[n] = value of the index n days ago
Ax = value of cell at column A, row x.
Momentum (Close to Close Difference)
Description
This is a measure of the difference between the today’s and previous days index, usually
over 1, 2, 5, 25 and 50 days. For use in NeuroShell this measure is preferred to the
absolute value of the index.
Formula
Momx = v - v[n]
where n= 1 to 260 days.
Excel formula

Momx=Ax-A(x-n)
Returns
Description
Formula
⎡ v ⎤
Returns = ⎢ln ⎥ ∗100
⎣ v[1] ⎦
Excel formula
Returnsx= LN(Ax/A(x-n))*100
Percentage Change of Momentum (PCM)
Description
This is a measure of the % change of momentum between today and previous days index,
usually over 10, 25 and 50 days.
Formula
v − v[n]
PCM = ∗100
v
where n = 10, 25 and 50 days.
Excel formula
PCMx = (Ax-A(x-n))/Ax*100
Moving Average (MAV)
Description
This is a measure of the arithmatic mean of the index. Of the various moving average
measures this is the simplest and often known as Simple Moving Average (SMA).
A moving average smooths out fluctuations in values and may help to indicate trends in
the market. A shorter moving average (i.e. when n is small) is more sentive to changes
and results in less smoothing than a longer moving average.
Normal usage is in comparing the value of ROC with the raw index data. When there is a
divegence between the ROC and the price, followed by a break in the trend, this indicates
the signal to buy or sell.
Formula
n −1
∑
1
MAV = v[ 0] − ∗ v[i]
n i= 0
where n = 5, 25 and 50 days.

Excel formula
MAVx = Ax - sum(A(x-n):Ax)/n
Rate of Change (ROC)
Description
The rate of change measures how fast the momentum of the index is changing.
Formula
v
ROC = × 100
v[n]
Excel formula
ROCx= (Ax/A(x-n))*100
Moving Average Convergence-Divergence (MACD)
Description
This is an indicator of overbought and oversold signals in the market. This measure is
obtained by working out the difference of the two exponential moving averages of short
and long periods. When the difference in value is greater than the exponential moving
average of the difference, it can be a signal to buy. Conversely, when the difference in
value is less than the exponential moving average of the difference, it can be a signal to
buy.
In addition, if the MACD lines are too far above or below the zero line, they could
indicate an overbought or oversold situation respectively.
Formula
w= EMA(v,sf2) - EMA(v,sf1)
MACD = EMA(w,sf3)
where EMA(v,sf) is defined by:
EMA(v,sf) = v in the first interval

EMA(v,sf) = (1-sf) * EMA[1](v,sf) + sf*v in second and later intervals
where sf, sf1, sf2, sf3 = smoothing factor (0.0-1.0), and sf2>sf1
Excel formula
Here, the EMA for long and short periods are calculated first, in two different columns,
e.g.
EMA1(sf1) = v (for the first value) EMA1(sf2) = v (for the first value)
EMAx(sf1) =0.02*vx+0.98*EMAx-1 EMAx(sf2) = 0.05*vx+0.95*EMAx-1
MACD can now be calculated as follows:
MACD1 = EMA 1 (sf2 ) − EMA 1 (sf1 ) (for the first value)
MACD x = EMA x (sf 3 ) = 0.1 × [EMAx ( sf 2 ) − EMAx ( sf1 )] + 0.9 × MACD( x −1)

Relative Strength Index (RSI)
Description
This is an indicator of trend reversals in the market, and is preferred over the momentum
indicators.
Formula
If MEMA(u,n) = MEMA(d,n) =0
then
RSI(v,n) = 50
else
100 × MEMA(u,n)
RSI(v,n) =
MEMA(u,n) + MEMA(d,n)
where v = close
u = max(v-v[1],0)
d = max(v[1],0).
MEMA(v,n) is given by:
MEMA(v,n) = SMA(v,n) in nth interval,

1 ⎛ 1⎞
MEMA(v, n) = × v + ⎜1 − ⎟ × MEMA[1](v, n) in later
n ⎝ n⎠
intervals.
Excel formula
Again, MEMA for the two cases, MEMA(u,n) and MEMA(d,n), are calculated first.
max{(v x − v( x−1) ),0}+ ⎜1 − ⎟ MEMA( x−1)

1 ⎛ 1⎞
MEMAx (u , n) =
n ⎝ n⎠
⎛ 1⎞
MEMAx (d , n) = max{(vn−1 − vn ),0}+ ⎜1 − ⎟ MEMA( x−1)
1
n ⎝ n⎠
Once these are calculated we can determine the value of RSI, which is given by:
100 × MEMA x (u,n)

RSIx =
MEMA x (u,n) + MEMA x (d,n)
Zero trend close-close volatility (ZCCV)
Description
This is an estimate of volatility in the market and the major assumption here is that the
underlying distribution has a zero trend.
Formula
y = ln(close/close[1])
t = time (in years) until end of period
⎧ 1 n −1 y[i ]2 ⎫
ZCCV = 100 ∗ ⎨ ∗ ∑ ⎬ zx
⎩ n i =0 (t[i ] − t[i + 1]) ⎭

⎧ 1 n 2⎫
= 100 ∗ ⎨ ∑ yi ⎬ ∗ 256
⎩ n − 1 i =1 ⎭
Excel formula
ZCCVx = 100*STDEV(Ax:Ax-n)*SQRT(256)

FTSE Trend Forecasting Using Neural Nets

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

FTSE Trend Forecasting Using Neural Nets

Hochgeladen von

Copyright:

Verfügbare Formate

FTSE Trend Forecasting using

Preliminary Results and Findings

Rate of Change (ROC)................................................................................................................63

Copyright— © Len Aye 1994 3

Copyright— © Len Aye 1994 4

Copyright— © Len Aye 1994 10

Copyright— © Len Aye 1994 11

Figure 2.1 — Functions of a unit in a neural network

Copyright— © Len Aye 1994 10

Copyright— © Len Aye 1994 11

INPUT DATA MANIPULATIONS

UK Base rate UK Interbank 3 Months UK 20 yr. Gilt Yield

UK Money supply UK RPI (inflation) UK GDP

UK 3 month Sterling (Short Sterling)

Copyright— © Len Aye 1994 11

Input data manipulations

2–History in data series

1 day difference (momentum) short-term value that we wish to predict

Table 4.1 — FTSE momentums

Copyright— © Len Aye 1994 12

Copyright— © Len Aye 1994 13

FTSE 100 Index

The analyses that were performed are as follows:

Using normal linear regression, minimise the error term,

Using normal linear regression, minimise the error term,

Using normal linear regression, minimise the error term,

Copyright— © Len Aye 1994 14

Figure 4.1 — Underlying trend in FTSE

Hence, the gradient is obtained by:

Copyright— © Len Aye 1994 15

EMA t = 0. 01∗ F r + 0.99∗ EMA t −1

Comparison of FTSE values between 1985-1992 03/03/9

Figure 4.2 — Seasonal trends in FTSE

Copyright— © Len Aye 1994 16

Trend removals using differences

• Base rate - 3M Interbank

1 day 1 month 12 months

Copyright— © Len Aye 1994 17

Differences UK US German French Japan

UK Money supply US Money supply

Copyright— © Len Aye 1994 18

Complete list of input data sets

1D, 2D, 65D, etc. 1 day, 2 day, 65 day, etc.

Table 4.2 — Naming convention used in the tests

Copyright— © Len Aye 1994 19

Table 4.3 — Complete list of input data sets

Copyright— © Len Aye 1994 20

OUTPUT DATA SELECTION

Number of predictive items in the neural net

Short term prediction

Long term prediction

Selection of predictive item

6 See Appendix B for explanation.

Copyright— © Len Aye 1994 21

Selection of FTSE difference

Figure 5.1 — Difference between 2 day MAV and actual FTSE

Copyright— © Len Aye 1994 22

FTSE/3 Day weighted Average

Figure 5.2 — Difference between 3 day weighted AV and actual FTSE

Copyright— © Len Aye 1994 23

Input nodes 50 50 50 50 100 200

Figure 6.1 — A typical 3-layer network

Copyright— © Len Aye 1994 27

and in the above example network, there are a total of 25 connections.

FTSE and other indexes (US, Germany, France, Japan) derivatives