Sie sind auf Seite 1von 28

Paterson, Grant & Watson Limited

Predictive Targeting with Neural Networks


Version 2.0
for Oasis Montaj 7.1 and Later

USER MANUAL
R. Jason Hearst
June 2008

2
Upgrades history:

Allowing users to browse for the folder containing input training/simulation/targets


definition grids;

Karl Kwan, October 5, 2008

3
Table of Contents

1.0 The Neural Network Concept 3

2.0 Installation 3
2.1 Installation Notes 3

3.0 Usage 3
3.1 Loading the Menu 4
3.2 Interface 4

4.0 Inputs 6
4.1 Training Grids 6
4.2 Target Grids 7
4.3 Simulation Grids 7
4.4 Weights 8
4.5 Neural Network Algorithms 8

5.0 Data Smoothing 9


5.1 Target Smoothing 9

6.0 The Fast Classification Neural Network 10


6.1 FCNN Training 10
6.2 FCNN Simulation 11
6.3 FCNN Tuning 11

7.0 The Levenberg-Marquardt Neural Network 12


7.1 LMNN Training 12
7.2 LMNN Simulation 13
7.3 LMNN Tuning 13
7.4 FCNN vs. LMNN 15

8.0 Output 16
8.1 The Output Database 16
8.2 Gridding the Output 18
8.3 Clipping the Grid 19

9.0 Transfer Functions 20


9.1 Transfer Function Definitions 21
9.2 Effect of Differing Transfer Functions 21

10.0 References 25

4
1.0 The Neural Network Concept

The basic premise behind a neural network is that it is software that can learn
and apply what was learnt to new data.

In the case of this application, the neural network learns by being shown several
grids modeling different traits of a geological area, these grids are called the
training grids. It is then shown a grid composed entirely of 1s and 0s that
identifies specific regions where desired anomalies or know deposits exist, this
grid is called the target grid. This constitutes the training of the neural network.

The neural network is then provided grids of a different area modeling the same
traits in the same order as the first training grids; these new grids are the
simulation grids. The neural network now applies what it has learned from the
first set of grids to the second set of grids.

The output is a database, which can be gridded by minimum curvature gridding.


The results show the regions of the second set of grids that are most similar to
the regions identified by the training. The output values are in decimal
percentage form where 1, means identical to the training, 0, not at all similar, and
values in-between are varying degrees of similarity.

2.0 Installation

To install the Predictive Targeting with Neural Networks (PTNN):


1) Copy nn_predtar.dll into the Geosoft Oasis Montaj bin directory. If
Oasis Montaj was installed in the default directory, this will be:
C:\Program Files\Geosoft\Oasis montaj\bin
2) Copy nn_predtar.omn into the Geosoft Oasis Montaj menu directory. If
Oasis Montaj was installed in the default directory, this will be:
C:\Program Files\Geosoft\Oasis montaj\omn

2.1 Installation Notes

To use PTNN, you must install the GX nnprep.gx, or you must have some other
means of producing target grids (see 4.2 Target Grid). Once installed,
nnprep.gx is accessible from the menu.

PTNN was developed on Oasis Montaj 7.0 or later and with Microsoft .NET
Framework 2.0; although back-compatibility may be possible, it has not been
tested nor is it supported.

3.0 Usage

5
3.1 Loading the Menu

To use PTNN you must first load the menu. To do this:


1) Open a workspace in Geosoft Oasis Montaj.
2) Click on the GX menu, and select Load Menu.
3) Select nn_predtar.omn, and click open.

3.2 Interface

To access the PTS2 interface, click on the PGW Predictive Targeting menu and
select Neural Network Simulation. The interface will appear as shown in Figure
1.

Figure 1 PSNN Interface

The interface consists of six regions. Each of which is discussed briefly below.

3.2.1 Neural Network Inputs

The Neural Network Inputs region is where the training and simulation grids are
inputted. Combo boxes are filled with any grids in the working directory and
browse buttons are provided for grids outside the working directory. At least one
of each type of grid must be entered, and the number of training grids must
match the number of simulation grids. For more information on the inputs grids
see sections 4.1 Training Grids, 4.2 Target Grid, and 4.3 Simulation Grids.

6
Input weights are also featured in this section of the interface. They allow for a
bias to be put on any of the input grids. For more information on the input
weights see section 4.4 Weights.

3.2.2 Target Grid Input

The target grid, which defines the areas for NN training, must be selected here.

3.2.3 Neural Network Output

The Neural Network Output region is where the type of neural network algorithm
is chosen and where the output database name is entered. Three different
neural network (NN) algorithm options are provided. For more information on the
algorithms see section 4.5 Neural Network Algorithms.

The output database name can be either an existing database or a new


database. If a new database is desired, simply enter the name of the new
database. If you want the results to be put into a new database, then first make
sure that the database is not open, and then enter the name of the database.
Note that if a current database is used, PTNN will over-write all the data that is
currently in it, and replace it with the neural network simulation output.

3.2.4 Data Smoothing

Some basic smoothing is available to the user; smoothing here means giving a
cell the average value of all cells that surround it in a 1x1, 3x3 or 5x5 square,
depending on the user selection. For more information on data smoothing see
section 5.0 Data Smoothing.

3.2.5 FC Neural Network Tuning Controls

The Fast Classification Neural Network (FCNN) can be tuned by adjusting the
output transfer function. For more information on this see section 6.3 FCNN
Tuning.

3.2.6 LM Neural Network Tuning Controls

The Levenberg-Marquardt Neural Network (LMNN) can be tuned by adjust any of


four controls:
Input Transfer Function
Output Transfer Function
Maximum Number of Training Epochs
Acceptable Error Range

For more information on these controls, see section 7.3 LMNN Tuning.

7
Note that PTS2 only takes into account the necessary inputs, so if the selected
NN algorithm is LMNN, then the FCNN Tuning Control settings will be
disregarded; the opposite is also true. If the Combined FC & LM algorithm is
selected, then both sets of tuning controls will be taken into account.

3.2.7 Simulating

Once all of the Neural Network settings are set as desired, clicking on the “OK”
button can trigger the simulation. Hitting “Cancel” will close the window and any
information that has not finished computing will be lost.

3.2.8 Default Settings

Both the FCNN and LMNN Tuning Controls have default settings. The interface
automatically resets to the defaults for the tuning controls every time it is
restarted. In contrast entries in the NN Inputs, NN Outputs, and Data Smoothing
regions are saved and are automatically filled with the same values as were used
the last time a simulation was run. If you are unsure if a value is set to its default,
you can enter the word “default” in the control box and the default setting or value
will be used by the neural network.

4.0 Inputs

The neural network requires 5 different sets of inputs to run.

The inputs are the following:


Training Grids (max 10)
Target Grid (1)
Simulation Grids (max 10, must match number of training grids)
Weights (10)
Neural Network Algorithm Selection

4.1 Training Grids

The training grids are the base grids from which you want to highlight a trend and
find it in other data (the simulation grids). The number of training grids must be
between a minimum of 1 and a maximum of 10. The training grids can be
models of different types of data (e.g. K, Th, EM, etc.), but must all have the
same grid properties (listed below); basically all grids must be of the same
geological area. All grids must be in Geosoft grid format (.GRD).

Please note that you do not need to worry about modifying the data ahead of
time as the neural network allows for some basic data smoothing (see 5.0 Data
Smoothing), and adjusts the data so that it is on comparable scales.

8
It is also possible to emphasize the importance of one training grid over another
by using the weighting tool (see 4.4 Weights).

Properties that must be common for all training grids:


Number of elements in X direction
Number of elements in Y direction
Separation in X direction
Separation in Y direction
Initial X position
Initial Y position
Orientation (KX)

4.2 Target Grid

The target grid is the grid, which highlights the area of interest. The area of
interest is the area that shows the trend in the training grids that you wish to find
in the simulation grids. Only one target grid per simulation may be provided to
the neural network. The target grid must be composed entirely of 1s and 0s.
NNPREP.GX can be used to generate such a grid if you do not have another
means.

The target grid must have the same properties (listed below) as the training grids,
and must be in Geosoft grid format (.GRD).

Properties that must be common for the target grid and the training grids are:
Number of elements in X direction
Number of elements in Y direction
Separation in X direction
Separation in Y direction
Initial X position
Initial Y position
Orientation (KX)

4.3 Simulation Grids

The simulation grids are the grids, which represent the area upon which you wish
to find the trends exhibited in your training set and targeted by the target grid. A
maximum of 10 simulation grids is allowed, but their number must match the
number of training grids. The simulation grids must be the same types of data
(e.g. K, Th, EM, etc.) and in the same order as the training grids. In addition, all
simulation grids must have the same properties (listed below); basically all
simulation grids must be of the same geological area (this can differ from the
area of the training grids though). All simulation grids must be in Geosoft grid
format (.GRD).

9
It is important to note that the simulation grids must be inputted in the same order
as the training grids for meaningful results. While the order in which the training
grids are entered does not matter, the simulation grid order must match the
training grid order because the neural network compares Training Grid #1 to
Simulation Grid #1, and Training Grid #2 to Simulation Grid #2 and so on. Thus if
Training Grid #1 and Simulation Grid #1 do not represent the same type of
geophysical data, their comparison will be meaningless.

Properties that must be common for all simulation grids:


Number of elements in X direction
Number of elements in Y direction
Separation in X direction
Separation in Y direction
Initial X position
Initial Y position
Orientation (KX)

Note that these properties do not need to match those of the training and target
grid data.

4.4 Weights

The weighting tool is provided in case you wish to offer emphasis to one data
type over another. For instance if your training set consists of only a K grid and a
Th grid, but you want to put more emphasis on the K trends, you can set the
weight of K to 2, and leave Th set to 1.

The weighting options provided are:


1 No emphasis
2 Double emphasis
3 Triple emphasis
Note, after every simulation the weights are set back to 1 by default.

You can set as many grids as you like to any weight range. For instance you
may enter 10 grids and set all of their weights to 3, but the results of this would
be the same as setting them all to 1 and the processing time will be significantly
higher.

4.5 Neural Network Algorithms

Three options are available for selecting a neural network algorithm:


Fast Classification (FC)
Levenberg-Marquardt (LM)
Combined FC & LM

10
These are three different algorithms for performing the neural network training
and simulation. The first is using an instantaneously trained neural network
called a Fast Classification Neural Network or FCNN (see 6.0 The Fast
Classification Neural Network). The second option uses Levenberg-Marquardt
optimization to train the neural network and is a classic feed-forward neural
network; we call it a Levenberg-Marquardt Neural Network or LMNN (see 7.0
The Levenberg-Marquardt Neural Network). The third option is provided in case
the user wants an output that is somewhere between the two aforementioned
options. The Combined FC & LM option simply simulates the neural network
with both algorithms and then averages the results.

By default the selected algorithm is FC, but after the first time PTS2 is used in a
project, the interface will store whatever the last algorithm used was in the input
box.

5.0 Data Smoothing

The neural network allows for some basic grid data smoothing of three different
parameters with the following options:

Training Smoothing: 1x1, 3x3, 5x5


Target Smoothing: 1x1, 3x3
Results Smoothing: 1x1, 3x3

The effects of smoothing can usually be seen both numerically (in the database)
and graphically (on the grid). Applying different amounts of smoothing to different
parameters will change the results by varying degrees depending on the input
data. In the PTS2 context, smoothing refers to preconditioning the data by
averaging adjacent cells together to make up the value of a single cell.

Note, the effect of smoothing on the target grid (1s & 0s) creates a slightly
different effect, see 5.1 Target Smoothing.

The meaning of the different smoothing terms is as follows:

1x1: No averaging is applied at all; the data is fed into the neural
network as the user has provided it.

3x3: The value in each cell that the neural network receives is actually
the average of a 9 cell, or 3x3, grid surrounding each cell. This is
accommodated for at the edges and corners (e.g. each corner is
really only an average of a 2x2 grid).

5x5: The value in each cell that the neural network receives is actually
the average of a 25 cell, or 5x5, grid surrounding each cell. This is

11
accommodated for at the edges and corners (e.g. each corner is
really only an average of a 3x3 grid).

5.1 Target Smoothing

Two options are available for target smoothing: 1x1, or 3x3.

Selecting 1x1 smoothing is equivalent to applying no smoothing at all. So the


target is passed into the neural network as a vector of only 1s. Note that no
zeroes are included because any information that is not in the target area is not
regarded as relevant by the neural network.

Selecting 3x3 smoothing applies the 3x3 smoothing discussed in 5.0 Data
Smoothing. This effectively makes the target vector a gradient representing the
target areas. The target values near the edges of the target are averaged with 0
values, which makes them, lower than the target values found near the centre of
the target. Thus the neural network will look for everything in the target area, but
put higher value to that found near the centre of the target area than that found
near the edges.

6.0 The Fast Classification Neural Network

Fast (or sometimes referred to as Fuzzy) Classification Neural Networks (FCNN)


belong to the class of instantaneously trained neural networks. This means that
the training process occurs almost immediately. What a FCNN gains in training
time, it gives up slightly in simulation result data range and processing time.
While training of a FCNN is always faster than that of a Levenberg-Marquardt NN
(LMNN), the simulation time may be slightly longer; also the output data of a
FCNN tends to span a wider range than that of a LMNN, although this can be
accommodate for by changing the output transfer function, see 9.2.1 Transfer
Function Effects on FCNN.

The results of the FCNN tend to be more granulated than that of the LMNN. This
is due to the fact that FCNNs do no manipulation to the target data at all,
compared to an LMNN that adjusts weights. An FCNN tends to give more points
a partial membership rating, meaning that it will find more points that partially
match the target than a LMNN whose results are more clear cut but may miss
anomalies.

The FCNN is a newer generation of neural networks than LMNNs.

6.1 FCNN Training

12
The user is allowed no control over the training of the FCNN because it is an
intricate process that has very few variables, and those that it does have, have
little effect on the results of the simulation when applied to geophysical data.

Figure 2 Fast Classification Neural Network Diagram. Image sourced from [1].

The FCNN training creates one node for each target cell and gives it weights
directly from the training data. The radius of generalisation, a measure of how
close the nodes are to each other mathematically, is then computed, and the
output weights are set equal to the target values. Figure 2 shows the model of a
FCNN.

This process is referred to as instantaneous because all the weight values were
just set without computation. The only computed values were the radii of
generalization, which can be computed relatively quickly by a computer.

6.2 FCNN Simulation

Sending the simulation data through the system of weights and functions that
was generated by the training simulates the FCNN. The purpose of the output
transfer function (the only user tuneable quantity in this NN) is to put the output
on a range [0,1] so that it can be read as a decimal percentage. The next section
discusses this in greater detail.

6.3 FCNN Tuning

Only one tuning control is provided to the user for the Fast Classification Neural
Network (FCNN). This tuning control is the output transfer function.

13
The purpose of the output transfer function is to smooth and filter the data onto
the range [0,1] to represent it as a decimal percentage. Changing the transfer
function does not change the trends shown in the output, but does change the
range of the simulation results.

The transfer function options are:


Pure Linear
Sigmoid
Hyperbolic Tangent
Elliot’s Function

For greater discussion on the transfer functions, their definitions and effects on
the data see section 9.0 Transfer Functions.

The default transfer function is “Pure Linear.” The word “default” may also be
typed into this control and the neural network will use the Pure Linear function.

7.0 The Levenberg-Marquardt Neural Network

“Levenberg-Marquardt Neural Network” or LMNN is actually a misnomer because


the neural network is actually a feed-forward neural network, which uses
Levenberg-Marquardt optimization for the training process.

This LMNN is an older neural network design and provides more rounded results
than the Fast Classification NN (FCNN). The original version of the Predictive
Targeting Suite used this NN algorithm.

7.1 LMNN Training

The LMNN is modelled again based on a nodal structure, except this time the
network consists only of two layers of nodes and no other intricacies. The first
layer is the input nodes, which filter and manipulate the data, and the second
layer is the output nodes, which do even more filtering and manipulation on the
data.

Each node consists of input weights (one for each input to the node) and an
output bias (one for each node). The goal of the weights it to adjust the value so
that it is near the target value, and the bias is to adjust for any remaining error.

The weights are created by first filling all weights with random values and then
adjusting the random values according to Levenberg-Marquardt optimization
algorithm (1), until the training ending criteria are met. Some of these criteria are
fixed within the neural network and some are user accessible. The maximum
number of epochs and error goal are both user modifiable and are discussed in

14
7.3.3 Maximum Number of Training Epochs and 7.3.4 Acceptable Error Range
respectively.

x k 1  x k  [ J J   I ] 1 J e
T T
(1)

The Levenberg-Marquardt optimization algorithm computes the next suggested


value based on the current value. In (1) J is the Jacobian matrix of f(x), in our
case x is the a current weight, e is a vector of the error between the target and
the current value, and  is a training parameter that is made smaller as the
training converges on the target.

An important thing to note about this method of training is that the first step in the
process is assigning the weights random values. This means that every time the
simulation is run it will have a different result, even if run on the exact same data.
Variations have been observed to be on the order of  0.05. Note that although
the numerical values do vary slightly, the trends remain exactly the same.

7.2 LMNN Simulation

Once the training sets all the weights and biases, simply passing the simulation
data through the system of weights and biases runs the simulation. The general
equation that data sees at each node is modelled by (2).

oi  f ( xi * wi  bi ) (2)

In (2) oi represents the nodal output, xi represents the nodal input, wi represents
the nodal weight for that particular input, bi represents the nodal bias, and f(x, w,
b) is the transfer function.

7.3 LMNN Tuning

Four tuning options are made available to the user for the LMNN. These options
are:
Input Transfer Function
Output Transfer Function
Maximum Number of Training Epochs
Acceptable Error Range

7.3.1 Input Transfer Function

The purpose of the input transfer function is to smooth and filter the input data
onto the range [0,1]; this way data from different types of measurements can be
compared. Changing the transfer function does not change the trends shown in
the output, but does change the data range and values.

The transfer function options are:

15
Pure Linear
Sigmoid
Hyperbolic Tangent
Elliot’s Function

For greater discussion on the transfer functions, their definitions and effects on
the data, see 9.0 Transfer Functions.

The default transfer function is “Pure Linear.” The word “default” may also be
typed into this control and the neural network will use the Pure Linear function.

7.3.2 Output Transfer Function

The purpose of the output transfer function is to smooth and filter the output data
onto the range [0,1]. Changing the transfer function does not change the trends
shown in the output, but does change the data range and values.

The transfer function options are:


Sigmoid
Hyperbolic Tangent
Elliot’s Function

The default transfer function is “Hyperbolic Tangent.” The word “default” may
also be typed into this control and the neural network will use the Hyperbolic
Tangent function.

Note that the default setting is:


Input TF: Pure Linear
Output TF: Hyperbolic Tangent
This setting is designed to portray the simulation data over the widest possible
range, so if this is not the desired goal, i.e. you want a more limited range of
simulation results, you should change these settings; see 9.2.2 Transfer Function
Effects on LMNN for further discussion on varying transfer functions.

7.3.3 Maximum Number of Training Epochs

The user is allowed to adjust the maximum number of training epochs. A training
epoch is one training cycle. If the training is not converging or not converging at
a reasonable rate, the best training result once the maximum number of epochs
is reached is taken as the correct training result. The reason for limiting the
number of epochs is that sometimes the training does not converge, or
convergence takes too many iterations for the processing to be efficient.

The default maximum number of epochs is 500; typing “default” in this selection
will send 500 to the NN.

16
7.3.4 Acceptable Error Range

The acceptable error range is the error goal of the training. The LMNN is trained
on a basis where it adjusts weighting parameters that multiply the training data in
an attempt to make it match the target. The error range is the range around the
target value that is acceptable as “matching” the target value.

Increasing this will decrease processing time and accuracy, decreasing it will do
the opposite.

The default error range is 0.001; typing “default” in this selection will send 0.001
to the NN.

7.4 FCNN vs. LMNN

The overall results of the two different neural networks are generally very similar.
The highest matching areas on either simulation set of the same data are the
same, but the mid-valued locations are where we see a predominant difference.

We find that FCNN gives a lot more semi-membership weighting to points, this
means that the FCNN finds that there are a lot more points that match the
training area well, but finds few that match it close to exactly. In contrast the
LMNN has a very straightforward approach and the resulting values almost
always form a near-perfect Gaussian distribution.

Figure 3 shows the histograms of the tutorial data simulated with default settings
using both FCNN and LMNN. Figure 4 shows the same histograms auto-scaled
so we can observe their specifics better.

Figure 3 Histograms of the tutorial data gridded at default settings. From left to right:
FCNN, LMNN

17
Figure 4 Histograms of the tutorial data gridded at default settings and auto-scaled. From
left to right: FCNN, LMNN

From observing the curves we see that the FCNN tends to give more points a
higher ranking of matching the training than the LMNN and that the region
which it is applied over is not very large, whereas the LMNN gives a very
smooth very large Gaussian distribution.

In regards to performance, the FCNN trains instantaneously, while the LMNN


takes time, and the larger the target area the longer an LMNN will take to train.
In regards to simulation the FCNN takes longer than the LMNN because there is
more actual calculation in an FCNN’s simulation than in that of an LMNN. So for
small target areas (less than 100 target cells) the overall processing time of the
LMNN will be faster, while for larger target areas (more than 100 target cells) the
FCNN will be faster.

8.0 Output

8.1 The Output Database

The output of the neural network is a Geosoft database file (.GDB). This file is
created in the working directory of the current Oasis Montaj workspace but will
not appear in the current workspace until added.

If you wish to write over an existing database, make sure it is closed before you
start the simulation. Note that entering the name of an existing database will
write-over that database, so all information held in it prior to running the
simulation will be lost.

Once added to the workspace, the output database can be opened and should
appear in the form of Figure 5.

18
Figure 5 Geosoft database file representing the FCNN simulation of the tutorial data with
training grid smoothing set to 3x3, target grid smoothing set to 1x1, result
smoothing set to 3x3, and FCNN output transfer function set to “Pure Linear.”

In the output database, the X channel represents the x-coordinate, the Y channel
represents the Y coordinate, and the NNsimCH channel represents the
simulation value for the corresponding X and Y channel values. Note that all
three X, Y, and NNsimCH values are dependent on the neural network inputs so
they will vary from simulation to simulation.

The values of the NNsimCH channel represent percentage match, in decimal


form, to the target area. Thus a value of 1 means exact match, while a value of 0
means no match at all; values between 0 and 1 represent varying degrees of
matching to the target area.

8.1.1 Default Names

If there is no specified output database filename, the neural network will set a
default filename to the output. Note that if this output filename already has a
simulation database stored in it, the neural network will write-over this previous
simulation result.

The default filenames of simulations vary with the selected neural network
algorithm. The default filenames are as follows:
Fast Classification (FC) “FCNN_out.GDB”
Levenberg-Marquardt (LM) “LMNN_out.GDB”
Combined FC & LM “DUONN_out.GDB”

19
8.2 Gridding the Output

The simulation results stored in the database can be gridded to provide a visual
representation of the simulation.

To grid the database results, make sure that you are currently in the database,
then go to the Grid and Image menu, select the Gridding sub-menu, and then the
Minimum Curvature option. The minimum curvature gridding prompt will appear.
Select the NNsimCH channel, and make sure that for “Grid cell size” you enter
the separation size of your simulation grids (this should be the same for all of
them). Name the grid and press “OK”.

Figure 6 shows the gridded output of the simulation of the tutorial data for all
three neural network algorithm choices.

Figure 6 Gridded simulation outputs from all three neural network algorithms. From left to
right: Fast Classification (FC), Levenberg-Marquardt (LM), Combined FC & LM.
The default settings are used to generate all three databases and grids.

You can see that the main high-points on all three grids are the same, but that
different neural networks offer different interpretations of the data, just as curves
can be fitted with different functions e.g. L1-norm, L2-norm, etc.

20
Determining which neural network provides the most accurate results comes
from research and testing. This package provides the tools, which allow this to
be done.

8.3 Clipping the Grid

Grid clipping can be used to further emphasise the main regions of interest, i.e.
the regions where the neural network simulation has yielded the highest values.

This can be done by gridding only the top quartile, decile, etc. of the data. The
following description is for finding the top decile, top 10%, of the data, a similar
process can be used to find the top quartile or any other segment.

To find the top decile, view the histogram of the gridded database. To do this,
right click on the grid name in the side bar, and select “Properties.” Then click
the “Stats” button, and subsequently the “Histogram” button.

Figure 7 shows the histograms of the three grids in Figure 6 adjusted to their top
decile.

Figure 7 Histograms highlighting the top decile of the data represented by the grids in
Figure 6. From top left to right, and then bottom, the histograms represent:
FCNN, LMNN, DUONN.

21
You can adjust to the top decile by moving the cursor (the red line) to different
points on the grid until the value in the “Cursor” box next to “%” shows ~90. Then
you want to record the value next to the “X” in the same box. Thus the values
that will constitute our minimum Z values for FCNN, LMNN, and DUONN
respectively are: 0.801, 0.6, and 0.7026.

Now, in the Grid and Image menu, and the Utilities sub-menu, select Window a
Grid. A prompt will appear. Make sure that the correct grid is selected and enter
an output name. The only other parameter that must be entered is the minimum
value of the top decile, which was found from the histogram. This is entered in
the “Z MIN” selection. Press the “OK” button and the clipped grids will be
created.

Figure 8 shows the top decile clipped grids for the grids presented in Figure 6.

Figure 8 Clipped to the top decile grids representing the simulation data of the grids in
Figure 6.

9.0 Transfer Functions

The transfer functions are the one greatest tuning tool offered to the user for
modifying the output of the neural network. They allow the user to increase and
decrease the shape and range of the simulation output values. It is very

22
important to note that even though the transfer functions can squeeze or flatten
the simulation results, they do not change the trends of the results, thus
regardless of what transfer function is used, the gridded images will almost
always looked exactly the same. To change the appearance of a gridded image,
changing the data smoothing options will have a greater effect.

9.1 Transfer Function Definitions

There are four transfer functions utilised by PTS2. They all normalise the data
onto the range [0,1]. Some of the functions themselves normalise the data onto
the range [-1, 1] and are then adjusted to [0, 1]. The definitions of the transfer
functions are listed below.

Pure Linear:
f ( x)  x (3)

Sigmoid (Logistic Function):


1
f ( x)  (4)
1  ex

Hyperbolic Tangent:
e x  ex
f ( x)  tanh( x)  (5)
e x  e x

Elliot’s Function:
x
f ( x)  (6)
1 | x |

Note that the Pure Linear function is always passed through a normalisation
algorithm that brings the value onto [0, 1] after it goes through the transfer
function.

9.2 Effects of Differing Transfer Functions

Changing the transfer function will change the results of the simulation. Note that
as previously mentioned these changes in results change only the numerical
values, the overall trends observed remain generally or exactly the same
regardless of transfer function. For some applications it is better to have the data
on a small tight range and for others we want the simulation results to span the
widest possible range. The following examples are intended to show what
effects the different transfer functions have on the data.

9.2.1 Transfer Function Effects on FCNN

Figure 9 through 12 show the histograms of the default simulation settings for the
FCNN but with varying transfer functions (TFs). All histograms are plotted on the

23
range [0.1, 0.9]

Figure 9 FCNN default simulation with TF: Pure Linear (default TF)

Figure 10 FCNN default simulation with TF: Sigmoid

Figure 11 FCNN default simulation with TF: Hyperbolic Tangent

24
Figure 12 FCNN default simulation with TF: Elliot’s Function

Observing the above four figures we can see how the different transfer functions
shift and squeeze the simulation results. We will take Figure 9 as our reference
because it uses the default settings and the default transfer function, Pure Linear.

If we observe Figure 10, we see that the data has been significantly compressed
to a much smaller range, and it has been shifted to lower values. This is the
effect of the Sigmoid function.

Comparing Figure 11 to 9 we notice that the overall shape is the same but the
plateau regions in the areas of higher concentration seem to be more elongated.
Also observing the maximum and minimum values on the histogram, the data
has been shifted up a very slight amount, with both the minimum value and the
maximum value increasing on that of the default settings in Figure 9. These are
the effects of the Hyperbolic Tangent function—increasing the data range upward
and creating higher concentrations of already highly concentrated points.

Observing the last histogram in Figure 12 we see that the range has actually
been compressed and shifted upward by a slight amount. This is the effect of
Elliot’s function.

So to summarise, the transfer functions have the following effects:

Sigmoid: Compresses the data and shifts it to lower values.

Hyperbolic Tan.: Keeps the data at the same general frequency but shifts it
upward slightly.

Elliot’s function: Compresses the data and shifts it to slightly higher values.

9.2.2 Transfer Function Effects on LMNN

There are a total of 12 different possible combinations of transfer functions for

25
the LMNN, but we will consider only three; this sample will show the effects of all
transfer functions (TFs) on the LMNN. It is important to note that there are four
TFs available as the input TF, but only three as the output TF. This is because
Pure Linear is not allowed as an output TF because it allows for the results of the
LMNN to exceed the bounds of [0, 1] due to internal data manipulation.

Figures 13 to 15 show the results of the same LMNN simulation when different
transfer functions are applied.

Figure 13 LMNN default simulation with Input TF: Pure Linear, and Output TF: Hyperbolic
Tangent (default TFs)

Figure 14 LMNN default simulation with Input TF: Pure Linear, and Output TF: Sigmoid

26
Figure 15 LMNN default simulation with Input TF: Pure Linear, and Output TF: Elliot’s
Function

It must be noted that the default settings are input transfer function: Pure Linear,
and output transfer function: Hyperbolic Tangent. Out of all the possibilities, this
provides the widest simulation data range, and is set as the default for this
purpose. This is shown in Figure 13.

Observing Figure 14 we see that the Sigmoid function has again compressed the
data into a much smaller segment and the amount that values are repeated is
significantly higher. Also all the data has been compressed into the upper half of
the range of the default setting.

Figure 15 shows the effect of Elliot’s function, which again compresses the data
into a smaller area, but not as small as the effects of Sigmoid. Opposite to what
happens in the FCNN, this time Elliot’s function pushes the values into the lower
half of the default instead of the upper half.

To summarise the effects of the various transfer functions:

Sigmoid: Compresses the data into a smaller area and shifts it to


higher values.

Hyperbolic Tan.: Provides the widest possible range of values.

Elliot’s Function: Compresses the data into a smaller area and shifts it to
lower values.

10.0 References

[1] K.W. Tang and S. Kak, "Fast classification networks for signal processing,"
Circuits, Systems Signal Processing 21 (2002) 207-224.

27
[2] K.W. Tang, "Instantaneous Learning Neural Networks." Ph.D. Dissertation,
Louisiana State University, 1999.
[3] S. Kak, "A Class of Instantaneously Trained Neural Networks," Baton
Rouge, LA; May 7, 2002.

[4] A. Ahl, W. Seiberl, and E. Winkler, “Interpretation of airborne


electromagnetic data with neural networks,” Exploration Geophysics 29
(1998) 152-156.

[5] F. Au, T-C. Chen, D-J. Han, and L. Tham, “Acceleration of Levenberg-
Marquardt Training of Neural Netowrks with Variable Decay Rate,” IEEE,
0-7803-7898-9/03, 2003, 1873-1878.

[6] A. Neto, F. Soeiro, and H. Velho, “A combiniation of artificial neural


netowrks and the Levenberg-Marquardt method for the solution of inverse
heat conduction problems,” research paper, Rio de Janeiro State
University, 2005.

[7] “Predictive Targeting with Neural Networks for Oasis montaj v6 – Tutorial
and User Guide,” Software Tutorial; Paterson, Grant and Watson Limited,
2004.

28

Das könnte Ihnen auch gefallen