Sie sind auf Seite 1von 13

CS 473

SUBMITTED TO Dr. Shailendra Singh

Submitted By -
Name : Akshit Singla
SID : 10103006
Date : 15-10-2013

SC Assignment 2 10103006

Page 2

Question 1.
List atleast 10 applications of feed-forward multi-layer neural networks with a
short note on their technical details.

Answer 1.
1. Injection Moulding & Casting Process
ANNs can be employed for casting and IM processes starting from moulding to
final inspection stage. Multiple regression equation obtained from DOE fulfill the
need of huge training data for ANN. ANNs address the limitations of simulation
software by reducing repetitive analysis, computational cost and time, replace
need of experts for results interpretation and help in implementation for online
process control. Supervised learning, MLP networks, one or two hidden layer(s),
BP & LM algorithm, off-line training mode and Matlab simulations are more
common in casting and IM process. Use of activation constants in transfer
function and bias values for MLP networks reduces the MSE to small values and
improves the prediction accuracy. ANN is best used to model complex interaction
among several number of input & output parameter and predicts with good
accuracy, the only limitation is need of enough training data and ready to sacrifice
for computational time and cost. The attempted and proposed empirical
equations for optimum hidden layer/neurons selection, however are not
generalized and fail in other applications while using those methods. Hence
optimum selection for hidden layer(s)/hidden neurons is still under intensive
study. Single hidden layer finds maximum applications in MLP networks, reason
might be increase the hidden layers increases the computation time or
mathematically proven by Simon Hyken, single hidden layer with enough neurons
yields better prediction accuracy.

SC Assignment 2 10103006

Page 3

2. Acoustic detection of buried objects
The idea of in-line holography is used to increase the ratio of the signal to noise
due to the effect of concealing media that decreases the value of the received
signal. The performance is enhanced by using multilayer neural network for noise
reduction. The aim of using multilayer neural network is to extract the essential
knowledge from a noisy training data. Theoretical and experimental results have
showed that preprocessing the noisy data with multilayer neural network will
decrease the effect of noise as much as possible. Applying the enhanced data to
spectral estimation methods has improved the performance of the model.
The results obtained during a research show that during the pre-processing stage,
the MLNN was able to enhance the tested recorded signal and produce an output
signal that follows the desired model with very good performance. It is
demonstrated that the Burg method can be used to detect and image a concealed
object of closely separated points. Experimental results show that pre-processing
the noisy data with MLNN to decrease the effect of noise as much as possible and
then applying the enhanced data to spectral estimation method can improve the
performance. Also the use of holography enables an improvement of the signal-
to- noise ratio by coherently cumulating the acoustic field on the ultrasonic
transducers when scanning the field.
3. Vision Problems
A class of feed-forward Artificial Neural Networks (ANNs) is successful in solving
two vision problems: recognition and pose estimation of 3D objects from a single
2D perspective view; and handwritten digit recognition. In both cases, a multi-
MLP classification scheme is developed that combines the decisions of several
classifiers. These classifiers operate on the same feature set for the 3D
recognition problem whereas different feature types are used for the handwritten
digit recognition. The back-propagation learning rule is used to train the MLPs.

SC Assignment 2 10103006

Page 4

4. Pattern Recognition
Many pattern recognition problems, especially character or other symbol
recognition and vowel recognition, have been implemented using a multilayer
neural network. Note, however, that these networks are not directly applicable
for situations where the patterns are deformed or modified due to
transformations such as translation, rotation and scale change, although some of
them may work well even with large additive uncorrelated noise in the data.
Direct applications are successful, if the data is directly presentable to the
classification network.
The neural network literature is full of pattern recognition applications. Typically
one takes pixelated image values as the network input and that maps via layers of
hidden units to a set of outputs corresponding to possible classifications of the
5. Image Compression
Neural networks offer the potential for providing a novel solution to the problem
of data compression by its ability to generate an internal data representation. This
network, which is an application of back propagation network, accepts a large
amount of image data, compresses it for storage or transmission, and
subsequently restores it when desired.
6. Speech recognition
The principle aim of an artificial neural network hybrid (HMM/ANN) system is to
combine the efficient discriminative learning capabilities of neural networks and
the superior time warping and decoding techniques associated with the HMM
approach. The ANN is trained to estimate HMM emission probabilities which are
then used by a decoder based on the well-known Viterbi algorithm. Among the
advantages in using such an approach is that no assumption about the statistical
distribution of the input features is necessary. Due to its classification procedure,
an MLP has the ability to de-correlate the input features. Moreover, while in
classical HMM based system, the parameters are trained according to a likelihood
SC Assignment 2 10103006

Page 5

criterion, an MLP also penalizes the incorrect classes. At every time n, the acoustic
vector x
is presented to the network. This generates local probabilities that are
used, after division by priors, as local scaled likelihoods in a Viterbi dynamic
programming algorithm. Posterior Probability estimation MLP may be used to
estimate probabilities.
7. Universal Approximators
Standard multi-layer feedforward networks are capable of approximating any
measurable function to any desired degree of accuracy, in a very specific and
satisfying sense. This implies that any lack of success in applications must arise
from inadequate learning, insufficient numbers of hidden units or the lack of a
deterministic relationship between input and target.
8. Predictions
Neural networks have been applied to numerous situations where time series
prediction is required predicting weather, climate, stocks and share prices,
currency exchange rates, airline passengers, etc. We can turn the temporal
problem into a simple input- output mapping by taking the time series data x(t) at
k time-slices t, t1, t2, , tk+1 as the inputs, and the output is the prediction for
Such networks can be extended in many ways, e.g. additional inputs for
information other than the series x(t), outputs for further time steps into the
future, feeding the outputs back through the network to predict further into the
future (Weigend & Gershenfeld, 1994).
9. Driving
Pomerleau(1989) constructed a neural network controller ALVINN for driving a
car on a winding road. The inputs were a 30 32 pixel image from a video
camera, and an 8 32 image from a range finder. These were fed into a hidden
layer of 29 units, and from there to a line of 45 output units corresponding to
direction to drive.
SC Assignment 2 10103006

Page 6

The network was originally trained using back-propagation on 1200 simulated
road images. After about 40 epochs the network could drive at about 5mph the
speed being limited by the speed of the computer that the neural network was
running on.
In a later study the network learnt by watching how a human steered, and by
using additional views of what the road would look like at positions slightly off
course. After about three minutes of training, ALVINN was able to take over and
continue to drive. ALVINN has successfully driven at speeds up to 70mph and for
distances of over 90 miles on a public highway north of Pittsburgh. (Apparently,
actually being inside the vehicle during the test drive was a big incentive for the
researchers to develop a good neural network!)
10. Management Formation
The empirical research in strategic planning systems has focused on two areas:
the impact of strategic planning on firm performance and the role of
strategic planning in strategic decision making Neural networks as efficient
tool have utilized for determining and clarifying the relationship between
strategic planning and performance and also assessing the decision making.
We can say that neural network approaches differ from traditional statistical
techniques in many ways and the differences can be exploited by the
application developer. They are powerful alternative tools and a complement to
statistical techniques when data are multivariate with a high degree of
interdependence between factors, when the data are noisy or incomplete, or
when many hypotheses are to be pursued and high computational rates are
required. With their unique features, both methods together can lead to a
powerful decision-making tool. Studies and investigations are being made to
enhance the applications of ANNs and to achieve the benefits of this new
technology [123]. Most frequently quoted advantages of the application of neural
networks are:
Neural network models can provide highly accurate results in comparison
with regression models.
SC Assignment 2 10103006

Page 7

Neural network models are able to learn any complex non-linear mapping /
approximate any continuous function and can handle nonlinearities
implicitly and directly.
The significance and accuracy of neural network models can be assessed
using the traditional statistical measures of mean squared error and R2.
Neural network models automatically handle variable interactions if they
exist [2].
Neural networks as non-parametric methods do not make a prior
assumptions about the distribution of the data/input-output mapping
Neural networks are very flexible with respect to incomplete, missing and
noisy data/ NNs are fault tolerant.
Neural networks models can be easily updated.
It means they are suitable for dynamic environment.

SC Assignment 2 10103006

Page 8

Question 2.
Write a short note on the different variations of the Back Propagation
learning method.

Answer 2.
a) Standard Back propagation (BP)
o The LMS algorithm is guaranteed to converge to a solution that
minimizes the mean squared error, so long as the learning rate is not
too large.
o Multilayer nonlinear Net many local minimum points the
curvature can vary widely in different regions of the parameter
b) Momentum
The batching form of MOBP, in which the parameters are updated only
after the entire example set has been presented.
The same initial condition and learning rate.
The algorithm now is stable and it tends to accelerate convergence when
the trajectory is moving in a consistent direction.
c) Delta-bar-delta method
Each weight wjk has its own rate jk
If wjk remains in the same direction, increase jk (F has a smooth curve in
the vicinity of current W)
If wjk changes the direction, decrease jk (F has a rough curve in the
vicinity of current W) rough curve in the vicinity of current W)
delta-bar-delta also involves momentum term
Quickprop algorithm of Fahlman (1988).(It assumes that the error surface is
parabolic and concave upward around the minimum point and that the
effect of each weight can be considered independently)
SuperSAB algorithm of Tollenaere (1990). (It has more complex rules for
adjusting the learning rates).
SC Assignment 2 10103006

Page 9

o In SDBP we have only one parameter to select, but in heuristic
modification sometimes we have six parameters to be selected.
o Sometimes modifications fail to converge while SDBP will eventually
find a solution.
d) Conjugate Gradient
SD is the simplest optimization method but is often slow in converging.
Newtons method is much faster, but requires that the Hessian matrix and
its inverse be calculated.
The conjugate gradient is a compromise; it does not require the calculation
of 2nd derivatives, and yet it still has the quadratic convergence property.
o Its necessary to choose some initial value for every n
o The d parameter is user defined (empirically)
o The weight update is still a function of the gradient of the error.
e) Levenberg-Marquardt
Variation of Newtons method
Non-linear error function
Drawback: matrix inversion
Efficient in the number of epochs, but slow within each epoch

SC Assignment 2 10103006

Page 10

Question 3.
Answer the following questions
a) How many hidden layers are optimal for a BPN?
b) How many training pairs are minimum required to train a BPN?
c) For how long a BPN should be trained?
d) How can we represent data to a BPN?

Answer 3.
Problems that require two hidden layers are rarely encountered. However,
neural networks with two hidden layers can represent functions with any kind
of shape. There is currently no theoretical reason to use neural networks with
any more than two hidden layers. In fact, for many practical problems, there is
no reason to use any more than one hidden layer.
Number of
Hidden Layers
Only capable of representing linear separable functions or
Can approximate any function that contains a continuous mapping
from one finite space to another.
Can represent an arbitrary decision boundary to arbitrary accuracy
with rational activation functions and can approximate any smooth
mapping to any accuracy.

When you back propagate, all it means is that you have changed your weights
in such a manner that your neural network will get better at recognizing that
particular input (that is, the error keeps decreasing).
So if you present pair-1 and then pair-2, it is possible that pair-2 may negate
the changes to a certain degree. However in the long run the neural network's
weights will tend towards recognizing all inputs properly. The thing is, you
cannot look at the result of a particular training attempt for a particular set of
inputs/outputs and be concerned that the changes will be negated. As I
SC Assignment 2 10103006

Page 11

mentioned before, when you're training a neural network you are traversing
an error surface to find a location where the error is the lowest. Think of it as
walking along a landscape that has a bunch of hills and valleys. Imagine that
you don't have a map and that you have a special compass that tells you in
what direction you need to move, and by what distance. The compass is
basically trying to direct you to the lowest point in this landscape. Now this
compass doesn't know the the landscape well either and so in trying to send
you to the lowest point, it may go in a slightly-wrong direction (i.e., send you
some way up a hill) but it will try and correct itself after that. In the long run,
you will eventually end up at the lowest point in the landscape (unless you're
in a local minima i.e., a low-point, but one that is not the lowest point).

We have time series, i.e., a variable x changing in time x
(t=1,2,...) and we
would like to predict the value of x in time t+h. The prediction of time series
using neural network consists of teaching the net the history of the variable
in a selected limited time and applying the taught information to the future.
Data from past are provided to the inputs of neural network and we expect
data from future from the outputs of the network. As we can see, the
teaching with teacher is involved. For more exact prediction, additional
information can be added for teaching and prediction, for example in the
form of interventional variables (intervention indicators). However, more
information does not always mean better prediction; sometimes it can make
the process of teaching and predicting worse. It is always necessary to select
really relevant information, if it is available.
SC Assignment 2 10103006

Page 12

Fig: teaching of time series without interventional variables. The points in
graph represent time series obtained by sampling of continuous data.
SC Assignment 2 10103006

Page 13

Fig : Teaching of time series with intervention indicator

For reasons of speed and complexity, there is often pressure to minimize the
number of variables (input and output) that a neural network has to deal with.
The pressure also has a bearing on the resolution at which the data is
represented; the finer the data representation, the greater the complexity and
consequently the amount of training time and data required. This brings us to
second principle of training data representation: explicitness. Given the limit on
the number of input variables a network may see, there is pressure to ensure that
the variables which are used contain the information required to carry out the
task to be learned in a form which is as explicit as possible.