Fulltext Soft Springer 2006

Soft Comput (2006) 10: 264271
DOI 10.1007/s00500-005-0481-0
O R I G I N A L PA P E R
B. Samanta K. R. Al-Balushi S. A. Al-Araimi
Artificial neural networks and genetic algorithm for bearing fault

detection
Published online: 27 April 2005

Springer-Verlag 2005
Abstract A study is presented to compare the performance

of three types of artificial neural network (ANN), namely,
multi layer perceptron (MLP), radial basis function (RBF)
network and probabilistic neural network (PNN), for bearing fault detection. Features are extracted from time domain
vibration signals, without and with preprocessing, of a rotating machine with normal and defective bearings. The extracted features are used as inputs to all three ANN classifiers:
MLP, RBF and PNN for two- class (normal or fault) recognition. Genetic algorithms (GAs) have been used to select
the characteristic parameters of the classifiers and the input
features. For each trial, the ANNs are trained with a subset
of the experimental data for known machine conditions. The
ANNs are tested using the remaining set of data. The procedure is illustrated using the experimental vibration data of
a rotating machine. The roles of different vibration signals
and preprocessing techniques are investigated. The results
show the effectiveness of the features and the classifiers in
detection of machine condition.
Keywords Condition monitoring Feature selection
Genetic algorithm Bearing faults Neural network
Probabilistic neural network Radial basis function
Rotating machines Signal processing
1 Introduction
Machine condition monitoring is gaining importance in
industry because of the need to increase reliability and to
decrease the possibility of production loss due to machine
breakdown. The use of vibration and acoustic emission (AE)
signals is quite common in the field of condition monitoring
of rotating machinery. By comparing the signals of a machine
running in normal and faulty conditions, detection of faults
B. Samanta (B) K. R. Al-Balushi S. A. Al-Araimi
Department of Mechanical and Industrial Engineering,
College of Engineering, Sultan Qaboos University,
PO Box 33, PC 123, Muscat, Sultanate of Oman
E-mail: samantab@squ.edu.om
Fax: +968 24413315
like mass unbalance, rotor rub, shaft misalignment, gear failures and bearing defects is possible. These signals can also
be used to detect the incipient failures of the machine components, through the on-line monitoring system, reducing the
possibility of catastrophic damage and the down time. Some
of the recent works in the area are listed in [18]. Although
often the visual inspection of the frequency domain features
of the measured signals is adequate to identify the faults,
there is a need for a reliable, fast and automated procedure
of diagnostics.
Artificial neural networks (ANNs) have potential
applications in automated detection and diagnosis of machine
conditions [3,4,710]. Multi layer perceptrons (MLPs) and
radial basis function (RBF) networks are most commonly
used ANNs [1115], though interest on probabilistic neural networks (PNNs) is also increasing recently [16, 17]. The
main difference among these methods lies in the ways of partitioning the data into different classes. The applications of
ANNs are mainly in the areas of machine learning, computer
vision and pattern recognition because of their high accuracy and good generalization capability [1118]. Though in
the area of machine condition monitoring, MLPs are being
used for quite some time, the applications of RBFs and PNNs
are relatively recent [3,1921]. In [21], a procedure was presented for condition monitoring of rolling element bearings
comparing the performance of these ANNs, with all calculated signal features and fixed parameters for the classifiers.
In this, vibration signals were acquired under different operating speeds and bearing conditions. The statistical features
of the signals, both original and with some preprocessing like
differentiation and integration, low and high-pass filtering
and spectral data of the signals were used for classification
of bearing conditions.
However, there is a need to make the classification process faster and accurate using the minimum number of features which primarily characterize the system conditions with
optimized structure or parameters of ANNs [3,22]. Genetic
algorithms (GAs) were used for automatic feature selection in
machine condition monitoring [3,2123]. In [22], a GA based
approach was introduced for selection of input features and
Artificial neural networks and genetic algorithm for bearing fault detection
265
number of neurons in the hidden layer. The features were

extracted from the entire signal under each condition and
operating speed [19,21]. In [23], some preliminary results of
MLPs and GAs were presented for fault detection of gears
using only the time domain features of vibration signals. In
this approach, the features were extracted from finite segments of two signals: one with normal condition and the
other with defective gears.
In the present work, the procedure of [23] is extended
to the diagnosis of bearing condition using vibration signals through three types of ANN classifiers. Comparisons are
made between the performance of ANNs, both without and
with automatic selection of features and classifier parameters
using a GA based approach. Figure 1 shows flow diagram of
the proposed procedure. The features, namely, mean, root
mean square (RMS), variance, skewness, kurtosis and normalized higher order (upto ninth) central moments are used to
distinguish between normal and defective bearings. Moments
of order higher than nine are not considered in the present
work to keep the input vector within a reasonable size without sacrificing the accuracy of diagnosis. The roles of different vibration signals are investigated. The results show the
effectiveness of the extracted features from the acquired and
preprocessed signals in diagnosis of the machine condition.
The procedure is illustrated using the vibration data of an
experimental setup with normal and defective bearings.
2 Vibration data
Figure 2 shows the schematic diagram of the experimental
test rig. The rotor is supported on two bearings. The rotor
was driven by an electrical AC motor through a flexible coupling. Two accelerometers were mounted on the right hand
side (RHS) bearing support, with an angle of 90 to measure
vibrations in vertical and horizontal directions ( x and y).
Separate measurements were obtained for two conditions,
one with normal bearings and the other with a fault on the
outer race of the RHS bearing. The accelerometers were connected through charge amplifiers to two channels of a PC
based data acquisition system. The one pulse per revolution
of the shaft was sensed by a proximity sensor and the signal
was used as a trigger to start the sampling process. Measurements were obtained simultaneously at a sampling rate of
49152 samples/sec per channel. The accelerometer signals
were processed through charge amplifiers with lower and
higher cut-off frequencies of 2 Hz and 100 kHz respectively.
The number of samples collected for each channel was 49152.
In the present work, these time domain data were preprocessed to extract the features for using as inputs to the ANNs.
Fig. 1 Flow chart of diagnostic procedure
consisting of 49152 samples (qi, ) were obtained using accelerometers in vertical and horizontal directions to monitor
the machine condition. The magnitude of the vibration
was
constructed from the two component signals, z = (x 2 +
y 2 ). In the present work, these samples were divided into
48 segments (bins) of 1024 (n) samples each. Each of these
bins was further processed to extract the following features
(19): mean () RMS, variance ( 2 ), skewness (normalized
third central moment, 3 ), kurtosis (normalized fourth central moment, 4 ), normalized fifth to ninth central moments
(5 9 ) as follows:
n =
3 Feature extraction
3.1 Signal statistical characteristics
One set of experimental data each with normal and defective
bearings was presented. For each set, two vibration signals
E{[qi ]n }
n
n = 3 9,
(1)
where E{} represents the expected value of the function. Figure 3 shows plots of some of theses features extracted from
the vibration signals (qi )x, y and z, with each row representing the features for one signal. Only a few of the features are
shown as representatives of the full feature set.
266
B. Samanta et al.
Fig. 2 Experimental test rig
Fig. 3 Time-domain features of acquired signals: , normal, - - - - - defective
3.2 Time derivative and integral of signals

The high and low frequency content of the raw signals can
be obtained from the corresponding time derivatives and the
integrals. In this work, the first time derivative (dq) and the
integral (iq) have been defined, using sampling time as a factor, as follows:
dq(k) = q(k) q(k 1)
(2)
iq(k) = q(k) + q(k 1)
(3)
The derivative and the integral of each signal were processed

to extract additional set of 18 features (1027).
3.3 High- and low-pass filtering
The raw signals were also processed through low- and highpass filters with a cut-off frequency as one-tenth (f/10) of
the sampling rate (f = 49152 Hz). These filtered signals were
processed to obtain another set of 18 features (2845).
3.4 Normalization
The feature set was normalized dividing each row by its absolute maximum value, keeping the features within 1, for
better speed and success of the network training. The total
set of normalized features consists of 451442 array where
each row represents a feature and the columns represent the
total number of bins (48) per signal times the total number
of signals (3) multiplied by two machine conditions (2).
4 Artificial neural networks

Artificial neural networks have been developed in form of parallel distributed network models based on biological
learning process of the human brain. There are numerous
applications of ANNs in data analysis, pattern recognition
and control [13,17, 24]. In this section, three types of ANNs,
namely, MLP, RBF networks and PNN are briefly discussed
with reference to the structures, parameters and the main
differences. Readers are referred to [13], [17], [24] for further details.
4.1 Multi layer perceptron

Multi layer perceptrons consist of an input layer of source
nodes, one or more hidden layers of computation nodes or
neurons and an output layer. The number of nodes in the
input and the output layers depend on the number of input
and output variables respectively. The number of hidden layers and the number of nodes in each hidden layer affect the
generalization capability of the network. For smaller number
of hidden layers and neurons, the performance may not be
adequate. Whereas with too many hidden nodes may have the
risk of over-fitting the training data and poor generalization
on the new data. There are various methods, both heuristic
and systematic, to select the number of hidden layers and the
nodes [24].
In the present work, only one hidden layer was used. The
input layer has nodes representing the normalized features
extracted from the measured vibration signals. The number
of input nodes was varied from 3 to 45 and that of the output
nodes was 2. The number of neurons in hidden layer was
selected using the GA based procedure for optimum classification performance. The target values of two output nodes
can have only binary levels representing normal (N) and
failed (F) gears. In the MLPs, the activation functions of
sigmoid were used in the hidden and the output layers. The
ANN was created, trained and implemented using Matlab
neural network toolbox with backpropagation (BPN) and the
training algorithm of Levenberg-Marquardt. The ANN was
trained iteratively to minimize the performance function of
mean square error (MSE) between the network outputs and
the corresponding target values. At each iteration, the gradient of the performance function (MSE) was used to adjust the
network weights and biases. In this work, a mean square error
267
of 106 , a minimum gradient of 1010 and maximum iteration number (epoch) of 500 were used. The training process
would stop if any of these conditions were met. The initial
weights and biases of the network were generated automatically by the program.
4.2 Radial basis function networks
The RBF network is similar in structure to that of an MLP
with only one hidden layer whereas MLPs may have more
than one. The activation function of the hidden layer is Gaussian spheroid function as follows:
y(x) = e(xc
/2 2 )
(4)
The output (y) of a hidden neuron gives a measure of distance

of the input vector (x) from the centroid (c) of the data cluster
and it is used at the output layer to classify the input vector.
The parameter, , represents the radius of the hypersphere
enclosing the data clusters. The proper choice of number of
neurons, the location of centres and the spread ( ) is very
important for classification success. In general, the number
of neurons in the hidden layer is increased iteratively with
corresponding assignment of centres for a given spread till
the performance goal is achieved. The parameter ( ) is generally determined using iterative process selecting an optimum width on the basis of the full datasets. However, in the
present work the width ( ) is selected along with the relevant input features using the GA based approach. The main
advantage of a RBF network is faster training compared to
an MLP of similar structure. In the present work, the RBFs
were created, trained and tested using Matlab. through a simple iterative algorithm of adding more neurons in the hidden
layer upto a maximum of 144 (number of training vectors)
till the performance goal (MSE) of 0.01 is reached.
4.3 Probabilistic neural networks
The structure of a PNN is similar to that of a RBF, both having a Gaussian spheroid activation function in the first of the
two layers. The linear output layer of the RBF is replaced
with a competitive layer in PNN which allows only one neuron to fire with all others in the layer returning zero. The
major drawback of using PNNs was computational cost for
the potentially large size of the hidden layer which could be
equal to the size of the input vector. The PNN can be Bayesian classifier, approximating the probability density function
(PDF) of a class using Parzen windows [17]. The generalized
expression for calculating the value of Parzen approximated
PDF at a given point x in feature space is given as follows:
fA (x) =
NA

1
2
2
e(xci /2 )
(2)2 p NA i=1
(5)
Where p is the dimensionality of the feature vector, NA is

the number of examples of class A used for training the network. The parameter represents the spread of the Gaussian
268
B. Samanta et al.
function and has significant effects on the generalization of a

PNN. The probability that a given sample belongs to a given
class A can be calculated in PNN as follows:
P (A|x) = fA (x)hA
(6)
where hA represents the relative frequency of the class A

within the whole training data set. The expressions of (5)
and (6) are evaluated for each class. The class returning the
highest probability is taken as the correct classification. The
main advantages of PNNs are faster training and probabilistic output. The width parameter ( ) in equation (5) is generally determined using iterative process selecting an optimum
width on the basis of the full datasets. However, in the present work the width is selected along with the relevant input
features using the GA based approach, as in case of RBFs. In
the present work, the PNNs were created, trained and tested
using Matlab.
crossover and selection routines have been proposed for optimization [25]. In this work, a GA based optimization routine
[28] was used.
5.1.1 MLP training
For MLPs, the genome (X) contains the row numbers of the
selected features from the total set and the number of hidden
neurons. For a training run needing N different inputs to be
selected from a set of Q possible inputs, the genome string
would consist of N + 1 real numbers. The first N numbers
(xi , i = 1, N) in the genome are constrained to be in the
range 1 xi Q whereas the last number xN+1 has to
be within the range Smin xN+1 Smax . The parameters
Smin and Smax represent respectively the lower and the upper
bounds on the number of neurons in the hidden layer of the
MLP.
X = {x1 x2 , . . . , xN xN+1 }T
(7)
5 Genetic algorithms
5.1.2 RBF and PNN Training
Genetic algorithms (GAs) have been considered with increasing interest in a wide variety of applications [2527]. These
algorithms are used to search the solution space through simulated evolution of survival of the fittest. These are used to
solve linear and nonlinear problems by exploring all regions
of state space and exploiting potential areas through mutation, crossover and selection operations applied to individuals in the population [26]. The use of genetic algorithm needs
consideration of six basic issues: chromosome (genome) representation, selection function, genetic operators like mutation and crossover for reproduction function, creation of initial population, termination criteria, and the evaluation function. Though the traditional genome representation has been
in binary form, the interest in real-coded or floating-point genomes for multi-dimensional parameter optimization problems is on the rise because of the closeness of the second
type of representation to the problem space, better average
performance and more efficient numerical implementation
[26]. The type of genome representation depends on the particular problem under consideration. In this paper, real-coded
genomes and the corresponding genetic operators were used
for the selection of features and the classifier parameters.
In GAs, a population size of ten individuals was used starting with randomly generated genomes. This size of population was chosen to ensure relatively high interchange among
different genomes within the population and to reduce the
likelihood of convergence within the population.
For RBFs and PNNs, the first N entries of the (N +1)-element

genome represent the row numbers of the selected features as
in case of MLPs. However, the last element, x N+1 represents
the spread ( ) of the Gaussian function of equations (4) and
(5) for RBFs and PNNs respectively. For the present work,
this was taken between 0.1 and 1.0 with a step size of 0.1.
5.1 Genome representation
5.3 Genetic operators
In the present work, GA is used to select the most suitable

features and one variable parameter related to the particular
classifier: the number of neurons in the hidden layer for MLPs
and the width ( ) for RBFs and PNNs. Different mutation,
Genetic operators are the basic search mechanisms of the

GA for creating new solutions based on the existing population. The operators are of two basic types: mutation and
crossover. Mutation alters one individual to produce a single
5.2 Selection function

In a GA, the selection of individuals to produce successive
generations plays a vital role. A probabilistic selection is used
based on the individuals fitness such that the better individuals have higher chances of being selected. There are various
schemes for selection process [25,26]. In this work, normalized geometric ranking method was used because of its better
performance [26,28]. In this method, the probability (P i ) for
ith individual being selected is given as:
q
Pi =
(1 q)r1
(8)
1 (1 q)P
where q represents the probability of selecting the best individual, r is the rank of the individual, and P denotes the
population size. The parameter q is to be provided by the
user. The best individual is represented by a rank of 1 and
the worst having a rank of P . In the present work, a value of
0.08 was used for q.
269
The solution process continues from one generation to

another selecting and reproducing parents until a termination
criterion is satisfied. The most commonly used terminating
criterion is the maximum number of generations.
The creation of an evaluation function to rank the performance of a particular genome is very important for the
5.3.1 Mutation
success of the training process. The GA will rate its own perThere are different types of mutation functions used in GAs. formance around that of the evaluation (fitness) function. The
In this work, non-uniform-mutation function was used be- fitness function used in the present work returns the number
cause of its fine tuning capabilities, better performance and of correct classification of the test data. The better classificafaster convergence [26]. It randomly selects one element xi tion results give rise to higher fitness index.

The optimality of the variables selected by GA was based
of the parent X and modifies it as X = {x1 x2 , . . . , xi , . . . ,
on
the
value of the evaluation function. In the present work,

T
xN xN +1 } after setting the element (xi ) equal to a non-uniform
the
selection
of GA parameters was done on the basis of inirandom number in the following manner:
tial trials for satisfactory classification success. However, the
xi + (bi xi )f (G) if r1 < 0.5

proper selection of GA parameters is quite complex and has

xi = xi (xi ai )f (G) if r1 0.5
(9) rather empirical background needing further research [26].
x
otherwise
i
s

6 Simulation results
G
(10)
f (G) = r2 1
Gmax
The dataset (45 144 2) consisting of forty-five (45) norwhere r1 and r2 denote uniformly distributed random nummalized features for each of the three signals (3) split in form
ber in the range of [0,1]; G is the current generation number
of 48 segments of 1024 samples each with two (2) bearand Gmax denotes the maximum number of generations. The
ing conditions were divided into two subsets. The first 24
function f (G) returns a value in the range of [0,1] such that
bins of each signal was used for training the ANNs giving a
the probability of f (G)being close to 0 increases with genertraining set of 45 72 2 and the rest (45 72 2) was
ation number. This property enables the operator to initially
used for testing. For each of the MLPs and RBFs, the tarsearch uniformly and very locally at later stages (higher G).
get value of the first output node was set 1 and 0 for normal
The superscript s is a parameter determining the degree of
and failed bearings respectively and the values were internon-uniformity, ai and bi represent respectively the lower
changed (0 and 1) for the second output node. For PNNs, the
and the upper bound for the variable xi .
target values were specified as 1 and 2 respectively representing normal and faulty conditions. Results are presented
5.3.2 Crossover
to see the effects of sensor location and signal preprocessing
on diagnosis of machine condition using ANNs without and
Among different crossover operators, heuristic crossover [26]
with feature selection based on GA. The training success for
was used in this work because of its main characteristics of
each case was 100%. The training of RBFs was not successutilizing the fitness function to determine the search direcful with the partial feature set corresponding to individual
tion for better performance. This operator produces a linear
signals (xz). However, the results of RBFs for the whole
extrapolation of two individuals using the fitness informafeature set (145) consisting of all the signals are presented.

tion. A new individual, X , is created as per equation (11)
with r being a random number following uniform distribu

tion U(0,1) and X is better than Y in terms of fitness. If X is 6.1 Performance comparison of ANNs without feature
infeasible, given as =0 in equation (13), then a new random selection
number r is generated and a new solution is created using
In this section, classification results are presented for straight
equation (11).
ANNs without feature selection. For each case of straight
X = X + r(X Y )
(11) MLP, number of neurons in the hidden layer was kept at

Y =X
(12) 24 and for straight PNNs, widths ( ) were kept constant at

0.10. These values were found on the basis of several trials
1 if xi ai , xi bi i
(13) of training the ANNs.
=
0 otherwise
6.1.1 Effect of sensor location
new solution whereas crossover produces two new individuals (off-springs) from two existing individuals (parents). Let
X andY denote two individuals (parents) from the population

and the X and Y denote the new individuals (off-springs).
5.4 Initialization, termination and evaluation functions

To start the solution process, the GA has to be provided with
an initial population. The most commonly used method is the
random generation of initial solutions for the population.
Table 1 shows the classification results for each of the signals, x, y and the resultant z using all input features (145).
For both classifiers, test success was unsatisfactory in most
cases. The test success was in the range of 81.2593.75% for
MLPs, and 75.0095.83% for PNNs.
270
B. Samanta et al.
Table 1 Performance comparison of classifiers without feature

selection for different sensor locations
Dataset
Signal x
Signal y
Signal z
Input features
145
145
145
Test success (%)

MLP (N = 24)
PNN ( = 0.10)
93.75
81.25
85.42
87.50
75.00
95.83
features were selected from the corresponding ranges. In case

of MLPs, the number of neurons in the hidden layer was
selected in the range of 1030 whereas for PNNs, the Gaussian spread was selected in the range of 0.11.0 with a step
size of 0.1.
6.2.1 Effect of sensor location
Table 2 Performance comparison of classifiers without feature

selection for different signal preprocessing
Dataset
Input features
Test success (%)

MLP
PNN
(N = 24)
( =0.50)
Signals xz
Derivative/ integral
High-/low- pass filtering
19
1027
2845
84.03
98.61
96.88
77.08
85.42
95.83
6.1.2 Effect of signal pre-processing
Table 3 shows the classification results along with the selected

parameters for each of the signals, x y and the resultant z.
In all cases, the input features were selected by GA from
the entire range (145). The test success improved substantially in each case with feature selection, compared with the
results of Table 1. The test success was 93.75100% for MLPs
and 83.33100% for PNNs. Features selected for different
schemes are also shown for comparison. Though some of the
features were selected by both schemes, there was no apparent fixed combination of features.
6.2.2 Effect of signal pre-processing
Table 2 shows the effects of signal processing on the classification results for straight ANNs with all three signals. In each
case the corresponding features from the signals without and
with signal preprocessing were used. The test success was in
the range of 84.0398.61% for MLPs and 77.0896.83% for
PNNs.
Table 4 shows the effects of signal processing on the classification results for the signals (x, y and z) with GA. In all
cases, only three features from the signals without and with
signal pre-processing were used from each of these ranges.
The test success was 82.6499.31% for MLPs, whereas it
was in the range of 78.4797.92% for PNNs.
6.2 Performance comparison of ANNs with feature

selection
6.3 Performance comparison with number of features
In this section, classification results are presented for ANNs

with feature selection based on GA. In each case, only three
Table 5 shows the results for ANNs with different number of

features without and with feature selection. With all the 45
Table 3 Performance comparison of classifiers with feature selection for different sensor locations
Dataset
Signal x
Signal y
Signal z
GA with MLP
Input features
No of hidden Neurons
Test success (%)
GA with PNN (3 features)

Input features
Width( )
Test success (%)
13, 19, 42
27, 33, 41
21, 40, 41
10
26
18
97.92
93.75
100
11, 40, 41
1, 11, 33
18, 23, 42
97.92
83.33
100
0.40
0.10
0.50
Table 4 Performance comparison of classifiers with feature selection for different signal preprocessing
Dataset
Signals 14
Derivative/ integral
High-/low- pass filtering
GA with MLP
Input features
No of hidden Neurons
GA with PNN (3 features)

Test success (%)
Input features
Width
Test success (%)
4, 5, 6
14, 15, 22
39, 41, 42
27
25
10
82.64
98.61
99.31
0.30
0.10
0.30
78.47
95.14
97.92
1, 3, 4
11, 14, 25
33, 37, 39
Table 5 Performance comparison of classifiers with different number of features

Classifier
Number of features
Features
Classifier parameter (N / )
Test success (%)
Straight MLP
GA with MLP
GA with MLP
Straight RBF
GA with RBF
GA with RBF
GA with RBF
Straight PNN
GA with PNN
GA with PNN
45
3
6
45
3
6
8
45
3
6
145
4, 14, 18
5, 13, 23, 30, 32, 39
145
12, 23, 38
2, 4, 14, 18, 21, 30
2, 3, 11, 17, 29, 31, 37, 38
145
1, 14, 21
1, 10, 13, 23, 37, 38
24
21
25
1.00
0.10
0.10
0.90
0.10
0.10
0.30
85.06
99.31
100
83.33
87.50
95.14
99.31
95.83
96.53
100
features taken together, the performance of RBF was worst

(83.33%), the PNN was the best (95.83%) and MLP (85.06%)
was closer to RBF. However, with features selected using GA,
the classification performance improved for all three classifiers. For three features selected by GA, the performance
of RBF was worst (87.50%), MLP was best (99.31%) and
PNN (96.53%) was closer to MLP. With six features selected,
both MLPs and the PNNs gave 100% test success whereas
it was 95.14% in case of RBF. With eight features selected
the RBF resulted in 99.31% test success. The computation
time (on a PC with Pentium III processor of 533 MHz and 64
MB RAM) for training the PNNs is noted for comparison.
PNNs converged quite fast. Time needed with three features
(50.212 sec) was not much different from PNNs with six
features (51.975 sec) but these were much higher than the
straight PNNs without feature selection (0.260sec). These
values were substantially lower than the corresponding training time of RBFs and MLPs. However, direct comparison of
training time was not made among theANNs due to the difference in code efficiency. It should also be mentioned that the
difference in computation time should be very important if
the training is done online.
7 Conclusions
A procedure is presented for diagnosis of bearing condition using three classifiers, namely, MLPs, RBFs and PNNs
with GA based feature selection from time domain vibration
signals. The selection of input features and the appropriate
classifier parameters have been optimized using a GA based
approach. The roles of different vibration signals have been
investigated. The use of GAs with only three features gave
almost 100% classification with MLPs and PNNs for most of
the test cases. The use of six selected features with MLPs and
PNNs gave 100% test success whereas with RBF, test success
was 99.31% for eight features. The training time with feature
selection is quite reasonable for PNNs compared to the other
two schemes. The results show the potential application of
GAs for selection of input features and classifier parameters
in ANN based condition monitoring systems.
Acknowledgements The financial support from Sultan Qaboos University grant IG/ENG/MIED/01/01 to carry out the research is gratefully
acknowledged.
References
1. Shiroishi J, Li Y, Liang S, Kurfess T, Danyluk S, (1997) Bearing
condition diagnostics via vibration and acoustic emission measurements. Mech Syst Signal Processing 11:693705
2. McFadden PD (2000) Detection of gear faults by decomposition
of matched differences of vibration signals. Mech Syst Signal Processing 14:805817
3. Nandi, AK (2000) Advanced digital vibration signal processing
for condition monitoring. In: Proceedings of COMADEM 2000,
Houston, pp 129143
271
4. Mechanical systems and signal processing (2001) 15 (5), Special

issue on gear and bearing diagnostics. (Randall, guest editor)
5. Al-Balushi KR, Samanta B (2002) Gear fault diagnosis using
energy-based features of acoustic emission signals. In: Proceedings of IMechE, Part I: J Syst Control Eng 216(I3):249263
6. Antoni J, Randall RB (2002) Differential diagnosis of gear and
bearing faults. Trans ASME J Vibration Acoustics 124:165171
7. McCormick AC, Nandi AK (1997) Classification of the rotating
machine condition using artificial neural networks. In: Proceedings of IMechE, Part C: J Mech Eng Sci 211(C6):439450
8. Dellomo MR (1999) Helicopter gearbox fault detection: a neural network based approach. Trans ASME J Vibration Acoustics
121:265272
9. Samanta B, Al-Balushi KR (2001) Use of time domain features for
the neural network based fault diagnosis of a machine tool coolant system. In: Proceedings of IMechE, Part I: J Syst Control Eng
215(I3):199207
10. Samanta B, Al-Balushi KR (2003) Artificial neural network based
fault diagnostics of rolling element bearings using time-domain
features. Mech Syst Signal Processing 17:317328
11. IEEE Transactions on Neural Network (1997) 8 (1), Special Issue
on Artificial Neural Networks and Statistical Pattern Recognition.
(AK Jain and J Mao guest editors)
12. Baraldi A, Borghese NA (1998) Learning from data: general issues
and special applications of Radial Basis Function networks. In:
Technical report of international computer science research institute, Berkeley, TR-98-028
13. Bishop CM (1995) Neural networks for pattern recognition. Oxford
University Press, Oxford
14. Hornik K, Stinchcombe S, White H (1989) Multilayer feed forward
networks are universal approximators. Neural Netw 2:359366
15. Park J, Sandberg IW (1993) Universal approximation using radialbasis-function networks. Neural Comput 5:305316
16. Specht DF (1990) Probabilistic neural networks. Neural Netw
3:109118
17. Wasserman PD (1995) Advanced methods in neural computing.
Van Nostrand Reinhold, New York, pp 3555
18. Yao X (1999) Evolving artificial neural networks. Proc IEEE
87:14231447
19. Jack LB, Nandi AK, McCormick AC (1999) Diagnosis of rolling
element bearing faults using radial basis functions. Appl Signal
Processing 6:2532
20. Jack LB, Nandi AK (2000) Comparison of neural networks and
support vector machines in condition monitoring applications. In:
Proceedings of COMADEM 2000, Houston, pp 721730
21. Jack LB (2000) Applications of artificial intelligence in machine
condition monitoring. PhD Thesis, Department of Electrical Engineering and Electronics, the University of Liverpool
22. Jack LB, Nandi AK (2000) Genetic algorithms for feature extraction in machine condition monitoring with vibration signals. IEE
Proc Vision Image Signal Processing 147:205212
23. Samanta B, Al-Balushi KR, Al-Araimi SA (2001) Use of genetic
algorithm and artificial neural network for gear condition diagnostics. In: Proceedings of COMADEM, 2001, University of Manchester, UK, pp 449456
24. Haykin S (1999) Neural networks: a comprehensive foundation,
2nd ed. Prentice Hall, New Jersey
25. Goldberg GE (1989) Genetic algorithms in search, optimization
and machine learning. Addison Wesley, New York
26. Michalewicz Z (1999) Genetic algorithms + data structures = evolution programs, 3rd edition, Springer, Berlin Heidelberg NewYork
27. Tang KS, Man KF, Kwong S, He Q (1996) Genetic algorithms and
their applications. IEEE Signal Processing Magazine 13:2237
28. Houk CR, Joines J, Kay MA (1995) Genetic algorithm for function optimization: a matlab implementation. North Carolina State
University, Report no: NCSU IE TR 95 09

Fulltext Soft Springer 2006

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Fulltext Soft Springer 2006

Hochgeladen von

Copyright:

Verfügbare Formate

Soft Comput (2006) 10: 264271

B. Samanta K. R. Al-Balushi S. A. Al-Araimi

Artificial neural networks and genetic algorithm for bearing fault

Published online: 27 April 2005

Abstract A study is presented to compare the performance

number of neurons in the hidden layer. The features were

Fig. 1 Flow chart of diagnostic procedure

Fig. 2 Experimental test rig

Fig. 3 Time-domain features of acquired signals: , normal, - - - - - defective

3.2 Time derivative and integral of signals

The derivative and the integral of each signal were processed

4 Artificial neural networks

4.1 Multi layer perceptron

The output (y) of a hidden neuron gives a measure of distance

Where p is the dimensionality of the feature vector, NA is

function and has significant effects on the generalization of a

where hA represents the relative frequency of the class A

5.1.2 RBF and PNN Training

For RBFs and PNNs, the first N entries of the (N +1)-element

5.1 Genome representation

5.3 Genetic operators

In the present work, GA is used to select the most suitable

Genetic operators are the basic search mechanisms of the

5.2 Selection function

The solution process continues from one generation to

tial trials for satisfactory classification success. However, the

xi + (bi xi )f (G) if r1 < 0.5

5.4 Initialization, termination and evaluation functions

Table 1 Performance comparison of classifiers without feature

Test success (%)

features were selected from the corresponding ranges. In case

Table 2 Performance comparison of classifiers without feature

Test success (%)

6.1.2 Effect of signal pre-processing

Table 3 shows the classification results along with the selected

6.2 Performance comparison of ANNs with feature

6.3 Performance comparison with number of features

In this section, classification results are presented for ANNs

Table 5 shows the results for ANNs with different number of

Test success (%)

GA with PNN (3 features)

Test success (%)

GA with PNN (3 features)

Test success (%)

Table 5 Performance comparison of classifiers with different number of features

Test success (%)

features taken together, the performance of RBF was worst

4. Mechanical systems and signal processing (2001) 15 (5), Special

Das könnte Ihnen auch gefallen