Art:10.1007/s10845 014 0933 4

J Intell Manuf
DOI 10.1007/s10845-014-0933-4
Data-driven prognostic method based on Bayesian approaches

for direct remaining useful life prediction
A. Mosallam K. Medjaher N. Zerhouni
Received: 26 February 2014 / Accepted: 31 May 2014

Springer Science+Business Media New York 2014
Abstract Reliability of prognostics and health manage- Keywords Degradation modeling Online estimation
ment systems relies upon accurate understanding of criti- Discrete Bayes filter Uncertainty representation
cal components degradation process to predict the remain- Data-driven PHM
ing useful life (RUL). Traditionally, degradation process is
represented in the form of physical or expert models. Such
models require extensive experimentation and verification Introduction
that are not always feasible. Another approach that builds
up knowledge about the system degradation over the time The large volume of data gathered continuously from dif-
from component sensor data is known as data driven. Data ferent systems has created challenges to interpret such data
driven models, however, require that sufficient historical data in order to anticipate the breakdowns. Most large industries
have been collected. In this paper, a two phases data driven have specialized engineers whom are skilled in the use of high
method for RUL prediction is presented. In the offline phase, technology maintenance equipment and have earned special
the proposed method builds on finding variables that contain certification in the field of maintenance. Nevertheless, it is
information about the degradation behavior using unsuper- still hard to take immediate decisions and predict the system
vised variable selection method. Different health indicators failure. The need of computer systems that constantly record
(HIs) are constructed from the selected variables, which rep- and analyze data to predict the RUL of critical components is
resent the degradation as a function of time, and saved in particularly important for facilitating maintenance decisions.
the offline database as reference models. In the online phase, In general, maintenance involves performing routine
the method finds the most similar offline HI, to the online HI, actions to obtain optimal availability of industrial systems
using k-nearest neighbors classifier to use it as a RUL predic- (Montgomery and Banjevic 2012). Maintenance routines can
tor. The method finally estimates the degradation state using be broadly categorized into two main types, namely, correc-
discrete Bayesian filter. The method is verified using battery tive and preventive maintenance (Kothamasu et al. 2006).
and turbofan engine degradation simulation data acquired In corrective maintenance, the interventions are performed
from NASA data repository. The results show the effective- only when the critical component is fully worn out and failure
ness of the method in predicting the RUL for both applica- occurred. Preventive maintenance can be further divided into
tions. two main approaches, namely, time-based maintenance and
condition based maintenance (CBM). In time-based main-
A. Mosallam K. Medjaher (B) N. Zerhouni tenance, the interventions are placed according to periodic
FEMTO-ST Institute, AS2M Department, University of
Franche-Comt/CNRS/ENSMM/UTBM, 24 rue Alain Savary,
intervals regardless of the assets health condition and thus
25000 Besanon, France the service life of the critical components is not fully utilized
e-mail: kamal.medjaher@ens2m.fr (Soh et al. 2012). Condition based maintenance uses machine
A. Mosallam run-time data to assess the critical components state and
e-mail: ahmed.mosallam@femto-st.fr schedule required maintenance actions prior to breakdown
N. Zerhouni (Peng et al. 2010). Furthermore, predictive maintenance uti-
e-mail: noureddine.zerhouni@ens2m.fr lizes the current health status of a given critical component
123
J Intell Manuf
to predict its future condition and plan maintenance actions. A drawback of using regression methods is that when avail-
Prognostics and health management (PHM) (Jardine et al. able component degradation history is incomplete the extrap-
2006) is a process which links degradation modeling research olation may lead to large errors (Wang et al. 2008). There
to predictive maintenance policies. have been more interests lately on various types of neural
Prognostics and health management consists of four main networks and neural-fuzzy systems (Gebraeel et al. 2004;
modules: fault detection, fault diagnostics, fault prognostics Satish and Sarma 2005; Huang et al. 2007; Lei et al. 2007;
and decision making (Medjaher et al. 2012). Fault detection Vassilopoulos et al. 2007; Tian 2012; Kamran et al. 2013;
can be defined as the process of recognizing that a prob- Brezak et al. 2012; Gajate et al. 2012; Purushothaman 2010;
lem has occurred regardless of the root cause (Dong et al. Yeo et al. 2000). However, these methods generate black
2012). Fault diagnostics is the process of identifying the box models and it is difficult to select the structure of the
faults and their causes (Choi et al. 2009). Fault prognostics network (Ramasso et al. 2013). Similarity-based methods
can be defined as the prediction of when a failure might take are shown to be very effective in performing RUL predic-
place (Tobon-Mejia et al. 2012). Finally, decision making tion. A similarity-based method based on linear regression to
step uses all the information gathered about the monitored construct offline degradation models is proposed in Wang et
system status to choose the optimal maintenance actions (Iyer al. (2008). The method measures the similarity between test
et al. 2006). Among other routines, prognostics have recently instance and offline models and the selected offline instance
attracted significant research interest due to the need of mod- is used for RUL prediction. The RUL probability density of
els for accurate RUL prediction for different applications. the test instance is estimated from the multiple local predic-
RUL prediction of critical components is a non-trivial task tions using the kernel density estimation method. The main
for many reasons. Sensor signals, for instance, are usually problem with this method is the manual selection of the infor-
obscured by noise and thus it is very challenging to process mative sensor data. Another similarity-based method that uti-
and to extract informative representation of the RUL (Kam- lizes k-nearest neighbors (k-NN) and belief function theory
ran et al. 2013). Another problem is the prediction uncertainty to estimate the health and from that deduce the RUL of turbo-
due to the variation of the end of life time that can differ for fan engines is proposed in Ramasso et al. (2013). The authors
two components made by the same manufacturer and operat- manually annotate the health status of the offline data sets and
ing under the same conditions. Therefore, proposed models then the method predicts the RUL when the degradation level
should include such uncertainties and represent them in a reaches a predefined alarm threshold.
probabilistic form (Saha and Goebel 2008; Pal et al. 2011). Alternatively, instead of learning the degradation from the
RUL prediction models can be realized using two differ- data and predict the RUL; direct RUL prediction algorithms
ent methods, namely, physics based and data-driven methods learn the relation between the sensor data and the end of life
(Heng et al. 2009). Physics based methods build physical to predict the RUL. To do this, health indicators are extracted
models of the desired critical components by the means of from the raw monitoring signals, which may have originated
state-space models (Isermann 2006) and dynamic ordinary from single sensor or from a number of sensors aggregated
or partial differential equations (Vachtsevanos et al. 2006). to represent the degradation evolution over time. Although
These models require extensive experimentation and model this type of RUL prediction is relatively easy to implement,
verification (Luo et al. 2003). However, these models are there are few published examples in the literature (Sikorska
very reliable at least until the system is upgraded or changed et al. 2011).
(Chaari et al. 2009). Data-driven methods can be used when In this work, direct RUL prediction method is presented.
the first principles of the system operation are complex such The aim of this work is to model the relation between sensor
that developing an accurate physics of failure model is not data and end of life to predict the RUL without the need for
feasible (Zhang et al. 2013; He et al. 2012). Such methods predefined alarm threshold. The method builds on extract-
employ pattern recognition and machine learning techniques ing health indicators from the training data, which are used
to characterize the desired critical components degradation as reference models. For new data, the method finds in the
behavior (Schwabacher 2005). One way to do data-driven database the most similar signal to be used as a RUL predic-
RUL prediction is by first estimating the current health status tor. The method then estimates the new signals health status
of the desired component and when the degradation exceeds using a Bayesian approach.
the alarm threshold, the algorithms start predicting the RUL The assumptions taken in this work can be summarized as
(Zhang 2003; Gorjian et al. 2009; Benkedjouh et al. 2013). follows:
Different regression models have been proposed in the liter-
ature to deal with data-driven RUL prediction problem such 1. The method can only be applied to critical components,
as the auto regressive model and the multivariate adaptive which are already identified by the system expert.
regression splines (Box and Jenkins 1976; Lewis 1992; Tsay 2. Historical data should contain degradation evolution of
2000; Wu et al. 2007; Yan et al. 2004; Lee et al. 2006). the critical component over time.
123
J Intell Manuf
Fig. 1 The methods general

S1
scheme SX
S2 Health
. Variable
Offline signals . SY indicator Data base
selection
Sn construction
Offline phase
Online phase
K-NN
SX
Health
Online signals indicator RUL
SY
construction
Online
estimation
3. Historical data should contain sufficient number of train- applied (Mosallam et al. 2013). The method builds on two
ing instances to build representative models of the desired main steps, namely, variable selection and health indicator
critical components behavior. construction.
4. The predicted RUL values will span between the values
available in the offline data sets. Variable selection
This paper is structured as follows. Section 2 presents the Not all signals from the monitored component are infor-
proposed method. The experimental set-up and the simula- mative. Signals that have non-random relationships contain
tion results are depicted in Sect. 3. Finally, Sect. 4 concludes information about the system degradation. To select such sig-
the paper. nals, an unsupervised variable selection algorithm based on
information theory is applied (Mosallam et al. 2011). The
algorithm first calculates pairwise symmetrical uncertainty
Data-driven prognostic method based on Bayesian (SU ), as depicted in Fig. 2a, for all the input signals:
approaches for direct remaining useful life prediction I (X, Y )
SU (X, Y ) = 2 (1)
H (X ) + H (Y )
Measurements observed from monitored components are
usually noisy multidimensional time series signals. Thus, it where, I (X, Y ) is the mutual information between two ran-
is essential in the offline phase to first extract information that dom variables X and Y ; H (X ) and H (Y ) are information
represents the degradation evolution over time. The relation entropy values of the random variables X and Y , respec-
between the extracted information and the end of life should tively. Then, the algorithm groups the variables based on the
be modeled to predict the RUL. To do this, the proposed SU distance using hierarchical clustering shown in Fig. 2b.
method selects interesting sensor signals and builds health The algorithm finally ranks the resulting clusters according to
indicators that are used as offline models. In the online phase, the quality of the included signals in representing interesting
the method estimates the current status from the unseen relationships using normalized self-organizing map distor-
online data, using only the sensors selected on the offline tion measure. A cluster gets low rank if it contains random
phase, and predicts the RUL by measuring the similarity to signals. On the other hand, a cluster gets high rank if it con-
the offline data. The method is summarized in Fig. 1 and tains signals that exhibit nonrandom relationship and those
explained hereafter. signals will be used for later processing.
Offline phase Health indicator construction
In order to build offline reference models, representative fea- The following task, after selecting the interesting variables
tures should be extracted from the training data. Those fea- from the initial monitoring raw signals, is to extract smooth
tures are later labeled with the end of life time and saved monotonic signals, which are correlated with the compo-
in the database. To do that, a trend construction method is nents end of life. These monotonic signals are later processed
123
J Intell Manuf
1 1
1
2
Symetrical uncertainty distance

0.8
3
0.8
4 0.6
Input variables
5 0.6
0.4
6
0.2
7 0.4
8 0
9 0.2
0.2
10
0
11 0.4 2 4 1 6 7 9 11 10 8 3 5
1 2 3 4 5 6 7 8 9 10 11
Input variables Input variables
(a) (b)
Fig. 2 Variable selection step for 11 sensor signals from NASA battery B0005. a SU similarity matrix. b Tree representation of variables relations
to extract representative features over the time, which can be to both of the two signals and the resulting residuals are
used as health indicators and are saved in the database as shown in Fig. 3c. The experiments show that the residual
reference models. To do this, three processing steps, namely of the degraded component was a monotonic signal while
variable compression, trend extraction and feature extraction, the non-degraded component generated almost a constant
are applied to the selected variables. residual.
Variable compression: The goal of this step is to com- Feature extraction: So far, trends are extracted from the com-
press the n signals selected in the previous step onto one- pressed variables. These trends should be used to build an
dimensional space. From each cycle, the selected variables offline model, which can be used to classify new online data.
are compressed using standard principal component analy- In order to make the classification task more efficient, dis-
sis (PCA) method. The first principle component retains the criminant features should be extracted from acquired trends.
maximum variance while reducing the dimensionality to one Different approaches have been proposed for extracting fea-
dimension. Therefore, only the first principle component is tures such as mean, variance, multi-exponential function,
used to represent the health status evolution with respect to curve fitting, discrete wavelet transform and discrete Fourier
time. transform (Marco et al. 2009). However, selecting appropri-
Trend extraction: The compressed variables are then fur- ate features is mainly problem specific. Recalling Fig. 3c,
ther processed at each cycle to get monotonic trends that can the slope of the trend can be a discriminant characteristic
represent the variation of end of life using empirical mode of the trend. A trend with more RUL tends to have smaller
decomposition algorithm (EMD) (Huang et al. 1998). EMD slope and vice versa. The y-intercept of the curve fit shows
is a method employed to decompose a signal into successive the beginning value of the extracted trend, which also can be
intrinsic mode functions (IMF) and a residual signal rn (t), a discriminant feature. Another discriminant feature for this
which should be a constant or monotonic signal that can be problem is the mean of the extracted trend. Every data value
represented as: in the trend contributes to the mean value, and the change
of the data over time will affect the mean value. Finally, the

n variance of the extracted trend describes the spread of a trend
rn (t) = X (t) im f i (t) (2) with respect to end of life time, which is also an important
i=1 feature to extract.
In this work, a feature vector F = [a, b, x, s 2 ] is extracted
where, X (t) is the input signal, im f i is the IMF and n is the from each trend at each time, where, a and b are the slope and
maximum number of IMFs. The generated residual can rep- the y-intercept of a linear curve fit of the input trend respec-
resent the relation between the generated trends and end of tively, x and s 2 are the mean and the variance of the input
life time. For example, Fig. 3a shows an acceleration sig- trend, respectively. The feature vector is extracted from each
nal acquired from a degraded bearing that was worn out trend starting from time 0 until current time t. Figure 4 shows
after around 9 h and Fig. 3b shows a non-degraded bear- an example of the feature extraction process from three differ-
ing were the experiment stopped at the same time of the ent trends extracted at three constitutive cycles, namely, cycle
degraded bearing (Nectoux et al. 2012). EMD was applied 40, cycle 100 and cycle 167. The method extracts the feature
123
J Intell Manuf
(a) 0.2 (b) 0.04
0.15 0.03
Acceleration (m/s2 )
Acceleration (m/s )
0.02
2
0.1
0.01
0.05
0
0
0.01
0.05
0.02
0.1 0.03
0.15 0.04
0 1 2 3 4 5 0 1 2 3 4 5
Time (hour) Time (hour)
(c) x 10
3
3.5
3
Residual of the degraded bearing
Residual of the non degraded bearing
2.5
Residual
1.5
0.5
0
0 1 2 3 4
Time (hour)
Fig. 3 Residual variation according to the health status. a Degraded bearing. b Non-degraded bearing. c Residual of both bearings
0.4 0.4
Residual at cycle 40
Fitting at cycle 40
0.3 Y = 0.0009 X + 0.0209 0.3
0.2 Fitting at cycle 100 0.2
Y = 0.0039 X + 0.1967
Mean values at the
0.1 0.1 three different cycles
Residual
Residual
Fitting at cycle 167

Y = 0.0040 X + 0.3373
0 0
0.1 0.1
0.2 0.2
0.3 0.3
(a) (b)
0.4 0.4
0 50 100 150 200 0 50 100 150 200
Time (cycle) Time (cycle)
Fig. 4 Feature extraction from input residual at cycles 40, 100 and 167. a Slope and y-intercept values. b Mean and variance values
vector, from each trend built in previous step, labels the vector or health indicator, as depicted in Fig. 5, is then used to rep-
with the cycle number and end of life value and saves it in the resent the corresponding critical component according to its
offline database. This process is repeated recursively, until the end of life time. Each group of health indicators with similar
method reaches the end of life, to generate a representation of end of life time is considered as a class and saved in the offline
the degradation as a function of time. The resulting function database.
123
J Intell Manuf
0.45 0.08
(a) (b)
Health indicator (variance feature)

Health indicator (slope feature)
0.4 0.07
0.35
0.06
0.3
0.05
0.25
0.04
0.2
0.03
0.15
0.02
0.1
0.05 0.01
0 0
0 50 100 150 200 0 50 100 150 200
Time (cycles) Time (cycles)
Fig. 5 Health indicators constructed from the NASA battery B0005. a Slope features over time. b Variance features over time
0.45
(a) Feature (b)
group #1 at 0.4
time t
Predicted EOL time
0.35 p(EOL | Ft)
Health indicator
Feature 0.3
group #2 at
time t
0.25
Online Online signal
extracted K-NN 0.2 at time = 50
features at RUL
time t
cycles
0.15
0.1
Feature
group #n
0.05
at time t
0
0 50 100 150 200
Offline database
Time (cycle)
Fig. 6 Finding the end of life for the online signal at time = 50 cycles using k-NN classifier. a Selecting the most similar group. b End of life for
the online signal
Online phase sification decision is based on largest posterior probability

of the tested sample at time t, therefore, a probability value
In this phase, new sensor data are collected online from the will be assigned to the prediction output:
critical component(s) from only the sensors that are selected
in the offline phase. The processes applied in the offline p(Ft |E O L k ) p(E O L k )
p(E O L k |Ft ) = (3)
phase, such as extracting monotonic trends and feature vector p(Ft )
F, are applied to the online signals. The generated vector F
where, Ft is the online feature vector, E O L k is the class or
is then fed to a k-NN classifier to find the most similar offline
end of life value for group k, p(Ft |E O L k ) is the probability
signal (or case). The end of life value of the offline signal is
of observing Ft given E O L k , also known as the likelihood,
then considered to be the RUL of the test signal. The online
p(E O L k ) is class prior and p(Ft ) is the marginal likelihood.
estimator recursively estimates the trends value until it stops
The end of life with the highest posterior probability will be
at end of life time.
used as the end of life for the new signal as depicted in Fig.
6b.
Classification using k-nearest neighbours
Online estimation
In order to build the predictive model, a k-NN classifier is
applied in this work. The extracted feature vector F at time t To estimate the actual value of the online health indicator at
is passed to the k-NN to find the most similar offline group in the predicted end of life value, a recursive discrete Bayesian
the database at the same time as shown in Fig. 6a. The clas- filter is applied to the online trends. This filter, decomposes
123
J Intell Manuf
the state space into many regions and represents the cumu-
lative posterior for each region by probability values, see
Algorithm 1.
Input : { pk,t1 }, z t
Output: { pk,t }
forall the kdo
pk,t = p(X t = xk |X t1 = xi ) pi,t1
i
pk,t = p(z t |X t = xk ) pk,t p(EOLk| Ft)
end
Algorithm 1: Discrete Bayesian filter.
The input to the algorithm is a discrete probability dis-

Fig. 7 Estimation of the health indicator using Bayesian filter
tribution { pk,t } along with the recent measurement z t . The

first line of the Algorithm 1, pk,t = i p(X t = xk |X t1 =
xi ) pi,t1 , calculates the prediction for the new state based Data: {training Data, test Data}
Result: {Dl , RU L}
on previous state uncertainty and state transition model.
The prediction is then updated in the second line, pk,t = 1 for training Data do
2 selected V ariables = Find Best Gr oup(training Data);
p(z t |X t = xk ) pk,t , so as to incorporate the measurement. 3 end
Discrete Bayesian filters apply to problems with finite state 4 Offline phase
space, where the random variable X t = x1,t x2,t xk,t . 5 for i = 1 : number O f (training Data) do
A straightforward decomposition of X t is a multidimensional 6 E O L = length O f (training Data(i));
7 for j = 2 : E O L do
grid, where each xk,t is a bin or region. The size of each bin 8 selected V ariables =
is d x = xmax xn
min
, where xmax is the maximum state value, get Selected V ariables(training Data(i));
xmin is the minimum state value and n is the number of bins. 9 i p = selected V ariables(1 : j);
Each bin can then be represented as a Gaussian function with 10 f ir stComponent = Get Fir stComponent (i p);
11 r esidual = Get E M D Residual( f ir stComponent);
a mean value at each state and a common variance: 12 f eatur es = Get Featur es(r esidual);
13 H I = append([ f eatur es, i]);
p(X t |X t1 ) = d x N (X k,t , 2 ) (4) 14 end
15 Dl = append([H I, E O L]);
end
where, p(X t = xk |X t1 ) is the state transition model, d x is 16
17 Online phase
the bin size and N (X k,t , 2 ) is the normal distribution at state 18 selected V ariables = get Selected V ariables(test Data);
X k,t . Moreover, Eq. (4) is normalized to turn this quantity 19 f ir stComponent =
into a probability distribution. Similarly, the measurement Get Fir stComponent (selected V ariables);
20 r esidual = Get E M D Residual( f ir stComponent);
probability model can be calculated in the same manner as
21 testing Featur es = Get Featur es(r esidual);
the transition model. Figure 7 shows the final result of the 22 E O L = k N N (test Featur es, Dl );
proposed method. 23 rul Estimation =
The estimation algorithm stops once it reaches the pre- discr eteBayesian Filter (test Featur es, E O L);
24 RU L = (E O L , rul Estimation);
dicted end of life. The uncertainty about the prediction and
current status are represented in probabilistic forms. The Algorithm 2: The general algorithm of the proposed
overall method is summarized in Algorithm 2. method
The machine degradation information, i.e. predicted RUL,
the estimated health status and corresponding uncertainties,
produced by this method can be used as an input for main- especially when availability of maintenance resources are
tenance decision making routine. Decision-making routine limited (Li and Ni 2009).
considers both machine degradation information and system
structure to assist the plant manager in making a dynamic
maintenance plan based not only on the optimization of single Applications and results
component/subsystem plan, but also on the global schedul-
ing of whole system for optimized maintenance prioritiza- Two real life data sets are used in this work to verify the
tion (Xia et al. 2012). Maintenance prioritization is crucial proposed method: turbofan engine and lithium-ion battery
and important to reduce unnecessary maintenance activities, aging data sets.
123
J Intell Manuf
(a) (b) 1
Feature 1 (Slope of the curve fit)

Corrected fan speed (rpm)
2388.5
0.5
2388.4
2388.3 0
2388.2
0.5
2388.1
2388 1
2388.6
300
2388.4
200 1.5
2388.2 100
Physical fan speed (rpm) 2388 0
Time (cycles) 2
0 50 100 150 200
Time (cycles)
Fig. 8 Results of variable selection and health indicator construction for the NASA turbofan engine 61. a The selected pair of sensors. b The slope
health indicator
350 350
Predicted values Predicted values
Real values Real values
300 300
250 250
200 200
RUL
RUL
150 150
100 100
50 50
(a) (b)
0 0
0 50 100 150 200 250 0 20 40 60 80 100 120 140
Cycle Cycle
300 350
Predicted values Predicted values
Real values Real values
250 300
250
200
200
RUL
RUL
150
150
100
100
50 50
(c) (d)
0 0
0 50 100 150 200 0 50 100 150 200 250
Cycle Cycle
Fig. 9 Results of predicting the RUL at all cycles for 4 engines. a RUL of engine 34. b RUL of engine 41. c RUL of engine 42. d RUL of engine
81
Turbofan engine data four testing files and four RUL values files. The training files
contain run to failure sensor records of a fleet of engines
The turbofan engine data sets are generated using commer- generated under different combinations of operational con-
cial modular aero-propulsion system simulation (C-MAPSS) ditions and fault modes. Each engine is operating normally
(Saxena and Goebel 2008). They consist of four training files, and it develops a fault at some point during the operation until
123
J Intell Manuf
Table 1 Training data sets with three folds Table 2 Testing data sets with three folds
Fold #1 Fold #2 Fold #3 EOL Fold #1 Fold #2 Fold #3 EOL
B0006 B0005 B0005 168 B0005 B0006 B0007 168

B0007 B0007 B0006 168 B0025 B0028 B0027 28
B0026 B0025 B0025 28 B0029 B0030 B0032 40
B0027 B0026 B0026 28 B0033 B0034 B0036 197
B0028 B0027 B0028 28 B0038 B0039 B0040 47
B0030 B0029 B0029 40 B0042 B0043 B0044 112
B0031 B0031 B0030 40 B0046 B0047 B0048 72
B0032 B0032 B0031 40 B0049 B0052 B0051 25
B0034 B0033 B0033 197 B0054 B0055 B0056 102
B0036 B0036 B0034 197
B0039 B0038 B0038 47
B0040 B0040 B0039 47 2
B0043 B0042 B0042 112 1.9
B0044 B0044 B0043 112 1.8
Capacity (Ahr)
B0045 B0045 B0045 72 1.7
B0047 B0046 B0046 72 1.6
B0048 B0048 B0047 72 1.5
B0050 B0049 B0049 25 1.4
B0051 B0050 B0050 25 1.3

200
B0052 B0051 B0052 25 3.6 100
3.55
B0055 B0054 B0054 102 3.5 0
3.45 Time (cycles)
Discharge voltage (volt)
B0056 B0056 B0055 102
Fig. 10 Selected pair of variables from the NASA battery B0005
Health indicator: As mentioned before, four features are

finally it reaches the system failure and the engine stops. The
extracted from each trend at each time and labeled with end
test files are generated in the same way; however, the sensor
of life time to be saved in the offline database. The features
readings are omitted prior to system failure. The RUL files
represent the relation between the extracted trends and the
contain vector of true RUL values for the test data. Each train-
engines end of life. Figure 8b shows one of the four health
ing and test file contains 26 columns that represent different
indicators for the NASA training engine number 61. The
variables. The first two columns represent the engine number
indicator is monotonic and shows how the relation between
and the time in cycles, respectively. The next three columns
the end of life and the extracted trend changes through the
represent the operational settings. The last 21 columns, or
time. Each health indicator is then saved in offline database
variables, represent different time series sensor data such as
and labeled with the end of life time and will be used for
total temperature at fan inlet, pressure at fan inlet, physical
predicting the RUL of new sensor data.
fan speed, etc. Each row represents a data snapshot taken dur-
Prediction results: Figure 9 shows the predicted RUL for 4
ing a single cycle. In this work, the data file train_FD001.txt
engines at all cycles. It can be noticed that the accuracy of
is used for offline training and test_FD001.txt is used for
the predictions increases with the time. The prediction error
online testing. Each file contains data for 100 engines and the
at the last cycles is less than the errors at the beginning. To
objective is to predict the number of remaining operational
assess the performance of the proposed method, the mean
cycles before failure in the test set. The true RUL values for
absolute percentage error (MAPE) is calculated for all 100
the test data are presented in the data file RUL_FD001.txt.
online predictions:
Variable selection: One of the results of the selection algo-
rithm is the pair of sensors number {8,13}, i.e. physical fan n
100 % RU L i RU L i
speed and corrected fan speed, respectively (Fig. 8a). The M A P E(%) = (5)
selected group is interesting as the two variables are corre- n RU L i
i=1
lated and both are related to the fan speed. Then, the method
starts constructing the monotonic trends iteratively from each where RU L and RU L are the actual and predicted RUL val-
pair at each time. ues respectively and n is the number of total predictions. The
123
J Intell Manuf
180 180
(a) (b) Pridicted values
160 160 Real values
Pridicted values
Real values
140 140
120 120
100 100
RUL
RUL
80 80
60 60
40 40
20 20
0 0
0 20 40 60 80 100 120 140 160 180 0 5 10 15 20 25 30
Cycle Cycle
Fig. 11 Results of predicting the RUL at all cycles for 2 batteries. a RUL of battery B0005. b RUL of battery B0025
error is calculated only for the last cycles of all 100 test sig- Table 3 Mean absolute percentage error for the NASA battery data sets
nals. The MAPE over the 100 test data equals to 12.19 %. And Fold #1 Fold #2 Fold #3 Average
for comparison, the MAPE over the first 15 test engines is
8.7691 %, which outperforms the method presented in Kam- 28.0493 % 26.3089 % 28.3536 % 27.5706 %
ran et al. (2013) in which the MAPE value is 15.5 % for the
15 test engines.
cators for the battery B0005. The indicators are monotonic
Lithium-ion battery data and show how the relation between the end of life and the
extracted trend changes through the time.
These data are collected on 34 lithium-ion batteries run Prediction results: To assess the performance of the pro-
through different operational profiles (e.g. charge, discharge posed method, MAPE is calculated for all cycles of each
and impedance) at different temperatures (Saha and Goebel battery (Fig. 11). The average MAPE per fold is calculated
2007). In this work only charge and discharge data are used. as follows:
Each data set, corresponding to one experiment, consists of
1
n
11 variables such as charging voltage, charging current, tem- M AP E f = M A P E i, f (6)
perature, discharging current, discharging voltage and capac- n
i=1
ity. The aging of the batteries was accelerated and the exper-
iments continued until the batteries reached their end of life where M A P E f is the average MAPE for a complete fold,
time. Each cycle is presented by the mean value to reduce the M A P E i, f is the MAPE for test battery i in fold f . The final
processing time. In order to validate the proposed method, results are calculated and summarized in Table 3.
a threefold cross-validation is applied, i.e. the available data Figure 11a shows plot of the predicted RUL for all cycles
sets are partitioned into three groups of equal size. Each group for battery B0005. It can be seen that the prediction accuracy
is then divided into training and testing data set as depicted increases with time, i.e, the longer the test trend is the higher
in Tables 1 and 2, respectively. Only 31 battery data sets are the predication accuracy. Figure 11b shows a plot of the RUL
used in this experiment as three batteries, namely B0018, predicted for the battery B0025. Only 10 cycles were con-
B0041 and B0053, do not have any similar data sets with the sidered as late prediction. However, the error was decreasing
same end of life. at the later cycles.
Variable selection: One of the results of the selection algo-
rithm is the pair of variables {6, 11}, i.e. the voltage mea-
sured at discharge and the capacity of the battery (Fig. 10). Conclusion
The selected pair is interesting because the two variables are
correlated. Indeed, the capacity is related to the battery health In this paper a data driven method for RUL prediction based
as the decrease in the capacity indicates health degradation. on a Bayesian approach is proposed. The method builds
Health indicator: Four features are extracted from each trend on unsupervised selection of interesting variables from the
at each time and labeled with end of life time to be saved in the input offline signals. It constructs representative features that
offline database. Figure 5 shows two of the four health indi- can be used as health indicators. The method represents the
123
J Intell Manuf
current status of the online signals as well as the uncertainty spectrum for nonlinear and non-stationary time series analysis. In
about the predictions in a probabilistic form. Proceedings of the royal society of London series A mathematical
Physical and engineering sciences (pp. 903995).
The performance of the proposed method is evaluated Huang, R., Xi, L., Li, X., Qiu, H., & Lee, J. (2007). Residual life pre-
using two data sets, namely, turbofan engines and lithium-ion dictions for ball bearings based on self-organizing map and back
battery data downloaded form the NASA prognostic center of propagation neural network methods. Mechanical Systems and Sig-
excellence website. The prediction results show low MAPE nal Processing, 21(1), 193207.
Isermann, R. (2006). Fault-diagnosis systems: An introduction from
error for both applications. fault detection to fault tolerance. Heidelberg: Springer.
For future work, the proposed method should consider Iyer, N., Goebel, K., & Bonissone, P. (2006). Framework for post-
the data sets with no training samples in the database, such prognostic decision support. IEEE Aerospace Conference, 9(1),
as the case with battery data sets. Also, it should be tested 39623971. doi:10.1109/AERO.2006.1656108.
Jardine, Andrew K. S., Lin, Daming, & Banjevic, Dragan. (2006).
using data sets with variable operating conditions and after
A review on machinery diagnostics and prognostics implement-
introducing maintenance interventions. Different classifica- ing condition-based maintenance. Mechanical Systems and Signal
tion/regression models should be tested in the proposed Processing, 20(7), 14831510. doi:10.1016/j.ymssp.2005.09.012.
framework. Javed, K., Gouriveau, R., & Zerhouni, N. (2013) Novel failure prog-
nostics approach with dynamic thresholds for machine degradation.
In 39th annual conference of the IEEE industrial electronics soci-
ety, (IECON), (pp. 44044409), 1013 November 2013 doi:10.1109/
References IECON.2013.6699844.
Javed, K., Gouriveau, R., Zerhouni, N., & Nectoux, P. (2013) A feature
Benkedjouh, T., Medjaher, K., Zerhouni, N., Rechak, S. (2013). Health extraction procedure based on trigonometric functions and cumula-
assessment and life prediction of cutting tools based on support vec- tive descriptors to enhance prognostics modeling. In IEEE prog-
tor regression. Journal of Intelligent Manufacturing, article pub- nostics and health management (PHM) conference (Vol. 1(7), pp.
lished online 19 April 2013. doi:10.1007/s10845-013-0774-6. 2427). doi:10.1109/ICPHM.2013.6621413.
Box, G. E. P., & Jenkins, G. M. (1976). Time series analysis: Forecasting Kothamasu, Ranganath, Huang, Samuel H., & VerDuin, William H.
and control. San Francisco: Holden-Day. (2006). System health monitoring and prognostics a review of cur-
Brezak, D., Majetic, D., Udiljak, T., & Kasac, J. (2012). Tool wear rent paradigms and practices. The International Journal of Advanced
estimation using an analytic fuzzy classifier and support vector Manufacturing Technology, 28(910), 10121024. doi:10.1007/
machines. Journal of Intelligent Manufacturing, 23, 797809. s00170-004-2131-6.
Chaari, Fakher, Fakhfakh, Tahar, & Haddar, Mohamed. (2009). Analyt- Lee, J., Ni, J., Djurdjanovic, D., Qiu, H., & Liao, H. (2006). Intelligent
ical modelling of spur gear tooth crack and influence on gearmesh prognostics tools and e-maintenance. Computers in Industry, 57(6),
stiffness. European Journal of Mechanics-A/Solids, 28(3), 461468. 476489.
doi:10.1016/j.euromechsol.2008.07.007. Lei, Z., Xingshan, L., Jinsong, Y., ZhanBao, G. (2007). A genetic train-
Choi, Kihoon, Singh, Satnam, Kodali, Anuradha, Pattipati, Krishna R., ing algorithm of wavelet neural networks for fault prognostics in
Sheppard, John W., Namburu, Setu Madhavi, et al. (2009). Novel condition based maintenance. In Proceedings of the eighth interna-
classifier fusion approaches for fault diagnosis in automotive sys- tional conference on electronic measurement and instruments (pp.
tems. IEEE Transactions on Instrumentation and Measurement, 584589). IEEE
58(3), 602611. doi:10.1109/TIM.2008.2004340. Lewis, F. (1992). Applied optimal control and estimation: Digital design
Dong, Jianfei, Verhaegen, Michel, & Gustafsson, Fredrik. (2012). and implementation. Englewood Cliffs, NJ: Prentice-Hall.
Robust fault detection with statistical uncertainty in identified para- Li, Lin, & Ni, Jun. (2009). Short-term decision support system for main-
meters. IEEE Transactions on Signal Processing, 60(10), 5064 tenance task prioritization. International Journal of Production Eco-
5076. doi:10.1109/TSP.2012.2208638. nomics, 121(1), 195202.
Gajate, A., Haber, R., Del Toro, R., Vega, P., & Bustillo, A. (2012). Luo, J., Namburu, M., Pattipati, K., Qiao, L., Kawamoto, M., & Chigusa,
Tool wear monitoring using neuro-fuzzy techniques: A comparative S. (2003). Model-based prognostic techniques, Anaheim, CA, United
study in a turning process. Journal of Intelligent Manufacturing, 23, States: 2003 (pp. 330340). Piscataway, NJ, United States: Institute
869882. of Electrical and Electronics Engineers Inc.
Gebraeel, N., Lawley, M., Liu, R., & Parmeshwaran, V. (2004). Residual Medjaher, Kamal, Tobon-Mejia, Diego A., & Zerhouni, Noureddine.
life predictions from vibration-based degradation signals: A neural (2012). Remaining useful life estimation of critical components with
network approach. IEEE Transactions on Industrial Electronics, application to bearings. IEEE Transactions on Reliability, 61(2),
51(3), 694700. 292302. doi:10.1109/TR.2012.2194175.
Gorjian, N., Ma, L., Mittinty, M., Yarlagadda, P., Sun, Y. (2009) Review Montgomery, N., Banjevic, D., & Jardine, A. K. S. (2012). Minor
on degradation models in reliability analysis. In: Proceedings of the maintenance actions and their impact on diagnostic and prognostic
4th world congress on engineering asset management, 2830 Sept, CBM models. Journal of Intelligent Manufacturing, 23(2), 303311.
Athens, Greece. doi:10.1007/s10845-009-0352-0.
He, D., Li, R., & Bechhoefer, E. (2012). Stochastic modeling of Mosallam, A., Byttner, S., Svensson, M. T. R. (2011). Nonlinear rela-
damage physics for mechanical component prognostics using tion mining for maintenance prediction. In IEEE Aerospace Con-
condition indicators. Journal of Intelligent Manufacturing, 23, ference, (pp. 19), March 2011. doi:10.1109/AERO.2011.5747581.
221226. Mosallam, A., Medjaher, K., & Zerhouni, N. (2013). Nonparametric
Heng, Aiwina, Zhang, Sheng, Tan, Andy C. C., & Mathew, Joseph. time series modelling for industrial prognostics and health manage-
(2009). Rotating machinery prognostics: State of the art, chal- ment. The International Journal of Advanced Manufacturing Tech-
lenges and opportunities. Mechanical Systems and Signal Process- nology, 69(5), 16851699. doi:10.1007/s00170-013-5065-z.
ing, 23(3), 724739. doi:10.1016/j.ymssp.2008.06.009. Nectoux, P., Gouriveau, R., Medjaher, K., Ramasso, E., Chebel-
Huang, N. E., Shen, Z., Long, S. R., Wu, M. C., Shih, H. H., Zheng, Morello, B., Zerhouni, N., Varnier, C. (2012) Pronostia: An experi-
Q., et al. (1998). The empirical mode decomposition and the hilbert mental platform for bearings accelerated degradation tests. In IEEE
123
J Intell Manuf
international conference on prognostics and health management, Tobon-Mejia, Diego A., Medjaher, Kamal, Zerhouni, Noureddine, &
Denver, Colorado, USA. Tripot, Gerard. (2012). A data-driven failure prognostics method
Pal, S., Heyns, P. S., Freyer, B. H., Theron, N. J., & Pal, S. K. (2011). based on mixture of Gaussians hidden Markov models. IEEE
Tool wear monitoring and selection of optimum cutting conditions Transactions on Reliability, 61(2), 491503. doi:10.1109/TR.2012.
with progressive tool wear effect and input uncertainties. Journal of 2194177.
Intelligent Manufacturing, 22, 491504. Trincavelli, M., Coradeschi, S., & Loutfi, A. (2009). Odour classifi-
Peng, Ying, Dong, Ming, & Zuo, Ming Jian. (2010). Current status cation system for continuous monitoring applications. Sensors and
of machine prognostics in condition-based maintenance: A review. Actuators B: Chemical, 139(2), 265273, 4 June 2009, ISSN: 0925
The International Journal of Advanced Manufacturing Technology, 4005. doi:10.1016/j.snb.2009.03.018.
50(14), 297313. doi:10.1007/s00170-009-2482-0. Tsay, R. S. (2000). Time series and forecasting: Brief history and future
Purushothaman, S. (2010). Tool wear monitoring using artificial neural research. Journal of the American Statistical Association, 95(450),
network based on extended Kalman filter weight updation with trans- 638643.
formed input patterns. Journal of Intelligent Manufacturing, 21, Vachtsevanos, G., Lewis, F., Roemer, M., Hess, A., & Wu, B. (2006).
717730. Intelligent fault diagnosis and prognosis for engineering systems.
Ramasso, Emmanuel, Rombaut, Michle, & Zerhouni, Noureddine. Hoboken, New Jersey: Wiley.
(2013). Joint prediction of continuous and discrete states in time- Vassilopoulos, A. P., Georgopoulos, E. F., & Dionysopoulos, V. (2007).
series based on belief functions. IEEE Transactions on Cybernetics, Artificial neural networks in spectrum fatigue life prediction of com-
43(1), 3750. doi:10.1109/TSMCB.2012.2198882. posite materials. International Journal of Fatigue, 29(1), 2029.
Saha, B., Goebel, K. (2007). Battery Data Set, NASA Ames Wang, Tianyi, Jianbo, Yu., Siegel, D., & Lee, J. (2008). A similarity-
Prognostics Data Repository. [http://ti.arc.nasa.gov/project/ based prognostics approach for remaining useful life estimation of
prognostic-data-repository]. NASA Ames, Moffett Field, CA engineered systems. IEEE International Conference on Prognos-
Saha, Bhaskar, & Goebel, Kai. (2008). Uncertainty management for tics and Health Management, 1(6), 69. doi:10.1109/PHM.2008.
diagnostics and prognostics of batteries using Bayesian techniques. 4711421.
IEEE Aerospace Conference, 1(8), 18. doi:10.1109/AERO.2008. Wu, W., Hu, J., & Zhang, J. (2007). Prognostics of machine health
4526631. condition using an improved ARIMA-based prediction method (pp.
Sarah S. S., Radzi, N. H. M., Haron, H. (2012). Review on schedul- 10621067). Harbin, China: IEEE.
ing techniques of preventive maintenance activities of railway. In Xia, Tangbin, Xi, Lifeng, Zhou, Xiaojun, & Lee, Jay. (2012). Dynamic
Fourth international conference on computational intelligence, mod- maintenance decision-making for series-parallel hybrid multi-unit
elling and simulation (CIMSiM) (pp. 310315), 2527 Sept. 2012, manufacturing system based on MAM-MTW methodology. Euro-
Kuantan, Malaysia. doi:10.1109/CIMSim.2012.56. pean Journal of Operational Research, 221, 231240.
Satish, B., & Sarma, N. D. R. (2005). A fuzzy BP approach for diagnosis Yan, J., Koc, M., & Lee, J. (2004). A prognostic algorithm for machine
and prognosis of bearing faults in induction motors. In: IEEE power performance assessment and its application. Production Planning
engineering society general meeting (pp. 22912294). IEEE and Control, 76, 796801.
Saxena, A., Goebel, K. (2008). C-MAPSS Data Set, NASA Yeo, S. H., Khoo, L. P., & Neo, S. S. (2000). Tool condition monitoring
Ames Prognostics Data Repository. [http://ti.arc.nasa.gov/project/ using reflectance of chip surface and neural network. Journal of
prognostic-data-repository]. NASA Ames, Moffett Field, CA Intelligent Manufacturing, 11, 507514.
Schwabacher, M. A. (2005). A survey of data-driven prognostic. In Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA
Infotech@Aerospace (pp. 2629). Arlington, Virginia. and neural network model. Neurocomputing, 50, 159175.
Sikorska, J. Z., Hodkiewicz, M., & Ma, L. (2011). Prognostic modelling Zhang, Zhenyou, Wang, Yi, & Wang, Kesheng. (2013). Fault diagnosis
options for remaining useful life estimation by industry. Mechanical and prognosis using wavelet packet decomposition, Fourier trans-
Systems and Signal Processing, 25(5), 18031836. doi:10.1016/j. form and artificial neural network. Journal of Intelligent Manufac-
ymssp.2005.09.012. turing, 24(6), 12131227. doi:10.1007/s10845-012-0657-2.
Tian, Zhigang. (2012). An artificial neural network method for remain-
ing useful life prediction of equipment subject to condition monitor-
ing. Journal of Intelligent Manufacturing, 23(2), 227237. doi:10.
1007/s10845-009-0356-9.
123

Art:10.1007/s10845 014 0933 4

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Art:10.1007/s10845 014 0933 4

Hochgeladen von

Copyright:

Verfügbare Formate

J Intell Manuf

Data-driven prognostic method based on Bayesian approaches

Received: 26 February 2014 / Accepted: 31 May 2014

Fig. 1 The methods general

Offline phase Health indicator construction

Symetrical uncertainty distance

(a) 0.2 (b) 0.04

Fitting at cycle 167

Health indicator (variance feature)

Online phase sification decision is based on largest posterior probability

The input to the algorithm is a discrete probability dis-

Feature 1 (Slope of the curve fit)

B0006 B0005 B0005 168 B0005 B0006 B0007 168

B0050 B0049 B0049 25 1.4

B0051 B0050 B0050 25 1.3

Health indicator: As mentioned before, four features are

Das könnte Ihnen auch gefallen