Development and Comparison of Neural Network Based Soft Sensors For Online Estimation of Cement Clinker Quality

ISA Transactions 52 (2013) 1929
Contents lists available at SciVerse ScienceDirect
ISA Transactions
journal homepage: www.elsevier.com/locate/isatrans
Development and comparison of neural network based soft sensors

for online estimation of cement clinker quality
Ajaya Kumar Pani n, Vamsi Krishna Vadlamudi, Hare Krishna Mohanta 1
Department of Chemical Engineering, Birla Institute of Technology and Science, Pilani 333031, India
a r t i c l e i n f o abstract
Article history: The online estimation of process outputs mostly related to quality, as opposed to their belated
Received 5 March 2012 measurement by means of hardware measuring devices and laboratory analysis, represents the most
Received in revised form valuable feature of soft sensors. As of now there have been very few attempts for soft sensing of cement
8 June 2012
clinker quality which is mostly done by ofine laboratory analysis resulting at times in low quality
Accepted 11 July 2012
Available online 31 August 2012
clinker. In the present work three different neural network based soft sensors have been developed for
This paper was recommended online estimation of cement clinker properties. Different input and output data for a rotary cement
for publication by Ricky Dubay kiln were collected from a cement plant producing 10,000 tons of clinker per day. The raw data were
pre-processed to remove the outliers and the resulting missing data were imputed. The processed data
Keywords: were then used to develop a back propagation neural network model, a radial basis network model and
Cement kiln modeling
a regression network model to estimate the clinker quality online. A comparison of the estimation
Back propagation neural network
capabilities of the three models has been done by simulation of the developed models. It was observed
Radial basis function neural network
Regression neural network that radial basis network model produced better estimation capabilities than the back propagation and
Soft sensor regression network models.
& 2012 ISA. Published by Elsevier Ltd. All rights reserved.
1. Introduction thereby rendering data-driven modeling methods (black box or

gray box) more preferable to rst principles models for complex
A major problem in product quality control in process indus- processes.
tries is the difculty faced in continuous online measurement of Articial neural network (ANN) has been a popular empirical
certain output variables especially related to composition. modeling technique over the years. The details of processes for
Although in some cases online analyzers are available, signicant which neural network modeling has been done is given under the
time delays associated with most of such instruments make category of each type of neural network development. Other than
timely control difcult and sometimes impossible. Moreover pollutant emission studies, there have been very few attempts to
such instruments have low reliability. Soft sensing is a modeling develop ANN based soft sensor for online estimation of product
approach to estimate hard-to-measure process variables (primary quality in cement plant. The quality of clinker plays the most
variables) from easy-to-measure, online process variables important role in determining the quality of cement. Unfortu-
(secondary variables). This plant model may be a rst principle nately there is no hardware sensor available for online sensing of
model, a black box model or a gray box model substituting some clinker composition coming out of a rotary cement kiln. The
physical sensors and using data acquired from some other clinker quality is determined by measuring its contents of free
available ones. Though modeling of a process from rst principles lime and other important components by ofine laboratory
is often desirable, in most cases however a rst principle model is analysis. Therefore any reduction in clinker quality, as determined
not possible because of the enormous amount of complexity by ofine laboratory analysis hours after production, leads to
involved and/or the intensive computation works involved. On rejection or recycling of the clinker formed. Any method for
the other hand modern measurement techniques enable a large online estimation of clinker quality will largely help in reducing
amount of operating data to be collected, stored and analyzed, the amount of rejection thereby resulting in lower revenue loss or
more prot. A few works on kiln modeling has been done based
on statistical approach [1,2]. However these works are aimed only
n
Corresponding author. Tel.: 91 9929832108. at estimating the free lime content of the clinker. While free lime
E-mail addresses: akpani@bits-pilani.ac.in (A.K. Pani),
vadlamudiy5ch845@gmail.com (V.K. Vadlamudi),
content is the most important clinker quality parameter, there are
harekrishna.mohanta@gmail.com (H.K. Mohanta). also other important quality parameters (mentioned later in the
1
Tel.: 91 9829434948. paper) which require online estimation for better product quality.
0019-0578/$ - see front matter & 2012 ISA. Published by Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.isatra.2012.07.004
20 A.K. Pani et al. / ISA Transactions 52 (2013) 1929
Nomenclature x Mean value of a data set

x0.5 Median value of a data set
ci ith RBF center xmax Maximum value of a variable in a dataset
D2i Euclidean distance between two input vectors xmin Minimum value of a variable in a dataset
Ij Input to jth pattern neuron of GRNN xnorm Normalized value
n Number of training samples y Actual output value (normalized)
Q0.25, Q0.75 1st and 3rd quartile value of a dataset y Simulated output value of neural network model
var(y) Variance of a dataset y s Standard deviation, scaling parameter of RBF net-
wi Weight associated with ith RBF center work, spread coefcient in GRNN
wij Weight associated with ith input neuron and jth g Skewness of a data set
pattern neuron k Kurtosis of a data set
xi ith observation value of an input variable ji RBF model basis function
In the present work neural network based soft sensors have aluminate (C3A) and tetracalcium aluminoferrite (C4AF). All these
been developed for estimation of eight clinker quality parameters. components must be maintained in a required proportion in the
Three types of network models, a back propagation network model, clinker for maintaining the quality. The unreacted lime appears as
a radial basis function network model and a regression network free lime in the clinker which should be limited to a minimum.
model, have been developed and a comparison of their perfor- The clinker is then ground with a small amount of gypsum in the
mances has been done. The models receive four raw mix quality cement mill producing the nal product i.e. cement.
parameters and ve physical variables pertaining to kiln operation In the cement plant the quality of the clinker to a large extent
i.e. rpm, current, fuel ow rate, temperature and kiln feed rate as affects the quality of the cement. However this clinker quality is
inputs and produce values of eight clinker quality parameters. mostly determined by ofine laboratory analysis by taking clinker
The article describes the following topics in order: brief samples at kiln outlet at regular intervals and then analyzing the
description of the cement making process with focus on the same for different components. Therefore failure to meet the
rotary cement kiln, data preprocessing, neural network develop- quality criteria leads to the rejection/recycling of the kiln product
ment, simulation results, discussion and conclusion. resulting in signicant revenue loss. So a soft sensor which is
capable of continuous online estimation of clinker composition
can minimize the occurrence of such problems.
2. Brief description of cement manufacturing process In order to develop the data driven soft sensor, data were
collected over a period of one month for various quality para-
The raw materials for cement are limestone as a source of lime, meters of the raw meal as well as corresponding output clinker
clay as a source of silica, laterite as a source of iron and red ochre and kiln operating variables from a cement plant having a kiln
as a source of aluminum. These raw materials are rst mixed in capacity of 10,000 t clinker per day as shown in Table 1.
the required proportion which is referred to as raw meal. This raw
meal is then ground to required size in a vertical roller mill. The
required size raw meal then enters a multistage cyclone preheater 3. Data preprocessing
where it is preheated by the hot ue gas coming from the cement
rotary kiln. Bulk of the calcination (CaCO3-CaO CO2) of the raw An industrial database provides data of all the variables that
meal is done in the multistage cyclone preheater. The preheated are recorded. In the present work as shown in Table 1, there are
raw meal then enters the cement kiln. The rotary kiln is the heart 12 variables pertaining to raw meal quality (kiln inlet), 5 kiln
of the cement plant where the different components present in operating variables and 17 clinker quality variables. Out of all
the raw meal, at high temperature, react with each other. quality parameters of clinker, eight important parameters were
The lime (CaO) reacts with other components like silica, alumina chosen to be predicted by the model. For prediction of a few
and iron oxide present in the raw meal to form complexes chosen quality parameters, all the input information is not
of dicalcium silicate (C2S), tricalcium silicate (C3S), tricalcium required. Presence of irrelevant (or less relevant) variable data
Table 1
All associated variables for cement kiln.
Raw meal quality Clinker quality Kiln operating variables
SiO2, Al2O3, Fe2O3, CaO, MgO, K2O, Na2O, SO3, Cl, Lime SiO2, Al2O3, Fe2O3, CaO, MgO, K2O, Na2O, SO3, Cl, Free Lime, lime Kiln feed rate, current, kiln RPM,
Saturation Factor, Silica Modulus, Alumina Modulus saturation Factor, Silica Modulus, Alumina Modulus, C3S, C2S, C3A, C4AF feed inlet temperature, coal feed
rate
Table 2
Final selected variables for the cement kiln.
Raw meal quality Clinker quality Kiln operating variables
SiO2, Al2O3, Fe2O3, CaO, Free Lime, Lime Saturation Factor, Silica Modulus, Kiln feed rate, current, kiln RPM,
Alumina Modulus, C3S, C2S, C3A, C4AF feed inlet temperature, coal feed rate
A.K. Pani et al. / ISA Transactions 52 (2013) 1929 21
in the input data set leads to noise which may result in the detection techniques [6,7], the 3s method, Hampels method and
deterioration of the model. For neural network modeling, a reduc- the Box plot method were applied to each of the online operating
tion in the input data dimension leads to simplied neural archi- variable data set. The 3s method detects the data values as
tecture and reduced training time [3]. One approach to reducing the outliers satisfying the following inequality:
input variable dimension is by applying statistical methods such as
9xi x9
principal component analysis (PCA) or partial least squares (PLS) 43 1
s
which results in a low dimension input space. However the problem
associated with such methods are the new variables which are a The same for Hampels method is presented in Eq. (2).
combination of the original variables, are difcult to interpret in 9xi x0:5 9
terms of actual process variables [4]. Therefore based on prior 43 2
1:4826 median9xi x0:5 9
process knowledge and in consultation with plant operators the
variables shown in Table 2 were retained for subsequent model In the box plot method, the different regions in the plot are
development. The raw meal quality parameters along with the kiln dened as
operating variables are the inputs for the kiln model and the clinker 9
Lower inner fence : Q 0:25 1:5 Q 0:75 2Q 0:25 >>
quality parameters are the outputs of the model. >
>
Upper inner fence : Q 0:75 1:5 Q 0:75 2Q 0:25 =
The next task is handling of outliers and missing data. These 3
Lowerouterfence : Q 0:25 3 Q 0:75 2Q 0:25 >
>
problems are more likely to take place in case of online measure- >
>
Upperouterfence : Q 3 Q 2Q ;
ment than laboratory measurement. Outliers are sensor values 0:75 0:75 0:25
which deviate from the typical, or sometimes also meaningful,

Q0.25 and Q0.75 are 25% and 75% quantiles respectively
ranges of the measured values. The raw data extracted from the
A mild outlier is a point beyond an inner fence on either side
plant database are often contaminated with outliers which might
while an extreme outlier is a point beyond an outlier fence.
have resulted due to one or more reasons of hardware failure,
The performance of the three techniques were evaluated by
process disturbances or changes in operating conditions, instru-
determining the skewness and kurtosis values for the data
ment degradation, transmission problems and/or human error [1].
obtained after the removal of outliers by a particular technique.
Outliers may lead to model misspecication, biased parameter
The skewness value of a data set is dened as
estimation and incorrect analysis results [5]. P
In this work also, the collected industrial data suffers from g xi x3
4
problem of outliers as shown in Fig. 1. Three popular outlier ns3
Fig. 1. Actual online data of cement kiln operating parameters.
Table 3
Skewness and kurtosis values for raw and treated data.
Kiln operating Value for raw data Value for treated data
parameters
3 sigma Hampels method Box plot rule
g k g k g k g k
Kiln feed rate 3.842 18.442 3.79 19.119 1.181 4.12 1.669 5.658
Kiln current 32.946 1105.4 7.318 75.024 0.056 2.676 0.068 4.716
Kiln RPM 3.141 13.673 2.50 9.497 1.317 3.9231 1.892 6.2741
Kiln feed inlet temperature 9.954 110.73 14.341 227.488 0.624 3.51 1.038 4.70
Coal feed rate 34.577 1196.7 3.862 21.554 0.671 4.321 1.429 6.47
Fig. 2. Comparison of the three outlier detection techniques.
The kurtosis is dened as hidden layer neurons of the BP network, the data normalization
P was done in the range of 01 using the following formula:
k xi x3
5 xxmin
ns4 xnorm 6
xmax xmin
Skewness is a measure of symmetry of the dataset and for a
normally distributed data the skewness value is zero. Similarly The nal normalized input data is shown in Fig. 3.
kurtosis represents the extent of peakedness or atness of the
data set and for a normally distributed data has a value of 3. The
skewness and kurtosis values for the raw and treated data are 4. Neural network design
shown in Table 3. As evident from Table 3 and Fig. 2, it was
observed that, the Hampel identier is able to detect most A neural network model requires a complete inputoutput
efciently the outliers present in the data set. data set. In the present study raw meal quality parameters and
The removal of the detected outliers resulted in missing values kiln operating variables are the inputs to the network and clinker
in the data set. Missing data were imputed by the method of quality parameters are the outputs. A total of 223 inputoutput
linear interpolation [8] between the data preceding and following data sets were prepared out of which 156 were used for training
it. While laboratory data for quality parameters were available at the neural network model and 67 for model validation. The three
an interval of every 23 h, online data is recorded in the database below mentioned neural networks were developed based on this
history almost every 20 min. Therefore for the same time period collected and processed data Fig. 3.
when there were only a few hundred laboratory quality data were
available, process variables were available in excess of thousand. 4.1. Back propagation neural network
In order to make the entire neural network input variables of the
same dimension, linear interpolation was done to estimate the The back propagation algorithm needs no explanation since till
online process variable at a time instant when the laboratory data date this has been the most popular neural network design algorithm.
was available. Interested readers can refer to Samarasinghe [10] for calculation
After performing the different data preprocessing operations details using back propagation algorithm. Neural network models
of variable selection, outlier detection and removal, missing value using back propagation algorithm have been developed for estimation
imputation, nally data normalization was done so as to ensure of PET viscosity in polymerization process [3], pollutant emission in
equal relevance for all variables. Sola and Sevilla [9] have shown cement plant [11], process output variables of water desalination
how the choice of normalization technique affects the predictive plant [12], water content of natural gases [13], crude oil viscosity [14],
capability of the network. In this case, since logistic sigmoidal river ow [15], polymer property [16], enzyme activity and biomass
functions [1/(1e x)], having a range from 0 to 1, are used for the concentration [17], molten steel temperature [18]. The number of
neurons in the input layer and the hidden layer are the same as the training time due to the use of non-linear optimization techniques,
number of input variables and output variables of the process. and the possibility of getting trapped in local minima resulting in
Choosing the number of hidden layers and number of neurons in sub-optimal solution. Although the use of genetic algorithm may
each hidden layer is the most critical decision to a successful design of result in global minima, the procedure is computation intensive.
BPNN. Unfortunately there is no universal method to determine the Radial basis function networks are a class of feed forward
optimum network topology and these are mostly decided based on a supervised networks. It is a two layer network consisting of an input
trial and error procedure so as to produce the least error [3,13,19]. layer, a hidden layer and an output layer with linear parameters.
Usually a single hidden layer is used to solve functional approxima- Non-linear basis functions are used at the hidden layer neurons. A
tion problems and if the performance goal is not attained in a single center is associated with every hidden layer node. Hidden layer nodes
hidden layer gradually the number of hidden layers can be increased. calculate the Euclidean distance between the center and the input
The more the number of hidden layers the more is the complexity vector which is sent as an input to the basis function. The different
associated and large is the training time required. Few number of types of basis functions used are as follows [20]:
neurons in the hidden layer leads to poor model accuracy whereas
many number of neurons result in model over tting and poor Thin plate spline function : jx x2 logx 7
generalization. In the present work however the use of two hidden
layers produced signicantly less error as compared to the use of only

one hidden layer. The number of neurons was determined by x2
Gaussian function : jx exp 2 8
conducting model training for different numbers of neurons ranging s
from 3 to 20 and choosing the one producing the least error. The nal
optimum architecture based on large number of simulations is given Table 4
in Table 4 and the gure is produced in Fig. 4. BPNN details.
No of input nodes 9
4.2. RBF neural network
No of output nodes 8
No of hidden layers 2
Multilayer feed forward networks with sigmoidal activation No of neurons in 1st hidden layer 9
functions have been proven to be universal approximators No of neurons in 2nd hidden layer 12
which are mostly trained by the back propagation method using Activation function for the two hidden layers Sigmoidal
Activation function for output layer Linear
gradient descent algorithm. However, the disadvantages of BP neural Training algorithm used Scaled conjugate gradient
networks are the following [2022]: excessive computational or
Fig. 3. Final processed data to be used for model development.

Fig. 4. Back propagation neural network architecture.

2 Moreover, though this method has faster training but results in
exp :xci : =s2i
Normalised Gaussian function ji x P 9 local optimum yielding suboptimal models [25,26].
n 2
j 1 exp :xc j : =sj
2
The second category makes use of algorithms to determine the
1=2 network structure as well as the parameters. Some of the proposed
Multiquadratic function : jx x2 s2 10 algorithms are orthogonal least squares algorithm [20,22], genetic
algorithm [27], individual training of each hidden unit based on
1=2
Inverse multiquadratic function : jx x2 s2 11 functional analysis, fuzzy partition of input space followed by linear
regression [23].
s is the scaling parameter or the width which controls the spread of In the present work, a two layered feed forward neural
the function around the center. network was constructed. The rst layer has radial basis neurons
Out of the above functions, the Gaussian type is mostly used as with Gaussian activation function as given in Eq. (8) to perform
activation function for hidden layer nodes. So for an input vector the non-linear transformation of the input signal.
x, the network output is given as The second layer has linear neurons which produce linear
X
n outputs. The following iteration is performed until the networks
y^ f x wi j:xci : 12 mean squared error falls below goal or the maximum number of
i1
neurons are reached:
wt is the weight associated with ith RBF center and :xci : is the
Euclidean distance between center ci and the input vector x. (1) The network is simulated with no neurons in the rst layer.
The linear parameters used in RBF networks result in faster (2) The input vector with the greatest error is determined.
training and less convergence problems in comparison to BP (3) A radial basis neuron is added with weights equal to that vector.
neural networks. Also RBF networks have better approximation (4) The output layer weights are redesigned to minimize error.
ability with simpler network architecture as compared to MLPs
[23]. Selection of an appropriate radial basis network requires
careful selection of basis function and their associated parameters In the present case the goal was set to zero and maximum
(centers and widths). The performance of an RBF network largely number of neurons to 70. A larger spread smoothens the function
depends on the centers chosen. As a strict interpolator the approximation and a too small spread value leads to the use of
network must have as many RBF centers as the training data. more number of neurons to t a smooth function. After adequate
However this results in a large structure when the data are plenty trial and error an optimum spread value of 0.4 was used in the
and results in over tting of the data and poor generalization present case.
capability of the network. On the other hand the use of very less
number of centers results in under tting of the data [24].
The centers and widths are obtained using k-means clustering
algorithm or density estimation methods. This involves classify- 4.3. Generalized regression neural network
ing the input data into k number of clusters. The cluster centers
are determined by minimizing the total squared error incurred in Generalized regression neural network (GRNN) which was rst
representing the data set by k cluster centers. However the proposed by Specht [28] is a powerful tool for non-linear function
drawbacks of this standard algorithm are that for determining approximation. In the general regression algorithm, the form of
the hidden nodes many passes of all training data are required inputoutput dependence is expressed as a probability density
resulting in large computational time for large dataset [23]. function determined from the observed data. The algorithm has
the form [28]: The total collected data (online process variable data from
Pn 2 database history and quality data from the laboratory), after outlier
Di
i 1 y exp 2s2 removal, missing value imputation and data normalization, were split
y^ Pn 13
D2i into two parts. Two thirds of the data (156 data values) were used for
i 1 exp 2s2
model development (training data) and one third (67 data values)
D2i , the Euclidean distance between two input vectors is given as: were used for model validation purpose. While selecting the data for
T neural network training, it was ensured that the highest and the
D2i xi xj xi xj 14 lowest values from each variable set were retained in the training set
The regression equations can be implemented in a neural so that the developed model can be used over a wider operating
network like structure which is then known as GRNN. Fig. 5 range. Because empirical models do not extrapolate well and one
shows the typical structure of a GRNN. should be careful while using the model, the input data actually falls
A typical GRNN has four layers: an input layer, a pattern layer, within the range which was used for model development.
a summation layer and the output layer. Input layer has the same Figs. 6 and 7 make a comparison of the estimation capability of
number of neurons as the number of input variables and pattern the three neural network models with respect to the trained and
layer has the same number of neurons as the number of training untrained data respectively. The performance of the models is
cases. Pattern neurons compute a distance which is the square of determined by evaluating the mean of squared error (MSE) values
the differences across all weights as described in Eq. (14). For jth produced by each model to the trained and untrained data.
pattern neuron, net input is Pn
^ 2
i 1 yy
X
n The MSE value is given as 16
2 n
Ij wij xj 15
i1 Table 5 shows the minimum mean squared error achieved for
The activation function associated with the pattern neuron is the three models. Actual variable values are used for the graphs
exponential and can be written as expI2j =2s2 . The choice of whereas the mean squared error values are based on the normal-
smoothing function or spread parameter s is critical to the successful ized values. Further analysis of the estimation capabilities of the
design of a GRNN. A large value of s results in more generalization developed models was done by computing the variance account for
and smoother tting whereas a low value results in more accurate (VAF) values of the models for the unknown data. The VAF values
tting and poor generalization. The method suggested for optimum for different estimated parameters are presented in Table 6.
selection of s is the hold out or leave one out method [28,29]. The
summation neurons calculate the sum of weighted inputs from
pattern layer. There have been some applications of GRNN modeling 6. Discussion
for estimation of crude oil viscosity [14], river ow [15], polymer
property [16], soil quality [29], river sediments [30], coal grindability As stated earlier, the skewness and kurtosis values for a
[31], plasma process parameters [32], water quality [33], compressive normally distributed data should be 0 and 3 respectively. Therefore
strength and elasticity modulus [34], NOx emission [35]. it can be concluded that the outlier detection technique producing
skewness and kurtosis result close to these values is the most
effective. From Table 3 the superiority of Hampels method over the
5. Results other two methods is veried.
Training of the neural network can be viewed as a parameter
Hampels identier, which uses outlier resistant median and estimation technique to get the best model. But before choosing
median absolute deviation (MAD) values instead of the outlier the model the following must be determined:
sensitive mean and standard deviation, is more effective. There- Does the model perform well with the untrained data?
fore it can be concluded that if outlier numbers are less in a very Is the best developed model suitable enough for a given
large data set which will affect the values of mean and standard process application?
deviation insignicantly one can use the three sigma method. The answer to the rst question is the model validation
Otherwise it is better to use the Hampels identier. process which was carried out by simulating the model with
Fig. 5. Generalized regression neural network structure.

Fig. 6. Actual and estimated values of clinker quality parameters for training data.
unknown data and determining the error or residual value the optimum selection of the spread coefcient is crucial to the
between actual and model estimated values. The mean squared success of the GRNN model. Theoretically, a decrease in the value
error values produced by different models to validate the data is of sigma s will increase error value for untrained data (poor
produced in Table 5. generalization) and decrease error value for trained data (over
One interesting observation is, as far as simulation of the tting). In this case, a decrease in s below the optimum value of
networks for the training data is concerned, all three models 4.9 resulted in decrease in MSE for trained data but increase in
produce almost comparable results (Fig. 6 and 2nd column MSE for unknown data. However a value beyond the optimum did
Table 5). But the important aspect where the BP network model not result in better generalization but only resulted in signicant
lags behind the radial basis network and the regression network increase in MSE for trained data.
model is the generalization capability of the models (Fig. 7 and It is quite obvious from Table 5 that RBFN and GRNN clearly
3rd column Table 5) i.e. how well the models perform when they outperform the network model trained by the back propagation
are supplied with data not used for the training. As stated earlier method. The performance of a model is assessed by its ability for
Fig. 7. Actual and estimated values of clinker quality parameters for validation data.
Table 5 [36] of the different models in predicting the clinker quality

Mean squared error comparison of neural network soft sensors. parameters. The VAF is dened as

Types of Mean squared error Mean squared error var yy^
Neural Network value for training data value for untrained data VAF 1 100 17
var y
Back propagation 0.0068 0.0693 The closer is the value of VAF to 100, the better is the model.
Radial basis 0.0086 0.03 Detailed analysis of the results of Table 6 reveals that the BPNN
Generalized regression 0.0038 0.039
model has very low generalization capability. The high negative
VAF values for the BPNN model is due to the fact that the model
exhibits highly erratic behavior in estimating outputs from
unknown inputs, resulting in much higher variance for the resi-
generalization. However there is only marginal error difference in duals (yy^ ) than for the actual output. Except free lime content, all
the validation process between RBFN and GRNN models. To have other clinker quality parameters are better predicted by RBFN than
a clear answer to whether the best designed model is good GRNN as evident by the higher VAF values for the RBFN model.
enough for the purpose, further analysis of the model perfor- The nal issue regarding online implementation of soft sensors
mances were done by computing the variance account for (VAF) in the process has been discussed by several researchers [3,37].
Table 6 References
VAF values for model simulated output for unknown input data.
Quality Parameters VAF values for different neural network models [1] Lin B, Recke B, Knudsen J, Jorgensen SB. A systematic approach for soft sensor
development. Computers and Chemical Engineering 2007;31:41925.
BPNN RBFN GRNN [2] Qiao J, Fang Z, Chai T. SVR L.S. based soft sensor model for cement clinker
calcination process. International conference on Measuring Technology and
Free lime (FCaO) 981.634 54.612 67.457 Mechatronics Automation (ICMTMA), vol. 2; 2010. p. 5914 .
Lime Saturation Factor (LSF) 1206.493 61.33 39.14 [3] Gonzagaa JCB, Meleirob LAC, Kianga C, Filho RM. ANN-based soft-sensor for
Alumina Modulus (ALM) 1648.77 66.134 46.457 real-time process monitoring and control of an industrial polymerization
process. Computers and Chemical Engineering 2009;33:439.
Silica Modulus (SiM) 3321.1 39.7 29.06
[4] Delgado MR, Nagai EY, Arruda LVR. A neuro-coevolutionary genetic fuzzy
C3S 3689.38 42.6 6.94
system to design soft sensors. Soft Computing 2009;13:48195.
C2S 3281.34 43.14 14.06 [5] Liu H, Shah S, Jiang W. Online outlier detection and data cleaning. Computers
C3A 2949.07 32.53 23.37 and Chemical Engineering 2004;28:163547.
C4AF 3316.57 63.38 49.92 [6] Pearson RK. Exploring process data. Journal of Process Control 2001;11:
17994.
[7] Pearson RK. Outliers in process modeling and identication. IEEE Transac-
tions on Control Systems Technology 2002;10:5563.
[8] Wang D, Liu J, Srinivasan R. Data driven soft sensor approach for quality
prediction in a rening process. IEEE Transactions on Industrial Informatics
2010;6:117.
[9] Sola J, Sevilla J. Importance of input data normalization for the application of
neural networks to complex industrial problems. IEEE Transactions on
Nuclear Science 1997:14648.
[10] Samarasinghe S. Neural networks for applied sciences and engineering: from
fundamentals to complex pattern recognition. Auerbach Publications, Taylor
& Francis Group; 2007.
[11] Marengo E, Bobba M, Robotti E, Liparota MC. Modeling of the polluting
emissions from a cement production plant by partial least-squares, principal
component regression, and articial neural networks. Environmental Science
and Technology 2006;40:27280.
[12] Al-Shayji KA, Liu YA. Predictive modeling of large-scale commercial water
desalination plants: data-based neural network and model-based process
simulation. Industrial and Engineering Chemistry Research 2002;41:
646074.
[13] Mohammadi AH, Richon D. Use of articial neural networks for estimating
Fig. 8. Online implementation of soft sensor. water content of natural gases. Industrial and Engineering Chemistry
Research 2007;46:14318.
[14] Elsharkwy AM, Gharbi RBC. Comparing classical and neural regression
techniques in modeling crude oil viscosity. Advances in Engineering Software
A properly trained and validated soft sensor should be able to
2001;32:21524.
make real time estimates of the clinker quality when supplied [15] Kisi O, Cigizoglu HK. Comparison of different ANN techniques in river ow
with the values of the kiln operating parameters as measured by prediction. Civil Engineering and Environmental Systems 2007;24:21131.
the physical hardware sensors and the raw mix quality values. [16] Roy NK, Potter WD, Landau DP. Polymer property prediction and optimiza-
tion using neural networks. IEEE Transactions on Neural Networks
The estimated clinker quality values then can be used by the 2006;17:100114.
control system to manipulate the kiln operating parameters so as [17] Linko S, Luopa J, Zhu YH. Neural networks as software sensors in enzyme
to maintain the desired clinker quality. Fig. 8 describes the online production. Journal of Biotechnology 1997;52:25766.
[18] Tian H, Mao Z, Wang S, Li K. Application of genetic algorithm combined with
implementation of the soft sensor. BP neural network in soft sensor of molten steel temperature. WCICA The
Sixth World Congress on Intelligent Control and Automation 2006;2:77425.
[19] Bahar COzgen. Articial neural network estimator design for the inferential
model predictive control of an industrial distillation column. Industrial and
7. Conclusion Engineering Chemistry Research 2004:610211.
[20] Chen S, Cowan CFN, Grant PM. Orthogonal least squares learning algorithm
for radial basis function networks. IEEE Transactions on Neural Networks
Online estimation of clinker quality will greatly help in reduction 1991:)302309.
of poor quality clinker production. Unfortunately online estimators [21] Gurumoorthy KA, Kosanovich. Improving the prediction capability of radial
for the same are not available. In the present study neural network basis function networks. Industrial and Engineering Chemistry Research
1998;37:395670.
based soft sensor is developed for online prediction of clinker [22] Samanta B. Radial basis function network for ore grade estimation. Natural
quality. Three types of neural network were developed based on Resources Research 2010;19:91102.
the actual inputoutput data of a rotary cement kiln taken from a [23] Sarimveis H, Alexandridis A, Tsekouras G, Bafas G. A fast and efcient
algorithm for training radial basis function neural networks based on a fuzzy
cement plant producing 10,000 t per day clinker. It was observed
partition of the input space. Industrial & Engineering Chemistry Research
that all three networks perform satisfactorily for the known data. 2002;41:7519.
However the widely used back propagation neural network model [24] Ghodsi A, Schuurmans D. Automatic basis selection techniques for RBF
networks. Neural Networks 2003;16:80916.
fails miserably in the model validation step for accurate online
[25] Marinaro M, Scarpetta S. On-line learning in RBF neural networks: a
estimation of clinker quality as compared to the radial basis function stochastic approach. Neural Networks 2000;13:71929.
neural network and the regression neural network model. The RBF [26] Li C, Ye H, Wang G. Nonlinear time series modeling and prediction using RBF
model performance is better than that of the regression network network with improved clustering algorithm. IEEE International Conference
on Systems, Man and Cybernetics, vol. 4; 2004. p. 35138 .
model. The developed RBF model can provide the plant operators [27] Billings Steve A, Zheng GL. Radial basis function network conguration using
with approximate clinker quality so as to enable them for proper genetic algorithms. Neural Networks 1995;8:87790.
maintenance of clinker quality. [28] Specht DF. A general regression neural network. IEEE Transactions on Neural
Networks 1991;2:56876.
[29] Goh ATC. Soil laboratory data interpretation using generalized regression
neural network. Civil Engineering and Environmental Systems 1999;16:
Acknowledgment 17595.
[30] Cigizoglu HK, Alp M. Generalized regression neural network in modeling
river sediment yield. Advances in Engineering Software 2006;37:638.
The authors thank the management of Ultratech Cements,
[31] Peishenga L, Youhuia X, Dunxia Y, Xuexin S. Prediction of grindability with
Kotputli Cement Works, Rajasthan, India, for providing the online multivariable regression and neural network in Chinese coal. Fuel 2005;84:
and laboratory data related to the cement kiln for the research work. 23848.
[32] Kim B, Kwon M, Kwon SH. Modeling of plasma process data using a multi- [35] Zheng L, Yu S, Yu M. Monitoring NOx emissions from coal red boilers using
parameterized generalized regression neural network. Microelectronic Engi- generalized regression neural network. The 2nd International Conference on
neering 2009;86:637. Bioinformatics and Biomedical Engineering, ICBBE; 2008. p. 19169 .
[33] Palani S, Liong SY, Tkalich P. An ANN application for water quality forecasting. [36] Erzin Y, Hanumantha Rao B, Singh DN. Articial neural network models for
Marine Pollution Bulletin 2008;56:158697. predicting soil thermal resistivity. International Journal of Thermal Sciences
[34] Dehghan S, Sattari Gh, Chelgani CS, Aliabadi MA. Prediction of uniaxial 2008;47:134758.
compressive strength and modulus of elasticity for Travertine samples using [37] Rallo R, Ferre-Gine J, Arenas A, Giralt F. Neural virtual sensor for the
regression and articial neural networks. Mining Science and Technology inferential prediction of product quality from process variables. Computers
2010;20:416. and Chemical Engineering 2002;26:173554.

Development and Comparison of Neural Network Based Soft Sensors For Online Estimation of Cement Clinker Quality

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Development and Comparison of Neural Network Based Soft Sensors For Online Estimation of Cement Clinker Quality

Hochgeladen von

Copyright:

Verfügbare Formate

ISA Transactions 52 (2013) 1929

Contents lists available at SciVerse ScienceDirect

Development and comparison of neural network based soft sensors

1. Introduction thereby rendering data-driven modeling methods (black box or

Nomenclature x Mean value of a data set

Raw meal quality Clinker quality Kiln operating variables

Raw meal quality Clinker quality Kiln operating variables

which deviate from the typical, or sometimes also meaningful,

Fig. 1. Actual online data of cement kiln operating parameters.

Fig. 2. Comparison of the three outlier detection techniques.

Fig. 3. Final processed data to be used for model development.

Fig. 4. Back propagation neural network architecture.

Fig. 5. Generalized regression neural network structure.

Table 5 [36] of the different models in predicting the clinker quality

Das könnte Ihnen auch gefallen