Beruflich Dokumente
Kultur Dokumente
a r t i c l e i n f o a b s t r a c t
Article history: This paper describes the selection of training function of an articial neural network (ANN) for modeling
Received 20 August 2014 the heat transfer prediction of horizontal tube immersed in gassolid uidized bed of large particles. The
Received in revised form 20 October 2014 ANN modeling was developed to study the effect of uidizing gas velocity on the average heat transfer
Accepted 27 November 2014
coefcient between uidizing bed and horizontal tube surface. The feed-forward network with back
Available online 23 December 2014
propagation structure implemented using LevenbergMarquardts learning rule in the neural network
approach. The objective of this work is to compare performances of ve training functions (TRAINSCG,
Keywords:
TRAINBFG, TRAINOSS, TRAINLM and TRAINBR) implemented in training neural network for predicting
Articial neural network
Fluidized bed
the heat transfer coefcient. The comparison is shown on the basis of percentage relative error, coef-
Heat transfer coefcient cient of determination, root mean square error and sum of the square error. The predictions by training
Training function function TRAINBR found to be in good agreement with the experiments values.
2014 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.ijheatmasstransfer.2014.11.085
0017-9310/ 2014 Elsevier Ltd. All rights reserved.
338 L.V. Kamble et al. / International Journal of Heat and Mass Transfer 83 (2015) 337344
Nomenclature
perceptron (MLP) developed by Rosenblatt is the popular network 2. Materials and procedure
in many heat transfer applications [810]. The MLP consists of
several articial neurons arranged in two or more layers. The 2.1. Experimental set up
neurons are information-processing elements that are fundamen-
tal for MLP operation. The inputs of each neuron added and the The schematic diagram of experiments set up shown in Fig. 1,
result is transformed by an activation function that serves to limit consists of a rectangular uidized column that is 0.1 m 0.15 m
the threshold of the neuron output. The output of each neuron is in cross section and 0.4 m height, with a horizontal brass tube
multiplied by a weight of concern neuron, before being input to installed at a height of 100 mm from the distributor plate. The
every neuron in the following layer. The network adapts changing air was used as a uidizing gas at atmospheric pressure. The qual-
the weight by an amount proportional to the difference between ity of uidization improved by providing tapered diffuser and ple-
the desired output and the actual output. The process of weight num section, thus minimizing the acceleration effects because of
updating is called learning or training. The training process is the high ow rate. The setup was instrumented for measuring
achieved by applying a back propagation (BP) procedure. There the bed temperature, surface temperature of heat transfer tube,
are several training algorithms using BP procedure [1118], with air ow rate and electrical energy supplied to the tube. The parti-
individual advantages, such as calculation rate, requirements of cles chosen were mustard, raagi and bajara of diameter, 1.8 mm,
computation and storage. Boniecki et al. [19] developed neural 1.4 mm and 2.0 mm respectively. Drying these food grains using
model to predict ammonia emission from the composted sewage uidized bed is one of the growing areas where much investigation
sludge. For all of the selected models, the correlation coefcient can be carried out by using ANN modeling. The static bed height
reached the high values of 0.9720.981. The neural models devel- was 150 mm and the duct supported by a perforated distributor
oped by Boniecki et al. [20] to forecast the cows milk yield plate 4 mm thick at the bottom, which consisted of many small
proved to be the best predictive tool and optimized with the con- holes. A stainless steel screen with the mesh was placed above
jugate gradients algorithm. The sensitivity analysis performed for the distributor to gain more homogeneous distribution of the gas
input variables in network decides the dominant input variable in ow. The heat transfer tube was 110 mm in length and its outer
the developed network. Krzywanski and Nowak [21] developed a diameter was 27.5 mm. The cartridge heater inserted inside the
model to predict local heat transfer coefcient in the combustion bare single tube, and the heat input to the tube was controlled
chamber of the circulating uidized bed boiler by ANN approach. by a variable direct current power supply. The heat input was
It is shown that neural networks gives quick and accurate results determined by measuring voltage (V) and current (I). The temper-
to the input patterns provided as compared to the numerical ature of the bed measured at three different heights in the bed,
models developed previously. Neural network based heat convec- while two thermocouples were mounted on tube surface at equal
tion algorithm was successfully implemented by Zang and distance to measure tube surface temperature. The pressure
Haghighat [22] to predict local average Nusselt numbers along difference across the bed was measured by using a water
the duct surfaces. This algorithm was also integrated with a tube manometer. A constant heat input of 52.2 W was main-
transient three-dimensional heat transfer model based on nite tained throughout the experiment. The plenum chamber was
element analysis of heat conduction to develop a new thermal made up of 1.5 mm thick mild steel plate and was xed to a
modeling method for heat exchanger. ange at its top end to accommodate the distributor plate. A cen-
It is noted that no single algorithm suits best to all the prob- trifugal blower of 0.75 kW capacities provided the air for
lems. The performance of each algorithm depends on the process uidization.
to model, the learning sample and training mode. The success of
modeling NN depends on selecting the training function. The aim
2.2. Experimental heat transfer coefcient
of this work is to study training algorithms selected in the study
that use the BP procedure to optimize an ANN modeling. At rst
The experimental heat transfer coefcient h (W/m2 K) was cal-
the experimental setup is described, and then ANN model and
culated from simple relation of heat energy supplied:
training algorithms implemented in the study are explained. In this
work, authors are comparing the performance of ve training Q hAt T t T b 1
functions TRAINSCG, TRAINBFG, TRAINOSS, TRAINLM and TRAINBR
based on percentage relative error, root mean square error (RMSE), where Q was the measured tube heat input (W), At is surface area
coefcient of determination (R2) and sum of square because of of tube (m2), Tt is tube surface temperature (K) and Tb is bed
error (SSE). temperature (K).
L.V. Kamble et al. / International Journal of Heat and Mass Transfer 83 (2015) 337344 339
2.3. Architecture of neural network algorithms require line search at each iteration which is computa-
tionally expensive. Hence the scaled conjugate gradient algorithm
In this study a multilayer feed-forward ANN model [12] has (TRAINSCG), developed by Moller was designed to avoid the
been developed. The schematic diagram of NN model selected for time-consuming line search. The basic calculation step of the
the current study is shown in Fig. 2. The network consists of an quasi-Newton methods, (Eq. (2)), converges faster than the conju-
input layer with three neurons (particle diameter dp (mm), uid- gate gradient method, but it requires the calculation of the Hessian
izing velocity u(m/s) and temperature difference between bed and matrix, A1
k , of the performance index at the current values of
tube surface DT (K)), an output layer of two neurons (heat transfer weights and biases. This calculation is complex and expensive to
coefcient h (W/m2 K) and Nusselt number Nu), and hidden layer perform.
of ve neurons. The activation function is, tansig, in the hidden
layer whereas, purelin, is in output layer with TRAINLM training X k1 X k A1
k :g k 2
algorithm. The Broyden, Fletcher, Goldfarb, and Shano (TRAINBFG) updating
method and the LevenbergMarquardt (TRAINLM) algorithms avoid
2.4. Training algorithms this difculty because they update an approximate Hessian matrix
at each iteration of the algorithm. LevenbergMarquardt algorithm
There are different BP training algorithms in MATLAB ANN tool- was designed to approach second-order training speed without
box [23]. In this study, training and validation steps were carried having to compute the Hessian matrix. If the performance function
out for each of the ve different training functions (Table 1). has the form of a sum of squares, then the Hessian matrix can be
The categories of fast algorithms use standard numerical approximated as
optimization techniques such as conjugate gradient (TRAINSCG)
[24,25], quasi-Newton (TRAINBFG, TRAINOSS), [26] and Leven- H JT J 3
bergMarquardt (TRAINLM), [27,28]. The other conjugate gradient and the gradient can be computed as
g JT e 4
When the scalar l is zero, this is just Newtons method, using the
approximate Hessian matrix. When l is large, this becomes gradi-
ent descent with a small step size. Newtons method is faster and
more accurate near an error minimum, so the aim is to shift
towards Newtons method as quickly as possible. Thus, l
Fig. 2. Articial neural network structure in the study. is decreased after each successful step that is reduction in
340 L.V. Kamble et al. / International Journal of Heat and Mass Transfer 83 (2015) 337344
Table 1
The description of selected training algorithms for the study.
Table 2
Experimental conditions of training and testing data.
performance function and is increased only when a tentative step Xk is vector of current weights and biases, ak is learning rate and
would increase the performance function. In this way, the perfor- gk is current gradient. The learning rate in training the network
mance function will always be reduced at each iteration of the algo- was set at 0.5. Initially the TRAINLM algorithm was chosen for the
rithm. The one-step secant (TRAINOSS) method is an improvement network training [13]. The training data sets have been fed for a
of the TRAINBFG method because it decreases the storage and com- maximum of 2000 epochs until the MSE was below a performance
putation in each iteration. The TRAINOSS method does not store the goal set. The weights and biases have been updated only after the
complete Hessian matrix; it assumes that the previous Hessian entire training set has been applied to the network. The readings
matrix was the identity matrix. This function has additional advan- of input and output achieved during experimentation have been
tage that the new search direction can be calculated without com- utilized for the training and testing of the network. The network
puting a matrix inverse. The Bayesian regularization (TRAINBR), was trained with 75 data sets (70% of data) and tested with 30 data
[2931] method improves the ANN generalization. In this method sets (30% of data) hence in all 105 data sets were taken in NN mod-
the weights and biases of the network are assumed to be random eling. The post-training analysis has been performed with a regres-
variables with specied distributions. The regularization parame- sion analysis between the network response and the corresponding
ters are related to the unknown variances associated with these dis- target. The resulting correlation coefcient (R-value) between the
tributions. This method involves modication of the performance ANN outputs and the targets decides the measure of the perfor-
index. The performance index for the Bayesian regularization mance of the network. In order to obtain the optimum number of
method is given by Eq. (6); neurons in a hidden layer, the ANN model was trained with a vary-
ing number of neurons with the tansig transfer function and TRAIN-
F b Ed a Ew 6
LM algorithm. The maximum neurons checked were 15, starting
where a and b are parameters to be optimized, Ed is the mean with a minimum of one neuron and then increasing the network
sum of the squared network errors, and Ew is the sum of the size in steps by adding a neuron every time. Based on the results,
squares of the network weights. One feature of this algorithm is that the minimum error found at ve numbers of neurons hence
it provides a measure of how many network parameters (weights selected. This is an optimum topology obtained in authors previous
and biases) are being effectively utilized by the network. work [7]. The experimental and predicted values of the heat transfer
coefcient and Nusselt number match with a high level of accuracy.
2.5. Methodology
X 2
MSE 1=N t o k ak ; 1 k N 8
The supervised training, in which a network is trained for a par-
ticular set of inputs to produce the desired outputs, was imple- where a is network output, to be target output and N is number
mented in this study. Initially, the weights of input vectors and of data points. The experimental conditions are listed in Table 2.
bias were chosen randomly; however, the weights, subsequently,
were adjusted to minimize the network performance function
3. Results and discussion
i.e., mean square error (MSE) with performance level 1 105.
The training was considered as completed when the NN reached
Initially the network was trained using TRAINLM training func-
user dened performance level. The network weights have been
tion. The optimum number of hidden neurons found to be ve
updated using the BP algorithm as implemented by Sahoo et al.
hence this topology is nalized for the study. Then the network
[14]. The published literature [1315] on the implementation of
was trained with other training functions TRAINSCG, TRAINBFG,
NN modeling implies that the BP learning technique is popular
TRAINOSS, and TRAINBR keeping all other network parameters
and provides results with high accuracy. In this technique it uses
same as that of TRAINLM for comparison purpose. The training
a gradient descent algorithm in which it updates the network
was continued till the least value of MSE at denite value of epochs
weights and biases in the direction in which the performance func-
attained, as represented in Figs. 37. These gures are the perfor-
tion decreases most rapidly (i.e. along the negative of the gradient).
mance plots (learning curves) of the mean square error value ver-
X k1 X k ak g k 7 sus the number of iterations (epochs). The curves start from a large
L.V. Kamble et al. / International Journal of Heat and Mass Transfer 83 (2015) 337344 341
Table 3
Comparative % relative error of heat transfer coefcient by ve ANN training functions with the experimental values.
h (W/m2 K) EXP. % Relative error (TRAINBFG) % Relative error (TRAINBR) % Relative error (TRAINSCG) % Relative error (TRAINLM) % Relative error (TRAINOSS)
69.44 0.0012 0.0006 0.0317 0.0168 0.0324
72.98 0.0711 0.0071 0.1000 0.0034 0.1150
74.13 0.0169 0.0134 0.0113 0.0179 0.0357
77.24 0.0676 0.0243 0.0496 0.0721 0.0548
80.02 0.0266 0.0191 0.0211 0.0017 0.0126
82.20 0.1601 0.1178 0.1432 0.1265 0.0517
80.98 0.2670 0.0293 0.0606 0.1345 0.9190
59.13 0.0957 0.0029 0.0754 0.0391 0.0472
62.82 0.1192 0.0060 0.0855 0.0178 0.0508
64.56 0.0074 0.0015 0.0127 0.0384 0.0002
66.27 0.0548 0.0047 0.0398 0.0189 0.0361
68.23 0.0186 0.0084 0.0355 0.0438 0.0214
70.98 0.0962 0.1267 0.1817 0.0183 0.4527
68.87 2.8033 0.0713 2.8128 0.1737 1.4204
49.79 0.0135 0.0112 0.0044 0.0135 0.0024
53.86 0.0106 0.0267 0.0015 0.0063 0.0212
54.67 0.0461 0.0132 0.0649 0.0739 0.1057
56.46 0.1498 0.0096 0.1196 0.0466 0.1018
57.00 0.0423 0.0139 0.0560 0.0165 0.0502
57.81 0.0751 0.0737 0.0765 0.2572 0.3261
56.75 0.1080 0.1225 0.1570 0.0818 1.2268
Fig. 8. Regression analysis of TRAINBR for training data. Fig. 10. Regression analysis of TRAINBFG for testing data.
Fig. 9. Regression analysis of TRAINBR for testing data. Fig. 11. Regression analysis of TRAINSCG for testing data.
The validation of the training function is performed by execut- The coefcient of determination values for testing data for func-
ing regression analysis (post-training analysis) between network tions TRAINBFG, TRAINSCG, TRAINLM and TRAINOSS have achieved
response and corresponding targets. All the training functions have the values 0.995, 0.995, 0.999 and 0.998, respectively with inferior
achieved the value of R2 as unity for training data whereas the performance than TRAINBR as shown in Figs. 1013.
function TRAINBR has achieved value of R2 as unity for testing data The values of RMSE, SSE, and R2 are presented in Table 4.
also (Figs. 8 and 9). After analyzing all the results, TRAINBR function has shown better
L.V. Kamble et al. / International Journal of Heat and Mass Transfer 83 (2015) 337344 343
Abbreviations
Fig. 12. Regression analysis of TRAINLM for testing data. Conict of interest
None declared.
References
[22] J. Zhang, F. Haghighat, Development of articial neural network based heat [27] F.D. Foresee, M.T. Hagan, GaussNewton approximation to Bayesian
convection for thermal simulation of large rectangular cross-sectional area regularization, in: Proceedings of the International Joint Conference on
earth-to-earth heat exchanges, Energy Build. 42 (4) (2010) 435440. Neural Networks, IEEE Press, Piscataway, NJ, 1997, pp. 19301935.
[23] H. Demuth, M. Beale, M.T. Hagan, Neural Network Toolbox for Use with [28] M.T. Hagan, M. Menhaj, Training feed forward networks with the Marquardt
MATLAB, Neural Network Toolbox Users Guide, The Math Works Inc., Natick, algorithm, IEEE Trans. Neural Networks 5 (1994) 989993.
MA, 2005. [29] D.J.C. Markay, Bayesian interpolation, Neural Comput. 4 (1992) 415447.
[24] M.T. Hagan, H.B. Demuth, M.H. Beale, Neural Network Design, PWS Publishing, [30] J.S. Torrecilla, J.M. Aragon, M.C. Palancar, Optimization of an articial neural
Boston, MA, 1996. network by selecting the training function application to olive oil mills
[25] M.F. Moller, A scaled conjugate gradient algorithm for fast supervised learning, waste, Ind. Eng. Chem. Res. 47 (2008) 70727080.
Neural Networks 6 (1983) 525533. [31] D.V. Singh, G. Maheshwari, R. Shrivastav, D.K. Mishra, Neural network
[26] J.E. Dennis, R.B. Schanabel, Numerical Methods for Unconstrained comparing the performances of the training functions for predicting the value
Optimization and Nonlinear Equations, Prentice-Hall, Englewood Cliffs, NJ, of specic heat of refrigerant in vapor absorption refrigeration system, Int. J.
1983. Comput. Appl. 18 (4) (2011) 15.