Sie sind auf Seite 1von 10

Feasibility Study of the Use of Artificial Neural Networks for Predicting the Efficiency of Alcohol Fermentation

Abstract
The growing worldwide interest in producing renewable energy drives the development of new technologies to improve production processes and reduce inherent losses allowing greater production without necessarily expanding sugarcane growing areas, previously reserved for the production of food. Modern computer systems today are a strong ally in the pursuit of such improvement, in the form of artificial neural networks (ANN). This study presents the use of artificial neural networks to build a mathematical model to predict the efficiency of fermentation. The inputs used in the model were selected by linear correlation from a series of parameters monitored in production plants of ethanol and sugar. A theoretical analysis based on information provided by the literature was necessary to confirm the statistical results. Finally, the chosen parameters were tested in three different combinations within a multilayer perceptron neural network (MLP), with promising results for the use of this technology in processes involving biological systems as complex as alcoholic fermentation. Keywords: Alcoholic Fermentation, fermentation efficiency, Multilayer Perceptron Neural Model

INTRODUCTION
The use of ethanol as fuel in Brazil is explored as an additive for gasoline since 1931 when it was officially regulated, and since 1975, with the PROLCOOL program the country became the first country in the world to have part of its fleet of vehicles powered by this biofuel. The demand for renewable fuels and the fact that ethanol is one of the most viable renewable energy sources, in conjunction with the flex engine technology that allows the use of any ratio of ethanol/gasoline, significantly increased production and consumption of ethanol in Brazil (2005 Mantovaneli ). On the other hand, the increase in sugarcane cultivation areas has caused much concern because some fear that these areas might advance over land previously reserved for food production. Thus, the low efficiency in producing ethanol per hectare of cultivated land has become a concern that has led to many studies on new methods and agricultural practices, new yeast strains and even the technical improvement of production processes. In this context, modern computer systems available today, along with predictive tools to optimize and control and consequently increase the efficiency in production and extraction of ethanol, are powerful allies in the search for improvement with the use of computational techniques capable of resolving complex biological systems, such as phenomenological models, neural and hybrid neural networks (Mantovaneli 2005). Artificial neural networks (ANN) are information processing units consisting of a number of connected simple processing elements, units or nodes, whose functionality is based on the operation of a brain neuron. The processing ability of the network is in the stored memory on the connections of these units, called weights, which are acquired by a process called learning from training patterns (Gurney 1997). The strategy in the use of a neural network to solve a problem is to choose the network architecture best suited to the problem, and a learning process whereby representative examples of the knowledge to be acquired are presented to the network that self-organizes to integrate, within its structure, the information being presented, correctly adjusting the synaptic weight of the neural connections. The learning often requires the display to the network of examples of learning many thousands of times (Thibault et al. 1990). The great advantage of using a neural network resides in its high ability to adjust its performance to approximate more accurately the possible outcomes that occur on site in real fermentation processes, and can, in a way, predict the same results in the face of multiple random variables.

The MLP network (multilayer perceptron), which can be defined as a set of sensory units which constitute an input layer, one or two hidden layers of computing nodes and an output layer, have been very successful in many applications where it is necessary the use of computing power to solve difficult problems. Its great success derives from its supervised training with an algorithm known as error backpropagation. This algorithm is based on the learning rule of error correction (Haykin 2001). Figure 1 shows an example of a "feedforward" neural network fully connected, that is, each neuron in a given layer is connected to the other of the following layer.
Figure 1 - Architecture of an artificial neural network (Eyng 2006).

An early work related to the use of nonlinear systems in alcoholic fermentation processes was conducted by Meleiro (2002), which proposed the development of controllers based on neural systems. The results, according to the author, demonstrated the superior ability, compared to linear controllers, of the use of the neural networks technique in the design of biological processes controllers. Since then, some studies have demonstrated the feasibility of neural networks to predict biological parameters (Pramanik 2004; Eyng 2006; Meleiro et al. 2009, Rivera et al. 2009). Despite the growing understanding of the use of artificial neural networks in fermentation processes for the production of ethanol, little has been applied in practice in ethanol production plants in the country. For these reasons the aim of this work is to develop artificial neural networks using a different approach. We propose to study the data already collected by the quality control system of plants and insert them into an artificial neural network, in order to predict the best conditions associated with optimum performance and thereby generate data to help adjust the productive parameters in order to achieve optimum performance.

MATERIALS AND METHODS NATURE OF THE DATA USED


The data used for the study refer to the historical series of the weekly production of ethanol from sugarcane, collected through weekly data reports from two different plants now called, for reasons of confidentiality, plant A and plant B, covering the harvest period for the years 2008, 2009 and 2010, in a total of 3217 observations.

TYPE OF FERMENTATION PROCESS


The fermentation process in the analyzed plants is the fed-batch with recirculation of yeast, which is widely used in Brazil. The fermentation time, on average, was 8 hours for both plants.

CHOICE OF PARAMETERS FOR CONSTRUCTING THE NEURAL NETWORK


To ensure a good choice for the parameters that would serve as entries for the neural networks, we used the following methods to analyze their effectiveness: statistical correlation, theoretical analysis and sensitivity analysis. Statistical Analysis Correlation Using the STATISTICA 7.0 software, from Statsoft, we performed statistical correlation tests between the selected parameters, comparing these to the results obtained from fermentation efficiency. We performed two correlation analyzes. The first one was using the data from plant A (A 2008) where we found which degree of correlation would be obtained using the weekly average data report model used by this plant. The second analysis was performed with data aggregated from three data sets of plant B (B 2008, B 2009, B 2010). However, only a statistical analysis would not be sufficient to identify the input parameters for the neural network, for confirmation and validation of this selection, so we chose to conduct a theoretical analysis of these preselected parameters. Theoretical Analysis This analysis was conducted through a review of the topics related to each parameter and its influence on the efficiency of fermentation performance. Separate analyzes were performed for the parameters used by plant A and plant B. Thus, we also evaluated whether there is a significant difference between the parameters chosen by the two plants to monitor their processes. This difference can be identified either by the correlation coefficients or the results obtained in the application of the neural network. Sensitivity analysis The sensitivity analysis identifies which variables are more important for a particular neural network. Usually this analysis is used to identify which input variables can safely be removed from a neural network. To define the sensitivity of each variable in a particular neural network, we use the concept that every variable is "dispensable" (Hunter et al. 2000). Thus, the software that performs this test (STATISTICA v. 7), first computes the errors of the network and successively launches another round of the neural network using a null value for the variable being tested. The influence of this variable within the neural network is then measured by the difference between results from errors before and after the test, setting the significance (RANK) and sensitivity (RATIO). This analysis was performed after the assembly of the neural networks to adjust the results of the analyzes described above, according to a possible criterion error in choosing the parameters of the neural network caused by the nonlinear behavior of certain parameters monitored by the plants. Choosing and Assembling the Neural Network Using Multilayer Perceptron Neural Networks for Predicting the Efficiency of Fermentation After the correlation analyzes and theoretical analysis, the selected parameters were grouped into spreadsheets identified by the plant and the year in which they were collected. For the parameters of Plant B, all years are assembled such that they formed a single data set. Some neural network configurations were performed to determine the best predictor of performance results: i) Neural Network MLP-1 from the selected parameters from plant A to determine the performance of the parameters monitored by this plant with respect to prediction of fermentation efficiency, ii) Neural Network

MLP-2 from the selected parameters of plant B, to determine the performance of the parameters monitored by this plant with respect to prediction of fermentation efficiency and its variation depending on the year (2008, 2009 and 2010), iii) Neural Network MLP-3 using the parameters of plant B using the alcohol content (GL grade) as output to the neural network. The neural networks had the following features: Network: multilayer Perceptron (MLP) Standardization: mapminmax (input and output); random sampling: 80% training, 10% validation and 10% test; activation functions: the hidden layer (hyperbolic tangent) , output layer (linear); Best network choice: correlation coefficient; Error function: MSE (mean square error). Since the goal of this network is to predict the fermentation efficiency, and therefore a regression problem, we propose the use of MSE and the correlation coefficient r as parameters for choosing the best ANN. For the assessment of network performance, we chose as criteria the rate of testing error. The rate of testing error is obtained at the end of the test phase and determines the reliability of the predictions made by the network.

COMPUTER RESOURCES
Data from fermentation processes provided by the plants were recorded manually by laboratory analysts, after the analyzes were performed in triplicate, recording the average of the measurements in Excel spreadsheets (xlsx format). For statistical analyzes we used the Statsoft's STATISTICA software, version 7.0 licensed for the Federal Technological University of Paran (UTFPR). The assembly and simulation of the neural network were performed with the software MATLAB version R2007b using the Neural Network toolbox licensed to the Federal Technological University of Paran (UTFPR). Other important analyzes in this study were made using the Minitab statistical software version 16 (student version) licensed to the author. The computer used in the simulations is a HP ProBook Core i5, 4 Gigabytes of RAM and a 380 Gigabytes hard disk drive, Windows Seven Professional Operating system.

RESULTS AND DISCUSSION


The results for the selection of the parameters of Plant A (A 2008) for assembling the MLP-1 neural network that better correlated (correlation larger than 0.70) to the efficiency of the fermentation are shown in Table 1.
Table 1 - Correlation analysis of the data set of plant A (A 2008). Parameter A4 A9 A13 A17 A11 A15 A22 Description Sugarcane acidity % TRS for sugar Must Acidity Maximum temperature of wine Must Temperature Ammoniacal nitrogen in the must Flocculation Correlation with Fermentation Efficiency 0.79 -0.88 0.89 0.83 0.77 0.77 -0.71

The parameters selected above show a strong relationship with fermentation efficiency. However, a dataset of 583 records, representing a single crop, can be considered small and statistically inconclusive and may lead to an incorrect interpretation of whether these are really the key parameters for assembling a model for a neural network able to predict the fermentation efficiency. These seven parameters were selected for the theoretical analysis. After mathematical and theoretical verification, the parameters sugarcane acidity, % TRS sugar (Total Reducing Sugar), must acidity, wine and must maximum temperature, ammoniacal N2 and flocculation were chosen to compose the input for the MLP-1 neural network.

For the selection of the parameters for Plant B (B 2008, B 2009 and B 2010) data regarding the crops were consolidated into a single spreadsheet containing 2634 records. This was done to obtain a temporal component associated with the fermentation efficiency. This component was identified by the crop parameter (B 22) and its correlation was also seen in a first correlation analysis. The results of this first correlation analysis for the parameters of plant B showed a much lower correlation compared to the results obtained by plant A, demonstrating a potential lower ability of plant B of monitoring the fermentation process, using parameters little related to fermentation efficiency. Another aspect to be considered is the fact that the parameter with the highest positive correlation to fermentation efficiency was the crop, leading to the mistaken conclusion that other factors monitored by other parameters did not have significant influence, as long as the crop's year was more recent. Thus, the parameter crop was disregarded in a second correlation analysis (Table 2).
Table 2 - Correlation analysis disregarding the crop from the data set of plant B. Parameter B3 B5 B7 B10 B11 B14 B19 Description Sugarcane purity Sugarcane Dextran (ppm/Brix) Must impurity (%) Ammoniacal Nitrogen in the must (ppm) Wine (%) RRS(residual reducing sugar) Glycerol in wine Viability (%) Correlation with Fermentation Efficiency 0.27 - 0.20 0.25 -0.47 -0.28 0.25 0.21

The above parameters represent those who had the highest linear correlation with efficiency and therefore were chosen for theoretical analysis, where a closer study of its behavior and its relation to the fermentation efficiency can be observed.

Neural Networks Results


The results of the neural networks are presented through graphs and charts that show the performance, the values corresponding to the mean squared error (MSE), the correlation coefficient (r) and testing error rate, combined with appropriate comments about network performance concerning the desired target and in relation to the performance of other networks tested. 1 Results of the MLP-1 Neural Network. After selecting the parameters for testing (%TRS sugar, sugarcane acidity, must acidity, wine maximum temperature, must temperature, flocculation and ammoniacal N2), these were inserted into the artificial neural networks module of STATISTICA V. 7. The software simulations were analyzed using different amounts of interactions to achieve the best parameters, i.e., the closest correlation and lowest MSE. After testing numerous configurations, the network with the best performance had the following structure: 6 neurons in the input layer, two hidden layers with 6 neurons each, and one neuron in the output layer. All neurons have sigmoidal activation function. The algorithm used for training was the back propagation with momentum with a learning rate of 0.01 and momentum rate of 0.01 (Fig. 2).
Figure 2 - Structure of the MLP-1 network

The selection of the best network was assessed by two selected parameters and represented below by the mean square error (MSE) and correlation coefficient (r), as can be seen in table 3: Table 3 - The neural MLP-1 network choice
Mean Error MLP-1 -0.20823 MSE 1.27653 Mean Absolute Error Standard deviation rate 0.78698 0.65891 Correlation 0.79175

In Figure 3, the graph shows an approximation between the actual values and those predicted by the neural network. However, as this same graph indicates, there are points where the network was not able to detect accurately the behavior of the network, as in the points 1, 6, 14, and 20 which, due to their distance, contributed to a MSE of 1.27.
Figure 3 - Graphical representation of the original series (blue) and the predicted by the neural network (green) of fermentation efficiency for the MLP-1 network.

A sensitivity analysis may open a way for future efforts regarding performance improvements, by indicating that at least one of the chosen parameters was not the most appropriate (%TSR sugar). Table 4 shows a classification (ranks) giving importance to each parameter according to the sensitivity analysis (Rank 1) and the correlation analysis (Rank 2). The %TSR sugar parameter was not needed to assemble the network, and the components with the highest significance for the network are not necessarily those with higher linear correlation.
Table 4 - Ranks for the parameters according to the coefficients of sensitivity and linear correlation. Ammoniac Wine al Nitrogen Max in the must Temp. Rank 1 Sensibility Rank 2 Correlation 1 5 0.77 2 3 0.83 Flocculation Must Must Sugarcane % TRS sugar Temperatu Acidity Acidity re gH 2 S0 4 / L 4 5 6 N/A 1.00964 1.00272 6 0.77 1 0.89 0.95106 4 0.79 N/A 2 0.88

3 1.05420 7 0.71

1.41630 1.06248

2 Results of the MLP-2 Neural Network The same procedure performed to assemble the MLP-1 network was repeated for the MLP-2 network. However, due to differences in monitoring between the plants, the selected parameters were different (Ammoniacal N 2 in the must, sugarcane purity, dextran, glycerol in wine, wine RRS %, viability and must impurity). The best configuration for MLP-2 using the selected parameters, was achieved with 150 interactions (r = 0.755 and MSE = 0.855). It had a structure formed by an input layer comprised of 7 neurons, one hidden layer of 11 neurons, another intermediate hidden layer with eight neurons and one neuron in the output layer. The seven input neurons are connected to the same seven parameters selected in the analyzes proposed earlier in this paper.
Figure 4 - Structure of the MLP-2 network (Plant B) MLP-2 7:1-11-8-1:1

The parameters selected by the MLP-2 neural network are described in order of importance in Table 5.

Table 5 - Parameters used in the neural MLP-2 network Ammoniacal Sugarcane Must Glycerol RRS Wine Viability Nitrogen in the must Purity (PCTS) impurity in wine (%) (%) (ppm) (%) (%) (%) Rank Ratio 1 1.443728 2 1.376896 Sugarcane Dextran (ppm / Brix) 3 4 5 6 7 1.286733 1.234084 1.203130 1.123687 1.094810

A comparison between the neural networks MLP-1 and MLP-2, shows that although both present close correlations (MLP-1 r = 0.79 and MLP-2 r = 0.75), the MSE of the neural network of plant B (MSE = 0.85) was significantly lower compared with that of plant A (MSE = 1.276). These values in itself are not enough to affirm with certainty that the parameters monitored by plant B are best for determining the income that the plant because the MSE of both plants was affected by sampling taken from each plant. Given the above results, we assembled a new artificial neural network, the MLP-2 (complete) network, which does not take into account the pre-selection of parameters, and therefore enabled us to clarify the need to preselect the parameters before submitting them to an assembly of a neural network. This network uses all 20 parameters monitored by plant B, and its results can be seen in Table 6.
Table 6 - Comparison between the MLP-2 and MLP-2 (Complete) networks

MLP-2 MLP-2 (Complete) Interactions r MSE Networks r MSE 150 0.75 0.85 300 0.824 0.745
After testing various configurations, Figure 5 shows the best performing network structure (r = 0.824 and MSE = 0.745). All monitored parameters were used as inputs to this ANN structure. The testing error rate of the network was 0.12, greater than the aim of 0.05, proposed for this work.

In view of what has been demonstrated in this experiment, using all parameters to compose the ANN, it is evident that the correlation analysis it is not an efficient method for selecting the parameters of the ANN in question. The explanation is perhaps related to the nonlinear behavior, compared to fermentation efficiency, of some parameters monitored by the plants, belonging to this alcoholic fermentation process. Since during the assembling of these neural networks we had the idea of predicting the alcohol content of the wine as a fundamental parameter for the efficiency of fermentation, we developed a third artificial neural network, MLP-3.

Results of MLP-3 Neural Network

An ANN was assembled from all 20 parameters monitored by plant B (2008, 2009 and 2010). But now, the output parameter chosen for the network becomes the alcohol content of wine ( GL). In short, a new network is developed to predict the alcohol content and no longer the fermentation efficiency. The choice was due to the contribution that the alcohol content in wine has on fermentation efficiency. A diagram of neurons and layers of the MLP-3 network can be seen in Figure 6. The network has 11 neurons in its input layer (not all parameters were used), 12 neurons in the first hidden layer, 4 neurons in another hidden layer and one neuron in the output layer (MLP 11:11-12-4- 1:1).
Figure 6 - Structure of the MLP-3 network (MLP 11:11-12-4-1:1)

The parameters selected for the sensitivity analysis, generated by STATISTICA v. 7, are the selected entries to be used in the MLP-3 neural network and are shown in Table 6.
Table 6 - Analysis of the sensitivity and input parameters of the MLP-3
Must ART (%) Rank Sens. Glycerol Sugarcane in wine Dextran (%) (ppm/Brix) 3 1.099715 Ammoniacal Must Must Nitrogen in Sugarcane RRS Wine Viability Burning ART (T / acidity impurity the must ATR (%) (%) (%) Time (h) day) (gH 2 SO (%) (ppm) 4 / L) 4 1.061504 5 6 7 8 9 10 11 0.99909

1 2 4.03029 1.128956 7

1.039101 1.034029 1.023598 1.023366 1.013163 1.00258

Given the excellent values for the correlation coefficient (r = 0.974) and a reduced mean squared error (MSE = 0.176), the prediction capacity of this network was considered excellent. The results obtained by the MLP-3 ANN show that the parameters monitored by the plants today may be used in predicting the resulting alcoholic wine fermentation with a great degree of accuracy. In itself, this alcohol content can be used as an indicator of good results obtained in the fermentation, because the ultimate goal of these companies, which employ biotechnological processes, is to produce alcohol (ethanol). We suggest that, because of the results obtained, plant A should also monitor the alcohol content of wine in their processes. The generalization ability of the MLP-3 network using the data reserved for network testing is also good, MSE of 0.22, r of 0.972, and the error rate was 0.075 ,very close to our goal in this work (0.05). Taking into account these latest results, the assembly of an ANN to predict the optimal fermentation efficiency would have its performance expanded, if instead of a network specifically to predict efficiency, a hybrid system

was implemented, similar to the idea proposed by Mantovaneli ( 2005), which includes the equation of fermentation efficiency by subproducts, being fed by secondary artificial neural networks, providing the terms of this equation.

CONCLUSIONS
The artificial neural networks developed in this work demonstrated the ability to predict, with some degree of reliability, the results for the fermentation efficiency for both plants A and B within the ranges (minimum and maximum) of the parameters monitored by them. In particular, we highlight the network MLP-2 (complete) achieving an testing error rate of the order of 0.12 and the MLP-3 network with error rate of 0.075, striking values for early studies, performed with parameters coming from industrial environments, where the accuracy in laboratory tests is not as great as in a laboratory. Thus, we have demonstrated the efficiency of ANN in the treatment of non-linear patterns, peculiar characteristics of fermentation processes, and its application in the task of prediction. We also highlight the superior ability of multilayer perceptron artificial neural network, on identifying by means of a sensitivity analysis, the best parameters to be selected for the inputs of the ANN, compared to the linear correlation method. For best results, changes in topology and ANN training can be applied such as varying the amount of neurons, increased training time and testing with other training parameters, as well as the use of secondary neural networks in the calculation of efficiency.

ACKNOWLEDGEMENTS

REFERENCES
Eyng E. Controle Feedforward baseado em redes neurais aplicado a coluna de absoro do processo de produo de etanol [Dissertao de Mestrado]. Campinas, Brasil: Universidade Estadual de Campinas; 2006. Gurney K. Uma Introduo Redes Neurais. London: Routledge; 1997. Haykin S. Redes Neurais: princpios e prtica. 2nd ed. Porto Alegre: Bookman; 2001. Meleiro LAC. Projeto e Aplicao de Controladores Baseados em Modelos Lineares, Neurais e Nebulosos [Tese de Doutorado]. Campinas, Brasil: Universidade Estadual de Campinas; 2002. Meleiro LAC, Zuben FV, Filho RM. Constructive learning neural network applied to identification and control of a fuelethanol fermentation process. Eng Appl Artif Intel. 2009; 22(2): 201-15. Mantovaneli ICC. Modelagem hbrido neuronal de um processo de fermentao alcolica [Dissertao de Mestrado]. Campinas, Brasil: Universidade Estadural de Campinas; 2005. Pramanik K. Use of Artificial Neural Networks for Prediction of Cell Mass and Ethanol Concentration in Batch Fermentation using Saccharomyces cerevisiae Yeast. J. Inst. Eng. (India) Chem. Eng. Div. 2004; 85:31-5. Rivera EAC, Farias FJ, Atala DIP, Ramos RDA, Carvalho ADA, Filho RM. Development and implementation of a automated monitoring system for improved bioethanol production. Chem. Eng. Trans. 2009; 18: 451-6. Thibault J, Breusegem VV, Cheruy A. On-line prediction of fermentation variables using neural networks. Biotechnol. Bioeng. 1990; 36(12): 1041-8.

Das könnte Ihnen auch gefallen