Beruflich Dokumente
Kultur Dokumente
6
works have applied DNN to the forecasting tasks such as evolution strategies (ES). We select the A-ES for evolving a
predicting crude oil price [9], household electricity demand [10], prediction function instead of using a traditional genetic
stock market [11], etc. To the best of our knowledge, this paper is programming (GP) because the GP has a limitation on ephemeral
the first to employ DNN to forecast sugarcane yield and CCS. random constants [13]. The A-ES takes the advantages of both
GA and ES in terms of their capability in searching for a structure
The paper is organized as follows: material and methods are of equation and adjusting coefficients. The A-ES simultaneously
described in Section 2. Experiments and results are presented in evolves an equation form and its coefficients. It was proposed to
Section 3. Discussion and analysis are provided in Section 4. determine the equation for predicting sugarcane yielded in [2]. In
Finally, the conclusion is summarized in Section 5. this work, we adopt A-ES to find a new mathematical equation for
forecasting CCS. We set A-ES’s parameter values similar to the
2. MATERIAL AND METHODS ones in [2], except the maximum number of the equation terms
2.1 Collecting Data that is set to 3 (which is obtained from the experiments where the
We collected data from the farmers (residing in 24 provinces of number of terms is varied from 2 to 15). The A-ES algorithm is
Thailand), who supplied 8 – 14 tons of sugarcane to the mills shown in the following steps:
during the year 2010 - 2014. We preprocessed the data and
excluded incomplete and noisy data points from the raw data. 1. Generate parents, each of which contains 2 chromosomes
After the data cleansing, the numbers of the data points for the (i.e., coefficients and function forms).
year 2010 – 2014 are 314, 284, 288, 312 and 220, respectively. 2. Generate new offspring using the following steps:
2.1 Update step-size
The data consist of nine attributes as follows: 1) areas of new sigma = sigma Exp(N(0, (sigma2))),
sprout plantation (rais), 2) areas of first-ratoon plantation (rais), 3) where (sigma2) denotes the learning rate, and N(x, y) the
areas of second-ratoon plantation (rais), 4) areas of third-ratoon normal distribution with the mean of x, and the variance of
plantation (rais), 5) quota agreements with the mills (tons), 6)
(sigma2).
species (i.e. K84-200, LK11, K88-50, K90-77, K88-92, K88-50,
2.2 Mutate real-value coefficients
OoTong3, and OoTong1), 7) average rainfall of plantation
chromosome1 = chromosome1 + N(0, (sigma2))
(millimeters), 8) number of rainy days throughout the year of each
plantation (days), and 9) maximum rainfall of each plantation 2.3 Crossover and mutate the functions (chromosome2) as
(millimeters). In this work, we feed these data attributes as an shown in the following equations.
input to the training processes to construct the models for chromosome2 = mutate(crossover(chromosome2))
forecasting both of sugarcane yield and CCS. 3. Evaluate chromosomes and select most fit individuals to be
parents for the next generation.
2.2 Forecasting Models 4. Repeat step 2 and 3 until a stopping criterion is met.
This paper presents the models for forecasting the sugarcane yield
and CCS. We chose three methods based on the backpropagation 2.2.3 Deep Neural Network (DNN)
neural network (BPNN), the (μ+λ) adaptive evolution strategies Deep Neural Network (DNN) [14] is one of widely-accepted
(A-ES), and the deep neural network (DNN). machine learning techniques developed since 2006. It has been
applied to solve various problems in computer vision, speech
2.2.1 BPNN recognition, handwriting recognition, and information retrieval,
A neural network is a widely used model that is inspired by etc. Recently, research works on DNN are remarkably active. In
human neural networks. In general, there are three kinds of layers this paper, we attempt to apply the DNN to construct models for
in neural networks, namely 1) input layer 2) hidden layers and 3) forecasting sugarcane yield and CCS. We also study its behavior
output layer. The hidden layers transform input features to a related to network structures, initial weights, and training-data
problem-specific feature space. In this paper, we train the neural overfitting.
network to predict the sugarcane yield and CCS by fitting the
DNN differs from normal ANN in that it has a larger number of
model to the training data. During the training process, the
hidden layers. The increasing of the number of layers help DNN
algorithm tries to minimize error by updating weights of the
cope with more complex problems and more efficient to solve
network based on a backpropagation technique. These steps of
real-world problems. This paper varies DNN parameters to find
updating weights are repeated until a given number of iterations is
appropriate DNN structures of the sugarcane forecasting model.
reached.
Particularly, we vary a number of hidden layers from 2 to 5 layers
In this paper, we construct the BPNN by using Weka version and a number of nodes in each layer in {32, 64, 128}. In all
3.6.12. Since the network structure directly affects to the experiments, every layer has equal number of nodes. Each
performance of the BPNN, we vary parameters including a network structure is trained to determine the forecasting model.
number of hidden nodes (from 5 to 20 nodes), a number of The data of the ith year is trained to predict the yield of the (i+1)th
iterations (varied in {1500, 2000, 2500, 3000}). We set learning year. Each structure’s preliminary error (averaged over the years)
rate and momentum to 0.1. From the previous work [2], we found is shown in Figure 1 and the ones for forecasting CCS are shown
that the set of parameters that provides the lowest training error in Figure 2.
for sugarcane yield forecasting is 16 hidden nodes with 3000
iterations. In this work, the BPNN-based model for forecasting
CCS also uses the same parameters.
7
Table 1. Training and testing MAPEs in sugarcane yield
prediction
Training Testing Training MAPE Testing MAPE
Data Data
BPNN A-ES DNN BPNN A-ES DNN
Figure 1. MAPE of the DNN-based yield forecasting models Avg. 37.90 10.91 9.25 56.35 13.21 13.83
Figure 2. MAPE of the DNN-based CCS forecasting models 2011 2012 3.64 3.49 3.48 5.50 5.60 6.09
3.1 Sugarcane Yield Forecasting 2013 2014 4.14 4.13 4.21 5.98 5.64 5.71
In this experiments, we compare the three algorithms’ forecasting
accuracy. Table 1 shows the accuracies in terms of mean absolute Avg. 4.28 4.10 4.16 8.02 5.99 5.93
percentage error (MAPE). Note that BPNN and A-ES’s MAPE
are obtained from [2]. The accuracies of DNN are obtained from
the best training model after 50 runs. Although the DNN-based model is slightly better than the A-ES,
its models are described as a set of weights, while the A-ES-based
model provides the human-understandable prediction equations.
The prediction equations for the CCS are presented in Table 3.
The CCS-influent factors are cropping areas (i.e, Ratoon1,
8
Ratoon3, NewPlanted in the equations), maximum rainfall, the the best testing results for 2011, 2013 and 2014, while the DNN-
number of rainy days, quota and sugarcane varieties. based provides the best for year 2012. On average over the years,
the DNN-based model has the lowest testing MAPE.
Table 3. Equations obtained by the A-ES for CCS prediction
In summary, the DNN-based model generates both the lowest
Training Testing Equation training and testing errors when forecasting sugarcane yield. The
Data Data results in this section also show that if the DNN-based model is
initialized appropriately, it could outperform the others (as shown
2010 2011 11.8621 + 0.511844Ratoon1 / - in Table 4 where the average test MAPE is 11.94%). On the other
2.61096MaxRainfall - 1.27713TypeK88_92 hand, the model might be trained to overfit the training data, and
produce more MAPE than the other two models do (as shown in
2011 2012 12.3818 + 2.79451TypeK84_200 - Table 1 where the average test MAPE is 13.83%).
4.21822DaysOfRain / 16.9181MaxRainfall
Table 4. In yield prediction, MAPEs of the BPNN-based and
2012 2013 11.6497 + 0.275961TypeLK11 - 0.724346Ratoon3 of A-ES-based models (producing the lowest training error)
/ 4.7858Ratoon3 vs. MAPEs of the DNN-based model (yielding the lowest
testing error)
2013 2014 12.5836 + -3.16223TypeLK11 / 0.0515265Quota Training Testing Training MAPE Testing MAPE
/ 0.0336951NewPlanted Data Data
BPNN A-ES DNN BPNN A-ES DNN
9
Table 6. In CCS prediction, MAPEs of the BPNN-based and 5. CONCLUSION
of A-ES-based models (producing the lowest training error) In this paper, we proposed the BPNN-, A-ES-, and DNN- based
vs. MAPEs of the DNN-based model (yielding the lowest models for forecasting sugarcane yield and quality levels (CCS).
testing error) The data were informed by sugarcane farmers residing in 24
Training Testing Training MAPE Testing MAPE
Thailand provinces during 2010–2014. The model's’ prediction
Data Data performances were then compared and analyzed. In our
BPNN A-ES DNN BPNN A-ES DNN experiments, we studied how DNN parameters would affect its
forecasting accuracy and whether it could outperform A-ES. We
2010 2011 3.97 4.04 4.32 8.62 4.58 3.66 found that the DNN-based model outperforms the others in
forecasting sugarcane yields. However, it is slightly worse than
2011 2012 3.64 3.49 3.72 5.50 5.60 5.47 the A-ES-based model when it comes to predict sugarcane CCS.
As DNN’s accuracy depends on the algorithm’s parameters, the
2012 2013 5.35 4.74 4.97 11.98 8.14 7.09 model’s produced by DNN sometimes performs relatively poorly.
Lastly, the DNN-based model tends to overfit the training data:
2013 2014 4.14 4.13 4.32 5.98 5.64 5.01 the one that yields the least training errors does not necessary
produces the best testing errors.
Avg. 4.28 4.10 4.33 8.02 5.99 5.31 In the future work, we plan to reduce the forecasting inaccuracies
for sugarcane yield by applying time-series analysis algorithms,
4.2.2 Analyzing each model’s average training and together with more useful information (e.g. pest outbreak, more
testing errors thorough and accurate weather information) obtained by widely-
deployed sensor networks. Moreover, the deep-learning
Similar to Table 5, Table 7 shows the training/testing errors of the parameters could be fine-tuned by applying evolutionary
BPNN-based and the A-ES-based models (each of which yields algorithms.
the lowest training error) to be compared with the training/testing
errors of the DNN (averaged over 50 runs). During the training 6. ACKNOWLEDGMENTS
process, the BPNN-based produces the best result for year 2010 This work was financially supported by the Research Grant of
and the A-ES-based produces the best results for the other three Burapha University through National Research Council of
years. For the testing process, the BPNN-based and the A-ES- Thailand (Grant no. 33/2560).
based produces the best results in year 2012 and 2014,
respectively, while the DNN-based produces the best result in 7. REFERENCES
year 2011 and 2013. [1] Sugarcane cultivated area report, Office of the Cane and
Sugar Board, Thailand, 2017.
Table 7. In CCS prediction, each model’s average MAPE [2] Srikamdee, S., Rimcharoen, S. and Leelathakul, N. 2016.
Training Testing Training MAPE Testing MAPE
Forecasting sugarcane yield using (mu+lambda) adaptive
Data Data evolution strategies. In: Park J., Pan Y., Yi G., Loia V. (eds)
BPNN A-ES DNN BPNN A-ES DNN Advances in Computer Science and Ubiquitous Computing.
CSA 2016, CUTE 2016, UCAWSN 2016. Lecture Notes in
2010 2011 3.97 4.04 4.17 8.62 4.58 4.18 Electrical Engineering. 421. Springer, Singapore.
[3] Hossain1, Md. M. and Abdulla, F. 2015. Forecasting the
2011 2012 3.64 3.49 3.56 5.50 5.60 6.22
sugarcane production in Bangladesh by ARIMA model. J.
2012 2013 5.35 4.74 4.88 11.98 8.14 8.00 Stat. Appl. Pro. 4, 2, 297-303.
[4] Li, D. and Qiu, M. 2012. The application of joint model for
2013 2014 4.14 4.13 4.24 5.98 5.64 5.70 sugarcane production forecast of Guangxi province.
Procedia Engineer. 31, 1083-1088.
Avg. 4.28 4.10 4.21 8.02 5.99 6.03
[5] Saithanu, K., Sittisorn, P. and Mekparyup, J. 2017.
On average over 4 years, the DNN-based model provides the best Estimation of sugar cane yield in the northeast of Thailand
forecasting results compared to the others for the yield prediction with MLR model. Burapha Science Journal. 22, 2, 197-202.
(as shown in Table 5). However, Table 7 shows that A-ES is the [6] Obe, O.O. and Shangodoyin, D.K. 2010. Artificial neural
best model for forecasting CCS. The reason may be that CCS has network based model for forecasting sugar cane production.
a narrow range of values (approximately between 10 - 13), so it is Journal of Computer Science. 6, 4, 439-445.
not difficult for the A-ES to find the fitting model. On the [7] Bugate, O. and Seresangtakul, P. 2013. Sugarcane production
contrary, possible values of the yields are quite diverse, it requires forecasting model of the northeastern by artificial neural
more complicated model to deal with. In this case, the deep network. KKU Sci. J. 41, 1, 213-225.
learning model is superior. Another advantage of the model
obtained by A-ES is that it could yield the mathematical equations, [8] Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y. and Alsaadi, F.
which are easy for humans to interpret and understand their E. 2017. A survey of deep neural network architectures and
meaning, while the results of the DNN-based model are a set of their applications. Neurocomputing. 234, 11-26.
weights, which are less meaningful. [9] Zhao, Y., Li., J. and Yu, L. 2017. A deep learning ensemble
approach for crude oil price forecasting. Energ Econ. 66, 9-
16.
10
[10] Coelho, V. N., Coelho, I. M., Rios, E., Filho, A. S. T., Reis, [12] Rimcharoen, S., Sutivong, D. and Chongstitvatana, P. 2005.
A. J. R., Coelho, B. N., Alves, A., Netto, G. G., Souza, M. J. Curve fitting using adaptive evolution strategies for
F. and Guimaraes, F. G. 2016. A hybrid deep learning forecasting the exchange rate. In Proceedings of the 2nd
forecasting model using GPU disaggregated function ECTI Annual International Conference.
evaluations applied for household electricity demand [13] Koza, J. R. 1992. Genetic Programming: On the
forecasting. Enrgy Proced. 103, 280-285. Programming of Conputers by Means of Natural Selection,"
[11] Chong, E., Han, C. and Park, F. C. 2017. Deep learning MIT Press: Cambridge, USA.
networks for stock market analysis and prediction: [14] Schmidhuber, Jürgen. 2015. Deep learning in neural
Methodology, data representations, and case studies. Expert networks: An overview. Neural networks. 61, 85-117.
Syst Appl. 83, 187-205.
11