Beruflich Dokumente
Kultur Dokumente
A novel hybridization of articial neural networks and ARIMA models for time
series forecasting
Mehdi Khashei , Mehdi Bijari
Department of Industrial Engineering, Isfahan University of Technology, Isfahan, Iran
a r t i c l e
i n f o
Article history:
Received 18 June 2008
Received in revised form 8 August 2010
Accepted 28 October 2010
Available online 4 November 2010
Keywords:
Articial neural networks (ANNs)
Auto-regressive integrated moving average
(ARIMA)
Time series forecasting
Hybrid models
a b s t r a c t
Improving forecasting especially time series forecasting accuracy is an important yet often difcult task
facing decision makers in many areas. Both theoretical and empirical ndings have indicated that integration of different models can be an effective way of improving upon their predictive performance,
especially when the models in combination are quite different. Articial neural networks (ANNs) are
exible computing frameworks and universal approximators that can be applied to a wide range of forecasting problems with a high degree of accuracy. However, using ANNs to model linear problems have
yielded mixed results, and hence; it is not wise to apply ANNs blindly to any type of data. Autoregressive
integrated moving average (ARIMA) models are one of the most popular linear models in time series
forecasting, which have been widely applied in order to construct more accurate hybrid models during
the past decade. Although, hybrid techniques, which decompose a time series into its linear and nonlinear
components, have recently been shown to be successful for single models, these models have some disadvantages. In this paper, a novel hybridization of articial neural networks and ARIMA model is proposed
in order to overcome mentioned limitation of ANNs and yield more general and more accurate forecasting model than traditional hybrid ARIMA-ANNs models. In our proposed model, the unique advantages
of ARIMA models in linear modeling are used in order to identify and magnify the existing linear structure in data, and then a neural network is used in order to determine a model to capture the underlying
data generating process and predict, using preprocessed data. Empirical results with three well-known
real data sets indicate that the proposed model can be an effective way to improve forecasting accuracy
achieved by traditional hybrid models and also either of the components models used separately.
2010 Elsevier B.V. All rights reserved.
1. Introduction
Time series forecasting is an active research area that has drawn
considerable attention for applications in variety of areas. With
the time series approach to forecasting, historical observations
of the same variable are analyzed to develop a model describing
the underlying relationship. Then the established model is used in
order to extrapolate the time series into the future. This modeling
approach is particularly useful when little knowledge is available
on the underlying data generating process or when there is no satisfactory explanatory model that relates the prediction variable to
other explanatory variables. Over the past several decades, much
effort has been devoted to the development and improvement of
time series forecasting models [1].
Articial neural networks (ANNs) are one of the most important
types of nonparametric nonlinear time series models, which have
been proposed and examined for time series forecasting. The basic
Corresponding author. Tel.: +98 311 3912550/1; fax: +98 311 3915526.
E-mail address: khashei@in.iut.ac.ir (M. Khashei).
1568-4946/$ see front matter 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.asoc.2010.10.015
2665
[25]. In ARIMA analysis, an identied underlying process is generated based on observations to a time series for generating a good
model that shows the process-generating mechanism precisely.
Box and Jenkins [26] provided a step-by-step procedure for
ARMA analysis, which is a combination of AR coefcients, which
are multiplied by past values of the time series data and MA coefcients, which are multiplied by past random shocks. The popularity
of the ARIMA model is due to its statistical properties as well as
the well-known BoxJenkins methodology in the model building
process. In addition, ARIMA models [22] can implement various
exponential smoothing models. Although ARIMA models are quite
exible in that they can represent several different types of time
series, their major limitation is the pre-assumed linear form of the
model. ARIMA models assume that future values of a time series
have a linear relationship with current and past values as well as
with white noise, so approximations by ARIMA models may not
be adequate for complex nonlinear real-world problems. However,
real world systems are often nonlinear [1], thus, it is unreasonable to assume that a particular realization of a given time series is
generated by a linear process.
Both ANNs and ARIMA models have achieved successes in their
own linear or nonlinear domains. However, none of them is a
universal model that is suitable for all circumstances. The approximation of ARIMA models to complex nonlinear problems as well as
ANNs to model linear problems may be totally inappropriate, and
also, in problems that consist both linear and nonlinear correlation
structures. Using hybrid models or combining several models has
become a common practice in order to overcome the limitations of
components models and improve the forecasting accuracy. In addition, since it is difcult to completely know the characteristics of
the data in a real problem, hybrid methodology that has both linear and nonlinear modeling capabilities can be a good strategy for
practical use.
The hybrid techniques that decompose a time series into its linear and nonlinear form are one of the most popular hybrid models
categories, which have been shown to be successful for single models. Zhang [27] presented a hybrid ARIMA and ANN approaches
for time series forecasting using mentioned technique. In Zhangs
hybrid model is jointly used the linear ARIMA and the nonlinear
multilayer perceptrons models in order to capture different forms
of relationship in the time series data. The motivation of Zhangs
hybrid model comes from the following perspectives. First, it is
often difcult in practice to determine whether a time series under
study is generated from a linear or nonlinear underlying process;
thus, the problem of model selection can be eased by combining
linear ARIMA and nonlinear ANN models. Second, real-world time
series are rarely pure linear or nonlinear and often contain both
linear and nonlinear patterns, which neither ARIMA nor ANN models alone can be adequate for modeling in such cases; hence the
problem of modeling the combined linear and nonlinear autocorrelation structures in time series can be solved by combining linear
ARIMA and nonlinear ANN models. Third, it is almost universally
agreed in the forecasting literature that no single model is the best
in every situation, due to the fact that a real-world problem is often
complex in nature and any single model may not be able to capture different patterns equally well. Therefore, the chance in order
to capture different patterns in the data can be increased by combining different models. These hybrid models, despite the all their
advantages, have two assumptions [21] that will degenerate their
performance if the opposite situation occurs; therefore, they may
be inadequate in some specic situations.
In this paper, ARIMA models are applied to construct a new
hybrid model in order to overcome the above-mentioned limitation of articial neural networks and to yield more general and
more accurate model than traditional hybrid ARIMA and articial
neural networks models. In our proposed model, a time series is
2666
2667
Articial neural networks (ANNs) are exible computing frameworks for modeling a broad range of nonlinear problems. One
signicant advantage of the ANN models over other classes of
nonlinear models is that ANNs are universal approximators that
can approximate a large class of functions with a high degree of
accuracy. Their power comes from the parallel processing of the
information from the data. No prior assumption of the model form is
required in the model building process. Instead, the network model
is largely determined by the characteristics of the data. Multilayer
perceptrons, especially with one hidden layer are one of the most
widely used forms of articial neural networks for time series modeling and forecasting. The model is characterized by a network of
three layers of simple processing units connected by acyclic links.
The relationship between the output (yt ) and the inputs (yt1 , . . .,
ytP ) has the following mathematical representation:
(1)
random error
period
where yt and at are the actual
pvalue and
q at time
i , (B) = 1
j are polyt, respectively (B) = 1
B
B
i
j
i=1
j=1
nomials in B of degree p and q, i (i = 1, 2, . . ., p) and j (j = 1, 2, . . ., q)
are model parameters, = (1 B), B is the backward shift operator,
p and q are integers and often referred to as orders of the model,
and d is an integer and often referred to as order of differencing.
Random errors, at , are assumed to be independently and identically distributed with a mean of zero and a constant variance of
2.
The BoxJenkins [26] methodology includes three iterative
steps of model identication, parameter estimation, and diagnostic checking. The basic idea of model identication is that if a time
series is generated from an ARIMA process, it should have some
theoretical autocorrelation properties. By matching the empirical
autocorrelation patterns with the theoretical ones, it is often possible to identify one or several potential models for the given time
series. Box and Jenkins [26] proposed to use the autocorrelation
function (ACF) and the partial autocorrelation function (PACF) of
the sample data as the basic tools to identify the order of the ARIMA
model. Some other order selection methods have been proposed
based on validity criteria, the information-theoretic approaches
such as the Akaikes information criterion (AIC) [49] and the minimum description length (MDL) [50,51]. In addition, in recent years
different approaches based on intelligent paradigms, such as neural networks [52], genetic algorithms [53,54] or fuzzy system [55]
have been proposed to improve the accuracy of order selection of
ARIMA models.
In the identication step, data transformation is often required
to make the time series stationary. Stationarity is a necessary
condition in building an ARIMA model used for forecasting. A stationary time series is characterized by statistical characteristics
such as the mean and the autocorrelation structure being constant over time. When the observed time series presents trend
and heteroscedasticity, differencing and power transformation are
applied to the data to remove the trend and to stabilize the variance before an ARIMA model can be tted. Once a tentative model
is identied, estimation of the model parameters is straightforward. The parameters are estimated such that an overall measure
of errors is minimized. This can be accomplished using a nonlinear optimization procedure. The last step in model building is
the diagnostic checking of model adequacy. This is basically to
check if the model assumptions about the errors, at , are satised.
Several diagnostic statistics and plots of the residuals can be
used to examine the goodness of t of the tentatively entertained model to the historical data. If the model is not adequate,
a new tentative model should be identied, which will again
be followed by the steps of parameter estimation and model
verication. Diagnostic information may help suggest alternative model(s). This three-step model building process is typically
repeated several times until a satisfactory model is nally selected.
The nal selected model can then be used for prediction purposes.
yt = w0 +
Q
wj g
w0j +
j=1
P
wi,j yti
+ et ,
(2)
i=1
1
.
1 + exp(x)
(3)
Hence, the ANN model of (2), in fact, performs a nonlinear functional mapping from the past observations to the future value yt ,
i.e.
yt = f (yt1 , . . . , ytP , W ) + et ,
(4)
where W is a vector of all parameters and f() is a function determined by the network structure and connection weights. Thus, the
neural network is equivalent to a nonlinear autoregressive model.
Note that expression (2) implies one output node in the output
layer, which is typically used for one-step-ahead forecasting. The
simple network given by (2) is surprisingly powerful in that it is
able to approximate arbitrary function as the number of hidden
nodes Q is sufciently large [1]. In practice, simple network structure that has a small number of hidden nodes often works well
in out-of-sample forecasting. This may be due to the over-tting
effect typically found in neural network modeling process. It occurs
when the network has too many free parameters, which allow the
network to t the training data well, but typically lead to poor generalization. In addition, it has been experimentally shown that the
generalization ability begins to deteriorate when the network has
been trained more than necessary, that is when it begins to t the
noise of the training data [56].
The choice of Q is data dependent and there is no systematic rule
in deciding this parameter. In addition to choosing an appropriate
number of hidden nodes, another important task of ANN modeling
of a time series is the selection of the number of lagged observations, P, and the dimension of the input vector [57]. This is perhaps
the most important parameter to be estimated in an ANN model
because it plays a major role in determining the (nonlinear) autocorrelation structure of the time series. However, there is no theory
that can be used to guide the selection of P. Hence, experiments are
often conducted to select an appropriate P as well as Q.
4. Formulation of the proposed model
Despite the numerous time series models available, the accuracy of time series forecasting currently is fundamental to many
2668
decision processes, and hence, never research into ways of improving the effectiveness of forecasting models been given up. Many
researches in time series forecasting have argued that predictive
performance improves in combined models [21]. In hybrid models,
the aim is to reduce the risk of using an inappropriate model by
combining several models to reduce the risk of failure and obtain
results that are more accurate. Typically, this is done because the
underlying process cannot easily be determined [58]. The motivation for combining models comes from the assumption that either
one cannot identify the true data generating process [59] or that a
single model may not be sufcient to identify all the characteristics
of the time series [27].
Hybrid techniques that decompose a time series into its linear
and nonlinear form are one of the most popular hybrid models,
which have recently been shown to be successful for single models
[27]. However, perhaps the danger in using these hybrid models
is that there is an assumption that the relationship between the
linear and nonlinear components is additive and this may underestimate the relationship between the components and degrade
performance, if there is not be any additive association between
the linear and nonlinear elements and the relationship is different
(for example multiplicative). In addition, one may not guarantee
that the residuals of the linear component may comprise valid nonlinear patterns. Therefore, such assumptions are likely to lead to
unwanted degeneration of performance if the opposite situation
occurs [21].
Terui and van Dijk [59] indicate that, these architectures do not
always lead to better estimates when compared to single models
and combined forecasts do not necessarily dominate for all series
sometimes a linear model still produces better results. Hibon and
Evgeniou [58] present experiments, using the 3003 series of the
M3-competition, that challenge this belief that combining forecasts
improves accuracy relative to individual forecasts. Their results
indicate that the advantage of combining forecasts is not that the
best possible combinations perform better than the best possible
individual forecasts, but that it is less risky in practice to combine
forecasts than to select an individual forecasting method. Taskaya
and Casey [21] try to answer the question that whether the performance of hybrid models shows consistent improvement over
single models. Their results show that hybrid models are not always
better, and hence, the process of model selection, still remains
an important step despite the popularity of hybrid models. They
believe that, despite the popularity of hybrid models, which rely
upon the success of their components, single models themselves
can be sufcient.
In this paper, a novel hybridization of articial neural networks and ARIMA model is proposed in order to overcome the
above-mentioned limitations of traditional hybrid models and also
limitations of linear and nonlinear models by using the unique
advantages of ARIMA and ANNs models in linear and nonlinear
modeling, respectively. In our proposed model, there are no abovementioned assumptions of the traditional hybrid ARIMA-ANNs
models. In addition, in our proposed model in contrast to the traditional hybrid ARIMA-ANNs models, we can guarantee that the
performance of the proposed model will not be worse than both
ARIMA and articial neural networks.
Based on the previous works in the linear and nonlinear hybrid
models literature, a time series can be considered to be composed
of a linear autocorrelation structure and a nonlinear component. In
this paper, a time series is considered as function of a linear and a
nonlinear component. Thus,
yt = f (Lt , Nt ),
(5)
where Lt denotes the linear component and Nt denotes the nonlinear component. These two components have to be estimated from
the data. In the rst stage, the main aim is linear modeling; there-
p
q
i zti
j tj + et = L t + et ,
Lt =
i=1
(6)
j=1
where L t is the forecast value for time t from the estimated relationship (1), zt = (1 B)d (yt ), and et is the residual at time t from
the linear model. Residuals are important in diagnosis of the sufciency of linear models. A linear model is not sufcient if there
are still linear correlation structures left in the residuals. However,
residual analysis is not able to detect any nonlinear patterns in the
data. In fact, there is currently no general diagnostic statistics for
nonlinear autocorrelation relationships. Therefore, even if a model
has passed diagnostic checking, the model may still not be adequate
in that nonlinear relationships have not been appropriately modeled. Any signicant nonlinear pattern in the residuals will indicate
the limitation of the ARIMA model. The forecast values and residuals of linear modeling are the results of rst stage that are used in
next stage. In addition, the linear patterns are magnied by ARIMA
model in order to apply in second stage.
In second stage, the main aim is nonlinear modeling; therefore,
a multilayer perceptron is used in order to model the nonlinear and
probable linear relationships existing in residuals of linear modeling and original data. Thus,
Nt1 = f 1 (et1 , . . . , etn ),
(7)
Nt2
(8)
= f (zt1 , . . . , ztm ),
Nt =
f (Nt1 , Nt2 ),
(9)
f1 , f2 ,
(10)
where f are the nonlinear functions determined by the neural network. n1 n and m1 m are integers that are determined in design
process of nal neural network. It must be noted that anyone of
above-mentioned variables ei (i = t 1, . . ., t n), L t , and zj (j = t 1,
. . ., t m) or set of them {ei (i = t 1, . . ., t n)} or {zi (i = t 1, . . .,
t m)} may be deleted in design process of nal neural network.
This maybe related to the underlying data generating process and
the existing linear and nonlinear structures in data. For example,
if data only consist of pure linear structure, then {ei (i = t 1, . . .,
t n)} variables will be probably deleted against other of those
variables. In contrast, if data only consist of pure nonlinear structures, then L t variable will be probably deleted against other of
those variables.
As previously mentioned, in building ARIMA as well as ANN
models, subjective judgment of the model order as well as the
model adequacy is often needed. It is possible that suboptimal
models will be used in the hybrid model. For example, the current practice of BoxJenkins methodology focuses on the low
order autocorrelation. A model is considered adequate if low order
autocorrelations are not signicant even though signicant autocorrelations of higher order still exist. This suboptimality may not
affect the usefulness of the hybrid model. Granger [60] has pointed
out that for a hybrid model to produce superior forecasts, the
component model should be suboptimal. In general, it has been
observed that it is more effective to combine individual forecasts
that are based on different information sets [60].
Although the proposed model, such as traditional hybrid ANNsARIMA models, exploits the unique feature and strength of ARIMA
2669
287
274
261
248
235
222
209
196
183
157
170
144
118
131
92
105
79
66
53
27
40
14
200
180
160
140
120
100
80
60
40
20
0
model, the sunspot data set is divided into two samples of training
and testing. The training data set, 221 observations (17001920), is
exclusively used in order to formulate the model and then the test
sample, the last 67 observations (19211987), is used in order to
evaluate the performance of the established model.
Stage I. Linear modeling: Using the Eviews package software, the
best-tted model is a autoregressive model of order 9, AR(9), which
has also been used by many researchers [27,61,63].
Stage II. Nonlinear modeling: In order to obtain the optimum
network architecture, based on the concepts of articial neural networks design and using pruning algorithms in MATLAB7 package
software, different network architectures are evaluated to compare
the ANNs performance. The best tted network which is selected,
and therefore, the architecture which presented the best forecasting accuracy with the test data, is composed of seven inputs, three
hidden and one output neurons (in abbreviated form, N(7-3-1) )). The
structure of the best-tted network is shown in Fig. 2. The performance measures of the proposed model for sunspot data are given
in Table 1. The estimated values of proposed model sunspot data
sets are plotted in Fig. 3. In addition, the estimated value of ARIMA,
ANN, and our proposed models for test data are plotted in Figs. 46,
respectively.
5.2. The lynx series forecasts
The lynx series, which is considered in this investigation, contains the number of lynx trapped per year in the Mackenzie River
district of Northern Canada. The data set are plotted in Fig. 7, which
shows a periodicity of approximately 10 years [64]. The data set has
114 observations, corresponding to the period of 18211934. It has
Table 1
Performance measures of the proposed model for sunspot data.
Train
Test
MSE
MAE
MSE
SSE
RMSE
ME
MAPE
VAF
MAE
146.131
9.144
218.642
14649.024
14.786
3.743
27.21
91.87
11.446
2670
Fig. 3. Results obtained from the proposed model for sunspot data set.
Table 2
Performance measures of the proposed model for lynx data.
Train
Test
MSE
MAE
MSE
SSE
RMSE
ME
MAPE
VAF
MAE
0.005
0.063
0.009
0.139
0.099
0.017
2.827
94.17
0.085
sive model of order 12, AR(12), which has also been used by many
researchers [27,63].
Stage II. Nonlinear modeling: Similar to the previous section, by
using pruning algorithms in MATLAB7 package software, the best
tted network which is selected, is composed of eight inputs, three
hidden and one output neurons (in abbreviated form, N(8-3-1) ). The
structure of the best-tted network is shown in Fig. 8. The performance measures of the proposed model for Canadian lynx data are
given in Table 2. The estimated values of the proposed model for
Canadian lynx data set are plotted in Fig. 9. In addition, the estimated value of ARIMA, ANN, and proposed models for test data are
respectively plotted in Figs. 1012.
109
103
97
91
85
79
73
67
61
55
49
43
37
31
25
19
13
8000
7000
6000
5000
4000
3000
2000
1000
0
2671
Fig. 9. Results obtained from the proposed model for Canadian lynx data set.
Fig. 14. Structure of the best tted network (exchange rate), N(12-4-1) .
3
2.5
2
1.5
1
Fig. 13. Weekly British pound/US dollar exchange rate series (19801993).
704
667
630
593
556
519
482
445
408
371
334
297
260
223
186
149
112
75
38
0.5
2672
Fig. 15. Results obtained from the proposed model for exchange rate data set.
Fig. 16. ARIMA model prediction of exchange rate data set (test sample).
Fig. 17. ANN model prediction of exchange rate data set (test sample).
Fig. 18. Proposed model prediction of exchange rate data set (test sample).
MAE =
(11)
i=1
1
(ei )2 .
N
N
MSE =
(12)
i=1
Table 3
Performance measures of the proposed model for exchange rate data.
Train
Test
MSE
MAE
5
3.22 10
0.004
MSE
SSE
5
3.64 10
0.001
RMSE
0.006
ME
5
4.59 10
MAPE
VAF
MAE
0.283
64.19
0.004
2673
Table 4
Comparison of the performance of the proposed model with those of other forecasting models (Sunspot data set).
Model
35 points ahead
67 points ahead
MAE
MSE
MAE
MSE
11.319
10.243
10.831
8.847
216.965
205.302
186.827
129.425
13.033739
13.544365
12.780186
11.446981
306.08217
351.19366
280.15956
218.642153
Table 5
Percentage improvement of the proposed model in comparison with those of other forecasting models (Sunspot data set).
Model
MAE
MSE
MAE
MSE
21.84
13.63
18.32
40.35
36.96
30.72
12.17
15.49
10.43
28.57
37.74
21.96
Table 6
Comparison of the performance of the proposed model with those of other forecasting models (Canadian lynx data).
Model
MAE
MSE
0.112255
0.112109
0.103972
0.085055
0.020486
0.020466
0.017233
0.00999
Table 7
Percentage improvement of the proposed model in comparison with those of other forecasting models (Canadian lynx data).
Model
MAE (%)
MSE (%)
24.23
24.13
18.19
51.23
51.19
42.03
Table 8
Comparison of the performance of the proposed model with those of other forecasting models (exchange rate data).a
Model
1 month
6 months
12 months
MAE
MSE
MAE
MSE
MAE
MSE
0.005016
0.004218
0.004146
0.003972
3.68493
2.76375
2.67259
2.39915
0.0060447
0.0059458
0.0058823
0.0053361
5.65747
5.71096
5.65507
4.27822
0.0053579
0.0052513
0.0051212
0.0049691
4.52977
4.52657
4.35907
3.64774
Table 9
Percentage improvement of the proposed model in comparison with those of other forecasting models (exchange rate data).
Model
1 month
6 months
12 months
MAE
MSE
MAE
MSE
MAE
MSE
20.81
5.83
4.20
34.89
13.19
10.23
11.72
10.25
9.29
24.38
25.09
24.35
7.26
5.37
2.97
19.47
19.41
16.32
2674
both ARIMA and ANN models for longer time horizons (6 and 12
months). However, our proposed model signicantly outperforms
ARIMA, ANN, and Zhangs hybrid models across three different time
horizons and with both error measures.
6. Conclusions
Improving forecasting especially time series forecasting accuracy is an important yet often difcult task facing decision makers in
many areas. Despite the numerous time series models available, the
research for improving the effectiveness of forecasting models has
never stopped. Several large-scale forecasting competitions with
a large number of commonly used time series forecasting models
conclude that combining forecasts from more than one model often
leads to improved performance, especially when the models in
the ensemble are quite different. Articial neural networks (ANNs)
have shown to be an effective, general-purpose approach for pattern recognition, classication, clustering, and especially prediction
with a high degree of accuracy. Nevertheless, using ANNs to model
linear problems have yielded mixed results, and hence; it is unreasonable to apply ANNs blindly to any type of data.
Hybrid techniques that decompose a time series into its linear
and nonlinear form are one of the most popular hybrid models
categories, which have been shown to be successful for single
models. These models are jointly used the linear and nonlinear models in order to capture different forms of relationship
in the time series data. Although the numerous studies have
shown that hybrid ARIMA-ANNs models are able to outperform
each components used in isolation, some other have reported
inconsistent results. Some researchers believe that assumptions,
which have been considered to construct these hybrid models, can degenerate their performance if the opposite situation
occurs.
In this paper, a novel hybridization of articial neural networks
and ARIMA model is proposed in order to overcome the abovementioned limitation of ANNs and yield the more general and the
more accurate forecasting model than traditional hybrid ARIMAANNs models. In our proposed model is used the unique capability
of ARIMA models in linear modeling in order to identify and magnify the existing linear structure in data, and then a multilayer
perceptron is used to determine a model in order to capture the
underlying data generating process and predict the future, using
preprocessed data. Empirical results with three well-known real
data sets indicate that the proposed model can be an effective
way in order to yield more general and more accurate model than
Zhangs hybrid model and either of the components models used
separately, thus, can be used as an alternative forecasting tool for
time series forecasting.
Acknowledgements
The authors wish to express their gratitude to anonymous
referees and Seyed Reza Hejazi, assistant professor of industrial
engineering, Isfahan University of Technology, for their insightful
and constructive comments, which helped to improve the paper
greatly.
References
[1] G. Zhang, B.E. Patuwo, M.Y. Hu, Forecasting with articial neural networks: the
state of the art, International Journal of Forecasting 14 (1998) 3562.
[2] D. Rumelhart, J. McClelland, Parallel Distributed Processing, MIT Press, Cambridge, MA, 1986.
[3] G. Cybenko, Approximations by superpositions of a sigmoidal function, Mathematics of Control, Signals, and Systems 2 (1989) 303314.
[4] K. Hornik, M. Stinnchcombe, H. White, Multi-layer feed forward networks are
universal approximators, Neural Networks 2 (1989) 359366.
2675
[56] N. Morgan, H. Bourlard, Generalization and parameter estimation in feedforward nets: some experiments, in: D.S. Touretzky (Ed.), Advances in Neural
Information Processing Systems, vol. 2, 1990, pp. 630637.
[57] S. Thawornwong, D. Enke, The adaptive selection of nancial and economic
variables for use with articial neural networks, Neurocomputing 31 (2000)
113.
[58] M. Hibon, T. Evgeniou, To combine or not to combine: selecting among forecasts
and their combinations, International Journal of Forecasting 21 (2005) 1524.
[59] N. Terui, H. van Dijk, Combined forecasts from linear and nonlinear time series
models, International Journal of Forecasting 18 (2002) 421438.
[60] C.W.J. Granger, Combining forecaststwenty years later, Journal of Forecasting
8 (1989) 167173.
[61] K.W. Hipel, A.I. McLeod, Time Series Modelling of Water Resources and Environmental Systems, Amsterdam, Elsevier, 1994.
[62] M. Ghiassi, H. Saidane, A dynamic architecture for articial neural networks,
Neurocomputing 63 (2005) 397413.
[63] T. Subba Rao, M.M. Sabr, An Introduction to Bispectral Analysis and Bilinear
Time Series Models Lecture Notes in Statistics, 24, Springer-Verlag, New York,
1984.
[64] L. Stone, D. He, Chaotic oscillations and cycles in multi-trophic ecological systems, Journal of Theoretical Biology 248 (2007) 382390.
[65] Y. Tang, S. Ghosal, A consistent nonparametric Bayesian procedure for estimating autoregressive conditional densities, Computational Statistics & Data
Analysis 51 (2007) 44244437.
[66] T. Lin, M. Pourahmadi, Nonparametric and non-linear models and data mining
in time series: a case study in the Canadian lynx data, Applied Statistics 47
(1998) 187201.
[67] P. Cornillon, W. Imam, E. Matzner, Forecasting time series using principal
component analysis with respect to instrumental variables, Computational
Statistics & Data Analysis 52 (2008) 12691280.
[68] M.J. Campbell, A.M. Walker, A survey of statistical work on the MacKenzie River
series of annual Canadian lynx trappings for the years 18211934, and a new
analysis, Journal of the Royal Statistical Society Series A 140 (1977) 411431.
[69] C.S. Wong, W.K. Li, On a mixture autoregressive model, Journal of the Royal
Statistical Society Series B 62 (1) (2000) 91115.
[70] R.A. Meese, K. Rogoff, Empirical exchange rate models of the seventies: do they
/t out of samples? Journal of International Economics 14 (1983) 324.
[71] A. Timmermann, C.W.J. Granger, Efcient market hypothesis and forecasting,
International Journal of Forecasting 20 (2004) 1527.