A Novel Hybridization of Artificial Neural Networks and ARIMA Models For Time Series Forecasting

Applied Soft Computing 11 (2011) 26642675
Contents lists available at ScienceDirect
Applied Soft Computing

journal homepage: www.elsevier.com/locate/asoc
A novel hybridization of articial neural networks and ARIMA models for time
series forecasting
Mehdi Khashei , Mehdi Bijari
Department of Industrial Engineering, Isfahan University of Technology, Isfahan, Iran
a r t i c l e
i n f o
Article history:
Received 18 June 2008
Received in revised form 8 August 2010
Accepted 28 October 2010
Available online 4 November 2010
Keywords:
Articial neural networks (ANNs)
Auto-regressive integrated moving average
(ARIMA)
Time series forecasting
Hybrid models
a b s t r a c t
Improving forecasting especially time series forecasting accuracy is an important yet often difcult task
facing decision makers in many areas. Both theoretical and empirical ndings have indicated that integration of different models can be an effective way of improving upon their predictive performance,
especially when the models in combination are quite different. Articial neural networks (ANNs) are
exible computing frameworks and universal approximators that can be applied to a wide range of forecasting problems with a high degree of accuracy. However, using ANNs to model linear problems have
yielded mixed results, and hence; it is not wise to apply ANNs blindly to any type of data. Autoregressive
integrated moving average (ARIMA) models are one of the most popular linear models in time series
forecasting, which have been widely applied in order to construct more accurate hybrid models during
the past decade. Although, hybrid techniques, which decompose a time series into its linear and nonlinear
components, have recently been shown to be successful for single models, these models have some disadvantages. In this paper, a novel hybridization of articial neural networks and ARIMA model is proposed
in order to overcome mentioned limitation of ANNs and yield more general and more accurate forecasting model than traditional hybrid ARIMA-ANNs models. In our proposed model, the unique advantages
of ARIMA models in linear modeling are used in order to identify and magnify the existing linear structure in data, and then a neural network is used in order to determine a model to capture the underlying
data generating process and predict, using preprocessed data. Empirical results with three well-known
real data sets indicate that the proposed model can be an effective way to improve forecasting accuracy
achieved by traditional hybrid models and also either of the components models used separately.
2010 Elsevier B.V. All rights reserved.
1. Introduction
Time series forecasting is an active research area that has drawn
considerable attention for applications in variety of areas. With
the time series approach to forecasting, historical observations
of the same variable are analyzed to develop a model describing
the underlying relationship. Then the established model is used in
order to extrapolate the time series into the future. This modeling
approach is particularly useful when little knowledge is available
on the underlying data generating process or when there is no satisfactory explanatory model that relates the prediction variable to
other explanatory variables. Over the past several decades, much
effort has been devoted to the development and improvement of
time series forecasting models [1].
Articial neural networks (ANNs) are one of the most important
types of nonparametric nonlinear time series models, which have
been proposed and examined for time series forecasting. The basic
Corresponding author. Tel.: +98 311 3912550/1; fax: +98 311 3915526.
E-mail address: khashei@in.iut.ac.ir (M. Khashei).
1568-4946/$ see front matter 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.asoc.2010.10.015
structure of and operations performed by the ANN emulate those

found in a biological neural systems [2]. ANNs have some advantages over other forecasting models, which make them attractive
in forecasting tasks. First, articial neural networks have exible nonlinear function mapping capability, which can approximate
any continuous measurable function with arbitrarily desired accuracy [3,4]. Second, being nonparametric and data-driven models,
articial neural networks impose few prior assumptions on the
underlying process from which data are generated [1]. Because
of this property, articial neural networks are less susceptible to
model misspecication problem than most parametric nonlinear
methods. Third, articial neural networks are adaptive in nature.
The adaptivity implies that the networks generalization capabilities remain accurate and robust in a nonstationary environment
whose characteristics may change over time. Fourth, articial neural networks models use only linearly many parameters, whereas
traditional polynomial, spline, and trigonometric expansions use
exponentially many parameters to achieve the same approximation rate [5].
Given the advantages of articial neural networks, it is not
surprising that this methodology has attracted overwhelming
M. Khashei, M. Bijari / Applied Soft Computing 11 (2011) 26642675
attention in time series forecasting. Articial neural networks have

been found to be a viable contender to various traditional time
series models [68]. Lapedes and Farber [9] report the rst attempt
to model nonlinear time series with articial neural networks.
De Groot and Wurtz [10] present a detailed analysis of univariate time series forecasting using feedforward neural networks for
two benchmark nonlinear time series. Chakraborty et al. [11] conduct an empirical study on multivariate time series forecasting with
articial neural networks. Poli and Jones [12] propose a stochastic
neural network model based on Kalman lter for nonlinear time
series prediction. Cottrell et al. [13] address the issue of network
structure for forecasting real world time series. Berardi and Zhang
[14] investigate the bias and variance issue in the time series forecasting context. In addition, several large forecasting competitions
[15,16] suggest that neural networks can be a very useful addition
to the time series forecasting toolbox.
Although ANNs have the advantages of accurate forecasting,
their performance in some specic situation is inconsistent. In the
literature, several papers are devoted to comparing ANNs with the
traditional methods [1]. Despite the numerous studies, which have
shown ANNs are signicantly better than the conventional linear models and their forecast considerably and consistently more
accurately, some other studies have reported inconsistent results.
Foster et al. [17] nd that ANNs are signicantly inferior to linear
regression and a simple average of exponential smoothing methods. Brace et al. [18] also nd that the performance of ANNs is not
as good as many other statistical methods commonly used in the
load forecasting. Denton [19] with generated data for several different experimental conditions shows that under ideal conditions,
with all regression assumptions, there is little difference in the predictability between ANNs and linear regression, and only under less
ideal conditions such as outliers, multicollinearity, and model misspecication, ANNs perform better. Hann and Steurer [20] make
comparisons between the neural networks and the linear model
in exchange rate forecasting. They report that if monthly data are
used, neural networks do not show much improvement over linear models. Taskaya and Casey [21] compare the performance of
linear models with neural networks. Their results show that linear
autoregressive models can outperform neural networks in some
cases.
Most other researchers also make comparisons between ANNs
and the corresponding traditional methods in their particular applications. De Groot and Wurtz [10] compare ANNs with the linear
(Box-Jenkins) and nonlinear (bilinear and TAR) statistical models in
forecasting the sunspots data. Fishwick [22] reports that the performance of ANNs is worse than that of the simple linear regression.
Tang et al. [23], and Tang and Fishwick try to answer the question:
under what conditions ANN forecasters can perform better than the
linear time series forecasting methods such as BoxJenkins models [24]. Some researchers believe that in some specic situations
where ANNs perform worse than linear statistical models, the reason may simply be that the data is linear without much disturbance,
therefore; cannot be expected that ANNs to do better than linear
models for linear relationships [1]. However, for any reason, using
ANNs to model linear problems have yielded mixed results and
hence; it is not wise to apply ANNs blindly to any type of data.
In the literature, several linear approaches have been proposed
to time series forecasting. Autoregressive integrated moving average (ARIMA) models are one of the most popular linear models
for time series forecasting over the past three decades that have
enjoyed useful applications in forecasting social, economic, engineering, foreign exchange, and stock problems. ARIMA models have
been originated from the autoregressive models (AR), the moving
average models (MA) and the combination of the AR and MA, the
ARMA models. ARIMA models can be used when the time series is
stationary and there is no missing data in the within the time series
2665
[25]. In ARIMA analysis, an identied underlying process is generated based on observations to a time series for generating a good
model that shows the process-generating mechanism precisely.
Box and Jenkins [26] provided a step-by-step procedure for
ARMA analysis, which is a combination of AR coefcients, which
are multiplied by past values of the time series data and MA coefcients, which are multiplied by past random shocks. The popularity
of the ARIMA model is due to its statistical properties as well as
the well-known BoxJenkins methodology in the model building
process. In addition, ARIMA models [22] can implement various
exponential smoothing models. Although ARIMA models are quite
exible in that they can represent several different types of time
series, their major limitation is the pre-assumed linear form of the
model. ARIMA models assume that future values of a time series
have a linear relationship with current and past values as well as
with white noise, so approximations by ARIMA models may not
be adequate for complex nonlinear real-world problems. However,
real world systems are often nonlinear [1], thus, it is unreasonable to assume that a particular realization of a given time series is
generated by a linear process.
Both ANNs and ARIMA models have achieved successes in their
own linear or nonlinear domains. However, none of them is a
universal model that is suitable for all circumstances. The approximation of ARIMA models to complex nonlinear problems as well as
ANNs to model linear problems may be totally inappropriate, and
also, in problems that consist both linear and nonlinear correlation
structures. Using hybrid models or combining several models has
become a common practice in order to overcome the limitations of
components models and improve the forecasting accuracy. In addition, since it is difcult to completely know the characteristics of
the data in a real problem, hybrid methodology that has both linear and nonlinear modeling capabilities can be a good strategy for
practical use.
The hybrid techniques that decompose a time series into its linear and nonlinear form are one of the most popular hybrid models
categories, which have been shown to be successful for single models. Zhang [27] presented a hybrid ARIMA and ANN approaches
for time series forecasting using mentioned technique. In Zhangs
hybrid model is jointly used the linear ARIMA and the nonlinear
multilayer perceptrons models in order to capture different forms
of relationship in the time series data. The motivation of Zhangs
hybrid model comes from the following perspectives. First, it is
often difcult in practice to determine whether a time series under
study is generated from a linear or nonlinear underlying process;
thus, the problem of model selection can be eased by combining
linear ARIMA and nonlinear ANN models. Second, real-world time
series are rarely pure linear or nonlinear and often contain both
linear and nonlinear patterns, which neither ARIMA nor ANN models alone can be adequate for modeling in such cases; hence the
problem of modeling the combined linear and nonlinear autocorrelation structures in time series can be solved by combining linear
ARIMA and nonlinear ANN models. Third, it is almost universally
agreed in the forecasting literature that no single model is the best
in every situation, due to the fact that a real-world problem is often
complex in nature and any single model may not be able to capture different patterns equally well. Therefore, the chance in order
to capture different patterns in the data can be increased by combining different models. These hybrid models, despite the all their
advantages, have two assumptions [21] that will degenerate their
performance if the opposite situation occurs; therefore, they may
be inadequate in some specic situations.
In this paper, ARIMA models are applied to construct a new
hybrid model in order to overcome the above-mentioned limitation of articial neural networks and to yield more general and
more accurate model than traditional hybrid ARIMA and articial
neural networks models. In our proposed model, a time series is
2666
considered as function of a linear and a nonlinear component, so, in

the st phase, an autoregressive integrated moving average model
is rst used in order to identify and magnify the existing linear
structures in data. In the second phase, a multilayer perceptron
is used as a nonlinear neural network in order to model the preprocessed data, which the existing linear structures are identied
and magnied by ARIMA, and to predict the future value of time
series in the future. Three well-known real data sets the Wolfs
sunspot data, the Canadian lynx data, and the British pound/US dollar exchange rate data are used in this paper in order to show the
appropriateness and effectiveness of the proposed model to time
series forecasting. The rest of the paper is organized as follows.
In the next section, the literature survey of the hybrid models is
briey reviewed. The basic concepts of autoregressive integrated
moving average (ARIMA) and articial neural networks (ANNs)
are presented in section 3. In Section 4, the formulation of the
proposed model is introduced. In Section 5, the proposed model
is applied to time series forecasting and its performance is compared with those of other models. Section 6 contains the concluding
remarks.
2. The hybrid models

In the literature, different combination techniques have been
proposed in order to overcome the deciencies of single models.
The basic idea of the model combination in forecasting is to use
each models unique feature in order to capture different patterns
in the data. The difference between these combination techniques
can be described using terminology developed for the classication
and neural network literature [28]. Hybrid models can be homogeneous, such as using differently congured neural networks (all
multilayer perceptrons), or heterogeneous, such as with both linear
and nonlinear models [21].
In a competitive architecture, the aim is to build appropriate
modules to represent different parts of the time series, and to be
able to switch control to the most appropriate. For example, a
time series may exhibit nonlinear behavior generally, but this may
change to linearity depending on the input conditions. Early work
on threshold autoregressive models (TAR) used two different linear AR processes, each of which change control among themselves
according to the input values [29]. An alternative is a mixture density model, also known as nonlinear gated expert, which comprises
neural networks integrated with a feedforward gating network
[21].
In a cooperative modular combination, the aim is to combine
models to build a complete picture from a number of partial solutions [28]. The assumption is that a model may not be sufcient
to represent the complete behavior of a time series, for example,
if a time series exhibits both linear and nonlinear patterns during
the same time interval, neither linear nor nonlinear models alone
are able to model both components simultaneously. A good exemplar is models that fuse ARIMA with articial neural networks. An
ARIMA process combines three different processes comprising an
autoregressive (AR) function regressed on past values of the process, moving average (MA) function regressed on a purely random
process, and an integrated (I) part to make the data series stationary by differencing. In such hybrids, whilst the neural network
model deals with nonlinearity, the ARIMA model deals with the
non-stationary linear component [27,30]. Such models are generally constructed in a sequential manner, with the ARIMA model rst
applied to the original time series, and then its residuals modeled
using neural networks.
The literature on this topic has expanded dramatically since the
early work of Reid [31] and Bates and Granger [32]. Clemen [33]
provided a comprehensive review and annotated bibliography in
this area. Both theoretical and empirical ndings in neural network

forecasting research suggest that combining different methods can
be an effective and efcient way to improve forecasts [34,35]. Wedding and Cios [36] described a combining methodology using radial
basis function networks (RBF) and the BoxJenkins models. Luxhoj
et al. presented a hybrid econometric and ANN approach for sales
forecasting [37]. Pelikan et al. [38] and Ginzburg and Horn [39] proposed to combine several feedforward neural networks to improve
time series forecasting accuracy.
In recent years, several hybrid models have been proposed,
using autoregressive integrated moving average (ARIMA) and articial neural networks (ANNs) and applied to time series forecasting
with good prediction performance. Pai and Lin [40] proposed a
hybrid methodology to exploit the unique strength of ARIMA models and support vector machines (SVMs) for stock prices forecasting.
Chen and Wang [41] constructed a combination model incorporating seasonal autoregressive integrated moving average (SARIMA)
model and SVMs for seasonal time series forecasting. Zhou and Hu
[42] proposed a hybrid modeling and forecasting approach based
on grey and BoxJenkins autoregressive moving average models.
Armano et al. [43] presented a new hybrid approach that integrated
articial neural networks with genetic algorithms (GAs) to stock
market forecast. Yu et al. [44] proposed a novel nonlinear ensemble
forecasting model integrating generalized linear auto regression
(GLAR) with neural networks in order to obtain accurate prediction
in foreign exchange market.
Khashei et al. [45] presented a hybrid ARIMA and articial intelligence approaches to nancial markets prediction. Kim and Shin
[46] investigated the effectiveness of a hybrid approach based on
the articial neural networks for time series properties, such as
the adaptive time delay neural networks (ATNNs) and the time
delay neural networks (TDNNs), with the genetic algorithms (GAs)
in detecting temporal patterns for stock market prediction tasks.
Tseng et al. [30] proposed using a hybrid model called SARIMABP
that combines the seasonal ARIMA (SARIMA) model and the backpropagation neural network model to predict seasonal time series
data. Voort et al. [47] introduced a hybrid method called KARIMA
using a Kohonen self-organizing map and ARIMA method for shortterm prediction. Khashei et al. [48] based on the basic concepts of
articial neural networks, proposed a new hybrid model in order
to overcome the data limitation of ANNs and yield more accurate
results than ANNs in incomplete data situations using fuzzy regression model.
3. Time series forecasting models

There are several different approaches to time series forecasting, which are generally categorized as follow. Traditional
statistical models including moving average, exponential smoothing, and autoregressive integrated moving average (ARIMA) are
linear in that predictions of the future values are constrained
to be linear functions of past observations. Second category of
time series models are nonlinear models. In the literature, several classes of nonlinear models have been proposed to overcome
the linear limitation of time series models. These include the
bilinear model, the threshold autoregressive (TAR) model, the
autoregressive conditional heteroscedastic (ARCH) model, general autoregressive conditional heteroscedastic (GARCH), chaotic
dynamics, and articial neural networks such as multilayer perceptrons (MLPs), radial basis function networks (RBF), general
regression neural networks (GRNNs), support vector machines
(SVMs), etc. In this section, the basic concepts and modeling process of the two components of the proposed model autoregressive
integrated moving average and articial neural networks are
briey reviewed.
2667
3.1. The auto-regressive integrated moving average models
3.2. The articial neural networks (ANNs)
For more than half a century, ARIMA models have dominated

many areas of time series forecasting. In an ARIMA (p, d, q) model,
the future value of a variable is assumed to be a linear function of
several past observations and random errors. That is, the underlying
process that generates the time series with the mean has the
form:
Articial neural networks (ANNs) are exible computing frameworks for modeling a broad range of nonlinear problems. One
signicant advantage of the ANN models over other classes of
nonlinear models is that ANNs are universal approximators that
can approximate a large class of functions with a high degree of
accuracy. Their power comes from the parallel processing of the
information from the data. No prior assumption of the model form is
required in the model building process. Instead, the network model
is largely determined by the characteristics of the data. Multilayer
perceptrons, especially with one hidden layer are one of the most
widely used forms of articial neural networks for time series modeling and forecasting. The model is characterized by a network of
three layers of simple processing units connected by acyclic links.
The relationship between the output (yt ) and the inputs (yt1 , . . .,
ytP ) has the following mathematical representation:
(B) d (yt ) = (B)at ,
(1)
random error
period
where yt and at are the actual
pvalue and
q at time
i , (B) = 1
j are polyt, respectively (B) = 1
B

B
i
j
i=1
j=1
nomials in B of degree p and q, i (i = 1, 2, . . ., p) and j (j = 1, 2, . . ., q)
are model parameters, = (1 B), B is the backward shift operator,
p and q are integers and often referred to as orders of the model,
and d is an integer and often referred to as order of differencing.
Random errors, at , are assumed to be independently and identically distributed with a mean of zero and a constant variance of
2.
The BoxJenkins [26] methodology includes three iterative
steps of model identication, parameter estimation, and diagnostic checking. The basic idea of model identication is that if a time
series is generated from an ARIMA process, it should have some
theoretical autocorrelation properties. By matching the empirical
autocorrelation patterns with the theoretical ones, it is often possible to identify one or several potential models for the given time
series. Box and Jenkins [26] proposed to use the autocorrelation
function (ACF) and the partial autocorrelation function (PACF) of
the sample data as the basic tools to identify the order of the ARIMA
model. Some other order selection methods have been proposed
based on validity criteria, the information-theoretic approaches
such as the Akaikes information criterion (AIC) [49] and the minimum description length (MDL) [50,51]. In addition, in recent years
different approaches based on intelligent paradigms, such as neural networks [52], genetic algorithms [53,54] or fuzzy system [55]
have been proposed to improve the accuracy of order selection of
ARIMA models.
In the identication step, data transformation is often required
to make the time series stationary. Stationarity is a necessary
condition in building an ARIMA model used for forecasting. A stationary time series is characterized by statistical characteristics
such as the mean and the autocorrelation structure being constant over time. When the observed time series presents trend
and heteroscedasticity, differencing and power transformation are
applied to the data to remove the trend and to stabilize the variance before an ARIMA model can be tted. Once a tentative model
is identied, estimation of the model parameters is straightforward. The parameters are estimated such that an overall measure
of errors is minimized. This can be accomplished using a nonlinear optimization procedure. The last step in model building is
the diagnostic checking of model adequacy. This is basically to
check if the model assumptions about the errors, at , are satised.
Several diagnostic statistics and plots of the residuals can be
used to examine the goodness of t of the tentatively entertained model to the historical data. If the model is not adequate,
a new tentative model should be identied, which will again
be followed by the steps of parameter estimation and model
verication. Diagnostic information may help suggest alternative model(s). This three-step model building process is typically
repeated several times until a satisfactory model is nally selected.
The nal selected model can then be used for prediction purposes.
yt = w0 +
Q
wj g
w0j +
j=1
P
wi,j yti
+ et ,
(2)
i=1
where wi,j (i = 0, 1, 2, . . . , P, j = 1, 2, . . . , Q ) and wj (j =

0, 1, 2, . . . , Q ) are model parameters often called connection
weights; P is the number of input nodes; and Q is the number of
hidden nodes. The sigmoid function is often used as the hidden
layer transfer function, that is,
Sig(x) =
1
.
1 + exp(x)
(3)
Hence, the ANN model of (2), in fact, performs a nonlinear functional mapping from the past observations to the future value yt ,
i.e.
yt = f (yt1 , . . . , ytP , W ) + et ,
(4)
where W is a vector of all parameters and f() is a function determined by the network structure and connection weights. Thus, the
neural network is equivalent to a nonlinear autoregressive model.
Note that expression (2) implies one output node in the output
layer, which is typically used for one-step-ahead forecasting. The
simple network given by (2) is surprisingly powerful in that it is
able to approximate arbitrary function as the number of hidden
nodes Q is sufciently large [1]. In practice, simple network structure that has a small number of hidden nodes often works well
in out-of-sample forecasting. This may be due to the over-tting
effect typically found in neural network modeling process. It occurs
when the network has too many free parameters, which allow the
network to t the training data well, but typically lead to poor generalization. In addition, it has been experimentally shown that the
generalization ability begins to deteriorate when the network has
been trained more than necessary, that is when it begins to t the
noise of the training data [56].
The choice of Q is data dependent and there is no systematic rule
in deciding this parameter. In addition to choosing an appropriate
number of hidden nodes, another important task of ANN modeling
of a time series is the selection of the number of lagged observations, P, and the dimension of the input vector [57]. This is perhaps
the most important parameter to be estimated in an ANN model
because it plays a major role in determining the (nonlinear) autocorrelation structure of the time series. However, there is no theory
that can be used to guide the selection of P. Hence, experiments are
often conducted to select an appropriate P as well as Q.
4. Formulation of the proposed model
Despite the numerous time series models available, the accuracy of time series forecasting currently is fundamental to many
2668
decision processes, and hence, never research into ways of improving the effectiveness of forecasting models been given up. Many
researches in time series forecasting have argued that predictive
performance improves in combined models [21]. In hybrid models,
the aim is to reduce the risk of using an inappropriate model by
combining several models to reduce the risk of failure and obtain
results that are more accurate. Typically, this is done because the
underlying process cannot easily be determined [58]. The motivation for combining models comes from the assumption that either
one cannot identify the true data generating process [59] or that a
single model may not be sufcient to identify all the characteristics
of the time series [27].
Hybrid techniques that decompose a time series into its linear
and nonlinear form are one of the most popular hybrid models,
which have recently been shown to be successful for single models
[27]. However, perhaps the danger in using these hybrid models
is that there is an assumption that the relationship between the
linear and nonlinear components is additive and this may underestimate the relationship between the components and degrade
performance, if there is not be any additive association between
the linear and nonlinear elements and the relationship is different
(for example multiplicative). In addition, one may not guarantee
that the residuals of the linear component may comprise valid nonlinear patterns. Therefore, such assumptions are likely to lead to
unwanted degeneration of performance if the opposite situation
occurs [21].
Terui and van Dijk [59] indicate that, these architectures do not
always lead to better estimates when compared to single models
and combined forecasts do not necessarily dominate for all series
sometimes a linear model still produces better results. Hibon and
Evgeniou [58] present experiments, using the 3003 series of the
M3-competition, that challenge this belief that combining forecasts
improves accuracy relative to individual forecasts. Their results
indicate that the advantage of combining forecasts is not that the
best possible combinations perform better than the best possible
individual forecasts, but that it is less risky in practice to combine
forecasts than to select an individual forecasting method. Taskaya
and Casey [21] try to answer the question that whether the performance of hybrid models shows consistent improvement over
single models. Their results show that hybrid models are not always
better, and hence, the process of model selection, still remains
an important step despite the popularity of hybrid models. They
believe that, despite the popularity of hybrid models, which rely
upon the success of their components, single models themselves
can be sufcient.
In this paper, a novel hybridization of articial neural networks and ARIMA model is proposed in order to overcome the
above-mentioned limitations of traditional hybrid models and also
limitations of linear and nonlinear models by using the unique
advantages of ARIMA and ANNs models in linear and nonlinear
modeling, respectively. In our proposed model, there are no abovementioned assumptions of the traditional hybrid ARIMA-ANNs
models. In addition, in our proposed model in contrast to the traditional hybrid ARIMA-ANNs models, we can guarantee that the
performance of the proposed model will not be worse than both
ARIMA and articial neural networks.
Based on the previous works in the linear and nonlinear hybrid
models literature, a time series can be considered to be composed
of a linear autocorrelation structure and a nonlinear component. In
this paper, a time series is considered as function of a linear and a
nonlinear component. Thus,
yt = f (Lt , Nt ),
(5)
where Lt denotes the linear component and Nt denotes the nonlinear component. These two components have to be estimated from
the data. In the rst stage, the main aim is linear modeling; there-
fore, an autoregressive integrated moving average (ARIMA) model

is used to model the linear component. The residuals from the rst
stage will contain the nonlinear relationship, which linear model
do not able to model it, and maybe linear relationship [25]. Thus,
p
q

i zti
j tj + et = L t + et ,
Lt =
i=1
(6)
j=1
where L t is the forecast value for time t from the estimated relationship (1), zt = (1 B)d (yt ), and et is the residual at time t from
the linear model. Residuals are important in diagnosis of the sufciency of linear models. A linear model is not sufcient if there
are still linear correlation structures left in the residuals. However,
residual analysis is not able to detect any nonlinear patterns in the
data. In fact, there is currently no general diagnostic statistics for
nonlinear autocorrelation relationships. Therefore, even if a model
has passed diagnostic checking, the model may still not be adequate
in that nonlinear relationships have not been appropriately modeled. Any signicant nonlinear pattern in the residuals will indicate
the limitation of the ARIMA model. The forecast values and residuals of linear modeling are the results of rst stage that are used in
next stage. In addition, the linear patterns are magnied by ARIMA
model in order to apply in second stage.
In second stage, the main aim is nonlinear modeling; therefore,
a multilayer perceptron is used in order to model the nonlinear and
probable linear relationships existing in residuals of linear modeling and original data. Thus,
Nt1 = f 1 (et1 , . . . , etn ),
(7)
Nt2
(8)
= f (zt1 , . . . , ztm ),
Nt =
f (Nt1 , Nt2 ),
(9)
f1 , f2 ,
f are the nonlinear functions determined by the neural

where
network. n, m are integers and often referred to as orders of the
model. Thus, the combined forecast will be as follows:
yt = f (Nt1 , L t , Nt2 ) = f (et1 , . . . , etn1 , L t , zt1 , . . . , ztm1 ),
(10)
where f are the nonlinear functions determined by the neural network. n1 n and m1 m are integers that are determined in design
process of nal neural network. It must be noted that anyone of
above-mentioned variables ei (i = t 1, . . ., t n), L t , and zj (j = t 1,
. . ., t m) or set of them {ei (i = t 1, . . ., t n)} or {zi (i = t 1, . . .,
t m)} may be deleted in design process of nal neural network.
This maybe related to the underlying data generating process and
the existing linear and nonlinear structures in data. For example,
if data only consist of pure linear structure, then {ei (i = t 1, . . .,
t n)} variables will be probably deleted against other of those
variables. In contrast, if data only consist of pure nonlinear structures, then L t variable will be probably deleted against other of
those variables.
As previously mentioned, in building ARIMA as well as ANN
models, subjective judgment of the model order as well as the
model adequacy is often needed. It is possible that suboptimal
models will be used in the hybrid model. For example, the current practice of BoxJenkins methodology focuses on the low
order autocorrelation. A model is considered adequate if low order
autocorrelations are not signicant even though signicant autocorrelations of higher order still exist. This suboptimality may not
affect the usefulness of the hybrid model. Granger [60] has pointed
out that for a hybrid model to produce superior forecasts, the
component model should be suboptimal. In general, it has been
observed that it is more effective to combine individual forecasts
that are based on different information sets [60].
Although the proposed model, such as traditional hybrid ANNsARIMA models, exploits the unique feature and strength of ARIMA
2669
287
274
261
248
235
222
209
196
183
157
170
144
118
131
92
105
79
66
53
27
40
14
200
180
160
140
120
100
80
60
40
20
0
Fig. 1. Sunspot series (17001987).
model as well as ANN model in determining different linear and

nonlinear patterns, it has no assumptions that limit the process of
linear and nonlinear patterns modeling separately by using different models and also combining process. Thus, the proposed model
can capture the linear and nonlinear autocorrelation structures in
data more and better than traditional hybrid models, so can be an
effective way in order to yield more general and more accurate
model than of those hybrid models and either of the components
models.
5. Application of the hybrid model to exchange rate
forecasting
Fig. 2. Structure of the best tted network (sunspot data), N(7-3-1) .
In this section, three well-known data sets the Wolfs sunspot

data, the Canadian lynx data, and the British pound/US dollar
exchange rate data are used in order to demonstrate the appropriateness and effectiveness of the proposed model. These time series
come from different areas and have different statistical characteristics. They have been widely studied in the statistical as well as the
neural network literature [27]. Both linear and nonlinear models
have been applied to these data sets, although more or less nonlinearities have been found in these series. Only the one-step-ahead
forecasting is considered. Different performance indicators such as
MAE (Mean Absolute Error), MSE (Mean Squared Error), SSE (Sum
Squared Error), RMSE (Root Mean Squared Error), MAPE (Mean
Absolute Percentage Error), ME (Mean Error), and VAF (Variance
Account For) are employed in order to evaluate the performance of
the proposed model.
5.1. The Wolfs sunspot data forecasts
The sunspot series is record of the annual activity of spots visible on the face of the sun and the number of groups into which
they cluster. The sunspot data, which is considered in this investigation, contains the annual number of sunspots from 1700 to 1987,
giving a total of 288 observations. The study of sunspot activity
has practical importance to geophysicists, environment scientists,
and climatologists [61]. The data series is regarded as nonlinear
and non-Gaussian and is often used to evaluate the effectiveness of
nonlinear models [62]. The plot of this time series (see Fig. 1) also
suggests that there is a cyclical pattern with a mean cycle of about
11 years. The sunspot data has been extensively studied with a vast
variety of linear and nonlinear time series models including ARIMA
and ANNs [27]. To assess the forecasting performance of proposed
model, the sunspot data set is divided into two samples of training
and testing. The training data set, 221 observations (17001920), is
exclusively used in order to formulate the model and then the test
sample, the last 67 observations (19211987), is used in order to
evaluate the performance of the established model.
Stage I. Linear modeling: Using the Eviews package software, the
best-tted model is a autoregressive model of order 9, AR(9), which
has also been used by many researchers [27,61,63].
Stage II. Nonlinear modeling: In order to obtain the optimum
network architecture, based on the concepts of articial neural networks design and using pruning algorithms in MATLAB7 package
software, different network architectures are evaluated to compare
the ANNs performance. The best tted network which is selected,
and therefore, the architecture which presented the best forecasting accuracy with the test data, is composed of seven inputs, three
hidden and one output neurons (in abbreviated form, N(7-3-1) )). The
structure of the best-tted network is shown in Fig. 2. The performance measures of the proposed model for sunspot data are given
in Table 1. The estimated values of proposed model sunspot data
sets are plotted in Fig. 3. In addition, the estimated value of ARIMA,
ANN, and our proposed models for test data are plotted in Figs. 46,
respectively.
5.2. The lynx series forecasts
The lynx series, which is considered in this investigation, contains the number of lynx trapped per year in the Mackenzie River
district of Northern Canada. The data set are plotted in Fig. 7, which
shows a periodicity of approximately 10 years [64]. The data set has
114 observations, corresponding to the period of 18211934. It has
Table 1
Performance measures of the proposed model for sunspot data.
Train
Test
MSE
MAE
MSE
SSE
RMSE
ME
MAPE
VAF
MAE
146.131
9.144
218.642
14649.024
14.786
3.743
27.21
91.87
11.446
2670
Fig. 3. Results obtained from the proposed model for sunspot data set.
Fig. 4. ARIMA model prediction of sunspot data (test sample).
Fig. 8. Structure of the best tted network (lynx data), N(8-3-1) .
Table 2
Performance measures of the proposed model for lynx data.
Train
Test
MSE
MAE
MSE
SSE
RMSE
ME
MAPE
VAF
MAE
0.005
0.063
0.009
0.139
0.099
0.017
2.827
94.17
0.085
Fig. 5. ANN model prediction of sunspot data (test sample).
sive model of order 12, AR(12), which has also been used by many
researchers [27,63].
Stage II. Nonlinear modeling: Similar to the previous section, by
using pruning algorithms in MATLAB7 package software, the best
tted network which is selected, is composed of eight inputs, three
hidden and one output neurons (in abbreviated form, N(8-3-1) ). The
structure of the best-tted network is shown in Fig. 8. The performance measures of the proposed model for Canadian lynx data are
given in Table 2. The estimated values of the proposed model for
Canadian lynx data set are plotted in Fig. 9. In addition, the estimated value of ARIMA, ANN, and proposed models for test data are
respectively plotted in Figs. 1012.
Fig. 6. Proposed model prediction of sunspot data (test sample).
5.3. The exchange rate (British pound/US dollar) forecasts
Fig. 7. Canadian lynx data series (18211934).
109
103
97
91
85
79
73
67
61
The last data set that is considered in this investigation, is the

exchange rate between British pound and US dollar. Predicting
55
49
43
37
31
25
19
13
8000
7000
6000
5000
4000
3000
2000
1000
0
also been extensively analyzed in the time series literature with a

focus on the nonlinear modeling [6568] also see Wong and Li [69]
for a survey. Following other studies [27,63,64], the logarithms (to
the base 10) of the data are used in the analysis.
Stage I. Linear modeling: As in the previous section, using the
Eviews package software, the established model is a autoregres-
2671
Fig. 9. Results obtained from the proposed model for Canadian lynx data set.
Fig. 10. ARIMA model prediction of lynx data (test sample).
Fig. 14. Structure of the best tted network (exchange rate), N(12-4-1) .
Stage I. Linear modeling: In a similar fashion, by using the Eviews

package software, the best-tted ARIMA model is a random walk
model, which has been used by Zhang [27] and has also been suggested by many studies in the exchange rate literature that a simple
random walk is the dominant linear model [70].
Stage II. Nonlinear modeling: Similar to the previous sections, by
using pruning algorithms in MATLAB7 package software, the best
tted network which is selected, is composed of 12 inputs, 4 hidden
and 1 output neurons (in abbreviated form, N(12-4-1) ). The structure
of the best-tted network is shown in Fig. 14. The performance
measures of the proposed model for Canadian lynx data are given
in Table 3. The estimated value of proposed model for both test and
training data are plotted in Fig. 15. In addition, the estimated value
of ARIMA, ANN, and proposed models for test data are plotted in
Figs. 1618, respectively.
Fig. 11. ANN model prediction of lynx data (test sample).
Fig. 12. Proposed model prediction of lynx data (test sample).
exchange rate is an important yet difcult task in international

nance. Various linear and nonlinear theoretical models have been
developed but few are more successful in out-of-sample forecasting than a simple random walk model. Recent applications of neural
networks in this area have yielded mixed results. The data used
in this paper contain the weekly observations from 1980 to 1993,
giving 731 data points in the time series. The time series plot is
given in Fig. 13, which shows numerous changing turning points
in the series. Following Meese and Rogoff [70] and Zhang [27], we
use the natural logarithmic transformed data in the modeling and
forecasting analysis.
5.4. Comparison with other models

In this section, the predictive capabilities of the proposed model
are compared with auto-regressive integrated moving average
(ARIMA), articial neural networks (ANNs), and Zhangs hybrid
model [27] using three well-known data sets: (1) the Wolfs sunspot
data, (2) the Canadian lynx data, and (3) the British pound/US dollar exchange rate data. The MAE (Mean Absolute Error), and MSE
(Mean Squared Error), which are computed from the following
equations, are employed as performance indicators in order to mea-
3
2.5
2
1.5
1
Fig. 13. Weekly British pound/US dollar exchange rate series (19801993).
704
667
630
593
556
519
482
445
408
371
334
297
260
223
186
149
112
75
38
0.5
2672
Fig. 15. Results obtained from the proposed model for exchange rate data set.
Fig. 16. ARIMA model prediction of exchange rate data set (test sample).
Fig. 17. ANN model prediction of exchange rate data set (test sample).
Fig. 18. Proposed model prediction of exchange rate data set (test sample).
sure forecasting performance of proposed model in comparison

with those other forecasting models.
1
ei ,
N
N
MAE =
(11)
i=1
1
(ei )2 .
N
N
MSE =
(12)
i=1
In the Wolfs sunspot data forecast case, a subset autoregressive

model of order nine has been found to be the most parsimonious
among all ARIMA models that are also found adequate judged by
the residual analysis. Many researchers such as Subba Rao and Gabr
[63], Hipel and McLeod [61], and Zhang [27] have also used this
model. The neural network model used is composed of four inputs,
four hidden and one output neurons (N(4-4-1) ), as also employed by

De Groot and Wurtz [10], Cottrell et al. [13], and Zhang [27]. Two
forecast horizons of 35 and 67 periods are used in order to assess
the forecasting performance of models. The forecasting results of
above-mentioned models and improvement percentage of the proposed model in comparison with those models for the sunspot data
are summarized in Tables 4 and 5, respectively.
Results show that while applying neural networks alone can
improve the forecasting accuracy over the ARIMA model in the 35period horizon, the performance of ANNs is getting worse as time
horizon extends to 67 periods. This may suggest that neither the
neural network nor the ARIMA model captures all of the patterns
in the data and combining two models together can be an effective way in order to overcome this limitation. However, the results
of the Zhangs hybrid model [27] show that; although, the overall
forecasting errors of Zhangs hybrid model have been reduced in
comparison with ARIMA and ANN, this model may also give worse
predictions than either of those, in some specic situations. These
results may be occurred due to the assumptions [21], which have
been considered in constructing process of the hybrid model by
Zhang [27].
Our proposed model results conrm this hypothesis. Our proposed model that has neither assumptions of Zhangs hybrid model
in the modeling and combining process, has yielded more accurate
results than Zhangs hybrid model and also both ARIMA and ANN
models used separately across two different time horizons and with
both error measures. Therefore, these assumptions are likely to lead
to unwanted degeneration of performance if the opposite situation
occurs. For example in terms of MSE, the percentage improvements
of the proposed model over the Zhangs hybrid model, ARIMA, and
ANN for 35-period forecasts are 30.72%, 36.96%, and 40.35%, respectively.
In a similar fashion, a subset autoregressive model of order
12 has been tted to Canadian lynx data. This is a parsimonious
model also used by Subba Rao and Gabr [63] and Zhang [27]. In
addition, a neural network, which is composed of seven inputs,
ve hidden and one output neurons (N(7-5-1) ), has been designed
to Canadian lynx data set forecast, as also employed by Zhang
[27]. The overall forecasting results of above-mentioned models
and improvement percentage of the proposed model in comparison with those models for the last 14 years are summarized in
Tables 6 and 7, respectively.
Numerical results show that the used neural network gives
slightly better forecasts than the ARIMA model and the Zhangs
hybrid model, signicantly outperform the both of them. However, by applying our proposed model more accurate results than
Zhangs hybrid model to be obtained. Our proposed model indicates
Table 3
Performance measures of the proposed model for exchange rate data.
Train
Test
MSE
MAE
5
3.22 10
0.004
MSE
SSE
5
3.64 10
0.001
RMSE
0.006
ME
5
4.59 10
MAPE
VAF
MAE
0.283
64.19
0.004
2673
Table 4
Comparison of the performance of the proposed model with those of other forecasting models (Sunspot data set).
Model
35 points ahead
Auto-regressive integrated moving average (ARIMA)

Zhangs hybrid model
Our proposed model
67 points ahead
MAE
MSE
MAE
MSE
11.319
10.243
10.831
8.847
216.965
205.302
186.827
129.425
13.033739
13.544365
12.780186
11.446981
306.08217
351.19366
280.15956
218.642153
Table 5
Percentage improvement of the proposed model in comparison with those of other forecasting models (Sunspot data set).
Model
35 points ahead (%)

Zhangs hybrid model
67 points ahead (%)
MAE
MSE
MAE
MSE
21.84
13.63
18.32
40.35
36.96
30.72
12.17
15.49
10.43
28.57
37.74
21.96
Table 6
Comparison of the performance of the proposed model with those of other forecasting models (Canadian lynx data).
Model
MAE
MSE

Zhangs hybrid model
Our proposed model
0.112255
0.112109
0.103972
0.085055
0.020486
0.020466
0.017233
0.00999
Table 7
Percentage improvement of the proposed model in comparison with those of other forecasting models (Canadian lynx data).
Model
MAE (%)
MSE (%)

Zhangs hybrid model
24.23
24.13
18.19
51.23
51.19
42.03
Table 8
Comparison of the performance of the proposed model with those of other forecasting models (exchange rate data).a
Model

Zhangs hybrid model
Our proposed model
a
1 month
6 months
12 months
MAE
MSE
MAE
MSE
MAE
MSE
0.005016
0.004218
0.004146
0.003972
3.68493
2.76375
2.67259
2.39915
0.0060447
0.0059458
0.0058823
0.0053361
5.65747
5.71096
5.65507
4.27822
0.0053579
0.0052513
0.0051212
0.0049691
4.52977
4.52657
4.35907
3.64774
Note: All MSE values should be multiplied by 105 .
an 18.19% and 42.03% decrease over the Zhangs hybrid model in

MSE and MAE, respectively.
With the exchange rate data set, the best linear ARIMA model
is found to be the simple random walk model: yt = yt1 + t . This is
the same nding suggested by many studies in the exchange rate
literature [27] that a simple random walk is the dominant linear
model. They claim that the evolution of any exchange rate follows
the theory of efcient market hypothesis (EMH) [71]. According to this hypothesis, the best prediction value for tomorrows
exchange rate is the current value of the exchange rate and the
actual exchange rate follows a random walk [70]. A neural network,
which is composed of seven inputs, six hidden and one output neu-
rons (N(7-6-1) ) is designed in order to model the nonlinear patterns,

as also employed by Zhang [27]. Three time horizons of 1, 6 and 12
months are used in order to assess the forecasting performance of
models. The forecasting results of above-mentioned models and
improvement percentage of the proposed model in comparison
with those models for the exchange rate data are summarized in
Tables 8 and 9, respectively.
Results of the exchange rate data set forecasting indicate that
for short-term forecasting (1 month), both neural network and
hybrid models are much better in accuracy than the simple random walk model. The ANN model gives a comparable performance
to the ARIMA model and Zhangs hybrid model slightly outperforms
Table 9
Percentage improvement of the proposed model in comparison with those of other forecasting models (exchange rate data).
Model

Zhangs hybrid model
1 month
6 months
12 months
MAE
MSE
MAE
MSE
MAE
MSE
20.81
5.83
4.20
34.89
13.19
10.23
11.72
10.25
9.29
24.38
25.09
24.35
7.26
5.37
2.97
19.47
19.41
16.32
2674
both ARIMA and ANN models for longer time horizons (6 and 12
months). However, our proposed model signicantly outperforms
ARIMA, ANN, and Zhangs hybrid models across three different time
horizons and with both error measures.
6. Conclusions
Improving forecasting especially time series forecasting accuracy is an important yet often difcult task facing decision makers in
many areas. Despite the numerous time series models available, the
research for improving the effectiveness of forecasting models has
never stopped. Several large-scale forecasting competitions with
a large number of commonly used time series forecasting models
conclude that combining forecasts from more than one model often
leads to improved performance, especially when the models in
the ensemble are quite different. Articial neural networks (ANNs)
have shown to be an effective, general-purpose approach for pattern recognition, classication, clustering, and especially prediction
with a high degree of accuracy. Nevertheless, using ANNs to model
linear problems have yielded mixed results, and hence; it is unreasonable to apply ANNs blindly to any type of data.
Hybrid techniques that decompose a time series into its linear
and nonlinear form are one of the most popular hybrid models
categories, which have been shown to be successful for single
models. These models are jointly used the linear and nonlinear models in order to capture different forms of relationship
in the time series data. Although the numerous studies have
shown that hybrid ARIMA-ANNs models are able to outperform
each components used in isolation, some other have reported
inconsistent results. Some researchers believe that assumptions,
which have been considered to construct these hybrid models, can degenerate their performance if the opposite situation
occurs.
In this paper, a novel hybridization of articial neural networks
and ARIMA model is proposed in order to overcome the abovementioned limitation of ANNs and yield the more general and the
more accurate forecasting model than traditional hybrid ARIMAANNs models. In our proposed model is used the unique capability
of ARIMA models in linear modeling in order to identify and magnify the existing linear structure in data, and then a multilayer
perceptron is used to determine a model in order to capture the
underlying data generating process and predict the future, using
preprocessed data. Empirical results with three well-known real
data sets indicate that the proposed model can be an effective
way in order to yield more general and more accurate model than
Zhangs hybrid model and either of the components models used
separately, thus, can be used as an alternative forecasting tool for
time series forecasting.
Acknowledgements
The authors wish to express their gratitude to anonymous
referees and Seyed Reza Hejazi, assistant professor of industrial
engineering, Isfahan University of Technology, for their insightful
and constructive comments, which helped to improve the paper
greatly.
References
[1] G. Zhang, B.E. Patuwo, M.Y. Hu, Forecasting with articial neural networks: the
state of the art, International Journal of Forecasting 14 (1998) 3562.
[2] D. Rumelhart, J. McClelland, Parallel Distributed Processing, MIT Press, Cambridge, MA, 1986.
[3] G. Cybenko, Approximations by superpositions of a sigmoidal function, Mathematics of Control, Signals, and Systems 2 (1989) 303314.
[4] K. Hornik, M. Stinnchcombe, H. White, Multi-layer feed forward networks are
universal approximators, Neural Networks 2 (1989) 359366.
[5] P. Chakradhara, V. Narasimhan, Forecasting exchange rate better with articial

neural network, Journal of Policy Modeling 29 (2007) 227236.
[6] Y. Chen, B. Yang, J. Dong, A. Abraham, Time-series forecasting using exible
neural tree model, Information Sciences 174 (34) (2005) 219235.
[7] F. Giordano, M. La Rocca, C. Perna, Forecasting nonlinear time series with neural
network sieve bootstrap, Computational Statistics and Data Analysis 51 (2007)
38713884.
[8] A. Jain, A.M. Kumar, Hybrid neural network models for hydrologic time series
forecasting, Applied Soft Computing 7 (2007) 585592.
[9] A. Lapedes, R. Farber, Nonlinear Signal Processing Using Neural Networks: Prediction and System Modeling, Technical Report LAUR-87-2662, Los Alamos
National Laboratory, Los Alamos, NM, 1987.
[10] C. De Groot, D. Wurtz, Analysis of univariate time series with connectionist nets:
a case study of two classical examples, Neurocomputing 3 (1991) 177192.
[11] K. Chakraborty, K. Mehrotra, C.K. Mohan, S. Ranka, Forecasting the behavior
of multivariate time series using neural networks, Neural Networks 5 (1992)
961970.
[12] I. Poli, R.D. Jones, A neural net model for prediction, Journal of American Statistical Association 89 (1994) 117121.
[13] M. Cottrell, B. Girard, Y. Girard, M. Mangeas, C. Muller, Neural modeling for time
series: a statistical stepwise method for weight elimination, IEEE Transactions
on Neural Networks 6 (6) (1995) 13551364.
[14] V.L. Berardi, G.P. Zhang, An empirical investigation of bias and variance in
time series forecasting: modeling considerations and error evaluation, IEEE
Transactions on Neural Networks 14 (3) (2003) 668679.
[15] S.D. Balkin, J.K. Ord, Automatic neural network modeling for univariate time
series, International Journal of Forecasting 16 (2000) 509515.
[16] A.S. Weigend, N.A. Gershenfeld, Time Series Prediction: Forecasting the Future
and Understanding the Past, Addison-Wesley, Reading, MA, 1993.
[17] W.R. Foster, F. Collopy, L.H. Ungar, Neural network forecasting of short,
noisy time series, Computers and Chemical Engineering 16 (4) (1992)
293297.
[18] M.C. Brace, J. Schmidt, M. Hadlin, Comparison of the forecasting accuracy
of neural networks with other established techniques, in: Proceedings of
the First Forum on Application for Weight Elimination. IEEE Transactions on
Neural Networks of Neural Networks to Power Systems, Seattle, WA, 1991,
pp. 3135.
[19] J.W. Denton, How good are neural networks for causal forecasting? The Journal
of Business Forecasting 14 (2) (1995) 1720.
[20] T.H. Hann, E. Steurer, Much ado about nothing? Exchange rate forecasting:
neural networks vs. linear models using monthly and weekly data, Neurocomputing 10 (1996) 323339.
[21] T. Taskaya, M.C. Casey, A comparative study of autoregressive neural network
hybrids, Neural Networks 18 (2005) 781789.
[22] P.A. Fishwick, Neural network models in simulation: a comparison with traditional modeling approaches, in: Proceedings of Winter Simulation Conference,
Washington, DC, 1989, pp. 702710.
[23] Z. Tang, C. Almeida, P.A. Fishwick, Time series forecasting using neural networks
vs. Box-Jenkins methodology, Simulation 57 (5) (1991) 303310.
[24] Z. Tang, P.A. Fishwick, Feedforward neural nets as models for time series forecasting, ORSA Journal on Computing 5 (4) (1993) 374385.
[25] V. Ediger, S. Akar, ARIMA forecasting of primary energy demand by fuel in
Turkey, Energy Policy 35 (2007) 17011708.
[26] P. Box, G.M. Jenkins, Time Series Analysis: Forecasting and Control, Holden-day
Inc., San Francisco, CA, 1976.
[27] G.P. Zhang, Time series forecasting using a hybrid ARIMA and neural network
model, Neurocomputing 50 (2003) 159175.
[28] A. Sharkey, Types of multinet system Proceedings of the Third International
Workshop on Multiple Classier Systems Lecture Notes in Computer Science,
2364, Springer, London, 2002, 108117.
[29] H. Tong, K.S. Lim, Threshold autoregressive, limit cycles and cyclical data, Journal of the Royal Statistical Society Series B 42 (3) (1980) 245292.
[30] F.M Tseng, H.C. Yu, G.H. Tzeng, Combining neural network model with seasonal
time series ARIMA model, Technological Forecasting & Social Change 69 (2002)
7187.
[31] M.J. Reid, Combining three estimates of gross domestic product, Economica 35
(1968) 431444.
[32] J.M. Bates, W.J. Granger, The combination of forecasts, Operation Research 20
(1969) 451468.
[33] R. Clemen, Combining forecasts: a review and annotated bibliography with
discussion, International Journal of Forecasting 5 (1989) 559608.
[34] S. Makridakis, Why combining works? International Journal of Forecasting 5
(1989) 601603.
[35] F.C. Palm, A. Zellner, To combine or not to combine? Issues of combining forecasts, Journal of Forecasting 11 (1992) 687701.
[36] D.K. Wedding, K.J. Cios, Time series forecasting by combining RBF networks
certainty factors, and the BoxJenkins model, Neurocomputing 10 (1996)
149168.
[37] J.T. Luxhoj, J.O. Riis, B. Stensballe, A hybrid econometric-neural network
modeling approach for sales forecasting, International Journal of Production
Economics 43 (1996) 175192.
[38] E. Pelikan, C. de Groot, D. Wurtz, Power consumption in West-Bohemia:
improved forecasts with decorrelating connectionist networks, Neural Network World 2 (1992) 701712.
[39] I. Ginzburg, D. Horn, Combined neural networks for time series analysis,
Advances in Neural Information Processing Systems 6 (1994) 224231.

[40] P.F. Pai, C.S. Lin, A hybrid ARIMA and support vector machines model in stock
price forecasting, Omega 33 (2005) 497505.
[41] K.Y. Chen, C.H. Wang, A hybrid SARIMA and support vector machines in forecasting the production values of the machinery industry in Taiwan, Expert
Systems with Applications 32 (2007) 254264.
[42] Z.J. Zhou, C.H. Hu, An effective hybrid approach based on grey and ARMA for
forecasting gyro drift, Chaos, Solitons and Fractals 35 (2008) 525529.
[43] G. Armano, M. Marchesi, A. Murru, A hybrid genetic-neural architecture for
stock indexes forecasting, Information Sciences 170 (2005) 333.
[44] L. Yu, S. Wang, K.K. Lai, A novel nonlinear ensemble forecasting model incorporating GLAR and ANN for foreign exchange rates, Computers and Operations
Research 32 (2005) 25232541.
[45] M. Khashei, M. Bijari, G.H.A. Raissi, Improvement of auto-regressive integrated
moving average models using fuzzy logic and articial neural networks (ANNs),
Neurocomputing 72 (2009) 956967.
[46] H. Kim, K. Shin, A hybrid approach based on neural networks and genetic
algorithms for detecting temporal patterns in stock markets, Applied Soft Computing 7 (2007) 569576.
[47] M.V.D. Voort, M. Dougherty, S. Watson, Combining Kohonen maps with ARIMA
time series models to forecast trafc ow, Transportation Research Part C:
Emerging Technologies 4 (1996) 307318.
[48] M. Khashei, S.R. Hejazi, M. Bijari, A new hybrid articial neural networks and
fuzzy regression model for time series forecasting, Fuzzy Sets and Systems 159
(2008) 769786.
[49] R. Shibata, Selection of the order of an autoregressive model by Akaikes information criterion, Biometrika AC-63 (1) (1976) 117126.
[50] R.H. Jones, Fitting autoregressions, Journal of American Statistical Association
70 (351) (1975) 590592.
[51] C.M. Hurvich, C.-L. Tsai, Regression and time series model selection in small
samples, Biometrica 76 (2) (1989) 297307.
[52] H.B. Hwang, Insights into neural-network forecasting time series corresponding to ARMA(p;q) structures, Omega 29 (2001) 273289.
[53] T. Minerva, I. Poli, Building ARMA models with genetic algorithms Lecture Notes
in Computer Science, vol. 2037, 2001, pp. 335342.
[54] C.-S. Ong, J.-J. Huang, G.-H. Tzeng, Model identication of ARIMA family using
genetic algorithms, Applied Mathematics and Computation 164 (3) (2005)
885912.
[55] M. Haseyama, H. Kitajima, An ARMA order selection method with fuzzy reasoning, Signal Processing 81 (2001) 13311335.
2675
[56] N. Morgan, H. Bourlard, Generalization and parameter estimation in feedforward nets: some experiments, in: D.S. Touretzky (Ed.), Advances in Neural
Information Processing Systems, vol. 2, 1990, pp. 630637.
[57] S. Thawornwong, D. Enke, The adaptive selection of nancial and economic
variables for use with articial neural networks, Neurocomputing 31 (2000)
113.
[58] M. Hibon, T. Evgeniou, To combine or not to combine: selecting among forecasts
and their combinations, International Journal of Forecasting 21 (2005) 1524.
[59] N. Terui, H. van Dijk, Combined forecasts from linear and nonlinear time series
models, International Journal of Forecasting 18 (2002) 421438.
[60] C.W.J. Granger, Combining forecaststwenty years later, Journal of Forecasting
8 (1989) 167173.
[61] K.W. Hipel, A.I. McLeod, Time Series Modelling of Water Resources and Environmental Systems, Amsterdam, Elsevier, 1994.
[62] M. Ghiassi, H. Saidane, A dynamic architecture for articial neural networks,
Neurocomputing 63 (2005) 397413.
[63] T. Subba Rao, M.M. Sabr, An Introduction to Bispectral Analysis and Bilinear
Time Series Models Lecture Notes in Statistics, 24, Springer-Verlag, New York,
1984.
[64] L. Stone, D. He, Chaotic oscillations and cycles in multi-trophic ecological systems, Journal of Theoretical Biology 248 (2007) 382390.
[65] Y. Tang, S. Ghosal, A consistent nonparametric Bayesian procedure for estimating autoregressive conditional densities, Computational Statistics & Data
Analysis 51 (2007) 44244437.
[66] T. Lin, M. Pourahmadi, Nonparametric and non-linear models and data mining
in time series: a case study in the Canadian lynx data, Applied Statistics 47
(1998) 187201.
[67] P. Cornillon, W. Imam, E. Matzner, Forecasting time series using principal
component analysis with respect to instrumental variables, Computational
Statistics & Data Analysis 52 (2008) 12691280.
[68] M.J. Campbell, A.M. Walker, A survey of statistical work on the MacKenzie River
series of annual Canadian lynx trappings for the years 18211934, and a new
analysis, Journal of the Royal Statistical Society Series A 140 (1977) 411431.
[69] C.S. Wong, W.K. Li, On a mixture autoregressive model, Journal of the Royal
Statistical Society Series B 62 (1) (2000) 91115.
[70] R.A. Meese, K. Rogoff, Empirical exchange rate models of the seventies: do they
/t out of samples? Journal of International Economics 14 (1983) 324.
[71] A. Timmermann, C.W.J. Granger, Efcient market hypothesis and forecasting,
International Journal of Forecasting 20 (2004) 1527.

A Novel Hybridization of Artificial Neural Networks and ARIMA Models For Time Series Forecasting

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

A Novel Hybridization of Artificial Neural Networks and ARIMA Models For Time Series Forecasting

Hochgeladen von

Copyright:

Verfügbare Formate

Applied Soft Computing 11 (2011) 26642675

Contents lists available at ScienceDirect

Applied Soft Computing

structure of and operations performed by the ANN emulate those

M. Khashei, M. Bijari / Applied Soft Computing 11 (2011) 26642675

attention in time series forecasting. Articial neural networks have

M. Khashei, M. Bijari / Applied Soft Computing 11 (2011) 26642675

considered as function of a linear and a nonlinear component, so, in

2. The hybrid models

this area. Both theoretical and empirical ndings in neural network

3. Time series forecasting models

M. Khashei, M. Bijari / Applied Soft Computing 11 (2011) 26642675

3.1. The auto-regressive integrated moving average models

3.2. The articial neural networks (ANNs)

For more than half a century, ARIMA models have dominated

(B) d (yt ) = (B)at ,

where wi,j (i = 0, 1, 2, . . . , P, j = 1, 2, . . . , Q ) and wj (j =

M. Khashei, M. Bijari / Applied Soft Computing 11 (2011) 26642675

fore, an autoregressive integrated moving average (ARIMA) model

f are the nonlinear functions determined by the neural

M. Khashei, M. Bijari / Applied Soft Computing 11 (2011) 26642675

Fig. 1. Sunspot series (17001987).

model as well as ANN model in determining different linear and

In this section, three well-known data sets the Wolfs sunspot

M. Khashei, M. Bijari / Applied Soft Computing 11 (2011) 26642675

Fig. 4. ARIMA model prediction of sunspot data (test sample).

Fig. 8. Structure of the best tted network (lynx data), N(8-3-1) .

Fig. 5. ANN model prediction of sunspot data (test sample).

Fig. 6. Proposed model prediction of sunspot data (test sample).

5.3. The exchange rate (British pound/US dollar) forecasts

Fig. 7. Canadian lynx data series (18211934).

The last data set that is considered in this investigation, is the

also been extensively analyzed in the time series literature with a

M. Khashei, M. Bijari / Applied Soft Computing 11 (2011) 26642675

Fig. 10. ARIMA model prediction of lynx data (test sample).

Stage I. Linear modeling: In a similar fashion, by using the Eviews

Fig. 11. ANN model prediction of lynx data (test sample).

Fig. 12. Proposed model prediction of lynx data (test sample).

exchange rate is an important yet difcult task in international

5.4. Comparison with other models

M. Khashei, M. Bijari / Applied Soft Computing 11 (2011) 26642675

sure forecasting performance of proposed model in comparison

In the Wolfs sunspot data forecast case, a subset autoregressive

four hidden and one output neurons (N(4-4-1) ), as also employed by

M. Khashei, M. Bijari / Applied Soft Computing 11 (2011) 26642675

Auto-regressive integrated moving average (ARIMA)

35 points ahead (%)

Auto-regressive integrated moving average (ARIMA)

67 points ahead (%)

Auto-regressive integrated moving average (ARIMA)

Auto-regressive integrated moving average (ARIMA)

Auto-regressive integrated moving average

Note: All MSE values should be multiplied by 105 .

an 18.19% and 42.03% decrease over the Zhangs hybrid model in

rons (N(7-6-1) ) is designed in order to model the nonlinear patterns,

Auto-regressive integrated moving average

M. Khashei, M. Bijari / Applied Soft Computing 11 (2011) 26642675

[5] P. Chakradhara, V. Narasimhan, Forecasting exchange rate better with articial

M. Khashei, M. Bijari / Applied Soft Computing 11 (2011) 26642675

Das könnte Ihnen auch gefallen