Time Series Models

Time Series Models 1
TIME SERIES MODELS

Time Series
time series = sequence of observations
Example: daily returns on a stock
multivariate time series is a sequence of vectors of
observations
Example returns from set of stocks.
statistical models for univariate times series widely
used
in nance to model asset prices
in OR to model the output of simulations
in business for forecasting
Stationary Processes
often a time series has same type of random behavior
from one time period to the next
outside temperature: each summer is similar to the
past summers
interest rates and returns on equities
stationary stochastic processes are probability models
for such series
process stationary if behavior unchanged by shifts in
time
a process is weakly stationary if its mean, variance,
and covariance are unchanged by time shifts
thus X
1
, X
2
, . . . is a weakly stationary process if
E(X
i
) = (a constant) for all i
Var(X
i
) =
2
(a constant) for all i
Corr(X
i
, X
j
) = (|i j|) for all i and j for some
function
the correlation between two observations depends only
on the time distance between them (called the lag)
example: correlation between X
2
and X
5
=
correlation between X
7
and X
10
is the correlation function
Note that (h) = (h)
covariance between X
t
and X
t+h
is denoted by (h)
() is called the autocovariance function
Note that (h) =
2
(h) and that (0) =
2
since
(0) = 1
many nancial time series not stationary
but the changes in these time series may be
stationary
Weak White Noise
simplest example of stationary process
no correlation
X
1
, X
2
, . . . is WN(,
2
) if
E(X
i
) = for all i
Var(X
i
) =
2
(a constant) for all i
Corr(X
i
, X
j
) = 0 for all i = j
if X
1
, X
2
. . . IID normal then process is Gaussian
white noise process
weak white noise process is weakly stationary with
(0) = 1
(t) = 0 if t = 0
so that
(0) =
2
(t) = 0 if t = 0
White noise
WN is uninteresting in itself
but is the building block of important models
It is interesting to know if a nancial time series, e.g.,
of net returns, is WN.
Estimating parameters of a stationary process
observe y
1
, . . . , y
n
estimate and
2
with Y and s
2
estimate autocovariance with
(h) = n
1
nh
j=1
(y
j+h
y)(y
j
y)
estimate () with
(h) =
(h)
(0)
, h = 1, 2, . . .
innite number of parameters (bad)
AR(1) processes
time series models with correlation built from WN
in AR processes y
t
is modeled as a weighted average
of past observations plus a white noise error
AR(1) is simplest AR process

1
,
2
, . . . are WN(0,
2
)
y
1
, y
2
, . . . is an AR(1) process if
y
t
= (y
t1
) +
t
(1)
for all t
From previous page:
y
t
= (y
t1
) +
t
Only three parameters:
mean

2
variance of one-step ahead prediction errors

a correlation parameter
If || < 1, then y
1
, . . . is a weakly stationary process
mean is
y
t
= (1 ) + y
t1
+
t
(2)
compare with linear regression model,
y
t
=
0
+
1
x
t
+
t

0
= (1 ) is called the constant in computer
output
is called the mean in the output
When || < 1 then
y
t
= +
t
+
t1
+
2
t2
+ = +
h=0
th
innite moving average [MA()] represention
since || < 1,
h
0 as the lag h
Properties of a stationary AR(1) process
When || < 1 (stationarity), then
E(y
t
) = t
(0) = Var(y
t
) =

2
1
2
t
(h) = Cov(y
t
, y
t+h
) =

2
|h|
1
2
t
(h) = Corr(y
t
, y
t+h
) =
|h|
t
Only if || < 1 and only for AR(1) processes
if || 1, then the AR(1) process is nonstationary,
and the mean, variance, and correlation are not
constant
Formulas 14 can be proved using
y
t
= +
t
+
t1
+
2
t2
+ + = +
h=0
th
For example
Var(y
t
) = Var
h=0
th
=
2
h=0
2h
=

2
1
2
Also, for h > 0
(h) = Cov
i=0
ti
i
,
j=0
t+hj
=

2
|h|
1
2
distinguish between
2
= variance of
1
,
2
, . . . and
(0) = variance of y
1
, y
2
, . . .
Nonstationary AR(1) processes
Random Walk
if = 1 then
y
t
= y
t1
+
t
not stationary
random walk process
y
t
= y
t1
+
t
= (y
t2
+
t1
) +
t
= =
y
0
+
1
+ +
t
start at the process at an arbitrary point y
0
then E(y
t
|y
0
) = y
0
for all t
Var(y
t
|y
0
) = t
2
increasing variance makes the random walk

wander
AR(1) processes when || > 1
when || > 1, an AR(1) process has explosive
behavior
0 50 100 150 200
8
6
4
2
0
2
4
6
AR(1): = 0.9
0 50 100 150 200
5
4
3
2
1
0
1
2
3
AR(1): = 0.6
0 50 100 150 200
4
3
2
1
0
1
2
3
AR(1): = 0.2
0 50 100 150 200
5
0
5
AR(1): = 0.9
0 50 100 150 200
12
10
8
6
4
2
0
2
4
AR(1): = 1
0 50 100 150 200
60
50
40
30
20
10
0
10
AR(1): = 1.02
n = 200
0 10 20 30
3
2
1
0
1
2
3
4
AR(1): = 0.9
0 10 20 30
3
2
1
0
1
2
AR(1): = 0.6
0 10 20 30
3
2
1
0
1
2
AR(1): = 0.2
0 10 20 30
4
2
0
2
4
6
AR(1): = 0.9
0 10 20 30
1
0
1
2
3
4
5
AR(1): = 1
0 10 20 30
1
0
1
2
3
4
5
6
AR(1): = 1.02
n = 30
0 200 400 600 800 1000
6
4
2
0
2
4
6
8
AR(1): = 0.9
0 200 400 600 800 1000
4
2
0
2
4
6
AR(1): = 0.6
0 200 400 600 800 1000
3
2
1
0
1
2
3
4
AR(1): = 0.2
0 200 400 600 800 1000
8
6
4
2
0
2
4
6
8
AR(1): = 0.9
0 200 400 600 800 1000
10
0
10
20
30
40
AR(1): = 1
0 200 400 600 800 1000
2
0
2
4
6
8
10
x 10
8 AR(1): = 1.02
n = 1000
Suppose an explosive AR(1) process starts at y
0
= 0 and
has = 0. Then
y
t
= y
t1
+
t
= (y
t2
+
t1
) +
t
=
2
y
t2
+
t1
+
t
=
=
t
+
t1
+
2
t2
+ +
t1
1
+
t
y
0
Therefore, E(y
t
) =
t
y
o
and
Var(y
t
) =
2
(1 +
2
+
4
+ +
2(t1)
) =
2
2t
1
1
.
Since || > 1, variance increases geometrically fast at
t .
Explosive AR processes not widely used in econometrics
since economic growth usually is not explosive.
Estimation
Can t an AR(1) to either
raw data, or
variable constructed from the raw data
To create the log returns
take logs of prices
dierence
In MINITAB, to dierence
Stat menu
Time Series menu
select dierences
select log prices as variable
AR(1) model is a linear regression model
can be analyzed using linear regression software
one creates a lagged variable in y
t
and uses this as the
x-variable
MINITAB and SAS both support lagging
to lag in MINITAB
Stat menu
Time Series menu
then select lag
The least squares estimation of and minimize
n
t=2
{y
t
} {(y
t1
)}
2
if the errors are Gaussian white noise then LSE =
MLE
both MINITAB or SAS have special procedures for
tting AR models
In MINITAB
Stat menu
Time Series
then choose ARIMA
use
1 autoregressive parameter
0 dierencing if using log returns (or 1 if using
log prices)
0 moving average parameters
In SAS, use the AUTOREG or the ARIMA
procedure
Residuals
t
= y
t

(y
t1
)
estimate
1
,
2
, . . . ,
n
since
t
= y
t
(y
t1
)
used to check that y
1
, y
2
, . . . , y
n
is an AR(1) process
autocorrelation in residuals evidence against AR(1)
assumption
to test for residual autocorrelation use test bounds
provided by MINITABs or SASs autocorrelation
plots
can also use the Ljung-Box test
null hypothesis is that autocorrelations up to a
specied lag are zero
To appreciate why residual autocorrelation indicates a
possible problem, suppose that
we are tting an AR(1) model but
the true model is a AR(2) process given by
(y
t
) =
1
(y
t1
) +
2
(y
t2
) +
t
.
no hope of estimating
2
.

does not necessarily estimate
1
because of bias
Let
be the expected value of

.
For the purpose of illustration, assume that
and

.
Then
t
(y
t
)
(y
t1
)
=
1
(y
t1
) +
2
(y
t2
) +
t

(y
t1
)
= (
1

)(y
t1
) +
2
(y
t2
) +
t
.
From previous page:
t
(y
t
)
(y
t1
)
=
1
(y
t1
) +
2
(y
t2
) +
t

(y
t1
)
= (
1

)(y
t1
) +
2
(y
t2
) +
t
.
Thus, the residuals do not estimate the white noise
process.
If there is no bias in the estimation of then

1
=
and
(
1

)(y
t1
) drops out
But the presence of
2
(y
t2
) still causes the
residuals to be autocorrelated.
Example: GE daily returns
The MINITAB output was obtained by running
MINITAB interactively.
Here is the MINITAB output. The variable logR is the
time series of log returns.
Results for: GE_DAILY.MT
ARIMA Model: logR
ARIMA model for logR
Estimates at each iteration
Iteration SSE Parameters
0 2.11832 0.100 0.090
1 0.12912 0.228 0.015
2 0.07377 0.233 0.001
3 0.07360 0.230 0.000
4 0.07360 0.230 -0.000
5 0.07360 0.230 -0.000
Relative change in each estimate less than 0.0010
Final Estimates of Parameters
Type Coef SE Coef T P
AR 1 0.2299 0.0621 3.70 0.000
Constant -0.000031 0.001081 -0.03 0.977
Mean -0.000040 0.001403
Number of observations: 252
Residuals: SS = 0.0735911 (backforecasts excluded)
MS = 0.0002944 DF = 250
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 23.0 33.6 47.1 78.6
DF 10 22 34 46
P-Value 0.011 0.054 0.066 0.002
The SAS output comes from running the following
program.
options linesize = 72 ;
data ge ;
infile c:\courses\or473\data\ge.dat ;
input close ;
logP = log(close) ;
logR = dif(logP) ;
run ;
title GE - Daily prices, Dec 17, 1999 to Dec 15, 2000 ;
title2 AR(1) ;
proc autoreg ;
model logR =/nlag = 1 ;
run ;
Here is the SAS output.
The AUTOREG Procedure
Dependent Variable logR
Ordinary Least Squares Estimates
SSE 0.07762133 DFE 251
MSE 0.0003092 Root MSE 0.01759
SBC -1316.8318 AIC -1320.3612
Regress R-Square 0.0000 Total R-Square 0.0000
Durbin-Watson 1.5299
Standard Approx
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 -0.000011 0.001108 -0.01 0.9917
Estimates of Autocorrelations
Lag Covariance Correlation
0 0.000308 1.000000
1 0.000069 0.225457
Lag -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0 | |********************|
1 | |***** |
Preliminary MSE 0.000292
Estimates of Autoregressive Parameters
Standard
Lag Coefficient Error t Value
1 -0.225457 0.061617 -3.66
GE - Daily prices, Dec 17, 1999 to Dec 15, 2000 2
Yule-Walker Estimates
SSE 0.07359998 DFE 250
MSE 0.0002944 Root MSE 0.01716
SBC -1324.6559 AIC -1331.7148
Standard Approx
Intercept 1 -0.000040 0.001394 -0.03 0.9773

= .2299
std dev of

is 0.0621
t-value for testing H
0
: = 0 versus H
1
: = 0 is
.2299/.0621 = 3.70
p-value is .000 (to three decimals)
null hypothesis: log returns are white noise
alternative is that they are correlated
small p-value is evidence against the geometric
random walk hypothesis
however, = .2299 is not large
(h) =
h
correlation between successive log
returns is .2299
the squared correlation is only .0528
only about ve percent of the variation in a log return
can be predicted by the previous days return
In summary, AR(1) process ts GE log returns better
than white noise
not proof that the AR(1) ts these data
only that it ts better than a white noise model
to check that the AR(1) ts well looks sample
autocorrelation function (SACF) of the residuals
plot of the residual SACF available from MINITAB or
SAS
SACF of the residuals from the GE daily log returns
shows high negative autocorrelation at lag 6
(6) is outside the test limits so is signicant at
= .05
this is disturbing
0 5 10 15 20 25 30 35
0.25
0.2
0.15
0.1
0.05
0
0.05
0.1
0.15
0.2
0.25
lag
a
u
t
o
c
o
r
r
e
l
a
t
i
o
n
SACF of residuals from an AR(1) t to GE daily log returns.
the more conservative Ljung-Box simultaneous test
that (1) = = (12) = 0 has p = .011
since the AR(1) model does not t well, one might
consider more complex models
these will be discussed in following sections
The SAS estimate of is .2254
SAS uses the model
y
t
= y
t1
+
t
so SASs is the negative of as we, and
MINITAB, dene it
The dierence, .2299 versus .2254, between MINITAB
and SAS is slight
due to the estimation algorithm
can also estimate and test that is zero
from the MINITAB output is nearly zero
t-value for testing that is zero is very small
p-value is near one
small values of the p-value are signicant
since the p-value is large we accept the null
hypothesis that is zero
Cree Daily Log Returns
options linesize=72;
data cree ;
set Sasuser.cree_daily ; /* The data have already been imported via the wizard */
log_return = dif(log(AdjClose)) ; /* compute log returns */
lag_log_return = lag(log_return) ; /* previous days returns */
run ;
proc arima ;
identify var=log_return ;
estimate p=1 ;
run ;
The ARIMA Procedure
Name of Variable = log_return
Mean of Working Series 0.000953
Standard Deviation 0.051749
Number of Observations 3244
Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0 0.0026779 1.00000 | |********************|
1 -0.0000656 -.02450 | .|. |
2 -0.0000694 -.02593 | *|. |
3 -0.0000339 -.01266 | .|. |
4 -0.0000221 -.00826 | .|. |
5 -0.0000175 -.00653 | .|. |
6 0.00005227 0.01952 | .|. |
7 0.00003363 0.01256 | .|. |
8 -0.0000703 -.02624 | *|. |
9 0.00008622 0.03220 | .|* |
10 -6.6465E-6 -.00248 | .|. |
.
.
.
Autocorrelation Check for White Noise
To Chi- Pr >
Lag Square DF ChiSq -------------Autocorrelations------------
6 6.25 6 0.3955 -0.025 -0.026 -0.013 -0.008 -0.007 0.020
12 12.69 12 0.3919 0.013 -0.026 0.032 -0.002 -0.002 -0.009
18 15.40 18 0.6343 0.013 -0.003 0.020 0.008 0.010 -0.009
24 17.57 24 0.8235 0.013 -0.004 -0.011 0.012 0.012 0.009
Conditional Least Squares Estimation
Standard Approx
Parameter Estimate Error t Value Pr > |t| Lag
MU 0.0009521 0.0008869 1.07 0.2831 0
AR1,1 -0.02450 0.01756 -1.40 0.1630 1
Constant Estimate 0.000975
Variance Estimate 0.002678
Std Error Estimate 0.051749
AIC -10005.2
SBC -9993
Number of Residuals 3244
* AIC and SBC do not include log determinant.
Correlations of Parameter
Estimates
Parameter MU AR1,1
MU 1.000 -0.000
AR1,1 -0.000 1.000
The ARIMA Procedure
Autocorrelation Check of Residuals
To Chi- Pr >
6 4.58 5 0.4697 -0.001 -0.027 -0.014 -0.009 -0.006 0.020
12 10.67 11 0.4716 0.012 -0.025 0.032 -0.002 -0.003 -0.009
18 13.36 17 0.7115 0.013 -0.002 0.020 0.008 0.010 -0.009
24 15.62 23 0.8711 0.012 -0.004 -0.011 0.012 0.012 0.011
30 34.93 29 0.2067 0.042 0.043 -0.005 -0.006 0.014 0.045
36 48.74 35 0.0614 -0.020 -0.025 -0.036 -0.029 0.019 -0.027
42 52.02 41 0.1162 0.007 -0.025 -0.011 0.001 -0.000 -0.015
48 57.08 47 0.1488 -0.006 -0.016 0.015 0.012 -0.006 -0.029
AR(p) models
y
t
is AR(p) process if
(y
t
) =
1
(y
t1
)+
2
(y
t2
)+ +
p
(y
tp
)+
t
here
1
, . . . ,
n
is WN(0,
2
)
multiple linear regression model with lagged values of
the time series as the x-variables
model can be reexpressed as
y
t
=
0
+
1
y
t1
+ . . . +
p
y
tp
+
t
,
here
0
= {1 (
1
+ . . . +
p
)}
least-squares estimator minimizes
n
t=p+1
{y
t
(
0
+
1
y
t1
+ . . . +
p
y
tp
)}
2
least-squares estimator can be calculated using a
multiple linear regression program
one must create x-variables by lagging the time
series with lags 1 throught p
easier to use the ARIMA command in MINITAB or
SAS or SASs AUTOREG procedures
these do the lagging automatically
Example: GE daily returns
SAS program rerun with
model logR =/nlag = 1
replaced by
model logR =/nlag = 6

i
is signicant at lags 1 and 6
but not at lags 2 through 5
signicant means at = .05 which corresponds to
absolute t-value bigger than 2
MINITAB will not allow p > 5
but SAS does not have such a constraint
Moving Average (MA) Processes
MA(1) processes
moving average process of order [MA(1)] is
y
t
=
t

t1
,
where as before the
t
s are WN(0,
2
)
can show that
E(y
t
) = ,
Var(y
t
) =
2
(1 +
2
),
(1) =
2
,
(h) = 0 if |h| > 1,
(1) =

1 +
2
,
and
(h) = 0 if |h| > 1
General MA processes
MA(q) process is
y
t
=
t

1
t1

q
tq
can show that (h) = 0 and (h) = 0 if |h| > q
formulas for (h) and (h) when |h| q are given in
time series textbooks
complicated but not be needed by us
ARIMA Processes
ARMA (autoregressive and moving average):
stationary time series with complex autocorrelation
behavior better modeled by mixed autoregressive and
moving average processes
ARIMA (autoregressive, integrated, moving average):
based on stationary ARMA processes but are
nonstationary
ARIMA processes easily described with backwards
operator, B
The backwards operator
backwards operator B is dened by
B y
t
= y
t1
more generally,
B
k
y
t
= y
tk
B c = c for any constant c
since a constant does not change with time
ARMA Processes
ARMA(p, q) process satises the equation
(1
1
B
p
B
p
)(y
t
) = (1
1
B. . .
q
B
q
)
t
(3)
white noise process is ARMA(0,0) with = 0 since if
p = q = 0, then (3) reduces to
(y
t
) =
t
The dierencing operator
dierencing operator is = 1 B so that
y
t
= y
t
B y
t
= y
t
y
t1
dierencing a time series produces a new time series
consisting of the changes in the original series
for example, if p
t
= log(P
t
) is the log price, then the
log return is
r
t
= p
t
dierencing can be iterated
for example,
2
y
t
= (y
t
) = (y
t
y
t1
)
= (y
t
y
t1
) (y
t1
y
t2
)
= y
t
2y
t1
+ y
t2
From ARMA processes to ARIMA process
often the rst or second dierences of nonstationary
time series are stationary
for example, the rst dierences of random walk
(nonstationary) are white noise (stationary)
a time series y
t
is said to be ARIMA(p, d, q) if
d
y
t
is
ARMA(p, q)
for example, if log returns (r
t
) on an asset are
ARMA(p, q), then the log prices (p
t
) are
ARIMA(p, 1, q)
ARIMA procedures in MINITAB and SAS allow one
to specify p, d, and q
an ARIMA(p, 0, q) model is the same as an
ARMA(p, q) model
ARIMA(p, 0, 0), ARMA(p, 0), and AR(p) models are
the same
Also, ARIMA(0, 0, q), ARMA(0, q), and MA(q)
models are the same
random walk is an ARIMA(0, 1, 0) model
The inverse of dierencing is integrating
the integral of a process y
t
is
w
t
= w
t
0
+ y
t
0
+ y
t
0
+1
+ y
t
t
0
is an arbitrary starting time point
w
t
0
is the starting value of the w
t
process
Figure shows an AR(1), its integral and its second
integral, meaning the integral of its integral
0 50 100 150 200 250 300 350 400
4
2
0
2
4
ARIMA(1,0,0) with = 0 and = 0.4
0 50 100 150 200 250 300 350 400
30
20
10
0
10
20
ARIMA(1,1,0)
0 50 100 150 200 250 300 350 400
1000
500
0
500
ARIMA(1,2,0)
Model Selection
once the parameters p, d, and q selected, coecients
can be estimated by maximum likelihood
but how do we choose p, d, and q?
generally, d is either 0, 1, or 2
chosen by looking at the SACF of y
t
, y
t
, and
2
y
t
a sign that a process is nonstationary is that its
SACF decays to zero very slowly
if this is true of y
t
then the original series is
nonstationary
should be dierenced at least once
if the SACF of y
t
looks stationary, then we use
d = 1
otherwise, we look at the SACF of
2
y
t
if this looks stationary we use d = 2.
real time series where
2
y
t
did not look stationary
are rare
but if one were encountered then d > 2 would be
used
once d has been chosen we will t ARMA(p, q)
process to
d
y
t
but still need p and q
comparing various choices of p and q by some
criterion that measures how well a model ts
AIC and SBC
AIC and SBC are model selection criteria based on
the log-likelihood
Akaikes information criterion (AIC) is dened as
2 log(L) + 2(p + q),
L is the likelihood evaluated at the MLE
Schwarzs Bayesian Criterion (SBC):
dened as
2 log(L) + log(n)(p + q),
n is the length of the time series
also called Bayesian Information Criterion (BIC)
best model by either criterion is the model that
minimizes that criterion
Either criteria will tend to select models with a large
likelihood value
this makes perfect sense since large L means
observed data are likely under that model
term 2(p + q) in AIC or log(n)(p + q) is a penalty on
having too many parameters
therefore, AIC and SBC try to tradeo
good t to the data measured by L
the desire to use few parameters
which penalizes the most?
log(n) > 2 if n 8
most time series are much longer than 8
so SBC penalizes p + q more than AIC
therefore, AIC will tend to choose models with
more parameters than SBC
compared to SBC, with AIC the tradeo is more in
favor of a large value of L than a small value of p + q
Unfortunately, MINITAB does not compute AIC and
SBC but SAS does.
Heres how you can calculate approximate AIC and SBC
values using MINITAB.
It can be shown that log(L) (n/2) log(
2
) + K
where K is a constant that does not depend on the
model of on the parameters.
Since we only want to minimize AIC and SBC, the
exact value of K is irrelevant and we will drop K.
Thus, you can use the approximations
AIC nlog(
2
) + 2(p + q),
and
SBC nlog(
2
) + log(n)(p + q).

2
is called MSE (mean squared error) on the
MINITAB output.
dierence between AIC and SBC is due to the way
they were designed
AIC is designed to select the model that will
predict best and is less concerned with having a
few too many parameters
SBC is designed to select the true values of p and q
exactly
in practice the best AIC model is usually close to the
best SBC model
often they are the same model
models can be compared by likelihood ratio testing
when one model is bigger than the other
therefore, AIC and SBC are basically LR tests
Stepwise regression applied to AR processes
stepwise regression: looks at many regression models
sees which ones t the data well
will be discussed later
backwards regression:
starts with all possible x-variables
eliminates them one at time
stop when all remaining variables are signicant
can be applied to AR models
SASs AUTOREG procedure allows backstepping as
an option
The following SAS program starts with an AR(6) model
and backsteps
data ge ;
infile c:\courses\or473\data\ge_quart.dat ;
input close ;
D_p = dif(close);
logP = log(close) ;
logR = dif(logP) ;
run ;
title GE - Quarterly closing prices, Dec 1900 to Dec 2000 ;
title2 AR(6) with backstepping ;
proc autoreg ;
model logR =/nlag = 6 backstep ;
run ;
Here is the SAS output:
GE - Quarterly closing prices, Dec 1900 to Dec 2000 1
AR(6) with backstepping
Dependent Variable logR
SSE 0.15125546 DFE 38
MSE 0.00398 Root MSE 0.06309
SBC -102.20076 AIC -103.86432
Standard Approx
Intercept 1 0.0627 0.0101 6.21 <.0001
Lag -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0 | |********************|
1 | *| |
2 | *| |
3 | |******** |
4 | *| |
5 | ****| |
6 | |** |
Backward Elimination of
Autoregressive Terms
Lag Estimate t Value Pr > |t|
4 0.020648 0.12 0.9058
2 -0.023292 -0.14 0.8921
1 0.035577 0.23 0.8226
6 0.082465 0.50 0.6215
5 0.170641 1.13 0.2655
Standard
3 -0.392878 0.151180 -2.60
Expected
Autocorrelations
Lag Autocorr
0 1.0000
1 0.0000
2 0.0000
3 0.3929
Yule-Walker Estimates
SSE 0.12476731 DFE 37
MSE 0.00337 Root MSE 0.05807
SBC -105.5425 AIC -108.86962
Standard Approx
Intercept 1 0.0632 0.0146 4.33 0.0001
Expected
Autocorrelations
Lag Autocorr
0 1.0000
1 0.0000
2 0.0000
3 0.3929
Using the SACF to Choose d
0 200 400 600
0
5
10
15
20
month since Jan 1950
i
n
t
e
r
e
s
t

r
a
t
e
3 month Tbills
0 200 400 600
6
4
2
0
2
4
1
s
t

d
i
f
f
e
r
e
n
c
e
Differences
0 20 40 60
0.5
0
0.5
1
lag
a
u
t
o
c
o
r
r
.
SACF
0 20 40 60
0.4
0.2
0
0.2
0.4
SACF of differences
lag
a
u
t
o
c
o
r
r
.
SACFs show that one should use d = 1
Using ARIMA in SAS: Cree data
in this example, we will illustrate tting an ARMA
model in SAS
will use daily log returns on Cree from December
1999 to December 2000.
0 100 200
30
40
50
60
70
80
90
100
p
r
i
c
e
CREE, daily 12/17/99 to 12/15/00
0 100 200
0.2
0.1
0
0.1
0.2
0.3
R
e
t
u
r
n
0 100 200
0.2
0.1
0
0.1
0.2
0.3
l
o
g

r
e
t
u
r
n
0.2 0.1 0 0.1
0.001
0.003
0.01
0.02
0.05
0.10
0.25
0.50
0.75
0.90
0.95
0.98
0.99
0.997
0.999
log return
P
r
o
b
a
b
i
l
i
t
y
Normal plot of log returns
0 100 200
0
0.05
0.1
0.15
0.2
0.25
V
o
l
a
t
i
l
i
t
y
The SAS program is:
data cree ;
infile U:\courses\473\data\cree_daily.dat ;
input month day year volume high low close ;
logP = log(close) ;
logR = dif(logP) ;
run ;
title Cree daily log returns ;
title2 ARMA(1,1) ;
proc arima ;
identify var=logR ;
estimate p=1 q=1 ;
run ;
Cree log returns appear to be white noise since each of

1
(denoted by AR1,1 in SAS)

1
(denoted by MA1,1)

not signicantly dierent from zero.
Autocorrelations
0 0.0045526 1.00000 | |********************|
1 0.00031398 0.06897 | . |* . |
2 -0.0000160 -.00351 | . | . |
3 -5.5958E-6 -.00123 | . | . |
4 -0.0002213 -.04862 | . *| . |
5 0.00002748 0.00604 | . | . |
6 -0.0000779 -.01712 | . | . |
7 -0.0000207 -.00454 | . | . |
8 -0.0003281 -.07207 | . *| . |
9 0.00015664 0.03441 | . |* .
10 0.00057077 0.12537 | . |*** |
To Chi- Pr >
6 1.91 6 0.9276 0.069 -0.004 -0.001 -0.049 0.006 -0.017
12 10.02 12 0.6143 -0.005 -0.072 0.034 0.125 0.052 -0.076
18 21.95 18 0.2344 -0.030 -0.123 0.051 -0.022 -0.013 -0.157
24 23.37 24 0.4978 0.014 -0.010 -0.037 -0.032 -0.047 0.01
Standard Approx
MU -0.0006814 0.0045317 -0.15 0.8806 0
MA1,1 -0.18767 0.88710 -0.21 0.8326 1
AR1,1 -0.11768 0.89670 -0.13 0.8957 1
Constant Estimate -0.00076
AIC -638.889
SBC -628.301
To Chi- Pr >
6 0.75 4 0.9444 0.000 0.004 0.001 -0.049 0.010 -0.019
12 8.54 10 0.5761 0.003 -0.075 0.032 0.118 0.050 -0.079
18 21.12 16 0.1741 -0.014 -0.127 0.062 -0.029 0.001 -0.159
24 22.48 22 0.4314 0.025 -0.011 -0.035 -0.026 -0.045 0.016
Three possible scenarios:
1. log returns are white noise
then log returns should pass the white noise test
2. log returns are not white noise but t the times series
model
then log returns should fail the white noise test but
then residuals should pass the white noise test
3. log returns are not white noise and do not t the time
series model
then log returns and residuals will both fail the
white noise test
Warning
Dont rely too much of the residual tests for
autocorrelation.
If n is large:
autocorrelation might be small but statistically
signicant
my opinion SBC is a better guide to model choice
than the residual test for autocorrelations
Example: Three-month Treasury bill rates
our empirical results: log returns have little
autocorrelation
but not exactly white noise
other nancial time series do have substantial
autocorrelation
example: monthly interest rates on three-month US
Treasury bills from December 1950 until February
1996
data come from Example 16.1 of Pindyck and Rubin
(1998), Econometric Models and Economic Forecasts
rates are plotted in next gure
rst dierences look somewhat stationary
we will t ARMA models to the rst dierences
0 200 400 600
0
5
10
15
20
i
n
t
e
r
e
s
t

r
a
t
e
3 month Tbills
0 200 400 600
6
4
2
0
2
4
1
s
t

d
i
f
f
e
r
e
n
c
e
Differences
0 20 40 60
0.5
0
0.5
1
lag
a
u
t
o
c
o
r
r
.
SACF
0 20 40 60
0.4
0.2
0
0.2
0.4
SACF of differences
lag
a
u
t
o
c
o
r
r
.
rst: AR(10) model with ARIMA
here is the SAS program.
data rate1 ;
infile c:\courses\or473\data\fygn.dat ;
input date $ z;
title Three month treasury bills ;
title2 ARIMA model - to first differences ;
proc arima ;
identify var=z(1) ;
estimate p=10 plot;
run ;
Three month treasury bills 1
ARIMA model - to first differences
The ARIMA Procedure
Name of Variable = z
Period(s) of Differencing 1
Mean of Working Series 0.006986
Standard Deviation 0.494103
Number of Observations 554
Observation(s) eliminated by differencing 1
Autocorrelations
0 0.244138 1.00000 | |********************|
1 0.067690 0.27726 | . |****** |
2 -0.026212 -.10736 | **| . |
3 -0.022360 -.09159 | **| . |
4 -0.0091143 -.03733 | .*| . |
5 0.011399 0.04669 | . |*. |
6 -0.045339 -.18571 | ****| . |
7 -0.047987 -.19656 | ****| . |
8 0.022734 0.09312 | . |** |
9 0.047441 0.19432 | . |****
10 0.014282 0.05850 | . |*. |
To Chi- Pr >
6 75.33 6 <.0001 0.277 -0.107 -0.092 -0.037 0.047 -0.186
12 130.15 12 <.0001 -0.197 0.093 0.194 0.059 -0.007 -0.093
18 158.33 18 <.0001 0.036 0.157 -0.102 0.005 0.082 0.078
24 205.42 24 <.0001 -0.033 -0.232 -0.160 -0.015 -0.008 -0.030
Standard Approx
MU 0.0071463 0.02056 0.35 0.7283 0
AR1,1 0.33494 0.04287 7.81 <.0001 1
AR1,2 -0.16456 0.04501 -3.66 0.0003 2
AR1,3 0.01712 0.04535 0.38 0.7060 3
AR1,4 -0.10901 0.04522 -2.41 0.0163 4
AR1,5 0.14252 0.04451 3.20 0.0014 5
AR1,6 -0.21560 0.04451 -4.84 <.0001 6
AR1,7 -0.08347 0.04522 -1.85 0.0655 7
AR1,8 0.10382 0.04536 2.29 0.0225 8
AR1,9 0.10007 0.04502 2.22 0.0267 9
AR1,10 -0.04723 0.04290 -1.10 0.2714 10
Constant Estimate 0.006585
The ARIMA Procedure
AIC 687.6855
SBC 735.1743
The ARIMA Procedure
To Chi- Pr >
6 0.00 0 <.0001 0.003 -0.011 0.003 0.021 -0.015 -0.031
12 9.56 2 0.0084 0.036 -0.001 -0.031 0.018 0.105 -0.040
18 42.72 8 <.0001 -0.076 0.177 -0.115 0.081 0.019 0.025
24 62.06 14 <.0001 -0.062 -0.149 -0.078 -0.025 -0.024 -0.013
30 65.76 20 <.0001 0.002 0.008 0.045 0.048 -0.043 -0.007
36 73.52 26 <.0001 -0.070 -0.004 -0.051 -0.003 -0.053 -0.052
42 74.14 32 <.0001 -0.007 0.028 -0.007 -0.005 0.010 0.006
48 82.20 38 <.0001 -0.011 -0.000 -0.006 0.001 -0.103 0.050
Autocorrelation Plot of Residuals
0 0.198648 1.00000 | |********************|
1 0.00057812 0.00291 | . | . |
2 -0.0020959 -.01055 | . | . |
3 0.00068451 0.00345 | . | . |
4 0.0041792 0.02104 | . | . |
5 -0.0030362 -.01528 | . | . |
6 -0.0061377 -.03090 | .*| . |
7 0.0071315 0.03590 | . |*. |
8 -0.0001693 -.00085 | . | . |
9 -0.0061781 -.03110 | .*| .
10 0.0036055 0.01815 | . | . |
11 0.020788 0.10465 | . |** |
12 -0.0078818 -.03968 | .*| . |
13 -0.015171 -.07637 | **| . |
14 0.035240 0.17740 | . |****
AR(10) model does not t well
try an AR(24) model with backtting
here is the SAS program
data rate1 ;
infile c:\courses\or473\data\fygn.dat ;
input date $ z;
zdif=dif(z) ;
title Three month treasury bills ;
title2 AR(24) model to first differences with backfitting ;
proc autoreg ;
model zdif= / nlag=24 backstep;
run ;
Here is the output.
AR(24) model to first differences with backfitting
Dependent Variable zdif
SSE 135.25253 DFE 553
MSE 0.24458 Root MSE 0.49455
SBC 797.34939 AIC 793.032225
AR(24) model to first differences with backfitting
Backward Elimination of
Autoregressive Terms
Lag Estimate t Value Pr > |t|
10 0.007567 0.16 0.8721
23 0.010212 0.22 0.8241
17 0.008951 0.19 0.8492
3 -0.014390 -0.32 0.7496
24 0.015798 0.40 0.6907
13 0.041434 0.92 0.3605
7 0.038880 0.85 0.3964
18 -0.037456 -0.90 0.3702
22 0.042555 1.02 0.3090
20 0.058230 1.31 0.1912
4 0.059903 1.48 0.1389
9 -0.058141 -1.42 0.1562
Standard
1 -0.388246 0.040419 -9.61
2 0.200242 0.040438 4.95
5 -0.108069 0.040513 -2.67
6 0.249095 0.039719 6.27
8 -0.103462 0.039668 -2.61
11 -0.102896 0.040278 -2.55
12 0.119950 0.040704 2.95
14 -0.204702 0.040427 -5.06
15 0.223381 0.042441 5.26
16 -0.151917 0.040811 -3.72
19 0.103356 0.038847 2.66
21 0.108074 0.039511 2.74
My analysis of these results:
We are probably overtting
Probably should look carefully at SBC values of these
models
Forecasting
ARIMA models can forecast future values
consider forecasting using an AR(1) process
have data y
1
, . . . , y
n
and estimates and


remember y
n+1
= + (y
n
) +
n+1
and E(
n+1
|y
1
, . . . , y
n
) = 0
so we estimate y
n+1
by
y
n+1
:= +

(y
n
)
and y
t+2
by
y
n+2
:= +

( y
n+1
) = +

{
(y
n
)},
etc.
in general, y
n+k
= +

k
(y
n
).
if

< 1 then as k increases forecasts decay
exponentially fast to
forecasting general AR(p) processes is similar
example: for an AR(2) process
y
n+1
= +
1
(y
n
) +
2
(y
n1
) +
n+1
therefore
y
n+1
:= +

1
(y
n
) +

2
(y
n1
)
also
y
n+2
:= +

1
( y
n+1
) +

2
(y
n
).
etc
forecasting ARMA and ARIMA processes is slightly
more complicated
is discussed in time series courses such as ORIE 563
the forecasts can be generated automatically by
statistical software such as MINITAB and SAS
GE daily returns
tting an ARIMA(1,0,0) model to log returns is
equivalent to tting an ARIMA(1,1,0) model to the
log prices
we will t both models to the GE daily price data
next gure shows the forecasts of the log returns up
to 24 days ahead.
0 50 100 150 200 250 300
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
time
l
o
g

r
e
t
u
r
n
LOWER FORECAST LIMIT
UPPER FORECAST LIMIT
FORECAST
DATA
0 50 100 150 200 250 300
3.65
3.7
3.75
3.8
3.85
3.9
3.95
4
4.05
4.1
4.15
time
l
o
g

p
r
i
c
e
UPPER FORECAST LIMIT
LOWER FORECAST LIMIT
FORECASTS
DATA
time
MINITAB always forecasts the input series
the two gures show that forecasts of a stationary
process behave very dierently from forecasts of a
nonstationary process
MATLAB now has a GARCH Toolbox
Can be used to t ARIMA models as well as GARCH
models
MATLAB Program
load cree_daily_adjclose.txt ;
x = cree_daily_adjclose ;
n = length(x) ;
year = 1993 + (1:n)*(2006-1993)/n ;
net_return = price2ret(x) ;
spec = garchset(r,2,m,0) ;
[coeff,errors,LLF, innovations, sigmas,summary]=garchfit(spec,net_return) ;
garchdisp(coeff,errors)
garchplot(innovations,sigmas,net_return)
[h pvalue Qstat ] = lbqtest(innovations,[6 12])
MATLAB Output
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Diagnostic Information
Number of variables: 4
Functions
Objective: garchllfn
Gradient: finite-differencing
Hessian: finite-differencing (or Quasi-Newton)
Nonlinear constraints: armanlc
Gradient of nonlinear constraints: finite-differencing
Constraints
Number of nonlinear inequality constraints: 2
Number of nonlinear equality constraints: 0
Number of linear inequality constraints: 0
Number of linear equality constraints: 0
Number of lower bound constraints: 4
Number of upper bound constraints: 4
Algorithm selected
medium-scale
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
End diagnostic information
max Directional First-order
Iter F-count f(x) constraint Step-size derivative optimality Procedure
0 5 -5005.73 -0.002674
1 31 -5005.73 -0.002674 -9.54e-007 -0.00417 0.581
2 57 -5005.73 -0.002674 -9.54e-007 -0.00825 1.81
3 78 -5005.73 -0.002674 3.05e-005 -0.000112 1.44
4 97 -5005.73 -0.002674 0.000122 -8.6e-007 1.26
Optimization terminated: magnitude of directional derivative in search
direction less than 2*options.TolFun and maximum constraint violation
is less than options.TolCon.
No active inequalities
Mean: ARMAX(2,0,0); Variance: GARCH(0,0)
Conditional Probability Distribution: Gaussian
Number of Model Parameters Estimated: 4
Standard T
Parameter Value Error Statistic
----------- ----------- ------------ -----------
C 0.0010025 0.00090898 1.1029
AR(1) -0.025152 0.013994 -1.7973
AR(2) -0.026544 0.014982 -1.7717
K 0.0026744 4.2403e-005 63.0711
h = 0 0
pvalue = 0.8990 0.7574
Qstat = 2.2139 8.3481
0 500 1000 1500 2000 2500 3000 3500
0.4
0.2
0
0.2
0.4
Innovations
I
n
n
o
v
a
t
i
o
n
0 500 1000 1500 2000 2500 3000 3500
1
0
1
2
Conditional Standard Deviations
S
t
a
n
d
a
r
d

D
e
v
i
a
t
i
o
n
0 500 1000 1500 2000 2500 3000 3500
0.4
0.2
0
0.2
0.4
Returns
R
e
t
u
r
n

Time Series Models

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Time Series Models

Hochgeladen von

Copyright:

Verfügbare Formate

Time Series Models 1

TIME SERIES MODELS

variance of one-step ahead prediction errors

increasing variance makes the random walk

be the expected value of

Time Series Models 113

Das könnte Ihnen auch gefallen