Sie sind auf Seite 1von 126

Time Series Models 1

TIME SERIES MODELS


Time Series
time series = sequence of observations
Example: daily returns on a stock
multivariate time series is a sequence of vectors of
observations
Example returns from set of stocks.
statistical models for univariate times series widely
used
in nance to model asset prices
in OR to model the output of simulations
in business for forecasting
Time Series Models 2
Stationary Processes
often a time series has same type of random behavior
from one time period to the next
outside temperature: each summer is similar to the
past summers
interest rates and returns on equities
stationary stochastic processes are probability models
for such series
process stationary if behavior unchanged by shifts in
time
Time Series Models 3
a process is weakly stationary if its mean, variance,
and covariance are unchanged by time shifts
thus X
1
, X
2
, . . . is a weakly stationary process if
E(X
i
) = (a constant) for all i
Var(X
i
) =
2
(a constant) for all i
Corr(X
i
, X
j
) = (|i j|) for all i and j for some
function
Time Series Models 4
the correlation between two observations depends only
on the time distance between them (called the lag)
example: correlation between X
2
and X
5
=
correlation between X
7
and X
10
Time Series Models 5
is the correlation function
Note that (h) = (h)
covariance between X
t
and X
t+h
is denoted by (h)
() is called the autocovariance function
Note that (h) =
2
(h) and that (0) =
2
since
(0) = 1
many nancial time series not stationary
but the changes in these time series may be
stationary
Time Series Models 6
Weak White Noise
simplest example of stationary process
no correlation
X
1
, X
2
, . . . is WN(,
2
) if
E(X
i
) = for all i
Var(X
i
) =
2
(a constant) for all i
Corr(X
i
, X
j
) = 0 for all i = j
if X
1
, X
2
. . . IID normal then process is Gaussian
white noise process
Time Series Models 7
weak white noise process is weakly stationary with
(0) = 1
(t) = 0 if t = 0
so that
(0) =
2
(t) = 0 if t = 0
Time Series Models 8
White noise
WN is uninteresting in itself
but is the building block of important models
It is interesting to know if a nancial time series, e.g.,
of net returns, is WN.
Time Series Models 9
Estimating parameters of a stationary process
observe y
1
, . . . , y
n
estimate and
2
with Y and s
2
estimate autocovariance with
(h) = n
1
nh

j=1
(y
j+h
y)(y
j
y)
estimate () with
(h) =
(h)
(0)
, h = 1, 2, . . .
innite number of parameters (bad)
Time Series Models 10
AR(1) processes
time series models with correlation built from WN
in AR processes y
t
is modeled as a weighted average
of past observations plus a white noise error
AR(1) is simplest AR process

1
,
2
, . . . are WN(0,
2

)
y
1
, y
2
, . . . is an AR(1) process if
y
t
= (y
t1
) +
t
(1)
for all t
Time Series Models 11
From previous page:
y
t
= (y
t1
) +
t
Only three parameters:
mean

2

variance of one-step ahead prediction errors


a correlation parameter
Time Series Models 12
If || < 1, then y
1
, . . . is a weakly stationary process
mean is

y
t
= (1 ) + y
t1
+
t
(2)
compare with linear regression model,
y
t
=
0
+
1
x
t
+
t

0
= (1 ) is called the constant in computer
output
is called the mean in the output
Time Series Models 13
When || < 1 then
y
t
= +
t
+
t1
+
2

t2
+ = +

h=0

th
innite moving average [MA()] represention
since || < 1,
h
0 as the lag h
Time Series Models 14
Properties of a stationary AR(1) process
When || < 1 (stationarity), then
E(y
t
) = t
(0) = Var(y
t
) =

2

1
2
t
(h) = Cov(y
t
, y
t+h
) =

2

|h|
1
2
t
(h) = Corr(y
t
, y
t+h
) =
|h|
t
Only if || < 1 and only for AR(1) processes
Time Series Models 15
if || 1, then the AR(1) process is nonstationary,
and the mean, variance, and correlation are not
constant
Formulas 14 can be proved using
y
t
= +
t
+
t1
+
2

t2
+ + = +

h=0

th
For example
Var(y
t
) = Var

h=0

th

=
2

h=0

2h
=

2

1
2
Time Series Models 16
Also, for h > 0
(h) = Cov

i=0

ti

i
,

j=0

t+hj

=

2

|h|
1
2
distinguish between
2

= variance of
1
,
2
, . . . and
(0) = variance of y
1
, y
2
, . . .
Time Series Models 17
Nonstationary AR(1) processes
Random Walk
if = 1 then
y
t
= y
t1
+
t
not stationary
random walk process
y
t
= y
t1
+
t
= (y
t2
+
t1
) +
t
= =
y
0
+
1
+ +
t
Time Series Models 18
start at the process at an arbitrary point y
0
then E(y
t
|y
0
) = y
0
for all t
Var(y
t
|y
0
) = t
2

increasing variance makes the random walk


wander
AR(1) processes when || > 1
when || > 1, an AR(1) process has explosive
behavior
Time Series Models 19
0 50 100 150 200
8
6
4
2
0
2
4
6
AR(1): = 0.9
0 50 100 150 200
5
4
3
2
1
0
1
2
3
AR(1): = 0.6
0 50 100 150 200
4
3
2
1
0
1
2
3
AR(1): = 0.2
0 50 100 150 200
5
0
5
AR(1): = 0.9
0 50 100 150 200
12
10
8
6
4
2
0
2
4
AR(1): = 1
0 50 100 150 200
60
50
40
30
20
10
0
10
AR(1): = 1.02
n = 200
Time Series Models 20
0 10 20 30
3
2
1
0
1
2
3
4
AR(1): = 0.9
0 10 20 30
3
2
1
0
1
2
AR(1): = 0.6
0 10 20 30
3
2
1
0
1
2
AR(1): = 0.2
0 10 20 30
4
2
0
2
4
6
AR(1): = 0.9
0 10 20 30
1
0
1
2
3
4
5
AR(1): = 1
0 10 20 30
1
0
1
2
3
4
5
6
AR(1): = 1.02
n = 30
Time Series Models 21
0 200 400 600 800 1000
6
4
2
0
2
4
6
8
AR(1): = 0.9
0 200 400 600 800 1000
4
2
0
2
4
6
AR(1): = 0.6
0 200 400 600 800 1000
3
2
1
0
1
2
3
4
AR(1): = 0.2
0 200 400 600 800 1000
8
6
4
2
0
2
4
6
8
AR(1): = 0.9
0 200 400 600 800 1000
10
0
10
20
30
40
AR(1): = 1
0 200 400 600 800 1000
2
0
2
4
6
8
10
x 10
8 AR(1): = 1.02
n = 1000
Time Series Models 22
Suppose an explosive AR(1) process starts at y
0
= 0 and
has = 0. Then
y
t
= y
t1
+
t
= (y
t2
+
t1
) +
t
=
2
y
t2
+
t1
+
t
=
=
t
+
t1
+
2

t2
+ +
t1

1
+
t
y
0
Time Series Models 23
Therefore, E(y
t
) =
t
y
o
and
Var(y
t
) =
2
(1 +
2
+
4
+ +
2(t1)
) =
2

2t
1
1
.
Since || > 1, variance increases geometrically fast at
t .
Explosive AR processes not widely used in econometrics
since economic growth usually is not explosive.
Time Series Models 24
Estimation
Can t an AR(1) to either
raw data, or
variable constructed from the raw data
To create the log returns
take logs of prices
dierence
Time Series Models 25
In MINITAB, to dierence
Stat menu
Time Series menu
select dierences
select log prices as variable
Time Series Models 26
AR(1) model is a linear regression model
can be analyzed using linear regression software
one creates a lagged variable in y
t
and uses this as the
x-variable
MINITAB and SAS both support lagging
to lag in MINITAB
Stat menu
Time Series menu
then select lag
Time Series Models 27
The least squares estimation of and minimize
n

t=2

{y
t
} {(y
t1
)}

2
if the errors are Gaussian white noise then LSE =
MLE
both MINITAB or SAS have special procedures for
tting AR models
Time Series Models 28
In MINITAB
Stat menu
Time Series
then choose ARIMA
use
1 autoregressive parameter
0 dierencing if using log returns (or 1 if using
log prices)
0 moving average parameters
In SAS, use the AUTOREG or the ARIMA
procedure
Time Series Models 29
Residuals

t
= y
t


(y
t1
)
estimate
1
,
2
, . . . ,
n
since
t
= y
t
(y
t1
)
used to check that y
1
, y
2
, . . . , y
n
is an AR(1) process
autocorrelation in residuals evidence against AR(1)
assumption
to test for residual autocorrelation use test bounds
provided by MINITABs or SASs autocorrelation
plots
Time Series Models 30
can also use the Ljung-Box test
null hypothesis is that autocorrelations up to a
specied lag are zero
Time Series Models 31
To appreciate why residual autocorrelation indicates a
possible problem, suppose that
we are tting an AR(1) model but
the true model is a AR(2) process given by
(y
t
) =
1
(y
t1
) +
2
(y
t2
) +
t
.
no hope of estimating
2
.


does not necessarily estimate
1
because of bias
Time Series Models 32
Let

be the expected value of



.
For the purpose of illustration, assume that
and

.
Then

t
(y
t
)

(y
t1
)
=
1
(y
t1
) +
2
(y
t2
) +
t

(y
t1
)
= (
1

)(y
t1
) +
2
(y
t2
) +
t
.
Time Series Models 33
From previous page:

t
(y
t
)

(y
t1
)
=
1
(y
t1
) +
2
(y
t2
) +
t

(y
t1
)
= (
1

)(y
t1
) +
2
(y
t2
) +
t
.
Thus, the residuals do not estimate the white noise
process.
If there is no bias in the estimation of then

1
=

and
(
1

)(y
t1
) drops out
But the presence of
2
(y
t2
) still causes the
residuals to be autocorrelated.
Time Series Models 34
Example: GE daily returns
The MINITAB output was obtained by running
MINITAB interactively.
Here is the MINITAB output. The variable logR is the
time series of log returns.
Time Series Models 35
Results for: GE_DAILY.MT
ARIMA Model: logR
ARIMA model for logR
Estimates at each iteration
Iteration SSE Parameters
0 2.11832 0.100 0.090
1 0.12912 0.228 0.015
2 0.07377 0.233 0.001
3 0.07360 0.230 0.000
4 0.07360 0.230 -0.000
5 0.07360 0.230 -0.000
Relative change in each estimate less than 0.0010
Time Series Models 36
Final Estimates of Parameters
Type Coef SE Coef T P
AR 1 0.2299 0.0621 3.70 0.000
Constant -0.000031 0.001081 -0.03 0.977
Mean -0.000040 0.001403
Number of observations: 252
Residuals: SS = 0.0735911 (backforecasts excluded)
MS = 0.0002944 DF = 250
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 23.0 33.6 47.1 78.6
DF 10 22 34 46
P-Value 0.011 0.054 0.066 0.002
Time Series Models 37
The SAS output comes from running the following
program.
options linesize = 72 ;
data ge ;
infile c:\courses\or473\data\ge.dat ;
input close ;
logP = log(close) ;
logR = dif(logP) ;
run ;
title GE - Daily prices, Dec 17, 1999 to Dec 15, 2000 ;
title2 AR(1) ;
proc autoreg ;
model logR =/nlag = 1 ;
run ;
Time Series Models 38
Here is the SAS output.
The AUTOREG Procedure
Dependent Variable logR
Ordinary Least Squares Estimates
SSE 0.07762133 DFE 251
MSE 0.0003092 Root MSE 0.01759
SBC -1316.8318 AIC -1320.3612
Regress R-Square 0.0000 Total R-Square 0.0000
Durbin-Watson 1.5299
Standard Approx
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 -0.000011 0.001108 -0.01 0.9917
Time Series Models 39
Estimates of Autocorrelations
Lag Covariance Correlation
0 0.000308 1.000000
1 0.000069 0.225457
Time Series Models 40
Estimates of Autocorrelations
Lag -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0 | |********************|
1 | |***** |
Preliminary MSE 0.000292
Estimates of Autoregressive Parameters
Standard
Lag Coefficient Error t Value
1 -0.225457 0.061617 -3.66
Time Series Models 41
GE - Daily prices, Dec 17, 1999 to Dec 15, 2000 2
The AUTOREG Procedure
Yule-Walker Estimates
SSE 0.07359998 DFE 250
MSE 0.0002944 Root MSE 0.01716
SBC -1324.6559 AIC -1331.7148
Regress R-Square 0.0000 Total R-Square 0.0518
Durbin-Watson 1.9326
Standard Approx
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 -0.000040 0.001394 -0.03 0.9773
Time Series Models 42


= .2299
std dev of

is 0.0621
t-value for testing H
0
: = 0 versus H
1
: = 0 is
.2299/.0621 = 3.70
p-value is .000 (to three decimals)
null hypothesis: log returns are white noise
alternative is that they are correlated
small p-value is evidence against the geometric
random walk hypothesis
Time Series Models 43
however, = .2299 is not large
(h) =
h
correlation between successive log
returns is .2299
the squared correlation is only .0528
only about ve percent of the variation in a log return
can be predicted by the previous days return
In summary, AR(1) process ts GE log returns better
than white noise
not proof that the AR(1) ts these data
only that it ts better than a white noise model
Time Series Models 44
to check that the AR(1) ts well looks sample
autocorrelation function (SACF) of the residuals
plot of the residual SACF available from MINITAB or
SAS
SACF of the residuals from the GE daily log returns
shows high negative autocorrelation at lag 6
(6) is outside the test limits so is signicant at
= .05
this is disturbing
Time Series Models 45
0 5 10 15 20 25 30 35
0.25
0.2
0.15
0.1
0.05
0
0.05
0.1
0.15
0.2
0.25
lag
a
u
t
o
c
o
r
r
e
l
a
t
i
o
n
SACF of residuals from an AR(1) t to GE daily log returns.
Time Series Models 46
the more conservative Ljung-Box simultaneous test
that (1) = = (12) = 0 has p = .011
since the AR(1) model does not t well, one might
consider more complex models
these will be discussed in following sections
Time Series Models 47
The SAS estimate of is .2254
SAS uses the model
y
t
= y
t1
+
t
so SASs is the negative of as we, and
MINITAB, dene it
The dierence, .2299 versus .2254, between MINITAB
and SAS is slight
due to the estimation algorithm
Time Series Models 48
can also estimate and test that is zero
from the MINITAB output is nearly zero
t-value for testing that is zero is very small
p-value is near one
small values of the p-value are signicant
since the p-value is large we accept the null
hypothesis that is zero
Time Series Models 49
Cree Daily Log Returns
options linesize=72;
data cree ;
set Sasuser.cree_daily ; /* The data have already been imported via the wizard */
log_return = dif(log(AdjClose)) ; /* compute log returns */
lag_log_return = lag(log_return) ; /* previous days returns */
run ;
proc arima ;
identify var=log_return ;
estimate p=1 ;
run ;
Time Series Models 50
The ARIMA Procedure
Name of Variable = log_return
Mean of Working Series 0.000953
Standard Deviation 0.051749
Number of Observations 3244
Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0 0.0026779 1.00000 | |********************|
1 -0.0000656 -.02450 | .|. |
2 -0.0000694 -.02593 | *|. |
3 -0.0000339 -.01266 | .|. |
4 -0.0000221 -.00826 | .|. |
5 -0.0000175 -.00653 | .|. |
6 0.00005227 0.01952 | .|. |
7 0.00003363 0.01256 | .|. |
8 -0.0000703 -.02624 | *|. |
9 0.00008622 0.03220 | .|* |
10 -6.6465E-6 -.00248 | .|. |
.
.
.
Time Series Models 51
Autocorrelation Check for White Noise
To Chi- Pr >
Lag Square DF ChiSq -------------Autocorrelations------------
6 6.25 6 0.3955 -0.025 -0.026 -0.013 -0.008 -0.007 0.020
12 12.69 12 0.3919 0.013 -0.026 0.032 -0.002 -0.002 -0.009
18 15.40 18 0.6343 0.013 -0.003 0.020 0.008 0.010 -0.009
24 17.57 24 0.8235 0.013 -0.004 -0.011 0.012 0.012 0.009
Time Series Models 52
Conditional Least Squares Estimation
Standard Approx
Parameter Estimate Error t Value Pr > |t| Lag
MU 0.0009521 0.0008869 1.07 0.2831 0
AR1,1 -0.02450 0.01756 -1.40 0.1630 1
Constant Estimate 0.000975
Variance Estimate 0.002678
Std Error Estimate 0.051749
AIC -10005.2
SBC -9993
Number of Residuals 3244
* AIC and SBC do not include log determinant.
Correlations of Parameter
Estimates
Parameter MU AR1,1
MU 1.000 -0.000
AR1,1 -0.000 1.000
Time Series Models 53
The ARIMA Procedure
Autocorrelation Check of Residuals
To Chi- Pr >
Lag Square DF ChiSq -------------Autocorrelations------------
6 4.58 5 0.4697 -0.001 -0.027 -0.014 -0.009 -0.006 0.020
12 10.67 11 0.4716 0.012 -0.025 0.032 -0.002 -0.003 -0.009
18 13.36 17 0.7115 0.013 -0.002 0.020 0.008 0.010 -0.009
24 15.62 23 0.8711 0.012 -0.004 -0.011 0.012 0.012 0.011
30 34.93 29 0.2067 0.042 0.043 -0.005 -0.006 0.014 0.045
36 48.74 35 0.0614 -0.020 -0.025 -0.036 -0.029 0.019 -0.027
42 52.02 41 0.1162 0.007 -0.025 -0.011 0.001 -0.000 -0.015
48 57.08 47 0.1488 -0.006 -0.016 0.015 0.012 -0.006 -0.029
Time Series Models 54
AR(p) models
y
t
is AR(p) process if
(y
t
) =
1
(y
t1
)+
2
(y
t2
)+ +
p
(y
tp
)+
t
here
1
, . . . ,
n
is WN(0,
2

)
multiple linear regression model with lagged values of
the time series as the x-variables
model can be reexpressed as
y
t
=
0
+
1
y
t1
+ . . . +
p
y
tp
+
t
,
here
0
= {1 (
1
+ . . . +
p
)}
Time Series Models 55
least-squares estimator minimizes
n

t=p+1
{y
t
(
0
+
1
y
t1
+ . . . +
p
y
tp
)}
2
least-squares estimator can be calculated using a
multiple linear regression program
one must create x-variables by lagging the time
series with lags 1 throught p
easier to use the ARIMA command in MINITAB or
SAS or SASs AUTOREG procedures
these do the lagging automatically
Time Series Models 56
Example: GE daily returns
SAS program rerun with
model logR =/nlag = 1
replaced by
model logR =/nlag = 6

i
is signicant at lags 1 and 6
but not at lags 2 through 5
signicant means at = .05 which corresponds to
absolute t-value bigger than 2
MINITAB will not allow p > 5
but SAS does not have such a constraint
Time Series Models 57
Moving Average (MA) Processes
MA(1) processes
moving average process of order [MA(1)] is
y
t
=
t

t1
,
where as before the
t
s are WN(0,
2

)
can show that
E(y
t
) = ,
Var(y
t
) =
2

(1 +
2
),
Time Series Models 58
(1) =
2

,
(h) = 0 if |h| > 1,
(1) =

1 +
2
,
and
(h) = 0 if |h| > 1
Time Series Models 59
General MA processes
MA(q) process is
y
t
=
t

1

t1

q

tq
can show that (h) = 0 and (h) = 0 if |h| > q
formulas for (h) and (h) when |h| q are given in
time series textbooks
complicated but not be needed by us
Time Series Models 60
ARIMA Processes
ARMA (autoregressive and moving average):
stationary time series with complex autocorrelation
behavior better modeled by mixed autoregressive and
moving average processes
ARIMA (autoregressive, integrated, moving average):
based on stationary ARMA processes but are
nonstationary
ARIMA processes easily described with backwards
operator, B
Time Series Models 61
The backwards operator
backwards operator B is dened by
B y
t
= y
t1
more generally,
B
k
y
t
= y
tk
B c = c for any constant c
since a constant does not change with time
Time Series Models 62
ARMA Processes
ARMA(p, q) process satises the equation
(1
1
B
p
B
p
)(y
t
) = (1
1
B. . .
q
B
q
)
t
(3)
white noise process is ARMA(0,0) with = 0 since if
p = q = 0, then (3) reduces to
(y
t
) =
t
Time Series Models 63
The dierencing operator
dierencing operator is = 1 B so that
y
t
= y
t
B y
t
= y
t
y
t1
dierencing a time series produces a new time series
consisting of the changes in the original series
for example, if p
t
= log(P
t
) is the log price, then the
log return is
r
t
= p
t
Time Series Models 64
dierencing can be iterated
for example,

2
y
t
= (y
t
) = (y
t
y
t1
)
= (y
t
y
t1
) (y
t1
y
t2
)
= y
t
2y
t1
+ y
t2
Time Series Models 65
From ARMA processes to ARIMA process
often the rst or second dierences of nonstationary
time series are stationary
for example, the rst dierences of random walk
(nonstationary) are white noise (stationary)
a time series y
t
is said to be ARIMA(p, d, q) if
d
y
t
is
ARMA(p, q)
for example, if log returns (r
t
) on an asset are
ARMA(p, q), then the log prices (p
t
) are
ARIMA(p, 1, q)
Time Series Models 66
ARIMA procedures in MINITAB and SAS allow one
to specify p, d, and q
an ARIMA(p, 0, q) model is the same as an
ARMA(p, q) model
ARIMA(p, 0, 0), ARMA(p, 0), and AR(p) models are
the same
Also, ARIMA(0, 0, q), ARMA(0, q), and MA(q)
models are the same
Time Series Models 67
random walk is an ARIMA(0, 1, 0) model
The inverse of dierencing is integrating
the integral of a process y
t
is
w
t
= w
t
0
+ y
t
0
+ y
t
0
+1
+ y
t
t
0
is an arbitrary starting time point
w
t
0
is the starting value of the w
t
process
Figure shows an AR(1), its integral and its second
integral, meaning the integral of its integral
Time Series Models 68
0 50 100 150 200 250 300 350 400
4
2
0
2
4
ARIMA(1,0,0) with = 0 and = 0.4
0 50 100 150 200 250 300 350 400
30
20
10
0
10
20
ARIMA(1,1,0)
0 50 100 150 200 250 300 350 400
1000
500
0
500
ARIMA(1,2,0)
Time Series Models 69
Model Selection
once the parameters p, d, and q selected, coecients
can be estimated by maximum likelihood
but how do we choose p, d, and q?
generally, d is either 0, 1, or 2
chosen by looking at the SACF of y
t
, y
t
, and
2
y
t
a sign that a process is nonstationary is that its
SACF decays to zero very slowly
if this is true of y
t
then the original series is
nonstationary
should be dierenced at least once
Time Series Models 70
if the SACF of y
t
looks stationary, then we use
d = 1
otherwise, we look at the SACF of
2
y
t
if this looks stationary we use d = 2.
real time series where
2
y
t
did not look stationary
are rare
but if one were encountered then d > 2 would be
used
once d has been chosen we will t ARMA(p, q)
process to
d
y
t
but still need p and q
comparing various choices of p and q by some
criterion that measures how well a model ts
Time Series Models 71
AIC and SBC
AIC and SBC are model selection criteria based on
the log-likelihood
Akaikes information criterion (AIC) is dened as
2 log(L) + 2(p + q),
L is the likelihood evaluated at the MLE
Schwarzs Bayesian Criterion (SBC):
dened as
2 log(L) + log(n)(p + q),
n is the length of the time series
also called Bayesian Information Criterion (BIC)
Time Series Models 72
best model by either criterion is the model that
minimizes that criterion
Either criteria will tend to select models with a large
likelihood value
this makes perfect sense since large L means
observed data are likely under that model
Time Series Models 73
term 2(p + q) in AIC or log(n)(p + q) is a penalty on
having too many parameters
therefore, AIC and SBC try to tradeo
good t to the data measured by L
the desire to use few parameters
which penalizes the most?
log(n) > 2 if n 8
most time series are much longer than 8
so SBC penalizes p + q more than AIC
therefore, AIC will tend to choose models with
more parameters than SBC
Time Series Models 74
compared to SBC, with AIC the tradeo is more in
favor of a large value of L than a small value of p + q
Unfortunately, MINITAB does not compute AIC and
SBC but SAS does.
Heres how you can calculate approximate AIC and SBC
values using MINITAB.
It can be shown that log(L) (n/2) log(
2
) + K
where K is a constant that does not depend on the
model of on the parameters.
Since we only want to minimize AIC and SBC, the
exact value of K is irrelevant and we will drop K.
Time Series Models 75
Thus, you can use the approximations
AIC nlog(
2
) + 2(p + q),
and
SBC nlog(
2
) + log(n)(p + q).

2
is called MSE (mean squared error) on the
MINITAB output.
Time Series Models 76
dierence between AIC and SBC is due to the way
they were designed
AIC is designed to select the model that will
predict best and is less concerned with having a
few too many parameters
SBC is designed to select the true values of p and q
exactly
in practice the best AIC model is usually close to the
best SBC model
often they are the same model
models can be compared by likelihood ratio testing
when one model is bigger than the other
therefore, AIC and SBC are basically LR tests
Time Series Models 77
Stepwise regression applied to AR processes
stepwise regression: looks at many regression models
sees which ones t the data well
will be discussed later
backwards regression:
starts with all possible x-variables
eliminates them one at time
stop when all remaining variables are signicant
can be applied to AR models
SASs AUTOREG procedure allows backstepping as
an option
Time Series Models 78
The following SAS program starts with an AR(6) model
and backsteps
options linesize = 72 ;
data ge ;
infile c:\courses\or473\data\ge_quart.dat ;
input close ;
D_p = dif(close);
logP = log(close) ;
logR = dif(logP) ;
run ;
title GE - Quarterly closing prices, Dec 1900 to Dec 2000 ;
title2 AR(6) with backstepping ;
proc autoreg ;
model logR =/nlag = 6 backstep ;
run ;
Time Series Models 79
Here is the SAS output:
GE - Quarterly closing prices, Dec 1900 to Dec 2000 1
AR(6) with backstepping
The AUTOREG Procedure
Dependent Variable logR
Ordinary Least Squares Estimates
SSE 0.15125546 DFE 38
MSE 0.00398 Root MSE 0.06309
SBC -102.20076 AIC -103.86432
Regress R-Square 0.0000 Total R-Square 0.0000
Durbin-Watson 2.0710
Standard Approx
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 0.0627 0.0101 6.21 <.0001
Time Series Models 80
Estimates of Autocorrelations
Lag -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0 | |********************|
1 | *| |
2 | *| |
3 | |******** |
4 | *| |
5 | ****| |
6 | |** |
Time Series Models 81
GE - Quarterly closing prices, Dec 1900 to Dec 2000 2
AR(6) with backstepping
The AUTOREG Procedure
Backward Elimination of
Autoregressive Terms
Lag Estimate t Value Pr > |t|
4 0.020648 0.12 0.9058
2 -0.023292 -0.14 0.8921
1 0.035577 0.23 0.8226
6 0.082465 0.50 0.6215
5 0.170641 1.13 0.2655
Preliminary MSE 0.00328
Time Series Models 82
Estimates of Autoregressive Parameters
Standard
Lag Coefficient Error t Value
3 -0.392878 0.151180 -2.60
Expected
Autocorrelations
Lag Autocorr
0 1.0000
1 0.0000
2 0.0000
3 0.3929
Time Series Models 83
Yule-Walker Estimates
SSE 0.12476731 DFE 37
MSE 0.00337 Root MSE 0.05807
SBC -105.5425 AIC -108.86962
Regress R-Square 0.0000 Total R-Square 0.1751
Durbin-Watson 1.9820
Time Series Models 84
GE - Quarterly closing prices, Dec 1900 to Dec 2000 3
AR(6) with backstepping
The AUTOREG Procedure
Standard Approx
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 0.0632 0.0146 4.33 0.0001
Expected
Autocorrelations
Lag Autocorr
0 1.0000
1 0.0000
2 0.0000
3 0.3929
Time Series Models 85
Using the SACF to Choose d
0 200 400 600
0
5
10
15
20
month since Jan 1950
i
n
t
e
r
e
s
t

r
a
t
e
3 month Tbills
0 200 400 600
6
4
2
0
2
4
month since Jan 1950
1
s
t

d
i
f
f
e
r
e
n
c
e
Differences
0 20 40 60
0.5
0
0.5
1
lag
a
u
t
o
c
o
r
r
.
SACF
0 20 40 60
0.4
0.2
0
0.2
0.4
SACF of differences
lag
a
u
t
o
c
o
r
r
.
SACFs show that one should use d = 1
Time Series Models 86
Using ARIMA in SAS: Cree data
in this example, we will illustrate tting an ARMA
model in SAS
will use daily log returns on Cree from December
1999 to December 2000.
Time Series Models 87
0 100 200
30
40
50
60
70
80
90
100
p
r
i
c
e
CREE, daily 12/17/99 to 12/15/00
0 100 200
0.2
0.1
0
0.1
0.2
0.3
R
e
t
u
r
n
0 100 200
0.2
0.1
0
0.1
0.2
0.3
l
o
g

r
e
t
u
r
n
0.2 0.1 0 0.1
0.001
0.003
0.01
0.02
0.05
0.10
0.25
0.50
0.75
0.90
0.95
0.98
0.99
0.997
0.999
log return
P
r
o
b
a
b
i
l
i
t
y
Normal plot of log returns
0 100 200
0
0.05
0.1
0.15
0.2
0.25
V
o
l
a
t
i
l
i
t
y
Time Series Models 88
The SAS program is:
options linesize = 72 ;
data cree ;
infile U:\courses\473\data\cree_daily.dat ;
input month day year volume high low close ;
logP = log(close) ;
logR = dif(logP) ;
run ;
title Cree daily log returns ;
title2 ARMA(1,1) ;
proc arima ;
identify var=logR ;
estimate p=1 q=1 ;
run ;
Time Series Models 89
Here is the SAS output.
Cree log returns appear to be white noise since each of

1
(denoted by AR1,1 in SAS)

1
(denoted by MA1,1)

not signicantly dierent from zero.
Time Series Models 90
Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0 0.0045526 1.00000 | |********************|
1 0.00031398 0.06897 | . |* . |
2 -0.0000160 -.00351 | . | . |
3 -5.5958E-6 -.00123 | . | . |
4 -0.0002213 -.04862 | . *| . |
5 0.00002748 0.00604 | . | . |
6 -0.0000779 -.01712 | . | . |
7 -0.0000207 -.00454 | . | . |
8 -0.0003281 -.07207 | . *| . |
9 0.00015664 0.03441 | . |* .
10 0.00057077 0.12537 | . |*** |
Time Series Models 91
Autocorrelation Check for White Noise
To Chi- Pr >
Lag Square DF ChiSq -------------Autocorrelations------------
6 1.91 6 0.9276 0.069 -0.004 -0.001 -0.049 0.006 -0.017
12 10.02 12 0.6143 -0.005 -0.072 0.034 0.125 0.052 -0.076
18 21.95 18 0.2344 -0.030 -0.123 0.051 -0.022 -0.013 -0.157
24 23.37 24 0.4978 0.014 -0.010 -0.037 -0.032 -0.047 0.01
Time Series Models 92
Conditional Least Squares Estimation
Standard Approx
Parameter Estimate Error t Value Pr > |t| Lag
MU -0.0006814 0.0045317 -0.15 0.8806 0
MA1,1 -0.18767 0.88710 -0.21 0.8326 1
AR1,1 -0.11768 0.89670 -0.13 0.8957 1
Constant Estimate -0.00076
Variance Estimate 0.004585
Std Error Estimate 0.067712
AIC -638.889
SBC -628.301
Number of Residuals 252
* AIC and SBC do not include log determinant.
Time Series Models 93
Autocorrelation Check of Residuals
To Chi- Pr >
Lag Square DF ChiSq -------------Autocorrelations------------
6 0.75 4 0.9444 0.000 0.004 0.001 -0.049 0.010 -0.019
12 8.54 10 0.5761 0.003 -0.075 0.032 0.118 0.050 -0.079
18 21.12 16 0.1741 -0.014 -0.127 0.062 -0.029 0.001 -0.159
24 22.48 22 0.4314 0.025 -0.011 -0.035 -0.026 -0.045 0.016
Time Series Models 94
Three possible scenarios:
1. log returns are white noise
then log returns should pass the white noise test
2. log returns are not white noise but t the times series
model
then log returns should fail the white noise test but
then residuals should pass the white noise test
3. log returns are not white noise and do not t the time
series model
then log returns and residuals will both fail the
white noise test
Time Series Models 95
Warning
Dont rely too much of the residual tests for
autocorrelation.
If n is large:
autocorrelation might be small but statistically
signicant
my opinion SBC is a better guide to model choice
than the residual test for autocorrelations
Time Series Models 96
Example: Three-month Treasury bill rates
our empirical results: log returns have little
autocorrelation
but not exactly white noise
other nancial time series do have substantial
autocorrelation
Time Series Models 97
example: monthly interest rates on three-month US
Treasury bills from December 1950 until February
1996
data come from Example 16.1 of Pindyck and Rubin
(1998), Econometric Models and Economic Forecasts
rates are plotted in next gure
rst dierences look somewhat stationary
we will t ARMA models to the rst dierences
Time Series Models 98
0 200 400 600
0
5
10
15
20
month since Jan 1950
i
n
t
e
r
e
s
t

r
a
t
e
3 month Tbills
0 200 400 600
6
4
2
0
2
4
month since Jan 1950
1
s
t

d
i
f
f
e
r
e
n
c
e
Differences
0 20 40 60
0.5
0
0.5
1
lag
a
u
t
o
c
o
r
r
.
SACF
0 20 40 60
0.4
0.2
0
0.2
0.4
SACF of differences
lag
a
u
t
o
c
o
r
r
.
Time Series Models 99
rst: AR(10) model with ARIMA
here is the SAS program.
options linesize = 72 ;
data rate1 ;
infile c:\courses\or473\data\fygn.dat ;
input date $ z;
title Three month treasury bills ;
title2 ARIMA model - to first differences ;
proc arima ;
identify var=z(1) ;
estimate p=10 plot;
run ;
Time Series Models 100
Here is the SAS output.
Three month treasury bills 1
ARIMA model - to first differences
The ARIMA Procedure
Name of Variable = z
Period(s) of Differencing 1
Mean of Working Series 0.006986
Standard Deviation 0.494103
Number of Observations 554
Observation(s) eliminated by differencing 1
Time Series Models 101
Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0 0.244138 1.00000 | |********************|
1 0.067690 0.27726 | . |****** |
2 -0.026212 -.10736 | **| . |
3 -0.022360 -.09159 | **| . |
4 -0.0091143 -.03733 | .*| . |
5 0.011399 0.04669 | . |*. |
6 -0.045339 -.18571 | ****| . |
7 -0.047987 -.19656 | ****| . |
8 0.022734 0.09312 | . |** |
9 0.047441 0.19432 | . |****
10 0.014282 0.05850 | . |*. |
Time Series Models 102
Autocorrelation Check for White Noise
To Chi- Pr >
Lag Square DF ChiSq -------------Autocorrelations------------
6 75.33 6 <.0001 0.277 -0.107 -0.092 -0.037 0.047 -0.186
12 130.15 12 <.0001 -0.197 0.093 0.194 0.059 -0.007 -0.093
18 158.33 18 <.0001 0.036 0.157 -0.102 0.005 0.082 0.078
24 205.42 24 <.0001 -0.033 -0.232 -0.160 -0.015 -0.008 -0.030
Time Series Models 103
Conditional Least Squares Estimation
Standard Approx
Parameter Estimate Error t Value Pr > |t| Lag
MU 0.0071463 0.02056 0.35 0.7283 0
AR1,1 0.33494 0.04287 7.81 <.0001 1
AR1,2 -0.16456 0.04501 -3.66 0.0003 2
AR1,3 0.01712 0.04535 0.38 0.7060 3
AR1,4 -0.10901 0.04522 -2.41 0.0163 4
AR1,5 0.14252 0.04451 3.20 0.0014 5
AR1,6 -0.21560 0.04451 -4.84 <.0001 6
AR1,7 -0.08347 0.04522 -1.85 0.0655 7
AR1,8 0.10382 0.04536 2.29 0.0225 8
AR1,9 0.10007 0.04502 2.22 0.0267 9
AR1,10 -0.04723 0.04290 -1.10 0.2714 10
Constant Estimate 0.006585
Variance Estimate 0.198648
Std Error Estimate 0.445699
Time Series Models 104
Three month treasury bills 4
ARIMA model - to first differences
The ARIMA Procedure
AIC 687.6855
SBC 735.1743
Number of Residuals 554
* AIC and SBC do not include log determinant.
Time Series Models 105
Three month treasury bills 5
ARIMA model - to first differences
The ARIMA Procedure
Autocorrelation Check of Residuals
To Chi- Pr >
Lag Square DF ChiSq -------------Autocorrelations------------
6 0.00 0 <.0001 0.003 -0.011 0.003 0.021 -0.015 -0.031
12 9.56 2 0.0084 0.036 -0.001 -0.031 0.018 0.105 -0.040
18 42.72 8 <.0001 -0.076 0.177 -0.115 0.081 0.019 0.025
24 62.06 14 <.0001 -0.062 -0.149 -0.078 -0.025 -0.024 -0.013
30 65.76 20 <.0001 0.002 0.008 0.045 0.048 -0.043 -0.007
36 73.52 26 <.0001 -0.070 -0.004 -0.051 -0.003 -0.053 -0.052
42 74.14 32 <.0001 -0.007 0.028 -0.007 -0.005 0.010 0.006
48 82.20 38 <.0001 -0.011 -0.000 -0.006 0.001 -0.103 0.050
Time Series Models 106
Autocorrelation Plot of Residuals
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0 0.198648 1.00000 | |********************|
1 0.00057812 0.00291 | . | . |
2 -0.0020959 -.01055 | . | . |
3 0.00068451 0.00345 | . | . |
4 0.0041792 0.02104 | . | . |
5 -0.0030362 -.01528 | . | . |
6 -0.0061377 -.03090 | .*| . |
7 0.0071315 0.03590 | . |*. |
8 -0.0001693 -.00085 | . | . |
9 -0.0061781 -.03110 | .*| .
10 0.0036055 0.01815 | . | . |
11 0.020788 0.10465 | . |** |
12 -0.0078818 -.03968 | .*| . |
13 -0.015171 -.07637 | **| . |
14 0.035240 0.17740 | . |****
Time Series Models 107
AR(10) model does not t well
try an AR(24) model with backtting
here is the SAS program
options linesize = 72 ;
data rate1 ;
infile c:\courses\or473\data\fygn.dat ;
input date $ z;
zdif=dif(z) ;
title Three month treasury bills ;
title2 AR(24) model to first differences with backfitting ;
proc autoreg ;
model zdif= / nlag=24 backstep;
run ;
Time Series Models 108
Here is the output.
Three month treasury bills 1
AR(24) model to first differences with backfitting
The AUTOREG Procedure
Dependent Variable zdif
Ordinary Least Squares Estimates
SSE 135.25253 DFE 553
MSE 0.24458 Root MSE 0.49455
SBC 797.34939 AIC 793.032225
Regress R-Square 0.0000 Total R-Square 0.0000
Durbin-Watson 1.4454
Time Series Models 109
AR(24) model to first differences with backfitting
The AUTOREG Procedure
Backward Elimination of
Autoregressive Terms
Lag Estimate t Value Pr > |t|
10 0.007567 0.16 0.8721
23 0.010212 0.22 0.8241
17 0.008951 0.19 0.8492
3 -0.014390 -0.32 0.7496
24 0.015798 0.40 0.6907
13 0.041434 0.92 0.3605
7 0.038880 0.85 0.3964
18 -0.037456 -0.90 0.3702
22 0.042555 1.02 0.3090
20 0.058230 1.31 0.1912
4 0.059903 1.48 0.1389
9 -0.058141 -1.42 0.1562
Preliminary MSE 0.1765
Time Series Models 110
Estimates of Autoregressive Parameters
Standard
Lag Coefficient Error t Value
1 -0.388246 0.040419 -9.61
2 0.200242 0.040438 4.95
5 -0.108069 0.040513 -2.67
6 0.249095 0.039719 6.27
8 -0.103462 0.039668 -2.61
11 -0.102896 0.040278 -2.55
12 0.119950 0.040704 2.95
14 -0.204702 0.040427 -5.06
15 0.223381 0.042441 5.26
16 -0.151917 0.040811 -3.72
19 0.103356 0.038847 2.66
21 0.108074 0.039511 2.74
Time Series Models 111
My analysis of these results:
We are probably overtting
Probably should look carefully at SBC values of these
models
Time Series Models 112
Forecasting
ARIMA models can forecast future values
consider forecasting using an AR(1) process
have data y
1
, . . . , y
n
and estimates and

Time Series Models 113


remember y
n+1
= + (y
n
) +
n+1
and E(
n+1
|y
1
, . . . , y
n
) = 0
so we estimate y
n+1
by
y
n+1
:= +

(y
n
)
and y
t+2
by
y
n+2
:= +

( y
n+1
) = +

{

(y
n
)},
etc.
Time Series Models 114
in general, y
n+k
= +

k
(y
n
).
if

< 1 then as k increases forecasts decay
exponentially fast to
forecasting general AR(p) processes is similar
Time Series Models 115
example: for an AR(2) process
y
n+1
= +
1
(y
n
) +
2
(y
n1
) +
n+1
therefore
y
n+1
:= +

1
(y
n
) +

2
(y
n1
)
also
y
n+2
:= +

1
( y
n+1
) +

2
(y
n
).
etc
Time Series Models 116
forecasting ARMA and ARIMA processes is slightly
more complicated
is discussed in time series courses such as ORIE 563
the forecasts can be generated automatically by
statistical software such as MINITAB and SAS
Time Series Models 117
GE daily returns
tting an ARIMA(1,0,0) model to log returns is
equivalent to tting an ARIMA(1,1,0) model to the
log prices
we will t both models to the GE daily price data
next gure shows the forecasts of the log returns up
to 24 days ahead.
Time Series Models 118
0 50 100 150 200 250 300
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
time
l
o
g

r
e
t
u
r
n
LOWER FORECAST LIMIT
UPPER FORECAST LIMIT
FORECAST
DATA
Time Series Models 119
0 50 100 150 200 250 300
3.65
3.7
3.75
3.8
3.85
3.9
3.95
4
4.05
4.1
4.15
time
l
o
g

p
r
i
c
e
UPPER FORECAST LIMIT
LOWER FORECAST LIMIT
FORECASTS
DATA
time
Time Series Models 120
MINITAB always forecasts the input series
the two gures show that forecasts of a stationary
process behave very dierently from forecasts of a
nonstationary process
Time Series Models 121
MATLAB now has a GARCH Toolbox
Can be used to t ARIMA models as well as GARCH
models
Time Series Models 122
MATLAB Program
load cree_daily_adjclose.txt ;
x = cree_daily_adjclose ;
n = length(x) ;
year = 1993 + (1:n)*(2006-1993)/n ;
net_return = price2ret(x) ;
spec = garchset(r,2,m,0) ;
[coeff,errors,LLF, innovations, sigmas,summary]=garchfit(spec,net_return) ;
garchdisp(coeff,errors)
garchplot(innovations,sigmas,net_return)
[h pvalue Qstat ] = lbqtest(innovations,[6 12])
Time Series Models 123
MATLAB Output
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Diagnostic Information
Number of variables: 4
Functions
Objective: garchllfn
Gradient: finite-differencing
Hessian: finite-differencing (or Quasi-Newton)
Nonlinear constraints: armanlc
Gradient of nonlinear constraints: finite-differencing
Constraints
Number of nonlinear inequality constraints: 2
Number of nonlinear equality constraints: 0
Number of linear inequality constraints: 0
Number of linear equality constraints: 0
Number of lower bound constraints: 4
Number of upper bound constraints: 4
Algorithm selected
medium-scale
Time Series Models 124
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
End diagnostic information
max Directional First-order
Iter F-count f(x) constraint Step-size derivative optimality Procedure
0 5 -5005.73 -0.002674
1 31 -5005.73 -0.002674 -9.54e-007 -0.00417 0.581
2 57 -5005.73 -0.002674 -9.54e-007 -0.00825 1.81
3 78 -5005.73 -0.002674 3.05e-005 -0.000112 1.44
4 97 -5005.73 -0.002674 0.000122 -8.6e-007 1.26
Optimization terminated: magnitude of directional derivative in search
direction less than 2*options.TolFun and maximum constraint violation
is less than options.TolCon.
No active inequalities
Time Series Models 125
Mean: ARMAX(2,0,0); Variance: GARCH(0,0)
Conditional Probability Distribution: Gaussian
Number of Model Parameters Estimated: 4
Standard T
Parameter Value Error Statistic
----------- ----------- ------------ -----------
C 0.0010025 0.00090898 1.1029
AR(1) -0.025152 0.013994 -1.7973
AR(2) -0.026544 0.014982 -1.7717
K 0.0026744 4.2403e-005 63.0711
h = 0 0
pvalue = 0.8990 0.7574
Qstat = 2.2139 8.3481
Time Series Models 126
0 500 1000 1500 2000 2500 3000 3500
0.4
0.2
0
0.2
0.4
Innovations
I
n
n
o
v
a
t
i
o
n
0 500 1000 1500 2000 2500 3000 3500
1
0
1
2
Conditional Standard Deviations
S
t
a
n
d
a
r
d

D
e
v
i
a
t
i
o
n
0 500 1000 1500 2000 2500 3000 3500
0.4
0.2
0
0.2
0.4
Returns
R
e
t
u
r
n

Das könnte Ihnen auch gefallen