Beruflich Dokumente
Kultur Dokumente
mm 40 60 80 100 120
40
Eric Zivot
80
University of Washington
Guy Yollin
University of Washington
Zivot & Yollin (Copyright
2012)
R/Finance 2012
1 / 90
Outline
mm
1
40
60
80
100
120
Introduction to state space models and the dlm package DLM estimation and forecasting examples
40
Structural time series models and StructTS Exponential smoothing models and the forecast package
60
2012)
R/Finance 2012
2 / 90
Lecture references
G. mm Petris, S. Petrone, and P.60 Campagnoli80 40 Dynamic Linear Models with R. Springer, 2009. J. Durbin and S. J. Koopman 40 Series Analysis by State Space Methods. Time Oxford University Press, 2001. J. J. F. Commandeur and S. J. Koopman An Introduction to State Space Time Series Analysis. 60 Oxford University Press, 2007. Rob Hyndman Forecasting with Exponential Smoothing: The State Space Approach . 80 Springer, 2008
Zivot & Yollin (Copyright
100
120
2012)
R/Finance 2012
3 / 90
Outline
mm
1
40
60
80
100
120
Introduction to state space models and the dlm package DLM estimation and forecasting examples
40
Structural time series models and StructTS Exponential smoothing models and the forecast package
60
2012)
R/Finance 2012
4 / 90
t (mp )(p 1)
+ vt ,
m1
vt iid N (0, Vt )
The transition equation for the state vector t is the rst order Markov process t p 1 = Gt t 1 + wt
80 (p p )(p 1)
(p 1)
wt iid N (0, Wt )
2012)
R/Finance 2012
5 / 90
The matrices Ft , Vt , Gt ,and Wt are called the system matrices, and contain non-random elements.
40
If these matrices do not depend deterministically on t the state space system is called time invariant. Note: 60 If yt is covariance stationary, then the state space system will be time invariant.
80
2012)
R/Finance 2012
6 / 90
If some or all of the elements of t are covariance stationary, then we can typically solve for the corresponding elements of m0 and C0 analytically from the elements of the system matrices
60
For deterministic elements of (e.g., mean of a series), the corresponding element of C0 is dened to be zero. For non-stationary elements of , it is customary to set the corresponding element of C0 to a very large positive number, say 106 .
Zivot & Yollin (Copyright
80
2012)
R/Finance 2012
7 / 90
ME : yt = ct
2 TE : ct = 1 ct 1 + 2 ct 2 + t , t N (0, )
40
1 1 2 0
ct 1 2 ct 2
t 0
, wt =
t 0
2012)
R/Finance 2012
8 / 90
0 = E [t ] = GE [t 1 ] + E [wt ] = GE [t ] E [t ](I2 G) = 0
80
m0 = E [t ] = 0
Zivot & Yollin (Copyright
2012)
R/Finance 2012
9 / 90
For the state variance, stationarity of t = Gt 1 + wt implies that for all t var(t ) = Gvar(t )G + var(wt )
40
C0 = GC0 G + W
Stacking columns via the vec() operator then gives vec(60 C0 ) = (G G) vec(C0 ) + vec(W) vec(C0 ) = (I4 G G)1 vec(W)
80
2012)
R/Finance 2012
10 / 90
2 ME : ct = ct 1 + t + t 1 , t N (0, )
yt = ( 1 0 )t ct t 60 = 1 0 0 ct 1 t 1 + t t
0 0
1 2
R/Finance 2012 11 / 90
40
Dene t = (t , t )
60
yt = ( 1 xt )t + vt = 1 0 0 1 t 1 t 1 + w,t w,t
t t
80
2012)
R/Finance 2012
12 / 90
60
80
100
120
40 G= I2 , W =
2 0 2 0
Notice that Ft is time varying. Because the t is non-stationary the initial distribution is
60
0 N (m0 , C0 ) m0 = 0, C0 = k I2 , k = 107
80
2012)
R/Finance 2012
13 / 90
rt = t ut , ut N (0, 1)
2 ln t = + ln t 1 + t , t N (0, ), || < 1
ln |rt | = ln t + ln |ut |
2 E [|u60 t |] = 0.63518, var (|ut |) = /8
2012)
R/Finance 2012
14 / 90
+ vt 0 0 0.63518 0 + 0 1 0 1 ln t 1 t
0 0 0 0 0 1 0 , W = 0 0 0 2 0 0 1
2012)
R/Finance 2012
15 / 90
80
100
120
12
Note: in 80 dlm, you cant have elements of C0 exactly zero; use very small number 1e-7 instead.
Zivot & Yollin (Copyright
2012)
R/Finance 2012
16 / 90
List Name FF v GG W C0 m0
60
F V G W C0 m0 data
yt =80 Ft t + vt t = Gt t 1 + wt
Zivot & Yollin (Copyright
2012)
40
60
Model generic DLM ARMA process nth order polynomial DLM Linear regression Periodic Seasonal factors Periodic Trigonometric form
80
2012)
R/Finance 2012
18 / 90
[1] "dlm"
60
"C0" "FF" "V" "GG" "W" "JFF" "JV" "JGG" "JW"
80
2012)
R/Finance 2012
19 / 90
mm
40
60
80
100
120
40
60
80
2012)
R/Finance 2012
20 / 90
80
2012)
R/Finance 2012
21 / 90
mm
40
60
80
100
120
40
[,1] [,2] [1,] 0.01 0.00 [2,] 0.00 0.01 $m0 [1] 0 0 $C0
60
80
2012)
R/Finance 2012
22 / 90
60
80
100
120
40
60
80
2012)
R/Finance 2012
23 / 90
mm 40 60 80 ######################################## # EX 3. state space for SV model ######################################## # ln|r(t)| = -0.63518 + lns(t) + v(t), v(t) ~ (0,pi^2/8) # lns(t) = w + phi*lns(t-1) + w(t), w(t) ~ N(0, s2w) # theta = (-0.63518, w, lns(t))' 40 # m0 = (-0.63518, w, w/(1-phi))' # C0 = I*1e-7, C0[3,3] = sw2/(1-phi^2) phi = 0.9 sig2n = 1 omega = 0.1 F.mat = matrix(c(1,0,1),1,3) V.val = 60 pi^2/8 G.mat = matrix(c(1,0,0,0,1,0,0,1,phi),3,3) W.mat = diag(0,3) W.mat[3,3] = sig2n m0.vec = c(-0.63518, omega, omega/(1-phi)) C0.mat = diag(1,3)*1e-7 80 = sig2n/(1-phi^2) C0.mat[3,3] SV.dlm = dlm(FF=F.mat, V=V.val, GG=G.mat, W=W.mat, m0=m0.vec, C0=C0.mat)
100
120
2012)
R/Finance 2012
24 / 90
mm
40
60
80
100
120
40
[,1] [,2] [,3] [1,] 1 0 0.0 [2,] 0 1 1.0 [3,] 0 0 0.9 $W [1,] [2,] [3,] [,1] [,2] [,3] 0 0 0 0 0 0 0 0 1
60
0.10000
1.00000
[,1] [,2] [,3] [1,] 1e-07 0e+00 0.000000 [2,] 0e+00 1e-07 0.000000 [3,] 0e+00 0e+00 5.263158 Zivot & Yollin (Copyright
80
2012)
R/Finance 2012
25 / 90
In a state space model, the unobserved state vector t is the signal and the measurement 40 error wt is the noise. Given observed data y1 , . . . , yT the goals of state space estimation are:
1
Optimal signal extraction Optimal 60 hstep ahead prediction of states and data
80
2012)
R/Finance 2012
26 / 90
Filtering: Optimal estimates of t given information available at time t , It 40 = {y1 , . . . , yt } : E [t |It ] = ltered estimate of
Smoothing: Optimal estimates of t given information available at time60 T , IT = {y1 , . . . , yT } E [t |IT ] = smoothed estimate of
80
2012)
R/Finance 2012
27 / 90
The Kalman lter is a set of recursion equations for determining the optimal estimates of the state vector t given information available at time t , It . The lter consists of two sets of equations:
1 2
To describe the lter, let mt = E [t |It ] = optimal estimator of t based on It Ct = E [(t mt )(t mt ) |It ] = MSE matrix of mt
80 60
2012)
R/Finance 2012
28 / 90
Prediction Equations
Given mt 1 and Ct 1 at time t 1, the optimal predictor of t and its associated MSE matrix mm 40 are 60 80 100 120 mt |t 1 = E [t |It 1 ] = Gt mt 1 Ct |t 1 = E [(t mt 1 )(t mt 1 ) |It 1 ]
40
= Gt Ct 1 Gt + Wt
The corresponding optimal predictor of yt give information at t 1 is yt |t 1 = E [yt |It 1 ] = Ft mt |t 1 The prediction error and its MSE matrix are et = yt yt |t 1 = yt Ft mt |t 1
80 = Ft (t mt |t 1 ) + vt 60
E [et et ] = Qt = Ft Ct |t 1 Ft + Vt
Zivot & Yollin (Copyright
2012)
R/Finance 2012
29 / 90
Updating Equations
mm 40 60 80 100 120
When new observations yt become available, the optimal predictor mt |t 1 and its MSE matrix are updated using
1 mt = 40mt |t 1 + Ct |t 1 Ft Qt (yt Ft mt |t 1 ) 1 = mt |t 1 + Ct |t 1 Ft Q t vt 1 Ct = Ct |t 1 Ct |t 1 Ft Q t Ft Ct |t 1
60
1 Note: Kt = Ct |t 1 Ft Q t = Kalman gain matrix. It gives the weight on new information et = yt Ft mt |t 1 in the updating equation for mt .
80
2012)
R/Finance 2012
30 / 90
Kalman Smoother
mm 40 60 80 100 120
Once all data IT is observed, the optimal estimates E [t |IT ] can be computed using the backwards Kalman smoothing recursions
40
E [t |IT ] = mt |T = mt + C t mt +1|T Gt +1 mt
60
The algorithm starts by setting mT |T = mT and CT |T = CT and then proceeds backwards for t = T 1, . . . , 1.
80
2012)
R/Finance 2012
31 / 90
Let denote the parameters of the state space model, which are embedded in the system matrices Ft , Gt , Wt and Vt . These parameters are typically unknown and must be estimated from the data y = {y1 , . . . , yT }.
40
In the linear-Gaussian state space model, the parameter vector can be estimated be estimated by maximum likelihood using the prediction error decomposition of the log-likelihood
60
T
ln f (yt |It 1 ; )
2012)
R/Finance 2012
32 / 90
1 1 et ( ) Q t et ( ) 2
The prediction error decomposition of the Gaussian log-likelihood function follows immediately:
60
ln L( |y) =
NT 1 ln(2 ) 2 2
T
ln |Qt ( )|
t =1
80
1 2
1 et ( )Q t ( )et ( ) t =1
Time Series with State Space Models R/Finance 2012 33 / 90
2012)
Forecasting
mm 40 60 80 100 120
The Kalman lter prediction equations produces in-sample 1-step ahead forecasts and MSE matrices Out-of-sample hstep ahead predictions and MSE matrices can be computed from the prediction equations by extending the data set y1 , . . . , yT with a set of h missing values
When y is missing the Kalman lter reduces to the prediction step so 60 a sequence of h missing values at the end of the sample will produce a set of hstep ahead forecasts for j = 1, . . . , h 40
80
2012)
R/Finance 2012
34 / 90
40
60
80
2012)
R/Finance 2012
35 / 90
Outline
mm
1
40
60
80
100
120
Introduction to state space models and the dlm package DLM estimation and forecasting examples
40
Structural time series models and StructTS Exponential smoothing models and the forecast package
60
2012)
R/Finance 2012
36 / 90
mm 40 > # ols fit - constant equity beta > ols.fit = lm(HAM1 ~ sp500) > summary(ols.fit)
Call: lm(formula = HAM1 ~ sp500)
60
80
100
120
40
Residuals: Min 1Q Median -5.1782 -1.3871 -0.2147 3Q 1.2626 Max 5.7441
Coefficients: 60Estimate Std. Error t value Pr(>|t|) (Intercept) 0.57747 0.16971 3.403 0.000887 *** sp500 0.39007 0.03908 9.981 < 2e-16 *** --Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1
' '
Residual standard error: 1.934 on 130 degrees of freedom 80 Multiple R-squared: 0.4339, Adjusted R-squared: 0.4295 F-statistic: 99.63 on 1 and 130 DF, p-value: < 2.2e-16
Zivot & Yollin (Copyright
2012)
R/Finance 2012
37 / 90
60
80
100
120
> # function to build TVP ss model > buildTVP <- function(parm, x.mat){ parm <- exp(parm) return( dlmModReg(X=x.mat, dV=parm[1], dW=c(parm[2], parm[3])) ) } 40 > # maximize over log-variances > start.vals = c(0,0,0) > names(start.vals) = c("lns2v", "lns2a", "lns2b") > TVP.mle = dlmMLE(y=HAM1, parm=start.vals, x.mat=sp500, build=buildTVP, hessian=T) 60 > class(TVP.mle) [1] "list" > names(TVP.mle) [1] "par" 80 [6] "hessian" "value" "counts" "convergence" "message"
2012)
R/Finance 2012
38 / 90
60
80
100
120
40
60
$message [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH" $hessian lns2v lns2a lns2b lns2v 5.943783e+01 2.871303e-05 1.853667e+00 lns2a 2.871303e-05 1.566605e-04 3.391776e-05 lns2b 1.853667e+00 3.391776e-05 1.860590e+00
80
2012)
R/Finance 2012
39 / 90
mm
40
60
80
100
120
sv sa sb 1.32902368 0.03094178 0.23528498 > > > > > # fitted ss model TVP.dlm <- buildTVP(TVP.mle$par, sp500) # filtering TVP.f <- dlmFilter(HAM1, TVP.dlm) class(TVP.f)
40
60
"U.C" "D.C" "a" "U.R" "D.R" "f"
"mod" "m"
> # smoothing > TVP.s <- dlmSmooth(TVP.f) > class(TVP.s) [1] "list"
80
"U.S" "D.S"
2012)
R/Finance 2012
40 / 90
> # extract smoothed states - intercept and slope coefs > alpha.s = xts(TVP.s$s[-1,1,drop=FALSE], as.Date(rownames(TVP.s$s[-1,]))) > beta.s = xts(TVP.s$s[-1,2,drop=FALSE], 40 as.Date(rownames(TVP.s$s[-1,]))) > colnames(alpha.s) = "alpha" > colnames(beta.s) = "beta" > # extract std errors - dlmSvd2var gives list of MSE matrices > mse.list = dlmSvd2var(TVP.s$U.S, TVP.s$D.S) > se.mat = t(sapply(mse.list, FUN=function(x) sqrt(diag(x)))) > se.xts =60 xts(se.mat[-1, ], index(beta.s)) > colnames(se.xts) = c("alpha", "beta") > a.u = alpha.s + 1.96*se.xts[,"alpha"] > a.l = alpha.s - 1.96*se.xts[, "alpha"] > b.u = beta.s + 1.96*se.xts[,"beta"] > b.l = beta.s - 1.96*se.xts[, "beta"]
80
2012)
R/Finance 2012
41 / 90
mm 40 60 a.u), main="Smoothed 80 100 120 > chart.TimeSeries(cbind(alpha.s, a.l, estimates of alpha", ylim=c(0,1), colorset=c(1,2,2), lty=c(1,2,2),ylab=expression(alpha),xlab="")
Smoothed estimates of alpha
1.0 0.0 Jan 96 0.2 0.4 0.6 0.8
40
60
80
Jul 97 Jul 98 Jul 99 Jul 00 Jul 01 Jul 02 Jul 03 Jul 04 Jul 05 Jul 06
2012)
R/Finance 2012
42 / 90
mm 40 60 80 100 of beta", 120 > chart.TimeSeries(cbind(beta.s, b.l, b.u), main="Smoothed estimates colorset=c(1,2,2), lty=c(1,2,2),ylab=expression(beta),xlab="")
Smoothed estimates of beta
40
1.0
60
80
0.0 Jan 96
0.2
0.4
0.6
0.8
Jul 97 Jul 98 Jul 99 Jul 00 Jul 01 Jul 02 Jul 03 Jul 04 Jul 05 Jul 06
2012)
R/Finance 2012
43 / 90
mm
> # forecasting using dlmFilter > # add 10 missing values to end of sample > new.xts = xts(rep(NA, 10), seq.Date(from=end(HAM1), by="months", length.out=11)[-1]) > HAM1.ext = merge(HAM1, new.xts)[,1] > TVP.ext.f 40= dlmFilter(HAM1.ext, TVP.dlm) > # extract h-step ahead forecasts of state vector > TVP.ext.f$m[as.character(index(new.xts)),] [,1] 2007-01-30 0.5333932 2007-03-02 0.5333932 2007-03-3060 0.5333932 2007-04-30 0.5333932 2007-05-30 0.5333932 2007-06-30 0.5333932 2007-07-30 0.5333932 2007-08-30 0.5333932 2007-09-3080 0.5333932 2007-10-30 0.5333932 [,2] 0.6872751 0.6872751 0.6872751 0.6872751 0.6872751 0.6872751 0.6872751 0.6872751 0.6872751 0.6872751
40
60
80
100
120
2012)
R/Finance 2012
44 / 90
80
2012)
R/Finance 2012
45 / 90
# create state space # ln(r(t)) = -0.63518 + ln(s(t-1)) + v(t) # ln(s(t)) = w + phi*ln(s(t-1)) + n(t) buildSV = function(parm) { # parm[1]=phi, parm[2]=omega, parm[3]=lnsig2n parm[3] = exp(parm[3]) 40 F.mat = matrix(c(1,0,1),1,3) V.val = pi^2/8 G.mat = matrix(c(1,0,0,0,1,0,0,1,parm[1]),3,3, byrow=TRUE) W.mat = diag(0,3) W.mat[3,3] = parm[3] m0.vec = c(-0.63518, parm[2], parm[2]/(1-parm[1])) 60 C0.mat = diag(1,3)*1e7 C0.mat[1,1] = 1e-7 C0.mat[2,2] = 1e-7 C0.mat[3,3] = parm[3]/(1-parm[1]^2) SV.dlm = dlm(FF=F.mat, V=V.val, GG=G.mat, W=W.mat, m0=m0.vec, C0=C0.mat) 80 return(SV.dlm)
mm
40
60
80
100
120
}
Zivot & Yollin (Copyright
2012)
R/Finance 2012
46 / 90
120
> SV.s <- dlmSmooth(SV.f) 60 > names(SV.s) [1] "s" > > > > > "U.S" "D.S"
# extract smoothed estimate of logvol logvol.s = xts(SV.s$s[-1,3,drop=FALSE], as.Date(rownames(SV.s$s[-1,]))) colnames(logvol.s) = "Volatility" 80 # plot absolute returns with smoothed volatility plot.zoo(cbind(abs(GSPC.ret), exp(logvol.s)), main="Absolute Returns and Volatility",col=c(4,1),lwd=1:2,xlab="")
2012)
R/Finance 2012
47 / 90
4 0 2 4 6 8 1 2 3
GSPC
40
Volatility
60
80
2000 2002 2004 2006 2008 2010 2012
2012)
R/Finance 2012
48 / 90
40
60
80
100
120
> SV.fcst = dlmForecast(SV.f, nAhead=100) > logvol.fcst = SV.fcst$a[,3] > se.mat = t(sapply(SV.fcst$R, FUN=function(x) sqrt(diag(x)))) > se.logvol 40= se.mat[,3] > lvol.l = logvol.fcst - 2*se.logvol > lvol.u = logvol.fcst + 2*se.logvol > n.obs = length(logvol.s) > n.hist = n.obs - 100 > new.xts = xts(cbind(logvol.fcst, lvol.l, lvol.u), 60 seq.Date(from=end(logvol.s), by="days", length.out=100)) > chart.TimeSeries(merge(logvol.s[n.hist:n.obs],new.xts), main="log volatility forecasts", ylab="log volatility",xlab="", lty=c(1,1,2,2), colorset=c("black","blue", "red", "red"), event.lines=list(as.character(end(logvol.s)))) 80
2012)
R/Finance 2012
49 / 90
0.5
40
log volatility
60
80
20111108 20120103 20120301 20120501 20120701
R/Finance 2012 50 / 90
2012)
Outline
mm
1
40
60
80
100
120
Introduction to state space models and the dlm package DLM estimation and forecasting examples
40
Structural time series models and StructTS Exponential smoothing models and the forecast package
60
2012)
R/Finance 2012
51 / 90
40
60
80
100
120
Main arguments: x type init xed univariate time series (numeric vector or time series) species local level, linear trend, or basic structural model initial parameter values (optional) specied values for xed variables (optional)
60
80 Return value:
2012)
R/Finance 2012
52 / 90
40
60 N (0 , 2 ),
120
t +1 = t + t ,
2 t N (0, ),
state equation
level
80
2012)
R/Finance 2012
53 / 90
40
N (0, 2 ), 80 observation 60 100 equation 120 state equation, level state equation, slope
t +1 = t + t + t , t +1 = t + t ,
40
R Code: The StructTS function
> StructTS(GAS,type="trend")
2 t N (0, ), 2 t N (0, ),
Call: StructTS(x60 = GAS, type = "trend") Variances: level slope 0.006082 0.008082
epsilon 0.000000
80
slope
2012)
R/Finance 2012
54 / 90
60
80
100
120
xt = t + t +
N (0, 2 ),
t +140 = t + t + t , t +1 = t + t , t +1 = (t + t 1 + . . .
60
2 t N (0, ), 2 t N (0, ),
+ t S +2 ) + t ,
2 t N (0, ),
80
2012)
R/Finance 2012
55 / 90
60
80
100
120
seas 2.496e-06
epsilon 0.000e+00
60
2012)
R/Finance 2012
56 / 90
Outline
mm
1
40
60
80
100
120
Introduction to state space models and the dlm package DLM estimation and forecasting examples
40
Structural time series models and StructTS Exponential smoothing models and the forecast package
60
2012)
R/Finance 2012
57 / 90
2 t N (0, ),
Hyndman, 2008
2012)
R/Finance 2012
58 / 90
y =T +S +E Trend (T) long-term direction Seasonal 40 (S) periodic pattern Error (E) random error component Trend components:
60 None Additive Additive Damped Multiplicative 80 Multiplicative Damped
see Hyndman 2008
Zivot & Yollin (Copyright
Th Th Th Th Th
=l = l + bh = l + ( + 2 + . . . + h )b = lbh = lb( + 2 + . . . + h )
2012)
R/Finance 2012
59 / 90
Seasonal Component 80A 100M 120 N None Additive Multiplicative N,N A,N Ad ,N M,N Md ,N N,A A,A Ad ,A M,A Md ,A N,M A,M Ad ,M M,M Md ,M
N,N simple exponential smoothing A,N Holt linear method A,A additive Holt-Winters A,M multiplicative Holt-Winters 80 Ad ,N damped trend (additive errors) Ad ,M damped trend (multiplicative errors)
Zivot & Yollin (Copyright
2012)
R/Finance 2012
60 / 90
yt = 1 1 xt 1 + t xt =80 1 1 x + 0 1 t 1 t
2012)
R/Finance 2012
61 / 90
Journal of 40Statistical Software technical paper Automatic Time Series Forecasting: The forecast Package for R http://www.jstatsoft.org/v27/i03
60 Key functions
ets - exponential smoothing state space models forecast - forecast n-steps ahead with condence levels plot.forecast - create color-coded forecast probability cone
80
2012)
R/Finance 2012
62 / 90
40 model = "ZZZ", damped = NULL, alpha = NULL, beta = NULL, function (y, gamma = NULL, phi = NULL, additive.only = FALSE, lambda = NULL, lower = c(rep(1e-04, 3), 0.8), upper = c(rep(0.9999, 3), 0.98), opt.crit = c("lik", "amse", "mse", "sigma", "mae"), nmse = 3, bounds = c("both", "usual", "admissible"), ic = c("aic", "aicc", "bic"), restrict = TRUE) 60 NULL
Main arguments: y univariate time series (numeric vector or time series) model 80 3-letter model identifying for the error, trend, season components damped TRUE for damped trend models
Zivot & Yollin (Copyright
2012)
R/Finance 2012
63 / 90
matrix of upper condence bounds for prediction intervals condence values associated with the prediction intervals time series of tted values original time series of data name of the method used to t the model
80
2012)
R/Finance 2012
64 / 90
Outline
mm
1
40
60
80
100
120
Introduction to state space models and the dlm package DLM estimation and forecasting examples
40
Structural time series models and StructTS Exponential smoothing models and the forecast package
60
2012)
R/Finance 2012
65 / 90
40
60
2012)
R/Finance 2012
66 / 90
80
100
120
80
2012)
R/Finance 2012
67 / 90
40
60
80
100
120
function (x, h, lambda = NULL, ...) { require(forecast) fit <-40 ets(x, lambda = lambda, ...) forecast(fit, h = h, lambda = lambda, fan = TRUE) } <environment: namespace:tsfssm> > stsForecast
60 h, lambda = NULL, ...) function (x, { require(forecast) if (!is.null(lambda)) x <- BoxCox(x = x, lambda = lambda) fit <- StructTS(x, ...) 80 forecast(fit, h = h, lambda = lambda, fan = TRUE) } <environment: namespace:tsfssm>
Zivot & Yollin (Copyright
2012)
R/Finance 2012
68 / 90
mm > cv.ts <function(x, FUN, 40 tsControl, 60 ...) { 80 100 120 stepSize <- tsControl$stepSize maxHorizon <- tsControl$maxHorizon minObs <- tsControl$minObs fixedWindow <- tsControl$fixedWindow freq <- frequency(x) n <- length(x) 40 st <- tsp(x)[1]+(minObs-2)/freq steps <- seq(1,(n-minObs),by=stepSize) cl <- makeCluster( detectCores()-1 ) registerDoParallel(cl) # register foreach backend forcasts <- foreach(i=steps, .multicombine=FALSE) %dopar% { if (fixedWindow) { 60 training.window <- window(x, start=st+(i-minObs+1)/freq, end=st+i/freq) } else { training.window <- window(x, end=st + i/freq) } return(FUN(training.window, h=maxHorizon, ...)) } 80 stopCluster(cl) return( forcasts ) }
Zivot & Yollin (Copyright
2012)
R/Finance 2012
69 / 90
mm > head(energyPrices,3)
1990-08-01 1990-09-01 1990-10-01
40
60
80
100
120
OILPRICE GASREGCOVM MHOILNYH 27.174 1.218 0.753 33.687 1.258 0.888 35.922 1.335 0.942
40 > GAS <- ts(coredata(energyPrices[,"GASREGCOVM"]), start=as.yearmon(time(energyPrices)[1]), frequency=12) > plot(as.xts(GAS),main="Price of regular gasoline")
Price of regular gasoline
4.0 1.0 Aug 1990 Feb 1994 2.0 3.0
60
80
Feb 1998
Feb 2002
Feb 2006
Feb 2010
2012)
R/Finance 2012
70 / 90
80
100
120
> zzz.list <- cv.ts(x=GAS, FUN=etsForecast, tsControl=myControl, lambda=0) > class(zzz.list) [1] "list"
40
> class(zzz.list[[length(zzz.list)]]) [1] "forecast" > names(zzz.list[[length(zzz.list)]]) [1] "model" 60 [7] "fitted" "mean" "method" "level" "x" "residuals" "upper" "lower"
> names(zzz.list[[length(zzz.list)]]$model) [1] [6] [11] [16] "loglik" "amse" "par"80 "initstate" "aic" "fit" "m" "sigma2" "bic" "residuals" "method" "x" "aicc" "mse" "fitted" "states" "components" "call" "lambda"
> plot(zzz.list[[length(zzz.list)]],include=3*12)
Zivot & Yollin (Copyright
2012)
R/Finance 2012
71 / 90
40
5 4
60
3 2
80
2010 2011
Time Series with State Space Models
2012
2013
R/Finance 2012 72 / 90
2012)
mm <- sapply(zzz.list,function(x) > ets.models paste(x$model$components,collapse="-")) 40 60 80 100 120 > barplot(sort(table(ets.models)/length(ets.models),dec=TRUE), las=1,col=4,cex.names=0.6) > title("Frequency of Auto-Selected Models") 40
0.5 0.4 0.3 0.2 0.1 0.0
60
80
ANNFALSE
ANAFALSE
AANTRUE
AAATRUE
AAAFALSE
2012)
R/Finance 2012
73 / 90
mm 40 60 80 100 > unpackForecasts <- function(model.list) { ts.list <- lapply(model.list, function(x) {x$mean}) num.models <- length(model.list) ts.st <- tsp(ts.list[[1]])[1] ts.end <- tsp(ts.list[[num.models]])[2] 40<- seq(ts.st,ts.end,by=1/12) date.seq forecast.ts <- ts(matrix(NA,nrow=length(date.seq),ncol=12, dimnames=list(NULL,1:12)),start=ts.st,frequency=12) for(l in 1:num.models) { f <- ts.list[[l]] for(j60 in 1:12) { time.stamp <- tsp(f)[1]+(j-1)/12 window(forecast.ts[,j],start=time.stamp, end=time.stamp) <- window(f,start=time.stamp,end=time.stamp) } 80 } return( forecast.ts ) }
Zivot & Yollin (Copyright
120
2012)
R/Finance 2012
74 / 90
mm
100
120
6 3.85 3.63 3.61 3.61 3.57 3.54 3.86 3.99 3.33 4.04 NA NA NA NA NA NA
7 3.75 3.85 3.63 3.61 3.61 3.57 3.66 3.89 4.01 3.33 3.95 NA NA NA NA NA
8 3.51 3.75 3.85 3.63 3.61 3.61 3.57 3.70 3.88 4.00 3.33 3.71 NA NA NA NA
9 3.62 3.51 3.75 3.85 3.63 3.61 3.61 3.57 3.70 3.85 3.89 3.33 3.49 NA NA NA
10 3.06 3.64 3.51 3.75 3.85 3.63 3.61 3.61 3.57 3.67 3.79 3.58 3.33 3.36 NA NA
11 2.95 3.06 3.65 3.51 3.75 3.85 3.63 3.61 3.61 3.57 3.62 3.52 3.32 3.33 3.44 NA
12 3.06 2.95 3.06 3.66 3.51 3.75 3.85 3.63 3.61 3.61 3.57 3.40 3.33 3.22 3.33 3.52
R/Finance 2012 75 / 90
> modelAccuracy <- function(forecast.ts,actual.ts) { cnames <- c("RMSE","MAE","MAPE","MdAPE","Min-Err","Max-Err","Max-APE") stat.tab <- matrix(data=NA,nrow=length(cnames),ncol=12, 40 dimnames=list(cnames,1:12)) res <- forecast.ts-actual.ts pe <- log(forecast.ts)-log(actual.ts) stat.tab["RMSE",] <- round(sqrt(apply(res^2,2,mean,na.rm=T)),2) stat.tab["MAE",] <- round(apply(abs(res),2,mean,na.rm=T),2) stat.tab["MAPE",] <- round(100*apply(abs(pe),2,mean,na.rm=T), 2) 60 stat.tab["MdAPE",] <- round(100*apply(abs(pe),2,median,na.rm=T), 2) stat.tab["Min-Err",] <- round(apply(res,2,min,na.rm=T),2) stat.tab["Max-Err",] <- round(apply(res,2,max,na.rm=T),2) stat.tab["Max-APE",] <- round(100*apply(abs(pe),2,max,na.rm=T),1) return( stat.tab ) }
80
2012)
R/Finance 2012
76 / 90
80
100
120
2012)
R/Finance 2012
77 / 90
80
100
120
plot(0,type="n",xlim=c(1,12),ylim=c(3,19),xlab="",ylab="MdAPE") grid() lines(modelAccuracy(unpackForecasts(zzz.list),GAS)["MdAPE",],lwd=2,col=4) 60 lines(modelAccuracy(unpackForecasts(ann.list),GAS)["MdAPE",],lwd=2,col=3) lines(modelAccuracy(unpackForecasts(ana.list),GAS)["MdAPE",],lwd=2,col=2) lines(modelAccuracy(unpackForecasts(aan.list),GAS)["MdAPE",],lwd=2,col=6) lines(modelAccuracy(unpackForecasts(aaa.list),GAS)["MdAPE",],lwd=2,col=5) legend(x="bottomright",legend=c("Z-Z-Z","A-N-N","A-N-A","A-Ad-N","A-A-A"), lty=1,lwd=2,col=c(4,3,2,6,5),bty="n") 80 > title("MdAPE versus forecast horizon for ETS models")
2012)
R/Finance 2012
78 / 90
MdAPE
10
15
40
60
ZZZ ANN ANA AAdN AAA 2 4
2012)
80
6 8 10
12
R/Finance 2012 79 / 90
STS local linear trend model STS Basic Structural Model (BSM)
R Code: Run 40 cross validations and plot results
> stsLevel.list <- cv.ts(x=GAS, FUN=stsForecast, tsControl=myControl, lambda=0, type="level") > stsTrend.list <- cv.ts(x=GAS, FUN=stsForecast, tsControl=myControl, lambda=0, type="trend") > stsBSM.list <- cv.ts(x=GAS, FUN=stsForecast, tsControl=myControl, 60 type="BSM") lambda=0, > > > > > plot(0,type="n",xlim=c(1,12),ylim=c(0,22),xlab="",ylab="MdAPE") lines(modelAccuracy(unpackForecasts(stsLevel.list),GAS)["MdAPE",],lwd=2,col=4) lines(modelAccuracy(unpackForecasts(stsTrend.list),GAS)["MdAPE",],lwd=2,col=3) lines(modelAccuracy(unpackForecasts(stsBSM.list),GAS)["MdAPE",],lwd=2,col=2) 80 legend(x="topleft",legend=c("sts-level","sts-trend","sts-BSM"),lty=1, lwd=2,col=c(4,3,2),bty="n") > title("MdAPE versus forecast horizon for STS models")
Zivot & Yollin (Copyright
2012)
R/Finance 2012
80 / 90
120
20
40
MdAPE 15 10
60
5
80
2 4
2012)
10
12
R/Finance 2012 81 / 90
120
80
2012)
R/Finance 2012
82 / 90
mm
0e+00 6e06 0.000 0.0060.000 0.006 1 2 3 4
40
60
GAS
80
100
120
stsBSM.coef.level
40
stsBSM.coef.slope
60
stsBSM.coef.seas
stsBSM.coef.epsilon
0.0e+00
80
2000
2012)
2005
Time Series with State Space Models
2010
R/Finance 2012 83 / 90
mm 40 60 80 { 100 120 > dlmSTSForecast <- function(x, h, lambda=NULL, ...) require(forecast) require(dlm) if( !is.null(lambda) ) x <- BoxCox(x=x, lambda=lambda) fit <- dlmStructTS(x, ...) 40 forecast(object=fit, data=x, h=h, lambda=lambda) } > dlmStructTS <- function(x, type = c("level", "trend", "BSM"), init = NULL, fixed = NULL, optim.control = NULL) { type <- if (missing(type)) "level" else match.arg(type) FUN <- switch(type, level = dlmLLM, trend = dlmLTM, BSM = dlmBSM) 60 do.call(FUN,args=list(x,init,fixed,optim.control)) } > dlmLLM <- function(x,init=NULL,fixed=NULL,optim.control=NULL) { if( is.null(init) ) init <- rep(log(0.1),2) buildFun <- function(theta) { dlmModPoly(1, dV = exp(theta[2]), dW = exp(theta[1] fit <- 80 dlmMLE(x, parm = init, build = buildFun) dlm.mod <- buildFun(fit$par) }
Zivot & Yollin (Copyright
2012)
R/Finance 2012
84 / 90
mm <- function(object,data,h=12,lambda=NULL) > forecast.dlm 40 60 80 100 { dlm.sm <- dlmSmooth(data, object) dlm.filt <- dlmFilter(data, object) dlm.for <- dlmForecast(dlm.filt, nAhead = h) hwidth <- qnorm(0.25, lower = FALSE) * sqrt(unlist(dlm.for$Q)) if( is.null(lambda ) ) { 40 fore <- dlm.for$f lower <- dlm.for$f - hwidth upper <- dlm.for$f + hwidth } else { fore <- InvBoxCox(dlm.for$f,lambda) lower60 <- InvBoxCox(dlm.for$f - hwidth,lambda) upper <- InvBoxCox(dlm.for$f + hwidth,lambda) data <- InvBoxCox(data,lambda) } ans <- structure(list(method = "dlm",model = object,level = 50, mean = fore,lower = as.matrix(lower),upper = as.matrix(upper), x = data,fitted = dlm.sm$s,residuals = residuals(dlm.filt, 80 sd = FALSE)), class = "forecast") return( ans ) }
Zivot & Yollin (Copyright
120
2012)
R/Finance 2012
85 / 90
Compare model results: DLM local level model, STS local level model, ETS damped trend model
40
R Code: Run cross validations and plot results
> dlmLL.list <- cv.ts(x=GAS, FUN=dlmSTSForecast, tsControl=myControl, lambda=0, type="level") > dlmLT.list <- cv.ts(x=GAS, FUN=dlmSTSForecast, tsControl=myControl, lambda=0, type="trend", init=rep(log(1e-4),3)) 60 > > > > > > plot(0,type="n",xlim=c(1,12),ylim=c(3,19),xlab="",ylab="MdAPE") grid() lines(modelAccuracy(unpackForecasts(dlmLL.list),GAS)["MdAPE",],lwd=2,col=4) lines(modelAccuracy(unpackForecasts(stsLevel.list),GAS)["MdAPE",],lwd=2,col=2) lines(modelAccuracy(unpackForecasts(aan.list),GAS)["MdAPE",],lwd=2,col=6) 80 legend(x="topleft",legend=c("DLM LL model","STS LL model","ETS A-Ad-N model"), lty=1,lwd=2,col=c(4,2,6),bty="n") > title("MdAPE versus forecast horizon for DLM models")
Zivot & Yollin (Copyright
2012)
R/Finance 2012
86 / 90
120
MdAPE
10 5
15
40
60
80
2
Zivot & Yollin (Copyright
4
2012)
10
12
R/Finance 2012 87 / 90
Outline
mm
1
40
60
80
100
120
Introduction to state space models and the dlm package DLM estimation and forecasting examples
40
Structural time series models and StructTS Exponential smoothing models and the forecast package
60
2012)
R/Finance 2012
88 / 90
Summary
mm 40 60 80 100 120
Time series cross validation can be used for model selection and out-of-sample forecast analysis
ets models 60 StructTS models dlm models
80
2012)
R/Finance 2012
89 / 90
40
http://depts.washington.edu/compfin
60
80
2012)
R/Finance 2012
90 / 90