Hidden Markov Independent Component Analysis in Financial Time Series

HIDDEN MARKOV INDEPENDENT COMPONENT ANALYSIS AS A
MEASURE OF COUPLING IN MULTIVARIATE FINANCIAL TIME

SERIES
Nauman Shah, Stephen J. Roberts

Pattern Analysis and Machine Learning Group, Department of Engineering Science
Oxford-Man Institute of Quantitative Finance
University of Oxford
nauman@robots.ox.ac.uk, sjrob@robots.ox.ac.uk
ABSTRACT box to harness market inefficiencies in order to generate

consistent positive returns. Till recently, algorithmic trad-
Modelling the dynamics of financial markets has been an
ing was usually done on an hourly basis. However, due
area of active research in recent years. This paper presents
to the easy and relatively cheap availability of high fre-
a time series analysis model which can be used to infer
quency market data, some of the latest algorithmic trading
patterns within financial data, in order to better understand
engines trade on a second by second or even tick by tick
the dynamics of financial markets. The focus of the paper
basis. The time series analysis model presented in this
is on finding causal and time-scale relationships between
paper can potentially be used in the development of an
financial time series. Wavelets are used to extract use-
algorithmic trading strategy.
ful time-scale information from financial data at different
frequencies and mutual information between time series
is used as the canonical measure of coupling. A Hid-
den Markov Independent Component Analysis (HMICA) 2 TIME-FREQUENCY
model is used to infer a series of hidden states and it is REPRESENTATION OF FINANCIAL
shown that these hidden states are indicative of changes in TIME SERIES
mutual information between time series at various differ-
ent time scales. There are a variety of methods currently in use to de-
termine the time-frequency representation of a financial
Keywords: ICA, Wavelets, Hidden Markov ICA, Fi- time series. The Fourier Transform, Empirical Mode De-
nancial Time Series Analysis composition and Wavelet Analysis are all popular time-
frequency analysis techniques. The method used in this
paper is the Continuous Wavelet Transform, primarily be-
1 INTRODUCTION cause it allows for the inclusion of a prior distribution as
its basis. Thus, prior knowledge can easily be incorpo-
One of the most prominent outcomes of research carried rated into a trading model. Studies focusing on analysis
out in the field of financial engineering has been the devel- of financial time series using a time-scale approach show
opment and rapid growth of algorithmic trading strategies. some very promising results [5]. Therefore, there is po-
Due to the vast scale of the global financial markets and tential for significant new developments in this field, espe-
their constant evolution, research in this sector presents cially considering the fact that almost all asset classes in-
real challenges and opportunities. Algorithmic trading, cluding Foreign Exchange (FX) and Equities exhibit time-
also known as black-box or technical trading, is an au- scale behaviour.
tomated trading platform, which relies on complex math-
ematical and statistical algorithms to make online trading
decisions. Since the introduction of electronic trading in
1971, the proportion of trades that can be attributed to 2.1 Continuous Wavelet Transform
algorithmic trading has steadily increased. Algorithmic
Wavelets are functions which can be used to represent a
trading now accounts for over 60% of all trades taking
signal in a form which can be more easily analysed and
place on the London Stock Exchange [1]. Many of the al-
comprehended. Wavelets allow for the localised analy-
gorithmic trading engines currently in use act as a black-
sis of signal components, which makes them especially
interesting for use in dealing with FX data. The Continu-
Permission to make digital or hard copies of all or part of this ous Wavelet Transform (CWT) is a powerful data analysis
work for personal or classroom use is granted without fee pro- method which can be used to analyse the properties of a
vided that copies are not made or distributed for profit or com- financial time series at different frequencies. The knowl-
mercial advantage and that copies bear this notice and the full edge gained about the trends of a time series at different
citation on the first page. time scales can be used to develop trading models which
take advantage of recurring patterns at various different
2008
c The University of Liverpool frequencies. The CWT of a function x(t) is given by:
I[c1 , c2 ], between the recovered wavelet coefficients,
1 ∞
t−b
Z
T (u, b) = √ x(t)ψ dt (1) c1 = c1 (t), c2 = c2 (t), of a multivariate time series,
u −∞ u x1 = x1 (t), x2 = x2 (t). Figure 1 shows the variation in
MI across different time scales for various currency pairs.
where u is the dilation parameter, also known as the scale,
It is evident that MI and thus coupling between two cur-
and b is the localisation parameter. The function ψ(t),
rency pairs generally increases across time scales.
from which different dilated and translated versions are
derived is called the mother wavelet.
p(c1 )p(c1 )
Z Z
The Morlet wavelet is used in all the analysis pre-
I[c1 , c2 ] = − p(c1 , c2 ) log dc1 dc2
sented later in this paper. Morlet is a non-orthogonal p(c1 , c2 )
wavelet which has both a real and a complex part. Such (7)
wavelets are also referred to as the analytical wavelets.
Due to the complex part, Morlet wavelets can be used to 0.35
separate both the phase and amplitude parts of a signal. A EURUSD − GBPUSD
USDCHF − EURCHF
Morlet wavelet is represented by: 0.3
USDJPY − EURJPY
2
1 t 0.25
ψ(t) = π − 4 exp (i2πf0 t) exp − (2)
2
Mutual Information
0.2
3 MUTUAL INFORMATION 0.15
The Mutual Information (MI) of two signals is the canon- 0.1
ical measure of coupling between the signals. For un-

0.05
coupled signals MI is zero, whereas for coupled signals
it has a positive value. The MI between two time vary- 0
ing signals, x1 = x1 (t) and x2 = x2 (t), is given by the 0 5 10 15
scale (seconds)
20 25 30
Kullback-Leibler (KL) divergence [2]:

Figure 1: Mutual Information of three currency pairs,
I[x1 , x2 ] = KL(p(x1 , x2 ) || p(x1 )p(x2 )) (3) EURUSD-GBPUSD, USDCHF-EURCHF and USDJPY-
EURJPY, at various time scales
The Kullback-Leibler (KL) divergence between two de-
pendent probability density functions, p(x1 ) and p(x2 ), In this paper, the method used for computing MI
is given by: is based on a data-interpolation technique known as the
Parzen-window density estimation, as discussed in [4].

p(x1 )
Z
KL[p(x1 ) || p(x2 )] = p(x1 ) log dt (4) 5 HIDDEN MARKOV INDEPENDENT
t p(x2 )
COMPONENT ANALYSIS
Therefore, the MI is represented by: 5.1 Markov Model

p(x1 )p(x1 )
Z Z
A Markov process is a statistical process in which future
I[x1 , x2 ] = − p(x1 , x2 ) log dx1 dx2
p(x1 , x2 ) probabilities are determined by only its most recent val-
(5) ues. Using the product rule, the joint probability of a vari-
able x can be written as [2]:
4 The Wavelet-MI Algorithm
N
Y
This section presents the Wavelet-MI model, which uses p(x1 , ..., xN ) = p(xn | x1 , ..., xn−1 ) (8)
the wavelet coefficients to calculate the mutual informa- n=1
tion between two financial time series at any given time
A first-order Markov model assumes that all the con-
scale. For N time series, xt = [x1 (t), x2 (t), ..., xN (t)],
ditional probabilities of equation 8 are dependent on only
analysed using the CWT at scales of 1 to u, the wavelet
the most recent observation and independent of all others.
coefficients can be combined into a single matrix, Ct =
Thus, a first order Markov model can be represented by:
[c1 (t), c2 (t), ..., cu (t)]. Thus, the set of signals, xt , can be
represented in terms of the wavelet coefficients, Ct , and
the mother wavelet ψt : N
Y
p(x1 , ..., xN ) = p(x1 ) p(xn | xn−1 ) (9)
xt = Ct ψt (6) n=2
The CWT of a signal is given by equation 1. Equa- which can be simplified to:
tion 5 presents the MI equation. These two equa-
tions can be combined as shown in equation 7. The p(xn | x1 , ..., xn−1 ) = p(xn | xn−1 ) (10)
Wavelet-MI algorithm computes the mutual information, which is a simplified form of a first-order Markov model.
5.2 Hidden Markov Model where γk [t] is the probability of being in state k. The log
likelihood of the ICA observation model, with unmixing
A Hidden Markov Model (HMM) is a statistical model
matrix W and M sources can be written as [6]:
consisting of a set of observations which are produced by
an unobservable set of latent Markov model states. It is
widely used within the speech recognition sector. Due to M
X
its numerous advantages in inferring the hidden states of a log p(xt ) = log |det(W)| + log p(ai [t]) (13)
dynamic system, it is increasingly being used in the finan- i=1
cial sector as well. The aim of using a HMM is to infer the
hidden states from a set of observations. Mathematically, Substituting the ICA log likelihood, equation 13, into the
the model can be represented by [2]: HMM auxiliary function, equation 12, gives:
N N 1 X X
Y Y Qk = log |det(Wk )| +
γk [t] log p(ai [t])
p(X | Z, θ) = p(z1 | π)[ p(zn | zn−1 , A)] p(xm | zm , B) γk t i
n=2 m=1 (14)
(11) The auxiliary function summed over all states k, becomes:
where X = x1 , ..., xN is the observation set, Z =
z1 , ..., zN is the set of latent variables, θ = π, A, B rep-
X
Q= Qk (15)
resents the set of parameters governing the model. The k
HMM is represented using Markov chains in Figure 2,
The HMICA model finds the unmixing matrix Wk for
showing the hidden layer with states zt and the observed
state k, by minimizing the cost function given by equation
layer with observation set xt . Also shown in the figure is
15 over all underlying parameters.
the state transition probability P (z(t + 1) | z(t)) and the
emission model probability P (x(t) | z(t))
6 RESULTS
State Transition Probability
Hidden Layer P(z(t+1)|z(t)) This section presents the results obtained when the
z(t) z(t+1) Wavelet-MI model presented in section 4 and the HMICA
model presented in section 5 are simulated in Matlab. Fig-
ure 3 presents the Viterbi diagrams and the mutual infor-
Emission Model Probability
P(x(t)|z(t))
mation plots obtained by using FX data at various different
time scales.
From the plots it is evident that there are significantly
long periods of state stability. There is also some evidence
Observed Layer x(t) x(t+1)
of existence of recurring patterns, which can prove ex-
tremely useful in building a trading strategy. The HMICA
code also gives the state transition matrix as an output.
Figure 2: Hidden Markov Model graphical representation The state transition matrix gives the probability of change
of state from state i to state j, i.e.:
5.3 Hidden Markov ICA Model Pi,j = P (z(t + 1) = j | z(t) = i) (16)

Independent Component Analysis (ICA) is a form of The state transition probability matrix, Pi,j , can also be
blind source separation model. The Hidden Markov ICA written as:
(HMICA) model is a Hidden Markov model with an ICA
observation model. This section presents an overview of p(0 | 0) p(1 | 0)
Pi,j = (17)
the HMICA model, using the set of equations detailed in p(0 | 1) p(1 | 1)
[6]. where pj,i = p(j | i) is the transition probability from
Let θ = π, A, B represent the parameters of a HMM, state i to state j.
where B represents the parameters of the ICA observa- The state transition probability matrices, Pscale , for
tion model, A is the state transition matrix with entries the USDJPY-EURJPY pair at various different time scales
aij , and π represents an initial state probability matrix. are given below. The Mutual Information diagrams and
A HMM parameterised by some vector θ̂ = π̂, Â, B̂ can Viterbi plots for the USDJPY-EURJPY currency pair at
be trained using an Expectation-Maximisation (EM) algo- various scales are presented in Figure 3.
rithm as shown in [3].
Using analysis shown in [6], it can be proved that for 0.9756 0.0244
P5 = (18)
an observation sequence xt and hidden state sequence zt , 0.0207 0.9793
the observation model parameters, B, can be written in
terms of an auxiliary function Q, given by: 0.9761 0.0239
P7.5 = (19)
0.0162 0.9838
XX
Q(B, B̂) = γk [t] log pθ̂ (xt | zt ) (12) 0.9860 0.0140
P10 = (20)
k t 0.0479 0.9521
Viterbi decoding (USDJPY − EURJPY) − scale = 5 secand
1
State
0
0 500 1000 1500 2000 2500 3000 3500 4000
1.5
1
MI
0.5
0
0 500 1000 1500 2000 2500 3000 3500 4000
time
(a) scale = 5 seconds

Viterbi decoding (USDJPY − EURJPY) − scale = 7.5
1
State
0
0 500 1000 1500 2000 2500 3000 3500 4000
MI 2
1
0 500 1000 1500 2000 2500 3000 3500 4000

time
(b) scale = 7.5 seconds

Viterbi decoding (USDJPY − EURJPY) − scale = 10 sec
1
State
0
0 500 1000 1500 2000 2500 3000 3500 4000
3
2
MI
1
0
0 500 1000 1500 2000 2500 3000 3500 4000
time
(c) scale = 10 seconds

Viterbi decoding (USDJPY − EURJPY) − scale = 12.5 sec
1
State
0
0 500 1000 1500 2000 2500 3000 3500 4000
4
MI
2
0
0 500 1000 1500 2000 2500 3000 3500 4000
time
(d) scale = 12.5 seconds
Figure 3: Viterbi diagrams showing state transitions in the Hidden layer for USDJPY-EURJPY at different time scales.
Also shown are the Mutual Information (MI) plots of the currency pairs.
ACKNOWLEDGEMENTS
0.9915 0.0085
P12.5 = (21) The authors are grateful to the Oxford-Man Institute of
0.0273 0.9727
Quantitative Finance for their support. The first author
It is interesting to note that for significant portions of would also like to thank Exeter College (Oxford) for fund-
time, the length of time for which the state stays constant ing this research.
is over 100 samples (50 seconds) long. These periods of
state stability are hence well-suited for placing a trade or-
der. The state transition probability matrix, Pij , can be References
used to make predictions about future states. Simulations [1] International Banking Systems Jour-
conducted with Equities data using the models presented nal/Supplements/Trading Platforms Supplement.
in this paper also give encouraging results. International Banking Systems Journal, June 2007.
[2] C.M. Bishop. Pattern recognition and machine learn-
7 CONCLUSIONS ing. Springer, 2006.
This paper presents a statistical model for analysing the [3] AP Dempster, NM Laird, and DB Rubin. Maximum
dynamics of multivariate financial time series. The CWT Likelihood from Incomplete Data via the EM Algo-
is presented as a useful tool for the analysis of finan- rithm. Journal of the Royal Statistical Society. Series
cial data sets at various different frequencies. HMICA B (Methodological), 39(1):1–38, 1977.
is used to extract the hidden states from multivariate fi- [4] F. Long and C. Ding. Feature Selection Based on Mu-
nancial time series. The hidden states stay constant for tual Information: Criteria of Max-Dependency, Max-
significant periods of time which is potentially useful for Relevance, and Min-Redundancy. IEEE Transactions
building efficient trading models. It is also shown that the on Pattern Analysis and Machine Intelligence, 27(8):
hidden states are indicative of changes in mutual informa- 1226–1238, 2005.
tion between two FX returns time series. [5] P. Oswiecimka, J. Kwapien, S. Drozdz, and R. Rak.
Investigating Multifractality of Stock Market Fluc-
tuations Using Wavelet and Detrending Fluctuation
Methods. Acta Physica Polonica B, 36(8):2447, 2005.
[6] W. Penny, R. Everson, and S.J. Roberts. Hid-
den Markov Independent Components Analysis.
Advances in Independent Component Analysis.
Springer, pages 3–22, 2000.

Hidden Markov Independent Component Analysis in Financial Time Series

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Hidden Markov Independent Component Analysis in Financial Time Series

Hochgeladen von

Copyright:

Verfügbare Formate

HIDDEN MARKOV INDEPENDENT COMPONENT ANALYSIS AS A

MEASURE OF COUPLING IN MULTIVARIATE FINANCIAL TIME

Nauman Shah, Stephen J. Roberts

ABSTRACT box to harness market inefficiencies in order to generate

3 MUTUAL INFORMATION 0.15

The Mutual Information (MI) of two signals is the canon- 0.1

ical measure of coupling between the signals. For un-

Kullback-Leibler (KL) divergence [2]:

5.3 Hidden Markov ICA Model Pi,j = P (z(t + 1) = j | z(t) = i) (16)

(a) scale = 5 seconds

0 500 1000 1500 2000 2500 3000 3500 4000

(b) scale = 7.5 seconds

(c) scale = 10 seconds

(d) scale = 12.5 seconds

Das könnte Ihnen auch gefallen