Sie sind auf Seite 1von 6

JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 12

Algorithms of Hidden Markov Model and a


Prediction Method on Product Preferences
Ersoy ÖZ

Abstract—Markov chains are stochastic processes the knowledge of the present state uniquely determines its future stochastic
behaviour, and this behaviour does not depend on the past of the process. Markov Chains are widely used in areas such as
finance, education, production, marketing and brand addiction. A Hidden Markov Model is a stochastic process which is formed
by adding some properties to a Markov chain. Applications are developed by using the solution algorithms of the three main
problems of Hidden Markov Model. This study is an application about product preferences and the reasons for the preferences
by using Hidden Markov Model based on Markov Chains. The aim of this study is to develop an estimation method about
product preferences. The toolbox inside Matlab software is used for the numerical solutions of the model which is handled in the
application.

Index Terms—Marketing, Markov processes, Probabilistic algorithms, Stochastic processes.

——————————  ——————————

1 INTRODUCTION

A Hidden Markov Model (HMM) is a stochastic


process generated by two interrelated probabilistic
mechanism. At discrete instants of time, the process
pared to the previous process. At this point, knowing
which brand the consumers will prefer and what could be
the reason behind this preference is very important.
is assumed to be in some state and an observation is gen- The Markov chain theory is widely used in marketing
erated by the random function corresponding to the cur- and in determining the short or long term preference prob-
rent state. The underlying Markov chain then changes abilities of brands. In the Markov chain theory, while talk-
states according to its probability matrix. The observer ing about the probability of a client, who is currently using
sees only the output of the random functions associated brand A, choosing again to use brand A, the reason why
with each state and cannot directly observe the states of the client prefers brand A again cannot be known. Markov
the underlying Markov chain. This makes the Markov chain theory can be used to find the optimal solution if the
chain hidden [24]. preference reason is not important to the researcher [22].
In HMM, the states of the Markov chain are said to be However, the Markov chain structure is incapable when
hidden and to emit observation states. There are three clas- the preference reasons are questioned. Therefore, HMM
sical problems that need to be solved for HMM to be useful theory can be used to find the preference reasons. When
in applications [4]. the preference reasons are taken into consideration, Mar-
HMMs were first studied in the 1940s but they could not kov chain theory lacks in some points and the HMM which
be widely used in applications. The HMM theory was scrutinizes the preference reasons due to its inner dynam-
firstly developed by Baum and Petrie [13], Baum and Ea- ics eventuates as the main frame of this study.
gon [12], Petrie [23] and Baum [11]. In the last 20 years,
application area of HMMs spreaded widely because of the
2 HIDDEN MARKOV MODELS
noticable developments in computer systems. Basic focus
points of these areas are estimation of state space, the algo- 2.1 Markov Chain Models
rithms that would make the models usable and carrying Consider a system that may be described at any time
out methods which are based on similarity. In the last as being one of a set of N distinct states indexed by
years, statistical deduction studies were done by Leroux 1 ,2 , , N. At regularly spaced, discrete times, the sys-
[3], Bickel and Ritov [17], Bickel, Ritov and Ryden [18] and tem undergoes a change of state (possibly back to the
Fuh [6]. same state) according to a set of probabilities associated
Today, due to the rapid development of information with the state. We denote the time instant associated with
technology, the notebook computers are developing in state changes as t  1, 2 , and we denote the actual state
many ways parallel to this process. Therefore the pre-
ferred brand of notebook computers of consumers is at time t as q t . A full probabilistic description of the
changing over time. People will be more conscious and above system would, in general, require specification of
have more options about technological innovations com- the current state (at time t ), as well as all the predecessor
states. For the special case of a discrete-time, first order,
———————————————— Markov chain, the probabilistic dependence is truncated
 Ersoy ÖZ, Department of Technical Programs, Yildiz Technical Universi-
ty, Turkey.
© 2010 Journal of Computing Press, NY, USA, ISSN 2151-9617
http://sites.google.com/site/journalofcomputing/
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 13
to just the preceding state N

P[q t 1  j q t  i , q t 1  k , ...]  P[q t 1  j q t  i ] . (1) a


j 1
ij  1 , i  1, 2 , , N . (7)
Furthermore, we consider only those processes in
which the right-hand side of (1) is independent of time, State transition probabilities do not change over time
thereby leading to the set of state transition probabilities and these probabilities are independent of the observa-
aij of the form tions [19].
iv. The observation symbol probability distribution in
a ij  P[qt 1  j q t  i ] , 1≤ i, j ≤ N (2)  
state j , B  b j (k ) , where
with the following properties
b j ( k )  P[vk at t qt  S j ] , 1≤ j ≤ N , 1≤ k ≤ M . (8)
aij  0 , i, j (3)
N
Dimension of B is N  M and it satisfies the two con-

a
j 1
ij  1 , i (4) ditions below.
b j ( k )  0 , 1≤ j ≤ N , 1≤ k ≤ M (9)
since they obey standard stochastic constraints. M

The above stochastic process could be called an ob-  b ( k)  1 ,


k 1
j 1≤ j ≤ N . (10)
servable Markov model because the output of the process
is the set of states at each instant of time, where each state v. The initial state distribution    i  where
corresponds to an observable event [15].  i  Pq1  Si  , 1≤ i ≤ N . (11)
Given appropriate values of N , M , A , B and  HMM
2.2 Definition of The Hidden Markov Models
can be used as a generator to give an observation se-
The HMM is a very powerful statistical method of cha-
quence O  O1O2 OT , where each observation Ot , is
racterizing the observed data samples of a discrete time
series [25]. one of the symbols from v1 , v 2 , , v M  and T is the
In the Markov chain, each state corresponds to a de- number of observations in the sequence [14].
terministically observable event; i.e., the output of such It can be seen from the above discussion that a com-
sources in any given state is not random. A natural exten- plete specification of an HMM requires specification of
sion to the Markov chain introduces a non-deterministic two model parameters ( N and M ), specification of ob-
process that generates output observation symbols in any servation symbols, and the specification of the three
given state. Thus, the observation is a probabilistic func- probability measures A , B and  . For convenience, we
tion of the state. This new model is known as a HMM, use the compact notation   A , B,   to indicate the
which can be viewed as a double-embedded stochastic
process with an underlying stochastic process (the state complete parameter set of the model.
sequence) not directly observable. This underlying HMMs are usually delimited with models having the
process can only be probabilistically associated with state and measurements inside discrete set and discrete
another observable stochastic process producing the se- time [20]. HMMs may be classified as discrete or conti-
quence of features we can observe. nuous depending on the observations [10]. When the ob-
A HMM is defined by following five elements: servations in the HMM are discrete, discrete HMM is
i. N , the number of states in the model. Although the formed but this model is usually expressed with the name
states are hidden, for many practical applications there is HMM. In the situations where the observation series is
often some physical significance attached to the states or not discrete, HMM with continuous observation density
to sets of states of the model. Generally the states are in- or shortly continuous HMM is formed.
terconnected in such a way that any state can be reached
from any other state (e.g., an ergodic model). We denote 3 ALGORITHMS OF HIDDEN MARKOV MODELS
the individual states as S  S1 , S 2 , , S N  , and the state
at time t as q t . Given the form of HMM of the section 2, there are
ii. M , the number of distinct observation symbols per three basic problems of interest that must be solved for
state, i.e., the discrete alphabet size. The observation the model to be useful in real-world applications. These
symbols correspond to the physical output of the system problems are the following [14]:
being modeled. We denote the individual symbols as
V  v1 , v 2 , , v M  . Observations are only dependent on Problem 1: Given the observation sequence
the state of the system at the observation. Therefore they O  O1O2  OT , and a model   A , B ,   , how do we
are independent from the previous observation [2]. efficiently compute P(O  ) , the probability of the obser-
 
iii. A  a ij is the state transition probability distribu- vation sequence, given the model?
tion and its dimension is N  N . For Problem 1 can be seen as one of scoring how well a
a ij  P[q t 1  S j q t  Si ] , 1≤ i, j ≤ N . (5) given model matches a given observation sequence, i.e.
the two conditions given below are valid for aij . the solution to this problem would give us a tool to
choose between competing models.
aij  0 , i , j  1, 2 , , N (6)
Problem 2: Given the observation se-
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 14
quence O  O1O2  OT , and the model  , how do we 3.2 Solution to Problem 2
choose a corresponding state sequence Q  q1q 2  qT A formal technique for finding this single best state se-
which is optimal in some meaningful sense (i.e., best “ex- quence exists, based on dynamic programming methods,
plains” the observations)? and is called the Viterbi algorithm.
The second problem is unveiling the hidden part of the
model so finding the correct state array. State arrays are Viterbi Algorithm [1], [8]: To find the single best state
determined according to the number of states in the mod- sequence, Q  {q1q 2  qT } , for the given observation se-
el. The Viterbi Algorithm is used to solve the second prob- quence O  {O1O2 OT } , we need to define the quantity
lem.  t ( i )  max P[q1q2  qt  i , O1O2  Ot  ] (19)
The Viterbi algorithm, an application of dynamic pro- q1 ,q 2 , , q t 1
gramming. The Viterbi Algorithm finds the most-likely  t (i ) is the best score (highest probability) along a sin-
state transition sequence in a state diagram, given a se- gle path, at time t , which accounts for the first t observa-
quence of symbols [9]. tions and ends in state Si . By induction we have
 t 1 ( j )  [ max  t ( i )aij ].b j (Ot  1 ) (20)
Problem 3: How do we adjust the model parameters i
  A , B ,   to maximize P(O  ) ? To actually retrieve the state sequence, we need to
Problem 3 is the one in which we try to optimise the keep track of the argument which maximized (20), for
model parameters so as to best describe how a given ob- each t and j . We do this via the array  t ( j ) . The com-
servation sequence comes about. The observation se- plete procedure for finding the best state sequence can
quence used to adjust the model parameters is called a now be stated as follows:
training sequence [16].
1) Initialization:
3.1 Solution to Problem 1  1 (i )   i bi (O1 ) , 1≤ i ≤ N (21a)
The forward-backward algorithms are used to calcu-  1 (i)  0 (21b)
late P(O  ) .
The forward variable  t (i ) defined as: 2) Recursion:
 t (i )  P(O1O 2  Ot , q t  Si  ) (12)  t ( j )  max [ t 1 ( i ) a ij ]b j (O t ), 2  t  T , 1  j  N (22a)
i.e., the probability of the partial observation sequence, 1 i  N
O1O2 Ot , (until time t ) and state Si at time t , given  t ( j )  arg max [ t  1 ( i ) aij ], 2  t  T , 1  j  N (22b)
the model  . We can solve for  t (i ) inductively, as fol- 1 i  N
lows [7]: 3) Termination:
P *  max [ T ( i )] (23a)
1 i  N
1) Initialization:
 1 ( i )  P(O1 , q1  Si  ) qT*  arg max [ T ( i )] (23b)
 1 (i )   i bi (O1 ) , t  1, 1≤ i ≤ N (13) 1 i  N
4) Path (state sequence) backtracking:
2) Induction: qt*   t  1 (qt* 1 ) , t  T  1,T  2 , , 1 (24)
N
 t 1( j)  [   (i)a ]b (O
t ij j t  1 ), 1≤ t ≤ T  1 ,1≤ j ≤ N (14) 3.3 Solution to Problem 3
i 1 We can, however, choose   A , B ,   such that
3) Termination: P(O  ) is locally maximized using an iterative procedure
N such as the Baum-Welch method. The Baum-Welch Algo-
P(O  )   T (i) (15) rithm, which is said to be the training of HMM, is an iter-
i 1 ative likelihood maximization method based on the prob-
The backward variable  t (i ) defined as: abilities belonging to the forward and backward variables
 t (i )  P(Ot 1Ot  2  OT q t  Si ,  ) (16) [21].
the probability of the partial observation sequence
from t  1 to the end, given state Si at time t and the Baum-Welch Algorithm [14]: In order to describe the
model  . We can solve for  t (i ) inductively, as follows procedure for reestimation (iterative update and im-
[5]: provement) of HMM parameters, we first define  t (i , j ) ,
the probability of being in state Si at time t and state S j
1) Initialization: at time t  1 , given the model and the observation se-
 T ( i )  1 , 1≤ i ≤ N (17) quence, i.e.
 t ( i , j )  P ( q t  S i , q t 1  S j O ,  ). (25)
2) Induction:
N It should be clear, from the definitions of the forward
 t (i)  a ij b j (O t 1 )  t 1 ( j ) , and backward variables that we can write  t (i , j ) in the
j 1 form
t  T  1,T  2 , , 1 , 1≤ i ≤ N (18)
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 15

 t ( i ) aij b j (Ot  1 )  t  1 ( j ) final result of this reestimation procedure is called a max-


 t (i , j)  imum likelihood estimate of the HMM. It should be
P (O  ) pointed out that the forward-backward algorithm leads to
 t ( i ) aij b j (Ot  1 )  t  1 ( j ) (26) local maxima only, and that in most problems of interest,
 N N the optimization surface is very complex and has many
  (i )a b (O
i 1 j 1
t ij j t 1 ) t 1( j) local maxima.
An important aspect of the reestimation procedure is
where the numerator term is just P(qt  Si , qt  1  S j O ,  ) that the stochastic constraints of the HMM parameters,
and the division by P(O  ) gives the desired probability namely are automatically satisfied at each iteration.
measure. N
We have previously defined  t (i ) as the probability of
being in state Si at time t , given the observation se-

i 1
i 1 (30a)

quence and the model; hence we can relate  t (i ) to N


 t (i , j ) by summing over j , giving
N
a
j 1
ij  1, 1≤ i ≤ N (30b)

 t (i)    (i , j) (27) M

 b (k)  1 ,
t
j 1 j 1≤ j ≤ N (30c)
If we sum  t (i ) over the time index t ,we get a quanti- k 1
ty which can be interpreted as the expected (over time)
number of times that state Si is visited, or equivalently,
4 APPLICATION
the expected number of transitions made from state Si (if
we exclude the time slot t  T from the summation). Si- In this study, the goal is to estimate which brand the
milarly, summation of  t (i , j ) over t (from t  1 to notebook computer users in Turkey will prefer and the
t  T  1 ) can be interpreted as the expected number of reason behind the preference of any brand, thus develop-
transitions from state Si to state S j . That is ing a new application area for HMM. The data that will
T 1 be used in determining the research parameters are ga-
  (i) expected number of transitions from S
t 1
t i (28a) thered with the survey technique. In the making of the
survey which is used as the research tool, the appropriate
T 1 questions are selected from the brand addiction and
 (i , j)  expected
t 1
t
S .
number of transitions from S to
(28b)
i
brand preference surveys that are used frequently in the
literature.
j
In the preparation of the questions in the survey, the
Using the above formulas (and the concept of counting information that is gathered from the literature scan and
event occurrences) we can give a method for reestimation the opinions and suggestions of the authorities in the
of the parameters of an HMM. A set of reasonable reesti- leading companies of the industry were taken into con-
mation formulas for  , A and B are sideration. There are four category questions in the sur-
vey form. The first question is, choosing the currently
 i  expected frequency (number of times) in state Si used notebook computer brand, and the second question
at time ( t  1 ) =  1 (i ) (29a) is, choosing the reason behind the preference of the used
brand. The third question is, choosing the notebook com-
T 1 puter brand that will be used next and the fourth question
  (i , j) t is, choosing the reason behind the preference of the brand
aij  t 1 (29b) that will be used the next.
T 1

  (i)
t 1
t Estimation of the Model Parameters
In a given time { t }, the reason behind preferring any
T notebook computer brand is taken as the “state” of the

t 1
t ( j) HMM { q t  D i }. When any notebook computer brand is
preferred, the brand of the computer { V } is only known
Ot vk
bj(k)  (29c) by the person who uses a notebook computer or the buy-
T


t 1
t ( j)
er but the reason behind preferring that brand is not
known by the others. Therefore, the reasons of preference
are “hidden” since they are not known by other people
If we define the current model as   A ,B ,   and use than the buyer. Estimation of unknown reasons of prefe-
that to compute the right-hand sides of (40a)-(40c), and rences is very critical for the firms that produce notebook
we define the reestimated model as   A ,B,  .   computers or the leading companies in the notebook
Based on the above procedure, if we iteratively use  computer market.
in place of  and repeat the reestimation calculation, we The transition probabilities matrix is formed by using
then can improve the probability of O being observed the reasons behind the current and the next notebook
from the model until some limiting point is reached. The computer brand preferences since, in the Markov Chain
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 16
theory, the state of the system { q t 1 } in the next step In the light of these data, the transition probabilities and
{ t  1} is only dependent on the current state { qt } and is observation probabilities matrices are formed. Since none
independent from the past. The transition probabilities of the preference reasons of notebook computers were of
matrix includes the transition probabilities of the hidden higher priority than any other, initially, their initial state
states whose rows and columns are represented as probabilities were taken equally as {  i  Pq1  Di  ,
S  {S1 , S 2 , , S6 } , so to say the transition probabilities 1≤ i ≤6}.
 
{ A  a ij } from the reason behind the preference of the By using the First Problem of the HMM, P(O  ) refers
currently used notebook computer to the reason behind to the probability of the notebook computer brand that
the preference of the notebook computer that will be used will be used next from the currently used and the note-
next. book computer brand that will be preferred next for the
TABLE 1  model {  i , aij , b j (k ) }. The observations v1 , v 2 , , v16
REASONS OF PREFERENCES THAT ARE THE HIDDEN STATES FOR were used respectively for the O value here. In other
HMM words, the O value is used to show every notebook com-
puter brand individually { O  v1 (Acer), O  v 2 (Apple)
etc.}. The forward algorithm given in (13), (14) and (15) is
used in calculating the P(O  ) probability.
Finding out what will be the reason behind the prefe-
rence of the notebook computer brand that will be used
next will be done by using the Second Problem of HMM.
The reason that lies beneath the preference actually refers
to the hidden state in the model  . Finding out the hid-
den states for the observations v1 , v 2 , , v16 is done by
The answers given to the questions of currently used using the Viterbi algorithm. The steps of this algorithm
notebook computer preference { v k } and the preference of are given between (19) and (24).
the notebook computer that will be used next { v k 1 } are The parameters of the model may be rearranged in or-
taken as the “observations” of the HMM. The brands that der to maximize the preference probability of any obser-
are obtained from the answers given to the survey ques- vation (brand). The model parameters to be rearranged
tions are given in Table 2. include the transition probabilities matrix, the observa-
TABLE 2 tion probabilities matrix and the initial probabilities.
OBSERVATIONS REPRESENTING BRANDS (29a), (29b) and (29c) are given in order to calculate these
values.

5 CONCLUSION
For the  model, the P(O  ) probabilities, that were
found by using the First Problem of the HMM,
TABLE 1
BRAND PREFERENCE PROBABILITIES AND HIDDEN STATES

When the notebook computer preferences are com-


bined with the preference reasons, the observation proba-
bilities matrix { B  {b j ( k )} } is formed. The preference
probabilities of sixteen notebook computers when in any
state are given in the matrix. The observation probabili-
ties matrix which is explained by (9) and (10) is formed
according to the answers given to the survey questions.
The answers given to the survey are in the form of
brand preference for the currently used notebook com- of the notebook computer brands that will be preferred
puter and its reason and the brand preference for the after the notebook computer brands that will be preferred
notebook computer that will be used next and its reason. next and the results that were obtained about finding the
hidden states that lie beneath the notebook computer pre-
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 10, OCTOBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 17
ferences by using the Viterbi algorithm are given in Table Markov processes”, Inequalities, vol. 3, pp. 1-8, 1972.
3. [12] L.E. Baum and J.A. Eagon, “An inequality with applications to
According to the estimations in Table 3, the three note- statistical estimation for probabilistic functions of Markov
book computer brands with the highest preference prob- processes and to a model for ecology”, Bull. Am. Math. Stat., vol.
abilities were HP with %20,18, Toshiba with %17,76 and 37, pp. 360-363, 1967.
Sony with %16,81. The preference reason, which is the [13] L.E. Baum and T. Petrie, “Statistical inference for probabilistic
hidden state, was estimated as S 2 (Advanced System functions of finite state Markov chains”, Ann. Math. Statist., vol.
Properties) for HP, as S5 (Robustness and Durability) for 37, pp. 1554-1563, 1966.
Toshiba and as S 4 (Trust to the Brand) for Sony. [14] L. Rabiner, “A tutorial on hidden Markov models and selected
The model parameters were rearranged to maximize applications in speech recognition”, Proceedings Of The IEEE,
observation probabilities of the brands. However, the vol. 77, no. 2, pp. 257–286, February 1989.
preferences need to be observed again in the future in [15] L. Rabiner, B.H. Juang, Fudamentals of Speech Recognition, United
order to see the consistency of the obtained parameters. States of America-New Jersey: Prentice Hall International Inc.,
This study contains results about brand preference and Prentice Hall Signal Processing Series, pp. 322-323, 1993.
estimation of preference reasons of HMM. It is clear that [16] M. Karlsson, “Hidden Markov Models”, Hidden Markov Models,
the obtained results are dependent on the current data http://www.math.chalmers.se/~olleh/Markov_Karlsson.pdf,
and the data that will be formed in the next step due to 2004, (07 Sept. 2010).
the Markov Chain structure. Since the used data are about [17] P. Bickel and Y. Ritov, “Inference in hidden Markov models. I.
the preferences of the notebook computer users in Tur- Local asymptotic normality in the stationary case”, Bernoulli,
key, the results of the research will be valid for the Tur- vol. 2, no. 3, pp. 199-228, 1996.
kish market. However, it is clear that these results may as [18] P. Bickel, Y. Ritov and T. Ryden, “Asymptotic normality of the
well be used outside of Turkey. maximum likelihood estimator for general hidden Markov
When a literature research about the subject is made, it models”, Ann. Statist., vol. 26, no. 4, pp. 1614-1635, 1998.
can be observed that HMM is generally used in the areas [19] R. Bhar, S. Hamori, Hidden Markov Models Applications to Finan-
of face, vision, audio and character recognition and biolo- cial Economics, The Netherlands: Kluwer Academic Publishers,
gy. For this reason, it is very important to apply HMM to pp. 17, 2004.
different areas, especially for the people who will make [20] R.J. Elliott, L. Aggoun, J.B. Moore, Hidden Markov Models Esti-
research about this subject. mation and Control, Second Printing, United States of America:
Springer-Verlag, pp. 3, 1997.
REFERENCES [21] S.V. Vaseghi, Multimedia Signal Processing Theory and Applica-
tions in Speech, Music and Communications, England: John-Wiley
[1] A. J. Viterbi, “Error bounds for convolutional codes and an
& Sons Ltd., pp. 364, 2007.
asymptotically optimal decoding algorithm,” IEEE Trans. In-
[22] T. Can, “Kuadratik Programlama Yöntemiyle Markov Geçiş
format. Theory, vol. IT-13, pp. 260-269, Apr. 1967.
Matris Değerlerinin Belirlenmesi”, Atatürk Üniversitesi İktisadi ve
[2] A. Schliep, B. Georgi, W. Rungsarityotin, I.G. Costa, and A.
İdari Bilimler Dergisi, cilt. 21, sayı. 1, pp.89-101, Jan. 2007.
Schönhuth, “The General Hidden Markov Model Library: Ana-
[23] T. Petrie, “Probabilistic functions of finite state Markov chains”,
lyzing Systems with Unobservable States”, Forschung und wis-
Ann. Math. Statist., vol. 40, pp. 97–115, 1969.
senschaftliches Rechnen: Beitrage zum Heinz-Billing Preis, Series
[24] V.M. Mantyla, Discrete hidden Markov models with application to
GWDG-Bericht, pp. 121-135, 2004.
isolated user-dependent hand gesture recognition, Finland: Valtion
[3] B.G. Leroux, “Maximum likelihood estimation for hidden Mar-
Teknillinen Tutkimuskeskus Publications, pp. 35, 2001.
kov models”, Stochastic Process. Appl., vol. 40, pp.127–143, 1992.
[25] 124 X. Huang, A. Acero, and H.W. Hon, Spoken Language
[4] B. Haubold, T. Wiehe, Introduction to Computational Biology An
Processing A Guide Theory, Algorithm, and System Development,
Evolutionary Approach, Basel-Switzerland: Birkhauser Verlag,
United States of America: Prantice Hall, Inc., pp. 375, 2001.
pp. 114, 2006.
[5] C. Becchetti, L.P. Ricatti, Speech Recognition Theory and C++
Implementation, England: Jhon-Wiley & Sons Inc., pp. 179, 2004. Ersoy ÖZ has obtained his Ph.D. degree in Operations Research at
[6] K8 C.D. Fuh, “SPRT and CUSUM in Hidden Markov Models”, Marmara University in Turkey in 2009. His works are about markov
chains, hidden markov models and geometric programming. He is
The Annals of Statistics, vol. 31, no. 3, pp. 942–977, 2003.
lecturer in the Computer Programming department at Yildiz Technical
[7] E. Alpaydın, Introduction To Machine Learning, United States of University.
America: The MIT Pres, pp. 313, 2004.
[8] G. D. Forney, “The Viterbi algorithm”, Proc. IEEE, vol. 61, pp.
268-278, Mar. 1973.
[9] H.L. Lou, “Implementing the Viterbi Algorithm”, IEEE Signal
Processing Magazine, pp. 42-52, September 1995.
[10] H. Xue and V. Govindaraju, “Hidden Markov Models Combin-
ing Discrete Symbols and Continuous Attributes in Handwrit-
ing Recognition”, IEEE Transactions On Pattern Analysis And
Machine Intelligence, vol.28, no.3, pp. 458-462, March 2006.
[11] L.E. Baum, “An inequality and associated maximization tech-
nique in statistical estimation for probabilistic functions of

Das könnte Ihnen auch gefallen