STATISTICS IN MEDICINE Statist. Med. (in press) Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/sim.2108
Application of hidden Markov models to multiple sclerosis lesion count data
Rachel MacKay Altman ^{1}^{;}^{∗}^{;}^{†} and A. John Petkau ^{2}
^{1} Department of Statistics and Actuarial Science; Simon Fraser University; Canada ^{2} Department of Statistics; University of British Columbia; U.S.A.
SUMMARY
This paper is motivated by the work of Albert et al. who consider lesion count data observed on multiple sclerosis patients, and develop models for each patient’s data individually. From a medical perspective, adequate models for such data are important both for describing the behaviour of lesions over time, and for designing e cient clinical trials. In this paper, we discuss some issues surrounding the hidden Markov model proposed by these authors. We describe an e cient estimation method and propose some extensions to the original model. Our examples illustrate the need for models which describe all patients’ data simultaneously, while allowing for interpatient heterogeneity. Copyright _{?} 2005 John Wiley & Sons, Ltd.
KEY WORDS:
Hidden Markov model; multiple sclerosis; time series; count data
1. INTRODUCTION
Multiple sclerosis (MS) is a debilitating disease of the central nervous system. Patients with this disease may have problems with vision, coordination, sensation, gait, endurance, and bowel, bladder, cognitive and sexual functions. It is now believed that such symptoms are related to the development of lesions (areas of demyelination) in the brain and spinal cord. Lesions may persist inde nitely, or may disappear temporarily only to reappear at a later time, or may disappear altogether. Relapsingremitting MS is a particular type of this disease where symptoms tend to worsen and then improve in alternating periods of relapse and remission. It has been shown that relapse rate is positively associated with numbers of T2 lesions [1]. Magnetic resonance imaging (MRI) is one method for detecting and measuring MS lesions. The development of a model for MS=MRI lesion count data may lead to medical insight into
^{∗} Correspondence to: R. M. Altman, Department of Statistics and Actuarial Science, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada. ^{†} Email: raltman@stat.sfu.ca
Contract=grant sponsor: Natural Sciences and Engineering Research Council of Canada
Copyright _{?} 2005 John Wiley & Sons, Ltd.
Received May 2004 Accepted September 2004
R. M. ALTMAN AND A. J. PETKAU
the behaviour of the disease. Furthermore, an adequate model is necessary for designing e  cient clinical trials for new MS therapies. Albert et al. [2] (henceforth called AMSF) provide
a starting point for determining such a model. These authors discuss a study of relapsing
remitting MS patients who received monthly MRI scans for a period of approximately 30 months. Each month, the numbers of lesions observed on these scans were recorded. AMSF report the data for three particular patients (see Figure 1). The observed lesion counts range from 0 to 19, with a mean of 4.5 and a median of four lesions per scan. AMSF propose three di erent models for these data, and t each to the three patients’ data individually. The rst model assumes that the monthly lesion counts are independent and Poisson distributed with a common mean. The second assumes that, conditional on past observations, each count is Poisson distributed with mean depending on these past observa tions. The third model is a hidden Markov model (HMM), where the lesion count at time t is assumed to be Poisson distributed with mean depending on the patient’s unobserved disease state at that time. As AMSF point out, MS is known to be a very heterogeneous disease. Not surprisingly, their analysis suggests that none of the models is appropriate for all three patients. In this paper, we discuss some issues surrounding AMSF’s HMM. Of the three models, we
focus on the HMM because we feel that it best re ects the nature of the disease. Speci cally, given the results of Reference [1], we expect the number of lesions to depend on the patient’s underlying disease state (relapse or remission). AMSF’s HMM can be expressed as follows. Let Y _{t} be the lesion count at month t. Let {Z _{t} } be an unobserved, stationary Markov chain taking on values in the set {−1; 1} and with
transition probabilities given by P(Z _{t} =j  Z _{t}_{−}_{1} = i) = , i =j. The stationary distribution of {Z _{t} }
is
thus _{j} ≡ P(Z _{t} = j)= _{2} , j =−1; 1. Let Y _{t}  Z _{1} ;:::;Z _{t} be distributed as Poisson( _{t} ), where
1
t =
_{t}_{−}_{1} ; (1= ) _{t}_{−}_{1} ;
Z _{t} = − 1
Z _{t} =1
t
For identi ability, we assume that ¿1. Letting S _{t} = ^{} _{j} _{=} _{1} Z _{j} and letting _{0} be the mean count at baseline, we can rewrite this assumption as _{t} = _{0} ^{S} ^{t} . It is also assumed that, given Z _{1} ;:::;Z _{t} , Y _{t} is independent of Y _{1} ;:::;Y _{t}_{−}_{1} ; Y _{t}_{+}_{1} ;:::;Y _{n} . AMSF use the EM algorithm to estimate the parameters , _{0} , and for each patient separately. The purpose of our manuscript is twofold. The rst is to discuss a more e cient means of estimating the parameters of the HMM than the EM algorithm. The second is to provide some insight into the underlying assumptions of AMSF’s model, and to propose some possible extensions. Our work will suggest that, in order to make substantive statements about the course of lesion activity, we require a much larger collection of patients and a exible model which describes all patients’ data simultaneously. Our paper is organized as follows. In Section 2, we write AMSF’s model in a more standard form. This form allows us to evaluate the likelihood—and hence the maximum likelihood estimates (MLEs)—very e ciently. We outline the details of our estimation method in Section 3. In Section 4, we describe some features of the original HMM in more detail, and suggest some extensions to the model. We conclude with Section 5, where we discuss the need for a model which borrows strength across patients.
Copyright _{?} 2005 John Wiley & Sons, Ltd.
Statist. Med. (in press)
Lesion Count
Lesion Count
Lesion Count
APPLICATION OF HIDDEN MARKOV MODELS
Patient 1
15 


10 

5 

0 

0 
10 
_{2}_{0} 
_{3}_{0} 

Month 

Patient 2 

15 


10 

5 

0 

0 
10 
_{2}_{0} 
_{3}_{0} 

Month 

Patient 3 

15 


10 

5 

0 

0 
10 
20 
30 
Month
Figure 1. MS=MRI data reported by AMSF.
Copyright _{?} 2005 John Wiley & Sons, Ltd.
Statist. Med. (in press)
R. M. ALTMAN AND A. J. PETKAU
2. WRITING THE AMSF MODEL IN STANDARD FORM
AMSF do not express their HMM in the usual way. In particular, standard HMMs satisfy the property
P(Y _{t}  Z _{1} ;:::;Z _{t} )= P(Y _{t}  Z _{t} )
In AMSF’s model, _{t} depends not just on Z _{t} , but on Z _{1} ;:::;Z _{t}_{−}_{1} as well. However, it is possible to write AMSF’s model as a standard HMM. Speci cally, de ne S _{0} = 0 and U _{t} =(S _{t}_{−}_{1} ; S _{t} ), t = 1;:::;n. We will then think of {U _{t} }, rather than {Z _{t} }, as the (twodimensional) hidden process. The process {U _{t} } is a Markov chain, and, as in the stan dard model, P(Y _{t}  U _{1} ;:::;U _{t} )= P(Y _{t}  U _{t} ). Let A _{t} = {−t; −t + 2;:::;t − 2; t}. The transition probabilities for {U _{t} } are given by
P _{(}_{x} _{2} _{;}_{x} _{1} _{)}_{;}_{(}_{x} _{1} _{;}_{x} _{0} _{)} (t) ≡
P{U _{t} =(x _{1} ; x _{0} ) U _{t}_{−}_{1} =(x _{2} ; x _{1} )}
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
=
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
1 
− ; 
x _{0} ∈A _{t} ; 
x _{0} = x _{1} + 
1; 
x _{1} = x _{2} + 
1 
1 
− ; 
x _{0} ∈A _{t} ; 
x _{0} = x _{1} − 1; 
x _{1} = x _{2} − 1 

; 
x _{0} ∈A _{t} ; 
x _{0} = x _{1} + 
1; 
x _{1} = x _{2} − 
1 

; 
x _{0} ∈A _{t} ; 
x _{0} = x _{1} − 1; 
x _{1} = x _{2} + 
1 

0; 
otherwise 
This formulation of the model elucidates one of its key features: the state space of the hidden process—and hence the transition probabilities—vary with time. In other words, this model is a nonhomogeneous HMM, and, although {Z _{t} } is stationary, {Y _{t} } is not. In the case of a stationary HMM where the hidden process takes on only a nite number of values, the MLEs are consistent and asymptotically normal under quite general conditions [3, 4]. Similar results hold in the case when {Z _{t} } belongs to a compact set and is possibly nonstationary [5]. In these cases, the observed information converges in probability to the Fisher information matrix [4, 5]. In addition, in the comparison of nested stationary HMMs with a common, known number of hidden states, the likelihood ratio test (LRT) statistic has the usual asymptotic ^{2} distribution [6]. However, we are not aware of any results in the literature regarding the asymptotic properties of nonhomogeneous HMMs. (AMSF did not provide standard errors or carry out formal inference in their analysis.) To make inferences about AMSF’s model and its extensions, we will rely on the above results for homogeneous HMMs. Because these results have not been shown to hold for non homogeneous HMMs, our conclusions should be considered only informal. If more formal conclusions were desired, one could t a stationary Poisson HMM with an appropriate value of K to the data. Speci cally, in AMSF’s model, the mean lesion count at time t is restricted to a discrete number of values (evenly spaced on the log scale). If we assume that the observed process is stationary, it is reasonable to use a nite approximation to these mean values, i.e. to assume that the mean at time t is one of K values, K¡∞. This new model is simply a stationary Poisson HMM with K hidden states and with some restrictions on the transition probabilities; hence, standard inference results would certainly apply.
Copyright _{?} 2005 John Wiley & Sons, Ltd.
Statist. Med. (in press)
APPLICATION OF HIDDEN MARKOV MODELS
3. COMPUTATIONAL ISSUES
AMSF use the EM algorithm to obtain the MLEs of the parameters in their model. This algorithm can be a useful means of computing the MLEs when there are ‘missing’ data (e.g. the hidden states), and is very popular in the HMM literature. However, as pointed out in Reference [7], HMM likelihoods can be computed very simply as a product of matrices. Hence, direct maximum likelihood estimation is typically much more e cient than the EM algorithm. In particular, using the formulation of Section 2 and de ning f(Y _{t}  S _{t} ) as the Poisson( _{t} ) distribution, we can express the likelihood associated with AMSF’s model as
L = ^{}
=
S _{1} ∈A _{1}
^{}
S 1 ∈A 1
n
^{}
S _{n} ∈A _{n}
_{S} _{1} f(Y _{1}  S _{1} )
···
_{S} _{1} f(Y _{1}  S _{1} )
^{}
S 2 ∈A 2
t =
2 P (S _{t}_{−}_{2} ;S _{t}_{−}_{1} );(S _{t}_{−}_{1} ;S _{t} ) f(Y t  S t )
S n ∈A n
P (S _{0} ;S _{1} );(S _{1} ;S _{2} ) f(Y 2  S 2 )···
P (S _{n}_{−}_{2} ;S _{n}_{−}_{1} );(S _{n}_{−}_{1} ;S _{n} ) f(Y n  S n )
By de ning n appropriate matrices (one for each of the summations), we can compute the likelihood as a product of these n matrices. This formulation allows the e cient evaluation of the likelihood, even for relatively large numbers of hidden states. (See Reference [8] for an example of tting a similar model with up to ve hidden states.) After evaluating the likelihood, we can then obtain the MLEs numerically. We use a quasi Newton routine [9]. For the cases we have considered, maximizing the likelihood directly produces the MLEs far more quickly than the EM algorithm. Finally, HMM likelihoods often have many local maxima, so good starting values can be of critical importance. The speed of the direct MLE computations allows us to try a variety of starting values—or to do a grid search over a set of reasonable values—within a reasonable time frame. The same search using the EM algorithm would be computationally intensive. For these reasons, we much prefer the use of direct likelihood maximization for tting HMMs.
4. HMMS FOR MS=MRI DATA
AMSF’s model has two unusual features: the constraint that P(Z _{t} = j  Z _{t}_{−}_{1} = i) = , i =j, and
the structure of _{t} . The former implies that patients spend half the time in a state of deteri
oration and the other half in a state of improvement. The latter implies that the conditional mean lesion count either increases or decreases each month; it cannot remain stable. These assumptions may not be realistic for all patients. In this section, we suggest some models with more exible structures. We t each model to AMSF’s data using the method of direct maximum likelihood estimation described in Section 3. The quasiNewton routine provides an estimate of the inverse Hessian of the log likelihood, from which we obtain approximate standard errors.
4.1. Original model
We rst t AMSF’s model. We can facilitate the maximization of the likelihood by transform ing the parameters so that their range is the entire real line, i.e. ^{∗} _{0} = log _{0} , ^{∗} = log( − 1), and ^{∗} = log( =(1 − )).
Copyright _{?} 2005 John Wiley & Sons, Ltd.
Statist. Med. (in press)
R. M. ALTMAN AND A. J. PETKAU
Table I. Parameter estimates and standard errors for AMSF’s model.
Patient 1
Patient 2
Patient 3
Parameter Transformation 
Estimate 
SE 
Estimate 
SE 
Estimate 
SE 

∗ 
log log( − 1) 
1.070 
0.091 
1.362 
0.178 
2.223 
0.244 


0 

^{∗} 
−11.083 
NA 
−0.282 
0.245 
−1.128 
0.560 

^{∗} 
log _{1}_{−} _{}

NA 
NA 
1.241 
0.575 
1.336 
0.847 
log L
−66.814
−72.029
−65.363
Table I gives parameter estimates and approximate standard errors. Our results, with the addition of the standard errors, are similar to AMSF’s. One di erence is that we have achieved a higher value of the likelihood in the case of Patients 2 and 3. An explanation for this di er ence might be that we considered a larger collection of starting values. A second di erence is that we have omitted the estimate of ^{∗} for Patient 1. For this patient, is estimated as 1.000, and the model reduces to that for independent Poisson counts. So, the estimates for given by the quasiNewton routine (and that given by AMSF) are, in fact, arbitrary. Since = 1 is on the boundary of the parameter space, there is no guarantee that the usual standard error for the estimate of ^{∗} is even approximately correct. Hence, we omit this value. One question raised by AMSF is whether the complexity of the HMM is warranted, or whether the independent Poisson count model is su cient to describe the variability in the data. In principle, we should be cautious about making such inferences, since the test of = 1 is a boundary problem. Informally, though, in the case of Patient 1, there is no evidence to suggest that the simpler model is inadequate. In the case of Patients 2 and 3, the 95 per cent con dence intervals for ([1.47, 2.22] and [1.11, 1.97], respectively) suggest evidence against the hypothesis that = 1. Thus, the HMM structure seems to be more appropriate for these patients than the simpler model.
4.2. Generalization of the transition probabilities
The rst extension we consider is the use of general transition probabilities. In particular, we assume that P(Z _{t} =1 Z _{t}_{−}_{1} =−1) = and P(Z _{t} =−1 Z _{t}_{−}_{1} = 1) = . This generalization allows patients to spend di ering portions of time in states of deterioration and improvement. Based on the analysis in Section 4.1, we do not apply this new model to the data from Patient 1. The parameter estimates for the other patients are given in Table II. In the case of Patient 3, is estimated as 1.000, which is on the boundary of the parameter space. Hence, we do not include a standard error. To test the validity of the assumption that = , we note that AMSF’s model is nested within the more general model. We then use the LRT to compare the two models, assuming
that the LRT statistic has an asymptotic ^{2}
distribution. Surprisingly, the more general model
does not t substantially better for either of the two patients (pvalue = 0.64 and 0.13 for Patients 2 and 3, respectively). We have two possible explanations for these results. First, making inferences about the hidden process is usually a di cult problem. The standard errors of the estimates of ^{∗} given in Table I are quite large relative to the estimates themselves, and relative to the standard
1
Copyright _{?} 2005 John Wiley & Sons, Ltd.
Statist. Med. (in press)
APPLICATION OF HIDDEN MARKOV MODELS
Table II. Parameter estimates and standard errors for the model with general transition probabilities.
Patient 2
Patient 3
Parameter Transformation 
Estimate 
SE 
Estimate 
SE 

∗ 
log _{0} 
1.360 
0.173 
2.186 
0.143 


0 


^{∗} 
log( − 1) 
−0.281 
0.244 
−1.416 
0.273 
^{∗} 
log _{1}_{−} _{}

1.489 
0.801 
0.975 
0.515 

^{∗} 
log _{1}_{−} _{}

1.039 
0.690 
18.570 
NA 
log L
−71.921
−64.222
Table III. Parameter estimates and standard errors for the model with a general conditional mean structure.
Patient 1
Patient 2
Patient 3
Parameter Transformation 
Estimate 
SE 
Estimate 
SE 
Estimate 
SE 

∗ 
log 
1.168 
0.200 
0.987 
0.256 
2.446 
0.247 


0 

_{0} 


∗ 
log _{0} 
0.006 
0.089 
0.484 
0.621 
0.466 
0.166 

0 


1 ∗ log _{1} 
−0.006 
0.089 
0.621 
0.120 
0.398 
0.158 

^{∗} 
log _{1}_{−} _{}

NA 
NA 
0.974 
0.708 
2.384 
0.813 
log L
−66.652
−70.401
−63:904
errors of the estimates of ^{∗} _{0} and ^{∗} . The same is true of the standard errors of the estimates of ^{∗} and ^{∗} in Table II. A second explanation may lie in the structure of _{t} : the proportional increase in the mean when Z _{t} = − 1 is assumed equal to the proportional decrease in the mean when Z _{t} = 1. In the case where there is no overall trend in the data (as is true for these particular patients, as well as for relapsingremitting patients in general when observed over a short time period), the number of transitions from decreasing to increasing mean is forced to equal approximately the number of transitions from increasing to decreasing mean. This statement is equivalent to AMSF’s assumption that _{−}_{1} = _{1} = 0:5. Since _{−}_{1} = =( + ) and _{1} = =( + ), we have that = .
4.3. Generalization of the conditional mean structure and of the transition probabilities
In light of the discussion in Section 4.2, we might consider modelling _{t} more generally while leaving the transition probabilities as in Section 4.1, e.g.
t =
_{0} _{t}_{−}_{1} ;
(1= _{1} ) _{t}_{−}_{1} ;
if patient is deteriorating at time t
if patient is improving at time t
The parameter estimates and standard errors are given in Table III. When _{0} =1= _{1} ≡ , the model implies that {Y _{t} } are independent with Y _{t} distributed as Poisson( _{0} ^{t} ), so that is arbitrary. Thus, for Patient 1, we omit the estimate of ^{∗} . This modi cation does not
Copyright _{?} 2005 John Wiley & Sons, Ltd.
Statist. Med. (in press)
R. M. ALTMAN AND A. J. PETKAU
Table IV. Parameter estimates and standard errors for model with general transition probabilities and conditional mean structure.
Patient 2
Patient 3
Parameter Transformation 
Estimate 
SE 
Estimate 
SE 

∗ 
log

1.889 
0.235 
2.445 
0.233 


0 
_{0} 


∗ 
log _{0} 
0.811 
0.169 
0.466 
0.154 

0 


∗ 
log _{1} 
0.383 
0.071 
0.398 
0.149 

1 

^{∗} 
log _{1}_{−} _{}

2.022 
1.134 
2.372 
1.045 

^{∗} 
log _{1}_{−} _{}

−0.399 
0.505 
2.396 
1.052 
log L
−68.695
−63.904
signi cantly improve the t for Patient 1 ( pvalue = 0.57), but we observe some evidence of an improved t for Patients 2 and 3 ( pvalue = 0.071 and 0.088, respectively). It was anticipated that, for Patients 2 and 3, the t might be further improved by using general transition probabilities (as in Section 4.2). The estimates of the transformed parameters and standard errors are given in Table IV. The LRTs comparing this model to the model with = yield the pvalues 0.065 and 1.00 for Patients 2 and 3, respectively. Thus, for Patient 2, there is some support for the expanded model.
4.4. Addition of a third hidden state
Our nal question of interest regarding AMSF’s model involves the number of hidden states. We consider the addition of a third hidden state, state 0, where the patient’s condition is stable. This modi cation can be expressed as
t =
⎧ ⎪ _{t}_{−}_{1} ;
⎨ _{t}_{−}_{1} ;
⎪ ⎩ (1= ) _{t}_{−}_{1} ;
= _{0} ^{S} ^{t}
if patient is deteriorating at time t
if patient is stable at time t
if patient is improving at time t
We represent the transition probabilities as P _{i}_{;}_{j} ≡ P(Z _{t} = j _{} Z _{t}_{−}_{1} = i), i; j _{∈} {−1; 0; 1}. Of course, we have the constraint ^{} _{j} P _{i}_{;} _{j} = 1 for all i. One disadvantage of such an extension is the introduction of the problem of computing the stationary probabilities, which are used as initial probabilities for the hidden Markov chain. In the twodimensional case, we have the simple closed form for the stationary distribution given in Section 4.2. In order to compute the stationary distribution in the threedimensional case, however, a system of three linear equations must be solved at each iteration of the quasiNewton algorithm. Another disadvantage of this model is the large number of unknown parameters. However, we could reduce this number if we were willing to place restrictions on the transition probabilities (as in AMSF’s model), for example by assuming that the transition probability matrix is symmetric.
Copyright _{?} 2005 John Wiley & Sons, Ltd.
Statist. Med. (in press)
APPLICATION OF HIDDEN MARKOV MODELS
Table V. Parameter estimates and standard errors for 3state model.
Patient 1
Patient 2
Patient 3
Parameter Transformation 
Estimate 
SE 
Estimate 
SE 
Estimate 
SE 

∗ 
log _{0} 
1.061 
0.139 
1.372 
0.155 
2.119 
0.110 


0 

^{∗} 
log ( − 1) 
0.003 
0.028 
0.671 
0.130 
0.395 
0.068 

P 
∗ 
^{l}^{o}^{g} 

p 1−p _{1} _{−}_{p} _{2} 1 
0.417 
0.476 
−3.139 
0.229 
0.184 
0.167 
−1; −1 

P 
∗ 
log 

p 1−p _{1} _{−}_{p} _{2} 2 
−0.536 
0.816 
4.316 
0.315 
0.126 
0.195 
−1; 0 

P 
∗ 
log 

p 1−p _{3} _{−}_{p} _{4} 3 
0.173 
0.537 
−0.142 
0.643 
−6.825 
6.846 
0; −1 

P 
∗ 
log 

p 1−p _{3} _{−}_{p} _{4} 4 
−0.370 
0.250 
−6.376 
1.247 
2.210 
0.349 
0; 0 

P 
∗ 
log 

p 1−p _{5} _{−}_{p} _{6} 5 
0.547 
0.856 
−1.757 
1.002 
7.741 
6.168 
1; −1 

P 
∗ 
log 

p 1−p _{5} _{−}_{p} _{6} 6 
−1.525 
0.136 
7.079 
1.051 
−1.170 
1.401 
1; 0 
log L
− 65.931
−69.448
−63.514
The parameter estimates and standard errors are given in Table V. In this case, the like
lihood functions are quite at (likely due to the large number of parameters and relatively small sample sizes) and hence di cult to maximize. The parameter estimates are not entirely reliable, and may correspond to a local maximum. When we compare the likelihoods in Table V to those in Table I, we see that substantial decreases occur for Patient 2 in particular. Thus, we might surmise that this 3state model is more appropriate than AMSF’s model. It would be a mistake, however, to use the ^{2} distribution to gauge the extremity of the LRT statistic. The test comparing models with
di ering numbers of hidden states amounts to the hypothesis that some of the transition
probabilities are zero. Thus, this test is a boundary problem and does not satisfy the conditions
required for the usual results for LRTs. Moreover, we cannot assume that methods such as the Akaike information criterion or the Bayesian information criterion provide consistent estimates of the number of hidden states. The question of estimating the number of hidden states in stationary HMMs is addressed in Reference [8], but the nonhomogeneous case has not been considered in the literature.
5
5. DISCUSSION
Our analyses in Section 4 illustrate a number of di culties in applying AMSF’s model in particular, and in modelling MS=MRI data in general. Formal inference about nonhomogeneous HMMs is challenging. It is di cult to have con dence in applying these models in the absence of tools for assessing their t. Further research in this eld is warranted. In particular, additional theory is required in order to formally examine the adequacy of AMSF’s models and its extensions for MS=MRI data. A technique for studying the t of a stationary HMM is proposed in Reference [10]; this work may be a starting point for treating the nonhomogeneous case. The issue that is, perhaps, of greatest importance in the context of models for MS=MRI data is the planning of clinical trials (see e.g. Reference [11]). The selection of a reasonable
Copyright _{?} 2005 John Wiley & Sons, Ltd.
Statist. Med. (in press)
R. M. ALTMAN AND A. J. PETKAU
model is key to achieving this goal. For example, in order to design clinical trials for new MS therapies, we need a model which allows the incorporation of a treatment e ect. Furthermore, we require a model which describes all patients’ data simultaneously. In particular, in our analyses, the standard errors associated with the estimates of the parameters of the hidden process are relatively large. Assuming the same model for each patient is one means of reducing this uncertainty. However, the behaviour of lesion counts in MS patients is often highly variable. It thus seems more reasonable to allow at least some model parameters to vary across patients. Random e ects are a useful means of capturing betweenpatient di erences while still borrowing strength across patients. Such models are discussed in detail—in a very general setting—in Altman (manuscript under revision), where a new class of HMMs for
multiple time series, called mixed hidden Markov models (MHMMs), is developed. This class is based on the generalized linear mixed model framework. For MS=MRI data, we might assume, for example, that, conditional on a patientspeci c random e ect, u, and the hidden disease state, Z, the lesion count at that time is Poisson distributed with mean depending on u and Z. Including random e ects in the transition probabilities is an option as well. In this case, the model would allow the percentage of time spent in each disease state to vary among patients. MHMMs also readily allow the incorporation of covariates, including a treatment
e ect. The exibility of MHMMs makes them a promising possibility for modelling MS=MRI
data and planning clinical trials. Good choices of models from this class for such data are currently under investigation.
ACKNOWLEDGEMENTS
This manuscript includes work from the rst author’s Ph.D. thesis, and was partially supported by a research grant and postdoctoral fellowship from the Natural Sciences and Engineering Research Council of Canada. We would like to express our appreciation to Paul Albert, Henry McFarland, and the Joseph Frank Experimental Neuroimaging Section, Laboratory of Diagnostic Radiology Research, Clinical Cen ter, NIH, for providing the MS=MRI data.
REFERENCES
1. Sormani MP, Bruzzi P, Beckmann K, Wagner K, Miller DH, Kappos L, Filippi M. MRI metrics as surrogate endpoints for EDSS progression in SPMS patients treated with IFN beta1b. Neurology 2003; 60(9):1462–1466.
2. Albert PS, McFarland HF, Smith ME, Frank JA. Time series for modelling counts from a relapsingremitting disease: application to modelling disease activity in multiple sclerosis. Statistics in Medicine 1994; 13(5–7):
453– 466.
3. Leroux BG. Maximumlikelihood estimation for hidden Markov models. Stochastic Processes and their Applications 1992; 40(1):127–143.
4. Bickel PJ, Ritov Y, Ryden T. Asymptotic normality of the maximumlikelihood estimator for general hidden Markov models. Annals of Statistics 1998; 26(4):1614–1635.
5. Douc R, Matias C. Asymptotics of the maximum likelihood estimator for general hidden Markov models. Bernoulli 2001; 7(3):381– 420.
6. Giudici P, Ryden T, Vandekerkhove P. Likelihoodratio tests for hidden Markov models. Biometrics 2000;
56(3):742–747.
7. MacDonald IL, Zucchini W. Hidden Markov Models and Other Models for DiscreteValued Time Series. Chapman & Hall: London, 1997.
8. MacKay RJ. Estimating the order of a hidden Markov model. The Canadian Journal of Statistics 2002;
30(4):573–589.
9. Nash JC. Compact Numerical Methods for Computers: Linear Algebra and Function Minimisation. Wiley:
New York, 1979.
10. Altman RM. Assessing the goodnessof t of hidden Markov models. Biometrics 2004; 60(2):444 – 450. 11. McFarland HF, Frank JA, Albert PS, Smith ME, Martin R, Harris JO, Patronas N, Maloni H, McFarlin DE. Using gadoliniumenhanced magnetic resonance imaging lesions to monitor disease activity in multiple sclerosis. Annals of Neurology 1992; 32(6):758–766.
Copyright _{?} 2005 John Wiley & Sons, Ltd.
Statist. Med. (in press)
Viel mehr als nur Dokumente.
Entdecken, was Scribd alles zu bieten hat, inklusive Bücher und Hörbücher von großen Verlagen.
Jederzeit kündbar.