Beruflich Dokumente
Kultur Dokumente
Systems biology
© The Author 2009. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org 1923
where ykl D denotes d data-points for each observable k, measured at 2.3 Identifiability
time-points tl . σklD are the corresponding measurement errors and A parameter θi is identifiable, if the confidence interval [σi− , σi+ ] of
yk (θ,tl ) the k-th observable as predicted by parameters θ for time- its estimate θ̂i is finite. Two phenomena accounting for parameters
point tl . The parameters can be estimated numerically by to be non-identifiable will be discussed here. Structural non-
identifiability is related to the model structure independent of
θ̂ = argmin χ 2 (θ ) . (5) experimental data which is extensively discussed, e.g. Cobelli and
DiStefano III (1980). In contrast, practical non-identifiability also
For normally distributed observational noise ∼ N(0,σ 2 ), this takes into account the amount and quality of measured data, that
corresponds to the maximum likelihood estimate (MLE) of θ and was used for parameter calibration. Practical non-identifiability is
less clearly defined in literature, therefore an independent definition
χ 2 (θ ) = const −2·log(L(θ)) (6) will be given.
where L(θ ) is the likelihood. In the following, χ 2 will be used as
placeholder for the likelihood. Furthermore, an appropriate model 2.3.1 Structural non-identifiability A structural non-
M that sufficiently describes the available experimental data is identifiability arises from a redundant parameterization in the
assumed. formal solution of y(t), due to an insufficient mapping g of
1924
A 10 B 10 C 10
θ2
2
θ
θ
0 0 0
0 10 0 10 0 10
θ1 θ1 θ1
Fig. 1. Contour plots of χ 2 (θ ) for a two-dimensional parameter space, shown in non-logarithmic scale for illustrative reasons. Shades from black to white
correspond to low and high values of χ 2 , respectively. Thick white lines display likelihood-based confidence regions and white stars the optimal parameter
estimates θ̂ . Left panel: a structural non-identifiability along the functional relation h(θ ) = θ1 ·θ2 −10 = 0 (dashed line). The likelihood-based confidence region
is infinitely extended. Middle panel: a practical non-identifiability. The likelihood-based confidence region is infinitely extended for θ1 → +∞ and θ2 → +∞,
lower confidence bounds can be derived. Right panel: both parameters are identifiable.
internal model states x to observables y in Equation (2). The set of extended in increasing and/or decreasing direction of θi , although
ambiguous parameters θsub ⊂ θ may be varied without changing the likelihood has a unique minimum for this parameter.
the observables y(t), hence keeping χ 2 (θ) on a constant value.
This means that the increase in χ 2 stays below the threshold α for
The redundant parameterization manifests as functional relations h
a desired confidence level α in direction of θi . Similar to a structural
between the parameters θsub , representing a manifold with constant
non-identifiability, the flattening out of the likelihood can continue
χ 2 in parameter space
along a functional relation. The confidence interval of a practically
χ 2 (θ ) = χ 2 (θ̂ ) ⇔ sub ) = 0.
h(θ (9) non-identifiable parameter is not necessarily extended infinitely to
both sides. There can be a finite upper or lower bound of the
Consequently, the parameter estimates θ̂sub and, respectively,
confidence interval [σi− , σi+ ] where either σi− = −∞ or σi+ = +∞.
the internal model states x(t) affected by these parameters are
In a two-dimensional parameter space, a practical non-
not uniquely identified. Confidence intervals of a structurally
identifiability can be visualized as a relatively flat valley, which is
non-identifiable parameter θi ∈ θsub are infinite [−∞, +∞] in
infinitely extended. The height distance of the valley bottom to the
logarithmic parameter space considered here. Hence, its value
lowest point θ̂ never excesses α , as illustrated in Figure 1, middle
cannot be estimated at all. A direct detection of a redundant
panel.
parameterization in the analytic form of y(t) is hampered, because
Along a practical non-identifiability, the observable y change
Equation (1) can only be solved analytically in special cases.
only negligibly remaining compliant with the given measurement
χ 2 (θ ) in a two-dimensional parameter space can be visualized
accuracy. Nevertheless, model behavior in terms of internal states x
as a landscape. A structural non-identifiability results in a perfect
might vary strongly. Improving the detection of typical dynamical
flat valley, infinitely extended along the corresponding functional
behavior by increasing the amount and quality of measured data
relation, as illustrated in Figure 1, left panel.
and/or the choice of measurement time-points tij will ultimately
Since a structural non-identifiability is independent of the
resolve a practical non-identifiability, yielding finite likelihood-
accuracy of available experimental data, it cannot be resolved
based confidence intervals (Fig. 1, right panel). Inferring how to
by a refinement of existing measurements, e.g. by reducing the
decrease confidence intervals most efficiently is the subject of
measurement noise (t). The only remedy is a qualitatively new
experimental planning, which will be discussed later on.
measurement which alters the mapping g, e.g. by increasing the
number of observed species. A parameter is structural identifiable,
if a unique minimum of χ 2 (θ ) with respect to θi exists. 3 EXISTING METHODS
Various methods exist to detect structural non-identifiability by a
2.3.2 Practical non-identifiability A parameter that is structurally
priori analyzing the system equations (1) and (2), such as the Power
identifiable may still be practically non-identifiable. This arises
Series Expansion (Pohjanpalo, 1978), the Volterra and Generating
frequently if amount and quality of experimental data is insufficient
Power Series Approach (Lecourtier et al., 1987), the Similarity
and manifests in a confidence interval that is infinite. Please note
Transform Approach (Vajda et al., 1989b) or differential algebraic
that the asymptotic confidence interval of a structural identifiable
methods (e.g. Ljung and Glad, 1994). Unfortunately, these methods
parameter estimate may be large, but is always finite because Cii > 0
become rapidly infeasible with increasing model size (Margaria
[see Equation (7)]. Therefore, it is not possible to infer practical non-
et al., 2001; White et al., 2001). Practical non-identifiability cannot
identifiability using asymptotic confidence intervals. We propose a
be detected, since experimental data are disregarded.
definition that is inspired by likelihood-based confidence intervals
Another class of methods aims to detect non-identifiability by
[see Equation (8)]:
flatness of likelihood, using simulated or experimental data. Here,
Definition 1. A parameter estimate θ̂i is practically non- measures of curvature are computed, commonly using a quadratic
identifiable, if the likelihood-based confidence region is infinitely approximation of χ 2 at the estimated optimum θ̂ , e.g. the Hessian
1925
1926
Fig. 3. Black lines display profile likelihood versus parameter. The results for the original dataset are shown in the upper panel. The lower panel shows the
alteration of the profile likelihood, after taking into account the additional data. Calibrated parameter values θ̂ are displayed by gray stars, thresholds for
simultaneous and pointwise 1-σ confidence intervals by upper, respectively, lower dashed lines. Gray parabolas indicate the quadratic approximation used for
asymptotic confidence intervals. They are very flat for the structurally non-identifiable parameters. Discontinuities in the profile likelihood of p3 and p4 in the
upper panel stem from local minima that govern the profile likelihood in remote regions. Parameter values are given in orders of magnitude.
1927
4
2
Table 1. Likelihood-based confidence intervals σ ±,PL derived from the
x1 / nM
x / nM
profile likelihood are compared with asymptotic confidence intervals σ ±,Hess
2
1 derived from the Hessian matrix [see Equation (7)]
2
0 0
0 20 40 60 0 20 40 60
time / min time / min Name θ̂i Non- σ −,PL σ +,PL σ −,Hess σ +,Hess
identifiability
0.6
0.2
x3 / nM
x4 / nM
1928
available, e.g. partial differential equations (PDE) or stochastic Hengl,S. et al. (2007) Data-based identifiability analysis of non-linear dynamical
differential equations (SDE). models. Bioinformatics, 23, 2612.
Jacquez,J. and Greif,P. (1985) Numerical parameter identifiability and estimability:
The approach results in easily interpretable plots of profile
integrating identifiability, estimability, and optimal sampling design. Math. Biosci.,
likelihood versus parameter. It can be automated, but an explicit 77, 201–228.
advantage is that the output might be evaluated visually. This Joshi,M. et al. (2006) Exploiting the bootstrap method for quantifying parameter
gives insight into a complex and high-dimensional parameter confidence intervals in dynamical systems. Metab. Eng., 8, 447–455.
space. Structural non-identifiabilities originating from incomplete Kitano,H. (2005) International alliances for quantitative modeling in systems biology.
Mol. Syst. Biol., 1, 1–2.
observation of the internal model states can be detected. Arising Lecourtier,Y. et al. (1987) Volterra and generating power series approaches to
from limited amount and quality of experimental data, also practical identifiability testing. In Identifiability of Parametric Models. Pergamon Press,
non-identifiabilities can be inferred. Bridging the gap between Oxford.
identifiability and confidence intervals, the profile likelihood allows Lehmann,E. and Leo,E. (1983) Theory of point estimation. Wiley, Hoboken, New Jersey.
Ljung,L. and Glad,T. (1994) On global identifiability for arbitrary model
to derive likelihood-based confidence intervals for each parameter.
parametrizations. Automatica, 30, 265–265.
Functional relations between parameters occurring due to non- MacDonald,N. (1976) Time delay in simple chemostat models. Biotechnol. Bioeng., 18,
identifiabilities can be recovered. The results of the approach can 805–812.
be used on the one hand to design new experiments that efficiently Maiwald,T. and Timmer,J. (2008) Dynamical modeling and multi-experiment fitting
resolve non-identifiability and narrow confidence intervals and on with PottersWheel. Bioinformatics, 24, 2037–2043.
Margaria,G. et al. (2001) Differential algebra methods for the study of the structural
the other hand to facilitate model reduction. Thus, identifiability identifiability of rational function state-space models in the biosciences. Math.
analysis ensures that the model complexity is tailored to the Biosci., 174, 1–26.
information content given by the experimental data. Whether a Meeker,W. and Escobar,L. (1995) Teaching about approximate confidence regions based
model that is not well determined by the experimental data, should on maximum likelihood estimation. Am. Stat., 49, 48–53.
Murphy,S. and van der Vaart,A. (2000) On profile likelihood. J. Am. Stat. Assoc., 95,
be reduced or additional data should be measured depends on the
449–485.
biological issue to be addressed. Neale,M. and Miller,M. (1997) The use of likelihood-based confidence intervals in
The approach was applied to a model of the JAK-STAT signaling genetic models. Behav. Genet., 27, 113–120.
pathway. Non-identifiable parameters were detected, revealing Pohjanpalo,H. (1978) System identifiability based on the power series expansion of the
limitations in the experimental setup. Additional measurements that solution. Math. Biosci., 41, 21–33.
Press,W. et al. (1990) Numerical Recipes: FORTRAN. Cambridge University Press,
efficiently improve parameter identification were suggested and Cambridge.
validated. Royston,P. (2007) Profile likelihood for estimation and confidence intervals. Stata J.,
7, 376.
Schmidt,H. and Jirstrand,M. (2006) Systems biology toolbox for matlab: a
ACKNOWLEDGEMENTS computational platform for research in systems biology. Bioinformatics, 22,
514–515.
The authors thank Seong-Hwan Rho for testing the approach and Swameye,I. et al. (2003) Identification of nucleocytoplasmic cycling as a remote sensor
Andrea Pfeifer for supplying data of the compartment volume. in cellular signaling by databased modeling. Proc. Natl Acad. Sci. USA, 100,
1028–1033.
Funding: German Federal Ministry of Education and Research Timmer,J. et al. (2004) Modeling the nonlinear dynamics of cellular signal transduction.
(HepatoSys 0313074D, LungSys 0315415E, FRISYS 0313921); the Int. J. Bifurcat. Chaos, 14, 2069–2079.
European Union (CancerSys EU-FP7 HEALTH-F4-2008-223188); Vajda,S. et al. (1989a) Qualitative and quantitative identifiability analysis of nonlinear
chemical kinetic models. Chem. Eng. Commun., 83, 191–219.
the Helmholtz Alliance on Systems Biology (SBCancer DKFZ
Vajda,S. et al. (1989b) Similarity transformation approach to identifiability analysis of
I.2); the Excellence Initiative of the German Federal and State nonlinear compartmental models. Math. Biosci., 93, 217–248.
Governments. Venzon,D. and Moolgavkar,S. (1988) A method for computing profile-likelihood-based
confidence intervals. Appl. Stat., 37, 87–94.
Conflict of Interest: none declared. White,L. et al. (2001) The structural identifiability and parameter estimation of a
multispecies model for the transmission of mastitis in dairy cows. Math. Biosci.,
174, 77–90.
REFERENCES Yao,K. et al. (2003) Modeling ethylene/butene copolymerization with multi-site
catalysts: parameter estimability and experimental design. Polym. React. Eng., 11,
Cobelli,C. and DiStefano III,J. (1980) Parameter and structural identifiability concepts 563–588.
and ambiguities: a critical review and analysis. Am. J. Physiol. Regul. Integr. Comp.
Physiol., 239, 7–24.
Ghosh,J. and Samanta,T. (2001) Model selection–an overview. Curr. Sci., 80,
1135–1144.
1929