Sie sind auf Seite 1von 24

C H A P T E R

T W E N T Y- F O U R

A Practical Guide to Genome-Scale Metabolic Models and Their Analysis


Filipe Santos,*, Joost Boele,* and Bas Teusink*, Contents
1. Introduction 2. Genome-Scale Metabolic Models: Their Place in the Spectrum of Modeling Options 2.1. Bottom-up (kinetic) models versus genome-scale metabolic models 2.2. Top-down (biostatistical) models versus genome-scale metabolic models 3. The Art of Making Genome-Scale Metabolic Models 3.1. Generating a draft reconstruction based on the genome sequence 3.2. Identify and resolve errors, gaps, and inconsistencies in the network 3.3. Define external metabolites and the biomass equation 3.4. Validate the model with additional experiments 4. Applications of Genome-Scale Metabolic Models 4.1. Flux balance analysis: The work horse of constraint based modeling 4.2. FBA predicts rates only through yield maximization 4.3. Using constraint-based modeling for discovery and interpretation: Sensitivity analysis 4.4. Final remarks References 510 514 514 518 520 520 521 521 523 523 524 525 527 528 528

This chapter and Chapter 4 in Section 7 are focused on large-scale metabolic model reconstruction. The approaches and viewpoints of the authors differ, and the editors would advise readers to consider both chapters together for a broader understanding of the methodologies and issues within this field. * Amsterdam Institute for Molecules, Medicines and Systems/NISB, VU University Amsterdam, De Boelelaan 1085, Amsterdam, The Netherlands { Kluyver Centre for Genomics of Industrial Fermentations/Netherlands Consortium Systems Biology, VU University Amsterdam, De Boelelaan 1085, Amsterdam, The Netherlands Methods in Enzymology, Volume 500 ISSN 0076-6879, DOI: 10.1016/B978-0-12-385118-5.00024-4
#

2011 Elsevier Inc. All rights reserved.

509

510

Filipe Santos et al.

Abstract
Genome-scale metabolic reconstructions and their analysis with constraintbased modeling techniques have gained enormous momentum. It is a natural next step after sequencing of a genome, as a technique that links top-down systems biology analyses at genome scale with bottom-up systems biology modeling scrutiny. This chapter aims at (systems) biologists that have an interest in, but no extensive knowledge of, applying genome-scale metabolic reconstruction and modeling to their organism. Rather than being comprehensiveexcellent and extensive reviews exist on every aspect of this fieldwe give a rather personal account on our experience with the process of reconstruction and modeling. First, we place genome-scale metabolic models in the spectrum of modeling approaches, and rather extensively discuss, for nonexperts, the central concept in constraint-based modeling: the solution space that is bounded through constraints on fluxes. We subsequently provide an overview of the different steps involved in metabolic reconstruction and modeling, pointing to aspects that we found difficult, important, not well enough addressed in the current reviews, or any combination thereof. In this way, we hope that this chapter serves as a practical guide through the field.

1. Introduction
Todays interest in systems biology is largely fuelled by high-throughput techniques that generate large amounts of data. There is a general consensus that functional genomics has enormous potential in the life sciences, in particular, in biotechnology and medicine. How to use these technologies most efficiently, for fundamental understanding, biomarker discovery, or concrete biotech applications, is an area of active research. It is clear that the volume and complexity of the data are becoming too large to cope with by biologists alone, especially when the latter are poorly trained in advanced mathematics and computation (which is still largely the case). So there is an understandable need from the biologists perspective for help in mining, interpreting, and using the datasets that they collect. Such activities require modeling of one form or the other (Ideker and Lauffenburger, 2003). Biostatistics and bioinformatics offer help in the analysis of genome-scale data sets, but they often rely on purely mathematical and statistical analysis. Although extremely useful, it ignores what is often referred to as legacy data, that is, the large body of biological knowledge that is often scattered in literature and therefore poorly accessible. Moreover, many of the techniques were not designed to incorporate a priori knowledge, even if it is available (Liao et al., 2003). Bottom-up systems biologists, however, construct detailed mechanistic models that aim at a fundamental understanding of systems behavior (Bruggeman and Westerhoff, 2007). To achieve this, they start from the molecular properties of biological components and their interactions in

Genome-Scale Metabolic Models

511

BOX 24.1 Beginners Kit for Genome-Scale Metabolic Model Reconstruction and Their Analysis

The first step of model development is generally obtaining a draft model. Multiple tools can be used for this with different input demands. These can be from only an assembled genome (The SEED) to the combination of multiple genome sequences and respective curated models (AUTOGRAPH). Again depending on the algorithms employed in the initial reconstruction, the list of gene-reaction associations needs to be checked for consistency, gaps, and errors, and also attached to the list of external metabolites and the biomass equation. The latter two generally are obtained either from literature or from experimental data. This first (almost) growing model then enters the refinement stage in which the outputs of a succession of constraint-based analysis techniques available within various software packages (see Table 24.1) are compared with both literature and experimental data. For more detail, please see the main text and/or references to external resources (Fig. 24.1).

Genome

Available models Model reconstruction

Literature and experimental data

(draft)

Model

Manual curation Assessment of model performance

Model simulations

Figure 24.1 Overview of draft genome-scale metabolic model generation and iterative refinement cycle.

biological networks, be it metabolic, signaling, or gene regulatory, amongst others. As will be briefly discussed below, both approaches have their limitations. Using genome-scale reconstructions, and their corresponding models, may be considered as a middle-out approach, as they combine -omics data with more traditional modeling strategies. This approach is the focus of this chapter, in particular, its application to metabolic networks. All aspects of genome-scale metabolic models have been extensively reviewed in recent years (Feist and Palsson, 2010; Feist et al., 2009; Hyduke

512

Filipe Santos et al.

Table 24.1 Selected list of external resources useful for the generation of genomescale metabolic model and their analysis Tool/data source URL/reference Description

AUTOGRAPH is a semi-automatic AUTOGRAPH www. approach to accelerate the process biomedcentral. of genome-scale metabolic com/1471-2105/ network reconstruction by taking 7/296 full advantage of already manually curated networks BiGG bigg.ucsd.edu BiGG is a knowledge base of biochemically, genetically, and genomically structured genomescale metabolic network reconstructions BioCyc biocyc.org BioCyc is a collection of > 1000 pathway/genome databases. Each database in the BioCyc collection describes the genome and metabolic pathways of a single organism BioMet 129.16.106.142 The BioMet ToolBox is a web-based Toolbox resource for analysis of highthroughput data, together with methods for ux analysis (uxomics) and integration of transcriptome data exploiting the capabilities of metabolic networks described in genome scale models BRENDA www.brendaBRENDA is the main collection of enzymes.org enzyme functional data available to the scientic community. The COBRA Toolbox for Matlab COBRA gcrg.ucsd.edu/ includes implementations of many Downloads/ of the commonly used forms of Cobra_Toolbox constraint-based analysis such as FBA, gene deletions, ux variability analysis, sampling, and batch simulations together with tools to read in and manipulate constraint-based models KEGG www.genome.jp/ KEGG (Kyoto Encyclopedia of kegg/ Genes and Genomes) is a bioinformatics resource, popular for its visualization capabilities, that links genomes to pathways

Genome-Scale Metabolic Models

513

Table 24.1 (Continued) Tool/data source URL/reference Description

OptFlux

Pathway Tools

PubMed

The SEED

YANAsquare

INSD

OptFlux incorporates strain optimization tasks and also allows the use of stoichiometric metabolic models for (i) phenotype simulation of both wild-type and mutant organisms, (ii) metabolic ux analysis, and (iii) pathway analysis through the calculation of elementary ux modes bioinformatics.ai.sri. Pathway Tools is a comprehensive com/ptools symbolic systems biology software system that supports several use cases in bioinformatics and systems biology www.ncbi.nlm.nih. PubMed is a service of the U.S. gov/pubmed National Library of Medicine that includes over 19 million citations from MEDLINE and other life science journals for biomedical articles back to the 1950s www.theseed.org The Model SEED automates as much as possible the development of genome-scale metabolic models requiring from the user only an assembled genome sequence YANA is a user friendly software yana.bioapps. package for the analysis of biozentrum.unimetabolic networks wuerzburg.de http://www.insdc. The International Nucleotide org/ Sequence Databases (INSD) consists of an international collaboration between GenBank, DDBJ, and ENA resulting in the largest sequence repository

www.optux.org

and Palsson, 2010; Liu et al., 2010; Oberhardt et al., 2009; Orth et al., 2010; Teusink and Smid, 2006). In this chapter, therefore, we will try to guide the reader, whom we envisage is a biologist relatively new to systems biology in general, and to genome-scale metabolic models, in particular, through the various steps of the modeling process. The account on the many steps

514

Filipe Santos et al.

involved is necessarily not comprehensive, and rather personal, based on our experience of the process. It will hopefully help to understand the basic concepts and rationales, while providing many references to other resources where more detailed information can be pursued. We also provide a sort of beginners kit of software that we found useful at different stages in the reconstruction and modeling process (see Box 24.1).

2. Genome-Scale Metabolic Models: Their Place in the Spectrum of Modeling Options


As seen above, two opposing strategies have been generally employed to model biological systems. The so-called bottom-up approach focuses mostly on the very detailed description of the different components of the system, while the top-down approach looks for (uneducated) correlations between different variables of the system. Genome-scale metabolic models can be seen as a middle-out strategy because they first delimit the genomic pool of the system, mainly through bioinformatic approaches, and from there model the potential interactions of these components. It should be clear that all modeling strategies are good for specific tasks but have limited capabilities for others, and so it is important to understand the possibilities, limitations, and underlying assumptions before embarking on any of them. The direct comparison of genome-scale metabolic models with other modeling approaches places them in the spectrum of modeling options and helps understand their value.

2.1. Bottom-up (kinetic) models versus genome-scale metabolic models


The structure of a kinetic model is actually quite simple. Suppose, we have a simple system with two branches (Fig. 24.2). We often assume an environment in which S is some infinite source with a constant concentration (often referred to as an external metabolite; Heinrich and Schuster, 1996; Schuster et al., 2000), for example, glucose in the bloodstream or a nutrient in a chemostat. Similarly, P1 and P2, the products of this pathway, have constant concentrations and are also external metabolites. X is an internal metabolite, the concentration of which depends on the rates of production and consumption. In mathematical terms, this can be described as: dX v1 v2 v3 dt 24:1

Genome-Scale Metabolic Models

515

V2 S V1 X V3

P1

P2

Figure 24.2 Simple metabolic network consisting of three reactions, three external metabolites (S, P1, and P2) that act as source or sink, and one internal metabolite (X).

The rates of the enzymes, vi, are a function () of the kinetic parameters p (e.g., MichaelisMenten constants Vmax and Km values) and concentrations of the metabolites, in this case: vi p; X; S; P1; P2 24:2

Thus, Eq. (24.1) is a differential equation, the solution of which is a function that describes the time course of the concentration of X, that is, X t X0 ; p; S; P1; P2 24:3

The X0 indicates that the actual behavior is dependent on the initial concentration of X. As the rates depend on p, S, P1, and P2, so does X. If X is computed at each time point, Eq. (24.2) can be solved to find the rates of all the enzymes in time, that is, vi(t). Thus, such kinetic models can be constructed and validated through time-series of metabolites and flux measurements. Detailed kinetic models can be used to explore the dynamics of a system, as well as the control structure of a pathway (Fell, 1997). Once a model is available, parameters and conditions (such as the concentration of S in our example) can be altered to study biologically relevant properties, such as bistabilities (Veening et al., 2008), oscillations (Goldbeter, 1997), and robustness or homeostasis (Kitano, 2002, 2007), just to mention a few. In a practical sense, this understanding makes it possible to predict which steps to enhance, or which feedbacks to remove, for a particular modification of the pathways behavior (e.g., enhanced production of a valuable product or improving a diseased state). Even though there are examples of successful applications of these types of kinetic models (Bakker et al., 2010; Hoefnagel et al., 2002), they face some serious limitations. First, kinetic parameters p of all enzymes in a pathway are very rarely available, although databases such as BRENDA (Barthelmes et al., 2007) and Sabio-RK (Rojas et al., 2007) are a tremendous help. However, these kinetic parameters are often measured at different or nonphysiological

516

Filipe Santos et al.

conditions, such as at the optimal pH of the enzymes, not the physiological pH. Also, it remains quite controversial to what extent the in vitro kinetics reflects the in vivo kinetics (Teusink et al., 2000). An alternative to the in vitro kinetic parameter determinations is to estimate in vivo kinetic parameters from time courses of metabolites after short timescale perturbations of the metabolic network (Mashego et al., 2007; Theobald et al., 1993; Visser et al., 2004). Parameter estimation is a field by itself and not an easy task. We refer to a recent review on the topic (Banga and BalsaCanto, 2008). Second, these models necessarily represent relatively small, isolated metabolic pathways. As these pathways are embedded in a large metabolic network, the boundary conditions, that is, the exchanges of information with the rest of the system, become critically important in the success of the model predictions. It was shown that including the boundaries explicitly, even with uncertain parameter values, can improve significantly the predictive power of the model outcome (Liebermeister et al., 2005). Several approaches, in fact, have been described that take the uncertainty in kinetic parameters explicitly into account, generating an ensemble of model outcomes that can be inspected for robust and more uncertain model predictions (Liebermeister and Klipp, 2005, 2006; Steuer et al., 2006; Wang and Hatzimanikatis, 2006). Eventually such new approaches, together with further developments in metabolomics research and computing power, may substantially increase the size of systems for which kinetic models can be constructed by reverse engineering ( Jamshidi and Palsson, 2010; Resendis-Antonio, 2009). However, at the moment these approaches scale poorly, and genome sizes are still a long way ahead. Genome-scale metabolic models, however, do cover the total metabolic potential that is encoded in the genome of an organism, but this comes at a cost. They share with kinetic models the structure, that is, the stoichiometry of all reactions, but they leave out (almost) all of the kinetic details. This is possible because many pathways, such as the one depicted in Fig. 24.2, when analyzed over a sufficiently long time will reach a steady state, that is, a state in which all internal metabolites will be balanced by the producing and consuming reactions. In our example, this means that, in steady state: dX v1 v2 v3 0 dt 24:4

Equation (24.4) basically states that the concentration of X is constant in time, as it is balanced by its production and consumption rate. When the kinetics of the reactions are known, we can compute the steady state concentration of X, and the steady-state rates through the enzymes, which we then, in steady state, call fluxes:

Genome-Scale Metabolic Models

517

X t ! 1 Xss p; S; P1; P2 Vi;ss Xss ; p; S; P1; P2

24:5a 24:5b

Equation (24.4) leads to a set of linear equations that link the rates vi with each other, which does not require the parameters p, S, P1, and P2. However, to truly predict the rates vi or the steady-state metabolite concentration from the conditions defined by S, P1, and P2, one does need this kinetic information, as is specified in Eqs. (24.5a) and (24.5b). Via Eq. (24.4), we only look at steady-state stoichiometric interrelationships between (steady-state) fluxes, nothing more. This difference with kinetic modeling is absolutely crucial for understanding the limitations of the analyses of genome-scale metabolic models discussed later on. Imposing steady-state relationships between rates (referred to as mass-balance constraints (Price et al., 2004), for obvious reasons) constrains the possible states the system can be in only with respect to fluxes through the network, also referred to as the solution space (illustrated in Fig. 24.3). The other type of constraints used in genome-scale metabolic modeling is called the capacity constraint. Such constraints also have a link with kinetics, as they represent any limitation that can be imposed on individual rates. Such can be measured Vmax values that form an upper limit through

v3

v3

v1 v2 v2

v1

Figure 24.3 Illustration of the concept of solution space, and how mass-balance and capacity constraints result in a bounded space of possible flux states. (A) The steadystate solution space is two-dimensional (a plane), because there are three unknowns (three fluxes) and one equation: the mass balance of X, that is, Eq. (24.4). This plane stretches out to all directions but is depicted only in the positive quadrant for comparison with Fig. 24.3B. The arrows represent extreme pathways, basis vectors that span the solution space, see Papin et al. (2004) for a review on the topic. (B) Specific capacity constraints on the fluxes v1, v2, and v3 turn the plane into a bounded space within which all feasible flux states should lie. For each flux, we have set capacity constraints, 0 vi vi,max.

518

Filipe Santos et al.

the reaction catalyzed by the enzyme. More often, it contains a constraint on the directionality of a reaction, that is, the assessment whether a reaction can work in both directions under the physiological range of metabolite concentrations that is assumed (or sometimes measured) in the organism under study (Hoppe et al., 2007; Kummel et al., 2006a,b). Mass balance and capacity constraints together limit the solution space of all possible flux distributions (Fig. 24.3). The analysis of genome-scale metabolic models, applying mass-balance and capacity constraints, is collectively named constraint-based modeling. Note that a detailed kinetic model will predict a specific steady state with fluxes that lie inside the solution space of the corresponding stoichiometric network (Teusink and Smid, 2006). Leaving out the kinetic details in constraint-based modeling comes at the cost of being able to predict the space of feasible states but not the specific state (unless specific assumptions are made that will be outlined later). Also information about the sensitivity of the steady state to parameters changes, as well as the trajectory how this state is reached, is not available through stoichiometry alone.

2.2. Top-down (biostatistical) models versus genome-scale metabolic models


Despite the -omics data explosion, it is clear that for true systems-level understanding, the abundance of relevant data, unfortunately, is matched by an abundance of less relevant data. This has to do with the inductive and open-ended nature of the typical -omics experiment. There is nothing wrong with this (well, a little perhaps; Lipton, 2005), and especially when done in a systematic way with good experimental design, there are good methods to turn data into knowledge (Kell, 2004; van der Werf, 2005). Statistics-based data analysis often gives qualitative interaction networks, or candidate components (genes, transcription factors, metabolites) that are scored to be related to the phenomenon under study. For many applications, this may be sufficient because these leads can be followed up by validation experiments. Also, the -omics data can simply be used for diagnostic purposes both in red and in white biotechnology, for example, to indicate a particular limitation or stress during a fermentation process (Knijnenburg et al., 2007; Tai et al., 2005), or to predict likely outcomes of cancer treatment (Kim and Paik, 2010), just to mention a few of the many, many applications of recent years. Attempts to enhance the purely statistical analyses by integrating different data sets and a priori knowledge lie within the realm of integrative bioinformatics. We will not make an attempt to give a full account on that field but focus on where metabolic information was, or can be, used for integration. The construction of genome-scale models starts from the genome: using bioinformatic approaches, the sequenced genome is scanned

Genome-Scale Metabolic Models

519

for regions that encode proteins with enzymatic activity (Francke et al., 2005; Reed et al., 2006; Thiele and Palsson, 2010). Based on the presence of such genes, and supplemented with known biochemical and physiological information, putative metabolic pathways can be inferred. Thus, genomescale metabolic models do not only contain reaction information (as was used above in comparison with kinetic models) but also the often manyto-man relationships between genes, proteins, and reactions (Reed et al., 2006). The mapping of genes to proteins and to reactions allows for integration of these levels. Transcriptome data (gene level) can be mapped on metabolic maps in this way, providing a visual, metabolic context for the interpretation of the transcriptome data (Gehlenborg et al., 2010; Kono et al., 2009). By visual inspection, this may give leads to parts of the metabolic network that are regulated. For instance, the need for CO2 in fermentations of Lactobacillus plantarum was identified in this way (Stevens et al., 2008). A computational counterpart of visual inspection was developed as reporter metabolites for proteome or transcriptome data (Patil and Nielsen, 2005). In these analyses, the differential expression of reactions that produce or consume a particular metabolite is scored for each metabolite and compared to an expected score (based on chance alone). Those metabolites, whose surrounding reactions change significantly, are then reporters for metabolic effects. CO2 would be such a reporter metabolite in the L. plantarum example above. Reporter reactions are computed in a similar vein from metabolome data (Cakir et al., 2006). Recently, a toolbox was developed that allows these types of analyses, called the BioMet Toolbox (Cvijovic et al., 2010). Also, the metabolic association of genes, via their mapping to the reaction network, has been used for functional association tests. Thus, genes close to each other on the metabolic map were found to have a stronger correlation in gene expression. Using constraint-based modeling, via the so-called flux coupling analysis (Burgard et al., 2004), this relationship was resolved with a much higher functional resolution (Notebaart et al., 2008, 2009). The key to this higher resolution is the appreciation that two reactions in series in a linear pathway are stoichiometrically fully coupled, whereas two reactions diverging from the same metabolite (such as reactions 2 and 3 in Fig. 24.2) are uncoupled, that is, they can carry flux in steady state independently from each other (Notebaart et al., 2008). Thus, genome-scale metabolic models provide a context for content (Palsson, 2004) and are very useful in -omics data integration. Reconstructions of other networks, such as transcription regulation networks (Herrgard et al., 2004), signal transduction networks (Hyduke and Palsson, 2010), or the translation machinery (Thiele et al., 2009), allow for further extension of top-down analyses in which biological data are integrated in a systematic way.

520

Filipe Santos et al.

3. The Art of Making Genome-Scale Metabolic Models


The step-by-step instructions for making a genome-scale reconstruction have been published recently (Feist et al., 2009; Thiele and Palsson, 2010), and so we will forego a detailed description of these steps in favor of placing emphasis on specific issues that we have experienced to be critical, difficult, easily overlooked, or a combination of these. The steps toward a reconstruction are (see Feist et al., 2009; Thiele and Palsson, 2010, with many illustrations or a flow chart in Francke et al., 2005; Henry et al., 2010): 1. 2. 3. 4. generating a draft reconstruction based on the genome sequence identify and resolve errors, gaps, and inconsistencies in the network define external metabolites and the biomass equation validate the model with additional experiments

3.1. Generating a draft reconstruction based on the genome sequence


Tools and methods exist that take either a sequence or an annotated genome and produce a first draft of a metabolic reconstruction automatically. One such system is Pathway Tools (Karp et al., 2010), which underlies the BioCyc collections including the famous EcoCyc database (Keseler et al., 2009). It takes annotated genomes and identifies the metabolic genes based on EC code and name matching. Pathway Tools also includes algorithms for automatic gap filling. In our experience, the information in this resource is wonderful, but the automatically generated output of Pathway Tools results in a rather fragmented set of pathways that need extensive curation (Teusink et al., 2005b). Also, the system is not explicitly designed for modeling, but rather, for making an inventory (encyclopedia) of metabolic pathways, and their regulation. The SEED, a pipeline system for metabolic reconstruction, has been recently made publicly available (Henry et al., 2010). This system uses as input a sequenced genome and produces an SBML (Hucka et al., 2003) model that is able to produce biomass (Feist and Palsson, 2010), one of the hallmarks of a completed reconstruction. However, we agree with Feist (Feist et al., 2009), and have stressed ourselves (Francke et al., 2005), that extensive manual curation cannot be fully replaced by automated methods. This is simply because there are too many important organism-specific choices that confer their own idiosyncrasies and set them apart from others. Expert domain knowledge about the organism under study is required to make these choices. With this in mind, we ourselves have developed an automated method that uses orthology prediction between genes of a query genome and genes

Genome-Scale Metabolic Models

521

from already existing genome-scale metabolic models, to copy the gene proteinreaction associations (Notebaart et al., 2006). The philosophy of this AUTOGRAPH method is that information of the manual curation process of the existing genome-scale metabolic models is being reused as much as possible. Issues resolved during manual curation relate to (i) reaction stoichiometries of specific reactions not found in the databases (or with wrong stoichiometries, still not uncommon); (ii) choices in cofactor usage; and (iii) annotations of genes on the basis of gap filling, extensive bioinformatic analyses, or experimental results that have not found their way into the databases (yet). It does not remove the need for manual curation, but it clearly accelerates the curation process, in our experience. We expect the same to be true for the SEED pipeline, and the two methods are to a large extent complementary.

3.2. Identify and resolve errors, gaps, and inconsistencies in the network
After the initial draft of the model, manual curation involves going through the draft geneproteinreaction associations to check for omissions, wrong assignments and gaps, and inconsistencies (for a review on gap filling, see Breitling et al., 2008). One insightful inconsistency that we have found (Francke et al., 2005) is the annotation of metA in L. plantarum to homoserine succinyl CoA transferase, while this organism lacks a TCA cycle and hence cannot make succinyl CoA. All databases nevertheless agree on this function for metA, presumably because it is a member of a distinct orthologous group that includes the Escherichia coli enzyme, which does take succinyl CoA. However, the protein from B. subtilis, also in this orthologous cluster, has been experimentally shown to take acetyl CoA, but somehow this result did not find its way into BRENDA or any other database. So, sequence similarity, even one-to-one orthology, is not a guarantee for function transfer. Interestingly, the model SEED has kept the succinyl CoA-dependent annotation and hence, the gap, but allows biomass production through methionine transport, even though domain experts will know that L. plantarum grows without methionine (Teusink et al., 2005a). This teaches us that it is both wise and necessary to not always trust the databases (when tested on students, it was striking how much they trust information from the internet).

3.3. Define external metabolites and the biomass equation


For a reconstruction to become a model, there are a number of additional steps that involve a minimum of experimental data without which the model is of poor predictive value. They come in three flavors:

522

Filipe Santos et al.

(i) Biomass composition: biomass composition must be defined, ideally for the different experimental conditions under study. The Supplementary Material in Oliveira et al. (2005), on the biomass of L. lactis, gives an extensive and useful account of the data and computation that is required. Protein, RNA, and lipid content change with growth rate and may affect the model simulations. (ii) Data on the bioenergetics: this pertains to information on stoichiometry of pumps and the respiratory chain (P/O ratio), and on maintenance energy and ATP requirement for growth, often referred to as YATP (Tempest and Neijssel, 1984). These data can be estimated by carefully monitoring product formation and biomass yield under a sufficient number of substrate/growth rate regimes; for excellent reviews on the techniques involved, see vanGulik and Heijnen (1995), Vanrolleghem et al. (1996), and Vanrolleghem and Heijnen (1998). Surprisingly, only very recently these parameters were carefully estimated for E. coli (Taymaz-Nikerel et al., 2010). But there is an important assumption here: measurement and inclusion of energetic parameters are based on full coupling between ATP production by catabolism and ATP utilization through anabolic processes. Without this assumption, we cannot do the calculations, and the predictions will be imprecise (Teusink et al., 2006). (iii) Essential and possible nutrients: growth requirement data (e.g., on specific auxotrophies for vitamins and amino acids, or possible carbon, nitrogen and sulfur sources) are used to decide on the biological relevance of a gap, for example, if a vitamin is not essential, then a gap in the vitamin biosynthesis pathways may be benign (Teusink et al., 2005b). These data can also be used to determine the potential external metabolites, that is, sources for the network, and the necessary transport systems to take them up. Although it is not very often pointed out, sinks are equally important as sources, as they also require transport reactions, even for simple diffusion over the membrane. Easily overlooked sinks include by-products of biosynthesis pathways (e.g., folate biosynthesis produces glycolaldehyde), or diffusible substrates such as water and CO2. Because genome-scale models are not exclusively bottom-up but largely based on genome content, they tend to contain more (futile) cycles, parallel routes, and dead-end pathways which a biochemist or biochemical engineer would never put into his or her bottom-up model. These cycles are nevertheless potentially there in the network (Pinchuk et al., 2010; Teusink et al., 2006) and are either thermodynamically infeasible (substrate cycles; Beard et al., 2002) or probably, and hopefully, regulated in such a way that they tend not to run in circles too much (in the case of futile cycles; if they do, however, we have uncoupling of ATP production and

Genome-Scale Metabolic Models

523

consumption by growth, and our bioenergetic parameters go down the drain). The dead-end pathways point to missing biochemical knowledge (or to reductive evolution, see e.g., Hols et al., 2005) and can therefore be valuable leads to new discoveries, especially in combination with the data on substrate use and product formation. Metabolic reconstructions and their modeling therefore not only rely on gene annotation, but it can also help in functional annotation by identifying functions that have to be there.

3.4. Validate the model with additional experiments


Once the model can form biomass by providing all the necessary components that form the biomass, validation of the model requires comparison with additional experimental data. One such set of data are phenotypes of deletion strains; comprehensive sets are unfortunately usually not available but for model organisms such as E. coli and yeast. Their metabolic reconstructions have been extensively tested against the deletion strain phenotypes, with relatively high success rates in the order of 8090% (Covert et al., 2004; Kuepfer et al., 2005). A more accessible set of data that can be used is physiological data on growth rates, yields, substrate utilization, and product formation rates. Especially chemostat data are useful, as they are the best experimental setup for establishing growth in steady state. The model should be compatible with the measured fluxes, or, at a higher level of validation, should be able to predict some of the fluxes as a function of some input fluxes (note that we cannot predict all fluxes de novo, only the steadystate dependencies between fluxes, Eq. (24.4)). Oberhardt (Oberhardt et al., 2009) gives a good overview of the different models that exist and the extent to which they have been validated through experimental data.

4. Applications of Genome-Scale Metabolic Models


For an increasing number of microorganisms, a genome-scale metabolic model is available at the quality level that we have discussed above (Oberhardt et al., 2009; Reed et al., 2006). The group of Palsson developed a resource for such metabolic models (see Schellenberger et al., 2010). These models can be used for a number of purposes, see a recent account on the applications of genome-scale metabolic models (Oberhardt et al., 2009). A large set of different methods, constraint-based modeling techniques, have been developed in the past years to accommodate these goals. An excellent overview of these techniques is given in Price et al. (2004). In brief, successful use of genome-scale metabolic models have ranged from:

524

Filipe Santos et al.

(i) exploration of gene lethality (Blank et al., 2005; Covert et al., 2004) and/or synthetic lethality (Costanzo et al., 2010) (ii) definition of metabolic context for integrative bioinformatics (Kharchenko et al., 2004; Notebaart et al., 2008; Patil and Nielsen, 2005) (iii) the study of pathway evolution (Pal et al., 2006; Papp et al., 2004) (iv) prediction of metabolic engineering strategies (Burgard et al., 2003; Bro and Nielsen, 2004, Park et al., 2010) (v) prediction of adaptive evolution outcomes (Fong et al., 2003, 2005; Ibarra et al., 2002; Teusink et al., 2009) (vi) interpretation of fermentation data (Goffin et al., 2010; Teusink et al., 2006)

4.1. Flux balance analysis: The work horse of constraint based modeling
In many of the applications of genome-scale models, flux balance analysis (FBA) is used, and this is by far the most popular constraint-based modeling technique (Gianchandani et al., 2010). For a more extensive primer on FBA, see Orth et al. (2010). FBA uses optimization of a certain objective function to find a subset of optimal states in the large solution space of possible states that is shaped by the mass balance and capacity constraints (see above; Orth et al., 2010; Price et al., 2004). We purposely state subset of optimal states, because the flux value of the objective function is unique, but the pathway distributions that give this optimal flux value, may not. There may be many different ways to Rome. If one is interested in the flux distributions, not only in the optimal value of the objective function, one needs to do a flux variability analysis (FVA) to check the uniqueness of the flux distributions (Mahadevan and Schilling, 2003). FVA maximizes and minimizes each reaction rate in the network at the optimal flux value: the resultant range in fluxes gives an indication how flexible the network is in reaching the optimum. FBA basically tries to find a state (or set of states) that maximizes (or minimizes) some desired flux or linear combination of fluxes. The approach is an optimization technique, and hence it implies that the modeled system (e.g., organism) behaves optimally (by evolution). This may limit the understanding of how microorganisms behave in experiments, as they may not have had the time to adapt to the environment. There are three ways to deal with the optimality issue. First, we can allow cells to adapt to their (constant) environment by laboratory evolution experiments (see application (v)). On the positive side, FBA offers an opportunity to predict evolutionary engineering outcomes, which may be guided by in silico analysis of specific deletions to enhance productivity

Genome-Scale Metabolic Models

525

(Burgard et al., 2003; Cvijovic et al., 2010). It also provides optimal yields with which experimentally obtained yields can be benchmarked. Second, alternatives to FBA have been proposed to circumvent the need for evolution, notably MOMA (minimization of metabolic adjustments; Segre et al., 2002). MOMA requires a well-defined reference state and then tries to predict the response of the network to a perturbation (change in external conditions, deletion of a reaction) by assuming that the network is robust. Technically, it tries to minimize the (Euclidian) distance between the reference flux distribution and the new solution space (which has changed by the perturbation). In our experience, the major problem with MOMA is that it weighs all fluxes equally with respect to biological relevance, which is not very likely. Moreover, the response is dominated by high fluxes. Thus, MOMA predictions in our hands do not produce proper specific predictions of fluxes, but it has been useful to see if a certain perturbation is likely to cause severe growth problems even though the optimal FBA solution seems fine (e.g., see Teusink et al., 2009). Third, the optimality issue can just be ignored, simply because one does not care about a specific, quantitative prediction. For example, one may want to know if the model can produce biomass, or some compound, not necessarily how much. This goes for (synthetic) lethality predictions, and predictions of evolution of metabolic networks.

4.2. FBA predicts rates only through yield maximization


If one does want to optimize to obtain specific flux distributions, however, one requires a sensible objective function, and here some confusion and controversy have arisen in the literature. In a recent study, different objective functions were tested for the extent to which they could predict actual flux states under different conditions (Schuetz et al., 2007). This study demonstrated that different objective functions were needed to describe the flux states under different conditions. Notably, under energy (or carbon) limitation, optimization of biomass yield appeared to be the best objective function. This is in line with earlier studies in which the biomass formation function was taken as objective to predict functional states (Edwards et al., 2001). However, many microorganisms display overflow metabolism, which is a wasteful lifestyle in terms of ATP generation and consequently, biomass yield. Yet it is observed even in glucose-limited chemostats above a certain critical dilution rate, such as ethanol fermentation in Saccharomyces cerevisiae (van Dijken et al., 1993), acetate formation in E. coli (Vemuri et al., 2006), or lactate formation in lactic acid bacteria (Thomas et al., 1979). This behavior cannot be predicted by FBA, because it predicts rates through optimal yields. It could only be described by including additional capacity constraints on the oxidative phosphorylation pathways in the corresponding metabolic networks (Famili et al., 2003; Varma and Palsson, 1994).

526

Filipe Santos et al.

It is important to fully appreciate the point that FBA predicts rates through optimal yields, and not rates directly. If we take our example from Fig. 24.2, and suppose we want to maximize the production rate of P1 from S, we may formulate the FBA problem as: max v2 given : 8 > > mass balance constraint : > > v1 v2 v3 0 > > < capacity constraints : 0 v1 10 > > > > 1 v2 1 > > : 0 v3 1

24:6

The solution is easy in this case: v1 should be 10, v2 should be 10, and v3 should be 0. But note that the rate of v2 is fully dictated by the constraint on v1. We could write the rate of v2 as: v2 v1 YP1;S 24:7

where YP1,S is the yield of P1 on S (1 in this case). As v1 is fixed by the capacity constraint, the only way to maximize v2 is to maximize the yield; this is also fixed in this simple case, but this is not so in larger systems while the capacity constraint is always required to bound the problem. For larger, real genome-scale models, this argument therefore remains (Schuster et al., 2007): as the solution space is bounded by an input flux, FBA finds an optimal objective flux by optimizing the yield on the incoming substrate. Hence, if in a model there are two options to make ATP from glucose, fermentation (low ATP yield) and respiration (high ATP yield), optimization of ATP production rate will necessarily be achieved by respiration. Therefore, when applying biomass optimization in FBA, the underlying biological assumption is that biomass yield maximization was the strategy through which the organism has reached its fitness. It is clear that there are also other strategies that lead to fitness, and hence, the validity of FBA, using biomass yield as objective function, is organism and condition specific (Fong et al., 2003; Ibarra et al., 2002; Schuetz et al., 2007). Understanding the basic assumptions of FBA, we can turn the problem around and change the conditions in such a way that organisms are likely to grow efficiently, thereby increasing the predictive power of FBA. Growth yield maximization on poor substrates must have been the growth strategy under the conditions where FBA predictions were successful (Fong et al., 2003; Teusink et al., 2009). Growth yield maximization is not the best

Genome-Scale Metabolic Models

527

strategy under high glucose concentrations and explains why adaptation on glucose led the cells away from the line of optimality predicted by FBA (Ibarra et al., 2002).

4.3. Using constraint-based modeling for discovery and interpretation: Sensitivity analysis
Even under conditions where FBA is not predictive, such as high glucose concentrations, FBA can be very useful when combined with experimental data on input and output fluxes (which could not be predicted by FBA). By using those measured fluxes as constraints, there are a number of interesting things one can do. We will discuss two of them. First, we can set the measured flux data and then perform FVA to assess which parts of the network flux distribution is resolved by the measured fluxes and which are not. One basically maximizes and minimizes each reaction in the network, given the measured values as constraints (note we used FVA before to test the uniqueness of an FBA solution: in that case, fluxes were minimized and maximized with the optimal objective value as constraint). It thus gives a range of reaction flux values for each reaction: the larger the range, the poorer do the measured fluxes predict this flux. This analysis could be considered as a genome-scale analogue to the well-known metabolic flux analysis (Maertens and Vanrolleghem, 2010), but applicable to large, underdetermined, systems. Alternatively, the ranges of fluxes obtained this way may include products of fermentation that were not measured but may be required to obtain the proper mass (or redox) balances. In this way, we were pointed at degradation products of amino acids in a recent retentostat study (Goffin et al., 2010). Second, growth rate optimization under measured flux constraints will indicate which medium compounds could further contribute to growth. This can be done through sensitivity analysis, which comes for free during the optimization procedure. These sensitivity coefficients are called reduced costs, and they quantify to what extent the objective flux could be improved by changing a capacity constraint. For example, the reduced cost of reaction v1 in the FBA problem of Eq. (24.6) is 1. We demonstrated the use of reduced costs by analysis of growth limitations of L. plantarum on a medium containing 3 carbon sources and 18 amino acids (Teusink et al., 2006). It turned out that some amino acids contributed to growth via ATP production, even though the mechanism for this ATP production was not immediately obvious. In this way, we discovered a previously unknown transhydrogenase activity in the degradation of branched chain amino acids. The latter is a good example of how genome-scale metabolic models can lead to the discovery of new metabolic capacities. In a later study, this approach was used to understand the puzzling observation that some amino acids were being produced under conditions of extremely slow growth

528

Filipe Santos et al.

(Goffin et al., 2010). We can safely say that without the model, we would not have been able to solve these puzzles.

4.4. Final remarks


There is a general consensus that functional genomics has enormous potential in metabolic engineering and biotechnology. Systems biology, as a new interdisciplinary branch of science, is rapidly gaining momentum. The functional genomics toolbox has allowed a global view and thereby forced many (molecular) biologists to focus more on the systems behavior than on the behavior of one of its single components. Systems biology aims at understanding and ultimately predicting such system level behavior in terms of the underlying molecular components and their interactions. It has model building at its centre: models to integrate data, models to interpret the data, and models to make predictions. It is our view that such models will play a central role in the advancement of biology in this century, simply because we cannot grasp the complexity of biological systems by intuition alone. Genome-scale metabolic models are a first approach to combine bottom-up modeling with top-down, genomics data-driven modeling. However, this can be used for exploration, for testing scenarios, for scanning conditions, and for eliminating impossibilities. However, these models can be used for interpretation and integration of high-throughput data. But these models are a middle-out approach, and probably only a temporary acceptance of our limitations. The challenge for the future will be to bring these models to life, that is, to make them dynamic and put them under physicochemical constraints and ultimately make them subject to our manipulation by understanding (exactly) how the control over flux and metabolite levels is distributed. Only then can we truly claim to be able to rationally engineer strains via computer-aided design.

REFERENCES
Bakker, B. M., et al. (2010). Systems biology from micro-organisms to human metabolic diseases: The role of detailed kinetic models. Biochem. Soc. Trans. 38, 12941301. Banga, J. R., and Balsa-Canto, E. (2008). Parameter estimation and optimal experimental design. Essays Biochem. 45, 195209. Barthelmes, J., et al. (2007). BRENDA, AMENDA and FRENDA: The enzyme information system in 2007. Nucleic Acids Res. 35, D511D514. Beard, D. A., et al. (2002). Energy balance for analysis of complex metabolic networks. Biophys. J. 83, 7986. Blank, L. M., et al. (2005). Large-scale 13C-flux analysis reveals mechanistic principles of metabolic network robustness to null mutations in yeast. Genome Biol. 6, R49. Breitling, R., et al. (2008). New surveyor tools for charting microbial metabolic maps. Nat. Rev. Microbiol. 6, 156161.

Genome-Scale Metabolic Models

529

Bro, C., and Nielsen, J. (2004). Impact of ome analyses on inverse metabolic engineering. Metab. Eng. 6, 204211. Bruggeman, F. J., and Westerhoff, H. V. (2007). The nature of systems biology. Trends Microbiol. 15, 4550. Burgard, A. P., et al. (2003). Optknock: A bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng. 84, 647657. Burgard, A. P., et al. (2004). Flux coupling analysis of genome-scale metabolic network reconstructions. Genome Res. 14, 301312. Cakir, T., et al. (2006). Integration of metabolome data with metabolic networks reveals reporter reactions. Mol. Syst. Biol. 2, 50. Costanzo, M., et al. (2010). The genetic landscape of a cell. Science 327, 425431. Covert, M. W., et al. (2004). Integrating high-throughput and computational data elucidates bacterial networks. Nature 429, 9296. Cvijovic, M., et al. (2010). BioMet Toolbox: Genome-wide analysis of metabolism. Nucleic Acids Res. 38(Suppl.), W144W149. Edwards, J. S., et al. (2001). In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nat. Biotechnol. 19, 125130. Famili, I., et al. (2003). Saccharomyces cerevisiae phenotypes can be predicted by using constraint-based analysis of a genome-scale reconstructed metabolic network. Proc. Natl. Acad. Sci. USA 100, 1313413139. Feist, A. M., and Palsson, B. O. (2010). The biomass objective function. Curr. Opin. Microbiol. 13, 344349. Feist, A. M., et al. (2009). Reconstruction of biochemical networks in microorganisms. Nat. Rev. Microbiol. 7, 129143. Fell, D. (1997). Understanding the Control of Metabolism. Portland Press, London. Fong, S. S., et al. (2003). Description and interpretation of adaptive evolution of Escherichia coli K-12 MG1655 by using a genome-scale in silico metabolic model. J. Bacteriol. 185, 64006408. Fong, S. S., et al. (2005). Parallel adaptive evolution cultures of Escherichia coli lead to convergent growth phenotypes with different gene expression states. Genome Res. 15, 13651372. Francke, C., et al. (2005). Reconstructing the metabolic network of a bacterium from its genome. Trends Microbiol. 13, 550558. Gehlenborg, N., et al. (2010). Visualization of omics data for systems biology. Nat. Methods 7, S56S68. Gianchandani, E. P., et al. (2010). The application of flux balance analysis in systems biology. Wiley Interdiscip. Rev. Syst. Biol. Med. 2, 372382. Goffin, P., et al. (2010). Understanding the physiology of Lactobacillus plantarum at zero growth. Mol. Syst. Biol. 6, 413. Goldbeter, A. (1997). Biochemical Oscillations and Cellular RhythmsThe Molecular Bases of Periodic and Chaotic Behaviour. Cambridge University Press, Cambridge, United Kingdom. Heinrich, R., and Schuster, S. (1996). The Regulation of Cellular Systems. Chapman & Hall, New York. Henry, C. S., et al. (2010). High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat. Biotechnol. 28, 977982. Herrgard, M. J., et al. (2004). Reconstruction of microbial transcriptional regulatory networks. Curr. Opin. Biotechnol. 15, 7077. Hoefnagel, M. H., et al. (2002). Metabolic engineering of lactic acid bacteria, the combined approach: Kinetic modelling, metabolic control and experimental analysis. Microbiology 148, 10031013. Hols, P., et al. (2005). New insights in the molecular biology and physiology of Streptococcus thermophilus revealed by comparative genomics. FEMS Microbiol. Rev. 29, 435463.

530

Filipe Santos et al.

Hoppe, A., et al. (2007). Including metabolite concentrations into flux balance analysis: Thermodynamic realizability as a constraint on flux distributions in metabolic networks. BMC Syst. Biol. 1, 23. Hucka, M., et al. (2003). The systems biology markup language (SBML): A medium for representation and exchange of biochemical network models. Bioinformatics 19, 524531. Hyduke, D. R., and Palsson, B. O. (2010). Towards genome-scale signalling-network reconstructions. Nat. Rev. Genet. 11, 297307. Ibarra, R. U., et al. (2002). Escherichia coli K-12 undergoes adaptive evolution to achieve in silico predicted optimal growth. Nature 420, 186189. Ideker, T., and Lauffenburger, D. (2003). Building with a scaffold: Emerging strategies for high- to low-level cellular modelling. Trends Biotechnol. 21, 255262. Jamshidi, N., and Palsson, B. O. (2010). Mass action stoichiometric simulation models: Incorporating kinetics and regulation into stoichiometric models. Biophys. J. 98, 175185. Karp, P. D., et al. (2010). Pathway Tools version 13.0: Integrated software for pathway/ genome informatics and systems biology. Brief. Bioinform. 11, 4079. Kell, D. B. (2004). Metabolomics and systems biology: Making sense of the soup. Curr. Opin. Microbiol. 7, 296307. Keseler, I. M., et al. (2009). EcoCyc: A comprehensive view of Escherichia coli biology. Nucleic Acids Res. 37, D464D470. Kharchenko, P., et al. (2004). Filling gaps in a metabolic network using expression information. Bioinformatics 20(Suppl. 1), I178I185. Kim, C., and Paik, S. (2010). Gene-expression-based prognostic assays for breast cancer. Nat. Rev. Clin. Oncol. 7, 340347. Kitano, H. (2002). Systems biology: A brief overview. Science 295, 16621664. Kitano, H. (2007). Towards a theory of biological robustness. Mol. Syst. Biol. 3, 137. Knijnenburg, T. A., et al. (2007). Exploiting combinatorial cultivation conditions to infer transcriptional regulation. BMC Genomics 8, 25. Kono, N., et al. (2009). Pathway projector: Web-based zoomable pathway browser using KEGG atlas and Google Maps API. PLoS One 4, e7710. Kuepfer, L., et al. (2005). Metabolic functions of duplicate genes in Saccharomyces cerevisiae. Genome Res. 15, 14211430. Kummel, A., et al. (2006a). Putative regulatory sites unraveled by network-embedded thermodynamic analysis of metabolome data. Mol. Syst. Biol. 2, 0034. Kummel, A., et al. (2006b). Systematic assignment of thermodynamic constraints in metabolic network models. BMC Bioinformatics 7, 512. Liao, J. C., et al. (2003). Network component analysis: Reconstruction of regulatory signals in biological systems. Proc. Natl. Acad. Sci. USA 100, 1552215527. Liebermeister, W., and Klipp, E. (2005). Biochemical networks with uncertain parameters. IEE Proc. Syst. Biol. 152, 97107. Liebermeister, W., and Klipp, E. (2006). Bringing metabolic networks to life: Convenience rate law and thermodynamic constraints. Theor. Biol. Med. Model. 3, 41. Liebermeister, W., et al. (2005). Biochemical network models simplified by balanced truncation. FEBS J. 272, 40344043. Lipton, P. (2005). Testing hypotheses: Prediction and prejudice. Science 307, 219221. Liu, L., et al. (2010). Use of genome-scale metabolic models for understanding microbial physiology. FEBS Lett. 584, 25562564. Maertens, J., and Vanrolleghem, P. A. (2010). Modelling with a view to target identification in metabolic engineering: A critical evaluation of the available tools. Biotechnol. Prog. 26, 313331. Mahadevan, R., and Schilling, C. H. (2003). The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab. Eng. 5, 264276.

Genome-Scale Metabolic Models

531

Mashego, M. R., et al. (2007). Metabolome dynamic responses of Saccharomyces cerevisiae to simultaneous rapid perturbations in external electron acceptor and electron donor. FEMS Yeast Res. 7, 4866. Notebaart, R. A., et al. (2006). Accelerating the reconstruction of genome-scale metabolic networks. BMC Bioinformatics 7, 296. Notebaart, R. A., et al. (2008). Co-regulation of metabolic genes is better explained by flux coupling than network distance. PLoS Comput. Biol. 4, e26. Notebaart, R. A., et al. (2009). Asymmetric relationships between proteins shape genome evolution. Genome Biol. 10, R19. Oberhardt, M. A., et al. (2009). Applications of genome-scale metabolic reconstructions. Mol. Syst. Biol. 5, 320. Oliveira, A. P., et al. (2005). Modelling Lactococcus lactis using a genome-scale flux model. BMC Microbiol. 5, 39. Orth, J. D., et al. (2010). What is flux balance analysis? Nat. Biotechnol. 28, 245248. Pal, C., et al. (2006). Chance and necessity in the evolution of minimal metabolic networks. Nature 440, 667670. Palsson, B. (2004). Two-dimensional annotation of genomes. Nat. Biotechnol. 22, 12181219. Papin, J. A., et al. (2004). Comparison of network-based pathway analysis methods. Trends Biotechnol. 22, 400405. Papp, B., et al. (2004). Metabolic network analysis of the causes and evolution of enzyme dispensability in yeast. Nature 429, 661664. Park, J. H., et al. (2010). Fed-batch culture of Escherichia coli for L-valine production based on in silico flux response analysis. Biotechnol Bioeng. 108, 934946. Patil, K. R., and Nielsen, J. (2005). Uncovering transcriptional regulation of metabolism by using metabolic network topology. Proc. Natl. Acad. Sci. USA 102, 26852689. Pinchuk, G. E., et al. (2010). Constraint-based model of Shewanella oneidensis MR-1 metabolism: A tool for data analysis and hypothesis generation. PLoS Comput. Biol. 6, e1000822. Price, N. D., et al. (2004). Genome-scale models of microbial cells: Evaluating the consequences of constraints. Nat. Rev. Microbiol. 2, 886897. Reed, J. L., et al. (2006). Towards multidimensional genome annotation. Nat. Rev. Genet. 7, 130141. Resendis-Antonio, O. (2009). Filling kinetic gaps: Dynamic modelling of metabolism where detailed kinetic information is lacking. PLoS One 4, e4967. Rojas, I., et al. (2007). Storing and annotating of kinetic data. In Silico Biol. 7, S37S44. Schellenberger, J., et al. (2010). BiGG: A Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions. BMC Bioinformatics 11, 213. Schuetz, R., et al. (2007). Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli. Mol. Syst. Biol. 3, 119. Schuster, S., et al. (2000). A general definition of metabolic pathways useful for systematic organization and analysis of complex metabolic networks. Nat. Biotechnol. 18, 326332. Schuster, S., et al. (2007). Is maximization of molar yield in metabolic networks favoured by evolution? J. Theor. Biol. 252, 497504. Segre, D., et al. (2002). Analysis of optimality in natural and perturbed metabolic networks. Proc. Natl. Acad. Sci. USA 99, 1511215117. Steuer, R., et al. (2006). Structural kinetic modelling of metabolic networks. Proc. Natl. Acad. Sci. USA 103, 1186811873. Stevens, M. J., et al. (2008). Improvement of Lactobacillus plantarum aerobic growth as directed by comprehensive transcriptome analysis. Appl. Environ. Microbiol. 74, 47764778. Tai, S. L., et al. (2005). Two-dimensional transcriptome analysis in chemostat cultures. Combinatorial effects of oxygen availability and macronutrient limitation in Saccharomyces cerevisiae. J. Biol. Chem. 280, 437447.

532

Filipe Santos et al.

Taymaz-Nikerel, H., et al. (2010). Genome-derived minimal metabolic models for Escherichia coli MG1655 with estimated in vivo respiratory ATP stoichiometry. Biotechnol. Bioeng. 107, 369381. Tempest, D. W., and Neijssel, O. M. (1984). The status of YATP and maintenance energy as biologically interpretable phenomena. Annu. Rev. Microbiol. 38, 459486. Teusink, B., and Smid, E. J. (2006). Modelling strategies for the industrial exploitation of lactic acid bacteria. Nat. Rev. Microbiol. 4, 4656. Teusink, B., et al. (2000). Can yeast glycolysis be understood in terms of in vitro kinetics of the constituent enzymes? Testing biochemistry. Eur. J. Biochem. 267, 53135329. Teusink, B., et al. (2005). In silico reconstruction of the metabolic pathways of Lactobacillus plantarum: Comparing predictions of nutrient requirements with those from growth experiments. Appl. Environ. Microbiol. 71, 72537262. Teusink, B., et al. (2006). Analysis of growth of Lactobacillus plantarum WCFS1 on a complex medium using a genome-scale metabolic model. J. Biol. Chem. 281, 4004140048. Teusink, B., et al. (2009). Understanding the adaptive growth strategy of L. plantarum by in silico optimisation. PLoS Comput. Biol. 5, e1000410. Theobald, U., et al. (1993). In vivo analysis of glucose-induced fast changes in yeast adenine nucleotide pool applying a rapid sampling technique. Anal. Biochem. 214, 3137. Thiele, I., and Palsson, B. O. (2010). A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat. Protoc. 5, 93121. Thiele, I., et al. (2009). Genome-scale reconstruction of Escherichia colis transcriptional and translational machinery: A knowledge base, its mathematical formulation, and its functional characterization. PLoS Comput. Biol. 5, e1000312. Thomas, T. D., et al. (1979). Change from homo- to heterolactic fermentation by Streptococcus lactis resulting from glucose limitation in anaerobic chemostat cultures. J. Bacteriol. 138, 109117. van der Werf, M. J. (2005). Towards replacing closed with open target selection strategies. Trends Biotechnol. 23, 1116. van Dijken, J. P., et al. (1993). Kinetics of growth and sugar consumption in yeasts. Antonie Van Leeuwenhoek 63, 343352. vanGulik, W. M., and Heijnen, J. J. (1995). A metabolic network stoichiometry analysis of microbial growth and product formation. Biotechnol. Bioeng. 48, 681698. Vanrolleghem, P. A., and Heijnen, J. J. (1998). A structured approach for selection among candidate metabolic network models and estimation of unknown stoichiometric coefficients. Biotechnol. Bioeng. 58, 133138. Vanrolleghem, P. A., et al. (1996). Validation of a metabolic network for Saccharomyces cerevisiae using mixed substrate studies. Biotechnol. Prog. 12, 434448. Varma, A., and Palsson, B. O. (1994). Stoichiometric flux balance models quantitatively predict growth and metabolic by-product secretion in wild-type Escherichia coli W3110. Appl. Environ. Microbiol. 60, 37243731. Veening, J. W., et al. (2008). Bistability, epigenetics, and bet-hedging in bacteria. Annu. Rev. Microbiol. 62, 193210. Vemuri, G. N., et al. (2006). Overflow metabolism in Escherichia coli during steady-state growth: Transcriptional regulation and effect of the redox ratio. Appl. Environ. Microbiol. 72, 36533661. Visser, D., et al. (2004). Analysis of in vivo kinetics of glycolysis in aerobic Saccharomyces cerevisiae by application of glucose and ethanol pulses. Biotechnol. Bioeng. 88, 157167. Wang, L., and Hatzimanikatis, V. (2006). Metabolic engineering under uncertainty. I: Framework development. Metab. Eng. 8, 133141.

Das könnte Ihnen auch gefallen