Beruflich Dokumente
Kultur Dokumente
HFSP Journal
Institut Pasteur, 25 rue du Docteur Roux, 75015 Paris et Universit Paris-Sud, CNRS, UMR 8621,
91405, Crsay-Cedex, France
2
Institut Pasteur, 25 rue du Docteur Roux, 75015 Paris, France
(Received 22 June 2007; accepted 22 June 2007; published online 25 July 2007; corrected 11 March 2008 )
The study of the origin of life covers many areas of expertise and requires the
input of various scientific communities. In recent years, this research field has
often been viewed as part of a broader agenda under the name of exobiology
or astrobiology. In this review, we have somewhat narrowed this agenda,
focusing on the origin of modern terrestrial life. The adjective modern here
means that we did not speculate on different forms of life that could have
possibly appeared on our planet, but instead focus on the existing forms cells
and viruses. We try to briefly present the state of the art about alternative
hypotheses discussing not only the origin of life per se, but also how life
evolved to produce the modern biosphere through a succession of steps that
we would like to characterize as much as possible. [DOI: 10.2976/1.2759103]
CORRESPONDENCE
P. Forterre: forterre@pasteur.fr
S. Gribaldo: simo@pasteur.fr
that remains to be
solved is the origin
of
RNA, since this is
where the bottomup and top-down
approaches meet.
We definitely know,
from the resolution
of the ribosome
structure,
that
modern
proteins
were invented by
156
Plausible mechanisms for the formation of the solar system have now been
PERSPECTIVE
Figure 1. Schematic of bottom-up and top-down approaches. Major events discussed in the text are highlighted.
crometeorites (cosmic dust), could have started accumulating on the surface. For some authors, the conditions for the
15
7
HFSP
Journal
emergence of life (liquid water, continental crust, atmosphere) were already in place at 4.4 4.3 Ga. However,
the habitability of the early Earth was seriously
compromised by multiple giant impacts. In particular,
around 3.9 Ga the Earth was subjected to an impressive
episode of bombard- ment, called the late heavy
bombardment (LHB) (Cohen et al., 2000).
The Late Heavy Bombardment
158
(more than 100 km and up to 5000 km) that hit the Earths
surface during the LHB [for a recent review, see (Claeys
and Morbidelli, 2006)]. This dra- matic event could have
been triggered by the migration of giant planets that took
place after the dissipation of the gas- eous circumsolar
nebula (Gomes et al., 2005). The LBH may have lasted
from 20 to 200 million years, with a frequency of impact
that is highly debated (from one each 10,000 years to one
every 20 years). Models predict that such impacts would
have almost completely resurfaced our planet, leading to
evaporation of the oceans, melting of the crust down to at
least 1000 ms, and loss of the atmosphere. It might be significant that the oldest terrestrial continental crust (Isua,
(Kump, 2005). Oxygen and silicon isotope data from archaean cherts indicate that ancient oceans may have been
warmer than today, with temperatures as high as 70 C
around 3.3 Ga (Knauth, 1998; Robert and Chaussidon,
2006). However, the interpretation of isotopic data
remains controversial since this would imply that
archaean hot and acidic rainwater would have produced
intense weathering that is not observed in the
paleoweathering record. Further- more, a hot ocean is
difficult to reconcile with a first global glaciation that
could have occurred at 2.9 and 2.4 Ga [for a critical
review of these data, see (Kasting and Howard, 2006)].
However, it is very difficult to extract kerogens from Archaean rocks, and not all lipids are equally resistant. For
ex- ample, lipids from archaea are very fragile and have not
been found in rocks older than 1.8 Ga (Summons et al.,
1988). The older biomarker record regards the presence of
hopanes, lipids that today are distinctive of cyanobacteria,
in 2.7 Ga old rocks from Australia (Brocks et al., 1999).
The presence of eukaryotic-type steranes in the same
ancient rocks (Brocks et al., 1999) is more controversial
since some bacte- ria can produce sterols as well (Pearson
et al., 2003; Tippelt et al., 1998), although not of the
complexity of those found by Brocks et al. (Summons et
al., 2006).
In conclusion, the fact that the oldest traces of life that
are not controversial are only those from 2.6 Ga (Schopf,
2006) leaves open a wide window for the origin of modern
life be- tween 3.9 (end of the LHB) and 2.7 Ga. The quest
for traces of life in this time interval is a rapidly expanding
research field. New drilling projects have now started in
order to ob- tain novel samples of archaean rocks. Isotopic
and chemical techniques are being improved to detect the
presence of or- ganic matter with less ambiguity, and new
in situ techniques start to be applied to the analysis of
putative microfossils. Novel and more performing
techniques of lipid extraction will hopefully push back the
limit of detection of biomarkers to the early Archaean. In
parallel, theoretical models for the early Earth will surely
benefit from a better description of known metabolisms
(see below) and metabolic consortia, and their current
distribution in a wide range of environmen- tal settings.
THE ORIGIN AND EARLY EVOLUTION OF LIFE
Heterotrophic versus autotrophic theories
In the traditional prebiotic soup scenario, organic molecules would have first accumulated in the ocean or in
smaller water bodies on the early Earth, either delivered by
extraterrestrial sources (micrometeorites, dust) and/or produced by Millers type experiments (especially if the
early atmosphere was hydrogen rich, see above) (Bada and
Lazcano, 2003). The first living systems would have then
emerged from the gradual complexification of the prebiotic
broth. The authors supporting this heterotrophic theory
of- ten argue that prebiotic chemistry is the prolongation on
our planet of the cosmic chemistry, whose products (e.g.,
amino acids) indeed overlap with the building blocks of
life. For them, the possibility to easily produce in prebiotic
conditions simple amino acids, purines, sugars, fatty acids,
and other small organic molecules essential to modern life
is too strik- ing to be fortuitous (de Duve, 2003).
Proponents of the pre- biotic soup scenario (especially the
Bada and Miller school) have in general argued in favor of
a slow (gradual accumula- tion) and cold origin of life
(essential to the long-term stabil- ity of organic matter).
protometabolism, especially if the RNA world itself originated in the framework of Darwinian evolution between
competing protocells.
161
Figure
2.
Competition
between
vesicles in the early RNA world
adapted
from
Chen
2006.
Lipid vesicles containing mineral catalysts
(hexagons) and able to incorporate ribose
(R) and polyphosphate (PP) grow by capturing lipids from vesicles containing
amino acids (AA) only. The growth of
vesicles induces a proton gradient (H+)
that is used to facilitate the transport of
various compounds, followed by the synthesis of small RNA oligomers (crosses).
After division, vesicles containing RNA
replicators (red crosses) grow at the expense of those containing RNA without
self-replicating activity (blue crosses).
These grow further using additional RNA
(green barrel) to facilitate the transport of
small polar molecules.
bo- rate occupies the 2' and 3' position of the ribose thus
leaving the 5' position available for reactions such as
phosphoryla- tion (Li et al., 2005). Borate minerals were
probably present in the interstellar space and on early
Earth. It was also sug- gested that ribose, together with
purine bases, could have been synthesized in hydrothermal
environments on the sea floor (favoring the formose
reaction) that could be enriched in borate (Holm et al.,
2006). Another recent finding that could be of great
importance is that ribose permeates both fatty acid and
phospholipid membranes more rapidly than other
aldopentoses (Sacerdote and Szostak, 2005). The formation of nucleosides (ribose+base) is also very difficult to
achieve in any prebiotic condition. Interestingly, the use of
phosphorylated ribose instead of ribose facilitates the association between the base and the sugar, suggesting that
phos- phoribose might have been a major prebiotic
intermediate [(Orgel, 2004) and references therein]. Future
effort should thus be concentrated on the search for
catalysts (including
Origin of ribozymes
The polymerization of ribonucleotides in prebiotic conditions has only been achieved using nucleotide monophosphate activated by various amine compounds and using
RNA primers. It has been shown that clays
(montmorillonite) cata- lyze the condensation of such
activated substrates to form RNA oligomers up to 40 50
nucleotides long [for recent re- views see (Muller, 2006)
(Ferris, 2006) (Huang and Ferris, 2003)]. Importantly, the
mineral catalysts increase the ratio of 3' to 5' over 2' to 5'
phosphodiester bonds. A major prob- lem for the
establishment of a robust RNA world is the insta- bility of
RNA due to the reactive oxygen in 2' of the ribose
(Forterre et al., 1995; Lazcano and Miller, 1996). RNA can
be stabilized by a high concentration of monovalent salts
(Hethke et al., 1999) (Tehei et al., 2002), but most
ribozymes absolutely require millimolar concentrations of
divalent salts (Woodson, 2005) which, in contrast, strongly
increase RNA degradation at high temperatures (Ginoza et
al., 1964). To solve this problem, Vlassov and co-workers
have suggested that RNA occurred first in cold
environments, where synthe- sis would have been favored
over degradation, an RNA world on ice hypothesis
(Vlassov et al., 2005). They re- ported that polymerization
of nucleotides, ligation of small RNAs, and other critical
prebiotic chemical reactions are in- deed stimulated by
freezing [(Vlassov et al., 2004) and ref- erences therein].
Interestingly, a 3 ' 5 ' linkage between nucleotides is the
major or even the only product formed un- der freezing
conditions. Freezing probably accelerates some chemical
reactions in aqueous solution because of the orga- nization
of frozen water and the concentration of reactants. In the
RNA world on ice scenario, early ribozymes might have
survived transport to more warm and wet environments by
virtue of their synthetic power outpacing degradation
(Vlassov et al., 2004).
The next problem is the production of polymers of
suffi- cient length to harbor catalytic activity (minimal
ribozymes). The smallest known ribozyme is a 7mer
olinucleotide that can cleave itself at 37 C [for reviews,
see (Muller, 2006; Vlassov et al., 2005)]. A mini-RNA
ligase of 29 nucleotides has also been obtained by in vitro
selection (see below) (Landweber and Pokrovskaya, 1999).
This shows that small ribozymes may support simple
reactions of cleavage and li- gation of other small RNAs.
The production of large RNAs by successive ligation of
small RNAs would have opened the way to the emergence
of true ribozymes. The repertoire of catalytic activities
accessible to RNA has been systemati- cally explored in
16
2
At some point, one has to assume that an efficient polymerase was not only able to replicate itself, but also to
repli- cate templates producing catalysts (either ribozymes
or pep- tides) useful for the metabolism of the RNA cell
16
3
sumed that the primitive genetic code was simpler (for instance with a two-nucleotide codon and less amino acids)
and expanded in the course of evolution. Two main theories
have been proposed, suggesting either that codon choice
was initiated by specific interaction between amino acids
and an- ticodons (stereochemical theories) or that codon
choice was set up parallel with the evolution of the amino
acid biosyn- thetic pathways (historical theories) [for
reviews see (Di Giulio, 2005; Ellington et al., 2000; Wong,
2005; Yarus et al., 2005) (Knight and Landweber, 2000)].
In any case, the modern genetic code is probably not a
frozen accident, but seems to be optimized to minimize
the deleterious conse- quences of mutations (Vogel, 1998)
[for review see (Freeland et al., 2003)]. This indicates that
the tendency to increase faithful translation was the major
selection pressure that di- rected the evolution of the
genetic code, as suggested early on by Woese (1965).
Goldenfeld and co-workers have re- cently shown from in
silico stimulation that an optimal code might have become
universal in the frame of a communal evolution pervaded
by intense horizontal gene transfer of coding sequences and
coding system components among co- evolving
communities with different codes (Vetsigian et al., 2006). If
correct, this suggests that mechanisms of gene transfer
were operational very early, allowing genetic ex- change
between RNA-protein cells. Theories about the ori- gin of
the genetic code should now also accommodate struc- tural
data obtained for modern amino-acyl tRNA synthetases
and ribosomes. For instance, from comparative structural
analysis, it has been suggested that all modern amino-acyl
tRNA synthetases evolved from two proteins whose initial
role was to chaperone the tRNA (Ribas de Pouplana and
Schimmel, 2001).
The first proteins were indeed probably short
chaperone- like proteins that stabilized ribozymes and
increased their catalytic activities. They would also have
facilitated the transport of molecules (including nucleic
acids) through the membranes of the RNA cells, (Jay and
Gilbert, 1987). Longer genes and proteins may have
originated by RNA re- combination producing proteins of
increasing size via a mul- tistep combinatorial mechanism
under the control of natural selection (de Duve, 2003).
Starting from a small number of proteins of small size
(corresponding to modern folds), this mechanism would
have allowed the extensive exploration of the space
sequence at each size level size. This period ended up with
the establishment of all modern protein superfami- lies by
the various combinations of protein folds. Recent advances in comparative and structural genomics have provided fascinating insights on this process [see for instance
many recent papers by the group of Koonin (Iyer et al.,
2003) (Iyer et al., 2004)]. Complex protein enzymes, such
as large RNA polymerases, ribonucleotide reductases, and
protect the viral genetic material against defense mechanisms of the infected cell (a direct selection pressure) (Forterre, 2002). Cellular RNA genomes would have then been
transformed later on into DNA genomes following the recruitment by RNA cells of viral enzymes to produce and
rep- licate DNA, or by the takeover of RNA cells by DNA
viruses living in a carrier state (Forterre, 2005).
The introduction of viruses in the early evolutionary
sce- nario implies that viruses themselves originated at an
early stage in life evolution. The concept of an ancient viral
world was indeed first proposed by scientists who
suggested that RNA viruses are relics of the RNA world
[see, for instance (Maizels and Weiner, 1994)], and that
retroviruses, with their RNADNA cycles, could give
evidence for the transition from the RNA to the DNA
world. This concept is now sup- ported by the existence of
viruses harboring homologous capsid proteins that infect
cells from different domains (Ar- chaea, Bacteria, Eukarya)
(Akita et al., 2007; Bamford et al., 2005) suggesting that
capsid proteins originated prior to the last universal
common ancestor (LUCA). Several models have thus been
recently proposed to explain the origin of vi- ruses in the
RNA world (Forterre, 2006). Interestingly, the concept of
an ancient viral world implies that both modern RNA and
DNA viruses might have preserved ancient mo- lecular
features from the pre-LUCA era. The study of viruses
(especially the extensive exploration of their diversity)
should thus be a major area for research on early life evolution in the next decade.
A major goal of the top-down approaches in the origin-oflife field is to reconstruct the common ancestor of all extant
organisms to reach an intermediary stage between the
origin of life and the present biosphere. The basic principle
of cell division and membrane heredity (Cavalier-Smith,
2001) im- plies that all modern cells derive from a single
cell. This his- torical entity was called the cenancestor (for
common ances- tor in Greek), the progenote, or the LUCA.
This last term has the advantage to be both neutral (unlike
the term progenote, which suggests a very primitive
organism) and precise. It clearly states that LUCA should
not be confused with the first cell, but was the product of a
long period of evolution. Being the last means that LUCA
was preceded by a long suc- cession of older ancestors.
In this framework, a plethora of cellular lineages that have
left no descendants today may have existed before LUCA.
It is important to consider that many of these were probably
still present at the time of LUCA, and some have probably
even coexisted for some time with its descendants, possibly
contributing via horizon- tal gene transfer to some traits
present in modern lineages (Fig. 3).
A consensus on the nature of LUCA is far from reached.
For some authors LUCA was a very simple organism, even
possibly acellular (Woese, 1998) (Russell and Martin,
2004),
Figure 3. LUCA was the last bottleneck in a long series of ancestors to the three present-day cellular domains: Archaea,
Bacteria, and Eukarya. Extinct lineages may have coexisted for
some time with the descendants of LUCA, and transferred some
features to them (yellow arrows). The emergence of a universal
code in an earlier bottleneck organism may have been favored by
Another controversial idea is that modern hyperthermophiles (i.e., organisms having an optimal growth
temperature above 80 C) could be the direct descendants
of a heat- loving LUCA. Hyperthermophiles indeed appear
as early di- verging lineages in the rRNA universal tree and
have rela- tively short branches (Stetter, 2006). However,
this position might be due to the high guaninecytosine
content of their rRNAs, which could have reduced their rate
of evolution (leading to shorter branches and artifactual
grouping) (Fort- erre, 1996). Several attempts have been
made to determine putative compositional biases in the
rRNA, tRNA, or pro- teins from LUCA in order to
determine the temperature at which these molecules were
functional [see, for instance (Galtier et al., 1999) (Di
Giulio, 2003)]. However, these ap- proaches led to
contradictory results and are hampered by the difficulty of
reconstructing ancient phylogenies and un- certainties
concerning the root of the tree of life (see below). In our
opinion, a mesophilic LUCA fits better with the observation that hyperthermophiles are sophisticated organisms
that have evolved specific mechanisms to thrive at very
high temperatures [for a review see (Forterre and Philippe,
1999a; Xu and Glansdorff, 2002)]. In particular,
phylogenomics analyses indeed suggest that reverse gyrase,
an atypical DNA topoisomerase present in all
hyperthermophiles, was absent in LUCA (BrochierArmanet and Forterre, 2006; Forterre et al., 2000) whereas
hot-temperature-adapted lipids are not
homologous in Archaea and Bacteria, suggesting a secondary adaptation that occurred independently in each of these
domains (Forterre and Philippe, 1999a; Xu and Glansdorff,
2002).
The minimal set of universal proteins includes a surprisingly small number of proteins that function in DNA
replica- tion, lacking in particular a DNA replicase, a
primase, and a helicase. This is not due to unrecognized
homology since the proteins performing these functions in
Bacteria on one side, and ArcheaEukaryotes on the other,
belong to different pro- tein superfamilies (Bailey et al.,
2006; Leipe et al., 1999). To explain this observation,
Koonin and colleagues have sug- gested that LUCA had an
RNA genome, but used DNA as a replication intermediate
(much like a retrovirus) (Leipe et al., 1999). Alternatively,
if LUCA had a DNA genome, the ancestral system might
have been replaced in one lineage (probably in Bacteria) by
a new system of viral origin (Fort- erre, 1999). Finally, if
LUCA still had a bona fide RNA ge- nome, Forterre
suggested that the few universal proteins in- volved in
DNA metabolism were independently introduced by DNA
viruses in the three cellular domains (Forterre, 2006). The
idea that LUCA still had a RNA genome has been recently
boosted by the discovery of mechanisms for the re- pair of
RNA damages and for enhancing the fidelity of RNA
transcription and replication. These findings have suggested
that RNAprotein cells may have reached a level of
sophisti- cation much more important than previously
thought (Fort- erre, 2005; Poole and Logan, 2005).
Most authors assume that LUCA was identical to the
last common ancestor of Archaea and Bacteria, either
because it is commonly believed that the tree of life is
rooted between the ArchaeaEukaryotes on one side and
Bacteria on the other, or because of models where
Eukaryotes originated from some kind of association
between Archaea and Bacteria (Lopez-Garcia and Moreira,
1999; Martin and Muller, 1998; Rivera and Lake, 2004;
Wachtershauser, 2006). However, the root of the bacterial
tree and the origin of Eukaryotes remain highly
controversial (Forterre and Philippe, 1999b; Gribaldo and
Philippe, 2002), (Poole and Penny, 2007). If the root turned
out to be in the eucaryotic branch (Philippe and Fort- erre,
1999), several features now exclusively present in Eukaryotes could already have been present in LUCA,
whereas features common to Archaea and Bacteria might
have origi- nated in a common lineage to these two
domains. At the mo- ment, there is no definitive argument
to conclude if the archaealeukaryal or even the unique
eucaryotic features (e.g., the spliceosome and spliceosomal
introns) are ances- tral or derived. The same can be said for
the features that are common to Bacteria and Archaea, such
as the superoperons encoding ribosomal proteins. In any
case, many puzzling ob- servations that are difficult to fit in
a single coherent scenario remain to be explained. The
PERSPECTIVES
REFERENCES
ACKNOWLEDGMENTS
33, 457465.
Forterre, P (2002). The origin of DNA genomes and DNA replication
proteins. Curr. Opin. Microbiol. 5, 525532.
Forterre, P (2005). The two ages of the RNA world, and the transition
to the DNA world: a story of viruses and cells. Biochimie 87,
793803.
Forterre, P (2006). Three RNA cells for ribosomal lineages and
three DNA viruses to replicate their genomes: a hypothesis for
the origin of cellular domain. Proc. Natl. Acad. Sci. U.S.A.
103, 36693674.
Forterre, P, Bouthier De La Tour, C, Philippe, H, and Duguet, M
(2000). Reverse gyrase from hyperthermophiles: probable
transfer of a thermoadaptation trait from archaea to bacteria.
Trends Genet. 16, 152154.
Forterre, P, Confalonieri, F, Charbonnier, F, and Duguet, M (1995).
Speculations on the origin of life and thermophily: review of
available information on reverse gyrase suggests that
hyperthermophilic procaryotes are not so primitive. Orig Life Evol
Biosph 25,
235249.
Forterre, P, and Philippe, H (1999a). The last universal common
ancestor (LUCA), simple or complex? Biol. Bull. 196, 373375;
discussion 375377.
Forterre, P, and Philippe, H (1999b). Where is the root of the
universal tree of life? BioEssays 21, 871879.
Freeland, SJ, Knight, RD, and Landweber, LF (1999). Do proteins
predate DNA? Science 286, 690692.
Freeland, SJ, Wu, T, and Keulmann, N (2003). The case for an error
minimizing standard genetic code. Orig Life Evol Biosph 33,
457477.
Fuerst, JA (2005). Intracellular compartmentation in
planctomycetes.
Annu. Rev. Microbiol. 59, 299328.
Galtier, N, Tourasse, N, and Gouy, M (1999). A
nonhyperthermophilic common ancestor to extant life forms.
Science 283, 220221.