Sie sind auf Seite 1von 11

Nitrosopumilus maritimus genome reveals unique

mechanisms for nitrication and autotrophy in


globally distributed marine crenarchaea
C. B. Walkera,b, J. R. de la Torrea, M. G. Klotzc, H. Urakawaa, N. Pinela, D. J. Arpd, C. Brochier-Armanete, P. S. G. Chainf,g,h,
P. P. Chani, A. Gollabgirj, J. Hempk, M. Hglerl,m, E. A. Karrn, M. Knnekeo, M. Shinf,g, T. J. Lawtonp, T. Lowei, W. MartensHabbenaa, L. A. Sayavedra-Sotod, D. Langf,g, S. M. Sievertq, A. C. Rosenzweigp, G. Manningj, and D. A. Stahla,1
a

Department of Civil and Environmental Engineering, University of Washington, Seattle, WA 98195; bGeosyntec Consultants, Seattle, WA 98101;
Department of Biology, University of Louisville, Louisville, KY 40292; dDepartment of Botany and Plant Pathology, Oregon State University, Corvalis, OR
97331; eUniversit de Provence Aix-Marseille I, Laboratoire de Chimie Bactrienne, Centre National de la Recherche Scientique Unit Propre de Recherche,
Marseille, 13402 France; fBiosciences Division, Lawrence Livermore National Laboratory, Livermore, CA 94550; gMicrobial Program, Joint Genome Institute,
Walnut Creek, CA 94598; hCenter for Microbial Ecology, Michigan State University, East Lansing, MI 48824; iDepartment of Biomolecular Engineering,
University of California, Santa Cruz, CA 95064; jRazavi Newman Center for Bioinformatics, Salk Institute for Biological Studies, La Jolla, CA 92037; kSchool of
Chemical Sciences, University of Illinois, Urbana, IL 61801; lLeibniz-Institut fr Meereswissenschaften, Kiel, 24105 Germany; mWater Technology Center,
Karlsruhe, 76139 Germany; nDepartment of Botany and Microbiology, University of Oklahoma, Norman, OK 73019; oInstitut fr Chemie und Biologie des
Meeres, Universitt Oldenburg, Oldenburg, 26129 Germany; pDepartments of Biochemistry, Molecular Biology and Cell Biology, and Chemistry, Northwestern
University, Evanston, IL 60208; and qBiology Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543
c

Ammonia-oxidizing archaea are ubiquitous in marine and terrestrial environments and now thought to be signicant contributors
to carbon and nitrogen cycling. The isolation of Candidatus Nitrosopumilus maritimus strain SCM1 provided the opportunity for
linking its chemolithotrophic physiology with a genomic inventory
of the globally distributed archaea. Here we report the 1,645,259bp closed genome of strain SCM1, revealing highly copper-dependent systems for ammonia oxidation and electron transport that
are distinctly different from known ammonia-oxidizing bacteria.
Consistent with in situ isotopic studies of marine archaea, the
genome sequence indicates N. maritimus grows autotrophically
using a variant of the 3-hydroxypropionate/4-hydroxybutryrate
pathway for carbon assimilation, while maintaining limited capacity for assimilation of organic carbon. This unique instance of archaeal biosynthesis of the osmoprotectant ectoine and an
unprecedented enrichment of multicopper oxidases, thioredoxinlike proteins, and transcriptional regulators points to an organism
responsive to environmental cues and adapted to handling reactive copper and nitrogen species that likely derive from its distinctive biochemistry. The conservation of N. maritimus gene content
and organization within marine metagenomes indicates that the
unique physiology of these specialized oligophiles may play a signicant role in the biogeochemical cycles of carbon and nitrogen.
ammonia oxidation

| marine microbiology | archaea | nitroxyl

arine Group I archaea are among the most abundant


microorganisms in the global oceans (13). Originally discovered through ribosomal RNA gene sequencing (3, 4), recent
metagenomic, biogeochemical, and microbiological studies
established the capacity of these organisms to oxidize ammonia,
thus linking this abundant microbial clade to one of the key steps of
the global nitrogen cycle (59). For a century following the discovery of autotrophic ammonia oxidizers, only Bacteria were
thought to catalyze this generally rate-limiting transformation in
the two-step process of nitrication (10). Despite recent enrichment of mesophilic as well as thermophilic ammonia-oxidizing
archaea (AOA) (6, 11, 12), only a single Group I-related strain,
isolated from a gravel inoculum from a tropical marine aquarium,
has thus far been successfully obtained in pure culture (7).
The isolation of Nitrosopumilus maritimus strain SCM1 ultimately conrmed an archaeal capacity for chemoautotrophic
growth on ammonia. More detailed characterization of this strain
revealed cytological and physiological adaptations critical for life
in an oligotrophic open ocean environment, most notably one of
www.pnas.org/cgi/doi/10.1073/pnas.0913533107

the highest substrate afnities yet observed (13). Among characterized ammonia oxidizers, only N. maritimus is capable growing at
the extremely low concentrations of ammonia generally found in
the open ocean (7, 13). This strain therefore provided an excellent
opportunity to investigate the core genetic inventory for ammoniabased chemoautotrophy by Group I crenarchaea.
The gene content and gene order of N. maritimus is highly
similar to environmental populations represented in marine bacterioplankton metagenomes, conrming on a genomic level its
close relationship to many oceanic crenarchaea. Thus, an evaluation of the genomic inventory of N. maritimus should offer
a framework to identify features shared among ammonia-oxidizing
Group I crenarchaea, resolve physiological diversity among AOA,
and rene understanding of their ecology in relationship to the
larger assemblage of marine archaeanot all of which are ammonia oxidizers. In support of this expectation, the physiological
and genomic proles together show that many of the nonextreme archaea identied in metagenomic studies, and currently
assigned to the Crenarchaeota kingdom, are AOA that contribute
to global carbon and nitrogen cycling, possibly determining rates
of nitrication in a variety of environments (6, 8, 9, 13).
Results and Discussion
Primary Sequence Characteristics. N. maritimus strain SCM1 con-

tains a single chromosome of 1,645,259 bp encoding 1,997 predicted genes and no extrachromosomal elements or complete
prophage sequences (Table 1). No unambiguous origin of replication could be determined on the basis of local gene content
or GC skew, as commonly observed for other archaeal genomes
(14). Approximately 61% of the N. maritimus open-reading

Author contributions: C.B.W., J.R.d.l.T., P.S.G.C., and D.A.S. designed research; C.B.W.,
J.R.d.l.T., M.G.K., H.U., N.P., C.B-A., P.P.C., A.G., M.H., E.A.K., M.K., M.S., T.L., W.M-H.,
M.S., D.L., S.M.S., A.C.R., G.M., and D.A.S. performed research; C.B.W. and J.R.d.l.T. contributed new reagents/analytic tools; C.B.W., J.R.d.l.T., M.G.K., H.U., N.P., D.J.A., C.B.-A.,
P.P.C., A.G., J.H., M.H., E.A.K., M.K., T.J.L., T.L., W.M.-H., L.A.S.-S., S.M.S., A.C.R., G.M., and
D.A.S. analyzed data; and C.B.W., J.R.d.l.T., M.G.K., H.U., N.P., C.B.-A., J.H., M.H., E.A.K.,
M.K., T.L., S.M.S., A.C.R., G.M., and D.A.S. wrote the paper.
The authors declare no conict of interest.
This article is a PNAS Direct Submission.
Data deposition: The sequence reported in this paper has been deposited in the NCBI
database (accession no. NC_010085).
1

To whom correspondence should be addressed. E-mail: dastahl@u.washington.edu.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.


1073/pnas.0913533107/-/DCSupplemental.

PNAS Early Edition | 1 of 6

MICROBIOLOGY

Edited by David Karl, University of Hawaii, Honolulu, HI, and approved April 2, 2010 (received for review December 6, 2009)

Table 1. Genome features of N. maritimus SCM1, C. symbiosum, sequenced AOB, and crenarchaeal genome fragments
Nitrosococcus Nitrosomonas
Nitrosospira
Nitrosopumilus Cenarchaeum
oceani
europaea
Nitrosomonas multiformis Fosmid
Cosmid
Fosmid
ATCC 19707 ATCC 19718 eutropha C91 ATCC 25196 4B7 DeepAnt-EC39 74A4
maritimus SCM1 symbiosum
Size (bp)
Percent coding
GC content
ORFs
ORF density (ORF/kb)
Avg. ORF length (bp)
Standard tRNAs
rRNAs
Plasmids

1,645,259
91.90%
34.20%
1,997
1.19
757
44
1
0

2,045,086
91.20%
57.70%
2,066
0.986
924
45
1
0

3,481,691
86.80%
50.30%
3,186
0.889
964
45
2
1

2,812,094
88.40%
50.70%
2,628
0.876
1009
41
1
0

2,661,057
85.60%
48.50%
2,578
0.952
890
41
1
2

3,184,243
85.60%
53.90%
2,827
0.86
980
43
1
3

39,297
89.10%
34.40%
41
0.992
898
0
1
NA

33,347
86.10%
34.10%
41
1.17
737
0
1
NA

43,902
84.00%
32.60%
51
1.12
753
0
1
NA

NA, not analyzed.

frames (ORFs) could be assigned to clusters of orthologous


groups of proteins (COGs), a lower percentage than for genomes
of ammonia-oxidizing bacteria (AOB) (Table S1) but similar to
Cenarchaeum symbiosum (15). The genome possesses a relatively
high coding density (91.9%), with a larger fraction dedicated to
energy production/conservation, coenzyme transport/metabolism,
and translation genes than other characterized Crenarchaeota, but
similar to two common species of photoautotrophic marine
Bacteria, Prochlorococcus, and Synechococcus.
Energy Metabolism. The stoichiometry of ammonia oxidation to
nitrite is similar to that of characterized aerobic, obligate chemolithoautotrophic AOB (13), yet the contributing biochemistry is
distinctly unique. All AOB share a common pathway where hydroxylamine, produced by an ammonia monooxygenase (AMO), is
oxidized to nitrite by a heme-rich hydroxylamine oxidoreductase
(HAO) complex; the oxidation of hydroxylamine supplies electrons to both the AMO and a typical electron transport chain
composed of cytochrome c proteins. N. maritimus lacks genes
encoding a recognizable AOB-like HAO complex and pertinent
cytochrome c proteins, indicating an alternative archaeal pathway.
The numerous copper-containing proteins, including multicopper
oxidases and small blue copper-containing proteins (similar to
plasto-, halo-, and sulfocyanins), suggest an alternative electron
transfer mechanism (Table S2). These predicted periplasmic
proteins likely serve functionally similar roles to soluble cytochrome c proteins in other Archaea, including Natronomonas
pharaonis and thermoacidophiles such as Sulfolobus (16). This
apparent reliance on copper for redox reactions is a major divergence from the iron-based electron transfer system of AOB.
The N. maritimus genome contains genes coding for six soluble
periplasmic multicopper oxidase (MCO) proteins: two nearly
identical NO-forming nitrite reductase proteins (NirK;
Nmar_1259 and -1667), each with three cupredoxin domains;
two NcgA-like (nirK cluster gene A) MCOs (Nmar_1131 and
-1663) with two cupredoxin domains; one MCO (Nmar_1136)
with three cupredoxin domains; and one MCO (Nmar_1354)
with two domains fused to a blue copper-containing protein
(Table S2). Two (Nmar_1131 and Nmar_1663) of the three
genes that are classied as belonging to the emerging class of
two-domain MCOs (2dMCOs) resemble the general architecture
of the 2dMCO NcgA present in Nitrosomonas europaea. Although the overall sequence identity is low, clustering of NcgA
with a nitrite reductase suggests it may play a supporting role in
nitrite reduction (17). Genes Nmar_1131 and Nmar_1663 are also
colocated with a member of the DtxR family of metal regulators and
a member of the ZIP metal transport family, suggesting a role in
metal homeostasis (see below). The third 2dMCO (Nmar_1354),
possessing a fused blue copper-containing domain, has not been
found in AOB and appears to be unique to AOA. Redox inter2 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.0913533107

actions with the MCOs (and other predicted redox proteins) are
likely mediated by eight soluble and nine membrane-anchored
copper-binding proteins containing plastocyanin-like domains
(Table S2). The corresponding genes appear to be the result of
a series of duplications within the N. maritimus lineage (Fig. S1).
A second family of predicted redox active periplasmic proteins, composed of 11 thiol-disulde oxidoreductases from the
thioredoxin family (Nmar_0639, _0655, _0829, _0881, _1140,
_1143, _1148, _1150, _1181, _1658, and Nmar_1670), show low
(but recognizable) identity with the better characterized disulde
bond oxidases/isomerases found in Bacillus subtilis (BdbD) and
Escherichia coli (DsbA, DsbC, and DsbG). The mean percentage
of sequence identity between the N. maritimus proteins and
BdbD is 21 3%. The signicantly lower mean percentage of
sequence identity to DsbA, DsbC, and DsbG (9.2, 10.7, and
10.4%, respectively) is comparable to that shared between the
E. coli proteins (1011%). Although functional equivalency
cannot be established, all but Nmar_0881 preserve the conserved
thioredoxin-like active site FX4CXXC sequence (1820). In
E. coli, both DsbA and DsbC rectify nonspecic disulde bonds
catalyzed by copper (21), whereas up-regulation of dsbA by the
Cpx regulon occurs during copper stress (22, 23). Eukaryotic
protein disulde isomerase (PDI) homologs sequester and/or
reduce oxidized Cu(II), perhaps serving as copper acceptors/
donors for copper-containing proteins (24). Another described
function of PDIs is the capture and transport of nitric oxide (25,
26), a possible intermediate or by-product of ammonia oxidation.
The related protein family in N. maritimus may function in part
to alleviate copper and nitric oxide toxicity.
Pathways for Ammonia Oxidation and Electron Transfer. The three
genes (Nmar_1500, _1503, and _1502) annotated as amoA,
amoB, and amoC and coding for a putative ammonia monooxygenase complex are the only recognizable genetic hallmarks
of ammonia oxidation in the genome sequence. However, the
N. maritimus sequences are no more similar (in either content or
organizational structure) to bacterial amo genes than they are to
the genes encoding bacterial particulate methane monooxygenases (pMMO), suggestive of functional differences between the archaeal and bacterial versions of AMO (7, 27).
Notably, mapping the sequence encoded by amoB onto the
pmoB crystal structure of Methylococcus capsulatus (Bath) (28)
reveals the conservation of the ligands to the pMMO metal
centers and the complete absence of both a transmembrane helix
and a C-terminal cupredoxin domain predicted to be present in
bacterial AMO (Fig. S2).
The structural differences in the archaeal AMO, the lack of
genes encoding the hydroxylamineubiquinone redox module
(29), and a periplasm enriched in redox active proteins together
suggest signicant divergence from the bacterial pathway of
Walker et al.

Carbon Fixation and Mixotrophy. N. maritimus, like all known


AOB, grows chemolithoautotrophically by using inorganic carbon
as the sole carbon source (7, 32). However, whereas AOB use the
CalvinBasshamBenson cycle with the CO2-xing enzyme ribulose

bisphosphate carboxylase/oxygenase (RubisCO) as the key enzyme,


the absence of genes in N. maritimus coding for RubisCO and other
enzymes of this cycle points to an alternative pathway for carbon
xation. The most likely mechanism supported by the genome sequence is the 3-hydroxypropionate/4-hydroxybutyrate pathway elucidated in the thermophilic crenarchaote Metallosphaera sedula and
suggested as a potential pathway of carbon xation in C. symbiosum
(33). The pathway has two parts: a sequence including two carboxylation reactions transforming acetyl-CoA to succinyl-CoA and
a multistep sequence converting succinyl-CoA into two molecules of
acetyl-CoA. Genes identied in the N. maritimus genome coding for
key enzymes of the pathway (Fig. S3) include a biotin-dependent
acetyl-CoA/propionyl-CoA carboxylase (Nmar_02720274), methylmalonyl-CoA epimerase and mutase (Nmar_0953, _0954, and
_0958), and 4-hydroxybutyrate dehydratase (Nmar_0207). With the
exception of one gene (Nmar_1608), all of the genes implicated in the
3-hydroxyproprionate/4-hydrobutyrate pathway for carbon assimilation in N. maritimus are present and show highest similarity to the
genes of C. symbiosum (34, 35). Although N. maritimus and M. sedula
most likely use the same CO2-xation reaction sequences, not all
individual reactions appear to be catalyzed by identical enzymes. In
one instance, the stepwise reductive transformation of malonyl-CoA
to propionyl-CoA involves ve enzymes in M. sedula (33, 36, 37).
Although the N. maritimus genome lacks any close homologs of
the M. sedula genes, it contains alternative alcohol dehydrogenases,
aldehyde dehydrogenases, acyl-CoA synthetases, and enoyl-CoA
hydratases possibly fullling the same functions. Similarly, M. sedula
catalyzes the activation of 3-hydroxypropionate to 3-hydroxypropionylCoA, using an AMP-forming 3-hydroxypropionyl-CoA synthetase
(37). The N. maritimus genome lacks an obvious homolog, although
does code for an ADP-forming acyl-CoA synthetase (Nmar_1309) that
suggests a more energy efcient alternative.
In addition to the genes coding for the 3-hydroxypropionate/4hydroxybutyrate pathway, the genome of N. maritimus contains
a number of genes encoding enzymes of the tricarboxylic acid
(TCA) cycle. No homologs for genes coding for a citrate-cleaving
enzyme (ATP citrate lyase or citryl-CoA lyase) (38) were identied, permitting exclusion of the reductive TCA cycle as a pathway
for carbon xation. The lack of these genes suggests that N. maritimus utilizes either an incomplete (or horseshoe-type) TCA cycle
for strictly biosynthetic purposes or possibly a complete oxidative
TCA cycle.
N. maritimus grows on a completely inorganic medium, indicating the genes coding for essential biosynthetic capacity (SI
Materials and Methods), yet its genomic inventory also suggests
some exibility in the utilization of organic sources of phosphorus and carbon. Two systems for phosphorous acquisition are
suggested: the high-afnity, high-activity phosphate transport
system (pstSCAB, Nmar_0479, Nmar_04810483) and a phosphonate transporter (Nmar_08730875). However, because the
genome lacks genes encoding known C-P lyases and hydrolases
(39), and phosphate limitation is not alleviated by supplementation with phosphonates common in the marine environment

Nitrite Reduction
H+
CuHAO
NXOR

H2 O + NH 2OH
H2 O + HNO

4 e2 e-

NH3 + O2
Out

AMO

CuNIR

pcy

CI

H+

H+

2 e-

Q
QH2

QRED

Ammonia Metabolism

2 H+

Q
QH2

NDH I

NAD+ + H+

Complex I

pcy

Q
QH2

NADH
H+

4 H+

2 H+
pcy

In

Walker et al.

NO + H2O

4H+ + HNO2
2H+ + HNO2

ATPase

aa3
H+

Complex III

0.5 O2

4 H+

H2O ADP + P
i

Complex IV

4 H+

ATP

Complex V

Fig. 1. Proposed AOA respiratory pathway. Text indicates the described possible hydroxylamine (blue text
and arrows) and nitroxyl (green) pathways. Red arrows
indicate electron ow not involved in ammonia oxidation. Blue shading denotes blue copper-containing proteins. Pink box indicates possible alternative respiratory
electron sink. Hexagons containing Q and QH2 represent
the oxidized and reduced quinone pools, respectively.

PNAS Early Edition | 3 of 6

MICROBIOLOGY

ammonia oxidation. There are two hypothetical mechanistic


alternatives (Fig. 1, Table S2): either a unique biochemistry
exists for the oxidation of hydroxylamine or the divergent AMO
does not actually produce hydroxylamine. If the former is true,
hydroxylamine oxidation may occur via one of the periplasmic
MCOs (CuHAO). Given the lack of cytochrome c proteins, the
four electrons would then be transferred to a quinone reductase
(QRED) via small blue copper-containing plastocyanin-like
electron carriers. The protein encoded by Nmar_1226, which
contains four transmembrane-spanning regions and two plastocyanin-like domains, may serve as an analog of the membranebound cytochrome cM552 quinone reductase present in AOB
(29) and is a good candidate for the QRED.
In an alternative scenario, the archaeal AMO produces not
hydroxylamine, but the reactive intermediate nitroxyl (nitroxyl
hydride, HNO). Nitroxyl is a highly toxic and reactive compound
recently recognized as having biological signicance in a number
of systems (30, 31). During archaeal ammonia oxidation, nitroxyl
might be formed by a unique monooxygenase function of archaeal AMO. Alternatively, the archaeal AMO may act as
a dioxygenase and insert two oxygen atoms into ammonia,
producing nitroxyl from the spontaneous decay of HNOHOH.
Both reaction sequences eliminate the requirement for reductant
recycling during the initial oxygenase reaction, a simplication
offering signicant ecological advantage (when compared with
AOB) in nutrient poor environments. In this pathway, one of the
MCO-like proteins may act as a nitroxyl oxidoreductase (NXOR)
and facilitate the oxidation of nitroxyl to nitrite with the extraction of two protons and two electrons in the presence of water.
The proposed NXOR would relay the two extracted electrons
into the quinone pool via the QRED pathway described above.
In this proposed model, the electrons extracted by either
a CuHAO or a NXOR (and transferred into the quinone pool)
would generate a proton motive force (PMF) through complexes
III (plastocyanin-like subunit, Nmar_1542; Rieske-type subunit,
Nmar_1544; transmembrane subunit, Nmar_1543) and IV
(Nmar_0182-5), driving the generation of ATP by an F0F1-type
ATP synthase (Nmar_16881693). The production of reductant
(i.e., NADH) would require the reverse operation of complex I
(NuoABCDHIJKMLN, Nmar_0276286) as a quinol oxidase
driven by a PMF. The proposed biochemistry involving nitroxyl
produces the same net gain as bacterial ammonia oxidation,
providing two electrons for reduction of the quinone pool and
subsequent linear electron ow and the generation of a PMF.
The presence of a copper-containing (versus heme) complex III
and the unique evolutionary placement of terminal oxidase
(complex IV) between two of the hemecopper oxygen reductase
families further distinguish this proposed ammonia oxidation
pathway from that in AOB.

(e.g., aminoethylphosphonate), there is as yet no support for


a functional phosphonate utilization pathway. Numerous organic
transport functions are also evident, broadly encompassing transporters for different amino acids, dipeptides/oligopeptides, sulfonates/taurine, and glycerol. Additional physiological characterization will likely demonstrate some capacity for mixotrophic
growth, as suggested by isotopic studies of natural populations
(4042). No genomic evidence exists for growth on urea, as
N. maritimus lacks the homologs of the putative urease and urea
transporter genes identied in C. symbiosum (35).
Noncoding RNA Genes. The N. maritimus genome contains a full

complement of essential noncoding RNA (ncRNA) genes, including one copy each of 5S/16S/23S ribosomal RNAs, RNase P,
SRP RNA, and 44 transfer RNAs (Table S3). In addition to six
normally placed canonical tRNA introns, noncanonical introns
were found at different positions in six of the tRNAs (ValCAC, Met,
Trp, ArgCCT, LeuTAA, and GluTTC), a phenomenon previously
observed only in thermophiles and C. symbiosum (34, 43). All
other sequenced crenarchaea (including C. symbiosum) contain at
least 46 tRNAs. N. maritimus lacks tRNA sequences coding for
ProCGG or ArgCCG, perhaps resulting from (or related to) the low
G + C content of the genome and preference for protein codons
ending in A/T. Other archaeal genomes with relatively low G + C
content, such as the euryarchaeon Haloquadratum walsbyi, also
lack these tRNAs, while possessing the exact complement of 44
tRNAs found in N. maritimus (43). This occurrence likely reects
a difference in posttranscriptional modication of the wobble base
of tRNAs ProTGG and ArgTCG, allowing more efcient decoding
of the rare codons CCG and CGG, respectively (44).
Six candidates for C/D box small RNAs (sRNAs) were identied
(Nmar_sR1sR6). Most C/D box sRNAs guide the precise positioning of posttranscriptional 2-O-methyl group addition to rRNAs
or tRNAs, a process also occurring in eukaryotes, but not Bacteria. In
N. maritimus, predicted targets of 2-O-methylation may include the
wobble base of the LeuCAA tRNA and two different positions separated by 26 nucleotides in the 23S rRNA. Before the N. maritimus
and C. symbiosum genomes, multiple C/D box sRNAs were found
almost exclusively in hyperthermophilic archaea (45). Detection of
these conserved, syntenic guide sRNAs in the two mesophilic crenarchaeal genomes supports an RNA stabilization/chaperone function not seen in other archaeal mesophiles and possibly more similar
to their predicted function(s) in eukaryotes (46).
Regulation of Transcription. The genome contains at least eight
transcription factor B (TFB) and two TATA-box binding protein

(TBP) (Table S3) genes required for starting site-specic transcription initiation, making N. maritimus among the densest and
richest archaeal genomes for these transcription factors. TFB
and TBP are thought to serve functions similar to the bacterial
sigma factors (e.g., modulating cellular function in response to
uctuating environmental conditions) in Archaea with genomes
coding for multiple copies, with optimal TFB/TPB partners (47).
Although many other archaeal genomes contain multiple copies
of these transcription factors, only the haloarchaea have more
than ve TFB genes (47). The functional signicance of this
exceptionally high density of regulatory factors in an apparently
metabolically specialized organism will likely be informed by
future transcriptional analyses of different growth states. Genes
for two widely distributed types of archaeal chromatin proteins
are present, an archaeal histone (Nmar_0683) and two Alba
genes (Nmar_0255 and Nmar_0933). These are thought to
maintain chromosomal material in a state permitting polymerase
accessibility, with differential expression possibly providing for
altered global transcription (48).
Unique Cell Division Machinery and Previously Uncharacterized Instance
of Archaeal Biosynthesis of Hydroxyectoine. The N. maritimus genome

contains elements homologous with two systems of cell division: ftsZ


(Nmar_1262) and cdvABC (Nmar_0700, _0816, and _1088). The
cdvABC operon, induced at the onset of genome segregation and cell
division, codes for machinery related to the eukaryotic endosomal
sorting complex (49, 50). With the exception of N. maritimus, C.
symbiosum, and the Thermoproteales (where the cell division machinery remains uncharacterized), all available archaeal genomes
have either the FtsZ or the Cdv cell division machinery, but not both.
The two cell division systems may comprise a hybrid mechanism or
serve two distinct processes in marine Crenarchaeota.
The genome of N. maritimus also encodes for synthesis of
the compatible solute hydroxyectoine: ectoine synthase (ectC, Nmar_
1344), a L-2,4-diaminobutyrate transaminase (ectB, Nmar_1345),
a L-2,4-diaminobutyrate acetyltransferase (ectA, Nmar_1346), and an
ectoine hydroxylase (ectD, Nmar_1343). Although widely distributed
among Bacteria (in particular the genome sequences of marine
organisms), the presence of these genes in N. maritimus represents
a unique indication of archaeal biosynthesis.
Relationship to C. symbiosum. The genome of N. maritimus differs
signicantly in G + C content and size from that of the closely related sponge symbiont (97% 16S rRNA gene sequence identity).
Despite the differences in overall genomic G + C content (34.2%
for N. maritimus versus 57.7% for C. symbiosum), the G + C content

Fig. 2. Synteny plots comparing the N. maritimus genome with (A) the Cenarchaeum symbiosum A type genome, (B) crenarchaeal genome fragments, and
(C) Sargasso Sea fosmids. Vertical gray bar indicates location of ribosomal RNA operon.

4 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.0913533107

Walker et al.

for the rRNAs is largely similar between both organisms (5052%).


The higher ORF density (1.19 ORF/kb) relative to C. symbiosum
(0.986 ORF/kb) results principally from the 0.4 Mbp smaller genome of N. maritimus, not from a large disparity in the number of
predicted ORFs. The two genomes share 1,267 genes in common
(when compared via reciprocal BLAST with expectation cutoff
values of 104), yet there is little conservation of synteny (Fig. 2C,
Table S4). Most of the increased size of the C. symbiosum genome
and much of the divergence in gene content are associated with
discrete regions (islands), although no obvious functionality could
be assigned to individual islands. Homologs for a majority of genes
implicated in the archaeal ammonia oxidation pathway (51 of 69
listed in Table S2) appear present in C. symbiosum.
Phylogeny and Evolution. The widely distributed Group I archaeal
lineage with which N. maritimus is afliated was earlier assigned to
the hyperthermophilic Crenarchaeota (3). Questions regarding this
association arose through phylogenetic analysis of C. symbiosum
ribosomal proteins, indicating possible divergence before the
CrenarchaeotaEuryarchaeota split and therefore deserving provisional assignment to a new archaeal kingdom, the Thaumarchaeota (51). The basal position of the Group I archaea previously
inferred from protein sequences encoded by the C. symbiosum genome was reexamined by reanalysis of the combined dataset, using
patterns of gene distribution (Table S5) and phylogeny inference.
Maximum-likelihood analyses conrmed the basal branching with
signicant statistical support (bootstrap value = 90%, Fig. S4).
Bayesian analysis of a selection of species from the same dataset
produced results linking C. symbiosum and N. maritimus as sister
taxa of Crenarchaeota, albeit with nonsignicant support (posterior
probability = 0.88). Although a denitive placement within the
Archaea still must be conrmed by inclusion of genomes from more
distantly related lineages, both analyses strongly support a lineage
distinct from all other cultivated Crenarchaeota.
High Similarity to the Metagenome of the Globally Distributed Marine
Group I Archaea. The genome of N. maritimus shares remarkable
Walker et al.

Materials and Methods


Genome sequencing was performed on high-molecular-weight DNA
extracted from two cultures of N. maritimus. Whole-genome shotgun sequencing of 3-, 8-, and 40-kb DNA libraries by the Joint Genome Institute
produced at least 8 coverage. Annotation of the closed genome was performed using Department of Energy (DOE) computational support at Oak
Ridge National Laboratory and The Institute for Genomic Research (TIGR)
Autoannotation Service in conjunction with Manatee visualization software.
Complete details describing high-molecular-weight DNA purication and
sequence analysis are found in SI Materials and Methods.
ACKNOWLEDGMENTS. The authors thank David Bruce and Paul Richardson
from the Joint Genome Institute for facilitating genome sequencing. This

PNAS Early Edition | 5 of 6

MICROBIOLOGY

Fig. 3. Comparison of N. maritimus genome to metagenomic sequence reads


from GOS. (A) Percentage of reads from each GOS site that align to the N.
maritimus genome by protein sequence similarity. (B) Number of GOS reads
homologous to each N. maritimus protein-coding gene. Counts of 40 are mostly
due to highly conserved proteins that include contaminants from other clades.

conservation of gene content and gene order with numerous archaeal sequences previously recovered in fosmid libraries and recent oceanic surveys. The Antarctic genome fragments DeepAntEC39 (taken from 500 m depth) (52) and cosmid 74A4 (from surface waters) (53) both share very high synteny with portions of the N.
maritimus genome (Fig. 2B and Table S6) despite signicant differences in rRNA sequence identity (93 and 98%, respectively).
Sixteen Sargasso Sea contigs (93% 16S rRNA gene sequence
identity with N. maritimus) also have high synteny (Fig. 2C and
Table S6). Retrieval of DNA fragments from the Global Ocean
Sampling (GOS) database using differential protein sequence
similarity showed that N. maritimus-like sequences constituted an
average of 1.15% of all sequences across a wide range of temperatures (929 C), salinities (freshwater to seawater, 0.163 practical
salinity units), two open oceans, and several coastal environments
(Fig. 3A). The Block Island, NY, coastal site and the Lake Gatun,
Panama Canal, site (neither of which share any notable physical/
chemical characteristics other than sample depth) both exhibited
notably large increases in density (>2.5%). The available GOS
sequences provided almost complete and uniform coverage of the
N. maritimus genome (Fig. 3B), although at least three signicant
gaps in coverage exist (possibly corresponding with unique
N. maritimus genomic islands). Whereas some of the coverage may
result in matches to bacterial sequences, particularly for very highly
conserved proteins, the majority of recruited proteins had >50%
sequence identity to N. maritimus proteins. Together, these shared
genomic features suggest N. maritimus is representative of many of
the globally abundant marine Group I Crenarchaeaota and that it
should provide a useful model for developing an understanding of
the basic physiology of these abundant and cosmopolitan organisms.
A comparison of the sequence content and genome organization hints at functionally more divergent marine population
types. N. maritimus has limited syntenic similarity to a deepwater population represented by the North Pacic fosmid 4B7
(93% 16S rRNA sequence identity), but shares proteins highly
similar to most of those coded on this fosmid. Previous comparison of marine crenarchaeal genomic fragments reported
changes in genomic organization with sampling depths, suggestive of depth-related habitat types (54). Coupled with recent
evidence indicating varied physiological lifestyles along depth
and latitudinal gradients, distinct crenarchaeal ecotypes likely
exist, analogous to that observed in marine cyanobacteria (2, 55).
However, no clear correlations currently exist between environmental parameters and the similarity of recruited fragments.
The genome sequence presented here also offers further insight
into the ecological success of AOA. For example, using the likely more
energy-efcient 3-hydroxypropionate/4-hydroxybutyrate pathway for
CO2 xation rather than the CalvinBasshamBenson cycle used by
AOB could provide a growth advantage. Further ecological advantage
may be conferred by their potential capacity for mixotrophic growth or
the use of copper as a major redox-active metal for respiration in
generally iron-limited oceans. However, a deeper understanding of
the remarkable success of this archaeal lineage will come only from
more detailed physiological, biochemical, and genetic characterization of N. maritimus and additional environmental isolates.

work was supported by the Department of Energy Microbial Genome


Program, by National Science Foundation Microbial Interactions and Processes
Grant MCB-0604448 (to D.A.S. and J.R.d.l.T.), by National Science Foundation
Molecular and Cellular Biosciences Grant MCB-0920741 (to D.A.S.), by National
Science Foundation Biological Oceanography Grants OCE-0623174 (to D.A.S.)

and OCE-0623908 (to S.M.S.), by National Science Foundation Grant EF0412129 (to M.G.K.), by incentive funds from the University of Louisville VP
Research ofce (to M.G.K.), by the Deutsche Forschungsgemeinschaft (M.K.),
by US Department of Agriculture Grant 2010-65115-20380 (to A.C.R.), and by
a Salk Institute Innovation Grant (to G.M.).

1. Karner MB, DeLong EF, Karl DM (2001) Archaeal dominance in the mesopelagic zone
of the Pacic Ocean. Nature 409:507510.
2. Agogu H, Brink M, Dinasquet J, Herndl GJ (2008) Major gradients in putatively
nitrifying and non-nitrifying Archaea in the deep North Atlantic. Nature 456:788791.
3. DeLong EF (1992) Archaea in coastal marine environments. Proc Natl Acad Sci USA 89:
56855689.
4. Fuhrman JA, McCallum K, Davis AA (1992) Novel major archaebacterial group from
marine plankton. Nature 356:148149.
5. Venter JC, et al. (2004) Environmental genome shotgun sequencing of the Sargasso
Sea. Science 304:6674.
6. Wuchter C, et al. (2006) Archaeal nitrication in the ocean. Proc Natl Acad Sci USA
103:1231712322.
7. Knneke M, et al. (2005) Isolation of an autotrophic ammonia-oxidizing marine
archaeon. Nature 437:543546.
8. Leininger S, et al. (2006) Archaea predominate among ammonia-oxidizing prokaryotes
in soils. Nature 442:806809.
9. Prosser JI, Nicol GW (2008) Relative contributions of archaea and bacteria to aerobic
ammonia oxidation in the environment. Environ Microbiol 10:29312941.
10. Prosser JI (1989) Autotrophic nitrication in bacteria. Adv Microb Physiol 30:125181.
11. de la Torre JR, Walker CB, Ingalls AE, Knneke M, Stahl DA (2008) Cultivation of
a thermophilic ammonia oxidizing archaeon synthesizing crenarchaeol. Environ
Microbiol 10:810818.
12. Hatzenpichler R, et al. (2008) A moderately thermophilic ammonia-oxidizing
crenarchaeote from a hot spring. Proc Natl Acad Sci USA 105:21342139.
13. Martens-Habbena W, Berube PM, Urakawa H, de la Torre JR, Stahl DA (2009)
Ammonia oxidation kinetics determine niche separation of nitrifying Archaea and
Bacteria. Nature 461:976979.
14. Sernova NV, Gelfand MS (2008) Identication of replication origins in prokaryotic
genomes. Brief Bioinform 9:376391.
15. Makarova KS, Sorokin AV, Novichkov PS, Wolf YI, Koonin EV (2007) Clusters of
orthologous genes for 41 archaeal genomes and implications for evolutionary
genomics of archaea. Biol Direct 2:33.
16. Schfer G (2004) Respiration in Archaea and Bacteria, ed Zannoni D (Springer,
Dordrecht, The Netherlands), pp 133.
17. Beaumont HJ, Lens SI, Westerhoff HV, van Spanning RJ (2005) Novel nirK cluster
genes in Nitrosomonas europaea are required for NirK-dependent tolerance to
nitrite. J Bacteriol 187:68496851.
18. Andersen CL, Matthey-Dupraz A, Missiakas D, Raina S (1997) A new Escherichia coli
gene, dsbG, encodes a periplasmic protein involved in disulphide bond formation,
required for recycling DsbA/DsbB and DsbC redox proteins. Mol Microbiol 26:121132.
19. Meima R, et al. (2002) The bdbDC operon of Bacillus subtilis encodes thiol-disulde
oxidoreductases required for competence development. J Biol Chem 277:69947001.
20. Raina S, Missiakas D (1997) Making and breaking disulde bonds. Annu Rev Microbiol
51:179202.
21. Hiniker A, Collet JF, Bardwell JC (2005) Copper stress causes an in vivo requirement for
the Escherichia coli disulde isomerase DsbC. J Biol Chem 280:3378533791.
22. Pogliano J, Lynch AS, Belin D, Lin EC, Beckwith J (1997) Regulation of Escherichia coli
cell envelope proteins involved in protein folding and degradation by the Cpx twocomponent system. Genes Dev 11:11691182.
23. Kershaw CJ, Brown NL, Constantinidou C, Patel MD, Hobman JL (2005) The expression
prole of Escherichia coli K-12 in response to minimal, optimal and excess copper
concentrations. Microbiology 151:11871198.
24. Narindrasorasak S, Yao P, Sarkar B (2003) Protein disulde isomerase, a multifunctional
protein chaperone, shows copper-binding activity. Biochem Biophys Res Commun 311:
405414.
25. Sliskovic I, Raturi A, Mutus B (2005) Characterization of the S-denitrosation activity of
protein disulde isomerase. J Biol Chem 280:87338741.
26. Ramachandran N, Root P, Jiang XM, Hogg PJ, Mutus B (2001) Mechanism of transfer
of NO from extracellular S-nitrosothiols into the cytosol by cell-surface protein
disulde isomerase. Proc Natl Acad Sci USA 98:95399544.
27. Nicol GW, Tscherko D, Chang L, Hammesfahr U, Prosser JI (2006) Crenarchaeal
community assembly and microdiversity in developing soils at two sites associated
with deglaciation. Environ Microbiol 8:13821393.
28. Lieberman RL, Rosenzweig AC (2005) Crystal structure of a membrane-bound
metalloenzyme that catalyses the biological oxidation of methane. Nature 434:
177182.

29. Klotz MG, Stein LY (2008) Nitrier genomics and evolution of the nitrogen cycle.
FEMS Microbiol Lett 278:146156.
30. Fukuto JM, Switzer CH, Miranda KM, Wink DA (2005) Nitroxyl (HNO): Chemistry,
biochemistry, and pharmacology. Annu Rev Pharmacol Toxicol 45:335355.
31. Miranda KM, et al. (2003) A biochemical rationale for the discrete behavior of nitroxyl
and nitric oxide in the cardiovascular system. Proc Natl Acad Sci USA 100:91969201.
32. Arp DJ, Chain PS, Klotz MG (2007) The impact of genome analyses on our
understanding of ammonia-oxidizing bacteria. Annu Rev Microbiol 61:503528.
33. Berg IA, Kockelkorn D, Buckel W, Fuchs G (2007) A 3-hydroxypropionate/4-hydroxybutyrate
autotrophic carbon dioxide assimilation pathway in Archaea. Science 318:17821786.
34. Hallam SJ, et al. (2006) Genomic analysis of the uncultivated marine crenarchaeote
Cenarchaeum symbiosum. Proc Natl Acad Sci USA 103:1829618301.
35. Hallam SJ, et al. (2006) Pathways of carbon assimilation and ammonia oxidation
suggested by environmental genomic analyses of marine Crenarchaeota. PLoS Biol 4:
e95.
36. Alber B, et al. (2006) Malonyl-coenzyme A reductase in the modied 3-hydroxypropionate
cycle for autotrophic carbon xation in archaeal Metallosphaera and Sulfolobus spp.
J Bacteriol 188:85518559.
37. Alber BE, Kung JW, Fuchs G (2008) 3-Hydroxypropionyl-coenzyme A synthetase from
Metallosphaera sedula, an enzyme involved in autotrophic CO2 xation. J Bacteriol
190:13831389.
38. Hgler M, Huber H, Molyneaux SJ, Vetriani C, Sievert SM (2007) Autotrophic CO2
xation via the reductive tricarboxylic acid cycle in different lineages within the
phylum Aquicae: Evidence for two ways of citrate cleavage. Environ Microbiol 9:
8192.
39. Quinn JP, Kulakova AN, Cooley NA, McGrath JW (2007) New ways to break an old
bond: The bacterial carbon-phosphorus hydrolases and their role in biogeochemical
phosphorus cycling. Environ Microbiol 9:23922400.
40. Herndl GJ, et al. (2005) Contribution of Archaea to total prokaryotic production in the
deep Atlantic Ocean. Appl Environ Microbiol 71:23032309.
41. Ingalls AE, et al. (2006) Quantifying archaeal community autotrophy in the mesopelagic
ocean using natural radiocarbon. Proc Natl Acad Sci USA 103:64426447.
42. Ouverney CC, Fuhrman JA (2000) Marine planktonic archaea take up amino acids.
Appl Environ Microbiol 66:48294833.
43. Chan PP, Lowe TM (2009) GtRNAdb: A database of transfer RNA genes detected in
genomic sequence. Nucleic Acids Res 37 (Database issue):D93D97.
44. Grosjean H, Marck C, de Crcy-Lagard V (2007) The various strategies of codon
decoding in organisms of the three domains of life: Evolutionary implications. Nucleic
Acids Symp Ser (Oxf) 51:1516.
45. Dennis PP, Omer A, Lowe T (2001) A guided tour: Small RNA function in Archaea. Mol
Microbiol 40:509519.
46. Maxwell ES, Fournier MJ (1995) The small nucleolar RNAs. Annu Rev Biochem 64:
897934.
47. Facciotti MT, et al. (2007) General transcription factor specied global gene
regulation in archaea. Proc Natl Acad Sci USA 104:46304635.
48. Sandman K, Reeve JN (2005) Archaeal chromatin proteins: Different structures but
common function? Curr Opin Microbiol 8:656661.
49. Linds AC, Karlsson EA, Lindgren MT, Ettema TJ, Bernander R (2008) A unique cell
division machinery in the Archaea. Proc Natl Acad Sci USA 105:1894218946.
50. Samson RY, Obita T, Freund SM, Williams RL, Bell SD (2008) A role for the ESCRT
system in cell division in archaea. Science 322:17101713.
51. Brochier-Armanet C, Boussau B, Gribaldo S, Forterre P (2008) Mesophilic
Crenarchaeota: Proposal for a third archaeal phylum, the Thaumarchaeota. Nat Rev
Microbiol 6:245252.
52. Stein JL, Marsh TL, Wu KY, Shizuya H, DeLong EF (1996) Characterization of
uncultivated prokaryotes: Isolation and analysis of a 40-kilobase-pair genome
fragment from a planktonic marine archaeon. J Bacteriol 178:591599.
53. Bj O, et al. (2002) Comparative genomic analysis of archaeal genotypic variants in
a single population and in two different oceanic provinces. Appl Environ Microbiol
68:335345.
54. Lpez-Garca P, Brochier C, Moreira D, Rodrguez-Valera F (2004) Comparative
analysis of a genome fragment of an uncultivated mesopelagic crenarchaeote reveals
multiple horizontal gene transfers. Environ Microbiol 6:1934.
55. Johnson ZI, et al. (2006) Niche partitioning among Prochlorococcus ecotypes along
ocean-scale environmental gradients. Science 311:17371740.

6 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.0913533107

Walker et al.

Supporting Information
Walker et al. 10.1073/pnas.0913533107
SI Materials and Methods
High-Molecular-Weight Genomic DNA Preparation. Two cultures of

Nitrosopumilus maritimus strain SCM1 were grown as previously


described in 500 mL of media in 1-L asks (1). Cells from both
cultures were harvested in late-exponential phase using Sterivex
lters for one culture and a 0.1-m lter for the other. Highmolecular-weight DNA was isolated as previously described using
either agarose plugs (2) or a modied guanadinium thiocynate
protocol (3). Cells from the Sterivex lter were resuspended in
1 mL of 2 STE buffer [1 M NaCl, 0.1 M EDTA (pH 8.0), 10 mM
Tris (pH 8.0)], extracted from the lter, and mixed with 1 vol of 1%
molten SeaPlaque LMP agarose (FMC). The mixture was cooled
to 40 C, immediately drawn into a 1-mL syringe, and placed on ice
for 10 min. The agarose plug was mixed with 10 mL of lysis buffer,
incubated at 37 C for 1 h, and then transferred to 40 mL of ESP
buffer (1% Sarkosyl1 mg of Proteinase K per ml in 0.5 M EDTA).
After incubation at 55 C for 16 h, the solution was replaced with
fresh ESP buffer and incubated at 55 C for another hour. DNA
was puried using phenol:chloroform:isoamyl alcohol (24:24:1)
and recovered by precipitation with isopropanol.
Cells collected on the 0.1-m lter were resuspended in 100 L of
Tris-EDTA (pH 8.0) and 100 mg/mL lysozyme before incubating at
37 C for 30 min. Then, 3.0 mL of a solution containing 5 M guanidinium thiocyanate, 100 mM EDTA (pH 8.0), and 0.5% (vol/vol)
sarkosyl was added. The solution was mixed gently for 15 min before
being cooled on ice for 10 min. After cooling, an equal volume of cold
7.5 M ammonium acetate was added, and the solution was mixed
gently and cooled on ice. Purication and precipitation of DNA were
performed with chloroform:isoamyl alcohol (24:1) and isopropanol.
Genome Sequencing. A completely sequenced and closed genome

of N. maritimus was obtained through collaboration with the


Joint Genome Institute. Whole-genome shotgun sequencing of
3-, 8-, and 40-kb DNA libraries produced at least 8 coverage of
the entire genome. Specics of clone library generation, sequencing, and assembly strategies may be found at the DOE JGI
website (www.jgi.doe.gov/sequencing/index.html).
Genome Sequence Analysis. Autoannotation of the closed genome
sequence was performed by both the Computational Biology
group at Oak Ridge National Laboratory (http://genome.ornl.
gov/microbial/nmar/02jul07/) and the TIGR Autoannotation
Service (now hosted by JCVI; details available from http://www.
jcvi.org/cms/research/projects/annotation-service/overview/). The
genome visualization software Manatee (release 2.4.1; latest
version available from http://manatee.sourceforge.net/) was used
for manual curation. Analysis of potential transporter genes was
performed using the Transporter Automatic Annotation Pipeline (TransAAP) through the TransportDB genomic comparison
tool (membranetransport.org).
Direct comparisons with the C. symbiosum genome and genome
fragments were performed using the Artemis Comparison Tool (3)
1. Knneke M, et al. (2005) Isolation of an autotrophic ammonia-oxidizing marine
archaeon. Nature 437:543546.
2. Stein LY, et al. (2007) Whole-genome analysis of the ammonia-oxidizing bacterium,
Nitrosomonas eutropha C91: Implications for niche adaptation. Environ Microbiol 9:
29933007.
3. Carver TJ, et al. (2005) ACT: The Artemis comparison tool. Bioinformatics 21:34223423.
4. Bland C, et al. (2007) CRISPR recognition tool (CRT): A tool for automatic detection of
clustered regularly interspaced palindromic repeats. BMC Bioinformatics 8:209.
5. Brochier-Armanet C, Boussau B, Gribaldo S, Forterre P (2008) Mesophilic Crenarchaeota:
Proposal for a third archaeal phylum, the Thaumarchaeota. Nat Rev Microbiol 6:245252.

Walker et al. www.pnas.org/cgi/content/short/0913533107

with a comparison library generated through WebACT (www.


webact.org/WebACT/home). Orthologous genes shared between
these two organisms were identied through reciprocal BLAST
searches, with an expectation cutoff value of 104 and a minimum of
75% alignable N. maritimus sequence. Comparisons with the
Global Ocean Sampling (GOS) and Sargasso Sea metagenomic
datasets were performed using several single-copy universal archaeal genes to determine a count of 15 archaeal genomes in the
GOS dataset. An initial set of candidate N. maritimus-like proteins
was found by BLASTP searches with each N. maritimus proteincoding gene, using a cutoff of 100 hits and a maximum expected
value of e = 10. This set consisted of 125,326 peptides drawn from
107,223 scaffolds. Neighboring ORFs on any scaffold with two or
more hits were added to make a total of 319,585 peptides,
amounting to 5.2% of all ORFs in GOS. These sequences were
ltered by BLAST alignment to four sequence datasets, containing
24 complete proteomes from diverse euryarchaeota, crenarchaeota,
and bacteria. Sequences scoring more highly to N. maritimus and/or
C. symbiosum proteins than any other entry in these datasets were
retained, giving a ltered dataset of 21,278 ORFs. Coverage of the
N. maritimus proteome was measured by bidirectional BLASTP of
N. maritimus proteins against the ltered dataset to assign putative
orthologs. The average coverage was 11, although some highly
conserved genes had a much greater number of hits, probably due to
recruitment of nonarchaeal homologs: 48 ORFs had >30 hits, of
which most were highly conserved. Searches for CRISPR regions
were performed using the Java-based CRISPR recognition tool
(CRT) with least stringent settings (4).
Maximum-Likelihood and Bayesian Trees of the Archaeal Domain.

The trees are based on the concatenation of ribosomal proteins


used by Brochier-Armanet et al. (5) but including sequences from N.
maritimus and from Candidatus Korarchaeum cryptolum OPF8.
Sequences were aligned using MUSCLE (6). Resulting alignments
were visually inspected and improved with the MUST software (7).
Regions where homology between sites was doubtful were removed
from further phylogenetic analyses. A total of 6,142 positions were
kept for the phylogenetic analyses. The maximum-likelihood tree
was computed with PHYML, using the WAG model corrected by
a gamma law to take into account evolutionary rate among site
variations (8). The parameter alpha of the gamma distribution as the
proportion of invariable sites was estimated from the dataset. The
robustness of each branch was estimated by the bootstrap procedure
implemented in PHYML. A Bayesian tree analysis on a subset of 29
taxa was performed using MrBayes 3.2 (9) with a mixed model of
amino acid substitution and a gamma distribution (eight discrete
categories and an estimated proportion of invariant sites) to take
into account among-site rate variation. MrBayes was run with four
chains for 1 million generations and trees were sampled every 100
generations. To construct the consensus tree, the rst 1,500 trees
were discarded as burn-in. The reduction of the taxonomic sampling was necessary to reduce the computation time.
6. Edgar RC (2004) MUSCLE: Multiple sequence alignment with high accuracy and high
throughput. Nucleic Acids Res 32:17921797.
7. Philippe H (1993) MUST, a computer package of management utilities for sequences
and trees. Nucleic Acids Res 21:52645272.
8. Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate large
phylogenies by maximum likelihood. Syst Biol 52:696704.
9. Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference under
mixed models. Bioinformatics 19:15721574.

1 of 5

0.5

95

A.

B.

Nmar0121
Nmar0167

Thermoplasma volcanium GSS1 [rusticyanin; gi13542283]


Acidithiobacillus ferrooxidans [rusticyanin; gi3282064]
Thermobaculum terrenum ATCC BAA-798 [plastocyanin; gi227373287]
Sphaerobacter thermophilus DSM 20745 [plastocyanin; gi229876615]
Streptomyces sp. AA4 [hypothetical protein; gi256667890]
Burkholderia cenocepacia PC184 [plastocyanin; gi254250279]
Catenulispora acidiphila DSM 44928 [blue (type 1) copper domain protein; gi256391720]
Mycobacterium intracellulare ATCC 13950 [hypothetical protein; gi254819204]
Mycobacterium avium subsp. avium ATCC 25291 [copper-binding protein; gi254776845]
83
99 Mycobacterium avium 104 [copper-binding protein; gi118462975]
Nmar0747
Nmar0361
96
Candidatus Koribacter versatillis [plastocyanin-like protein; gi94968815]
Nmar0902
Chromobacterium violaceum ATCC 12472 [hypothetical protein; gi34498952]
Methylocella silvestris BL2 [blue (type 1) copper domain protein; gi217976580]
Methylobacterium nodulans ORS [blue (type 1) copper domain protein; gi220920817]
Methylocella silvestris BL2 [blue (type 1) copper domain protein; gi217976580]
Nitrobacter hamburgensis X14 [blue (type 1) copper domain protein; gi92118183]
61
Bradyrhizobium japonicum USDA 110 [putative Amicyanin precursor; gi27378126]
Paracoccus denitrificans PD1222 [amicyanin; gi119387457]
Nmar0190
Methanosarcina mazei Go1 [hypothetical protein; gi21226177]
Methanosarcina acetivorans C2A [hypothetical protein; gi20090537]
64
C. symbiosum A [copper-binding protein; gi118575216]
Nmar0167
99
Nmar0621
99
Methanosarcina mazei Go1 [copper-binding protein; gi21228445]
Methanosarcina acetivorans C2A [plastocyanin/azurin family copper-binding protein; gi20090218]
Methanosarcina mazei Go1 [copper-binding protein; gi21228446]
62
Methanosarcina acetivorans C2A [plastocyanin/azurin family copper-binding protein; gi20090219]
100
Jannaschia sp. CCS1 [blue (type 1) copper domain protein; gi89056161]
Nmar0718
Thermobaculum terrenum ATCC BAA-798 [plastocyanin; gi227373254]
Deinococcus deserti VCD115 [copper-binding protein; gi226356528]
Synechococcus sp. WH 8102 [plastocyanin; gi33866031]
Prochlorococcus marinus MIT 9312 [plastocyanin; gi78778966]
91
Anabaena variabilisATCC 29413 [plastocyanin; gi75908957]
88
Gloeobacter violaceus PCC 7421 [plastocyanin; gi35212909]
79
Methanococcus maripaludis C7 [copper-binding protein; gi150402166]
Methanococcus maripaludis S2 [copper-binding protein; gi45358560]
82
Nmar1913
Nmar1789
82
Nmar0300
Nmar0762
C. symbiosum A [copper-binding protein; gi118576966]
C. symbiosum A [copper-binding protein; gi118576965]
100
Nmar0516
Nmar1087
C. symbiosum A [hypothetical protein; gi118575848]
C. symbiosum A [hypothetical protein; gi118575831]
98
C. symbiosum A [copper-binding protein; gi118575244]
Nmar0121
79
uncultured marine Crenarchaeote [putative copper-binding protein; gi167045254]
uncultured marine Crenarchaeote [putative copper-binding protein; gi167044521]
100
100 uncultured marine Crenarchaeote [putative copper-binding protein; gi167042919]
C. symbiosum A [copper-binding protein; gi118575724]
Nmar0273
100
Nmar0732
Nmar0324
79
C. symbiosum A [copper-binding protein; gi118576967]
80

Nmar0190
Nmar0273
Nmar0300
Nmar0324
Nmar0361
Nmar0516
Nmar0621
Nmar0718
Nmar0732
Nmar0747
Nmar0762
Nmar1087
Nmar1789
Nmar1913

100 aa

0.1

Bacillus subtilis subsp. subtilis str. 168 [BdbD thiol-disulfide oxidoreductase; gi 16080401]
Bacillus amyloliquefaciens FZB42 [BdbD; gi 154687467]
Paenibacillus larvae subsp. larvae BRL-230010 [disulfide dehydrogenase D; gi 167461952]
Deinococcus geothermalis DSM 11300 [DsbA oxidoreductase; gi 94984799]
Thermobaculum terrenum ATCC BAA-798 [protein-disulfide isomerase; gi 227374421]
Gemmatimonas aurantiaca T-27 [putative oxidoreductase; gi 226228008]
100
Sphaerobacter thermophilus DSM 20745 [protein-disulfide isomerase; gi 229876989]
Thermomicrobium roseum DSM 5159 [DsbA oxidoreductase; gi 221635547]
100
Thermus aquaticus Y51MC23 [DsbA oxidoreductase; gi 218295130]
Thermobifida fusca YX [protein-disulfide isomerase; gi 72161925]
87
Mycobacterium kansasii ATCC 12478 [DsbA oxidoreductase; gi 240169146]
Rubrobacter xylanophilus DSM 9941 [DsbA oxidoreductase; gi 108804655]
82
Streptomyces
coelicolor A3(2) [hypothetical protein SCO5993; gi 21224330]
75
Streptomyces ghanaensis ATCC 14672 [hypothetical protein SghaA1_08748; gi 239928300]
100
Symbiobacterium thermophilum IAM 14863 [hypothetical protein STH2058; gi 51893196]
Roseiflexus castenholzii DSM 13941 [DsbA oxidoreductase; gi 156741642]
Chloroflexus aggregans DSM 9485 [DsbA oxidoreductase; gi 219849651]
100
Chloroflexus aggregans DSM 9485 [DsbA oxidoreductase; gi 219847445]
Marinobacter algicola DG893 [DsbA oxidoreductase; gi 149377658]
99
Vibrio harveyi HY01 [DsbA oxidoreductase; gi 153832115]
Vibrio parahaemolyticus RIMD 2210633 [hypothetical protein VPA0994; gi 28900849]
100
Moritella sp. PE36 [putative membrane protein; gi 149908466]
Shewanella benthica KT99 [hypothetical protein KT99_10483; gi 163751466]
99
Roseiflexus castenholzii DSM 13941 [protein-disulfide isomerase-like protein; gi 156741356]
Roseiflexus castenholzii DSM 13941 [DsbA oxidoreductase; gi 156743646]
75
Chloroflexus aurantiacus J-10-fl [DsbA oxidoreductase; gi 163848707]
Herpetosiphon aurantiacus ATCC 23779 [DsbA oxidoreductase; gi 159896786]
Chloroflexus aurantiacus J-10-fl [DsbA oxidoreductase; gi 163845898]
72
Chloroflexus aggregans DSM 9485 [DsbA oxidoreductase; gi 219850456]
100
Stigmatella aurantiaca DW4/3-1 [conserved hypothetical protein; gi 115374845]
Solibacter usitatus Ellin6076 [DsbA oxidoreductase; gi 116625220]
Solibacter usitatus Ellin6076 [DsbA oxidoreductase; gi 116624599]
Syntrophobacter fumaroxidans MPOB [DsbA oxidoreductase; gi 116751066]
82
Anaeromyxobacter dehalogenans 2CP-1 [DsbA oxidoreductase; gi 220919173]
Myxococcus xanthus DK 1622 [putative lipoprotein; gi 108762810]
100
Stigmatella aurantiaca DW4/3-1 [disulfide interchange protein; gi 115379912]
Archaeoglobus fulgidus DSM 4304 [hypothetical protein AF1354; gi 11498950]
Cenarchaeum symbiosum A [protein-disulfide isomerase; gi 118575694]
Nmar0752
93
Nmar0179
91
Nmar0218
Nmar0740
73
Nmar0175
79
Natrialba magadii ATCC 43099 ]DsbA oxidoreductase; gi 224823141]
Nmar1863
Cenarchaeum symbiosum A [protein-disulfide isomerase; gi 118576454]
90
Nmar1805
67
Cenarchaeum symbiosum A [protein-disulfide isomerase; gi 118576169]
100
Nmar1606
Nmar0164
63
Nmar0169
72
Nmar1590
Cenarchaeum
symbiosum A [protein-disulfide isomerase; gi 118575427]
88

C.

Fig. S1. Phylogeny of plastocyanin-like protein sequences. Sequences with signicant matches to COG3794 (PetE: Plastocyanin [Energy production and
conversion]) were used to query the non-redundant protein sequence database from NCBI. Sequences from the top 2030 non-N. maritimus hits were retrieved and their match to the above conserved domain model veried. Sequences from experimentally characterized proteins were obtained from the
available literature and included, aligned with ClustalW and then curated manually. Distance-based phylogenies were inferred in Phylip using the NeighborJoining algorithm and 100 bootstrap replicates. Bootstrap support values >60% are displayed. Nodes with <50% bootstrap support were collapsed.
Walker et al. www.pnas.org/cgi/content/short/0913533107

2 of 5

Fig. S2. Archaeal ammonia monooxygenase AmoB sequence mapped onto the crystal structure of the particulate methane monooxygenase (PDB accession
code 1YEW). The pmoA subunit is shown in dark gray, the pmoC subunit in light gray, and the pmoB subunit in red and pink. The red part represents the
region of pmoB conserved in the predicted N. maritimus AmoB. The transmembrane helix and C-terminal cupredoxin domain shown in pink are missing in the
predicted N. maritimus AmoB. Cyan spheres represent copper ions. The grey sphere represents a zinc ion.

Walker et al. www.pnas.org/cgi/content/short/0913533107

3 of 5

Acetoacetyl-CoA -ketothiolase
(Nmar_0841 or Nmar_1631)

3-Hydroxybutyryl-CoA dehydrogenase
(Nmar_1028)

Crotonyl-CoA hydratase
(Nmar_1308)

Acetyl-CoA carboxylase
(Nmar_0272, 0273, 0274 )

4-Hydroxybutyryl-CoA dehydratase
(Nmar_0207)

Malonyl-CoA reductase
Malonate semialdehyde reductase
(unknow n)

4-Hydroxybutyryl-CoA synthetase
(Nmar_0206)

3-Hydroxypropionyl-CoA synthetase
3-Hydroxypropionyl-CoA dehydratase
Acryloyl-CoA reductase (unknow n)

Propionyl-CoA carboxylase
(Nmar_0272, 0273, 0274)

Succinate semialdehyde reductase


(Nmar_1110 or Nmar_0161 )

Succinyl-CoA reductase
(Nmar_1608)

Methylmalonyl-CoA epimerase
Methylmalonyl-CoA mutase
(Nmar_0953, 0954, 0958)

Fig. S3. Proposed 3-hydroxypropionate/4-hydroxybutyrate cycle for autotrophic carbon xation by N. maritimus.

Walker et al. www.pnas.org/cgi/content/short/0913533107

4 of 5

A.

100

100

100

90

96
97
78

0.1

B.

Desulfurococcales
Sulfolobales
Nanoarchaeota
Thermococcales
Methanopyrales
Methanobacteriales
Methanococcales
Thermoplasmatales
Archaeoglobales
Halobacteriales
Methanomicrobiales
Methanosarcinales

Eucarya
Korarchaeota
Thaumarchaeota
Thermoproteales
Desulfurococcales
Sulfolobales
Nanoarchaeota
Thermococcales
Methanopyrales
Methanobacteriales
Methanococcales
Thermoplasmatales
Archaeoglobales
Halobacteriales
Methanosarcinales

Euryarchaeota

0.1

Thermoproteales

Crenarchaeota

Saccharomyces cerevisiae
Dictyostelium discoideum
Oryza sativa
Candidatus Korarchaeum cryptofilum
Candidatus Nitrosopumilus maritimus
1.00
Cenarchaeum symbiosum
0.88
Pyrobaculum aerophilum
1.00
1.00
Thermofilum pendens
Aeropyrum pernix
1.00
1.00
Hyperthermus butylicus
1.00
Sulfolobus acidocaldarius
0.88
1.00
Metallosphaera sedula
Nanoarchaeum equitans
Thermococcus kodakarensis
1.00
Pyrococcus abyssi
1.00
Methanopyrus kandleri
1.00
Methanothermobacter thermautotrophicus
1.00
1.00
Methanosphaera stadtmanae
Methanocaldococcus jannaschii
1.00
1.00
Methanococcus maripaludis
Thermoplasma volcanium
1.00
1.00
Ferroplasma acidarmanus
Archaeoglobus fulgidus
1.00
Haloarcula marismortui
1.00
1.00
Natronomonas pharaonis
Methanosarcina mazei
1.00
1.00
Methanosaeta thermophila
0.70
Methanospirillum hungatei
1.00
Methanocorpusculum labreanum
1.00

Thaumarchaeota
Korarchaeota

Euryarchaeota

38

Eucarya

Crenarchaeota

52
100

Giardia lamblia
Entamoeba histolytica
Leishmania major
Trypanosoma cruzi
100
100
100
Trypanosoma brucei
Cryptosporidium parvum
Theileria parva
64
100
Plasmodium yoelii
100
Plasmodium falciparum
100
100
Arabidopsis thaliana
Oryza sativa
100
Dictyostelium discoideum
97
Homo sapiens
51
Anopheles gambiae
100
Saccharomyces cerevisiae
86
Schizosaccharomyces pombe
100
Cenarchaeum symbiosum
100
Candidatus Nitrosopumilus maritimus
Candidatus Korarchaeum cryptofilum
Thermofilum pendens
Caldivirga maquilingensis
Pyrobaculum calidifontis
100
Pyrobaculum islandicum
100
Pyrobaculum
aerophilum
94
Pyrobaculum arsenaticum
100
Ignicoccus hospitalis
100
Staphylothermus marinus
Aeropyrum pernix
61
Hyperthermus butylicus
70
Metallosphaera sedula
Sulfolobus solfataricus
100
Sulfolobus acidocaldarius
90
Sulfolobus tokodaii
100
Nanoarchaeum equitans
Thermococcus gammatolerans
Thermococcus kodakarensis
100
Pyrococcus furiosus
100
Pyrococcus abyssi
100
Pyrococcus horikoshii
100
Methanopyrus kandleri
Methanosphaera stadtmanae
Methanothermobacter thermautotrophicus
100
55
Methanocaldococcus jannaschii
Methanococcus aeolicus
100
Methanococcus maripaludis
100
Methanococcus vannielii
100
Picrophilus torridus
Ferroplasma acidarmanus
100
Thermoplasma acidophilum
100
100
Thermoplasma volcanium
Archaeoglobus fulgidus
95
Halobacterium sp
69
Natronomonas pharaonis
Haloarcula marismortui
71
100
88
Halorubrum lacusprofundi
Haloquadratum walsbyi
100
100
Haloferax volcanii
Methanocorpusculum labreanum
100
Methanoculleus
marisnigri
100
Candidatus Methanoregula
75
Methanospirillum hungatei
82
61
Methanosaeta thermophila
Methanococcoides burtonii
100
Methanosarcina barkeri
100
Methanosarcina acetivorans
100
100
Methanosarcina mazei

Methanomicrobiales

Fig. S4. (A) Maximum-likelihood phylogeny of Group 1 Archaea. The phylogeny was inferred using an alignment of concatenated R-proteins (66 taxa, 6,142
positions). WAG+Inv+Gamma (4 classes); 100 replicates. (B) Bayesian tree of mesophilic Group 1 Archaea inferred using an alignment of concatenated Rproteins (29 taxa, 6,142 positions). Mixed model + Gamma (4 classes); 100 replicates.

Other Supporting Information Files


Table S1 (DOC)
Table S2 (DOC)
Table S3 (DOC)
Table S4 (DOC)
Table S5 (DOC)
Table S6 (DOC)

Walker et al. www.pnas.org/cgi/content/short/0913533107

5 of 5

Das könnte Ihnen auch gefallen