Sie sind auf Seite 1von 9

Archaeal Chromosome

Nikhil A Thomas, Queens University, Kingston, Ontario, Canada David M Faguy, University of New Mexico, Alberquerque, New Mexico, USA Ken F Jarrell, Queens University, Kingston, Ontario, Canada
The archaeal chromosome consists of a single circular molecule within the size range reported for bacterial chromosomes. However, within the archaeal genome are both bacterial-like genes and eukaryotic-like genes, as well as archaeal-specific genes, all transcribed from promoters very different from those of bacteria.

Secondary article
Article Contents
. Introduction . General Features of the Sequenced Archaeal Genomes . Size and State of the Genome . Origin of Replication . Repetitive and Insertional Elements . DNA Replication and Chromosome Segregation . Promoters . Archaeal Histone Proteins . Nucleosome Structure

Introduction
The Archaea represent a third line of evolutionary descent. While prokaryotic in cell form they are as distinct at the molecular level from the other prokaryotic domain, the Bacteria, as they are from the third domain, the Eukarya. The domain Archaea contains two kingdoms, the Euryarchaeota (comprising methanogens, extreme halophiles and some hyperthermophiles) and the Crenarchaeota (originally consisting of only certain hyperthermophiles but now known to include a variety of nonthermophilic members). Most cultured archaea have been isolated from extreme environments with regards to temperature, pH, salt concentration or anaerobiosis. However, recent molecular techniques have indicated that archaea are much more widespread in nature, being signicant components of normal environments, such as ocean waters and soils, and even as symbionts of higher organisms. It is clear that the role of archaea in nature has been severely underestimated. At the end of 1999, ve archaeal genomes have been completely sequenced and published (Aeropyrum pernix, Archaeoglobus fulgidus, Methanobacterium thermoautotrophicum, Methanococcus jannaschii and Pyrococcus horikoshii). In addition, others are complete but as yet unpublished (Pyrobaculum aerophilum, Pyrococcus abyssi) and many other archaeal genome-sequencing projects are underway (Halobacterium salinarum, Methanosarcina mazei, Sulfolobus solfataricus, Thermoplasma acidophilum, Thermoplasma volcanium). The ve published sequences are all from thermophilic or hyperthermophilic archaea and all but A. pernix are from the Euryarchaeota kingdom. A more generalized view of the archaeal genome will come when members of more diverse groups are also included, such as mesophiles and extreme halophiles.

. Future Outlook

General Features of the Sequenced Archaeal Genomes


The ve completely sequenced and published archaeal genomes utilize 8992% of the genome for encoding

proteins. Similarly, bacterial genomes typically utilize 87 93% of the genome for protein-encoding open reading frames (ORFs) (with a low of 75% for Rickettsia prowazeckii and a high of 95% for Thermotoga maritima). The archaeal genomes tend to have a somewhat higher concentration of ORFs per kilobase of genome (1.01.6) than bacteria, which range from 0.75 to 1.0. This is reected in the average size of a gene in archaea, which is about 800 bp, compared with bacterial genes, which are usually about 1 kbp (Table 1). However, since all ve sequenced archaeal genomes are from thermophilic or hyperthermophilic members, this may be a feature of organisms that inhabit these high-temperature environments rather than a trait of the whole domain. The two hyperthermophilic bacteria with sequenced genomes (T. maritima and Aquifex aoelicus) have 0.981.0 ORFs per kilobase but the average gene size is still 950 bp. The availability of nonthermophilic archaeal genome sequences in the near future may answer the question of whether archaea in general have smaller average genes. In general, the genomes of archaea contain bacterial-like genes, eukaryal-like genes as well as genes that appear to be archaeal specic. Bacterial-like genes of the archaea include ones involved in metabolism, small molecule biosynthesis, transport and regulation. Archaeal genes that are more similar to eukaryal genes include ones involved in transcription, translation and DNA metabolism. Archaeal-specic genes include ones involved in agellation, methanogenesis and the synthesis of unique lipids and wall components, as well as many that are still undened. In addition, there are genes that are common to all life, such as ones for the partitioning of genetic material. Recent evidence indicates that lateral gene transfer has occurred between archaea and bacteria in both directions, an observation that muddies the distinction between archaeal and bacterial genes. As with bacteria, the study of the completed genomes of archaea reveals that many genes are organized into multigene transcriptional units, with ribosomal-binding
1

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net

Archaeal Chromosome

Table 1 Comparison of sequenced genomes of Archaea and Bacteria Organism Archaea Aeropyrum pernix Archaeoglobus fulgidus Methanobacterium thermoautotrophicum Methanococcus jannaschiia Pyrococcus horikoshii Bacteria Aquifex aeolicus Bacillus subtilis Borrelia burgdorferi Chlamydia pneumoniae Chlamydia trachomatis Escherichia coli Helicobacter pylori J99 H. pylori 26695 Haemophilus influenzae Mycobacterium tuberculosis Mycoplasma genitalium Mycoplasma pneumoniae Rickettsia prowazekii Synechocystis sp. Thermotoga maritima Treponema pallidum
a

Genome size (Mb) 1.67 2.18 1.75 1.66 1.74 1.55 4.2 0.91 1.23 1.04 4.64 1.64 1.67 1.83 4.4 0.58 0.82 1.11 3.57 1.87 1.14

Protein-encoding ORFs (no.) 2694 2436 1855 1682 2061 1512 4100 853 1073 894 4288 1495 1590 1743 3924 470 677 834 3168 1877 1041

Protein-coding region (%) 88.8 92.2 92 90.7 93 87 93

Average size ORF (bp) 711 822

ORF per kb 1.6 1.1 1.1 1.0 1.2 0.98 0.98 0.94 0.87 0.86 0.92 0.91 0.95 0.95 0.89 0.81 0.83 0.75 0.89 1.0 0.91

816 956 890 992

87.8 90.8 91 85 91 88 88.7 75.4 87 95 92.9

951 998 945 900 1040 1011 1005 978 947 1041

This does not include plasmid-encoded genes of M. jannaschii.

sites preceding them, that are expressed as long, noncapped messenger RNAs (mRNAs) with only short bacterial-like poly(A) tails. Surprisingly, some operons have conserved gene arrangements between archaea and bacteria, for example, ones involved in chemotaxis and pyrimidine biosynthesis, despite the fact that the structure of these operons are not well conserved within bacteria. The regulation of gene expression in many archaeal species is generally not well understood. In the complete genomes of M. thermoautotrophicum and A. fulgidus, there are over 20 examples of two-component sensor kinaseresponse regulator proteins, while there are no representatives of these proteins in the genome of M. jannaschii. Several gene-specic transcriptional activators have been described in archaea. These include GvpE (gas vesicle synthesis) in H. salinarum, the T6 repressor of the Haloarcula phage fH, ArcR (arginine fermentation) in H. salinarum, and Tfx (formylmethanofuran dehydrogenase complex) of M. thermoautotrophicum. In addition, a repressor-binding site that regulates nitrogen xation gene expression in Methanococcus maripaludis has been re2

ported. Whereas in the Bacteria and Eukarya many copies of the ribosomal RNA (rRNA) gene clusters have been reported (e.g. Escherichia coli has seven), the Archaea studied often have only one copy. Analysis of the completed genomes has revealed a number of cases of genes that are apparently missing, as well as the unexpected presence of certain other genes. For example, Hsp70 (DnaK) is a member of a set of proteins that undergoes increased synthesis in response to a variety of stresses including heat shock. Hsp70 has been found in all members of the domains Bacteria and Eukarya in which it has been searched for. Surprisingly, dnaK has not been detected in many archaea, by polymerase chain reaction (PCR), Southern blotting or in the analysis of several complete genome sequences (A. fulgidus, M. jannaschii, P. horikoshii). It has been found in T. acidophilum, M. thermoautotrophicum, M. mazei and Haloarcula marismortui. The observations that dnaK is missing from the crenarchaeotes and its distribution is haphazard in the euryarchaeal groups suggest that archaeal dnaK homologues were derived from bacterial donors through lateral

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net

Archaeal Chromosome

transfer. Fibrillarin is a protein contained in the nucleus of eukaryotes and associated with the small nucleolar RNAs. Surprisingly, it has also been found in Methanococcus voltae and M. vannielii by Southern hybridization techniques and in complete genome sequences. DNA repair system genes are also under represented in the genome of M. jannaschii. Archaea, like bacteria, have restriction/modication (R/ M) systems to distinguish host DNA from foreign DNA. All the archaeal genomes thus far sequenced have several type I R/M systems. In addition, A. fulgidus has a type III R/M system. H. salinarum and M. jannaschii have several type II systems. These systems are very similar to bacterial systems, with isoschizomers from dierent domains being more similar to each other than nonisoschizomers from related organisms. All of these systems were identied by comparative genomics (based on homology to bacterial R/ M systems). It is possible that archaeal-specic R/M systems remain to be discovered by biochemical means.

Size and State of the Genome


The archaeal chromosome consists of a single circular molecule that falls within the range of size established for bacterial chromosomes (Table 2). Many archaea also have plasmids and in certain cases, the most prominent of which are certain extreme halophiles, these are very large megaplasmids (up to 700 kb) that contain a substantial portion of the total genome content. These megaplasmids are unusual in that they appear to contain essential genes. In Halobacterium species NRC1, two large plasmids have been found and one, pNRC100 (191 kb), has been completely sequenced. Analysis of the sequence indicated that many genes were typical plasmid genes encoding for replication and partitioning while other genes were responsible for important but not essential functions such as gas vesicle production. However, an unusual nding was the discovery of a number of genes thought to be essential on the plasmid which were not found on the chromosome (i.e. they were not extra copies). These genes included ones for a terminal cytochrome oxidase needed for the last step in respiration as well as genes required for the synthesis of deoxyribonucleotides. A plausible explanation for these attributes is that the plasmid acquired these genes by integration into the chromosome in an insertion elementmediated fashion, followed by faulty excision in a manner similar to the formation of an F factor in E. coli. Characterization of these megaplasmids has blurred the traditional dierence between the denition of plasmids and chromosomes. The halobacterial genome is unusual in other respects as well. It is known that phenotypic variants can arise at an astonishingly high frequency of about 1 in 100. This occurs due to the transposition of halobacterial insertion sequences as well as rearrangements mediated by these

insertion sequences. However, recent study has indicated that the entire chromosome is not subject to this high rate of variation; it is conned to a 240-kb portion of the genome and the megaplasmids. Presumably, most of the essential genes are located in the stable part of the halobacterial genome. It may also be that the hypervariable regions of the chromosome have lower gene density. Conrmation of these assumptions awaits the complete sequencing of the genome of H. salinarum. Bacterial DNA, as well as the DNA present in eukaryal chromatin, is negatively supercoiled. An interesting observation is that at least in some archaeal hyperthermophiles there are positively supercoiled DNA molecules; this has been observed for the DNA of the archaeal virus SSV1 as well as for several plasmids. An unusual topoisomerase, called reverse gyrase, has been detected in extracts of all organisms that grow above 758C (bacteria as well as archaea). This enzyme catalyses in vitro positive supercoiling into a closed circular DNA molecule at high temperatures. Because reverse gyrase appears to be limited to very thermophilic and hyperthermophilic organisms, it has been suggested that its biological role is to stabilize the DNA duplex against denaturation at high growth temperatures by incorporation of positive supercoils. However, since topologically closed DNA is known to be extremely resistant to heating, the role of reverse gyrase cannot simply be to prevent overall DNA melting, but perhaps it helps to prevent localized strand separation as might occur in AT-rich segments. Other suggestions for the biological role of reverse gyrase have focused on more specic roles such as maintenance of genetic stability. Of particular interest is how the genomes of hyperthermophiles avoid melting at their normal growth temperature near 1008C. Stabilization of DNA at elevated temperatures, i.e. preventing the helices from separating, may be brought about by intrinsic or extrinsic factors. Apparently there is no correlation between the G 1 C content of a genome and the optimal growth temperature, despite the expectation that hyperthermophiles would have genomes with a high G 1 C content. This is in contrast to the sequences of the 16S and 23S rRNA genes, which do show elevated G 1 C content. In fact, many hyperthermophilic archaea have a low genomic G 1 C content, for example M. jannaschii grows optimally at 888C and has a G 1 C content of only 31%. Since increased G 1 C content does not appear to be the solution, stabilization of chromosomal DNA at high temperatures appears to be brought about by the presence of extrinsic factors such as polyamines, potassium ions and histonelike proteins. For example, it has been shown that thermodegradation of DNA can be prevented signicantly by the presence of KCl. Several hyperthermophilic archaea have been shown to contain intracellular potassium concentrations in the range of 0.51.0 mol L 2 1, which should result in protection of their genome at their high growth temperature.
3

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net

Archaeal Chromosome

Table 2 Estimated genome sizes in Archaea Organism Hyperthermophiles (growth optimum above 80C) Acidianus ambivalens Acidianus infernus Aeropyrum pernix Archaeoglobus fulgidus Archaeoglobus lithotrophicus Archaeoglobus profundus Methanococcus igneus Methanococcus jannaschii Pyrobaculum aerophilum Pyrococcus abyssi Pyrococcus furiosus Pyrococcus horikoshii Pyrococcus sp KOD1 Pyrodictium abyssi Sulfolobus solfataricus Stygiolobus azoricus Thermococcus celer Thermophiles (growth optimum 5079C) Acidianus brierleyi Metallosphaera sedula Metallosphaera prunae Methanobacterium thermoautotrophicum Sulfolobus acidocaldarius Sulfolobus metallicus Sulfolobus shibatae Thermoplasma acidophilum Mesophiles (growth optimum below 50C) Halobacterium salinarum Haloferax mediterranei Haloferax volcanii Methanococcus voltae Methanosarcina mazei
a

Size (kb) 1855 (PFGEa) 1829 (PFGE) 1670 (sequenced) 2180 (sequenced) 1891 (PFGE) 1813 (PFGE) 1658 (PFGE) 1660 (sequenced) 2220 (sequenced) 1800 (sequenced) 2100 (sequencing in progress) 1800 (sequenced) 2036 (PFGE) 1627 (PFGE) 3050 (sequencing in progress) 1543 (PFGE) 1890 (PFGE) 1880 (PFGE) 1890 (PFGE) 1879 (PFGE) 1750 (sequenced) 2760 (PFGE) 1932 (PFGE) 3010 (PFGE) 1700 (sequencing in progress) 4000 (plus megaplasmids) (sequencing in progress) 2900 (plus plasmids of 490, 320, 130) 2920 (plus plasmids of 690, 442, 86, 6.4) 1880 (PFGE) 2800 (sequencing in progress)

Determined by pulsed-field gel electrophoresis.

Origin of Replication
Bacteria replicate their chromosomes from a single origin of replication, whereas eukaryotes replicate from multiple origins. When growing rapidly, bacteria have many rounds of replication initiated per round of cell division, whereas eukaryotes replicate their DNA only once per cell division.
4

There is no direct biochemical or genetic data on archaeal replication origins. However, identication of putative chromosomal origins was accomplished by employing the information in so-called cumulative skew plots. Organisms with a single origin of replication will accumulate more G than C in the leading strand of replication (due to a dierence in mutational bias between leading and lagging

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net

Archaeal Chromosome

strands). In bacteria, plotting of this GC skew accurately correlates with known origin and termination sites, with the maximum reached at the terminus and the minimum occurring at the replication origin. Using this technique, it was found that P. horikoshii and M. thermoautotrophicum genomes also had this typical bacterial-like pattern, but A. fulgidus and M. jannaschii did not. Genome rearrangements randomize this signal so that genomes that undergo frequent rearrangements will have this pattern obscured. Interestingly, A. fulgidus GC skew plots revealed two minima and two maxima, resembling the behaviour of eukaryotic chromosomes. This method was rened by nding oligomers that produced plots with less noise than GC skew plots (Lopez et al., 1999). With this method, it was possible to tentatively locate the putative origin in both M. thermoautotrophicum and P. horikoshii in a region very near homologues of the eukaryotic DNA replication initiation genes cdc6/orc. This site contains multiple copies of a 13-bp repeat distributed around an internal stretch of nucleotides that is AT-rich. In M. thermoautotrophicum, the AT-rich element is TTATTATTAAAAATTT. The consensus sequence for the repeat element is t/cTa/cCAg/cTgGAAAT. In M. thermoautotrophicum the predicted origin of replication is located between ORF 1410 and cdc6 near position 1.277 on the 1.75-Mb chromosome. Although the predicted site is highly conserved in M. thermoautotrophicum, P. horikoshii and P. furiosus, a similar site was not found in A. fulgidus or M. jannaschii. Also, no homologue to cdc6/orc was found in the M. jannaschii genome.

Repetitive and Insertional Elements


Repetitive DNA is a hallmark of eukaryotic chromosomes (especially higher eukaryotes), but it is generally more scarce in prokaryotes. However, several classes of repetitive elements have been described in prokaryotic genomes. Repeats present in archaea include short tandem repeats in a noncoding sequence, insertion sequences (IS elements), multigene families and proteins with repetitive sequence elements. Perhaps the most interesting class of repeat is the SR class of short direct repeats in noncoding regions. These are found in all completely sequenced archaeal genomes (as well as in Sulfolobus and Haloferax) and also in the bacteria T. maritima and A. aeolicus (Table 3). Similar repeats are also found (although smaller in number) on several plasmids found in archaea (e.g. pHV4 of H. volcanii, Mojica et al., 1995 and pNOB8 of S. solfataricus, She et al., 1998). This class consists of dozens to hundreds of copies of a 2530 bp short direct repeat interspersed with  40 bp of unique sequence. The direct repeat often has dyad symmetry and clusters of these short repeats often have a longer repeat at one end. While no function has been

shown for these repeats, the conserved (and unusual) structure and (in some cases) sequence of these repeats tends to suggest some biological function. Work with H. volcanii did show that a plasmid containing the repeats caused signicant defects in cell viability and altered the pattern of chromosome segregation (Mojica et al., 1995). Several archaea (e.g. halophiles and Sulfolobus spp.) have large numbers of repeated IS elements, while others (M. thermoautotrophicum) do not appear to have any. The typical IS structure has terminal inverted repeats, anked by short direct repeats. The IS elements often encode a transposase to facilitate mobility. Work on S. solfataricus has identied six new IS elements and shown that genomic rearrangements due to IS elements may be frequent (Schleper et al., 1994). These IS elements are also found on plasmids in Sulfolobus and Halobacterium and may promote their insertion into the chromosome (She et al., 1998). The only introns found to date in archaea are those in stable RNAs like 16S and 23S rRNAs and transfer RNAs (tRNAs). No eukaryotic spliceosomal introns, group I or group II self-splicing introns (found in both bacteria and eukaryotes) have been identied in archaea. Introns in archaeal tRNAs are common and widespread and they appear to have many similarities to eukaryotic nuclear tRNA introns. Recent evidence suggests that their splicing mechanisms may be quite similar. Some introns in rRNA genes of archaea are able to transfer to nonintroncontaining 23S rRNA genes. An interesting case of a mobile intron in archaea is the intron in the 23S rRNA gene of Desulfurococcus mobilis. This intron can spread through a culture of Sulfolobus cells. Eorts are under way to use this mobile intron in the construction of vectors for use in crenarchaeotes. Although archaea have no introns in protein-encoding genes, inteins (internal regions that are self-spliced out of the mature protein) are quite common. Some archaeal genomes (M. jannaschii and P. horikoshii) have more than 20 identied inteins, including as many as three in one gene (replication factor C in M. jannaschii), while other genomes (A. fulgidus) have no identied inteins.

DNA Replication and Chromosome Segregation


The processes of DNA replication and chromosome segregation are very dierent in bacterial and eukaryotic cells. Although little is known of these processes in archaea, they appear to have characteristics of both bacteria and eukaryotes. Most of our information about these processes comes from complete genomes and gene sequences. These complete sequences suggest that most of the genes likely to be involved in DNA replication are of the eukaryotic type,
5

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net

Archaeal Chromosome

Table 3 Short, interspersed tandem repeat elements in archaeal genomes

A. fulgidus M. jannaschii M. thermoautotrophicum H. volcanii H. mediterranei P. horikoshii S. solfataricus T. maritimab

Based on the approximately 80% of the S. solfataricus genome that is currently sequenced. Member of domain Bacteria.

while genes likely to be involved in chromosome partitioning are a mix of bacterial and eukaryotic types. One of the most puzzling observations arising from the analysis of the rst completed archaeal genome (M. jannaschii) was the detection of only one sequence homologous to known DNA polymerases. Since bacteria and eukaryotes both utilize several DNA polymerases for replication and repair of the chromosome, the possibility of only a single enzyme to full all the requirements of DNA polymerases was surprising. However, recently, a novel DNA polymerase family, which does not show homology to any of the known DNA polymerase families, has been identied in the Archaea (Ishino et al., 1998). This archaeal DNA polymerase consists of a heterodimer with excellent primer extension capability as well as 3 to 5 exonuclease activity, suggesting that it is this DNA polymerase which is the one important for DNA replication in the archaea. Homologues of the two subunits comprising this DNA polymerase have been detected in all the complete archaeal genome sequences. Sometimes the two subunits are found in an operon (as in P. furiosus and P. woesei) and sometimes they are found well separated on the chromosome (as in M. jannaschii). Several other enzymes involved in DNA replication in eukaryotes have been identied in archaeal complete genomes, including DNA helicase, DNA ligase, replication factor C, PCNA (proliferating cell nuclear antigen/sliding clamp), cdc6 (a component of the origin recognition complex), and mcm proteins (minichromosome maintenance proteins or replication licensing factors). Few homologues of bacterial replication proteins have been identied in archaeal complete genomes. While DNA replication in archaea appears very eukaryotic in character, chromosome partitioning has both bacterial and eukarytoic characteristics. Several proteins involved in mitosis (eukaryotic chromosome partitioning) have been identied in archaeal genomes. A homologue of the pelota protein (aecting spindle formation in Drosophila) has been identied in all of the sequenced archaeal genomes (and in S. solfataricus). SMC (structural maintenance of chromosomes) proteins are involved in chromosome condensation and segregation in both bacterial and eukaryotic cells have also been identied in archaeal genomes. In addition, bacterial proteins (the minD/soj/parA/parB family) involved in nucleoid partioning have homologues (often several in each genome) in archaea.

Copies/genome

Number of repeat clusters

Inter-repeat distance (bp)

Repeat sequence (consensus)

CTTTCAATCcCATTTTggtCTGATTTCAACtctta TTAAAATCAGACCGTAAAAATCTAATAC ATTTCAATCCCATTTTGGTCTGATTTAAC GTTTCAGACGAACCCTTGTGGGGTTGAAGC GTTACAGACGAACCCTAGTTGGGTTGAAGC GTTTCCGTAGAACTTAGTAGTGTGGAAAG GATTAATCCaAAAAGGAATTGAAAG TTTCCATACCTCTAAGGAATTATTGAAACA

40 3151 3438 3339 3339 3240 40 3940

3 at least 3a 8

3 18 2

108 at least 196a 143

150 188 171

Promoters
Archaeal promoter elements are dierent from the bacterial paradigm and share some similarities to the eukaryotic RNA PolI promoter, including a TATA box element centred on 2 26/ 2 27. Recent studies have

Organism

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net

Archaeal Chromosome

identied three sequence elements in archaeal promoters that are shared by all archaeal groups (Soppa, 1999). These are an initiation element around the transcription initiation site (INR), a TATA box centred on 2 26/ 2 27 and a transcription factor B recognition element (BRE) comprising two adenines at positions 2 34/ 2 33 upstream of the TATA box. At the same time, no evidence was found for a downstream promoter element sometimes found in eukaryotic promoters and recognized by a TATA-binding protein associated factor, TAF. Various studies have revealed that the TATA box does not have a strict sequence requirement: at several positions two or even three dierent nucleotides are compatible with high promoter activity. While normally centred at 2 26/ 2 27 relative to the transcription start, a variation of + 1 or 2 in the location of the TATA box is still compatible with a faithful start, indicating that even the spacing has some exibility. Surprisingly, the consensus sequences for TATA, BRE and INR dier for the dierent subgroups of Archaea (halophiles, methanogens and the crenarchaeotes (Sulfolobus mainly)). This led to the prediction that the importance and number of basal transcription factors will not be the same in all the archaeal lines. For example, in A. fulgidus and P. horikoshii (but not in M. jannaschii or M. thermoautotrophicum) there is a gene with similarity to a TBP-binding protein which has only recently been detected in eukaryotes. This protein, TIP49, appears to be an example of an archaeal basal transcription factor that is not present in all species. In addition, analysis of the sequenced archaeal genomes has revealed a gene encoding a protein with similarity to the a subunit of the eukaryal transcription factor TFIIE. This nding indicates that other basal transcription factors may exist, despite the demonstration that an in vitro archaeal transcription system with RNA polymerase, TBP and TFB alone is possible.

Archaeal Histone Proteins


In the Eukarya, DNA compaction is achieved by a highorder proteinDNA complex termed the nucleosome. Certain DNA-binding proteins, called histones, bind to DNA, resulting in the DNA being wrapped around an oligomeric protein core. Within the Archaea, similar histone proteins have been identied and nucleosome-like
1 1 2

structures have been observed in vivo. No nucleosome structures have been observed in the Bacteria, but histonelike (HU) proteins have been found. The complexity of eukaryal and archaeal nucleosomes is comparable and the structural units involved in DNA packing appear to be homologous (Pereira and Reeve, 1998). Furthermore, the similarities of archaeal and eukaryal transcription systems, along with nucleosome packing of DNA are of considerable interest from the standpoint of gene expression. It is believed that the wrapping of DNA around the nucleosome core limits access to promoter and DNA-binding sites required for transcriptional activity. Archaeal histones are 6669 amino acids long and exhibit 6090% identity in pairwise alignments. In addition, considerable sequence conservation is shared with eukaryal histones, particularly in a region called the histone fold (Reeve et al., 1997). While eukaryal histones have additional N- and C-terminal domains (external to the histone fold) that act as sites for posttranslational modications, these extensions are not essential for nucleosome assembly. The histone fold is formed by two short a helices (a1 and a3, with three turns each) that ank a longer, eight turn-containing a helix (a2) from which they are separated by short b-strand loops (L1 and L2) (Reeve et al., 1997) (Figure 1). Histones do not exist as monomers but form very stable dimers due to hydrophobic interactions between pairs of antiparallel-orientated a2 helices. Eukaryal histones form exclusively heterodimers ((H2a 1 H2B) or (H3 1 H4)), however, archaeal histones are capable of forming homodimers and heterodimers. It should be noted that archaeal histones share more identity (or are less variable) and the conservation of histone sequences may permit multiple combinations for dimer formation. In contrast, residues in the a helix 2 of eukaryal histones are generally more variable (albeit still hydrophobic), and may preclude homodimer formation or favour correct heterodimeration. In addition to the sequence conservation of archaeal and eukaryal histones, nuclear magnetic resonance (NMR)-determined secondary and tertiary structures of histone B from the hyperthermophilic methanogen Methanothermus fervidus (HMfB) revealed that the lengths and spacings of the ahelical regions are directly superimposable on the same elements that form the histone fold on eukaryal nucleosome core histones. These ndings and others suggest that the histones of the Eukarya and Archaea are homologous, and have evolved from a common ancestor.
2

MELPIAPIGRIIKDAGAERVSDDARITLAKILEEMGRDIASEAIKLARHAGRKTIKAEDIELAVRRFKK
*** ** * * * * * * * * * ** * *

Figure 1 Schematic representation of an archaeal histone. The a-helical and b-strand regions that comprise the histone are identified. The sequence for Methanothermus fervidus histone B (HMfB) is shown below. Amino acid residues denoted by an asterisk are found in all archaeal histones. The boxed serine and threonine residues found within the b strands have been demonstrated to participate in DNA binding. Figure adapted from Reeve et al. (1997).

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net

Archaeal Chromosome

The expression of archaeal histones has been best evaluated in M. fervidus. A growth rate-dependent synthesis of histones was found in M. fervidus, suggesting dierent roles for the various dimers in vivo. Specically, HMfA and HMfB form both heterodimers and homodimers in vivo, but HMfA makes up as much as 80% of histone preparations synthesized during exponential phase and then decreases to 50% as cells reach stationary phase (Sandman et al., 1994). It appears that HMfA and HMfB dier in DNA-binding properties, and thus may contribute to altered states of genomic activity. Although histones seem to be present in all euryarchaeotes so far studied, they have not been studied in crenarchaeotes. Examination of the complete genome of A. pernix did not reveal any histone proteins and none have been found in Sulfolobus spp. (although they have been looked for extensively). Interestingly, there is a class of low molecular weight DNA-binding proteins in crenarchaeotes that may play the same role as histones (although they have little or no sequence similarity). The most intensively studied of this class are the sac7 proteins from S. acidocaldarius.

In the Eukarya, it is the histone tetramer that recognizes nucleosome positioning signals. In particular, tandem repeats of the trinucleotide CTG are recognized by the (H3 1 H4)2 tetramer. It was found that archaeal histones assemble nucleosomes in vitro centred preferentially within (CTG)6 and (CTG)8 repeats. It is not known why CTG repeats are favourable sites for nucleosome assembly, although CTG repeats are more exible than mixed sequence B-form DNA. In addition to specic positioning signals, DNA binding by HMf protein has been shown to occur more frequently at sites that are intrinsically curved. Specically, the intergenic regions of methanogens contain oligo (dA) tracts which when appropriately phased would be suitable regions for nucleosome positioning, and thus potentially aect gene expression by preventing access to upstream promoters or other transcriptional elements.

Future Outlook
It has become increasingly clear through the use of sophisticated molecular techniques that the members of the domain Archaea are much more widespread in the biosphere and much more diverse in nature than previously believed. It is not unreasonable to expect that the Archaea are involved as major participants in yet unknown global processes in marine, freshwater and terrestrial habitats beyond their already known importance in, for example, methanogenesis. This, coupled with their important status in evolution and their broad potential in biotechnology applications, ensures that many more archaeal genomes will be sequenced. These complete genome sequences will, in turn, provide abundant information that will ultimately help dene the Archaea as a separate domain. More complete archaeal genome sequences will provide whole genomes for studies on the molecular basis of thermostability, uncover novel biochemical pathways and structures, allow the ready cloning of enzymes for potential biotechnological exploitation, present us with unprecedented quantities of data pertaining to the diversity of life and provide invaluable information for researchers studying the origin and evolution of life.

Nucleosome Structure
The eukaryal nucleosome has a protein core that is a histone octamer. This octamer is formed by two (H3 1 H4) dimers that form an (H3 1 H4)2 tetramer. Two (H2A 1 H2B) dimers are then added at the sides of the tetramer to complete the octamer. This structure wraps 146-bp DNA around the surface of the core to complete the nucleosome. Similarly, archaeal nucleosomes are believed to be formed by an initial tetrameric core (Pereira and Reeve, 1998). Crosslinked histone tetramers have been isolated from methanogens, suggesting that oligomeric histones are involved in nucleosome formation. The composition of the tetramer is believed to be highly variable, due to the strong similarity of archaeal histones allowing promiscuity in archaeal histone tetramer formation. Archaeal nucleosomes protect 60 bp of DNA from micrococcal nuclease digestion, but the length of DNA incorporated into these structures in vivo may be signicantly larger than that protected from nuclease digestion in vitro. Electron microscopy of archaeal chromosomes reveals structures that appear similar to eukaryal nucleosomes but they appeared to be separated by protein-free DNA regions and therefore are probably not as tightly packed as the eukaryal counterparts. Nonetheless, histones comprise a signicant portion of total cellular protein. It has been estimated that 0.9% of the total soluble proteins from M. thermoautotrophicum are histones, which allows for one HMt tetramer per 100 bp of the 1.75 Mb M. thermoautotrophicum genome.
8

References
Ishino Y, Komori K, Cann IKO and Koga Y (1998) A novel DNA polymerase family found in Archaea. Journal of Bacteriology 180: 22322236. Lopez P, Philippe H, Myllykallio H and Forterre P (1999) Identication of putative chromosomal origins of replication in Archaea. Molecular Microbiology 32: 883886. Mojica FJ, Ferrer C, Juez G and Rodriguez-Valera F (1995) Long stretches of short tandem repeats are present in the largest replicons of the Archaea Haloferax mediterranei and Haloferax volcanii and could

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net

Archaeal Chromosome

be involved in replicon partitioning. Molecular Microbiology 17: 85 93. Pereira SL and Reeve JN (1998) Histones and nucleosomes in Archaea and Eukarya: a comparative analysis. Extremophiles 2: 141148. Reeve JN, Sandman K and Daniels CJ (1997) Archaeal histones, nucleosomes and transcription initiation. Cell 89: 9991002. Sandman K, Grayling RA, Dobrinski B, Lurz R and Reeve JN (1994) Growth-phase-dependent synthesis of histones in the archaeon Methanothermus fervidus. Proceedings of the National Academy of Sciences of the USA 91: 1262412628. Schleper C, Roder R, Singer T and Zillig W (1994) An insertion element of the extremely thermophilic archaeon Sulfolobus solfataricus transposes into the endogenous beta-galactosidase gene. Molecular and General Genetics 243: 9196. She Q, Phan H, Garrett RA et al. (1998) Genetic prole of pNOB8 from Sulfolobus: the rst conjugative plasmid from an archaeon. Extremophiles 2: 417425. Soppa J (1999) Transcriptional initiation of Archaea: facts, factors and future aspects. Molecular Microbiology 31: 12951305.

Further Reading
Arents G and Moudrianakis EN (1995) The histone fold: a ubiquitous architectural motif utilized in DNA compaction and protein dimerization. Proceedings of the National Academy of Sciences of the USA 92: 1117011174.

Baumann C, Judex M, Huber H and Wirth R (1998) Estimation of genome sizes of hyperthermophiles. Extremophiles 2: 101108. Bernander R (1998) Archaea and the cell cycle. Molecular Microbiology 29: 955961. Bult CJ, White O, Olsen GJ et al. (1996) Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii. Science 273: 10581073. Edgell DR and Doolittle WF (1997) Archaea and the origin(s) of DNA replication proteins. Cell 89: 995998. Grayling RA, Sandman K and Reeve JN (1996) DNA stability and DNA binding proteins. Advances in Protein Chemistry 48: 437467. Kawarabayasi Y, Hino Y, Horikawa H et al. (1999) Complete genome sequence of an aerobic hyperthermophilic crenarchaeon, Aeropyrum pernix K1. DNA Research 6: 83101. Kawarabayasi Y, Sawada M, Horikawa H et al. (1998) Complete sequence and gene organization of the genome of a hyperthermophilic archaebacterium, Pyrococcus horikoshii OT3. DNA Research 5: 147 155. Klenk H-P, Clayton RA, Tomb JF et al. (1997) The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon, Archaeoglobus fulgidus. Nature (London) 390: 364370. Smith DR, Doucette-Stamm LA, Deloughery C et al. (1997) Complete genome sequence of Methanobacterium thermoautotrophicum delta H: functional analysis and comparative genomics. Journal of Bacteriology 179: 71357155.

ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net

Das könnte Ihnen auch gefallen