Beruflich Dokumente
Kultur Dokumente
The biotech industry was launched on Francis Cricks infamous Central Dogma of molecular
biology, the scientific myth that organisms are hardwired in their genes, and hence, by
moving genes across species separated by billions of years of evolution, new genetically
modified organisms could be created to serve our every need.
http://www.i-sis.org.uk/isisnews/sis24.php#from
The Central Dogma has been thoroughly exploded by scientific findings accumulating since
the mid-1970s, and especially so after the human and other genomes have been sequenced
(see Living with the Fluid Genome, by Mae-Wan Ho).
We bring you the latest surprises that tell you why our health and environmental policies
based on genetic engineering and genomics are misguided; and more importantly, why the
new genetics demands a thoroughly ecological approach to life.
"GM crops are a dead end, invest in non-GM sustainable agriculture right
now"
The Independent Science Panel (ISP) (see SiS 18) took its campaign for a GM-free
sustainable world to the European Parliament on 20 October 2004. One hundred and twenty
registered for the special briefing including 27 who crossed the channel with the scientists
from the UK. The event made a big impression and the participants could not stop
congratulating us afterwards. We thank all our sponsors and supporters for making it such a
success. Cordis News, the official EU news service for science and technology reported the
event the very next day with the title "Politicians, professors and protestors target sustainable
non-GM agriculture". Further media coverage was still coming in five days later.
The ISP message is crucial as GM battles are raging across the world. The high point of the
briefing was the talk by Sue Edwards, Director of the Institute for Sustainable Development,
who helped convinced the Ethiopian government to adopt an organic composting, water and
soil conservation package as its main strategy for combating land degradation and poverty
throughout the country (see SiS 23). It brought home the proven successes of low-input,
health enhancing agricultural practices that should be adopted all over the world.
Sustainable agriculture is particularly important under climate change when oil and water on
which industrial agriculture, and even more so, GM agriculture are heavily dependent - are
both running out. Industrial agriculture uses up to seven times the energy per tonne of food
than organic agriculture; it also turns organic soil, which is a carbon sink, into a carbon
source, and generates other green house gases that exacerbate global warming. In order to
feed the world, we must invest in sustainable, non-GM agriculture across the globe right now,
which will also ameliorate the worse consequences of climate change.
At the same time, important changes have to be made in international agencies and
institutions, which have hitherto supported the dominant model of industrial agriculture as
well as policies that work against poor countries, where farmers are also desperately in need
of secure land tenure.
The evidence against the Central Dogma has piled up to such an extent that rumblings of
"challenging the dogma" and "a new theory is needed to replace the central dogma" can even
be heard in the mainstream scientific journals. Though Dr. Ewan Birney, who gave the Royal
Society's inaugural Francis Crick Lecture in December 2003, still paid elaborate homage to
the Central Dogma, with arrows pointing strictly one-way from DNA to RNA to protein,
leaving out all the many more arrows that point in reverse.
What are the latest surprises that the fluid and flexible genome has in store? One area is the
importance and pervasiveness of epigenetics, specifically, chemical markings on the DNA
and proteins binding to the DNA in the chromosomes that determine patterns of gene
expression, or which bits of the genetic text is actually read. That is overwhelmingly
determined by experience. In an earlier issue (SiS 20), we showed the mother's diet and stress
can affect patterns of gene expression in the embryo and foetus, which determines the
individuals' health prospects much later in life.
Now, researchers are finding genes that are marked for life in rat pups, strictly by how their
mothers care for them during their first week of life after birth (see "Caring mothers reduce
response to stress for life", this series). It leaves one in no doubt that the environment is
giving the instruction of which genes to turn on.
Only a few years ago, people were referring to the 98% or more of the genome that doesn't
code for proteins as "junk DNA". Not any more. The genome has a definite 'architecture' that
holds up beneath the fluidity. There is a high degree of non-randomness in the parts of the
genome that undergo change. While some parts are hypermutable, certain families of
sequences are 'homogenized' to be nearly identical (see "Keeping in concert", this series),
while still others are 'ultraconservative' in that they have remained absolutely unchanged in
hundreds of millions of years of evolution ("Are ultraconserved elements indispensable?" this
series). And when cells get into a tight corner metabolically speaking, there may even be
genes that mutate to get them out of it ("To mutate or not to mutate", this series).
Most of all, there is a big treasure trove within the apparent junkyard of the genome. Many
sequences that don't code for proteins are involved in regulating development and gene
expression. Many of the surprises are associated with findings that indicate most of the action
is not in proteins, but in the numerous species of RNA 'interfering' at all levels of the 'readout'
of genetic information: with the DNA, with other RNA species, and with proteins (see "RNA
subverting the genetic text", this series).
All of this goes against the very grain of the Central Dogma that posits linear, mechanistic
control. Instead, layers upon layers of chaotic complexity are coordinated, it seems, by mutual
agreement, in an incredibly elaborate, exquisite dance of life that dances itself freely and
spontaneously into being.
It is not so much that we need a new theory to replace the central dogma; it is more important
than that. We need a new way of knowing and being organisms that will prevent us from
mistaking organisms for instruments and machines. That's the real challenge.
Article first published 03/09/04
Maternal effects on the development of offspring are well known. But they are thought to be
due to nutritional and physiological factors affecting the foetus in the womb; and within the
past few years, geneticists have discovered that diet and stress can profoundly change the
pattern of gene expression in the offspring, affecting their health prospects as adults (see Diet
trumping genes, SiS 20).
A team of researchers from the Douglas Hospital Research Centre and McGill University in
Montreal Canada, and the Molecular Medicine Centre, in Edinburgh University Western
General Hospital in the UK, now report a remarkable experiment in which the behaviour of
the mother nursing her pups not only affects the pups response to stress as adults, but are
correlated with changes in gene expression states in brain cells that persist into adult life.
Such changes are referred to as epigenetic as they do not involve alterations in the base
sequence of DNA in the genome, only their off and on states; but they can persist in the brain
cells and are passed on to all the daughter cells.
Caring mothers reduces stress response of pups
In the nest, the mother rat licks and grooms her pups, and while nursing, arches her back to
groom and lick her pups. Some mothers (high performers) tend to do these more frequently
than others (low performers). As adults, the offspring of high performers are less fearful and
show more modest responses to stress in the hypothalamus-pituitary-adrenal (HPA) neuroendocrine pathway.
Cross-fostering studies showed that the biological offspring of low-performers reared by highperformers, resemble the offspring of high performers, and vice versa.
Maternal behaviour, therefore, alters the development of the HPA responses to stress. The
magnitude of the HPA response is a function of the corticotropin-releasing factor (CRF)
secreted by the hypothalamus, which activates the pituitary-adrenal system. This is modulated
by glucocorticoid, which feeds back to inhibit CRF synthesis and secretion, thus dampening
the HPA responses to stress. The adult offspring of high-versus low performer mothers show
increased glucocorticoid expression the hippocampus, and enhanced sensitivity to
Previous studies indicate that the maternal behaviour of licking and grooming and arching her
back to do so while nursing increased the expression of glucocorticoid receptor (GR),
accompanied by, among other things, an increased expression of a special transcription factor,
NGF1-A, which binds to the promoter of the GR gene to increase its transcription and
expression. But how could this be transmitted from the neonate to the adult?
The answer is: through the structure of chromatin (complex of protein and DNA in the
chromosomes), and the methylation of DNA. DNA methylation is a stable chemical
modification of the cytosine in the cytosine-guanine (CpG) dinucleotides, often associated
with stable variations in gene transcription. Under-methylation of CpG dinucleotides is
associated with active transcription. The researchers decided to look at the methylation state
of the GR promoter around the binding region of the NGF1-A transcription factor in the
hippocampus of adult offspring from high and low performers.
Sure enough, they found highly significant differences in methylation, with low methylation
in offspring from high-performing mothers and high methylation in offspring from lowperforming mothers, corresponding to high and low expression respectively of the GR.
Cross-fostering results in methylation patterns associated with the adoptive mother, as
consistent with the change in the adult offsprings responses to stress. Moreover, these
epigenetic differences due to maternal behaviour during the first week of life persisted into
adulthood.
A clean slate at birth
Amazingly, the pups of both high and low-performing mothers start out life genetically the
same. Just before birth, the entire region of the GR promoter was unmethylated in both
groups; and day one after birth, methylation is found in the region in both groups to the same
extent.
The changes in methylation pattern then develops within the first week according to the
behaviour of the mother, and thereafter remain for the rest of their lives. This finding is
consistent with earlier studies showing that the first week of postnatal life is a critical period
for the effects of early experiences on hippocampus GR expression.
The hippocampus is the emotion centre of the brain, and is believed to be responsible for
transferring memory to the rest of the brain. It is vulnerable to stress and richly supplied with
receptors for the sex hormones [2, 3].
Additional markings of the gene
Next, the researchers looked at the structure of chromatin around the GR gene, as chromatin
structure determines whether a gene is transcribed or not. Chemical modification of the
histones (major chromatin protein) by adding an acetyl- group is a well-established marker for
active chromatin around transcribed genes, which makes it accessible for the transcription
enzyme complex. Again, they found highly significant changes in acetylation between the two
groups of pups. There was greater acetylation and threefold greater binding of the NGF1-A
transcription factor to the GR promoter in the adult offspring of high- compared with lowperforming mothers.
Marked for life?
Now, a critical question is, are these gene-marking changes reversible? Is the adult doomed to
conditioning by the mothers behaviour towards it as a pup? The general belief is that one is
marked for life. DNA methylation pattern is irreversible. However, recent data from in vitro
experiments suggests that under certain circumstances, it is possible to demethylate DNA by
increasing histone acetylation through a chemical inhibitor of the deacetylating enzyme,
trichostatin A (TSA). The researchers, rather crudely, infused the adult brain with TSA by
applying the solution into the ventricle (space inside the brain), and obtained more than 3-fold
binding of the NGF1-A protein to the GR promoter in the adult offspring of low-performers,
and as expected, no change in the adult offspring of high-performers. Simultaneously
correlated changes in DNA methylation pattern of the GR promoter was found in the adults
reared by low-performing mothers treated with TSA, but not those reared by high-performing
mothers. In other words, those epigenetic changes were reversed.
The next question is, are the reversal of epigenetic changes associated with reversal in HPA
responses to stress? The answer, incredibly, is yes. The TSA treatment, crude as it was,
appeared to significantly decreased plasma corticosterone in the offspring of low-performer in
response to stress.
This is all grist to the mill of the fluid and adaptive, adaptable genome [4] that makes
nonsense of the Central Dogma.
Article first published 07/09/04
References
1. Weaver ICG, Cerboni N, Champagne FA, DAlesslo AC, Sharma S, Seckl JR, Dymov S, Szyf M
and Meaney MJ. Epigenetic programming by maternal behavior. Nature Neuroscience 2004,
7, 847-54.
2. PsychEducation.org http://www.psycheducation.org/index.html
3. hyperdictionary http://www.hyperdictionary.com/medical/hippocampus
4. Ho MW. Living with the Fluid Genome, ISIS & TWN, London & Penang, 2003
According to the Central Dogma, DNA, the genetic text, is read out into RNA and RNA is
translated into protein. RNA is rather like the scribe copying and translating the sacred text to
direct the faithful.
But geneticists are now uncovering a vast underworld of heresy to the Central Dogma where
RNA agents not only decide which bits of text to copy, which copies get destroyed, which bits
to delete and splice together, which copies to be transformed into a totally different message
and finally, which resulting message - that may bear little resemblance to the original text gets translated into protein. RNAs even get to decide which parts of the sacred text to rewrite
or corrupt.
The whole RNA underworld also resembles an enormous espionage network in which genetic
information is stolen, or gets re-routed as it is transmitted, or transformed, corrupted,
destroyed, and in some cases, returned to the source file in a totally different form.
And this underworld is big, really big. The protein-coding sequence is only about 1.5% of the
human genome. Yet, around 97 - 98% of the transcriptional readout of the human genome is
non-protein-coding RNA. This estimate is based on the fact that intronic RNA makes up 95%
of the primary protein-coding transcripts on average, and there are large numbers of noncoding RNA transcripts which may represent at least half of all transcripts. Most of the
miRNAs (microRNA, see below), for example, are derived from (intergenic) regions between
genes; and almost half of all transcripts from the mouse genome are non-coding RNAs. A
similar estimate applies to the human genome [1].
The inescapable conclusion is that the job of mediating between DNA and protein is really the
centre stage of molecular life. And who gives orders to the multitudes of RNA agents? In a
sense it is everyone and no one, because the system works by perfect intercommunication. It
is not the DNA, but rather, the particular environment in which the RNA agents find
themselves.
For the organism (organization) to survive, it needs to turnover the DNA text continuously,
adapting to the realities of its environment. In the process, it keeps certain texts invariant (see
"Are ultra-conserved elements indispensable?" this series), while changing others rapidly in
non-random ways (see "To mutate or not to mutate", this series). It also needs to keep
referring to texts that are relevant, modifying it, or updating the interpretation in keeping with
the times (see "Keeping in concert" this series).
RNA interference
RNA interference (RNAi) was first discovered in the nematode worm, C. elegans in the
1990s. Researchers noticed that injecting either sense RNA (the sequence that gets read and
translated into protein) or antisense RNA (the complementary sequence, which does not code
for protein) into the worm led to specific silencing of the gene involved. It was later found
that the phenomenon was actually caused by double-stranded RNA (dsRNA) contaminating
the sense or antisense RNA. RNAi now refers to all gene-silencing induced by dsRNA.
These include a host of other phenomena discovered at around the same time [2, 3]. For
example, a gene could be silenced, or co-suppressed, simply by introducing an extra copy
into the genome as a transgene, and transgenes themselves may be silenced either at or after
transcription. The coat protein gene of a virus transferred into a plant may protect the plant
from the virus, by silencing the virus genes.
All these phenomena are interlinked through special pathways of RNA processing that are
only just being defined (see Fig. 1). Abnormal single stranded RNA (ssRNA) is turned into a
double stranded RNA (dsRNA) by an RNA-dependent RNA polymerase enzyme (RDRP).
The dsRNA is then chopped up into small pieces or microRNA (miRNA) by the enzyme
Dicer. The same enzyme also processes certain hairpin RNA (hpRNA) and related premicroRNA (pre-miRNA) into miRNA. The miRNA is further processed into single-stranded
RNA that's incorporated into a multiprotein complex called RNA-induced silencing complex
(RISC). At this point, the single stranded RNA fragment binds to complementary part of the
messenger RNA and either causes the breakdown of the mRNA or prevents its translation into
protein.
Remember that all this depends on complementary base pairing, just as in DNA, so these
mechanisms could potentially exist for each and every one of the now estimated 24 500 genes
in the genome.
at the 5 end of the miRNA synthesized by a phage polymerase [4]. In addition, there are
other problems, such as avoiding interfering with non-target sequences [5], especially as
perfect base-pairing is not required, and matches of as few as 11 consecutive nucleotides can
give non-target effects.
RNA-directed DNA read-out
The dsRNA involved in RNA interference can selectively silence genes at the read-out or
transcription stage [6]; dsRNA species homologous to promoters are involved in crippling the
promoter by methylation (adding methyl (-CH3) groups) in the region of sequence overlap, so
no transcription can occur. In other cases, a dsRNA resulting from a bi-directional
transcription of a repeat element leads to methylation of a nearby histone protein H3 in
chromatin, which, too, results in gene silencing.
Transcriptional gene silencing can potentially be initiated by the dsRNA formed from pairs of
transcriptional units arranged in a tail-to tail orientation (sense antisense transcription units,
SATs). In humans, SATs account for most overlapping transcriptional units (70%). A recent
survey estimated that there are 1 600 human SATs (or 3 200 transcription units). When both
transcriptional units are active, formation of dsRNA occurs by default, leading to
modification of the histone protein and gene silencing. This mechanism is involved in
imprinting: the marking of genes in chromosomes to determine whether they are expressed in
cell clones. Expression of the gene only occurs when the antisense promoter is methylated and
inactive.
Recently, a new kind of trans-acting (acting across to different parts of the genome) RNA was
identified in mouse [7]. B2 RNA originates from a short interspersed repetitive element
(SINE) repeated more than 105 copies in the genome of multicellular plants and animals. They
were previously thought to be molecular parasites with no function. However, the level of B2
and related RNAs have been found to increase up to 100-fold in response to environmental
stresses such as heat shock. And B2 RNA is required for the concomitant inhibition of RNA
polymerase II during heat shock, by interacting directly with the enzyme, preventing it from
working. RNA polymerase II is involved in the transcription of all protein-coding RNA. So an
inhibition of RNA polymerase II will decrease the synthesis of many proteins.
A special kind of RNA directed DNA read-out is accomplished via RNA riboswitches to
switch genes off in response to the concentration of a metabolite in the cell, without the need
for a protein repressor (see Box).
Riboswitch and other RNA regulators
A new molecular switch involves an RNA molecule with enzyme activity, a ribozyme, which
can self-destruct by self-cleavage [8]. This self-cleavage is accelerated 1 000 fold in the
presence of a small sugar molecule, glucosamine-6-phosphate, which is generated by the
enzyme protein encoded by a portion of the mRNA downstream from the ribozyme sequence.
So, this simple gene regulatory circuit involves the mRNA being translated into the enzyme,
which makes the product, glucosamine-6-phospate. As the product accumulates, it binds to
the special catalytic element in the mRNA, causing it to self-destruct. The region of the
mRNA that can confer this regulatory activity is roughly 75 nucleotides long. When placed
11
upstream of an un-related reporter gene, it also shuts down its expression, showing that this
active RNA element is transplantable.
A particular group of ribozymes forms a pocket that binds guanosine monophosate, one of the
four building blocks of RNA. A specific region of the RNA from the Human
Immunodeficiency Virus (HIV) binds a derivative of the amino acid arginine. Short (<100
nucleotide) RNA aptamers (DNA or RNA molecules that bind other molecules) have been
identified that specifically bind everything, from hydrophobic (water-hating) amino acids to
small organic molecules and metal ions. An RNA aptamer can even distinguish the plant
alkaloid theophylline from the closely related molecule caffeine.
Aptamers found within some natural mRNAs bind small molecules as part of their generegulatory feedback circuits. In the E. coli bacterium, coenzyme B12 binds directly to, and
thereby represses translation of, the mRNA coding for the protein that transports its precursor,
cobalamin. In Bacillus species, the synthesis of thiamine and riboflavin involves discrete
genetic units or operons, controlled by direct binding of thiamine pyrophospate and flavin
mononucleotide to leader sequences of the corresponding mRNAs, resulting in the premature
termination of transcription.
Several research groups had previously engineered artificial riboswitches that accomplish
exactly the same task, that is, induce ribozyme-mediated cleavage of the RNA on binding
small molecules, before these were discovered in nature.
RNA splicing
It is estimated that 64% of the genes in the human genome is interrupted [9]; i.e., the coding
regions exist in short stretches (exons) interrupted by long non-coding stretches (introns).
After the entire sequence is transcribed into RNA, the non-coding stretches are spliced out,
leaving the coding sequence. However, different exons can be spliced together, and the
borders between the exons and introns can themselves be shifted. Alternative splicing
multiplies the number of different proteins that can be obtained from a single gene. This is a
case of extensive cutting and pasting of the genetic text to suit the occasion.
The fruitfly gene Dscam (homologue of the Down syndrome cell adhesion molecule) codes
for a cell-surface protein essential for the development of the fruitfly's brain. It has so many
exons that a total of 38 016 possible alternative splice forms could be generated. Geneticists
from the Whitehead Institute for Biomedical Research, Cambridge, Massachusetts in the
United States analysed the splice forms expressed by different cell types and by individual
cells, and found that the choice of splice variants is regulated both spatially and temporally
[10].
Different subtypes of photoreceptor cells express broad yet distinctive spectra of Dscam
splice forms. Individual photoreceptor cells express about 14-50 splice forms chosen from the
spectrum of thousands distinctive of its cell type. Thus, the repertoire of each cell is different
from those of its neighbours.
The complexity does not end there. Not only are different splice variants obtained from the
same primary transcript, trans-splicing between different primary transcripts can also take
place [11], multiplying the combinatorial possibilities of proteins available.
12
There's increasing evidence that genomic variants in both coding and non-coding sequences in
genes can have unexpected deleterious effects on the splicing of gene transcripts [12]. Even
synonymous base substitutions (those that do not change the amino acid sequence of the
encoded protein) and sequence changes within the introns can affect splicing and cause
diseases.
RNA-directed rewriting of RNA
Some nucleotides are deleted during splicing and others changed by editing. Around 41 to
60% of mouse multi-exon genes generate alternatively spliced transcripts, the frequency of
edited transcripts is unknown. These processes generate new sequences not found in the gene.
Trypanosomes show the importance of RNA rewriting. Their survival depends on editing
defective mitochondrial transcripts using trans-encoded RNA sequences to guide insertion
and deletion of uridine bases. The rewriting of RNA restores the correct reading frame,
allowing the production of functional gene products. RNA guides are also used to direct
rewriting of RNA during editing and splicing of pre-mRNA. In some cases, editing creates
splice sites and in others splicing prevents editing.
Rewriting of RNA is associated with a high turnover of transcripts. Of all the RNA
transcribed in the human nucleus, only about 5% enters the cytoplasm Quality control
mechanisms dispose of incompletely or improperly processes messages encoding flawed
proteins.
RNA-directed rewriting of DNA
In each ribotype, only specific transcripts are produced and particular mRNAs translated.
These outcomes are achieved by coRNAs that coordinate the action of highly conserved
pathways. An RNA product from one processing event may regulate a downstream event,
making the second outcome contingent on the first. For example, a miRNA encoded in an
intron would only be expressed when the host gene is transcribed. CoRNA may facilitate
coordination of pathways by interacting with sequence motifs shared by a number of targets.
Evolution of rule sets requires creation of new coRNAs, possibly by duplication and mutation.
New coRNAS would result in assembly of new regulatory complexes on conserved DNA
elements, new patterns of gene expression during development.
Replication of ribotypes
Both genetic modification, involving changes in DNA, and epigenetic modifications, such as
DNA methylation and histone acetylation, can be inherited. For example, imprinting is
determined by the parent of origin of a chromosome, which means that at some point maternal
13
and paternal chromosomes are marked so that they can be distinguished during embryonic
development. Methylation may undergo variable erasure during primordial germ cell
development, producing epigenetic mosaic individuals. The persistence of such epigenetic
marks is relevant to the origin of complex diseases. Here, the susceptibility of offspring to
disease can depend on whether there is maternal or paternal history of disease as well as
ethnicity.
Transmission of ribotypes also occurs more directly. The embryo receives RNA from the
mother that is important in specifying cells fate. The foetus is also exposed to the maternal
environment, which can influence the foetal phenotype. For example, pregnant female mice
fed a diet rich in methyl donors have litters with fewer yellow-coloured agouti Avy offspring,
reflecting enhanced silencing of the retroviral promoter in this allele (see "Diet trumping
genes", SiS 20). In other cases, integration of signals received from maternal hormones may
trigger epigenetic modifications that alter long-term phenotypic development by modulating
RNA co-regulatory networks. Low birth weight, for example, has been shown to correlate
with lifetime risk of cardiovascular disease and diabetes mellitus.
Recently, it has been demonstrated that the plasma of pregnant women contains circulating
mRNA originating from the foetus [13], which is rapidly cleared after delivery. This raises the
question of whether coRNAs secreted by various somatic tissues are also used to transmit
information from mother to foetus, a serious case of the inheritance of acquired characteristics
not coded in the genome.
Article first published 09/09/04
References
1. Semon M and Duret L. Evidence that functional transcription units cover at least half of the
human genome. TRENDS in Genetics (in press, 2004).
2. Kusaba M. RNA interference in crop plants. Current Opinion in Biotechnology 2004, 15,
13943.
3. Novina CD and Sharp PA. The RNAi revolution. Nature 2004, 430, 161-4.
4. Samuel CE. Knockdown by RNAi - proceed with caution. Nature Biotechnology (News and
Views) 2004, 22, 280-2.
5. Caplen NJ. Gene therapy progress and prospects. Downregulating gene expression: the
impact of RNA interference. Gene Therapy 2004, 11, 1241-8.
6. Herbert A. The four Rs of RNA-directed evolution. Nature genetics 2004, 36, 19-25.
7. Wassarman KM. Nature Structural & Molecular Biology 2004, 11, 803-4
8. Cech TR. RNA finds a simpler way. Nature (news and views) 2004, 428, 263-4.
9. EASED: Exended Alternatively Spliced EST Database. http://eased.bioinf.mdcberlin.de/statistics.html
10. Neves G, Zucker J, Daly M and Chess A. Stochastic yet biased expression of multiple Dscam
splice variants by individual cells. Nature Genetics 2004,
http://www.nature.com/naturegenetics
11. Dorn R, Reuter G and Loewendorf A. Transgene analysis proves mRNA trans-splicing at the
complex mod(mdg4) locus in Drosophila. Proc Natl Acad Sci USA 2001, 98, 9724-9.
14
12. Pagani F and Baralle FE. Genomic variants in exons and introns: identifying the splicing
spoilers. Nature Reviews Genetics 2004, 5, 389-96.
13. Ng EKO, Tsui NBY, Lau TK, Leung TN, Chiu RWK Panesar NS, Lit LCW, Chan K-W and Lo YMD.
mRNA of placental (and hence foetal) origin is readily detectable in maternal plasma. PNAS
2003, 100, 4748-53.
15
mutations are not strictly 'directed'. Instead, the cells appear to activate a number of different
mechanisms that target mutations to genes, the end result of which is to enable them to grow,
which they otherwise would not be able to do.
A profusion of mechanisms
There are many ways to generate adaptive mutations.
16
Interestingly, adaptive point mutations in the lac system requires homologous recombination
proteins of the E. coli RecBCD double-strand break-repair system which is widely involved in
gene conversion and recombination (see "How to keep in concert", this series). Double-strand
ends could be generated during DNA replication by a number of different mechanisms.
The adaptive Lac+ point mutations that revert a framewhift allele are nearly all -1 deletions
(deletion of a single nucleotide) in small mononucleotide repeats, whereas the pre-existing
(non-adaptive) Lac+ reversions are heterogeneous. Mononcleotide repeat instability is thought
to reflect DNA polymerase errors, which is consistent with the requirement of a special errorprone DNA polymerase (polIV) for adaptive mutations.
The 'SOS response' is the bacteria's response to DNA damage or the inhibition of DNA
replication. It involves de-repression of at least 42 genes that carry out DNA repair,
recombination, mutation, translesion DNA synthesis (synthesis across non-repaired or
damaged DNA) and prevent cell division.
Global hypermutation is thought to occur in a subpopulation of the cells. This is because the
frequencies of unselected mutations are about two orders of magnitude higher among Lac+
mutants than in the main population of Lac- starved cells. These results mean that stationaryphase mutations in this system are not directed exclusively to the lac gene, and both adaptive
and neutral mutations are formed. Some or all of the adaptive mutants arise in a subpopulation
that is hypermutable relative to the main population.
The subpopulation of cells that are transiently mutable is estimated to be between 10-3 and 104
of all cells. Despite that, the frequency per unit length of DNA in the genome is markedly
uneven, with definite hotspots and coldspots, perhaps depending on the proximity to double
strand breaks (DSBs) in DNA that are generated.
Gene amplification is 'adaptive' in the sense that it only occurs in response to the selective
environment. Cells carrying the amplification are not hypermutated in unselected genes, and
neither the SOS response nor polIV is required. Dependence on homologous recombination is
implied in that adaptive Lac+ colonies do not appear in the absence of RecA and RecBCD
enzyme, and RuvAB and C recombination proteins.
17
The bacteria were highly variable in their inducible mutator activity. The frequency of
mutations conferring resistance to rifampicin (RifR) in day 1 (D1) and day 7 (D7) was
measured. For all strains, the median values of RifR mutations were 5.8 x 10-9 on day 1, and
4.03 x 10-8 on day 7, an increase of 7 fold, while the median number of colony-forming units
increased 1.2-fold. In comparison, the E. coli K12 MG1655 lab strain showed a 5.5-fold
increase in frequency of RifR and a 1.7 fold increase in colony forming units. Constitutive
mutator strains having a D1 mutation frequencies >10-fold or >100-fold higher than the
median D1 frequency of all the strains represented 3.3% and 1.4% of isolates respectively.
The D7/D1 mutation frequency ratio showed that 45% of strains had more than a 10-fold, and
13% more than a 100-fold increase in mutagenesis over 7 days. Interestingly, constitutive
mutagenesis and MAC (mutagenesis in aging cells) showed a negative correlation.
The MAC was genome wide in a large fraction of natural isolates. There was no significant
correlation between MAC and phylogeny. The host's nutrition might explain some of the
variation of MAC. For example, bacteria from the guts of omnivorous species like human
beings have weaker stress-inducible mutator activities than those from carnivores.
The mechanisms for generating mutations looked even more diverse than in the laboratory
strains [6].
18
Indeed, in one study on 12 long-term E coli lines, 36 genes were chosen at random, and 500
bp regions sequenced in four clones from each line and their ancestors [8]. Several mutations
were found in a few lines that evolved mutator phenotypes, but no mutations were found in
any of the 8 lines that retained functional DNA repair throughout the 20 000 generations
experiment. This confirms the low level of 'spontaneous' or unprovoked mutation.
Article first published 15/09/04
19
"Ultraconservative elements"
Many surprises lay in store as genome sequences accumulated and, thankfully, get deposited
into one public database, so useful comparisons could be made. It turns out that not only are
there vast hidden treasures among the "junk DNA", but evidence of highly non-random
changes among different stretches of the DNA, some of which change in concert, some or
which change at random, and others, change almost not at all.
There are 481 segments in the human genome longer than 200 bp that are 100% identical with
rat and mouse genomes. Nearly all are also conserved in the chicken (467/481) and dog
(477/481) genomes, with an average of 95.7% and 99.2% identity, respectively. Many are
also significantly conserved in fish (324/481 at an average of 76.8% identity).
Very few of these elements could be traced back to jelly fish, Drosophila or the nematode
worm.
These "ultraconserved" elements are widely distributed in the genome, occurring on all
chromosomes with the exception of the Y chromosome and chromosome 21. They most often
overlap exons in genes involved in RNA processing or in their introns; or near genes involved
in regulation of transcription and development.
Of the 481 ultraconserved elements, 111 overlap the mRNA of a known human protein
coding gene, including the UTR (untranslated region) and are partly exonic (belonging to
protein coding sequences); 256 show no match to expressed mRNA and are therefore
nonexonic (non-protein coding); while the remaining 114 are possibly exonic. One hundred of
20
the non-exonic elements are located in introns (non-coding intervening sequences) of known
genes and the rest are intergenic (between genes). The non-exonic elements, both intronic and
intergenic, tend to congregate in clusters near transcription factors and developmental genes,
whereas the exonic and possibly exonic elements are more randomly distributed along the
chromosomes.
There are 93 known genes that overlap with exonic ultraconserved elements; these are called
type 1 genes. The 255 genes that are near the non-exonic elements are type II genes. Type I
genes tend to be RNA binding or involved in regulation of splicing. In contrast, type II genes
are involved in regulation of transcription and DNA binding, and are enriched for DNA
binding motifs such as the homeobox.
Nonexonic ultraconserved elements are often found in "gene deserts" that extend more than a
megabase. Of the non-exonic elements, there are 140 that are more than 10Kb away from any
known gene, and 88 that are more than 100Kb away.
The set of 156 annotated genes that flank intergenic ultraconserved elements is significantly
enriched for developmental genes, and in particular, genes involved in early development,
suggesting that many of the associated ultraconserved elements may be distal enhancers of
these early developmental genes.
Non-exonic elements that lie in introns are also often associated with developmental genes.
Many elements in the ultraconservative set of 481 are considerably longer than 200bp. The
longest elements (779bp, 770bp and 731 bp) all lie in the last three introns in the 3' portion of
the DNA polymerase alpha catalytic subunit on chromosome X, along with other shorter
ultraconserved elements.
If the criterion "highly conserved" sequences with 99% identity (instead of 100% identity) is
used, then there are 1 974 elements, of lengths up to 1 087bp in the human genome.
There are also 5 000 sequences of more than 100bp in length that are 100% identical in the
human, rat and mouse genomes. These appear to be essential for development in mammals
and other vertebrates.
Tens of thousands more are found at lower cutoffs.
Thus, as much as 5% of the genome is more conserved than expected from neutral mutations
occurring at random.
21
The ultraconserved elements show almost no natural variation in the human population. Only
6 out of 106 767 bp examined are at validated SNPs, whereas 119 are expected.
Surprise, surprise
But researchers revealed that mice with big chunks for such ultraconserved sequences deleted
get on very well without them.
Edward Rubin's team at the Lawrence Berkeley National Laboratory in California deleted two
huge regions of DNA from mice containing nearly 1 000 highly conserved sequences shared
between human and mice. One region was 1.6 million DNA bases long, the other over
800,000 bases long. The researchers expected the mice to show big problems as the result of
the deletions.
But the mutant mice were no different from normal mice in every respect: growth, metabolic
functions, lifespan and overall development. "We were quite amazed," said Rubin, who
presented the findings at a meeting of the Cold Spring Harbor Laboratory in New York earlier
this year.
"It may say as much about our inability to detect any phenotypes as it says about the function
of this region, " said David Haussler of the University of California, Santa Cruz, whose team
described the "ultra-conserved regions" in mammals, "What's most mysterious is that we don't
know any molecular mechanism that would demand conservation like this."
Article first published 16/09/04
Sources
Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, Mattick JS and Haussler D.
Ultraconserved elements in the human genome. Sciencexpress/ www.sciencexpress.org/6May
2004/ Page1/10.1126/science.1098119
"Life goes on without 'vital DNA" Sylvia Pagn, New Scientist 3 June 2004
www.Newscientist.com
22
A closer look
Microbiologist Liao Daiqing of the University of Sherbrooke in Quebec, Canada, compared
the sequences of multiple rRNA genes within the genome of 12 bacteria that have multiple
copies of the rRNA genes. The genes for the three rRNA molecules (23S, 15S and 5S) found
in the ribosome are typically linked together and transcribed in a single unit called an operon
in prokaryotes. The length of these three rRNA genes is ~2 900bp (23S), ~1 500bp (16S), and
~120 bp (5S), and their sizes as well as sequences are well conserved between different
prokaryotic species. The multiple rRNA operons (muti-gene units under the same
transcription control) are generally dispersed throughout the prokaryotic genome. Liao
analysed the rRNA genes and their immediate flanking sequences in 19 completely sequenced
genomes, but seven of the genomes surveyed contain only one copy of each rRNA gene.
He found striking sequence homogeneity of each individual rRNA gene family within a
species, in contrast to the divergence of gene sequences between species.
23
Within a genome, evidence of gene conversion was found throughout the entire length of each
individual rRNA genes and their immediate flanking regions. Individual conversion events,
however, convert only a short sequence tract, and the conversion partner can be any gene
within the gene family in the genome. He confirmed that gene sequences undergo much
slower divergence than their flanking sequences, and any homogeneous flanking regions that
exist may have been incidental co-conversion with the gene sequence.
The average divergence (difference) among the seven 16S rRNA genes present in E. coli is
0.0055 per site, whereas the average divergence between the 16S RNA genes in E. coli and its
close relative H. influenze is 0.1325, or 24 times greater. The same applies to the 23S and 5S
rRNA genes. No sequence heterogeneity was detected for multiple copies of 23S, 16S or 5S
in Aquifex aeolicus, Chlamydia trachomatis, Haemophilus influenze, Helicobacter pylori,
Methanobacterium thermoautotrophicum and Synechocystis PCC6803. Five of these six
species have only two rRNA operons, whereas there are six operons in H. influenzae. There
are 10 and 7 rRNA operons in B. subtilis and E. coli, but the rRNA genes in these two species
also display remarkable sequence homogeneity.
Obvious sequence heterogeneity was found for the intergenic spacer sequences between 16S
and 23S genes in B. subtilis, E. coli, H. influenzae and T. pallidum. This is mainly due to the
presence or absence of tRNA (transfer RNA) genes or the presence of different tRNA genes
in this intergenic region. The contrast of homogeneity in the gene sequences to heterogeneity
in the intergenic spacers implies that concerted evolution does not reflect gross replacement of
one operon with another; rather it is a gradual, region-by-region homogenisation process.
Individual conversion tracts appear to be short, apparently less than 500bp, similar to those
observed in other organisms.
24
sequences are frequently found within the 16S and 23S rRNA genes and their vicinities. For
example, the sequence stretch GCTGGCGG near the 5 end of the 16S rRNA gene differs
from Chi by only one nucleotide, and this change does not appear to affect its function. This
Chi sequence is conserved in all bacterial 16S rRNA genes. Although RecBCD/Chi system
may not operate in all the species, similar recombination machinery may be responsible.
25
26
trypsinogen gene associated with pancreatitis, the -crystallin gene CRYBB2 in a dominant
form of cataracts, the CYP21B gene responsible for steroid 21-hydroxylase deficiency and
congenital adrenal hyperplasia and Von Willebrand disease (VWD), the commonest inherited
bleeding disorder. Such pathological gene conversions may be linked to stress, and resemble
the controversial phenomenon of directed mutations found in stressed and starving bacterial
cells (see "To mutate or not to mutate", this series).
Avoiding stress may be much more important for health than inheriting good genes.
Article first published 20/09/04
Sources
1. Liao D. Gene conversion drives within genic
sequences: concerted evolution of ribosomal
RNA genes in Bacteria and Archaea. J Mol
Evol 2000, 51, 305-17.
2. Arnold DA and Kowalczykowski SC.
RecBCE helicase/nuclease. Encyclopaedia of
Life Sciences, Macmillan, 1998.
3. Martinsohn J Th, Sousa AB, Gujethlein LA,
Howard JC. The gene conversion hypothesis
of MHC evolution: a review. Immunogentica
1999, 50, 168-200.
4. Dorak MT. Common terms in evolutionary
biology and genetics.
http://dorakmt.tripod.com/mhc/glossary.html
5. Jeffreys AJ and May CA. Intense and highly
localized gene conversion activity in human
meiotic crossover hot spots. Nature Genetics
2004, 36, 151-6.
6. Guillon H and de Massy B. An initiation site
for meiotic crossing-over and gene
conversion in the mouse. Nature Genetics
2002, 32, 296-9.
7. Kppers R and Dalla-Favara R. Mechanisms
of chromosomal translocation in B cell
lymphomas. Oncogene 2001, 20, 5580-94.
8. Chen J-M, Raguenes O, Ferec C, Deprez PH
and Verellen-Dumoulin C. A CG>CAT gene
conversion-like event resulting in the R122H
mutation in the cationic trypsinogen gene and
its implication in the genotyping of
pancreatitis. J Med Genet 2000, 37
(http://jmedgenet.com/cgi/content/full/37/11/e
36)
9. Virinder Sarhadi V, Reis A, Jung M, Singh D,
Sperling K, Singh JR and Brger J. A unique
27
28