Sie sind auf Seite 1von 4

NEWS FEATURE NATURE|Vol 441|25 May 2006

©2006 Nature Publishing Group


NATURE|Vol 441|25 May 2006 NEWS FEATURE
C. DARKIN

WHAT IS A GENE?
The idea of genes as beads on a DNA string is fast fading. Protein-coding sequences have no
clear beginning or end and RNA is a key part of the information package, reports Helen Pearson.
ene’ is not a typical four-letter Laurence Hurst at the University of Bath, UK. viously unimagined scope of RNA.

‘G word. It is not offensive. It is never


bleeped out of TV shows. And
where the meaning of most four-
letter words is all too clear, that of gene is not.
The more expert scientists become in molecu-
“All of that information seriously challenges The one gene, one protein idea is coming
our conventional definition of a gene,” says under particular assault from researchers who
molecular biologist Bing Ren at the University are comprehensively extracting and analysing
of California, San Diego. And the information the RNA messages, or transcripts, manufac-
challenge is about to get even tougher. Later tured by genomes, including the human and
lar genetics, the less easy it is to be sure about this year, a glut of data will be released from mouse genome. Researchers led by Thomas
what, if anything, a gene actually is. the international Encyclopedia of DNA Ele- Gingeras at the company Affymetrix in Santa
Rick Young, a geneticist at the Whitehead ments (ENCODE) project. The pilot phase of Clara, California, for example, recently studied
Institute in Cambridge, Massachusetts, says ENCODE involves scrutinizing roughly 1% of all the transcripts from ten chromosomes
that when he first started teaching as a young the human genome in unprecedented detail; across eight human cell lines and worked out
professor two decades ago, it took him about the aim is to find all the precisely where on the chro-
two hours to teach fresh-faced undergraduates sequences that serve a useful “We’ve come to the mosomes each of the tran-
what a gene was and the nuts and bolts of how purpose and explain what realization that the scripts came from3.
it worked. Today, he and his colleagues need that purpose is. “When we The picture these studies
three months of lectures to convey the concept started the ENCODE project
genome is full of paint is one of
of the gene, and that’s not because the students I had a different view of overlapping transcripts.” mind-boggling complexity.
are any less bright. “It takes a whole semester what a gene was,” says con- — Phillip Kapranov Instead of discrete genes
to teach this stuff to talented graduates,” Young tributing researcher Roderic dutifully mass-producing
says. “It used to be we could give a one-off def- Guigo at the Center for Genomic Regulation identical RNA transcripts, a teeming mass of
inition and now it’s much more complicated.” in Barcelona. “The degree of complexity we’ve transcription converts many segments of the
In classical genetics, a gene was an abstract seen was not anticipated.” genome into multiple RNA ribbons of differing
concept — a unit of inheritance that ferried a lengths. These ribbons can be generated from
characteristic from parent to child. As bio- Under fire both strands of DNA, rather than from just one
chemistry came into its own, those character- The first of the complexities to challenge molec- as was conventionally thought. Some of these
istics were associated with enzymes or proteins, ular biology’s paradigm of a single DNA transcripts come from regions of DNA previ-
one for each gene. And with the advent of mol- sequence encoding a single protein was alterna- ously identified as holding protein-coding
ecular biology, genes became real, physical tive splicing, discovered in viruses in 1977 (see genes. But many do not. “It’s somewhat revolu-
things — sequences of DNA which when con- ‘Hard to track’, overleaf). Most of the DNA tionary,” says Gingeras’s colleague Phillip
verted into strands of so-called messenger sequences describing proteins in humans have a Kapranov. “We’ve come to the realization that
RNA could be used as the basis for building modular arrangement in which exons, which the genome is full of overlapping transcripts.”
their associated protein piece by piece. The carry the instructions for making proteins, are Other studies, one by Guigo’s team4, and one
great coiled DNA molecules of the chromo- interspersed with non-coding introns. In alter- by geneticist Rotem Sorek5, now at Tel Aviv
somes were seen as long strings on which gene native splicing, the cell snips out introns and University, Israel, and his colleagues, have
sequences sat like discrete beads. sews together the exons in various different hinted at the reasons behind the mass of tran-
This picture is still the working model for orders, creating messages that can code for dif- scription. The two teams investigated occa-
many scientists. But those at the forefront of ferent proteins. Over the years geneticists have sional reports that transcription can start at a
genetic research see it as increasingly old-fash- also documented overlapping genes, genes DNA sequence associated with one protein
ioned — a crude approximation that, at best, within genes and countless other weird arrange- and run straight through into the gene for a
hides fascinating new complexities and, at ments (see ‘Muddling over genes’, overleaf). completely different protein, producing a
worst, blinds its users to useful new paths Alternative splicing, however, did not in itself fused transcript. By delving into databases of
of enquiry. require a drastic reappraisal of the notion of a human RNA transcripts, Guigo’s team esti-
Information, it seems, is parceled out along gene; it just showed that some DNA sequences mate that 4–5% of the DNA in regions con-
chromosomes in a much more complex way could describe more than one protein. Today’s ventionally recognized as genes is transcribed
than was originally supposed. RNA molecules assault on the gene concept is more far reach- in this way. Producing fused transcripts could
are not just passive conduits through which the ing, fuelled largely by studies that show the pre- be one way for a cell to generate a greater vari-
gene’s message flows into the world but active ety of proteins from a limited number of
P. PLAILLY/SPL

regulators of cellular processes. In some cases, exons, the researchers say.


RNA may even pass information across gener- Many scientists are now starting to think
ations — normally the sole preserve of DNA. that the descriptions of proteins encoded in
An eye-opening study last year raised the DNA know no borders — that each sequence
possibility that plants sometimes rewrite their reaches into the next and beyond. This idea
DNA on the basis of RNA messages inherited will be one of the central points to emerge
from generations past1. A study on page 469 of from the ENCODE project when its results are
this issue suggests that a comparable phenom- published later this year.
enon might occur in mice, and by implication Kapranov and others say that they have doc-
in other mammals 2. If this type of phenome- umented many examples of transcripts in
non is indeed widespread, it “would have huge Spools of DNA (above) still harbour surprises, with which protein-coding exons from one part of
implications,” says evolutionary geneticist one protein-coding gene often overlapping the next. the genome combine with exons from another
399
©2006 Nature Publishing Group
NEWS FEATURE NATURE|Vol 441|25 May 2006

part that can be hundreds of thousands of when scientists have searched for the genetic
Hard to track bases away, with several other ‘genes’ in basis of a disease or other characteristic they
1860s After playing between. This continuum of genes might have overwhelmingly found the underlying
ABBEY OF ST THOMAS, BRNO, CZECH REPUBLIC

with pea plants, even spill over the boundaries of chromo- mutation to be in a protein-coding gene
Austrian monk somes: last year, Richard Flavell at Yale Uni- rather than in another region. “The prepon-
Gregor Mendel versity School of Medicine in New Haven, derance of evidence suggests that protein-
defines the basic Connecticut, documented human immune- coding genes will hold their own when the
rules of inheritance. system genes that seem to be controlled by day is over,” Hogenesh says.
Traits are regulatory regions from another chromo- Some of the recent discoveries — that the
determined by some6. “Discrete genes are starting to van- human genome makes a continuum of tran-
discrete units that ish,” Guigo says. “We have a continuum of scripts and that cells produce masses of non-
are passed from one transcripts.” coding RNA molecules — have not posed
generation to the next. much of a problem to people outside the
Slippery concept world of molecular biology. Population
1909 Danish botanist Wilhelm Johanssen The large transcriptional surveys suggest geneticists can examine how a trait is passed
coins the word ‘gene’ for the unit associated
that a vast amount of the RNA manufactured down and evolves regardless of the precise
with an inherited trait, although the physical
by the mouse and human genomes do not molecular mechanism that underlies it. For
basis remains unknown.
code for proteins. Last year a consortium of example, geneticists can build models show-
1910 Thomas researchers in Japan, for example, estimated ing how a mutation is inherited whether it
that a whopping 63% of the mouse genome is affects a protein, a non-coding RNA or a reg-
GRAPHIC SCIENCE/ALAMY

Morgan’s work on
fruitflies (right), transcribed7,8; only 1–2% of the genome is ulatory region. “I don’t actually care if it’s
shows that genes sit thought to be spanned by sequences that making a protein or not,” says Hurst. “The
on chromosomes, contain everyday exons. equations are still the same.”
leading to the idea The discovery of RNA sequences that But the same can’t be said for studies
of genes as beads aren’t just intermediates between the DNA revealing so-called extragenomic modes of
on a string. and the protein-making machinery is not inheritance. In recent years, many investiga-
new in itself; the cell’s protein-building appa- tors have focused on epigenetic inheritance,
1941 George Beadle ratus requires a number of RNA molecules in which information is passed from parent
and Edward Tatum introduce the concept that as well as proteins to operate. But the finding to offspring independent of the DNA
one gene makes one enzyme. of ‘microRNAs’ and other RNA molecules sequence. And this week in Nature (see page
now known to be vital in controlling many 469), Minoo Rassoulzadegan’s team at the
1944 Genes are made of DNA, find Oswald cellular processes in plants and animals, and French National Institute for Health and
Avery (below), Colin MacLeod and Maclyn the newly revealed ferment of RNA tran- Medical Research (INSERM) in Nice, France,
McCarty. scription, contributes to the view that RNA reports that RNA may sometimes be compli-
TENNESSEE STATE LIB. & ARCHIVES/NLM/NIH

actively processes and carries out the cating traditional models of inheritance.
1953 James Watson
instructions in the genome. In mice, mutations in the Kit gene cause
and Francis Crick
Perhaps the regions that make non-coding white patches on the tail and feet; if a mouse
publish the chemical
structure of DNA;
RNA should also carry the status of genes, if has one normal Kit gene and one mutated
the central dogma of not the name itself. “I think it’s time for peo- one it will have the spots. The odd thing is
ple to take a deep breath that some of the offspring of
molecular biology
and step back,” says molec-
“A lot of the such mice, who inherit two
emerges in which
information flows ular biologist John Mattick information is being normal Kit genes, still have the
from DNA to RNA of the University of transacted by RNA.” white tail. The French group
to protein. Queensland in Brisbane, suggest that the mutant Kit
Australia. “A lot of the
— John Mattick gene manufactures abnormal
1977 Richard Roberts and Phillip Sharp information in the system is being transacted RNA molecules, which accumulate in sperm
discover that genes can be split into segments, by RNA.” and pass into the egg. These bits of RNA
leading to the idea that one gene can make Although functions have been identified somehow silence the normal Kit gene in the
several proteins. for several RNA molecules, the crux of the next generation and subsequent ones, pro-
debate now is the extent to which all the ducing the spotted-tail effect. “We are con-
1993 The first microRNA is identified in the extra RNA plays a part. It is conceivable that vinced that it’s a more general phenomenon,”
worm Caenorhabditis elegans. it is easier to overtranscribe and ignore the says co-author François Cuzin.
rubbish than to invest in systems that pro- If this is strange, the work reported last
2003 duce only what is needed. A study from last year1 on the cress plant Arabidopsis by
GeneSweep: year, however, hints that at least some of the Robert Pruitt and his colleagues at Purdue
Human
mass of RNAs is doing something useful. University in West Lafayette, Indiana, is even
geneticists
Working at the Genomics Institute of the stranger. Here the gene involved is called
come up with a
definition for
Novartis Research Foundation in San Diego, HOTHEAD. Pruitt and his co-workers’
protein-coding California, John Hogenesch and his co- analysis shows that some plants do not carry
genes in order to workers systematically quenched the activity the mutant version of HOTHEAD that their
decide on a winner for a bet on the number of of more than 500 non-coding RNAs in parents possessed. These plants had replaced
human genes. The winner is announced, but human cells and found that eight were the abnormal DNA sequence with the regu-
geneticists acknowledge that they don’t know involved in cell signalling and growth9. lar code possessed by earlier generations.
the true answer. But Hogenesh, and many other scientists, “It’s like, whoa, this changes everything,”
remain convinced that non-coding RNAs Pruitt says. “It definitely changes my view of
2006 The idea that human genes are one long are much less important, functionally, than inheritance.”
continuum begins to emerge. those that describe proteins; in the past, Pruitt is now working to explain how the
400
©2006 Nature Publishing Group
NATURE|Vol 441|25 May 2006 NEWS FEATURE
M. ALICE WEBB

one means when they talk about genes because


we don’t share the same definition,” says devel-
opmental geneticist William Gelbert of Har-
vard University in Cambridge, Massachusetts.
Without a clear definition of a gene, life is
also difficult for bioinformaticians who want
to use computer programs to spot landmark
sequences in DNA that signal where one gene
ends and the next begins. But reaching a con-
sensus over the definition is virtually impossi-
ble, as Karen Eilbeck can attest. Eilbeck, who
works at the University of California in Berke-
ley, is a coordinator of the Sequence Ontology
consortium. This defines labels for landmarks
within genetic-sequence databases of organ-
isms, such as the mouse and fly, so that the
databases can be more easily compared. The
consortium tries, for example, to decide
whether a protein-coding sequence should
always include the triplet of DNA bases that
mark its end.
Eilbeck says that it took 25 scientists the bet-
ter part of two days to reach a definition of a
gene that they could all work with. “We had
several meetings that went on for hours and
everyone screamed at each other,” she says.
The group finally settled on a loose definition
that could accommodate everyone’s demands.
(Since you ask: “A locatable region of genomic
sequence, corresponding to a unit of inheri-
Back-up copies: mutant DNA in the cress plant may be ‘corrected’ by inherited RNA. tance, which is associated with regulatory
regions, transcribed regions and/or other
plant could perform such a feat. One idea is promises to enrich — and complicate — the functional sequence regions.”)
that they carry a back-up copy of their grand- notion of a gene yet further. Rather than striving to reach a single defin-
parents’ genetic information encoded in RNA Leaving aside the can of worms that studies ition — and coming to blows in the process —
that is passed into seeds along with the regular on epigenetics are beginning to open up, does most geneticists are instead incorporating less
DNA and is then used as a template to ‘correct’ it matter that many scientists not directly con- ambiguous words into their vocabulary such
certain genes. Conceivably, Pruitt says, some cerned with molecular mechanisms continue as transcripts and exons. When it is used, the
of the mystery non-coding transcripts could to think of genetics in simpler terms? Some word ‘gene’ is frequently preceded by ‘protein-
be responsible. “I think there’s something geneticists say yes. They worry that coding’ or another descriptor. “We almost
being inherited outside what we think of as the researchers working with an oversimplistic have to add an adjective every time we use that
conventional DNA genome.” idea of the gene could discard important noun,” says Francis Collins, director of the
results that don’t fit. A medical researcher, for National Human Genome Research Institute
Changing views example, might gloss over the many different at the National Institutes of Health in
The implications of such findings for our transcripts generated by a sequence at one Bethesda, Maryland.
understanding of evolution have yet to be fig- location. And the lack of a clear idea of what a But however much geneticists struggle to pin
ured out. But research into the role of RNA as a gene is might also hinder collaboration. “I find down the elusive gene, it is precisely its ambigu-
carrier of information across generations it sometimes very difficult to tell what some- ous nature that fuels their continued curiosity.
“It’s ever more fascinating,” says Whitehead’s
Young. Some things, it seems, are not best por-
Muddling over genes trayed by a crude four-letter word. ■
Science philosophers Karola One is a DNA segment 500 biologists who Helen Pearson is a reporter working for Nature
Stotz, at Indiana University that uses some of the same completed the questionnaire. in New York.
in Bloomington, and Paul protein-coding sequences Stotz and Griffiths found that
1. Lolle, S. J., Victor, J. L., Young, J. M. & Pruitt. R. E. Nature
Griffiths, now at the to manufacture two entirely 60% are typically sure of one 434, 505–509 (2005).
University of Queensland in different proteins with answer, and 40% are 2. Rassoulzadegan, M. et al. Nature 441, 469–474 (2006).
Australia, are attempting to distinct functions. In another, confident of another. 3. Cheng J. et al. Science 308, 1149–1154 (2005).
measure the extent of one ‘gene’ is nestled within Hardly any confess that 4. Parra, G. et al. Genome Res. 16, 37–44 (2006).
5. Akiva, P. et al. Genome Res. 16, 30–36 (2006).
working biologists’ the non-protein coding they don’t know. 6. Spilianakis, C. G., Lalioti, M. D., Town, T., Lee, G. R. & Flavell,
bewilderment over genes. intron of another. Another Stotz wants to examine R. A. Nature 435, 637–645 (2005).
They collected together 14 protein is assembled when whether scientists working in 7. FANTOM Consortium and RIKEN Genome Exploration
weird and wonderful (but four different RNA separate disciplines tend to Research Group and Genome Science Group (Genome
Network Project Core Group) Science 309, 1559–1563
real) genetic arrangements molecules, made from DNA view the situations in (2005).
and asked biologists to scattered over 40,000 base different lights. “It will be 8. RIKEN Genome Exploration Research Group and Genome
decide whether each pairs, are assembled into interesting to know if there is Science Group (Genome Network Project Core Group) and
represents one, or more one transcript. some order to the confusion,” the FANTOM Consortium Science 309, 1564–1566
than one, gene. Confused? So were the Stotz says. H.P. (2005).
9. Willingham, A. T. et al. Science 309, 1570–1573 (2005).

401
©2006 Nature Publishing Group

Das könnte Ihnen auch gefallen