Sie sind auf Seite 1von 6

634

Functional evolution of noncoding DNA Michael Z Ludwig


Noncoding DNA in eukaryotes encodes functionally important signals for the regulation of chromosome assembly, DNA replication, and gene expression. The increasing availability of whole-genome sequences of related taxa has led to interest in the evolution of these signals, and the phylogenetic footprints they produce. Cis-regulatory sequences controlling gene expression are often conserved among related species, but are rarely conserved between distantly related taxa. Several experimentally characterized regulatory elements have failed to show sequence similarity even between closely related species.
Addresses Department of Ecology and Evolution University of Chicago, 1101 E. 57th St Chicago, Illinois 60637, USA; e-mail: mludwig@midway.uchicago.edu Current Opinion in Genetics & Development 2002, 12:634639 0959-437X/02/$ see front matter 2002 Elsevier Science Ltd. All rights reserved. DOI 10.1016/S0959-437X(02)00355-6 Abbreviations bp base pair eve s2e even-skipped stripe 2 enhancer ncDNA noncoding DNA

comparisons [2,4,13]. Despite the obvious problems attendant on aligning ncDNA, these studies show that regulatory sequences involved in chromatin organization and transcription regulation occupy a much larger fraction of genomic sequences than the protein-coding counterpart. Enhancer elements are an important class of cis-regulatory sequences that share both structural features and common mechanisms of action. They dramatically increase the transcription from a core promoter in a manner which is independent of orientation and position. Enhancers are typically 100 bp300 bp long, and contain multiple short binding site sequences for targeting several different activators and repressors of transcription [1]. Binding-site motifs for a given transcription factor are often degenerate, and are better characterized as a probability (position weight matrix) than as a consensus sequence [14]. Bound transcription factors can interact in many different contexts, and they can function as either activator or repressor. Transcription factor interactions are modulated by the spacing of binding sites [15]; mechanisms of interaction include direct competition for overlapping binding sites, quenching over short distances (<150 bp), and long-range (>150 bp) interactions [16,17]. An enhancer can be located anywhere within a span of 100 kilobases upstream or downstream of a gene whose expression it regulates. One gene can have many enhancers in order to ensure appropriate activation in response to different temporal or spatial (surrounding cellular environment) cues [1]. It seems plausible that cis-regulatory elements co-evolve with their core promoter in both sequence-specific and location-specific ways to achieve the best possible functional performance [18,19]. Although there is good evidence for functional constraints on regulatory sequences, essentially no work has yet been carried out to investigate spatial constraints between binding sites of cis-regulatory elements and different elements themselves although it seems obvious that they should exist. Such spatial constraints on regulatory sequences can be determined experimentally.

Introduction
Functional noncoding DNA (ncDNA) is composed of cis-regulatory elements such as enhancers, core promoters, matrix or scaffold attachment regions, insulators and silencers [1,2]. Eukaryotic DNA is tightly packaged around nucleosomes and other structural protein components, and it is highly decorated with regulatory proteins. ncDNA mediates nucleoprotein interactions serving as a template for both the physical locations of bound proteins and their binding kinetics. Interspecific sequence comparisons of noncoding regions reveal conserved features, many of which are likely to be cis-regulatory elements. But despite obvious indications of selective constraint [3,4], the structure and sequences of cis-elements change over time, sometimes dramatically so even in cases where expression patterns are conserved [5,6]. Functional conservation of gene expression is not sufficient, therefore, to assure the evolutionary preservation of corresponding cis-regulatory elements [2,7,8,9]. The opportunity for positive selection to act on cis-regulatory sequences (as opposed to transcription factors) has also been emphasized, particularly in the diversification of body plans [10,11,12]. In this review, I discuss how findings based on sequence similarity between regulatory elements shed light on the evolutionary processes acting on cis-regulatory elements.

Preservation by functional constraint


Comparative studies of large ncDNA sequences show conserved regions interspersed among rapidly diverged segments, and conservation can be evident even after 300450 million years of evolution [20,21]. Both the density and the block-lengths of highly conserved regions decrease as evolutionary distances increase. Interpretation of the punctate pattern of conservation in ncDNA has been guided by rules of molecular evolution first elucidated by M Kimura: Functionally less important molecules or parts of molecules evolve (in terms of mutant substitutions) faster than more important ones [22]. In other words, sequence-specific

Cis-regulatory element structure/function


Recent attempts to estimate the functional component of ncDNA have relied on analyses of interspecific sequence

Functional evolution of noncoding DNA Ludwig

635

Table 1 Putative binding sites for transcription factors Kruppel and bicoid in eve s2e of 13 Drosophila species. kr-5 mel sim mau sec yak tei ere ore tak ana pse vir pic TTAATCCGTT .......... .......... .......... .......... .......... .......... .......... .......... .......C.. .......... .......... .......... bcd-5 mel sim mau sec yak tei ere ore tak ana pse vir pic GTTAATCCG ......... ......... ......... ......... ......... ......... ......... ......... ........C ......... ......... ......... kr-4 ACC-GGGTTGC ...--....... ...--....... ...--..T.... ...--....... ...--....... ...--....... ...--....... ..G--.....TA ..G--....... ...AA......T ...--.....A. ...--.....A. bcd-4 GAGATTATT C........ C........ ......... C........ C........ C........ C........ C........ C........ A........ ..C...... ..C...... kr-3 GAAGGGATTAG ........... ........... ........... ...C....... ...C....... ...C....... ...C....... ........... ...-......C ..........A AC......... AGG........ bcd-3 TATAATCGC .C....... .C....... .C....... N/A N/A N/A N/A N/A N/A N/A N/A N/A kr-2 ACTGGGTTAT .......... .......... .......... .......... .......... .......... .......... .G........ ..C....... .TC....... .T........ .T...C.... bcd-2 GGGATTAGC ......... ......... ......... ......... ......... ......... ......... ......... C........ A........ T......TA .A......G kr-1 TTAACCCGTTT ........... .........C. ........... ........... ........... ........... ........... ........... .C.G...T... .......C..G N/A N/A bcd-1 GAAGGGATTAG ............ ............ ............ ...C....... ...C....... ...C....... ...C....... ........... ...-......C ..........A AC......... AGG........

eve s2e binding sites in D. melanogaster [30] and homologous sequences from 12 other Drosophila species [7,8]. Residues homologous to D. melanogaster are indicated as dots. Gaps in aligned sequences are indicated by dashes. bcd and kr are the binding sites in D. melanogaster for the transcription factors bicoid and Kruppel. N/A, not available (no homologous sequence identified). Drosophila species: mel, D. melanogaster; sim, D. simulans; mau, D. mauritiana; sec, D. sechellia; ere, D. erecta; ore, D. orena; yak, D. yakuba; tei, D. teissieri; tak, D. takahashii; ana, D. ananassae; pse, D. pseudoobscura; vir, D. virilis; pic, D. picticornis.

conservation of ncDNA implies functional constraint on these sequences and slower rates of molecular evolution. Some studies have attempted to place ncDNA conservation in a quantitative framework. Thus, for example, the fraction of human conserved sequences in regions upstream of genes and intron regions in mouse has been estimated to be 36% and 23%, respectively [23]. A slightly lower estimate of conservation (20%) has been reported between Caenorhabditis elegans and C. briggsae, two species with overall DNA divergence slightly greater than that of the humanmouse comparison [24,25]. Crude numbers alone provide little, if any, insight into evolutionary mechanisms acting in the noncoding portion of the genome. The conservative component contains many different classes of sequences, including ncRNA and even cryptic genes [25,26,27]. At present, it is impossible to estimate the fraction of the conserved features in ncDNA that are cis-regulatory elements. The values obtained are also dependent on alignment methods and can vary substantially between different regions of a genome [28]. The distribution of conserved features along ncDNA segments, including block lengths and spacing, is perhaps

more revealing about the mechanisms underlying functional conservation. A median block-length of 19 bp was found in ncDNA between Drosophila virilis and D. melanogaster (60 million years divergence time) based on a requirement of 70% identity in every 10-base window within a block [29]. 19 bp must be an overestimate of the median length of a regulatory motif because chance sequence identity at the boundaries of conserved features will artificially inflate block lengths. The block lengths of conserved regulatory sequences, therefore, are quite short. The ability to detect them will depend on the divergence times of the species being compared, and the criteria used to search for them. False positive rates can be unacceptably high. For example, one can expect to find, entirely by chance, ~1000 different 10 bp sequences with one or fewer substitutions in 100,000 bp of randomly evolving DNA with total divergence similar to D. melanogaster and D. virilis. For this reason alone, it may be more informative to identify short functionally conserved features in ncDNA by comparing the sequences of many closely related species rather than a single pair of distantly related species. Taking all these factors into account, the distribution of conserved features in Drosophila ncDNA conforms well with

636

Genomes and evolution

Box 1. Hypotheses for evolutionary changes in cis-elements. 1. Neutral evolution of non-functional segments (e.g. spacer regions). But the distance between interacting transcription factors is functionally important [15], and there is some evidence for evolutionary constraints on the length of spacers in the eve s2e. 2. Fuzziness of binding site sequence specificity allows neutral or nearly neutral evolution within binding sites. But conservation of distinct (weak and strong) binding site motifs is evident in functionally characterized enhancers. These binding sequences, therefore, do not appear to have positions that are free to evolve. We simply dont know what governs speed of substitution in binding sites. 3. Evolutionary accretion of new binding sites until each one becomes marginally important. Accretion leads to the evolutionary progression of enhancer structures to be become governed by nearly neutral evolution. This may explain certain features of element molecular evolution, but certainly not why some binding sites remain strongly (albeit, not completely) conserved. Fuzziness and accretion may not be unrelated. Gerland and Hwa [45], in a theoretical study of regulatory evolution, showed that high mutation rates implied by fuzzy motifs lead to scenarios in which accretion is a natural outcome, especially if there are synergistic interactions among transcription factors. 4. Functional co-evolution within an element. Structure/function studies of enhancers suggest that evolutionary forces will act to maintain overall enhancer function rather than individual bindingsite specificity. The best evidence for co-evolutionary change is our work with eve s2e [8], where we showed that enhancer sequence differences between species have functional consequences, but they are masked by other differences, which have co-evolved, to produce no net functional change in expression.

there is rapid turnover of spacer sequences between binding sites; conservation is apparent; turnover of binding site architecture occurs; and there is co-evolution of sites within an enhancer. A number of evolutionary hypotheses have been proposed to explain the observations concerning evolutionary changes in cis-elements (Box 1).

Models of enhancer evolution


One proposal for explaining enhancer evolution has been to model enhancer structure/function as a quantitative character. Many studies have shown that modular multibinding site architecture is required for eukaryotic enhancer function [1]. The presence of multiple binding sites, each with many fuzzy sites, and the possibility that subtle changes in spacing can also influence enhancer function, is compatible with the idea that many independent mutations will contribute to variation in gene expression. The spatio-temporal requirements for gene expression can then be viewed as a continuous character that can shift forwards and backwards subtly with mutations that affect transcription factor binding or interaction.
Model of stabilizing selection

the lengths of conserved features observed in functionally characterized enhancer elements. The Drosophila even-skipped stripe 2 enhancer (eve s2e), for example, contains 17 transcription factor binding sites for two activator and two repressor proteins [30]. Evolutionary comparisons of eve s2e sequences indicate that most, but not all, functionally important binding sites are conserved at the level of DNA (see Table 1). No single binding site is completely conserved in Drosophila evolution, and some functional binding sites can be lost (or gained) in evolution. In addition, we find conservation in both weak and strong binding sites.

The above view naturally leads to the idea that the model of stabilizing selection can be considered to be the major mode of enhancer evolution. Kimura [38] investigated the rate of substitution assuming a quantitative character subject to stabilizing selection. If a large number of segregating sites (or loci) are involved, the average selection coefficient per mutant under stabilizing selection will be small. These weak mutations can then be controlled by genetic drift rather than selection, and the rate of substitution can be quite high. Applied to enhancer evolution, stabilizing selection can accommodate binding site turnover without disruption of primary enhancer function. The results of studies of the eve s2e evolution from Drosophila are consistent with the stabilizing selection model [8].
Model of compensatory selection

Preservation of function and evolutionary changes in cis-elements


There are many examples of extensively diverged regulatory sequences that have retained expression specificity [9]. What can we say about the evolutionary processes governing the changes in these functionally conserved elements? The study of evolutionary changes in functionally conserved regulatory elements must begin, necessarily, with a characterization of the sequences responsible for functionality. Unfortunately, the number of well-characterized cis-elements, such as the eve s2e in Drosophila, remains small in number; the number of evolutionary studies of them is even smaller [7,8,10,3136,37]. From these landmark studies, the following observations can be gleaned: there is a lack of complete conservation of functional binding sites;

The evolution of some regulatory elements might also be consistent with a model of compensatory or epistatic selection. A pair of mutations at different sites (loci) that are singly deleterious but restore normal fitness in combination may be called compensatory neutral mutations [39]. Kimura demonstrated that these mutations could easily become fixed in the population by genetic drift when the genes are tightly linked [39]. Carter and Wagner [40] modeled this process as it might apply to regulatory sequences. They found that large population size accelerates compensatory evolution, whereas small population sizes inhibit this form of drift from occurring. They also interpreted comparative studies of enhancer structure/ function as indicating faster turnover rates of Drosophila enhancers than vertebrate enhancers (a conclusion worthy of further investigation), consistent with a prediction of their model.

Functional evolution of noncoding DNA Ludwig

637

Origin of cis-elements with novel function


Recent progress has been made in understanding the evolution of new genes [41], including a proposed role for the evolutionary specialization of duplicates involving duplicated regulatory sequences [42,43]. Directional evolution of cis-elements can also bring about new patterns of gene expression [44]. Nevertheless, mechanisms and processes involved in the origin of new patterns of the gene expression remain poorly understood. The major challenge for modeling the evolution of a novel regulatory element is to allow for the stepwise evolutionary progression of an element from one that initially contains by chance only a small number of functional transcription factor binding sites [45,46]. Three hypotheses have been proposed for how cis-regulatory elements evolve a new function through the modification and divergence of pre-existing ones: first, duplication and DNA rearrangements involving either all or part of existing functional elements [10]; second, modification of existing elements for example, through gain and loss of binding sites, or acquisition of binding sites for additional transcription factors [10,47]; and third, co-option of an existing element and expansion of its developmental function [10]. These three scenarios adhere to generally accepted principles of developmental biology and population genetics; they involve the evolutionary modification of pre-existing elements and they allow for a novel function to be gained without breaking the primary function of cis-regulatory elements. Is it possible for a cis-regulatory element to evolve de novo from random sequences? In short no-one knows, but some examples of a vertebrate sequence [4850] that functions similarly to a Drosophila regulatory sequence show that entirely different sequences can direct similar expression. The evolutionary distance between these species, however, precludes our knowing whether the two have evolved from a common ancestor. Perhaps cases in which apparently nonorthologous sequences in related species carry out the same regulatory function will provide an answer to this question.

follows one given by Arnone and Davidson [1]: a cis-element such as an enhancer is defined as the smallest fragment of DNA that, when linked to a reporter gene and transferred into an appropriate cell, executes a regulatory function in a fashion consistent with that of the native gene in its proper context. This definition evidently simplifies natural relations between cis-regulatory element structure and its function by paring down cis-neighboring sequences but it does so at the expense of those flanking sequences that have evolved as part of the regulation system of gene expression. Second, the cis-regulatory element control of transcription has been characterized in depth for only a relatively small number of eukaryotic genes. Few enhancers have been functionally characterized in detail; there is a great need to expand this line of research as a necessary companion to bioinformatics approaches. At best, bioinformatics will refine hypotheses about the evolution of enhancer structure/ function, reducing the number of plausible models and experiments needed to test them. Third, a theoretical framework for interpreting the evolutionary changes in any cis-regulatory sequences is still in its infancy, and what exists is guided by very little informative data.

Acknowledgements
I thank Josep Comeron for providing calculations on the distribution of undiverged sequences after a period of random evolution. I also thank Martin Kreitman for discussions and editorial suggestions.

References and recommended reading


Papers of particular interest, published within the annual period of review, have been highlighted as:

of special interest of outstanding interest


1. Arnone MI, Davidson EH: The hardwiring of development: Organization and function of genomic regulatory systems. Development 1997, 124:1851-1864. Pennacchio LA, Rubin EM: Genomic strategies to identify mammalian regulatory sequences. Nat Rev Genet 2001, 2:100-109. Loots GG, Locksley RM, Blankespoor CM, Wang ZE, Miller W, Rubin EM, Frazer KA: Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. Science 2000, 288:136-140. Hardison RC: Conserved noncoding sequences are reliable guides to regulatory elements. Trends Genet 2000, 16:369-372. Hardison R, Krane D, Vandenbergh D, Cheng JF, Mansberger J, Taddie J, Schwartz S, Huang Xq, Miller W: Sequence and comparative-analysis of the rabbit alpha-like globin gene-cluster reveals a rapid mode of evolution in a G+C-rich region of mammalian genomes. J Mol Biol 1991, 222:233-249. Takahashi H, Mitani Y, Satoh G, Satoh N: Evolutionary alterations of the minimal promoter for notochord-specific Brachyury expression in ascidian embryos. Development 1999, 126:3725-3734. Ludwig MZ, Patel N, Kreitman M: Functional analysis of eve stripe 2 enhancer evolution in Drosophila: Rules governing conservation and change. Development 1998, 125:949-958. Ludwig MZ, Bergman C, Patel N, Kreitman M: Evidence for stabilizing selection in a eukaryotic cis-regulatory element. Nature 2000, 403:564-567.

2.

3.

Conclusions
It should be evident that a variety of evolutionary processes shape the structure of ncDNA. Functionally important regulatory sequences will tend to be conserved as a result of negative selection against deleterious mutations and positive selection for better-canalized performance. This need not be the case, however; some forms of stabilizing selection can maintain functional conservation of cis-regulatory elements for long periods of evolutionary time despite structural architecture turnover. In addition, compensatory evolution can even accelerate the substitution process in large populations to a level greater than the neutral rate of substitution. Despite advances, cis-regulatory evolution remains poorly understood for several reasons. First, there is no general definition of a cis-regulatory element. A working definition
4.

5.

6.

7.

8.

638

Genomes and evolution

9.

Tautz D: Evolution of transcriptional regulation. Curr Opin Genet Dev 2000, 10:575-579.

10. Carroll SB, Grenier JK, Weatherbee SD: From DNA to diversity: The primacy of regulatory evolution. In From DNA to Diversity: Molecular Genetics and the Evolution of Animal Design. Oxford: Blackwell Science, Inc.; 2001:173-196. The authors discuss problems of cis-regulatory evolution. Hypotheses are proposed for how cis-regulatory elements can evolve a new function through the modification of pre-existing ones. 11. Galant R, Carroll SB: Evolution of a transcriptional repression domain in an insect Hox protein. Nature 2002, 415:910-913. It is generally believed that body plan evolution can be brought about through the evolution of cis-regulatory sequences. The authors, together with the authors of [12], report that changes in transcriptional regulators such as the HOX proteins may also play an important role. In particular, this paper demonstrates that a functional motif arose in the Ubx protein in insects. An insect-specific glutamine/alanine rich domain correlated with greater repression activity on the gene Distalless limb enhancer. The evolution of this domain may have facilitated the greater morphological diversification of segments characteristic of modern insects. 12. Ronshaugen M, McGinnis N, McGinnis W: Hox protein mutation and macroevolution of the insect body plan. Nature 2002, 415:914-917. See annotation [11]. 13. Fickett JW, Wasserman WW: Discovery and modeling of transcriptional regulatory regions. Curr Opin Biotechnol 2000, 11:19-24. 14. Krivan W, Wasserman WW: A predictive model for regulatory sequences directing liver-specific transcription. Genome Res 2001, 11:1559-1566. 15. Ondek B, Gloss L, Herr W: The SV40 enhancer contains two distinct levels of organization. Nature 1988, 333:40-45. 16. Gray S, Levine M: Short-range transcriptional repressors mediate both quenching and direct repression within complex loci in Drosophila. Genes Dev 1996, 10:700-710. 17. Barolo S, Levine M: hairy mediates dominant repression in the Drosophila embryo. EMBO J 1997, 16:2883-2891.

28. Chiaromonte F, Yang S, Elnitski L, Yap VB, Miller W, Hardison RC: Association between divergence and interspersed repeats in mammalian noncoding genomic DNA. Proc Natl Acad Sci USA 2001, 98:14503-14508. The authors investigate aligned DNA in four large genomic regions (5.89 Mb of DNA) on different chromosomes from human and mouse. This paper reports that divergence in noncoding nonrepetitive DNA is strongly correlated with the amount of repetitive DNA in all four loci. The results show that overall rates of evolution vary in different segments of the genome, with more flexible regions able to accommodate many point mutations and insertions, whereas more rigid regions tend to accumulate fewer changes of both types. 29. Bergman CM, Kreitman M: Analysis of conserved noncoding DNA in Drosophila reveals similar structural and evolutionary properties of intergenic and intronic sequences. Genome Res 2001, 11:1335-1345. This work studies the properties of sequence conservation in noncoding regions by comparing 100 kb of sequence in promoter regions and introns of 40 genes of Drosophila melanogaster and Drosophila virilis. On average, 2226% of noncoding sequences are conserved (cf. [25]), with median block length ~19 bp. The authors report that patterns of ncDNA structure and evolution differ remarkably little (i.e. are statistically indistinguishable) between intergenic and intronic conservative blocks. 30. Stanojevic D, Small S, Levine M: Regulation of a segmentation stripe by overlapping activators and repressors in the Drosophila embryo. Science 1991, 254:1385-1387. 31. Segal JA, Barnett JL, Crawford DL: Functional analysis of natural variation in Sp1 binding sites of a TATA-less promoter. J Mol Evol 1999, 49:736-749. 32. Wang RL, Stec A, Hey J, Lukens L, Doebley J: The limits of selection during maize domestication. Nature 1999, 398:236-239. 33. Sucena E, Stern DL: Divergence of larval morphology between Drosophila sechellia and its sibling species caused by cis-regulatory evolution of ovo/shaven-baby. Proc Natl Acad Sci USA 2000, 97:4530-4534. 34. Koch MA, Weisshaar B, Kroymann J, Haubold B, Mitchell-Olds T: Comparative genomics and regulatory evolution: conservation and function of the Chs and Apetala3 promoters. Mol Biol Evol 2001, 18:1882-1891. 35. Kim J: Macro-evolution of the hairy enhancer in Drosophila species. J Exp Zool 2001, 291:175-185. 36. McGregor AP, Shaw PJ, Hancock JM, Bopp D, Hediger M, Wratten NS, Dover GA: Rapid restructuring of bicoid-dependent hunchback promoters within and between Dipteran species: implications for molecular coevolution. Evol Devel 2001, 3:397-407. Dermitzakis ET, Clark AG: Evolution of transcription factor binding sites in mammalian gene regulatory regions: conservation and turnover. Mol Biol Evol 2002, 19:1114-1121. This paper presents an analysis of evolutionary dynamics of transcription factor binding sites whose function had been experimentally verified in promoters of 51 human genes by comparing their sequences in other primate species and rodents. Turnover of the binding sites was found to be widespread. The authors discuss the efficacy of phylogenetic footprinting and the interpretation of the pattern of evolution in regulatory sequences. 38. Kimura M: Possibility of extensive neutral evolution under stabilizing selection with special reference to nonrandom usage of synonymous codons. Proc Natl Acad Sci USA 1981, 78:5773-5777. 39. Kimura M: The role of compensatory neutral mutations in molecular evolution. J Genet 1985, 64:7-19. 40. Carter AJR, Wagner GP: Evolution of functionally conserved enhancers can be accelerated in large populations: a populationgenetic model. Proc R Soc Lond Ser B 2002, 269:953-960. This work presents a population-genetic model to explain the different degree of conservation seen in vertebrate and Drosophila enhancers. The model examines the dynamics of fixation of pairs of individually deleterious, but compensating, mutations. The differences in population sizes and generation times are the key characteristics for explaining the observed phenomenon. 41. Long M: Evolution of novel genes. Curr Opin Genet Dev 2001, 11:673-680. 42. Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J: Preservation of duplicate genes by complementary, degenerative mutations. Genetics 1999, 151:1531-1545. 37.

18. Schaffner W: Enhancer. In The Encyclopedia of Molecular Biology. Edited by Creighton T. New York: John Wiley and Sons, Inc.; 1999:823-828. 19. Butler JEF, Kadonaga JT: Enhancer-promoter specificity mediated by DPE or TATA core promoter motifs. Genes Dev 2001, 15:2515-2519. 20. Duret L, Bucher P: Searching for regulatory elements in human noncoding sequences. Curr Opin Struct Biol 1997, 7:399-406. 21. Mller F, Blader P, Strhle U: Search for enhancers: teleost models in comparative genomic and transgenic analysis of cis regulatory elements. Bioessays 2002, 24:564-572. 22. Kimura M: The Neutral Theory of Molecular Evolution. Cambridge: Cambridge University Press; 1983. 23. Jareborg N, Birney E, Durbin R: Comparative analysis of noncoding regions of 77 orthologous mouse and human gene pairs. Genome Res 1999, 9:815-824. 24. Shabalina SA, Kondrashov AS: Pattern of selective constraint in C. elegans and C. briggsae genomes. Genet Res 1999, 74:23-30. 25. Webb CT, Shabalina SA, Ogurtsov AY, Kondrashov AS: Analysis of similarity within 142 pairs of orthologous intergenic regions of Caenorhabditis elegans and Caenorhabditis briggsae. Nucleic Acids Res 2002, 30:1233-1239. This paper reports the properties of sequence conservation in noncoding regions of related species of nematode. The authors found a mosaic pattern with regions of high similarity (phylogenetic footprints) interspersed nonalignable sequences. The footprints cover ~20% of intergenic regions. This finding is similar to published estimates of the fraction sequence conservation in noncoding regions of other complex eukaryotes (cf. [29]). 26. Wong GKS, Passey DA, Huang YZ, Yang ZY, Yu J: Is junk DNA mostly intron DNA? Genome Res 2000, 10:1672-1678. 27. Eddy SR: Computational genomics of noncoding RNA genes. Cell 2002, 109:137-140.

Functional evolution of noncoding DNA Ludwig

639

43. Chiu CH, Amemiya C, Dewar K, Kim CB, Ruddle FH, Wagner GP: Molecular evolution of the HoxA cluster in the three major gnathostome lineages. Proc Natl Acad Sci USA 2002, 99:5492-5497. 44. Stern DL: Evolutionary developmental biology and the problem of variation. Evolution 2000, 54:1079-1091. 45. Gerland U, Hwa T: On the selection and evolution of the regulatory DNA motifs. J Mol Evol 2002, 55:386-400. The authors construct a theoretical model for the evolution of regulatory sequences under the assumption that fitness of a sequence depends only on its binding affinity to the regulatory protein. The commonly observed fuzziness in binding motifs arises naturally, according to their model, as a consequence of the balance between selection and mutation. 46. Stone JR, Wray GA: Rapid evolution of cis-regulatory sequences via local point mutations. Mol Biol Evol 2001, 18:1764-1770. The paper addresses the evolutionary origin of individual binding sites. The authors simulate the evolution of promoters by using standard mutation models and scanning for the appearance of particular binding sites, then calculating the likelihood that these binding sites would became fixed in the population. The results indicate that new binding sites capable of altering gene expression can evolve via local point mutation on short timescales under an assumption of neutral evolution.

Galant R, Walsh CM, Carroll SB: Hox repression of a target gene: extradenticle-independent, additive action through multiple monomer binding sites. Development 2002, 129:3115-3126. The low DNA-binding specificities of Hox proteins have raised the question of how these transcription factors selectively regulate target gene expression. This paper demonstrates that Ubx directly regulates a flight appendagespecific cis-regulatory element of the spalt gene. The results of this work suggest that HOX proteins can regulate target genes directly in the absence of such protein cofactors as Extradenticle. The authors propose that the regulation of some HOX target genes evolves via the stepwise accumulation of multiple HOX monomer binding sites within cis-regulatory elements. 48. Abel T, Bhatt R, Maniatis T: A Drosophila CREB/ATF transcriptional activator binds to both fat body- and liver-specific regulatory elements. Genes Dev 1992, 6:466-480. 49. Falb D, Maniatis T: A conserved regulatory unit implicated in tissue-specific gene expression in Drosophila and man. Genes Dev 1992, 6:454-465. 50. Gonzalez-Crespo S, Levine M: Related target enchancers for dorsal and NF-kB signaling pathways. Science 1994, 264:255-258.

47.