Opinion TRENDS in Genetics Vol.17 No.1 January 2001 23 Evolution of genome size: new approaches to an old problem Dmitri A. Petrov Eukaryotic genomes come in a wide variety of sizes. Haploid DNA contents (C values) range >80 000-fold w it hout an apparent correlat ionwitheitherthe com plexity of t he organism or the num ber of genes. This puzzling observation, the C-value paradox, has remained a mystery for almost half a century, despite m uch progress in the elucidation of the structure and function of genomes. Here I argue t hat new approaches focussing on t he genet ic m echanism s t hat generate genome-size differences could shed much light on the evolution of genome size. Because DNA is the stuff of genes, it is natural to think that more complex organisms would require more genes and thus have more DNA. Paradoxically, however, even the initial observations1 showed that many apparently simple organisms could have over a thousand times more DNA than presumably more complex multicellular organisms. It is difficult to see why some amoebas would need 200 times more DNA than humans2, or why lillies would need over 200 times more DNA than rice. Research into these differences soon revealed that in many organisms much of the DNA is noncoding and often repetitive3. This provided a solution to the original paradox in that it showed that apparently simple organisms probably do have fewer essential genes than more complex organisms, even though they sometimes have larger genomes because of larger amounts of noncoding DNA. However, these discoveries also generated a whole set of new questions. Which evolutionary forces could produce vast amounts of noncoding DNA? What is the adaptive function, if any, of the nongenic DNA? If nongenic DNA does not have an adaptive role, why would natural selection tolerate the burden of extra DNA? 'Adaptive' versus'junk DNA' theories of genome-size evolution Traditionally, theories of genome-size evolution primarily attempted to solve the puzzle of the apparent wastefulness of Nature; that is,why would many genomes have vast amounts of extra DNA considering the actual informational needs of the organism? There are two broad classes of explanations. The 'adaptive' theories postulate an adaptive function for this extra DNA given that DNA abundance, rather than its information content, can have a direct and significant effect on phenotype4 (Box 1). For instance, a larger genome size could be D.A. Petrov Department of Biological Sciences, Stanford University, Stanford, CA 94035, USA. e-mail: dpetrov@stanford.edu adaptive because it directly or indirectly increases nuclear and cellular volumes45, helps to buffer fluctuations in the concentration of regulatory proteins6 or protects coding DNA from mutation7. According to these hypotheses, the observed variation in genome size reflects different adaptive needs or the efficacy of natural selection in different organisms. Alternatively the 'junk DNA' theories propose that the extra DNA is indeed extra - that it is useless, maladaptive DNA fixed by random drift and carried passively in the chromosomes8. More recent versions of this theory propose that the junk DNA comprises parasitic transposable elements (TEs) (the 'selfish' DNA hypothesis)910. According to these theories, purifying selection against the accumulation of useless DNA is often not strong enough completely to counteract the steady stream of DNA addition through transposition and pseudogene formation. The final genome size is then set at the highest tolerable maximum which depends on the particular ecological and developmental needs of the organism. Evolutionary forces affecting genome size The current dichotomy between adaptive and junk DNA theories places the focus squarely on the question of whether extra DNA benefits the organism. But could there be more to it? I propose that by focussing exclusively on the question of the function (or lack of thereof) of extra DNA, the current debate obscures differences between various evolutionary explanations that might also be relevant. The easiest way to see this is to cast the question of genome-size evolution in terms of population genetics. Whatever the evolutionary scenarios of genome-size change might be, they must involve mutational mechanisms of addition and loss of DNA. Figure 1 shows some of these mechanisms, including the activity of TEs, spontaneous deletions and insertions, genome duplications and many others. The genome-size variants sometimes affect phenotype and thus have to go through natural selection before being fixed in the genome. It is also likely that, within a certain range, genome-size variants could be of such similar selective values that their ultimate fates are determined primarily by neutral drift (Box 2). The essential point is that changes in genome size can occur through modulation of any evolutionary force in Fig. 1. Certainly, changes in the strength and direction of natural selection can do that. But also, unless natural selection is exceptionally strong and does not allow any genetic drift to take place, modulation of the strength of any of the mutational factors shown in Fig. 1 should also affect genome size. For instance, other things being equal, an increase in transposition rates should lead to an increase in genome size, even though the exact magnitude of the increase will depend on the strength of natural selection for or against genome-size growth, and on other factors, such as the availability of nondeleterious insertion sites. The same applies to changes in the average rates http://tig.trends.com 0168-9525/01/$-seef rant matter ©2001 Elsevier Science Ltd. All rights reserved. Pll: S0168-9525(00)02157-0 24 Opinion TRENDS in Genetics Vol.17 No.1 January 2001 Box 1. Phenotypic correlates of genome size Most evidence for the adaptive evolution of genome size comes from the numerous observations of correlations between genome size and various phenotypic traits of apparent selective significance.The strongest positive correlation of the genome size is with the cellular and nuclear sizesa. 1 6-CO .Q O Q- 4_ CD > - 2 CD CC 1 1 1 1 1 Fig. I 2 4 4/VS 10 TRENDS in Genetics changes in mutational patterns in different organisms. That is true. By focusing only on the function of extra DNA, adaptive and junk DNA theories do not need to be explicit regarding these issues. But this is exactly my point - by not taking a stand on exactly which evolutionary forces produce changes in genome size and on the relative importance of these forces, the above theories fail to make explicit important distinctions between different scenarios of genome-size evolution. I suggest that, in addition to thinking of genome-size evolution in terms of the adaptive and junk DNA theories, whenever possible we should also investigate directly all of the potential evolutionary scenarios, including all of the mutational and selective forces potentially affecting genome size. We should then try to estimate directly the strength of individual forces in different organisms to see whether the modulation in strength of the individual forces corresponds to changes in genome size. Such an approach should not only tell us which genetic mechanisms and selective forces affect genome size, but also give us a quantitative sense of their relative importance. For instance, if in a particular group of organisms, variation in the strength of natural selection for or against DNA addition explains 95% of the variation in genome size, whereas variation in the rate of transposition explains 5%, we will be able to tell which force is more important and do it quantitatively. In the event that natural selection proves to be the dominant force, we would then need to determine which of the many phenotypic correlates of genome-size changes are selectively important. But it is important to know that natural selection is indeed modulating genome size before trying to find phenotypic reasons for this selection. Of course, different evolutionary forces can be important in different organisms and across different http://tig.trends.com Opinion TRENDS in Genetics Vol.17 No.1 January 2001 25 ■Spontaneous deletion Number of deletions Size of deletions Chromosomal mechanisms rGenome duplication Polysomy Duplication/deletion Accessory chromosomes ] I Transposable elements Insertion Excision Proliferation Repression Mutation pressure Spontaneous insertion! Number of insertions Size of insertions Microsatellites Expansion Shrinkage Heterochromatin Expansion Shrinkage Negligible effects on fitness Physiological effects Constraints DNA replication Polypeptide length] Nuclear volume Intron length ] Metabolic rate Intergenic spacers (and others) Random genetic drift Selection pressure TRENDS in Genetics Fig. 1. Theforces affecting genome-size evolution. DNA-length mutants are created through avariety of mechanismsshown atthe top, producing m utational pressure either to expand or contract the genome size. Some of these mutations affect the phenotypeand undergo natural selection.Some might have negligible selective effects and are governed primarily by genetic drift.The combined interplay of all these forces affects genomesize. time frames. It might turn out, for instance, that different forces are important in animals versus plants, or in long-term evolution of genome size versus genome-size differences between closely related species. But if we cast the net broadly enough, in the end this approach can help bring about a robust understanding of genome-size evolution. Distinguishing evolutionary forces of genome-size change Given the multitude of genetic and population processes involved, the main challenge of the approach I am advocating is to find a way of distinguishing among them and to studying them individually. One way to distinguish among the forces acting on genome size is to consider the timescale over which these forces could be effective, as different mutational mechanisms act on very different timescales. The activity of TEs is relatively fast, potentially amplifying the transposable-element copy number by 20-100 copies (~0.1-1 Mbp) in a single generation11-16. By contrast, changes of genome size through small spontaneous deletions and insertions are relatively slow, with, for example, the Drosophila melanogaster genome losing less than a single base pair per generation17. Of course, the evolutionary impact of these changes depends on the probability of their becoming fixed in the genome. If there is strong selection against increases in genome size, even strong mutational pressure to increase genome size would not affect the long-term evolution of genome size. However, strong selection for a change in genome size could substantially enhance the impact of slow mutational mechanisms by increasing the probability and the rate of fixation of length variants over the neutral expectations. The above considerations could help us restrict our search for the mechanisms responsible for genome-size evolution if we know the timescale of genome-size divergence in a particular case. Specifically, we can assert that genome-size changes between closely related organisms must be due either to very fast (relative to the time of species divergence) mutational mechanisms or to natural selection. However, the long-term genome size, established over very long periods of time, is affected by all of the forces in Fig. 1, both fast and slow. In the long run, slow and steady forces could be just as powerful as the quick but sporadic ones. Global nature of forces affecting genome size Another way to differentiate among multiple forces acting on genome size is to consider the scope of their action. Some forces, such as natural selection acting on a trait correlated with total genome size, are global in the sense that they affect the size of all genomic sequences, provided that these sequences are free to vary in size and that size variation exists. For example, if a lineage experiences an increased selection for a shorter developmental time (and therefore for more rapid DNA replication and cell division), then all of the unconstrained or weakly constrained sequences should be reduced in size. In other cases, however, a force affecting genome size could have a more limited scope and would only affect some genomic components and not others: an increase in the rate of heterochromatin shrinkage through a mechanism specific to heterochromatic DNA should not affect the size of euchromatin, and an expansion of satellite DNA through a mechanism specific to satellite DNA should not affect the size of satellite-free sequences. Two forces clearly have the capacity to affect the whole genome: natural selection and global deletion-insertion biases. The activity of TEs could also have global effects on genome size, at least in the long run when copies of TEs can no longer be recognized, yet continue to take up space. Other forces, such as satellite expansion, heterochromatic shrinkage or expansion, creation of accessory chromosomes and so forth have a much more limited scope of action. Polyploidization is also not expected to increase the size of all genomic compartments, but rather to make them more numerous. Thus, a fundamental question in the study of genome-size evolution is whether different genomic components vary together in a correlated fashion. Data of this kind are unfortunately limited. However, a number of cytogenetic and molecular studies give us a glimpse of a general pattern. http://tig.trends.com Opinion TRENDS in Genetics Vol.17 No.1 January 2001 The first key observations were made by cytogeneticists who showed that genome-size differences are scattered generally throughout (at least) the euchromatic portion of the genome18-23. Subsequent molecular studies support these claims. A comparison of orthologous mammalian introns revealed a correlation between the average size of introns and genome size24. A comparison of 115 complete introns present in 42 homologous genes in Drosophila virilis andD. melanogaster25 showed that the average length of introns is larger in D. virilis than in D. melanogaster (P value = 0.003). The mean intron lengths for the two species were 394 and 283 bp, respectively, a difference of 39%. Interestingly, the size of the euchromatic genome ofD. virilis exceeds thatofD. mefcs«ogas£erby36%(150Mbversus 110 Mb)2627. The trend of the correlated change in intron sizes with the changes in euchromatic genome size was confirmed by the analysis of a very large number of introns in model organisms28,29. Note that the change in intron length does not account for all the changes in genome size, implying that other sequences in the genomes grow or shrink together with introns. Increases in genome size have also been associated with increases in the copy number of TEs131630-33, increases in the amount of simple repeated sequences34, the presence of large numbers of pseudogenes35, increases in the size of inter-enhancer spacers (CM. Bergman, pers. commun.), and increases in the size or abundance of microsatellites3036. The tentative trend emerging from these studies is that, when genomes change in size, they do it across all genomic components, implicating a global force as the agent of genome-size change. This conclusion is at best preliminary and much additional work is required before it is firmly established. Measuring individual forces How can we assess the relative importance of particular mechanisms as forces of this genome-size change? As genome-size changes are reflected in all genomic components, we cannot answer this question simply by testing whether any one particular class of sequences is amplified in large genomes or reduced in compact ones. The action of any global force would produce very similar long-term effects. What we need are direct experimental studies of the strength of the individual mechanisms of genome-size evolution in action. The lack of viable experimental approaches has long been an obstacle to such studies. However, two new approaches (discussed below)3337 have changed this, at least for some of the potential mechanisms of genome-size change. Long-term estimates of transposable element activity The activity of TEs shapes much of eukaryotic genomes and has a major impact on the evolution of genome size. Because transposition rates are generally higher than excision rates, TEs increase genome size. Although one can easily envision how a change in transposition rate could result in a change of genome size, the difficulty is that the change in the copy number of TEs in a genome cannot be taken as strong evidence of a change in transposition rates; modulation of any global genomic force would produce the same end result. However, a recent study3338 demonstrates a way in which it is possible to estimate the average rate of transposition and fixation of at least some TEs in a predetermined genomic region. The approach relies on contiguous sequencing of a defined genomic region and identification of all long-terminal repeat (LTR)-containing retrotransposable elements in the region. The authors then use sequence divergence of LTRs in individual elements to estimate when their insertions into the region took place. The mechanism of transposition of LTR retroelements ensures that at the time of insertion the 5' and 3' LTRs are identical in sequence. However, after insertion they start evolving independently and diverging in sequence. The extent of the divergence can then be used to calibrate the age of each element. SanMiguel and colleagues3338 used this approach in maize to obtain a remarkable result. In the 240 kb of contiguous sequence around the adhl gene they found 23 copies of TEs belonging to 11 families of retrotransposons. These 23 copies of retrotransposons accounted for over 160 kb. Importantly, the LTR analysis demonstrated that all elements have transposed in the past 6 Myr, with most jumping in the past 3 Myr. Assuming that the adhl region is representative of the maize genome in general (and there is no reason to believe otherwise), this result implies that the maize genome has grown by 50%, from 1200 Mbp to 2400 Mbp, in the past 3 Myr. How can we interpret these results? One straightforward explanation is that the transposition frequency in maize has increased substantially in the past 3 Myr. Alternatively, it is also possible that the fixation probability of retrotransposons has changed. For example, natural selection against genome-size growth might keep TEs from fixation in maize relatives, but not in maize itself. This study did not address this question, although in principle it can be approached by estimating population frequencies and ages of individual TEs in maize and its relatives (e.g. sorghum). A higher population frequency of TEs of similar age in maize than in sorghum would implicate natural selection, because it would mean that in maize each individual TE insertion has a higher chance of persisting and ultimately being fixed than in sorghum. Similar population frequencies of similarly aged TEs implicate differences in transposition rate. One complicating factor is that the more compact sorghum genome could have fewer nondeleterious insertion sites available for TEs than the larger genome of maize. The use of a predefined contiguous genomic region is essential in this study as it allows the study of the number and ages of all elements present in the region and extrapolation of the results to the whole genome. However, this requirement makes it difficult to http://tig.trends.com Opinion TRENDS in Genetics Vol.17 No.1 January 2001 27 implement in organisms where little genomic analysis has been done. The hope is that continuing technological advances in cloning and sequencing will mitigate this difficulty and allow the use of this method in a broad array of organisms. It will be important to see whether genome-size differences across larger evolutionary distances than those in the studies of SanMiguel and colleagues33,38 are also due (at least partly) to changes in TE activity. Alternatively, it is possible that many lineages experience transient bursts of transposition and genome-size growth, but that over longer times most lineages experience a similar number of bursts without generating long-term divergences in genome size39. Deletion-insertion spectra The spectrum (distribution) of size and frequency of small spontaneous nucleotide insertions and deletions (indels) is one of the important parameters in the long-term evolution of genome size. If spontaneous insertions are more frequent and longer on average than deletions, this would generate a persistent global pressure towards genome-size growth. The reverse would be true if deletions are more frequent and were longer on average than insertions. If the spectrum varies among lineages, this would provide a force for diversification of genome sizes. The first time that the indel spectrum was investigated as a parameter in genome-size evolution was in the study of mammalian pseudogenes40. There, spontaneous deletions outnumbered insertions and were longer on average. Intriguingly, the DNAloss was estimated to be faster in rodents than in humans, corresponding to the smaller rodent genomes. However, the overall rate of spontaneous DNAloss in mammals turned out to be so low as to appear a minor parameter in genome-size evolution40-42. Use of nonfunctional sequences, such as bona fide pseudogenes, is essential in the study of all spontaneous mutations, including indels. The reason is that occurrence of mutations in nonfunctional sequences is affected only by how frequently they are formed and not by natural selection for the information content, whereas varied selective effects of different kinds of mutation in functional sequences confounds meaningful inference. Unfortunately, pseudogenes are very rare in most model organisms (e.g. in Drosophilo) and are not available in less well characterized genomes, preventing us from studying indel spectra (and other kinds of mutation) in diverse organisms. Even though many eukaryotes do not have a large number of bona fide pseudogenes, practically all carry identifiable non-functional DNA In particular, non-LTR retrotransposable elements are ubiquitous in eukaryotes (with the exception of yeast) and commonly generate non-functional, transpositionally defunct copies. We recently showed that it is possible to use maximum parsimony analysis to separate mutations that occur in active, master lineages (that are subject to selection) from those that occur in the inactive, pseudogene-like copies (that are neutral)37,43. The set of mutations in inactive copies can then be used to estimate mutational patterns. The initial application of this method confirmed that spontaneous deletions are more frequent than insertions, but also revealed a striking difference in the indel spectra between Drosophila and mammals, with 60-fold faster DNA loss in Drosophila corresponding to its much more compact genome37,44. The rate of DNA loss in Drosophila is also quite substantial in absolute terms, resulting in the loss of 50% of pseudogene DNA in approximately 14 Myr. To test further whether indel spectra affect genome size, we used the non-LTR approach in another group of insects, Hawaiian crickets of the genus Laupala, which have genomes that are 11 times larger on average than those of Drosophila species. Our study revealed a 40-fold lower rate of DNA loss in these crickets than in Drosophila, in agreement with the genome-size difference45. The use of pseudogenes avoids the problem of natural selection for information content of a sequence, but not the problem of selection acting on the bulk of DNA. Thus the difference in observed indel spectra could be due either to the differences in mutational mechanisms or, just as in the case of TE accumulation, to differences in natural selection for genome size46. However, there is strong evidence that natural selection for genome size, acting directly on individual indels, cannot explain the observed differences in indel patterns43,45,47. Deletions of different sizes do not appear to have different probabilities of persistence in the population as would be predicted if natural selection acted on them on the basis of their effect on genome size. Strong selection on small deletions and insertions would be expected to act even more strongly on much larger insertions of TEs in the same studies, which was not observed. In addition the deletion profiles change extremely abruptly (e.g. 3- to 5-bp deletions are found in equal frequencies in Drosophila and mammals, whereas deletions of 6- to 8-bp are 25-fold more frequent in Drosophila) - much more abruptly than predicted by selective scenarios. Because non-LTR elements are ubiquitous in eukaryotes and can be cloned easily45,48, the non-LTR-element-based method can be used to study mutational patterns in a comprehensive range of organisms. However, one should not ignore alternative sources of unconstrained DNA. Indeed, recent studies of indel frequency and size spectra in Caenorhabditis elegans using a large pseudogene family49 and in brown mountain grasshoppers using nuclear insertions of mitochondrial DNA50 show that it can be done. These studies add further evidence that differences in indel spectra could indeed underlie some changes in genome size. Clearly, it is not important exactly which unconstrained sequences are used to assay indel spectra. What is essential is to study indel patterns in a phylogenetically broad array of organisms to answer the question of how much of the variation in eukaryotic genome sizes is attributable to the variation in indel spectra. http://tig.trends.com 28 Opinion TRENDS in Genetics Vol.17 No.1 January 2001 Conclusion The question of the C-value paradox has puzzled us for almost half a century. For much of this time the debate has centered on whether the vast amounts of noncoding DNA have any functional, adaptive role. This question needs to be settled if we are to understand fully the evolution of genome size. However, I believe the primary focus on this one question distracts us from other essential questions. How important is genetic drift in genome-size evolution? Do mutational rates of DNA addition and loss vary between different organisms independently of natural selection for small or large genome size? If yes, do mutational rates of DNA addition and loss correlate with genome size? These questions are much more mechanistic and address the process of genome-size evolution rather than the functional significance of genome size. One possible reason for their exclusion from debate is that experimental approaches that can be used to address them were lacking until now. However, recent studies on the long-term rates of TE mobilization and on the rates of small deletions and insertions discussed in this review are just two examples of these new experimental approaches. I believe that we are getting closer to being able to study all possible evolutionary scenarios directly. Such studies are bound to disclose much information on the process of genome-size evolution and might even clarify the question of the possible adaptive significance of'extra' DNA in eukaryotic genomes. Acknow ledgem ents IthankM.Siegal.D. Hartl, D. Bensasson,T. Gregory and E. Zuckerkandl for very helpful comments. Com m ents by the two anonymous referees and theeditor helped to improve the manuscript significantly. References 1 Mirsky, A.E. and H. Ris (1951) The DNA content of animal cells and its evolutionary significance. J. Gen. Physiol. 34,451-^62 2 Thomas, C.A. (1971) The genetic organization of chromosomes. Annu. Rev. Genet. 5,237-256 3 John, B. and Miklos, G.L.G. (1988) The Eukaryotic Genome in Development and Evolution, Allen & Unwin 4 Bennett, M.D. (1971) The duration of meiosis. Proc. R. Soc. Lond.BBiol. Sci. 178,259-275 5 Cavalier-Smith, T. (1978) Nuclear volume control by nucleoskeletal DNA, selection for cell volume and cell growth rate, and the solution of the DNA C-value paradox. J. Cell Sci. 34,247-278 6 Vinogradov, A.E. (1998) Buffering: a possible passive-homeostasis role for redundant DNA. J. Theor. Biol. 193,197-199 7 Hsu, T.C. (1975) A possible function of constitutive heterochromatin: the bodyguard hypothesis. Genetics 79 (Suppl.), 137-150 8 Ohno, S. (1972) So much 'junk' in our genomes. In Evolution of Genetic Systems (Smith, H.H. ed.), pp. 366-370, Brookhaven Symp. Biol. 9 Orgel, L.E. and Crick, F.H.C. (1980) Selfish DNA: the ultimate parasite. Nature 284,604-607 10 Doolittle, W.F. and Sapienza, C. (1980) Selfish genes, the phenotype paradigm and genome evolution. Nature 284,601-603 11 O'Neill, RJ.et al. (1998)Undermethylation associated with retroelement activation and chromosome remodelling in an interspecific mammalian hybrid. Nature 393,68-72 12 Bingham, P.M. et al. (1982) The molecular basis of P-M hybrid dysgenesis: the role of the P element, a P-strain specific transposon family Cell 29, 995-1004 13 Finnegan, D.J. and Fawcett, D.H. (1986) Transposable elements in Drosophila melanogaster. In Oxford Surveys of Eukaryotic Genes (N. MacLean, ed.), pp. 1-62, Oxford University Press 14 Petrov, DA. et al. (1995) Diverse transposable elements are mobilized in hybrid dysgenesis in Drosophila virilis. Proc. Afa£/. Acad. Sci. U. S.A. 92,8050-8054 15 Walbot, V.etal. (1988) Regulation of mutator activities in maize. Basic Life Sci. 47,121-135 16 Kalendar, R. et al. (2000) From the cover: genome evolution of wild barley (Hordeum spontaneum) by BARE-1 retrotransposon dynamics in response to sharp microclimatic divergence. Proc. Natl. Acad. Sci. U. S. A 97,6603-6607 17 Petrov, DA. and Hartl, D.L. (1998) High rate of DNA loss in the Drosophila melanogaster and Drosophila virilis species groups. Mol. Biol. Evol. 15,293-302 18 Keyl, H.G. (1965) A demonstrable local and geometric increase in the chromosomal DNA of Chironomous. Experientia. 21,191-193 19 Jones, RN. and Rees, H. (1968) Nuclear DNA variation in Allium. Heredity 23,591-605 20 Narayan, R.K.J. (1982) Constraints upon the organization and evolution of chromosomes in Allium. Theor. Appl. Genet. 75,319-329 21 Ohri, D. et al. (1998) Evolution of genome size in Allium (Alliacea). Plant Syst. Evol. 1998,57-86 22 Ohri, D. and Khoshoo, T.N. (1986) Plant DNA contents and systematics. InDNA Systematics. (Dutta, S.K., ed.), pp. 2-19, CRC Press 23 Uozu, S. et al. (1997) Repetitive sequences: cause for variation in genome size and chromosome morphology in the genus Oryza. Plant Mol. Biol. 35,791-799 24 Ogata, H. et al. (1996) The size differences among mammalian introns are due to the accumulation of small deletions. FEBS Lett. 390,99-103 25 Moriyama, E.N. et al. (1998) Genome size and intron size in Drosophila. Mol. Biol. Evol. 15, 770-773 26 Ashburner, M. (1989) Drosophila:ALaboratory Handbook, Cold Spring Harbor Laboratory Press. 27 Powell, J.R. (1997)Progress andProspects in Evolutionary Biology: the DrosophilaMode/, Oxford University Press 28 Deutsch, M. and Long, M. (1999) Intron-exon structures of eukaryotic model organisms. Nucleic AcidsRes. 27,3219-3228 29 Vinogradov, A.E. (1999) Intron-genome size relationship on a large evolutionary scale. J. Mol. Evol. 49,376-384 30 Crollius.H.R. eiaZ. (2000) Characterization and repeat analysis of the compact genome of the freshwater pufferfish Te^raodorc nigroviridis. Genome Res. 10,939-949 31 Brenner, S. et al. (1993) Characterization of the pufferfish (Fugu) genome as a compact model vertebrate genome. Nature 366,265-268 32 Smit, A.F. (1999) Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr. Opin. Genet. Dev. 9,657-663 33 SanMiguel, P. e