Models of Genome Evolution

Zhou, Yi; Mishra, Bud

doi:10.1007/978-3-642-18734-6_13

Yi Zhou⁵ &
Bud Mishra^6,7

Part of the book series: Natural Computing Series ((NCS))

289 Accesses
1 Citations

Summary

The evolutionary theory, “evolution by duplication”, originally proposed by Susumu Ohno in 1970, can now be verified with the available genome sequences. Recently, several mathematical models have been proposed to explain the topology of protein interaction networks that have also implemented the idea of “evolution by duplication”. The power law distribution with its “hubby” topology (e.g., P53 was shown to interact with an unusually large number of other proteins) can be explained if one makes the following assumption: new proteins, which are duplicates of older proteins, have a propensity to interact only with the same proteins as their evolutionary predecessors. Since protein interaction networks, as well as other higher-level cellular processes, are encoded in genomic sequences, the evolutionary structure, topology, and statistics of many biological objects (pathways, phylogeny, symbiotic relations, etc.) are rooted in the evolution dynamics of the genome sequences. Susumu Ohno’s hypothesis can be tested “in silico” using Polya’s urn model. In our model, each basic DNA sequence change is modelled using several probability distribution functions. The functions can decide the insertion/deletion positions of the DNA fragments, the copy numbers of the inserted fragments, and the sequences of the inserted/deleted pieces. Moreover, those functions can be interdependent. A mathematically tractable model can be created with a directed graph representation. Such graphs are Eulerian and each possible Eulerian path encodes a genome. Every “genome duplication” event evolves these Eulerian graphs, and the probability distributions and their dynamics themselves give rise to many intriguing and elegant mathematical problems. In this chapter, we explore and survey these connections between biology, mathematics and computer science in order to reveal simple, and yet deep, models of life itself.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Peng, C.K. et al: Long-range correlations in nucleotide sequences. Nature 356, 168–170 (1992)
Article Google Scholar
Gomez, S.M., Rzhetsky, A.: Birth of scale-free molecular networks and the number of distinct DNA and protein domains per genome. Bioinformatics 17, 988–996 (2001)
Article Google Scholar
Fields, S., Schwikowski, B., Uetz, P.: A network of protein-protein interactions in yeast. Nature Biotechnology 18, 1257–1261 (2000)
Article Google Scholar
Albert, R., Barabasi, A.-L.: Statistical mechanics of complex networks. Reviews of Modern Physics 74, 48–97 (2002)
Article MathSciNet Google Scholar
Havlin, S. et al: Mosaic organization of DNA nucleotides. Physical Review E 49, 1685–1689 (1994)
Article Google Scholar
Ehrlich, S.D., Viguera, E., Canceill, D.: Replication slippage involves DNA polymerase pausing and dissociation. EMBO Journal 20, 2587–2596 (2001)
Article Google Scholar
Lilley, D.M.J., Eckstein, F.: DNA Repair (Springer, Berlin Heidelberg New York 1998)
Google Scholar
Albert, R. et al: The large-scale organization of metabolic networks. Nature 407, 651–654 (2000)
Article Google Scholar
Barabasi, A.L. et al: Lethality and centrality in protein networks. Nature 411, 41–42 (2001)
Article Google Scholar
Gerstein, M., Qian, J., Luscombe, N.M.: Protein family and fold occurrence in genomes: power-law behavior and evolutionary model. Journal of Molecular Biology 313, 673–681 (2001)
Article Google Scholar
Rain, J.C. et al: The protein-protein interaction map of Helicobacter pylori. Nature 409, 211–215 (2001)
Article Google Scholar
Vogelstein, B., Lane, D., Levine, A.J.: Surfing the P53 network. Nature 408, 307–310 (2000)
Article Google Scholar
Johnson, N.L.: Urn models and their application (Wiley 1977)
Google Scholar
Ganapathiraju, M. et al: Comparative n-gram analysis of whole-genome protein sequences. In: HLT’02: Human Language Technologies Conference, San Diego, California, USA, 2002.
Google Scholar
Ohno, S.: Evolution by Gene Duplication (Springer, Berlin Heidelberg New York 1970)
Google Scholar
Apweiler, R. et al: The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Research 29, 37–40 (2000)
Article Google Scholar
Sole, R.V., Pastor-Satorra, R., Smight, E.: Evolving protein interaction networks through gene duplication. Santa Fe Institute Working Paper 02-02-008 (2002)
Google Scholar
Mantegna, R.N. et al: Linguistic features of noncoding DNA sequences. Physical Review Letters 73, 3169–3172 (1994)
Article MathSciNet Google Scholar
Sneppen, K., Maslov, S.: Specificity and stability in topology of protein networks. Science 296, 910–913 (2002)
Article Google Scholar
Buldyrev, S.V. et al: Fractal landscapes and molecular evolution: modeling the myosin heavy chain gene family. Biophysical Journal 65, 2673–2679 (1993)
Article Google Scholar
Eichler, E.E.: Recent duplication, domain accretion and the dynamic mutation of the Human genome. Trends in Genetics 17, 661–669 (2001)
Article Google Scholar
Bailey, J.A. et al: Recent segmental duplications in the human genome. Science 297, 1003–1007 (2002)
Article Google Scholar
Graur, D., Li, W-H.: Fundamentals of Molecular Evolution (Sinauer 2000)
Google Scholar
Gu, X., Li, W.-H.: The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment. Journal of Molecular Evolution 40, 464–473 (1995)
Article Google Scholar
Ophir, R., Graur, D.: Patterns and rates of indel evolution in processed pseudogenes from humans and murids. Gene 205, 191–202 (1997)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Biology Department, New York University, New York
Yi Zhou
Courant Institute of Mathematical Sciences, New York University, New York
Bud Mishra
Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Harbor
Bud Mishra

Authors

Yi Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Bud Mishra
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Science, Romanian Academy of Sciences, 700506, Iasi, Romania
Gabriel Ciobanu
Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, 2333 CA, Leiden, The Netherlands
Grzegorz Rozenberg

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhou, Y., Mishra, B. (2004). Models of Genome Evolution. In: Ciobanu, G., Rozenberg, G. (eds) Modelling in Molecular Biology. Natural Computing Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18734-6_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-18734-6_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-62269-4
Online ISBN: 978-3-642-18734-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics