Skip to main content

Models of Genome Evolution

  • Chapter
Modelling in Molecular Biology

Part of the book series: Natural Computing Series ((NCS))

Summary

The evolutionary theory, “evolution by duplication”, originally proposed by Susumu Ohno in 1970, can now be verified with the available genome sequences. Recently, several mathematical models have been proposed to explain the topology of protein interaction networks that have also implemented the idea of “evolution by duplication”. The power law distribution with its “hubby” topology (e.g., P53 was shown to interact with an unusually large number of other proteins) can be explained if one makes the following assumption: new proteins, which are duplicates of older proteins, have a propensity to interact only with the same proteins as their evolutionary predecessors. Since protein interaction networks, as well as other higher-level cellular processes, are encoded in genomic sequences, the evolutionary structure, topology, and statistics of many biological objects (pathways, phylogeny, symbiotic relations, etc.) are rooted in the evolution dynamics of the genome sequences. Susumu Ohno’s hypothesis can be tested “in silico” using Polya’s urn model. In our model, each basic DNA sequence change is modelled using several probability distribution functions. The functions can decide the insertion/deletion positions of the DNA fragments, the copy numbers of the inserted fragments, and the sequences of the inserted/deleted pieces. Moreover, those functions can be interdependent. A mathematically tractable model can be created with a directed graph representation. Such graphs are Eulerian and each possible Eulerian path encodes a genome. Every “genome duplication” event evolves these Eulerian graphs, and the probability distributions and their dynamics themselves give rise to many intriguing and elegant mathematical problems. In this chapter, we explore and survey these connections between biology, mathematics and computer science in order to reveal simple, and yet deep, models of life itself.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Peng, C.K. et al: Long-range correlations in nucleotide sequences. Nature 356, 168–170 (1992)

    Article  Google Scholar 

  2. Gomez, S.M., Rzhetsky, A.: Birth of scale-free molecular networks and the number of distinct DNA and protein domains per genome. Bioinformatics 17, 988–996 (2001)

    Article  Google Scholar 

  3. Fields, S., Schwikowski, B., Uetz, P.: A network of protein-protein interactions in yeast. Nature Biotechnology 18, 1257–1261 (2000)

    Article  Google Scholar 

  4. Albert, R., Barabasi, A.-L.: Statistical mechanics of complex networks. Reviews of Modern Physics 74, 48–97 (2002)

    Article  MathSciNet  Google Scholar 

  5. Havlin, S. et al: Mosaic organization of DNA nucleotides. Physical Review E 49, 1685–1689 (1994)

    Article  Google Scholar 

  6. Ehrlich, S.D., Viguera, E., Canceill, D.: Replication slippage involves DNA polymerase pausing and dissociation. EMBO Journal 20, 2587–2596 (2001)

    Article  Google Scholar 

  7. Lilley, D.M.J., Eckstein, F.: DNA Repair (Springer, Berlin Heidelberg New York 1998)

    Google Scholar 

  8. Albert, R. et al: The large-scale organization of metabolic networks. Nature 407, 651–654 (2000)

    Article  Google Scholar 

  9. Barabasi, A.L. et al: Lethality and centrality in protein networks. Nature 411, 41–42 (2001)

    Article  Google Scholar 

  10. Gerstein, M., Qian, J., Luscombe, N.M.: Protein family and fold occurrence in genomes: power-law behavior and evolutionary model. Journal of Molecular Biology 313, 673–681 (2001)

    Article  Google Scholar 

  11. Rain, J.C. et al: The protein-protein interaction map of Helicobacter pylori. Nature 409, 211–215 (2001)

    Article  Google Scholar 

  12. Vogelstein, B., Lane, D., Levine, A.J.: Surfing the P53 network. Nature 408, 307–310 (2000)

    Article  Google Scholar 

  13. Johnson, N.L.: Urn models and their application (Wiley 1977)

    Google Scholar 

  14. Ganapathiraju, M. et al: Comparative n-gram analysis of whole-genome protein sequences. In: HLT’02: Human Language Technologies Conference, San Diego, California, USA, 2002.

    Google Scholar 

  15. Ohno, S.: Evolution by Gene Duplication (Springer, Berlin Heidelberg New York 1970)

    Google Scholar 

  16. Apweiler, R. et al: The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Research 29, 37–40 (2000)

    Article  Google Scholar 

  17. Sole, R.V., Pastor-Satorra, R., Smight, E.: Evolving protein interaction networks through gene duplication. Santa Fe Institute Working Paper 02-02-008 (2002)

    Google Scholar 

  18. Mantegna, R.N. et al: Linguistic features of noncoding DNA sequences. Physical Review Letters 73, 3169–3172 (1994)

    Article  MathSciNet  Google Scholar 

  19. Sneppen, K., Maslov, S.: Specificity and stability in topology of protein networks. Science 296, 910–913 (2002)

    Article  Google Scholar 

  20. Buldyrev, S.V. et al: Fractal landscapes and molecular evolution: modeling the myosin heavy chain gene family. Biophysical Journal 65, 2673–2679 (1993)

    Article  Google Scholar 

  21. Eichler, E.E.: Recent duplication, domain accretion and the dynamic mutation of the Human genome. Trends in Genetics 17, 661–669 (2001)

    Article  Google Scholar 

  22. Bailey, J.A. et al: Recent segmental duplications in the human genome. Science 297, 1003–1007 (2002)

    Article  Google Scholar 

  23. Graur, D., Li, W-H.: Fundamentals of Molecular Evolution (Sinauer 2000)

    Google Scholar 

  24. Gu, X., Li, W.-H.: The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment. Journal of Molecular Evolution 40, 464–473 (1995)

    Article  Google Scholar 

  25. Ophir, R., Graur, D.: Patterns and rates of indel evolution in processed pseudogenes from humans and murids. Gene 205, 191–202 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Zhou, Y., Mishra, B. (2004). Models of Genome Evolution. In: Ciobanu, G., Rozenberg, G. (eds) Modelling in Molecular Biology. Natural Computing Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18734-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-18734-6_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-62269-4

  • Online ISBN: 978-3-642-18734-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics