Evolution of Genome Organization by Duplication and Loss: An Alignment Approach

  • Patrick Holloway
  • Krister Swenson
  • David Ardell
  • Nadia El-Mabrouk
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7262)


We present a comparative genomics approach for inferring ancestral genome organization and evolutionary scenarios, based on a model accounting for content-modifying operations. More precisely, we focus on comparing two ordered gene sequences with duplicated genes that have evolved from a common ancestor through duplications and losses; our model can be grouped in the class of “Block Edit” models. From a combinatorial point of view, the main consequence is the possibility of formulating the problem as an alignment problem. On the other hand, in contrast to symmetrical metrics such as the inversion distance, duplications and losses are asymmetrical operations that are applicable to one of the two aligned sequences. Consequently, an ancestral genome can directly be inferred from a duplication-loss scenario attached to a given alignment. Although alignments are a priori simpler to handle than rearrangements, we show that a direct approach based on dynamic programming leads, at best, to an efficient heuristic. We present an exact pseudo-boolean linear programming algorithm to search for the optimal alignment along with an optimal scenario of duplications and losses. Although exponential in the worst case, we show low running times on real datasets as well as synthetic data. We apply our algorithm in a phylogenetic context to the evolution of stable RNA (tRNA and rRNA) gene content and organization in Bacillus genomes. Our results lead to various biological insights, such as rates of ribosomal RNA proliferation among lineages, their role in altering tRNA gene content, and evidence of tRNA class conversion.


Comparative Genomics Gene order Duplication Loss Linear Programming Alignment Bacillus tRNA 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ajana, Y., Lefebvre, J.F., Tillier, E., El-Mabrouk, N.: Exploring the Set of All Minimal Sequences of Reversals - An Application to Test the Replication-Directed Reversal Hypothesis. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 300–315. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Ardell, D.H., Kirsebom, L.A.: The genomic pattern of tDNA operon expression in E. coli. PLoS Comp. Biol. 1(1:e12) (2005)Google Scholar
  3. 3.
    Bermudez-Santana, C., Attolini, C.S., Kirsten, T., Engelhardt, J., Prohaska, S.J., Steigele, S., Stadler, P.: Genomic organization of eukaryotic tRNAs. BMC Genomics 11(270) (2010)Google Scholar
  4. 4.
    Blanchette, M., Bourque, G., Sankoff, D.: Breakpoint phylogenies. In: Genome Informatics Workshop (GIW), pp. 25–34 (1997)Google Scholar
  5. 5.
    Blomme, T., Vandepoele, K., De Bodt, S., Silmillion, C., Maere, S., van de Peer, Y.: The gain and loss of genes during 600 millions years of vertebrate evolution. Genome Biology 7, R43 (2006)CrossRefGoogle Scholar
  6. 6.
    Bourque, G., Pevzner, P.A.: Genome-scale evolution: Reconstructing gene orders in the ancestral species. Genome Research 12, 26–36 (2002)Google Scholar
  7. 7.
    Caprara, A.: Formulations and hardness of multiple sorting by reversals. In: RECOMB, pp. 84–94 (1999)Google Scholar
  8. 8.
    Chauve, C., Tannier, E.: A methodological framework for the reconstruction of contiguous regions of ancestral genomes and its application to mammalian genomes. PloS Computational Biology 4, e1000234 (2008)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Cotton, J.A., Page, R.D.M.: Rates and patterns of gene duplication and loss in the human genome. Proceedings of the Royal Society of London. Series B 272, 277–283 (2005)CrossRefGoogle Scholar
  10. 10.
    Demuth, J.P., De Bie, T., Stajich, J., Cristianini, N., Hahn, M.W.: The evolution of mammalian gene families. PLoS ONE 1, e85 (2006)CrossRefGoogle Scholar
  11. 11.
    Dong, H., Nilsson, L., Kurland, C.G.: Co-variation of tRNA abundance and codon usage in Escherichia coli at different growth rates. Journal of Molecular Biology 260, 649–663 (2006)CrossRefGoogle Scholar
  12. 12.
    Eichler, E.E., Sankoff, D.: Structural dynamics of eukaryotic chromosome evolution. Science 301, 793–797 (2003)CrossRefGoogle Scholar
  13. 13.
    Eisen, J.A., Heidelberg, J.F., White, O., Salzberg, S.L.: Evidence for symmetric chromosomal inversions around the replication origin in bacteria. Genome Biology 1(6) (2000)Google Scholar
  14. 14.
    El-Mabrouk, N.: Genome rearrangement with gene families. In: Mathematics of Evolution and Phylogeny, pp. 291–320. Oxford University Press (2005)Google Scholar
  15. 15.
    Fertin, G., Labarre, A., Rusu, I., Tannier, E., Vialette, S.: Combinatorics of genome rearrangements. The MIT Press, Cambridge (2009)zbMATHGoogle Scholar
  16. 16.
    Hahn, M.W., Han, M.V., Han, S.-G.: Gene family evolution across 12 drosophilia genomes. PLoS Genetics 3, e197 (2007)CrossRefGoogle Scholar
  17. 17.
    Hannenhalli, S., Pevzner, P.A.: Transforming men into mice (polynomial algorithm for genomic distance problem). In: Proceedings of the IEEE 36th Annual Symposium on Foundations of Computer Science, pp. 581–592 (1995)Google Scholar
  18. 18.
    Hannenhalli, S., Pevzner, P.A.: Transforming cabbage into turnip (polynomial algorithm for sorting signed permutations by reversals). JACM 48, 1–27 (1999)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Jiang, M.: The Zero Exemplar Distance Problem. In: Tannier, E. (ed.) RECOMB-CG 2010. LNCS, vol. 6398, pp. 74–82. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  20. 20.
    Kanaya, S., Yamada, Y., Kudo, Y., Ikemura, T.: Studies of codon usage and tRNA genes of 18 unicellular organisms and quantification of Bacillus subtilis tRNAs: Gene expression level and species-specific diversity of codon usage based on multivariate analysis. Gene 238, 143–155 (1999)CrossRefGoogle Scholar
  21. 21.
    Kováč, J., Brejová, B., Vinař, T.: A Practical Algorithm for Ancestral Rearrangement Reconstruction. In: Przytycka, T.M., Sagot, M.-F. (eds.) WABI 2011. LNCS (LNAI), vol. 6833, pp. 163–174. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  22. 22.
    Lynch, M., Conery, J.S.: The evolutionary fate and consequences of duplicate genes. Science 290, 1151–1155 (2000)CrossRefGoogle Scholar
  23. 23.
    Ma, J., Zhang, L., Suh, B.B., Raney, B.J., Burhans, R.C., Kent, W.J., Blanchette, M., Haussler, D., Miller, W.: Reconstructing contiguous regions of an ancestral genome. Genome Research 16, 1557–1565 (2007)CrossRefGoogle Scholar
  24. 24.
    Moret, B., Wang, L., Warnow, T., Wyman, S.: New approaches for reconstructing phylogenies from gene order data. Bioinformatics 173, S165–S173 (2001)CrossRefGoogle Scholar
  25. 25.
    Ohno, S.: Evolution by gene duplication. Springer, Berlin (1970)Google Scholar
  26. 26.
    Pe’er, I., Shamir, R.: The median problems for breakpoints are NP-complete. Elec. Colloq. on Comput. Complexity 71 (1998)Google Scholar
  27. 27.
    Rawlings, T.A., Collins, T.M., Bieler, R.: Changing identities: tRNA duplication and remolding within animal mitochondrial genomes. Proceedings of the National Academy of Sciences USA 100, 15700–15705 (2003)CrossRefGoogle Scholar
  28. 28.
    Rogers, H.H., Bergman, C.M., Griffiths-Jones, S.: The evolution of tRNA genes in Drosophila. Genome Biol. Evol. 2, 467–477 (2010)CrossRefGoogle Scholar
  29. 29.
    Saks, M.E., Conery, J.S.: Anticodon-dependent conservation of bacterial tRNA gene sequences. RNA 13(5), 651–660 (2007)CrossRefGoogle Scholar
  30. 30.
    Sankoff, D., Blanchette, M.: The Median Problem for Breakpoints in Comparative Genomics. In: Jiang, T., Lee, D.T. (eds.) COCOON 1997. LNCS, vol. 1276, pp. 251–263. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  31. 31.
    Tang, D.T., Glazov, E.A., McWilliam, S.M., Barris, W.C., Dalrymple, B.P.: Analysis of the complement and molecular evolution of tRNA genes in cow. BMC Genomics 10(188) (2009)Google Scholar
  32. 32.
    Tannier, E., Zheng, C., Sankoff, D.: Multichromosomal median and halving problems under different genomic distances. BMC Bioinformatics 10 (2009)Google Scholar
  33. 33.
    Tillier, E.R.M., Collins, R.A.: Genome rearrangement by replication-directed translocation. Nature Genetics 26 (2000)Google Scholar
  34. 34.
    Wapinski, I., Pfeffer, A., Friedman, N., Regev, A.: Natural history and evolutionary principles of gene duplication in fungi. Nature 449, 54–61 (2007)CrossRefGoogle Scholar
  35. 35.
    Withers, M., Wernisch, L., Dos Reis, M.: Archaeology and evolution of transfer RNA genes in the escherichia coli genome. Bioinformatics 12, 933–942 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Patrick Holloway
    • 1
  • Krister Swenson
    • 2
    • 3
  • David Ardell
    • 4
  • Nadia El-Mabrouk
    • 2
  1. 1.Département d’Informatique et de Recherche Opérationnelle (DIRO)Université de MontréalCanada
  2. 2.DIROCanada
  3. 3.University of McGill Computer ScienceCanada
  4. 4.Center for Computational Biology, School of Natural SciencesUniversity of CaliforniaMercedUSA

Personalised recommendations