Reversing Gene Erosion – Reconstructing Ancestral Bacterial Genomes from Gene-Content and Order Data

  • Joel V. Earnest-DeYoung
  • Emmanuelle Lerat
  • Bernard M. E. Moret
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3240)


In the last few years, it has become routine to use gene-order data to reconstruct phylogenies, both in terms of edge distances (parsimonious sequences of operations that transform one end point of the edge into the other) and in terms of genomes at internal nodes, on small, duplication-free genomes. Current gene-order methods break down, though, when the genomes contain more than a few hundred genes, possess high copy numbers of duplicated genes, or create edge lengths in the tree of over one hundred operations. We have constructed a series of heuristics that allow us to overcome these obstacles and reconstruct edges distances and genomes at internal nodes for groups of larger, more complex genomes. We present results from the analysis of a group of thirteen modern γ-proteobacteria, as well as from simulated datasets.


Edge Length Gene Content Internal Node Xanthomonas Campestris Yersinia Pestis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cosner, M., Jansen, R., Moret, B., Raubeson, L., Wang, L.S., Warnow, T., Wyman, S.: An empirical comparison of phylogenetic methods on chloroplast gene order data in Campanulaceae. In: Sankoff, D., Nadeau, J. (eds.) Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment, and the Evolution of Gene Families, pp. 99–121. Kluwer Acad. Publ., Dordrecht (2000)Google Scholar
  2. 2.
    Waterston, R., et al.: Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)CrossRefGoogle Scholar
  3. 3.
    Hannenhalli, S., Chappey, C., Koonin, E., Pevzner, P.: Genome sequence comparison and scenarios for gene rearrangements: A test case. Genomics 30, 299–311 (1995)CrossRefGoogle Scholar
  4. 4.
    McLysaght, A., Baldi, P., Gaut, B.: Extensive gene gain associated with adaptive evolution of poxviruses. Proc. Nat’l Acad. Sci. USA 100, 15655–15660 (2003)CrossRefGoogle Scholar
  5. 5.
    Sankoff, D.: Genome rearrangement with gene families. Bioinformatics 15, 990–917 (1999)Google Scholar
  6. 6.
    Tang, J., Moret, B., Cui, L., de Pamphilis, C.: Phylogenetic reconstruction from arbitrary gene-order data. In: Proc. 4th Int’l IEEE Conf. on Bioengineering and Bioinformatics BIBE 2004, IEEE Press, Los Alamitos (2004)Google Scholar
  7. 7.
    Marron, M., Swenson, K., Moret, B.: Genomic distances under deletions and insertions. In: Warnow, T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, pp. 537–547. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  8. 8.
    Swenson, K., Marron, M., Earnest-DeYoung, J., Moret, B.: Approximating the true evolutionary distance between two genomes. Technical Report TR-CS-2004-15, Univ. of New Mexico (2004)Google Scholar
  9. 9.
    Lerat, E., Daubin, V., Moran, N.: From gene trees to organismal phylogeny in prokaryotes: The case of the γ proteobacteria. PLoS Biology 1, 101–109 (2003)CrossRefGoogle Scholar
  10. 10.
    Clark, M., Moran, N., Baumann, P.: Sequence evolution in bacterial endosymbionts having extreme base composition. Mol. Biol. Evol. 16, 1586–1598 (1999)Google Scholar
  11. 11.
    Lawrence, J., Ochman, H.: Amelioration of bacterial genomes: Rates of change and exchange. J. Mol. Evol. 44, 383–397 (1997)CrossRefGoogle Scholar
  12. 12.
    Parkhill, J., et al.: Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18. Nature 413, 848–852 (2001)CrossRefGoogle Scholar
  13. 13.
    Stover, C., et al.: Complete genome sequence of Pseudomonas aeruginosa PAO1, an opportunistic pathogen. Nature 406, 959–964 (2000)CrossRefGoogle Scholar
  14. 14.
    Moret, B., Tang, J., Wang, L.S., Warnow, T.: Steps toward accurate reconstructions of phylogenies from gene-order data. J. Comput. Syst. Sci. 65, 508–525 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Bader, D., Moret, B., Warnow, T., Wyman, S., Yan, M. (GRAPPA (Genome Rearrangements Analysis under Parsimony and other Phylogenetic Algorithms)),
  16. 16.
    Moret, B., Tang, J., Warnow, T.: Reconstructing phylogenies from gene-content and geneorder data. In: Gascuel, O. (ed.) Mathematics of Evolution and Phylogeny, Oxford University Press, Oxford (2004)Google Scholar
  17. 17.
    Tang, J., Moret, B.: Scaling up accurate phylogenetic reconstruction from gene-order data. In: Proc. 11th Int’l Conf. on Intelligent Systems for Molecular Biology ISMB 2003. Bioinformatics, vol. 19 (Suppl. 1), pp. i305–i312 (2003)Google Scholar
  18. 18.
    Moret, B., Siepel, A., Tang, J., Liu, T.: Inversion medians outperform breakpoint medians in phylogeny reconstruction from gene-order data. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 521–536. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  19. 19.
    Moret, B., Wyman, S., Bader, D., Warnow, T., Yan, M.: A new implementation and detailed study of breakpoint analysis. In: Proc. 6th Pacific Symp. on Biocomputing (PSB 2001), pp. 583–594. World Scientific Pub., Singapore (2001)Google Scholar
  20. 20.
    Bergeron, A.: A very elementary presentation of the Hannenhalli-Pevzner theory. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 106–117. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  21. 21.
    Bergeron, A., Stoye, J.: On the similarity of sets of permutations and its applications to genome comparison. In: Warnow, T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, pp. 68–79. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  22. 22.
    El-Mabrouk, N.: Genome rearrangement by reversals and insertions/deletions of contiguous segments. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 222–234. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  23. 23.
    Overbeek, R., Fonstein, M., D’Souza, M., Pusch, G., Maltsev, N.: The use of gene clusters to infer functional coupling. Proc. Nat’l Acad. Sci. USA 96, 2896–2901 (1999)CrossRefGoogle Scholar
  24. 24.
    Heber, S., Stoye, J.: Algorithms for finding gene clusters. In: Gascuel, O., Moret, B.M.E. (eds.) WABI 2001. LNCS, vol. 2149, pp. 252–263. Springer, Heidelberg (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Joel V. Earnest-DeYoung
    • 1
  • Emmanuelle Lerat
    • 2
  • Bernard M. E. Moret
    • 1
  1. 1.Dept. of Computer ScienceUniv. of New MexicoAlbuquerqueUSA
  2. 2.Dept. of Ecology and Evolutionary BiologyUniv. of ArizonaTucsonUSA

Personalised recommendations