Skip to main content

Linear Programming for Phylogenetic Reconstruction Based on Gene Rearrangements

  • Conference paper
Combinatorial Pattern Matching (CPM 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3537))

Included in the following conference series:

Abstract

Phylogenetic reconstruction from gene rearrangements has attracted increasing attention from biologists and computer scientists over the last few years. Methods used in reconstruction include distance-based methods, parsimony methods using sequence-based encodings, and direct optimization. The latter, pioneered by Sankoff and extended by us with the software suite GRAPPA, is the most accurate approach, but has been limited to small genomes because the running time of its scoring algorithm grows exponentially with the number of genes in the genome. We report here on a new method to compute a tight lower bound on the score of a given tree, using a set of linear constraints generated through selective applications of the triangle inequality (in the spirit of GESTALT). Our method generates an integer linear program with a carefully limited number of constraints, rapidly solves its relaxed version, and uses the result to provide a tight lower bound. Since this bound is very close to the optimal tree score, it can be used directly as a selection criterion, thereby enabling us to bypass entirely the expensive scoring procedure. We have implemented this method within our GRAPPA software and run several series of experiments on both biological and simulated datasets to assess its accuracy. Our results show that using the bound as a selection criterion yields excellent trees, with error rates below 5% up to very large evolutionary distances, consistently beating the baseline Neighbor-Joining. Our new method enables us to extend the range of applicability of the direct optimization method to chromosomes of size comparable to those of bacteria, as well as to datasets with complex combinations of evolutionary events.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blanchette, M., Bourque, G., Sankoff, D.: Breakpoint phylogenies. In: Miyano, S., Takagi, T. (eds.) Genome Informatics 1997, pp. 25–34. Univ. Academy Press, Tokyo (1997)

    Google Scholar 

  2. Berkelaar, M., Eikland, K., Notebaert, P.: lp solve Available at http://www.geocities.com/lpsolve/

  3. Bourque, G., Pevzner, P.: Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Research 12, 26–36 (2002)

    Google Scholar 

  4. Caprara, A.: Formulations and hardness of multiple sorting by reversals. In: Proc. 3rd Ann. Int’l. Conf. Comput. Mol. Biol (RECOMB 1999), pp. 84–93. ACM Press, New York (1999)

    Google Scholar 

  5. Caprara, A.: On the practical solution of the reversal median problem. In: Gascuel, O., Moret, B.M.E. (eds.) WABI 2001. LNCS, vol. 2149, pp. 238–251. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  6. Cosner, M.E., Jansen, R.K., Palmer, J.D., Downie, S.R.: The highly rearranged chloroplast genome of Trachelium caeruleum (Campanulaceae): multiple inversions, inverted repeat expansion and contraction, transposition, insertions/deletions, and several repeat families. Curr. Genet. 31, 419–429 (1997)

    Article  Google Scholar 

  7. Cosner, M.E., Jansen, R.K., Moret, B.M.E., Raubeson, L.A., Wang, L., Warnow, T., Wyman, S.K.: A new fast heuristic for computing the breakpoint phylogeny and experimental phylogenetic analyses of real and synthetic data. In: Proc. 8th Int’l. Conf. on Intelligent Systems for Mol. Biol (ISMB 2000), pp. 104–115 (2000)

    Google Scholar 

  8. Downie, S.R., Palmer, J.D.: Use of chloroplast DNA rearrangements in reconstructing plant phylogeny. In: Soltis, P., Soltis, D., Doyle, J.J. (eds.) Plant Molecular Systematics, pp. 14–35. Chapman and Hall, Boca Raton (1992)

    Google Scholar 

  9. Earnest-DeYoung, J., Lerat, E., Moret, B.M.E.: Reversing gene erosion: reconstructing ancestral bacterial genomes from gene-content and gene-order data. In: Jonassen, I., Kim, J. (eds.) WABI 2004. LNCS (LNBI), vol. 3240, pp. 1–13. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  10. El-Mabrouk, N.: Genome rearrangement by reversals and insertions/deletions of contiguous segments. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 222–234. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  11. Eppstein, D.: Finding the k shortest paths. SIAM J. on Computing 28(2), 652–673 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  12. Lancia, G., Ravi, R.: GESTALT: GEnomic STeiner ALignmenTs. In: Crochemore, M., Paterson, M. (eds.) CPM 1999. LNCS, vol. 1645, pp. 101–114. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  13. Moret, B.M.E., Tang, J., Wang, L.-S., Warnow, T.: Steps toward accurate reconstructions of phylogenies from gene-order data. J. Comput. Syst. Sci. 65(3), 508–525 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  14. Moret, B.M.E., Tang, J., Warnow, T.: Reconstructing phylogenies from gene-content and gene-order data. In: Gascuel, O. (ed.) Mathematics of Evolution and Phylogeny, pp. 321–352. Oxford University Press, Oxford (2005)

    Google Scholar 

  15. Moret, B.M.E., Wyman, S.K., Bader, D.A., Warnow, T., Yan, M.: A new implementation and detailed study of breakpoint analysis. In: Proc. 6th Pacific Symp. on Biocomputing (PSB 2001), pp. 583–594. World Scientific Pub., Singapore (2001)

    Google Scholar 

  16. Palmer, J.D.: Chloroplast and mitochondrial genome evolution in land plants. In: Herrmann, R. (ed.) Cell Organelles, pp. 99–133. Springer, Heidelberg (1992)

    Google Scholar 

  17. Pe’er, I., Shamir, R.: The median problems for breakpoints are NP-complete. Elec. Colloq. on Comput. Complexity 71 (1998)

    Google Scholar 

  18. Raubeson, L.A., Jansen, R.K.: Chloroplast DNA evidence on the ancient evolutionary split in vascular land plants. Science 255, 1697–1699 (1992)

    Article  Google Scholar 

  19. Robinson, D.R., Foulds, L.R.: Comparison of phylogenetic trees. Mathematical Biosciences 53, 131–147 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  20. Saitou, N., Nei, M.: The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)

    Google Scholar 

  21. Sankoff, D., Blanchette, M.: Multiple genome rearrangement and breakpoint phylogeny. J. Comput. Biol. 5, 555–570 (1998)

    Article  Google Scholar 

  22. Sankoff, D., Nadeau, J. (eds.): Comparative Genomics. Kluwer Academic Pubs., Dordrecht (2000)

    MATH  Google Scholar 

  23. Siepel, A.C.: Exact algorithms for the reversal median problem. Master’s thesis, U. New Mexico, Albuquerque, NM (2001), Available at http://www.cs.unm.edu/~acs/thesis.html

  24. Siepel, A.C., Moret, B.M.E.: Finding an optimal inversion median: experimental results. In: Gascuel, O., Moret, B.M.E. (eds.) WABI 2001. LNCS, vol. 2149, pp. 189–203. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  25. Swenson, K.M., Marron, M., Earnest-DeYoung, J.V., Moret, B.M.E.: Approximating the true evolutionary distance between two genomes. In: Proc. 7th Workshop on Alg. Engineering & Experiments (ALENEX 2005), Vancouver, SIAM Press, Philadelphia (2005)

    Google Scholar 

  26. Swofford, D.L., Olson, G., Waddell, P., Hillis, D.M.: Phylogenetic inference. In: Hillis, D.M., Moritz, C., Mable, B. (eds.) Molecular Systematics, ch. 11, 2nd edn., Sinauer Associates (1996)

    Google Scholar 

  27. Tang, J., Moret, B.M.E.: Scaling up accurate phylogenetic reconstruction from gene-order data. In: Proc. 11th Int’l Conf. on Intelligent Systems for Mol. Biol (ISMB’03). Bioinformatics, vol. 19, pp. i305–i312. Oxford U. Press, Oxford (2003)

    Google Scholar 

  28. Tang, J., Moret, B.M.E., Cui, L., de Pamphilis, C.W.: Phylogenetic reconstruction from arbitrary gene-order data. In: Proc. 4th IEEE Symp. on Bioinformatics and Bioengineering BIBE 2004, pp. 592–599. IEEE Press, Piscataway (2004)

    Chapter  Google Scholar 

  29. Wang, L.-S., Jansen, R.K., Moret, B.M.E., Raubeson, L.A., Warnow, T.: Fast phylogenetic methods for genome rearrangement evolution: An empirical study. In: Proc. 7th Pacific Symp. on Biocomputing (PSB 2002), pp. 524–535. World Scientific Pub., Singapore (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tang, J., Moret, B.M.E. (2005). Linear Programming for Phylogenetic Reconstruction Based on Gene Rearrangements. In: Apostolico, A., Crochemore, M., Park, K. (eds) Combinatorial Pattern Matching. CPM 2005. Lecture Notes in Computer Science, vol 3537. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11496656_35

Download citation

  • DOI: https://doi.org/10.1007/11496656_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26201-5

  • Online ISBN: 978-3-540-31562-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics