Skip to main content

Sequence-Length Requirements for Phylogenetic Methods

  • Conference paper
  • First Online:
Algorithms in Bioinformatics (WABI 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2452))

Included in the following conference series:

Abstract

We study the sequence lengths required by neighbor-joining, greedy parsimony, and a phylogenetic reconstruction method (DCM NJ +MP) based on disk-covering and the maximum parsimony criterion. We use extensive simulations based on random birth-death trees, with controlled deviations from ultrametricity, to collect data on the scaling of sequence-length requirements for each of the three methods as a function of the number of taxa, the rate of evolution on the tree, and the deviation from ultrametricity. Our experiments show that DCM NJ +MP has consistently lower sequence-length requirements than the other two methods when trees of high topological accuracy are desired, although all methods require much longer sequences as the deviation from ultrametricity or the height of the tree grows. Our study has significant implications for large-scale phylogenetic reconstruction (where sequencelength requirements are a crucial factor), but also for future performance analyses in phylogenetics (since deviations from ultrametricity are proving pivotal).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K. Atteson. The performance of the neighbor-joining methods of phylogenetic reconstruction. Algorithmica, 25:251–278, 1999.

    Article  MATH  MathSciNet  Google Scholar 

  2. O.R.P. Bininda-Emonds, S.G. Brady, J. Kim, and M.J. Sanderson. Scaling of accuracy in extremely large phylogenetic trees. In Proc. 6th Pacific Symp. Biocomputing PSB 2002, pages 547–558. World Scientific Pub., 2001.

    Google Scholar 

  3. W. J. Bruno, N. Socci, and A. L. Halpern. Weighted neighbor joining: A likelihoodbased approach to distance-based phylogeny reconstruction. Mol. Biol. Evol., 17(1):189–197, 2000.

    Google Scholar 

  4. M. Csűrös. Fast recovery of evolutionary trees with thousands of nodes. To appear in RECOMB 01, 2001.

    Google Scholar 

  5. M. Csűrös and M. Y. Kao. Recovering evolutionary trees through harmonic greedy triplets. Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA 99), pages 261–270, 1999.

    Google Scholar 

  6. P. L. Erdős, M. Steel, L. Székély, and T. Warnow. A few logs suffice to build almost all trees-I. Random Structures and Algorithms, 14:153–184, 1997.

    Google Scholar 

  7. P. L. Erdős, M. Steel, L. Székély, and T. Warnow. A few logs suffice to build almost all trees-II. Theor. Comp. Sci., 221:77–118, 1999.

    Article  Google Scholar 

  8. L. R. Foulds and R. L. Graham. The Steiner problem in phylogeny is NP-complete. Advances in Applied Mathematics, 3:43–49, 1982.

    Article  MATH  MathSciNet  Google Scholar 

  9. J. Huelsenbeck. Performance of phylogenetic methods in simulation. Syst. Biol., 44:17–48, 1995.

    Article  Google Scholar 

  10. J. Huelsenbeck and D. Hillis. Success of phylogenetic methods in the four-taxon case. Syst. Biol., 42:247–264, 1993.

    Article  Google Scholar 

  11. D. Huson, S. Nettles, and T. Warnow. Disk-covering, a fast-converging method for phylogenetic tree reconstruction. Comput. Biol., 6:369–386, 1999.

    Article  Google Scholar 

  12. D. Huson, K. A. Smith, and T. Warnow. Correcting large distances for phylogenetic reconstruction. In Proceedings of the 3rd Workshop on Algorithms Engineering (WAE), 1999. London, England.

    Google Scholar 

  13. M. Kimura. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol., 16:111–120, 1980.

    Article  Google Scholar 

  14. K. Kuhner and J. Felsenstein. A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Mol. Biol. Evol., 11:459–468, 1994.

    Google Scholar 

  15. L. Nakhleh, B.M.E. Moret, U. Roshan, K. St. John, and T. Warnow. The accuracy of fast phylogenetic methods for large datasets. In Proc. 7th Pacific Symp. Biocomputing PSB 2002, pages 211–222. World Scientific Pub., 2002.

    Google Scholar 

  16. L. Nakhleh, U. Roshan, K. St. John, J. Sun, and T. Warnow. Designing fast converging phylogenetic methods. In Proc. 9th Int’l Conf. on Intelligent Systems for Molecular Biology (ISMB01), volume 17 of Bioinformatics, pages S190–S198. Oxford U. Press, 2001.

    Google Scholar 

  17. L. Nakhleh, U. Roshan, K. St. John, J. Sun, and T. Warnow. The performance of phylogenetic methods on trees of bounded diameter. In O. Gascuel and B.M.E. Moret, editors, Proc. 1st Int’l Workshop Algorithms in Bioinformatics (WABI’01), pages 214–226. Springer-Verlag, 2001.

    Google Scholar 

  18. A. Rambaut and N. C. Grassly. Seq-gen: An application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comp. Appl. Biosci., 13:235–238, 1997.

    Google Scholar 

  19. B. Rannala, J. P. Huelsenbeck, Z. Yang, and R. Nielsen. Taxon sampling and the accuracy of large phylogenies. Syst. Biol., 47(4):702–719, 1998.

    Article  Google Scholar 

  20. D. F. Robinson and L. R. Foulds. Comparison of phylogenetic trees. Mathematical Biosciences, 53:131–147, 1981.

    Article  MATH  MathSciNet  Google Scholar 

  21. N. Saitou and M. Nei. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol., 4:406–425, 1987.

    Google Scholar 

  22. M.J. Sanderson. r8s software package. Available from http://ginger.ucdavis.edu/r8s/.

  23. M.J. Sanderson, B.G. Baldwin, G. Bharathan, C.S. Campbell, D. Ferguson, J.M. Porter, C. Von Dohlen, M.F. Wojciechowski, and M.J. Donoghue. The growth of phylogenetic information and the need for a phylogenetic database. Systematic Biology, 42:562–568, 1993.

    Article  Google Scholar 

  24. T. Warnow, B. Moret, and K. St. John. Absolute convergence: true trees from short sequences. Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA 01), pages 186–195, 2001.

    Google Scholar 

  25. Z. Yang. Maximum likelihood estimation of phylogeny from DNA sequences whensubstitution rates differ over sites. Mol. Biol. Evol., 10:1396–1401, 1993.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Moret, B.M., Roshan, U., Warnow, T. (2002). Sequence-Length Requirements for Phylogenetic Methods. In: Guigó, R., Gusfield, D. (eds) Algorithms in Bioinformatics. WABI 2002. Lecture Notes in Computer Science, vol 2452. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45784-4_26

Download citation

  • DOI: https://doi.org/10.1007/3-540-45784-4_26

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44211-0

  • Online ISBN: 978-3-540-45784-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics