Skip to main content

Parsimony is hard to beat!

  • Session 4: Computational Biology I
  • Conference paper
  • First Online:
Book cover Computing and Combinatorics (COCOON 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1276))

Included in the following conference series:

Abstract

The estimation of evolutionary history from biomolecular sequences is a major intellectual project in systematic biology and many methods are used to reconstruct phylogenetic (i.e. evolutionary) trees from sequence data. In this paper, we report on an extensive performance analysis of parsimony and two distance-based methods, a popular method called neighbor joining, and a new method developed by Agarwala et al. which approximates the L āˆž-nearest tree, on more than 260,000 sequence data sets simulated on approximately 500 model trees. Our experiments indicate a decrease in statistical power of the two distance methods as the diameter grows, but also show that parsimony is not as badly affected by the diameter as the distance methods. More generally, the experiments indicate that parsimony is almost always more accurate than the other two methods on reasonable length sequences even under adverse conditions, such as having sites that evolve quickly within the tree, pairs of taxa with large evolutionary distances between them, or large ratios between the highest and the lowest substitution rates on the edges.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agarwala, V. Bafna, M. FƔrach, B. Narayanan, M. Paterson, and M. Thorup. On the approximability of numerical taxonomy: fitting distances by tree metrics. Proceedings of the 7th Annual SODA, 1996.

    Google ScholarĀ 

  2. D. J. Aldous, Probability distributions on cladograms, in: Discrete Random Structures, eds. D. J. Aldous and R. Permantle, Springer-Verlag, IMA Vol. in Mathematics and its Applications. Vol. 76, 1ā€“18, 1995.

    Google ScholarĀ 

  3. K. Atteson. The performance of the Neighbor-Joining Method of Phylogeny Reconstruction, Proceedings, COCOON 1997 (in this volume).

    Google ScholarĀ 

  4. Cavender, J. Taxonomy with confidence, Mathematical Biosciences, 40:271ā€“280, 1978.

    Google ScholarĀ 

  5. J. Cohen and M. Farach, Numerical Taxonomy on Data: Experimental Results. SODA '97 and RECOMB '97.

    Google ScholarĀ 

  6. W.H.E. Day, Computational complexity of inferring phylogenies from dissimilarity matrices. Bull. of Math. Biol. 49(4):461ā€“467, 1987.

    Google ScholarĀ 

  7. W.H.E. Day and D.S. Johnson, The computational complexity of inferring rooted phylogenies by parsimony. Mathematical Biosciences, 81:33ā€“42, 1986.

    Google ScholarĀ 

  8. P. L. Erdős, M. A. Steel, L. A. SzĆ©kely, and T. J. Warnow, Constructing big trees from short sequences. To appear, Proceedings ICALP 1997.

    Google ScholarĀ 

  9. M. Farach, and S. Kannan, Efficient algorithms for inverting evolution, Proceedings of the ACM Symposium on the Foundations of Computer Science, 230ā€“236, (1996).

    Google ScholarĀ 

  10. M. Farach, S. Kannan, and T. Warnow. A robust model for finding optimal evolutionary trees. Algorithmica (1995) 13: 155ā€“179.

    Google ScholarĀ 

  11. J. Felsenstein, Cases in which parsimony or compatibility methods will be positively misleading, Syst. Zool. 27 (1978), 401ā€“410.

    Google ScholarĀ 

  12. J. Felsenstein, PHYLIP ā€” Phylogeny Inference Package (Version 3.2), Cladistics, 5: 164ā€“166, 1989

    Google ScholarĀ 

  13. L. R. Foulds, R. L. Graham, The Steiner problem in phylogeny is NP-complete, Adv. Appl. Math. 3 (1982), 43ā€“49.

    Google ScholarĀ 

  14. O. Gascuel, BIONJ. An improved version of the NJ algorithm based on a simple model of sequence data. GERAD Technical Report, G-97-18.

    Google ScholarĀ 

  15. D. Hillis, Inferring complex phylogenies, Nature Vol 383 12 September, 1996, 130ā€“131.

    Google ScholarĀ 

  16. D. Hillis, J. Huelsenbeck, and D. Swofford, Hobgoblin of phylogenetics? Nature, Vol. 369, 1994, pp. 363ā€“364.

    Google ScholarĀ 

  17. J. Huelsenbeck. Performance of phylogenetic methods in simulation. Syst. Biol. 44(1):17ā€“48, 1995.

    Google ScholarĀ 

  18. J. P. Huelsenbeck and D. M. Hillis, Success of Phylogenetic Methods in the Fourtaxon Case, Syst Biol., 42:3 247ā€“264, 1993.

    Google ScholarĀ 

  19. J. Kim, General inconsistency conditions for maximum parsimony: effects of branch length and increasing number of taxa, Syst. Biol. 45(3) (1996), 363ā€“374.

    Google ScholarĀ 

  20. K. Rice, M. Donoghue, and R. Olmstead, Analyzing large data sets: rbcL revisited, Systematic Biology, to appear, 1997.

    Google ScholarĀ 

  21. K. Rice. See the web page http://www.cis.upenn.edu/rice/progs/ecat.

    Google ScholarĀ 

  22. K. Rice, M. Steel, T. Warnow, and S. Yooseph. Getting better topology estimates of difficult evolutionary trees, manuscript.

    Google ScholarĀ 

  23. N. Saitou, M. Nei, The neighbor-joining method: a new method for reconstructing phylogenetic trees, Mol. Biol. Evol. 4 (1987), 406ā€“425.

    Google ScholarĀ 

  24. D. L. Swofford, PAUP: Phylogenetic analysis using parsimony, version 3.0s. Illinois Natural History Survey, Champaign. 1992.

    Google ScholarĀ 

  25. A. Templeton, Human origins and analysis of mitochondrial DNA sequences. Science, Vol. 255, 737ā€“739, 1992.

    Google ScholarĀ 

  26. Allan C. Wilson, Rebecca L. Cann, The recent African genesis of humans, Scientific American April 1992, 68ā€“73.

    Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Tao Jiang D. T. Lee

Rights and permissions

Reprints and permissions

Copyright information

Ā© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rice, K., Warnow, T. (1997). Parsimony is hard to beat!. In: Jiang, T., Lee, D.T. (eds) Computing and Combinatorics. COCOON 1997. Lecture Notes in Computer Science, vol 1276. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0045079

Download citation

  • DOI: https://doi.org/10.1007/BFb0045079

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63357-0

  • Online ISBN: 978-3-540-69522-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics