Abstract
The estimation of evolutionary history from biomolecular sequences is a major intellectual project in systematic biology and many methods are used to reconstruct phylogenetic (i.e. evolutionary) trees from sequence data. In this paper, we report on an extensive performance analysis of parsimony and two distance-based methods, a popular method called neighbor joining, and a new method developed by Agarwala et al. which approximates the L ā-nearest tree, on more than 260,000 sequence data sets simulated on approximately 500 model trees. Our experiments indicate a decrease in statistical power of the two distance methods as the diameter grows, but also show that parsimony is not as badly affected by the diameter as the distance methods. More generally, the experiments indicate that parsimony is almost always more accurate than the other two methods on reasonable length sequences even under adverse conditions, such as having sites that evolve quickly within the tree, pairs of taxa with large evolutionary distances between them, or large ratios between the highest and the lowest substitution rates on the edges.
Preview
Unable to display preview. Download preview PDF.
References
R. Agarwala, V. Bafna, M. FƔrach, B. Narayanan, M. Paterson, and M. Thorup. On the approximability of numerical taxonomy: fitting distances by tree metrics. Proceedings of the 7th Annual SODA, 1996.
D. J. Aldous, Probability distributions on cladograms, in: Discrete Random Structures, eds. D. J. Aldous and R. Permantle, Springer-Verlag, IMA Vol. in Mathematics and its Applications. Vol. 76, 1ā18, 1995.
K. Atteson. The performance of the Neighbor-Joining Method of Phylogeny Reconstruction, Proceedings, COCOON 1997 (in this volume).
Cavender, J. Taxonomy with confidence, Mathematical Biosciences, 40:271ā280, 1978.
J. Cohen and M. Farach, Numerical Taxonomy on Data: Experimental Results. SODA '97 and RECOMB '97.
W.H.E. Day, Computational complexity of inferring phylogenies from dissimilarity matrices. Bull. of Math. Biol. 49(4):461ā467, 1987.
W.H.E. Day and D.S. Johnson, The computational complexity of inferring rooted phylogenies by parsimony. Mathematical Biosciences, 81:33ā42, 1986.
P. L. ErdÅs, M. A. Steel, L. A. SzĆ©kely, and T. J. Warnow, Constructing big trees from short sequences. To appear, Proceedings ICALP 1997.
M. Farach, and S. Kannan, Efficient algorithms for inverting evolution, Proceedings of the ACM Symposium on the Foundations of Computer Science, 230ā236, (1996).
M. Farach, S. Kannan, and T. Warnow. A robust model for finding optimal evolutionary trees. Algorithmica (1995) 13: 155ā179.
J. Felsenstein, Cases in which parsimony or compatibility methods will be positively misleading, Syst. Zool. 27 (1978), 401ā410.
J. Felsenstein, PHYLIP ā Phylogeny Inference Package (Version 3.2), Cladistics, 5: 164ā166, 1989
L. R. Foulds, R. L. Graham, The Steiner problem in phylogeny is NP-complete, Adv. Appl. Math. 3 (1982), 43ā49.
O. Gascuel, BIONJ. An improved version of the NJ algorithm based on a simple model of sequence data. GERAD Technical Report, G-97-18.
D. Hillis, Inferring complex phylogenies, Nature Vol 383 12 September, 1996, 130ā131.
D. Hillis, J. Huelsenbeck, and D. Swofford, Hobgoblin of phylogenetics? Nature, Vol. 369, 1994, pp. 363ā364.
J. Huelsenbeck. Performance of phylogenetic methods in simulation. Syst. Biol. 44(1):17ā48, 1995.
J. P. Huelsenbeck and D. M. Hillis, Success of Phylogenetic Methods in the Fourtaxon Case, Syst Biol., 42:3 247ā264, 1993.
J. Kim, General inconsistency conditions for maximum parsimony: effects of branch length and increasing number of taxa, Syst. Biol. 45(3) (1996), 363ā374.
K. Rice, M. Donoghue, and R. Olmstead, Analyzing large data sets: rbcL revisited, Systematic Biology, to appear, 1997.
K. Rice. See the web page http://www.cis.upenn.edu/rice/progs/ecat.
K. Rice, M. Steel, T. Warnow, and S. Yooseph. Getting better topology estimates of difficult evolutionary trees, manuscript.
N. Saitou, M. Nei, The neighbor-joining method: a new method for reconstructing phylogenetic trees, Mol. Biol. Evol. 4 (1987), 406ā425.
D. L. Swofford, PAUP: Phylogenetic analysis using parsimony, version 3.0s. Illinois Natural History Survey, Champaign. 1992.
A. Templeton, Human origins and analysis of mitochondrial DNA sequences. Science, Vol. 255, 737ā739, 1992.
Allan C. Wilson, Rebecca L. Cann, The recent African genesis of humans, Scientific American April 1992, 68ā73.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
Ā© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rice, K., Warnow, T. (1997). Parsimony is hard to beat!. In: Jiang, T., Lee, D.T. (eds) Computing and Combinatorics. COCOON 1997. Lecture Notes in Computer Science, vol 1276. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0045079
Download citation
DOI: https://doi.org/10.1007/BFb0045079
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63357-0
Online ISBN: 978-3-540-69522-6
eBook Packages: Springer Book Archive