Abstract
Inference of large phylogenetic trees using elaborate statistical models is computationally extremely intensive. Thus, progress is primarily achieved via algorithmic innovation rather than by brute-force allocation of all available computational resources. We present simple heuristics which yield accurate trees for synthetic (simulated) as well as real data and improve execution time compared to the currently fastest programs. The new heuristics are implemented in a sequential program (RAxML) which is available as open source code. Furthermore, we present a non-deterministic parallel version of our algorithm which in some cases yielded super-linear speedups for computations with 1000 organisms. We compare sequential RAxML performance with the currently fastest and most accurate programs for phylogenetic tree inference based on statistical methods using 50 synthetic alignments and 9 real-world alignments comprising up to 1000 sequences. RAxML outperforms those programs for real-world data in terms of speed and final likelihood values.
This work is sponsored under the project ID ParBaum, within the framework of the “Competence Network for Technical, Scientific High Performance Computing in Bavaria” (KONWIHR)
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Felsenstein, J.: Evolutionary Trees from DNA Sequences: A Maximum Likelihood Approach. In: J. Mol. Evol., 17:368–376, 1981.
Guindon, S., and Gascuel, O.: A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood. In: Syst. Biol., 52(5):696–704, 2003.
Holder, M.T., and Lewis, P.O.: Phylogeny Estimation: Traditional and Bayesian Approaches. In: Nat. Rev. Gen., 4:275–284, 2003.
Huelsenbeck, J.P., and Ronquist, F.: MRBAYES: Bayesian inference of phylogenetic trees. In: Bioinf., 17(8):754–5, 2001.
Huelsenbeck, J.P., et al.: Potential Applications and Pitfalls of Bayesian Inference of Phylogeny. In: Syst. Biol., 51(5):673–688, 2002.
Ludwig, W. et al.: ARB: A Software Environment for Sequence Data. In: Nucl. Acids Res., in press, 2003.
Olsen, G., et al.: fastdnaml: A Tool for Construction of Phylogenetic Trees of DNA Sequences using Maximum Likelihood. In: Comput. Appl. Biosci., 10:41–48, 1994.
PAML Manual: superconducting bcr.musc.edu/manuals, visited Nov 2003.
PAUP: superconducting paup.csit.fsu.edu, visited May 2003.
PHYLIP: superconducting evolution.genetics.washington.edu, visited Nov 2003.
RRZE: superconducting www.rrze.uni-erlangen.de, visited Oct 2003.
Stamatakis, A.P., et al: New Fast and Accurate Heuristics for Inference of Large Phylogenetic Trees. In: Proc. of IPDPS2004, to be published.
Stamatakis, A.P., et al: A Fast Program for Maximum Likelihood-based Inference of Large Phylogenetic Trees. In: Proc. of SAC’04, to be published.
Stamatakis, A.P., et al.: Accelerating Parallel Maximum Likelihood-based Phylogenetic Tree Computations using Subtree Equality Vectors. In: Proc. of SC2002, 2002.
Stewart, C. et al.: Parallel Implementation and Performance of fastdnaml-a Program for Maximum Likelihood Phylogenetic Inference. In: Proc. of SC2001, 2001.
Strimmer, K., Haeseler, A.v.: Quartet Puzzling: A Maximum-Likelihood Method for Reconstructing Tree Topologies. In: Mol. Biol. Evol., 13:964–969, 1996.
Williams, T.L., Moret, B.M.E.: An Investigation of Phylogenetic Likelihood Methods. In: Proc. of BIBE’03, 2003.
Tuffley, C, Steel, M.: Links between Maximum Likelihood and Maximum Parsimony under a Simple Sodel of Site Substitution. In: Bull. Math. Biol., 59(3):581–607, 1997.
Wolf, M.J., et al.: TrExML: A Maximum Likelihood Program for Extensive Tree-space Exploration. In: Bioinf., 16(4):383–394, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Stamatakis, A.P., Ludwig, T., Meier, H. (2005). ParBaum: A Fast Program for Phylogenetic Tree Inference with Maximum Likelihood. In: Bode, A., Durst, F. (eds) High Performance Computing in Science and Engineering, Garching 2004. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28555-5_24
Download citation
DOI: https://doi.org/10.1007/3-540-28555-5_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26145-2
Online ISBN: 978-3-540-28555-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)