Abstract
Due to immense computational requirements Phylogenetic inference is considered to be a grand challenge in Bioinformatics. The increasing popularity of multi-gene alignments in biological studies, which typically provide a stable topological signal due to a more favorable ratio of the number of base pairs to the number of sequences, coupled with rapid accumulation of sequence data in general, poses new challenges for high performance computing. In this paper, we present a parallelization strategy for RAxML, which is currently among the fastest and most accurate programs for phylogenetic inference under the ML criterion. We simultaneously exploit coarse-grained and fine-grained parallelism that is inherent in every ML-based biological analysis. Our experimental results indicate that our approach scales very well on supercomputer architectures like the IBM BlueGene/L or SGI Altix, as well as on common Linux clusters with high-speed interconnects.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
D. Bader, B. Moret, L. Vawter, Industrial applications of high-performance computing for phylogeny reconstruction. Proc. SPIE 4528, 159–168 (2001)
O.R.P. Bininda-Emonds, M. Cardillo, K.E. Jones, R.D.E. MacPhee, R.M.D. Beck, R. Grenyer, S.A. Price, R.A. Vos, J.L. Gittleman, A. Purvis, The delayed rise of present-day mammals. Nature 446, 507–512 (2007)
B. Chor, T. Tuller, Maximum likelihood of evolutionary trees: Hardness and approximation. Bioinformatics 21(1), 97–106 (2005)
T.I.H. Consortium, The international HapMap project. Nature 426, 789–796 (2003)
T.Z. DeSantis, P. Hugenholtz, N. Larsen, M. Rojas, E.L. Brodie, K. Keller, T. Huber, D. Dalevi, P. Hu, G.L. Andersen, Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72(7), 5069–5072 (2006)
Z. Du, F. Lin, U. Roshan, Reconstruction of large phylogenetic trees: A parallel approach. Comput. Biol. Chem. 29(4), 273–280 (2005)
J. Felsenstein, Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981)
G.W. Grimm, S.S. Renner, A. Stamatakis, V. Hemleben, A nuclear ribosomal DNA phylogeny of acer inferred with maximum likelihood, splits graphs, and motif analyses of 606 sequences. Evol. Bioinform. Online 2, 279–294 (2006)
S. Guindon, O. Gascuel, A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52(5), 696–704 (2003)
R.E. Ley, J.K. Harris, J. Wilcox, J.R. Spear, S.R. Miller, B.M. Bebout, J.A. Maresca, D.A. Bryant, M.L. Sogin, N.R. Pace, Unexpected diversity and complexity of the guerrero negro hypersaline microbial mat. Appl. Envir. Microbiol. 72(5), 3685–3695 (2006)
B. Minh, L. Vinh, A. Haeseler, H. Schmidt, pIQPNNI: Parallel reconstruction of large maximum likelihood phylogenies. Bioinformatics 21(19), 3794–3796 (2005)
B. Minh, L. Vinh, H. Schmidt, A. Haeseler, Large maximum likelihood trees, in Proceedings of the NIC Symposium 2006 (2006), pp. 357–365
C. Robertson, J. Harris, J.R. Spear, N. Pace, Phylogenetic diversity and ecology of environmental Archaea. Curr. Opin. Microbiol. 8, 638–642 (2005)
A. Stamatakis, Distributed and parallel algorithms and systems for inference of huge phylogenetic trees based on the maximum likelihood method. PhD thesis, Technische Universität München, Germany, October 2004
A. Stamatakis, Phylogenetic models of rate heterogeneity: A high performance computing perspective, in Proceedings of IPDPS2006, HICOMB Workshop, Rhodos, Greece, April 2006 (Proceedings on CD)
A. Stamatakis, RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21), 2688–2690 (2006)
A. Stamatakis, T. Ludwig, H. Meier, Parallel inference of a 10.000-taxon phylogeny with maximum likelihood, in Proceedings of Euro-Par 2004, September 2004, pp. 997–1004
A. Stamatakis, T. Ludwig, H. Meier, RAxML-III: A fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21(4), 456–463 (2005)
A. Stamatakis, M. Ott, T. Ludwig, RAxML-OMP: An efficient program for phylogenetic inference on SMPs, in PaCT (2005), pp. 288–302
D. Zwickl, Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. PhD thesis, University of Texas at Austin, April 2006
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ott, M., Zola, J., Aluru, S., Stamatakis, A. (2009). ParBaum: Large-Scale Maximum Likelihood-Based Phylogenetic Analyses. In: Wagner, S., Steinmetz, M., Bode, A., Brehm, M. (eds) High Performance Computing in Science and Engineering, Garching/Munich 2007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69182-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-69182-2_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69181-5
Online ISBN: 978-3-540-69182-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)