Skip to main content

Abstract

Due to immense computational requirements Phylogenetic inference is considered to be a grand challenge in Bioinformatics. The increasing popularity of multi-gene alignments in biological studies, which typically provide a stable topological signal due to a more favorable ratio of the number of base pairs to the number of sequences, coupled with rapid accumulation of sequence data in general, poses new challenges for high performance computing. In this paper, we present a parallelization strategy for RAxML, which is currently among the fastest and most accurate programs for phylogenetic inference under the ML criterion. We simultaneously exploit coarse-grained and fine-grained parallelism that is inherent in every ML-based biological analysis. Our experimental results indicate that our approach scales very well on supercomputer architectures like the IBM BlueGene/L or SGI Altix, as well as on common Linux clusters with high-speed interconnects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. D. Bader, B. Moret, L. Vawter, Industrial applications of high-performance computing for phylogeny reconstruction. Proc. SPIE 4528, 159–168 (2001)

    Article  Google Scholar 

  2. O.R.P. Bininda-Emonds, M. Cardillo, K.E. Jones, R.D.E. MacPhee, R.M.D. Beck, R. Grenyer, S.A. Price, R.A. Vos, J.L. Gittleman, A. Purvis, The delayed rise of present-day mammals. Nature 446, 507–512 (2007)

    Article  Google Scholar 

  3. B. Chor, T. Tuller, Maximum likelihood of evolutionary trees: Hardness and approximation. Bioinformatics 21(1), 97–106 (2005)

    Article  Google Scholar 

  4. T.I.H. Consortium, The international HapMap project. Nature 426, 789–796 (2003)

    Article  Google Scholar 

  5. T.Z. DeSantis, P. Hugenholtz, N. Larsen, M. Rojas, E.L. Brodie, K. Keller, T. Huber, D. Dalevi, P. Hu, G.L. Andersen, Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72(7), 5069–5072 (2006)

    Article  Google Scholar 

  6. Z. Du, F. Lin, U. Roshan, Reconstruction of large phylogenetic trees: A parallel approach. Comput. Biol. Chem. 29(4), 273–280 (2005)

    Article  MATH  Google Scholar 

  7. J. Felsenstein, Evolutionary trees from DNA sequences: A maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981)

    Article  Google Scholar 

  8. G.W. Grimm, S.S. Renner, A. Stamatakis, V. Hemleben, A nuclear ribosomal DNA phylogeny of acer inferred with maximum likelihood, splits graphs, and motif analyses of 606 sequences. Evol. Bioinform. Online 2, 279–294 (2006)

    Google Scholar 

  9. S. Guindon, O. Gascuel, A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52(5), 696–704 (2003)

    Article  Google Scholar 

  10. R.E. Ley, J.K. Harris, J. Wilcox, J.R. Spear, S.R. Miller, B.M. Bebout, J.A. Maresca, D.A. Bryant, M.L. Sogin, N.R. Pace, Unexpected diversity and complexity of the guerrero negro hypersaline microbial mat. Appl. Envir. Microbiol. 72(5), 3685–3695 (2006)

    Article  Google Scholar 

  11. B. Minh, L. Vinh, A. Haeseler, H. Schmidt, pIQPNNI: Parallel reconstruction of large maximum likelihood phylogenies. Bioinformatics 21(19), 3794–3796 (2005)

    Article  Google Scholar 

  12. B. Minh, L. Vinh, H. Schmidt, A. Haeseler, Large maximum likelihood trees, in Proceedings of the NIC Symposium 2006 (2006), pp. 357–365

    Google Scholar 

  13. C. Robertson, J. Harris, J.R. Spear, N. Pace, Phylogenetic diversity and ecology of environmental Archaea. Curr. Opin. Microbiol. 8, 638–642 (2005)

    Article  Google Scholar 

  14. A. Stamatakis, Distributed and parallel algorithms and systems for inference of huge phylogenetic trees based on the maximum likelihood method. PhD thesis, Technische Universität München, Germany, October 2004

    Google Scholar 

  15. A. Stamatakis, Phylogenetic models of rate heterogeneity: A high performance computing perspective, in Proceedings of IPDPS2006, HICOMB Workshop, Rhodos, Greece, April 2006 (Proceedings on CD)

    Google Scholar 

  16. A. Stamatakis, RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21), 2688–2690 (2006)

    Article  Google Scholar 

  17. A. Stamatakis, T. Ludwig, H. Meier, Parallel inference of a 10.000-taxon phylogeny with maximum likelihood, in Proceedings of Euro-Par 2004, September 2004, pp. 997–1004

    Google Scholar 

  18. A. Stamatakis, T. Ludwig, H. Meier, RAxML-III: A fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21(4), 456–463 (2005)

    Article  Google Scholar 

  19. A. Stamatakis, M. Ott, T. Ludwig, RAxML-OMP: An efficient program for phylogenetic inference on SMPs, in PaCT (2005), pp. 288–302

    Google Scholar 

  20. D. Zwickl, Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. PhD thesis, University of Texas at Austin, April 2006

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Ott .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ott, M., Zola, J., Aluru, S., Stamatakis, A. (2009). ParBaum: Large-Scale Maximum Likelihood-Based Phylogenetic Analyses. In: Wagner, S., Steinmetz, M., Bode, A., Brehm, M. (eds) High Performance Computing in Science and Engineering, Garching/Munich 2007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69182-2_9

Download citation

Publish with us

Policies and ethics