Abstract
MrBayes is a widely used software for Bayesian phylogenetic inference: we input biological sequence data from various taxonomic groups, and MrBayes returns its estimate of the phylogenetic tree which gave rise to those taxa. This paper presents ta(MC)\(^{3}\), based on its predecessor a(MC)\(^{3}\), which, for protein datasets, improves computational efficiency and overcomes major obstacles in analyzing larger datasets on HPCs with multiple Graphics Processing Units (GPUs). The major improvements are (a) a new task mapping strategy, (b) the use of Kahan summation to resolve non-convergence issues, and (c) the introduction of 64-bit variables. We evaluate ta(MC)\(^{3}\) on real-world protein datasets both on a desktop server and the Tianhe-1A supercomputer. With a single GPU, ta(MC)\(^{3}\) is nearly 90 times faster compared with the serial version of MrBayes, up to around 9 times faster than MrBayes utilizing a GPU via the BEAGLE library, and up to 2.5 times faster than a(MC)\(^{3}\). On larger datasets with 64 nodes (GPUs) on Tianhe-1A, ta(MC)\(^{3}\) is capable of obtaining \(1000+\) speedup vs. serial MrBayes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Altekar, G., Dwarkadas, S., Huelsenbeck, F., Ronquist, J.P.: Parallel metropolis coupled markov chain monte carlo for bayesian phylogenetic inference. Bioinformatics 20, 407–415 (2004)
Bao, J., Xia, J., Zhou, J., Liu, X.G., Wang, G.: Efficient implementation of MrBayes on multi-GPU. Mol. Biol. Evol. 30, 1471–1479 (2013)
Farber, R.: CUDA Application Design and Development. Morgan Kaufmann, San Francisco (2011)
Felsenstein, J.: Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981)
Kahan, W.: Pracniques: further remarks on reducing truncation errors. Commun. ACM 8(1), 40 (1965). http://doi.acm.org/10.1145/363707.363723
Larget, B., Simon, D.L.: Markov chain monte carlo algorithms for the bayesian analysis of phylogenetic trees. Mol. Biol. Evol. 16, 750–759 (1999)
Li, S., Pearl, D.K., Doss, H.: Phylogenetic tree construction using markov chain monte carlo. J. Am. Statist. Assoc. 95, 493–508 (2000)
Mau, B., Newton, M.A.: Phylogenetic inference for binary data on dendrograms using markov chain monte carlo. J. Comp. Graph. Stat. 6, 122–131 (1997)
NVIDIA: CUDA C Programming Guide (2013)
Pang, S., Stones, R.J., Ren, M.M., Liu, X.G., Wang, G., Xia, H., Wu, H.Y., Liu, Y., Xie, Q.: GPU MrBayes v3.1: GPU MrBayes on graphics processing units for protein sequence data. Mol. Biol. Evol. 32(9), 2496–2497 (2015)
Pratas, F., Trancoso, P., Stamatakis, A., Sousa, L.: Fine-grain parallelism using multi-core, Cell/BE, and GPU systems: accelerating the phylogenetic likelihood function. In: 42nd International Conference on Parallel Processing, pp. 9–17 (2009)
Rannala, B., Yang, Z.: Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference. J. Mol. Evol. 43, 304–311 (1996)
Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)
Schmidt, H., Strimmer, K., Vingron, M., Haeseler, A.: Tree-puzzle: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18, 502–504 (2002)
Thuiller, W., Lavergne, S., Roquet, C., Boulangeat, I., Lafourcade, B., Araujo, M.B.: Parallel algorithms for bayesian phylogenetic inference. J. Parallel Distrib. Comput. 63, 707–718 (2003)
Xie, Q., Bu, W., Zheng, L.: The bayesian phylogenetic analysis of the 18s RNA sequences from the main lineages of trichophora (insecta: Heteroptera:pentatomomorpha). Mol. Biol. Evol. 34, 448–451 (2005)
Yang, Z.: Phylogenetic analysis using parsimony and likelihood methods. J. Mol. Evol. 42(2), 294–307 (1996)
Zhou, J., Liu, X.G., Stones, D.S., Xie, Q., Wang, G.: MrBayes on a graphics processing unit. Bioinformatics 27, 1255–1261 (2011)
Acknowledgements
A biology-focused version of this paper has been published [10]. This work is partially supported by NSF of China (grant numbers: 61373018, 11301288), Program for New Century Excellent Talents in University (grant number: NCET130301) and the Fundamental Research Funds for the Central Universities (grant number: 65141021). Stones was supported by her NSF China Research Fellowship for International Young Scientists (grant number: 11450110409). We would also like to thank Hongju Xia, Jianfu Zhou, Jie Bao and Prof. Qiang Xie for their valuable input.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Pang, S., Stones, R.J., Ren, Mm., Wang, G., Liu, X. (2015). MrBayes for Phylogenetic Inference Using Protein Data on a GPU Cluster. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9530. Springer, Cham. https://doi.org/10.1007/978-3-319-27137-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-27137-8_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27136-1
Online ISBN: 978-3-319-27137-8
eBook Packages: Computer ScienceComputer Science (R0)