Abstract
Next Generation Sequencing (NGS) technology produces large volumes of genome data, which gets processed using various open source bioinformatics tools. The configuration and compilation of some bioinformatics tools (e.g. BWAKIT, root) is a challenging activity in its own right, not to mention the need to perform more elaborate porting activities for these applications on some architectures (e.g. IBM Power). The best practices of application porting should ensure (i) the semantics of the program or algorithm should not be changed, (ii) the output generated from the original source code and the modified source code (i.e., after porting) should be same even though the code is ported into different architectures and (iii) the output should be similar across different architectures after porting. Burrows-Wheeler Aligner (BWA) is the most popular genome mapping application used in the BWAKIT toolset. This BWAKIT provides pre-compiled binaries for x86_64 architecture and an end-to-end solution for genome mapping. In this paper, we show how to port various pre-built application binaries used in BWAKIT into OpenPOWER architecture and execute the BWAKIT pipeline successfully. Additionally, we demonstrate the validity of output results on OpenPOWER as well as present benchmarking results of BWAKIT applications that indicate the suitability of the highly multithreaded OpenPOWER architecture to execute these applications.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Li, H., Durbin, R.: Fast and accurate short read alignment with burrows–wheeler transform. Bioinformatics 25(14), 1754–1760 (2009)
Broad Institute. GATK best practices for the NGS Pipeline (2016). https://goo.gl/mjdmU2. Accessed 19 Jan 2016
Kathiresan, N., Temanni, R., Al-Ali, R.: Performance improvement of BWA MEM algorithm using data-parallel with concurrent parallelization. In: International Conference on Parallel, Distributed and Grid Computing (PDGC), pp. 406–411. IEEE (2014)
Al-Ali, R., Kathiresan, N., El Anbari, M., Schendel, E., Zaid, A.: Workflow optimization of performance and quality of service for bioinformatics application in high performance computing. J. Comput. Sci. 15, 3–10 (2016)
Parallel Computing, Chapter 7 Performance and Scalability. https://www.cs.uky.edu/~jzhang/CS621/chapter7.pdf
Genome Comparison and analysis testing. standard genome data (2016). http://www.bioplanet.com/gcat. Accessed 19 Jan 2016
Kathiresan, N., Al-Ali, R.: Intelligent resource management system. U.S. Patent Application 15/194,052, filed December 28 2017 (2017)
Kathiresan, N., Temanni, R., Almabrazi, H., Syed, N., Jithesh, P.V., Al-Ali, R.: Accelerating next generation sequencing data analysis with system level optimizations. Sci. Rep. 7(1), 9058 (2017)
BamUtil tools. https://github.com/statgen/bamUtil
BWAKIT porting source code. https://github.com/sidratools/BWA_in_Power8/tree/master/IBM
IBM Power Vector Intrinisic Functions version 1.0.4. https://github.com/vcflib/vcflib/blob/master/src/vec128int.h
Ahmed, N., Sima, V.M., Houtgast, E.J., Bertels, K.L.M., Al-Ars, Z.: Heterogeneous hardware/software acceleration of the BWA-MEM DNA alignment algorithm. In: International Conference on Computer Aided Design (ICCAD 2015) (2015)
Al-Ars, Z., Mushtaq, H.: Scalability potential of BWA DNA mapping algorithm on apache spark. In: International Symposium on Information Management and Big Data (SIMBig 2015) (2015)
Mushtaq, H., Al-Ars, H.: Cluster-based apache spark implementation of the GATK DNA analysis pipeline. In: IEEE Conference on Bioinformatics and Biomedicine (BIBM 2015) (2015)
Acknowledgement
The authors gratefully acknowledge the access that was provided to OpenPOWER hardware at Forschungszentrum Jülich Supercomputing Center. Special thanks goes to Dr. Dirk Pleiter and Dr. Marcus Richter, Jülich Supercomputing Center, Germany. Also, the authors would like to thank Mr. Jaideep Bajwa, Mr. Michael Dawson, and Dr. Yinhe Cheng for helping on V8, K8 and trimadap source code modifications for POWER architecture.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Kathiresan, N., Al-Ali, R., Jithesh, P., Narayanasamy, G., Al-Ars, Z. (2018). Porting and Benchmarking of BWAKIT Pipeline on OpenPOWER Architecture. In: Yokota, R., Weiland, M., Shalf, J., Alam, S. (eds) High Performance Computing. ISC High Performance 2018. Lecture Notes in Computer Science(), vol 11203. Springer, Cham. https://doi.org/10.1007/978-3-030-02465-9_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-02465-9_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02464-2
Online ISBN: 978-3-030-02465-9
eBook Packages: Computer ScienceComputer Science (R0)