Skip to main content

Porting and Benchmarking of BWAKIT Pipeline on OpenPOWER Architecture

  • Conference paper
  • First Online:
  • 1269 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11203))

Abstract

Next Generation Sequencing (NGS) technology produces large volumes of genome data, which gets processed using various open source bioinformatics tools. The configuration and compilation of some bioinformatics tools (e.g. BWAKIT, root) is a challenging activity in its own right, not to mention the need to perform more elaborate porting activities for these applications on some architectures (e.g. IBM Power). The best practices of application porting should ensure (i) the semantics of the program or algorithm should not be changed, (ii) the output generated from the original source code and the modified source code (i.e., after porting) should be same even though the code is ported into different architectures and (iii) the output should be similar across different architectures after porting. Burrows-Wheeler Aligner (BWA) is the most popular genome mapping application used in the BWAKIT toolset. This BWAKIT provides pre-compiled binaries for x86_64 architecture and an end-to-end solution for genome mapping. In this paper, we show how to port various pre-built application binaries used in BWAKIT into OpenPOWER architecture and execute the BWAKIT pipeline successfully. Additionally, we demonstrate the validity of output results on OpenPOWER as well as present benchmarking results of BWAKIT applications that indicate the suitability of the highly multithreaded OpenPOWER architecture to execute these applications.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Li, H., Durbin, R.: Fast and accurate short read alignment with burrows–wheeler transform. Bioinformatics 25(14), 1754–1760 (2009)

    Article  Google Scholar 

  2. Broad Institute. GATK best practices for the NGS Pipeline (2016). https://goo.gl/mjdmU2. Accessed 19 Jan 2016

  3. Kathiresan, N., Temanni, R., Al-Ali, R.: Performance improvement of BWA MEM algorithm using data-parallel with concurrent parallelization. In: International Conference on Parallel, Distributed and Grid Computing (PDGC), pp. 406–411. IEEE (2014)

    Google Scholar 

  4. Al-Ali, R., Kathiresan, N., El Anbari, M., Schendel, E., Zaid, A.: Workflow optimization of performance and quality of service for bioinformatics application in high performance computing. J. Comput. Sci. 15, 3–10 (2016)

    Article  Google Scholar 

  5. Parallel Computing, Chapter 7 Performance and Scalability. https://www.cs.uky.edu/~jzhang/CS621/chapter7.pdf

  6. Genome Comparison and analysis testing. standard genome data (2016). http://www.bioplanet.com/gcat. Accessed 19 Jan 2016

  7. Kathiresan, N., Al-Ali, R.: Intelligent resource management system. U.S. Patent Application 15/194,052, filed December 28 2017 (2017)

    Google Scholar 

  8. Kathiresan, N., Temanni, R., Almabrazi, H., Syed, N., Jithesh, P.V., Al-Ali, R.: Accelerating next generation sequencing data analysis with system level optimizations. Sci. Rep. 7(1), 9058 (2017)

    Article  Google Scholar 

  9. BamUtil tools. https://github.com/statgen/bamUtil

  10. BWAKIT porting source code. https://github.com/sidratools/BWA_in_Power8/tree/master/IBM

  11. IBM Power Vector Intrinisic Functions version 1.0.4. https://github.com/vcflib/vcflib/blob/master/src/vec128int.h

  12. Ahmed, N., Sima, V.M., Houtgast, E.J., Bertels, K.L.M., Al-Ars, Z.: Heterogeneous hardware/software acceleration of the BWA-MEM DNA alignment algorithm. In: International Conference on Computer Aided Design (ICCAD 2015) (2015)

    Google Scholar 

  13. Al-Ars, Z., Mushtaq, H.: Scalability potential of BWA DNA mapping algorithm on apache spark. In: International Symposium on Information Management and Big Data (SIMBig 2015) (2015)

    Google Scholar 

  14. Mushtaq, H., Al-Ars, H.: Cluster-based apache spark implementation of the GATK DNA analysis pipeline. In: IEEE Conference on Bioinformatics and Biomedicine (BIBM 2015) (2015)

    Google Scholar 

Download references

Acknowledgement

The authors gratefully acknowledge the access that was provided to OpenPOWER hardware at Forschungszentrum Jülich Supercomputing Center. Special thanks goes to Dr. Dirk Pleiter and Dr. Marcus Richter, Jülich Supercomputing Center, Germany. Also, the authors would like to thank Mr. Jaideep Bajwa, Mr. Michael Dawson, and Dr. Yinhe Cheng for helping on V8, K8 and trimadap source code modifications for POWER architecture.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nagarajan Kathiresan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kathiresan, N., Al-Ali, R., Jithesh, P., Narayanasamy, G., Al-Ars, Z. (2018). Porting and Benchmarking of BWAKIT Pipeline on OpenPOWER Architecture. In: Yokota, R., Weiland, M., Shalf, J., Alam, S. (eds) High Performance Computing. ISC High Performance 2018. Lecture Notes in Computer Science(), vol 11203. Springer, Cham. https://doi.org/10.1007/978-3-030-02465-9_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02465-9_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02464-2

  • Online ISBN: 978-3-030-02465-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics