Abstract
The Smith-Waterman algorithm is used for determining the similarity between two very long data streams. A popular application of the Smith-Waterman algorithm is for sequence alignment in DNA sequences. Like many computational algorithms, the Smith-Waterman algorithm is constrained by the memory resources and the computational capacity of the system. As such, it can be accelerated and run at larger scales by parallelizing the implementation, allowing the work to be distributed to exploit HPC systems. A central part of the algorithm is computing the similarity matrix which is the mechanism that evaluates the quality of the matching sequences. This access pattern to the matrix to compute the similarity is non-uniform; as such, it better suits the Partioned Global Address Space (PGAS) programming model. In this paper, we explore parallelizing the Smith-Waterman algorithm using the OpenSHMEM model and interfaces in OpenSHMEM 1.2 as well as the one-sided communication interfaces in MPI-3. Further, we also explore the advantages of using non-blocking communication interfaces, which are proposed as extensions for a future OpenSHMEM specification. We evaluate the parallel implementation on Titan, a Cray XK7 system at the Oak Ridge Leadership Computing Facility (OLCF). Our results demonstrate good weak and strong scaling characteristics for both of the OpenSHMEM and MPI-3 implementations.
Notice of Copyright
This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)
Jaleel, A., Mattina, M., Jacob, B.: Last level cache (llc) performance of data mining workloads on a CMP - a case study of parallel bioinformatics workloads. In: 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA), pp. 88–98 (2006)
Wang, Y., Lu, J., Yu, J., Gibbs, R.A., Yu, F.: An integrative variant analysis pipeline for accurate genotype/haplotype inference in population NGS data. Genome Res. 23, 833–842 (2013)
El-Saghir, Z., Kelash, H., Elnazly, S., Faheem, H.: Parallel implementation of smith-waterman algorithm using MPI, openmp and hybrid model. Int. J. Innovative Technol. Exploring Eng. 4, 1–5 (2014)
Hamidouche, K., Mendonca, F., Falcou, J., de Melo, A., Etiemble, D.: Parallel smith-waterman comparison on multicore and manycore computing platforms with BSP++. Int. J. Parallel Prog. 41, 111–136 (2013)
Noorian, M., Pooshfam, H., Noorian, Z., Abdullah, R.: Performance enhancement of smith-waterman algorithm using hybrid model: Comparing the mpi and hybrid programming paradigm on smp clusters. In: SMC 2009, IEEE International Conference on Systems, Man and Cybernetics, pp. 492–497 (2009)
Khajeh-Saeed, A., Poole, S., Perot, J.B.: Acceleration of the smith-waterman algorithm using single and multiple graphics processors. J. Comput. Phys. 229, 4247–4258 (2010)
OpenSHMEM Org.: OpenSHMEM Specification (2015). http://openshmem.org/
The MPI Forum: MPI: A Message Passing Interface. Technical report, Version 3.0 (2012)
Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162, 705–708 (1982)
ten Bruggencate, M.: Cray shmem update. Presentation regarding extensions to OpenSHMEM by Cray (2014). Accessed 23 June 2015
Bader, D., Madduri, K., Gilbert, J., Shah, V., Kepner, J., Meuse, T., Krishnamurthy, A.: Designing scalable synthetic compact applications for benchmarking high productivity computing systems. Cyberinfrastructure Tech. Watch 2, 1–10 (2006)
Acknowledgments
This work is supported by the United States Department of Defense and used the resources of the Extreme Scale Systems Center (ESSC) located at the Oak Ridge National Laboratory (ORNL).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Baker, M., Welch, A., Gorentla Venkata, M. (2015). Parallelizing the Smith-Waterman Algorithm Using OpenSHMEM and MPI-3 One-Sided Interfaces. In: Gorentla Venkata, M., Shamis, P., Imam, N., Lopez, M. (eds) OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies. OpenSHMEM 2014. Lecture Notes in Computer Science(), vol 9397. Springer, Cham. https://doi.org/10.1007/978-3-319-26428-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-26428-8_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26427-1
Online ISBN: 978-3-319-26428-8
eBook Packages: Computer ScienceComputer Science (R0)