International Journal of Parallel Programming

, Volume 41, Issue 1, pp 111–136 | Cite as

Parallel Smith-Waterman Comparison on Multicore and Manycore Computing Platforms with BSP++

  • Khaled Hamidouche
  • Fernando Machado Mendonca
  • Joel Falcou
  • Alba Cristina Magalhaes Alves de Melo
  • Daniel Etiemble


Biological Sequence Comparison is an important operation in Bioinformatics that is often used to relate organisms. Smith and Waterman proposed an exact algorithm that compares two sequences in quadratic time and space. Due to high computing power and memory requirements, SW is usually executed on High Performance Computing (HPC) platforms such as multicore clusters and CellBEs. Since HPC architectures exhibit very different hardware characteristics, porting an application to them is an error-prone time-consuming task. BSP++ is an implementation of BSP that aims to facilitate parallel programming, reducing the effort to port code. In this paper, we propose and evaluate a parallel BSP++ strategy to execute SW on multiple multicore and manycore platforms. Given the same base code, we generated MPI, OpenMP, MPI/OpenMP, CellBE and MPI/CellBE versions, which were executed on heterogeneous platforms with up to 6,144 cores. The results obtained with real DNA sequences show that the performance of our versions is comparable to the hand-tuned strategies in the literature, evidencing the appropriateness and flexibility of our approach.


Bioinformatics Parallel programming Parallel hybrid architectures 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aji, A., Feng, W.: Optimizing performance; cost and sensitivity in pairwise sequence search on a cluster of playstation, In: Proceedings of the 8th IEEE International Conference on BioInformatics and BioEngineering (2008)Google Scholar
  2. 2.
    Aji, A., Feng, W., Blagojevic, F., Nikolopoulos, D.: Cell-swat: modeling and scheduling wavefront computations on the cell broadband engine. In: Proceedings of the Computing Frontiers Conference (2008)Google Scholar
  3. 3.
    Bellens, P., Peres, J.M., Badia, R.M., Labarta, J.: Cellss: a programming model for the cell be architecture. In: Proceedings of the 2006 Supercomputing Conference (SC06) (2006)Google Scholar
  4. 4.
    Beran, M.: Decomposable bulk synchronous parallel computers. In: SOFSEM 99: Proceedings of the 26th Conference on Current Trends in Theory and Practice of Informatics on Theory and Practice of Informatics, pp. 349–359 (1999)Google Scholar
  5. 5.
    Bisseling, R.H., Mccoll, W.F.: Scientific computing on bulk synchronous parallel architectures. In: Proceedings of 13th IFIP World Computer Congress, p. 31 (1994)Google Scholar
  6. 6.
    Boukerche, A., Batista, R.B., Melo, A.C.M.A.: Exact pairwise alignment of megabase genome biological sequences using a novel z-align parallel strategy. In: Proceedings of the 2009 IPDPS Workshop on Nature-Inspired Distributed Systems (2009)Google Scholar
  7. 7.
    Cappello, F., Desprez, F., Margery, D.: Grid5000. Jan 2010
  8. 8.
    Chen, C., Schmidt, B.: Computing large-scale alignments on a multi-cluster. In: Proceedings of the IEEE International Conference on Cluster Computing, pp. 38–45 (2003)Google Scholar
  9. 9.
    Collins, R.L., Vellore, B., Carloni, L.P.: Recursion-driven parallel code generation for multi-core platforms. In: Proceedings of the Design, Automation and Test in Europe (DATE) (2010)Google Scholar
  10. 10.
    Duran A. et al.: Ompss: a proposal for programming heterogeneous multi-core architectures. Parallel Process. Lett. 21(2), 173–193 (2011)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Fatahalian, K. et al.: Sequoia: programming the memory hierarchy. In: Proceedings of the 2006 Supercomputing Conference (SC06) (2006)Google Scholar
  12. 12.
    Hamidouche, K., Falcou, J., Etiemble, D.: Hybrid bulk synchronous parallelism library for clustered smp architectures. In: Proceedings of the 4th International Workshop on High-level Parallel Programming and Applications (HLPP 2010) (2010)Google Scholar
  13. 13.
    IBM: Software development kit for multicore acceleration v. 3.1. Available at (2008)
  14. 14.
    IBM: Sony, Toshiba. Cell broadband engine architecture. Available at (2007)
  15. 15.
    Kunzman, D.M., Kale, L.K.: Towards a framework for abstracting accelerators in parallel applications: Experience with cell. In: Proceedings of the 2009 Supercomputing Conference (SC09) (2009)Google Scholar
  16. 16.
    Liu, Y., Schmidt, B., Maskell, D.L.: Cudasw++ 2.0: enhanced smith-waterman protein database search on cuda-enabled gpus based on simt and virtualized simd abstractions. BMC Res. Note 93(3) (2010)Google Scholar
  17. 17.
    Mount D.: Bioinformatics: Sequences and Genome Alignment. Cold Spring Harbor Laboratory Press, New York (2004)Google Scholar
  18. 18.
    Myers E.W., Miller W.: Optimal alignments in linear space. Comput. Appl. Biosci. 4(1), 11–17 (1988)Google Scholar
  19. 19.
    Noorian M., Pooshfam H., Noorian Z., Abdulla R.: Performance enhancement of smith-waterman algorithm using hybrid model: comparing the mpi and hybrid programming paradigm on smp clusters. In: Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics (2009)Google Scholar
  20. 20.
    Pfister, G.F.: In Search of Clusters: The Coming Battle for Lowly Parallel Computing. Prentice Hall, Englewood Cliffs, NJ (1997)Google Scholar
  21. 21.
    Rajko S., Aluru S.: Space and time optimal parallel sequence alignments. IEEE Trans. Parallel Distrib. Syst. 15(12), 1070–1081 (2004)CrossRefGoogle Scholar
  22. 22.
    Sachdeva V., Kistler M., Speight E., Tzeng T.: Exploring the viability of the cell broadband engine for bioinformatics applications. Parallel Comput. 34(11), 616–626 (2008)CrossRefGoogle Scholar
  23. 23.
    Sandes, E.F.O., Melo, A.C.M.A.: Cudalign: using gpu to accelerate the comparison of megabase genomic sequences. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 137–146 (2010)Google Scholar
  24. 24.
    Sarje A., Aluru S.: Parallel genomic alignments on the cell broadband engine. IEEE Trans. Parallel Distrib. Syst. 20(11), 1600–1610 (2009)CrossRefGoogle Scholar
  25. 25.
    Smith T., Waterman M.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)CrossRefGoogle Scholar
  26. 26.
    Song, Y., Striemer, G., Akoglu, A.: Performance analysis of IBM cell broadband engine on sequence alignment. In: Proceedings of the 2009 NASA/ESA Conference on Adaptive Hardware and Systems, pp. 439–446 (2009)Google Scholar
  27. 27.
    Sousa M.S., Melo A.C.M.A., Boukerche A.: An adaptive multi-policy grid service for biological sequence comparison. J. Parallel Distrib. Comput. 70(2), 160–172 (2010)MATHCrossRefGoogle Scholar
  28. 28.
    Szalkowski, A., Ledergerber, C., Krahenbuhl, P., Dessimoz, C.: Swps3—fast multi-threaded vectorized smith-waterman for ibm cellb.e. and x86sse2. BMC Res. Note. 1, 107–110 (2008)Google Scholar
  29. 29.
    Valiant L.: A bridging model for parallel computation. Commun. ACM 33, 103–111 (1990)CrossRefGoogle Scholar
  30. 30.
    Wirawan A., Schmidt B., Zhang H., Kwoh C.: High performance protein sequence database scanning on the cell broadband engine. Sci. Comput. 17, 97–111 (2009)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Khaled Hamidouche
    • 1
  • Fernando Machado Mendonca
    • 2
  • Joel Falcou
    • 1
  • Alba Cristina Magalhaes Alves de Melo
    • 2
  • Daniel Etiemble
    • 1
  1. 1.LRI, Universite Paris-SudOrsayFrance
  2. 2.University of BrasiliaBrasíliaBrazil

Personalised recommendations