SWIMM 2.0: Enhanced Smith–Waterman on Intel’s Multicore and Manycore Architectures Based on AVX-512 Vector Extensions
The well-known Smith–Waterman (SW) algorithm is the most commonly used method for local sequence alignments, but its acceptance is limited by the computational requirements for large protein databases. Although the acceleration of SW has already been studied on many parallel platforms, there are hardly any studies which take advantage of the latest Intel architectures based on AVX-512 vector extensions. This SIMD set is currently supported by Intel’s Knights Landing (KNL) accelerator and Intel’s Skylake (SKL) general purpose processors. In this paper, we present an SW version that is optimized for both architectures: the renowned SWIMM 2.0. The novelty of this vector instruction set requires the revision of previous programming and optimization techniques. SWIMM 2.0 is based on a massive multi-threading and SIMD exploitation. It is competitive in terms of performance compared with other state-of-the-art implementations, reaching 511 GCUPS on a single KNL node and 734 GCUPS on a server equipped with a dual SKL processor. Moreover, these successful performance rates make SWIMM 2.0 the most efficient energy footprint implementation in this study achieving 2.94 GCUPS/Watts on the SKL processor.
KeywordsBioinformatics Smith–Waterman Xeon-Phi Intel-KNL SIMD Intel-AVX512
This work has been supported by the EU (FEDER) and the Spanish MINECO, under Grant TIN2015-65277-R and the CAPAP-H6 network (TIN2016-81840-REDT).
- 8.Frielingsdorf, J.T.: Improving optimal sequence alignments through a simd-accelerated library. Master’s thesis, University of Oslo (2015)Google Scholar
- 12.Liu, Y., Schmidt, B.: SWAPHI: Smith–Waterman protein database search on Xeon Phi coprocessors. In: 25th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2014) (2014)Google Scholar
- 13.Lan, H., Liu, W., Schmidt, B., Wang, B.: Accelerating large-scale biological database search on Xeon Phi-based neo-heterogeneous architectures. in 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2015), pp. 503–510. https://doi.org/10.1109/BIBM.2015.7359735
- 14.Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matas, M.: An energy-aware performance analysis of SWIMM: Smith–Waterman implementation on Intel’s Multicore and Manycore architectures. Concurr. Comput. Pract. Exp. 27(18), 5517 (2015). https://doi.org/10.1002/cpe.3598 CrossRefGoogle Scholar
- 15.Lan, H., Liu, W., Liu, Y., Schmidt, B.: SWhybrid: a hybrid-parallel framework for large-scale protein sequence database search. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2017), pp. 42–51. https://doi.org/10.1109/IPDPS.2017.42
- 16.Isa, M., Benkrid, K., Clayton, T., Ling, C., Erdogan, A.: An FPGA-based parameterised and scalable optimal solutions for pairwise biological sequence analysis. In: Adaptive Hardware and Systems (AHS), 2011 NASA/ESA Conference on (2011), pp. 344–351. https://doi.org/10.1109/AHS.2011.5963957
- 20.Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: First experiences accelerating Smith–Waterman on Intel’s Knights Landing processor. In: Ibrahim, S., Choo, K.K.R., Yan, Z., Pedrycz, W. (eds.) Algorithms and Architectures for Parallel Processing: 17th International Conference, ICA3PP 2017, Helsinki, Finland, August 21–23, 2017, Proceedings, pp. 569–579. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65482-9_42
- 24.Asai, R.: MCDRAM as High-Bandidth Memory (HBM) in Knights Landing Processors: Developer’s Guide (2016). https://goparallel.sourceforge.net/wp-content/uploads/2016/05/Colfax_KNL_MCDRAM_Guide.pdf
- 25.Intel Corporation: Intel 64 and IA-32 Architectures Optimization Reference Manual (2017). https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf