Function portability of molecular dynamics on heterogeneous parallel architectures with OpenCL
Classical molecular dynamics simulation for atomistic systems is implemented in OpenCL and benchmarked on a variety of different hardware platforms. Modifying the number of particles and system size in the study provides insight into characteristics of parallel compute platforms, where latency, data transfer, memory access characteristics and compute intense work can be identified as fingerprints in benchmark runs. Data layouts are compared, for which the access of structure-of-arrays shows best performance in most cases. It is demonstrated that function portability can be achieved straightforwardly with OpenCL, while performance portability lacks behind as various architectures strongly depend on specific vectorisation optimisation.
KeywordsMolecular dynamics OpenCL Shared memory parallelisation Many-core architectures
- 1.DOE (2017) Performance portability WS DOE. https://asc.llnl.gov/DOE-COE-Mtg-2016/
- 3.Halver R, Sutmann G (2015) Multi-threaded construction of neighbour lists for particle systems in OpenMP. In: Parallel Processing and Applied Mathematics/Wyrzykowski, Roman (Editor), 11th International Conference on Parallel Processing and Applied Mathematics, Krakow (Poland), 6 Sept 2015–9 Sept 2015. https://doi.org/10.1007/978-3-319-32152-3_15
- 5.Intel (2017) Intel OpenCL SDK. https://software.intel.com/en-us/articles/opencl-drivers
- 6.JSC (2017a) JURECA. http://www.fz-juelich.de/ias/jsc/jureca
- 9.Sutmann G (2002) Classical molecular dynamics. In: Grotendorst J, Marx D, Muramatsu A (eds) Quantum simulations of many-body systems: from theory to algorithms, vol 10. NIC, Jülich, pp 211–254Google Scholar