Performance-Portable Many-Core Plasma Simulations: Porting PIConGPU to OpenPower and Beyond

Zenker, Erik; Widera, René; Huebl, Axel; Juckeland, Guido; Knüpfer, Andreas; Nagel, Wolfgang E.; Bussmann, Michael

doi:10.1007/978-3-319-46079-6_21

Performance-Portable Many-Core Plasma Simulations: Porting PIConGPU to OpenPower and Beyond

Erik Zenker^16,17,
René Widera¹⁶,
Axel Huebl^16,17,
Guido Juckeland¹⁶,
Andreas Knüpfer¹⁷,
Wolfgang E. Nagel¹⁷ &
…
Michael Bussmann¹⁶

Conference paper
First Online: 06 October 2016

2443 Accesses
9 Citations
2 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9945))

Abstract

With the appearance of the heterogeneous platform OpenPower, many-core accelerator devices have been coupled with Power host processors for the first time. Towards utilizing their full potential, it is worth investigating performance portable algorithms that allow to choose the best-fitting hardware for each domain-specific compute task. Suiting even the high level of parallelism on modern GPGPUs, our presented approach relies heavily on abstract meta-programming techniques, which are essential to focus on fine-grained tuning rather than code porting. With this in mind, the CUDA-based open-source plasma simulation code PIConGPU is currently being abstracted to support the heterogeneous OpenPower platform using our fast porting interface cupla, which wraps the abstract parallel C++11 kernel acceleration library Alpaka.

We demonstrate how PIConGPU can benefit from the tunable kernel execution strategies of the Alpaka library, achieving portability and performance with single-source kernels on conventional CPUs, Power8 CPUs and NVIDIA GPUs.

This project has received funding from the European Unions Horizon 2020 research and innovation programme under grant agreement No 654220.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
--use_fast_math --ftz=false -g0 -O3 -m64.
2.
-g0 -O3 -m64 -funroll-loops -march=native --param max-unroll-times=512 -ffast-math.

References

AMD: AMD Opteron 6200 Series Processor Quick Reference Guide. https://www.amd.com/Documents/Opteron_6000_QRG.pdf. Accessed 11 Apr 2016
Burau, H., Widera, R., Hönig, W., Juckeland, G., Debus, A., Kluge, T., Schramm, U., Cowan, T.E., Sauerbrey, R., Bussmann, M.: PIConGPU: a fully relativistic particle-incell code for a GPU cluster. IEEE Trans. Plasma Sci. 38(10), 2831–2839 (2010)
Article Google Scholar
Bussmann, M., Burau, H., Cowan, T.E., Debus, A., Huebl, A., Juckeland, G., Kluge, T., Nagel, W.E., Pausch, R., Schmitt, F., Schramm, U., Schuchart, J., Widera, R.: Radiative signatures of the relativistic Kelvin-Helmholtz instability. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p. 5. ACM (2013). doi:10.1145/2503210.2504564
Chung, H.-K., Chen, M.H., Lee, R.W.: Extension of atomic configuration sets of the Non-LTE model in the application to the K\(\alpha \) diagnostics of hot dense matter. High Energy Density Phys. 3(1), 57–64 (2007)
Article Google Scholar
Carter Edwards, H., Trott, C.R., Sunderland, D.: Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014)
Article Google Scholar
Fluhr, E.J., Friedrich, J., Dreps, D., Zyuban, V., Still, G., Gonzalez, C., Hall, A., Hogenmiller, D., Malgioglio, F., Nett, R., Paredes, J., Pille, J., Plass, D., Puri, R., Restle, P., Shan, D., Stawiasz, K., Deniz, Z.T., DieterWendel, M.Z.: 5.1 POWER8 TM: a 12-core server-class processor in 22nm SOI with 7.6 Tb/s off-chip bandwidth. In: 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pp. 96–97. IEEE (2014)
Google Scholar
Foley, D.: DataNVLink, Pascal and Stacked Memory: Feeding the Appetite for Big Data. https://devblogs.nvidia.com/parallelforall/nvlink-pascal-stacked-memory-feeding-appetite-big-data/. Accessed 13 Jun 2016
Hockney, R.W., Eastwood, J.W.: Computer Simulation Using Particles. CRC Press, Boca Raton (1988). ISBN:0-85274-392-0
Book MATH Google Scholar
Hornung, R.D., Keasler, J.A.: The RAJA portability layer: overview and status. Lawrence Livermore National Laboratory, Livermore, USA, LLNL-TR-661403 (2014)
Google Scholar
Intel: Intel Xeon Processor E5-2698 v3 Specification. http://ark.intel.com/de/products/81060/Intel-Xeon-Processor-E5-2698-v3-40M-Cache-2_30-GHz. Accessed 11 Apr 2016
de Oliveira, M.F.: NVIDIA on IBM POWER8: Technical overview, software installation, and application development (2015)
Google Scholar
NVIDIA: Tesla K80 GPU Accelerator Board Specification. http://images.nvidia.com/content/pdf/kepler/Tesla-K80-BoardSpec-07317-001-v05.pdf. Accessed 11 Apr 2016
Oak Ridge National Laboratory: Summit. Scale new heights. Discover new solutions. Oak Ridge National Laboratory’s next High Performance Supercomputer. https://www.olcf.ornl.gov/summit/. Accessed 10 Apr 2016
Kowalke, O.: Boost.Fiber. https://github.com/olk/boost-fiber. Accessed 12 Apr 2016
OpenMP: OpenMP application program interface version 4.0 (2013)
Google Scholar
Widera, R.: cupla: C++ User interface for the Platform independent Library Alpaka. https://github.com/ComputationalRadiationPhysics/cupla. Accessed 14 Mar 2016
Stone, J.E., Gohara, D., Shi, G.: OpenCL: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12(1–3), 66–73 (2010)
Article Google Scholar
Widera, R., Worpitz, B., Zenker, E., Huebl, A., Juckeland, G., Knüpfer, A., Nagel, W.E., Bussmann, M.: PI- ConGPU, Alpaka, cupla software bundle for IWOPH 2016 submission, May 2016. doi:10.5281/zenodo.53761
Zeil, K., Metzkes, J., Kluge, T., Bussmann, M., Cowan, T.E., Kraft, S.D., Sauerbrey, R., Schramm, U.: Direct observation of prompt pre-thermal laser ion sheath acceleration. Nat. Commun. 3, 874 (2012)
Article Google Scholar
Zenker, E., Worpitz, B., Widera, R., Huebl, A., Juckeland, G., Knüpfer, A., Nagel, W.E., Bussmann, M.: Alpaka - an abstraction library for parallel kernel acceleration. In: International Parallel and Distributed Processing Symposium Workshops. IEEE (2016). doi:10.1109/IPDPSW.2016.50

Download references

Author information

Authors and Affiliations

Helmholtz-Zentrum Dresden–Rossendorf, Dresden, Germany
Erik Zenker, René Widera, Axel Huebl, Guido Juckeland & Michael Bussmann
Technische Universität Dresden, Dresden, Germany
Erik Zenker, Axel Huebl, Andreas Knüpfer & Wolfgang E. Nagel

Authors

Erik Zenker
View author publications
You can also search for this author in PubMed Google Scholar
René Widera
View author publications
You can also search for this author in PubMed Google Scholar
Axel Huebl
View author publications
You can also search for this author in PubMed Google Scholar
Guido Juckeland
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Knüpfer
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang E. Nagel
View author publications
You can also search for this author in PubMed Google Scholar
Michael Bussmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Axel Huebl or Michael Bussmann .

Editor information

Editors and Affiliations

University of Delaware, Newark, Delaware, USA
Michela Taufer
Forschungszentrum Jülich, Jülich, Germany
Bernd Mohr
DKRZ, Hamburg, Germany
Julian M. Kunkel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zenker, E. et al. (2016). Performance-Portable Many-Core Plasma Simulations: Porting PIConGPU to OpenPower and Beyond. In: Taufer, M., Mohr, B., Kunkel, J. (eds) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science(), vol 9945. Springer, Cham. https://doi.org/10.1007/978-3-319-46079-6_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-46079-6_21
Published: 06 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46078-9
Online ISBN: 978-3-319-46079-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics