Abstract
In this paper we present the results of optimizing the performance of the gyrokinetic full-f fusion PIC code XGC1 on the Cori Phase Two Knights Landing system. The code has undergone substantial development to enable the use of vector instructions in its most expensive kernels within the NERSC Exascale Science Applications Program. We study the single-node performance of the code on an absolute scale using the roofline methodology to guide optimization efforts. We have obtained 2\({\times }\) speedups in single node performance due to enabling vectorization and performing memory layout optimizations. On multiple nodes, the code is shown to scale well up to 4000 nodes, near half the size of the machine. We discuss some communication bottlenecks that were identified and resolved during the work.
References
International Atomic Energy Agency: Fusion Physics, chap. 1. IAEA, Vienna (2012)
Artsimovich, L.A.: Nucl. Fusion 12(2), 215 (1972)
Ethier, S., Tang, W.M., Lin, Z.: J. Phys. Conf. Ser. 16(1). IOP Publishing (2005)
Markidis, S., Rizwan-uddin, Lapenta, G.: Math. Comput. Simul. 80(7), 1509–1519 (2010)
Brizard, A.J., Hahm, T.S.: Rev. Mod. Phys. 79(2), 412–468 (2007)
Barnes, T., et al.: Supercomputing Conference, 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, pp. 43–53 (2016)
Doerfler, D., et al.: International Conference on High Performance Computing, pp. 339–353 (2016)
Hager, R., et al.: J. Comput. Phys. 315, 644–660 (2016)
Ku, S., et al.: Nucl. Fusion 49(11) (2009). Article 115021
Williams, S., et al.: CACM 52(4), 65–76 (2009)
Ilic, A., et al.: IEEE Comput. Archit. Lett. 12(1), 21–24 (2013)
Kurth, T., et al.: Submitted to the International Supercomputing Conference IXPUG Workshop (2017)
Acknowledgements
The authors wish to thank Drs. S. Abbot, E. D’Azavedo, E. Yoon, S. Ku, R. Hager and C.S. Chang for their help in understanding the XGC1 code and many helpful ideas during the optimization efforts. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Koskela, T., Deslippe, J. (2017). Optimizing Fusion PIC Code Performance at Scale on Cori Phase Two. In: Kunkel, J., Yokota, R., Taufer, M., Shalf, J. (eds) High Performance Computing. ISC High Performance 2017. Lecture Notes in Computer Science(), vol 10524. Springer, Cham. https://doi.org/10.1007/978-3-319-67630-2_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-67630-2_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67629-6
Online ISBN: 978-3-319-67630-2
eBook Packages: Computer ScienceComputer Science (R0)