Performance Comparison of Eulerian Kinetic Vlasov Code Between Xeon Phi KNL and Xeon Broadwell
The present study deals with the kinetic Vlasov simulation code as a high-performance application, which solves the first-principle kinetic equations known as the Vlasov equation. A five-dimensional Vlasov code with two spatial dimension and three velocity dimensions is parallelized with the MPI-OpenMP hybrid parallelism. The performance of the parallel Vlasov code is measured on a single compute node with a Xeon Phi Knights Landing (KNL) processor and on a single compute node with two Xeon Broadwell processors. It is shown that the use of Multi-Channel Dynamic Random Access Memory (MCDRAM) as the “cache” mode gives higher performances than the “flat” mode when the size of a computing job is larger than the size of MCDRAM. On the other hand, the use of MCDRAM as the “flat” mode gives higher performances than the “cache” mode for small-size jobs, when the NUMA (Non-Uniform Memory Access) policy is controlled appropriately. It is also shown that there is not a substantial difference in the performance among the cluster modes. The performance of our Vlasov code is best with the “Quadrant” cluster mode and worst with the “SNC-4” cluster mode.
KeywordsPerformance measurement Xeon Phi processor Xeon processor Eulerian-grid-based method Hybrid parallelism
- 10.Umeda, T., Fukazawa, K.: Hybrid parallelization of hyper-dimensional Vlasov code with OpenMP loop collapse directive. In: Joubert, G.R., Leather, H., Parsons, M., Peters, F., Sawyer, M. (eds.) Parallel Computing: On the Road to Exascale, Advances in Parallel Computing, vol. 27, pp. 265–274. IOS Press, Amsterdam (2016). https://doi.org/10.3233/978-1-61499-621-7-265CrossRefGoogle Scholar