Design and Preliminary Evaluation of Omni OpenACC Compiler for Massive MIMD Processor PEZY-SC
PEZY-SC is a novel massive Multiple Instruction Multiple Data (MIMD) processor used as an accelerator and characterized by high power efficiency. OpenACC is a standard directive-based programming model for accelerators, and programmers can concisely offload data and computation to the accelerators. In this paper, we present the design and preliminary implementation of an OpenACC compiler for a PEZY-SC. Our compiler translates C code with OpenACC directives to the corresponding PZCL code, which is the programming environment for PEZY-SC. The evaluation shows that the performance of the OpenACC version achieves over 98 % at N-body and up to 88 % at NAS Parallel Benchmarks CG than that of the PZCL version. In addition, we examined optimization techniques such as kernel merging and explicit context switching to exploit the PEZY-SC MIMD architecture, which differs from the single instruction multiple data graphics processing units. We found these optimizations useful in improving the performance and will be implemented in the future release.
KeywordsPEZY-SC OpenACC Compiler
The present study was supported in part by the JST/CREST program entitled “Research and Development on Unified Environment of Accelerated Computing and Interconnection for Post-Petascale Era” in the research area of “Development System Software Technologies for post-Peta Scale High Performance Computing”.
- 1.The green500. http://www.green500.org
- 2.Khronos Group. OpenCL - The open standard for parallel programming of heterogeneous systems. https://www.khronos.org/opencl/
- 3.OpenACC-Standard.org. OpenACC Home. http://www.openacc.org
- 5.NASA Advanced Supercomputing Division. NAS Parallel Benchmarks. http://www.nas.nasa.gov/publications/npb.html
- 7.Tian, X., Xu, R., Yan, Y., Yun, Z., Chandrasekaran, S., Chapman, B.: Compiling a high-level directive-based programming model for GPGPUs. In: Caṣcaval, C., Montesinos-Ortego, P. (eds.) LCPC 2013. LNCS, vol. 8664, pp. 105–120. Springer, Heidelberg (2014)Google Scholar
- 8.Lee, S., Vetter, J.S.: Openarc: open accelerator research compiler for directive-based, efficient heterogeneous computing. In: Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing, HPDC 2014, New York, NY, USA, pp. 115–120. ACM (2014)Google Scholar
- 9.University of Delaware and LLNL. RoseACC. http://roseacc.org/
- 10.GCC. OpenACC - GCC Wiki. https://gcc.gnu.org/wiki/OpenACC
- 11.RIKEN AICS and University of Tsukuba. Omni Compiler Project. http://omni-compiler.org
- 12.Warren, M.S., Salmon, J.K., Becker, D.J., Goda, M.P., Sterling, T., Winckelmans, W.: Pentium pro inside: I. a treecode at 430 gigaflops on asci red, ii. price/performance of $50/mflop on loki and hyglac. In: ACM/IEEE 1997 Conference on Supercomputing, p. 61, November 1997Google Scholar