Automated GPU Support in LuNA Fragmented Programming System
The paper is devoted to the problem of reduction of complexity of development of numerical parallel programs for distributed memory computers with hybrid (CPU+GPU) computing nodes. The basic idea is to employ a high-level representation of an application algorithm to allow its automated execution on multicomputers with hybrid nodes without a programmer having to do low-level programming. LuNA is a programming system for numerical algorithms, which implements the idea, but only for CPU. In the paper we propose a LuNA language extension, as well as necessary run-time algorithms to support GPU utilization. For that a user only has to provide a limited number of computational GPU procedures using CUDA, while the system will take care of such associated low-level problems, as jobs scheduling, CPU-GPU data transfer, network communications and others. The algorithms developed and implemented take advantage of concerning informational dependencies of an application and support automated tuning to available hardware configuration and application input data.
KeywordsHybrid multicomputers GPGPU Parallel programming automation Fragmented programming LuNA system
- 2.https://www.khronos.org/opencl/ accessed May 2017
- 3.Wen, Y., Wang, Z., O’Boyle, M.F.P.: Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms. In: 21st International Conference on High Performance Computing (HiPC), pp. 1–10 (2014)Google Scholar
- 4.http://www.openacc.org/ accessed May 2017
- 6.http://charm.cs.illinois.edu/research/charm accessed May 2017