Automated GPU Support in LuNA Fragmented Programming System

  • Belyaev Nikolay
  • Vladislav PerepelkinEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10421)


The paper is devoted to the problem of reduction of complexity of development of numerical parallel programs for distributed memory computers with hybrid (CPU+GPU) computing nodes. The basic idea is to employ a high-level representation of an application algorithm to allow its automated execution on multicomputers with hybrid nodes without a programmer having to do low-level programming. LuNA is a programming system for numerical algorithms, which implements the idea, but only for CPU. In the paper we propose a LuNA language extension, as well as necessary run-time algorithms to support GPU utilization. For that a user only has to provide a limited number of computational GPU procedures using CUDA, while the system will take care of such associated low-level problems, as jobs scheduling, CPU-GPU data transfer, network communications and others. The algorithms developed and implemented take advantage of concerning informational dependencies of an application and support automated tuning to available hardware configuration and application input data.


Hybrid multicomputers GPGPU Parallel programming automation Fragmented programming LuNA system 


  1. 1.
    Kraeva, M.A., Malyshkin, V.E.: Assembly technology for parallel realization of numerical models on mimd-multicomputers. Int. J. Futur. Gener. Comput. Syst. 17(6), 755–765 (2001). Elsevier ScienceCrossRefzbMATHGoogle Scholar
  2. 2.
  3. 3.
    Wen, Y., Wang, Z., O’Boyle, M.F.P.: Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms. In: 21st International Conference on High Performance Computing (HiPC), pp. 1–10 (2014)Google Scholar
  4. 4. accessed May 2017
  5. 5.
    Bakhtin, V.A., Chetverushkin, B.N., Krukov, V.A., Shilnikov, E.V.: Extension of the DVM parallel programming model for clusters with heterogeneous nodes. Doklady Math. 84(3), 879–881 (2011). Moscow: Pleiades Publishing LtdMathSciNetCrossRefGoogle Scholar
  6. 6.
  7. 7.
    Malyshkin, V.E., Perepelkin, V.A.: LuNA fragmented programming system, main functions and peculiarities of run-time subsystem. In: Malyshkin, V. (ed.) PaCT 2011. LNCS, vol. 6873, pp. 53–61. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-23178-0_5 CrossRefGoogle Scholar
  8. 8.
    Malyshkin, V.E., Perepelkin, V.A., Schukin, G.A.: Distributed algorithm of data allocation in the fragmented programming system LuNA. In: Malyshkin, V. (ed.) PaCT 2015. LNCS, vol. 9251, pp. 80–85. Springer, Cham (2015). doi: 10.1007/978-3-319-21909-7_8 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Institute of Computational Mathematics and Mathematical Geophysics SB RASNovosibirskRussia
  2. 2.National Research University of NovosibirskNovosibirskRussia

Personalised recommendations