GSched: An Efficient Scheduler for Hybrid CPU-GPU HPC Systems

  • Mariano Raboso MateosEmail author
  • Juan Antonio Cotobal Robles
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 217)


Modern and efficient GPUs evolve towards a new integration paradigm for parallel processing systems, where Message-Passing Interfaces (MPI), Open MP and GPU architectures (CUDA) may be joined to perform a powerful high performance computation system (HPC). Nevertheless, this challenge requires much effort to properly integrate both technology and software programming. This paper describes GSched (Grid Scheduler), an optimized scheduler that allows distributing both CPU and GPU processor execution using a previous calculated optimum pattern, obtaining the best elapsed execution time for overall execution. Furthermore, high-level algorithm description is introduced to efficiently distribute processing and network resources.


Scheduler MPI GPU Open MP parallel processing HPC CUDA 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Karunadasa, N.P., Ranasinghe, D.N.: Accelerating high performance applications with CUDA and MPI. In: 2009 International Conference on Industrial and Information Systems (ICIIS), December 28-31, pp. 331–336 (2009)Google Scholar
  2. 2.
    Lawlor, O.S.: Message Passing for GPGPU Clusters: cudaMPI. In: Proceedings of IEEE Cluster 2009 (2009)Google Scholar
  3. 3.
    Yamagiwa, S., Sousa, L.: CaravelaMPI: Message Passing Interface for Parallel GPU-Based Applications. In: Eighth International Symposium on Parallel and Distributed Computing, ISPDC 2009, June 30-July 4, pp. 161–168 (2009)Google Scholar
  4. 4.
    Potluri, S., Wang, H., Bureddy, D., Singh, A.K., Rosales, C., Panda, D.K.: Optimizing MPI Communication on Multi-GPU Systems Using CUDA Inter-Process Communication. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), May 21-25, pp. 1848–1857 (2012)Google Scholar
  5. 5.
    Linderman, M.D., Collins, J.D., Wang, H., Meng, T.H.: Merge: a programming model for heterogeneous multi-core systems. SIGPLAN Not. 43(3), 287–296 (2008)CrossRefGoogle Scholar
  6. 6.
    Ravi, V.T., Agrawal, G.: A dynamic scheduling framework for emerging heterogeneous systems. In: 2011 18th International Conference on High Performance Computing (HiPC), December 18-21, pp. 1–10 (2011)Google Scholar
  7. 7.
    Chen, L., Villa, O., Krishnamoorthy, S., Gao, G.R.: Dynamic load balancing on single- and multi-GPU systems. In: 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), April 19-23, pp. 1–12 (2010)Google Scholar
  8. 8.
    Guevara, M., Gregg, C., Hazelwood, K., Skadron, K.: Enabling task parallelism in the cuda scheduler. In: PEMA 2009 (2009)Google Scholar
  9. 9.
    Jamsek, D., Van Hensbergen, E.: Experiences with hybrid clusters. In: IEEE Int. Conference on Cluster Computing and Workshops, CLUSTER 2009, August 31-September 4, pp. 1–4 (2009)Google Scholar
  10. 10.
    Kijsipongse, E., U-ruekolan, S., Ngamphiw, C., Tongsima, S.: Efficient large Pearson correlation matrix computing using hybrid MPI/CUDA. In: 8th Int. Joint Conference on Computer Science and Software Engineering (JCSSE 2011), May 11-13, pp. 237–241 (2011)Google Scholar
  11. 11.
    Paralel Programming and Computing Platform CUDA NVIDIA, (last accessed: January 2013)
  12. 12.
    Lee, C.-K., Hamdi, M.: Parallel image processing applications on a network of workstations. Paralell Computing 21(1), 137–160 (1996)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Mariano Raboso Mateos
    • 1
    Email author
  • Juan Antonio Cotobal Robles
    • 1
  1. 1.Facultad de InformáticaUniversidad Pontificia de SalamancaSalamancaSpain

Personalised recommendations