Advertisement

Experiences with High-Level Programming Directives for Porting Applications to GPUs

  • Oscar Hernandez
  • Wei Ding
  • Barbara Chapman
  • Christos Kartsaklis
  • Ramanan Sankaran
  • Richard Graham
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7174)

Abstract

HPC systems now exploit GPUs within their compute nodes to accelerate program performance. As a result, high-end application development has become extremely complex at the node level. In addition to restructuring the node code to exploit the cores and specialized devices, the programmer may need to choose a programming model such as OpenMP or CPU threads in conjunction with an accelerator programming model to share and manage the different node resources. This comes at a time when programmer productivity and the ability to produce portable code has been recognized as a major concern. In order to offset the high development cost of creating CUDA or OpenCL kernels, directives have been proposed for programming accelerator devices, but their implications are not well known. In this paper, we evaluate the state of the art accelerator directives to program several applications kernels, explore transformations to achieve good performance, and examine the expressivity and performance penalty of using high-level directives versus CUDA. We also compare our results to OpenMP implementations to understand the benefits of running the kernels in the accelerator versus CPU cores.

Keywords

Graphic Processing Unit Spectral Element Method Divergence Sphere Spherical Element CUDA Implementation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    International Exascale Software Project Draft Report V 0.93, http://www.exascale.org/iesp/MainPage
  2. 2.
  3. 3.
    OpenCL 1.0 Specification, http://www.khronos.org/opencl/
  4. 4.
    PGI Fortran & C Accelerator Compilers and Programming Model, http://www.pgroup.com/lit/pgi_whitepaper_accpre.pdf
  5. 5.
  6. 6.
    Amarasinghe, S., Gordon, M.I., Karczmarek, M., Lin, J., Maze, D., Rabbah, R.M., Thies, W.: Language and compiler design for streaming applications. Int. J. Parallel Program. 33(2), 261–278 (2005)CrossRefGoogle Scholar
  7. 7.
    Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P.: Brook for gpus: stream computing on graphics hardware. In: SIGGRAPH 2004: ACM SIGGRAPH 2004 Papers, pp. 777–786. ACM, New York (2004)Google Scholar
  8. 8.
    Chen, J.H., Choudhary, A., de Supinski, B., DeVries, M., Hawkes, E.R., Klasky, S., Liao, W.K., Ma, K.L., Mellor-Crummey, J., Podhorszki, N., Sankaran, R., Shende, S., Yoo, C.S.: Terascale direct numerical simulations of turbulent combustion using s3d. Computational Science and Discovery 2(1), 15001 (2009)CrossRefGoogle Scholar
  9. 9.
    CAPS Enterprise. HMPP: A Hybrid Multicore Parallel Programming Platform, http://www.caps-entreprise.com/en/documentation/caps_hmpp_product_brief.pdf
  10. 10.
    Han, T.D., Abdelrahman, T.S.: /hi/cuda: a high-level directive-based language for gpu programming. In: GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, pp. 52–61. ACM, New York (2009)CrossRefGoogle Scholar
  11. 11.
    McCool, M., Toit, S.: Metaprogramming GPUs with Sh. A K Peters, Ltd. (2004)Google Scholar
  12. 12.
    Membarth, R., Hannig, F., Teich, J., Korner, M., Eckert, W.: Frameworks for gpu accelerators: A comprehensive evaluation using 2d/3d image registration. In: 2011 IEEE 9th Symposium on Application Specific Processors (SASP), pp. 78–81 (June 2011)Google Scholar
  13. 13.

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Oscar Hernandez
    • 1
    • 2
  • Wei Ding
    • 1
    • 2
  • Barbara Chapman
    • 1
    • 2
  • Christos Kartsaklis
    • 1
    • 2
  • Ramanan Sankaran
    • 1
    • 2
  • Richard Graham
    • 1
    • 2
  1. 1.Computer Science and Mathematics DivisionNational Center for Computational Sciences, Oak Ridge National LaboratoryUSA
  2. 2.Dept. of Computer ScienceUniversity of HoustonUSA

Personalised recommendations