Skip to main content

SkelCL: Enhancing OpenCL for High-Level Programming of Multi-GPU Systems

  • Conference paper
Parallel Computing Technologies (PaCT 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7979))

Included in the following conference series:

Abstract

Application development for modern high-performance systems with Graphics Processing Units (GPUs) currently relies on low-level programming approaches like CUDA and OpenCL, which leads to complex, lengthy and error-prone programs.

In this paper, we present SkelCL – a high-level programming approach for systems with multiple GPUs and its implementation as a library on top of OpenCL. SkelCL provides three main enhancements to the OpenCL standard: 1) computations are conveniently expressed using parallel algorithmic patterns (skeletons); 2) memory management is simplified using parallel container data types (vectors and matrices); 3) an automatic data (re)distribution mechanism allows for implicit data movements between GPUs and ensures scalability when using multiple GPUs. We demonstrate how SkelCL is used to implement parallel applications on one- and two-dimensional data. We report experimental results to evaluate our approach in terms of programming effort and performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. AMD APP SDK code samples, version 2.7 (February 2013), http://developer.amd.com/

  2. NVIDIA CUDA SDK code samples, version 5.0 (February 2013), http://developer.nvidia.com/

  3. Arora, N., Shringarpure, A., Vuduc, R.W.: Direct N-body Kernels for Multicore Platforms. In: Proceedings of the 2009 International Conference on Parallel Processing, ICPP 2009, pp. 379–387. IEEE Computer Society, Washington, DC (2009)

    Chapter  Google Scholar 

  4. Sengupta, S., Harris, M., Zhang, Y., Owens, J.D.: Scan primitives for GPU computing. In: Graphics Hardware 2007 (2007)

    Google Scholar 

  5. T.P. Group: PGI Accelerator Programming Model for Fortran & C (2010)

    Google Scholar 

  6. OpenACC Application Program Interface. version 1.0 (2011), http://www.openacc.org/

  7. OpenMP Application Program Interface. OpenMP Architecture Review Board, version 3.0 (2008), http://www.openmp.org/mp-documents/spec30.pdf

  8. Bihan, S., Moulard, G., Dolbeau, R., et al.: Directive-based heterogeneous programming a GPU-accelerated RTM use case. In: Proceedings of the 7th International Conference on Computing, Communications and Control Technologies (2009)

    Google Scholar 

  9. Kong, J., Dimitrov, M., Yang, Y., et al.: Accelerating MATLAB image processing toolbox functions on GPUs. In: GPGPU 2010: Proc. of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units. ACM (2010)

    Google Scholar 

  10. Mandelbrot, B.B.: Fractal aspects of the iteration of zλz(1 − z) for complex λ and z. Annals of the New York Academy of Sciences 357, 249–259 (1980)

    Article  Google Scholar 

  11. NVIDIA CUDA API Reference Manual, version 5.0 (February 2013)

    Google Scholar 

  12. Chang, D., Desoky, A., Ouyang, M., Rouchka, E.: Compute Pairwise Manhattan Distance and Pearson Correlation Coefficient of Data Points with GPU. In: Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing, SNPD 2009, pp. 501–506 (2009)

    Google Scholar 

  13. Munshi, A.: The OpenCL Specification, version 1.2

    Google Scholar 

  14. Steuwer, M., Kegel, P., Gorlatch, S.: SkelCL – A Portable Skeleton Library for High-Level GPU Programming. In: 2011 IEEE 25th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), pp. 1171–1177 (2011)

    Google Scholar 

  15. Gorlatch, S., Cole, M.: Parallel skeletons. In: Encyclopedia of Parallel Computing, pp. 1417–1422 (2011)

    Google Scholar 

  16. Hoberock, J., Bell, N.: Thrust: A Parallel Template Library (2009)

    Google Scholar 

  17. Enmyren, J., Kessler, C.: SkePU: A multi-backend skeleton programming library for multi-GPU systems. In: Proceedings 4th Int. Workshop on High-Level Parallel Programming and Applications, pp. 5–14 (2010)

    Google Scholar 

  18. University of Southern California SIPI Image Database. Girl (lena, or lenna), http://sipi.usc.edu/database/database.php?volume=misc

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Steuwer, M., Gorlatch, S. (2013). SkelCL: Enhancing OpenCL for High-Level Programming of Multi-GPU Systems. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2013. Lecture Notes in Computer Science, vol 7979. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39958-9_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39958-9_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39957-2

  • Online ISBN: 978-3-642-39958-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics