Towards High-Level Programming for Systems with Many Cores

Gorlatch, Sergei; Steuwer, Michel

doi:10.1007/978-3-662-46823-4_10

Sergei Gorlatch¹⁵ &
Michel Steuwer¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8974))

Included in the following conference series:

International Andrei Ershov Memorial Conference on Perspectives of System Informatics

509 Accesses

Abstract

Application development for modern high-performance systems with many cores, i.e., comprising multiple Graphics Processing Units (GPUs) and multi-core CPUs, currently exploits low-level programming approaches like CUDA and OpenCL, which leads to complex, lengthy and error-prone programs. In this paper, we advocate a high-level programming approach for such systems, which relies on the following two main principles: (a) the model is based on the current OpenCL standard, such that programs remain portable across various many-core systems, independently of the vendor, and all low-level code optimizations can be applied; (b) the model extends OpenCL with three high-level features which simplify many-core programming and are automatically translated by the system into OpenCL code. The high-level features of our programming model are as follows: (1) memory management is simplified and automated using parallel container data types (vectors and matrices); (2) a data (re)distribution mechanism supports data partitioning and generates automatic data movements between multiple GPUs; (3) computations are precisely and concisely expressed using parallel algorithmic patterns (skeletons). The well-defined skeletons allow for semantics-preserving transformations of SkelCL programs which can be applied in the process of program development, as well as in the compilation and optimization phase. We demonstrate how our programming model and its implementation are used to express several parallel applications, and we report first experimental results on evaluating our approach in terms of program size and target performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

OpenACC application program interface. Version 1.0 (2011)
Google Scholar
AMD. AMD APP SDK code samples. Version 2.7, February 2013
Google Scholar
AMD. Bolt – A C++ template library optimized for GPUs (2013)
Google Scholar
Arora, N., Shringarpure, A., Vuduc, R.W.: Direct N-body kernels for multicore platforms. In: 2012 41st International Conference on Parallel Processing, pp. 379–387. IEEE Computer Society, Los Alamitos (2009)
Google Scholar
Blelloch, G.E.: Prefix sums and their applications. In: Sythesis of Parallel Algorithms, pp. 35–60. Morgan Kaufmann Publishers Inc. (1990)
Google Scholar
Chang, D.-J., Desoky, A.H., Ouyang, M., Rouchka, E.C.: Compute pairwise manhattan distance and pearson correlation coefficient of data points with GPU. In: 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing, pp. 501–506 (2009)
Google Scholar
Elangovan, V.K., Badia, R.M., Parra, E.A.: OmpSs-OpenCL programming model for heterogeneous systems. In: Kasahara, H., Kimura, K. (eds.) LCPC 2012. LNCS, vol. 7760, pp. 96–111. Springer, Heidelberg (2013)
Chapter Google Scholar
Enmyren, J., Kessler. C.: SkePU: a multi-backend skeleton programming library for multi-GPU systems. In: Proceedings 4th International Workshop on High-Level Parallel Programming and Applications (HLPP-2010), pp. 5–14 (2010)
Google Scholar
Ernsting, S., Kuchen, H.: Algorithmic skeletons for multi-core, multi-GPU systems and clusters. Int. J. High Perform. Comput. Netw. 7(2), 129–138 (2012)
Article Google Scholar
Gorlatch, S., Cole, M.: Parallel skeletons. In: Padua, D.A. (ed.) Encyclopedia of Parallel Computing, pp. 1417–1422. Springer, US (2011)
Google Scholar
Gorlatch, S., Lengauer, C.: (De)Composition rules for parallel scan and reduction. In: Proceedings of the 3rd International Working Conference on Massively Parallel Programming Models (MPPM’97), pp. 23–32. IEEE Computer Society Press (1998)
Google Scholar
Hoberock, J., Bell, N.: (NVIDIA). Thrust: a parallel template, Library (2013)
Google Scholar
Khronos Group. The OpenCL specification, Version 2.0, November 2013
Google Scholar
Kirk, D.B., Hwu, W.W.: Programming Massively Parallel Processors - A Hands-on Approach. Morgan Kaufman, San Francisco (2010)
Google Scholar
Nitsche, T.: Skeleton implementations based on generic data distributions. In: 2nd International Workshop on Constructive Methods for Parallel Programming (2000)
Google Scholar
NVIDIA. CUBLAS (2013). http://developer.nvidia.com/cublas
NVIDIA. NVIDIA CUDA SDK code samples. Version 5.0, February 2013
Google Scholar
OpenMP Architecture Review Board. OpenMP API. Version 4.0 (2013)
Google Scholar
Pepper, P., Südholt. M.: Deriving parallel numerical algorithms using data distribution algebras: Wang’s algorithm. In: 30th Annual Hawaii International Conference on System Sciences (HICSS), pp. 501–510 (1997)
Google Scholar
Steuwer, M., Friese, M., Albers, S., Gorlatch, S.: Introducing and implementing the allpairs skeleton for programming multi-GPU systems. Int. J. Parallel Prog. 42(4), 601–618 (2013)
Article Google Scholar
Steuwer, M., Gorlatch, S.: SkelCL: enhancing OpenCL for high-level programming of multi-GPU systems. In: Malyshkin, V. (ed.) PaCT 2013. LNCS, vol. 7979, pp. 258–272. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Acknowledgments

This work is partially supported by the OFERTIE (FP7) and MONICA projects. We would like to thank the anonymous reviewers for their valuable comments, as well as NVIDIA for their generous hardware donation used in our experiments.

Author information

Authors and Affiliations

University of Muenster, Münster, Germany
Sergei Gorlatch & Michel Steuwer

Authors

Sergei Gorlatch
View author publications
You can also search for this author in PubMed Google Scholar
Michel Steuwer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sergei Gorlatch .

Editor information

Editors and Affiliations

School of Computer Science, University of Manchester, Manchester, United Kingdom
Andrei Voronkov
A.P. Ershov Institute of Informatics Systems, Novosibirsk, Russia
Irina Virbitskaite

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gorlatch, S., Steuwer, M. (2015). Towards High-Level Programming for Systems with Many Cores. In: Voronkov, A., Virbitskaite, I. (eds) Perspectives of System Informatics. PSI 2014. Lecture Notes in Computer Science(), vol 8974. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46823-4_10

Download citation

DOI: https://doi.org/10.1007/978-3-662-46823-4_10
Published: 19 April 2015
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46822-7
Online ISBN: 978-3-662-46823-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics