OpenCL as a Programming Model for GPU Clusters

Kim, Jungwon; Seo, Sangmin; Lee, Jun; Nah, Jeongho; Jo, Gangwon; Lee, Jaejin

doi:10.1007/978-3-642-36036-7_6

OpenCL as a Programming Model for GPU Clusters

Jungwon Kim¹⁷,
Sangmin Seo¹⁷,
Jun Lee¹⁷,
Jeongho Nah¹⁷,
Gangwon Jo¹⁷ &
…
Jaejin Lee¹⁷

Conference paper

1125 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7146))

Abstract

In this paper, we propose an OpenCL framework for GPU clusters. The target cluster architecture consists of a single host node and multiple compute nodes. They are connected by an interconnection network, such as Gigabit and InfiniBand switches. Each compute node consists of multiple GPUs. Each GPU becomes an OpenCL compute device. The host node executes the host program in an OpenCL application. Our OpenCL framework provides an illusion of a single system for the user. It allows the application to utilize GPUs in a compute node as if they were in the host node. No communication API, such as the MPI library, is required in the application source. We show that the original OpenCL semantics naturally fits to the GPU cluster environment, and the framework achieves both high performance and ease of programming. We implement the OpenCL framework and evaluate its performance on a GPU cluster that consists of one host and eight compute nodes using six OpenCL benchmark applications.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

AMD: AMD Accelerated Parallel Processing SDK v2.3, http://developer.amd.com/gpu/AMDAPPSDK/Pages/default.aspx
AMD: AMD Accelerated Parallel Processing (APP) SDK With OpenCL 1.1 Support (2011), http://developer.amd.com/gpu/atistreamsdk/pages/default.aspx
Amza, C., Cox, A.L., Dwarkadas, S., Keleher, P., Lu, H., Rajamony, R., Yu, W., Zwaenepoel, W.: TreadMarks: Shared Memory Computing on Networks of Workstations. Computer 29, 18–28 (1996)
Article Google Scholar
Barak, A., Ben-nun, T., Levy, E., Shiloh, A.: A Package for OpenCL Based Heterogeneous Computing on Clusters with Many GPU Devices. In: Proceedings of the Workshop on Parallel Programming and Applications on Accelerator Clusters, PPAAC 2010 (2010)
Google Scholar
Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT 2008, pp. 72–81 (2008)
Google Scholar
Chen, L., Liu, L., Tang, S., Huang, L., Jing, Z., Xu, S., Zhang, D., Shou, B.: Unified Parallel C for GPU Clusters: Language Extensions and Compiler Implementation. In: Cooper, K., Mellor-Crummey, J., Sarkar, V. (eds.) LCPC 2010. LNCS, vol. 6548, pp. 151–165. Springer, Heidelberg (2011)
Chapter Google Scholar
Chen, Y., Cui, X., Mei, H.: Large-scale FFT on GPU clusters. In: Proceedings of the 24th ACM International Conference on Supercomputing, ICS 2010, pp. 315–324 (2010)
Google Scholar
Fan, Z., Qiu, F., Kaufman, A., Yoakum-Stover, S.: GPU cluster for high performance computing. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, SC 2004, pp. 47–58 (2004)
Google Scholar
IBM: OpenCL Development Kit for Linux on Power (2011), http://www.alphaworks.ibm.com/tech/opencl
Intel: Intel OpenCL SDK (2011), http://software.intel.com/en-us/articles/intel-opencl-sdk/
Khronos OpenCL Working Group: The OpenCL Specification Version 1.1 (2010), http://www.khronos.org/opencl
Kim, J., Kim, H., Lee, J.H., Lee, J.: Achieving a single compute device image in OpenCL for multiple GPUs. In: Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming, PPoPP 2011, pp. 277–288 (2011)
Google Scholar
Lattner, C., Adve, V.: LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In: Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization, CGO 2004, pp. 75–86 (2004)
Google Scholar
NASA Advanced Supercomputing Division: NAS Parallel Benchmarks version 3.2, http://www.nas.nasa.gov/Resources/Software/npb.html
NVIDIA: NVIDIA CUDA Toolkit 3.2, http://developer.nvidia.com/cuda-toolkit-32-downloads
NVIDIA: NVIDIA CUDA C Programming Guide 3.2 (2010)
Google Scholar
NVIDIA: NVIDIA GPU Computing Developer Home Page (2011), http://developer.nvidia.com/object/gpucomputing.html
Phillips, J.C., Stone, J.E., Schulten, K.: Adapting a message-driven parallel application to GPU-accelerated clusters. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC 2008, pp. 8:1–8:9 (2008)
Google Scholar
Seoul National University and Samsung: SNU-SAMSUNG OpenCL Framework (2010), http://opencl.snu.ac.kr
The IMPACT Research Group: Parboil Benchmark suite, http://impact.crhc.illinois.edu/parboil.php

Download references

Author information

Authors and Affiliations

Center for Manycore Programming School of Computer Science and Engineering, Seoul National University, Seoul, 151-744, Korea
Jungwon Kim, Sangmin Seo, Jun Lee, Jeongho Nah, Gangwon Jo & Jaejin Lee

Authors

Jungwon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Sangmin Seo
View author publications
You can also search for this author in PubMed Google Scholar
Jun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jeongho Nah
View author publications
You can also search for this author in PubMed Google Scholar
Gangwon Jo
View author publications
You can also search for this author in PubMed Google Scholar
Jaejin Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, Colorado State University, 80523-1873, Fort Collins, CO, USA
Sanjay Rajopadhye & Michelle Mills Strout &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, J., Seo, S., Lee, J., Nah, J., Jo, G., Lee, J. (2013). OpenCL as a Programming Model for GPU Clusters. In: Rajopadhye, S., Mills Strout, M. (eds) Languages and Compilers for Parallel Computing. LCPC 2011. Lecture Notes in Computer Science, vol 7146. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36036-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-36036-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36035-0
Online ISBN: 978-3-642-36036-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics