OpenCL Task Partitioning in the Presence of GPU Contention

Grewe, Dominik; Wang, Zheng; O’Boyle, Michael F. P.

doi:10.1007/978-3-319-09967-5_5

Dominik Grewe¹⁷,
Zheng Wang¹⁷ &
Michael F. P. O’Boyle¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8664))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

872 Accesses
17 Citations

Abstract

Heterogeneous multi- and many-core systems are increasingly prevalent in the desktop and mobile domains. On these systems it is common for programs to compete with co-running programs for resources. While multi-task scheduling for CPUs is a well-studied area, how to partitioning and map computing tasks onto the heterogeneous system in the presence of GPU contention (i.e. multiple programs compete for the GPU) remains an outstanding problem.

In this paper we consider the problem of partitioning OpenCL kernels on a CPU-GPU based system in the presence of contention on the GPU. We propose a machine learning-based approach that predicts the optimal partitioning of OpenCL kernels, explicitly taking GPU contention into account. Our predictive model achieves a speed-up of 1.92 over a scheme that always uses the GPU. When compared to two state-of-the-art dynamic approaches our model achieves speed-ups of 1.54 and 2.56 respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
NVIDIA GPUs allow concurrent executions of kernels from the same application but not from different applications.

References

AMD. Accelerated parallel processing (APP) SDK (2013)
Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, New York (2006)
Google Scholar
Boser, B.E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Proceedings of the 5th Annual ACM Conference on Computational Learning Theory, pp. 144–152 (1992)
Google Scholar
Cooper, K.D., Schielke, P.J., Subramanian, D.: Optimizing for reduced code space using genetic algorithms. In: LCTES ’99, pp. 1–9 (1999)
Google Scholar
Eyerman, S., Eeckhout, L.: Probabilistic job symbiosis modeling for SMT processor scheduling. In: ASPLOS ’10, pp. 91–102
Google Scholar
Grewe, D., O’Boyle, M.F.P.: A static task partitioning approach for heterogeneous systems using OpenCL. In: Knoop, J. (ed.) CC 2011. LNCS, vol. 6601, pp. 286–305. Springer, Heidelberg (2011)
Google Scholar
Grewe, D., Wang, Z., O’Boyle, M.F.P.: A workload-aware mapping approach for data-parallel programs. In: HiPEAC ’11 (2011)
Google Scholar
Han, T.D., Abdelrahman, T.S.: hiCUDA: a high-level directive-based language for GPU programming. In: GPGPU ’09
Google Scholar
Hormati, A., Samadi, M., Woh, M., Mudge, T., Mahlke, S.: Sponge: portable stream programming on graphics engines. In: ASPLOS ’11
Google Scholar
Intel. Intel SDK for OpenCL applications 2013 — intel developer zone (2013)
Google Scholar
Kim, J., Kim, H., Lee, J.H. Lee, J.: Achieving a single compute device image in OpenCL for multiple GPUs. In: PPoPP ’11
Google Scholar
LLVM. Clang: a C language family frontend for LLVM. http://clang.llvm.org/
Long, S., O’Boyle, M.F.P.: Adaptive java optimisation using instance-based learning. In: ICS ’04
Google Scholar
Luk, C.-K., Hong, S., Kim, H.: Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: MICRO 42 (2009)
Google Scholar
Raman, A., Zaks, A., Lee, J.W., August, D.I.: Parcae: a system for exible parallel execution. In: PLDI ’12, pp. 133–144
Google Scholar
Ravi, V.T. Ma, W., Chiu, D., Agrawal, G.: Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations. In: SC, pp. 137–146 (2010)
Google Scholar
Snavely, A., Tullsen, D.M.: Symbiotic jobscheduling for a simultaneous multithreaded processor. In: ASPLOS-IX, pp. 234–244 (2000)
Google Scholar
Wang, Z., O’Boyle, M.F.P.: Using machine learning to partition streaming programs. ACM Trans. Archit. Code Optim. 10(3) (2013)
Google Scholar
Wang, Z., O’Boyle, M.F.P., Emani, M.K.: Smart, adaptive mapping of parallelism in the presence of external workload. In: CGO ’13 (2013)
Google Scholar
Wang, Z., O’Boyle, M.F.P.: Mapping parallelism to multi-cores: a machine learning based approach. In: PPoPP ’09 (2008)
Google Scholar
Wang, Z., O’Boyle, M.F.P.: Partitioning streaming parallelism for multi-cores: a machine learning based approach. In: PACT ’10 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Informatics, The University of Edinburgh, Edinburgh, UK
Dominik Grewe, Zheng Wang & Michael F. P. O’Boyle

Authors

Dominik Grewe
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Michael F. P. O’Boyle
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dominik Grewe .

Editor information

Editors and Affiliations

Silicon Valley, Qualcomm Research, San Jose, California, USA
Călin Cașcaval
Silicon Valley, Qualcomm Research, San Jose, California, USA
Pablo Montesinos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Grewe, D., Wang, Z., O’Boyle, M.F.P. (2014). OpenCL Task Partitioning in the Presence of GPU Contention. In: Cașcaval, C., Montesinos, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2013. Lecture Notes in Computer Science(), vol 8664. Springer, Cham. https://doi.org/10.1007/978-3-319-09967-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-09967-5_5
Published: 01 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09966-8
Online ISBN: 978-3-319-09967-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics