Load Balancing for Heterogeneous Parallel Architecture

Chen, Quan; Guo, Minyi

doi:10.1007/978-981-10-6238-4_6

Quan Chen³ &
Minyi Guo³

911 Accesses

Abstract

Besides traditional CPU-based parallel computer, heterogeneous parallel architectures that consists of both CPU and GPGPU are used in many emerging large-scale clusters/supercomputers. In order to better utilize both the CPU and GPU, an application could divide and distribute its workload to the two types of hardware at the same time. However, it is not trivial to find an optimal allocation for all the applications offline, because applications often have various characters thus different applications have different speedup ratio on GPGPU compared with that on CPU. In order to solve this problem, this chapter presents the techniques that can balance the application workload across heterogeneous hardware.

Part of contents in this chapter has been published through International Workshop on Programming Models and Applications for Multicores and Manycores. Reprinted from Ref. [14], with permission from ACM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

C. Augonnet, S. Thibault, R. Namyst, P. Wacrenier, StarPU: A unified platform for task scheduling on heterogeneous multicore architectures, Concurrency and Computation: Practice and Experience 23 (2) (2011) 187–198.
Google Scholar
S. S. Baghsorkhi, M. Delahaye, S. J. Patel, W. D. Gropp, W.-m. W. Hwu, An adaptive performance modeling tool for GPU architectures, in: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’10, ACM, New York, NY, USA, 2010, pp. 105–114.
Google Scholar
I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, P. Hanrahan, Brook for GPUs: stream computing on graphics hardware, in: ACM SIGGRAPH 2004 Papers, SIGGRAPH ’04, ACM, New York, NY, USA, 2004, pp. 777–786.
Google Scholar
J. Bueno, L. Martinell, A. Duran, M. Farreras, X. Martorell, R. Badia, E. Ayguade, J. Labarta, Productive cluster programming with OmpSS, Euro-Par 2011 Parallel Processing (2011) 555–566.
Google Scholar
B. He, W. Fang, Q. Luo, N. K. Govindaraju, T. Wang, Mars: a mapreduce framework on graphics processors, in: Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT ’08, ACM, New York, NY, USA, 2008, pp. 260–269.
Google Scholar
S. Hong, H. Kim, An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness, in: Proceedings of the 36th annual international symposium on Computer architecture, ISCA ’09, ACM, New York, NY, USA, 2009, pp. 152–163.
Google Scholar
S. Hong, H. Kim, An integrated GPU power and performance model, in: Proceedings of the 37th annual international symposium on Computer architecture, ISCA ’10, ACM, New York, NY, USA, 2010, pp. 280–289.
Google Scholar
C.-K. Luk, S. Hong, and H. Kim. Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pages 45–55. ACM, 2009.
Google Scholar
P. McCormick, J. Inman, J. Ahrens, J. Mohd-Yusof, G. Roth, S. Cummins, Scout: a data-parallel programming language for graphics processors, Parallel Computing 33 (10–11) (2007) 648–662.
Google Scholar
A. Munshi, The OpenCL specification version: 1.2 (2011).
Google Scholar
C. Nvidia, CUDA C programming guide 5.0 (2012).
Google Scholar
S. Ryoo, C. I. Rodrigues, S. S. Baghsorkhi, S. S. Stone, D. B. Kirk, W.-m. W. Hwu, Optimization principles and application performance evaluation of a multithreaded GPU using CUDA, in: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, PPoPP ’08, ACM, New York, NY, USA, 2008, pp. 73–82.
Google Scholar
T. R. Scogland, B. Rountree, W.-c. Feng, and B. R. De Supinski. Heterogeneous task scheduling for accelerated openmp. In Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International, pages 144–155. IEEE, 2012.
Google Scholar
Z. Wang, L. Zheng, Q. Chen, and M. Guo. CAP: co-scheduling based on asymptotic profiling in CPU+ GPU hybrid systems. Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores, pages 107–114. ACM, 2013.
Google Scholar
Y. Zhang, J. Owens, A quantitative performance analysis model for GPU architectures, in: High Performance Computer Architecture (HPCA), 2011 IEEE 17th International Symposium on, 2011, pp. 382 –393.
Google Scholar
F. Zhang, B. Wu, J. Zhai, B. He, and W. Chen. Finepar: irregularity-aware fine-grained workload partitioning on integrated architectures. In Proceedings of the 2017 International Symposium on Code Generation and Optimization, pages 27–38. IEEE Press, 2017.
Google Scholar

Download references

Author information

Authors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Quan Chen & Minyi Guo

Authors

Quan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Minyi Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Quan Chen .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chen, Q., Guo, M. (2017). Load Balancing for Heterogeneous Parallel Architecture. In: Task Scheduling for Multi-core and Parallel Architectures. Springer, Singapore. https://doi.org/10.1007/978-981-10-6238-4_6

Download citation

DOI: https://doi.org/10.1007/978-981-10-6238-4_6
Published: 25 November 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6237-7
Online ISBN: 978-981-10-6238-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics