Skip to main content

Feedback Control Optimization for Performance and Energy Efficiency on CPU-GPU Heterogeneous Systems

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10048))

Abstract

Owing to the rising awareness of environment protection, high performance is not the only aim in system design, energy efficiency has increasingly become an important goal. In accordance with this goal, heterogeneous systems which are more efficient than CPU-based homogeneous systems, and occupying a growing proportion in the Top500 and the Green500 lists. Nevertheless, heterogeneous system design being more complex presents greater challenges in achieving a good tradeoff between performance and energy efficiency for applications running on such systems. To address the performance energy tradeoff issue in CPU-GPU heterogeneous systems, we propose a novel feedback control optimization (FCO) method that alternates between frequency scaling of device and division of kernel workload between CPU and GPU. Given a kernel and a workload division, frequency scaling involves finding near-optimal core frequency of the CPU and of the GPU. Further, an iterative algorithm is proposed for finding a near-optimal workload division that balance workload between CPU and GPU at a frequency that was optimal for the previous workload division. The frequency scaling phase and workload division phase are alternatively performed until the proposed FCO method converges and finds a configuration including core frequency for CPU, core frequency for GPU, and the workload division. Experiments show that compared with the state-of-the-art GreenGPU method, performance can be improved by 7.9%, while energy consumption can be reduced by 4.16%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Top. 500 Supercomputer Sites (2013). http://www.top500.org

  2. The Green500 (2013). http://www.green500.org

  3. Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S., Skadron, K.: Rodinia: a benchmark suite for heterogeneous computing. In: Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), pp. 44–54. IEEE Press, October 2009

    Google Scholar 

  4. NVIDIA CUDA Toolkit 5.5. https://developer.nvidia.com/cuda-toolkit-55-archive

  5. Matthew, S., Henry, D., Karthikeyan, S.: Porting CMP benchmarks to GPUs. Department of Computer Sciences, The University of Wisconsin-Madison, Technical report (2011)

    Google Scholar 

  6. Li, J., Martinez, J.F., Huang, M.C.: The thrifty barrier: energy-aware synchronization in shared-memory multiprocessors. In: Proceedings of the 10th International Symposium on High Performance Computer Architecture (HPCA), p. 14. IEEE Computer Society, February 2004

    Google Scholar 

  7. Lim, M., Freeh, V.W., Lowenthal, D.K.: Adaptive, transparent frequency and voltage scaling of communication phases in MPI programs. In: Proceedings of the ACM/IEEE Conference on Supercomputing (SC). IEEE Press, November 2006

    Google Scholar 

  8. Hong, S., Kim, H.: An integrated GPU power and performance model. In: Proceedings of the 37th Annual International Symposium on Computer Architecture (ISCA), pp. 280–289. ACM Press, June 2010

    Google Scholar 

  9. Song, S., Su, C., Rountree, B., Cameron, K.W.: A simplified and accurate model of power-performance efficiency on emergent GPU architectures. In: Proceedings of the IEEE 27th International Symposium on Parallel and Distributed Processing (IPDPS), pp. 673–686. IEEE Computer Society, May 2013

    Google Scholar 

  10. Luk, C., Hong, S., Kim, H.: Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 45–55. IEEE Press, December 2009

    Google Scholar 

  11. Diamos, G.F., Yalamanchili, S.: Harmony: an execution model and runtime for heterogeneous many core systems. In: Proceedings of the 17th International Symposium on High Performance Distributed Computing (HPDC), pp. 197–200. ACM Press, June 2008

    Google Scholar 

  12. Ravi, V.T., Ma, W., Chiu, D., Agrawal, G.: Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations. In: Proceedings of the 24th ACM International Conference on Supercomputing (ICS), pp. 137–146. ACM Press, June 2010

    Google Scholar 

  13. Grewe, D., ÓBoyle, M.F.P.: A static task partitioning approach for heterogeneous systems using OpenCL. In: Knoop, J. (ed.) CC 2011. LNCS, vol. 6601, pp. 286–305. Springer, Heidelberg (2011). doi:10.1007/978-3-642-19861-8_16

    Chapter  Google Scholar 

  14. Weaver, V.M., Johnson, M., Kasichayanula, K., Ralph, J., Luszczek, P., Terpstra, D., Moore, S.: Measuring energy and power with PAPI. In: Proceedings of the 41st International Conference on Parallel Processing Workshops (ICPPW), pp. 262–268. IEEE Press, September 2012

    Google Scholar 

  15. Rafique, M.M., Butt, A.R., Nikolopoulos, D.S.: A capabilities-aware framework for using computational accelerators in data-intensive computing. J. Parallel Distrib. Comput. 71(2), 185–197 (2011)

    Article  Google Scholar 

  16. Ma, K., Li, X., Chen, W., Zhang, X., Wang, X.: GreenGPU: a holistic approach to energy efficiency in GPU-CPU heterogeneous architectures. In: Proceedings of the 41st International Conference on Parallel Processing (ICPP), pp. 48–57. IEEE Press, September 2012

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng-Sheng Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Lin, FS., Liu, PT., Li, MH., Hsiung, PA. (2016). Feedback Control Optimization for Performance and Energy Efficiency on CPU-GPU Heterogeneous Systems. In: Carretero, J., Garcia-Blas, J., Ko, R., Mueller, P., Nakano, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2016. Lecture Notes in Computer Science(), vol 10048. Springer, Cham. https://doi.org/10.1007/978-3-319-49583-5_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49583-5_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49582-8

  • Online ISBN: 978-3-319-49583-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics