Performance/Energy Aware Optimization of Parallel Applications on GPUs Under Power Capping

  • Adam Krzywaniak
  • Paweł CzarnulEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12044)


In the paper we present an approach and results from application of the modern power capping mechanism available for NVIDIA GPUs to the benchmarks such as NAS Parallel Benchmarks BT, SP and LU as well as cublasgemm-benchmark which are widely used for assessment of high performance computing systems’ performance. Specifically, depending on the benchmarks, various power cap configurations are best for desired trade-off of performance and energy consumption. We present two: both energy savings and performance drops for same power caps as well as a normalized performance-energy consumption product. It is important that optimal configurations are often non-trivial i.e. are obtained for power caps smaller than default and larger than minimal allowed limits. Tests have been performed for two modern GPUs of Pascal and Turing generations i.e. NVIDIA GTX 1070 and NVIDIA RTX 2080 respectively and thus results can be useful for many applications with profiles similar to the benchmarks executed on modern GPU based systems.


Performance/energy optimization Power capping GPU NAS Parallel Benchmarks 


  1. 1.
    Abe, Y., Sasaki, H., Peres, M., Inoue, K., Murakami, K., Kato, S.: Power and performance analysis of GPU-accelerated systems. Presented as part of the 2012 Workshop on Power-Aware Computing and Systems, Hollywood, CA. USENIX (2012)Google Scholar
  2. 2.
    Bridges, R.A., Imam, N., Mintz, T.M.: Understanding GPU power: a survey of profiling, modeling, and simulation methods. ACM Comput. Surv. 49(3), 41:1–41:27 (2016)CrossRefGoogle Scholar
  3. 3.
    Carreño, E.D., Sarates Jr., A.S., Navaux, P.O.A.: A mechanism to reduce energy waste in the post-execution of GPU applications. J. Phys.: Conf. Ser. 649(1), 012002 (2015)Google Scholar
  4. 4.
    Choi, J.W., Bedard, D., Fowler, R., Vuduc, R.: A roofline model of energy. In: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pp. 661–672, May 2013Google Scholar
  5. 5.
    Czarnul, P.: Parallelization of large vector similarity computations in a hybrid CPU+GPU environment. J. Supercomput. 74(2), 768–786 (2017). Scholar
  6. 6.
    Czarnul, P., Proficz, J., Krzywaniak, A.: Energy-aware high-performance computing: survey of state-of-the-art tools, techniques, and environments. Sci. Program. 2019, 8348791:1–8348791:19 (2019)Google Scholar
  7. 7.
    Dümmler, J.: NPB-CUDA. Technische Universitat Chemnitz.
  8. 8.
    Ge, R., Vogt, R., Majumder, J., Alam, A., Burtscher, M., Zong, Z.: Effects of dynamic voltage and frequency scaling on a k20 GPU. In: 2013 42nd International Conference on Parallel Processing, pp. 826–833, October 2013Google Scholar
  9. 9.
    Hong, S., Kim, H.: An integrated GPU power and performance model. SIGARCH Comput. Archit. News 38(3), 280–289 (2010)CrossRefGoogle Scholar
  10. 10.
    Huang, S., Xiao, S., Feng, W.: On the energy efficiency of graphics processing units for scientific computing. In: 2009 IEEE International Symposium on Parallel Distributed Processing, pp. 1–8, May 2009Google Scholar
  11. 11.
    Komoda, T., Hayashi, S., Nakada, T., Miwa, S., Nakamura, H.: Power capping of CPU-GPU heterogeneous systems through coordinating DVFS and task mapping. In: 2013 IEEE 31st International Conference on Computer Design (ICCD), pp. 349–356, October 2013Google Scholar
  12. 12.
    Krzywaniak, A., Proficz, J., Czarnul, P.: Analyzing energy/performance trade-offs with power capping for parallel applications on modern multi and many core processors. In: 2018 Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 339–346, September 2018Google Scholar
  13. 13.
    Krzywaniak, A., Czarnul, P.: Parallelization of selected algorithms on multi-core CPUs, a cluster and in a hybrid CPU+Xeon Phi environment. In: Borzemski, L., Świątek, J., Wilimowska, Z. (eds.) ISAT 2017. AISC, vol. 655, pp. 292–301. Springer, Cham (2018). Scholar
  14. 14.
    Krzywaniak, A., Czarnul, P., Proficz, J.: Extended investigation of performance-energy trade-offs under power capping in HPC environments. Accepted for International Conference on High Performance Computing & Simulation (HPCS 2019), Dublin, Ireland (in press)Google Scholar
  15. 15.
    Leng, J., et al.: GPUWattch: enabling energy optimizations in GPGPUs. SIGARCH Comput. Archit. News 41(3), 487–498 (2013)CrossRefGoogle Scholar
  16. 16.
    Libuschewski, P., Marwedel, P., Siedhoff, D., Müller, H.: Multi-objective, energy-aware GPGPU design space exploration for medical or industrial applications. In: 2014 Tenth International Conference on Signal-Image Technology and Internet-Based Systems, pp. 637–644, November 2014Google Scholar
  17. 17.
    Lucas, J., Juurlink, B.: MEMPower: data-aware GPU memory power model. In: Schoeberl, M., Hochberger, C., Uhrig, S., Brehm, J., Pionteck, T. (eds.) ARCS 2019. LNCS, vol. 11479, pp. 195–207. Springer, Cham (2019). Scholar
  18. 18.
    He Ma. cublasgemm-benchmark. University of Guelph, Canada.
  19. 19.
    Mittal, S., Vetter, J.S.: A survey of methods for analyzing and improving GPU energy efficiency. CoRR, abs/1404.4629 (2014)Google Scholar
  20. 20.
    Rojek, K.: Machine learning method for energy reduction by utilizing dynamic mixed precision on GPU-based supercomputers. Concurr. Comput.: Pract. Exp. 31(6), e4644 (2019)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Tsuzuku, K., Endo, T.: Power capping of CPU-GPU heterogeneous systems using power and performance models. In: 2015 International Conference on Smart Cities and Green ICT Systems (SMARTGREENS), pp. 1–8, May 2015Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Faculty of Electronics, Telecommunications and InformaticsGdansk University of TechnologyGdańskPoland

Personalised recommendations