Exploring Energy Efficiency for GPU-Accelerated POWER Servers

Hater, Thorsten; Anlauf, Benedikt; Baumeister, Paul; Bühler, Markus; Kraus, Jiri; Pleiter, Dirk

doi:10.1007/978-3-319-46079-6_15

Thorsten Hater¹⁶,
Benedikt Anlauf¹⁷,
Paul Baumeister¹⁶,
Markus Bühler¹⁷,
Jiri Kraus¹⁸ &
…
Dirk Pleiter¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9945))

Included in the following conference series:

International Conference on High Performance Computing

2401 Accesses

Abstract

Modern servers provide different features for managing the amount of energy that is needed to execute a given work-load. In this article we focus on a new generation of GPU-accelerated servers with POWER8 processors. For different scientific applications, which have in common that they have been written for massively-parallel computers, we measure energy-to-solution for different system configurations. By combining earlier developed performance models and a simple power model, we derive an energy model that can help to optimise for energy efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/open-power/amester.
2.
Comparison with nvidia-smi indicates an overhead of roughly 40 W measured with idle system.
3.
Less than 15 % of the mean value in pathological cases.
4.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
5.
Trademark of IBM in USA and/or other countries.
6.
We also performed fits with \(\gamma \) as a free parameter, where we found \(\gamma \simeq 3\).

References

Abraham, M.J., Murtola, T., Schulz, R., Páll, S., Smith, J.C., Hess, B., Lindahl, E.: GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1, 19–25 (2015)
Article Google Scholar
Alonso, P., Dolz, M.F., Mayo, R., Quintana-Ortí, E.S.: Modeling power and energy of the task-parallel Cholesky factorization on multicore processors. Comput. Sci. Res. Dev. 29(2), 105–112 (2012). doi:10.1007/s00450-012-0227-z
Article Google Scholar
Baumeister, P.F., Hater, T., Kraus, J., Pleiter, D., Wahl, P.: A performance model for GPU-accelerated FDTD applications. In: 2015 IEEE 22nd International Conference on High Performance Computing (HiPC), pp. 185–193, December 2015
Google Scholar
Beeby, J.: The density of electrons in a perfect or imperfect lattice. Proc. R. Soc. Lond. A Math. Phys. Eng. Sci. 302(1468), 113–136 (1967). The Royal Society
Article Google Scholar
Bilardi, G., Pietracaprina, A., Pucci, G., Schifano, F., Tripiccione, R.: The potential of on-chip multiprocessing for QCD machines. In: Bader, D.A., Parashar, M., Sridhar, V., Prasanna, V.K. (eds.) HiPC 2005. LNCS, vol. 3769, pp. 386–397. Springer, Heidelberg (2005). doi:10.1007/11602569_41
Chapter Google Scholar
Bui, V., Norris, B., Huck, K., McInnes, L.C., Li, L., Hernandez, O., Chapman, B.: A component infrastructure for performance and power modeling of parallel scientific applications. In: Proceedings of the 2008 compFrame/HPC-GECO Workshop on Component Based High Performance, CBHPC 2008, pp. 6:1–6:11. (2008). http://doi.acm.org/10.1145/1456190.1456199
Cabrera, A., Almeida, F., Blanco, V., Giménez, D.: Analytical modeling of the energy consumption for the High Performance Linpack. In: 2013 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 343–350, February 2013
Google Scholar
Caldeira, A.B., et al.: IBM Power System S824L technical overview and introduction (2014). http://www.redbooks.ibm.com/Redbooks.nsf/RedbookAbstracts/redp5139.html
David, H., Gorbatov, E., Hanebutte, U.R., Khanna, R., Le, C.: RAPL: memory power estimation and capping. In: 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED), pp. 189–194, August 2010
Google Scholar
Demmel, J., Gearhart, A.: Instrumenting linear algebra energy consumption via on-chip energy counters. Technical report, UCB/EECS-2012-168, EECS Department, University of California, Berkeley, June 2012. http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-168.html
Dongarra, J., Ltaief, H., Luszczek, P., Weaver, V.M.: Energy footprint of advanced dense numerical linear algebra using tile algorithms on multicore architecture. In: The 2nd International Conference on Cloud and Green Computing, November 2012
Google Scholar
Eyerman, S., Eeckhout, L.: A counter architecture for online DVFS profitability estimation. IEEE Trans. Comput. 59(11), 1576–1583 (2010)
Article MathSciNet Google Scholar
Feng, W.C., et al.: Green500 list, November 2015. http://green500.org
Flinn, J., Satyanarayanan, M.: PowerScope: a tool for profiling the energy usage of mobile applications. In: Proceedings of the Second IEEE Workshop on Mobile Computer Systems and Applications, WMCSA 1999, p. 2 (1999). http://dl.acm.org/citation.cfm?id=520551.837522
Floyd, M., et al.: Introducing the adaptive energy management features of the POWER7 chip. IEEE Micro 31(2), 60–75 (2011)
Article Google Scholar
Freund, R.W., Nachtigal, N.: QMR: a quasi-minimal residual method for non-Hermitian linear systems. Numer. Math. 60(1), 315–339 (1991)
Article MathSciNet MATH Google Scholar
Friedrich, J., Le, H., Starke, W., Stuechli, J., Sinharoy, B., Fluhr, E., Dreps, D., Zyuban, V., Still, G., Gonzalez, C., Hogenmiller, D., Malgioglio, F., Nett, R., Puri, R., Restle, P., Shan, D., Deniz, Z., Wendel, D., Ziegler, M., Victor, D.: The POWER8\(^{\rm {TM}}\) processor: designed for big data, analytics, and cloud environments. In: 2014 IEEE International Conference on IC Design Technology (ICICDT), pp. 1–4, May 2014
Google Scholar
Ge, R., Feng, X., Song, S., Chang, H.C., Li, D., Cameron, K.W.: PowerPack: energy profiling and analysis of high-performance systems and applications. IEEE Trans. Parallel Distrib. Syst. 21(5), 658–671 (2010)
Article Google Scholar
Ghosh, S., Chandrasekaran, S., Chapman, B.: Statistical modeling of power/energy of scientific kernels on a multi-GPU system. In: 2013 International Green Computing Conference (IGCC), pp. 1–6, June 2013
Google Scholar
Hackenberg, D., Ilsche, T., Schuchart, J., Schöne, R., Nagel, W.E., Simon, M., Georgiou, Y.: HDEEM: high definition energy efficiency monitoring. In: Energy Efficient Supercomputing Workshop, E2SC 2014, pp. 1–10, November 2014
Google Scholar
Isci, C., Martonosi, M.: Runtime power monitoring in high-end processors: methodology and empirical data. In: 2003 Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-36, pp. 93–104, December 2003
Google Scholar
Klavík, P., Malossi, A.C.I., Bekas, C., Curioni, A.: Changing computing paradigms towards power efficiency. Philos. Trans. R. Soc. Lond. A: Math. Phys. Eng. Sci. 372(2018), 20130278 (2014)
Article Google Scholar
Knobloch, M., Foszczynski, M., Homberg, W., Pleiter, D., Böttiger, H.: Mapping fine-grained power measurements to HPC application runtime characteristics on IBM POWER7. Comput. Sci. Res. Dev. 29(3), 211–219 (2013). doi:10.1007/s00450-013-0245-5
Google Scholar
Kohn, W., Rostoker, N.: Solution of the Schrödinger equation in periodic lattices with an application to metallic Lithium. Phys. Rev. 94, 1111–1120 (1954)
Article MATH Google Scholar
Korringa, J.: On the calculation of the energy of a Bloch wave in a metal. Physica 13(6), 392–400 (1947)
Article MathSciNet Google Scholar
Kraus, J.: Increase performance with GPU boost and K80 autoboost (2014). https://devblogs.nvidia.com/parallelforall/increase-performance-gpu-boost-k80-autoboost/
Lee, K.H., Ahmed, I., Goh, R.S., Khoo, E.H., Li, E.P., Hung, T.G.: Implementation of the FDTD method based on Lorentz-Drude dispersive model on GPU for plasmonics applications. Prog. Electromagnet. Res. 116, 441–456 (2011)
Article Google Scholar
Lefurgy, C., Wang, X., Ware, M.: Server-level power control. In: 2007 Fourth International Conference on Autonomic Computing, ICAC 2007, p. 4, June 2007
Google Scholar
Lindahl, E.: Molecular simulation with GROMACS on CUDA GPUs (2013). http://on-demand.gputechconf.com/gtc/2013/webinar/gromacs-kepler-gpus-gtc-express-webinar.pdf
Pleiter, D.: Parallel computer architectures. In: 45th IFF Spring School 2014 “Computing Solids Models, ab-initio methods and supercomputing”. Schriften des Forschungszentrums Jülich, Reihe Schlüsseltechnologien, vol. 74 (2014)
Google Scholar
Rountree, B., Lowenthal, D.K., Schulz, M., de Supinski, B.R.: Practical performance prediction under dynamic voltage frequency scaling. In: 2011 International Green Computing Conference and Workshops (IGCC), pp. 1–8, July 2011
Google Scholar
Ryffel, S.: LEA\(^2\)P: the Linux energy attribution and accounting platform. Master’s thesis, Swiss Federal Institute of Technology (ETH) (2009). http://ftp.tik.ee.ethz.ch/pub/students/2009-FS/MA-2009-04.pdf
Shahmansouri, A., Rashidian, B.: GPU implementation of split-field finite-difference time-domain method for Drude-Lorentz dispersive media. Prog. Electromagnet. Res. 125, 55–77 (2012)
Article Google Scholar
Song, S., Su, C., Rountree, B., Cameron, K.W.: A simplified and accurate model of power-performance efficiency on emergent GPU architectures. In: 2013 IEEE 27th International Symposium on Parallel Distributed Processing (IPDPS), pp. 673–686, May 2013
Google Scholar
Song, S.L., Barker, K., Kerbyson, D.: Unified performance and power modeling of scientific workloads. In: Proceedings of the 1st International Workshop on Energy Efficient Supercomputing, E2SC 2013, pp. 4:1–4:8. (2013). http://doi.acm.org/10.1145/2536430.2536435
Subramaniam, B., Feng, W.C.: Statistical power and performance modeling for optimizing the energy efficiency of scientific computing. In: Green Computing and Communications (GreenCom), pp. 139–146, December 2010
Google Scholar
Taflove, A., Hagness, S.C.: Others: Computational Electrodynamics: The Finite-Difference Time-Domain Method. Artech House, Norwood (1995)
MATH Google Scholar
Tan, L., Kothapalli, S., Chen, L., Hussaini, O., Bissiri, R., Chen, Z.: A survey of power and energy efficient techniques for high performance numerical linear algebra operations. Parallel Comput. 40(10), 559–573 (2014)
Article MathSciNet Google Scholar
Thiess, A., et al.: Massively parallel density functional calculations for thousands of atoms: KKRnano. Phys. Rev. B 85, 235103 (2012)
Article Google Scholar
Tiwari, A., Laurenzano, M.A., Carrington, L., Snavely, A.: Modeling power and energy usage of HPC kernels. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW), pp. 990–998, May 2012
Google Scholar
Wahl, P., Ly-Gagnon, D., Debaes, C., Miller, D., Thienpont, H.: B-CALM: an open-source GPU-based 3D-FDTD with multi-pole dispersion for plasmonics. In: 2011 11th International Conference on Numerical Simulation of Optoelectronic Devices (NUSOD), pp. 11–12, September 2011
Google Scholar
Weaver, V.M., Johnson, M., Kasichayanula, K., Ralph, J., Luszczek, P., Terpstra, D., Moore, S.: Measuring energy and power with PAPI. In: 2012 41st International Conference on Parallel Processing Workshops (ICPPW), pp. 262–268, September 2012
Google Scholar
Wittmann, M., Hager, G., Zeiser, T., Treibig, J., Wellein, G.: Chip-level and multi-node analysis of energy-optimized lattice Boltzmann CFD simulations. Concur. Comput. Pract. Exper. 28, 2295–2315 (2016). doi:10.1002/cpe.3489
Article Google Scholar
Wu, G., Greathouse, J.L., Lyashevsky, A., Jayasena, N., Chiou, D.: GPGPU performance and power estimation using machine learning. In: 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pp. 564–576, February 2015
Google Scholar
Zyuban, V., Taylor, S.A., Christensen, B., Hall, A.R., Gonzalez, C.J., Friedrich, J., Clougherty, F., Tetzloff, J., Rao, R.: IBM POWER7+ design for higher frequency at fixed power. IBM J. Res. Dev. 57(6), 1:1–1:18 (2013)
Article Google Scholar

Download references

Acknowledgements

This work has been carried out in the context of the POWER Acceleration and Design Center, a joined project between IBM, Forschungszentrum Jülich and NVIDIA. We acknowledge generous support from IBM by providing early access to GPU-accelerated POWER8 systems.

Author information

Authors and Affiliations

Forschungszentrum Jülich, JSC, 52425, Jülich, Germany
Thorsten Hater, Paul Baumeister & Dirk Pleiter
IBM Deutschland Research and Development GmbH, 71032, Böblingen, Germany
Benedikt Anlauf & Markus Bühler
NVIDIA GmbH, Würselen, Germany
Jiri Kraus

Authors

Thorsten Hater
View author publications
You can also search for this author in PubMed Google Scholar
Benedikt Anlauf
View author publications
You can also search for this author in PubMed Google Scholar
Paul Baumeister
View author publications
You can also search for this author in PubMed Google Scholar
Markus Bühler
View author publications
You can also search for this author in PubMed Google Scholar
Jiri Kraus
View author publications
You can also search for this author in PubMed Google Scholar
Dirk Pleiter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dirk Pleiter .

Editor information

Editors and Affiliations

University of Delaware, Newark, Delaware, USA
Michela Taufer
Forschungszentrum Jülich, Jülich, Germany
Bernd Mohr
DKRZ, Hamburg, Germany
Julian M. Kunkel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hater, T., Anlauf, B., Baumeister, P., Bühler, M., Kraus, J., Pleiter, D. (2016). Exploring Energy Efficiency for GPU-Accelerated POWER Servers. In: Taufer, M., Mohr, B., Kunkel, J. (eds) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science(), vol 9945. Springer, Cham. https://doi.org/10.1007/978-3-319-46079-6_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-46079-6_15
Published: 06 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46078-9
Online ISBN: 978-3-319-46079-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics