Advertisement

© 2013

Energy-Efficient High Performance Computing

Measurement and Tuning

Book

Part of the SpringerBriefs in Computer Science book series (BRIEFSCOMPUTER)

Table of contents

  1. Front Matter
    Pages i-xiv
  2. James H. Laros III, Kevin Pedretti, Suzanne M. Kelly, Wei Shu, Kurt Ferreira, John Vandyke et al.
    Pages 1-4
  3. James H. Laros III, Kevin Pedretti, Suzanne M. Kelly, Wei Shu, Kurt Ferreira, John Vandyke et al.
    Pages 5-9
  4. James H. Laros III, Kevin Pedretti, Suzanne M. Kelly, Wei Shu, Kurt Ferreira, John Vandyke et al.
    Pages 11-16
  5. James H. Laros III, Kevin Pedretti, Suzanne M. Kelly, Wei Shu, Kurt Ferreira, John Vandyke et al.
    Pages 17-19
  6. James H. Laros III, Kevin Pedretti, Suzanne M. Kelly, Wei Shu, Kurt Ferreira, John Van Dyke et al.
    Pages 21-30
  7. James H. Laros III, Kevin Pedretti, Suzanne M. Kelly, Wei Shu, Kurt Ferreira, John Vandyke et al.
    Pages 31-42
  8. James H. Laros III, Kevin Pedretti, Suzanne M. Kelly, Wei Shu, Kurt Ferreira, John Vandyke et al.
    Pages 43-49
  9. James H. Laros III, Kevin Pedretti, Suzanne M. Kelly, Wei Shu, Kurt Ferreira, John Vandyke et al.
    Pages 51-55
  10. James H. Laros III, Kevin Pedretti, Suzanne M. Kelly, Wei Shu, Kurt Ferreira, John Vandyke et al.
    Pages 57-59
  11. Back Matter
    Pages 61-67

About this book

Introduction

Recognition of the importance of power and energy in the field of high performance computing (HPC) has never been greater. Research has been conducted in a number of areas related to power and energy, but little existing research has focused on large-scale HPC. Part of the reason is the lack of measurement capability currently available on small or large platforms. Typically, research is conducted using coarse methods of measurement such as inserting a power meter between the power source and the platform, or fine grained measurements using custom instrumented boards (with obvious limitations in scale). To analyze real scientific computing applications at large scale, an in situ measurement capability is necessary that scales to the size of the platform.

In response to this challenge, the unique power measurement capabilities of the Cray XT architecture were exploited to gain an understanding of power and energy use and the effects of tuning both CPU and network bandwidth. Modifications were made at the operating system level to deterministically halt cores when idle. Additionally, capabilities were added to alter operating P-state. At the application level, an understanding of the power requirements of a range of important DOE/NNSA production scientific computing applications running at large scale (thousands of nodes) is gained by simultaneously collecting current and voltage measurements on the hosting nodes. The effects of both CPU and network bandwidth tuning are examined and energy savings opportunities of up to 39% with little or no impact on run-time performance is demonstrated. Capturing scale effects was key. This research provides strong evidence that next generation large-scale platforms should not only approach CPU frequency scaling differently, as we will demonstrate, but could also benefit from the capability to tune other platform components, such as the network, to achieve more energy efficient performance.

Keywords

Energy Efficiency High Performance Computing (HPC) Lightweight Kernel Network Bandwidth Tuning Operating Systems Power Reliability Availability and Serviceability (RAS)

Authors and affiliations

  1. 1.Sandia National LaboratoriesAlbuquerqueUSA
  2. 2.Sandia National LaboratoriesAlbuquerqueUSA
  3. 3.Sandia National LaboratoriesAlbuquerqueUSA
  4. 4.Electrical and Computer Engineering DeptUniversity of New MexicoAlbuquerqueUSA
  5. 5.Sandia National LaboratoriesAlbuquerqueUSA
  6. 6.Sandia National LaboratoriesAlbuquerqueUSA
  7. 7.Sandia National LaboratoriesAlbuquerqueUSA

Bibliographic information

  • Book Title Energy-Efficient High Performance Computing
  • Book Subtitle Measurement and Tuning
  • Authors James H. Laros III
    Kevin Pedretti
    Suzanne M. Kelly
    Wei Shu
    Kurt Ferreira
    John Van Dyke
    Courtenay Vaughan
  • Series Title SpringerBriefs in Computer Science
  • DOI https://doi.org/10.1007/978-1-4471-4492-2
  • Copyright Information James H. Laros III 2013
  • Publisher Name Springer, London
  • eBook Packages Computer Science Computer Science (R0)
  • Softcover ISBN 978-1-4471-4491-5
  • eBook ISBN 978-1-4471-4492-2
  • Series ISSN 2191-5768
  • Series E-ISSN 2191-5776
  • Edition Number 1
  • Number of Pages XIV, 67
  • Number of Illustrations 11 b/w illustrations, 8 illustrations in colour
  • Topics Computer Communication Networks
    Performance and Reliability
    Operating Systems
  • Buy this book on publisher's site
Industry Sectors
Automotive
Biotechnology
IT & Software
Telecommunications
Consumer Packaged Goods
Pharma
Materials & Steel
Finance, Business & Banking
Electronics
Energy, Utilities & Environment
Aerospace
Oil, Gas & Geosciences
Engineering