Advertisement

Cluster Computing

, Volume 12, Issue 3, pp 299–308 | Cite as

Power and environment aware control of Beowulf clusters

  • Fengping Hu
  • Jeffrey J. Evans
Article

Abstract

Beowulf clusters are now deployed worldwide, chiefly in support of scientific computing. Beowulf clusters yield high computing performance, yet they also pose several challenges: (1) heat-induced hardware failure makes large scale commodity clusters fail quite frequently and (2) cost effectiveness of the Beowulf cluster is challenged by the fact that it lacks means of adapting its power state according to varying work load. This paper addresses these issues by developing a Power and Environment Awareness Module (PEAM) for a Beowulf cluster. The busty nature of computation load in an academic environment inspired the implementation and analysis of a fixed timeout Dynamic Power Management (DPM) policy. Today it is common that many Beowulf clusters in academic environment are composed of older, recycled nodes that may lack of out-of-band management technologies, thus Advanced Configuration and Power Interface (ACPI) and Wake-on-LAN (WOL) technology is exploited to control the power state of cluster nodes. A data center environment monitoring system that uses Wireless Sensor Networks (WSN) technology is developed and deployed to realize environment awareness of the cluster. Our PEAM module has been implemented on our cluster at Purdue University, reducing the operational cost and increasing the reliability of the cluster by reducing heat generation and optimizing workload distribution in an environment aware manner.

Keywords

Cluster Dynamic power management Efficiency High performance computing Reliability 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Advanced Micro Devices, Inc.: Magic packet technology. Online Document (1995) Google Scholar
  2. 2.
    Advanced Micro Devices, Inc.: Thermal management & server density: Critical issues for today’s data center. Online Document (2004) Google Scholar
  3. 3.
    Benni, L., Bogliolo, A., De Micheli, G.: A survey of design techniques for system-level dynamic power management. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 8(3), 299–316 (2000) CrossRefGoogle Scholar
  4. 4.
    Boucher, T.D., Auslander, D.M., Bash, C.E., Federpiel, C.C., Patel, C.D.: Viability of dynamic cooling control in a data center environment. J. Electron. Pack. 128(2), 137–144 (2006) CrossRefGoogle Scholar
  5. 5.
    Evans, J.J., Hood, C.S.: PARSE: a tool for parallel application run time sensitivity evaluation. In: Proceedings of the Twelfth International Conference on Parallel and Distributed Systems (ICPADS), pp. 475–484 (July 2006) Google Scholar
  6. 6.
    Evans, J.J., Hood, C.S.: A network performance sensitivity metric for parallel applications. In: Proceedings of the Fifth International Symposium on Parallel and Distributed Processing and Applications (ISPA07) (Best Paper), pp. 920–932 (August 2007) Google Scholar
  7. 7.
    Feng, W.-C.: Making a case for efficient supercomputing. ACM Queue 1(7), 54–64 (2003) CrossRefGoogle Scholar
  8. 8.
    Feng, W.-C.: The importance of being low power in high performance computing. CT Watch Q. 1(3) (August 2005) Google Scholar
  9. 9.
    Feng, W.-C., Hsu, C.-H.: The origin and evolution of green destiny. In: IEEE Cool Chips VII: An International Symposium on Low-Power and High-Speed Chips, April 2004 Google Scholar
  10. 10.
    Grama, A., Gupta, A., Karypis, G., Kumar, V.: Introduction to Parallel Computing, 2nd edn. Addison-Wesley, Reading (2003) Google Scholar
  11. 11.
    Hsu, C.-H., Feng, W.-C.: Reducing overheating-induced failures via performance-aware cpu power management. In: Proceedings from the 6th International Conference on Linux Clusters: The HPC Revolution 2005, April 2005 Google Scholar
  12. 12.
    Hu, F., Evans, J.J.: Linux kernel improvement: Toward dynamic power management of Beowulf clusters. In: Proceedings of the 8th LCI International Conference on High-Performance Clustered Computing (CDROM), May 2007 Google Scholar
  13. 13.
    Lieberman, P.: White paper: Wake on LAN technology. Online Document (2002) Google Scholar
  14. 14.
    Markoff, J., Lohr, S.: Intel’s huge bet turns iffy. New York Times (2002) Google Scholar
  15. 15.
    Nakashima, H., Nakamura, H., Sato, M., Boku, T., Matsuoka, S., Takahashi, D., Hotta, Y.: MegaProto: A low power and compact cluster for high-performance computing. In: IEEE Workshop on High-Performance, Power-Aware Computing (in conjunction with the IEEE Parallel & Distributed Processing Symposium), April 2005 Google Scholar
  16. 16.
    Patel, C.D., Bash, C.E., Belady, C., Stahl, L., Sullivan, D.: Computational fluid dynamics modeling of high compute density data centers to assure system inlet air specifications. In: Proceedings of the Pacific Rim ASME International Electronic Packaging Technical Conference and Exhibition (IPACK 2001), pp. 767–776 (July 2001) Google Scholar
  17. 17.
    Patel, C.D., Bash, C.E., Sharma, R., Beitelmal, M., Friedrich, R.: Smart cooling of data centers. In: Proceedings of the International Electronic Packaging Tech Conference and Exhibition, Maui, Hawaii, USA, July 2003 Google Scholar
  18. 18.
    Zhu, Q., David, F.M., Devaraj, C.F., Li, Z., Zhou, Y., Cao, P.: Reducing energy consumption of disk storage using power-aware cache management. In: Proceedings of the 10th International Symposium on High Performance Computer Architecture (HPCA-10), pp. 118–129 (February 2004) Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Purdue UniversityWest LafayetteUSA

Personalised recommendations