Skip to main content

Energy Proportionality and Performance in Data Parallel Computing Clusters

  • Conference paper
Scientific and Statistical Database Management (SSDBM 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6809))

Abstract

Energy consumption in datacenters has recently become a major concern due to the rising operational costs and scalability issues. Recent solutions to this problem propose the principle of energy proportionality, i.e., the amount of energy consumed by the server nodes must be proportional to the amount of work performed. For data parallelism and fault tolerance purposes, most common file systems used in MapReduce-type clusters maintain a set of replicas for each data block. A covering set is a group of nodes that together contain at least one replica of the data blocks needed for performing computing tasks. In this work, we develop and analyze algorithms to maintain energy proportionality by discovering a covering set that minimizes energy consumption while placing the remaining nodes in low-power standby mode. Our algorithms can also discover covering sets in heterogeneous computing environments. In order to allow more data parallelism, we generalize our algorithms so that it can discover k-covering sets, i.e., a set of nodes that contain at least k replicas of the data blocks. Our experimental results show that we can achieve substantial energy saving without significant performance loss in diverse cluster configurations and working environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amur, H., Cipar, J., Gupta, V., Ganger, G.R., Kozuch, M.A., Schwan, K.: Robust and flexible power-proportional storage. In: Proceedings of the 1st ACM Symposium on Cloud computing, SoCC 2010, pp. 217–228 (2010)

    Google Scholar 

  2. Barroso, L.A., Holzle, U.: The case for energy-proportional computing. Computer 40, 33–37 (2007)

    Article  Google Scholar 

  3. Berman, P., DasGupta, B., Sontag, E.: Randomized approximation algorithms for set multicover problems with applications to reverse engineering of protein and gene networks. Discrete Appl. Math. 155(6-7), 733–749 (2007)

    Article  MATH  Google Scholar 

  4. Bianchini, R., Rajamony, R.: Power and energy management for server systems. Computer 37(11), 68–74 (2004)

    Article  Google Scholar 

  5. Cardosa, M., Singh, A., Pucha, H., Chandra, A.: Exploiting spatio-temporal tradeoffs for energy efficient MapReduce in the cloud. Technical Report TR 10-008, University of Minnesota (April 2010)

    Google Scholar 

  6. Chase, J.S., Anderson, D.C., Thakar, P.N., Vahdat, A.M., Doyle, R.P.: Managing and server resources in hosting centers. In: SOSP 2001: Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles, pp. 103–116 (2001)

    Google Scholar 

  7. Chun, B.-G., Iannaccone, G., Iannaccone, G., Katz, R., Lee, G., Niccolini, L.: An case for hybrid datacenters. SIGOPS Oper. Syst. Rev. 44(1), 76–80 (2010)

    Article  Google Scholar 

  8. Chvàtal, V.: A greedy heuristic for the set-covering problem. Mathematics of Operations Research 4, 233–235 (1979)

    Article  MATH  Google Scholar 

  9. Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Elmeleegy, K., Sears, R.: MapReduce online. In: Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, NSDI 2010, pp. 21–21 (2010)

    Google Scholar 

  10. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI 2004: Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation, pp. 10–10 (2004)

    Google Scholar 

  11. Gunarathne, T., Wu, T.-L., Qiu, J., Fox, G.: MapReduce in the clouds for science. In: CloudCom, pp. 565–572 (2010)

    Google Scholar 

  12. Hadoop: http://hadoop.apache.org/

  13. Heath, T., Diniz, B., Carrera, E.V., Meira Jr., W., Bianchini, R.: Energy conservation in heterogeneous server clusters. In: PPoPP 2005, pp. 186–195 (2005)

    Google Scholar 

  14. Kim, J., Chou, J., Rotem, D.: Energy proportionality and performance in data parallel computing clusters. Technical Report LBNL-4533E, Lawrence Berkeley National Laboratory (April 2011)

    Google Scholar 

  15. Lang, W., Patel, J.M.: Energy management for MapReduce clusters. In: VLDB 2010 (2010)

    Google Scholar 

  16. Leverich, J., Kozyrakis, C.: On the (in)efficiency of Hadoop clusters. SIGOPS Oper. Syst. Rev. 44(1), 61–65 (2010)

    Article  Google Scholar 

  17. http://www.mckinsey.com/clientservice/bto/pointofview/pdf/revolutionizing_data_center_efficiency.pdf

  18. OMNeT++ Network Simulation Framework, http://www.omnetpp.org/

  19. http://www.federalnewsradio.com/pdfs/epadatacenterreporttocongress-august2007.pdf

  20. Vercellis, C.: A probabilistic analysis of the set covering problem. In: Annals of Operations Research, 255–271 (1984)

    Google Scholar 

  21. Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R.H., Stoica, I.: Improving MapReduce performance in heterogeneous environments. In: OSDI, pp. 29–42 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, J., Chou, J., Rotem, D. (2011). Energy Proportionality and Performance in Data Parallel Computing Clusters. In: Bayard Cushing, J., French, J., Bowers, S. (eds) Scientific and Statistical Database Management. SSDBM 2011. Lecture Notes in Computer Science, vol 6809. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22351-8_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22351-8_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22350-1

  • Online ISBN: 978-3-642-22351-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics