Advertisement

An Energy-Efficient Greedy MapReduce Scheduler for Heterogeneous Hadoop YARN Cluster

  • Vaibhav PandeyEmail author
  • Poonam Saini
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11297)

Abstract

Energy efficiency of a MapReduce system has become an essential part of infrastructure management in the field of big data analytics. Here, Hadoop scheduler plays a vital role in order to ensure the energy efficiency of the system. A handful of MapReduce scheduling algorithms have been proposed in the literature for slot-based Hadoop system (i.e., Hadoop 0.x and Hadoop 1.x) to minimize the overall energy consumption. However, YARN-based Hadoop schedulers have not been discussed much in the literature. In this paper, we design a scheduling model for Hadoop YARN architecture and formulate the energy efficient scheduling problem as an Integer Program. To solve the problem, we propose a Greedy scheduler which selects the best job with minimum energy consumption in each iteration. We evaluate the performance of the proposed algorithm against the FAIR and Capacity schedulers and find out that our greedy scheduler shows better results for both CPU- and I/O intensive workloads.

Keywords

MapReduce Scheduling Energy-efficiency 

Notes

Acknowledgment

Authors would like to thank Ministry of Electronics and IT, Govt. of India for providing financial support to perform this work under the Visvesvaraya Ph.D. scheme.

References

  1. 1.
    Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies, MSST2010 (2010)Google Scholar
  2. 2.
    Welcome to Apache Pig! https://pig.apache.org/. Accessed 25 June 2018
  3. 3.
    Apache Hive TM. https://hive.apache.org/. Accessed 25 June 2018
  4. 4.
    Apache Mahout: Scalable machine learning and data mining. http://mahout.apache.org/. Accessed 25 June 2018
  5. 5.
  6. 6.
    Shehabi, A., et al.: United States Data Center Energy Usage Report, June 2016Google Scholar
  7. 7.
    Cai, X., Li, F., Li, P., Ju, L., Jia, Z.: SLA-aware energy-efficient scheduling scheme for Hadoop YARN. J. Supercomput. 73(8), 3526–3546 (2017)CrossRefGoogle Scholar
  8. 8.
    Bampis, E., Chau, V., Letsios, D., Lucarelli, G., Milis, I., Zois, G.: Energy efficient scheduling of MapReduce jobs. In: Silva, F., Dutra, I., Santos Costa, V. (eds.) Euro-Par 2014. LNCS, vol. 8632, pp. 198–209. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-09873-9_17CrossRefGoogle Scholar
  9. 9.
    Leverich, J., Kozyrakis, C.: On the energy (in)efficiency of Hadoop clusters. ACM SIGOPS Oper. Syst. Rev. 44(1), 61 (2010)CrossRefGoogle Scholar
  10. 10.
    Lang, W., Patel, J.M.: Energy management for MapReduce clusters. Proc. VLDB Endow. 3(1–2), 129–139 (2010)CrossRefGoogle Scholar
  11. 11.
    Chen, Y., Alspaugh, S., Borthakur, D., Katz, R.: Energy efficiency for large-scale MapReduce workloads with significant interactive analysis. In: Proceedings of the 7th ACM European Conference on Computer Systems – EuroSys 2012, p. 43 (2012)Google Scholar
  12. 12.
    Yigitbasi, N., Datta, K., Jain, N., Willke, T.: Energy efficient scheduling of MapReduce workloads on heterogeneous clusters. In: Green Computing Middleware on Proceedings of the 2nd International Workshop – GCM 2011, pp. 1–6 (2011)Google Scholar
  13. 13.
    Mashayekhy, L., Nejad, M.M., Grosu, D., Zhang, Q., Shi, W.: Energy-aware scheduling of MapReduce jobs for big data applications. IEEE Trans. Parallel Distrib. Syst. (1), 1 (2015) Google Scholar
  14. 14.
    Verma, A., Cherkasova, L., Campbell, R.H.: ARIA: automatic resource inference and allocation for MapReduce environments. In: Proceedings of the 8th ACM International Conference on Autonomic Computing - ICAC 2011, p. 235 (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of CSEPunjab Engineering College (Deemed to be University)ChandigarhIndia

Personalised recommendations