Apache Hadoop Yarn MapReduce Job Classification Based on CPU Utilization and Performance Evaluation on Multi-cluster Heterogeneous Environment

Mathiya, Bhavin J.; Desai, Vinodkumar L.

doi:10.1007/978-981-10-0129-1_4

Bhavin J. Mathiya⁶ &
Vinodkumar L. Desai⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 408))

1561 Accesses

Abstract

Recently it is observed that Yahoo, Facebook, mobile devices, sensors, scientific instruments, etc., are generating a huge amount of data. It is a challenge to store, manage, process, and analyze this data. Apache Hadoop Yarn is a framework which provides a solution for big data. In this paper, we have evaluated the performance of Apache Hadoop Yarn MapReduce jobs such as Pi, TeraGen, TeraSort, and Wordcount on single cluster node. After evaluating performance; jobs are classified into various classes like low CPU intensive job, high CPU intensive job based on CPU utilization (%). Based on the classification, Apache Hadoop Yarn MapReduce jobs executed on multi-cluster environment and evaluated performance. It is found that execution time has increased for low CPU intensive job and decreased for high CPU intensive job. Also, a total CPU time is decreased for low and high CPU intensive job. In addition, CPU Utilization is decreased for low CPU intensive job and increased for high CPU intensive job when number of nodes increased.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Vinod Kumar, V., et al. (2013) Apache Hadoop Yarn: Yet another resource negotiator. In Proceedings of the 4th annual Symposium on Cloud Computing, ACM.
Google Scholar
Apache Hadoop. http://hadoop.apache.org.
Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107–113.
Article Google Scholar
Maurya, M., & Mahajan, S. (2012). Performance analysis of MapReduce programs on Hadoop cluster. In 2012 World Congress on Information and Communication Technologies (WICT), IEEE.
Google Scholar
Joshi, S. B. (2012). Apache Hadoop performance-tuning methodologies and best practices. In Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering (ICPE ‘12). ACM, New York, pp. 241–242 doi:10.1145/2188286.2188323 http://doi.acm.org/10.1145/2188286.2188323.
Liu, Z., & Mu, D. (2012). Analysis of resource usage profile for MapReduce applications using Hadoop on cloud. In 2012 International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering (ICQR2MSE), pp. 1500, 1504, 15–18 June 2012.
Google Scholar
Kamal, Kc. & Freeh, V. W. Tuning Hadoop map slot value using CPU metric.
Google Scholar
Yao, Y., Wang, J., Sheng B., & Mi, N. (2013). Using a tunable knob for reducing makespan of mapreduce jobs in a hadoop cluster. In 2013 IEEE Sixth International Conference on Cloud Computing (CLOUD), pp. 1,8, June 28 2013-July 3 2013.
Google Scholar
Wang, K., Lin, X., & Tang, W., Predator—an experience guided configuration optimizer for Hadoop MapReduce. In 2012 IEEE 4th International Conference on Cloud Computing Technology and Science (CloudCom), pp. 419, 426, 3–6 Dec 2012.
Google Scholar
Feng, B., Lu, J., Zhou, Y. & Yang, N. (2012). Energy efficiency for MapReduce workloads: an in-depth study. In Zhang, R., Zhang, Y. (Eds.), Proceedings of the Australasian Database Conference (ADC 2012), Melbourne, Australia. CRPIT, vol. 124. ACS, pp. 61–70.
Google Scholar
Lin, W., & Liu, J. (2013). Performance analysis of MapReduce program in heterogeneous cloud computing. Journal of Networks, 8(8), 1734–1741.
Article Google Scholar
Kazuki, Y. et al. (2013). Implementation and evaluation of the JobTracker initiative task scheduling on Hadoop. In 2013 First International Symposium on Computing and Networking (CANDAR), IEEE.
Google Scholar
Dhok, J., & Varma, V. (2005). Using pattern classification for task assignment in mapreduce. Hyderabad: International Institute of Information Technology.
Google Scholar
Benslimane, Z., Liu, Q., & Hongming, Z. (2013). Predicting Hadoop Parameters.
Google Scholar

Download references

Author information

Authors and Affiliations

C.U. Shah University, Wadhwan City, Gujarat, India
Bhavin J. Mathiya
Department of Computer Science, Government Science College, Chikhli, Navsari, Gujarat, India
Vinodkumar L. Desai

Authors

Bhavin J. Mathiya
View author publications
You can also search for this author in PubMed Google Scholar
Vinodkumar L. Desai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bhavin J. Mathiya .

Editor information

Editors and Affiliations

Dept of Comp Sci Engg, Anil Neerukonda Ins Tech & Sci, Visakhapatnam, Andhra Pradesh, India
Suresh Chandra Satapathy
Sabarkantha, Sabar Institute of Technology, Sabarkantha, Gujarat, India
Amit Joshi
Studies and Management, Narsinhbhai Institute of Computer, Kadi, Gujarat, India
Nilesh Modi
Computer Studies and Management, Narsinhbhai Institute of, Kadi, Gujarat, India
Nisarg Pathak

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mathiya, B.J., Desai, V.L. (2016). Apache Hadoop Yarn MapReduce Job Classification Based on CPU Utilization and Performance Evaluation on Multi-cluster Heterogeneous Environment. In: Satapathy, S., Joshi, A., Modi, N., Pathak, N. (eds) Proceedings of International Conference on ICT for Sustainable Development. Advances in Intelligent Systems and Computing, vol 408. Springer, Singapore. https://doi.org/10.1007/978-981-10-0129-1_4

Download citation

DOI: https://doi.org/10.1007/978-981-10-0129-1_4
Published: 11 February 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0127-7
Online ISBN: 978-981-10-0129-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics