Skip to main content

Enhancing the Performance of MapReduce Default Scheduler by Detecting Prolonged TaskTrackers in Heterogeneous Environments

  • Conference paper
  • First Online:
Proceedings of the Second International Conference on Computer and Communication Technologies

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 380))

  • 1259 Accesses

Abstract

MapReduce is now a significant parallel processing model for large-scale data-intensive applications using clusters with commodity hardware. Scheduling of jobs and tasks, and identification of TaskTrackers which are slow in Hadoop clusters are the focus research in the recent years. MapReduce performance is currently limited by its default scheduler, which does not adapt well in heterogeneous environments. In this paper, we propose a scheduling method to identify the TaskTrackers which are running slowly in map and reduce phases of the MapReduce framework in a heterogeneous Hadoop cluster. The proposed method is integrated with the MapReduce default scheduling algorithm. The performance of this method is compared with the unmodified MapReduce default scheduler. We observe that the proposed approach shows improvements in performance to the default scheduler in the heterogeneous environments. Performance improvement was observed as the overall job execution times for different workloads from HiBench benchmark suite were reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008)

    Google Scholar 

  2. Dean, J., Ghemawat, S.: MapReduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)

    Article  Google Scholar 

  3. Dawei, J., Beng, C.O., Lei, S., Sai, W.: The performance of MapReduce: an in-depth study. VLDB 19, 1–2 (2010)

    Google Scholar 

  4. Tian, C., Zhou, H., He, Y., Zha, L.: A dynamic MapReduce scheduler for heterogeneous workloads. In: Proceedings of the 2009 Eighth International Conference on Grid and Cooperative Computing, pp. 218–224 (2009)

    Google Scholar 

  5. Rasooli, A., Down, D.G.: An adaptive scheduling algorithm for dynamic heterogeneous Hadoop systems. In: Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research, pp. 30–44. Canada (2011)

    Google Scholar 

  6. Zaharia, M., Borthakur, D., Sarma, J.S., Elmeleegy, K., Shenker, S., Stoica, I.: Job Scheduling for Multi-user MapReduce clusters. Technical Report, University of California, Berkeley (2009)

    Google Scholar 

  7. Chen, Q., Zhang, D., Guo, M., Deng, Q., Guo, S.: SAMR: A self adaptive MapReduce scheduling algorithm in heterogeneous environment. In: Proceedings of the 10th IEEE International Conference on Computer and Information Technology, pp. 2736–2743. Washington, USA (2010)

    Google Scholar 

  8. Tan, J., Meng, X., Zhang, L.: Delay tails in MapReduce scheduling. Technical Report, IBM T. J. Watson Research Center, New York (2011)

    Google Scholar 

  9. Rasooli, A., Down, D.G.: A hybrid scheduling approach for scalable heterogeneous Hadoop systems. In: Proceeding of the 5th Workshop on Many-Task Computing on Grids and Supercomputers, pp. 1284–1291 (2012)

    Google Scholar 

  10. Nanduri, R., Maheshwari, N., Reddyraja, A., Varma, V.: Job aware scheduling algorithm for MapReduce framework. In: Proceedings of the 3rd International Conference on Cloud Computing Technology and Science, pp. 724–729, Washington, USA (2011)

    Google Scholar 

  11. Naik, N.S., Negi, A., Sastry, V.N.: A review of adaptive approaches to MapReduce scheduling in heterogeneous environments. In: IEEE International Conference on Advances in Computing, Communications and Informatics, pp. 677–683. Delhi, India (2014)

    Google Scholar 

  12. Zhenhua, G., Geo, R.F., Zhou, M., Yang, R.: Improving resource utilization in MapReduce. In; IEEE International Conference on Cluster Computing, pp. 402–410 (2012)

    Google Scholar 

  13. Rasooli, A., Down, D.G.: COSHH: a classification and optimization based scheduler for heterogeneous Hadoop systems. J. Future Gener. Comput. Syst. 36, 1–15 (2014)

    Google Scholar 

  14. Shengsheng, H., Jie, H., Jinquan, D., Tao, X., Huang, B.: The HiBench benchmark suite: characterization of the MapReduce-based data analysis. In: IEEE 26th International Conference on Data Engineering Workshops, pp. 41–51 (2010)

    Google Scholar 

Download references

Acknowledgments

Nenavath Srinivas Naik expresses his gratitude to Prof. P.A. Sastry (Principal), Prof. J. Prasanna Kumar (Head of the CSE Department), and Dr. B. Sandhya, MVSR Engineering College, Hyderabad, India for hosting the experimental test bed.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nenavath Srinivas Naik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer India

About this paper

Cite this paper

Naik, N.S., Negi, A., Sastry, V.N. (2016). Enhancing the Performance of MapReduce Default Scheduler by Detecting Prolonged TaskTrackers in Heterogeneous Environments. In: Satapathy, S., Raju, K., Mandal, J., Bhateja, V. (eds) Proceedings of the Second International Conference on Computer and Communication Technologies. Advances in Intelligent Systems and Computing, vol 380. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2523-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2523-2_21

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2522-5

  • Online ISBN: 978-81-322-2523-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics