A New Scheduling Algorithm in Hadoop MapReduce

Peng, Zhiping; Ma, Yanchun

doi:10.1007/978-3-642-24282-3_74

Zhiping Peng⁵ &
Yanchun Ma⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 237))

Included in the following conference series:

International Conference on Artificial Intelligence and Computational Intelligence

1805 Accesses
4 Citations

Abstract

The core concept of cloud computing is the resource pool. Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data in-parallel on large clusters of commodity hardware in a reliable, fault-tolerant manner. A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. We distribute the total slots according to Pi which is the percent of job’s unfulfilled tasks in the total unfulfilled tasks. Since the P_i of the large job is bigger, the large job will be allocated more slots. We can clearly improve the response time of the large jobs. This new scheduling algorithm can improve the performance of the system, such as throughout, response time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hurwitz, J., Bloor, R., Kaufman, M., Halper, F.: Cloud Computing For Dummies, Hoboken (2009)
Google Scholar
Apache.: Welcome to Apache Hadoop, http://hadoop.apache.org/
Wang, K., Wu, Q., Yang, S.: Design and Implementation of Job Scheduling Algorithm for Multi-User Map-Reduce Clusters. J. Computer and Modernization 8, 23–28 (2010)
Google Scholar
Isard, M., Prabhakaran, V., Currey, J., Wieder, U., Talwar, K., Goldberg, A.: Quincy: Fair Scheduling for Distributed Computing Clusters. In: ACM SIGOPS 22nd Symposium on Operating Systems Principles, pp. 261–276. ACM Press, Montana (2009)
Chapter Google Scholar
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Communications of the ACM, 107–113 (2008)
Google Scholar
Feitelson, D.G., Rudolph, L., Schwiegelshohn, U.: Parallel job scheduling–a status report. Computer Science 3277, 1–16 (2005)
Google Scholar
Ucar, B., Aykanat, C., Kaya, K.: Task assignment in heterogeneous computing systems. Parallel and Distributed Computing 66, 32–46 (2006)
Article MATH Google Scholar
Graves, S.C., Redfield, C.H.: Equipment Selection and Task Assignment for Multiproduct Assembly System Design. Flexible Manufacturing Systems 1, 31–50 (1988)
Article Google Scholar
Harchol-Balter, M.: Task assignment with unknown duration. JACM 49, 266–280 (2002)
Article MathSciNet MATH Google Scholar
Kettimuthu, R., Subramani, V., Srinivasan, S., Gopalsamy, T., Panda, D.K., Sadayappan, P.: Selective preemption strategies for parallel job scheduling. High Performance Computing and Networking 3, 122–152 (2005)
Article Google Scholar
Aida, K.: Effect of job size characteristics on job scheduling performance. Computer Science 1911, 1–17 (2000)
Google Scholar
Nicolae, B., Moise, D., Antoniu, G., Bouge, L.: BlobSeer: Bringing high throughput under heavy concurrency to Hadoop Map-Reduce applications. In: Parallel & Distributed Processing (IPDPS), pp. 1–11. IEEE Press, Atlanta (2010)
Google Scholar
Bhandarkar, M.: MapReduce programming with apache Hadoop. In: Parallel & Distributed Processing (IPDPS), p. 1. IEEE Press, Atlanta (2010)
Google Scholar
Yeung, J.H.C., Tsang, C.C., Tsoi, K.H., Kwan, B.S.H., Cheung, C.C.C., Chan, A.P.C.: Map reduce as a Programming Model for Custom Computing Machines. In: Field-Programmable Custom Computing Machines, pp. 149–159. IEEE Press, Palo Alto (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer and Electronic Information College, Guangdong University of Petrochemical Technology, Maoming, China
Zhiping Peng
School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang, China
Yanchun Ma

Authors

Zhiping Peng
View author publications
You can also search for this author in PubMed Google Scholar
Yanchun Ma
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Business Information technology, RMIT University, GPO Box 2476V, 3000, Melbourne, Victoria, Australia
Hepu Deng
Department of Computer Science and Technology, Tongji University, Shanghai, China
Duoqian Miao
Caritas Institute of Higher Education, 18 Chui Ling Road, Tseung Kwan, Hong Kong, SAR, China
Fu Lee Wang
School of Computer and Information Engineering, Shanghai University of Electric Power, 200090, Shanghai, China
Jingsheng Lei

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peng, Z., Ma, Y. (2011). A New Scheduling Algorithm in Hadoop MapReduce. In: Deng, H., Miao, D., Wang, F.L., Lei, J. (eds) Emerging Research in Artificial Intelligence and Computational Intelligence. AICI 2011. Communications in Computer and Information Science, vol 237. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24282-3_74

Download citation

DOI: https://doi.org/10.1007/978-3-642-24282-3_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24281-6
Online ISBN: 978-3-642-24282-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics