Abstract
The core concept of cloud computing is the resource pool. Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data in-parallel on large clusters of commodity hardware in a reliable, fault-tolerant manner. A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. We distribute the total slots according to Pi which is the percent of job’s unfulfilled tasks in the total unfulfilled tasks. Since the P i of the large job is bigger, the large job will be allocated more slots. We can clearly improve the response time of the large jobs. This new scheduling algorithm can improve the performance of the system, such as throughout, response time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hurwitz, J., Bloor, R., Kaufman, M., Halper, F.: Cloud Computing For Dummies, Hoboken (2009)
Apache.: Welcome to Apache Hadoop, http://hadoop.apache.org/
Wang, K., Wu, Q., Yang, S.: Design and Implementation of Job Scheduling Algorithm for Multi-User Map-Reduce Clusters. J. Computer and Modernization 8, 23–28 (2010)
Isard, M., Prabhakaran, V., Currey, J., Wieder, U., Talwar, K., Goldberg, A.: Quincy: Fair Scheduling for Distributed Computing Clusters. In: ACM SIGOPS 22nd Symposium on Operating Systems Principles, pp. 261–276. ACM Press, Montana (2009)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Communications of the ACM, 107–113 (2008)
Feitelson, D.G., Rudolph, L., Schwiegelshohn, U.: Parallel job scheduling–a status report. Computer Science 3277, 1–16 (2005)
Ucar, B., Aykanat, C., Kaya, K.: Task assignment in heterogeneous computing systems. Parallel and Distributed Computing 66, 32–46 (2006)
Graves, S.C., Redfield, C.H.: Equipment Selection and Task Assignment for Multiproduct Assembly System Design. Flexible Manufacturing Systems 1, 31–50 (1988)
Harchol-Balter, M.: Task assignment with unknown duration. JACM 49, 266–280 (2002)
Kettimuthu, R., Subramani, V., Srinivasan, S., Gopalsamy, T., Panda, D.K., Sadayappan, P.: Selective preemption strategies for parallel job scheduling. High Performance Computing and Networking 3, 122–152 (2005)
Aida, K.: Effect of job size characteristics on job scheduling performance. Computer Science 1911, 1–17 (2000)
Nicolae, B., Moise, D., Antoniu, G., Bouge, L.: BlobSeer: Bringing high throughput under heavy concurrency to Hadoop Map-Reduce applications. In: Parallel & Distributed Processing (IPDPS), pp. 1–11. IEEE Press, Atlanta (2010)
Bhandarkar, M.: MapReduce programming with apache Hadoop. In: Parallel & Distributed Processing (IPDPS), p. 1. IEEE Press, Atlanta (2010)
Yeung, J.H.C., Tsang, C.C., Tsoi, K.H., Kwan, B.S.H., Cheung, C.C.C., Chan, A.P.C.: Map reduce as a Programming Model for Custom Computing Machines. In: Field-Programmable Custom Computing Machines, pp. 149–159. IEEE Press, Palo Alto (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Peng, Z., Ma, Y. (2011). A New Scheduling Algorithm in Hadoop MapReduce. In: Deng, H., Miao, D., Wang, F.L., Lei, J. (eds) Emerging Research in Artificial Intelligence and Computational Intelligence. AICI 2011. Communications in Computer and Information Science, vol 237. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24282-3_74
Download citation
DOI: https://doi.org/10.1007/978-3-642-24282-3_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24281-6
Online ISBN: 978-3-642-24282-3
eBook Packages: Computer ScienceComputer Science (R0)