Skip to main content

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 237))

Abstract

The core concept of cloud computing is the resource pool. Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data in-parallel on large clusters of commodity hardware in a reliable, fault-tolerant manner. A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. We distribute the total slots according to Pi which is the percent of job’s unfulfilled tasks in the total unfulfilled tasks. Since the P i of the large job is bigger, the large job will be allocated more slots. We can clearly improve the response time of the large jobs. This new scheduling algorithm can improve the performance of the system, such as throughout, response time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hurwitz, J., Bloor, R., Kaufman, M., Halper, F.: Cloud Computing For Dummies, Hoboken (2009)

    Google Scholar 

  2. Apache.: Welcome to Apache Hadoop, http://hadoop.apache.org/

  3. Wang, K., Wu, Q., Yang, S.: Design and Implementation of Job Scheduling Algorithm for Multi-User Map-Reduce Clusters. J. Computer and Modernization 8, 23–28 (2010)

    Google Scholar 

  4. Isard, M., Prabhakaran, V., Currey, J., Wieder, U., Talwar, K., Goldberg, A.: Quincy: Fair Scheduling for Distributed Computing Clusters. In: ACM SIGOPS 22nd Symposium on Operating Systems Principles, pp. 261–276. ACM Press, Montana (2009)

    Chapter  Google Scholar 

  5. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Communications of the ACM, 107–113 (2008)

    Google Scholar 

  6. Feitelson, D.G., Rudolph, L., Schwiegelshohn, U.: Parallel job scheduling–a status report. Computer Science 3277, 1–16 (2005)

    Google Scholar 

  7. Ucar, B., Aykanat, C., Kaya, K.: Task assignment in heterogeneous computing systems. Parallel and Distributed Computing 66, 32–46 (2006)

    Article  MATH  Google Scholar 

  8. Graves, S.C., Redfield, C.H.: Equipment Selection and Task Assignment for Multiproduct Assembly System Design. Flexible Manufacturing Systems 1, 31–50 (1988)

    Article  Google Scholar 

  9. Harchol-Balter, M.: Task assignment with unknown duration. JACM 49, 266–280 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  10. Kettimuthu, R., Subramani, V., Srinivasan, S., Gopalsamy, T., Panda, D.K., Sadayappan, P.: Selective preemption strategies for parallel job scheduling. High Performance Computing and Networking 3, 122–152 (2005)

    Article  Google Scholar 

  11. Aida, K.: Effect of job size characteristics on job scheduling performance. Computer Science 1911, 1–17 (2000)

    Google Scholar 

  12. Nicolae, B., Moise, D., Antoniu, G., Bouge, L.: BlobSeer: Bringing high throughput under heavy concurrency to Hadoop Map-Reduce applications. In: Parallel & Distributed Processing (IPDPS), pp. 1–11. IEEE Press, Atlanta (2010)

    Google Scholar 

  13. Bhandarkar, M.: MapReduce programming with apache Hadoop. In: Parallel & Distributed Processing (IPDPS), p. 1. IEEE Press, Atlanta (2010)

    Google Scholar 

  14. Yeung, J.H.C., Tsang, C.C., Tsoi, K.H., Kwan, B.S.H., Cheung, C.C.C., Chan, A.P.C.: Map reduce as a Programming Model for Custom Computing Machines. In: Field-Programmable Custom Computing Machines, pp. 149–159. IEEE Press, Palo Alto (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Peng, Z., Ma, Y. (2011). A New Scheduling Algorithm in Hadoop MapReduce. In: Deng, H., Miao, D., Wang, F.L., Lei, J. (eds) Emerging Research in Artificial Intelligence and Computational Intelligence. AICI 2011. Communications in Computer and Information Science, vol 237. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24282-3_74

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24282-3_74

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24281-6

  • Online ISBN: 978-3-642-24282-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics