Skip to main content

Partitioned Parallel Job Scheduling for Extreme Scale Computing

  • Conference paper
Job Scheduling Strategies for Parallel Processing (JSSPP 2012)

Abstract

Recent success in building extreme computing systems poses new challenges in job scheduling design to support cluster sizes that can execute million’s of concurrent tasks. We show that for these extreme scale clusters the resource demand at a centralized scheduler can exceed the capacity or limit the ability of the scheduler to perform well. This paper introduces partitioned scheduling, a hybrid centralized and distributed approach in which compute nodes are assigned to the job centrally, while task to local node resources assignments are performed subsequently at the assigned job nodes. This reduces the memory and processing growth at the central scheduler, and improves the scaling behavior of scheduling time by enabling operations to be done in parallel at the job nodes. When local resource assignments must be distributed to all other job nodes, the partitioned approach trades central processing for increased network communications. Thus, we introduce features that improve communications such as pipelining that leverage the presence of the high speed cluster network. The new system is evaluated for jobs with up to 50K tasks on clusters with 496 nodes and 128 tasks per node. The partitioned scheduling approach is demonstrated to reduce processor and memory usage at the central processor and improve job scheduling and job dispatching times up to an order of magnitude.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. DARPA High Productivity Computing Systems project, http://www.darpa.mil/IPTO/programs/hpcs/hpcs.asp

  2. External Data Represenation Standard, http://tools.ietf.org/html/rfc1014

  3. IBM Parallel Environment (PE), http://www-03.ibm.com/systems/software/parallel/index.html

  4. IBM Tivoli Workload Scheduler LoadLeveler, http://publib.boulder.ibm.com/-infocenter/clresctr/vxrx/index.jsp

  5. IBM Tivoli Workload Scheduler LoadLeveler Version 4.1, http://www-01.ibm.com/common/ssi/rep_ca/5/897/ENUS210-145/ENUS210-145.PDF

  6. Adiga, N.R., Alm’asi, G., Aridor, Y., et al.: An overview of the BlueGene/L Supercomputer. In: Proceeding of Supercomputing, pp. 1–22 (2002)

    Google Scholar 

  7. Anderson, J.H., Bud, V., Devi, U.C.: An edf-based scheduling algorithm for multiprocessor soft real-time systems. In: ECRTS (2005)

    Google Scholar 

  8. Aridor, Y., Domany, T., Goldshmidt, O., Kliteynik, Y., Moreira, J., Shmueli, E.: Open Job Management Architecture for the Blue Gene/L Supercomputer. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 91–107. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  9. Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H., Culler, D.E.: Effective distributed scheduling of parallel workloads. In: SIGMETRICS, pp. 25–36 (1996)

    Google Scholar 

  10. Baker, T.P.: A comparison of global and partitioned edf schedulability tests for multiprocessors. In: Proceeding of International Conf. on Real-Time and Network Systems (2005)

    Google Scholar 

  11. Balaji, P., Buntinas, D., Goodell, D., Gropp, W., Krishna, J., Lusk, E., Thakur, R.: PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS, vol. 6305, pp. 31–41. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  12. Bobroff, N., Coppinger, R., Fong, L., Seelam, S., Xu, J.: Scalability Analysis of Job Scheduling Using Virtual Nodes. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2009. LNCS, vol. 5798, pp. 190–206. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  13. Butler, R., Gropp, W.D., Lusk, E.: A Scalable Process-Management Environment for Parallel Programs. In: Dongarra, J., Kacsuk, P., Podhorszki, N. (eds.) EuroPVM/MPI 2000. LNCS, vol. 1908, pp. 168–175. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  14. Casavant, T.L., Kuhl, J.G.: A taxonomy of scheduling in general-purpose distributed computing systems. IEEE Trans. Software Eng. 14(2) (1988)

    Google Scholar 

  15. Casey, L.M.: Decentralised scheduling. Australian Computer Journal 13(2) (1981)

    Google Scholar 

  16. Chandra, A., Shenoy, P.J.: Hierarchical scheduling for symmetric multiprocessors. IEEE Trans. Parallel Distrib. Syst. 19(3) (2008)

    Google Scholar 

  17. Demaine, E.D., Foster, I.T., et al.: Generalized communicators in the message passing interface. IEEE Trans. Parallel Distrib. Syst. 12(6) (2001)

    Google Scholar 

  18. Frachtenberg, E., Feitelson, D.G., et al.: Adaptive parallel job scheduling with flexible coscheduling. IEEE Trans. Parallel & Distributed Syst. 16 (2005)

    Google Scholar 

  19. Kato, S., Yamasaki, N., Ishikawa, Y.: Semi-partitioned scheduling of sporadic task systems on multiprocessors. In: ECRTS (2009)

    Google Scholar 

  20. Prenneis, A.: Loadleveler: Workload management for parallel and distributed computing environments. In: Super Computing Europe, SUPEREU (1996)

    Google Scholar 

  21. Rajamony, R., Arimilli, L.B., Gildea, K.: PERCS: The IBM Power7-IH high-performance computing system. IBM J. Res. Dev. 55(3), 233–244 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Brelsford, D. et al. (2013). Partitioned Parallel Job Scheduling for Extreme Scale Computing. In: Cirne, W., Desai, N., Frachtenberg, E., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2012. Lecture Notes in Computer Science, vol 7698. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35867-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35867-8_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35866-1

  • Online ISBN: 978-3-642-35867-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics