Skip to main content

Parallel Job Scheduling — A Status Report

  • Conference paper
Book cover Job Scheduling Strategies for Parallel Processing (JSSPP 2004)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3277))

Included in the following conference series:

Abstract

The popularity of research on the scheduling of parallel jobs demands a periodic review of the status of the field. Indeed, several surveys have been written on this topic in the context of parallel supercomputers [17, 20]. The purpose of the present paper is to update that material, and to extend it to include work concerning clusters and the grid.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alverson, G., Kahan, S., Korry, R., McCann, C., Smith, B.: Scheduling on the Tera MTA. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 19–44. Springer, Heidelberg (1995)

    Google Scholar 

  2. Banen, S., Bucur, A.I.D., Epema, D.H.J.: A measurement-based simulation study of processor co-allocation in multicluster systems. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 105–128. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  3. Batat, A., Feitelson, D.G.: Gang scheduling with memory considerations. In: 14th Intl. Parallel & Distributed Processing Symp, May 2000, pp. 109–114 (2000)

    Google Scholar 

  4. Bucur, A.I.D., Epema, D.H.J.: The influence of communication on the performance of co-allocation. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 66–86. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  5. Bucur, A.I.D., Epema, D.H.J.: The influence of the structure and sizes of jobs on the performance of co-allocation. In: Feitelson, D.G., Rudolph, L. (eds.) IPDPS-WS 2000 and JSSPP 2000. LNCS, vol. 1911, pp. 154–173. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  6. Chiang, S.-H., Arpaci-Dusseau, A., Vernon, M.K.: The impact of more accurate requested runtimes on production job scheduling performance. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 103–127. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  7. Cirne, W., Berman, F.: Adaptive selection of partition size for supercomputer requests. In: Feitelson, D.G., Rudolph, L. (eds.) IPDPS-WS 2000 and JSSPP 2000. LNCS, vol. 1911, pp. 187–207. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  8. Das Sharma, D., Pradhan, D.K.: Job scheduling in mesh multicomputers. In: Intl. Conf. Parallel Processing, August 1994, vol. II, pp. 251–258 (1994)

    Google Scholar 

  9. Ernemann, C., Hamscher, V., Schwiegelshohn, U., Streit, A., Yahyapour, R.: Enhanced Algorithms for Multi-Site Scheduling. In: Parashar, M. (ed.) GRID 2002. LNCS, vol. 2536, pp. 219–231. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  10. Ernemann, C., Hamscher, V., Schwiegelshohn, U., Streit, A., Yahyapour, R.: On Advantages of Grid Computing for Parallel Job Scheduling. In: Proc. 2nd IEEE/ACM Int’l Symp. on Cluster Computing and the Grid (CCGRID 2002), May 2002, IEEE Press, Berlin (2002)

    Google Scholar 

  11. Ernemann, C., Hamscher, V., Streit, A., Yahyapour, R.: On Effects of Machine Configurations on Parallel Job Scheduling in Computational Grids. In: International Conference on Architecture of Computing Systems, ARCS, April 2002, pp. 169–179. VDE, Karlsruhe (2002)

    Google Scholar 

  12. Ernemann, C., Hamscher, V., Yahyapour, R.: Economic Scheduling in Grid Computing. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 128–152. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  13. Ernemann, C., Yahyapour, R.: Grid Resource Management - State of the Art and Future Trends. In: Applying Economic Scheduling Methods to Grid Environments, pp. 491–506. Kluwer Academic Publishers, Dordrecht (2003)

    Google Scholar 

  14. Etsion, Y., Feitelson, D.G.: User-level communication in a system with gang scheduling. In: 15th Intl. Parallel & Distributed Processing Symp. (April 2001)

    Google Scholar 

  15. Feitelson, D.G.: Experimental Analysis of the Root Causes of Performance Evaluation Results: A Backfilling Case Study. Technical Report 2002–4, School of Computer Science and Engineering, Hebrew University (March 2002)

    Google Scholar 

  16. Feitelson, D.G.: Metric and workload effects on computer systems evaluation. Computer 36(9), 18–25 (2003)

    Article  Google Scholar 

  17. Feitelson, D.G.: A Survey of Scheduling in Multiprogrammed Parallel Systems. Research Report RC 19790 (87657), IBM T. J. Watson Research Center (October 1994)

    Google Scholar 

  18. Feitelson, D.G., Mu’alem Weil, A.: Utilization and predictability in scheduling the IBM SP2 with backfilling. In: 12th Intl. Parallel Processing Symp., April 1998, pp. 542–546 (1998)

    Google Scholar 

  19. Feitelson, D.G., Rudolph, L.: Gang scheduling performance benefits for finegrain synchronization. J. Parallel & Distributed Comput. 16(4), 306–318 (1992)

    Article  MATH  Google Scholar 

  20. Feitelson, D.G., Rudolph, L.: Parallel job scheduling: issues and approaches. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 1–18. Springer, Heidelberg (1995)

    Google Scholar 

  21. Feitelson, D.G.: The Supercomputer Industry in Light of the Top500 Data. Comput. in Science & Engineering 7(1), 42–47 (2004)

    Article  Google Scholar 

  22. Foster, I., Kesselman, C.: The Globus toolkit. In: Foster, I., Kesselman, C. (eds.) The Grid: Blueprint for a New Computing Infrastructure, pp. 259–278. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  23. Frachtenberg, E., Feitelson, D.G., Fernandez, J., Petrini, F.: Parallel job scheduling under dynamic workloads. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 208–227. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  24. Frachtenberg, E., Feitelson, D.G., Petrini, F., Fernandez, J.: Flexible coscheduling: mitigating load imbalance and improving utilization of heterogeneous resources. In: 17th Intl. Parallel & Distributed Processing Symp. (April 2003)

    Google Scholar 

  25. Frachtenberg, E., Petrini, F., Fernandez, J., Pakin, S., Coll, S.: STORM: lightning-fast resource management. In: Supercomputing (November 2002)

    Google Scholar 

  26. Hamscher, V., Schwiegelshohn, U., Streit, A., Yahyapour, R.: Evaluation of Job-Scheduling Strategies for Grid Computing. In: Buyya, R., Baker, M. (eds.) GRID 2000. LNCS, vol. 1971, pp. 191–202. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  27. Henderson, R.L.: Job scheduling under the portable batch system. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 279–294. Springer, Heidelberg (1995)

    Google Scholar 

  28. Holt, G.: Time-Critical Scheduling on a Well Utilised HPC System Using Resource Reservations. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 102–124. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  29. Intel Corp., iPSC/860 Multi-User Accounting, Control, and Scheduling Utilities Manual. Order number 312261-002 (May 1992)

    Google Scholar 

  30. Jackson, D., Snell, Q., Clement, M.: Core algorithms of theMaui scheduler. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 87–102. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  31. Lagerstrom, R., Gipp, S.: PScheD: Political Scheduling on the CRAY T3E. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 117–138. Springer, Heidelberg (1997)

    Google Scholar 

  32. Lee, C.B., Schwartzman, Y., Hardy, J., Snavely, A.: Are user runtime estimates inherently inaccurate? In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 253–263. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  33. Lifka, D.: The ANL/IBM SP scheduling system. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 295–303. Springer, Heidelberg (1995)

    Google Scholar 

  34. Litzkow, M.J., Livny, M., Mutka, M.W.: Condor - a hunter of idle workstations. In: 8th Intl. Conf. Distributed Comput. Syst., June 1988, pp. 104–111 (1988)

    Google Scholar 

  35. Moreira, J.E., Chan, W., Fong, L.L., Franke, H., Jette, M.A.: An infrastructure for efficient parallel job execution in terascale computing environments. In: Supercomputing 1998 (November 1998)

    Google Scholar 

  36. Mraz, R.: Reducing the variance of point-to-point transfers for parallel real-time programs. IEEE Parallel & Distributed Technology 2(4), 20–31 (Winter 1994)

    Article  Google Scholar 

  37. Mu’alem, W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel & Distributed Syst. 12(6), 529–543 (2001)

    Article  Google Scholar 

  38. Ousterhout, J.K.: Scheduling techniques for concurrent systems. In: 3rd Intl. Conf. Distributed Comput. Syst., October 1982, pp. 22–30 (1982)

    Google Scholar 

  39. Petrini, F., Feng, W.-c.: Buffered coscheduling: a new methodology for multitasking parallel jobs on distributed systems. In: 14th Intl. Parallel & Distributed Processing Symp., May 2000, pp. 439–444 (2000)

    Google Scholar 

  40. Petrini, F., Feng, W.-c.: Time-sharing parallel jobs in the presence of multiple resource requirements. In: Feitelson, D.G., Rudolph, L. (eds.) IPDPS-WS 2000 and JSSPP 2000. LNCS, vol. 1911, pp. 113–136. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  41. Petrini, F., Kerbyson, D.J., Pakin, S.: The case of missing supercomputer performance: achieving optimal performance on the 8,192 processors of ASCI Q. In: Supercomputing (November 2003)

    Google Scholar 

  42. Pruyne, J., Livny, M.: Parallel processing on dynamic resources with CARMI. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 259–278. Springer, Heidelberg (1995)

    Google Scholar 

  43. Schwiegelshohn, U., Yahyapour, R.: Analysis of First-Come-First-Serve Parallel Job Scheduling. In: Proceedings of the 9th SIAM Symposium on Discrete Algorithms, January 1998, pp. 629–638 (1998)

    Google Scholar 

  44. Schwiegelshohn, U., Yahyapour, R.: Fairness in Parallel Job Scheduling. Journal of Scheduling 3(5), 297–320 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  45. Schwiegelshohn, U., Yahyapour, R.: Grid Resource Management - State of the Art and Future Trends. In: Attributes for Communication Between Grid Scheduling Instances, pp. 41–52. Kluwer Academic Publishers, Dordrecht (2003)

    Google Scholar 

  46. Rudolph, L., Smith, P.: Valuation of Ultra-scale Computing Systems. In: Feitelson, D.G., Rudolph, L. (eds.) IPDPS-WS 2000 and JSSPP 2000. LNCS, vol. 1911, pp. 39–55. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  47. Shmueli, E., Feitelson, D.G.: Backfilling with lookahead to optimize the performance of parallel job scheduling. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 228–251. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  48. Sinaga, J.M.P., Mohammed, H.H., Epema, D.H.J.: A dynamic co-allocation service in multicluster systems. In: 10th Job Scheduling Strategies for Parallel Processing (June 2004)

    Google Scholar 

  49. Snell, Q., Clement, M., Jackson, D., Gregory, C.: The performance impact of advance reservation meta-scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) IPDPS-WS 2000 and JSSPP 2000. LNCS, vol. 1911, pp. 137–153. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  50. Srinivasan, S., Kettimuthu, R., Subramani, V., Sadayappan, P.: Selective reservation strategies for backfill job scheduling. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 55–71. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  51. Talby, D., Feitelson, D.G.: Supporting priorities and improving utilization of the IBM SP scheduler using slack-based backfilling. In: 13th Intl. Parallel Processing Symp., April 1999, pp. 513–517 (1999)

    Google Scholar 

  52. Tullsen, D.M., Eggers, S., Emer, J., Levy, H., Lo, J., Stamm, R.: Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor. In: 23rd Annual International Symposium on Computer Architecture (May 1996)

    Google Scholar 

  53. Tsafrir, D.: (in preparation)

    Google Scholar 

  54. Uno, A., Aoyagi, T., Tani, K.: Job scheduling on the earth simulator. NEC Res. & Develop. 44(1), 47–52 (2003)

    Google Scholar 

  55. Schwiegelshohn, U., Yahyapour, R.: GGF-GFD.6: Attributes for Communication between Scheduling Instances (December 2001), http://www.ggf.org/documents/GFD/GFDI-6.pdf

  56. Wiseman, Y., Feitelson, D.G.: Paired gang scheduling. IEEE Trans. Parallel & Distributed Syst. 14(6), 581–592 (2003)

    Article  Google Scholar 

  57. Yoo, A.B., Jette, M.A., Grondona, M.: SLURM: simple Linux utility for resource management. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 44–60. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  58. Zhang, Y., Franke, H., Moreira, J., Sivasubramaniam, A.: An integrated approach to parallel scheduling using gang-scheduling, backfilling, and migration. IEEE Trans. Parallel & Distributed Syst. 14(3), 236–247 (2003)

    Article  Google Scholar 

  59. Zhang, Y., Franke, H., Moreira, J.E., Sivasubramaniam, A.: Improving parallel job scheduling by combining gang scheduling and backfilling techniques. In: 14th Intl. Parallel & Distributed Processing Symp., May 2000, pp. 133–142 (2000)

    Google Scholar 

  60. Zhou, S., Zheng, X., Wang, J., Delisle, P.: Utopia: a load sharing facility for large, heterogeneous distributed computer systems. Software — Pract. & Exp. 23(12), 1305–1336 (1993)

    Article  Google Scholar 

  61. Zotkin, D., Keleher, P.J.: Job-length estimation and performance in backfilling schedulers. In: 8th Intl. Symp. High Performance Distributed Comput. (August 1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (2005). Parallel Job Scheduling — A Status Report. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2004. Lecture Notes in Computer Science, vol 3277. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11407522_1

Download citation

  • DOI: https://doi.org/10.1007/11407522_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25330-3

  • Online ISBN: 978-3-540-31795-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics