Skip to main content

Benchmarks and Standards for the Evaluation of Parallel Job Schedulers

  • Conference paper
  • First Online:
Job Scheduling Strategies for Parallel Processing (JSSPP 1999)

Abstract

The evaluation of parallel job schedulers hinges on the workloads used. It is suggested that this be standardized, in terms of both format and content, so as to ease the evaluation and comparison of different systems. The question remains whether this can encompass both traditional parallel systems and metacomputing systems.

This paper is based on a panel on this subject that was held at the workshop, and the ensuing discussion; its authors are both the panel members and participants from the audience. Naturally, not all of us agree with all the opinions expressed here...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. K. Agrawala, J. M. Mohr, and R. M. Bryant, “An approach to the workload characterization problem.” Computer 9 (6), pp. 18–32, Jun 1976.

    Article  MATH  Google Scholar 

  2. G. Alverson, S. Kahan, R. Korry, C. McCann, and B. Smith, “Scheduling on the Tera MTA.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 19–44, Springer-Verlag, 1995. Lect. Notes Comput. Sci. vol. 949.

    Google Scholar 

  3. P. Barford and M. Crovella, “Generating representative web workloads for network and server performance evaluation.” In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 151–160, Jun 1998.

    Google Scholar 

  4. A. Batat. Master’s thesis, Hebrew University, 1999. (in preparation).

    Google Scholar 

  5. F. Berman, “High-performance schedulers.” In The Grid: Blueprint for a New Computing Infrastructure, I. Foster and C. Kesselman (eds.), pp. 279–309, Morgan Kaufmann, 1999.

    Google Scholar 

  6. F. Berman, R. Wolski, S. Figueira, J. Schopf, and G. Shao., “Application-level scheduling on distributed heterogeneous networks.” In Supercomputing’ 96, 1996.

    Google Scholar 

  7. M. Calzarossa, G. Haring, G. Kotsis, A. Merlo, and D. Tessera, “A hierarchical approach to workload characterization for parallel systems.” In High-Performance Computing and Networking, pp. 102–109, Springer-Verlag, May 1995. Lect. Notes Comput. Sci. vol. 919.

    Chapter  Google Scholar 

  8. S. J. Chapin, “Distributed scheduling support in the presence of autonomy.” In Proc. 4th Heterogeneous Computing Workshop, pp. 22–29, Apr 1995. Santa Barbara, CA.

    Google Scholar 

  9. S. J. Chapin and E. H. Spafford, “Support for implementing scheduling algorithms using MESSIAHS.” Scientiffc Programming 3 (4), pp. 325–340, Winter 1994.

    Google Scholar 

  10. W. Cirne and F. Berman, “S3: a metacomputing-friendly parallel scheduler.” Manuscript, UCSD, In preparation.

    Google Scholar 

  11. W. Cirne and K. Marzullo, “The computational co-op: gathering clusters into a metacomputer.” In Second Merged Symposium IPPS/SPDP 1999, 13th International Parallel Processing Symposium & 10th Symposium on Parallel and Distributed Processing, pp. 160–166, April 1999.

    Google Scholar 

  12. D. E. Culler, J. P. Singh, and A. Gupta, Parallel Computer Architecture: A Hardware/Software Approach. Morgan Kaufmann, 1999.

    Google Scholar 

  13. A. B. Downey, “A parallel workload model and its implications for processor allocation.” In 6th Intl. Symp. High Performance Distributed Comput., Aug 1997.

    Google Scholar 

  14. A. B. Downey, “Predicting queue times on space-sharing parallel computers.” In 11th Intl. Parallel Processing Symp., pp. 209–218, Apr 1997.

    Google Scholar 

  15. A. B. Downey, “Using queue time predictions for processor allocation.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 35–57, Springer-Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.

    Google Scholar 

  16. A. B. Downey and D. G. Feitelson, “The elusive goal of workload characterization.” Perf. Eval. Rev. 26 (4), pp. 14–29, Mar 1999.

    Article  Google Scholar 

  17. D. G. Feitelson, “Memory usage in the LANL CM-5 workload.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 78–94, Springer-Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.

    Google Scholar 

  18. D. G. Feitelson, “Packing schemes for gang scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 89–110, Springer-Verlag, 1996. Lect. Notes Comput. Sci. vol. 1162.

    Chapter  Google Scholar 

  19. D. G. Feitelson, “Parallel workloads archive.” http://www.cs.huji.ac.il/labs/parallel/workload/.

  20. D. G. Feitelson, A Survey of Scheduling in Multiprogrammed Parallel Systems. Research Report RC 19790 (87657), IBM T. J. Watson Research Center, Oct 1994.

    Google Scholar 

  21. D. G. Feitelson and B. Nitzberg, “Job characteristics of a production parallel scientific workload on the NASA Ames iPSC/860.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 337–360, Springer-Verlag, 1995. Lect. Notes Comput. Sci. vol. 949.

    Google Scholar 

  22. D. G. Feitelson and L. Rudolph, “Distributed hierarchical control for parallel processing.” Computer 23 (5), pp. 65–77, May 1990.

    Article  Google Scholar 

  23. D. G. Feitelson and L. Rudolph, “Gang scheduling performance benefits for finegrain synchronizationℍ. J. Parallel & Distributed Comput. 16 (4), pp. 306–318, Dec 1992.

    Article  MATH  Google Scholar 

  24. D. G. Feitelson and L. Rudolph, “Metrics and benchmarking for parallel job schedulingℍ. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 1–24, Springer-Verlag, 1998. Lect. Notes Comput. Sci. vol. 1459.

    Chapter  Google Scholar 

  25. D. G. Feitelson and L. Rudolph, “Toward convergence in job schedulers for parallel supercomputersℍ. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 1–26, Springer-Verlag, 1996. Lect. Notes Comput. Sci. vol. 1162.

    Chapter  Google Scholar 

  26. D. Ferrari, “Workload characterization and selection in computer performance measurement.” Computer 5 (4), pp. 18–24, Jul/Aug 1972.

    Article  Google Scholar 

  27. I. Foster and C. Kesselman, “Globus: a metacomputing infrastructure toolkit.” International Journal of Supercomputing Applications 11 (2), pp. 115–128, 1997.

    Article  Google Scholar 

  28. I. Foster and C. Kesselman (eds.), The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, 1999.

    Google Scholar 

  29. I. Foster, C. Kesselman, C. Lee, R. Lindell, K. Nahrstedt, and A. Roy, “A distributed resource management architecture that supports advance reservations and co-allocation.” In International Workshop on Quality of Service, 1999.

    Google Scholar 

  30. H. Franke, P. Pattnaik, and L. Rudolph, “Gang scheduling for highly efficient distributed multiprocessor systems.” In 6th Symp. Frontiers Massively Parallel Comput., pp. 1–9, Oct 1996.

    Google Scholar 

  31. G. Ghare and S. T. Leutenegger, “The effect of correlating quantum allocation and job size for gang scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1999. Lect. Notes Comput. Sci. vol. 1659.

    Google Scholar 

  32. R. Gibbons, “A historical application profiler for use by parallel schedulers.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 58–77, Springer-Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.

    Google Scholar 

  33. A. S. Grimshaw, J. B. Weissman, E. A. West, and E. C. Loyot, Jr., “Metasystems: an approach combining parallel processing and heterogeneous distributed computing systems.” J. Parallel & Distributed Comput. 21 (3), pp. 257–270, Jun 1994.

    Article  Google Scholar 

  34. A. S. Grimshaw, W. A. Wulf, and the Legion team, “The Legion vision of a worldwide virtual computer.” Comm. ACM 40 (1), pp. 39–45, Jan 1997.

    Article  Google Scholar 

  35. A. Gupta, A. Tucker, and S. Urushibara, “The impact of operating system scheduling policies and synchronization methods on the performance of parallel applications.” In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 120–132, May 1991.

    Google Scholar 

  36. M. A. Holliday and C. S. Ellis, “Accuracy of memory reference traces of parallel computations in trace-driven simulation.” IEEE Trans. Parallel & Distributed Syst. 3 (1), pp. 97–109, Jan 1992.

    Article  Google Scholar 

  37. S. Hotovy, “Workload evolution on the Cornell Theory Center IBM SP2.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 27–40, Springer-Verlag, 1996. Lect. Notes Comput. Sci. vol. 1162.

    Chapter  Google Scholar 

  38. R. Jain, The Art of Computer Systems Performance Analysis. John Wiley & Sons, 1991.

    Google Scholar 

  39. J. Jann, P. Pattnaik, H. Franke, F. Wang, J. Skovira, and J. Riodan, “Modeling of workload in MPPs.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 95–116, Springer-Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.

    Google Scholar 

  40. R. E. Kessler, M. D. Hill, and D. A. Wood, “A comparison of trace-sampling techniques for multi-megabyte caches.” IEEE Trans. Comput. 43 (6), pp. 664–675, Jun 1994.

    Article  MATH  Google Scholar 

  41. E. J. Koldinger, S. J. Eggers, and H. M. Levy, “On the validity of trace-driven simulation for multiprocessors.” In 18th Ann. Intl. Symp. Computer Architecture Conf. Proc., pp. 244–253, May 1991.

    Google Scholar 

  42. J. Krallmann, U. Schwiegelshohn, and R. Yahyapour, “On the design and evaluation of job scheduling systems.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1999. Lect. Notes Comput. Sci. vol. 1659.

    Google Scholar 

  43. M. Krunz and S. K. Tripathi, “On the characterization of VBR MPEG streams.” In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 192–202, Jun 1997.

    Google Scholar 

  44. W. Lee, M. Frank, V. Lee, K. Mackenzie, and L. Rudolph, “Implications of I/O for gang scheduled workloads.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 215–237, Springer-Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.

    Google Scholar 

  45. W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson, “On the self-similar nature of Ethernet traffic.” IEEE/ACM Trans. Networking 2 (1), pp. 1–15, Feb 1994.

    Article  Google Scholar 

  46. -M. J. Litzkow, M. Livny, and M. W. Mutka, “Condor-a hunter of idle workstations.” In 8th Intl. Conf. Distributed Comput. Syst., pp. 104–111, Jun 1988.

    Google Scholar 

  47. U. Lublin, A Workload Model for Parallel Computer Systems. Master’s thesis, Hebrew University, 1999. (In Hebrew).

    Google Scholar 

  48. N. Nieuwejaar, D. Kotz, A. Purakayastha, C. S. Ellis, and M. L. Best, “File-access characteristics of parallel scientific workloads.” IEEE Trans. Parallel & Distributed Syst. 7 (10), pp. 1075–1089, Oct 1996.

    Article  Google Scholar 

  49. J. K. Ousterhout, H. Da Costa, D. Harrison, J. A. Kunze, M. Kupfer, and J. G. Thompson, “A trace-driven analysis of the UNIX 4.2 BSD file system.” In 10th Symp. Operating Systems Principles, pp. 15–24, Dec 1985.

    Google Scholar 

  50. E.W. Parsons and K. C. Sevcik, “Coordinated allocation of memory and processors in multiprocessors.” In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 57–67, May 1996.

    Google Scholar 

  51. V. G. J. Peris, M. S. Squillante, and V. K. Naik, “Analysis of the impact of memory in distributed parallel processing systems.” In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 5–18, May 1994.

    Google Scholar 

  52. A. Polze, M. Werner, and G. Fohler, “Predictable network computing.” In 17th Intl. Conf. Distributed Comput. Syst., pp. 423–431, May 1997.

    Google Scholar 

  53. E. Rosti, G. Serazzi, E. Smirni, and M. S. Squillante, “The impact of I/O on program behavior and parallel scheduling.” In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 56–65, Jun 1998.

    Google Scholar 

  54. U. Schwiegelshohn and R. Yahyapour, “Resource allocation and scheduling in metasystems.” In Proc. Distributed Computing & Metacomputing Workshop at HPCN Europe, P. Sloot, M. Bibak, A. Hoekstra, and B. Hertzberger (eds.), pp. 851–860, Springer-Verlag, Apr 1999. Lect. Notes in Comput. Sci. vol. 1593.

    Google Scholar 

  55. K. C. Sevcik, “Application scheduling and processor allocation in multiprogrammed parallel processing systems.” Performance Evaluation 19 (2-3, pp. 107–140, Mar 1994.

    Article  MATH  Google Scholar 

  56. R. L. Sites and A. Agarwal, “Multiprocessor cache analysis using ATUM.” In 15th Ann. Intl. Symp. Computer Architecture Conf. Proc., pp. 186–195, 1988.

    Google Scholar 

  57. W. Smith, V. Taylor, and I. Foster, “Using run-time predictions to estimate queue wait times and improve scheduler performance.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1999. Lect. Notes Comput. Sci. vol. 1659.

    Google Scholar 

  58. D. Talby, D. G. Feitelson, and A. Raveh, “Comparing logs and models of parallel workloads using the co-plot method.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1999. Lect. Notes Comput. Sci.

    Google Scholar 

  59. D. Thiébaut;, J. L. Wolf, and H. S. Stone, “Synthetic traces for trace-driven simulation of cache memories.”IEEE Trans. Comput. 41(4), pp. 388–410, Apr 1992.(Corrected in IEEE Trans. Comput. 42 (5) p. 635, May 1993.

    Article  Google Scholar 

  60. K. Windisch, V. Lo, R. Moore, D. Feitelson, and B. Nitzberg, “A comparison of workload traces from two production parallel machines.” In 6th Symp. Frontiers Massively Parallel Comput., pp. 319–326, Oct 1996.

    Google Scholar 

  61. R. Wolski, N. T. Spring, and J. Hayes, “The network weather service: a distributed resource performance forecasting service for metacomputing.” Journal of Future Generation Computing Systems, 1999.

    Google Scholar 

  62. S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, “The SPLASH-2 programs: characterization and methodological considerations.” In 22nd Ann. Intl. Symp. Computer Architecture Conf. Proc., pp. 24–36, Jun 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chapin, S.J. et al. (1999). Benchmarks and Standards for the Evaluation of Parallel Job Schedulers. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1999. Lecture Notes in Computer Science, vol 1659. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47954-6_4

Download citation

  • DOI: https://doi.org/10.1007/3-540-47954-6_4

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66676-9

  • Online ISBN: 978-3-540-47954-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics