Benchmarks and Standards for the Evaluation of Parallel Job Schedulers

Chapin, Steve J.; Cirne, Walfredo; Feitelson, Dror G.; Jones, James Patton; Leutenegger, Scott T.; Schwiegelshohn, Uwe; Smith, Warren; Talby, David

doi:10.1007/3-540-47954-6_4

Steve J. Chapin⁶,
Walfredo Cirne⁷,
Dror G. Feitelson⁸,
James Patton Jones⁹,
Scott T. Leutenegger¹⁰,
Uwe Schwiegelshohn¹¹,
Warren Smith¹² &
…
David Talby⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1659))

Included in the following conference series:

Workshop on Job Scheduling Strategies for Parallel Processing

474 Accesses
60 Citations

Abstract

The evaluation of parallel job schedulers hinges on the workloads used. It is suggested that this be standardized, in terms of both format and content, so as to ease the evaluation and comparison of different systems. The question remains whether this can encompass both traditional parallel systems and metacomputing systems.

This paper is based on a panel on this subject that was held at the workshop, and the ensuing discussion; its authors are both the panel members and participants from the audience. Naturally, not all of us agree with all the opinions expressed here...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. K. Agrawala, J. M. Mohr, and R. M. Bryant, “An approach to the workload characterization problem.” Computer 9 (6), pp. 18–32, Jun 1976.
Article MATH Google Scholar
G. Alverson, S. Kahan, R. Korry, C. McCann, and B. Smith, “Scheduling on the Tera MTA.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 19–44, Springer-Verlag, 1995. Lect. Notes Comput. Sci. vol. 949.
Google Scholar
P. Barford and M. Crovella, “Generating representative web workloads for network and server performance evaluation.” In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 151–160, Jun 1998.
Google Scholar
A. Batat. Master’s thesis, Hebrew University, 1999. (in preparation).
Google Scholar
F. Berman, “High-performance schedulers.” In The Grid: Blueprint for a New Computing Infrastructure, I. Foster and C. Kesselman (eds.), pp. 279–309, Morgan Kaufmann, 1999.
Google Scholar
F. Berman, R. Wolski, S. Figueira, J. Schopf, and G. Shao., “Application-level scheduling on distributed heterogeneous networks.” In Supercomputing’ 96, 1996.
Google Scholar
M. Calzarossa, G. Haring, G. Kotsis, A. Merlo, and D. Tessera, “A hierarchical approach to workload characterization for parallel systems.” In High-Performance Computing and Networking, pp. 102–109, Springer-Verlag, May 1995. Lect. Notes Comput. Sci. vol. 919.
Chapter Google Scholar
S. J. Chapin, “Distributed scheduling support in the presence of autonomy.” In Proc. 4th Heterogeneous Computing Workshop, pp. 22–29, Apr 1995. Santa Barbara, CA.
Google Scholar
S. J. Chapin and E. H. Spafford, “Support for implementing scheduling algorithms using MESSIAHS.” Scientiffc Programming 3 (4), pp. 325–340, Winter 1994.
Google Scholar
W. Cirne and F. Berman, “S3: a metacomputing-friendly parallel scheduler.” Manuscript, UCSD, In preparation.
Google Scholar
W. Cirne and K. Marzullo, “The computational co-op: gathering clusters into a metacomputer.” In Second Merged Symposium IPPS/SPDP 1999, 13th International Parallel Processing Symposium & 10th Symposium on Parallel and Distributed Processing, pp. 160–166, April 1999.
Google Scholar
D. E. Culler, J. P. Singh, and A. Gupta, Parallel Computer Architecture: A Hardware/Software Approach. Morgan Kaufmann, 1999.
Google Scholar
A. B. Downey, “A parallel workload model and its implications for processor allocation.” In 6th Intl. Symp. High Performance Distributed Comput., Aug 1997.
Google Scholar
A. B. Downey, “Predicting queue times on space-sharing parallel computers.” In 11th Intl. Parallel Processing Symp., pp. 209–218, Apr 1997.
Google Scholar
A. B. Downey, “Using queue time predictions for processor allocation.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 35–57, Springer-Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.
Google Scholar
A. B. Downey and D. G. Feitelson, “The elusive goal of workload characterization.” Perf. Eval. Rev. 26 (4), pp. 14–29, Mar 1999.
Article Google Scholar
D. G. Feitelson, “Memory usage in the LANL CM-5 workload.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 78–94, Springer-Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.
Google Scholar
D. G. Feitelson, “Packing schemes for gang scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 89–110, Springer-Verlag, 1996. Lect. Notes Comput. Sci. vol. 1162.
Chapter Google Scholar
D. G. Feitelson, “Parallel workloads archive.” http://www.cs.huji.ac.il/labs/parallel/workload/.
D. G. Feitelson, A Survey of Scheduling in Multiprogrammed Parallel Systems. Research Report RC 19790 (87657), IBM T. J. Watson Research Center, Oct 1994.
Google Scholar
D. G. Feitelson and B. Nitzberg, “Job characteristics of a production parallel scientific workload on the NASA Ames iPSC/860.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 337–360, Springer-Verlag, 1995. Lect. Notes Comput. Sci. vol. 949.
Google Scholar
D. G. Feitelson and L. Rudolph, “Distributed hierarchical control for parallel processing.” Computer 23 (5), pp. 65–77, May 1990.
Article Google Scholar
D. G. Feitelson and L. Rudolph, “Gang scheduling performance benefits for finegrain synchronizationℍ. J. Parallel & Distributed Comput. 16 (4), pp. 306–318, Dec 1992.
Article MATH Google Scholar
D. G. Feitelson and L. Rudolph, “Metrics and benchmarking for parallel job schedulingℍ. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 1–24, Springer-Verlag, 1998. Lect. Notes Comput. Sci. vol. 1459.
Chapter Google Scholar
D. G. Feitelson and L. Rudolph, “Toward convergence in job schedulers for parallel supercomputersℍ. In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 1–26, Springer-Verlag, 1996. Lect. Notes Comput. Sci. vol. 1162.
Chapter Google Scholar
D. Ferrari, “Workload characterization and selection in computer performance measurement.” Computer 5 (4), pp. 18–24, Jul/Aug 1972.
Article Google Scholar
I. Foster and C. Kesselman, “Globus: a metacomputing infrastructure toolkit.” International Journal of Supercomputing Applications 11 (2), pp. 115–128, 1997.
Article Google Scholar
I. Foster and C. Kesselman (eds.), The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, 1999.
Google Scholar
I. Foster, C. Kesselman, C. Lee, R. Lindell, K. Nahrstedt, and A. Roy, “A distributed resource management architecture that supports advance reservations and co-allocation.” In International Workshop on Quality of Service, 1999.
Google Scholar
H. Franke, P. Pattnaik, and L. Rudolph, “Gang scheduling for highly efficient distributed multiprocessor systems.” In 6th Symp. Frontiers Massively Parallel Comput., pp. 1–9, Oct 1996.
Google Scholar
G. Ghare and S. T. Leutenegger, “The effect of correlating quantum allocation and job size for gang scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1999. Lect. Notes Comput. Sci. vol. 1659.
Google Scholar
R. Gibbons, “A historical application profiler for use by parallel schedulers.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 58–77, Springer-Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.
Google Scholar
A. S. Grimshaw, J. B. Weissman, E. A. West, and E. C. Loyot, Jr., “Metasystems: an approach combining parallel processing and heterogeneous distributed computing systems.” J. Parallel & Distributed Comput. 21 (3), pp. 257–270, Jun 1994.
Article Google Scholar
A. S. Grimshaw, W. A. Wulf, and the Legion team, “The Legion vision of a worldwide virtual computer.” Comm. ACM 40 (1), pp. 39–45, Jan 1997.
Article Google Scholar
A. Gupta, A. Tucker, and S. Urushibara, “The impact of operating system scheduling policies and synchronization methods on the performance of parallel applications.” In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 120–132, May 1991.
Google Scholar
M. A. Holliday and C. S. Ellis, “Accuracy of memory reference traces of parallel computations in trace-driven simulation.” IEEE Trans. Parallel & Distributed Syst. 3 (1), pp. 97–109, Jan 1992.
Article Google Scholar
S. Hotovy, “Workload evolution on the Cornell Theory Center IBM SP2.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 27–40, Springer-Verlag, 1996. Lect. Notes Comput. Sci. vol. 1162.
Chapter Google Scholar
R. Jain, The Art of Computer Systems Performance Analysis. John Wiley & Sons, 1991.
Google Scholar
J. Jann, P. Pattnaik, H. Franke, F. Wang, J. Skovira, and J. Riodan, “Modeling of workload in MPPs.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 95–116, Springer-Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.
Google Scholar
R. E. Kessler, M. D. Hill, and D. A. Wood, “A comparison of trace-sampling techniques for multi-megabyte caches.” IEEE Trans. Comput. 43 (6), pp. 664–675, Jun 1994.
Article MATH Google Scholar
E. J. Koldinger, S. J. Eggers, and H. M. Levy, “On the validity of trace-driven simulation for multiprocessors.” In 18th Ann. Intl. Symp. Computer Architecture Conf. Proc., pp. 244–253, May 1991.
Google Scholar
J. Krallmann, U. Schwiegelshohn, and R. Yahyapour, “On the design and evaluation of job scheduling systems.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1999. Lect. Notes Comput. Sci. vol. 1659.
Google Scholar
M. Krunz and S. K. Tripathi, “On the characterization of VBR MPEG streams.” In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 192–202, Jun 1997.
Google Scholar
W. Lee, M. Frank, V. Lee, K. Mackenzie, and L. Rudolph, “Implications of I/O for gang scheduled workloads.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), pp. 215–237, Springer-Verlag, 1997. Lect. Notes Comput. Sci. vol. 1291.
Google Scholar
W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson, “On the self-similar nature of Ethernet traffic.” IEEE/ACM Trans. Networking 2 (1), pp. 1–15, Feb 1994.
Article Google Scholar
-M. J. Litzkow, M. Livny, and M. W. Mutka, “Condor-a hunter of idle workstations.” In 8th Intl. Conf. Distributed Comput. Syst., pp. 104–111, Jun 1988.
Google Scholar
U. Lublin, A Workload Model for Parallel Computer Systems. Master’s thesis, Hebrew University, 1999. (In Hebrew).
Google Scholar
N. Nieuwejaar, D. Kotz, A. Purakayastha, C. S. Ellis, and M. L. Best, “File-access characteristics of parallel scientific workloads.” IEEE Trans. Parallel & Distributed Syst. 7 (10), pp. 1075–1089, Oct 1996.
Article Google Scholar
J. K. Ousterhout, H. Da Costa, D. Harrison, J. A. Kunze, M. Kupfer, and J. G. Thompson, “A trace-driven analysis of the UNIX 4.2 BSD file system.” In 10th Symp. Operating Systems Principles, pp. 15–24, Dec 1985.
Google Scholar
E.W. Parsons and K. C. Sevcik, “Coordinated allocation of memory and processors in multiprocessors.” In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 57–67, May 1996.
Google Scholar
V. G. J. Peris, M. S. Squillante, and V. K. Naik, “Analysis of the impact of memory in distributed parallel processing systems.” In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 5–18, May 1994.
Google Scholar
A. Polze, M. Werner, and G. Fohler, “Predictable network computing.” In 17th Intl. Conf. Distributed Comput. Syst., pp. 423–431, May 1997.
Google Scholar
E. Rosti, G. Serazzi, E. Smirni, and M. S. Squillante, “The impact of I/O on program behavior and parallel scheduling.” In SIGMETRICS Conf. Measurement & Modeling of Comput. Syst., pp. 56–65, Jun 1998.
Google Scholar
U. Schwiegelshohn and R. Yahyapour, “Resource allocation and scheduling in metasystems.” In Proc. Distributed Computing & Metacomputing Workshop at HPCN Europe, P. Sloot, M. Bibak, A. Hoekstra, and B. Hertzberger (eds.), pp. 851–860, Springer-Verlag, Apr 1999. Lect. Notes in Comput. Sci. vol. 1593.
Google Scholar
K. C. Sevcik, “Application scheduling and processor allocation in multiprogrammed parallel processing systems.” Performance Evaluation 19 (2-3, pp. 107–140, Mar 1994.
Article MATH Google Scholar
R. L. Sites and A. Agarwal, “Multiprocessor cache analysis using ATUM.” In 15th Ann. Intl. Symp. Computer Architecture Conf. Proc., pp. 186–195, 1988.
Google Scholar
W. Smith, V. Taylor, and I. Foster, “Using run-time predictions to estimate queue wait times and improve scheduler performance.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1999. Lect. Notes Comput. Sci. vol. 1659.
Google Scholar
D. Talby, D. G. Feitelson, and A. Raveh, “Comparing logs and models of parallel workloads using the co-plot method.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (eds.), Springer-Verlag, 1999. Lect. Notes Comput. Sci.
Google Scholar
D. Thiébaut;, J. L. Wolf, and H. S. Stone, “Synthetic traces for trace-driven simulation of cache memories.”IEEE Trans. Comput. 41(4), pp. 388–410, Apr 1992.(Corrected in IEEE Trans. Comput. 42 (5) p. 635, May 1993.
Article Google Scholar
K. Windisch, V. Lo, R. Moore, D. Feitelson, and B. Nitzberg, “A comparison of workload traces from two production parallel machines.” In 6th Symp. Frontiers Massively Parallel Comput., pp. 319–326, Oct 1996.
Google Scholar
R. Wolski, N. T. Spring, and J. Hayes, “The network weather service: a distributed resource performance forecasting service for metacomputing.” Journal of Future Generation Computing Systems, 1999.
Google Scholar
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, “The SPLASH-2 programs: characterization and methodological considerations.” In 22nd Ann. Intl. Symp. Computer Architecture Conf. Proc., pp. 24–36, Jun 1995.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Engineering Department, Syracuse University, Syracuse, NY, 13244-1240
Steve J. Chapin
Computer Science and Engineering Department, University of California, San Diego La Jolla, CA, 92093
Walfredo Cirne
Institute of Computer Science, Hebrew University, 91904, Jerusalem, Israel
Dror G. Feitelson & David Talby
MRJ Technology Solutions, NASA Ames Research Center, Moffet Field, CA, 94035
James Patton Jones
Mathematics and Computer Science Department, University of Denver, Denver, CO80208
Scott T. Leutenegger
Computer Engineering Institute, University Dortmund, 44221, Dortmund, Germany
Uwe Schwiegelshohn
Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL, 60439
Warren Smith

Authors

Steve J. Chapin
View author publications
You can also search for this author in PubMed Google Scholar
Walfredo Cirne
View author publications
You can also search for this author in PubMed Google Scholar
Dror G. Feitelson
View author publications
You can also search for this author in PubMed Google Scholar
James Patton Jones
View author publications
You can also search for this author in PubMed Google Scholar
Scott T. Leutenegger
View author publications
You can also search for this author in PubMed Google Scholar
Uwe Schwiegelshohn
View author publications
You can also search for this author in PubMed Google Scholar
Warren Smith
View author publications
You can also search for this author in PubMed Google Scholar
David Talby
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Science, The Hebrew University of Jerusalem, 91904, Jerusalem, Israel
Dror G. Feitelson
Laboratory for Computer Science, MIT, Cambridge, MA, 02139, USA
Larry Rudolph

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chapin, S.J. et al. (1999). Benchmarks and Standards for the Evaluation of Parallel Job Schedulers. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1999. Lecture Notes in Computer Science, vol 1659. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47954-6_4

Download citation

DOI: https://doi.org/10.1007/3-540-47954-6_4
Published: 14 July 2000
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66676-9
Online ISBN: 978-3-540-47954-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics