Skip to main content

Parallel application characterization for multiprocessor scheduling policy design

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1162))

Abstract

Much of the recent work on multiprocessor scheduling disciplines has used abstract workload models to explore the fundamental, high-level properties of the various alternatives. As continuing work on these policies increases their level of sophistication, however, it is clear that the choice of appropriate policies must be guided at least in part by the typical behavior of actual parallel applications. Our goal in this paper is to examine a variety of such applications, providing measurements of properties relevant to scheduling policy design. We give measurements for both hand-coded parallel programs (from the SPLASH benchmark suites) and compiler-parallelized programs (from the PERFECT Club suite) running on a KSR-2 shared-memory multiprocessor.

The measurements we present are intended primarily to address two aspects of multiprocessor scheduling policy design:

  • In the spectrum between aggressively dynamic and static allocation policies, what is an appropriate choice for the rate at which reallocations should take place?

  • Is it possible to take measurements of application speedup and efficiency at runtime that are sufficiently accurate to guide allocation decisions?

We address these questions through three sets of measurements:

  • First, we examine application speedup, and the sources of speedup loss. Our results confirm that there is considerable variation in job speedup, and that the bulk of the speedup loss is due to communication and idleness.

  • Next, we examine runtime measurement of speedup information. We begin by looking at how such information might be acquired accurately and at acceptable cost. We then investigate the extent to which recent measurements of speedup accurately predict the future, and so the extent to which such measurements might reasonably be expected to guide allocation decisions.

  • Finally, we examine the durations of individual processor idle periods, and relate these to the cost of reallocating a processor at those times. These results shed light on the potential for aggressively dynamic policies to improve performance.

This work was supported in part by the National Science Foundation (Grants CCR-9123308 and CCR-9200832) and the Washington Technology Center.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Agarwal and A. Gupta. Memory-Reference Characteristics of Multiprocessor Applications under MACH. In Proceedings of the ACM SIGMETRICS Conference, pages 215–225, May 1988.

    Google Scholar 

  2. I. Ashok and J. Zahorjan. Scheduling a Mixed Interactive and Batch Workload on a Parallel, Shared Memory Supercomputer. In Supercomputing '92, pages 616–625, Nov. 1992.

    Google Scholar 

  3. M. Berry, D. Chen, P. Koss, D. Kuck, S. Lo, Y. Pang, L. Pointer, R. Roloff, A. Sameh, E. Clementi, S. Chin, D. Schneider, G. Fox, P. Messina, D. Walker, C. Hsiung, J. Scharzmeier, K. Lue, S. Orszag, F. Seidl, O. Johnson, R. Goodrum, and J. Martin. The PERFECT Club Benchmarks: Effective Performance Evaluation of Supercomputers. The International Journal of Supercomputer Applications, 3(3):5–40, 1989.

    Google Scholar 

  4. J. Chen, Y. Endo, K. Chan, D. Mazieres, A. Dias, M. Seltzer, and M. Smith. The Measured Performance of Personal Computer Operating Systems. In Proceedings of the 15th ACM Symposium on Operating system Principles, pages 299–313, Dec. 1995.

    Google Scholar 

  5. S.-H. Chiang, R. K. Mansharamani, and M. K. Vernon. Use of Application Characteristics and Limited Preemption for Run-To-Completion Parallel Processor Scheduling Policies. In Proceedings of the ACM SIGMETRICS Conference, pages 33–44, May 1994.

    Google Scholar 

  6. E. C. Cooper and R. P. Draves. C Threads. Technical Report CMU-CS-88-154, Department of Computer Science, Carnegie-Mellon University, June 1988.

    Google Scholar 

  7. G. Cybenko, L. Kipp, L. Pointer, and D. Kuck. Supercomputer Performance Evaluation and the Perfect Benchmarks. In Proceedings of the 1990 International Conference on Supercomputing, ACM SIGARCH Computer Architecture News, pages 254–266, Sept. 1990.

    Google Scholar 

  8. R. Cypher, A. Ho, S. Konstantinidou, and P. Messina. Architectural Requirements of Parallel Scientific Applications with Explicit Communication. In Proceedings of the 20th Annual International Symposium on Computer Architecture, pages 2–13, May 1993.

    Google Scholar 

  9. F. Darema-Rogers, G. Pfister, and K. So. Memory Access Patterns of Parallel Scientific Programs. In Proceedings of the ACM SIGMETRICS Conference, pages 46–58, May 1987.

    Google Scholar 

  10. J. J. Dongarra and T. Dunigan. Message-Passing Performance of Various Computers. Technical Report CS-95-299, University of Tennessee, July 1995.

    Google Scholar 

  11. R. Eigenmann,J. Hoeflinger,Z. Li, and D. Padua. Experience in the Parallelization of Four Perfect-Benchmark Programs. Technical Report 1193, Center for Supercomputing Research and Development, Aug. 1991.

    Google Scholar 

  12. D. G. Feitelson and B. Nitzberg. Job Characteristics of a Production Parallel Scientific Workload on the NASA Ames iPSC/860. In Proceedings of the IPPS'95 Workshop on Job Scheduling Strategies for Parallel Processing, pages 337–360, Apr. 1995.

    Google Scholar 

  13. K. Guha. Using Parallel Program Characteristics in Dynamic Processor Allocation Policies. Technical Report CS-95-03, Department of Computer Science, York University, May 1995.

    Google Scholar 

  14. A. Gupta, A. Tucker, and S. Urushibara. The Impact of Operating System Scheduling Policies and Synchronization Methods on the Performance of Parallel Applications. In Proceedings of the ACM SIGMETRICS Conference, pages 120–133, May 1991.

    Google Scholar 

  15. A. Karlin, K. Li, M. S. Manasse, and S. Owicki. Empirical Studies of Competitive Spinning for a Shared-Memory Multiprocessor. In Proceedings of the 13th ACM Symposium on Operating Systems Principles, pages 41–55, Oct. 1991.

    Google Scholar 

  16. Kendall Square Research Inc., 170 Tracer Lane, Waltham, MA 02154. KSR Fortran Programming, 1993.

    Google Scholar 

  17. S. T. Leutenegger and M. K. Vernon. The Performance of Multiprogrammed Multiprocessor Scheduling Policies. In Proceedings of the ACM SIGMETRICS Conference, pages 226–236, May 1990.

    Google Scholar 

  18. S.-P. Lo and V. Gligor. A Comparative Analysis of Multiprocessor Scheduling Algorithms. In Proceedings of the 7th International Conference on Distributed Computing Systems, pages 356–63, Sept. 1987.

    Google Scholar 

  19. S. Majumdar, D. L. Eager, and R. B. Bunt. Scheduling in Multiprogrammed Parallel Systems. In Proceedings of the ACM SIGMETRICS Conference, pages 104–113, May 1988.

    Google Scholar 

  20. C. McCann, R. Vaswani, and J. Zahorjan. A Dynamic Processor Allocation Policy for Multiprogrammed Shared-Memory Multiprocessors. ACM Transactions on Computer Systems, 11(2):146–178, May 1993.

    Article  Google Scholar 

  21. A. J. Musciano and T. L. Sterling. Efficient Dynamic Scheduling of Medium-Grained Tasks for General Purpose Parallel Processing. In Proceedings of the International Conference on Parallel Processing, pages 166–175, Aug. 1988.

    Google Scholar 

  22. T. D. Nguyen, R. Vaswani, and J. Zahorjan. Maximizing Speedup Through Self-Tuning of Processor Allocation. In Proceedings of the 10th International Parallel Processing Symposium, pages 463–468, Apr. 1996.

    Google Scholar 

  23. T. D. Nguyen, R. Vaswani, and J. Zahorjan. Using Runtime Measured Workload Characteristics in Parallel Processor Scheduling. In Proceedings of the IPPS'96 Workshop on Job Scheduling Strategies for Parallel Processing, Apr. 1996.

    Google Scholar 

  24. T. D. Nguyen, R. Vaswani, and J. Zahorjan. Parallel Application Characterization for Multiprocessor Scheduling Policy Design. Technical report, Department of Computer Science and Engineering, University of Washington, In preparation.

    Google Scholar 

  25. J. K. Ousterhout. Scheduling Techniques for Concurrent Systems. In Proceedings of 3rd International Conference on Distributed Computing Systems, pages 22–30, Oct. 1982.

    Google Scholar 

  26. P. Petersen and D. Padua. Machine-Independent Evaluation of Parallelizing Compilers. Technical Report 1173, Center for Supercomputing Research and Development, 1992.

    Google Scholar 

  27. E. Rothberg, J. P. Singh, and A. Gupta. Working Sets, Cache Sizes, and Node Granularity Issues for Large-Scale Multiprocessors. In Proceedings of the 20th Annual International Symposium on Computer Architecture, pages 14–25, May 1993.

    Google Scholar 

  28. K. C. Sevcik. Characterizations of Parallelism in Applications and their Use in Scheduling. In Proceedings of the ACM SIGMETRICS Conference, pages 171–180, May 1989.

    Google Scholar 

  29. K. C. Sevcik. Application Scheduling and Processor Allocation in Multiprogrammed Parallel Processing Systems. Performance Evaluation, 19(2/3): 107–140, Mar. 1994.

    Article  Google Scholar 

  30. J. P. Singh, W.-D. Weber, and A. Gupta. SPLASH: Stanford Parallel Applications for Shared-Memory. Computer Architecture News, 20(1):5–44, 1992.

    Article  Google Scholar 

  31. R. L. Sites, editor. Alpha Architecture Reference Manual. Digital Press, 1992.

    Google Scholar 

  32. M. Squillante and E. Lazowska. Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling. IEEE Transactions on Parallel and Distributed Systems, 4(2):131–143, February 1993.

    Article  Google Scholar 

  33. D. Thiebaut and H. S. Stone. Footprints in the Cache. ACM Transactions on Computer Systems, 5(4):305–329, Nov. 1987.

    Article  Google Scholar 

  34. A. Tucker and A. Gupta. Process Control and Scheduling Issues for Multiprogrammed Shared-Memory Multiprocessors. In Proceedings of the 12th ACM Symposium on Operating Systems Principles, pages 159–166, Dec. 1989.

    Google Scholar 

  35. R. Vaswani and J. Zahorjan. The Implications of Cache Affinity on Processor Scheduling for Multiprogrammed, Shared Memory Multiprocessors. In Proceedings of the 13th ACM Symposium on Operating Systems Principles, pages 26–40, Dec. 1991.

    Google Scholar 

  36. R. P. Wilson, R. S. French, C. S. Wilson, S. P. Amarasinghe, J. M. Anderson, S. W. K. Tjiang, S.-W. Liao, C.-W. Tseng, M. W. Hall, M. S. Lam, and J. L. Hennessy. SUIF: An Infrastructure for Research on Parallelizing and Optimizing Comilers. Technical report, Computer Systems Laboratory, Stanford Univeristy.

    Google Scholar 

  37. S. C.Woo,M. Ohara,E. Torrie,J. P.Singh, and A. Gupta. The SPLASH-2 Programs: Characterization and Methodological Considerations. In Proceedings 22nd Annual International Symposium on Computer Architecture, pages 24–36, June 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Dror G. Feitelson Larry Rudolph

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nguyen, T.D., Vaswani, R., Zahorjan, J. (1996). Parallel application characterization for multiprocessor scheduling policy design. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1996. Lecture Notes in Computer Science, vol 1162. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0022294

Download citation

  • DOI: https://doi.org/10.1007/BFb0022294

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-61864-5

  • Online ISBN: 978-3-540-70710-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics