Performance Estimation for Scheduling on Shared Networks

Subhlok, Jaspal; Venkataramaiah, Shreenivasa

doi:10.1007/10968987_8

Jaspal Subhlok⁷ &
Shreenivasa Venkataramaiah⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2862))

Included in the following conference series:

Workshop on Job Scheduling Strategies for Parallel Processing

972 Accesses
2 Citations

Abstract

This paper develops a framework to model the performance of parallel applications executing in a shared network computing environment. For sharing of a single computation node or network link, the actual performance is predicted, while for sharing of multiple nodes and links, performance bounds are developed. The methodology for building such a shared execution performance model is based on monitoring an application’s execution behavior and resource usage under controlled dedicated execution. The procedure does not require access to the source code and hence can be applied across programming languages and models. We validate our approach with experimental results with NAS benchmarks executed in different resource sharing scenarios on a small cluster. Applicability to more general scenarios, such as large clusters, memory and I/O bound programs and wide are networks, remain open questions that are included in the discussion. This paper makes the case that understanding and modeling application behavior is important for resource allocation and offers a promising approach to put that in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Foster, I., Kesselman, K.: Globus: A metacomputing infrastructure toolkit. Journal of Supercomputer Applications 11, 115–128 (1997)
Article Google Scholar
Grimshaw, A., Wulf, W.: The Legion vision of a worldwide virtual computer. Communications of the ACM 40 (1997)
Google Scholar
Litzkow, M., Livny, M., Mutka, M.: Condor — A hunter of idle workstations. In: Proceedings of the Eighth Conference on Distributed Computing Systems, San Jose, California (1988)
Google Scholar
Zhou, S.: LSF: load sharing in large-scale heterogeneous distributed systems. In: Proceedings of the Workshop on Cluster Computing, Orlando, FL (1992)
Google Scholar
Wolski, R., Spring, N., Peterson, C.: Implementing a performance forecasting system for metacomputing: The Network Weather Service. In: Proceedings of Supercomputing 1997, San Jose, CA (1997)
Google Scholar
Lowekamp, B., Miller, N., Sutherland, D., Gross, T., Steenkiste, P., Subhlok, J.: A resource query interface for network-aware applications. In: Seventh IEEE Symposium on High-Performance Distributed Computing, Chicago, IL (1998)
Google Scholar
Berman, F., Wolski, R., Figueira, S., Schopf, J., Shao, G.: Application-level scheduling on distributed heterogeneous networks. In: Proceedings of Supercomputing 1996, Pittsburgh, PA (1996)
Google Scholar
Bolliger, J., Gross, T.: A framework-based approach to the development of network-aware applications. IEEE Trans. Softw. Eng. 24, 376–390 (1998)
Article Google Scholar
Subhlok, J., Lieu, P., Lowekamp, B.: Automatic node selection for high performance applications on networks. In: Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Atlanta, GA, pp. 163-172 (1999)
Google Scholar
Tangmunarunkit, H., Steenkiste, P.: Network-aware distributed computing: A case study. In: Second Workshop on Runtime Systems for Parallel Programming (RTSPP), Orlando (1998)
Google Scholar
Weismann, J.: Metascheduling: A scheduling model for metacomputing systems. In: Seventh IEEE Symposium on High-Performance Distributed Computing, Chicago, IL (1998)
Google Scholar
Subhlok, J., Venkataramaiah, S., Singh, A.: Characterizing NAS benchmark performance on shared heterogeneous networks. In: 11th International Heterogeneous Computing Workshop (2002)
Google Scholar
Barak, A., La’adan, O.: The MOSIX multicomputer operating system for high performance cluster computing. Future Generation Computer Systems 13, 361–372 (1998)
Article Google Scholar
Arpaci-Dusseau, A., Culler, D., Mainwaring, A.: Scheduling with implicit information in distributed systems. In: SIGMETRICS 1998/PERFORMANCE 1998 Joint Conference on the Measurement and Modeling of Computer Systems (1998)
Google Scholar
Wolski, R., Spring, N., Hayes, J.: Predicting the CPU availability of time-shared unix systems on the computational grid. Cluster Computing 3, 293–301 (2000)
Article Google Scholar
Venkataramaiah, S., Subhlok, J.: Performance prediction for simple CPU andnetwork sharing. In: LACSI Symposium 2002 (2002)
Google Scholar
Clement, M., Quinn, M.: Automated performance prediction for scalable parallel computing. Parallel Computing 23, 1405–1420 (1997)
Article MATH Google Scholar
Fahringer, T., Basko, R., Zima, H.: Automatic performance prediction to support parallelization of Fortran programs for massively parallel systems. In: Proceedings of the 1992 International Conference on Supercomputing, Washington, DC, pp. 347–56 (1992)
Google Scholar
Culler, D., Karp, R., Patterson, D., Sahay, A., Schauser, K., Santos, E., Subra-monian, R., von Eicken, T.: LogP: Towards a realistic model of parallel computation. In: Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, CA, pp. 1–12 (1993)
Google Scholar
Fahringer, T., Scholz, B., Sun, X.: Execution-driven performance analysis for distributed and parallel systems. In: 2nd International ACM Sigmetrics Workshop on Software and Performance (WOSP 2000), Ottawa, Canada (2000)
Google Scholar
Schopf, J., Berman, F.: Performance prediction in production environments. In: 12th International Parallel Processing Symposium, Orlando, FL, pp. 647–653 (1998)
Google Scholar
Dinda, P., O’Hallaron, D.: An evaluation of linear models for host load prediction. In: Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing (1999)
Google Scholar
Singh, A., Subhlok, J.: Reconstruction of application layer message sequences by network monitoring. In: IASTED International Conference on Communications and Computer Networks (2002)
Google Scholar
Bailey, D., Harris, T., Saphir, W., van der Wijngaart, R., Woo, A., Yarrow, M.: The NAS Parallel Benchmarks 2.0. Technical Report 95-020, NASA Ames Research Center (1995)
Google Scholar
Tabe, T., Stout, Q.: The use of the MPI communication library in the NAS Parallel Benchmark. Technical Report CSE-TR-386-99, Department of Computer Science,University of Michigan (1999)
Google Scholar
Rizzo, L.: Dummynet: a simple approach to the evaluation of network protocols. ACM Computer Communication Review 27 (1997)
Google Scholar
Venkataramaiah, S.: Performance prediction of distributed applications using CPU measurements. Master’s thesis, University of Houston (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Houston, Houston, TX, 77204, USA
Jaspal Subhlok & Shreenivasa Venkataramaiah

Authors

Jaspal Subhlok
View author publications
You can also search for this author in PubMed Google Scholar
Shreenivasa Venkataramaiah
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The Hebrew University of Jerusalem,
Dror Feitelson
Massachusetts Institute of Technology, 77 Massachusetts Avenue, MA 02139, Cambridge, USA
Larry Rudolph
No Affiliations,,
Uwe Schwiegelshohn

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Subhlok, J., Venkataramaiah, S. (2003). Performance Estimation for Scheduling on Shared Networks. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2003. Lecture Notes in Computer Science, vol 2862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10968987_8

Download citation

DOI: https://doi.org/10.1007/10968987_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20405-3
Online ISBN: 978-3-540-39727-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics