Abstract
This paper develops a framework to model the performance of parallel applications executing in a shared network computing environment. For sharing of a single computation node or network link, the actual performance is predicted, while for sharing of multiple nodes and links, performance bounds are developed. The methodology for building such a shared execution performance model is based on monitoring an application’s execution behavior and resource usage under controlled dedicated execution. The procedure does not require access to the source code and hence can be applied across programming languages and models. We validate our approach with experimental results with NAS benchmarks executed in different resource sharing scenarios on a small cluster. Applicability to more general scenarios, such as large clusters, memory and I/O bound programs and wide are networks, remain open questions that are included in the discussion. This paper makes the case that understanding and modeling application behavior is important for resource allocation and offers a promising approach to put that in practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Foster, I., Kesselman, K.: Globus: A metacomputing infrastructure toolkit. Journal of Supercomputer Applications 11, 115–128 (1997)
Grimshaw, A., Wulf, W.: The Legion vision of a worldwide virtual computer. Communications of the ACMÂ 40 (1997)
Litzkow, M., Livny, M., Mutka, M.: Condor — A hunter of idle workstations. In: Proceedings of the Eighth Conference on Distributed Computing Systems, San Jose, California (1988)
Zhou, S.: LSF: load sharing in large-scale heterogeneous distributed systems. In: Proceedings of the Workshop on Cluster Computing, Orlando, FL (1992)
Wolski, R., Spring, N., Peterson, C.: Implementing a performance forecasting system for metacomputing: The Network Weather Service. In: Proceedings of Supercomputing 1997, San Jose, CA (1997)
Lowekamp, B., Miller, N., Sutherland, D., Gross, T., Steenkiste, P., Subhlok, J.: A resource query interface for network-aware applications. In: Seventh IEEE Symposium on High-Performance Distributed Computing, Chicago, IL (1998)
Berman, F., Wolski, R., Figueira, S., Schopf, J., Shao, G.: Application-level scheduling on distributed heterogeneous networks. In: Proceedings of Supercomputing 1996, Pittsburgh, PA (1996)
Bolliger, J., Gross, T.: A framework-based approach to the development of network-aware applications. IEEE Trans. Softw. Eng. 24, 376–390 (1998)
Subhlok, J., Lieu, P., Lowekamp, B.: Automatic node selection for high performance applications on networks. In: Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Atlanta, GA, pp. 163-172 (1999)
Tangmunarunkit, H., Steenkiste, P.: Network-aware distributed computing: A case study. In: Second Workshop on Runtime Systems for Parallel Programming (RTSPP), Orlando (1998)
Weismann, J.: Metascheduling: A scheduling model for metacomputing systems. In: Seventh IEEE Symposium on High-Performance Distributed Computing, Chicago, IL (1998)
Subhlok, J., Venkataramaiah, S., Singh, A.: Characterizing NAS benchmark performance on shared heterogeneous networks. In: 11th International Heterogeneous Computing Workshop (2002)
Barak, A., La’adan, O.: The MOSIX multicomputer operating system for high performance cluster computing. Future Generation Computer Systems 13, 361–372 (1998)
Arpaci-Dusseau, A., Culler, D., Mainwaring, A.: Scheduling with implicit information in distributed systems. In: SIGMETRICS 1998/PERFORMANCE 1998 Joint Conference on the Measurement and Modeling of Computer Systems (1998)
Wolski, R., Spring, N., Hayes, J.: Predicting the CPU availability of time-shared unix systems on the computational grid. Cluster Computing 3, 293–301 (2000)
Venkataramaiah, S., Subhlok, J.: Performance prediction for simple CPU andnetwork sharing. In: LACSI Symposium 2002 (2002)
Clement, M., Quinn, M.: Automated performance prediction for scalable parallel computing. Parallel Computing 23, 1405–1420 (1997)
Fahringer, T., Basko, R., Zima, H.: Automatic performance prediction to support parallelization of Fortran programs for massively parallel systems. In: Proceedings of the 1992 International Conference on Supercomputing, Washington, DC, pp. 347–56 (1992)
Culler, D., Karp, R., Patterson, D., Sahay, A., Schauser, K., Santos, E., Subra-monian, R., von Eicken, T.: LogP: Towards a realistic model of parallel computation. In: Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, CA, pp. 1–12 (1993)
Fahringer, T., Scholz, B., Sun, X.: Execution-driven performance analysis for distributed and parallel systems. In: 2nd International ACM Sigmetrics Workshop on Software and Performance (WOSP 2000), Ottawa, Canada (2000)
Schopf, J., Berman, F.: Performance prediction in production environments. In: 12th International Parallel Processing Symposium, Orlando, FL, pp. 647–653 (1998)
Dinda, P., O’Hallaron, D.: An evaluation of linear models for host load prediction. In: Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing (1999)
Singh, A., Subhlok, J.: Reconstruction of application layer message sequences by network monitoring. In: IASTED International Conference on Communications and Computer Networks (2002)
Bailey, D., Harris, T., Saphir, W., van der Wijngaart, R., Woo, A., Yarrow, M.: The NAS Parallel Benchmarks 2.0. Technical Report 95-020, NASA Ames Research Center (1995)
Tabe, T., Stout, Q.: The use of the MPI communication library in the NAS Parallel Benchmark. Technical Report CSE-TR-386-99, Department of Computer Science,University of Michigan (1999)
Rizzo, L.: Dummynet: a simple approach to the evaluation of network protocols. ACM Computer Communication Review 27 (1997)
Venkataramaiah, S.: Performance prediction of distributed applications using CPU measurements. Master’s thesis, University of Houston (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Subhlok, J., Venkataramaiah, S. (2003). Performance Estimation for Scheduling on Shared Networks. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2003. Lecture Notes in Computer Science, vol 2862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10968987_8
Download citation
DOI: https://doi.org/10.1007/10968987_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20405-3
Online ISBN: 978-3-540-39727-4
eBook Packages: Springer Book Archive