ScoPred–Scalable User-Directed Performance Prediction Using Complexity Modeling and Historical Data

Lafreniere, Benjamin J.; Sodan, Angela C.

doi:10.1007/11605300_3

Benjamin J. Lafreniere²⁰ &
Angela C. Sodan²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3834))

Included in the following conference series:

Workshop on Job Scheduling Strategies for Parallel Processing

492 Accesses
7 Citations

Abstract

Using historical information to predict future runs of parallel jobs has shown to be valuable in job scheduling. Trends toward more flexible job-scheduling techniques such as adaptive resource allocation, and toward the expansion of scheduling to grids, make runtime predictions even more important. We present a technique of employing both a user’s knowledge of his/her parallel application and historical application-run data, synthesizing them to derive accurate and scalable predictions for future runs. These scalable predictions apply to runtime characteristics for different numbers of nodes (processor scalability) and different problem sizes (problem-size scalability). We employ multiple linear regression and show that for decently accurate complexity models, good prediction accuracy can be obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sodan, A.C., Liu, L.: Dynamic Multi-Resource Monitoring for Predictive Job Scheduling with ScoPro. Technical Report 04-002, U of W, CS Department (February 2005)
Google Scholar
Sodan, A.C., Huang, X.: Adaptive Time/Space Scheduling with SCOJO. In: Int. Symp. on High-Performance Computing Systems (HPCS), Winnipeg/Manitoba, May 2004, pp. 165–178 (2004)
Google Scholar
Sodan, A.C., Han, L.: ATOP–Space and Time Adaptation for Parallel and Grid Applications via Flexible Data Partitioning. In: 3rd ACM/IFIP/USENIX Workshop on Reflective and Adaptive Middleware, Toronto (October 2004)
Google Scholar
Sodan, A.C., Lan, L.: LOMARC–Lookahead Matchmaking in Multi-Resource Coscheduling. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 288–315. Springer, Heidelberg (2005)
Chapter Google Scholar
Cirne, W., Berman, F.: A Model for Moldable Supercomputer Jobs. In: Proc. Internat. Parallel and Distributed Processing Symposium (IPDPS) (April 2001)
Google Scholar
Sodan, A.C.: Loosely Coordinated Coscheduling in the Context of Other Dynamic Approaches for Job Scheduling–A Survey. In: Concurrency & Computation: Practice & Experience, 57 pages (accepted for publication)
Google Scholar
Naik, V.K., Setia, S.K., Squillante, M.S.: Processor Allocation in Multiprogrammed Distributed-Memory Parallel Computer Systems. J. of Parallel and Distributed Computing 46(1), 28–47 (1997)
Article Google Scholar
Frachtenberg, E., Feitelson, D., Petrini, F., Fernandez, J.: Flexible CoScheduling: Mitigating Load Imbalance and Improving Utilization of Heterogeneous Resources. In: Proc. Int. Parallel and Distributed Processing Symposium (IPDPS 2003), Nice, France (April 2003)
Google Scholar
Gibbons, R.A.: Historical Application Profiler for Use by Parallel Schedulers. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291. Springer, Heidelberg (1997)
Google Scholar
Mu’alem, A., Feitelson, D.G.: Utilization, Predictability, Workloads and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. IEEE Transactions Parallel & Distributed Systems 12(6) (June 2001)
Google Scholar
Perkovic, D., Keleher, P.J.: Randomization, Speculation, and Adaptation in Batch Schedulers. In: Proc. ACM/IEEE Supercomputing (SC), Dallas/TX (November 2000)
Google Scholar
Chiang, S.-H., Vernon, M.K.: Characteristics of a Large Shared Memory Production Workload. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 159–187. Springer, Heidelberg (2001)
Chapter Google Scholar
Smith, W., Taylor, V., Foster, I.: Using Run-Time Predictions to Estimate Queue Wait Times and Improve Scheduler Performance. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1999, IPPS-WS 1999, and SPDP-WS 1999. LNCS, vol. 1659, p. 202. Springer, Heidelberg (1999)
Chapter Google Scholar
Arpaci-Dusseau, A.C., Culler, D.E., Mainwaring, A.M.: Scheduling with Implicit Information in Distributed Systems. In: Proc. SIGMETRICS 1998/PERFORMANCE 1998 Joint Conference on the Measurement and Modeling of Computer Systems, Madison/WI, USA (June 1998)
Google Scholar
Crovella, M.E., LeBlanc, T.J.: Parallel Performance Prediction Using Lost Cycles Analysis. In: Proc. Supercomputing, SC (1994)
Google Scholar
Keahey, K., Beckman, P., Ahrens, J.: Ligature: Component Architecture for High Performance Applications. The International Journal of High Performance Applications 14(4), 347–356 (winter 2000)
Article Google Scholar
Vraalsen, F., Aydt, R.A., Mendes, C.L., Reed, D.A.: Performance Contracts: Predicting and Monitoring Grid Application Behavior. In: Proc. 2nd Internat. Workshop on Grid Computing (November 2001)
Google Scholar
Marin, G., Mellor-Crummey, J.: Cross-Architecture Predictions for Scientific Applications Using Parameterized Models. In: Proc. Joint. Internat. Conf. on Measurement and Modeling of Computer Systems (SIGMETRICS), New York, NY, USA (June 2004)
Google Scholar
Snavely, A., Carrington, L., Wolter, N.: Modeling Application Performance by Convolving Machine Signatures with Application Profiles. In: Proc. IEEE 3rd Annual Workshop on Workload Characterization (2001)
Google Scholar
NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook (retrieved October 2004)
Cohen, J., Cohen, P., West, S.G., Alken, L.S.: Applied Multiple Regression/Correlation Analysis for the Behavioural Sciences, 3rd edn. Lawrence Erlbaum Associates, Mahwah (2003)
Google Scholar
Mendenhall, W., Beaver, R.J., Beaver, B.M.: Introduction to Probability and Statistics, 10th edn. Brooks/Cole Publishing Company, Pacific Grove (1999)
Google Scholar
Bailey, D.H., Harris, T., Saphir, W.C., Van der Wijngaart, R.F., Woo, A.C., Yarrow, M.: The NAS Parallel Benchmarks 2.0. NAS Technical Report NAS-95-020, NASA Ames Research Center, Moffett Field, CA (1995)
Google Scholar
Grama, A., Gupta, A., Karypis, G., Kumar, V.: Introduction to Parallel Computing, 2nd edn. Addison-Wesley, Reading (2003)
Google Scholar
Yarrow, M., Van der Wijngaart, R.F.: Communication Improvement for the LU NAS Parallel Benchmark: A Model for Efficient Parallel Relaxation Schemes. NAS Technical Report NAS-97-032, NASA Ames Research Center, Moffett Field, CA (1997)
Google Scholar
Barszcz, E., Fatoohi, R., Venkatakrishnan, V., Weeratunga, S.: Solution of Regular, Sparse Triangular Linear Systems on Vector and Distributed-Memory Multiprocessors. NAS Applied Research Branch Report RNR-94-007, NASA Ames Research Center, Moffet Field, CA (1993)
Google Scholar
Maple 9.5–Advanced Mathematics Software for Engineers, Academics, Researchers, and Students, http://www.maplesoft.com/products/maple/index.aspx (retrieved December 2004)
OpenMaple–An API into Maple, http://www.adaptscience.com/products//maple/html/OpenMaple.html (retrieved December 2004)
HPL–A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers, http://www.netlib.org/benchmark/hpl/ (retrieved June 2005)
Foster, I.: Designing and Building Parallel Programs. Addison-Wesley, Reading (1995)
MATH Google Scholar
Vadhiyar, S.S., Fagg, G.E., Dongarra, J.: Automatically Tuned Collective Communications. In: IEEE/ACM Supercomputing (November 2000)
Google Scholar
Pješivac-Grbović, J., Angskun, T., Bosilca, G., Fagg, G.E., Gabriel, E., Dongarra, J.J.: Performance Analysis of MPI Collective Operations. In: PMEO-PDS (April 2005)
Google Scholar
Todorowski, L., Ljubič, P., Džeroski, S.: Inducing Polynomial Equations for Regression. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 441–452. Springer, Heidelberg (2004)
Chapter Google Scholar
Schmidt, E., Schulz, A., Kruse, L., von Cölln, G., Nebel, W.: Automatic Generation of Complexity Functions for High-Level Power Analysis. In: PATMOS (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Windsor, Windsor, ON N9B 3P4, Canada
Benjamin J. Lafreniere & Angela C. Sodan

Authors

Benjamin J. Lafreniere
View author publications
You can also search for this author in PubMed Google Scholar
Angela C. Sodan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The Hebrew University of Jerusalem,
Dror Feitelson
Powerset, Inc.,,
Eitan Frachtenberg
Massachusetts Institute of Technology, 77 Massachusetts Avenue, MA 02139, Cambridge, USA
Larry Rudolph
No Affiliations,,
Uwe Schwiegelshohn

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lafreniere, B.J., Sodan, A.C. (2005). ScoPred–Scalable User-Directed Performance Prediction Using Complexity Modeling and Historical Data. In: Feitelson, D., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2005. Lecture Notes in Computer Science, vol 3834. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11605300_3

Download citation

DOI: https://doi.org/10.1007/11605300_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31024-2
Online ISBN: 978-3-540-31617-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics