Impact of Reservations on Production Job Scheduling
The TeraGrid is a closely linked community of diverse resources: computational, data, and experimental, e.g., the imminent very large computational system at the University of Texas, the extensive data facilities at SDSC, and the physics experiments at ORNL. As research efforts become more extensive in scope, the co-scheduling of multiple resources becomes an essential part of scientific progress. This can be at odds with the traditional management of the computational systems, where utilization, queue wait times, and expansion factors are considered paramount and anything that affects their performance is considered with suspicion. The only way to assuage concerns is with intensive investigation of the likely effects of allowing advance reservations on these performance metrics.
To understand the impact, we developed a simulator that reads our actual production job log and reservation request data to investigate different scheduling scenarios. We explored the effect of reservations and policies using job log data from two different months within consecutive years and present our initial results. Results from the simulations suggest that utilization, expansion factor and queue wait time indeed can be affected negatively by significant numbers and size of reservations, but this effect can be mitigated with appropriate policies.
KeywordsExpansion Factor Advance Reservation Local Scheduler Simulation Core Simulation Clock
Unable to display preview. Download preview PDF.
- 1.Feitelson, D., Rudolph, L., Schwiegelshohn, U.: Parallel Job Scheduling - A Status Report. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, Springer, Heidelberg (2005)Google Scholar
- 2.Singaga, J., Mohamed, H., Epema, D.: Dynamic Co-Allocation Service in Multicluster Systems. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, Springer, Heidelberg (2005)Google Scholar
- 3.Sodan, A., Lan, L.: LOMARC - Lookahead Matchmaking for Multi-Resource Coscheduling. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, Springer, Heidelberg (2005)Google Scholar
- 5.San Diego Supercomputer Center at UCSD, http://www.sdsc.edu
- 6.IBM LoadLeveler, http://publib.boulder.ibm.com/infocenter/clresctr/index.jsp
- 7.Catalina scheduler, http://www.sdsc.edu/catalina
- 8.TeraGrid, http://www.teragrid.org
- 9.Neutron source, http://en.wikipedia.org/wiki/Spallation_Neutron_Source
- 10.Andrews, P., Jordan, C., Kovatch, P.: Massive High-Performance Global File Systems for Grid Computing. Research Paper, SuperComputing 2005, Seattle, WA (November 2005)Google Scholar
- 12.Metascheduling Requirement Analysis Team final report, http://www.teragridforum.org/mediawiki/images/b/b4/MetaschedRatReport.pdf