SPRUCE: A System for Supporting Urgent High-Performance Computing
Modeling and simulation using high-performance computing are playing an increasingly important role in decision making and prediction. For time-critical emergency decision support applications, such as influenza modeling and severe weather prediction, late results may be useless. A specialized infrastructure is needed to provide computational resources quickly. This paper describes the architecture and implementation of SPRUCE, a system for supporting urgent computing on both traditional supercomputers and distributed computing Grids. Currently deployed on the TeraGrid, SPRUCE provides users with “right-of-way tokens” that can be activated from a Web-based portal or Web service invocation in the event of an urgent computing need. Tokens are transferrable and can be restricted to specific resource sets and priority levels. Once a session is activated, job submissions may request elevated priority. Based on local policy, computing resources can respond, for example, by preempting active jobs or raising the job’s priority in the queue. This paper also explores the strengths and weaknesses of the SPRUCE architecture and token-based activation for urgent computing applications.
KeywordsResource Provider Virtual Organization Globus Toolkit Site Administrator High Priority Queue
- 1.C. D. Keeling, R. B. Bacastow, and T. P. Whorf, “Measurements of the concentration of carbon dioxide at Mauna Loa Observatory, Hawaii,” Carbon Dioxide Review, pp. 377–385, 1982.Google Scholar
- 2.K. Nagel, R. Beckman, and C. Barrett, “Transmins for transportation planning,” in 6th Int. Conf. on Computers in Urban Planning and Urban Management, 1999.Google Scholar
- 3.“TeraGrid Project,” http://www.teragrid.org.
- 4.“Telecommunications Service Priority (TSP) program,” http://www.tsp.ncs.gov.
- 5.J. Schopf, M. D’Arcy, N. Miller, L. Pearlman, I. Foster, and C. Kesselman, “Monitoring and discovery in a Web services framework: Functionality and performance of the Globus Toolkit’s MDS4,” Argonne National Laboratory, Tech. Rep., 2005.Google Scholar
- 6.I. Foster, “Globus toolkit version 4: Software for service-oriented systems,” in IFIP International Conference on Network and Parallel Computing, 2005, pp. 2–13.Google Scholar
- 7.“Globus Resource Specification Language,” http://www.globus.org/ toolkit/docs/2.4/gram/rsl specl.html.
- 8.“PBS ‘qsub’ Job Submission Tool,” http://www.clusterresources.com/ products/torque/docs20/commands/qsub.shtml.
- 9.“Torque Submit Filter,” http://www.clusterresources.com/products/torque/docs20/a.jqsubwrapper.shtml.
- 10.“Sura Coastal Ocean Observing and Prediction,” http://www.scoop.lsu.edu/gridsphere/gridsphere.
- 11.“Linked Environments for Atmospheric Discovery (LEAD),” http://www.lead.ou.edu/.
- 12.“Apache AXIS 2,” http://www.ws.apache.org/axis/.
- 13.“Test Harness and Reporting Framework (INCA),” http://www.inca.sdsc.edu.
- 14.“GridFTP Project,” http://www.globus.org/grid software/data/gridftp.php.
- 15.L. Gommans, F. Travostino, John, Vollbrecht, C. de Laat, and R. Meijer, “Tokenbased authorization of connection oriented network resources,” in GRIDNETS Conference Proceedings, October 2004.Google Scholar