Computation Scheduling and Data Replication Algorithms for Data Grids

Ranganathan, Kavitha; Foster, Ian

doi:10.1007/978-1-4615-0509-9_22

Kavitha Ranganathan⁵ &
Ian Foster^5,6

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 64))

149 Accesses
13 Citations

Abstract

Data Grids seek to harness geographically distributed resources for large-scale data-intensive problems such as those encountered in high energy physics, bioinformatics, and other disciplines. These problems typically involve numerous, loosely coupled jobs that both access and generate large data sets. Effective scheduling in such environments is challenging, because of a need to address a variety of metrics and constraints (e.g., resource utilization, response time, global and local allocation policies) while dealing with multiple, potentially independent sources of jobs and a large number of storage, compute, and network resources.

We describe a scheduling framework that addresses these problems. Within this framework, data movement operations may be either tightly bound to job scheduling decisions or performed by a decoupled, asynchronous process on the basis of observed data access patterns and load. We develop a family of job scheduling and data movement (replication) algorithms and use simulation studies to evaluate various combinations. Our results suggest that while it is necessary to consider the impact of replication on the scheduling strategy, it is not always necessary to couple data movement and computation scheduling. Instead, these two activities can be addressed separately, thus significantly simplifying the design and implementation of the overall Data Grid system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Department of Computer Science, The University of Chicago, USA
Kavitha Ranganathan & Ian Foster
Mathematics and Computer Science Division, Argonne National Laboratory, USA
Ian Foster

Authors

Kavitha Ranganathan
View author publications
You can also search for this author in PubMed Google Scholar
Ian Foster
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Poznań Supercomputing and Networking Center, Poland
Jarek Nabrzyski
Argonne National Laboratory, USA
Jennifer M. Schopf
Institute of Computing Science, Poznań University of Technology, Poland
Jan Węglarz

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ranganathan, K., Foster, I. (2004). Computation Scheduling and Data Replication Algorithms for Data Grids. In: Nabrzyski, J., Schopf, J.M., Węglarz, J. (eds) Grid Resource Management. International Series in Operations Research & Management Science, vol 64. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0509-9_22

Download citation

DOI: https://doi.org/10.1007/978-1-4615-0509-9_22
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5112-2
Online ISBN: 978-1-4615-0509-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics