Abstract
Data Grids are increasingly popular in novel, demanding and data-intensive eScience applications. In these applications, vast amounts of data, generated by specialized instruments, need to be collaboratively accessed, processed and analyzed by a large number of users spread across several organizations. The nearly unlimited storage capabilities of Data Grids allow these data to be replicated at different sites in order to guarantee a high degree of availability. For updateable data objects, several replicas per object need to be maintained in an eager way. In addition, read-only copies serve users’ needs of data with different levels of freshness. The number of updateable replicas has to be dynamically adapted to optimize the trade-off between synchronization overhead and the gain which can be achieved by balancing the load of update transactions. Due to the particular characteristics of the Grid, especially due to the absence of a global coordinator, replication management needs to be provided in a completely distributed way. This includes the synchronization of concurrent updates as well as the dynamic deployment and undeployment of replicas based on actual access characteristics which might change over time. In this paper we present the Re:GRIDiT approach to dynamic replica deployment and undeployment in the Grid. Based on a combination of local load statistics, proximity and data access patterns, Re:GRIDiT dynamically adds new replicas or removes existing ones without impacting global correctness. In addition, we provide a detailed evaluation of the overall performance of the dynamic Re:GRIDiT protocol which shows increased throughput with respect to the replication management protocol with a static number of replicas.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Akal, F., Türker, C., Schek, H.-J., Breitbart, Y., Grabs, T., Veen, L.: Fine-Grained Replication and Scheduling with Freshness and Correctness Guarantees. In: VLDB, pp. 565–576 (2005)
Akbarinia, R., Pacitti, E., Valduriez, P.: Data currency in replicated DHTs. In: Proceedings of the ACM SIGMOD international conference on Management of data, pp. 211–222 (2007)
Andrzejak, A., Graupner, S., Kotov, V., Trinks, H.: Algorithms for Self-Organization and Adaptive Service Placement in Dynamic Distributed Systems. Technical report, HP (2002)
Breitbart, Y., Komondoor, R., Rastogi, R., Seshadri, S., Silberschatz, A.: Update propagation protocols for replicated databates. In: EDBT, pp. 97–108 (1999)
Breitbart, Y., Korth, H.F.: Replication and consistency: being lazy helps sometimes. In: PODS, pp. 173–184 (1997)
Candela, L., Akal, F., Avancini, H., Castelli, D., Fusco, L., Guidetti, V., Langguth, C., Manzi, A., Pagano, P., Schuldt, H., Simi, M., Springmann, M., Voicu, L.: DILIGENT: integrating digital library and Grid technologies for a new Earth observation research infrastructure. Int. J. Digit. Libr. 7(1), 59–80 (2007)
CERN. LHC Computing Centres Join Forces for Global Grid Challenge. CERN Press Release (2005), http://press.web.cern.ch/press/PressReleases/Releases2005/PR06.05E.html
Chundi, P., Rosenkrantz, D.J., Ravi, S.S.: Deferred Updates and Data Placement in Distributed Databases. In: ICDE, pp. 469–476 (1996)
Cohen, E., Shenker, S.: Replication strategies in unstructured peer-to-peer networks. In: SIGCOMM Comput. Commun. Rev, pp. 177–190 (2002)
EDG: The European DataGrid Project, http://eu-datagrid.web.cern.ch/eu-datagrid/
EGEE: The Enabling Grids for E-sciencE Project, http://www.eu-egee.org/
Gopalakrishnan, V., Silaghi, B., Bhattacharjee, B., Keleher, P.: Adaptive Replication in Peer-to-Peer Systems. In: ICDCS, pp. 360–369 (2003)
Gray, J., Helland, P., O’Neil, P., Shasha, D.: The Dangers of Replication and a Solution. In: International Conference on Management of Data, pp. 173–182 (1996)
Harris, R., Olby, N.: Archives for Earth observation data. Space Policy 16, 223–227 (2007)
JimĂ©nez-Peris, R., Patiño-MartĂnez, M., Kemme, B.: Are Quorums an Alternative For Data Replication. ACM Transactions on Database Systems 28, 2003 (2003)
Kemme, B., Alonso, G.: A new approach to developing and implementing eager database replication protocols. ACM Transactions on Database Systems 25, 2000 (2000)
The Laser Interferometer Gravitational Wave Observatory, http://www.ligo.caltech.edu/
Rathore, K.A., Madria, S.K., Hara, T.: Adaptive searching and replication of images in mobile hierarchical peer-to-peer networks. Data Knowl. Eng. 63(3), 894–918 (2007)
Röhm, U., Böhm, K., Schek, H.-J., Schuldt, H.: FAS: a freshness-sensitive coordination middleware for a cluster of OLAP components. In: VLDB 2002, pp. 754–765 (2002)
SRB: The Storage Resource Broker, http://www.sdsc.edu/srb/
Valcarenghi, L., Castoldi, P.: QoS-Aware Connection Resilience for Network-Aware Grid Computing Fault Tolerance. In: Intl. Conf. on Transparent Optical Networks, vol. 1, pp. 417–422 (2005)
Vingralek, R., Hasse-Ye, H., Breitbart, Y., Schek, H.-J.: Unifying concurrency control and recovery of transactions with semantically rich operations. Theoretical Computer Science 190(2) (1998)
Vingralek, R., Sayal, M., Scheuermann, P., Breitbart, Y.: Web++: A system for fast and reliable web service. In: USENIX Annual Technical Conference, pp. 6–11 (1999)
Voicu, L., Schuldt, H.: The Re:GRIDiT Protocol: Correctness of Distributed Concurrency Control in the Data Grid in the Presence of Replication. Technical report, University of Basel, Department of Computer Science (2008)
Voicu, L.C., Schuldt, H., Akal, F., Breitbart, Y., Schek, H.-J.: Re:GRIDiT – Coordinating Distributed Update Transactions on Replicated Data in the Grid. In: Proceedings of the 10th IEEE/ACM Intl. Conference on Grid Computing (Grid 2009), Banff, Canada (October 2009)
Voicu, L.C., Schuldt, H., Breitbart, Y., Schek, H.-J.: Replicated Data Management in the Grid: the Re:GRIDiT Approach. In: ACM Workshop on Data Grids for eScience (May 2009)
Weissman, J., Lee, B.: The Virtual Service Grid: an Architecture for Delivering High-End Network Services. Concurrency And Computation 14, 287–319 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Voicu, L.C., Schuldt, H. (2009). Load-Aware Dynamic Replication Management in a Data Grid. In: Meersman, R., Dillon, T., Herrero, P. (eds) On the Move to Meaningful Internet Systems: OTM 2009. OTM 2009. Lecture Notes in Computer Science, vol 5870. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05148-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-05148-7_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05147-0
Online ISBN: 978-3-642-05148-7
eBook Packages: Computer ScienceComputer Science (R0)