Managing Geo-replicated Data in Multi-datacenters

Agrawal, Divyakant; El Abbadi, Amr; Mahmoud, Hatem A.; Nawab, Faisal; Salem, Kenneth

doi:10.1007/978-3-642-37134-9_2

Divyakant Agrawal¹⁷,
Amr El Abbadi¹⁷,
Hatem A. Mahmoud¹⁷,
Faisal Nawab¹⁷ &
…
Kenneth Salem¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7813))

Included in the following conference series:

International Workshop on Databases in Networked Information Systems

1603 Accesses
10 Citations

Abstract

Over the past few years, cloud computing and the growth of global large scale computing systems have led to applications which require data management across multiple datacenters. Initially the models provided single row level transactions with eventual consistency. Although protocols based on these models provide high availability, they are not ideal for applications needing a consistent view of the data. There has been now a gradual shift to provide transactions with strong consistency with Google’s Megastore and Spanner. We propose protocols for providing full transactional support while replicating data in multi-datacenter environments. First, an extension of Megastore is presented, which uses optimistic concurrency control. Second, a contrasting method is put forward, which uses gossip-based protocol for providing distributed transactions across datacenters. Our aim is to propose and evaluate different approaches for geo-replication which may be beneficial for diverse applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. In: Proc. 7th USENIX Symp. Operating Systems Design and Implementation, pp. 15–28 (2006)
Google Scholar
Cooper, B.F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.A., Puz, N., Weaver, D., Yerneni, R.: Pnuts: Yahoo!’s hosted data serving platform. Proc. VLDB Endow. 1(2), 1277–1288 (2008)
Google Scholar
Muthukkaruppan, K.: The underlying technology of messages (2011) (acc. October 5, 2011)
Google Scholar
McKusick, K., Quinlan, S.: Gfs: evolution on fast-forward. Commun. ACM 53(3), 42–49 (2010)
Article Google Scholar
Baker, J., Bond, C., Corbett, J., Furman, J., Khorlin, A., Larson, J., Leon, J.M., Li, Y., Lloyd, A., Yushprakh, V.: Megastore: Providing scalable, highly available storage for interactive services. In: Conf. Innovative Data Systems Research, pp. 223–234 (2011)
Google Scholar
Das, S., Agrawal, D., El Abbadi, A.: G-Store: A scalable data store for transactional multi key access in the cloud. In: Proc. 1st ACM Symp. Cloud Computing, pp. 163–174 (2010)
Google Scholar
Das, S., Agrawal, D., El Abbadi, A.: Elastras: An elastic transactional data store in the cloud. In: USENIX Workshop on Hot Topics in Cloud Computing (2009); An expanded version of this paper will appear in the ACM Transactions on Database Systems
Google Scholar
Amazon.com: Summary of the Amazon EC2 and Amazon RDS service disruption in the US East Region (2011) (acc. October 5, 2011)
Google Scholar
Butcher, M.: Amazon EC2 goes down, taking with it Reddit, Foursquare and Quora (April 2011) (acc. October 5, 2011)
Google Scholar
Greene, A.: Lightning strike causes Amazon, Microsoft cloud outage in Europe. TechFlash (August 2011)
Google Scholar
Corbett, J., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J., Ghemawat, S., Gubarev, A., Heiser, C., Hochschild, P., et al.: Spanner: Google’s globally-distributed database. To Appear in Proceedings of OSDI, 1 (2012)
Google Scholar
Chandra, T.D., Griesemer, R., Redstone, J.: Paxos made live: an engineering perspective. In: Proc. 26th ACM Symp. Principles of Distributed Computing, pp. 398–407 (2007)
Google Scholar
Lamport, L.: Paxos made simple. ACM SIGACT News 32(4), 18–25 (2001)
Google Scholar
van Renesse, R.: Paxos made moderately complex. Technical Report (2011)
Google Scholar
Gifford, D.: Weighted voting for replicated data. In: Proceedings of the Seventh ACM Symposium on Operating Systems Principles, pp. 150–162. ACM (1979)
Google Scholar
Stonebraker, M.: Concurrency Control and Consistency in Multiple Copies of Data in Distributed INGRES. IEEE Transactions on Software Engineering 3(3), 188–194 (1979)
Article Google Scholar
Thomas, R.H.: A Majority Consensus Approach to Concurrency Control for Multiple Copy Databases. ACM Transaction on Database Systems 4(2), 180–209 (1979)
Article Google Scholar
Bernstein, P.A., Goodman, N.: An Algorithm for Concurrency Control and Recovery in Replicated Distributed Databases. ACM Transactions on Database Systems 9(4), 596–615 (1984)
Article MathSciNet Google Scholar
Herlihy, M.: Replication Methods for Abstract Data Types. PhD thesis, Laboratory for Computer Science, Massachusetts Institute of Technology (May 1984)
Google Scholar
Birman, K.P.: Replication and Fault-tolerance in the ISIS System. In: Proceedings of the Tenth Symposium on Operating Systems Principles, pp. 79–86 (December 1985)
Google Scholar
El Abbadi, A., Skeen, D., Cristian, F.: An Efficient Fault-Tolerant Protocol for Replicated Data Management. In: Proceedings of the Fourth ACM Symposium on Principles of Database Systems, pp. 215–228 (March 1985)
Google Scholar
El Abbadi, A., Toueg, S.: Availability in partitioned replicated databases. In: Proceedings of the Fifth ACM Symposium on Principles of Database Systems, pp. 240–251 (March 1986)
Google Scholar
Garcia-Molina, H., Barbara, D.: How to assign votes in a distributed system. Journal of the Association of the Computing Machinery 32(4), 841–860 (1985)
Article MathSciNet MATH Google Scholar
Herlihy, M.: A Quorum-Consensus Replication Method for Abstract Data Types. ACM Transactions on Computer Systems 4(1), 32–53 (1986)
Article Google Scholar
Liskov, B., Ladin, R.: Highly Available Services in Distributed Systems. In: Proceedings of the Fifth ACM Symposium on Principles of Distributed Computing, pp. 29–39 (August 1986)
Google Scholar
Demers, A., Greene, D., Hauser, C., Irish, W., Larson, J., Shenker, S., Sturgis, H., Swinehart, D., Terry, D.: Epidemic Algorithms for Replicated Database Maintenance. In: Proceedings of the Sixth ACM Symposium on Principles of Distributed Computing, pp. 1–12 (August 1987)
Google Scholar
Jajodia, S., Mutchler, D.: Dynamic Voting. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 227–238 (June 1987)
Google Scholar
Carey, M.J., Livny, M.: Distributed concurrency control performance: A study of algorithms, distribution, and replication. In: Proceedings of the Fourteenth Conference on Very Large Data Bases, pp. 13–25 (August 1988)
Google Scholar
Agrawal, D., El Abbadi, A.: Reducing storage for quorum consensus algorithms. In: Proceedings of the Thirteenth International Conference on Very Large Data Bases, pp. 419–430 (August 1988)
Google Scholar
El Abbadi, A., Toueg, S.: Maintaining Availability in Partitioned Replicated Databases. ACM Transaction on Database Systems 14(2), 264–290 (1989)
Article Google Scholar
Agrawal, D., El Abbadi, A.: The Tree Quorum Protocol: An Efficient Approach for Managing Replicated Data. In: Proceedings of Sixteenth International Conference on Very Large Data Bases, pp. 243–254 (August 1990)
Google Scholar
Jajodia, S., Mutchler, D.: Dynamic Voting Algorithms for Maintaining the Consistency of a Replicated Database. ACM Transactions on Database Systems 15(2), 230–280 (1990)
Article Google Scholar
Agrawal, D., El Abbadi, A.: The Generalized Tree Quorum Protocol: An Efficient Approach for Managing Replicated Data. ACM Transaction on Database Systems 17(4), 689–717 (1992)
Article Google Scholar
Agrawal, D., El Abbadi, A.: Resilient Logical Structures for Efficient Management of Replicated Data. In: Proceedings of Eighteenth International Conference on Very Large Data Bases, pp. 151–162 (August 1992)
Google Scholar
Gray, J., Helland, P., O’Neil, P., Shasha, D.: The Dangers of Replication. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pp. 173–182 (June 1996)
Google Scholar
Agrawal, D., El Abbadi, A., Steinke, R.: Epidemic Algorithms in Replicated Databases. In: Proceedings of the ACM Symposium on Principles of Database Systems, pp. 161–172 (May 1997)
Google Scholar
Stanoi, I., Agrawal, D., El Abbadi, A.: Using broadcast primitives in replicated databases. In: Proceedings of the 1998 IEEE International Conference on Distributed Computing Systems, pp. 148–155 (May 1998)
Google Scholar
Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. Operating Systems Review 44(2), 35–40 (2010)
Article Google Scholar
Burrows, M.: The chubby lock service for loosely-coupled distributed systems. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, OSDI 2006, pp. 335–350. USENIX Association, Berkeley (2006)
Google Scholar
Hunt, P., Konar, M., Junqueira, F.P., Reed, B.: Zookeeper: wait-free coordination for internet-scale systems. In: Proc. 2010 USENIX Conference, USENIXATC 2010, p. 11. USENIX Association, Berkeley (2010)
Google Scholar
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. In: Proc. 21st ACM Symp. Operating Systems Principles, pp. 205–220 (2007)
Google Scholar
HBase (2011), http://hbase.apache.org (acc. July 18, 2011)
Calder, B., Wang, J., Ogus, A., Nilakantan, N., Skjolsvold, A., McKelvie, S., Xu, Y., Srivastav, S., Wu, J., Simitci, H., et al.: Windows azure storage: a highly available cloud storage service with strong consistency. In: Proc. Twenty-Third ACM Symp. Operating Systems Principles, pp. 143–157. ACM (2011)
Google Scholar
Curino, C., Jones, E.P.C., Popa, R.A., Malviya, N., Wu, E., Madden, S., Balakrishnan, H., Zeldovich, N.: Relational cloud: a database service for the cloud. In: CIDR, pp. 235–240 (2011)
Google Scholar
Bernstein, P.A., Cseri, I., Dani, N., Ellis, N., Kalhan, A., Kakivaya, G., Lomet, D.B., Manne, R., Novik, L., Talius, T.: Adapting microsoft sql server for cloud computing. In: ICDE, pp. 1255–1263 (2011)
Google Scholar
Glendenning, L., Beschastnikh, I., Krishnamurthy, A., Anderson, T.: Scalable consistency in scatter. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP 2011, pp. 15–28. ACM, New York (2011)
Chapter Google Scholar
Sovran, Y., Power, R., Aguilera, M.K., Li, J.: Transactional storage for geo-replicated systems. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP 2011, pp. 385–400. ACM, New York (2011)
Chapter Google Scholar
Lloyd, W., Freedman, M.J., Kaminsky, M., Andersen, D.G.: Don’t settle for eventual: scalable causal consistency for wide-area storage with COPS. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP 2011, pp. 401–416. ACM, New York (2011)
Chapter Google Scholar
Kraska, T., Pang, G., Franklin, M.J., Madden, S.: Mdcc: Multi-data center consistency. CoRR abs/1203.6049 (2012)
Google Scholar
Fischer, M., Michael, A.: Sacrificing serializability to attain high availability of data in an unreliable network. In: Proceedings of the 1st ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, pp. 70–75. ACM (1982)
Google Scholar
Wuu, G.T., Bernstein, A.J.: Efficient solutions to the replicated log and dictionary problems. In: Proceedings of the Third Annual ACM Symposium on Principles of Distributed Computing, PODC 1984, pp. 233–242. ACM, New York (1984)
Chapter Google Scholar
Kaashoek, M.F., Tanenbaum, A.S.: Group Communication in the Amoeba Distributed Operating Systems. In: Proceedings of the 11th International Conference on Distributed Computing Systems, 222–230 (May 1991)
Google Scholar
Amir, Y., Dolev, D., Kramer, S., Malki, D.: Membership Algorithms for Multicast Communication Groups. In: Segall, A., Zaks, S. (eds.) WDAG 1992. LNCS, vol. 647, pp. 292–312. Springer, Heidelberg (1992)
Chapter Google Scholar
Amir, Y., Moser, L.E., Melliar-Smith, P.M., Agarwal, D.A., Ciarfella, P.: The Totem Single-Ring Ordering and Membership Protocol. ACM Transactions on Computer Systems 13(4), 311–342 (1995)
Article Google Scholar
Neiger, G.: A New Look at Membership Services. In: Proceedings of the ACM Symposium on Principles of Distributed Computing (1996)
Google Scholar
Patterson, S., Elmore, A.J., Nawab, F., Agrawal, D., Abbadi, A.E.: Serializability, not serial: Concurrency control and availability in multi-datacenter datastores. PVLDB 5(11), 1459–1470 (2012)
Google Scholar
Lamport, L.: The part-time parliament. ACM Trans. Computer Systems 16(2), 133–169 (1998)
Article Google Scholar
Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978)
Article MATH Google Scholar
Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems. Addison-Wesley (1987)
Google Scholar
Adya, A., Liskov, B., O’Neil, P.E.: Generalized isolation level definitions. In: ICDE, pp. 67–78 (2000)
Google Scholar
Lin, Y., Kemme, B., Jiménez-Peris, R., Patiño Martínez, M., Armendáriz-Iñigo, J.E.: Snapshot isolation and integrity constraints in replicated databases. ACM Trans. Database Syst. 34(2), 11:1–11:49 (2009)
Google Scholar
Wu, S., Kemme, B.: Postgres-r(si): Combining replica control with concurrency control based on snapshot isolation. In: ICDE, pp. 422–433 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of California at Santa Barbara, USA
Divyakant Agrawal, Amr El Abbadi, Hatem A. Mahmoud & Faisal Nawab
School of Computer Science, University of Waterloo, Canada
Kenneth Salem

Authors

Divyakant Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Amr El Abbadi
View author publications
You can also search for this author in PubMed Google Scholar
Hatem A. Mahmoud
View author publications
You can also search for this author in PubMed Google Scholar
Faisal Nawab
View author publications
You can also search for this author in PubMed Google Scholar
Kenneth Salem
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate Department of Computer and Information Systems, University of Aizu, Ikki Machi, 965-8580, Aizu-Wakamatsu, Fukushima, Japan
Aastha Madaan , Shinji Kikuchi & Subhash Bhalla , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Agrawal, D., El Abbadi, A., Mahmoud, H.A., Nawab, F., Salem, K. (2013). Managing Geo-replicated Data in Multi-datacenters. In: Madaan, A., Kikuchi, S., Bhalla, S. (eds) Databases in Networked Information Systems. DNIS 2013. Lecture Notes in Computer Science, vol 7813. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37134-9_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-37134-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37133-2
Online ISBN: 978-3-642-37134-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics