Encyclopedia of Big Data Technologies

2019 Edition
| Editors: Sherif Sakr, Albert Y. Zomaya

Geo-Scale Transaction Processing

  • Faisal NawabEmail author
Reference work entry
DOI: https://doi.org/10.1007/978-3-319-77525-8_180

Synonyms

Definitions

Geo-Scale Transaction Processing considers the processing of transactions on nodes that are separated by wide-area links.

Overview

Replication and distribution of data across nodes have been used for various objectives, such as fault tolerance, load balancing, read availability, and others. This practice dates back to the early days of computing (Kemme et al. 2010; Bernstein and Goodman 1981) and continues to develop to accommodate the development and advancement of new computing technologies. The availability of publicly accessible cloud resources that are dispersed around the world has allowed the replication and distribution of data across large distances, potentially covering many continents. This is denoted a geo-scale deployment and allows achieving higher levels of the objectives of replication and distribution. For example, geo-scale...

This is a preview of subscription content, log in to check access.

References

  1. Agarwal S, Dunagan J, Jain N, Saroiu S, Wolman A, Bhogan H (2010) Volley: automated data placement for geo-distributed cloud services. In: Proceedings of the 7th USENIX conference on networked systems design and implementation, USENIX association, NSDI’10, Berkeley, pp 2–2. http://dl.acm.org/citation.cfm?id=1855711.1855713
  2. Ardekani MS, Terry DB (2014) A self-configurable geo-replicated cloud storage system. In: Proceedings of the 11th USENIX conference on operating systems design and implementation, USENIX association, OSDI’14, Berkeley, pp 367–381. http://dl.acm.org/citation.cfm?id=2685048.2685077
  3. Bailis P, Ghodsi A (2013) Eventual consistency today: limitations, extensions, and beyond. Queue 11(3):20:20–20:32. http://doi.acm.org/10.1145/2460276.2462076CrossRefGoogle Scholar
  4. Bailis P, Ghodsi A, Hellerstein JM, Stoica I (2013) Bolt-on causal consistency. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data, SIGMOD’13. ACM, New York, pp 761–772. http://doi.acm.org/10.1145/2463676.2465279CrossRefGoogle Scholar
  5. Bailis P, Fekete A, Franklin MJ, Ghodsi A, Hellerstein JM, Stoica I (2014) Coordination avoidance in database systems. Proc VLDB Endow 8(3):185–196. https://doi.org/10.14778/2735508.2735509CrossRefGoogle Scholar
  6. Baker J, Bond C, Corbett JC, Furman JJ, Khorlin A, Larson J, Leon J, Li Y, Lloyd A, Yushprakh V (2011) Megastore: providing scalable, highly available storage for interactive services. In: CIDR 2011, Fifth biennial conference on innovative data systems research, Asilomar, pp 223–234, 9–12 Jan 2011. Online Proceedings. http://cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf
  7. Berenson H, Bernstein P, Gray J, Melton J, O’Neil E, O’Neil P (1995) A critique of ansi SQL isolation levels, pp 1–10. http://doi.acm.org/10.1145/223784.223785
  8. Bernstein PA, Goodman N (1981) Concurrency control in distributed database systems. ACM Comput Surv (CSUR) 13(2):185–221MathSciNetCrossRefGoogle Scholar
  9. Bernstein PA, Hadzilacos V, Goodman N (1987) Concurrency control and recovery in database systems. Addison-Wesley, ReadingGoogle Scholar
  10. Cooper BF, Ramakrishnan R, Srivastava U, Silberstein A, Bohannon P, Jacobsen HA, Puz N, Weaver D, Yerneni R (2008) Pnuts: Yahoo!’s hosted data serving platform. Proc VLDB Endow 1(2):1277–1288. https://doi.org/10.14778/1454159.1454167CrossRefGoogle Scholar
  11. Corbett JC, Dean J, Epstein M, Fikes A, Frost C, Furman JJ, Ghemawat S, Gubarev A, Heiser C, Hochschild P, Hsieh W, Kanthak S, Kogan E, Li H, Lloyd A, Melnik S, Mwaura D, Nagle D, Quinlan S, Rao R, Rolig L, Saito Y, Szymaniak M, Taylor C, Wang R, Woodford D (2012) Spanner: Google’s globally-distributed database, pp 251–264. http://dl.acm.org/citation.cfm?id=2387880.2387905
  12. Daudjee K, Salem K (2006) Lazy database replication with snapshot isolation. In: Proceedings of the 32nd international conference on very large data bases, VLDB Endowment, VLDB’06, pp 715–726. http://dl.acm.org/citation.cfm?id=1182635.1164189
  13. DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W (2007) Dynamo: Amazon’s highly available key-value store. In: Proceedings of twenty-first ACM SIGOPS symposium on operating systems principles, SOSP’07. ACM, New York, pp 205–220. http://doi.acm.org/10.1145/1294261.1294281CrossRefGoogle Scholar
  14. Du J, Elnikety S, Roy A, Zwaenepoel W (2013a) Orbe: scalable causal consistency using dependency matrices and physical clocks. In: Proceedings of the 4th annual symposium on cloud computing, SOCC’13. ACM, New York, pp 11:1–11:14. http://doi.acm.org/10.1145/2523616.2523628
  15. Du J, Elnikety S, Zwaenepoel W (2013b) Clock-si: Snapshot isolation for partitioned data stores using loosely synchronized clocks. In: Proceedings of the 2013 IEEE 32nd international symposium on reliable distributed systems, SRDS’13. IEEE Computer Society, Washington, pp 173–184,  https://doi.org/10.1109/SRDS.2013.26CrossRefGoogle Scholar
  16. Endo PT, de Almeida Palhares AV, Pereira NN, Goncalves GE, Sadok D, Kelner J, Melander B, Mangs JE (2011) Resource allocation for distributed cloud: concepts and research challenges. IEEE Netw 25(4):42–46.  https://doi.org/10.1109/MNET.2011.5958007CrossRefGoogle Scholar
  17. Glendenning L, Beschastnikh I, Krishnamurthy A, Anderson T (2011) Scalable consistency in scatter. In: Proceedings of the twenty-third ACM symposium on operating systems principles, SOSP’11. ACM, New York, pp 15–28. http://doi.acm.org/10.1145/2043556.2043559CrossRefGoogle Scholar
  18. Kemme B, Jimenez-Peris R, Patino-Martinez M (2010) Database replication. Synth Lect Data Manage 2(1):1–153. http://www.morganclaypool.com/doi/abs/10.2200/S00296ED1V01Y201008DTM007zbMATHCrossRefGoogle Scholar
  19. Kraska T, Pang G, Franklin MJ, Madden S, Fekete A (2013) MDCC: multi-data center consistency. In: Proceedings of the 8th ACM European conference on computer systems, EuroSys’13. ACM, New York, pp 113–126. http://doi.acm.org/10.1145/2465351.2465363CrossRefGoogle Scholar
  20. Lamport L (1978) Time, clocks, and the ordering of events in a distributed system. Commun ACM 21(7):558–565. http://doi.acm.org/10.1145/359545.359563zbMATHCrossRefGoogle Scholar
  21. Lamport L (1998) The part-time parliament. ACM Trans Comput Syst 16(2):133–169. http://doi.acm.org/10.1145/279227.279229CrossRefGoogle Scholar
  22. Lamport L (2005) Generalized consensus and paxos. Technical report, MSR-TR-2005-33, Microsoft ResearchGoogle Scholar
  23. Lamport L (2006) Fast paxos. Distrib Comput 19(2): 79–103zbMATHCrossRefGoogle Scholar
  24. Lin Y, Kemme B, Patiño Martínez M, Jiménez-Peris R (2005) Middleware based data replication providing snapshot isolation. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, SIGMOD’05. ACM, New York, pp 419–430. http://doi.acm.org/10.1145/1066157.1066205CrossRefGoogle Scholar
  25. Lin Y, Kemme B, Patino-Martinez M, Jimenez-Peris R (2007) Enhancing edge computing with database replication. In: Proceedings of the 26th IEEE international symposium on reliable distributed systems, SRDS’07. IEEE Computer Society, Washington, pp 45–54. http://dl.acm.org/citation.cfm?id=1308172.1308219Google Scholar
  26. Lloyd W, Freedman MJ, Kaminsky M, Andersen DG (2011) Don’t settle for eventual: scalable causal consistency for wide-area storage with cops. In: Proceedings of the twenty-third ACM symposium on operating systems principles, SOSP’11. ACM, New York, pp 401–416. http://doi.acm.org/10.1145/2043556.2043593CrossRefGoogle Scholar
  27. Lloyd W, Freedman MJ, Kaminsky M, Andersen DG (2013) Stronger semantics for low-latency geo-replicated storage. In: Proceedings of the 10th USENIX conference on networked systems design and implementation, NSDI’13. USENIX Association, Berkeley, pp 313–328. http://dl.acm.org/citation.cfm?id=2482626.2482657Google Scholar
  28. Mahmoud H, Nawab F, Pucher A, Agrawal D, El Abbadi A (2013) Low-latency multi-datacenter databases using replicated commit. Proc VLDB Endow 6(9):661–672. https://doi.org/10.14778/2536360.2536366CrossRefGoogle Scholar
  29. Moraru I, Andersen DG, Kaminsky M (2013) There is more consensus in Egalitarian parliaments. In: Proceedings of the twenty-fourth ACM symposium on operating systems principles, SOSP’13. ACM, New York, pp 358–372. http://doi.acm.org/10.1145/2517349.2517350CrossRefGoogle Scholar
  30. Nawab F, Agrawal D, El Abbadi A (2013) Message futures: fast commitment of transactions in multi-datacenter environments. In: CIDR 2013, sixth biennial conference on innovative data systems research, Asilomar, 6–9 Jan 2013. Online Proceedings. http://cidrdb.org/cidr2013/Papers/CIDR13_Paper103.pdf
  31. Nawab F, Arora V, Agrawal D, El Abbadi A (2015a) Chariots: a scalable shared log for data management in multi-datacenter cloud environments. In: Proceedings of the 18th international conference on extending database technology, EDBT 2015, Brussels, 23–27 Mar 2015, pp 13–24. https://doi.org/10.5441/002/edbt.2015.03
  32. Nawab F, Arora V, Agrawal D, El Abbadi A (2015b) Minimizing commit latency of transactions in geo-replicated data stores. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, SIGMOD’15. ACM, New York, pp 1279–1294. http://doi.acm.org/10.1145/2723372.2723729Google Scholar
  33. Pang G, Kraska T, Franklin MJ, Fekete A (2014) Planet: making progress with commit processing in unpredictable environments. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data, SIGMOD’14. ACM, New York, pp 3–14. http://doi.acm.org/10.1145/2588555.2588558Google Scholar
  34. Patterson S, Elmore AJ, Nawab F, Agrawal D, El Abbadi A (2012) Serializability, not serial: concurrency control and availability in multi-datacenter datastores. Proc VLDB Endow 5(11):1459–1470. https://doi.org/10.14778/2350229.2350261CrossRefGoogle Scholar
  35. Pu Q, Ananthanarayanan G, Bodik P, Kandula S, Akella A, Bahl P, Stoica I (2015) Low latency geo-distributed data analytics. In: Proceedings of the 2015 ACM conference on special interest group on data communication, SIGCOMM’15. ACM, New York, pp 421–434. http://doi.acm.org/10.1145/2785956.2787505Google Scholar
  36. Roy S, Kot L, Bender G, Ding B, Hojjat H, Koch C, Foster N, Gehrke J (2015) The homeostasis protocol: avoiding transaction coordination through program analysis. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, SIGMOD’15. ACM, New York, pp 1311–1326. http://doi.acm.org/10.1145/2723372.2723720Google Scholar
  37. Shankaranarayanan PN, Sivakumar A, Rao S, Tawarmalani M (2014) Performance sensitive replication in geo-distributed cloud datastores. In: Proceedings of the 2014 44th annual IEEE/IFIP international conference on dependable systems and networks, DSN’14. IEEE Computer Society, Washington, pp 240–251.  https://doi.org/10.1109/DSN.2014.34CrossRefGoogle Scholar
  38. Sharov A, Shraer A, Merchant A, Stokely M (2015) Take me to your leader!: online optimization of distributed storage configurations. Proc VLDB Endow 8(12):1490–1501. https://doi.org/10.14778/2824032.2824047CrossRefGoogle Scholar
  39. Sovran Y, Power R, Aguilera MK, Li J (2011) Transactional storage for geo-replicated systems. In: Proceedings of the twenty-third ACM symposium on operating systems principles, SOSP’11. ACM, New York, pp 385–400. http://doi.acm.org/10.1145/2043556.2043592CrossRefGoogle Scholar
  40. Terry DB, Prabhakaran V, Kotla R, Balakrishnan M, Aguilera MK, Abu-Libdeh H (2013) Consistency-based service level agreements for cloud storage. In: Proceedings of the twenty-fourth ACM symposium on operating systems principles, SOSP’13. ACM, New York, pp 309–324. http://doi.acm.org/10.1145/2517349.2522731CrossRefGoogle Scholar
  41. Vulimiri A, Curino C, Godfrey PB, Jungblut T, Padhye J, Varghese G (2015) Global analytics in the face of bandwidth and regulatory constraints. In: Proceedings of the 12th USENIX conference on networked systems design and implementation, NSDI’15. USENIX Association, Berkeley, pp 323–336. http://dl.acm.org/citation.cfm?id=2789770.2789793Google Scholar
  42. Wu Z, Butkiewicz M, Perkins D, Katz-Bassett E, Madhyastha HV (2013) Spanstore: cost-effective geo-replicated storage spanning multiple cloud services. In: Proceedings of the twenty-fourth ACM symposium on operating systems principles, SOSP’13. ACM, New York, pp 292–308. http://doi.acm.org/10.1145/2517349.2522730CrossRefGoogle Scholar
  43. Wu Z, Yu C, Madhyastha HV (2015) Costlo: cost-effective redundancy for lower latency variance on cloud storage services. In: Proceedings of the 12th USENIX conference on networked systems design and implementation, NSDI’15. USENIX Association, Berkeley, pp 543–557. http://dl.acm.org/citation.cfm?id=2789770.2789808Google Scholar
  44. Zakhary V, Nawab F, Agrawal D, El Abbadi A (2016) Db-risk: the game of global database placement. In: Proceedings of the 2016 international conference on management of data, SIGMOD’16. ACM, New York, pp 2185–2188. http://doi.acm.org/10.1145/2882903.2899405CrossRefGoogle Scholar
  45. Zakhary V, Nawab F, Agrawal D, El Abbadi A (2018) Global-scale placement of transactional data stores. In: Proceedings of the 2018 international conference on extending database technology, EDBT’18Google Scholar
  46. Zhang Y, Power R, Zhou S, Sovran Y, Aguilera MK, Li J (2013) Transaction chains: achieving serializability with low latency in geo-distributed storage systems. In: Proceedings of the twenty-fourth ACM symposium on operating systems principles, SOSP’13. ACM, New York, pp 276–291. http://doi.acm.org/10.1145/2517349.2522729CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of CaliforniaSanta CruzUSA