Advertisement

Efficient Snapshot Isolation in Paxos-Replicated Database Systems

  • Jinwei Guo
  • Peng CaiEmail author
  • Bing Xiao
  • Weining Qian
  • Aoying Zhou
Conference paper
  • 2.5k Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10828)

Abstract

Modern database systems are increasingly deployed in a cluster of commodity machines with Paxos-based replication technique to offer better performance, higher availability and fault-tolerance. The widely adopted implementation is that one database replica is elected to be a leader and to be responsible for transaction requests. After the transaction execution is completed, the leader generates transaction log and commit this transaction until the log has been replicated to a majority of replicas. The state of the leader is always ahead of that of the follower replicas since the leader commits the transactions firstly and then notifies other replicas of the latest committed log entries in the later communication. As the follower replica can’t immediately provide the latest snapshot, both read-write and read-only transactions would be executed at the leader to guarantee the strong snapshot isolation semantic. In this work, we design and implement an efficient snapshot isolation scheme. This scheme uses adaptive timestamp allocation to avoid frequently requesting the leader to assign transaction timestamps. Furthermore, we design an early log replay mechanism for follower replicas. It allows the follower replica to execute a read operation without waiting to replay log to generate the required snapshot. Comparing with the conventional implementation, we experimentally show that the optimized snapshot isolation for Paxos-replicated database systems has better performance in terms of scalability and throughput.

Notes

Acknowledgments

This work is partially supported by National High-tech R&D Program (863 Program) under grant number 2015AA015307, NSFC under grant numbers 61432006 and 61332006, and Guangxi Key Laboratory of Trusted Software (kx201602).

References

  1. 1.
  2. 2.
    Bailis, P., Davidson, A., Fekete, A., et al.: Highly available transactions: virtues and limitations. PVLDB 7(3), 181–192 (2013)Google Scholar
  3. 3.
    Baker, J., Bond, C., Corbett, J.C., et al.: Megastore: providing scalable, highly available storage for interactive services. In: CIDR, pp. 223–234 (2011)Google Scholar
  4. 4.
    Berenson, H., Bernstein, P., Gray, J., et al.: A critique of ANSI SQL isolation levels. SIGMOD Rec. 24(2), 1–10 (1995)CrossRefGoogle Scholar
  5. 5.
    Binnig, C., Hildenbrand, S., et al.: Distributed snapshot isolation: global transactions pay globally, local transactions pay locally. VLDB J. 23(6), 987–1011 (2014)CrossRefGoogle Scholar
  6. 6.
    Bornea, M.A., Hodson, O., Elnikety, S., Fekete, A.: One-copy serializability with snapshot isolation under the hood. In: ICDE, pp. 625–636 (2011)Google Scholar
  7. 7.
    Chairunnanda, P., Daudjee, K., Özsu, T.M.: ConfluxDB: multi-master replication for partitioned snapshot isolation databases. In: VLDB, pp. 947–958 (2014)Google Scholar
  8. 8.
    Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: SoCC, pp. 143–154 (2010)Google Scholar
  9. 9.
    Corbett, J.C., Dean, J., Epstein, M., et al.: Spanner: Google’s globally distributed database. TOCS 31(3), 8 (2013)CrossRefGoogle Scholar
  10. 10.
    Daudjee, K., Salem, K.: Lazy database replication with snapshot isolation. In: VLDB, pp. 715–726 (2006)Google Scholar
  11. 11.
    Elnikety, S., Zwaenepoel, W., Pedone, F.: Database replication using generalized snapshot isolation. In: SRDS, pp. 73–84. IEEE Computer Society (2005)Google Scholar
  12. 12.
    Gray, J., Helland, P., O’Neil, P., Shasha, D.: The dangers of replication and a solution. SIGMOD Rec. 25(2), 173–182 (1996)CrossRefGoogle Scholar
  13. 13.
    Jung, H., Han, H., Fekete, A., Röhm, U.: Serializable snapshot isolation for replicated databases in high-update scenarios. In: VLDB, pp. 783–794 (2011)Google Scholar
  14. 14.
    Kemme, B., Alonso, G.: A suite of database replication protocols based on group communication primitives. In: ICDCS, pp. 156–163 (1998)Google Scholar
  15. 15.
    Kemme, B., Alonso, G.: Database replication: a tale of research across communities. PVLDB 3(1), 5–12 (2010)Google Scholar
  16. 16.
    Kraska, T., Pang, G., Franklin, M.J., et al.: MDCC: multi-data center consistency. In: EuroSys, pp. 113–126 (2013)Google Scholar
  17. 17.
    Lamport, L.: The part-time parliament. TOCS 16(2), 133–169 (1998)CrossRefGoogle Scholar
  18. 18.
    Lamport, L.: Paxos made simple. ACM SIGACT News 32(4), 18–25 (2001)Google Scholar
  19. 19.
    Lin, Y., Kemme, B., Patiño Martínez, M., Jiménez-Peris, R.: Middleware based data replication providing snapshot isolation. In: SIGMOD, pp. 419–430 (2005)Google Scholar
  20. 20.
    Moraru, I., Andersen, D.G., Kaminsky, M.: Paxos quorum leases: fast reads without sacrificing writes. In: SOCC, pp. 22:1–22:13 (2014)Google Scholar
  21. 21.
    Mu, S., Nelson, L., Lloyd, W., Li, J.: Consolidating concurrency control and consensus for commits under conflicts. In: OSDI, pp. 517–532 (2016)Google Scholar
  22. 22.
    Ongaro, D., Ousterhout, J.K.: In search of an understandable consensus algorithm. In: ATC (2014)Google Scholar
  23. 23.
    Pedone, F., Wiesmann, M., Schiper, A., Kemme, B., Alonso, G.: Understanding replication in databases and distributed systems. In: ICDCS, pp. 464–474 (2000)Google Scholar
  24. 24.
    Rao, J., Shekita, E.J., Tata, S.: Using Paxos to build a scalable, consistent, and highly available datastore. In: VLDB, pp. 243–254 (2011)Google Scholar
  25. 25.
    Schneider, F.B.: Implementing fault-tolerant services using the state machine approach: a tutorial. CSUR 22(4), 299–319 (1990)CrossRefGoogle Scholar
  26. 26.
    Wiesmann, M., Schiper, A.: Comparison of database replication techniques based on total order broadcast. TKDE 17(4), 551–566 (2005)Google Scholar
  27. 27.
    Wu, Y., Arulraj, J., Lin, J., et al.: An empirical evaluation of in-memory multi-version concurrency control. Proc. VLDB Endow. 10(7), 781–792 (2017)CrossRefGoogle Scholar
  28. 28.
    Zhang, I., Sharma, N.K., Szekeres, A., et al.: Building consistent transactions with inconsistent replication. In: SOSP, pp. 263–278. ACM (2015)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Jinwei Guo
    • 1
  • Peng Cai
    • 1
    • 2
    Email author
  • Bing Xiao
    • 1
  • Weining Qian
    • 1
  • Aoying Zhou
    • 1
  1. 1.School of Data Science and EngineeringEast China Normal UniversityShanghaiPeople’s Republic of China
  2. 2.Guangxi Key Laboratory of Trusted SoftwareGuilin University of Electronic TechnologyGuilinPeople’s Republic of China

Personalised recommendations