Leader Election Using NewSQL Database Systems

  • Salman NiaziEmail author
  • Mahmoud Ismail
  • Gautier Berthou
  • Jim Dowling
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9038)


Leader election protocols are a fundamental building block for replicated distributed services. They ease the design of leader-based coordination protocols that tolerate failures. In partially synchronous systems, designing a leader election algorithm, that does not permit multiple leaders while the system is unstable, is a complex task. As a result many production systems use third-party distributed coordination services, such as ZooKeeper and Chubby, to provide a reliable leader election service. However, adding a third-party service such as ZooKeeper to a distributed system incurs additional operational costs and complexity. ZooKeeper instances must be kept running on at least three machines to ensure its high availability. In this paper, we present a novel leader election protocol using NewSQL databases for partially synchronous systems, that ensures at most one leader at any given time. The leader election protocol uses the database as distributed shared memory. Our work enables distributed systems that already use NewSQL databases to save the operational overhead of managing an additional third-party service for leader election. Our main contribution is the design, implementation and validation of a practical leader election algorithm, based on NewSQL databases, that has performance comparable to a leader election implementation using a state-of-the-art distributed coordination service, ZooKeeper.


Shared Memory Leader Election Leader Process Synchronous System Clock Drift 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Lamport, L.: Paxos made simple. ACM Sigact News 32(4), 18–25 (2001)Google Scholar
  2. 2.
    Chandra, T.D., Hadzilacos, V., Toueg, S.: The weakest failure detector for solving consensus. J. ACM 43(4), 685–722 (1996)CrossRefzbMATHMathSciNetGoogle Scholar
  3. 3.
    Junqueira, F.P., Reed, B.C.: The life and times of a zookeeper. In: Proceedings of the 28th ACM Symposium on Principles of Distributed Computing, p. 4. ACM (2009)Google Scholar
  4. 4.
    Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: Mass Storage Systems and Technologies, pp. 1–10 (May 2010)Google Scholar
  5. 5.
    Ronström, M., Oreland, J.: Recovery Principles of MySQL Cluster 5.1. In: Proc. of VLDB 2005, pp. 1108–1115. VLDB Endowment (2005)Google Scholar
  6. 6.
    Burrows, M.: The chubby lock service for loosely-coupled distributed systems. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, OSDI 2006, pp. 335–350. USENIX Association, Berkeley (2006)Google Scholar
  7. 7.
    Hunt, P., Konar, M., Junqueira, F.P., Reed, B.: Zookeeper: Wait-free coordination for internet-scale systems. In: Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference, USENIXATC 2010, p. 11 (2010)Google Scholar
  8. 8.
    Guerraoui, R., Raynal, M.: A Leader Election Protocol for Eventually Synchronous Shared Memory Systems, pp. 75–80. IEEE Computer Society, Alamitos (2006)Google Scholar
  9. 9.
    Fernandez, A., Jimenez, E., Raynal, M.: Electing an eventual leader in an asynchronous shared memory system. In: Dependable Systems and Networks, DSN 2007, pp. 399–408 (June 2007)Google Scholar
  10. 10.
    Fernandez, A., Jimenez, E., Raynal, M., Tredan, G.: A timing assumption and a t-resilient protocol for implementing an eventual leader service in asynchronous shared memory systems. In: ISORC 2007, pp. 71–78 (May 2007)Google Scholar
  11. 11.
  12. 12.
    Mysql :: The world’s most popular open source database @ONLINE (January 2015),
  13. 13.
    White paper: Xa and oracle controlled distributed transactions @ONLINE (June 2010),
  14. 14.
    Lakshman, A., Malik, P.: Cassandra: A decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44, 35–40 (2010)CrossRefGoogle Scholar
  15. 15.
    Riak, basho technologies @ONLINE (January 2015),
  16. 16.
    Stonebraker, M., Madden, S., Abadi, D.J., Harizopoulos, S., Hachem, N., Helland, P.: The end of an architectural era (it’s time for a complete rewrite). In: Proceedings of the 33rd International VLDB Conference, pp. 1150–1160 (2007)Google Scholar
  17. 17.
    Özcan, F., Tatbul, N., Abadi, D.J., Kornacker, M., Mohan, C., Ramasamy, K., Wiener, J.: Are we experiencing a big data bubble? In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 1407–1408 (2014)Google Scholar
  18. 18.
    Multi-model database – foundationdb @ONLINE (January 2015),
  19. 19.
    In-memory database, newsql and real-time analytics – voltdb @ONLINE (January 2015),
  20. 20.
    Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32, 374–382 (1985)CrossRefzbMATHMathSciNetGoogle Scholar
  21. 21.
    Thomson, A., Diamond, T., Weng, S.-C., Ren, K., Shao, P., Abadi, D.J.: Calvin: Fast Distributed Transactions for Partitioned Database Systems. In: Proc. of SIGMOD 2012, pp. 1–12. ACM (2012)Google Scholar
  22. 22.
    Sanz-Marco, V., Zolda, M., Kirner, R.: Efficient leader election for synchronous shared-memory systems. In: Proc. Int’l Workshop on Performance, Power and Predictability of Many-Core Embedded Systems (3PMCES 2014). Electronic Chips and Systems Design Initiative (ECSI) (March 2014)Google Scholar
  23. 23.
    Aguilera, M., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: On implementing omega in systems with weak reliability and synchrony assumptions. Distributed Computing 21(4), 285–314 (2008)CrossRefzbMATHGoogle Scholar
  24. 24.
    Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Communication-efficient leader election and consensus with limited link synchrony. In: Proceedings of the Twenty-Third Annual ACM Symposium on Principles of Distributed Computing, PODC 2004, pp. 328–337. ACM, New York (2004)CrossRefGoogle Scholar
  25. 25.
    Larrea, M., Fernandez, A., Arevalo, S., Carlos, J., Carlos, J.: Optimal implementation of the weakest failure detector for solving consensus, pp. 52–59. IEEE Computer Society PressGoogle Scholar
  26. 26.
    Mostefaoui, A., Mourgaya, E., Raynal, M.: Asynchronous implementation of failure detectors. In: Proceedings of the 2003 International Conference on Dependable Systems and Networks, pp. 351–360 (June 2003)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2015

Authors and Affiliations

  • Salman Niazi
    • 1
    Email author
  • Mahmoud Ismail
    • 1
  • Gautier Berthou
    • 2
  • Jim Dowling
    • 1
    • 2
  1. 1.KTH - Royal Institute of TechnologyStockholmSweden
  2. 2.SICS - Swedish ICTStockholmSweden

Personalised recommendations