Self-stabilization Overhead: A Case Study on Coded Atomic Storage

  • Chryssis Georgiou
  • Robert Gustafsson
  • Andreas Lindhé
  • Elad Michael SchillerEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11704)


Shared memory emulation on distributed message-passing systems can be used as a fault-tolerant and highly available distributed storage solution or as a low-level synchronization primitive. Cadambe et al. proposed the Coded Atomic Storage (CAS) algorithm, which uses erasure coding to achieve data redundancy with much lower communication cost than previous algorithmic solutions. Recently, Dolev et al. introduced a version of CAS where transient faults are included in the fault model, making it self-stabilizing. But self-stabilization comes at a cost, so in this work we examine the overhead of the algorithm by implementing a system we call CASSS (CAS Self-Stabilizing). Our system builds on the self-stabilizing version of CAS, along with several other self-stabilizing building blocks. This provides us with a powerful platform to evaluate the overhead and other aspects of the real-world applicability of the algorithm.

In our case-study, we evaluated the system performance by running it on the world-wide distributed platform PlanetLab. Our study shows that CASSS scales very well in terms of the number of servers, the number of concurrent clients, as well as the size of the replicated object. More importantly, it shows (a) to have only a constant overhead compared to the traditional CAS algorithm and (b) the recovery period (after the last occurrence of a transient fault) is no more than the time it takes to perform a few client (read/write) operations. Our results suggest that the self-stabilizing variation of CAS, which is CASSS, does not significantly impact efficiency while dealing with automatic recovery from transient faults.


  1. 1.
    Herlihy, M.P., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)CrossRefGoogle Scholar
  2. 2.
    Attiya, H., Bar-Noy, A., Dolev, D.: Sharing memory robustly in message-passing systems. J. ACM 42(1), 124–142 (1995)CrossRefGoogle Scholar
  3. 3.
    Lynch, N., Shvartsman, A.: Communication and data sharing for dynamic distributed systems. In: Schiper, A., Shvartsman, A.A., Weatherspoon, H., Zhao, B.Y. (eds.) Future Directions in Distributed Computing. LNCS, vol. 2584, pp. 62–67. Springer, Heidelberg (2003). Scholar
  4. 4.
    Cadambe, V.R., Lynch, N., Medard, M., Musial, P.: A coded shared atomic memory algorithm for message passing architectures. Distrib. Comput. 30(1), 49–73 (2017)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Dolev, S., Petig, T., Schiller, E.M.: Self-stabilizing and private distributed shared atomic memory in seldomly fair message passing networks. CoRR abs/1806.03498 (2018)Google Scholar
  6. 6.
    Dolev, S., Petig, T., Schiller, E.M.: Brief announcement: robust and private distributed shared atomic memory in message passing networks. In: ACM Symposium on Principles of Distributed Computing, PODC, pp. 311–313. ACM (2015)Google Scholar
  7. 7.
    Dolev, S., Georgiou, C., Marcoullis, I., Schiller, E.M.: Self-stabilizing reconfiguration. In: Networked Systems NETYS, pp. 51–68 (2017)CrossRefGoogle Scholar
  8. 8.
    Dijkstra, E.W.: Self-stabilizing systems in spite of distributed control. Commun. ACM 17(11), 643–644 (1974)CrossRefGoogle Scholar
  9. 9.
    Attiya, H.: Robust simulation of shared memory: 20 years after. Bull. EATCS 100, 99–113 (2010)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Lynch, N., Shvartsman, A.A.: RAMBO: a reconfigurable atomic memory service for dynamic networks. In: Malkhi, D. (ed.) DISC 2002. LNCS, vol. 2508, pp. 173–190. Springer, Heidelberg (2002). Scholar
  11. 11.
    Musial, P.M., Nicolaou, N.C., Shvartsman, A.A.: Implementing distributed shared memory for dynamic networks. Commun. ACM 57(6), 88–98 (2014)CrossRefGoogle Scholar
  12. 12.
    Cadambe, V.R., Nicolaou, N.C., Konwar, K.M., Prakash, N., Lynch, N.A., Médard, M.: ARES: adaptive, reconfigurable, erasure coded, atomic storage. CoRR abs/1805.03727 (2018)Google Scholar
  13. 13.
    Nicolaou, N.C., Georgiou, C.: On the practicality of atomic MWMR register implementations. In: 10th IEEE Parallel and Distributed Processing with Applications, ISPA, pp. 340–347 (2012)Google Scholar
  14. 14.
    Fan, R., Lynch, N.: Efficient replication of large data objects. In: Fich, F.E. (ed.) DISC 2003. LNCS, vol. 2848, pp. 75–91. Springer, Heidelberg (2003). Scholar
  15. 15.
    Lynch, N.A., Shvartsman, A.A.: Robust emulation of shared memory using dynamic quorum-acknowledged broadcasts. In: 27th Fault-Tolerant Computing, pp. 272–281 (1997)Google Scholar
  16. 16.
    Dolev, S.: Self-Stabilization. MIT Press, Cambridge (2000)CrossRefGoogle Scholar
  17. 17.
    Dolev, S., Georgiou, C., Marcoullis, I., Schiller, E.M.: Self-stabilizing reconfiguration. In: El Abbadi, A., Garbinato, B. (eds.) NETYS 2017. LNCS, vol. 10299, pp. 51–68. Springer, Cham (2017). Scholar
  18. 18.
    Canini, M., Salem, I., Schiff, L., Schiller, E.M., Schmid, S.: A self-organizing distributed and in-band SDN control plane. In: ICDCS, IEEE Computer Society, pp. 2656–2657 (2017)Google Scholar
  19. 19.
    Canini, M., Salem, I., Schiff, L., Schiller, E.M., Schmid, S.: Renaissance: a self-stabilizing distributed SDN control plane. In: ICDCS, IEEE Computer Society, pp. 233–243 (2018)Google Scholar
  20. 20.
    Dolev, S., Georgiou, C., Marcoullis, I., Schiller, E.M.: Self-stabilizing byzantine tolerant replicated state machine based on failure detectors. In: Dinur, I., Dolev, S., Lodha, S. (eds.) CSCML 2018. LNCS, vol. 10879, pp. 84–100. Springer, Cham (2018). Scholar
  21. 21.
    Dolev, S., Liba, O., Schiller, E.M.: Self-stabilizing byzantine resilient topology discovery and message delivery. In: Gramoli, V., Guerraoui, R. (eds.) NETYS 2013. LNCS, vol. 7853, pp. 42–57. Springer, Heidelberg (2013). Scholar
  22. 22.
    Dolev, S., Hanemann, A., Schiller, E.M., Sharma, S.: Self-stabilizing end-to-end communication in (bounded capacity, omitting, duplicating and non-FIFO) dynamic networks. In: Richa, A.W., Scheideler, C. (eds.) SSS 2012. LNCS, vol. 7596, pp. 133–147. Springer, Heidelberg (2012). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Chryssis Georgiou
    • 1
  • Robert Gustafsson
    • 2
    • 3
  • Andreas Lindhé
    • 2
    • 3
  • Elad Michael Schiller
    • 2
    Email author
  1. 1.University of CyprusNicosiaCyprus
  2. 2.Chalmers University of TechnologyGothenburgSweden
  3. 3.Combitech ABLinköpingSweden

Personalised recommendations