Skip to main content

What You Always Wanted to Know About Model Checking of Fault-Tolerant Distributed Algorithms

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9609))

Abstract

Distributed algorithms have numerous mission-critical applications in embedded avionic and automotive systems, cloud computing, computer networks, hardware design, and the internet of things. Although distributed algorithms exhibit complex interactions with their computing environment and are difficult to understand for human engineers, computer science has developed only very limited tool support to catch logical errors in distributed algorithms at design time.

In the last two decades we have witnessed a revolutionary progress in software model checking due to the development of powerful techniques such as abstract model checking, SMT solving, and partial order reduction. Still, model checking of fault-tolerant distributed algorithms poses multiple research challenges, most notably parameterized verification: verifying an algorithm for all system sizes and different combinations of faults. In this paper, we survey our recent results in this area which extend and combine abstraction, partial orders, and bounded model checking. Our results demonstrate that model checking has acquired sufficient critical mass to build the theory and the practical tools for the formal verification of large classes of distributed algorithms.

Supported by the Austrian Science Fund (FWF) through the National Research Network RiSE (S11403 and S11405) and project P27722 (PRAVDA), and by the Vienna Science and Technology Fund (WWTF) through project ICT15-103 (APALACHE).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://forsyte.at/software/bymc/.

  2. 2.

    Although NuSMV implements bounded model checking for LTL, our present results guarantee completeness only for safety properties.

  3. 3.

    https://github.com/konnov/fault-tolerant-benchmarks/.

  4. 4.

    http://forsyte.at/apalache/.

References

  1. Alberti, F., Ghilardi, S., Pagani, E., Ranise, S., Rossi, G.P.: Universal guards, relativization of quantifiers, and failure models in model checking modulo theories. JSAT 8(1/2), 29–61 (2012)

    MathSciNet  MATH  Google Scholar 

  2. Attiya, H., Welch, J.: Distributed Computing, 2nd edn. Wiley, New York (2004)

    Book  Google Scholar 

  3. Biere, A., Cimatti, A., Clarke, E., Zhu, Y.: Symbolic model checking without BDDs. In: Cleaveland, W.R. (ed.) TACAS 1999. LNCS, vol. 1579, pp. 193–207. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  4. Bloem, R., Jacobs, S., Khalimov, A., Konnov, I., Rubin, S., Veith, H., Widder, J.: Decidability of Parameterized Verification. Synthesis Lectures on Distributed Computing Theory. Morgan & Claypool Publishers, San Rafael (2015)

    Google Scholar 

  5. Bracha, G., Toueg, S.: Asynchronous consensus and broadcast protocols. J. ACM 32(4), 824–840 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  6. Brasileiro, F., Greve, F.G.P., Mostéfaoui, A., Raynal, M.: Consensus in one communication step. In: Malyshkin, V.E. (ed.) PaCT 2001. LNCS, vol. 2127, pp. 42–50. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  7. Cavada, R., Cimatti, A., Dorigatti, M., Griggio, A., Mariotti, A., Micheli, A., Mover, S., Roveri, M., Tonetta, S.: The nuXmv symbolic model checker. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 334–342. Springer, heidelberg (2014)

    Google Scholar 

  8. Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. JACM 43(2), 225–267 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  9. Cimatti, A., Griggio, A., Schaafsma, B.J., Sebastiani, R.: The MathSAT5 SMT Solver. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013 (ETAPS 2013). LNCS, vol. 7795, pp. 93–107. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  10. Clarke, E., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-guided abstraction refinement for symbolic model checking. J. ACM 50(5), 752–794 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  11. de Moura, L., Bjørner, N.S.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  12. Dobre, D., Suri, N.: One-step consensus with zero-degradation. In: DSN, pp. 137–146 (2006)

    Google Scholar 

  13. Drăgoi, C., Henzinger, T.A., Veith, H., Widder, J., Zufferey, D.: A logic-based framework for verifying consensus algorithms. In: McMillan, K.L., Rival, X. (eds.) VMCAI 2014. LNCS, vol. 8318, pp. 161–181. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  14. Dwork, C., Lynch, N., Stockmeyer, L.: Consensus in the presence of partial synchrony. J. ACM 35(2), 288–323 (1988)

    Article  MathSciNet  Google Scholar 

  15. Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  16. Gmeiner, A., Konnov, I., Schmid, U., Veith, H., Widder, J.: Tutorial on parameterized model checking of fault-tolerant distributed algorithms. In: Bernardo, M., Damiani, F., Hähnle, R., Johnsen, E.B., Schaefer, I. (eds.) SFM 2014. LNCS, vol. 8483, pp. 122–171. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  17. Guerraoui, R.: Non-blocking atomic commit in asynchronous distributed systems with failure detectors. Distrib. Comput. 15(1), 17–25 (2002)

    Article  Google Scholar 

  18. Holzmann, G.: The SPIN Model Checker. Addison-Wesley, Boston (2003)

    Google Scholar 

  19. Imbs, D., Raynal, M.: Simple and efficient reliable broadcast in the presence of Byzantine processes. CoRR abs/1510.06882 (2015). http://arxiv.org/abs/1510.06882

  20. John, A., Konnov, I., Schmid, U., Veith, H., Widder, J.: Brief announcement: parameterized model checking of fault-tolerant distributed algorithms by abstraction. In: PODC, pp. 119–121 (2013)

    Google Scholar 

  21. John, A., Konnov, I., Schmid, U., Veith, H., Widder, J.: Parameterized model checking of fault-tolerant distributed algorithms by abstraction. In: FMCAD, pp. 201–209 (2013)

    Google Scholar 

  22. John, A., Konnov, I., Schmid, U., Veith, H., Widder, J.: Towards modeling and model checking fault-tolerant distributed algorithms. In: Bartocci, E., Ramakrishnan, C.R. (eds.) SPIN 2013. LNCS, vol. 7976, pp. 209–226. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  23. Konnov, I., Veith, H., Widder, J.: On the completeness of bounded model checking for threshold-based distributed algorithms: reachability. In: Baldan, P., Gorla, D. (eds.) CONCUR 2014. LNCS, vol. 8704, pp. 125–140. Springer, Heidelberg (2014)

    Google Scholar 

  24. Konnov, I., Veith, H., Widder, J.: SMT and POR beat counter abstraction: parameterized model checking of threshold-based distributed algorithms. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9206, pp. 85–102. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  25. Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)

    Article  Google Scholar 

  26. Lamport, L.: Paxos made simple. ACM SIGACT News 32(4), 18–25 (2001)

    Google Scholar 

  27. Lamport, L.: Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley Longman Publishing Co., Inc., Boston (2002)

    Google Scholar 

  28. Lynch, N.: Distributed Algorithms. Morgan Kaufman, San Francisco (1996)

    MATH  Google Scholar 

  29. Mostéfaoui, A., Mourgaya, E., Parvédy, P.R., Raynal, M.: Evaluating the condition-based approach to solve consensus. In: DSN, pp. 541–550 (2003)

    Google Scholar 

  30. Netflix: 5 lessons we have learned using AWS (2010). http://techblog.netflix.com/2010/12/5-lessons-weve-learned-using-aws.html

  31. Pease, M., Shostak, R., Lamport, L.: Reaching agreement in the presence of faults. J. ACM 27(2), 228–234 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  32. Pnueli, A., Xu, J., Zuck, L.D.: Liveness with \({(0,1,\infty )}\)-counter abstraction. In: Brinksma, E., Larsen, K.G. (eds.) CAV 2002. LNCS, vol. 2404, pp. 107–122. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  33. Raynal, M.: A case study of agreement problems in distributed systems: non-blocking atomic commitment. In: HASE, pp. 209–214 (1997)

    Google Scholar 

  34. Song, Y.J., van Renesse, R.: Bosco: one-step byzantine asynchronous consensus. In: Taubenfeld, G. (ed.) DISC 2008. LNCS, vol. 5218, pp. 438–450. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  35. Srikanth, T., Toueg, S.: Simulating authenticated broadcasts to derive simple fault-tolerant algorithms. Dist. Comp. 2, 80–94 (1987)

    Article  Google Scholar 

Download references

Acknowledgements

We are grateful to Annu Gmeiner and Ulrich Schmid for their contributions to several papers [16, 2022] of our research agenda.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Igor Konnov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Konnov, I., Veith, H., Widder, J. (2016). What You Always Wanted to Know About Model Checking of Fault-Tolerant Distributed Algorithms. In: Mazzara, M., Voronkov, A. (eds) Perspectives of System Informatics. PSI 2015. Lecture Notes in Computer Science(), vol 9609. Springer, Cham. https://doi.org/10.1007/978-3-319-41579-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41579-6_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41578-9

  • Online ISBN: 978-3-319-41579-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics