Skip to main content

Automated Addition of Fault-Tolerance under Synchronous Semantics

  • Conference paper
Book cover Stabilization, Safety, and Security of Distributed Systems (SSS 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8255))

Included in the following conference series:

Abstract

We focus on the problem of automated model repair for synchronous systems. Model repair focuses on revising a model, so that it satisfies a new property while preserving its existing properties. While the problem of model repair has been studied previously in the context of interleaving semantics, we argue that the corresponding solutions are not applicable for several problems encountered in embedded systems. Specifically, in interleaving semantics, only one of the components executes in a given step. On the contrary, in many commonly considered distributed embedded systems, several components can execute synchronously.

We present a polynomial-time sound and complete algorithm for repairing models in synchronous semantics (also called maximum parallelism semantics). We show that our approach allows us to design fault-tolerant systems, where after the occurrence of faults, the system recovers to its normal behavior within a given number of steps. We illustrate our approach by synthesizing a fault-tolerant group membership protocol and a protocol for cache coherence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alpern, B., Schneider, F.B.: Defining liveness. Information Processing Letters 21, 181–185 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  2. Alur, R., Dill, D.: A theory of timed automata. Theoretical Computer Science 126(2), 183–235 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bartocci, E., Grosu, R., Katsaros, P., Ramakrishnan, C.R., Smolka, S.A.: Model repair for probabilistic systems. In: Abdulla, P.A., Leino, K.R.M. (eds.) TACAS 2011. LNCS, vol. 6605, pp. 326–340. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  4. Basu, A., Bozga, M., Sifakis, J.: Modeling heterogeneous real-time components in BIP. In: Software Engineering and Formal Methods (SEFM), pp. 3–12 (2006)

    Google Scholar 

  5. Bonakdarpour, B., Ebnenasir, A., Kulkarni, S.S.: Complexity results in revising UNITY programs. ACM Transactions on Autonomous and Adaptive Systems (TAAS) 4(1), 1–28 (2009)

    Article  Google Scholar 

  6. Bonakdarpour, B., Kulkarni, S.S., Abujarad, F.: Symbolic synthesis of masking fault-tolerant programs. Springer Journal on Distributed Computing (DC) 25(1), 83–108 (2012)

    Article  MATH  Google Scholar 

  7. Bonakdarpour, B., Lin, Y., Kulkarni, S.S.: Automated addition of fault recovery to cyber-physical component-based models. In: ACM International Conference on Embedded Software (EMSOFT), pp. 127–136 (2011)

    Google Scholar 

  8. Buccafurri, F., Eiter, T., Gottlob, G., Leone, N.: Enhancing model checking in verification by ai techniques. Artificial Intelligence 112, 57–104 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  9. Chandy, K.M., Misra, J.: Parallel program design: a foundation. Addison-Wesley Longman Publishing Co., Inc., Boston (1988)

    MATH  Google Scholar 

  10. Chatzieleftheriou, G., Bonakdarpour, B., Smolka, S.A., Katsaros, P.: Abstract model repair. In: Goodloe, A.E., Person, S. (eds.) NFM 2012. LNCS, vol. 7226, pp. 341–355. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  11. Cristian, F.: Reaching agreement on processor group membership in synchronous distributed systems. Distributed Computing 4, 175–187 (1991)

    Article  MATH  Google Scholar 

  12. Ghosh, S.: Distributed Systems: An Algorithmic Approach. Chapman and Hall/CRC Computer and Information Science Series. Taylor & Francis (2010)

    Google Scholar 

  13. Girault, A., Rutten, É.: Automating the addition of fault tolerance with discrete controller synthesis. Formal Methods in System Design (FMSD) 35(2), 190–225 (2009)

    Article  MATH  Google Scholar 

  14. Gössler, G., Sifakis, J.: Composition for component-based modeling. Science of Computer Programming 55(1-3), 161–183 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  15. Gouda, M.G., Haddix, F.F.: The alternator. Distributed Computing 20(1), 21–28 (2007)

    Article  MATH  Google Scholar 

  16. Jobstmann, B., Griesmayer, A., Bloem, R.: Program repair as a game. In: Etessami, K., Rajamani, S.K. (eds.) CAV 2005. LNCS, vol. 3576, pp. 226–238. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  17. Kulkarni, S.S., Arora, A.: Large automating the addition of fault-tolerance. In: Joseph, M. (ed.) FTRTFT 2000. LNCS, vol. 1926, pp. 82–93. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  18. Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann Publishers Inc., San Francisco (1996)

    MATH  Google Scholar 

  19. Ramadge, P.J., Wonham, W.M.: The control of discrete event systems. Proceedings of the IEEE 77(1), 81–98 (1989)

    Article  Google Scholar 

  20. Samanta, R., Deshmukh, J.V., Emerson, E.A.: Automatic generation of local repairs for boolean programs. In: Formal Methods in Computer-Aided Design (FMCAD), pp. 1–10 (2008)

    Google Scholar 

  21. Somenzi, F.: Cudd: Colorado university decision diagram package

    Google Scholar 

  22. Lin, Y., Kulkarni, S., Bonakdarpour, B.: Automated addition of fault-tolerance under synchronous semantics. Technical Report MSU-CSE-13-5, Computer Science and Engineering, Michigan State University, East Lansing, Michigan (July 2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Lin, Y., Bonakdarpour, B., Kulkarni, S. (2013). Automated Addition of Fault-Tolerance under Synchronous Semantics. In: Higashino, T., Katayama, Y., Masuzawa, T., Potop-Butucaru, M., Yamashita, M. (eds) Stabilization, Safety, and Security of Distributed Systems. SSS 2013. Lecture Notes in Computer Science, vol 8255. Springer, Cham. https://doi.org/10.1007/978-3-319-03089-0_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03089-0_19

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03088-3

  • Online ISBN: 978-3-319-03089-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics