Skip to main content

Checkpointing for the Reliability of Real-Time Systems with On-Line Fault Detection

  • Conference paper
Book cover Embedded and Ubiquitous Computing – EUC 2005 (EUC 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3824))

Included in the following conference series:

Abstract

The checkpointing problem in real-time systems equipped with on-line fault detection mechanisms is dealt with from a reliability point of view. The reliability analysis is performed with the assumption that transient faults occur in accordance with a Poisson process and are detected immediately by the detection mechanisms. And the best equidistant checkpointing strategy that maximizes the reliability of the system against transient faults is derived.

This work was supported by SaTReC of KAIST.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dupont, E., Nicolaidis, M., Rohr, P.: Embedded Robustness IPs for Transient-Error-Free ICs. IEEE Design & Test of Computers 19, 56–70 (2002)

    Article  Google Scholar 

  2. Shin, K.G., Lin, T.-H., Lee, Y.-H.: Optimal Checkpointing of Real-Time Tasks. IEEE Trans. Computers C-36(11), 1328–1341 (1987)

    Article  Google Scholar 

  3. Punnekkat, S., Burns, A., Davis, R.: Analysis of Checkpointing for Real-Time Systems. Real-Time Systems 20(1), 83–102 (2001)

    Article  MATH  Google Scholar 

  4. Zhang, Y., Chakrabarty, K.: Energy-Aware Adaptive Checkpointing in Embedded Real-Time Systems. In: Proc. Design, Automation and Test in Europe Conference and Exhibition, Messe Munich, Germany, pp. 918–923 (2003)

    Google Scholar 

  5. Kwak, S.W., Kim, B.K., Choi, B.J.: An Optimal Checkpointing-Strategy for Real-Time Control Systems under Transient Faults. IEEE Trans. Reliability 50(3), 293–301 (2001)

    Article  Google Scholar 

  6. Ranganathan, A., Upadhyaya, S.J.: Simulation Analysis of a Dynamic Checkpointing Strategy for Real-Time Systems. In: Proc. 27th Annual Simulation Symp., April 1994, pp. 181–187 (1994)

    Google Scholar 

  7. Siewiorek, D.P.: Reliable Computer Systems: Design and Evaluation. A. K. Peters (1998)

    Google Scholar 

  8. Duda, A.: The Effects of Checkpointing on Program Execution Time. Information Processing Letters 16, 221–229 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  9. L’ecuyer, P., Malenfant, J.: Computing Optimal Checkpointing Strategies for Rollback and Recovery Systems. IEEE Trans. Computers 37(4), 491–496 (1988)

    Article  Google Scholar 

  10. Grassi, V., Donatiello, L., Tucci, S.: On the Optimal Checkpointing of Critical Tasks and Transaction-Oriented Systems. IEEE Trans. Software Engineering 18(1), 72–77 (1992)

    Article  Google Scholar 

  11. Ziv, A., Bruck, J.: An On-Line Algorithm for Checkpoint Placement. IEEE Trans. Computers 46(9), 976–985 (1997)

    Article  MathSciNet  Google Scholar 

  12. Geist, R., Reynolds, R., Westall, J.: Selection of a Checkpoint Interval in a Critical-Task Environment. IEEE Trans. Reliability 37(4), 395–400 (1988)

    Article  MATH  Google Scholar 

  13. Saleh, A.M., Patel, J.H.: Transient-Fault Analysis for Retry Techniques. IEEE Trans. Reliability 37(3), 323–330 (1988)

    Article  MATH  Google Scholar 

  14. Johnson, B.W.: Design and Analysis of Fault-Tolerant Digital Systems. Addison-Wesley, Reading (1989)

    Google Scholar 

  15. Sosnowski, J.: Transient Fault Tolerance in Digital Systems. IEEE Micro 14(1), 24–35 (1994)

    Article  Google Scholar 

  16. Pflanz, M., Vierhaus, H.T.: Online Check and Recovery Techniques for Dependable Embedded Processors. IEEE Mirco 21(5), 24–40 (2001)

    Article  Google Scholar 

  17. Leon-Garcia, A.: Probability and Random Processes for Electrical Engineering. Addison-Wesley, Reading (1994)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ryu, SM., Park, DJ. (2005). Checkpointing for the Reliability of Real-Time Systems with On-Line Fault Detection. In: Yang, L.T., Amamiya, M., Liu, Z., Guo, M., Rammig, F.J. (eds) Embedded and Ubiquitous Computing – EUC 2005. EUC 2005. Lecture Notes in Computer Science, vol 3824. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596356_22

Download citation

  • DOI: https://doi.org/10.1007/11596356_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30807-2

  • Online ISBN: 978-3-540-32295-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics