A Controller Safety Concept Based on Software-Implemented Fault Tolerance for Fail-Operational Automotive Applications

Ghadhab, Majdi; Kuntz, Matthias; Kuvaiskii, Dmitrii; Fetzer, Christof

doi:10.1007/978-3-319-29510-7_11

Majdi Ghadhab¹²,
Matthias Kuntz¹²,
Dmitrii Kuvaiskii¹³ &
…
Christof Fetzer¹³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 596))

Included in the following conference series:

International Workshop on Formal Techniques for Safety-Critical Systems

795 Accesses
4 Citations

Abstract

We propose to build a fail-operational computing system from a primary self-checking controller and a secondary limp-home controller to guarantee an emergency operation in the case of hardware failure of the primary controller. A self-checking controller commonly builds on hardware-implemented fault detection, e.g. lock-stepping to reach a high diagnostic coverage of hardware faults. Such techniques come into contradiction with new features of modern CPUs such as inherent non-determinism of execution. Thus an interesting alternative to hardware-based self-checking in the primary controller is to implement software-based fault detection and recovery on the primary controller to detect and mask its hardware failures. We prove by means of stochastic model checking and prototype fault detection technique that the proposed approach not only reduces costs, but also guarantees higher availability of the computing system at the same safety level as common replicated execution on redundant hardware.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Electric/Electronic.
2.
FIT: Failure in Time (1/\(10^{-9}\)h).
3.
Robust and Reliant Automotive Computing Environment for Future eCars, www.projekt-race.de/

References

Beckschulze, E., et al.: Fault handling approaches on dual-core microcontrollers in safety-critical automotive applications. RWTH Aachen University, Germany, Embedded Software Laboratory (2008)
Google Scholar
Temple, C., Vilela, A.: Fehlertolerante Systeme im Fahrzeug: von “fail-safe” zu “fail-operational”. Infineon Technologies. www.elektroniknet.de
Wanner, D., et al.: Survey on fault-tolerant vehicle design. In: EVS26 International Battery, Hybrid and Fuel Cell Electric Vehicle Symposium, Los Angeles (2012)
Google Scholar
Powel Douglass, B.: Real-Time Design Patterns: Robust Scalable Architecture for Real-Time Systems. Addison-Wesley, Boston (2002)
Google Scholar
Bernick, D., et al.: Nonstop advanced architecture. Hewlett Packard Company. In: Proceedings of the International Conference on Dependable Systems and Networks (DSN), Yokohama, Japan (2005)
Google Scholar
German Electrical and Electronic Manufacturers Assosciation (ZVEI): ConsumerComponents in Safe Automotive Applications. Position paper (2014)
Google Scholar
Ghadhab, M., Kaienburg, J., Süßkraut, M., Fetzer, C.: Is software coded processing an answer to the execution integrity challenge of current and future software-intensive applications? In: Schulze, T., Müller, B., Meyer, G. (eds.) Advanced Microsystems for Automotive Applications 2015 Smart Systems for Green and Automated Driving. LNIM, pp. 263–275. Springer, Heidelberg (2015)
Google Scholar
Kwiatkowska, M., Norman, G., Parker, D.: Stochastic Model Checking. School of Computer Science, University of Birmingham Edgbaston, Birmingham B15 2TT (2007)
Google Scholar
Baier, C., et al.: Model-checking algorithms for continuous-time Markov chains. IEEE Trans. Softw. Eng. 29(7), 524–541 (2003)
Article Google Scholar
PRISM - Probabilistic Symbolic Model Checker. www.prismmodelchecker.org
Häggström, H.: Finite Markov Chains and Algorithmic Applications. Cambridge University Press, Cambridge (2002)
Book MATH Google Scholar
International Organization for Standardization: ISO 26262: Road vehicles - Functional safety. International standard, 1st edn. (2011)
Google Scholar
Avizienis, A., Laprie, J.-C., Randell, B.: Fundamental concepts of dependability. Research report, no. 1145, LAAS-CNRS (2001)
Google Scholar
Pullum, L.L.: Software Fault Tolerance Techniques and Implementation. Artech House Computing Library, Boston, London (2001)
MATH Google Scholar
Brown, D.T.: Error detecting and correcting binary codes for arithmetic operations. IRE Trans. Electron. Comput. 3, 333–337 (1960)
Article Google Scholar
Massey, J.L.: Survey of residue coding for arithmetic errors. Int. Comput. Cent. Bull. 3, 3–17 (1964)
MathSciNet Google Scholar
Nathan, R., Sorin, D.J.: Nostradamus: Low-cost hardware-only error detection for processor cores. In: Design, Automation and Test in Europe Conference and Exhibition (DATE), pp. 1–6 (2014)
Google Scholar
Reick, K., et al.: Fault-tolerant design of the IBM Power6 microprocessor. IEEE Micro 28(2), 30–38 (2008)
Article Google Scholar
Forin, P.: Vital coded microprocessor principles and application for various transit systems. In: IFAC-GCCT, pp. 79–84, Paris, France (1989)
Google Scholar
Schiffel, U.: Hardware error detection using AN-codes. Ph.D thesis, Technische Universität Dresden (2011)
Google Scholar
Kuvaiskii, D., Fetzer, C.: \(\Delta \)-encoding: practical encoded processing. In: Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, Rio de Janeiro, Brazil (2015)
Google Scholar
Oh, N., et al.: Error detection by duplicated instructions in superscalar processors. IEEE Trans. Reliab. 51(1), 63–75 (2002)
Article Google Scholar
Reis, G.A., et al.: SWIFT: Software Implemented Fault Tolerance. In: Proceedings of the International Symposium on Code Generation and Optimization (2005)
Google Scholar
Sommer, S., et al.: RACE: a centralized platform computer based architecture for automotive applications. In: Vehicular Electronics Conference (VEC) and the International Electric Vehicle Conference (IEVC) (2013)
Google Scholar
Armbruster, M., Freitag, G., Schmid, T., Spiegelberg, G., Fiege, L., Zirkler, A.: Ethernet-based and function-independent vehicle control-platform: motivation, idea and technical concept fulfilling quantitative safety-requirements from ISO 26262. In: Meyer, G. (ed.) Advanced Microsystems for Automotive Applications 2012 Smart Systems for Safe, Sustainable and Networked Vehicles, pp. 91–107. Springer, Heidelberg (2012)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

BMW AG, Munich, Germany
Majdi Ghadhab & Matthias Kuntz
Technische Universität Dresden, Dresden, Germany
Dmitrii Kuvaiskii & Christof Fetzer

Authors

Majdi Ghadhab
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Kuntz
View author publications
You can also search for this author in PubMed Google Scholar
Dmitrii Kuvaiskii
View author publications
You can also search for this author in PubMed Google Scholar
Christof Fetzer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Majdi Ghadhab .

Editor information

Editors and Affiliations

AIST Ikeda, Ikeda,Osaka, Japan
Cyrille Artho
University of Oslo, Oslo, Norway
Peter Csaba Ölveczky

Appendix: Sensitivity analysis

To understand the sensitivity of the measured properties to the failure rate and the repair rate of the computing platform, we vary one of these parameters (see Tables 3 and 4) by keeping the rest of the specification unchanged.

Table 3. Variation of failure rate \(\lambda \).

Full size table

Table 4. Variation of repair rate \(\mu \).

Full size table

1.1 Part 1 - Sensitivity of the “intact”-probability to the parameters failure rate and repair rate (platform 1 vs. 2)

The improvement reached by platform 2 compared to platform 1 regarding the probability of the state “intact” is more significant at high failure rates (Fig. 20) and low repair rates (Fig. 21). At low failure rates or high repair rates, the probability of the state “intact” is almost identical between platform 1 and platform 2.

1.2 Part 2 - Sensitivity of the availability to the parameters failure rate and repair rate (platform 2 vs. 3)

The improvement reached by platform 3 compared to platform 2 regarding the availability of the computing platform is almost independent from the failure rate of the primary controller. The improvement is actually negligible at high as well as at low failure rates (Fig. 22). However, Fig. 23 shows a clear availability improvement at low repair rates.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ghadhab, M., Kuntz, M., Kuvaiskii, D., Fetzer, C. (2016). A Controller Safety Concept Based on Software-Implemented Fault Tolerance for Fail-Operational Automotive Applications. In: Artho, C., Ölveczky, P. (eds) Formal Techniques for Safety-Critical Systems. FTSCS 2015. Communications in Computer and Information Science, vol 596. Springer, Cham. https://doi.org/10.1007/978-3-319-29510-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-29510-7_11
Published: 30 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29509-1
Online ISBN: 978-3-319-29510-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Controller Safety Concept Based on Software-Implemented Fault Tolerance for Fail-Operational Automotive Applications

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: Sensitivity analysis

Appendix: Sensitivity analysis

1.1 Part 1 - Sensitivity of the “intact”-probability to the parameters failure rate and repair rate (platform 1 vs. 2)

1.2 Part 2 - Sensitivity of the availability to the parameters failure rate and repair rate (platform 2 vs. 3)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation