Abstract
A safety-critical real-time computer system must provide its services with a dependability that is much better than the dependability of any one of its constituent components. This challenging goal can only be achieved by the provision of fault tolerance. The design of any fault-tolerant system proceeds in four distinct phases. In the first phase the fault hypothesis is shaped, i.e. assumptions are made about the types and numbers of faults that must be tolerated by the planned system. In the second phase an architecture is designed that tolerates the specified faults. In the third phase the architecture is implemented and the functions and fault-tolerance mechanisms are validated. Finally, in the fourth phase it has to be confirmed experimentally that the assumptions contained in the fault-hypothesis are met by reality. The first part of this contribution focuses on the establishment of a comprehensive fault hypothesis for safety-critical real-time computer systems. The size of the fault containment regions, the failure mode of the fault containment regions, the assumed frequency of the faults and the assumptions about error detection latency and error containment are discussed under the premise that in future a distributed system node is expected to be a system-on-a-chip (SOC). The second part of this contribution focuses on the implications that such a fault hypothesis will have on the future architecture of distributed safety-critical real-time computer systems in the automotive domain.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Phillips, D.: Major US Airlines Complete Safest Year. In: Washington Post, June 13, p. A10 (2003)
Suri, N., Walter, C.J., Hugue, M.M. (eds.): Advances in Ultra-Dependable Systems. IEEE Press, Los Alamitos (1995)
Littlewood, B., Strigini, L.: Validation of Ultra-high Dependability for Software-based Systems. Communications of the ACM 36(11), 69–80 (1993)
Kaufman, L.F., Johnson, B.W., Dugan, J.B.: Coverage Estimtion Using Statistics of the Extremes for When Testing Reveals No Failures. IEEE Trans. on Computers 51(1), 3–12 (2002)
Hugue, M.M., Scalzo, R.: Specifying fault-tolerance in large complex computing systems. In: Engineering of Complex Computer Systems. IEEE Press, Fr. Lauderdale Florida (1995)
Powell, D.: Failure Mode Assumptions and Assumption Coverage. In: Proc. 22nd Int. Symp. on Fault-Tolerant Computing (FTCS-22). IEEE Computer Society Press, Boston (1992)
Kaufmann, L.M., Bhide, S., Johnson, B.W.: Modeling of Common-Mode Failures in Digital Embedded Systems. In: Proc. of the Reliability and Maintainability Symposium 2000. IEEE Press, Los Angeles (2000)
Laprie, J.C. (ed.): Dependability: Basic Concepts and Terminology - in English, French, German, German and Japanese. In: Avizienis, A., Kopetz, H., Laprie, J.-C. (eds.) Dependable Computing and Fault Tolerance, vol. 5. Springer, Vienna (1992)
Normand, E.: Single Event Upset at Ground Level. IEEE Trans. on Nucl. Science 43, 2742 (1996)
Constantinescu, C.: Impact of Deep Submicron Technology on Dependability of VLSI Circuits. In: Proc. of the 2002 International Conference on Dependable Systems and Networks. IEEE Press, Washington D.C (2002)
Kopetz, H.: Real-Time Systems, Design Principles for Distributed Embedded Applications. Kluwer Academic Publishers, Boston (1997)
Pauli, B., Meyna, A., Heitmann, P.: Reliability of Electronic Components and Control Units in Motor Vehicle Applications, Verein Deutscher Ingenieure (VDI), pp. 1009–1024 (1998)
Rechtin, E., Maier, M.W.: The Art of Systems Architecting, p. 313. CRC Press, Boca Raton (2002)
Kopetz, H., Bauer, G.: The Time-Triggered Architecture. Proceedings of the IEEE 91, 112–126 (2003)
Karlsson, J., et al.: Integration and Comparison of Three Physical Fault Injection Techniques. In: Randell, B., et al. (eds.) Predictably Dependable Computing Systems, pp. 309–327. Springer, Heidelberg (1995)
Ademaj, A., et al.: Dependability Evaluation of the Time-Triggered Architecture with Bus and Star Topology. In: DSN Conference. IEEE Press, San Francisco (2003)
Steiner, W., Paulitsch, M., Kopetz, H.: Multiple Failure Correction in the Time- Triggered Architecture. In: Proc. of the IEEE WORDS 2003 Conference. IEEE Press, CAPRI (2003)
Steiner, W., Rushby, J., Pfeifer, H.: xxx (2003)
TTTech, homepage of TTTech (1998), at http://www.tttech.com
Driscoll, K., Hoyme, K.: SafeBus for avionics. IEEE Aerospace and Electronics Systems Magazine 8(3), 34–39 (1993)
Lala, J.H., Alger, L.S.: Hardware and Software Fault Tolerance: A unified architectural approach. In: Proc. 18th Int. Symp. on Fault-Tolerant Computing (FTCS-18), Tokyo (1988)
Kimseng, K., et al.: Physics-of-failure assessment of a cruise control module. Microelectronics Reliability 39, 1423–1444 (1999)
Kopetz, H., Suri, N.: Compositional Design of Real-Time System: A Conceptual Basis for the Specification of Linking Interfaces. In: ISORC 2003–The 6th International Symposium on Object Oriented Real-Time Computing. IEEE Press, Hakodate (2003)
Swanson, D.L.: Evolving Avionics Systems for Federated to Distributed Architectures. In: Proc. of the 17th Digital Avionics System Conference. IEEE Press, Los Alamitos (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kopetz, H. (2006). On the Fault Hypothesis for a Safety-Critical Real-Time System. In: Broy, M., Krüger, I.H., Meisinger, M. (eds) Automotive Software – Connected Services in Mobile Networks. ASWSD 2004. Lecture Notes in Computer Science, vol 4147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823063_3
Download citation
DOI: https://doi.org/10.1007/11823063_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37677-4
Online ISBN: 978-3-540-37678-1
eBook Packages: Computer ScienceComputer Science (R0)