Tolerating Software Design Faults in a Command and Control System

  • Tom Anderson
  • Peter A. Barrett
  • Dave N. Halliwell
  • Michael R. Moulding
Part of the Dependable Computing and Fault-Tolerant Systems book series (DEPENDABLECOMP, volume 2)


The process of software development is usually described in terms of a progression from user requirements to the final code, passing through intermediate stages such as specification, design, and validation. Of course, progress through these stages is rarely unidirectional, and “final code” must be considered to be a misnomer given the demand for subsequent software maintenance. An engineering approach to software development should enable software to be produced on time, within budget, and in accordance with user requirements. One important aspect of these requirements concerns the reliability of the software. Software reliability requirements can be expressed in a number of ways, of which the simplest, perhaps, is to impose an upper limit on the measured rate of failure over a specified interval.


Acceptance Test Software Reliability Control Software Mean Time Between Failure State Restoration 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [Anderson 1985]
    T. Anderson and P. A. Barrett, “Fault Tolerance Project Report: Results and Conclusions from the Second and Third Experimental Programmes”, University of Newcastle upon Tyne, Proj. Rep. 4844/DD.17/3,1985.Google Scholar
  2. [Anderson 1976]
    T. Anderson and R. Kerr, “Recovery Blocks in Action: A System Supporting High Reliability”, in Proc. Second Int. Conf. Software Eng., San Francisco, CA, 1976, pp. 447–457.Google Scholar
  3. [Anderson 1981]
    T. Anderson and P. A. Lee, Fault Tolerance: Principles and Practice. Englewood Cliffs, NJ: Prentice-Hall 1981.Google Scholar
  4. [Anderson 1984]
    T. Anderson et al., “Fault Tolerance Project Report: Results and Conclusions from the Experimental Programme”, University of Newcastle upon Tyne, Proj. Rep. 4844/DD.17/2,1984.Google Scholar
  5. [Anderson 1987]
    T. Anderson, “Design Fault Tolerance in Practical Systems”, in Software Reliability: Achievement and Assessment, B. Littlewood (Ed.), Oxford: Blackwell Scientific, 1987.Google Scholar
  6. [Avizienis 1977]
    A. Avizienis and L. Chen, “On the Implementation of N- Version Programming for Software Fault Tolerance During Program Execution”, in Proc. COMPSAC 77, Chicago, IL, 1977, pp. 149–155.Google Scholar
  7. [Avizienis 1984]
    A. Avizienis and J. P. J. Kelly, “Fault Tolerance by Design Diversity: Concepts and Experiments”, Computer, vol. 17, pp. 67–80, August 1984.CrossRefGoogle Scholar
  8. [Bhargava 1981]
    B. Bhargava, “Software Reliability in Real-Time Systems”, in Proc. NCC, Chicago, IL, 1981, pp. 297–309.Google Scholar
  9. [Carter 1985]
    W. C. Carter, “Hardware Fault Tolerance”, in Resilient Computing Systems, T. Anderson (Ed.), New York: Wiley, 1985, pp. 11–63.Google Scholar
  10. [Chen 1978]
    L. Chen and A. Avizienis, “N-Version Programming: A Fault-Tolerance Approach to Reliability of Software Operation”, in Dig. FTCS-8, Toulouse, France, 1978, pp. 3–9.Google Scholar
  11. [Eckhardt 1985]
    D. E. Eckhardt and L. D. Lee, “A Theoretical Basis for the Analysis of Multi Version Software Subject to Coincident Errors”, IEEE Trans. Software Eng., vol. SE-11, pp. 1511–1517, December 1985.CrossRefGoogle Scholar
  12. [Grnarov 1980]
    A. Grnarov et al., “On the Performance of Software Fault Tolerance Strategies”, in Dig. FTCS-10, Kyoto, Japan, 1980, pp.251–253.Google Scholar
  13. [Hagelin 1987]
    G. Hagelin, “ERICSSON Safety Systems for Railway Control”, in this volume.Google Scholar
  14. [Horning 1974]
    J. J. Horning et al., “A Program Structure for Error Detection and Recovery”, in Lecture Notes in Computer Science 16. New York: Springer-Verlag, 1974, pp. 171–187.Google Scholar
  15. [Kelly 1983]
    J. P. J. Kelly and A. Avizienis, “A Specification Oriented Multiversion Software Experiment”, in Dig. FTCS-13, Milan, Italy, 1983, pp.120–126.Google Scholar
  16. [Knight 1986]
    J. C. Knight and N. G. Leveson, “An Empirical Study of Failure Probabilities in Multi-Version Software”, in Dig. FTCS-16, Vienna, Austria, 1986, pp. 165–170.Google Scholar
  17. [Lala 1985]
    P. K. Lala, Fault Tolerant and Fault Testable Hardware Design. Englewood Cliffs, NJ: Prentice Hall, 1985.Google Scholar
  18. [Laprie 1984]
    J.-C. Laprie, “Dependability Evaluation of Software Systems in Operation,” IEEE Trans. Software Eng., vol. SE-10, pp. 701–714, June 1984.CrossRefGoogle Scholar
  19. [Lee 1978]
    P. A. Lee, “A Reconsideration of the Recovery Block Scheme”, Comput. J., vol. 21, no. 4, pp. 306–310,1978.CrossRefGoogle Scholar
  20. [Lee 1980]
    P. A. Lee et al., “A Recovery Cache for the PDP-11”, IEEE Trans. Comput., vol. C-29, pp. 546–549, June 1980.CrossRefGoogle Scholar
  21. [Martin 1982]
    D. J. Martin, “Dissimilar Software in High Integrity Applications in Flight Controls”, in Proc. AGARD Symp. Software Avionics, The Hague, The Netherlands, 1982, pp. 36:1–36:13.Google Scholar
  22. [Mascot 1980]
    Mascot Suppliers Ass., The Official Handbook of MASCOT, Royal Signals and Radar Establishment, Malvern, England, 1980.Google Scholar
  23. [Migneault 1982]
    G. E. Migneault, “The Cost of Software Fault Tolerance”, in Proc. AGARD Symp. Software Avionics, The Hague, The Netherlands, 1982, pp. 37:1–37:8.Google Scholar
  24. [Randell 1975]
    B. Randell, “System Structure for Software Fault Tolerance”, IEEE Trans. Software Eng., vol. SE-1, pp. 220–232, June 1975.Google Scholar
  25. [Scott 1983]
    R. K. Scott et al., “Modelling Fault-Tolerant Software Reliability”, in Proc. 3rd Symp. Reliability Distrib. Software Database Syst., Clearwater Beach, FL, 1983, pp. 15–27.Google Scholar
  26. [Slivinski 1984]
    T. Slivinski et al., “Study of Fault Tolerant Software Technology”, Man- dex Inc., Rep. NASA Langley Res. Cen., 1984.Google Scholar
  27. [Welch 1983]
    H. O. Welch, “Distributed Recovery Block Performance in a Real-Time Control Loop”, in Proc. Real-Time Sys. Symp., Arlington, VA, 1983, pp. 268–276.Google Scholar

Copyright information

© Springer-Verlag/Wien 1988

Authors and Affiliations

  • Tom Anderson
    • 1
  • Peter A. Barrett
    • 2
  • Dave N. Halliwell
    • 3
  • Michael R. Moulding
    • 4
  1. 1.CSR, University of Newcastle upon TyneUK
  2. 2.MARI, Newcastle upon TyneUK
  3. 3.CAP ScientificLondonUK
  4. 4.Royal Military College of ScienceShrivenhamUK

Personalised recommendations