Advertisement

Dependable Computing and Fault Tolerance at LAAS: a Summary

  • Jean-Claude Laprie
  • Alain Costes
Conference paper
Part of the Dependable Computing and Fault-Tolerant Systems book series (DEPENDABLECOMP, volume 1)

Abstract

This paper reviews the work which has been performed at LAAS on dependable computing and fault tolerance for twelve years. From its very beginning, this work has had two main concerns: a) a system approach, and b) the need for quantification. The system approach has its source in complexity mastering, in the sense of dealing with a global problem by a global approach. The need for quantification is simply to acknowledge the fact that any scientific or technical discipline (even dealing largely with abstractions as does computer science) cannot mature without quantification.

Keywords

Fault Tolerance Fault Injection Symbolic Execution Distribute Computing System Transient Fault 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Arl 79]
    J. Arlat, “Design of a microcomputer tolerating faults through functional diversity”, Dr. Engineer thesis, Toulouse National Polytechnic Institute, April 1979; in French.Google Scholar
  2. [Arl 83]
    J. Arlat, J. C. Laprie, “Performance-related dependability evaluation of supercomputer systems”, in Proc. 13th Int. Symp. on Fault Tolerant Computing, Milano, June 1983, pp. 276–283; extended version: Microelectronics and Reliability, vol. 24, no. 4, Aug. 1984, pp. 717–742.Google Scholar
  3. [Arl 84a]
    J. Arlat, W. C. Carter, “Implementation and evaluation of a (b,k)-adjacent error-correcting/detecting scheme for supercomputer systems”, IBM Journal of Research and Development, vol. 28, no. 2, March 1984, pp. 159–169.CrossRefGoogle Scholar
  4. [Arl 84b]
    J. Arlat, J. P. Blanquart, J. C. Laprie, “On the certification of computing systems: the EVE project”, in Proc. 4th Int. Conf. on Reliability and Maintainability, Perros-Guirec, France, May 1984, pp. 650–656; in French.Google Scholar
  5. [Arl 85]
    J. Arlat, J. C. Laprie, “On the dependability evaluation of high safety systems”, in Proc. 15th Int. Symp. on Fault Tolerant Computing, Ann Arbor, Michigan, June 1985, pp. 318–323.Google Scholar
  6. [Avi 86]
    A. Avizienis, J. C. Laprie, “Dependable computing: from concepts to design diversity”, Proceedings of IEEE, vol. 74, no. 5, May 1986, pp. 629–638.CrossRefGoogle Scholar
  7. [Beo 76]
    C. Beounes, J. C. Laprie, “Design of a secure and modular micro computer for process control: ASMARA”, in Proc. EUROMICRO Symposium, Venice, Italy, Oct. 1976.Google Scholar
  8. [Beo 78]
    C. Beounes, F. Cereja, “Design methodology for secure micro computers: application to the implementation of the control of a turbo-jet engine”,in Proc. 8th Int. Symp. on Fault Tolerant Computing, Toulouse, June 1978, pp. 10–15.Google Scholar
  9. [Beo 79]
    C. Beounes, F. Cereja, J. C. Laprie, “Design of a secure and modular microcomputer for the control of a turbo-jet engine”, in INFOTECH on Microprocessor Applications, 1979, pp. 155–177.Google Scholar
  10. [Beo 85]
    C. Beounes, J. C. Laprie, “Dependability evaluation of complex computer systems: stochastic Petri net modeling”, in Proc. 15th Int. Symp. on Fault Tolerant Computing, Ann Arbor, Michigan, June 1985, pp. 364–369.Google Scholar
  11. [Bla 83]
    J. P. Blanquart, K. Kanoun, J. C. Laprie, M. Rodrigues dos Santos, “REBECCA: A dependable communication support system for a distributed monitoring and safety system”, in Proc. 3rd Int. Workshop SAFECOMP’83, Cambridge, UK, Sept. 1983, pp. 261–268.Google Scholar
  12. [Bou 80]
    J. L. Boussin, F. Cereja, J. C. Laprie, K. Medhaffer, “Reliability and safety evaluation by Markov processes of the control system of a very high voltage station”, in Proc. EUROCON’80, Stuttgart, March 1980, pp. 459–463.Google Scholar
  13. [Cau 82]
    G. Caumont, J. C. Laprie, D. Powell, “RHEA: A fault-and damage-tolerant hierarchical communication support system for local area computing in aggressive environments”, in Proc. 3rd Int. Conf. on Distributed Computing Systems, Miami, Oct. 1982.Google Scholar
  14. [Cha 82]
    J. Chavade, Y. Crouzet, “The PAD: a self-checking LSI circuit for fault detection in microcomputers”, in Proc. 12th Int. Symp. on Fault Tolerant Computing, Santa Monica, California, June 1982, pp. 55–62.Google Scholar
  15. [Cos 78]
    A. Costes, C. Landrault, J. C. Laprie, “Reliability and availability models for maintained systems featuring hardware failures and design faults”, IEEE Trans, on Computers, vol. C-27, no. 6, June 1978, pp. 548–560.CrossRefGoogle Scholar
  16. [Cos 81]
    A. Costes, J. E. Doucet, C. Landrault, J. C. Laprie, “SURF: A program for dependability evaluation of complex fault tolerant computing systems”, in Proc. 11th Int. Symp. on Fault Tolerant Computing, Portland, Maine, June 1981.Google Scholar
  17. [Cro 79]
    Y. Crouzet, C. Landrault, “Design of self-checking LSI circuits; application to a 4 bit microprocessor”, in Proc. 9th Int. Symp. on Fault Tolerant Computing, Madison, Wisconsin, June 1979, pp. 189–192; also IEEE Trans, on Computers, vol. C-29, no. 6, June 1980, pp. 532–537.Google Scholar
  18. [Cro 80]
    Y. Crouzet, C. Landrault, “Design specifications of a self-checking detection processor”, in Proc. 10th Int. Symp. on Fault Tolerant Computing, Kyoto, Oct. 1980, pp. 275–277.Google Scholar
  19. [Cro 82]
    Y. Crouzet, B. Decouty, “Measurement of fault detection mechanisms efficiency: results”, in Proc. 12th Int. Symp. on Fault Tolerant Computing, Santa Monica, California, June 1982, pp. 373–376.Google Scholar
  20. [Cro 86]
    Y. Crouzet, J. Chavade, “A 6800 coprocessor for error detection”, Proceedings of the IEEE, vol. 74, no. 5, May 1986, pp. 723–731.CrossRefGoogle Scholar
  21. [Des 86]
    Y. Deswarte, J. C. Fabre, J. C. Laprie, D. Powell, “A saturation network to tolerate faults and intrusions”, in Proc. 5th Symp. on Reliability in Distributed Software and Database Systems, Los Angeles, Jan. 1986, pp. 74–81.Google Scholar
  22. [Dia 74]
    M. Diaz, “Design of totally self-checking and fail-safe sequential machines”, in Proc. 4th Int. Symp. on Fault Tolerant Computing, Urbana, Illinois, June 1974.Google Scholar
  23. [Fra 86]
    J. M. Fray, Y. Deswarte, D. Powell, “Intrusion tolerance using fine-grain fragmentation-scattering”, in Proc. 1986 Symp. on Privacy and Security, Oakland, California, April 1986, pp. 194–201.Google Scholar
  24. [Gal 79]
    J. Galiay, Y. Crouzet, M. Vergniault, “Physical versus logical faults in MOS LSI circuits; impact on their testability”, in Proc. 9th Int. Symp. on Fault Tolerant Computing, Madison, Wisconsin, June 1979, pp. 195–202; also IEEE Trans, on Computers, vol. C-29, no. 6, June 1980, pp. 527–531.Google Scholar
  25. [Kan 85]
    K. Kanoun, J. C. Laprie, “Modeling software reliability and availability from development-validation up to operation”, LAAS Report no. 85. 042, Aug. 1985.Google Scholar
  26. [Lan 76]
    C. Landrault, J. C. Laprie, “Design, realization and performance evaluation of a microcomputer with built-in autodiagnostics”, in Proc. Fault Diagnosis of Digital Networks and Fault Tolerant Computing Symposium, Katowice, Poland, May 1976.Google Scholar
  27. [Lan 78]
    C. Landrault, J. C. Laprie, “the SURF program for modeling and reliability prediction for fault tolerant computing systems”,in Proc. 3rd Jerusalem Conference on Information Technology, Jerusalem, Aug. 1978.Google Scholar
  28. [Lap 75]
    J. C. Laprie, “Reliability and availability of repairable structures”, in Proc. 5th Int. Symp. on Fault Tolerant Computing, Paris, June 1975.Google Scholar
  29. [Lap 76]
    J. C. Laprie, “On reliability prediction of repairable redundant structures when neglecting repair times”, IEEE Trans, on Reliability, vol; R-25, no. 4, pp. 256–258, Oct. 1976.Google Scholar
  30. [Lap 80]
    J. C. Laprie, K. Medhaffer, “Dependability modeling of safety systems”, in Proc. 10th Int. Symp. on Fault Tolerant Computing, Kyoto, Oct. 1980, pp. 245–250; extended version: Microelectronics and Reliability, vol. 22, no. 5, pp. 341–348, Oct. 1982.Google Scholar
  31. [Lap 81]
    J. C. Laprie, A. Costes, C. Landrault, “Parametric analysis of 2-unit redundant computer systems with corrective and preventive maintenance”, IEEE Trans, on Reliability, vol. R-30, no. 2, June 1981, pp. 139–144.CrossRefGoogle Scholar
  32. [Lap 82]
    J. C. Laprie, A. Costes, Dependability: a unifying concept for reliable computing, in Proc. 12th Int. Symp. on Fault Tolerant Computing, Santa Monica, California, June 1982, pp. 18–21.Google Scholar
  33. [Lap 84a]
    J. C. Laprie, “Trustable evaluation of computer systems dependability”, in Mathematical Computer Performance and Reliability, G. Iazeolla, P. J. Courtois, A. Hordijk, Eds., North Holland, 1984, pp. 341–360.Google Scholar
  34. [Lap 84b]
    J. C. Laprie, “Dependability modeling and evaluation of software-and-hardware systems”, in Proc. 2nd GI Conf. on Fault Tolerant Computing, Bonn, Sept. 1984, pp. 202–215.Google Scholar
  35. [Lap 84c]
    J. C. Laprie, “Dependability evaluation of software systems in operation”, IEEE Trans, on Software Engineering, vol. 10, no. 6, Nov. 1984, pp. 701–714.CrossRefGoogle Scholar
  36. [Lap 85]
    J. C. Laprie, “Dependable computing and fault tolerance: concepts and terminology”, in Proc. 15th Int. Symp. on Fault Tolerant Computing, Ann Arbor, Michigan, June 1985, pp. 2–11.Google Scholar
  37. [Lap 86]
    J. C. Laprie, J. Arlat, C. Beounes, C. Hourtolle, K. Kanoun, “Software fault tolerance”, LAAS Report no. 86. 044, April 1986, 250 p.; in FrenchGoogle Scholar
  38. [Mor 75]
    J. Moreira de Souza, E. Peixoto Paz, “Fault tolerant digital clocking systems”, Electronics Letters, Sept. 1975, vol. 11, no. 18, pp. 433–434.Google Scholar
  39. [Mor 76]
    J. Moreira de Souza, E. Peixoto Paz, C. Landrault, “A research oriented microcomputer with built-in auto-diagnostics”, in Proc. 6th Int. Symp. on Fault Tolerant Computing, Pittsburgh, June 1976, pp. 3–8.Google Scholar
  40. [Mor 80]
    J. Moreira de Souza, “A method for the cost benefit analysis of fault tolerance”, in Proc. 10th Int. Symp. on Fault Tolerant Computing, Kyoto, Oct. 1980, pp. 201–203.Google Scholar
  41. [Mor 81]
    J. Moreira de Souza, C. Landrault. Landrault, “Benefit analysis of concurrent redundancy techniques”, IEEE Trans, on Reliability, vol. R-30, no. 1, April 1981, pp. 67–70.Google Scholar
  42. [Pow 78]
    D. Powell, J. C. Laprie, P. Romand, C. Aleonard, “RHEA: A system for reliable and survivable interconnection of real time processing elements”, in Proc. 8th Int. Symp. on Fault Tolerant Computing, Toulouse, June 1978, pp. 117–122.Google Scholar
  43. [Pow 81]
    D. Powell, “Performance evaluation and comparison of dependable channel access techniques for locally distributed computing systems”, in Proc. 2nd Int. Conf. on Distributed Computing Systems, Paris, April 1981.Google Scholar
  44. [Pow 82]
    D. Powell, “Dependability evaluation of communication support systems for local area distributed computing”, in Proc. 12th Int. Symp. on Fault Tolerant Computing, Santa Monica, California, June 1982, pp. 259–266.Google Scholar
  45. [Pow 85]
    D. Powell, J. C. Valadier, “Dependable avionic data transmission”, in AGARD Lecture Series no. 143 “Fault Tolerant Hardware/Software Architecture for Flight Critical Function”, 1985, pp. 5. 1–5. 19.Google Scholar
  46. [Pow 86]
    D. Powell, “A hierarchical approach to distributed computer system dependability evaluation”, The Journal of Systems and Software, vol. 1, no. 2, 1986, pp. 183–198.CrossRefGoogle Scholar
  47. [Val 84]
    J. C. Valadier, D. Powell, “On CSMA protocols allowing bounded access times”, in Proc. 4th Int. Conf. on Distributed Computing Systems, San Francisco, May 1984, pp. 146–153.Google Scholar

Copyright information

© Springer-Verlag/Wien 1987

Authors and Affiliations

  • Jean-Claude Laprie
    • 1
  • Alain Costes
    • 1
  1. 1.LAAS-CNRSToulouse CedexFrance

Personalised recommendations