Advertisement

Software-Implemented Fault Injection of Transient Hardware Errors

  • Charles R. Yount
  • Daniel P. Siewiorek
Part of the The Springer International Series in Engineering and Computer Science book series (SECS, volume 283)

Abstract

As computer applications extend to areas which require extreme dependability, their designs mandate the ability to operate in the presence of faults. The problem of assuring that the design goals are achieved requires the observation and measurement of fault behavior parameters under various input conditions. One means to characterize systems is fault injection, but injection of internal faults is difficult due to the complexity and level of integration of contemporary VLSI implementations. This chapter explores the effects of gate-level faults on system operation as a basis for fault models at the program level.

A new fault model for processors based on a register-transfer-level (RTL) description is presented. This model addresses time, cost, and accuracy limitations imposed by current fault-injection techniques. It is designed to be used with existing software-implemented fault-injection (SWIFI) tools, but the error patterns it generates are designed to be more representative of actual transient hardware faults than the ad-hoc patterns currently injected via most SWIFI experiments.

Keywords

Fault Model Register File System Under Test Fault Injection Transient Fault 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    W. Richards Adrion, Martha A Branstad, and John C. Cheriavsky. Validation, Verification, and Testing of Computer Software. ACM Computing Surveys. 14(2):159–192, June, 1982.CrossRefGoogle Scholar
  2. [2]
    J. Arlat, Y. Crouzet, and Jean-Claude Laprie. Fault-Injection for Dependability Validation of Fault-Tolerant Computing Systems. 19th International Symposium on Fault-Tolerant Computing (FTCSI9), pages 348–355. June, 1989.Google Scholar
  3. [3]
    James H. Barton, Edward W. Czeck, and Zary Segall and Daniel P. Siewiorek. Fault Injection Experiments Using FIAT. IEEE Transactions on Computers. 39(4):576–582, April, 1990.CrossRefGoogle Scholar
  4. [4]
    Dhananjay Brahme and Jacob A. Abraham. Functional Testing of Microprocessors. IEEE Transactions on Computers. C-33(6):475–485, June, 1984.Google Scholar
  5. [5]
    Ram Chillarege and Nicholas S. Bowen. Understanding Large System Failures—A Fault Injection Experiment, Technical Report RC 14233, IBM Corporation Research Division, IBM T.J. Watson Research Center, Yorktown Heights, NY, 1989.Google Scholar
  6. [6]
    Gwan S. Choi and Ravishankar K. Iyer. FOCUS: An Experimental Environment for Fault Sensitivity Analysis. IEEE Transactions on Computers. 41(5), May, 1992.Google Scholar
  7. [7]
    Gwan S. Choi, Ravishankar K. Iyer, and Victor A. Carreno. Simulated Fault Injection: A Methodology to Evaluate Fault Tolerant Microprocessor Architectures. IEEE Transactions on Computers. 39(4):486–490, April, 1990.Google Scholar
  8. [8]
    J. Cusick SEU Vulnerability of the ZILOG Z-80 and NSC-800 Microprocessors. IEEE Transactions on Nuclear Science. NS-32(46):4206–4211, December, 1985.Google Scholar
  9. [9]
    Edward Czeck. On the Prediction of Fault Tolerant Behavior Based On Workload. Ph.D. thesis, Carnegie Mellon University, December, 1990.Google Scholar
  10. [10]
    Edward Czeck and Daniel Siewiorek. Observations on the Effects of Fault Manifestation as a Function of Workload. IEEE Transactions on Computers. 41(5):559–564, May, 1992.CrossRefGoogle Scholar
  11. [11]
    Thomas R. Dilenno, David A. Yaskin, and James H. Barton. Fault Tolerance Testing in the Advanced Automation System. 21st International Symposium on Fault-Tolerant Computing (FTCS21), pages 18–25. June, 1991.Google Scholar
  12. [12]
    P. Duba and R.K. Iyer. Transient Fault Behavior in a Microprocessor: a case study. Proceedings of the 1988 IEEE International Conference on Computer Design: VLSI in Computers and Processors. October, 1988.Google Scholar
  13. [13]
    Joanne Bechta Dugan. On Measurement and Modeling of Computer Systems Dependability: A Dialog Among Experts. IEEE Transactions on Computers. 39(4):506–509, April, 1990.Google Scholar
  14. [14]
    Klaus Echtle and Yinong Chen. Evaluation of Deterministic Fault Injection for Fault-Tolerant Protocol Testing. 21st International Symposium on Fault-Tolerant Computing (FTCS2I), pages 418–425. June, 1991.Google Scholar
  15. [15]
    Kumar Goswami and Ravishankar K. Iyer. DEPEND: A Simulation-Based Environment for System Level Dependability Analysis. Technical Report UILU-ENG-92-2217/CRHC-92-11, Coordinated Science Laboratory, College of Engineering, University of Illinois at Urbana Champaign, June, 1992.Google Scholar
  16. [16]
    Alf Gunneflo, Johan Karlsson, and Jan Torin. Evaluation of Error Detection Schemes Using Fault Injection by Heavy-ion Radiation. 19th International Symposium on Fault-Tolerant Computing (FTCSI9), pages 340–347. June, 1989.Google Scholar
  17. [17]
    IBM RT Personal Computer Technology. Technical Report SA23-1057, IBM, 1986.Google Scholar
  18. [18]
    Ravi K. Iyer and D.J. Rossetti. A Measurement-Based Model for Workload Dependence of CPU Errors. IEEE Transactions on Computers. C-3511-519, June, 1986.Google Scholar
  19. [19]
    Richard L. Johnson Jr., Sherra E. Diehl-Nagle, and John R. Hauser. Simulation Approach for Modeling Single Event Upsets on Advanced CMOS SRAMS. IEEE Transactions on Nuclear Science. NS-32(46):4122–4127, December, 1985.CrossRefGoogle Scholar
  20. [20]
    Ghani A. Kanawati, Nasser A. Kanawati, and Jacob A. Abraham. FERRARI: A Tool for the Validation of System Dependability Properties. 22nd International Symposium on Fault-Tolerant Computing (FTCS22), pages 336–344. July, 1992.Google Scholar
  21. [21]
    Ghani A. Kanawati, Nasser A. Kanawati, and Jacob A. Abraham. A High-Level Error Model Automatic Extractor. Technical Report UT-CERC-TR-JAA93-01, Computer Engineering Research Center, University of Texas at Austin, January, 1993.Google Scholar
  22. [22]
    Ramachandra P. Kunda, Prakash Narain, Jacob A. Abraham, and Brarat Deep Rathi. Speed Up of Test Generation using High-Level Primitives. 27th ACM/IEEE Design Automation Conference, pages 594–599. 1990.Google Scholar
  23. [23]
    Jean-Claude Laprie. Dependable Computing and Fault Tolerance: Concepts and Terminology. 15th International Symposium on Fault-Tolerant Computing (FTCS15), pages 2–11. June, 1985.Google Scholar
  24. [24]
    T. May and M. Woods. Alpha-Particle-Induced Soft Errors in Dynamic Memories. IEEE Transactions on Electron Devices. ED-262-9, January, 1979.Google Scholar
  25. [25]
    Ghassem Miremadi, Johan Karlsson, Ulf Gunneflo, and Jan Torin. Two Software Techniques for On-line Error Detection. 22nd International Symposium on Fault-Tolerant Computing (FTCS22), pages 328–335. July, 1992.Google Scholar
  26. [26]
    Donald K. Nichols et al. Trends in Parts Susceptibility to Single Event Upset from Heavy Ions. IEEE Transactions on Nuclear Science. NS-32(46):4189–4194, December, 1985.Google Scholar
  27. [27]
    Thomas M. Niermann, Wu-Tung Cheng, and Janak H. Patel. PROOFS: A Fast, Memory Efficient Sequential Circuit Fault Simulator. 27th ACM/IEEE Design Automation Conference, pages 535–540. 1990.Google Scholar
  28. [28]
    Forrest E. Norrod. An Automatic Test Generation Algorithm for Hardware Description Languages. 26th ACM/IEEE Design Automation Conference, pages 429–434. 1989.Google Scholar
  29. [29]
    Joakim Ohlsson, Marcus Rimen, and Ulf Gunneflo. A Study of the Effects of Transient Fault Injection into a 32-bit RISC with Built-in Watchdog. 22nd International Symposium on Fault-Tolerant Computing (FTCS22), pages 316–325. July, 1992.Google Scholar
  30. [30]
    Micheal Peercy and Prithviraj Banerjee. Design and Analysis of Software Reconfiguration Strategies for Hypercube Multicomputers under Multiple Faults. 22nd International Symposium on Fault-Tolerant Computing (FTCS22), pages 448–455. July, 1992.Google Scholar
  31. [31]
    Z. Segall, D. Vrsalovic, D. Siewiorek, D. Yaskin, J. Kownacki, J. Barton, R. Dancey, A. Robinson, and T. Lin. FIAT—Fault Injection Based Automated Testing Environment. Proceedings of the 18th International Symposium on Fault Tolerance, pages 102–107. The Computer Society of the IEEE, Washington, DC, June, 1988.Google Scholar
  32. [32]
    Daniel P. Siewiorek and Robert S. Swarz. The Theory and Practice of Reliable System Design. Digital Press, DEC, Bedford, MA, 1982.Google Scholar
  33. [33]
    Daniel Siewiorek, John Hudak, Byung-Hoon Suh, and Zary Segall. Development of a Benchmark to Measure System Robustness: Experiences and Lessons Learned. 23nd International Symposium on Fault-Tolerant Computing (FTCS23). July, 1993.Google Scholar
  34. [34]
    Daniel P. Siewiorek and Robert S. Swarz. Reliable Computer Systems: Design and Evaluation. Digital Press, DEC, Bedford, MA, second edition, 1992.Google Scholar
  35. [35]
    Satish M. Thatte and Jacob A. Abraham. User Testing of Microprocessors. Spring’ 79 Compcon, 18th IEEE Computer Society International Conference, pages 108–114. 1979.Google Scholar
  36. [36]
    Satish M. Thatte and Jacob A. Abraham. A Methodology for Functional Level Testing of Microprocessors. 8th International Symposium on Fault-Tolerant Computing (FTCS8), pages 90–95. June, 1978.Google Scholar
  37. [37]
    Satish M. Thatte and Jacob A. Abraham. Test Generation for General Microprocessor Architectures. 9th International Symposium on Fault-Tolerant Computing (FTCS9), pages 203–210. June, 1979.Google Scholar
  38. [38]
    P.C. Ward. Behavioral Fault Simulation in VHDL. 27th ACM/IEEE Design Automation Conference, pages 587–593. 1990.Google Scholar
  39. [39]
    Luke T. Young and Ravishankar K. Iyer. A Hybrid Monitor Assisted Fault Injection Environment. Technical Report UILU-ENG-92-2207/CRHC-92-04, Coordinated Science Laboratory, College of Engineering, University of Illinois at Urbana Champaign, February, 1992.Google Scholar
  40. [40]
    Charles R. Yount. The Automatic Generation of Instruction-Level Error Manifestations of Hardware Faults: A New Fault-Injection Model. Ph.D. Thesis, Carnegie Mellon University, May 1993.Google Scholar

Copyright information

© Kluwer Academic Publishers 1994

Authors and Affiliations

  • Charles R. Yount
    • 1
  • Daniel P. Siewiorek
    • 2
  1. 1.Inter-National Research InstituteUSA
  2. 2.Department of Electrical and Computer EngineeringCarnegie Mellon UniversityUSA

Personalised recommendations