A Hybrid Monitor Assisted Fault Injection Environment

  • Luke T. Young
  • Carlos Alonso
  • Ravi K. Iyer
  • Kumar K. Goswami
Part of the Dependable Computing and Fault-Tolerant Systems book series (DEPENDABLECOMP, volume 8)


This paper describes a hybrid (hardware/software monitor) fault injection environment and its application to a commercial fault tolerant system. The hybrid environment is useful for obtaining dependability statistics and failure characteristics for a range of system components. The Software instrumentation keeps the introduced overhead small so that error propagation and control flow are not significantly affected by its presence. The Hybrid environment can be used to obtain precise measurements of instruction-level activity that would otherwise be impossible to perform with a hardware monitor alone. It is also well suited for measuring extremely short error latencies. Its utility is demonstrated by applying it to the study of a Tandem Integrity S2 system. Faults are injected into CPU registers, cache, and local memory. The effects of faults on individual user applications are studied by obtaining subsystem dependability measurements such as detection and latency statistics for cache and local memory. Instruction-level error propagation effects are also measured.


Local Memory Fault Injection Error Latency Physical Address Fault Isolation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    R. Chillarege, R. K. Iyer. Measurement-Based Analysis of Error Latency. IEEE Trans Computers, vol. C-36, No.5., May 1987, pp. 529–537.CrossRefGoogle Scholar
  2. [2]
    R. Chillarege, N. S. Bowen. Understanding large system failures — A fault injection experiment. Proc. 19th International Symposium on Fault-Tolerant Computing, June 1989, pp. 355-363.Google Scholar
  3. [3]
    G. Choi, R. K. Iyer, V. A. Carreno. Simulated Fault Injection: A Methodology to Evaluate Fault Tolerant Microprocessor Architectures. IEEE Transactions on Reliability-Special Issue on Experimental Evaluation, Vol. 39, No. 4, October 1990, pp. 486–491.CrossRefGoogle Scholar
  4. [4]
    G. S. Choi, R. K. Iyer, V. Carreno. FOCUS: An Experimental Environment for Fault Sensitivity Analysis. To appear in IEEE Transactions on Computers.Google Scholar
  5. [5]
    E. Czeck. On the Prediction of Fault Behavior based on Workload. PhD. dissertation, Electrical and Computer Engineering Department, Carnegie Mellon University, Pittsburgh, PA, April 19, 1992.Google Scholar
  6. [6]
    K. Goswami, R. Iyer. A Simulation-Based Study of a Triple Modular Redundant System using DEPEND. Proc. 5th International FTRS Conference, Nurnberg, Germany, Sept. 25–27, 1991.Google Scholar
  7. [7]
    D. Jewett. Integrity S2: A Fault-Tolerant Unix Platform. Proc. 21st International Symposium on Fault-Tolerant Computing, Montreal, June 25–27, 1991, pp. 512-519.Google Scholar
  8. [8]
    G. Kanawati, N. Kanawati, J. Abraham. FERRARI: A Fault and ERRor Automatic Realtime Injector. Proc. 22nd International Symposium on Fault-Tolerant Computing, Boston, 1992.Google Scholar
  9. [9]
    J. H. Lala. Fault detection, isolation and reconfiguration in FTMP: Methods and experimental results. Proc. 5th Avionics Systems Conference, Seattle, WA, Nov. 1983, pp. 21.3.1-21.3.9.Google Scholar
  10. [10]
    J. C. Laprie. Dependable Computing and Fault-Tolerance: Concepts and Terminology. Proc. 15th International Symposium on Fault-Tolerant Computing, Ann Arbor, MI, USA, June 1985, pp. 2-11.Google Scholar
  11. [11]
    D. Lomelino, R. Iyer. Error propagation in a digital avionic processor: A simulation-based study. NASA CR-176501, University of Illinois, 1986.Google Scholar
  12. [12]
    J. G. McGough, F. L. Swern, S. Bavuso. New results in fault latency modeling. Proc. IEEE EASCON Conf., Washington, D.C., Aug. 1983, pp. 882-889.Google Scholar
  13. [13]
    S. G. Mitra, R. K. Iyer. Measurement-based Analysis of Multiple Latent Errors and Near-coincident Fault Discovery in a Shared Memory Multiprocessor. Proc. 1988 International Conference on Parallel Processing, St. Charles, IL, August 15–19, 1988, pp. 404-409.Google Scholar
  14. [14]
    Z. Segall, D. Vrsalovic, et al. FIAT — Fault Injection Based Automated Testing Environment. Proc. 18th International Symposium on Fault-Tolerant Computing, 1988, pp. 102-107.Google Scholar
  15. [15]
    K. G. Shin, Y. H. Lee. Measurement and Application of Fault Latency. IEEE Trans. Computers, Vol. C-35, No. 4., April 1986, pp. 307–375.CrossRefGoogle Scholar
  16. [16]
    DAS 9200 92A60/90 User’s Manual (8-/16-/32-Bit Microprocessor Support Modules). Tektronix, Inc., Beaverton, OR, May 1988.Google Scholar
  17. [17]
    L. Young, R. Iyer. Error Latency Measurements in Symbolic Architectures. Proc. AIAA Computing in Aerospace 8, Baltimore, Maryland, October 22–24, 1991, pp. 786-794.Google Scholar
  18. [18]
    C. Yount, D. Siewiorek. Automatic Generation of Instruction-Level Error Manifestations of Hardware Failures. (pending technical report), Center for Dependable Systems, Dept. of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, 1992.Google Scholar

Copyright information

© Springer-Verlag Wien 1993

Authors and Affiliations

  • Luke T. Young
    • 1
  • Carlos Alonso
    • 1
  • Ravi K. Iyer
    • 2
  • Kumar K. Goswami
    • 2
  1. 1.Integrity Systems DivisionTandem Computers, Inc.AustinUSA
  2. 2.Center for Reliable and High Performance Computing Coordinated Science LaboratoryUniversity of Illinois at Urbana-ChampaignUrbanaUSA

Personalised recommendations