Advertisement

Control Flow Checking in Object-Based Distributed Systems

  • Nasser A. Kanawati
  • Ghani A. Kanawati
  • Jacob A. Abraham
Part of the Dependable Computing and Fault-Tolerant Systems book series (DEPENDABLECOMP, volume 8)

Abstract

Object-based distributed systems are becoming increasingly popular since objects provide a secure and easy means of using the abstraction of shared memory. In this paper we develop a new object-based control flow checking technique called (COTM) to detect errors due to hardware faults in such systems. The proposed technique monitors objects and thread flow across objects in two stages. The first stage applies control flow checking for every object invocation. In the second stage, the legality of a terminating thread is examined. Results of fault injection experiments on several applications written in C++ and modified to incorporate the object-based checks show that the proposed technique achieves high fault coverage with low performance overhead.

Keywords

Method Invocation Concurrent Error Detection Undetected Error Control Flow Information Fault Tolerance Technique 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    A. Avizienis. Fault-Tolerance: the survival attribute of digital systems. Proc. IEEE, Vol-66, October 1978, pp. 1109–1125.CrossRefGoogle Scholar
  2. [2]
    S. S. Yau, R. C. Cheung. Design of self-checking software. Proc. Int. Conf. Reliable Software, April 1975, pp. 450-457.Google Scholar
  3. [3]
    S. S. Yau, F.-C. Chen. An approach to concurrent control flow checking. IEEE Trans. soft. Eng., Vol. SE-6, No. 2, March 1980, pp. 126–137.MathSciNetCrossRefGoogle Scholar
  4. [4]
    K. Hua, J. Abraham. Design of systems with concurrent error detection using software redundancy. Proc. ACM/IEEE Fall Joint Computer Conference, Dallas, Texas, November 1986, pp. 826-834.Google Scholar
  5. [5]
    K. Huang, J. A. Abraham. Algorithm-based fault tolerance for matrix operations. IEEE Trans. on Computers, Vol. C-33, No. 6, June 1984, pp. 518–528.CrossRefGoogle Scholar
  6. [6]
    J. R. Kane, S.S. Yau. Concurrent software fault detection. IEEE Trans. Soft. Eng., Vol. SE-1, No. 1, March 1975.Google Scholar
  7. [7]
    M. Schmid, R. Trapp, A. Davidoff, G. Masson. Upset exposure by means of abstraction verification. Dig FTCS-12, June 1982, pp. 237-244.Google Scholar
  8. [8]
    M. Ahamad, et al. Fault tolerant computing in object based distributed operating systems. IEEE symp. on Reliability in distributed soft. and database systems, 1987, pp. 115-124.Google Scholar
  9. [9]
    P. Dasgupta, et al. The clouds distributed operating system. IEEE symp. on Reliability in distributed soft, and database systems, 1988, pp. 2-9.Google Scholar
  10. [10]
    M. Schuette, J. Shen. Processor control flow monitoring using signatured instruction streams. IEEE Trans. Comput., Vol. C-36, March 1987, pp. 264–276.CrossRefGoogle Scholar
  11. [11]
    A. Mahmood, E. J. McCluskey. Watchdog processors: error coverage and overhead. Dig FTCS-15 June 1985, pp. 214-219.Google Scholar
  12. [12]
    A. Mahmood, E. J. McCluskey. Concurrent error detection using watchdog processors — a survey. IEEE Trans. Comput., Vol. 37, February 1988, pp. 160–174.CrossRefGoogle Scholar
  13. [13]
    J. P. Shen, M. Schuette. On-line self-monitoring using signatured instruction streams. Int. Test Conf., 1983, pp. 275-282.Google Scholar
  14. [14]
    K. Wilken, J. P. Shen. Continuous signature monitoring: efficient concurrent detection of processor control errors. Int. Test Conf. 1988, pp. 914-925.Google Scholar
  15. [15]
    P. Banerjee, J.T. Rahmeh, C. Stunkel, V. S. Nair, K. Roy, V. Balasubramanian, J. A. Abraham. Algorithmic-based fault tolerance on hypercube multiprocessor. IEEE Trans. on Computers, Vol. 39, No. 9, September 1990, pp. 1132–1146.CrossRefGoogle Scholar
  16. [16]
    K. Wilken, J.P. Shen. Continuous signature monitoring: low-cost concurrent detection of processor control errors. IEEE Trans. Computer-Aided Design, Vol. 9, No. 6, June 1990, pp. 629–641.CrossRefGoogle Scholar
  17. [17]
    M. Namjoo. Techniques for concurrent testing of VLSI processor operation. Proc., 12th IEEE ITC, 1982, pp. 461-468.Google Scholar
  18. [18]
    L. Lin, M. Ahamad. Checkpointing and rollback recovery in distributed object based system. Dig FTCS-20, June 1990, pp. 97-103.Google Scholar
  19. [19]
    J. Sosnowski. Detection of control flow errors using signature and checking instructions. Proc. 18th IEEE ITC, 1982, pp. 81-88.Google Scholar
  20. [20]
    J. B. Eifert, J. P. Shen. Processor monitoring using asynchronous signatured instruction streams. Dig FTCS-14, June 1984, pp. 394-399.Google Scholar
  21. [21]
    J. P. Shen, S. P. Thomas. A roving monitoring processor for detection of control flow errors in multiple processor systems. Microprocessors and Microprogramming, 20, 1987, pp. 249-269.Google Scholar
  22. [22]
    N. J. Warter, W. W. Hwu. A software based approach to achieving optimal performance for signature control flow checking. 1990, pp. 442-449.Google Scholar
  23. [23]
    S. J. Upadhyaya, B. Ramamurthy. A new efficient signature technique for process monitoring in critical systems. 2nd Int. Working Conf. DCCA-2, Tucson, Arizona, February 1991, pp. 178-185.Google Scholar
  24. [24]
    H. Madeira, J. G. Silva. On-Line Signature Learning and Checking. 2nd Int. Working Conf. DCCA-2, Tucson, Arizona, February 1991, pp. 170-177.Google Scholar
  25. [25]
    M. A. Breuer, A. A. Ismaeel. Roving emulator as a fault detection mechanism. Dig FTCS-13, June 1983, pp. 206-215.Google Scholar
  26. [26]
    J. M. Berger. A note on error detection codes for asymmetric channels. Information and Control, Vol. 4, March 1973, pp. 68–73.CrossRefGoogle Scholar
  27. [27]
    S. K. Shrivastava, S. M. Wheater. Implementing fault-tolerant distributed applications using objects and multicolored actions. IEEE Trans. Soft. Eng., 1990, pp. 347-356.Google Scholar
  28. [28]
    G. A. Kanawati, N. A. Kanawati, J. A. Abraham. FERRARI-A Fault and ERRor Automatic Real-time Injector. Dig FTCS-22, Boston, 1992, pp. 336-344.Google Scholar

Copyright information

© Springer-Verlag Wien 1993

Authors and Affiliations

  • Nasser A. Kanawati
    • 1
  • Ghani A. Kanawati
    • 1
  • Jacob A. Abraham
    • 1
  1. 1.Computer Engineering Research CenterThe University of Texas at AustinAustinUSA

Personalised recommendations