Advertisement

A Consensus-Based Framework for Responsive Computer System Design

  • Miroslaw Malek
Conference paper
Part of the NATO ASI Series book series (NATO ASI F, volume 127)

Abstract

The concept of responsive computer systems is presented. The emerging discipline of responsive systems demands fault-tolerant and real-time performance in both parallel and distributed computing environments. The responsiveness measure is discussed and a new design framework for responsive systems is introduced. The new framework is based on the fundamental concept of consensus and on application specific responsiveness. It is shown that consensus is crucial in responsive synchronization, communication, diagnosis, and reconfiguration.

It is also illustrated how these tasks form a part of the consensus-based operating system, and, when combined with application specific methods, handle fault-tolerance and real-time issues germane to a given application. This approach seems to be the most appropriate for the design of fault-tolerant, real-time, parallel/distributed systems.

Keywords

Fault Diagnosis Fault Tolerance Consensus Problem Consensus Algorithm Consensus Protocol 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    M. Malek, Responsive Systems: A Marriage between Real Time and Fault Tolerance, Keynote Address, Proceedings of the 5th International GI/ITG/GMA Conference on Fault-Tolerant Computing Systems, Nürnberg, Germany, Springer-Verlag, Informatik-Fachberichte 283, September 25, 1991, 1–17.Google Scholar
  2. 2.
    M. Malek, Responsive Systems: A Challenge for the Nineties, EuroMicro 90: Microprocessing and Microprogramming, August 30, 1990, 1–5.Google Scholar
  3. 3.
    M. Barborak, M. Malek and A.T. Dahbura, Consensus Problem in Fault-Tolerant Computing,Department of Computer Sciences, The University of Texas at Austin, November 1991, No. TR-91–40.Google Scholar
  4. 4.
    M. Barborak and M. Malek, Partitioning for Efficient Consensus, Technical Report, Department of Electrical and Computer Engineering, The University of Texas at Austin, May 1992.Google Scholar
  5. 5.
    F. Preparata, G. Metze and R. Chien, On the Connection Assignment Problem of Diagnosable Systems, IEEE Transactions on Electronic Computers, EC-16, No. 6, December 1967, 848–854.CrossRefGoogle Scholar
  6. 6.
    F. Cristian, Reaching Agreement on Processor-Group Membership in Synchronous Distributed Systems, Distributed Computing, Vol. 4, 1991, 175–187.CrossRefMATHGoogle Scholar
  7. 7.
    M. Pease, R. Shostak and L. Lamport, Reaching Agreement in the Presence of Faults, Journal of the ACM, Vol. 27, No. 2, April 1980, 228–234.MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    M. Fischer, N. Lynch and M. Paterson, Impossibility of Distributed Consensus with One Faulty Process, Journal of the ACM, Vol. 32, No. 2, April 1985, 374382.Google Scholar
  9. 9.
    A. Johnson and M. Malek, Rainbow nets for system analysis,IBM Systems Technology Division, Austin, Texas, IBM Technical Report, No. TR 51.0565, September 1989Google Scholar
  10. 10.
    L. Larenjeira, M. Malek and R. Jenevein, NEST: A Nested Predicate Scheme for Fault Tolerance, to appear in IEEE Transactions on Computers.Google Scholar
  11. 11.
    L. Laranjeira, M. Malek and R. Jenevein, An Tolerating Faults in Naturally Redundant Algorithms, The 10th Symposium on Reliable Distributed Systems, Pisa, Italy, September 1991, 118–127.Google Scholar
  12. 12.
    R. Koo and S. Toueg, Checkpointing and Rollback-Recovery for Distributed Systems, IEEE Transactions on Software Engineering, Vol. SE-13, No. 1, January 1987, 23–31.CrossRefGoogle Scholar
  13. 13.
    K.H. Huang and J.A. Abraham, Algorithm-Based Fault Tolerance for Matrix Operations, IEEE Transactions on Software Engineering, Vol. SE-33, No. 6, June 1984, 518–528.Google Scholar
  14. 14.
    E.W. Dijkstra, Self-Stabilizing Systems in Spite of Distributed Control, Communications of the ACM, Vol. 17, No. 11, November 1974, 643–644.CrossRefMATHGoogle Scholar
  15. 15.
    F.B. Bastani, I. Yen and I. Chen, A Class of Inherently Fault-Tolerant Distributed Systems, IEEE Transactions on Software Engineering, Vol. 14, No. 10, October 1988, 1432–1442.CrossRefMATHGoogle Scholar
  16. 16.
    L. Laranjeira, M. Malek R. Jenevein, Space/Time Overhead Analysis and Experiments with Fault-Tolerant Techniques, The Third IFIP Working Conference on Dependable Computing for Critical Applications, Palermo, Italy, September 1992.Google Scholar
  17. 17.
    M. Malek, M. Guruswamy, H. Owens, M. Pandya, A Hybrid Algorithm Technique, Department of Computer Sciences, The University of Texas at Austin, No. TR-89–06, 1989.Google Scholar
  18. 18.
    M. Malek, M. Guruswamy, H. Ownes and M. Pandya, Serial and Parallel Search Techniques for the Traveling Salesman Problem, Annals of Operations Research, Vol. 21, 1989, 59–84.MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    S.S. Lam and A U Shankar, Protocol Verification via Projections, IEEE Transactions on Software Engineering, July 1984, Vol. SE-10, No. 4, 325–342.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1994

Authors and Affiliations

  • Miroslaw Malek
    • 1
  1. 1.Department of Electrical and Computer EngineeringThe University of Texas at AustinAustinUSA

Personalised recommendations