Distributed Fault-Tolerance

  • David Powell
Part of the Research Reports ESPRIT book series (ESPRIT, volume 1)


Distribution and fault-tolerance are tightly related. Should a single element of a distributed system fail, users expect at worst a slight degradation of the service that is offered; distributed systems must thus at least have some built-in fault-tolerance. On the other hand, most fault-tolerant systems can, at some level or another, be seen as a distributed system due to their redundant processing resources. Distributed fault-tolerance is used here to refer to that class of techniques suitable for ensuring fault-tolerance in an architecture consisting of a set of processing elements (called nodes or stations) interconnected by a message-passing communication network (figure 1). The distributed fault-tolerance techniques discussed here are focussed towards distributed systems in which the communication network consists of one or more local area networks. In particular, the existence of high-bandwidth broadcast channels allowing efficient multicast communication is assumed.


Expense Encapsulation Volatility Lost ECSC 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© ECSC — EEC — EAEC, Brussels — Luxembourg 1991

Authors and Affiliations

  • David Powell
    • 1
  1. 1.LAAS-CNRSToulouseFrance

Personalised recommendations