Distributed Fault-Tolerance

  • David Powell
Part of the Research Reports ESPRIT book series (ESPRIT, volume 1)

Abstract

Distribution and fault-tolerance are tightly related. Should a single element of a distributed system fail, users expect at worst a slight degradation of the service that is offered; distributed systems must thus at least have some built-in fault-tolerance. On the other hand, most fault-tolerant systems can, at some level or another, be seen as a distributed system due to their redundant processing resources. Distributed fault-tolerance is used here to refer to that class of techniques suitable for ensuring fault-tolerance in an architecture consisting of a set of processing elements (called nodes or stations) interconnected by a message-passing communication network (figure 1). The distributed fault-tolerance techniques discussed here are focussed towards distributed systems in which the communication network consists of one or more local area networks. In particular, the existence of high-bandwidth broadcast channels allowing efficient multicast communication is assumed.

Keywords

Expense Encapsulation Volatility Lost ECSC 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© ECSC — EEC — EAEC, Brussels — Luxembourg 1991

Authors and Affiliations

  • David Powell
    • 1
  1. 1.LAAS-CNRSToulouseFrance

Personalised recommendations