Advertisement

Muteness Failure Detectors: Specification and Implementation

  • Assia Doudou
  • Benoit Garbinato
  • Rachid Guerraoui
  • André Schiper
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1667)

Abstract

This paper extends the failures detector approach from crash-stop failures to muteness failures. Muteness failures are malicious failures in which a process stops sending algorithm messages, but might continue to send other messages, e.g., “I-am-alive” messages. The paper presents both the specification of a muteness failure detector, denoted by
$$ \diamondsuit {\rm M}_{\mathcal{A},} $$
, and an implementation of
$$ \diamondsuit {\rm M}_{\mathcal{A},} $$
in a partial synchrony model (there are bounds on message latency and clock skew, but these bounds are unknown and hold only after some point that is itself unknown). We show that, modulo a simple modification, a consensus algorithm that has been designed in a crash-stop model with
$$ \diamondsuit S $$
, can be reused in the presence of muteness failures simply by replacing
$$ \diamondsuit {\rm M}_{\mathcal{A},} $$
with
$$ \diamondsuit S $$
.

Keywords

Correct Process Failure Detector Consensus Problem Consensus Algorithm Asynchronous System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    M. Aguilera, W. Chen, and S. Toueg. Failure detection and consensus in the crash-recovery model. In 12th International Symposium on Distributed Computing. Springer Verlag, LNCS 1499, September 1998.Google Scholar
  2. 2.
    G. Bracha and S. Toueg. Asynchronous consensus and broadcast protocols. Journal of the Association for Computing Machinery, 32(4):824–840, October 1985.MathSciNetGoogle Scholar
  3. 3.
    T. D. Chandra and S. Toueg. Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2):225–267, March 1996.zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Danny Dolev, Roy Friedman, Idit Keidar, and Dahlia Malkhi. Failure detectors in omission failure environments. In Proceedings of the Sixteenth Annual ACM Symposium on Principles of Distributed Computing, page 286, Santa Barbara, California, August 1997.Google Scholar
  5. 5.
    A. Doudou and S. Schiper. Muteness detectors for consensus with byzantine processes (brief announcement). In Proceedings of the 17th Annual ACM Symposium on Principles of Distributed Computing (PODC’98), Puerto Vallarta, Mexico, June 1998. ACM. An extended version of this brief annoucement is available as a Technical Report, TR 97/230, EPFL, Detp d’Informatique, October 1997, under the title “Muteness Failure Detector for Consensus with Byzantine Processes”.Google Scholar
  6. 6.
    C. Dwork, N. Lynch, and L. Stockmeyer. Consensus in the presence of partial synchrony. Journal of the ACM, 35(2):288–323, apr 1988.CrossRefMathSciNetGoogle Scholar
  7. 7.
    M. Fischer, N. Lynch, and M. Paterson. Impossibility of Distributed Consensus with One Faulty Process. Journal of the ACM, 32:374–382, April 1985.zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    R. Guerraoui and A. Schiper. Consensus service: a modular approach for building agreement protocols in distributed systems. In IEEE 26th Int Symp on Fault-Tolerant Computing (FTCS-26), pages 168–177, June 1996.Google Scholar
  9. 9.
    Rachid Guerraoui. Revisiting the relationship between non-blocking atomic commitment and consensus. In Jean-Michel Hélary and Michel Raynal, editors, Distributed Algorithms, 9th International Workshop, WDAG’ 95, volume 972 of Lecture Notes in Computer Science, pages 87–100, Le Mont-Saint-Michel, France, 13-15 September 1995. Springer.CrossRefGoogle Scholar
  10. 10.
    Kim Potter Kihlstrom, Louise E. Moser, and P. M. Melliar-Smith. Solving consensus in a Byzantine environment using an unreliable fault detector. In Proceedings of the International Conference on Principles of Distributed Systems (OPODIS), pages 61-75, December 1997.Google Scholar
  11. 11.
    L. Lamport, R. Shostak, and M. Pease. The Byzantine Generals Problem. ACM Transactions on Programming Languages and Systems, 4(3):382–401, July 1982.zbMATHCrossRefGoogle Scholar
  12. 12.
    D. Malkhi and M. Reiter. Unreliable Intrusion Detection in Distributed Computations. In Proc. 10th Computer Security Foundations Workshop (CSFW97), pages 116–124, June 1997.Google Scholar
  13. 13.
    O. Babaoğlu, R. Davoli, and A. Montresor. Failure Detectors, Group Membership and View-Synchronous Communication in Partitionable Asynchronous Systems. Technical Report UBLCS-95-18, Department of Computer Science University of Bologna, November 1995.Google Scholar
  14. 14.
    R. Oliveira, R. Guerraoui, and A. Schiper. Consensus in the crash-recover model. Technical Report 97/239, École Polytechnique Fédérale de Lausanne, Switzerland, August 1997.Google Scholar
  15. 15.
    A. Schiper. Early consensus in an asynchronous system with a weak failure detector. Distributed Computing, 10(3):149–157, April 1997.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Assia Doudou
    • 1
  • Benoit Garbinato
    • 2
  • Rachid Guerraoui
    • 1
  • André Schiper
    • 1
  1. 1.École Polytechnique FédéraleLausanneSwitzerland
  2. 2.United Bank of SwitzerlandZürichSwitzerland

Personalised recommendations