Skip to main content

Efficient Algorithms to Implement Unreliable Failure Detectors in Partially Synchronous Systems

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1693))

Abstract

Unreliable failure detectors, proposed by Chandra and Toueg [2], are mechanisms that provide information about process fail- ures. In [2], eight classes of failure detectors were de.ned, depending on how accurate this information is, and an algorithm implementing a fail- ure detector of one of these classes in a partially synchronous system was presented. This algorithm is based on all-to-all communication, and peri- odically exchanges a number of messages that is quadratic on the number of processes. To our knowledge, no other algorithm implementing these classes of unreliable failure detectors has been proposed.

In this paper, we present a family of distributed algorithms that imple- ment four classes of unreliable failure detectors in partially synchronous systems. Our algorithms are based on a logical ring arrangement of the processes, which defines the monitoring and failure information propa- gation pattern. The resulting algorithms periodically exchange at most a linear number of messages.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Aguilera and S. Toueg. Heartbeat: A Timeout-Free Failure Detector for Quiescent Reliable Communication. Proceedings of the 11th International Workshop on Distributed Algorithms (WDAG), LNCS, Springer-Verlag, Germany, Sep. 1997.

    Google Scholar 

  2. T. D. Chandra and S. Toueg. Unreliable Failure Detectors for Reliable Distributed Systems. Journal of the ACM, 43(2), pages 225–267, Mar. 1996.

    Article  MATH  MathSciNet  Google Scholar 

  3. T. D. Chandra, V. Hadzilacos, and S. Toueg. The Weakest Failure Detector for Solving Consensus. Journal of the ACM, 43(4), pages 685–722, Jul. 1996.

    Article  MATH  MathSciNet  Google Scholar 

  4. D. Dolev, C. Dwork, and L. Stockmeyer. On the Minimal Synchronism Needed for Distributed Consensus. Journal of the ACM, 34(1), pages 77–97, Jan. 1987.

    Article  MATH  MathSciNet  Google Scholar 

  5. C. Dwork, N. Lynch, and L. Stockmeyer. Consensus in the Presence of Partial Synchrony. Journal of the ACM, 35(2), pages 288–323, Apr. 1988.

    Article  MathSciNet  Google Scholar 

  6. C. Fetzer and F. Cristian. Fail-Aware Failure Detectors. Proceedings of the 15th Symposium on Reliable Distributed Systems (SRDS), Canada, Oct. 1996.

    Google Scholar 

  7. M. Fischer, N. Lynch, and M. Paterson. Impossibility of Distributed Consensus with One Faulty Process. Journal of the ACM, 32(2), pages 374–382, Apr. 1985.

    Article  MATH  MathSciNet  Google Scholar 

  8. R. Guerraoui, M. Larrea, and A. Schiper. Non-Blocking Atomic Commitment with an Unreliable Failure Detector. Proceedings of the 14th Symposium on Reliable Distributed Systems (SRDS), Germany, Sep. 1996.

    Google Scholar 

  9. R. Guerraoui and A. Schiper. Gamma-Accurate Failure Detectors. Proceedings of the 10th International Workshop on Distributed Algorithms (WDAG), LNCS, Springer-Verlag, Italy, Oct. 1996.

    Google Scholar 

  10. M. Pease, R. Shostak, and L. Lamport. Reaching Agreement in the Presence of Faults. Journal of the ACM, 27(2), pages 228–234, Apr. 1980.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Larrea, M., Arevalo, S., Fernndez, A. (1999). Efficient Algorithms to Implement Unreliable Failure Detectors in Partially Synchronous Systems. In: Jayanti, P. (eds) Distributed Computing. DISC 1999. Lecture Notes in Computer Science, vol 1693. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48169-9_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-48169-9_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66531-1

  • Online ISBN: 978-3-540-48169-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics