Experimental Evaluation of a Failure Detection Service Based on a Gossip Strategy

de Sousa, Leandro P.; Duarte, Elias P.

doi:10.1007/978-3-642-24669-2_21

Leandro P. de Sousa¹⁹ &
Elias P. Duarte Jr.¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7017))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

1199 Accesses

Abstract

Failure detectors were first proposed as an abstraction that makes it possible to solve consensus in asynchronous systems. A failure detector is a distributed oracle that provides information about the state of processes of a distributed system. This work presents a failure detection service based on a gossip strategy. The service was implemented on the JXTA platform. A simulator was also implemented so the detector could be evaluated for a larger number of processes. Experimental results show that increasing the frequency in which gossip messages are sent gives better results than increasing the fanout. Results are included for fault and recovery detection time and mistake rate of the detector.

This work was partially supported by grant 304013/2009-9 from the Brazilian Research Agency (CNPq).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2), 225–267 (1996)
Article MathSciNet MATH Google Scholar
Chen, W., Toueg, S., Aguilera, M.K.: On the quality of service of failure detectors. IEEE Trans. Comput. 51(1), 13–32 (2002)
Article MathSciNet Google Scholar
Das, A., Gupta, I., Motivala, A.: Swim: scalable weakly-consistent infection-style process group membership protocol. In: Proc. International Conference on Dependable Systems and Networks DSN 2002, pp. 303–312 (June 23-26, 2002)
Google Scholar
Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)
Article MathSciNet MATH Google Scholar
Gupta, I., Birman, K.P., van Renesse, R.: Fighting fire with fire: using randomized gossip to combat stochastic scalability limits. Quality and Reliability Engineering International 18(3), 165–184 (2002)
Article Google Scholar
Gupta, I., Chandra, T.D., Goldszmidt, G.S.: On scalable and efficient distributed failure detectors. In: PODC 2001: Proceedings of the Twentieth Annual ACM Symposium on Principles of Distributed Computing, pp. 170–179. ACM, New York (2001)
Chapter Google Scholar
Jxta website, http://java.net/projects/jxta/ (last access in April 2011)
Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)
Article Google Scholar
MacDougall, M.H.: Simulating Computer Systems, Techniques and Tools. The MIT Press, Cambridge (1997)
Google Scholar
Raynal, M.: A short introduction to failure detectors for asynchronous distributed systems. SIGACT News 36(1), 53–70 (2005)
Article Google Scholar
Turek, J., Shasha, D.: The many faces of consensus in distributed systems. Computer 25(6), 8–17 (1992)
Article Google Scholar
van Renesse, R., Minsky, Y., Hayden, M.: A gossip-style failure detection service. Tech. rep., Cornell University, Ithaca, NY, USA (1998)
Google Scholar
Wan, Y., Luo, Y., Liu, L., Feng, D.: A dynamic failure detector for p2p storage system. In: NISS (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. Informatics, Federal University of Parana (UFPR), P.O. Box 19018, Curitiba, 81531-980, Brazil
Leandro P. de Sousa & Elias P. Duarte Jr.

Authors

Leandro P. de Sousa
View author publications
You can also search for this author in PubMed Google Scholar
Elias P. Duarte Jr.
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology, Deakin University, Melbourne Burwood Campus, 221 Burwood Highway, 3125, Burwood, VIC, Australia
Yang Xiang & Wanlei Zhou &
ICAR-CNR and University of Calabria, Via P. Bucci 41 C, 87036, Rende, CS, Italy
Alfredo Cuzzocrea
School of Information Technology, Deakin University, Geelong Waurn Ponds Campus, Pigdons Road, 3217, Geelong, VIC, Australia
Michael Hobbs

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de Sousa, L.P., Duarte, E.P. (2011). Experimental Evaluation of a Failure Detection Service Based on a Gossip Strategy. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2011. Lecture Notes in Computer Science, vol 7017. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24669-2_21

Download citation

DOI: https://doi.org/10.1007/978-3-642-24669-2_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24668-5
Online ISBN: 978-3-642-24669-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics