Making distributed spanning tree algorithms fault-resilient

Bar-Yehuda, Reuven; Kutten, Shay; Wolfstahl, Yaron; Zaks, Shmuel

doi:10.1007/BFb0039625

Making distributed spanning tree algorithms fault-resilient

Reuven Bar-Yehuda¹,
Shay Kutten¹,
Yaron Wolfstahl¹ &
…
Shmuel Zaks¹

Contributed Papers
Conference paper
First Online: 01 January 2005

132 Accesses
8 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 247))

Abstract

We study distributed algorithms for networks with undetectable fail-stop failures, assuming that all of them had occurred before the execution started. (It was proved that distributed agreement cannot be reached when a node may fail during execution.) Failures of this type are encountered, for example, during a recovery from a crash in the network. We study the problems of leader election and spanning tree construction, that have been characterized as fundamental for this environment. We point out that in presence of faults just duplicating messages in an existing algorithm does not suffice to make it resilient; actually, this redundancy gives rise to synchronization problems and also might increase the message complexity. In this paper we investigate the problem of making existing spanning tree algorithms fault-resilient, and still overcome these difficulties. Several lower bounds and optimal fault-resilient algorithms are presented for the first time.However, we believe that the main contribution of the paper is twofold: First, in designing the algorithms we use tools that thus argued to be rather general (for example, we extend the notion of token algorithms to multiple-token algorithms). In fact we are able to use them on several different algorithms, for several different families of networks. Second, following the amortized computational complexity, we introduce amortized message complexity as a tool for analyzing the message complexity.

Extended abstract

The work of this author was supported in part by a Technion grant no. 121–641.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

References

Afek, Y., and Gafni, E., Time and Message Bounds for Election in Synchronous and Asynchronous Complete Networks, 4-th ACM Symposium on Principles of Distributed Computing, Minaki, Canada, August 1985, pp 186–195.
Google Scholar
Bar-Yehuda, R., and Kutten, S., Fault-Tolerant Leader Election with Termination Detection, in General Undirected Networks, Technical Report #409, Computer Science Department, Technion, Haifa, Israel, April 1986, Revised August 1986.
Google Scholar
Awerbuch, B., and Goldreich, O., private communication.
Google Scholar
Afek, Y., and Saks, M., An Efficient Fault Tolerant Termination Detection Algorithm (draft), unpublished.
Google Scholar
Dwork, C., Lynch, N., and Stockmeyer, L., Consensus in the Presence of Partial Synchrony, 3th ACM Symposium on Principles of Distributed Computing, Vancouver, Canada, August 1984, pp 103–118.
Google Scholar
Fischer, M. The Consensus Problem in Unreliable Distributed Systems (a Brief Survey), YALE/DCS/RR-273, June 1983.
Google Scholar
Fischer, M.J., Lynch, N.A., and Merritt, M., Easy Impossibility Proofs for Distributed Consensus Problems, 4-th ACM Symposium on Principles of Distributed Computing, Minaki, Canada, August 1985, pp. 59–70.
Google Scholar
Fischer, M., Lynch, N., Paterson, M., Impossibility of Distributed Consensus with One Faulty Process, JACM, Volume 32(2), April 1985.
Google Scholar
Francez, N., Distributed Termination, ACM-TOPLAS, January 1980.
Google Scholar
Fredman, M.L., and Tarjan, R.E., Fibonacci Heaps and Their Uses in Improved Network Optimization Algorithms, 25th FOCS Singer Island, Florida, October 1984.
Google Scholar
Gallager, R.G., Finding a Leader in a Network with O (|E|)+O(n log n) messages, Internal Memo, Laboratory for Information and Decision Systems, MIT, undated.
Google Scholar
Garcia-Molina, H., Election in a Distributed Computing System, IEEE Trans. on Computers, Vol. c-31, No 1, 1982.
Google Scholar
Gallager, R.G., Humblet, P.M., and Spira P.M., A Distributed Algorithm for Minimum-Weight Spanning Trees, ACM TOPLAS, January 1983, Vol. 5, No. 1.
Google Scholar
Humblet, P., Selecting a Leader in a Clique in O(n log n) Messages, internal memo., Lab. for Information and Decision Systems, M.I.T., February, 1984.
Google Scholar
Hirshberg, D.S., and Sinclair, J.B., Decentralized Extrema-Finding in Circular Configurations of Processes, CACM, November 1980.
Google Scholar
Kutten, S., Optimal Fault-Tolerant Distributed Spanning Tree Weak Construction in General Networks, Technical Report #432, Computer Science Department, Technion, Haifa, Israel, August 1986.
Google Scholar
Korach, E., Kutten, S., and Moran, S., A Modular Technique for the Design of Efficient Distributed Leader Finding Algorithms, 4-th ACM Symposium on Principles of Distributed Computing (PODC), Minaki, Canada, August 1985, pp. 163–174.
Google Scholar
Korach, E., Moran, S, and Zaks, S, Tight Lower and Upper Bounds For Some Distributed Algorithms for a Complete Network of Processors, 3th ACM Symposium on Principles of Distributed Computing (PODC), Vancouver, B.C., Canada, August 1984, pp. 199–207.
Google Scholar
Kutten, S., and Wolfstahl, Y., Finding A Leader in a Distributed System where Elements may fail, Proceeding of the 17th Ann. IEEE Electronic and Aerospace Conference (EASCON), Washington D.C., September 1984, pp. 101–105.
Google Scholar
Kutten, S., Wolfstahl, Y., and Zaks, S., Optimal Distributed t-Resilient Election in Complete Networks, Technical Report #430, Computer Science Department, Technion, Haifa, Israel, August 1986.
Google Scholar
Merritt, M., Election in the Presence Of Faults, 3th ACM Symposium on Principles of Distributed Computing, Vancouver, Canada, August 1984, pp. 134–142.
Google Scholar
Shrira, L., and Goldreich, O., Electing a Leader in the Presence of Faults: a Ring as a Special Case, to appear in Acta Informatica.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Technion, Haifa, Israel
Reuven Bar-Yehuda, Shay Kutten, Yaron Wolfstahl & Shmuel Zaks

Authors

Reuven Bar-Yehuda
View author publications
You can also search for this author in PubMed Google Scholar
Shay Kutten
View author publications
You can also search for this author in PubMed Google Scholar
Yaron Wolfstahl
View author publications
You can also search for this author in PubMed Google Scholar
Shmuel Zaks
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Franz J. Brandenburg Guy Vidal-Naquet Martin Wirsing

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bar-Yehuda, R., Kutten, S., Wolfstahl, Y., Zaks, S. (1987). Making distributed spanning tree algorithms fault-resilient. In: Brandenburg, F.J., Vidal-Naquet, G., Wirsing, M. (eds) STACS 87. STACS 1987. Lecture Notes in Computer Science, vol 247. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0039625

Download citation

DOI: https://doi.org/10.1007/BFb0039625
Published: 09 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-17219-2
Online ISBN: 978-3-540-47419-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics