Abstract
We study distributed algorithms for networks with undetectable fail-stop failures, assuming that all of them had occurred before the execution started. (It was proved that distributed agreement cannot be reached when a node may fail during execution.) Failures of this type are encountered, for example, during a recovery from a crash in the network. We study the problems of leader election and spanning tree construction, that have been characterized as fundamental for this environment. We point out that in presence of faults just duplicating messages in an existing algorithm does not suffice to make it resilient; actually, this redundancy gives rise to synchronization problems and also might increase the message complexity. In this paper we investigate the problem of making existing spanning tree algorithms fault-resilient, and still overcome these difficulties. Several lower bounds and optimal fault-resilient algorithms are presented for the first time.However, we believe that the main contribution of the paper is twofold: First, in designing the algorithms we use tools that thus argued to be rather general (for example, we extend the notion of token algorithms to multiple-token algorithms). In fact we are able to use them on several different algorithms, for several different families of networks. Second, following the amortized computational complexity, we introduce amortized message complexity as a tool for analyzing the message complexity.
Extended abstract
The work of this author was supported in part by a Technion grant no. 121–641.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
Afek, Y., and Gafni, E., Time and Message Bounds for Election in Synchronous and Asynchronous Complete Networks, 4-th ACM Symposium on Principles of Distributed Computing, Minaki, Canada, August 1985, pp 186–195.
Bar-Yehuda, R., and Kutten, S., Fault-Tolerant Leader Election with Termination Detection, in General Undirected Networks, Technical Report #409, Computer Science Department, Technion, Haifa, Israel, April 1986, Revised August 1986.
Awerbuch, B., and Goldreich, O., private communication.
Afek, Y., and Saks, M., An Efficient Fault Tolerant Termination Detection Algorithm (draft), unpublished.
Dwork, C., Lynch, N., and Stockmeyer, L., Consensus in the Presence of Partial Synchrony, 3th ACM Symposium on Principles of Distributed Computing, Vancouver, Canada, August 1984, pp 103–118.
Fischer, M. The Consensus Problem in Unreliable Distributed Systems (a Brief Survey), YALE/DCS/RR-273, June 1983.
Fischer, M.J., Lynch, N.A., and Merritt, M., Easy Impossibility Proofs for Distributed Consensus Problems, 4-th ACM Symposium on Principles of Distributed Computing, Minaki, Canada, August 1985, pp. 59–70.
Fischer, M., Lynch, N., Paterson, M., Impossibility of Distributed Consensus with One Faulty Process, JACM, Volume 32(2), April 1985.
Francez, N., Distributed Termination, ACM-TOPLAS, January 1980.
Fredman, M.L., and Tarjan, R.E., Fibonacci Heaps and Their Uses in Improved Network Optimization Algorithms, 25th FOCS Singer Island, Florida, October 1984.
Gallager, R.G., Finding a Leader in a Network with O (|E|)+O(n log n) messages, Internal Memo, Laboratory for Information and Decision Systems, MIT, undated.
Garcia-Molina, H., Election in a Distributed Computing System, IEEE Trans. on Computers, Vol. c-31, No 1, 1982.
Gallager, R.G., Humblet, P.M., and Spira P.M., A Distributed Algorithm for Minimum-Weight Spanning Trees, ACM TOPLAS, January 1983, Vol. 5, No. 1.
Humblet, P., Selecting a Leader in a Clique in O(n log n) Messages, internal memo., Lab. for Information and Decision Systems, M.I.T., February, 1984.
Hirshberg, D.S., and Sinclair, J.B., Decentralized Extrema-Finding in Circular Configurations of Processes, CACM, November 1980.
Kutten, S., Optimal Fault-Tolerant Distributed Spanning Tree Weak Construction in General Networks, Technical Report #432, Computer Science Department, Technion, Haifa, Israel, August 1986.
Korach, E., Kutten, S., and Moran, S., A Modular Technique for the Design of Efficient Distributed Leader Finding Algorithms, 4-th ACM Symposium on Principles of Distributed Computing (PODC), Minaki, Canada, August 1985, pp. 163–174.
Korach, E., Moran, S, and Zaks, S, Tight Lower and Upper Bounds For Some Distributed Algorithms for a Complete Network of Processors, 3th ACM Symposium on Principles of Distributed Computing (PODC), Vancouver, B.C., Canada, August 1984, pp. 199–207.
Kutten, S., and Wolfstahl, Y., Finding A Leader in a Distributed System where Elements may fail, Proceeding of the 17th Ann. IEEE Electronic and Aerospace Conference (EASCON), Washington D.C., September 1984, pp. 101–105.
Kutten, S., Wolfstahl, Y., and Zaks, S., Optimal Distributed t-Resilient Election in Complete Networks, Technical Report #430, Computer Science Department, Technion, Haifa, Israel, August 1986.
Merritt, M., Election in the Presence Of Faults, 3th ACM Symposium on Principles of Distributed Computing, Vancouver, Canada, August 1984, pp. 134–142.
Shrira, L., and Goldreich, O., Electing a Leader in the Presence of Faults: a Ring as a Special Case, to appear in Acta Informatica.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1987 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bar-Yehuda, R., Kutten, S., Wolfstahl, Y., Zaks, S. (1987). Making distributed spanning tree algorithms fault-resilient. In: Brandenburg, F.J., Vidal-Naquet, G., Wirsing, M. (eds) STACS 87. STACS 1987. Lecture Notes in Computer Science, vol 247. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0039625
Download citation
DOI: https://doi.org/10.1007/BFb0039625
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-17219-2
Online ISBN: 978-3-540-47419-7
eBook Packages: Springer Book Archive