Skip to main content

Making distributed spanning tree algorithms fault-resilient

  • Contributed Papers
  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 247))

Abstract

We study distributed algorithms for networks with undetectable fail-stop failures, assuming that all of them had occurred before the execution started. (It was proved that distributed agreement cannot be reached when a node may fail during execution.) Failures of this type are encountered, for example, during a recovery from a crash in the network. We study the problems of leader election and spanning tree construction, that have been characterized as fundamental for this environment. We point out that in presence of faults just duplicating messages in an existing algorithm does not suffice to make it resilient; actually, this redundancy gives rise to synchronization problems and also might increase the message complexity. In this paper we investigate the problem of making existing spanning tree algorithms fault-resilient, and still overcome these difficulties. Several lower bounds and optimal fault-resilient algorithms are presented for the first time.However, we believe that the main contribution of the paper is twofold: First, in designing the algorithms we use tools that thus argued to be rather general (for example, we extend the notion of token algorithms to multiple-token algorithms). In fact we are able to use them on several different algorithms, for several different families of networks. Second, following the amortized computational complexity, we introduce amortized message complexity as a tool for analyzing the message complexity.

Extended abstract

The work of this author was supported in part by a Technion grant no. 121–641.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Afek, Y., and Gafni, E., Time and Message Bounds for Election in Synchronous and Asynchronous Complete Networks, 4-th ACM Symposium on Principles of Distributed Computing, Minaki, Canada, August 1985, pp 186–195.

    Google Scholar 

  2. Bar-Yehuda, R., and Kutten, S., Fault-Tolerant Leader Election with Termination Detection, in General Undirected Networks, Technical Report #409, Computer Science Department, Technion, Haifa, Israel, April 1986, Revised August 1986.

    Google Scholar 

  3. Awerbuch, B., and Goldreich, O., private communication.

    Google Scholar 

  4. Afek, Y., and Saks, M., An Efficient Fault Tolerant Termination Detection Algorithm (draft), unpublished.

    Google Scholar 

  5. Dwork, C., Lynch, N., and Stockmeyer, L., Consensus in the Presence of Partial Synchrony, 3th ACM Symposium on Principles of Distributed Computing, Vancouver, Canada, August 1984, pp 103–118.

    Google Scholar 

  6. Fischer, M. The Consensus Problem in Unreliable Distributed Systems (a Brief Survey), YALE/DCS/RR-273, June 1983.

    Google Scholar 

  7. Fischer, M.J., Lynch, N.A., and Merritt, M., Easy Impossibility Proofs for Distributed Consensus Problems, 4-th ACM Symposium on Principles of Distributed Computing, Minaki, Canada, August 1985, pp. 59–70.

    Google Scholar 

  8. Fischer, M., Lynch, N., Paterson, M., Impossibility of Distributed Consensus with One Faulty Process, JACM, Volume 32(2), April 1985.

    Google Scholar 

  9. Francez, N., Distributed Termination, ACM-TOPLAS, January 1980.

    Google Scholar 

  10. Fredman, M.L., and Tarjan, R.E., Fibonacci Heaps and Their Uses in Improved Network Optimization Algorithms, 25th FOCS Singer Island, Florida, October 1984.

    Google Scholar 

  11. Gallager, R.G., Finding a Leader in a Network with O (|E|)+O(n log n) messages, Internal Memo, Laboratory for Information and Decision Systems, MIT, undated.

    Google Scholar 

  12. Garcia-Molina, H., Election in a Distributed Computing System, IEEE Trans. on Computers, Vol. c-31, No 1, 1982.

    Google Scholar 

  13. Gallager, R.G., Humblet, P.M., and Spira P.M., A Distributed Algorithm for Minimum-Weight Spanning Trees, ACM TOPLAS, January 1983, Vol. 5, No. 1.

    Google Scholar 

  14. Humblet, P., Selecting a Leader in a Clique in O(n log n) Messages, internal memo., Lab. for Information and Decision Systems, M.I.T., February, 1984.

    Google Scholar 

  15. Hirshberg, D.S., and Sinclair, J.B., Decentralized Extrema-Finding in Circular Configurations of Processes, CACM, November 1980.

    Google Scholar 

  16. Kutten, S., Optimal Fault-Tolerant Distributed Spanning Tree Weak Construction in General Networks, Technical Report #432, Computer Science Department, Technion, Haifa, Israel, August 1986.

    Google Scholar 

  17. Korach, E., Kutten, S., and Moran, S., A Modular Technique for the Design of Efficient Distributed Leader Finding Algorithms, 4-th ACM Symposium on Principles of Distributed Computing (PODC), Minaki, Canada, August 1985, pp. 163–174.

    Google Scholar 

  18. Korach, E., Moran, S, and Zaks, S, Tight Lower and Upper Bounds For Some Distributed Algorithms for a Complete Network of Processors, 3th ACM Symposium on Principles of Distributed Computing (PODC), Vancouver, B.C., Canada, August 1984, pp. 199–207.

    Google Scholar 

  19. Kutten, S., and Wolfstahl, Y., Finding A Leader in a Distributed System where Elements may fail, Proceeding of the 17th Ann. IEEE Electronic and Aerospace Conference (EASCON), Washington D.C., September 1984, pp. 101–105.

    Google Scholar 

  20. Kutten, S., Wolfstahl, Y., and Zaks, S., Optimal Distributed t-Resilient Election in Complete Networks, Technical Report #430, Computer Science Department, Technion, Haifa, Israel, August 1986.

    Google Scholar 

  21. Merritt, M., Election in the Presence Of Faults, 3th ACM Symposium on Principles of Distributed Computing, Vancouver, Canada, August 1984, pp. 134–142.

    Google Scholar 

  22. Shrira, L., and Goldreich, O., Electing a Leader in the Presence of Faults: a Ring as a Special Case, to appear in Acta Informatica.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Franz J. Brandenburg Guy Vidal-Naquet Martin Wirsing

Rights and permissions

Reprints and permissions

Copyright information

© 1987 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bar-Yehuda, R., Kutten, S., Wolfstahl, Y., Zaks, S. (1987). Making distributed spanning tree algorithms fault-resilient. In: Brandenburg, F.J., Vidal-Naquet, G., Wirsing, M. (eds) STACS 87. STACS 1987. Lecture Notes in Computer Science, vol 247. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0039625

Download citation

  • DOI: https://doi.org/10.1007/BFb0039625

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-17219-2

  • Online ISBN: 978-3-540-47419-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics