Efficient reconfiguration of trees: A case study in methodical design of nonmasking fault-tolerant programs
We illustrate the effectiveness of a formal method for the design of nonmasking fault-tolerant programs, by demonstrating how the method enables us to design a new and efficient program. Our program maintains the processes of any given distributed system in a spanning tree, tolerates any finite number of fail-stop failures and repairs of system processes and channels, and requires only O(n) time and O(n log n) space to reconfigure the tree, where n is the number of nonfaulty processes. The program is, moreover, simple and fully distributed.
Unable to display preview. Download preview PDF.
- 1.A. Arora and M. G. Gouda, “Closure and convergence: A foundation of faulttolerant computing”. IEEE Trans. on Soft. Engg. 19(11) (1993) 1015–1027Google Scholar
- 2.A. Arora, “A foundation of fault-tolerant computing”. Ph.D. Dissertation, The University of Texas at Austin (1992)Google Scholar
- 3.A. Arora, M. G. Gouda, and G. Varghese, “Constraint satisfaction as a basis for designing nonmasking fault-tolerance”. J. High Speed Networks (1994 to appear); Proc. 14th Intl. Conf. on Distributed Computer Systems (1994) 424–431Google Scholar
- 4.E. W. Dijkstra, A Discipline of Programming, Prentice-Hall (1976)Google Scholar
- 5.D. Gries, The Science of Programming, Springer-Verlag (1981)Google Scholar
- 6.R. G. Gallagher, P. A. Humblet, and P. M. Spira, “A distributed algorithm for minimum-weight spanning trees”. ACM Trans. on Prog. Lang. and Sys. 5(1) (1983) 66–77Google Scholar
- 7.G. Varghese, “Self-stablization by local checking and correction”. Ph.D. Dissertation, Massachusetts Institute of Technology (1992)Google Scholar
- 8.A. Arora and M. G. Gouda, “Distributed reset”. IEEE Trans. on Computers (1994 to appear); Proc. 10th Conf. on Foundations of Software Technology and Theoretical Computer Science, Lecture Notes in Computer Science 472, Springer-Verlag (1990) 316–331.Google Scholar