Primary-Backup Protocols: Lower Bounds and Optimal Implementations

Budhiraja, Navin; Marzullo, Keith; Schneider, Fred B.; Toueg, Sam

doi:10.1007/978-3-7091-4009-3_14

Navin Budhiraja⁴,
Keith Marzullo⁴,
Fred B. Schneider⁴ &
…
Sam Toueg⁴

Part of the book series: Dependable Computing and Fault-Tolerant Systems ((DEPENDABLECOMP,volume 8))

64 Accesses
4 Citations

Abstract

We present a precise specification of the primary-backup approach. Then, for a variety of different failure models we prove lower bounds on the degree of replication, failover time, and worst-case blocking time for client requests. Finally, we outline primary-backup protocols and indicate which of our lower bounds are tight.

Supported by Defense Advanced Research Projects Agency (DoD) under NASA Ames grant number NAG 2-593 and by grants from IBM, Siemens, and Xerox. Budhiraja is also supported by an IBM Graduate Fellowship. The views, opinions, and findings contained in this report are those of the authors and should not be construed as an official Department of Defense position, policy, or decision.

Supported in part by the Office of Naval Research under contract N00014-91-J-1219, the National Science Foundation under Grant No. CCR-8701103, DARPA/NSF Grant No. CCR-9014363, and by a grant from IBM Endicott Programming Laboratory.

Supported in part by NSF grants CCR-8901780 and CCR-9102231 and by a grant from IBM Endicott Programming Laboratory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

P. A. Alsberg, J. D. Day. A principle for resilient sharing of distributed resources. Proc. Second International Conference on Software Engineering, October 1976, pp. 627-644.
Google Scholar
Ö. Babaoğlu, R. Drummond. Streets of Byzantium: network architectures for fast reliable broadcasts. IEEE Transactions on Software Engineering, 11(6), June 1985, pp. 546–554.
Article Google Scholar
J. F. Barlett. A nonstop kernel. Proc. Eighth ACM Symposium on Operating System Principles, SIGOPS Operating System, Review, vol. 15, December 1981, pp. 22–29.
Article Google Scholar
A. Bhide, E. N. Elnozahy, S. P. Morgan. A highly available network file server. USENIX, 1991, pp. 199-205.
Google Scholar
K. P. Birman, T. A. Joseph. Exploiting virtual synchrony in distributed systems. Eleventh ACM Symposium on Operating System Principles, November 1987, pp. 123-138.
Google Scholar
N. Budhiraja, K. Marzullo, F. Schneider, S. Toueg. Optimal primary-backup protocols. Proc. Sixth International Workshop on Distributed Algorithms, Haifa, Israel, November 1992. To appear.
Google Scholar
IBM International Technical Support Centers. IBM/VS extended recovery facility (XRF) technical reference. Technical Report GG24-3I53-0, IBM, 1987.
Google Scholar
F. Cristian. Synchronous atomic broadcast for redundant broadcast channels. Journal of Real-Time Systems, 2, September 1990, pp. 195-212.
Google Scholar
F. Cristian, H. Aghili, H. R. Strong, D. Dolev. Atomic broadcast: from simple message diffusion to Byzantine agreement. Proc. Fifteenth International Symposium on Fault-Tolerant Computing, Ann Arbor, Michigan, June 1985. A revised version appears as IBM Technical Report RJ5244, pp. 200-206.
Google Scholar
V. Hadzilacos. Issues of fault tolerance in concurrent computations. PhD thesis, Harvard University, June 1984. Technical Report 11-84, Department of Computer Science.
Google Scholar
T. Joseph, K. Birman. Reliable broadcast protocols. ACM Press, New York, 1989, pp. 294–318.
Google Scholar
L. Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7), July 1978, pp. 558–565.
Article MATH Google Scholar
L. Lamport, M. Fischer. Byzantine generals and transaction commit protocols. Op. 62, SRI International, April 1982.
Google Scholar
L. Lamport, P. M. Melliar-Smith. Synchronizing clocks in the presence of faults. Journal of the ACM, 32(1), January 1985, 52–78.
Article MathSciNet MATH Google Scholar
B. Liskov, S. Ghemawat, R. Gruber, P. Johnson, M. Williams. Replication in the Harp file system. Proc. 13th Symposium on Operating System Principles, 1991, pp. 226-238.
Google Scholar
T. Mann, A. Hisgen, G. Swart. An algorithm for data replication. Technical Report 46, Digital Systems Research Center, 1989.
Google Scholar
G. Neiger, S. Toueg. Automatically increasing the fault-tolerance of distributed systems. Proc. Seventh ACM Symposium on Principles of Distributed Computing, ACM SIGOPS-SIGACT, Toronto, Ontario, August 1988, pp. 248-262.
Google Scholar
B. Oki, B. Liskov. Viewstamped replication: a new primary copy method to support highly available distributed systems. Seventh ACM Symposium on Principles of Distributed Computing, ACM SIGOPS-SIGACT, Toronto, Ontario, August 1988, pp. 8-17.
Google Scholar
M. Pease, R. Shostak, L. Lamport. Reaching agreement in the presence of faults. Journal of the ACM, 27(2), April 1980, pp. 228–234.
Article MathSciNet MATH Google Scholar
K. J. Perry, S. Toueg. Distributed agreement in the presence of processor and communication faults. IEEE Transactions on Software Engineering, 12(3), March 1986, pp. 477–482.
Article MATH Google Scholar
R. D. Schlichting, F. B. Schneider. Fail-stop processors: an approach to designing fault-tolerant computing systems. ACM Transactions on Computer Systems, 1(3), August 1983, pp. 222–238.
Article Google Scholar
F. B. Schneider. Implementing fault tolerant services using the state machine approach: a tutorial. Computing Surveys, 22(4), December 1990, pp. 299–319.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Cornell University, Ithaca, New York, 14853, USA
Navin Budhiraja, Keith Marzullo, Fred B. Schneider & Sam Toueg

Authors

Navin Budhiraja
View author publications
You can also search for this author in PubMed Google Scholar
Keith Marzullo
View author publications
You can also search for this author in PubMed Google Scholar
Fred B. Schneider
View author publications
You can also search for this author in PubMed Google Scholar
Sam Toueg
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Naval Research Laboratory, Washington DC, 20375-5000, USA
Carl E. Landwehr
Dept. of Computer Sc., Univ. of Newcastle, Newcastle upon Tyne, NE1 7RU, UK
Brian Randell
Dip. Ingegneria dell’Informazione, Università di Pisa, I-56100, Pisa, Italia
Luca Simoncini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Budhiraja, N., Marzullo, K., Schneider, F.B., Toueg, S. (1993). Primary-Backup Protocols: Lower Bounds and Optimal Implementations. In: Landwehr, C.E., Randell, B., Simoncini, L. (eds) Dependable Computing for Critical Applications 3. Dependable Computing and Fault-Tolerant Systems, vol 8. Springer, Vienna. https://doi.org/10.1007/978-3-7091-4009-3_14

Download citation

DOI: https://doi.org/10.1007/978-3-7091-4009-3_14
Publisher Name: Springer, Vienna
Print ISBN: 978-3-7091-4011-6
Online ISBN: 978-3-7091-4009-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics