Skip to main content

Primary-Backup Protocols: Lower Bounds and Optimal Implementations

  • Conference paper
Dependable Computing for Critical Applications 3

Part of the book series: Dependable Computing and Fault-Tolerant Systems ((DEPENDABLECOMP,volume 8))

Abstract

We present a precise specification of the primary-backup approach. Then, for a variety of different failure models we prove lower bounds on the degree of replication, failover time, and worst-case blocking time for client requests. Finally, we outline primary-backup protocols and indicate which of our lower bounds are tight.

Supported by Defense Advanced Research Projects Agency (DoD) under NASA Ames grant number NAG 2-593 and by grants from IBM, Siemens, and Xerox. Budhiraja is also supported by an IBM Graduate Fellowship. The views, opinions, and findings contained in this report are those of the authors and should not be construed as an official Department of Defense position, policy, or decision.

Supported in part by the Office of Naval Research under contract N00014-91-J-1219, the National Science Foundation under Grant No. CCR-8701103, DARPA/NSF Grant No. CCR-9014363, and by a grant from IBM Endicott Programming Laboratory.

Supported in part by NSF grants CCR-8901780 and CCR-9102231 and by a grant from IBM Endicott Programming Laboratory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. P. A. Alsberg, J. D. Day. A principle for resilient sharing of distributed resources. Proc. Second International Conference on Software Engineering, October 1976, pp. 627-644.

    Google Scholar 

  2. Ö. Babaoğlu, R. Drummond. Streets of Byzantium: network architectures for fast reliable broadcasts. IEEE Transactions on Software Engineering, 11(6), June 1985, pp. 546–554.

    Article  Google Scholar 

  3. J. F. Barlett. A nonstop kernel. Proc. Eighth ACM Symposium on Operating System Principles, SIGOPS Operating System, Review, vol. 15, December 1981, pp. 22–29.

    Article  Google Scholar 

  4. A. Bhide, E. N. Elnozahy, S. P. Morgan. A highly available network file server. USENIX, 1991, pp. 199-205.

    Google Scholar 

  5. K. P. Birman, T. A. Joseph. Exploiting virtual synchrony in distributed systems. Eleventh ACM Symposium on Operating System Principles, November 1987, pp. 123-138.

    Google Scholar 

  6. N. Budhiraja, K. Marzullo, F. Schneider, S. Toueg. Optimal primary-backup protocols. Proc. Sixth International Workshop on Distributed Algorithms, Haifa, Israel, November 1992. To appear.

    Google Scholar 

  7. IBM International Technical Support Centers. IBM/VS extended recovery facility (XRF) technical reference. Technical Report GG24-3I53-0, IBM, 1987.

    Google Scholar 

  8. F. Cristian. Synchronous atomic broadcast for redundant broadcast channels. Journal of Real-Time Systems, 2, September 1990, pp. 195-212.

    Google Scholar 

  9. F. Cristian, H. Aghili, H. R. Strong, D. Dolev. Atomic broadcast: from simple message diffusion to Byzantine agreement. Proc. Fifteenth International Symposium on Fault-Tolerant Computing, Ann Arbor, Michigan, June 1985. A revised version appears as IBM Technical Report RJ5244, pp. 200-206.

    Google Scholar 

  10. V. Hadzilacos. Issues of fault tolerance in concurrent computations. PhD thesis, Harvard University, June 1984. Technical Report 11-84, Department of Computer Science.

    Google Scholar 

  11. T. Joseph, K. Birman. Reliable broadcast protocols. ACM Press, New York, 1989, pp. 294–318.

    Google Scholar 

  12. L. Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7), July 1978, pp. 558–565.

    Article  MATH  Google Scholar 

  13. L. Lamport, M. Fischer. Byzantine generals and transaction commit protocols. Op. 62, SRI International, April 1982.

    Google Scholar 

  14. L. Lamport, P. M. Melliar-Smith. Synchronizing clocks in the presence of faults. Journal of the ACM, 32(1), January 1985, 52–78.

    Article  MathSciNet  MATH  Google Scholar 

  15. B. Liskov, S. Ghemawat, R. Gruber, P. Johnson, M. Williams. Replication in the Harp file system. Proc. 13th Symposium on Operating System Principles, 1991, pp. 226-238.

    Google Scholar 

  16. T. Mann, A. Hisgen, G. Swart. An algorithm for data replication. Technical Report 46, Digital Systems Research Center, 1989.

    Google Scholar 

  17. G. Neiger, S. Toueg. Automatically increasing the fault-tolerance of distributed systems. Proc. Seventh ACM Symposium on Principles of Distributed Computing, ACM SIGOPS-SIGACT, Toronto, Ontario, August 1988, pp. 248-262.

    Google Scholar 

  18. B. Oki, B. Liskov. Viewstamped replication: a new primary copy method to support highly available distributed systems. Seventh ACM Symposium on Principles of Distributed Computing, ACM SIGOPS-SIGACT, Toronto, Ontario, August 1988, pp. 8-17.

    Google Scholar 

  19. M. Pease, R. Shostak, L. Lamport. Reaching agreement in the presence of faults. Journal of the ACM, 27(2), April 1980, pp. 228–234.

    Article  MathSciNet  MATH  Google Scholar 

  20. K. J. Perry, S. Toueg. Distributed agreement in the presence of processor and communication faults. IEEE Transactions on Software Engineering, 12(3), March 1986, pp. 477–482.

    Article  MATH  Google Scholar 

  21. R. D. Schlichting, F. B. Schneider. Fail-stop processors: an approach to designing fault-tolerant computing systems. ACM Transactions on Computer Systems, 1(3), August 1983, pp. 222–238.

    Article  Google Scholar 

  22. F. B. Schneider. Implementing fault tolerant services using the state machine approach: a tutorial. Computing Surveys, 22(4), December 1990, pp. 299–319.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1993 Springer-Verlag Wien

About this paper

Cite this paper

Budhiraja, N., Marzullo, K., Schneider, F.B., Toueg, S. (1993). Primary-Backup Protocols: Lower Bounds and Optimal Implementations. In: Landwehr, C.E., Randell, B., Simoncini, L. (eds) Dependable Computing for Critical Applications 3. Dependable Computing and Fault-Tolerant Systems, vol 8. Springer, Vienna. https://doi.org/10.1007/978-3-7091-4009-3_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-4009-3_14

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-7091-4011-6

  • Online ISBN: 978-3-7091-4009-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics