Abstract
Membership information is used to provide a consistent, system-wide view of which processes are currently functioning or failed in a distributed computation. This paper describes a membership protocol that is used to maintain this information. Our protocol is novel because it is based on a multicast facility that preserves only the partial order of messages exchanged among the communicating processes. Because it depends only on a partial ordering of messages rather than a total ordering, our protocol requires less synchronization overhead. The advantages of our approach are especially pronounced if multiple failures occur concurrently.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This work supported in part by the National Science Foundation under grants CCR-8811923 and CCR-9003161, and the Office of Naval Research under grant N00014-91-J-1015.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
F. Cristian, “Probabilistic clock synchronization,” in Ninth International Symposium on DCS, (Newport Beach, CA), pp. 288-296, Jun 1989.
J. Y. Halpern, B. Simons, R. Strong, and D. Dolev, “Fault-tolerant clock synchronization,” in Third ACM Symposium on PODC, (Vancouver, Canada), pp. 89-102, Aug 1984.
H. Kopetz and W. Ochsenreiter, “Clock synchronizatin in distributed, realtime systems,” IEEE Transactions on Computers, vol. C-36, pp. 933–940, Aug 1987.
K. Birman and K. Marzullo, “The role of order in distributed programs,” Tech. Rep. 89-1001, Department of Computer Science, Cornell University, 1989.
H. Garcia-Molina and A. Spauster, “Message ordering in a multicast environment,” in Ninth International Conference on DCS, (Newport Beach, CA), pp. 354-361, Jun 1989.
P. Kearns and B. Koodalattupuram, “Immediate ordered service in distributed systems,” in Ninth International Conference on DCS, (Newport Beach, CA), pp. 611-618, Jun 1989.
L. Lamport, “Time, clocks, and the ordering of events in a distributed system,” Communications of the ACM, vol. 21, pp. 558–565, July 1978.
F. Cristian, “Agreeing on who is present and who is absent in a synchronous distributed system,” in Eighteenth FTCS, (Tokyo), pp. 206-211, Jun 1988.
H. Garcia-Molina, “Elections in a distributed computing system,” IEEE Transactions on Computers, vol. C-31, pp. 49–59, Jan 1982.
H. Kopetz, G. Grunsteidl, and J. Reisinger, “Fault-tolerant membership service in a synchronous distributed real-time system,” in International Working Conference on Dependable Computing for Critical Applications, (Santa Barbara, California), pp. 167-174, Aug 1989.
P. Verissimo and J. Marques, “Reliable broadcast for fault-tolerance on local computer networks,” in Ninth IEEE Symposium on Reliable Distributed Systems, pp. 54-63, Oct. 1990.
K. Birman and T. Joseph, “Reliable communication in the presence of failures,” ACM Transactions on Computer Systems, vol. 5, pp. 47–76, Feb. 1987.
J. Chang and N. Maxemchuk, “Reliable broadcast protocols,” ACM Transactions on Computer Systems, vol. 2, pp. 251–273, Aug. 1984.
L. L. Peterson, N. Buchholz, and R. D. Schlichting, “Preserving and using context information in interprocess communication,” ACM Transactions on Computer Systems, vol. 7, pp. 217–246, Aug. 1989.
S. Mishra, L. L. Peterson, and R. D. Schlichting, “Implementing fault-tolerant replicated objects using Psync,” in Eighth IEEE Symposium on Reliable Distributed Systems, pp. 42-52, Oct. 1989.
N. C. Hutchinson, L. L. Peterson, M. Abbott, and S. O’Malley, “RPC in the x-Kernel: Evaluating new design techniques,” in Proceedings of the Twelfth ACM Symposium on Operating System Principles, pp. 91-101, Dec. 1989.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1992 Springer-Verlag/Wien
About this paper
Cite this paper
Mishra, S., Peterson, L.L., Schlichting, R.D. (1992). A Membership Protocol Based on Partial Order. In: Meyer, J.F., Schlichting, R.D. (eds) Dependable Computing for Critical Applications 2. Dependable Computing and Fault-Tolerant Systems, vol 6. Springer, Vienna. https://doi.org/10.1007/978-3-7091-9198-9_15
Download citation
DOI: https://doi.org/10.1007/978-3-7091-9198-9_15
Publisher Name: Springer, Vienna
Print ISBN: 978-3-7091-9200-9
Online ISBN: 978-3-7091-9198-9
eBook Packages: Springer Book Archive