Abstract
Most high-level programming languages contain little support for programming multicomputer programs that must continue to execute despite failures in the underlying computing platform. This paper describes two projects that address this problem by providing features specifically designed for fault-tolerance. The first is FT-Linda, a version of the Linda coordination language for writing fault-tolerant parallel programs. Major enhancements include stable tuple spaces whose contents survive failure and atomic execution of collections of tuple space operations. The second is FT-SR, a language based on the existing SR distributed programming language. Major features include support for transparent module replication, ordered group communication, automatic recovery and failure notification. Prototype versions of both languages have been implemented.
This work supported in part by the National Science Foundation under grant CCR-9003161 and the Office of Naval Research under grant N00014-91-J-1015
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
1.3.7. References
S. Mishra and R. Schlichting, “Abstractions for constructing dependable distributed systems,” Technical report 92-19, Dept. of Computer Science, University of Arizona, 1992.
S. Ahuja, N. Carriero, and D. Gelernter, “Linda and friends,” IEEE Computer, vol. 19, pp. 26–34, August 1986.
G. Andrews and R. Olsson, The SR Programming Language, Benjamin/Cummings, Redwood City, CA, 1993.
J. Gray, “An approach to decentralized computer systems,” IEEE Trans. on Software Engineering, vol. SE-12, pp. 684–692, June 1986.
B. Lampson, “Atomic transactions,” in Distributed Systems-Architecture and Implementation (B. Lampson, M. Paul, and H. Seigert, eds.), ch. 11, pp. 246–265, Springer-Verlag, Berlin, 1981.
F. Schneider, “Implementing fault-tolerant services using the state machine approach: A tutorial,” ACM Computing Surveys, vol. 22, pp. 299–319, Dec. 1990.
S. Mishra, L. Peterson, and R. Schlichting, “Consul: A communication substrate for fault-tolerant distributed programs,” Distributed Systems Engineering, vol. 1, pp. 87–103, 1993.
N. Hutchinson and L. Peterson, “The x-kernel: An architecture for implementing network protocols,” IEEE Trans. on Software Engineering, vol. SE-17, pp. 64–76, Jan. 1991.
D. Gelernter, “Generative communication in Linda,” ACM Trans. on Programming Languages and Systems, vol. 7, pp. 80–112, Jan. 1985.
D. Bakken and R. Schlichting, “Supporting fault-tolerant parallel programming in Linda,” IEEE Trans. on Parallel and Distributed Systems, to appear, 1994.
B. Anderson and D. Shasha, “Persistent Linda: Linda + transactions + query processing,” in Research Directions in High-Level Parallel Programming Languages, LNCS, Vol. 574, pp. 93–109, Springer-Verlag, Berlin, 1991.
V. Thomas, FT-SR: A Programming Language for Constructing Fault-Tolerant Distributed Systems, Ph.D. Dissertation, Dept. of Computer Science, University of Arizona, 1993.
P. Buhr, H. MacDonald, and C. Zarnke, “Synchronous and asynchronous handling of abnormal events in the μSystem,” Software—Practice and Experience, vol. 22, pp. 735–776, Sept. 1992.
R. Cmelik, N. Gehani, and W. Roome, “Fault Tolerant Concurrent C: A tool for writing fault tolerant distributed programs,” in Proc. 18th Symp. on Fault-Tolerant Computing, Tokyo, pp. 55–61, June 1988.
R. Schlichting, F. Cristian, and T. Purdin, “A linguistic approach to failure-handling in distributed systems,” in Dependable Computing for Critical Applications, pp. 387–409, Springer-Verlag, Wien, 1991.
P. Bernstein, V. Hadzilacos, and N. Goodman, Concurrency Control and Recovery in Database Systems, Addison-Wesley, Reading, MA, 1987.
J. Leichter, Shared Tuple Memories, Shared Memories, Buses and LANs—Linda Implementation Across the Spectrum of Connectivity, Ph.D. Dissertation, Dept. of Computer Science, Yale University, 1989.
J. Chang and N. Maxemchuk, “Reliable broadcast protocols,” ACM Trans. on Computer Systems, vol. 2, pp. 251–273, Aug. 1984.
M. Kaashoek, A. Tanenbaum, S. Hummel, and H. Bal, “An efficient reliable broadcast protocol,” Operating Systems Review. vol. 23, pp. 5–19, Oct. 1989.
H. Garcia-Molina and A. Spauster, “Ordered and reliable multicast communication,” ACM Trans. on Computer Systems, vol. 9, pp. 242–271, Aug. 1991.
S. Cannon and D. Dunn, “A high-level model for the development of fault-tolerant parallel and distributed systems,” Technical report A0192, Dept. of Computer Science, Utah State Univ., 1992.
S. Kambhatla, Replication Issues for a Distributed and Highly Available Linda Tuple Space. M.S. Thesis, Dept. of Computer Science, Oregon Graduate Institute, 1991.
L. Patterson, R. Turner, R. Hyatt, and K. Reilly, “Construction of a fault-tolerant distributed tuple-space,” in Proc. 1993 ACM Symp. on Applied Computing, pp. 279–285, Feb. 1993.
A. Xu and B. Liskov, “A design for a fault-tolerant distributed implementation of Linda,” in Proc. 19th Fault-Tolerant Computing Symposium, Chicago, IL, pp. 199–206, June 1989
B. Liskov, “Distributed programming in Argus,” Commun. ACM, vol. 31, pp. 300–312, March 1988.
M. Herlihy and J. Wing, “Avalon: Language support for reliable distributed systems,” in Proc 17th Symp. on Fault-Tolerant Computing, Pittsburgh, PA, pp. 89–94, July 1987.
C. Ellis, J. Feldman, and J. Heliotis, “Language constructs and support systems for distributed computing,” in Proc. 1st ACM Symp. on Principles of Distributed Computing, Ottawa, Canada, pp. 1–9, Aug. 1982.
A. Spector, D. Daniels, D. Duchamp, J. Eppinger, and R. Pausch, “Distributed transactions for reliable systems,” in Proc. 10th ACM Symp. on Operating Systems Principles, Orcas Island, WA, pp. 127–146, Dec. 1985.
H. Madduri, “Fault-tolerant distributed computing,” Scientific Honeyweller, pp. 1–10, Winter 1986–87.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1994 Kluwer Academic Publishers
About this chapter
Cite this chapter
Schlichting, R.D., Bakken, D.E., Thomas, V.T. (1994). Language Support for Fault-Tolerant Parallel and Distributed Programming. In: Koob, G.M., Lau, C.G. (eds) Foundations of Dependable Computing. The Springer International Series in Engineering and Computer Science, vol 284. Springer, Boston, MA. https://doi.org/10.1007/978-0-585-27316-7_3
Download citation
DOI: https://doi.org/10.1007/978-0-585-27316-7_3
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-7923-9485-3
Online ISBN: 978-0-585-27316-7
eBook Packages: Springer Book Archive