Language Support for Fault-Tolerant Parallel and Distributed Programming

Schlichting, Richard D.; Bakken, David E.; Thomas, Vicraj T.

doi:10.1007/978-0-585-27316-7_3

Richard D. Schlichting²,
David E. Bakken² &
Vicraj T. Thomas³

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 284))

65 Accesses

Abstract

Most high-level programming languages contain little support for programming multicomputer programs that must continue to execute despite failures in the underlying computing platform. This paper describes two projects that address this problem by providing features specifically designed for fault-tolerance. The first is FT-Linda, a version of the Linda coordination language for writing fault-tolerant parallel programs. Major enhancements include stable tuple spaces whose contents survive failure and atomic execution of collections of tuple space operations. The second is FT-SR, a language based on the existing SR distributed programming language. Major features include support for transparent module replication, ordered group communication, automatic recovery and failure notification. Prototype versions of both languages have been implemented.

This work supported in part by the National Science Foundation under grant CCR-9003161 and the Office of Naval Research under grant N00014-91-J-1015

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

1.3.7. References

S. Mishra and R. Schlichting, “Abstractions for constructing dependable distributed systems,” Technical report 92-19, Dept. of Computer Science, University of Arizona, 1992.
Google Scholar
S. Ahuja, N. Carriero, and D. Gelernter, “Linda and friends,” IEEE Computer, vol. 19, pp. 26–34, August 1986.
Google Scholar
G. Andrews and R. Olsson, The SR Programming Language, Benjamin/Cummings, Redwood City, CA, 1993.
MATH Google Scholar
J. Gray, “An approach to decentralized computer systems,” IEEE Trans. on Software Engineering, vol. SE-12, pp. 684–692, June 1986.
Google Scholar
B. Lampson, “Atomic transactions,” in Distributed Systems-Architecture and Implementation (B. Lampson, M. Paul, and H. Seigert, eds.), ch. 11, pp. 246–265, Springer-Verlag, Berlin, 1981.
Google Scholar
F. Schneider, “Implementing fault-tolerant services using the state machine approach: A tutorial,” ACM Computing Surveys, vol. 22, pp. 299–319, Dec. 1990.
Article Google Scholar
S. Mishra, L. Peterson, and R. Schlichting, “Consul: A communication substrate for fault-tolerant distributed programs,” Distributed Systems Engineering, vol. 1, pp. 87–103, 1993.
Article Google Scholar
N. Hutchinson and L. Peterson, “The x-kernel: An architecture for implementing network protocols,” IEEE Trans. on Software Engineering, vol. SE-17, pp. 64–76, Jan. 1991.
Article Google Scholar
D. Gelernter, “Generative communication in Linda,” ACM Trans. on Programming Languages and Systems, vol. 7, pp. 80–112, Jan. 1985.
Article MATH Google Scholar
D. Bakken and R. Schlichting, “Supporting fault-tolerant parallel programming in Linda,” IEEE Trans. on Parallel and Distributed Systems, to appear, 1994.
Google Scholar
B. Anderson and D. Shasha, “Persistent Linda: Linda + transactions + query processing,” in Research Directions in High-Level Parallel Programming Languages, LNCS, Vol. 574, pp. 93–109, Springer-Verlag, Berlin, 1991.
Google Scholar
V. Thomas, FT-SR: A Programming Language for Constructing Fault-Tolerant Distributed Systems, Ph.D. Dissertation, Dept. of Computer Science, University of Arizona, 1993.
Google Scholar
P. Buhr, H. MacDonald, and C. Zarnke, “Synchronous and asynchronous handling of abnormal events in the μSystem,” Software—Practice and Experience, vol. 22, pp. 735–776, Sept. 1992.
Article Google Scholar
R. Cmelik, N. Gehani, and W. Roome, “Fault Tolerant Concurrent C: A tool for writing fault tolerant distributed programs,” in Proc. 18th Symp. on Fault-Tolerant Computing, Tokyo, pp. 55–61, June 1988.
Google Scholar
R. Schlichting, F. Cristian, and T. Purdin, “A linguistic approach to failure-handling in distributed systems,” in Dependable Computing for Critical Applications, pp. 387–409, Springer-Verlag, Wien, 1991.
Google Scholar
P. Bernstein, V. Hadzilacos, and N. Goodman, Concurrency Control and Recovery in Database Systems, Addison-Wesley, Reading, MA, 1987.
Google Scholar
J. Leichter, Shared Tuple Memories, Shared Memories, Buses and LANs—Linda Implementation Across the Spectrum of Connectivity, Ph.D. Dissertation, Dept. of Computer Science, Yale University, 1989.
Google Scholar
J. Chang and N. Maxemchuk, “Reliable broadcast protocols,” ACM Trans. on Computer Systems, vol. 2, pp. 251–273, Aug. 1984.
Article Google Scholar
M. Kaashoek, A. Tanenbaum, S. Hummel, and H. Bal, “An efficient reliable broadcast protocol,” Operating Systems Review. vol. 23, pp. 5–19, Oct. 1989.
Article Google Scholar
H. Garcia-Molina and A. Spauster, “Ordered and reliable multicast communication,” ACM Trans. on Computer Systems, vol. 9, pp. 242–271, Aug. 1991.
Article Google Scholar
S. Cannon and D. Dunn, “A high-level model for the development of fault-tolerant parallel and distributed systems,” Technical report A0192, Dept. of Computer Science, Utah State Univ., 1992.
Google Scholar
S. Kambhatla, Replication Issues for a Distributed and Highly Available Linda Tuple Space. M.S. Thesis, Dept. of Computer Science, Oregon Graduate Institute, 1991.
Google Scholar
L. Patterson, R. Turner, R. Hyatt, and K. Reilly, “Construction of a fault-tolerant distributed tuple-space,” in Proc. 1993 ACM Symp. on Applied Computing, pp. 279–285, Feb. 1993.
Google Scholar
A. Xu and B. Liskov, “A design for a fault-tolerant distributed implementation of Linda,” in Proc. 19th Fault-Tolerant Computing Symposium, Chicago, IL, pp. 199–206, June 1989
Google Scholar
B. Liskov, “Distributed programming in Argus,” Commun. ACM, vol. 31, pp. 300–312, March 1988.
Article Google Scholar
M. Herlihy and J. Wing, “Avalon: Language support for reliable distributed systems,” in Proc 17th Symp. on Fault-Tolerant Computing, Pittsburgh, PA, pp. 89–94, July 1987.
Google Scholar
C. Ellis, J. Feldman, and J. Heliotis, “Language constructs and support systems for distributed computing,” in Proc. 1st ACM Symp. on Principles of Distributed Computing, Ottawa, Canada, pp. 1–9, Aug. 1982.
Google Scholar
A. Spector, D. Daniels, D. Duchamp, J. Eppinger, and R. Pausch, “Distributed transactions for reliable systems,” in Proc. 10th ACM Symp. on Operating Systems Principles, Orcas Island, WA, pp. 127–146, Dec. 1985.
Google Scholar
H. Madduri, “Fault-tolerant distributed computing,” Scientific Honeyweller, pp. 1–10, Winter 1986–87.
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, Univ. of Arizona, Tucson, AZ, 85721
Richard D. Schlichting & David E. Bakken
Honeywell Technology Center, 3660 Technology Drive, Minneapolis, MN, 55418
Vicraj T. Thomas

Authors

Richard D. Schlichting
View author publications
You can also search for this author in PubMed Google Scholar
David E. Bakken
View author publications
You can also search for this author in PubMed Google Scholar
Vicraj T. Thomas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Office of Naval Research, USA
Gary M. Koob & Clifford G. Lau &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Schlichting, R.D., Bakken, D.E., Thomas, V.T. (1994). Language Support for Fault-Tolerant Parallel and Distributed Programming. In: Koob, G.M., Lau, C.G. (eds) Foundations of Dependable Computing. The Springer International Series in Engineering and Computer Science, vol 284. Springer, Boston, MA. https://doi.org/10.1007/978-0-585-27316-7_3

Download citation

DOI: https://doi.org/10.1007/978-0-585-27316-7_3
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-7923-9485-3
Online ISBN: 978-0-585-27316-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics