Fault tolerance via replication in coarse grain data-flow

Nguyen-Tuong, Anh; Grimshaw, Andrew S.; Karpovich, John F.

doi:10.1007/BFb0023066

Anh Nguyen-Tuong¹,
Andrew S. Grimshaw¹ &
John F. Karpovich¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1068))

Included in the following conference series:

International Workshop on Parallel Symbolic Languages and Systems

143 Accesses
3 Citations

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

T. Agerwala and Arvind, “Data Flow Systems,” IEEE Computer, vol. 15, no. 2, pp. 10–13, February, 1982.
Google Scholar
O. Babaoglu et. al., “Paralex: An Environment for Parallel Programming in Distributed Systems,” Technical Report UBLCS-92-4, Laboratory for Computer Science, University of Bologna, Oct. 1992.
Google Scholar
R. F. Babb, ”Parallel Processing with Large-Grain Data Flow Techniques,” IEEE Computer, pp. 55–61, July, 1984.
Google Scholar
D. Bakken and R. Schlichting, “Supporting fault-tolerant parallel programming in Linda,” Technical Report TR93-18, The University of Arizona, 1993.
Google Scholar
A. Baratloo, P. Dasgupta and Z. M. Kedem, “CALYPSO: A Novel Software System for Fault-Tolerant Parallel Processing on Distributed Platforms,” Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing, pp. 122–129, Washington, D.C., August 1995.
Google Scholar
A. Beguelin et al., “HeNCE: Graphical Development Tools for Network-Based Concurrent Computing,” Proceedings SHPCC-92, pp. 129–136, Williamsburg, VA, May, 1992.
Google Scholar
K. Birman et. al., “Implementing Fault-Tolerant Distributed Objects,” IEEE Transactions on Software Engineering, Vol. SE-11, No. 6, June 1985.
Google Scholar
J. C. Browne, T. Lee, and J. Werth, “Experimental Evaluation of a Reusability-Oriented Parallel Programming Environment,” IEEE Transactions on Software Engineering, pp. 111–120, vol. 16, no. 2, Feb., 1990.
Google Scholar
A. S. Grimshaw, “Easy to Use Object-Oriented Parallel Programming with Mentat,” IEEE Computer, pp. 39–51, May, 1993.
Google Scholar
A. S. Grimshaw, “The Mentat Computation Model — Data-Driven Support for Dynamic Object-Oriented Parallel Processing,” Computer Science Technical Report, CS-93-30, University of Virginia, May, 1993.
Google Scholar
A. S. Grimshaw and V. E. Vivas, “FALCON: A Distributed Scheduler for MIMD Architectures”, Proceedings of the Symposium on Experiences with Distributed and Multiprocessor Systems, pp. 149–163, Atlanta, GA, March, 1991.
Google Scholar
A. S. Grimshaw, J. B. Weissman and W. T. Strayer, “Portable Run-Time Support for Dynamic Object-Oriented Parallel Processing”, To appear in the ACM Transactions of Computer Systems.
Google Scholar
A. S. Grimshaw, A. Nguyen-Tuong and W. A. Wulf, “Campus-Wide Computing: Early Results using Legion at the University of Virginia”, Technical Report CS-95-19, Department of Computer Science, University of Virginia, 1995.
Google Scholar
A. S. Grimshaw et. al., “Legion: The Next Logical Step Toward a Natiowide Virtual Compute,” Computer Science Technical Report, CS-94-21, June 8, 1994.
Google Scholar
K. Jeong and D. Shasha, “Plinda 2.0: A transactional/checkpointing approach to fault tolerant Linda,” Proceedings of the 13th Symposium on Reliable Distributed Systems, 1994.
Google Scholar
M. Kaashoek et. al., “Transparent fault-tolerance in parallel Orca programs,“ Symposium on Experiences with Distributed and Multiprocessor Systems, 1992.
Google Scholar
J. Leon, A. L. Fisher, P. Steenkiste, “Fail-safe PVM: A portable package for distributed programming with transparent recovery”, Technical Report CMU-CS-93-124, School of Computer Science, Carnegie Mellon University, PA, February 1993.
Google Scholar
M.C. Little and S.K. Shrivastava, “Replicated K-Resilient Objects in Arjuna”, Proceedings of the 1st IEEE Workshop on the Management of Replicated Data, Houston, pp. 53–58, November 1990.
Google Scholar
R. A. Obando and J. W. Stoughton, “A Performance Prediction Model for a Fault-Tolerant Computer During Recovery and Restoration,” NASA Contractor Report 195074, NASA Langley Research Center, Virginia, February 1995.
Google Scholar
D. Powell, “Delta-4: A Generic Architecture for Dependable Distributed Computing,” ESPRIT project 2252 Research Report, Springer Verlag, 1991.
Google Scholar
A. H. Veen, “Dataflow Machine Architecture,” ACM Computing Surveys, pp. 365–396, vol. 18, no. 4, December, 1986.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Virginia, Thornton Hall, 22903, Charlottesville, VA
Anh Nguyen-Tuong, Andrew S. Grimshaw & John F. Karpovich

Authors

Anh Nguyen-Tuong
View author publications
You can also search for this author in PubMed Google Scholar
Andrew S. Grimshaw
View author publications
You can also search for this author in PubMed Google Scholar
John F. Karpovich
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Takayasu Ito Robert H. Halstead Jr. Christian Queinnec

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nguyen-Tuong, A., Grimshaw, A.S., Karpovich, J.F. (1996). Fault tolerance via replication in coarse grain data-flow. In: Ito, T., Halstead, R.H., Queinnec, C. (eds) Parallel Symbolic Languages and Systems. PSLS 1995. Lecture Notes in Computer Science, vol 1068. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0023066

Download citation

DOI: https://doi.org/10.1007/BFb0023066
Published: 10 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61143-1
Online ISBN: 978-3-540-68332-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics