A Recovery Technique Using Multi-agent in Distributed Computing Systems

Lee, Hwa-Min; Chung, Kwang-Sik; Shin, Sang-Chul; Lee, Dae-Won; Lee, Won-Gyu; Yu, Heon-Chang

doi:10.1007/3-540-46000-4_23

Hwa-Min Lee⁶,
Kwang-Sik Chung⁷,
Sang-Chul Shin⁶,
Dae-Won Lee⁶,
Won-Gyu Lee⁶ &
…
Heon-Chang Yu⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2315))

Included in the following conference series:

International Conference on Coordination Languages and Models

222 Accesses
1 Citations

Abstract

This paper proposes a new approach to rollback-recovery, using multi-agent in distributed computing system. Previous rollback-recovery protocols were dependent on inherent communication and operating system, which cause a decline of computing performance in distributed computing system. By using multi-agent, we propose rollback-recovery protocol which works independently on operating system. We define three kinds of agent. One is a recovery agent that performs rollback-recovery protocol after a failure. Other is an information agent that constructs domain knowledge as a rule of fault tolerance and information during failure-free operation. The other is the facilitator agent that controls the efficient communication between agents. Also we propose rollback-recovery protocol using multi-agent and simulate the proposed roll-back-recovery protocol using JAVA and agent communication language in CORBA environment.

This work was supported by grant No. R01-2001-00354 from the Korea Science & Engineering Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

B. Bhargava, S. R. Lian: Independent Checkpointing and Concurrent Rollback for Recovery—An Optimistic Approach, In Proceedings of the Symposium on Reliable Distributed Systems (1988) 3–12
Google Scholar
E. N. Elnozahy, D. B. Johnson, Y. M. Wang,: A Survey of Rollback-Recovery Protocols in Message Passing Systems, CMU Technical Report CMU-CS-99-148 (1999)
Google Scholar
E. N. Elnozahy: Manetho: Fault tolerance in distributed systems using rollback-recovery and process replication, Ph. D. Thesis, Rice University (1993)
Google Scholar
Finin T., Fritzson R., Mckay D., McEntire R.: KQML as an agent communication language, Proc. of CIKM’ 94 (1994) 126–130
Google Scholar
Genesereth M., Fikes R.: Knowledge interchange format version 3.0 reference manual, Technical Report Logic-92-1, Computer Science Department, Stanford University (1992)
Google Scholar
L. Alvisi: Understanding the message logging paradigm for masking process crashes, Ph.D. Thesis, Department of Computer Science, Cornell University (1996)
Google Scholar
L. Alvisi, K. Marzullo: Message Logging: Pessimistic, Optimistic, Causal and Optimal, IEEE Trans. on Software Engineering, Vol. 24 (1998) 149–159
Article Google Scholar
L. Lamport: Time, Clocks and the Ordering of Events in a Distributed System, Communications of the ACM, 21 (1978) 558–565
Article MATH Google Scholar
R. Koo and S. Toueg: Checkpointing and rollback-recovery for distributed systems, IEEE Trans. on Software Engineering, Vol. SE-13, No. 1 (1987) 23–31
Article Google Scholar
R.D. Schlichting and F.B. Schneider: Fail-stop processors: an approach to designing fault-tolerant distributed computing systems”, ACM Transactions on Computer Systems 1 (1985) 222–238
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science Education, Korea University, Seoul, Korea
Hwa-Min Lee, Sang-Chul Shin, Dae-Won Lee, Won-Gyu Lee & Heon-Chang Yu
Dept. of Computer Science, University College London, London, UK
Kwang-Sik Chung

Authors

Hwa-Min Lee
View author publications
You can also search for this author in PubMed Google Scholar
Kwang-Sik Chung
View author publications
You can also search for this author in PubMed Google Scholar
Sang-Chul Shin
View author publications
You can also search for this author in PubMed Google Scholar
Dae-Won Lee
View author publications
You can also search for this author in PubMed Google Scholar
Won-Gyu Lee
View author publications
You can also search for this author in PubMed Google Scholar
Heon-Chang Yu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Software Engineering Department, Centre for Mathematics and Computer Science, Kruislaan 413, 1098, SJ Amsterdam, The Netherlands
Farhad Arbab
SRI International, 333 Ravenswood Ave., 94025, Menlo Park, CA, USA
Carolyn Talcott

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, HM., Chung, KS., Shin, SC., Lee, DW., Lee, WG., Yu, HC. (2002). A Recovery Technique Using Multi-agent in Distributed Computing Systems. In: Arbab, F., Talcott, C. (eds) Coordination Models and Languages. COORDINATION 2002. Lecture Notes in Computer Science, vol 2315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46000-4_23

Download citation

DOI: https://doi.org/10.1007/3-540-46000-4_23
Published: 14 March 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43410-8
Online ISBN: 978-3-540-46000-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics