Skip to main content

A Recovery Technique Using Multi-agent in Distributed Computing Systems

  • Conference paper
  • First Online:
Coordination Models and Languages (COORDINATION 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2315))

Included in the following conference series:

Abstract

This paper proposes a new approach to rollback-recovery, using multi-agent in distributed computing system. Previous rollback-recovery protocols were dependent on inherent communication and operating system, which cause a decline of computing performance in distributed computing system. By using multi-agent, we propose rollback-recovery protocol which works independently on operating system. We define three kinds of agent. One is a recovery agent that performs rollback-recovery protocol after a failure. Other is an information agent that constructs domain knowledge as a rule of fault tolerance and information during failure-free operation. The other is the facilitator agent that controls the efficient communication between agents. Also we propose rollback-recovery protocol using multi-agent and simulate the proposed roll-back-recovery protocol using JAVA and agent communication language in CORBA environment.

This work was supported by grant No. R01-2001-00354 from the Korea Science & Engineering Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. B. Bhargava, S. R. Lian: Independent Checkpointing and Concurrent Rollback for Recovery—An Optimistic Approach, In Proceedings of the Symposium on Reliable Distributed Systems (1988) 3–12

    Google Scholar 

  2. E. N. Elnozahy, D. B. Johnson, Y. M. Wang,: A Survey of Rollback-Recovery Protocols in Message Passing Systems, CMU Technical Report CMU-CS-99-148 (1999)

    Google Scholar 

  3. E. N. Elnozahy: Manetho: Fault tolerance in distributed systems using rollback-recovery and process replication, Ph. D. Thesis, Rice University (1993)

    Google Scholar 

  4. Finin T., Fritzson R., Mckay D., McEntire R.: KQML as an agent communication language, Proc. of CIKM’ 94 (1994) 126–130

    Google Scholar 

  5. Genesereth M., Fikes R.: Knowledge interchange format version 3.0 reference manual, Technical Report Logic-92-1, Computer Science Department, Stanford University (1992)

    Google Scholar 

  6. L. Alvisi: Understanding the message logging paradigm for masking process crashes, Ph.D. Thesis, Department of Computer Science, Cornell University (1996)

    Google Scholar 

  7. L. Alvisi, K. Marzullo: Message Logging: Pessimistic, Optimistic, Causal and Optimal, IEEE Trans. on Software Engineering, Vol. 24 (1998) 149–159

    Article  Google Scholar 

  8. L. Lamport: Time, Clocks and the Ordering of Events in a Distributed System, Communications of the ACM, 21 (1978) 558–565

    Article  MATH  Google Scholar 

  9. R. Koo and S. Toueg: Checkpointing and rollback-recovery for distributed systems, IEEE Trans. on Software Engineering, Vol. SE-13, No. 1 (1987) 23–31

    Article  Google Scholar 

  10. R.D. Schlichting and F.B. Schneider: Fail-stop processors: an approach to designing fault-tolerant distributed computing systems”, ACM Transactions on Computer Systems 1 (1985) 222–238

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, HM., Chung, KS., Shin, SC., Lee, DW., Lee, WG., Yu, HC. (2002). A Recovery Technique Using Multi-agent in Distributed Computing Systems. In: Arbab, F., Talcott, C. (eds) Coordination Models and Languages. COORDINATION 2002. Lecture Notes in Computer Science, vol 2315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46000-4_23

Download citation

  • DOI: https://doi.org/10.1007/3-540-46000-4_23

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43410-8

  • Online ISBN: 978-3-540-46000-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics