Skip to main content
Log in

Garbage collection in uncoordinated checkpointing algorithms

  • Regular Papers
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

In this paper, the hard problem of the thorough garbage collection in uncoordinated checkpointing algorithms is studied. After introduction of the traditional garbage collecting scheme, with which only obsolete checkpoints can be discarded, it is shown that this kind of traditional method may fail to discard any checkpoint in some special cases, and it is necessary and urgent to find a thorough garbage collecting method, with which all the checkpoints useless for any future rollback-recovery including the obsolete ones can be discarded. Then, the Thorough Garbage Collection Theorem is proposed and proved, which ensures the feasibility of the thorough garbage collection, and gives the method to calculate the set of the useful checkpoints as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Chandy K M, Lamport L. Distributed snapshot: Determining global states of distributed systems.ACM Trans. Computer Systems, May 1985, 3: 63–75.

    Article  Google Scholar 

  2. Koo R, Toueg S. Checkpoint and rollback-recovery for distributed systems.IEEE Trans. Software Engineering, Jan. 1987, 13(1): 23–31.

    Article  MATH  Google Scholar 

  3. Cristian F, Jahanian F. A timestamp-based checkpointing protocol for long-lived distributed computations. InProc. 10th Symp. reliable Distributed Systems, 1991, pp.12–20.

  4. Tamir Y, Sequin C H. Error recovery in multiprocessors using global checkpoints. InProc. Int. Conf. Parallel Processing, 1984, pp.32–41.

  5. Ramanathan P, Shin K G. Use of common time base for checkpointing and rollback recovery in a distributed system.IEEE Trans. Software Engineering, June 1993, 19(6): 571–583.

    Article  Google Scholar 

  6. Bhargava B, Lian S R. Independent checkpointing and concurrent rollback for recovery in distributed systems — An optimistic approach. InProc. 7th IEEE Symp. Reliable Distributed Syst., Oct. 1988, pp.3–12.

  7. Wang Y Met al. Consistent global checkpoints based on direct dependency trackability.Information Processing Letters, May 1994, 50(4): 223–230.

    Article  MATH  Google Scholar 

  8. Baldon Ret al. Characterization of consistent global checkpoints in large-scale distributed systems. InProc. the 5th IEEE Workshop on Future Trends of Distributed Computing Systems, Aug. 1995, Korea, pp.314–323.

  9. Kim Junguk L, Park Taesoon. An effeicient protocol for checkpointing recovery in distributed systems,IEEE Trans. Parallel and Distributed Systems, Aug. 1993, 4(8): 955–960.

    Article  Google Scholar 

  10. Prakash Ravi, Singhal Mukesh. Low-cost checkpointing and failure recovery in mobile computing systems.IEEE Trans. Parallel and Distributed Systems, Oct. 1996, 7(10): 1035–1048.

    Article  Google Scholar 

  11. Sean W Smithet al. Completely asynchronous optimistic recovery with minimal rollbacks. InProc. IEEE FTCS-25, 1995, pp.361–370.

  12. Lamport L. Time, clocks, and the ordering of events in a distributed system.Comm. ACM, 1978, 21(7): 558–565.

    Article  MATH  Google Scholar 

  13. Strom R E, Yemini S. Optimistic rocovery in distributed systems.ACM Trans. Computer System, Aug. 1985, 3(3): 204–226.

    Article  Google Scholar 

  14. Tsuruoka Ket al. Dynamic recovery schemes for distributed systems. InProc. IEEE 2nd Symp. Reliability in Distr. Software and DataBase Syst., 1981, pp.124–130.

  15. Chiu Ge-ming, Young Cheng-ru. Efficient rollback-rocovery technique in distributed computing systems,IEEE Trans. Parallel and Distributed Systems, June 1996, 7(6): 565–577.

    Article  Google Scholar 

  16. Liu Yunlong, Chen Junliang. Study on resolutions of Domino effects.Chinese Journal of Software, Dec. 1998, 12: 942–945.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liu Yunlong.

Additional information

Supported by the Doctorate Research Foundation of the State Education Commission of China.

LIU Yunlong was born in 1972. He received his Bachelor degree in computer science from Harbin Engineering University in 1993 and his Ph.D. degree in communication and information systems from Beijing University of Posts and Telecommunications in 1998. Now, he is working as a Member of Technical Staff in Bell Laboratories — Lucent Technologies. His Research interests include fault tolerant computing, intelligent network, telecommunication management network, and heterogeneous networks interworking. Address: Bell Labs, 3F, Huilong Technical Center, 30 Haidian Nan Lu, Beijing, 100080, P.R. China.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Y., Chen, J. Garbage collection in uncoordinated checkpointing algorithms. J. of Comput. Sci. & Technol. 14, 242–249 (1999). https://doi.org/10.1007/BF02948512

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02948512

Keywords

Navigation