Advertisement

Application and Middleware Transparent Checkpointing with TCKPT on Clustergrid

  • József Kovács
  • Rafal Mikolajczak
  • Radoslaw Januszewski
  • Gracjan Jankowski

Keywords

Parallel Application Integrity Requirement Compatibility Requirement Entire Application Client Process 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    I. Foster, C. Kesselman, S. Tuecke, “The Anatomy of the Grid. Enabling Scalable Virtual Organizations”, Intern. Journal of Supercomputer Applications, 15(3), 2001Google Scholar
  2. [2]
    Elnozahy E N, Johnson D B, Wang Y M. “A Survey of Rollback Recovery Protocols in Message-Passing System.” Technical Report. Pittsburgh, PA: CMU-CS-96-181. Carnegie Mellon University, Oct 1996Google Scholar
  3. [3]
    K.M. Chandy and L. Lamport. „Distributed snapshots: Determining global states of distributed systems”, ACM Transactions on Computer Systems, 3(1):63-75, February 1985.CrossRefGoogle Scholar
  4. [4]
    G. Stellner, “Consistent Checkpoints of PVM Applications”, In Proc. 1st Euro. PVM Users Group Meeting, 1994Google Scholar
  5. [5]
    M. Litzkow, T. Tannenbaum, J. Basney, and M. Livny, “Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System”, Technical Report #1346, Computer Sciences Department, University of Wisconsin, April 1997Google Scholar
  6. [6]
    J. Léon, A. L. Fisher, and P. Steenkiste, “Fail-safe PVM: a portable package for distributed programming with transparent recovery”. CMU-CS-93-124. February, 1993Google Scholar
  7. [7]
    G.D. van Albada; J. Clinckemaillie; A.H.L. Emmen; J. Gehring; O. Heinz; F. van der Linden; B.J. Overeinder; A. Reinefeld and P.M.A. Sloot: „Dynamite - blasting obstacles to parallel cluster computing”, in P.M.A. Sloot; M. Bubak; A.G. Hoekstra and L.O. Hertzberger, editors, High-Performance Computing and Networking (HPCN Europe '99), Amsterdam, The Netherlands, in series Lecture Notes in Computer Science, nr 1593 pp. 300-310. Springer-Verlag, Berlin, April 1999. ISBN 3-540-65821-1.Google Scholar
  8. [8]
    Jozsef Kovacs: “Making PVM applications checkpointable for the Grid” Proc. of the Microcad 2005 Conference, Section N, pp. 223-228, Marcius 10-11, 2005, MiskolcGoogle Scholar
  9. [9]
  10. [10]
    Gracjan Jankowski, Rafal Mikolajczak, Radoslaw Januszewski: “Checkpoint/Restart mechanism for multiprocess applications implemented under SGIGrid Project”, Proceedings of the Cracow GridWorkshop 2004, pp.142 149, ISBN: 83-911541-4-5, 2005.Google Scholar
  11. [11]
    G. Jankowski, R. Januszewski, R. Mikolajczak, J. Kovacs: "Scalable multilevel checkpointing for distributed applications - on the integration possibility of TCKPT and psncLibCkpt ", CoreGRID Technical Report, TR-0019, March 2006Google Scholar
  12. [12]
    G. Jankowski, R. Januszewski, R. Mikolajczak, J. Kovacs: "Scalable multilevel checkpointing for distributed applications - on the possibility of integrating Total Checkpoint and AltixC/R", CoreGRID Technical Report, TR-0035, March 2006Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • József Kovács
    • 1
  • Rafal Mikolajczak
    • 2
  • Radoslaw Januszewski
    • 3
  • Gracjan Jankowski
    • 4
  1. 1.Parallel and Distributed Systems LaboratoryMTA SZTAKIHungary
  2. 2.Poznan Supercomputing and Networking CenterPoland
  3. 3.Poznan Supercomputing and Networking CenterPoland
  4. 4.Poznan Supercomputing and Networking CenterPoland

Personalised recommendations