Skip to main content

Process Migration in Clusters and Cluster Grids

  • Chapter
  • 566 Accesses

Part of the book series: The International Series in Engineering and Computer Science ((SECS,volume 777))

Abstract

The paper describes two working modes of the parallel program checkpointing mechanism of P-GRADE and its potential application in the nationwide Hungarian ClusterGrid (CG) project. The first generation architecture of ClusterGrid enables the migration of parallel processes among friendly Condor pools. In the second generation CG Condor flocking is disabled, so a new technique is introduced to somehow interrupt the whole parallel application and take it out of the Condor scheduler with checkpoint files. The latter mechanism enables a parallel application to be completely removed from the Condor pool after checkpointing and to be resumed under another non-friendly Condor pool after resubmission. The checkpointing mechanism can automatically (without user interaction) support generic PVM programs created by the P-GRADE Grid programming environment.

The work presented in this paper has been supported by the Hungarian Chemistrygrid OMFB-00580/2003 project, the Hungarian Supergrid OMFB-00728/2002 project, the Hungarian IHM 4671/1/2003 project and the Hungarian Research Fund No. T042459.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. Casas, D. Clark, R. Konuru, S. Otto, R. Prouty, and J. Walpole, “MPVM: A Migration Transparent Version of PVM”, Technical Report CSE-95-002, 1, 1995

    Google Scholar 

  2. L. Dikken, F. van der Linden, J.J.J. Vesseur, and P.M.A. Sloot, “DynamicPVM: Dynamic Load Balancing on Parallel Systems”, In W. Gentzsch and U. Harms, editors, Lecture notes in computer sciences 797, High Performance Computing and Networking, volume Proceedings Volume II, Networking and Tools, pages 273–277, Munich, Germany, April 1994. Springer Verlag

    Google Scholar 

  3. D. Drótos, G. Dózsa, and P. Kacsuk, “GRAPNEL to C Translation in the GRADE Environment”, Parallel Program Development for Cluster Comp. Methodology, Tools and Integrated Environments, Nova Science Publishers, Inc. pp. 249–263, 2001

    Google Scholar 

  4. I. Foster, C. Kesselman, S. Tuecke, “The Anatomy of the Grid.” Enabling Scalable Virtual Organizations, Intern. Journal of Supercomputer Applications, 15(3), 2001

    Google Scholar 

  5. P. Kacsuk, “Visual Parallel Programming on SGI Machines”, Invited paper, Proc. of the SGI Users Conference, Krakow, Poland, pp. 37–56, 2000

    Google Scholar 

  6. J. Kovács and P. Kacsuk, “Server Based Migration of Parallel Applications”, Proc. of DAP SYS’2002, Linz, pp. 30–37, 2002

    Google Scholar 

  7. J. Leon, A. L. Fisher, and P. Steenkiste, “Fail-safe PVM: a portable package for distributed programming with transparent recovery”. CMU-CS-93-124. February, 1993

    Google Scholar 

  8. M. Litzkow, T. Tannenbaum, J. Basney, and M. Livny, “Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System”, Technical Report # 1346, Computer Sciences Department, University of Wisconsin, April 1997

    Google Scholar 

  9. G. Stellner, “Consistent Checkpoints of PVM Applications”, In Proc. 1st Euro. PVM Users Group Meeting, 1994

    Google Scholar 

  10. D. Thain, T. Tannenbaum, and M. Livny, “Condor and the Grid”, in Fran Berman, Anthony J.G. Hey, Geoffrey Fox, editors, Grid Computing: Making The Global Infrastructure a Reality, John Wiley, 2003

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer Science + Business Media, Inc.

About this chapter

Cite this chapter

Kovács, J. (2005). Process Migration in Clusters and Cluster Grids. In: Juhász, Z., Kacsuk, P., Kranzlmüller, D. (eds) Distributed and Parallel Systems. The International Series in Engineering and Computer Science, vol 777. Springer, Boston, MA. https://doi.org/10.1007/0-387-23096-3_12

Download citation

  • DOI: https://doi.org/10.1007/0-387-23096-3_12

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-23094-8

  • Online ISBN: 978-0-387-23096-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics