Skip to main content

Application Recovery in Parallel Programming Environment

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2474))

Abstract

In this paper, fault-tolerant feature of TOPAS parallel programming environment for distributed systems is presented. TOPAS automatically analyzes data dependence among tasks and synchronizes data, which reduces the time needed for parallel program developments. TOPAS also provides supports for scheduling, load balancing and fault tolerance. The main topics of this paper is to present the solution for transparent recovery of asynchronous distributed computation on clusters of workstations without hardware spare when a fault occurs on a node. Experiments show simplicity and efficiency of parallel programming in TOPAS environment with fault-tolerant integration, which provides graceful performance degradation and quick reconfiguration time for application recovery.

This work is supported by the Slovak Scientific Grant Agency within Research Project No. 2/7186/20

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kennedy: The Grid-Blue Print for a New Computing Infrastructure, pp. 181–204, Morgan Kaufmann, 1999.

    Google Scholar 

  2. L. Hluchy, M. Dobrucky, D. Dobrovodsky: Distributed Static and Dynamic Load Balancing Tools under PVM. First Austrian-Hungarian Workshop on Distributed and Parallel Systems, Hungary, 1996, pp. 215–216.

    Google Scholar 

  3. L. Hluchy, M. Dobrucky, J. Astalos: Hybrid Approach to Task Allocation in Distributed Systems. Computers and Artificial Intelligence, Vol. 17, No. 5, pp. 469–480, 1998.

    MATH  Google Scholar 

  4. Senar M. A., Cortes A., Ripoll A., Hluchy L., Astalos J.: Dynamic Load Balancing. Parallel Program Development for Cluster Computing. Nova Science Publishers, USA, 2001.

    Google Scholar 

  5. H. El-Rewini, T. G. Lewis: Distributed and Parallel Computing. Manning Publication, USA, 1998.

    Google Scholar 

  6. H. El-Rewini, H.H. Ali, T. Lewis: Task Scheduling in Multiprocessing Systems. Manning Publication, USA, 1999.

    Google Scholar 

  7. T. Yang, A. Gerasoulis: DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors. IEEE Trans. on Parallel and Distributed Systems, Vol. 5, No. 9, pp. 951–967, 1994.

    Article  Google Scholar 

  8. B. A. Shirazi, A. R. Hurson, K. M. Kavi: Scheduling and Load Balancing on Parallel and Distributed Systems. IEEE Computer Society Press, 1995.

    Google Scholar 

  9. V.D. Tran, G. T. Nguyen, L. Hluchy: Data Driven Graph: A Parallel Program Model for Scheduling. Languages and Compilers for Parallel Computing LCPC’1999, pp. 494–497, USA. Lecture Notes in Computer Science, Springer-Verlag.

    Google Scholar 

  10. V.D. Tran, L. Hluchy, G.T. Nguyen: Parallel Program Model for Distributed Systems. EuroPVM/MPI’2000, pp. 250–257, Hungary. Lecture Notes in Computer Science, Springer Verlag.

    Google Scholar 

  11. Nguyen G.T., Tran V.D., Kotocova M.: TOPAS: Parallel Programming Environment for Distributed Computing. ICCS, 2002, pp. 890–899, The Netherlands. Lecture Notes in Computer Science, Springer Verlag.

    Chapter  Google Scholar 

  12. M. Richmond, M. Hitchens: A New Process Migration Algorithm, Operating System Review, 31(1), 1997, pp. 31–42.

    Article  Google Scholar 

  13. M. Bielikova, P. Navrat: An Approach to Automated Building of Software System Configurations, International Journal of Software Engineering and Knowledge Engineering, Vol. 9, No. 1, pp. 73–95, 1999.

    Article  Google Scholar 

  14. PVM: Parallel Virtual Machine http://www.epm.ornl.gov/pvm/pvmhome.html.

  15. MPI-Message Passing Interface http://www.erc.msstate.edu/mpi/.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nguyen, G.T., Tran, V.D., Kotocova Faf, M. (2002). Application Recovery in Parallel Programming Environment. In: Kranzlmüller, D., Volkert, J., Kacsuk, P., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2002. Lecture Notes in Computer Science, vol 2474. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45825-5_39

Download citation

  • DOI: https://doi.org/10.1007/3-540-45825-5_39

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44296-7

  • Online ISBN: 978-3-540-45825-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics