Skip to main content

Extended Heartbeat Mechanism for Fault Detection Service Methodology

  • Conference paper
Grid and Distributed Computing (GDC 2009)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 63))

Included in the following conference series:

Abstract

Fault detection methodology is a crucial part in providing a scalable, dependable and high availability of grid computing environment. The most popular technique that used in detecting fault is heartbeat mechanism where it monitors the grid resources in a very short interval. However, this technique has its weakness as it requires a period of times before the node is realized to be faulty and therefore delaying the recovery actions to be taken. This is due to unindexed status for each transaction and need to wait for a certain time interval before realizing the nodes has failed. In this paper, fault detection mechanism and service using extended heartbeat mechanism is proposed. This technique introduced the use of index server for indexing the transaction and utilizing pinging service for pushing mechanism. The model outperformed the existing techniques by reducing the time taken to detect fault in approximately 30%. Also, the mechanism provides a basis for customizable recovery actions to be deployed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Nemeth, Z., Sunderam, V.: Introduction. In: Characterizing Grids: Attributes, Definitions and Formalisms, pp. 9–11 (2003)

    Google Scholar 

  2. Stelling, P., Foster, I., Kesselman, C., Lee, C., Laszewski, G.: A Fault Detection Service for Wide Area Distributed Computations. In: Proceedings of HPDC, pp. 268–278 (1998)

    Google Scholar 

  3. Soonwook, H.: A Generic Failure Detection Service for the Grid, Ph.D. thesis, institution =. University of Southern California (2003)

    Google Scholar 

  4. Renesse, R., Minsky, Y., Hayden, M.: A Gossip-Style Failure Detection Service,Technical Report, TR98-1687 (1998)

    Google Scholar 

  5. Abawajy, J.H., Dandamudi, S.P.: A Reconfigurable Multi-Layered Grid Scheduling Infrastructure. In: Proceedings of PDPTA 2003, pp. 138–144 (2003)

    Google Scholar 

  6. Foster, I.: The Need for a Clear Definition. What is the Grid? A Three Point Checklist (2002)

    Google Scholar 

  7. Abawajy, J.H.: Introduction. In: Fault Detection Service Architecture for Grid Computing Systems, pp. 107–108 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mohd. Noor, A.S., Mat Deris, M. (2009). Extended Heartbeat Mechanism for Fault Detection Service Methodology. In: Ślęzak, D., Kim, Th., Yau, S.S., Gervasi, O., Kang, BH. (eds) Grid and Distributed Computing. GDC 2009. Communications in Computer and Information Science, vol 63. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10549-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-10549-4_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-10548-7

  • Online ISBN: 978-3-642-10549-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics