Abstract
Fault detection methodology is a crucial part in providing a scalable, dependable and high availability of grid computing environment. The most popular technique that used in detecting fault is heartbeat mechanism where it monitors the grid resources in a very short interval. However, this technique has its weakness as it requires a period of times before the node is realized to be faulty and therefore delaying the recovery actions to be taken. This is due to unindexed status for each transaction and need to wait for a certain time interval before realizing the nodes has failed. In this paper, fault detection mechanism and service using extended heartbeat mechanism is proposed. This technique introduced the use of index server for indexing the transaction and utilizing pinging service for pushing mechanism. The model outperformed the existing techniques by reducing the time taken to detect fault in approximately 30%. Also, the mechanism provides a basis for customizable recovery actions to be deployed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Nemeth, Z., Sunderam, V.: Introduction. In: Characterizing Grids: Attributes, Definitions and Formalisms, pp. 9–11 (2003)
Stelling, P., Foster, I., Kesselman, C., Lee, C., Laszewski, G.: A Fault Detection Service for Wide Area Distributed Computations. In: Proceedings of HPDC, pp. 268–278 (1998)
Soonwook, H.: A Generic Failure Detection Service for the Grid, Ph.D. thesis, institution =. University of Southern California (2003)
Renesse, R., Minsky, Y., Hayden, M.: A Gossip-Style Failure Detection Service,Technical Report, TR98-1687 (1998)
Abawajy, J.H., Dandamudi, S.P.: A Reconfigurable Multi-Layered Grid Scheduling Infrastructure. In: Proceedings of PDPTA 2003, pp. 138–144 (2003)
Foster, I.: The Need for a Clear Definition. What is the Grid? A Three Point Checklist (2002)
Abawajy, J.H.: Introduction. In: Fault Detection Service Architecture for Grid Computing Systems, pp. 107–108 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mohd. Noor, A.S., Mat Deris, M. (2009). Extended Heartbeat Mechanism for Fault Detection Service Methodology. In: Ślęzak, D., Kim, Th., Yau, S.S., Gervasi, O., Kang, BH. (eds) Grid and Distributed Computing. GDC 2009. Communications in Computer and Information Science, vol 63. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10549-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-10549-4_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10548-7
Online ISBN: 978-3-642-10549-4
eBook Packages: Computer ScienceComputer Science (R0)