Extended Heartbeat Mechanism for Fault Detection Service Methodology

Mohd. Noor, Ahmad Shukri; Mat Deris, Mustafa

doi:10.1007/978-3-642-10549-4_11

Ahmad Shukri Mohd. Noor⁶ &
Mustafa Mat Deris⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 63))

Included in the following conference series:

International Conference on Grid and Distributed Computing

538 Accesses
2 Citations

Abstract

Fault detection methodology is a crucial part in providing a scalable, dependable and high availability of grid computing environment. The most popular technique that used in detecting fault is heartbeat mechanism where it monitors the grid resources in a very short interval. However, this technique has its weakness as it requires a period of times before the node is realized to be faulty and therefore delaying the recovery actions to be taken. This is due to unindexed status for each transaction and need to wait for a certain time interval before realizing the nodes has failed. In this paper, fault detection mechanism and service using extended heartbeat mechanism is proposed. This technique introduced the use of index server for indexing the transaction and utilizing pinging service for pushing mechanism. The model outperformed the existing techniques by reducing the time taken to detect fault in approximately 30%. Also, the mechanism provides a basis for customizable recovery actions to be deployed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Nemeth, Z., Sunderam, V.: Introduction. In: Characterizing Grids: Attributes, Definitions and Formalisms, pp. 9–11 (2003)
Google Scholar
Stelling, P., Foster, I., Kesselman, C., Lee, C., Laszewski, G.: A Fault Detection Service for Wide Area Distributed Computations. In: Proceedings of HPDC, pp. 268–278 (1998)
Google Scholar
Soonwook, H.: A Generic Failure Detection Service for the Grid, Ph.D. thesis, institution =. University of Southern California (2003)
Google Scholar
Renesse, R., Minsky, Y., Hayden, M.: A Gossip-Style Failure Detection Service,Technical Report, TR98-1687 (1998)
Google Scholar
Abawajy, J.H., Dandamudi, S.P.: A Reconfigurable Multi-Layered Grid Scheduling Infrastructure. In: Proceedings of PDPTA 2003, pp. 138–144 (2003)
Google Scholar
Foster, I.: The Need for a Clear Definition. What is the Grid? A Three Point Checklist (2002)
Google Scholar
Abawajy, J.H.: Introduction. In: Fault Detection Service Architecture for Grid Computing Systems, pp. 107–108 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Science and Technology, Universiti Malaysia Terengganu, 21030, Kuala Terengganu, Malaysia
Ahmad Shukri Mohd. Noor
Faculty of Multimedia and Information Technology, Universiti Tun Hussein Onn Malaysia, 86400, Parit Raja, Batu Pahat, Johor Darul Takzim, Malaysia
Mustafa Mat Deris

Authors

Ahmad Shukri Mohd. Noor
View author publications
You can also search for this author in PubMed Google Scholar
Mustafa Mat Deris
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Warsaw and Infobright Inc.,
Dominik Ślęzak
Hannam University, 306-791, Daejeon, South Korea
Tai-hoon Kim
Department of Computer Science and Engineering, Arizona State University, AZ 85287-8809, Tempe, USA
Stephen S. Yau
Department of Mathematics and Computer Science, University of Perugia, Via Vanvitelli, 1, 06123, Perugia, Italy
Osvaldo Gervasi
University of Tasmania,, 7001, Hobart, Australia
Byeong-Ho Kang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mohd. Noor, A.S., Mat Deris, M. (2009). Extended Heartbeat Mechanism for Fault Detection Service Methodology. In: Ślęzak, D., Kim, Th., Yau, S.S., Gervasi, O., Kang, BH. (eds) Grid and Distributed Computing. GDC 2009. Communications in Computer and Information Science, vol 63. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10549-4_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-10549-4_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10548-7
Online ISBN: 978-3-642-10549-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics