Implementable Failure Detectors in Asynchronous Systems
The failure detectors discussed in the literature are either impossible to implement in an asynchronous system, or their exact guarantees have not been discussed. We introduce an infinitely often accurate failure detector which can be implemented in an asynchronous system. We provide one such implementation and show its application to the fault-tolerant server maintenance problem. We also show that some natural timeout based failure detectors implemented on Unix are not sufficient to guarantee infinitely often accuracy.
Unable to display preview. Download preview PDF.
- CHT92.Chandra, T.D., Hadzilacos, V., Toueg, S.: The weakest failure detector for solving consensus. In: Proc. of the 11th ACM Symposium on Principles of Distributed Computing, August 1992, pp. 147–158 (1992)Google Scholar
- FLP85.Fischer, M.J., Lynch, N., Paterson, M.: Impossibility of distributed consensus with one faulty process. Journal of the ACM 32(2) (April 1985)Google Scholar
- GM98.Garg, V.K., Mitchell, J.R.: Detection of global predicates in a faulty environment. In: Proc. of the IEEE International Conference on Distributed Computing Systems, Amsterdam, May 1998, pp. 416–423 (1998)Google Scholar
- HK95.Huang, Y., Kintala, C.: Software fault tolerance in the application layer. In: Lyu, M. (ed.) Software Fault Tolerance. Trends in Software, pp. 249–278. Wiley, Chichester (1995)Google Scholar