Abstract
In general, offering a fault-tolerant service boils down to the execution of replicas of a service process on different nodes in a distributed system. The service is fault-tolerant in such a way, that, even if some of the nodes on which a replica of the service resides, behave maliciously, the service is still performed correctly. To be able to guarantee the correctness of a fault-tolerant service despite the presence of maliciously functioning nodes, it is of key importance that all faulty nodes are timely removed from this service. Faulty nodes are detected by tests performed by the nodes offering the service. In practice, tests always have an imperfect fault coverage. In this paper, a distributed diagnosis algorithm with imperfect tests is described, by means of which all detectably faulty nodes are removed from a fault-tolerant service. This may, however, inevitably, imply the removal of a number of correctly functioning nodes from the service too. The maximum number of correctly functioning nodes removed from the service by the algorithm is calculated. Finally, the minimally required number of nodes needed in a fault-tolerant service to perform this diagnosis algorithm is given.
Preview
Unable to display preview. Download preview PDF.
References
Preparata, F., Metze, G., Chien, R., On the connection assignment of diagnosable systems, in: IEEE Transactions on Electronic Computing, EC-16, 6(Dec. 1967), pp.848–854.
Barborak, M., Malek, M., Dahbura, A., The consensus problem in fault tolerant computing, in: ACM Computing Surveys, Vol 25, 2(Jun. 1993), pp.171–220.
Blough, D.M., Sullivan, G.F., Mason G.M. Intermittent fault diagnosis in multi processor systems, in: IEEE Transactions on computers, vol 41, 11(Nov. 1992), pp.1430–1441.
Bauch, A., Maehle, E., Self diagnosis, Reconfiguration and Recovery in the Dynamical Reconfigurable Multiprocessor System DAMP, in: Fault-tolerant computing systems: tests, diagnosis, fault-treatment: 5th international GI/ITG/GMA Conference Nürnberg, September 25–27, 1991: Proceedings, Dal Cin, M., and Hohl, W. (Eds.), Springer-Verlag, Berlin, 1991, pp. 18–29.
Bianchini, R., Goodwin, R., Nydick, D.S., Practical application and implementation of distributed system level diagnosis theory, in: Fault-tolerant computing: the twentieth international symposium, IEEE Comp. Soc. Press, Los Alamitos, California, 1990, pp. 332–339.
Chen, Y., Bucken, W., Echtle, K., Efficient algorithms for system diagnosis with both processor and comparator faults, in: IEEE Transactions on parallel and distributed systems, vol 4, 4(Apr. 1993), pp.371–381.
Lee, S., Shin, K.G., Optimal multiple syndrome probabilistic diagnosis,in: Faulttolerant computing: the twentieth international symposium, IEEE Comp. Soc. Press, Los Alamitos, California, 1990, pp. 324–331.
Maheshwari, S.N., Hakimi, S.L., On models for diagnosable systems and probabilistic fault diagnosis, in: IEEE Transaction on computers, vol 25, 3(March 1976).
Kime, C.R., An analysis model for digital system diagnosis, in: IEEE Transactions on computers, vol c-19,11(Nov. 1970).
Jalote, P., Fault tolerance in distributed systems, Prentice Hall, 1994, pp.115–125.
Lee, S., Shin, K.G., On probabilistic diagnosis of multiprocessor systems using multiple syndromes, in: IEEE Transactions on parallel and distributed systems, vol 5, 6(Jun. 1994), pp.630–638.
Lee, S., Shin, K.G., Optimal and efficient probabilistic distributed diagnosis schemes, in: IEEE Transactions on computers, vol 42, 7(Jul. 1993), pp.882–886.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Postma, A., Hartman, G., Krol, T. (1996). Removal of all faulty nodes from a fault-tolerant service by means of distributed diagnosis with imperfect fault coverage. In: Hlawiczka, A., Silva, J.G., Simoncini, L. (eds) Dependable Computing — EDCC-2. EDCC 1996. Lecture Notes in Computer Science, vol 1150. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61772-8_50
Download citation
DOI: https://doi.org/10.1007/3-540-61772-8_50
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61772-3
Online ISBN: 978-3-540-70677-9
eBook Packages: Springer Book Archive