Abstract
Backward error recovery is one of the important techniques of software fault tolerance. Because of error propagation its recovery in distributed software needs cooperation between processes to achieve consistent recovery. However, the techniques of the achievement suffer from either concurrency level decreasing or the domino effect. Based on a formal model of the distributed system, a backward recovery protocol without the two drawbacks is specified in this paper. The algorithm of the protocol is proven strictly and its implementation is proposed.
Similar content being viewed by others
References
B. Randellet al., Reliability issues in computing system design,Computing Surveys,10:2 (1978).
B. Randell, System structure for software fault tolerance,IEEE Trans. SE-1:2 (1975).
P. Jalote and R. H. Campbell, Fault Tolerance Using Communicating Sequential Processes, FTCS- 14, 1984.
P. Jalote and R. H. Compbell, Atomic actions for fault tolerances using CSP,IEEE Trans. SE-12:1 (1986).
S. T. Grigory, and J. C. Knight, A New Linguistic Approach To Backward Error Recovey, FTCS- 15, 1985.
P. M. Merlin and B. Randell, Consistent State Restoration In Distributed Systems, FTCS- 8, 1978.
W. G. Wood, Recovery Control of Communicating Processes in a Distributed System, FTCS- 11, 1981.
K. Zielinsky, Model of error propagation in systems of communicating processes,Science of Computing Programming,6 (1986), 191–205.
D. L. Russell, State restoration in systems of communicating processes,IEEE Trans. SE-6:2 (1980).
K. H. Kim, An Implementation of a Programmer Transparent Scheme for Coordination Concurrent Processes in Recovery, COMPSAC 1980.
A. Ciuffoletti, Error Recovery in Systems of Communication Process, 7th Intern, Conf. on Softw. Eng., 1984.
D. Briaticoet al., A Distributed Domino- Effect Free Recovery Algorithm, IEEE 1984 Conf. on Reliability in Distributed Software and Database Systems.
B. Randell, Fault Tolerance and System Structuring, from Reliable Computer Systems, Springer- Verlag, 1985.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Liu, C., Wen, C. Model and algorithm of backward error recovery of distributed software. J. of Comput. Sci. & Technol. 4, 275–285 (1989). https://doi.org/10.1007/BF02943542
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02943542