Abstract
This paper introduces algorithms for scheduling fault recovery operations on systems which must preserve the timing correctness of critical application tasks in the presence of faults. The algorithms are based on methods to reserve time for the processing of recovery tasks at the design stage. This allows recovery tasks to be scheduled with very low run-time overhead, complementing or reducing the need for hardware replication to support dependable operation. Although previous work has advocated the use of reservation methods, there exists no formal methodology for allocating such time. A methodology is developed which facilitates the difficult task of verifying the timing correctness of a desired reservation strategy. In addition, simulation results are presented which give insight into the effectiveness of different reservation strategies in averting timing failures under a variety of transient recovery loads.1
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
P. Hood and V. Grover. Designing real-time systems in ada. Technical Report 1123-1, SofTech Inc., 460 Totten Pold Road, Waltham, MA 022540-9197, January 1986.
Kopetz et.al. Distributed fault-tolerant real-time systems: The mars approach. IEEE Micro, 9(1):25–40, February 1989.
V. Nirkhe and W. Pugh. A partial evaluator for the maruti hard real-time system. In Real-Time Systems Symposium, pages 64-73, Dec. 1991.
T.B. Smith III. The Fault-Tolerant Multiprocessor Computer. Moyes Publications, 1986.
Daniel Mosse. A Framework for the Development and Deployment of Fault-Tolerant Applications in Real-Time Environments. PhD thesis, University of Maryland, College Park, MD., 1992.
A.L. Liestman and R.H. Campbell. A fault-tolerant scheduling problem. IEEE Transactions on Software Engineering, SE-12(11), November 1986.
A. Wei, K. Hiraishi, R. Cheng, and R. Campbell. Application of the fault-tolerant deadline mechanism to a satellite on-board computer system. In 1980 International Symposium on Fault-Tolerant Computing, pages 107-109, June 1980.
H. Hecht. Fault-tolerant software for real-time applications. ACM Computing Surveys, 8(4):391–407, December 1976.
C.M. Krishna and K.G. Shin. On scheduling tasks with a quick recovery from failure. IEEE Transactions on Computers, C-35(5):448–455, May 1986.
S. Balaji, L. Jenkins, L.M. Patnaik, and P.S. Goel. Workload redistribution for fault-tolerance in a hard real-time distributed computing system. In 1989 International Symposium on Fault-Tolerant Computing, pages 366-373, Chicago, Illinois, June 1989.
R.H. Campbell, K.H. Horton, and G.C. Beiford. Simulations of a fault-tolerant deadline mechanism. In 1979 International Symposium on Fault-Tolerant Computing, pages 95-101, Madison, Wisconsin, June 1979.
J. Y.-T. Leung and J. Whitehead. On the complexity of fixed-priority scheduling of periodic real-time tasks. Performance Evaluation, 2:237–250, 1982.
C.L. Liu and J.W. Layland. Scheduling algorithms for multiprogramming in a hard-real-time environment. Journal of the Association for Computing Machinery, 20(l):46–61, January 1973.
John Lehoczky, Lui Sha, and Ye Ding. The rate-monotonic scheduling algorithm: Exact characterization and average case behavior. In Real-Time Systems Symposium, pages 166-171, 1989.
Sandra Ramos Thuel. Enhancing Fault Tolerance of Real-Time Systems through Time Redundancy. PhD thesis, Carnegie Mellon University, May 1993.
K. Fowler. Inertial navigation system simulator: Top-level design. Technical Report CMU/SEI-89-TR-38, Software Engineering Institute, January 1989.
D. Locke, D. Vogel, and T.J. Mesler. Building a predictable avionics platform in ada: A case study. In Real-Time Systems Symposium, pages 181-189, Dec. 1991.
S. Sathaye, D. Katcher, and J. Strosnider. Fixed priority scheduling with limited priority levels. Technical Report CMU-CDS-92-7, Carnegie Mellon University, August 1992.
B. Randell. System structure for software fault tolerance. IEEE Transactions on Software Engineering, pages 220-232, June 1975.
Daniel P. Siewiorek and Robert S. Swarz. Reliable Computer Systems. Digital Press, 1992.
J.D. Musa. A theory of software reliability and its application. IEEE Transactions on Software Engineering, pages 312-327, September 1975.
W.G. Bouricius. Reliability modeling for fault-tolerant computers. IEEE Transactions on Computers, C-20:1306–1311, Nov. 1971.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1995 Springer-Verlag/Wien
About this paper
Cite this paper
Ramos-Thuel, S., Strosnider, J.K. (1995). Scheduling Fault Recovery Operations for Time-Critical Applications. In: Cristian, F., Le Lann, G., Lunt, T. (eds) Dependable Computing for Critical Applications 4. Dependable Computing and Fault-Tolerant Systems, vol 9. Springer, Vienna. https://doi.org/10.1007/978-3-7091-9396-9_35
Download citation
DOI: https://doi.org/10.1007/978-3-7091-9396-9_35
Publisher Name: Springer, Vienna
Print ISBN: 978-3-7091-9398-3
Online ISBN: 978-3-7091-9396-9
eBook Packages: Springer Book Archive