Foundations of Dependable Computing pp 3-38 | Cite as

# Adaptive System-Level Diagnosis in Real-Time

## Abstract

Distributed real-time systems are subject to stricter fault-tolerance requirements than non-real time systems. This work presents an application of system-level diagnosis to a real-time distributed system as a first step in providing fault-tolerance. An existing algorithm for distributed system-level diagnosis, Adaptive_DSD, is converted to a real-time framework, establishing a deadline for the end-to-end diagnosis latency. Rate monotonic analysis is chosen as the framework for achieving real-time performance. The ADSD algorithm is converted into a set of independent periodic tasks running at each node, and a systematic procedure is used to assign priorities and deadlines to minimize the hard deadline of the diagnosis function. The resulting algorithm, Real-Time Adaptive Distributed System-Level Diagnosis (RT-ADSD), is fully compatible with a real-time environment, where both the processors and the network support fixed-priority scheduling. The RT-ADSD algorithm provides a useful first step in adding fault-tolerance to distributed real-time systems by quickly and reliably diagnosis node failures. The key results presented here include a framework for specifying real-time distributed algorithms and a scheduling model for analyzing them that accounts for many requirements of distributed systems, including network I/O, task jitter, and critical sections caused by shared resources.

## Keywords

Idle Time Critical Section Schedule Model Diagnosis Latency Priority Task## Preview

