1 Introduction

Distributed applications require a reliable and real-time efficient network support. For economic reasons network resources cannot be reserved for each application and have to be shared by many other applications. Typical examples are distributed sensor-controller communications known as “Networked Control Systems” (NCS), applied in integrated manufacturing processes, traffic control systems, or Cyber-Physical Systems. Besides the unavoidable network delays transmission errors may occur through noise interferences or information losses caused by buffer limitations. These effects require a strong communication protocol support to guarantee Service Level Objectives (SLO). The performance of distributed control applications is further affected by all other applications which share the common communication infrastructure. For the development of such distributed real-time control systems methods are required by which the influence of all interfering sources can be estimated in order to predict their effects on each considered application and their individual SLO requirements.

Research has already been directed to subjects of specific distributed application systems as, e.g., sensor networks, city, smart energy, or air traffic control systems, robot and integrated production control systems, or for disaster control operations, c.f. [1,2,3,4,5,6,7,8,9,10,11,12,13]. In most of these cases dedicated networks have been applied which are designed or configured for the specific application SLOs as Local Area Networks (LAN/WLAN) with priority options for real-time support. Typically, control and network analyses are treated independently; this paper attempts at an integrated multi-layer analysis approach by aggregating the complete network infrastructure into a stochastic equivalent phase which is inserted in the specific application control layer. In a first paper [14] we have studied a single Networked Control System where the whole functionality of the underlying network is aggregated into a functional module operated on the Link Layer (2b) with acknowledgment signaling and Timeout control for each frame.

In the remaining part of this paper we inspect in Sect. 2 several principal alternatives for shared communication networks, specifically with respect to that layer on which the Application Layer is based upon. In particular, we define a local area network as shared infrastructure for all control applications, i.e., an infrastructure where the control application is placed upon an enhanced common Media Access Control (MAC) Layer (2a) to increase reliability and real-time performance of the control systems. In Sect. 3 a comprehensive model is developed for the performance evaluation of the shared infrastructure (MAC Layer) which is represented by an extended task graph. Its performance is analyzed exactly resulting in an aggregated virtual service time which represents the whole MAC layer including all control communications. In Sect. 4 single-server queuing models are suggested which represent the whole NCS with all control loops from which the aggregated performance of the control systems will be derived based on the shared infrastructure. Results are presented and discussed in Sect. 5, conclusions are drawn in Sect. 6.

2 Architectural Alternatives for NCSs

Communication networks, their interfaces and protocols are standardized for reasons of interoperability, e.g., by the general ISO model for Open System Interconnection (OSI) or by the dominating IETF RFCs for the Internet and its well-known principal layers, c.f. Fig. 1. Our distributed control functionalities are belonging to Layer 7 which can, in principle, be placed on top of any of the above layers as, e.g., a pure hardware solution directly across wires of the PHY Layer 1 for signal transmissions as in classical electrical control systems, the MAC Layer 2a of shared LANs for frame exchanges, the Link Layer 2b for safe exchange of frames as in our paper [14, 15], the Network Layer 3 for an unreliable end-to-end exchange of packets (but with enhanced transport options), or the Transport Layer 4 for a safe exchange of byte streams end-to-end. The functional support increases in the upward direction but the end-to-end delays increase accordingly. For our control application Layers 3 and 4 are less attractive or have to be complemented with respect to reliability and real-time performance. In this paper we restrict our aims to applications within a local area, such as for manufacturing plants, inventory and logistics management, traffic control or IoT applications. We prefer therefore to put the Application Layer directly upon the MAC Layer. This MAC layer has, however, to be enhanced with functions of the LLC with respect to error control and with respect to performance efficiency.

Fig. 1.
figure 1

Reference model for open system interconnection

3 Enhanced MAC Layer Architecture for Shared Infrastructures

3.1 Modeling of the Enhanced MAC Layer

Conventionally, media access control takes care of an efficient access to the common transmission medium among independently acting “stations” through functions as channel activity sensing, contention resolution, access right signaling by Token circulation, or reservation requests. To integrate error control and request/response functionalities in the MAC layer we can make use of the following further optional possibilities: (1) Acknowledgment signaling after frame reception, (2) Different Inter-Frame Spacing for service class distinctions, (3) Slot-based contention resolution for channel access, (4) Carrier Sensing for collision detection (CD) during transmission, and (5) Channel Reservation for Request/Response cycles. Properties (1) and (2) were suggested by the main author already in 1983 [16] for service differentiation and real-time channel access enhancements in LANs; these functions have been adopted later in connection with the development of the WLAN standards for the IEEE 802.11 series of protocols. Property (3) is known from early MAC protocols as the periodic CSMA p-persistent channel access based on attempts within a slot time Δt [17]. In WLANs a station synchronization takes place by an ACK signal after a successful frame transmission. Property (5) can be used for channel reservation for a whole request/response control cycle. The problem of “Hidden Stations” in WLANs and its handling through the RTS/CTS-based channel reservation through the Network Allocation Vector (NAV, c.f. [23]) can be applied in our approach; for real-time control applications within plants the centralized controller stations should be located within a mutually reachable area; in that case the RTS/CTS cycle method could be neglected. Based on the properties (1)–(4) we suggest the following model of the enhanced MAC Layer of the shared communication infrastructure, c.f. Fig. 2. The “Shared Medium” in Fig. 2 can be a wired or wireless channel which can be accessed by the stations. One or several controllers are the recipients of the frames sent from stations (not shown explicitly in Fig. 2).

Fig. 2.
figure 2

Enhanced MAC layer performance model for multiple networked control systems with shared local area network communication infrastructure

The “Shared Medium” can be a wired or a wireless channel accessed by the stations. One or several controllers are the recipients of frames being sent by the stations (they are not shown explicitly in Fig. 2).

Abbreviations in Fig. 2 and for the analysis are self-explaining, where x indicates the number of stations, D are probabilistic decisions, T indicates random time variables, P and q probabilities; index numbers refer to different applications of the variables. Δt is the slot time for channel access and has to be larger than the largest round- trip time.

Two operating modes will be distinguished for the channel access and reservation: Mode 1 for Event-based control and Mode 2 for Time-based control. In Mode 1 sensor signals for a full request/response cycle are generated only when certain defined sensor threshold values are exceeded, e.g., a speed, water, or temperature level or fire/gas concentration alarms. Arrival processes are typically clustered rare events. Mode 2 addresses periodic channel reservations for a full request/response cycle sensor-controller-actuator. The arrival process type is D (deterministic), i.e., constant inter-arrival times.

The operation of control activities across the shared infrastructure is based on the access competition for the common channel infrastructure among all plant stations. When a plant station has won the access competition, all other stations have no access right to the channel until the ongoing plant-controller control cycle has been successfully completed. The plant station and the responding controller station have exclusive access to the channel. Two buffers B1 and B2 are used for intermediate buffering of the plant-state frame and the controller-response frame, respectively, for repeated frame transmission in case of a transmission error, c.f. dashed links in Fig. 2. After each successful cycle both buffer contents are cleared by which mutual interferences between competing stations are excluded. To avoid illegal channel tapping of information a strict encryption coding is required. When a plant has gained access to the common channel, the access right remains with that station and with the controller until the activity for that event has been completed successfully. Two signaling messages are applied: ACK acknowledges a complete request/response control cycle between a plant and its corresponding controller. This frame is destined to the corresponding plant and carries the immediate controller response to the plant and the acknowledgment of the successful plant-controller cycle. If the ACK-frame is in error, which happens with probability q2, it is repeated until correct reception. NAK is a negative acknowledgment used by the controller or by the plant and is applied in cases when a frame is received in error (detected by the common frame error control check) which happens with probability q1 for an information frame and with probability q2 for a response frame. Upon reception of the NAK- frame, which happens either with probability q1 (or q2) the receiver (plant or controller) repeats its frame buffered in B1 (or B2), respectively, immediately without another channel access competition.

3.2 Performance Analysis of the MAC Layer

The model for a transmission/acknowledgement cycle of one frame in the model Fig. 2 is a special case of a Directed Acyclic Graph (DAG) and can be analyzed exactly. The exact mathematical analysis of general DAGs (which includes also logical synchronization conditions) has been suggested by the main author in [18] and has been applied, among others, to the exact analysis of NCSs for the LLC 2 protocol models “Send-and-Wait” (SW) and “Selective Repeat” (SR) with Timeout recovery [15] where parallel activities had to be considered to avoid a life-lock of the protocol function in case of a frame loss. The performance analysis will be explained through a step-wise aggregation of independent stochastic phases Ti for i Є {0,1,2,P,PD,A} by the task Graph Reduction method introduced in [15]. This allows for the exact analysis of the control model within which the whole aggregated time for channel access and communication is represented by an equivalent stochastic phase TC(x).

As outlined in Sect. 3.1 the channel access is based on the principle of the p-persistent CSMA/CD protocol with x stations, c.f. [17]. The channel access resolution is based on slotted periods of length Δt. We will assume that each station takes part in the channel access competition by deciding to send its frame with a randomly chosen probability p in the next slot. If several stations send during this slot, a collision occurs which is detected by the CD-function and all involved stations abort sending. The same procedure is repeated in the successive slots until that case when only one station has attempted the channel access during this slot: This station has won the competition and proceeds with construction of the frame. The channel access time T0 is indicated in Fig. 2 and defines the aggregated duration for an arbitrary channel access as multiples of the slot time Δt, where T0 = (j + 1)Δt, j = 1,2,.. The slot time has to be sized to Δt ≥ 2τ; τ denotes the propagation delay time for signals between the two most distantly located stations which guarantees that a safe decision can be made after each slot. The total channel access time T0 = (j + 1)·Δt is constituted from j multiples of Δt times for the contention resolution plus one Δt accounting for the slot during which the channel competition has been won. Explicit formulas are obtained for: the probability PS for a successful frame transmission, the 1st and 2nd ordinary moments of the random contention interval J, the LS-transform Φ0(s), the mean E[T0] and coefficient of variance c0 of T0 and the optimized probability p for the random channel access probability p.

TC indicates the random time of a successful completion of the whole time measured from the successful channel access to a completed Sensor-Controller-Actuator cycle. It has been exactly analyzed by the mathematical task-graph reduction method [18] resulting in the LT ΦC(s), the mean E[TC] and the coefficient of variation cc of TC. From these quantities we can approximate the cumulative distribution function (CDF) of TC by phase-type distributions of hypo- or hyper-exponential type (0 ≤ cc < ∞).

Note:

The complete mathematical analysis and their explicit results cannot be presented here for reasons of limited space and will be part of a forthcoming paper (but can be provided on request from the main author). We will only outline the principal course of derivation of the key performance metrics for the Channel Access Time T0 and the Duration of a successful control cycle time TC. Same holds true for the analysis of the application layer control models: the derived results for TC are used as a “virtual service time” within a queuing model of the Application Layer.

4 Application Layer Performance

4.1 Application Layer Queuing Models

4.1.1 Open-Loop NCS Applications

The application Layer is placed directly above the shared communication infrastructure of the extended MAC layer. It will be modeled by means of queuing systems of the type GI/G/1, where GI (General Independent) indicates a stochastic point process of control request arrivals, G (General) represents the aggregated MAC layer service phase TC for one complete request/response cycle with its channel access, frame transmission, and acknowledgment time components. Figure 3 illustrates the total system model. For an “Open-Loop NCS”, i.e., for a one-way communication between a station and a controller or vice-versa. The competitive effects of the shared media for all control circuits is reflected in the resulting delay for processing of each control request.

Fig. 3.
figure 3

Aggregated queuing model for a NCS with shared communication infrastructure with FIFO, LIFO, and RANDOM order of service

Queuing theory is a highly developed discipline with more than 100 years of research and experience and a rich selection of analytic results. For a number of specific arrival/service process types exact results exist, e.g., for M/G/1, GI/M/1 model types, where M stands for Markovian and G for General process types with typically FIFO (first-in, first-out) queuing disciplines. For LIFO (last-in, first-out) and RANDOM disciplines first and second order moments of the waiting times are also known [21] from which we can approximate the delay distributions through the Weibull-distribution function [19]. Once a correct model has been defined, many cases can be solved by use of tabled results on standard queuing system types [19, 20]. We therefore want to encourage to make use of both tabled results or simulations (when there are no analytic results available) based on adequate system models.

4.1.2 Closed-Loop NCS Applications

Closed-Loop applications originate typically from automatic control systems, c.f. the basic model in Fig. 4. The closed-loop model consists of a Plant which can be adjusted by a signal of the Actuator A through the Controller C. The output signal Y (“state of the Plant”) is fed back to the Controller C where it is compared with a Reference Signal R as control objective; the difference E between R and Y forms the input to the Controller C. The Controller determines a reaction signal which is communicated to the Actuator to affect that Y will be driven towards the reference value R. This basic control loop of the Control System becomes a Networked Control System (NCS) when Controller and Plant reside at different locations. As a typical application example, the controller function is implemented by software operated at a centralized computer system. The network adds to the total delay in both directions and is especially critical in applications with strict real-time SLO requirements, especially when there are further delays through the shared use of the communication network.

Fig. 4.
figure 4

Structure of a Networked Control System (NCS)

The model of Fig. 4 has been analyzed before [14] with a dedicated network infrastructure where the “Network” is a two-way logical link control (LLC) connection (Layer 2b) operated under control of the “Send-and-Wait” or under the “Selective Repeat” protocol with Timeout recovery in case of frame loss or excessive frame delay between A and C and between C and A separately. In this contribution we extend the model to a NCS with a shared network infrastructure, i.e., a local area network operated on layers 1 (Physical Layer) and 2a (Media Access Control Layer) by a wired or a wireless network. To analyze the extended model two items have to be solved:

  1. (1)

    to replace the Network block N by our Enhanced MAC-Layer of Sect. 3.1

  2. (2)

    to analyze the resulting control system of Fig. 4.

The analysis of the control systems can be performed in different ways. Classical Analog Control Theory or Discrete Time State Theory.

The classical analog control theory for linear systems is based on analytical functions for the analog time signals and their Laplace transforms and by definition of standard controller functions as the PID Controller (P: proportional, I integral, D: differential). The resulting solution for y(t) as a response to a standard reference signal r(t) represented by the delta function δ(t) for an “impulse response” or by the unit step function u(t) for a “unit-step response” analyzed in the Laplace-domain resulting in Y(s) = LT{y(t)}. More complex systems are non-linear which are more difficult to analyze. The discrete time state control theory is based on detailed system state variables and their description by systems of state equations and using, e.g., the MATLAB Simulink tool or a computational solution of matrix equation systems. We have applied these method in [14].

5 Selected Numerical Results and Discussion

5.1 MAC Layer Performance

An example for the extended MAC Layer LAN will be studied to demonstrate the real-time optimized performance of the CSMA/CD p-persistent MAC protocol for different numbers of stations attached to the LAN. The parameters are as follows:

Slot time:

Δt = 10 μs

Frame transmission time (constant):

TA = TP = 100 μs

Propagation delay time (constant):

TPD = 5 μs

Frame Construction/Controller Time:

T1 = T2 = 50 μs

Number of Stations:

x = 1, .., 100

Channel Transmission Rate:

r = 100 Mbit/s

The aggregated arrival process of requests of all stations is assumed to follow a Poisson process, i.e., the inter-arrival times are negative-exponentially distributed (Type M). This assumption is justified especially when many independent arrival processes are superimposed, even when the individual stations send requests at regular instant distances and when stations are not synchronized among each other. The accuracy increases with increasing number of stations. The resulting queuing model is of the type M/G/1 with either a FIFO queuing discipline in the ideal case or RANDOM in a more realistic case. In the RANDOM case the coefficient of variation of delayed requests is significantly larger, which affects SLA-guarantees of delay quantiles. The key results are given in Table 1. The dimensions for E[TC] and tW are given in multiples of 1 ms. The results for cDF and cDR are given for the queue disciplines FIFO (left column part) and RANDOM (right column part). The results indicate the following properties:

Table 1. Performance results of the MAC layer
  1. (1)

    The duration of the control cycles E[TC] depends only minimally on the number of stations as a result of the optimized parameter p of the CSMA/CD p-persistent access protocol. The coefficient of variation is low and quite stable over the whole range of the number of attached stations. The coefficient of variation is low and quite stable over the whole range of the number of attached stations.

  2. (2)

    The mean waiting time of delayed access requests tW remains stable for low and medium loads and approaches infinity asymptotically when the system capacity is approached. The coefficient of variation cDF of delayed requests results primarily from repeated frame transmissions in case of transmission errors and is hypo-exponential; cDR becomes, however, hyper-exponential for higher loads.

  3. (3)

    The parameters can be used for system resource sizing when certain SLOs have to be met: (3.1) Meeting SLA with respect to the mean waiting time of an arriving request tW which has to be delayed: The number of attached stations or the request rate by each station can be fixed up to a prescribed upper threshold tWTh of the mean delay tW of an arriving and delayed request independently of the applied queue discipline. (3.2) Meeting SLA with respect to the real-time critical delay quantile q such that a delayed request will not have to wait longer than a threshold time tTh with probability

$$ \text{q}\; = \;\text{P}\left\{ {\text{T}_{\text{W}} \le \, \text{t}_{{\text{Th}}} |\text{T}_{\text{W}} > \, 0} \right\}. $$

This SLO depends on the queue discipline and is harder to meet for RANDOM than in case of FIFO service, c.f. the increased coefficient of variation of delay cCR.

Example:

We construct the complementary CDF WC(t)/W for delayed requests using the first and second order parameters tW and cD from Table 1 by the Weibull DF tabled in [19] for various parameters of cD. A SLO of p = 0.001 for a delay threshold tTh = 5E[TC] is reached for ρ = 0.078 for the FIFO queue discipline. The same SLO is reached for the RANDOM queue discipline only at 15E[TC] due to the much higher coefficient of variation. As a consequence the allowable load or allowable number of stations has to be reduced accordingly to meet the SLO objective.

5.2 Discussion and Summary

  1. 1.

    The explicitly worked out performance results are easy to apply even for users without specific expertise in applied queuing theory. The use of the Simulink toolset is very helpful for both, the analytic evaluation of control systems or for the system analysis by stochastic simulations, but reveals deficiencies concerning coverage of network and protocol properties.

  2. 2.

    Hard real-time performance requirements are not well supported by current local area networks and the Internet. The current developments towards the future 5G mobile network are an aim for future IoT applications. Advanced WLAN concepts within the IEEE 802.11 standards for Distributed Control Functions (DCF), Point Control Functions (PCF), and Hybrid Control Functions (HFC) are still not sufficient to meet hard real-time control requirements. The concept of an enhanced MAC-layer protocol based on the optimized CSMA/CD p-persistent protocol allows to adapt the local area network to distributed networked control applications, in particular for indoor integrated production plants for the ms-range of IoT systems. The concept can easily be extended to multi-class applications to meet different real-time classes through different inter-frame spacing as already suggested and studied in [16]. The current approach of an enhanced MAC layer is in principle also applicable for future 5G network slicing concepts.

Summary:

Distributed IoT applications require efficient network support, especially for real-time critical problems. Shared network use is attractive for economic reasons. Current LANs are designed for the integration of quite different type of services but don’t allow for an efficient real-time control. In this contribution a novel concept for networked control systems is suggested which is in particular able to meet hard SLO requirements. The concept is based on an enhanced and optimized MAC layer for the CSMA/CD p-persistent media access protocol. The whole NCS is modeled for a safe and efficient communication support where the network functions are aggregated by a stochastic variable TC which is part of the application control loop. The method has been applied to a sample LAN and to distributed NCSs. Details of the performance analysis and NCS examples did not fit into the limited space and will be published separately; they can be provided from the main author on request.