Keywords

1 Introduction and Related Work

Internet of Things (IoT) wireless networks integrate many physical devices and sensors that send data through wire or wireless connections and have different architectures (e.g. fog, edge, cloud). Because of the heterogeneity of IoT systems it is not easy task to design reliable and efficient communication systems [8]. One of the big issue is energy consumption, which depends on: incorrect selection of the microcontroller, energy-inefficient software [3] or communication protocol parameters [12]. Another big issue is the impact of offered load on quality of service (QoS) and quality of experience (QoE), especially for interactive or streaming services. There are two main measures in terms of QoS that relate to end-to-end connection performance: packet loss and latency. This article presents the results of these measures for different scenarios, taking into account long-range dependent (LRD) feature of the traffic.

Wireless communication in IoT networks is usually based on one of the Medium Access Control (MAC) protocols: Time Division Multiple Access (TDMA) or Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA). In TDMA protocol each node is assigned a time slot for data transmission in a predetermined order. The main disadvantage of this system is that it requires accurate synchronization, which reduces the efficiency of the entire system. The second MAC protocol widely used in IoT systems is well known CSMA/CA (example use: IEEE 802.11 family, IEEE 802.15.4, etc.), that was extensively studied in many articles. Comprehensive study on the throughput, delay and stability performance of CSMA networks was presented in [6]. The problem of stable throughput and bounded mean delay was discussed in [24]. Another interesting contribution in the area of CSMA/CA protocols was a proposition of Handshake Sense Multiple Access with Collision Avoidance (HSMA/CA) protocol [19], which protects a densely deployed network from the classical hidden and exposed terminal problems. This idea was developed using Markov modeling and simulations. Recently, authors of [9] considered maximum effective throughput and suggested that the minimum mean access delay parameter of CSMA/CA system is of great practical interest.

Network traffic affects the reliability and performance of a network. Information on the statistical distribution of interarrival times of packets is not sufficient for evaluation of network performance, since actual network traffic exhibits second-order properties associated with long-range dependence (LRD) [1, 20]. It is very important to consider LRD, because of the impact on queueing performance [16]. Many models were developed from the properties of fractional Gaussian noise [17] or fractional autoregressive moving average process [4]. One of the most popular model that incorporates LRD properties is on/off source, recently analyzed in [25]. There is also a modified version of the Pareto on/off model [13], which is used further in this article.

Network traffic analysis and modeling is crucial for design and implementation of efficient and reliable transmission networks. It helps explain possible problems before they occur. The simulation results can be used to identify anomalies [5, 7, 10] or detect Distributed Denial of Service flood attacks [11, 15]. Furthermore, most of the symptoms that lead to congestion and high level of packet loss rate can be detected in the simulation process of network traffic [2, 23].

Experimenting with physical devices is uneconomical in the first phase of the project, especially if the number of devices is large. Therefore, the simulation approach is optimal in developing new methods and testing new scenarios. For the purposes of this article, all simulations were carried out using OMNeT++ framework [21, 22], which is a powerful open-source discrete event simulation tool. In fact, it is component-based C++ simulation library and framework, but instead of providing components specifically for computer networks, it includes generic component architecture to create any simulation.

The article is organized as follows. Next section characterizes basic properties of long-range dependence and introduces estimation methods used in further sections of this paper. Section 3 presents considered network structure as well as the model of network traffic generated by single node. In addition, it contains all the statistics for the reference queueing system, which are further studied from the point of view of performance evaluation. In Sect. 4, the results obtained for the analyzed network are then compared with the corresponding statistics of a commonly used queueing model described in Sect. 3.3.

2 Long-Range Dependence

2.1 Basic Properties

In order to explain the concept of long-range dependence (LRD), one needs to take a closer look at the stochastic process for different time scales. Let Y(t) be the stationary stochastic process. The following simple equation describes the relationship for the process that is rescaled in time:

$$\begin{aligned} Y(at) \overset{d}{=} a^H Y(t), \quad a>0, \end{aligned}$$
(1)

where a is a stretching factor and H is the Hurst exponent and \(\overset{d}{=}\) denotes equality in distributions. If \(0.5< H < 1\) then second-order properties associated with correlation structure are preserved regardless of scaling in time and the process becomes LRD. The higher value of H the stronger dependence. The autocorrelation function for the incremental process \(X(i) = Y(i) - Y(i-1)\), \(i=1,2,...\), which reflects the similarity between X(i) and \(X(i+k)\), has the following form:

$$\begin{aligned} r_k = \frac{\sigma ^2}{2} \left( (k + 1)^{2H} - 2 k^{2H} + |k - 1|^{2H} \right) , \quad k=0,1,... \end{aligned}$$
(2)

and for \(H>0.5\) is not summable:

$$\begin{aligned} \sum _{k=0}^{\infty } r_k \rightarrow \infty . \end{aligned}$$
(3)

The value of autocorrelation function decays slowly for LRD processes. In case of no-LRD processes, there is no dependency (\(H=0.5\)) and \(r_k=0\) for \(k \ge 1\).

2.2 Estimation

In order to evaluate Three methods of estimation of Hurst exponent were used. First one is variance-time method, which is based on the aggregated process of X(n) for discrete times \(n=0,1,...,N\) that corresponds to fixed-length intervals:

$$\begin{aligned} X^{(m)}(n) = m^{-1} \sum _{t=mn}^{m(n+1)-1} X(t), \quad n=0,1,...,\left\lfloor {N/m}\right\rfloor - 1, \end{aligned}$$
(4)

where m denotes the level of aggregation, i.e.:

$$\begin{aligned} X^{(m)} = m^{-1} \sum _{t=1}^{m} X(t) = m^{H-1}X \end{aligned}$$
(5)

The variance of the aggregated random variable in (5) is:

$$\begin{aligned} Var \left( X^{(m)} \right) = \sigma ^2 m^{2H-2}, \end{aligned}$$
(6)

where \(\sigma ^2\) is the variance of X. It can be easily seen that if one performs logarithmic operation on both sides of (6) then Hurst exponent can be estimated from the slope of linear regression (\(2H-2\)) for all aggregated samples according to (4).

The next, similar method of estimation is Index of Dispertion for Counts (IDC). It is defined as a relation of variance-to-mean ratio of the sum of random variable X for period L:

$$\begin{aligned} IDC(L) = Var \left( \sum _{n=1}^{L} X(n) \right) \bigg / E \left( \sum _{n=1}^{L} X(n) \right) \approx c L^{2H-1}, \end{aligned}$$
(7)

where c is a positive value. As with the variance-time method, one can use the linear regression to get the estimated \(\tilde{H}\) from the slope (\(2H-1\)).

Another method of estimation is periodogram based on the approximated value of spectral density of LRD processes:

$$\begin{aligned} f(\lambda ,H) \approx sin(\pi H) \Gamma (2H + 1) |\lambda |^{1-2H}, \end{aligned}$$
(8)

where \(\lambda \) is the frequency value for analyzed random variable X. Although the estimation operation is done in the frequency domain, the Hurst exponent can also be calculated from linear regression. In this case, the FFT values should be taken as the regression points. LRD refers to the lowest frequencies, which is reflected in the formula (8), where most of the energy concentrates near 0. For that reason, only 10% of the lowest frequencies is considered in the periodogram estimation method.

3 Framework

3.1 Network

Fig. 1.
figure 1

Schematic diagram of analyzed IoT wireless network, example for 5 nodes.

In order to analyze network traffic in a typical and commonly used structure for IoT devices shown in Fig. 1, a simple and efficient non-persistent CSMA/CA protocol is assumed. This protocol was chosen because it can be easily implemented even in basic and cheap microcontrollers and does not consume much power during wireless operation, which is crucial for IoT wireless sensors. In non-persistent version of CSMA protocol waiting node does not listen to the channel continuously until it becomes idle (like in 1- or p-persistent versions), which reduces energy consumption.

The network consists of nNodes wireless nodes and two stations: st0 and st1. Each node can transmit packets to the station st0 and can sense transmission from another node to avoid collisions. All nodes can hear each other, so there is no hidden node problem [14]. When the channel is busy, because another node is transmitting, the node that wants to send packets must wait a period of time and then listen (sense) again. If the channel is idle, the node starts its transmission immediately.

Wireless station st0 receives packets from all nodes and tries to send them all across the long range radio link to the station st1. All packets must pass through the wireless network interface of st0 that connects this station to another one (st1). The output interface at st0 actually incorporates the queueing system that has K places for packets (including the one being transmitted) and the transmission circuit limited by the bandwidth of the radio link. Therefore, because all the cumulative traffic from the nodes goes there, all interesting performance statistics, associated with the impact of LRD traffic, can be found at the radio link network interface of station st0. Before the packet leaves st0, either it is immediately processed (if the queue is empty) or goes to the queue buffer. If there is not enough buffer space, packet is dropped (buffer overflow). By changing the bandwidth of this radio link channel, one can examine the impact of the local traffic on queueing performance, and thus on the level of packet loss and latency.

3.2 Source Node

Every node sends fixed length packets to the np-CSMA channel according to Pareto Modulated Poisson Process (PMPP) [13]. This model was chosen because it is versatile, efficient and introduces long-range dependence to the traffic while maintaining a constant level of packet rate. In addition, it resembles the behavior of variable-bit-rate services or protocols that transmit packets in batches. The PMPP source consists of two Poisson sources with alternating traffic intensities \(\lambda _1\) and \(\lambda _2\) (Fig. 2). The sojourn time in each state has Pareto distribution \(P\{X \ge x\} = x^{-\alpha }\) with parameter \(\alpha > 0\). This distribution has infinite variance for \(1< \alpha < 2\) and is heavy-tailed.

Fig. 2.
figure 2

Two Poisson sources of PMPP packet generator

The approximated value of IDC for PMPP source is:

$$\begin{aligned} IDC(t) \approx 1 + \frac{(\lambda _1 - \lambda _2)^2}{\lambda _1 + \lambda _2} \left( \frac{\alpha - 1}{\alpha } \right) t ^ {2 - \alpha }, \end{aligned}$$
(9)

where H can be easily obtained from:

$$\begin{aligned} H = \frac{3 - \alpha }{2} \end{aligned}$$
(10)

and is compatible with (7) in terms of the same exponent (\(2H-1\)). The \(\lambda _1\) and \(\lambda _2\) values should be selected so that the expected value of number of packets \(E(N(t))=0.5(\lambda _1 + \lambda _2)t\) corresponds to the desired value of the generated network traffic.

3.3 Performance Evaluation

The network performance of backhaul link between st0 and st1 depends on statistical properties of the inbound traffic as well as the service rate and packet length distribution. Since all packets have fixed size, a deterministic service is assumed. All traffic from local network goes to the input of queueing system inside the output interface of st0 (Fig. 1). The queueing system consists of one server and has \(K-1\) slots as a buffer space for packets. If the buffer overflows (K packets in the system) then the next incoming packet is dropped. Most common type of queueing system that meets the above assumptions is M/D/1/K, for which explicit formulas of blocking probability, stationary distribution and mean system sojourn time were derived in [18]. Both latency and packet loss can be expressed in terms of steady state probabilities of number of packets in the system:

$$\begin{aligned} D = \frac{1}{\lambda (1 - P_{LOSS})} \sum _{k=0}^{K} k \cdot p_k^{(K)} \end{aligned}$$
(11)
$$\begin{aligned} P_{LOSS} = p_K^{(K)}, \end{aligned}$$
(12)

where:

$$\begin{aligned} p_k^{(K)} = {\left\{ \begin{array}{ll} \left( 1 + \rho \varLambda _{K-1} \right) ^ {-1} \quad \text {for} \quad k=0 \\ \left( \varLambda _k - \varLambda _{K-1} \right) p_0^{(K)} \quad \text {for} \quad k=1,\dots ,K-1 \\ 1 - \varLambda _{K-1} p_0^{(K)} \quad \text {for} \quad k=K \end{array}\right. } \end{aligned}$$
(13)
$$\begin{aligned} \varLambda _k = \sum _{i=0}^{k} \frac{\left( \rho (i-k) \right) ^i}{i!} exp\left( {(k-i)\rho }\right) . \end{aligned}$$
(14)

These relationships are the reference for comparing them with the data received from the interface of st0 for different scenarios, i.e. different levels of LRD as well as different number of nodes.

4 Results

All results were obtained using the OMNeT++ [22] simulation platform. The simulation framework was described in the previous section.

Fig. 3.
figure 3

Sample result for the first 200 ms of highly congested traffic, nNodes: 5, bandwidth: 1 Mbps, packet transmission time (T): 4.096 ms.

The performance results were compared to a typical M/D/1/K queuing system, presented in 3.3, where all performance measures refer to the classical Poisson model of input traffic (without LRD feature).

The purpose of the experiments was to examine the effect of different Hurst exponent values in the range of \(0.5<H<1\) as well as different number of nodes (nNodes) on performance of the backhaul link (Fig. 1). The simulation time, bandwidth and packet length for each scenario were 60 min., 1 Mbps and 4096 bits (512 bytes), respectively. The packet length corresponds to the packet transmission time of 4.096 ms shown in Fig. 3.

Fig. 4.
figure 4

Variance and IDC plots for two desired Hurst exponent values: \(H=0.9\) (\(\alpha =1.2\)) and \(H=0.6\) (\(\alpha =1.8\)).

Figure 4 shows the estimation results of Hurst exponent (\(\tilde{H}\)) for the aggregated outbound traffic from 5 nodes. Two methods of estimation were applied: variance-time and IDC plot (see Sect. 2.2 and Eqs. (5), (7)). The desired H was 0.9 and 0.6, which corresponds to the \(\alpha =1.2\) and \(\alpha =1.8\) of Pareto distribution in PMPP model.

Fig. 5.
figure 5

Periodogram plots for different number of nodes.

In the next figure (Fig. 5) there are periodogram estimation plots for different number of nodes for desired \(H=0.9\). All estimation results of Hurst exponent were obtained based on the formula (8). It is clearly seen that the values of \(\tilde{H}\) decreases as the number of nodes increases. This is due to the disappearance of the LRD structure in aggregated stream that consists of many independent network flows coming from single nodes. It can be better observed in Table 1, where the mean \(\tilde{H}\) estimated for nodes stays the same (approximately), while the mean \(\tilde{H}\) at the input of the queue (in0) decreases to the low value, suggesting that the LRD properties gradually disappears.

Table 2 shows the main statistics for different desired H values constant number of nodes (\(nNodes=5\)). A slight increase of all measured values can be observed, which suggests that increasing H value raises the level of collisions and hence increases values for other statistics.

Table 1. Main statistics for \(H=0.9\) and different number of nodes.
Fig. 6.
figure 6

Mean number of packets in the queueing system for different number of nodes, \(\alpha =1.2\).

Table 2. Main statistics for 5 nodes and different H values.

The observations from Table 1 are confirmed by the backhaul link performance results. In Fig. 6 there are curves for mean number of packets in the queueing system for different number of nodes versus offered traffic load. The theoretical curve for M/D/1/20 system, marked with solid black line, is calculated as the mean value of all \(p_k^{(K)}\) in (13) for \(K=20\). In the next Fig. 7 one can observe the same tendency - the packet loss becomes lower as the number of nodes increases. Last Fig. 8 presents changing latency when number of nodes increases. For \(nNodes=10\) and \(nNodes=15\) the empirical curves goes below the levels of theoretical counterpart calculated from (11), which can be explained by the fact that the aggregation of streams from single nodes causes big change in LRD as well as in distribution.

Fig. 7.
figure 7

Packet loss for different number of nodes, \(\alpha =1.2\).

Fig. 8.
figure 8

Latency for different number of nodes compared to M/D/1/20 queueing system.

5 Conclusions

The number of IoT devices and networks is constantly increasing, which means that congestion can occur, especially when the communication channel capacity does not increase. The main performance measures analyzed in this article were latency and packet loss. It is obvious that for higher offered traffic load the values of both measures increases causing poor performance of the connection link. However, situation becomes worse when LRD is considered. There is no doubt that this feature exists in network traffic. The question is, how strong are these long-term relationships and how they influence the performance. The simulation results of analysis of a typical IoT wireless CSMA/CA network with backhaul link provided more insights on the impact of both LRD and number of nodes on latency and packet loss.

All estimation results of Hurst exponent show that \(\tilde{H}\) is stable for the same number of nodes. If number of nodes increases, then \(\tilde{H}\) becomes smaller. Furthermore, all performance results show that with an increasing number of nodes, performance improves, causing even underestimation of the classical M/D/1/K model of queueing system. It implies that the aggregated stream consisting of many single node streams has changed its structure in terms of LRD feature as well as the distribution. This phenomenon can be used to determine the parameters of the IoT network system in order to reduce the value of latency and packet loss, which in turn has a positive effect on QoS and QoE.