Keywords

1 Introduction

Today’s communication technologies are capable of transmitting increasing amounts of data per second. Their source is not only the data of human-operated applications, but increasingly the sensors and hubs of major applications such as healthcare [6, 31] and the of the Internet of Things (IoT) and other services. However, the Internet’s ease of use and high bandwidth also creates tremendous opportunities for attackers, so that all these Internet accessible systems need to be protected from malicious attacks [5, 32].

Since the computational capabilities of servers and workstations are limited and they are not always able to process data at an appropriate speed, Cloud architectures have become the answer to this problem, grouping servers into structures that provide huge computing capacities, but these need to be properly accessed and scheduled [4, 40, 42]. The second trend, which is gaining momentum especially with the development of 5G networks, is the multiplication of computing services and their movement to the Edge, close to the users and to the sources of data.

The primary purpose of a computer system aand network is to process and transmit data while maintaining adequate Quality of Service (QoS) [20]. Disturbances in QoS result in the need to wait for data, thus wasting computing power, and often in the need to resend data, which in addition, in the case of IoT devices, is associated with energy expenditure and shortening the life of a battery-powered device. QoS problems could be avoided if it were possible to place processing nodes close enough to the data source so that transmission would not be a problem. However, this can be too costly, both at the investment stage and later when it comes to covering energy costs. Electricity, apart from being an obvious cost for the operator, is obtained in the overwhelming majority from non-renewable energy sources, and its unnecessary consumption has an impact on the climate of our planet. It should therefore be saved for both economic and ecological reasons [22]. How important, although underestimated, is ‘green computing’ and ‘green networking’ [3, 35] is shown by the fact that, at present, the energy consumption of IT systems accounts for roughly 10% of global electricity consumption, and by 2030 this share may even reach 20% [1, 16].

Another problem is security which needs to be assured [19, 21]. As the value of data transmitted over the network and processed on external servers increases, so do the number of attacks on the infrastructure for transmitting, processing and storing information. Modern computer systems must take this issue into account already at the design level, according to the security-by-design principle.

Our ppaer addresses all these issues, improving network performance in terms of QoS, power consumption and security [27]. This article is composed of six main sections. In Sect. 2 we briefly introduce the reader to the topic of RNNs, referring to previous publications on the subject. We show the specifics of the environment that is the subject of the current research and the tailored solutions that we have used. Section ?? discusses how to collect QoS, energy and security data that RNNs use to make decisions. Section 3 presents the experimental part, including a description of the implementation and the testbed. It also includes a discussion of the obtained results. The whole work is summarised in the Sect. 4.

2 Random Neural Networks for the Control of Computer Networks

The optimization the QoS of distributed systems has been discussed in numerous publications [28, 38, 39, 41, 42]. QoS versus energy consumption of distributed services has also been examined experimentallly in [18]. However, the focus on security is more recent and its impact on network management and routing is examined in [10, 11, 17].

To control the network in terms of multiple criteria, including QoS, security and energy in our case, we use a solution based on Random Neural Networks (RNNs) [12, 13], trained using Reinforcement Learning. RNNs optimize data packet transmission paths as well as the selection of Egde Computing services in such a way as to maintain an appropriate (predefined) balance between QoS, energy consumption and security. The switches and servers of a computer network form a distributed system, and its optimization is a variation of a well-known problem. However, by using the RNN and placing our system in a Software Defined Network (SDN) environment as in [8, 9], we show that familiar Machine Learning techniques can also be used in state-of-the-art network architectures.

It should be noted, however, that the use of an SDN controller to implement the presented solutions is convenient from the point of view of demonstrating the usefulness of RNN in computer network control, but due to the distributed architecture of the RNN-based Decision Engine the same solutions can - under certain conditions - also be applied to a traditional, fully distributed network architecture.

The problems of SDN design and optimization are discussed in survey paper [36], taking into account not only energy efficiency issues, but also touching on security problems. Security issues in SDN have received a number of publications, for example in [2]. An interesting survey article on system deployment and optimization, shedding light on our work, was published in [25]. The popularity of this technology and the ease of implementation of routing control algorithms are also significant.

2.1 The Goal of the Decision System

The system we consider consists of:

  • The set of network SDN switches or forwarders \(S=\{s_1,~..~s_n\}\) that are interconnected via a network graph, where S is the set of nodes and A is the \(n\times n\) one-hop binary connection matrix between nodes.

  • Every switch \(s\in S\) may have connected “clients” or Edge services.

  • The set of Clients is \(C=\{c_1,~...~c_m\}\) and each client c has a node or switch s(c) to which it is directly connected .

  • Edge services are used to offload specific cloud services (with their processing capacity and/or repositories) that are operating in close proximity so as to offer fast service to the clients. They belong to a set \(E =\{e_1,~...~e_M\}\) of M services which all offer equivalent facilities in terms of processing and the ability to provide specific data. Also any service e is connected to some switch or node s(e).

The Goal of the decision system is to find a P among the set of switches S to connect the pair of clients \((c, c'),~c,c'\in C\) or the client-service pair \((c,e),~c\in C,~e\in E\). The choice of the path is based on the QoS, security and energy criteria, or one or two of these criteria. For ease of notation we will denote a connection \((c,c')\) or (, e) as a “flow” f.

Thus a path:

  • \(P=P(c,c')\) from c to \(c'\) is \(P(c,c')=(s(c),s(P)_1,~...~.s(P)_{l(P)-2},s(c'))\), or

  • A path \(P=P(c,e)\) from c to e is \(P(c,e)=(s(c),s(P)_1,~...~.s(P)_{l(P)-2},s(e))\), where

  • \(A(s(c),s(P)_1)=1\), \(A(s(P)_i,s(P)_{i+1})=1\), for \(1\le i\le l(P)-3\), \(A(P_{l(P)-2},s(c'))=1\), \(A(P_{l(P)-2},s(e))=1\),

  • and l(P) denotes the length of the path P in number of switches or nodes.

Thus we can now formulate the goal function G for given flow and path as the weighted sum of three criteria:

$$\begin{aligned} G(f,P) = a{}Q(f, P) + b{}T(f, P) + c{}J(f,P), \end{aligned}$$
(1)

where a,  b,  c are non-negative constants with \(a+b+c=1\), and Q(fP) is the QoS value for given flow f using path P. For instance, Q(fP) can be the end-to-end delay per packet for flow f on path P or the corresponding packet loss, or some combination thereof. The measurement of such metrics is presented in Section ?? below.

T(fP) is the trust metric that expresses the level of insecurity of traffic belonging to given flow f going along the path P. It can be obtained via Attack or Anomaly Detectors, Honeypots or similar entities, that asseses the probability or some other non-negative metric, that connection f is harmed by devices on path P. Note that T(fP) may be symmetric so that it may characterize the effect of f on P, rather than the opposite. Furthermore it may be expressed as the cumulative effect of all the nodes on path P, such as:

$$\begin{aligned} T(f,P)=\sum _{s\in P}T(f,s),~or~T(f,P)=\max \{T(f,s):~s\in P\}. \end{aligned}$$
(2)

J(fP) is the energy consumed per packet by flow f by devices along path P, which can be computed from the power consumption and traffic rate, as follows:

$$\begin{aligned} J(f,P) =\sum _{s\in P} \frac{\varPi (s,\lambda (s))}{\lambda (s)}, \end{aligned}$$
(3)

where \(\varPi (s)\) is the power consumption when switch or node s carries the traffic rate \(\lambda (s)\) while:

$$\begin{aligned} \lambda (s)=\sum _{f\in F}\sum _{s\in f}\lambda (f), \end{aligned}$$
(4)

and \(\lambda (f)\) is the traffic rate of connection f, and \(F=\{f\}\) is the set of all active connections.

2.2 RNN Based Routing for Path Control

The approach taken here is to use the Cognitive Packet Network (CPN) idea [14, 15, 23], so as to store inside the SDN Controller a “good” or near-optimal path P(f) for flow \(f=(c,e)\) from client c to edge device e that minimizes G(fP(f)). Thus, rather than calculate ex-nihilo for each upcoming connection \(f=(c,e)\) the path P(f), we follow the CPN approach that maintains for each router or switch (i.e. node) s, a Random Neural Network [12] that computes the best “next hop” from s to \(s'(s,e))\), where \(s'(s,e)\) is the node to which s is connected and that minimizes G((se), P(se)).

Since our study is focused on the IoT where the real-time operation is crucial, the path link latencies were chosen as the key QoS metric. Since a SDN controller within its standard means has no direct way to measure the latency on the links and paths, Cognitive Packets (CP) were employed as described in [24] were described for this purpose. CPs have also been employed in SDN networks previously [9, 33, 34], but the concept of the Cognitive Network Map (CNM) was extended with all necessary data within single data structure.

2.3 Energy

Fig. 1.
figure 1

Measurement circuit for power versus traffic characteristics.

Most network devices do not have the ability to directly measure energy during operation. However, since each network packet handled needs to be processed and transmitted, it is obvious that the amount of energy consumed during operation of a network switch depends on the traffic intensity. The energy characteristic reflecting the amount of energy in Watts [W] depending on the amount of network traffic passing through the switch is, on the one hand, easy to measure in the laboratory, and on the other hand - during operation in the real system - gives the SDN controller, knowing the current throughput of the node, a sufficiently precise answer to the question “How much energy does the network switch consume at this moment”.

The SDN switches used in our experiments are Intel NUC devices [26] that run Open vSwitch [29]. Our approach, however, is universal in the sense that it can be applied to any network switch or router.

The laboratory setup used for the measurements if the power drawn during data transfer is presented in Fig. 1. After setting of the traffic level given in Mb/s the energy measurement was done. The traffic was generated and received by workstations connected to the NUC device. The experiment was carried out for successive for increasing traffic levels as shown in Fig. 2.

The electronic circuit which is used to condition the signal obtained from a sensor which measures the current, is based on precision operational amplifiers. The Hall effect-based current sensor ACS712-05 (0–5A current range) is galvanically isolated from the copper conduction path, integrated into the IC, which is used to pass the measured current. This path was connected in series with the supply wire on the constant DC voltage side at \(U_{DC} = 19.5V\), of the AC adapter used for the NUC’s as shown in Fig. 1. The output signal from the sensor is amplified in a single-ended amplifier and then converted to the differential form. The instantaneous value of the measured power can then be found from the following relationship:

$$\begin{aligned} P = U_{DC}.i = U_{DC}\frac{U_m}{k_u S} = AU_m,~in~Watts~, \end{aligned}$$
(5)

where \(S=185mV/A\) is the sensitivity of the current sensor, and \(A = U_{DC}/(k_u S) = 520.9A\) is a constant with \(k_u=2\) which is related to the instrumentation, and \(U_m\) is the measured output voltage of the single-sided differential converter shown at “channel 1” of Fig. 1, which results from the Hall-effect measurement of the NUC input current.

Fig. 2.
figure 2

The dependence between the instantaneous power consumption and traffic load of the Intel NUC when used as a switch or router.

To reduce the effect of noise and interference, thirty separate measurements were repeated for the power consumption as a function of incoming and outgoing traffic, and the results are summarised in Fig. 2. Then we extracted the difference of the energy consumption between the basic level for zero traffic and the value for a given traffic level, and the increase of energy consumption per traffic volume in Mb is presented in Fig. 3.

2.4 Security

The level of trust in a given flow, and therefore in the device that generates it, can be assessed using external entities. Within the network, nodes and devices with higher and lower sensitivity may be defined. For example, the failure of some nodes has a greater impact on the operation of the entire network than in the case of other nodes, and attacking such a node will cause more damage than otherwise. Security-aware routing aims to direct suspicious traffic away from vulnerable nodes, if possible. Trust assessing entities can be Attack Detectors or Honeypots, e.g. [17, 30]. We employed SYN attack detector presented in [7].

3 Experiments and Results

The experiments we performed were done in the IITiS laboratory. The test network consisted of seven NUC devices working as SDN switches, plus SDN controller, client machines and attack detector. The basic topology of the network is presented in Fig. 4

Fig. 3.
figure 3

The energy used per Mb in the function of switch load.

For clarity of results presentation, and in order to concisely present the different possibilities of our solution, two separate experiments were performed, however the basic network configuration remained the same. The course and results of the experiments follows.

3.1 Point-to-Point Transmission in Insecure Environment

The aim of the experiment was to reflect the situation of point-to-point communication in the situation of an attack. As presented in Fig. 6, point-to-point communication from \(c_1\) to \(c_6\) client devices was established and put under observation. In this experiment energy efficiency was not taken into consideration, to avoid too many factors influencing the results, making it hard to separate the influence of each of them on the final results.

The experiment had three steps:

  • Normal communication from \(c_1\) to \(c_6\)

  • QoS deterioration on the link \(c_1\)\(c_4\)

  • Security problem detected – the need to bypass sensitive nodes \(c_3\) and \(c_5\)

The measurement included latency on the path \(c_1\) to \(c_6\). System reaction to changing conditions can be easily observed in the Fig. 5. After some time needed for the neural network to test various conditions and possibilities the path which is both fast and secure was found. The network configurations in particular steps are presented in Figs. 6, 8 and the final on in 8

3.2 Energy-efficient Access to the Edge

The final topology of the second experiment is presented in Fig. 9. It include 24 client devices (implemented as virtual machines) and seven edge services. Every switch was accompanied by the separate service instance. The energy characteristics is taken into account, as well as total time of request handling by the Edge services. The total handling time included: time of client-to-service communication \(t_{cs}\), request handling in the server \(t_r\), time of service-to-client communication \(t_{sc}\). The second component of the goal function was energy efficiency, and energy characteristics from the Fig. 3 was loaded into SDN controller for readouts of energy usage based on traffic in each switch. The RNN decision engine was used for path-and-service choice (Fig. 7).

Fig. 4.
figure 4

The configuration of experimental test-bed

Fig. 5.
figure 5

The delay in time between clients 1 and 6

Fig. 6.
figure 6

The \(c_1\)\(c_6\) path configuration – stage 1

Fig. 7.
figure 7

The \(c_1\)\(c_6\) path configuration – stage 2

Fig. 8.
figure 8

The \(c_1\)\(c_6\) path configuration – stage 3

Fig. 9.
figure 9

Configuration of the Edge services experiment

Fig. 10.
figure 10

Average Energy [J]/packet during stress test

Fig. 11.
figure 11

Average QoS (delay [ms]) during stress test

The course of the experiment included loading the network with heavy traffic of stress-test type, as such a load was best to show differences in energy usage. Seven steps of experiments were performed, in every step the total load in the network was increased by 1 Gb/s. In the first run only QoS optimisation was performed as a reference result, then both QoS and Energy components were included into the Goal functions. The results, presented in Figs. 10 and 11, show positive influence of the latter version of Goal function on the total energy consumption. with minor effect on QoS.

4 Conclusions

The paper presents the possibilities of using modern tools from the field of Artificial Intelligence (AI) and Machine Learning (ML) to control the operation of computer networks. It has been shown that theoretical capabilities of RNNs can be translated into practical applications, and appropriately constructed goal functions perform complex routing based on several criteria simultaneously.

Among the criteria tested experimentally are the possibilities of increasing the security and reducing the energy consumption of the IT infrastructure, which are very relevant for today’s IT systems. These very promising ideas have been tested in several experiments which demonstrate their practical value in the framework of Software Defined Networks.