Keywords

1 Introduction

Future networks are expected to be highly heterogeneous, consisting of different complex topologies with diverse characteristics and behaviors. For example, 5G is expected to consist of a wired-wireless broadband core, while assuming wireless heterogeneous access networks at the periphery, characterized by numerous simple devices, e.g., Internet of Things (IoT) [1], fog networks [2], etc. Due to pushing cloud services towards the access network [2] and a multitude of factors, such as operational environment, user trends and device heterogeneity, it is anticipated that the on/off behavior (churn) of network nodes and corresponding communication links between them will become critical for the feasibility and scaling of future resource allocation/optimization mechanisms, signalling, etc. For this reason, network evolution as a consequence of network churnFootnote 1, will be one of the key dynamic processes that will impact and possibly determine the success of future network design/planning efforts. Network design and optimization will need to take into account the anticipated churn rates in order to provide services at low cost and with predictable resource demands.

The impact of network churn on network structure, management and application services has been more or less neglected in the literature. Only a few scattered works have addressed it under targeted settings. The authors in [4] have focused on sensor network churn induced by energy depletion, addressing it via a queuing framework. Churn of users in online social networks, i.e., how users enter-leave the system have been studied in [5, 6]. In [3] a mechanism for network robustness against node churn in single-hop wireless networks was developed, while [7] addressed the impact of node churn on malware spreading under random attacks. The scope of our work is broader, aiming at extending the study of churn-induced network evolution to arbitrary complex networks and eventually to more general churn mechanisms.

More specifically, in this paper, we first introduce a general analysis framework for evolving networks, and then use it to study the impact of churn on various complex network topologies that emerge as components in current and future networks. The proposed framework maps topology evolution into multi-dimensional vector trajectories and quantifies the similarity of different topologies. We analyze the impact of evolution on random, scale-free and small-world topologies via various graph metrics typically employed for social/complex network analysis, i.e., average values of node degree, path length, clustering coefficient and centrality measures, thus obtaining a clearer picture of the impact anticipated by network churn. The latter is considered in the form of random edge (link) churn and various types of preferential edge churn, namely churn based on degree, closeness and betweenness centrality metrics. We analyze the impact of churn on each topology and compare their evolution cumulatively, thus obtaining useful knowledge for efficient network design and planning.

The rest of this paper is structured as follows. Section 2 summarizes related work and background on complex networks, analysis metrics and churn, and explains the contribution of this paper. Section 3 introduces the proposed network evolution analysis framework, while Sect. 4 describes the considered edge churn processes. Section 5 provides extensive simulation results on synthetic complex network topologies analyzing the impact of churn on their evolution. Finally, Sect. 6 concludes the paper and provides directions for future research.

2 Network Evolution and Complex Network Analysis Metrics

2.1 Network Evolution and Churn

Networks in general, and computer networks in particular, are characterized by various forms of topological and operational evolution [8]. For instance, different traffic patterns emerge as networks grow, or as user habits change over time. Network evolution can involve variations of nodes/users, or most frequently connection links (edge) variations. With respect to the latter, the authors in [9] proposed a continuum model for dynamic wireless networks, assuming that communications links change in a continuous fashion. They formulated network evolution through differential equations and obtained the long-term behavior of the average node degree. The work in [10] studied the evolution of real networks, implicitly focusing on power-law (scale-free) topologies, with respect to the metrics of the average path length, network diameter and node degree. The evolution was analyzed under the regime of network growth, which is realistic for commercial communications networks. In [11, 12] network evolution models for wireless multihop networks were considered, and especially, a methodology for adding features typical of social networks (small-world characteristics, etc.) into multihop networks via network churn evolution mechanisms was proposed.

Compared to the above works addressing various facets of network evolution, our work attempts a more holistic consideration. More specifically, in contrast to [9], we consider more network analysis metrics and various complex network topologies for random and preferential edge churn. Like [10], we focus on the impact of churn on complex topologies that emerge at various capacities in real networks, but in a broader setting where more topologies and analysis metrics are considered. Finally, compared to [11, 12], this paper focuses on studying how diverse topology transformations impact the network properties, rather than imposing specific churn-based network evolution as in [11, 12]. In essense, this work aspires to pave the way for a more general topology control framework, by enabling controlled network evolution towards desired topologies, for arbitrary initial complex topologies and several analysis metrics, as each application setting specifies.

2.2 Complex and Social Network Analysis Metrics

Various metrics have been employed for complex and social network analysis. The most significant ones have been the node degree and the associated degree distribution, the average path length, the clustering coefficient (CC) and the variants of centrality measures [12, 13]. Some of these metrics refer to network nodes independently, others to the overall network, and some to both. Thus, the CC and centrality can be defined for individual nodes and also computed as network averages. Node degrees refer to individual nodes, but the node degree distribution uniquely characterizes a specific type of network topology. The average path length is a network wide metric, as well.

The clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. For a network node, the local CC is given by the number of links between the nodes within its neighborhood divided by the number of links that could possibly exist among them [12]. Centrality is a measure of node importance in a network. Since the latter is subjective in many application settings, there exist various centrality metrics, e.g., degree centrality (a normalized version of node degree), closeness centrality, betweenness centrality, eigenvector centrality, etc., [12]. In this paper we focus on degree, betweenness and closeness centrality metrics, but other variations/definitions of centrality can be used in a similar fashion. Node betweenness centrality is defined as a normalized sum of the percentages of the number of shortest paths between each pair of vertices that pass through that node. Various approaches have been employed for computing efficiently such type of centrality, including variants of its definition [14]. Node closeness centrality is typically defined as the normalized reciprocal of the sum of distances of a node from all other nodes in the network.

3 Analyzing Network Evolution with Network Feature Vector

Network topologies are typically represented as graphs bearing structural features, characteristic of the interactions among nodes/users. In this section, we introduce a framework for tracking and analyzing network evolution, based on the observation of graph properties and other metrics of interest.

For that, we define the notion of network feature vector. Assume a set of m parameters of interest of a network graph, each of which is denoted as \(g_i,1\le i\le m\). The number m of available/employed network metrics can be arbitrary but finite, and mainly depends on the application setting and the objectives of network evolution analysis. For instance, the metrics mentioned in Sect. 2.2 are sufficient for an overall study of network evolution or node importance, while additional features are required for the study of resource allocation.

In the general case, the metrics can be split into node-oriented and edge-oriented. The first are characteristic of node properties, e.g., node centrality or node clustering coefficient. The second are characteristic of edge properties, e.g., edge weight distribution or average weight of the links, etc. Thus, the set of parameters employed can be split in two subsets, one with node-related features \(\{g_i^{(n)},1\le i\le m_1\}\) and a second with edge-related metrics \(\{g_i^{(e)},1\le i\le m_2\}\), \(m_1 + m_2 = m\).

With this in mind we can define the network feature vector, consisting of all the employed network metrics:

$$\begin{aligned} \mathbf {g}=\underbrace{[g_1^{(n)}~g_2^{(n)}~\ldots ~g_{m_1}^{(n)}}_\text {node-related features}| \underbrace{g_1^{(e)}~g_2^{(e)}~\ldots ~g_{m_2}^{(e)}]^T}_\text {edge-related features} \end{aligned}$$
(1)

If the topology varies with time, at least some of the features will be time-varying and a time-dependence can be considered. In this paper, we will demonstrate the framework with a small number of parameters and we will not maintain the aforementioned distinction of features to node-related and edge-related. Thus, we will employ a more compact form of the feature vector \(\mathbf {g}(t) = [g_1(t)~g_2(t)~\ldots ~g_m(t)]^T\). Also, for simplicity and without loss of generality, we will drop the time variable whenever the time-dependence is clear.

Through the feature vector \(\mathbf {g}\), a specific network topology with its properties is mapped to a vector (point) in a metric space. As the topology evolves, so does the point in the metric space and the direction of the associated position vector \(\mathbf {g}(t)\), i.e., the angle coordinates and measure of \(\mathbf {g}(t)\) evolve in time. The dimension of the metric space, denoted as network feature space, depends on the number of network properties considered, so that different network feature spaces correspond to studying different properties of network evolution.

The time-varying feature vector can be used to assess the similarity of different topologies, namely quantify how “close” the final topology is to the initial, after a series of modifications. It can be also used to assess the similarity of different types of topologies. Since by the above mapping a topology snapshot corresponds to a point in space, various distance or similarity metrics can be employed to quantify the distance/similarity between topologies. In the special case that each component (metric) of the feature vector is independent of the rest, the network feature vector can be cast as a probability density function, and thus, in addition to distance/similarity, entropy-like measures can be employed as well [15]. With respect to the metrics employed in this work, and since the relation among them has not been absolutely clarified, we will employ only distance and similarity metrics.

Distance metrics are more adequate to quantify the “magnitude” of network evolution, corresponding to the magnitude change of the network feature vector. On the other hand, inner product metrics (e.g., cosine metric) depict the “direction” of change (e.g., if a network changes character drastically or to a lesser degree) corresponding to the directional change of the network feature vector.

In this paper, and in order to demonstrate the potentials of the topology evolution framework and the role of distance/similarity metrics, we will show one simple representative metric from each category. Specifically, we will employ the Euclidean distance and cosine metrics, to quantify the magnitude and direction of change, respectively, between two topology instances, \(\mathbf {g}(t_1)\) and \(\mathbf {g}(t_2)\):

$$\begin{aligned} d_e = \sqrt{\sum _{i=1}^m{|g_i(t_1)-g_i(t_2)|^2}} \end{aligned}$$
(2)
$$\begin{aligned} s_c = \frac{\sum _{i=1}^m{g_i(t_1)g_i(t_2)}}{\sqrt{\sum _{i=1}^m{g_i^2(t_1)}}\sqrt{\sum _{i=1}^m{g_i^2(t_2)}}} \end{aligned}$$
(3)

It should be noted that \(\mathbf {g}(t)\) corresponds to the instance of a topology at time t, as explained above. \(\mathbf {g}(t^{\prime })\) denotes another instance, namely the evolution of the topology up to time \(t^{\prime } > t\). Furthermore, it is noted that since the Euclidean distance is a special case of the Minkowski distance family [15] and the cosine similarity a special case of the inner product family of similarities [15], the results provided are indicative of the trends one would obtain by any other member of the Minkowski and inner product families, apart from some scaling factors.

Another important observation refers to the dimension of the network feature space. It should be stressed that this is defined by the number of metrics employed in the feature vector definition and not the number of nodes in the network. The dimension of the feature vector (not its values) is independent from the size and order of the network graph.

One of the most fascinating potentials of the introduced network feature vector is the control capability over topology evolution. As explained above, different instances in time of an evolving topology correspond to a trajectory of a vector in a metric space. By properly defining a cost function of the form \(J = h(\mathbf {g}(t_f),t_f) + \int _{t_0}^{t_f}{k(\mathbf {g}(t),\mathbf {u}(t),t)dt}\), where \([t_0,t_f]\) is the observation time interval, \(h(\cdot ),k(\cdot )\) properly defined continuous functions and \(\mathbf {u}(t)\) a control function related to the real mechanics of network evolution that determines the evolution of the topology through the system of equation \(\dot{\mathbf {g}}(t) = \mathbf {a}(\mathbf {g}(t),\mathbf {u}(t),t)\), where \(\mathbf {a}(\cdot )\) determines the relation of the network feature vector with each control, one can potentially develop an optimal control problem on \(\mathbf {g}(t)\) and exploit the constraints and controls for optimally balancing trade-offs relevant to network evolution and the benefit-cost relations of network processes emerging. The type of cost J, controls \(\mathbf {u}(t)\) and relation between network feature vector-controls \(\mathbf {a}(\cdot )\), are application/network dependent and define the type of control problem emerging. They also determine the required solution methodology to be employed.

In the following, we demonstrate the impact of edge churn on network topology, by exploiting the capabilities of the framework described above to quantify and provide intuition on the similarity between instances of an evolving topology and of different networks among them.

4 Network Churn and Network Evolution

In this work, we will focus on edge churn for relational complex networks, i.e., variation of links between nodes. In the latter, any link can be added or deleted (no constraint on link formation). Examples of such networks with edge churn are peer-to-peer and online social network, where users enter/leave the network arbitrarily and form relational links arbitrarily with other users. This would not be the case in lattice and random geometric networks, where spatial constraints would apply for the links that could be added/deleted. However, even in these cases, network evolution could be analyzed similarly. We restrict ourselves to relational graphs only in order to directly compare the results between them and focus on the churn effect, rather than take into account evolutionary features induced by the type of analyzed network.

We consider a generic edge churn mechanism and several variations of it, as described in the forthcoming subsection. Edge churn is a combined effect in complex networks, including environmental aspects (wireless access cases), user behavior (turning on/off devices), service patterns (connectionless, service-oriented, etc.). A general trend discovered in [10] is that of network “densification”, i.e., edges are added with higher rate than edges deleted. Thus, we will also consider a slightly higher edge addition than deletion rate, reducing also in this way the probability of rendering the network topology disconnected. Furthermore, by considering the churn-based constructive nature of many complex networks, such as the ones included in this work (especially the scale-free and small-world ones), we further study the effect of edge churn mechanisms on their structure, thus obtaining clearer picture of its impact on their evolution, while exemplifying the proposed framework.

Edge churn is performed successively in multiple steps, in a constructive process [12]. We assume that at each step, only one of the sub-processes takes place, as described below:

  • Edge Addition: With probability p, \(0\le p\le 1\), we add one new link to a selected pair of nodes that are not currently linked.

  • Edge Deletion: With probability r, \(0\le r\le 1\), we delete one selected edge from an already connected node pair.

  • No Action: With probability \(1-p-r\), neither edge addition, nor edge deletion are performed.

Note that it should hold \(p+r\le 1\), i.e., the vector \([p~r~1-p-r]\) is a probability distribution over the above processes.

The selection mechanism for which edges to add/delete determines diverse edge churn types. We consider random edge churn (RC) that corresponds to selecting randomly and uniformly which edge to add/delete, as well as which process to perform. We also consider preferential edge churn, which selects nodes and edges for edge churn preferentially with respect to a selected feature (i.e., centrality metric). We consider three types of preferential edge churn. In the first, new edges are preferentially attached to nodes with high degree centrality and deleted from nodes with low degree centrality. The second and the third variants are similar, but instead of degree centrality, we use betweenness and closeness centrality correspondingly. Also, we examine the inverse of these edge churn types, i.e., the inverse of the first type corresponds to adding new edges preferentially to nodes with low degree centrality and deleting edges from nodes with high degree. The reason of applying such preferential attachment (PA) edge churn mechanisms is to study how network evolution based on a particular network feature (part of the network feature vector) impacts the other features and the behavior/structure of each network as a whole. It is also in line with network churn processes as the ones described above in peer-to-peer or online social networks, and IoT networks within the fog computing paradigm.

5 Numerical Evaluation

In this section, we exemplify the application of the proposed framework via numerical results obtained by simulating RC and PA edge churn in various complex networks. Initially, we explain the models for the complex networks employed and then present and analyze the obtained results.

5.1 Relational Complex Networks

We consider three different types of relational graphs, random graphs (RG), scale-free (SF) and small-world (SW). The simplest model of RG is the Gilbert \(\mathcal {G}(N,p)\), where every possible edge occurs independently with probability \(0< p < 1\). This is a model extensively used for peer-to-peer networks and other spontaneously forming networks. We consider the Erdős-Rényi (ER) model, \(\mathcal {G}(N,E)\), where a graph is chosen uniformly at random from the collection of all graphs which have N nodes and E edges. If \(pN^2 \rightarrow \infty \), then \(\mathcal {G}(N,p)\) behaves fairly similarly to \(\mathcal {G}(N,E)\) with \(E=\genfrac(){0.0pt}1{N}{2} p\), as N increases [12, 13]. We choose E in the range of \(\frac{N\log (N)}{2}< E < \frac{N(N-1)}{2}\). The lower bound ensures a connected topology, while the upper \(\frac{N(N-1)}{2}\) signifies a massively dense network, i.e., a completely connected graph.

Scale-free (SF) is a type of network whose degree distribution follows a power-law. The fraction P(k) of nodes having k connections to other nodes scales as \(P(k) \ \sim \ k^{-\gamma }\), where \(2< \gamma < 3\) [12, 13]. Many real-world networks such as the WWW and the router network of the Internet belong in this category. We consider the Barabasi-Albert (BA) model of scale-free networks, which generates random scale-free networks using the preferential attachment mechanism. If a network begins with an initially connected and d-regular network of \(m_0\) nodes (\(d<m_0\)) and new nodes are added to the network one at a time, each new node will connect to \(m \le m_0\) existing nodes with a probability that is proportional to the degree of the latter. Such probability \(p_i\) that the new node connects to the existing node i is \(p_i = \frac{k_i}{\sum _j k_j}\), where \(k_i\) is the degree of node i and the sum is over all pre-existing nodes j. Heavily linked nodes (hubs) tend to quickly accumulate even more links and usually have high betweenness and closeness centrality values.

Small-world (SW) is defined as a network where the typical distance L between two randomly chosen nodes grows proportionally to the logarithm of the number of nodes N, i.e., \(L \propto \log N\) [12, 13] and the clustering coefficient is high. Practically, nodes are linked with a small number of local neighbors. However, the average distance between nodes remains small, i.e., nodes are virtually close to each other. We consider the Watts-Strogatz (WS) model of SW. It starts from a regular lattice of degree d and randomly rewires each edge with a probability \(g_p\), connecting nodes that are otherwise far apart with shortcuts. The initial lattice ensures high clustering coefficient, while a suitable number of shortcuts can further reduce the average path length. For comparison purposes, one should consider that RGs exhibit a small average shortest path length (varying typically as the logarithm of the number of nodes) along with small clustering. Examples include some types of smartgrids, telephone networks, etc.

5.2 Case Study of the Application of the Proposed Framework: Impact of Edge Churn on Complex Networks

In this subsection we investigate via simulations the impact of edge churn on complex network evolution. We performed them in MATLAB, averaging over 50 topologies for each considered scenario. The feature vector consists of the following features, in this order: average degree centrality (normalized average node degree), average path length, average clustering coefficient, average betweenness centrality, average closeness centrality, where averaging is performed for all network nodes. We consider \(N = 100\) nodes for all topologies, \(E = 2000\) edges for ER-RGs, initial d-regular lattice with \(d = 10\) for BA-SF and initial lattice with \(d = 10\) and a rewire probability \(g_p = 0.25\) for WS-SW networks. Edge addition takes place with probability 0.60, and edge deletion with probability 0.40.

Impact of Edge Churn on Each Type of Network. In this subsection, we study how the use of the introduced network feature vector can aid at tracking and visualization of the evolution of several types of networks when random and preferential edge churn takes place. The results are shown in Figs. 1, 2 and 3, where edge churn takes place in an initial RG, SF and SW topology correspondingly. For visualization purposes, we divide the set of features in two subsets one including the degree, clustering coefficient and average path length and the other all the centrality metrics (degree, betweenness and closeness). The subfigure (a) of each figure corresponds to the first subset and the subfigure (b) to the second. The vectors with only a graph type indication in the legend correspond to the features’ values in the initial topology, while the vectors having also an edge churn mechanism indication in the legend correspond to the final topology after the application of the edge churn.

From Fig. 1, we observe that an RG is not significantly affected by edge churn, either PA-based or random, as it is intuitively expected. The final vectors remain very close to the initial ones for RG for all types of churn. The churn type (e.g., random, preferential) does not seem to affect the final vector, which differs from the initial one mainly due to the increase of the average node degree, as an outcome of network densification. This behavior is intuitively expected for RGs, since due to the random nature of the initial network, even preferential additions/deletions eventually behave similarly to random ones. Given the rest of random connections, the ones added/deleted do not alter considerably the initial structure of the RG.

Fig. 1.
figure 1

Illustration of the proposed framework when edge churn with diverse attachment types is applied to an initial random graph.

From Fig. 2, we observe that an evolved SF network under all types of edge churn examined, eventually moves closer to an RG, as the node degree increases due to network densification (nodes with low degree in the initial SF structure also obtain edges) and so does the clustering coefficient. Note that the SF topology has the lowest clustering coefficient compared to the RG and SW ones. The final topologies under PA edge churn based on degree and betweenness centrality distinguish themselves among the rest by staying closer to the initial SF topology, while the second also moves closer to the SW topology, as shown more clearly in Fig. 2(a).

For an initial SW topology under edge churn (Fig. 3), we can make similar observations as in Fig. 2, as the network feature vector moves towards the one corresponding to an RG. However, the final topology under preferential edge churn based on betweenness centrality distinguishes itself among the rest by staying closer to the SW feature vector. Thus, such illustrations of the network evolution reveal important observations for maintaining a non-random network structure, e.g., for SF and SW. For the first, in order to maintain a completely non-random structure, node degree and betweenness centrality should be prevented from changing significantly. For the SW network only betweenness centrality should be prevented from drastic modification.

Fig. 2.
figure 2

Illustration of the proposed framework when edge churn with diverse attachment types is applied to an initial scale-free graph.

Fig. 3.
figure 3

Illustration of the proposed framework when edge churn with diverse attachment types is applied to an initial small-world graph.

Comparative Analysis of Churn Types on Complex Relational Topologies. In this subsection, we compare the evolved complex topologies with the initial ones, in terms of the distance and similarity metrics. In the following table pairs, the left-hand corresponds to the (Euclidean) distance metric between the pure topologies (not evolved) indicated in each row and the final topologies derived by the initial topologies indicated in columns. The right-hand table corresponds to the (cosine) similarity metric. As an example, the entry in row 1 and column 1 of Table 1 corresponds to the distance between the feature vector of a pure RG (row) and the feature vector of an evolved RG topology (column) under RC. The observations follow the lines of the ones made in Sect. 5.2.

Tables 1 and 2 present the distance and similarity of the evolved topologies for random edge churn compared to the initial ones. Similarly, table pairs: Tables 34, 56, 78, present the distance and similarity of the evolved topologies compared to the initial ones for the PA edge churn based on degree, closeness and betweenness centralities, respectively.

For random edge churn, the most important observations are that it has almost no impact on RG topologies, lesser (and small) impact on SF and noteworthy impact on SW (which move closer to RG topologies). This is expected since random edge modification are not expected to affect an already random network. However, they are expected to have some effect on a SW, where the few shortcuts may be modified and thus affect its structure.

A similar but modified trend is emerging in preferential based edge churn. Evolved RGs are not affected notably by any type of preferential edge churn and always remain close enough to the features of the initial random graph. Evolved SFs and SWs after PA edge churn based on degree and closenness centralities become almost RGs as their distance to a pure RG network decreases and their similarity increases. This is counter-intuitive for SF graphs when considering PA edge churn based on degree centrality, however edges are added and deleted probabilistically, although preferentially, while the characterization of the network is now more holistic based on a set of features and not on specific features. Notably SF modified via PA edge churn based on betweenness centrality maintains its pure graph’s properties as shown in Tables 7 and 8 and also observed in the previous subsection. The same observation is true for the evolved SW graph although to a lesser degree. Note that more specific observations targeted to the extent of change of each topology can be made by comparing the absolute values of the distances.

Table 1. Distance comparison for random churn.
Table 2. Similarity comparison for random churn.
Table 3. Distance comparison for preferential churn (degree-based).
Table 4. Similarity comparison for preferential churn (degree-based).
Table 5. Distance comparison for preferential churn (closeness-based).
Table 6. Similarity comparison for preferential churn (closeness-based).
Table 7. Distance comparison for preferential churn (betweenness-based).
Table 8. Similarity comparison for preferential churn (betweenness-based).

Another general observation is that with respect to distance, SF-SW networks are relatively close and RG is farther away from them in the 5-dimensional network feature space considered. This is also expected since several features of SW networks are also present in SF graphs [12, 13], which is also illustrated in the plots of the previous subsection. All the above observations can be exploited for advanced network design and optimization. By taking into account the anticipated network changes due to edge churn, network planning will be more realistic, and more accurate optimization can be achieved.

6 Conclusions

In this work, we demonstrated a framework for quantifying network evolution and similarity based on the mapping of a set of network parameters on points of a vector space. It allows to track network evolution by observing the trajectory of the network feature vector, which can include different graph features, and thus accommodate various applications. Consequently, it is a very flexible, yet descriptive framework for analysis and study of network evolution. In this work, we further demonstrated the potentials of the proposed framework by analyzing edge churn for various complex networks of relational type, namely random, scale-free and small-world. The outcomes are in accordance with observed trends and can be used for advanced network design/optimization.

In the future, we plan to demonstrate the full spectrum of potentials by extending the proposed framework to a topology control one. By proper definition of a cost function, optimal control theory can be used to drive the network feature vector to desired values, thus allowing modifying one type of network to another based on feasible trade-offs and precise specifications.