About complexity of complex networks
 195 Downloads
Abstract
We derive complexity estimates for two classes of deterministic networks: the Boolean networks S(B_{n, m}), which compute the Boolean vectorfunctions B_{n, m}, and the classes of graphs \(G(V_{P_{m,\,l}}, E)\), with overlapping communities and high density. The latter objects are well suited for the synthesis of resilience networks. For the Boolean vectorfunctions, we propose a synthesis of networks on a NOT, AND, and OR logical basis and unreliable channels such that the computation of any Boolean vectorfunction is carried out with polynomial information cost.All vertexes of the graphs \(G(V_{P_{m,\,l}}, E)\) are labeled by the trinomial (m^{2}±l,m)partitions from the set of partitions P_{m, l}. It turns out that such labeling makes it possible to create networks of optimal algorithmic complexity with highly predictable parameters. Numerical simulations of simple graphs for trinomial (m^{2}±l,m)partition families (m=3,4,…,9) allow for the exact estimation of all commonly known topological parameters for the graphs. In addition, a new topological parameter—overlapping index—is proposed. The estimation of this index offers an explanation for the maximal density value for the clique graphs \(G(V_{P_{m,\,l}}, E)\).
Keywords
Partition Simple graph Labeling Algorithmic complexity Information cost Trinomial coefficients Predictability Overlapping community Topological parameters Connectivity Emergency RobustnessAbbreviations
 DDPF
Direct descendant partition family
 GRN
Gene regulatory networks
 TPF
Trinomial partition family
 TR
Transcription factors
 UD
Universal decoder
 UGP
Uniform gap partitions
Introduction
The fundamental problem of the theory of complex networks is to choose the “right” measure of complexity. For the theory of logical networks (and the theory of algorithms in general), the “right” measures are based on estimates of the resources required for solving the most difficult tasks from given classes. The theory of complex networks aims to study any type of object (brain, social network, or otherwise) that is considered “complex” by definition. The more data that are collected, the better the hope of understanding the nature of a complex system—i.e., dataism, or the principle of “brute force of data”. There is an established opinion (Krioukov 2014) that “as far as the brain and other complex systems and networks are concerned, we are at a rather Ptolemaic stage, collecting the data, awaiting for Copernicus”. Is there clear representation of which data we must collect or when the process of “mining of big data” could well suffice? Could it be something like “Waiting for Godot”?^{1}
Herein, we study the approach “from simplicity to complexity” (see (Donetti et al. 2006)) as applied to evolving complex networks, with a special emphasis on the estimation of different complexity measures: algorithmic complexity, running complexity, informational complexity and topological complexity. Although it has been empirically established that different realworld networks (brain, technological, social) have “almost the same” nontrivial topological features, it remains a mystery as to whether one type or another of graph topology is better suited for solving problems. There is a good reason to believe that methods and results of the computational complexity theory can be useful both for justifying the choice of the topology of complex networks to be designed and for understanding the real complex networks. In (Helbing et al. 2014), different cases of the “minimally invasive” principle of “chaos control” in complex networks with simultaneously extremely variable and highly predictable behavior are discussed. Designing networks that demonstrate this principle with precision is the primary goal of our paper.
For any realworld, evolving complex network, the basic stages of evolution (at least, from the algorithmic point of view) are almost identical. In the initial moment, “a set of the basic elements” comes into being that will define the future of the evolving network. As a rule, any evolving system—from genetic networks to computer networks to transportation networks—tends to utilize, as much as possible, any useful decision made at a previous stage of selection. In fact, the main principle of evolving complex networks design is scaling—at least, if the new design is working sufficiently well.
The term “complex networks” includes two different (vast and entangled) conceptions: network and complexity. Even if a network is described by a simple graph (indirect, unweighted edges containing no graph loops or multiple edges) with nodes without attributes, the problem of graph classification is challenging. The resolution of such a problem is likely impossible without more detailed research in the field of the methods of deterministic network synthesis. Recall that studying such practical problems as the creation of resilience networks calls for deeper research of regular or “almost regular” graphs, such as the families of Cayley, Cages and Ramanujan graphs. In (Donetti et al. 2006), the authors report finding topologies of deterministic “almost regular” graphs, called interwoven (entangled) graph structures. Their method (Donetti et al. 2006) requires substantial computer resources to find “almost regular” graphs, even for small numbers of graph vertices.
It is well known that graph theory serves as a theoretical basis for network design. In particular, in the seminal work (Erdös and Rényi 1960), it was established for the first time that the topological behavior of probabilistic graphs with V vertices and E edges, namely, the appearance of subgraphs of a given type (trees, cycles of given order, complete subgraphs, etc.) depends on the ratio V/E. Almost all results of (Erdös and Rényi 1960) were asymptotic, with E∼o(V^{1/2}) or E∼const(V^{1/2}), e.g., when the graphs have a small density. But the density of some realworld objects, for example, computer networks and transcription networks (Sorrells and Johnson 2015), is close to 1. Thus, the topological properties of deterministic graphs with overlapping cliques are of chief interest.
One of main goals of our work is to describe a new approach to the design of special classes of the deterministic networks that are both simple from the viewpoint of algorithmic complexity and entangled from the viewpoint of topological complexity. These features can be ensured by a special method of labeling network nodes in some metric space. Thus far, this approach has not commonly been used in network synthesis theory, and at first glance, it might seem too sophisticated. In reality, this approach is no more exotic than identifying logical elements in computer networks. It should be especially emphasized that, with this approach, one uses a deterministic method for designing networks with exactly predictable topological parameters (at least, for the parameters being checked in numerical modeling).
Assume that we are designing a network S that can be represented by a simple graph G=G(V,E) and the corresponding graph of cliques G_{CL}(G)=G(V_{CL}, E_{CL}), where V is the number of vertexes of graph G(S), and V_{CL} is the number of cliques (possibly overlapping) in graph G_{CL}(G). A special class of networks is represented by a tree directed graph that realizes the Boolean vectorfunctions B_{k, m}, i.e., k Boolean functions of m variables. Let G(B_{k, m}) be the graph of Boolean network S(B_{k, m}).

“general dimension” m;

number of vertexes N=V;

global overlapping index (as a “clusterness” measure) G_{CL}(G);

diameter D of G(S);

algorithmic complexity K(S(m));

density graphs G(S(m)) and G_{CL}(G(S(m))); and

information cost J(G(B_{k, m})).
 1.
V=O(expm);
 2.
V_{CL}=O(expm);
 3.
K(S(m))=O(logm);
 4.
∀m (m≥3) density G_{CL}(G(S(m)))=1;
 5.
∀m (m≥3) D(G(S))≤3; and
 6.
∀k, m J(G(B_{m, k}))=O(ρ^{−1}m^{4}k), where ρ is the probability of error for any of the edges.
Boolean networks were proposed by Kauffman (Kauffman 1969) to study gene regulatory networks (GRNs). A directed graph of GRNs is comprised of labeled nodes—the genes and their regulators. The inputs of theses nodes are proteins (different transcription factors (TF)), and the outputs are the levels of gene expression. When modeling GRN by Boolean networks, estimating the information cost of computation in Boolean networks is especially desirable for detailed researching of the information flows. We will show that there is the network S(B_{k, m}) when computing the Boolean vectorfunctions B_{k, m} for which constraint 6 holds true.
Aside from the Boolean networks, in this paper, we are studying a special class of labeled networks S(m). Our main goal is the analysis of labeling methods that allow for the synthesis of graphs matching conditions 15 listed above. It opens the door to deterministic synthesis of networks with optimal robustness features. We develop labeling methods that may come close to meeting this goal. Numerical modeling shows that there is deterministic method for the synthesis of networks G(S(m)) for m=3,…,9 matching conditions Ω(m).
The paper is structured as follows. In Section 1, we introduce two functionals that are complexity measures for the Boolean network. The first, the socalled entropy volume of computation, quantifies the total “occupation” of all circuit channels under a given distribution of input values. The second one—the information cost of computation—evaluates the capacity of a circuit’s elements with respect to information transmission. For any B_{n, m}, one can design a network S^{∗}(B_{n, m}), such that the upper bound for entropy volume and the information cost of computation for S^{∗}(B_{n, m}) grows with n as ρ logc(L(F_{n, m})), where ρ>0 is a desired error of computation, L(B_{n, m}) is the complexity of B_{n, m}, and c≤4 is a constant.
In Section 2, we present an explicit construction of deterministic models of complex networks. In these models, the labels of nodes are different partitions of the integer m^{2}(m≥3) into at most m integer parts, and connections between nodes are defined by the metric distance between partitions. The values of some topological parameters of graphs from different families are presented analytically, whereas the values of other parameters (size and number of cliques, diameter, average distance, energy and so on) can be computed numerically. A characteristic feature of these deterministic models is the simplicity of the networkgenerating algorithm, which nevertheless allows for the design of large and very attractive, as far as resilience is concerned, topological objects. It is not inconceivable that the predictability of the topological parameters of these networks is a consequence of their low Kcomplexity. Some results of numerical experiments on estimating the robustness of these partition networks (quantified by the proportion of nodes and cliques that can be removed before the network loses connectivity) were presented in (Goryashko et al. 2019).
It should be noted that the basic features of the deterministic model from this section were published in our paper (Goryashko et al. 2019).
Finally, Section 5 is focused on a discussion of the perspectives on the design of complex networks in the form of attributed graphs and connections of this approach to the current state of the art.
Informational complexity bounds for computing Boolean functions by combinational circuits
In analyses of complex networks, the most commonly used static complexity measures are the number of elements (nodes or edges of the graph) or the topological parameters of the network’s graph. However, with a focus on functional complexity, the values of such dynamical properties as, for example, the “number of elements in [an] activity state on each computing step”, can be no less important. For the semiconductor elements of logical networks, it can be naturally assumed that the element “activity” is connected with changing element output values. Similar problems were studied with regard to the theory of complexity beginning 60 years ago in Russia. Now, with the increased study of realworld complex networks (biological, technological, and social), it seems useful to discuss the definitions of a network’s activeness in the process of information transmission.^{4}
Preliminary explanation
In the initial stages of verylargescale integration (VLSI) design, the most important problem was to reduce the power dissipated by a circuit (see, e.g., (Yajima and Inagaki 1974)).
However, most of the proposed approaches were ad hoc and did not allow for a general understanding of limits on energy consumption when computing the arbitrary Boolean vectorfunctions. Moreover, in the optimal model (in terms of the number of logicalelement synthesis methods), starting with the classic results of C. Shannon, the number of active elements is proportional to the total number of elements, that is, for most of the functions, it grows exponentially with the number of variables. In (Goryashko and Nemirovski 1978), for the first time, to the best of our knowledge, it was shown that the “energy cost of computation” is proportional to the information cost of computation rather than to the number of elements in the circuit. We believe that this observation is, in particular, useful when investigating reallife neural networks. A biological neuron, as is the case for most basic biological channels (axons), utilizes the principles implemented in (Goryashko and Nemirovski 1978)—after becoming active, they switch themselves off for refractory period, where no energy is consumed.
This, combined with appropriate changes in circuit synthesis as compared to the standard synthesis methods of that time, allows us to arrive at the results on energy consumption and reliability of computation stated in Theorems 14.
When an “active” element is defined as an element of logical network for which the output value or some of the input values changes, then every Boolean function can be implemented by a network with O(n^{2}) active elements. It is unknown whether this estimate is unimprovable.
For the problem in which we are interested, every network S (combinational circuits) is characterized by two functionals. The first, the socalled “entropy volume of computation”, quantifies the total “occupation” of all network channels under a given distribution of input values. The second—the information cost of computation—evaluates the capacity of the circuit’s elements with respect to the information transmission.
Let network S(f(x_{1},…,x_{n})) represent combinational circuits from elements AND, OR, NOT, and during the computation in S input variables for x_{1},…,x_{n}, and let the value of the boolean function f(x_{1},…,x_{n}) appear in the output at each instance of time, t=0,τ,2τ,…,kτ,….
Let Ψ(x)=Ψ(x_{1},…,x_{n}) be a distribution of input vectors x_{1},…,x_{T}, and all input vectors are independent, with distributions Ψ(x).
Let graph G_{S}(V,E) be a directed graph, where v∈V are elements of network S, and e∈E are connections between the elements.
Assume that each vertex v∈V has a discrete binary source without memory and that each edge e∈E is a discrete binary noiseless channel. Each distribution Ψ(x) in S(f(x_{1},…,x_{n})) can be used to set up the distribution Φ_{s} of binary symbols for each source s. For each e∈E, weight H_{e,Ψ} can be assigned, where H_{e,Ψ} is the entropy of the incident source when Ψ(x) is the input distribution.
Definition 1.
Theorem 1.
∀n,m ∀f(n,m) ∀ Ψ(x) ∃ S(f(n,m)):{H(f(n,m))≤c (n^{3}+n^{2}m)} & {S(f(n,m))≤3m·2^{n}}, where S is the number of elements in network S, and c is a constant.
The proof is given by describing the construction of circuit S and verifying that this circuit indeed meets the theorem’s conclusion. We will soon describe the construction only; the computation of H(S(f(n,m))) and S is straightforward.
The network UD(4) is depicted in Fig. 2. Here, each rank j (j=0,…,n−1) consists of 2^{j} blocks, and each block realizes 2(n−j) conjunctive terms with j+1 letters (input variables). For each block ξ(ξ=(1,2,…,2^{j})), a conjunction in rank j is \((x_{1}^{\sigma _{1}}, \dotsc, x_{j}^{\sigma _{j}}) x_{\mu }\), where σ_{1},…,σ_{j}=ξ and x_{μ} is taken as \(x_{j+1}, \overline {x}_{j+1}, \dotsc, x_{n}, \overline {x}_{n}\). Rank 0 contains only n NO elements; the other ranks are built up from AND elements with two inputs. Each conjunction in rank j+1 is formed from two conjunctions in rank j+1 as \(x_{1}^{\sigma _{1}}\cdots x_{j}^{\sigma _{j}} x_{j+1}^{\sigma _{j+1}} x_{j+2}^{\sigma _{j+2}} = x_{1}^{\sigma _{1}}\cdots x_{j}^{\sigma _{j}} x_{j+1}^{\sigma _{j+1}}\ \&\ x_{1}^{\sigma _{1}}\cdots x_{j}^{\sigma _{j}} x_{j+2}^{\sigma _{j+2}}\). Therefore, the outputs of rank n−1 are 2^{n} different conjunctions \((x_{1}^{\sigma _{1}}, \dotsc, x_{n}^{\sigma _{n}})\). Universal junction (UJ) is used to realize any function f(n,m), made up of the OR elements, with two inputs.
Model for a channel with memory
Here, S(t_{0}) is the initial state of channel; P(x^{T}) is the distribution of input words with length T; Γ(x^{T}) is the output words after transmission through Γ; and J_{P}(Γ(x^{T});x^{T}) is the mutual information of random values x^{T} and Γ(x^{T}), assuming that the distribution of input words is P(x^{T}).
Theorem 2.
 1.
C(Q+1)=C(Q+1)≡C(Q+1)=− ln(1−p^{∗}), where p^{∗} is the root of the equation lnp=(1+Q) ln(1−p), and p, (1−p) are probabilities of 1 and 0 as channel outputs, accordingly;
 2.
C(Q+1)≤[ ln(Q+1)+1]/(Q+1).
The proofs of Theorem 2 and Theorems 3 and 4 are also in the Appendix.
Network S _{Q}(f(n,m)) in a Qbasis and the information cost of computation
Definition 2.

Delay equals zero;

The number of outputs equals r (r≥1); and

Each output channel has memory equaling Q.
Definition 3.
Define the output of the ideal network S_{0}(f(n,m)), i.e., the network with Q=0, with y_{1},y_{2},…,y_{T}, where each y_{i} consists of m bits of the f(n,m) values. If for the network S_{Q}(f(n,m)),Q>0, any channel C(Q+1) translates without error only the part with unit symbols. As a result of our network, S_{Q}(f(n,m)) will be unreliable, where output is z_{1},z_{2},…,z_{T}. Let it be reasoned that error exists if y_{i}≠z_{i}. The total error during T computing steps is defined as ρ(y^{T},z^{T}). Let Ψ(x)=Ψ(x_{1},…,x_{n}) for the distribution of input vectors x_{1},…,x_{T}, and \(\mu _{k}^{T}\) is a random sequence of T signals in channel k∈K, and H(T,k) is entropy of this sequence in the case of distribution Ψ(x) for the input vector x^{T}.
Definition 4.
Definition 5.
Definition 6.
Definition 7.
In (Goryashko and Nemirovski 1978), it was proven that given a function f(n,m) and any distribution of inputs Ψ(x), it is possible to select the values of Q for all elements of the network described in Theorem 1 to ensure the validity of the following statement.
Theorem 3.
In (Goryashko and Nemirovski 1978), it was also shown that the method of synthesis from Theorem 3 provides the following upper bound of informational cost: if the error of computing f(n,m) is \(\rho + \sum _{i=1}^{m} \rho _{i}\), then \(J(S_{\rho }) \leq \rho ^{1} n^{4} + n^{3} \sum _{i=1}^{m} (\rho _{i})^{1}\).
Theorem 4.
\(\forall n,m (m\leq n) \forall \Psi (\widetilde {x}) \forall f_{n,m} \forall \rho (\rho >0) \exists ({\mathfrak {A}}(f_{n,m})\) computing f_{n,m} with error ≤ρ): \(\{J({\mathfrak {A}}(f_{n,m}))\leq 4\rho ^{1}n^{4}m\} \& \{{\mathfrak {A}}(f_{n,m})\leq 3m2^{n}\}\).
Graphs labeling by partition: basic definitions

low algorithmic complexity;

the ability to create rich, evolving topological structures; and

low time consumption for the numerical simulation of the resulting structure.
It turns out that all these properties can be achieved by a special method of attributing graph nodes, which in (Goryashko et al. 2019) was named partition family graphs. Here, we restate the basic definitions.
Definition 8.
A (n,m)partition of n into no more than m parts is defined as a sequence of nonnegative integers a_{1}≥a_{2}≥⋯≥a_{k}≥0, such that n=a_{1}+a_{2}+⋯+a_{m}. The set of all feasible (n,m)partitions is denoted by P(n,m)^{6}.
Definition 9.
For (n,m)partition α=(a_{1},…,a_{m}), the gaps between parts are the values k_{i}=a_{i}−a_{i+1}(1≤i≤m−1). A partition whose gaps k_{i} are the same as and equal to some specific k≥2 is called a partition with uniform gaps, and the set of all these partitions is denoted by UGPk(m). Note that any partition set P(m^{2},m) where m≥3 contains UGP2(m), which is the sequence of odd numbers (2m−1,2m−3,…,3,1).
Definition 10.
For all α,β∈P(n,m)(α≠β), it holds that 2≤ρ(α,β)≤2n(1−1/m). The least upper bound is achieved for α=(n,0,…,0) and β=(n/m,n/m,…,n/m).
Definition 11.
Let a graph \(G(V_{\mathcal {P}},\, E)\) consist of a set of vertices (nodes) V, where each node has unique label \(\alpha \in \mathcal {P} \subseteq P(n, m)\), and every two nodes \(v_{\alpha }, v_{\beta } \in V_{\mathcal {P}}\) are connected by an edge e∈E iif ρ(α,β)=1.
What is the direct descendant partition family?
Following the notation (Andrews 1971), the trinomial coefficient \(\binom {m}{k}_{2}\) with m≥0 and −m≤k≤m is given by the coefficient of x^{m+k} in the expansion of (1+x+x^{2})^{m}.
The central trinomial coefficient (k=0) for m=3,4,5,6,7,8,9,10,… is given by the sequence 7,19,51,141,393,1107,3139,8953,… (sequence A002426 in the OEIS (OEIS Foundation Inc 2018)). If k=1,k=−1 trinomial coefficients is given by the sequence 6,16,45,126,357,1016,2907,8350,… (sequence A005717 in the OEIS). The central trinomial coefficient is asymptotic: \(a(m) \approx d \cdot 3^{m} / \sqrt {m}\), where \(d = \sqrt {3/\pi }/2\) (OEIS Foundation Inc 2018).
Now, we will examine the extended class of partitions which will be named as the direct descendant partition family (DDPF(n,m)), which contains partitions with m parts but with different values of n (“spectrum” of values n=m^{2}±l,l=0,1,2,…,m−2; m≥2).
For a partition set DDPF(n,m), subset DDPF(m^{2})—i.e., when l=0—will be named a progenitor for any family from DDPF(n,m) because any set of partitions F(m,n)∈ DDPF(n,m) is created easily from the subset P(m^{2})≡P.
Thus, we have the possibility of estimating a priori the number of adjacent partitions for progenitor. More importantly, it turns out that there is a simple recurrent procedure for designing the graphs G(V_{P},E) and, consequently, the graph G(V_{F(m,n)}, E).
Recurrent procedure for designing the graph of G(V _{P},E)
 Step 0. Let us have the initial partition P(9,3) (Fig. 5a).

Step 1. First, to create partitions in accordance with the central trinomial coefficient for P(16,4), we prepend the odd number a_{1}=2m−1 to all partitions from Step 0 (Fig. 5b, red color).
 Step 2. To find two side subsets (k=1,k=−1) of trinomial coefficients (see Fig. 5b, green and blue color accordingly) it is sufficient to define all partitions such that the following holds:

for all partitions, n=m^{2};

the first part of all partitions equals 2m when k=1 (green column);

first part of all partitions equals 2m−2 when k=1 (blue column); and

for all partitions, αρ(α,h)=1.


Step 3. To design the graph \(\phantom {\dot {i}\!}G(V_{P(m^{2}, m)},\, E_{m})\), we have to connect any nodes α,β iff ρ(label α,label β)=1.
In Fig. 5c, the adjacent matrix of G(V_{P(16,4)}, E_{4}) is depicted.
Let us estimate the designing cost T_{m} of the graph \(\phantom {\dot {i}\!}G(V_{P(m^{2}, m)},\, E_{m})\) as an upper bound on its running time. Each partition α∈P(m^{2},m) can be presented by word w(α) from 2 log2m binary digits. To find all partitions adjacent to the partition α, it is sufficient to make 3m arithmetical operations under the numbers w(α). The total number of computational steps number no more than 3·3·m·L(m) arithmetical operations, where L(m) is the number of partitions for the central trinomial coefficient. Note that the number of nodes in \(\phantom {\dot {i}\!}G(V_{P(m^{2}, m)},\, E_{m})\) is no greater than 3·L(m).
At the same time, it is known that for random graphs, the expected runtime is O(N+M), where N is the number of nodes, and M is the expected number of edges (Miller and Hagberg 2011).
where P is a program that produces the string s when executed on the universal Turing machine, and P is the length of the program P—the number of bits required to represent P. The Kcomplexity is upper semicomputable, i.e., only the upper bound of the value of Kcomplexity can be computed for a given string s. As is easy to see from description of this procedure, the upper bound of the number of bits required to represent the adjacency matrix of \(\phantom {\dot {i}\!}G(V_{\vartheta (m^{2},\, m)}, E_{m})\) is O(logm) because only the value of m is necessary to create the appropriate program.
As is clear from the description of the recurrent procedure and Fig. 5c at each step (increasing the partition parts m on unit basis), the adjacency matrix from the previous step becomes the central part of the new matrix for m+1. This principle of iterative nesting provides for the effect of selfsimilarity (similar to a Russian matryoshka doll).
The results of numerical calculation of the graph’s connectivity (parameters MDenV(m) and MDenCl are calculated for graphs without head nodes)
m  V  Node  D(1)  D(2)  D(3)  D(4)  D(5)  Md  MdenG  MdenCl  Ov  L  Cc 

3  7  Deg.  3  6  –  –  –  3.4  0.87  1.0  2.57  1.43  0.54 
Quant.  6  1  –  –  –  
4  19  Deg.  5  9  18  –  –  8.2  0.39  1.0  2.58  1.54  0.59 
Quant.  6  12  1  –  –  
5  51  Deg.  13  25  50  –  –  18.2  0.34  1.0  4.9  1.63  0.56 
Quant.  30  20  1  –  –  
6  141  Deg.  19  35  69  140  –  40.3  0.28  1.0  6.53  1.7  0.52 
Quant.  20  90  30  1  –  
7  393  Deg.  49  95  191  393  –  87.9  0.22  1.0  8.72  1.77  0.47 
Quant.  140  210  42  1  –  
8  1107  Deg.  69  132  261  533  1106  194.7  0.17  1.0  11.6  1.82  0.42 
Quant.  70  553  420  55  1  
9  3139  Deg.  181  357  752  1499  3138  437.4  0.14  1.0  15.5  1.86  0.37 
Quant.  630  1681  755  74  1 
Topological parameters of the progenitor partition family DDPF(m ^{2}) graphs
Let us introduce some oftenused topological parameters of graphs and provide reminders of definitions (see details in (Strang et al. 2018)) (the programming tools for computer estimates of these parameters can be found in (Samokhine 2017)). Beyond this point, we assume that G(V,E) is a simple graph with V≡N vertices and E≡M edges. The graph density den(G)=2M/N(N−1) and d(i,j) represent the distance (the number of edges in the shortest path between vertices i and j in G(V,E)). If there is no path connecting i and j, then d(i,j)=∞. The diameter D(G(V,E))= maxijd(i,j). The characteristic path lengthL is the average distance over all pairs of vertices, i.e., \(L = 1/\left (N(N1)\right) \sum _{i \neq j} d(i, j)\). In (Strang et al. 2018), a measure named the global efficiency was proposed: \(E_{glob}(G) = 1/\left (N(N1)\right) \sum _{i \neq j} (1/d(i, j))\). The local efficiency is the average of the global efficiencies of all subgraphs G_{i}, i.e. \(E_{loc}(G) = (1/N)\sum _{i \in G} E_{glob}(G_{i})\). Beyond the common parameters listed above, we are in need of some measure for estimating of overlapping nodes. Let each node i∈V from graph G(V,E) be incorporated into the ϕ(i) cliques such that 0≤ϕ(i)≤ total cliques of G(V,E). The value ϕ(i) will named the local overlapping. The global overlapping is the average of the local overlapping \(Ov(G) = (1/N) \sum _{i \in G} \phi (i)\).
Overlapping problems
In recent years, interest in clique graphs has grown (see (Evans 2010; Fortunato 2010; Leskovec et al. 2010) and references herein). A large body of work has been dedicated to community detection methods and identification of overlapping communities in large realworld networks such as social networks, biochemical networks, and scientific publications as well as collaboration networks.
Usually, the reasons for the emergence of graph overlapping communities in networks did not attract the particular interest of scientists, although (Crandall et al. 2008) analyzed overlapping community models. There, some mechanisms were established by which the communities in LiveJournal and DBLP networks were growing and changing. In particular, it was shown that networks’ structural changes directly depend on the quantity of overlapping clusters.
In the case under study, the numbers and sizes of the cliques’ overlapping depends on nodes label only. By changing the nodes’ labeling, one can gain insight into the emergent topological peculiarities of overlapping communities.
It may appear that this approach using a clique graph is no better than a trick because the total numbers of nodes and edges in G and G_{Cl} do not change. But in such cases when the cliques may be thought of as a unified entity (for example, from a point of functionality), such an approach has the potential for yielding information about optimal partition methods of large objects.
For example, one can look at the creation of some large scientific collaboration. Let the network nodes be scientists and the edges be the connections between the scientists. The success of the collaboration depends on—to a greater extent than the creative skills of each one person—the reconcilability (tolerance) of coworkers. If there are some groups of scientists (“cliques”) that have demonstrated successful interactions in the past, it makes sense to leave these cliques and create the network as clique graph.
Computer modules of some brain regions (gyrs; see (McCarthy et al. 2014)) have been studied as clique graphs (Strang et al. 2018) as well. Hence, for realworld biological, technological and computer networks, the properties of the overlapping communities could be governing factors, with point representing resilience, effectiveness or density.
The results of numerical calculation of the clique parameters (size and quantity)
m  V  Cliques in G^{v}(m)  

(c_{q}, where q is quantity of cliques of the size c)  
\(\binom {m}{1}^{*}\)  \(\binom {m}{2}\)  \(\binom {m}{3}\)  \(\binom {m}{4}\)  \(\binom {m}{5}\)  \(\binom {m}{6}\)  \(\binom {m}{7}\)  \(\binom {m}{8}\)  Total  
3  7  3_{3}  3_{3}  –  –  –  –  –  –  6 
4  19  4_{4}  6_{6}  4_{4}  –  –  –  –  –  14 
5  51  5_{5}  10_{10}  10_{10}  5_{5}  –  –  –  –  30 
6  141  6_{6}  15_{15}  20_{20}  15_{15}  6_{6}  –  –  –  62 
7  393  7_{7}  21_{21}  35_{35}  35_{35}  21_{21}  7_{7}  –  –  156 
8  1107  8_{8}  28_{28}  56_{56}  70_{70}  56_{56}  28_{28}  8_{8}  –  254 
9  3139  9_{9}  36_{36}  84_{84}  126_{126}  126_{126}  84_{84}  36_{36}  9_{9}  510 
As one can see from Table 1, the distribution of the degrees of vertices resembles a binomial distribution (distinct from the power law for random graphs). Naturally, the diameter of the progenitor graph G(DDPF(m^{2})) is different from that of a randomly evolving network as well. For example, the asymptotic estimate of the diameter D for the BalabosRiordan network model is D(V)≡ lnV/ ln lnV (Bollobás and Riordan 2004). Evidently, for the progenitor graph G(DDPF(m^{2})) with head node for any m, the diameter equals 2. It can be proven that if the head node were excluded from the progenitor graph, then for any m, D(G(DDPF(m^{2})))=3. As shown in Table 1, the characteristic path length remains less than 2 until m=9, until the total nodes of graph number greater than 3000.
More surprising is the behavior of the average index density for the cliques graph. If the value of the average index density of the graph G(DDPF(m^{2})) decreases with m fairly slowly (approximately as 1/ lnV for m=4,…,9), and if this represents moreorless typical behavior, the cliques progenitor graph for any m will have a density equaling 1. This effect is reminiscent of the “densification power law” (Leskovec et al. 2010), which implies the existence of evolving networks for which the density tends to increase. In our case, the reason behind the “stable maximal density” of the clique graph is its high value of global overlapping.
Topological parameters of the descendant partition family (DDPF(n, m))
The partition family DDPF(n,m) for n=m^{2}±l(m), where l(m)=0,1,…,4
m  n=m^{2}  n=m^{2}±1  n=m^{2}±2  n=m^{2}±3  n=m^{2}±4  

Total  Global  Total  Global  Total  Global  Total  Global  Total  Global  
nodes  overlap  nodes  overlap  nodes  overlap  nodes  overlap  nodes  overlap  
2  —  0  7  0  9  0  9  0  —  — 
3  7  2.57  19  2.63  13  2.46  12  2.0  —  — 
4  19  3.57  51  3.56  39  3.23  27  3.18  —  — 
5  51  4.90  141  4.77  111  4.43  81  4.22  50  4.0 
6  141  6.53  393  6.38  321  5.96  241  5.65  182  5.45 
7  393  8.72  1107  8.52  924  8.04  714  7.6  546  7.38 
8  1107  11.62  3138  11.39  2674  10.79  2114  10.21  1646  9.92 
Experimental observation. In accordance with the experimental results, it could be supposed that distributions of clique numbers and sizes are defined by the m and ratio n/m values only. This would explain why the parameters are highly predictable.
Concluding remarks
The design of networks with attributes described by partitions is a first step on the way to a theory of evolving networks with prearranged topological properties. There are many open problems that call for further investigation before this approach could become a working instrument.
One of the most intriguing problems is robustness, both in technical networks (for example, computing networks) and biological networks (for example, transcriptional regulatory networks (TRN)); for more details, see (Whitacre 2012) and references therein. It is well known that methods providing high robustness that were “invented” through billions of years of evolution of any living entity from fungi to mammals are fundamentally different from today’s engineering methods. However, studying the problem of robustness in the complex networks field as a rule takes its origin in the classical models of logical networks. As a result, it is difficult to evaluate the advantages of robust topologies in complex regulatory networks achieved through neutral evolution. We speculate that graph models where labels of each node contain information about specifics and connectivity of the node (something akin to gene sequences) will be more appropriate for modeling TRN, with such models including complex regulatory network topologies.
The partition labeling of network nodes makes it possible to develop a fresh approach to behavioral game theory, in particular, to Colonel Blotto and Lotto games (see (Bocharov and Goryashko 2017)). The results of numerical experiments show that there is tight connection between equilibria in mixed strategies in antagonistic Blottotype games and topological parameters of networks with nodes labeled by game strategies.
From the viewpoint of practical applications, it is interesting to study network models where nodes are attributed by economical parameters. For example, each country could be represented by a network node labeled by a partition with m parts presenting the percentages of total expenditures going to a specific area—e.g., defense, health, and education; this maps onto the standard COFOG (Classification of the Function of Government) data. Topological parameters of such a network with N nodes (for N countries) would provide insight into some special features of international behavior.
We hope that interesting results can eventually be found in this field.
Appendix
Theorem 2.
 1.
\(\underline {C}(Q)\overline {C}(Q)\equiv C(Q)=\ln (1p^{*})\), where p^{∗} is the root of the equation lnp=(1+Q) ln(1−p); p and 1−p are, respectively, the probabilities of having 0 and 1 as the channel’s input; and
 2.
C(Q)≤[ ln(Q+1)+1]/(Q+1).
Theorem 3.
 1.∀n,m∀f_{n,m}∀ρ(ρ>0)$$ J(n,m,\rho)\geq H_{\Psi}(y)m[\rho m+H(\rho)]. $$
 2.∀n,m∀f_{n,m}∀ρ(ρ>0) for every circuit \({\mathfrak {A}}^{R}(f_{n,m})\) computing f_{n,m} with error ≤ρ, it holds that$$ J({\mathfrak{A}}^{R}(f_{n,m}))\geq R\{H_{\Psi}(y)m[\rho m+H(\rho)]\}. $$
Here, H_{Ψ}(y) is entropy per symbol of the output alphabet of circuit \(\mathfrak {A}, \Psi \) is the input distribution, and H(ρ)=ρ ln(ρ)+(1−ρ) ln(1−ρ).
From the definition of circuits (Section 4, item 3) and information cost of computation (Section 4, item 7), it is easily seen that \(J({\mathfrak {A}}(f_{n,m})\geq C({\mathfrak {A}}(f_{n,m}))\) and \(J({\mathfrak {U}}^{R}(f{n,m}))\geq RC({\mathfrak {A}}^{R}\dashrightarrow (f_{n,m}))\) (with thelatter inequality following from the fact that the total amount of information passing from level to level cannot be less than the total amount of information passing from the input to the output of \({\mathfrak {A}}^{R}\)).
 1.
\(J({\mathfrak {A}}(f_{n,m}))\geq \overline {\lim }_{T\to \infty }T^{1}J_{\Psi }(\Gamma _{T}x^{T};\Gamma _{T}^{0}x^{T})\equiv J_{\Psi }\); and
 2.
\(J({\mathfrak {A}}^{R}(f_{n,m}))\geq RJ_{\Psi }.\)
From this assumption regarding \({\mathfrak {A}}({\mathfrak {A}}^{R}), \overline {\lim }_{T\to \infty }T^{1}\sum \limits _{x^{T}}(\Gamma _{T}x^{T},\Gamma _{T}^{0}x^{T})\Psi ^{T}(x^{T})\leq \rho _{*}\). According to the inequality obtained when inverting the Coding Theorem ([6], p. 98), it follows that J_{Ψ}≥mH_{Ψ}(y)−[ρm+H(ρ)]. The theorem is proven.
Theorem 4.
\(\forall n,m (m\leq n) \forall \Psi (\widetilde {x}) \forall f_{n,m} \forall \rho (\rho >0) \exists ({\mathfrak {A}}(f_{n,m})\) computing f_{n,m} with error ≤ρ): \(\{J({\mathfrak {A}}(f_{n,m}))\leq 4\rho ^{1}n^{4}m\} \& \{{\mathfrak {A}}(f_{n,m})\leq 3m2^{n}\}\).
Footnotes
 1.
Samuel Beckett, “Waiting for Godot”.
 2.
All exact definitions for these parameters are given in the subsequent sections.
 3.
As usual, O(⋯) means “some function bounded above in absolute value by a constant times what is in the parentheses”.
 4.
The results of Section 1 were established in (Goryashko and Nemirovski 1978); in view of the unavailability of (Goryashko and Nemirovski 1978) to Western readers (i.e., until 80 yh, the Journal had not been translated into English), we reproduce the original proofs below to make the paper selfcontained.
 5.
Our model of a channel is similar to a physiological model of muscle cells or neurons, where there is such a condition as that called the refractory period.
 6.
There exist an optimal algorithm and recurrent rules for the exact computation of the number of partitions in the classes P(n,m) (Knuth 2011).
 7.
The numerical values of almost all topological parameters for case of (m^{2}±l,m)partitions is found to be very close to previous case (different by less than one percent).
Notes
Acknowledgments
We thank Arcadi Nemirovski for inspiring discussions.
Authors’ contributions
Conceived and designed the experiments: AG, LS. Performed the experiments: AG, LS, PB. Analyzed the data: AG, PB. Contributed programming tools: LS, PB. Wrote the paper: AG, PB. Performed mathematical modeling: AG, LS, PB. All authors read and approved the final manuscript.
Funding
The authors have no support or funding to report.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
References
 Andrews, GE (1971) Number Theory.Google Scholar
 Andrews, GE (1990) Euler’s “Exemplum memorabile inductionis fallacis” and qtrinomial coefficients. J Am Math Soc 3(3):653–669. https://doi.org/10.2307/1990932.MathSciNetzbMATHGoogle Scholar
 Bocharov, P, Goryashko A (2017) New approach to solving partition games. Appl Math Sci. https://doi.org/10.12988/ams.2017.7248.
 Bollobás, B, Riordan OM (2004) Mathematical results on scalefree random graphs In: Handbook of Graphs and Networks. https://doi.org/10.1002/3527602755.ch1.
 Crandall, D, Kleinberg J, Suri S, Cosley D, Huttenlocher D (2008) Feedback effects between similarity and social influence in online communities. https://doi.org/10.1145/1401890.1401914.
 Donetti, L, Neri F, Mũoz MA (2006) Optimal network topologies: Expanders, cages, Ramanujan graphs, entangled networks and all that. http://arxiv.org/abs/0605565. https://doi.org/10.1088/17425468/2006/08/P08007.
 Erdös, P, Rényi A (1960) On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Sciences.Google Scholar
 Evans, TS (2010) Clique graphs and overlapping communities. J Stat Mech: Theory and Experiment. doi: https://doi.org/10.1088/17425468/2010/12/P12037.
 Fortunato, S (2010) Community detection in graphs. http://arxiv.org/abs/0906.0612. https://doi.org/10.1016/j.physrep.2009.11.002.
 Gallager, R (1970) Information Theory and Reliable Communication. Springer, Udine. https://doi.org/10.1007/9783709129456.zbMATHGoogle Scholar
 Goryashko, A, Nemirovski A (1978) Estimates of Information Cost of Computing Boolean Functions in Combination Circuits. Prob of Info Trans XIV(1):90–100.Google Scholar
 Goryashko, A, Samokhine L, Bocharov P (2019) Complex Networks and Their Applications VII. In: Aiello LM, Cherifi C, Cherifi H, Lambiotte R, Lió P, Rocha LM (eds), 553–564.. Springer, Cham.Google Scholar
 Helbing, D, Brockmann D, Chadefaux T, Donnay K, Blanke U, WoolleyMeza O, Moussaid M, Johansson A, Krause J, Schutte S, Perc M (2014) Saving Human Lives: What Complexity Science and Information Systems can Contribute. J Stat Phys. https://doi.org/10.1007/s1095501410249.
 Kauffman, SA (1969) Metabolic stability and epigenesis in randomly constructed genetic nets. Journal of Theoretical Biology. doi:10.1016/00225193(69)90015010.1016/00225193(69)900150. NIHMS150003.
 Knuth, DE (2011) The Art of Computer Programming, Volume 4A: Combinatorial Algorithms, Part 1. AddisonWesley Professional.Google Scholar
 Kolmogorov, AN (1965) Three approaches to the quantitative definition of information. Prob Inf Trans 1(1):1–7.MathSciNetGoogle Scholar
 Krioukov, D (2014) Brain theory. Frontiers in Computational Neuroscience 8(October):114. https://doi.org/10.3389/fncom.2014.00114. http://arxiv.org/abs/1203.21091203.2109.Google Scholar
 Leskovec, J, Chakrabarti D, Kleinberg J, Faloutsos C, Ghahramani Z (2010) Kronecker graphs: an approach to modeling networks. J Mach Learn Res 11:985–1042.MathSciNetzbMATHGoogle Scholar
 McCarthy, P, Benuskova L, Franz EA (2014) The agerelated posterioranterior shift as revealed by voxelwise analysis of functional brain networks. Front Aging Neurosci. https://doi.org/10.3389/fnagi.2014.00301.
 Miller, JC, Hagberg A (2011) Efficient Generation of Networks with Given Expected Degrees In: Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 115–126. http://link.springer.com/10.1007/9783642212864_10.
 OEIS Foundation Inc (2018) The OnLine Encyclopedia of Integer Sequences. https://oeis.org.
 Samokhine, L (2017) Trinomial Family Research Toolbox. https://github.com/samokhine/gory.
 Sorrells, TR, Johnson AD (2015) Making sense of transcription networks. https://doi.org/10.1016/j.cell.2015.04.014.
 Strang, A, Haynes O, Cahill ND, Narayan DA (2018) Generalized relationships between characteristic path length, efficiency, clustering coefficients, and density. Soc Netw Anal Mining. https://doi.org/10.1007/s1327801804923.
 Whitacre, JM (2012) Biological robustness: Paradigms, mechanisms, systems principles. https://doi.org/10.3389/fgene.2012.00067.
 Yajima, S, Inagaki K (1974) Power Minimization Problems of Logic Networks. IEEE Trans Comput. https://doi.org/10.1109/TC.1974.223878.
Copyright information
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.