Keywords

1 Introduction

The diameter and radius are two basic graph parameters whose values play a vital role in many applications. In distributed computing, these parameters are even more fundamental, since they capture the minimal number of rounds needed in order to send a piece of information to all the nodes in a network. Hence, understanding the complexity of computing these parameters is central to distributed computing, and has been the focus of many studies in the CONGEST model of computation, where in every round each of n nodes may send messages of up to \(O(\log {n})\) bits to each of its neighbors. Frischknecht et al. [21] showed that the diameter is surprisingly hard to compute: \(\widetilde{\varOmega }(n)\) rounds are needed even in networks with constant diameterFootnote 1. This lower bound is nearly tight, due to an O(n) upper bound presented simultaneously and independently by [26, 34] to compute all pairs shortest paths in a network. Naturally, approximate solutions are a desired relaxation, and were indeed addressed in several cornerstone studies [21, 24, 26, 28, 34], bringing us even closer to a satisfactory understanding of the time complexity of computing distances in distributed networks. However, several central questions remained elusive.

Sparse Graphs. The graphs constructed in [21] have \(\varTheta (n^2)\) edges and constant diameter, and require any distributed protocol for computing their diameter to spend \(\widetilde{\varOmega }(n)\) rounds. Such a high lower bound makes one wonder if the diameter can be computed faster in networks that we expect to encounter in realistic applications. Almost all large networks of practical interest are very sparse [29], e.g. the Internet in 2012 had \(\approx 4\) billion nodes and \(\approx 128\) billion edges [30].

The only known lower bound for computing the diameter of a sparse network is obtained by a simple modification to the construction of [21] which yields a much weaker bound of \(\widetilde{\varOmega }(\sqrt{n})\). This leaves hope that the \(\widetilde{\varOmega }(n)\) bound can be beaten significantly in sparse networks. Our first result rules out this possibility.

Theorem 1

The number of rounds needed for any protocol to compute the diameter of a network on n nodes and \(O(n \log n)\) edges of constant diameter in the CONGEST model is \(\varOmega (\frac{n}{\log ^2{n}})\).

We remark that, as in [21], our lower bound holds even for networks with constant diameter. Throughout the paper we say that a graph on n nodes is sparse if it has \(O(n \log {n})\) edges. Due to simple transformations, e.g. by adding dummy nodes, all of our lower bounds will also hold for the more strict definition of sparse graphs as having O(n) edges, up to a loss of a log factor.

As explained next, the sparsity in our new lower bound construction allows us to extend the result in some interesting ways.

Approximation Algorithms. Another important question is whether we can bypass this near-linear barrier if we settle for knowing only an approximation to the diameter. An \(\alpha \)-approximation algorithm to the diameter returns a value \(\hat{D}\) such that \(D \le \hat{D} \le \alpha \cdot D\), where D is the diameter of the network. From [21] we know that \(\widetilde{\varOmega }(\sqrt{n}+D)\) rounds are needed, even for computing a \((\frac{3}{2}-\varepsilon )\)-approximation to the diameter, for any constant \(\varepsilon >0\).

Following this lower bound, almost-complementary upper bounds were under extensive research. It is known that a \(\frac{3}{2}\)-approximation can be computed in a sublinear number of rounds: Holzer and Wattenhofer [26] showed a \(O(n^{3/4}+D)\)-round algorithm and (independently) Peleg et al. [34] obtained a \(O(D \sqrt{n} \log {n})\) bound, later these bounds were improved to \(O(\sqrt{n}\log {n}+D)\) by Lenzen and Peleg [28], and finally Holzer et al. [24] reduce the bound to \(O(\sqrt{n\log {n}}+D)\). When D is small, these upper bounds are near-optimal in terms of the round complexity – but do they have the best possible approximation ratio that can be achieved within a sublinear number of rounds? That is, can we also obtain a \((\frac{3}{2}-\varepsilon )\)-approximation in \(\widetilde{O}(\sqrt{n}+D)\) rounds, to match the lower bound of [21]?

Progress towards answering this question was made by Holzer and Wattenhofer [26] who showed that any algorithm that needs to decide whether the diameter is 2 or 3 has to spend \(\widetilde{\varOmega }(n)\) rounds. However, as the authors point out, their lower bound is not robust and does not rule out the possibility of a \((\frac{3}{2}-\varepsilon )\)-approximation when the diameter is larger than 2, or an algorithm that is allowed an additive \(+1\) error besides a multiplicative \((\frac{3}{2}-\varepsilon )\) error. Furthermore, when the diameter is 2 or 3 as in the construction of [26], any \((\frac{3}{2}-\varepsilon )\)-approximation must return the exact diameter. Thus, to explain why we cannot save time by settling for a \((\frac{3}{2}-\varepsilon )\)-approximation, we need a more general construction.

The Main Challenge. Perhaps the main difficulty in extending the lower bound constructions of [21, 26] to resolve these gaps was that their original graphs are dense. Our new sparse construction technique allows us to tighten the bounds and negatively resolve the above question. In other words, we show a \(\widetilde{\varOmega }(n)\) lower bound for computing a (\(\frac{3}{2}-\varepsilon \))-approximation to the diameter.

At a high level, the reason that the density matters is as follows. The lower bound technique reduces the 2-party communication complexity problem of Set-Disjointness to a distributed algorithm for approximating the diameter. This is done by constructing a graph in which the existence of some of the edges depends on the inputs of the players, and partitioning the graph between the two players, inducing a cut in it. The dependence is such that having a good approximation to the diameter induces an answer to the Set-Disjointness problem. The players then simulate the distributed algorithm, and pay in communicating the bits that are sent on edges that cross the cut between their two partitions. Therefore, a known lower bound on the 2-party communication complexity of Set-Disjointness implies a lower bound on the number of rounds required for a distributed approximation algorithm for the diameter. The larger the cut, the smaller the lower bound for the distributed problem.

Having a sparse graph with a small cut, is what allows us to make this leap in the lower bound. The key idea of achieving a sparse graph with a small cut, is to connect the nodes to a set of nodes that represent their binary value, and the only nodes on the cut are the nodes of the binary representation. We call this graph structure a bit-gadget, and it plays a central role in all of our graph constructions. This is inspired by graph constructions for different settings (e.g. [3], see discussion in Sect. 1.1). Our main result follows.

Theorem 2

For all constant \(0<\varepsilon <1/2\), the number of rounds needed for any protocol to compute a (\(3/2-\varepsilon \))-approximation to the diameter of a sparse network in the CONGEST model is \(\varOmega ( \frac{ n}{\log ^3n})\).

Radius. In many scenarios we want one special node to be able to efficiently send information to all other nodes. In this case, we would like this node to be the one that is closest to every other node, i.e. the center of the graph. The radius of the graph is the largest distance from the center, and it captures the number of rounds needed for the center node to transfer a message to another node in the network. While radius and diameter are closely related, the previous lower bounds for diameter do not transfer to radius and it was conceivable that the radius of the graph could be computed much faster. Obtaining a non-trivial lower bound for radius has been stated as an open problem in [26]. A third advantage of our technique is that it extends to computing the radius, for which we show that the same strong near-linear barriers above hold.

Theorem 3

For all constant \(0<\varepsilon <1/2\), the number of rounds needed for any protocol to compute a (\(3/2-\varepsilon \))-approximation to the radius of a sparse network in the CONGEST model is \(\varOmega (\frac{ n}{\log ^3n})\).

Eccentricity. The eccentricity of a node is the largest distance from it. Observe that the diameter is the largest eccentricity in the graph while the radius is the smallest. As pointed in [26], given a \((\frac{3}{2}-\varepsilon )\)-approximation algorithm to all the eccentricities, we can achieve \((\frac{3}{2}-\varepsilon )\)-approximation algorithm to the diameter by a simple flooding. This implies an \(\widetilde{\varOmega }(\sqrt{n}+D)\) lower bound for any \((\frac{3}{2}-\varepsilon )\)-approximation algorithm for computing all the eccentricities. Our construction allows us to improve this result by showing that any algorithm for computing even a \((\frac{5}{3}-\varepsilon )\)-approximation to all the eccentricities must spend \(\varOmega (\frac{n}{\log ^3(n)})\) rounds. This improves both in terms of the number of rounds, and in terms of the approximation factor, which we allow to be even larger. Interestingly, it implies that approximating all eccentricities is even harder than approximating just the largest or the smallest one.

Theorem 4

For all constant \(0<\varepsilon <2/3\), the number of rounds needed for any protocol to compute a \((5/3-\varepsilon )\) approximation of all eccentricities of a sparse network in the CONGEST model is \(\varOmega (\frac{ n}{\log ^3n})\).

Verification of Spanners. Finally, our technique allows us to obtain a lower bound for the verification of \((\alpha ,\beta )\)-spanners. An \((\alpha ,\beta )\)-spanner of a graph G, is a subgraph H in which for any two nodes uv it holds that \(d_H(u,v) \le \alpha \cdot d_G(u,v)+\beta \). When spanners are sparse, i.e., when H does not have too many edges, they play a vital role in many application, such as routing, approximating distances, synchronization, and more. Hence, the construction of sparse spanners has been a central topic of many studies, both in centralized and sequential computing.

Here we address the problem of verifying that a given subgraph H is indeed an \((\alpha ,\beta )\)-spanner of G. At the end of the computation, each node outputs a bit indicating whether H is a spanner, with the requirement that if H is indeed a spanner with the required parameters then all nodes indicate this, and otherwise at least one node indicates that it is not. We obtain the following lower bound.

Theorem 5

Given an unweighted graph \(G=V,E\), a subgraph \(H\subset E\) of G, the number of rounds needed for any protocol to decide whether H is an \((\alpha ,\beta )\)-spanner of G in the CONGEST model is \(\varOmega (\frac{n}{(\alpha +\beta )\log ^3{n}})\), for any \(\alpha <\beta +1\).

Notice that for any reasonable value of \(\alpha ,\beta = O(poly\log {n})\), the lower bound is near-linear. This is another evidence for a task for which verification is harder than computation in the CONGEST model, as initially brought into light in [18]. This is, for example, because \((+2)\)-purely additive spanners with \(O(n^{3/2}\log {n})\) edges can be constructed in \(O(\sqrt{n}\log {n} + D)\) rounds (this appears in [28], and can also be deduced from [26]), and additional various additive spanners can be constructed fast in CONGEST [14].

Roadmap. Section 2 contains our lower bound for computing the exact or approximate diameter. Due to space limitations we give our lower bounds for computing the exact or approximate radius, computing eccentricities and verifying spanners in the full version [2]. The degree-reduction technique appears in the full version as well.

1.1 Additional Related Work

Communication Complexity and Distributed Computing. A well-known technique to prove lower bounds in the CONGEST model is to use a reduction from communication complexity to distributed computing. Peleg and Rubinovich [35] apply a lower bound from communication complexity to show that the number of rounds needed for any distributed algorithm to construct a minimum spanning tree (MST) is \(\widetilde{\varOmega }(\sqrt{n}+D)\). Many recent papers were inspired by this technique. In [20] Elkin extended the result of [35] to show that any distributed algorithm for constructing an \(\alpha \)-approximation to the MST must spend \(\widetilde{\varOmega }(\sqrt{\frac{n}{\alpha }})\) rounds. Das Sarma et al. [18] show that any distributed verification algorithm for many problems, such as connectivity, \(s-t\) cut and approximating MST requires \(\widetilde{\varOmega }(\sqrt{n}+D)\) rounds. Nanongkai et al. [32] showed an \(\varOmega (\sqrt{\ell \cdot D}+D)\) lower bound for computing a random walk of length \(\ell \). Similar reductions from communication complexity were adapted also in the \(CONGEST\ Clique\ Broadcast\) model [19, 25], where in each round each node can broadcast the same \(O(\log {n})\)-bit message to all the nodes in the network.

Similar to the technique used in [18, 19, 21, 25, 26], our lower bounds are obtained by reductions from the Set-Disjointness problem in the two-party number-in-hand model of communication complexity [39]. Here, each of the players Alice and Bob receives a k-bit string, \(S_a\) and \(S_b\) respectively, and needs to decide whether the two strings are disjoint or not, i.e., whether there is some bit \(0\le i\le k-1\) such that \(S_a[i]=1\) and \(S_b[i]=1\). It is shown in [27] that in order to solve the Set-Disjointness problem, Alice and Bob must exchange \(\varOmega (k)\) bits. The high level idea for applying this lower bound in the CONGEST model, is to define a graph \(G=(V,E)\) based on the input strings of Alice and Bob, such that G has some property p (e.g., diameter at most 4) if and only if the two strings of Alice and Bob are not disjoint. Given an algorithm for deciding whether a graph has property p, Alice and Bob can simulate this algorithm on G in order to solve the Set-Disjointness problem.

More on Distributed Distance Computation. Distance computation problems have also been recently studied in weighted distributed networks. Nanongkai [31] presented an \(\widetilde{O}(n^{\frac{1}{2}}\cdot D^{\frac{1}{4}})\) upper bound for the \((1+o(1))\)-approximation to the single source shortest paths problem. More recently, this result was improved by Henzinger et al. [23] as they presented an \(\widetilde{O}(n^{\frac{1}{2} + o(1)} + D^{1+o(1)})\) algorithm for the same approximation factor and showed that this also implies a \((2+o(1))\)-approximation algorithm to the diameter. Moreover, such problems have also been considered in the congested clique model [13, 23, 31], where \((1+o(1))\)-approximate all pairs shortest paths can be computed in \(O(n^{0.158})\) rounds [13].

Intuitively, the technical difficulty in extending the proof for diameter to work for radius as well is the difference in types between the two problems: the diameter asks for a pair of nodes that are far (\(\exists x \exists y\)) while radius asks for a node that is close to everyone (\(\exists x \forall y\)). Recent developments in the theory of (sequential) algorithms suggest that this type-mismatch could lead to fundamental differences between the two problems. Recall that classical sequential algorithms solve APSP in O(nm) time [17] and therefore both diameter and radius can be solved in quadratic \(O(n^2)\) time in sparse graphs.

Due to the lack of techniques for proving unconditional super-linear \(\omega (n)\) lower bounds on the runtime of sequential algorithms for any natural problem, a recent line of work seeks hardness results conditioned on certain plausible conjectures (a.k.a. “Hardness in P”). An interesting example of such result concerns the diameter: Roditty and Vassilevska W. [36] proved that if the diameter of sparse graphs can be computed in truly-subquadratic \(O(n^{2-\varepsilon })\) time, for any \(\varepsilon >0\), then the Strong Exponential Time Hypothesis (SETH) is falseFootnote 2, by reducing SAT to diameter. Since then, many other problems were shown to be “SETH-hard” (e.g. [1, 4, 6, 7, 9] to name a few) but whether a similar lower bound holds for radius is an open question [3, 5, 10, 11, 16, 36]. In fact, Carmosino et al. [12] show that there is a formal barrier for reducing SAT to radiusFootnote 3, and Abboud, Vassilevska W. and Wang [5] introduce a new conjecture to prove an \(n^{2-o(1)}\) lower bound for radiusFootnote 4 (which has a similar \(\exists \forall \) type). Diameter and radius seem to behave differently also in the regime of dense and weighted graphs where the best known algorithms take roughly cubic \(O(n^{3}/2^{\sqrt{\log {n}}})\) time [15, 37] and it is known that radius can be solved in truly-subcubic \(O(n^{3-\varepsilon })\) time if and only if APSP can [3], but showing such a subcubic-equivalence between APSP and diameter is a big open question [3, 8, 38].

The framework and set-up in our unconditional lower bound proofs for distributed algorithms are very different from the ones in the works on conditional lower bounds for sequential algorithms discussed above. Still, some of our graph gadgets are inspired by the constructions in those proofs, e.g. [3, 5, 11, 16, 36]. Thus, it is quite surprising that our hardness proof for diameter transfers without much difficulty to a hardness proof for radius.

1.2 Model and Basic Definitions

We consider a synchronized network of n nodes represented by an undirected graph \(G=(V,E)\). In each round, each node can send a different message of b bits to each of its neighbors. This model is known as the CONGEST(b) model, and as the CONGEST model when \(b=O(\log (n))\) [33]. The graph parameters we consider are formally defined as follows.

Definition 1

(Eccentricity, Diameter and Radius). Let d(uv) denote the length of the shortest path between the nodes u and v. The eccentricity e(u) of some node u is \(max_{v\in V}d(u,v)\). The Diameter (denoted by D) is the maximum distance between any two nodes in the graph: \(D = max_{u\in V}e(u)\). The Radius (denoted by r) is the maximum distance from some node to the “center” of the graph: r = \(min_{u\in V}e(u)\).

Finally, we define what we mean when we say that a graph is sparse.

Definition 2

(sparse network). A sparse network \(G=(V,E)\) is a network with n nodes and at most \(O(n\log (n))\) edges.

Recall, however, that all our results can be obtained for graphs that have a strictly linear number of \(\varTheta (n)\) edges, at the cost of at most an additional \(O(\log {n})\) factor in the lower bound.

2 Computing the Diameter

In this section we present lower bounds on the number of rounds needed to compute the diameter exactly and approximately in sparse networks. First, in Sect. 2.1 we present a higher lower bound on the number of rounds needed for any algorithm to compute the exact diameter of a sparse network, and next, in Sect. 2.2 we show how to modify our sparse construction to achieve a higher lower bound on the number of rounds needed for any algorithm to compute a \((\frac{3}{2}-\varepsilon )\)-approximation to the diameter.

2.1 Exact Diameter

To prove Theorem 1 we describe a graph construction \(G=(V,E)\) and a partition of G into \((G_a,G_b)\), such that one part is simulated by Alice (denoted by \(G_a\)), and the second is simulated by Bob (denoted by \(G_b\)). Each player receives an input string defining some additional edges that will affect the diameter of G. The proof is organized as follows: in Sect. 2.1 we describe the graph construction, and next, in Sect. 2.1, we describe the reduction from the Set-Disjointness problem and deduce Theorem 1.

Fig. 1.
figure 1

Graph Construction (diameter). Some edges are omitted, for clarity.

Graph Construction. Let \(i^j\) denote the value of the bit j in the binary representation of i. The set of nodes V is defined as follows (see also Fig. 1):Footnote 5 First, it contains two sets of nodes \(L= \{\ell _i \mid 0\le i\le k-1\}\) and \(R= \{r_i \mid 0\le i\le k-1\}\), each of size k. All the nodes in L are connected to an additional node \(\ell _{k}\), which is connected to an additional node \(\ell _{k+1}\). Similarly, all the nodes in R are connected to an additional node \(r_{k}\), which is connected to an additional node \(r_{k+1}\). The nodes \(\ell _{k+1}\) and \(r_{k+1}\) are also connected by an edge.

Furthermore, we add four sets of nodes, which are our bit-gadget: \(F=\{f_j \mid 0\le j\le \log (k)-1\},\,T=\{t_j \mid 0\le j\le \log (k)-1\},\, F'=\{f'_j \mid 0\le j\le \log (k)-1\},\, T'=\{t'_j \mid 0\le j\le \log (k)-1\}\), each of size \(\log (k)\). We connect the sets FT with \(F',T'\) by adding edges between \(f_i\) and \(t_i'\), and between \(t_i\) and \(f_i'\), for each \(0\le i \le \log (k)-1\). To define the connections between the sets LR and the sets \(F,T,F',T'\), we add the following edges: For each \(\ell _i\in L\), if \(i^j=0\), we connect \(\ell _i\) to \(f_j\), otherwise, we connect \(\ell _i\) to \(t_j\). Similarly, for each \(r_i\in R\), if \(i^j=0\) we connect \(r_i\) to \(f'_j\), otherwise, we connect \(r_i\) to \(t_j'\).

To complete the construction we add two additional nodes \(\{a,b\}\). We connect a to all the nodes in \(F\cup T\cup \{\ell _{k},\ell _{k+1}\}\), and similarly, we connect b to all the nodes in \(F'\cup T'\cup \{r_{k},r_{k+1}\}\). We also add an edge between the nodes a and b. The proofs of the following two claims appears in the full version [2].

Claim 2.1

For every \(i,j\in [k-1]\) it holds that \(d(\ell _i,r_j)=3\) if \(i\ne j\), and \(d(\ell _i,r_j)=5\) otherwise.

Claim 2.2

For every \(u,v\in V{\setminus } (L\cup R)\) it holds that \(d(u,v) \le 3\).

Corollary 1

For every uv such that \(u\in (V_a{\setminus } L)\) or \(v\in (V_b{\setminus } R)\), it holds that \(d(u,v) \le 4\).

Reduction from Set-Disjointness. To prove Theorem 1, we show a reduction from the Set-Disjointness problem. Following the construction defined in the previous section, we define a partition \((G_a=(V_a,E_a),\,G_b=(V_b,E_b))\):

$$\begin{aligned}&V_a = L\cup F\cup T\cup \{\ell _{k},\ell _{k+1},a\},\,E_a = \{(u,v)|u,v\in V_a \wedge (u,v)\in E \}\\&V_b = R\cup F'\cup T'\cup \{r_{k},r_{k+1},b\},\,E_b = \{(u,v)|u,v\in V_b \wedge (u,v)\in E \} \end{aligned}$$

The graph \(G_a\) is simulated by Alice and the graph \(G_b\) is simulated by Bob, i.e., in each round, all the messages that nodes in \(G_a\) send to nodes in \(G_b\) are sent by Alice to Bob. Bob forwards these messages to the corresponding nodes in \(G_b\). All the messages from nodes in \(G_b\) to nodes in \(G_a\) are sent in the same manner. Each player receives an input string \((S_a\) and \(S_b)\) of k bits. If the bit \(S_a[i] = 0\), Alice adds an edge between the nodes \(\ell _i\) and \(\ell _{k+1}\). Similarly, if \(S_b[i] = 0\), Bob adds an edge between the nodes \(r_i\) and \(r_{k+1}\).

Observation 6

For every \(u,v\in V_a\), it holds that \(d(u,v)\le 4\). Similarly, \(d(u,v)\le 4\) For every \(u,v\in V_b\).

This is because \(d(u_a,\ell _{k+1})\le 2\) for any \(u_a\in V_a\), and \(d(u_b,r_{k+1})\le 2\) for any node \(u_b\in V_b\).

Lemma 1

The diameter of G is at least 5 if and only if the sets of Alice and Bob are not disjoint.

The proof appears in the full version [2].

Proof of Theorem 1 . From Lemma 1, we get that any algorithm for computing the exact diameter of the graph G can be used to solve the Set-Disjointness problem. Note that since there are \(O(\log (k))\) edges in the cut (\(G_a,G_b\)), in each round Alice and Bob exchange \(O(\log (k)\cdot \log (n))\) bits. Since \(k = \varOmega (n)\) we deduce that any algorithm for computing the diameter of a network must spend \({\varOmega }(\frac{n}{\log ^2(n)})\) rounds, and since \(|E|=O(n\log (n))\) this lower bound holds even for sparse networks.

2.2 \((\frac{3}{2}-\varepsilon )\)-approximation to the Diameter

In this Section we show how to modify our sparse construction to achieve a stronger lower bound for (\(\frac{3}{2}-\varepsilon \))-approximation algorithms.

Fig. 2.
figure 2

Graph construction, \(P=3\) (diameter approximation). Some edges are omitted.

Graph Construction. The main idea to achieve this lower bound is to stretch our sparse construction by replacing some edges by paths of length P, an integer which will be chosen later. Actually, we only apply the following changes to the construction described in Sect. 2.1 (see also Fig. 2 where \(P=3\)):

  1. 1.

    Remove the nodes ab and their incident edges.

  2. 2.

    Replace all the edges incident to the nodes \(\ell _{k},r_{k}\) by paths of length P.

  3. 3.

    Replace all the edges (uv) such that \(u\in L\) and \(v\in (F\cup T)\) by paths of length P. Similarly, Replace all the edges (uv) such that \(u\in R\) and \(v\in (F'\cup T')\) by paths of length P.

  4. 4.

    Add two additional sets \(L' = \{\ell '_i \mid 0\le i\le k-1\}\), \(R'=\{r'_i \mid 0\le i\le k-1\}\) each of size k. Connect each \(\ell _i'\) to \(\ell _i\) by a path of length P. Similarly, connect each \(r_i'\) to \(r_i\) by a path of length P.

Furthermore, to simplify our proof, we connect each \(u\in (F\cup T)\) to \(\ell _{k+1}\) by a path of length P. Similarly, connect each \(u\in (F'\cup T')\) to \(r_{k+1}\) by a path of length P.

Definition 3

(Y(u,v)) For each \(u,v\in V\) such that u and v are connected by a path of length P, denote by Y(uv) the set of all nodes on the P path between u and v (without u and v).

The proofs of the following two claims appears in the the full version [2].

Claim 2.3

For every \(u,v\in V{\setminus } (L'\cup R'\bigcup _{i\in [k-1]} Y(\ell '_i,\ell _i)\bigcup _{i\in [k-1]} Y(r'_i,r_i))\) it holds that d(uv) is at most \(4P+1\).

Claim 2.4

For every \(i,j\in [k-1]\) it holds that \(d(\ell '_i,r'_j) = 4P+1\) if \(i\ne j\), and \(d(\ell '_i,r'_j) = 6P+1\) otherwise.

Reduction from Set-Disjointness. Following the construction described in the previous section, we define a partition \((G_a=(V_a,E_a),\,G_b=(V_b,E_b))\):

$$\begin{aligned} V_a =&\bigcup _{\begin{array}{c} i\in [k-1] \\ j\in [\log (k)-1] \\ i^j=0 \end{array}}Y(\ell _i,f_j)\bigcup _{\begin{array}{c} i\in [k-1] \\ j\in [\log (k)-1] \\ i^j=1 \end{array}}Y(\ell _i,t_j)\bigcup _{i\in [k-1]} Y(\ell '_i,\ell _i)\\\nonumber&\bigcup _{i\in [k-1]} Y(\ell _i,\ell _{k})\cup Y(\ell _{k},\ell _{k+1})\cup L'\cup L\cup F\cup T\cup \{\ell _{k},\ell _{k+1}\}\\ E_a =&\{(u,v)|u,v\in V_a \wedge (u,v)\in E \}\\ V_b =&\bigcup _{\begin{array}{c} i\in [k-1] \\ j\in [\log (k)-1] \\ i^j=0 \end{array}}Y(r_i,f'_j)\bigcup _{\begin{array}{c} i\in [k] \\ j\in [\log (k)-1] \\ i^j=1 \end{array}}Y(r_i,t'_j)\bigcup _{i\in [k-1]} Y(r'_i,r_i)\\\nonumber&\bigcup _{i\in [k-1]} Y(r_i,r_{k})\cup Y(r_{k},r_{k+1})\cup R'\cup R\cup F'\cup T'\cup \{r_{k},r_{k+1}\}\\ E_b =&\{(u,v)|u,v\in V_b \wedge (u,v)\in E \} \end{aligned}$$

Each player receives an input string \((S_a\) and \(S_b)\) of k bits. If \(S_a[i] = 0\), Alice adds an edge between the nodes \(\ell _i\) and \(\ell _{k+1}\). Similarly, if \(S_b[i] = 0\), Bob adds an edge between the nodes \(r_i\) and \(r_{k+1}\).

Claim 2.5

Let \(0\le i\le k-1\) be such that \(S_a[i]=0\) or \(S_b[i]=0\). Then the distance from the node \(\ell _i \in L\) to any node \(u\in (R\cup \{r_{k+1}\})\) is at most \(2P+2\).

The proof appears in the full version [2]. Note that any node in \(V_b\) is connected by a path of length at most P to some node in \(R\cup \{r_{k+1}\}\), and any node in \(L'\) is connected by a path of length P to some node in L. Thus, combining this with Claim 2.5 we conclude the following.

Corollary 2

Let \(0\le i\le k-1\) be such that \(S_a[i]=0\) or \(S_b[i]=0\). Then \(d(u,v_b)\le 4P+2\) for any \(u\in \{\ell '_i\}\cup Y(\ell '_i,\ell _i)\) and any \(v_b\in V_b\). Symmetrically, \(d(u,v_a)\le 4P+2\) for any \(u\in \{r'_i\}\cup Y(r'_i,r_i)\) and any \(v_a\in V_a\).

Lemma 2

The Diameter of G is 6P+1 if the two sets of Alice and Bob are not disjoint, and 4P+2 otherwise.

The proof appears in the full version [2].

Proof of Theorem 2 . To complete the proof we need to choose P such that \((\frac{3}{2}-\varepsilon )\cdot (4P+2) < (6P+1)\), this holds for any \(P>\frac{1}{2\varepsilon }-\frac{1}{2}\). Note that \(k=\varOmega (\frac{n}{\log (n)})\) for a constant \(\varepsilon \). Thus, we deduce that any algorithm for computing \((\frac{3}{2}-\varepsilon )\)-approximation to the diameter requires at least \(\varOmega (\frac{n}{\log ^3(n)})\) rounds. Furthermore, the number of nodes and the number of edges are both equal to \(\varTheta (k\log (k)\cdot P)\). Thus, this lower bound holds even for graphs with linear number of edges.

3 Discussion

We introduce a new technique for reducing the Set-Disjointness communication problem to distributed computation problems, in a highly efficient way. Our reductions encode an instance of Disjointness on k bits into a graph on only \(\widetilde{O}(k)\) nodes and edges with a small “communication-cut” of size \(O(\log {k})\). All previous lower bound constructions had a cut of poly(k) size (e.g., [18, 21, 25, 26]). This efficiency allows us to answer several central open questions regarding the round complexity of distance computation problems in the CONGEST model.

There are several interesting directions for future work. First, there is still a \(\log {n}\) factor gap between the upper and lower bounds on the round complexity of computing the diameter in the CONGEST model. Due to the fundamentality of the diameter, we believe that it will be interesting to close this small gap.

Second, while our ideas greatly improve the state of the art lower bounds for shortest paths problems on unweighted graphs, their potential in the regime of weighted graphs is yet to be explored.

Finally, following our strong barriers for sparse graphs, it is important to seek further natural restrictions on the networks that would allow for much faster distance computation. Planar graphs are an intriguing setting in this context.s A promising recent work of Ghaffari and Haeupler [22] showed that computing a minimum spanning tree can be done in \(\widetilde{O}(D)\) rounds in planar graphs, despite the \(\widetilde{\varOmega }(\sqrt{n})\) lower bound for general graphs [18]. Can the diameter of a planar network be computed in \(\widetilde{O}(D)\) rounds? While the graphs in our lower bounds are highly non-planar, it is interesting to note that they have a relatively small treewidth of \(O(\log {n})\).