# The Parameterised Complexity of Computing the Maximum Modularity of a Graph

- 60 Downloads

**Part of the following topical collections:**

## Abstract

The *maximum modularity* of a graph is a parameter widely used to describe the level of clustering or community structure in a network. Determining the maximum modularity of a graph is known to be \(\textsf {NP}\)-complete in general, and in practice a range of heuristics are used to construct partitions of the vertex-set which give lower bounds on the maximum modularity but without any guarantee on how close these bounds are to the true maximum. In this paper we investigate the parameterised complexity of determining the maximum modularity with respect to various standard structural parameterisations of the input graph *G*. We show that the problem belongs to \(\textsf {FPT}\) when parameterised by the size of a minimum vertex cover for *G*, and is solvable in polynomial time whenever the treewidth or max leaf number of *G* is bounded by some fixed constant; we also obtain an FPT algorithm, parameterised by treewidth, to compute any constant-factor approximation to the maximum modularity. On the other hand we show that the problem is W[1]-hard (and hence unlikely to admit an FPT algorithm) when parameterised simultaneously by pathwidth and the size of a minimum feedback vertex set.

## Keywords

Modularity Community detection Integer quadratic programming Vertex cover Pathwidth## 1 Introduction

The increasing availability of large network datasets has led to great interest in techniques to discover network structure. An important and frequently observed structure in networks is the existence of groups of vertices with many connections between them, often referred to as ‘communities’.

Newman and Girvan introduced the modularity function in 2004 [24]. Modularity gives a measure of how well a graph can be partitioned into communities and is used in the most popular algorithms to cluster large networks. For example, the Louvain method, an iterative clustering technique, uses the modularity function to choose which parts from the previous step to fuse into larger parts at each step [2, 16]. The widespread use of modularity and empirical success in finding communities makes modularity an important function to study from an algorithmic point of view.

This problem was shown to be \(\textsf {NP}\)-complete in general by Brandes et al. [4], using a construction that relies on the fact that all vertices of a sufficiently large clique must be assigned to the same part of an optimal partition. They also showed that a variation of the problem in which we wish to find the optimal partition into exactly two sets is hard; their proof for this relied again on the use of large cliques, but DasGupta and Desai [6] later showed that this 2-clustering problem remains \(\textsf {NP}\)-complete on *d*-regular graphs for any fixed \(d \ge 9\). It has also been shown that it is \(\textsf {NP}\)-hard to approximate the maximum modularity within any constant factor [8], although there is a polynomial-time constant-factor approximation algorithm for certain families of scale-free networks [10]. The hardness of computing constant-factor multiplicative approximations in general has motivated research into approximation algorithms with an additive error [8, 17]: the best known result is an approximation algorithm with additive error roughly 0.42084 [17].

In this paper we initiate the study of the parameterised complexity of Modularity, considering its complexity with respect to several standard structural parameterisations. On the positive side, we show that the problem is in \(\textsf {FPT}\) when parameterised by the cardinality of a minimum vertex cover for the input graph *G*, and that it belongs to \(\textsf {XP}\) when parameterised by either the treewidth or max leaf number of *G*. The XP algorithm parameterised by treewidth can easily be adapted to give an FPT algorithm, parameterised by treewidth, to compute any constant-factor approximation maximum modularity. On the other hand, we demonstrate that Modularity, parameterised by treewidth, is unlikely to belong to \(\textsf {FPT}\): we prove that the problem is \(\textsf {W}[1]\)-hard even when parameterised simultaneously by the pathwidth of *G* and the size of a minimum feedback vertex set for *G*. For background on parameterised complexity, and the complexity classes discussed here, we refer the reader to [5, 11].

These results follow the same pattern as those obtained for the problem Equitable Connected Partition [12], and indeed our hardness result involves a reduction from a specialisation of this problem. There are clear similarities between the two problems: in a partition that maximises the modularity, every part will induce a connected subgraph and, in certain circumstances, we achieve the maximum modularity with a partition into parts that are as equal as possible. However, the crucial difference between the two problems is that the input to Equitable Connected Partition includes the required number of parts, whereas Modularity requires us to maximise over all possible partition sizes; in fact, if we restrict to partitions with a specified parts, it is no longer necessarily true that a partition maximising the modularity must induce connected subgraphs. This difference makes reductions between the two problems non-trivial.

### 1.1 The Modularity Function

The definition of modularity was first introduced by Newman and Girvan in [24]. Many or indeed most popular algorithms used to search for clusterings on large datasets are based on finding partitions with high modularity [15, 19], and the heuristics within them sometimes also use local modularity optimisation, for example in the Louvain method [2]. See [14, 25] for surveys on community detection including modularity based methods.

Knowledge on the maximum modularity for classes of graphs helps to understand the behaviour of the modularity function. There is a growing literature on this which began with cycles and complete graphs in [4]. Bagrow [1] and Montgolfier et al. [7] showed some classes of trees have high maximum modularity which was extended in [21] to all trees with maximum degree *o*(*n*), and furthermore to all graphs where the product of treewidth and maximum degree grows more slowly than the number of edges. Many random graph models also have high modularity, see [22, 23] for a treatment of Erdős-Renyi random graphs, [21] for random regular graphs and also [26] which includes the preferential attachment model.

*A*of vertices, let

*e*(

*A*) denote the number of edges within

*A*, and let \({{\,\mathrm{vol}\,}}(A)\) (sometimes called the volume of

*A*) denote the sum of the degree \(d_v\) (in the whole graph

*G*) over the vertices

*v*in

*A*. For a graph

*G*with \(m\ge 1\) edges and a vertex partition \(\mathcal {A}\) of

*G*, set the modularity score of \(\mathcal {A}\) on

*G*to be

*G*is \(q^*(G)=\max _\mathcal {A}(G)\), where the maximum is over all partitions \(\mathcal {A}\) of the vertices of

*G*. Graphs with no edges are defined conventionally to have modularity 1. However note that if the modularity of graphs with no edges were defined to be 0 it would not change any of the results.

The modularity function is designed to score partitions highly when most edges fall within the parts and penalise partitions with very few or very big parts. These two objectives are encoded as the *edge contribution* or *coverage*\(q^E_\mathcal {A}(G)=\frac{1}{m}\sum _{\mathsf {A}\in \mathcal {A}} e(\mathsf {A})\), and *degree tax*\(q_\mathcal {A}^D(G)=\frac{1}{4m^2}\sum _{\mathsf {A}\in \mathcal {A}} {{\,\mathrm{vol}\,}}(\mathsf {A})^2\), in the modularity of a vertex partition \(\mathcal {A}\) of *G*.

Note that for any graph with \(m\ge 1\) edges \(0 \le q^*(G) \le 1\). To see the lower bound, notice that the trivial partition which places all vertices in the same part has modularity zero. For example, complete graphs and stars have modularity 0 as noted in [4]. A graph consisting of *c* disjoint cliques of the same size has modularity \(1-1/c\) with the optimal partition taking each clique to be a part.

*modularity deficit*\(\tilde{q}_\mathcal {A}(G)=1-q_\mathcal {A}(G)\). Denote by \(\partial (A)\) the number of edges between vertex set

*A*and the rest of the graph. Then

### Fact 1

### Fact 2

(Lemma 3.4 of [4]) Suppose that *G* is a graph that contains no isolated vertices. If \(\mathcal {A}\) is a partition of *V*(*G*) such that \(q_{\mathcal {A}}(G) = q^*(G)\) then, for every \(A \in \mathcal {A}\), *G*[*A*] is a connected subgraph of *G*.

### Fact 3

(Corollary 1 of [4]) Let \(G = (V,E)\) and suppose that \(V_0 \subseteq V\) is a set of isolated vertices. Then \(q(G) = q(G {\setminus } V_0)\). Moreover, if partitions \(\mathcal {A}\) and \(\mathcal {A}'\) agree on all vertices of \(V {\setminus } V_0\), then \(q_{\mathcal {A}}(G) = q_{\mathcal {A}'}(G)\).

### Fact 4

(Lemma 1.6.5 of [27]) If \(\mathcal {A}\) is a partition of *V*(*G*) such that \(q_{\mathcal {A}}(G) = q^*(G)\) then no part *A* consists of a single non-isolated vertex.

### Proof

*u*be a vertex with degree \(d_u>0\) and suppose (for a contradiction) that \(\mathcal {A}=\{\{u\},A_1, \ldots , A_k\}\) is an optimal partition of

*G*. For each \(i=1,\ldots , k\) define the vertex partition \(\mathcal {B}_i=\{A_1, \ldots , A_i\cup \{u\}, \ldots , A_k\}\). We can derive a simple expression for \(q_{\mathcal {B}_i}(G)-q_{\mathcal {A}}(G)\) as most terms cancel:

*i*we have \(2m \cdot e(\{u\},A_i) \le d_u {{\,\mathrm{vol}\,}}(A_i)\). Hence we can sum over \(i=1, \ldots , k\) and the inequality should hold. However for the LHS \(2m \sum _i e(\{u\},A_i) =2m d_u\) and the RHS is

Observe that Facts 2, 3 and 4 together imply that the search for an optimal partition can be restricted to those in which all parts are connected subgraphs and no part consists of a single node.

### 1.2 Notation and Definitions

Given a graph \(G = (V,E)\), and a set \(U \subseteq V\) of vertices, we write *G*[*U*] for the subgraph of *G* induced by *U* and \(G {\setminus } U\) for \(G[V {\setminus } U]\). Given two disjoint subsets of vertices \(A,B \subseteq V\), we write *e*(*A*, *B*) for the number of edges with one endpoint in *A* and the other in *B*. We shall often want to denote the number of edges between a set of vertices and the remainder of the graph so set \(\partial (A)=e(A,\bar{A})\). If \(\mathcal {P}\) is a partition of a set *X*, and \(Y \subset X\), we write \(\mathcal {P}[Y]\) for the restriction of \(\mathcal {P}\) to *Y*.

A *vertex cover* of a graph \(G = (V,E)\) is a set \(U \subseteq V\) such that every edge has at least one endpoint in *U*; equivalently, \(G {\setminus } U\) is an independent set (i.e. contains no edges). The *vertex cover number* of *G* is the smallest cardinality of any vertex cover of *G*. A *feedback vertex set* for *G* is a set \(U \subseteq V\) such that \(G {\setminus } U\) contains no cycles. Notice that the vertex cover number of *G* gives an upper bound on the size of the smallest feedback vertex set for *G*, written \(\mathrm{fvs}(G)\). The *max leaf number* of *G* is the maximum number of leaves (degree one vertices) in any spanning tree of *G*.

*tree decomposition*of a graph

*G*is a pair \((T,\mathcal {D})\) where

*T*is a tree and \(\mathcal {D} = \{\mathcal {D}(t): t \in V(T)\}\) is a collection of non-empty subsets of

*V*(

*G*) (or

*bags*), indexed by the nodes of

*T*, satisfying:

- 1.
\(V(G) = \bigcup _{t \in V(T)} \mathcal {D}(t)\),

- 2.
for every \(e=uv \in E(G)\), there exists \(t \in V(T)\) such that \(u,v \in \mathcal {D}(t)\),

- 3.
for every \(v \in V(G)\), if

*T*(*v*) is defined to be the subgraph of*T*induced by nodes*t*with \(v \in \mathcal {D}(t)\), then*T*(*v*) is connected.

*T*has a distinguished root node

*r*; if not we may choose an arbitrary node to be the root. Given any node \(t \in V(T)\) we write \(V_t\) for the set of vertices of

*G*that appear in bags indexed by

*t*and the descendants of

*t*.

If *T* is in fact a path, we say that \((T,\mathcal {D})\) is a *path decomposition* of *G*. The *width* of the tree decomposition \((T,\mathcal {D})\) is defined to be \(\max _{t \in V(T)} |\mathcal {D}(t)| - 1\), and the *treewidth* of *G*, written \(\mathrm{tw}(G)\), is the minimum width over all tree decompositions of *G*. The *pathwidth* of *G*, \(\mathrm{pw}(G)\), is the minimum width over all path decompositions of *G*.

We note that there is an FPT algorithm to compute a minimum-width tree decomposition of any graph *G*, where the treewidth of *G* is taken as the parameter [3]. Moreover, any such tree decomposition can be transformed into a so-called *nice* tree decomposition (having certain algorithmically useful properties) in linear time, without increasing the number of nodes by more than a constant factor [18].

## 2 Positive Results

In this section we identify a number of structural restrictions on the input graph that allow us to compute the maximum modularity of a graph, or a good approximation to this quantity, efficiently.

### 2.1 Parameterisation by Vertex Cover Number

In this section we demonstrate that Modularity is in \(\textsf {FPT}\) when parameterised by the vertex cover number of the input graph.

### Theorem 5

Modularity, parameterised by cardinality of a minimum vertex cover for the input graph *G*, is in FPT.

Our strategy can be summarised as follows. We first observe that we may restrict our attention to partitions in which every part intersects the vertex cover. Moreover, the vertices outside the vertex cover can be classified into at most \(2^k\) “types” according to their neighbourhood (which by definition must be a subset of the vertex cover). We then argue that the modularity of a partition depends only on (1) the inherited partition of the vertex cover and (2) the number of (non-vertex-cover) vertices of each type that belong to each of the parts. Using this characterisation, we can reduce the problem of maximising the modularity to that of solving a collection of instances of Integer Quadratic Programming.

Before embarking on the proof of Theorem 5, we introduce some notation. Suppose that the graph \(G = (V,E)\) has \(|E| = m\), and that \(U = \{u_1,\ldots ,u_k\}\) is a vertex cover for *G*. Let \(\mathcal {P} = \{P_1,\ldots ,P_{\ell }\}\) be a partition of *U*, and set \(W = V {\setminus } U\) (so *W* is an independent set).

We can partition the vertices of *W* into \(2^k\) sets based on their *type*: the type \(\tau _U(w) \in \{0,1\}^k\) of a vertex \(w \in W\) describes which of the vertices in *U* are neighbours of *w*. Formally \(\tau _U(w)_j = 1\) if \(u_jw\in E(G)\) and \(\tau _U(w)_j = 0\) otherwise. For each \(\sigma \in \{0,1\}^k\), we set \(S_{\sigma }\) to be the set of all vertices in *W* with type exactly \(\sigma \), that is, \(S_\sigma = \{ w\in W : \tau (w)=\sigma \}\).

Now let \(\mathcal {A} = \{A_1,\ldots ,A_r\}\) be a partition of *V*. We write \(x_{\sigma ,i}^{\mathcal {A}}\) for the number of vertices of type \(\sigma \) which are assigned to \(A_i\), that is, \(x_{\sigma ,i}^{\mathcal {A}} = |S_{\sigma } \cap A_i|\). Finally, we introduce 0-1 vectors to encode the sets \(P_i \in \mathcal {P}\): for \(1 \le i \le \ell \), we let \(\pi ^i \in \{0,1\}^k\) be given by \(\pi _j^i = 1\) if \(u_j \in P_i\), and \(\pi _j^i = 0\) otherwise. An example is given in Fig. 1.

### Lemma 1

*U*. If \(\mathcal {A}\) is any partition of

*V*which extends \(\mathcal {P}\) and has the property that every \(A \in \mathcal {A}\) has non-empty intersection with

*U*, then

### Proof

*i*; we set \(B_i = A_i \cap W\) for each \(1 \le i \le \ell \). For any vertex \(w \cap B_i\), we have that \(e(w,P_i)\) is given by the dot product \(\tau (w)\cdot \pi ^i\); thus the number of edges between \(P_i\) and \(B_i\) for each

*i*is given by

We are now ready to prove the main result of this section.

### Proof of Theorem 5

We will assume that the input to our instance of Modularity is a graph \(G=(V,E)\), where \(|E| = m\). We may assume without loss of generality that we are also given as input a vertex cover \(U = \{u_1,\ldots ,u_k\}\) for *G* (as if not we can easily compute one in the allowed time). We may further assume that *G* does not contain any isolated vertices, as we can delete any such vertices (in polynomial time) without changing the value of the maximum modularity (by Fact 3).

*U*into non-empty parts is equal to the \(k^{th}\)

*Bell number*, \(B_k\) (and hence is certainly less than \(k^k\)). It therefore suffices to describe an fpt-algorithm which determines, given some partition \(\mathcal {P}\) of

*U*,

*G*can then be calculated by taking

*U*, and describe how to compute \(q^{\mathcal {P}}(G)\). It follows from Facts 2 and 4, together with the fact that

*W*is an independent set that, if \(\mathcal {A} = \{A_1,\ldots ,A_j\}\) is a partition of

*V*which achieves the maximum modularity, then every part \(A_i\) has non-empty intersection with

*U*. We will call a partition with this property a

*U*-partition of

*G*. It then suffices to maximise the modularity over all

*U*-partitions in order to determine the value of \(q^{\mathcal {P}}(G)\).

*U*-partition \(\mathcal {A}\) as

*U*-partition, we therefore need to find the values of \(x_{\sigma ,i}^{\mathcal {A}}\) which maximise this expression.

*U*-partitions it therefore suffices to determine, for all possible values of \(\theta (\mathcal {A})\) and \(\phi (\mathcal {A})\), the minimum possible value of \(\psi (\mathcal {A})\). Before describing how to do this, we observe that the number of combinations of possible values for \(\theta (\mathcal {A})\) and \(\phi (\mathcal {A})\) and is not too large. Note that \(0 \le \sum _{\sigma ,i} x_{\sigma ,i}^{\mathcal {A}} (\sigma \cdot \pi ^i) < nk\), and \(0 \le \sum _{\sigma ,i} x_{\sigma ,i}^{\mathcal {A}} e(P_i)(\sigma \cdot (\mathbf {1} + \pi ^i))< n \left( {\begin{array}{c}k\\ 2\end{array}}\right) 2k < nk^3\), so the number of possible pairs \(\left( \theta (\mathcal {A}),\phi (\mathcal {A})\right) \) is at most \(n^2k^4\). Thus, if we know the minimum possible value of \(\psi (\mathcal {A})\) corresponding to each possible pair \(\left( \theta (\mathcal {A}),\phi (\mathcal {A})\right) \), we can compute the maximum modularity achieved by any

*U*-partition \(\mathcal {A}\) such that \(\left( \theta (\mathcal {A}),\phi (\mathcal {A})\right) = (y,z)\), and maximising over the polynomial number of possible pairs (

*y*,

*z*) will give \(q^{\mathcal {P}}(G)\).

*y*,

*z*) for \(\left( \theta (\mathcal {A}),\phi (\mathcal {A})\right) \), we describe how to compute

*Q*expresses the value of \(\psi (\mathcal {A})\) in terms of \(\mathbf {x}\): if we set \(Q = \{q_{i,j}\}\) where

*Q*is at most \(4k^2\).

- 1.
\(\theta (\mathcal {A}) = y\),

- 2.
\(\phi (\mathcal {A}) = z\), and

- 3.
the values \(x_{i,\sigma }\) correspond to a valid

*U*-partition \(\mathcal {A}\).

*U*-partition if and only if every \(x_{i,\sigma }\) is non-negative, and for each \(\sigma \) we have \(\sum _{i = 1}^{\ell } x_{i,\sigma }^{\mathcal {A}} = |S_{\sigma }|\).

We can therefore express all three conditions in the form \(A\mathbf {x} = \mathbf {b}\), where *A* is a \(\left( 4 + (\ell +1)2^k\right) \times n\) and \(\mathbf {b}\) is a \(\left( 4 + (\ell +1)2^k\right) \)-dimensional vector (notice that we use two inequalities to express each of the linear equality constraints).

Altogether, this means that the solution to this Integer Quadratic Programming instance will determine the values of \(x_{i,\sigma }^{\mathcal {A}}\) which minimize (out of all values corresponding to some *U*-partition \(\mathcal {A}\)) the value of \(\psi (\mathcal {A})\), subject to the additional requirement that \(\theta (\mathcal {A}) = y\) and \(\phi (\mathcal {A}) = z\). Note that the number of variables *n* is at most \(k2^k\) and the largest absolute value of any entry in *A* or *Q* is at most \(2k^3\), so the parameter in the instance of Integer Quadratic Programming is bounded by a function of *k*. This completes the proof. \(\square \)

We note the algorithm described can easily be modified to output an optimal partition.

### 2.2 Parameterisation by Treewidth

In this section we demonstrate that Modularity, when parameterised by the treewidth of the input graph *G*, belongs to \(\textsf {XP}\) and so is solvable in polynomial time on graph classes whose treewidth is bounded by some fixed constant. We further show that for any fixed \(\varepsilon >0\) there is an FPT-algorithm, parameterised by treewidth, which computes a factor \((1-\varepsilon )\)-approximation; i.e. returning a value between \((1-\varepsilon )q^*\) and \(q^*\) where \(q^*\) is the maximum modularity of the graph.

### Theorem 6

Modularity parameterised by the treewidth of the input graph *G* is in XP.

### Proof

As the proof makes use of standard dynamic programming techniques on tree decompositions, we only give an outline proof here. Suppose that *G* has *n* vertices and *m* edges, and has treewidth *k*. We will assume that we are given a nice tree decomposition \((T,\mathcal {D})\) (where *T* is a tree and \(\mathcal {D} = \{\mathcal {D}(t): t \in V(T)\}\)) of *G*, of width *k*, as part of the input (if not we can compute one in FPT time).

The proof relies heavily on Fact 2. This means we can compute the optimum modularity without considering partitions that induce disconnected subgraphs; hence, for any node \(t \in V(T)\), we need only consider partitions \(\mathcal {A}\) with the property that, if \(A \in \mathcal {A}\) does not intersect \(\mathcal {D}(t)\), then all vertices in *A* only appear in bags indexed by nodes in precisely one connected component of \(T {\setminus } t\).

We compute the modularity by working upwards from the leaves in the standard way. As we do this, we need to keep track of relevant statistics for the parts that intersect the current bag (*liquid* parts) and also the total contribution to the modularity from the parts (*frozen* parts) which contain only vertices from bags indexed by descendants of the current node (and so by the reasoning above cannot accept more vertices from elsewhere in the graph).

*state*of

*t*consists of the following:

- 1.
a partition \(\mathcal {P}\) of \(\mathcal {D}(t)\);

- 2.
a function \(\alpha : \mathcal {P} \rightarrow [m]\) such that \(\alpha (P_i) \ge e(P_i)\) for each \(P_i \in \mathcal {P}\);

- 3.
a function \(\beta : \mathcal {P} \rightarrow [2m]\) such that \(\beta (P_i) \ge {{\,\mathrm{vol}\,}}(P_i)\) for each \(P_i \in \mathcal {P}\).

*t*is at most \((k+1)^{(k+1)} \cdot m^{(k+1)} \cdot (2m)^{(k+1)} = m^{\mathcal {O}(k)}\).

*t*, we need to keep track of the maximum contribution to modularity from frozen parts we can achieve consistent with the liquid parts having the specified state: this is done with a function \(\sigma _t\), the

*signature*of

*t*. Given any state \((\mathcal {P},\alpha ,\beta )\) of

*t*, we first define a \((t,\mathcal {P},\alpha ,\beta )\)-

*partition*to be any partition \(\mathcal {A}\) of \(V_t\) such that:

- 1.
\(\mathcal {P} = \mathcal {A}[\mathcal {D}(t)]\);

- 2.for all \(A \in \mathcal {A}\) with \(A \cap \mathcal {D}(t) \ne \emptyset \):
\(\alpha \left( A \cap \mathcal {D}(t)\right) = e(A)\), and

\(\beta \left( A \cap \mathcal {D}(t) \right) = {{\,\mathrm{vol}\,}}(A)\).

It is clear that, with knowledge of \(\sigma _r\) for the root *r* of the tree decomposition, we can easily determine the maximum modularity of *G*. It therefore remains to outline how we compute \(\sigma _t\) for the four types of node in the nice tree decomposition, using only information about the values of \(\sigma _{t'}\) where \(t'\) is a child of *t*. We begin by observing that if *t* is a leaf node then we can exhaustively consider all possibilities in time depending only on *k*.

*t*is an introduce node with child \(t'\), where \(\mathcal {D}(t) = \mathcal {D}(t') \cup \{v\}\). Given any state \((\mathcal {P},\alpha ,\beta )\) of

*t*, we say that a state \((\mathcal {P}',\alpha ',\beta ')\) of \(t'\) is introduce-compatible with \((\mathcal {P},\alpha ,\beta )\) if:

\(\mathcal {P}' = \mathcal {P} {\setminus } \{v\}\);

for every \(P \in \mathcal {P}\), if \(v \notin P\) then \(\alpha '(P) = \alpha (P)\), and if \(v \in P\) (but \(P {\setminus } \{v\} \ne \emptyset \)) then \(\alpha '(P) = \alpha (P) - |\{u \in P: uv \in E(G)\}|\);

for every \(P \in \mathcal {P}\), if \(v \notin P\) then \(\beta '(P) = \beta (P)\), and if \(v \in P\) (but \(P {\setminus } \{v\} \ne \emptyset \)) then \(\beta '(P) = \beta (P) - d(v)\).

*t*is a forget node with child \(t'\), where \(\mathcal {D}(t) = \mathcal {D}(t') {\setminus } \{v\}\). Given any state \((\mathcal {P},\alpha ,\beta )\) of

*t*, we define two functions \(\sigma _t^1\) and \(\sigma _t^2\); these functions correspond to the case where one of the parts that is liquid at \(t'\) becomes frozen at

*t*(if

*v*was the last vertex in its part), and the case where all parts that are liquid at \(t'\) remain liquid at

*t*, respectively. We set

*t*is a join node with children \(t_1\) and \(t_2\), where \(\mathcal {D}(t_1) = \mathcal {D}(t_2) = \mathcal {D}(t)\). In this case we see that

*G*achievable with a partition into at most

*c*parts, that is

*G*and constant \(q \in [0,1]\) as

*c*-Modularity. We now argue that

*c*-Modularity is in \(\textsf {FPT}\) parameterised by the treewidth of the input graph. The crucial difference from our XP algorithm above is the fact that, when we fix the number of parts in the partition, we can no longer assume that every part is connected. However, if the maximum number of parts

*c*is a constant, we can keep track of the necessary statistics for every possible part, not just those that intersect the bag under consideration.

### Lemma 2

*c*-Modularity is in FPT when parameterised by the treewidth of the input graph.

### Proof

*c*(possibly empty) parts allowed in the partition. Formally, for any node \(t \in V(T)\), a valid state of

*t*consists of:

- 1.
a function \(\pi : \mathcal {D}(t) \rightarrow [c]\);

- 2.
a function \(\alpha :[c] \rightarrow [m]\) such that \(\alpha (i) \ge e(\pi ^{-1}(i))\) for all \(i \in [c]\);

- 3.
a function \(\beta :[c] \rightarrow [2m]\) such that \(\beta (i) \ge {{\,\mathrm{vol}\,}}(\pi ^{-1}(i))\) for all \(i \in [c]\).

*c*possible parts, \(\alpha \) keeps track of the number of edges captured so far in each of the

*c*parts, and \(\beta \) the volume so far of each part. Notice that the number of possible states for any node

*t*is at most \(c^{k+1} \cdot m^{c} \cdot (2m)^c = c^{k+1} m^{\mathcal {O}(c)}\).

*t*, we define a \((t,\pi ,\alpha ,\beta )\)-

*partition*to be any partition \(\mathcal {A} = \{A_1,\ldots ,A_c\}\) of \(V_t\) such that:

- 1.
\(v \in A_{\pi (v)}\) for each \(v \in \mathcal {D}(t)\);

- 2.for each \(i \in [c]\):
\(\alpha (i) = e(A_i)\), and

\(\beta (i) = {{\,\mathrm{vol}\,}}(A_i)\).

*r*is the root of the tree decomposition,

*t*is a leaf node we can consider all possibilities in time depending only on

*k*and

*c*; we now outline how to compute the values of \(\theta _t\) for a node

*t*, given the values for its children.

*t*is an introduce node with child \(t'\), where \(\mathcal {D}(t) = \mathcal {D}(t') \cup \{v\}\). Given any state \((\pi ,\alpha ,\beta )\) of

*t*, we say that a state \((\pi ',\alpha ',\beta ')\) of \(t'\) is introduce-compatible with \((\pi ,\alpha ,\beta )\) if:

\(\pi ' = \pi |_{\mathcal {D}(t')}\);

for every \(i \in [c]\), if \(\pi (v) \ne c\) then \(\alpha '(i) = \alpha (i)\), and if \(\pi (v) = i\) then \(\alpha '(i) = \alpha (i) - |\{u \in \pi ^{-1}(i): uv \in E(G)\}|\);

for every \(i \in [c]\), if \(\pi (v) \ne i\) then \(\beta '(i) = \beta (i)\), and if \(\pi (v) = P\) then \(\beta '(i) = \beta (i) - d(v)\).

*t*is a forget node with child \(t'\), where \(\mathcal {D}(t) = \mathcal {D}(t') {\setminus } \{v\}\). In this case we have

*t*is a join node with children \(t_1\) and \(t_2\), where \(\mathcal {D}(t_1) = \mathcal {D}(t_2) = \mathcal {D}(t)\). In this case we see that

Recall (Fact 1) that \(q^*(G) \ge q_{\le c}(G) > q^*(G)\big (1-\frac{1}{c}\big )\); thus, for any constant \(\epsilon > 0\), we obtain a factor \((1-\varepsilon )\)-approximation by solving \(\lceil \frac{1}{\epsilon } \rceil \)-Modularity. This immediately gives the following result.

### Corollary 1

Given any constant \(\epsilon > 0\), there is an FPT-algorithm, parameterised by the treewidth of the input graph *G*, that returns a partition \(\mathcal {A}\) with \(q_{\mathcal {A}}(G)>(1-\varepsilon )q^*(G)\).

We conclude this section by noting that sparse graphs, in particular graphs *G* with low tree width, \(\mathrm{tw}(G)\), and maximum degree, \(\triangle (G)\), can have high maximum modularity. In particular Theorem 1.11 of [21] shows \(q^*(G)\ge 1-2((\mathrm{tw}(G)+1)\triangle (G)/|E(G)|)^{1/2}\).

### 2.3 Parameterisation by Max Leaf Number

In this section we demonstrate that Modularity can be solved in time linear in the number of connected subgraphs of the input graph *G*; as a consequence of this result, we deduce that the problem belongs to \(\textsf {XP}\) when parameterised by the max leaf number of *G*.

### Theorem 7

Let *G* be a graph on *n* vertices with *m* edges and at most *h* connected subgraphs. Then Modularity can be solved in time \(\mathcal {O}(h^2n)\).

### Proof

*G*contains no isolated vertices. For any induced subgraph

*H*of

*G*, and partition \(\mathcal {A}_H\) of

*V*(

*H*), we write

*A*in

*G*. We then set

*V*(

*H*). Thus, \(q^*(H,G)\) can be seen as the maximum possible contribution of parts contained in

*H*to the modularity of

*G*, if we only consider partitions of

*V*(

*G*) such that every part is either completely contained in

*V*(

*H*) or does not intersect

*V*(

*H*).

*H*be a connected subgraph of

*G*. Then, for any partition \(\mathcal {A}_H\) of

*V*(

*H*) with \(|\mathcal {A}_H| > 1\), such that each part induces a connected subgraph, it is clear that there exists a partition (

*X*,

*Y*) of

*V*(

*H*) into two nonempty sets such that

*H*[

*X*] and

*H*[

*Y*] are both connected, and every element of \(\mathcal {A}_H\) is completely contained in either

*X*or

*Y*. Conversely, if (

*X*,

*Y*) is a partition with this property it is immediate that partitions of

*X*and

*Y*can be combined to give a partition of

*V*(

*H*). For any connected graph

*H*, we write \(\mathcal {P}(H)\) for the set of all partitions (

*X*,

*Y*) of

*V*(

*H*) into two non-empty sets such that

*G*[

*X*] and

*G*[

*Y*] are both connected. Since we need only consider partitions in which every part induces a connected subgraph (by Fact 2), it follows that

By assumption, *G* has only *h* connected induced subgraphs. We note that, with suitable data structures, we can compute a list of all such subgraphs in time \(\mathcal {O}(nh)\). To enumerate all connected induced subgraphs containing the vertex *v*, we can explore a search tree as follows: we associate the pair \((\{v\},V(G) {\setminus } \{v\})\) with the root and, on reaching a node associated with the pair (*U*, *W*), we select an arbitrary vertex \(x \in W\) such that \(N(x) \cap U \ne \emptyset \) (if such a vertex exists), and create two child nodes associated with \((U \cup \{x\}, W {\setminus } \{x\})\) and \((U, W {\setminus } \{x\})\) respectively. When this process terminates, the vertex-set of every connected induced subgraph appears as the first element of the tuple for exactly one leaf node. Repeating the process for each vertex in the graph (after deleting those starting vertices already considered) will produce a list of all connected induced subgraphs.

*G*; without loss of generality we may further assume that these subgraphs are listed in non-decreasing order of their number of vertices. In particular, this means that there is no connected induced subgraph that is strictly contained in \(H_1\), so \(\mathcal {P}(H_1) = \emptyset \) and \(q^*(H_1,G) = \frac{1}{m}e(H_1) - \frac{1}{m^2} {{\,\mathrm{vol}\,}}(H_1)^2\). We can reformulate (4) as follows:

*G*has connected components \(C_1,\ldots ,C_{\ell }\), where \(V(C_i) = V_i\) for each

*i*. By Fact 2 (see also Lemma 1.6.2 of [27]), we can restrict our attention to partitions \(\mathcal {A}\) of

*V*(

*G*) such that every part is completely contained in some \(V_i\),

*G*, it occurs in the list \(H_1,\ldots ,H_h\) of connected induced subgraphs. Thus, once we have computed \(q^*(H,G)\) for each connected induced subgraph

*H*, we can immediately determine \(q^*(G)\) by summing the appropriate values. Hence the overall time required to compute \(q^*(G)\) is \(\mathcal {O}(h^2n)\). \(\square \)

It is known that, if the max leaf number of *G* is *c*, then *G* is a subdivision of some graph *H* on at most 4*c* vertices [13]; a graph on *n* vertices that is a subdivision of such a graph *H* has at most \(2^{4c}n^{(4c)^2}\) connected subgraphs (once we have decided which branch vertices belong to a subgraph, it remains only to decide where to cut each path from one of the chosen branch vertices to one we have not chosen). Thus, if the max leaf number of *G* is bounded by a constant it follows that *G* has at most a polynomial number of connected subgraphs, and the following result is an immediate consequence of Theorem 7.

### Corollary 2

Modularity is in XP when parameterised by the max leaf number of the input graph *G*.

We conjecture that this result is not optimal, and that Modularity is in fact in \(\textsf {FPT}\) with respect to this parameterisation.

## 3 Hardness results

In this section we complement our positive result about the FPT approximability of the problem parameterised by treewidth by demonstrating that computing the exact value of the maximum modularity is hard even in a more restricted setting.

### Theorem 8

Modularity, parameterised simultaneously by the pathwidth and the size of a minimum feedback vertex set for the input graph, is W[1]-hard.

*r*, \(\mathrm{pw}(G)\) and \(\mathrm{fvs}(G)\). In proving this hardness result, the authors implicitly consider the following variation of ECP.

From the proof of [12, Theorem 1] we can extract the following statement about the hardness of AECP.

### Lemma 3

- 1.
*H*is connected; - 2.
the graph \(H'\) obtained from

*H*by deleting all vertices of degree one is a subdivision of a 3-regular graph \(\tilde{H}\); - 3.
the branch vertices of \(H'\) (i.e. vertices of \(\tilde{H}\)) are precisely the anchor vertices \(a_1,\ldots ,a_r\);

- 4.
\(r\ge 4\) is even and divides \(|V_H|\);

- 5.
\(H {\setminus } \{a_1,\ldots ,a_r\}\) is a disjoint union of isolated vertices and paths with pendant edges.

*B*. For \(m\ge 1\) and vertex subset

*B*with \({{\,\mathrm{vol}\,}}(B)\ge 1\) we define

*B*with \(\delta (B)=4\) the modularity maximising volume is \({{\,\mathrm{vol}\,}}(B)=2\sqrt{2m}\). Moreover, while it would usually be better to take parts with \(\delta (B)<4\) these parts are actually worse (i.e. higher \(f_m(B)\) value) if their volumes are too big or too small. The function \(f_m(B)\) plays a similar role to the

*n-cost*in Proposition 1 of [21].

### Lemma 4

- 0:
if \(\partial (B)=0\) and \({{\,\mathrm{vol}\,}}(B)>4 \sqrt{2m}\) then \(f_m(B)>2\sqrt{2/m}\).

- 1:
if \(\partial (B)=1\) and \({{\,\mathrm{vol}\,}}(B)>3.7321 \sqrt{2m}\) or \({{\,\mathrm{vol}\,}}(B)<0.2679 \sqrt{2m}\) then \(f_m(B)>2\sqrt{2/m}\).

- 2:
if \(\partial (B)=2\) and \({{\,\mathrm{vol}\,}}(B)>3.4143 \sqrt{2m}\) or \({{\,\mathrm{vol}\,}}(B)<0.5857 \sqrt{2m}\) then \(f_m(B)>2\sqrt{2/m}\).

- 3:
if \(\partial (B)=3\) and \({{\,\mathrm{vol}\,}}(B)>3\sqrt{2m}\) or \({{\,\mathrm{vol}\,}}(B)<\sqrt{2m}\) then \(f_m(B)>2\sqrt{2/m}\).

- 4:
if \(\partial (B)=4\) and \({{\,\mathrm{vol}\,}}(B)\ge 2\sqrt{2m}\) then \(f_m(B) \ge 2\sqrt{2/m}\) with equality iff \({{\,\mathrm{vol}\,}}(B)=2\sqrt{2m}\).

- 5:
if \(\partial (B)\ge 5\) then \(f_m(B)>2\sqrt{2/m}\).

### Proof

*B*with a constant number, \(\ell \), of edges to the rest of the graph (so \(\partial (B)=\ell \)). For \(\ell =0\) one can check that directly that if \({{\,\mathrm{vol}\,}}(B)>4\sqrt{2m}\) then \(f_m(B)>2\sqrt{2/m}\) which establishes part 0. Thus we may assume \(\ell \ge 1\). By definition, \(f_m(B)=\ell /{{\,\mathrm{vol}\,}}(B)+{{\,\mathrm{vol}\,}}(B)/(2m)\) and so

We are now ready to prove Theorem 8.

### Proof of Theorem 8

We give a reduction from AECP. Suppose that \((H,\{a_1,\ldots ,a_r\})\) is the input to an instance of AECP; we will describe how to construct a graph *G*, where \(\mathrm{pw}(G)\) and \(\mathrm{fvs}(G)\) are both bounded by a function of *r*, together with an explicit \(q_0 \in (0,1)\) such that \((G,q_0)\) is a yes-instance for Modularity if and only if \((H,\{a_1,\ldots ,a_r\})\) is a yes-instance for AECP.

We may assume without loss of generality that our instance of AECP satisfies all of the conditions of Lemma 3.

*G*, obtained from

*H*by adding the following (see Fig. 2):

\(\alpha \) new leaves adjacent to each anchor vertex \(a_1,\ldots ,a_r\),

\(\beta \) isolated edges disjoint from

*G*, andan arbitrary perfect matching on the anchor vertices \(a_1,\ldots ,a_r\),

The idea of the construction is that the \(\alpha \) edges help ensure that each anchor vertex is in a separate part of any modularity optimal partition and the \(\beta \) edges allow us to get the numbers to work at the end of the proof. Notice that, even with these modifications, \(G {\setminus } \{a_1,\ldots ,a_r\}\) is still a disjoint union of isolated vertices and paths with pendant edges; hence \(\mathrm{pw}(G) \le r+1\) and \(\mathrm{fvs}(G) \le r\). We set \(m=|E(G)|\) so \(m=|E(H)| + \alpha r+\beta +r/2\).

*G*without the vertices supporting the \(\beta \) isolated edges, and let the minimisation be over \(\mathcal {A}'\) which are vertex partitions of \(V'\). We then have

*A*has zero volume, also implies that (7) \(\ge \) (8) with equality if and only if \(f_m(A)=\min _{B \subset V'} f_m(B)\) for every \(A\in \mathcal {A}'\).

Note that, since \(\mathcal {A}'\) is the restriction of some modularity optimal partition \(\mathcal {A}\) to a connected component of *G*, we may assume that, for all \(A\in \mathcal {A}'\), *G*[*A*] is connected. Moreover, if *v* is a pendant vertex adjacent to *u* then *u* and *v* are in the same part in \(\mathcal {A}'\); we call a partition with this last property (or, abusing notation, a set that would not violate this condition in a partition) ‘pendant-consistent’.

We now make the following claim, writing \(s = |H|/r\) for the desired part size in our instance of AECP.

### Claim 9

- a)
for any connected, pendant-consistent set \(B\subseteq V'\) we have \(f_m(B)\ge 2\sqrt{2/m}\), and if \(f_m(B) = 2\sqrt{2/m}\) then

*B*contains exactly one anchor and \({{\,\mathrm{vol}\,}}(B)=2\sqrt{2m}\); - b)
if \((H,\{a_1,\ldots ,a_r\})\) is a yes-instance, then there is a vertex partition \(\mathcal {A}'\) of \(V'\) so that \(f_m(A)=2\sqrt{2/m}\) for all \(A \in \mathcal {A}'\);

- c)
if there is a vertex partition \(\mathcal {A}'=\{A_1, \ldots , A_r\}\) of \(V'\) so that for all \(A_i\in \mathcal {A}\), \(f_m(A_i)=2\sqrt{2/m}\),

*A*is pendant-consistent and*G*[*A*] is connected for all \(A \in \mathcal {A}'\), then \((H,\{a_1,\ldots ,a_r\})\) is a yes-instance.

Claim 9(b), together with line (7), implies that, if \((H,\{a_1,\ldots ,a_r\})\) is a yes-instance, then so is \((G,q_0)\). Converesly, if \((G,q_0)\) is a yes-instance, it follows from Claim 9(c), that \((H,\{a_1,\ldots ,a_r\})\) is a yes-instance.

*r*is even. This, along with our parity constraint between \(\alpha \) and

*s*, implies that \((\alpha +s+1)^2-r\) is even. Thus we can choose \(\beta \) to be

### Proof of Claim 9(a):

We begin by showing that our two assumptions \(\alpha >32|E(G)|^2\) and \(\alpha +s+1=\sqrt{2m}\) imply that \(\alpha > 0.969\sqrt{2m}\). Recall that *H* without pendant edges is a subdivision of a cubic graph and so the average degree in *H* is at least two. Thus \(|E(H)|\ge |H|\). Also \(r\ge 4\), so \(|H|\ge 4s \ge s+1\) so \(|E(H)|\ge s+1\). By assumption \(\alpha > 32 |E(H)|^2 \ge 32(s+1)\). But also by assumption \(\alpha +s+1=\sqrt{2m}\) and so \(\alpha \ge 32/33\sqrt{2m} > 0.969\sqrt{2m}\).

*B*must contain exactly one anchor. First suppose

*B*contains no anchors: then

*B*does not contain nor is

*B*incident to any of the \(\alpha \) extra edges added to anchors nor the

*r*/ 2 extra edges in the perfect matching between anchors. Hence the volume of

*B*in

*G*is at most what it was in

*H*, i.e. \({{\,\mathrm{vol}\,}}_{G}(B)\le {{\,\mathrm{vol}\,}}_H(B) \le 2|E(H)|\). Also note that as \(G[V']\) is connected, \(\partial (B)\ge 1\), hence by Lemma 4 it is enough to show that \({{\,\mathrm{vol}\,}}(B)<0.2679\sqrt{2m}\) and this will show that for

*B*with no anchors, \(f_m(B)>2\sqrt{2/m}\). Clearly \(m\ge \alpha \) and by assumption \(\alpha >32|E(H)|^2\). Hence for

*B*with no anchors:

*B*must contain at least one anchor.

If *B* contains at least two anchors then there are two options: \(B=V'\) and \(B\subsetneq V'\). We rule out \(B=V'\) and \(f_m(B)\le 2\sqrt{2/m}\) first. Note that \({{\,\mathrm{vol}\,}}(V')=2|E(H)|+2\alpha r+r\). But \(r\ge 4\) and by earlier in the proof \(\alpha >0.969\sqrt{2m}\). Hence \({{\,\mathrm{vol}\,}}(V')\ge 8\alpha > 7.752\sqrt{2m}\) and so by Lemma 4 we get that \(f_m(V')>2\sqrt{2/m}\). Thus \(B\ne V'\).

Now we show that for \(B\subsetneq V'\) with at least two anchors in *B* we have \(f_m(B)>2\sqrt{2/m}\). In the case \(B\ne V'\) because \(G[V']\) is connected \(\partial (B)\ge 1\). If *B* has at least two anchors then \({{\,\mathrm{vol}\,}}(B)\ge 4\alpha >3.876\sqrt{2m}\). Therefore for \(B\subsetneq V'\) with at least two anchors in *B*, \(\partial (B)\ge 1\) and \({{\,\mathrm{vol}\,}}(B) > 3.878\sqrt{2m}\) hence \(f_m(B)>2\sqrt{2/m}\) by Lemma 4.

Thus to ensure \(f_m(B)\le 2\sqrt{2/m}\) we must have exactly one anchor in *B*. In particular we can now assume that *B* contains exactly one anchor. Let graph \(G'\) be *G* without the added perfect matching between anchors at the end of the construction of *G* from *H*. Now \(G'[B]\) is connected, *B* has exactly one anchor and after stripping pendant vertices that anchor has degree 3 in \(G'\) so we have \(\partial _{G'}(B)\ge 3\). And because *B* is pendant-consistent \(\partial _{G'}(B)=3\), after re-adding the perfect matching between anchors \(\partial _{G}(B)=4\).

But now, because \(\partial _G(B)=4\), by Lemma 4 we have that \(f_m(B)\ge 2\sqrt{2/m}\). Also by Lemma 4 to get equality \(f_m(B)=2\sqrt{2/m}\) we must have \({{\,\mathrm{vol}\,}}(B)=2\sqrt{2m}\) which establishes the last part of the claim. \(\square \)*Claim*9*(a)*

### Proof of Claim 9(b):

Suppose \((H,\{a_1,\ldots ,a_r\})\) is a yes-instance. We prove there exists a vertex partition \(\mathcal {A}'\) of \(V'\) such that, for all \(A\in \mathcal {A}'\), \(f_m(A)=2\sqrt{2/m}\). By assumption, there is a connected equipartition \(\mathcal {B}=\{B_1,\ldots ,B_r\}\) of *V*(*H*) such that \(a_i \in B_i\) for each *i*. In the construction of the graph *G* from *H* we added \(\alpha \) pendant vertices, say \(u_1^i, \ldots , u_\alpha ^i\), to each anchor \(a_i\). Define \(\mathcal {A}'=\{ B_i \cup \{ u_1^i, \ldots , u_\alpha ^i\} \; : \; B_i \in \mathcal {B}\}\). Observe that that \(\mathcal {A}'\) is a vertex partition of \(V'\) as the set \(V'\) consists exactly of *V*(*H*) together with the extra \(\alpha r\) vertices added with the pendant edges on each anchor. It now remains to prove that \(f_m(A_i)=2\sqrt{2/m}\) for each *i*.

Consider \(G'\), the graph formed from *G* by removing the arbitary perfect matching added in the last step of the construction of *G* from *H*. Recall the graph \(G'\) is the subdivision of a 3-regular graph with the anchors as the branch vertices. Fix *i* and note that \(H[B_i]\) connected implies that \(G'[A_i]\) is connected. But as \(G'[A_i]\) is connected, contains exactly one anchor and contains every vertex pendant to a vertex in \(A_i\) it must be the case that \(\partial _{G'}(A_i)=3\). Now re-add the perfect matching and we get that \(\partial _G(A_i)=4\).

*Claim*9

*(b)*

### Proof of Claim 9(c):

Suppose there exists a vertex partition \(\mathcal {A}'=\{A'_1, \ldots , A'_r\}\) of \(V'\) such that, for all \(A'_i\in \mathcal {A}'\), \(f_m(A'_i)=2\sqrt{2/m}\), \(A'_i\) is pendant-consistent, and \(G[A_i']\) is connected. By Claim 9(a) we may also assume that for all \(A_i'\in \mathcal {A}'\) we have \({{\,\mathrm{vol}\,}}(A_i')=2\sqrt{2m}\). We will show this implies that \((H,\{a_1,\ldots ,a_r\})\) is a yes-instance.

Fix some *i*. The induced subgraph \(G[A'_i]\) is connected and contains exactly one anchor, say \(a_i\), so we can remove the perfect matching between the anchors and \(G'[A'_i]\) is still connected. Let \(B_i\) be the vertex set obtained from \(A_i'\) by removing the \(\alpha \) added leaves pendant on the anchor \(a_i\). Then \(B_i\subseteq V(H)\) and \(H[B_i]\) is connected.

\(\square \) *Claim* 9 *(c)*

This completes the proof. \(\square \)

## 4 Conclusions and Open Problems

We have shown that Modularity belongs to \(\textsf {FPT}\) when parameterised by the vertex cover number of the input graph, and that the problem is solvable in polynomial time on input graphs whose treewidth or max leaf number is bounded by some fixed constant; we also showed that there is an FPT algorithm, parameterised by treewidth, which computes any constant-factor approximation to the maximum modularity. In contrast with the positive approximation result, we demonstrated that the problem is unlikely to admit an exact FPT algorithm when the treewidth is taken to be the parameter, as it is \(\textsf {W}[1]\)-hard even when parameterised simultaneously by the pathwidth and size of a minimum feedback vertex set for the input graph.

We conjecture that our XP algorithm parameterised by max leaf number is not optimal, and that Modularity in fact belongs to \(\textsf {FPT}\) with respect to this parameterisation. Another open question arising from our work is whether the problem belongs to \(\textsf {FPT}\) with respect to other parameters for which this is not ruled out by our hardness result, including treedepth, modular width and neighbourhood diversity.

It is also natural to ask whether our approximation result can be extended to larger classes of graphs, for example those of bounded cliquewidth or bounded expansion. Moreover, when considering treewidth as the parameter, it would be interesting to investigate the existence or otherwise of an \(\epsilon \)-approximation in time \(f(\mathrm{tw},\epsilon ) n^{\mathcal {O}(1)}\).

## Notes

### Acknowledgements

The authors are grateful to Jessica Enright for some helpful initial discussions about the topic.

## References

- 1.Bagrow, J.P.: Communities and bottlenecks: trees and treelike networks have high modularity. Phys. Rev. E
**85**(6), 066118 (2012)CrossRefGoogle Scholar - 2.Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech Theory Exp.
**2008**(10), P10008 (2008)CrossRefGoogle Scholar - 3.Bodlaender, H.L.: A linear time algorithm for finding tree-decompositions of small treewidth. In: Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, STOC ’93, pp. 226–234. ACM, New York, NY, USA (1993). https://doi.org/10.1145/167088.167161
- 4.Brandes, U., Delling, D., Gaertler, M., Gorke, R., Hoefer, M., Nikoloski, Z., Wagner, D.: On modularity clustering. IEEE Trans. Knowl. Data Eng.
**20**(2), 172–188 (2008). https://doi.org/10.1109/TKDE.2007.190689 CrossRefzbMATHGoogle Scholar - 5.Cygan, M., Fomin, F.V., Kowalik, Ł., Lokshtanov, D., Marx, D., Pilipczuk, M., Pilipczuk, M., Saurabh, S.: Parameterized Algorithms. Springer, Cham (2015)CrossRefGoogle Scholar
- 6.DasGupta, B., Desai, D.: On the complexity of newman’s community finding approach for biological and social networks. J. Comput. Syst. Sci.
**79**(1), 50–67 (2013). https://doi.org/10.1016/j.jcss.2012.04.003 MathSciNetCrossRefzbMATHGoogle Scholar - 7.de Montgolfier, F., Soto, M., Viennot, L.: Asymptotic modularity of some graph classes. In: Asano, T., Nakano, S., Okamoto, Y., Watanabe, O. (eds.) Algorithms and Computation. ISAAC 2011. Lecture Notes in Computer Science, vol. 7074. Springer, Berlin, Heidelberg (2011) Google Scholar
- 8.Dinh, T.N., Li, X., Thai, M.T.: Network clustering via maximizing modularity: Approximation algorithms and theoretical limits. In: 2015 IEEE International Conference on Data Mining, pp. 101–110 (2015). https://doi.org/10.1109/ICDM.2015.139
- 9.Dinh, T.N., Thai, M.T.: Finding community structure with performance guarantees in scale-free networks. In: Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on, pp. 888–891. IEEE (2011)Google Scholar
- 10.Dinh, T.N., Thai, M.T.: Community detection in scale-free networks: approximation algorithms for maximizing modularity. IEEE J. Sel. Areas Commun.
**31**(6), 997–1006 (2013). https://doi.org/10.1109/JSAC.2013.130602 CrossRefGoogle Scholar - 11.Downey, R.G., Fellows, M.R.: Fundamentals of Parameterized Complexity. Springer, London (2013)CrossRefGoogle Scholar
- 12.Enciso, R., Fellows, M.R., Guo, J., Kanj, I., Rosamond, F., Suchý, O.: What makes equitable connected partition easy. In: Chen, J., Fomin, F.V., (eds.) Parameterized and Exact Computation: 4th International Workshop, IWPEC 2009, Copenhagen, Denmark, September 10–11, 2009, Revised Selected Papers, pp. 122–133. Springer, Berlin, (2009). https://doi.org/10.1007/978-3-642-11269-0_10 CrossRefGoogle Scholar
- 13.Estivill-Castro, V., Fellows, M., Langston, M., Rosamond, F.: FPT is P-time extremal structure I. In: Algorithms and Complexity in Durham 2005, Proceedings of the first ACiD Workshop, volume 4 of Texts in Algorithmics, pp. 1–41. King’s College Publications (2005)Google Scholar
- 14.Fortunato, S.: Community detection in graphs. Phys. Rep.
**486**(3), 75–174 (2010)MathSciNetCrossRefGoogle Scholar - 15.Fortunato, S., Hric, D.: Community detection in networks: a user guide. Phys. Rep.
**659**, 1–44 (2016)MathSciNetCrossRefGoogle Scholar - 16.Jutla, I.S., Jeub, L.G.S., Mucha, P.J.: A generalized Louvain method for community detection implemented in MATLAB. (2011) http://netwiki.amath.unc.edu/GenLouvain
- 17.Kawase, Y., Matsui, T., Miyauchi, A.: Additive Approximation Algorithms for Modularity Maximization. In: Hong, S.H. (ed.) 27th International Symposium on Algorithms and Computation (ISAAC 2016), Leibniz International Proceedings in Informatics (LIPIcs), vol. 64, pp. 43:1–43:13. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2016). https://doi.org/10.4230/LIPIcs.ISAAC.2016.43
- 18.Kloks, T.: Treewidth—Computations and Approximations, Lecture Notes in Computer Science, vol. 842. Springer, Berlin (1994)zbMATHGoogle Scholar
- 19.Lancichinetti, A., Fortunato, S.: Limits of modularity maximization in community detection. Phys. Rev. E
**84**(6), 066122 (2011)CrossRefGoogle Scholar - 20.Lokshtanov, D.: Parameterized integer quadratic programming: Variables and coefficients. (2015) arXiv:1511.00310 [cs.DS]
- 21.McDiarmid, C., Skerman, F.: Modularity of regular and treelike graphs. J. Complex Netw.
**6**, 596 (2017)MathSciNetCrossRefGoogle Scholar - 22.McDiarmid, C., Skerman, F.: Modularity of Erdős-Rényi random graphs. Random Struct. Algorithms, (to appear)Google Scholar
- 23.McDiarmid, C., Skerman, F.: Modularity of Erdős-Rényi random graphs. In: 29th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms, vol. 1 (2018)Google Scholar
- 24.Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E
**69**(2), 026113 (2004)CrossRefGoogle Scholar - 25.Porter, M., Onnela, J.P., Mucha, P.: Communities in networks. Not AMS
**56**(9), 1082–1097 (2009)MathSciNetzbMATHGoogle Scholar - 26.Prokhorenkova, L.O., Prałat, P., Raigorodskii, A.: Modularity of models of complex networks. Electron. Notes Discrete Math.
**61**, 947–953 (2017)CrossRefGoogle Scholar - 27.Skerman, F.: Modularity of networks. Ph.D. thesis, University of Oxford (2016)Google Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.