Abstract
In this paper we show that for any graph H of order m and any graph G of order n and maximum degree \(\Delta \) one can compute the number of subsets S of V(G) that induces a graph isomorphic to H in time \(O(c^m \cdot n )\) for some constant \(c = c(\Delta ) >0\). This is essentially best possible (in the sense that there is no \(c^{o(m)}poly(n)\)time algorithm under the exponential time hypothesis).
Introduction
For two graphs H and G we denote by \(\text {ind}(H,G)\) the number of subsets of the vertex set of G that induce a graph that is isomorphic to H. (We recall that two graphs \(H=(V_H,E_H)\) and \(G=(V_G,E_G)\) are said to be isomorphic if there exists a bijection \(f:V_H \rightarrow V_G\) such that for any \(u,v \in V_H\), we have that \(f(u)f(v) \in E_G\) if and only if \(uv \in E_H\).) Throughout we take G and H to have n and m vertices respectively unless otherwise stated.
Understanding the numbers \(\text {ind}(H,G)\) for different choices of H gives us much important information about G. For example, if H is the disjoint union of m isolated vertices, \(\text {ind}(H,G)\) equals the number of independent sets of size m in G. Determining these induced subgraph counts is closely related to determining subgraph counts and homomorphism counts; these parameters play a central role in the theory of graph limits [15], and frequently appear in statistical physics, see e.g. Sect. 2.2 in [19] and the references therein.
When H and G are both part of the input, computing \(\text {ind}(H,G)\) is clearly an NPhard problem because it includes the problem of determining the size of a maximum clique in G. When the graph H is fixed (with m vertices), the bruteforce algorithm takes time \(O(n^m)\) and a linear improvement has been made to the exponent [17].
This problem and some its variants (e.g. where we count not necessarily induced subgraphs) have been studied from a fixed parameter tractability (FTP) perspective; see [2, 6, 10, 12]). In general computing \(\text {ind}(H,G)\) when parameterizing by \(m = V(H)\) is \(\#W[1]\)hard because even deciding whether G contains an independent set of size m is \(\#W[1]\)hard [11]. Curticapean, Dell and Marx [9] prove a number of interesting dichotomy results for \(\#W[1]\)hardness using the treewidth of H (and of a certain class of graphs obtained from H) as an additional parameter.
However, when the graph G is of bounded degree, which is often of interest in statistical physics, the problem is no longer \(\#W[1]\)hard. Indeed, Curticapean, Dell, Fomin, Goldberg, and Lapinskas [8, Theorem 13] showed that for a graph H on m vertices and a bounded degree graph G on n vertices, \(\text {ind}(H,G)\) can be computed in time \(O(m^{O(m)}n)\), thus giving an FPT algorithm in terms of m. In the present paper we go further and give an algorithm with essentially optimal running time. We assume the standard wordRAM machine model with logarithmicsized words.
Theorem 1.1
There is an algorithm which, given an nvertex graph G of maximum degree at most \(\Delta \), and an mvertex graph H, computes \(\text {ind}(H,G)\) in time \(\tilde{O}((7\Delta )^{2m}n+2^{10m})\). (Here the \(\tilde{O}\)notation means that we suppress polynomial factors in m.)
Remark 1.1
Theorem 13 in [8] in fact concerns vertexcoloured graphs H and G. Our proof of Theorem 1.1 also easily extends to the coloured setting. We discuss this in Sect. 4.
The running time here is essentially optimal under the exponential time hypothesis. Indeed, if we could find an algorithm with an improved running time \(c^{o(m)}poly(n)\) (for some constant c possibly dependent on \(\Delta \)), we could use it to determine the size of a maximum independent set in time \(nc^{o(n)}poly(n)=c^{o(n)}poly(n)\), which is not possible (even in graphs of maximum degree 3) under the exponential time hypothesis; see [13, Lemma 2.1]. (In fact, under the exponential time hypothesis, we may draw the stronger conclusion that there is no \(c'^{m}poly(n)\)time algorithm for some \(c'>0\).)
Note that our algorithm allows us to compute \(\text {ind}(H,G)\) in polynomial time in G, even when H is logarithmic in G and G has bounded degree. The special case of this when H is an independent set was a crucial ingredient in our recent paper [18], which uses the Taylor approximation method of Barvinok [3] to give (amongst others) a fully polynomial time approximation scheme for evaluating the independence polynomial for bounded degree graphs. Our present paper completes the running time complexity picture for computing \(\text {ind}(H,G)\) on bounded degree graphs G.
We add a few remarks to give further perspective on the problem. Note that computing \(\text {ind}(H,G)\) in time \(\tilde{O}(\Delta ^m n)\) is relatively straightforward for G of bounded degree \(\Delta \) when H is connected (see Lemma 2.2). Thus the difficulty lies in graphs H that have many components. Note also that Curticapean et al. [8] use the fact that induced graph counts can be expressed in terms of homomorphism counts (see e.g. [15]) and that homomorphism counts from H to G can be computed in time \(\tilde{O}(\Delta ^m n)\) in their FPT algorithm. However the limiting factor is the time dependence (on m) of expressing induced graph counts in terms of homomorphism graph counts, which could be as large as the Bell number. This is significantly larger than the time dependence in our algorithm.
Both our approach and the approach in [8] crucially use the bounded degree assumption. It would be very interesting to know if Theorem 1.1 could be extended to graphs of average bounded degree such as for example planar graphs.
Question 1
For which class of graphs \(\mathcal C\) does there exist a constant \(c=c(\mathcal C)\) and an algorithm such that given an nvertex graph \(G\in \mathcal C\) and an mvertex graph H, the algorithm computes \(\text {ind}(H,G)\) in time \(O(c^{m} poly(n))\)?
We note that Nederlof [16] recently showed that there exists a constant c and an algorithm that computes the number of independent sets of size m in an nvertex planar graph in time \(c^{O(m)} n\).
Organization The remainder of the paper is devoted to proving Theorem 1.1. The main idea in our proof is to define a multivariate graph polynomial where the coefficients are certain induced graph counts; in particular \(\text {ind}(H,G)\) will be the coefficient of a monomial. We cannot compute these coefficients directly, but use machinery from [18] to compute the coefficients of univariate evaluations of this polynomial. In Sect. 3, we use algebraic techniques to efficiently extract \(\text {ind}(H,G)\) (i.e. the coefficient of interest) from the coefficients of the univariate evaluations. The resulting algorithm to compute \(\text {ind}(H,G)\) is summarized in Sect. 3.
We however need to slightly modify the result from [18]. This will be done in the next section.
Computing Coefficients of Graph Polynomials
An efficient way to compute the coefficients of a large class of (univariate) graph polynomials for bounded degree graphs was given in [18]. We will need a small modification of this result, for which we will provide the details here. In particular, the running time has been improved compared to [18] and we have clarified the dependence on the parameters. We start with some definitions after which we state the main result of this section.
By \(\mathcal {G}\) we denote the collection of all graphs and by \(\mathcal {G}_k\) for \(k\in \mathbb {N}\) we denote the collection of graphs with at most k vertices. A graph invariant is a function \(f:\mathcal {G}\rightarrow S\) for some set S that takes the same value on isomorphic graphs. A (univariate) graph polynomial is a graph invariant \(p:\mathcal {G}\rightarrow \mathbb {C}[z]\), where \(\mathbb {C}[z]\) denotes the ring of polynomials in the variable z over the field of complex numbers. Call a graph invariant fmultiplicative if \(f(\emptyset )=1\) and \(f(G_1\cup G_2)=f(G_1)f(G_2)\) for all graphs \(G_1,G_2\) (here \(G_1\cup G_2\) denotes the disjoint union of the graphs \(G_1\) and \(G_2\)). We can now give the key definition and tool we need from [18].
Definition 2.1
Let p be a multiplicative graph polynomial defined by
for each \(G\in \mathcal {G}\) with \(e_0(G)=1\), where d(G) is the degree of the polynomial p(G). We call p a bounded induced graph counting polynomial (BIGCP) if there exists \(\alpha \in \mathbb {N}\), an algorithm A and a nondecreasing sequence \(\beta \in \mathbb {N}^\mathbb {N}\) such that the following two conditions are satisfied:

(i)
for every graph G, the coefficients \(e_i\) satisfy
$$\begin{aligned} e_i(G):=\sum _{H\in \mathcal {G}_{\alpha i}}\lambda _{H,i}\text {ind}(H,G) \end{aligned}$$(2)for certain \(\lambda _{H,i}\in \mathbb {C}\);

(ii)
for each i and \(H\in \mathcal {G}_{\alpha i}\), the algorithm A computes the coefficient \(\lambda _{H,i}\) in time \(\beta _i\).
We note that the coefficients of BIGCPs can be seen as graph motif parameters as introduced in [9].
We have the following result for computing coefficients of BIGCPs.
Theorem 2.1
Let \(n, m, \Delta \in \mathbb {N}\) and let \(p(\cdot )\) be a bounded induced graph counting polynomial with parameters \(\alpha \) and \(\beta \). Then there is a deterministic \(\tilde{O}(n (e\Delta )^{\alpha m}\beta _m4^{\alpha m}) \)time algorithm, which, given any nvertex graph G of maximum degree at most \(\Delta \), computes the first m coefficients \(e_1(G), \ldots , e_m(G)\) of p(G). (Here the \(\tilde{O}\)notation means that we suppress polynomial factors in m.)
Remark 2.1
The algorithm in the theorem above only has access to the polynomial p via the algorithm A in the definition of BIGCP that computes the complex numbers \(\lambda _{H,i}\).
Before we prove Theorem 2.1 we will first gather some facts from [18] about induced subgraph counts and the number of connected induced subgraphs of fixed size that occur in a graph. Compared to [18] we actually need to slightly sharpen the statements.
Induced Subgraph Counts
Define \(\text {ind}(H,\cdot ):\mathcal {G}\rightarrow \mathbb {C}\) by \(G\mapsto \text {ind}(H,G)\). So we view \(\text {ind}(H,\cdot )\) as a graph invariant. We can take linear combinations and products of these invariants. In particular, for two graphs \(H_1,H_2\) we have
where for a graph H, \(c^H_{H_1,H_2}\) is the number of pairs of subsets of V(H), (U, T), such that \(U \cup T = V(H)\) and \(H[U]=H_1\) and \(H[T]=H_2\). In particular, given \(H_1\) and \(H_2\), \(c^H_{H_1,H_2}\) is nonzero for only a finite number of graphs H.
In what follows we will often have to maintain a list L of subsets S of [n] with \(S\le k\) (for some k) as well as some (complex) number \(c_S\) associated to S. We will use the standard wordRAM machine model with logarithmicsized words. This means that given a set S of size k, we have access to \(c_S\) in O(k) time. In particular, this also means we can determine whether S is contained in our list in O(k) time.
The next lemma says that computing \(\text {ind}(H,G)\) is fixed parameter tractable when G has bounded degree and H is connected. The following lemma is a variation on [18, Lemma 3.5].
Lemma 2.2
Let H be a connected graph on k vertices and let \(\Delta \in \mathbb {N}\). Then

(i)
there is an \(O(n\Delta ^{k1})\)time algorithm, which, given any nvertex graph G with maximum degree at most \(\Delta \), checks whether \(\text {ind}(H,G)\ne 0\);

(ii)
there is an \(O(n \Delta ^{k1}k)\)time algorithm, which, given any nvertex graph G with maximum degree at most \(\Delta \), computes the number \(\text {ind}(H,G)\).
Note that Lemma 2.2 (i) enables us to test for graph isomorphism between connected bounded degree graphs when \(V(G) = V(H)\).
Proof
We follow the proof from [18]. We assume that \(V(G)=[n]\). Let us list the vertices of V(H), \(v_1,\ldots ,v_k\) in such a way that for \(i\ge 1\) vertex \(v_i\) has a neighbour among \(v_1,\ldots ,v_{i1}\). Then to embed H into G we first select a target vertex for \(v_1\) and then given that we have embedded \(v_1, \ldots , v_{i1}\) with \(i\ge 2\) there are at most \(\Delta \) choices for where to embed \(v_i\). After k iterations, we have a total of at most \(n\Delta ^{k1}\) potential ways to embed H and each possibility is checked in the procedure above. Hence we determine if \(\text {ind}(H,G)\) is zero or not in \(O(n\Delta ^{k1})\) time.
Throughout the procedure above we maintain a list L that contains all sets S such that \(G[S]=H\) found thus far. Each time we find a set \(S\subset [n]\) such that \(G[S]=H\) we check if it is contained in L. If this is not the case we add S to L and we discard S otherwise. The length of the resulting list, which we update at each iteration, gives the value of \(\text {ind}(H,G)\). \(\square \)
Next we consider how to enumerate all possible connected induced subgraphs of fixed size in a bounded degree graph. We will need the following result of Borgs, Chayes, Kahn, and Lovász [5, Lemma 2.1]:
Lemma 2.3
Let G be a graph of maximum degree \(\Delta \). Fix a vertex \(v_0\) of G. Then the number of connected induced subgraphs of G with k vertices containing the vertex \(v_0\) is at most \(\frac{(e\Delta )^{k1}}{2}\).
As a consequence we can efficiently enumerate all connected induced subgraphs of logarithmic size that occur in a bounded degree graph G.
Lemma 2.4
There is a \(O(nk^3(e\Delta )^{k})\)time algorithm which, given \(k \in \mathbb {N}\) and an nvertex graph G on [n] of maximum degree \(\Delta \), outputs \(\mathcal {T}_k\), the list of all \(S \subseteq [n]\) satisfying \(S \le k\) and G[S] connected.
Proof
We assume that \(V(G)=[n]\). By the previous result, we know that \(\mathcal {T}_k \le n(e \Delta )^{k1}\) for all k.
We inductively construct \(\mathcal {T}_k\). For \(k=1\), \(\mathcal {T}_k\) is clearly the set of singleton vertices and takes time O(n) to output.
Given that we have found \(\mathcal {T}_{k1}\) we compute \(\mathcal {T}_k\) as follows. We iteratively compute \(\mathcal {T}_{k}\) by going over all \(S\in \mathcal {T}_{k1}\) going over all \(v\in N_G(S)\) (the collection of vertices that are connected to an element of S) and checking whether \(S\cup \{v\}\) is already contained in \(\mathcal {T}_{k}\) or not. We add it to \(\mathcal {T}_k\) if it is not already contained in \(\mathcal {T}_k\).
The set \(N_G(S)\) has size at most \(S\Delta \le k\Delta \) and takes time \(O(k\Delta )\) to find (assuming G is given in adjacency list form). Therefore computing \(\mathcal {T}_k\) takes time bounded by \(O(\mathcal {T}_{k1}k^2\Delta ) = O(nk^2(e\Delta )^{k})\).
Starting from \(\mathcal {T}_1\), we perform the above iteration k times, requiring a total running time of \(O(nk^3(e\Delta )^{k})\). The proof that \(\mathcal {T}_k\) contains all the sets we desire is straightforward and can be found in [18]. \(\square \)
We call a graph invariant \(f:\mathcal {G}\rightarrow \mathbb {C}\)additive if for each \(G_1,G_2\in \mathcal {G}\) we have \(f(G_1\cup G_2)=f(G_1)+f(G_2)\). The following lemma is a variation of a lemma due to Csikvári and Frenkel [7]; it is fundamental to our approach. See [18] for a proof.
Lemma 2.5
Let \(f:\mathcal {G}\rightarrow \mathbb {C}\) be a graph invariant given by \(f(\cdot ):=\sum _{H\in \mathcal {G}}a_H\text {ind}(H,\cdot )\). Then f is additive if and only if \(a_H=0\) for all graphs H that are disconnected.
Let \(p(z) = a_0 + \cdots + a_dz^d\) be a polynomial of degree d with nonzero constant term \(a_0\) and with complex roots \(\zeta _1, \ldots , \zeta _d\). Define for \(j\in \mathbb {N}\), the inverse power sum\(p_j\) by
The next proposition is a variant of the Newton identities that relate the inverse power sums and the coefficients of a polynomial. We refer to [18] for a proof.
Proposition 2.6
Let \(p(z) = a_0 + \cdots + a_dz^d\) be a polynomial of degree d with \(a_0\ne 0\) and inverse power sums \(p_j\), \(j\in \mathbb {N}\). Then for each \(k=1,2, \ldots \), we have
(Here we take \(a_i=0\) if \(i>d\).)
Proof of Theorem 2.1
We follow the proof as given in [18], which we modify slightly at certain points.
Recall that \(p(\cdot )\) is a bounded induced graph counting polynomial (BIGCP). Given an nvertex graph G with maximum degree at most \(\Delta \), we must show how to compute the first m coefficients of p. We will use \(\tilde{O}\)notation throughout to mean that we suppress polynomial factors in m (and k). To reduce notation, let us write \(p=p(G)\), \(d=d(G)\) for the degree of p, and \(e_i=e_i(G)\) for \(i=0,\ldots ,d\) for the coefficients of p (from (1)). We also write \(p_k:= \zeta _1^{k} + \cdots + \zeta _d^{k}\), where \(\zeta _1,\ldots ,\zeta _d\in \mathbb {C}\) are the roots of the polynomial p(G).
Noting \(e_0=1\), Proposition 2.6 gives
for each \(k=1, \ldots , d\).
By (2), for \(i\ge 1\), the \(e_i\) can be expressed as linear combinations of induced subgraph counts of graphs with at most \(\alpha i\) vertices. Since \(p_1=e_1\), this implies that the same holds for \(p_1\). By induction, (3), and (4) we have that for each k
for certain, yet unknown, coefficients \(a_{H,k}\).
Since p is multiplicative, the inverse power sums are additive (since the multiset of roots of \(p(G_1\cup G_2)\) is exactly the union of the multisets of the roots of \(p(G_1)\) and \(p(G_2)\)). Thus Lemma 2.5 implies that \(a_{H,k}=0\) if H is not connected. Denote by \(\mathcal {C}_{i}(G)\) the set of connected graphs of order at most i that occur as induced subgraphs in G. Let us assume that G has vertex set [n]. Denote by \(\mathcal {T}_{\le \alpha k}(G)\) the list consisting of those sets \(S\subseteq [n]\) of size at most \(\alpha k\) that induce a connected graph in G. This way we can rewrite (5) as follows:
The next lemma says that we can compute the coefficients \(a_{S,k}:=a_{G[S],k}\) efficiently for \(k=1,\ldots ,m\).
Lemma 2.7
There is an \(\tilde{O}(n (e\Delta )^{\alpha m}\beta _m4^{\alpha m})\)time algorithm, which given a BIGCP p (with parameters \(\alpha \) and \(\beta \)) and an nvertex graph G of maximum degree \(\Delta \), computes and lists the coefficients \(a_{S,k}\) in (6) for all \(S\in \mathcal {T}_{\le \alpha k}(G)\) and all \(k=1,\ldots , m\).
Proof
We assume that the vertex set of G is equal to [n]. Using the algorithm of Lemma 2.4, we first compute the list \(\mathcal {T}_{\le \alpha k}\) consisting of all subsets S of V(G) such that \(S\le \alpha k\) and G[S] is connected. This takes time bounded by
(Note that the algorithm in Lemma 2.4 actually computes \(\mathcal {T}_{\le \alpha k}\) when it computes \(\mathcal {T}_{\alpha m}\).)
To prove the lemma, let us fix \(k\le m\) and show how to compute the coefficients \(a_{S,k}\), assuming that we have already computed and listed the coefficients \(a_{S',k'}\) for all \(k'<k\) and \(S'\in \mathcal {T}_{\le \alpha k'}\). Let us fix \(S\in \mathcal {T}_{\le \alpha k}\). Let \(H=G[S]\). By (4), it suffices to compute the coefficient of \(\text {ind}(H,\cdot )\) in \(p_{ki}e_{i}\) for \(i=1,\ldots ,k\) (where we set \(p_0=1)\). By (2) and (5) we know that
So by (3) we know that the coefficient of \(\text {ind}(H,\cdot )\) in \(p_{ki}e_{i}\) is given by
As \(V(H)\le \alpha k\), the second sum in (8) is over at most \(4^{\alpha k}=O(4^{\alpha m})\) pairs (T, U). For each such pair, we need to compute \(\lambda _{H[U], i}\) and \( a_{H[T],(ki)}\). We can compute \(\lambda _{H[U],i}\) in time bounded by \(O(\beta _i)=O(\beta _m)\) since p is a BIGCP. As \(H[T]=G[T]\), to compute \(a_{H[T],(ki)}\) we just need to look up the coefficient \(a_{T,ki}\), which takes time \(O(ki)\).
Together, all this implies that the coefficient of \(\text {ind}(H,\cdot )\) in \(p_{ki}e_{i}\) can be computed in time bounded by
So the coefficient \(a_{H,k}\) can be computed in the same time (since we suppress polynomial factors in m). Thus all coefficients \(a_{S,k}\) for \(S\in \mathcal {T}_{\le \alpha k}\) can be computed and listed in time bounded by \(\mathcal {T}_{\le \alpha k}\) multiplied by the expression (9), which is bounded by
by Lemma 2.3.
So the total running time is bounded by the time to compute the list \(\mathcal {T}_{\le \alpha m}\) (which is given by (7)) plus the time to compute the \(a_{S,k}\) for \(S \in \mathcal {T}_{\le \alpha k}\) (which is given by (10)) for \(k=1,\ldots ,m\). This proves the lemma. \(\square \)
To finish the proof of the theorem, we compute \(p_k\) for each \(k=1,\ldots , m\) by adding all the numbers \(a_{S,k}\) over all \(S\in \mathcal {T}_{\le \alpha k}(G)\) using (6) (these numbers were computed in the previous lemma in time \(\tilde{O}((e\Delta )^{\alpha m}\beta _m4^{\alpha m}n) \)). Doing this addition takes time
Finally, knowing the \(p_i\), we can inductively compute the \(e_i\) for \(i=1, \ldots , m\) using the relations (4), in quadratic time in m. So we see that the total running time for computing \(e_1, \ldots , e_m\) is dominated by the computation of the \(\alpha _{S,k}\) and is \(\tilde{O}(n (e\Delta )^{\alpha m}\beta _m4^{\alpha m})\). This proves the theorem.
Since our algorithm is spread over several lemmas, we give an overview below.
Algorithm 1
Input: a BIGCP p (with parameters \(\alpha \) and \(\beta _i\)), a natural number m and a graph G. Recall that \(p(G)(z):=\sum _{i=0}^{d(G)} e_{i}(G)z^{i}\) where the coefficients \(e_i\) satisfy \(e_i(G):=\sum _{H\in \mathcal {G}_{\alpha i}}\lambda _{H,i}\text {ind}(H,G)\). One inputs p via an algorithm A that can compute \(\lambda _{H,i}\) in time \(\beta _i\).

Step 1: Use the algorithm of Lemma 2.4 to compute, for \(k=1,\ldots , m\), the collection \(\mathcal {T}_k=\{S\subseteq V(G)\mid G[S]\text { connected } S\le \alpha k\}\).

Step 2: For each \(k=1\ldots ,m\), and \(S \subseteq V(G)\) with \(S \le \alpha k\), we iteratively compute the coefficients \(a_{S,k}\) using the following recursion
$$\begin{aligned} a_{S,k} ={\left\{ \begin{array}{ll} k \lambda _{G[S],k}  \sum _{i=1}^{k1} \sum _{(U,T) : U \cup T = S} a_{T, (ki)}\lambda _{G[U], i} &{}\text {if}\ S \in \mathcal {T}_{k} ;\\ 0 &{}\text {otherwise}, \end{array}\right. } \end{aligned}$$as detailed in the proof of Lemma 2.7 using (6) and (8). (Formally, we only compute \(a_{S,k}\) for \(S \in \mathcal {T}_k\) and implicitly assume \(a_{S,k}=0\) for any \(S \subseteq V(G)\) with \(S \le \alpha k\) and \(S \not \in \mathcal {T}_k\).)

Step 3: For each \(k=1,\ldots ,m\) compute \(p_k=\sum _{S\in \mathcal {T}_{k}}a_{S,k}\).

Step 4: Use the following recursion (i.e. the Newton identities (4) together with the values \(p_k\) computed in Step 3 to iteratively compute \(e_1(G),\ldots ,e_m(G)\):
$$\begin{aligned} e_i = \frac{1}{k} \left( p_k  \sum _{i1}^{k1} e_i p_{k1} \right) \end{aligned}$$
Output: the m coefficients \(e_1(G),\ldots ,e_m(G)\).
Proof of Theorem 1.1
We first set up some notation before we state our key definition. For a graph H we write \(H = i_1H_1 \cup \cdots \cup i_rH_r\) to mean that H is the disjoint union of \(i_1\) copies of \(H_1\), \(i_2\) copies of \(H_2\) all the way to \(i_r\) copies of \(H_r\) for connected and pairwise nonisomorphic graphs \(H_1, \ldots , H_r\). We write \(\underline{H} = (H_1, \ldots , H_r)\). For vectors \(\varvec{\mu } = (\mu _1, \ldots , \mu _r),\varvec{\nu } = (\nu _1, \ldots , \nu _r) \in \mathbb {Z}_{\ge 0}^r\) we write \(\varvec{\mu } \circ \varvec{\nu }\) for the vector in \(\mathbb {Z}_{\ge 0}^r\) that is the pointwise or Hadamard product of \(\varvec{\mu }\) and \(\varvec{\nu }\). We denote by \(\langle \varvec{\mu } , \varvec{\nu }\rangle \) the usual scalar product of \(\varvec{\mu }\) and \(\varvec{\nu }\). For a vector of variables \(\varvec{x}=(x_1,\ldots ,x_r)\) and \(\varvec{\mu }\in \mathbb {Z}_{\ge 0}^r\) we sometimes write \({\varvec{x}}^{\varvec{\mu }}:=x_1^{\mu _1}\cdots x_r^{\mu _r}\).
For pairwise nonisomorphic and connected graphs \(H_1,\ldots ,H_r\) we define the multivariate graph polynomial \(Z_{\underline{H}}(G) \in \mathbb {Z}[x_1, \ldots , x_r]\) as follows. For a graph G we let
where \(\varvec{x}= (x_1, \ldots , x_r)\), \(\varvec{\gamma } = (\gamma _1, \ldots , \gamma _r)\), \(\varvec{h}: = (V(H_1), \ldots , V(H_r))\) and where \(\varvec{\gamma } \underline{H}\) denotes the graph \(\gamma _1H_1 \cup \cdots \cup \gamma _rH_r\). Note that this is a finite polynomial because G is finite and so there are only a finite number of nonzero terms \(\text {ind}(\varvec{\gamma } \underline{H}, G)\).
Computing \(\text {ind}(H,G)\) for any two graphs \(H= i_1H_1 \cup \cdots \cup i_rH_r\) and G can now be modelled as computing the coefficient of the monomial \(x_1^{i_1h_1}x_2^{i_2h_2}\cdots x_r^{i_rh_r}\) in \(Z_{\underline{H}}(G;\varvec{x})\). Let us start by gathering some facts about the polynomial \(Z_{\underline{H}}\).
Proposition 3.1
The polynomial \(Z_{\underline{H}}\) is multiplicative, i.e., for any two graphs \(G_1\) and \(G_2\), \(Z_{\underline{H}}(G_1 \cup G_2;\varvec{x}) = Z_{\underline{H}}(G_1;\varvec{x}) \cdot Z_{\underline{H}}(G_2; \varvec{x})\). In particular, any evaluation of \(Z_{\underline{H}}\) is also multiplicative.
Proof
Note first that every monomial in \(Z_{\underline{H}}(G;\varvec{x})\) is of the form \(\varvec{x}^{\varvec{\gamma } \circ \varvec{h}}\) for some unique choice of \(\varvec{\gamma }\). For notational convenience we write \(s_{\varvec{\gamma }}(G):= \text {ind}(\varvec{\gamma } \underline{H}, G)\). Consider the coefficient of \(\varvec{x}^{\varvec{\gamma } \circ \varvec{h}}\) in the polynomial \(Z_{\underline{H}}(G_1;\varvec{x}) \cdot Z_{\underline{H}}(G_2;\varvec{x})\). The coefficient is given by
(where \(\varvec{\mu } + \varvec{\nu }\) denotes the usual vector addition of \(\varvec{\mu }\) and \(\varvec{\nu }\)). This counts precisely the number of copies of \(\gamma _1H_1 \cup \cdots \cup \gamma _rH_r\) in \(G_1 \cup G_2\), that is, \(s_{\varvec{\gamma }}(G_1 \cup G_2)\), which is the coefficient of \(\varvec{x}^{\varvec{\gamma } \circ \varvec{h}}\) in the polynomial \(Z_{\underline{H}}(G_1 \cup G_2;\varvec{x})\). \(\square \)
Suppose \(\varvec{\mu } \in \mathbb {Z}_{\ge 0}^r\) and let z be a variable. Define the graph polynomial \(Z_{\varvec{\mu }} = Z_{\varvec{\mu },\underline{H}}(G) \in \mathbb {Z}[z]\) by
here the second equality defines the numbers \(s_i(\varvec{\mu }) = s_i(\varvec{\mu })(G)\). Evaluating \(Z_{\underline{H}}(G;\varvec{x})\) at \(\varvec{x} = (\mu _1z, \ldots , \mu _rz)\), each monomial \(\varvec{x}^{\varvec{\gamma } \circ \varvec{h}}\) will evaluate to \((\mu _1z)^{\gamma _1h_1} \cdots (\mu _rz)^{\gamma _rh_r} = \varvec{\mu }^{\varvec{\gamma } \circ \varvec{h}}z^{\langle \varvec{\gamma } ,\varvec{h}\rangle }\). Therefore the coefficient of \(z^i\) in \(Z_{\varvec{\mu }}(z)\) is
Proposition 3.2
Fix \(\underline{H} = (H_1, \ldots , H_r)\) where the \(H_i\) are pairwise nonisomorphic connected graphs each of maximum degree at most \(\Delta \) and fix \(\varvec{\mu } \in \mathbb {Z}_{\ge 0}^r\). Then \(Z_{\varvec{\mu },\underline{H}}(G;z)\) is a BIGCP with parameters \(\alpha = 1\) and \(\beta _i = i^2r \Delta ^{i1}\).
Proof
Since \(Z_{\varvec{\mu },\underline{H}}(G)\) is a particular evaluation of \(Z_{\underline{H}}(G)\), we know by Proposition 3.1 that it is multiplicative.
The coefficient of \(z^i\) in \(Z_{\varvec{\mu },\underline{H}}(G;z)\) is given by (11). Since \(\varvec{\gamma } \underline{H}\) is a graph with exactly \(\langle \varvec{\gamma }, \varvec{h} \rangle = i\) vertices, we can take \(\alpha \) to be 1 in the definition of BIGCP.
For a given graph F, we must determine \(\lambda _{F,i}\) in the definition of BIGCP and the time \(\beta _i\) required to do this. Note that we may assume \(V(F) = i\); otherwise \(\lambda _{F,i} = 0\). If \(V(F) = i\), we must test if F is isomorphic to a graph of the form \(\varvec{\gamma } \underline{H}\) with \(\langle \varvec{\gamma }, \varvec{h} \rangle = i\) and if so we must output the value of \(\lambda _{F,i}\) as \(\varvec{\mu }^{\varvec{\gamma } \circ \varvec{h}}\) (this last step taking i arithmetic operations). To test if F is isomorphic to a graph of the form \(\varvec{\gamma } \underline{H}\), we test isomorphism of each component of F against each of the graphs \(H_1, \ldots , H_r\), which takes time at most \(O(i^2r\Delta ^{i1})\) using Lemma 2.2 at most ir times. Thus the total time to compute \(\lambda _{F,i}\) is at most \(O(i^2r\Delta ^{i1})\). \(\square \)
Now since \(Z_{\varvec{\mu },\underline{H}}(G;z)\) is a BIGCP, Theorem 2.1 allows us to compute the coefficients \(s_i(\varvec{\mu })\) in (11) with the desired running time. However the \(s_i(\varvec{\mu })\) are linear combinations of the numbers \(\text {ind}(\varvec{\gamma } \underline{H}, G)\), while we wish to compute one of these numbers in particular, say \(\text {ind}(\rho \underline{H}, G)\). By making careful choices of different \(\varvec{\mu }\), we will obtain an invertible linear system whose solution will include the number \(\text {ind}(\rho \underline{H},G)\). We will require Alon’s Combinatorial Nullstellensatz [1], which we state here for the reader’s convenience.
Theorem 3.3
([1]) Let \(f(x_1, \ldots , x_n)\) be a polynomial of degree d over a field \(\mathbb {F}\). Suppose the coefficient of the monomial \(x_1^{\mu _1}\cdots x_n^{\mu _n}\) in f is nonzero and \(\mu _1+ \cdots + \mu _n = d\). If \(S_1, \ldots , S_n\) are finite subsets of \(\mathbb {F}\) with \(S_i \ge \mu _i + 1\) then there exists a point \(x \in S_1 \times \cdots \times S_n\) for which \(f(x) \not =0\).
Given a vector \(h \in \mathbb {N}^r\), let us write \(\mathcal {P}_{m,r,\varvec{h}}\) for the set of vectors \(\varvec{\gamma } \in \mathbb {Z}_{\ge 0}^r\) such that \(\langle \varvec{\gamma }, \varvec{h} \rangle = m\). We note that, as the the number of elements in \(\mathcal {P}_{m,r,\varvec{h}}\) is at most the number of monomials in r variables of degree m, we have
Enumerate the vectors in \(\mathcal {P}_{m,r,\varvec{h}}\) as \(\varvec{\gamma }_1, \ldots , \varvec{\gamma }_k\) and write \(\gamma _{i,j}\) for the \(j\hbox {th}\) component of \(\varvec{\gamma }_i\). Given a vector \(\varvec{\nu } \in \mathbb {N}^r\), we write \(\varvec{\nu }^* \in \mathbb {N}^k\) for the vector whose \(i\hbox {th}\) component \(\nu _i^*\) is \(\varvec{\nu }^{\varvec{\gamma }_i \circ \varvec{h}}\), i.e.
Lemma 3.4
Fix \(m,r \in \mathbb {N}\) and \(\varvec{h} \in \mathbb {N}^r\), and let \(\varvec{\gamma }_1, \ldots , \varvec{\gamma }_k\) be an enumeration of the elements in \(\mathcal {P}_{m,r,\varvec{h}}\) as before. In time \(O(k^5 + k^2m e^{m})\), we can find vectors \(\varvec{\nu }_1, \ldots , \varvec{\nu }_k\in \mathbb {N}^r\) such that \(\varvec{\nu }_1^*, \ldots , \varvec{\nu }_k^* \in \mathbb {N}^k\) (as defined above) are linearly independent.
Proof
For any vector \(\varvec{\nu }\), let us write \(\varvec{\nu }_j\) to denote the vector consisting of the first j components. Suppose we have found vectors \(\varvec{\nu }_1, \ldots , \varvec{\nu }_{\ell 1}\in \mathbb {N}^r\) such that the \((\ell 1) \times (\ell 1)\) matrix
has nonzero determinant. We will show how to find \(\varvec{\nu }_\ell \in \mathbb {N}^r\) such that the corresponding matrix \(M_\ell \) has nonzero determinant. First consider the components of \(\varvec{\nu }_\ell \) to be unknown variables \(x_1, \ldots , x_r\) so that \(\det (M_\ell )\) becomes a polynomial \(P = P(x_1, \ldots , x_r)\) in the variables \(x_1, \ldots , x_r\). In fact it is a homogeneous polynomial of degree m. Writing \(\varvec{x} = (x_1, \ldots , x_r)\), we know that the coefficient of \(\varvec{x}^{\varvec{\gamma }_\ell \circ \varvec{h}}\) is \(\pm \det (M_{\ell 1}) \not = 0\) (consider the determinant expansion of the matrix \(M_\ell \) along the \(\ell th\) column). We must now find \(\varvec{\nu }_\ell \in \mathbb {N}^r\) such that \(\det (M_\ell ) = P(\nu ^*_{\ell ,1},\ldots ,\nu ^*_{\ell ,\ell }) \not = 0\), where \(\nu ^*_{\ell , i}\) is the \(i\hbox {th}\) component of \(\varvec{\nu }^*_{\ell }\).
Assume the components of \(\varvec{\gamma }_\ell \in \mathbb {N}^r\) are \(a_1, \ldots , a_r\). Applying Theorem 3.3 to the monomial \(\varvec{x}^{\varvec{\gamma }_\ell \circ \varvec{h}}\) and taking the sets \(S_i = \{1, \ldots , a_ih_i + 1 \}\) for \(i=1,\ldots ,r\), we know there exists a vector \(\varvec{\nu }_\ell \in S := S_1 \times \cdots \times S_r\) such that \(P(\nu ^*_{\ell ,1},\ldots ,\nu ^*_{\ell ,\ell }) \not =0\). Computing the polynomial P requires time at most \(O(k \cdot k^{3})\) (using that computing the determinant of an \(n \times n\) matrix takes \(O(n^3)\) time) and evaluating it at every point in S requires at most \(O(m \cdot k \cdot S)\) operations. We can bound S as follows:
The first inequality follows from the arithmeticgeometric mean inequality. Iterating the procedure, we can determine \(\varvec{\nu }_1, \ldots , \varvec{\nu }_k\) in time \(O(k\cdot (k^4 + m k S)) \le O(k^5 + k^2m e^{m})\). \(\square \)
Remark 3.1
We suspect there should be a simpler argument than the one we have just given (perhaps one where the vectors \(\varvec{\nu }_1, \ldots , \varvec{\nu }_k\) can be explicitly written down rather than having an algorithm to determine them). Note that one can also use a faster randomised algorithm by applying the SchwarzZippel Lemma.
We can now state our algorithm to compute \(\text {ind}(H,G)\) for Theorem 1.1.
Algorithm 2
Input: two graphs H and G.

Step 1: Determine the components of H using e.g. breadthfirst search. Compute, using Lemma 2.2 (i), the pairwise nonisomorphic components \(H_1, \ldots , H_r\) of H and their multiplicities \(i_1,\ldots ,i_r\).

Step 2: Write \(H = i_1H_1\cup \ldots \cup i_rH_r\) and \(\underline{H} = (H_1, \ldots , H_r)\), let \(\varvec{h}\) be the vector in \(\mathbb {N}^r\) defined by \(h_j=H_j\) for each j, and compute \(m=\sum _{j=1}^r i_j h_j\). Consider the multivariate polynomial \(Z_{\underline{H}}(G;\varvec{x})\).

Step 3: Recall
$$\begin{aligned} \mathcal {P}_{m,r,\varvec{h}} = \{ \varvec{\gamma } \in \mathbb {N}^r: \langle \varvec{\gamma } , \varvec{h} \rangle = m \}. \end{aligned}$$Enumerate the set \(\mathcal {P}_{m,r,\varvec{h}}\) as \(\{\varvec{\gamma }_1,\ldots ,\varvec{\gamma }_k\}\) with \(\varvec{\gamma }_1=(i_1,\ldots ,i_r)\).

Step 4: Recall that given a vector \(\varvec{\nu } \in \mathbb {N}^r\), we write \(\varvec{\nu }^* \in \mathbb {N}^k\) for the vector \((\varvec{\nu }^{\varvec{\gamma }_i \circ \varvec{h}})_{i=1}^k\). Use Lemma 3.4 to determine vectors \(\varvec{\nu }_1,\ldots ,\varvec{\nu }_k\) such that the vectors \(\varvec{\nu }_1^*,\ldots ,\varvec{\nu }_k^*\) are linearly independent.

Step 5: For each \(i=1,\ldots ,k\) compute the \(m\hbox {th}\) coefficient \(s_m(\varvec{\nu }_i)\) of the univariate polynomial \(Z_{\underline{H},\varvec{\nu }_i}\) using Algorithm 1. (Here \(Z_{\underline{H},\varvec{\nu }_i}(z)\) is the evaluation of \(Z_{\underline{H}}(\varvec{x})\) at the vector \(\varvec{x} = z \varvec{\nu }_i\), where z is a scalar variable and this univariate satisfies the properties of a BIGCP by Proposition 3.2.)

Step 6: Invert the system of linear equations
$$\begin{aligned} \langle \varvec{\nu }_i^*, \varvec{s} \rangle = s_m(\varvec{\nu }_i) \quad \text {for }\, i=1,\ldots ,k, \end{aligned}$$to find \(\varvec{s} \in \mathbb {N}^k\) (with components \(s_{1},\ldots ,s_{k}\)).
Output: \(s_{1}=\text {ind}(H,G)\).
Proof of Theorem 1.1
Step 1 can be executed in time \(O(m^3\Delta ^m)\) by Lemma 2.2 (i) to test for isomorphism. Step 3, the enumeration of the elements of \(\mathcal {P}_{m,r,\varvec{h}}\), can be executed in time \(O(2^{2m})\), as the size of \(\mathcal {P}_{m,r,\varvec{h}}\) is bounded by \(\left( {\begin{array}{c}m+r1\\ r1\end{array}}\right) = O(2^{m+r}) = O(2^{2m})\).
By Lemma 3.4, we can find the vectors \(\varvec{\nu }_1, \ldots , \varvec{\nu }_k \in \mathbb {N}^r\) such that the vectors \(\varvec{\nu }_1^*, \ldots , \varvec{\nu }_k^* \in \mathbb {N}^k\) are linearly independent in time \(O(k^5 + k^2m e^{m}) = \tilde{O}(2^{10m})\), noting that by (12) k is at most \(\left( {\begin{array}{c}m+r1\\ r1\end{array}}\right) = O(2^{m+r}) = O(2^{2m})\). So step 4 can be executed in \(\tilde{O}(2^{10m})\) time.
By Proposition 3.2 and Theorem 2.1 we can compute the coefficient \(s_m(\varvec{\nu }_i)\) of \(z^m\) in \(Z_{\underline{H},\varvec{\nu }_i}(G)(z)\) in time \(\tilde{O}((e\Delta )^{\alpha m}\beta _m4^{\alpha m}n) \) with \(\alpha = 1\) and \(\beta _j = j^2r \Delta ^{j1}\). So Step 5, i.e. computing all these coefficients for \(i=1, \ldots , k\) takes time
Recall that this coefficient is given by
More conveniently, writing \(\varvec{s} \in \mathbb {Z}_{\ge 0}^k\) for the vector whose \(j\hbox {th}\) component is \(s_j=\text {ind}(\varvec{\gamma }_j\underline{H}, G)\), we have the invertible system of linear equations given by
where we have computed the values of \(s_m(\varvec{\nu }_i)\) and \(\varvec{\nu }_i^*\), while the vector \(\varvec{s}\) is unknown (the system is invertible because we chose the \(\varvec{\nu }_i^*\) to be linearly independent). We can then invert the system in time \(O(k^3) = \tilde{O}(2^{6m})\) (Step 6). In particular finding the value of \(s_1 = \text {ind}(H, G)\) takes time \(\tilde{O}(2^{6m})\). This proves correctness of the algorithm.
The total running time is dominated by the time to execute Step 4 and 5, which is bounded by \(\tilde{O}(n(7\Delta )^{2m} +2^{10m}))\). \(\square \)
Concluding Remarks
As we remarked in the introduction our approach also works in the setting of vertex and edgecoloured graphs. We will not elaborate on the details here, but just refer the interested reader to Section 3.3 of [18] where we have briefly explained how to extend the results for computing coefficients of BIGCPs to the setting of coloured graphs. In addition we note that the part of the proof given in Sect. 3 also carries over to the coloured graphs setting replacing graph by coloured graph everywhere.
The approach used to prove Theorem 1.1 is very robust. Besides extending to the coloured setting, it also easily extends to other graph like structures. For example, in [18] it has been extended to fragments, i.e., vertexcoloured graphs in which some edges may be unfinished, Liu et al. [14] extended it to insects, i.e., vertexcoloured hypergraphs in which some edges may be unfinished and very recently, Barvinok and the second author [4] applied this approach to enumerate integer points in certain polytopes. We expect our approach to be applicable to the problem of counting (induced) substructures in other structures as well, as long as there is a notion of connectedness and maximum degree.
References
 1.
Alon, N., Tarsi, M.: Combinatorics probability and computing. Combinatorial nullstellensatz 8(1), 7–30 (1999)
 2.
Arvind, V., Raman, V.: Approximation algorithms for some parameterized counting problems. In: Algorithms and Computation, 13th International Symposium, ISAAC 2002 Vancouver, BC, Canada, 21–23 Nov 2002, Proceedings, pp. 453–464 (2002)
 3.
Barvinok, A.: Combinatorics and Complexity of Partition Functions, volume 30 of Algorithms and Combinatorics. Springer, Berlin (2017)
 4.
Barvinok, A., Regts, G.: Weighted counting of nonnegative integer points in a subspace (2017). arXiv preprint arXiv:1706.05423
 5.
Borgs, C., Chayes, J., Kahn, J., Lovász, L.: Left and right convergence of graphs with bounded degree. Random Struct. Algorithms 42(1), 1–28 (2013)
 6.
Chen, Y., Thurley, M., Weyer, M.: Understanding the complexity of induced subgraph isomorphisms. In: Automata, Languages and Programming, 35th International Colloquium, ICALP 2008, Reykjavik, Iceland, 7–11 July 2008, Proceedings, Part I: Tack A: Algorithms, Automata, Complexity, and Games, pp. 587–596 (2008)
 7.
Csikvári, P., Frenkel, P.E.: Benjamini–Schramm continuity of root moments of graph polynomials. Eur. J. Comb. 52, 302–320 (2016)
 8.
Curticapean, R., Dell, H., Fomin, F.V., Goldberg, L.A., Lapinskas, J.: A fixedparameter perspective on #bis. In: 12th International Symposium on Parameterized and Exact Computation, IPEC 2017, 6–8 Sept 2017, Vienna, Austria, pp. 13:1–13:13 (2017)
 9.
Curticapean, R., Dell, H., Marx, D.: Homomorphisms are a good basis for counting small subgraphs. In: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, 19–23 June 2017, pp. 210–223 (2017)
 10.
Curticapean, R., Marx, D.: Complexity of counting subgraphs: only the boundedness of the vertexcover number counts. In: 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, 18–21 Oct 2014, pp. 130–139 (2014)
 11.
Downey, R.G., Fellows, M.R.: Fixedparameter tractability and completeness II: on completeness for W[1]. Theor. Comput. Sci. 141(1&2), 109–131 (1995)
 12.
Flum, J., Grohe, M.: The parameterized complexity of counting problems. SIAM J. Comput. 33(4), 892–922 (2004)
 13.
Johnson, D.S., Szegedy, M.: What are the least tractable instances of max independent set? In: Proceedings of the Tenth Annual ACMSIAM Symposium on Discrete Algorithms, 17–19 Jan 1999, Baltimore, Maryland, pp. 927–928 (1999)
 14.
Liu, J., Sinclair, A., Srivastava, P.: The Ising partition function: Zeros and deterministic approximation. In: 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, 15–17 Oct 2017, pp. 986–997 (2017)
 15.
Lovász, L.: Large Networks and Graph Limits, volume 60 of American Mathematical Society Colloquium Publications. American Mathematical Society, Providence (2012)
 16.
Nederlof, J.: Personal communication
 17.
Nešetřil, J., Poljak, S.: On the complexity of the subgraph problem. Comment. Math. Univ. Carolin. 26(2), 415–419 (1985)
 18.
Patel, V., Regts, G.: Deterministic polynomialtime approximation algorithms for partition functions and graph polynomials. SIAM J. Comput. 46(6), 1893–1919 (2017)
 19.
Scott, A.D., Sokal, A.D.: The repulsive lattice gas, the independentset polynomial, and the Lovász local lemma. J. Stat. Phys. 118(5), 1151–1261 (2005)
Acknowledgements
We thank John Lapinskas for raising a question about the complexity of our main algorithm in a previous version of this paper, which led to an improved running time. We also thank Radu Curticapean for informing us of some historical context to our result. We are grateful for the excellent comments by the anonymous referees leading to an improved presentation of our result.
Author information
Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Viresh Patel: Supported by the Netherlands Organisation for Scientific Research (NWO) through the Gravitation Programme Networks (024.002.003). Guus Regts: Supported by a personal NWO Veni grant.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Patel, V., Regts, G. Computing the Number of Induced Copies of a Fixed Graph in a Bounded Degree Graph. Algorithmica 81, 1844–1858 (2019). https://doi.org/10.1007/s0045301805119
Received:
Accepted:
Published:
Issue Date:
Keywords
 Induced graph
 Computational counting
 Fixed parameter tractability